Pavel Janata

Machine Learning • Data Science • he/him

Prague, CZ • pavel@janata.meLinkedInGitlab

Professional summary

Senior ML Engineer and Data Scientist with 7+ years’ experience designing and delivering ML‑powered solutions end‑to‑end across industries including energy, manufacturing, finance, and healthcare. Combines strong data‑science fundamentals — statistical modelling, time‑series forecasting, anomaly detection — with production engineering skills to turn experimental models into robust, scalable systems. Currently tech‑leading projects, mentoring junior engineers across teams, and driving pre‑sales and solution design for new client engagements.

Core skills

Data Science Experiment design, feature engineering, statistical modelling, Bayesian methods, time‑series forecasting, anomaly detection, NLP / topic modelling
ML / DL Frameworks Scikit‑Learn, PyTorch, Darts, XGBoost / LightGBM, Hugging Face Transformers
Languages Python, Scala, SQL
Python ecosystem Pandas, NumPy, SciPy, Jupyter, Plotly, Dash, FastAPI, Pydantic, uv, ruff
Data & Storage DuckDB, PostgreSQL (Aurora), Cassandra, S3
Streaming & Processing Apache Flink, Spark Structured Streaming, Kafka, Kinesis
Cloud & Ops AWS (Bedrock, Glue, S3, RDS, Kinesis, CloudWatch), Kubernetes, Docker, GitLab Pipelines, GitHub Actions
ML Domains Time‑Series Forecasting, Anomaly Detection, Demand Forecasting, NLP / LLMs, Fraud Detection

Experience

Blindspot.ai

Senior AI Engineer / Tech Lead

Part of Adastra Group
AI/ML consultancy

Prague · Jul 2018 – present (full‑time since Feb 2023)

Responsibilities

2026 Probability of Default Model Development · Senior Data Scientist / Tech Lead
Led development of probability‑of‑default (PD) models for a consumer lending company. Performed end‑to‑end EDA on client data, designed the modelling approach, evaluated and validated results, and proposed a deployment architecture for client‑side implementation.
Python, Pandas, XGBoost, Scikit-Learn, DuckDB

2025–26 GenAI Use Case Assessment & Prototyping · AI Consultant / Project Lead
Led a GenAI assessment for a major European payment‑services provider — from use‑case discovery and prioritisation workshops through to functional LLM‑based prototypes (AWS Bedrock flows, RAG via Knowledge Bases, agentic architectures). Delivered a comprehensive feasibility assessment with production effort estimates.
AWS, AWS Bedrock, LLM

2025 Asphalt Heating Optimisation · ML Engineer
Built a predictive model to optimise the asphalt heating process for an international manufacturer. Predicted cooling behaviour based on weather, transport, and process variables to recommend optimal heating temperatures — reducing energy use and CO₂ emissions while meeting technical standards. Developed GBM and Bayesian models with end‑to‑end feature engineering, and delivered a FastAPI prediction service for production integration.
Python, scikit‑learn, Pandas, FastAPI

2025 Credit Portfolio Forecasting · Data Scientist
Developed a forecasting solution to predict the future structure of individual credit portfolios for a consumer microloan provider. Built analytical datasets from raw credit report data, developed and evaluated deep forecasting models (N‑BEATS, Temporal Fusion Transformer), and applied hierarchical reconciliation techniques to ensure consistency across product categories.
Python, scikit‑learn, PyTorch, Pandas, DuckDB

2025 Warehouse Resupply Forecasting · Data Scientist
Built demand forecasting models for a pharmacy retail chain to predict warehouse delivery volumes for shift scheduling and staffing. Designed both monthly workload predictions and granular daily forecasts, leveraging delivery patterns and long‑term supplier trends.
Python, scikit‑learn, Pandas

2025 EU AI Act Compliance · AI Consultant
Guided a major telco in complying with the EU AI Act — gap analysis, governance process design, MLOps architecture for traceability, and cross‑functional coordination (legal, compliance, engineering).
Kubernetes, MLflow, Elastic

2023–25 IoT Measurements Correction Platform · Tech Lead
Designed and delivered a fault‑tolerant, real‑time streaming platform for a major natural‑gas distributor. The system aggregates, validates, and automatically corrects measurements from hundreds of thousands of sensors within legally mandated timeframes. Developed and integrated an ML model for real‑time fault detection and correction. Managed production deployment end‑to‑end.
Python, Apache Flink, AWS Kinesis, AWS Glue, PostgreSQL (Aurora), S3

2023 Customer Support Ticket Analysis · Data Scientist
Analysed unstructured customer‑service chat data for a major Slovak telco using topic modelling to categorise key communication themes and derive operational insights.
Python, Transformers, UMAP

2023 Bank Balance Prediction · Data Scientist / Data Engineer
Developed a real‑time predictive model for customer account balances within a major Czech bank’s data platform. Implemented in Spark Structured Streaming with data integration from multiple warehouses and Airflow‑orchestrated workflows.
Scala, Spark Structured Streaming, Python, Kafka, Hive, Airflow

2018-20, 2020–23 User & Entity Behaviour Analytics · ML Engineer / Data Engineer
Developer on a cloud‑native, multi‑tenant anomaly‑detection platform for a major EU SIEM provider. Designed unsupervised anomaly‑detection algorithms, user‑clustering approaches, and large‑scale feature extraction in Apache Flink. Managed data ingestion, processing, and storage across the full platform lifecycle.
Python, Scala, Apache Flink, Kafka, Cassandra, Kubernetes, Docker, AWS

2020 Epidemic Forecasting (COVID‑19) · Data Engineer
Supported epidemiological research in collaboration with University of Oxford — built data pipelines for aggregating data, integrating forecasting models, and publishing predictions.
Python, GCP, GitHub Actions

2019 Fraud Detection · Data Scientist
Developed ML models and feature‑extraction techniques for scoring transaction risk at a major telco operator.
Python, scikit‑learn, Docker

Education



2021-2023 Data Science Master Program at Open Informatics, FEE CTU
Thesis topic: Decentralized Federated Learning for Network Security

2019-2021 Artificial Intelligence Master Program at Open Informatics, FEE CTU
Unfinished

2016-2019 Informatics and Computer Science Bachelor Program at Open Informatics, FEE CTU
Thesis topic: Transfer Learning for Textual Topic Classificaton