Pavel Janata
Machine Learning • Data Science • he/him
Professional summary
Senior ML Engineer and Data Scientist with 7+ years’ experience designing and delivering ML‑powered solutions end‑to‑end across industries including energy, manufacturing, finance, and healthcare. Combines strong data‑science fundamentals — statistical modelling, time‑series forecasting, anomaly detection — with production engineering skills to turn experimental models into robust, scalable systems. Currently tech‑leading projects, mentoring junior engineers across teams, and driving pre‑sales and solution design for new client engagements.
Core skills
| Data Science | Experiment design, feature engineering, statistical modelling, Bayesian methods, time‑series forecasting, anomaly detection, NLP / topic modelling |
| ML / DL Frameworks | Scikit‑Learn, PyTorch, Darts, XGBoost / LightGBM, Hugging Face Transformers |
| Languages | Python, Scala, SQL |
| Python ecosystem | Pandas, NumPy, SciPy, Jupyter, Plotly, Dash, FastAPI, Pydantic, uv, ruff |
| Data & Storage | DuckDB, PostgreSQL (Aurora), Cassandra, S3 |
| Streaming & Processing | Apache Flink, Spark Structured Streaming, Kafka, Kinesis |
| Cloud & Ops | AWS (Bedrock, Glue, S3, RDS, Kinesis, CloudWatch), Kubernetes, Docker, GitLab Pipelines, GitHub Actions |
| ML Domains | Time‑Series Forecasting, Anomaly Detection, Demand Forecasting, NLP / LLMs, Fraud Detection |
Experience
Blindspot.ai
Senior AI Engineer / Tech Lead
Part of Adastra Group
AI/ML consultancy
Prague · Jul 2018 – present (full‑time since Feb 2023)
Responsibilities
- Technical delivery – architect data pipelines, develop & deploy ML models, perform data‑science experimentation; hands‑on coding (Python/Scala).
- Tech lead – own technical direction on projects, drive architecture decisions, coordinate with client engineering teams.
- People leadership – mentor interns/juniors/mediors, conduct 1‑on‑1s, performance reviews, and growth planning.
- Pre‑sales & solution design – lead discovery workshops with prospective clients, translate business needs into technical proposals and effort estimates.
2026
Probability of Default Model Development · Senior Data Scientist / Tech Lead
Led development of probability‑of‑default (PD) models for a consumer lending company. Performed end‑to‑end EDA on client data, designed the modelling approach, evaluated and validated results, and proposed a deployment architecture for client‑side implementation.
Python, Pandas, XGBoost, Scikit-Learn, DuckDB
2025–26
GenAI Use Case Assessment & Prototyping · AI Consultant / Project Lead
Led a GenAI assessment for a major European payment‑services provider — from use‑case discovery and prioritisation workshops through to functional LLM‑based prototypes (AWS Bedrock flows, RAG via Knowledge Bases, agentic architectures). Delivered a comprehensive feasibility assessment with production effort estimates.
AWS, AWS Bedrock, LLM
2025
Asphalt Heating Optimisation · ML Engineer
Built a predictive model to optimise the asphalt heating process for an international manufacturer. Predicted cooling behaviour based on weather, transport, and process variables to recommend optimal heating temperatures — reducing energy use and CO₂ emissions while meeting technical standards. Developed GBM and Bayesian models with end‑to‑end feature engineering, and delivered a FastAPI prediction service for production integration.
Python, scikit‑learn, Pandas, FastAPI
2025
Credit Portfolio Forecasting · Data Scientist
Developed a forecasting solution to predict the future structure of individual credit portfolios for a consumer microloan provider. Built analytical datasets from raw credit report data, developed and evaluated deep forecasting models (N‑BEATS, Temporal Fusion Transformer), and applied hierarchical reconciliation techniques to ensure consistency across product categories.
Python, scikit‑learn, PyTorch, Pandas, DuckDB
2025
Warehouse Resupply Forecasting · Data Scientist
Built demand forecasting models for a pharmacy retail chain to predict warehouse delivery volumes for shift scheduling and staffing. Designed both monthly workload predictions and granular daily forecasts, leveraging delivery patterns and long‑term supplier trends.
Python, scikit‑learn, Pandas
2025
EU AI Act Compliance · AI Consultant
Guided a major telco in complying with the EU AI Act — gap analysis, governance process design, MLOps architecture for traceability, and cross‑functional coordination (legal, compliance, engineering).
Kubernetes, MLflow, Elastic
2023–25
IoT Measurements Correction Platform · Tech Lead
Designed and delivered a fault‑tolerant, real‑time streaming platform for a major natural‑gas distributor. The system aggregates, validates, and automatically corrects measurements from hundreds of thousands of sensors within legally mandated timeframes. Developed and integrated an ML model for real‑time fault detection and correction. Managed production deployment end‑to‑end.
Python, Apache Flink, AWS Kinesis, AWS Glue, PostgreSQL (Aurora), S3
2023
Customer Support Ticket Analysis · Data Scientist
Analysed unstructured customer‑service chat data for a major Slovak telco using topic modelling to categorise key communication themes and derive operational insights.
Python, Transformers, UMAP
2023
Bank Balance Prediction · Data Scientist / Data Engineer
Developed a real‑time predictive model for customer account balances within a major Czech bank’s data platform. Implemented in Spark Structured Streaming with data integration from multiple warehouses and Airflow‑orchestrated workflows.
Scala, Spark Structured Streaming, Python, Kafka, Hive, Airflow
2018-20, 2020–23
User & Entity Behaviour Analytics · ML Engineer / Data Engineer
Developer on a cloud‑native, multi‑tenant anomaly‑detection platform for a major EU SIEM provider. Designed unsupervised anomaly‑detection algorithms, user‑clustering approaches, and large‑scale feature extraction in Apache Flink. Managed data ingestion, processing, and storage across the full platform lifecycle.
Python, Scala, Apache Flink, Kafka, Cassandra, Kubernetes, Docker, AWS
2020
Epidemic Forecasting (COVID‑19) · Data Engineer
Supported epidemiological research in collaboration with University of Oxford — built data pipelines for aggregating data, integrating forecasting models, and publishing predictions.
Python, GCP, GitHub Actions
2019
Fraud Detection · Data Scientist
Developed ML models and feature‑extraction techniques for scoring transaction risk at a major telco operator.
Python, scikit‑learn, Docker
Education
2021-2023
Data Science Master Program at Open Informatics, FEE CTU
Thesis topic: Decentralized Federated Learning for Network Security
2019-2021
Artificial Intelligence Master Program at Open Informatics, FEE CTU
Unfinished
2016-2019
Informatics and Computer Science Bachelor Program at Open Informatics, FEE CTU
Thesis topic: Transfer Learning for Textual Topic Classificaton