Introduction

Hi, I am a data practitioner with experience in various data-related areas (data engineering, data analysis, data science, machine learning) and business domains (customer support, customer success, product, marketing, personalization, and sales).

I am an independent Python user, who is also proficient in multiple databases querying languages/dialects (PostgreSQL, MySQL, Snowflake, Standard SQL in BigQuery) with experience in areas ranging from data collection through its wrangling, modeling, description, and visualisation to the machine learning techniques and models (XGBoost, KMs, Word2vec).

Lately, I have gained experience with data science techniques (like segmentation or recommendation engines) and LLMs in the industry context and data engineering areas (ETL, data models, data architecture).

Data Engineer

Make, Prague, Czech Republic, 12/2022-present

  • Data engineering. Development and implementation of the CD/CI pipeline for Keboola using kb-cli, GitHub, and GitHub Actions. Building ELT pipelines and data modelling. Building custom Extract/Load components for various Sales/CRM platforms (Celonis, ZoomInfo) using the request library. Addressing data quality. Code reviewing. DataOps and FinOps. Airflow, Amazon Managed Workflows for Apache Airflow. Data Quality using Soda.
  • Machine Learning. Deploying data science projects (LTV or personas) into production. MLOps. Strategy for the ML infrastructure development. Code reviewing.
  • Generally used tools. Python 3.11+, Snowflake, Airflow, AWS, Github, Make, Confluence, Slack, Jira

Machine Learning Engineer

Ataccama, Prague, Czech Republic, 07/2022-10/2022

  • As a part of the Data Stories team, I contributed to the back-end (idiomatic and pydantic-driven Python, FAST API) and front-end (TypeScript, Vue.js) codebases. The use-cases included user-defined filters or charts and attributes recommendation.
  • Unit testing via pytest (including contribution to the CI/CD pipeline). Git using Gitlab. Linting using black, isort, and flake8. Docker, Kubernetes, kustomize and Helm.
  • Data analytics and data science - prototyping via Jupyter Notebooks.
  • Generally used tools. Python 3.10, Snowflake, MySQL, REST API, Jira, Notion.

Data & Machine Learning Engineer

CloudTalk, Prague, Czech Republic, 04/2022-06/2022

  • Data engineering. Development and implementation of the CD/CI pipeline for Keboola using kb-cli, GitHub, and GitHub Actions. Building ELT pipelines and data modelling. Building custom Extract/Load components for various Sales/CRM platforms (HubSpot, Crunchbase, Apollo) using the request library. Addressing data quality topics like entity profiling. Code reviewing.
  • Machine Learning. Developing anti-fraud prediction models. Defined business logic with stakeholders and setting up alerting. Implementing cookiecutter framework for developing data science projects. Prepared a strategy for the ML infrastructure development. Code reviewing.
  • Data analytics. Ad hoc reports using Jupyter Notebooks and Redash.
  • Generally used tools. Python 3, Snowflake, ML Flow, MySQL, REST API, Jira, Outline.

Interim Product Owner of Personalization

Rohlik Group, Prague, Czech Republic, 09/2021-03/2022

  • Distributed leadership of the 4-member team (a back-end developer, a tester, a data analyst, a machine-learning engineer). Setting a roadmap, agile ceremonies (planning, grooming, retrospectives, etc.).
  • Feature Lifecycle Management - collecting and prioritising business requirements based on their alignment with the company’s business goals. Alignment with other relevant stakeholders across the company (including preemptive identification of dependencies, synergies, and blockers). Development and deployment of personalised features to production, including evaluating the business impact.
  • Setting up objectives, key results, and key performance indicators of the Personalisation squad based on the company’s OKRs (GTMHub). Reporting to the C-level management and the board of directors.

Machine Learning Engineer

Rohlik Group, Prague, Czech Republic, 02/2021-03/2022

  • Covered domains: stakeholder management in the agile set-up (using Jira), data engineering, ML model development and deployment in production using Keboola and AWS (both batch and real-time predictions) and performance evaluation. Focus on the personalization of the product and CRM, solving various classification and regression tasks in the international context.
  • Data engineering. Data models and ETL self-service in Keboola (including REST APIs via Postman) and AWS (S3, Redshift, MySQL).
  • MLOps using AWS - Sagemaker, Lambda, CloudWatch, Glue, API Gateway.
  • A/B testing - design, execution, and evaluation. Dashboards in Tableau, ad hoc reports using Jupyter Notebooks. Git using GitLab and AWS Codecommit.
  • Other tools - Python (including various packages like pandas or seaborn), Snowflake, PySpark, TensorFlow.

Data Analyst

Meiro, Brno, Czech Republic, 08/2020–11/2020

  • Covered domains: Data Engineering, Data Analysis, Data Science.
  • ETL pipelines management, Design of data models.
  • Integrating PostgreSQL and MySQL with R and Python, data wrangling, description, and visualization to advanced statistical techniques and data science techniques (e.g. recommendation engine and customer segmentation) in Python using VS Code. Fundamentals of Spark.
  • Linux desktop (Ubuntu), Bash, Git. Fundamentals of Docker and CD/CI.

Data Analyst

Kiwi.com, Brno, Czech Republic, 04/2019–07/2020

  • Covered domains: Product, Customer Experience, Customer Support.
  • Reports. Integrating PostgreSQL, Snowflake, and BigQuery with R and Python, data wrangling, description, and visualization to advanced statistical techniques (regression models, structural equation modelling) in R. Reporting in Markdown (R notebooks, Jupyter Notebooks, oral presentations).
  • Dashboards. I’m proficient in working with tools like Looker, GoodData, Google Data Studio, or Retool (Plotly).
  • Data engineering. Sense of data models and ETL self-service (Keboola, dbt).
  • Linux desktop (Ubuntu), Bash, Git.
  • Jira, Scrum.

Lecturer

Masaryk University, Brno, Czech Republic, 09/2015–09/2020

  • Demonstrations of analyses and their interpretations, providing oral and written feedback in the following courses:
    • Statistical Data Analysis I.
      • Correlation, contingency tables, t-test, an introduction to non-parametric tests (e.g. chi-square).
    • Statistical Data Analysis II.
      • Multiple linear regression, ANOVA, logistic regression, factor analysis, mixed effect models.
    • An Introduction to R
      • An introduction to the R language, data cleaning, wrangling, description and visualization, multivariate analyses in the R language.

Researcher

Transport Research Centre, Brno, Czech Republic, 06/2015–03/2019

  • Travel behaviour analyses, coordination of research activities, communication with the lay public as well as the community of experts.
  • Investigator in the project Česko v pohybu. Development of the research tool, development and implementation of the probabilistic sampling procedure and algorithms for automated control of the data quality.
  • Methodological and analytical supervision of the research project for the Czech Ministry of Transport related to the Public opinion on autonomous vehicles.
  • Project management of small project teams, e.g. program section for the conference Dopravní chování v datech.

Education

Psychology (Ph.D.)

Masaryk University, Brno, Czech Republic, 09/2015–06/2022

Psychology (Master degree)

Masaryk University, Brno, Czech Republic, 2013–2015

Psychology and Sociology (Bachelor degree)

Masaryk University, Brno, Czech Republic, 2010–2013

Internships

Danmarks Tekniske Universitet, Management Engineering, Lyngby, Denmark. 08-09/2017

Corpus Christi College, University of Cambridge, Cambridge, UK. 07-09/2015

Selected publications

Selected talks

Go back

