Aller au contenu

Resume

Nathan Leclercq
#

nathan.leclercq9@protonmail.com | LinkedIn | GitHub | Blog | Download PDF


Profile
#

Data Engineer and ML Engineer at DataKhi for 3 years (apprenticeship then full-time). I design and operate end-to-end data platforms: collection, pipelines, ML models, deployment, monitoring. Background in mathematics and computer science, Master’s in Machine Learning (Lille). I don’t just write code — I deploy, industrialize and deliver.


Professional Experience
#

DataKhi — Data Consulting Firm, Tourcoing (2023 - present)
#

Data Engineer — Nyukom Project · Full-time · Oct 2025 - present

  • Designed and deployed an end-to-end telecom data platform: collection (3CX web scraping, Centreon API), MinIO data lake, PostgreSQL star schema warehouse, Power BI reporting
  • Full infrastructure deployment: K3s, Airflow, Ansible, private Docker registry
  • Multi-tenant architecture with partitioning, idempotency and historical backfill
  • Stack: Airflow, K3s, Ansible, Docker, PostgreSQL, MinIO, Playwright, Pandas

ML Engineer — Hall U Need Project · Full-time (continued from apprenticeship) · 2023 - present

  • Industrialized a restaurant demand forecasting model (XGBoost quantile regression)
  • Multi-restaurant prediction models, feature engineering (weather, calendar, reservations)
  • Custom loss function (Huber), confidence interval calibration, non-regression tests
  • Full pipeline: Microsoft Fabric collection → training → prediction · Makefile workflow

Data Engineer & ML Engineer — Tossée Project · Apprenticeship · 2023 - 2025

  • Architected a complete data ecosystem for an eco-friendly fashion aggregator
  • Multi-brand scraping (Playwright, Scrapy, custom YAML rules engine)
  • Normalization pipeline, environmental impact calculation (Ecobalyse API), product embeddings
  • Backend API (FastAPI, PostgreSQL/pgvector, semantic search, recommendation)
  • Flutter mobile app: virtual try-on (DM-VTON), barcode scanning, multi-provider OAuth, geolocation
  • React/TypeScript browser extension for real-time environmental impact display
  • AI agent (OpenAI Agents SDK) for automated data extraction from HTML
  • Hybrid on-premise / Azure deployment (Functions, Blob, DevOps)

FullStack Developer — Internship · 2023 · 4 months

  • PowerBI versioning system: C++ backend (report differentials), React frontend, Electron distribution

Music Teacher · 2017 - present
#

  • Saxophone (jazz, soul) and music theory — private lessons and music schools

Technical Skills
#

Data Engineering

  • End-to-end ETL pipelines, star schema, partitioning, idempotency, backfill
  • Apache Airflow · PostgreSQL · MinIO (S3) · Parquet / PyArrow · Microsoft Fabric

Machine Learning

  • XGBoost (quantile regression) · Feature engineering · Temporal cross-validation
  • Embeddings / vector search (pgvector) · CamemBERT / Transformers · MLflow
  • Confidence interval calibration · Custom loss functions

DevOps / Infrastructure

  • Kubernetes (K3s) · Docker · Ansible (IaC, roles, vault) · Proxmox
  • Monitoring: Prometheus / Grafana · CI/CD: Makefile, pipelines
  • Azure (Fabric, Functions, Blob, DevOps)

Development

  • Python (FastAPI, Pandas, scikit-learn) · SQL · TypeScript (React) · Dart (Flutter)
  • Scraping: Playwright, Scrapy, BeautifulSoup
  • Familiar with: Go, Rust, Haskell, C++

Scientific / Competitive Programming

  • Julia (competitions: Google Hash Code, Reply Challenge, Cloudflight) · R · NumPy / SciPy

Personal Projects
#

MLOps Homelab Platform · 2024 - present

  • Self-hosted infrastructure: Proxmox, GPU servers, ML services, crewAI agents with RAG
  • Prometheus/Grafana monitoring, Ansible deployment, Docker registry, Gitea
  • Published technical articles

Book Recommendation System · 2023 - 2025

  • Full data pipeline: large book catalogue scraping, embeddings (TF-IDF + CamemBERT), FastAPI API
  • PostgreSQL/pgvector, MLflow, Vue.js interface
  • Published technical articles

Algorithms Club · 2020 - 2024

  • Preparation and participation in competitive programming contests
  • Optimized solutions in Julia · Google Hash Code, Reply Challenge, Cloudflight

Research: Melody Harmonization · 2024

  • Comparative study of models and algorithms for automatic melody harmonization

Education
#

Master’s in Machine Learning · University of Lille · 2023 - 2025

  • Deep Learning, NLP, MLOps · LLM deployment on GPU infrastructure

Bachelor’s in Computer Science · University of Lille · 2020 - 2023

  • Advanced algorithms, distributed architecture, full-stack development

Mathematics Studies (3 years) · University of Lille · 2017 - 2020

  • Numerical analysis, probability/statistics, applied linear algebra

Languages
#

  • French: native
  • English: professional (TOEIC 885)

Interests
#

  • Music: jazz/soul saxophone, orchestra
  • Sports: daily cycling, badminton
  • Reading: science fiction, technical essays
  • Tabletop role-playing games

Publications
#