Course Outline
Module 1: Core Python for Machine Learning Workflows
• Course kickoff and environment setup
Align course objectives and establish a reproducible Python machine learning (ML) workspace for government.
• Python language essentials (fast-track)
Review essential Python syntax, control flow, functions, and patterns commonly used in ML codebases for government.
• Data structures for ML
Utilize lists, dictionaries, sets, and tuples for managing features, labels, and metadata in government ML projects.
• Comprehensions and functional tools
Employ comprehensions and higher-order functions to express data transformations in ML workflows for government.
• Object-oriented Python for ML developers
Explore classes, methods, composition, and practical design decisions for government ML applications.
• Dataclasses and lightweight modeling
Use typed containers for configuration, examples, and results in government ML projects.
• Decorators and context managers
Implement timing, caching, logging, and resource-safe execution patterns in government ML scripts.
• Working with files and paths
Ensure robust dataset handling and support various serialization formats in government ML workflows.
• Exceptions and defensive programming
Write ML scripts for government that fail safely and transparently.
• Modules, packages, and project structure
Organize reusable ML codebases for government projects.
• Typing and code quality
Utilize type hints, documentation, and lint-friendly structures to enhance code quality in government ML projects.
Module 2: Numerical Python, SciPy, and Data Handling
• NumPy foundations for vectorized computing
Master efficient array operations and performance-aware coding in government ML tasks.
• Indexing, slicing, broadcasting, and shapes
Perform safe tensor manipulation and shape reasoning in government ML data.
• Linear algebra essentials with NumPy and SciPy
Conduct stable matrix operations and decompositions used in government ML.
• SciPy deep dive
Delve into statistics, optimization, curve fitting, and sparse matrices for government ML applications.
• Pandas for tabular ML data
Clean, join, aggregate, and prepare datasets for government ML projects.
• Scikit-learn deep dive
Explore the estimator interface, pipelines, and reproducible workflows in government ML.
• Visualization essentials
Create diagnostic plots for data exploration and model behavior in government ML.
Module 3: Programming Patterns for Building ML Applications
• From notebook to maintainable project
Refactor exploratory code into structured packages for government ML.
• Configuration management
Externalize parameters and implement startup validation in government ML projects.
• Logging, warnings, and observability
Implement structured logging for debuggable ML systems in government.
• Reusable components with OOP and composition
Design extensible transformers and predictors for government ML.
• Practical design patterns
Apply pipeline, factory, registry, strategy, and adapter patterns in government ML.
• Data validation and schema checks
Prevent silent data issues in government ML applications.
• Performance and profiling
Identify bottlenecks and apply optimization techniques in government ML.
• Model I/O and inference interfaces
Ensure safe persistence and clean prediction interfaces for government ML models.
• End-to-end mini build
Develop a production-style ML pipeline with configuration and logging for government.
Module 4: Statistical Learning for Tabular, Text, and Image Data
• Evaluation foundations
Implement train and validation splits, honest cross-validation, and business-aligned metrics in government ML.
• Advanced tabular ML
Apply regularized GLMs, tree ensembles, and leakage-free preprocessing in government ML.
• Calibration and uncertainty
Use Platt scaling, isotonic regression, bootstrap, and conformal prediction in government ML.
• Classical NLP methods
Explore tokenization trade-offs, TF-IDF, linear models, and Naive Bayes in government NLP.
• Topic modeling
Understand LDA fundamentals and practical limitations in government NLP.
• Classical computer vision
Work with HOG, PCA, and feature-based pipelines in government computer vision.
• Error analysis
Detect bias, label noise, and spurious correlations in government ML.
• Hands-on labs
Develop a leakage-proof tabular pipeline
Compare and interpret text baselines
Create a classical vision baseline with structured failure analysis for government.
Module 5: Neural Networks for Tabular, Text, and Image Data
• Training loop mastery
Master clean PyTorch loops with AMP, clipping, and reproducibility in government ML.
• Optimization and regularization
Explore initialization, normalization, optimizers, and schedulers in government ML.
• Mixed precision and scaling
Apply gradient accumulation and checkpointing strategies in government ML.
• Tabular neural networks
Use categorical embeddings, feature crosses, and ablation studies in government tabular data.
• Text neural networks
Implement embeddings, CNNs, BiLSTM, or GRU for sequence handling in government text data.
• Vision neural networks
Utilize CNN fundamentals and ResNet-style architectures in government computer vision.
• Hands-on labs
Develop a reusable training framework
Compare tabular neural networks with boosting
Experiment with CNNs, augmentation, and scheduling in government ML.
Module 6: Advanced Neural Architectures
• Transfer learning strategies
Apply freeze and unfreeze patterns, and discriminative learning rates in government ML.
• Transformer architectures for text
Understand self-attention internals and fine-tuning approaches in government NLP.
• Vision backbones and dense prediction
Explore ResNet, EfficientNet, Vision Transformers, and U-Net concepts in government computer vision.
• Advanced tabular architectures
Use TabTransformer, FT-Transformer, and Deep and Cross networks in government tabular data.
• Time series considerations
Address temporal splits and covariate shift detection in government time series data.
• PEFT and efficiency techniques
Evaluate LoRA, distillation, and quantization trade-offs in government ML.
• Hands-on labs
Fine-tune a pretrained text transformer
Fine-tune a pretrained vision model
Compare a tabular transformer with GBDT in government ML.
Module 7: Generative AI Systems
• Prompting fundamentals
Implement structured prompting and controlled generation in government AI.
• LLM foundations
Understand tokenization, instruction tuning, and hallucination mitigation in government large language models (LLMs).
• Retrieval-Augmented Generation
Use chunking, embeddings, hybrid search, and evaluation metrics in government AI.
• Fine-tuning strategies
Apply LoRA and QLoRA with data quality controls in government AI.
• Diffusion models
Gain intuition and practical adaptation for latent diffusion in government AI.
• Synthetic tabular data
Utilize CTGAN and consider privacy implications in government synthetic data.
• Hands-on labs
Build a production-style RAG mini-application
Implement structured output validation with schema enforcement
Optionally experiment with diffusion in government AI.
Module 8: AI Agents and MCP
• Agent loop design
Design observe, plan, act, reflect, and persist loops for government AI agents.
• Agent architectures
Explore ReAct, plan-and-execute, and multi-agent coordination in government AI.
• Memory management
Implement episodic, semantic, and scratchpad approaches in government AI agents.
• Tool integration and safety
Ensure tool contracts, sandboxing, and prompt injection defenses in government AI.
• Evaluation frameworks
Develop replayable traces, task suites, and regression testing for government AI.
• MCP and protocol-based interoperability
Design MCP servers with secure tool exposure for government AI.
• Hands-on labs
Build an AI agent from scratch
Expose tools via an MCP-style server
Create an evaluation harness with safety constraints for government AI.
Requirements
Participants should possess a strong foundation in Python programming and prior experience with machine learning using scikit-learn or a comparable framework.
A working knowledge of NumPy, basic linear algebra, probability, and model evaluation concepts is required.
Experience with PyTorch or another deep learning framework is highly recommended for those intending to engage in the advanced modules.
This program is designed for intermediate to advanced technical professionals and is not appropriate for beginners. It is tailored to enhance the skills of professionals who are actively involved in data science and machine learning projects for government and other public sector applications.
Testimonials (2)
the ML ecosystem not only MLFlow but Optuna, hyperops, docker , docker-compose
Guillaume GAUTIER - OLEA MEDICAL
Course - MLflow
I enjoyed participating in the Kubeflow training, which was held remotely. This training allowed me to consolidate my knowledge for AWS services, K8s, all the devOps tools around Kubeflow which are the necessary bases to properly tackle the subject. I wanted to thank Malawski Marcin for his patience and professionalism for training and advice on best practices. Malawski approaches the subject from different angles, different deployment tools Ansible, EKS kubectl, Terraform. Now I am definitely convinced that I am going into the right field of application.