Course Outline
Introduction
- Developing effective algorithms for pattern recognition, classification, and regression is essential for government applications.
Setting up the Development Environment
- Utilizing Python libraries to support algorithm development for government use.
- Evaluating the benefits of online versus offline editors in a government context.
Overview of Feature Engineering
- Identifying and utilizing input and output variables (features) for government datasets.
- Assessing the advantages and disadvantages of feature engineering in public sector projects.
Types of Problems Encountered in Raw Data
- Addressing issues such as unclean data, missing data, and other common data challenges for government datasets.
Pre-Processing Variables
- Strategies for managing missing data in government datasets.
Handling Missing Values in the Data
Working with Categorical Variables
Converting Labels into Numbers
Handling Labels in Categorical Variables
Transforming Variables to Improve Predictive Power
- Techniques for transforming numerical, categorical, and date variables to enhance predictive models for government use.
Cleaning a Data Set
Machine Learning Modelling
Handling Outliers in Data
- Methods for identifying and managing outliers in numerical and categorical variables within government datasets.
Summary and Conclusion
Requirements
- Proficiency in Python programming.
- Experience with Numpy, Pandas, and scikit-learn.
- Knowledge of Machine Learning algorithms.
Intended Audience for Government
- Developers
- Data Scientists
- Data Analysts
Testimonials (2)
the ML ecosystem not only MLFlow but Optuna, hyperops, docker , docker-compose
Guillaume GAUTIER - OLEA MEDICAL
Course - MLflow
I enjoyed participating in the Kubeflow training, which was held remotely. This training allowed me to consolidate my knowledge for AWS services, K8s, all the devOps tools around Kubeflow which are the necessary bases to properly tackle the subject. I wanted to thank Malawski Marcin for his patience and professionalism for training and advice on best practices. Malawski approaches the subject from different angles, different deployment tools Ansible, EKS kubectl, Terraform. Now I am definitely convinced that I am going into the right field of application.