Course Outline
Overview
This section outlines the appropriate contexts for applying machine learning methodologies, critical considerations for implementation, and the underlying principles governing these systems. It addresses the advantages and limitations of such approaches, encompassing data types (structured, unstructured, static, and streaming), data integrity and volume, the distinction between data-driven and user-driven analytics, and the comparison between statistical models and machine learning frameworks. Key topics include the challenges of unsupervised learning, the bias-variance tradeoff, iterative evaluation processes, cross-validation techniques, and the distinctions among supervised, unsupervised, and reinforcement learning paradigms. This guidance is intended for government applications to ensure rigorous and accountable data analytics practices.
PRIMARY TOPICS
1. Principles of Naive Bayes
- Foundational concepts of Bayesian methods
- Probability theory
- Joint probability
- Conditional probability and Bayes' theorem
- The Naive Bayes algorithm
- Naive Bayes classification
- The Laplace estimator
- Application of numeric features in Naive Bayes
2. Principles of Decision Trees
- Divide and conquer methodology
- The C5.0 decision tree algorithm
- Optimal split selection
- Decision tree pruning
3. Principles of Neural Networks
- Transition from biological to artificial neurons
- Activation functions
- Network topology
- Layer configuration
- Information flow direction
- Node density per layer
- Training via backpropagation
- Deep learning architectures
4. Principles of Support Vector Machines
- Hyperplane-based classification
- Maximizing the margin
- Handling linearly separable data
- Handling non-linearly separable data
- Kernel methods for non-linear spaces
5. Principles of Clustering
- Clustering as a machine learning objective
- The k-means algorithm
- Distance metrics for cluster assignment and update
- Selection of optimal cluster count
6. Classification Performance Metrics
- Analysis of classification prediction data
- Confusion matrix structure
- Performance evaluation via confusion matrices
- Alternative performance measures beyond accuracy
- The kappa statistic
- Sensitivity and specificity
- Precision and recall
- The F-measure
- Visualization of performance tradeoffs
- Receiver Operating Characteristic (ROC) curves
- Estimation of future performance
- The holdout method
- Cross-validation
- Bootstrap sampling
7. Optimization of Standard Models
- Automated parameter tuning with caret
- Construction of optimized models
- Customization of the tuning process
- Performance improvement through meta-learning
- Ensemble methodology
- Bagging
- Boosting
- Random forests
- Training random forests
- Evaluation of random forest performance
SECONDARY TOPICS
8. Classification via Nearest Neighbors
- The k-nearest neighbors (kNN) algorithm
- Distance calculation methods
- Selection of appropriate k value
- Data preparation for kNN application
- Characteristics of lazy algorithms in kNN
9. Classification via Rule-Based Systems
- Separate and conquer strategy
- The OneR algorithm
- The RIPPER algorithm
- Derivation of rules from decision trees
10. Principles of Regression
- Simple linear regression
- Ordinary least squares estimation
- Correlation analysis
- Multiple linear regression
11. Regression Trees and Model Trees
- Integration of regression functions into tree structures
12. Association Rule Mining
- The Apriori algorithm
- Measurements of rule interest: support and confidence
- Rule set construction using the Apriori principle
Supplementary Topics
- Spark/PySpark/MLlib implementations
- Multi-armed bandit algorithms
Requirements
Testimonials (7)
I thoroughly enjoyed the training and appreciated the deeper dive into the subject of Machine Learning. I appreciated the balance between theory and practical applications, especially the hands-on coding sessions. The trainer provided engaging examples and well-designed exercises that enhanced the learning experience. The course covered a wide range of topics, and Abhi demonstrated excellent expertise by answering all questions with clarity and ease.
Valentina
Course - Machine Learning
I appriciated the exercise that help me to undersand the theory and apply it step by step . as well the way the trainer explained everything in a simple and clear manner. It was easy to follow even though I'm not very experienced with Python, still, I didn't want to miss the opportunity to learn something that relly interests me. I also appreciated the variety of information provided and the trainer’s availability to explain and support us in understanding the concepts. After this course, machine learning concepts are much clear to me, and now I feel like I have a direction and a better undersantind of the topic.
Cristina
Course - Machine Learning
At the end of the training, I could see the real-life use-case of the subjects presented.
Daniel
Course - Machine Learning
I liked the pace, I liked the balance between theory and practice, the main topics covered and the way the trainer was able to put everything into balance. I also really like your training infrastructure, very practical to work with VMs
Andrei
Course - Machine Learning
Keeping it short and simple. Creating intuition and visual models around the concepts (decision tree graph, linear equations, calculating y_pred manually to prove how the model works).
Nicolae - DB Global Technology
Course - Machine Learning
It helped me achieve my goal of understanding ML. Much respect for Pablo for giving a proper introduction in this topic, since it becomes obvious after 3 days of training how vast this topic is. I have also enjoyed A LOT the idea of virtual machines you have provided, which had very good latency! It allowed every coursant to do experiments at their own pace.
Silviu - DB Global Technology
Course - Machine Learning
The way practical part, seeing the theory materializing into something practical is great.