Course Outline
Module 1: Microservices Design
• Establishing Effective Microservice Boundaries
• Applying Domain-Driven Design (DDD)
• Exploring Alternatives to Business Domain Boundaries (Volatility, Data, Technology, Organizational)
• Strategies for Decomposing Monolithic Applications
• Avoiding Premature Decomposition
• Decomposition by Layer Approaches
• Utilizing Decomposition Patterns (Strangler, Parallel Run, Feature Toggle)
• Addressing Data Decomposition Concerns (Performance, Integrity, Transactions)
Module 2: Optimizing Docker and the Runtime for Government
• Selecting the Appropriate Base Image
• Reducing the Number of Layers
• Implementing Multi-Stage Builds
• Optimizing Images (Sorting Multi-Line Arguments, etc.)
• Utilizing Build Cache Effectively
• Pinning Image Versions for Stability
• Fine-Tuning Resource Allocation
• Adhering to Secure Container Practices
• Configuring Runtime Settings for Optimal Performance
Module 3: Kubernetes & Release Strategies for Government
Kubernetes Deployments Overview
• Creating and Executing Initial Deployments
• Exploring Kubernetes Deployment Options
Performing Rolling Update Deployments
• Understanding the Concept of Rolling Updates
• Creating and Executing a Rolling Update
• Rolling Back a Deployment When Necessary
Performing Canary Deployments
• Understanding Canary Deployments
• Creating and Executing a Canary Deployment
Performing Blue-Green Deployments
• Understanding Blue-Green Deployments
• Creating and Executing a Blue-Green Deployment
Running Jobs and CronJobs
• Creating and Managing Jobs and CronJobs
Performing Monitoring and Troubleshooting Tasks for Government
• Advanced Troubleshooting Techniques with kubectl
Module 4: Automation & Operational Efficiency for Government
Using Python to Automate Common Tasks in Kubernetes
• Leveraging Python for Administrative Operations in Kubernetes
• Using Python to Define Configuration Objects
• Creating Deployment Objects with Python
• Monitoring Kubernetes Events via Python
• Scaling Deployments Automatically Using Python
Understanding the Challenges of Automating Deployments for Government
• Declarative Configuration in Kubernetes
• Ensuring Configuration Integrity and Consistency
Using the GitOps Approach for Automating Deployments for Government
• Principles of GitOps
• Introduction to Flux
• Installing Flux on a Kubernetes Cluster
Configuring Flux for Automated Deployments for Government
• Utilizing Notifications in Flux
• Structuring the Source Repository
Handling Application Updates with Image Automation for Government
• Updating Application Deployments with Flux
• Scanning Container Image Repositories for Tags
• Defining Policies for Selecting Latest Images
• Configuring Flux to Perform Automatic Image Updates
Module 5: Observability & Root Cause Clarity for Government
Kubernetes Logging and Tracing Capabilities for Government
• The Importance of Logging and Tracing
• Accessing Kubernetes Logs
• Reviewing Pod and Container Logs
• Examining Control Plane Logs
• Monitoring Resource Usage of Nodes and Pods
Collecting and Analyzing the Logs for Government
• Log Aggregation Techniques
• Visualizing Logs for Better Insights
Distributed Tracing in Kubernetes for Government
• Overview of Distributed Tracing
• Utilizing OpenTelemetry
• Exploring Distributed Tracing Tools
• Instrumenting Applications for Tracing
• Using Tracing to Identify Performance Issues
Monitoring with Prometheus and Grafana for Government
• Key Concepts of Observability
• Monitoring Tools Overview
• Implementing Prometheus Instrumentation
Advanced Use Cases for Logging for Government
• Advanced Log Processing Techniques
• Filtering and Enriching Logs
• Implementing Event Sourcing
Module 6: Cluster Crisis Simulation & Incident Response for Government
• Understanding Various Types of Failures in a Cluster Environment
• Simulating Node Failures
• Handling Pod Eviction and Resource Exhaustion Scenarios
• Addressing Network Issues
• Managing DNS Failures to Ensure Application Timeout Handling
• Simulating an API Server Outage
• Testing System Stability Under High Traffic Conditions
• Handling Storage Failures
• Resolving Configuration Errors
• Understanding Incident Reporting Procedures
Module 7: AI to Support Troubleshooting for Government
• Benefits of Generative AI for Kubernetes Operations
• Overview of the K8sGPT CLI Architecture
• Installing the K8sGPT CLI
• Utilizing K8sGPT Commands and Features
• Using K8sGPT Analyzers (podAnalyzer, pvcAnalyzer, rsAnalyzer, etc.)
• Analyzing the Cluster with K8sGPT
• Addressing Real-Time Issues with K8sGPT
• Deploying an In-Cluster Operator for K8sGPT
Requirements
- Proficiency in Linux command-line operations
- Experience in application development or system administration
- Familiarity with container technologies, including Docker concepts
- Fundamental knowledge of Kubernetes principles (pods, deployments, services)
- General understanding of software architecture (e.g., APIs, services)
Target Audience:
- DevOps Engineers
- Site Reliability Engineers (SREs)
- Backend and Software Developers working with microservices
- Cloud Engineers and Platform Engineers
- System Administrators transitioning to Kubernetes environments for government use cases
Testimonials (2)
Craig was extremely involved in the training, always making sure we are paying attention, adapted the examples to our day-to-day activities and always provided an answer when asked, even if the information was not added in the presentation.
Ecaterina Ioana Nicoale - BOOKING HOLDINGS ROMANIA SRL
Course - DevOps Foundation®
High level of commitment and knowledge of the trainer