Course Outline

SRE Anti-patterns

  • Identifying counterproductive practices for government operations
  • Recognizing the impact of anti-patterns on system reliability
  • Best practices and corrective alternatives to enhance reliability

SLO as a Proxy for Customer Satisfaction

  • Defining Service Level Indicators (SLIs) and Service Level Objectives (SLOs)
  • Managing error budgets and balancing innovation with reliability in public sector services
  • Understanding the limitations of distributed systems in government environments

Building Secure and Reliable Systems

  • Designing for fault tolerance and resilience in government systems
  • Integrating security into reliability engineering practices for government
  • Scalability and data protection strategies tailored for public sector use

Full-stack Observability

  • Instrumentation and metrics collection methods for government systems
  • Distributed tracing and synthetic monitoring techniques for enhanced visibility
  • Observability-driven development to improve system transparency and accountability

Platform Engineering and AIOps

  • Platform-centered engineering approaches for government applications
  • Automation and orchestration in Site Reliability Engineering (SRE) for public sector operations
  • Leveraging DataOps and operational intelligence to optimize government services

Incident Management in SRE

  • Roles and responsibilities in incident response within government agencies
  • Applying frameworks such as the Observe, Orient, Decide, Act (OODA) loop for efficient response
  • Automated remediation and AI/ML-assisted resolution to enhance incident management in government systems

Chaos Engineering

  • Principles and strategies for resilience testing in government IT environments
  • Planning and executing “game day” exercises to prepare for potential failures
  • Learning from controlled failure experiments to improve system reliability for government operations

SRE as a Pure Form of DevOps

  • Integrating SRE into DevOps workflows within government agencies
  • Cultural alignment and collaboration practices for effective implementation in the public sector
  • Driving organizational transformation through SRE to enhance government service delivery

Post-class Exercises

  • Large-scale system design case studies relevant to government operations
  • Advanced instrumentation and monitoring scenarios for public sector systems
  • Real-world reliability problem-solving exercises for government applications

Review and Exam Preparation

  • Final review of the DevOps Institute SRE Practitioner syllabus with a focus on government relevance
  • Sample questions and practice tests to prepare for certification exams in the public sector
  • Exam-taking strategies and recommendations tailored for government professionals

Summary and Next Steps

Requirements

  • Comprehension of fundamental Site Reliability Engineering principles
  • Practical experience with DevOps methodologies and associated tools
  • Familiarity with system monitoring, incident management, and automation techniques

Audience for Government

  • SRE professionals pursuing the DevOps Institute SRE Practitioner certification
  • DevOps engineers looking to transition into reliability-focused roles
  • Operations leaders tasked with developing and implementing reliability strategies within their organizations
 35 Hours

Number of participants


Price per participant

Testimonials (5)

Upcoming Courses

Related Categories