Get in Touch

Course Outline

Introduction to EXO and Local AI Clustering for Government

  • Overview of the EXO framework and the exo-explore ecosystem for government
  • Comparing centralized cloud inference versus distributed local inference for government operations
  • Architecture: libp2p device discovery, MLX backend, dashboard, and API layers for government
  • Hardware requirements: Apple Silicon (M3 Ultra, M4 Pro/Max), Thunderbolt 5, shared storage for government

Installing EXO on macOS for Government

  • Setting up Xcode, Metal ToolChain, and macOS prerequisites for government systems
  • Installing uv, Node.js, Rust nightly toolchain for government environments
  • Installing the pinned macmon fork for Apple Silicon monitoring in government settings
  • Cloning the repository and building the dashboard with npm for government use
  • Running EXO from source and verifying the localhost:52415 dashboard for government operations

Installing EXO on Linux for Government

  • Installing dependencies via apt or Homebrew on Linux for government systems
  • Configuring uv, Node.js 18+, and Rust nightly for government environments
  • Building the dashboard and running EXO in CPU-only mode for government use
  • Directory layout: XDG Base Directory paths for config, data, cache, and logs for government operations

Automatic Device Discovery and Cluster Formation for Government

  • Understanding libp2p-based auto-discovery across local networks for government clusters
  • Configuring custom namespaces with EXO_LIBP2P_NAMESPACE for cluster isolation in government settings
  • Verifying node membership in the dashboard cluster view for government operations
  • Handling discovery failures and network segmentation issues for government networks

Enabling RDMA over Thunderbolt 5 for Government

  • RDMA architecture and the 99 percent latency reduction claim for government use
  • Enabling RDMA in macOS Recovery mode with rdma_ctl for government systems
  • Cable requirements and port topology constraints on Mac Studio for government environments
  • Matching macOS versions across all cluster nodes for government operations
  • Troubleshooting RDMA discovery and DHCP configuration for government networks

Deploying Frontier Models for Government

  • Using the dashboard to load and shard DeepSeek v3.1, Qwen3-235B, and Llama family models for government applications
  • Previewing instance placements with the /instance/previews API endpoint for government use
  • Creating model instances with pipeline or tensor-parallel sharding for government operations
  • Configuring custom model cards from HuggingFace hub for government projects

Monitoring and Troubleshooting for Government

  • Reading EXO logs and understanding distributed tracing for government systems
  • Interpreting cluster health in the dashboard cluster view for government operations
  • Diagnosing worker node failures and reconnection behavior for government clusters
  • Using EXO_TRACING_ENABLED for performance bottleneck analysis in government settings

Cluster Maintenance and Updates for Government

  • Updating EXO binaries and dashboard rebuild procedures for government systems
  • Migrating model caches and managing pre-downloaded models over NFS for government operations
  • Gracefully removing nodes and rebalancing workloads for government clusters

Requirements

  • An understanding of networking fundamentals, including IP addressing, subnetting, and firewall configurations
  • Experience with command-line administration in macOS or Linux environments
  • Familiarity with Python package management (pip/uv) and Node.js development tools

Audience for government

  • System administrators responsible for maintaining secure and efficient IT infrastructure
  • DevOps engineers tasked with automating deployment and management processes
  • AI infrastructure architects focused on deploying large language models in on-premises settings
 21 Hours

Number of participants


Price per participant

Upcoming Courses

Related Categories