Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Introduction to EXO and Local AI Clustering for Government
- Overview of the EXO framework and the exo-explore ecosystem for government
- Comparing centralized cloud inference versus distributed local inference for government operations
- Architecture: libp2p device discovery, MLX backend, dashboard, and API layers for government
- Hardware requirements: Apple Silicon (M3 Ultra, M4 Pro/Max), Thunderbolt 5, shared storage for government
Installing EXO on macOS for Government
- Setting up Xcode, Metal ToolChain, and macOS prerequisites for government systems
- Installing uv, Node.js, Rust nightly toolchain for government environments
- Installing the pinned macmon fork for Apple Silicon monitoring in government settings
- Cloning the repository and building the dashboard with npm for government use
- Running EXO from source and verifying the localhost:52415 dashboard for government operations
Installing EXO on Linux for Government
- Installing dependencies via apt or Homebrew on Linux for government systems
- Configuring uv, Node.js 18+, and Rust nightly for government environments
- Building the dashboard and running EXO in CPU-only mode for government use
- Directory layout: XDG Base Directory paths for config, data, cache, and logs for government operations
Automatic Device Discovery and Cluster Formation for Government
- Understanding libp2p-based auto-discovery across local networks for government clusters
- Configuring custom namespaces with EXO_LIBP2P_NAMESPACE for cluster isolation in government settings
- Verifying node membership in the dashboard cluster view for government operations
- Handling discovery failures and network segmentation issues for government networks
Enabling RDMA over Thunderbolt 5 for Government
- RDMA architecture and the 99 percent latency reduction claim for government use
- Enabling RDMA in macOS Recovery mode with rdma_ctl for government systems
- Cable requirements and port topology constraints on Mac Studio for government environments
- Matching macOS versions across all cluster nodes for government operations
- Troubleshooting RDMA discovery and DHCP configuration for government networks
Deploying Frontier Models for Government
- Using the dashboard to load and shard DeepSeek v3.1, Qwen3-235B, and Llama family models for government applications
- Previewing instance placements with the /instance/previews API endpoint for government use
- Creating model instances with pipeline or tensor-parallel sharding for government operations
- Configuring custom model cards from HuggingFace hub for government projects
Monitoring and Troubleshooting for Government
- Reading EXO logs and understanding distributed tracing for government systems
- Interpreting cluster health in the dashboard cluster view for government operations
- Diagnosing worker node failures and reconnection behavior for government clusters
- Using EXO_TRACING_ENABLED for performance bottleneck analysis in government settings
Cluster Maintenance and Updates for Government
- Updating EXO binaries and dashboard rebuild procedures for government systems
- Migrating model caches and managing pre-downloaded models over NFS for government operations
- Gracefully removing nodes and rebalancing workloads for government clusters
Requirements
- An understanding of networking fundamentals, including IP addressing, subnetting, and firewall configurations
- Experience with command-line administration in macOS or Linux environments
- Familiarity with Python package management (pip/uv) and Node.js development tools
Audience for government
- System administrators responsible for maintaining secure and efficient IT infrastructure
- DevOps engineers tasked with automating deployment and management processes
- AI infrastructure architects focused on deploying large language models in on-premises settings
21 Hours