Get in Touch

Course Outline

Introduction to Multimodal Large Language Models in Vertex AI

  • Overview of multimodal capabilities within the Vertex AI platform
  • Overview of Gemini models and supported modalities
  • Enterprise and research use cases for government applications

Setting Up the Development Environment

  • Configuring Vertex AI for multimodal workflows
  • Working with datasets across modalities
  • Hands-on lab: environment setup and dataset preparation

Long Context Windows and Advanced Reasoning

  • Understanding long-context workflows
  • Use cases in planning and decision-making
  • Hands-on lab: implementing long-context analysis

Cross-Modal Workflow Design

  • Combining text, audio, and image analysis
  • Chaining multimodal steps in pipelines
  • Hands-on lab: designing a multimodal pipeline

Working with Gemini API Parameters

  • Configuring multimodal inputs and outputs
  • Optimizing inference and efficiency
  • Hands-on lab: tuning Gemini API parameters

Advanced Applications and Integrations

  • Interactive multimodal agents and assistants
  • Integrating external APIs and tools
  • Hands-on lab: building a multimodal application for government

Evaluation and Iteration

  • Testing multimodal performance
  • Metrics for accuracy, alignment, and drift
  • Hands-on lab: evaluating multimodal workflows

Summary and Next Steps

Requirements

  • Competency in Python programming languages
  • Proven track record in the development of machine learning models
  • Operational knowledge of multimodal data structures, including textual, auditory, and visual inputs

Target Stakeholders

  • Artificial intelligence research personnel
  • Senior software engineering professionals
  • Machine learning science specialists

This capability is designed specifically for government agencies seeking advanced technical expertise.

 14 Hours

Number of participants


Price per participant

Upcoming Courses

Related Categories