Course Outline

Introduction

  • What is GPU programming?
  • Why use GPU programming for government applications?
  • What are the challenges and trade-offs of GPU programming in a public sector context?
  • What are the frameworks and tools available for GPU programming?
  • Choosing the right framework and tool for your application to meet government requirements

OpenCL

  • What is OpenCL, and how does it support government initiatives?
  • What are the advantages and disadvantages of using OpenCL in a public sector environment?
  • Setting up the development environment for OpenCL to ensure compliance with government standards
  • Creating a basic OpenCL program that performs vector addition, tailored for government use cases
  • Using the OpenCL API to query device information, allocate and deallocate device memory, copy data between host and device, launch kernels, and synchronize threads in a secure manner
  • Using OpenCL C language to write kernels that execute on the device and manipulate data efficiently for government applications
  • Utilizing OpenCL built-in functions, variables, and libraries to perform common tasks and operations relevant to public sector workflows
  • Optimizing data transfers and memory accesses using OpenCL memory spaces such as global, local, constant, and private, ensuring optimal performance for government tasks
  • Controlling the work-items, work-groups, and ND-ranges that define parallelism in a way that aligns with government governance and accountability requirements
  • Debugging and testing OpenCL programs using tools such as CodeXL to ensure reliability and security for government operations
  • Optimizing OpenCL programs using techniques such as coalescing, caching, prefetching, and profiling to enhance performance in a public sector setting

CUDA

  • What is CUDA, and how does it support government applications?
  • What are the advantages and disadvantages of using CUDA in a public sector context?
  • Setting up the development environment for CUDA to meet government standards and security requirements
  • Creating a basic CUDA program that performs vector addition, tailored for government use cases
  • Using the CUDA API to query device information, allocate and deallocate device memory, copy data between host and device, launch kernels, and synchronize threads in a secure manner
  • Writing kernels using CUDA C/C++ language that execute on the device and manipulate data efficiently for government applications
  • Utilizing CUDA built-in functions, variables, and libraries to perform common tasks and operations relevant to public sector workflows
  • Optimizing data transfers and memory accesses using CUDA memory spaces such as global, shared, constant, and local, ensuring optimal performance for government tasks
  • Controlling the threads, blocks, and grids that define parallelism in a way that aligns with government governance and accountability requirements
  • Debugging and testing CUDA programs using tools such as CUDA-GDB, CUDA-MEMCHECK, and NVIDIA Nsight to ensure reliability and security for government operations
  • Optimizing CUDA programs using techniques such as coalescing, caching, prefetching, and profiling to enhance performance in a public sector setting

ROCm

  • What is ROCm, and how does it support government initiatives?
  • What are the advantages and disadvantages of using ROCm in a public sector environment?
  • Setting up the development environment for ROCm to ensure compliance with government standards
  • Creating a basic ROCm program that performs vector addition, tailored for government use cases
  • Using the ROCm API to query device information, allocate and deallocate device memory, copy data between host and device, launch kernels, and synchronize threads in a secure manner
  • Writing kernels using ROCm C/C++ language that execute on the device and manipulate data efficiently for government applications
  • Utilizing ROCm built-in functions, variables, and libraries to perform common tasks and operations relevant to public sector workflows
  • Optimizing data transfers and memory accesses using ROCm memory spaces such as global, local, constant, and private, ensuring optimal performance for government tasks
  • Controlling the threads, blocks, and grids that define parallelism in a way that aligns with government governance and accountability requirements
  • Debugging and testing ROCm programs using tools such as ROCm Debugger and ROCm Profiler to ensure reliability and security for government operations
  • Optimizing ROCm programs using techniques such as coalescing, caching, prefetching, and profiling to enhance performance in a public sector setting

HIP

  • What is HIP, and how does it support government applications?
  • What are the advantages and disadvantages of using HIP in a public sector context?
  • Setting up the development environment for HIP to meet government standards and security requirements
  • Creating a basic HIP program that performs vector addition, tailored for government use cases
  • Using HIP language to write kernels that execute on the device and manipulate data efficiently for government applications
  • Utilizing HIP built-in functions, variables, and libraries to perform common tasks and operations relevant to public sector workflows
  • Optimizing data transfers and memory accesses using HIP memory spaces such as global, shared, constant, and local, ensuring optimal performance for government tasks
  • Controlling the threads, blocks, and grids that define parallelism in a way that aligns with government governance and accountability requirements
  • Debugging and testing HIP programs using tools such as ROCm Debugger and ROCm Profiler to ensure reliability and security for government operations
  • Optimizing HIP programs using techniques such as coalescing, caching, prefetching, and profiling to enhance performance in a public sector setting

Comparison

  • Comparing the features, performance, and compatibility of OpenCL, CUDA, ROCm, and HIP for government applications
  • Evaluating GPU programs using benchmarks and metrics to ensure they meet government standards and requirements
  • Learning best practices and tips for GPU programming in a public sector context
  • Exploring current and future trends and challenges of GPU programming, particularly as they relate to government initiatives

Summary and Next Steps

Requirements

  • An understanding of C/C++ language and parallel programming concepts
  • Basic knowledge of computer architecture and memory hierarchy
  • Experience with command-line tools and code editors

Audience

  • Developers who seek to acquire foundational skills in GPU programming, including the primary frameworks and tools for developing GPU applications for government
  • Developers aiming to write portable and scalable code that can function across various platforms and devices
  • Programmers interested in exploring the advantages and challenges of GPU programming and optimization
 21 Hours

Number of participants


Price per participant

Upcoming Courses

Related Categories