Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Introduction
- What is GPU programming?
- Why use GPU programming for government applications?
- What are the challenges and trade-offs of GPU programming for government operations?
- What are the frameworks for GPU programming in federal agencies?
- Choosing the right framework for your application within a government context
OpenCL
- What is OpenCL and its role in government computing environments?
- What are the advantages and disadvantages of OpenCL for government applications?
- Setting up the development environment for OpenCL in a federal IT setting
- Creating a basic OpenCL program that performs vector addition, suitable for government use cases
- Using the OpenCL API to query device information, allocate and deallocate device memory, copy data between host and device, launch kernels, and synchronize threads in a secure manner for government operations
- Writing kernels using the OpenCL C language that execute on the device and manipulate data efficiently for government tasks
- Utilizing OpenCL built-in functions, variables, and libraries to perform common tasks and operations relevant to public sector needs
- Optimizing data transfers and memory accesses using OpenCL memory spaces such as global, local, constant, and private for government applications
- Controlling the work-items, work-groups, and ND-ranges that define parallelism in OpenCL to meet government performance requirements
- Debugging and testing OpenCL programs using tools such as CodeXL to ensure reliability and security for government use
- Optimizing OpenCL programs using techniques such as coalescing, caching, prefetching, and profiling to enhance performance for government operations
CUDA
- What is CUDA and its relevance to government computing?
- What are the advantages and disadvantages of CUDA for government applications?
- Setting up the development environment for CUDA in a federal IT infrastructure
- Creating a basic CUDA program that performs vector addition, tailored to government requirements
- Using the CUDA API to query device information, allocate and deallocate device memory, copy data between host and device, launch kernels, and synchronize threads in a secure environment for government operations
- Writing kernels using the CUDA C/C++ language that execute on the device and manipulate data effectively for government tasks
- Utilizing CUDA built-in functions, variables, and libraries to perform common tasks and operations relevant to public sector needs
- Optimizing data transfers and memory accesses using CUDA memory spaces such as global, shared, constant, and local for government applications
- Controlling the threads, blocks, and grids that define parallelism in CUDA to meet government performance requirements
- Debugging and testing CUDA programs using tools such as CUDA-GDB, CUDA-MEMCHECK, and NVIDIA Nsight to ensure reliability and security for government use
- Optimizing CUDA programs using techniques such as coalescing, caching, prefetching, and profiling to enhance performance for government operations
ROCm
- What is ROCm and its significance in government computing environments?
- What are the advantages and disadvantages of ROCm for government applications?
- Setting up the development environment for ROCm in a federal IT setting
- Creating a basic ROCm program that performs vector addition, suitable for government use cases
- Using the ROCm API to query device information, allocate and deallocate device memory, copy data between host and device, launch kernels, and synchronize threads in a secure manner for government operations
- Writing kernels using the ROCm C/C++ language that execute on the device and manipulate data efficiently for government tasks
- Utilizing ROCm built-in functions, variables, and libraries to perform common tasks and operations relevant to public sector needs
- Optimizing data transfers and memory accesses using ROCm memory spaces such as global, local, constant, and private for government applications
- Controlling the threads, blocks, and grids that define parallelism in ROCm to meet government performance requirements
- Debugging and testing ROCm programs using tools such as ROCm Debugger and ROCm Profiler to ensure reliability and security for government use
- Optimizing ROCm programs using techniques such as coalescing, caching, prefetching, and profiling to enhance performance for government operations
Comparison
- Comparing the features, performance, and compatibility of OpenCL, CUDA, and ROCm in the context of government computing
- Evaluating GPU programs using benchmarks and metrics relevant to public sector operations
- Learning best practices and tips for GPU programming within a government framework
- Exploring current and future trends and challenges of GPU programming for government applications
Summary and Next Steps
Requirements
- Proficiency in C/C++ language and parallel programming concepts
- Fundamental knowledge of computer architecture and memory hierarchy
- Experience with command-line tools and code editors
Audience for Government
- Developers interested in learning to utilize various frameworks for GPU programming, comparing their features, performance, and compatibility
- Developers aiming to write portable and scalable code that can function across different platforms and devices
- Programmers seeking to understand the trade-offs and challenges associated with GPU programming and optimization
28 Hours