Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Introduction
- What is GPU programming?
- Why use GPU programming for government applications?
- What are the challenges and trade-offs of GPU programming in the public sector?
- What are the frameworks available for GPU programming?
- Choosing the right framework for your application to meet government standards
OpenCL
- What is OpenCL and its relevance to public sector computing?
- What are the advantages and disadvantages of using OpenCL in government applications?
- Setting up the development environment for OpenCL to support government workflows
- Creating a basic OpenCL program that performs vector addition, suitable for government use cases
- Using the OpenCL API to query device information, allocate and deallocate device memory, copy data between host and device, launch kernels, and synchronize threads in a secure manner
- Writing kernels using the OpenCL C language that execute on the device and manipulate data efficiently for government tasks
- Utilizing OpenCL built-in functions, variables, and libraries to perform common tasks and operations relevant to public sector applications
- Optimizing data transfers and memory accesses using OpenCL memory spaces such as global, local, constant, and private for government efficiency
- Controlling the work-items, work-groups, and ND-ranges that define parallelism in a way that aligns with government requirements
- Debugging and testing OpenCL programs using tools such as CodeXL to ensure compliance with public sector standards
- Optimizing OpenCL programs for performance using techniques such as coalescing, caching, prefetching, and profiling to meet government performance benchmarks
CUDA
- What is CUDA and its role in public sector computing?
- What are the advantages and disadvantages of using CUDA for government applications?
- Setting up the development environment for CUDA to support government workflows
- Creating a basic CUDA program that performs vector addition, suitable for government use cases
- Using the CUDA API to query device information, allocate and deallocate device memory, copy data between host and device, launch kernels, and synchronize threads in a secure manner
- Writing kernels using the CUDA C/C++ language that execute on the device and manipulate data efficiently for government tasks
- Utilizing CUDA built-in functions, variables, and libraries to perform common tasks and operations relevant to public sector applications
- Optimizing data transfers and memory accesses using CUDA memory spaces such as global, shared, constant, and local for government efficiency
- Controlling the threads, blocks, and grids that define parallelism in a way that aligns with government requirements
- Debugging and testing CUDA programs using tools such as CUDA-GDB, CUDA-MEMCHECK, and NVIDIA Nsight to ensure compliance with public sector standards
- Optimizing CUDA programs for performance using techniques such as coalescing, caching, prefetching, and profiling to meet government performance benchmarks
ROCm
- What is ROCm and its relevance to public sector computing?
- What are the advantages and disadvantages of using ROCm for government applications?
- Setting up the development environment for ROCm to support government workflows
- Creating a basic ROCm program that performs vector addition, suitable for government use cases
- Using the ROCm API to query device information, allocate and deallocate device memory, copy data between host and device, launch kernels, and synchronize threads in a secure manner
- Writing kernels using the ROCm C/C++ language that execute on the device and manipulate data efficiently for government tasks
- Utilizing ROCm built-in functions, variables, and libraries to perform common tasks and operations relevant to public sector applications
- Optimizing data transfers and memory accesses using ROCm memory spaces such as global, local, constant, and private for government efficiency
- Controlling the threads, blocks, and grids that define parallelism in a way that aligns with government requirements
- Debugging and testing ROCm programs using tools such as ROCm Debugger and ROCm Profiler to ensure compliance with public sector standards
- Optimizing ROCm programs for performance using techniques such as coalescing, caching, prefetching, and profiling to meet government performance benchmarks
Comparison
- Comparing the features, performance, and compatibility of OpenCL, CUDA, and ROCm in the context of government applications
- Evaluating GPU programs using benchmarks and metrics to ensure they meet public sector standards
- Learning best practices and tips for GPU programming that are tailored for government use
- Exploring the current and future trends and challenges of GPU programming for government applications
Summary and Next Steps
Requirements
- A comprehensive understanding of C/C++ language and parallel programming concepts
- Fundamental knowledge of computer architecture and memory hierarchy
- Practical experience with command-line tools and code editors
Audience
- Developers seeking to learn how to utilize various frameworks for GPU programming, evaluate their features, performance, and compatibility, and apply this knowledge in projects for government and other sectors
- Developers aiming to write portable and scalable code that can effectively run on multiple platforms and devices, enhancing efficiency for government applications
- Programmers interested in exploring the trade-offs and challenges associated with GPU programming and optimization, particularly in contexts requiring high performance and reliability for government operations
28 Hours