GPU Programming with CUDA Training Course

CUDA is an open standard for GPU programming that enables code to run on NVIDIA GPUs, which are widely used for high-performance computing, artificial intelligence (AI), gaming, and graphics. CUDA exposes the programmer to the hardware details and gives full control over the parallelization process. However, this also requires a good understanding of the device architecture, memory model, execution model, and optimization techniques.

This instructor-led, live training (online or onsite) is aimed at beginner-level to intermediate-level developers who wish to use CUDA to program NVIDIA GPUs and exploit their parallelism.

By the end of this training, participants will be able to:

Set up a development environment that includes the CUDA Toolkit, an NVIDIA GPU, and Visual Studio Code.
Create a basic CUDA program that performs vector addition on the GPU and retrieves the results from the GPU memory.
Use the CUDA API to query device information, allocate and deallocate device memory, copy data between host and device, launch kernels, and synchronize threads.
Use the CUDA C/C++ language to write kernels that execute on the GPU and manipulate data.
Use CUDA built-in functions, variables, and libraries to perform common tasks and operations.
Use CUDA memory spaces, such as global, shared, constant, and local, to optimize data transfers and memory accesses.
Use the CUDA execution model to control the threads, blocks, and grids that define the parallelism.
Debug and test CUDA programs using tools such as CUDA-GDB, CUDA-MEMCHECK, and NVIDIA Nsight.
Optimize CUDA programs using techniques such as coalescing, caching, prefetching, and profiling.

Format of the Course

Interactive lecture and discussion.
Lots of exercises and practice.
Hands-on implementation in a live-lab environment.

Course Customization Options

To request a customized training for this course, please contact us to arrange.
96% of clients satisfied

This course is available as onsite live training in Australia or online live training.

Thank you for sending your enquiry! One of our team members will contact you shortly.

Thank you for sending your booking! One of our team members will contact you shortly.

Provisional Upcoming Courses (Require 5+ participants)

GPU Programming with CUDA

2026-08-20 09:30

28 hours

Adelaide City Central

10000 AUD (Online)

23720 AUD (Classroom)

GPU Programming with CUDA

2026-09-03 09:30

28 hours

Melbourne 385 Bourke Street

10000 AUD (Online)

23720 AUD (Classroom)

GPU Programming with CUDA

2026-09-17 09:30

28 hours

Cliftons Perth

10000 AUD (Online)

24600 AUD (Classroom)

GPU Programming with CUDA

2026-10-01 09:30

28 hours

Sydney King Street

10000 AUD (Online)

23720 AUD (Classroom)

GPU Programming with CUDA

2026-10-15 09:30

28 hours

London Circuit

10000 AUD (Online)

23720 AUD (Classroom)

GPU Programming with CUDA Training Course

Course Outline

Requirements

Provisional Upcoming Courses (Require 5+ participants)

GPU Programming with CUDA

GPU Programming with CUDA

GPU Programming with CUDA

GPU Programming with CUDA

GPU Programming with CUDA

Related Categories

This site in other countries/regions

Europe

Asia Pacific

North America

South America

Africa / Middle East

Other sites

GPU Programming with CUDA Training Course

Course Outline

Requirements

Provisional Upcoming Courses (Require 5+ participants)

GPU Programming with CUDA

GPU Programming with CUDA

GPU Programming with CUDA

GPU Programming with CUDA

GPU Programming with CUDA

Related Courses

Developing AI Applications with Huawei Ascend and CANN

Deploying AI Models with CANN and Ascend AI Processors

AI Inference and Deployment with CloudMatrix

GPU Programming on Biren AI Accelerators

Cambricon MLU Development with BANGPy and Neuware

Introduction to CANN for AI Framework Developers

CANN for Edge AI Deployment

Understanding Huawei’s AI Compute Stack: From CANN to MindSpore

Optimizing Neural Network Performance with CANN SDK

CANN SDK for Computer Vision and NLP Pipelines

Building Custom AI Operators with CANN TIK and TVM

Migrating CUDA Applications to Chinese GPU Architectures

Performance Optimization on Ascend, Biren, and Cambricon

Related Categories

GPU

This site in other countries/regions

Europe

Asia Pacific

North America

South America

Africa / Middle East

Other sites