Get in Touch

Course Outline

Overview of the Indigenous Chinese AI GPU Ecosystem

  • Comparison of Huawei Ascend, Biren, and Cambricon MLU
  • Understanding CUDA versus CANN, Biren SDK, and BANGPy models
  • Industry trends and vendor ecosystems

Preparing for Migration

  • Assessing your existing CUDA codebase
  • Identifying target platforms and SDK versions
  • Toolchain installation and environment setup

Code Translation Techniques

  • Translating CUDA memory access and kernel logic
  • Mapping compute grid and thread models
  • Options for automated versus manual translation

Platform-Specific Implementations

  • Utilising Huawei CANN operators and custom kernels
  • The Biren SDK conversion pipeline
  • Rebuilding models with BANGPy (Cambricon)

Cross-Platform Testing and Optimisation

  • Profiling execution on each target platform
  • Memory tuning and parallel execution comparisons
  • Performance tracking and iteration

Managing Mixed GPU Environments

  • Hybrid deployments involving multiple architectures
  • Fallback strategies and device detection
  • Abstraction layers to ensure code maintainability

Case Studies and Best Practices

  • Translating vision and NLP models to Ascend or Cambricon
  • Adapting inference pipelines for Biren clusters
  • Handling version mismatches and API gaps

Summary and Next Steps

Requirements

  • Experience in programming with CUDA or GPU-based applications
  • Understanding of GPU memory models and compute kernels
  • Familiarity with AI model deployment or acceleration workflows

Target Audience

  • GPU programmers
  • System architects
  • Porting specialists
 21 Hours

Number of participants


Price per participant

Provisional Upcoming Courses (Require 5+ participants)

Related Categories