New The Skills of Tomorrow: how AI-exposed is every skill in 2026? See the data →
NVIDIA

Deep Learning Compiler Engineer - CUDA

NVIDIA
Apply →
onsite mid full-time Shanghai

First indexed 29 May 2026

Description

We are now looking for a cuTile Core Compiler Architect to join our group. The NVIDIA Architecture group is looking for world-class architects and engineers to join and lead our various architecture efforts. A key part of NVIDIA's strength is to innovate in the graphics and parallel computing fields, delivering the highest performance in the world for parallel processing algorithms.

What you'll be doing:

  • Design and implement the DSL and the core compiler of tile-aware GPU programming model for emerging GPU architectures
  • Continuously innovate and iterate on the core architecture of the compiler to consistently optimize performance
  • Investigation of next-generation GPU architectures and provide solutions in the DSL and compiler stack
  • Performance analysis on emerging AI/LLM workloads and integrate with AI/ML frameworks

Requirements:

  • Masters or PhD or equivalent experience in relevant discipline (CE, CS&E, CS, AI)
  • 2+ years of relevant work experience
  • Excellent C/C++ programming and software engineering skills, ACM background is a plus
  • Good fundamental knowledge on computer architecture
  • Strong ability in abstracting problems and the methodology in resolving problems
  • Strong compiler backgrounds including MLIR/TVM/Triton/LLVM is desired
  • Good knowledge of GPU architecture and fast kernel programming skills is a plus
  • Knowledge of LLM algorithms or a certain HPC domain is a plus
  • Knowledge of multi-GPU distributed communication is a plus
  • Excellent oral communication in English is a plus
This listing is enriched and indexed by YubHub. To apply, use the employer's original posting: https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/China-Shanghai/Deep-Learning-Compiler-Engineer---CUDA_JR2010731