Description
We are seeking engineers to develop algorithms and optimizations for our LPX inference and compiler stack. You will work at the intersection of large-scale systems, compilers, and deep learning, crafting how neural network workloads map onto future NVIDIA platforms.
Key responsibilities include:
- Building, developing, and maintaining high-performance runtime and compiler components, focusing on end-to-end inference optimization.
- Defining and implementing mappings of large-scale inference workloads onto NVIDIA's systems.
- Extending and integrating with NVIDIA's SW ecosystem, contributing to libraries, tooling, and interfaces that enable seamless deployment of models across platforms.
- Benchmarking, profiling, and monitoring key performance and efficiency metrics to ensure the compiler generates efficient mappings of neural network graphs to our inference hardware.
- Collaborating closely with hardware architects and design teams to feedback software observations, influence future architectures, and codesign features that unlock new performance and efficiency points.
- Prototyping and evaluating new compilation and runtime techniques, including graph transformations, scheduling strategies, and memory/layout optimizations tailored to spatial processors.
- Publishing and presenting technical work on novel compilation approaches for inference and related spatial accelerators at top-tier ML, compiler, and computer architecture venues.
Ideal candidates will have direct experience with MLIR-based compilers or other multilevel IR stacks, especially in the context of graph-based deep learning workloads.
This listing is enriched and indexed by YubHub. To apply, use the employer's original posting:
https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/Canada-Toronto/Machine-Learning-Applications-and-Compiler-Engineer--LPX---New-College-Grad-2026_JR2016937