Description
Apply now to become a Machine Learning Applications and Compiler Engineer at NVIDIA. We're seeking engineers to develop algorithms and optimizations for our LPX inference and compiler stack. You will work at the intersection of large-scale systems, compilers, and deep learning, crafting how neural network workloads map onto future NVIDIA platforms.
Key responsibilities include:
- Building, developing, and maintaining high-performance runtime and compiler components, focusing on end-to-end inference optimization.
- Defining and implementing mappings of large-scale inference workloads onto NVIDIA's systems.
- Extending and integrating with NVIDIA's SW ecosystem, contributing to libraries, tooling, and interfaces that enable seamless deployment of models across platforms.
- Benchmarking, profiling, and monitoring key performance and efficiency metrics to ensure the compiler generates efficient mappings of neural network graphs to our inference hardware.
- Collaborating closely with hardware architects and design teams to feedback software observations, influence future architectures, and codesign features that unlock new performance and efficiency points.
- Prototyping and evaluating new compilation and runtime techniques, including graph transformations, scheduling strategies, and memory/layout optimizations tailored to spatial processors.
Ideal candidates will have a strong background in software engineering, systems-level programming, and compiler development. Experience with LLVM, MLIR, and deep learning frameworks such as TensorFlow and PyTorch is highly desirable. Strong analytical and debugging skills, with experience using profiling, tracing, and benchmarking tools to drive performance improvements, are essential.
As a Machine Learning Applications and Compiler Engineer at NVIDIA, you will have the opportunity to work on cutting-edge projects, collaborate with talented engineers, and contribute to the development of innovative AI and machine learning solutions.