Description
We are seeking a Senior Software Engineer to join our team working on the CUDA driver, a key component of accelerated GPU computing. As a Senior Software Engineer, you will work with a versatile software engineering team that delivers innovative software features to unlock the full potential and performance of NVIDIA hardware across diverse workloads like deep learning, scientific research, autonomous vehicles, gaming, and virtual reality.
Your system-level expertise and creativity in solving complex problems will help invent the future of CUDA and NVIDIA's compute technologies. You will collaborate with hardware architects, deep learning specialists, and both internal and external partners to advance the CUDA architecture. With the opportunity to collaborate with teams across the whole NVIDIA computing stack, you will help design software solutions across kernel mode components, compilers, and networking software.
Key responsibilities include evangelizing, architecting, and implementing new CUDA features, coordinating and driving development efforts across multiple teams, collaborating with members of hardware architecture teams, defining forward-looking improvements to the CUDA APIs and programming model, building and maintaining performance and precision modeling, writing effective, maintainable, and well-tested code, and developing code for multiple operating systems.
To be successful in this role, you will need a Bachelor of Science or Master of Science degree in Computer Science, Electrical Engineering, or a related field, or equivalent experience. You will also require 5+ years of relevant experience in developing systems software, strong C programming skills, experience designing, debugging, and maintaining complex software stacks, experience with operating system interfaces for threads, process control, and virtual memory, and understanding of system-level architecture, such as interconnects, memory hierarchy, interrupts, and memory-mapped IO.
Preferred qualifications include prior experience with parallel computing, knowledge of memory coherence and consistency models, background with kernel mode development, experience with Linux systems software development, and familiarity with distributed system and training/inference patterns, and deep learning frameworks.