Description

As a Senior Developer Technology Engineer, you will be at the forefront of innovation, working with leading industry partners and exciting OSS projects to help them adopt groundbreaking advancements in AI and accelerated computing on NVIDIA RTX.

Your primary responsibilities will include:

Working closely with internal engineering and product teams and external app developers on solving local end-to-end AI GPU deployment challenges on the NVIDIA RTX AI platform.
Applying powerful profiling and debugging tools for analyzing most demanding GPU-accelerated end-to-end AI applications to detect insufficient GPU utilization resulting in suboptimal runtime performance.
Conducting hands-on trainings, developing sample code and hosting presentations to give good guidance on efficient end-to-end AI deployment targeting optimal runtime performance on NVIDIA ARM-based SoCs.
Improving Windows LLM & GenAI user experience on NVIDIA RTX by working on feature and performance enhancements of OSS software, including but not limited to projects like GGML, Llama.cpp, Ollama, ONNX Runtime.
Collaborating with GPU driver and architecture teams as well as NVIDIA research to influence next generation GPU features by providing real-world workflows and giving feedback on partner and customer needs.
Providing technical leadership and mentorship to junior engineers, encouraging an inclusive and high-performing team environment.

To succeed in this role, you will need:

A proven track record of 8+ years of professional experience in local GPU deployment, profiling and optimization.
Bachelor's or Master's degree or equivalent experience in Computer Science, Engineering, or a related field.
Strong proficiency in C/C++, Python, software design, programming techniques.
Familiarity with and development experience on the Windows operating system.
Experience working with open-source LLM and GenAI software.
Experience with CUDA and NVIDIA's Nsight GPU profiling and debugging suite.
Some travel is required for conferences and for on-site visits with external partners.
Strong problem-solving skills and the ability to work both independently and collaboratively in a fast-paced environment.
Excellent interpersonal and communication skills and a passion for keeping track with the latest advancements in AI technology.

If you have experience with GPU-accelerated AI inference driven by NVIDIA APIs, specifically cuDNN, CUTLASS, TensorRT, confirmed expert knowledge in Vulkan and/or DX12, detailed knowledge of the latest generation GPU architectures, or experience with AI deployment on NPUs and ARM architectures, you will stand out from the crowd.

This listing is enriched and indexed by YubHub. To apply, use the employer's original posting: https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/US-CA-Santa-Clara/Senior-Developer-Technology-Engineer---Windows-AI-Platform_JR2011694-1