# Senior DL Algorithms Engineer - Inference Performance

**Company**: NVIDIA
**Location**: Santa Clara
**Work arrangement**: remote
**Experience**: senior
**Job type**: full-time
**Category**: Engineering
**Industry**: Technology

**Apply**: https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/US-CA-Santa-Clara/Senior-DL-Algorithms-Engineer---Inference-Performance_JR2017176?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply
**Canonical**: https://yubhub.co/jobs/job_acb3edd0-b88

## Description

We are seeking a Senior DL Algorithms Engineer to join our team. You will enable and optimize state-of-the-art open models, contribute new features, and deliver production code to open-source frameworks. Your expertise in deep learning, neural networks, and performance profiling will help us push the boundaries of inference performance.

Key responsibilities include:

- Enabling and optimizing state-of-the-art open models on NVIDIA's accelerated inference SW stack.

- Contributing new features, fixing bugs, and delivering production code to open-source frameworks like TRT-LLM, vLLM, SGLang, FlashInfer, etc.

- Profiling and analyzing bottlenecks across the full inference stack to push the boundaries of inference performance.

- Benchmarking state-of-the-art offerings and performing competitive analysis for NVIDIA's SW/HW stack.

- Co-designing with partner teams to develop the next generation of AI models and services.

Requirements include:

- PhD in CS, EE, or CSEE or equivalent experience.

- 3+ years of experience.

- Strong background in deep learning and neural networks, in particular inference.

- Experience with performance profiling, analysis, and optimization, especially for GPU-based applications.

- Proficient in PyTorch or equivalent frameworks for AI, or HPC-heavy application development.

- Deep understanding of computer architecture, and familiarity with the fundamentals of GPU architecture.

Preferred qualifications include:

- Proven experience with processor and system-level performance optimization.

- Deep understanding of modern LLM/Diffusion architectures.

- Strong fundamentals in algorithms.

- GPU programming experience (CUDA or OpenCL) is a strong plus.

## Skills

### Required
- PyTorch
- Deep learning
- Neural networks
- Performance profiling
- GPU-based applications
- Computer architecture
- GPU architecture

### Nice to have
- Processor and system-level performance optimization
- Modern LLM/Diffusion architectures
- Algorithms
- GPU programming (CUDA or OpenCL)

---

Source: [Apply at nvidia.wd5.myworkdayjobs.com](https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/US-CA-Santa-Clara/Senior-DL-Algorithms-Engineer---Inference-Performance_JR2017176?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply)
