Description
We are now looking for a Senior Deep Learning Performance Architect!
You will design and evaluate hardware architectures to improve performance, efficiency, and scalability of production AI workloads.
Key responsibilities include analysing and optimising large-scale deep learning workloads, especially LLM inference/training in real-world deployments.
You will build and use performance and power models (Python/C++) to drive architecture and product decisions.
Identify and resolve system bottlenecks across compute, memory, and interconnect.
Evaluate PPA trade-offs and guide feature prioritisation for next-generation GPU/ASIC designs.
Partner closely with software, systems, and product teams to align hardware capabilities with workload requirements.
Responsibilities:
- Design and evaluate hardware architectures to improve performance, efficiency, and scalability of production AI workloads.
- Analyse and optimise large-scale deep learning workloads, especially LLM inference/training in real-world deployments.
- Build and use performance and power models (Python/C++) to drive architecture and product decisions.
- Identify and resolve system bottlenecks across compute, memory, and interconnect.
- Evaluate PPA trade-offs and guide feature prioritisation for next-generation GPU/ASIC designs.
- Partner closely with software, systems, and product teams to align hardware capabilities with workload requirements.
Requirements:
- MS or PhD in a relevant field (Computer Science, Electrical Engineering, Computer Engineering, etc) or equivalent experience.
- 5+ years of hands-on experience in GPU/ASIC architecture, parallel computing, or system performance engineering.
- Experience with deep learning workloads in production environments (training and/or inference).
- Proficiency in Python and C++ for building performance models, simulators, or analysis tools.
- Solid understanding of system architecture: memory hierarchy, data movement, and scalability.
- Prior experience debugging, profiling, and performance tuning on real systems.
- Ability to work across team and drive decisions in fast-paced product environments.
Benefits:
- Eligible for equity and benefits.
This posting is for an existing vacancy.
NVIDIA uses AI tools in its recruiting processes.