Description
We are looking for a technical and hands-on Product Manager to lead our product efforts for local AI on Linux and developers. Client AI is the technology platform on top of NVIDIA's client hardware , GeForce RTX, RTX PRO, DGX Spark, DGX Station, and N1X , that enables AI and agents, content creation, and developer workflows.
Generative AI is moving from the cloud to the workstation and the edge. Developers want to prototype, fine-tune, and run frontier models locally. Enterprises want to deploy agents against their private data on-prem. Inference stacks like vLLM, SGLang, TensorRT-LLM, and PyTorch are becoming the default runtime for these workflows. This Product Manager will help NVIDIA win the Linux side of this shift , making our client platforms the best place to build and run modern AI.
Responsibilities:
- Define and lead the enterprise agent use case , understand how enterprises deploy agents on-prem, what they need from the platform, and where NVIDIA should invest.
- Collaborate with Product Managers that are working on cloud inference backends (vLLM, SGLang, TensorRT-LLM, and PyTorch) to drive and prioritize requirement for local AI.
- Own the product strategy and roadmap for the Linux developer experience on NVIDIA client platforms (DGX Spark, DGX Station, RTX PRO workstations, RTX Spark).
- Research the developer and enterprise AI ecosystem: interview customers, build personas and user journeys, and map workflows across training, fine-tuning, inference, and agent deployment.
- Work hands-on with the latest models, frameworks, and agent tooling so you can represent the developer's point of view in every decision.
- Lead cross-functional teams , engineering, DevRel, marketing, partnerships , to ship features and grow adoption.
- Influence NVIDIA's GPU, system, and software roadmaps based on what Linux developers and enterprise AI teams actually need.
- Build product positioning, technical demos, and sales and partner enablement material for a developer audience.
Requirements:
- 8+ years of product management experience, with meaningful time on AI/ML, developer tools, or infrastructure products.
- First-hand experience as a developer or engineer , you have shipped code in production and can debug a CUDA, PyTorch, or Docker issue alongside an engineer, not just manage around it.
- Deep familiarity with modern AI workflows: training and fine-tuning, inference serving, agent frameworks, RAG pipelines, and evaluation.
- Working knowledge of at least one major inference backend (vLLM, SGLang, TensorRT-LLM, or PyTorch-based serving).
- Fluency in Linux as a development and deployment environment.
- Strong written communication and the ability to translate technical depth for both engineers and executives.
Nice to Have:
- Prior role as an AI/ML engineer, inference systems engineer, or application developer building with LLM APIs and agent frameworks (LangChain, LlamaIndex, MCP).
- Experience with model optimization , quantization, distillation, speculative decoding, KV-cache strategies.
- Hands-on with CUDA, Triton, or low-level GPU programming.
- Background in enterprise software, on-prem deployments, or private AI.
- Open-source contributions to AI/ML, inference, or agent projects.