New The Skills of Tomorrow: how AI-exposed is every skill in 2026? See the data →
NVIDIA

Senior Solutions Architect - Generative AI

NVIDIA
Apply →
onsite senior full-time Pune

First indexed 29 May 2026

Description

We're seeking a Senior Solutions Architect who can operate fluently across two worlds: the AI Factory infrastructure stack-designing accelerated compute environments built on DGX/HGX, networking, storage, and orchestration -and the Generative & Agentic AI application stack -guiding customers through open-source frameworks, NeMo, NIM, and Blueprints to deliver production AI workloads.

As a Senior Solutions Architect, you will be a trusted technical advisor to enterprise customers, partnering with sales, product, and engineering to translate business outcomes into deployable architectures.

Key responsibilities include:

  • Leading enterprise customers through the full lifecycle of on-prem Generative AI , from use case discovery and model selection to fine-tuning, deployment, and continuous evaluation.
  • Architecting production-grade RAG systems using NeMo Retriever, embedding and reranker NIMs, and vector databases such as Milvus, pgvector, and Weaviate, tuned for accuracy, latency, and cost at scale.
  • Guiding customers in selecting and customizing open-source foundation models like Llama, Mistral, Qwen, and Gemma, and leading them through fine-tuning workflows using NeMo Customizer, PEFT/LoRA, SFT, DPO, and RLHF.
  • Composing and building agentic applications using frameworks like LangGraph, LlamaIndex, CrewAI, AutoGen, and NVIDIA AI Blueprints, covering use cases such as customer service automation, enterprise search, document intelligence, video analytics, software engineering agents, and industry-specific copilots.
  • Advising on inference optimization with TensorRT-LLM, vLLM, and SGLang, including quantization, speculative decoding, and multi-LoRA serving.
  • Championing responsible AI practices: guardrails (NeMo Guardrails), red-teaming, evaluation harnesses, and observability for LLM and agent systems.
  • Helping customers operationalize GenAI with accurate MLOps , versioning, CI/CD for prompts and models, drift detection, and human-in-the-loop feedback.
  • Providing architectural mentorship on on-prem AI Factory deployments built around DGX BasePOD/SuperPOD, HGX-based OEM systems, Spectrum-X and Quantum InfiniBand networking, and high-performance storage.
  • Scaling clusters appropriately for combined fine-tuning and inference tasks. Building orchestration layers through Kubernetes, including the NVIDIA GPU Operator, Network Operator, and Run:ai. Applying Slurm as needed.
  • Leading technical workshops, PoCs, and architecture reviews with enterprise customers as well as representing NVIDIA at customer briefings, industry events, and technical forums.
  • Translating field findings into product feedback for NVIDIA engineering as well as building reusable assets such as reference architectures, deployment guides, and demos that scale across the SA community.
This listing is enriched and indexed by YubHub. To apply, use the employer's original posting: https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/India-Pune/Senior-Solutions-Architect---Generative-AI_JR2018210