Description
We are looking for a Senior Software Engineer, Applied AI Systems, to build production AI / ML and agentic solutions. You will work at the intersection of applied AI, agentic workflows, software engineering, distributed systems, performance engineering, accelerated computing, and data infrastructure.
Responsibilities:
- Build and own production-grade applied AI systems for NVIDIA’s technical and solution development use cases, including agentic solutions where they materially improve the systems and softwares.
- Design and build agentic workflows and the software around them: workflow services, APIs, retrieval, MCP/A2A-style tool integrations, agent harnesses, automation, telemetry, operational controls, and human oversight.
- Design reliable services, APIs, workflow state, event-driven execution, and observability using systems such as Kafka, ClickHouse, and OTel-style patterns.
- Translate complex technical and operational requirements into clear system designs, plans, interfaces, measurable outcomes, and pragmatic technical decisions through design reviews, code reviews, and clear communication.
- Develop production software in Python and other relevant languages, with strong testing, observability, CI/CD, documentation, and operational practices.
- Build performance and benchmarking workflows for existing production solutions or products, including validation harnesses, regression tests, tracing, metrics, failure analysis, latency, throughput, reliability, resource usage, and AI/inference behavior where relevant.
- Improve standard solution patterns alongside larger applied AI systems, working with NVIDIA engineering and solution teams to codify repeated patterns, product gaps, and field lessons into APIs, services, reference architectures, playbooks, test harnesses, and shared engineering building blocks.
- Debug and support production solutions across software, infrastructure, AI models, data pipelines, inference services, and GPU-accelerated environments, turning recurring support patterns into product or platform improvements.
Requirements:
- BS, MS, or PhD in Computer Science, Engineering, AI/ML, or equivalent experience, with 5+ years of professional software engineering experience owning production systems or meaningful platform components.
- Hands-on experience with LLM, generative AI, RAG, agentic AI, MCP or intelligent AI technologies beyond simple prompting or notebooks, including tool use, retrieval, evaluation, guardrails, orchestration, or human-in-the-loop control.
- Strong Python engineering skills and practical experience with at least one additional production programming language such as C++, Go, Rust, or TypeScript.
- Demonstrated ability to develop and build distributed systems, backend services, data pipelines, workflow orchestration, APIs, or developer platforms using production environments like Kafka, ClickHouse, PostgreSQL, Redis, object storage, Kubernetes, or similar technologies.
- Strong system design and operational judgment, including reliability, latency, cost, security, privacy, scalability, debuggability, maintainability, performance analysis, benchmarking, profiling, or capacity evaluation.
- Excellent debugging and problem-solving skills across software, infrastructure, AI systems, and performance bottlenecks.
- Proven ownership of ambiguous, cross-team engineering work, with ability to collaborate with distributed teams spanning US Pacific, EMEA, and APAC timezones.
- Required: Strong written and verbal communication skills in English.
This listing is enriched and indexed by YubHub. To apply, use the employer's original posting:
https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/Germany-Munich/Senior-Software-Engineer--Applied-AI_JR2019145