# Principal Software Engineer - DGX Cloud

**Company**: NVIDIA
**Location**: Santa Clara
**Work arrangement**: onsite
**Experience**: senior
**Job type**: full-time
**Salary**: $120,000–$180,000
**Category**: Engineering
**Industry**: Technology

**Apply**: https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/US-CA-Santa-Clara/Principal-Software-Engineer---DGX-Cloud_JR2012048-1?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply
**Canonical**: https://yubhub.co/jobs/job_1ee56d6e-eb0

## Description

We are looking for a Principal Software Engineer to join our DGX Cloud team and build the foundational systems that drive NVIDIA's high-performance GPU infrastructure. You will play a meaningful role in crafting scalable automation solutions, integrating diverse systems, and enabling seamless workflows across global cloud operations.

As a Principal Engineer in DGX Cloud, you will be at the pinnacle of technical leadership. You will directly craft the platform that fuels the future of AI and cloud computing.

Responsibilities:

- Lead the build and development of next-generation APIs, state management, and workflow orchestration systems that automate fleet lifecycle operations at a massive scale.

- Drive technical alignment across dependent systems and partner teams to ensure cohesive integration, clear interfaces, and reliable end-to-end workflows, with a strong focus on delivery.

- Act as a force-multiplier by coaching, mentoring, and encouraging senior engineers, elevating the technical standards and guidelines across the organization.

- Maintain an incredible focus on the customer experience and product requirements, translating deep technical insight into high-impact business solutions.

- Partner with executive and engineering leadership to codify critical business processes into self-measuring, scalable, and operationally consistent platforms, drastically reducing manual toil.

- Direct the integration strategy for key technologies, including common AI schedulers (e.g., Kubernetes, Slurm) and innovative observability systems (e.g., Prometheus, OpenTelemetry, Grafana).

Requirements:

- 16+ years of progressive industry experience

- Master's or Bachelor's degree, or equivalent experience defining and shipping complex distributed systems.

- Deep, hands-on expertise in establishing, operating, and scaling services in a fast-paced, high-reliability environment.

- Thrive in ambiguous, fast-paced environments by rapidly testing ideas, iterating toward working solutions, and then hardening the winners into reliable, scalable systems.

- Outstanding proficiency in modern systems programming languages such as Go, Java, or Python.

- Proven track record of defining, owning, and evolving the architecture of high-scale distributed systems, including advanced patterns for APIs, control planes, and data pipelines.

- Deep understanding of global cloud infrastructure (AWS, GCP, Azure) and container ecosystems (Docker, Kubernetes).

- Demonstrated ability to drive technical strategy and influence outcomes across organizational boundaries.

- Outstanding ability to communicate complex technical concepts, drive organizational consensus, and mentor high-performing engineers.

Benefits:

- Widely considered to be one of the technology world's most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package.

- Eligible for equity and benefits.

## Skills

### Required
- Go
- Java
- Python
- Kubernetes
- Slurm
- Prometheus
- OpenTelemetry
- Grafana

---

Source: [Apply at nvidia.wd5.myworkdayjobs.com](https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/US-CA-Santa-Clara/Principal-Software-Engineer---DGX-Cloud_JR2012048-1?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply)
