# Lead Software Engineer, Fleet Management - DGX Cloud

**Company**: NVIDIA
**Location**: Seattle
**Work arrangement**: remote
**Experience**: senior
**Job type**: full-time
**Category**: Engineering
**Industry**: Technology

**Apply**: https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/US-WA-Seattle/Lead-Software-Engineer--Fleet-Management---DGX-Cloud_JR2016367?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply
**Canonical**: https://yubhub.co/jobs/job_9b407143-c7c

## Description

We are looking for a Lead Software Engineer to join our DGX Cloud team and build the foundational systems that drive NVIDIA's high-performance GPU infrastructure. As a technical lead, you will play a key role in designing scalable cloud services that integrate with diverse systems, including GPU telemetry in datacenters, and enabling operational automation across global cloud operations.

Your responsibilities will include:

Acting as technical lead for a team of software engineers designing cloud services backed by databases and data warehouses. Designing and developing RESTful APIs to ingest telemetry from AI datacenters. Building scalable cloud services for high-volume ingestion, processing, and storage of large datasets. Building and managing data pipelines for online and offline data storage. Collaborating across teams to codify business processes into scalable, self-measuring systems. Optimizing the reliability and efficiency of cloud services and operations. Leading and shipping impactful technical projects, ensuring quality and scalability at every stage.

Requirements include:

At least 12+ years of industry experience with a Bachelor's or Master's degree (or equivalent experience); PhD degree preferred. Expertise in building scalable REST APIs backed by PostgreSQL-compatible data stores. Proficiency in programming languages such as Go or Python. Familiarity with modern JavaScript frameworks (e.g., React, Angular, Next.js). Expertise in cloud infrastructure (AWS, GCP, Azure, etc) and container technologies like Docker and Kubernetes. Expertise with high-scale distributed systems, including architectural patterns for APIs and data pipelines. Outstanding communication and collaboration skills, with a focus on solving complex operational challenges. A passion for delivering scalable and efficient cloud services. Familiarity with Linux operating systems.

Preferred qualifications include:

A track record of leading engineers to successful delivery and operations of high-performance cloud services at Internet scale. Experience operating NVIDIA datacenter GPUs. Strong debugging and problem-solving skills in distributed environments.

## Skills

### Required
- RESTful APIs
- PostgreSQL
- Go
- Python
- JavaScript
- Cloud infrastructure
- Docker
- Kubernetes
- Linux operating systems

### Nice to have
- High-scale distributed systems
- Architectural patterns for APIs and data pipelines
- Debugging and problem-solving skills in distributed environments

---

Source: [Apply at nvidia.wd5.myworkdayjobs.com](https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/US-WA-Seattle/Lead-Software-Engineer--Fleet-Management---DGX-Cloud_JR2016367?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply)
