# Director, Global Network Reliability Engineering

**Company**: NVIDIA
**Location**: Santa Clara
**Work arrangement**: onsite
**Experience**: executive
**Job type**: full-time
**Category**: Engineering
**Industry**: Technology

**Apply**: https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/US-CA-Santa-Clara/Director--Global-Network-Reliability-Engineering_JR2007400?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply
**Canonical**: https://yubhub.co/jobs/job_ed5c35b4-bb5

## Description

We are seeking a Director of Network Reliability Engineering to lead our global network operations, ensuring reliability, scalability, and efficiency goals are defined and met. In this role, you will be responsible for maturing our current support model and processes to a more data-driven, automated, SRE model. You will build an in-house team of reliability experts for networking support and operations, set the technical vision, strategy, and roadmap for network operations, and work across Network Architecture, Network engineering, and partner teams to establish run books, regular training sessions, and ensure we build the network to be self-healing.

Your main focus will be on understanding RCAs from events and incidents, working with our AI operations to enrich our observability tooling for a better full-stack view of the network to applications, and influencing the architecture of the Nvidia networks both on-prem and in the clouds.

To succeed in this role, you will need to have a Bachelor's degree in Computer Science, related technical field, or equivalent experience, with 12+ years of experience in system design, network architecture, network engineering, and network operations, and 7+ years of leadership experience. You should also have experience transforming network operations using software-driven methods, knowledge of SRE principles, and knowledge of software interface design and documentation for less technical end-users.

## Skills

### Required
- Network Reliability Engineering
- System Design
- Network Architecture
- Network Engineering
- Network Operations
- Software-Driven Methods
- SRE Principles
- Software Interface Design
- Documentation

---

Source: [Apply at nvidia.wd5.myworkdayjobs.com](https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/US-CA-Santa-Clara/Director--Global-Network-Reliability-Engineering_JR2007400?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply)
