New The Skills of Tomorrow: how AI-exposed is every skill in 2026? See the data →
NVIDIA

Senior Network Reliability Engineer - DGX Cloud

NVIDIA
Apply →
remote senior full-time Santa Clara

First indexed 18 May 2026

Description

We are looking for a Senior Network Reliability Engineer to support and maintain our cloud and datacenter network infrastructures. This network serves the needs across the whole software stack for NVIDIA, from Graphics Drivers to Autonomous Vehicles and Artificial Intelligence.

In this role, you will remediate critical alerts within defined SLAs, triage production impacting network incidents, and interact with internal customers on network related issues. You will also be responsible for engaging with external vendors to remediate hardware and software issues, and participate in project related work such as network device upgrades and capacity augmentations.

Key responsibilities include:

  • Engaging in 24/7 global shift rotations to provide remote support for network repairs and changes while collaborating across teams and updating customers on status and ticket information.
  • Driving operational improvements in change management and daily operations by following procedures.
  • Managing and operating large scale IP network technologies and infrastructures.
  • Utilizing your skills in Peering and Datacenter interconnect technologies: PNI, Transit, Exchange, Passive DWDM, Wave circuits.
  • Monitoring and supporting the network health of on-premises and cloud infrastructures.
  • Collaborating and developing workflow enhancements while documenting best practices.

Requirements include:

  • Deep knowledge and experience of TCP/IP, BGP, OSPF, MPLS, IS-IS, VxLAN, EVPN, QoS, GRE, IPsec, DNS, and MACsec.
  • 5+ years of experience in network operations.
  • Skilled in network troubleshooting techniques and demonstrating creative problem-solving abilities.
  • Strong track record of alert response within defined SLAs and Incident management.
  • Experience with one or more of the following CSP environments: AWS, Azure, GCP, OCI.
  • Familiarity with Arista, Fortinet and Juniper.
  • Hands-on experience with contributing to tooling and automation for provisioning, monitoring, and managing complex network infrastructures.
  • Bachelor’s degree in Computer Science, related technical field, or equivalent experience.
  • Excellent verbal and written communication skills.