Description

We are seeking a Senior Manager to lead the design, scaling, and operations of high-performance networking for GPU-based cloud infrastructure. This role is critical to enabling cloud gaming workloads, AI/ML training, and inference platforms by delivering ultra-low-latency, high-throughput, and highly reliable interconnects across data centers and cloud environments.

Key responsibilities include building and mentoring a specialized team of network architects, overseeing the design of intra-cluster and inter-cluster connectivity, driving technical tuning to reduce latency and increase throughput, defining the roadmap for networking strategies, engaging with ISPs to optimize low-latency edge networks, and implementing Infrastructure as Code (IaC) and observability frameworks.

The ideal candidate will have 12+ years of proven experience in networking, cloud infrastructure, or distributed systems, with 5+ years of experience directly managing technical teams. They should have mastery of data center networking, including Clos/spine-leaf architectures and high-performance fabrics like RDMA, RoCE, or InfiniBand.

Additionally, the candidate should have hands-on experience with BGP, EVPN/VXLAN, and kernel-level development for routing and switching, as well as skilled in using Ansible or Terraform for infrastructure automation, paired with monitoring tools like Prometheus and Grafana.

A Bachelor’s or Master’s degree in Computer Science or a related engineering field is required, and relevant top-tier certifications, such as CCIE or specialized cloud networking designations, are a plus.

This listing is enriched and indexed by YubHub. To apply, use the employer's original posting: https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/US-CA-Santa-Clara/Senior-Manager--GPU-Cloud-Infrastructure---GeForce-NOW_JR2015878