Description

As an IPP Cloud Data Center Platform Development Operations Program Manager, you will play a pivotal role in driving the global scale expansion of NVIDIA's infrastructure, supporting next-generation AI and GPU platforms. You will take charge of centralized operational leadership, advancing infrastructure coordination, and strategic capacity planning throughout NVIDIA's growing infrastructure footprint.

Your responsibilities will include:

Driving centralized operational coordination across global Data Center and Super Lab infrastructure initiatives.
Supporting infrastructure planning and execution across a rapidly growing portfolio of AI/GPU infrastructure sites.
Improving infrastructure forecasting, portfolio visibility, dependency management, and operational readiness processes.
Partnering with cross-functional organizations to align infrastructure capacity, deployment priorities, and execution timelines.
Developing executive-level operational reporting, infrastructure dashboards, hotspot analysis, and portfolio readiness metrics.
Supporting strategic data center capacity planning initiatives across power, liquid cooling, and space allocation requirements.
Coordinating operational readiness activities supporting infrastructure build-outs, site readiness, and launch execution.
Identifying process gaps, operational bottlenecks, and scaling challenges associated with global infrastructure growth.
Improving prioritization rigor and ensuring alignment in carrying out tasks across multiple collaborator organizations.
Facilitating operational alignment between infrastructure demand, deployment schedules, and site readiness constraints.

To succeed in this role, you will need:

15+ years of experience in infrastructure operations, developer operations, technical operations, program management, or large-scale infrastructure planning environments.
A strong understanding of liquid cooling systems, power and space planning, infrastructure operations, DC capacity planning and forecasting, GPU infrastructure deployment lifecycle, portfolio management, operational readiness, and infrastructure dependency management.
Experience managing collaboration across engineering, facilities, operations, deployment, and executive leadership groups.
Strong executive communication, analytical, and operational problem-solving skills.
Ability to operate effectively in rapidly scaling and highly matrixed environments.

This listing is enriched and indexed by YubHub. To apply, use the employer's original posting: https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/US-CA-Santa-Clara/Colossus-Cloud-Platform-Datacenter-Engineer_JR2018810