Description
As an IPP Cloud Data Center Platform Development Operations Program Manager, you will play a pivotal role in driving the global scale expansion of NVIDIA's infrastructure, supporting next-generation AI and GPU platforms. You will take charge of centralized operational leadership, advancing infrastructure coordination, and strategic capacity planning throughout NVIDIA's growing infrastructure footprint.
Your responsibilities will include:
- Driving centralized operational coordination across global Data Center and Super Lab infrastructure initiatives.
- Supporting infrastructure planning and execution across a rapidly growing portfolio of AI/GPU infrastructure sites.
- Improving infrastructure forecasting, portfolio visibility, dependency management, and operational readiness processes.
- Partnering with cross-functional organizations to align infrastructure capacity, deployment priorities, and execution timelines.
- Developing executive-level operational reporting, infrastructure dashboards, hotspot analysis, and portfolio readiness metrics.
- Supporting strategic data center capacity planning initiatives across power, liquid cooling, and space allocation requirements.
- Coordinating operational readiness activities supporting infrastructure build-outs, site readiness, and launch execution.
- Identifying process gaps, operational bottlenecks, and scaling challenges associated with global infrastructure growth.
- Improving prioritization rigor and ensuring alignment in carrying out tasks across multiple collaborator organizations.
- Facilitating operational alignment between infrastructure demand, deployment schedules, and site readiness constraints.
To succeed in this role, you will need:
- 15+ years of experience in infrastructure operations, developer operations, technical operations, program management, or large-scale infrastructure planning environments.
- A strong understanding of liquid cooling systems, power and space planning, infrastructure operations, DC capacity planning and forecasting, GPU infrastructure deployment lifecycle, portfolio management, operational readiness, and infrastructure dependency management.
- Experience managing collaboration across engineering, facilities, operations, deployment, and executive leadership groups.
- Strong executive communication, analytical, and operational problem-solving skills.
- Ability to operate effectively in rapidly scaling and highly matrixed environments.
This listing is enriched and indexed by YubHub. To apply, use the employer's original posting:
https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/US-CA-Santa-Clara/Colossus-Cloud-Platform-Datacenter-Engineer_JR2018810