Description
The Command Center Systems Engineer will be responsible for building and maintaining the operational backbone of the company's command center, ensuring the uptime and operational excellence of the world's largest GPU clusters. This includes strengthening, maintaining, and governing all SOPs, MOPs, and EOPs across the Command Center, enhancing and owning the escalation framework, leading change management governance, developing and managing shift structure, handover protocols, and staffing frameworks, owning the incident management lifecycle, defining and tracking operational KPIs, building and overseeing onboarding and ongoing training programs for Command Center Technicians, partnering with engineering, facilities, and site operations teams, and leading vendor governance.
The ideal candidate will have 5+ years of experience in data center operations, operations management, or mission-critical infrastructure in a 24/7 environment, a proven track record of building and scaling operational frameworks, strong project and program management skills, excellent written and verbal communication, experience facilitating root cause analysis and driving corrective action to closure, and comfort working with operational metrics and reporting.
In addition to a competitive salary, the company offers a variety of benefits, including medical, dental, and vision insurance, company-paid life insurance, voluntary supplemental life insurance, short and long-term disability insurance, flexible spending account, health savings account, tuition reimbursement, employee stock purchase program, mental wellness benefits, family-forming support, paid parental leave, flexible PTO, catered lunch, and a casual work environment.