Description
We are looking for an HPC and AI Data Center Engineer to join our networking cloud solutions HPC/AI Infrastructure team. As a key player in building supercomputers and HPC clusters based on groundbreaking technologies, you will contribute to the latest breakthroughs in artificial intelligence and GPU computing.
Your primary responsibilities will include planning and building complex cluster and supercomputers in various data centers and labs, ensuring data centers and labs power and cooling efficiency while optimizing rack space utilization, and performing troubleshooting - network, optic cabling, bare metal, operating system.
To succeed in this role, you will need to have MCSE or MCITP/CCNA certification, 3+ years of experience as a lab manager, and proven hands-on experience in Linux troubleshooting with good problem identification, resolution and solving skills.
If you have scripting experience in Bash and/or Python, experience with configuration management tools known in the community (e.g. Ansible, puppet), CI & Known Job schedulers tools (e.g. Jenkins, SLURM), Virtualization: KVM / VMware / Hyper-V, and experience with L2 & L3 network protocols, you will stand out from the crowd.