# Network Engineer - AI/HPC

**Company**: xAI
**Location**: Palo Alto, CA
**Work arrangement**: onsite
**Experience**: senior
**Job type**: full-time
**Salary**: $180,000 - $440,000
**Category**: Engineering
**Industry**: Technology
**Wikidata**: https://www.wikidata.org/wiki/Q120599684

**Apply**: https://job-boards.greenhouse.io/xai/jobs/5074185007
**Canonical**: https://yubhub.co/jobs/job_93c935ee-a9f

## Description

## About the Role

xAI is a technology company that aims to create AI systems to understand the universe and aid humanity in its pursuit of knowledge. We are seeking a Network Engineer - AI/HPC to join our team.

## Responsibilities

- Develop and maintain large-scale networks with expertise in RoCEv2, optimizing performance and availability.

- Design and implement metric dashboards to monitor network performance.

- Collaborate with the team to design and implement the next iteration of our backend and front-end networks.

- Participate in a team on-call rotation and help with scaling and maintenance efforts.

## Requirements

- Minimum 10 years designing and operating large-scale networks with 5 years in the ethernet AI/HPC space.

- Deep understanding of congestion control on ethernet with Infiniband an added bonus.

- Expertise in creating a portfolio of metrics for performance and operations to optimize the fleet for training and inference traffic.

- Experience with Python to automate away repetitive tasks and facilitate daily job working with and analyzing large sets of data.

## Compensation and Benefits

$180,000 - $440,000 base salary. Comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short & long-term disability insurance, life insurance, and various other discounts and perks.

## Skills

### Required
- RoCEv2
- Ethernet
- Infiniband
- NCCL
- Python
- Large-scale network design and operation
