# Systems Engineer, HPC (US & Canada)

**Company**: Mistral AI
**Location**: Montreal
**Work arrangement**: remote
**Experience**: senior
**Job type**: full-time
**Category**: Engineering
**Industry**: Technology
**Wikidata**: https://www.wikidata.org/wiki/Q119718658

**Apply**: https://jobs.lever.co/mistral/18347854-1639-43f7-a0ff-7dfed538420a?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply
**Canonical**: https://yubhub.co/jobs/job_44b4ad94-bc2

## Description

At Mistral AI, we build high-performance, open, and efficient AI systems designed to power the next generation of applications.

We are looking for Systems Engineers / System Administrators to help design, operate, and scale the infrastructure behind Mistral’s AI platforms.

This is a hands-on, hybrid role combining:

- Systems administration (operating and troubleshooting large-scale Linux environments)

- Systems engineering (automation, scalability, and performance improvements)

You’ll work closely with infrastructure, HPC, and research teams to ensure our clusters and platforms run reliably at scale.

## Core Systems Operations

- Operate and maintain large-scale Linux environments (bare metal, clusters, cloud)

- Monitor system health, troubleshoot incidents, and ensure high availability

- Support production and research workloads across multiple environments

## Scaling Infrastructure

- Help scale clusters toward hundreds to thousands of nodes

- Work on systems handling petabyte-scale storage

- Improve performance, reliability, and resource utilisation

## Automation & Engineering

- Automate operational tasks using tools like Python, Bash, Ansible, or Terraform

- Improve deployment, provisioning, and system lifecycle management

- Contribute to system design and architecture decisions

## Cross-Functional Collaboration

- Work closely with HPC / infrastructure teams, Platform / DevOps engineers, Research teams

- Act as a bridge between users and infrastructure

## Must-have

- Strong Linux systems administration experience (core requirement)

- Experience working in large-scale environments: HPC clusters or cloud infrastructure

- Experience with Job schedulers (e.g. Slurm)

- Solid troubleshooting skills across systems, hardware, and networks

## Nice-to-have (any of these)

- Containers / orchestration (e.g. Kubernetes)

- Storage systems (e.g. Ceph, Lustre, NFS)

- Networking fundamentals (Ethernet; InfiniBand is a plus)

- Infrastructure as Code / automation tooling

- GPU or AI/ML experience

## Profile We Value

- Pragmatic problem solver who can operate in fast-scaling environments

- Comfortable working across multiple domains (“Swiss army knife” mindset)

- Able to go deep in one area while learning others

- Low-ego, collaborative, and hands-on

## Why Join Mistral?

- Impact: Play a pivotal role in scaling Mistral’s cutting-edge AI infrastructure.

- Growth: Opportunity to shape data centre operations from the ground up in a high-growth startup environment.

- Collaboration: Work with a talented, cross-functional team passionate about AI and technology.

- Flexibility: Competitive compensation, benefits, and the chance to contribute to revolutionary projects.

## Skills

### Required
- Linux systems administration
- large-scale environments
- Job schedulers
- troubleshooting

### Nice to have
- Containers / orchestration
- Storage systems
- Networking fundamentals
- Infrastructure as Code
- GPU or AI/ML experience

---

Source: [Apply at jobs.lever.co](https://jobs.lever.co/mistral/18347854-1639-43f7-a0ff-7dfed538420a?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply)
