# Field Hardware Engineer, HPC

**Company**: Mistral AI
**Location**: Paris
**Work arrangement**: onsite
**Experience**: senior
**Job type**: full-time
**Category**: Engineering
**Industry**: Technology
**Wikidata**: https://www.wikidata.org/wiki/Q119718658

**Apply**: https://jobs.lever.co/mistral/ea94b55b-58e1-437b-bf3d-07ed150308e3?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply
**Canonical**: https://yubhub.co/jobs/job_058f6a10-283

## Description

Our compute footprint is growing fast to support our science and engineering teams. We're hiring a Field HW Engineer to understand end-to-end systems, execute complex/vendor-level interventions, and guide L1 engineers on site,without direct line management.

You'll work hands-on across compute, storage, interconnect and cooling to keep one of France's largest GPU/CPU clusters healthy and scalable.

Location: Bruyères-le-Châtel , on-site, field role (multi-site mobility: Paris area and nearby)

Reporting line: Hardware Ops

Impact:

• Compute is a key lever for Mistral's success and our largest spend item.

• Direct impact on scale: you'll restore service on complex incidents and raise the bar on reliability as we grow.

• Enable breakthrough AI: your work unlocks science & engineering teams to deliver state-of-the-art AI.

What you will do:

• Lead complex interventions: plan and execute vendor-level or multi-node operations (e.g., full rack work, intricate recabling, post-restart diagnosis), own risk assessment/rollback, and coordinate with vendors (RMA/escalations).

• Advanced diagnostics: correlate symptoms across compute, storage, interconnect, cooling; read system indicators (LED/POST/beep), BMC/IPMI consoles, and logs to identify root causes.

• Guide and uplift L1s: coach on safe practices (ESD/LOTO), first-line triage, rack craftsmanship, documentation quality; pair on tricky procedures.

• Process & automation: improve SOPs/checklists; propose/build small automation (Python/Bash) for photo/serial capture, inventory sync, dashboards/alerts; shorten MTTR.

• Safety & compliance: enforce lockout/tagout, ESD, PPE; ensure audit-ready tickets, evidence and change traces.

• Parts & logistics (advanced): plan spares strategy, track failure trends, and drive proactive vendor actions.

About you:

• 5+ years in data center/server hardware or L2/L3 hardware support, with proven complex hands-on work in production (HPC/AI/Cloud at scale).

• End-to-end hardware expertise: comfortable across CPU/memory/PCIe cards (incl. accelerators), NICs, PSUs, drives, network, power and cooling (including DLC); strong judgment on when/how to escalate.

• Diagnostics depth: confident in analyzing BMC/IPMI logs, linux software logs and crashes simple CLI checks; methodical root cause analysis.

• Safety & discipline: impeccable ESD/LOTO/PPE habits; zero rough handling; clean, labeled, auditable work.

• Communication & mentoring: crisp status/handovers; able to coach L1s during live operations.

Provide technical documentations to L1s or other team

Mobility: willing to travel between sites (Paris area or nearby regions, occasionally in Europe or US))

Nice to have:

• Vendor tools (iDRAC/iLO/IPMI), RAID/storage basics (NVMe/SAS/SATA), high-speed interconnect (Ethernet/InfiniBand).

• Coding/automation (Python/Bash) for small ops tools and reporting.

• Experience with ticketing (Jira/ServiceNow), inventory/RMA flows, vendor coordination.

## Skills

### Required
- data center/server hardware
- L2/L3 hardware support
- HPC/AI/Cloud at scale
- end-to-end hardware expertise
- diagnostics depth
- safety & discipline
- communication & mentoring

### Nice to have
- vendor tools
- RAID/storage basics
- high-speed interconnect
- coding/automation
- ticketing
- inventory/RMA flows
- vendor coordination

---

Source: [Apply at jobs.lever.co](https://jobs.lever.co/mistral/ea94b55b-58e1-437b-bf3d-07ed150308e3?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply)
