New The Skills of Tomorrow: how AI-exposed is every skill in 2026? See the data →
Mistral AI

Applied AI Engineer, Site Reliability Engineer

Mistral AI
Apply →
hybrid senior full-time Paris

First indexed 30 May 2026

Description

About the Role

At Mistral AI, we are seeking an experienced Applied AI Engineer, Site Reliability Engineer to join our team. As a key member of our Applied AI team, you will be responsible for building and operating the framework to ensure Mistral's solution delivery is reliable and sustainable.

Responsibilities

  • Design for a fleet of Mistral platforms and apps, building proactivity to reduce reactivity.
  • Productize reliability, author runbooks, create SLO templates, implement observability.
  • Operate the Tier-1 customer environments that Mistral are contracted to operate, ensuring SLO compliance, owning on-call and incident response, managing drift, partnering with Technical Support as L3 escalation, championing high signal post-mortems.
  • Productize how Mistral deploy, secure, and scale our Applied AI solutions, engineering on-demand provisioning, authoring security baseline packages, embedding security guardrails, automating everything.
  • Own the security operations layer for our customer-side deployments, leading CVE response across the fleet, shipping supply-chain integrity controls (SBOM, signed images, provenance), co-paging with InfoSec on security incidents, enforcing secure-config baselines.

How We Work in Applied AI

  • We care about people and outputs.
  • What matters is what you ship, not the time you spend on it.
  • Bureaucracy is where urgency goes to vanish. You talk to whoever you need to talk to. The best idea wins, whether it comes from a principal engineer or someone in their first week.
  • Always ask why. The best solutions come from deep understanding, not from copying what worked before.
  • We say what we mean. Feedback is direct, timely, and given because we care.
  • No politics. Low ego, high standards.
  • We embrace an unstructured environment and find joy in it.

About You

  • Fluent in English.
  • 5+ years in SRE, Production Engineering, or DevOps, with a record of shipping tooling.
  • Strong multi-tenant Kubernetes fluency, namespace segmentation, network policy, RBAC, admission control, operations at scale.
  • On-call discipline: incident response, blameless post-mortem culture, runbook-first mindset.
  • Observability stack in production: Prometheus, Grafana, OpenTelemetry, Loki, Tempo, Signoz.
  • Infrastructure as code: Terraform, Ansible (or close equivalents).
  • Proficient in Python and/or Golang for tooling and automation.
  • Security mindset: you treat secure-SDLC, CVE response, and supply-chain integrity as reliability properties of the shipped artifact, not as someone else's job.
  • Strong written communication skills: runbooks, post-mortems, and customer-facing incident comms are core deliverables of this role.
  • Comfortable operating with high autonomy in an ambiguous, fast-paced environment , and disciplined enough to defend the team's scope when work tries to spill in.
  • Solid Linux internals, networking debug, and distributed-systems fundamentals.

Strong Plus

  • Cloud or application security background (AppSec, K8s security, supply chain , SBOM, cosign, SLSA).
  • Experience operating LLM / model-serving stacks in production.
  • Experience with multi-cloud or on-prem hybrid customer environments (AWS, GCP, Azure, sovereign clouds).
  • Open-source contributions, particularly in SRE, observability, or security tooling.
This listing is enriched and indexed by YubHub. To apply, use the employer's original posting: https://jobs.lever.co/mistral/a93b2891-9aaa-4c18-855e-37ef159d4eed