# Lead Software Engineer, Runtime

**Company**: Mistral AI
**Location**: Paris
**Work arrangement**: onsite
**Experience**: senior
**Job type**: full-time
**Category**: Engineering
**Industry**: Technology
**Wikidata**: https://www.wikidata.org/wiki/Q119718658

**Apply**: https://jobs.lever.co/mistral/0593f273-44f5-4c20-a84c-0406d5da6a0b
**Canonical**: https://yubhub.co/jobs/job_8d359571-77e

## Description

As the Technical Lead for the Inference team, you will drive the architecture and optimization of our inference backbone, ensuring high performance, scalability, and efficiency in a dynamic environment.

The role involves architecting and optimizing the inference for high-volume, low-latency, and high-availability environments, leading the acquisition and automation of benchmarks, collaborating with cross-functional teams, and innovating solutions to enhance our AI-powered applications.

Key responsibilities include:

* Architecting and optimizing the inference for high-volume, low-latency, and high-availability environments
* Leading the acquisition and automation of benchmarks at both micro and macro scales
* Introducing new techniques and tools to improve performance, latency, throughput, and efficiency in our model inference stack
* Building tools to identify bottlenecks and sources of instability, and designing solutions to address them
* Collaborating with machine learning researchers, engineers, and product managers to bring cutting-edge technologies into production
* Optimizing code and infrastructure to maximize hardware utilization and efficiency
* Mentoring and guiding team members, fostering a culture of collaboration, innovation, and continuous learning

Requirements include:

* Extensive experience in C++ and Python, with a strong focus on backend development and performance optimization
* Deep understanding of modern ML architectures and experience with performance optimization for inference
* Proven track record with large-scale distributed systems, particularly performance-critical ones
* Familiarity with PyTorch, TensorRT, CUDA, NCCL
* Strong grasp of infrastructure, continuous integration, and continuous development principles
* Ability to lead and mentor team members, driving projects from concept to implementation
* Results-oriented mindset with a bias towards flexibility and impact
* Passion for staying ahead of emerging technologies and applying them to AI-driven solutions
* Humble attitude, eagerness to help colleagues, and a desire to see the team succeed

Our Culture

We're driven to build a strong company culture and are looking for individuals with solid alignment with the following:

* Reason with rigor
* Are you audacious enough?
* Make our customers succeed
* Ship early and accelerate
* Leave your ego aside

## Skills

### Required
- C++
- Python
- PyTorch
- TensorRT
- CUDA
- NCCL