# Engineering Manager - Inference

**Company**: Perplexity
**Location**: San Francisco
**Work arrangement**: onsite
**Experience**: senior
**Job type**: full-time
**Salary**: $300K - $405K
**Category**: Engineering
**Industry**: Technology

**Apply**: https://jobs.ashbyhq.com/perplexity/2a87ccbf-82ef-4fc7-b1ed-4dd18b11baf9
**Canonical**: https://yubhub.co/jobs/job_7917d1eb-6e2

## Description

We are looking for an Inference Engineering Manager to lead our AI Inference team. This is a unique opportunity to build and scale the infrastructure that powers Perplexity's products and APIs, serving millions of users with state-of-the-art AI capabilities.

## What you'll do

You will own the technical direction and execution of our inference systems while building and leading a world-class team of inference engineers. Our current stack includes Python, PyTorch, Rust, C++, and Kubernetes.

- Lead and grow a high-performing team of AI inference engineers

- Develop APIs for AI inference used by both internal and external customers

- Architect and scale our inference infrastructure for reliability and efficiency

## What you need

- 5+ years of engineering experience with 2+ years in a technical leadership or management role

- Deep experience with ML systems and inference frameworks (PyTorch, TensorFlow, ONNX, TensorRT, vLLM)

- Strong understanding of LLM architecture: Multi-Head Attention, Multi/Grouped-Query Attention, and common layers

## Skills

### Required
- ML systems
- inference frameworks
- LLM architecture

### Nice to have
- CUDA
- Triton
- custom kernel development
