# Member of Technical Staff - Large Scale Data Infrastructure

**Company**: Black Forest Labs
**Location**: Freiburg (Germany), San Francisco (USA)
**Work arrangement**: hybrid
**Experience**: staff
**Job type**: full-time
**Salary**: $180,000–$300,000 USD + Equity
**Category**: Engineering
**Industry**: Technology

**Apply**: https://job-boards.greenhouse.io/blackforestlabs/jobs/5019171008
**Canonical**: https://yubhub.co/jobs/job_4075c787-328

## Description

We're looking for infrastructure engineers to work at peta-to-exabyte scale. You'll build data systems behind the largest training runs on thousands of GPUs, where fixing one bottleneck lets researchers train the next breakthrough model.

**What You'll Work On:**

- Scalable data loaders for training runs across thousands of GPUs

- Efficient storage and retrieval systems for petabyte-scale datasets

- Multi-cloud object storage abstraction

- Execute large-scale data migrations across storage systems and providers

- Debug and resolve performance bottlenecks in distributed data loading

**Technical Focus:**

- Python, PyTorch DataLoader internals

- Object storage (e.g. S3, Azure Blob, GCS)

- Parquet for metadata

- Video: ffmpeg, PyAV, codec fundamentals

**What We're Looking For:**

- Built and operated data pipelines at petabyte scale

- Optimized data loading

- Worked with petabyte-scale video and image datasets

- Written processing jobs operating on millions of files

- Debugged distributed system bottlenecks across large fleets of machines

**Nice to Have:**

- Experience streaming dataset formats (e.g. WebDataset)

- Video codec internals and frame-accurate seeking

- Distributed systems experience

- Slurm and Kubernetes for job orchestration

- Experience with object storage performance tuning across providers

**How We Work Together:**

- We're a distributed team with real offices that people actually use. Depending on your role, you'll either join us in Freiburg or SF at least 2 days a week (or one full week every other week), or work remotely with a monthly in-person week to stay connected. We'll cover reasonable travel costs to make this possible. We think in-person time matters, and we've structured things to make it accessible to all. We'll discuss what this will look like for the role during our interview process.

**Everything we do is grounded in four values:**

- Obsessed. We are a frontier research lab. The science has to be right, the understanding deep, the product beautiful.

- Low Ego. The work speaks. The best idea wins, no matter who said it. Credit is shared. Nobody is above any task.

- Bold. We take the ambitious bet. We ship, we do not wait for conditions to be perfect.

- Kind. People over politics. We treat each other with genuine warmth. Agency without empathy creates chaos.

## Skills

### Required
- Python
- PyTorch
- Data Loader Internals
- Object Storage
- Parquet
- Video
- ffmpeg
- PyAV
- Codec Fundamentals

### Nice to have
- WebDataset
- Distributed Systems
- Slurm
- Kubernetes
- Object Storage Performance Tuning
