Description

About the Role

We're seeking a Member of Technical Staff to own the post-training pipeline for our multimodal models end to end. This includes data strategy and reward modeling, preference optimization, distillation, and safety tuning across image, editing, and video.

Responsibilities

Own the full post-training pipeline end to end , from data curation and reward modeling through fine-tuning, preference optimization, distillation, safety tuning, evaluation, and deployment
Advance techniques across the post-training stack: SFT, RLHF, RLAIF, DPO, preference learning, and reward modeling to align models with human intent and aesthetic judgment
Work across modalities: text-to-image, image editing, multi-reference, and video post-training
Build personalization and customization capabilities that let users adapt our models to their own creative style
Design and maintain high-throughput fine-tuning and evaluation infrastructure to support rapid iteration across the research team
Identify quality and alignment gaps through rigorous evaluation, then close them through targeted research and engineering

Requirements

You've owned post-training for a frontier generative model through release (SFT, preference optimization (DPO or RLHF), distillation, safety tuning) with measurable quality wins on human prefs or standard benchmarks
Deep experience across the post-training stack, not just one slice: reward modeling, preference learning, RLHF/RLAIF, and personalization
Comfortable working across modalities: text-to-image, image editing, multi-reference, and ideally video
Strong PyTorch fluency; you write research code that others can build on
Experience with distillation (LADD, DMD, consistency models, or similar) or with building high-throughput eval pipelines is a strong plus
Bias toward shipping: measurable model-quality improvements that reach users, not just papers

How We Work Together

We’re a distributed team with real offices that people actually use. Depending on your role, you’ll either join us in Freiburg or SF at least 2 days a week (or one full week every other week), or work remotely with a monthly in-person week to stay connected. We’ll cover reasonable travel costs to make this possible. We think in-person time matters, and we’ve structured things to make it accessible to all. We’ll discuss what this will look like for the role during our interview process.

This listing is enriched and indexed by YubHub. To apply, use the employer's original posting: https://job-boards.greenhouse.io/blackforestlabs/jobs/5193512008