Description
About the Role
We're seeking a Member of Technical Staff to own the post-training pipeline for our multimodal models end to end. This includes data strategy and reward modeling, preference optimization, distillation, and safety tuning across image, editing, and video.
Responsibilities
- Own the full post-training pipeline end to end , from data curation and reward modeling through fine-tuning, preference optimization, distillation, safety tuning, evaluation, and deployment
- Advance techniques across the post-training stack: SFT, RLHF, RLAIF, DPO, preference learning, and reward modeling to align models with human intent and aesthetic judgment
- Work across modalities: text-to-image, image editing, multi-reference, and video post-training
- Build personalization and customization capabilities that let users adapt our models to their own creative style
- Design and maintain high-throughput fine-tuning and evaluation infrastructure to support rapid iteration across the research team
- Identify quality and alignment gaps through rigorous evaluation, then close them through targeted research and engineering
Requirements
- You've owned post-training for a frontier generative model through release (SFT, preference optimization (DPO or RLHF), distillation, safety tuning) with measurable quality wins on human prefs or standard benchmarks
- Deep experience across the post-training stack, not just one slice: reward modeling, preference learning, RLHF/RLAIF, and personalization
- Comfortable working across modalities: text-to-image, image editing, multi-reference, and ideally video
- Strong PyTorch fluency; you write research code that others can build on
- Experience with distillation (LADD, DMD, consistency models, or similar) or with building high-throughput eval pipelines is a strong plus
- Bias toward shipping: measurable model-quality improvements that reach users, not just papers
How We Work Together
We’re a distributed team with real offices that people actually use. Depending on your role, you’ll either join us in Freiburg or SF at least 2 days a week (or one full week every other week), or work remotely with a monthly in-person week to stay connected. We’ll cover reasonable travel costs to make this possible. We think in-person time matters, and we’ve structured things to make it accessible to all. We’ll discuss what this will look like for the role during our interview process.