Description
As a researcher on the Alignment team, you will design and run experiments that improve our ability to oversee increasingly capable models. You will work on hands-on model training, evaluation design, and research infrastructure, and translating promising oversight ideas into systems that can operate on real model traffic and real user workflows.
This role combines longer-horizon research with shorter deployment sprints, with projects typically scoped around 3-6 month research timelines and aimed at directly improving future model behavior.
In this role, you will:
- Design and implement alignment experiments focused on oversight systems for increasingly agentic AI models.
- Deploy practical systems for action monitoring, red-teaming, and human-in-the-loop control.
- Develop evaluations for alignment failure modes of the frontier models such as overeagerness, instruction following failures, covert actions, avoiding restrictions and scheming propensity.
- Analyze deployment data to understand model failures, oversight gaps, and opportunities for training more aligned models.
- Develop techniques for feeding oversight signals back into training while preserving the reliability and independence of the oversight process.
- Produce externally publishable research when results advance the broader science of alignment.
- Collaborate across research, product, security, safety, and engineering teams to turn alignment ideas into working systems.
You might thrive in this role if you:
- Have strong hands-on experience training, evaluating, or debugging large ML models, especially LLMs.
- Have experience with reinforcement learning, post-training, preference optimization, scalable oversight, model evaluation, or adjacent empirical ML research.
- Have strong engineering execution and can turn ambiguous research ideas into reliable experiments, tools, training pipelines, and production-facing systems.
- Have research intuitions for what experiments are likely to teach us something, while staying grounded in implementation details and empirical results.
- Are a team player - willing to do a variety of tasks that move the team forward.
- Enjoy fast-paced, collaborative research environments where priorities shift as models and evidence change.
- See safety and usefulness as coupled goals.
This listing is enriched and indexed by YubHub. To apply, use the employer's original posting:
https://jobs.ashbyhq.com/openai/16ae0f82-e390-453e-b175-0655f1b0fc67