# Researcher, Alignment Oversight

**Company**: OpenAI
**Location**: San Francisco
**Work arrangement**: hybrid
**Experience**: mid
**Job type**: Full time
**Salary**: $250K – $445K
**Category**: Engineering
**Industry**: Technology
**Wikidata**: https://www.wikidata.org/wiki/Q124605186

**Apply**: https://jobs.ashbyhq.com/openai/16ae0f82-e390-453e-b175-0655f1b0fc67?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply
**Canonical**: https://yubhub.co/jobs/job_8b0ae9b6-96d

## Description

As a researcher on the Alignment team, you will design and run experiments that improve our ability to oversee increasingly capable models. You will work on hands-on model training, evaluation design, and research infrastructure, and translating promising oversight ideas into systems that can operate on real model traffic and real user workflows.

This role combines longer-horizon research with shorter deployment sprints, with projects typically scoped around 3-6 month research timelines and aimed at directly improving future model behavior.

In this role, you will:

- Design and implement alignment experiments focused on oversight systems for increasingly agentic AI models.

- Deploy practical systems for action monitoring, red-teaming, and human-in-the-loop control.

- Develop evaluations for alignment failure modes of the frontier models such as overeagerness, instruction following failures, covert actions, avoiding restrictions and scheming propensity.

- Analyze deployment data to understand model failures, oversight gaps, and opportunities for training more aligned models.

- Develop techniques for feeding oversight signals back into training while preserving the reliability and independence of the oversight process.

- Produce externally publishable research when results advance the broader science of alignment.

- Collaborate across research, product, security, safety, and engineering teams to turn alignment ideas into working systems.

You might thrive in this role if you:

- Have strong hands-on experience training, evaluating, or debugging large ML models, especially LLMs.

- Have experience with reinforcement learning, post-training, preference optimization, scalable oversight, model evaluation, or adjacent empirical ML research.

- Have strong engineering execution and can turn ambiguous research ideas into reliable experiments, tools, training pipelines, and production-facing systems.

- Have research intuitions for what experiments are likely to teach us something, while staying grounded in implementation details and empirical results.

- Are a team player - willing to do a variety of tasks that move the team forward.

- Enjoy fast-paced, collaborative research environments where priorities shift as models and evidence change.

- See safety and usefulness as coupled goals.

## Skills

### Required
- ML models
- reinforcement learning
- post-training
- preference optimization
- scalable oversight
- model evaluation

### Nice to have
- LLMs
- empirical ML research
- engineering execution
- research intuitions
- team player

---

Source: [Apply at jobs.ashbyhq.com](https://jobs.ashbyhq.com/openai/16ae0f82-e390-453e-b175-0655f1b0fc67?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply)
