# Member of Technical Staff - Post-Training and RL

**Company**: xAI
**Location**: Palo Alto, CA
**Work arrangement**: onsite
**Experience**: staff
**Job type**: full-time
**Salary**: $180,000 - $600,000 USD
**Category**: Engineering
**Industry**: Technology
**Wikidata**: https://www.wikidata.org/wiki/Q120599684

**Apply**: https://job-boards.greenhouse.io/xai/jobs/5114737007?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply
**Canonical**: https://yubhub.co/jobs/job_fd550095-8ce

## Description

## About the Role

You will work on the most critical post-training and reinforcement learning challenges at any given time , including reward modeling, preference optimisation (RLHF/DPO), and RL for improving reasoning, truthfulness, and real-world capabilities.

You will get clarity on your first project before an offer.

## Responsibilities

- Work on post-training and reinforcement learning challenges

- Develop and implement reward models and preference optimisation techniques

- Improve reasoning, truthfulness, and real-world capabilities using RL

## Qualifications

- Believe truth-seeking AI is the most important and challenging problem

- Obsessed about building incredibly useful models through post-training and RL techniques

- Power user of AI models and eager to push the boundaries of what's possible with reinforcement learning and alignment methods

## Compensation and Benefits

$180,000 - $600,000 USD

Base salary is just one part of our total rewards package at xAI, which also includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short & long-term disability insurance, life insurance, and various other discounts and perks.

## Skills

### Required
- post-training
- reinforcement learning
- reward modeling
- preference optimisation
- RLHF/DPO

### Nice to have
- alignment methods
- real-world capabilities

---

Source: [Apply at job-boards.greenhouse.io](https://job-boards.greenhouse.io/xai/jobs/5114737007?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply)
