Description

About the role:

You will contribute to exploratory experimental research on AI safety, with a focus on risks from powerful future systems. As a Research Engineer on Alignment Science, you'll work on creating methods to ensure advanced AI systems remain safe and harmless in unfamiliar or adversarial scenarios.

Responsibilities:

Conduct research on AI control and alignment stress-testing
Develop and implement new techniques for ensuring AI safety
Collaborate with other teams, including Interpretability, Fine-Tuning, and the Frontier Red Team
Test and evaluate the effectiveness of AI safety techniques

Requirements:

Significant software, ML, or research engineering experience
Familiarity with technical AI safety research
Experience contributing to empirical AI research projects

Preferred qualifications:

Experience authoring research papers in machine learning, NLP, or AI safety
Experience with LLMs
Experience with reinforcement learning

Benefits:

Competitive compensation and benefits
Optional equity donation matching
Generous vacation and parental leave
Flexible working hours

Note:

This role requires all candidates to be based at least 25% in London and travel to San Francisco occasionally.

This listing is enriched and indexed by YubHub. To apply, use the employer's original posting: https://job-boards.greenhouse.io/anthropic/jobs/4610158008