Description
As a Research Engineer on the RL Velocity team, you'll build and improve the core platform that underpins how we do RL at Anthropic, removing bottlenecks that slow down research and making it easier for the broader org to ship better models faster.
This is high-leverage work: small improvements to velocity compound across every researcher and every run.
Key responsibilities include building and improving the RL training infrastructure, identifying and removing bottlenecks across the RL stack, partnering closely with researchers and adjacent engineering teams, owning the reliability and performance of research runs end-to-end, and contributing to design decisions that shape how Anthropic does RL at scale.
Strong candidates will have strong software engineering fundamentals, a track record of building performant, reliable systems, and experience with ML infrastructure, distributed systems, or research tooling.
In addition, they should be comfortable operating across the stack, from low-level performance work to RL algorithms, and have a bias toward shipping and iterating quickly, with a mix of high agency and low ego.
The annual compensation range for this role is £370,000-£630,000 GBP.