Description
About the role
You will work at the intersection of cutting-edge research and practical engineering, contributing to the development of safe, steerable, and trustworthy AI systems.
Responsibilities
- Conduct research and implement solutions in areas such as model architecture, algorithms, data processing, and optimizer development
- Independently lead small research projects while collaborating with team members on larger initiatives
- Design, run, and analyze scientific experiments to advance our understanding of large language models
- Optimise and scale our training infrastructure to improve efficiency and reliability
- Develop and improve dev tooling to enhance team productivity
- Contribute to the entire stack, from low-level optimisations to high-level model design
Qualifications & Experience
- Degree (BA required, MS or PhD preferred) in Computer Science, Machine Learning, or a related field
- Strong software engineering skills with a proven track record of building complex systems
- Expertise in Python and deep learning frameworks
- Have worked on high-performance, large-scale ML systems, particularly in the context of language modelling
- Familiarity with ML Accelerators, Kubernetes, and large-scale data processing
- Strong problem-solving skills and a results-oriented mindset
- Excellent communication skills and ability to work in a collaborative environment
You'll thrive in this role if you
- Have significant software engineering experience
- Are able to balance research goals with practical engineering constraints
- Are happy to take on tasks outside your job description to support the team
- Enjoy pair programming and collaborative work
- Are eager to learn more about machine learning research
- Are enthusiastic to work at an organisation that functions as a single, cohesive team pursuing large-scale AI research projects
- Have ambitious goals for AI safety and general progress in the next few years, and you’re excited to create the best outcomes over the long-term
Sample Projects
- Optimising the throughput of novel attention mechanisms
- Proposing Transformer variants, and experimentally comparing their performance
- Preparing large-scale datasets for model consumption
- Scaling distributed training jobs to thousands of accelerators
- Designing fault tolerance strategies for training infrastructure
- Creating interactive visualisations of model internals, such as attention patterns
Benefits
How we're different
We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact , advancing our long-term goals of steerable, trustworthy AI , rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We're an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills.
Come work with us!
Anthropic is a public benefit corporation headquartered in San Francisco. We offer competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a lovely office space in which to collaborate with colleagues.