Description

About the role

You will work at the intersection of cutting-edge research and practical engineering, contributing to the development of safe, steerable, and trustworthy AI systems.

Responsibilities

Conduct research and implement solutions in areas such as model architecture, algorithms, data processing, and optimizer development
Independently lead small research projects while collaborating with team members on larger initiatives
Design, run, and analyze scientific experiments to advance our understanding of large language models
Optimise and scale our training infrastructure to improve efficiency and reliability
Develop and improve dev tooling to enhance team productivity
Contribute to the entire stack, from low-level optimisations to high-level model design

Qualifications & Experience

Degree (BA required, MS or PhD preferred) in Computer Science, Machine Learning, or a related field
Strong software engineering skills with a proven track record of building complex systems
Expertise in Python and deep learning frameworks
Have worked on high-performance, large-scale ML systems, particularly in the context of language modelling
Familiarity with ML Accelerators, Kubernetes, and large-scale data processing
Strong problem-solving skills and a results-oriented mindset
Excellent communication skills and ability to work in a collaborative environment

You'll thrive in this role if you

Have significant software engineering experience
Are able to balance research goals with practical engineering constraints
Are happy to take on tasks outside your job description to support the team
Enjoy pair programming and collaborative work
Are eager to learn more about machine learning research
Are enthusiastic to work at an organisation that functions as a single, cohesive team pursuing large-scale AI research projects
Have ambitious goals for AI safety and general progress in the next few years, and you’re excited to create the best outcomes over the long-term

Sample Projects

Optimising the throughput of novel attention mechanisms
Proposing Transformer variants, and experimentally comparing their performance
Preparing large-scale datasets for model consumption
Scaling distributed training jobs to thousands of accelerators
Designing fault tolerance strategies for training infrastructure
Creating interactive visualisations of model internals, such as attention patterns

Benefits

How we're different

We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact , advancing our long-term goals of steerable, trustworthy AI , rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We're an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills.

Come work with us!

Anthropic is a public benefit corporation headquartered in San Francisco. We offer competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a lovely office space in which to collaborate with colleagues.

This listing is enriched and indexed by YubHub. To apply, use the employer's original posting: https://job-boards.greenhouse.io/anthropic/jobs/5135168008