Description
We are seeking a Research Engineer to join our Pretraining team. In this role, you will conduct research and implement solutions in areas such as model architecture, algorithms, data processing, and optimizer development. You will also independently lead small research projects while collaborating with team members on larger initiatives.
Key responsibilities include designing, running, and analyzing scientific experiments to advance our understanding of large language models. Additionally, you will optimize and scale our training infrastructure to improve efficiency and reliability, and develop and improve dev tooling to enhance team productivity.
As a Research Engineer, you will contribute to the entire stack, from low-level optimizations to high-level model design. You will work at the intersection of cutting-edge research and practical engineering, contributing to the development of safe, steerable, and trustworthy AI systems.
The ideal candidate will have an advanced degree in Computer Science, Machine Learning, or a related field, and strong software engineering skills with a proven track record of building complex systems. You should be familiar with Python and experience with deep learning frameworks, particularly PyTorch. Additionally, you should have expertise in large-scale machine learning, particularly in the context of language models.
You will thrive in this role if you have significant software engineering experience, are results-oriented with a bias towards flexibility and impact, willing to take on tasks outside your job description to support the team, enjoy pair programming and collaborative work, and are eager to learn more about machine learning research.