Description
We're looking for a Research Engineer / Research Scientist to join our team. As a Research Engineer, you'll touch all parts of our code and infrastructure, whether that's making the cluster more reliable for our big jobs, improving throughput and efficiency, running and designing scientific experiments, or improving our dev tooling.
You'll be working on large-scale ML systems from the ground up, making safe, steerable, trustworthy systems. You'll be excited to write code when you understand the research context and more broadly why it's important.
Strong candidates may also have experience with high performance, large-scale ML systems, GPUs, Kubernetes, Pytorch, or OS internals, language modeling with transformers, reinforcement learning, and large-scale ETL.
Representative projects may include optimizing the throughput of a new attention mechanism, comparing the compute efficiency of two Transformer variants, making a Wikipedia dataset in a format models can easily consume, scaling a distributed training job to thousands of GPUs, writing a design doc for fault tolerance strategies, and creating an interactive visualization of attention between tokens in a language model.
The annual compensation range for this role is $350,000-$500,000 USD.