Description
Summary
Microsoft AI are looking for a talented Member of Technical Staff, LLM Inference to join their team in Redmond. This role will involve working alongside researchers and engineers to implement frontier AI research ideas and introduce new systems, tools, and techniques to improve model inference performance.
About the Role
As a Member of Technical Staff, LLM Inference, you will be responsible for building and maintaining the tools and systems that enable Microsoft AI researchers to run models easily and efficiently. This will involve working on a variety of tasks, including building tools to help debug performance bottlenecks, numeric instabilities, and distributed systems issues. You will also be responsible for building tools and establishing processes to enhance the team's collective productivity.
Accountabilities
- Work alongside researchers and engineers to implement frontier AI research ideas
- Introduce new systems, tools, and techniques to improve model inference performance
- Build tools to help debug performance bottlenecks, numeric instabilities, and distributed systems issues
- Build tools and establish processes to enhance the team's collective productivity
The Candidate we're looking for
Experience:
- 6+ years of technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
Technical skills:
- Experience with generative AI
- Experience with distributed computing
- Python and Python ecosystem (eg. uv, pybind/nanobind, FastAPI) expertise
Personal attributes:
- Results-oriented, have a bias toward action, and enjoy owning problems end-to-end
Benefits
- Competitive salary
- Comprehensive benefits package
- Opportunities for professional growth and development
- Collaborative and dynamic work environment