Description
As a Senior Data Scientist for LLM Evaluation, you will develop and implement cutting-edge methodologies to help evaluate how well Copilot performs in real-world usage scenarios.
Users turn to Copilot for various tasks, making it crucial to ensure our AI systems effectively assist them.
Your responsibilities will include:
- Developing new methods to evaluate LLMs, train classifiers, and experiment with data collection techniques
- Implementing methodologies to provide real-time signals on Copilot performance
- Collaborating with user researchers and product leaders to build automated evaluation frameworks
The ideal candidate will have experience in social sciences, machine learning, and natural language analysis, with strong problem-solving skills and the ability to work independently.
Responsibilities
- Leverage expertise to measure Copilot performance, identify failure modes, and develop mitigation strategies
- Create and implement comprehensive evaluation frameworks across diverse scenarios
- Build automated testing systems and write efficient code for model pipelines
- Maintain a user-oriented perspective and serve as a trusted advisor on AI matters
- Track advances in research and adapt algorithms to drive innovation
Qualifications
- Doctorate or Master's degree in Data Science, Mathematics, Statistics, or related field with relevant experience
- Experience with large language models, Python programming, and Responsible AI
This listing is enriched and indexed by YubHub. To apply, use the employer's original posting:
https://microsoft.ai/job/senior-data-scientist-15/