# Staff Software Engineer, Machine Learning Platform

**Company**: Stripe
**Location**: San Francisco, Seattle
**Work arrangement**: remote
**Experience**: staff
**Job type**: full-time
**Category**: Engineering
**Industry**: Technology

**Apply**: https://job-boards.greenhouse.io/stripe/jobs/7939868?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply
**Canonical**: https://yubhub.co/jobs/job_037d00b1-967

## Description

## About the role

You will serve as a technical lead across the Machine Learning Platform space and a key contributor to the evolution of the platforms that power Stripe's ML-driven products.

## Responsibilities

- Take ownership of end-to-end architecture and system design for large, complex projects across ML Platform.

- Define technical directions for projects with high ambiguity, transforming complex user needs into long-lasting platform strategy.

- Design the system architecture and solutions for the most challenging problems in the ML Platform domain, including low-latency model inference, large-scale feature stores, real-time monitoring, and LLM/agent orchestration.

- Turn high-leverage ideas into tangible, robust solutions that shape platform and product roadmap, combining technical excellence with creative problem-solving.

- Scope and lead large projects with significant business impact, driving them from requirements through design, implementation, and production operation.

- Work with ML engineers, data scientists, and product teams directly to translate their needs into functional requirements and scalable technical solutions.

- Arbitrate critical decisions that balance competing priorities while meeting latency, reliability, cost, and security constraints.

- Serve as a key engineering representative, engaging senior leaders across Stripe and advising the leadership team on key technical considerations related to the end-to-end ML lifecycle.

- Drive cross-team technical initiatives that improve ML development velocity and MLOps maturity across the company.

- Mentor and grow other engineers. Serve as a role model for designing, implementing, and operating great software systems.

## Requirements

- 10+ years of professional software development experience, or equivalent domain expertise, with a solid background in service-oriented architecture and large-scale distributed systems.

- Track record of serving as a technical lead, with the ability to provide technical direction, lead multi-team initiatives, and mentor team members.

- Experience working on production ML platform services.

- Strong product instincts and a deep understanding of the business context in which you operate.

- Strong communication skills with the ability to explain complex technical concepts to both technical and non-technical stakeholders.

- Demonstrated ability to work cross-functionally, collaborating effectively with ML engineers, data scientists, software engineers, product managers, and business stakeholders.

- The ability to thrive on a high level of autonomy and responsibility, and comfort operating in ambiguous environments.

- Hands-on experience using AI tools to accelerate how you work.

## Preferred qualifications

- Experience building large-scale serving or data infrastructure for machine learning use cases (e.g., model inference, feature stores, real-time feature computation, model registries).

- Familiarity with LLMs, LLM frameworks, and agentic AI patterns (e.g., tool use, multi-agent orchestration, retrieval-augmented generation).

- Experience rapidly developing prototypes and iterating based on user feedback.

- Familiarity with cloud services (e.g., AWS) and cloud-based AI/ML services (e.g., SageMaker, Bedrock, Databricks, OpenAI).

- Experience training and shipping machine learning models to production to solve critical business problems.

- Ability to synthesize ideas across the organization while setting a compelling technical vision.

- Comfortable working with geographically distributed teams.

- Passion for side-projects, open source, or self-driven technical initiatives.

## Skills

### Required
- service-oriented architecture
- large-scale distributed systems
- machine learning platform services
- product management
- communication
- cross-functional collaboration
- autonomy
- responsibility
- AI tools

### Nice to have
- large-scale serving or data infrastructure
- LLMs
- LLM frameworks
- agentic AI patterns
- cloud services
- cloud-based AI/ML services
- prototype development
- user feedback
- geographically distributed teams
- side-projects
- open source
- self-driven technical initiatives

---

Source: [Apply at job-boards.greenhouse.io](https://job-boards.greenhouse.io/stripe/jobs/7939868?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply)
