Description
We're seeking a senior software engineer to join our team and lead the design and development of our Kubernetes-native inference platform. As a senior engineer, you will be responsible for leading design reviews, driving architecture, and ensuring the reliability and scalability of our platform.
Key responsibilities include:
- Leading design reviews and driving architecture within the team
- Defining and owning SLIs/SLOs and ensuring post-incident actions land and reliability improves release-over-release
- Implementing advanced optimizations such as micro-batch schedulers, speculative decoding, and KV-cache reuse
- Strengthening incident posture through capacity planning, autoscaling policy, and rollback/traffic-shift strategies
- Mentoring IC1/IC2 engineers and reviewing cross-team designs to elevate coding/testing standards
We're looking for someone with strong coding skills in Python or Go, deep familiarity with networked systems and performance, and hands-on experience with Kubernetes at production scale. If you have experience with inference internals, batching, caching, mixed precision, and streaming token delivery, that's a plus.
In addition to a competitive salary, we offer a range of benefits including medical, dental, and vision insurance, company-paid life insurance, and flexible PTO. We're committed to creating a work environment that's inclusive, diverse, and supportive of our employees' well-being.