Description
Build real-time AI agent infrastructure: Design and operate the stateful, low-latency runtime that powers voice and chat AI agents , from LLM streaming and conversation state management to graceful recovery and multi-channel support.
Solve distributed systems problems: Own session management across scaled-out workers , including affinity, checkpointing, crash recovery, and consistency under concurrent access.
Build a function execution platform: Own a serverless-style runtime where customers deploy custom logic , build orchestration, container lifecycle, autoscaling, and versioned rollouts.
Own developer experience and test infrastructure: Build CLI tools, local development environments, and test execution frameworks that let engineers iterate quickly and ship with confidence.
Raise the bar on production quality: Drive observability, incident response, and engineering best practices across the team.
We're looking for a senior software engineer with 5+ years of experience in infrastructure, platform, or systems work. You should have strong Python and Go skills, as well as a deep understanding of distributed systems, consistency, fault tolerance, state management, and concurrency.
Experience with Kubernetes and cloud-native infrastructure is also required. You should be able to build developer-facing tooling, such as CLIs, SDKs, local dev environments, or internal platforms.
A high bar for code quality, thorough testing, thoughtful code review, and sustainable engineering practices are essential. You should be comfortable operating what you build, on-call, incident response, and production ownership.
AI-native workflow is a must, and you should actively use LLMs and AI-assisted tools in your daily development.