Description
As a Staff Site Reliability Engineer at Reddit, you will play a key role in leading reliability engineering initiatives for critical user-facing systems at internet scale. You will partner closely with product and infrastructure teams to improve availability, latency, scalability, and operational excellence across Reddit's most business-critical experiences.
In this role, you will:
- Lead Reliability Engineering for User Experience
- Drive reliability, scalability, and operational excellence for critical user-facing systems and services.
- Architect for Scale
- Partner with product and infrastructure engineering teams to design systems that remain highly available and performant under massive global load.
- Reduce Operational Risk
- Identify systemic risks and reliability bottlenecks across services, dependencies, deployments, and infrastructure.
- Drive Automation
- Eliminate repetitive operational work through automation and tooling.
- Incident Management
- Lead complex incident response efforts across engineering teams.
You will have the opportunity to work on a wide range of projects, from improving the performance of Reddit's core systems to developing new tools and processes to enhance the reliability and scalability of our platform.
We are looking for a highly skilled and experienced Site Reliability Engineer who is passionate about building scalable and reliable systems. If you are a strong collaborator with excellent communication skills and a passion for problem-solving, we encourage you to apply.