Description

About EarnIn

EarnIn is a pioneer in earned wage access, providing financial flexibility for individuals living paycheck to paycheck.

Position Summary

We have a real passion for delivering the best product experience for our community members. We work closely with all teams and share responsibility for rapidly delivering production-ready features to our community. We build or contribute to infrastructure, reliability tooling, and practices that help teams ship quickly and safely.

Responsibilities

Design systems with resilience, graceful degradation, and capacity in mind.
Define and measure SLOs and SLIs that actually reflect what our customers feel.
Use Datadog (logging, metrics, APM) together with CloudWatch to build signal-heavy, noise-light observability.
Configure alerting and routing that reach engineers through incident.io, where we run incident management and on-call, so that when a human gets paged, it really matters.
Continuously improve our incident lifecycle, from fast detection and solid triage, through clear communication, to blameless, actionable follow-ups.
Combine solid software fundamentals with reliability thinking so our systems are highly available, easy to debug, and a joy to work on.

Requirements

A bachelor's or master's degree in computer science or equivalent industry experience.
3+ years of experience in an SRE or Software Engineering role.
Hands-on coding experience in Python and/or Go.
Distributed Systems Expertise , Proven experience designing, operating, and shepherding large-scale distributed systems from design through production, including incident learnings that make on-call quieter over time.
Reliability Engineering Mindset , Deep fluency in SLOs, SLIs, error budgets, and MTTR , using them to drive decisions and explain tradeoffs, not just decorate dashboards.
Observability & Incident Response , Treats observability as essential, not optional; stays calm under pressure; can diagnose incidents from logs and metrics and translate findings into durable process and technical improvements.
Cross-functional Communication , Able to work across technical and non-technical teams, reduce silos through documentation and runbooks, and explain reliability concepts in plain language.
Operational Tooling & AI Fluency , Selects the right tools for production management and leverages AI-assisted development to reduce toil, accelerate RCA, and streamline infrastructure-as-code workflows.
Leadership & Mentorship , Can plan and lead strategic reliability initiatives across engineering, and invests in mentoring engineers as a high-leverage path to long-term reliability improvements.

What We're Looking For

A bachelors or masters degree in computer science or equivalent industry experience, 3+ years of experience in an SRE or Software Engineering role, hands-on coding experience in Python and/or Go, distributed systems expertise, reliability engineering mindset, observability and incident response, cross-functional communication, operational tooling and AI fluency, and leadership and mentorship.

Benefits

EarnIn offers excellent employee benefits, including healthcare, internet and cell phone reimbursement, a learning and development stipend, and potential opportunities to travel to our Mountain View headquarters.

This listing is enriched and indexed by YubHub. To apply, use the employer's original posting: https://job-boards.greenhouse.io/earnin/jobs/7895723