New The Skills of Tomorrow: how AI-exposed is every skill in 2026? See the data →
Anduril

Infrastructure Reliability Engineer

Anduril
Apply →
onsite senior full-time $146,000-$194,000 USD Costa Mesa, California, United States

First indexed 2 Jun 2026

Description

This is a small but growing team responsible for the infrastructure and operations behind core developer tools used across the entire engineering organization. You'll own the full lifecycle , patching, upgrades, backups, scaling, and incident response , for services that every engineer depends on daily. The role blends DevOps, SRE, and software engineering, and is ideal for engineers who want high ownership and company-wide impact. You should have a mindset of continuous improvement , if something is manual and repetitive, your instinct should be to automate it away. As the company's on-prem infrastructure footprint grows, this team will expand its scope to provide SRE capabilities for on-prem systems , making this an opportunity to help shape that practice from the ground up.

  • Own the lifecycle of core self-hosted developer tools (e.g., GitHub Enterprise Server, CircleCI, JFrog Artifactory/Xray)
  • Design and implement automated systems for patching, backups (with validation), and upgrades
  • Scale infrastructure to support a fast-growing engineering org
  • Use Infrastructure-as-Code (Terraform) to manage environments
  • Operate and troubleshoot systems using Docker, Kubernetes, and cloud platforms (AWS, GCP, Azure)
  • Define and maintain SLOs for service availability, reliability, and performance
  • Build and maintain monitoring, alerting, and observability for developer tool services
  • Lead and participate in incident response and root cause analysis
  • Work cross-functionally with platform, security, infrastructure (on-prem and cloud), and software teams
This listing is enriched and indexed by YubHub. To apply, use the employer's original posting: https://job-boards.greenhouse.io/andurilindustries/jobs/5149139007