New The Skills of Tomorrow: how AI-exposed is every skill in 2026? See the data →
Anduril Industries

Senior Site Reliability Engineer

Anduril Industries
Apply →
senior full-time $166,000-$220,000 USD Costa Mesa, California

First indexed 18 Jun 2026

Description

Anduril Industries is a defense technology company with a mission to transform U.S. and allied military capabilities with advanced technology.

As a Senior Site Reliability Engineer on the Maritime Digital Shipbuilding team, you will build and operate the infrastructure that keeps our digital production systems running at full speed.

Responsibilities

  • Build and Manage CI/CD Pipelines: Develop and maintain CI/CD pipelines using tools like GitHub Actions and Jfrog Artifactory to ensure seamless integration and deployment of machine learning models and applications.
  • Infrastructure as Code (IaC): Utilize Terraform and Ansible to automate infrastructure provisioning and management on cloud platforms such as Azure, AWS, or Google Cloud Platform (GCP).
  • Containerization and Orchestration: Implement containerization solutions with Docker and manage container orchestration using Kubernetes to ensure reliable deployment and scaling of applications.
  • Model Management and Deployment: Set up and maintain model registries and feature stores (e.g., MLflow, Kubeflow), and manage deployment pipelines for both batch and real-time inference.
  • Monitoring and Logging: Establish comprehensive monitoring and logging solutions using tools like ELK Stack (Elasticsearch, Logstash, Kibana), Prometheus, and Grafana to ensure the smooth operation of deployment environments.
  • Collaborate with Cross-Functional Teams: Work closely with development, data science, and operations teams to foster collaboration and ensure the efficient and effective deployment of machine learning models.
  • Optimize Performance: Utilize parallel computing frameworks such as CUDA and OpenCL to accelerate high-performance computing tasks, ensuring timely processing of large datasets and complex simulations.

Requirements

  • Advanced proficiency in programming languages (Python for scripting and integration).
  • Experience with CI/CD tools like GitHub Actions, Jfrog Artifactory, and Git.
  • Proficiency with IaC tools (Terraform, Ansible).
  • Experience with cloud platforms (Azure, AWS, GCP).
  • Proficiency in containerization (Docker) and container orchestration (Kubernetes).
  • Knowledge of model registries and feature stores (e.g., MLflow, Kubeflow).
  • Experience with logging and monitoring tools (ELK Stack, Prometheus, Grafana).
  • Understanding of parallel computing frameworks (CUDA, OpenCL).
  • Strong collaboration skills and proficiency with collaborative tools (JIRA, Confluence).
  • Eligible to obtain and maintain an active U.S. Secret security clearance.

Preferred Qualifications

  • Previous experience in a manufacturing or industrial setting.
  • Familiarity with observability concepts and tools.
  • Knowledge of security best practices for DevOps and MLOps.

Benefits

  • US Salary Range: $166,000-$220,000 USD
  • Comprehensive medical, dental, and vision plans
  • Income protection: life and disability insurance
  • Generous time off: highly competitive PTO plans
  • Family planning and parenting support
  • Mental health resources
  • Professional development
  • Commuter benefits
  • Relocation assistance
  • Retirement savings plan
This listing is enriched and indexed by YubHub. To apply, use the employer's original posting: https://job-boards.greenhouse.io/andurilindustries/jobs/4995589007