# Senior Site Reliability Engineer

**Company**: Anduril Industries
**Location**: Costa Mesa, California
**Experience**: senior
**Job type**: full-time
**Salary**: $166,000-$220,000 USD
**Category**: Engineering
**Industry**: Technology

**Apply**: https://job-boards.greenhouse.io/andurilindustries/jobs/4995589007?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply
**Canonical**: https://yubhub.co/jobs/job_c9285b30-d26

## Description

Anduril Industries is a defense technology company with a mission to transform U.S. and allied military capabilities with advanced technology.

As a Senior Site Reliability Engineer on the Maritime Digital Shipbuilding team, you will build and operate the infrastructure that keeps our digital production systems running at full speed.

## Responsibilities

- Build and Manage CI/CD Pipelines: Develop and maintain CI/CD pipelines using tools like GitHub Actions and Jfrog Artifactory to ensure seamless integration and deployment of machine learning models and applications.

- Infrastructure as Code (IaC): Utilize Terraform and Ansible to automate infrastructure provisioning and management on cloud platforms such as Azure, AWS, or Google Cloud Platform (GCP).

- Containerization and Orchestration: Implement containerization solutions with Docker and manage container orchestration using Kubernetes to ensure reliable deployment and scaling of applications.

- Model Management and Deployment: Set up and maintain model registries and feature stores (e.g., MLflow, Kubeflow), and manage deployment pipelines for both batch and real-time inference.

- Monitoring and Logging: Establish comprehensive monitoring and logging solutions using tools like ELK Stack (Elasticsearch, Logstash, Kibana), Prometheus, and Grafana to ensure the smooth operation of deployment environments.

- Collaborate with Cross-Functional Teams: Work closely with development, data science, and operations teams to foster collaboration and ensure the efficient and effective deployment of machine learning models.

- Optimize Performance: Utilize parallel computing frameworks such as CUDA and OpenCL to accelerate high-performance computing tasks, ensuring timely processing of large datasets and complex simulations.

## Requirements

- Advanced proficiency in programming languages (Python for scripting and integration).

- Experience with CI/CD tools like GitHub Actions, Jfrog Artifactory, and Git.

- Proficiency with IaC tools (Terraform, Ansible).

- Experience with cloud platforms (Azure, AWS, GCP).

- Proficiency in containerization (Docker) and container orchestration (Kubernetes).

- Knowledge of model registries and feature stores (e.g., MLflow, Kubeflow).

- Experience with logging and monitoring tools (ELK Stack, Prometheus, Grafana).

- Understanding of parallel computing frameworks (CUDA, OpenCL).

- Strong collaboration skills and proficiency with collaborative tools (JIRA, Confluence).

- Eligible to obtain and maintain an active U.S. Secret security clearance.

## Preferred Qualifications

- Previous experience in a manufacturing or industrial setting.

- Familiarity with observability concepts and tools.

- Knowledge of security best practices for DevOps and MLOps.

## Benefits

- US Salary Range: $166,000-$220,000 USD

- Comprehensive medical, dental, and vision plans

- Income protection: life and disability insurance

- Generous time off: highly competitive PTO plans

- Family planning and parenting support

- Mental health resources

- Professional development

- Commuter benefits

- Relocation assistance

- Retirement savings plan

## Skills

### Required
- Python
- GitHub Actions
- Jfrog Artifactory
- Git
- Terraform
- Ansible
- Azure
- AWS
- Google Cloud Platform (GCP)
- Docker
- Kubernetes
- MLflow
- Kubeflow
- ELK Stack
- Prometheus
- Grafana
- CUDA
- OpenCL
- JIRA
- Confluence

### Nice to have
- manufacturing
- observability
- security best practices for DevOps and MLOps

---

Source: [Apply at job-boards.greenhouse.io](https://job-boards.greenhouse.io/andurilindustries/jobs/4995589007?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply)
