Shield AI

Senior Cloud Engineer

Shield AI
onsite senior full-time $110,000 - $170,000 a year San Diego, California / Dallas, Texas / San Francisco, California
Apply →

First indexed 17 Apr 2026

Description

Shield AI is seeking a Senior Cloud Engineer to support its leadership in applied artificial intelligence development. In this role, you will be responsible for engineering, deploying, provisioning, and managing critical cloud systems that drive innovation across Shield AI's public and private cloud environments, both domestically and internationally.

As part of the Cloud and Infrastructure team within Enterprise Operations, you will play a key role in ensuring the performance, scalability, and reliability of these systems to support various business units. This position may involve occasional travel to Shield AI locations.

Responsibilities:

Engineering:
  • Manage and optimize multi-cloud infrastructure (Azure, AWS) for performance, reliability, and scalability.
  • Support and optimize cloud and virtual machine environments, assisting with capacity planning, performance monitoring, security compliance, and vulnerability remediation.
  • Assist in implementing and maintaining infrastructure systems, including servers, storage, backup solutions, and disaster recovery processes, for both public and private clouds.
  • Continuously learn and adapt to emerging technologies and platforms, leveraging automation wherever possible.
  • Author and produce the necessary documentation for engineered and maintained systems along with associated processes that supporting teams can leverage.
  • Assist in researching, recommending, and developing innovative solutions for complex requirements and issue resolution.
  • Collaborate cross-functionally with AI, DevOps, and Security teams to ensure compliance, observability, and resilience in mission-critical environments.
  • Participate in Agile methodologies and sound engineering principles.
Operations and Support:
  • Perform daily system monitoring, verifying the integrity and availability of all server resources, systems and key processes, reviewing system and application logs.
  • Support system maintenance and upgrades, including OS patching, software configuration, hardware updates, and performance tuning to ensure optimal cloud infrastructure performance.
  • Provide escalated support for operational issues possibly during and after normal business hours for systems, workloads, and Kubernetes AI infrastructure.
  • Analyze, troubleshoot and resolve system infrastructure and software issues.
  • Ability to participate in on-call, emergency, or maintenance roles

Requirements:

  • Bachelor’s degree in Computer Science or related field, or equivalent experience (4+ years) plus an engineer level certification, Azure/AWS Associate, or another similar level certification.
  • 4 years’ experience supporting applications and systems in a production environment in high-availability, mission-critical, or defense-grade environments preferred.
  • Comfortable with operational efficiencies utilizing Infrastructure as Code (IaC) solutions (e.g., Terraform, Ansible).
  • Strong understanding of networking concepts (VPCs, VPNs, subnets, routing, firewalls).
  • Experience in automating repetitive tasks using scripting languages such as PowerShell, Python, or Bash.
  • Experience with deployment and systems administration of at least one type of Linux distribution (i.e. RHEL, Ubuntu)
  • Experience with concepts of Microsoft Windows Server administration, Azure and Active Directory environments
  • Possesses organizational skills, with a process-oriented mindset, attention to detail, and effective verbal and written communication abilities.
  • Ability to work independently to accomplish assigned tasks.
  • Solution-oriented, constructive approach to problem-solving.

Preferred Qualifications:

  • Experience deploying and maintaining workloads in Azure public cloud environments.
  • Hands-on experience with containerization and Kubernetes-based workloads.
  • Strong understanding of virtualization and private cloud platforms (e.g., VMware, Hyper-V, KVM).
  • Background in DevOps, Site Reliability Engineering (SRE), or cloud infrastructure roles.
  • Proficiency with configuration management and automation tools (e.g., Ansible, Chef, Puppet, Terraform).
  • Experience building and optimizing CI/CD pipelines.

Salary and Benefits:

  • $110,000 - $170,000 a year
  • Full-time regular employee offer package: Pay within range listed + Bonus + Benefits + Equity
  • Temporary employee offer package: Pay within range listed above + temporary benefits package (applicable after 60 days of employment)
This listing is enriched and indexed by YubHub. To apply, use the employer's original posting: https://jobs.lever.co/shieldai/702e2609-db48-49ab-8bec-d405c956a6ce