New The Skills of Tomorrow: how AI-exposed is every skill in 2026? See the data →
Anthropic

Technical Program Manager, Safeguards (Infrastructure & Evals)

Anthropic
hybrid senior full-time $290,000-$365,000 USD San Francisco, CA | New York City, NY | Seattle, WA
Apply →

First indexed 18 Apr 2026

Description

About the Role

Safeguards Engineering builds and operates the infrastructure that keeps Anthropic's AI systems safe in production. As a Technical Program Manager for Safeguards Infrastructure and Evals, you'll own the operational health and forward momentum of this stack.

Your primary responsibility is driving reliability , owning the incident-response and post-mortem process, ensuring SLOs are defined and met in partnership with various teams, and making sure that when things go wrong, the right people know, the right actions get taken, and those actions actually get closed out.

Alongside that ongoing operational rhythm, you'll coordinate the larger platform investments: migrations, eval-platform improvements, and the cross-team dependencies that connect them.

This role sits at the intersection of operations and program management. It requires genuine technical depth , you need to understand how these systems work well enough to triage effectively, judge what's actually safety-critical versus what can wait, and have informed conversations with the engineers building and maintaining them.

But the core of the job is keeping the machine running well and the work moving.

Responsibilities

  • Own the Safeguards Engineering ops review
  • Drive the recurring cadence that keeps the team informed and coordinated: surfacing recent incidents and failures, bringing visibility to reliability trends, and making sure the right people are in the room when decisions need to be made.
  • Drive incident tracking and post-mortem execution
  • Establish and maintain SLOs with partner teams
  • Maintain runbook quality and incident-ownership clarity
  • Drive platform migrations and infrastructure projects
  • Coordinate evals platform improvements

Requirements

  • Solid technical program management experience, particularly in operational or infrastructure-heavy environments
  • Understanding of how production ML systems work well enough to triage incidents intelligently and have substantive conversations with engineers about what's going wrong and why
  • Ability to work effectively across team boundaries
  • Experience with or strong interest in AI safety

Nice to Have

  • Experience with SRE practices, incident management frameworks, or on-call operations at scale
  • Familiarity with monitoring and alerting tooling (PagerDuty, Datadog, or equivalents)
  • Experience driving infrastructure migrations in complex, multi-team environments
This listing is enriched and indexed by YubHub. To apply, use the employer's original posting: https://job-boards.greenhouse.io/anthropic/jobs/5108695008