New The Skills of Tomorrow: how AI-exposed is every skill in 2026? See the data →
Cloudflare

Principal Software Engineer: Resiliency

Cloudflare
Apply →
hybrid senior full-time $230,000 - $281,000

First indexed 26 Apr 2026

Description

At Cloudflare, we're not looking for people who wait for a polished roadmap; we're looking for the builders who see the cracks in the Internet that everyone else has simply learned to live with.

We value candidates who have the instinct to spot a "normalized" problem and the AI-native curiosity to create a solution using the latest tools. Our culture is built on iteration, leveraging AI to ship faster today to make it better tomorrow, while ensuring that every improvement, no matter how small, is shared across the team to lift everyone up.

As a Principal Software Engineer on our Resiliency Engineering Team, you will work with several teams of passionate and talented engineers that are building the internal Control Plane used by our SREs, and Infrastructure Operations teams to manage our internal DCaaS and IaaS platforms.

You will be responsible for tools that support the management of a growing, globally distributed fleet of servers, storage, and network gear spread across over a thousand colos worldwide. You will play an active part in shaping the future of the infrastructure that propels Cloudflare's scale and growth.

Along the way you will have the opportunity to write code to bring this design to fruition as well as to mentor high-potential engineers on their distributed system journey.

You will be working alongside engineers who have presented at DevOPs Days, Config Management Camp 2024 & 2025, Monitorama, OSMC, Kubecon and Promcon. Together you will deliver on the key Health Mediated Deployment projects that are being tracked through senior leadership of Cloudflare up to the founders.

Examples of desirable skills, knowledge and experience include:

  • Minimum 10 years of experience working with distributed systems.
  • Experience designing, building and managing high volume software applications.
  • Expert in at least one modern strongly-typed programming language.
  • Experience debugging, measuring, optimizing and identifying failure modes in a large-scale distributed system.
  • Excellent collaboration skills.
  • Proven ability to convey ideas effectively through verbal and written communication.
  • Ability to translate business needs into requirements, design documents and technical solutions.
  • Knowledge of API design standards, patterns and best practices.
  • Proven ability to use data to drive business outcomes.
  • Proven experience in developing architects and lead engineers.
  • Solid understanding of computer science fundamentals including data structures, algorithms, and object-oriented or functional design.

Bonus points for experience with optimizing and scaling infrastructure provisioning, repair, and decommissioning processes and automations.

This listing is enriched and indexed by YubHub. To apply, use the employer's original posting: https://job-boards.greenhouse.io/cloudflare/jobs/6709422