Description
We're seeking a Staff Software Engineer, Capacity Engineering to join our team. As a key member of our infrastructure team, you will be responsible for efficiently managing one of the largest-scale cloud-native infrastructures in the world. This role has direct visibility across Pinterest Engineering and with Engineering and company leadership.
The ideal candidate will have a strong background in implementing performance and efficiency projects on large scale distributed systems. You will improve the efficiency of large scale shared environments like Kubernetes, improve the performance and efficiency of large scale distributed systems that drive Pinterest systems, and build develop and mature profiling and optimization capabilities for Pinterest scale.
You will collaborate with Infrastructure Engineering and SRE teams in their mission to deliver highly available, resilient, secure and efficient foundations for Pinterest’s tech stack. You will leverage AI to scale the impact of yourself and the team, including accelerating performance investigations, building tooling and agents that allow users to self-serve efficiency insights and recommendations, and iterating faster on optimization approaches and rollout plans.
To be successful in this role, you will need to have a deep understanding of infrastructure capacity and performance, experience leading efficiency initiatives at scale on Kubernetes or other large scale shared infrastructure, and strong technical and performance engineering skills to collaborate with stakeholders on complex and ambiguous technical challenges.
Additionally, you will need to have experience building and managing highly available distributed applications at scale, proficiency in software development languages such as Java, Python and C++, excellent skills in communicating complex technical issues, experience with AWS or similar cloud environments, and demonstrated ability to use AI to improve speed and quality in your day-to-day workflow for relevant outputs.
Bonus points for hands-on experience with large, cloud-native multi-tenant platforms at Internet scale.