Description
The Senior Site Reliability Engineer will play a key role in developing scalable, reliable, and efficient infrastructure that powers the entire company. This includes building and scaling internal platform offerings, designing and implementing monitoring, alerting, and incident response systems, and collaborating with application software engineers to guide their design and ensure it scales for what Carta needs in the long run.
The ideal candidate will have extensive experience with cloud services such as AWS, Google Cloud Platform, or Azure, including services like EC2, S3, RDS, and Lambda. They will also be proficient in using tools such as Terraform, Ansible, or CloudFormation for managing and provisioning cloud infrastructure.
The team is responsible for providing secure, reliable, scalable, and performant infrastructure to Carta's customers and developers. The successful candidate will be a strong communicator who enjoys collaborating to solve complex problems and has familiarity with infrastructure best practices on performance, reliability, and security and their associated tools.
Our stack is Python, Java, Terraform, gRPC, Docker, Kubernetes, Postgres, running on AWS. Come join us!