Description

Electronic Arts creates next-level entertainment experiences that inspire players and fans around the world. This role is part of the CT - Infrastructure & Platform team, which builds and operates distributed, large-scale, cloud-based infrastructure using modern open-source software solutions.

As a SEIII/SRE Engineer, you will be responsible for building and operating a unified platform across EA, extracting and processing massive data from spanning 20+ game studios, and using the insight to serve massive online requests. You will also use automation technologies to ensure repeatability, eliminate toil, reduce mean time to detection and resolution (MTTD & MTTR) and repair services.

Your responsibilities will include:

Building and operating distributed, large-scale, cloud-based infrastructure using modern open-source software solutions
Helping build and operate a unified platform across EA, extracting and processing massive data from spanning 20+ game studios, and using the insight to serve massive online requests
Using automation technologies to ensure repeatability, eliminate toil, reduce MTTD & MTTR and repair services
Performing root cause analysis and post-mortems with an eye towards future prevention
Designing and building CI/CD pipelines
Creating monitoring, alerting and dashboarding solutions that improve visibility into EA's application performance and business metrics
Producing documentation and support tooling for online support teams
Developing reporting systems that inform on important metrics, detect anomalies, and forecast future results
Developing and Operating both SQL and NoSQL solutions
Building complex queries to solve data mining problems
Developing large-scale online platform to personalize player experience and provide reporting and feedback
Helping in interviewing and hiring the best candidates for the team
Helping mentor the team members and help them grow in their skillsets
Being responsible for driving growth and modernization efforts and projects for the team

To be successful in this role, you will need:

7+ years of experience with Virtualization, Containerization, Cloud Computing (AWS preferred), VMWare ecosystems, Kubernetes, or Docker
7+ years of experience supporting high-availability production-grade Data infrastructure and applications with defined SLIs and SLOs
Systems Administration or Cloud experience, including a strong understanding of Linux / Unix
Network experience, including an understanding of standard protocols/components
Automation and orchestration experience including Terraform, Helm, Chef, Packer
Experience writing code in Python, Golang, or Java
Experience with Monitoring tech stack like Prometheus, Grafana, Loki, Alertmanager
Experience with distributed system to serve massive concurrent requests
Experience working with large-scale systems and data platforms/warehouses

If you are passionate about building and operating scalable, reliable, and efficient systems, and have a strong background in software development and operations, we encourage you to apply for this exciting opportunity.

This listing is enriched and indexed by YubHub. To apply, use the employer's original posting: https://jobs.ea.com/en_US/careers/JobDetail/SRE-III/214248