Description
Role Overview
The purpose of this role is to ensure smooth operations of our production data assets. Activities will include monitoring production systems for incident occurrence, alerting applicable parties when incidents arise and incident triaging and management. They will also carry out activities to prevent production incidents.
Key Responsibilities
- Work with Data Pipelines, handling incidents and RCA
- Administers, analyzes, and prioritizes systems issues and negotiates a course of action for resolution.
- Supports workflow and solutions; trouble shoots user errors and supports reporting capabilities.
- Utilizes system monitoring utilities to monitor system availability.
- Extracts and compiles data system monitoring data to create availability scorecards and reports.
- System Monitoring: Continuously monitor IT systems to ensure optimal performance and availability, identifying and addressing potential issues before they escalate.
- Monitoring and Maintenance: Regularly monitor production data assets to ensure they are functioning correctly and efficiently. Alerting applicable parties if an issue arises in production.
- Issue Resolution: Work with data team to identify, diagnose, and resolve technical issues related to production data assets. Work with relevant teams to implement effective solutions.
- Incident Management: Manage and prioritize incidents, ensuring that they are resolved promptly and efficiently and follow the incident management process. Document incidents and resolutions for future reference.
- Incident Management: Respond to and resolve technical issues reported by users or automated monitoring alerts. This includes diagnosing problems, identifying solutions, and implementing fixes.
- Problem Analysis: Analyze recurring issues to identify root causes and implement long-term solutions to prevent future occurrences.
- Root Cause Analysis: Conduct thorough investigations to determine the underlying causes of recurring incidents and implement preventive measures.
- Preventative Measures: Identify incidents that recur and put solutions in place to prevent recurrence.
- Data Integrity: Work with data team to ensure the accuracy and integrity of data produced and provided to the business, work with the data teams to implement and maintain quality control measures to prevent errors.
- Documentation: Maintain comprehensive documentation of processes, system configurations, and troubleshooting procedures. Ensure documentation is created and owned be it by the data team or the production support team.
- Support: Provide support to data teams, data users and stakeholders. Respond to inquiries and assist with requests as applicable.
- Optimization: Identify opportunities to optimize data production processes and implement improvements to enhance efficiency.
- Performance Optimization: Analyze system performance and identify areas for improvement. Suggest and implement changes to enhance system efficiency and reliability.
This listing is enriched and indexed by YubHub. To apply, use the employer's original posting:
https://jobs.workable.com/view/ffvEvDAAYzjgBfJeCMdK9E/remote-fbs-data-production-support-analyst-(data-pipelines)-in-mexico-at-capgemini