Description
Some careers have more impact than others. If you’re looking for a career where you can make a real impression, join HSBC and discover how valued you’ll be. We are currently seeking an experienced professional to join our team in the role of Senior Consultant Specialist.
As a Senior Consultant Specialist, you will be responsible for end-to-end service ownership, observability, monitoring, and alerting, API gateway and traffic management, authentication, authorisation, and security controls, resilience engineering and DR, incident management and problem management, CI/CD, release reliability, and environment management, database and dependency reliability, vulnerability management (CVE) and patching, and leadership and stakeholder management.
Key responsibilities include:
- End-to-end service ownership (AWS API platform) - Own reliability and operational readiness for API services deployed on AWS across multiple environments (dev/test/stage/prod).
- Observability, monitoring, and alerting - Design and implement monitoring and service status visibility (dashboards, service health views, dependency mapping).
- API gateway and traffic management - Operate and optimise Kong Gateway (or equivalent) for routing, rate limiting, throttling, authentication integration, and policy enforcement.
- Authentication, authorisation, and security controls - Work with IAM/security teams to ensure strong authentication/authorisation controls (e.g., OAuth2/OIDC, mTLS, token validation, secrets management).
- Resilience engineering and DR - Define and implement resilience patterns: multi-AZ design, failover strategies, graceful degradation, and dependency resilience.
- Incident management and problem management - Lead major incident response (triage, coordination, communications, recovery).
- CI/CD, release reliability, and environment management - Partner with engineering teams to improve CI/CD pipelines and release safety (progressive delivery, canary/blue-green, automated rollback).
- Database and dependency reliability - Provide reliability guidance for databases and stateful components (performance, backup/restore, replication, patching, capacity).
- Vulnerability management (CVE) and patching - Own/drive operational response to CVE and vulnerability findings: triage, risk assessment, patch planning, and verification.
Requirements include:
- 10+ years’ experience in fintech or regulated financial services operating customer-facing digital platforms.
- Proven experience leading SRE/production operations for cloud-based services, ideally AWS.
- Strong hands-on experience with API gateways (Kong preferred) and API platform operations, observability tooling (AppDynamics, Splunk), incident management, RCA, and operational governance, CI/CD pipelines and release engineering practices, security controls (IAM, secrets management, secure configuration, vulnerability/CVE remediation), database operations and performance troubleshooting (SQL/NoSQL exposure beneficial).
- Strong understanding of reliability engineering concepts: SLI/SLO, error budgets, capacity planning, resilience patterns, DR.
Technical skills include infrastructure as Code (Terraform/CloudFormation), containers and orchestration (Docker/Kubernetes/EKS), scripting/automation (Python, Bash), and AWS services commonly used in API platforms.
Soft skills include calm, structured leadership during high-severity incidents, strong stakeholder communication, bias for automation and continuous improvement, and collaborative mindset across engineering, security, and operations.