New The Skills of Tomorrow: how AI-exposed is every skill in 2026? See the data →
Databricks

Sr. Manager – Data & AI Support Engineering

Databricks
Apply →
onsite senior full-time Texas

First indexed 5 Jun 2026

Description

As a Sr. Manager of the Data & AI Support Engineering team, you will lead and manage a team of Technical Solutions Engineers responsible for driving deep technical resolutions for complex customer issues across Spark, AI/ML, Streaming, and Lakehouse platforms.

You will help customers realise business value from Databricks Ecosystem products through strong technical leadership, AI-first operational innovation and customer-centric execution.

Mission Lead and scale a world-class AI-first Data & AI Support Engineering organisation that combines deep technical expertise, operational excellence, intelligent automation and customer-centric support to accelerate issue resolution, improve platform reliability and drive exceptional customer outcomes across enterprise-scale Data and AI workloads.

Key Responsibilities:

  • Build AI-enabled support workflows and reusable automations to improve resolution speed and support quality.
  • Use Agentic AI systems, logs, telemetry, observability platforms and internal systems to accelerate troubleshooting and root-cause analysis safely.
  • Create reusable runbooks, prompts, and agentic workflows that scale operational efficiency across teams.
  • Ensure strong AI governance, customer data safety, validation practices, auditability, and human-in-the-loop controls.
  • Partner with Engineering and Product teams to drive AI-first support innovation and operational excellence.

Outcomes:

  • Drive AI-first support transformation initiatives that improve resolution speed, case quality, operational efficiency and customer experience.
  • Partner with Engineering and Product teams to operationalize AI-assisted diagnostics, observability insights, and intelligent escalation management for enterprise customers.
  • Build and scale reusable AI-enabled workflows, automations, runbooks, and operational intelligence frameworks across the support organisation.
  • Lead and manage Technical Solutions Engineers, Team Leads, and support operations personnel across AMER support functions based out of the Dallas location.
  • Own and improve operational KPIs including customer satisfaction, escalation management, backlog health, resolution efficiency, and support quality.
  • Act as a senior escalation point for customers and internal teams while driving operational excellence and process optimisation.
  • Lead hiring, onboarding, mentoring, technical assessments, training, and career development for support engineers and technical leads.
  • Conduct regular one-on-ones, annual review, and career development discussions with direct reports.
  • Be a hands-on technical leader supporting complex issues related to Spark Core, Spark SQL, Structured Streaming, Delta Lake, Lakehouse architecture, and Databricks Runtime technologies.
  • Guide customers on Spark runtime optimisation, distributed systems performance, and best practices for scalable Data & AI workloads.
  • Own Engineering JIRA escalations and proactively drive faster resolutions for customer-reported product issues.
  • Maintain internal operational documentation, runbooks, and customer-facing knowledge base assets.
  • Coordinate closely with Engineering and Backline Support engineering, customer experience intelligence teams to identify, reproduce, and report product defects effectively.
  • Act as a strong customer advocate and collaborate with cloud partners to support mutual customer success.
  • Participate in major incident management, escalation handling, on-call rotations, and critical production support activities.

Requirements:

  • 10+ years of experience designing, building, troubleshooting, and supporting large-scale Data & AI applications using Python, Java, Scala, Spark, or related distributed technologies.
  • Strong work experience of AI-enabled support workflows, agentic AI systems, Claude Skills workflows, RAG architectures, vector databases and any other operational automation frameworks.
  • Proven development/delivery experience at a production scale in Databricks tech stacks like Model serving, Lakehouse, Delta, DLT, Lakeflow, Lakebase platforms is a strong plus.
  • Experience using AI tools for troubleshooting, root-cause analysis, observability analysis, and support workflow acceleration.
  • Strong hands-on expertise in Apache Spark, Spark SQL, Structured Streaming, Delta Lake, and distributed data processing systems.
  • Experience leading production-scale workloads across Big Data, Hadoop, AI/ML, Kafka, Streaming, Data Science, or Analytics platforms.
  • Strong troubleshooting and performance tuning experience for Spark and JVM-based distributed systems, including memory management, garbage collection, heap analysis, and thread dump analysis.
  • Hands-on experience with AWS, Azure, or GCP cloud platforms.
  • Proven experience managing globally distributed technical teams and handling high-severity customer escalations.
  • Strong analytical, debugging, problem-solving, and distributed systems troubleshooting skills.
  • Excellent written and verbal communication skills with strong customer-facing leadership abilities.
  • Strong organisational, multitasking, stakeholder management, and operational leadership capabilities.
This listing is enriched and indexed by YubHub. To apply, use the employer's original posting: https://job-boards.greenhouse.io/databricks/jobs/7914730002