Description
Our mission at Databricks is to simplify the data lifecycle from ingestion to ETL, BI, and all the way up to ML/AI with a unified platform.
To achieve this goal, we believe the data warehouse architecture as we know it today will be replaced by a new architectural pattern, Lakehouse, open platforms that unify data warehousing and advanced analytics.
A critical part of realizing this vision is the next generation (decoupled) query engine and structured storage system that can outperform specialized data warehouses in relational query performance, yet retain the expressiveness and of general purpose systems such as Apache Spark to support diverse workloads ranging from ETL to data science.
As part of this team, you will be working in one or more of the following areas to design and implement these next gen systems that leapfrog state-of-the-art:
- Query compilation and optimization
- Distributed query execution and scheduling
- Vectorized execution engine
- Data security
- Resource management
- Transaction coordination
- Efficient storage structures (encodings, indexes)
- Automatic physical data optimization
We look for individuals with a passion for database systems, storage systems, distributed systems, language design, or performance optimization. You should have experience working towards a multi-year vision with incremental deliverables, motivated by delivering customer value and impact.
The pay range for this role is $192,000-$260,000 USD, and the total compensation package may also include eligibility for annual performance bonus, equity, and benefits.