Alluxio

Staff Software Engineer

Alluxio
onsite staff full-time
Apply →

First indexed 17 Apr 2026

Description

We're looking for experienced distributed-systems engineers to join our Core Product team and advance the next generation of Alluxio's data-orchestration engine - the foundation for AI and analytics at global scale.

As a Staff Software Engineer, you'll work on high-impact systems problems such as optimizing metadata management, caching, and replication across thousands of nodes; designing concurrent, fault-tolerant services for multi-region and multi-cloud environments; evolving Alluxio's storage abstraction and scheduling layer to support large-scale AI/ML data pipelines; and collaborating with internal product teams to push the limits of distributed I/O performance.

This is a hands-on, architecture-plus-implementation role for engineers who love deep systems work and want visible impact in a small, senior, highly technical team.

What You'll Own

  • Cache and metadata consistency - advance Alluxio's intelligent caching framework for multi-tenant environments (TTL policies, write-back consistency, invalidation protocols, and distributed metadata scaling).
  • High-throughput data I/O optimization - profile and optimize Alluxio's data path across S3, GCS, HDFS, and POSIX interfaces using adaptive prefetching, async I/O, and tier-aware scheduling.
  • Scaling for AI and analytics workloads - evolve the coordination layer to efficiently serve distributed AI training clusters, accelerating model load and shuffle operations across regions and clouds.
  • Observability and performance insights - build fine-grained metrics and tracing for cache efficiency, throughput, and latency across storage tiers.
  • Open-source leadership - drive design discussions, mentor contributors, and represent Alluxio's core-systems direction within the OSS community.

What You'll Do

  • Design and implement core components of Alluxio's distributed file and object-access layer.
  • Optimize performance for large-scale, high-throughput environments using advanced concurrency and caching techniques.
  • Build scalable metadata and coordination systems that ensure strong consistency, high availability, and minimal latency.
  • Collaborate cross-functionally with product, solution-engineering, and research teams to drive roadmap and customer success.

What We're Looking For

  • Strong computer-science fundamentals and a passion for large-scale distributed systems.
  • Professional experience developing in Java, C++, or Go.
  • Deep understanding of concurrency, replication, fault tolerance, and performance optimization.
  • Experience with distributed storage, data-access layers, or cloud infrastructure (e.g., Spark, Presto, Hadoop, Kubernetes).
  • Bachelor's or advanced degree in Computer Science or related technical field (or equivalent experience).
  • Demonstrated technical leadership: defining architecture, mentoring peers, or driving major projects from design through release.

Why Alluxio

  • Build infrastructure trusted by the world's largest AI and data-driven companies.
  • Join a small, senior engineering team where your designs shape the product's evolution.
  • Work directly with the original creators of open-source Alluxio.
  • A culture of empathy, curiosity, and ownership - where engineers collaborate closely to solve hard problems.
This listing is enriched and indexed by YubHub. To apply, use the employer's original posting: https://jobs.lever.co/alluxio/65f09933-df44-4f0d-b70d-7d4e6fd57330