Description
CoreWeave is seeking a Staff Engineer, Storage Engine to join their team. The successful candidate will design and implement distributed storage solutions to support scaling data-intensive AI workloads. They will contribute to the development of exabyte-scale, S3-compatible object storage and integrate dedicated storage clusters into diverse customer environments.
Key responsibilities include:
- Designing and implementing distributed storage solutions to support scaling data-intensive AI workloads
- Contributing to the development of exabyte-scale, S3-compatible object storage
- Integrating dedicated storage clusters into diverse customer environments
- Working with technologies such as RDMA, GPU Direct Storage, and distributed filesystems protocols such as NFS or FUSE to optimize storage performance and efficiency
- Leading efforts to improve the reliability, durability, security, and observability of the storage stack
- Collaborating with operations teams to monitor, troubleshoot, and improve storage systems in production environments
- Setting the bar for developing metrics and dashboards to provide visibility into storage performance and health
- Analyzing telemetry and system data to drive improvements in throughput, latency, and resilience
- Working cross-functionally with platform, product, and infrastructure teams to deliver seamless storage capabilities across the stack
- Sharing knowledge and mentoring other engineers on best practices in building distributed, high-performance systems
Requirements include:
- Bachelor's, Master's, or PhD degree in Computer Science, Engineering, or a related field
- 8-10+ years of experience working in storage systems engineering or infrastructure
- Strong hands-on experience with object storage or distributed filesystems in production environments
- Experience with one or more storage protocols (e.g. S3, NFS) and file systems such as Ceph, DAOS, or similar
- Proficiency in a systems programming language such as Go, C, or Rust
- Proficiency leveraging AI tools to augment software development
- Familiarity with storage observability tools and telemetry pipelines (e.g., ClickHouse, Prometheus, Grafana)
- Experience working with cloud-native infrastructure, Kubernetes, and scalable system architectures
The base salary range for this role is $188,000 to $275,000.
This listing is enriched and indexed by YubHub. To apply, use the employer's original posting:
https://job-boards.greenhouse.io/coreweave/jobs/4612047006