Description

Job Summary

Join CoreWeave, The Essential Cloud for AI, as a Software Engineer on the Inference team. You'll work on shipping production features to improve latency, reliability, and cost for model serving on our GPU platform.

Responsibilities

Implement features and fixes in Python/Go/C++ for model-serving services like Triton, vLLM, TensorRT-LLM, and Ray Serve.
Write tests, code comments, and design documents, and participate in code reviews.
Develop basic metrics and dashboards, and assist with alarms and runbooks.
Follow on-call runbooks and learn incident response in a guided rotation.
Contribute to performance experiments, such as request batching, concurrency, and caching.

Requirements

BS/MS in Computer Science, Electrical Engineering, or a related field, or equivalent practical experience.
Strong foundations in data structures, algorithms, and networked services.
Experience with Python or Go, and Linux fundamentals; Git/CI basics.
Exposure to containers and Kubernetes.
Curiosity about GPU inference concepts.

Preferred Qualifications

Internship or project experience deploying a microservice or ML inference demo.
Coursework or research with PyTorch or TensorFlow; simple CUDA projects a plus.
Familiarity with Grafana/Prometheus/OpenTelemetry or similar tooling.

What We Offer

Competitive salary: $92,000 to $135,000 per year.
Comprehensive benefits package, including medical, dental, and vision insurance, company-paid life insurance, and flexible spending accounts.
Opportunities for professional growth and development in a rapidly growing company.

Our Culture

We prioritize a hybrid work environment, with remote work options available.
We're committed to fostering an inclusive and supportive workplace.

This listing is enriched and indexed by YubHub. To apply, use the employer's original posting: https://job-boards.greenhouse.io/coreweave/jobs/4609928006