Description
Job Summary
Join CoreWeave, The Essential Cloud for AI, as a Software Engineer on the Inference team. You'll work on shipping production features to improve latency, reliability, and cost for model serving on our GPU platform.
Responsibilities
- Implement features and fixes in Python/Go/C++ for model-serving services like Triton, vLLM, TensorRT-LLM, and Ray Serve.
- Write tests, code comments, and design documents, and participate in code reviews.
- Develop basic metrics and dashboards, and assist with alarms and runbooks.
- Follow on-call runbooks and learn incident response in a guided rotation.
- Contribute to performance experiments, such as request batching, concurrency, and caching.
Requirements
- BS/MS in Computer Science, Electrical Engineering, or a related field, or equivalent practical experience.
- Strong foundations in data structures, algorithms, and networked services.
- Experience with Python or Go, and Linux fundamentals; Git/CI basics.
- Exposure to containers and Kubernetes.
- Curiosity about GPU inference concepts.
Preferred Qualifications
- Internship or project experience deploying a microservice or ML inference demo.
- Coursework or research with PyTorch or TensorFlow; simple CUDA projects a plus.
- Familiarity with Grafana/Prometheus/OpenTelemetry or similar tooling.
What We Offer
- Competitive salary: $92,000 to $135,000 per year.
- Comprehensive benefits package, including medical, dental, and vision insurance, company-paid life insurance, and flexible spending accounts.
- Opportunities for professional growth and development in a rapidly growing company.
Our Culture
- We prioritize a hybrid work environment, with remote work options available.
- We're committed to fostering an inclusive and supportive workplace.
This listing is enriched and indexed by YubHub. To apply, use the employer's original posting:
https://job-boards.greenhouse.io/coreweave/jobs/4609928006