# Solutions Architect - CPU and LPU

**Company**: NVIDIA
**Location**: Beijing, Shanghai
**Work arrangement**: onsite
**Experience**: senior
**Job type**: full-time
**Category**: Engineering
**Industry**: Technology

**Apply**: https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/China-Beijing/Solutions-Architect---CPU-and-LPU_JR2015614?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply
**Canonical**: https://yubhub.co/jobs/job_222fc54a-654

## Description

## Job Description

We're looking for a software-focused Solutions Architect to drive adoption of next-generation AI infrastructure across NVIDIA CPU platforms and LPU-based inference systems.

As a Solutions Architect, you will be the first line of technical expertise between NVIDIA and our customers for CPU- and LPU-centric AI system design. You will help customers understand how NVIDIA CPUs and LPU-based systems can improve the efficiency, latency, throughput, and total cost of their AI workloads, especially when deployed alongside NVIDIA GPUs in heterogeneous production environments.

### Key Responsibilities:

- Evangelize NVIDIA CPU platforms, including Grace, Vera, and future generations, as well as LPU-based systems and LPX-class platforms, with a strong focus on AI software stacks and workload efficiency.

- Help customers design and optimize AI workloads across CPU, GPU, and LPU, improving latency, throughput, utilization, and overall cost efficiency.

- Analyze and tune LLM and generative AI pipelines across serving, runtime, memory, I/O, batching, scheduling, and orchestration layers.

- Build proof-of-concepts, reference architectures, and technical guidance in partnership with Engineering, Product, and Sales teams.

- Establish trusted technical relationships with customer architects, infrastructure teams, and senior leaders, becoming a strategic advisor for heterogeneous AI system design.

### Requirements:

- MS or PhD in Computer Science, Engineering, Mathematics, Physics, or a related field, or equivalent experience, plus 5+ years in AI systems, infrastructure, performance engineering, or solution architecture.

- Strong understanding of modern CPU architecture, Linux systems, and software performance tuning, along with hands-on experience in AI inference for LLM, generative AI, or agentic AI workloads.

- Experience optimizing heterogeneous systems involving CPU and accelerators, with familiarity in frameworks such as PyTorch, Triton, TensorRT-LLM, vLLM, or ONNX Runtime.

- Strong programming, problem-solving, and communication skills, with the ability to work effectively with both technical teams and senior customer stakeholders.

### Nice to Have:

- Experience with NVIDIA CPU platforms such as Grace, Grace Hopper, or Arm64 server environments, and familiarity with LPU-based systems or other low-latency inference accelerators.

- Deep expertise in LLM inference optimization, serving architecture, and workload placement across CPU, GPU, and LPU.

- Experience building customer-facing proof-of-concepts and measuring AI efficiency through latency, throughput, cost per token, power, or utilization.

- Familiarity with NVIDIA AI software and platform technologies.

## Skills

### Required
- AI systems
- infrastructure
- performance engineering
- solution architecture
- modern CPU architecture
- Linux systems
- software performance tuning
- AI inference
- PyTorch
- Triton
- TensorRT-LLM
- vLLM
- ONNX Runtime

---

Source: [Apply at nvidia.wd5.myworkdayjobs.com](https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/China-Beijing/Solutions-Architect---CPU-and-LPU_JR2015614?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply)
