# Systems Quality and Reliability Lead

**Company**: NVIDIA
**Location**: Santa Clara
**Work arrangement**: onsite
**Experience**: senior
**Job type**: full-time
**Category**: Engineering
**Industry**: Technology

**Apply**: https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/US-CA-Santa-Clara/Systems-Quality-and-Reliability-Lead---LPU_JR2013680?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply
**Canonical**: https://yubhub.co/jobs/job_63c5de5d-a95

## Description

We are seeking a Lead Systems Quality and Reliability Engineer to join our LPU team! You will own, build, and manage the RMA and FA debug and root-cause analysis for existing and new Nvidia AI/ML products.

Responsibilities:

- Conduct and lead debug and root-cause analysis of field RMAs. Collaborate with Systems Engineers, Hardware engineers, Software engineers, and operations engineers as required

- Scale root cause FA capabilities within your organization

- Create FA result reports that align with standard 8D or similar process

- Analyze RMA, FA and repair data. Identify trends and raise quality alerts when necessary. Drive resolution, containment, and mitigation plans for such quality alerts

- Oversee hardware quality performance, monitoring field quality data and associated metrics including RMA rates, MTBF, and Reliability Ratio

- Manage operational perf of FA at CMs, ensuring partner achieve key perf indicators including FA cycle times, fault duplication rates and fault isolation rates

- Oversee the setup of new products into Failure Analysis operations

Requirements:

- BS/MS in EE, Physics or a related degree (or equivalent experience)

- 8+ yrs of hands on systems test and/or validation engineering experience

- Proven hands-on management and leadership experience

- Competence using lab equipment such as oscilloscopes, logic analyzers, power analyzers etc.

- Experience with enabling reliability tests such as HTOL and quality tests such as Burn in

- Ideal candidate will have working knowledge of FA techniques and tools such as FIB, SEM, TDR, VNA and CSAM

- Strong knowledge of Fault isolation techniques such as OBIRCH, DLS/LADA, LVP and LVI

- Proficiency with high speed interfaces (SerDes, PCIe, DDR)

- Proficiency in Python, PERL, C++, or other languages on UNIX /Linux

- Excellent knowledge of PCB card and system level test and debug as well as be able to manage factory floor partners (CMs) for RMA/FA activities

## Skills

### Required
- lab equipment
- reliability tests
- quality tests
- FA techniques
- Fault isolation techniques
- high speed interfaces
- programming languages
- PCB card and system level test and debug

---

Source: [Apply at nvidia.wd5.myworkdayjobs.com](https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/US-CA-Santa-Clara/Systems-Quality-and-Reliability-Lead---LPU_JR2013680?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply)
