# Senior Solutions Architect, Cloud Infrastructure and DevOps

**Company**: NVIDIA
**Location**: Saudi Arabia
**Work arrangement**: remote
**Experience**: senior
**Job type**: full-time
**Category**: Engineering
**Industry**: Technology

**Apply**: https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/Saudi-Arabia-Remote/Senior-Solutions-Architect--Cloud-Infrastructure-and-DevOps_JR2016420?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply
**Canonical**: https://yubhub.co/jobs/job_1ffc8c95-ac3

## Description

We are looking for a Senior Cloud Infrastructure and DevOps Solutions Architect to join our NVIDIA Infrastructure Specialist Team. As a Senior Solutions Architect, you will engage directly with customers, partners, and multi-functional teams to assess, architect, and guide the implementation of large-scale infrastructure projects.

The scope of this role spans system architecture, Kubernetes-based platforms, and automation,serving as both a trusted advisor and a hands-on technical leader. You will advise on and help maintain large-scale computational and AI infrastructure, including monitoring, logging, and workload orchestration (Kubernetes and Linux job schedulers).

Key responsibilities include:

- Providing consultative guidance and performing hands-on solving across the full stack,from bare metal and operating system, through the software stack, container platform, networking, and storage.

- Assessing customer environments and recommending optimized, production-ready Kubernetes-based container platforms integrated with enterprise-grade networking and storage solutions.

- Serving as a key technical resource: developing, refining, and documenting standard methodologies and operational guidelines to be shared with internal teams and customer partners.

- Supporting Research & Development activities and engaging in POCs/POVs to validate new features, architectures, and upgrade approaches.

- Creating and delivering high-quality documentation, including runbooks, onboarding materials, and best-practice guides for customers and internal teams.

- Acting as the technical leader for assigned customer accounts, providing strategic guidance on DevOps and platform architecture and influencing long-term infrastructure and operations decisions.

Requirements:

- Education & Experience: BS/MS/PhD in Computer Science, Electrical/Computer Engineering, Physics, Mathematics, or related fields (or equivalent experience), with 8+ years of professional experience in leading scalable cloud environments and automation engineering roles.

- Cloud & HPC Expertise: Shown understanding of networking fundamentals, data center architectures, and hands-on experience leading HPC/AI clusters, including deployment, optimization, and solving.

- NVIDIA GPU Expertise: Validated hands-on experience deploying, configuring, and optimizing NVIDIA GPU-accelerated infrastructure, including driver management, CUDA toolkit integration, and GPU workload profiling.

- Kubernetes & AI/ML Workloads: Extensive experience with Kubernetes for container orchestration, resource scheduling, scaling, and integration with GPU-accelerated and HPC environments.

- Hardware & Software Knowledge: Strong familiarity with HPC and AI technologies (CPUs, GPUs, high-speed interconnects) and supporting software stacks.

- Linux & Storage Systems: Deep knowledge of Linux (RedHat, Ubuntu), OS-level security, and protocols. Experience with storage solutions such as Lustre, GPFS, ZFS, XFS, and emerging Kubernetes storage technologies.

- Automation & Observability: Proficiency in Python and Bash scripting, configuration management, and Infrastructure-as-Code tools (e.g., Ansible, Terraform). Experience with observability stacks (Grafana, Loki, Prometheus) for monitoring, logging, and building fault-tolerant systems.

- Solution Architecture & Customer Engagement: Strong background in crafting scalable solutions and providing consultative support to customers, including leading architectural reviews and speaking publicly to executive partners.

Preferred qualifications include:

- Knowledge of CI/CD pipelines for software deployment and automation.

- Experience working with NVIDIA GPU and Network Operators to manage automated resource lifecycle in Kubernetes environments.

- Solid hands-on knowledge of Kubernetes and container-based microservices architectures.

- Experience with NVIDIA GPU and Network Operator for automated GPU as well as network resources lifecycle management in Kubernetes environments.

- Experience with NVIDIA Base Command Manager (BCM) for provisioning, managing, and supervising GPU clusters at scale as well as background with RDMA-based fabrics (InfiniBand or RoCE) in HPC or AI environments.

## Skills

### Required
- Cloud Infrastructure
- DevOps
- Kubernetes
- NVIDIA GPU
- HPC
- Linux
- Storage Systems
- Automation
- Observability
- Solution Architecture
- Customer Engagement

### Nice to have
- CI/CD Pipelines
- NVIDIA GPU and Network Operators
- Kubernetes and Container-Based Microservices Architectures
- NVIDIA Base Command Manager
- RDMA-Based Fabrics

---

Source: [Apply at nvidia.wd5.myworkdayjobs.com](https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/Saudi-Arabia-Remote/Senior-Solutions-Architect--Cloud-Infrastructure-and-DevOps_JR2016420?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply)
