# AI Researcher, TAO Multi-Modal Model Development

**Company**: NVIDIA
**Location**: Hanoi, Ho Chi Minh City
**Work arrangement**: onsite
**Experience**: entry
**Job type**: full-time
**Category**: Engineering
**Industry**: Technology

**Apply**: https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/Vietnam-Hanoi/AI-Researcher--TAO-Multi-Modal-Model-Development_JR2017527?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply
**Canonical**: https://yubhub.co/jobs/job_88e91fb2-c40

## Description

We are seeking a motivated AI Model Development Researcher to join the TAO , Train, Adapt, Optimize , Multi-Modal Model Development team in Hanoi or Ho Chi Minh City, Vietnam.

In this role, you will contribute to the development, adaptation, optimization, and evaluation of advanced AI models within the NVIDIA frameworks. You will work on cutting-edge areas such as multi-modal learning, vision-language models, image segmentation, foundation model adaptation, and scalable deep learning workflows.

You will collaborate with engineers, researchers, and cross-functional teams to build practical AI solutions that can be integrated into production pipelines, NVIDIA SDKs, and real-world customer use cases. This is an excellent opportunity for an early-career engineer/scientist who is passionate about machine learning, deep learning, vision-language models, and building high-quality AI software.

Responsibilities:

- Develop and fine-tune multi-modal AI models using NVIDIA’s TAO Toolkit and deep learning frameworks.

- Contributes to the design and implementation of vision-language models (VLMs) and universal segmentation systems.

- Conduct experiments and benchmarking to evaluate model accuracy, robustness, and scalability.

- Collaborate with cross-functional teams to integrate your research into production-level pipelines and NVIDIA SDKs.

- Participate in research discussions, code reviews, and technical documentation to share insights and improve methodologies.

Requirements:

- BS or MS in Electrical Engineering, Computer Engineering, Computer Science, or a related field (or equivalent experience).

- 2+ years of experience in machine learning, deep learning, or computer vision model development.

- Strong Python programming skills and proficiency with PyTorch or similar frameworks.

- Solid understanding of neural network architectures, transformers, and multi-modal learning techniques.

- Excellent problem-solving abilities, attention to detail, and a collaborative mindset.

- Familiarity with vision-language models, image segmentation, or large-scale pretraining is a strong plus.

## Skills

### Required
- Python
- PyTorch
- Deep Learning
- Machine Learning
- Computer Vision
- Neural Network Architectures
- Transformers
- Multi-Modal Learning Techniques

### Nice to have
- Vision-Language Models
- Image Segmentation
- Large-Scale Pretraining

---

Source: [Apply at nvidia.wd5.myworkdayjobs.com](https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/Vietnam-Hanoi/AI-Researcher--TAO-Multi-Modal-Model-Development_JR2017527?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply)