# Research Engineer, Human Understanding

**Company**: Google DeepMind
**Location**: Los Angeles, California, US; Mountain View, California, US
**Work arrangement**: onsite
**Experience**: senior
**Job type**: full-time
**Salary**: $174,000 USD - $252,000 USD
**Category**: Engineering
**Industry**: Technology
**Wikidata**: https://www.wikidata.org/wiki/Q15733006

**Apply**: https://job-boards.greenhouse.io/deepmind/jobs/7669433
**Canonical**: https://yubhub.co/jobs/job_e121da52-304

## Description

We are seeking a highly motivated Research Engineer with a strong background in multi-modal modelling for humans and a focus on speech & audio/visual to join the effort within Google DeepMind's Frontier AI unit.

This role is pivotal in developing foundational multimodal AI capabilities to understand, generate, and protect human likeness. As a key contributor, you will design and implement cutting-edge models and frameworks, pushing the boundaries of AI to enable foundational capabilities for human-centric understanding and generation.

This is a unique opportunity to contribute to impactful research and advance Google DeepMind's mission towards Artificial General Intelligence (AGI).

### Key Responsibilities

- Advance multimodal human representations & understanding: Research and implement novel models and other multimodal techniques for a more holistic understanding of humans across visual, audio, and textual data.

- Conduct applied research: Conduct experimental research cycles from hypothesis to deployment.

- Drive technical projects: Take ownership of substantial technical projects within the effort, from ideation and design to implementation and evaluation, often involving cross-functional collaboration.

- Contribute to Infrastructure: Inform and contribute to the development of scalable and efficient research infrastructure for multimodal human understanding models and datasets.

- Design and execute strategies for tuning and adapting VLMs and other foundation models for specific tasks

### Requirements

- PhD degree in Computer Science, Machine Learning, or a related technical field with 3+ years of relevant experience.

- Experience in developing machine learning models, such as audio & speech-visual models.

- Experience in working with and tuning large-scale vision language models.

- Strong programming skills in Python and experience with at least one major deep learning framework (e.g., JAX)

- Experience conducting independent research and development, including experimental design, implementation, and analysis.

### Salary

The US base salary range for this full-time position is between $174,000 USD - $252,000 USD + bonus + equity + benefits.

## Skills

### Required
- Python
- JAX
- Machine Learning
- Deep Learning
- Vision Language Models
- Audio & Speech-Visual Models

### Nice to have
- Generative AI
- Reinforcement Learning
- Alignment Methods
- Multimodal Learning
- Privacy-Preserving Machine Learning
