A Generalist Agent

2022-05-12DeepMind 2022Code Available2· sign in to hype

Scott Reed, Konrad Zolna, Emilio Parisotto, Sergio Gomez Colmenarejo, Alexander Novikov, Gabriel Barth-Maron, Mai Gimenez, Yury Sulsky, Jackie Kay, Jost Tobias Springenberg, Tom Eccles, Jake Bruce, Ali Razavi, Ashley Edwards, Nicolas Heess, Yutian Chen, Raia Hadsell, Oriol Vinyals, Mahyar Bordbar, Nando de Freitas

arXiv PDF

Code Available — Be the first to reproduce this paper.

Reproduce

Code

github.com/OrigamiDream/gato
tf★ 220
github.com/ManifoldRG/gato-control
pytorch★ 45
github.com/LAS1520/Gato-A-Generalist-Agent
pytorch★ 44

Abstract

Inspired by progress in large-scale language modeling, we apply a similar approach towards building a single generalist agent beyond the realm of text outputs. The agent, which we refer to as Gato, works as a multi-modal, multi-task, multi-embodiment generalist policy. The same network with the same weights can play Atari, caption images, chat, stack blocks with a real robot arm and much more, deciding based on its context whether to output text, joint torques, button presses, or other tokens. In this report we describe the model and the data, and document the current capabilities of Gato.

Tasks

Language Modeling Language Modelling Skill Generalization Skill Mastery

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
RGB-Stacking	Gato	Group 1	24.5	—	Unverified

A Generalist Agent

Code

Abstract

Tasks

Benchmark Results

Reproductions