OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models

2023-08-02Code Available4· sign in to hype

Anas Awadalla, Irena Gao, Josh Gardner, Jack Hessel, Yusuf Hanafy, Wanrong Zhu, Kalyani Marathe, Yonatan Bitton, Samir Gadre, Shiori Sagawa, Jenia Jitsev, Simon Kornblith, Pang Wei Koh, Gabriel Ilharco, Mitchell Wortsman, Ludwig Schmidt

arXiv PDF

Code Available — Be the first to reproduce this paper.

Reproduce

Code

github.com/mlfoundations/open_flamingo
OfficialIn paperpytorch★ 4,079
github.com/luodian/otter
pytorch★ 3,348

Abstract

We introduce OpenFlamingo, a family of autoregressive vision-language models ranging from 3B to 9B parameters. OpenFlamingo is an ongoing effort to produce an open-source replication of DeepMind's Flamingo models. On seven vision-language datasets, OpenFlamingo models average between 80 - 89% of corresponding Flamingo performance. This technical report describes our models, training data, hyperparameters, and evaluation suite. We share our models and code at https://github.com/mlfoundations/open_flamingo.

Tasks

Visual Question Answering Visual Question Answering (VQA)

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
InfiMM-Eval	OpenFlamingo-v2	Overall score	6.82	—	Unverified

OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models

Code

Abstract

Tasks

Benchmark Results

Reproductions