RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness

2024-05-27CVPR 2025Code Available11· sign in to hype

Tianyu Yu, Haoye Zhang, Qiming Li, Qixin Xu, Yuan YAO, Da Chen, Xiaoman Lu, Ganqu Cui, Yunkai Dang, Taiwen He, Xiaocheng Feng, Jun Song, Bo Zheng, Zhiyuan Liu, Tat-Seng Chua, Maosong Sun

arXiv PDF

Code Available — Be the first to reproduce this paper.

Reproduce

Code

github.com/openbmb/omnilmm
OfficialIn paperpytorch★ 24,167
github.com/rlhf-v/rlaif-v
OfficialIn paperpytorch★ 448
github.com/OpenBMB/MiniCPM-o
pytorch★ 24,170
github.com/rlhf-v/rlhf-v
none★ 306

Abstract

Traditional feedback learning for hallucination reduction relies on labor-intensive manual labeling or expensive proprietary models. This leaves the community without foundational knowledge about how to build high-quality feedback with open-source MLLMs. In this work, we introduce RLAIF-V, a novel framework that aligns MLLMs in a fully open-source paradigm. RLAIF-V maximally explores open-source MLLMs from two perspectives, including high-quality feedback data generation for preference learning and self-feedback guidance for inference-time scaling. Extensive experiments on six benchmarks in both automatic and human evaluation show that RLAIF-V substantially enhances the trustworthiness of models at both preference learning and inference time. RLAIF-V 7B reduces object hallucination by 80.7\% and overall hallucination by 33.7\%. Remarkably, RLAIF-V 12B further reveals the self-alignment potential of open-source MLLMs, where the model can learn from feedback of itself to achieve super GPT-4V trustworthiness.

Tasks

Hallucination Image Captioning Object Hallucination Visual Question Answering

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
Object HalBench	RLAIF-V 7B	chair_i	4.3	—	Unverified
Object HalBench	RLAIF-V 12B	chair_i	1.8	—	Unverified

RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness

Code

Abstract

Tasks

Benchmark Results

Reproductions