SOTAVerified

RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness

2024-05-27CVPR 2025Code Available11· sign in to hype

Tianyu Yu, Haoye Zhang, Qiming Li, Qixin Xu, Yuan YAO, Da Chen, Xiaoman Lu, Ganqu Cui, Yunkai Dang, Taiwen He, Xiaocheng Feng, Jun Song, Bo Zheng, Zhiyuan Liu, Tat-Seng Chua, Maosong Sun

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Traditional feedback learning for hallucination reduction relies on labor-intensive manual labeling or expensive proprietary models. This leaves the community without foundational knowledge about how to build high-quality feedback with open-source MLLMs. In this work, we introduce RLAIF-V, a novel framework that aligns MLLMs in a fully open-source paradigm. RLAIF-V maximally explores open-source MLLMs from two perspectives, including high-quality feedback data generation for preference learning and self-feedback guidance for inference-time scaling. Extensive experiments on six benchmarks in both automatic and human evaluation show that RLAIF-V substantially enhances the trustworthiness of models at both preference learning and inference time. RLAIF-V 7B reduces object hallucination by 80.7\% and overall hallucination by 33.7\%. Remarkably, RLAIF-V 12B further reveals the self-alignment potential of open-source MLLMs, where the model can learn from feedback of itself to achieve super GPT-4V trustworthiness.

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
Object HalBenchRLAIF-V 7Bchair_i4.3Unverified
Object HalBenchRLAIF-V 12Bchair_i1.8Unverified

Reproductions