Benchmarking Multi-dimensional AIGC Video Quality Assessment: A Dataset and Unified Model

2024-07-31Code Available0· sign in to hype

Zhichao Zhang, Wei Sun, Xinyue Li, Jun Jia, Xiongkuo Min, ZiCheng Zhang, Chunyi Li, Zijian Chen, Puyi Wang, Fengyu Sun, Shangling Jui, Guangtao Zhai

arXiv PDF

Code Available — Be the first to reproduce this paper.

Reproduce

Code

github.com/zczhang-sjtu/ugvq
OfficialIn paper★ 1

Abstract

In recent years, artificial intelligence (AI)-driven video generation has gained significant attention. Consequently, there is a growing need for accurate video quality assessment (VQA) metrics to evaluate the perceptual quality of AI-generated content (AIGC) videos and optimize video generation models. However, assessing the quality of AIGC videos remains a significant challenge because these videos often exhibit highly complex distortions, such as unnatural actions and irrational objects. To address this challenge, we systematically investigate the AIGC-VQA problem, considering both subjective and objective quality assessment perspectives. For the subjective perspective, we construct the Large-scale Generated Video Quality assessment (LGVQ) dataset, consisting of 2,808 AIGC videos generated by 6 video generation models using 468 carefully curated text prompts. We evaluate the perceptual quality of AIGC videos from three critical dimensions: spatial quality, temporal quality, and text-video alignment. For the objective perspective, we establish a benchmark for evaluating existing quality assessment metrics on the LGVQ dataset. Our findings show that current metrics perform poorly on this dataset, highlighting a gap in effective evaluation tools. To bridge this gap, we propose the Unify Generated Video Quality assessment (UGVQ) model, designed to accurately evaluate the multi-dimensional quality of AIGC videos. The UGVQ model integrates the visual and motion features of videos with the textual features of their corresponding prompts, forming a unified quality-aware feature representation tailored to AIGC videos. Experimental results demonstrate that UGVQ achieves state-of-the-art performance on the LGVQ dataset across all three quality dimensions. Both the LGVQ dataset and the UGVQ model are publicly available on https://github.com/zczhang-sjtu/UGVQ.git.

Tasks

Benchmarking Large Language Model Video Alignment Video Generation Video Quality Assessment Visual Question Answering (VQA)

Benchmarking Multi-dimensional AIGC Video Quality Assessment: A Dataset and Unified Model

Code

Abstract

Tasks

Reproductions