SOTAVerified

Audio-visual Question Answering

Papers

Showing 2127 of 27 papers

TitleStatusHype
SHMamba: Structured Hyperbolic State Space Model for Audio-Visual Question Answering0
AVQACL: A Novel Benchmark for Audio-Visual Question Answering Continual LearningCode0
Target-Aware Spatio-Temporal Reasoning via Answering Questions in Dynamics Audio-Visual ScenariosCode0
Towards Multilingual Audio-Visual Question AnsweringCode0
Object-aware Adaptive-Positivity Learning for Audio-Visual Question AnsweringCode0
Answering Diverse Questions via Text Attached with Key Audio-Visual CluesCode0
Music's Multimodal Complexity in AVQA: Why We Need More than General Multimodal LLMsCode0
Show:102550
← PrevPage 3 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1VASTAcc80.7Unverified
2CoQo(Internvideo2)Acc79.6Unverified
3VALORAcc78.9Unverified
4CADAcc78.26Unverified
5LAVISHAcc77.08Unverified
6ST-AVQAAcc71.52Unverified