SOTAVerified

Audio-visual Question Answering

Papers

Showing 2627 of 27 papers

TitleStatusHype
Patch-level Sounding Object Tracking for Audio-Visual Question Answering0
OMCAT: Omni Context Aware Transformer0
Show:102550
← PrevPage 2 of 2Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1VASTAcc80.7Unverified
2CoQo(Internvideo2)Acc79.6Unverified
3VALORAcc78.9Unverified
4CADAcc78.26Unverified
5LAVISHAcc77.08Unverified
6ST-AVQAAcc71.52Unverified