SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 151–160 of 1149 papers

Title	Date	Tasks	Status	Hype
Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation	Mar 25, 2025	HallucinationHallucination Evaluation	CodeCode Available	1
PAVE: Patching and Adapting Video Large Language Models	Mar 25, 2025	Audio-visual Question AnsweringMulti-Task Learning	CodeCode Available	1
ACVUBench: Audio-Centric Video Understanding Benchmark	Mar 25, 2025	Video Understanding	CodeCode Available	0
SlowFast-LLaVA-1.5: A Family of Token-Efficient Video Large Language Models for Long-Form Video Understanding	Mar 24, 2025	FormVideo Understanding	—Unverified	0
CRCL: Causal Representation Consistency Learning for Anomaly Detection in Surveillance Videos	Mar 24, 2025	Anomaly DetectionAnomaly Detection In Surveillance Videos	—Unverified	0
Video-XL-Pro: Reconstructive Token Compression for Extremely Long Video Understanding	Mar 24, 2025	8kGPU	—Unverified	0
Unbiasing through Textual Descriptions: Mitigating Representation Bias in Video Benchmarks	Mar 24, 2025	Common Sense ReasoningPrediction	—Unverified	0
Breaking the Encoder Barrier for Seamless Video-Language Understanding	Mar 24, 2025	DecoderLanguage Modeling	—Unverified	0
MammAlps: A multi-view video behavior monitoring dataset of wild mammals in the Swiss Alps	Mar 23, 2025	Scene SegmentationVideo Understanding	CodeCode Available	1
4D-Bench: Benchmarking Multi-modal Large Language Models for 4D Object Understanding	Mar 22, 2025	BenchmarkingObject	CodeCode Available	0

Show:10 25 50

← PrevPage 16 of 115Next →

No leaderboard results yet.