SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 581–590 of 1149 papers

Title	Date	Tasks	Status	Hype	Score
Video Swin Transformers for Egocentric Video Understanding @ Ego4D Challenges 2022	Jul 22, 2022	ObjectObject State Change Classification	—Unverified	0	0
Video Time: Properties, Encoders and Evaluation	Jul 18, 2018	Video Understanding	—Unverified	0	0
Video Token Merging for Long-form Video Understanding	Oct 31, 2024	FormVideo Classification	—Unverified	0	0
Video Understanding as Machine Translation	Jun 12, 2020	Machine TranslationMetric Learning	—Unverified	0	0
Video Watermarking: Safeguarding Your Video from (Unauthorized) Annotations by Video-based LLMs	Jul 2, 2024	Video Understanding	—Unverified	0	0
Video-XL-Pro: Reconstructive Token Compression for Extremely Long Video Understanding	Mar 24, 2025	8kGPU	—Unverified	0	0
VidHalluc: Evaluating Temporal Hallucinations in Multimodal Large Language Models for Video Understanding	Dec 4, 2024	HallucinationInstruction Following	—Unverified	0	0
VidLPRO: A Video-Language Pre-training Framework for Robotic and Laparoscopic Surgery	Sep 7, 2024	Computational EfficiencyContrastive Learning	—Unverified	0	0
ViFi-ReID: A Two-Stream Vision-WiFi Multimodal Approach for Person Re-identification	Oct 13, 2024	Contrastive LearningPerson Re-Identification	—Unverified	0	0
VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation	Dec 1, 2024	Instruction FollowingVideo Understanding	—Unverified	0	0

Show:10 25 50

← PrevPage 59 of 115Next →

No leaderboard results yet.