SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 431–440 of 1149 papers

Title	Date	Tasks	Status	Hype
METok: Multi-Stage Event-based Token Compression for Efficient Long Video Understanding	Jun 3, 2025	Video Understanding	CodeCode Available	0
InterRVOS: Interaction-aware Referring Video Object Segmentation	Jun 3, 2025	8kObject	—Unverified	0
EgoVLM: Policy Optimization for Egocentric Video Understanding	Jun 3, 2025	EgoSchemaQuestion Answering	CodeCode Available	0
ReAgent-V: A Reward-Driven Multi-Agent Framework for Video Understanding	Jun 2, 2025	Action RecognitionVideo Understanding	—Unverified	0
FlexSelect: Flexible Token Selection for Efficient Long Video Understanding	Jun 1, 2025	Video Understanding	—Unverified	0
Scene Detection Policies and Keyframe Extraction Strategies for Large-Scale Video Analysis	May 31, 2025	Scene SegmentationSegmentation	—Unverified	0
Learning reusable concepts across different egocentric video understanding tasks	May 30, 2025	Video Understanding	—Unverified	0
Threading Keyframe with Narratives: MLLMs as Strong Long Video Comprehenders	May 30, 2025	Video Understanding	—Unverified	0
Time Blindness: Why Video-Language Models Can't See What Humans Can?	May 30, 2025	Temporal SequencesVideo Understanding	—Unverified	0
VUDG: A Dataset for Video Understanding Domain Generalization	May 30, 2025	Domain GeneralizationMultiple-choice	—Unverified	0

Show:10 25 50

← PrevPage 44 of 115Next →

No leaderboard results yet.