SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 351–360 of 1149 papers

Title	Date	Tasks	Status	Hype
Learning Self-Similarity in Space and Time as Generalized Motion for Video Action Recognition	Feb 14, 2021	Action RecognitionTemporal Action Localization	CodeCode Available	1
Leveraging triplet loss for unsupervised action segmentation	Apr 13, 2023	Action SegmentationClustering	CodeCode Available	1
LoVR: A Benchmark for Long Video Retrieval in Multimodal Contexts	May 20, 2025	Caption GenerationRetrieval	CodeCode Available	1
SiLVR: A Simple Language-based Video Reasoning Framework	May 30, 2025	MathMME	CodeCode Available	1
Slow-Fast Architecture for Video Multi-Modal Large Language Models	Apr 2, 2025	Video Understanding	CodeCode Available	1
Isolated Sign Recognition from RGB Video using Pose Flow and Self-Attention	Jun 11, 2021	Action RecognitionSign Language Recognition	CodeCode Available	1
IV-Bench: A Benchmark for Image-Grounded Video Perception and Reasoning in Multimodal LLMs	Apr 21, 2025	Video Understanding	CodeCode Available	1
From My View to Yours: Ego-Augmented Learning in Large Vision Language Models for Understanding Exocentric Daily Living Activities	Jan 10, 2025	Human-Object Interaction DetectionKnowledge Distillation	CodeCode Available	1
EgoTaskQA: Understanding Human Tasks in Egocentric Videos	Oct 8, 2022	Action Localizationcounterfactual	CodeCode Available	1
EgoSurgery-Phase: A Dataset of Surgical Phase Recognition from Egocentric Open Surgery Videos	May 30, 2024	Action RecognitionSurgical phase recognition	CodeCode Available	1

Show:10 25 50

← PrevPage 36 of 115Next →

No leaderboard results yet.