SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1081–1090 of 1149 papers

Title	Date	Tasks	Status	Hype	Score
SF2T: Self-supervised Fragment Finetuning of Video-LLMs for Fine-Grained Understanding	Apr 10, 2025	Video Understanding	—Unverified	0	0
ShotVL: Human-Centric Highlight Frame Retrieval via Language Queries	Dec 17, 2024	Human Detectionimage-classification	—Unverified	0	0
SkillFormer: Unified Multi-View Video Understanding for Proficiency Estimation	May 13, 2025	Computational EfficiencyVideo Understanding	—Unverified	0	0
Skimming and Scanning for Untrimmed Video Action Recognition	Apr 21, 2021	Action RecognitionTemporal Action Localization	—Unverified	0	0
Slicing Convolutional Neural Network for Crowd Video Understanding	Jun 1, 2016	AttributeVideo Understanding	—Unverified	0	0
Slot-VLM: SlowFast Slots for Video-Language Modeling	Feb 20, 2024	Language ModelingLanguage Modelling	—Unverified	0	0
SlowFast-LLaVA-1.5: A Family of Token-Efficient Video Large Language Models for Long-Form Video Understanding	Mar 24, 2025	FormVideo Understanding	—Unverified	0	0
SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability	Mar 18, 2025	Language ModelingLanguage Modelling	—Unverified	0	0
Spacewalk-18: A Benchmark for Multimodal and Long-form Procedural Video Understanding	Nov 30, 2023	FormVideo Retrieval	—Unverified	0	0
Sparse-to-Dense: A Free Lunch for Lossless Acceleration of Video Understanding in LLMs	May 25, 2025	Video Understanding	—Unverified	0	0

Show:10 25 50

← PrevPage 109 of 115Next →

No leaderboard results yet.