SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 141–150 of 1149 papers

Title	Date	Tasks	Status	Hype
UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer	Sep 22, 2022	Action ClassificationAction Recognition	CodeCode Available	2
ActionFormer: Localizing Moments of Actions with Transformers	Feb 16, 2022	Action LocalizationAction Recognition	CodeCode Available	2
PyTorchVideo: A Deep Learning Library for Video Understanding	Nov 18, 2021	Deep LearningSelf-Supervised Learning	CodeCode Available	2
Attention Mechanisms in Computer Vision: A Survey	Nov 15, 2021	image-classificationImage Classification	CodeCode Available	2
TSM: Temporal Shift Module for Efficient and Scalable Video Understanding on Edge Device	Sep 27, 2021	Video RecognitionVideo Understanding	CodeCode Available	2
Video Swin Transformer	Jun 24, 2021	Action ClassificationAction Recognition	CodeCode Available	2
Is Space-Time Attention All You Need for Video Understanding?	Feb 9, 2021	Action ClassificationAction Recognition	CodeCode Available	2
Video Instance Segmentation	May 12, 2019	Instance SegmentationSegmentation	CodeCode Available	2
UGC-VideoCaptioner: An Omni UGC Video Detail Caption Model and New Benchmarks	Jul 15, 2025	Video CaptioningVideo Understanding	CodeCode Available	1
MCAM: Multimodal Causal Analysis Model for Ego-Vehicle-Level Driving Video Understanding	Jul 8, 2025	Autonomous DrivingVideo Understanding	CodeCode Available	1

Show:10 25 50

← PrevPage 15 of 115Next →

No leaderboard results yet.