SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1051–1060 of 1149 papers

Title	Date	Tasks	Status	Hype	Score
Recurring the Transformer for Video Action Recognition	Jan 1, 2022	Action RecognitionGPU	—Unverified	0	0
Relational Space-Time Query in Long-Form Videos	Jan 1, 2023	FormVideo Understanding	—Unverified	0	0
Residual Frames with Efficient Pseudo-3D CNN for Human Action Recognition	Aug 3, 2020	Action RecognitionOptical Flow Estimation	—Unverified	0	0
ResNetVLLM -- Multi-modal Vision LLM for the Video Understanding Task	Apr 20, 2025	Language ModelingLanguage Modelling	—Unverified	0	0
Rethinking Image-to-Video Adaptation: An Object-centric Perspective	Jul 9, 2024	Action RecognitionObject	—Unverified	0	0
Rethinking Video-Text Understanding: Retrieval from Counterfactually Augmented Data	Jul 18, 2024	Language ModellingLarge Language Model	—Unverified	0	0
Retrieval-based Video Language Model for Efficient Long Video Question Answering	Dec 8, 2023	Language ModelingLanguage Modelling	—Unverified	0	0
RETTA: Retrieval-Enhanced Test-Time Adaptation for Zero-Shot Video Captioning	May 11, 2024	Image-text matchingRetrieval	—Unverified	0	0
Revealing Occlusions with 4D Neural Fields	Apr 22, 2022	Video Understanding	—Unverified	0	0
Revisiting Kernel Temporal Segmentation as an Adaptive Tokenizer for Long-form Video Understanding	Sep 20, 2023	Action LocalizationForm	—Unverified	0	0

Show:10 25 50

← PrevPage 106 of 115Next →

No leaderboard results yet.