SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 681–690 of 1149 papers

Title	Date	Tasks	Status	Hype
End-to-end Generative Pretraining for Multimodal Video Captioning	Jan 20, 2022	Action ClassificationDecoder	—Unverified	0
End-to-End Joint Semantic Segmentation of Actors and Actions in Video	Sep 1, 2018	Action RecognitionSegmentation	—Unverified	0
End-to-End Video Classification with Knowledge Graphs	Nov 6, 2017	BIG-bench Machine LearningClassification	—Unverified	0
Enhanced Motion-Text Alignment for Image-to-Video Transfer Learning	Jan 1, 2024	Transfer LearningVideo Understanding	—Unverified	0
Enhancing Long Video Understanding via Hierarchical Event-Based Memory	Sep 10, 2024	Video Understanding	—Unverified	0
Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization	Oct 9, 2024	Audio captioningLarge Language Model	—Unverified	0
Enhancing Transformer for Video Understanding Using Gated Multi-Level Attention and Temporal Adversarial Training	Mar 18, 2021	Video Understanding	—Unverified	0
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis	Feb 11, 2025	Action RecognitionVideo Description	—Unverified	0
Espresso: High Compression For Rich Extraction From Videos for Your Vision-Language Model	Dec 6, 2024	EgoSchemaLanguage Modeling	—Unverified	0
EVA: An Embodied World Model for Future Video Anticipation	Oct 20, 2024	Language ModelingLanguage Modelling	—Unverified	0

Show:10 25 50

← PrevPage 69 of 115Next →

No leaderboard results yet.