Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 301–325 of 1149 papers

Title	Date	Tasks	Status	Hype	Score
SPAct: Self-supervised Privacy Preservation for Action Recognition	Mar 29, 2022	Action ClassificationAction Recognition	CodeCode Available	1	5
InternVideo-Ego4D: A Pack of Champion Solutions to Ego4D Challenges	Nov 17, 2022	Future Hand PredictionMoment Queries	CodeCode Available	1	5
EPIC Fields: Marrying 3D Geometry and Video Understanding	Jun 14, 2023	3D geometryNeural Rendering	CodeCode Available	1	5
-Video: A Training-Free Approach to Long Video Understanding via Continuous-Time Memory Consolidation	Jan 31, 2025	Question AnsweringVideo Question Answering	CodeCode Available	1	5
CEFHRI: A Communication Efficient Federated Learning Framework for Recognizing Industrial Human-Robot Interaction	Aug 29, 2023	Federated Learningimage-classification	CodeCode Available	1	5
Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation	Mar 25, 2025	HallucinationHallucination Evaluation	CodeCode Available	1	5
A Simple LLM Framework for Long-Range Video Question-Answering	Dec 28, 2023	EgoSchemaLanguage Modelling	CodeCode Available	1	5
Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation	Mar 18, 2024	Referring Video Object SegmentationSemantic Segmentation	CodeCode Available	1	5
SoccerNet-v2: A Dataset and Benchmarks for Holistic Understanding of Broadcast Soccer Videos	Nov 26, 2020	Action SpottingBoundary Detection	CodeCode Available	1	5
Spotting Temporally Precise, Fine-Grained Events in Video	Jul 20, 2022	Action DetectionAction Spotting	CodeCode Available	1	5
Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models	Mar 20, 2025	Multiple-choiceVideo Understanding	CodeCode Available	1	5
Enhancing Traffic Safety with Parallel Dense Video Captioning for End-to-End Event Analysis	Apr 12, 2024	Dense Video CaptioningTransfer Learning	CodeCode Available	1	5
Hypergraph Multi-modal Large Language Model: Exploiting EEG and Eye-tracking Modalities to Evaluate Heterogeneous Responses for Video Understanding	Jul 11, 2024	EEGLanguage Modeling	CodeCode Available	1	5
Learning Self-Similarity in Space and Time as Generalized Motion for Video Action Recognition	Feb 14, 2021	Action RecognitionTemporal Action Localization	CodeCode Available	1	5
Enhancing Self-supervised Video Representation Learning via Multi-level Feature Optimization	Aug 4, 2021	Contrastive LearningRepresentation Learning	CodeCode Available	1	5
Facial Dynamics in Video: Instruction Tuning for Improved Facial Expression Perception and Contextual Awareness	Jan 14, 2025	Event ExtractionInstruction Following	CodeCode Available	1	5
Fact-R1: Towards Explainable Video Misinformation Detection with Deep Reasoning	May 22, 2025	Misinformationreinforcement-learning	CodeCode Available	1	5
Learning Optical Flow with Adaptive Graph Reasoning	Feb 8, 2022	Motion EstimationOptical Flow Estimation	CodeCode Available	1	5
Learning Temporally Latent Causal Processes from General Temporal Data	Sep 29, 2021	Causal DiscoveryDisentanglement	CodeCode Available	1	5
SportsHHI: A Dataset for Human-Human Interaction Detection in Sports Videos	Apr 6, 2024	Graph GenerationRelation	CodeCode Available	1	5
Leveraging triplet loss for unsupervised action segmentation	Apr 13, 2023	Action SegmentationClustering	CodeCode Available	1	5
Feature Combination Meets Attention: Baidu Soccer Embeddings and Transformer based Temporal Detection	Jun 28, 2021	Action RecognitionAction Spotting	CodeCode Available	1	5
Large Scale Holistic Video Understanding	Apr 25, 2019	Action ClassificationAction Recognition	CodeCode Available	1	5
Slow-Fast Architecture for Video Multi-Modal Large Language Models	Apr 2, 2025	Video Understanding	CodeCode Available	1	5
Hier-EgoPack: Hierarchical Egocentric Video Understanding with Diverse Task Perspectives	Feb 4, 2025	Video Understanding	CodeCode Available	1	5

Show:10 25 50

← PrevPage 13 of 46Next →

No leaderboard results yet.