Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 351–375 of 1149 papers

Title	Date	Tasks	Status	Hype	Score
Elaborative Rehearsal for Zero-shot Action Recognition	Aug 5, 2021	Action RecognitionFew-Shot Learning	CodeCode Available	1	5
InfiniBench: A Comprehensive Benchmark for Large Multimodal Models in Very Long Video Understanding	Jun 28, 2024	Multiple-choiceVideo Understanding	CodeCode Available	1	5
Language-Assisted Skeleton Action Understanding for Skeleton-Based Temporal Action Segmentation	Oct 31, 2024	Action SegmentationAction Understanding	CodeCode Available	1	5
Temporal Context Aggregation Network for Temporal Action Proposal Refinement	Mar 24, 2021	Action DetectionAction Localization	CodeCode Available	1	5
TSM: Temporal Shift Module for Efficient Video Understanding	Nov 20, 2018	3D Action RecognitionAction Classification	CodeCode Available	1	5
TCLR: Temporal Contrastive Learning for Video Representation	Jan 20, 2021	Action ClassificationAction Recognition	CodeCode Available	1	5
How Severe is Benchmark-Sensitivity in Video Self-Supervised Learning?	Mar 27, 2022	Self-Supervised LearningSensitivity	CodeCode Available	1	5
From My View to Yours: Ego-Augmented Learning in Large Vision Language Models for Understanding Exocentric Daily Living Activities	Jan 10, 2025	Human-Object Interaction DetectionKnowledge Distillation	CodeCode Available	1	5
Teaching VLMs to Localize Specific Objects from In-context Examples	Nov 20, 2024	ObjectObject Tracking	CodeCode Available	1	5
From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video Understanding	Sep 27, 2024	Video UnderstandingVisual Reasoning	CodeCode Available	1	5
EgoTaskQA: Understanding Human Tasks in Egocentric Videos	Oct 8, 2022	Action Localizationcounterfactual	CodeCode Available	1	5
EgoSurgery-Phase: A Dataset of Surgical Phase Recognition from Egocentric Open Surgery Videos	May 30, 2024	Action RecognitionSurgical phase recognition	CodeCode Available	1	5
Technical Report: Temporal Aggregate Representations	Jun 6, 2021	Action AnticipationAction Recognition	CodeCode Available	1	5
EgoSchema: A Diagnostic Benchmark for Very Long-form Video Language Understanding	Aug 17, 2023	DiagnosticEgoSchema	CodeCode Available	1	5
Large Scale Holistic Video Understanding	Apr 25, 2019	Action ClassificationAction Recognition	CodeCode Available	1	5
Can An Image Classifier Suffice For Action Recognition?	Jun 26, 2021	Action Recognitionimage-classification	CodeCode Available	1	5
Hier-EgoPack: Hierarchical Egocentric Video Understanding with Diverse Task Perspectives	Feb 4, 2025	Video Understanding	CodeCode Available	1	5
How Well Does GPT-4V(ision) Adapt to Distribution Shifts? A Preliminary Investigation	Dec 12, 2023	Anomaly DetectionAutonomous Driving	CodeCode Available	1	5
Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation	Dec 16, 2021	Contrastive LearningRepresentation Learning	CodeCode Available	1	5
EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval	Jul 23, 2024	Re-RankingRetrieval	CodeCode Available	1	5
Procedure-Aware Pretraining for Instructional Video Understanding	Mar 31, 2023	Video Understanding	CodeCode Available	1	5
HAT: History-Augmented Anchor Transformer for Online Temporal Action Localization	Aug 12, 2024	Action LocalizationTemporal Action Localization	CodeCode Available	1	5
An Empirical Study of End-to-End Temporal Action Detection	Apr 6, 2022	Action ClassificationAction Detection	CodeCode Available	1	5
ST-Adapter: Parameter-Efficient Image-to-Video Transfer Learning	Jun 27, 2022	Action ClassificationAction Recognition	CodeCode Available	1	5
Helping Hands: An Object-Aware Ego-Centric Video Recognition Model	Aug 15, 2023	DecoderObject	CodeCode Available	1	5

Show:10 25 50

← PrevPage 15 of 46Next →

No leaderboard results yet.