Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 451–475 of 1149 papers

Title	Date	Tasks	Status	Score
HA-ViD: A Human Assembly Video Dataset for Comprehensive Assembly Knowledge Understanding	Jul 9, 2023	Action RecognitionAction Segmentation	CodeCode Available	5
Hallucination Mitigation Prompts Long-term Video Understanding	Jun 17, 2024	Answer GenerationHallucination	CodeCode Available	5
Video action detection by learning graph-based spatio-temporal interactions	Dec 9, 2019	Action DetectionAction Localization	CodeCode Available	5
Spatio-Temporal Perturbations for Video Attribution	Sep 1, 2021	Video Understanding	CodeCode Available	5
4D-Bench: Benchmarking Multi-modal Large Language Models for 4D Object Understanding	Mar 22, 2025	BenchmarkingObject	CodeCode Available	5
SoccerNet 2024 Challenges Results	Sep 16, 2024	Action SpottingDense Video Captioning	CodeCode Available	5
Streaming Detection of Queried Event Start	Dec 4, 2024	Autonomous Drivingparameter-efficient fine-tuning	CodeCode Available	5
Situational Scene Graph for Structured Human-centric Situation Understanding	Oct 30, 2024	Graph GenerationPredicate Classification	CodeCode Available	5
Creative Flow+ Dataset	Jun 1, 2019	3D Character Animation From A Single PhotoDepth Estimation	CodeCode Available	5
ScVLM: Enhancing Vision-Language Model for Safety-Critical Event Understanding	Oct 1, 2024	Contrastive LearningHallucination	CodeCode Available	5
Screencast Tutorial Video Understanding	Jun 1, 2020	object-detectionObject Detection	CodeCode Available	5
ScaleLong: A Multi-Timescale Benchmark for Long Video Understanding	May 29, 2025	AvgVideo Understanding	CodeCode Available	5
SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding	Apr 30, 2025	Video Understanding	CodeCode Available	5
Snippet-Aware Transformer With Multiple Action Elements for Skeleton-Based Action Segmentation	May 6, 2024	Action SegmentationSkeleton Based Action Segmentation	CodeCode Available	5
Gaussian Temporal Awareness Networks for Action Localization	Sep 9, 2019	Action Localizationobject-detection	CodeCode Available	5
Relation-aware Hierarchical Attention Framework for Video Question Answering	May 13, 2021	Question AnsweringRelation	CodeCode Available	5
Re-ID-AR: Improved Person Re-identification in Video via Joint Weakly Supervised Action Recognition	Nov 1, 2021	Action RecognitionPerson Re-Identification	CodeCode Available	5
Representation Flow for Action Recognition	Oct 2, 2018	Action ClassificationAction Recognition	CodeCode Available	5
Contextual Explainable Video Representation: Human Perception-based Understanding	Dec 12, 2022	Action DetectionAction Recognition	CodeCode Available	5
DriftNet: Aggressive Driving Behavior Classification using 3D EfficientNet Architecture	Apr 18, 2020	Anomaly DetectionClassification	CodeCode Available	5
Recurrent Space-time Graph Neural Networks	Apr 11, 2019	Action RecognitionHuman-Object Interaction Detection	CodeCode Available	5
FriendsQA: A New Large-Scale Deep Video Understanding Dataset with Fine-grained Topic Categorization for Story Videos	Dec 22, 2024	Language ModellingLarge Language Model	CodeCode Available	5
Constrained-size Tensorflow Models for YouTube-8M Video Understanding Challenge	Aug 21, 2018	Video Understanding	CodeCode Available	5
VideoDG: Generalizing Temporal Relations in Videos to Novel Domains	Dec 8, 2019	Action RecognitionData Augmentation	CodeCode Available	5
SoccerChat: Integrating Multimodal Data for Enhanced Soccer Game Understanding	May 22, 2025	Action ClassificationAutomatic Speech Recognition	CodeCode Available	5

Show:10 25 50

← PrevPage 19 of 46Next →

No leaderboard results yet.