SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 826–850 of 1149 papers

Title	Date	Tasks	Status
VideoGLUE: Video General Understanding Evaluation of Foundation Models	Jul 6, 2023	Action RecognitionTemporal Localization	—Unverified
ICSVR: Investigating Compositional and Syntactic Understanding in Video Retrieval Models	Jun 28, 2023	RetrievalVideo Retrieval	CodeCode Available
Temporal Action Proposal Generation With Action Frequency Adaptive Network	Jun 23, 2023	Knowledge DistillationTemporal Action Proposal Generation	CodeCode Available
Learning Space-Time Semantic Correspondences	Jun 16, 2023	Imitation LearningSemantic correspondence	—Unverified
Learning Fine-grained View-Invariant Representations from Unpaired Ego-Exo Videos via Temporal Alignment	Jun 8, 2023	Video Understanding	—Unverified
MoviePuzzle: Visual Narrative Reasoning through Multimodal Order Learning	Jun 4, 2023	BenchmarkingContrastive Learning	—Unverified
Teacher Agent: A Knowledge Distillation-Free Framework for Rehearsal-based Video Incremental Learning	Jun 1, 2023	Incremental LearningKnowledge Distillation	CodeCode Available
Action Sensitivity Learning for Temporal Action Localization	May 25, 2023	Action LocalizationMoment Queries	—Unverified
Learning Higher-order Object Interactions for Keypoint-based Video Understanding	May 16, 2023	Action LocalizationAction Recognition	—Unverified
A Video Is Worth 4096 Tokens: Verbalize Videos To Understand Them In Zero Shot	May 16, 2023	Emotion ClassificationQuestion Answering	CodeCode Available
Vehicle Detection and Classification without Residual Calculation: Accelerating HEVC Image Decoding with Random Perturbation Injection	May 14, 2023	Image Reconstructionvehicle detection	—Unverified
ChatVideo: A Tracklet-centric Multimodal and Versatile Video Understanding System	Apr 27, 2023	Video Understanding	—Unverified
MRSN: Multi-Relation Support Network for Video Action Detection	Apr 24, 2023	Action DetectionRelation	—Unverified
Search-Map-Search: A Frame Selection Paradigm for Action Recognition	Apr 20, 2023	Action RecognitionHeuristic Search	—Unverified
LASER: A Neuro-Symbolic Framework for Learning Spatial-Temporal Scene Graphs with Weak Supervision	Apr 15, 2023	Language ModelingLanguage Modelling	—Unverified
Therbligs in Action: Video Understanding through Motion Primitives	Apr 6, 2023	Action AnticipationAction Recognition	—Unverified
DOAD: Decoupled One Stage Action Detection Network	Apr 1, 2023	Action DetectionAction Recognition	—Unverified
SVT: Supertoken Video Transformer for Efficient Video Understanding	Apr 1, 2023	Video Understanding	—Unverified
System-status-aware Adaptive Network for Online Streaming Video Understanding	Mar 28, 2023	Streaming video understandingVideo Understanding	—Unverified
Selective Structured State-Spaces for Long-Form Video Understanding	Mar 25, 2023	Contrastive LearningForm	—Unverified
Leaping Into Memories: Space-Time Deep Feature Synthesis	Mar 17, 2023	DiversityVideo Understanding	CodeCode Available
Video4MRI: An Empirical Study on Brain Magnetic Resonance Image Analytics with CNN-based Video Classification Frameworks	Feb 24, 2023	ClassificationData Augmentation	—Unverified
MINOTAUR: Multi-task Video Grounding From Multimodal Queries	Feb 16, 2023	Action DetectionSentence	CodeCode Available
Semi-Parametric Video-Grounded Text Generation	Jan 27, 2023	Language ModelingLanguage Modelling	—Unverified
Building Scalable Video Understanding Benchmarks through Sports	Jan 17, 2023	Video Understanding	—Unverified

Show:10 25 50

← PrevPage 34 of 46Next →

No leaderboard results yet.