SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 681690 of 1149 papers

TitleStatusHype
Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and BenchmarksCode2
Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video UnderstandingCode4
MoviePuzzle: Visual Narrative Reasoning through Multimodal Order Learning0
Teacher Agent: A Knowledge Distillation-Free Framework for Rehearsal-based Video Incremental LearningCode0
Action Sensitivity Learning for Temporal Action Localization0
VideoLLM: Modeling Video Sequence with Large Language ModelsCode1
A Video Is Worth 4096 Tokens: Verbalize Videos To Understand Them In Zero ShotCode0
Learning Higher-order Object Interactions for Keypoint-based Video Understanding0
Vehicle Detection and Classification without Residual Calculation: Accelerating HEVC Image Decoding with Random Perturbation Injection0
Transformer-Based Model for Monocular Visual Odometry: A Video Understanding ApproachCode1
Show:102550
← PrevPage 69 of 115Next →

No leaderboard results yet.