SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 351400 of 1149 papers

TitleStatusHype
Learning the Predictability of the FutureCode1
NExT-QA: Next Phase of Question-Answering to Explaining Temporal ActionsCode1
End-to-end Temporal Action Detection with TransformerCode1
Isolated Sign Recognition from RGB Video using Pose Flow and Self-AttentionCode1
VT-SSum: A Benchmark Dataset for Video Transcript Segmentation and SummarizationCode1
Technical Report: Temporal Aggregate RepresentationsCode1
FineAction: A Fine-Grained Video Dataset for Temporal Action LocalizationCode1
NExT-QA:Next Phase of Question-Answering to Explaining Temporal ActionsCode1
MultiSports: A Multi-Person Video Dataset of Spatio-Temporally Localized Sports ActionsCode1
Stochastic Image-to-Video Synthesis using cINNsCode1
FrameExit: Conditional Early Exiting for Efficient Video RecognitionCode1
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip RetrievalCode1
Crossover Learning for Fast Online Video Instance SegmentationCode1
TubeR: Tubelet Transformer for Video Action DetectionCode1
Visual Semantic Role Labeling for Video UnderstandingCode1
Learning Salient Boundary Feature for Anchor-free Temporal Action LocalizationCode1
Temporal Context Aggregation Network for Temporal Action Proposal RefinementCode1
Temporally-Weighted Hierarchical Clustering for Unsupervised Action SegmentationCode1
Learning Self-Similarity in Space and Time as Generalized Motion for Video Action RecognitionCode1
Relaxed Transformer Decoders for Direct Action Proposal GenerationCode1
Occluded Video Instance Segmentation: A BenchmarkCode1
TCLR: Temporal Contrastive Learning for Video RepresentationCode1
TrackFormer: Multi-Object Tracking with TransformersCode1
Learning Self-Similarity in Space and Time as a Generalized Motion for Action RecognitionCode1
A Comprehensive Study of Deep Video Action RecognitionCode1
End-to-End Video Instance Segmentation with TransformersCode1
SoccerNet-v2: A Dataset and Benchmarks for Holistic Understanding of Broadcast Soccer VideosCode1
QuerYD: A video dataset with high-quality text and audio narrationsCode1
Improved Actor Relation Graph based Group Activity RecognitionCode1
PAN: Towards Fast Action Recognition via Learning Persistence of AppearanceCode1
The End-of-End-to-End: A Video Understanding Pentathlon Challenge (2020)Code1
Uncertainty-based Traffic Accident Anticipation with Spatio-Temporal Relational LearningCode1
MotionSqueeze: Neural Motion Feature Learning for Video UnderstandingCode1
Video Moment Localization using Object Evidence and Reverse CaptioningCode1
Actor-Context-Actor Relation Network for Spatio-Temporal Action LocalizationCode1
Temporal Aggregate Representations for Long-Range Video UnderstandingCode1
Towards Visually Explaining Video Understanding Networks with PerturbationCode1
Top-1 Solution of Multi-Moments in Time Challenge 2019Code1
Video2Commonsense: Generating Commonsense Descriptions to Enrich Video CaptioningCode1
Weakly Supervised Temporal Action Localization Using Deep Metric LearningCode1
Tree-Structured Policy based Progressive Reinforcement Learning for Temporally Language Grounding in VideoCode1
Temporal Interlacing NetworkCode1
EEV: A Large-Scale Dataset for Studying Evoked Expressions from VideoCode1
A Multigrid Method for Efficiently Training Video ModelsCode1
CATER: A diagnostic dataset for Compositional Actions and TEmporal ReasoningCode1
Lightweight Network Architecture for Real-Time Action RecognitionCode1
Large Scale Holistic Video UnderstandingCode1
TSM: Temporal Shift Module for Efficient Video UnderstandingCode1
VirtualHome: Simulating Household Activities via ProgramsCode1
AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual ActionsCode1
Show:102550
← PrevPage 8 of 23Next →

No leaderboard results yet.