SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 10511100 of 1149 papers

TitleStatusHype
Representation Learning on Visual-Symbolic Graphs for Video Understanding0
Video Instance SegmentationCode2
Large Scale Holistic Video UnderstandingCode1
Recurrent Space-time Graph Neural NetworksCode0
Constructing Hierarchical Q&A Datasets for Video Story Understanding0
Wasserstein Dependency Measure for Representation Learning0
4D Generic Video Object ProposalsCode0
DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition0
Future semantic segmentation of time-lapsed videos with large temporal displacement0
Dynamic Graph Modules for Modeling Object-Object Interactions in Activity Recognition0
Long-Term Feature Banks for Detailed Video UnderstandingCode0
A Structured Model For Action Detection0
An Attempt towards Interpretable Audio-Visual Video Captioning0
The Visual Centrifuge: Model-Free Layered Video RepresentationsCode0
How to Make a BLT Sandwich? Learning to Reason towards Understanding Web Instructional Videos0
Self-Supervised Spatiotemporal Feature Learning via Video Rotation Prediction0
Integrated Object Detection and Tracking with Tracklet-Conditioned Detection0
Efficient Video Understanding via Layered Multi Frame-Rate Analysis0
TSM: Temporal Shift Module for Efficient Video UnderstandingCode1
NeXtVLAD: An Efficient Neural Network to Aggregate Frame-level Features for Large-scale Video ClassificationCode0
Random Temporal Skipping for Multirate Video Analysis0
Morph: Flexible Acceleration for 3D CNN-based Video Understanding0
Unsupervised Adversarial Visual Level Domain Adaptation for Learning Video Object Detectors from ImagesCode0
Representation Flow for Action RecognitionCode0
Learnable Pooling Methods for Video ClassificationCode0
Non-local NetVLAD Encoding for Video Classification0
Large-Scale Video Classification with Feature Space Augmentation coupled with Learned Label Relations and Ensembling0
Label Denoising with Large Ensembles of Heterogeneous Neural Networks0
Localizing Moments in Video with Temporal LanguageCode0
End-to-End Joint Semantic Segmentation of Actors and Actions in Video0
Teaching Machines to Understand Baseball Games: Large-Scale Baseball Video Database for Multiple Video Understanding Tasks0
Constrained-size Tensorflow Models for YouTube-8M Video Understanding ChallengeCode0
Diagnosing Error in Temporal Action DetectorsCode0
Video Time: Properties, Encoders and Evaluation0
Query-Conditioned Three-Player Adversarial Network for Video Summarization0
When Work Matters: Transforming Classical Network Structures to Graph CNN0
Deep Spatio-Temporal Random Fields for Efficient Video Segmentation0
Long Activity Video Understanding using Functional Object-Oriented Network0
Exploiting Spatial-Temporal Modelling and Multi-Modal Fusion for Human Action Recognition0
VirtualHome: Simulating Household Activities via ProgramsCode1
Massively Parallel Video Networks0
Geometry Guided Convolutional Neural Networks for Self-Supervised Video Representation Learning0
What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets0
DenseImage Network: Video Spatial-Temporal Evolution Encoding and Understanding0
Fast Retinomorphic Event Stream for Video Recognition and Reinforcement Learning0
Dilated Temporal Relational Adversarial Network for Generic Video Summarization0
Charades-Ego: A Large-Scale Dataset of Paired Third and First Person Videos0
ECO: Efficient Convolutional Network for Online Video UnderstandingCode0
Watch, Listen, and Describe: Globally and Locally Aligned Cross-Modal Attentions for Video CaptioningCode0
End-to-End Learning of Motion Representation for Video UnderstandingCode0
Show:102550
← PrevPage 22 of 23Next →

No leaderboard results yet.