SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 10011050 of 1149 papers

TitleStatusHype
Video Action UnderstandingCode0
Global Self-Attention Networks for Image Recognition0
Features Understanding in 3D CNNs for Actions Recognition in VideoCode0
Residual Frames with Efficient Pseudo-3D CNN for Human Action Recognition0
Self-supervised Motion Representation via Scattering Local Motion Cues0
Detection and Localization of Robotic Tools in Robot-Assisted Surgery Videos Using Deep Neural Networks for Region Proposal and Detection0
Perceptron Synthesis Network: Rethinking the Action Scale Variances in Videos0
MovieNet: A Holistic Dataset for Movie Understanding0
Auto-captions on GIF: A Large-scale Video-sentence Dataset for Vision-language Pre-training0
Video Understanding as Machine Translation0
Screencast Tutorial Video UnderstandingCode0
Large Scale Video Representation Learning via Relational Graph Clustering0
CARPe Posterum: A Convolutional Approach for Real-time Pedestrian Path PredictionCode0
DramaQA: Character-Centered Video Story Understanding with Hierarchical QACode0
HLVU : A New Challenge to Test Deep Understanding of Movies the Way Humans do0
CATER: A diagnostic dataset for Compositional Actions & TEmporal Reasoning0
Beyond Instructional Videos: Probing for More Diverse Visual-Textual Grounding on YouTubeCode0
DriftNet: Aggressive Driving Behavior Classification using 3D EfficientNet ArchitectureCode0
Knowledge-Based Visual Question Answering in Videos0
Real-Time Segmentation Networks should be Latency Aware0
Context Modulated Dynamic Networks for Actor and Action Video Segmentation with Language Queries0
Fully Automated Hand Hygiene Monitoring\ Operating Room using 3D Convolutional Neural Network0
Beyond the Camera: Neural Networks in World Coordinates0
CTM: Collaborative Temporal Modeling for Action Recognition0
Cut-Based Graph Learning Networks to Discover Compositional Structure of Sequential Video Data0
SoccerDB: A Large-Scale Database for Comprehensive Video UnderstandingCode0
Video action detection by learning graph-based spatio-temporal interactionsCode0
VideoDG: Generalizing Temporal Relations in Videos to Novel DomainsCode0
Context R-CNN: Long Term Temporal Context for Per-Camera Object DetectionCode0
A Context-Aware Loss Function for Action Spotting in Soccer VideosCode0
BERT for Large-scale Video Segment Classification with Test-time Augmentation0
AdapNet: Adaptability Decomposing Encoder-Decoder Network for Weakly Supervised Action Recognition and Localization0
Mimic The Raw Domain: Accelerating Action Recognition in the Compressed Domain0
Cross-Class Relevance Learning for Temporal Concept Localization0
Multi-attention Networks for Temporal Localization of Video-level LabelsCode0
Multi-Moments in Time: Learning and Interpreting Models for Multi-Action Video UnderstandingCode0
Comprehensive Video Understanding: Video summarization with content-based video recommender design0
MOD: A Deep Mixture Model with Online Knowledge Distillation for Large Scale Video Temporal Concept LocalizationCode0
KnowIT VQA: Answering Knowledge-Based Questions about Videos0
AFO-TAD: Anchor-free One-Stage Detector for Temporal Action Detection0
Tiny Video NetworksCode0
OmniTrack: Real-time detection and tracking of objects, text and logos in video0
ViP: Video Platform for PyTorchCode0
A SPIKING SEQUENTIAL MODEL: RECURRENT LEAKY INTEGRATE-AND-FIRE0
Question Answering is a Format; When is it Useful?0
Zero-Shot Action Recognition in Videos: A Survey0
Gaussian Temporal Awareness Networks for Action LocalizationCode0
Only Time Can Tell: Discovering Temporal Data for Temporal Modeling0
Localizing Unseen Activities in Video via Image Query0
UniDual: A Unified Model for Image and Video Understanding0
Show:102550
← PrevPage 21 of 23Next →

No leaderboard results yet.