SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 10011050 of 1149 papers

TitleStatusHype
HLVU : A New Challenge to Test Deep Understanding of Movies the Way Humans do0
Towards Visually Explaining Video Understanding Networks with PerturbationCode1
Beyond Instructional Videos: Probing for More Diverse Visual-Textual Grounding on YouTubeCode0
DriftNet: Aggressive Driving Behavior Classification using 3D EfficientNet ArchitectureCode0
Knowledge-Based Visual Question Answering in Videos0
Real-Time Segmentation Networks should be Latency Aware0
Context Modulated Dynamic Networks for Actor and Action Video Segmentation with Language Queries0
Fully Automated Hand Hygiene Monitoring\ Operating Room using 3D Convolutional Neural Network0
Beyond the Camera: Neural Networks in World Coordinates0
Top-1 Solution of Multi-Moments in Time Challenge 2019Code1
Video2Commonsense: Generating Commonsense Descriptions to Enrich Video CaptioningCode1
CTM: Collaborative Temporal Modeling for Action Recognition0
Weakly Supervised Temporal Action Localization Using Deep Metric LearningCode1
Tree-Structured Policy based Progressive Reinforcement Learning for Temporally Language Grounding in VideoCode1
Cut-Based Graph Learning Networks to Discover Compositional Structure of Sequential Video Data0
Temporal Interlacing NetworkCode1
EEV: A Large-Scale Dataset for Studying Evoked Expressions from VideoCode1
SoccerDB: A Large-Scale Database for Comprehensive Video UnderstandingCode0
Video action detection by learning graph-based spatio-temporal interactionsCode0
VideoDG: Generalizing Temporal Relations in Videos to Novel DomainsCode0
Context R-CNN: Long Term Temporal Context for Per-Camera Object DetectionCode0
A Context-Aware Loss Function for Action Spotting in Soccer VideosCode0
BERT for Large-scale Video Segment Classification with Test-time Augmentation0
A Multigrid Method for Efficiently Training Video ModelsCode1
AdapNet: Adaptability Decomposing Encoder-Decoder Network for Weakly Supervised Action Recognition and Localization0
Mimic The Raw Domain: Accelerating Action Recognition in the Compressed Domain0
Cross-Class Relevance Learning for Temporal Concept Localization0
Multi-attention Networks for Temporal Localization of Video-level LabelsCode0
Multi-Moments in Time: Learning and Interpreting Models for Multi-Action Video UnderstandingCode0
Comprehensive Video Understanding: Video summarization with content-based video recommender design0
MOD: A Deep Mixture Model with Online Knowledge Distillation for Large Scale Video Temporal Concept LocalizationCode0
KnowIT VQA: Answering Knowledge-Based Questions about Videos0
AFO-TAD: Anchor-free One-Stage Detector for Temporal Action Detection0
Tiny Video NetworksCode0
OmniTrack: Real-time detection and tracking of objects, text and logos in video0
CATER: A diagnostic dataset for Compositional Actions and TEmporal ReasoningCode1
ViP: Video Platform for PyTorchCode0
A SPIKING SEQUENTIAL MODEL: RECURRENT LEAKY INTEGRATE-AND-FIRE0
Question Answering is a Format; When is it Useful?0
Zero-Shot Action Recognition in Videos: A Survey0
Gaussian Temporal Awareness Networks for Action LocalizationCode0
Only Time Can Tell: Discovering Temporal Data for Temporal Modeling0
Localizing Unseen Activities in Video via Image Query0
UniDual: A Unified Model for Image and Video Understanding0
Hierarchical Video Frame Sequence Representation with Deep Convolutional Graph Network0
Creative Flow+ DatasetCode0
Audio Caption in a Car Setting with a Sentence-Level LossCode0
AssembleNet: Searching for Multi-Stream Neural Connectivity in Video ArchitecturesCode0
Exploring Temporal Information for Improved Video UnderstandingCode0
Lightweight Network Architecture for Real-Time Action RecognitionCode1
Show:102550
← PrevPage 21 of 23Next →

No leaderboard results yet.