SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 876–900 of 1149 papers

Title	Date	Tasks	Status
Learning to Focus on the Foreground for Temporal Sentence Grounding	Oct 1, 2022	SentenceTemporal Sentence Grounding	—Unverified
In-the-Wild Video Question Answering	Oct 1, 2022	Evidence SelectionQuestion Answering	—Unverified
Speeding Up Action Recognition Using Dynamic Accumulation of Residuals in Compressed Domain	Sep 29, 2022	Action RecognitionVideo Understanding	—Unverified
AVT: Audio-Video Transformer for Multimodal Action Recognition	Sep 22, 2022	Action RecognitionAudio Classification	—Unverified
WildQA: In-the-Wild Video Question Answering	Sep 14, 2022	Evidence SelectionQuestion Answering	—Unverified
Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions	Sep 7, 2022	Image GenerationText to Image Generation	—Unverified
Visual Subtitle Feature Enhanced Video Outline Generation	Aug 24, 2022	ArticlesHeadline Generation	—Unverified
Identifying Auxiliary or Adversarial Tasks Using Necessary Condition Analysis for Adversarial Multi-task Video Understanding	Aug 22, 2022	Action RecognitionMulti-Task Learning	—Unverified
Motion Sensitive Contrastive Learning for Self-supervised Video Representation	Aug 12, 2022	Contrastive LearningRepresentation Learning	—Unverified
Exploring Anchor-based Detection for Ego4D Natural Language Query	Aug 10, 2022	Video Understanding	—Unverified
SA-NET.v2: Real-time vehicle detection from oblique UAV images with use of uncertainty estimation in deep meta-learning	Aug 4, 2022	Meta-LearningSemantic Segmentation	—Unverified
Two-Stream Transformer Architecture for Long Video Understanding	Aug 2, 2022	Action RecognitionGPU	—Unverified
BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object Segmentation	Aug 1, 2022	ObjectOptical Flow Estimation	—Unverified
EgoEnv: Human-centric environment representations from egocentric video	Jul 22, 2022	Video Understanding	—Unverified
Video Swin Transformers for Egocentric Video Understanding @ Ego4D Challenges 2022	Jul 22, 2022	ObjectObject State Change Classification	—Unverified
AE-Net:Adjoint Enhancement Network for Efficient Action Recognition in Video Understanding	Jul 21, 2022	Action RecognitionVideo Understanding	—Unverified
An Efficient Spatio-Temporal Pyramid Transformer for Action Detection	Jul 21, 2022	Action DetectionVideo Understanding	—Unverified
SVGraph: Learning Semantic Graphs from Instructional Videos	Jul 16, 2022	Graph LearningVideo Understanding	—Unverified
GraphVid: It Only Takes a Few Nodes to Understand a Video	Jul 4, 2022	SuperpixelsVideo Understanding	—Unverified
Multimodal Intent Discovery from Livestream Videos	Jul 1, 2022	Intent DiscoveryVideo Summarization	—Unverified
(Un)likelihood Training for Interpretable Embedding	Jul 1, 2022	Ad-hoc video searchDecoder	CodeCode Available
Dynamic Multistep Reasoning based on Video Scene Graph for Video Question Answering	Jul 1, 2022	Question AnsweringVideo Question Answering	—Unverified
Submission to Generic Event Boundary Detection Challenge@CVPR 2022: Local Context Modeling and Global Boundary Decoding Approach	Jun 30, 2022	Boundary DetectionGeneric Event Boundary Detection	CodeCode Available
Technical Report for CVPR 2022 LOVEU AQTC Challenge	Jun 29, 2022	Video Understanding	CodeCode Available
Multimodal Dialogue State Tracking	Jun 16, 2022	Dialogue State TrackingVideo Understanding	CodeCode Available

Show:10 25 50

← PrevPage 36 of 46Next →

No leaderboard results yet.