Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 276–300 of 1149 papers

Title	Date	Tasks	Status	Hype
Whether and When does Endoscopy Domain Pretraining Make Sense?	Mar 30, 2023	Action Triplet DetectionSurgical phase recognition	CodeCode Available	1
Streaming Video Model	Mar 30, 2023	Action RecognitionDecoder	CodeCode Available	1
TimeBalance: Temporally-Invariant and Temporally-Distinctive Video Representations for Semi-Supervised Action Recognition	Mar 28, 2023	Action RecognitionOptical Flow Estimation	CodeCode Available	1
Weakly Supervised Video Representation Learning with Unaligned Text for Sequential Videos	Mar 22, 2023	Representation LearningSentence	CodeCode Available	1
Dual-path Adaptation from Image to Video Transformers	Mar 17, 2023	Action ClassificationAction Recognition	CodeCode Available	1
TemporalMaxer: Maximize Temporal Context with only Max Pooling for Temporal Action Localization	Mar 16, 2023	Action LocalizationTemporal Action Localization	CodeCode Available	1
Localizing Moments in Long Video Via Multimodal Guidance	Feb 26, 2023	Natural Language Moment RetrievalNatural Language Visual Grounding	CodeCode Available	1
Test of Time: Instilling Video-Language Models with a Sense of Time	Jan 5, 2023	Video-Text RetrievalVideo Understanding	CodeCode Available	1
Boosting Single Image Super-Resolution via Partial Channel Shifting	Jan 1, 2023	DiversityImage Super-Resolution	CodeCode Available	1
Modeling Video As Stochastic Processes for Fine-Grained Video Representation Learning	Jan 1, 2023	Contrastive LearningRepresentation Learning	CodeCode Available	1
Towards Smooth Video Composition	Dec 14, 2022	Image Generationsingle-image-generation	CodeCode Available	1
MOMA-LRG: Language-Refined Graphs for Multi-Object Multi-Actor Activity Parsing	Nov 28, 2022	Activity RecognitionFew Shot Action Recognition	CodeCode Available	1
Contrastive Masked Autoencoders for Self-Supervised Video Hashing	Nov 21, 2022	DecoderRetrieval	CodeCode Available	1
EVEREST: Efficient Masked Video Autoencoder by Removing Redundant Spatiotemporal Tokens	Nov 19, 2022	Action RecognitionObject State Change Classification	CodeCode Available	1
InternVideo-Ego4D: A Pack of Champion Solutions to Ego4D Challenges	Nov 17, 2022	Future Hand PredictionMoment Queries	CodeCode Available	1
VTC: Improving Video-Text Retrieval with User Comments	Oct 19, 2022	Representation LearningRetrieval	CodeCode Available	1
EgoTaskQA: Understanding Human Tasks in Egocentric Videos	Oct 8, 2022	Action Localizationcounterfactual	CodeCode Available	1
SoccerNet 2022 Challenges Results	Oct 5, 2022	Action SpottingCamera Calibration	CodeCode Available	1
Learning Transferable Spatiotemporal Representations from Natural Script Knowledge	Sep 30, 2022	DescriptiveRepresentation Learning	CodeCode Available	1
Streaming Video Temporal Action Segmentation In Real Time	Sep 28, 2022	Action SegmentationLanguage Modelling	CodeCode Available	1
Panoramic Vision Transformer for Saliency Detection in 360° Videos	Sep 19, 2022	Saliency DetectionSaliency Prediction	CodeCode Available	1
EchoCoTr: Estimation of the Left Ventricular Ejection Fraction from Spatiotemporal Echocardiography	Sep 9, 2022	Video Understanding	CodeCode Available	1
DeepSportradar-v1: Computer Vision Dataset for Sports Understanding with High Quality Annotations	Aug 17, 2022	Camera CalibrationInstance Segmentation	CodeCode Available	1
Point Primitive Transformer for Long-Term 4D Point Cloud Video Understanding	Jul 30, 2022	point cloud video understandingVideo Understanding	CodeCode Available	1
Static and Dynamic Concepts for Self-supervised Video Representation Learning	Jul 26, 2022	DiversityRepresentation Learning	CodeCode Available	1

Show:10 25 50

← PrevPage 12 of 46Next →

No leaderboard results yet.