SOTAVerified

Temporal Localization

Papers

Showing 150 of 153 papers

TitleStatusHype
VideoMind: A Chain-of-LoRA Agent for Long Video ReasoningCode3
OphNet: A Large-Scale Video Benchmark for Ophthalmic Surgical Workflow UnderstandingCode2
MINERVA: Evaluating Complex Video ReasoningCode2
LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal UnderstandingCode2
Egocentric Video-Language PretrainingCode2
VideoMolmo: Spatio-Temporal Grounding Meets PointingCode2
TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video UnderstandingCode2
Number it: Temporal Grounding Videos like Flipping MangaCode2
LITA: Language Instructed Temporal-Localization AssistantCode2
Crab: A Unified Audio-Visual Scene Understanding Model with Explicit CooperationCode2
TimeMarker: A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization AbilityCode2
Weakly Supervised Temporal Action Localization Using Deep Metric LearningCode1
Training-free Video Temporal Grounding using Large-scale Pre-trained ModelsCode1
Learning Salient Boundary Feature for Anchor-free Temporal Action LocalizationCode1
VLG-Net: Video-Language Graph Matching Network for Video GroundingCode1
Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video GroundingCode1
LocVTP: Video-Text Pre-training for Temporal LocalizationCode1
Weakly Supervised Action Selection Learning in VideoCode1
FineAction: A Fine-Grained Video Dataset for Temporal Action LocalizationCode1
TubeDETR: Spatio-Temporal Video Grounding with TransformersCode1
Temporally Precise Action Spotting in Soccer Videos Using Dense Detection AnchorsCode1
TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization TasksCode1
Boundary-sensitive Pre-training for Temporal Localization in VideosCode1
Human-centric Spatio-Temporal Video Grounding With Visual TransformersCode1
Video Moment Localization using Object Evidence and Reverse CaptioningCode1
Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQACode1
Finding Moments in Video Collections Using Natural LanguageCode1
Unsupervised classification to improve the quality of a bird song recording datasetCode1
Enriching Local and Global Contexts for Temporal Action LocalizationCode1
OpenTAL: Towards Open Set Temporal Action LocalizationCode1
Self-Chained Image-Language Model for Video Localization and Question AnsweringCode1
Audio-Visual Event Localization in Unconstrained VideosCode1
Multi-Task Learning of Object State Changes from Uncurated VideosCode1
MAC: Mining Activity Concepts for Language-based Temporal LocalizationCode1
DisTime: Distribution-based Time Representation for Video Large Language ModelsCode1
End-to-End Semi-Supervised Learning for Video Action DetectionCode1
Meerkat: Audio-Visual Large Language Model for Grounding in Space and TimeCode1
Explore-And-Match: Bridging Proposal-Based and Proposal-Free With Transformer for Sentence Grounding in VideosCode1
Stargazer: A transformer-based driver action detection system for intelligent transportationCode1
TALL: Temporal Activity Localization via Language QueryCode1
Few-Shot Temporal Action Localization with Query Adaptive TransformerCode1
CityFlow-NL: Tracking and Retrieval of Vehicles at City Scale by Natural Language DescriptionsCode1
Unsupervised Pre-training for Temporal Action Localization TasksCode1
TimeLoc: A Unified End-to-End Framework for Precise Timestamp Localization in Long VideosCode1
Detection of Sleep Apnea-Hypopnea Events Using Millimeter-wave Radar and Pulse Oximeter0
Described Spatial-Temporal Video Detection0
Density-Guided Label Smoothing for Temporal Localization of Driving Actions0
Action Shuffling for Weakly Supervised Temporal Localization0
Learning to track for spatio-temporal action localization0
Deep-Learning-Assisted Analysis of Cataract Surgery Videos0
Show:102550
← PrevPage 1 of 4Next →

No leaderboard results yet.