| Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA | May 13, 2020 | Image CaptioningMulti-Label Classification | CodeCode Available | 1 | 5 |
| TubeDETR: Spatio-Temporal Video Grounding with Transformers | Mar 30, 2022 | DecoderLanguage-Based Temporal Localization | CodeCode Available | 1 | 5 |
| Weakly Supervised Action Selection Learning in Video | May 6, 2021 | Temporal LocalizationWeakly Supervised Action Localization | CodeCode Available | 1 | 5 |
| Enriching Local and Global Contexts for Temporal Action Localization | Jul 27, 2021 | Action ClassificationAction Localization | CodeCode Available | 1 | 5 |
| End-to-End Semi-Supervised Learning for Video Action Detection | Mar 8, 2022 | Action DetectionClassification Consistency | CodeCode Available | 1 | 5 |
| Finding Moments in Video Collections Using Natural Language | Jul 30, 2019 | Moment RetrievalRe-Ranking | CodeCode Available | 1 | 5 |
| Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding | Feb 16, 2025 | AttributeObject | CodeCode Available | 1 | 5 |
| Audio-Visual Event Localization in Unconstrained Videos | Mar 23, 2018 | audio-visual event localizationTemporal Localization | CodeCode Available | 1 | 5 |
| Learning Salient Boundary Feature for Anchor-free Temporal Action Localization | Mar 24, 2021 | Action LocalizationTemporal Action Localization | CodeCode Available | 1 | 5 |
| DisTime: Distribution-based Time Representation for Video Large Language Models | May 30, 2025 | Temporal LocalizationVideo Understanding | CodeCode Available | 1 | 5 |
| LocVTP: Video-Text Pre-training for Temporal Localization | Jul 21, 2022 | RetrievalTemporal Localization | CodeCode Available | 1 | 5 |
| TALL: Temporal Activity Localization via Language Query | May 5, 2017 | Natural Language Queriesregression | CodeCode Available | 1 | 5 |
| Explore-And-Match: Bridging Proposal-Based and Proposal-Free With Transformer for Sentence Grounding in Videos | Jan 25, 2022 | Natural Language QueriesSentence | CodeCode Available | 1 | 5 |
| Boundary-sensitive Pre-training for Temporal Localization in Videos | Nov 21, 2020 | Action ClassificationClassification | CodeCode Available | 1 | 5 |
| MAC: Mining Activity Concepts for Language-based Temporal Localization | Nov 21, 2018 | Language-Based Temporal LocalizationTemporal Localization | CodeCode Available | 1 | 5 |
| Temporally Precise Action Spotting in Soccer Videos Using Dense Detection Anchors | May 20, 2022 | Action SpottingData Augmentation | CodeCode Available | 1 | 5 |
| CityFlow-NL: Tracking and Retrieval of Vehicles at City Scale by Natural Language Descriptions | Jan 12, 2021 | Multi-Object TrackingObject Tracking | CodeCode Available | 1 | 5 |
| FineAction: A Fine-Grained Video Dataset for Temporal Action Localization | May 24, 2021 | Action DetectionAction Localization | CodeCode Available | 1 | 5 |
| Unsupervised classification to improve the quality of a bird song recording dataset | Feb 15, 2023 | Sound ClassificationTemporal Localization | CodeCode Available | 1 | 5 |
| Skeleton-Based Human Action Recognition with Noisy Labels | Mar 15, 2024 | Action RecognitionDenoising | CodeCode Available | 0 | 5 |
| Asynchronous Temporal Fields for Action Recognition | Dec 19, 2016 | Action ClassificationAction Recognition | CodeCode Available | 0 | 5 |
| SoftLoc: Robust Temporal Localization under Label Misalignment | Sep 25, 2019 | PositionTemporal Localization | CodeCode Available | 0 | 5 |
| Am I Done? Predicting Action Progress in Videos | May 4, 2017 | Action DetectionTemporal Localization | CodeCode Available | 0 | 5 |
| Dense Video Object Captioning from Disjoint Supervision | Jun 20, 2023 | ObjectSentence | CodeCode Available | 0 | 5 |
| Semi-supervised Active Learning for Video Action Detection | Dec 12, 2023 | Action DetectionActive Learning | CodeCode Available | 0 | 5 |