| Few-Shot Temporal Action Localization with Query Adaptive Transformer | Oct 20, 2021 | Action LocalizationAction Segmentation | CodeCode Available | 1 | 5 |
| Audio-Visual Event Localization in Unconstrained Videos | Mar 23, 2018 | audio-visual event localizationTemporal Localization | CodeCode Available | 1 | 5 |
| FineAction: A Fine-Grained Video Dataset for Temporal Action Localization | May 24, 2021 | Action DetectionAction Localization | CodeCode Available | 1 | 5 |
| DisTime: Distribution-based Time Representation for Video Large Language Models | May 30, 2025 | Temporal LocalizationVideo Understanding | CodeCode Available | 1 | 5 |
| Human-centric Spatio-Temporal Video Grounding With Visual Transformers | Nov 10, 2020 | Referring ExpressionSentence | CodeCode Available | 1 | 5 |
| OpenTAL: Towards Open Set Temporal Action Localization | Mar 10, 2022 | Action ClassificationAction Localization | CodeCode Available | 1 | 5 |
| Enriching Local and Global Contexts for Temporal Action Localization | Jul 27, 2021 | Action ClassificationAction Localization | CodeCode Available | 1 | 5 |
| Explore-And-Match: Bridging Proposal-Based and Proposal-Free With Transformer for Sentence Grounding in Videos | Jan 25, 2022 | Natural Language QueriesSentence | CodeCode Available | 1 | 5 |
| TALL: Temporal Activity Localization via Language Query | May 5, 2017 | Natural Language Queriesregression | CodeCode Available | 1 | 5 |
| TimeLoc: A Unified End-to-End Framework for Precise Timestamp Localization in Long Videos | Mar 9, 2025 | Action LocalizationBoundary Detection | CodeCode Available | 1 | 5 |