| Open-Vocabulary Action Localization with Iterative Visual Prompting | Aug 30, 2024 | Action LocalizationTemporal Action Localization | CodeCode Available | 1 |
| Prompting Visual-Language Models for Efficient Video Understanding | Dec 8, 2021 | Action RecognitionLanguage Modelling | CodeCode Available | 1 |
| Zero-Shot Temporal Action Detection via Vision-Language Prompting | Jul 17, 2022 | Action DetectionClassification | CodeCode Available | 1 |
| Spatio-Temporal Context Prompting for Zero-Shot Action Detection | Aug 28, 2024 | Action DetectionZero-Shot Action Detection | —Unverified | 0 |
| Scaling Open-Vocabulary Action Detection | Apr 4, 2025 | Action DetectionMultiple Action Detection | CodeCode Available | 0 |
| UnLoc: A Unified Framework for Video Localization Tasks | Aug 21, 2023 | Action SegmentationMoment Retrieval | CodeCode Available | 0 |