| The Surprising Effectiveness of Multimodal Large Language Models for Video Moment Retrieval | Jun 26, 2024 | Action LocalizationMoment Retrieval | CodeCode Available | 2 | 5 |
| Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation for Video Moment Retrieval | Jul 21, 2024 | General KnowledgeHighlight Detection | CodeCode Available | 2 | 5 |
| UniMD: Towards Unifying Moment Retrieval and Temporal Action Detection | Apr 7, 2024 | Action DetectionMoment Queries | CodeCode Available | 2 | 5 |
| UniVTG: Towards Unified Video-Language Temporal Grounding | Jul 31, 2023 | Highlight DetectionMoment Retrieval | CodeCode Available | 2 | 5 |
| Correlation-Guided Query-Dependency Calibration for Video Temporal Grounding | Nov 15, 2023 | Highlight DetectionMoment Retrieval | CodeCode Available | 2 | 5 |
| LD-DETR: Loop Decoder DEtection TRansformer for Video Moment Retrieval and Highlight Detection | Jan 18, 2025 | Contrastive LearningDecoder | CodeCode Available | 1 | 5 |
| FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal Grounding | Dec 18, 2024 | Highlight DetectionMoment Retrieval | CodeCode Available | 1 | 5 |
| Localizing Moments in Long Video Via Multimodal Guidance | Feb 26, 2023 | Natural Language Moment RetrievalNatural Language Visual Grounding | CodeCode Available | 1 | 5 |
| Dense Regression Network for Video Grounding | Apr 7, 2020 | Natural Language Moment RetrievalNatural Language Queries | CodeCode Available | 1 | 5 |
| DeCafNet: Delegate and Conquer for Efficient Temporal Grounding in Long Videos | May 22, 2025 | Natural Language Moment RetrievalNatural Language Queries | CodeCode Available | 1 | 5 |