| Improving Zero-Shot Action Recognition using Human Instruction with Text Description | Jan 21, 2023 | Action RecognitionSentence | —Unverified | 0 |
| Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models | Dec 31, 2022 | Action ClassificationAction Recognition | CodeCode Available | 2 |
| VideoCoCa: Video-Text Modeling with Zero-Shot Transfer from Contrastive Captioners | Dec 9, 2022 | Question AnsweringRetrieval | —Unverified | 0 |
| REST: REtrieve & Self-Train for generative action recognition | Sep 29, 2022 | Action RecognitionCaption Generation | —Unverified | 0 |
| Global Semantic Descriptors for Zero-Shot Action Recognition | Sep 24, 2022 | Action ClassificationAction Recognition | CodeCode Available | 0 |
| Expanding Language-Image Pretrained Models for General Video Recognition | Aug 4, 2022 | Action ClassificationAction Recognition | CodeCode Available | 3 |
| Multimodal Open-Vocabulary Video Classification via Pre-Trained Vision and Language Models | Jul 15, 2022 | Optical Flow EstimationVideo Classification | —Unverified | 0 |
| Revisiting Classifier: Transferring Vision-Language Models for Video Recognition | Jul 4, 2022 | Action ClassificationAction Recognition | CodeCode Available | 2 |
| Learning Using Privileged Information for Zero-Shot Action Recognition | Jun 17, 2022 | Action RecognitionHallucination | —Unverified | 0 |
| A CLIP-Hitchhiker's Guide to Long Video Retrieval | May 17, 2022 | RetrievalVideo Retrieval | CodeCode Available | 1 |