| Attribute-based Visual Reprogramming for Image Classification with CLIP | Jan 23, 2025 | AttributeDescriptive | CodeCode Available | 0 |
| Graphite: GRAPH-Induced feaTure Extraction for Point Cloud Registration | Oct 18, 2020 | DescriptiveKeypoint Detection | CodeCode Available | 0 |
| SemStyle: Learning to Generate Stylised Image Captions using Unaligned Text | May 18, 2018 | DescriptiveImage Captioning | CodeCode Available | 0 |
| Less Descriptive yet Discriminative: Quantifying the Properties of Multimodal Referring Utterances via CLIP | May 1, 2022 | Descriptive | CodeCode Available | 0 |
| Attend to You: Personalized Image Captioning with Context Sequence Memory Networks | Apr 21, 2017 | DescriptiveImage Captioning | CodeCode Available | 0 |
| Let's Think Frame by Frame with VIP: A Video Infilling and Prediction Dataset for Evaluating Video Chain-of-Thought | May 23, 2023 | DescriptiveVideo Prediction | CodeCode Available | 0 |
| CoinMath: Harnessing the Power of Coding Instruction for Math LLMs | Dec 16, 2024 | DescriptiveMath | CodeCode Available | 0 |
| Good News, Everyone! Context driven entity-aware captioning for news images | Apr 2, 2019 | ArticlesDescriptive | CodeCode Available | 0 |
| Picture It In Your Mind: Generating High Level Visual Representations From Textual Descriptions | Jun 23, 2016 | Cross-Modal Information RetrievalCross-Modal Retrieval | CodeCode Available | 0 |
| Leveraging Vision-Language Models for Open-Vocabulary Instance Segmentation and Tracking | Mar 18, 2025 | DescriptiveInstance Segmentation | CodeCode Available | 0 |