| Dual-Level Collaborative Transformer for Image Captioning | Jan 16, 2021 | DescriptiveImage Captioning | CodeCode Available | 1 |
| EgoTaskQA: Understanding Human Tasks in Egocentric Videos | Oct 8, 2022 | Action Localizationcounterfactual | CodeCode Available | 1 |
| Enhancing CLIP with GPT-4: Harnessing Visual Descriptions as Prompts | Jul 21, 2023 | DescriptivePrompt Engineering | CodeCode Available | 1 |
| Enriching Music Descriptions with a Finetuned-LLM and Metadata for Text-to-Music Retrieval | Oct 4, 2024 | DescriptiveLanguage Modeling | CodeCode Available | 1 |
| Deep learning based geometric registration for medical images: How accurate can we get without visual features? | Mar 1, 2021 | DecoderDescriptive | CodeCode Available | 1 |
| Descriptive and Predictive Analysis of Euroleague Basketball Games and the Wisdom of Basketball Crowds | Feb 19, 2020 | BIG-bench Machine LearningBinary Classification | CodeCode Available | 1 |
| Dataset Distillation via Vision-Language Category Prototype | Jun 30, 2025 | Dataset DistillationDescriptive | CodeCode Available | 1 |
| Datasheets Aren't Enough: DataRubrics for Automated Quality Metrics and Accountability | Jun 2, 2025 | DescriptiveSynthetic Data Generation | CodeCode Available | 1 |
| FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes | Oct 15, 2021 | DescriptiveImage Classification | CodeCode Available | 1 |
| CRAFT: A Benchmark for Causal Reasoning About Forces and inTeractions | Dec 8, 2020 | counterfactualDescriptive | CodeCode Available | 1 |