| Cognitive resilience: Unraveling the proficiency of image-captioning models to interpret masked visual content | Mar 23, 2024 | DescriptiveImage Captioning | CodeCode Available | 0 |
| MASSTAR: A Multi-Modal and Large-Scale Scene Dataset with a Versatile Toolchain for Surface Prediction and Completion | Mar 18, 2024 | Descriptive | —Unverified | 0 |
| Does the Performance of Text-to-Image Retrieval Models Generalize Beyond Captions-as-a-Query? | Mar 15, 2024 | DescriptiveImage Captioning | CodeCode Available | 0 |
| CSDNet: Detect Salient Object in Depth-Thermal via A Lightweight Cross Shallow and Deep Perception Network | Mar 15, 2024 | DescriptiveInformativeness | —Unverified | 0 |
| Medical Image Synthesis via Fine-Grained Image-Text Alignment and Anatomy-Pathology Prompting | Mar 11, 2024 | AnatomyDescriptive | —Unverified | 0 |
| Structure Your Data: Towards Semantic Graph Counterfactuals | Mar 11, 2024 | counterfactualDescriptive | CodeCode Available | 0 |
| The Case for Evaluating Multimodal Translation Models on Text Datasets | Mar 5, 2024 | DescriptiveImage Captioning | —Unverified | 0 |
| LLMs for Targeted Sentiment in News Headlines: Exploring the Descriptive-Prescriptive Dilemma | Mar 1, 2024 | DescriptiveIn-Context Learning | —Unverified | 0 |
| Gender Bias in Large Language Models across Multiple Languages | Mar 1, 2024 | Descriptive | —Unverified | 0 |
| ARED: Argentina Real Estate Dataset | Mar 1, 2024 | Descriptive | CodeCode Available | 0 |