| Text-Visual Semantic Constrained AI-Generated Image Quality Assessment | Jul 14, 2025 | Image DescriptionImage Quality Assessment | CodeCode Available | 1 |
| Mitigating Hallucinations in Vision-Language Models through Image-Guided Head Suppression | May 22, 2025 | HallucinationImage Description | CodeCode Available | 1 |
| Patho-R1: A Multimodal Reinforcement Learning-Based Pathology Expert Reasoner | May 16, 2025 | Cross-Modal RetrievalDiagnostic | CodeCode Available | 2 |
| Advanced Chest X-Ray Analysis via Transformer-Based Image Descriptors and Cross-Model Attention Mechanism | Apr 23, 2025 | DecoderImage Description | —Unverified | 0 |
| LaMOuR: Leveraging Language Models for Out-of-Distribution Recovery in Reinforcement Learning | Mar 21, 2025 | Code GenerationDeep Reinforcement Learning | —Unverified | 0 |
| Seedream 2.0: A Native Chinese-English Bilingual Image Generation Foundation Model | Mar 10, 2025 | Image DescriptionImage Generation | CodeCode Available | 2 |
| VisBias: Measuring Explicit and Implicit Social Biases in Vision Language Models | Mar 10, 2025 | Image DescriptionMultiple-choice | CodeCode Available | 0 |
| SPIDER: A Comprehensive Multi-Organ Supervised Pathology Dataset and Baseline Models | Mar 4, 2025 | Image Description | CodeCode Available | 1 |
| Boli: A dataset for understanding stuttering experience and analyzing stuttered speech | Jan 27, 2025 | Image Description | —Unverified | 0 |
| IDEA: Image Description Enhanced CLIP-Adapter | Jan 15, 2025 | Few-Shot Image Classificationimage-classification | CodeCode Available | 0 |