| DiffRhythm+: Controllable and Flexible Full-Length Song Generation with Preference Optimization | Jul 17, 2025 | Descriptive | —Unverified | 0 |
| Assay2Mol: large language model-based drug design using BioAssay context | Jul 16, 2025 | DescriptiveDrug Design | CodeCode Available | 0 |
| Describe Anything Model for Visual Question Answering on Text-rich Images | Jul 16, 2025 | DescriptiveLanguage Modeling | CodeCode Available | 1 |
| FIFA: Unified Faithfulness Evaluation Framework for Text-to-Video and Video-to-Text Generation | Jul 9, 2025 | DescriptiveText Generation | —Unverified | 0 |
| Beyond Accuracy: Metrics that Uncover What Makes a 'Good' Visual Descriptor | Jul 4, 2025 | Descriptiveimage-classification | CodeCode Available | 0 |
| Prompt Disentanglement via Language Guidance and Representation Alignment for Domain Generalization | Jul 3, 2025 | DescriptiveDisentanglement | —Unverified | 0 |
| Dataset Distillation via Vision-Language Category Prototype | Jun 30, 2025 | Dataset DistillationDescriptive | CodeCode Available | 1 |
| Show, Tell and Summarize: Dense Video Captioning Using Visual Cue Aided Sentence Summarization | Jun 25, 2025 | Dense Video CaptioningDescriptive | —Unverified | 0 |
| Experiential marketing strategy and tourism demand in the contribution of the positioning of the floating islands Los Uros, Puno | Jun 22, 2025 | DescriptiveMarketing | —Unverified | 0 |
| DRAMA-X: A Fine-grained Intent Prediction and Risk Reasoning Benchmark For Driving | Jun 21, 2025 | Autonomous DrivingDescriptive | CodeCode Available | 1 |