SOTAVerified

Image to text

Papers

Showing 141150 of 246 papers

TitleStatusHype
MedM2G: Unifying Medical Multi-Modal Generation via Cross-Guided Diffusion with Visual Invariant0
Enhancing Vision-Language Pre-training with Rich Supervisions0
Attention Guidance Mechanism for Handwritten Mathematical Expression Recognition0
Probing Multimodal Large Language Models for Global and Local Semantic RepresentationsCode0
A Unified Framework and Dataset for Assessing Societal Bias in Vision-Language Models0
Captions Are Worth a Thousand Words: Enhancing Product Retrieval with Pretrained Image-to-Text Models0
Dynamic Traceback Learning for Medical Report Generation0
CoCoT: Contrastive Chain-of-Thought Prompting for Large Multimodal Models with Multiple Image Inputs0
SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment0
Accept the Modality Gap: An Exploration in the Hyperbolic Space0
Show:102550
← PrevPage 15 of 25Next →

No leaderboard results yet.