SOTAVerified

Image to text

Papers

Showing 2130 of 246 papers

TitleStatusHype
From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with Vision-Language ModelsCode2
ERNIE-ViLG: Unified Generative Pre-training for Bidirectional Vision-Language GenerationCode1
Efficient Medical Vision-Language Alignment Through Adapting Masked Vision ModelsCode1
Distilled Dual-Encoder Model for Vision-Language UnderstandingCode1
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal CyclesCode1
Improving Factual Completeness and Consistency of Image-to-Text Radiology Report GenerationCode1
Beyond One-to-One: Rethinking the Referring Image SegmentationCode1
Improving Image Restoration through Removing Degradations in Textual RepresentationsCode1
Bootstrapping Vision-Language Learning with Decoupled Language Pre-trainingCode1
LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-Text Generation?Code1
Show:102550
← PrevPage 3 of 25Next →

No leaderboard results yet.