SOTAVerified

Image to text

Papers

Showing 5160 of 246 papers

TitleStatusHype
Multimodal Foundation Models For Echocardiogram InterpretationCode1
Multimodal Procedural Planning via Dual Text-Image PromptingCode1
Can MLLMs Perform Text-to-Image In-Context Learning?Code1
Efficient Medical Vision-Language Alignment Through Adapting Masked Vision ModelsCode1
ERNIE-ViLG: Unified Generative Pre-training for Bidirectional Vision-Language GenerationCode1
Safe-CLIP: Removing NSFW Concepts from Vision-and-Language ModelsCode1
FLAME: Frozen Large Language Models Enable Data-Efficient Language-Image Pre-trainingCode1
Improving Factual Completeness and Consistency of Image-to-Text Radiology Report GenerationCode1
Bootstrapping Vision-Language Learning with Decoupled Language Pre-trainingCode1
L-Verse: Bidirectional Generation Between Image and TextCode1
Show:102550
← PrevPage 6 of 25Next →

No leaderboard results yet.