SOTAVerified

Image to text

Papers

Showing 126150 of 246 papers

TitleStatusHype
CoBIT: A Contrastive Bi-directional Image-Text Generation Model0
Contrastive Learning of Visual-Semantic Embeddings0
COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval0
Cross-Modal Adaptive Dual Association for Text-to-Image Person Retrieval0
Cross-Modal Alignment with Mixture Experts Neural Network for Intral-City Retail Recommendation0
Cross-modal Contrastive Attention Model for Medical Report Generation0
Dallah: A Dialect-Aware Multimodal Large Language Model for Arabic0
DART: Disease-aware Image-Text Alignment and Self-correcting Re-alignment for Trustworthy Radiology Report Generation0
Deductron -- A Recurrent Neural Network0
Development of a New Image-to-text Conversion System for Pashto, Farsi and Traditional Chinese0
DiffusionSTR: Diffusion Model for Scene Text Recognition0
DiffuVST: Narrating Fictional Scenes with Global-History-Guided Denoising Models0
DIR: Retrieval-Augmented Image Captioning with Comprehensive Understanding0
Discovering Bugs in Vision Models using Off-the-shelf Image Generation and Captioning0
Doc2Im: document to image conversion through self-attentive embedding0
DOCCI: Descriptions of Connected and Contrasting Images0
Do DALL-E and Flamingo Understand Each Other?0
Do LLMs Understand Visual Anomalies? Uncovering LLM's Capabilities in Zero-shot Anomaly Detection0
Dynamic Traceback Learning for Medical Report Generation0
Efficient End-to-End Visual Document Understanding with Rationale Distillation0
EI-CLIP: Entity-Aware Interventional Contrastive Learning for E-Commerce Cross-Modal Retrieval0
EmojiGAN: learning emojis distributions with a generative model0
Enhancing Vision-Language Pre-training with Rich Supervisions0
Evaluating authenticity and quality of image captions via sentiment and semantic analyses0
Every picture tells a story: Image-grounded controllable stylistic story generation0
Show:102550
← PrevPage 6 of 10Next →

No leaderboard results yet.