Image to text

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 126–150 of 246 papers

Title	Date	Tasks	Status
CoBIT: A Contrastive Bi-directional Image-Text Generation Model	Mar 23, 2023	DecoderImage Generation	—Unverified
Contrastive Learning of Visual-Semantic Embeddings	Oct 17, 2021	Contrastive Learningimage-classification	—Unverified
COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval	Apr 15, 2022	Contrastive LearningCross-Modal Retrieval	—Unverified
Cross-Modal Adaptive Dual Association for Text-to-Image Person Retrieval	Dec 4, 2023	AttributeCross-Modal Person Re-Identification	—Unverified
Cross-Modal Alignment with Mixture Experts Neural Network for Intral-City Retail Recommendation	Sep 17, 2020	cross-modal alignmentImage to text	—Unverified
Cross-modal Contrastive Attention Model for Medical Report Generation	Oct 1, 2022	Image to textMedical Report Generation	—Unverified
Dallah: A Dialect-Aware Multimodal Large Language Model for Arabic	Jul 25, 2024	Image to textLanguage Modeling	—Unverified
DART: Disease-aware Image-Text Alignment and Self-correcting Re-alignment for Trustworthy Radiology Report Generation	Apr 16, 2025	Contrastive LearningImage to text	—Unverified
Deductron -- A Recurrent Neural Network	Jun 23, 2018	Image to textOptical Character Recognition (OCR)	—Unverified
Development of a New Image-to-text Conversion System for Pashto, Farsi and Traditional Chinese	May 8, 2020	Image to textOptical Character Recognition (OCR)	—Unverified
DiffusionSTR: Diffusion Model for Scene Text Recognition	Jun 29, 2023	Image to textmodel	—Unverified
DiffuVST: Narrating Fictional Scenes with Global-History-Guided Denoising Models	Dec 12, 2023	DenoisingDiversity	—Unverified
DIR: Retrieval-Augmented Image Captioning with Comprehensive Understanding	Dec 2, 2024	Caption GenerationDomain Generalization	—Unverified
Discovering Bugs in Vision Models using Off-the-shelf Image Generation and Captioning	Aug 18, 2022	Image GenerationImage to text	—Unverified
Doc2Im: document to image conversion through self-attentive embedding	Nov 8, 2018	Document To Image Conversiondocument understanding	—Unverified
DOCCI: Descriptions of Connected and Contrasting Images	Apr 30, 2024	Image GenerationImage to text	—Unverified
Do DALL-E and Flamingo Understand Each Other?	Dec 23, 2022	Image CaptioningImage Generation	—Unverified
Do LLMs Understand Visual Anomalies? Uncovering LLM's Capabilities in Zero-shot Anomaly Detection	Apr 15, 2024	Anomaly DetectionAnomaly Localization	—Unverified
Dynamic Traceback Learning for Medical Report Generation	Jan 24, 2024	Image to textMedical Report Generation	—Unverified
Efficient End-to-End Visual Document Understanding with Rationale Distillation	Nov 16, 2023	document understandingImage to text	—Unverified
EI-CLIP: Entity-Aware Interventional Contrastive Learning for E-Commerce Cross-Modal Retrieval	Jan 1, 2022	Causal InferenceContrastive Learning	—Unverified
EmojiGAN: learning emojis distributions with a generative model	Oct 1, 2018	Image CaptioningImage to text	—Unverified
Enhancing Vision-Language Pre-training with Rich Supervisions	Mar 5, 2024	Image to textTable Detection	—Unverified
Evaluating authenticity and quality of image captions via sentiment and semantic analyses	Sep 14, 2024	Image CaptioningImage to text	—Unverified
Every picture tells a story: Image-grounded controllable stylistic story generation	Sep 4, 2022	Image CaptioningImage to text	—Unverified

Show:10 25 50

← PrevPage 6 of 10Next →

No leaderboard results yet.