Image to text

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 176–200 of 246 papers

Title	Date	Tasks	Status
Dallah: A Dialect-Aware Multimodal Large Language Model for Arabic	Jul 25, 2024	Image to textLanguage Modeling	—Unverified
DART: Disease-aware Image-Text Alignment and Self-correcting Re-alignment for Trustworthy Radiology Report Generation	Apr 16, 2025	Contrastive LearningImage to text	—Unverified
Deductron -- A Recurrent Neural Network	Jun 23, 2018	Image to textOptical Character Recognition (OCR)	—Unverified
Development of a New Image-to-text Conversion System for Pashto, Farsi and Traditional Chinese	May 8, 2020	Image to textOptical Character Recognition (OCR)	—Unverified
DiffusionSTR: Diffusion Model for Scene Text Recognition	Jun 29, 2023	Image to textmodel	—Unverified
DiffuVST: Narrating Fictional Scenes with Global-History-Guided Denoising Models	Dec 12, 2023	DenoisingDiversity	—Unverified
DIR: Retrieval-Augmented Image Captioning with Comprehensive Understanding	Dec 2, 2024	Caption GenerationDomain Generalization	—Unverified
Discovering Bugs in Vision Models using Off-the-shelf Image Generation and Captioning	Aug 18, 2022	Image GenerationImage to text	—Unverified
Doc2Im: document to image conversion through self-attentive embedding	Nov 8, 2018	Document To Image Conversiondocument understanding	—Unverified
DOCCI: Descriptions of Connected and Contrasting Images	Apr 30, 2024	Image GenerationImage to text	—Unverified
Do DALL-E and Flamingo Understand Each Other?	Dec 23, 2022	Image CaptioningImage Generation	—Unverified
Do LLMs Understand Visual Anomalies? Uncovering LLM's Capabilities in Zero-shot Anomaly Detection	Apr 15, 2024	Anomaly DetectionAnomaly Localization	—Unverified
Dynamic Traceback Learning for Medical Report Generation	Jan 24, 2024	Image to textMedical Report Generation	—Unverified
Efficient End-to-End Visual Document Understanding with Rationale Distillation	Nov 16, 2023	document understandingImage to text	—Unverified
EI-CLIP: Entity-Aware Interventional Contrastive Learning for E-Commerce Cross-Modal Retrieval	Jan 1, 2022	Causal InferenceContrastive Learning	—Unverified
EmojiGAN: learning emojis distributions with a generative model	Oct 1, 2018	Image CaptioningImage to text	—Unverified
Enhancing Vision-Language Pre-training with Rich Supervisions	Mar 5, 2024	Image to textTable Detection	—Unverified
Evaluating authenticity and quality of image captions via sentiment and semantic analyses	Sep 14, 2024	Image CaptioningImage to text	—Unverified
Every picture tells a story: Image-grounded controllable stylistic story generation	Sep 4, 2022	Image CaptioningImage to text	—Unverified
Everything is a Video: Unifying Modalities through Next-Frame Prediction	Nov 15, 2024	Caption GenerationCross-Modal Retrieval	—Unverified
Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation	Mar 14, 2024	Image to textOptical Character Recognition (OCR)	—Unverified
Faithful Chart Summarization with ChaTS-Pi	May 29, 2024	Image to textSentence	—Unverified
Fetch-A-Set: A Large-Scale OCR-Free Benchmark for Historical Document Retrieval	Jun 11, 2024	Image RetrievalImage to text	—Unverified
From Image to Text Classification: A Novel Approach based on Clustering Word Embeddings	Jul 25, 2017	ClusteringGeneral Classification	—Unverified
From Image to Text in Sentiment Analysis via Regression and Deep Learning	Sep 1, 2019	Image to textregression	—Unverified

Show:10 25 50

← PrevPage 8 of 10Next →

No leaderboard results yet.