Image to text

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 201–225 of 246 papers

Title	Date	Tasks	Status
From Pixels to Prose: Advancing Multi-Modal Language Models for Remote Sensing	Nov 5, 2024	Change DetectionContrastive Learning	—Unverified
GPC: Generative and General Pathology Image Classifier	Jul 12, 2024	Classificationimage-classification	—Unverified
GPT-4V(ision) as a Generalist Evaluator for Vision-Language Tasks	Nov 2, 2023	Image GenerationImage to text	—Unverified
GrowCLIP: Data-aware Automatic Model Growing for Large-scale Contrastive Language-Image Pre-training	Aug 22, 2023	image-classificationImage Classification	—Unverified
Hierarchical Gumbel Attention Network for Text-based Person Search	Oct 10, 2020	Image RetrievalImage to text	—Unverified
HyCIR: Boosting Zero-Shot Composed Image Retrieval with Synthetic Labels	Jul 8, 2024	Contrastive LearningImage Retrieval	—Unverified
I2T2I: Learning Text to Image Synthesis with Textual Data Augmentation	Mar 20, 2017	Caption GenerationData Augmentation	—Unverified
Illegible Text to Readable Text: An Image-to-Image Transformation using Conditional Sliced Wasserstein Adversarial Networks	Oct 11, 2019	Generative Adversarial NetworkImage-to-Image Translation	—Unverified
Image2Text2Image: A Novel Framework for Label-Free Evaluation of Image-to-Text Generation with Text-to-Image Diffusion Models	Nov 8, 2024	Image CaptioningImage Generation	—Unverified
Image Captioners Sometimes Tell More Than Images They See	May 4, 2023	DescriptiveImage Captioning	—Unverified
Image Semantic Relation Generation	Oct 19, 2022	Image RetrievalImage Segmentation	—Unverified
Image-to-Text for Medical Reports Using Adaptive Co-Attention and Triple-LSTM Module	Mar 24, 2025	Image to textMedical Report Generation	—Unverified
Image-to-Text Logic Jailbreak: Your Imagination can Help You Do Anything	Jul 1, 2024	Image to textLanguage Modeling	—Unverified
Improving Factuality of 3D Brain MRI Report Generation with Paired Image-domain Retrieval and Text-domain Augmentation	Nov 23, 2024	Cross-Modal RetrievalImage to text	—Unverified
Improving Medical Visual Representation Learning with Pathological-level Cross-Modal Alignment and Correlation Exploration	Jun 12, 2025	cross-modal alignmentImage to text	—Unverified
Improving Table Structure Recognition with Visual-Alignment Sequential Coordinate Modeling	Mar 13, 2023	DecoderImage to text	—Unverified
Improving the Factual Correctness of Radiology Report Generation with Semantic Rewards	Oct 21, 2022	Image to textnamed-entity-recognition	—Unverified
Instruction Tuning-free Visual Token Complement for Multimodal LLMs	Aug 9, 2024	Image GenerationImage to text	—Unverified
Interpreting Vision and Language Generative Models with Semantic Visual Priors	Apr 28, 2023	Image to text	—Unverified
Is Cross-modal Information Retrieval Possible without Training?	Apr 20, 2023	Contrastive LearningCross-Modal Information Retrieval	—Unverified
I See Dead People: Gray-Box Adversarial Attack on Image-To-Text Models	Jun 13, 2023	Adversarial AttackDecoder	—Unverified
Knowledge Aware Semantic Concept Expansion for Image-Text Matching	Aug 10, 2019	Common Sense ReasoningContent-Based Image Retrieval	—Unverified
Knowledge driven Description Synthesis for Floor Plan Interpretation	Mar 15, 2021	Caption GenerationDescriptive	—Unverified
Semantically Grounded QFormer for Efficient Vision Language Understanding	Nov 13, 2023	DiversityImage to text	—Unverified
Learning by Hallucinating: Vision-Language Pre-training with Weak Supervision	Oct 24, 2022	cross-modal alignmentCross-Modal Retrieval	—Unverified

Show:10 25 50

← PrevPage 9 of 10Next →

No leaderboard results yet.