Image to text

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 151–175 of 246 papers

Title	Date	Tasks	Status
An Online Learning Approach to Prompt-based Selection of Generative Models	Oct 17, 2024	Image to text	—Unverified
Ask, Attend, Attack: A Effective Decision-Based Black-Box Targeted Attack for Image-to-Text Models	Aug 16, 2024	Image to text	—Unverified
A Thousand Words Are Worth More Than a Picture: Natural Language-Centric Outside-Knowledge Visual Question Answering	Jan 14, 2022	Generative Question AnsweringImage to text	—Unverified
Attention Guidance Mechanism for Handwritten Mathematical Expression Recognition	Mar 4, 2024	Image to text	—Unverified
A Unified Framework and Dataset for Assessing Societal Bias in Vision-Language Models	Feb 21, 2024	BenchmarkingImage to text	—Unverified
Backdooring Vision-Language Models with Out-Of-Distribution Data	Oct 2, 2024	Image CaptioningImage to text	—Unverified
Better Text Understanding Through Image-To-Text Transfer	May 23, 2017	Image to text	—Unverified
Beyond Color and Lines: Zero-Shot Style-Specific Image Variations with Coordinated Semantics	Oct 24, 2024	Image to textImage-Variation	—Unverified
Beyond Images: An Integrative Multi-modal Approach to Chest X-Ray Report Generation	Nov 18, 2023	Image to textSemantic Similarity	—Unverified
BiLMa: Bidirectional Local-Matching for Text-based Person Re-identification	Sep 9, 2023	Image to textLanguage Modeling	—Unverified
BIMCV-R: A Landmark Dataset for 3D CT Text-Image Retrieval	Mar 24, 2024	DiagnosticImage Retrieval	—Unverified
BRIT: Bidirectional Retrieval over Unified Image-Text Graph	May 24, 2025	Image to textQuestion Answering	—Unverified
Canonical Correlation Analysis for Misaligned Satellite Image Change Detection	Dec 21, 2018	Action RecognitionChange Detection	—Unverified
CapText: Large Language Model-based Caption Generation From Image Context and Description	Jun 1, 2023	Caption GenerationImage to text	—Unverified
Captions Are Worth a Thousand Words: Enhancing Product Retrieval with Pretrained Image-to-Text Models	Feb 13, 2024	Image CaptioningImage to text	—Unverified
ChartReasoner: Code-Driven Modality Bridging for Long-Chain Reasoning in Chart Question Answering	Jun 11, 2025	Chart Question AnsweringImage to text	—Unverified
VITR: Augmenting Vision Transformers with Relation-Focused Learning for Cross-Modal Information Retrieval	Feb 13, 2023	Cross-Modal Information RetrievalCross-Modal Retrieval	—Unverified
CLIP the Bias: How Useful is Balancing Data in Multimodal Learning?	Mar 7, 2024	Image to textImage-to-Text Retrieval	—Unverified
CoBIT: A Contrastive Bi-directional Image-Text Generation Model	Mar 23, 2023	DecoderImage Generation	—Unverified
CoCoT: Contrastive Chain-of-Thought Prompting for Large Multimodal Models with Multiple Image Inputs	Jan 5, 2024	Image ComprehensionImage to text	—Unverified
Contrastive Learning of Visual-Semantic Embeddings	Oct 17, 2021	Contrastive Learningimage-classification	—Unverified
COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval	Apr 15, 2022	Contrastive LearningCross-Modal Retrieval	—Unverified
Cross-Modal Adaptive Dual Association for Text-to-Image Person Retrieval	Dec 4, 2023	AttributeCross-Modal Person Re-Identification	—Unverified
Cross-Modal Alignment with Mixture Experts Neural Network for Intral-City Retail Recommendation	Sep 17, 2020	cross-modal alignmentImage to text	—Unverified
Cross-modal Contrastive Attention Model for Medical Report Generation	Oct 1, 2022	Image to textMedical Report Generation	—Unverified

Show:10 25 50

← PrevPage 7 of 10Next →

No leaderboard results yet.