Image to text

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–125 of 246 papers

Title	Date	Tasks	Status
Towards Cross-modal Retrieval in Chinese Cultural Heritage Documents: Dataset and Solution	May 16, 2025	Cross-Modal RetrievalImage to text	—Unverified
ABC: Achieving Better Control of Multimodal Embeddings using VLMs	Mar 1, 2025	Image to textImage-to-Text Retrieval	—Unverified
Accept the Modality Gap: An Exploration in the Hyperbolic Space	Jan 1, 2024	Image to textImage-to-Text Retrieval	—Unverified
Advancing Myopia To Holism: Fully Contrastive Language-Image Pre-training	Jan 1, 2025	Image-text RetrievalImage to text	—Unverified
AICoderEval: Improving AI Domain Code Generation of Large Language Models	Jun 7, 2024	Code GenerationImage to text	—Unverified
AI Recommendation System for Enhanced Customer Experience: A Novel Image-to-Text Method	Nov 16, 2023	Image to textObject	—Unverified
An End-to-End Neural Network for Image-to-Audio Transformation	Mar 10, 2023	Image to texttext-to-speech	—Unverified
An Online Learning Approach to Prompt-based Selection of Generative Models	Oct 17, 2024	Image to text	—Unverified
Ask, Attend, Attack: A Effective Decision-Based Black-Box Targeted Attack for Image-to-Text Models	Aug 16, 2024	Image to text	—Unverified
A Thousand Words Are Worth More Than a Picture: Natural Language-Centric Outside-Knowledge Visual Question Answering	Jan 14, 2022	Generative Question AnsweringImage to text	—Unverified
Attention Guidance Mechanism for Handwritten Mathematical Expression Recognition	Mar 4, 2024	Image to text	—Unverified
A Unified Framework and Dataset for Assessing Societal Bias in Vision-Language Models	Feb 21, 2024	BenchmarkingImage to text	—Unverified
Backdooring Vision-Language Models with Out-Of-Distribution Data	Oct 2, 2024	Image CaptioningImage to text	—Unverified
Better Text Understanding Through Image-To-Text Transfer	May 23, 2017	Image to text	—Unverified
Beyond Color and Lines: Zero-Shot Style-Specific Image Variations with Coordinated Semantics	Oct 24, 2024	Image to textImage-Variation	—Unverified
Beyond Images: An Integrative Multi-modal Approach to Chest X-Ray Report Generation	Nov 18, 2023	Image to textSemantic Similarity	—Unverified
BiLMa: Bidirectional Local-Matching for Text-based Person Re-identification	Sep 9, 2023	Image to textLanguage Modeling	—Unverified
BIMCV-R: A Landmark Dataset for 3D CT Text-Image Retrieval	Mar 24, 2024	DiagnosticImage Retrieval	—Unverified
BRIT: Bidirectional Retrieval over Unified Image-Text Graph	May 24, 2025	Image to textQuestion Answering	—Unverified
Canonical Correlation Analysis for Misaligned Satellite Image Change Detection	Dec 21, 2018	Action RecognitionChange Detection	—Unverified
CapText: Large Language Model-based Caption Generation From Image Context and Description	Jun 1, 2023	Caption GenerationImage to text	—Unverified
Captions Are Worth a Thousand Words: Enhancing Product Retrieval with Pretrained Image-to-Text Models	Feb 13, 2024	Image CaptioningImage to text	—Unverified
ChartReasoner: Code-Driven Modality Bridging for Long-Chain Reasoning in Chart Question Answering	Jun 11, 2025	Chart Question AnsweringImage to text	—Unverified
VITR: Augmenting Vision Transformers with Relation-Focused Learning for Cross-Modal Information Retrieval	Feb 13, 2023	Cross-Modal Information RetrievalCross-Modal Retrieval	—Unverified
CLIP the Bias: How Useful is Balancing Data in Multimodal Learning?	Mar 7, 2024	Image to textImage-to-Text Retrieval	—Unverified

Show:10 25 50

← PrevPage 5 of 10Next →

No leaderboard results yet.