Image to text

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 126–150 of 246 papers

Title	Date	Tasks	Status
TMCIR: Token Merge Benefits Composed Image Retrieval	Apr 15, 2025	Contrastive Learningcross-modal alignment	—Unverified
TNG-CLIP:Training-Time Negation Data Generation for Negation Awareness of CLIP	May 24, 2025	Image CaptioningImage Generation	—Unverified
Towards a Visual-Language Foundation Model for Computational Pathology	Jul 24, 2023	Contrastive Learningimage-classification	—Unverified
Transform-Retrieve-Generate: Natural Language-Centric Outside-Knowledge Visual Question Answering	Jan 1, 2022	Generative Question AnsweringImage to text	—Unverified
TrojVLM: Backdoor Attack Against Vision Language Models	Sep 28, 2024	Backdoor AttackImage Captioning	—Unverified
Turbo Learning for Captionbot and Drawingbot	May 21, 2018	Image CaptioningImage Generation	—Unverified
Two-stream Hierarchical Similarity Reasoning for Image-text Matching	Mar 10, 2022	Image-text matchingImage to text	—Unverified
Uncertainty-based Cross-Modal Retrieval with Probabilistic Representations	Apr 20, 2022	Cross-Modal RetrievalImage Retrieval	—Unverified
Understanding the Effect of using Semantically Meaningful Tokens for Visual Representation Learning	May 26, 2024	Image to textImage-to-Text Retrieval	—Unverified
UNITE-FND: Reframing Multimodal Fake News Detection through Unimodal Scene Translation	Feb 16, 2025	Binary ClassificationFake News Detection	—Unverified
Using Inter-Sentence Diverse Beam Search to Reduce Redundancy in Visual Storytelling	May 30, 2018	Image to textSentence	—Unverified
Utilizing Resource-Rich Language Datasets for End-to-End Scene Text Recognition in Resource-Poor Languages	Nov 24, 2021	DecoderImage to text	—Unverified
Vision-Braille: An End-to-End Tool for Chinese Braille Image-to-Text Translation	Jul 8, 2024	Image to textLifelong learning	—Unverified
Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation	Apr 30, 2024	Caption GenerationHallucination	—Unverified
When are Lemons Purple? The Concept Association Bias of Vision-Language Models	Dec 22, 2022	Attributeimage-classification	—Unverified
X-Fusion: Introducing New Modality to Frozen Large Language Models	Apr 29, 2025	Image to text	—Unverified
15M Multimodal Facial Image-Text Dataset	Jul 11, 2024	Image to text	—Unverified
Ziya-Visual: Bilingual Large Vision-Language Model via Multi-Task Instruction Tuning	Oct 12, 2023	Image CaptioningImage-text Retrieval	—Unverified
Towards Cross-modal Retrieval in Chinese Cultural Heritage Documents: Dataset and Solution	May 16, 2025	Cross-Modal RetrievalImage to text	—Unverified
ABC: Achieving Better Control of Multimodal Embeddings using VLMs	Mar 1, 2025	Image to textImage-to-Text Retrieval	—Unverified
Accept the Modality Gap: An Exploration in the Hyperbolic Space	Jan 1, 2024	Image to textImage-to-Text Retrieval	—Unverified
Advancing Myopia To Holism: Fully Contrastive Language-Image Pre-training	Jan 1, 2025	Image-text RetrievalImage to text	—Unverified
AICoderEval: Improving AI Domain Code Generation of Large Language Models	Jun 7, 2024	Code GenerationImage to text	—Unverified
AI Recommendation System for Enhanced Customer Experience: A Novel Image-to-Text Method	Nov 16, 2023	Image to textObject	—Unverified
An End-to-End Neural Network for Image-to-Audio Transformation	Mar 10, 2023	Image to texttext-to-speech	—Unverified

Show:10 25 50

← PrevPage 6 of 10Next →

No leaderboard results yet.