SOTAVerified

Image to text

Papers

Showing 201225 of 246 papers

TitleStatusHype
From Pixels to Prose: Advancing Multi-Modal Language Models for Remote Sensing0
GPC: Generative and General Pathology Image Classifier0
GPT-4V(ision) as a Generalist Evaluator for Vision-Language Tasks0
GrowCLIP: Data-aware Automatic Model Growing for Large-scale Contrastive Language-Image Pre-training0
Hierarchical Gumbel Attention Network for Text-based Person Search0
HyCIR: Boosting Zero-Shot Composed Image Retrieval with Synthetic Labels0
I2T2I: Learning Text to Image Synthesis with Textual Data Augmentation0
Illegible Text to Readable Text: An Image-to-Image Transformation using Conditional Sliced Wasserstein Adversarial Networks0
Image2Text2Image: A Novel Framework for Label-Free Evaluation of Image-to-Text Generation with Text-to-Image Diffusion Models0
Image Captioners Sometimes Tell More Than Images They See0
Image Semantic Relation Generation0
Image-to-Text for Medical Reports Using Adaptive Co-Attention and Triple-LSTM Module0
Image-to-Text Logic Jailbreak: Your Imagination can Help You Do Anything0
Improving Factuality of 3D Brain MRI Report Generation with Paired Image-domain Retrieval and Text-domain Augmentation0
Improving Medical Visual Representation Learning with Pathological-level Cross-Modal Alignment and Correlation Exploration0
Improving Table Structure Recognition with Visual-Alignment Sequential Coordinate Modeling0
Improving the Factual Correctness of Radiology Report Generation with Semantic Rewards0
Instruction Tuning-free Visual Token Complement for Multimodal LLMs0
Interpreting Vision and Language Generative Models with Semantic Visual Priors0
Is Cross-modal Information Retrieval Possible without Training?0
I See Dead People: Gray-Box Adversarial Attack on Image-To-Text Models0
Knowledge Aware Semantic Concept Expansion for Image-Text Matching0
Knowledge driven Description Synthesis for Floor Plan Interpretation0
Semantically Grounded QFormer for Efficient Vision Language Understanding0
Learning by Hallucinating: Vision-Language Pre-training with Weak Supervision0
Show:102550
← PrevPage 9 of 10Next →

No leaderboard results yet.