SOTAVerified

Image to text

Papers

Showing 151175 of 246 papers

TitleStatusHype
An Online Learning Approach to Prompt-based Selection of Generative Models0
Ask, Attend, Attack: A Effective Decision-Based Black-Box Targeted Attack for Image-to-Text Models0
A Thousand Words Are Worth More Than a Picture: Natural Language-Centric Outside-Knowledge Visual Question Answering0
Attention Guidance Mechanism for Handwritten Mathematical Expression Recognition0
A Unified Framework and Dataset for Assessing Societal Bias in Vision-Language Models0
Backdooring Vision-Language Models with Out-Of-Distribution Data0
Better Text Understanding Through Image-To-Text Transfer0
Beyond Color and Lines: Zero-Shot Style-Specific Image Variations with Coordinated Semantics0
Beyond Images: An Integrative Multi-modal Approach to Chest X-Ray Report Generation0
BiLMa: Bidirectional Local-Matching for Text-based Person Re-identification0
BIMCV-R: A Landmark Dataset for 3D CT Text-Image Retrieval0
BRIT: Bidirectional Retrieval over Unified Image-Text Graph0
Canonical Correlation Analysis for Misaligned Satellite Image Change Detection0
CapText: Large Language Model-based Caption Generation From Image Context and Description0
Captions Are Worth a Thousand Words: Enhancing Product Retrieval with Pretrained Image-to-Text Models0
ChartReasoner: Code-Driven Modality Bridging for Long-Chain Reasoning in Chart Question Answering0
VITR: Augmenting Vision Transformers with Relation-Focused Learning for Cross-Modal Information Retrieval0
CLIP the Bias: How Useful is Balancing Data in Multimodal Learning?0
CoBIT: A Contrastive Bi-directional Image-Text Generation Model0
CoCoT: Contrastive Chain-of-Thought Prompting for Large Multimodal Models with Multiple Image Inputs0
Contrastive Learning of Visual-Semantic Embeddings0
COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval0
Cross-Modal Adaptive Dual Association for Text-to-Image Person Retrieval0
Cross-Modal Alignment with Mixture Experts Neural Network for Intral-City Retail Recommendation0
Cross-modal Contrastive Attention Model for Medical Report Generation0
Show:102550
← PrevPage 7 of 10Next →

No leaderboard results yet.