SOTAVerified

Image to text

Papers

Showing 201225 of 246 papers

TitleStatusHype
Image Semantic Relation Generation0
Cross-modal Contrastive Attention Model for Medical Report Generation0
Every picture tells a story: Image-grounded controllable stylistic story generation0
Discovering Bugs in Vision Models using Off-the-shelf Image Generation and Captioning0
Paired Cross-Modal Data Augmentation for Fine-Grained Image-to-Text Retrieval0
SRCB at SemEval-2022 Task 5: Pretraining Based Image to Text Late Sequential Fusion System for Multimodal Misogynous Meme Identification0
Delving into the Openness of CLIPCode0
Multilingual Image Corpus – Towards a Multimodal and Multilingual Dataset0
Uncertainty-based Cross-Modal Retrieval with Probabilistic Representations0
COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval0
Characterizing and Understanding the Behavior of Quantized Models for Reliable DeploymentCode0
Two-stream Hierarchical Similarity Reasoning for Image-text Matching0
A Thousand Words Are Worth More Than a Picture: Natural Language-Centric Outside-Knowledge Visual Question Answering0
Transform-Retrieve-Generate: Natural Language-Centric Outside-Knowledge Visual Question Answering0
EI-CLIP: Entity-Aware Interventional Contrastive Learning for E-Commerce Cross-Modal Retrieval0
Self-Supervised Image-to-Text and Text-to-Image SynthesisCode0
Exploration into Translation-Equivariant Image QuantizationCode0
Utilizing Resource-Rich Language Datasets for End-to-End Scene Text Recognition in Resource-Poor Languages0
Contrastive Learning of Visual-Semantic Embeddings0
Survey of Visual-Semantic Embedding Methods for Zero-Shot Image Retrieval0
Knowledge driven Description Synthesis for Floor Plan Interpretation0
Hierarchical Gumbel Attention Network for Text-based Person Search0
Cross-Modal Alignment with Mixture Experts Neural Network for Intral-City Retail Recommendation0
Development of a New Image-to-text Conversion System for Pashto, Farsi and Traditional Chinese0
Multimodal Intelligence: Representation Learning, Information Fusion, and Applications0
Show:102550
← PrevPage 9 of 10Next →

No leaderboard results yet.