SOTAVerified

Image to text

Papers

Showing 201246 of 246 papers

TitleStatusHype
Image Semantic Relation Generation0
Cross-modal Contrastive Attention Model for Medical Report Generation0
Every picture tells a story: Image-grounded controllable stylistic story generation0
Discovering Bugs in Vision Models using Off-the-shelf Image Generation and Captioning0
Paired Cross-Modal Data Augmentation for Fine-Grained Image-to-Text Retrieval0
SRCB at SemEval-2022 Task 5: Pretraining Based Image to Text Late Sequential Fusion System for Multimodal Misogynous Meme Identification0
Delving into the Openness of CLIPCode0
Multilingual Image Corpus – Towards a Multimodal and Multilingual Dataset0
Uncertainty-based Cross-Modal Retrieval with Probabilistic Representations0
COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval0
Characterizing and Understanding the Behavior of Quantized Models for Reliable DeploymentCode0
Two-stream Hierarchical Similarity Reasoning for Image-text Matching0
A Thousand Words Are Worth More Than a Picture: Natural Language-Centric Outside-Knowledge Visual Question Answering0
Transform-Retrieve-Generate: Natural Language-Centric Outside-Knowledge Visual Question Answering0
EI-CLIP: Entity-Aware Interventional Contrastive Learning for E-Commerce Cross-Modal Retrieval0
Self-Supervised Image-to-Text and Text-to-Image SynthesisCode0
Exploration into Translation-Equivariant Image QuantizationCode0
Utilizing Resource-Rich Language Datasets for End-to-End Scene Text Recognition in Resource-Poor Languages0
Contrastive Learning of Visual-Semantic Embeddings0
Survey of Visual-Semantic Embedding Methods for Zero-Shot Image Retrieval0
Knowledge driven Description Synthesis for Floor Plan Interpretation0
Hierarchical Gumbel Attention Network for Text-based Person Search0
Cross-Modal Alignment with Mixture Experts Neural Network for Intral-City Retail Recommendation0
Development of a New Image-to-text Conversion System for Pashto, Farsi and Traditional Chinese0
Multimodal Intelligence: Representation Learning, Information Fusion, and Applications0
Illegible Text to Readable Text: An Image-to-Image Transformation using Conditional Sliced Wasserstein Adversarial Networks0
Aligning Multilingual Word Embeddings for Cross-Modal Retrieval TaskCode0
From Image to Text in Sentiment Analysis via Regression and Deep Learning0
Knowledge Aware Semantic Concept Expansion for Image-Text Matching0
MirrorGAN: Learning Text-to-image Generation by RedescriptionCode0
Canonical Correlation Analysis for Misaligned Satellite Image Change Detection0
Doc2Im: document to image conversion through self-attentive embedding0
SpatialVOC2K: A Multilingual Dataset of Images with Annotations and Features for Spatial Relations between ObjectsCode0
EmojiGAN: learning emojis distributions with a generative model0
Text-to-Image-to-Text Translation using Cycle Consistent Adversarial NetworksCode0
Deductron -- A Recurrent Neural Network0
Using Inter-Sentence Diverse Beam Search to Reduce Redundancy in Visual Storytelling0
Turbo Learning for Captionbot and Drawingbot0
Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face DescriptionsCode0
Synthesizing Novel Pairs of Image and Text0
From Image to Text Classification: A Novel Approach based on Clustering Word Embeddings0
Better Text Understanding Through Image-To-Text Transfer0
I2T2I: Learning Text to Image Synthesis with Textual Data Augmentation0
A Gentle Tutorial of Recurrent Neural Network with Error BackpropagationCode0
Learning Deep Structure-Preserving Image-Text Embeddings0
Effective Use of Word Order for Text Categorization with Convolutional Neural NetworksCode0
Show:102550
← PrevPage 5 of 5Next →

No leaderboard results yet.