SOTAVerified

Image to text

Papers

Showing 201246 of 246 papers

TitleStatusHype
Uncertainty-based Cross-Modal Retrieval with Probabilistic Representations0
COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval0
Characterizing and Understanding the Behavior of Quantized Models for Reliable DeploymentCode0
Two-stream Hierarchical Similarity Reasoning for Image-text Matching0
A Thousand Words Are Worth More Than a Picture: Natural Language-Centric Outside-Knowledge Visual Question Answering0
EI-CLIP: Entity-Aware Interventional Contrastive Learning for E-Commerce Cross-Modal Retrieval0
Transform-Retrieve-Generate: Natural Language-Centric Outside-Knowledge Visual Question Answering0
ERNIE-ViLG: Unified Generative Pre-training for Bidirectional Vision-Language GenerationCode1
Distilled Dual-Encoder Model for Vision-Language UnderstandingCode1
Self-Supervised Image-to-Text and Text-to-Image SynthesisCode0
Exploration into Translation-Equivariant Image QuantizationCode0
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic ArithmeticCode1
Utilizing Resource-Rich Language Datasets for End-to-End Scene Text Recognition in Resource-Poor Languages0
L-Verse: Bidirectional Generation Between Image and TextCode1
Unifying Multimodal Transformer for Bi-directional Image and Text GenerationCode1
Contrastive Learning of Visual-Semantic Embeddings0
Survey of Visual-Semantic Embedding Methods for Zero-Shot Image Retrieval0
Concadia: Towards Image-Based Text Generation with a PurposeCode1
Knowledge driven Description Synthesis for Floor Plan Interpretation0
Progressive Transformer-Based Generation of Radiology ReportsCode1
Improving Factual Completeness and Consistency of Image-to-Text Radiology Report GenerationCode1
Hierarchical Gumbel Attention Network for Text-based Person Search0
Cross-Modal Alignment with Mixture Experts Neural Network for Intral-City Retail Recommendation0
Development of a New Image-to-text Conversion System for Pashto, Farsi and Traditional Chinese0
Multimodal Intelligence: Representation Learning, Information Fusion, and Applications0
Illegible Text to Readable Text: An Image-to-Image Transformation using Conditional Sliced Wasserstein Adversarial Networks0
Aligning Multilingual Word Embeddings for Cross-Modal Retrieval TaskCode0
From Image to Text in Sentiment Analysis via Regression and Deep Learning0
Knowledge Aware Semantic Concept Expansion for Image-Text Matching0
MirrorGAN: Learning Text-to-image Generation by RedescriptionCode0
Canonical Correlation Analysis for Misaligned Satellite Image Change Detection0
Doc2Im: document to image conversion through self-attentive embedding0
SpatialVOC2K: A Multilingual Dataset of Images with Annotations and Features for Spatial Relations between ObjectsCode0
EmojiGAN: learning emojis distributions with a generative model0
Text-to-Image-to-Text Translation using Cycle Consistent Adversarial NetworksCode0
Deductron -- A Recurrent Neural Network0
Using Inter-Sentence Diverse Beam Search to Reduce Redundancy in Visual Storytelling0
Turbo Learning for Captionbot and Drawingbot0
Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face DescriptionsCode0
Synthesizing Novel Pairs of Image and Text0
From Image to Text Classification: A Novel Approach based on Clustering Word Embeddings0
Better Text Understanding Through Image-To-Text Transfer0
I2T2I: Learning Text to Image Synthesis with Textual Data Augmentation0
A Gentle Tutorial of Recurrent Neural Network with Error BackpropagationCode0
Learning Deep Structure-Preserving Image-Text Embeddings0
Effective Use of Word Order for Text Categorization with Convolutional Neural NetworksCode0
Show:102550
← PrevPage 5 of 5Next →

No leaderboard results yet.