SOTAVerified

Image to text

Papers

Showing 176200 of 246 papers

TitleStatusHype
SLAN: Self-Locator Aided Network for Vision-Language Understanding0
Do DALL-E and Flamingo Understand Each Other?0
When are Lemons Purple? The Concept Association Bias of Vision-Language Models0
MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering0
SLAN: Self-Locator Aided Network for Cross-Modal Understanding0
Retrieval-Augmented Multimodal Language Modeling0
Versatile Diffusion: Text, Images and Variations All in One Diffusion ModelCode6
Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion ModelsCode1
Learning by Hallucinating: Vision-Language Pre-training with Weak Supervision0
Improving the Factual Correctness of Radiology Report Generation with Semantic Rewards0
Visual Spatial Description: Controlled Spatial-Oriented Image-to-Text GenerationCode1
Image Semantic Relation Generation0
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language UnderstandingCode2
Cross-modal Contrastive Attention Model for Medical Report Generation0
Linearly Mapping from Image to Text SpaceCode1
FETA: Towards Specializing Foundation Models for Expert Task ApplicationsCode1
Every picture tells a story: Image-grounded controllable stylistic story generation0
Discovering Bugs in Vision Models using Off-the-shelf Image Generation and Captioning0
Paired Cross-Modal Data Augmentation for Fine-Grained Image-to-Text Retrieval0
SRCB at SemEval-2022 Task 5: Pretraining Based Image to Text Late Sequential Fusion System for Multimodal Misogynous Meme Identification0
What is Where by Looking: Weakly-Supervised Open-World Phrase-Grounding without Text InputsCode1
Write and Paint: Generative Vision-Language Models are Unified Modal LearnersCode1
Delving into the Openness of CLIPCode0
Multilingual Image Corpus – Towards a Multimodal and Multilingual Dataset0
GIT: A Generative Image-to-text Transformer for Vision and LanguageCode2
Show:102550
← PrevPage 8 of 10Next →

No leaderboard results yet.