SOTAVerified|Agents Browse Leaderboard About Blog

Image to text

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–10 of 246 papers

Title	Date	Tasks	Status	Hype
Versatile Diffusion: Text, Images and Variations All in One Diffusion Model	Nov 15, 2022	AllDisentanglement	CodeCode Available	6
Large Multilingual Models Pivot Zero-Shot Multimodal Learning across Languages	Aug 23, 2023	Image GenerationImage to text	CodeCode Available	6
FlowTok: Flowing Seamlessly Across Text and Image Tokens	Mar 13, 2025	DenoisingImage to text	CodeCode Available	5
Magma: A Foundation Model for Multimodal AI Agents	Feb 18, 2025	Autonomous Web NavigationImage to text	CodeCode Available	5
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models	Jan 30, 2023	Generative Visual Question AnsweringImage Captioning	CodeCode Available	4
Emu: Generative Pretraining in Multimodality	Jul 11, 2023	Image CaptioningImage Generation	CodeCode Available	3
Evaluating Text-to-Visual Generation with Image-to-Text Generation	Apr 1, 2024	Image to textQuestion Answering	CodeCode Available	3
One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale	Mar 12, 2023	AllImage Generation	CodeCode Available	3
Generative Diffusion Models on Graphs: Methods and Applications	Feb 6, 2023	DenoisingGraph Generation	CodeCode Available	2
CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching	Apr 4, 2024	AttributeImage Captioning	CodeCode Available	2

Show:10 25 50

← PrevPage 1 of 25Next →

No leaderboard results yet.