SOTAVerified|Agents Browse Leaderboard About

Image Description

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–10 of 154 papers

Title	Date	Tasks	Status	Hype	Score
MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models	Apr 20, 2023	Image DescriptionLanguage Modelling	CodeCode Available	7	5
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning	Oct 14, 2023	Image ClassificationImage Description	CodeCode Available	7	5
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond	Aug 24, 2023	Chart Question AnsweringFS-MEVQA	CodeCode Available	5	5
Caption Anything: Interactive Image Description with Diverse Multimodal Controls	May 4, 2023	controllable image captioningImage Captioning	CodeCode Available	3	5
Seedream 2.0: A Native Chinese-English Bilingual Image Generation Foundation Model	Mar 10, 2025	Image DescriptionImage Generation	CodeCode Available	2	5
PandaGPT: One Model To Instruction-Follow Them All	May 25, 2023	AllImage Description	CodeCode Available	2	5
Image Textualization: An Automatic Framework for Creating Accurate and Detailed Image Descriptions	Jun 11, 2024	HallucinationImage Description	CodeCode Available	2	5
Patho-R1: A Multimodal Reinforcement Learning-Based Pathology Expert Reasoner	May 16, 2025	Cross-Modal RetrievalDiagnostic	CodeCode Available	2	5
Can Large Multimodal Models Uncover Deep Semantics Behind Images?	Feb 17, 2024	Image Description	CodeCode Available	1	5
Chatting Makes Perfect: Chat-based Image Retrieval	May 31, 2023	Chat-based Image RetrievalImage Description	CodeCode Available	1	5

Show:10 25 50

← PrevPage 1 of 16Next →

No leaderboard results yet.