SOTAVerified|Agents Browse Leaderboard About Blog

Image Comprehension

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–10 of 49 papers

Title	Date	Tasks	Status	Hype
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models	Mar 27, 2024	Image ClassificationImage Comprehension	CodeCode Available	7
Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation	Dec 5, 2024	Image ComprehensionRepresentation Learning	CodeCode Available	2
MMGenBench: Evaluating the Limits of LMMs from the Text-to-Image Generation Perspective	Nov 21, 2024	Image ComprehensionImage Generation	CodeCode Available	2
StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding	Nov 6, 2024	Image ComprehensionStreaming video understanding	CodeCode Available	2
MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models	Aug 5, 2024	Image ComprehensionMultiple-choice	CodeCode Available	2
Enhancing Large Vision Language Models with Self-Training on Image Comprehension	May 30, 2024	Image ComprehensionVisual Question Answering	CodeCode Available	2
Enhancing Visual-Language Modality Alignment in Large Vision Language Models via Self-Improvement	May 24, 2024	HallucinationImage Comprehension	CodeCode Available	2
EarthGPT: A Universal Multi-modal Large Language Model for Multi-sensor Image Comprehension in Remote Sensing Domain	Jan 30, 2024	Image ComprehensionInstruction Following	CodeCode Available	2
JourneyDB: A Benchmark for Generative Image Understanding	Jul 3, 2023	Image CaptioningImage Comprehension	CodeCode Available	2
Hierarchical Open-vocabulary Universal Image Segmentation	Jul 3, 2023	Image ComprehensionImage Segmentation	CodeCode Available	2

Show:10 25 50

← PrevPage 1 of 5Next →

No leaderboard results yet.