SOTAVerified|Agents Browse Leaderboard About Blog

MMR total

Sum of all scores of the 11 distinct tasks involving texts, fonts, visual elements, bounding boxes, spatial relations, and grounding in the Multi-Modal Reading (MMR) Benchmark.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–12 of 12 papers

Title	Date	Tasks	Status	Hype
Visual Instruction Tuning	Apr 17, 2023	1 Image, 2*2 Stitching3D Question Answering (3D-QA)	CodeCode Available	6
GPT-4 Technical Report	Mar 15, 2023	answerability predictionArithmetic Reasoning	CodeCode Available	6
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond	Aug 24, 2023	Chart Question AnsweringFS-MEVQA	CodeCode Available	5
Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models	Nov 11, 2023	Image CaptioningMMR total	CodeCode Available	3
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents	Jun 21, 2023	MMR total	CodeCode Available	2
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks	Dec 21, 2023	Image RetrievalImage-to-Text Retrieval	CodeCode Available	1
The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)	Sep 29, 2023	MMR total	CodeCode Available	1
MMR: Evaluating Reading Ability of Large Multimodal Models	Aug 26, 2024	Font RecognitionMMR total	—Unverified	0
Claude 3.5 Sonnet Model Card Addendum	Jun 24, 2024	Code GenerationMMR total	—Unverified	0
GPT-4o: Visual perception performance of multimodal large language models in piglet activity understanding	Jun 14, 2024	Activity RecognitionMMR total	—Unverified	0
What matters when building vision-language models?	May 3, 2024	1 Image, 2*2 StitchingImage Retrieval	—Unverified	0
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone	Apr 22, 2024	Language ModelingLanguage Modelling	—Unverified	0

Show:10 25 50

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Claude 3.5 Sonnet	Total Column Score	463	—	Unverified
2	GPT-4o	Total Column Score	457	—	Unverified
3	GPT-4V	Total Column Score	415	—	Unverified
4	LLaVA-NEXT-34B	Total Column Score	412	—	Unverified
5	Phi-3-Vision	Total Column Score	397	—	Unverified
6	InternVL2-8B	Total Column Score	368	—	Unverified
7	Qwen-vl-max	Total Column Score	366	—	Unverified
8	LLaVA-NEXT-13B	Total Column Score	335	—	Unverified
9	Qwen-vl-plus	Total Column Score	310	—	Unverified
10	Idefics-2-8B	Total Column Score	256	—	Unverified