SOTAVerified|Agents Browse Leaderboard About Blog

Referring Expression Comprehension

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–10 of 167 papers

Title	Date	Tasks	Status	Hype
VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model	Apr 10, 2025	Language ModelingLanguage Modelling	CodeCode Available	9
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding	Dec 13, 2024	Chart UnderstandingMixture-of-Experts	CodeCode Available	9
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models	Mar 27, 2024	Image ClassificationImage Comprehension	CodeCode Available	7
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning	Oct 14, 2023	Image ClassificationImage Description	CodeCode Available	7
Improved Baselines with Visual Instruction Tuning	Oct 5, 2023	Factual Inconsistency Detection in Chart CaptioningImage Classification	CodeCode Available	6
Visual Instruction Tuning	Apr 17, 2023	1 Image, 2*2 Stitching3D Question Answering (3D-QA)	CodeCode Available	6
Efficient Multimodal Learning from Data-centric Perspective	Feb 18, 2024	Image ClassificationReferring Expression Comprehension	CodeCode Available	5
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection	Mar 9, 2023	DecoderObject Detection	CodeCode Available	5
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V	Oct 17, 2023	Interactive SegmentationReferring Expression	CodeCode Available	4
LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day	Jun 1, 2023	Image ClassificationInstruction Following	CodeCode Available	4

Show:10 25 50

← PrevPage 1 of 17Next →

No leaderboard results yet.