SOTAVerified

Image Comprehension

Papers

Showing 4149 of 49 papers

TitleStatusHype
Rec-GPT4V: Multimodal Recommendation with Large Vision-Language Models0
RGB-Th-Bench: A Dense benchmark for Visual-Thermal Understanding of Vision Language Models0
InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and CompositionCode0
RRHF-V: Ranking Responses to Mitigate Hallucinations in Multimodal Large Language Models with Human FeedbackCode0
FTII-Bench: A Comprehensive Multimodal Benchmark for Flow Text with Image InsertionCode0
CLIC: Contrastive Learning Framework for Unsupervised Image Complexity RepresentationCode0
MM-MATH: Advancing Multimodal Math Evaluation with Process Evaluation and Fine-grained ClassificationCode0
MIRe: Enhancing Multimodal Queries Representation via Fusion-Free Modality Interaction for Multimodal RetrievalCode0
VGA: Vision GUI Assistant -- Minimizing Hallucinations through Image-Centric Fine-TuningCode0
Show:102550
← PrevPage 5 of 5Next →

No leaderboard results yet.