SOTAVerified

Image Comprehension

Papers

Showing 2130 of 49 papers

TitleStatusHype
VGA: Vision GUI Assistant -- Minimizing Hallucinations through Image-Centric Fine-TuningCode0
RRHF-V: Ranking Responses to Mitigate Hallucinations in Multimodal Large Language Models with Human FeedbackCode0
FTII-Bench: A Comprehensive Multimodal Benchmark for Flow Text with Image InsertionCode0
InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and CompositionCode0
CLIC: Contrastive Learning Framework for Unsupervised Image Complexity RepresentationCode0
Multiplane Prior Guided Few-Shot Aerial Scene Rendering0
An End-to-End OCR Text Re-organization Sequence Learning for Rich-text Detail Image Comprehension0
Aquila: A Hierarchically Aligned Visual-Language Model for Enhanced Remote Sensing Image Comprehension0
GeoLocator: a location-integrated large multimodal model for inferring geo-privacy0
CMMCoT: Enhancing Complex Multi-Image Comprehension via Multi-Modal Chain-of-Thought and Memory Augmentation0
Show:102550
← PrevPage 3 of 5Next →

No leaderboard results yet.