SOTAVerified|Agents Browse Leaderboard About

Multimodal Large Language Model

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 221–230 of 347 papers

Title	Date	Tasks	Status	Hype
Face-LLaVA: Facial Expression and Attribute Understanding through Instruction Tuning	Apr 9, 2025	Action Unit DetectionAge Estimation	—Unverified	0
Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms	Oct 24, 2024	DiversityLanguage Modeling	—Unverified	0
ForgeryGPT: Multimodal Large Language Model For Explainable Image Forgery Detection and Localization	Oct 14, 2024	Explanation GenerationImage Forgery Detection	—Unverified	0
From Street Views to Urban Science: Discovering Road Safety Factors with Multimodal Large Language Models	Jun 2, 2025	Large Language ModelMultimodal Large Language Model	—Unverified	0
GenArtist: Multimodal LLM as an Agent for Unified Image Generation and Editing	Jul 8, 2024	Image GenerationLanguage Modeling	—Unverified	0
GeoRSMLLM: A Multimodal Large Language Model for Vision-Language Tasks in Geoscience and Remote Sensing	Mar 16, 2025	Change DetectionImage Captioning	—Unverified	0
Gesture-Aware Zero-Shot Speech Recognition for Patients with Language Disorders	Feb 18, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation	Nov 25, 2023	Instruction FollowingLanguage Modeling	—Unverified	0
Graph-based Unsupervised Disentangled Representation Learning via Multimodal Large Language Models	Jul 26, 2024	DisentanglementLanguage Modeling	—Unverified	0
GroundingFace: Fine-grained Face Understanding via Pixel Grounding Multimodal Large Language Model	Jan 1, 2025	AttributeLanguage Modeling	—Unverified	0

Show:10 25 50

← PrevPage 23 of 35Next →

No leaderboard results yet.