SOTAVerified|Agents Browse Leaderboard About

Multimodal Large Language Model

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–110 of 347 papers

Title	Date	Tasks	Status	Hype
Optimus-2: Multimodal Minecraft Agent with Goal-Observation-Action Conditioned Policy	Feb 27, 2025	Large Language ModelMinecraft	—Unverified	0
AsymLoRA: Harmonizing Data Conflicts and Commonalities in MLLMs	Feb 27, 2025	Language ModelingLanguage Modelling	CodeCode Available	3
Introducing Visual Perception Token into Multimodal Large Language Model	Feb 24, 2025	Language ModelingLanguage Modelling	CodeCode Available	2
R1-Onevision：An Open-Source Multimodal Large Language Model Capable of Deep Reasoning	Feb 24, 2025	Language ModelingLanguage Modelling	CodeCode Available	4
OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language Models	Feb 22, 2025	document understandingKey Information Extraction	—Unverified	0
Gesture-Aware Zero-Shot Speech Recognition for Patients with Language Disorders	Feb 18, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Towards Text-Image Interleaved Retrieval	Feb 18, 2025	Information RetrievalLanguage Modeling	CodeCode Available	1
MMRC: A Large-Scale Benchmark for Understanding Multimodal Large Language Model in Real-World Conversation	Feb 17, 2025	Language ModelingLanguage Modelling	—Unverified	0
Leveraging Multimodal-LLMs Assisted by Instance Segmentation for Intelligent Traffic Monitoring	Feb 16, 2025	Instance SegmentationLanguage Modeling	—Unverified	0
Distraction is All You Need for Multimodal Large Language Model Jailbreaking	Feb 15, 2025	AllLanguage Modeling	—Unverified	0

Show:10 25 50

← PrevPage 11 of 35Next →

No leaderboard results yet.