SOTAVerified

Multimodal Large Language Model

Papers

Showing 326347 of 347 papers

TitleStatusHype
LLMGA: Multimodal Large Language Model based Generation AssistantCode2
GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation0
LION : Empowering Multimodal Large Language Model with Dual-Level Visual KnowledgeCode1
How to Bridge the Gap between Modalities: Survey on Multimodal Large Language Model0
Chain of Images for Intuitively ReasoningCode1
Multimodal ChatGPT for Medical Applications: an Experimental Study of GPT-4VCode1
CXR-LLAVA: a multimodal large language model for interpreting chest X-ray imagesCode1
Multimodal Large Language Model for Visual Navigation0
Ferret: Refer and Ground Anything Anywhere at Any GranularityCode5
UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language ModelCode1
Comics for Everyone: Generating Accessible Text Descriptions for Comic Strips0
Investigating the Catastrophic Forgetting in Multimodal Large Language Models0
Imaginations of WALL-E : Reconstructing Experiences with an Imagination-Inspired Module for Advanced AI Systems0
FinVis-GPT: A Multimodal Large Language Model for Financial Chart AnalysisCode1
ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning0
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding0
Kosmos-2: Grounding Multimodal Large Language Models to the WorldCode1
MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language ModelsCode2
A Survey on Multimodal Large Language Models0
Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and BenchmarksCode2
LMEye: An Interactive Perception Network for Large Language ModelsCode1
Language Is Not All You Need: Aligning Perception with Language Models0
Show:102550
← PrevPage 14 of 14Next →

No leaderboard results yet.