SOTAVerified

Multimodal Large Language Model

Papers

Showing 326347 of 347 papers

TitleStatusHype
Lumos : Empowering Multimodal LLMs with Scene Text Recognition0
LLaVA-Docent: Instruction Tuning with Multimodal Large Language Model to Support Art Appreciation Education0
LLaVA-MoLE: Sparse Mixture of LoRA Experts for Mitigating Data Conflicts in Instruction Finetuning MLLMs0
UNIMO-G: Unified Image Generation through Multimodal Conditional Diffusion0
MLLMReID: Multimodal Large Language Model-based Person Re-identification0
CoDi-2: In-Context Interleaved and Interactive Any-to-Any Generation0
ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation0
Audio-Visual LLM for Video Understanding0
EtC: Temporal Boundary Expand then Clarify for Weakly Supervised Video Grounding with Multimodal Large Language Model0
MedXChat: A Unified Multimodal Large Language Model Framework towards CXRs Understanding and Generation0
CoDi-2: In-Context, Interleaved, and Interactive Any-to-Any Generation0
mPLUG-PaperOwl: Scientific Diagram Analysis with the Multimodal Large Language Model0
GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation0
How to Bridge the Gap between Modalities: Survey on Multimodal Large Language Model0
Multimodal Large Language Model for Visual Navigation0
Comics for Everyone: Generating Accessible Text Descriptions for Comic Strips0
Investigating the Catastrophic Forgetting in Multimodal Large Language Models0
Imaginations of WALL-E : Reconstructing Experiences with an Imagination-Inspired Module for Advanced AI Systems0
ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning0
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding0
A Survey on Multimodal Large Language Models0
Language Is Not All You Need: Aligning Perception with Language Models0
Show:102550
← PrevPage 14 of 14Next →

No leaderboard results yet.