SOTAVerified

Multimodal Large Language Model

Papers

Showing 151175 of 347 papers

TitleStatusHype
Layout Generation Agents with Large Language ModelsCode0
Automatically Generating Visual Hallucination Test Cases for Multimodal Large Language ModelsCode0
Value-Spectrum: Quantifying Preferences of Vision-Language Models via Value Decomposition in Social Media ContextsCode0
MovSAM: A Single-image Moving Object Segmentation Framework Based on Deep ThinkingCode0
MedViLaM: A multimodal large language model with advanced generalizability and explainability for medical data understanding and generationCode0
Leveraging Multimodal LLM for Inspirational User Interface SearchCode0
SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model0
SNIFFER: Multimodal Large Language Model for Explainable Out-of-Context Misinformation Detection0
SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability0
ST^3: Accelerating Multimodal Large Language Model by Spatial-Temporal Visual Token Trimming0
StreetviewLLM: Extracting Geographic Information Using a Chain-of-Thought Multimodal Large Language Model0
Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization0
SubstationAI: Multimodal Large Model-Based Approaches for Analyzing Substation Equipment Faults0
TalkFashion: Intelligent Virtual Try-On Assistant Based on Multimodal Large Language Model0
The NTNU System at the S&I Challenge 2025 SLA Open Track0
The Solution for CVPR2024 Foundational Few-Shot Object Detection Challenge0
Think Before You Diffuse: LLMs-Guided Physics-Aware Video Generation0
TimeSoccer: An End-to-End Multimodal Large Language Model for Soccer Commentary Generation0
MERaLiON-SpeechEncoder: Towards a Speech Foundation Model for Singapore and Beyond0
Towards LLM-Centric Multimodal Fusion: A Survey on Integration Strategies and Techniques0
Towards Visual Text Grounding of Multimodal Large Language Model0
Unbridled Icarus: A Survey of the Potential Perils of Image Inputs in Multimodal Large Language Model Security0
UniGen: Enhanced Training & Test-Time Strategies for Unified Multimodal Understanding and Generation0
UNIMO-G: Unified Image Generation through Multimodal Conditional Diffusion0
Universal Item Tokenization for Transferable Generative Recommendation0
Show:102550
← PrevPage 7 of 14Next →

No leaderboard results yet.