SOTAVerified

Multimodal Large Language Model

Papers

Showing 125 of 347 papers

TitleStatusHype
KptLLM++: Towards Generic Keypoint Comprehension with Large Language Model0
MFGDiffusion: Mask-Guided Smoke Synthesis for Enhanced Forest Fire DetectionCode0
LRMR: LLM-Driven Relational Multi-node Ranking for Lymph Node Metastasis Assessment in Rectal Cancer0
Chat with AI: The Surprising Turn of Real-time Video Communication from Human to AI0
TalkFashion: Intelligent Virtual Try-On Assistant Based on Multimodal Large Language Model0
BlueLM-2.5-3B Technical Report0
CoT-lized Diffusion: Let's Reinforce T2I Generation Step-by-step0
Mask-aware Text-to-Image Retrieval: Referring Expression Segmentation Meets Cross-modal Retrieval0
OracleFusion: Assisting the Decipherment of Oracle Bone Script with Structurally Constrained Semantic TypographyCode0
ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and EditingCode5
MedTVT-R1: A Multimodal LLM Empowering Medical Reasoning and DiagnosisCode1
ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image GenerationCode3
DreamJourney: Perpetual View Generation with Video Diffusion Models0
The Condition Number as a Scale-Invariant Proxy for Information Encoding in Neural UnitsCode1
ASCD: Attention-Steerable Contrastive Decoding for Reducing Hallucination in MLLM0
CFBenchmark-MM: Chinese Financial Assistant Benchmark for Multimodal Large Language Model0
VIS-Shepherd: Constructing Critic for LLM-based Data Visualization GenerationCode0
VGR: Visual Grounded Reasoning0
PHRASED: Phrase Dictionary Biasing for Speech Translation0
Parking, Perception, and Retail: Street-Level Determinants of Community Vitality in Harbin0
The NTNU System at the S&I Challenge 2025 SLA Open Track0
Towards LLM-Centric Multimodal Fusion: A Survey on Integration Strategies and Techniques0
A Novel Data Augmentation Approach for Automatic Speaking Assessment on Opinion Expressions0
From Street Views to Urban Science: Discovering Road Safety Factors with Multimodal Large Language Models0
Period-LLM: Extending the Periodic Capability of Multimodal Large Language ModelCode1
Show:102550
← PrevPage 1 of 14Next →

No leaderboard results yet.