SOTAVerified

Multimodal Large Language Model

Papers

Showing 126150 of 347 papers

TitleStatusHype
Kosmos-2: Grounding Multimodal Large Language Models to the WorldCode1
LMEye: An Interactive Perception Network for Large Language ModelsCode1
LRMR: LLM-Driven Relational Multi-node Ranking for Lymph Node Metastasis Assessment in Rectal Cancer0
MFGDiffusion: Mask-Guided Smoke Synthesis for Enhanced Forest Fire DetectionCode0
KptLLM++: Towards Generic Keypoint Comprehension with Large Language Model0
Chat with AI: The Surprising Turn of Real-time Video Communication from Human to AI0
TalkFashion: Intelligent Virtual Try-On Assistant Based on Multimodal Large Language Model0
BlueLM-2.5-3B Technical Report0
CoT-lized Diffusion: Let's Reinforce T2I Generation Step-by-step0
Mask-aware Text-to-Image Retrieval: Referring Expression Segmentation Meets Cross-modal Retrieval0
OracleFusion: Assisting the Decipherment of Oracle Bone Script with Structurally Constrained Semantic TypographyCode0
DreamJourney: Perpetual View Generation with Video Diffusion Models0
ASCD: Attention-Steerable Contrastive Decoding for Reducing Hallucination in MLLM0
VIS-Shepherd: Constructing Critic for LLM-based Data Visualization GenerationCode0
CFBenchmark-MM: Chinese Financial Assistant Benchmark for Multimodal Large Language Model0
VGR: Visual Grounded Reasoning0
PHRASED: Phrase Dictionary Biasing for Speech Translation0
Parking, Perception, and Retail: Street-Level Determinants of Community Vitality in Harbin0
Towards LLM-Centric Multimodal Fusion: A Survey on Integration Strategies and Techniques0
The NTNU System at the S&I Challenge 2025 SLA Open Track0
A Novel Data Augmentation Approach for Automatic Speaking Assessment on Opinion Expressions0
From Street Views to Urban Science: Discovering Road Safety Factors with Multimodal Large Language Models0
S4-Driver: Scalable Self-Supervised Driving Multimodal Large Language Modelwith Spatio-Temporal Visual Representation0
Cross-modal RAG: Sub-dimensional Retrieval-Augmented Text-to-Image GenerationCode0
Think Before You Diffuse: LLMs-Guided Physics-Aware Video Generation0
Show:102550
← PrevPage 6 of 14Next →

No leaderboard results yet.