SOTAVerified

Multimodal Large Language Model

Papers

Showing 121130 of 347 papers

TitleStatusHype
When language and vision meet road safety: leveraging multimodal large language models for video-based traffic accident analysisCode1
Interpretable Droplet Digital PCR Assay for Trustworthy Molecular Diagnostics0
Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks0
Parameter-Inverted Image Pyramid Networks for Visual Perception and Multimodal UnderstandingCode2
3UR-LLM: An End-to-End Multimodal Large Language Model for 3D Scene UnderstandingCode1
LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal UnderstandingCode2
ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code GenerationCode2
MinMo: A Multimodal Large Language Model for Seamless Voice Interaction0
Valley2: Exploring Multimodal Models with Scalable Vision-Language DesignCode3
LLaVA-Octopus: Unlocking Instruction-Driven Adaptive Projector Fusion for Video Understanding0
Show:102550
← PrevPage 13 of 35Next →

No leaderboard results yet.