SOTAVerified

Multimodal Large Language Model

Papers

Showing 271280 of 347 papers

TitleStatusHype
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese UnderstandingCode7
Layout Generation Agents with Large Language ModelsCode0
Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition0
SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image EditingCode4
WorldGPT: Empowering LLM as Multimodal World ModelCode2
Paint by Inpaint: Learning to Add Image Objects by Removing Them FirstCode2
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites0
Multimodal Large Language Model is a Human-Aligned Annotator for Text-to-Image Generation0
Groma: Localized Visual Tokenization for Grounding Multimodal Large Language ModelsCode4
RAGAR, Your Falsehood Radar: RAG-Augmented Reasoning for Political Fact-Checking using Multimodal Large Language Models0
Show:102550
← PrevPage 28 of 35Next →

No leaderboard results yet.