SOTAVerified|Agents Browse Leaderboard About

Multimodal Large Language Model

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 271–280 of 347 papers

Title	Date	Tasks	Status	Hype
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding	May 14, 2024	Image GenerationLanguage Modeling	CodeCode Available	7
Layout Generation Agents with Large Language Models	May 13, 2024	Language ModelingLanguage Modelling	CodeCode Available	0
Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition	May 7, 2024	Large Language ModelMultimodal Large Language Model	—Unverified	0
SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image Editing	May 7, 2024	Image ManipulationLanguage Modeling	CodeCode Available	4
WorldGPT: Empowering LLM as Multimodal World Model	Apr 28, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
Paint by Inpaint: Learning to Add Image Objects by Removing Them First	Apr 28, 2024	Image InpaintingLanguage Modeling	CodeCode Available	2
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites	Apr 25, 2024	4kLanguage Modeling	—Unverified	0
Multimodal Large Language Model is a Human-Aligned Annotator for Text-to-Image Generation	Apr 23, 2024	Image GenerationLanguage Modeling	—Unverified	0
Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models	Apr 19, 2024	Language ModelingLanguage Modelling	CodeCode Available	4
RAGAR, Your Falsehood Radar: RAG-Augmented Reasoning for Political Fact-Checking using Multimodal Large Language Models	Apr 18, 2024	Fact CheckingLanguage Modeling	—Unverified	0

Show:10 25 50

← PrevPage 28 of 35Next →

No leaderboard results yet.