SOTAVerified

World Knowledge

Papers

Showing 1120 of 818 papers

TitleStatusHype
VILA: On Pre-training for Visual Language ModelsCode4
Text2SQL is Not Enough: Unifying AI and Databases with TAGCode4
LLM2CLIP: Powerful Language Model Unlocks Richer Visual RepresentationCode4
Retrieval-Augmented Generation for Knowledge-Intensive NLP TasksCode4
V?: Guided Visual Search as a Core Mechanism in Multimodal LLMsCode4
AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-TuningCode3
Tokenization, Fusion, and Augmentation: Towards Fine-grained Multi-modal Entity RepresentationCode3
Are We on the Right Way for Evaluating Large Vision-Language Models?Code3
LLaRA: Supercharging Robot Learning Data for Vision-Language PolicyCode3
HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and GenerationCode3
Show:102550
← PrevPage 2 of 82Next →

No leaderboard results yet.