SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 36713680 of 474278 papers

TitleStatusHype
MDAgents: An Adaptive Collaboration of LLMs for Medical Decision-MakingCode3
The Common Core OntologiesCode3
PARTNR: A Benchmark for Planning and Reasoning in Embodied Multi-agent TasksCode3
PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation ModelsCode3
RLVER: Reinforcement Learning with Verifiable Emotion Rewards for Empathetic AgentsCode3
SEED-Bench: Benchmarking Multimodal Large Language ModelsCode3
Reasoning with Language Model Prompting: A SurveyCode3
ThoughtSource: A central hub for large language model reasoning dataCode3
VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene CompletionCode3
Foundation Models for Music: A SurveyCode3
Show:102550
← PrevPage 368 of 47428Next →