SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 39113920 of 474278 papers

TitleStatusHype
YourBench: Easy Custom Evaluation Sets for EveryoneCode3
Token Reduction Should Go Beyond Efficiency in Generative Models -- From Vision, Language to MultimodalityCode3
Iterative Self-Incentivization Empowers Large Language Models as Agentic SearchersCode3
Spurious Rewards: Rethinking Training Signals in RLVRCode3
MotionDirector: Motion Customization of Text-to-Video Diffusion ModelsCode3
River: machine learning for streaming data in PythonCode3
Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with Applications to Protein Co-DesignCode3
Personalized Benchmarking with the Ludwig Benchmarking ToolkitCode3
Deep symbolic regression for physics guided by units constraints: toward the automated discovery of physical lawsCode3
Large Language Models for Generative Information Extraction: A SurveyCode3
Show:102550
← PrevPage 392 of 47428Next →