SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 22412250 of 177340 papers

TitleStatusHype
SWE-bench: Can Language Models Resolve Real-World GitHub Issues?Code4
RUMI: Rummaging Using Mutual InformationCode4
ChatGPT Outperforms Crowd-Workers for Text-Annotation TasksCode4
A General Theoretical Paradigm to Understand Learning from Human PreferencesCode4
Self-Supervised Geometry-Guided Initialization for Robust Monocular Visual OdometryCode4
MUSE: Machine Unlearning Six-Way Evaluation for Language ModelsCode4
Stock Price Prediction via Discovering Multi-Frequency Trading PatternsCode4
The Model Openness Framework: Promoting Completeness and Openness for Reproducibility, Transparency, and Usability in Artificial IntelligenceCode4
Fast Transformer Decoding: One Write-Head is All You NeedCode4
OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction DataCode4
Show:102550
← PrevPage 225 of 17734Next →