SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 411420 of 659983 papers

TitleStatusHype
LLM Reasoners: New Evaluation, Library, and Analysis of Step-by-Step Reasoning with Large Language ModelsCode7
Cradle: Empowering Foundation Agents Towards General Computer ControlCode7
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer EnvironmentsCode7
Efficient Track AnythingCode7
Streamlining Ocean Dynamics Modeling with Fourier Neural Operators: A Multiobjective Hyperparameter and Architecture Optimization ApproachCode7
Embedding Atlas: Low-Friction, Interactive Embedding VisualizationCode7
A Library for Learning Neural OperatorsCode7
Kimi k1.5: Scaling Reinforcement Learning with LLMsCode7
AutoCodeRover: Autonomous Program ImprovementCode7
S*: Test Time Scaling for Code GenerationCode7
Show:102550
← PrevPage 42 of 65999Next →