SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 29212930 of 177340 papers

TitleStatusHype
Discovering Language Model Behaviors with Model-Written EvaluationsCode3
A Survey of Camouflaged Object Detection and BeyondCode3
MCTrack: A Unified 3D Multi-Object Tracking Framework for Autonomous DrivingCode3
Trial and Error: Exploration-Based Trajectory Optimization for LLM AgentsCode3
PutnamBench: Evaluating Neural Theorem-Provers on the Putnam Mathematical CompetitionCode3
A Survey of Neural Code Intelligence: Paradigms, Advances and BeyondCode3
MVSFormer++: Revealing the Devil in Transformer's Details for Multi-View StereoCode3
Prisma: An Open Source Toolkit for Mechanistic Interpretability in Vision and VideoCode3
MyoSuite -- A contact-rich simulation suite for musculoskeletal motor controlCode3
Effects of charging and discharging capabilities on trade-offs between model accuracy and computational efficiency in pumped thermal electricity storageCode3
Show:102550
← PrevPage 293 of 17734Next →