SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 841850 of 177340 papers

TitleStatusHype
SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?Code5
TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow ModelsCode5
TikZero: Zero-Shot Text-Guided Graphics Program SynthesisCode5
VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic FaithfulnessCode5
ZeroSearch: Incentivize the Search Capability of LLMs without SearchingCode5
Show-o2: Improved Native Unified Multimodal ModelsCode5
REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable RewardsCode5
DoWhy-GCM: An extension of DoWhy for causal inference in graphical causal modelsCode5
VADv2: End-to-End Vectorized Autonomous Driving via Probabilistic PlanningCode5
Rethinking LLM Language Adaptation: A Case Study on Chinese MixtralCode5
Show:102550
← PrevPage 85 of 17734Next →