SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 27112720 of 474278 papers

TitleStatusHype
Self-rewarding correction for mathematical reasoningCode3
Verdict: A Library for Scaling Judge-Time ComputeCode3
ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image GenerationCode3
Harnessing Multiple Large Language Models: A Survey on LLM EnsembleCode3
Chain of Draft: Thinking Faster by Writing LessCode3
S-Graphs 2.0 -- A Hierarchical-Semantic Optimization and Loop Closure for SLAMCode3
Baichuan-Audio: A Unified Framework for End-to-End Speech InteractionCode3
AISafetyLab: A Comprehensive Framework for AI Safety Evaluation and ImprovementCode3
AlphaAgent: LLM-Driven Alpha Mining with Regularized Exploration to Counteract Alpha DecayCode3
Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMsCode3
Show:102550
← PrevPage 272 of 47428Next →