SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 24812490 of 474278 papers

TitleStatusHype
Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement LearningCode3
AudioTrust: Benchmarking the Multifaceted Trustworthiness of Audio Large Language ModelsCode3
Arctic-Text2SQL-R1: Simple Rewards, Strong Reasoning in Text-to-SQLCode3
MASLab: A Unified and Comprehensive Codebase for LLM-based Multi-Agent SystemsCode3
LaViDa: A Large Diffusion Language Model for Multimodal UnderstandingCode3
IFEval-Audio: Benchmarking Instruction-Following Capability in Audio-based Large Language ModelsCode3
Training-Free Efficient Video Generation via Dynamic Token CarvingCode3
Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept SpaceCode3
Distance Adaptive Beam Search for Provably Accurate Graph-Based Nearest Neighbor SearchCode3
OmniGenBench: A Modular Platform for Reproducible Genomic Foundation Models BenchmarkingCode3
Show:102550
← PrevPage 249 of 47428Next →