SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 43264350 of 177340 papers

TitleStatusHype
MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical ReasoningCode3
On Distillation of Guided Diffusion ModelsCode3
SWE-bench-java: A GitHub Issue Resolving Benchmark for JavaCode3
SoundStream: An End-to-End Neural Audio CodecCode3
Gradient Alignment in Physics-informed Neural Networks: A Second-Order Optimization PerspectiveCode3
On the Content Bias in Fréchet Video DistanceCode3
Flow Matching for Generative ModelingCode3
W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-TrainingCode3
3D Diffuser Actor: Policy Diffusion with 3D Scene RepresentationsCode3
Physics3D: Learning Physical Properties of 3D Gaussians via Video DiffusionCode3
SkyMath: Technical ReportCode3
XuanYuan 2.0: A Large Chinese Financial Chat Model with Hundreds of Billions ParametersCode3
Reason-RFT: Reinforcement Fine-Tuning for Visual ReasoningCode3
Designing and building the mlpack open-source machine learning libraryCode3
One-step Diffusion with Distribution Matching DistillationCode3
EAFormer: Scene Text Segmentation with Edge-Aware TransformersCode3
Accurate clinical and biomedical Named entity recognition at scaleCode3
Planning in Strawberry Fields: Evaluating and Improving the Planning and Scheduling Capabilities of LRM o1Code3
EventRL: Enhancing Event Extraction with Outcome Supervision for Large Language ModelsCode3
LRM: Large Reconstruction Model for Single Image to 3DCode3
GluonTS: Probabilistic Time Series Models in PythonCode3
Practical Deep Reinforcement Learning Approach for Stock TradingCode3
CodeBLEU: a Method for Automatic Evaluation of Code SynthesisCode3
Aguvis: Unified Pure Vision Agents for Autonomous GUI InteractionCode3
Merlin: A Vision Language Foundation Model for 3D Computed TomographyCode3
Show:102550
← PrevPage 174 of 7094Next →