SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 10261050 of 659983 papers

TitleStatusHype
GaussianObject: High-Quality 3D Object Reconstruction from Four Views with Gaussian SplattingCode5
3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D RepresentationsCode5
MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View ImagesCode5
Lean Copilot: Large Language Models as Copilots for Theorem Proving in LeanCode5
Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient ManipulationCode5
Enhancing Efficiency of Safe Reinforcement Learning via Sample ManipulationCode5
The Vizier Gaussian Process Bandit AlgorithmCode5
Fundamental Components of Deep Learning: A category-theoretic approachCode5
Magma: A Foundation Model for Multimodal AI AgentsCode5
LiveBench: A Challenging, Contamination-Limited LLM BenchmarkCode5
FuXi-2.0: Advancing machine learning weather forecasting model for practical applicationsCode5
Retinexformer: One-stage Retinex-based Transformer for Low-light Image EnhancementCode5
Neural Fields in Robotics: A SurveyCode5
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMsCode5
SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?Code5
TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow ModelsCode5
TikZero: Zero-Shot Text-Guided Graphics Program SynthesisCode5
VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic FaithfulnessCode5
ZeroSearch: Incentivize the Search Capability of LLMs without SearchingCode5
Show-o2: Improved Native Unified Multimodal ModelsCode5
REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable RewardsCode5
DoWhy-GCM: An extension of DoWhy for causal inference in graphical causal modelsCode5
VADv2: End-to-End Vectorized Autonomous Driving via Probabilistic PlanningCode5
Rethinking LLM Language Adaptation: A Case Study on Chinese MixtralCode5
Penzai + Treescope: A Toolkit for Interpreting, Visualizing, and Editing Models As DataCode5
Show:102550
← PrevPage 42 of 26400Next →