SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 44264450 of 661570 papers

TitleStatusHype
Benchmarking Reinforcement Learning via Stochastic Converse Optimality: Generating Systems with Known Optimal Policies0
Process Supervision for Chain-of-Thought Reasoning via Monte Carlo Net Information Gain0
Adaptive Guidance for Retrieval-Augmented Masked Diffusion Models0
From Virtual Environments to Real-World Trials: Emerging Trends in Autonomous Driving0
Federated Distributional Reinforcement Learning with Distributional Critic Regularization0
Machine Learning for Network Attacks Classification and Statistical Evaluation of Machine Learning for Network Attacks Classification and Adversarial Learning Methodologies for Synthetic Data Generation0
SARE: Sample-wise Adaptive Reasoning for Training-free Fine-grained Visual Recognition0
TAPESTRY: From Geometry to Appearance via Consistent Turntable Videos0
Event-Centric Human Value Understanding in News-Domain Texts: An Actor-Conditioned, Multi-Granularity Benchmark0
Omni-3DEdit: Generalized Versatile 3D Editing in One-Pass0
RHYME-XT: A Neural Operator for Spatiotemporal Control Systems0
ShapleyLaw: A Game-Theoretic Approach to Multilingual Scaling Laws0
ConGA: Guidelines for Contextual Gender Annotation. A Framework for Annotating Gender in Machine Translation0
IndicSafe: A Benchmark for Evaluating Multilingual LLM Safety in South Asia0
Only relative ranks matter in weight-clustered large language models0
Multi-Armed Sequential Hypothesis Testing by Betting0
CARE: Covariance-Aware and Rank-Enhanced Decomposition for Enabling Multi-Head Latent Attention0
Beyond Muon: MUD (MomentUm Decorrelation) for Faster Transformer Training0
AHOY! Animatable Humans under Occlusion from YouTube Videos with Gaussian Splatting and Video Diffusion Priors0
Versatile Editing of Video Content, Actions, and Dynamics without Training0
ScheduleMe: Multi-Agent Calendar Assistant0
TRiMS: Real-Time Tracking of Minimal Sufficient Length for Efficient Reasoning via RL0
A Deep Surrogate Model for Robust and Generalizable Long-Term Blast Wave Prediction0
Unlearnable phases of matter0
CTG-DB: An Ontology-Based Transformation of ClinicalTrials.gov to Enable Cross-Trial Drug Safety Analyses0
Show:102550
← PrevPage 178 of 26463Next →