SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 626650 of 659983 papers

TitleStatusHype
Kinetic Langevin Splitting Schemes for Constrained Sampling0
Graph Energy Matching: Transport-Aligned Energy-Based Modeling for Graph Generation0
Unleashing Spatial Reasoning in Multimodal Large Language Models via Textual Representation Guided Reasoning0
Evaluating a Multi-Agent Voice-Enabled Smart Speaker for Care Homes: A Safety-Focused Framework0
SortedRL: Accelerating RL Training for LLMs through Online Length-Aware Scheduling0
Biased Error Attribution in Multi-Agent Human-AI Systems Under Delayed Feedback0
Bilevel Autoresearch: Meta-Autoresearching Itself0
Mecha-nudges for Machines0
Similarity-Aware Mixture-of-Experts for Data-Efficient Continual Learning0
Targeted Adversarial Traffic Generation : Black-box Approach to Evade Intrusion Detection Systems in IoT Networks0
SIGMA: A Physics-Based Benchmark for Gas Chimney Understanding in Seismic Images0
Evaluating LLM-Based Test Generation Under Software Evolution0
VISion On Request: Enhanced VLLM efficiency with sparse, dynamically selected, vision-language interactions0
Estimating Flow Velocity and Vehicle Angle-of-Attack from Non-invasive Piezoelectric Structural Measurements Using Deep Learning0
WildWorld: A Large-Scale Dataset for Dynamic World Modeling with Actions and Explicit State toward Generative ARPG0
DA-Flow: Degradation-Aware Optical Flow Estimation with Diffusion Models0
UniGRPO: Unified Policy Optimization for Reasoning-Driven Visual Generation0
MedObvious: Exposing the Medical Moravec's Paradox in VLMs via Clinical Triage0
OccAny: Generalized Unconstrained Urban 3D Occupancy0
Agentic Automation of BT-RADS Scoring: End-to-End Multi-Agent System for Standardized Brain Tumor Follow-up Assessment0
The Geometric Price of Discrete Logic: Context-driven Manifold Dynamics of Number Representations0
Residual Attention Physics-Informed Neural Networks for Robust Multiphysics Simulation of Steady-State Electrothermal Energy Systems0
MetaKube: An Experience-Aware LLM Framework for Kubernetes Failure Diagnosis0
The Mass Agreement Score: A Point-centric Measure of Cluster Size Consistency0
Estimating Individual Tree Height and Species from UAV Imagery0
Show:102550
← PrevPage 26 of 26400Next →