SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 39513975 of 661570 papers

TitleStatusHype
Efficient Dense Crowd Trajectory Prediction Via Dynamic Clustering0
Enactor: From Traffic Simulators to Surrogate World Models0
Modeling the human lexicon under temperature variations: linguistic factors, diversity and typicality in LLM word associations0
Conflict-Free Policy Languages for Probabilistic ML Predicates: A Framework and Case Study with the Semantic Router DSL0
Starting Off on the Wrong Foot: Pitfalls in Data Preparation0
MicroVision: An Open Dataset and Benchmark Models for Detecting Vulnerable Road Users and Micromobility Vehicles0
Tackling the Sign Problem in the Doped Hubbard Model with Normalizing Flows0
Semantic Segmentation and Depth Estimation for Real-Time Lunar Surface Mapping Using 3D Gaussian Splatting0
A Hybrid Conditional Diffusion-DeepONet Framework for High-Fidelity Stress Prediction in Hyperelastic Materials0
Toward Reliable, Safe, and Secure LLMs for Scientific Applications0
Gradient-Informed Temporal Sampling Improves Rollout Accuracy in PDE Surrogate Training0
EDM-ARS: A Domain-Specific Multi-Agent System for Automated Educational Data Mining Research0
Detection Is Cheap, Routing Is Learned: Why Refusal-Based Alignment Evaluation Fails0
CycleCap: Improving VLMs Captioning Performance via Self-Supervised Cycle Consistency Fine-Tuning0
Offload or Overload: A Platform Measurement Study of Mobile Robotic Manipulation Workloads0
The Validity Gap in Health AI Evaluation: A Cross-Sectional Analysis of Benchmark Composition0
Sparse3DTrack: Monocular 3D Object Tracking Using Sparse Supervision0
Fast and Generalizable NeRF Architecture Selection for Satellite Scene Reconstruction0
Unrolled Reconstruction with Integrated Super-Resolution for Accelerated 3D LGE MRI0
Learning to Reason with Curriculum I: Provable Benefits of Autocurriculum0
Escaping Offline Pessimism: Vector-Field Reward Shaping for Safe Frontier Exploration0
Consumer-to-Clinical Language Shifts in Ambient AI Draft Notes and Clinician-Finalized Documentation: A Multi-level Analysis0
A Family of Adaptive Activation Functions for Mitigating Failure Modes in Physics-Informed Neural Networks0
FaithSteer-BENCH: A Deployment-Aligned Stress-Testing Benchmark for Inference-Time Steering0
MemArchitect: A Policy Driven Memory Governance Layer0
Show:102550
← PrevPage 159 of 26463Next →