SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 29512975 of 661570 papers

TitleStatusHype
AE-LLM: Adaptive Efficiency Optimization for Large Language Models0
PARHAF, a human-authored corpus of clinical reports for fictitious patients in French0
Meeting in the Middle: A Co-Design Paradigm for FHE and AI Inference0
CogFormer: Learn All Your Models Once0
Delightful Distributed Policy Gradient0
Does This Gradient Spark Joy?0
RMNP: Row-Momentum Normalized Preconditioning for Scalable Matrix-Based Optimization0
Memory Over Maps: 3D Object Localization Without Reconstruction0
Epistemic Observability in Language Models0
When Negation Is a Geometry Problem in Vision-Language Models0
Permutation-Consensus Listwise Judging for Robust Factuality Evaluation0
ReBOL: Retrieval via Bayesian Optimization with Batched LLM Relevance Observations and Query Reformulation0
Evaluating Large Language Models on Historical Health Crisis Knowledge in Resource-Limited Settings: A Hybrid Multi-Metric Study0
Shift-Invariant Feature Attribution in the Application of Wireless Electrocardiograms0
Diffutron: A Masked Diffusion Language Model for Turkish Language0
Goal-oriented learning of stochastic dynamical systems using error bounds on path-space observables0
DiffGraph: An Automated Agent-driven Model Merging Framework for In-the-Wild Text-to-Image Generation0
End-to-End Optimization of Polarimetric Measurement and Material Classifier0
Efficient Counterfactual Reasoning in ProbLog via Single World Intervention Programs0
Distributed Gradient Clustering: Convergence and the Effect of Initialization0
Measuring Reasoning Trace Legibility: Can Those Who Understand Teach?0
Lessons and Open Questions from a Unified Study of Camera-Trap Species Recognition Over Time0
Grounded Chess Reasoning in Language Models via Master Distillation0
Revenue-Sharing as Infrastructure: A Distributed Business Model for Generative AI Platforms0
Towards Practical Multimodal Hospital Outbreak Detection0
Show:102550
← PrevPage 119 of 26463Next →