SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 14011425 of 659983 papers

TitleStatusHype
Thermal is Always Wild: Characterizing and Addressing Challenges in Thermal-Only Novel View Synthesis0
Solver-Aided Verification of Policy Compliance in Tool-Augmented LLM Agents0
Policies Permitting LLM Use for Polishing Peer Reviews Are Currently Not Enforceable0
SDE-Driven Spatio-Temporal Hypergraph Neural Networks for Irregular Longitudinal fMRI Connectome Modeling in Alzheimer's Disease0
Reinforcement Learning from Multi-Source Imperfect Preferences: Best-of-Both-Regimes Regret0
From Data to Laws: Neural Discovery of Conservation Laws Without False Positives0
CREG: Compass Relational Evidence for Interpreting Spatial Reasoning in Vision-Language Models0
Profiling learners' affective engagement: Emotion AI, intercultural pragmatics, and language learning0
Spatio-Temporal Grid Intelligence: A Hybrid Graph Neural Network and LSTM Framework for Robust Electricity Theft Detection0
AE-LLM: Adaptive Efficiency Optimization for Large Language Models0
PARHAF, a human-authored corpus of clinical reports for fictitious patients in French0
Meeting in the Middle: A Co-Design Paradigm for FHE and AI Inference0
CogFormer: Learn All Your Models Once0
Delightful Distributed Policy Gradient0
Does This Gradient Spark Joy?0
RMNP: Row-Momentum Normalized Preconditioning for Scalable Matrix-Based Optimization0
Memory Over Maps: 3D Object Localization Without Reconstruction0
Epistemic Observability in Language Models0
When Negation Is a Geometry Problem in Vision-Language Models0
Permutation-Consensus Listwise Judging for Robust Factuality Evaluation0
ReBOL: Retrieval via Bayesian Optimization with Batched LLM Relevance Observations and Query Reformulation0
Evaluating Large Language Models on Historical Health Crisis Knowledge in Resource-Limited Settings: A Hybrid Multi-Metric Study0
Shift-Invariant Feature Attribution in the Application of Wireless Electrocardiograms0
Diffutron: A Masked Diffusion Language Model for Turkish Language0
Goal-oriented learning of stochastic dynamical systems using error bounds on path-space observables0
Show:102550
← PrevPage 57 of 26400Next →