SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 39263950 of 661570 papers

TitleStatusHype
ClinConsensus: A Consensus-Based Benchmark for Evaluating Chinese Medical LLMs across Difficulty Levels0
Adversarial Latent-State Training for Robust Policies in Partially Observable Domains0
Structure from rank: Rank-order coding as a bridge from sequence to structure0
AgentDrift: Unsafe Recommendation Drift Under Tool Corruption Hidden by Ranking Metrics in LLM Agents0
Listening to the Echo: User-Reaction Aware Policy Optimization via Scalar-Verbal Hybrid Reinforcement Learning0
Variational Phasor Circuits for Phase-Native Brain-Computer Interface Classification0
Discovering What You Can Control: Interventional Boundary Discovery for Reinforcement Learning0
A Synthesizable RTL Implementation of Predictive Coding Networks0
CWoMP: Morpheme Representation Learning for Interlinear Glossing0
Lightweight Adaptation for LLM-based Technical Service Agent: Latent Logic Augmentation and Robust Noise Reduction0
SLEA-RL: Step-Level Experience Augmented Reinforcement Learning for Multi-Turn Agentic Training0
EgoAdapt: Enhancing Robustness in Egocentric Interactive Speaker Detection Under Missing Modalities0
Probabilistic Federated Learning on Uncertain and Heterogeneous Data with Model Personalization0
Uncovering Latent Phase Structures and Branching Logic in Locomotion Policies: A Case Study on HalfCheetah0
One-to-More: High-Fidelity Training-Free Anomaly Generation with Attention Control0
A Trace-Based Assurance Framework for Agentic AI Orchestration: Contracts, Testing, and Governance0
ARTEMIS: A Neuro Symbolic Framework for Economically Constrained Market Dynamics0
From Concepts to Judgments: Interpretable Image Aesthetic Assessment0
Discovery of Bimodal Drift Rate Structure in FRB 20240114A: Evidence for Dual Emission Regions0
BoundAD: Boundary-Aware Negative Generation for Time Series Anomaly Detection0
VC-Soup: Value-Consistency Guided Multi-Value Alignment for Large Language Models0
MAED: Mathematical Activation Error Detection for Mitigating Physical Fault Attacks in DNN Inference0
Evaluating FrameNet-Based Semantic Modeling for Gender-Based Violence Detection in Clinical Records0
Towards sample-optimal learning of bosonic Gaussian quantum states0
How LLMs Distort Our Written Language0
Show:102550
← PrevPage 158 of 26463Next →