SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 47514775 of 661570 papers

TitleStatusHype
Transformers are Bayesian Networks0
TrackDeform3D: Markerless and Autonomous 3D Keypoint Tracking and Dataset Collection for Deformable Objects0
Large Reasoning Models Struggle to Transfer Parametric Knowledge Across Scripts0
PRISM: Demystifying Retention and Interaction in Mid-Training0
Evaluating LLM-Simulated Conversations in Modeling Inconsistent and Uncollaborative Behaviors in Human Social Interaction0
An End-to-End Framework for Functionality-Embedded Provenance Graph Construction and Threat Interpretation0
Knowledge Localization in Mixture-of-Experts LLMs Using Cross-Lingual Inconsistency0
When the Specification Emerges: Benchmarking Faithfulness Loss in Long-Horizon Coding Agents0
SENSE: Efficient EEG-to-Text via Privacy-Preserving Semantic Retrieval0
Pixel-level Counterfactual Contrastive Learning for Medical Image Segmentation0
Hidden Clones: Exposing and Fixing Family Bias in Vision-Language Model Ensembles0
MosaicMem: Hybrid Spatial Memory for Controllable Video World Models0
Security Assessment and Mitigation Strategies for Large Language Models: A Comprehensive Defensive Framework0
Topology-Preserving Deep Joint Source-Channel Coding for Semantic Communication0
Personalized Fall Detection by Balancing Data with Selective Feedback Using Contrastive Learning0
Intent Formalization: A Grand Challenge for Reliable Coding in the Age of AI Agents0
GazeOnce360: Fisheye-Based 360° Multi-Person Gaze Estimation with Global-Local Feature Fusion0
Quadratic Surrogate Attractor for Particle Swarm Optimization0
SLAM Adversarial Lab: An Extensible Framework for Visual SLAM Robustness Evaluation under Adverse Conditions0
PAuth - Precise Task-Scoped Authorization For Agents0
Detecting Data Poisoning in Code Generation LLMs via Black-Box, Vulnerability-Oriented Scanning0
Domain-informed explainable boosting machines for trustworthy lateral spread predictions0
Catching rationalization in the act: detecting motivated reasoning before and after CoT via activation probing0
Visual Product Search Benchmark0
Abstraction as a Memory-Efficient Inductive Bias for Continual Learning0
Show:102550
← PrevPage 191 of 26463Next →