SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 43014325 of 661570 papers

TitleStatusHype
A Noise Sensitivity Exponent Controls Large Statistical-to-Computational Gaps in Single- and Multi-Index Models0
Pretrained Multilingual Transformers Reveal Quantitative Distance Between Human Languages0
Interpretable Traffic Responsibility from Dashcam Video via Legal Multi Agent Reasoning0
Efficient Training-Free Multi-Token Prediction via Embedding-Space Probing0
Unified Policy Value Decomposition for Rapid Adaptation0
VideoAtlas: Navigating Long-Form Video in Logarithmic Compute0
LaDe: Unified Multi-Layered Graphic Media Generation and Decomposition0
Robust-ComBat: Mitigating Outlier Effects in Diffusion MRI Data Harmonization0
Specification-Aware Distribution Shaping for Robotics Foundation Models0
TechImage-Bench: Rubric-Based Evaluation for Technical Image Generation0
Revisiting foundation models for cell instance segmentation0
Automated Grammar-based Algebraic Multigrid Design With Evolutionary Algorithms0
Differential Attention-Augmented BiomedCLIP with Asymmetric Focal Optimization for Imbalanced Multi-Label Video Capsule Endoscopy Classification0
Omnilingual MT: Machine Translation for 1,600 Languages0
Efficient Exploration at Scale0
Thin Keys, Full Values: Reducing KV Cache via Low-Dimensional Attention Selection0
Bodhi VLM: Privacy-Alignment Modeling for Hierarchical Visual Representations in Vision Backbones and VLM Encoders via Bottom-Up and Top-Down Feature SearchCode0
Omni-I2C: A Holistic Benchmark for High-Fidelity Image-to-Code GenerationCode0
TheraMind: A Strategic and Adaptive Agent for Longitudinal Psychological CounselingCode0
EvoGuard: An Extensible Agentic RL-based Framework for Practical and Evolving AI-Generated Image Detection0
A practical artificial intelligence framework for legal age estimation using clavicle computed tomography scans0
MALLES: A Multi-agent LLMs-based Economic Sandbox with Consumer Preference Alignment0
Proactive Knowledge Inquiry in Doctor-Patient Dialogue: Stateful Extraction, Belief Updating, and Path-Aware Action Planning0
Large Language Models as a Semantic Interface and Ethical Mediator in Neuro-Digital Ecosystems: Conceptual Foundations and a Regulatory Imperative0
The Phasor Transformer: Resolving Attention Bottlenecks on the Unit Circle0
Show:102550
← PrevPage 173 of 26463Next →