SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 55515600 of 661570 papers

TitleStatusHype
A Family of LLMs Liberated from Static Vocabularies0
Robust Language Identification for Romansh Varieties0
UMO: Unified In-Context Learning Unlocks Motion Foundation Model Priors0
An Agentic Evaluation Framework for AI-Generated Scientific Code in PETSc0
Standardizing Medical Images at Scale for AI0
Aligning Paralinguistic Understanding and Generation in Speech LLMs via Multi-Task Reinforcement Learning0
Determinism in the Undetermined: Deterministic Output in Charge-Conserving Continuous-Time Neuromorphic Systems with Temporal Stochasticity0
The Midas Touch in Gaze vs. Hand Pointing: Modality-Specific Failure Modes and Implications for XR Interfaces0
Mostly Text, Smart Visuals: Asymmetric Text-Visual Pruning for Large Vision-Language Models0
Understanding Moral Reasoning Trajectories in Large Language Models: Toward Probing-Based Explainability0
IRAM-Omega-Q: A Computational Architecture for Uncertainty Regulation in Artificial Agents0
Agentic Exploration of Physics Models0
Balancing Saliency and Coverage: Semantic Prominence-Aware Budgeting for Visual Token Compression in VLMs0
Describing Agentic AI Systems with C4: Lessons from Industry Projects0
POLAR:A Per-User Association Test in Embedding SpaceCode0
GASP: Guided Asymmetric Self-Play For Coding LLMs0
MAC: Multi-Agent Constitution Learning0
Datasets for Verb Alternations across Languages: BLM Templates and Data Augmentation Strategies0
RoCo Challenge at AAAI 2026: Benchmarking Robotic Collaborative Manipulation for Assembly Towards Industrial Automation0
Learning Latent Proxies for Controllable Single-Image Relighting0
From Text to Forecasts: Bridging Modality Gap with Temporal Evolution Semantic Space0
Embedding Compression via Spherical Coordinates0
OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data3
Prompt Readiness Levels (PRL): a maturity scale and scoring framework for production grade prompt assets0
PCodeTrans: Translate Decompiled Pseudocode to Compilable and Executable Equivalent0
Massive Redundancy in Gradient Transport Enables Sparse Online Learning0
AI Evasion and Impersonation Attacks on Facial Re-Identification with Activation Map Explanations0
ELISA: An Interpretable Hybrid Generative AI Agent for Expression-Grounded Discovery in Single-Cell GenomicsCode0
Fold-CP: A Context Parallelism Framework for Biomolecular Modeling0
Active Seriation: Efficient Ordering Recovery with Statistical Guarantees0
A WDLoRA-Based Multimodal Generative Framework for Clinically Guided Corneal Confocal Microscopy Image Synthesis in Diabetic Neuropathy0
OrgForge: A Multi-Agent Simulation Framework for Verifiable Synthetic Corporate Corpora0
CyCLeGen: Cycle-Consistent Layout Prediction and Image Generation in Vision Foundation Models0
Evolutionary Transfer Learning for Dragonchess0
Counteractive RL: Rethinking Core Principles for Efficient and Scalable Deep Reinforcement Learning0
Resilience Meets Autonomy: Governing Embodied AI in Critical Infrastructure0
Persistent Autoregressive Mapping with Traffic Rules for Autonomous Driving0
Deterministic Policy Gradient for Reinforcement Learning with Continuous Time and State0
Rethinking LLM Watermark Detection in Black-Box Settings: A Non-Intrusive Third-Party Framework0
Interpretable Predictability-Based AI Text Detection: A Replication Study0
Detection of Autonomous Shuttles in Urban Traffic Images Using Adaptive Residual Context0
Self-supervised Disentanglement of Disease Effects from Aging in 3D Medical ShapesCode0
Learning to Recall with Transformers Beyond Orthogonal Embeddings0
Learning Question-Aware Keyframe Selection with Synthetic Supervision for Video Question Answering0
Machine learning for sustainable geoenergy: uncertainty, physics and decision-ready inference0
Mathematical Foundations of Polyphonic Music Generation via Structural Inductive Bias0
Consequentialist Objectives and Catastrophe0
Efficient Document Parsing via Parallel Token Prediction0
Criterion-referenceability determines LLM-as-a-judge validity across physics assessment formats0
SWE-Skills-Bench: Do Agent Skills Actually Help in Real-World Software Engineering?Code0
Show:102550
← PrevPage 112 of 13232Next →