SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 27512775 of 661570 papers

TitleStatusHype
KG-Hopper: Empowering Compact Open LLMs with Knowledge Graph Reasoning via Reinforcement Learning0
Beyond Static Visual Tokens: Structured Sequential Visual Chain-of-Thought Reasoning0
Distilled Large Language Model-Driven Dynamic Sparse Expert Activation Mechanism0
Ordinal Semantic Segmentation Applied to Medical and Odontological Images0
Errors in AI-Assisted Retrieval of Medical Literature: A Comparative Study0
T-MAP: Red-Teaming LLM Agents with Trajectory-aware Evolutionary Search0
Neutrino Oscillation Parameter Estimation Using Structured Hierarchical Transformers0
Causal Direct Preference Optimization for Distributionally Robust Generative Recommendation0
Graphs RAG at Scale: Beyond Retrieval-Augmented Generation With Labeled Property Graphs and Resource Description Framework for Complex and Unknown Search Spaces0
Implicit Turn-Wise Policy Optimization for Proactive User-LLM Interaction0
Subject Information Extraction for Novelty Detection with Domain Shifts0
LJ-Bench: Ontology-Based Benchmark for U.S. Crime0
Context Cartography: Toward Structured Governance of Contextual Space in Large Language Model Systems0
Position: Multi-Agent Algorithmic Care Systems Demand Contestability for Trustworthy AI0
Graph-based data-driven discovery of interpretable laws governing corona-induced noise and radio interference for high-voltage transmission lines0
Interpretable Operator Learning for Inverse Problems via Adaptive Spectral Filtering: Convergence and Discretization Invariance0
Bayesian Learning in Episodic Zero-Sum Games0
Towards Practical World Model-based Reinforcement Learning for Vision-Language-Action Models0
GaussianPile: A Unified Sparse Gaussian Splatting Framework for Slice-based Volumetric Reconstruction0
Beyond Token Eviction: Mixed-Dimension Budget Allocation for Efficient KV Cache Compression0
Where can AI be used? Insights from a deep ontology of work activities0
Reasoning Traces Shape Outputs but Models Won't Say So0
LassoFlexNet: Flexible Neural Architecture for Tabular Data0
Optimal low-rank stochastic gradient estimation for LLM training0
Seed1.8 Model Card: Towards Generalized Real-World Agency0
Show:102550
← PrevPage 111 of 26463Next →