SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 11011125 of 659983 papers

TitleStatusHype
Reframing Long-Tailed Learning via Loss Landscape Geometry0
ConsRoute:Consistency-Aware Adaptive Query Routing for Cloud-Edge-Device Large Language Models0
Amortized Variational Inference for Logistic Regression with Missing Covariates0
Accelerate Vector Diffusion Maps by Landmarks0
Graph Fusion Across Languages using Large Language Models0
Graph of States: Solving Abductive Tasks with Large Language Models0
The Library Theorem: How External Organization Governs Agentic Reasoning Capacity0
Aggregation Alignment for Federated Learning with Mixture-of-Experts under Data Heterogeneity0
Conversation Tree Architecture: A Structured Framework for Context-Aware Multi-Branch LLM Conversations0
Closed-form conditional diffusion models for data assimilation0
AutoKernel: Autonomous GPU Kernel Optimization via Iterative Agent-Driven Search0
EmoTaG: Emotion-Aware Talking Head Synthesis on Gaussian Splatting with Few-Shot Personalization0
ARYA: A Physics-Constrained Composable & Deterministic World Model Architecture0
RoboAlign: Learning Test-Time Reasoning for Language-Action Alignment in Vision-Language-Action Models0
Generalized Discrete Diffusion from Snapshots0
The AI Scientific Community: Agentic Virtual Lab Swarms0
Efficient Coarse-to-Fine Diffusion Models with Time Step Sequence Redistribution0
Respiratory Status Detection with Video Transformers0
Beyond Memorization: Distinguishing between Reductive and Epistemic Reasoning in LLMs using Classic Logic Puzzles0
The Workload-Router-Pool Architecture for LLM Inference Optimization: A Vision Paper from the vLLM Semantic Router Project0
FluidGaussian: Propagating Simulation-Based Uncertainty Toward Functionally-Intelligent 3D Reconstruction0
AgentHER: Hindsight Experience Replay for LLM Agent Trajectory Relabeling0
Benchmarking Bengali Dialectal Bias: A Multi-Stage Framework Integrating RAG-Based Translation and Human-Augmented RLAIF0
AdaRubric: Task-Adaptive Rubrics for LLM Agent Evaluation0
TIDE: Token-Informed Depth Execution for Per-Token Early Exit in LLM Inference0
Show:102550
← PrevPage 45 of 26400Next →