SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 11011150 of 659983 papers

TitleStatusHype
Reframing Long-Tailed Learning via Loss Landscape Geometry0
ConsRoute:Consistency-Aware Adaptive Query Routing for Cloud-Edge-Device Large Language Models0
Amortized Variational Inference for Logistic Regression with Missing Covariates0
Accelerate Vector Diffusion Maps by Landmarks0
Graph Fusion Across Languages using Large Language Models0
Graph of States: Solving Abductive Tasks with Large Language Models0
The Library Theorem: How External Organization Governs Agentic Reasoning Capacity0
Aggregation Alignment for Federated Learning with Mixture-of-Experts under Data Heterogeneity0
Conversation Tree Architecture: A Structured Framework for Context-Aware Multi-Branch LLM Conversations0
Closed-form conditional diffusion models for data assimilation0
AutoKernel: Autonomous GPU Kernel Optimization via Iterative Agent-Driven Search0
EmoTaG: Emotion-Aware Talking Head Synthesis on Gaussian Splatting with Few-Shot Personalization0
ARYA: A Physics-Constrained Composable & Deterministic World Model Architecture0
RoboAlign: Learning Test-Time Reasoning for Language-Action Alignment in Vision-Language-Action Models0
Generalized Discrete Diffusion from Snapshots0
The AI Scientific Community: Agentic Virtual Lab Swarms0
Efficient Coarse-to-Fine Diffusion Models with Time Step Sequence Redistribution0
Respiratory Status Detection with Video Transformers0
Beyond Memorization: Distinguishing between Reductive and Epistemic Reasoning in LLMs using Classic Logic Puzzles0
The Workload-Router-Pool Architecture for LLM Inference Optimization: A Vision Paper from the vLLM Semantic Router Project0
FluidGaussian: Propagating Simulation-Based Uncertainty Toward Functionally-Intelligent 3D Reconstruction0
AgentHER: Hindsight Experience Replay for LLM Agent Trajectory Relabeling0
Benchmarking Bengali Dialectal Bias: A Multi-Stage Framework Integrating RAG-Based Translation and Human-Augmented RLAIF0
AdaRubric: Task-Adaptive Rubrics for LLM Agent Evaluation0
TIDE: Token-Informed Depth Execution for Per-Token Early Exit in LLM Inference0
Relax Forcing: Relaxed KV-Memory for Consistent Long Video Generation0
Conspiracy Frame: a Semiotically-Driven Approach for Conspiracy Theories Detection0
PLR: Plackett-Luce for Reordering In-Context Learning Examples0
Constrained Online Convex Optimization with Memory and Predictions0
HamVision: Hamiltonian Dynamics as Inductive Bias for Medical Image Analysis0
An InSAR Phase Unwrapping Framework for Large-scale and Complex Events0
PivotRL: High Accuracy Agentic Post-Training at Low Compute Cost0
Mitigating Objectness Bias and Region-to-Text Misalignment for Open-Vocabulary Panoptic Segmentation0
Task-Specific Efficiency Analysis: When Small Language Models Outperform Large Language Models0
A Generalised Exponentiated Gradient Approach to Enhance Fairness in Binary and Multi-class Classification Tasks0
Mechanisms of Introspective Awareness0
Persona Vectors in Games: Measuring and Steering Strategies via Activation Vectors0
The Myhill-Nerode Theorem for Bounded Interaction: Canonical Abstractions via Agent-Bounded Indistinguishability0
Multi-Perspective LLM Annotations for Valid Analyses in Subjective Tasks0
Fingerprinting Deep Neural Networks for Ownership Protection: An Analytical Approach0
Silent Commitment Failure in Instruction-Tuned Language Models: Evidence of Governability Divergence Across Architectures0
Efficient Fine-Tuning Methods for Portuguese Question Answering: A Comparative Study of PEFT on BERTimbau and Exploratory Evaluation of Generative LLMs0
Is the future of AI green? What can innovation diffusion models say about generative AI's environmental impact?0
HyReach: Vision-Guided Hybrid Manipulator Reaching in Unseen Cluttered Environments0
Uncertainty-Aware Knowledge Distillation for Multimodal Large Language Models0
Image-Based Structural Analysis Using Computer Vision and LLMs: PhotoBeamSolver0
Left Behind: Cross-Lingual Transfer as a Bridge for Low-Resource Languages in Large Language Models0
Single-Eye View: Monocular Real-time Perception Package for Autonomous Driving0
Gradient Descent with Projection Finds Over-Parameterized Neural Networks for Learning Low-Degree Polynomials with Nearly Minimax Optimal Rate0
LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning0
Show:102550
← PrevPage 23 of 13200Next →