SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 82018250 of 661570 papers

TitleStatusHype
Time series forecasting with Hahn Kolmogorov-Arnold networks0
Structured Matrix Scaling for Multi-Class Calibration0
Memorization capacity of deep ReLU neural networks characterized by width and depth0
SimpleQA Verified: A Reliable Factuality Benchmark to Measure Parametric Knowledge0
TSFM in-context learning for time-series classification of bearing-health status0
Training-Free Coverless Multi-Image Steganography with Access Control0
Quantifying the Necessity of Chain of Thought through Opaque Serial Depth0
From Prior to Pro: Efficient Skill Mastery via Distribution Contractive RL Finetuning0
FreqCycle: A Multi-Scale Time-Frequency Analysis Method for Time Series Forecasting0
Improving 3D Foot Motion Reconstruction in Markerless Monocular Human Motion Capture0
Influencing LLM Multi-Agent Dialogue via Policy-Parameterized Prompts0
Calibration-Reasoning Framework for Descriptive Speech Quality Assessment0
GSStream: 3D Gaussian Splatting based Volumetric Scene Streaming System0
Governance Architecture for Autonomous Agent Systems: Threats, Framework, and Engineering Practice0
No evaluation without fair representation : Impact of label and selection bias on the evaluation, performance and mitigation of classification models0
Large Spikes in Stochastic Gradient Descent: A Large-Deviations View0
A Graph-Based Approach to Spectrum Demand Prediction Using Hierarchical Attention Networks0
Singing Syllabi with Virtual Avatars: Enhancing Student Engagement Through AI-Generated Music and Digital Embodiment0
Removing the Trigger, Not the Backdoor: Alternative Triggers and Latent Backdoors0
Global universality via discrete-time signatures0
Why LLMs Fail: A Failure Analysis and Partial Success Measurement for Automated Security Patch Generation0
CRANE: Causal Relevance Analysis of Language-Specific Neurons in Multilingual Large Language Models0
ParTY: Part-Guidance for Expressive Text-to-Motion Synthesis0
VarSplat: Uncertainty-aware 3D Gaussian Splatting for Robust RGB-D SLAM0
ReCoSplat: Autoregressive Feed-Forward Gaussian Splatting Using Render-and-Compare0
Digging Deeper: Learning Multi-Level Concept Hierarchies0
Common Sense vs. Morality: The Curious Case of Narrative Focus Bias in LLMs0
a-TMFG: Scalable Triangulated Maximally Filtered Graphs via Approximate Nearest Neighbors0
Learning Transferable Skills in Action RPGs via Directed Skill Graphs and Selective Adaptation0
Marginals Before Conditionals0
GR-SAP: Generative Replay for Safety Alignment Preservation during Fine-TuningCode0
MMGraphRAG: Bridging Vision and Language with Interpretable Multimodal Knowledge Graphs0
Learning Encoding-Decoding Direction Pairs to Unveil Concepts of Influence in Deep Vision Networks0
MKE-Coder: Multi-Axial Knowledge with Evidence Verification in ICD Coding for Chinese EMRs0
Experiments with Optimal Model Trees0
Pure Exploration with Infinite Answers0
TaoSR1: The Thinking Model for E-commerce Relevance Search0
On the mechanical creation of mathematical concepts0
VistaWise: Building Cost-Effective Agent with Cross-Modal Knowledge Graph for Minecraft0
AgentCoMa: A Compositional Benchmark Mixing Commonsense and Mathematical Reasoning in Real-World Scenarios0
Iterative In-Context Learning to Enhance LLMs Abstract Reasoning: The Case-Study of Algebraic Tasks0
Intrinsic Numerical Robustness and Fault Tolerance in a Neuromorphic Algorithm for Scientific Computing0
Automatic Paper Reviewing with Heterogeneous Graph Reasoning over LLM-Simulated Reviewer-Author Debates0
CoRe-GS: Coarse-to-Refined Gaussian Splatting with Semantic Object Focus0
Repulsive Monte Carlo on the sphere for the sliced Wasserstein distance0
VoiceBridge: General Speech Restoration with One-step Latent Bridge Models0
Reasoning Efficiently Through Adaptive Chain-of-Thought Compression: A Self-Optimizing Framework0
Kuramoto Orientation Diffusion Models0
Periodic Asynchrony: An On-Policy Approach for Accelerating LLM Reinforcement Learning0
LLaVAShield: Safeguarding Multimodal Multi-Turn Dialogues in Vision-Language Models0
Show:102550
← PrevPage 165 of 13232Next →