SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 21512200 of 659983 papers

TitleStatusHype
Is Hierarchical Quantization Essential for Optimal Reconstruction?0
RE-SAC: Disentangling aleatoric and epistemic risks in bus fleet control: A stable and robust ensemble DRL approach0
Discounted Beta--Bernoulli Reward Estimation for Sample-Efficient Reinforcement Learning with Verifiable Rewards0
GAPSL: A Gradient-Aligned Parallel Split Learning on Heterogeneous Data0
Transformers Learn Robust In-Context Regression under Distributional Uncertainty0
OpenT2M: No-frill Motion Generation with Open-source,Large-scale, High-quality Data0
Automatic detection of Gen-AI texts: A comparative framework of neural models0
WeNLEX: Weakly Supervised Natural Language Explanations for Multilabel Chest X-ray Classification0
Towards Verifiable AI with Lightweight Cryptographic Proofs of Inference0
Reasonably reasoning AI agents can avoid game-theoretic failures in zero-shot, provably0
D-Mem: A Dual-Process Memory System for LLM Agents0
Communication-Efficient and Robust Multi-Modal Federated Learning via Latent-Space Consensus0
Multi-Domain Causal Empirical Bayes Under Linear Mixing0
Multimodal Task Interference: A Benchmark and Analysis of History-Target Mismatch in Multimodal LLMs0
On the Peril of (Even a Little) Nonstationarity in Satisficing Regret Minimization0
Can LLM generate interesting mathematical research problems?0
SuperDec: 3D Scene Decomposition with Superquadric Primitives0
Online Convex Optimization with Heavy Tails: Old Algorithms, New Regrets, and Applications0
CrossHOI-Bench: A Unified Benchmark for HOI Evaluation across Vision-Language Models and HOI-Specific Methods0
AI-driven Dispensing of Coral Reseeding Devices for Broad-scale Restoration of the Great Barrier Reef0
Closed-form _r norm scaling with data for overparameterized linear regression and diagonal linear networks under _p bias0
LucidFlux: Caption-Free Photo-Realistic Image Restoration via a Large-Scale Diffusion Transformer5
Splines-Based Feature Importance in Kolmogorov-Arnold Networks: A Framework for Supervised Tabular Data Dimensionality Reduction0
Support Basis: Fast Attention Beyond Bounded Entries0
If Probable, Then Acceptable? Understanding Conditional Acceptability Judgments in Large Language Models0
Evaluating Hallucinations in Audio-Visual Multimodal LLMs with Spoken Queries under Diverse Acoustic Conditions0
Manual2Skill++: Connector-Aware General Robotic Assembly from Instruction Manuals via Vision-Language Models0
StoryBox: Collaborative Multi-Agent Simulation for Hybrid Bottom-Up Long-Form Story Generation Using Large Language Models0
Linear Attention for Joint Power Optimization and User-Centric Clustering in Cell-Free Networks0
Adaptive Accountability in Networked MAS: Tracing and Mitigating Emergent Norms at Scale0
Neuron-Guided Interpretation of Code LLMs: Where, Why, and How?0
Image2Garment: Simulation-ready Garment Generation from a Single Image0
HyperAlign: Hyperbolic Entailment Cones for Adaptive Text-to-Image Alignment Assessment0
GeoMotionGPT: Geometry-Aligned Motion Understanding with Large Language Models0
The Coordination Gap: Multi-Agent Alternation Metrics for Temporal Fairness in Repeated Games0
Towards Efficient and Stable Ocean State Forecasting: A Continuous-Time Koopman Approach0
How to Take a Memorable Picture? Empowering Users with Actionable Feedback1
LucidNFT: LR-Anchored Multi-Reward Preference Optimization for Generative Real-World Super-Resolution0
Neural Networks as Local-to-Global Computations0
SIA: A Synthesize-Inject-Align Framework for Knowledge-Grounded and Secure E-commerce Search LLMs with Industrial Deployment0
Flow Matching Policy with Entropy Regularization0
Rigorous Error Certification for Neural PDE Solvers: From Empirical Residuals to Solution Guarantees0
The Impact of Corporate AI Washing on Farmers' Digital Financial Behavior Response -- An Analysis from the Perspective of Digital Financial Exclusion0
MLOW: Interpretable Low-Rank Frequency Magnitude Decomposition of Multiple Effects for Time Series Forecasting0
Recovering Sparse Neural Connectivity from Partial Measurements: A Covariance-Based Approach with Granger-Causality Refinement0
When Names Change Verdicts: Intervention Consistency Reveals Systematic Bias in LLM Decision-Making0
Scaling Sim-to-Real Reinforcement Learning for Robot VLAs with Generative 3D Worlds0
Balancing the Reasoning Load: Difficulty-Differentiated Policy Optimization with Length Redistribution for Efficient and Robust Reinforcement LearningCode0
SCISSR: Scribble-Conditioned Interactive Surgical Segmentation and Refinement0
Learning Decision-Sufficient Representations for Linear Optimization0
Show:102550
← PrevPage 44 of 13200Next →