SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 37013725 of 661570 papers

TitleStatusHype
OpenT2M: No-frill Motion Generation with Open-source,Large-scale, High-quality Data0
Automatic detection of Gen-AI texts: A comparative framework of neural models0
WeNLEX: Weakly Supervised Natural Language Explanations for Multilabel Chest X-ray Classification0
Towards Verifiable AI with Lightweight Cryptographic Proofs of Inference0
Reasonably reasoning AI agents can avoid game-theoretic failures in zero-shot, provably0
D-Mem: A Dual-Process Memory System for LLM Agents0
Communication-Efficient and Robust Multi-Modal Federated Learning via Latent-Space Consensus0
Multi-Domain Causal Empirical Bayes Under Linear Mixing0
Multimodal Task Interference: A Benchmark and Analysis of History-Target Mismatch in Multimodal LLMs0
On the Peril of (Even a Little) Nonstationarity in Satisficing Regret Minimization0
Can LLM generate interesting mathematical research problems?0
SuperDec: 3D Scene Decomposition with Superquadric Primitives0
Online Convex Optimization with Heavy Tails: Old Algorithms, New Regrets, and Applications0
CrossHOI-Bench: A Unified Benchmark for HOI Evaluation across Vision-Language Models and HOI-Specific Methods0
AI-driven Dispensing of Coral Reseeding Devices for Broad-scale Restoration of the Great Barrier Reef0
Closed-form _r norm scaling with data for overparameterized linear regression and diagonal linear networks under _p bias0
LucidFlux: Caption-Free Photo-Realistic Image Restoration via a Large-Scale Diffusion Transformer5
Splines-Based Feature Importance in Kolmogorov-Arnold Networks: A Framework for Supervised Tabular Data Dimensionality Reduction0
Support Basis: Fast Attention Beyond Bounded Entries0
If Probable, Then Acceptable? Understanding Conditional Acceptability Judgments in Large Language Models0
Evaluating Hallucinations in Audio-Visual Multimodal LLMs with Spoken Queries under Diverse Acoustic Conditions0
Manual2Skill++: Connector-Aware General Robotic Assembly from Instruction Manuals via Vision-Language Models0
StoryBox: Collaborative Multi-Agent Simulation for Hybrid Bottom-Up Long-Form Story Generation Using Large Language Models0
Linear Attention for Joint Power Optimization and User-Centric Clustering in Cell-Free Networks0
Adaptive Accountability in Networked MAS: Tracing and Mitigating Emergent Norms at Scale0
Show:102550
← PrevPage 149 of 26463Next →