SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 23262350 of 661570 papers

TitleStatusHype
Identify Critical KV Cache in LLM Inference from an Output Perturbation PerspectiveCode4
Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic DataCode4
FinBen: A Holistic Financial Benchmark for Large Language ModelsCode4
SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion ModelsCode4
χ_0: Resource-Aware Robust Manipulation via Taming Distributional Inconsistencies3
InstantSfM: Towards GPU-Native SfM for the Deep Learning Era3
Simulating the Visual World with Artificial Intelligence: A Roadmap3
EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience3
The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution3
DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation3
LLaDA2.1: Speeding Up Text Diffusion via Token Editing3
ArenaRL: Scaling RL for Open-Ended Agents via Tournament-based Relative Ranking3
JUST-DUB-IT: Video Dubbing via Joint Audio-Visual Diffusion3
LLM-in-Sandbox Elicits General Agentic Intelligence3
AnyUp: Universal Feature Upsampling3
OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence3
GEM: A Gym for Agentic LLMs3
DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing3
Holi-Spatial: Evolving Video Streams into Holistic 3D Spatial Intelligence3
LongCat-Flash-Thinking-2601 Technical Report3
HY3D-Bench: Generation of 3D Assets3
PartUV: Part-Based UV Unwrapping of 3D Meshes3
Qianfan-OCR: A Unified End-to-End Model for Document Intelligence3
AI Can Learn Scientific Taste3
EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modeling3
Show:102550
← PrevPage 94 of 26463Next →