The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2326–2350 of 661570 papers

Title	Date	Tasks	Status	Hype
Identify Critical KV Cache in LLM Inference from an Output Perturbation Perspective	Feb 6, 2025		CodeCode Available	4
Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data	Jul 22, 2021	Blind Super-ResolutionSuper-Resolution	CodeCode Available	4
FinBen: A Holistic Financial Benchmark for Large Language Models	Feb 20, 2024	Question AnsweringRAG	CodeCode Available	4
SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models	Nov 7, 2024	GPUQuantization	CodeCode Available	4
χ_0: Resource-Aware Robust Manipulation via Taming Distributional Inconsistencies	Mar 17, 2026		—Unverified	3
InstantSfM: Towards GPU-Native SfM for the Deep Learning Era	Mar 11, 2026		—Unverified	3
Simulating the Visual World with Artificial Intelligence: A Roadmap	Feb 5, 2026		—Unverified	3
EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience	Jan 23, 2026		—Unverified	3
The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution	Feb 26, 2026		—Unverified	3
DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation	Feb 12, 2026		—Unverified	3
LLaDA2.1: Speeding Up Text Diffusion via Token Editing	Feb 13, 2026		—Unverified	3
ArenaRL: Scaling RL for Open-Ended Agents via Tournament-based Relative Ranking	Jan 22, 2026		—Unverified	3
JUST-DUB-IT: Video Dubbing via Joint Audio-Visual Diffusion	Jan 29, 2026		—Unverified	3
LLM-in-Sandbox Elicits General Agentic Intelligence	Feb 12, 2026		—Unverified	3
AnyUp: Universal Feature Upsampling	Feb 16, 2026		—Unverified	3
OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence	Feb 26, 2026		—Unverified	3
GEM: A Gym for Agentic LLMs	Mar 1, 2026		—Unverified	3
DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing	Feb 13, 2026		—Unverified	3
Holi-Spatial: Evolving Video Streams into Holistic 3D Spatial Intelligence	Mar 8, 2026		—Unverified	3
LongCat-Flash-Thinking-2601 Technical Report	Feb 1, 2026		—Unverified	3
HY3D-Bench: Generation of 3D Assets	Feb 3, 2026		—Unverified	3
PartUV: Part-Based UV Unwrapping of 3D Meshes	Feb 17, 2026		—Unverified	3
Qianfan-OCR: A Unified End-to-End Model for Document Intelligence	Mar 11, 2026		—Unverified	3
AI Can Learn Scientific Taste	Mar 15, 2026		—Unverified	3
EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modeling	Feb 1, 2026		—Unverified	3