The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2376–2400 of 661570 papers

Title	Date	Status	Hype
DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation	Feb 12, 2026	—Unverified	3
LLM-in-Sandbox Elicits General Agentic Intelligence	Feb 12, 2026	—Unverified	3
SceneSmith: Agentic Generation of Simulation-Ready Indoor Scenes	Feb 9, 2026	—Unverified	3
Yunjue Agent Tech Report: A Fully Reproducible, Zero-Start In-Situ Self-Evolving Agent System for Open-Ended Tasks	Feb 6, 2026	—Unverified	3
Baichuan-M3: Modeling Clinical Inquiry for Reliable Medical Decision-Making	Feb 6, 2026	—Unverified	3
Simulating the Visual World with Artificial Intelligence: A Roadmap	Feb 5, 2026	—Unverified	3
Scaling Multiagent Systems with Process Rewards	Feb 4, 2026	—Unverified	3
SWE-Pruner: Self-Adaptive Context Pruning for Coding Agents	Feb 4, 2026	—Unverified	3
HY3D-Bench: Generation of 3D Assets	Feb 3, 2026	—Unverified	3
CL-bench: A Benchmark for Context Learning	Feb 3, 2026	—Unverified	3
MemSkill: Learning and Evolving Memory Skills for Self-Evolving Agents	Feb 2, 2026	—Unverified	3
Making Avatars Interact: Towards Text-Driven Human-Object Interaction for Controllable Talking Avatars	Feb 2, 2026	—Unverified	3
A Survey of Token Compression for Efficient Multimodal Large Language Models	Feb 1, 2026	—Unverified	3
EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modeling	Feb 1, 2026	—Unverified	3
LongCat-Flash-Thinking-2601 Technical Report	Feb 1, 2026	—Unverified	3
DyPE: Dynamic Position Extrapolation for Ultra High Resolution Diffusion	Jan 29, 2026	—Unverified	3
MetricAnything: Scaling Metric Depth Pretraining with Noisy Heterogeneous Sources	Jan 29, 2026	—Unverified	3
Deep Delta Learning	Jan 29, 2026	—Unverified	3
JUST-DUB-IT: Video Dubbing via Joint Audio-Visual Diffusion	Jan 29, 2026	—Unverified	3
TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows	Jan 28, 2026	—Unverified	3
Geometry-Grounded Gaussian Splatting	Jan 27, 2026	—Unverified	3
Self-Distillation Enables Continual Learning	Jan 27, 2026	—Unverified	3
VoXtream: Full-Stream Text-to-Speech with Extremely Low Latency	Jan 26, 2026	—Unverified	3
AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security	Jan 26, 2026	—Unverified	3
EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience	Jan 23, 2026	—Unverified	3