The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–125 of 474278 papers

Title	Date	Tasks	Status	Hype
SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering	May 6, 2024	Bug fixingLanguage Modeling	CodeCode Available	11
HybridFlow: A Flexible and Efficient RLHF Framework	Sep 28, 2024	Large Language Model	CodeCode Available	11
PaperBanana: Automating Academic Illustration for AI Scientists	Jan 30, 2026		—Unverified	9
Qwen3-TTS Technical Report	Jan 22, 2026		—Unverified	9
RWKV-7 "Goose" with Expressive Dynamic State Evolution	Mar 18, 2025	In-Context LearningLanguage Modeling	CodeCode Available	9
MuseTalk: Real-Time High-Fidelity Video Dubbing via Spatio-Temporal Sampling	Oct 14, 2024	Audio-Visual SynchronizationGPU	CodeCode Available	9
Contextual Augmented Multi-Model Programming (CAMP): A Hybrid Local-Cloud Copilot Framework	Oct 20, 2024	Code CompletionRAG	CodeCode Available	9
OpenELM: An Efficient Language Model Family with Open Training and Inference Framework	Apr 22, 2024	Language ModelingLanguage Modelling	CodeCode Available	9
MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec Transformer	Sep 1, 2024	Self-Supervised Learningtext-to-speech	CodeCode Available	9
LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection	Jun 5, 2024	Decoderobject-detection	CodeCode Available	9
FinRobot: AI Agent for Equity Research and Valuation with Large Language Models	Nov 13, 2024	AI Agent	CodeCode Available	9
OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on	Mar 4, 2024	DenoisingImage Generation	CodeCode Available	9
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer	Oct 14, 2024	Image GenerationImage Reconstruction	CodeCode Available	9
StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models	Mar 12, 2024	Benchmarking	CodeCode Available	9
Depth Pro: Sharp Monocular Metric Depth in Less Than a Second	Oct 2, 2024	Depth EstimationGPU	CodeCode Available	9
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction	Apr 3, 2024	Image GenerationImage Reconstruction	CodeCode Available	9
Moshi: a speech-text foundation model for real-time dialogue	Sep 17, 2024	Action DetectionActivity Detection	CodeCode Available	9
Sapiens: Foundation for Human Vision Models	Aug 22, 2024	2D Human Pose Estimation2D Pose Estimation	CodeCode Available	9
SkyReels-V2: Infinite-length Film Generative Model	Apr 17, 2025	Large Language Modelmodel	CodeCode Available	9
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models	Feb 5, 2024	Arithmetic ReasoningMath	CodeCode Available	9
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism	Jan 5, 2024		CodeCode Available	9
SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer	Jan 30, 2025	Image GenerationModel Compression	CodeCode Available	9
AutoAgent: A Fully-Automated and Zero-Code Framework for LLM Agents	Feb 9, 2025	Large Language ModelRAG	CodeCode Available	9
TripoSR: Fast 3D Object Reconstruction from a Single Image	Mar 4, 2024	3D Generation3D Object Reconstruction	CodeCode Available	9
MonkeyOCR: Document Parsing with a Structure-Recognition-Relation Triplet Paradigm	Jun 5, 2025	GPURelation	CodeCode Available	9