The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 301–325 of 177339 papers

Title	Date	Tasks	Status	Hype	Score
PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding	Apr 17, 2025	Video Question AnsweringVideo Understanding	CodeCode Available	7	5
Tulu 3: Pushing Frontiers in Open Language Model Post-Training	Nov 22, 2024	Language ModelingLanguage Modelling	CodeCode Available	7	5
Measuring Massive Multitask Chinese Understanding	Apr 25, 2023	All	CodeCode Available	7	5
YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors	Jul 6, 2022	2D Object DetectionGPU	CodeCode Available	7	5
FoundationStereo: Zero-Shot Stereo Matching	Jan 17, 2025	Depth EstimationDiversity	CodeCode Available	7	5
Mirage: A Multi-Level Superoptimizer for Tensor Programs	May 9, 2024	GPUNavigate	CodeCode Available	7	5
TimeXer: Empowering Transformers for Time Series Forecasting with Exogenous Variables	Feb 29, 2024	Time SeriesTime Series Forecasting	CodeCode Available	7	5
Visual Agentic Reinforcement Fine-Tuning	May 20, 2025	Image Manipulation	CodeCode Available	7	5
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection	May 16, 2024	Edge-computingFew-Shot Object Detection	CodeCode Available	7	5
Align Anything: Training All-Modality Models to Follow Instructions with Language Feedback	Dec 20, 2024	AllInstruction Following	CodeCode Available	7	5
LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models	Jul 10, 2024	Video Question AnsweringZero-Shot Video Question Answer	CodeCode Available	7	5
Measuring short-form factuality in large language models	Nov 7, 2024	Form	CodeCode Available	7	5
RedPajama: an Open Dataset for Training Large Language Models	Nov 19, 2024		CodeCode Available	7	5
Reinforcement Learning Optimization for Large-Scale Learning: An Efficient and User-Friendly Scaling Library	Jun 6, 2025	Management	CodeCode Available	7	5
BrowseComp: A Simple Yet Challenging Benchmark for Browsing Agents	Apr 16, 2025		CodeCode Available	7	5
Easy Begun is Half Done: Spatial-Temporal Graph Modeling with ST-Curriculum Dropout	Nov 28, 2022		CodeCode Available	7	5
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning	Apr 24, 2025	Code Generation	CodeCode Available	7	5
On the Vulnerability of LLM/VLM-Controlled Robotics	Feb 15, 2024	Language ModellingRobot Manipulation	CodeCode Available	7	5
Grounding Image Matching in 3D with MASt3R	Jun 14, 2024	3D Reconstruction	CodeCode Available	7	5
PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides	Jan 7, 2025		CodeCode Available	7	5
VACE: All-in-One Video Creation and Editing	Mar 10, 2025	AllHuman-Domain Subject-to-Video	CodeCode Available	7	5
Revisiting PCA for time series reduction in temporal dimension	Dec 27, 2024	Computational EfficiencyDimensionality Reduction	CodeCode Available	7	5
Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis	May 14, 2025	DenoisingDepth Estimation	CodeCode Available	7	5
RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning	Apr 24, 2025	Decision MakingReinforcement Learning (RL)	CodeCode Available	7	5
Flow-GRPO: Training Flow Matching Models via Online RL	May 8, 2025	DenoisingDiversity	CodeCode Available	7	5