The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 326–350 of 659983 papers

Title	Date	Tasks	Status	Hype
Pyramidal Flow Matching for Efficient Video Generative Modeling	Oct 8, 2024	GPUText-to-Video Generation	CodeCode Available	7
Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis	May 14, 2025	DenoisingDepth Estimation	CodeCode Available	7
TextGrad: Automatic "Differentiation" via Text	Jun 11, 2024	Question AnsweringSpecificity	CodeCode Available	7
RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning	Apr 24, 2025	Decision MakingReinforcement Learning (RL)	CodeCode Available	7
Flow-GRPO: Training Flow Matching Models via Online RL	May 8, 2025	DenoisingDiversity	CodeCode Available	7
AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning	May 30, 2025	GPUMath	CodeCode Available	7
Pre^3: Enabling Deterministic Pushdown Automata for Faster Structured LLM Generation	Jun 4, 2025		CodeCode Available	7
DeepSeek-VL: Towards Real-World Vision-Language Understanding	Mar 8, 2024	ChatbotLanguage Modelling	CodeCode Available	7
Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability	May 27, 2024	Autonomous DrivingVideo Generation	CodeCode Available	7
Grants4Companies: Applying Declarative Methods for Recommending and Reasoning About Business Grants in the Austrian Public Administration (System Description)	Jun 21, 2024		CodeCode Available	7
InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models	Apr 10, 2024	Image to 3D	CodeCode Available	7
PEER: Expertizing Domain-Specific Tasks with a Multi-Agent Framework and Tuning Methods	Jul 9, 2024	Information RetrievalLEMMA	CodeCode Available	7
Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering	Jan 16, 2024	Code GenerationPrompt Engineering	CodeCode Available	7
Dynamic Evaluation of Large Language Models by Meta Probing Agents	Feb 21, 2024	Data Augmentation	CodeCode Available	7
Better Synthetic Data by Retrieving and Transforming Existing Datasets	Apr 22, 2024	Dataset GenerationDiversity	CodeCode Available	7
Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation	Mar 22, 2024	Depth EstimationSurface Normal Estimation	CodeCode Available	7
From RAG to Memory: Non-Parametric Continual Learning for Large Language Models	Feb 20, 2025	Continual LearningKnowledge Graphs	CodeCode Available	7
AIOS Compiler: LLM as Interpreter for Natural Language Programming and Flow Programming of AI Agents	May 11, 2024		CodeCode Available	7
Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP	Dec 28, 2022	In-Context LearningLanguage Modelling	CodeCode Available	7
ECCO: Can We Improve Model-Generated Code Efficiency Without Sacrificing Functional Correctness?	Jul 19, 2024	BenchmarkingCode Generation	CodeCode Available	7
mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models	Aug 9, 2024	Language ModelingLanguage Modelling	CodeCode Available	7
MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery	Sep 9, 2024	MemorizationQuestion Answering	CodeCode Available	7
PyRIT: A Framework for Security Risk Identification and Red Teaming in Generative AI System	Oct 1, 2024	Red Teaming	CodeCode Available	7
AutoTrain: No-code training for state-of-the-art models	Oct 21, 2024	Classificationimage-classification	CodeCode Available	7
The Road Less Scheduled	May 24, 2024	Scheduling	CodeCode Available	7