The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 326–350 of 177339 papers

Title	Date	Tasks	Status	Hype	Score
AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning	May 30, 2025	GPUMath	CodeCode Available	7	5
Pre^3: Enabling Deterministic Pushdown Automata for Faster Structured LLM Generation	Jun 4, 2025		CodeCode Available	7	5
DeepSeek-VL: Towards Real-World Vision-Language Understanding	Mar 8, 2024	ChatbotLanguage Modelling	CodeCode Available	7	5
Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability	May 27, 2024	Autonomous DrivingVideo Generation	CodeCode Available	7	5
Grants4Companies: Applying Declarative Methods for Recommending and Reasoning About Business Grants in the Austrian Public Administration (System Description)	Jun 21, 2024		CodeCode Available	7	5
InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models	Apr 10, 2024	Image to 3D	CodeCode Available	7	5
PEER: Expertizing Domain-Specific Tasks with a Multi-Agent Framework and Tuning Methods	Jul 9, 2024	Information RetrievalLEMMA	CodeCode Available	7	5
Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering	Jan 16, 2024	Code GenerationPrompt Engineering	CodeCode Available	7	5
Dynamic Evaluation of Large Language Models by Meta Probing Agents	Feb 21, 2024	Data Augmentation	CodeCode Available	7	5
Better Synthetic Data by Retrieving and Transforming Existing Datasets	Apr 22, 2024	Dataset GenerationDiversity	CodeCode Available	7	5
Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation	Mar 22, 2024	Depth EstimationSurface Normal Estimation	CodeCode Available	7	5
From RAG to Memory: Non-Parametric Continual Learning for Large Language Models	Feb 20, 2025	Continual LearningKnowledge Graphs	CodeCode Available	7	5
AIOS Compiler: LLM as Interpreter for Natural Language Programming and Flow Programming of AI Agents	May 11, 2024		CodeCode Available	7	5
Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP	Dec 28, 2022	In-Context LearningLanguage Modelling	CodeCode Available	7	5
ECCO: Can We Improve Model-Generated Code Efficiency Without Sacrificing Functional Correctness?	Jul 19, 2024	BenchmarkingCode Generation	CodeCode Available	7	5
mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models	Aug 9, 2024	Language ModelingLanguage Modelling	CodeCode Available	7	5
MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery	Sep 9, 2024	MemorizationQuestion Answering	CodeCode Available	7	5
PyRIT: A Framework for Security Risk Identification and Red Teaming in Generative AI System	Oct 1, 2024	Red Teaming	CodeCode Available	7	5
AutoTrain: No-code training for state-of-the-art models	Oct 21, 2024	Classificationimage-classification	CodeCode Available	7	5
ThunderKittens: Simple, Fast, and Adorable AI Kernels	Oct 27, 2024	GPUState Space Models	CodeCode Available	7	5
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning	Feb 20, 2025	Mathreinforcement-learning	CodeCode Available	7	5
InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity	Mar 20, 2025	Image Generation	CodeCode Available	7	5
A Scalable Approach to Clustering Embedding Projections	Apr 9, 2025	ClusteringDensity Estimation	CodeCode Available	7	5
Real-Time Video Generation with Pyramid Attention Broadcast	Aug 22, 2024	Video Generation	CodeCode Available	7	5
Stable Audio Open	Jul 19, 2024	Audio GenerationText-to-Music Generation	CodeCode Available	7	5