The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 351–400 of 474278 papers

Title	Date	Tasks	Status	Hype
VACE: All-in-One Video Creation and Editing	Mar 10, 2025	AllHuman-Domain Subject-to-Video	CodeCode Available	7
Revisiting PCA for time series reduction in temporal dimension	Dec 27, 2024	Computational EfficiencyDimensionality Reduction	CodeCode Available	7
Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis	May 14, 2025	DenoisingDepth Estimation	CodeCode Available	7
RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning	Apr 24, 2025	Decision MakingReinforcement Learning (RL)	CodeCode Available	7
Flow-GRPO: Training Flow Matching Models via Online RL	May 8, 2025	DenoisingDiversity	CodeCode Available	7
AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning	May 30, 2025	GPUMath	CodeCode Available	7
Pre^3: Enabling Deterministic Pushdown Automata for Faster Structured LLM Generation	Jun 4, 2025		CodeCode Available	7
DeepSeek-VL: Towards Real-World Vision-Language Understanding	Mar 8, 2024	ChatbotLanguage Modelling	CodeCode Available	7
Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability	May 27, 2024	Autonomous DrivingVideo Generation	CodeCode Available	7
Grants4Companies: Applying Declarative Methods for Recommending and Reasoning About Business Grants in the Austrian Public Administration (System Description)	Jun 21, 2024		CodeCode Available	7
InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models	Apr 10, 2024	Image to 3D	CodeCode Available	7
PEER: Expertizing Domain-Specific Tasks with a Multi-Agent Framework and Tuning Methods	Jul 9, 2024	Information RetrievalLEMMA	CodeCode Available	7
Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering	Jan 16, 2024	Code GenerationPrompt Engineering	CodeCode Available	7
Dynamic Evaluation of Large Language Models by Meta Probing Agents	Feb 21, 2024	Data Augmentation	CodeCode Available	7
Better Synthetic Data by Retrieving and Transforming Existing Datasets	Apr 22, 2024	Dataset GenerationDiversity	CodeCode Available	7
Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation	Mar 22, 2024	Depth EstimationSurface Normal Estimation	CodeCode Available	7
From RAG to Memory: Non-Parametric Continual Learning for Large Language Models	Feb 20, 2025	Continual LearningKnowledge Graphs	CodeCode Available	7
AIOS Compiler: LLM as Interpreter for Natural Language Programming and Flow Programming of AI Agents	May 11, 2024		CodeCode Available	7
Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP	Dec 28, 2022	In-Context LearningLanguage Modelling	CodeCode Available	7
ECCO: Can We Improve Model-Generated Code Efficiency Without Sacrificing Functional Correctness?	Jul 19, 2024	BenchmarkingCode Generation	CodeCode Available	7
mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models	Aug 9, 2024	Language ModelingLanguage Modelling	CodeCode Available	7
MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery	Sep 9, 2024	MemorizationQuestion Answering	CodeCode Available	7
PyRIT: A Framework for Security Risk Identification and Red Teaming in Generative AI System	Oct 1, 2024	Red Teaming	CodeCode Available	7
AutoTrain: No-code training for state-of-the-art models	Oct 21, 2024	Classificationimage-classification	CodeCode Available	7
ThunderKittens: Simple, Fast, and Adorable AI Kernels	Oct 27, 2024	GPUState Space Models	CodeCode Available	7
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning	Feb 20, 2025	Mathreinforcement-learning	CodeCode Available	7
InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity	Mar 20, 2025	Image Generation	CodeCode Available	7
A Scalable Approach to Clustering Embedding Projections	Apr 9, 2025	ClusteringDensity Estimation	CodeCode Available	7
Real-Time Video Generation with Pyramid Attention Broadcast	Aug 22, 2024	Video Generation	CodeCode Available	7
Stable Audio Open	Jul 19, 2024	Audio GenerationText-to-Music Generation	CodeCode Available	7
OpenThoughts: Data Recipes for Reasoning Models	Jun 4, 2025	Math	CodeCode Available	7
Training AI to be Loyal	Jan 27, 2025		CodeCode Available	7
CyberSecEval 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models	Apr 19, 2024		CodeCode Available	7
Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers	May 27, 2025		CodeCode Available	7
MoBA: Mixture of Block Attention for Long-Context LLMs	Feb 18, 2025	Mixture-of-Experts	CodeCode Available	7
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?	Nov 25, 2024	HallucinationKnowledge Distillation	CodeCode Available	7
D-FINE: Redefine Regression Task in DETRs as Fine-grained Distribution Refinement	Oct 17, 2024	GPUReal-Time Object Detection	CodeCode Available	7
pySLAM: An Open-Source, Modular, and Extensible Framework for SLAM	Feb 17, 2025	Depth EstimationDepth Prediction	CodeCode Available	7
Exploring Compressed Image Representation as a Perceptual Proxy: A Study	Jan 14, 2024	Image CompressionPerceptual Distance	CodeCode Available	7
Practical Efficiency of Muon for Pretraining	May 4, 2025		CodeCode Available	7
Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models	May 6, 2023	Math	CodeCode Available	7
Low-code LLM: Graphical User Interface over Large Language Models	Apr 17, 2023	Prompt Engineering	CodeCode Available	7
O1 Replication Journey: A Strategic Progress Report -- Part 1	Oct 8, 2024	Mathscientific discovery	CodeCode Available	7
Large Concept Models: Language Modeling in a Sentence Representation Space	Dec 11, 2024	Language ModelingLanguage Modelling	CodeCode Available	7
HuixiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance	Jan 16, 2024	In-Context Learning	CodeCode Available	7
3DGUT: Enabling Distorted Cameras and Secondary Rays in Gaussian Splatting	Dec 17, 2024	3DGSNovel View Synthesis	CodeCode Available	7
Scalable MatMul-free Language Modeling	Jun 4, 2024	GPULanguage Modeling	CodeCode Available	7
Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving	Jun 24, 2024	CPUGPU	CodeCode Available	7
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models	Jun 4, 2024	In-Context LearningLanguage Modelling	CodeCode Available	7
MMSU: A Massive Multi-task Spoken Language Understanding and Reasoning Benchmark	Jun 5, 2025	RhythmSpoken Language Understanding	CodeCode Available	7