The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 801–825 of 659983 papers

Title	Date	Tasks	Status	Hype
Weakly Supervised Detection of Hallucinations in LLM Activations	Dec 5, 2023	HallucinationLanguage Modeling	CodeCode Available	5
Vectorized and performance-portable Quicksort	May 12, 2022	CPU	CodeCode Available	5
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation	Apr 2, 2025	Conditional Image GenerationImage Generation	CodeCode Available	5
ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis	Sep 3, 2024	3D Generation3D Reconstruction	CodeCode Available	5
PaSa: An LLM Agent for Comprehensive Academic Paper Search	Jan 17, 2025		CodeCode Available	5
Voyager: An Open-Ended Embodied Agent with Large Language Models	May 25, 2023	Lifelong learningMinecraft	CodeCode Available	5
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs	Jul 31, 2023	Trajectory PlanningZero-shot Generalization	CodeCode Available	5
Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts	Jun 18, 2024	Language ModelingLanguage Modelling	CodeCode Available	5
On the Computation of the Fisher Information in Continual Learning	Feb 17, 2025	Continual Learning	CodeCode Available	5
Direct3D-S2: Gigascale 3D Generation Made Easy with Spatial Sparse Attention	May 23, 2025	3D Generation3D geometry	CodeCode Available	5
How NeRFs and 3D Gaussian Splatting are Reshaping SLAM: a Survey	Feb 20, 2024	3DGSSimultaneous Localization and Mapping	CodeCode Available	5
GRUtopia: Dream General Robots in a City at Scale	Jul 15, 2024	Language ModellingLarge Language Model	CodeCode Available	5
Fractal Generative Models	Feb 24, 2025	Image Generation	CodeCode Available	5
Scaling Up Your Kernels: Large Kernel Design in ConvNets towards Universal Representations	Oct 10, 2024	Time Series ForecastingVideo Recognition	CodeCode Available	5
Factuality Enhanced Language Models for Open-Ended Text Generation	Jun 9, 2022	MisconceptionsSentence	CodeCode Available	5
Tool Learning with Foundation Models	Apr 17, 2023		CodeCode Available	5
Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators	Apr 6, 2024	Chatbotcounterfactual	CodeCode Available	5
Deep Lake: a Lakehouse for Deep Learning	Sep 22, 2022	Decision MakingDeep Learning	CodeCode Available	5
MARIO Eval: Evaluate Your Math LLM with your Math LLM--A mathematical dataset evaluation toolkit	Apr 22, 2024	Math	CodeCode Available	5
YOLO-RD: Introducing Relevant and Compact Explicit Knowledge to YOLO by Retriever-Dictionary	Oct 20, 2024	object-detectionObject Detection	CodeCode Available	5
Efficient Diffusion Model for Image Restoration by Residual Shifting	Mar 12, 2024	Blind Face RestorationImage Inpainting	CodeCode Available	5
τ^2-Bench: Evaluating Conversational Agents in a Dual-Control Environment	Jun 9, 2025	AI Agent	CodeCode Available	5
DUSt3R: Geometric 3D Vision Made Easy	Dec 21, 2023	3D ReconstructionCamera Calibration	CodeCode Available	5
Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling	Jan 1, 2024	NeRF	CodeCode Available	5
EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine	Jun 21, 2022	MuJoCoreinforcement-learning	CodeCode Available	5