The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 801–850 of 659983 papers

Title	Date	Tasks	Status	Hype
Weakly Supervised Detection of Hallucinations in LLM Activations	Dec 5, 2023	HallucinationLanguage Modeling	CodeCode Available	5
Vectorized and performance-portable Quicksort	May 12, 2022	CPU	CodeCode Available	5
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation	Apr 2, 2025	Conditional Image GenerationImage Generation	CodeCode Available	5
ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis	Sep 3, 2024	3D Generation3D Reconstruction	CodeCode Available	5
PaSa: An LLM Agent for Comprehensive Academic Paper Search	Jan 17, 2025		CodeCode Available	5
Voyager: An Open-Ended Embodied Agent with Large Language Models	May 25, 2023	Lifelong learningMinecraft	CodeCode Available	5
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs	Jul 31, 2023	Trajectory PlanningZero-shot Generalization	CodeCode Available	5
Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts	Jun 18, 2024	Language ModelingLanguage Modelling	CodeCode Available	5
On the Computation of the Fisher Information in Continual Learning	Feb 17, 2025	Continual Learning	CodeCode Available	5
Direct3D-S2: Gigascale 3D Generation Made Easy with Spatial Sparse Attention	May 23, 2025	3D Generation3D geometry	CodeCode Available	5
How NeRFs and 3D Gaussian Splatting are Reshaping SLAM: a Survey	Feb 20, 2024	3DGSSimultaneous Localization and Mapping	CodeCode Available	5
GRUtopia: Dream General Robots in a City at Scale	Jul 15, 2024	Language ModellingLarge Language Model	CodeCode Available	5
Fractal Generative Models	Feb 24, 2025	Image Generation	CodeCode Available	5
Scaling Up Your Kernels: Large Kernel Design in ConvNets towards Universal Representations	Oct 10, 2024	Time Series ForecastingVideo Recognition	CodeCode Available	5
Factuality Enhanced Language Models for Open-Ended Text Generation	Jun 9, 2022	MisconceptionsSentence	CodeCode Available	5
Tool Learning with Foundation Models	Apr 17, 2023		CodeCode Available	5
Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators	Apr 6, 2024	Chatbotcounterfactual	CodeCode Available	5
Deep Lake: a Lakehouse for Deep Learning	Sep 22, 2022	Decision MakingDeep Learning	CodeCode Available	5
MARIO Eval: Evaluate Your Math LLM with your Math LLM--A mathematical dataset evaluation toolkit	Apr 22, 2024	Math	CodeCode Available	5
YOLO-RD: Introducing Relevant and Compact Explicit Knowledge to YOLO by Retriever-Dictionary	Oct 20, 2024	object-detectionObject Detection	CodeCode Available	5
Efficient Diffusion Model for Image Restoration by Residual Shifting	Mar 12, 2024	Blind Face RestorationImage Inpainting	CodeCode Available	5
τ^2-Bench: Evaluating Conversational Agents in a Dual-Control Environment	Jun 9, 2025	AI Agent	CodeCode Available	5
DUSt3R: Geometric 3D Vision Made Easy	Dec 21, 2023	3D ReconstructionCamera Calibration	CodeCode Available	5
Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling	Jan 1, 2024	NeRF	CodeCode Available	5
EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine	Jun 21, 2022	MuJoCoreinforcement-learning	CodeCode Available	5
ProPainter: Improving Propagation and Transformer for Video Inpainting	Sep 7, 2023	Optical Flow EstimationVideo Inpainting	CodeCode Available	5
MedRAX: Medical Reasoning Agent for Chest X-ray	Feb 4, 2025	AI AgentManagement	CodeCode Available	5
LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression	Oct 10, 2023	Code CompletionFew-Shot Learning	CodeCode Available	5
TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document	Mar 7, 2024	document understandingKey Information Extraction	CodeCode Available	5
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases	Feb 22, 2024		CodeCode Available	5
Self-Instruct: Aligning Language Models with Self-Generated Instructions	Dec 20, 2022	Instruction FollowingLanguage Modelling	CodeCode Available	5
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning	Oct 3, 2024	Efficient ExplorationMathematical Problem-Solving	CodeCode Available	5
ASAP: Aligning Simulation and Real-World Physics for Learning Agile Humanoid Whole-Body Skills	Feb 3, 2025		CodeCode Available	5
MambaIRv2: Attentive State Space Restoration	Nov 22, 2024	Computational EfficiencyImage Restoration	CodeCode Available	5
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue	Feb 8, 2024	Conversational Web NavigationText Generation	CodeCode Available	5
VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models	Nov 20, 2024	BenchmarkingImage Generation	CodeCode Available	5
Trust Regions for Explanations via Black-Box Probabilistic Certification	Feb 17, 2024		CodeCode Available	5
MEIA: Multimodal Embodied Perception and Interaction in Unknown Environments	Feb 1, 2024	Embodied Question AnsweringLanguage Modeling	CodeCode Available	5
EasyPhoto: Your Smart AI Photo Generator	Oct 7, 2023		CodeCode Available	5
Language Agents as Optimizable Graphs	Feb 26, 2024	Prompt Engineering	CodeCode Available	5
Data-Juicer: A One-Stop Data Processing System for Large Language Models	Sep 5, 2023	Distributed Computing	CodeCode Available	5
Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively	Jan 5, 2024	image-classificationImage Classification	CodeCode Available	5
Common 7B Language Models Already Possess Strong Math Capabilities	Mar 7, 2024	GSM8KMath	CodeCode Available	5
Fast On-device LLM Inference with NPUs	Jul 8, 2024	CPUGPU	CodeCode Available	5
VideoCrafter1: Open Diffusion Models for High-Quality Video Generation	Oct 30, 2023	Text-to-Video GenerationVideo Generation	CodeCode Available	5
Efficient Multimodal Learning from Data-centric Perspective	Feb 18, 2024	Image ClassificationReferring Expression Comprehension	CodeCode Available	5
RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation	Aug 15, 2024	DiagnosticRAG	CodeCode Available	5
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference	Dec 18, 2024	DecoderRetrieval	CodeCode Available	5
StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task Learning	Jun 5, 2024	Automatic Speech Recognition (ASR)de-en	CodeCode Available	5
A ConvNet for the 2020s	Jan 10, 2022	ClassificationDomain Generalization	CodeCode Available	5