The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1176–1200 of 177339 papers

Title	Date	Tasks	Status	Hype	Score
Deep Lake: a Lakehouse for Deep Learning	Sep 22, 2022	Decision MakingDeep Learning	CodeCode Available	5	5
MARIO Eval: Evaluate Your Math LLM with your Math LLM--A mathematical dataset evaluation toolkit	Apr 22, 2024	Math	CodeCode Available	5	5
YOLO-RD: Introducing Relevant and Compact Explicit Knowledge to YOLO by Retriever-Dictionary	Oct 20, 2024	object-detectionObject Detection	CodeCode Available	5	5
Efficient Diffusion Model for Image Restoration by Residual Shifting	Mar 12, 2024	Blind Face RestorationImage Inpainting	CodeCode Available	5	5
τ^2-Bench: Evaluating Conversational Agents in a Dual-Control Environment	Jun 9, 2025	AI Agent	CodeCode Available	5	5
DUSt3R: Geometric 3D Vision Made Easy	Dec 21, 2023	3D ReconstructionCamera Calibration	CodeCode Available	5	5
Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling	Jan 1, 2024	NeRF	CodeCode Available	5	5
EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine	Jun 21, 2022	MuJoCoreinforcement-learning	CodeCode Available	5	5
ProPainter: Improving Propagation and Transformer for Video Inpainting	Sep 7, 2023	Optical Flow EstimationVideo Inpainting	CodeCode Available	5	5
MedRAX: Medical Reasoning Agent for Chest X-ray	Feb 4, 2025	AI AgentManagement	CodeCode Available	5	5
LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression	Oct 10, 2023	Code CompletionFew-Shot Learning	CodeCode Available	5	5
TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document	Mar 7, 2024	document understandingKey Information Extraction	CodeCode Available	5	5
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases	Feb 22, 2024		CodeCode Available	5	5
Self-Instruct: Aligning Language Models with Self-Generated Instructions	Dec 20, 2022	Instruction FollowingLanguage Modelling	CodeCode Available	5	5
R1-Zero's "Aha Moment" in Visual Reasoning on a 2B Non-SFT Model	Mar 7, 2025	Multimodal Reasoningreinforcement-learning	CodeCode Available	4	5
Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis	Feb 6, 2025	Speech Synthesis	CodeCode Available	4	5
Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications	Jan 11, 2024	image-classificationImage Classification	CodeCode Available	4	5
Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation	Jun 6, 2022	Image SegmentationInstance Segmentation	CodeCode Available	4	5
Baichuan 2: Open Large-scale Language Models	Sep 19, 2023	Feature EngineeringGSM8K	CodeCode Available	4	5
SEED-Story: Multimodal Long Story Generation with Large Language Model	Jul 11, 2024	Image GenerationLanguage Modeling	CodeCode Available	4	5
VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents	Oct 14, 2024	RAGRetrieval	CodeCode Available	4	5
Otter: A Multi-Modal Model with In-Context Instruction Tuning	May 5, 2023	GPUIn-Context Learning	CodeCode Available	4	5
Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation	Feb 4, 2025	BenchmarkingInformation Retrieval	CodeCode Available	4	5
Safurai 001: New Qualitative Approach for Code LLM Evaluation	Sep 20, 2023	Language ModelingLanguage Modelling	CodeCode Available	4	5
The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models	Mar 14, 2022	CPUQuantization	CodeCode Available	4	5