The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1001–1050 of 659983 papers

Title	Date	Tasks	Status	Hype
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models	Jun 5, 2025	RerankingRetrieval	CodeCode Available	5
SoundMind: RL-Incentivized Logic Reasoning for Audio-Language Models	Jun 15, 2025	Logical ReasoningReinforcement Learning (RL)	CodeCode Available	5
Matrix-Game: Interactive World Foundation Model	Jun 23, 2025	Minecraftmodel	CodeCode Available	5
MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models	Aug 21, 2024	GPUQuantization	CodeCode Available	5
MambaIR: A Simple Baseline for Image Restoration with State-Space Model	Feb 23, 2024	Image RestorationImage Super-Resolution	CodeCode Available	5
DiffusionDrive: Truncated Diffusion Model for End-to-End Autonomous Driving	Nov 22, 2024	Autonomous DrivingDenoising	CodeCode Available	5
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos	Jan 7, 2025	2kLanguage Modeling	CodeCode Available	5
TS3-Codec: Transformer-Based Simple Streaming Single Codec	Nov 27, 2024	Audio Compression	CodeCode Available	5
MOSPAT: AutoML based Model Selection and Parameter Tuning for Time Series Anomaly Detection	May 24, 2022	Anomaly DetectionAutoML	CodeCode Available	5
On the reusability of samples in active learning	Jun 13, 2022	Active Learning	CodeCode Available	5
Robust Multivariate Time-Series Forecasting: Adversarial Attacks and Defense Mechanisms	Jul 19, 2022	Adversarial AttackMultivariate Time Series Forecasting	CodeCode Available	5
AlphaPose: Whole-Body Regional Multi-Person Pose Estimation and Tracking in Real-Time	Nov 7, 2022	Knowledge DistillationMulti-Person Pose Estimation	CodeCode Available	5
DepthSplat: Connecting Gaussian Splatting and Depth	Oct 17, 2024	Depth EstimationNovel View Synthesis	CodeCode Available	5
BLAST: Balanced Sampling Time Series Corpus for Universal Forecasting Models	May 23, 2025	DiversityTime Series	CodeCode Available	5
Does `Deep Learning on a Data Diet' reproduce? Overall yes, but GraNd at Initialization does not	Mar 26, 2023		CodeCode Available	5
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment	Apr 13, 2023	Ethics	CodeCode Available	5
Fine-Tuning Vision-Language-Action Models: Optimizing Speed and Success	Feb 27, 2025	Action GenerationChunking	CodeCode Available	5
Infinite Photorealistic Worlds using Procedural Generation	Jun 15, 2023	3D Reconstructionobject-detection	CodeCode Available	5
ChatGPT MT: Competitive for High- (but not Low-) Resource Languages	Sep 14, 2023	Machine Translation	CodeCode Available	5
YOLOR-Based Multi-Task Learning	Sep 29, 2023	Image CaptioningInstance Segmentation	CodeCode Available	5
CacheGen: KV Cache Compression and Streaming for Fast Large Language Model Serving	Oct 11, 2023	Language ModelingLanguage Modelling	CodeCode Available	5
InstructPix2Pix: Learning to Follow Image Editing Instructions	Nov 17, 2022	Image Editing	CodeCode Available	5
LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training	Jun 24, 2024	Mixture-of-Experts	CodeCode Available	5
Human Gaussian Splatting: Real-time Rendering of Animatable Avatars	Nov 28, 2023		CodeCode Available	5
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models	Jan 2, 2024		CodeCode Available	5
GaussianObject: High-Quality 3D Object Reconstruction from Four Views with Gaussian Splatting	Feb 15, 2024	3D Object ReconstructionNeural Rendering	CodeCode Available	5
3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations	Mar 6, 2024	Imitation LearningRobot Manipulation	CodeCode Available	5
MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images	Mar 21, 2024	3D ReconstructionGeneralizable Novel View Synthesis	CodeCode Available	5
Lean Copilot: Large Language Models as Copilots for Theorem Proving in Lean	Apr 18, 2024	Automated Theorem ProvingHallucination	CodeCode Available	5
Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation	May 2, 2024	MuJoCoReinforcement Learning (RL)	CodeCode Available	5
Enhancing Efficiency of Safe Reinforcement Learning via Sample Manipulation	May 31, 2024	MuJoCoreinforcement-learning	CodeCode Available	5
The Vizier Gaussian Process Bandit Algorithm	Aug 21, 2024	Bayesian Optimization	CodeCode Available	5
Fundamental Components of Deep Learning: A category-theoretic approach	Mar 13, 2024	Deep LearningDescriptive	CodeCode Available	5
Magma: A Foundation Model for Multimodal AI Agents	Feb 18, 2025	Autonomous Web NavigationImage to text	CodeCode Available	5
LiveBench: A Challenging, Contamination-Limited LLM Benchmark	Jun 27, 2024	ArticlesInstruction Following	CodeCode Available	5
FuXi-2.0: Advancing machine learning weather forecasting model for practical applications	Sep 11, 2024	Weather Forecasting	CodeCode Available	5
Retinexformer: One-stage Retinex-based Transformer for Low-light Image Enhancement	Mar 12, 2023	Image EnhancementLow-light Image Deblurring and Enhancement	CodeCode Available	5
Neural Fields in Robotics: A Survey	Oct 26, 2024	3D ReconstructionAutonomous Driving	CodeCode Available	5
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs	Dec 25, 2024	Reinforcement Learning (RL)	CodeCode Available	5
SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?	Feb 17, 2025		CodeCode Available	5
TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models	Feb 10, 2025	3D Generation3D Reconstruction	CodeCode Available	5
TikZero: Zero-Shot Text-Guided Graphics Program Synthesis	Mar 14, 2025	Program Synthesis	CodeCode Available	5
VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness	Mar 27, 2025	Anomaly DetectionVideo Generation	CodeCode Available	5
ZeroSearch: Incentivize the Search Capability of LLMs without Searching	May 7, 2025	Reinforcement Learning (RL)Retrieval	CodeCode Available	5
Show-o2: Improved Native Unified Multimodal Models	Jun 18, 2025	Language ModelingLanguage Modelling	CodeCode Available	5
REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards	May 30, 2025	reinforcement-learningReinforcement Learning	CodeCode Available	5
DoWhy-GCM: An extension of DoWhy for causal inference in graphical causal models	Jun 14, 2022	Causal Inference	CodeCode Available	5
VADv2: End-to-End Vectorized Autonomous Driving via Probabilistic Planning	Feb 20, 2024	Autonomous DrivingNavSim	CodeCode Available	5
Rethinking LLM Language Adaptation: A Case Study on Chinese Mixtral	Mar 4, 2024	Language ModelingLanguage Modelling	CodeCode Available	5
Penzai + Treescope: A Toolkit for Interpreting, Visualizing, and Editing Models As Data	Aug 1, 2024		CodeCode Available	5