The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 701–725 of 659983 papers

Title	Date	Tasks	Status	Hype
MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining	May 12, 2025	Language ModelingLanguage Modelling	CodeCode Available	5
UniVLA: Learning to Act Anywhere with Task-centric Latent Actions	May 9, 2025	Robot ManipulationVision-Language-Action	CodeCode Available	5
Continuous Thought Machines	May 8, 2025	Computational EfficiencyQuestion Answering	CodeCode Available	5
Generating Physically Stable and Buildable LEGO Designs from Text	May 8, 2025	3D GenerationLarge Language Model	CodeCode Available	5
ZeroSearch: Incentivize the Search Capability of LLMs without Searching	May 7, 2025	Reinforcement Learning (RL)Retrieval	CodeCode Available	5
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation	May 7, 2025	Human-Domain Subject-to-VideoSingle-Domain Subject-to-Video	CodeCode Available	5
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities	May 5, 2025	Image GenerationSurvey	CodeCode Available	5
DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal Decomposition	Apr 30, 2025	Automated Theorem ProvingLarge Language Model	CodeCode Available	5
WebThinker: Empowering Large Reasoning Models with Deep Research Capability	Apr 30, 2025	Navigate	CodeCode Available	5
Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble Scorers	Apr 27, 2025	HallucinationQuestion Answering	CodeCode Available	5
Reservoir-enhanced Segment Anything Model for Subsurface Diagnosis	Apr 26, 2025	Anomaly DetectionGPR	CodeCode Available	5
MMInference: Accelerating Pre-filling for Long-Context VLMs via Modality-Aware Permutation Sparse Attention	Apr 22, 2025	GPU	CodeCode Available	5
Reinforcement Learning from Human Feedback	Apr 16, 2025	MathPhilosophy	CodeCode Available	5
InstantCharacter: Personalize Any Characters with a Scalable Diffusion Transformer Framework	Apr 16, 2025	Image Generation	CodeCode Available	5
Pixel-SAIL: Single Transformer For Pixel-Grounded Understanding	Apr 14, 2025	Question Answering	CodeCode Available	5
Kimi-VL Technical Report	Apr 10, 2025	Long-Context UnderstandingMathematical Reasoning	CodeCode Available	5
M-Prometheus: A Suite of Open Multilingual LLM Judges	Apr 7, 2025	Machine TranslationModel Selection	CodeCode Available	5
The 1st Solution for 4th PVUW MeViS Challenge: Unleashing the Potential of Large Multimodal Models for Referring Video Segmentation	Apr 7, 2025	Inference OptimizationReferring Video Object Segmentation	CodeCode Available	5
PaperBench: Evaluating AI's Ability to Replicate AI Research	Apr 2, 2025		CodeCode Available	5
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation	Apr 2, 2025	Conditional Image GenerationImage Generation	CodeCode Available	5
HDVIO2.0: Wind and Disturbance Estimation with Hybrid Dynamics VIO	Apr 1, 2025	State Estimation	CodeCode Available	5
4th PVUW MeViS 3rd Place Report: Sa2VA	Apr 1, 2025	Language ModelingLanguage Modelling	CodeCode Available	5
VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness	Mar 27, 2025	Anomaly DetectionVideo Generation	CodeCode Available	5
Understanding R1-Zero-Like Training: A Critical Perspective	Mar 26, 2025	Reinforcement Learning (RL)	CodeCode Available	5
ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning	Mar 25, 2025	reinforcement-learningReinforcement Learning	CodeCode Available	5