The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 701–750 of 659983 papers

Title	Date	Tasks	Status	Hype
DanceGRPO: Unleashing GRPO on Visual Generation	May 12, 2025	Denoisingreinforcement-learning	CodeCode Available	5
UniVLA: Learning to Act Anywhere with Task-centric Latent Actions	May 9, 2025	Robot ManipulationVision-Language-Action	CodeCode Available	5
Generating Physically Stable and Buildable LEGO Designs from Text	May 8, 2025	3D GenerationLarge Language Model	CodeCode Available	5
Continuous Thought Machines	May 8, 2025	Computational EfficiencyQuestion Answering	CodeCode Available	5
ZeroSearch: Incentivize the Search Capability of LLMs without Searching	May 7, 2025	Reinforcement Learning (RL)Retrieval	CodeCode Available	5
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation	May 7, 2025	Human-Domain Subject-to-VideoSingle-Domain Subject-to-Video	CodeCode Available	5
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities	May 5, 2025	Image GenerationSurvey	CodeCode Available	5
DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal Decomposition	Apr 30, 2025	Automated Theorem ProvingLarge Language Model	CodeCode Available	5
WebThinker: Empowering Large Reasoning Models with Deep Research Capability	Apr 30, 2025	Navigate	CodeCode Available	5
Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble Scorers	Apr 27, 2025	HallucinationQuestion Answering	CodeCode Available	5
Reservoir-enhanced Segment Anything Model for Subsurface Diagnosis	Apr 26, 2025	Anomaly DetectionGPR	CodeCode Available	5
MMInference: Accelerating Pre-filling for Long-Context VLMs via Modality-Aware Permutation Sparse Attention	Apr 22, 2025	GPU	CodeCode Available	5
InstantCharacter: Personalize Any Characters with a Scalable Diffusion Transformer Framework	Apr 16, 2025	Image Generation	CodeCode Available	5
Reinforcement Learning from Human Feedback	Apr 16, 2025	MathPhilosophy	CodeCode Available	5
Pixel-SAIL: Single Transformer For Pixel-Grounded Understanding	Apr 14, 2025	Question Answering	CodeCode Available	5
Kimi-VL Technical Report	Apr 10, 2025	Long-Context UnderstandingMathematical Reasoning	CodeCode Available	5
M-Prometheus: A Suite of Open Multilingual LLM Judges	Apr 7, 2025	Machine TranslationModel Selection	CodeCode Available	5
The 1st Solution for 4th PVUW MeViS Challenge: Unleashing the Potential of Large Multimodal Models for Referring Video Segmentation	Apr 7, 2025	Inference OptimizationReferring Video Object Segmentation	CodeCode Available	5
PaperBench: Evaluating AI's Ability to Replicate AI Research	Apr 2, 2025		CodeCode Available	5
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation	Apr 2, 2025	Conditional Image GenerationImage Generation	CodeCode Available	5
HDVIO2.0: Wind and Disturbance Estimation with Hybrid Dynamics VIO	Apr 1, 2025	State Estimation	CodeCode Available	5
4th PVUW MeViS 3rd Place Report: Sa2VA	Apr 1, 2025	Language ModelingLanguage Modelling	CodeCode Available	5
VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness	Mar 27, 2025	Anomaly DetectionVideo Generation	CodeCode Available	5
Understanding R1-Zero-Like Training: A Critical Perspective	Mar 26, 2025	Reinforcement Learning (RL)	CodeCode Available	5
ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning	Mar 25, 2025	reinforcement-learningReinforcement Learning	CodeCode Available	5
TxAgent: An AI Agent for Therapeutic Reasoning Across a Universe of Tools	Mar 14, 2025	AI AgentDecision Making	CodeCode Available	5
TikZero: Zero-Shot Text-Guided Graphics Program Synthesis	Mar 14, 2025	Program Synthesis	CodeCode Available	5
Transformers without Normalization	Mar 13, 2025	Self-Supervised Learning	CodeCode Available	5
FlowTok: Flowing Seamlessly Across Text and Image Tokens	Mar 13, 2025	DenoisingImage to text	CodeCode Available	5
OminiControl2: Efficient Conditioning for Diffusion Transformers	Mar 11, 2025	Conditional Image GenerationDenoising	CodeCode Available	5
Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models	Mar 9, 2025	MathMultimodal Reasoning	CodeCode Available	5
R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcement Learning	Mar 7, 2025	Emotion RecognitionLanguage Modeling	CodeCode Available	5
GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control	Mar 5, 2025	Novel View SynthesisVideo Generation	CodeCode Available	5
InspireMusic: Integrating Super Resolution and Large Language Model for High-Fidelity Long-Form Music Generation	Feb 28, 2025	Audio GenerationForm	CodeCode Available	5
Fine-Tuning Vision-Language-Action Models: Optimizing Speed and Success	Feb 27, 2025	Action GenerationChunking	CodeCode Available	5
Comet: Fine-grained Computation-communication Overlapping for Mixture-of-Experts	Feb 27, 2025	Computational EfficiencyGPU	CodeCode Available	5
UniDepthV2: Universal Monocular Metric Depth Estimation Made Simpler	Feb 27, 2025	Depth EstimationMonocular Depth Estimation	CodeCode Available	5
NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms	Feb 25, 2025	Language ModelingLanguage Modelling	CodeCode Available	5
Fractal Generative Models	Feb 24, 2025	Image Generation	CodeCode Available	5
From System 1 to System 2: A Survey of Reasoning Large Language Models	Feb 24, 2025	Logical Reasoning	CodeCode Available	5
Getting SMARTER for Motion Planning in Autonomous Driving Systems	Feb 20, 2025	Autonomous DrivingMotion Planning	CodeCode Available	5
TrustRAG: An Information Assistant with Retrieval Augmented Generation	Feb 19, 2025	Answer GenerationChunking	CodeCode Available	5
Magma: A Foundation Model for Multimodal AI Agents	Feb 18, 2025	Autonomous Web NavigationImage to text	CodeCode Available	5
AIDE: AI-Driven Exploration in the Space of Code	Feb 18, 2025		CodeCode Available	5
SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?	Feb 17, 2025		CodeCode Available	5
On the Computation of the Fisher Information in Continual Learning	Feb 17, 2025	Continual Learning	CodeCode Available	5
Time-series attribution maps with regularized contrastive learning	Feb 17, 2025	Contrastive LearningTime Series	CodeCode Available	5
Phantom: Subject-consistent video generation via cross-modal alignment	Feb 16, 2025	cross-modal alignmentHuman-Domain Subject-to-Video	CodeCode Available	5
The Role of World Models in Shaping Autonomous Driving: A Comprehensive Survey	Feb 14, 2025	Autonomous DrivingSurvey	CodeCode Available	5
HealthGPT: A Medical Large Vision-Language Model for Unifying Comprehension and Generation via Heterogeneous Knowledge Adaptation	Feb 14, 2025	Language ModelingLanguage Modelling	CodeCode Available	5