The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 276–300 of 659983 papers

Title	Date	Tasks	Status	Hype
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning	Apr 24, 2025	Code Generation	CodeCode Available	7
RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning	Apr 24, 2025	Decision MakingReinforcement Learning (RL)	CodeCode Available	7
Step1X-Edit: A Practical Framework for General Image Editing	Apr 24, 2025	Image Editing	CodeCode Available	7
Skywork R1V2: Multimodal Hybrid Reinforcement Learning for Reasoning	Apr 23, 2025	Multimodal Reasoningreinforcement-learning	CodeCode Available	7
TTRL: Test-Time Reinforcement Learning	Apr 22, 2025	Mathreinforcement-learning	CodeCode Available	7
Chinese-Vicuna: A Chinese Instruction-following Llama-based Model	Apr 17, 2025	Code GenerationCPU	CodeCode Available	7
PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding	Apr 17, 2025	Video Question AnsweringVideo Understanding	CodeCode Available	7
BrowseComp: A Simple Yet Challenging Benchmark for Browsing Agents	Apr 16, 2025		CodeCode Available	7
Aligning Anime Video Generation with Human Feedback	Apr 14, 2025	Video Generation	CodeCode Available	7
The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search	Apr 10, 2025	scientific discovery	CodeCode Available	7
A Scalable Approach to Clustering Embedding Projections	Apr 9, 2025	ClusteringDensity Estimation	CodeCode Available	7
Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought	Apr 8, 2025	Language ModelingLanguage Modelling	CodeCode Available	7
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems	Mar 31, 2025	AutoMLContinual Learning	CodeCode Available	7
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model	Mar 31, 2025		CodeCode Available	7
Large Language Model Agent: A Survey on Methodology, Applications and Challenges	Mar 27, 2025	Language ModelingLanguage Modelling	CodeCode Available	7
Qwen2.5-Omni Technical Report	Mar 26, 2025	Automatic Speech Recognition (ASR)GSM8K	CodeCode Available	7
Open Deep Search: Democratizing Search with Open-source Reasoning Agents	Mar 26, 2025	10-shot image generation	CodeCode Available	7
Bridging Evolutionary Multiobjective Optimization and GPU Acceleration via Tensorization	Mar 26, 2025	CPUGPU	CodeCode Available	7
Scaling Vision Pre-Training to 4K Resolution	Mar 25, 2025	4kContrastive Learning	CodeCode Available	7
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild	Mar 24, 2025	Instruction FollowingMath	CodeCode Available	7
Enhancing Fourier Neural Operators with Local Spatial Features	Mar 22, 2025	Computational Efficiency	CodeCode Available	7
InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity	Mar 20, 2025	Image Generation	CodeCode Available	7
xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference	Mar 17, 2025	MambaMath	CodeCode Available	7
LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds	Mar 13, 2025	3D Human Reconstruction	CodeCode Available	7
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning	Mar 12, 2025	Question AnsweringRAG	CodeCode Available	7