The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1276–1300 of 659983 papers

Title	Date	Tasks	Status	Hype
UniWorld-V1: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation	Jun 3, 2025	Image Editing	CodeCode Available	4
Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning	Jun 3, 2025	Code Generationreinforcement-learning	CodeCode Available	4
ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and Understanding	Jun 2, 2025	3D GenerationLarge Language Model	CodeCode Available	4
RewardBench 2: Advancing Reward Model Evaluation	Jun 2, 2025	Instruction Followingmodel	CodeCode Available	4
GigaAM: Efficient Self-Supervised Learner for Speech Recognition	Jun 1, 2025	Automatic Speech RecognitionLanguage Modeling	CodeCode Available	4
AutoSchemaKG: Autonomous Knowledge Graph Construction through Dynamic Schema Induction from Web-Scale Corpora	May 29, 2025	graph constructionKnowledge Graphs	CodeCode Available	4
RenderFormer: Transformer-based Neural Rendering of Triangle Meshes with Global Illumination	May 28, 2025	Neural Rendering	CodeCode Available	4
Skywork Open Reasoner 1 Technical Report	May 28, 2025	MathReinforcement Learning (RL)	CodeCode Available	4
OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation	May 26, 2025	Human-Domain Subject-to-VideoOpen-Domain Subject-to-Video	CodeCode Available	4
On Path to Multimodal Historical Reasoning: HistBench and HistAgent	May 26, 2025	Optical Character Recognition (OCR)	CodeCode Available	4
ImgEdit: A Unified Image Editing Dataset and Benchmark	May 26, 2025	Image Editing	CodeCode Available	4
GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation	May 26, 2025	Question AnsweringSynthetic Data Generation	CodeCode Available	4
Alita: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-Evolution	May 26, 2025		CodeCode Available	4
DeepInverse: A Python package for solving imaging inverse problems with deep learning	May 26, 2025	Image Reconstruction	CodeCode Available	4
A Survey of LLM DATA	May 24, 2025	Large Language ModelManagement	CodeCode Available	4
Partition Generative Modeling: Masked Modeling Without Masks	May 24, 2025	Computational EfficiencyLanguage Modeling	CodeCode Available	4
LORE: Lagrangian-Optimized Robust Embeddings for Visual Encoders	May 24, 2025	Adversarial RobustnessOut-of-Distribution Generalization	CodeCode Available	4
Trinity-RFT: A General-Purpose and Unified Framework for Reinforcement Fine-Tuning of Large Language Models	May 23, 2025		CodeCode Available	4
Qiskit Machine Learning: an open-source library for quantum machine learning tasks at scale on quantum hardware and classical simulators	May 23, 2025	Quantum Machine Learning	CodeCode Available	4
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning	May 23, 2025	Question AnsweringReinforcement Learning (RL)	CodeCode Available	4
Scaling Up Biomedical Vision-Language Models: Fine-Tuning, Instruction Tuning, and Multi-Modal Learning	May 23, 2025	DecoderImage Captioning	CodeCode Available	4
SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis	May 22, 2025	DiversityInformation Retrieval	CodeCode Available	4
Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO	May 22, 2025	Domain GeneralizationImage Generation	CodeCode Available	4
R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning	May 22, 2025	MemorizationRAG	CodeCode Available	4
lmgame-Bench: How Good are LLMs at Playing Games?	May 21, 2025	Language ModelingLanguage Modelling	CodeCode Available	4