The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 5376–5400 of 661570 papers

Title	Date	Tasks	Status	Hype
DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization	May 18, 2025	Mathematical Reasoning	CodeCode Available	2
HISTAI: An Open-Source, Large-Scale Whole Slide Image Dataset for Computational Pathology	May 17, 2025	DiagnosticDiversity	CodeCode Available	2
AI-Driven Automation Can Become the Foundation of Next-Era Science of Science Research	May 17, 2025	scientific discovery	CodeCode Available	2
Safe Delta: Consistently Preserving Safety when Fine-Tuning LLMs on Diverse Datasets	May 17, 2025		CodeCode Available	2
DraftAttention: Fast Video Diffusion via Low-Resolution Attention Guidance	May 17, 2025	Video Generation	CodeCode Available	2
Demystifying and Enhancing the Efficiency of Large Language Model Based Search Agents	May 17, 2025	Language ModelingLanguage Modelling	CodeCode Available	2
LifelongAgentBench: Evaluating LLM Agents as Lifelong Learners	May 17, 2025	Language ModelingLanguage Modelling	CodeCode Available	2
Patho-R1: A Multimodal Reinforcement Learning-Based Pathology Expert Reasoner	May 16, 2025	Cross-Modal RetrievalDiagnostic	CodeCode Available	2
DiCo: Revitalizing ConvNets for Scalable and Efficient Diffusion Modeling	May 16, 2025	Attribute	CodeCode Available	2
Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation	May 16, 2025	3D geometryNavigate	CodeCode Available	2
Mergenetic: a Simple Evolutionary Model Merging Library	May 16, 2025	Evolutionary Algorithmsmodel	CodeCode Available	2
Search and Refine During Think: Autonomous Retrieval-Augmented Reasoning of LLMs	May 16, 2025	Retrieval	CodeCode Available	2
Think Twice Before You Act: Enhancing Agent Behavioral Safety with Thought Correction	May 16, 2025	Contrastive LearningSafety Alignment	CodeCode Available	2
Relational Graph Transformer	May 16, 2025	Graph Neural Network	CodeCode Available	2
ForensicHub: A Unified Benchmark & Codebase for All-Domain Fake Image Detection and Localization	May 16, 2025	AllDeepFake Detection	CodeCode Available	2
GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning	May 16, 2025	Data Augmentation	CodeCode Available	2
DexGarmentLab: Dexterous Garment Manipulation Environment with Generalizable Policy	May 16, 2025	Reinforcement Learning (RL)	CodeCode Available	2
SoftCoT++: Test-Time Scaling with Soft Chain-of-Thought Reasoning	May 16, 2025	Contrastive Learning	CodeCode Available	2
A Tutorial on Structural Identifiability of Epidemic Models Using StructuralIdentifiability.jl	May 15, 2025	parameter estimation	CodeCode Available	2
PnPXAI: A Universal XAI Framework Providing Automatic Explanations Across Diverse Modalities and Models	May 15, 2025	Explainable artificial intelligenceExplainable Artificial Intelligence (XAI)	CodeCode Available	2
Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models	May 15, 2025	Mathreinforcement-learning	CodeCode Available	2
Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis	May 15, 2025	Image GenerationText to Image Generation	CodeCode Available	2
MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly	May 15, 2025	8kBenchmarking	CodeCode Available	2
VRSplat: Fast and Robust Gaussian Splatting for Virtual Reality	May 15, 2025	3DGSGPU	CodeCode Available	2
AdaptCLIP: Adapting CLIP for Universal Visual Anomaly Detection	May 15, 2025	Anomaly Detection	CodeCode Available	2