The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 5276–5300 of 661570 papers

Title	Date	Tasks	Status	Hype
A Lightweight Hybrid Dual Channel Speech Enhancement System under Low-SNR Conditions	May 26, 2025	Speech Enhancement	CodeCode Available	2
The Missing Point in Vision Transformers for Universal Image Segmentation	May 26, 2025	Image SegmentationInstance Segmentation	CodeCode Available	2
SAEs Are Good for Steering -- If You Select the Right Features	May 26, 2025		CodeCode Available	2
SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond	May 26, 2025	Logical ReasoningReinforcement Learning (RL)	CodeCode Available	2
One-shot Entropy Minimization	May 26, 2025	reinforcement-learningReinforcement Learning	CodeCode Available	2
WINA: Weight Informed Neuron Activation for Accelerating Large Language Model Inference	May 26, 2025	Language ModelingLanguage Modelling	CodeCode Available	2
Training-Free Multi-Step Audio Source Separation	May 26, 2025	Audio Source SeparationDenoising	CodeCode Available	2
Shifting AI Efficiency From Model-Centric to Data-Centric Compression	May 25, 2025	Position	CodeCode Available	2
I2MoE: Interpretable Multimodal Interaction-aware Mixture-of-Experts	May 25, 2025	Mixture-of-Expertsmultimodal interaction	CodeCode Available	2
Jodi: Unification of Visual Generation and Understanding via Joint Modeling	May 25, 2025		CodeCode Available	2
VPGS-SLAM: Voxel-based Progressive 3D Gaussian SLAM in Large-Scale Scenes	May 25, 2025	3DGS	CodeCode Available	2
VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use	May 25, 2025	Multimodal ReasoningQuestion Answering	CodeCode Available	2
MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems	May 25, 2025		CodeCode Available	2
Benchmarking Laparoscopic Surgical Image Restoration and Beyond	May 25, 2025	BenchmarkingImage Restoration	CodeCode Available	2
LiteCUA: Computer as MCP Server for Computer-Use Agent on AIOS	May 24, 2025		CodeCode Available	2
Using Large Language Models to Tackle Fundamental Challenges in Graph Learning: A Comprehensive Survey	May 24, 2025	Graph Learning	CodeCode Available	2
Improved Immiscible Diffusion: Accelerate Diffusion Training by Reducing Its Miscibility	May 24, 2025	Denoising	CodeCode Available	2
Spiking Transformers Need High Frequency Information	May 24, 2025	Avg	CodeCode Available	2
CRMArena-Pro: Holistic Assessment of LLM Agents Across Diverse Business Scenarios and Interactions	May 24, 2025	Benchmarking	CodeCode Available	2
Geometry Aware Operator Transformer as an Efficient and Accurate Neural Surrogate for PDEs on Arbitrary Domains	May 24, 2025	Computational EfficiencyOperator learning	CodeCode Available	2
DanmakuTPPBench: A Multi-modal Benchmark for Temporal Point Process Modeling and Understanding	May 23, 2025	Language ModelingLanguage Modelling	CodeCode Available	2
VeriThinker: Learning to Verify Makes Reasoning Model Efficient	May 23, 2025	model	CodeCode Available	2
MetaBox-v2: A Unified Benchmark Platform for Meta-Black-Box Optimization	May 23, 2025	Meta-Learning	CodeCode Available	2
Managing FAIR Knowledge Graphs as Polyglot Data End Points: A Benchmark based on the rdf2pg Framework and Plant Biology Data	May 23, 2025	Knowledge GraphsManagement	CodeCode Available	2
TopoPoint: Enhance Topology Reasoning via Endpoint Detection in Autonomous Driving	May 23, 2025	Autonomous DrivingLane Detection	CodeCode Available	2