The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 5501–5525 of 474278 papers

Title	Date	Tasks	Status	Hype
Dynamic Early Exit in Reasoning Models	Apr 22, 2025	GSM8KMath	CodeCode Available	2
Text-based Animatable 3D Avatars with Morphable Model Alignment	Apr 22, 2025	3D Generation3DGS	CodeCode Available	2
WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks	Apr 22, 2025	Benchmarking	CodeCode Available	2
WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents	Apr 22, 2025	Knowledge GraphsMinecraft	CodeCode Available	2
Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning	Apr 21, 2025	AllForm	CodeCode Available	2
MARFT: Multi-Agent Reinforcement Fine-Tuning	Apr 21, 2025		CodeCode Available	2
Vision6D: 3D-to-2D Interactive Visualization and Annotation Tool for 6D Pose Estimation	Apr 21, 2025	6D Pose EstimationPose Estimation	CodeCode Available	2
Learning Adaptive Parallel Reasoning with Language Models	Apr 21, 2025	4k	CodeCode Available	2
FlowReasoner: Reinforcing Query-Level Meta-Agents	Apr 21, 2025	Reinforcement Learning (RL)	CodeCode Available	2
Retrieval Augmented Generation Evaluation in the Era of Large Language Models: A Comprehensive Survey	Apr 21, 2025	Computational EfficiencyInformation Retrieval	CodeCode Available	2
DyFo: A Training-Free Dynamic Focus Visual Search for Enhancing LMMs in Fine-Grained Visual Understanding	Apr 21, 2025	Hallucination	CodeCode Available	2
Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction	Apr 21, 2025	Math	CodeCode Available	2
Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMs	Apr 21, 2025	AttributeCamera Pose Estimation	CodeCode Available	2
Generative Auto-Bidding with Value-Guided Explorations	Apr 20, 2025	Reinforcement Learning (RL)	CodeCode Available	2
NTIRE 2025 Challenge on Image Super-Resolution (4): Methods and Results	Apr 20, 2025	Image Super-ResolutionSuper-Resolution	CodeCode Available	2
DreamID: High-Fidelity and Fast diffusion-based Face Swapping via Triplet ID Group Learning	Apr 20, 2025	AttributeFace Swapping	CodeCode Available	2
SG-Reg: Generalizable and Efficient Scene Graph Registration	Apr 20, 2025	GPU	CodeCode Available	2
Seurat: From Moving Points to Depth	Apr 20, 2025	Depth EstimationPoint Tracking	CodeCode Available	2
SphereDiff: Tuning-free Omnidirectional Panoramic Image and Video Generation via Spherical Latent Representation	Apr 19, 2025	ERPVideo Generation	CodeCode Available	2
Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale	Apr 19, 2025	Benchmarking	CodeCode Available	2
InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners	Apr 19, 2025	Action GenerationLogical Reasoning	CodeCode Available	2
CLIP-Powered Domain Generalization and Domain Adaptation: A Comprehensive Survey	Apr 19, 2025	Computational EfficiencyDomain Adaptation	CodeCode Available	2
LangCoop: Collaborative Driving with Language	Apr 18, 2025	Autonomous Driving	CodeCode Available	2
EyecareGPT: Boosting Comprehensive Ophthalmology Understanding with Tailored Dataset, Benchmark and Model	Apr 18, 2025	Diagnostic	CodeCode Available	2
TongUI: Building Generalized GUI Agents by Learning from Multimodal Web Tutorials	Apr 17, 2025	Articles	CodeCode Available	2