The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 7876–7900 of 474278 papers

Title	Date	Status
Fast PINN Eigensolvers via Biconvex Reformulation	Nov 2, 2025	CodeCode Available
CodeClash: Benchmarking Goal-Oriented Software Engineering	Nov 2, 2025	—Unverified
The Biased Oracle: Assessing LLMs' Understandability and Empathy in Medical Diagnoses	Nov 2, 2025	CodeCode Available
Music Arena: Live Evaluation for Text-to-Music	Nov 2, 2025	—Unverified
Continual Learning, Not Training: Online Adaptation For Agents	Nov 2, 2025	—Unverified
GeoToken: Hierarchical Geolocalization of Images via Next Token Prediction	Nov 2, 2025	CodeCode Available
A Survey of Reasoning and Agentic Systems in Time Series with Large Language Models	Nov 2, 2025	CodeCode Available
Dropping the D: RGB-D SLAM Without the Depth Sensor	Nov 2, 2025	—Unverified
HarnessLLM: Automatic Testing Harness Generation via Reinforcement Learning	Nov 2, 2025	CodeCode Available
LiteTracker: Leveraging Temporal Causality for Accurate Low-latency Tissue Tracking	Nov 2, 2025	CodeCode Available
Count-Based Approaches Remain Strong: A Benchmark Against Transformer and LLM Pipelines on Structured EHR	Nov 2, 2025	CodeCode Available
MARS-SQL: A multi-agent reinforcement learning framework for Text-to-SQL	Nov 2, 2025	CodeCode Available
Linear Differential Vision Transformer: Learning Visual Contrasts via Pairwise Differentials	Nov 2, 2025	CodeCode Available
Harmony in Divergence: Towards Fast, Accurate, and Memory-efficient Zeroth-order LLM Fine-tuning	Nov 2, 2025	CodeCode Available
MOSPA: Human Motion Generation Driven by Spatial Audio	Nov 2, 2025	CodeCode Available
Beyond Autoregression: An Empirical Study of Diffusion Large Language Models for Code Generation	Nov 2, 2025	CodeCode Available
Multimodal Spatial Reasoning in the Large Model Era: A Survey and Benchmarks	Nov 2, 2025	CodeCode Available
Layer-Wise Modality Decomposition for Interpretable Multimodal Sensor Fusion	Nov 2, 2025	CodeCode Available
Advancing Machine-Generated Text Detection from an Easy to Hard Supervision Perspective	Nov 2, 2025	CodeCode Available
iFlyBot-VLA Technical Report	Nov 1, 2025	—Unverified
OmniTrack++: Omnidirectional Multi-Object Tracking by Learning Large-FoV Trajectory Feedback	Nov 1, 2025	CodeCode Available
RoboOmni: Proactive Robot Manipulation in Omni-modal Context	Nov 1, 2025	—Unverified
Applying Medical Imaging Tractography Techniques to Painterly Rendering of Images	Nov 1, 2025	CodeCode Available
Word Salad Chopper: Reasoning Models Waste A Ton Of Decoding Budget On Useless Repetitions, Self-Knowingly	Nov 1, 2025	CodeCode Available
Exploring the Hidden Capacity of LLMs for One-Step Text Generation	Nov 1, 2025	—Unverified