The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 7801–7825 of 474278 papers

Title	Date	Status
LongRM: Revealing and Unlocking the Context Boundary of Reward Modeling	Nov 4, 2025	—Unverified
Deep Ideation: Designing LLM Agents to Generate Novel Research Ideas on Scientific Concept Network	Nov 4, 2025	CodeCode Available
M3PD Dataset: Dual-view Photoplethysmography (PPG) Using Front-and-rear Cameras of Smartphones in Lab and Clinical Settings	Nov 4, 2025	CodeCode Available
MammoClean: Toward Reproducible and Bias-Aware AI in Mammography through Dataset Harmonization	Nov 4, 2025	CodeCode Available
Zero-Shot Multi-Animal Tracking in the Wild	Nov 4, 2025	CodeCode Available
VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation	Nov 4, 2025	CodeCode Available
MemSearcher: Training LLMs to Reason, Search and Manage Memory via End-to-End Reinforcement Learning	Nov 4, 2025	CodeCode Available
Evaluating Large Language Models for Detecting Antisemitism	Nov 4, 2025	CodeCode Available
Weakly Supervised Object Segmentation by Background Conditional Divergence	Nov 4, 2025	CodeCode Available
Crucial-Diff: A Unified Diffusion Model for Crucial Image and Annotation Synthesis in Data-scarce Scenarios	Nov 4, 2025	CodeCode Available
WXSOD: A Benchmark for Robust Salient Object Detection in Adverse Weather Conditions	Nov 4, 2025	CodeCode Available
Exploring Human-AI Conceptual Alignment through the Prism of Chess	Nov 4, 2025	CodeCode Available
Demo: Statistically Significant Results On Biases and Errors of LLMs Do Not Guarantee Generalizable Results	Nov 4, 2025	CodeCode Available
Monocular absolute depth estimation from endoscopy via domain-invariant feature learning and latent consistency	Nov 4, 2025	CodeCode Available
A Novel Grouping-Based Hybrid Color Correction Algorithm for Color Point Clouds	Nov 4, 2025	CodeCode Available
SigmaCollab: An Application-Driven Dataset for Physically Situated Collaboration	Nov 4, 2025	CodeCode Available
Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification	Nov 4, 2025	CodeCode Available
NABench: Large-Scale Benchmarks of Nucleotide Foundation Models for Fitness Prediction	Nov 4, 2025	CodeCode Available
Identity Increases Stability in Neural Cellular Automata	Nov 3, 2025	CodeCode Available
MCFCN: Multi-View Clustering via a Fusion-Consensus Graph Convolutional Network	Nov 3, 2025	CodeCode Available
Efficient Tool-Calling Multi-Expert NPC Agent for Commonsense Persona-Grounded Dialogue	Nov 3, 2025	CodeCode Available
Vote-in-Context: Turning VLMs into Zero-Shot Rank Fusers	Nov 3, 2025	CodeCode Available
Unified Diffusion VLA: Vision-Language-Action Model via Joint Discrete Denoising Diffusion Process	Nov 3, 2025	—Unverified
When to Trust the Answer: Question-Aligned Semantic Nearest Neighbor Entropy for Safer Surgical VQA	Nov 3, 2025	CodeCode Available
TPS-Bench: Evaluating AI Agents' Tool Planning \& Scheduling Abilities in Compounding Tasks	Nov 3, 2025	CodeCode Available