The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 9201–9225 of 474278 papers

Title	Date	Status
LLM Unlearning Without an Expert Curated Dataset	Oct 7, 2025	CodeCode Available
Bridging Semantic Logic Gaps: A Cognition Inspired Multimodal Boundary Preserving Network for Image Manipulation Localization	Oct 7, 2025	CodeCode Available
Benchmarking the Robustness of Agentic Systems to Adversarially-Induced Harms	Oct 7, 2025	CodeCode Available
Sparse Representations Improve Adversarial Robustness of Neural Network Classifiers	Oct 7, 2025	CodeCode Available
vAttention: Verified Sparse Attention	Oct 7, 2025	CodeCode Available
Redefining Generalization in Visual Domains: A Two-Axis Framework for Fake Image Detection with FusionDetect	Oct 7, 2025	CodeCode Available
Are Heterogeneous Graph Neural Networks Truly Effective? A Causal Perspective	Oct 7, 2025	CodeCode Available
CalibCLIP: Contextual Calibration of Dominant Semantics for Text-Driven Image Retrieval	Oct 7, 2025	CodeCode Available
Towards Unified Image Deblurring using a Mixture-of-Experts Decoder	Oct 7, 2025	CodeCode Available
AUREXA-SE: Audio-Visual Unified Representation Exchange Architecture with Cross-Attention and Squeezeformer for Speech Enhancement	Oct 6, 2025	CodeCode Available
Reproducibility Study of "XRec: Large Language Models for Explainable Recommendation"	Oct 6, 2025	CodeCode Available
Time Is a Feature: Exploiting Temporal Dynamics in Diffusion Language Models	Oct 6, 2025	—Unverified
One Patch to Caption Them All: A Unified Zero-Shot Captioning Framework	Oct 6, 2025	—Unverified
ParallelBench: Understanding the Trade-offs of Parallel Decoding in Diffusion LLMs	Oct 6, 2025	—Unverified
Learning on the Job: Test-Time Curricula for Targeted Reinforcement Learning	Oct 6, 2025	—Unverified
When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA	Oct 6, 2025	—Unverified
Less is More: Recursive Reasoning with Tiny Networks	Oct 6, 2025	—Unverified
Test-Time Scaling in Diffusion LLMs via Hidden Semi-Autoregressive Experts	Oct 6, 2025	—Unverified
StaMo: Unsupervised Learning of Generalizable Robot Motion from Compact State Representation	Oct 6, 2025	—Unverified
Character Mixing for Video Generation	Oct 6, 2025	—Unverified
VChain: Chain-of-Visual-Thought for Reasoning in Video Generation	Oct 6, 2025	—Unverified
Pulp Motion: Framing-aware multimodal camera and human motion generation	Oct 6, 2025	—Unverified
A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility	Oct 6, 2025	—Unverified
DynaGuard: A Dynamic Guardian Model With User-Defined Policies	Oct 6, 2025	—Unverified
VER: Vision Expert Transformer for Robot Learning via Foundation Distillation and Dynamic Routing	Oct 6, 2025	—Unverified