The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 6201–6225 of 474278 papers

Title	Date	Status
Sequential Testing for Descriptor-Agnostic LiDAR Loop Closure in Repetitive Environments	Dec 10, 2025	CodeCode Available
Benchmarking Real-World Medical Image Classification with Noisy Labels: Challenges, Practice, and Outlook	Dec 10, 2025	CodeCode Available
Gradient-Guided Learning Network for Infrared Small Target Detection	Dec 10, 2025	CodeCode Available
IF-Bench: Benchmarking and Enhancing MLLMs for Infrared Images with Generative Visual Prompting	Dec 10, 2025	CodeCode Available
Training One Model to Master Cross-Level Agentic Actions via Reinforcement Learning	Dec 10, 2025	CodeCode Available
Benchmarking Document Parsers on Mathematical Formula Extraction from PDFs	Dec 10, 2025	CodeCode Available
Diffusion Is Your Friend in Show, Suggest and Tell	Dec 10, 2025	CodeCode Available
DB2-TransF: All You Need Is Learnable Daubechies Wavelets for Time Series Forecasting	Dec 10, 2025	CodeCode Available
SoMe: A Realistic Benchmark for LLM-based Social Media Agents	Dec 9, 2025	CodeCode Available
Generalization vs. Specialization: Evaluating Segment Anything Model (SAM3) Zero-Shot Segmentation Against Fine-Tuned YOLO Detectors	Dec 9, 2025	CodeCode Available
MolSculpt: Sculpting 3D Molecular Geometries from Chemical Syntax	Dec 9, 2025	CodeCode Available
Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform	Dec 9, 2025	—Unverified
Decentralized Trust for Space AI: Blockchain-Based Federated Learning Across Multi-Vendor LEO Satellite Networks	Dec 9, 2025	CodeCode Available
WonderZoom: Multi-Scale 3D World Generation	Dec 9, 2025	—Unverified
SceneGen: Single-Image 3D Scene Generation in One Feedforward Pass	Dec 9, 2025	—Unverified
AraLingBench A Human-Annotated Benchmark for Evaluating Arabic Linguistic Capabilities of Large Language Models	Dec 9, 2025	—Unverified
Arbitrage: Efficient Reasoning via Advantage-Aware Speculation	Dec 9, 2025	—Unverified
Direct transfer of optimized controllers to similar systems using dimensionless MPC	Dec 9, 2025	CodeCode Available
SegEarth-OV3: Exploring SAM 3 for Open-Vocabulary Semantic Segmentation in Remote Sensing Images	Dec 9, 2025	CodeCode Available
SimpleFold: Folding Proteins is Simpler than You Think	Dec 9, 2025	—Unverified
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance	Dec 9, 2025	—Unverified
LLM Collaboration With Multi-Agent Reinforcement Learning	Dec 9, 2025	CodeCode Available
Co-Seg++: Mutual Prompt-Guided Collaborative Learning for Versatile Medical Segmentation	Dec 9, 2025	CodeCode Available
OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows	Dec 9, 2025	CodeCode Available
MVP: Multiple View Prediction Improves GUI Grounding	Dec 9, 2025	CodeCode Available