The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 8351–8375 of 474278 papers

Title	Date	Status
Empirical Study on Robustness and Resilience in Cooperative Multi-Agent Reinforcement Learning	Oct 23, 2025	CodeCode Available
DAIL: Beyond Task Ambiguity for Language-Conditioned Reinforcement Learning	Oct 23, 2025	CodeCode Available
FerretNet: Efficient Synthetic Image Detection via Local Pixel Dependencies	Oct 23, 2025	CodeCode Available
Multi-modal Co-learning for Earth Observation: Enhancing single-modality models via modality collaboration	Oct 22, 2025	—Unverified
Seabed-Net: A multi-task network for joint bathymetry estimation and seabed classification from remote sensing imagery in shallow waters	Oct 22, 2025	CodeCode Available
The Temporal Graph of Bitcoin Transactions	Oct 22, 2025	CodeCode Available
KnowMol: Advancing Molecular Large Language Models with Multi-Level Chemical Knowledge	Oct 22, 2025	CodeCode Available
dInfer: An Efficient Inference Framework for Diffusion Language Models	Oct 22, 2025	CodeCode Available
One-Step Diffusion for Detail-Rich and Temporally Consistent Video Super-Resolution	Oct 22, 2025	CodeCode Available
Democratizing AI scientists using ToolUniverse	Oct 22, 2025	—Unverified
kabr-tools: Automated Framework for Multi-Species Behavioral Monitoring	Oct 22, 2025	—Unverified
ImagerySearch: Adaptive Test-Time Search for Video Generation Beyond Semantic Dependency Constraints	Oct 22, 2025	—Unverified
Can LLMs Correct Themselves? A Benchmark of Self-Correction in LLMs	Oct 22, 2025	—Unverified
MUG-V 10B: High-efficiency Training Pipeline for Large Video Generation Models	Oct 22, 2025	CodeCode Available
HAD: HAllucination Detection Language Models Based on a Comprehensive Hallucination Taxonomy	Oct 22, 2025	CodeCode Available
KORE: Enhancing Knowledge Injection for Large Multimodal Models via Knowledge-Oriented Augmentations and Constraints	Oct 22, 2025	—Unverified
VideoAgentTrek: Computer Use Pretraining from Unlabeled Videos	Oct 22, 2025	—Unverified
CARES: Context-Aware Resolution Selector for VLMs	Oct 22, 2025	—Unverified
Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1	Oct 22, 2025	—Unverified
HSCodeComp: A Realistic and Expert-level Benchmark for Deep Search Agents in Hierarchical Rule Application	Oct 22, 2025	—Unverified
The Massive Legal Embedding Benchmark (MLEB)	Oct 22, 2025	—Unverified
Deep Research Brings Deeper Harm	Oct 22, 2025	—Unverified
Data Efficient Adaptation in Large Language Models via Continuous Low-Rank Fine-Tuning	Oct 22, 2025	CodeCode Available
ToMMeR -- Efficient Entity Mention Detection from Large Language Models	Oct 22, 2025	—Unverified
Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing	Oct 22, 2025	—Unverified