The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 9726–9750 of 474278 papers

Title	Date	Status
KnowMT-Bench: Benchmarking Knowledge-Intensive Long-Form Question Answering in Multi-Turn Dialogues	Sep 26, 2025	CodeCode Available
Abductive Logical Rule Induction by Bridging Inductive Logic Programming and Multimodal Large Language Models	Sep 26, 2025	CodeCode Available
Discrete Guidance Matching: Exact Guidance for Discrete Flow Matching	Sep 26, 2025	CodeCode Available
FailureAtlas:Mapping the Failure Landscape of T2I Models via Active Exploration	Sep 26, 2025	CodeCode Available
Beyond Textual Context: Structural Graph Encoding with Adaptive Space Alignment to alleviate the hallucination of LLMs	Sep 26, 2025	CodeCode Available
Multidimensional Uncertainty Quantification via Optimal Transport	Sep 26, 2025	CodeCode Available
DEFT: Decompositional Efficient Fine-Tuning for Text-to-Image Models	Sep 26, 2025	CodeCode Available
APRIL: Active Partial Rollouts in Reinforcement Learning to Tame Long-tail Generation	Sep 26, 2025	CodeCode Available
From Long to Lean: Performance-aware and Adaptive Chain-of-Thought Compression via Multi-round Refinement	Sep 26, 2025	CodeCode Available
Scalable Option Learning in High-Throughput Environments	Sep 26, 2025	CodeCode Available
NIFTY: a Non-Local Image Flow Matching for Texture Synthesis	Sep 26, 2025	CodeCode Available
Vivid-VR: Distilling Concepts from Text-to-Video Diffusion Transformer for Photorealistic Video Restoration	Sep 26, 2025	CodeCode Available
Multi-Channel Differential Transformer for Cross-Domain Sleep Stage Classification with Heterogeneous EEG and EOG	Sep 26, 2025	CodeCode Available
Chain or tree? Re-evaluating complex reasoning from the perspective of a matrix of thought	Sep 26, 2025	CodeCode Available
TrustJudge: Inconsistencies of LLM-as-a-Judge and How to Alleviate Them	Sep 26, 2025	CodeCode Available
UniVid: Unifying Vision Tasks with Pre-trained Video Generation Models	Sep 26, 2025	CodeCode Available
MIRG-RL: Multi-Image Reasoning and Grounding with Reinforcement Learning	Sep 26, 2025	CodeCode Available
FastGRPO: Accelerating Policy Optimization via Concurrency-aware Speculative Decoding and Online Draft Learning	Sep 26, 2025	CodeCode Available
RedNote-Vibe: A Dataset for Capturing Temporal Dynamics of AI-Generated Text in Social Media	Sep 26, 2025	CodeCode Available
SpecXNet: A Dual-Domain Convolutional Network for Robust Deepfake Detection	Sep 26, 2025	CodeCode Available
Think Right, Not More: Test-Time Scaling for Numerical Claim Verification	Sep 26, 2025	CodeCode Available
Johnson-Lindenstrauss Lemma Guided Network for Efficient 3D Medical Segmentation	Sep 26, 2025	CodeCode Available
Zero-Effort Image-to-Music Generation: An Interpretable RAG-based VLM Approach	Sep 26, 2025	CodeCode Available
γ-Quant: Towards Learnable Quantization for Low-bit Pattern Recognition	Sep 26, 2025	CodeCode Available
Language Models Can Learn from Verbal Feedback Without Scalar Rewards	Sep 26, 2025	CodeCode Available