The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 8851–8875 of 474278 papers

Title	Date	Status
LongLive: Real-time Interactive Long Video Generation	Oct 13, 2025	—Unverified
LLM Reasoning for Machine Translation: Synthetic Data Generation over Thinking Tokens	Oct 13, 2025	—Unverified
Direct Multi-Token Decoding	Oct 13, 2025	—Unverified
Scaling Long-Horizon LLM Agent via Context-Folding	Oct 13, 2025	—Unverified
PanoTPS-Net: Panoramic Room Layout Estimation via Thin Plate Spline Transformation	Oct 13, 2025	CodeCode Available
Do LLMs "Feel"? Emotion Circuits Discovery and Control	Oct 13, 2025	—Unverified
Diffusion-Link: Diffusion Probabilistic Model for Bridging the Audio-Text Modality Gap	Oct 13, 2025	CodeCode Available
Coupled Degradation Modeling and Fusion: A VLM-Guided Degradation-Coupled Network for Degradation-Aware Infrared and Visible Image Fusion	Oct 13, 2025	CodeCode Available
ReLook: Vision-Grounded RL with a Multimodal LLM Critic for Agentic Web Coding	Oct 13, 2025	—Unverified
Deconstructing Attention: Investigating Design Principles for Effective Language Modeling	Oct 13, 2025	—Unverified
ExpVid: A Benchmark for Experiment Video Understanding & Reasoning	Oct 13, 2025	—Unverified
Demystifying Reinforcement Learning in Agentic Reasoning	Oct 13, 2025	CodeCode Available
IVEBench: Modern Benchmark Suite for Instruction-Guided Video Editing Assessment	Oct 13, 2025	—Unverified
InfiniHuman: Infinite 3D Human Creation with Precise Control	Oct 13, 2025	—Unverified
Point Prompting: Counterfactual Tracking with Video Diffusion Models	Oct 13, 2025	—Unverified
ACADREASON: Exploring the Limits of Reasoning Models with Academic Research Problems	Oct 13, 2025	—Unverified
Diffusion Transformers with Representation Autoencoders	Oct 13, 2025	—Unverified
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs	Oct 13, 2025	—Unverified
SPADE: Spatial Transcriptomics and Pathology Alignment Using a Mixture of Data Experts for an Expressive Latent Space	Oct 13, 2025	CodeCode Available
APLOT: Robust Reward Modeling via Adaptive Preference Learning with Optimal Transport	Oct 13, 2025	CodeCode Available
Add-One-In: Incremental Sample Selection for Large Language Models via a Choice-Based Greedy Paradigm	Oct 13, 2025	CodeCode Available
Go Beyond Earth: Understanding Human Actions and Scenes in Microgravity Environments	Oct 13, 2025	CodeCode Available
Prompt4Trust: A Reinforcement Learning Prompt Augmentation Framework for Clinically-Aligned Confidence Calibration in Multimodal Large Language Models	Oct 13, 2025	CodeCode Available
STAR: A Benchmark for Astronomical Star Fields Super-Resolution	Oct 13, 2025	CodeCode Available
FSA: An Alternative Efficient Implementation of Native Sparse Attention Kernel	Oct 13, 2025	CodeCode Available