The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 6201–6225 of 474278 papers

Title	Date	Tasks	Status	Hype
Sparse Autoencoders for Hypothesis Generation	Feb 5, 2025		CodeCode Available	2
Seeing World Dynamics in a Nutshell	Feb 5, 2025	Video Reconstruction	CodeCode Available	2
The Hidden Life of Tokens: Reducing Hallucination of Large Vision-Language Models via Visual Information Steering	Feb 5, 2025	Hallucination	CodeCode Available	2
On-device Sora: Enabling Training-Free Diffusion-based Text-to-Video Generation for Mobile Devices	Feb 5, 2025	DenoisingModel Optimization	CodeCode Available	2
Diff9D: Diffusion-Based Domain-Generalized Category-Level 9-DoF Object Pose Estimation	Feb 4, 2025	DenoisingDomain Generalization	CodeCode Available	2
Reviving The Classics: Active Reward Modeling in Large Language Model Alignment	Feb 4, 2025	Computational EfficiencyExperimental Design	CodeCode Available	2
STAIR: Improving Safety Alignment with Introspective Reasoning	Feb 4, 2025	Safety Alignment	CodeCode Available	2
QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search	Feb 4, 2025		CodeCode Available	2
On the Guidance of Flow Matching	Feb 4, 2025	Decision MakingImage Generation	CodeCode Available	2
CodeSteer: Symbolic-Augmented Language Models via Code/Text Guidance	Feb 4, 2025	Code GenerationText Generation	CodeCode Available	2
Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs	Feb 4, 2025	Code GenerationLanguage Modeling	CodeCode Available	2
Honegumi: An Interface for Accelerating the Adoption of Bayesian Optimization in the Experimental Sciences	Feb 4, 2025	Bayesian OptimizationExperimental Design	CodeCode Available	2
The Jumping Reasoning Curve? Tracking the Evolution of Reasoning Performance in GPT-[n] and o-[n] Models on Multimodal Puzzles	Feb 3, 2025	ARCMultimodal Reasoning	CodeCode Available	2
Massive Values in Self-Attention Modules are the Key to Contextual Knowledge Understanding	Feb 3, 2025	Quantization	CodeCode Available	2
Diffusion Model as a Noise-Aware Latent Reward Model for Step-Level Preference Optimization	Feb 3, 2025	model	CodeCode Available	2
Efficient Diffusion Models: A Survey	Feb 3, 2025	Survey	CodeCode Available	2
Compressed Image Generation with Denoising Diffusion Codebook Models	Feb 3, 2025	Conditional Image GenerationDenoising	CodeCode Available	2
LayerTracer: Cognitive-Aligned Layered SVG Synthesis via Diffusion Transformer	Feb 3, 2025		CodeCode Available	2
Towards Robust and Generalizable Lensless Imaging with Modular Learned Reconstruction	Feb 3, 2025	Transfer Learning	CodeCode Available	2
Preference Leakage: A Contamination Problem in LLM-as-a-judge	Feb 3, 2025		CodeCode Available	2
When Do LLMs Help With Node Classification? A Comprehensive Analysis	Feb 2, 2025	Node Classification	CodeCode Available	2
LEAD: Large Foundation Model for EEG-Based Alzheimer's Disease Detection	Feb 2, 2025	Alzheimer's Disease DetectionEEG	CodeCode Available	2
FlexCloud: Direct, Modular Georeferencing and Drift-Correction of Point Cloud Maps	Feb 1, 2025	Autonomous Drivingmotion prediction	CodeCode Available	2
Segment Anything for Histopathology	Feb 1, 2025	Image SegmentationInstance Segmentation	CodeCode Available	2
PyMOLfold: Interactive Protein and Ligand Structure Prediction in PyMOL	Feb 1, 2025	PredictionProtein Folding	CodeCode Available	2