The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 8501–8550 of 661570 papers

Title	Date	Status	Hype
Bayesian Hierarchical Models and the Maximum Entropy Principle	Mar 10, 2026	—Unverified	0
Decoupling Reasoning and Confidence: Resurrecting Calibration in Reinforcement Learning from Verifiable Rewards	Mar 10, 2026	—Unverified	0
ES-dLLM: Efficient Inference for Diffusion Large Language Models by Early-Skipping	Mar 10, 2026	—Unverified	0
Mashup Learning: Faster Finetuning by Remixing Past Checkpoints	Mar 10, 2026	—Unverified	0
Pretraining Frame Preservation for Lightweight Autoregressive Video History Embedding	Mar 10, 2026	—Unverified	0
Fairness-Aware Fine-Tuning of Vision-Language Models for Medical Glaucoma Diagnosis	Mar 10, 2026	—Unverified	0
CyberThreat-Eval: Can Large Language Models Automate Real-World Threat Research?	Mar 10, 2026	CodeCode Available	0
ALARM: Audio-Language Alignment for Reasoning Models	Mar 10, 2026	—Unverified	0
Improved Robustness of Deep Reinforcement Learning for Control of Time-Varying Systems by Bounded Extremum Seeking	Mar 10, 2026	—Unverified	0
The Affine Divergence: Aligning Activation Updates Beyond Normalisation	Mar 10, 2026	—Unverified	0
PostTrainBench: Can LLM Agents Automate LLM Post-Training?	Mar 10, 2026	—Unverified	0
LLM as a Meta-Judge: Synthetic Data for NLP Evaluation Metric Validation	Mar 10, 2026	—Unverified	0
Towards Understanding Adam Convergence on Highly Degenerate Polynomials	Mar 10, 2026	—Unverified	0
Do What I Say: A Spoken Prompt Dataset for Instruction-Following	Mar 10, 2026	—Unverified	0
Leveraging whole slide difficulty in Multiple Instance Learning to improve prostate cancer grading	Mar 10, 2026	—Unverified	0
Pathwise Test-Time Correction for Autoregressive Long Video Generation	Mar 10, 2026	—Unverified	0
Discovery of a Hematopoietic Manifold in scGPT Yields a Method for Extracting Performant Algorithms from Biological Foundation Model Internals	Mar 10, 2026	—Unverified	0
Physics-Informed Neural Engine Sound Modeling with Differentiable Pulse-Train Synthesis	Mar 10, 2026	—Unverified	0
CLIPO: Contrastive Learning in Policy Optimization Generalizes RLVR	Mar 10, 2026	CodeCode Available	0
Long Chain-of-Thought Compression via Fine-Grained Group Policy Optimization	Mar 10, 2026	CodeCode Available	0
Directional Textual Inversion for Personalized Text-to-Image Generation	Mar 10, 2026	CodeCode Available	0
Latent Equivariant Operators for Robust Object Recognition: Promises and Challenges	Mar 10, 2026	CodeCode Available	0
Breaking the Factorization Barrier in Diffusion Language Models	Mar 10, 2026	CodeCode Available	0
SlowBA: An efficiency backdoor attack towards VLM-based GUI agents	Mar 10, 2026	CodeCode Available	0
PromptDLA: A Domain-aware Prompt Document Layout Analysis Framework with Descriptive Knowledge as a Cue	Mar 10, 2026	CodeCode Available	0
CIGPose: Causal Intervention Graph Neural Network for Whole-Body Pose Estimation	Mar 10, 2026	CodeCode Available	0
ActiveUltraFeedback: Efficient Preference Data Generation using Active Learning	Mar 10, 2026	CodeCode Available	0
Test-time Ego-Exo-centric Adaptation for Action Anticipation via Multi-Label Prototype Growing and Dual-Clue Consistency	Mar 10, 2026	CodeCode Available	0
From Data Statistics to Feature Geometry: How Correlations Shape Superposition	Mar 10, 2026	CodeCode Available	0
No Memorization, No Detection: Output Distribution-Based Contamination Detection in Small Language Models	Mar 10, 2026	CodeCode Available	0
SGI: Structured 2D Gaussians for Efficient and Compact Large Image Representation	Mar 10, 2026	CodeCode Available	0
Rethinking Adam for Time Series Forecasting: A Simple Heuristic to Improve Optimization under Distribution Shifts	Mar 10, 2026	CodeCode Available	0
Robotic Ultrasound Makes CBCT Alive	Mar 10, 2026	CodeCode Available	0
More than the Sum: Panorama-Language Models for Adverse Omni-Scenes	Mar 10, 2026	CodeCode Available	0
DT-BEHRT: Disease Trajectory-aware Transformer for Interpretable Patient Representation Learning	Mar 10, 2026	CodeCode Available	0
Making Training-Free Diffusion Segmentors Scale with the Generative Power	Mar 10, 2026	CodeCode Available	0
Compiler-First State Space Duality and Portable O(1) Autoregressive Caching for Inference	Mar 10, 2026	CodeCode Available	0
BinaryAttention: One-Bit QK-Attention for Vision and Diffusion Transformers	Mar 10, 2026	CodeCode Available	0
MM-algorithms for traditional and convex NMF with Tweedie and Negative Binomial cost functions and empirical evaluation	Mar 10, 2026	CodeCode Available	0
IndiMathBench: Autoformalizing Mathematical Reasoning Problems with a Human Touch	Mar 10, 2026	CodeCode Available	0
HTMuon: Improving Muon via Heavy-Tailed Spectral Correction	Mar 10, 2026	CodeCode Available	0
KernelSkill: A Multi-Agent Framework for GPU Kernel Optimization	Mar 10, 2026	CodeCode Available	0
A Survey of Weight Space Learning: Understanding, Representation, and Generation	Mar 10, 2026	CodeCode Available	0
HG-Lane: High-Fidelity Generation of Lane Scenes under Adverse Weather and Lighting Conditions without Re-annotation	Mar 10, 2026	CodeCode Available	0
OilSAM2: Memory-Augmented SAM2 for Scalable SAR Oil Spill Detection	Mar 10, 2026	CodeCode Available	0
PanoAffordanceNet: Towards Holistic Affordance Grounding in 360° Indoor Environments	Mar 10, 2026	CodeCode Available	0
Multimodal Classification via Total Correlation Maximization	Mar 10, 2026	CodeCode Available	0
Video-Based Reward Modeling for Computer-Use Agents	Mar 10, 2026	—Unverified	1
InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing	Mar 10, 2026	—Unverified	3
TinyNav: End-to-End TinyML for Real-Time Autonomous Navigation on Microcontrollers	Mar 10, 2026	—Unverified	1