The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2726–2750 of 661570 papers

Title	Date	Tasks	Status	Hype
VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation	Sep 6, 2024	Image Generation	CodeCode Available	3
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't	Mar 20, 2025	Mathematical ReasoningReinforcement Learning (RL)	CodeCode Available	3
Vine Copulas as Differentiable Computational Graphs	Jun 16, 2025	GPUScheduling	CodeCode Available	3
Safe RLHF: Safe Reinforcement Learning from Human Feedback	Oct 19, 2023	reinforcement-learningReinforcement Learning	CodeCode Available	3
Predicting from Strings: Language Model Embeddings for Bayesian Optimization	Oct 14, 2024	Bayesian OptimizationExperimental Design	CodeCode Available	3
Discovering Language Model Behaviors with Model-Written Evaluations	Dec 19, 2022	Language ModelingLanguage Modelling	CodeCode Available	3
A Survey of Camouflaged Object Detection and Beyond	Aug 26, 2024	Instance SegmentationObject	CodeCode Available	3
MCTrack: A Unified 3D Multi-Object Tracking Framework for Autonomous Driving	Sep 23, 2024	3D Multi-Object TrackingAutonomous Driving	CodeCode Available	3
Trial and Error: Exploration-Based Trajectory Optimization for LLM Agents	Mar 4, 2024	Contrastive Learning	CodeCode Available	3
PutnamBench: Evaluating Neural Theorem-Provers on the Putnam Mathematical Competition	Jul 15, 2024	Automated Theorem Proving	CodeCode Available	3
A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond	Mar 21, 2024	Survey	CodeCode Available	3
MVSFormer++: Revealing the Devil in Transformer's Details for Multi-View Stereo	Jan 22, 2024	3D ReconstructionDepth Estimation	CodeCode Available	3
Prisma: An Open Source Toolkit for Mechanistic Interpretability in Vision and Video	Apr 28, 2025		CodeCode Available	3
MyoSuite -- A contact-rich simulation suite for musculoskeletal motor control	May 26, 2022	continuous-controlContinuous Control	CodeCode Available	3
Effects of charging and discharging capabilities on trade-offs between model accuracy and computational efficiency in pumped thermal electricity storage	Nov 8, 2024	Computational Efficiency	CodeCode Available	3
Evolving from Single-modal to Multi-modal Facial Deepfake Detection: A Survey	Jun 11, 2024	DeepFake DetectionFace Swapping	CodeCode Available	3
Towards Kinetic Manipulation of the Latent Space	Sep 15, 2024		CodeCode Available	3
Medical SAM Adapter: Adapting Segment Anything Model for Medical Image Segmentation	Apr 25, 2023	Image SegmentationMedical Image Segmentation	CodeCode Available	3
AA-CLIP: Enhancing Zero-shot Anomaly Detection via Anomaly-Aware CLIP	Mar 9, 2025	Anomaly DetectionAnomaly Localization	CodeCode Available	3
xLSTM-UNet can be an Effective 2D & 3D Medical Image Segmentation Backbone with Vision-LSTM (ViL) better than its Mamba Counterpart	Jul 1, 2024	3D Medical Imaging Segmentationimage-classification	CodeCode Available	3
Open-Source Skull Reconstruction with MONAI	Nov 25, 2022	C++ codeDeep Learning	CodeCode Available	3
MMedAgent: Learning to Use Medical Tools with Multi-modal Agent	Jul 2, 2024		CodeCode Available	3
DiarizationLM: Speaker Diarization Post-Processing with Large Language Models	Jan 7, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	3
RelBench: A Benchmark for Deep Learning on Relational Databases	Jul 29, 2024	Deep LearningFeature Engineering	CodeCode Available	3
A Survey on Text-guided 3D Visual Grounding: Elements, Recent Advances, and Future Directions	Jun 9, 2024	3D visual groundingSurvey	CodeCode Available	3