The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1751–1775 of 177339 papers

Title	Date	Tasks	Status	Hype	Score
REFINE: Inversion-Free Backdoor Defense via Model Reprogramming	Feb 22, 2025	backdoor defense	CodeCode Available	4	5
Relationships are Complicated! An Analysis of Relationships Between Datasets on the Web	Aug 26, 2024	Decision MakingMulti-class Classification	CodeCode Available	4	5
Benchmarking Graphormer on Large-Scale Molecular Modeling Datasets	Mar 9, 2022	BenchmarkingGraph Regression	CodeCode Available	4	5
LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment	Oct 3, 2023	Audio ClassificationContrastive Learning	CodeCode Available	4	5
SWE-Search: Enhancing Software Agents with Monte Carlo Tree Search and Iterative Refinement	Oct 26, 2024	Large Language Model	CodeCode Available	4	5
R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization	Mar 13, 2025	Multimodal Reasoning	CodeCode Available	4	5
Recurrent Partial Kernel Network for Efficient Optical Flow Estimation	Feb 1, 2024	Optical Flow Estimation	CodeCode Available	4	5
DeXtreme: Transfer of Agile In-hand Manipulation from Simulation to Reality	Oct 25, 2022	Deep Reinforcement LearningGPU	CodeCode Available	4	5
Fin-R1: A Large Language Model for Financial Reasoning through Reinforcement Learning	Mar 20, 2025	Decision MakingLanguage Modeling	CodeCode Available	4	5
Are Transformers Effective for Time Series Forecasting?	May 26, 2022	Anomaly DetectionRelation Extraction	CodeCode Available	4	5
Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models	May 8, 2025	Multimodal Reasoning	CodeCode Available	4	5
Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation	Dec 4, 2023	Depth EstimationGPU	CodeCode Available	4	5
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot	Jan 2, 2023	Common Sense ReasoningLanguage Modelling	CodeCode Available	4	5
AlignScore: Evaluating Factual Consistency with a Unified Alignment Function	May 26, 2023	Fact VerificationInformation Retrieval	CodeCode Available	4	5
TRAM: Global Trajectory and Motion of 3D Humans from in-the-wild Videos	Mar 26, 2024	3D Human Pose Estimation	CodeCode Available	4	5
TableGPT2: A Large Multimodal Model with Tabular Data Integration	Nov 4, 2024	BenchmarkingData Integration	CodeCode Available	4	5
Human-Humanoid Robots Cross-Embodiment Behavior-Skill Transfer Using Decomposed Adversarial Learning from Demonstration	Dec 19, 2024	Human-Object Interaction Detectionmotion retargeting	CodeCode Available	4	5
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds	Jul 1, 2024	Audio GenerationVideo Alignment	CodeCode Available	4	5
MovieChat+: Question-aware Sparse Memory for Long Video Question Answering	Apr 26, 2024	2kQuestion Answering	CodeCode Available	4	5
Knowledge Fusion of Chat LLMs: A Preliminary Technical Report	Feb 25, 2024		CodeCode Available	4	5
R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning	May 22, 2025	MemorizationRAG	CodeCode Available	4	5
The case for 4-bit precision: k-bit Inference Scaling Laws	Dec 19, 2022	Quantization	CodeCode Available	4	5
ActiveAnno3D -- An Active Learning Framework for Multi-Modal 3D Object Detection	Feb 5, 2024	3D Object DetectionActive Learning	CodeCode Available	4	5
DepGraph: Towards Any Structural Pruning	Jan 30, 2023	Network PruningNeural Network Compression	CodeCode Available	4	5
Improving Training Stability for Multitask Ranking Models in Recommender Systems	Feb 17, 2023	Recommendation Systems	CodeCode Available	4	5