The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 12351–12400 of 474278 papers

Title	Date	Tasks	Status	Hype
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time	Oct 26, 2023	In-Context Learning	CodeCode Available	2
FUSION: Fully Integration of Vision-Language Representations for Deep Cross-Modal Understanding	Apr 14, 2025		CodeCode Available	2
Diffusion Models for Tabular Data: Challenges, Current Progress, and Future Directions	Feb 24, 2025	Data AugmentationImage Generation	CodeCode Available	2
WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents	Apr 22, 2025	Knowledge GraphsMinecraft	CodeCode Available	2
Evaluating Large Language Models: A Comprehensive Survey	Oct 30, 2023	Survey	CodeCode Available	2
Parsel: Algorithmic Reasoning with Language Models by Composing Decompositions	Dec 20, 2022	Automated Theorem ProvingCode Generation	CodeCode Available	2
MolTC: Towards Molecular Relational Modeling In Language Models	Feb 6, 2024	Relational Reasoning	CodeCode Available	2
FastReID: A Pytorch Toolbox for General Instance Re-identification	Jun 4, 2020	Face RecognitionGPU	CodeCode Available	2
DEGAS: Detailed Expressions on Full-Body Gaussian Avatars	Aug 20, 2024	3DGSNeural Rendering	CodeCode Available	2
Self-Consistent Recursive Diffusion Bridge for Medical Image Translation	May 10, 2024	DenoisingScheduling	CodeCode Available	2
HoliTom: Holistic Token Merging for Fast Video Large Language Models	May 27, 2025		CodeCode Available	2
Tuning-Free Image Customization with Image and Text Guidance	Mar 19, 2024	DecoderDenoising	CodeCode Available	2
Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective	Mar 27, 2023	Image Quality AssessmentNo-Reference Image Quality Assessment	CodeCode Available	2
Variable Bitrate Neural Fields	Jun 15, 2022	Decoder	CodeCode Available	2
LION: Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge	Jan 1, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
Cross-lingual and Multilingual CLIP	Jun 1, 2022	Contrastive LearningImage-text Retrieval	CodeCode Available	2
A Diffusion Model Framework for Unsupervised Neural Combinatorial Optimization	Jun 3, 2024	Combinatorial Optimization	CodeCode Available	2
Dreamer XL: Towards High-Resolution Text-to-3D Generation via Trajectory Score Matching	May 18, 2024	3D GenerationDenoising	CodeCode Available	2
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging	Oct 2, 2024	Auto DebuggingBug fixing	CodeCode Available	2
BMFM-DNA: A SNP-aware DNA foundation model to capture variant effects	Jun 26, 2025	ImputationPromoter Detection	CodeCode Available	2
Honeybee: Locality-enhanced Projector for Multimodal LLM	Dec 11, 2023	MMEScience Question Answering	CodeCode Available	2
GigaPose: Fast and Robust Novel Object Pose Estimation via One Correspondence	Nov 23, 2023	3D Reconstruction6D Pose Estimation	CodeCode Available	2
MedTsLLM: Leveraging LLMs for Multimodal Medical Time Series Analysis	Aug 14, 2024	Anomaly DetectionBoundary Detection	CodeCode Available	2
Density Estimation via Binless Multidimensional Integration	Jul 10, 2024	Density Estimation	CodeCode Available	2
RenderIH: A Large-scale Synthetic Dataset for 3D Interacting Hand Pose Estimation	Sep 17, 2023	3D Interacting Hand Pose EstimationDiversity	CodeCode Available	2
MVGamba: Unify 3D Content Generation as State Space Sequence Modeling	Jun 10, 2024	3D GenerationAttribute	CodeCode Available	2
Sparse2DGS: Geometry-Prioritized Gaussian Splatting for Surface Reconstruction from Sparse Views	Apr 29, 2025	NeRFSurface Reconstruction	CodeCode Available	2
Autoregressive Pretraining with Mamba in Vision	Jun 11, 2024	Mamba	CodeCode Available	2
pymdp: A Python library for active inference in discrete state spaces	Jan 11, 2022	Bayesian Inference	CodeCode Available	2
3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly	Feb 9, 2025	Anomaly DetectionUnsupervised Anomaly Detection	CodeCode Available	2
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline	Apr 3, 2024	MathMathematical Problem-Solving	CodeCode Available	2
Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models	Mar 31, 2025		CodeCode Available	2
StreamBench: Towards Benchmarking Continuous Improvement of Language Agents	Jun 13, 2024	BenchmarkingLanguage Modeling	CodeCode Available	2
MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning	Mar 10, 2025	BenchmarkingMedical Question Answering	CodeCode Available	2
CLEAR: A Fully User-side Image Search System	Jun 17, 2022	Image RetrievalPrivacy Preserving	CodeCode Available	2
Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging	May 8, 2025		CodeCode Available	2
Agent-SafetyBench: Evaluating the Safety of LLM Agents	Dec 19, 2024		CodeCode Available	2
LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank Adaptation	Apr 10, 2025	Code GenerationContinual Learning	CodeCode Available	2
Logits-Based Finetuning	May 30, 2025	Out of Distribution (OOD) Detection	CodeCode Available	2
Accelerating Online Mapping and Behavior Prediction via Direct BEV Feature Attention	Jul 9, 2024	Autonomous DrivingDecoder	CodeCode Available	2
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition	Oct 30, 2023	Image ClassificationObject Detection	CodeCode Available	2
Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents	Feb 14, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
Generalized and Efficient 2D Gaussian Splatting for Arbitrary-scale Super-Resolution	Jan 12, 2025	Computational EfficiencyGPU	CodeCode Available	2
VLSBench: Unveiling Visual Leakage in Multimodal Safety	Nov 29, 2024		CodeCode Available	2
O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning	Jan 22, 2025	Mathematical Reasoning	CodeCode Available	2
xLSTM-Mixer: Multivariate Time Series Forecasting by Mixing via Scalar Memories	Oct 22, 2024	Multivariate Time Series ForecastingTemporal Sequences	CodeCode Available	2
Self-Introspective Decoding: Alleviating Hallucinations for Large Vision-Language Models	Aug 4, 2024	Hallucination	CodeCode Available	2
Restoring and attributing ancient texts using deep neural networks	Mar 9, 2022	Ancient Text RestorationAttribute	CodeCode Available	2
BAD: Bidirectional Auto-regressive Diffusion for Text-to-Motion Generation	Sep 17, 2024	Human motion predictionMotion Forecasting	CodeCode Available	2
Data-efficient Large Vision Models through Sequential Autoregression	Feb 7, 2024		CodeCode Available	2