The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3901–3950 of 177340 papers

Title	Date	Tasks	Status	Hype	Score
MUSt3R: Multi-view Network for Stereo 3D Reconstruction	Mar 3, 2025	3D ReconstructionArticles	CodeCode Available	3	5
SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference	Oct 6, 2024	Language ModelingLanguage Modelling	CodeCode Available	3	5
Inversion-Free Image Editing with Language-Guided Diffusion Models	Jan 1, 2024	DenoisingImage Manipulation	CodeCode Available	3	5
Beyond Quacking: Deep Integration of Language Models and RAG into DuckDB	Apr 1, 2025	Decision MakingRAG	CodeCode Available	3	5
OpenSpiel: A Framework for Reinforcement Learning in Games	Aug 26, 2019	General Reinforcement Learningreinforcement-learning	CodeCode Available	3	5
Scikit-fingerprints: easy and efficient computation of molecular fingerprints in Python	Jul 18, 2024	Molecular Property PredictionProperty Prediction	CodeCode Available	3	5
NeRF-SLAM: Real-Time Dense Monocular SLAM with Neural Radiance Fields	Oct 24, 2022	NeRF	CodeCode Available	3	5
CLIMB: Class-imbalanced Learning Benchmark on Tabular Data	May 23, 2025		CodeCode Available	3	5
Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation	Dec 9, 2024	DenoisingPhoto geolocation estimation	CodeCode Available	3	5
Take the aTrain. Introducing an Interface for the Accessible Transcription of Interviews	Oct 18, 2023	CPUGPU	CodeCode Available	3	5
Meta-Transformer: A Unified Framework for Multimodal Learning	Jul 20, 2023	Time Series	CodeCode Available	3	5
GroundingGPT:Language Enhanced Multi-modal Grounding Model	Jan 11, 2024	Language ModellingLarge Language Model	CodeCode Available	3	5
Evaluating Large Language Models with fmeval	Jul 15, 2024	Question Answering	CodeCode Available	3	5
Harmful Fine-tuning Attacks and Defenses for Large Language Models: A Survey	Sep 26, 2024	Safety Alignment	CodeCode Available	3	5
HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale	Jun 27, 2024	Visual Question Answering (VQA)	CodeCode Available	3	5
APPL: A Prompt Programming Language for Harmonious Integration of Programs and Large Language Model Prompts	Jun 19, 2024	Language ModelingLanguage Modelling	CodeCode Available	3	5
Rethinking Early Stopping: Refine, Then Calibrate	Jan 31, 2025	Decision Making	CodeCode Available	3	5
CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark	Sep 17, 2024		CodeCode Available	3	5
Better by Default: Strong Pre-Tuned MLPs and Boosted Trees on Tabular Data	Jul 5, 2024	Classificationregression	CodeCode Available	3	5
Automatic Gradient Estimation for Calibrating Crowd Models with Discrete Decision Making	Apr 6, 2024	Decision Making	CodeCode Available	3	5
DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought	Dec 23, 2024	Machine TranslationMath	CodeCode Available	3	5
Graph-Mamba: Towards Long-Range Graph Sequence Modeling with Selective State Spaces	Feb 1, 2024	Computational EfficiencyGPU	CodeCode Available	3	5
MVGS: Multi-view-regulated Gaussian Splatting for Novel View Synthesis	Oct 2, 2024	3DGSNeRF	CodeCode Available	3	5
Classification Done Right for Vision-Language Pre-Training	Nov 5, 2024	Classification	CodeCode Available	3	5
Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation	Jun 14, 2024	Audio-Visual Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	3	5
ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI Systems	Sep 2, 2024	BenchmarkingInstruction Following	CodeCode Available	3	5
AutoScraper: A Progressive Understanding Web Agent for Web Scraper Generation	Apr 19, 2024	Action Generation	CodeCode Available	3	5
Anatomy-informed Data Augmentation for Enhanced Prostate Cancer Detection	Sep 7, 2023	AnatomyData Augmentation	CodeCode Available	3	5
Improving Model Evaluation using SMART Filtering of Benchmark Datasets	Oct 26, 2024	ChatbotDiversity	CodeCode Available	3	5
ZO2: Scalable Zeroth-Order Fine-Tuning for Extremely Large Language Models with Limited GPU Memory	Mar 16, 2025	CPUGPU	CodeCode Available	3	5
A new face swap method for image and video domains: a technical report	Feb 7, 2022	Action Recognition In VideosFace Recognition	CodeCode Available	3	5
MooER: LLM-based Speech Recognition and Translation Models from Moore Threads	Aug 9, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	3	5
Reinforcement Learning Enhanced LLMs: A Survey	Dec 5, 2024	reinforcement-learningReinforcement Learning	CodeCode Available	3	5
PF3plat: Pose-Free Feed-Forward 3D Gaussian Splatting	Oct 29, 2024	3DGS3D Reconstruction	CodeCode Available	3	5
RT-DETRv3: Real-time End-to-End Object Detection with Hierarchical Dense Positive Supervision	Sep 13, 2024	Decoderobject-detection	CodeCode Available	3	5
AesBench: An Expert Benchmark for Multimodal Large Language Models on Image Aesthetics Perception	Jan 16, 2024	MLLM Evaluation: Aesthetics	CodeCode Available	3	5
One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt	Jan 23, 2025	Image GenerationStory Generation	CodeCode Available	3	5
An Imitative Reinforcement Learning Framework for Autonomous Dogfight	Jun 17, 2024	Imitation Learningreinforcement-learning	CodeCode Available	3	5
FusionBench: A Comprehensive Benchmark of Deep Model Fusion	Jun 5, 2024	image-classificationImage Classification	CodeCode Available	3	5
FedMKT: Federated Mutual Knowledge Transfer for Large and Small Language Models	Jun 4, 2024	Text GenerationTransfer Learning	CodeCode Available	3	5
Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning	Mar 25, 2024	Visual Question Answering (VQA)	CodeCode Available	3	5
From Panels to Prose: Generating Literary Narratives from Comics	Mar 30, 2025	Optical Character Recognition (OCR)	CodeCode Available	3	5
TorchCP: A Python Library for Conformal Prediction	Feb 20, 2024	Conformal PredictionDeep Learning	CodeCode Available	3	5
RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing	Apr 30, 2024	Computational EfficiencyHallucination	CodeCode Available	3	5
Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs	Mar 3, 2025	Reinforcement Learning (RL)	CodeCode Available	3	5
Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution	Sep 19, 2024	document understandingVideo Question Answering	CodeCode Available	3	5
X-LoRA: Mixture of Low-Rank Adapter Experts, a Flexible Framework for Large Language Models with Applications in Protein Mechanics and Molecular Design	Feb 11, 2024	graph constructionKnowledge Graphs	CodeCode Available	3	5
Segment Any Medical Model Extended	Mar 26, 2024	Data AugmentationImage Segmentation	CodeCode Available	3	5
An Image is Worth 32 Tokens for Reconstruction and Generation	Jun 11, 2024	Image GenerationImage Reconstruction	CodeCode Available	3	5
Intelligent Grimm - Open-ended Visual Storytelling via Latent Diffusion Models	Jan 1, 2024	Image GenerationText to Image Generation	CodeCode Available	3	5