The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 6051–6075 of 177340 papers

Title	Date	Tasks	Status	Hype	Score
Lenia - Biology of Artificial Life	Dec 13, 2018	Artificial LifeDiversity	CodeCode Available	2	5
WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models	Jun 26, 2024	ChatbotRed Teaming	CodeCode Available	2	5
SGPT: GPT Sentence Embeddings for Semantic Search	Feb 17, 2022	Argument RetrievalBiomedical Information Retrieval	CodeCode Available	2	5
AXIAL: Attention-based eXplainability for Interpretable Alzheimer's Localized Diagnosis using 2D CNNs on 3D MRI brain scans	Jul 2, 2024	3D ClassificationAlzheimer's Disease Detection	CodeCode Available	2	5
Model Uncertainty in Evolutionary Optimization and Bayesian Optimization: A Comparative Analysis	Mar 21, 2024	Bayesian Optimization	CodeCode Available	2	5
AGILE: A Novel Reinforcement Learning Framework of LLM Agents	May 23, 2024	Question Answeringreinforcement-learning	CodeCode Available	2	5
Learning Local Equivariant Representations for Large-Scale Atomistic Dynamics	Apr 11, 2022	Atomic Forces	CodeCode Available	2	5
How Can Time Series Analysis Benefit From Multiple Modalities? A Survey and Outlook	Mar 14, 2025	Time SeriesTime Series Analysis	CodeCode Available	2	5
Bayesian Neural Networks for One-to-Many Mapping in Image Enhancement	Jan 24, 2025	Image Enhancement	CodeCode Available	2	5
BK-SDM: A Lightweight, Fast, and Cheap Version of Stable Diffusion	May 25, 2023	DreamBooth Personalized GenerationImage-to-Image Translation	CodeCode Available	2	5
Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training	Sep 29, 2023	Decision MakingLanguage Modeling	CodeCode Available	2	5
MicroFlow: An Efficient Rust-Based Inference Engine for TinyML	Sep 28, 2024	Human Detection	CodeCode Available	2	5
Advancing Time Series Classification with Multimodal Language Modeling	Mar 19, 2024	ClassificationLanguage Modeling	CodeCode Available	2	5
Trajectory balance: Improved credit assignment in GFlowNets	Jan 31, 2022	Diversity	CodeCode Available	2	5
From Instance Training to Instruction Learning: Task Adapters Generation from Instructions	Jun 18, 2024	Knowledge Distillation	CodeCode Available	2	5
RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models	Oct 1, 2023	Benchmarking	CodeCode Available	2	5
φ-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation	Mar 17, 2025		CodeCode Available	2	5
Efficient Mixed Transformer for Single Image Super-Resolution	May 19, 2023	Image Super-ResolutionSuper-Resolution	CodeCode Available	2	5
CoMoGaussian: Continuous Motion-Aware Gaussian Splatting from Motion-Blurred Images	Mar 7, 2025	3DGS3D Scene Reconstruction	CodeCode Available	2	5
PAL: Proxy-Guided Black-Box Attack on Large Language Models	Feb 15, 2024		CodeCode Available	2	5
PyReason: Software for Open World Temporal Logic	Feb 27, 2023	Knowledge Graphs	CodeCode Available	2	5
mDPO: Conditional Preference Optimization for Multimodal Large Language Models	Jun 17, 2024	HallucinationLanguage Modeling	CodeCode Available	2	5
In-Context Matting	Mar 23, 2024	Image Matting	CodeCode Available	2	5
NTIRE 2025 Challenge on Image Super-Resolution (4): Methods and Results	Apr 20, 2025	Image Super-ResolutionSuper-Resolution	CodeCode Available	2	5
MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models	Jun 23, 2023	BenchmarkingLanguage Modeling	CodeCode Available	2	5