The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4276–4300 of 177340 papers

Title	Date	Tasks	Status	Hype	Score
MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model	Nov 27, 2023	Image Animation	CodeCode Available	3	5
CRAG -- Comprehensive RAG Benchmark	Jun 7, 2024	HallucinationLanguage Modelling	CodeCode Available	3	5
Major TOM: Expandable Datasets for Earth Observation	Feb 19, 2024	Earth Observation	CodeCode Available	3	5
Uni-QSAR: an Auto-ML Tool for Molecular Property Prediction	Apr 24, 2023	Drug DiscoveryModel Selection	CodeCode Available	3	5
Optimal Variable Speed Limit Control Strategy on Freeway Segments under Fog Conditions	Jul 30, 2021		CodeCode Available	3	5
Towards General-purpose Infrastructure for Protecting Scientific Data Under Study	Oct 4, 2021		CodeCode Available	3	5
L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning	Mar 6, 2025		CodeCode Available	3	5
Genie: Generative Interactive Environments	Feb 23, 2024		CodeCode Available	3	5
Exploring Regional Clues in CLIP for Zero-Shot Semantic Segmentation	Jan 1, 2024	SegmentationSemantic Segmentation	CodeCode Available	3	5
Efficiently Serving LLM Reasoning Programs with Certaindex	Dec 30, 2024	Code GenerationMathematical Problem-Solving	CodeCode Available	3	5
SPO: Sequential Monte Carlo Policy Optimisation	Feb 12, 2024	Decision MakingModel-based Reinforcement Learning	CodeCode Available	3	5
AgentStudio: A Toolkit for Building General Virtual Agents	Mar 26, 2024	Visual Grounding	CodeCode Available	3	5
Is Value Learning Really the Main Bottleneck in Offline RL?	Jun 13, 2024	Imitation LearningOffline RL	CodeCode Available	3	5
DANA: Domain-Aware Neurosymbolic Agents for Consistency and Accuracy	Sep 27, 2024	Financial Analysis	CodeCode Available	3	5
Compact 3D Gaussian Splatting for Static and Dynamic Radiance Fields	Aug 7, 2024	3DGSModel Compression	CodeCode Available	3	5
MAGiC-SLAM: Multi-Agent Gaussian Globally Consistent SLAM	Nov 25, 2024	Autonomous DrivingNovel View Synthesis	CodeCode Available	3	5
Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2	Aug 9, 2024	All	CodeCode Available	3	5
DPLM-2: A Multimodal Diffusion Protein Language Model	Oct 17, 2024	Language ModelingLanguage Modelling	CodeCode Available	3	5
Automated Formulaic Alpha Generation for Quantitative Investing using Evolutionary Algorithms	Mar 13, 2022	Evolutionary Algorithms	CodeCode Available	3	5
The False Promise of Imitating Proprietary LLMs	May 25, 2023	Language Modelling	CodeCode Available	3	5
Visual Geometry Grounded Deep Structure From Motion	Dec 7, 2023	Point Tracking	CodeCode Available	3	5
A Foundation Model for the Earth System	May 20, 2024	Computational EfficiencyDeep Learning	CodeCode Available	3	5
DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning	Jun 14, 2024	Offline RL	CodeCode Available	3	5
Human-level play in the game of Diplomacy by combining language models with strategic reasoning	Nov 22, 2022	AI AgentLanguage Modeling	CodeCode Available	3	5
Improving Text Embeddings with Large Language Models	Dec 31, 2023	DecoderDiversity	CodeCode Available	3	5