The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 7501–7525 of 177340 papers

Title	Date	Tasks	Status	Hype	Score
UniVST: A Unified Framework for Training-free Localized Video Style Transfer	Oct 26, 2024	Style TransferVideo Editing	CodeCode Available	2	5
Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario Analysis	Jun 13, 2025	Autonomous DrivingAutonomous Vehicles	CodeCode Available	2	5
Steering Away from Harm: An Adaptive Approach to Defending Vision Language Model Against Jailbreaks	Nov 23, 2024	Language ModelingLanguage Modelling	CodeCode Available	2	5
SHMT: Self-supervised Hierarchical Makeup Transfer via Latent Diffusion Models	Dec 15, 2024		CodeCode Available	2	5
Offline Reinforcement Learning for LLM Multi-Step Reasoning	Dec 20, 2024	GSM8KMath	CodeCode Available	2	5
Image Restoration with Mean-Reverting Stochastic Differential Equations	Jan 27, 2023	DeblurringDenoising	CodeCode Available	2	5
Large Language Models Play StarCraft II: Benchmarks and A Chain of Summarization Approach	Dec 19, 2023	Language ModellingLarge Language Model	CodeCode Available	2	5
MixFormer: End-to-End Tracking with Iterative Mixed Attention	Feb 6, 2023	Object TrackingVisual Object Tracking	CodeCode Available	2	5
Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval	Jan 28, 2022	Language ModelingLanguage Modelling	CodeCode Available	2	5
AdaMixer: A Fast-Converging Query-Based Object Detector	Mar 30, 2022	ObjectObject Detection	CodeCode Available	2	5
Rethinking Depth Estimation for Multi-View Stereo: A Unified Representation	Jan 5, 2022	3D ReconstructionClassification	CodeCode Available	2	5
CORE4D: A 4D Human-Object-Human Interaction Dataset for Collaborative Object REarrangement	Jun 27, 2024	Human-Object Interaction DetectionHuman-Object Interaction Generation	CodeCode Available	2	5
Mechanistic understanding and validation of large AI models with SemanticLens	Jan 9, 2025	Decision Making	CodeCode Available	2	5
FinEval: A Chinese Financial Domain Knowledge Evaluation Benchmark for Large Language Models	Aug 19, 2023	Multiple-choice	CodeCode Available	2	5
SPIdepth: Strengthened Pose Information for Self-supervised Monocular Depth Estimation	Apr 18, 2024	Autonomous DrivingDepth Estimation	CodeCode Available	2	5
ThingTalk: An Extensible, Executable Representation Language for Task-Oriented Dialogues	Mar 23, 2022	Semantic Parsing	CodeCode Available	2	5
N-BVH: Neural ray queries with bounding volume hierarchies	May 25, 2024		CodeCode Available	2	5
Understanding The Robustness in Vision Transformers	Apr 26, 2022	Domain GeneralizationImage Classification	CodeCode Available	2	5
TaleCrafter: Interactive Story Visualization with Multiple Characters	May 29, 2023	Image GenerationLayout Generation	CodeCode Available	2	5
Extreme Video Compression with Pre-trained Diffusion Models	Feb 14, 2024	DecoderImage Compression	CodeCode Available	2	5
Tiny Object Tracking: A Large-scale Dataset and A Baseline	Feb 11, 2022	AttributeKnowledge Distillation	CodeCode Available	2	5
Progressive-Hint Prompting Improves Reasoning in Large Language Models	Apr 19, 2023	Arithmetic ReasoningGSM8K	CodeCode Available	2	5
Scaling the leading accuracy of deep equivariant models to biomolecular simulations of realistic size	Apr 20, 2023	GPU	CodeCode Available	2	5
Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs	Aug 25, 2023		CodeCode Available	2	5
Hungry Hungry Hippos: Towards Language Modeling with State Space Models	Dec 28, 2022	8kCoreference Resolution	CodeCode Available	2	5