The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 6401–6450 of 177340 papers

Title	Date	Tasks	Status	Hype	Score
Efficient Online Reinforcement Learning with Offline Data	Feb 6, 2023	reinforcement-learningReinforcement Learning	CodeCode Available	2	5
Language Models are Multilingual Chain-of-Thought Reasoners	Oct 6, 2022	GSM8KMath	CodeCode Available	2	5
UniVST: A Unified Framework for Training-free Localized Video Style Transfer	Oct 26, 2024	Style TransferVideo Editing	CodeCode Available	2	5
Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario Analysis	Jun 13, 2025	Autonomous DrivingAutonomous Vehicles	CodeCode Available	2	5
Steering Away from Harm: An Adaptive Approach to Defending Vision Language Model Against Jailbreaks	Nov 23, 2024	Language ModelingLanguage Modelling	CodeCode Available	2	5
SHMT: Self-supervised Hierarchical Makeup Transfer via Latent Diffusion Models	Dec 15, 2024		CodeCode Available	2	5
Offline Reinforcement Learning for LLM Multi-Step Reasoning	Dec 20, 2024	GSM8KMath	CodeCode Available	2	5
Image Restoration with Mean-Reverting Stochastic Differential Equations	Jan 27, 2023	DeblurringDenoising	CodeCode Available	2	5
Large Language Models Play StarCraft II: Benchmarks and A Chain of Summarization Approach	Dec 19, 2023	Language ModellingLarge Language Model	CodeCode Available	2	5
MixFormer: End-to-End Tracking with Iterative Mixed Attention	Feb 6, 2023	Object TrackingVisual Object Tracking	CodeCode Available	2	5
Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval	Jan 28, 2022	Language ModelingLanguage Modelling	CodeCode Available	2	5
AdaMixer: A Fast-Converging Query-Based Object Detector	Mar 30, 2022	ObjectObject Detection	CodeCode Available	2	5
Rethinking Depth Estimation for Multi-View Stereo: A Unified Representation	Jan 5, 2022	3D ReconstructionClassification	CodeCode Available	2	5
CORE4D: A 4D Human-Object-Human Interaction Dataset for Collaborative Object REarrangement	Jun 27, 2024	Human-Object Interaction DetectionHuman-Object Interaction Generation	CodeCode Available	2	5
Mechanistic understanding and validation of large AI models with SemanticLens	Jan 9, 2025	Decision Making	CodeCode Available	2	5
FinEval: A Chinese Financial Domain Knowledge Evaluation Benchmark for Large Language Models	Aug 19, 2023	Multiple-choice	CodeCode Available	2	5
SPIdepth: Strengthened Pose Information for Self-supervised Monocular Depth Estimation	Apr 18, 2024	Autonomous DrivingDepth Estimation	CodeCode Available	2	5
ThingTalk: An Extensible, Executable Representation Language for Task-Oriented Dialogues	Mar 23, 2022	Semantic Parsing	CodeCode Available	2	5
N-BVH: Neural ray queries with bounding volume hierarchies	May 25, 2024		CodeCode Available	2	5
Understanding The Robustness in Vision Transformers	Apr 26, 2022	Domain GeneralizationImage Classification	CodeCode Available	2	5
TaleCrafter: Interactive Story Visualization with Multiple Characters	May 29, 2023	Image GenerationLayout Generation	CodeCode Available	2	5
Extreme Video Compression with Pre-trained Diffusion Models	Feb 14, 2024	DecoderImage Compression	CodeCode Available	2	5
Tiny Object Tracking: A Large-scale Dataset and A Baseline	Feb 11, 2022	AttributeKnowledge Distillation	CodeCode Available	2	5
Progressive-Hint Prompting Improves Reasoning in Large Language Models	Apr 19, 2023	Arithmetic ReasoningGSM8K	CodeCode Available	2	5
Scaling the leading accuracy of deep equivariant models to biomolecular simulations of realistic size	Apr 20, 2023	GPU	CodeCode Available	2	5
Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs	Aug 25, 2023		CodeCode Available	2	5
Hungry Hungry Hippos: Towards Language Modeling with State Space Models	Dec 28, 2022	8kCoreference Resolution	CodeCode Available	2	5
Chameleon: Fast-slow Neuro-symbolic Lane Topology Extraction	Mar 10, 2025	Autonomous DrivingScene Understanding	CodeCode Available	2	5
Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints	Nov 26, 2024	DenoisingImage Generation	CodeCode Available	2	5
Learning Deep Time-index Models for Time Series Forecasting	Jul 13, 2022	Inductive BiasMeta-Learning	CodeCode Available	2	5
VEXIR2Vec: An Architecture-Neutral Embedding Framework for Binary Similarity	Dec 1, 2023	Graph EmbeddingKnowledge Graph Embedding	CodeCode Available	2	5
A Semi-supervised Nighttime Dehazing Baseline with Spatial-Frequency Aware and Realistic Brightness Constraint	Mar 27, 2024	Image DehazingPseudo Label	CodeCode Available	2	5
AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation	Apr 19, 2023	AllVideo Frame Interpolation	CodeCode Available	2	5
Queryable Prototype Multiple Instance Learning with Vision-Language Models for Incremental Whole Slide Image Classification	Oct 14, 2024	Classificationimage-classification	CodeCode Available	2	5
HumanBench: Towards General Human-centric Perception with Projector Assisted Pretraining	Mar 10, 2023	AttributeAutonomous Driving	CodeCode Available	2	5
Teola: Towards End-to-End Optimization of LLM-based Applications	Jun 29, 2024	Language ModelingLanguage Modelling	CodeCode Available	2	5
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages	Jun 14, 2024	Diversity	CodeCode Available	2	5
Orientation-Independent Chinese Text Recognition in Scene Images	Sep 3, 2023	BenchmarkingImage Reconstruction	CodeCode Available	2	5
Human Motion Diffusion as a Generative Prior	Mar 2, 2023	DenoisingMotion Synthesis	CodeCode Available	2	5
Gradient-Driven 3D Segmentation and Affordance Transfer in Gaussian Splatting Using 2D Masks	Sep 18, 2024	3DGSSegmentation	CodeCode Available	2	5
ResT V2: Simpler, Faster and Stronger	Apr 15, 2022	Semantic Segmentation	CodeCode Available	2	5
Beyond Generalization: A Survey of Out-Of-Distribution Adaptation on Graphs	Feb 17, 2024		CodeCode Available	2	5
TCTrack: Temporal Contexts for Aerial Tracking	Mar 3, 2022		CodeCode Available	2	5
HAHA: Highly Articulated Gaussian Human Avatars with Textured Mesh Prior	Apr 1, 2024		CodeCode Available	2	5
Cross-Scale MAE: A Tale of Multi-Scale Exploitation in Remote Sensing	Jan 29, 2024	GPURepresentation Learning	CodeCode Available	2	5
Robust Dynamic Facial Expression Recognition	Feb 22, 2025	Dynamic Facial Expression RecognitionFacial Expression Recognition	CodeCode Available	2	5
PUGS: Zero-shot Physical Understanding with Gaussian Splatting	Feb 17, 2025	Friction	CodeCode Available	2	5
MAGE: A Multi-Agent Engine for Automated RTL Code Generation	Dec 10, 2024	Code GenerationNavigate	CodeCode Available	2	5
ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Image	Oct 27, 2023	DiversityNeRF	CodeCode Available	2	5
OneLLM: One Framework to Align All Modalities with Language	Dec 6, 2023	AllQuestion Answering	CodeCode Available	2	5