The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3351–3400 of 177340 papers

Title	Date	Tasks	Status	Hype	Score
Rewrite the Stars	Mar 29, 2024		CodeCode Available	3	5
OpenFedLLM: Training Large Language Models on Decentralized Private Data via Federated Learning	Feb 10, 2024	Federated LearningInstruction Following	CodeCode Available	3	5
Test-Time Training Scaling Laws for Chemical Exploration in Drug Design	Jan 31, 2025	Drug DesignDrug Discovery	CodeCode Available	3	5
Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models	Sep 11, 2024	3D Generation3D Reconstruction	CodeCode Available	3	5
Robust and Efficient Medical Imaging with Self-Supervision	May 19, 2022	DiagnosticRepresentation Learning	CodeCode Available	3	5
Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement Learning	Feb 5, 2024	reinforcement-learningReinforcement Learning	CodeCode Available	3	5
Arctic Inference with Shift Parallelism: Fast and Efficient Open Source Inference System for Enterprise AI	Jul 16, 2025	GPU	CodeCode Available	3	5
LangProBe: a Language Programs Benchmark	Feb 27, 2025		CodeCode Available	3	5
VerMCTS: Synthesizing Multi-Step Programs using a Verifier, a Large Language Model, and Tree Search	Feb 13, 2024	Language ModelingLanguage Modelling	CodeCode Available	3	5
STORM: Spatio-Temporal Reconstruction Model for Large-Scale Outdoor Scenes	Dec 31, 2024	Dynamic ReconstructionScene Flow Estimation	CodeCode Available	3	5
Differentiable Data Augmentation with Kornia	Nov 19, 2020	Image AugmentationImage Manipulation	CodeCode Available	3	5
Supplementary Material for Efficient and Robust Automated Machine Learning	Jan 1, 2015	BIG-bench Machine LearningHyperparameter Optimization	CodeCode Available	3	5
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks	Apr 16, 2022	BenchmarkingInstruction Following	CodeCode Available	3	5
Why Do Multi-Agent LLM Systems Fail?	Mar 17, 2025		CodeCode Available	3	5
SceneSplat: Gaussian Splatting-based Scene Understanding with Vision-Language Pretraining	Mar 23, 2025	3DGSBenchmarking	CodeCode Available	3	5
Token Merging: Your ViT But Faster	Oct 17, 2022	Efficient ViTs	CodeCode Available	3	5
StableVideo: Text-driven Consistency-aware Diffusion Video Editing	Aug 18, 2023	Video Editing	CodeCode Available	3	5
Data-centric AI: Perspectives and Challenges	Jan 12, 2023		CodeCode Available	3	5
Declarative Machine Learning Systems	Jul 16, 2021	BIG-bench Machine Learning	CodeCode Available	3	5
Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models	Mar 21, 2023	3D geometryText to 3D	CodeCode Available	3	5
BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects	Mar 24, 2023	3D Object Detection3D Object Tracking	CodeCode Available	3	5
TorchBench: Benchmarking PyTorch with High API Surface Coverage	Apr 27, 2023	BenchmarkingGPU	CodeCode Available	3	5
How Can Recommender Systems Benefit from Large Language Models: A Survey	Jun 9, 2023	EthicsFeature Engineering	CodeCode Available	3	5
Scaffold-GS: Structured 3D Gaussians for View-Adaptive Rendering	Nov 30, 2023	Neural Rendering	CodeCode Available	3	5
DeFlow: Decoder of Scene Flow Network in Autonomous Driving	Jan 29, 2024	Autonomous DrivingDecoder	CodeCode Available	3	5
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models	Feb 10, 2024	CPUGPU	CodeCode Available	3	5
FaceXFormer: A Unified Transformer for Facial Analysis	Mar 19, 2024	Age and Gender EstimationAge Estimation	CodeCode Available	3	5
Benchmarking Large Language Models on CFLUE -- A Chinese Financial Language Understanding Evaluation Dataset	May 17, 2024	16kBenchmarking	CodeCode Available	3	5
Vaporetto: Efficient Japanese Tokenization Based on Improved Pointwise Linear Classification	Jun 24, 2024		CodeCode Available	3	5
HARDVS: Revisiting Human Activity Recognition with Dynamic Vision Sensors	Nov 17, 2022	Activity PredictionActivity Recognition	CodeCode Available	3	5
A Note on the Prediction-Powered Bootstrap	May 28, 2024	Prediction	CodeCode Available	3	5
S-Graphs 2.0 -- A Hierarchical-Semantic Optimization and Loop Closure for SLAM	Feb 25, 2025	global-optimizationManagement	CodeCode Available	3	5
AudioBench: A Universal Benchmark for Audio Large Language Models	Jun 23, 2024	Audio Scene UnderstandingInstruction Following	CodeCode Available	3	5
Alias-Free Generative Adversarial Networks	Jun 23, 2021	Image Generation	CodeCode Available	3	5
HDR-GS: Efficient High Dynamic Range Novel View Synthesis at 1000x Speed via Gaussian Splatting	May 24, 2024	NeRFNovel View Synthesis	CodeCode Available	3	5
Embodied CoT Distillation From LLM To Off-the-shelf Agents	Dec 16, 2024	Decision MakingIn-Context Learning	CodeCode Available	3	5
MACE: Higher Order Equivariant Message Passing Neural Networks for Fast and Accurate Force Fields	Jun 15, 2022	Computational chemistry	CodeCode Available	3	5
GLiREL -- Generalist Model for Zero-Shot Relation Extraction	Jan 6, 2025	modelnamed-entity-recognition	CodeCode Available	3	5
ZIM: Zero-Shot Image Matting for Anything	Nov 1, 2024	Image InpaintingImage Matting	CodeCode Available	3	5
ivis Dimensionality Reduction Framework for Biomacromolecular Simulations	Apr 22, 2020	Dimensionality Reduction	CodeCode Available	3	5
Vulnerability Detection with Code Language Models: How Far Are We?	Mar 27, 2024	Vulnerability Detection	CodeCode Available	3	5
Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Language Models	May 26, 2023	GSM8KMultimodal Reasoning	CodeCode Available	3	5
Vision-LSTM: xLSTM as Generic Vision Backbone	Jun 6, 2024		CodeCode Available	3	5
A Survey on Evaluation of Large Language Models	Jul 6, 2023	EthicsSurvey	CodeCode Available	3	5
Movie Gen: A Cast of Media Foundation Models	Oct 17, 2024	Audio GenerationVideo Editing	CodeCode Available	3	5
Recent Advances on Machine Learning for Computational Fluid Dynamics: A Survey	Aug 22, 2024	scientific discoverySymbolic Regression	CodeCode Available	3	5
Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community	Aug 17, 2024	Novel ConceptsObject	CodeCode Available	3	5
Point Transformer V3: Simpler, Faster, Stronger	Dec 15, 2023	3D Semantic SegmentationLIDAR Semantic Segmentation	CodeCode Available	3	5
OmniSQL: Synthesizing High-quality Text-to-SQL Data at Scale	Mar 4, 2025	Text to SQLText-To-SQL	CodeCode Available	3	5
OpenDelta: A Plug-and-play Library for Parameter-efficient Adaptation of Pre-trained Models	Jul 5, 2023		CodeCode Available	3	5