The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 6001–6050 of 177340 papers

Title	Date	Tasks	Status	Hype	Score
Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis	Jul 24, 2022	3D geometryNeRF	CodeCode Available	2	5
VEGS: View Extrapolation of Urban Scenes in 3D Gaussian Splatting using Learned Priors	Jul 3, 2024	Neural Rendering	CodeCode Available	2	5
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models	Oct 10, 2024	GSM8KMath	CodeCode Available	2	5
Exploring CLIP for Assessing the Look and Feel of Images	Jul 25, 2022	Image Quality AssessmentNo-Reference Image Quality Assessment	CodeCode Available	2	5
Visual Perception by Large Language Model's Weights	May 30, 2024		CodeCode Available	2	5
MCP-Solver: Integrating Language Models with Constraint Programming Systems	Dec 31, 2024	Natural Language Understanding	CodeCode Available	2	5
SegNet4D: Efficient Instance-Aware 4D Semantic Segmentation for LiDAR Point Cloud	Jun 24, 2024	Autonomous DrivingAutonomous Navigation	CodeCode Available	2	5
Hourglass Tokenizer for Efficient Transformer-Based 3D Human Pose Estimation	Nov 20, 2023	3D Human Pose EstimationPose Estimation	CodeCode Available	2	5
Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing	Apr 3, 2025	BenchmarkingLogical Reasoning	CodeCode Available	2	5
Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning	Oct 10, 2023	Language ModelingLanguage Modelling	CodeCode Available	2	5
CMB: A Comprehensive Medical Benchmark in Chinese	Aug 17, 2023		CodeCode Available	2	5
Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D Policy	Oct 2, 2024	Motion PlanningRobot Manipulation	CodeCode Available	2	5
StructChart: On the Schema, Metric, and Augmentation for Visual Chart Understanding	Sep 20, 2023	Chart Question AnsweringChart Understanding	CodeCode Available	2	5
CleanDiffuser: An Easy-to-use Modularized Library for Diffusion Models in Decision Making	Jun 13, 2024	Decision Making	CodeCode Available	2	5
The P^3 dataset: Pixels, Points and Polygons for Multimodal Building Vectorization	May 21, 2025		CodeCode Available	2	5
Protein Representation Learning by Geometric Structure Pretraining	Mar 11, 2022	Contrastive LearningPrediction	CodeCode Available	2	5
SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation	Sep 18, 2022	Real-Time Semantic SegmentationSegmentation	CodeCode Available	2	5
JudgeLM: Fine-tuned Large Language Models are Scalable Judges	Oct 26, 2023		CodeCode Available	2	5
DeepInteraction: 3D Object Detection via Modality Interaction	Aug 23, 2022	3D Object DetectionDecoder	CodeCode Available	2	5
Internal Consistency and Self-Feedback in Large Language Models: A Survey	Jul 19, 2024		CodeCode Available	2	5
Hybrid-SORT: Weak Cues Matter for Online Multi-Object Tracking	Aug 1, 2023	Multi-Object TrackingMultiple Object Tracking	CodeCode Available	2	5
PartIR: Composing SPMD Partitioning Strategies for Machine Learning	Jan 20, 2024		CodeCode Available	2	5
SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable Responses Created Through Human-Machine Collaboration	May 28, 2023	Response Generation	CodeCode Available	2	5
FastVID: Dynamic Density Pruning for Fast Video Large Language Models	Mar 14, 2025		CodeCode Available	2	5
Embedding Earth: Self-supervised contrastive pre-training for dense land cover classification	Mar 11, 2022	Earth ObservationLand Cover Classification	CodeCode Available	2	5
AudioDec: An Open-source Streaming High-fidelity Neural Audio Codec	May 26, 2023	CPUGPU	CodeCode Available	2	5
Self-Normalizing Neural Networks	Jun 8, 2017	AstronomyBIG-bench Machine Learning	CodeCode Available	2	5
Discovering uncertainty: Gaussian constitutive neural networks with correlated weights	Mar 16, 2025		CodeCode Available	2	5
InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback	Jun 26, 2023	BenchmarkingCode Generation	CodeCode Available	2	5
SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices	Jun 4, 2024	Text Generation	CodeCode Available	2	5
CAGRA: Highly Parallel Graph Construction and Approximate Nearest Neighbor Search for GPUs	Aug 29, 2023	CPUGPU	CodeCode Available	2	5
Defending LLMs against Jailbreaking Attacks via Backtranslation	Feb 26, 2024	Language Modelling	CodeCode Available	2	5
TabDDPM: Modelling Tabular Data with Diffusion Models	Sep 30, 2022	Denoising	CodeCode Available	2	5
MCIBI++: Soft Mining Contextual Information Beyond Image for Semantic Segmentation	Sep 9, 2022	SegmentationSemantic Segmentation	CodeCode Available	2	5
RE-Bench: Evaluating frontier AI R&D capabilities of language model agents against human experts	Nov 22, 2024	AI AgentLanguage Modeling	CodeCode Available	2	5
3D LiDAR Mapping in Dynamic Environments Using a 4D Implicit Neural Representation	May 6, 2024	Autonomous VehiclesDecoder	CodeCode Available	2	5
Long and Short Guidance in Score identity Distillation for One-Step Text-to-Image Generation	Jun 3, 2024	Image GenerationText to Image Generation	CodeCode Available	2	5
Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decoding	Mar 27, 2024	AttributeDecision Making	CodeCode Available	2	5
RepoHyper: Search-Expand-Refine on Semantic Graphs for Repository-Level Code Completion	Mar 10, 2024	Code CompletionLink Prediction	CodeCode Available	2	5
MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix	May 19, 2025		CodeCode Available	2	5
Multi-Agent Trajectory Prediction with Difficulty-Guided Feature Enhancement Network	Jul 26, 2024	Autonomous DrivingDecoder	CodeCode Available	2	5
SRGS: Super-Resolution 3D Gaussian Splatting	Apr 16, 2024	3DGSNeRF	CodeCode Available	2	5
ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming	Apr 6, 2024	Adversarial RobustnessDialogue Safety Prediction	CodeCode Available	2	5
AdaNeRF: Adaptive Sampling for Real-time Rendering of Neural Radiance Fields	Jul 21, 2022	Novel View Synthesis	CodeCode Available	2	5
Language Models can Self-Lengthen to Generate Long Texts	Oct 31, 2024	Text Generation	CodeCode Available	2	5
Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling	Apr 17, 2025	Hallucination	CodeCode Available	2	5
Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace Projection	Dec 18, 2024		CodeCode Available	2	5
Multi-modal Molecule Structure-text Model for Text-based Retrieval and Editing	Dec 21, 2022	Contrastive LearningDrug Design	CodeCode Available	2	5
MM-Retinal: Knowledge-Enhanced Foundational Pretraining with Fundus Image-Text Expertise	May 20, 2024		CodeCode Available	2	5
VOOM: Robust Visual Object Odometry and Mapping using Hierarchical Landmarks	Feb 21, 2024	Computational EfficiencyObject	CodeCode Available	2	5