The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 5601–5625 of 177340 papers

Title	Date	Tasks	Status	Hype	Score
Human Preference Score: Better Aligning Text-to-Image Models with Human Preference	Mar 25, 2023		CodeCode Available	2	5
Rotation Invariant Graph Neural Networks using Spin Convolutions	Jun 17, 2021	Graph Neural NetworkInitial Structure to Relaxed Energy (IS2RE)	CodeCode Available	2	5
UDAPDR: Unsupervised Domain Adaptation via LLM Prompting and Distillation of Rerankers	Mar 1, 2023	Domain AdaptationInformation Retrieval	CodeCode Available	2	5
ActionFormer: Localizing Moments of Actions with Transformers	Feb 16, 2022	Action LocalizationAction Recognition	CodeCode Available	2	5
Learning Hazing to Dehazing: Towards Realistic Haze Generation for Real-World Image Dehazing	Mar 25, 2025	Image DehazingImage Generation	CodeCode Available	2	5
Multiview Compressive Coding for 3D Reconstruction	Jan 19, 2023	3D ReconstructionDecoder	CodeCode Available	2	5
Improving Text Embeddings for Smaller Language Models Using Contrastive Fine-tuning	Aug 1, 2024	Language ModelingLanguage Modelling	CodeCode Available	2	5
Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate	May 30, 2023	Arithmetic ReasoningMachine Translation	CodeCode Available	2	5
BigTranslate: Augmenting Large Language Models with Multilingual Translation Capability over 100 Languages	May 29, 2023	Machine TranslationTranslation	CodeCode Available	2	5
Retrieval Augmented Visual Question Answering with Outside Knowledge	Oct 7, 2022	Answer GenerationDiagnostic	CodeCode Available	2	5
Towards Zero-Shot Scale-Aware Monocular Depth Estimation	Jun 29, 2023	DecoderDepth Estimation	CodeCode Available	2	5
A Dynamic Points Removal Benchmark in Point Cloud Maps	Jul 14, 2023	BenchmarkingDynamic Point Removal	CodeCode Available	2	5
MT-Eval: A Multi-Turn Capabilities Evaluation Benchmark for Large Language Models	Jan 30, 2024		CodeCode Available	2	5
Occ3D: A Large-Scale 3D Occupancy Prediction Benchmark for Autonomous Driving	Apr 27, 2023	3D geometryAutonomous Driving	CodeCode Available	2	5
OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies	May 8, 2024	Domain AdaptationScene Understanding	CodeCode Available	2	5
What Can Natural Language Processing Do for Peer Review?	May 10, 2024	Articles	CodeCode Available	2	5
Mixed-Curvature Decision Trees and Random Forests	Jun 7, 2024		CodeCode Available	2	5
SimVG: A Simple Framework for Visual Grounding with Decoupled Multi-modal Fusion	Sep 26, 2024	DescriptiveGeneralized Referring Expression Comprehension	CodeCode Available	2	5
GeoGround: A Unified Large Vision-Language Model for Remote Sensing Visual Grounding	Nov 16, 2024	Instruction FollowingLanguage Modeling	CodeCode Available	2	5
RecFlow: An Industrial Full Flow Recommendation Dataset	Oct 28, 2024	Recommendation SystemsSelection bias	CodeCode Available	2	5
LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization	Mar 11, 2025	GPUImage Generation	CodeCode Available	2	5
ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware	Dec 2, 2018	GPUImage Classification	CodeCode Available	2	5
PerAct2: Benchmarking and Learning for Robotic Bimanual Manipulation Tasks	Jun 29, 2024	Diversity	CodeCode Available	2	5
GPQA: A Graduate-Level Google-Proof Q&A Benchmark	Nov 20, 2023	Multiple-choice	CodeCode Available	2	5
PruneVid: Visual Token Pruning for Efficient Video Large Language Models	Dec 20, 2024	Video Understanding	CodeCode Available	2	5