The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 5601–5650 of 177340 papers

Title	Date	Tasks	Status	Hype	Score
Human Preference Score: Better Aligning Text-to-Image Models with Human Preference	Mar 25, 2023		CodeCode Available	2	5
Rotation Invariant Graph Neural Networks using Spin Convolutions	Jun 17, 2021	Graph Neural NetworkInitial Structure to Relaxed Energy (IS2RE)	CodeCode Available	2	5
UDAPDR: Unsupervised Domain Adaptation via LLM Prompting and Distillation of Rerankers	Mar 1, 2023	Domain AdaptationInformation Retrieval	CodeCode Available	2	5
ActionFormer: Localizing Moments of Actions with Transformers	Feb 16, 2022	Action LocalizationAction Recognition	CodeCode Available	2	5
Learning Hazing to Dehazing: Towards Realistic Haze Generation for Real-World Image Dehazing	Mar 25, 2025	Image DehazingImage Generation	CodeCode Available	2	5
Multiview Compressive Coding for 3D Reconstruction	Jan 19, 2023	3D ReconstructionDecoder	CodeCode Available	2	5
Improving Text Embeddings for Smaller Language Models Using Contrastive Fine-tuning	Aug 1, 2024	Language ModelingLanguage Modelling	CodeCode Available	2	5
Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate	May 30, 2023	Arithmetic ReasoningMachine Translation	CodeCode Available	2	5
BigTranslate: Augmenting Large Language Models with Multilingual Translation Capability over 100 Languages	May 29, 2023	Machine TranslationTranslation	CodeCode Available	2	5
Retrieval Augmented Visual Question Answering with Outside Knowledge	Oct 7, 2022	Answer GenerationDiagnostic	CodeCode Available	2	5
Towards Zero-Shot Scale-Aware Monocular Depth Estimation	Jun 29, 2023	DecoderDepth Estimation	CodeCode Available	2	5
A Dynamic Points Removal Benchmark in Point Cloud Maps	Jul 14, 2023	BenchmarkingDynamic Point Removal	CodeCode Available	2	5
MT-Eval: A Multi-Turn Capabilities Evaluation Benchmark for Large Language Models	Jan 30, 2024		CodeCode Available	2	5
Occ3D: A Large-Scale 3D Occupancy Prediction Benchmark for Autonomous Driving	Apr 27, 2023	3D geometryAutonomous Driving	CodeCode Available	2	5
OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies	May 8, 2024	Domain AdaptationScene Understanding	CodeCode Available	2	5
What Can Natural Language Processing Do for Peer Review?	May 10, 2024	Articles	CodeCode Available	2	5
Mixed-Curvature Decision Trees and Random Forests	Jun 7, 2024		CodeCode Available	2	5
SimVG: A Simple Framework for Visual Grounding with Decoupled Multi-modal Fusion	Sep 26, 2024	DescriptiveGeneralized Referring Expression Comprehension	CodeCode Available	2	5
GeoGround: A Unified Large Vision-Language Model for Remote Sensing Visual Grounding	Nov 16, 2024	Instruction FollowingLanguage Modeling	CodeCode Available	2	5
RecFlow: An Industrial Full Flow Recommendation Dataset	Oct 28, 2024	Recommendation SystemsSelection bias	CodeCode Available	2	5
LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization	Mar 11, 2025	GPUImage Generation	CodeCode Available	2	5
ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware	Dec 2, 2018	GPUImage Classification	CodeCode Available	2	5
PerAct2: Benchmarking and Learning for Robotic Bimanual Manipulation Tasks	Jun 29, 2024	Diversity	CodeCode Available	2	5
GPQA: A Graduate-Level Google-Proof Q&A Benchmark	Nov 20, 2023	Multiple-choice	CodeCode Available	2	5
PruneVid: Visual Token Pruning for Efficient Video Large Language Models	Dec 20, 2024	Video Understanding	CodeCode Available	2	5
Voice Conversion With Just Nearest Neighbors	May 30, 2023	Voice Conversion	CodeCode Available	2	5
Learning Affinity from Attention: End-to-End Weakly-Supervised Semantic Segmentation with Transformers	Mar 5, 2022	Semantic SegmentationWeakly supervised Semantic Segmentation	CodeCode Available	2	5
DreamLLM: Synergistic Multimodal Comprehension and Creation	Sep 20, 2023	multimodal generationVisual Question Answering	CodeCode Available	2	5
On-Device Domain Generalization	Sep 15, 2022	Data AugmentationDomain Generalization	CodeCode Available	2	5
Dynamic Early Exit in Reasoning Models	Apr 22, 2025	GSM8KMath	CodeCode Available	2	5
Medical Vision Generalist: Unifying Medical Imaging Tasks in Context	Jun 8, 2024	Conditional Image GenerationDenoising	CodeCode Available	2	5
AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark	Dec 17, 2024	Information RetrievalRetrieval	CodeCode Available	2	5
Revisiting Adversarial Training under Long-Tailed Distributions	Mar 15, 2024	Adversarial DefenseData Augmentation	CodeCode Available	2	5
Many-Shot In-Context Learning in Multimodal Foundation Models	May 16, 2024	image-classificationImage Classification	CodeCode Available	2	5
Towards Unified Keyframe Propagation Models	May 19, 2022	Image InpaintingVideo Editing	CodeCode Available	2	5
Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks	Mar 27, 2025	Imitation LearningMathematical Reasoning	CodeCode Available	2	5
OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents	Jun 17, 2025		CodeCode Available	2	5
A Versatile Framework for Multi-scene Person Re-identification	Mar 17, 2024	Data AugmentationPerson Re-Identification	CodeCode Available	2	5
Measuring Massive Multitask Language Understanding	Sep 7, 2020	Elementary MathematicsMulti-task Language Understanding	CodeCode Available	2	5
CombatVLA: An Efficient Vision-Language-Action Model for Combat Tasks in 3D Action Role-Playing Games	Mar 12, 2025	Decision MakingVision-Language-Action	CodeCode Available	2	5
Think or Not Think: A Study of Explicit Thinking in Rule-Based Visual Reinforcement Fine-Tuning	Mar 20, 2025	ClassificationFew-Shot Learning	CodeCode Available	2	5
Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer	Dec 1, 2021		CodeCode Available	2	5
YOLO-UniOW: Efficient Universal Open-World Object Detection	Dec 30, 2024	Incremental LearningObject	CodeCode Available	2	5
Voxurf: Voxel-based Efficient and Accurate Neural Surface Reconstruction	Aug 26, 2022	Surface Reconstruction	CodeCode Available	2	5
DySLIM: Dynamics Stable Learning by Invariant Measure for Chaotic Systems	Feb 6, 2024		CodeCode Available	2	5
CLRerNet: Improving Confidence of Lane Detection with LaneIoU	May 15, 2023	Autonomous DrivingLane Detection	CodeCode Available	2	5
Do we actually understand the impact of renewables on electricity prices? A causal inference approach	Jan 10, 2025	Causal Inference	CodeCode Available	2	5
Transformer Circuit Faithfulness Metrics are not Robust	Jul 11, 2024		CodeCode Available	2	5
Retinexmamba: Retinex-based Mamba for Low-light Image Enhancement	May 6, 2024	Computational EfficiencyDeep Learning	CodeCode Available	2	5
COVID-19 Image Data Collection: Prospective Predictions Are the Future	Jun 22, 2020	Management	CodeCode Available	2	5