The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 11401–11450 of 474278 papers

Title	Date	Tasks	Status	Hype
AIM 2022 Challenge on Super-Resolution of Compressed Image and Video: Dataset, Methods and Results	Aug 23, 2022	Super-Resolution	CodeCode Available	2
VenusFactory: A Unified Platform for Protein Engineering Data Retrieval and Language Model Fine-Tuning	Mar 19, 2025	BenchmarkingLanguage Modeling	CodeCode Available	2
Grounded 3D-LLM with Referent Tokens	May 16, 2024	Dense CaptioningDiversity	CodeCode Available	2
Generative Planning with 3D-vision Language Pre-training for End-to-End Autonomous Driving	Jan 15, 2025	Autonomous DrivingTrajectory Planning	CodeCode Available	2
GraphMAE2: A Decoding-Enhanced Masked Self-Supervised Graph Learner	Apr 10, 2023	Self-Supervised Learning	CodeCode Available	2
ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM	Aug 22, 2024	Misinformation	CodeCode Available	2
Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model	Jul 9, 2024	Chart UnderstandingLanguage Modeling	CodeCode Available	2
Medical Image Segmentation Review: The success of U-Net	Nov 27, 2022	Image SegmentationMedical Image Segmentation	CodeCode Available	2
A Survey on Protein Representation Learning: Retrospect and Prospect	Dec 31, 2022	Representation LearningSurvey	CodeCode Available	2
Tissue Concepts: supervised foundation models in computational pathology	Sep 5, 2024	DiagnosticMulti-Task Learning	CodeCode Available	2
Real-time Scene Text Detection with Differentiable Binarization	Nov 20, 2019	BinarizationOptical Character Recognition (OCR)	CodeCode Available	2
Characterization of Large Language Model Development in the Datacenter	Mar 12, 2024	GPULanguage Modeling	CodeCode Available	2
Scalable telomere-to-telomere assembly for diploid and polyploid genomes with double graph	Jun 6, 2023		CodeCode Available	2
Generative Semi-supervised Graph Anomaly Detection	Feb 19, 2024	Anomaly DetectionGraph Anomaly Detection	CodeCode Available	2
Reconstructing Personalized Semantic Facial NeRF Models From Monocular Video	Oct 12, 2022	NeRF	CodeCode Available	2
Language-driven Semantic Segmentation	Jan 10, 2022	DescriptiveFew-Shot Semantic Segmentation	CodeCode Available	2
dtaianomaly: A Python library for time series anomaly detection	Feb 20, 2025	Anomaly DetectionTime Series	CodeCode Available	2
Benchmarking Deep Reinforcement Learning for Continuous Control	Apr 22, 2016	Action Triplet RecognitionAtari Games	CodeCode Available	2
Model Comparison and Calibration Assessment: User Guide for Consistent Scoring Functions in Machine Learning and Actuarial Practice	Feb 25, 2022		CodeCode Available	2
MPAX: Mathematical Programming in JAX	Dec 12, 2024		CodeCode Available	2
Multi-View Reasoning: Consistent Contrastive Learning for Math Word Problem	Oct 21, 2022	Contrastive LearningMath	CodeCode Available	2
LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs	Jun 21, 2022	3D Object DetectionObject	CodeCode Available	2
ColorVideoVDP: A visual difference predictor for image, video and display distortions	Jan 21, 2024	Video Compression	CodeCode Available	2
MHNet: Multi-view High-order Network for Diagnosing Neurodevelopmental Disorders Using Resting-state fMRI	Jul 3, 2024	Functional ConnectivityGraph Neural Network	CodeCode Available	2
Dungeons and Data: A Large-Scale NetHack Dataset	Nov 1, 2022	Decision MakingNetHack	CodeCode Available	2
DataSciBench: An LLM Agent Benchmark for Data Science	Feb 19, 2025	Code GenerationLarge Language Model	CodeCode Available	2
Navigating the Shadows: Unveiling Effective Disturbances for Modern AI Content Detectors	Jun 13, 2024	Data AugmentationText Detection	CodeCode Available	2
Trieste: Efficiently Exploring The Depths of Black-box Functions with TensorFlow	Feb 16, 2023	Active LearningBayesian Optimization	CodeCode Available	2
Foundation Policies with Hilbert Representations	Feb 23, 2024	Reinforcement Learning (RL)Unsupervised Pre-training	CodeCode Available	2
ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization	Jun 6, 2024		CodeCode Available	2
Needle In A Video Haystack: A Scalable Synthetic Evaluator for Video MLLMs	Jun 13, 2024	BenchmarkingQuestion Answering	CodeCode Available	2
IRef-VLA: A Benchmark for Interactive Referential Grounding with Imperfect Language in 3D Scenes	Mar 20, 2025	Scene UnderstandingSpatial Reasoning	CodeCode Available	2
Explore the Limits of Omni-modal Pretraining at Scale	Jun 13, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
Noise-Consistent Siamese-Diffusion for Medical Image Synthesis and Segmentation	May 9, 2025	Image GenerationImage Segmentation	CodeCode Available	2
Learning to Generate Explainable Stock Predictions using Self-Reflective Large Language Models	Feb 6, 2024	Stock Prediction	CodeCode Available	2
FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner	Sep 26, 2024	Image GenerationText to Image Generation	CodeCode Available	2
MC-Calib: A generic and robust calibration toolbox for multi-camera systems	Jan 12, 2022	Camera Calibration	CodeCode Available	2
Yggdrasil Decision Forests: A Fast and Extensible Decision Forests Library	Dec 6, 2022		CodeCode Available	2
NUDT4MSTAR: A Large Dataset and Benchmark Towards Remote Sensing Object Recognition in the Wild	Jan 23, 2025	Earth ObservationObject Recognition	CodeCode Available	2
Imp: Highly Capable Large Multimodal Models for Mobile Devices	May 20, 2024	QuantizationVisual Question Answering	CodeCode Available	2
DurLAR: A High-fidelity 128-channel LiDAR Dataset with Panoramic Ambient and Reflectivity Imagery for Multi-modal Autonomous Driving Applications	Jun 14, 2024	Autonomous DrivingDepth Estimation	CodeCode Available	2
Graph-ToolFormer: To Empower LLMs with Graph Reasoning Ability via Prompt Augmented by ChatGPT	Apr 10, 2023	Graph LearningKnowledge Graphs	CodeCode Available	2
RecBole: Towards a Unified, Comprehensive and Efficient Framework for Recommendation Algorithms	Nov 3, 2020	Collaborative FilteringGPU	CodeCode Available	2
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution	Oct 21, 2024	Allmodel	CodeCode Available	2
Trusted Multi-View Classification with Dynamic Evidential Fusion	Apr 25, 2022	ClassificationMULTI-VIEW LEARNING	CodeCode Available	2
Reading Between the Frames: Multi-Modal Depression Detection in Videos from Non-Verbal Cues	Jan 5, 2024	Depression Detection	CodeCode Available	2
Deep Differentiable Logic Gate Networks	Oct 15, 2022	CPUEfficient Neural Network	CodeCode Available	2
SALAD-Bench: A Hierarchical and Comprehensive Safety Benchmark for Large Language Models	Feb 7, 2024	DiversityMultiple-choice	CodeCode Available	2
The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"	Sep 21, 2023	Data AugmentationSentence	CodeCode Available	2
Prometheus-Vision: Vision-Language Model as a Judge for Fine-Grained Evaluation	Jan 12, 2024	Language ModelingLanguage Modelling	CodeCode Available	2