The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 8101–8150 of 661570 papers

Title	Date	Tasks	Status	Hype
Law of Vision Representation in MLLMs	Aug 29, 2024	cross-modal alignmentLanguage Modeling	CodeCode Available	2
CLUE: A Chinese Language Understanding Evaluation Benchmark	Apr 13, 2020	General ClassificationMachine Reading Comprehension	CodeCode Available	2
MASS: Multi-Agent Simulation Scaling for Portfolio Construction	May 15, 2025		CodeCode Available	2
Towards High-Resolution 3D Anomaly Detection: A Scalable Dataset and Real-Time Framework for Subtle Industrial Defects	Jul 10, 2025	3D Anomaly DetectionAnomaly Detection	CodeCode Available	2
Massive Activations in Large Language Models	Feb 27, 2024		CodeCode Available	2
Natural and Robust Walking using Reinforcement Learning without Demonstrations in High-Dimensional Musculoskeletal Models	Sep 6, 2023	Reinforcement Learning (RL)	CodeCode Available	2
Feature Pyramid Networks for Object Detection	Dec 9, 2016	GPUObject	CodeCode Available	2
Frozen Transformers in Language Models Are Effective Visual Encoder Layers	Oct 19, 2023	Action RecognitionImage-text Retrieval	CodeCode Available	2
The Russian Legislative Corpus	Jun 7, 2024		CodeCode Available	2
GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction	May 30, 2023	Image GenerationInstruction Following	CodeCode Available	2
FisherRF: Active View Selection and Uncertainty Quantification for Radiance Fields using Fisher Information	Nov 29, 2023	NeRFUncertainty Quantification	CodeCode Available	2
Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding	Sep 15, 2023	Language ModelingLanguage Modelling	CodeCode Available	2
Generative Pretraining from Pixels	Jul 17, 2020	Image ClassificationRepresentation Learning	CodeCode Available	2
DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models (Exemplified as A Video Agent)	Jan 16, 2024	Scheduling	CodeCode Available	2
Invertible Diffusion Models for Compressed Sensing	Mar 25, 2024	compressed sensingGPU	CodeCode Available	2
HASSOD: Hierarchical Adaptive Self-Supervised Object Detection	Feb 5, 2024	Objectobject-detection	CodeCode Available	2
OpenCity: Open Spatio-Temporal Foundation Models for Traffic Prediction	Aug 16, 2024	PredictionTraffic Prediction	CodeCode Available	2
Toward Unified Practices in Trajectory Prediction Research on Bird's-Eye-View Datasets	May 1, 2024	Autonomous VehiclesMotion Forecasting	CodeCode Available	2
Closing the Gap Between Synthetic and Ground Truth Time Series Distributions via Neural Mapping	Jan 29, 2025	Time SeriesTime Series Classification	CodeCode Available	2
EfficientViM: Efficient Vision Mamba with Hidden State Mixer based State Space Duality	Nov 22, 2024	Efficient Neural NetworkImage Classification	CodeCode Available	2
DynRefer: Delving into Region-level Multi-modality Tasks via Dynamic Resolution	May 25, 2024	Attribute	CodeCode Available	2
An Image Grid Can Be Worth a Video: Zero-shot Video Question Answering Using a VLM	Mar 27, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
EMOv2: Pushing 5M Vision Model Frontier	Dec 9, 2024	Image Generationmodel	CodeCode Available	2
PSP-HDRI+: A Synthetic Dataset Generator for Pre-Training of Human-Centric Computer Vision Models	Jul 11, 2022	Keypoint Estimation	CodeCode Available	2
OpenBox: A Python Toolkit for Generalized Black-box Optimization	Apr 26, 2023	Experimental Design	CodeCode Available	2
When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute	Feb 24, 2021	GPULanguage Modeling	CodeCode Available	2
ICML 2023 Topological Deep Learning Challenge : Design and Results	Sep 26, 2023	Deep Learning	CodeCode Available	2
Longhorn: State Space Models are Amortized Online Learners	Jul 19, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
CCPL: Contrastive Coherence Preserving Loss for Versatile Style Transfer	Jul 11, 2022	Image-to-Image TranslationStyle Transfer	CodeCode Available	2
A mmWave Software-Defined Array Platform for Wireless Experimentation at 24-29.5 GHz	Sep 17, 2024		CodeCode Available	2
Empirical Asset Pricing with Large Language Model Agents	Sep 25, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements	Jun 27, 2025		CodeCode Available	2
DCoM: Active Learning for All Learners	Jul 1, 2024	Active LearningAll	CodeCode Available	2
Foundation Models for Remote Sensing and Earth Observation: A Survey	Oct 22, 2024	Earth ObservationHumanitarian	CodeCode Available	2
PMC-LLaMA: Towards Building Open-source Language Models for Medicine	Apr 27, 2023	Language ModelingLanguage Modelling	CodeCode Available	2
SWE-bench Goes Live!	May 29, 2025		CodeCode Available	2
Uncertainty-Informed Deep Learning Models Enable High-Confidence Predictions for Digital Histopathology	Apr 9, 2022	AttributeUncertainty Quantification	CodeCode Available	2
Accelerated Policy Learning with Parallel Differentiable Simulation	Apr 14, 2022	Deep Reinforcement Learning	CodeCode Available	2
SimVP: Simpler yet Better Video Prediction	Jun 9, 2022	PredictionVideo Prediction	CodeCode Available	2
Rethinking Imitation-based Planner for Autonomous Driving	Sep 19, 2023	Autonomous DrivingData Augmentation	CodeCode Available	2
Contrastive Flow Matching	Jun 5, 2025		CodeCode Available	2
Conformal prediction interval for dynamic time-series	Oct 18, 2020	Conformal PredictionEnsemble Learning	CodeCode Available	2
ASTRA-sim2.0: Modeling Hierarchical Networks and Disaggregated Systems for Large-model Training at Scale	Mar 24, 2023		CodeCode Available	2
video-SALMONN 2: Captioning-Enhanced Audio-Visual Large Language Models	Jun 18, 2025	Audio captioningLarge Language Model	CodeCode Available	2
InstructSAM: A Training-Free Framework for Instruction-Oriented Remote Sensing Object Recognition	May 21, 2025	Earth ObservationObject	CodeCode Available	2
DiscoveryBench: Towards Data-Driven Discovery with Large Language Models	Jul 1, 2024	Code GenerationSociology	CodeCode Available	2
Investigating image-based fallow weed detection performance on Raphanus sativus and Avena sativa at speeds up to 30 km h^-1	May 17, 2023		CodeCode Available	2
Training Socially Aligned Language Models on Simulated Social Interactions	May 26, 2023		CodeCode Available	2
Stabilizing Transformer Training by Preventing Attention Entropy Collapse	Mar 11, 2023	Automatic Speech Recognitionimage-classification	CodeCode Available	2
End-to-End Vectorized HD-map Construction with Piecewise Bezier Curve	Jun 16, 2023	3D geometryAutonomous Driving	CodeCode Available	2