The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2751–2775 of 177340 papers

Title	Date	Tasks	Status	Hype	Score
Evaluating Large Language Models Trained on Code	Jul 7, 2021	Code GenerationHumanEval	CodeCode Available	3	5
Learning Inclusion Matching for Animation Paint Bucket Colorization	Mar 27, 2024	Colorization	CodeCode Available	3	5
Learning to Use Tools via Cooperative and Interactive Agents	Mar 5, 2024		CodeCode Available	3	5
WhisperNER: Unified Open Named Entity and Speech Recognition	Sep 12, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	3	5
Theory, Analysis, and Best Practices for Sigmoid Self-Attention	Sep 6, 2024		CodeCode Available	3	5
Fairness in Serving Large Language Models	Dec 31, 2023	FairnessScheduling	CodeCode Available	3	5
Degradation-Aware Residual-Conditioned Optimal Transport for Unified Image Restoration	Nov 3, 2024	5-Degradation Blind All-in-One Image RestorationBlind All-in-One Image Restoration	CodeCode Available	3	5
LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images	Mar 18, 2024	Long-Context UnderstandingTextVQA	CodeCode Available	3	5
Language Models are Few-Shot Learners	May 28, 2020	answerability predictionArticles	CodeCode Available	3	5
Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models	Dec 24, 2024	Attribute	CodeCode Available	3	5
A Smart Multimodal Healthcare Copilot with Powerful LLM Reasoning	Jun 3, 2025	Decision MakingDiagnostic	CodeCode Available	3	5
TCFormer: Visual Recognition via Token Clustering Transformer	Jul 16, 2024	Clusteringimage-classification	CodeCode Available	3	5
TSI-Bench: Benchmarking Time Series Imputation	Jun 18, 2024	BenchmarkingDeep Learning	CodeCode Available	3	5
Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-Constraint	Dec 18, 2023	Language ModelingLanguage Modelling	CodeCode Available	3	5
A Unified Anomaly Synthesis Strategy with Gradient Ascent for Industrial Anomaly Detection and Localization	Jul 12, 2024	Anomaly DetectionDefect Detection	CodeCode Available	3	5
Seamless Human Motion Composition with Blended Positional Encodings	Feb 23, 2024	DenoisingMotion Generation	CodeCode Available	3	5
AdaRevD: Adaptive Patch Exiting Reversible Decoder Pushes the Limit of Image Deblurring	Jun 13, 2024	DeblurringDecoder	CodeCode Available	3	5
LocalMamba: Visual State Space Model with Windowed Selective Scan	Mar 14, 2024	MambaState Space Models	CodeCode Available	3	5
CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models	Jan 30, 2024	Knowledge Base ConstructionQuestion Answering	CodeCode Available	3	5
Will LLMs be Professional at Fund Investment? DeepFund: A Live Arena Perspective	Mar 24, 2025	Decision Making	CodeCode Available	3	5
Event-Enhanced Blurry Video Super-Resolution	Apr 17, 2025	DeblurringMotion Estimation	CodeCode Available	3	5
Generative Data Augmentation using LLMs improves Distributional Robustness in Question Answering	Sep 3, 2023	Data AugmentationDomain Adaptation	CodeCode Available	3	5
Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning	May 1, 2024	ARCGSM8K	CodeCode Available	3	5
A Survey of Large Language Models in Finance (FinLLMs)	Feb 4, 2024	Named Entity Recognition (NER)Question Answering	CodeCode Available	3	5
INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning	Jan 12, 2024	Diversitydocument understanding	CodeCode Available	3	5