The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 19651–19700 of 474278 papers

Title	Date	Tasks	Status	Hype
Model-Preserving Adaptive Rounding	May 29, 2025	modelQuantization	CodeCode Available	2
Pseudo Multi-Source Domain Generalization: Bridging the Gap Between Single and Multi-Source Domain Generalization	May 29, 2025	Data AugmentationDomain Generalization	CodeCode Available	0
Don't Take the Premise for Granted: Evaluating the Premise Critique Ability of Large Language Models	May 29, 2025		CodeCode Available	0
Bounded-Abstention Pairwise Learning to Rank	May 29, 2025	Decision MakingLearning-To-Rank	—Unverified	0
PhysicsNeRF: Physics-Guided 3D Reconstruction from Sparse Views	May 29, 2025	3D ReconstructionNeRF	CodeCode Available	0
ScEdit: Script-based Assessment of Knowledge Editing	May 29, 2025	counterfactualknowledge editing	CodeCode Available	0
Efficiently Access Diffusion Fisher: Within the Outer Product Span Space	May 29, 2025		CodeCode Available	0
DyePack: Provably Flagging Test Set Contamination in LLMs Using Backdoors	May 29, 2025	MMLUMultiple-choice	CodeCode Available	0
Document-Level Text Generation with Minimum Bayes Risk Decoding using Optimal Transport	May 29, 2025	Document Level Machine TranslationImage Captioning	CodeCode Available	0
Gibbs randomness-compression proposition: An efficient deep learning	May 29, 2025	Deep LearningNeural Architecture Search	CodeCode Available	0
Diverse Prototypical Ensembles Improve Robustness to Subpopulation Shift	May 29, 2025		CodeCode Available	0
Efficient Parameter Estimation for Bayesian Network Classifiers using Hierarchical Linear Smoothing	May 29, 2025	parameter estimation	CodeCode Available	0
Model Immunization from a Condition Number Perspective	May 29, 2025	model	CodeCode Available	1
LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers	May 29, 2025	DenoisingImage Generation	—Unverified	0
Data-efficient Meta-models for Evaluation of Context-based Questions and Answers in LLMs	May 29, 2025	Dimensionality ReductionHallucination	—Unverified	0
UrbanCraft: Urban View Extrapolation via Hierarchical Sem-Geometric Priors	May 29, 2025	Neural Rendering	—Unverified	0
Does Machine Unlearning Truly Remove Model Knowledge? A Framework for Auditing Unlearning in LLMs	May 29, 2025	Machine Unlearning	—Unverified	0
ToolHaystack: Stress-Testing Tool-Augmented Language Models in Realistic Long-Term Interactions	May 29, 2025		CodeCode Available	0
Augment or Not? A Comparative Study of Pure and Augmented Large Language Model Recommenders	May 29, 2025	Language ModelingLanguage Modelling	CodeCode Available	0
Autoformalization in the Era of Large Language Models: A Survey	May 29, 2025	Automated Theorem Proving	CodeCode Available	5
Hyperbolic-PDE GNN: Spectral Graph Neural Networks in the Perspective of A System of Hyperbolic Partial Differential Equations	May 29, 2025		CodeCode Available	0
Discriminative Policy Optimization for Token-Level Reward Models	May 29, 2025	GSM8KLanguage Modeling	CodeCode Available	0
Automatic classification of stop realisation with wav2vec2.0	May 29, 2025	Classification	CodeCode Available	0
Translation in the Wild	May 29, 2025	Machine TranslationTranslation	—Unverified	0
CLDTracker: A Comprehensive Language Description for Visual Tracking	May 29, 2025	Image CaptioningVisual Tracking	CodeCode Available	0
Argus: Vision-Centric Reasoning with Grounded Chain-of-Thought	May 29, 2025	Multimodal Reasoning	—Unverified	0
UAQFact: Evaluating Factual Knowledge Utilization of LLMs on Unanswerable Questions	May 29, 2025		CodeCode Available	0
To Trust Or Not To Trust Your Vision-Language Model's Prediction	May 29, 2025	Transfer Learning	CodeCode Available	1
VAU-R1: Advancing Video Anomaly Understanding via Reinforcement Fine-Tuning	May 29, 2025	Anomaly DetectionDescriptive	CodeCode Available	2
Diffusion Guidance Is a Controllable Policy Improvement Operator	May 29, 2025	Offline RL	CodeCode Available	2
Directed Graph Grammars for Sequence-based Learning	May 29, 2025	Bayesian OptimizationGraph Generation	CodeCode Available	1
Puzzled by Puzzles: When Vision-Language Models Can't Take a Hint	May 29, 2025	Image CaptioningQuestion Answering	CodeCode Available	1
Label-Guided In-Context Learning for Named Entity Recognition	May 29, 2025	In-Context Learningnamed-entity-recognition	CodeCode Available	1
MathArena: Evaluating LLMs on Uncontaminated Math Competitions	May 29, 2025	MathMathematical Reasoning	CodeCode Available	3
CrossLinear: Plug-and-Play Cross-Correlation Embedding for Time Series Forecasting with Exogenous Variables	May 29, 2025	Time SeriesTime Series Forecasting	CodeCode Available	1
FreRA: A Frequency-Refined Augmentation for Contrastive Learning on Time Series Classification	May 29, 2025	Anomaly DetectionContrastive Learning	CodeCode Available	1
AutoSchemaKG: Autonomous Knowledge Graph Construction through Dynamic Schema Induction from Web-Scale Corpora	May 29, 2025	graph constructionKnowledge Graphs	CodeCode Available	4
Foundation Molecular Grammar: Multi-Modal Foundation Models Induce Interpretable Molecular Graph Languages	May 29, 2025	DiversityPrompt Learning	CodeCode Available	1
Graph Random Walk with Feature-Label Space Alignment: A Multi-Label Feature Selection Method	May 29, 2025	feature selection	CodeCode Available	0
Boosting Domain Incremental Learning: Selecting the Optimal Parameters is All You Need	May 29, 2025	Allimage-classification	CodeCode Available	0
Infinite-Instruct: Synthesizing Scaling Code instruction Data with Bidirectional Synthesis and Static Verification	May 29, 2025	Code Generation	CodeCode Available	0
URWKV: Unified RWKV Model with Multi-state Perspective for Low-light Image Restoration	May 29, 2025	DeblurringImage Enhancement	CodeCode Available	1
Distributed Federated Learning for Vehicular Network Security: Anomaly Detection Benefits and Multi-Domain Attack Threats	May 29, 2025	Anomaly DetectionAutonomous Vehicles	—Unverified	0
Satori-SWE: Evolutionary Test-Time Scaling for Sample-Efficient Software Engineering	May 29, 2025	Reinforcement Learning (RL)	CodeCode Available	1
Zero-to-Hero: Zero-Shot Initialization Empowering Reference-Based Video Appearance Editing	May 29, 2025	Optical Flow EstimationVideo Editing	CodeCode Available	1
Towards Privacy-Preserving Fine-Grained Visual Classification via Hierarchical Learning from Label Proportions	May 29, 2025	ClassificationDictionary Learning	—Unverified	0
Position Dependent Prediction Combination For Intra-Frame Video Coding	May 29, 2025	PositionPrediction	—Unverified	0
Neural Interpretable PDEs: Harmonizing Fourier Insights with Attention for Scalable and Interpretable Physics Discovery	May 29, 2025	Computational Efficiency	CodeCode Available	1
Adversarial Semantic and Label Perturbation Attack for Pedestrian Attribute Recognition	May 29, 2025	Adversarial AttackAttribute	—Unverified	0
CURVE: CLIP-Utilized Reinforcement Learning for Visual Image Enhancement via Simple Image Processing	May 29, 2025	Computational EfficiencyImage Enhancement	—Unverified	0