The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 17151–17200 of 474278 papers

Title	Date	Tasks	Status	Hype
GradEscape: A Gradient-Based Evader Against AI-Generated Text Detectors	Jun 9, 2025	BenchmarkingModel extraction	—Unverified	0
LEANN: A Low-Storage Vector Index	Jun 9, 2025	Question AnsweringRAG	—Unverified	0
Jamais Vu: Exposing the Generalization Gap in Supervised Semantic Correspondence	Jun 9, 2025	Depth EstimationMonocular Depth Estimation	—Unverified	0
Reinforcement Learning from Human Feedback with High-Confidence Safety Constraints	Jun 9, 2025	Safe Reinforcement Learning	CodeCode Available	0
Hierarchical Lexical Graph for Enhanced Multi-Hop Retrieval	Jun 9, 2025	Dataset GenerationRAG	CodeCode Available	3
Silencing Empowerment, Allowing Bigotry: Auditing the Moderation of Hate Speech on Twitch	Jun 9, 2025		CodeCode Available	0
Lite-RVFL: A Lightweight Random Vector Functional-Link Neural Network for Learning Under Concept Drift	Jun 9, 2025	Drift Detection	CodeCode Available	0
BLUR: A Bi-Level Optimization Approach for LLM Unlearning	Jun 9, 2025		CodeCode Available	0
WWAggr: A Window Wasserstein-based Aggregation for Ensemble Change Point Detection	Jun 9, 2025	Change Point Detection	—Unverified	0
Federated Learning on Stochastic Neural Networks	Jun 9, 2025	Edge-computingFederated Learning	—Unverified	0
GIQ: Benchmarking 3D Geometric Reasoning of Vision Foundation Models with Simulated and Real Polyhedra	Jun 9, 2025	3D ReconstructionBenchmarking	—Unverified	0
Snap-and-tune: combining deep learning and test-time optimization for high-fidelity cardiovascular volumetric meshing	Jun 9, 2025		CodeCode Available	2
A Good CREPE needs more than just Sugar: Investigating Biases in Compositional Vision-Language Benchmarks	Jun 9, 2025	Language ModelingLanguage Modelling	—Unverified	0
Temporalizing Confidence: Evaluation of Chain-of-Thought Reasoning with Signal Temporal Logic	Jun 9, 2025	Mathematical Reasoning	—Unverified	0
FreeGave: 3D Physics Learning from Dynamic Videos by Gaussian Velocity	Jun 9, 2025	Motion Segmentation	CodeCode Available	1
SWAT-NN: Simultaneous Weights and Architecture Training for Neural Networks in a Latent Space	Jun 9, 2025	Neural Architecture Search	CodeCode Available	0
UniVarFL: Uniformity and Variance Regularized Federated Learning for Heterogeneous Data	Jun 9, 2025	Federated Learning	CodeCode Available	0
Eliciting Fine-Tuned Transformer Capabilities via Inference-Time Techniques	Jun 9, 2025	In-Context LearningRetrieval-augmented Generation	—Unverified	0
DLNet: Direction-Aware Feature Integration for Robust Lane Detection in Complex Environments	Jun 9, 2025	Autonomous DrivingLane Detection	CodeCode Available	0
Discrete and Continuous Difference of Submodular Minimization	Jun 9, 2025	Compressive Sensing	CodeCode Available	0
Evidential Spectrum-Aware Contrastive Learning for OOD Detection in Dynamic Graphs	Jun 9, 2025	Contrastive LearningOut of Distribution (OOD) Detection	CodeCode Available	0
Fractional-order Jacobian Matrix Differentiation and Its Application in Artificial Neural Networks	Jun 9, 2025	GPU	—Unverified	0
Difference Inversion: Interpolate and Isolate the Difference with Token Consistency for Image Analogy Generation	Jun 9, 2025	In-Context Learning	—Unverified	0
Decoding Saccadic Eye Movements from Brain Signals Using an Endovascular Neural Interface	Jun 9, 2025	Brain Computer InterfaceEEG	—Unverified	0
SAFEFLOW: A Principled Protocol for Trustworthy and Transactional Autonomous Agent Systems	Jun 9, 2025	Scheduling	—Unverified	0
Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding	Jun 9, 2025		CodeCode Available	1
Flowing Datasets with Wasserstein over Wasserstein Gradient Flows	Jun 9, 2025	Dataset DistillationDomain Adaptation	CodeCode Available	1
Circumventing Backdoor Space via Weight Symmetry	Jun 9, 2025	Self-Supervised Learning	CodeCode Available	0
ProtocolLLM: RTL Benchmark for SystemVerilog Generation of Communication Protocols	Jun 9, 2025	Code GenerationSpecificity	—Unverified	0
Multiple Object Stitching for Unsupervised Representation Learning	Jun 9, 2025	Contrastive LearningObject	CodeCode Available	1
CausalPFN: Amortized Causal Effect Estimation via In-Context Learning	Jun 9, 2025	Decision MakingHeterogeneous Treatment Effect Estimation	CodeCode Available	2
Hallucination at a Glance: Controlled Visual Edits and Fine-Grained Multimodal Learning	Jun 8, 2025	AttributeHallucination	—Unverified	0
Theorem-of-Thought: A Multi-Agent Framework for Abductive, Deductive, and Inductive Reasoning in Language Models	Jun 8, 2025	Natural Language Inference	CodeCode Available	0
Next-Generation Conflict Forecasting: Unleashing Predictive Patterns through Spatiotemporal Learning	Jun 8, 2025	Feature EngineeringHumanitarian	—Unverified	0
Multi-Step Visual Reasoning with Visual Tokens Scaling and Verification	Jun 8, 2025	Question AnsweringVisual Question Answering	CodeCode Available	1
Learning to Clarify by Reinforcement Learning Through Reward-Weighted Fine-Tuning	Jun 8, 2025	Offline RLQuestion Answering	—Unverified	0
Filling the Missings: Spatiotemporal Data Imputation by Conditional Diffusion	Jun 8, 2025	Imputation	CodeCode Available	0
Paged Attention Meets FlexAttention: Unlocking Long-Context Efficiency in Deployed Inference	Jun 8, 2025	GPU	—Unverified	0
Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs	Jun 8, 2025		CodeCode Available	2
Representing Time-Continuous Behavior of Cyber-Physical Systems in Knowledge Graphs	Jun 8, 2025	Graph GenerationKnowledge Graphs	—Unverified	0
MS-TVNet:A Long-Term Time Series Prediction Method Based on Multi-Scale Dynamic Convolution	Jun 8, 2025	Time SeriesTime Series Prediction	CodeCode Available	0
Latency Optimization for Wireless Federated Learning in Multihop Networks	Jun 8, 2025	Federated Learning	CodeCode Available	0
FLAIR-HUB: Large-scale Multimodal Dataset for Land Cover and Crop Mapping	Jun 8, 2025	Earth ObservationMulti-Task Learning	—Unverified	0
Manifesto from Dagstuhl Perspectives Workshop 24352 -- Conversational Agents: A Framework for Evaluation (CAFE)	Jun 8, 2025	Conversational Information Access	—Unverified	0
SDE-SQL: Enhancing Text-to-SQL Generation in Large Language Models via Self-Driven Exploration with SQL Probes	Jun 8, 2025	Text to SQLText-To-SQL	—Unverified	0
Reliable Critics: Monotonic Improvement and Convergence Guarantees for Reinforcement Learning	Jun 8, 2025	Reinforcement Learning (RL)	—Unverified	0
AnnoDPO: Protein Functional Annotation Learning with Direct Preference Optimization	Jun 8, 2025	Language ModelingLanguage Modelling	CodeCode Available	0
Joint Channel and Symbol Estimation for Communication Systems with Movable Antennas	Jun 8, 2025	Tensor Decomposition	—Unverified	0
Transfer Learning and Explainable AI for Brain Tumor Classification: A Study Using MRI Data from Bangladesh	Jun 8, 2025	Brain Tumor ClassificationExplainable artificial intelligence	—Unverified	0
Simultaneous Segmentation of Ventricles and Normal/Abnormal White Matter Hyperintensities in Clinical MRI using Deep Learning	Jun 8, 2025	Computational EfficiencySegmentation	—Unverified	0