The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 19251–19300 of 474278 papers

Title	Date	Tasks	Status	Hype
Detecting Airborne Objects with 5G NR Radars	May 30, 2025	Integrated sensing and communicationISAC	—Unverified	0
ProofNet++: A Neuro-Symbolic System for Formal Proof Verification with Self-Correction	May 30, 2025	Automated Theorem Proving	—Unverified	0
LLMs Are Globally Multilingual Yet Locally Monolingual: Exploring Knowledge Transfer via Language and Thought Theory	May 30, 2025	Cross-Lingual TransferTransfer Learning	—Unverified	0
Improving Reliability and Explainability of Medical Question Answering through Atomic Fact Checking in Retrieval-Augmented LLMs	May 30, 2025	Fact CheckingHallucination	—Unverified	0
Provably Improving Generalization of Few-Shot Models with Synthetic Data	May 30, 2025	Few-Shot Image Classificationimage-classification	—Unverified	0
Searching Clinical Data Using Generative AI	May 30, 2025	Diagnostic	—Unverified	0
Cartan Networks: Group theoretical Hyperbolic Deep Learning	May 30, 2025	Deep Learning	CodeCode Available	0
Knockoff-Guided Compressive Sensing: A Statistical Machine Learning Framework for Support-Assured Signal Recovery	May 30, 2025	Compressive Sensing	CodeCode Available	0
Circuit Stability Characterizes Language Model Generalization	May 30, 2025	Language ModelingLanguage Modelling	CodeCode Available	0
ERU-KG: Efficient Reference-aligned Unsupervised Keyphrase Generation	May 30, 2025	InformativenessKeyphrase Generation	CodeCode Available	0
HardTests: Synthesizing High-Quality Test Cases for LLM Coding	May 30, 2025	Code GenerationLanguage Modeling	—Unverified	0
Pretraining Multi-Speaker Identification for Neural Speaker Diarization	May 30, 2025	speaker-diarizationSpeaker Diarization	—Unverified	0
A Flat Minima Perspective on Understanding Augmentations and Model Robustness	May 30, 2025	Adversarial RobustnessData Augmentation	—Unverified	0
Category-aware EEG image generation based on wavelet transform and contrast semantic loss	May 30, 2025	EEGImage Generation	CodeCode Available	0
MetaFaith: Faithful Natural Language Uncertainty Expression in LLMs	May 30, 2025	Benchmarking	CodeCode Available	0
Tag-Evol: Achieving Efficient Instruction Evolving via Tag Injection	May 30, 2025	TAG	CodeCode Available	0
Diffusion-Based Symbolic Regression	May 30, 2025	Audio SynthesisDenoising	—Unverified	0
Binary Cumulative Encoding meets Time Series Forecasting	May 30, 2025	Time SeriesTime Series Forecasting	—Unverified	0
Weisfeiler and Leman Follow the Arrow of Time: Expressive Power of Message Passing in Temporal Event Graphs	May 30, 2025	Graph Classification	—Unverified	0
Can Slow-thinking LLMs Reason Over Time? Empirical Studies in Time Series Forecasting	May 30, 2025	Language ModelingLanguage Modelling	CodeCode Available	1
Multilingual Gloss-free Sign Language Translation: Towards Building a Sign Language Foundation Model	May 30, 2025	Gloss-free Sign Language TranslationSign Language Translation	CodeCode Available	0
SwiftEval: Developing a Language-Specific Benchmark for LLM-generated Code Evaluation	May 30, 2025	Code GenerationHumanEval	—Unverified	0
When Humans Growl and Birds Speak: High-Fidelity Voice Conversion from Human to Animal and Designed Sounds	May 30, 2025	Voice Conversion	—Unverified	0
A*-Thought: Efficient Reasoning via Bidirectional Compression for Low-Resource Settings	May 30, 2025	Math	CodeCode Available	1
Interpreting Large Text-to-Image Diffusion Models with Dictionary Learning	May 30, 2025	Dictionary LearningImage Generation	CodeCode Available	0
Mixpert: Mitigating Multimodal Learning Conflicts with Efficient Mixture-of-Vision-Experts	May 30, 2025	Multi-Task Learning	—Unverified	0
Beyond Multiple Choice: Evaluating Steering Vectors for Adaptive Free-Form Summarization	May 30, 2025	FormLanguage Modeling	—Unverified	0
Feature Attribution from First Principles	May 30, 2025		CodeCode Available	0
Effects of Theory of Mind and Prosocial Beliefs on Steering Human-Aligned Behaviors of LLMs in Ultimatum Games	May 30, 2025	Decision Making	CodeCode Available	0
Diversify and Conquer: Open-set Disagreement for Robust Semi-supervised Learning with Outliers	May 30, 2025	Outlier Detection	CodeCode Available	0
K^2IE: Kernel Method-based Kernel Intensity Estimators for Inhomogeneous Poisson Processes	May 30, 2025	Computational EfficiencyUnity	CodeCode Available	0
MMAFFBen: A Multilingual and Multimodal Affective Analysis Benchmark for Evaluating LLMs and VLMs	May 30, 2025	Emotion ClassificationSentiment Analysis	CodeCode Available	0
Don't Erase, Inform! Detecting and Contextualizing Harmful Language in Cultural Heritage Collections	May 30, 2025		CodeCode Available	0
Revisiting Epistemic Markers in Confidence Estimation: Can Markers Accurately Reflect Large Language Models' Uncertainty?	May 30, 2025	Question Answering	CodeCode Available	0
CHIP: Chameleon Hash-based Irreversible Passport for Robust Deep Model Ownership Verification and Active Usage Control	May 30, 2025		CodeCode Available	0
Unsupervised Evolutionary Cell Type Matching via Entropy-Minimized Optimal Transport	May 30, 2025		CodeCode Available	0
Chameleon: A Flexible Data-mixing Framework for Language Model Pretraining and Finetuning	May 30, 2025	Language ModelingLanguage Modelling	CodeCode Available	0
Drop Dropout on Single-Epoch Language Model Pretraining	May 30, 2025	Language ModelingLanguage Modelling	CodeCode Available	0
Biological Pathway Guided Gene Selection Through Collaborative Reinforcement Learning	May 30, 2025	Dimensionality Reductionfeature selection	CodeCode Available	0
The Road to Generalizable Neuro-Symbolic Learning Should be Paved with Foundation Models	May 30, 2025		CodeCode Available	0
Impact of Bottleneck Layers and Skip Connections on the Generalization of Linear Denoising Autoencoders	May 30, 2025	Denoising	—Unverified	0
Decoding Knowledge Attribution in Mixture-of-Experts: A Framework of Basic-Refinement Collaboration and Efficiency Analysis	May 30, 2025	BlockingMixture-of-Experts	—Unverified	0
Random Rule Forest (RRF): Interpretable Ensembles of LLM-Generated Questions for Predicting Startup Success	May 30, 2025	Decision Making	—Unverified	0
BIMA: Bijective Maximum Likelihood Learning Approach to Hallucination Prediction and Mitigation in Large Vision-Language Models	May 30, 2025	Hallucination	—Unverified	0
On the Expressive Power of Mixture-of-Experts for Structured Complex Tasks	May 30, 2025	Mixture-of-Experts	—Unverified	0
Heterogeneous Graph Masked Contrastive Learning for Robust Recommendation	May 30, 2025	Contrastive Learning	—Unverified	0
On the Scaling of Robustness and Effectiveness in Dense Retrieval	May 30, 2025	Adversarial RobustnessRetrieval	—Unverified	0
Anomaly Detection and Improvement of Clusters using Enhanced K-Means Algorithm	May 30, 2025	Anomaly Detection	—Unverified	0
Voice Conversion Improves Cross-Domain Robustness for Spoken Arabic Dialect Identification	May 30, 2025	Dialect IdentificationVoice Conversion	—Unverified	0
Speech-to-Text Translation with Phoneme-Augmented CoT: Enhancing Cross-Lingual Transfer in Low-Resource Scenarios	May 30, 2025	Cross-Lingual TransferPhoneme Recognition	—Unverified	0