The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 18651–18700 of 474278 papers

Title	Date	Tasks	Status	Hype
Confidence intervals for forced alignment boundaries using model ensembles	Jun 2, 2025		CodeCode Available	0
System Calls for Malware Detection and Classification: Methodologies and Applications	Jun 2, 2025	Anomaly DetectionMalware Analysis	—Unverified	0
Memory Access Characterization of Large Language Models in CPU Environment and its Potential Impacts	Jun 2, 2025	CPU	—Unverified	0
Sensitivity-Aware Density Estimation in Multiple Dimensions	Jun 2, 2025	Density EstimationSensitivity	—Unverified	0
The Promise of Spiking Neural Networks for Ubiquitous Computing: A Survey and New Perspectives	Jun 2, 2025	Time Series	—Unverified	0
Not All Jokes Land: Evaluating Large Language Models Understanding of Workplace Humor	Jun 2, 2025	AllAutomatic Writing	—Unverified	0
SOC-DGL: Social Interaction Behavior Inspired Dual Graph Learning Framework for Drug-Target Interaction Identification	Jun 2, 2025	Drug DiscoveryGraph Learning	CodeCode Available	0
Reasoning-Table: Exploring Reinforcement Learning for Table Reasoning	Jun 2, 2025	Fact VerificationLanguage Modeling	CodeCode Available	2
Synthesis of discrete-continuous quantum circuits with multimodal diffusion models	Jun 2, 2025	DenoisingParameter Prediction	CodeCode Available	2
STORM-BORN: A Challenging Mathematical Derivations Dataset Curated via a Human-in-the-Loop Multi-Agent Framework	Jun 2, 2025	Math	CodeCode Available	1
EfficientFER: EfficientNetv2 Based Deep Learning Approach for Facial Expression Recognition	Jun 2, 2025	Deep LearningEmotion Recognition	CodeCode Available	1
Parameter Efficient Fine Tuning Llama 3.1 for Answering Arabic Legal Questions: A Case Study on Jordanian Laws	Jun 2, 2025	Language ModelingLanguage Modelling	CodeCode Available	0
Compiler Optimization via LLM Reasoning for Efficient Model Serving	Jun 2, 2025	Compiler OptimizationLarge Language Model	CodeCode Available	2
FlexSelect: Flexible Token Selection for Efficient Long Video Understanding	Jun 1, 2025	Video Understanding	—Unverified	0
L3A: Label-Augmented Analytic Adaptation for Multi-Label Class Incremental Learning	Jun 1, 2025	class-incremental learningClass Incremental Learning	CodeCode Available	0
Infinity Parser: Layout Aware Reinforcement Learning for Scanned Document Parsing	Jun 1, 2025	Document AIdocument understanding	CodeCode Available	0
Affordance Benchmark for MLLMs	Jun 1, 2025		CodeCode Available	0
Predicting Empirical AI Research Outcomes with Language Models	Jun 1, 2025	Retrieval	—Unverified	0
PFMBench: Protein Foundation Model Benchmark	Jun 1, 2025	model	CodeCode Available	1
LD-RPMNet: Near-Sensor Diagnosis for Railway Point Machines	Jun 1, 2025	Computational EfficiencyDiagnostic	—Unverified	0
Self-Supervised-ISAR-Net Enables Fast Sparse ISAR Imaging	Jun 1, 2025	Image ReconstructionSelf-Supervised Learning	—Unverified	0
Near-Field Directional Modulation for RIS-Aided Movable Antenna MIMO Systems with Hardware Impairments	Jun 1, 2025	compressed sensing	—Unverified	0
ModuLM: Enabling Modular and Multimodal Molecular Relational Learning with Large Language Models	Jun 1, 2025	BenchmarkingRelational Reasoning	—Unverified	0
ChemAU: Harness the Reasoning of LLMs in Chemical Research with Adaptive Uncertainty Estimation	Jun 1, 2025	Natural Language Understanding	—Unverified	0
Uncertainty-Aware Metabolic Stability Prediction with Dual-View Contrastive Learning	Jun 1, 2025	Contrastive LearningPrediction	—Unverified	0
Explainable-AI powered stock price prediction using time series transformers: A Case Study on BIST100	Jun 1, 2025	Explainable artificial intelligenceExplainable Artificial Intelligence (XAI)	—Unverified	0
A Group-Wise Narrow Beam Design for Uplink Channel Estimation in Hybrid Beamforming Systems	Jun 1, 2025	Bayesian Inference	—Unverified	0
Bridging Quantum and Classical Computing in Drug Design: Architecture Principles for Improved Molecule Generation	Jun 1, 2025	Bayesian OptimizationDrug Design	CodeCode Available	0
TRUST -- Transformer-Driven U-Net for Sparse Target Recovery	Jun 1, 2025	DecoderHallucination	—Unverified	0
anyECG-chat: A Generalist ECG-MLLM for Flexible ECG Input and Multi-Task Understanding	Jun 1, 2025	Open-Ended Question AnsweringQuestion Answering	—Unverified	0
Training Beam Design for Channel Estimation in Hybrid mmWave MIMO Systems	Jun 1, 2025	Compressive Sensing	—Unverified	0
ProtInvTree: Deliberate Protein Inverse Folding with Reward-guided Tree Search	Jun 1, 2025	Denoising	—Unverified	0
Projection Pursuit Density Ratio Estimation	Jun 1, 2025	Causal InferenceDensity Ratio Estimation	—Unverified	0
Evaluating the Unseen Capabilities: How Many Theorems Do LLMs Know?	Jun 1, 2025	DiversityInformation Retrieval	—Unverified	0
Uncovering Bias Mechanisms in Observational Studies	Jun 1, 2025	Causal Inference	—Unverified	0
A Reinforcement Learning Approach for RIS-aided Fair Communications	Jun 1, 2025	Fairnessreinforcement-learning	—Unverified	0
Protap: A Benchmark for Protein Modeling on Realistic Downstream Applications	Jun 1, 2025		CodeCode Available	1
Can AI Master Econometrics? Evidence from Econometrics AI Agent on Expert-Level Tasks	Jun 1, 2025	AI AgentEconometrics	—Unverified	0
Designing DSIC Mechanisms for Data Sharing in the Era of Large Language Models	Jun 1, 2025	Privacy Preserving	—Unverified	0
NR4DER: Neural Re-ranking for Diversified Exercise Recommendation	Jun 1, 2025	Re-Ranking	CodeCode Available	0
Fast or Slow? Integrating Fast Intuition and Deliberate Thinking for Enhancing Visual Question Answering	Jun 1, 2025	AllMME	—Unverified	0
GThinker: Towards General Multimodal Reasoning via Cue-Guided Rethinking	Jun 1, 2025	4kMath	CodeCode Available	0
SocialEval: Evaluating Social Intelligence of Large Language Models	Jun 1, 2025	Navigate	CodeCode Available	0
Probing Neural Topology of Large Language Models	Jun 1, 2025	Functional ConnectivityGraph Matching	CodeCode Available	0
SealQA: Raising the Bar for Reasoning in Search-Augmented Language Models	Jun 1, 2025		CodeCode Available	3
DeepVerse: 4D Autoregressive Video Generation as a World Model	Jun 1, 2025	Video Generation	—Unverified	0
Pi-SQL: Enhancing Text-to-SQL with Fine-Grained Guidance from Pivot Programming Languages	Jun 1, 2025	Text to SQLText-To-SQL	—Unverified	0
Fighting Fire with Fire (F3): A Training-free and Efficient Visual Adversarial Example Purification Method in LVLMs	Jun 1, 2025	Adversarial PurificationComputational Efficiency	—Unverified	0
A Review on Coarse to Fine-Grained Animal Action Recognition	Jun 1, 2025	Action RecognitionAnimal Action Recognition	—Unverified	0
AuralSAM2: Enabling SAM2 Hear Through Pyramid Audio-Visual Feature Prompting	Jun 1, 2025	Contrastive LearningDecoder	CodeCode Available	0