The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 15851–15900 of 474278 papers

Title	Date	Tasks	Status	Hype
Simple Radiology VLLM Test-time Scaling with Thought Graph Traversal	Jun 13, 2025		CodeCode Available	0
FAME: A Lightweight Spatio-Temporal Network for Model Attribution of Face-Swap Deepfakes	Jun 13, 2025	DeepFake DetectionFace Swapping	CodeCode Available	0
ViSAGe: Video-to-Spatial Audio Generation	Jun 13, 2025	Audio Generation	—Unverified	0
Self-supervised Learning of Echocardiographic Video Representations via Online Cluster Distillation	Jun 13, 2025	Anomaly DetectionClustering	CodeCode Available	1
From Sharpness to Better Generalization for Speech Deepfake Detection	Jun 13, 2025	DeepFake DetectionFace Swapping	—Unverified	0
Semantic Scheduling for LLM Inference	Jun 13, 2025	FairnessManagement	CodeCode Available	0
Deep Symmetric Autoencoders from the Eckart-Young-Schmidt Perspective	Jun 13, 2025	Anomaly DetectionDimensionality Reduction	CodeCode Available	0
DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents	Jun 13, 2025		CodeCode Available	1
SEC-bench: Automated Benchmarking of LLM Agents on Real-World Software Security Tasks	Jun 13, 2025	BenchmarkingLarge Language Model	CodeCode Available	2
SSLAM: Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes	Jun 13, 2025	Linear evaluationSelf-Supervised Learning	CodeCode Available	2
Efficient Long-Context LLM Inference via KV Cache Clustering	Jun 13, 2025	Clustering	—Unverified	0
Efficient Speech Enhancement via Embeddings from Pre-trained Generative Audioencoders	Jun 13, 2025	Speech Enhancement	CodeCode Available	2
GPLQ: A General, Practical, and Lightning QAT Method for Vision Transformers	Jun 13, 2025	Fine-Grained Image ClassificationQuantization	—Unverified	0
Vectorized Sparse Second-Order Forward Automatic Differentiation for Optimal Control Direct Methods	Jun 13, 2025	Computational Efficiency	CodeCode Available	1
Dual‑detector Re‑optimization for Federated Weakly Supervised Video Anomaly Detection Via Adaptive Dynamic Recursive Mapping	Jun 13, 2025	Anomaly DetectionAnomaly Detection In Surveillance Videos	CodeCode Available	1
A Multi-Agent Probabilistic Inference Framework Inspired by Kairanban-Style CoT System with IdoBata Conversation for Debiasing	Jun 12, 2025	DiversityPrediction	—Unverified	0
DiffPR: Diffusion-Based Phase Reconstruction via Frequency-Decoupled Learning	Jun 12, 2025	DenoisingDiagnostic	—Unverified	0
Design of 3D Beamforming and Deployment Strategies for ISAC-based HAPS Systems	Jun 12, 2025	Integrated sensing and communicationISAC	—Unverified	0
The Sample Complexity of Parameter-Free Stochastic Convex Optimization	Jun 12, 2025	Few-Shot LearningModel Selection	—Unverified	0
Convolutional method for data assimilation An improved method on neuronal electrophysiological data	Jun 12, 2025	parameter estimation	—Unverified	0
Measuring multi-calibration	Jun 12, 2025	Density Estimation	—Unverified	0
Multimodal Modeling of CRISPR-Cas12 Activity Using Foundation Models and Chromatin Accessibility Data	Jun 12, 2025	Activity Prediction	—Unverified	0
Optimal experiment design for practical parameter identifiability and model discrimination	Jun 12, 2025	Experimental Design	—Unverified	0
HyBiomass: Global Hyperspectral Imagery Benchmark Dataset for Evaluating Geospatial Foundation Models in Forest Aboveground Biomass Estimation	Jun 12, 2025	Benchmarking	—Unverified	0
Invocable APIs derived from NL2SQL datasets for LLM Tool-Calling Evaluation	Jun 12, 2025	Intent DetectionNatural Language Queries	—Unverified	0
LLM-as-a-Judge for Reference-less Automatic Code Validation and Refinement for Natural Language to Bash in IT Automation	Jun 12, 2025	Code Generation	—Unverified	0
Can Time-Series Foundation Models Perform Building Energy Management Tasks?	Jun 12, 2025	energy managementManagement	—Unverified	0
Conformal Safety Shielding for Imperfect-Perception Agents	Jun 12, 2025	Conformal PredictionState Estimation	—Unverified	0
Score-based Generative Diffusion Models to Synthesize Full-dose FDG Brain PET from MRI in Epilepsy Patients	Jun 12, 2025	Diagnostic	—Unverified	0
A Hybrid Adaptive Nash Equilibrium Solver for Distributed Multi-Agent Systems with Game-Theoretic Jump Triggering	Jun 12, 2025	Computational Efficiency	—Unverified	0
Polymorphism Crystal Structure Prediction with Adaptive Space Group Diversity Control	Jun 12, 2025	Diversity	CodeCode Available	0
Joint Denoising of Cryo-EM Projection Images using Polar Transformers	Jun 12, 2025	Cryogenic Electron Microscopy (cryo-EM)Denoising	—Unverified	0
FedNano: Toward Lightweight Federated Tuning for Pretrained Multimodal Large Language Models	Jun 12, 2025	Cross-Modal RetrievalFederated Learning	—Unverified	0
Learning a Continue-Thinking Token for Enhanced Test-Time Scaling	Jun 12, 2025	GSM8KMath	CodeCode Available	0
Don't Pay Attention	Jun 12, 2025		—Unverified	0
BotTrans: A Multi-Source Graph Domain Adaptation Approach for Social Bot Detection	Jun 12, 2025	Domain AdaptationGRAPH DOMAIN ADAPTATION	CodeCode Available	0
Improving Group Robustness on Spurious Correlation via Evidential Alignment	Jun 12, 2025	Uncertainty Quantification	CodeCode Available	0
ClimateChat: Designing Data and Methods for Instruction Tuning LLMs to Answer Climate Change Queries	Jun 12, 2025	scientific discovery	CodeCode Available	1
An Attention-based Spatio-Temporal Neural Operator for Evolving Physics	Jun 12, 2025	Interpretable Machine Learning	—Unverified	0
UCD: Unlearning in LLMs via Contrastive Decoding	Jun 12, 2025	Machine Unlearning	—Unverified	0
Intelligent Automation for FDI Facilitation: Optimizing Tariff Exemption Processes with OCR And Large Language Models	Jun 12, 2025	Large Language ModelOptical Character Recognition	—Unverified	0
Brain2Vec: A Deep Learning Framework for EEG-Based Stress Detection Using CNN-LSTM-Attention	Jun 12, 2025	DiagnosticEEG	—Unverified	0
Efficient Traffic Classification using HW-NAS: Advanced Analysis and Optimization for Cybersecurity on Resource-Constrained Devices	Jun 12, 2025	Hardware Aware Neural Architecture SearchNeural Architecture Search	—Unverified	0
Gondola: Grounded Vision Language Planning for Generalizable Robotic Manipulation	Jun 12, 2025	Referring Expression	—Unverified	0
Anti-Aliased 2D Gaussian Splatting	Jun 12, 2025	3DGSNovel View Synthesis	CodeCode Available	1
LLM Embedding-based Attribution (LEA): Quantifying Source Contributions to Generative Model's Response for Vulnerability Analysis	Jun 12, 2025	AttributeRAG	CodeCode Available	0
Advances in Small-Footprint Keyword Spotting: A Comprehensive Review of Efficient Models and Algorithms	Jun 12, 2025	Automatic Speech RecognitionKeyword Spotting	CodeCode Available	0
LLM-as-a-Fuzzy-Judge: Fine-Tuning Large Language Models as a Clinical Evaluation Judge with Fuzzy Logic	Jun 12, 2025	Large Language ModelPrompt Engineering	CodeCode Available	0
SocialCredit+	Jun 12, 2025	Credit scoreEthics	—Unverified	0
Poutine: Vision-Language-Trajectory Pre-Training and Reinforcement Learning Post-Training Enable Robust End-to-End Autonomous Driving	Jun 12, 2025	Autonomous Driving	—Unverified	0