The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 17851–17900 of 474278 papers

Title	Date	Tasks	Status	Hype
Towards provable probabilistic safety for scalable embodied AI systems	Jun 5, 2025	Autonomous Vehicles	—Unverified	0
Learning to Plan via Supervised Contrastive Learning and Strategic Interpolation: A Chess Case Study	Jun 5, 2025	Contrastive Learning	CodeCode Available	0
Improved Regret Bounds for Linear Bandits with Heavy-Tailed Rewards	Jun 5, 2025	Experimental DesignMulti-Armed Bandits	—Unverified	0
MMSU: A Massive Multi-task Spoken Language Understanding and Reasoning Benchmark	Jun 5, 2025	RhythmSpoken Language Understanding	CodeCode Available	7
Distributional encoding for Gaussian process regression with qualitative inputs	Jun 5, 2025	Bayesian OptimizationMulti-Task Learning	—Unverified	0
Generating Synthetic Stereo Datasets using 3D Gaussian Splatting and Expert Knowledge Transfer	Jun 5, 2025	3DGSDataset Generation	—Unverified	0
Truth in the Few: High-Value Data Selection for Efficient Multi-Modal Reasoning	Jun 5, 2025		CodeCode Available	0
Using In-Context Learning for Automatic Defect Labelling of Display Manufacturing Data	Jun 5, 2025	Defect DetectionIn-Context Learning	—Unverified	0
Causal Effect Identification in lvLiNGAM from Higher-Order Cumulants	Jun 5, 2025	Causal Inference	CodeCode Available	0
Invisible Backdoor Triggers in Image Editing Model via Deep Watermarking	Jun 5, 2025	Backdoor AttackImage Generation	CodeCode Available	0
LotusFilter: Fast Diverse Nearest Neighbor Search via a Learned Cutoff Table	Jun 5, 2025	RAG	CodeCode Available	1
LLMs for sensory-motor control: Combining in-context and iterative learning	Jun 5, 2025	MuJoCo	CodeCode Available	0
Unsupervised Machine Learning for Scientific Discovery: Workflow and Best Practices	Jun 5, 2025	Astronomyscientific discovery	CodeCode Available	0
Towards Holistic Visual Quality Assessment of AI-Generated Videos: A LLM-Based Multi-Dimensional Evaluation Model	Jun 5, 2025	Prompt Engineering	CodeCode Available	1
Spatiotemporal Contrastive Learning for Cross-View Video Localization in Unstructured Off-road Terrains	Jun 5, 2025	Contrastive Learning	—Unverified	0
Cloud-Based Interoperability in Residential Energy Systems	Jun 5, 2025	Edge-computing	—Unverified	0
Feature-Based Lie Group Transformer for Real-World Applications	Jun 5, 2025	ObjectObject Recognition	—Unverified	0
LESS: Large Language Model Enhanced Semi-Supervised Learning for Speech Foundational Models	Jun 5, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Deep learning image burst stacking to reconstruct high-resolution ground-based solar observations	Jun 5, 2025	Image-to-Image Translation	—Unverified	0
Toward Better SSIM Loss for Unsupervised Monocular Depth Estimation	Jun 5, 2025	Depth EstimationForm	—Unverified	0
A Multi-Dataset Evaluation of Models for Automated Vulnerability Repair	Jun 5, 2025	Program RepairVulnerability Detection	—Unverified	0
NIMO: a Nonlinear Interpretable MOdel	Jun 5, 2025	model	—Unverified	0
Unfolding Spatial Cognition: Evaluating Multimodal Models on Visual Simulations	Jun 5, 2025	4kSpatial Reasoning	CodeCode Available	1
MegaHan97K: A Large-Scale Dataset for Mega-Category Chinese Character Recognition with over 97K Categories	Jun 5, 2025	BenchmarkingOptical Character Recognition	CodeCode Available	2
SupeRANSAC: One RANSAC to Rule Them All	Jun 5, 2025	AllPose Estimation	CodeCode Available	3
Text-Aware Real-World Image Super-Resolution via Diffusion Model with Joint Segmentation Decoders	Jun 5, 2025	Image Super-ResolutionSuper-Resolution	CodeCode Available	0
HypeVPR: Exploring Hyperbolic Space for Perspective to Equirectangular Visual Place Recognition	Jun 5, 2025	RetrievalVisual Place Recognition	CodeCode Available	0
AV-Reasoner: Improving and Benchmarking Clue-Grounded Audio-Visual Counting for MLLMs	Jun 5, 2025	BenchmarkingVideo Understanding	—Unverified	0
Membership Inference Attacks on Sequence Models	Jun 5, 2025	Inference AttackMembership Inference Attack	—Unverified	0
Why LLM Safety Guardrails Collapse After Fine-tuning: A Similarity Analysis Between Alignment and Fine-tuning Datasets	Jun 5, 2025	Safety Alignment	—Unverified	0
Artificial Intelligence Should Genuinely Support Clinical Reasoning and Decision Making To Bridge the Translational Gap	Jun 5, 2025	Decision MakingDiagnostic	—Unverified	0
Robust Few-Shot Vision-Language Model Adaptation	Jun 5, 2025	Language ModelingLanguage Modelling	—Unverified	0
SeedVR2: One-Step Video Restoration via Diffusion Adversarial Post-Training	Jun 5, 2025	Image RestorationVideo Restoration	—Unverified	0
Towards LLM-Centric Multimodal Fusion: A Survey on Integration Strategies and Techniques	Jun 5, 2025	cross-modal alignmentLarge Language Model	—Unverified	0
hdl2v: A Code Translation Dataset for Enhanced LLM Verilog Generation	Jun 5, 2025	Code GenerationCode Translation	—Unverified	0
On Automating Security Policies with Contemporary LLMs	Jun 5, 2025	In-Context LearningRAG	—Unverified	0
Urania: Differentially Private Insights into AI Use	Jun 5, 2025	BenchmarkingChatbot	—Unverified	0
Clustering and Median Aggregation Improve Differentially Private Inference	Jun 5, 2025	ClusteringLanguage Modeling	—Unverified	0
Intentionally Unintentional: GenAI Exceptionalism and the First Amendment	Jun 5, 2025	Misinformation	—Unverified	0
Federated Isolation Forest for Efficient Anomaly Detection on Edge IoT Systems	Jun 5, 2025	Anomaly DetectionFederated Learning	—Unverified	0
TQml Simulator: Optimized Simulation of Quantum Machine Learning	Jun 5, 2025	Quantum Machine Learning	—Unverified	0
Handle-based Mesh Deformation Guided By Vision Language Model	Jun 5, 2025	Language ModelingLanguage Modelling	—Unverified	0
VoxDet: Rethinking 3D Semantic Occupancy Prediction as Dense Object Detection	Jun 5, 2025	3D geometry3D Semantic Occupancy Prediction	—Unverified	0
Conservative classifiers do consistently well with improving agents: characterizing statistical and online learning	Jun 5, 2025	PAC learning	—Unverified	0
Privacy Amplification Through Synthetic Data: Insights from Linear Regression	Jun 5, 2025	regression	—Unverified	0
GOLFer: Smaller LM-Generated Documents Hallucination Filter & Combiner for Query Expansion in Information Retrieval	Jun 5, 2025	HallucinationInformation Retrieval	CodeCode Available	0
Unifying Appearance Codes and Bilateral Grids for Driving Scene Gaussian Splatting	Jun 5, 2025	Autonomous DrivingNeRF	CodeCode Available	2
Time to Talk: LLM Agents for Asynchronous Group Communication in Mafia Games	Jun 5, 2025	Action GenerationAsynchronous Group Communication	CodeCode Available	1
Exp4Fuse: A Rank Fusion Framework for Enhanced Sparse Retrieval using Large Language Model-based Query Expansion	Jun 5, 2025	Information RetrievalLanguage Modeling	CodeCode Available	0
Sparse Autoencoders, Again?	Jun 5, 2025	Language ModelingLanguage Modelling	—Unverified	0