The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 16001–16050 of 474278 papers

Title	Date	Tasks	Status	Hype
Rethinking Repetition Problems of LLMs in Code Generation	May 15, 2025	Code GenerationHumanEval	CodeCode Available	1
Learned Lightweight Smartphone ISP with Unpaired Data	May 15, 2025		CodeCode Available	1
Large Wireless Localization Model (LWLM): A Foundation Model for Positioning in 6G Networks	May 15, 2025	Autonomous DrivingContrastive Learning	CodeCode Available	1
Evaluating Robustness of Deep Reinforcement Learning for Autonomous Surface Vehicle Control in Field Tests	May 15, 2025	BenchmarkingDeep Reinforcement Learning	CodeCode Available	1
Sparse Point Cloud Patches Rendering via Splitting 2D Gaussians	May 14, 2025	NeRF	CodeCode Available	1
Empirical Performance Evaluation of Lane Keeping Assist on Modern Production Vehicles	May 14, 2025	Autonomous Driving	CodeCode Available	1
MAKE: Multi-Aspect Knowledge-Enhanced Vision-Language Pretraining for Zero-shot Dermatological Assessment	May 14, 2025	Clinical KnowledgeContrastive Learning	CodeCode Available	1
OpenLKA: An Open Dataset of Lane Keeping Assist from Recent Car Models under Real-world Driving Conditions	May 14, 2025	Autonomous DrivingBenchmarking	CodeCode Available	1
BiECVC: Gated Diversification of Bidirectional Contexts for Learned Video Compression	May 14, 2025	Video Compression	CodeCode Available	1
EDBench: Large-Scale Electron Density Data for Molecular Modeling	May 14, 2025	Drug Discovery	CodeCode Available	1
Slow Transition to Low-Dimensional Chaos in Heavy-Tailed Recurrent Neural Networks	May 14, 2025		CodeCode Available	1
Bridging Human Oversight and Black-box Driver Assistance: Vision-Language Models for Predictive Alerting in Lane Keeping Assist Systems	May 14, 2025		CodeCode Available	1
TopoDiT-3D: Topology-Aware Diffusion Transformer with Bottleneck Structure for 3D Point Cloud Generation	May 14, 2025	DiversityPoint Cloud Generation	CodeCode Available	1
DRA-GRPO: Exploring Diversity-Aware Reward Adjustment for R1-Zero-Like Training of Large Language Models	May 14, 2025	DiversityMathematical Reasoning	CodeCode Available	1
Online Isolation Forest	May 14, 2025	Anomaly DetectionFault Detection	CodeCode Available	1
Llama See, Llama Do: A Mechanistic Perspective on Contextual Entrainment and Distraction in LLMs	May 14, 2025	counterfactual	CodeCode Available	1
UMotion: Uncertainty-driven Human Motion Estimation from Inertial and Ultra-wideband Units	May 14, 2025	Motion EstimationPose Estimation	CodeCode Available	1
InvDesFlow-AL: Active Learning-based Workflow for Inverse Design of Functional Materials	May 14, 2025	Active LearningFormation Energy	CodeCode Available	1
Mini Diffuser: Fast Multi-task Diffusion Policy Training Using Two-level Mini-batches	May 14, 2025	Action GenerationImage Generation	CodeCode Available	1
Evaluation in EEG Emotion Recognition: State-of-the-Art Review and Unified Framework	May 14, 2025	EEGEEG Emotion Recognition	CodeCode Available	1
Examining Deployment and Refinement of the VIOLA-AI Intracranial Hemorrhage Model Using an Interactive NeoMedSys Platform	May 14, 2025	DiagnosticSensitivity	CodeCode Available	1
Analog Foundation Models	May 14, 2025	4kQuantization	CodeCode Available	1
Achieving Tokenizer Flexibility in Language Models through Heuristic Adaptation and Supertoken Learning	May 14, 2025		CodeCode Available	1
AdaFortiTran: An Adaptive Transformer Model for Robust OFDM Channel Estimation	May 14, 2025		CodeCode Available	1
Introducing voice timbre attribute detection	May 14, 2025	Attribute	CodeCode Available	1
Towards scalable surrogate models based on Neural Fields for large scale aerodynamic simulations	May 14, 2025	Benchmarking	CodeCode Available	1
LatticeVision: Image to Image Networks for Modeling Non-Stationary Spatial Data	May 14, 2025	parameter estimation	CodeCode Available	1
TT-DF: A Large-Scale Diffusion-Based Dataset and Benchmark for Human Body Forgery Detection	May 13, 2025	DisentanglementFace Swapping	CodeCode Available	1
From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation	May 13, 2025	Robot ManipulationSpatial Reasoning	CodeCode Available	1
Foundation Models Knowledge Distillation For Battery Capacity Degradation Forecast	May 13, 2025	Knowledge DistillationTime Series	CodeCode Available	1
Large Language Models for Computer-Aided Design: A Survey	May 13, 2025	Survey	CodeCode Available	1
WaveGuard: Robust Deepfake Detection and Source Tracing via Dual-Tree Complex Wavelet and Graph Neural Networks	May 13, 2025	DeepFake DetectionFace Swapping	CodeCode Available	1
ADC-GS: Anchor-Driven Deformable and Compressed Gaussian Splatting for Dynamic Scene Reconstruction	May 13, 2025		CodeCode Available	1
ReSurgSAM2: Referring Segment Anything in Surgical Video via Credible Long-term Tracking	May 13, 2025	DiversityMamba	CodeCode Available	1
TiMo: Spatiotemporal Foundation Model for Satellite Image Time Series	May 13, 2025	Temporal SequencesTime Series	CodeCode Available	1
FlashMLA-ETAP: Efficient Transpose Attention Pipeline for Accelerating MLA Inference on NVIDIA H20 GPUs	May 13, 2025	GPU	CodeCode Available	1
Total Variation-Based Image Decomposition and Denoising for Microscopy Images	May 13, 2025	Denoising	CodeCode Available	1
Rejoining fragmented ancient bamboo slips with physics-driven deep learning	May 13, 2025		CodeCode Available	1
PrePrompt: Predictive prompting for class incremental learning	May 13, 2025	Classifier calibrationclass-incremental learning	CodeCode Available	1
Benchmarking AI scientists in omics data-driven biological research	May 13, 2025	BenchmarkingMultiple-choice	CodeCode Available	1
Domain Knowledge Integrated CNN-xLSTM-xAtt Network with Multi Stream Feature Fusion for Cuffless Blood Pressure Estimation from Photoplethysmography Signals	May 13, 2025	Blood pressure estimationPhotoplethysmography (PPG)	CodeCode Available	1
Hyperbolic Contrastive Learning with Model-augmentation for Knowledge-aware Recommendation	May 13, 2025	Contrastive LearningKnowledge-Aware Recommendation	CodeCode Available	1
DeepMath-Creative: A Benchmark for Evaluating Mathematical Creativity of Large Language Models	May 13, 2025		CodeCode Available	1
A Head to Predict and a Head to Question: Pre-trained Uncertainty Quantification Heads for Hallucination Detection in LLM Outputs	May 13, 2025	HallucinationUncertainty Quantification	CodeCode Available	1
Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving	May 13, 2025	3D visual groundingAutonomous Driving	CodeCode Available	1
Codifying Character Logic in Role-Playing	May 12, 2025		CodeCode Available	1
Towards Actionable Pedagogical Feedback: A Multi-Perspective Analysis of Mathematics Teaching and Tutoring Dialogue	May 12, 2025	TAG	CodeCode Available	1
Symbolic Regression with Multimodal Large Language Models and Kolmogorov Arnold Networks	May 12, 2025	Kolmogorov-Arnold NetworksLanguage Modeling	CodeCode Available	1
Chronocept: Instilling a Sense of Time in Machines	May 12, 2025	Fact CheckingRAG	CodeCode Available	1
A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models	May 12, 2025	Instruction Following	CodeCode Available	1