The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 5651–5675 of 474278 papers

Title	Date	Tasks	Status	Hype
Luminance-GS: Adapting 3D Gaussian Splatting to Challenging Lighting Conditions with View-Adaptive Curve Adjustment	Apr 2, 2025	3DGSNeRF	CodeCode Available	2
LARGE: Legal Retrieval Augmented Generation Evaluation Tool	Apr 2, 2025	RAGRetrieval	CodeCode Available	2
shapr: Explaining Machine Learning Models with Conditional Shapley Values in R and Python	Apr 2, 2025		CodeCode Available	2
AI-Newton: A Concept-Driven Physical Law Discovery System without Prior Physical Knowledge	Apr 2, 2025	scientific discovery	CodeCode Available	2
Scene-Centric Unsupervised Panoptic Segmentation	Apr 2, 2025	Instance SegmentationPanoptic Segmentation	CodeCode Available	2
MCP Safety Audit: LLMs with the Model Context Protocol Allow Major Security Exploits	Apr 2, 2025		CodeCode Available	2
Less is More: Efficient Black-box Attribution via Minimal Interpretable Subset Selection	Apr 1, 2025		CodeCode Available	2
FortisAVQA and MAVEN: a Benchmark Dataset and Debiasing Framework for Robust Multimodal Reasoning	Apr 1, 2025	Audio-visual Question AnsweringAudio-Visual Question Answering (AVQA)	CodeCode Available	2
CrackSQL: A Hybrid SQL Dialect Translation System Powered by Large Language Models	Apr 1, 2025	Large Language ModelTranslation	CodeCode Available	2
A Decade of Deep Learning for Remote Sensing Spatiotemporal Fusion: Advances, Challenges, and Opportunities	Apr 1, 2025		CodeCode Available	2
Z1: Efficient Test-time Scaling with Code	Apr 1, 2025		CodeCode Available	2
Learned Image Compression with Dictionary-based Entropy Model	Apr 1, 2025	Image Compressionmodel	CodeCode Available	2
OpenFACADES: An Open Framework for Architectural Caption and Attribute Data Enrichment via Street View Imagery	Apr 1, 2025	Attribute	CodeCode Available	2
Dynamic Parametric Retrieval Augmented Generation for Test-time Knowledge Enhancement	Mar 31, 2025	HallucinationRAG	CodeCode Available	2
Training-Free Text-Guided Image Editing with Visual Autoregressive Model	Mar 31, 2025	text-guided-image-editing	CodeCode Available	2
A Survey on Test-Time Scaling in Large Language Models: What, How, Where, and How Well?	Mar 31, 2025		CodeCode Available	2
Rec-R1: Bridging Generative Large Language Models and User-Centric Recommendation Systems via Reinforcement Learning	Mar 31, 2025	General Reinforcement LearningInstruction Following	CodeCode Available	2
SALT: A Flexible Semi-Automatic Labeling Tool for General LiDAR Point Clouds with Cross-Scene Adaptability and 4D Consistency	Mar 31, 2025	Zero-Shot Learning	CodeCode Available	2
Force-Free Molecular Dynamics Through Autoregressive Equivariant Networks	Mar 31, 2025	Numerical Integration	CodeCode Available	2
Inference-Time Scaling for Complex Tasks: Where We Stand and What Lies Ahead	Mar 31, 2025	MathSpatial Reasoning	CodeCode Available	2
Effective Cloud Removal for Remote Sensing Images by an Improved Mean-Reverting Denoising Model with Elucidated Design Space	Mar 31, 2025	Cloud RemovalDenoising	CodeCode Available	2
THEMIS: Towards Practical Intellectual Property Protection for Post-Deployment On-Device Deep Learning Models	Mar 31, 2025	GPU	CodeCode Available	2
TeleAntiFraud-28k: An Audio-Text Slow-Thinking Dataset for Telecom Fraud Detection	Mar 31, 2025	Fraud DetectionLarge Language Model	CodeCode Available	2
Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs	Mar 31, 2025	Large Language ModelVideo Chaptering	CodeCode Available	2
Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models	Mar 31, 2025		CodeCode Available	2