The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1801–1850 of 659983 papers

Title	Date	Status	Hype
In-the-Wild Camouflage Attack on Vehicle Detectors through Controllable Image Editing	Mar 19, 2026	—Unverified	0
GeoLAN: Geometric Learning of Latent Explanatory Directions in Large Language Models	Mar 19, 2026	—Unverified	0
Deep Hilbert--Galerkin Methods for Infinite-Dimensional PDEs and Optimal Control	Mar 19, 2026	—Unverified	0
Hyperagents	Mar 19, 2026	—Unverified	4
Global Convergence of Multiplicative Updates for the Matrix Mechanism: A Collaborative Proof with Gemini 3	Mar 19, 2026	—Unverified	0
ProactiveBench: Benchmarking Proactiveness in Multimodal Large Language Models	Mar 19, 2026	—Unverified	1
A Framework for Formalizing LLM Agent Security	Mar 19, 2026	—Unverified	0
Reinforcement-guided generative protein language models enable de novo design of highly diverse AAV capsids	Mar 19, 2026	—Unverified	0
Narrative Aligned Long Form Video Question Answering	Mar 19, 2026	—Unverified	0
Instruction-Free Tuning of Large Vision Language Models for Medical Instruction Following	Mar 19, 2026	—Unverified	0
Any-Subgroup Equivariant Networks via Symmetry Breaking	Mar 19, 2026	—Unverified	0
ICLAD: In-Context Learning for Unified Tabular Anomaly Detection Across Supervision Regimes	Mar 19, 2026	—Unverified	0
Teaching an Agent to Sketch One Part at a Time	Mar 19, 2026	—Unverified	0
Stochastic Sequential Decision Making over Expanding Networks with Graph Filtering	Mar 19, 2026	—Unverified	0
Vision Tiny Recursion Model (ViTRM): Parameter-Efficient Image Classification via Recursive State Refinement	Mar 19, 2026	—Unverified	0
Beyond the Desk: Barriers and Future Opportunities for AI to Assist Scientists in Embodied Physical Tasks	Mar 19, 2026	—Unverified	0
Linear Social Choice with Few Queries: A Moment-Based Approach	Mar 19, 2026	—Unverified	0
FedAgain: A Trust-Based and Robust Federated Learning Strategy for an Automated Kidney Stone Identification in Ureteroscopy	Mar 19, 2026	—Unverified	0
Learning to Disprove: Formal Counterexample Generation with Large Language Models	Mar 19, 2026	—Unverified	0
ItinBench: Benchmarking Planning Across Multiple Cognitive Dimensions with Large Language Models	Mar 19, 2026	—Unverified	0
Gastric-X: A Multimodal Multi-Phase Benchmark Dataset for Advancing Vision-Language Models in Gastric Cancer Analysis	Mar 19, 2026	—Unverified	0
ReXInTheWild: A Unified Benchmark for Medical Photograph Understanding	Mar 19, 2026	—Unverified	0
Inducing Sustained Creativity and Diversity in Large Language Models	Mar 19, 2026	—Unverified	0
Recognising BSL Fingerspelling in Continuous Signing Sequences	Mar 19, 2026	—Unverified	0
SurfaceXR: Fusing Smartwatch IMUs and Egocentric Hand Pose for Seamless Surface Interactions	Mar 19, 2026	—Unverified	0
AURORA: Adaptive Unified Representation for Robust Ultrasound Analysis	Mar 19, 2026	CodeCode Available	0
Cooperation and Exploitation in LLM Policy Synthesis for Sequential Social Dilemmas	Mar 19, 2026	CodeCode Available	0
TRACE: Trajectory Recovery with State Propagation Diffusion for Urban Mobility	Mar 19, 2026	CodeCode Available	0
End-to-End QGAN-Based Image Synthesis via Neural Noise Encoding and Intensity Calibration	Mar 19, 2026	—Unverified	0
Detecting Basic Values in A Noisy Russian Social Media Text Data: A Multi-Stage Classification Framework	Mar 19, 2026	—Unverified	0
Balancing Performance and Fairness in Explainable AI for Anomaly Detection in Distributed Power Plants Monitoring	Mar 19, 2026	—Unverified	0
Sheaf Neural Networks and biomedical applications	Mar 19, 2026	—Unverified	0
Image2Gcode: Image-to-G-code Generation for Additive Manufacturing Using Diffusion-Transformer Model	Mar 19, 2026	—Unverified	0
Score Reversal Is Not Free for Quantum Diffusion Models	Mar 19, 2026	—Unverified	0
To See or To Please: Uncovering Visual Sycophancy and Split Beliefs in VLMs	Mar 19, 2026	—Unverified	0
Do VLMs Need Vision Transformers? Evaluating State Space Models as Vision Encoders	Mar 19, 2026	—Unverified	0
Generalization of Long-Range Machine Learning Potentials in Complex Chemical Spaces	Mar 19, 2026	—Unverified	0
All-in-One Slider for Attribute Manipulation in Diffusion Models	Mar 19, 2026	CodeCode Available	0
PlanTwin: Privacy-Preserving Planning Abstractions for Cloud-Assisted LLM Agents	Mar 19, 2026	—Unverified	0
Language Model Maps for Prompt-Response Distributions via Log-Likelihood Vectors	Mar 19, 2026	—Unverified	0
WarPGNN: A Parametric Thermal Warpage Analysis Framework with Physics-aware Graph Neural Network	Mar 19, 2026	—Unverified	0
Bridging Network Fragmentation: A Semantic-Augmented DRL Framework for UAV-aided VANETs	Mar 19, 2026	—Unverified	0
AU Codes, Language, and Synthesis: Translating Anatomy to Text for Facial Behavior Synthesis	Mar 19, 2026	—Unverified	0
Student views in AI Ethics and Social Impact	Mar 19, 2026	—Unverified	0
ITKIT: Feasible CT Image Analysis based on SimpleITK and MMEngine	Mar 19, 2026	—Unverified	0
Investigating Faithfulness in Large Audio Language Models	Mar 19, 2026	—Unverified	0
From Workflow Automation to Capability Closure: A Formal Framework for Safe and Revenue-Aware Customer Service AI	Mar 19, 2026	—Unverified	0
Redundancy-as-Masking: Formalizing the Artificial Age Score (AAS) to Model Memory Aging in Generative AI	Mar 19, 2026	—Unverified	0
Augmenting Rating-Scale Measures with Text-Derived Items Using the Information-Determined Scoring (IDS) Framework	Mar 19, 2026	—Unverified	0
REST: Receding Horizon Explorative Steiner Tree for Zero-Shot Object-Goal Navigation	Mar 19, 2026	—Unverified	0