The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3001–3050 of 659983 papers

Title	Date	Status
ANTS: Adaptive Negative Textual Space Shaping for OOD Detection via Test-Time MLLM Understanding and Reasoning	Mar 17, 2026	CodeCode Available
A Practical Algorithm for Feature-Rich, Non-Stationary Bandit Problems	Mar 17, 2026	—Unverified
TurnWise: The Gap between Single- and Multi-turn Language Model Capabilities	Mar 17, 2026	—Unverified
Conservative Continuous-Time Treatment Optimization	Mar 17, 2026	—Unverified
Designing for Disagreement: Front-End Guardrails for Assistance Allocation in LLM-Enabled Robots	Mar 17, 2026	—Unverified
Tarab: A Multi-Dialect Corpus of Arabic Lyrics and Poetry	Mar 17, 2026	—Unverified
Evaluating Ill-Defined Tasks in Large Language Models	Mar 17, 2026	—Unverified
Why the Valuable Capabilities of LLMs Are Precisely the Unexplainable Ones	Mar 17, 2026	—Unverified
Controlling Fish Schools via Reinforcement Learning of Virtual Fish Movement	Mar 17, 2026	—Unverified
Ontological foundations for contrastive explanatory narration of robot plans	Mar 17, 2026	—Unverified
VQKV: High-Fidelity and High-Ratio Cache Compression via Vector-Quantization	Mar 17, 2026	—Unverified
TempCore: Are Video QA Benchmarks Temporally Grounded? A Frame Selection Sensitivity Analysis and Benchmark	Mar 17, 2026	—Unverified
Good Arguments Against the People Pleasers: How Reasoning Mitigates (Yet Masks) LLM Sycophancy	Mar 17, 2026	—Unverified
What DINO saw: ALiBi positional encoding reduces positional bias in Vision Transformers	Mar 17, 2026	—Unverified
BenchPreS: A Benchmark for Context-Aware Personalized Preference Selectivity of Persistent-Memory LLMs	Mar 17, 2026	—Unverified
From Passive to Persuasive: Localized Activation Injection for Empathy and Negotiation	Mar 17, 2026	—Unverified
LLM-Guided Reinforcement Learning for Audio-Visual Speech Enhancement	Mar 17, 2026	—Unverified
Scalable Feature Learning on Huge Knowledge Graphs for Downstream Machine Learning	Mar 17, 2026	—Unverified
When Machine Learning Gets Personal: Evaluating Prediction and Explanation	Mar 17, 2026	—Unverified
Feature Attribution in 5G Intrusion Detection: A Statistical vs. Logic-Based Comparison	Mar 17, 2026	—Unverified
WildCap: Facial Albedo Capture in the Wild via Hybrid Inverse Rendering	Mar 17, 2026	—Unverified
LANCE: Low Rank Activation Compression for Efficient On-Device Continual Learning	Mar 17, 2026	—Unverified
Representing Beauty: Towards a Participatory but Objective Latent Aesthetics	Mar 17, 2026	—Unverified
When a Robot is More Capable than a Human: Learning from Constrained Demonstrators	Mar 17, 2026	—Unverified
Distributional Consistency Loss: Beyond Pointwise Data Terms in Inverse Problems	Mar 17, 2026	—Unverified
Strategic Costs of Perceived Bias in Fair Selection	Mar 17, 2026	—Unverified
Evontree: Ontology Rule-Guided Self-Evolution of Large Language Models	Mar 17, 2026	—Unverified
S2WMamba: A Wavelet-Assisted Mamba-Based Dual-Branch Network For Pansharpening	Mar 17, 2026	—Unverified
Analyzing Planner Design Trade-offs for MAPF under ADG-based Realistic Execution	Mar 17, 2026	—Unverified
On Geometric Understanding and Learned Priors in Feed-forward 3D Reconstruction Models	Mar 17, 2026	—Unverified
Toward Better Temporal Structures for Geopolitical Events Forecasting	Mar 17, 2026	—Unverified
A Novel Patch-Based TDA Approach for Computed Tomography Imaging	Mar 17, 2026	—Unverified
DiG: Differential Grounding for Enhancing Fine-Grained Perception in Multimodal Large Language Model	Mar 17, 2026	—Unverified
Diffusion-DRF: Free, Rich, and Differentiable Reward for Video Diffusion Fine-Tuning	Mar 17, 2026	—Unverified
Large Language Models Approach Expert Pedagogical Quality in Math Tutoring but Differ in Instructional and Linguistic Profiles	Mar 17, 2026	—Unverified
Few-Shot Video Object Segmentation in X-Ray Angiography Using Local Matching and Spatio-Temporal Consistency Loss	Mar 17, 2026	CodeCode Available
SentGraph: Hierarchical Sentence Graph for Multi-hop Retrieval-Augmented Question Answering	Mar 17, 2026	—Unverified
Aletheia: What Makes RLVR For Code Verifiers Tick?	Mar 17, 2026	—Unverified
VisTIRA: Closing the Image-Text Modality Gap in Visual Math Reasoning via Structured Tool Integration	Mar 17, 2026	—Unverified
Think3D: Thinking with Space for Spatial Reasoning	Mar 17, 2026	CodeCode Available
Building a Correct-by-Design Lakehouse. Data Contracts, Versioning, and Transactional Pipelines for Humans and Agents	Mar 17, 2026	—Unverified
LogicSkills: A Structured Benchmark for Formal Reasoning in Large Language Models	Mar 17, 2026	—Unverified
Fluids You Can Trust: Property-Preserving Operator Learning for Incompressible Flows	Mar 17, 2026	—Unverified
Synergizing Understanding and Generation with Interleaved Analyzing-Drafting Thinking	Mar 17, 2026	—Unverified
Efficient Continual Learning in Language Models via Thalamically Routed Cortical Columns	Mar 17, 2026	—Unverified
Ask don't tell: Reducing sycophancy in large language models	Mar 17, 2026	—Unverified
Fixed Anchors Are Not Enough: Dynamic Retrieval and Persistent Homology for Dataset Distillation	Mar 17, 2026	—Unverified
Transit Network Design with Two-Level Demand Uncertainties: A Machine Learning and Contextual Stochastic Optimization Framework	Mar 17, 2026	—Unverified
Is Seeing Believing? Evaluating Human Sensitivity to Synthetic Video	Mar 17, 2026	—Unverified
Model Medicine: A Clinical Framework for Understanding, Diagnosing, and Treating AI Models	Mar 17, 2026	—Unverified