The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 8951–8975 of 474278 papers

Title	Date	Status
Mem4Nav: Boosting Vision-and-Language Navigation in Urban Environments with a Hierarchical Spatial-Cognition Long-Short Memory System	Oct 10, 2025	CodeCode Available
TriP-LLM: A Tri-Branch Patch-wise Large Language Model Framework for Time-Series Anomaly Detection	Oct 10, 2025	CodeCode Available
Transferable Parasitic Estimation via Graph Contrastive Learning and Label Rebalancing in AMS Circuits	Oct 10, 2025	CodeCode Available
Reasoning through Exploration: A Reinforcement Learning Framework for Robust Function Calling	Oct 10, 2025	—Unverified
Visual Representation Alignment for Multimodal Large Language Models	Oct 10, 2025	—Unverified
Verbalized Sampling: How to Mitigate Mode Collapse and Unlock LLM Diversity	Oct 10, 2025	—Unverified
GyroSwin: 5D Surrogates for Gyrokinetic Plasma Turbulence Simulations	Oct 10, 2025	—Unverified
Online Video Depth Anything: Temporally-Consistent Depth Prediction with Low Memory Consumption	Oct 10, 2025	—Unverified
On Epistemic Uncertainty of Visual Tokens for Object Hallucinations in Large Vision-Language Models	Oct 10, 2025	—Unverified
LLaMAX2: Your Translation-Enhanced Model also Performs Well in Reasoning	Oct 10, 2025	—Unverified
Towards Safer and Understandable Driver Intention Prediction	Oct 10, 2025	—Unverified
Obstacle Avoidance using Dynamic Movement Primitives and Reinforcement Learning	Oct 10, 2025	CodeCode Available
Goal-oriented Backdoor Attack against Vision-Language-Action Models via Physical Objects	Oct 10, 2025	—Unverified
KORMo: Korean Open Reasoning Model for Everyone	Oct 10, 2025	—Unverified
PhysToolBench: Benchmarking Physical Tool Understanding for MLLMs	Oct 10, 2025	—Unverified
StatEval: A Comprehensive Benchmark for Large Language Models in Statistics	Oct 10, 2025	—Unverified
SpaceVista: All-Scale Visual Spatial Reasoning from mm to km	Oct 10, 2025	—Unverified
Why Do Transformers Fail to Forecast Time Series In-Context?	Oct 10, 2025	—Unverified
Don't Throw Away Your Pretrained Model	Oct 10, 2025	—Unverified
RePIC: Reinforced Post-Training for Personalizing Multi-Modal Language Models	Oct 10, 2025	CodeCode Available
AMFT: Aligning LLM Reasoners by Meta-Learning the Optimal Imitation-Exploration Balance	Oct 10, 2025	CodeCode Available
StreamingVLM: Real-Time Understanding for Infinite Video Streams	Oct 10, 2025	CodeCode Available
TTS-VAR: A Test-Time Scaling Framework for Visual Auto-Regressive Generation	Oct 10, 2025	CodeCode Available
Building a Foundational Guardrail for General Agentic Systems via Synthetic Data	Oct 10, 2025	—Unverified
InteractScience: Programmatic and Visually-Grounded Evaluation of Interactive Scientific Demonstration Code Generation	Oct 10, 2025	CodeCode Available