The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 13451–13500 of 474278 papers

Title	Date	Tasks	Status	Hype
MadCLIP: Few-shot Medical Anomaly Detection with CLIP	Jun 30, 2025		CodeCode Available	0
Consistent Time-of-Flight Depth Denoising via Graph-Informed Geometric Attention	Jun 30, 2025		CodeCode Available	0
GazeTarget360: Towards Gaze Target Estimation in 360-Degree for Robot Perception	Jun 30, 2025		CodeCode Available	0
OcRFDet: Object-Centric Radiance Fields for Multi-View 3D Object Detection in Autonomous Driving	Jun 30, 2025		CodeCode Available	0
Event-based Tiny Object Detection: A Benchmark Dataset and Baseline	Jun 30, 2025		CodeCode Available	0
MReg: A Novel Regression Model with MoE-based Video Feature Mining for Mitral Regurgitation Diagnosis	Jun 30, 2025		CodeCode Available	0
AutoEvoEval: An Automated Framework for Evolving Close-Ended LLM Evaluation Data	Jun 30, 2025		CodeCode Available	0
Interpretable Zero-Shot Learning with Locally-Aligned Vision-Language Model	Jun 30, 2025		CodeCode Available	0
How to Design and Train Your Implicit Neural Representation for Video Compression	Jun 30, 2025		CodeCode Available	0
State and Memory is All You Need for Robust and Reliable AI Agents	Jun 30, 2025	AllBenchmarking	—Unverified	0
LineRetriever: Planning-Aware Observation Reduction for Web Agents	Jun 30, 2025	RetrievalSemantic Similarity	—Unverified	0
Supercm: Revisiting Clustering for Semi-Supervised Learning	Jun 30, 2025	Clustering	—Unverified	0
Discovering the underlying analytic structure within Standard Model constants using artificial intelligence	Jun 30, 2025	Symbolic Regression	CodeCode Available	0
A Data-Ensemble-Based Approach for Sample-Efficient LQ Control of Linear Time-Varying Systems	Jun 30, 2025	Q-Learning	—Unverified	0
MMReason: An Open-Ended Multi-Modal Multi-Step Reasoning Benchmark for MLLMs Toward AGI	Jun 30, 2025	Memorization	CodeCode Available	0
Robustness of Misinformation Classification Systems to Adversarial Examples Through BeamAttack	Jun 30, 2025	Adversarial AttackMisinformation	CodeCode Available	0
A Survey on Vision-Language-Action Models for Autonomous Driving	Jun 30, 2025	Autonomous DrivingAutonomous Vehicles	CodeCode Available	4
Advancing Learnable Multi-Agent Pathfinding Solvers with Active Fine-Tuning	Jun 30, 2025	Imitation LearningTrajectory Planning	CodeCode Available	2
GroundingDINO-US-SAM: Text-Prompted Multi-Organ Segmentation in Ultrasound with LoRA-Tuned Vision-Language Models	Jun 30, 2025	Organ SegmentationSegmentation	—Unverified	0
Foundation Models for Zero-Shot Segmentation of Scientific Images without AI-Ready Data	Jun 30, 2025	Visual ReasoningZero Shot Segmentation	—Unverified	0
MedSAM-CA: A CNN-Augmented ViT with Attention-Enhanced Multi-Scale Fusion for Medical Image Segmentation	Jun 30, 2025	DecoderImage Segmentation	—Unverified	0
STACK: Adversarial Attacks on LLM Safeguard Pipelines	Jun 30, 2025	Red Teaming	—Unverified	0
Flash-VStream: Efficient Real-Time Understanding for Long Video Streams	Jun 30, 2025	cross-modal alignmentEgoSchema	CodeCode Available	3
Consensus-based optimization for closed-box adversarial attacks and a connection to evolution strategies	Jun 30, 2025		CodeCode Available	0
Mamba-FETrack V2: Revisiting State Space Model for Frame-Event based Visual Object Tracking	Jun 30, 2025	MambaObject Tracking	CodeCode Available	1
Dataset Distillation via Vision-Language Category Prototype	Jun 30, 2025	Dataset DistillationDescriptive	CodeCode Available	1
Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers	Jun 30, 2025	Multimodal Reasoning	CodeCode Available	5
Diffusion Model-based Data Augmentation Method for Fetal Head Ultrasound Segmentation	Jun 30, 2025	Data AugmentationSegmentation	—Unverified	0
Visual and Memory Dual Adapter for Multi-Modal Object Tracking	Jun 30, 2025	Object TrackingPrompt Learning	CodeCode Available	0
HiNeuS: High-fidelity Neural Surface Mitigating Low-texture and Reflective Ambiguity	Jun 30, 2025	Inverse RenderingNeural Rendering	—Unverified	0
DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real World	Jun 30, 2025	Caption GenerationObject	CodeCode Available	2
Refine Any Object in Any Scene	Jun 30, 2025	Novel View SynthesisObject	CodeCode Available	1
Epona: Autoregressive Diffusion World Model for Autonomous Driving	Jun 30, 2025	Autonomous Drivingmodel	CodeCode Available	3
MTADiffusion: Mask Text Alignment Diffusion Model for Object Inpainting	Jun 30, 2025	Image Inpainting	—Unverified	0
μ^2Tokenizer: Differentiable Multi-Scale Multi-Modal Tokenizer for Radiology Report Generation	Jun 30, 2025	Computed Tomography (CT)	—Unverified	0
Large Language Models Don't Make Sense of Word Problems. A Scoping Review from a Mathematics Education Perspective	Jun 30, 2025	Mathematical Reasoning	—Unverified	0
Flow-Through Tensors: A Unified Computational Graph Architecture for Multi-Layer Transportation Network Optimization	Jun 30, 2025	Tensor Decomposition	—Unverified	0
Do Thinking Tokens Help or Trap? Towards More Efficient Large Reasoning Model	Jun 30, 2025	Math	—Unverified	0
FADRM: Fast and Accurate Data Residual Matching for Dataset Distillation	Jun 30, 2025	Computational EfficiencyDataset Distillation	CodeCode Available	1
The Trilemma of Truth in Large Language Models	Jun 30, 2025	AttributeConformal Prediction	CodeCode Available	0
Constructing Non-Markovian Decision Process via History Aggregator	Jun 30, 2025	Decision MakingReinforcement Learning (RL)	CodeCode Available	0
Thought-Augmented Planning for LLM-Powered Interactive Recommender Agent	Jun 30, 2025	Interactive RecommendationLarge Language Model	CodeCode Available	0
Auto-TA: Towards Scalable Automated Thematic Analysis (TA) via Multi-Agent Large Language Models with Reinforcement Learning	Jun 30, 2025	Language ModelingLanguage Modelling	—Unverified	0
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning	Jun 30, 2025	MathMulti-agent Reinforcement Learning	CodeCode Available	2
MDPG: Multi-domain Diffusion Prior Guidance for MRI Reconstruction	Jun 30, 2025	MambaMRI Reconstruction	CodeCode Available	0
Self-Supervised Multiview Xray Matching	Jun 30, 2025	Fracture detection	CodeCode Available	0
Seeding neural network quantum states with tensor network states	Jun 30, 2025		CodeCode Available	0
Real-World En Call Center Transcripts Dataset with PII Redaction	Jun 30, 2025	PII Redaction	CodeCode Available	0
Computational Detection of Intertextual Parallels in Biblical Hebrew: A Benchmark Study Using Transformer-Based Language Models	Jun 30, 2025	Word Embeddings	—Unverified	0
Ella: Embodied Social Agents with Lifelong Memory	Jun 30, 2025	Lifelong learning	—Unverified	0