The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 12751–12800 of 474278 papers

Title	Date	Tasks	Status	Hype
Seg-Wild: Interactive Segmentation based on 3D Gaussian Splatting for Unconstrained Image Collections	Jul 10, 2025	Interactive SegmentationSegmentation	CodeCode Available	1
MIRIX: Multi-Agent Memory System for LLM-Based Agents	Jul 10, 2025	RAG	—Unverified	0
NLGCL: Naturally Existing Neighbor Layers Graph Contrastive Learning for Recommendation	Jul 10, 2025	Collaborative FilteringContrastive Learning	CodeCode Available	1
OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding	Jul 10, 2025	Scene UnderstandingSpatial Reasoning	CodeCode Available	0
SAMO: A Lightweight Sharpness-Aware Approach for Multi-Task Optimization with Joint Global-Local Perturbation	Jul 10, 2025	Multi-Task Learning	CodeCode Available	0
HiM2SAM: Enhancing SAM2 with Hierarchical Motion Estimation and Memory Optimization towards Long-term Tracking	Jul 10, 2025	Motion EstimationObject Tracking	CodeCode Available	1
Goal-Oriented Sequential Bayesian Experimental Design for Causal Learning	Jul 10, 2025	Experimental Design	—Unverified	0
Objectomaly: Objectness-Aware Refinement for OoD Segmentation with Structural Consistency and Boundary Precision	Jul 10, 2025	Autonomous Driving	—Unverified	0
MUVOD: A Novel Multi-view Video Object Segmentation Dataset and A Benchmark for 3D Segmentation	Jul 10, 2025	NeRFObject	—Unverified	0
Rethinking Query-based Transformer for Continual Image Segmentation	Jul 10, 2025	Continual LearningImage Segmentation	CodeCode Available	1
Towards Interpretable Time Series Foundation Models	Jul 10, 2025	Time Series	CodeCode Available	0
SCOOTER: A Human Evaluation Framework for Unrestricted Adversarial Examples	Jul 10, 2025		CodeCode Available	0
PacGDC: Label-Efficient Generalizable Depth Completion with Projection Ambiguity and Consistency	Jul 10, 2025	Depth Completion	CodeCode Available	1
Towards High-Resolution 3D Anomaly Detection: A Scalable Dataset and Real-Time Framework for Subtle Industrial Defects	Jul 10, 2025	3D Anomaly DetectionAnomaly Detection	CodeCode Available	2
An Automated Classifier of Harmful Brain Activities for Clinical Usage Based on a Vision-Inspired Pre-trained Framework	Jul 10, 2025	Binary ClassificationEEG	—Unverified	0
Uncertainty Quantification for Motor Imagery BCI -- Machine Learning vs. Deep Learning	Jul 10, 2025	Deep LearningMotor Imagery	—Unverified	0
GNN-CNN: An Efficient Hybrid Model of Convolutional and Graph Neural Networks for Text Representation	Jul 10, 2025	Graph GenerationSentiment Analysis	CodeCode Available	0
You Don't Bring Me Flowers: Mitigating Unwanted Recommendations Through Conformal Risk Control	Jul 9, 2025		CodeCode Available	0
LongAnimation: Long Animation Generation with Dynamic Global-Local Memory	Jul 9, 2025		—Unverified	0
Does Data Scaling Lead to Visual Compositional Generalization?	Jul 9, 2025		CodeCode Available	0
Unlocking Thermal Aerial Imaging: Synthetic Enhancement of UAV Datasets	Jul 9, 2025		CodeCode Available	0
DICE: Data Influence Cascade in Decentralized Learning	Jul 9, 2025		—Unverified	0
MST-Distill: Mixture of Specialized Teachers for Cross-Modal Knowledge Distillation	Jul 9, 2025		CodeCode Available	0
Robust Multimodal Large Language Models Against Modality Conflict	Jul 9, 2025		—Unverified	0
AdeptHEQ-FL: Adaptive Homomorphic Encryption for Federated Learning of Hybrid Classical-Quantum Models with Dynamic Layer Sparing	Jul 9, 2025		—Unverified	0
Instance-Wise Monotonic Calibration by Constrained Transformation	Jul 9, 2025		CodeCode Available	0
MADPOT: Medical Anomaly Detection with CLIP Adaptation and Partial Optimal Transport	Jul 9, 2025		CodeCode Available	0
Omni-Fusion of Spatial and Spectral for Hyperspectral Image Segmentation	Jul 9, 2025		CodeCode Available	0
MultiJustice: A Chinese Dataset for Multi-Party, Multi-Charge Legal Prediction	Jul 9, 2025		CodeCode Available	0
MCCD: A Multi-Attribute Chinese Calligraphy Character Dataset Annotated with Script Styles, Dynasties, and Calligraphers	Jul 9, 2025		CodeCode Available	0
Colors See Colors Ignore: Clothes Changing ReID with Color Disentanglement	Jul 9, 2025		CodeCode Available	0
ViDove: A Translation Agent System with Multimodal Context and Memory-Augmented Reasoning	Jul 9, 2025		CodeCode Available	0
Test-Time Scaling with Reflective Generative Model	Jul 9, 2025		CodeCode Available	0
Label-Efficient Chest X-ray Diagnosis via Partial CLIP Adaptation	Jul 9, 2025		CodeCode Available	0
Text-promptable Object Counting via Quantity Awareness Enhancement	Jul 9, 2025		CodeCode Available	0
DS@GT at CheckThat! 2025: Exploring Retrieval and Reranking Pipelines for Scientific Claim Source Retrieval on Social Media Discourse	Jul 9, 2025		CodeCode Available	0
Evaluating and Improving Robustness in Large Language Models: A Survey and Future Directions	Jul 9, 2025		CodeCode Available	0
XY-Tokenizer: Mitigating the Semantic-Acoustic Conflict in Low-Bitrate Speech Codecs	Jul 9, 2025		CodeCode Available	0
Airway Segmentation Network for Enhanced Tubular Feature Extraction	Jul 9, 2025		CodeCode Available	0
Ambiguity-aware Point Cloud Segmentation by Adaptive Margin Contrastive Learning	Jul 9, 2025		CodeCode Available	0
FlexGaussian: Flexible and Cost-Effective Training-Free Compression for 3D Gaussian Splatting	Jul 9, 2025		CodeCode Available	0
Finetuning Vision-Language Models as OCR Systems for Low-Resource Languages: A Case Study of Manchu	Jul 9, 2025		CodeCode Available	0
Bridging the Last Mile of Prediction: Enhancing Time Series Forecasting with Conditional Guided Flow Matching	Jul 9, 2025	Time SeriesTime Series Forecasting	—Unverified	0
Orchestrator-Agent Trust: A Modular Agentic AI Visual Classification System with Trust-Aware Orchestration and RAG-Based Reasoning	Jul 9, 2025	BenchmarkingImage Retrieval	CodeCode Available	0
Rethinking Verification for LLM Code Generation: From Generation to Testing	Jul 9, 2025	Code GenerationHumanEval	CodeCode Available	1
HVI-CIDNet+: Beyond Extreme Darkness for Low-Light Image Enhancement	Jul 9, 2025	Image EnhancementLow-Light Image Enhancement	CodeCode Available	1
The Dark Side of LLMs Agent-based Attacks for Complete Computer Takeover	Jul 9, 2025	Large Language ModelRAG	—Unverified	0
Multi-Agent Retrieval-Augmented Framework for Evidence-Based Counterspeech Against Health Misinformation	Jul 9, 2025	InformativenessMisinformation	—Unverified	0
VisualTrap: A Stealthy Backdoor Attack on GUI Agents via Visual Grounding Manipulation	Jul 9, 2025	Backdoor AttackVisual Grounding	—Unverified	0
Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning	Jul 9, 2025	Reinforcement Learning (RL)	—Unverified	0