The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 16101–16150 of 474278 papers

Title	Date	Tasks	Status	Hype
Benchmarking Vision, Language, & Action Models in Procedurally Generated, Open Ended Action Environments	May 8, 2025	BenchmarkingPrompt Engineering	CodeCode Available	1
Crosslingual Reasoning through Test-Time Scaling	May 8, 2025	Mathematical Reasoning	CodeCode Available	1
CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global Memory	May 8, 2025	Large Language ModelNavigate	CodeCode Available	1
EquiHGNN: Scalable Rotationally Equivariant Hypergraph Neural Networks	May 8, 2025		CodeCode Available	1
HiBayES: A Hierarchical Bayesian Modeling Framework for AI Evaluation Statistics	May 8, 2025	parameter estimationUncertainty Quantification	CodeCode Available	1
Physics-Assisted and Topology-Informed Deep Learning for Weather Prediction	May 8, 2025	Deep LearningGraph Neural Network	CodeCode Available	1
Augmented Deep Contexts for Spatially Embedded Video Coding	May 8, 2025		CodeCode Available	1
X-Transfer Attacks: Towards Super Transferable Adversarial Attacks on CLIP	May 8, 2025		CodeCode Available	1
PyTDC: A multimodal machine learning training, evaluation, and inference platform for biomedical foundation models	May 8, 2025	BenchmarkingGraph Representation Learning	CodeCode Available	1
Griffin: Towards a Graph-Centric Relational Database Foundation Model	May 8, 2025	DecoderDiversity	CodeCode Available	1
Enhancing Cooperative Multi-Agent Reinforcement Learning with State Modelling and Adversarial Exploration	May 8, 2025	Deep Reinforcement LearningMulti-agent Reinforcement Learning	CodeCode Available	1
scDrugMap: Benchmarking Large Foundation Models for Drug Response Prediction	May 8, 2025	BenchmarkingDrug Discovery	CodeCode Available	1
The City that Never Settles: Simulation-based LiDAR Dataset for Long-Term Place Recognition Under Extreme Structural Changes	May 8, 2025		CodeCode Available	1
A Preliminary Study for GPT-4o on Image Restoration	May 8, 2025	Image DehazingImage Generation	CodeCode Available	1
A Simple Detector with Frame Dynamics is a Strong Tracker	May 8, 2025	Objectobject-detection	CodeCode Available	1
Hearing and Seeing Through CLIP: A Framework for Self-Supervised Sound Source Localization	May 8, 2025	Scene UnderstandingSound Source Localization	CodeCode Available	1
ArrayDPS: Unsupervised Blind Speech Separation with a Diffusion Prior	May 8, 2025	Room Impulse Response (RIR)Speech Separation	CodeCode Available	1
Adaptive Markup Language Generation for Contextually-Grounded Visual Document Understanding	May 8, 2025	document understandingInstruction Following	CodeCode Available	1
UncertainSAM: Fast and Efficient Uncertainty Quantification of the Segment Anything Model	May 8, 2025	Semantic SegmentationUncertainty Quantification	CodeCode Available	1
KG-HTC: Integrating Knowledge Graphs into LLMs for Effective Zero-shot Hierarchical Text Classification	May 8, 2025	Knowledge GraphsRAG	CodeCode Available	1
Scalable Chain of Thoughts via Elastic Reasoning	May 8, 2025		CodeCode Available	1
FilterTS: Comprehensive Frequency Filtering for Multivariate Time Series Forecasting	May 7, 2025	Computational EfficiencyMultivariate Time Series Forecasting	CodeCode Available	1
VideoPath-LLaVA: Pathology Diagnostic Reasoning Through Video Instruction Tuning	May 7, 2025	Decision MakingDiagnostic	CodeCode Available	1
TS-Diff: Two-Stage Diffusion Model for Low-Light RAW Image Enhancement	May 7, 2025	DenoisingImage Enhancement	CodeCode Available	1
Componential Prompt-Knowledge Alignment for Domain Incremental Learning	May 7, 2025	Incremental LearningTransfer Learning	CodeCode Available	1
Histo-Miner: Deep Learning based Tissue Features Extraction Pipeline from H&E Whole Slide Images of Cutaneous Squamous Cell Carcinoma	May 7, 2025	Segmentationwhole slide images	CodeCode Available	1
TrajEvo: Designing Trajectory Prediction Heuristics via LLM-driven Evolution	May 7, 2025	DiversityPrediction	CodeCode Available	1
WDMamba: When Wavelet Degradation Prior Meets Vision Mamba for Image Dehazing	May 7, 2025	Image DehazingMamba	CodeCode Available	1
RGB-Event Fusion with Self-Attention for Collision Prediction	May 7, 2025	BenchmarkingComputational Efficiency	CodeCode Available	1
EvEnhancer: Empowering Effectiveness, Efficiency and Generalizability for Continuous Space-Time Video Super-Resolution with Events	May 7, 2025	Space-time Video Super-resolutionSuper-Resolution	CodeCode Available	1
Lightweight RGB-D Salient Object Detection from a Speed-Accuracy Tradeoff Perspective	May 7, 2025	object-detectionObject Detection	CodeCode Available	1
LLAMAPIE: Proactive In-Ear Conversation Assistants	May 7, 2025		CodeCode Available	1
Retrieval Augmented Time Series Forecasting	May 7, 2025	RetrievalTime Series	CodeCode Available	1
Registration of 3D Point Sets Using Exponential-based Similarity Matrix	May 7, 2025	Point Cloud Registration	CodeCode Available	1
Image Restoration via Multi-domain Learning	May 7, 2025	Cloud RemovalDeblurring	CodeCode Available	1
ABKD: Pursuing a Proper Allocation of the Probability Mass in Knowledge Distillation via α-β-Divergence	May 7, 2025	Knowledge Distillation	CodeCode Available	1
Nature's Insight: A Novel Framework and Comprehensive Analysis of Agentic Reasoning Through the Lens of Neuroscience	May 7, 2025		CodeCode Available	1
Reward-SQL: Boosting Text-to-SQL via Stepwise Reasoning and Process-Supervised Rewards	May 7, 2025	Text to SQLText-To-SQL	CodeCode Available	1
Vision Graph Prompting via Semantic Low-Rank Decomposition	May 7, 2025	parameter-efficient fine-tuningVisual Prompting	CodeCode Available	1
Benchmarking LLMs' Swarm intelligence	May 7, 2025	Benchmarking	CodeCode Available	1
DFVO: Learning Darkness-free Visible and Infrared Image Disentanglement and Fusion All at Once	May 7, 2025	AllAutonomous Driving	CodeCode Available	1
Object-Shot Enhanced Grounding Network for Egocentric Video	May 7, 2025	Video Grounding	CodeCode Available	1
GAPrompt: Geometry-Aware Point Cloud Prompt for 3D Vision Model	May 7, 2025	parameter-efficient fine-tuning	CodeCode Available	1
Benchmarking LLM Faithfulness in RAG with Evolving Leaderboards	May 7, 2025	BenchmarkingHallucination	CodeCode Available	1
Token Communication-Driven Multimodal Large Models in Resource-Constrained Multiuser Networks	May 6, 2025		CodeCode Available	1
Learning-based Homothetic Tube MPC	May 6, 2025	Model Predictive Control	CodeCode Available	1
WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch	May 6, 2025		CodeCode Available	1
IndicSQuAD: A Comprehensive Multilingual Question Answering Dataset for Indic Languages	May 6, 2025	Question Answering	CodeCode Available	1
OSUniverse: Benchmark for Multimodal GUI-navigation AI Agents	May 6, 2025		CodeCode Available	1
1^st Place Solution of WWW 2025 EReL@MIR Workshop Multimodal CTR Prediction Challenge	May 6, 2025	Click-Through Rate PredictionRecommendation Systems	CodeCode Available	1