Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2851–2900 of 5548 papers

Title	Date	Tasks	Status
The Design and Implementation of a Scalable DL Benchmarking Platform	Nov 19, 2019	Benchmarking	—Unverified
Handwritten Text Recognition: A Survey	Feb 12, 2025	BenchmarkingHandwritten Text Recognition	—Unverified
HaN-Seg: The head and neck organ-at-risk CT and MR segmentation dataset	Jan 3, 2023	BenchmarkingComputed Tomography (CT)	—Unverified
xai_evals : A Framework for Evaluating Post-Hoc Local Explanation Methods	Feb 5, 2025	Benchmarking	—Unverified
Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead	Dec 21, 2020	Autonomous DrivingBenchmarking	—Unverified
Hardware-aware mobile building block evaluation for computer vision	Aug 26, 2022	BenchmarkingEfficient Neural Network	—Unverified
The Disagreement Problem in Faithfulness Metrics	Nov 13, 2023	BenchmarkingExplainable artificial intelligence	—Unverified
The DLV System for Knowledge Representation and Reasoning	Nov 4, 2002	Benchmarking	—Unverified
Harnessing Large Language Models for Software Vulnerability Detection: A Comprehensive Benchmarking Study	May 24, 2024	BenchmarkingVulnerability Detection	—Unverified
The Dota 2 Bot Competition	Mar 4, 2021	BenchmarkingDota 2	—Unverified
Benchmarking XAI Explanations with Human-Aligned Evaluations	Nov 4, 2024	Benchmarking	—Unverified
A Baseline Method for Removing Invisible Image Watermarks using Deep Image Prior	Feb 19, 2025	BenchmarkingMisinformation	—Unverified
Benchmarking with MIMIC-IV, an irregular, spare clinical time series dataset	Jan 27, 2024	BenchmarkingTime Series	—Unverified
HA-VLN: A Benchmark for Human-Aware Navigation in Discrete-Continuous Environments with Dynamic Multi-Human Interactions, Real-World Validation, and an Open Leaderboard	Mar 18, 2025	BenchmarkingHuman Dynamics	—Unverified
Hawk: An Industrial-strength Multi-label Document Classifier	Jan 15, 2023	BenchmarkingDocument Classification	—Unverified
Benchmarking Waitlist Mortality Prediction in Heart Transplantation Through Time-to-Event Modeling using New Longitudinal UNOS Dataset	Jul 9, 2025	BenchmarkingDecision Making	—Unverified
Benchmarking VLMs' Reasoning About Persuasive Atypical Images	Sep 16, 2024	BenchmarkingObject Recognition	—Unverified
Haze Visibility Enhancement: A Survey and Quantitative Benchmarking	Jul 21, 2016	BenchmarkingSurvey	—Unverified
Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information	May 9, 2025	BenchmarkingForm	—Unverified
Heidelberg Colorectal Data Set for Surgical Data Science in the Sensor Operating Room	May 7, 2020	Benchmarking	—Unverified
HelixDesign-Binder: A Scalable Production-Grade Platform for Binder Design Built on HelixFold3	May 28, 2025	BenchmarkingEfficient Exploration	—Unverified
Helsinki Deblur Challenge 2021: description of photographic data	May 21, 2021	BenchmarkingDeblurring	—Unverified
HERM: Benchmarking and Enhancing Multimodal LLMs for Human-Centric Understanding	Oct 9, 2024	BenchmarkingInstruction Following	—Unverified
Agent-oriented Joint Decision Support for Data Owners in Auction-based Federated Learning	May 9, 2024	BenchmarkingFederated Learning	—Unverified
Benchmarking Visual-Inertial Deep Multimodal Fusion for Relative Pose Regression and Odometry-aided Absolute Pose Regression	Aug 1, 2022	Benchmarkingregression	—Unverified
Benchmarking Vision Language Models on German Factual Data	Apr 15, 2025	Benchmarking	—Unverified
The Effect of Domain and Diacritics in Yoruba–English Neural Machine Translation	Aug 1, 2021	BenchmarkingMachine Translation	—Unverified
Jointly Modeling and Clustering Tensors in High Dimensions	Apr 15, 2021	BenchmarkingClustering	—Unverified
Heterogeneous graph neural networks for species distribution modeling	Mar 14, 2025	Benchmarking	—Unverified
Hide and Seek: on the Stealthiness of Attacks against Deep Learning Systems	May 31, 2022	Benchmarking	—Unverified
Hiding in Plain Sight: Reframing Hardware Trojan Benchmarking as a Hide&Seek Modification	Oct 21, 2024	Benchmarking	—Unverified
Agentic Mixture-of-Workflows for Multi-Modal Chemical Search	Feb 26, 2025	BenchmarkingRetrieval	—Unverified
Benchmarking Vision Language Models for Cultural Understanding	Jul 15, 2024	BenchmarkingQuestion Answering	—Unverified
Hierarchical Knowledge Graph Construction from Images for Scalable E-Commerce	Oct 28, 2024	Benchmarkinggraph construction	—Unverified
AA3DNet: Attention Augmented Real Time 3D Object Detection	Jul 26, 2021	3D Object DetectionAutonomous Vehicles	—Unverified
High Accuracy Tumor Diagnoses and Benchmarking of Hematoxylin and Eosin Stained Prostate Core Biopsy Images Generated by Explainable Deep Neural Networks	Aug 2, 2019	BenchmarkingSSIM	—Unverified
Agentic AI for Improving Precision in Identifying Contributions to Sustainable Development Goals	Nov 26, 2024	BenchmarkingRetrieval	—Unverified
High Fidelity RF Clutter Modeling and Simulation	Feb 10, 2022	BenchmarkingVocal Bursts Intensity Prediction	—Unverified
High-Level Synthesis Performance Prediction using GNNs: Benchmarking, Modeling, and Advancing	Jan 18, 2022	BenchmarkingFeature Engineering	—Unverified
Benchmarking Vision Foundation Models for Input Monitoring in Autonomous Driving	Jan 14, 2025	Autonomous DrivingBenchmarking	—Unverified
The EuroCity Persons Dataset: A Novel Benchmark for Object Detection	May 18, 2018	BenchmarkingObject	—Unverified
The Evolutionary Computation Methods No One Should Use	Jan 5, 2023	Benchmarking	—Unverified
HIMO: A New Benchmark for Full-Body Human Interacting with Multiple Objects	Jul 17, 2024	BenchmarkingHuman-Object Interaction Detection	—Unverified
Benchmarking Vision-Based Object Tracking for USVs in Complex Maritime Environments	Dec 10, 2024	Benchmarkingobject-detection	—Unverified
Hints-In-Browser: Benchmarking Language Models for Programming Feedback Generation	Jun 7, 2024	Benchmarking	—Unverified
Benchmarking Video Frame Interpolation	Mar 25, 2024	BenchmarkingComputational Efficiency	—Unverified
SnCQA: A hardware-efficient equivariant quantum convolutional circuit architecture	Nov 23, 2022	BenchmarkingComputational chemistry	—Unverified
HLB: Benchmarking LLMs' Humanlikeness in Language Use	Sep 24, 2024	Benchmarking	—Unverified
Benchmarking Unsupervised Outlier Detection with Realistic Synthetic Data	Apr 15, 2020	BenchmarkingOutlier Detection	—Unverified
The Expressive Power of Word Embeddings	Jan 15, 2013	BenchmarkingSentence	—Unverified

Show:10 25 50

← PrevPage 58 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified