Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1651–1700 of 5548 papers

Title	Date	Tasks	Status
Determinants of Performance in European ATM -- How to Analyze a Diverse Industry	Feb 20, 2023	BenchmarkingManagement	—Unverified
Benchmarking data encoding methods in Quantum Machine Learning	May 20, 2025	BenchmarkingQuantum Machine Learning	—Unverified
DetoxBench: Benchmarking Large Language Models for Multitask Fraud & Abuse Detection	Sep 9, 2024	Abuse DetectionAbusive Language	—Unverified
An Interpretable Measure for Quantifying Predictive Dependence between Continuous Random Variables -- Extended Version	Jan 18, 2025	Benchmarking	—Unverified
Benchmarking Data Efficiency and Computational Efficiency of Temporal Action Localization Models	Aug 24, 2023	Action LocalizationBenchmarking	—Unverified
Detection and Evaluation of Clusters within Sequential Data	Oct 4, 2022	BenchmarkingClustering	—Unverified
Benchmarking Data-driven Automatic Text Simplification for German	May 1, 2020	BenchmarkingMachine Translation	—Unverified
Detection of Adversarial Attacks and Characterization of Adversarial Subspace	Oct 26, 2019	BenchmarkingEnvironmental Sound Classification	—Unverified
detrex: Benchmarking Detection Transformers	Jun 12, 2023	Benchmarkingobject-detection	—Unverified
Development details and computational benchmarking of DEPAM	Mar 3, 2019	BenchmarkingDistributed Computing	—Unverified
Benchmarking Cross-Domain Audio-Visual Deception Detection	May 11, 2024	BenchmarkingDeception Detection	—Unverified
Benchmarking Counterfactual Interpretability in Deep Learning Models for Time Series Classification	Aug 22, 2024	Benchmarkingcounterfactual	—Unverified
Benchmarking Convolutional Neural Network and Graph Neural Network based Surrogate Models on a Real-World Car External Aerodynamics Dataset	Apr 9, 2025	BenchmarkingGraph Neural Network	—Unverified
An in-depth experimental study of sensor usage and visual reasoning of robots navigating in real environments	Nov 29, 2021	BenchmarkingVisual Navigation	—Unverified
Benchmarking Conventional Vision Models on Neuromorphic Fall Detection and Action Recognition Dataset	Jan 28, 2022	Action RecognitionBenchmarking	—Unverified
Benchmarking Conventional and Learned Video Codecs with a Low-Delay Configuration	Aug 9, 2024	BenchmarkingVideo Compression	—Unverified
Benchmarking Continual Learning from Cognitive Perspectives	Dec 6, 2023	BenchmarkingContinual Learning	—Unverified
Absolute Ranking: An Essential Normalization for Benchmarking Optimization Algorithms	Sep 6, 2024	Bayesian InferenceBenchmarking	—Unverified
Detecting Finger-Vein Presentation Attacks Using 3D Shape & Diffuse Reflectance Decomposition	Dec 3, 2019	BenchmarkingFinger Vein Recognition	—Unverified
Benchmarking Constraint-Based Bayesian Structure Learning Algorithms: Role of Network Topology	Jan 2, 2025	BenchmarkingSensitivity	—Unverified
Benchmarking confound regression strategies for the control of motion artifact in studies of functional connectivity	Aug 11, 2016	BenchmarkingFunctional Connectivity	—Unverified
ABSA-Bench: Towards the Unified Evaluation of Aspect-based Sentiment Analysis Research	Dec 1, 2020	Aspect-Based Sentiment AnalysisAspect-Based Sentiment Analysis (ABSA)	—Unverified
Design of Supervision-Scalable Learning Systems: Methodology and Performance Benchmarking	Jun 18, 2022	Benchmarkingimage-classification	—Unverified
Design Target Achievement Index: A Differentiable Metric to Enhance Deep Generative Models in Multi-Objective Inverse Design	May 6, 2022	Benchmarking	—Unverified
Benchmarking common uncertainty estimation methods with histopathological images under domain shift and label noise	Jan 3, 2023	BenchmarkingClassification	—Unverified
Benchmarking Collaborative Learning Methods Cost-Effectiveness for Prostate Segmentation	Sep 29, 2023	BenchmarkingFederated Learning	—Unverified
Benchmarking Cognitive Domains for LLMs: Insights from Taiwanese Hakka Culture	Sep 3, 2024	BenchmarkingRAG	—Unverified
A Distance Oriented Kalman Filter Particle Swarm Optimizer Applied to Multi-Modality Image Registration	Mar 20, 2018	BenchmarkingImage Registration	—Unverified
MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems	May 16, 2025	BenchmarkingMixture-of-Experts	—Unverified
Detecting Out-Of-Distribution Samples Using Low-Order Deep Features Statistics	May 1, 2019	Benchmarking	—Unverified
Device Modeling Bias in ReRAM-based Neural Network Simulations	Nov 29, 2022	Benchmarking	—Unverified
Different Horses for Different Courses: Comparing Bias Mitigation Algorithms in ML	Nov 17, 2024	BenchmarkingFairness	—Unverified
Diverse Community Data for Benchmarking Data Privacy Algorithms	Jun 20, 2023	Benchmarking	—Unverified
Benchmarking CNN on 3D Anatomical Brain MRI: Architectures, Data Augmentation and Deep Ensemble Learning	Jun 2, 2021	BenchmarkingData Augmentation	—Unverified
Benchmarking Clinical Decision Support Search	Jan 29, 2018	ArticlesBenchmarking	—Unverified
Ad-hoc Concept Forming in the Game Codenames as a Means for Evaluating Large Language Models	Feb 17, 2025	Benchmarking	—Unverified
Design2Code: Benchmarking Multimodal Code Generation for Automated Front-End Engineering	Mar 5, 2024	BenchmarkingCode Generation	—Unverified
Benchmarking Classical, Deep, and Generative Models for Human Activity Recognition	Jan 14, 2025	Activity RecognitionBenchmarking	—Unverified
An Experimental Study: Assessing the Combined Framework of WavLM and BEST-RQ for Text-to-Speech Synthesis	Dec 8, 2023	BenchmarkingQuantization	—Unverified
Benchmarking Chinese Medical LLMs: A Medbench-based Analysis of Performance Gaps and Hierarchical Optimization Strategies	Mar 10, 2025	BenchmarkingEthics	—Unverified
A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection	Jun 5, 2024	Anomaly DetectionBenchmarking	—Unverified
ABOUT ML: Annotation and Benchmarking on Understanding and Transparency of Machine Learning Lifecycles	Dec 12, 2019	BenchmarkingBIG-bench Machine Learning	—Unverified
Design and benchmarking of a two degree of freedom tendon driver unit for cable-driven wearable technologies	Apr 24, 2025	Benchmarking	—Unverified
CKnowEdit: A New Chinese Knowledge Editing Dataset for Linguistics, Facts, and Logic Error Correction in LLMs	Sep 9, 2024	Benchmarkingknowledge editing	—Unverified
A New Stereo Benchmarking Dataset for Satellite Images	Jul 9, 2019	Benchmarking	—Unverified
A New Real-World Video Dataset for the Comparison of Defogging Algorithms	Oct 2, 2023	BenchmarkingDeblurring	—Unverified
Benchmarking Chest X-ray Diagnosis Models Across Multinational Datasets	May 21, 2025	BenchmarkingDiagnostic	—Unverified
A Density-Guided Temporal Attention Transformer for Indiscernible Object Counting in Underwater Video	Mar 6, 2024	BenchmarkingCrowd Counting	—Unverified
A Boosting Approach to Constructing an Ensemble Stack	Nov 28, 2022	BenchmarkingEnsemble Learning	—Unverified
An Analysis of an Integrated Mathematical Modeling -- Artificial Neural Network Approach for the Problems with a Limited Learning Dataset	Nov 8, 2019	Benchmarking	—Unverified

Show:10 25 50

← PrevPage 34 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified