Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1676–1700 of 5548 papers

Title	Date	Tasks	Status
Benchmarking Collaborative Learning Methods Cost-Effectiveness for Prostate Segmentation	Sep 29, 2023	BenchmarkingFederated Learning	—Unverified
Benchmarking Cognitive Domains for LLMs: Insights from Taiwanese Hakka Culture	Sep 3, 2024	BenchmarkingRAG	—Unverified
A Distance Oriented Kalman Filter Particle Swarm Optimizer Applied to Multi-Modality Image Registration	Mar 20, 2018	BenchmarkingImage Registration	—Unverified
MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems	May 16, 2025	BenchmarkingMixture-of-Experts	—Unverified
Detecting Out-Of-Distribution Samples Using Low-Order Deep Features Statistics	May 1, 2019	Benchmarking	—Unverified
Device Modeling Bias in ReRAM-based Neural Network Simulations	Nov 29, 2022	Benchmarking	—Unverified
Different Horses for Different Courses: Comparing Bias Mitigation Algorithms in ML	Nov 17, 2024	BenchmarkingFairness	—Unverified
Diverse Community Data for Benchmarking Data Privacy Algorithms	Jun 20, 2023	Benchmarking	—Unverified
Benchmarking CNN on 3D Anatomical Brain MRI: Architectures, Data Augmentation and Deep Ensemble Learning	Jun 2, 2021	BenchmarkingData Augmentation	—Unverified
Benchmarking Clinical Decision Support Search	Jan 29, 2018	ArticlesBenchmarking	—Unverified
Ad-hoc Concept Forming in the Game Codenames as a Means for Evaluating Large Language Models	Feb 17, 2025	Benchmarking	—Unverified
Design2Code: Benchmarking Multimodal Code Generation for Automated Front-End Engineering	Mar 5, 2024	BenchmarkingCode Generation	—Unverified
Benchmarking Classical, Deep, and Generative Models for Human Activity Recognition	Jan 14, 2025	Activity RecognitionBenchmarking	—Unverified
An Experimental Study: Assessing the Combined Framework of WavLM and BEST-RQ for Text-to-Speech Synthesis	Dec 8, 2023	BenchmarkingQuantization	—Unverified
Benchmarking Chinese Medical LLMs: A Medbench-based Analysis of Performance Gaps and Hierarchical Optimization Strategies	Mar 10, 2025	BenchmarkingEthics	—Unverified
A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection	Jun 5, 2024	Anomaly DetectionBenchmarking	—Unverified
ABOUT ML: Annotation and Benchmarking on Understanding and Transparency of Machine Learning Lifecycles	Dec 12, 2019	BenchmarkingBIG-bench Machine Learning	—Unverified
Design and benchmarking of a two degree of freedom tendon driver unit for cable-driven wearable technologies	Apr 24, 2025	Benchmarking	—Unverified
CKnowEdit: A New Chinese Knowledge Editing Dataset for Linguistics, Facts, and Logic Error Correction in LLMs	Sep 9, 2024	Benchmarkingknowledge editing	—Unverified
A New Stereo Benchmarking Dataset for Satellite Images	Jul 9, 2019	Benchmarking	—Unverified
A New Real-World Video Dataset for the Comparison of Defogging Algorithms	Oct 2, 2023	BenchmarkingDeblurring	—Unverified
Benchmarking Chest X-ray Diagnosis Models Across Multinational Datasets	May 21, 2025	BenchmarkingDiagnostic	—Unverified
A Density-Guided Temporal Attention Transformer for Indiscernible Object Counting in Underwater Video	Mar 6, 2024	BenchmarkingCrowd Counting	—Unverified
A Boosting Approach to Constructing an Ensemble Stack	Nov 28, 2022	BenchmarkingEnsemble Learning	—Unverified
An Analysis of an Integrated Mathematical Modeling -- Artificial Neural Network Approach for the Problems with a Limited Learning Dataset	Nov 8, 2019	Benchmarking	—Unverified

Show:10 25 50

← PrevPage 68 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified