Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4101–4125 of 5548 papers

Title	Date	Tasks	Status
Benchmarking the Performance and Energy Efficiency of AI Accelerators for AI Training	Sep 15, 2019	BenchmarkingCPU	—Unverified
Performance Benchmarking of Psychomotor Skills Using Wearable Devices: An Application in Sport	Nov 25, 2024	Benchmarking	—Unverified
Performance Comparison of Surrogate-Assisted Evolutionary Algorithms on Computational Fluid Dynamics Problems	Feb 26, 2024	BenchmarkingEvolutionary Algorithms	—Unverified
Performance Evaluation Methodology for Long-Term Visual Object Tracking	Jun 19, 2019	BenchmarkingObject	—Unverified
Benchmark Dataset for Pore-Scale CO2-Water Interaction	Mar 22, 2025	Benchmarking	—Unverified
TTSlow: Slow Down Text-to-Speech with Efficiency Robustness Evaluations	Jul 2, 2024	Benchmarkingtext-to-speech	—Unverified
Performance Evaluation of Transcriptomics Data Normalization for Survival Risk Prediction	Feb 8, 2021	BenchmarkingPrediction	—Unverified
Performance-Guided LLM Knowledge Distillation for Efficient Text Classification at Scale	Nov 7, 2024	Active LearningBenchmarking	—Unverified
Where Paths Collide: A Comprehensive Survey of Classic and Learning-Based Multi-Agent Pathfinding	May 25, 2025	BenchmarkingMulti-Agent Path Finding	—Unverified
Performance of large language models in numerical vs. semantic medical knowledge: Benchmarking on evidence-based Q&As	Jun 6, 2024	ArticlesBenchmarking	—Unverified
Performance prediction of data streams on high-performance architecture	Jan 7, 2019	BenchmarkingDimensionality Reduction	—Unverified
Periocular Recognition in the Wild with Orthogonal Combination of Local Binary Coded Pattern in Dual-stream Convolutional Neural Network	Feb 18, 2019	Benchmarking	—Unverified
Which models are innately best at uncertainty estimation?	Jun 5, 2022	BenchmarkingOut-of-Distribution Detection	—Unverified
PerMedCQA: Benchmarking Large Language Models on Medical Consumer Question Answering in Persian Language	May 23, 2025	BenchmarkingQuestion Answering	—Unverified
WeQA: A Benchmark for Retrieval Augmented Generation in Wind Energy Domain	Aug 21, 2024	Answer GenerationBenchmarking	—Unverified
Perona: Robust Infrastructure Fingerprinting for Resource-Efficient Big Data Analytics	Nov 15, 2022	Benchmarking	—Unverified
PerSEval: Assessing Personalization in Text Summarizers	Jun 29, 2024	BenchmarkingHuman Judgment Correlation	—Unverified
A Conformance Checking-based Approach for Drift Detection in Business Processes	Jul 9, 2019	BenchmarkingDrift Detection	—Unverified
Personalised Feedback Framework for Online Education Programmes Using Generative AI	Oct 14, 2024	BenchmarkingManagement	—Unverified
Benchmark Data Repositories for Better Benchmarking	Oct 31, 2024	Benchmarking	—Unverified
Personalized Multimodal Large Language Models: A Survey	Dec 3, 2024	BenchmarkingSurvey	—Unverified
Personalized On-Device E-health Analytics with Decentralized Block Coordinate Descent	Dec 17, 2021	BenchmarkingDiagnostic	—Unverified
Person Re-Identification by Unsupervised Video Matching	Nov 25, 2016	BenchmarkingDynamic Time Warping	—Unverified
Person Re-Identification in Identity Regression Space	Jun 25, 2018	BenchmarkingIncremental Learning	—Unverified
Person Re-identification in the Wild	Apr 9, 2016	BenchmarkingPedestrian Detection	—Unverified

Show:10 25 50

← PrevPage 165 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified