Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3601–3650 of 5548 papers

Title	Date	Tasks	Status
Open the box of digital neuromorphic processor: Towards effective algorithm-hardware co-design	Mar 27, 2023	BenchmarkingEdge-computing	—Unverified
Opposition based Ensemble Micro Differential Evolution	Sep 8, 2017	BenchmarkingDiversity	—Unverified
Optimal Eco-driving Control of Autonomous and Electric Trucks in Adaptation to Highway Topography: Energy Minimization and Battery Life Extension	Sep 10, 2020	BenchmarkingModel Predictive Control	—Unverified
Optimally-Weighted Maximum Mean Discrepancy Framework for Continual Learning	Jan 21, 2025	BenchmarkingContinual Learning	—Unverified
Optimal PMU Placement for Kalman Filtering of DAE Power System Models	Feb 5, 2025	BenchmarkingState Estimation	—Unverified
Optimal Scheduling of Anticipated COVID-19 Vaccination: A Case Study of New York State	Aug 24, 2020	BenchmarkingScheduling	—Unverified
Optimization of Genomic Classifiers for Clinical Deployment: Evaluation of Bayesian Optimization to Select Predictive Models of Acute Infection and In-Hospital Mortality	Mar 27, 2020	Bayesian OptimizationBenchmarking	—Unverified
Optimization Techniques for a Physical Model of Human Vocalisation	Sep 26, 2023	Benchmarking	—Unverified
Optimizing open-domain question answering with graph-based retrieval augmented generation	Mar 4, 2025	BenchmarkingLanguage Modeling	—Unverified
Optimizing Recommendations using Fine-Tuned LLMs	May 11, 2025	BenchmarkingRecommendation Systems	—Unverified
OPTION: OPTImization Algorithm Benchmarking ONtology	Apr 24, 2021	BenchmarkingData Integration	—Unverified
OPTION: OPTImization Algorithm Benchmarking ONtology	Nov 21, 2022	BenchmarkingData Integration	—Unverified
OReole-FM: successes and challenges toward billion-parameter foundation models for high-resolution satellite imagery	Oct 25, 2024	Benchmarkingimage-classification	—Unverified
Organ-aware Multi-scale Medical Image Segmentation Using Text Prompt Engineering	Mar 18, 2025	BenchmarkingDescriptive	—Unverified
Orthogonal Deep Features Decomposition for Age-Invariant Face Recognition	Oct 17, 2018	Age-Invariant Face RecognitionBenchmarking	—Unverified
OSWorld-Human: Benchmarking the Efficiency of Computer-Use Agents	Jun 19, 2025	Benchmarking	—Unverified
oTTC: Object Time-to-Contact for Motion Estimation in Autonomous Driving	May 13, 2024	AttributeAutonomous Driving	—Unverified
Out of Distribution Performance of State of Art Vision Model	Jan 25, 2023	Benchmarking	—Unverified
Overconfident Oracles: Limitations of In Silico Sequence Design Benchmarking	Feb 24, 2025	Benchmarking	—Unverified
Overview and practical recommendations on using Shapley Values for identifying predictive biomarkers via CATE modeling	May 2, 2025	Benchmarking	—Unverified
Overview of Todai Robot Project and Evaluation Framework of its NLP-based Problem Solving	May 1, 2014	Benchmarking	—Unverified
OVQA: A Clinically Generated Visual Question Answering Dataset	Jul 7, 2022	BenchmarkingMedical Visual Question Answering	—Unverified
Paddy Doctor: A Visual Image Dataset for Automated Paddy Disease Classification and Benchmarking	May 23, 2022	BenchmarkingClassification	—Unverified
PalmBench: A Comprehensive Benchmark of Compressed Large Language Models on Mobile Platforms	Oct 5, 2024	BenchmarkingGPU	—Unverified
Paradigm Shift in Sustainability Disclosure Analysis: Empowering Stakeholders with CHATREPORT, a Language Model-Based Tool	Jun 27, 2023	BenchmarkingLanguage Modeling	—Unverified
Para-Lane: Multi-Lane Dataset Registering Parallel Scans for Benchmarking Novel View Synthesis	Feb 21, 2025	3DGSAutonomous Driving	—Unverified
Parsing Any Domain English text to CoNLL dependencies	May 1, 2012	BenchmarkingDependency Parsing	—Unverified
Participatory Personalization in Classification	Feb 8, 2023	BenchmarkingClassification	—Unverified
'Part'ly first among equals: Semantic part-based benchmarking for state-of-the-art object recognition systems	Nov 23, 2016	BenchmarkingObject	—Unverified
PASTA: A Dataset for Modeling Participant States in Narratives	Jul 31, 2022	BenchmarkingCommon Sense Reasoning	—Unverified
PatentNet: A Large-Scale Incomplete Multiview, Multimodal, Multilabel Industrial Goods Image Database	Jun 23, 2021	BenchmarkingClustering	—Unverified
PathBench: A Benchmarking Platform for Classical and Learned Path Planning Algorithms	May 4, 2021	Benchmarking	—Unverified
PathBench: A comprehensive comparison benchmark for pathology foundation models towards precision oncology	May 26, 2025	BenchmarkingPrognosis	—Unverified
Patherea: Cell Detection and Classification for the 2020s	Dec 21, 2024	BenchmarkingCell Detection	—Unverified
Pathway: a fast and flexible unified stream data processing framework for analytical and Machine Learning applications	Jul 12, 2023	Benchmarking	—Unverified
Patterns of Convergence and Bound Constraint Violation in Differential Evolution on SBOX-COST Benchmarking Suite	May 20, 2023	Benchmarking	—Unverified
PawPrint: Whose Footprints Are These? Identifying Animal Individuals by Their Footprints	May 23, 2025	Benchmarking	—Unverified
Perception Test 2023: A Summary of the First Challenge And Outcome	Dec 20, 2023	BenchmarkingGrounded Video Question Answering	—Unverified
Perception Test 2024: Challenge Summary and a Novel Hour-Long VideoQA Benchmark	Nov 29, 2024	BenchmarkingGrounded Video Question Answering	—Unverified
Benchmarking the Performance and Energy Efficiency of AI Accelerators for AI Training	Sep 15, 2019	BenchmarkingCPU	—Unverified
Performance Benchmarking of Psychomotor Skills Using Wearable Devices: An Application in Sport	Nov 25, 2024	Benchmarking	—Unverified
Performance Comparison of Surrogate-Assisted Evolutionary Algorithms on Computational Fluid Dynamics Problems	Feb 26, 2024	BenchmarkingEvolutionary Algorithms	—Unverified
Performance Evaluation Methodology for Long-Term Visual Object Tracking	Jun 19, 2019	BenchmarkingObject	—Unverified
Performance Evaluation of Transcriptomics Data Normalization for Survival Risk Prediction	Feb 8, 2021	BenchmarkingPrediction	—Unverified
Performance-Guided LLM Knowledge Distillation for Efficient Text Classification at Scale	Nov 7, 2024	Active LearningBenchmarking	—Unverified
Performance of large language models in numerical vs. semantic medical knowledge: Benchmarking on evidence-based Q&As	Jun 6, 2024	ArticlesBenchmarking	—Unverified
Performance prediction of data streams on high-performance architecture	Jan 7, 2019	BenchmarkingDimensionality Reduction	—Unverified
Periocular Recognition in the Wild with Orthogonal Combination of Local Binary Coded Pattern in Dual-stream Convolutional Neural Network	Feb 18, 2019	Benchmarking	—Unverified
PerMedCQA: Benchmarking Large Language Models on Medical Consumer Question Answering in Persian Language	May 23, 2025	BenchmarkingQuestion Answering	—Unverified
WeQA: A Benchmark for Retrieval Augmented Generation in Wind Energy Domain	Aug 21, 2024	Answer GenerationBenchmarking	—Unverified

Show:10 25 50

← PrevPage 73 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified