Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1801–1825 of 5548 papers

Title	Date	Tasks	Status
An Empirical Study of Benchmarking Chinese Aspect Sentiment Quad Prediction	Nov 3, 2023	BenchmarkingSentence	—Unverified
Comparative Benchmarking of Causal Discovery Techniques	Aug 18, 2017	BenchmarkingCausal Discovery	—Unverified
User-in-the-loop Evaluation of Multimodal LLMs for Activity Assistance	Aug 4, 2024	Action AnticipationBenchmarking	—Unverified
Comparative Design Space Exploration of Dense and Semi-Dense SLAM	Sep 15, 2015	Benchmarking	—Unverified
Comparative evaluation of instrument segmentation and tracking methods in minimally invasive surgery	May 7, 2018	BenchmarkingSegmentation	—Unverified
ChatGPT vs State-of-the-Art Models: A Benchmarking Study in Keyphrase Generation Task	Apr 27, 2023	ArticlesBenchmarking	—Unverified
Benchmarking Answer Verification Methods for Question Answering-Based Summarization Evaluation Metrics	Apr 21, 2022	AttributeBenchmarking	—Unverified
Comparing Computing Platforms for Deep Learning on a Humanoid Robot	Sep 11, 2018	BenchmarkingCPU	—Unverified
Benchmarking Answer Verification Methods for Question Answering-Based Summarization Evaluation Metrics	Sep 17, 2021	AttributeBenchmarking	—Unverified
Comparing Hyper-optimized Machine Learning Models for Predicting Efficiency Degradation in Organic Solar Cells	Mar 29, 2024	Benchmarking	—Unverified
ChatGPT Alternative Solutions: Large Language Models Survey	Mar 21, 2024	BenchmarkingChatbot	—Unverified
Comparison and Benchmarking of AI Models and Frameworks on Mobile Devices	May 7, 2020	BenchmarkingDiversity	—Unverified
Comparison of feature extraction and dimensionality reduction methods for single channel extracellular spike sorting	Feb 10, 2016	BenchmarkingClustering	—Unverified
Comparison of tree-based ensemble algorithms for merging satellite and earth-observed precipitation data at the daily time scale	Dec 31, 2022	Benchmarkingregression	—Unverified
An Empirical Study of Automated Mislabel Detection in Real World Vision Datasets	Dec 2, 2023	Benchmarking	—Unverified
CompBench: Benchmarking Complex Instruction-guided Image Editing	May 18, 2025	BenchmarkingInstruction Following	—Unverified
Chart-to-Experience: Benchmarking Multimodal LLMs for Predicting Experiential Impact of Charts	May 23, 2025	Benchmarking	—Unverified
CHaRNet: Conditioned Heatmap Regression for Robust Dental Landmark Localization	Jan 22, 2025	Benchmarkingregression	—Unverified
Characterizing Transactional Databases for Frequent Itemset Mining	Nov 9, 2020	Benchmarking	—Unverified
Benchmarking and Validation of Sub-mW 30GHz VG-LNAs in 22nm FDSOI CMOS for 5G/6G Phased-Array Receivers	Sep 11, 2024	Benchmarking	—Unverified
Complexity of Representations in Deep Learning	Sep 1, 2022	BenchmarkingDeep Learning	—Unverified
Comprehensive Benchmark Datasets for Amharic Scene Text Detection and Recognition	Mar 23, 2022	BenchmarkingScene Text Detection	—Unverified
Characterizing the adversarial vulnerability of speech self-supervised learning	Nov 8, 2021	Adversarial RobustnessBenchmarking	—Unverified
Characterizing Missing Information in Deep Networks Using Backpropagated Gradients	Jan 1, 2020	Anomaly DetectionAttribute	—Unverified
An Empirical Investigation into Benchmarking Model Multiplicity for Trustworthy Machine Learning: A Case Study on Image Classification	Nov 24, 2023	Benchmarkingimage-classification	—Unverified

Show:10 25 50

← PrevPage 73 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified