Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2251–2275 of 5548 papers

Title	Date	Tasks	Status
A CUDA-Based Real Parameter Optimization Benchmark	Jul 29, 2014	BenchmarkingCPU	—Unverified
Beyond Text: A Deep Dive into Large Language Models' Ability on Understanding Graph Data	Oct 7, 2023	Benchmarking	—Unverified
BEADs: Bias Evaluation Across Domains	Jun 6, 2024	BenchmarkingBias Detection	—Unverified
FedSym: Unleashing the Power of Entropy for Benchmarking the Algorithms for Federated Learning	Oct 11, 2023	BenchmarkingDiversity	—Unverified
FERA 2017 - Addressing Head Pose in the Third Facial Expression Recognition and Analysis Challenge	Feb 14, 2017	BenchmarkingFacial Action Unit Detection	—Unverified
Energy Models for Better Pseudo-Labels: Improving Semi-Supervised Classification with the 1-Laplacian Graph Energy	Jun 20, 2019	BenchmarkingMulti-class Classification	—Unverified
Beyond Static Models and Test Sets: Benchmarking the Potential of Pre-trained Models Across Tasks and Languages	May 12, 2022	BenchmarkingDiversity	—Unverified
Beyond Specialization: Benchmarking LLMs for Transliteration of Indian Languages	May 26, 2025	BenchmarkingTransliteration	—Unverified
BEACON: A Benchmark for Efficient and Accurate Counting of Subgraphs	Apr 15, 2025	BenchmarkingSubgraph Counting	—Unverified
FIMP: Foundation Model-Informed Message Passing for Graph Neural Networks	Oct 17, 2022	BenchmarkingGraph Neural Network	—Unverified
Beyond Single-Model Views for Deep Learning: Optimization versus Generalizability of Stochastic Optimization Algorithms	Mar 1, 2024	BenchmarkingStochastic Optimization	—Unverified
Beyond Self-Talk: A Communication-Centric Survey of LLM-Based Multi-Agent Systems	Feb 20, 2025	BenchmarkingDecision Making	—Unverified
ActPlan-1K: Benchmarking the Procedural Planning Ability of Visual Language Models in Household Activities	Oct 4, 2024	Benchmarkingcounterfactual	—Unverified
BBOB Instance Analysis: Landscape Properties and Algorithm Performance across Problem Instances	Nov 29, 2022	Benchmarking	—Unverified
A Benchmark for Multi-speaker Anonymization	Jul 8, 2024	BenchmarkingDisentanglement	—Unverified
FedHPO-B: A Benchmark Suite for Federated Hyperparameter Optimization	Jun 8, 2022	BenchmarkingFederated Learning	—Unverified
FedNLP: Benchmarking Federated Learning Methods for Natural Language Processing Tasks	Jan 16, 2022	BenchmarkingFederated Learning	—Unverified
FER-C: Benchmarking Out-of-Distribution Soft Calibration for Facial Expression Recognition	Dec 16, 2023	BenchmarkingFacial Expression Recognition	—Unverified
A Modular Framework for Centrality and Clustering in Complex Networks	Nov 23, 2021	BenchmarkingClustering	—Unverified
Beyond Monocular Deraining: Stereo Image Deraining via Semantic Understanding	Aug 1, 2020	BenchmarkingRain Removal	—Unverified
Beyond Monocular Deraining: Parallel Stereo Deraining Network Via Semantic Prior	May 9, 2021	BenchmarkingRain Removal	—Unverified
Bayesian Neural Networks at Scale: A Performance Analysis and Pruning Study	May 23, 2020	BenchmarkingNetwork Pruning	—Unverified
SPINEX-TimeSeries: Similarity-based Predictions with Explainable Neighbors Exploration for Time Series and Forecasting Problems	Aug 4, 2024	BenchmarkingComputational Efficiency	—Unverified
Beyond Metrics: A Critical Analysis of the Variability in Large Language Model Evaluation Frameworks	Jul 29, 2024	BenchmarkingLanguage Model Evaluation	—Unverified
Bayesian Multi-type Mean Field Multi-agent Imitation Learning	Dec 1, 2020	BenchmarkingImitation Learning	—Unverified

Show:10 25 50

← PrevPage 91 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified