Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2551–2600 of 5548 papers

Title	Date	Tasks	Status
Handwritten Text Recognition: A Survey	Feb 12, 2025	BenchmarkingHandwritten Text Recognition	—Unverified
Benchmarking Resource Usage for Efficient Distributed Deep Learning	Jan 28, 2022	BenchmarkingDeep Learning	—Unverified
FedHPO-B: A Benchmark Suite for Federated Hyperparameter Optimization	Jun 8, 2022	BenchmarkingFederated Learning	—Unverified
HaN-Seg: The head and neck organ-at-risk CT and MR segmentation dataset	Jan 3, 2023	BenchmarkingComputed Tomography (CT)	—Unverified
A Survey on Vision Autoregressive Model	Nov 13, 2024	3D GenerationBenchmarking	—Unverified
Federated Deconfounding and Debiasing Learning for Out-of-Distribution Generalization	May 8, 2025	AttributeBenchmarking	—Unverified
A Survey on Temporal Sentence Grounding in Videos	Sep 16, 2021	Action LocalizationBenchmarking	—Unverified
FedAD-Bench: A Unified Benchmark for Federated Unsupervised Anomaly Detection in Tabular Data	Aug 8, 2024	Anomaly DetectionBenchmarking	—Unverified
FeDa4Fair: Client-Level Federated Datasets for Fairness Evaluation	Jun 26, 2025	AttributeBenchmarking	—Unverified
Benchmarking Reinforcement Learning Methods for Dexterous Robotic Manipulation with a Three-Fingered Gripper	Aug 27, 2024	BenchmarkingReinforcement Learning (RL)	—Unverified
4Seasons: Benchmarking Visual SLAM and Long-Term Localization for Autonomous Driving in Challenging Conditions	Dec 31, 2022	Autonomous DrivingBenchmarking	—Unverified
Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead	Dec 21, 2020	Autonomous DrivingBenchmarking	—Unverified
HA-VLN: A Benchmark for Human-Aware Navigation in Discrete-Continuous Environments with Dynamic Multi-Human Interactions, Real-World Validation, and an Open Leaderboard	Mar 18, 2025	BenchmarkingHuman Dynamics	—Unverified
Imitation Learning Datasets: A Toolkit For Creating Datasets, Training Agents and Benchmarking	Mar 1, 2024	BenchmarkingImitation Learning	—Unverified
FederatedScope-LLM: A Comprehensive Package for Fine-tuning Large Language Models in Federated Learning	Sep 1, 2023	BenchmarkingFederated Learning	—Unverified
FedEval: A Holistic Evaluation Framework for Federated Learning	Nov 19, 2020	BenchmarkingFederated Learning	—Unverified
FeB4RAG: Evaluating Federated Search in the Context of Retrieval Augmented Generation	Feb 19, 2024	BenchmarkingChatbot	—Unverified
Feature selection in linear SVMs via a hard cardinality constraint: a scalable SDP decomposition approach	Apr 15, 2024	Benchmarkingfeature selection	—Unverified
A Survey on Semi-Supervised Learning for Delayed Partially Labelled Data Streams	Jun 16, 2021	Active LearningBenchmarking	—Unverified
Feature Selection and Classification of Hyperspectral Images With Support Vector Machines	Oct 15, 2007	BenchmarkingClassification	—Unverified
Featuremetric benchmarking: Quantum computer benchmarks based on circuit features	Apr 17, 2025	Benchmarking	—Unverified
Airport Capacity and Performance in Europe -- A study of transport economics, service quality and sustainability	Feb 4, 2021	Benchmarking	—Unverified
A Survey on Preserving Fairness Guarantees in Changing Environments	Nov 14, 2022	BenchmarkingDecision Making	—Unverified
Benchmarking Reasoning Robustness in Large Language Models	Mar 6, 2025	BenchmarkingMath	—Unverified
Feature Encodings for Gradient Boosting with Automunge	Sep 25, 2022	BenchmarkingBinarization	—Unverified
gSuite: A Flexible and Framework Independent Benchmark Suite for Graph Neural Network Inference on GPUs	Oct 20, 2022	BenchmarkingComputational Efficiency	—Unverified
Feature-based Evolutionary Diversity Optimization of Discriminating Instances for Chance-constrained Optimization Problems	Jan 24, 2025	BenchmarkingDiversity	—Unverified
Benchmarking real-time monitoring strategies for ethanol production from lignocellulosic biomass	Jan 29, 2021	Benchmarking	—Unverified
FERA 2017 - Addressing Head Pose in the Third Facial Expression Recognition and Analysis Challenge	Feb 14, 2017	BenchmarkingFacial Action Unit Detection	—Unverified
FER-C: Benchmarking Out-of-Distribution Soft Calibration for Facial Expression Recognition	Dec 16, 2023	BenchmarkingFacial Expression Recognition	—Unverified
GTP-4o: Modality-prompted Heterogeneous Graph Learning for Omni-modal Biomedical Representation	Jul 8, 2024	BenchmarkingGraph Embedding	—Unverified
Feasibility of BERT Embeddings For Domain-Specific Knowledge Mining	Jan 16, 2022	BenchmarkingLanguage Modelling	—Unverified
Benchmarking real-time algorithms for in-phase auditory stimulation of low amplitude slow waves with wearable EEG devices during sleep	Mar 4, 2022	BenchmarkingComputational Efficiency	—Unverified
F-Bench: Rethinking Human Preference Evaluation Metrics for Benchmarking Face Generation, Customization, and Restoration	Dec 17, 2024	BenchmarkingFace Generation	—Unverified
FewNLU: Benchmarking State-of-the-Art Methods for Few-Shot Natural Language Understanding	Nov 16, 2021	BenchmarkingNatural Language Understanding	—Unverified
Few-Shot Defect Segmentation Leveraging Abundant Normal Training Samples Through Normal Background Regularization and Crop-and-Paste Operation	Jul 18, 2020	Anomaly DetectionBenchmarking	—Unverified
Benchmarking Randomized Optimization Algorithms on Binary, Permutation, and Combinatorial Problem Landscapes	Jan 21, 2025	Benchmarking	—Unverified
FAVOR-Bench: A Comprehensive Benchmark for Fine-Grained Video Motion Understanding	Mar 19, 2025	BenchmarkingMultiple-choice	—Unverified
Benchmarking Robustness in Neural Radiance Fields	Jan 10, 2023	BenchmarkingCamera Calibration	—Unverified
Fiber Bundle Morphisms as a Framework for Modeling Many-to-Many Maps	Mar 15, 2022	BenchmarkingSentiment Analysis	—Unverified
Fast Training of Deep Networks with One-Class CNNs	Jun 28, 2020	BenchmarkingClassification	—Unverified
AI-ready Snow Radar Echogram Dataset (SRED) for climate change monitoring	May 1, 2025	BenchmarkingDeep Learning	—Unverified
A Comprehensive Benchmarking Platform for Deep Generative Models in Molecular Design	May 19, 2025	BenchmarkingDrug Discovery	—Unverified
Guidelines for Fine-grained Sentence-level Arabic Readability Annotation	Oct 11, 2024	BenchmarkingSentence	—Unverified
Fast Labeling and Transcription with the Speechalyzer Toolkit	May 1, 2012	Audio ClassificationBenchmarking	—Unverified
Benchmarking Quantum Hardware for Training of Fully Visible Boltzmann Machines	Nov 14, 2016	Benchmarking	—Unverified
FastEnsemble: Benchmarking and Accelerating Ensemble-based Uncertainty Estimation for Image-to-Image Translation	Sep 29, 2021	BenchmarkingImage Generation	—Unverified
Findings of the Shared Task on Offensive Language Identification in Tamil, Malayalam, and Kannada	Apr 1, 2021	BenchmarkingLanguage Identification	—Unverified
Fast Empirical Scenarios	Jul 8, 2023	BenchmarkingDecision Making	—Unverified
Benchmarking Quantum Convolutional Neural Networks for Signal Classification in Simulated Gamma-Ray Burst Detection	Jan 28, 2025	Benchmarking	—Unverified

Show:10 25 50

← PrevPage 52 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified