Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2976–3000 of 5548 papers

Title	Date	Tasks	Status
Benchmarking the Robustness of Panoptic Segmentation for Automated Driving	Feb 23, 2024	BenchmarkingDecision Making	—Unverified
The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs	Oct 2, 2024	BenchmarkingHallucination	—Unverified
Identifiable Convex-Concave Regression via Sub-gradient Regularised Least Squares	Jun 22, 2025	Benchmarkingregression	—Unverified
Identification of vortex in unstructured mesh with graph neural networks	Nov 11, 2023	BenchmarkingGraph Generation	—Unverified
The Leaderboard Illusion	Apr 29, 2025	BenchmarkingChatbot	—Unverified
XCSP3: An Integrated Format for Benchmarking Combinatorial Constrained Problems	Nov 10, 2016	Benchmarking	—Unverified
Identifying patterns and recommendations of and for sustainable open data initiatives: a benchmarking-driven analysis of open government data initiatives among European countries	Dec 1, 2023	Benchmarking	—Unverified
Identifying the Context Shift between Test Benchmarks and Production Data	Jul 3, 2022	BenchmarkingBIG-bench Machine Learning	—Unverified
The Liouville Generator for Producing Integrable Expressions	Jun 17, 2024	Benchmarking	—Unverified
Benchmarking the Robustness of Instance Segmentation Models	Sep 2, 2021	BenchmarkingDomain Adaptation	—Unverified
Benchmarking the Reliability of Post-training Quantization: a Particular Focus on Worst-case Performance	Mar 23, 2023	BenchmarkingData Augmentation	—Unverified
IEA: Inner Ensemble Average within a convolutional neural network	Aug 30, 2018	BenchmarkingEnsemble Learning	—Unverified
Benchmarking the rationality of AI decision making using the transitivity axiom	Feb 14, 2025	BenchmarkingDecision Making	—Unverified
A Gap in Time: The Challenge of Processing Heterogeneous IoT Data in Digitalized Buildings	May 23, 2024	BenchmarkingData Integration	—Unverified
Exploring the Decentraland Economy: Multifaceted Parcel Attributes, Key Insights, and Benchmarking	Apr 11, 2024	AttributeBenchmarking	—Unverified
A2Perf: Real-World Autonomous Agents Benchmark	Mar 4, 2025	BenchmarkingCombinatorial Optimization	—Unverified
Benchmarking the Physical-world Adversarial Robustness of Vehicle Detection	Apr 11, 2023	Adversarial AttackAdversarial Robustness	—Unverified
Benchmarking the Neural Linear Model for Regression	Dec 18, 2019	Bayesian OptimizationBenchmarking	—Unverified
The Low Emission Oil&Gas Open (LEOGO) Reference Platform of an Off-Grid Energy System for Renewable Integration Studies	Aug 16, 2022	BenchmarkingManagement	—Unverified
From Attack to Protection: Leveraging Watermarking Attack Network for Advanced Add-on Watermarking	Aug 14, 2020	Benchmarking	—Unverified
Image2Struct: Benchmarking Structure Extraction for Vision-Language Models	Oct 29, 2024	Benchmarking	—Unverified
Image-Based Benchmarking and Visualization for Large-Scale Global Optimization	Jul 24, 2020	BenchmarkingDimensionality Reduction	—Unverified
Benchmarking the Impact of Noise on Deep Learning-based Classification of Atrial Fibrillation in 12-Lead ECG	Mar 24, 2023	Atrial Fibrillation DetectionBenchmarking	—Unverified
Benchmarking the human brain against computational architectures	May 15, 2023	BenchmarkingComputational Efficiency	—Unverified
Image Matching: An Application-oriented Benchmark	Sep 12, 2017	AttributeBenchmarking	—Unverified

Show:10 25 50

← PrevPage 120 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified