Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1726–1750 of 5548 papers

Title	Date	Tasks	Status
Disability prediction in multiple sclerosis using performance outcome measures and demographic data	Apr 8, 2022	BenchmarkingBIG-bench Machine Learning	—Unverified
Discriminative Link Prediction using Local Links, Node Features and Community Structure	Oct 17, 2013	BenchmarkingClustering	—Unverified
CLAMS: A Cluster Ambiguity Measure for Estimating Perceptual Variability in Visual Clustering	Aug 1, 2023	BenchmarkingClustering	—Unverified
Benchmarking a wide range of optimisers for solving the Fermi-Hubbard model using the variational quantum eigensolver	Nov 20, 2024	Benchmarking	—Unverified
Classification and Retrieval of Digital Pathology Scans: A New Dataset	May 22, 2017	BenchmarkingGeneral Classification	—Unverified
A biologically-inspired multi-modal evaluation of molecular generative machine learning	Aug 20, 2022	BenchmarkingDrug Discovery	—Unverified
Classifying neuromorphic data using a deep learning framework for image classification	Jul 2, 2018	BenchmarkingDeep Learning	—Unverified
DIF: A Framework for Benchmarking and Verifying Implicit Bias in LLMs	May 15, 2025	BenchmarkingFairness	—Unverified
Benchmarking Automatic Speech Recognition coupled LLM Modules for Medical Diagnostics	Feb 18, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
DI-BENCH: Benchmarking Large Language Models on Dependency Inference with Testable Repositories at Scale	Jan 23, 2025	Benchmarking	—Unverified
Diff5T: Benchmarking Human Brain Diffusion MRI with an Extensive 5.0 Tesla K-Space and Spatial Dataset	Dec 9, 2024	BenchmarkingDiffusion MRI	—Unverified
CityLearn v2: Energy-flexible, resilient, occupant-centric, and carbon-aware management of grid-interactive communities	May 2, 2024	BenchmarkingManagement	—Unverified
Benchmarking Bayesian Deep Learning on Diabetic Retinopathy Detection Tasks	Nov 23, 2022	BenchmarkingDeep Learning	—Unverified
Addressing the Real-world Class Imbalance Problem in Dermatology	Oct 9, 2020	BenchmarkingFew-Shot Learning	—Unverified
CISOL: An Open and Extensible Dataset for Table Structure Recognition in the Construction Industry	Jan 26, 2025	BenchmarkingObject Detection	—Unverified
Benchmarking Automated Review Response Generation for the Hospitality Domain	Dec 1, 2020	BenchmarkingDomain Adaptation	—Unverified
Benchmarking bias: Expanding clinical AI model card to incorporate bias reporting of social and non-social factors	Nov 21, 2023	Benchmarking	—Unverified
Dialogue Games for Benchmarking Language Understanding: Motivation, Taxonomy, Strategy	Apr 14, 2023	Benchmarking	—Unverified
CLIRudit: Cross-Lingual Information Retrieval of Scientific Documents	Apr 22, 2025	BenchmarkingCross-Lingual Information Retrieval	—Unverified
DiffBody: Human Body Restoration by Imagining with Generative Diffusion Prior	Apr 4, 2024	BenchmarkingImage Restoration	—Unverified
CLLMate: A Multimodal Benchmark for Weather and Climate Events Forecasting	Sep 27, 2024	ArticlesBenchmarking	—Unverified
Benchmarking Automated Machine Learning Methods for Price Forecasting Applications	Apr 28, 2023	AutoMLBenchmarking	—Unverified
CIMLA: Interpretable AI for inference of differential causal networks	Apr 25, 2023	Benchmarking	—Unverified
CloudifierNet -- Deep Vision Models for Artificial Image Processing	Nov 4, 2019	BenchmarkingCode Generation	—Unverified
CIFAR-10-Warehouse: Broad and More Realistic Testbeds in Model Generalization Analysis	Oct 6, 2023	BenchmarkingDomain Generalization	—Unverified

Show:10 25 50

← PrevPage 70 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified