Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2151–2175 of 5548 papers

Title	Date	Tasks	Status
Benchmarking Adaptative Variational Quantum Algorithms on QUBO Instances	Aug 3, 2023	Benchmarking	—Unverified
Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark	Jun 4, 2018	BenchmarkingBIG-bench Machine Learning	—Unverified
Exploring the Adversarial Frontier: Quantifying Robustness via Adversarial Hypervolume	Mar 8, 2024	Adversarial RobustnessBenchmarking	—Unverified
Exploring Thermography Technology: A Comprehensive Facial Dataset for Face Detection, Recognition, and Emotion	May 28, 2024	BenchmarkingEmotion Recognition	—Unverified
A Benchmarking on Cloud based Speech-To-Text Services for French Speech and Background Noise Effect	May 7, 2021	BenchmarkingSpeech-to-Text	—Unverified
Benchmarking Active Learning Strategies for Materials Optimization and Discovery	Apr 12, 2022	Active LearningBenchmarking	—Unverified
Analysis and Benchmarking of Extending Blind Face Image Restoration to Videos	Oct 15, 2024	BenchmarkingBlind Face Restoration	—Unverified
TaskEval: Assessing Difficulty of Code Generation Tasks for Large Language Models	Jul 30, 2024	BenchmarkingCode Completion	—Unverified
BrokenVideos: A Benchmark Dataset for Fine-Grained Artifact Localization in AI-Generated Videos	Jun 25, 2025	Artifact DetectionBenchmarking	—Unverified
Bringing Quantum Algorithms to Automated Machine Learning: A Systematic Review of AutoML Frameworks Regarding Extensibility for QML Algorithms	Oct 6, 2023	AutoMLBenchmarking	—Unverified
Benchmarking Active Learning for NILM	Nov 24, 2024	Active LearningBenchmarking	—Unverified
Bridging vision language model (VLM) evaluation gaps with a framework for scalable and cost-effective benchmark generation	Feb 21, 2025	BenchmarkingLanguage Modeling	—Unverified
Analysing Features Learned Using Unsupervised Models on Program Embeddings	Jan 1, 2021	BenchmarkingBinary Classification	—Unverified
ExtremeAIGC: Benchmarking LMM Vulnerability to AI-Generated Extremist Content	Mar 13, 2025	BenchmarkingImage Generation	—Unverified
Toward Bridging the Simulated-to-Real Gap: Benchmarking Super-Resolution on Real Data	Sep 17, 2018	BenchmarkingSuper-Resolution	—Unverified
Analysing Errors of Open Information Extraction Systems	Jul 24, 2017	BenchmarkingOpen Information Extraction	—Unverified
Exploring Capabilities of Time Series Foundation Models in Building Analytics	Oct 28, 2024	Benchmarkingenergy management	—Unverified
Bridging the Gap Between Theory and Practice: Benchmarking Transfer Evolutionary Optimization	Apr 20, 2024	Benchmarking	—Unverified
Bridging the Bosphorus: Advancing Turkish Large Language Models through Strategies for Low-Resource Language Adaptation and Benchmarking	May 7, 2024	BenchmarkingModel Selection	—Unverified
Benchmarking Abstractive Summarisation: A Dataset of Human-authored Summaries of Norwegian News Articles	Jan 13, 2025	ArticlesBenchmarking	—Unverified
Exploring Continual Learning of Diffusion Models	Mar 27, 2023	BenchmarkingContinual Learning	—Unverified
Benchmarking a Benchmark: How Reliable is MS-COCO?	Nov 5, 2023	Benchmarkingimage-classification	—Unverified
A Benchmarking Environment for Reinforcement Learning Based Task Oriented Dialogue Management	Nov 29, 2017	BenchmarkingDeep Reinforcement Learning	—Unverified
Breakpoint: Scalable evaluation of system-level reasoning in LLM code agents	May 30, 2025	BenchmarkingCode Repair	—Unverified
A new pathway to generative artificial intelligence by minimizing the maximum entropy	Feb 18, 2025	Benchmarking	—Unverified

Show:10 25 50

← PrevPage 87 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified