Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1376–1400 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
Introducing Milabench: Benchmarking Accelerators for AI	Nov 18, 2024	BenchmarkingDeep Learning	CodeCode Available	1	5
Benchpress: A Scalable and Versatile Workflow for Benchmarking Structure Learning Algorithms	Jul 8, 2021	Benchmarking	CodeCode Available	1	5
BEND: Benchmarking DNA Language Models on biologically meaningful tasks	Nov 21, 2023	BenchmarkingLanguage Modeling	CodeCode Available	1	5
Introducing the VoicePrivacy Initiative	May 4, 2020	Benchmarking	CodeCode Available	1	5
BenchML: an extensible pipelining framework for benchmarking representations of materials and molecules at scale	Dec 4, 2021	BenchmarkingHyperparameter Optimization	CodeCode Available	1	5
Benchmarking the Robustness of Deep Neural Networks to Common Corruptions in Digital Pathology	Jun 30, 2022	BenchmarkingDiagnostic	CodeCode Available	1	5
Benchmarking Implicit Neural Representation and Geometric Rendering in Real-Time RGB-D SLAM	Mar 28, 2024	Benchmarking	CodeCode Available	1	5
Benchmark on Drug Target Interaction Modeling from a Structure Perspective	Jul 4, 2024	BenchmarkingDrug Discovery	CodeCode Available	1	5
Benchmarks for Deep Off-Policy Evaluation	Mar 30, 2021	Benchmarkingcontinuous-control	CodeCode Available	1	5
Intrinsic Image Harmonization	Jun 19, 2021	BenchmarkingImage Harmonization	CodeCode Available	1	5
Exploiting News Article Structure for Automatic Corpus Generation of Entailment Datasets	Oct 22, 2020	ArticlesBenchmarking	CodeCode Available	1	5
Align and Distill: Unifying and Improving Domain Adaptive Object Detection	Mar 18, 2024	Benchmarkingobject-detection	CodeCode Available	1	5
Event-Free Moving Object Segmentation from Moving Ego Vehicle	Apr 28, 2023	Autonomous DrivingBenchmarking	CodeCode Available	1	5
Ducho 2.0: Towards a More Up-to-Date Unified Framework for the Extraction of Multimodal Features in Recommendation	Mar 7, 2024	BenchmarkingMultimodal Recommendation	CodeCode Available	1	5
Benchmarking the Robustness of Spatial-Temporal Models Against Corruptions	Oct 13, 2021	BenchmarkingComputational Efficiency	CodeCode Available	1	5
Benchmarking Image Retrieval for Visual Localization	Nov 24, 2020	Autonomous DrivingBenchmarking	CodeCode Available	1	5
ArabicaQA: A Comprehensive Dataset for Arabic Question Answering	Mar 26, 2024	BenchmarkingMachine Reading Comprehension	CodeCode Available	1	5
Benchmarking human visual search computational models in natural scenes: models comparison and reference datasets	Dec 10, 2021	Benchmarking	CodeCode Available	1	5
Interpretable statistical representations of neural population dynamics and geometry	Apr 6, 2023	BenchmarkingDecision Making	CodeCode Available	1	5
InstructTTSEval: Benchmarking Complex Natural-Language Instruction Following in Text-to-Speech Systems	Jun 19, 2025	BenchmarkingDescriptive	CodeCode Available	1	5
Dynatask: A Framework for Creating Dynamic AI Benchmark Tasks	Apr 5, 2022	Benchmarking	CodeCode Available	1	5
Physiology-based simulation of the retinal vasculature enables annotation-free segmentation of OCT angiographs	Jul 22, 2022	BenchmarkingRetinal Vessel Segmentation	CodeCode Available	1	5
PIC4rl-gym: a ROS2 modular framework for Robots Autonomous Navigation with Deep Reinforcement Learning	Nov 19, 2022	Autonomous NavigationBenchmarking	CodeCode Available	1	5
Aquatic Navigation: A Challenging Benchmark for Deep Reinforcement Learning	May 30, 2024	Autonomous DrivingBenchmarking	CodeCode Available	1	5
IntelliGraphs: Datasets for Benchmarking Knowledge Graph Generation	Jul 13, 2023	BenchmarkingGraph Embedding	CodeCode Available	1	5

Show:10 25 50

← PrevPage 56 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified