Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4976–5000 of 5548 papers

Title	Date	Tasks	Status
Evaluating Feature Attribution Methods in the Image Domain	Feb 22, 2022	Benchmarking	CodeCode Available
NegBio: a high-performance tool for negation and uncertainty detection in radiology reports	Dec 16, 2017	BenchmarkingNegation	CodeCode Available
A Comprehensive Comparison of Multi-Dimensional Image Denoising Methods	Nov 6, 2020	BenchmarkingDenoising	CodeCode Available
NeMig -- A Bilingual News Collection and Knowledge Graph about Migration	Sep 1, 2023	ArticlesBenchmarking	CodeCode Available
NengoDL: Combining deep learning and neuromorphic modelling methods	May 28, 2018	BenchmarkingDeep Learning	CodeCode Available
Evaluating AI Recruitment Sourcing Tools by Human Preference	Apr 3, 2025	Benchmarking	CodeCode Available
EvalAI: Towards Better Evaluation Systems for AI Agents	Feb 10, 2019	BenchmarkingBIG-bench Machine Learning	CodeCode Available
Essential guidelines for computational method benchmarking	Dec 3, 2018	Benchmarking	CodeCode Available
Benchmarking of LSTM Networks	Aug 11, 2015	Benchmarking	CodeCode Available
NerveNet: Learning Structured Policy with Graph Neural Networks	Jan 1, 2018	Benchmarkingcontinuous-control	CodeCode Available
How Fragile is Relation Extraction under Entity Replacements?	May 22, 2023	BenchmarkingCausal Inference	CodeCode Available
Benchmarking Network Embedding Models for Link Prediction: Are We Making Progress?	Feb 25, 2020	BenchmarkingLink Prediction	CodeCode Available
Sequence-Aware Recommender Systems	Feb 23, 2018	BenchmarkingMatrix Completion	CodeCode Available
WCEbleedGen: A wireless capsule endoscopy dataset and its benchmarking for automatic bleeding classification, detection, and segmentation	Aug 22, 2024	BenchmarkingClassification	CodeCode Available
Enterprise Benchmarks for Large Language Model Evaluation	Oct 11, 2024	BenchmarkingLanguage Model Evaluation	CodeCode Available
Enriching Social Science Research via Survey Item Linking	Dec 20, 2024	BenchmarkingEntity Disambiguation	CodeCode Available
Sequential Large Language Model-Based Hyper-parameter Optimization	Oct 27, 2024	Bayesian OptimizationBenchmarking	CodeCode Available
Neural Network Design: Learning from Neural Architecture Search	Nov 1, 2020	Benchmarkingimage-classification	CodeCode Available
Benchmarking of image registration methods for differently stained histological slides	Oct 11, 2018	BenchmarkingBIRL	CodeCode Available
BOND: Benchmarking Unsupervised Outlier Node Detection on Static Attributed Graphs	Jun 21, 2022	Anomaly DetectionBenchmarking	CodeCode Available
Enhancing Video Summarization with Context Awareness	Apr 6, 2024	BenchmarkingInformativeness	CodeCode Available
Enhancing Treatment Effect Estimation via Active Learning: A Counterfactual Covering Perspective	May 8, 2025	Active LearningBenchmarking	CodeCode Available
Benchmarking Neural Machine Translation for Southern African Languages	Jun 17, 2019	BenchmarkingMachine Translation	CodeCode Available
Enhancing Hyper-To-Real Space Projections Through Euclidean Norm Meta-Heuristic Optimization	Jan 31, 2023	Benchmarking	CodeCode Available
Enhancing Biomedical Knowledge Discovery for Diseases: An Open-Source Framework Applied on Rett Syndrome and Alzheimer's Disease	Jul 18, 2024	Benchmarking	CodeCode Available

Show:10 25 50

← PrevPage 200 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified