Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1251–1300 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
Benchmarking Transcriptomics Foundation Models for Perturbation Analysis : one PCA still rules them all	Oct 17, 2024	AllBenchmarking	CodeCode Available	1	5
ImageNet-E: Benchmarking Neural Network Robustness via Attribute Editing	Mar 30, 2023	AttributeBenchmarking	CodeCode Available	1	5
Implicit Multi-Spectral Transformer: An Lightweight and Effective Visible to Infrared Image Translation Model	Apr 10, 2024	BenchmarkingImage-to-Image Translation	CodeCode Available	1	5
Benchmarking emergency department triage prediction models with machine learning and large public electronic health records	Nov 22, 2021	Benchmarking	CodeCode Available	1	5
SoK: Membership Inference Attacks on LLMs are Rushing Nowhere (and How to Fix It)	Jun 25, 2024	BenchmarkingExperimental Design	CodeCode Available	1	5
CompanyKG: A Large-Scale Heterogeneous Graph for Company Similarity Quantification	Jun 18, 2023	BenchmarkingRetrieval	CodeCode Available	1	5
Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks	Jun 14, 2020	BenchmarkingDeep Reinforcement Learning	CodeCode Available	1	5
Benchmarking Language Models for Code Syntax Understanding	Oct 26, 2022	Benchmarking	CodeCode Available	1	5
TextEE: Benchmark, Reevaluation, Reflections, and Future Challenges in Event Extraction	Nov 16, 2023	BenchmarkingEvent Extraction	CodeCode Available	1	5
Illuminating Darkness: Enhancing Real-world Low-light Scenes with Smartphone Images	Mar 10, 2025	4kBenchmarking	CodeCode Available	1	5
A Survey on Graph Counterfactual Explanations: Definitions, Methods, Evaluation, and Research Challenges	Oct 21, 2022	BenchmarkingCommunity Detection	CodeCode Available	1	5
MEGA: Multilingual Evaluation of Generative AI	Mar 22, 2023	Benchmarking	CodeCode Available	1	5
Benchmarking Language Model Creativity: A Case Study on Code Generation	Jul 12, 2024	BenchmarkingCode Generation	CodeCode Available	1	5
Benchmarking the Spectrum of Agent Capabilities	Sep 14, 2021	Benchmarking	CodeCode Available	1	5
Benchmarking of DL Libraries and Models on Mobile Devices	Feb 14, 2022	BenchmarkingGPU	CodeCode Available	1	5
MetaFormer and CNN Hybrid Model for Polyp Image Segmentation	Sep 16, 2024	BenchmarkingImage Segmentation	CodeCode Available	1	5
Meta-Surrogate Benchmarking for Hyperparameter Optimization	May 30, 2019	BenchmarkingHyperparameter Optimization	CodeCode Available	1	5
Benchmarking Quantized Neural Networks on FPGAs with FINN	Feb 2, 2021	BenchmarkingQuantization	CodeCode Available	1	5
Image Colorization: A Survey and Dataset	Aug 25, 2020	BenchmarkingColorization	CodeCode Available	1	5
MGTBench: Benchmarking Machine-Generated Text Detection	Mar 26, 2023	BenchmarkingQuestion Answering	CodeCode Available	1	5
IDToolkit: A Toolkit for Benchmarking and Developing Inverse Design Algorithms in Nanophotonics	May 30, 2023	Benchmarking	CodeCode Available	1	5
Benchmarking the Robustness of Spatial-Temporal Models Against Corruptions	Oct 13, 2021	BenchmarkingComputational Efficiency	CodeCode Available	1	5
Benchmarking Knowledge Boundary for Large Language Models: A Different Perspective on Model Evaluation	Feb 18, 2024	BenchmarkingLanguage Modeling	CodeCode Available	1	5
MIMII DG: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection for Domain Generalization Task	May 27, 2022	BenchmarkingDomain Generalization	CodeCode Available	1	5
Benchmarking the Robustness of Temporal Action Detection Models Against Temporal Corruptions	Mar 29, 2024	Action DetectionBenchmarking	CodeCode Available	1	5
Contemporary Symbolic Regression Methods and their Relative Performance	Jul 29, 2021	Benchmarkingparameter estimation	CodeCode Available	1	5
Benchmarking Recommendation, Classification, and Tracing Based on Hugging Face Knowledge Graph	May 23, 2025	BenchmarkingManagement	CodeCode Available	1	5
minicons: Enabling Flexible Behavioral and Representational Analyses of Transformer Language Models	Mar 24, 2022	BenchmarkingSentence	CodeCode Available	1	5
ILIAS: Instance-Level Image retrieval At Scale	Feb 17, 2025	BenchmarkingImage Retrieval	CodeCode Available	1	5
Image Matching across Wide Baselines: From Paper to Practice	Mar 3, 2020	Benchmarking	CodeCode Available	1	5
Benchmarking Relief-Based Feature Selection Methods for Bioinformatics Data Mining	Nov 22, 2017	Benchmarkingfeature selection	CodeCode Available	1	5
Benchmarking the Robustness of Deep Neural Networks to Common Corruptions in Digital Pathology	Jun 30, 2022	BenchmarkingDiagnostic	CodeCode Available	1	5
Benchmarking the Performance of Bayesian Optimization across Multiple Experimental Materials Science Domains	May 23, 2021	Active LearningBayesian Optimisation	CodeCode Available	1	5
iAMPCN: a deep-learning approach for identifying antimicrobial peptides and their functional activities	Jun 27, 2024	Benchmarking	CodeCode Available	1	5
AirSim Drone Racing Lab	Mar 12, 2020	BenchmarkingOptical Flow Estimation	CodeCode Available	1	5
A framework for benchmarking clustering algorithms	Sep 20, 2022	BenchmarkingClustering	CodeCode Available	1	5
ICU-Sepsis: A Benchmark MDP Built from Real Medical Data	Jun 9, 2024	BenchmarkingManagement	CodeCode Available	1	5
A Comprehensive Overview of Large Language Models	Jul 12, 2023	Benchmarking	CodeCode Available	1	5
CovDocker: Benchmarking Covalent Drug Design with Tasks, Datasets, and Solutions	Jun 26, 2025	BenchmarkingDrug Design	CodeCode Available	1	5
Benchmarking Retrieval-Augmented Multimomal Generation for Document Question Answering	May 22, 2025	BenchmarkingEvidence Selection	CodeCode Available	1	5
Benchmarking the Generation of Fact Checking Explanations	Aug 29, 2023	Abstractive Text SummarizationArticles	CodeCode Available	1	5
Arctique: An artificial histopathological dataset unifying realism and controllability for uncertainty quantification	Nov 11, 2024	BenchmarkingImage Segmentation	CodeCode Available	1	5
A Systematic Benchmarking Analysis of Transfer Learning for Medical Image Analysis	Aug 12, 2021	BenchmarkingMedical Image Analysis	CodeCode Available	1	5
Benchmarking Vision, Language, & Action Models on Robotic Learning Tasks	Nov 4, 2024	Action GenerationBenchmarking	CodeCode Available	1	5
Benchmarking the Robustness of LiDAR-Camera Fusion for 3D Object Detection	May 30, 2022	3D Object DetectionAutonomous Driving	CodeCode Available	1	5
A framework for benchmarking class-out-of-distribution detection and its application to ImageNet	Feb 23, 2023	BenchmarkingKnowledge Distillation	CodeCode Available	1	5
Benchmarking TinyML Systems: Challenges and Direction	Mar 10, 2020	BenchmarkingPosition	CodeCode Available	1	5
Geometric Deep Learning for Structure-Based Drug Design: A Survey	Jun 20, 2023	BenchmarkingDeep Learning	CodeCode Available	1	5
A Japanese Dataset for Subjective and Objective Sentiment Polarity Classification in Micro Blog Domain	Jun 1, 2022	BenchmarkingEmotion Recognition	CodeCode Available	1	5
iDNA-ABF: multi-scale deep biological language learning model for the interpretable prediction of DNA methylations	Oct 17, 2022	BenchmarkingText Classification	CodeCode Available	1	5

Show:10 25 50

← PrevPage 26 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified