Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1951–2000 of 5548 papers

Title	Date	Tasks	Status	Score
Bayesian Neural Networks with Soft Evidence	Oct 19, 2020	Benchmarking	CodeCode Available	5
A Modular Workflow for Performance Benchmarking of Neuronal Network Simulations	Dec 16, 2021	Benchmarking	CodeCode Available	5
IndiBias: A Benchmark Dataset to Measure Social Biases in Language Models for Indian Context	Mar 29, 2024	BenchmarkingSentence	CodeCode Available	5
CVPR 2020 Continual Learning in Computer Vision Competition: Approaches, Results, Current Challenges and Future Directions	Sep 14, 2020	BenchmarkingContinual Learning	CodeCode Available	5
Partial Rankings of Optimizers	Feb 26, 2024	Benchmarking	CodeCode Available	5
Improving Generalization of Neural Vehicle Routing Problem Solvers Through the Lens of Model Architecture	Jun 10, 2024	BenchmarkingDecoder	CodeCode Available	5
Beyond Marginal Uncertainty: How Accurately can Bayesian Regression Models Estimate Posterior Predictive Correlations?	Nov 6, 2020	Active LearningBenchmarking	CodeCode Available	5
Improving Pretrained Models for Zero-shot Multi-label Text Classification through Reinforced Label Hierarchy Reasoning	Apr 4, 2021	BenchmarkingMulti Label Text Classification	CodeCode Available	5
Improvements & Evaluations on the MLCommons CloudMask Benchmark	Mar 7, 2024	Benchmarking	CodeCode Available	5
Improving the Perturbation-Based Explanation of Deepfake Detectors Through the Use of Adversarially-Generated Samples	Feb 6, 2025	BenchmarkingDeepFake Detection	CodeCode Available	5
Individual Fairness Guarantees for Neural Networks	May 11, 2022	BenchmarkingFairness	CodeCode Available	5
Beyond Document Page Classification: Design, Datasets, and Challenges	Aug 24, 2023	BenchmarkingClassification	CodeCode Available	5
A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep Learning	Jan 29, 2019	BenchmarkingDeep Learning	CodeCode Available	5
Improved Target-specific Stance Detection on Social Media Platforms by Delving into Conversation Threads	Nov 6, 2022	BenchmarkingOpinion Mining	CodeCode Available	5
Benchmarking Feature-based Algorithm Selection Systems for Black-box Numerical Optimization	Sep 17, 2021	Benchmarking	CodeCode Available	5
Performance Evaluation of Real-Time Object Detection for Electric Scooters	May 5, 2024	Autonomous VehiclesBenchmarking	CodeCode Available	5
Improve Machine Learning carbon footprint using Nvidia GPU and Mixed Precision training for classification models -- Part I	Sep 12, 2024	BenchmarkingCPU	CodeCode Available	5
BASED: Benchmarking, Analysis, and Structural Estimation of Deblurring	May 27, 2023	BenchmarkingDeblurring	CodeCode Available	5
Neural Style Transfer Improves 3D Cardiovascular MR Image Segmentation on Inconsistent Data	Sep 20, 2019	BenchmarkingEnsemble Learning	CodeCode Available	5
Beyond Atomic Geometry Representations in Materials Science: A Human-in-the-Loop Multimodal Framework	May 30, 2025	Benchmarking	CodeCode Available	5
Benchmarking Feature Upsampling Methods for Vision Foundation Models using Interactive Segmentation	May 4, 2025	BenchmarkingFeature Upsampling	CodeCode Available	5
Beyond Accuracy: A Consolidated Tool for Visual Question Answering Benchmarking	Oct 11, 2021	BenchmarkingQuestion Answering	CodeCode Available	5
Improved Multilingual Language Model Pretraining for Social Media Text via Translation Pair Prediction	Oct 20, 2021	BenchmarkingLanguage Modeling	CodeCode Available	5
Improve Machine Learning carbon footprint using Parquet dataset format and Mixed Precision training for regression models -- Part II	Sep 17, 2024	BenchmarkingDescriptive	CodeCode Available	5
InDL: A New Dataset and Benchmark for In-Diagram Logic Interpretation based on Visual Illusion	May 28, 2023	BenchmarkingDecision Making	CodeCode Available	5
Integration of nested cross-validation, automated hyperparameter optimization, high-performance computing to reduce and quantify the variance of test performance estimation of deep learning models	Mar 11, 2025	BenchmarkingHyperparameter Optimization	CodeCode Available	5
Immunofluorescence Capillary Imaging Segmentation: Cases Study	Jul 14, 2022	BenchmarkingImage Segmentation	CodeCode Available	5
Impact of ImageNet Model Selection on Domain Adaptation	Feb 6, 2020	BenchmarkingDomain Adaptation	CodeCode Available	5
Better Late Than Never: Formulating and Benchmarking Recommendation Editing	Jun 6, 2024	BenchmarkingRecommendation Systems	CodeCode Available	5
Better force fields start with better data -- A data set of cation dipeptide interactions	Jul 19, 2021	Benchmarking	CodeCode Available	5
BanglaNLP at BLP-2023 Task 2: Benchmarking different Transformer Models for Sentiment Analysis of Bangla Social Media Posts	Oct 13, 2023	BenchmarkingSentiment Analysis	CodeCode Available	5
ImmersePro: End-to-End Stereo Video Synthesis Via Implicit Disparity Learning	Sep 30, 2024	BenchmarkingDisparity Estimation	CodeCode Available	5
BeSt-LeS: Benchmarking Stroke Lesion Segmentation using Deep Supervision	Oct 10, 2023	Acute Stroke Lesion SegmentationBenchmarking	CodeCode Available	5
Balancing policy constraint and ensemble size in uncertainty-based offline reinforcement learning	Mar 26, 2023	Behavioural cloningBenchmarking	CodeCode Available	5
Illusory VQA: Benchmarking and Enhancing Multimodal Models on Visual Illusions	Dec 11, 2024	BenchmarkingQuestion Answering	CodeCode Available	5
Action-conditioned Benchmarking of Robotic Video Prediction Models: a Comparative Study	Oct 7, 2019	BenchmarkingPrediction	CodeCode Available	5
Illuminating the Diversity-Fitness Trade-Off in Black-Box Optimization	Aug 29, 2024	BenchmarkingDiversity	CodeCode Available	5
ImpliRet: Benchmarking the Implicit Fact Retrieval Challenge	Jun 17, 2025	BenchmarkingRetrieval	CodeCode Available	5
A Meta-Analysis of the Anomaly Detection Problem	Mar 3, 2015	Anomaly DetectionBenchmarking	CodeCode Available	5
Benchmarks for Graph Embedding Evaluation	Aug 19, 2019	BenchmarkingGraph Embedding	CodeCode Available	5
BaDLAD: A Large Multi-Domain Bengali Document Layout Analysis Dataset	Mar 9, 2023	BenchmarkingDeep Learning	CodeCode Available	5
Identifying the Smallest Adversarial Load Perturbations that Render DC-OPF Infeasible	Jul 10, 2025	Adversarial AttackBenchmarking	CodeCode Available	5
Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari	Feb 24, 2018	Atari GamesBenchmarking	CodeCode Available	5
IHCV: Discovery of Hidden Time-Dependent Control Variables in Non-Linear Dynamical Systems	Apr 5, 2023	Benchmarking	CodeCode Available	5
Benchmark of Deep Learning Models on Large Healthcare MIMIC Datasets	Oct 23, 2017	BenchmarkingBIG-bench Machine Learning	CodeCode Available	5
AlphaZip: Neural Network-Enhanced Lossless Text Compression	Sep 23, 2024	BenchmarkingData Compression	CodeCode Available	5
Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study	Mar 15, 2024	Benchmarking	CodeCode Available	5
PPM: Automated Generation of Diverse Programming Problems for Benchmarking Code Generation Models	Jan 28, 2024	BenchmarkingCode Generation	CodeCode Available	5
IdeaBench: Benchmarking Large Language Models for Research Idea Generation	Oct 31, 2024	Benchmarkingscientific discovery	CodeCode Available	5
Identifying and Benchmarking Natural Out-of-Context Prediction Problems	Oct 25, 2021	Benchmarking	CodeCode Available	5

Show:10 25 50

← PrevPage 40 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified