Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4651–4700 of 5548 papers

Title	Date	Tasks	Status
Mamba-Based Ensemble learning for White Blood Cell Classification	Apr 15, 2025	BenchmarkingClassification	CodeCode Available
Better Late Than Never: Formulating and Benchmarking Recommendation Editing	Jun 6, 2024	BenchmarkingRecommendation Systems	CodeCode Available
Better force fields start with better data -- A data set of cation dipeptide interactions	Jul 19, 2021	Benchmarking	CodeCode Available
MANTRA: The Manifold Triangulations Assemblage	Oct 3, 2024	Benchmarking	CodeCode Available
BeSt-LeS: Benchmarking Stroke Lesion Segmentation using Deep Supervision	Oct 10, 2023	Acute Stroke Lesion SegmentationBenchmarking	CodeCode Available
debiaSAE: Benchmarking and Mitigating Vision-Language Model Bias	Oct 17, 2024	BenchmarkingBias Detection	CodeCode Available
VizSeq: A Visual Analysis Toolkit for Text Generation Tasks	Sep 12, 2019	BenchmarkingImage Captioning	CodeCode Available
PATH: A Discrete-sequence Dataset for Evaluating Online Unsupervised Anomaly Detection Approaches for Multivariate Time Series	Nov 21, 2024	Anomaly DetectionBenchmarking	CodeCode Available
Hi Guys or Hi Folks? Benchmarking Gender-Neutral Machine Translation with the GeNTE Corpus	Oct 8, 2023	BenchmarkingMachine Translation	CodeCode Available
Margin-bounded Confidence Scores for Out-of-Distribution Detection	Sep 22, 2024	Autonomous DrivingBenchmarking	CodeCode Available
Benchmarks for Graph Embedding Evaluation	Aug 19, 2019	BenchmarkingGraph Embedding	CodeCode Available
High-Quality, ROS Compatible Video Encoding and Decoding for High-Definition Datasets	Aug 1, 2024	BenchmarkingSimultaneous Localization and Mapping	CodeCode Available
MARS: Benchmarking the Metaphysical Reasoning Abilities of Language Models with a Multi-task Evaluation Dataset	Jun 4, 2024	Benchmarking	CodeCode Available
MARTA: a model for the automatic phonemic grouping of the parkinsonian speech	Mar 19, 2024	BenchmarkingClassification	CodeCode Available
High-Dynamic-Range Imaging for Cloud Segmentation	Mar 2, 2018	BenchmarkingImage Generation	CodeCode Available
Hierarchical Neural Networks for Sequential Sentence Classification in Medical Scientific Abstracts	Aug 19, 2018	BenchmarkingClassification	CodeCode Available
The Freiburg Groceries Dataset	Nov 17, 2016	BenchmarkingBIG-bench Machine Learning	CodeCode Available
AMPCliff: quantitative definition and benchmarking of activity cliffs in antimicrobial peptides	Apr 15, 2024	BenchmarkingProtein Language Model	CodeCode Available
Z_2 Z_2 Equivariant Quantum Neural Networks: Benchmarking against Classical Neural Networks	Nov 30, 2023	BenchmarkingBinary Classification	CodeCode Available
Benchmark of Deep Learning Models on Large Healthcare MIMIC Datasets	Oct 23, 2017	BenchmarkingBIG-bench Machine Learning	CodeCode Available
Hi-EF: Benchmarking Emotion Forecasting in Human-interaction	Jul 23, 2024	Benchmarking	CodeCode Available
Heterogeneous Datasets for Federated Survival Analysis Simulation	Jan 28, 2023	BenchmarkingFederated Learning	CodeCode Available
Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study	Mar 15, 2024	Benchmarking	CodeCode Available
Robust 2D/3D Vehicle Parsing in Arbitrary Camera Views for CVIS	Jan 1, 2021	BenchmarkingData Augmentation	CodeCode Available
Adaptive Visual Scene Understanding: Incremental Scene Graph Generation	Oct 2, 2023	BenchmarkingContinual Learning	CodeCode Available
HERMES: Holographic Equivariant neuRal network model for Mutational Effect and Stability prediction	Jul 9, 2024	Benchmarking	CodeCode Available
HATE-ITA: New Baselines for Hate Speech Detection in Italian	Jul 1, 2022	BenchmarkingHate Speech Detection	CodeCode Available
Benchmarking YOLOv5 and YOLOv7 models with DeepSORT for droplet tracking applications	Jan 19, 2023	BenchmarkingGPU	CodeCode Available
Benchmarking White Blood Cell Classification Under Domain Shift	Mar 3, 2023	BenchmarkingClassification	CodeCode Available
MAYA: Addressing Inconsistencies in Generative Password Guessing through a Unified Benchmark	Apr 23, 2025	Benchmarking	CodeCode Available
Robust Benchmarking for Machine Learning of Clinical Entity Extraction	Jul 31, 2020	BenchmarkingBIG-bench Machine Learning	CodeCode Available
MCA-Bench: A Multimodal Benchmark for Evaluating CAPTCHA Robustness Against VLM-based Attacks	Jun 6, 2025	Benchmarking	CodeCode Available
Benchmarking Vision-Language Contrastive Methods for Medical Representation Learning	Jun 11, 2024	BenchmarkingContrastive Learning	CodeCode Available
A Wild Bootstrap for Degenerate Kernel Tests	Aug 23, 2014	BenchmarkingTime Series	CodeCode Available
Harnessing Orthogonality to Train Low-Rank Neural Networks	Jan 16, 2024	Benchmarking	CodeCode Available
Aux-Drop: Handling Haphazard Inputs in Online Learning Using Auxiliary Dropouts	Mar 9, 2023	Benchmarking	CodeCode Available
Causally Testing Gender Bias in LLMs: A Case Study on Occupational Bias	Dec 20, 2022	Benchmarking	CodeCode Available
Benchmarking Unsupervised Strategies for Anomaly Detection in Multivariate Time Series	Jun 25, 2025	Anomaly DetectionBenchmarking	CodeCode Available
Harmonization Benchmarking Tool for Neuroimaging Datasets	Nov 15, 2022	BenchmarkingDiffusion MRI	CodeCode Available
Adaptive Shrinkage Estimation For Personalized Deep Kernel Regression In Modeling Brain Trajectories	Apr 10, 2025	Additive modelsBenchmarking	CodeCode Available
Benchmarking Unsupervised Online IDS for Masquerade Attacks in CAN	Jun 19, 2024	BenchmarkingIntrusion Detection	CodeCode Available
The iToBoS dataset: skin region images extracted from 3D total body photographs for lesion detection	Jan 30, 2025	BenchmarkingDiagnostic	CodeCode Available
Benchmarking Ultra-High-Definition Image Reflection Removal	Aug 1, 2023	BenchmarkingImage Restoration	CodeCode Available
Understanding the Role of LLMs in Multimodal Evaluation Benchmarks	Oct 16, 2024	BenchmarkingLarge Language Model	CodeCode Available
VocalBench: Benchmarking the Vocal Conversational Abilities for Speech Interaction Models	May 21, 2025	Benchmarking	CodeCode Available
Measuring what Really Matters: Optimizing Neural Networks for TinyML	Apr 21, 2021	Benchmarking	CodeCode Available
Benchmarking Traditional Machine Learning and Deep Learning Models for Fault Detection in Power Transformers	May 7, 2025	BenchmarkingFault Detection	CodeCode Available
Benchmarking TPU, GPU, and CPU Platforms for Deep Learning	Jul 24, 2019	BenchmarkingCPU	CodeCode Available
RoLargeSum: A Large Dialect-Aware Romanian News Dataset for Summary, Headline, and Keyword Generation	Dec 15, 2024	ArticlesBenchmarking	CodeCode Available
Hardware Aware Neural Network Architectures using FbNet	Jun 17, 2019	BenchmarkingNeural Architecture Search	CodeCode Available

Show:10 25 50

← PrevPage 94 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified