Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2201–2250 of 5548 papers

Title	Date	Tasks	Status
Few-Shot Defect Segmentation Leveraging Abundant Normal Training Samples Through Normal Background Regularization and Crop-and-Paste Operation	Jul 18, 2020	Anomaly DetectionBenchmarking	—Unverified
ForamViT-GAN: Exploring New Paradigms in Deep Learning for Micropaleontological Image Analysis	Apr 9, 2023	BenchmarkingDeep Learning	—Unverified
GANmut: Generating and Modifying Facial Expressions	Jun 16, 2024	BenchmarkingDiversity	—Unverified
h4rm3l: A language for Composable Jailbreak Attack Synthesis	Aug 9, 2024	BenchmarkingProgram Synthesis	—Unverified
BLAZE: Blazing Fast Privacy-Preserving Machine Learning	May 18, 2020	BenchmarkingBIG-bench Machine Learning	—Unverified
BLADE: Benchmark suite for LLM-driven Automated Design and Evolution of iterative optimisation heuristics	Apr 28, 2025	Benchmarking	—Unverified
Black-Box Optimization Revisited: Improving Algorithm Selection Wizards through Massive Benchmarking	Oct 8, 2020	Benchmarking	—Unverified
BENCHIP: Benchmarking Intelligence Processors	Oct 23, 2017	BenchmarkingDiversity	—Unverified
A Multi-Task Deep Learning Approach for Sensor-based Human Activity Recognition and Segmentation	Mar 20, 2023	Activity RecognitionBenchmarking	—Unverified
Black-box Bayesian inference for economic agent-based models	Feb 1, 2022	Bayesian InferenceBenchmarking	—Unverified
BIQ2021: A Large-Scale Blind Image Quality Assessment Database	Feb 8, 2022	BenchmarkingBlind Image Quality Assessment	—Unverified
Adaptive Deep Kernel Learning	May 28, 2019	BenchmarkingDrug Discovery	—Unverified
A Multisensory Learning Architecture for Rotation-invariant Object Recognition	Sep 14, 2020	BenchmarkingObject	—Unverified
FeB4RAG: Evaluating Federated Search in the Context of Retrieval Augmented Generation	Feb 19, 2024	BenchmarkingChatbot	—Unverified
FeDa4Fair: Client-Level Federated Datasets for Fairness Evaluation	Jun 26, 2025	AttributeBenchmarking	—Unverified
FedAD-Bench: A Unified Benchmark for Federated Unsupervised Anomaly Detection in Tabular Data	Aug 8, 2024	Anomaly DetectionBenchmarking	—Unverified
BenchCouncil's View on Benchmarking AI and Other Emerging Workloads	Dec 2, 2019	Benchmarking	—Unverified
Biomedical image analysis competitions: The state of current participation practice	Dec 16, 2022	BenchmarkingSurvey	—Unverified
Adaptive Control of an Inverted Pendulum by a Reinforcement Learning-based LQR Method	Sep 30, 2023	BenchmarkingReinforcement Learning (RL)	—Unverified
Biological Valuation Map of Flanders: A Sentinel-2 Imagery Analysis	Jan 26, 2024	BenchmarkingSemantic Segmentation	—Unverified
@Bench: Benchmarking Vision-Language Models for Human-centered Assistive Technology	Sep 21, 2024	BenchmarkingDepth Estimation	—Unverified
Biologically Plausible Learning on Neuromorphic Hardware Architectures	Dec 29, 2022	BenchmarkingQuantization	—Unverified
Bio-Image Informatics Index BIII: A unique database of image analysis tools and workflows for and by the bioimaging community	Dec 18, 2023	Benchmarking	—Unverified
A Multi-rater Comparative Study of Automatic Target Localization Methods for Epilepsy Deep Brain Stimulation Procedures	Jan 26, 2022	Benchmarking	—Unverified
A Benchmark for Spray from Nearby Cutting Vehicles	Aug 24, 2021	Autonomous DrivingBenchmarking	—Unverified
BioDSA-1K: Benchmarking Data Science Agents for Biomedical Research	May 22, 2025	Benchmarking	—Unverified
A Multimodal, Full-Surround Vehicular Testbed for Naturalistic Studies and Benchmarking: Design, Calibration and Deployment	Sep 21, 2017	Autonomous DrivingBenchmarking	—Unverified
Binary Classification with Positive Labeling Sources	Aug 2, 2022	BenchmarkingBinary Classification	—Unverified
Benanza: Automatic μBenchmark Generation to Compute "Lower-bound" Latency and Inform Optimizations of Deep Learning Models on GPUs	Nov 16, 2019	BenchmarkingGPU	—Unverified
Featuremetric benchmarking: Quantum computer benchmarks based on circuit features	Apr 17, 2025	Benchmarking	—Unverified
A Multi-Labeled Dataset for Indonesian Discourse: Examining Toxicity, Polarization, and Demographics Information	Mar 1, 2025	Benchmarking	—Unverified
BigDataBench: A Scalable and Unified Big Data and AI Benchmark Suite	Feb 23, 2018	BenchmarkingCPU	—Unverified
BelHouse3D: A Benchmark Dataset for Assessing Occlusion Robustness in 3D Point Cloud Semantic Segmentation	Nov 20, 2024	BenchmarkingPoint Cloud Segmentation	—Unverified
Feature Encodings for Gradient Boosting with Automunge	Sep 25, 2022	BenchmarkingBinarization	—Unverified
Feature Selection and Classification of Hyperspectral Images With Support Vector Machines	Oct 15, 2007	BenchmarkingClassification	—Unverified
Bi-Discriminator Class-Conditional Tabular GAN	Nov 12, 2021	Benchmarking	—Unverified
Bi-DCSpell: A Bi-directional Detector-Corrector Interactive Framework for Chinese Spelling Check	Jun 4, 2024	BenchmarkingRepresentation Learning	—Unverified
Behavior Structformer: Learning Players Representations with Structured Tokenization	Jun 7, 2024	Benchmarking	—Unverified
BEHAVIOR in Habitat 2.0: Simulator-Independent Logical Task Description for Benchmarking Embodied AI Agents	Jun 13, 2022	Benchmarking	—Unverified
BIAS: Transparent reporting of biomedical image analysis challenges	Oct 9, 2019	Benchmarking	—Unverified
AM-RADIO: Agglomerative Vision Foundation Model Reduce All Domains Into One	Jan 1, 2024	AllBenchmarking	—Unverified
A Benchmark for Out of Distribution Detection in Point Cloud 3D Semantic Segmentation	Nov 11, 2022	3D Semantic SegmentationAutonomous Driving	—Unverified
Feature-based Evolutionary Diversity Optimization of Discriminating Instances for Chance-constrained Optimization Problems	Jan 24, 2025	BenchmarkingDiversity	—Unverified
Feature selection in linear SVMs via a hard cardinality constraint: a scalable SDP decomposition approach	Apr 15, 2024	Benchmarkingfeature selection	—Unverified
Bias Mitigation for Machine Learning Classifiers: A Comprehensive Survey	Jul 14, 2022	BenchmarkingBIG-bench Machine Learning	—Unverified
Beyond Visual Understanding: Introducing PARROT-360V for Vision Language Model Benchmarking	Nov 20, 2024	BenchmarkingLanguage Modeling	—Unverified
Beyond Uniform Lipschitz Condition in Differentially Private Optimization	Jun 21, 2022	Benchmarkingregression	—Unverified
FAVOR-Bench: A Comprehensive Benchmark for Fine-Grained Video Motion Understanding	Mar 19, 2025	BenchmarkingMultiple-choice	—Unverified
Beyond the Singular: The Essential Role of Multiple Generations in Effective Benchmark Evaluation and Analysis	Feb 13, 2025	Benchmarking	—Unverified
Beyond the Hype: Benchmarking LLM-Evolved Heuristics for Bin Packing	Jan 20, 2025	BenchmarkingEvolutionary Algorithms	—Unverified

Show:10 25 50

← PrevPage 45 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified