Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3276–3300 of 5548 papers

Title	Date	Tasks	Status
Benchmarking Adversarial Robustness of Image Shadow Removal with Shadow-adaptive Attacks	Mar 15, 2024	Adversarial AttackAdversarial Robustness	—Unverified
Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study	Mar 15, 2024	Benchmarking	CodeCode Available
SpokeN-100: A Cross-Lingual Benchmarking Dataset for The Classification of Spoken Numbers in Different Languages	Mar 14, 2024	BenchmarkingDimensionality Reduction	CodeCode Available
Attention-based Class-Conditioned Alignment for Multi-Source Domain Adaptation of Object Detectors	Mar 14, 2024	BenchmarkingDomain Adaptation	CodeCode Available
Semi-Supervised Learning for Anomaly Traffic Detection via Bidirectional Normalizing Flows	Mar 13, 2024	Anomaly DetectionBenchmarking	CodeCode Available
An Approach to Evaluate Modeling Adequacy for Small-Signal Stability Analysis of IBR-related SSOs in Multimachine Systems	Mar 12, 2024	Benchmarking	—Unverified
A tutorial on multi-view autoencoders using the multi-view-AE library	Mar 12, 2024	Benchmarking	—Unverified
IndicSTR12: A Dataset for Indic Scene Text Recognition	Mar 12, 2024	BenchmarkingScene Text Recognition	—Unverified
(N,K)-Puzzle: A Cost-Efficient Testbed for Benchmarking Reinforcement Learning Algorithms in Generative Language Model	Mar 11, 2024	BenchmarkingLanguage Modeling	—Unverified
Class Imbalance in Object Detection: An Experimental Diagnosis and Study of Mitigation Strategies	Mar 11, 2024	BenchmarkingData Augmentation	CodeCode Available
A Holistic Framework Towards Vision-based Traffic Signal Control with Microscopic Simulation	Mar 11, 2024	BenchmarkingTraffic Signal Control	—Unverified
Multi-GPU-Enabled Hybrid Quantum-Classical Workflow in Quantum-HPC Middleware: Applications in Quantum Simulations	Mar 9, 2024	BenchmarkingCPU	CodeCode Available
Exploring the Adversarial Frontier: Quantifying Robustness via Adversarial Hypervolume	Mar 8, 2024	Adversarial RobustnessBenchmarking	—Unverified
Synth4bench: a framework for generating synthetic genomics data for the evaluation of tumor-only somatic variant calling algorithms	Mar 8, 2024	BenchmarkingSynthetic Data Generation	CodeCode Available
Benchmarking Large Language Models for Molecule Prediction Tasks	Mar 8, 2024	BenchmarkingPrediction	CodeCode Available
Improvements & Evaluations on the MLCommons CloudMask Benchmark	Mar 7, 2024	Benchmarking	CodeCode Available
NLPre: a revised approach towards language-centric benchmarking of Natural Language Preprocessing systems	Mar 7, 2024	BenchmarkingDependency Parsing	—Unverified
Benchmarking News Recommendation in the Era of Green AI	Mar 7, 2024	BenchmarkingGPU	—Unverified
Dissecting Sample Hardness: A Fine-Grained Analysis of Hardness Characterization Methods for Data-Centric AI	Mar 7, 2024	Benchmarking	CodeCode Available
Comparison Performance of Spectrogram and Scalogram as Input of Acoustic Recognition Task	Mar 6, 2024	Benchmarking	CodeCode Available
BAIT: Benchmarking (Embedding) Architectures for Interactive Theorem-Proving	Mar 6, 2024	Automated Theorem ProvingBenchmarking	—Unverified
Three Revisits to Node-Level Graph Anomaly Detection: Outliers, Message Passing and Hyperbolic Neural Networks	Mar 6, 2024	Anomaly DetectionBenchmarking	CodeCode Available
Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word Problem	Mar 6, 2024	BenchmarkingHallucination	CodeCode Available
A Density-Guided Temporal Attention Transformer for Indiscernible Object Counting in Underwater Video	Mar 6, 2024	BenchmarkingCrowd Counting	—Unverified
Benchmarking the Text-to-SQL Capability of Large Language Models: A Comprehensive Evaluation	Mar 5, 2024	BenchmarkingIn-Context Learning	—Unverified

Show:10 25 50

← PrevPage 132 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified