SOTAVerified

Benchmarking

Papers

Showing 32763300 of 5548 papers

TitleStatusHype
Benchmarking Adversarial Robustness of Image Shadow Removal with Shadow-adaptive Attacks0
Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot StudyCode0
SpokeN-100: A Cross-Lingual Benchmarking Dataset for The Classification of Spoken Numbers in Different LanguagesCode0
Attention-based Class-Conditioned Alignment for Multi-Source Domain Adaptation of Object DetectorsCode0
Semi-Supervised Learning for Anomaly Traffic Detection via Bidirectional Normalizing FlowsCode0
An Approach to Evaluate Modeling Adequacy for Small-Signal Stability Analysis of IBR-related SSOs in Multimachine Systems0
A tutorial on multi-view autoencoders using the multi-view-AE library0
IndicSTR12: A Dataset for Indic Scene Text Recognition0
(N,K)-Puzzle: A Cost-Efficient Testbed for Benchmarking Reinforcement Learning Algorithms in Generative Language Model0
Class Imbalance in Object Detection: An Experimental Diagnosis and Study of Mitigation StrategiesCode0
A Holistic Framework Towards Vision-based Traffic Signal Control with Microscopic Simulation0
Multi-GPU-Enabled Hybrid Quantum-Classical Workflow in Quantum-HPC Middleware: Applications in Quantum SimulationsCode0
Exploring the Adversarial Frontier: Quantifying Robustness via Adversarial Hypervolume0
Synth4bench: a framework for generating synthetic genomics data for the evaluation of tumor-only somatic variant calling algorithmsCode0
Benchmarking Large Language Models for Molecule Prediction TasksCode0
Improvements & Evaluations on the MLCommons CloudMask BenchmarkCode0
NLPre: a revised approach towards language-centric benchmarking of Natural Language Preprocessing systems0
Benchmarking News Recommendation in the Era of Green AI0
Dissecting Sample Hardness: A Fine-Grained Analysis of Hardness Characterization Methods for Data-Centric AICode0
Comparison Performance of Spectrogram and Scalogram as Input of Acoustic Recognition TaskCode0
BAIT: Benchmarking (Embedding) Architectures for Interactive Theorem-Proving0
Three Revisits to Node-Level Graph Anomaly Detection: Outliers, Message Passing and Hyperbolic Neural NetworksCode0
Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word ProblemCode0
A Density-Guided Temporal Attention Transformer for Indiscernible Object Counting in Underwater Video0
Benchmarking the Text-to-SQL Capability of Large Language Models: A Comprehensive Evaluation0
Show:102550
← PrevPage 132 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified