SOTAVerified

Benchmarking

Papers

Showing 27262750 of 5548 papers

TitleStatusHype
Data Analysis in the Era of Generative AI0
Benchmarking Feature Extractors for Reinforcement Learning-Based Semiconductor Defect Localization0
A Parallel Corpus for Evaluating Machine Translation between Arabic and European Languages0
Accelerating the discovery of steady-states of planetary interior dynamics with machine learning0
DASB -- Discrete Audio and Speech Benchmark0
DarkBench: Benchmarking Dark Patterns in Large Language Models0
Danish Airs and Grounds: A Dataset for Aerial-to-Street-Level Place Recognition and Localization0
AnyTOD: A Programmable Task-Oriented Dialog System0
DailyQA: A Benchmark to Evaluate Web Retrieval Augmented LLMs Based on Capturing Real-World Changes0
DACSA: A large-scale Dataset for Automatic summarization of Catalan and Spanish newspaper Articles0
Benchmarking Expressive Japanese Character Text-to-Speech with VITS and Style-BERT-VITS20
DACOS-A Manually Annotated Dataset of Code Smells0
Benchmarking Explanatory Models for Inertia Forecasting using Public Data of the Nordic Area0
Anytime Bi-Objective Optimization with a Hybrid Multi-Objective CMA-ES (HMO-CMA-ES)0
Adversarially Training for Audio Classifiers0
CzechLynx: A Dataset for Individual Identification and Pose Estimation of the Eurasian Lynx0
Benchmarking Evolutionary Community Detection Algorithms in Dynamic Networks0
CXPMRG-Bench: Pre-training and Benchmarking for X-ray Medical Report Generation on CheXpert Plus Dataset0
Benchmarking Evolutionary Algorithms For Single Objective Real-valued Constrained Optimization - A Critical Review0
Anytime Behavior of Inexact TSP Solvers and Perspectives for Automated Algorithm Selection0
Benchmarking Evaluation Metrics for Code-Switching Automatic Speech Recognition0
Benchmarking Ethical and Safety Risks of Healthcare LLMs in China-Toward Systemic Governance under Healthy China 20300
Labelling Vertebrae with 2D Reformations of Multidetector CT Images: An Adversarial Approach for Incorporating Prior Knowledge of Spine Anatomy0
Accelerating IoV Intrusion Detection: Benchmarking GPU-Accelerated vs CPU-Based ML Libraries0
GradEscape: A Gradient-Based Evader Against AI-Generated Text Detectors0
Show:102550
← PrevPage 110 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified