SOTAVerified

Benchmarking

Papers

Showing 19011910 of 5548 papers

TitleStatusHype
Creating and Leveraging a Synthetic Dataset of Cloud Optical Thickness Measures for Cloud Detection in MSICode0
CREPO: An Open Repository to Benchmark Credal Network AlgorithmsCode0
AdamZ: An Enhanced Optimisation Method for Neural Network TrainingCode0
Improvements & Evaluations on the MLCommons CloudMask BenchmarkCode0
Bias Analysis and Mitigation in the Evaluation of Authorship VerificationCode0
BED: Bi-Encoder-Based Detectors for Out-of-Distribution DetectionCode0
Critical review of conformational B-cell epitope prediction methodsCode0
Illusory VQA: Benchmarking and Enhancing Multimodal Models on Visual IllusionsCode0
BEARD: Benchmarking the Adversarial Robustness for Dataset DistillationCode0
AMQA: An Adversarial Dataset for Benchmarking Bias of LLMs in Medicine and HealthcareCode0
Show:102550
← PrevPage 191 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified