SOTAVerified

Benchmarking

Papers

Showing 26912700 of 5548 papers

TitleStatusHype
Apples to Apples: Learning Semantics of Common Entities Through a Novel Comprehension Task0
Advocating Character Error Rate for Multilingual ASR Evaluation0
Decentralized Federated Learning on the Edge over Wireless Mesh Networks0
Benchmarking for Public Health Surveillance tasks on Social Media with a Domain-Specific Pretrained Language Model0
DECASTE: Unveiling Caste Stereotypes in Large Language Models through Multi-Dimensional Bias Analysis0
Benchmarking for Metaheuristic Black-Box Optimization: Perspectives and Open Challenges0
DeAR: Debiasing Vision-Language Models with Additive Residuals0
DDR-ID: Dual Deep Reconstruction Networks Based Image Decomposition for Anomaly Detection0
Benchmarking for Bayesian Reinforcement Learning0
DBsurf: A Discrepancy Based Method for Discrete Stochastic Gradient Estimation0
Show:102550
← PrevPage 270 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified