SOTAVerified

Benchmarking

Papers

Showing 38513860 of 5548 papers

TitleStatusHype
RISEdb: a Novel Indoor Localization Dataset0
Risk Aware Benchmarking of Large Language Models0
Risk-Neutral Generative Networks0
RL2Grid: Benchmarking Reinforcement Learning in Power Grid Operations0
RL-Based Method for Benchmarking the Adversarial Resilience and Robustness of Deep Reinforcement Learning Policies0
RNAmountAlign: efficient software for local, global, semiglobal pairwise and multiple RNA sequence/structure alignment0
A Comprehensive Guide to CAN IDS Data & Introduction of the ROAD Dataset0
ROBBIE: Robust Bias Evaluation of Large Generative Language Models0
OOD-CV: A Benchmark for Robustness to Out-of-Distribution Shifts of Individual Nuisances in Natural Images0
Robust 2D/3D Vehicle Parsing in CVIS0
Show:102550
← PrevPage 386 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified