SOTAVerified

Benchmarking

Papers

Showing 13711380 of 5548 papers

TitleStatusHype
Comprehensive benchmarking of large language models for RNA secondary structure predictionCode1
Marine Snow Removal Benchmarking DatasetCode1
AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language ModelsCode1
ConsumerBench: Benchmarking Generative AI Applications on End-User DevicesCode1
CompanyKG: A Large-Scale Heterogeneous Graph for Company Similarity QuantificationCode1
Aquatic Navigation: A Challenging Benchmark for Deep Reinforcement LearningCode1
Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative TasksCode1
Enhancing clinical decision support with physiological waveforms -- a multimodal benchmark in emergency careCode1
Benchmarking the Robustness of Spatial-Temporal Models Against CorruptionsCode1
scSSL-Bench: Benchmarking Self-Supervised Learning for Single-Cell DataCode1
Show:102550
← PrevPage 138 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified