SOTAVerified

Benchmarking

Papers

Showing 751760 of 5548 papers

TitleStatusHype
A Comprehensive Overview of Large Language ModelsCode1
Benchmarking Simulation-Based InferenceCode1
A Multifaceted Benchmarking of Synthetic Electronic Health Record Generation ModelsCode1
BeHonest: Benchmarking Honesty in Large Language ModelsCode1
Benchmarking Large Language Models on CMExam -- A Comprehensive Chinese Medical Exam DatasetCode1
AdaPool: Exponential Adaptive Pooling for Information-Retaining DownsamplingCode1
Benchmarking Skeleton-based Motion Encoder Models for Clinical Applications: Estimating Parkinson's Disease Severity in Walking SequencesCode1
Benchmarking the Combinatorial Generalizability of Complex Query Answering on Knowledge GraphsCode1
AirSim Drone Racing LabCode1
A SWAT-based Reinforcement Learning Framework for Crop ManagementCode1
Show:102550
← PrevPage 76 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified