SOTAVerified

Benchmarking

Papers

Showing 491500 of 5548 papers

TitleStatusHype
CLoG: Benchmarking Continual Learning of Image Generation ModelsCode1
Benchmarking Large Language Models for Persian: A Preliminary Study Focusing on ChatGPTCode1
API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMsCode1
Adversarial Prompt Evaluation: Systematic Benchmarking of Guardrails Against Prompt Input Attacks on LLMsCode1
Clinical Prompt Learning with Frozen Language ModelsCode1
CloudEval-YAML: A Practical Benchmark for Cloud Configuration GenerationCode1
A Platform for the Biomedical Application of Large Language ModelsCode1
Decoding the Enigma: Benchmarking Humans and AIs on the Many Facets of Working MemoryCode1
Benchmarking Detection Transfer Learning with Vision TransformersCode1
Benchmarking Deep Reinforcement Learning for Navigation in Denied Sensor EnvironmentsCode1
Show:102550
← PrevPage 50 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified