SOTAVerified

Benchmarking

Papers

Showing 24912500 of 5548 papers

TitleStatusHype
On the Evaluation of Engineering Artificial General Intelligence0
Genicious: Contextual Few-shot Prompting for Insights Discovery0
GenTel-Safe: A Unified Benchmark and Shielding Framework for Defending Against Prompt Injection Attacks0
Benchmarking Scientific Image Forgery Detectors0
Benchmarking Scene Text Recognition in Devanagari, Telugu and Malayalam0
Benchmarking Sample Selection Strategies for Batch Reinforcement Learning0
A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking0
Generative Psycho-Lexical Approach for Constructing Value Systems in Large Language Models0
GenzIQA: Generalized Image Quality Assessment using Prompt-Guided Latent Diffusion Models0
GeoGebra Tools with Proof Capabilities0
Show:102550
← PrevPage 250 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified