SOTAVerified

Benchmarking

Papers

Showing 16011610 of 5548 papers

TitleStatusHype
HumaniBench: A Human-Centric Framework for Large Multimodal Models EvaluationCode0
Knowledge-Driven Slot Constraints for Goal-Oriented Dialogue SystemsCode0
Keep Security! Benchmarking Security Policy Preservation in Large Language Model Contexts Against Indirect Attacks in Question AnsweringCode0
KhabarChin: Automatic Detection of Important News in the Persian LanguageCode0
KamNet: An Integrated Spatiotemporal Deep Neural Network for Rare Event Search in KamLAND-ZenCode0
An implementation of the "Guess who?" game using CLIPCode0
KArSL: Arabic Sign Language DatabaseCode0
Adjusting Pretrained Backbones for PerformativityCode0
Benchmarking community drug response prediction models: datasets, models, tools, and metrics for cross-dataset generalization analysisCode0
An extensible Benchmarking Graph-Mesh dataset for studying Steady-State Incompressible Navier-Stokes EquationsCode0
Show:102550
← PrevPage 161 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified