SOTAVerified

Benchmarking

Papers

Showing 38613870 of 5548 papers

TitleStatusHype
NoisyHate: Mining Online Human-Written Perturbations for Realistic Robustness Benchmarking of Content Moderation Models0
Noisy intermediate-scale quantum (NISQ) algorithms0
Trajectory Normalized Gradients for Distributed Optimization0
ActPlan-1K: Benchmarking the Procedural Planning Ability of Visual Language Models in Household Activities0
InferBench: Understanding Deep Learning Inference Serving with an Automatic Benchmarking System0
Non-Contextual Modeling of Sarcasm using a Neural Network Benchmark0
Non-Reference Quality Assessment for Medical Imaging: Application to Synthetic Brain MRIs0
Nonstochastic Bandits with Infinitely Many Experts0
TRAM: Benchmarking Temporal Reasoning for Large Language Models0
NoTeS-Bank: Benchmarking Neural Transcription and Search for Scientific Notes Understanding0
Show:102550
← PrevPage 387 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified