SOTAVerified

Benchmarking

Papers

Showing 17311740 of 5548 papers

TitleStatusHype
KhabarChin: Automatic Detection of Important News in the Persian LanguageCode0
Can LLMs Grasp Implicit Cultural Values? Benchmarking LLMs' Metacognitive Cultural Intelligence with CQ-BenchCode0
InViG: Benchmarking Interactive Visual Grounding with 500K Human-Robot InteractionsCode0
Introducing SLAMBench, a performance and accuracy benchmarking methodology for SLAMCode0
Inverse Contextual Bandits: Learning How Behavior Evolves over TimeCode0
Investigating the Impact of Hard Samples on Accuracy Reveals In-class Data ImbalanceCode0
INTERSPEECH 2009 Emotion Challenge Revisited: Benchmarking 15 Years of Progress in Speech Emotion RecognitionCode0
CLEAVE: Scalable and Edge-native Benchmarking of Networked Control SystemsCode0
Can geometric combinatorics improve RNA branching predictions?Code0
Air Learning: A Deep Reinforcement Learning Gym for Autonomous Aerial Robot Visual NavigationCode0
Show:102550
← PrevPage 174 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified