SOTAVerified

Benchmarking

Papers

Showing 17111720 of 5548 papers

TitleStatusHype
Capsule Vision 2024 Challenge: Multi-Class Abnormality Classification for Video Capsule EndoscopyCode0
Chumor 2.0: Towards Benchmarking Chinese Humor UnderstandingCode0
Investigating the Impact of Hard Samples on Accuracy Reveals In-class Data ImbalanceCode0
An Analyst-Inspector Framework for Evaluating Reproducibility of LLMs in Data ScienceCode0
InViG: Benchmarking Interactive Visual Grounding with 500K Human-Robot InteractionsCode0
Learn How to Query from Unlabeled Data Streams in Federated LearningCode0
Introducing SLAMBench, a performance and accuracy benchmarking methodology for SLAMCode0
Can Tree Based Approaches Surpass Deep Learning in Anomaly Detection? A Benchmarking StudyCode0
Inverse Contextual Bandits: Learning How Behavior Evolves over TimeCode0
Can LLMs Grasp Implicit Cultural Values? Benchmarking LLMs' Metacognitive Cultural Intelligence with CQ-BenchCode0
Show:102550
← PrevPage 172 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified