SOTAVerified

Benchmarking

Papers

Showing 44914500 of 5548 papers

TitleStatusHype
UBENCH: Benchmarking Uncertainty in Large Language Models with Multiple Choice QuestionsCode0
BED: Bi-Encoder-Based Detectors for Out-of-Distribution DetectionCode0
Replicable Benchmarking of Neural Machine Translation (NMT) on Low-Resource Local Languages in IndonesiaCode0
RUHSNet: 3D Object Detection Using Lidar Data in Real TimeCode0
Replication Study and Benchmarking of Real-Time Object Detection ModelsCode0
IPC: A Benchmark Data Set for Learning with Graph-Structured DataCode0
RepLiQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference ContentCode0
Building a Large Scale Dataset for Image Emotion Recognition: The Fine Print and The BenchmarkCode0
IoT Data Trust Evaluation via Machine LearningCode0
Representation Learning of Limit Order Book: A Comprehensive Study and BenchmarkingCode0
Show:102550
← PrevPage 450 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified