SOTAVerified

Benchmarking

Papers

Showing 32513260 of 5548 papers

TitleStatusHype
Comparing Hyper-optimized Machine Learning Models for Predicting Efficiency Degradation in Organic Solar Cells0
IndiBias: A Benchmark Dataset to Measure Social Biases in Language Models for Indian ContextCode0
Are Large Language Models Good at Utility Judgments?Code0
Benchmarking Image Transformers for Prostate Cancer Detection from Ultrasound Data0
GPTs and Language Barrier: A Cross-Lingual Legal QA Examination0
Benchmarking Video Frame Interpolation0
NSINA: A News Corpus for SinhalaCode0
DISL: Fueling Research with A Large Dataset of Solidity Smart Contracts0
On the Fragility of Active Learners for Text ClassificationCode0
TrustSQL: Benchmarking Text-to-SQL Reliability with Penalty-Based ScoringCode0
Show:102550
← PrevPage 326 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified