SOTAVerified

Benchmarking

Papers

Showing 38313840 of 5548 papers

TitleStatusHype
Challenges and Opportunities in Offline Reinforcement Learning from Visual ObservationsCode2
SwinCheX: Multi-label classification on chest X-ray images with transformersCode1
Functional Code Building Genetic Programming0
Do We Need Another Explainable AI Method? Toward Unifying Post-hoc XAI Evaluation Methods into an Interactive and Multi-dimensional BenchmarkCode1
Benchmarking Bayesian neural networks and evaluation metrics for regression tasks0
FedHPO-B: A Benchmark Suite for Federated Hyperparameter Optimization0
Scaling laws in global corporations as a benchmarking approach to assess environmental performance0
Revisiting Realistic Test-Time Training: Sequential Inference and Adaptation by Anchored ClusteringCode1
MorisienMT: A Dataset for Mauritian Creole Machine Translation0
Which models are innately best at uncertainty estimation?0
Show:102550
← PrevPage 384 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified