SOTAVerified

Benchmarking

Papers

Showing 15911600 of 5548 papers

TitleStatusHype
CLLMate: A Multimodal Benchmark for Weather and Climate Events Forecasting0
MCUBench: A Benchmark of Tiny Object Detectors on MCUs0
Data Analysis in the Era of Generative AI0
Constructing Confidence Intervals for 'the' Generalization Error -- a Comprehensive Benchmark StudyCode0
ARLBench: Flexible and Efficient Benchmarking for Hyperparameter Optimization in Reinforcement LearningCode1
The Elephant in the Room: Towards A Reliable Time-Series Anomaly Detection BenchmarkCode3
Conformal Prediction: A Theoretical Note and Benchmarking Transductive Node Classification in GraphsCode0
MALPOLON: A Framework for Deep Species Distribution ModelingCode1
Omnibenchmark (alpha) for continuous and open benchmarking in bioinformatics0
Proof of Thought : Neurosymbolic Program Synthesis allows Robust and Interpretable Reasoning0
Show:102550
← PrevPage 160 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified