SOTAVerified

Holdout Set

Papers

Showing 110 of 35 papers

TitleStatusHype
TotalVibeSegmentator: Full Body MRI Segmentation for the NAKO and UK BiobankCode2
Distribution-Free, Risk-Controlling Prediction SetsCode2
Liver Tumor Screening and Diagnosis in CT with Pixel-Lesion-Patient NetworkCode1
Understanding Transformers via N-gram StatisticsCode1
Template-Based Automatic Search of Compact Semantic Segmentation ArchitecturesCode1
xView3-SAR: Detecting Dark Fishing Activity Using Synthetic Aperture Radar ImageryCode1
Challenges in Bayesian Adaptive Data Analysis0
Benchmark Inflation: Revealing LLM Performance Gaps Using Retro-Holdouts0
A Meta-Analysis of Overfitting in Machine Learning0
Diversified Ensembling: An Experiment in Crowdsourced Machine Learning0
Show:102550
← PrevPage 1 of 4Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1BloodAxe, 1st place xView3 prize challengeAggregate xView3 Score0.62Unverified
2selim_sef, 2nd place xView3 prize challengeAggregate xView3 Score0.6Unverified
3Tumen, 3rd place xView3 prize challengeAggregate xView3 Score0.58Unverified
4Skylight at AI2, 4th place xView3 prize challengeAggregate xView3 Score0.58Unverified
5Kohei, 5th place xView3 prize challengeAggregate xView3 Score0.57Unverified