SOTAVerified

Holdout Set

Papers

Showing 2130 of 35 papers

TitleStatusHype
Predicting Individual Responses to Vasoactive Medications in Children with Septic Shock0
Using Poisson Binomial GLMs to Reveal Voter Preferences0
STAND: Data-Efficient and Self-Aware Precondition Induction for Interactive Task Learning0
Benchmark Inflation: Revealing LLM Performance Gaps Using Retro-Holdouts0
Adaptive Statistical Learning with Bayesian Differential Privacy0
The Benefits and Risks of Transductive Approaches for AI Fairness0
The DCR Delusion: Measuring the Privacy Risk of Synthetic Data0
Diversified Ensembling: An Experiment in Crowdsourced Machine Learning0
The Generic Holdout: Preventing False-Discoveries in Adaptive Data Science0
Challenges in Bayesian Adaptive Data Analysis0
Show:102550
← PrevPage 3 of 4Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1BloodAxe, 1st place xView3 prize challengeAggregate xView3 Score0.62Unverified
2selim_sef, 2nd place xView3 prize challengeAggregate xView3 Score0.6Unverified
3Tumen, 3rd place xView3 prize challengeAggregate xView3 Score0.58Unverified
4Skylight at AI2, 4th place xView3 prize challengeAggregate xView3 Score0.58Unverified
5Kohei, 5th place xView3 prize challengeAggregate xView3 Score0.57Unverified