SOTAVerified

Holdout Set

Papers

Showing 125 of 35 papers

TitleStatusHype
Distribution-Free, Risk-Controlling Prediction SetsCode2
TotalVibeSegmentator: Full Body MRI Segmentation for the NAKO and UK BiobankCode2
Liver Tumor Screening and Diagnosis in CT with Pixel-Lesion-Patient NetworkCode1
Understanding Transformers via N-gram StatisticsCode1
xView3-SAR: Detecting Dark Fishing Activity Using Synthetic Aperture Radar ImageryCode1
Template-Based Automatic Search of Compact Semantic Segmentation ArchitecturesCode1
Generalization of Reinforcement Learners with Working and Episodic MemoryCode0
A shared latent space matrix factorisation method for recommending new trial evidence for systematic review updatesCode0
Comprehensive dataset of user-submitted articles with ideological and extreme bias from RedditCode0
Generalization in Adaptive Data Analysis and Holdout ReuseCode0
Holdouts set for safe predictive model updatingCode0
Parametric Scaling Law of Tuning Bias in Conformal PredictionCode0
Persistent Homology Captures the Generalization of Neural Networks Without A Validation SetCode0
RATT: Leveraging Unlabeled Data to Guarantee GeneralizationCode0
Testing for OverfittingCode0
Uncovering convolutional neural network decisions for diagnosing multiple sclerosis on conventional MRI using layer-wise relevance propagationCode0
Who's the (Multi-)Fairest of Them All: Rethinking Interpolation-Based Data Augmentation Through the Lens of MulticalibrationCode0
Outcome-based Reinforcement Learning to Predict the Future0
Who Wins the Game of Thrones? How Sentiments Improve the Prediction of Candidate Choice0
A Meta-Analysis of Overfitting in Machine Learning0
Predicting Individual Responses to Vasoactive Medications in Children with Septic Shock0
Using Poisson Binomial GLMs to Reveal Voter Preferences0
STAND: Data-Efficient and Self-Aware Precondition Induction for Interactive Task Learning0
Benchmark Inflation: Revealing LLM Performance Gaps Using Retro-Holdouts0
Adaptive Statistical Learning with Bayesian Differential Privacy0
Show:102550
← PrevPage 1 of 2Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1BloodAxe, 1st place xView3 prize challengeAggregate xView3 Score0.62Unverified
2selim_sef, 2nd place xView3 prize challengeAggregate xView3 Score0.6Unverified
3Tumen, 3rd place xView3 prize challengeAggregate xView3 Score0.58Unverified
4Skylight at AI2, 4th place xView3 prize challengeAggregate xView3 Score0.58Unverified
5Kohei, 5th place xView3 prize challengeAggregate xView3 Score0.57Unverified