SOTAVerified

Holdout Set

Papers

Showing 110 of 35 papers

TitleStatusHype
Outcome-based Reinforcement Learning to Predict the Future0
The DCR Delusion: Measuring the Privacy Risk of Synthetic Data0
Parametric Scaling Law of Tuning Bias in Conformal PredictionCode0
Navigating Towards Fairness with Data Selection0
Who's the (Multi-)Fairest of Them All: Rethinking Interpolation-Based Data Augmentation Through the Lens of MulticalibrationCode0
Benchmark Inflation: Revealing LLM Performance Gaps Using Retro-Holdouts0
STAND: Data-Efficient and Self-Aware Precondition Induction for Interactive Task Learning0
Machine Learning for Quantifier Selection in cvc50
Comprehensive dataset of user-submitted articles with ideological and extreme bias from RedditCode0
Understanding Transformers via N-gram StatisticsCode1
Show:102550
← PrevPage 1 of 4Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1BloodAxe, 1st place xView3 prize challengeAggregate xView3 Score0.62Unverified
2selim_sef, 2nd place xView3 prize challengeAggregate xView3 Score0.6Unverified
3Tumen, 3rd place xView3 prize challengeAggregate xView3 Score0.58Unverified
4Skylight at AI2, 4th place xView3 prize challengeAggregate xView3 Score0.58Unverified
5Kohei, 5th place xView3 prize challengeAggregate xView3 Score0.57Unverified