SOTAVerified|Agents Browse Leaderboard About Blog

Holdout Set

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 11–20 of 35 papers

Title	Date	Tasks	Status	Hype
Who's the (Multi-)Fairest of Them All: Rethinking Interpolation-Based Data Augmentation Through the Lens of Multicalibration	Dec 13, 2024	AllData Augmentation	CodeCode Available	0
Benchmark Inflation: Revealing LLM Performance Gaps Using Retro-Holdouts	Oct 11, 2024	Holdout SetMisconceptions	—Unverified	0
STAND: Data-Efficient and Self-Aware Precondition Induction for Interactive Task Learning	Sep 11, 2024	Active LearningHoldout Set	—Unverified	0
Machine Learning for Quantifier Selection in cvc5	Aug 26, 2024	Holdout Set	—Unverified	0
Comprehensive dataset of user-submitted articles with ideological and extreme bias from Reddit	Aug 12, 2024	ArticlesHoldout Set	CodeCode Available	0
The Benefits and Risks of Transductive Approaches for AI Fairness	Jun 17, 2024	FairnessHoldout Set	—Unverified	0
Diversified Ensembling: An Experiment in Crowdsourced Machine Learning	Feb 16, 2024	FairnessHoldout Set	—Unverified	0
Testing for Overfitting	May 9, 2023	Holdout Setvalid	CodeCode Available	0
Holdouts set for safe predictive model updating	Feb 13, 2022	Holdout Setmodel	CodeCode Available	0
Persistent Homology Captures the Generalization of Neural Networks Without A Validation Set	May 31, 2021	Holdout Set	CodeCode Available	0

Show:10 25 50

← PrevPage 2 of 4Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	BloodAxe, 1st place xView3 prize challenge	Aggregate xView3 Score	0.62	—	Unverified
2	selim_sef, 2nd place xView3 prize challenge	Aggregate xView3 Score	0.6	—	Unverified
3	Tumen, 3rd place xView3 prize challenge	Aggregate xView3 Score	0.58	—	Unverified
4	Skylight at AI2, 4th place xView3 prize challenge	Aggregate xView3 Score	0.58	—	Unverified
5	Kohei, 5th place xView3 prize challenge	Aggregate xView3 Score	0.57	—	Unverified