SOTAVerified

Sparse-Group Boosting with Balanced Selection Frequencies: A Simulation-Based Approach and R Implementation

2024-05-31Code Available0· sign in to hype

Fabian Obster, Christian Heumann

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

This paper introduces a novel framework for reducing variable selection bias by balancing selection frequencies of base-learners in boosting and introduces the sgboost package in R, which implements this framework combined with sparse-group boosting. The group bias reduction algorithm employs a simulation-based approach to iteratively adjust the degrees of freedom for both individual and group base-learners, ensuring balanced selection probabilities and mitigating the tendency to over-select more complex groups. The efficacy of the group balancing algorithm is demonstrated through simulations. Sparse-group boosting offers a flexible approach for both group and individual variable selection, reducing overfitting and enhancing model interpretability for modeling high-dimensional data with natural groupings in covariates. The package uses regularization techniques based on the degrees of freedom of individual and group base-learners. Through comparisons with existing methods and demonstration of its unique functionalities, this paper provides a practical guide on utilizing sparse-group boosting in R, accompanied by code examples to facilitate its application in various research domains.

Tasks

Reproductions