SOTAVerified

AutoLearn - Automated Feature Generation and Selection

2017-11-17IEEE IEEE International Conference on Data Mining (ICDM) 2017Code Available0· sign in to hype

Ambika Kaul, Saket Maheshwary, Vikram Pudi

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

In recent years, the importance of feature engineering has been confirmed by the exceptional performance of deep learning techniques, that automate this task for some applications. For others, feature engineering requires substantial manual effort in designing and selecting features and is often tedious and non-scalable. We present AutoLearn, a regression-based feature learning algorithm. Being data-driven, it requires no domain knowledge and is hence generic. Such a representation is learnt by mining pairwise feature associations, identifying the linear or non-linear relationship between each pair, applying regression and selecting those relationships that are stable and improve the prediction performance. Our experimental evaluation on 18 UC Irvine and 7 Gene expression datasets, across different domains, provides evidence that the features learnt through our model can improve the overall prediction accuracy by 13.28%, compared to original feature space and 5.87% over other top performing models, across 8 different classifiers without using any domain knowledge.

Tasks

Reproductions