An automated machine learning framework to optimize radiomics model construction validated on twelve clinical applications
Martijn P. A. Starmans, Sebastian R. van der Voort, Thomas Phil, Milea J. M. Timbergen, Melissa Vos, Guillaume A. Padmos, Wouter Kessels, David Hanff, Dirk J. Grunhagen, Cornelis Verhoef, Stefan Sleijfer, Martin J. van den Bent, Marion Smits, Roy S. Dwarkasing, Christopher J. Els, Federico Fiduzi, Geert J. L. H. van Leenders, Anela Blazevic, Johannes Hofland, Tessa Brabander, Renza A. H. van Gils, Gaston J. H. Franssen, Richard A. Feelders, Wouter W. de Herder, Florian E. Buisman, Francois E. J. A. Willemssen, Bas Groot Koerkamp, Lindsay Angus, Astrid A. M. van der Veldt, Ana Rajicic, Arlette E. Odink, Mitchell Deen, Jose M. Castillo T., Jifke Veenland, Ivo Schoots, Michel Renckens, Michail Doukas, Rob A. de Man, Jan N. M. IJzermans, Razvan L. Miclea, Peter B. Vermeulen, Esther E. Bron, Maarten G. Thomeer, Jacob J. Visser, Wiro J. Niessen, Stefan Klein
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/MStarmans91/WORCOfficialIn papernone★ 82
- github.com/mstarmans91/worcdatabaseOfficialIn papernone★ 4
Abstract
Predicting clinical outcomes from medical images using quantitative features (``radiomics'') requires many method design choices, Currently, in new clinical applications, finding the optimal radiomics method out of the wide range of methods relies on a manual, heuristic trial-and-error process. We introduce a novel automated framework that optimizes radiomics workflow construction per application by standardizing the radiomics workflow in modular components, including a large collection of algorithms for each component, and formulating a combined algorithm selection and hyperparameter optimization problem. To solve it, we employ automated machine learning through two strategies (random search and Bayesian optimization) and three ensembling approaches. Results show that a medium-sized random search and straight-forward ensembling perform similar to more advanced methods while being more efficient. Validated across twelve clinical applications, our approach outperforms both a radiomics baseline and human experts. Concluding, our framework improves and streamlines radiomics research by fully automatically optimizing radiomics workflow construction. To facilitate reproducibility, we publicly release six datasets, software of the method, and code to reproduce this study.