SOTAVerified

Enhancing Bagging Ensemble Regression with Data Integration for Time Series-Based Diabetes Prediction

2025-06-11Unverified0· sign in to hype

Vuong M. Ngo, Tran Quang Vinh, Patricia Kearney, Mark Roantree

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Diabetes is a chronic metabolic disease characterized by elevated blood glucose levels, leading to complications like heart disease, kidney failure, and nerve damage. Accurate state-level predictions are vital for effective healthcare planning and targeted interventions, but in many cases, data for necessary analyses are incomplete. This study begins with a data engineering process to integrate diabetes-related datasets from 2011 to 2021 to create a comprehensive feature set. We then introduce an enhanced bagging ensemble regression model (EBMBag+) for time series forecasting to predict diabetes prevalence across U.S. cities. Several baseline models, including SVMReg, BDTree, LSBoost, NN, LSTM, and ERMBag, were evaluated for comparison with our EBMBag+ algorithm. The experimental results demonstrate that EBMBag+ achieved the best performance, with an MAE of 0.41, RMSE of 0.53, MAPE of 4.01, and an R2 of 0.9.

Tasks

Reproductions