Citation
Candia Jr., Jose and Adonis, Airish Mae and Perlas, Jesica (2024) Optimizing bagged trees in an ensemble classifier for improved prediction of diabetes prevalence in women. Pertanika Journal of Science & Technology (Malaysia), 32 (4). pp. 1753-1764. ISSN 2231-8526
Abstract
This study aims to optimize the performance of the bagged tree in an ensemble classifier for predicting diabetes prevalence in women. The study used a dataset of 1,888 women with six features: age, BMI, glucose level, insulin level, blood pressure, and pregnancy status. The dataset was divided into training and testing sets with a 70:30 ratio. The bagged tree ensemble classifier was used for the analysis, and five-fold cross-validation was applied. The study found that using all features during training resulted in a 92.3% training accuracy and a 99.5% testing accuracy. However, applying optimization techniques such as feature selection, parameter tuning, and a maximum number of splits improved model performance. Feature selection optimized the accuracy performance by 0.2%, while parameter tuning improved the test accuracy by 0.2%. Moreover, decreasing the maximum number of splits from 1322 to 800 or 600 resulted in an optimized model with 0.1% higher validation accuracy. Finally, the optimized bagged tree models were evaluated using various performance metrics, including accuracy, precision, recall, and F1 score. The study found that Model 1, which used 800 maximum number of splits and 50 learners, outperformed Model 2 in terms of recall and F1 score, while Model 2, which used 600 maximum number of splits and 50 learners, had a higher precision score. The study concludes that optimization techniques can significantly improve the performance of the bagged tree in predicting diabetes prevalence in women.
Download File
Full text available from:
Official URL: http://www.pertanika.upm.edu.my/resources/files/Pe...
|
Abstract
This study aims to optimize the performance of the bagged tree in an ensemble classifier for predicting diabetes prevalence in women. The study used a dataset of 1,888 women with six features: age, BMI, glucose level, insulin level, blood pressure, and pregnancy status. The dataset was divided into training and testing sets with a 70:30 ratio. The bagged tree ensemble classifier was used for the analysis, and five-fold cross-validation was applied. The study found that using all features during training resulted in a 92.3% training accuracy and a 99.5% testing accuracy. However, applying optimization techniques such as feature selection, parameter tuning, and a maximum number of splits improved model performance. Feature selection optimized the accuracy performance by 0.2%, while parameter tuning improved the test accuracy by 0.2%. Moreover, decreasing the maximum number of splits from 1322 to 800 or 600 resulted in an optimized model with 0.1% higher validation accuracy. Finally, the optimized bagged tree models were evaluated using various performance metrics, including accuracy, precision, recall, and F1 score. The study found that Model 1, which used 800 maximum number of splits and 50 learners, outperformed Model 2 in terms of recall and F1 score, while Model 2, which used 600 maximum number of splits and 50 learners, had a higher precision score. The study concludes that optimization techniques can significantly improve the performance of the bagged tree in predicting diabetes prevalence in women.
Additional Metadata
| Item Type: | Article |
|---|---|
| AGROVOC Term: | diabetes |
| AGROVOC Term: | women |
| AGROVOC Term: | body mass index |
| AGROVOC Term: | blood pressure |
| AGROVOC Term: | insulin |
| AGROVOC Term: | tidal prediction |
| AGROVOC Term: | optimization methods |
| AGROVOC Term: | training |
| AGROVOC Term: | machine learning |
| AGROVOC Term: | accuracy |
| Geographical Term: | Philippines |
| Uncontrolled Keywords: | Bagged trees, diabetes prevalence, ensemble classifier, feature selection, model optimization, parameter tuning |
| Depositing User: | Ms. Azariah Hashim |
| Date Deposited: | 23 Apr 2026 01:34 |
| Last Modified: | 23 Apr 2026 01:34 |
| URI: | http://webagris.upm.edu.my/id/eprint/3020 |
Actions (login required)
![]() |
View Item |
