Benchmarking of Ensembles and Meta‐Ensembles in the Multiclass Classification of Obesity Risk: Predictive Performance, Calibration and Interpretability

digitado ⋅ 10 de April de 2026

Obesity represents a significant public health concern, attributable to its high preva-lence and its association with cardiometabolic comorbidities. This study compared a set of ensemble learning models—including canonical ensembles, meta-ensembles, and base-lines for tabular data—in a multiclass obesity status prediction task using the “Obesity Dataset” (n = 1,610; 14 predictors; 4 classes). To ensure methodological rigor, a pipeline was implemented using ColumnTransformer, standardization, one-hot encoding, and re-balancing via SMOTENC applied exclusively to the training folds, thereby preventing data leakage. The performance of the system was evaluated using several evaluation metrics, including accuracy, F1-score, precision, recall, Cohen’s kappa, and Matthews correlation coefficient. This evaluation was supplemented by a computational cost analysis. Inferen-tial comparisons were executed using the Friedman test and the Nemenyi post-hoc test (α = 0.05). The findings indicated a high level of overall performance (≈89–90.5% precision), identifying a leading group of models that were statistically indistinguishable (Group A). This group included LightGBM (90.49% ± 1.38), Random Forest (90.16% ± 1.70), Stacking (90.21% ± 1.70), and Extra Trees (89.69% ± 1.55). It has been demonstrated that models such as XGBoost, Bagging, and CatBoost demonstrate competitive performance with par-tial statistical overlap. Conversely, Gradient Boosting and AdaBoost exhibited signifi-cantly lower performance. In summary, a single dominant model was not identified; ra-ther, a set of equivalent solutions was identified. The selection of a model should be based on a balance between accuracy, computational cost, and interpretability. Random Forest and Extra Trees are efficient options, and Stacking is a valid alternative when maximizing predictive performance is prioritized.

Like 0

Liked Liked