Interpretable Photoplethysmography Feature Engineering for Multi-Class Blood Pressure Staging
Hypertension is a leading global health risk and requires accurate and continuous monitoring for effective management. Although photoplethysmography (PPG) is a promising non-invasive modality for cuffless blood pressure (BP) assessment, many existing approaches (especially raw-signal deep learning) are vulnerable to data leakage, overfitting on small datasets, limited interpretability, and poor performance on minority BP stages. To address these limitations, we propose a robust and physiologically grounded framework for multi-class BP stage classification based on interpretable PPG features. Our approach centers on a comprehensive multi-domain feature engineering pipeline that extracts 124 PPG features, including demographic, morphological, functional decomposition, spectral, nonlinear dynamics, and clinical composite indices. We apply rigorous preprocessing and feature selection prior to model training. We validate the framework on two datasets: PPG-BP dataset (657 segments, 4 classes) for benchmarking and PulseDB (283,773 segments, 3 classes) to assess scalability. On PPG-BP, LightGBM trained on the selected features achieved macro-F1 = 0.81 and accuracy = 0.79, outperforming comparable deep-learning models and achieving strong minority-class performance (e.g., HT2 F1score = 0.92). On the PulseDB, a custom Residual MLP achieved accuracy = 0.81 and macro-F1 = 0.79, supporting generalization at scale. These results show that the proposed feature-based approach can outperform complex end-to-end deep-learning models on small datasets while providing improved interpretability. This work establishes a reliable and transparent pathway toward clinically viable continuous BP staging, moving beyond black-box models toward physiologically grounded decision support.