Responsible AI for Sepsis Prediction: Bridging the Gap Between Machine Learning Performance and Clinical Trust

Background: Sepsis remains a leading cause of mortality in Intensive Care Units (ICUs) worldwide. Machine learning models for clinical prediction must be accurate, fair, transparent, and reliable to ensure that physicians feel confident in their decision-making process. Methods: We used the MIMIC-IV, version 3.1, database to evaluate several machine learning architectures, including Logistic Regression, XGBoost, LightGBM, LSTM (Long Short-Term Memory) networks and Transformer models. We predicted three main clinical targets: hospital mortality, length of stay, and septic shock onset. Model interpretability was assessed using Shapley Additive Explanations (SHAP). Results: The XGBoost model demonstrated superior performance in prediction tasks, particularly for hospital mortality (AUROC 0.874), outperforming traditional LSTM networks, transformers and linear baselines. Importance analysis of the variables confirmed the clinical relevance of the model. Conclusions: While XGBoost and ensemble algorithms demonstrate superior predictive power for sepsis prognosis, their clinical adoption necessitates robust explainability mechanisms to gain the doctors trust.

Liked Liked