Chinese Journal of Evidence-Based Pediatrics ›› 2024, Vol. 19 ›› Issue (1): 31-35.DOI: 10.3969/j.issn.1673-5501.2024.01.006

Previous Articles     Next Articles

Mortality risk predicting and clinical feature screening of children with severe infection by machine learning based on multicenter cohort data

ZHU Xuemei1,4, CHEN Shencheng2,4, ZHANG Yingying1, LU Guoping1, YE Qi2, RUAN Tong2, ZHENG Yingjie3   

  1. 1 Department of Critical Care Medicine, Children's Hospital of Fudan University, Shanghai 201102, China; 2 School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China; 3 Department of Epidemiology, School of Public Health, Fudan University, Shanghai 200032, China; 4 Co-first author
  • Received:2024-01-25 Revised:2024-02-23 Online:2024-02-25 Published:2024-02-25

Abstract: Background It is of great significance to predict the mortality of children with severe infection scientifically and effectively. In the past, the relationship between illness and death in critically ill children was mostly predicted by scores with poor accuracy like the Pancreatitis Complications and Severity Index. Objective To explore the sensitive indicators for the early warning of the death in children with severe infection by machine learning combined with feature screening. Design Cohort study. Methods We conducted the cohort study based on the pediatric Multi-center Infectious Diseases Collaboration Network database of 54 PICUs in 20 provincial administrative regions of China. In total, 122 clinical features of 11 clinical dimensions were collected from children aged > 28 days after birth to 18 years, with confirmed infection and at least one organ dysfunction. A risk prediction model for mortality in critically ill children with infections was established by constructing logistic regression models (LR), random forest models (RF), extreme gradient boosting tree models (XGB), and backpropagation neural network models (BP) through machine learning techniques and screening important clinical features. Main outcome measures AUROC and the performance of the model in screening clinical characteristics. Results From April 1, 2022 to December 31, 2023, there were 1 738 cases of severe infection with complete records at PICU admission, at PICU 24h stay and at discharge from PICU, of whom 1 396 patients survived or improved, and 342(19.6%) died or deteriorated. After data preprocessing by outlier processing, missing value filling, mandatory value interval range testing, normalization processing, 1 738 pieces of information were entered into machine learning to build the model. According to the ration of 4∶1, 1 390 patients were enrolled in training sets and 348 were in validation sets. In training sets, 1 116 patients survived (or cured) and 274 died (or worsened), and in validation sets, 280 patients survived (or cured), and 68 died (or worsened). In training sets, a total of 122 clinical features were input. After machine learning and feature screening, the range of AUROC of LR, RF and XGB was 0.74-0.78 in validation sets after 50 rounds of 5-fold stratified cross-validation. Features with greater importance than the mean value were selected to construct the optimal clinical features in LR, RF, and XGB models. At present, there is no good method to measure the importance of BP characteristics. Clinical features constructed by the LR model were closer to clinical expectations than by RF and XGB. Conclusion Machine learning is less than perfect in predicting death of severe infectious diseases in children, and the clinical futures screened by predictive model are still far from clinical expectations.

Key words: Machine learning, Pediatric intensive care unit, Infection, Random forest model, Extreme gradient lifting tree