中国循证儿科杂志 ›› 2017, Vol. 12 ›› Issue (1): 22-26.

• 论著 • 上一篇    下一篇

基于数据挖掘技术建立的BP神经网络模型鉴别儿童川崎病与发热性疾病的研究

樊楚1,贺向前1,于跃1,田杰2,张胜1,李哲1   

  1. 1重庆医科大学医学信息学院 重庆,400016; 2 重庆医科大学附属儿童医院心内科 重庆,400000
  • 收稿日期:2017-01-16 修回日期:2017-02-13 出版日期:2017-02-25 发布日期:2017-02-25
  • 通讯作者: 贺向前

BP neural network model for the differentiation of Kawasaki disease and febrile illnesses based on data mining

FAN Chu1, HE Xiang-qian1, YU Yue1, TIAN Jie2, ZHANG Sheng1, LI Zhe1   

  1. 1 College of Medical Informatics,Chongqing Medical University,Chongqing 400016,China;2 Department of Cardiology,Children's Hospital,Chongqing Medical University,Chongqing 400000,China
  • Received:2017-01-16 Revised:2017-02-13 Online:2017-02-25 Published:2017-02-25
  • Contact: HE Xiang-qian

摘要:

目的:以临床表现和实验室指标建立诊断川崎病(KD)的BP神经网络模型并考察其诊断性能。方法:收集重庆医科大学附属儿童医院(我院)2007年1月至2016年1月电子病历系统中出院诊断为KD的连续病例和待鉴别发热疾病病例,使用R 3.2.3软件中的随机抽样函数分为训练集和测试集。截取病历中一般情况、临床表现和实验室指标的共51项信息,单因素分析后提取有统计学意义的变量,以此分别构建Logistic回归和BP神经网络模型,比较两种模型的诊断性能。结果905例KD患儿和438例待鉴别发热疾病患儿进入数据模型分析,训练集1 042例,其中KD 700例,待鉴别发热类疾病342例;测试集301例,其中KD 205例,待鉴别发热类疾病96例。单因素分析结果显示差异有统计学意义37项信息。Logistic回归分类模型有16个变量纳入最佳回归方程。BP神经网络输入层、隐含层和输出层分别有37、24和1个节点。Logistic回归分类模型对训练集和测试集的分类正确率为84.1%和82.1%,ROC曲线下面积为0.91和0.89;BP神经网络模型对训练集和测试集的分类正确率为96.4%和86.0%,ROC曲线下面积为0.94和0.92;2个模型的敏感度均很好, BP神经网络模型的特异度优于Logistic回归分类模型。结论:本文建立的BP神经网络诊断模型对KD有较好的诊断辅助功能,有待进一步通过临床检验。

Abstract:

Objective:A BP neural network model for diagnosing Kawasaki disease(KD)based on laboratory tests and clinical symptoms was developed and evaluated. Methods:Consecutive cases of diagnosis for KD and other common febrile illnesses in electronic medical record system of Children's Hospital of Chongqing Medical University from January 2007 to January 2016 was collected as the study subject. Subjects were randomized into training cohort and test cohort using random sampling function in R 3.2.3. Totally 51 clinical information including demographic data, laboratory tests and clinical symptoms were collected and analyzed by univariate analysis to identify significant variables .The diagnostic model was established using Logistic regression analysis and BP neural network, respectively. And the diagnostic performance of the two methods was compared. Results: A total of 905 patients with KD and 438 patients with other febrile illnesses were included: 1 042 patients (700 patients with KD, 342 patients with other febrile illnesses) as the training cohort and 301 patients (205 patients with KD, 96 patients with other febrile illnesses ) as the testing cohort. Univariate analysis showed that 37 variables had significant difference between KD and other febrile illness. Logistic regression analysis showed that 16 variables were included in the optimal regression equation. This BP neural network had 37 input layer nodes, 24 hidden layer nodes and 1 output layer nodes. Logistic regression analysis accurately diagnosed 84.1% of training cohort and 82.1% of testing cohort, the ROC analysis of Logistic regression revealed that AUC was 0.91 in training cohort and 0.89 in testing cohort. The accuracy of BP neural network was 96.4% and 86%, AUC was 0.94 and 0.92. These two models showed reasonably high sensitivity. The specificity of BP neural network model was significantly higher than that of Logistic regression model. Conclusion: A BP neural network model was developed, which has important accessory diagnostic value for diagnosis of KD. But all these conclusions need further validation in clinic.