IEEE Journal of Biomedical and Health Informatics | IEEE Xplore

目录

深度学习在医疗综述

使用流程挖掘/深度学习架构改进糖尿病 ICU 患者的院内死亡率预测

使用支持向量机用血浆蛋白早期检测阿尔茨海默病

XGBoost方法预测血管衰老和SHAP解释

通过多核支持向量机+进行多源迁移学习,用于基于B模式超声的计算机辅助诊断肝癌

基于过程发现和迁移学习的缺血性脑血管事件复发预测


深度学习在医疗综述

Deep Learning for Health Informatics

Deep Learning for Health Informatics | IEEE Journals & Magazine | IEEE Xplore

2017.1

HER数据使用时的挑战

D. 医学信息学(Medical Informatics)

医学信息学专注于分析卫生保健环境中的大型汇总数据,旨在增强和发展临床决策支持系统或评估医疗数据,以确保医疗保健服务的质量保证和可及性。电子健康记录(EHR)是极其丰富的患者信息来源,其中包括病史详细信息,例如诊断,诊断检查,药物和治疗计划,免疫记录,过敏,放射学图像,传感器多变量时间序列(例如来自重症监护病房的EEG),实验室和测试结果。对这些大数据的有效挖掘将为疾病管理提供有价值的见解[138],[139]。然而,由于以下几个原因,这并不是微不足道的:

  1. 由于篇幅不同、抽样不规则、缺乏结构化报告和数据缺失,数据复杂。报告的质量因机构和个人而异。
  2. 数 PB 的多模式数据集,包括医学图像、传感器数据、实验室结果和非结构化文本报告。
  3. 临床事件与疾病诊断和治疗之间的长期依赖性使学习复杂化。例如,长时间和变化的延迟通常将疾病的发作与症状的出现分开。
  4. 传统的机器学习方法无法扩展到大型非结构化数据集。
  5. 缺乏结果的可解释性阻碍了方法在临床环境中的适应性。(Preprocessing to bring all subjects and imaging modalities to)

深度学习方法旨在通过大型分布式数据集进行良好扩展。DNN的成功很大程度上归功于它们能够以无监督和监督的分层方式学习新的特征/模式并理解数据表示。DNN也被证明在处理多模态信息方面是有效的,因为它们可以组合多个DNN架构组件。因此,深度学习迅速被医学信息学研究所采用也就不足为奇了。

Deep learning approaches have been designed to scale up well with big and distributed datasets. The success of DNNs is largely due to their ability to learn novel features/patterns and understand data representation in both an unsupervised and supervised hierarchical manners. DNNs have also proven to be efficient in handling multimodal information since they can combine several DNN architectural components. Therefore, it is unsurprising that deep learning has quickly been adopted in medical informatics research.

(注: 多模态信息multimodal information即多种模态的信息,包括:文本、图像、视频、音频)

这些研究背后的动机是开发通用系统,以准确预测住院时间,未来疾病,再入院和死亡率,以改善临床决策并优化临床途径。卫生保健中的早期预测与挽救患者的生命直接相关。此外,新模式的发现可以导致新的假设和研究问题。在计算表型研究中,目标是发现有意义的数据驱动特征和疾病特征。(Early prediction in health care is directly related to saving patients’ lives. Furthermore, the discovery of novel patterns can result in new hypotheses and research questions. In computational phenotyping research, the goal is to discover meaningful data-driven features and disease characteristics.)

(注: 计算表型是试别患者表现症状的一类方法:

computational phenotyping, a biomedical informatics method for identifying patient populations. In this course you will learn how different clinical data types perform when trying to identify patients with a particular disease or trait.)

我们的数据算电子健康记录EHR吗?

第四节. 医疗保健领域的深度学习:局限性和挑战

SECTION IV. Deep Learning in Healthcare: Limitations and Challenges

2. 正如我们在前面的部分中已经强调的那样,为了训练可靠和有效的模型,需要大量的训练数据来表达新概念。尽管最近我们目睹了可用医疗保健数据的爆炸式增长,许多组织开始有效地将医疗记录从纸质记录转换为电子记录,但疾病特定数据通常受到限制。因此,并非所有应用程序(尤其是罕见疾病或事件)都非常适合深度学习。在训练 DNN 期间可能出现的一个常见问题(特别是在小数据集的情况下)是过度拟合,当网络中的参数数与训练集中的样本总数成正比时,可能会发生这种情况。在这种情况下,网络能够记住训练示例,但无法推广到尚未观察到的新样本。因此,尽管训练集上的误差被驱动到一个非常小的值,但新数据的错误会很高。

As we have already highlighted in the previous sections, to train a reliable and effective model, large sets of training data are required for the expression of new concepts. Although recently we have witnessed an explosion of available healthcare data with many organizations starting to effectively transform medical records from paper to electronic records, disease specific data is often limited. Therefore, not all applications—particularly rare diseases or events—are well suited to deep learning. A common problem that can arise during the training of a DNN (especially in the case of small datasets) is overfitting, which may occur when the number of parameters in the network is proportional to the total number of samples in the training set. In this case, the network is able to memorize the training examples, but cannot generalize to new samples that it has not already observed. Therefore, although the error on the training set is driven to a very small value, the errors for new data will be high.

4. 我们要强调的最后一个方面是,许多DNN很容易被愚弄。例如,[144]表明,可以对输入样本添加微小的变化(例如图像中难以察觉的噪声),从而导致样本被错误分类。但是,重要的是要注意,几乎所有的机器学习算法都容易受到此类问题的影响。可以故意将特定要素的值设置得非常高或非常低,以在逻辑回归中引起错误分类。类似地,对于决策结构,可以使用单个二进制特征来引导样本沿着错误的分区,只需在最后一层切换它。因此,一般来说,任何机器学习模型都容易受到这种操纵。另一方面,[145]中的工作讨论了相反的问题。作者表明,有可能获得无意义的合成样品,即使它们不应该被分类,也被强烈地分类。这也是深度学习范式的真正局限性,但它也是其他机器学习算法的缺点。

The last aspect that we would like to underline is that many DNNs can be easily fooled. For example,  [144] shows that it is possible to add small changes to the input samples (such as imperceptible noise in an image) to cause samples to be misclassified. However, it is important to note that almost all machine learning algorithms are susceptible to such issues. Values of particular features can be deliberately set very high or very low to induce misclassification in logistic regression. Similarly, for decision tress, a single binary feature can be used to direct a sample along the wrong partition by simply switching it at the final layer. Hence in general, any machine learning models are susceptible to such manipulations. On the other hand, the work in [145] discusses the opposite problem. The author shows that it is possible to obtain meaningless synthetic samples that are strongly classified into classes even though they should not have been classified. This is also a genuine limitation of the deep learning paradigm, but it is a drawback for other machine learning algorithms as well.

总而言之,我们认为今天的医疗保健信息学是一种人机协作,最终可能成为未来的共生关系。随着越来越多的数据变得可用,深度学习系统可以在人类解释困难的地方发展和交付。这可以使疾病诊断更快,更智能,并减少决策过程中的不确定性。最后,深度学习的最后一个边界可能是跨健康信息学学科整合数据的可行性,以支持精准医学的未来。

To conclude, we believe that healthcare informatics today is a human-machine collaboration that may ultimately become a symbiosis in the future. As more data becomes available, deep learning systems can evolve and deliver where human interpretation is difficult. This can make diagnoses of diseases faster and smarter and reduce uncertainty in the decision making process. Finally, the last boundary of deep learning could be the feasibility of integrating data across disciplines of health informatics to support the future of precision medicine.

使用流程挖掘/深度学习架构改进糖尿病 ICU 患者的院内死亡率预测

Improving the In-Hospital Mortality Prediction of Diabetes ICU Patients Using a Process Mining/Deep Learning Architecture

Improving the In-Hospital Mortality Prediction of Diabetes ICU Patients Using a Process Mining/Deep Learning Architecture | IEEE Journals & Magazine | IEEE Xplore

可解释性方面相似, 有神经网络的 Shapley 值分析

本文主要贡献是使用流程挖掘Process Mining增加了输入神经网络的特征

D. Ablation Study

本小节描述了多个已执行的消融研究,包括神经网络的 Shapley 值分析、事件日志上的事件类型消融、神经网络上的逐层消融研究,以及分别在 V-D1、V-D2、V-D3 和 V-D4 部分中对人工事件学习方法的消融研究。

This subsection describes multiple performed ablation studies, including an analysis of Shapley values of the neural network, an event type ablation on the event log, a layer-wise ablation study on the neural network, and an ablation study on the artificial event learning approach in the Sections V-D1, V-D2, V-D3, and V-D4, respectively.

1) 沙普利价值分析Shapley Value Analysis

对测试集的每个患者执行Shapley值分析,以调查每个患者的输入对神经网络输出概率的影响。因此,利用了Shapley Additive Explanation方法[62]。Shapley 值描述了特征值对不同联盟之间的预测的平均贡献。Shapley 值的第 3 个四分位数范围提供了对特征重要性的良好估计,同时忽略了异常值并适应了由于过程模型的高维数而影响较小的特征的稀疏性。图8显示了获得的结果。

A Shapley value analysis is performed on every patient of the test set to investigate the impact of each patient's input to the neural network output probability. Therefore, the Shapley Additive Explanation approach [62] is leveraged. A Shapley value describes the average contribution of a feature value to the prediction across different coalitions. The 3 rd quartile range of Shapley values provides a good estimate of a feature's importance while neglecting outliers and accommodating for the sparsity of features with low impact due to the high dimensionality of the process model. Fig. 8 shows the obtained results.

论文写作笔记6 JBHI 论文汇总-编程知识网

 Fig. 8.

Average 3 rd quartile range of Shapley values per feature type.

该图显示,患者的事件计数对预测的影响最大,其次是严重程度评分。通过在过程模型上重播患者病史而创建的令牌计数向量具有类似于严重性评分的影响。来自患者病史的时间衰减函数值和定时状态样本标记的Shapley值的平均3rd四分位数范围显示范围约为10%。平均而言,人口统计信息和指数时间衰减函数似乎不那么重要。该图证实了患者病史和使用过程挖掘建模的事件时间对建议设置中的死亡率概率预测有影响。

The figure shows that the patient's count of events has the largest impact on the prediction, followed by the severity scores. The token count vectors created by replaying the patient's history on the process model have an impact similar to the severity scores. The average 3 rd quartile ranges of Shapley values for the time decay function values and markings of the timed state samples originating from the patient's history show a range of around 10 percent. The demographic information and exponential time decay function seem to be less important on average. The figure confirms that patient history and the timing of events modeled using process mining have an impact on the mortality probability prediction in the proposed setting.

模型部分

死亡率预测包括:

基于流程挖掘的方法+神经网络(Dense Neural Network)直接给出名称

方法是利用创建的事件日志、患者人口统计数据和入院日严重程度评分来预测糖尿病 ICU 患者的住院死亡率

实验部分

(Evaluation)

包括数据集Dataset/设置Setup/Results/消融实验Ablation Study/Discussion

Setup中介绍了指标

使用支持向量机用血浆蛋白早期检测阿尔茨海默病

Early Detection of Alzheimer's Disease with Blood Plasma Proteins Using Support Vector Machines

Early Detection of Alzheimer's Disease with Blood Plasma Proteins Using Support Vector Machines | IEEE Journals & Magazine | IEEE Xplore

SVM疾病预测, 在机器学习方法使用上相似

主要贡献是在特征选择的过程中, 找到几个可用于预测的蛋白, 传统医学没有使用.

数据集情况, 受试者的人口统计信息The demographic information of the subjects

数据预处理是否进行了标准化(还是直接在降维前做的?) Data Pre-processing, standardized

交叉验证using 10-fold cross-validation

特征子集选择(CFS)在模型开发之前减少研究数据的维度reduce the dimension of the study data prior to model development.

软间隔SVM公式介绍

调包的使用:

CFS 进行特征选择是在 Weka 软件包 [41] 中使用属性选择工具箱进行的。所有分类任务都是使用 MATLAB 和 Weka 软件包执行的

Feature selection using CFS as discussed earlier was conducted with attribute selection toolbox in Weka software package

代码开源:

MATLAB 代码可在 https://github.com/chimastan/earlydetectionofAD 上找到

类别不平衡处理:

在模型开发中没有对训练数据集(数据集1)应用类不平衡处理程序,因为少数到多数的类分布为35:65%,这在基于ML的分类问题中是可以接受的

No class imbalance handling procedure was applied to the training dataset (Dataset 1) in model development as minority to majority class distribution was 35:65% which is acceptable in ML-based classification problems [44], [45].

维度爆炸问题表述

我们还展示了通过组合我们确定的所有五个面板而得出的较大面板的性能,尽管它相对于单个面板具有较低的性能,这可能是由于维度的诅咒。

We have also shown the performance of the larger panel derived by combining all five panels we identified, although it has a lower performance relative to the individual panels perhaps due to curse of dimensionality.

实验得到重要特征(此处是特征选择得到)

在这项研究中研究的几乎所有先前报道的模型中都发现了这些蛋白质中的几种。

Several of these proteins are found in nearly all the previously reported models investigated in this study.

局限性

However, this study has several limitations including the following:

样本大小和ML方法Sample size and ML method

在这项工作中,研究数据的样本量很小。这是因为相关数据的可用性有限,部分原因是收集此类专门数据的成本很高。由于数据集有限,本研究未探索深度学习(DL)等最新ML方法,因为它们需要大型数据集。随着更多数据的出现,我们将探索DL方法,如卷积和递归神经网络[54],[55]。尽管如此,传统的机器学习方法在这个领域仍然具有吸引力,因为它们相对简单,成本更低,并且对数据建模有用[56]。然而,尽管我们应用的传统ML方法实现了高分类性能,但还有其他方法(例如集成学习[57])具有提高性能的潜力,因此可以应用于未来的研究。

贡献

本研究的主要贡献包括确定的潜在生物标志物特征以及在搜索这些特征时采用的方法学方法,以努力弥合早期发现AD与基于蛋白质组学的非淀粉样蛋白血液生物标志物的重要研究差距。

The main contributions of this study include the potential biomarker signatures identified and the methodological approach adopted in the search for these signatures in an effort to bridge an important study gap of early detection of AD with proteomic-based non-amyloid blood biomarkers.

结论

这可能有助于识别在AD最早阶段可能从早期干预中受益的个体。此外,通过了解疾病受试者中蛋白质之间的相互作用,可以获得有关该疾病的新见解。这种增强的理解可能有助于改善临床试验中的干预措施

This may aid identification of individuals at the earliest stages of AD who may benefit from early interventions. Furthermore, new insights about the disease may be gained from understanding the interactions between the proteins in disease subjects. Such enhanced understanding may contribute to the improvement of interventions in clinical trials.

模型部分

Model Development

包含特征选择/ 确定ML算法SVM-based Evaluation/ Classification With Kernelized SVM

构建了几个具有不同核的SVM [37]分类模型,包括线性,二阶和三阶多项式以及径向基函数(RBF)。每个模型对ADD和HC受试者进行分类的平均性能是使用重复10次的10倍交叉验证[29]方案获得的。其次,使用数据集2测试了最稳定的模型(SVM算法和特征面板)的性能,这些模型满足ADD和HC受试者分类的平均SN和SP≥70%的性能标准,以区分MCI和HC组。最后,选择在MCI和HC组分类方面表现最佳的模型和基础蛋白组合作为AD早期检测的推定模型和非淀粉样蛋白生物标志物组合。

实验部分

没有详细介绍指标

直接评估SVM效果(AUC等)

数据集描述

论文写作笔记6 JBHI 论文汇总-编程知识网

excel求sd:

STDEVP

XGBoost方法估计血管衰老和SHAP解释

XGBoost Regression of the Most Significant Photoplethysmogram Features for Assessing Vascular Aging

XGBoost Regression of the Most Significant Photoplethysmogram Features for Assessing Vascular Aging | IEEE Journals & Magazine | IEEE Xplore

模型部分

将模型的每个小模块分别给出(核心模型直接写的XGBoost)

且并列给出

I. 模型生成和验证

模型生成和验证是通过10倍交叉验证执行的。10 倍交叉验证将数据分成 10 个相同大小的集,使用 9 个集作为训练集创建一个模型,并对其余集重复测试过程,直到所有集都用于测试一次。它的优点是所有数据都可用于训练和测试。贝叶斯优化用于模型训练期间的超参数优化。使用贝叶斯优化优化的参数包括学习速率、最大树深度和子样本比率,最优值分别在 (0.0001 – 0.1)、(2 – 20) 和 (0.1 – 1) 范围内导出。

I. Model Generation and Validation

Model generation and validation were performed through 10-fold cross validation. The 10-fold cross validation divided the data into 10 sets of the same size, created a model using 9 sets as a training set, and repeated the process of testing with the remaining sets, until all sets were used for testing once. It has the advantage that all data can be used for training and testing. Bayesian optimization was used for hyperparameter optimization during model training. The parameters optimized using Bayesian optimization included learning rate, max tree depth, and subsample ratio, and optimal values were derived in the ranges (0.0001 – 0.1), (2 – 20), and (0.1 – 1), respectively.

J. 统计分析

为了统计验证所生成模型的性能,计算了表示估计年龄与实际年龄之间差异的平均绝对误差(MAE)和均方根误差(RMSE)。此外,利用皮尔逊相关系数分析了实际年龄与估计年龄的相关性,并利用Bland-Altman分析了血管年龄估计误差。

J. Statistical Analysis

To statistically verify the performance of the generated model, the mean absolute error (MAE) and root mean squared error (RMSE), which represented the difference between the estimated age and the actual age, were calculated. In addition, Pearson's correlation coefficient was used to analyze the correlation between the actual age and the estimated age, and Bland–Altman analysis was used to analyze the vascular age estimation error.

通过多核支持向量机+进行多源迁移学习,用于基于B模式超声的计算机辅助诊断肝癌

Multi-Source Transfer Learning Via Multi-Kernel Support Vector Machine Plus for B-Mode Ultrasound-Based Computer-Aided Diagnosis of Liver Cancers

Multi-Source Transfer Learning Via Multi-Kernel Support Vector Machine Plus for B-Mode Ultrasound-Based Computer-Aided Diagnosis of Liver Cancers | IEEE Journals & Magazine | IEEE Xplore

迁移+肝

方法部分直接写各种自己提出的改进算法

实验部分

Experimental results and Analysis包括

数据处理特征提取/实验设计Experimental Design/实验结果Experimental results

基于过程发现和迁移学习的缺血性脑血管事件复发预测

Predicting Recurrence for Patients With Ischemic Cerebrovascular Events Based on Process Discovery and Transfer Learning

Predicting Recurrence for Patients With Ischemic Cerebrovascular Events Based on Process Discovery and Transfer Learning | IEEE Journals & Magazine | IEEE Xplore

在迁移方法上不同,但在数据和消融实验上很相似.

这个论文结构比较好

方法部分

预测模型开发和验证

在此步骤中,我们构建了一个训练数据集并开发了一个二元分类模型,以预测ICE是否会在两年内再次发生。经过变量筛选后,两个域中变量的数量和类型是相同的(即,特征空间 Xs = Xt)。与参考[23]类似,我们选择一种直接的方式来组合源域和目标域,其中这些数据被连接起来以构成一个统一的训练集。但是,由于来自源域和目标域的实例数量不同(前者包含的样本要多得多),我们调整实例权重以更改训练过程中目标域数据的相对“关注”程度。

这种所谓的迁移, 其实就是给目标域样本更大的权重

实验部分

图表绘制

数据集描述及迁移的数据划分, 类不平衡情况可以参考这篇的表3

论文写作笔记6 JBHI 论文汇总-编程知识网

Excel算标准差: STDEVP