Updating and recalibrating causal probabilistic models on a new target population

To be clinically useful, a Bayesian Network (BN) needs to function well in different populations. For example, a model should function the same whether it is used in an intensive care unit in England or in France. Although internally validated BNs usually perform well in the population in which they were developed (source population), they often show poor calibration when applied to new populations (target population) [1]. External validation of the model performance is commonly considered a stronger test than internal validation, since it addresses generalisability rather than reproducibility [2], [3]. External validity may be evaluated by studying patients from a different timepoint (temporal validation), from other hospitals (geographic validation), or treated in fully different domains (strong external or domain validation) [3]. The chance of finding worse model calibration grows as more stringent external validation methods are used (such as domain validation) [4]. The cause of this poor calibration can be heterogeneity in populations in terms of patient characteristics, disease prevalence, patient management and treatment policies [1].

When there is a lack of credibility, model developers tend to simply reject that model and develop a new one, sometimes by repeating the entire development process [4]. This leads to a loss of previous scientific information and additional time and effort cost for developing a new model. A better solution is to not ignore previous development work but rather to update and recalibrate the source BN’s structure and parameters to better represent the target population characteristics. In this way, previously acquired scientific information is retained and the time and effort to update the model is substantially reduced. In addition, existing approaches for updating the model structure and parameters rely mainly on data-driven methods, completely ignoring the use of knowledge from experts and literature. Moreover, the process of recalibrating latent variables has gained little research attention because latent variables are often omitted from data-driven algorithms despite their importance for models developed using expert knowledge.

In this paper, we propose a pragmatic methodology to update and recalibrate both BN structure and parameters to better model the target population characteristics. Our approach integrates expert knowledge for updating the model’s structure. Data collected for the target population and/or expert knowledge, as well as parameter transfer from the source population when possible and necessary, are used to recalibrate the model’s parameters. A novel approach for recalibrating latent variables is also proposed. The method is illustrated by a case study into the prediction of trauma-induced coagulopathy (TIC), where a BN had already been developed for civilian trauma patients (TIC-CIV) and now is recalibrated on military combat casualties (TIC-MIL).

The remainder of this paper is as follows: Section 2 presents the overview of our methodology. The case study is introduced in Section 3 and developed further in Sections 4 - model update and recalibration, and 5 - model performance. We present a discussion and a conclusion in 6 Discussion, 7 Conclusion, respectively.

留言 (0)

沒有登入
gif