A deep learning model for predicting multidrug-resistant organism infection in critically ill patients

Study population

We retrospectively collected data from patients who received treatment in the ICU of the Affiliated Hospital of Qingdao University from July 2021 to January 2022. The primary cohort enrolled 688 critically ill patients. For external validation, patients in the same study center from May 2022 to July 2022 were selected in the validation set.

All adults (aged ≥ 18 years and ≥ one-time microbial culture performed during ICU hospitalization) in ICU were enrolled in this study. Patients who died or left the ICU within 48 h, had incomplete case data or were diagnosed with MDRO infection prior to ICU admission were excluded. Only the first admission was included for analysis for patients with multiple ICU admissions during hospitalization.

This study has obtained the approval of the Ethics Committee of Qingdao University Medicine (QDU-HEC-2021173). As this study was retrospective and data were anonymized, informed consent was waived.

Data collection

We obtained patient information through hospital infection surveillance and electronic medical records systems. Initial candidate factors may be associated with MDRO infections, including general data, invasive procedures, medication, laboratory indicators, and the scores. General data included gender, age, body mass index, length of hospitalization, length of ICU stay, and comorbid diseases (including diabetes, hypertension, chronic lung disease, liver disease, chronic renal disease, congestive heart failure, and cerebrovascular disease). Invasive procedures included surgical situations, mechanical ventilation, central venous catheters, gastrointestinal decompression, peripherally inserted central venous catheters, extracorporeal membrane oxygenation, urinary tube, and other drainage tubes in ICU. Medication included antibiotic use, hormone, and nutritional support therapy during ICU. Laboratory indicators included albumin, prealbumin, C-reactive protein, procalcitonin, white blood cells, blood–urea–nitrogen, and creatinine within the first 24 h of their ICU stay. The scores included the APACHE II score, Glasgow coma scale, and nutrition risk screening (NRS)-2002 score within 24 h of admission in the ICU. The diagnosis of the combined disease was as per the International Classification of disease-10 code [17].

This study obtained specimens for microbiologic cultures from blood, urine, sputum, pus, drainage fluid, and secretions. VITEK2 Compact System automatic microbial identification and drug sensitivity analysis system were used for culture identification of strains, and the Kirby Bauer paper diffusion method was applied to the drug sensitivity test of strains. The definition of MDRO was based on the provisional standard definition of MDRO published by Magiorakos and other experts [18]. Long-term bed rest refers to being bedridden for 15 days at least, and more than 90% of the time in bed within 1 day. The surgical situation included the grading of the operation, the classification of incision, and the healing of the incision.

Screening for risk factors

Patients were categorized into MDRO-infected and non-MDRO-infected groups in accordance with the presence or absence of MDRO infection during the ICU. We combined Lasso and stepwise regression to screen risk factors. Lasso regression used tenfold cross validation to select the optimal penalty coefficient (lambda). The variables whose coefficients were not zero had a significant relationship with the dependent variable and were preserved [19]. Lasso can avoid adding too many independent variables into the BPNN model, thereby reducing the network's complexity and computation and improving the model's prediction accuracy. Then, stepwise regression was applied to further select the optimal combination of independent variables. This method was the introduction of variables one after the other. After introducing a new variable, the old variables that had been selected in the regression model were tested one by one, and the variables that were not meaningful were deleted [20]. This process continued until no new variables were introduced and no old variables were deleted. Variables with bilateral P < 0.05 were identified as independent risk factors for MDRO infection.

Development and validation of the BPNN model

These confirmed independent risk factors for MDRO infection were used as input variables to construct a BPNN model. The BPNN algorithm employed gradient descent to continuously adjust the weights and thresholds among layers through backpropagation to minimize the sum of error squares of the network [21].

These data of the primary cohort were randomly divided into a training set and a test set in an 8:2 ratio, where the training set was utilized to construct the model, and the test set was utilized to evaluate the model's ability to discriminate new samples. To further evaluate the generalization ability and universality of the model, external validation was performed by period validation, that is, patients from the same study center at different times. At this stage, patient data were mainly collected based on independent risk factors confirmed during model construction.

Statistical analysis

All variables in this study had less than 5% missing values, and mean interpolation was accomplished. Outliers were values that were less than the difference between the first quartile and 1.5 quartile spacing or more than the sum of the third quartile and 1.5 quartile spacing. Outliers in the data were replaced using mean values [22].

Continuous data were described as means ± standard deviation or median and interquartile range (IQR), and group comparisons were performed using the Students' t test or Mann–Whitney U test. Categorical data were expressed as frequency and percentage, and comparisons were made using the Chi-square or Fisher's exact test between groups.

In this study, Lasso and stepwise regression were performed using "glmnet" and "MASS" packages of R 4.2.3. The BPNN model was constructed with the "nnet" package of R 4.2.3. The model's predictive performance was evaluated in terms of calibration and discrimination. The discrimination was assessed by accuracy, sensitivity, specificity, and area under the curve (AUC). Calibration curves investigated the calibration of the model.

Comments (0)

No login
gif