Hybridizing machine learning in survival analysis of cardiac PET/CT imaging

Study population

We retrospectively evaluated 951 consecutive symptomatic patients with intermediate pre-test probability of CAD (based on the pre-test probability categorization in line with the 2019 ESC CCS guidelines) who had been referred to CCTA in the PET Centre of the Turku University Hospital in Finland. According to the local sequential imaging protocol, patients with suspected obstructive stenosis on CCTA underwent downstream stress PET perfusion imaging to evaluate the hemodynamic significance of stenosis.3 Patients with documented CAD, prior MI, or prior revascularization (PCI or CABG) were not included in the present study. After exclusion of patients with non-diagnostic imaging results or failure to adhere to the local sequential imaging protocol, data from 739 subjects were analyzed. A report of CCTA and PET findings by cardiovascular imaging specialist, as recommended by SCCT guidelines, was provided to the treating physician to guide patient management. The study was performed in accordance with the Declaration of Helsinki. The Ethics Committee of the Hospital District of Southwest Finland waived the need for written informed consent owing to retrospective observational study design.

Clinical variables

Demographic (sex and age) and clinical data (hypertension, dyslipidemia, smoking status, type 2 diabetes mellitus, family history of cardiovascular disease, chest complaints, and dyspnea) were extracted from the electronic medical records system and are summarized in Table 1.

Table 1 Characteristics of baseline risk factors and imaging expert interpretationsFollow-up and clinical outcomes

The analyzed endpoints were recorded in binary form for the occurrence of MI or all-cause death using the registries of the Finnish National Institute for Health and Welfare and the Centre for Clinical Informatics of the Turku University Hospital. Identified events were manually confirmed by the investigators through the electronic medical records system following the European Society of Cardiology recommendations. There were no missing data in this regard for our sample.

Imaging acquisition

Patients underwent sequential imaging protocol with CCTA and selective 15O-water stress PET perfusion imaging. The corresponding acquisition protocols have been described previously.1 CCTA scans were performed in a 64-row PET/CT scanner (GE Discovery VCT or GE D690, General Electric Medical Systems, Waukesha, Wisconsin). Prior to acquisition, 0-30 mg of metoprolol were administered intravenously to achieve a target heart rate of <60 bpm and 1.25 mg of isosorbide dinitrate aerosol, or alternatively 800 mg of sublingual nitrate, were also administered. CCTA utilized intravenously administered low-osmolal iodine contrast agent (48-155 mL at 320-400 mg iodine·mL−1) and prospective ECG-triggered acquisition.

Dynamic quantitative PET myocardial perfusion imaging during pharmacological stress (using adenosine infusion of 140 µg·kg−1·min−1 as vasodilator) was performed as previously described.1 The mean injected activity of 15O-water was 1042 ± 117 MBq. All patients were instructed to refrain from methyl-xanthine-containing food, beverages, and medications (e.g., coffee, chocolate, tea) for 24 hours before the PET study.

Image analysis

CCTA data was analyzed according to the segmentation system recommended in the SCCT guidelines.4 In detail, the coronary artery tree was described by: (1) its system dominance [right, left or co-dominance], (2) the anatomical presence or absence of each theoretical coronary segment, (3) the presence or absence of atherosclerotic plaque per segment, (4) the visually estimated percentage of luminal narrowing [0%, < 50%, 50-69%, 70-99% or 100%], and (5) the complete, partial or absent calcification of an atherosclerotic plaque when present. The database was coded in such a way that it allowed for the distinction between normal coronary segments and anatomically absent ones (0 and 1 values, respectively).

From these recorded variables, we implemented the statistical CCTA score proposed by De Graaf et al., which is generated by calculating plaque-, stenosis- and segment weight factors for each coronary segment and summing individual segment scores together as described previously.16 The de Graaf score is based on linear integration of these variables as an example comparator of a non-ML based approach to CCTA scoring.

Coronary Calcium Score17 by the Agatston method was globally quantified and stored as a unique continuous variable.

PET data were quantitatively analyzed using Carimas software (Turku PET Centre, Finland) using a one-tissue (two-compartment) kinetic model to estimate absolute stress myocardial blood flow (MBF) through the standardized 17-segment AHA model (where segments 2 and 3 being negated due to correspondence to the membranous interventricular septum).2 The finding of at least one segment with a stress MBF of < 2.4 mL·g−1·min−1 was considered abnormal and indicative of myocardial ischemia.3 Missing values were considered as null parameters and input accordingly.

In total, this segmental anatomo-functional description delivered 58 CCTA anatomically descriptive variables, a single continuous Calcium Score variable, and 15 PET variables. All image interpretations were performed by an experienced cardiovascular imaging physician and recorded in a standardized reporting system.

Machine learning

Our machine learning workflow (Figure 1) was generated in accordance with current state-of-the-art recommendations and the Proposed Requirements for Cardiovascular Imaging-Related Machine Learning Evaluation (PRIME) Checklist,18 which can be consulted in the supplement.

Figure 1figure 1

Machine learning and survival workflow. CCS, chronic coronary syndromes; CCTA, coronary computed tomography angiography; ML, machine learning; PET, positron emission tomography

Data pre-processing and re-sampling

Missing values in in clinical data were imputed with negative value for smoking (N = 91 in training and N = 28 in test dataset), diabetes (N = 116 in training and N = 47 in test dataset), hypertension (N = 80 in training and N = 43 in test dataset), and dyslipidemia (N = 85 in training and N = 40 in test dataset). The variables gender and age did not have missing values. Imaging variables did not have missing values except in the case the subject did not proceed to PET scan. In these cases, all PET variables (N = 259 in training and N = 126 in test dataset) were assigned to zero. ML analytics utilized gradient boosting machine (GBM)5 algorithm for the analysis based on robustness and stability. The 739 subjects were randomly assigned to a training or testing dataset (with a 2:1 split ratio to achieve datasets with N = 493 and N = 246).19,20 A 10-fold cross-validation policy was applied to tune model parameters through the ML training and validation process for the training dataset (N = 493). Model building and optimization were exclusively performed in the training dataset, while final comparative performance evaluations were conducted in the test dataset. Feature selection considering the most informative CCTA and PET variables was performed in model building as a recommended measure to promote dimensionality reduction.

ML-integrated scores

ML modeling was performed as aforementioned separately in CCTA (58) and PET (17) variables. The outcome for both ML models was MI or all-cause death. The resulting continuous output (a pseudo-probability expressed in a range from 0 to 1) integrated the extensive imaging information into a single ML-CCTA score and a single ML-PET score. These ML-integrated imaging scores were further utilized in the subsequent survival analysis (see below).

(Hybridized) survival analysis

Continuous variables are presented as mean and standard deviation (SD). Categorical variables were expressed as counts and corresponding percentages. Significant statistical differences in baseline characteristics between the training and testing datasets were compared using the Wilcoxon rank sum test for continuous variables and the chi-squared test for categorical ones.

The isolated performance of the integrated ML scores (ML-CCTA and ML-PET) was evaluated through the area under the curve (AUC) obtained from receiver operating characteristic (ROC) analyses.

The prognostic analyses were performed through Cox regression modeling21 for the occurrence of the previously described composite endpoint (i.e., MI or all-cause death). Predictive performance in test data was illustrated using Kaplan-Meier curves accompanied with log-rank test P values. The resulting hazard ratios were expressed along with their 95% confidence intervals.

The independent and comparative significance of the ML-integrated scores was evaluated through the following three predictive models, namely: (1) Hybridized ML Model: considering clinical variables, ML-CCTA score and ML-PET score, (2) Expert-based Model: considering clinical variables, CCTA expert interpretation and PET expert interpretation, and (3) Calcium score-based Model: considering global Calcium Score. Performance was evaluated through the concordance index (CI) by binary transformation of its predictions to a binary Youden’s J statistic applied to training dataset.22 Concordance indexes were compared using Student’s t test.23,24 A two-tailed P value < .05 was considered statistically significant.

Statistical and Machine Learning analytics were implemented using R (version 3.5.3) with complementary sub-packages survival (version 2.44-1.1), survcomp (version 1.36.1) and gbm (version 2.1.5) for GBM.

Comments (0)

No login
gif