Nomograms integrating CT radiomic and deep learning signatures to predict overall survival and progression-free survival in NSCLC patients treated with chemotherapy

Patients and clinicopathologic features

A total of 187 patients with NSCLC were enrolled in this study according to the inclusion and exclusion criteria. The demographic and histopathological characteristics of the enrolled patients are shown in Table 1. All patients were randomly allocated into two parts: 126 patients (mean (SD) age, 62.43 (12.15) years; median age, 55 years; 67 (53.2%) female) in the training cohort, and 61 patients (mean (SD) age, 60.74 (13.25) years; median age, 50 years; 35 (57.4%) female) in the validation cohort. The median follow-up period was 28.3 (0–60.0) and 25.6 (5.8–55.0) months in the training and validation cohorts, respectively. No significant survival difference in OS and PFS was found among the training cohort (OS: median, 36.7 months; PFS: median, 24.2 months) and the validation cohort (OS: median, 32.5 months; PFS: median, 21.8 months). In addition, no significant statistical differences (p > 0.05) were found in the demographic characteristics (sex, age, smoking status, histopathology, tumor location, and TNM stage) among the two cohorts.

Table 1 Demographic and histopathologic characteristics of study patients Radiomic features and signatures

Phenotypic features were extracted from the intra- and peritumoral regions of the CT images of each patient acquired before chemotherapy. For the inter-observer reproducibility of segmentation by the two radiologists, the Dice coefficient was 0.86 ± 0.04, and the over- and under-segmentation errors of the segmented tumor volume were 0.19 ± 0.11, 0.26 ± 0.11, respectively.

In the training cohort, 1688 radiomic features were obtained from the intra- and peritumoral regions of each patient. Subsequently, 12 (intratumoral) and 9 (peritumoral) significant features were screened using the LASSO Cox proportional hazards regression model. The weights of the 12 selected features were used to build the signature (S1) of the intratumoral region, and the weights of the nine selected features were used to build the signature (S2) of the peritumoral region. Based on S1 and S2, a cutoff threshold of 0.27 and 0.70 were respectively obtained by X-tile with the maximum Chi-squared log-rank value to stratify the NSCLC patients into over-expression and under-expression groups.

Deep learning features and signatures

Deep learning features were extracted from each pretrained model (AlexNet, VGG16, and ResNet34) in the training cohort. Following the backbone module, a global average pooling layer, max pooling layer, and fully connected layer were used to obtain the signatures (S3, S4, and S5). Each signature was then calculated using X-tile to stratify the NSCLC patients into over-expression and under-expression groups; the cutoff values of S3, S4, and S5 were − 0.83, 0.33, and 0.62, respectively.

Independent prognostic factors in the training set

For both OS and PFS, a univariable unadjusted Cox analysis was performed for the following factors: sex, age, smoking status, histopathology, tumor location, TNM stage, S1, S2, S3, S4, and S5 in the training cohort. In OS, TNM stage (HR (95%): III, 1.49 (1.03–2.21), p < 0.05; IV, 2.28 (1.19–4.36), p < 0.05), S1 (HR (95%), 0.56 (0.33–0.95); p < 0.05), S2 (HR (95%), 2.48 (1.56–3.94); p < 0.001), and S3 (HR (95%), 0.43 (0.26–0.69) p < 0.001), were identified as statistically significant prognostic features. In PFS, TNM stage (HR (95%): III, 1.45 (1.03–2.38), p = 0.14; IV, 2.15 (1.13–4.13), p < 0.05), S1 (HR (95%), 0.58 (0.34–0.99), p < 0.05), S2 (HR (95%), 2.45 (1.53–3.94), p < 0.001), and S3 (HR (95%), 0.45 (0.28–0.74), p < 0.05), were identified as statistically significant prognostic features.

We analyzed the over- and under-expression subgroups of OS and PFS using Kaplan-Meier curves; Fig. 2A and B confirm the significant differences in both OS and PFS of S2 between the two groups. In OS, the subgroups with the under-expression (median, 29.2 months) signatures tend to have a lower survival probability than those with over-expression (median, 45.7 months) (p < 0.001). Similar to PFS, the subgroups with under-expression (median, 20.3 months) signatures tend to have a lower survival probability than over-expression (median, 36.7 months) (p < 0.001). The KM curves for TNM, S1, and S3 are shown in Supplemental Fig. S2, Fig. S3 and Fig. S4.

Fig. 2figure 2

Kaplan-Meier curves of S2 for over- and under-expression subgroups: (A) Kaplan-Meier curves in OS; (B) Kaplan-Meier curves in PFS.

These prognostic features were included in the Cox multivariate analysis; the results are shown in Table 2. Apparently, TNM stage (HR (95%): III, 1.48 (1.04–2.47), p = 0.13; IV, 2.10(1.09–4.04), p < 0.05), S2 (HR (95%), 2.26 (1.40–3.67), p < 0.001), and S3 (HR (95%), 0.48 (0.29–0.79), p < 0.05) remained significantly associated with OS. Similarly, TNM stage (HR (95%): III, 1.42 (1.06–2.34), p = 0.17; IV, 1.98(1.02–3.84), p < 0.05), S2 (HR (95%), 2.23 (1.36–3.65); p < 0.05), and S3 (HR (95%), 0.55 (0.33–0.90); p < 0.05) were independently associated with PFS.

Table 2 Results of multiple cox regression

All significant factors were then incorporated into the prognostic model to develop individualized nomograms of OS at 3 and 5 years, and PFS at 3- and 5-years.

As shown in the nomogram of OS (Fig. 3A), S2 presented the largest contribution to the prognosis, followed by TNM stages S1 and S3. Similarly, in the nomogram of PFS (Fig. 3B), S2 presented the largest contribution to the prognosis, followed by S1 and S3.

Fig. 3figure 3

Nomograms for predicting survival analysis: (A) probability with 3- and 5-year OS; (B) probability with 3 and 5-year PFS.

The calibration curves (Fig. 4A and B) obtained from the individualized nomogram demonstrated a good consistency between the prediction and actual observation for both the 3-year and 5-year OS in the training and independent validation cohorts. Performances of the training and validation cohorts are shown on the plot relative to the 45-degree line, which represents a satisfied prediction. The mean absolute value for the 3-year and 5-year OS were 0.068 and 0.072 in the training cohorts, and 0.054 and 0.052 in the independent validation cohorts. The mean absolute value for the 3-year and 5-year PFS were 0.081 and 0.077 in the training cohorts, and 0.046 and 0.041 in the independent validation cohorts. The Harrell C-index (95% CI) of the nomogram was 0.74 (0.70–0.79) for the training cohort, and 0.72 (0.67–0.78) for the validation cohorts. For PFS in Fig. 4 C and 4D, the Harrell C-index (95% CI) of the nomogram was 0.71 (0.68–0.81) for the training cohort, and 0.72 (0.66–0.79) for the validation cohorts.

Fig. 4figure 4

Calibration plots of the nomograms: (A) OS in the training cohort, (B) OS in the validation cohort, (C) PFS in the training cohort, (D) PFS in the validation cohort

As shown in Fig. 5, we compared the performance of the aforementioned models with that of the TNM stage using DCA. Our model provided the largest overall net benefit in predicting both OS and PFS compared to the TNM stage with (C-index (95% CI), OS:0.64 (0.58–0.69); PFS:0.62 (0.54–0.67)).

Fig. 5figure 5

Decision curve analysis of nomogram and TNM stage in the validation cohort: (A) Decision curve analysis of OS; (B) Decision curve analysis of PFS.

Comments (0)

No login
gif