The survival prediction of advanced colorectal cancer received neoadjuvant therapy—a study of SEER database

Basic characteristics

Figure 1 showed that a total of 1833 study subjects were enrolled to conduct this study, including 1759 cases in the SEER database and 74 cases in our center. The SEER dataset was randomly assigned in a 3:2 ratio, 1055 cases were used as the training group and the testing group was 704 cases. The 74 patients in our center served as the validation group. As the results presented in Table 1, there was no significant statistical difference were observed in clinical features between the training group and the testing group, except for gender. The Table 2 showed the clinical characteristics of patients in Tangdu hospital. The proportion of LOCRC patients was near to 80% and the tumor location of 75% patients was rectal. The pathology type of most patients was moderately differentiated. And more patients received radiotherapy and chemotherapy, with low levels of LODDS and LNR as the main characteristics.

Fig. 1figure 1

The detailed flowchart of the study

Table 1 Characteristics of training group and testing groupTable 2 Characteristics of validation groupScreening variables associated with survival in training group

Firstly, univariate cox regression, lasso regression and random forest were used to screen factors associated with OS of advanced CRC treated with neoadjuvant therapy, and the results were shown in Fig. 2. Univariate cox regression showed that age, marital status, location, T, N, M, stage, radiation sequence, radiotherapy, chemotherapy, time of diagnosis to treatment, CEA, perineural invasion, tumor size, LODDS, LNR and bone/liver/lung metastasis were associated with OS. The variables were selected through lasso regression, including location, T, M, stage, radiation sequence, perineural invasion, LODDS and LNR. The RFS results were ranked according to the importance of the variables, and some variables with relative importance greater than 12% were selected as candidates. Next, M, stage, location, chemotherapy, perineural invasion, radiation sequence, LNR, liver/lung metastasis and LODDS were chosen.

Fig. 2figure 2

Screening variables associated with prognosis of advanced CRC received neoadjuvant therapy in the training group. (a) univariate cox regression, (b) randomForest survival (RFS), (c) lasso regression

Secondly, stepwise multivariate cox regression was used for further screening (Table S1-S3). The variables screened by univariate cox regression included M, age, chemotherapy, CEA, perineural invasion, tumor size, LODDS, liver metastasis and radiation, corresponding to an AIC and C-index of 7072.72 and 0.695, respectively. However, only five variables screened by the lasso regression were retained, including location, T, M, perineural invasion and LNR, corresponding to an AIC and C-index of 7099.68 and 0.67, respectively. The variables screened by the RFS were M, chemotherapy, perineural invasion, location, LNR and liver metastasis, which corresponded to an AIC and C-index of 7090.51 and 0.676, respectively. Finally, we found that the 1-year, 3-year and 5-year AUCs of the univariate cox regression were 0.765, 0.761 and 0.742, respectively, while the corresponding AUCs of lasso and RFS were 0.673, 0.727, 0.717 and 0.726, 0.744, 0.724, respectively (Fig. 3a-c). Considering the three evaluation indexes of the AIC, C-index and AUCs, we finally chose the variables screened by the univariate cox regression to construct the survival prediction model.

Fig. 3figure 3

a-c. Compare 1,3,5-year prediction value of three models for advanced CRC in the training group (a. univariate cox regression, b. lasso regression, c. RFS). d. nomogram survival prediction model for advanced CRC received neoadjuvant therapy in the training group. e-f. The verified of model in the testing and validation group

Construction and validation of the nomogram model

A nomogram survival prediction model was constructed for advanced CRC patients treated with neoadjuvant therapy based on variables screened by univariate cox regression (Fig. 3d). A total score was calculated based on each patient’s clinical characteristic score, and their corresponding 1, 3 and 5-year survival rates were obtained. The study assessed the predictive ability of the model by AUCs. The results of the study (Fig. 3a, e and f) showed that the AUCs of the model for predicting 1-year OS in the training group, testing and validation group were 0.765 (0.703,0.827), 0.772 (0.697,0.847) and 0.742 (0.601,0.883), respectively, the AUCs for predicting 3-year OS was 0.761 (0.725,0.780), 0.742 (0.699,0.785), 0.733 (0.560,0.905) and AUCs for predicting 5-year OS were 0.742 (0.711,0.773), 0.746 (0.709,0.783), 0.838 (0.670,0.980), respectively. The predicted AUCs of the model were all above 0.70, so the model was considered to have some predictive value for OS in patients with advanced CRC treated with neoadjuvant therapy.

Figure 4 demonstrated the accuracy of the model in the training, testing and validation groups through the calibration curves. The fit of the red line to the 45° calibration line illustrated the fit between the predicted results and the real results. We also calculated the C-index of the calibration curves. The study found the corrected C-index of the 1,3,5 years-calibration curves in the training group were 0.6862,0.6860 and 0.6869, respectively. And the corrected C-index of the 1,3,5 years-calibration curves in the testing and validation group were 0.6967,0.6964, 0.6943 and 0.6979,0.6967,0.6951, respectively. The results indicated the nomogram model has certain prediction ability. DCA curve presented that the difference between the predicted results and the real survival status was not significant. In addition, the DCA curve also was used to assess the practical clinical application value of the model, and the corresponding results were displayed in Fig. 5.

Fig. 4figure 4

The figure showed the prediction accuracy of the nomogram model in the training, testing and validation group through calibration curve

Fig. 5figure 5

The figure evaluated the actual clinical application value of the nomogram model by using the DCA curve in the training, testing and validation group

Subsequently, this study also evaluated the predictive value of the model in different N stages and LODDS grades (Table 3). The results suggested that the model had a higher predictive value in predicting 1-year and 3-year OS in patients with N2 compared with N1, and the predictive value of the model reached 0.862 (0.801,0.923) in predicting 1-year OS in patients with N2a. For patients with different levels of LODDS, the model had a higher predictive value for 1-year OS (0.729 (0.644,0.814)) for patients with low levels of LODDS, but a higher predictive value for 3-year and 5-year OS for patients with high levels of LODDS. The results were shown in supplement figure S1and figure S2.

Table 3 The prediction value of nomogram model in subgroups of N stage and LODDSPrediction of nomogram model for EOCRC and LOCRC

This study also explored the predictive value of the established nomogram model for EOCRC and LOCRC. A total of 1833 patients from the SEER database and Tangdu hospital were included in the study, of which EOCRC and LOCRC patients were 405 and 1428, respectively. Considering the large difference in sample size between the two groups, PSM was conducted to balance the difference. Matching variables included gender, tumor location and stage, the caliper value was set as 0.01. Finally, 405 cases EOCRC patients and 810 cases LOCRC patients were included after matching. Figure 6 showed that the 1, 3 and 5-year OS of the model for patients with EOCRC and LOCRC received neoadjuvant therapy were 0.853 (0.766,0.939), 0.784 (0.728,0.839), 0.767 (0.718,0.815) and 0.768 (0.711,0.825), 0.743 (0.702,0.784), 0.735 (0.699,0.770).

Fig. 6figure 6

The figure showed the prediction value of the nomogram model for EOCRC and LOCRC received neoadjuvant therapy

Comments (0)

No login
gif