Mapping the immunological battlefield in gastric cancer: prognostic implications of an immune gene expression signature

4.1 Identification of the differentially expressed immune-related genes and their enriched functions

Through overlapping 8833 DEGs and 2553 ImmPort- and InnateDB-obtained IRGs (Immune-related genes), 493 DEIRGs (differentially expressed immune related genes) were obtained (Fig. 2A). The results regarding Gene Ontology (GO) terms enriched in DEIRGs show that 493 DEIRGs were significantly enriched in several biological processes such as leukocyte migration, leukocyte chemotaxis, myeloid leukocyte migration, neutrophil chemotaxis; several cellular components for example collagen-containing extracellular matrix, immunoglobulin complex, and secretary granule membrane; several molecular functions such as cytokine activity, cytokine receptor binding, chemokine receptor binding, and chemokine activity (Fig. 2B). The KEGG enrichment analysis shows that DEIRGs were enriched in several key signaling pathways, for instance, MAPK signaling pathway, PI3K-Akt signaling pathway, NF-kappa B signaling pathway, Rap1 signaling pathway, TGF-beta signaling pathway, EGFR tyrosine kinase inhibitor resistance, and p53 signaling pathway (Fig. 2C).

Fig. 2figure 2

The functional enrichment analysis of DEIRGs and identification of hub DEIRGs. A Venn diagram shows that 493 DEIRGs were overlapped between 8833 DEGs and 2533 IRGs; B Gene Ontology (GO) enrichment analysis of the DEIRG (p < 0.05); C Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis of the DEIRG (p < 0.05); D Weighted gene coexpression network analysis (WGCNA) of immune-related differentially expressed genes with a soft threshold β = 5. Gene dendrogram representing the hierarchical clustering of genes based on their expression patterns; E The heatmap illustrates the module-trait relationships between the identified gene modules and trait-sample type (healthy control/tumor). Each row represents a module, and each column represents a trait. The level of correlation is indicated by color, with red colors representing positive correlations and blue colors representing negative correlations. F The PPI network of the genes in the turquoise module

4.2 Identification of the key module by performing WGCNA analysis

Figure 2D shows that the gene dendrogram was generated using the WGCNA algorithm (Weighted Gene Co-expression Network Analysis) with a soft threshold β = 5. The module colors were assigned based on the unsupervised clustering algorithm employed by WGCNA. The module colors in the dynamic tree cut figure reflect the different co-expression modules. Notably, the turquoise and blue module represent the largest module, suggesting these two modules’ potential importance in the biological processes under investigation. Figure 2E shows the associations between gene modules (e.g., turquoise, yellow, blue, brown, and grey) and various phenotypic traits (healthy control samples/tumor samples). Positive or negative correlations indicate the potential functional relevance of specific gene modules to the corresponding traits. For instance, the blue module exhibits a significant positive correlation with Trait-tumor samples (correlation value = 0.27, p = 1e−07), suggesting its potential involvement in the biological processes underlying tumor samples. Conversely, the turquoise module with Trait-tumor samples (correlation value = – 0.57, p = 4e–34), implying a potential inhibitory or opposing relationship. Figure 2F shows the PPI network constructed by the genes in the turquoise module. The gene nodes with the highest degree were shown to be ANGPTL1, CSRP1, TPM2, PGR, ZFPM2, ANGPTL7, JAM3, MLNR, ADCYAP1R1, DES, and FGF2.

4.3 Establishment of a prognostic index

The uni- and multi- variate Cox regression analyses were conducted to identify the independent prognostic DEIRGs for predicting the overall survival outcomes of STAD patients. Figure 3A shows that hub DEIRGs (e.g., PROCR (p = 0.039), S100A12 (p = 0.019), SLC22A17 (p = 0.011)) significantly affected the overall survival outcomes of STAD patients. Figure 3B shows the significantly mutated prognostic genes (e.g., SLIT2 (mutation rate = 7%), ZFPM2 (mutation rate = 3%), ANGPT1 (mutation rate = 3%), and NPR3 (mutation rate = 3%)) among the univariate Cox regression analysis-obtained statistically significant DEIRGs. The multivariate Cox regression analysis confirmed that RNASE2 (p = 0.026), CGB5 (p = 0.002), CTLA4 (p = 0.012), and DUSP1 (p = 0.008) were independent prognostic genetic factor after adjusted for other clinicopathologic factors (Fig. 3C).

Fig. 3figure 3

The uni- and multi-variate Cox regression analysis of DEIRGs. A Univariate Cox regression analysis of hub DEIRG; B Significantly mutated prognostic genes among the univariate Cox regression analysis-obtained statistically significant DEIRGs. Samples (columns) are arranged to emphasize mutual exclusivity among mutations. The right shows the mutation percentage, and the top shows the overall number of mutations. The color coding indicates the mutation type. C Multivariate Cox regression analysis of the DEIRGs which were significant in the univariate Cox analysis (p < 0.05)

Several prior studies using a similar methodology to develop prognostic gene signatures have described the thought behind choosing the Cox model for developing the IRGPI formula [13,14,15,16,17]. Based on the results of multivariate Cox regression analysis, a prognostic index for STAD tumor samples was established based on the formula immune-related gene prognostic index (IRGPI) = expression level of RNASE2* 0.277 + expression level of CGB5* 0.255 + expression level of INHBE*0.501 + expression level of PTGER3*(– 0.348) + expression level of CTLA4*(– 0.341) + expression level of DUSP1*0.260 + expression level of APOA1*0.071 +  + expression level of CD36*0.234.

4.4 Comparison of the survival probability between two IRGPI-defined clusters

Figure 4A, B shows that univariate and multivariate Cox regression analysis confirmed that IRGPI genes were independent prognostic factors (p < 0.001) after adjusting for clinicopathologic factors including age, gender, grade, and stage. The cutoff to categorize STAD samples into IRGPI-high and low groups was determined by taking the median value of the IRGPI risk scores as the threshold. The Kaplan–Meier survival analysis based on the training cohort dataset (TCGA-STAD) shows that the patients in the IRGPI-high expression cluster indicated lower survival probability and thus worse prognosis; while the patients in the IRGPI-low expression cluster represented higher survival probability and thus favorable prognosis (p < 0.001, Fig. 4C). The results validated by using the GSE84437 as the validation cohort dataset obtained the same trend by showing a higher survival probability in the IRGPI-low expression STAD patients compared with the IRGPI-high expression STAD patients (p−0.024 < 0.05, Fig. 4D).

Fig. 4figure 4

Prognostic analysis based on immune-related gene prognostic index (IRGPI). A Univariate Cox regression analysis of clinicopathologic factors and the IRGPI in train cohort (TCGA-STAD); B Multivariate Cox regression analysis of the factors significant in the univariate Cox regression analysis (p < 0.05); C K–M survival analysis of the IRGPI subgroups in the training cohort (TCGA-STAD); D K–Msurvival analysis of the IRGPI subgroups in the validation cohort (GSE84437)

4.5 Identification of the biological functions of IRGPI-based subgroups

GSEA was performed to determine the gene sets enriched in two IRGPI subgroups, including IRGPI-high and IRGPI-low subgroups. The gene sets of the IRGPI-high samples were enriched in complement and coagulation cascades, ECM receptor interaction, focal adhesion, neuroactive ligand receptor interaction, and PPAR signaling pathway (Fig. 5A). The gene sets of the IRGPI-low samples were enriched in Aminoacyl-TRNA-biosynthesis, cell cycle, DNA replication, pyrimidine metabolism, and spliceosome (Fig. 5B). Afterward, the gene mutations were analyzed to identify the significantly mutated genes in the mutated STAD samples of two different IRGPI-based subgroups. Figure 5C, D shows that the top 10 genes with the highest mutation rates in both the IRGPI-high and -low subgroups was identified to be TNN, TP53, MUC16, ARID1A, LRP1B, SYNE1, FLG, FAT4, CSMD3, and PCLO.

Fig. 5figure 5

Molecular characteristics of different IRGPI subgroups. A Gene set enrichment analysis in IRGPI-high subgroup (p < 0.05, FDR < 0.25); B Gene set enrichment analysis in IRGPI-low subgroup (p < 0.05, FDR < 0.25); C, D Significantly mutated genes in the mutated GC samples of IRGPI-high (C) and IRGPI-low (D) subgroups

4.6 Immune characteristics of two different IRGPI subgroups

The Wilcoxon test was utilized to compare the relative percentage of immune cells in different IRGPI subgroups (Fig. 6A). Figure 6B shows that monocytes, macrophage M2, and neutrophils were more abundant in the IRGPI-high subgroup. A predominant M2 macrophage population may suppress anti-tumor immune responses and promote a favorable environment for tumor growth. In the IRGPI-low subgroup, T cells CD8, T cells CD4 memory activated, T cells follicular helper, and macrophage M1 were more abundant. An increased presence of M1 macrophages can drive an effective anti-tumor immune response. Such findings indicate that the IRGPI-high subgroup is more aggressive while the IRGPI-low subgroup is less aggressive.

Fig. 6figure 6

The characteristics of immune cell infiltration in different IRGPI subgroups. A The IRGPI grouping and proportions of immune cells for GC patients in TCGA cohort. B The fractions of TME cells in different IRGPI subgroups. The scattered dots represent the immune score of the two subgroups. The thick lines represent the median value. The bottom and top of the boxes are the 25th and 75th percentiles (interquartile range), respectively. Significant statistical differences between the two subgroups were assessed using the Wilcoxon test (*p < 0.05; **p < 0.01;***p < 0.001). C–H KM plots of six types of immune cells (neutrophils (C), NK cells resting (D), mast cells resting (E), T cells resting (F), macrophage M2 (G), dendritic cells resting (H).

In addition, Kaplan–Meier plot analysis was performed to investigate the prognostic values of these significantly abundant immune cells. Figure 6C shows that the score of neutrophils didn’t show significant prognostic values in STAD patients. Figure 6D shows that STAD patients with a low score of NK cells resting had a favorable overall survival outcome, while STAD patients with a higher score of neutrophils had a worse prognosis (p = 0.040 < 0.05). Figure 6E shows that STAD patients with a high score of mast cells resting had a favorable overall survival outcome, while STAD patients with a low score of mast cells resting had a worse prognosis (p = 0.008 < 0.05). Figure 6F shows that STAD patients with a high score of Tcells CD8 had a favorable overall survival outcome, while STAD patients with a low score of T cells CD8 had a worse prognosis (p = 0.045 < 0.05). Figure 6G shows that STAD patients with a low score of macrophage M2 had a favorable overall survival outcome, while STAD patients with a higher score of macrophage M2 had a worse prognosis (p = 0.015 < 0.05). Figure 6H shows that STAD patients with a low score of dendritic cells resting had a favorable overall survival outcome, while STAD patients with a higher score of dendritic cells resting had a worse prognosis (p = 0.034 < 0.05).

Figure 7A shows the distribution of immune subtypes (C1: wound healing; C2: IFN-gamma dominant; C3: inflammatory; C4: lymphocyte depleted) between the two IRGPI-based subgroups. There were more wound healing samples and inflammatory samples in the IRGPI-high subgroup compared to the IRGPI-low subgroup; more IFN-gamma dominant samples and lymphocyte depleted samples in the IRGPI-low subgroup compared than the IRGPI-low subgroup (p = 0.046 < 0.05). Figure 7B shows that the TMB was significantly higher in the IRGPI-low subgroup compared to the IRGPI-high subgroup (p = 6.5e−06). Figure 7C shows a negative correlation between IRGPI expression and TMB (r = − 0.29, p = 1.8e−08).

Fig. 7figure 7

Distribution of immune subtypes in different IRGPI subgroups. A Heatmap and table showing the distribution of immune subtypes (C1: wound healing; C2: IFN-gamma dominant; C3: inflammatory; C4: lymphocyte depleted) between the IRGPI subgroups; B The difference of total mutational burden between IRGPI subgroups; C Correlation analysis between IRGPI and total mutational burden

The various immune-related scores (e.g., Tumor immune dysfunction and exclusion (TIDE) (Fig. 8A), microsatellite instability (MSI) (Fig. 8B), T-cell dysfunction (Fig. 8C), and exclusion score (Fig. 8D)) were compared between two IRGPI subgroups. Figure 8E, F shows that IRGPI had good diagnostic values in predicting the overall survival at 3-year (AUC = 0.719) and 5-year post operation (AUC = 0.709).

Fig. 8figure 8

The potential clinical efficacy of immunotherapy in different IRGPI subgroups and the value of predicting the prognosis of IRGPI in patients with GC. A–D Tumor immune dysfunction and exclusion (TIDE) (A), microsatellite instability (MSI) (B), and T-cell dysfunction (C) and exclusion score (D) in different IRGPI subgroups. The scores between the two IRGPI subgroups were compared through the Wilcoxon test (*p < 0.05; **p < 0.01; ***p < 0.001). E ROC analysis of IRGPI, TIS, and TIDE on OS at 3-years follow-up in TCGA cohort. F ROC analysis of IRGPI, TIS, and TIDE on OS at 5-years follow-up in TCGA cohort

留言 (0)

沒有登入
gif