Identification of E3 ubiquitin ligase-associated prognostic genes and construction of a prediction model for uterine cervical cancer based on bioinformatics analysis

3.1 Finding E3URGs that are differentially expressed in CESCs

1142 E3URGs were examined for differences in expression between normal neighboring samples (n = 3) and CESC (n = 306). Between CESC and normal neighboring samples, there was a differential expression of 148 E3URGs, as demonstrated by volcanograms (Fig. 2A) and thermograms (Fig. 2B).While ZBTB16, TRIM63, and PDZRN3 were expressed at low levels in normal tissues, SFN, H2BC9, and H2BC17 were expressed at significant levels in CESC tissues. Nineteen E3URGs were found using one-way Cox regression analysis (Fig. 2C). Furthermore, correlation analysis revealed that the majority of the genes have relationships with one another (Fig. 2D).

Fig. 2figure 2

Expression patterns in CESC of genes associated to E3 ubiquitination ligase. Plot of a volcano displaying E3URG in CESC (A). Red genes are up-regulated, and green genes are down-regulated. Genes without a substantial difference are indicated by black dots. B A heatmap visualization showing each sample's E3URG expression levels. "Tumor" denotes samples of tumors, and "Normal" denotes normal samples. Red denotes strong expression, and green denotes modest expression. C A forest plot utilizing univariate Cox regression analysis for 19 E3URGs. D Prognostic network map and study of gene correlation

3.2 Using consensus clustering, two molecular subtypes were found

An examination of one-way Cox regression revealed 19 E3URGs. Based on the expression similarity, k = 2 demonstrated the best clustering stability from k = 2 to 9 in order to further explore the clinical importance of E3URGs (Fig. 3A, B). Based on gene expression, two subgroups of CESC patients were identified (Fig. 3C). Group A was shown to have a much better prognosis than group B based on the findings of the survival study (Fig. 3D). This heatmap illustrates the differences in the expression of pertinent clinical information across subtypes A and B, such as age, TMN, and tumor grade (Fig. 3E).

Fig. 3figure 3

Subtypes related to the E3 ligase gene were identified via consensus cluster analysis. A, B, C Two subgroups, A and B, can be established from TCGA cervical cancer specimens; two is the perfect number to use. D The overall survival (OS) K-M survival curves for subgroups A and B. E Heatmap showing each subgroup's age, grade, TMN, and E3 ligase-related gene expression

3.3 Construction of E3 ligase gene-related prognostic model

One-way Cox regression analyses were conducted on the expression levels of 148 E3URGs, based on the association seen between these regulators and OS in CESC patients.A training set and a test set were created by randomly grouping the cervical cancer data in the TCGA database 100 times. LASSO regression analysis and multivariate Cox regression analysis were run on the training cohort in order to remove overfitting from this model. Consequently, the method successfully determined the most predictive markers and produced prognostic indicators for the prediction of clinical outcomes (Fig. 4A, B). The disparity in distribution between the two risk groups was displayed by the PCA plot (Fig. 4C). Four genes were found to have the greatest significant prediction capacity based on the results (Fig. 4D). Therefore, the corresponding coefficient risk score = (0.54 × PSMD14)—(0.74 × PSMA4)—(0.42 × ZBTB16)—(0.021 × expression value of RADD) + (0.54 × ANKRD9) To find out if the risk score was an accurate marker of CESC regardless of the other clinicopathological traits, multivariate Cox regression analysis was carried out. The results from the study showed that OS along with the risk score demonstrated an independent correlation. (Fig. 4E, P < 0.01).

Fig. 4figure 4

Patients with cervical cancer had their risk markers gathered from the TCGA database. A, B Building the signature with Cox regression's absolute shrinkage and selection operator (LASSO). C Cervical cancer PCA plot according to risk score. D The four genes' coefficients that make up the signature. E Multivariate analysis of OS and clinicopathological characteristics using Cox regression

3.4 Survival analysis and ROC curve based on prognostic modelling

The prognostic value of the risk score was demonstrated by the time-dependent ROC curves, and the OS rate in the high-risk group was significantly smaller than that in the low-risk group. According to Fig. 5A, the AUC values for the test cohorts were 0.676 after one year, 0.700 after three years, and 0.698 after five years. Next, we took advantage of the GSE44001 dataset to review our risk model's accuracy in predicting in these validation cohorts [20] (Fig. 5C). The TCGA cohort's thresholds were used to divide the patient population into low- and high-risk categories. Survival studies discovered that the high-risk group maintained a lower OS rate than the low-risk group. This enhanced OS was consistent with the TCGA cohort's findings. The 1-year OS has an AUC of 0.694, the 3-year OS of 0.730, and the 5-year OS of 0.702. These outcomes confirmed the risk model's ability to predict outcomes (Fig. 5B). We examined how different clinicopathological variables affected the risk ratings within and between subgroups (Fig. 5D).

Fig. 5figure 5

Gene signatures relevant to E3 typing: construction and validation. A, B, C ROC curves at 1, 3, and 5 years for the training set, validation set, and GEO cohort; K-M survival curves for the high-risk and low-risk groups in each of these sets. D Variations in risk scores among subgroups characterized by distinct clinicopathological features. (*P < 0.05; **P < 0.01; ***P < 0.001; ns, not significant)

3.5 Construction of the column-line diagram

This work provides a thorough prognostic column chart based on tumor grading and risk scores (Fig. 6A) to aid in the clinical application of risk models. For LUAD patients, the column charts correctly predicted the 1-, 3-, and 5-year OS.Furthermore, the performance of the column charts was evaluated using correction curves (Fig. 6B). The study's findings showed that the model could correctly forecast OS in patients with CC.

Fig. 6figure 6

Development and verification of a prognostic model for patients with cervical cancer. A For BC patients, 1-, 3-, and 5-year OS may be consistently predicted by merging clinical data with prognostic columnar plots. B Columnar plot calibrations that are used to forecast the likelihood of survival at one, two, three, and five years

3.6 CeRNA network construction

Using the miRanda, miRDB, miRWalk, and TargetScan databases, gene target miRNAs were predicted for each of the four [21]. After cross-tabulating the anticipated hsa-miR-1237-3p and DEmRNAs, six mRNAs in total were found. Two miRNAs (has-miR-1297-3p, hsa-miR-205-5p), two mRNAs, and six lncRNAs were obtained through the integration of lncRNA-miRNA pairs and miRNA-mRNA couples (Fig. 7).

Fig. 7figure 7

The ceRNA regulatory network. A lncRNA-miRNA-mRNA ceRNA network

3.7 Differential gene enrichment study using GO and KEGG

GO and KEGG enrichment analyses were performed on DEGs linked with E3 ubiquitination ligase (Fig. 8A and B). Biological function (BP), cellular component (CC), and molecular function (MF) were the three areas with the greatest GO enrichment values. BP was primarily selected for the ubiquitin-dependent protein catabolic process mediated by proteasomes in our experiment. Proteasome complex and endopeptidase complex were the key areas of enrichment for CC, while MF was largely enriched for endopeptidase activator activity, K63-linked deubiquitinase activity, proteasome binding, and ubiquitin ligase-substrate adaptor activity. Proteasome, Prion disease, Pathways of neurodegeneration—many illnesses, and Acute myeloid leukemia were among the disorders in which KEGG was considerably enriched. leukemia exhibited considerable richness.

Fig. 8figure 8

Differential gene studies using GO and KEGG. A The differential gene KEGG enrichment analysis histogram. B A histogram showing the differential genes' GO enrichment analysis

3.8 Analysis of the correlation between cervical cancer risk score and immune infiltration

Using the "limma" program, we were able to acquire a differential analysis of 22 distinct types of immune cells in cervical cancer between the low and high risk groups of subtypes A and B. Investigations discovered that the high-risk group's CD8T cell expression substantially lower, T cells with CD4 memory activation, resting mast cells, and neutrophils than the low risk group (Fig. 9A). This shows that patients' survival times may be somewhat impacted by the quantity of immune cells present in their cervical cancer. Scatter plots of correlation have been produced through the use of stem cell analysis on the patient risk scores (Fig. 9B). Following on the ESTIMATE strategy of the "estimate" R package, the expression levels of ImmuneScore, StromalScore, and ESTIMATEScore were drastically greater in the group with low risk in comparison to the high-risk group of participants (Fig. 9C).

Fig. 9figure 9

Variances in the microenvironment of the tumor. A The proportion of 22 immune cell subtypes that populate both high- and low-risk individuals. B Risk group and stem cell correlation scatter plot. C Variations in the two groups' immune microenvironment ratings

3.9 Validation results are consistent with the trends expressed in the risk prediction models

According to IHC data, PSMD14 expression was more prominent in CC specimens than in normal specimens, although ZBTB16 and PSMA4 expression was decreased in tumor tissues. AnKRD9 expression was low and did not significantly differ in either tumor or normal tissue. (Fig. 10).

Fig. 10figure 10

Using the HPA database, four prognostic genes' protein expression have been investigated

Comments (0)

No login
gif