Deciphering the Prognostic and Therapeutic Value of a Gene Model Associated with Two Aggressive Hepatocellular Carcinoma Phenotypes Using Machine Learning

Introduction

Hepatocellular carcinoma (HCC) is the primary form of liver cancer and the third leading cause of cancer-related deaths worldwide.1 Several clinical staging systems have been developed to optimize patient care in HCC, with the Barcelona Clinic Liver Cancer (BCLC) system being widely used for guiding treatment decisions in HCC.2 Despite the availability of curative therapies for early-stage HCC, the long-term prognosis remains poor due to high tumor recurrence rates following surgery.3,4 Transarterial chemoembolization (TACE) is the standard treatment for intermediate-stage HCC, but over 40% of patients do not respond effectively.5 For advanced HCC, first-line treatments involve multi-kinase inhibitors like sorafenib, and targeted therapy combined with and immunotherapy.2 However, a majority of patients fail to respond or derive sustained clinical benefit due to tumor heterogeneity and treatment resistance.6 Therefore, there is an urgent need to identify and validate reliable biomarkers for HCC to enable patient stratification and personalized treatment strategies, ultimately improving prognosis.

The growing recognition of HCC has identified various histopathological phenotypes, each exhibiting distinctive clinical and molecular features.7,8 Among these, the macrotrabecular-massive hepatocellular carcinoma (MTM-HCC) mentioned in the 2019 WHO Classification of Digestive System Tumors has garnered attention due to its highly aggressive biological characteristics.9,10 MTM-HCC is characterized by thick trabeculae consisting of more than six cells and occupying over 50% of the tumor area.9 Interestingly, previous studies have shown a strong association between MTM-HCC and another pathological architecture known as vessels encapsulating tumor clusters (VETC).11 In contrast to epithelial-mesenchymal transition, VETC represents a distinctive histologic vascular pattern linked to a novel mechanism of metastasis.12,13 The VETC pattern is characterized by sinusoid-like vessels that enclose individual tumor clusters, forming a cobweb-like pattern on CD34-immunostained sections.13 It is noteworthy that VETC-HCC and MTM-HCC often exhibit overlapping characteristics and share diverse markers associated with aggressive behavior. The MTM/VETC-HCC are generally associated with poorer tumor differentiation, higher tumor burden, and elevated levels of AFP compared to negative cases.7,10,14,15 Furthermore, MTM/VETC-HCC has been established as an independent prognostic factor for tumor recurrence and overall survival in patients who undergo surgery for HCC.11,12,15,16 In addition to their prognostic implications, MTM/VETC-HCC can offer valuable insights for treatment decision-making in HCC patients, including options such as TACE, hepatic arterial infusion chemotherapy, or systemic therapy.17–21 A recent study by Kurebayashi et al further classified VETC-HCC and MTM-HCC into a unified immune-low/angiogenic phenotype based on analysis of the tumor microenvironments.22 However, the biological changes underlying MTM and VETC pathological subclasses have not been fully elucidated.

Recent advancements in high-throughput sequencing technologies have transformed the classification of HCC by shifting from purely pathological categorizations to refined molecular subtypes.8 Groundbreaking studies, such as that conducted by Lee et al in 2004, utilized transcriptome microarray data to identify distinct subclasses within HCC, revealing significant differences in survival and biological characteristics.23 This pioneering work paved the way for subsequent molecular stratification systems, including those developed by Boyault et al (G1-G6),24 Chiang et al (CTNNB1, proliferation, IFN, poly 7, and unannotated class),25 and Hoshida et al (S1-S3),26 which are all based on gene expression profiles. Additionally, the construction of risk models based on key genes has shown implications for prognosis and therapeutic strategies in liver cancer. For instance, Gao et al employed machine learning techniques to develop a DNA damage repair score that successfully predicts overall survival and resistance to PD-1 blockade in HCC patients.27 Nevertheless, the specific molecular markers that associated with MTM/VETC and their implications for prognosis and therapeutic strategies in HCC remained unclear.

In this study, we leveraged high-throughput sequencing to uncover MTM and VETC-related genes and applied machine learning techniques to construct a novel prognostic risk score. Additionally, we investigated the correlation between the risk score and survival outcomes, clinical-pathological features, molecular subclasses, and responsiveness to conventional therapies. Furthermore, we explored underlying biological processes, potential drug targets, and tumor microenvironment. The utility of the score was confirmed using the in-house cohort. Insights from this research may enhance the prognostication of HCC and inform patient-specific therapeutic approaches.

Materials and Methods Data Acquisition

This study utilized a total of five HCC cohorts: TCGA-LIHC, ICGC-LIRI-JP, and GSE14520, GSE104580, and GSE109211. The gene expression data and metadata for TCGA-LIHC were obtained through the TCGA biolinks package.28 Furthermore, survival outcome data, encompassing overall survival (OS), disease-free interval (DFI), disease-specific survival (DSS), and progression-free interval (PFI), were acquired from the research conducted by Hu et al.29 The TCGA IDs for patients with MTM-HCC were extracted from the research conducted by Zucman-Rossi et al.14 In addition, the GSE14520, GSE104580, GSE109211 and ICGC-LIRI-JP cohorts were accessible on GEO (https://www.ncbi.nlm.nih.gov/geo/) and the ICGC database (https://dcc.icgc.org/). ICGC-LIRI-JP IDs for HCC patients with VETC pattern were collected from Aikata et al’s research.30 RNA sequencing data and clinical information from the IMvigor210 cohort were also acquired,31 which investigated the efficacy of atezolizumab, an inhibitor targeting programmed death-ligand 1, in urothelial carcinoma.

Identification of Differential Expressed MTM and VETC-Related Genes

We conducted a differential analysis to compare the tumor tissue of MTM-HCC and non-MTM-HCC with normal liver tissues. The DESeq2 was employed for the differential expression analysis, utilizing a screening criterion of | log2(fold change) | > 1 and adjusted P value <0.05. To identify MTM-related differentially expressed genes (DEGs), we specifically considered genes that were found in the DEGs for MTM-HCC compared to normal liver tissues but were not present in the DEGs for non-MTM-HCC compared to normal liver tissues from the TCGA-LIHC cohort. Subsequently, we applied a similar process to identify VETC-related DEGs from the ICGC-LIRI-JP cohort. Finally, we determined the intersection of the MTM-related genes and VETC-related genes, resulting in the identification of MTM and VETC-related genes. An Upset diagram32 was also used to show the results.

Development of MTM and VETC-Related Risk Score

Initially, univariate Cox regression analysis was performed to identify genes associated with OS, using an adjusted P value threshold of <0.05. Next, lasso regression was conducted 1000 times, and features that appeared at least 100 times out of the 1000 trials were considered as significant lasso features. Furthermore, the random forest (RF) algorithm was employed to select representative genes (normalized variable importance measure index >0.4) using the “randomSurvivalForest” package. Finally, the overlapping features obtained from both the repeated lasso results and the random forest results were integrated into a stepwise multivariate Cox regression model. This model takes into account multiple gene signatures to develop the MTM and VETC-related risk score. We determined the cutoff value for the risk score based on the median value, and subsequently classified patients into high-risk and low-risk groups accordingly.

Analysis of Survival and Clinical Characteristics

Kaplan-Meier curves and Log rank tests were then employed to compare the OS, DFI, DSS, and PFI between the high-risk and low-risk groups. Additionally, chi-squared tests and Wilcoxon rank-sum tests were utilized to explore any associations between the risk score or risk group and clinical characteristics. To evaluate the independent prognostic effect of the risk score, both univariate and multivariate Cox regression analyses were also performed.

Functional Enrichment Analysis

The Hallmark gene sets was acquired from the msigdbr package (v.7.5.1). To minimize overlap and redundancy among pathways, gene sets associated with each pathway were filtered to include unique genes, and genes related to multiple pathways were subsequently excluded. We applied the limma package to identify the pathways that exhibited significant modifications between the high- and low-risk score groups. Furthermore, we performed gene set enrichment analysis (GSEA) using the clusterProfiler package.33

Drug Target Analysis

We obtained compound target information from the study conducted by Yang et al.34 Next, we computed the risk score for each liver cancer cell line sourced from the Cancer Cell Line Encyclopedia project, and conducted a correlation analysis between the CERES score and risk score across these cell lines. A lower CERES score for a gene suggests a higher likelihood of its dependency within a specific cancer cell line. Consequently, we regarded a gene with a correlation coefficient less than −0.5 (P <0.05) as a potential drug target associated with poor prognosis.

Assessment of Drug Reactivity and Possible Therapeutic Agents

We collected sensitivity information for cancer cell lines (CCLs) from the Profiling Relative Inhibition Simultaneously in Mixture (PRISM) and Cancer Therapeutics Response Portal (CTRP) datasets. Both datasets utilize the area under the dose-response curve (AUC) value as a measure of drug sensitivity, where lower AUC values indicate a stronger response to the drug. To address any missing AUC values, we applied K-nearest neighbor (K-NN) imputation, excluding compounds with less than 20% missing values.

Analysis of Tumor Microenvironment and Evaluation of Immunotherapy Response

To assess the tumor microenvironment (TME), we utilized the IOBR package35 to collect previously published signatures related to various aspects, including TME cell types, response to immunotherapy, immune suppression, and immune exclusion. Furthermore, we computed the enrichment score for each sample, providing an analysis of the immunological differences between high- and low-risk score patients. For evaluating the potential response to immunotherapy, we employed the Tumor Immune Dysfunction and Exclusion (TIDE) algorithm36 and The Cancer Immunome Atlas (TCIA) database (tcia.at). We also verified the immunotherapy response in the IMvigor210 cohort.

Characterization of HCC Subclasses

We used nearest template prediction (NTP) analyses to predict previously published molecular classifications of HCC, including those by Lee, Hoshida, Boyault, and Chiang.23–26 We then compared these prediction results with our risk score classification.

In-House Validation Cohort Preparation

We retrospectively included 25 patients with surgical-proven HCC at our institution between July 2019 and June 2020. Comprehensive data including clinical, pathological, follow-up information, as well as Magnetic Resonance Imaging (MRI) features were collected. Formalin-Fixed Paraffin-Embedded tissues of HCC samples were collected from these patients for further analysis.

Transcriptome Analysis of in-House Cohort

Bulk RNA-sequencing analysis was conducted on 25 Formalin-Fixed Paraffin-Embedded HCC tissue samples. The RNA-seq library was constructed and subsequently sequenced using the Illumina NovaSeq 6000 platform. Paired-end sequencing with a read length of 150 bp was performed. To align the raw sequencing reads to the human reference genome (hg38), the STAR algorithm was employed.

Statistical Analysis

All the statistical analyses were performed using R (version 4.3.1). For continuous variables that followed a normal distribution across two groups, t-tests were employed; otherwise, Wilcoxon tests were utilized. Categorical variables were analyzed using either the chi-squared test or Fisher’s exact test. Pearson or Spearman coefficients were used to assess correlations among variables. Survival analysis was performed using the Kaplan-Meier method. Uni- and multivariate analysis was performed using Cox-regression analysis. Significance was defined as P < 0.05, and all tests were conducted as two-sided tests.

Results Identification and Functional Analysis of MTM and VETC-Related Genes

The overall design of this study is displayed in Figure 1. We utilized multiple HCC cohorts, and the names and sample sizes of each cohort are provided in Table S1. In the TCGA-LIHC cohort, a total of 59 samples were classified as MTM-HCC. Differential gene expression analysis revealed 1909 upregulated genes and 1102 downregulated genes in MTM-HCC (Figure 2A), as well as 1568 upregulated genes and 618 downregulated genes in non-MTM-HCC (Figure 2B) compared to normal liver tissue. Among these genes, we identified 1080 MTM-related genes that were primarily associated with metabolic pathways, specifically glycolysis and gluconeogenesis (Figure 2C). Similarly, in the ICGC-LIRI-JP cohort, 16 hCC samples with the VETC pattern were identified. In VETC positive-HCC, we observed 677 upregulated genes and 304 downregulated genes (Figure 2D), while in VETC negative-HCC, 244 genes were upregulated and 201 genes were downregulated (Figure 2E). Functional enrichment analysis revealed an association of VETC-related genes with the VEGFA-VEGFR2 signaling pathway (Figure 2F).

Figure 1 Study flowchart.

Abbreviations: MTM, Macrotrabecular-massive; VETC, Vessels encapsulating tumor clusters; HCC, Hepatocellular carcinoma.

Figure 2 Identification of MTM and VETC-related genes and construction of the risk score based on the TCGA cohort. (A) The volcano plot of DEGs between MTM-HCC and normal tissues. (B) The volcano plot of DEGs between non-MTM-HCC and normal tissue. (C) Wiki pathway enrichment analysis of MTM-related genes. (D) The volcano plot of DEGs between VETC-positive-HCC and normal tissues. (E) The volcano plot of DEGs between VETC-negative-HCC and normal tissues. (F) Wiki pathway enrichment analysis of VETC-related genes. (G) The Upset diagram shows the intersection genes of MTM-related genes and VETC-related genes. (H) Wiki pathway enrichment analysis of MTM and VETC-related genes. (I) Forest plot of univariate cox analysis. Error bars represent 95% confidence intervals. (J) Repeated lasso analysis identifies genes that appear more than 100 times. (K) Random Forest variable importance plot. (L) Venn diagram of the 9 key genes between repeated Lasso and Random Forest analysis. (M) Coefficients of 4 genes finally obtained in stepwise Cox regression.

Abbreviations: MTM, Macrotrabecular-massive; VETC, Vessels encapsulating tumor clusters; HCC, Hepatocellular carcinoma; DEGs, Differentially expressed genes.

By intersecting the gene sets related to both MTM-HCC and VETC positive-HCC, we identified 140 genes that exhibit associations with both phenotypes of HCC (Figure 2G). These genes play roles in signaling pathways involving cytoplasmic ribosomal proteins, VEGFA-VEGFR2, and mRNA processing (Figure 2H). The identified 140 genes will be utilized for subsequent modeling purposes. Among the 140 MTM and VETC-related genes, 83 genes were found to be significantly associated with prognosis using univariate Cox regression analysis in the TCGA cohort. Figure 2I displays the genes with hazard ratios (HR) above 1.5 and below 1. To further refine the gene selection, repeated lasso regression was performed, resulting in the identification of 25 frequently appearing genes out of 1000 repetitions (Figure 2J). Additionally, RF analysis identified genes with a relative importance exceeding 0.4 (Figure 2K). By comparing these results, we identified 9 candidate hub genes for constructing the risk score (Figure 2L). We then developed a risk score using multivariate Cox regression analysis with these genes. After refining the risk score with stepAIC, we established a risk score based on four genes: SLC41A3, NOP56, FNDC4, and NDRG1 (Figure 2M and Table S2).

Application of Risk Score in Patient Prognostic Assessment

Through Principal Component Analysis (PCA), we observed that the TCGA-LIHC patient cohort with high- and low- risk could be separated into two distinct clusters (Figure 3A). High-risk patients had lower survival rates and higher mortality rates compared to the low-risk group (Figure 3D and G). These findings were validated in two independent cohorts, GSE14520 and ICGC-LIRI-JP, which also exhibited similar results (Figure 3B, C, E, F, H and I). The low-risk group consistently displayed better outcomes in terms of DSS, DFI, and PFI (Figure 3J–L and Figure S1). Subgroup survival analyses showed a trend of poorer prognosis in high-risk score patients. However, not all trends were statistically significant (Figures S2–S8).

Figure 3 Internal training and external validation of MTM and VETC-related risk score. (A–C) Principal component analysis plot based on the expression matrix of the four model genes (SLC41A3, FNDC4, NDRG1, and NOP56) in the TCGA, GSE14520, and ICGC cohorts. (D–F) Distribution of risk scores (upper), survival time (middle), and model gene expression levels (lower) by survival status in the TCGA, GSE14520, and ICGC cohorts. Patients were classified into low-risk and high-risk groups using the median score as the cut-off value. Red dots and lines represent patients in the high-risk group, while blue dots and lines represent those in the low-risk group. (G–I) Comparison of overall survival between low-risk and high-risk patient groups in TCGA, GSE14520, and ICGC cohorts. (J–L) Comparison of disease-specific survival, disease-free interval, and progression-free interval comparison between low-risk and high-risk patient groups in the TCGA cohort.

Associations Between Different Risk Groups and Clinicopathologic Characteristics

The findings revealed that the high-risk group was significantly associated with advanced TNM stage as well as increased mortality across the TCGA-LIHC, GSE14520, and ICGC-LIRI-JP cohorts (Figure 4A–C). Additionally, in the TCGA-LIHC cohort, patients classified as “Dead”, with AFP levels over 400 ng/mL, G3 & G4, stage III & IV, and vascular invasion, exhibited significantly higher risk scores than their counterparts (Figure 4D, K and P–R). No distinct trends were observed concerning age and Child-Pugh variables (Figure 4E and J). In the GSE14520 cohort, patients classified as “Dead”, at BCLC stage B & C, with multinodular tumors and at stage III, showed significantly higher risk scores compared to the low-risk group (Figure 4F and S–U). No different trends were noted regarding age, gender, and tumor size (Figure 4G, L and M). Similarly, in the ICGC-LIRI-JP cohort, patients classified as “Dead”, and at stage III & IV exhibited significantly higher risk scores compared to the low-risk group (Figure 4H and O), with no different trends concerning age and gender variables (Figure 4I and N). On the other hand, the univariate analysis consistently showed a significant association between the high-risk group and poor prognosis across all three HCC cohorts. Moreover, the multivariate analysis, adjusting for potential confounding factors, confirmed that the high-risk group remained an independent prognostic indicator in all three cohorts (Tables S3–S5).

Figure 4 Association between risk scores and clinicopathological characteristics in the TCGA, GSE14520, and ICGC cohorts. (A–C) Pie charts show the chi-squared test between various risk groups and clinical pathological factors in HCC samples from the TCGA cohort, GSE14520 cohort, and ICGC cohort. Different colored circles represent different clinical variables. (D, E, J, K, P–R) Differences in risk scores between different clinical subgroups were analyzed in the TCGA cohort, and statistical differences between clinical subgroups were tested using the Wilcoxon rank test. (F, G, L, M and S–U) Differences in risk scores between different clinical subgroups were analyzed in the GSE14520 cohort, and statistical differences between clinical subgroups were tested using the Wilcoxon rank test. (H, I, N and O) Differences in risk scores between different clinical subgroups were analyzed in the ICGC cohort, and statistical differences between clinical subgroups were tested using the Wilcoxon rank test. *P < 0.05; **P < 0.01; ***P < 0.001; ****P < 0.0001.

Abbreviation: NS, Not significant.

Correlation Between Different Risk Groups and HCC Molecular Subclasses

We investigated the relationship between risk score and molecular subclasses in four HCC cohorts (TCGA-LIHC, ICGC-LIRI-JP, GSE14520, and our in-house cohort). Compared to low-risk patients, high-risk patients were frequently classified into subclasses G1 and G2 based on Boyault’s classification, SURVIVAL_DN based on Lee’s classification, as well as S1 and S2 based on Hoshida’s classification (Figure 5A–D).

Figure 5 Association of risk scores with hepatocellular carcinoma molecular subclasses. (A–D) Heatmaps illustrating the relationship between risk scores and various hepatocellular carcinoma molecular subclasses across four independent cohorts: TCGA, ICGC, GSE14520, and FAHZU.

Application of Risk Score in TACE and Sorafenib Treatment for HCC

In the GSE104580 cohort, patients who did not respond to TACE had higher risk scores than responders (Figure 6A). Additionally, the high-risk group was more likely to exhibit TACE non-responsiveness compared to the low-risk group (Figure 6B). The risk score demonstrated good predictive precision for TACE responsiveness, with a high AUC value of 0.781 (Figure 6C). Furthermore, HCC patients treated with TACE and high-risk scores experienced poorer OS (Figure 6D) and RFS (Figure 6E) compared to those with low-risk scores. Additional drug analysis revealed that patients with low-risk scores were more sensitive to sorafenib treatment (Figure 6F). In the GSE109211 cohort, we observed that sorafenib non-responders had higher risk scores than responders (Figure 6G), and patients in the high-risk group were more prone to non-responsiveness (Figure 6H). The risk score exhibited a moderate predictive value for sorafenib response, with an AUC value of 0.681 (Figure 6I).

Figure 6 Guidance of risk score in the TACE and sorafenib therapy for hepatocellular carcinoma patients. (A) Differences in risk scores between TACE responders and nonresponders. (B) Comparison of TACE response rates across different risk groups. (C) AUC value of the risk score in predicting TACE efficacy. (D and E) Differences in overall survival and recurrence-free survival between low-risk and high-risk patient groups treated with TACE. (F) Sensitivity to sorafenib between low-risk and high-risk patient groups, (G) Differences in risk scores between sorafenib responders and nonresponders. (H) Comparison of sorafenib response rates between low-risk and high-risk patient groups. (I) AUC value of the risk score in predicting sorafenib efficacy.

Abbreviations: TACE, Transarterial chemoembolization; AUC, Area under the curve.

Functional Disparity Analysis and Drug Target Profiling Between Different Risk Groups

Geneset Variation Analysis (GSVA) analysis revealed that the high-risk groups were enriched in pathways such as Myc targets v1, E2F targets, and Myc targets v2 (Figure 7A). In contrast, the low-risk group showed enrichment in metabolism-related pathways like bile acid, fatty acid, and xenobiotic metabolism. Further analysis demonstrated a positive correlation between the risk score and GSVA scores of Myc targets v1, E2F targets, and Myc targets v2 pathways, with a negative correlation to metabolism-related pathways (Figure 7B). GSEA confirmed the activation of cancer-related pathways including Myc targets v1, E2F targets, and Myc targets v2 in the high-risk group (Figure 7C–E).

Figure 7 Identification of biological processes and drug targets associated with the risk score. (A) Geneset variation analysis enrichment analysis comparing low-risk and high-risk patient groups. (B) The correlation between risk scores and gene set enrichment scores. (C–E) Representative geneset variation analysis enrichment plots of the “Myc targets v1”, ‘E2F targets’, and “Myc targets v2” hallmarks. (F–J) Volcano plot and scatter plots depicting Spearman correlations and the statistical significance between risk scores and CERES drug target scores. The shaded blue areas represent the confidence intervals of the fitted lines.

To identify potential therapeutic targets associated with the risk score, we examined the relationship between individual gene dependencies and the risk score across 22 liver cancer cell lines using CERES scores. This analysis identified P2RY12, SLC25A4, CD44, and LDHAL6B as potential therapeutic targets (Figure 7F–J).

Identification of Potential Therapeutic Drugs for the High-Risk Group

Two approaches were used to identify candidate agents with increased sensitivity in high-risk patients (Figure 8A). We analyzed drug response data from CTRP and PRISM datasets. Initially, we compared drug responses between high-risk (top 10%) and low-risk (bottom 10%) patients to find compounds with significantly lower estimated AUC values in the high-risk group (log2FC > 0.1). Subsequently, Spearman correlation analysis was performed to select compounds that exhibited a strong negative correlation between AUC value and risk score (Spearman’s r < −0.3). The results indicated that 12 compounds exhibited both lower estimated AUC values in the high-risk group and a negative correlation with risk score (Figure 8B–E).

Figure 8 Discovery of potential therapeutic compounds with higher drug sensitivity in high-risk patient group. (A) Illustration outlining the approach to identify compounds exhibiting increased drug sensitivity in high-risk patient group. (B–E) Comparison of the Potential therapeutic compounds for high-risk patient group. ***P < 0.001.

Immune Characteristics and Immunotherapy Between the Different Risk Groups

Using the IOBR package, we observed increased infiltration of immunosuppressive cells such as Regulatory T cells (Tregs), Myeloid-derived suppressor cells (MDSCs), and exhausted CD8 T cells in the high-risk group (Figure 9A). Furthermore, enrichment of immune exclusion, exhaustion, and suppression markers indicated an immunosuppressive environment in high-risk group (Figure 9B–D). Assessing the correlation between risk score and immunotherapy responsiveness, we applied the TIDE algorithm and found that the low-risk group exhibited better responses to immunotherapy with lower TIDE scores (Figure 9E and F). The high-risk group showed higher exclusion scores, suggesting a greater likelihood of immune escape (Figure 9G).

Figure 9 TME-related molecular characteristics and immunotherapy responsiveness of low-risk and high-risk patient groups. (A) Comparison of the TME signatures between low-risk and high-risk patient groups. (B) Comparison of the immune exhaustion signatures between low-risk and high-risk patient groups. (C) Comparison of the immune exclusion signatures between low-risk and high-risk patient groups. (D) Comparison of the immune suppression signatures between low-risk and high-risk patient groups. (E–G) Distribution of TIDE scores and exclusion scores between low-risk and high-risk patient groups. (H–K) Comparison of the immunotherapy scores of anti-CTLA4 and anti-PD1 inhibitors between the low-risk and high-risk patient groups. (L) Comparison of overall survival between the low-risk and high-risk patient groups in the IMvigor210 cohort. *P < 0.05; **P < 0.01; ***P < 0.001; ****P < 0.0001.

Abbreviations: TME, Tumor microenvironment; TIDE, Tumor Immune Dysfunction and Exclusion.

Treatment response analysis for anti-CTLA4, anti-PD1, and combined therapies using the TCIA database revealed that the low-risk group derived more benefit from these treatments (Figure 9H–K). Lastly, we confirmed the predictive value of the risk score for immunotherapy response in the IMvigor210 cohort, where the low-risk group demonstrated improved prognostic outcomes following immunotherapy (Figure 9L).

Validation of Risk Score in the in-House Cohort

In terms of clinical-pathological features, our findings revealed a higher incidence of high AFP levels (> 400 ng/mL) and microvascular invasion in the high-risk group compared to the low-risk group (Figure 10A, B and Table S6). Additionally, Kaplan-Meier survival analysis demonstrated elevated rates of early recurrence in the high-risk group (Figure 10C). Condensing the potential utility of imaging examinations in predicting HCC aggressiveness, we also explored the association between MRI features and the risk score. We found that HCC in the high-risk group showed a higher occurrence of aggressive MRI features, including non-smooth tumor margin, mosaic architecture, and corona enhancement, compared to the low-risk group (Figure 10D–G and Table S7).

Figure 10 Validation in a clinical in-house cohort. (A and B) Comparison of clinicopathological characteristics between low-risk and high-risk patient groups. (C) Comparison of postoperative recurrence free survival between low-risk and high-risk patient groups. (D–G) Comparison of the frequency of aggressive magnetic resonance imaging features between low-risk and high-risk patient groups.

Discussion

In recent years, the distinct histopathological phenotypes of HCC, namely MTM and VETC, have gained significant attention for their implications in clinical prognosis and treatment strategies. By investigating gene expression profiles from both MTM and VETC phenotypes, a more comprehensive analysis becomes possible, overcoming limitations related to variability in single-subtype gene expression and enhancing the precision and reliability of prognostic predictions. Thus, the current study explored MTM and VETC-related genes for the first time and developed a promising prognostic model that offers improved predictive value for HCC patients.

In this study, we initially employed sequencing data to identify MTM and VETC-related genes through differential analysis. Subsequent Wiki pathway37 enrichment analysis revealed the association of these genes with cytoplasmic ribosomal proteins, the VEGFA-VEGFR2 signaling pathway, and the mRNA processing pathway. Notably, the VEGF signaling pathway has been recognized as a pivotal contributor to oncogenesis, affecting angiogenesis, metastasis, and tumor cell proliferation.38,39 A prior study has indicated that the long non-coding RNA ZFAS1 promotes colorectal cancer progression by influencing the miR-150-5p/VEGFA axis, thereby impacting the VEGFA/VEGFR2/Akt/mTOR signaling pathway and epithelial-mesenchymal transition processes.40 More importantly, both MTM-HCC and VETC-HCC exhibited elevated expression of neoangiogenesis-related genes such as VEGF and Ang2.14,19,22 To enhance the predictive power of our model, Unicox, repeated lasso, and RF methods were used for feature reduction and gene selection. Finally, through stepwise multi-Cox regression analysis, we identified four key genes that were used to construct a prognostic risk score.

We assessed the clinical significance of the risk score by investigating its prognostic value in multiple cohorts. Previous studies have established MTM-HCC and VETC-HCC as reliable prognostic markers for HCC patients undergoing surgery.11,12,15,16 Consistent with these findings, our results showed that high-risk patients displayed higher tumor aggressiveness and poorer clinical outcomes compared to those in the low-risk group. Multivariate Cox regression analysis further confirmed the risk score as an independent factor for HCC patients. Notably, in our internal validation cohort, we observed that HCC patients in the high-risk group had a higher prevalence of prognostic MRI features, including non-smooth tumor margin, mosaic architecture, and corona enhancement. These features are known to be associated with aggressive pathologic and molecular phenotypes.16,41,42 On the other hand, our risk score showed associations with aggressive molecular subclasses of HCC, including SURVIVAL_DN in Lee’s classification, S1 and S2 in Hoshida’s classification, and G1/2 and G3 in Boyault’s classification. These subclasses are characterized by their low differentiation, proliferative nature, and unfavorable prognosis.43,44 Collectively, these findings highlight the potential of the risk score to serve as a valuable prognostic tool for HCC patients.

TACE has long been considered the standard therapeutic approach for intermediate-stage HCC,45 while sorafenib has emerged as the first-line treatment for advanced HCC.2 Against this background, we investigated whether stratifying patients based on the risk score could provide valuable insights for treatment decision-making. Our findings revealed that the high-risk group exhibited diminished responsiveness to both TACE and sorafenib treatments, characterized by a higher proportion of non-responders and elevated risk scores. Additional drug sensitivity analyses supported the superior effectiveness of sorafenib in the low-risk group. Furthermore, in the context of TACE treatment, the low-risk group demonstrated superior OS and RFS compared to the high-risk group. Notably, the high AUC value of 0.781 from the ROC analysis for TACE responsiveness emphasizes the value of the risk score in guiding treatment decisions for patients with intermediate and advanced HCC, specifically for TACE.

Regarding the biological mechanisms underlying the risk score, we observed that the high-risk group is associated with processes related to tumor proliferation, including MYC targets V1, DNA repair, and cell cycle regulation. Previous studies have provided insights into the role of MYC in cancer cell proliferation and survival. Sun et al46 reported that inhibiting MYC through dual HDAC and PI3K inhibition hinders the growth of MYC-dependent cancers, underscoring the crucial role of MYC in cancer cell proliferation and survival. Luo et al47 demonstrated that microRNA-146a-5p enhances radiosensitivity in HCC by promoting DNA repair pathways, suggesting the involvement of MYC targets in HCC’s DNA repair mechanisms. Furthermore, Dhanasekaran et al48 showed that targeting the MYC oncogene with antisense oligonucleotides restrains tumor growth in transgenic mouse models of liver and kidney cancer, as well as human liver cancer xenografts, while exhibiting minimal toxicity. These cumulative findings may shed light on the unfavorable prognosis observed in high-risk patients.

In the tumor microenvironment, immune cell populations including MDSCs, TAMs, CAFs, and Tregs have been shown to impede the infiltration and function of CD8+ T cells.49 Our analysis of immune cell infiltration revealed a higher abundance of Tregs and MDSCs in the high-risk group, suggesting the presence of an immunosuppressive microenvironment. Additionally, the use of TIDE and TCIA, along with analysis of an independent immunotherapy cohort, further supports the observation that the high-risk group exhibits a diminished response to immunotherapy. Given the suboptimal response to immunotherapy observed in the high-risk group, we employed a screening framework that has previously demonstrated effectiveness to identify potential therapeutic agents.34 Through this approach, candidate drugs such as paclitaxel and rigosertib could be selected for targeted treatment of high-risk patients.

This study has certain limitations. First, the size of our in-house cohort was relatively small, and overall survival data were not available. Second, some clinical and molecular characteristics were limited in public datasets, potentially affecting the correlation between the risk score and certain variables. Finally, the functional roles of the MTM and VETC-related genes in HCC prognosis have yet to be fully elucidated. To address these gaps, future research should focus on investigating the involvement of these genes in HCC development through in vivo and in vitro studies.

Conclusion

The present study successfully established a risk score based on MTM and VETC-related genes through machine learning methods. This risk score has been shown to be an independent prognostic predictor for HCC patients, effectively predicting their response to TACE, targeted therapy, and immunotherapy. The analysis revealed significant variations in biological function, immune cell infiltration, and drug sensitivity among different risk groups, which can also provide novel insights for optimizing treatment strategies for HCC patients. The robustness of the risk score was further validated in our in-house cohort, demonstrating its potential as a precision clinical management tool.

Abbreviations

MTM, Macrotrabecular-massive; VETC, Vessels encapsulating tumor clusters; HCC, Hepatocellular carcinoma; BCLC, Barcelona Clinic Liver Cancer; TACE, Transarterial chemoembolization; AFP, Alpha-fetoprotein; OS, Overall survival; DFI, Disease-free interval; DSS, Disease-specific survival; PFI, Progression-free interval; RFS, Recurrence free survival; DEGs, Differentially expressed genes; RF, Random forest; GSEA, Gene set enrichment analysis; CCLs: Cancer cell lines; PRISM, Profiling Relative Inhibition Simultaneously in Mixture; CTRP, Cancer Therapeutics Response Portal; AUC, Area under the curve; KNN, K-nearest neighbor; TME, Tumor microenvironment; TIDE, Tumor Immune Dysfunction and Exclusion; TCIA, The Cancer Immunome Atlas; NTP, Nearest template prediction; MRI, Magnetic Resonance Imaging; HR, Hazard Ratios; GSVA, Geneset variation analysis; Tregs, Regulatory T cells; MDSCs, Myeloid-derived suppressor cells; PCA, Principal Component Analysis.

Data Sharing Statement

Gene expression data described in this study were deposited in TCGA, GEO and ICGC databases. Additional relevant data supporting the findings of this study are available from the corresponding authors upon request.

Ethics Approval and Consent to Participate

This study received approval from the Research Ethics Committee of the First Affiliated Hospital, Zhejiang University School of Medicine (approval numbers: IIT20220893A).

Author Contributions

All authors made a significant contribution to the work reported, whether that is in the conception, study design, execution, acquisition of data, analysis and interpretation, or in all these areas; took part in drafting, revising or critically reviewing the article; gave final approval of the version to be published; have agreed on the journal to which the article has been submitted; and agree to be accountable for all aspects of the work.

Funding

This study has received funding by Zhejiang Provincial Natural Science Foundation Committee-Zhejiang Society for Mathematical Medicine Joint Fund Major Project (LSD19H180003) and National Natural Science Foundation of China (12090020 and 12090025).

Disclosure

The authors declare no potential conflicts of interest regarding the research, or publication of this article.

References

1. F Bray, J Ferlay, I Soerjomataram. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2020;70(4):313. doi:10.3322/caac.21609

2. Reig M, Forner A, Rimola J, et al. BCLC strategy for prognosis prediction and treatment recommendation: the 2022 update. J Hepatol. 2022;76(3):681–693. doi:10.1016/j.jhep.2021.11.018

3. Xu XF, Xing H, Han J, et al. Risk Factors, Patterns, and Outcomes of Late Recurrence After Liver Resection for Hepatocellular Carcinoma: a Multicenter Study From China. JAMA Surg. 2019;154(3):209–217. doi:10.1001/jamasurg.2018.4334

4. Tabrizian P, Jibara G, Shrager B, et al. Recurrence of hepatocellular cancer after resection: patterns, treatments, and prognosis. Ann Surg. 2015;261(5):947–955. doi:10.1097/SLA.0000000000000710

5. Gillmore R, Stuart S, Kirkwood A, et al. EASL and mRECIST responses are independent prognostic factors for survival in hepatocellular cancer patients treated with transarterial embolization. J Hepatol. 2011;55(6):1309–1316. doi:10.1016/j.jhep.2011.03.007

6. Yang X, Yang C, Zhang S, et al. Precision treatment in advanced hepatocellular carcinoma. Cancer Cell. 2024;42(2):180–197. doi:10.1016/j.ccell.2024.01.007

7. Renne SL, Woo HY, Allegra S, et al. Vessels Encapsulating Tumor Clusters (VETC) Is a Powerful Predictor of Aggressive Hepatocellular Carcinoma. Hepatology. 2020;71(1):183–195. doi:10.1002/hep.30814

8. Calderaro J, Ziol M, Paradis V, Zucman-Rossi J. Molecular and histological correlations in liver cancer. J Hepatol. 2019;71(3):616–630. doi:10.1016/j.jhep.2019.06.001

9. Nagtegaal ID, Odze RD, Klimstra D, et al. The 2019 WHO classification of tumours of the digestive system. Histopathology. 2020;76(2):182–188. doi:10.1111/his.13975

10. Ziol M, Pote N, Amaddeo G, et al. Macrotrabecular-massive hepatocellular carcinoma: a distinctive histological subtype with clinical relevance. Hepatology. 2018;68(1):103–112. doi:10.1002/hep.29762

11. Akiba J, Nakayama M, Sadashima E, et al. Prognostic impact of vessels encapsulating tumor clusters and macrotrabecular patterns in hepatocellular carcinoma. Pathol Res Pract. 2022;238:154084. doi:10.1016/j.prp.2022.154084

12. Zhou HC, Liu CX, Pan WD, et al. Dual and opposing roles of the androgen receptor in VETC-dependent and invasion-dependent metastasis of hepatocellular carcinoma. J Hepatol. 2021;75(4):900–911. doi:10.1016/j.jhep.2021.04.053

13. Fang JH, Zhou HC, Zhang C, et al. A novel vascular pattern promotes metastasis of hepatocellular carcinoma in an epithelial-mesenchymal transition-independent manner. Hepatology. 2015;62(2):452–465. doi:10.1002/hep.27760

14. Calderaro J, Couchy G, Imbeaud S, et al. Histological subtypes of hepatocellular carcinoma are related to gene mutations and molecular tumour classification. J Hepatol. 2017;67(4):727–738. doi:10.1016/j.jhep.2017.05.014

15. Huang CW, Lin SE, Huang SF, et al. The Vessels That Encapsulate Tumor Clusters (VETC) Pattern Is a Poor Prognosis Factor in Patients with Hepatocellular Carcinoma: an Analysis of Microvessel Density. Cancers. 2022;14(21):5428. doi:10.3390/cancers14215428

16. Rhee H, Cho ES, Nahm JH, et al. Gadoxetic acid-enhanced MRI of macrotrabecular-massive hepatocellular carcinoma and its prognostic implications. J Hepatol. 2021;74(1):109–121. doi:10.1016/j.jhep.2020.08.013

17. He X, Li K, Wei R, et al. A multitask deep learning radiomics model for predicting the macrotrabecular-massive subtype and prognosis of hepatocellular carcinoma after hepatic arterial infusion chemotherapy. Radiol Med. 2023;128(12):1508–1520. doi:10.1007/s11547-023-01719-1

18. Feng Z, Li H, Liu Q, et al. CT Radiomics to Predict Macrotrabecular-Massive Subtype and Immune Status in Hepatocellular Carcinoma. Radiology. 2023;307(1):e221291. doi:10.1148/radiol.221291

19. Fang JH, Xu L, Shang LR, et al. Vessels That Encapsulate Tumor Clusters (VETC) Pattern Is a Predictor of Sorafenib Benefit in Patients with Hepatocellular Carcinoma. Hepatology. 2019;70(3):824–839. doi:10.1002/hep.30366

20. Lin C, He Y, Liu M, et al. Vessels That Encapsulate Tumor Clusters (VETC) Predict cTACE Response in Hepatocellular Carcinoma. J Hepatocell Carcinoma. 2023;10:383–397. doi:10.2147/JHC.S395903

21. Lin W, Lu L, Zheng R, et al. Vessels encapsulating tumor clusters: a novel efficacy predictor of hepatic arterial infusion chemotherapy in unresectable hepatocellular carcinoma. J Cancer Res Clin Oncol. 2023;149(19):17231–17239. doi:10.1007/s00432-023-05444-0

22. Kurebayashi Y, Matsuda K, Ueno A, et al. Immunovascular classification of HCC reflects reciprocal interaction between immune and angiogenic tumor microenvironments. Hepatology. 2022;75(5):1139–1153. doi:10.1002/hep.32201

23. Lee JS, Chu IS, Heo J, et al. Classification and prediction of survival in hepatocellular carcinoma by gene expression profiling. Hepatology. 2004;40(3):667–676. doi:10.1002/hep.20375

24. Boyault S, Rickman DS, de Reynies A, et al. Transcriptome classification of HCC is related to gene alterations and to new therapeutic targets. Hepatology. 2007;45(1):42–52. doi:10.1002/hep.21467

25. Chiang DY, Villanueva A, Hoshida Y, et al. Focal gains of VEGFA and molecular classification of hepatocellular carcinoma. Cancer Res. 2008;68(16):6779–6788. doi:10.1158/0008-5472.CAN-08-0742

26. Hoshida Y, Nijman SM, Kobayashi M, et al. Integrative transcriptome analysis reveals common molecular subclasses of human hepatocellular carcinoma. Cancer Res. 2009;69(18):7385–7392. doi:10.1158/0008-5472.CAN-09-1089

27. Hong W, Zhang Y, Wang S, et al. Deciphering the immune modulation through deep transcriptomic profiling and therapeutic implications of DNA damage repair pattern in hepatocellular carcinoma. Cancer Lett. 2024;582:216594. doi:10.1016/j.canlet.2023.216594

28. Mounir M, Lucchetta M, Silva TC, et al. New functionalities in the TCGAbiolinks package for the study and integration of cancer data from GDC and GTEx. PLoS Comput Biol. 2019;15(3):e1006701. doi:10.1371/journal.pcbi.1006701

29. Liu J, Lichtenberg T, Hoadley KA, et al. An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics. Cell. 2018;173(2):400–416e11. doi:10.1016/j.cell.2018.02.052

30. Zhang P, Ono A, Fujii Y, et al. The presence of vessels encapsulating tumor clusters is associated with an immunosuppressive tumor microenvironment in hepatocellular carcinoma. Int J Cancer. 2022;151(12):2278–2290. doi:10.1002/ijc.34247

31. Mariathasan S, Turley SJ, Nickles D, et al. TGFbeta attenuates tumour response to PD-L1 blockade by contributing to exclusion of T cells. Nature. 2018;554(7693):544–548. doi:10.1038/nature25501

32. Lex A, Gehlenborg N, Strobelt H, et al. UpSet: visualization of Intersecting Sets. IEEE Trans Vis Comput Graph. 2014;20(12):1983–1992. doi:10.1109/TVCG.2014.2346248

33. Wu T, Hu E, Xu S, et al. clusterProfiler 4.0: a universal enrichment tool for interpreting omics data. Innovation. 2021;2(3):100141. doi:10.1016/j.xinn.2021.100141

34. Yang C, Huang X, Li Y, et al. Prognosis and personalized treatment prediction in TP53-mutant hepatocellular carcinoma: an in silico strategy towards precision oncology. Brief Bioinform. 2021;22(3):bbaa164.

35. Zeng D, Ye Z, Shen R, et al. IOBR: multi-Omics Immuno-Oncology Biological Research to Decode Tumor Microenvironment and Signatures. Front Immunol. 2021;12:687975. doi:10.3389/fimmu.2021.687975

36. Jiang P, Gu S, Pan D, et al. Signatures of T cell dysfunction and exclusion predict cancer immunotherapy response. Nat Med. 2018;24(10):1550–1558. doi:10.1038/s41591-018-0136-1

37. Slenter DN, Kutmon M, Hanspers K, et al. WikiPathways: a multifaceted pathway database bridging metabolomics to other omics research. Nucleic Acids Res. 2018;46(D1):D661–D667. doi:10.1093/nar/gkx1064

38. Zhang Q, Lu S, Li T, et al. ACE2 inhibits breast cancer angiogenesis via suppressing the VEGFa/VEGFR2/ERK pathway. J Exp Clin Cancer Res. 2019;38(1):173. doi:10.1186/s13046-019-1156-5

39. Yu B, Zhu N, Fan Z, et al. miR-29c inhibits metastasis of gastric cancer cells by targeting VEGFA. J Cancer. 2022;13(14):3566–3574. doi:10.7150/jca.77727

40. Chen X, Zeng K, Xu M, et al. SP1-induced lncRNA-ZFAS1 contributes to colorectal cancer progression via the miR-150-5p/VEGFA axis. Cell Death Dis. 2018;9(10):982. doi:10.1038/s41419-018-0962-6

41. Zhu Y, Yang L, Wang M, et al. Preoperative MRI features to predict vessels that encapsulate tumor clusters and microvascular invasion in hepatocellular carcinoma. Eur J Radiol. 2023;167:111089. doi:10.1016/j.ejrad.2023.111089

42. Wei H, Fu F, Jiang H, et al. Development and validation of the OSASH score to predict overall survival of hepatocellular carcinoma after surgical resection: a dual-institutional study. Eur Radiol. 2023;33(11):7631–7645. doi:10.1007/s00330-023-09725-7

43. Zucman-Rossi J, Villanueva A, Nault JC, Llovet JM. Genetic Landscape and Biomarkers of Hepatocellular Carcinoma. Gastroenterology. 2015;149(5):1226–1239e4. doi:10.1053/j.gastro.2015.05.061

44. Hoshida Y, Toffanin S, Lachenmayer A, et al. Molecular classification and novel targets in hepatocellular carcinoma: recent advancements. Semin Liver Dis. 2010;30(1):35–51. doi:10.1055/s-0030-1247131

Comments (0)

No login
gif