Molecular Characterization and Landscape of Breast cancer Models from a multi-omics Perspective

Breast cancer currently represents 30% of new cancer cases in American women (American Cancer Society, 2022). Therapy is dictated by stage and individual characteristics of each cancer with striking differences in the major subtypes. Given the differences between subtypes of breast cancer, choice of an accurate breast cancer model is essential. Due to the high genomic, transcriptomic, and proteomic landscape heterogeneity of human breast cancer, choosing the optimal model to address breast cancer research can be challenging [1]. Currently, a variety of in vivo and in vitro research models are available, ranging from cell lines, 3D cultures, murine models, mammospheres, patient derived xenografts (PDX) to patient derived organoids (PDO). With the rapid progress of omics technologies, researchers have been now gathering and cataloging information on the molecular mechanisms of both cancer and the research models used to study it [2]. Recently advances in machine learning have been making inroads and are poised to change precision medicine [3]. This review will examine the molecular characterization and landscape of breast cancer models from a multi-omic perspective (Fig. 1) with particular attention to how each system resembles human breast cancer subtypes and the advantages of each.

Fig. 1figure 1

Multi-omics techniques have been uncovering the intrinsic features and parallels between human breast cancer and research models

Multi-omics Analysis of Human Breast CancersTranscriptomics in Human Breast cancer

Prior to the advent of microarrays, breast cancer classification was determined by the status of markers such as BRCA, estrogen receptor (ER), progesterone receptor (PR) and HER2. Diagnosis was refined by relying on the histological description and other tumor characteristics including metastasis and invasion of the lymph nodes. The initial description of a classification system using transcriptomics in a semi-supervised clustering analysis revolutionized the field and introduced the now familiar luminal A/luminal B / basal / HER2 / normal-like subtypes [4,5,6]. The initial set of classification genes was refined and the claudin low subtype was added [7, 8] and eventually was FDA approved to predict the subtype of breast cancer. Given the diversity of gene expression present within basal tumors, it was not surprising that subtypes were noted with differences in survival outcome [9]. Following this, the basal subtype was refined to contain 4 subtypes: two basal-like (BL1, BL2), a mesenchymal (M), and a luminal androgen receptor (LAR) subtype [10]. In 2012, a large comprehensive multi-omics study of 825 human breast cancer samples was published revealing novel molecular features of each subtype as part of The Cancer Genome Atlas (TCGA) project [11]. This revealed gene expression patterns that were characteristic of and augmented the intrinsic subtypes, from ESR1, GATA3, FOXA1, XBP1 and cMYB, in ER+/luminal-like subtypes to high expression of receptor tyrosine kinases like FGFR4, HER1/EGFR and loss of PTEN and INPP4B in the HER2 subtype.

The shift from microarrays to RNAseq has also allowed the development of RNA sequencing at the single-cell level. This has resulted in new dimensions of mammary gland development and breast cancer heterogeneity being uncovered. Lineage tracing of mammary epithelium in different development stages revealed cell populations and differentially expressed genes for each cell type according to the development phase, composing a new cluster of signature genes and shedding a light on cell fate decision [12, 13]. Although the specific timeline in which the mammary cell commits to a lineage (basal, luminal progenitor or mature luminal) remains an open area of investigation, it appears that embryonic cells are closely related to the basal population and commitment starts in early postnatal stages, finishing during puberty. Interestingly, intermediate populations arise, and population composition constantly changes during development.

In breast cancer, while bulk RNAseq was found to closely resemble single-cell profiling, particularly in the Luminal and HER2-enriched subtypes, the single-cell approach helped to define other characteristics important to the understanding of cancer biology [14]. Within tumors, mixed subtypes and cell composition of carcinoma and non-carcinoma cells, which includes mainly stromal (such as fibroblasts) and immune cells (i.e. tumor-associated macrophages and different phenotypes of T and B lymphocytes), are observed in different proportions. As an example, while Luminal subtype tumors are largely composed of carcinoma cells expressing high levels of ER and its canonical pathway genes, basal breast cancer is marked by immune infiltration, which likely contributes to the high heterogeneity of this subtype.

In addition to single-cell level work, spatial transcriptomics has allowed the complex interactions of human breast cancer, and the subsets of cells that make up a tumor to be examined. In a recent analysis, breast cancer was stratified into nine ecotypes (E1-E9) with the various cell populations defined by a single-cell derived subtype (SC) algorithm and then extended to bulk RNAseq data to partially associated it with the PAM50 intrinsic molecular subtypes [15]. Although a total of 9 major cell types have been identified, most ecotypes agglomerate cells from major cell lineages, i.e. epithelial, stromal, and immune cells, but in different cell states and with unique compositions of the last two. Importantly, the spatial organization of the cell states in discrete zones of the tumor suggests a role of the microenvironment in driving the zone phenotype, such as proliferative or mesenchymal-like. Different prognoses can also be correlated with the ecotype. For instance, E7 and E3, which are enriched in HER2_SC and HER2 tumors or basal_SC, cycling and luminal progenitor cells, respectively, have a worse 5-year survival. The complex interplay between cell types in this manuscript provides a unique glimpse of the complex interactions seen in cancer biology that were first seen through histology.

Proteomics

It is well appreciated that the correlation between mRNA and protein abundance may vary greatly, emphasizing the importance of validating how well breast cancer mRNA subtyping is reflected in proteomics. Overall, unsupervised proteomics analysis of breast cancer tumors resembles the PAM50 subtypes for the basal-like, luminal A and normal-like, whereas Luminal B and HER2 + present mingled profiles, reflecting similarities between these two phenotypes [16]. Furthermore, highly proliferative subtypes (basal-like, Luminal B and HER2+) were found to have a greater correlation between transcriptome and proteome in comparison to the low-proliferative subtypes (normal-like and Luminal A). Proteomes from Luminal, HER2 + and basal-like contain an enrichment of E2F and MYC targets as well as G2M checkpoint proteins, however, basal-like tumors are distinguished by immune markers in special MHC class proteins. Increased proliferation and glycolysis, features of the Warburg effect, are notably observed in HER2 + and Luminal B subtypes. Just as was noted with gene expression for the basal subtype, proteomics and metabolomics can divide basal breast cancer into 3 subtypes: C1, enriched in sphingolipids and long-chain and unsaturated fatty acids, C2, presenting high metabolism of glutamate and carbohydrates and oxidation reaction, and C3, which metabolomics are more closely related to normal breast tissue [17]. The mRNA basal breast cancer subtype luminal androgen receptor (LAR) fits within the C1 subtype, while basal-like immune-suppressed (BLIS), immunomodulatory (IM) and mesenchymal-like (M) within the C2 and C3. In addition, LAR tumors are noted by the high activation of the ceramides pathway and levels of SP1, while BLIS tumors are abundant in NAAG, arising them as potential tumor-promoting metabolites.

Genomic Alterations

Underlying the transcriptomic and proteomic characteristics are genomic alterations. An obvious example is the amplification of HER2 that results in overexpression and signaling. However, a more nuanced examination reveals that the amplification of HER2 includes other neighboring genes, with resulting gene expression alterations [18]. Since approximately 62% of gene amplifications result in elevated gene expression, this allows for copy number alterations to be predicted from gene expression data [19, 20]. Aside from predicting alterations, the TCGA project resulted in extensive sequencing data. As expected, a majority of tumors were noted to have p53 alterations with a substantial fraction also harboring PIK3CA mutations [11]. The TCGA data does permit a detailed view of events in specific subtypes. For example, the Luminal B subtype is overrepresented with ATM loss and Cyclin D1 and MDM2 amplification.

While the TCGA data and other large scale sequencing studies such as COSMIC [21] have revealed individual tumors with single nucleotide variations relative to the reference genome, many of them remain variants of unknown significance. Other mutants that result in frameshifts and missense mutations can have additional information about their potential importance gained from combining this mutation data with the cancer dependency atlas [22]. This integrative approach is essential in combining the multiple data streams with a combined analysis of transcriptomic and sequence data being a much more powerful analysis.

Summary of the multi-omic Analysis of Human Breast cancer

Recent publications of multi-omic analysis highlight the utility of data intensive biology, but also illustrate the complexity of analysis. A recent manuscript detailing the workflow illustrates the effort required for a single patient with multiple biopsies, but also reveals how the multiple data streams can be successfully integrated [23]. Classification of breast cancer tumors according to the classic PAM50 mRNA subtypes reflects the transcriptomic landscape, but interrogating each of these subtypes from a multiomics perspective reveals novel molecular features, helping to understand their heterogeneity [11]. For instance, stratifying the basal-like subtype into new groups according to the different nuances of proteomics and transcriptomics landscapes emphasizes the complexity of these cancers [10, 17]. Alongside, unique genomic alterations have been found in normal-like, basal-like, luminal, HER2+, and claudin-low subtypes [11, 19,20,21]. These observations provide us with a comprehensive spectrum of breast tumors. Hence, acknowledging the intrinsic subgroups and underlying molecular features can enlighten how cancer behavior and response to therapy are dictated.

In order to study human breast cancer, there are a number of model systems. Each system (Fig. 2), has strengths and weaknesses and are usually suited for particular experiments. However, recent omic data has allowed a more detailed examination of the suitability of these model systems.

Fig. 2figure 2

Simplified representation of breast cancer (BC) models generation. (a) Transgenic mice: one common approach for BC genetically engineered mice model generation is to overexpress an oncogene driven by a specific promoter targeting the mammary gland, such as MMTV. (b) 3D culture: the combination of a supporting scaffold (scaffold-dependent model), such as hydrogels and inert matrices, and different cell types allow cell growth and cell-extracellular matrix and cell-cell interactions. (c) Mammospheres (MM): these spheroids can be originated either from breast cancer cell lines (BCCL) or from BC biopsy. A single-cell suspension is obtained from the material, cell phenotypes are sorted for stem and progenitor cells, followed by culture in an ultra-low adherent surface for MM formation. (d) Patient-derived xenograft (PDX): tissue fragments from patient’s tumor are directly transplanted onto the immunodeficient mice heterotopically or orthotopically, with no need of an in vitro preparation step (F0). Once tumor reaches appropriate size, it can be dissected and expanded by reimplanting it onto another mice recipient (F1). The tumor expansion can go on for multiple generations (Fn). Patient-derived organoid (PDO): tissue fragments from patient’s tumor are digested and cultured in a 3D extracellular matrix hydrogel, giving rise to organoids

Murine ModelsSubtypes of Mouse Models

The description of inbred strains of mice that were susceptible to mammary tumors after fostering pups [24] led to the discovery of mouse mammary tumor virus (MMTV) through an interesting series of discoveries, from MMTV to the genes misregulated at the integration sites [25, 26]. Modeling cancer in mice was revolutionized with the description of some of the first transgenic mouse models of breast cancer expressing well known oncogenes like Myc [27], Ras [28] and Neu [29] under the control of the MMTV glucocorticoid regulated promoter enhancer [30]. In addition to overexpression models, there were also a series of knockouts used to study breast cancer with early knockouts often suffering from embryonic lethality [31,32,33] or in a surprising finding for BRCA1 heterozygous mice, lacking tumor formation [34, 35]. Development of independent MMTV-Cre transgenics in the Hennighausen [36] and Muller [37] labs allowed for tissue specific control of gene expression, including expression of the activated Neu allele under the control of the endogenous promoter [37]. Today, the design of mouse models of breast cancer has evolved to be induced / de-induced through tet-on systems [38], can be lineage traced [39, 40], can have multiple transgenes with an IRES system [41] and numerous other technical advances that allow for precise questions to be addressed. These advances allow for many aspects of tumor biology to be addressed, from metastasis in widely used models like MMTV-PyMT [42] to tumor heterogeneity [43, 44]. Indeed, the heterogeneity is present across many mouse models [45] and while this mimics human breast cancer, at the same time it confounds the analysis as the overexpression of an oncogene is too reductionist of an approach. Hence, understanding the molecular landscape of murine models and their variety of tumors becomes crucial.

Histological Subtypes

Early in the analysis of mouse mammary tumor models it was noted that there were specific histological characteristics that were associated with expression of given oncogenes. This was encapsulated in a review of various initiating oncogenes and the physical characteristics of their tumors [46]. While the hypothesis that there are defining histological subtypes associated with specific oncogenes holds, an analysis of larger number of samples revealed the heterogeneity within individual models. Here a dominant histology was noted with other subtypes arising in a smaller population [43, 44]. More recently, it was noted that there were specific gene expression programs associated with the various subtypes that were predictive in nature [45]. While human breast cancer is not associated with the same subtypes noted in the mouse models, there are patterns that do hold, including an epithelial to mesenchymal transition (EMT). However, the EMT noted in primary tumors in mouse models is not associated with metastasis [44], a marked difference from the prevailing opinion in human breast cancer [47]. The examination of the histological subtype is a key component in the analysis of mouse mammary tumor models as they are reflected in the gene expression patterns that were noted to occur [44].

A key difference between mouse models and human breast cancer is noted at the immunohistochemical level. A substantial fraction of human breast cancers stain positively for the ER and the PR but there is a paucity of mouse models that are ER or PR positive. Stat1 deficient mice develop ER positive mammary tumors [48] and the MMTV-PyMT tumors [42] begin as ER positive but transition to ER negative during tumor progression [49], but other genetically engineered mice are lacking in this regard. This is poised to change with the development of lines such as the activated ESR1 mice [50] that can be interbred with other GEMs, which should allow for the development of ER positive mouse models of breast cancer.

Transcriptomics in Mouse Mammary Tumors

Transcriptomics profiles of murine mammary tumor models have now been detailed by a number of groups. Aside from individual characterization of models, three studies have completed larger cohort analyses spanning many mouse models. This includes key comparisons of mouse models to human breast cancer [51,52,53] that revealed the extent of similarity of the models to the human disease. Removal of batching effects and interspecies differences allowed for the direct comparison of human breast tumors and murine tumors. This allowed the identification of common features across species and association with the human breast cancers subtypes (Table 1).

Table 1 Association of murine models with the human breast cancer samples. * Distinction between Luminal A and Luminal B not provided by authors. Note that not all murine models currently available are shown in this table. Only the most representative subtypes of each model published in the literature was assigned here. MMTV = Mouse mammary tumor virus promoter / enhancer, WAP = Whey acidic protein promoter

There was high agreement amongst signature genes of the basal-like subtype, which include Laminin gamma 2, Keratins 5, 6B, 13, 14, 15, TRIM29, c-KIT and CRYAB, was found in the models harboring BRCA, p53 and Rb deficiency (Brca1+/-, ;p53+/-;IR, MMTV-Cre;Brca1Co/Co; p53+/-, MMTV-Wnt1, and a few DMBA-induced) [52]. The MMTV-Wnt1 model can also overlap with normal breast-like, while the MMTV-Neu tumors unexpectedly associate with the Luminal A subtype instead of the HER2 subtype [53] While one analysis suggested that MMTV-PyMT and WAP-Myc models were similar to HER2+/ER- and/or luminal tumors [52] a more nuanced examination, including the histological subtypes, suggested that Myc and PyMT tumors with EMT histology resembled the gene expression markers of the claudin-low subtype [51], highlighting the heterogeneity of tumors that can be driven by these oncogenes.

The MMTV-Myc model can give rise to distinct tumor histological subtypes, each associated with a gene expression cluster [44]. Interrogation of the transcriptomic data with cell signaling signatures revealed histological subtypes that activate specific pathways. For example, papillary and microacinar tumors both activate E2F1 and Myc pathways, but the papillary has activation of Stat3 while microacinar has elevated B-cateninin signaling. EMT was noted for high expression of the Ras pathway, but low expression of Myc. Given the dominance of KRas over Myc [54] the induction of KRas mutations in the MMTV-Myc model and the subsequent development was not surprising. The mouse EMT subgroup appears to resemble human data more closely, and elevation of EMT signature is found in the triple negative subtype, which is also associated with greater metastasis potential.

The histological disconnect between ER positive human breast cancer and the mouse models that fail to develop ER positive tumors was again observed in the transcriptomic data [52, 53]. However, this analysis did identify an association of human ER positive luminal tumors with the WAP-Int3 (Notch4) [55], MMTV-Pik3ca-H1047R [56] and Stat1−/− models.

Whole Genome Sequencing of Mouse Mammary Tumors

Relative to the models analyzed through transcriptomics analysis, there is a dearth of whole genome sequencing data for mouse mammary tumor models. Indeed, a search of the literature reveals whole genome sequence data for only four models [18, 57, 58], including the NRL-PRL model which has elevated prolactin [57], p53 null and the MMTV-Neu [59] and MMTV-PyMT [42] models. The Prolactin model identified both mutations and copy number alterations in Kras and transcriptional analysis confirmed activation of the pathway [57]. Not surprisingly, the p53 null model resembled the basal breast cancer subtype and the WGS data was used in conjunction with transcriptomic data to identify new therapeutic approaches [58].

Compared to human tumors, both the MMTV-Neu and MMTV-PyMT murine models have a lesser mutation burden. Copy number alterations (CNA) are markedly predominant in MMTV-Neu and they are likely associated with increased activity of PI3K/AKT/MTOR signaling pathway. In the MMTV-Neu tumors, CNA in the 11D locus, a chromosomal region homologous to the human 17q21.33, leads to amplification of the genes Collagen type 1 alpha 1 (COL1A1) and Chondroadherin (CHAD). This is a genomic event frequent in 8% of human breast cancers, of which 25% are HER2-enriched, and depletion of these two genes impacts migration and the ability to form tumors[18]. Interestingly, over 80% of MMTV-PyMT tumors have a V483M mutation in Ptprh, a phosphatase targeting EGFR and other kinases. Mutation of Ptprh results in phosphorylation of EGFR and upregulation of downstream pathways at a transcriptomic level [60].

Together each of these models illustrate the power of integrated sequence and transcriptomic analysis with each model system having accumulated genomic events that result in transcriptional changes. Other studies have employed exome sequencing to identify mutations in other model systems, including other backgrounds of MMTV-PyMT [61], Trp53 and BRCA2 deficient strains [62].

Epigenomics of Mouse Mammary Tumors

Epigenomic regulation studies in murine models of breast cancer are currently lacking, hence a comparison with human breast cancers from this perspective is not accurate. Nonetheless, few conclusions can be drawn based on the epigenetic information available in the literature. In the c-Neu (Erbb2/HER2) cancer models driven by the MMTV promoter, overexpression of the oncogene and tumor development is owned to promoter demethylation, an event that happens in early stages of development [63]. Genome-wide chromosomal losses in these animals, such as loss of heterozygosity at chromosome 4 or 15, are also likely to be associated with a non-identified epigenetic mechanism [64]. Epigenetic reprogramming can also be influenced by diet: calory restriction was shown to preserve ER expression in MMTV-neu transgenic mice by differently methylating CpG islands within or in the flanking regions of the ESR1/ESR2 genes [65]. In overweight and obese mice, expression of the methylation enzyme DNMT1 is increased compared to lean/calory restricted mice, hence it is possible that other genes are being targeted. Besides epigenetic regulation at the gene level, energy balance is linked to histone modifications [

留言 (0)

沒有登入
gif