A description of the underlying study in which the present investigation was conducted has been presented in detail elsewhere [11]. In brief, we conducted a case-control study nested within a cohort of 15,395 women aged 21 to 85 years who received a histopathologic diagnosis of benign breast disease (BBD) within the Kaiser Permanente Northwest Region (KPNW) health care system between August 3, 1971 and December 31, 2006 and were followed until July 1, 2015.
Cases and controlsCases were women with a biopsy for BBD who developed a subsequent first diagnosis of invasive breast cancer (IBC) at least one year after the index BBD biopsy and were ascertained by linking records from the BBD cohort to the KPNW Tumor Registry. The KPNW Tumor Registry has an excellent follow-up rate, even for women who are no longer health plan members, and it maintained a followup rate of 98% of patients (living and dead) during the time period of the study. Women who were diagnosed with ductal carcinoma in situ prior to the first BBD biopsy or were diagnosed with IBC prior to or within a year of the BBD biopsy were excluded from the study, as were those who had no breast tissue in the biopsied material. For each case, we randomly selected one control from the BBD cohort using risk-set sampling. Each control was individually matched to the corresponding case on age at diagnosis of BBD (+/- 1 year) and was sampled randomly from the risk-set with replacement [12]. In addition to being alive and free of invasive breast cancer during the same follow-up period as that for the corresponding case, each eligible control had not undergone a mastectomy before the date of diagnosis of breast cancer for its matched case.
HistopathologyWe obtained BBD tissue blocks for the cases and controls. Hematoxylin and eosin (H&E) stained sections prepared from the blocks were reviewed by a breast pathologist who was blinded to the case-control status of the study subjects. The BBD lesions were classified according to the well-established criteria of Page and colleagues [3, 13,14,15] as follows: no lesions/non-proliferative lesions (cysts, fibrosis, apocrine metaplasia, adenosis, simple fibroadenoma); proliferative disease without atypia (mild, moderate, or florid epithelial hyperplasia; columnar cell change and columnar cell hyperplasia; complex fibroadenoma; sclerosing adenosis; radial scar; complex sclerosing lesion, papilloma); and proliferative disease with atypia (atypical ductal hyperplasia, atypical lobular hyperplasia, columnar cell change and columnar cell hyperplasia with atypia/and flat epithelial atypia).
Senescence predictionThe H&E sections were scanned with a Panoramic 250 High Capacity Slide Scanner (3D Histech) using brightfield with a 20× air objective, numerical aperture 0.8. Cellular senescence was predicted from nuclear morphology observed in the scanned H&E images of BBD tissue using AI methods. To this end, whole slide images were processed by splitting them into 2048 × 2048 pixel tiles and then rescaling by 50% to 1024 × 1024 pixels. Of the 1,652,812 tiles extracted, 60 tiles were randomly selected for annotation, where we identified nuclei in samples drawn from the major tissue types (epithelial, stromal, and adipose tissue). A segmentation model based on U-Net [16] was trained to identify nuclei for the samples and then applied to segment nuclei across the entire set of image tiles. We also applied segmentation models that we had previously trained to identify adipose and epithelial tissues [10] and a published model to identify terminal duct lobular units (TDLUs) [17]. Nuclei classified as TDLU were excluded from the epithelial type, leading to epithelial nuclei classified as either TDLU or non-TDLU epithelial tissues. Collecting the 8,305,727 identified nuclei, we applied five senescence prediction models, each previously trained on fibroblasts in cell culture using different senescence inducers, including ionizing radiation (IR), replicative senescence (RS), doxorubicin (Doxo), antimycin-A (Anti), and atazanavir-ritonavir treatment (Atvr) [9]. We also evaluated senescence using a model that was trained on all three drug treatments together (AAD) due to our previous observation that there is a high association per nucleus between scores for these treatments [10]. Each model is based on an ensemble of 10 independent neural networks, where results are averaged together by model, as described in our previous work. After generating scores for each nucleus, scores were averaged by model and tissue per individual.
We investigated spatial patterns of cellular senescence in breast tissue by identifying the nuclei within the top 10th percentile of scores and then examining the scores of surrounding nuclei by distance. Based on patterns observed during the analysis, we fit negative exponential curves to the scores of all nuclei of the same tissue type near high-scoring nuclei (Supplementary Fig. 1a). Curves with fits of R2 > 0.1 were classified as good fits because they generally showed a negative exponential pattern. To characterize these patterns, we focused on two spatial metrics, percent difference and half-life (Supplementary Fig. 1b). The first and last fit points were used to calculate the percent difference, showing the magnitude of senescence score change between the nearest and farthest buckets, which were defined as 0–233 μm and 2097–2330 μm. Additionally, we calculated half-life to determine the rate of exponential senescence decay by distance. Senescence prediction and spatial analysis were performed with Python, Keras, and SciPy.
CovariatesRisk factor data were obtained by abstraction from the KPNW medical records using a chart abstraction manual and included information on age at menarche; age at first live birth; number of pregnancies; menopausal status; family history of breast cancer in a first degree relative; height; weight; cigarette use (ever/never), ever use of postmenopausal hormone therapy (HT), and history of bilateral oophorectomy.
Analytical sampleIn the present study, we excluded samples whose senescence scores were calculated using ≤ 100 cells, as the data for this group were considered to be less reliable. To maximize the sample size of the study, all cases and controls whose senescence scores were estimated based on more than 100 cells were included in the analysis, regardless of the presence/absence of their matched counterpart. The final sample size included 1,003 women (491 controls and 512 cases) for the epithelial tissue analyses, 712 (358 controls and 354 cases) for the fat tissue analyses, 1,006 (492 controls and 514 cases) for the stromal tissue analyses, and 937 (465 controls and 472 cases) for the TDLU tissue analyses.
Statistical analysisCorrelations between the senescence scores in the different tissue types (epithelial, adipose, stromal and terminal duct lobular units) obtained using the 3 different prediction models (RS, IR and AAD) and the spatial senescence metric data were calculated using Spearman correlation coefficients.
Unconditional logistic regression was performed to estimate age-adjusted and multivariable odds ratios (OR) and 95% confidence intervals (CI) for the associations of the senescence scores with breast cancer risk. Age was included as a covariate in the regression models to account for the potential residual confounding effect of age, although we note that the case-control pairs were closely matched on age (Supplementary Table 1). For this purpose, senescence scores and spatial decay metrics were each categorized into quartiles (qt), with the lowest quartile serving as the reference group in the analyses. Covariates were included in the models if they were known risk factors for IBC or if adjustment for them resulted in a change in the estimated OR of ≥ 10%. The following variables were adjusted for: cigarette smoking status (yes/no), BMI (calculated close to the date of BBD diagnosis by dividing weight (kg) by the square of the height (m [2]) (< 18.5, 18.5–24.9, 25-29.9, ≥ 30 kg/m2), family history of breast cancer in a first degree female relative (yes/no), age at menarche (≤ 11, 12–13, ≥ 14 years), age at first live birth (never had, 15–19, 20–24, 25–29, ≥ 30 years), number of pregnancies (never pregnant, 1, 2, 3, ≥ 4), history of bilateral oophorectomy (no/yes), HRT use (no/yes), and menopausal status (premenopausal/postmenopausal). Women were considered to be post-menopausal if they had had a natural menopause, were aged at least 53 years [18] and did not report their menopausal status, or had had a bilateral oophorectomy before this age. All variables with missing information were assigned a missing value indicator for the analyses. To test for linear trend, senescence score quartiles were included in the model as continuous variables and Wald test p-values were calculated.
In further analyses, we examined the association of combined pairs of senescence scores in epithelial and adipose tissue obtained using the RS and IR models with the risk of IBC using as reference groups the quartiles which showed the lowest risk when analyzed as individual scores (1st quartiles of these scores). A similar approach was adopted for combined analysis of senescence scores (RS-epithelial) and spatial decay metrics (percent-difference-epithelial). Additionally, subgroup analyses were performed by menopausal status and by BBD histopathological classification (no lesions/non-proliferative lesions and epithelial hyperplasia with/without atypia). Finally, we examined the association between senescence scores and the risk of developing ipsilateral or contralateral IBC using multinomial logistic regression, so that the risks of these two outcomes were estimated simultaneously [19].
All statistical analyses were performed using Stata version 18 (StataCorp LLC, College Station, TX). All p values were 2-sided and considered to be statistically significant for p values <0.05.
Comments (0)