We utilized autopsy and cognitive data from the Religious Orders Study (ROS) and the Rush Memory and Aging Project (MAP), collectively known as ROS/MAP, to conduct this study [7]. Data collection commenced in 1994 for ROS and in 1997 for MAP, resulting in extensive longitudinal clinical-pathologic data on aging and Alzheimer’s disease (AD) risk factors. ROS includes religious clergy members from across the United States, while MAP includes lay individuals from northeastern Illinois. Participants are older, free of known dementia at study initiation, and are primarily of European ancestry (see cohort demographics in Table 1). All participants consented to organ donation. Each study received approval from a Rush University Medical Center Institutional Review Board, including guidelines for data sharing under Institutional Review Board protocols. All participants provided informed and repository consents, along with an Anatomic Gift Act. Additionally, the analyses were approved by the Vanderbilt University Medical Center IRB.
Table 1 Participant demographics, ROS/MAPGenotypingDNA was extracted from whole blood lymphocytes or frozen brain tissue, adhering to previously established quality control (QC) measures [30]. APOE genotyping was conducted by investigators blinded to cohort data at Polymorphic DNA Technologies. The APOE gene was sequenced to identify the isoforms APOE-ε2, APOE-ε3, and APOE-ε4, defined by codons 112 and 158 on exon 4.
Neuropsychological compositesDetails of the neuropsychological testing have been previously published [5, 6, 7] and additional documentation is available on the ROS/MAP website at www.radc.rush.edu. Briefly, 19 neuropsychological tests across five cognitive domains (episodic, semantic, and working memory, visuospatial ability/perceptual orientation, and perceptual speed) are used to calculate a composite global cognition variable in ROS/MAP. This variable represents a participant’s overall cognitive function. Raw scores from each test were converted to z-scores using the mean and standard deviation. The final composite score is derived by converting each test within each domain to a z-score and averaging all z-scores.
Final summary clinical diagnosisA clinical diagnosis was determined at each participant visit based on cognitive test scores, clinical judgment by a neuropsychologist, and diagnostic classification by a clinician (neurologist, geriatrician, or geriatric nurse practitioner) as described previously [5, 6, 7]. Clinical diagnoses of AD or other dementias followed criteria recommended by the joint working group of the National Institute of Neurological and Communicative Disorders and Stroke and the Alzheimer’s disease and Related Disorders Association (NINCDS/ADRDA). Diagnosis of mild cognitive impairment (MCI) was given to individuals judged to have cognitive impairment by the neuropsychologist but not meeting dementia criteria by the clinician. The final summary clinical diagnosis at the time of death was made by a neurologist, blinded to post-mortem data, based on a review of select clinical data from all years.
Neuropathological measuresCore AD pathologyAll neuropathological marker quantifications have been described previously [5, 6, 7]. Briefly, quantification of neuritic plaques and neurofibrillary tangles was based on silver staining of five brain regions (midfrontal cortex, midtemporal cortex, inferior parietal cortex, entorhinal cortex, and hippocampus) averaged to obtain a summary score of overall burden. Additionally, immunohistochemistry was used to calculate semi-quantitative scores for amyloid-β and phospho-tau abundance in the cortex, using antibodies specific to Aβ1-42 and abnormally phosphorylated tau (AT8 epitope), respectively, based on the average of eight regions (hippocampus, entorhinal cortex, midfrontal cortex, inferior temporal cortex, angular gyrus, calcarine cortex, anterior cingulate cortex, and superior frontal cortex).
Cerebrovascular pathologyMacro infarcts were visualized on fixed slabs and dissected for confirmation [2, 38]. Microinfarcts were examined on 6 µm paraffin-embedded sections, stained with hematoxylin/eosin. Gross and microinfarcts were categorized as present (1) or absent (0) based on visual inspection in nine brain regions (midfrontal, middle temporal, entorhinal, hippocampal, inferior parietal and anterior cingulate cortices, anterior basal ganglia, midbrain, and thalamus) [2]. A semi-quantitative score for cerebral amyloid angiopathy (CAA) was measured by amyloid-β immunostaining in neocortical regions (midfrontal, midtemporal, angular, and calcarine cortices), and was scored on a scale from 0 to 4 (0 = no pathology, 4 = severe pathology). A meningeal and parenchymal vessel score was obtained for each brain region, and the maximum of these was then used in each case. Final scores were averaged across regions [9].
TDP-43 and Lewy body pathologyTDP-43 immunohistochemistry was performed on eight brain regions using phosphorylated monoclonal TAR5P-1D3 TDP-43 antibody, and the presence of TDP-43 cytoplasmic inclusions in neuron and glia was assessed for each region [28]. A dichotomized variable representing no TDP-43 pathology or TDP-43 pathology in the amygdala only (0) and TDP-43 pathology extending beyond the amygdala (1) was leveraged in our analyses. Lewy body stages were determined by α-synuclein immunostain and encompassed four stages [39]. A dichotomized variable representing Lewy body pathology outside the neocortex (0) and neocortical-type (1) was used in our analyses.
Autopsy measures of GFAP mRNA expressionAs previously described [5], a standardized protocol for post-mortem biological specimens was used. RNA extraction from specific brain regions was conducted using a Qiagen miRNeasy mini kit along with an RNase-free DNase Set for quantification on a Nanodrop. The integrity and purity of the RNA were assessed using an Agilent Bioanalyzer. Samples with an RIN score greater than five were included for bulk next-generation RNA sequencing. RNASeq was generated in 946 participants.
Sequencing was performed in multiple phases. Phase one focused on the dorsolateral prefrontal cortex (dlPFC). Phase two added more dlPFC samples and included samples from the posterior cingulate cortex (PCC) and the head of the caudate nucleus (CN). Phase three included additional participant samples from the dlPFC. Detailed information on RNA processing and sequencing is available on Synapse (syn3388564).
In summary, phase one employed poly-A selection, strand-specific dUTP library preparation, and Illumina HiSeq with 101 bp paired-end reads, achieving a coverage of 150 million reads for the first 12 reference samples. These deeply sequenced reference samples included two males and two females from non-impaired, mild cognitive impairment, and Alzheimer’s disease cases. The remaining samples were sequenced with a coverage of 50 million reads. Phase two used the KAPA Stranded RNA-Seq Kit with RiboErase (kapabiosystems) for ribosomal depletion and fragmentation. Sequencing for this phase was performed on an Illumina NovaSeq6000 with 2 × 100 bp cycles, targeting 30 million reads per sample. In phase three, RNA was extracted with a Chemagic RNA tissue kit (Perkin Elmer, CMG-1212) using a Chemagic 360 instrument, and ribosomal RNA was depleted using RiboGold (Illumina, 20,020,599). Sequencing for phase three was carried out on an Illumina NovaSeq6000 with 40-50 million 2 × 150 bp paired-end reads.
Data processing and QC of RNA sequencing runs were performed by the Vanderbilt Memory and Alzheimer’s Center Computational Neurogenomics Team using an automated pipeline and are described in detail elsewhere [41, 50]. Samples whose last visit was >5 years before death or who had non-AD dementia were excluded. This quantification yielded measurements for GFAP in 917 samples. We further assessed correlations of GFAP protein and transcripts in the DLPFC with available cytokine/chemokine transcripts known to be upregulated in reactive astrocytes as reported by Sofroniew [45], yielding five inflammatory transcripts available for correlation analyses.
Cellular fraction dataA deconvolution technique was previously employed to derive cellular fraction data for a subset of ROS/MAP participants [27]. This approach involved a subset of bulk RNA samples from the dlPFC which also had single-nucleus data (N = 48 individuals and 80,660 single-nucleus transcriptomes). These data were used to identify the best predictors of each cellular component (e.g., excitatory neurons, microglia, oligodendrocytes, etc.) by utilizing all genes in the RNAseq data to build models with the most optimized set of genes. The process of isolating and extracting nuclei from frozen tissue has been previously described [16]. In summary, the analysis of single-nucleus data (snRNA-seq) employed high-throughput droplet technology and massively parallel sequencing following the DroNc-seq protocol [15], with modifications for the 10X Genomics Chromium platform. Gene counts were obtained by aligning reads to the hg38 reference genome (GRCh38.p5) using CellRanger software. Unspliced nuclear transcripts were included by counting reads mapped to pre-mRNA. Each individual library was quantified for pre-mRNA and then aggregated to equalize read depth between libraries, generating a gene count matrix.
The quality control criteria for cell inclusion have been described in detail previously [27]. The final dataset comprised 17,926 genes in 75,060 nuclei. This snRNA-seq data were used in a regression-based approach to generate a reference expression profile and decompose bulk RNA sequencing data, resulting in cellular fraction estimates for each sample across eight cell types (microglia, astrocytes, inhibitory neurons, excitatory neurons, oligodendrocytes, oligodendrocyte progenitors, and endothelial or pericyte cells).
Autopsy measures of GFAP protein expressionGFAP protein expression was quantified using isobaric tandem-mass tag mass spectrometry (TMT-MS) on dlPFC tissue from 400 ROS/MAP samples (syn17015098). Briefly, protein abundance was determined using brain tissue samples from the dorsolateral prefrontal cortices of 400 participants leveraging the UniProtKB human proteome database containing both Swiss-Prot and TrEMBL reference sequences (downloaded on 21 April 2015, processed data available at syn21266454) as reported previously [18]. Samples whose last visit was >5 years before death or who had non-AD dementia were excluded. This quantification yielded measurements for GFAP in 386 samples.
Statistical analysesStatistical analyses were conducted in R v4.1.2 using the R Studio IDE (https://www.rstudio.com/). Multiple linear regression models were employed for cross-sectional cognition and pathological outcomes to analyze the data, while linear mixed-effects models were used for longitudinal cognition. Models were executed separately for regional GFAP expression. We utilized generalized linear and proportional odds models for binomial and multinomial cerebrovascular outcome variables, respectively. Linear regression models adjusted for age at death, sex, and post-mortem interval. Models with cognition as the outcome also included education and the time in years between the final visit and death. In mixed-effects regression models, time was represented as years from the final visit, with both time and intercept included as fixed and random effects. Measurements of AD pathology through immunohistochemistry and silver staining were square root transformed to better approximate a normal distribution. Secondary analyses were performed to account for potential variation in model predictions due to astrocytic cell-type fraction by including this estimate as a covariate. Furthermore, in models assessing the interaction of GFAP expression and amyloid status, we leveraged a binary variable where amyloid negativity was defined as CERAD “none” or “sparse”, while amyloid positivity was defined as CERAD “moderate” or “frequent.” When assessing GFAP associations with non-AD pathologies, we also ran amyloid-stratified models using the binary amyloid status variable.
All models were corrected for multiple comparisons using the Benjamini and Hochberg (1995) false discovery rate based on the total number of tests completed, accounting for all GFAP predictors across modalities and outcomes (N = 66). Statistical significance was determined using the a priori threshold of p < 0.05. Among the 917 participants with GFAP mRNA measurement from the dlPFC, 668 participants also had GFAP measurement from the CN and 511 participants also had measurement from the PCC. There were 435 participants with GFAP measurements from all three brain regions. 281 participants had both mRNA transcript and protein measures of GFAP from the dlPFC.
Comments (0)