Using RNA-seq to identify suitable housekeeping genes for hypoxia studies in human adipose-derived stem cells

Cell culture maintenance

hADSC from 5 donors (A, B, C, D and E) were purchased from Lonza (Lonza, Walkersville, Maryland) or ZenBio (ZenBio, Research Triangle Park, North Carolina), where cells were previously characterised (general information on hADSC donors and cell characterisation are provided in Additional file 5). hADSC were cultured in Dulbecco’s Modified Eagle’s medium (DMEM) containing 1 g/L D-glucose (Gibco, Carlsbad, California), 2 mM GlutaMAX™ (hADSC-E in 4 mM L-glutamine) and 110 mg/L Sodium Pyruvate, supplemented with 10% foetal bovine serum (FBS, Bovogen, Keilor East, Australia) and 1% Penicillin–Streptomycin (P/S; Gibco, Carlsbad, California). Spent media were replaced every 2–3 days. Cells were incubated in standard culture conditions of 37 °C, with 5% carbon dioxide (CO2) and 21% atmospheric oxygen (O2) in a humidified cell culture incubator unless otherwise specified, and passaged at approximately 80% confluence. hADSC were used between passage 5 to 8.

A human eardrum keratinocyte cell line characterised previously [45] was cultured in DMEM containing 4.5 g/L D-glucose (Gibco, Carlsbad, California), 4 mM L-glutamine) and 110 mg/L Sodium Pyruvate, supplemented with 10% FBS and 1% P/S. Spent media were replaced every 3–4 days. Cells were incubated in standard culture conditions of 37 °C, with 5% CO2 and 21% atmospheric oxygen (O2) in a humidified cell culture incubator and passaged at approximately 90% confluence. Keratinocytes of passage 36 were used.

Hypoxia conditioning

hADSC were cultured to 80–90% confluence in 75 cm2 tissue culture flasks, rinsed with phosphate-buffered saline (PBS), pH 7.2 (Gibco, Carlsbad, California), and cultured in serum-free DMEM for at least 8 h before replacing with fresh serum-free medium for a further 48 h. hADSC at this step were either incubated in standard (Nx; 37 °C, 5% CO2) or Hx conditions in the GENbox Jar (BioMérieux, Craponne, France) produced by an anaerobic atmosphere generator sachet (BioMérieux, Craponne, France) at 37 °C. A hypoxia indicator strip (BioMérieux, Craponne, France) was included in the sealed chamber to confirm gas composition (< 0.1% O2 and 15% CO2). At the end of the 48-h incubation, total RNA was extracted.

RNA extraction

Total RNA from hADSC-E was extracted using the FavorPrep™ Tissue total RNA Mini Kit (Favorgen Biotech Corp, Ping Tung, Taiwan) according to the manufacturer’s protocol, before eluted RNA was cleaned up of genomic DNA (gDNA) using the RNeasy Plus Micro kit (Qiagen, Hilden, Germany). Total RNA from hADSC-A, -B, -C, -D, and keratinocytes were extracted using the RNeasy Plus Micro kit according to the manufacturer’s instructions. Briefly, cells were lysed and gDNA was removed via the gDNA Eliminator spin column. Total RNA was then bound, collected and washed on the RNeasy MinElute spin column before being eluted at approximately 14 μL per sample. All RNA samples were stored at -80 °C until required for RNA sequencing and cDNA synthesis.

RNA samples were submitted to the Australia Genome Research Facility (AGRF, Melbourne, Australia) for quality control, library preparation and RNA sequencing. Briefly, RNA samples were quantified using a 2100 Bioanalyzer (Agilent Technologies, Santa Clara, California) where samples of at least an RNA Integrity Number (RIN) of ≥ 8.0 and an A260/280 ratio of 1.8 – 2.0 were accepted. RNA quantification results are provided in Additional file 6.

Library preparation and RNA sequencing

For all samples, library preparation was performed using the Illumina TruSeq® Stranded mRNA kit according to the manufacturer’s protocol (Illumina, San Diego, California). Briefly, mRNA was purified and fragmented for double-stranded cDNA synthesized using specific methods to improve strand specificity. Adapters were ligated to DNA fragments to prepare for hybridization onto a flow cell during RNA sequencing, and enriched by amplification.

For hADSC-E, samples were processed earlier in a separate batch to the rest. RNA sequencing was performed on the Illumina HiSeq 2500 platform High Output Mode using the HiSeq Control Software (HCS) v2.2.68 and Real Time Analysis (RTA) v1.18.66.3 running on the Illumina sequencing computer for single-end reads at 50 bp. The Illumina bcl2fastq 2.18.0.12 pipeline was used next to generate the sequence data.

For hADSC-A, -B, -C and -D, RNA sequencing was performed on the Illumina NovaSeq 2500 platform High Output Mode using the NovaSeq Control Software (NCS) v 1.6.0 and Real Time Analysis (RTA) v 3.4.4 performing real-time based calling on the NovaSeq instrument computer for single-end reads at 100 bp. The Illumina bcl2fastq v 2.20.0.422 pipeline was used next to generate the sequence data. All RNAseq files were uploaded onto ArrayExpress accession E-MTAB-9979.

Bioinformatic analysis

RNAseq raw fastq files were checked using the FastQC (v 0.11.9) tool [46] to evaluate the quality of the sequenced samples. Reads were trimmed with Trimmomatic (v 0.36) [47] with parameters recommended in the Trimmomatic tool manual for single-end reads. Trimmed reads were then aligned to the GrCh38 human reference genome assembly obtained from Ensembl using the HISAT2 (v 2.1.0) tool [48], with the default parameters in the HISAT2 tool manual. Mapped reads were processed using SamBamba (v 0.6.6) [49]. For hADSC-A, -B, -C and -D, lane 1 files were merged with their corresponding repeated lane 2 files. Multimappers were filtered out using a mapping quality threshold of ≥ 5. Sorted and filtered Bam files were used as input for the featureCounts function in subread [50] to count each transcript to the GrCh38.90 reference genome. Sequencing details and processing steps are attached in Additional file 7. Count files (number of reads per transcript counted in data files) were parsed to include the corresponding Ensembl Gene IDs, before analysing with DESeq2 (v 1.26.0) [51] via R studio.

Differential expression analyses were performed using DESeq2 to compare differential gene expressions from hADSC cultured in Hx against Nx. Low count reads (< 10) were removed and DESeq2 was used with the multi-factor design for “conditions” (Hx vs Nx) and “cell types” (hADSC-A, -B, -C, -D and -E). Count data were transformed using regularised logarithmic transformation (rlog function) in DESeq2 to compare between datasets in the clustering analyses. Differential expression analysis was performed on raw count data and results were shrunk using the Apeglm function [52] for better estimation of distribution and exported as L2FC. Transcripts were considered differentially expressed (DE) if L2FC ≥ 2 at adjusted-P ≤ 0.05.

In addition to the L2FC data, raw count reads from all 5 hADSC were also normalised to transcripts per million (TPM) by normalising for both gene lengths and sequence depths.

All analysis performed in R for this study can be found in Additional file 4.

Ranking method

From the literature, a list of 78 previously known HKG were curated. L2FC and DESeq2-normalised values were extracted from the differential expression analysis data of the 5 hADSC, and TPM were calculated using the formula below, for the 78 HKG.

$$Transcripts\;per\;million\;\left(TPM\right)=Raw\;counts\div\frac\div Sequencing\;depth$$

The CV of each HKG between Hx and Nx was calculated with TPM or DE data using the following formula below, for all 5 hADSC (CVABCDE) as well as individually (CVA, CVB, CVC, CVD, CVE):

$$Co\;efficient\;of\;Variance\;\left(CV\right)=\frac_}_}$$

An average of CV (\(\underline\)) was first calculated using only CV ≤ 0.15. For example, ALAS1 had 7 CV values ≤ 0.15, the average CV was calculated by averaging the 7 CV values only. Next, the top 8 genes were selected to normalise for both CV and L2FC by dividing each value with the smallest corresponding values for an arbitrary score. For example, from the averaged CV, ALAS1 was the smallest at 0.013 so average CV of all other genes were normalised to 0.013, and of the L2FC, GUSB was the smallest (closest to 0) at 0.01 so L2FC of all other genes were normalised to 0.01. These two arbitrary scores were then summed and ranked from the smallest to the largest where the top 4 HKG were selected for validation via qRT-PCR. Additional file 3 contains all 78 HKG data, calculations, and ranking.

cDNA synthesis

Complementary DNA was synthesised using the RT2 Omniscript kit (Qiagen, Hilden, Germany) according to manufacturer’s instructions. Briefly, 1 μg of template RNA was reverse transcribed in accordance with the manufacturer’s protocol for each sample, together with the corresponding no reverse transcriptase controls (NRT) and no template controls (NTC). For each reaction, 1 μM oligo(dT)15 primers (Promega Corporation, Fitchburg, Wisconsin) were used, where reactions were incubated at 37 °C for 60 min.

qRT-PCR

qRT-PCR was performed using the QuantiFast Probe PCR kit (Qiagen, Hilden, Germany) on cDNA generated from hADSC-A, -B, -C and -D RNA samples. Gene expressions were performed using FAM-labelled and VIC-labelled TaqMan Assay probes (Thermo Fisher Scientific, Waltham, Massachusetts) as listed below in Table 4. 18S was not included in our RNAseq since it’s an rRNA, VEGFA was to validate the known regulation by hypoxia. Briefly, 2.5 μL of cDNA, 5 μL of 2 × PCR Mastermix, 0.5 μL of TaqMan Assay probes and RNase-free H2O were combined to a total of 10 μL per reaction per well in a white 96-well full-skirt plate, along with appropriate controls for NRT and NTC. qRT-PCR was performed on the CFX Real-Time PCR Detection System (Bio-Rad Laboratories, Hercules, California) under the following conditions: 95 °C for 3 min, and 40 cycles of denaturation at 95 °C for 3 s and annealing at 60 °C for 30 s. Threshold cycles (Ct) were determined using the exponential growth phase and the baseline signal from fluorescence versus cycle number plots.

Table 4 Description of TaqMan Assay probes used in qRT-PCR validation of housekeeping genesValidation of HKG stability

Using RefFinder [35], raw Ct values for each HKG of all 4 hADSC lines were entered according to the instructions to calculate for geometric average (GeNorm), model-based variance estimation (NormFinder), pair-wise correlations (BestKeeper), \(\Delta Ct\) method, as well as the geometric mean of the weighted ranks from the 4 methods to be expressed as RefFinder rank. Then, relative gene expression levels were determined using \(^\) method [39] for each hADSC to compare Hx (test) against Nx (control). The overall rank was averaged for each HKG. Relative gene expression levels (n = 3) using \(^\) was also performed for VEGFA against 18S and RRP1 in 2 biological replicates of hADSC in Hx (test) against Nx (control).

留言 (0)

沒有登入
gif