Glial cell proteome using targeted quantitative methods for potential multi-diagnostic biomarkers

Study layout for developing biomarker candidates from primary cell to diagnose the Glioma

The first step in biomarker development is to identify candidates. To this end, we performed a comprehensive proteomics study of glial cells, which pooled glial primary cells (Control: 5 and Grade 4: 5) were used. Next step is to validate the glial marker candidates in primary glial cells and tissues (Control: 10 and 10, Grade 2: 10 and 10, Grade 3: 12 and 10, and Grade 4: 15 and 10) (Fig. 1 and Additional file 1: Table S1). Briefly, in the first stage, we profiled the human glial proteome to obtain a pool of biomarker candidates, in which TMT-labeled quantitation method was performed to compare the abundance of proteins between control and cancer. Then, we stratified biomarker candidates by small scale MRM analysis, which was used as the initial selection tool in our systematic proteomic pipeline. In the second stage, a large-scale MRM analysis of targeted peptides was performed in individual glial cells and tissues using the corresponding heavy peptide mixtures as an internal standard. Finally, to develop a multiplex assay, a multimarker panel was established, based on candidate variables in individual primary cells.

Fig. 1figure 1

Schematic describing the Glioma primary cell biomarker Study workflow. Sample preparation A: surgically-removed tissue samples were enzymatically dissociated to single cells and cultured. Control pooled primary cells and grade 4 glioma pooled primary cells were lysed, digested, and labeled with TMT reagent 126 and 130, respectively. TMT-labeled control and grade 4 glioma peptides were mixed and subjected to HPLC fractionation. Candidate screening B: Fractionations obtained (n = 12) were subjected to LC–MS/MS, and the acquired data were analyzed via Proteome Discoverer to obtain differentially expressed proteins in glioma primary cells. Ingenuity pathway analysis (IPA) were performed to further understand the biological significance of the differentially expressed proteins. Validation C: MRM assays for the differentially expressed proteins were developed using synthetic peptides for each protein

Identification of protein in glial primary cell

To obtain an in-depth proteome in glial primary cell, we implemented the TMT labeling method combined with LC-based mid pH peptide fractionation. Our proteome analysis was performed based on the high-resolution mass spectrometry and a multiple-database search strategy including SEQUEST and X! Tandem. In this study, a total of 7,429 protein groups were identified at a minimum confidence level > 95%, more than 2 unique peptides, and FDR < 5% (Additional file 1: Table S2).

To determine the functions of the proteins in our glial proteome, we used Gene Ontology (GO) to classify them by biological process (BP), molecular function (MF), and cellular component (CC).

Our glial proteome was significantly enriched in proteins that participate in ‘cellular process (38.6%), ‘biological regulation’ (29.4%), and ‘metabolic process’ (25.8%). Regarding molecular function, the proteome was significantly enriched in proteins that mediate ‘binding’ (34.4%), ‘catalytic activity’ (17.7%), and ‘enzyme regulator activity’ (3.3%). GO analysis of cellular components was significantly enriched in proteins associated with ‘intracellular organelle’ (37.2%), ‘cytoplasm’ (35.7%), and ‘membrane’ (21.8%) (Additional file 2: Fig. S2).

Differential expression of proteins in control and glioma cell

For the differential proteome in control and cancer cells, three technical replicates were performed, and the labeled TMT quantitation method was used to compare protein expression under different conditions.

To identify reliable key proteins that are systemically able to show the differentially expressed pattern, we first narrowed down the proteins based on the cutoff range rule (t-test, p value ≤ 0.05, minimum confidence level > 95%, more than 2 unique peptides, and FDR < 1%), and selected 3,311 proteins. We then determined the fold-change thresholds (expressed as log2 ratio) of > 2 or < -2 to identify true differences in the expression of proteins. Finally, to select a more reliable list of differentially expressed proteins, we assessed the technical variability based on the coefficient of variation (CV) in all experiments (CV < 20%). Four hundred and seventy-six proteins were finally quantifiable based on the above quantitative criteria (Fig. 2 and Additional file 1: Table S3), and these differentially expressed proteins were represented by volcano plots and heat maps (Fig. 2). Notably, three replicate experiments in control and glioma samples were used to show experimental accuracy and reproducibility.

Fig. 2figure 2

Differentially expressed proteins. The cutoff range of protein identification is as follows: protein confidence interval > 95.0%, peptide N ≥ at least two peptides, 1% < decoy FDR. Through t-test, p ≤ 0.05, and cv < 20%, total 476 proteins were finally listed as differentially expressed proteins. Volcano plot (A): For the analysis of differentially expressed proteins and statistical analysis, Perseus (version 1.5.8.2) and R were used, where the cutoff range for significant fold change (FC) and T-test p-values were set as ± 2.0 and 0.05, respectively. Heat map (B): 2D–hierarchical clustering analysis exploring the difference in protein expression between Red and Green Pashmina fiber. Each row in the map represents a differentially expressed proteins, and each column represents the condition used. Log2 (DEP) value was used for constructing the heat-map

Analysis of canonical pathway and protein networks

To investigate the signaling pathway and protein–protein interactions related to the upregulated and downregulated proteins in our glial proteome, we performed canonical pathway and protein network analyses based on the differentially expressed proteins using IPA. Compared with control samples, there were 476 differentially expressed proteins in grade 4 glioma, of which 228 proteins increased, whereas 248 proteins decreased in abundance. In the canonical pathway, 476 regulated proteins were enriched in 470 pathways, where 21 representative signaling pathways related the carcinogenesis and neurogenesis are as followed; Protein Ubiquitination, Protein Kinase A Signaling, Sertoli Cell-Sertoli Cell Junction Signaling, PI3K/AKT Signaling, Leukocyte Extravasation Signaling, Systemic Lupus Erythematosus Signaling, IGF-1 Signaling, 14–3-3-mediated Signaling, HIPPO signaling, ERK5 Signaling, Inhibition of ARE-Mediated mRNA Degradation, Necroptosis Signaling, Calcium Signaling, FAT10 Signaling, Mitochondrial Dysfunction, Cell Cycle, Pentose Phosphate, DNA Methylation and Transcriptional Repression Signaling, Apelin Adipocyte Signaling, Neuroprotective Role of THOP1, and TCA Cycle II pathway (Table 1).

Table 1 Analysis of canonical pathway and protein networksValidation of biomarker candidates in the MRM analysis

To select biomarker candidates, we first excluded proteins that have common gene and protein names. For the selection of reliable MRM transition, we constructed a glial-specific MS/MS spectral library and compared its MS/MS spectra with experimental spectra from our MRM analysis. In this study, 321 proteins showing the same fragmentation spectral pattern were selected. We then examined the detectability of marker candidates in the MRM platform, and confirmed low, middle, and high endogenous concentrations of the marker candidates, wherein the ranges of low, middle, and high concentrations were defined by comparing endogenous peptides with heavy peptide concentrations. For the MRM validation analysis, we excluded proteins with no detected range of concentration. To narrow down the number of marker candidates, we performed a small-scale MRM analysis, wherein we verified whether candidates showed the same expression pattern between the MRM and TMT-labeled dataset. From the small-scale MRM, 90 proteins were selected. Finally, bioinformatics analysis of the differentially expressed proteins revealed several putative enriched functional and disease networks, and upstream regulators, such as cancer, cell death and survival, organismal injury, and abnormalities related networks, which were used to select biomarker candidates. Consequently, 20 proteins, viz., ATP2B4, ATP5ME, CCT3, DNMT1, FKBP2, GLRX5, IDH3A, JAM2, LDHA, PCMT1, PLEKHG3, PRDX6, SLC44A2, TACC3, TINAGL, TKT, TOMM34, UACA, UBA1, and YWHAE were selected (Table 2).

Table 2 Selected 20 proteins and MRM transitionIndividual sample analysis by MRM

Using the heavy peptide mixture (20 fmol/μL) of each target peptide for MRM as an internal standard, we analyzed individual human primary cells by MRM. We first confirmed the differential concentration of target proteins between control (N: 10) and cancer (grade 2: 10, grade 3: 12, grade 4: 15). All 20 proteins were detected in glial cells, and 5 proteins had disparate expression patterns in the control and cancer groups (Fig. 3 and Table 2). Student’s t-test and ROC curve was performed to compare the control and cancer groups; 5 (CCT3, PCMT1, TKT, TOMM34, UBA1) and 2 proteins (CCT3 and TOMM34) were satisfied with the significant differences rule (Student’s t-test: p ≤ 0.05, AUC: AUC value ≥ 0.7) in control versus cancer (grade 4) and control versus cancer (grade 3and 4), respectively.

Fig. 3figure 3

Validation of biomarker candidates in control group and cancer group. The 20 selected proteins from TMT labeled quantitation were verified by MRM in control (N = 10) and cancer (grade 3& grade 4) (N = 27) primary cell samples

Construction of a multi-marker panel based on the MRM results

To improve the classification discriminating power between the control and cancer groups, we constructed a multi-marker panel using Logistic regression analysis and used it to statistical evaluation.

We first selected a multi-marker panel that showed the best discriminatory power between the control and cancer group (grade 4). We then applied this multi-marker panel to evaluate its discriminatory power in control group versus cancer group (grade 3 and 4).

In a comparison of the control with cancer group (grade 4), the 5-marker panel (CCT3, PCMT1, TKT, TOMM34, UBA1) showed better sensitivity (0.90 and 0.90), specificity (0.93 and 1.00), error rate (8 and 4%), and AUC value (0.94 and 0.96) than the best single marker (TOMM34). Indeed, the single best candidate model showed lower sensitivity (0.70 and 0.80), specificity (0.80 and 0.50), AUC value (0.88 and 0.72), and a higher error rate (24 and 11%) (Figs. 4, 5 and Additional file 2: Fig. S3). Moreover, for the control versus cancer group (grade 3 and 4) comparison, the 5-marker panel (sensitivity, 0.80 and 0.90; specificity, 0.92 and 1.00; error rate, 10 and 2%; and AUC, 0.93 and 0.98) also showed better performance than the best single marker (sensitivity, 0.50 and 0.40; specificity, 0.88 and 0.85; error rate, 26 and 7%; and AUC, 0.82 and 0.82) (Figs. 4, 5).

Fig. 4figure 4

Comparison of discriminatory power of the 5-marker panel versus the best single marker in primary glial cells. Five proteins were selected from t-test and ROC curve and used to construct the 5-marker panel, and its performance was evaluated. Logistic regression algorithms were used, in which enter method was used to evaluate the discriminatory power between control and grade 2, 4, 3 and 4 groups (Control: 10, Grade 2: 10, Grade 3: 12, and Grade 4: 15). The results of the evaluation between the best single marker A and C and 5-marker panel B and C are presented as confusion matrices with sensitivity, specificity, and error rate, and ROC curves D, E, and F are also represented by AUC values

Fig. 5figure 5

Comparison of discriminatory power of the 5-marker panel versus the best single marker in glial tissues. The performance of 5-marker panel was also evaluated in glial tissue samples. Logistic regression algorithms were used, in which enter method was used to evaluate the discriminatory power between control and grade 2, 4, 3 and 4 groups (Control: 10, Grade 2: 10, Grade 3: 10, and Grade 4: 10). The results of the evaluation between the best single marker A and C and 5-marker panel B and C are presented as confusion matrices with sensitivity, specificity, and error rate, and ROC curves D, E, and F are also represented by AUC values

留言 (0)

沒有登入
gif