A total number of 2689 SNPs for the LCN2 gene were retrieved from the NCBI (https://www.ncbi.nlm.nih.gov/projects/SNP) dbSNP databases. Among these SNPs 180 were missense non-synonymous SNPs (nsSNPs), 1341 were introns SNPs, and 88 were synonymous SNPs, while the others belongs to different categories. The missense nsSNPs were selected for our study since deleterious nsSNPs could have structural and functional impact on the protein.
Prediction and functional analysis of nsSNPs in LCN2Missense nsSNPs 180 were chosen for our study because they may have both structural and functional effects on proteins. Several in silico tools such as SIFT, Polyphen-2, PROVEAN, PREDICTSNP, MAPP, and SNAP2 were used to predict the deleterious effect on SNPs. Initially, 180 missense SNPs were loaded to SIFT server, which predicted 132 nsSNPs as deleterious or tolerated. Among them, 35 nsSNPs were predicted as deleterious with the score ≤ 0.05 and remaining 97 nsSNPs were tolerated. Then, nsSNPs were examined for Polyphen-2 server analysis which shows the nsSNPs as “Probably Damaging” with a score of 0.9–1, “Possibly Damaging” with a score of 0.7–0.9. The results from both SIFT and Polyphen-2 were combined to enhance the prediction accuracy. Further other bioinformatics tools PROVEAN, PREDICTSNP, MAPP, and SNAP2 were utilized. Based on the PROVEAN results, all 7 nsSNPs were predicted as deleterious. Through the PREDICTSNP results 6 nsSNPs were predicted as deleterious and 1nsSNPs were neutral. Moreover, Snap results 5 nsSNPs were predicted as disease causing and 2 nsSNPs were neutral. After prediction the using above-mentioned tools, 6 nsSNPs alone were found to be deleterious and are listed in Table 1.These potentially deleterious SNPs were considered to further analysis.
Table 1 List of nsSNPs of LCN2 gene predicted as deleterious in various in silico toolsPrediction of the effect of nsSNPs on protein stabilityMUpro and I-MUTANT 2.0 were used to analyze whether the selected missense nsSNPs predict the change of protein stability in LCN2 protein. According to I-MUTANT 2.0 server, nsSNPs rs11556770, rs142623708, rs200107414, rs201365744, rs368926734 were unstable and decreased the protein stability. In MUpro server, all nsSNPs rs147787222, rs11556770, rs139418967, rs142623708, rs200107414, rs201365744, rs368926734 decreased the stability of protein listed in Table 2
Table 2 Prediction of protein stability by I-MUTANT 2.0 and MUproAnalysis of deleterious nsSNPs conservationAccording to phylogenetic conservation study, amino acids in conserved regions were significantly harmful than those in non-conserved regions. The ConSurf server was used to analyze the conservation profiles of amino acids in LCN2. The result showed that Q39H, L6P, M71I, Y52C, Y76H, and Y135 were found to be highly conserved and the variant amino acids were denoted in black boxes represented in Fig. 2. The result of ConSurf is shown in Table 2
Fig. 2Conservation analysis of LCN2 by ConSurf server. This figure represents the amino acids in conserved regions were significantly harmful than those in non-conserved regions. It found to be highly conserved, and the variant amino acids were denoted in black boxes represented
Prediction of relative solvent accessibilityNetsurfP-2.0 was employed to assess the solvent accessibility, stability, and predict secondary structure variations with high conservation scores identified in the ConSurf output. According to NetsurfP-2.0 server, the result showed that Q39H, L6P, Y135H were predicted to be exposed and M71I, Y52C, Y135 were buried. The results are displayed in Table 3
Table 3 Prediction of stability, secondary structure, and relative solvent accessibilityPredicting structural analysis of nsSNPs by PSIPRED softwarePSIPRED projected the alpha-helix, beta-sheet, and coils that were distributed in the LCN2 secondary structure. The PSIPRED server analysis indicated that the predominant secondary structure was a strand, with lesser occurrences of coil and helix, as illustrated in Fig. 3. The PSIPRED predicted the transmembrane MEMSAT topology and the amino acid types. All of the transmembrane topology was cytoplasmic, the amino acid types were aromatic plus cysteine, and hydrophobic and polar are listed in Table 4.
Fig. 3Prediction of structural analysis by PSIPRED. PSIPRED examined the alpha-helix, beta-sheet, and coils that were distributed in the LCN2 secondary structure. This figure represents that PSIPRED revealed that the strand was the common secondary structure and less distribution of coil and helix
Table 4 Prediction of structural analysis of LCN2Secondary structural analysis of LCN2 by SOPMASOPMA analysis indicated that LCN2's secondary structure comprises distributions of alpha-helix, beta-sheet, and random coil. SOPMA secondary structure prediction for LCN2 is displayed in Fig. 4, where 21.21% of sites were alpha helixes, 51.52% were random coils, 3.54% were beta twists, and 23.74% were extended strands.
Fig. 4Prediction of secondary structure using SOPMA. This figures represent the LCN2's secondary structure as 21.21% of sites where alpha-helix, 3.54% beta-sheet, and 51.52% were random coil distributions
Protein interaction by STRING serverThe STRING server result showed that LCN2 protein interacts with ten proteins including matrix mettaloproteinase-9 (MMP9), solute carrier family 22 member 17(SLC22A17), lacto transferrin (LTF), hepcidin-20 (HAMP), cytotoxic T-lymphocyte protein 4 (CTLA4), low-density lipoprotein receptor-related protein 2 (LRP2), gamma-secretase C-terminal fragment 50 (APP), fibronectin (FN1), cystatin-C (CST3), hepatitis A virus cellular receptor 1 (HAVCR1). Based on the analysis, CTAL4, LTF, SLC22A17, HAVCR1, MMP9, APP, HAMP proteins had direct interaction with which is shown in Fig. 5.
Fig. 5Protein–Protein interaction network of LCN2 gene. The network of protein–protein interactions is critical for understanding biological processes. Using STRING functional genomics data and structural assessment, functional and evolutionary aspects of the LCN2 protein were examined. Based on genomics data and fundamental assessment, functional CTLA4, LTF, SLC22A17, HAVCR1, MMP9, APP, HAMP these 7 proteins has strong and direct interaction with LCN2 protein
Gene–gene interaction by GeneMANIAThe GeneMANIA tool was used to analyze the gene interactions with the LCN2 protein. This server predicts that 9 genes matrix mettaloproteinase-9(MMP9), matrixmetallopeptidase2(MMP2), S100 calcium binding protein P (S100P), Lysozyme(LYZ), S100 calcium binding protein A8 (S100A8), GID complex subunit 8 homolog (GID8), LDL receptor-related protein 2(LRP2), Integrin subunit alpha 9 (ITGA9), L-2-hydroxyglutarate dehydrogenase(L2HGDH) has physical and genetic interactions. 7 genes WAP four-disulfide core domain 2(WFDC2), lacto transferrin (LTF), lysozyme (LYZ), secretory leukocyte peptidase inhibitor (SLP1), transcobalamin1 (TCN1), serpin family B member 5 (SERPINB5), peptidase inhibitor 3(P13) colocalized. 1 gene progestagen-associated endometrial protein (PAEP) shared protein domain and 6 genes MMP9, MMP2, LRP2, GID8, L2HGDH, ITGA9 were directly bound to LCN2 gene as shown in Fig. 6
Fig. 6Gene–gene interaction of LCN2 gene. GeneMANIA facilitates the identification of functional interactions between 6 genes: MMP9, MMP2, LRP2, GID8, L2HGDH, and ITGA9, which were directly bound with the LCN2 gene
3D structure predictionThe 3D structure of the LCN2 protein was analyzed by AlphaFold. The AlphaFold method assigns a confidence pLDDT score to each residue ranging from 0 to 100. The average pLDDT scores across all residues demonstrate an overall confidence in the entire protein chain. These 3D structure results show very high confidence (pLDDT > 90), while the other components are represented as unresolved loops with a low (70 > PLDDT > 50) and very low score (pLDDT50) and consist mostly of α-helical domains shown in Fig. 7.
Fig. 7AlphaFold 3D structure prediction of LCN2 gene. The AlphaFold method assigns a confidence pLDDT score to each individual residue ranging from 0 to 100. This 3D structure results reveal the very high confidence (pLDDT > 90), while the remaining components are illustrated as unresolved loops with the low (70 > pLDDT > 50) and extremely low scores (pLDDT50) and are primarily made up of α-helical domains
Comments (0)