Genotype imputation methods for whole and complex genomic regions utilizing deep learning technology

Uffelmann E, Huang QQ, Munung NS, de Vries J, Okada Y, Martin AR, et al. Genome-wide association studies. Nat Rev Methods Prim. 2021;1:59 https://doi.org/10.1038/s43586-021-00056-9.

Article  CAS  Google Scholar 

Sherry ST. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 2001;29:308–11. https://doi.org/10.1093/nar/29.1.308.

Article  PubMed  PubMed Central  CAS  Google Scholar 

Schaid DJ, Chen W, Larson NB. From genome-wide associations to candidate causal variants by statistical fine-mapping. Nat Rev Genet. 2018;19:491–504. https://doi.org/10.1038/s41576-018-0016-z.

Article  PubMed  PubMed Central  CAS  Google Scholar 

Wang QS, Huang H. Methods for statistical fine-mapping and their applications to auto-immune diseases. Semin Immunopathol. 2022;44:101–13. https://doi.org/10.1007/s00281-021-00902-8.

Article  PubMed  PubMed Central  CAS  Google Scholar 

Das S, Abecasis GR, Browning BL. Genotype imputation from large reference panels. Annu Rev Genom Hum Genet. 2018;19:73–96. https://doi.org/10.1146/annurev-genom-083117-021602.

Article  CAS  Google Scholar 

Naj AC. Genotype imputation in genome-wide association studies. Curr Protoc Hum Genet. 2019;102:1–15. https://doi.org/10.1002/cphg.84.

Article  Google Scholar 

LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436–44. https://doi.org/10.1038/nature14539.

Article  PubMed  CAS  Google Scholar 

Rubinacci S, Delaneau O, Marchini J. Genotype imputation using the Positional Burrows Wheeler Transform. PLOS Genet. 2020;16:e1009049 https://doi.org/10.1371/journal.pgen.1009049.

Article  PubMed  PubMed Central  CAS  Google Scholar 

Das S, Forer L, Schönherr S, Sidore C, Locke AE, Kwong A, et al. Next-generation genotype imputation service and methods. Nat Genet. 2016;48:1284–7. https://doi.org/10.1038/ng.3656.

Article  PubMed  PubMed Central  CAS  Google Scholar 

Browning BL, Zhou Y, Browning SR. A one-penny imputed genome from next-generation reference panels. Am J Hum Genet. 2018;103:338–48. https://doi.org/10.1016/j.ajhg.2018.07.015.

Article  PubMed  PubMed Central  CAS  Google Scholar 

Li N, Stephens M. Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. Genetics. 2003;165:2213–33.

De Marino A, Mahmoud AA, Bose M, Bircan KO, Terpolovsky A, Bamunusinghe V, et al. A comparative analysis of current phasing and imputation software. PLoS One. 2022;17:1–22. https://doi.org/10.1371/journal.pone.0260177.

Article  CAS  Google Scholar 

Consortium IH 3. Integrating common and rare genetic variation in diverse human populations. Nature. 2010;467:52–58. https://doi.org/10.1038/nature09298.

Article  CAS  Google Scholar 

Genomes Project C, Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, et al. A global reference for human genetic variation. Nature. 2015;526:68–74. https://doi.org/10.1038/nature15393.

Article  CAS  Google Scholar 

A reference panel of 64,976 haplotypes for genotype imputation. Nat Genet. 2016;48:1279-83. https://doi.org/10.1038/ng.3643.

Durbin R. Efficient haplotype matching and storage using the positional Burrows–Wheeler transform (PBWT). Bioinformatics. 2014;30:1266–72. https://doi.org/10.1093/bioinformatics/btu014.

Article  PubMed  PubMed Central  CAS  Google Scholar 

Chen J, Shi X. Sparse convolutional denoising autoencoders for genotype imputation. Genes. 2019;10:1–16. https://doi.org/10.3390/genes10090652.

Article  CAS  Google Scholar 

Song M, Greenbaum J, Luttrell J, Zhou W, Wu C, Luo Z, et al. An autoencoder-based deep learning method for genotype imputation. Front Artif Intell. 2022;5, https://doi.org/10.3389/frai.2022.1028978

Dias R, Evans D, Chen SF, Chen KY, Loguercio S, Chan L, et al. Rapid, Reference-Free human genotype imputation with denoising autoencoders. Elife. 2022;11:1–20. https://doi.org/10.7554/elife.75600.

Article  CAS  Google Scholar 

Kojima K, Tadaka S, Katsuoka F, Tamiya G, Yamamoto M, Kinoshita K. A genotype imputation method for de-identified haplotype reference information by using recurrent neural network. PLOS Comput Biol. 2020;16:e1008207 https://doi.org/10.1371/journal.pcbi.1008207.

Article  PubMed  PubMed Central  CAS  Google Scholar 

Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9:1735–80. https://doi.org/10.1162/neco.1997.9.8.1735.

Article  PubMed  CAS  Google Scholar 

Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, et al. Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg, PA, USA: Association for Computational Linguistics; 2014, pp 1724–34.

Guo M-H, Xu T-X, Liu J-J, Liu Z-N, Jiang P-T, Mu T-J, et al. Attention mechanisms in computer vision: a survey. Comput Vis Media. 2022;8:331–68. https://doi.org/10.1007/s41095-022-0271-y.

Article  Google Scholar 

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. IEEE Ind Appl Mag. 2017;8:8–15. https://doi.org/10.1109/2943.974352.

Article  Google Scholar 

Mowlaei ME, Li C, Chen J, Jamialahmadi B, Kumar S, Rebbeck TR, et al. Split-transformer impute (STI): genotype imputation using a transformer-based model. bioRxiv. 2023, https://www.biorxiv.org/content/10.1101/2023.03.05.531190v1.

Horton R, Wilming L, Rand V, Lovering RC, Bruford EA, Khodiyar VK, et al. Gene map of the extended human MHC. Nat Rev Genet. 2004;5:889–99.

Article  Google Scholar 

Shiina T, Hosomichi K, Inoko H, Kulski JK. The HLA genomic loci map: expression, interaction, diversity and disease. J Hum Genet. 2009;54:15–39. https://doi.org/10.1038/jhg.2008.5.

MacArthur J, Bowler E, Cerezo M, Gil L, Hall P, Hastings E, et al. The new NHGRI- EBI catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res. 2017;45:D896–D901.

Article  PubMed  CAS  Google Scholar 

Débora YCB, Vitor RCA, Bitarello BD, Kelly N, Jérôme G, Diogo M. Mapping bias overestimates reference allele frequencies at the HLA genes in the 1000 Genomes Project Phase I Data. G3 Genes|Genomes|Genetics. 2015;5:931–41.

Article  Google Scholar 

Dilthey AT, Moutsianas L, Leslie S, McVean G. HLA*IMP-an integrated framework for imputing classical HLA alleles from SNP genotypes. Bioinformatics. 2011;27:968–72. https://doi.org/10.1093/bioinformatics/btr061.

Article  PubMed  PubMed Central  CAS  Google Scholar 

Jia X, Han B, Onengut-Gumuscu S, Chen WM, Concannon PJ, Rich SS, et al. Imputing amino acid polymorphisms in human leukocyte antigens. PLoS One. 2013;8:e64683 https://doi.org/10.1371/journal.pone.0064683.

Article  PubMed  PubMed Central  CAS  Google Scholar 

Naito T, Okada Y. HLA imputation and its application to genetic and molecular fine-mapping of the MHC region in autoimmune diseases. Semin Immunopathol. 2022;44:15–28. https://doi.org/10.1007/s00281-021-00901-9.

Article  PubMed  CAS  Google Scholar 

Karnes JH, Shaffer CM, Bastarache L, Gaudieri S, Glazer AM, Steiner HE, et al. Comparison of HLA allelic imputation programs. PLoS One. 2017;12:1–12. https://doi.org/10.1371/journal.pone.0172444.

Article  CAS  Google Scholar 

Naito T, Suzuki K, Hirata J, Kamatani Y, Matsuda K, Toda T, et al. A deep learning method for HLA imputation and trans-ethnic MHC fine-mapping of type 1 diabetes. Nat Commun. 2021;12:1639 https://doi.org/10.1038/s41467-021-21975-x.

Article  PubMed  PubMed Central  CAS  Google Scholar 

Gu J, Wang Z, Kuen J, Ma L, Shahroudy A, Shuai B, et al. Recent advances in convolutional neural networks. Pattern Recognit. 2018;77:354–77. https://doi.org/10.1016/j.patcog.2017.10.013.

Article  Google Scholar 

Naito T, Satake W, Ogawa K, Suzuki K, Hirata J, Foo JN, et al. Trans‐ethnic fine‐mapping of the major histocompatibility complex region linked to Parkinson’s disease. Mov Disord. 2021;36:1805–14. https://doi.org/10.1002/mds.28583.

Article  PubMed  PubMed Central  CAS  Google Scholar 

Akiyama Y, Sonehara K, Maeda D, Katoh H, Naito T, Yamamoto K, et al. Genome-wide association study identifies risk loci within the major histocompatibility complex region for Hunner-type interstitial cystitis. Cell Rep Med. 2023;4:101114 https://doi.org/10.1016/j.xcrm.2023.101114.

Article  PubMed  PubMed Central  CAS  Google Scholar 

Tanaka K, Kato K, Nonaka N, Seita J. Efficient HLA imputation from sequential SNPs data by Transformer. arXiv. 2022. https://doi.org/10.48550/arXiv.2211.06430.

Article  Google Scholar 

Zhou J, Theesfeld CL, Yao K, Chen KM, Wong AK, Troyanskaya OG. Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk. Nat Genet. 2018;50:1171–9. https://doi.org/10.1038/s41588-018-0160-6.

Article  PubMed  PubMed Central  CAS  Google Scholar 

Avsec Ž, Agarwal V, Visentin D, Ledsam JR, Grabska-Barwinska A, Taylor KR, et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat Methods. 2021;18:1196–203. https://doi.org/10.1038/s41592-021-01252-x.

Article  PubMed  PubMed Central 

Comments (0)

No login
gif