Massive underrepresentation of Arabs in genomic studies of common disease

Discoveries from GWAS have begun to move into the clinical space through implementation of polygenic risk scores (PRS). It is now well-established that lack of diversity in reference genomes and published GWAS limit portability and accuracy of PRS prediction in non-European ancestries. The issues begin with SNP array imputation where reference panels are predominantly of European origin, and thus, when linkage disequilibrium (LD) blocks differ across ancestries, variation in diverse populations can be completely obscured 6. Differences in minor allele frequency (MAF) and effect size of variants among individuals of different genetic ancestries also lead to reduced accuracy of prediction — and increase the likelihood of propagating healthcare disparities 6. When comparing the MAF of causal variants associated with 9 cardiometabolic traits in a large European-ancestry cohort (the UK Biobank) to their frequency in an indigenous Arab population, the MAF is on average 7.6% lower in Arabs (p-value = 4.2e − 06), explaining part of the reduced performance of PRS that are derived in European populations when they are applied to Arabs (Fig. 1f) [7].

The use of computational methods to improve cross-ethnic PRS performance in genetically diverse populations is encouraging, but not enough to allay real concerns regarding ongoing health disparities. While large biobanks, population cohort studies, and international consortia have allowed for serially larger GWAS, the genetic diversity of these GWAS has not kept pace, despite advocacy from groups like the American Society of Human Genetics to prioritize portability of scores. Polygenic scores are increasingly being returned to patients, and clinico-genomic models to predict and intervene on risk early in the life course through a precision medicine approach are likely to become the norm. Without the appropriate development or validation of such clinico-genomic models in Arabs, widespread implementation risks propagation of already existing disparities in care.

Beyond the serious impact on public health genomics, the massive underrepresentation of Arabs in genomic studies is a missed opportunity to discover new disease biology. A prior comment in the journal eloquently described the opportunity to identify homozygous loss of function variants and novel candidate genes for recessive disease through more sequencing of highly consanguineous Arab populations [9]. We highlight that also for common disease, homozygous loss of function variants (human knockouts) might inform interpretation of GWAS. Increasingly, the diversity of GWAS participants, more than the sample size, is advancing the discovery of new loci. For example, in a recent multi-ancestry GWAS of type 2 diabetes, 46% of new loci would not have been identified in a European-ancestry GWAS only [10]. Today, more than 30 million whole genomes have been sequenced which has advanced our understanding of genetic underpinnings of disease, but it is now well understood that in addition to the number of genomes, future opportunities for discovery lie in sequencing a diversity of genomes. First, genomes of different ancestries, when combined, allow leveraging of different LD block structures for better identification of causal variants within a specific genetic region. Second, different causal variants for similar diseases may be identified by studying diverse populations.

The unique ancestral, geographic, and cultural histories of the Arab people offer many opportunities for discovery and improved risk prediction and health for the 450 million Arabs in the world. Future directions and actions to reduce disparities and increase yield of novel variants would include (1) encouraging census bodies to disaggregate Arabs from White individuals within the diaspora, (2) active recruitment of Arabs in genomic databases, and (3) supporting infrastructure and training programs in Arab countries to develop their own biobanks genomic resources to add to the global collective pool of data.

留言 (0)

沒有登入
gif