IJNS, Vol. 8, Pages 61: A Roadmap for Potential Improvement of Newborn Screening for Inherited Metabolic Diseases Following Recent Developments and Successful Applications of Bivariate Normal Limits for Pre-Symptomatic Detection of MPS I, Pompe Disease, and Krabbe Disease

3.1. General Introduction to BVNLBVNL methodology is based on well-established multivariate normal distribution theory [56]. Sir Francis Galton recognized that plots of the heights of parents and their adult children formed ellipses [57]. His depiction drove the development of a large portion of modern statistical techniques, ranging from regression and correlation to multivariate normal distribution theory and its two-dimensional special case, bivariate normal distribution theory [56,58].

In the context of disease diagnosis, these methods utilize a set of biomarker observations from disease-free patients to form a normal tolerance region that can be estimated by a prediction region. With sufficient normal observations, this prediction region contains a pre-specified portion, (1−α)100%, of the disease-free or normal population, where (1−α) represents the confidence that a randomly sampled individual from the normal population will fall in the region. The approximation improves with increasing sample size, as prediction regions converge to corresponding tolerance regions as sample size increases. While these techniques generalize to any number of variables, this discussion will center on the two-variable case where (1−α)100% prediction regions are ellipses.

For a diagnostic screen that utilizes a single predictive biomarker, this tolerance region would be a univariate (1−α)100% interval, and observations falling above or below a certain critical univariate threshold value for the biomarker would be considered a positive screen. Given a second predictive biomarker, using a second critical univariate threshold would improve diagnostic accuracy. However, given two biomarkers with a bivariate normal distribution that are informative about disease risk, simply using univariate intervals or thresholds does not account for the correlation between the two biomarkers and is, therefore, less efficient than a bivariate approach, resulting in additional false positives.

Reference ranges for single biomarkers measured from blood spots or urine analysis have long been used in NBS program protocols [10]. More recently, there has been increasing interest in two-tier testing, analyzing a second biomarker on the same NBS blood spot material used in an initial positive screen [59]. While separate univariate reference ranges for each can be useful in two-tier testing, statistical improvements in accuracy are achievable by using bivariate normal limits based on tolerance regions, as suggested above, provided the distribution of the biomarkers is at least approximately bivariate normal [5]. Furthermore, screening tools utilizing three or more biomarkers in concert may be improved by implementing multivariate normal limits, as these methods generalize to any number of biomarkers. 3.2. General Definition and Discussion of BVNL Screening Tests for Inherited Metabolic Diseases

The root causes of inherited metabolic diseases such as KD, PD, and MPS disorders are pathogenic genetic variants and related enzyme deficiencies/inactivity that result in an accumulation of toxic substances in cells and ultimately damage the central nervous system, organs, or tissue. The etiology of such diseases makes them particularly well-suited for NBS tests that are based on multiple biomarkers: a measure enzyme level or activity (or a monotone transformation of such a variable) and a measure of the toxic substance levels in the cells (or a monotone transformation such a variable). Transformations are selected, if necessary, so that the biomarkers are normally distributed. If the assumption of normality is met, a (1 − α)100% prediction ellipse can be estimated. Given biomarkers X and Y, and thresholds τ1 (indicating “low” values of X) and τ2 (indicating “high” values of Y), a BVNL NBS test employs the following prediction rule:

(1)

Predict that the infant will experience clinical symptoms of the disease in early childhood if

the observed value of X is less than τ1,

the observed value of Y is greater than τ2, and

the observed value of the pair (X, Y) falls outside the estimated (1 − α)100% prediction ellipse; and

(2)

Predict that the infant will not experience clinical symptoms during early childhood if any of the conditions a, b, or c above do not hold.

To specifically define a test for use, a NBS program must specify the values of τ1, τ2, and α.

α should be set at the largest false positive rate that is tolerable to the program, as α is the maximum possible false positive rate of a prediction rule that is defined as above. The actual false positive rate (FPR) will be less than α; how much less depends on the choice of τ1 and τ2. The smaller τ1 and larger τ2 are, the lower FPR will be. Given the choices of τ1, τ2, and α and a large sample of normal infants; a close estimate of the actual FPR can be computed theoretically or estimated by a simulation study. This is not necessary given the assumption of normality is met, as it is guaranteed that FPR 56]. Then, given large n and consistent maximum likelihood estimates the resulting prediction ellipses will contain approximately (1 − fpr)100% of future normal observations [60].Evaluating a diagnostic test for any disease requires estimation of six epidemiological parameters; Sensitivity (Sens), Specificity (Spec), Positive Predictive Value (PPV), Negative Predictive Value (NPV), False Positive Rate (FPR), and False Negative Rate (FNR); which are estimated as follows when a random sample of size N = TP + FP + FN + TN is available from the general population:

Sensitivity= TP(TP+FN)

(1)

Specificity=TN(TN+FP)

(2)

Positive Predictive Value=TP(TP+FP)

(3)

Negative Predictive Value=TN(TN+FN)

(4)

False Positive Rate=FP(FP+TN)=1−Specificity

(5)

False Negative Rate=FN(FN+TP)=1 –Sensitvity

(6)

where TP represents the number of correct positive test results, FP the number of incorrect positive test results, FN the number of incorrect negative test results, and TN the number of correct negative test results.In the case of rare diseases, small samples may leave one or more of these quantities inestimable. For some diseases, the necessary additional information is available in the form of disease prevalence estimates from the literature. However, if accurate disease prevalence estimates are available, PPV can be estimated using Bayes’ Rule as follows. where O=(1−Prev)/Prev, and Prev denotes the prevalence of the disease in the general population. Thus, Equation (7) is the odds that a randomly sampled infant from the general population has the disease. So, if the prevalence of the disease is known or externally estimated, then a valid estimate of PPV can be calculated by substituting the estimate of sensitivity in Equation (1) and the calculated value of O into Equation (7), provided that FP ≠ 0. Analogous statements hold for NPV but are not discussed here, as Equation (7) is particularly relevant in BVNL applications to KD and MPS I presented below.A beauty of BVNL NBS tests is that FPR is knowable or even can be fixed at an acceptable level by one’s choice of τ1, τ2, and α. Thus, PPV can be properly estimated even when FP = 0, if an estimate of prevalence is available. Furthermore, even without performing the difficult mathematical tasks of theoretically calculating FPR, given chosen values of τ1, τ2, and α, or of choosing values of τ1, τ2, and α that ensure an acceptably low pre-specified; we have the following lower bound on PPV:

PPV>Sens/(Sens+α*O),

(8)

because it is mathematically guaranteed that the FPR of a BVNL test is less than α. 3.3. Review of an Application of a BVNL NBS Test for KDThe need for a bivariate approach to EIKD NBS has been established by examination of univariate normal limits of GaLC enzyme, concluding that while depleted GaLC enzyme levels were indicative of EIKD, they could not solely determine phenotype [61]. After interest in PSY re-emerged, measurements of GaLC and PSY were used successfully in an initial BVNL approach, although the lack of simultaneous GaLC/PSY measurements from a normal population limited studies to investigations of the potential benefits of a bivariate approach [59,62] and ad hoc development and application of the first BVNL NBS test for KD [5].The results were positive, and work began to collect simultaneous GaLC/PSY measurements from healthy newborns, which was necessary for fully rigorous development. In October 2016, data from 166 NBS dried blood spots, as well as 15 affected KD cases with symptom onset prior to 29 months, were utilized to further develop a BVNL test for KD screening [6]. This involved standardizing and centering natural-log transformations of GaLC and PSY determinations on deriving a (1–10−6)100% prediction ellipse for z-scores that is portable to any NBS program. The values of τ1 = −2.90, τ2 = 2.90, and α = 10−6 were chosen. These settings corresponded roughly to a FPR of 10−7. Figure 1 below shows the resulting ellipse and results of the application of the resulting BVNL test to the normative and diseased samples.For this FPR only one falsely predicted early childhood case of KD is expected to occur in every 10 million newborns [6]. The rough approximation that FPR = 10−7 for the above settings of τ1, τ2, and α was confirmed by Monte Carlo simulation results. By generating 100,000,000 observations from the estimated distribution of normal newborns illustrated by the ellipses in the figure above and tabulating the number of observations falling in the abnormal region, a simulated FPR was estimated and report to be 1.1 per 10 million newborns, very close to the roughly approximated 1 per 10 million [5]. This FPR corresponds to one expected false positive every 2.5 years if every US newborn were screened [6].Langan et al. [6] reported an estimated sensitivity of 1.0 and specificity also of 1.0. An estimated PPV was obtained by substituting these estimates along with FPR =10−7, and O = 149,999 into Equation (8) above to obtain an estimated PPV of 98.5%, which far exceeds estimated PPV of previously employed test protocols that do not use BVNL [46]. O was calculated from a reported prevalence of 1 in 150,000 from the literature [63]. Efforts are currently underway for a prospective evaluation/validation of this BVNL test for KD.Carter et al. showed that the BVNL test for KD performed better than any univariate test based on GaLC alone (Predict KD if X1), a univariate test based on PSY alone (Predict KD if Y > τ2), and a bivariate test that predicts KD if X 1 and Y> τ2 (i.e., a bivariate test that is based on conditions a and b above, but not c) [64]. The operating characteristics of the BVNL test were better than those of the latter of these three tests because the BVNL test incorporates information about the shape of the distribution of (X, Y) points in the normal population to better identify abnormal observations. The BVNL offers a theoretically ensured improvement in FPR that did not increase FNR in the application. 3.4. Review of an Application of a BVNL NBS Test for MPS ISimilarly to deficient GaLC enzyme activity causing a harmful excess of PSY levels in KD patients, deficiency in IDUA enzyme leads to the harmful accumulation of GAGs. Kubaski et al. examined dried blood spots from 2862 NBS patients and 14 MPS cases, 7 of which were MPS I patients, demonstrating that certain GAGs may be beneficial in NBS programs as a first- or second-tier test [27]. Langan et al. considered IDUA enzyme and the GAG heparan ΔDi-NS [2-deoxy-2-sulfamino 4-O-(4-deoxy-α-L-threo-hex-4-enopyranosyluronic acid)-D-glucose] (HS) as part of BVNL approach to MPS I NBS [7].Using 5000 normal newborns from Japanese screening efforts in the Gifu prefecture, BVNL prediction ellipses were calculated for univariately centered and standardized natural log values of HS and IDUA activity. This resulted in a (1–10−7)100% BVNL prediction ellipse. The values of τ1 = −3.62, τ2 = 1.90, and α = 10−7 were chosen. Thus, NBS observations with transformed IDUA less than −3.62 and transformed HS greater than 1.9 that fall outside the BVNL ellipse are test-positive BVNL. These thresholds and α–level result in roughly a FPR of 10−8. The MPS I BVNL plot is shown below in Figure 2, with seven MPS I cases and 12 pseudo-deficient normal newborns [7].Langan et al. followed a similar simulation strategy as in their KD application to estimate specificity and PPV [6,7]. Ultimately, they report that this BVNL tool for MPS I yields one false positive in 100 million newborns tested, with a sensitivity of 100%, specificity of 99.999999%, and a PPV of 99.9%. Langan et al. conclude that the BVNL tool outperformed univariate threshold tests using IDUA and HS, and also the joint univariate test of both IDUA/HS [7]. 3.5. Review of an Application of a BVNL NBS Test for Infantile PD

Following efforts with KD and MPS I, Langan et al. applied BVNL methods to biomarkers relevant to PD. In the case of PD, deficient GAA enzyme combined with creatine levels seemingly offer potential in the diagnosis of IOPD. While further refinement of the ellipse and additional testing is required on some referred patients before the ellipse itelf may be published, the resultant preliminary findings are discussed below.

Dried blood spots of 312,105 normal newborns from New York State screening programs were tested for CRE and GAA activity. This resulted in a (1–10−8)100% BVNL prediction ellipse. The values of τ1 = −4, τ2 = −1, and α = 10−8 were chosen. Thus, NBS observations with transformed GAA less than −4 and transformed CRE greater than −1 that fall outside the BVNL ellipse are test-positive BVNL. These thresholds and α–level result in roughly a FPR of 10−9. The PD BVNL test accurately identified seven known IOPD cases and 312,105 normal newborns. Four of the presumptively normal newborns were also identified as PD cases by the preliminary BVNL tool.

Unlike KD and MPS I, the presence of false positives on the preliminary PD ellipse allows for direct estimation of the achieved false positive rate and PPV using Equations (3) and (5). With four apparent false positives, the BVNL method achieves a FPR of 0.000013%, with 95% confidence interval of (0.0000003, 0.00003), and a PPV of 63.64%, with 95% confidence interval (35.21, 92.06). It should be noted that adjusting τ1 (the GAA threshold) to −7.5 would eliminate all false positives, although the risk of missing true cases increases. Although not as well performing as the BVNL for KD and MPS I, further refinements such as an alternative to CRE as a second-tier biomarker or different choice of transformation for GAA and/or CRE are currently being investigated. Nevertheless, these achieved values are improvements on several reported studies [32,33,36,37,38,39,40]. Further, compared to the BVNL implementation, using univariate thresholds alone as screening criteria would have resulted in a minimum of 299 additional false positives.

留言 (0)

沒有登入
gif