Performance of the FIND-FH machine learning algorithm for the identification of individuals with suspected familial hypercholesterolemia

Familial hypercholesterolemia (FH) is an autosomal dominant disorder of cholesterol metabolism that results in high levels of low-density lipoprotein cholesterol (LDL-C) from birth and up to a 20-fold increased risk of cardiovascular disease in one's lifetime.1,2 Individuals with heterozygous FH experience shortened life expectancy and a 30% to 50% risk of major cardiac events by age 50.2,3 FH is common, affecting approximately 1 in every 250 individuals and is significantly underdiagnosed and undertreated.1,4 Early diagnosis of FH allows for prompt initiation of appropriate lipid lowering therapy, which can markedly lower lifetime cardiovascular risk.3 Ideally, this should be followed by screening of all first-degree relatives with cholesterol testing or genetic testing to identify additional affected individuals, and genetic testing for FH is designated Tier 1 status by the Centers for Disease Control and Prevention due to the significant potential for positive impact on public health.5,6 Given the critical importance of early intervention, identification of undiagnosed patients with FH could meaningfully improve clinical outcomes for these individuals.

Machine learning algorithms (MLAs) offer a scalable path toward identifying people at risk of cardiac diseases and related complications.7 In 2016, the Family Heart Foundation developed the Find, Identify, Network, Deliver FH (FIND-FH) MLA specifically designed to identify individuals with high likelihood of FH. This random forest based MLA, trained using pooled electronic health record (EHR) data from over 80,000 patients with and without FH, calculates a FIND-FH score that is intended to identify individuals with a high likelihood of FH based on various demographic and clinical criteria. Expert clinical review of a sample of flagged individuals from the model that was applied to a national health-care encounter database and an integrated healthcare delivery system dataset found that 87% and 77%, respectively, had a high clinical suspicion of FH.8

Through the FIND-FH Collaborative Learning Network that is coordinated by the Family Heart Foundation, the FIND-FH MLA has recently been implemented at 6 health systems to identify individuals with a high suspicion of FH. Better understanding of the score performance in real-world datasets can enhance its potential larger-scale application in FH quality improvement programs. Prior work has examined the FIND-FH algorithm’s performance with prespecified complete datasets, 8 but its performance in real-world clinical samples remains unknown. This study sought to evaluate the performance of the FIND-FH score when the MLA was applied to real-world EHR data from a large academic healthcare system including characteristics of the patients identified as possibly having FH, phenotypic differences by score category, accuracy for identification of FH, and appropriateness for FH outreach.

Comments (0)

No login
gif