Preeclampsia (PE) is a pregnancy-specific multisystem disorder. It is characterized by elevated blood pressure and proteinuria after the 20th week of gestation, and is often accompanied by dysfunction of vital organs such as the brain, heart, liver, and kidneys [1]. Depending on the time of onset, PE is categorized as early-onset preeclampsia (EOPE) if it occurs between 20 and 34 weeks of gestation and late-onset preeclampsia (LOPE) if it occurs after 34 weeks of gestation [2]. The overall prevalence of PE is 3.1 %. The prevalence of EOPE and LOPE is 0.38 % and 2.72 %, respectively [3]. EOPE is a severe form of PE with worse maternal and fetal outcomes compared to LOPE, and it is one of the leading causes of maternal and perinatal mortality [4].
As EOPE occurs early in pregnancy, early termination of pregnancy is usually required to safeguard maternal and fetal health. This condition is associated with high rates of neonatal mortality and associated preterm complications such as respiratory distress syndrome, intraventricular hemorrhage, and necrotizing small bowel colitis [5], [6]. Furthermore, pregnant women with EOPE face a higher risk of severe complications including acute renal failure, liver failure, and disseminated intravascular coagulation [7]. In developing countries, insufficient prenatal screening and inadequate prenatal care due to economic and medical limitations have led to an increasing number of EOPE cases with poor prognosis. Many patients are diagnosed late, resulting in more complicated and limited treatment options. Multiple studies have confirmed that the rational use of aspirin in early pregnancy for high-risk EOPE populations can reduce the risk of PE [8], [9], [10], [11]. Therefore, early identification of high-risk EOPE populations and timely implementation of targeted treatment and care measures are crucial for the prevention, control, and improvement of PE outcomes.
Traditional methods for predicting EOPE primarily rely on clinical risk assessment and biochemical markers, but these methods often lack ideal accuracy and timeliness. For instance, proteinuria and the ratio of soluble fms-like tyrosine kinase-1 (sFlt-1) to placental growth factor (PlGF) are conventional predictive indicators. Although proteinuria is a hallmark of PE, its predictive value is limited as significant proteinuria often indicates a relatively advanced stage of the disease [12]. Additionally, the levels of sFlt-1 and PlGF can be influenced by various factors, and their sensitivity and specificity remain limited when used for EOPE [13]. Previous studies have attempted to construct PE prediction models by combining maternal factors, laboratory tests, and specific biomarkers using traditional statistical methods such as logistic regression. However, the performance of most prediction models has been suboptimal [14]. Due to the limited sensitivity and processing power of traditional statistical methods, the absence of certain predictors in routine antenatal care in resource-limited settings, and ethnic disparities. Consequently, such models are often unsuitable for application in developing countries and regions with less advanced healthcare systems.
In recent years, the rapid development of medical big data platforms has posed the question of how to mine potential patterns within vast datasets and apply these to clinical diagnosis and treatment — an area of keen interest among contemporary scholars. With the advancements in artificial intelligence (AI), machine learning (ML) algorithms have evolved over several decades as a core component, enhancing the rationality and accuracy of data analysis while reducing time and human resource costs. This includes widespread applications in intelligent disease screening, diagnosis, and treatment across various medical fields. Utilizing AI-driven methods is beneficial for improving disease diagnosis, risk prediction, and auxiliary treatment [15], [16]. ML typically requires large sample training data to enhance prediction model sensitivity, and through algorithms, it recognizes latent patterns in data, thereby providing new avenues for early prediction of EOPE. Research on applying ML algorithms for the prediction of PE remains in its infancy. While some studies have explored this area, most focus solely on predicting the occurrence of PE without differentiating between early-onset and late-onset subtypes. For instance, Bülent et al. [17] conducted a retrospective analysis of medical records from 10,352 pregnant women and utilized the LightGBM algorithm to predict the incidence of PE. Their model achieved an area under the curve (AUC) of 0.832 (95 % CI: 0.818–0.846) and an accuracy of 90.6 % (95 % CI: 90.1 %–91.1 %). Similarly, Li et al. [18] analyzed pregnancy data from 5,116 cases and applied logistic regression alongside four ML methods, including Extra Trees Classifier, Voting Classifier, Gaussian Process Classifier, and Stacking Classifier. Among these, the Voting Classifier demonstrated superior performance in predicting PE, with an AUC of 0.831 and a detection rate of 51.3 % at a 10 % false-positive rate. However, it is worth noting that the features used in this study were not derived from routine prenatal examinations, potentially limiting its clinical applicability.
Therefore, we aim to explore routine clinical data of pregnant women using ML algorithms to extract potential features from complex datasets and construct five risk prediction models for EOPE, comparing their accuracy. Our goal is to identify the most precise model, thereby enhancing the predictive accuracy for EOPE, providing reliable decision support for early clinical interventions, ultimately reducing maternal and neonatal complications, and improving the quality of prenatal care.
Comments (0)