Serological indices and ultrasound variables in predicting the staging of hepatitis B liver fibrosis: A comparative study based on random forest algorithm and traditional methods
Daolin Xie1, Minghua Ying2, Jingru Lian2, Xin Li2, Fangyi Liu2, Xiaoling Yu2, Caifang Ni3
1 Department of Interventional Radiology, The First Affiliated Hospital of Soochow University, Suzhou; Department of Interventional Ultrasound, Fifth Medical Center of Chinese PLA General Hospital, Beijing, China
2 Department of Interventional Ultrasound, Fifth Medical Center of Chinese PLA General Hospital, Beijing, China
3 Department of Interventional Radiology, The First Affiliated Hospital of Soochow University, Suzhou, China
Correspondence Address:
Caifang Ni
Department of Interventional Radiology, The First Affiliated Hospital of Soochow University, 188 Shizi Road, Suzhou - 215006
China
Xiaoling Yu
Department of Interventional Ultrasound, Fifth Medical Center of Chinese PLA General Hospital, Beijing
China
Source of Support: None, Conflict of Interest: None
CheckDOI: 10.4103/jcrt.jcrt_1394_22
Objective: To compare the diagnostic efficacy of serological indices and ultrasound (US) variables in hepatitis B virus (HBV) liver fibrosis staging using random forest algorithm (RFA) and traditional methods.
Methods: The demographic and serological indices and US variables of patients with HBV liver fibrosis were retrospectively collected and divided into serology group, US group, and serology + US group according to the research content. RFA was used for training and validation. The diagnostic efficacy was compared to logistic regression analysis (LRA) and APRI and FIB-4 indices.
Results: For the serology group, the diagnostic performance of RFA was significantly higher than that of APRI and FIB-4 indices. The diagnostic accuracy of RFA in the four classifications (S0S1/S2/S3/S4) of the hepatic fibrosis stage was 79.17%. The diagnostic accuracy for significant fibrosis (≥S2), advanced fibrosis (≥S3), and cirrhosis (S4) was 87.99%, 90.69%, and 92.40%, respectively. The area under the curve (AUC) values were 0.945, 0.959, and 0.951, respectively. For the US group, there was no significant difference in diagnostic performance between RFA and LRA. The diagnostic performance of RFA in the serology + US group was significantly better than that of LRA. The diagnostic accuracy of the four classifications (S0S1/S2/S3/S4) of the hepatic fibrosis stage was 77.21%. The diagnostic accuracy for significant fibrosis (≥S2), advanced fibrosis (≥S3), and cirrhosis (S4) was 87.50%, 90.93%, and 93.38%, respectively. The AUC values were 0.948, 0.959, and 0.962, respectively.
Conclusion: RFA can significantly improve the diagnostic performance of HBV liver fibrosis staging. RFA based on serological indices has a good ability to predict liver fibrosis staging. RFA can help clinicians accurately judge liver fibrosis staging and reduce unnecessary biopsies.
Keywords: Liver fibrosis, logistic regression analysis, random forest algorithm, serum, ultrasound
Authors Xiaoling Yu and Caifang Ni as co-corresponding authors have made equal contributions.
> IntroductionHepatitis B virus (HBV) infection is a global public health problem. According to statistics, there are ~ 292 million chronic HBV patients worldwide[1] and 32 million chronic HBV patients in China.[2] At present, liver biopsy is still regarded as the “gold standard” for the clinical diagnosis of liver fibrosis, but this technique is difficult to be accepted by patients due to its many shortcomings. Therefore, finding an alternative method to evaluate liver fibrosis staging has become the focus of world hepatology research. Noninvasive evaluation methods of liver fibrosis reported in the literature mainly include serology and imaging. Among them, serum biological examination lacks a single serum-specific diagnostic index of liver fibrosis, which is mainly studied in combination with other imaging indices.[3],[4]
Ultrasound (US) is one of the main imaging methods used to screen for liver diseases. In recent years, interventional US has become more and more widely used in chronic liver diseases, and US-guided ablation of liver cancer[5] has also become one of the treatment methods for liver cancer. With the wide application of artificial intelligence (AI) in the medical field, including the central nervous system,[6] breast,[7] gynecology,[8] respiratory system, and digestive system, AI plays an important role in radiation oncology. It can improve the degree of automation of radiotherapy plan design and quality control, thereby promoting and guaranteeing individualized precision treatment. This study reviewed the application and research of AI in radiotherapy physics and anticipated the development prospects of AI from the aspects of radiotherapy plan design, radiotherapy quality assurance, quality control, delineation of high-risk organs, and dose prediction.[9]
Combining machine learning (ML) with medical imaging in the field of liver disease has gradually become one of the hotspots. There are many reports in the literature, including diagnosis and differential diagnosis of benign and malignant tumors of the liver,[10],[11],[12],[13] determination of the tumor boundary,[14] prediction of postoperative recurrence for hepatocellular carcinoma (HCC),[15],[16],[17] screening of risk factors for HCC,[18] and screening of prognostic factors for HCC.[19],[20]
The combination of ML and serological indices is used for liver fibrosis staging, which mainly uses the supercomputing power of ML to extract hidden information from many feature quantities with a weak linear or nonlinear relationship in serological indices, thereby improving diagnostic performance. As early as 2006, Piscaglia et al.[21] conducted a comparative study on the serological indices of liver transplant patients (414 cases in the training group and 96 cases in the validation group) using neural networks and logistic regression analysis (LRA). Fibrosis performance was obviously better than the latter. In 2018, Wei et al.[22] used GB and FIB-4 to study the serological indices of a group of HBV patients (490 in the training group and 86 in the validation group). In the binary classification, the former performed significantly better than the latter. In the same year, Shousha et al.[23] obtained a similar result. In 2022, Sarvestany et al.[24] conducted a multicenter study on patients with liver fibrosis caused by multiple etiologies and performed an external validation. They used support vector machines, random forests, gradient boosting classifiers, logistic regression, artificial neural networks, and fusion algorithm combined with the above ML algorithms (MLAs) to analyze the serological indices of the above patients and compared the results of APRI, FIB-4, and NFS. The former fusion model was better than the traditional method in distinguishing advanced liver fibrosis.
In this study, serological indices and grayscale US variables of patients with chronic HBV were collected. Random forest algorithm (RFA) and traditional noninvasive liver fibrosis staging models were used to classify liver fibrosis.
> Materials and MethodsGeneral information
A total of 1359 patients with HBV who underwent US-guided puncture biopsy at the Fifth Medical Center of Chinese PLA General Hospital from January 1, 2014 to August 31, 2021 were selected for this study. Patients were randomly divided into training set (n = 951) and validation set (n = 408). Inclusion criteria were as follows: (1) age ≥18 years; (2) the pathological result was chronic HBV, and the fibrosis stage was clear; and (3) inpatients at the hospital had complete relevant data. Exclusion criteria were as follows: (1) complications with fatty liver or overlapping liver, (2) hematological diseases, (3) pregnant patients, (4) liver cancer patients or liver transplantation patients, (5) hepatic cholestasis or hepatic edema caused by biliary obstruction or other factors, and (6) patients with incomplete data [Figure 1].
Figure 1: Flowchart of RFA and traditional methods. A total of 1359 patients with chronic HBV were selected from 8554 patients. RFA and LRA were used for US variables, RFA and APRI/FIB-4 were used for serological indices, RFA and LRA were used for serological indices + US variablesDemographic and serological indices
This study included 44 indices potentially related to liver fibrosis: demographic indices (age, sex, and body mass index), blood routine indices [red blood cell, white blood cell, and platelet (PLT) counts], coagulation series indices, and four items of HBV and HBV DNA replication. Liver function indices included cholinesterase, alanine aminotransferase (ALT), aspartate aminotransferase (AST), and γ-glutamyltransferase (γ-GGT).
Grayscale US
A routine US examination of the upper abdomen was performed on an empty stomach for >8 h, including liver parenchyma echo, hepatic capsule, width of the portal vein, width of the common bile duct, thickness of intercostal spleen, and long diameter of the spleen. (Because this study is retrospective, most grayscale US scanning data did not measure the blood flow velocity of the portal and splenic veins and the width of the spleen vein, so these three indices were not included.) Among them, the description of liver parenchyma echo was divided into five grades: dense, thickening, cords, nodules, and patches. The description of hepatic capsule was divided into four grades: smooth, undersmooth, unsmooth, and serrated. All liver US images were read by three senior doctors who had worked on US diagnosis for >5 years, and those who disagreed reached the final result after discussion and negotiation.
Liver puncture and pathological diagnosis
Liver puncture was carried out in the same hospitalization cycle, within 1 week before and after a routine US examination. All pathological sections went through the process of preliminary reading-forming a report-senior doctor review and issuing a report. In this study, senior doctors were asked to read all sections once again. According to the Scheuer score system, the degree of liver fibrosis was divided into five grades (S0–S4). In this study, S0 and S1 were combined into one group, and S2, S3, and S4 were combined as a group.
Research methods
Three groups were divided according to the sources of the dataset: serology group, US group, and serology + US group. For serological indices, RFA and traditional noninvasive liver fibrosis staging were used for classification and diagnosis. RFA included all 44 demographic and serological indices, whereas the traditional noninvasive model was based on “consensus on diagnosis and treatment of liver fibrosis (2019)”.[25] Using APRI and FIB-4 diagnostic models [calculation formula: APRI = AST × 100/PLT, FIB-4 = (age × AST)/(PLT × ALT1/2)], the relevant evaluation indices of each model were calculated. For US variables, RFA and LRA were used for classification and diagnosis. Six US variables and five demographic indices were included in the study, and the relevant evaluation indices of each model were calculated. For the joint diagnosis of serological indices + US variables, RFA and LRA were used for classification and diagnosis. RFA included all 50 indices, and LRA included alkaline phosphatase, PLT, GGT, HBV surface antigen, and liver parenchyma echo. The relevant evaluation indices of each model were calculated.
Data preprocessing and standardization processing
If there are missing values in the dataset, the same classification feature mean filling method is used to fill the missing values; that is, if there is a null value XP under a feature and the null value belongs to S1, then the sum of all non-null values classified as S1 under that feature will be calculated. The average values will be calculated, and the average value will be used to fill the control. The starting point of this filling method is to fill the missing value with the possible value of the maximum probability. That is, the missing value is inferred from existing data information. Data standardization refers to scaling the attributes of a sample to a specified range. In this study, the normalization method is adopted to deal with the data. After normalization, the scope of the optimization process becomes smaller, the optimization process becomes smooth, and it is easier to correctly converge to the optimal solution. In normalization, the maximum value is classified as 1, the minimum value is classified as 0, and the other values are distributed among them. For each attribute, let min and max be the minimum and maximum values of a feature, respectively, and a primitive value x of this feature is normalized to the value Xscale in the interval [0,1]. The formula is as follows:
Confusion matrix and related evaluation indices
The confusion matrix is the most basic method to measure the accuracy of the ML classification model. The vertical coordinate of the matrix represents the real classification y, the abscissa indicates that the prediction is divided into class x, and each lattice in the matrix (x, y) is the number of category y predicted to category x. Taking the second classification of classification model as an example, the confusion matrix has four indices: true positive (TP), false positive (FP), true negative (TN), and false negative (FN). When the four indices are presented together in the table, the confusion matrix is as follows:
TP means that the classified sample is positive, and the model prediction is positive. FN means that the classified sample is positive, and the model prediction is negative. FP means that the classified sample is negative, and the model prediction is positive. TN means that the classified sample is negative, and the model prediction is negative. In the confusion matrix, the higher the TP and TN, the better the performance of the model. Accordingly, the smaller the FP and FN, the better the performance of the model. However, only the number of samples is counted in the confusion matrix, and it is difficult to measure the quality of the model in all directions with these four indices alone. Therefore, the confusion matrix extends to the second-level indices, such as accuracy (ACC), specificity (SPE), sensitivity (SEN), positive predictive value (PPV), negative predictive value (NPV), area under the curve (AUC), and so on.
Training and validation of ML models
After the correlation analysis with the liver fibrosis stage, 5 demographic and 39 serological indices were included in the serology group. Six US variables and five demographic indices were included in the US group model. All variables were included in the serology + US group model. In this study, the dataset was divided into training and validation sets according to a 7:3 proportion. Based on the training set, a grid search was used to presuppose the combination of several values for the number and maximum depth of parameter decision trees of RFA. Each parameter was evaluated by 10% discount cross-validation. Finally, the optimal parameters were selected. The cross-validation method could improve the performance and generalization ability of the model. When classifying the validation set, aiming at the problem of multiclassification, the performance of the model is verified by accuracy. For the two-classification problem, the model was evaluated by ACC, SEN, SPE, and AUC.
Statistical analysis
The continuous variables were expressed as the mean ± standard deviation or median (quartile range). Compared to the t-test, the P value was bilateral; P < 0.05 was statistically significant. The classification variables were expressed as n%. Compared to the χ2 test, the P value was bilateral; P < 0.05 was statistically significant. Pearson and Spearman correlation tests were used for correlation analysis. RFA calculated the results of ACC, SEN, SPE, PPV, NPV, and AUC based on the confusion matrix. All calculations were run on jupyter4.4.0 (Python version 3.7.0; the operating system was AMD Ryzen 7 4800U). Baseline data and nonmachine studies were tested in SPSS version 25.0. Receiver operating characteristic (ROC) curves and AUC values were obtained. The classification table results were calculated on jupyter4.4.0 according to the confusion matrix. Evaluation results, such as ACC, SEN, SPE, PPV, NPV, and AUC, were obtained.
> ResultsClinical and US data
From January 1, 2014 to August 31, 2021, a total of 8554 patients with chronic diffuse liver disease underwent US-guided liver puncture. According to the standard of discharge, a total of 1359 patients were enrolled in the group. There were 837 males and 522 females ages 36.14 ± 11.03 years. Of the 1359 patients, 542 were in stages S0-S1, 387 in stage S2, 249 in stage S3, and 181 in stage S4. All patients were randomly divided into training set (n = 951) and validation set (n = 408). The baseline level is shown in [Supplementary Table 1], and the correlation analysis is shown in [Supplementary Table 2].
Diagnostic performance of RFA and LRA based on US variables
Based on six US variables and five demographic indices, the diagnostic accuracy of RFA in the four classifications (S0S1/S2/S3/S4) of the hepatic fibrosis stage was 47.79%. The diagnostic accuracy for significant fibrosis (≥S2), advanced fibrosis (≥S3), and cirrhosis (S4) was 62.25%, 75.25%, and 88.73%, respectively. The AUC values were 0.684, 0.776, and 0.821, respectively. The diagnostic accuracy of LRA in the four classifications (S0S1/S2/S3/S4) was 47.8%. The diagnostic accuracy for significant fibrosis (≥S2), advanced fibrosis (≥S3), and cirrhosis (S4) was 64.09%, 78.07%, and 87.86%, respectively. The AUC values were 0.698, 0.792, and 0.828, respectively [Figure 2]; [Table 1].
Figure 2: ROC curves of two methods. RFA and LRA for the diagnosis of HBV liver fibrosis stage based on US variables. The upper pictures are the ROC curves of RFA, the lower pictures are the ROC curves of LRA. There are no significant differences between RFA and LRA in each subclassificationTable 1: Diagnostic performance of random forest and logistic regression in staging of liver fibrosis based on ultrasound variablesDiagnostic performance of RFA, APRI, and FIB-4 indices based on serological indices
Based on 44 demographic and serological indices, the diagnostic accuracy of RFA in the four classifications (S0S1/S2/S3/S4) of the hepatic fibrosis stage was 79.17%. The diagnostic accuracy for significant fibrosis (≥S2), advanced fibrosis (≥S3), and cirrhosis (S4) was 87.99%, 90.69%, and 92.40%, respectively. The AUC values were 0.945, 0.959, and 0.951, respectively. The diagnostic accuracy of APRI in the four classifications was 42.3%. The diagnostic accuracy for significant fibrosis (≥S2), advanced fibrosis (≥S3), and cirrhosis (S4) was 60.12%, 69.09%, and 86.61%, respectively. The AUC values were 0.711, 0.700, and 0.730, respectively. The diagnostic accuracy of FIB-4 in the four classifications (S0S1/S2/S3/S4) was 41.4%. The diagnostic accuracy for significant fibrosis (≥S2), advanced fibrosis (≥S3), and cirrhosis (S4) was 60.12%, 70.20%, and 87.12%, respectively. The AUC values were 0.562, 0.625, and 0.684, respectively [Figure 3]; [Table 2].
Figure 3: ROC curves of three methods. RFA, APRI, and FIB-4 for the diagnosis of HBV liver fibrosis stage based on serological indices. The upper pictures are the ROC curves of RFA, the intermediate for APRI, the lower pictures for FIB-4. The AUC value of RFA in each subclassification is higher than the AUC value of APRI and FIB-4Table 2: Diagnostic performance of random forest, APRI, and FIB-4 in liver fibrosis staging based on serological indexesDiagnostic performance of RFA and LRA based on serological indices + US variables
Taking into account serological indices and US variables, the diagnostic accuracy of the four classifications (S0S1/S2/S3/S4) of the hepatic fibrosis stage was 77.21%. The diagnostic accuracy for significant fibrosis (≥S2), advanced fibrosis (≥S3), and cirrhosis (S4) was 87.50%, 90.93%, and 93.38%, respectively. The AUC values were 0.948, 0.959, and 0.962, respectively. LRA included alkaline phosphatase, PLT, GGT, HBV surface antigen, and liver parenchyma echo. The diagnostic accuracy in the four classifications (S0S1/S2/S3/S4) was 51.1%. The diagnostic accuracy for significant fibrosis (≥S2), advanced fibrosis (≥S3), and cirrhosis (S4) was 70.86%, 79.54%, and 89.26%, respectively. The AUC values were 0.770, 0.829, and 0.870, respectively [Figure 4]; [Table 3].
Figure 4: ROC curves of the two methods. RFA and LRA for the diagnosis of HBV liver fibrosis stage based on serological indices + US variables. The upper pictures are the ROC curves of RFA, the lower pictures are the ROC curves of LRA. The AUC value of RFA in each subclassification is higher than the AUC value of LRATable 3: Diagnostic performance of random forest and logistic regression in staging of liver fibrosis based on serological indexes + ultrasound variables > DiscussionLiver fibrosis after HBV infection is a key step in the progression to liver cirrhosis after HBV infection and is an important link affecting the prognosis. It is mainly due to a series of pathological and physiological changes, such as hepatocyte swelling, necrosis, cell structure collapse, stimulation of hepatic stellate cells to produce collagen fibers, extracellular matrix, and collagen deposition.[26] The degree of fibrosis depends on the antagonism between the production and dissolution of hepatic collagen fibers. When the etiology of collagen production is controlled or removed, fibrinolysis is dominant, and the degree of fibrosis can be alleviated or even reversed. Timely determination of the liver fibrosis stage is particularly important to guide clinical treatment and course management. Liver biopsy is still regarded as the “gold standard” for the clinical diagnosis of liver fibrosis, but it is criticized by clinical doctors and patients because it is a traumatic examination and has a certain probability of serious complications during and after the operation. In addition, the uneven distribution of liver fibrosis often leads to errors in histological evaluation.[27] Insufficient sample length can lead to FN results, such as underestimation of fibrosis stage and missed diagnosis of liver cirrhosis,[28] whereas sample fragmentation can lead to FP results.[29] Histological analysis of biopsy specimens requires experience and skills, but there is still a subjective tendency, which is easy to make differences within and between observers. The consistency of liver pathologists in the diagnosis of fibrosis stages is 60%–90% among observers and 70%–90% among observers.[30] Therefore, the noninvasive examination technology of liver fibrosis staging has become a research hotspot in this field, mainly including serological and imaging examinations. The elastography technology in imaging examination has achieved good results,[31],[32] but there are also some shortcomings. In recent years, with the integration of medical data, especially medical imaging data and the rapid development of supercomputing power, ML based on medical big data analysis has not only made great achievements in the diagnosis and treatment of liver space occupying lesions but also has a good performance in diffuse liver diseases, such as the diagnosis of nonalcoholic fatty liver disease,[33],[34],[35] the clinical study of portal hypertension,[36],[37] and the risk prediction of esophageal and gastric varices bleeding.[38] MLA for the diagnosis of liver fibrosis staging is mainly based on imaging[39],[40],[41],[42],[43] and has achieved good diagnostic efficiency. MLA based on serological indices performs better in the middle and late stages of liver fibrosis,[22] but there are few reports on the diagnosis of early liver fibrosis.
In this study, the diagnostic performance of RFA based on demographic and serological indices for HBV liver fibrosis staging was significantly higher than the two traditional noninvasive diagnosis models (APRI and FIB-4), especially in the diagnosis of the four classifications. The accuracy was 79.17%, 42.3%, and 41.4%, respectively. The diagnostic accuracy for significant fibrosis (≥S2), advanced fibrosis (≥S3), and cirrhosis (S4) increased by 27%, 20%, and 5%, respectively [Table 4]. The main reason is that RFA can extract hidden information from many features with weak linear relationships, improving diagnostic performance. LRA and MLA were used in the diagnosis of liver fibrosis staging based on US variables. Results showed that the diagnostic performance of the two methods was average, and there was no significant difference. The reason for this analysis may be that few indices are taken into consideration, and the superiority of RFA cannot be reflected. When the demographic and serological indices and US variables were included at the same time, MLA showed good performance once again. Compared to LRA, RFA improved the accuracy of the four classifications by 26%. The diagnostic accuracy for significant fibrosis (≥S2), advanced fibrosis (≥S3), and cirrhosis (S4) increased by 16%, 11%, and 4%, respectively [Table 4]. Through comparative observation, no significant difference in diagnostic performance was found between the RFA diagnosis model based on serology + US and the RFA diagnosis model based on a serological index. The inclusion of US parameters did not improve the diagnostic performance. This might be due to the limited information provided by US considerations. To sum up, the use of RFA can significantly improve the accuracy of classified diagnosis of HBV liver fibrosis, especially for significant fibrosis (≥S2), and accurately screen patients with early liver fibrosis. To provide a basis for clinical treatment, through the timely intervention of patients with early liver fibrosis, the process of liver fibrosis can be alleviated or even reversed. The improvement of the diagnostic accuracy of the four classifications provides clinicians with accurate liver fibrosis staging, which is conducive to the management of the course of the disease and curative effect observation and reduces unnecessary puncture biopsies.
Table 4: Diagnosis accuracy of random forest algorithm and traditional methodThis study has some shortcomings. The staging distribution of liver fibrosis in the cases included in this study is uneven, which is not conducive to the use of traditional statistical methods for statistical analysis and the establishment of the MLA model. This is a single-center study that needs a multicenter large sample size for validation.
> ConclusionRFA can significantly improve the classification accuracy of HBV liver fibrosis staging. The RFA model based on serological indices has good predictive efficiency for liver fibrosis staging. RFA can help clinicians accurately judge the liver fibrosis stage and reduce unnecessary biopsies.
Financial support and sponsorship
Nil.
Conflicts of interest
There are no conflicts of interest.
> References
Comments (0)