Deep-kidney: an effective deep learning framework for chronic kidney disease prediction

Many researchers proposed algorithms for health risk prediction for a variety of diseases in an effort to reduce mortality. Li et al. [19] forecasted the risk of hospital readmission for diabetic patients using a combination of collaborative filtering-enhanced and deep learning approaches in 2018. With 88.94% accuracy, the algorithm outperforms the Naïve Bayes, SVM, and decision tree algorithms. Later that year, Alam et al. [3] created a medical data classifier based on the Random Forest algorithm and feature ranking for ten different diseases. The proposed method was based on determining the significance of features for classification using various feature ranking techniques.

In 2020, Bikku et al. [2] proposed a multi-layer perceptron algorithm based on supervised learning methods to predict the risk of various diseases (breast cancer, diabetes, and heart disease) with a high degree of certainty. Following that, Shankar et al. [4] developed a new technique based on Naïve Bayes and KNN to predict the likelihood of a human being a recent or future heart disease patient. The prediction of coronary heart disease entered the competition as well, with the introduction of an accuracy-based-weighted aging classifier ensemble algorithm (AB-WAE) [5]. On two different datasets, this algorithm achieved 93% and 91% accuracy.

Since diabetes classification has occupied researchers’ minds for the riskiness of this disease, the Random Forest and XGBoost algorithms have been applied to the PIMA diabetes dataset for early prediction of this disease. XGBoost algorithm was superior to Random Forest by achieving 74.10% accuracy, while Random Forest achieved 71% accuracy [1]. Random Forest was proven to be superior to XGBOOST in CKD prediction reaching an accuracy of 100% as in [9, 11] using the CKD dataset [20].

Risk detection and prediction for chronic kidney disease

Given the riskiness of kidney disease to human health, scientists have attempted to detect it early or predict its occurrence in advance. Disease detection implies that the patient already has the disease, whereas disease prediction implies that it will occur in the future. As a consequence, research has been divided into two types: detection and prediction. In aspects of the first type, most of them used the same dataset [20], beginning with CDK detection. Almansour et al. [6] used SVM and ANN to detect CKD at an early stage. The dataset was preprocessed, and then the missing values were replaced. Following that, ten fold cross-validation was used. This study concluded that ANN outperformed SVM in terms of accuracy, with accuracy up to 99.75%. The limitation of this study is that the number of samples is limited, which causes the problem of dimensionality. This problem was solved by employing the SVM algorithm. This study suggests that a deep learning technique be used to detect CKD.

Elhoseny et al. [12] developed an intelligent classification technique for CKD in the same year, called Density-based Feature Selection (DFS) with Ant Colony based Optimization (D-ACO). This technique solved the problem of increasing the number of features in medical data by removing redundant features, which greatly aided in the resolution of many issues such as low interoperability, high computation, and overfitting. Using this method, the author achieved 95% detection accuracy with only 14 of the 24 features.

During the same year, Kriplani et al. [7] proposed a deep neural network model to detect the absence or presence of CKD in its early stages. This model used cross-validation to avoid overfitting and outperformed other available techniques, reaching 97% accuracy when compared to Naïve Bayes, Logistic, Random forest, Adaboost, and SVM.

Following that, in 2020, Jongbo et al. [8] used an ensemble algorithm: Random Subspace and Bagging to achieve 100% accuracy on the previous dataset, which is appropriate for efficient CKD diagnosis. The data is preprocessed, then missing values are handled, and finally the data is normalized. This algorithm was based on majority voting between three base-learners: KNN, Naïve Bayes, and Decision Tree. This study demonstrated that combining the base classifiers improved classification performance. In the performance matrices, the proposed model outperformed individual classifiers, according to the experimental results. In most cases, the random subspace algorithm outperformed the bagging algorithm.

During the same year, Ekanayake et al. [9] proposed an efficient method for detecting CKD based on medical data, beginning with data prepossessing and then filling missing values with K Nearest Neighbors-imputer, which resulted in higher detection model accuracy. Finally, the classification method was used. They focused on the practical aspects of data collection, emphasizing the importance of combining domain knowledge in CKD detection using machine learning. Among the 11 classifiers tested, the authors demonstrated the superiority of extra tree and Random Forest classifiers in CKD detection: (logistic regression, KNN, SVC with a linear kernel, SVC with RBF kernel, Gaussian NB, decision tree classifier, XGB classifier, Adaboost classifier, and a classical neural network). The K-Nearest Neighbors-imputer is recommended for this study to handle missing values in other diseases. Furthermore, adding more features to the analysis is important, such as food types, water consumption, and genomics knowledge.

In addition, Gudeti et al. [10] distinguished the performance of several machine learning techniques in 2020 based on their accuracy in analyzing CKD and distinguishing between CKD and Non-CKD patients. To detect CKD, the authors used Logistic Regression, SVM, and KNN models. The SVM model outperformed the other techniques, achieving 99.2% accuracy. The main benefit of this research is that the detection process is quick, allowing doctors to start treating patients sooner and further categorizing the patient population in less time. They did, however, use a small dataset of 400 patients.

Later that year, in 2021, Chittora et al. [21] detected CKD using full or important features. Full features, correlation-based feature selection, Wrapper technique feature selection, least absolute shrinkage and selection operator regression LASSO, synthetic minority over-sampling technique with least absolute shrinkage and selection operator regression selected features, and synthetic minority over-sampling method using full features were used to calculate the results. C5.0, CHAID, ANN, linear support vector machine (LSVM), logistic regression (LR), random tree (RT), and KNN were also used as classifiers. Finally, with the full features in synthetic minority over-sampling technique, LSVM achieved the highest accuracy of 98.86%.

Following that, Senan et al. [11] used machine learning techniques to develop a diagnosis system to detect CKD to aid experts in early diagnosis. The mean and mode were used to replace the missing values. Recursive Feature Elimination was used to select the most important features (RFE). The dataset was divided into two parts: 75% for training and 25% for testing and validation. Following that, four machine learning algorithms were used: support vector machine (SVM), Random Forest (RF), k-nearest neighbors (KNN), and decision tree (DT). To achieve the best results, the parameters were tuned for all classifiers. Among these four classifiers, the Random Forest (RF) algorithm outperformed all other four techniques by achieving 100% accuracy.

Finally, Singh et al. [22] proposed a deep neural network in 2022. The missing values were replaced by the average of the associated feature, and the Recursive Feature Elimination (RFE) algorithm was used to select features. The key parameters in the study were Specific Gravity, Hemoglobin, Red Blood Cell Count, Creatinine Levels, Packed Cell Volume, Albumin, and Hypertension (RFE). Following that, the selected features were fed into five classifiers (Deep neural network DNN, Naïve Bayes classifier, KNN, Random Forest, and Logistic regression). DNN outperformed all other models in terms of accuracy. The size of the dataset is a limitation of both the proposed algorithm and previous studies. The next step in this research will be to collect more sophisticated and representative CKD data to detect disease severity. The authors intend to use the proposed model on medical data containing the night urination, acid–base parameters, inorganic phosphorus concentration, and hyperparathyroidism features.

Concerning the second type of investigation, disease risk prediction, the first pioneering technique was proposed in 2021, which concerned CKD prediction as opposed to previous searches, which concerned CKD detection [13]. The primary goal of this study was to forecast the occurrence of CKD 6–12 months before disease onset using Taiwan’s National Health Insurance dataset [23]. The predictive model was developed using comorbidity, demographic, and medication data from patients over a 2-year period. For 12-month and 6-month predictions, the CNN model had the best AUROC of 0.954 and 0.957, with accuracy of 88% and 89%, respectively. While, the most important predictors were: gout, diabetes mellitus, age and medications such as angiotensin and sulfonamides. Table 1 summarizes the recent health risk prediction research for CKD.

Table 1 Summary of recent health risk detection and prediction models for CKDClassification ensemble techniques

Ensemble techniques are considered state-of-the-art methodologies for solving problems in a wide range of machine learning applications. The intuitive motivation for ensemble stems from human nature and the proclivity to gather disparate viewpoints and integrate them in order to make a complex decision. This idea depends on integrating multiple base-learners to obtain a classifier that outperforms them all using one of the combination algorithms: [Average Ensemble (AE), Weighted Average Ensemble (WAE), and Majority Voting Ensemble (MVE)]. In recent years, machine learning researchers have demonstrated through hands-on experimental research that combining the outputs of multiple classifiers improves the performance of a single classifier [18]. The ensemble technique has been used in a variety of applications, including disease detection and prediction, due to its impact on several machine learning challenges [5, 8, 24, 25]. The ensemble technique’s main idea is to maximize predictive performance by combining the strengths of multiple individual classifiers. In other words, the goal of deep ensemble models is to create a model that incorporates the advantages of both ensemble and deep models.

Recently, there have been some issues with using an individual classifier, such as overfitting, class imbalance, concept drift, and the curse of dimensionality, which cause a single classifier prediction to fail [26]. As a result, this new method has emerged in scientific research to address these issues. The predictive accuracy improves by using this algorithm in different machine learning challenges. The main idea of any ensemble learning is to use a combination function F to combine a set k of individual classifiers, c1, c2, …, ck, to predict a single output. Given a dataset of size n and features of dimension m, D = , 1 ≤ i ≤ n, xi ∈ Rm, the output’s prediction of this method is shown in Eq. (1) [27].

$$_=\varnothing \left(_\right)=f\left(c1.c2.. ck\right)$$

(1)

In this section, we will examine the most common Ensemble techniques that are commonly used in many machine learning applications, as well as some literature reviews on using ensemble techniques in health risk prediction.

Average ensemble (AE)

This technique demonstrated its high efficiency in scientific research. The main idea behind the techniques is that the final prediction is calculated by taking the average of the individual learners’ outputs. This average is calculated directly from the outcomes of individual learners or by applying the softmax function to the forecasting probabilities of the classes, as shown in Eq. (2). The performance of this technique is improved because of variance among the models is reduced [18].

$$_^=softmax(_)=\frac_^}^\mathrm(_^)}$$

(2)

where PiJ is the probability of the outcome of the ith unit on the jth base learner, \(_^\) is the output of the ith unit of the jth base learner and K is the number of the classes. This approach is appropriate when individual performance is proportional [28]. On the other hand, it is not appropriate when individual classifier performance is grossly disproportionate. The overall performance will be reduced in this case due to the influence of weak learners.

Because this technique does not take into account the performance of individual models, all models have the same weight. The previous method has the limitation that the results of the weak base classifier will have an adverse effect on the final model output. To avoid this problem, the Weighted Average Ensemble (WAE) is proposed, which provides sorted weights for models based on their efficiency.

Weighted average ensemble (WAE)

The previous approach is appropriate when the performance of the individuals are proportional [28]. On the other hand, it isn’t appropriate when the individual learners’ performances are absolutely disproportionate. In this case, the overall performance will be reduced according to the influence of weak learners. To avoid this problem, the (WAE) is proposed, which provides sorted weights for models based on their efficiency. It is thought to be an extension of the previous method, in which the final prediction value is obtained by calculating the average of all the base classifiers’ predictions. In contrast, in the weighted average, each data point is assigned a pre-defined weight to indicate its importance in the prediction, and the final prediction value is calculated by taking the weighted average of all the data points into account. Each classifier in the ensemble contributes to the final prediction based on its weight in this technique. The final prediction for class label prediction is calculated using the mode of the individuals’ predictions. For the class probability prediction, the final prediction is calculated using argmax of the summed probabilities of each class label [29].

Majority voting ensemble (MVE)

In the research field, this technique is regarded as the most widely used approach in the ensemble technique. This technique, similar to the previous ones, combines the outputs of individual learners. Instead of calculating the average of the probability results, (MVE) counts the votes of individual classifiers and predicts the final class based on the majority of votes [18]. The main advantage of this technique is that it is less biased towards the outcome of a specific individual learner because the majority vote count relieves the influence; additionally, the influence of weak learners is no longer significant. The majority voting rule comes in three varieties:

(i)

Unanimous voting, in which all individual classifiers agree on the prediction;

(ii)

Simple majority voting, in which the prediction must be supported by at least 51% of all classifiers; and

(iii)

Majority or plurality voting, in which individual learners’ votes are counted and the final prediction is calculated based on the majority of votes. The majority voting rule improves prediction performance.

This model caught the interest of scientists and researchers, and it is now used in a variety of applications in health risk detection and prediction for a variety of diseases.

Literature review of using ensemble in disease detection

This section examines the literature on ensemble learning in disease detection, with machine learning or deep learning as individual classifiers. Using ensemble learning, Raza et al. [30] created a model for detecting heart disease that is both reliable and accurate. The majority voting rule was used to combine the results of three classification algorithms [logistic regression (LR), multilayer perceptron (MLP), and Naïve Bayes (NB)] in this paper. The proposed ensemble method achieved classification accuracy of 88.88%, which is superior to any other base classifier.

Following that, Atallah et al. [24] presented an ensemble method based on the majority voting technique in the same field. This method combined Stochastic Gradient Descent (SGD), KNN, Random Forest, and Logistic Regression to provide doctors with greater dependability and accuracy. Finally, using the hard voting ensemble model, this technique achieved 90% accuracy. Yadav et al. [31] used various ensemble techniques on 10 biomedical datasets [32]. These techniques performed competitively against individual classifiers. The highest AUC was achieved using the average ensemble and the Rank Average Ensemble (RAE) in most datasets.

Individual classifiers were outperformed by these techniques. In most datasets, the average ensemble and the Rank Average Ensemble (RAE) produced the highest AUC. Similarly, Tao Zhou et al. [33] proposed an ensemble deep learning model called (EDL COVID) to detect COVID 19 in lung CT images, employing a relative majority vote algorithm with 99.05% accuracy. Before employing the Ensemble technique, the base models were built using ResNet, GoogleNet, and AlexNet. In terms of performance and detection speed, the EDL COVID classifier outperformed the single classifiers. Similarly, Chandra et al. [34] used the majority voting ensemble technique to create a two-phase classification system [normal vs. abnormal (phase-I) and COVID-19 vs. pneumonia (phase-II)]. The obtained precision for Phase-I and Phase-II was 98.062% and 91.329%, respectively.

Neloy et al. [25] proposed an ensemble model to achieve an excellent result in heart disease prediction. Among the baseline models used in their proposed work are (Random Forest, Decision Tree, and Naïve Bayes). The combining process, which used the Weighted Average Ensemble technique, achieved 100% accuracy on training and 93% accuracy on testing [25].

Using voice recordings of 50 patients and 50 healthy people, Hire et al. proposed an ensemble algorithm of CNNs for detecting Parkinson’s disease. The publicly available database obtained from PC-GITA was used. The base classifier was trained using a multiple-fine-tuning method. Each vowel was trained and tested separately, then a tenfold validation was performed to test the models. The proposed approach was soundly able to differentiate between the voices of patients and the healthy people for all vowels. The proposed model achieved 99% accuracy, 93.3% specificity, 86.2% sensitivity, and 89.6% AUC. The monitoring of the patients can be applied online without needing additional hardware.

Table 2 summarizes the ensemble disease detection techniques, the dataset used in the experiments, and the highest accuracy.

Table 2 Previous ensemble models for various diseases detection

留言 (0)

沒有登入
gif