Left Ventricular Myocardial Dysfunction Evaluation in Thalassemia Patients Using Echocardiographic Radiomic Features and Machine Learning Algorithms

Despite recent developments, heart failure resulting from iron deposition in patients with thalassemia major is still the most serious and the leading cause of death [18]. Furthermore, patients with thalassemia major may not have any symptoms, which can delay the early diagnosis of myocardial dysfunction and put successful reverse disease conditions at risk [10]. Iron overload identification methods such as serum ferritin and liver and heart biopsy have limitations and cannot be used as a reliable and accurate method to evaluate myocardial iron concentration [15]. T2*CMRI is an outstanding and non-invasive diagnostic technique in identifying cardiac iron content [10], although obstacles such as being expensive and time-consuming, not being generally available in all medical centers, and the presence of contraindications for MRI prevent its widespread use [18]. Echocardiography can also evaluate heart failure resulting from iron deposition. Availability, outpatient, and portable are the advantages of echo; nevertheless, the interpretation of this method highly depends on the user’s knowledge and experience [24]. In this study, the CMRI findings are utilized to categorize participants into two groups: normal and prone to thalassemia. It should be emphasized that while echo results were comparable (LEFV > 55%) in all patients, those with T2* ≤ 20 ms were more likely to experience future cardiac issues. In order to categorize patients and determine who is most likely to experience cardiac difficulties owing to iron overload in the future, radiomic features of echo images were used. As was already noted, early identification of these individuals can have a crucial role in the treatment process and reducing the mortality rate. To the best of our knowledge, this study is the first attempt to identify cardiac problems caused by iron overload using radiomic features extracted from echo images and ML based on T2* values obtained from MRI. In previous studies [6, 8,9,10, 17, 18, 51], statistical tests and software determined the correlation between T2* and echo parameters.

According to Fig. 4, first-order, GLRLM, and GLZLM were the most frequent features in three FS methods. In detail, in the ANOVA method, first-order (50%) and GLZLM (20%) features; in the MRMR method, the first-order (47%) and GLRLM (23%) features; and in the RFE method, the first-order (36%) and GLRLM (23%) features were the most frequent. Shape features were not selected in any FS methods because these features have no relationship with the amount of iron deposition and the amount of T2* in the septal area. Among the features selected by the ANOVA method, the highest scores in ED, ES, and ED&ES datasets belonged to the first quartile of discretized (DISCRETIZED_Q1), GLRLM_High Gray-Level Run Emphasis (HGRE), and GLCM_Energy features, respectively (Fig. 2). Joint Energy as one of the GLCM features evaluates the homogeneity patterns in the myocardium [31] and HGRE as GLRLM feature determines the distribution of the higher gray level values [43]. An effective substitute for the coefficient of variance is the quartile coefficient of dispersion. The first quartile (Q1) also evaluates the distribution of gray level values [43]. Then, among the features selected by the MRMR method, the highest scores in ED, ES, and ED&ES datasets belonged to the GLCM_Correlation, GLCM_Dissimilarity, and GLCM_Dissimilarity features, respectively (Fig. 3). Correlation as GLCM feature measures the linear dependency of gray levels and dissimilarity shows the local intensity variation [42]. Although, as in CMRI images, iron deposits cause the myocardium containing iron overload to have a lower signal and intensity compared to normal tissue [11], in echocardiography images, iron accumulation leads to the heterogeneous distribution of the intensity of gray levels. These radiomic features will help identify patients without any heart failure.

Barzin et al. [18] stated that all diastolic functional indicators, except for early (E) and late (A) transmitral peak flow velocity ratio (E⁄A), exhibit a notable relationship with T2*. In our research, the radiomic features showed that diastolic indices are related to the T2* parameter. Meanwhile, in the study of Aypar et al. [10], diastolic dysfunction was seen locally in the septal wall in patients with thalassemia major. In our study, the radiomics from diastolic were obtained from the segmentation area (septum) and had the highest score and importance in the ANOVA method.

Model explainability seeks to identify a distinctive set of biomarkers, known as a signature, to potentially predict a clinical outcome, such as a diagnosis, prognosis, or response to treatment. In the realm of radiomics, intriguing research has been carried out recently. However, there is a lack of emphasis on developing explainable models. The essence of explainable models lies in their ability to gain approval and trust from physicians in clinical setting. When a model is developed, it becomes crucial to demonstrate to physicians that it is not just a black-box computerized system. By providing explanations for the model’s outcomes, it can foster confidence and encourage the utilization of these models in practice [39].

The important issue is that many of the developed models are not easily interpretable. Physicians and clinicians cannot easily understand and subsequently trust them because of their black-box nature [39]. In Fig. 6, the four features that had the highest scores and the most selections among different FS methods were visualized by voxel-wise feature extraction for different classes. Entropy, from first-order features, provides randomness of the intensity distribution in the region of interest (ROI). A lower entropy value denotes a more uniform distribution, while a greater value reflects a more heterogeneous intensity [43, 52]. Dissimilarity, a GLCM-derived feature, measures the difference between adjacent pixel intensities, revealing changes in intensity values and indicating texture edges or sharp transitions. Higher dissimilarity values indicate greater contrast and variation, while lower values suggest more uniformity [43, 52]. HGRE, a GLRLM-derived feature, shows the image frequency and length of runs of consecutive pixel values. It measures the importance or weighting of the image’s longer runs with higher gray-level values and emphasizes the dominance or prevalence of runs with high values for the gray level [43, 52]. Small Zone High Gray-Level Emphasis (SZHGE), a GLZLM-derived feature, evaluates the significance or prioritization given to smaller zones containing higher gray-level values within the image. It offers insights into these small zones’ frequency and dominance, characterized by elevated gray-level values [43, 52]. The mean value of entropy and GLCM dissimilarity was higher, and the mean value of GLRLM_HGRE and GLZLM_SZHGE was lower in the control group both in ES and ED datasets. Our hypothesis regarding the former two features is that the formation of iron overload may have reduced the amount of dissimilarity and randomness of the intensity, which caused these values to be lower in patients compared to the control group. In terms of the latter two features, it can be hypothetically related to the points where iron overload is developing.

In the ED dataset, the MRMR-XGB model achieved the best result. In the ES dataset, the top models were MLP using ANOVA and KNN using RFE. In the ED&ES dataset, RFE-KNN had the best result. The results of the models in the ED dataset are superior to those of each set of features. The explanation is that the motion of the heart is the lowest in the mid-to-end-diastolic phase; this probably causes the distortion of the features to occur less and get a better result.

According to our findings, using radiomics extracted from echo images, it is possible to classify individuals which are labeled according to CMRI T2*. Meanwhile, the subjects examined in this study had normal results in terms of LVEF, and no dysfunction was evident. In other words, based on image analysis, echo radiomic features are related to the T2* value. While in conventional echocardiography studies, Moussavi et al. [21] found no remarkable association between T2*MRI and echocardiographic results. Vogel et al. [9] stated that the sensitivity of tissue doppler echocardiography in detecting abnormal iron load is 88%, and its specificity is 65%. In contrast, 73% of sensitivity and 73% of specificity in the MRMR-XGB model and 83% sensitivity and 56% specificity in the ANOVA-MLP model were achieved in our study. Aypar et al. [10] also concluded that when the mid-septal Sm ≤ 5.7 cm⁄s, the tissue Doppler echocardiography sensitivity is 63%, and the specificity is 83%, and when the mid-septal Em ≤ 12.1 cm⁄s, the sensitivity is 75%, and the specificity is 75%. Djer et al. [17] claimed no significant relationship exists between T2* and left ventricular systolic indices. While in our study ANOVA-MLP among the models applied on the ES dataset (AUC: 0.69, SPE: 0.56, SEN: 0.83, and ACC: 0.69) had the best performance in diagnosis of cardiac problems caused by iron overload. ANOVA-MLP is considered among the top three models. Since systolic dysfunction occurs late in the disease process, this finding can be significant.

Our study emphasizes the high ability of radiomics in the early detection of cardiomyopathy resulting from iron deposition in conditions where LVEF is preserved. Therefore, the presented findings could potentially help physicians make decisions regarding heart failure caused by iron deposition using echo images. In such a way, physicians can successfully reverse the condition of cardiomyopathy and prevent the progression of the disease with early diagnosis. Furthermore, since echocardiography has a lower cost than a method like MRI and is available in most centers, this method is cost-effective in evaluating heart failure in patients with thalassemia. In addition, echocardiography is non-invasive as well as portable.

This study had some limitations. First, we have a small sample size as we select patients with echo and CMRI studies in short time intervals with a max 6-month duration; a larger sample in future studies would be of more value. In this study, data were collected from one center. To ensure the models’ generalizability, collecting data from different centers and evaluating model performance across different centers is necessary. As RFE only benefited from the RF model, it is possible that the features selected may not be the most optimal choice for other classifiers.

留言 (0)

沒有登入
gif