Machine learning-based radiomic analysis and growth visualization for ablation site recurrence diagnosis in follow-up CT

Study design and patient selection

A retrospective cohort of adult patients who underwent TA for liver tumors including HCC and metastases from colorectal and breast cancer between 2008 and 2020 was established from the electronic patient records at the XXX. At our center, the follow-up protocol after TA consists of a first CT scan one week after TA, followed by CT scans every 4 months during the first two years, and thereafter every six months up to five years after the treatment.

All reports of follow-up CT scans after TA, generated by abdominal radiologist as part of routine patient care, were retrospectively scrutinized for the evidence of recurrent disease. ASR was characterized by the emergence of a contrast-enhancing lesion either within or in the immediate vicinity of the ablation zone. Concurrently, the largest diameter of these lesions maintains direct contact with the ablation zone [22]. In case of radiological evidence of ASR with histopathological confirmation, patients were classified as the positive patient group. In this ‘ASR-positive’ patient group, the follow-up CT scans on which the ASR was identified were used for the radiomic analysis (average 12 months [interquartile range: 5–17 months] after the date of TA).

A control group was established by randomly selecting patients until 2020 from the cohort with follow-up CT scans without evidence of ASR. In these patients, the most recent follow-up CT scan was used (average 18 months [interquartile range: 12–23 months] after the TA date) for radiomic analysis.

Exclusion criteria (Fig. 1) for radiomic analysis were (1) unavailability of contrast-enhanced portal venous phase CT scans, such as cases where ASR was confirmed through follow-up magnetic resonance (MR) or positron emission tomography (PET) scans, or when only the arterial phase was accessible in the picture archiving system; (2) distant intrahepatic liver lesions, identified by a radiologist as a novel lesion rather than ASR; (3) inability to delineate the ablation zone on the latest follow-up scan in patients in the control group. This inability arises when the ablation zone is overgrown by normal liver tissue, rendering it invisible in the patient's most recent follow-up scan. This indicates normalization and, consequently, the absence of recurrence.

Fig. 1figure 1

Patient exclusion diagram

Multivendor CT systems were employed, with scan parameters harmonized between our hospital and referring institutions as follows: automatic tube current modulation and tube voltage selection, 1 mm slice thickness, 75-s delay following the intravenous injection of 90–100 mL contrast medium at a 3.6–4.0 mL/s flow rate, succeeded by a 32 mL saline solution. The Institutional Review Board granted approval, and the requirement for written informed consent was waived.

Region of interest and image processing

The entire workflow of the study is demonstrated in Fig. 2. The ablation zone and a 2 cm diameter surrounding rim of liver parenchyma constituted the region of interest (ROI). Ablation zones were delineated by two experienced abdominal radiologists separately on different parts of the dataset, with the mask of the surrounding liver parenchyma rim being automatically generated through morphological dilation of the delineated ablation zone. Figure 3 shows examples of binary masks for the ablation zone and adjacent liver parenchyma rim.

Fig. 2figure 2

Workflow of radiomic analysis

Fig. 3figure 3

Example of Region of Interest

To modulate contrast and brightness of the CT scan, thereby augmenting soft tissue visibility, a soft tissue window centered at 50 HU with a width of 400 HU was implemented. For normalization, all images employed in the radiomic analysis were resampled to identical spacing [1.0 mm, 1.0 mm, 2.0 mm] using a B-spline interpolator. Gray-level discretization employed a fixed bin size method and tested the size set of .

Radiomic features

Radiomic features represent a collection of quantitative measurements derived from medical images, translating radiological visual information into numerical data. A predefined set of radiomic features according to the Image Biomarker Standardization Initiative (IBSI) was extracted from the original pre-processed CT scans [23], encompassing morphological features, first-order statistical features, gray level co-occurrence matrix features, gray level size zone matrix features, gray level run length matrix features, neighboring gray tone difference matrix features, and gray level dependence matrix features. Additionally, first-order statistical features and texture features were also extracted from Laplacian of Gaussian (LoG) filter-transformed CT scans, since the LoG filter enhances the visibility of subtle image structures, such as edges. The amalgamation of radiomic features from the original and LoG-transformed scans provides more comprehensive insight into underlying tissue characteristics. Feature extraction was executed using Python 3.7.9 in the open-source library Pyradiomics 3.0 [24].

Extracted radiomic features may exhibit strong linear relationships with one another. To address collinearity, Pearson correlation coefficients between radiomic features were computed. Radiomic feature groups exhibiting Pearson correlation coefficients > 0.8 were deemed highly correlated and therefore removed to decrease dataset dimensionality and mitigate collinearity issues.

Machine learning classifiers

Owing to the limited dataset size, logistic regression with L1 penalty (Lasso regression) was employed for feature modeling, as it is apt for small-scale data analysis tasks [25]. The L1 penalty served as the regularization for the logistic regression classifier, penalizing high-valued regression coefficients to eliminate redundant features and reduce multicollinearity in feature sets. The classifier automatically selected radiomic features related to training targets during training. Feature importance for Lasso regression was gauged by the corresponding feature weights in trained classifiers. Furthermore, extreme gradient boosting (XGBoost) methods were also utilized for radiomic feature modeling. XGBoost classifiers, constructed by decision trees, facilitate powerful feature selection to distinguish ASR at each split node [26]. Feature importance for the XGBoost classifier was measured by Gini importance (mean decrease in impurity) [27].

To furnish an unbiased performance estimate of trained classifiers, a leave-one-out test (LOOT) approach was employed. In LOOT, the dataset was divided into n subsets, where n represents the number of patients in the entire dataset. For each subset, a model was trained using n-1 samples based on five-fold cross-validation. The trained model was subsequently tested on the held-out sample to evaluate performance. This process was repeated n times, with results computed based on predictions of held-out samples in each subset. Additionally, the imbalanced dataset, with more patients lacking ASR than those with ASR could influence machine learning model performance [28, 29]. Therefore, a class weight of 1.5 was applied to the minority class. By increasing the weight of patients with ASR, the classifier was compelled to consider the asymmetry of cost error between the positive and control groups. The model would incur a greater penalty for misclassifying ASR patients during training. The model was developed using the open-source library scikit-learn 0.23.2 with Python 3.7.9 [30].

Visualization method for post-ablative grown region

Diagnosing ASR can be challenging for radiologists due to subtle tumor size and the similarity between ASR and post-ablative necrosis and perilesional inflammation. It was hypothesized that malignant recurrent tumors exhibit growth between two follow-up scans; thus, emphasizing the differences between follow-up scans could potentially assist radiologists in focusing on the grown region of disease recurrence, making it more accessible to visualize and identify ASR.

To generate a heatmap (diff-map) highlighting differences between two follow-up scans, the images were aligned using elastix software [31]. Liver segmentation on the CT ensured accurate registration across scans. Subsequently, the diff-map was generated by subtracting the two registered follow-up scans. To further refine the diff-map, it was smoothed using a Gaussian kernel with a standard deviation of 2.5, and then normalized to a range of 0 to 1. Regions exhibiting growth during the time interval between the two follow-up scans were characterized by larger differences in gray values on the scans, thus emphasizing the disease recurrence regions on the diff-map.

Comments (0)

No login
gif