PSMA-positive prostatic volume prediction with deep learning based on T2-weighted MRI

Patients

In this retrospective study, we screened all PCa patients who, between April 2016 and December 2020, consecutively underwent staging (n = 177) or prospectively biopsy guidance for PCa (n = 45) PSMA PET/MRI at our institution. Then, we excluded patients who previously underwent locoregional therapies (n = 34) or without any PSMA prostatic uptake (n = 11). Given the limited sample size, we decided to use every slice incorporating prostatic tissue on MRI and dichotomized the PSMA-uptake at a cut-off of SUV 4. This cutoff is based on suggestions that SUV 4 could be used for the differentiation of PCa with respect to normal prostatic gland uptake [15]. To further increase the specificity of our model, we excluded patients without PSMA uptake greater than 4 in the prostate (n = 23). In Fig. 1, we resumed the patients’ enrollment flowchart. The study was approved by the institutional review board (2020–02861); all staging patients signed a general informed consent for retrospective studies, while biopsy guidance patients signed the specific written informed consent of the prospective study [9].

Fig. 1figure 1

Patients’ enrollment flowchart. PSMA prostate-specific membrane antigen, PET/MRI positron emission tomography/magnetic resonance imaging

PSMA PET/MRI

All patients underwent [68Ga]Ga-PSMA-11 or [18F]PSMA-1007 PET/MRI scans (SIGNA PET/3T MRI, GE Healthcare, Waukesha, WI, USA). Images were acquired 60 and 90 min after the injection of [68Ga]Ga-PSMA-11 or [18F]PSMA-1007, respectively, starting with a whole-body MRI localizer scan. Then, a 3D dual-echo, spoiled gradient recalled echo sequence (LAVA-FLEX) for attenuation correction, and a PET emission scan were acquired. The protocol included dedicated sequences covering the pelvis, including a high-resolution T1-weighted LAVA-FLEX sequence, a T2-weighted fast recovery fast spin-echo sequence (FRFSE) in two planes, and diffusion-weighted images (DWI, with b values of 0, 400, and 700). Specifically, the axial T2 FRFSE were characterized by a repetition time (TR) of 2600 ms, an echo time (TE) of 117 ms, a flip angle (FA) of 125°, an acquisition matrix of 416 × 224 with a voxel size of 512 × 512, a slice thickness of 4 mm, a signal averages of 2, a bandwidth of 326 Hz/pixel, and an acquisition time of 3:48 mm:ss. Details of the other MRI sequences are given in supplemental Table 1. PET acquisition scan was acquired in the 3D time of flight (TOF) mode with a default number of 6-bed positions with an acquisition time per bed of 2 min (axial FOV of 25 cm and overlap of 24%, matrix of 256 × 256, 2 iterations, 28 subsets, with the sharpIR algorithm—GE Healthcare—and 5-mm filter cutoff). To reduce the radiopharmaceutical activity in the urinary system of [68Ga]Ga-PSMA-11, furosemide was injected intravenously 30 min before the tracer injection (0.13 mg/kg), and the patients were asked to void before the scan. All institutional protocols agreed with the joint EANM-SNMMI procedure guidelines [16].

Image segmentation

For each anonymized patient, a nuclear medicine physician with 3 years of experience (R.L.) segmented the T2 MRI images of the whole prostatic gland using 3D Slicer software [17] version 4.11 through a free-hand, slice-by-slice segmentation on axial view. Those volumes were then confirmed by a double board-certified radiologist and nuclear medicine physician with 15 years of experience (I.A.B.). Then, the selected volumes of interest (VOIs) from MRI were transferred to the PSMA PET scans and dichotomized with a threshold at SUV 4 using the image biomarker standardization initiative (IBSI) compliant software Lifex [18] version 7.1 to generate positive and negative voxels (namely, binary masks: 1 for the target, 0 for the background).

Given that T2 and diffusion-weighted (DWI) MRI are the most important sequences always acquired to detect cancer, and that DWI is often limited in correlation due to distortions, we decided to use T2 weighted as the base from the mpMRI. Indeed, T2 images have a higher resolution compared to dynamic contrast-enhanced (DCE) or DWI (including apparent diffusion coefficient—ADC); we therefore consider that T2 images include a lot of “invisible” data that could be used by the neural network to predict the PSMA prostatic uptake.

Deep learning

The customized-efficient neural network (C-ENet) [19] was used to predict increased prostatic PSMA uptake based on the axial T2 weighted sequence alone. ENet is a commonly used network in mobile applications where hardware availability is limited, and accurate segmentation is very critical [20]. Successively, it has been modified into C-ENet to be used in biomedical imaging applications, such as for segmentation of the prostate in MRI images [21]. Specifically, ENet is based on building blocks of residual networks, with each block consisting of three convolutional layers. It is characterized by asymmetric and separable convolutions with sequences of 5 × 1 and 1 × 5. The 5 × 5 convolution has 25 parameters while the corresponding asymmetric convolution has 10 parameters to reduce the size of the network. As explained in [19], the customization of ENet (namely, C-ENet) is achieved by replacing the output of 128 × 64 × 64 with an output of 256 × 64 × 64. This modification implies a dice similarity coefficient (DSC) distribution that, on average, is greater and exhibits less variability than the standard ENet. In our study, we used a C-ENet to automatically segment the prostate volume in the T2 MRI images and to automatically generate, within the obtained volume, a predictive PSMA PET map. To achieve these results, the C-ENet was trained twice. For the first training phase, the manual segmentations performed in the MRI dataset were used to automatically identify the anatomical prostate region. The stratified fivefold cross-validation strategy was used: in the way described in [22], the data set was divided into 5 folds of similar size, and the model was trained on five models by combining four of the five folds into a training set and keeping the remaining fold as validation. For the training task, an initial set of 16 patients was used to experimentally determine the best learning rate of 0.0001 with Adam optimizer [23]. Since the prostate segmentation suffers from unbalanced data (the prostate region is very small compared to the background), the Tversky loss function was used as loss function to adjust the weights of false positives and false negatives as reported in [24]. An automatic stop criterion was implemented in case the loss had not decreased for 10 consecutive epochs. Finally, we compared the proposed C-ENet method with the most widely used DL algorithm in biomedical image segmentation, namely UNet [25]. The distinctive feature of UNet is a U-shaped structure, where the network contracts the input image through a series of convolutional and pooling layers to capture context and then expands the representation back to the original input size to generate a segmentation map. The contracting path consists of convolutional layers followed by max-pooling layers to reduce spatial dimensions, while the expanding path involves up-sampling and concatenation operations to recover spatial information. Skip connections between corresponding layers in the contracting and expanding paths facilitate the retention of fine-grained details during segmentation.

For the second training phase, the prostate regions segmented in MRI images in the previous phase were superimposed on the binary PET masks obtained using LifeX. In this way, only the prostate regions with high PSMA uptake were considered in the MRI images given that the voxels belonging to the background (namely, the prostate region with low PSMA uptake) were set to zero value. Then, these images were used to train a new C-ENet-based model to generate a prostatic PSMA predictive PET map using the MRI dataset alone. Patients were divided into a training cohort (n = 124) and a validation cohort (n = 30) according to the general 80:20 rule, also known as the Pareto Principle. Specifically, the smaller lesions were used as a training set (lesion volume lower than 15 cc), while the larger ones were used as the test set to consider the problem of the partial volume effect in PET images [26] improving the robustness of this preliminary model. For preprocessing, MRI exams were resampled to an isotropic voxel size (1 × 1 × 1 mm3) with a matrix resolution of 512 × 512 using linear interpolation. Manually segmented masks were used as the ground truth (whole-gland) and PSMA prostatic uptake segmented masks were resampled using nearest-neighbor interpolation. For both models, data augmentation based on six different types of techniques (rotation, translation in both x and y directions, applying trimming, horizontal flipping, and zooming of the original images) was used to reduce overfitting. Finally, data standardization and normalization were used for faster convergence and to avoid numerical instability. The NVIDIA RTX A5000, 16 GB VRAM, 6144 CUDA Cores was used as a graphic processing unit (GPU). The flowchart of the proposed model is presented in Fig. 2.

Fig. 2figure 2

Flowchart for PSMA prostatic uptake prediction based on T2 MRI. Starting from the superimposition of the prostate region segmentation based on MRI images (a) and the binary PSMA uptake mask based on PET images (b), the prostate regions with high PSMA uptake (c) were used to train the C-ENet-based model (d). In this way, the model was able to predict prostatic PSMA PET maps (f) using the MRI dataset alone (e). C-ENet customized-efficient neural network, MRI magnetic resonance imaging, PET positron emission tomography, PSMA prostate-specific membrane antigen

Statistical analysis

Statistical analyses were performed using SPSS statistics software, version 26 (IBM). Descriptive analyses were used to display patient data as means with standard deviations or medians with ranges to describe normally or non-normally distributed values, respectively; frequency distribution with percentages was used to summarize categorical variables. These analyses were performed by R.L. (MD, PhD). To provide a model evaluation, for each clinical case, we computed a set of performance indicators routinely used in the literature for shape comparison, namely sensitivity, positive predictive value (PPV), DSC, volume overlap error (VOE), and volumetric difference (VD):

$$\begin & } = }/} + } \\ & } = }/} + } \\ & } = 2 \, \times }/2 \, \times } + } + } \\ & } = \, 1 \, }/} + } + } \\ & } = \, \left| } - }} \right| \, /2 \, \times } + } + } \\ \end$$

where TP, FP, TN, and FN are the true positives, false positives, true negatives, and false negatives, respectively. To verify if statistically significant different performances occurred between the two DL algorithms (i.e., C-ENet and UNet), the analysis of variance (one-way ANOVA) on the DSC was used. A p value < 0.05 was considered for statistical significance.

Statistical analyses for the model estimation were performed by A.C. (PhD).

Comments (0)

No login
gif