Deep learning (DL) reconstruction of diffusion-weighted imaging (DWI) is an emerging technique that allows for a substantially shorter acquisition time than conventional DWI.1 In this study, we prospectively evaluated the image quality of DL DWI with superresolution processing (DWIDL) in direct comparison to standard DWI (DWISTD) in a clinical setting. Diffusion-weighted imaging has substantial value in breast imaging. By providing information on the microstructure,2 DWI facilitates the ability to distinguish between different lesion types3,4 and can be used to assess neoadjuvant therapy success.5 Diffusion-weighted imaging can be integrated in a standard dynamic breast MRI protocol offering advantages for diagnostic confidence with improved specificity in a multiparametric breast MRI6,7 and improved diagnosis of axillary metastasis.2,8 Substantial additional diagnostic value was shown in a study on women who had previously undergone mammography screening, in whom DWI revealed 8 more breast cancers in 1000 women.9 It was proven for both 1.5 T and 3 T MRI that apparent diffusion coefficient (ADC) values were generally lower in malignant lesions compared with benign lesions (BEs), which resulted in a reduced rate of false-positive findings and biopsies.10 Also, diagnostic accuracy of lesion characterization is improved in 3 T breast MRI when DWI is performed.7,11 However, DWI has not yet been implemented in broad clinical routine. This might be explained by the additional scanning time required for DWI and the rather low spatial resolution compared with other sequences. Therefore, a reduction of acquisition time and improvement of spatial resolution are of high relevance for successful implementation of DWI on a broader scale.6 In the past years, methods such as parallel imaging, simultaneous multislice techniques, and compressed sensing were used to optimize DWI in terms of image quality and acquisition time.12 The method of image reconstruction with DL reconstruction networks has been recently introduced as a promising technique to further shorten acquisition time and has already been performed on DWI for breast,1 liver,13 and prostate14; T2-weighted turbo spin echo sequences for the breast,15 spine,16 and prostate17–20; and on T1-weighted imaging of the chest,21 each resulted in similar or better image quality for DL sequences.
The aim of this study was to evaluate image quality and acquisition time of DWIDL in comparison to current standard imaging in a clinical routine setting. We hypothesized that DWIDL has equally good overall image quality with higher lesion conspicuity due to superresolution while acquisition time is reduced.
MATERIALS AND METHODS Study DesignPatients who underwent a multiparametric 3 T breast MRI between August 2022 and November 2022 for diagnostic or screening purposes at our tertiary referral center were included in this prospective, single-center study (Table 1). The study was approved by the institutional review board (EK 22-1185) and listed in the German Clinical Trials Register (DRKS-ID: DRKS00029550). Study procedures are in line with the Declaration of Helsinki. Informed written consent was obtained by all participants. Exclusion criteria were age younger than 18 years, metallic devices not suitable for MRI, and pregnancy. For study workflow, see the STROBE flowchart in Figure 1.
TABLE 1 - Participant and Breast Cancer Characteristics Characteristic Count/Total Age, y* 65 ± 13 (range, 29–85) Total no. participants 70 Exclusion 5 Inclusion of n 65 Total no. lesions 90 Inclusion of lesions 82 Biopsy-proven invasive breast cancers 15 Cystic lesions‡ 35 Intramammary lymph nodes‡ 20 Biopsy-proven fibroadenoma 6 Fibroadenoma‡ 6 Reason for examination† Diagnostic confirmation 16/65 Screening 9/65 High risk screening 14/65 Staging 15/65 Follow-up 11/65 Menopausal status† Premenopausal 20/65 Postmenopausal 45/65 Antihormonal therapy† Yes 13/65 Breast density right breast† A (almost entirely fat) 13/65 B (scattered FGT) 13/65 C (heterogenous FGT) 18/65 D (extreme FGT) 20/65 Silicone 1/65 Breast density left breast† A (almost entirely fat) 15/65 B (scattered FGT) 10/65 C (heterogenous FGT) 18/65 D (extreme FGT) 19/65 Silicone 3/65 Mutation† No or not diagnosed 48/65 BRCA 1/2 11/65 ATM 1/65 CHECK-2 1/65 PALP2 4/65 Histopathological finding† No special type 14/15 Lobular type 1/15 TNM (8th edition)† cT1 5/15 cT2 9/15 cT3 0/15 cT4 1/15 Grading† G1 0/15 G2 7/15 G3 8/15 ER† Negative 7/15 Positive 8/15 PR† Negative 8/15 Positive 7/15 HER2† Negative 10/15 Positive 5/15 Proliferation index (Ki-67)† <10% 2/15 >10% 13/15*Mean value ± standard deviations.
†Count of participants.
‡Criteria of benign lesion in MRI.
ATM, ataxia telangiectasia mutated; BRCA, breast cancer gene; CHECK-2, checkpoint kinase 2; ER, estrogen receptor; FGT, fibroglandular tissue; HER2, human epidermal receptor; PALP2, partner and localizer of breast cancer 2 susceptibility protein; PR, progesterone receptor.
Flow diagram of study enrollment.
Image AcquisitionThe standard diagnostic contrast-enhanced multiparametric breast MRI protocol was performed with a 3 T MRI (MAGNETOM Vida; Siemens Healthcare, Erlangen, Germany) using an 18-channel breast coil, which composed a single-shot echo-planar DWI combined with reduced field-of-view excitation (ZOOMitPRO; Siemens Healthcare, Erlangen, Germany). Two b-values of 50 s/mm2 (b50) and 800 s/mm2 (b800) with a spatial resolution of 1.2 × 1.2 × 3.0 mm3 were chosen in accordance with current literature recommendations.6,22 Diffusion-weighted imaging was performed previous to the dynamic contrast-enhanced series with bolus injection of 0.1 mL/kg body weight gadoteridol (ProHance; Bracco Imaging, Konstanz, Germany) at a rate of 3 mL/s. A parallel imaging acceleration factor of 2 was applied for both sequences using generalized autocalibrating partially parallel acquisition (GRAPPA). In addition, a prototype DL-accelerated DWI research application sequence with superresolution processing and similar spatial resolution was acquired with a reduced number of averages (3:10 for DWIDL vs 6:20 for DWISTD). Apparent diffusion coefficient maps were generated for both sequence types, respectively. To create a more consistent image for reporting purposes, the signal in the areas outside of anatomic structure, such as air, was set to zero. Detailed acquisition parameters are provided in Table 2. The DWIDL acquisition and reconstruction pipeline relies on 2 independent networks as given in Figure 2.
TABLE 2 - Comparison of Acquisition Parameters for Standard DWI (DWISTD) and Deep Learning DWI (DWIDL) Characteristic Standard ZOOMit (DWISTD) Deep Learning ZOOMit (DWIDL) Field of view, mm2 380 × 192 Matrix size 306 × 152 612 × 304 (interpolated) No. slices 45 Slice thickness, mm 3 Resolution, mm2 1.24 × 1.24 0.62 × 0.62 PAT acceleration 2 (GRAPPA) Fat saturation technique Spectral Attenuated Inversion Recovery B-values, s/mm2 50; 800 Diffusion directions 1 Averages 6; 20 3; 10 TR, ms 10,400 TE, ms 79 Acquisition time, min:s 05:02 02:44DL, deep learning; DWI, diffusion-weighted imaging; GRAPPA, generalized autocalibrating partial parallel acquisition; PAT, parallel imaging techniques; STD, standard; TE, echo time; TR, relaxation time.
Illustration of standard and deep learning diffusion-weighted imaging (DWI) at 3 T breast MRI. The upper box demonstrates the standard DWI (DWISTD) with a conventional generalized autocalibrating partial parallel acquisition (GRAPPA) reconstruction. Subsampled k-space data with a reduced number of averages were reconstructed for the deep learning DWI with superresolution (DWIDL) with 2 deep learning networks. One network (yellow circle, number 1) was used to improve acquisition time. In addition, another network (red circle, number 2) was applied to increase spatial resolution through interpolation. The resulting spatial resolution of the DWIDL was 0.6 × 0.6 × 3.0 mm3 compared with 1.2 × 1.2 × 3.0 mm3 for the DWISTD. Image examples are given with b-values of 800 s/mm2 (b800) and apparent diffusion coefficient maps (ADC maps) on the right side, for standard and deep learning DWI, respectively. ADC, apparent diffusion coefficient; C, contrast; CNR, contrast-to-noise ratio; DL, deep learning; GRAPPA, generalized autocalibrating partial parallel acquisition; DWI, diffusion-weighted imaging.
First, for the accelerated DWI acquisition, a k-space to image reconstruction based on the idea of a variational network23 was deployed. Precalculated coil sensitivity maps as well as k-space data were used as input. In total, 17 unrolled iterations were performed with trainable step sizes and Nesterov-type extrapolation. The first 6 iterations focus on parallel imaging by applying data consistency without additional regularization. Subsequently, 11 iterations with a convolutional network–based regularization with down-up architecture were applied.
The entire reconstruction was trained offline in PyTorch in a supervised setting on approximately 500,000 single-shot DWI scans collected on various clinical 1.5 T and 3 T scanners (MAGNETOM; Siemens Healthcare, Erlangen, Germany) in volunteer scans. Training data were acquired with parallel imaging acceleration factors of 1 or 2, and corresponding ground truth images were generated by a conventional reconstruction. To create training pairs, the acceleration factor was retrospectively doubled. Training was performed using a NVIDIA Tesla V100-SXM2 (NVIDIA Corporation, Santa Clara, CA) with 16 GB of memory GPU. Network training and inference were performed on single-shot images and therefore without coupling between different b-values or averages. After training, the network was frozen and integrated in the C++-based reconstruction pipeline on the scanner.
Second, after application of the DL-based reconstruction to obtain single-shot images from acquired k-space data, an additional image-based superresolution network was applied to increase spatial resolution to 0.6 × 0.6 × 3.0 mm3. For this task, a second network with pixel shuffle architecture24 was used. In order not to modify the actual acquired image information but only extrapolate nonmeasured frequencies, hard data consistency steps were applied. The network was trained on ground truth MR images acquired in volunteers and reconstructed in-line with conventional GRAPPA using various sequences and covering various body regions. For these nonaveraged single-shot images, spatial resolution along the phase-encoding and readout direction was decreased by a factor of 2 to build corresponding training pairs. Again, training was performed offline and afterward integrated into the scanner's reconstruction pipeline.
After performing the described reconstruction steps, final operations such as averaging and calculation of ADC maps were performed identically to the DWISTD.
Quantitative Image AnalysisImage quality was quantitatively evaluated based on region of interest (ROI) measurements to calculate signal-to-noise ratio (SNR) in breast tissue (BT) for b50, b800, and ADC maps. A radiologist experienced in breast radiology (5 years) placed ROIs using a local instance of the postprocessing platform NORA (http://www.nora-imaging.com) to ensure identical sizes and locations of ROIs. To avoid bias from distortion artifacts, anatomic regions were exactly matched to ensure identical measurements. Signal intensity (SI) in ADC maps was measured in only solid parts of the invasive breast cancers (IBCs) or BEs and central in cysts (CYs). Apparent diffusion coefficient values were analyzed separately (ADCIBC, ADCBE, ADCCY) and compared between DWISTD and DWLDL. Contrast-to-noise ratio (CNR) and contrast (C) were calculated for IBC and BE with ROI placed in solid parts of the masses, and for CY with ROI placed in the center and ROI in fibroglandular tissue (FGT) close to the lesions. Because of image processing with zeroing of air, we were not able to calculate classic SNR. Therefore, SNR was calculated as the mean standard deviation (SD) of interest normalized to the mean SI of the lesion (LES) as follows: SNRLES = MeanSILES/SDLES. Contrast-to-noise ratio of lesions (labeled as CNRIBC, CNRBE, and CNRCY according to lesion type) was analyzed as follows: CNRLES = (|MeanSILES − MeanSIFGT|)/SDLES. Contrast ratio of lesions (CLES) was calculated by division of the mean signal intensity of the lesion (SI)LES by the mean SI of FGT (SIFGT) according to the following formula: C = MeanSILES/MeanSIFGT.
Qualitative Image AnalysisTwo radiologists (C.W., J.N.) with, respectively, 5 and 11 years of experience in breast imaging, blinded to sequence type and clinical parameters, independently performed the qualitative analysis. Overall image quality and lesion conspicuity were rated separately using Likert scales ranging from 1 (nondiagnostic), 2 (poor), 3 (moderate), 4 (good), to 5 (excellent). Ratings “4 (good)” and “5 (excellent)” were considered as high diagnostic image quality. Overall, artifacts and motion artifacts were scored from 0 (none), 1 (minimal), 2 (moderate), to 3 (very strong). Sequences were presented to the raters in a random order on diagnostic monitors. To effectively blind the readers, the DWIDL was downsampled to the resolution of the DWISTD for this qualitative evaluation only. The direct comparison method was adopted to enable raters to pick up on minute discrepancies that could otherwise go undetected.
Statistical AnalysisStatistical analysis was performed using SPSS 29.0 (IBM SPSS Statistics, Armonk, NY; Navering USA). Continuous data are presented as mean ± SD or median and interquartile range (IQR) for categorical or nonnormally distributed data if not otherwise specified. Categorical variables are given as count of the entire population. Normality assumptions were tested with the Shapiro-Wilk test. The paired samples t test was used for normally distributed data, and Wilcoxon signed rank test was applied to nonnormally distributed data. Interrater reliability analysis was performed using a weighted Cohen κ. Interpretation of agreement was done according to the definition of Landis and Koch, in which κ = 0.00 indicates no, κ ≤ 0.20 slight, 0.20 < κ ≤ 0.40 fair, 0.40 < κ ≤ 0.60 moderate, 0.60 < κ ≤ 0.80 substantial, and 0.80 < κ ≤ 1.00 an almost perfect agreement.25 Two-sided P values of <0.05 were considered statistically significant.
RESULTS Participant CharacteristicsBetween August 2022 and December 2022, a total of 70 participants agreed to participate in the study and underwent a standard breast MRI protocol with an additional DWIDL sequence before administration of contrast agent. Five participants were excluded due to incomplete or differing standard DWI sequences (Fig. 1). Thus, the final cohort consisted of 65 participants with a mean age of 54 ± 13 years (range, 29–85 years; 64 women). From those, 16 participants had histologically proven IBC. One patient was excluded due to a tumor size too small for objective measurements (<2 mm). A total of 15 IBCs were included in the analysis. Of those, 15 were histologically proven IBC with an average tumor size of 2.3 ± 1.1 cm (median, 2.1 cm; range, 0.6–4.2 cm), 14/15 were no-special type, and 1/15 was a lobular type. From the 90 lesions identified in the cohort, 74 lesions were classified as benign, 5 had to be excluded because of sizes below 2 mm, as they were too small for sufficient measurements, and 2 lesions were not represented in the DWI due to distortion artifacts. A total of 67 BEs could be included in the study. Benign lesions had an average diameter of 0.7 ± 0.5 cm (median, 0.6 cm; range, 0.2–3.4 cm). Of those, 20/67 were classified as intramammary lymph nodes and 12/67 as fibroadenoma (with 6/12 proven in biopsy), and 35/67 were classified as cystic lesions. For detailed information on participant and lesion characteristics, see Table 1.
Quantitative Analysis Acquisition TimeThe mean acquisition time was 5:02 minutes for DWISTD and 2:44 minutes for DWIDL, resulting in a reduction of acquisition time by 46% for DWIDL (P < 0.001).
Quantitative Analysis of Breast TissueThe median of SNRBT was significantly higher in DWISTD compared with DWIDL for b50 (2.8 [IQR, 0.9] vs 2.4 [IQR, 0.9]; P < 0.001) and b800 (4.5 [IQR, 2.3] vs 3.8 [IQR, 1.6]; P < 0.001) and ADC maps (3.9 [IQR, 2.7] vs 3.5 [IQR, 2.9]; P = 0.01), with higher SNRBT for DWISTD, respectively. For a detailed overview including mean and SD on the data, see Supplementary Table S1, https://links.lww.com/RLI/A831.
Quantitative Analysis on Breast CancersThe mean ADC SI in IBC did not differ significantly (P = 0.32) with 0.77 × 10−3 ± 0.13 mm2/s (range, 0.61–1.07 × 10−3 mm2/s) in DWISTD and 0.75 × 10−3 ± 0.12 mm2/s (range, 0.60–0.96 × 10−3 mm2/s) in DWIDL, see Table 3.
TABLE 3 - Mean SI of Breast Cancers in ADC Maps for Standard and Deep Learning DWI Feature Mean SI* ± SD Median IQR P ADC IBC ADCSTD 0.77 0.13 0.76 0.23 0.32 IBC ADCDL 0.75 0.12 0.76 0.2 BE ADCSTD 1.32 0.48 1.29 0.51 0.12 BE ADCDL 1.39 0.54 1.33 0.52 CY ADCSTD 2.18 0.49 2.11 0.64 0.002 CY ADCDL 2.31 0.43 2.29 0.41*Mean signal intensity measured in solid areas of invasive breast cancers (IBCs), benign lesions (BEs), and cysts (CY).
Mean SI of IBCs, BEs, and CYs for ADC maps for DWISTD and DWIDL. ADC values are given in × 10−3 mm2/s. P values are given for comparisons between DWISTD and DWIDL. All lesions showed higher SI in DWIDL compared with DWISTD. Note that no significant difference was observed between DWISTD and DWIDL for IBC and BE but for CY.
SI, signal intensity; ADC, apparent diffusion coefficient; DWI, diffusion-weighted imaging; IQR, interquartile range; IBC, invasive breast cancer; BE, benign lesion; CY, cyst; STD, standard; DL, deep learning.
The median of SNRIBC and CNRIBC did not differ significantly between DWISTD and DWIDL for b50 (SNR: P = 0.23; CNR: P = 0.69), b800 (SNR: P = 0.43; CNR: P = 0.21), and ADC maps (SNR: P = 0.65; CNR: P = 0.21), respectively. CIBC did differ significantly in DWISTD and DWIDL for b50 (5.0 ± 3.8 vs 7.2 ± 6.3; P = 0.01) and for ADC maps (2.7 ± 2.7 vs 4.6 ± 4.7; P = 0.008). CIBC did not differ significantly when DWISTD and DWIDL were compared for b-values of 800 mm2/s (3.5 ± 1.8 vs 5.1 ± 3.3; P = 0.08). For detailed data including b-values, see Figure 3 and Supplementary Tables S1–3, https://links.lww.com/RLI/A831.
Quantitative analysis of breast cancers, benign lesions, and cysts. Absolute values of apparent diffusion coefficient maps (ADC; value × 10−3 mm2/s), mean signal-to-noise ratio (SNRLES), mean contrast-to-noise ratio (CNRLES), and mean contrast (CLES) for biopsy-proven breast cancers (IBC), for benign lesions (BE) and cysts (CY) are shown with comparisons between standard DWI (DWISTD, blue bars) and deep learning (DL) DWI (DWIDL, red bars). Left side shows boxplots for IBC, middle column for BE, and right side for CY. No significant differences were observed for SNR and CNR for all lesion types. Note the improved contrast of lesions for DWIDL. ***P values less than 0.001. *P values less than 0.05. Whiskers indicate standard deviations of means. ADC, apparent diffusion coefficient; C, contrast; CNR, contrast-to-noise ratio; DL, deep learning; DWI, diffusion-weighted imaging.
Quantitative Analysis on Benign Lesions and Cysts Benign LesionsThe mean ADC SI in BE did not differ significantly (P = 0.12) with 1.3 × 10−3 ± 0.5 mm2/s in DWISTD and 1.4 × 10−3 ± 0.5 mm2/s in DWIDL, see Table 3.
The median SNRBE and CNRBE did not differ significantly between DWISTD and DWIDL for b50 (SNR: P = 0.31; CNR: P = 0.23), b800 (SNR: P = 0.49; CNR: P = 0.82), and ADC maps (SNR: P = 0.20; CNR: P = 0.30), respectively. CBE did differ significantly in DWISTD and DWIDL for b50 (7.0 ± 3.9 vs 10.0 ± 5.1; P < 0.001), b800 (3.3 ± 1.6 vs 4.2 ± 2.3; P < 0.001), and ADC maps (7.3 ± 4.0 vs 14.6 ± 16.7; P = 0.02). For detailed data including b-values, see Figure 3 and Supplementary Tables S1–3, https://links.lww.com/RLI/A831.
CystsThe mean ADC SI in cysts did differ significantly (P = 0.002) with 2.2 × 10−3 ± 0.5 mm2/s in DWISTD and 2.31 × 10−3 ± 0.4 mm2/s in DWIDL, see Table 3.
The median of SNRCY and CNRCY did not differ significantly between DWISTD and DWIDL for b50 (SNR: P = 0.26; CNR: P = 0.20), b800 (SNR: P = 0.92; CNR: P = 0.47), and ADC maps (SNR: P = 0.53; CNR: P = 0.39), respectively. The median of CCY did differ significantly in DWISTD and DWIDL for b50 (12.1 [IQR, 10.0] vs 15.3 [IQR, 14.7]; P < 0.001), b800 (3.5 [IQR, 2.2] vs 4.8 [IQR, 3.5]; P < 0.001), and ADC maps (7.1 [IQR, 6.3] vs 7.9 [IQR, 11.6]; P < 0.001). For detailed data including b-values, see Figure 3 and Supplementary Tables S1–3, https://links.lww.com/RLI/A831.
Qualitative AnalysisAlthough DWISTD was rated toward the higher image quality scores (P < 0.001), both sequences revealed a high image quality; good image quality scores were noticed in 26/65 cases for DWISTD and 36/65 cases for DWIDL compared with excellent scores in 31/65 cases for DWISTD and 3/65 cases for DWIDL. Moderate image quality scores were reported in 5/65 cases for DWISTD and 21/65 for DWIDL. In only 2/65 and 3/65 cases, the image quality was classified as poor for DWISTD and DWIDL, respectively. The same distribution was found for nondiagnostic image quality. Detailed data are given in Table 4.
TABLE 4 - Ratings of the Qualitative Analysis on Image Quality, Lesion Conspicuity, and Artifacts Performed by 2 Raters DWISTD DWIDL Characteristic Mean ± SD Median IQR κ Mean ± SD Median IQR κ P Overall image quality 4.2–4.3 0.9–0.9 4–4 0–0 0.881 3.5–3.5 0.8–0.8 4–4 0–0 0.87 <0.001 Conspicuity IBC 3.8–3.9 0.9–1.0 4–4 1.5–1.0 0.675 4.3–4.4 0.9–0.9 5–5 1.5–1.5 0.922 <0.001 Conspicuity BE 4.0–4.0 0.6–0.7 4–4 0–0 0.802 4.7–4.7 0.6–0.6 5–5 0–0 1 <0.001 Conspicuity CY 3.9–3.9 0.8–0.8 4–4 0.5–0.5 0.933 4.6–4.6 0.8–0.8 5–5 0.5–0.5 0.912 <0.001 Overall artifact level 1.4–2.2 0.7–0.6 1–1
Comments (0)