Tomography, Vol. 9, Pages 1-11: Predicting Underestimation of Invasive Cancer in Patients with Core-Needle-Biopsy-Diagnosed Ductal Carcinoma In Situ Using Deep Learning Algorithms

1. IntroductionDuctal carcinoma in situ (DCIS) is a noninvasive breast cancer with the presence of abnormal cells inside a milk duct [1]. Unlike invasive ductal carcinoma (IDC) that spreads into surrounding breast tissue, the proliferation of malignant cells in DCIS is confined within the basement membrane of milk ducts [2]. Therefore, it permits relatively less-invasive therapy options compared to IDC, which usually requires axillary interventions. While core-needle biopsy (CNB) is a gold standard for the diagnosis of breast lesions, a presurgical diagnosis of DCIS using CNB with a small caliper poses a potential sampling error and may result in the upgrading of DCIS to invasive disease in the histopathology of surgically excised specimens. The percentage of DCIS at CNB to upgraded DCIS after surgery has been shown to be 6–41% [3,4].The differentiation of pure DCIS from upgraded DCIS with invasive component is of clinical importance because the treatment strategies and prognosis of these two conditions are markedly different. Sentinel lymph node biopsy (SLNB) is not recommended for DCIS when breast-conserving surgery is planned because of the low incidence of axillary involvement in pure DCIS (1–2%) [5]. In cases of upgraded DCIS, however, SLNB or axillary lymph node dissection (ALND) is necessary. Presurgical prediction of DCIS with occult invasive component would equip clinicians with an important assisting tool to provide optimal medical care to these patients.A number of efforts have been made to evaluate the preoperative factors that are predictive of occult invasive component in DCIS using various breast imaging modalities [6,7,8,9]. Several studies have shown that magnetic resonance imaging (MRI) has the potential to distinguish DCIS with occult invasive component from pure DCIS [6,10,11]. These studies examined the conventional MR imaging features, such as cancer size and lesion signal intensity, for identifying predictors of occult invasive component in DCIS. Recently, deep-learning-based methods have emerged as one of the most powerful tools for computerized pattern recognition in the analysis of medical images. Zhu et al. demonstrated that a convolutional-neural-network (CNN)-based algorithm using breast MRI can predict DCIS with occult invasion with a borderline performance of AUC (0.68–0.7) [12].

The aim of this study was to develop deep learning models based on breast MRI to distinguish between pure vs. upgraded DCIS diagnosed by CNB. A total of three CNN-based models, which differed in the type of input images and the CNN architectures, were developed to investigate the feasibility of using them to predict the upgrading status of DCIS.

3. ResultsTable 3 shows the comparison of performance between the three classifiers: D-T RRCNN, RRCNN with ROIs, and CNN with ROIs models. For the validation data, the three models demonstrated comparable performances with the sensitivity, specificity, accuracy, and AUC ranging from 0.600 to 0.640, from 0.800 to 0.828, from 73.3% to 75.0%, and from 0.767 to 0.785, respectively.For the testing data, the RRCNN with ROIs model achieved the highest performance with the sensitivity, specificity, accuracy, and AUC of 0.677, 0.804, 75.0%, and 0.796, respectively. The D-T RRCNN model demonstrated a performance similar to that of the RRCNN with ROIs model with the sensitivity, specificity, accuracy, and AUC of 0.645, 0.804, 73.6%, and 0.762, respectively. The CNN with ROIs model exhibited a slightly lower performance compared to those of other two models with the sensitivity, specificity, accuracy, and AUC of 0.645, 0.756, 70.8%, and 0.755, respectively. Figure 6 shows the comparison of receiver operating characteristic (ROC) curves between the three deep learning models. 4. Discussion

This study demonstrated the feasibility of deep learning models based on breast MRI for distinguishing pure and upgraded DCIS. Our proposed models provided the highest accuracy and AUC of 75.0% and 0.796, respectively. These results suggest that a deep-learning-based approach has a potential to be used for the accurate prediction of upgrading the status of DCIS using presurgical MRI data.

Although the differentiation of upgraded DCIS from pure DCIS is of clinical importance due to distinct treatment strategies between the two diseases, the pre-surgical prediction of DCIS with occult invasion using medical imaging data is challenging. Several previous researchers have attempted to use various breast imaging modalities, including mammography and MRI, to evaluate the preoperative factors that are predictive of upgrading DCIS [7,8,9,10,11,16,17,18]. Most of these efforts, however, have used conventional ways to analyze medical imaging data, which are based on the qualitative assessment of imaging parameters. For example, Lamb et al. reported that larger size on MRI and the presence of comedonecrosis at biopsy were significantly associated with the upgrade of DCIS [17].Recently, a few reports have applied deep learning methods for the prediction of the upgrading of DCIS [12,19,20]. Using mammograms for the prediction of upgrading the status of DCIS, Shi et al. showed that the deep learning features from a CNN that was pretrained on non-medical images and the hand-crafted computer vision (CV) features provided comparable performances with the borderline AUCs of 0.70 and 0.68 for the deep and handcrafted CV features, respectively [20]. In another study, Hou et al. applied a deep learning model using the domain adaptation approach for distinguishing DCIS with atypical ductal hyperplasia (ADH) and DCIS with invasive component and showed a performance with an AUC of 0.697 [19]. Although MRI is the most sensitive tool for malignancy detection among different breast imaging tools [21,22], the application of deep learning approach to MRI for predicting the upgraded status of DCIS is very limited. Compared to mammogram, the larger amount of data contained in the three-dimensional format in MRI requires elaborate efforts when MRI data are used for deep learning applications. One previous study demonstrated the application of a pre-trained deep learning algorithm using breast MRI for the prediction of DCIS with occult invasion and reported an AUC of 0.68–0.70. In comparison, the deep learning models proposed in our study displayed relatively higher performances with the AUCs ranging from 0.755 to 0.796.

In this study, we tried to compare the performance between models using multiple MRI slices (sequential model, RRCNN) and a single MRI slice (2D model, CNN) as input. The performance of the RRCNN with ROIs model (accuracy = 75.0%, AUC = 0.796) was higher than those of the CNN with ROIs model (accuracy = 70.8%, AUC = 0.755). The information that was extracted between multiple MRI slices in the sequential model may have helped to assist the classification of upgraded versus pure DCIS. The high value of specificity (>0.8) from sequential models indicated that the relative information between slices plays an important role for indicating pure DCIS cases. In addition, the RRCNN with ROIs model showed a comparable, but slightly higher performance than the D-T RRCNN model (accuracy = 73.6%, AUC = 0.762). One advantage of the D-T model over the RRCNN with ROIs model is that it does require a manual annotation of ROIs. The information transferred from the detection network directly feeds the quarter of the subtraction images in the D-T model, while the ROIs models (both the RRCNN with ROIs and CNN with ROIs models) require the manual input of the lesion location. As the manual lesion annotation usually takes a great deal of human effort and time, the D-T RRCNN model may be beneficial by minimizing human involvement during the process of model training, especially when handling a large amount of data.

All three models showed relatively high specificities (0.756–0.804 on testing data), but low sensitivities (0.645–0.677 on testing data). A high level of specificity from these models means that a considerable number of upgraded DCIS were underestimated as pure DCIS. The relatively small number of upgraded DCIS (n = 150) compared to pure DCIS patients (n = 202) may have caused this imbalance between the sensitivity and specificity of model performances. In addition, the variations of pattern in upgraded DCIS and the consequent high level of difficulty in identifying it may demand more upgraded DCIS patient data for training deep learning models.

There are several limitations in our study. First, the proposed study was a retrospective effort without external test data. Future studies with an external validation from multi-center data will enhance the validity of our approach. The compilation of multi-center patient data from our collaborating institutions are currently ongoing. Second, we utilized only subtraction images to train our models. Because DCIS usually appears as non-mass enhancement, it is challenging to define tumor boundary from background parenchymal tissue in T2-weighted MRI. Further studies are necessary to properly define and delineate the lesions in T2-weighted images, and to evaluate the effects of adding multi-parametric MRI data on the predictive performance of the proposed models. Third, the process of selecting 20 imaging slices for the sequential models still required expert involvement. The total number of imaging slices contained in each MRI data can vary and only a part of the entire imaging slices from the MRI sequence contained tumors. These factors made the imaging slice selection process challenging to automate and required human involvement in the slice selection process.

The proposed models showed the feasibility of using deep learning as an assistant tool for estimating invasiveness in DCIS diagnosed by core-needle biopsy. The performance shown by sensitivity could not reach our expectations compared to the specificity. Even though we applied augmentation to make the training more balanced, the small number of upgraded DCIS patients (n = 150) still made the proposed models generate bias toward pure DCIS patients. The high level of difficulty and the variant of upgraded DCIS patterns also demand more upgraded DCIS patients for training a deep learning model with less bias towards pure DCIS.

Comments (0)

No login
gif