Removing outliers from the normative database improves regional atrophy detection in single-subject voxel-based morphometry

Test dataset

The test dataset for this retrospective study comprised 118 subjects, 81 patients (age 65.9 ± 8.2 years, 54% females) with AD (18 AD with amnestic dementia, 22 amnestic mild cognitive impairment (MCI) due to AD, 11 posterior cortical atrophy (PCA)) or FTLD (20 behavioral variant FTLD (bvFTLD), 10 semantic variant primary progressive aphasia (SD)) and 37 healthy controls (HC, 58.1 ± 10.9 years, 43% females). The subjects were included retrospectively from a previous prospective study on the relationship between local neuronal activity and the functional coupling among distributed brain regions [18] and from a previous retrospective study on the utility of single-subject VBM with a scanner- and sequence-specific NDB for the differential diagnosis of dementing neurodegenerative diseases in clinical practice [3]. The ground truth diagnoses had been established by dementia experts based on the results of biomarker information (FDG-PET, amyloid-PET, and/or CSF amyloid-β42, phosphorylated tau, and total tau), clinical examination, neuropsychological testing, and clinical follow-up.

In all subjects, simultaneous FDG-PET/MRI had been performed with the same PET-MRI hybrid system (Siemens Biograph mMR PET-MRI, Siemens Healthineers, Erlangen, Germany) using exactly the same acquisition sequence. Imaging included a 3D T1-weighted sequence with a resolution of 1 × 1 × 1 mm3 (TR = 2300 ms, TE = 2.98 ms, TI = 900 ms, flip angle = 9°).

Normative databases

The scanner-specific single-scanner NDB (SSD) consisted of the 37 healthy controls from the test dataset.

The non-scanner-specific multiple-scanner NDB (MSD) comprised 3D T1-weighted MRI with 1 × 1 × 1 mm3 resolution from 164 subjects (64.1 ± 9.4 years, 57% females) acquired for unspecific symptoms (e.g., headache, dizziness) with 164 different MRI scanners at 164 different sites using acquisition sequences recommended by the scanner manufacturer. Imaging was performed at 3/1.5/1.0 Tesla in 47/114/3 cases (28.7/69.5/1.8%) using MRI scanners from three different manufacturers: Siemens (n = 110; Aera, Amira, Avanto, Espree, Galan, HarmonyExpert, MAGNETOM (Lumina, Vida, ESSENZA), Orian, Skyra (fit), Symphony (Tim), TrioTim, Verio), Philips (n = 40; Achieva (dStream), Ingenia, Intera, Panorama HFO), and GE (n = 14; DISCOVERY MR750, Optima MR450w, SIGNA (Hde, HDxt, Voyager)).

None of the patients had a history of or currently ongoing neurological or psychiatric disease. All scans were free of abnormalities beyond those expected for the patients’ age based on visual inspection by an experienced radiologist.

Removal of outliers from the NDB

GM density maps in the anatomical space of the Montreal Neurological Institute (MNI) were obtained for each scan in the NDB as described in subsection “Single-subject voxel-based morphometry”. Then, a leave-one-out z-score map was computed for each GM map by voxel-wise application of the following formula:

$$\text=\left(\mathrm\;\mathrm-\mathrm\;\mathrm\right)/\mathrm\;\mathrm\;\mathrm\;\mathrm$$

(1)

where mean and standard deviation of the GM density were computed over all GM density maps in the NDB excluding the individual GM map. The calculation of the z-score map was restricted to a standard GM mask predefined in MNI space (in order to avoid division by zero or very small numbers).

The following quality metrics were computed for each individual leave-one-out z-score map in a given NDB

$$\text-\text=\mathrm\;\mathrm\;\mathrm\;z-\mathrm\left(\mathrm\;\mathrm\right)\mathrm\;\mathrm\;\mathrm\;\mathrm$$

(2)

$$\text-\text=\mathrm\;\mathrm\;\mathrm\;z-\mathrm\;\left(\mathrm\;\mathrm\right)\;\mathrm\;\mathrm\;\mathrm\;\mathrm$$

(3)

$$\text-\text=\mathrm\;\mathrm\;\mathrm\;\mathrm\;\mathrm\;\mathrm\;\mathrm\;\mathrm\;z\;\left(\mathrm\;\mathrm\right)>2.5$$

(4)

A scan in the NDB was considered an outlier with respect to one of these quality metrics if its corresponding value was equal to or larger than upper quartile + 1.0 * interquartile range of the quality metric in the NDB. A scan was considered an (overall) outlier if it was an outlier with respect to one or more of the quality metrics.

Identification and removal of outliers were performed separately for the two NDBs.

Single-subject voxel-based morphometry (VBM)

Single-subject VBM relative to each of the four different NDBs (SSD and MSD before and after removal of outliers) was performed with the Biometrica analysis platform (jung diagnostics GmbH, Hamburg, Germany). In brief, the original 3D T1-weighted MRI was segmented into GM, white matter, and cerebrospinal fluid component images [15]. Spatial correspondence between the GM component image of the patient and the GM component images of the NDB was established via high dimensional non-linear image registration (DARTEL) [19]. The registered and modulated individual GM component image was smoothed by convolution with an isotropic Gaussian kernel of 8 mm full-width-at-half-maximum. After smoothing, a voxel-based two-sample t test of the individual smoothed GM component image against the smoothed GM component images of the NDB was carried out, resulting in a statistical t-map. Age and total intracranial volume (TIV) were taken into account as nuisance covariates. The TIV was estimated in each T1-weighted MRI by using a 3D convolutional neural network specifically trained for accurate and stable delineation of the TIV, in particular to avoid TIV overestimation occasionally observed with conventional methods [20, 21].

Visual interpretation of individual VBM maps

Individual VBM maps were thresholded at p = 0.005. For visual interpretation of the VBM maps, a standardized display was used that provided the thresholded VBM maps as color-coded overlay on axial slices and as a glass brain view in a one-page pdf document separately for each case (Fig. 1).

Fig. 1figure 1

Standard display for visual interpretation of VBM maps. The example shows the VBM map of a 66-year-old patient with posterior cortical atrophy obtained with the full single-scanner normative database (SSD)

The VBM maps were interpreted by two neuroradiologists with 3 years and 8 years of experience in reading VBM maps of patients with suspected neurodegenerative disease. The readers were blinded for all clinical and biomarker information except age.

There were 472 different pdf documents (118 test cases × two NDBs × without or with removal of outliers). A copy was generated from each of these pdf documents to allow assessment of intra-reader variability of the visual interpretation. This resulted in 944 anonymized pdf documents that were presented in randomized order.

The readers were asked to use the following two-step approach for visual interpretation. First, the readers had to decide whether a neurodegenerative disorder was “present”, “absent”, or “uncertain”. If a neurodegenerative disorder was “present”, in the second step the reader categorized the atrophy pattern as AD or FTLD using criteria described previously [3] (Supplementary Fig. 1).

Cases with intra-reader discrepancy with respect to the detection of a neurodegenerative disease in the first step and/or categorization of the neurodegenerative disease in the second step were read a third time by the same reader to obtain an intra-reader consensus, separately for both readers. A joint reading session was used to resolve between-reader discrepancies of the intra-reader consensus to obtain a between-readers consensus.

Statistical analysis

For each thresholded VBM map, the total volume of atrophy was computed by counting the number of voxels and then multiplying the total number of voxels by the volume of a single voxel. A general linear model for repeated measures was used to test the impact of NDB cleaning on the total volume of atrophy. NDB (SSD or MSD) and cleaning (without or with) were included as within-subject factors. The ground truth diagnosis (AD, FTLD, HC) was included in the model as between-subjects factor.

Cross tables and Cohen’s kappa were used to assess intra- and between-reader agreement of the visual interpretation and to assess the accuracy of the between-readers consensus relative to the clinical ground truth diagnosis, separately for each NDB. “Uncertain” cases were included in the “no neurodegenerative disease” category for statistical analyses to be as specific as possible.

IBM SPSS (version 27) was used for these statistical analyses. The threshold for statistical significance was set at two-sided p = 0.05.

Voxel-based group-level comparison of the GM density between the two NDBs, SSD and MSD, was performed with the heteroscedastic two-sample t test implemented in the statistical parametric mapping software package (version SPM12), separately before and after NDB cleaning. For rather sensitive detection of regional GM differences, the voxel-level significance threshold was set to one-sided p = 0.005 uncorrected for multiple comparisons. The minimum cluster size was fixed at 296 voxels (corresponding to 1-ml volume).

Ethics statement

The retrospective use of the test dataset was approved by the ethics committee of the Technical University of Munich (Reference 176/18 s). The need for written informed consent was waived by the ethics committee due to the retrospective nature of the analysis.

The MRI data of the MSD had been transferred to jung-diagnostics GmbH under the terms and conditions of the European general data protection regulation for remote image analysis. Subsequently, the data had been anonymized. The need for written informed consent for the retrospective use of the anonymized data was waived by the ethics review board of the general medical council of the state of Hamburg, Germany.

留言 (0)

沒有登入
gif