The statistical impact of ROI referencing on quantitative susceptibility mapping

We investigated referencing in a synthetic dataset (from the 2019 QSM reconstruction challenge [15]) that has a ground truth, as well as on a cohort of patients with temporal lobe epilepsy and healthy controls originally investigated by Kiersnowski et al. [25]. The QSM challenge dataset has a single realistic numerical phantom reconstructed with many different QSM techniques which limited comparisons to those between reconstructions. Using this dataset, the aim was to highlight the influence of reconstruction method on the impact of referencing. On the other hand, the epilepsy cohort contains multiple subjects with both left and right temporal lobe epilepsy (LTLE and RTLE, respectively), as well as healthy controls (HC), allowing us to evaluate the effect of referencing in a published clinical QSM study.

Reference regions

We considered four reference regions commonly used in the QSM literature [4, 19, 20, 24] and one novel referencing approach:

1.

CSF: Cerebrospinal fluid,

2.

WM: Corpus callosum (CC). For the epilepsy dataset, both the corpus callosum (CC) and internal capsule (IC) were chosen as white matter reference regions [18].

3.

Whole brain: Whole brain (masked)

These three are common reference regions based on brain anatomy [18, 20]. These were obtained by brain segmentation, described in detail below for the specific datasets.

The final two reference regions are based on maps of the relative variance of the susceptibility maps across subjects and on subject-specific maps of the \(R_2^\) relaxation rate:

4.

RelVar: Thresholding the relative susceptibility variance map [19] has been used as a method to provide a reference region independent of anatomical segmentation.

The relative susceptibility variance map was computed according to

$$\begin \text \left[ \textrm\right] =\frac\,}}(\text )[r]}(\text )[r]} \end$$

(13)

where [r] indicates that these are values in a voxel at position r within a study-specific susceptibility template. This study-specific QSM template was generated, according to Acosta-Cabronero et al. [19], by registering the susceptibility maps in each subject to a study-specific \(T_1\)-weighted atlas, created by co-registering \(T_1\)-weighted images of all subjects together. The rationale for this method is that voxels with a low susceptibility variance across subjects are likely to provide a robust reference susceptibility. We chose the voxels with the lowest third percentile of relative variance because that resulted in a reasonably contiguous region of white matter as described in [19]. The reference susceptibility value was then calculated as the mean susceptibility in all voxels below the chosen threshold.

5.

\(R_2^\)-based: Thresholding the subject-specific \(R_2^\) map is our novel approach to obtain a reference region independent of anatomical segmentation.

Voxels with low \(R_2^\) values are expected to also contain few (or weak) susceptibility sources, and mostly liquid (isotropic) tissue (such as the CSF). Using a low relaxation rate as a cut-off then limits the potential contamination of grey matter voxels or partial volume effect which would be possible when using the CSF in the ventricles based on anatomical markers. We used auto-regression on linear operations (ARLO [27]) to compute the \(R_2^\) maps, with the reference “region” defined as those voxels with \(R_2^\) values below 4 Hz. We chose this threshold as low as possible while keeping a similar number of voxels to the more conventional, anatomical, reference regions.

QSM reconstruction challenge 2.0 dataset

This is a synthetic dataset, which was used to compare and evaluate QSM reconstruction methods and is now used frequently as a ground-truth dataset for developing new reconstruction methods. More than 100 different reconstructions were submitted and compared in the original publication [15]. In this dataset, we used pairwise t tests to compare the reconstructed ROI mean susceptibility values with and without referencing to the ground truth susceptibility values to decide whether the values were reconstructed accurately. We compare overall accuracy for all submissions [28], and provide literature references to 11 frequently used and top scoring algorithms in Table 1 for convenience.

Table 1 Overview of major reconstruction methods compared in this ROI based analysis. DOIs are provided for convenience and correspond as closely as possible to the submissions, but there may be minor differences in implementation of the method for the challenge compared to the original publication. For further details we would like to refer to the original challenge results

The QSM challenge 2.0 ground truth ROIs are not separated per hemisphere and some commonly investigated deep gray matter regions are not divided into sub-regions e.g., pallidum, amygdala and hippocampus. Therefore, for this study, we re-segmented the phantom using the ground truth susceptibility maps combined with the anatomical contrast using MRCloud [29]. The relative variance was computed across all reconstructions, selecting the voxels within the brain that the reconstructions mostly “agreed” on. And for the \(R_2^\) based ROI the magnitude data provided with the challenge was fitted in the same way as the epilepsy data.

Anatomical test ROIs were chosen to be the same as those used in the epilepsy study: cerebrospinal fluid (CSF), corpus callosum (CC), internal capsule (IC), amygdala, caudate, pallidum, putamen, thalamus, and hippocampus.

Any inaccuracy in the QSM reconstruction of the reference region would lead to an error in the referenced ROI susceptibility values. We investigated whether such errors significantly impact the outcomes of clinical studies, and which reconstruction methods are susceptible to them.

A drawback of the synthetic QSM challenge dataset being a single “subject” is that all the reconstructions are based on the same data and, therefore, cannot be treated as independent measurements, making any group-wise statistics flawed. However, it is possible to compare the reconstructions to the ground truth, performing the same t test as described above, but considering the voxels within an ROI to be independent samples from an ROI-specific distribution (rather than ROI mean values being samples of a distribution across subjects).

Note that in these comparisons we subtracted a scalar reference value from all the test ROI mean susceptibility values, which moved the test ROI means closer or further apart but did not impact the overall distribution of the voxel values within the test ROI. A visual depiction of this can be seen in Fig. 1a.

Epilepsy dataset

To investigate the effect of referencing in the presence of pathology, we used a dataset of left and right temporal lobe epilepsy patients (LTLE and RTLE, respectively). The dataset, originally analysed by Kiersnowski et al. [25], consists of 27 healthy controls (HC), 19 patients with LTLE, and 17 with RTLE, with ages ranging from 16 to 67 years old (Range (Median); HC: 16.5\(-\)55.1 (30) LTLE: 19.4\(-\)66.5 (32.9) RTLE: 21.4\(-\)67.1 (34)).

The original results were not explicitly referenced and, hence, intrinsically used referencing method 3 from above (whole brain ). To validate this, we have included both unreferenced as well as explicitly referenced results. Therefore, to investigate the effect of referencing, here we compare the unreferenced results with results explicitly referenced to the regions 1-5 defined above. In each case, the referencing was performed on the “raw” susceptibility maps before age correction was applied. Following referencing and age-correction, we assessed groupwise ROI mean susceptibility differences using an analysis of variance ANOVA [30]. A post-hoc Tukey-Kramer multiple comparison of the subject ROI mean susceptibilities was then used to compute the statistical significance of the between-group differences. To segment test ROIs and reference regions 1-3, GIF [31,32,33] was used to segment the \(T_1\) weighted images, HippoSeg [34] was used for the hippocampus, and the cerebrospinal fluid (reference region 1) was segmented using SPM12 [35]. Anatomical test ROIs included the amygdala, caudate, putamen, globus pallidus (internal and external combined), thalamus, internal capsule and hippocampus, for both hemispheres as the pathology is primarily single-sided, and these ROIs were also investigated in the original study (Fig. 3). As in the original analysis [25], all segmented ROIs were eroded by applying a spherical kernel of radius 1 (voxel) three times to the binary ROI mask to reduce partial volume effects, after which outliers (values outside the 1st and 99th percentile) were removed before computing the ROI statistics (mean and standard deviation). A visual depiction of the study design can be seen in Fig. 1b.

Fig. 1figure 1

Illustration of Study design for experiments on a QSM challenge 2.0 simulated data and b Data from the epilepsy study [25] (right)

Comments (0)

No login
gif