Analysis of pre-existing neuroimaging data was approved by the University of Minnesota and Oregon Health and Science University Institutional Review boards. Participants consented (or assented when below 18 years old) at their respective collection sites for each study (ABCD21: https://abcdstudy.org and HCP-D study36,99: https://www.humanconnectome.org/study/hcp-lifespan-development/, MSC9: https://openneuro.org/datasets/ds000224/versions/1.0.4) and have agreed to have their anonymized data shared. Participants were recruited under the auspice of the ABCD study to follow brain development and health in a longitudinal manner from 9 to 10 years of age until adolescence. Written informed consent and assent were obtained from a parent or guardian and the child, respectively, to participate in the ABCD study. Behavioral analysis used previously collected behavioral measures including the NIH toolbox tasks, assessments of mental health using Kiddie Schedule for Affective Disorders and Schizophrenia (KSADS) and surveys of substance use, culture and environment from baseline protocols (https://abcdstudy.org/scientists/protocols/).
ABCD-matched groupsParticipants from ABCD were split into three demographically matched cohorts, the ARMS13,100. Groups were matched using site, age, sex, ethnicity, grade, highest level of parental education, handedness, combined family income and exposure to anesthesia. Full demographic information for all ABCD cohorts 1, 2 and 3 are described in Extended Data Table 1, with the exception of participants that were excluded. Of these cohorts, participants were excluded either because they were unable to be processed through the DCAN processing pipeline (https://github.com/DCAN-Labs/abcd-hcp-pipeline) described in Methods (typically due to poor image quality) or had fewer than 10 min of resting-state data postmotion correction.
A diagram showing which participants were used to generate probabilistic maps is shown in Supplementary Fig. 1. Brain mapping was performed for all participants with at least 10 min of resting-state data using all three brain mapping methods—IM, TM and NMF. We also used TM and IM separately on concatenated rest and task data using the same network templates used for rs-fMRI data.
MRI image acquisitionABCD MRI images were collected from 11,572 participants across 21 sites across the United States of America (Children’s Hospital Los Angeles, University of Colorado Boulder, Florida International University, Laureate Institute for Brain Research, Medical University of South Carolina, Oregon Health and Science University, University of Rochester, SRI International, University of California Los Angeles, University of California San Diego, University of Florida, University of Maryland at Baltimore, University of Michigan, University of Minnesota, University of Pittsburgh Medical Center, University of Utah, University of Vermont, University of Wisconsin-Madison, Virginia Commonwealth University and Washington University in St. Louis)21. ABCD participants were aged approximately 9–10 years at the time of collection with ~50% female (see Extended Data Table 1 for additional details). The imaging component of the study was developed by the ABCD Data Analysis and Informatics Center and the ABCD Imaging Acquisition Workgroup. No statistical methods were used to predetermine sample sizes for this manuscript, but the ABCD study is the largest fMRI study conducted in the United States to date and is likely to be sufficiently powered to capture and analyze different patterns of substance use along with many other variables of interest20,101 and have been used by other studies to examine brain–behavior relationships14. Neuroimaging and behavioral data were collected in accordance with local institutional review boards at each institution.
Sequences were harmonized across Siemens, Philips and GE 3 Tesla (3T) scanners. For further details regarding MRI acquisitions, see refs. 21,43. Briefly, participants underwent 25–45 min of prescan task compliance, localizer, 3D T1-weighted MRI (1 mm isotropic, TR = either 2,500 or 6,100 ms, TE = 2–2.9 ms, 8o flip angle, 256 × 256 field of view (FOV)), diffusion-weighted images, 3D T2-weighted MRI (1 mm isotropic, TR = 2,500 or 3,200 ms, TE = 60–565 ms, variable flip angle, 256 × 256 FOV), 1–2 runs of rs-fMRI (2.4 mm isotropic, TR = 800 ms, TE = 30, variable flip angle = 52o, 216 × 216 FOV) and a randomized order of monetary incentive delay (MID), stop signal task (SST) and emotional n-back (EN-back) tasks.
Of the original 11,572 participants from the ABCD 2.0 release20, participants were divided into discovery (n = 5,786) and replication (n = 5,786) sets that were matched along the following ten variables: site location, age, sex, ethnicity, grade, highest level of parental education, handedness, combined family income and exposure to anesthesia93 (Extended Data Table 1). All resting-state scans were acquired using a gradient-echo, echo-planar imaging sequence (TR = 800 ms, TE = 30 ms, flip angle = 90°, voxel size = 2.4 mm3, 60 slices). Head motion was monitored in real time using Framewise Integrated Real-time MRI Monitor (FIRMM) software at Siemens sites102. For resting-state scans, participants were instructed to lie still and fixate on a crosshair at the center of their visual field.
All functional MRIs were processed with the publicly available ABCD-BIDS pipeline (https://github.com/DCAN-Labs/abcd-hcp-pipeline), which is a modified version of the HCP processing pipelines13. Brain extraction was performed by PreFreesurfer after denoising and bias field correction of the anatomical T1- and/or T2-weighted images. The DCAN-labs processing pipeline applies advanced normalization tools (ANTs) DenoiseImage to improve structural clarity and ANTs N4BiasFeildCorrection (ANTs)103,104 to reduce field bias93.
Resting-state time course processingSignal regressionTime courses were corrected using DCAN-BOLDproc13. The method for signal regression has been previously described65. Briefly, resting-state time courses (using surface registration for cortex and volume registration for subcortical gray matter) were detrended and further processed105 using mean whole brain, ventricle and white matter signal as well as displacement on the 6 degrees of freedom, rigid body registration, their derivatives and their squares by regression106,107. Finally, time courses were filtered using a first-order Butterworth band pass filter between 9 and 80 mHz backward and forward using MATLAB’s filtfilt function (The MathWorks, v2016-2018x).
The BOLD fMRI volumetric data from the cerebral cortex were constrained to the cortical sheet for surface-based imaging108 and combined with volumetric midbrain and hindbrain time courses into a Connectivity Informatics Technology Initiative (CIFTI) format. Once BOLD data were mapped to the sheet, time courses were deformed and resampled to the original surface.
Head motion correctionHead movement in the scanner interferes with the ability to identify a grayordinate from one time point to the next, and the movement of a large electrically conductive tissue in a magnet introduces contaminating artifacts from eddy currents. To minimize these effects, we rigorously controlled for head motion by using an FD threshold of 0.2 mm and only using participants with at least 10 min of resting-state data postmotion correction. Movement was calculated by FD in mm using the formula FDi = ∣Δpix∣ + ∣Δpiy∣ + ∣Δpiz∣ + ∣Δαi∣ + ∣Δβi∣ + ∣Δγi∣, where Δpix is the frame-to-frame change in the x position, p: Δpix = p(i−1)x − pix, and so forth for the other rigid body parameters (pix, piy, piz, αi, βi, γi)109. Rotational displacements were converted from degrees to millimeters by calculating displacement on the surface of a sphere with a 50 mm radius, which is approximately the mean distance from the cerebral cortex to the center of the head. Frames were removed if their total relative movement in any direction (FD) was greater than 0.2 mm relative to the previous frame or if they were contained within a segment of five contiguous frames that violated the threshold.
For the remaining frames, the s.d. was calculated across all grayordinates to remove potential artifacts. Frames that had outliers in the s.d. of the bold signal were removed using the median absolute deviation method in MATLAB and Statistics Toolbox Release 2016b-2022b (The MathWorks). In time courses containing more than 10 min of resting-state data, frames were randomly sampled to generate correlation matrices using exactly 10 min of fs-MRI data. Of the 11,572 participants enrolled, 10,079 in groups 1 and 2 had usable structural and functional MRI collected, and of these based on our movement/signal criteria, approximately 3,973 (~40%) children were excluded based on excessive movement in the scanner during resting-state scans, resulting in 2,995 (group 1) and 3,111 (group 2) participants. During task-based scans, we were able to retain many more usable frames at the FD criteria (group 1, n = 4,699 and group 2, n = 4732), excluding only 607 participants (6%).
IM community detection methodThe community detection method using the graph theory-based algorithm IM has been previously described7,9. The same correlation matrices that were used in the TM processes were used to detect networks using IM. Briefly, vertices/voxels within 30 mm of each other were set to zero in the matrix to avoid biasing network membership for nearby connections that had undergone spatial smoothing. The resulting correlation matrix was then thresholded at a range of density thresholds (0.3%, 0.4%, 0.5%, 1%, 1.5%, 2.0%, 2.5% and 3.0%) and each one was used as an input for IM. For instances where IM was implemented on combined cortical and subcortical data (data shown in Fig. 3c and the average group matrix shown in Supplementary Fig. 6), we extended the range of density thresholds to include 4% and 5%. IM calculates the network assignment based on an optimized code length using a flow-based method31,41. Networks that are computed in the group average are labeled based on similar patterns of activation observed in the scientific literature9,23. Small networks with 400 or fewer grayordinates were defined as ‘unassigned’.
Networks identified in each individual were then labeled based on the Jaccard similarity index to a network observed in the group average; however, often individuals will retain new networks that are not observed in group averaging, and these remain unlabeled. The list of networks included is the DMN, the visual network (Vis), the frontal-parietal network (FPN), the dorsal attention network (DAN), the ventral attention network (VAN), the salience network (Sal), the cingulo-opercular network (CO), the sensorimotor dorsal network (SMd), the sensorimotor lateral network (SMl), the auditory network (Aud), the anterior medial temporal network, the posterior medial temporal network (post-MTL), parieto-occipital network (PON) and the parietal medial network (PMN)9,22. In each participant and in the average, a ‘consensus’ network assignment was determined across the various thresholds, by giving each node the assignment it had at the sparsest possible threshold at which it was successfully assigned to one of the known group networks (Supplementary Fig. 2). Contiguous network clusters that were smaller than 30 grayordinates were removed and merged into neighboring networks, with the largest networks given priority.
Similar to how a letter in the United States can be addressed to a house with a two-level description state/province, then city, brain network organization can be described using a two-level system of networks and nodes, respectively. IM is a network-describing algorithm that tries to minimize the number of bits (using Huffman coding) necessary to describe the whole network31,110. For example, would it require fewer bits to describe the whole brain with few networks containing many nodes, or many networks with fewer nodes? IM uses a random walk algorithm that uses connection weights to determine the minimum descriptor code length necessary to describe the structure. Notably, while the solution provides modules, it is not designed to maximize modularity. As others have done previously9, we thresholded the correlation matrix to the top x% of connections (or edges) because of the computational limitations of using a full set of 4.1 billion connections as descriptors in the map equation. We thresholded the connectivity matrix at a threshold of 0.3%, 0.4%, 0.5%, 1%, 1.5%, 2%, 2.5% and 3%. These thresholds were chosen to scale the number of edges in proportion to edge densities that have been used previously9.
To generate a consensus across multiple edge densities, we implemented a methodology developed by Gordon et al. 9. Briefly, after IM detected communities for each participant, putative network assignments were then assigned to each participant’s communities by matching them at each threshold to the independent group networks from WashU (n = 120). To do this, for each individual, at each density threshold, the spatial overlap of each unknown community was compared to each one of the independent group networks separately using the Jaccard similarity index. The unknown community was then assigned that network identity to which it had the highest Jaccard similarity index. If the Jaccard Index was less than 0.1, the community remained unassigned, so as to avoid assigning communities to known networks based on only a few vertices. Assignments were first made with the large, well-known networks (default, lateral visual, motor, frontoparietal, cingulo-opercular and dorsal attention) and then to the smaller, less well-known networks (ventral attention, salience, parietal memory, parieto-occipital, temporal pole, medial temporal). In each individual, a ‘consensus’ network assignment was created by giving each grayordinate the assignment it had at the sparsest threshold at which it was successfully assigned to one of the canonical group networks.
TM methodMultiple versions of the time series were used depending on the analysis—either exactly 10 min of randomly sampled frames, all available frames below the FD threshold, or concatenated rest and task data in the following order: rest, MID, n-back and SST (provided that the participant had an available scan for the task). To generate the templates, IM community detection was performed at several tie densities (for full details of average networks, see refs. 9,23) on an average connectivity matrix (n = 120 participants) using a two-level solution. This yielded 14 networks that include the DMN, the Vis, the FPN, the DAN, the VAN, the Sal, the CO, the SMd, the SMl, the Aud, the temporal pole network (Tpole), the MTL, the PON and the PMN. Sensory and motor systems were combined due to the coupled nature of activation. Despite high reproducibility in resting-state functional connectivity, the extent to which these networks are activated on a neuronal time scale is unclear. However, recent work discussed in ref. 58 suggests that the contribution of short-term dynamic changes (for example, from task-based states) to variation in brain organization is quite modest relative to resting-state organization.
A graphical description of the TM method is shown in Supplementary Fig. 3. Gordon et al. 9 generated single network assignments using IM on a group average dense connectivity matrix from a cohort of 210 adults. The parcellation was used to anatomically define networks for each participant and create seed-based correlations for each network in all participants in the template group (n = 164 ABCD-group 3 participants). Seed-based correlation maps were averaged across the participants in the template group for each network separately (Supplementary Fig. 3a). Each template was then thresholded to correlation values >z score = 1 (~top 15.9% of connections). To perform TM in the group 3 participants, we first generated whole-brain connectivity matrices. Here we show an example of the whole-brain connectivity of a grayordinate within the PCC. We then thresholded the connectivity for each grayordinate in the same manner as the template and calculated an η2 value for each network. Each grayordinate is then assigned to the network based on the maximum η2 value (Supplementary Fig. 3b).
To generate an independent template, we conducted a seed-based correlation (using an average time series correlated to all the grayordinates) for all networks. Seed-based correlations were generated using the dense time series from each template participant that was smoothed with a within-frame spatial Gaussian smoothing kernel of 2.55 mm using each participant’s own mid-thickness surfaces (extracted from the Surf stage of Freesurfer). The resulting networks were converted to a dlabel CIFTI file and applied to the smooth dense series to generate an average time series for each network. We then correlated the time series of the seed with the time series of all other grayordinates. The seed and remaining time series were motion-censored using an FD of 0.2 mm, and outliers in the BOLD signal were removed using the median absolute deviation in the remaining frames using the motion-censoring method outlined above.
Seed-based correlation values were averaged across all the participants in the template group (n = 164, 9–10-year-olds), resulting in a vector (91,282 × 1) of average correlation values for each network correlated with each grayordinate. Each network vector was averaged independently across participants in the template group to generate seed-based templates for each network. We then thresholded each network template at z ≥ 1.
To generate precision maps for each participant in ABCD-group 1 and ABCD-group 2, we examined the whole-brain connectivity for each grayordinate by correlating the dense time series against all other grayordinates. For each participant in each test group (group 1, n = ~5,000 and group 2, n = ~5,000), we generated a Pearson correlation matrix (91,282 × 91,282 grayordinates) for each connection using the dense time series using the Connectome Workbench command ‘-cifti-correlate’ (https://www.humanconnectome.org/software/connectome-workbench). Time series were then motion-censored (‘Head motion correction’) to reduce artifacts induced by head motion.
Because connectivity matrices were generated including subcortical brain regions, the correlation matrix was z-scored separately for each hemisphere, the subcortical region, and the connections between the cortex and the subcortex. This allowed for the normalization of connectivity between subcortex and cortex where there is the potential for a decreased SNR in the subcortex. We thresholded the whole-brain connectivity for each grayordinate to only include correlated grayordinates with z-score values greater than or equal to one. This resulted in a vector of whole-brain connectivity for each grayordinate that only includes grayordinates that are strongly correlated to a given network template. We then calculated an η2 value between the remaining grayordinates and each of the network templates seen in Supplementary Fig. 4. The grayordinate is assigned to whichever network with the maximum η2 value.
OMNI mapping methodTo generate overlapping networks for each participant, rather than assigning the grayordinate to the network with the maximum η2 value, we used a data-driven approach to assign multiple networks to each grayordinate. For each network, we plotted the distribution of η2 values (Fig. 5c). The connectivity for each network demonstrates a characteristic skewed bimodal distribution. The distribution for η2 values was distributed into 10,000 bins and fitted with a cubic spline. The distribution was then smoothed using a Savitzky–Golay filter using a 2,000 data point window within MATLAB (The MathWorks, v2016-2022x). We calculated the local minimum of the bimodal distribution by taking the derivative of the smoothed data between 4,000 and 7,000 bins. We then used this local minimum as the threshold for whether or not a grayordinate would be labeled with this network, where grayordinates above this threshold would receive the network assignment. Grayordinates that had an η2 value higher than the threshold were assigned to those networks (Fig. 5d). Notably, unlike the winner-take-all approach, this method does not require an assigned network at each grayordinate, whereby grayordinates that do pass this thresholding and are not preferentially linked to any given network will go unassigned. In Supplementary Fig. 7a, an example of overlapping networks using OMNI mapping is shown for an ABCD participant with 10 min of low-motion resting-state data. Because each grayordinate can belong to multiple networks, we used the ten ABCD participants mentioned above to calculate NMI independently for each network (Supplementary Fig. 7). Networks that have larger topographical variability among the population (for example, the frontoparietal network) are those that had larger difference in NMI between intraparticipant split halves compared to the null distribution, indicating topographical specificity (Supplementary Fig. 7b, yellow distribution versus black distribution). Probabilistic mapping of overlapping networks for each group revealed that these probabilistic maps were reliable (see Fig. 6 and Supplementary Fig. 11 for all the network maps).
OMNI mapping allowed us to identify regions of network overlap and integration zones. One might be tempted to interpret the observed integration zones as a byproduct of volumetric averaging due to the limited volumetric resolution of rs-fMRI (Supplementary Fig. 12d–f). rs-MRI is generally collected with 3–4 mm resolution to optimize the SNR when using a 3T scanner111. There is evidence to suggest that smaller voxels produce a higher SNR and stronger BOLD effects at high fields such as 7T, which, at least within the motor system, can significantly affect the estimate of intervoxel correlation112. Newton and colleagues112 demonstrated that BOLD imaging at very high spatial resolution (1 × 1 × 2 mm) allows for improved functional connectivity analyses, allowing them to distinguish the intricacies of the sensorimotor network (as defined by a finger tapping task compared to rest) in resting-state functional connectivity maps, which may be attributable to due to decreased partial volume averaging. Any voxel size larger than a single hemodynamic unit (a neuron, corresponding capillaries and supporting astrocytes) is going to be susceptible to volumetric averaging. However, while volumetric averaging resulting from our collection resolution (2.4 × 2.4 × 2.4 mm) does occur, there are several reasons why it is still likely that neurons residing at the boundaries between networks are important for integration.
First, integration zones appear to be in generally similar locations across the population. If volumetric averaging contributed to the overlapping integration that we’ve observed, then we would expect them to exist indiscriminately near the boundaries of all networks. Instead, what we observe is that integration zones are present at relatively similar network intersections across participants. Second, the location of the integration zones closely corresponds to hubs with regions that are either highly metabolically active47, relay information between nodes85 or process multimodal information113,114, which supports the hypothesis that these regions are likely integrating information from multiple networks.
Furthermore, discrete network boundaries such as those shown in Supplementary Fig. 12b do not preclude neurons at the interface boundary from communicating with one another. On the contrary, they reinforce the boundary through persistent internetwork communication. Techniques that implement boundary mapping are predicated on the observation that resting-state functional connectivity patterns can abruptly change from one cortical region to an adjacent cortical region, which often reflects the abrupt changes in cytoarchitectonics in the cortex in nonhuman primates5,115. Few studies have examined cross-network communication at the resolution necessary to capture the nuances of integration between networks. However, in one study of adjacent brain regions, Carmichael and Price identified two distinct networks within the macaque orbital and medial prefrontal cortex using retrograde and anterograde tracers87. Although the regions’ networks were clearly distinct, they were highly interconnected at the boundary region between them87,116.
Probabilistic mapsProbabilistic maps were generated separately for each group, method and network. Probabilistic maps were generated by calculating the probability that a grayordinate was assigned to a given network using all the participants within the group. The TM ROI set was generated by converting clusters produced by thresholding the probability maps at 0.8 (excluding clusters smaller than 30 grayordinates), converting them to dlabel files (dlabel.nii) and combining ROIs into one combined probabilistic parcellation. Probabilistic dlabel files are available for the combined networks and each network separately from the MiDB Precision Brain Atlas webpage: http://neuroatlas.org. We performed an additional analysis with participants separated by site. When we correlated probabilistic maps between sites, we observed that probabilistic maps were nearly identical (Supplementary Fig. 16).
NMF community detection methodWe implemented a community detection technique used previously8 to decompose non-negative participant-specific functional networks using their corresponding concatenated rest + task dense time series in a constrained manner using three regularized terms32,46. Briefly, a voxel-wise group sparsity regularization term was first used to ensure that a group consensus was used as a prior using group 3. Second, the spatial locality regularization term was used to ensure that functional coherent voxels are encouraged to reside in the same functional network. Finally, a within-participants regularization term was used to eliminate redundant functional networks8,32. The weights from the consensus were then applied to each of the time series for participants in ABCD-group 1 and ABCD-group 2.
Analysis of minutes necessary for reliable communities using split halvesWe calculated the similarity between split halves for an individual by splitting the resting-state time series in half and generating a correlation matrix of all grayordinates from each half using exactly 10 min of randomly sampled frames. Each correlation matrix was used as an input to both the TM algorithm and IM algorithm to generate networks for each half (‘TM method’). We then calculated the NMI between halves (https://github.com/MidnightScanClub/MSCcodebase). To create the null distribution, we calculated the NMI between an individual participant’s half and all other halves in the group set for all participants. The difference between the test (self) and null (other) distributions was assessed using an independent two-sample t test with unequal variance.
Brain–behavior associations using subset reliabilityTo assess the reliability of a probabilistic parcellation schema, we conducted a split-group subset reliability association analysis. We randomly sampled participants from group 1 at discrete sample sizes and correlated each corresponding element of the matrix to measure reliability against the participants’ behavioral measures. For each analysis, we quantified the correlation between each participant’s behavioral measure and (1) the Gordon connectivity matrix, (2) the probabilistic parcellation connectivity matrix or (3) the integrative zone. The resultant correlation matrix for each subset was then correlated to the correlation matrix made from all the participants in group 2. To calculate a nonlinear regression estimate across sample sizes, we then fitted a curve through the data points using an exponential rise-to-maximum single 3-parameter estimate (SigmaPlot 12.5 (Systat Software)) with the following equation:
where y is the correlation, y0 is the y-intercept, a is a scaling parameter, b is the rate of rise to maximum and x is the number of participants. All regression parameter fits were significant (P < 0.0001) and were highly correlated with the data (PC1 Gordon: r2 = 0.8045, TM: 0.8540; PC2: Gordon = 0.7181, TM = 0.6260; PC3: Gordon = 0.8086, TM = 0.6999). The coefficients for the curves for each model are provided in Supplementary Table 5. To ensure that the increase in intergroup reproducibility observed with the integration zones was not simply due to the reduced number of ROIs, we conducted an additional subset reliability analysis, where we randomly sampled 80 ROIs from the Gordon parcellation (the same number of ROIs in the TM region set) at each of the various participant subsets (Fig. 4d–f). Neural network graphs can be mapped at multiple scales, and the reliability when resolving network assignments at those scales can vary depending on the dataset and the community detection algorithm117. However, thresholding by population-level network probabilities ultimately yields different numbers of ROIs, but more importantly, this ROI set is not based on thresholding a graph (as is done for IM), so assumptions about scale/resolution may not apply to this region set. Because the number of regions differs between the MIDB probabilistic parcellation and the Gordon parcellation, the within-network connectivity was not normally distributed, so for every network, equal variance was not assumed for these statistical tests.
Data requirements for network specificityAn open question in the field of neuroimaging is ‘What amount of resting-state data is required to draw reliable conclusions about an individual’s connectome?’. Some estimates examining split-half reliability of connectivity matrices have demonstrated that upwards of 30 min of low-motion BOLD data are necessary11. We performed a split-half reliability analysis for network maps generated in ten participants from the MSC dataset, who underwent 5 h of rs-fMRI (in addition to task collection)9,118. We split the resting-state scans into interleaved halves, generated networks from each half as described in Methods and calculated the NMI between networks generated from halves of the within versus between participants (identical to the analysis shown in Fig. 3b). As with the ABCD dataset, the NMI of networks generated from the same MSC participants was significantly higher than networks from different participants. In Supplementary Fig. 5, the range of same-participant NMIs is shown in a blue box (0.527–0.648) and the range of null NMI (from comparing different participants) is shown as a gray box (0.314–0.378). Interestingly, the average intraparticipant NMI from MSC participants was higher than ABCD participants (MSC: 0.584 versus ABCD: 0.4214), suggesting that random sampling from longer/multiple sessions may produce more reliable network maps.
In addition to the split halves analysis, we also compared the similarity of networks generated from the second half of a participant’s data (average of 71.28
Comments (0)