Task sub-type states decoding via group deep bidirectional recurrent neural network

Decoding brain cognitive states from functional brain imaging data has attracted increasing attention in recent years (Cohen et al., 2017; Haynes and Rees, 2006; Jang et al., 2017; Li and Fan, 2019; Wang et al., 2020; Zhang et al., 2018). On the one hand, it is one of the essential techniques in developing brain-computer interfaces (Dornhege et al., 2007; Du et al., 2022; Jeong et al., 2020), on the other hand it has emerged as an effective framework for disentangling neural circuits underlying information encoding in the brain (Haynes and Rees, 2006; Kragel and LaBar, 2016; Nummenmaa et al., 2023; Tagliazucchi and Laufs, 2014; Wang, 2008).

Such a decoding framework typically builds predictive models based on machine learning algorithms for classification or regression to predict brain states represented by cognitive-domain labels using neural recordings and their derivatives (e.g., functional connectivity) (Cohen et al., 2017; Karahanoğlu and Van De Ville, 2017; Li and Fan, 2019; Naselaris et al., 2011; Shirer et al., 2012; Zhang et al., 2018). Multi-voxel pattern analysis (MVPA) equipped with conventional machine learning algorithms such as support vector machine (SVM) and ridge/sparse regression is one of the most popular implementations of neural decoding (Haxby et al., 2001; Tam et al., 2019; Tong and Pratte, 2012; Xu et al., 2021). Meanwhile, several strategies have been incorporated with MVPA to alleviate the difficulties caused by high-dimensionality of functional brain recordings, including manual selection of regions of interest (ROIs) instructed by prior knowledge (Huth et al., 2016; Wang et al., 2020; Yousefnezhad et al., 2016), extra localizer scans (Walther et al., 2009) and a sliding “searchlight” that bounds the brain activities within local regions (Mumford et al., 2012; Norman et al., 2006).

Recent advances in deep neural networks (DNNs) facilitate a seamless end-to-end brain decoding framework (Du et al., 2020; Jang et al., 2017; Li et al., 2020; Qiang et al., 2022; Zhang et al., 2021), bypassing any feature selection/engineering operation. Specifically, DNNs in various architectures have been utilized for brain state decoding using functional magnetic resonance imaging (fMRI), for example, deep belief network (DBN) (Jang et al., 2017), deep sparse recurrent auto-encoder (DSRAE) optimized by neural architecture search (NAS) (Li et al., 2020), convolutional neural network (CNN) (Du et al., 2020) and graph neural network (GNN) (Zhang et al., 2021).

While existing studies have largely advanced brain state decoding, several limitations need further explorations. First, the well-known temporal dependency in sequential fMRI (Fox and Raichle, 2007; Gilbert and Sigman, 2007b; Lindquist et al., 2007a) has not been fully exploited. An effective modeling of temporal dependency has proven to be one of the critical factors contributing to accurate brain state decoding (Grootswagers et al., 2017; Ye et al., 2022). For example, when the brain switches its state from one to another, it is expected that fMRI data would capture this temporal change by the contrast or dependency between consecutive fMRI volumes. Thus, an effective modeling of the temporal dependency could essentially benefit the decoding of brain states. However, some conventional machine learning algorithms upon which the brain decoding framework is built lack the capability to model temporal dependency, for example, SVM, DBN, CNN and GNN. Although unidirectional sequential models (e.g., unidirectional recurrent neural network, RNN) (Chung et al., 2014) have been adopted in some brain state decoding studies to model the temporal dependency (Qiang et al., 2022), their representational capability is relatively limited compared with bidirectional RNNs (Schuster and Paliwal, 1997). Thus, bidirectional RNNs may model the temporal dependency more comprehensively and thus improve the performance of brain state decoding.

Second, the granularity of the cognitive domains under decoding is relatively coarse. For example, the Human Connectome Project (HCP) task-fMRI dataset (Barch et al., 2013; Van Essen et al., 2013) is widely used as a benchmark in fMRI brain state decoding studies, and a decoding model is typically trained to classify brain states corresponding to seven cognitive tasks in HCP. However, each task may include several distinct events. For example, the motor task is composed of six events, i.e., visual cue, left- and right-finger tapping, left- and right-toe squeezing, and tongue moving. Given the fact that different task events activate distinct brain region(s), it is expected that an accurate brain decoding model could differentiate those sub-type states corresponding to multiple events. In addition, most of the existing studies use isolated task fMRI sequences to train the RNN-based decoding model. That is, a single training sample only contains fMRI volumes extracted from the fMRI sequence of one specific task, overlooking brain activity contrast across different tasks. A training sample organization strategy that can cover fMRI volumes from different tasks and an effective modeling of task-relevant brain activity contrast could further boost the performance of brain state decoding models.

In this study, we propose a novel group deep bidirectional recurrent neural network (Group-DBRNN) model for decoding fine-grained task sub-type states from fMRI volumetric data. In this model, the deep bidirectional recurrent neural network (DBRNN) serves as the backbone architecture to model temporal dependency in fMRI. We also propose a training sample collection strategy, namely, multiple-scale random fragment strategy (MRFS), to introduce task-relevant brain activity contrast in training samples, as well as a multi-task interaction (MTIL) module to effectively encode task-relevant brain activity contrast. The experimental results on HCP task fMRI dataset demonstrate the superiority of the proposed Group-DBRNN model in differentiating 23 task sub-type states in the seven independent tasks compared to existing methods. Moreover, our extensive interpretations of the intermediate features via visual inspections based on feature visualizations and quantitative assessments of their discriminability and inter-subject alignment show that the proposed Group-DBRNN model can effectively capture the temporal dependency and task-relevant contrast.

The preliminary results of this study were presented at MICCAI 2022 (Zhao et al., 2022b). The extensions that we made in this study are three folds. In the methodological aspect, we have replaced the single-layer gated recurrent unit (GRU), which serves as the building block of an RNN, with a stacked GRU layer to better capture the temporal dependency in fMRI, as existing machine learning studies have proven the superiority of stacked GRU compared to GRU (Luo. et al., 2017). In the aspect of experiments, we have performed five-fold cross-validations instead of a single train-test trial to evaluate the proposed model more comprehensively, compared the proposed model to more existing methods, and performed additional experiments to explore the impact of hyperparameters of the model. In the aspect of results, we provide extensive interpretations and visualizations of the intermediate features that the model learns. In addition, we have provided a deeper background and more detailed explanation of the proposed method.

留言 (0)

沒有登入
gif