Emotion recognition based on microstate analysis from temporal and spatial patterns of electroencephalogram

1 Introduction

In recent years, affective computing has become an emerging direction in the field of brain-inspired intelligence. Researchers aim to enable intelligent systems to recognize, perceive, infer and interpret human emotions (Poria et al., 2017), and aspire to develop “emotional machines” with human-like emotions. Emotion is a complex psychological state. Psychologists proposed several typical theories to model human emotion: the basic emotion model, the dimensional emotion model and the constructed emotion theory. Ekman believed that human beings have six fundamental discrete emotions: sadness, joy, fear, anger, surprise, and disgust (Ekman and Friesen, 1971). The most widely used dimensional model is the circumplex model of affect proposed by Russell and Barrett (1999), which uses only valence and arousal dimensions to model emotions. The theory of constructed emotion proposed by Barrett (2017) proposes that emotions should be modeled holistically, as whole brain–body phenomena in context. The theory views emotions as constructions of the world, rather than reactions to it. In the field of cognitive neuroscience, event-related potential (ERP) components with short (N100 and P100) to medium (N200 and P200) latency are demonstrated to be correlated with valence, whereas medium to long latency components (P300 and late positive potential) are shown to correlate with arousal (Hajcak et al., 2010; Kim et al., 2013). Neuroimaging studies with positron emission tomography (PET) and functional magnetic resonance imaging (fMRI) have shown that [As reviewed in Phan et al. (2002)]: the medial prefrontal cortex, the anterior cingulate, the amygdala and the insula are essential brain areas in emotional information processing; sadness was associated with activity in the subcallosal cingulate and the occipital cortex and the amygdala are activated by visual emotional stimuli.

Emotion recognition is one of the core topics in the field of affective computing, aiming to detect the emotional state of human beings from subjective experiences, neurophysiological signals, and external emotional expressions (Alarcao and Fonseca, 2019). Among the commonly used neurophysiological signals, Electroencephalography (EEG) has been widely used in the fields of emotion recognition due to its excellent time resolution (millisecond level) and non-invasiveness. There are usually two strategies for EEG emotion recognition: step-by-step machine learning and end-to-end deep learning (Zhang et al., 2020). The step-by-step machine learning strategy mainly involves three steps: EEG data acquisition and preprocessing, feature extraction and machine-learning-based classification. Generally, features from EEG can be divided into time domain, frequency domain, time-frequency domain and spatial domain. The time domain features can capture the dynamic characteristics and temporal variation trends of unstable EEG signals, such as statistical features and entropy features (Nawaz et al., 2020). The frequency domain features describe the periodicity characteristic of EEG signals, including differential entropy (Zheng et al., 2019), power spectral density (Li X. et al., 2019) and so on. The commonly used feature extraction methods in time-frequency domain include wavelet transform (Subasi et al., 2021), empirical mode decomposition (EMD) (Mert and Akan, 2018) and so on, which combine the temporal and spatial information of EEG. Besides, common spatial pattern (CSP) (Hu et al., 2022) and hierarchical discriminant component analysis (HDCA) are popular feature extraction methods which focus on relationship between electrodes and specific brain regions. In order to describe emotion in a more comprehensive way from different perspectives, researchers usually combine various feature extraction strategies to improve the performance of emotion recognition (Li et al., 2018). With the wide application of deep learning strategies, the accuracy of EEG-based emotion recognition is getting increasingly higher (Zhang et al., 2020) investigated the application of several deep learning models to the EEG-based emotion recognition, including deep neural networks (DNN), convolutional neural networks (CNN), long short-term memory (LSTM), and a hybrid model of CNN and LSTM (CNN-LSTM). The results showed that the hybrid CNN-LSTM model achieved the highest accuracy of 94.17% on the raw DEAP dataset. Recently, graph neural networks (GNN) have shown excellent performance in EEG emotion recognition (Zhang et al., 2022; Pan et al., 2024), which regard EEG signals as graph-structured data and extract high-level spatiotemporal information from EEG. Besides, some deep learning training strategies, such as domain adaptation (He et al., 2022) and transfer learning (Li J. et al., 2019), are highly favored especially in cross-subject EEG emotion recognition.

These previous studies using time and frequency domain features have achieved great success in EEG-based emotion recognition. However, these features mainly reflect the characteristics of localized brain activities, failing to describe the global working mode of the brain during the affective process. In addition, EEG is a non-stationary and fast-changing voltage signal, which results in dramatic and rapid changes in features extracted from EEG, whereas emotion states change gradually and gently (Chen et al., 2021). Most existing feature extraction methods ignore these differences between emotion and EEG signals. On account of these aspects, we propose a feature extraction method based on EEG microstates for emotion recognition, which can capture the temporal and spatial dynamics of EEG from a global perspective.

The microstate analysis technique is based on scalp topographic maps clustering, which had been proven to be effective to capture the rich spatial–temporal information in EEG signals, and can reflect the global functional network activity of the brain (Khanna et al., 2015; Michel and Koenig, 2018; Tarailis et al., 2023). Lehmann et al. (1987) showed that the time series of scalp potential topographic maps of spontaneous EEG signal do not change continuously or randomly over time, but remain stable within a certain period typically ranging from 80 to 120 milliseconds, followed by an abrupt alteration into a new configuration which returns its stability (Michel and Koenig, 2018). The scalp electric potential can reflect the instantaneous state of global activity of the underlying brain functional network, and the changes in topographical configuration indicate the transformation of the global cooperation mode of the brain functional network. The stages at which these topographic maps remain in a stable state are called “functional microstates” (Pascualmarqui et al., 1995; Lehmann et al., 1998; Khanna et al., 2015), which reflect the basic steps of information processing in the human brain.

The key challenge of utilizing microstate analysis to study EEG signals in emotional states is how to determine the optimal number of microstates. In resting-state EEG signals, despite the different clustering algorithms and datasets, researchers commonly identify four clusters (i.e., microstates). These four microstate categories exhibit highly similar configurations across studies (Michel and Koenig, 2018; Tarailis et al., 2023). Thus, many studies tend to fix the number of microstates at four to keep consistent with previous studies.

However, since the EEG signal in the emotional state contains the dynamics of emotion and other emotion-related cognitive processes, it is more complex compared with the EEG in the resting state. It is necessary to combine various optimization criteria to determine the optimal number of microstates quantitatively for the emotional EEG. Commonly used optimization criteria in resting EEG include global explained variance (GEV), cross-validation (CV) criterion, dispersion criterion, Krzanowski-Lai (KL) criterion, and the normalized KL criterion (Murray et al., 2008; Michel and Koenig, 2018; Poulsen et al., 2018). GEV is considered to represent the proportion of data that can be interpreted by all microstate classes, which is used to evaluate the quality of clustering. However, compromise between clustering quality and data reduction is needed when using the GEV criterion. Dispersion is a measure of intra-cluster similarity, but it cannot be used in the clustering methods which are polarity-invariant, such as modified K-means. Both the KL criterion (Krzanowski and Lai, 1988) and the normalized KL criterion are essentially a method to find the “elbow” of the dispersion curve. The “elbow” refers to the point of highest deceleration where adding additional one more microstate will not increase the quality of the results (Murray et al., 2008; Poulsen et al., 2018). Inspired by the GEV and KL criteria, we proposed a KLGEV criterion here, to address the core problem of determining the optimal number of microstates in emotional EEG. The core idea of the KLGEV criterion is to find the “elbow point” (L-corner) of the GEV curve, in other words, the inflection point between the rapid growing period and the flat period of the GEV curve.

The work in this paper mainly includes the following three aspects: (1) We proposed a KLGEV-criterion-based microstate analysis method based on the GEV and KL criteria, which can automatically and adaptively determine the optimal number of microstates in emotional EEG signals, so as to explore the global working mode of the brain during the occurrence and evolution of emotion. Sufficient experiments were carried out on two public emotional datasets (2) We introduced two microstate spatial parameters (Poulsen et al., 2018) on the basis of the five commonly used temporal parameters. These parameters were used as feature sets for emotion recognition on two benchmark datasets SEED and DEAP, yielding good performance (3) We performed statistical analysis on the seven microstate parameters, to investigate the spatiotemporal dynamic characteristics of EEG signals under different emotional states. The results partially revealed the specific neurophysiological significance of microstates during the emotional cognitive process, and broaden our knowledge of the functional interpretability of microstates. The schema of the present study is shown in Figure 1.

Figure 1. The schema of the study. (A) Spatial clustering of topographic maps across subjects using the proposed KLGEV-based K-means clustering algorithm. (B) Construction of emotional dynamic microstate sequences. (C) Microstate temporal and spatial feature extraction from the microstate sequences (Here we take the DEAP dataset as an example). (D) Statistical analysis of microstate features to characterize spatiotemporal dynamics under different emotional states. (E) Emotion recognition with microstate features on the SEED and DEAP datasets.

2 Materials and methods 2.1 Electroencephalogram datasets and preprocessing

Two public emotional EEG datasets were used for microstate analysis: the SJTU Emotion EEG Dataset (SEED) (Duan et al., 2013; Zheng and Lu, 2015) and Database for Emotion Analysis using Physiological Signals (DEAP) (Koelstra et al., 2012).

2.1.1 Dataset 1: the SJTU emotion electroencephalogram dataset

The SEED dataset contains EEG data of 15 subjects when they were watching different types (positive, negative, and neutral emotions) of film clips. The EEG was continuously recorded with 62-channel ESI NeuroScan System at a sampling rate of 1,000 Hz. Each subject performed the experiment three times with an interval of about one week, for a total of 45 sessions. Each session consists of 15 trials, in which subjects were asked to watch a film clip lasting about four minutes. We carefully scrolled and reviewed the EEG data from 45 sessions, and removed 5 of them with lots of noise and artifacts. As a result, 40 sessions in SEED were used for subsequent processing and analysis.

A standard preprocessing pipeline was conducted for artifact removal. Firstly, we applied a bandpass filter of 1–45 Hz for the desired frequency range and a notch filter of 48–52 Hz for power line noise removal to each session of EEG data. Secondly, the filtered EEG data were common average referenced. Thirdly, the EEG was down-sampled to 200 Hz. Finally, we removed the artifacts from the eyes and muscles using independent component analysis (ICA).

2.1.2 Dataset 2: database for emotion analysis using physiological signals

The DEAP dataset contains EEG data of 32 subjects when they were watching music video clips. The EEG was collected with 32-channel Biosemi ActiveTwo system at a sampling rate of 512 Hz. Each experiment consists of 15 trials, in which subjects were asked to watch a one-minute music video clip and fill out a self-assessment mood scale after watching. Each video is scored on the dimensions of arousal and valence, which are rated on a continuous scale ranging from 1 to 9.

Since the EEG data from subjects No. 1–22 and No. 23–32 in the DEAP dataset were collected under different hardware conditions, only No. 1–22 were selected in this study to exclude the influence of different experimental conditions. In addition, the EEG data of No. 1–22 were further scrolled and examined, and two (subjects 8 and 17) with lots of noise and artifacts were removed. As a result, we used the EEG data from 20 subjects for further processing and analysis. The preprocessing procedure of the DEAP is the same as that of the SEED, with a replaced down-sampling step to 256 Hz after common average referencing.

2.2 The proposed KLGEV-criterion-based microstate analysis

Based on the modified K-means spatial clustering algorithm, we proposed a KLGEV-criterion-based microstate analysis method, which can automatically and adaptively determine the optimal number of microstates in emotional EEG signals. The proposed method was used to construct the microstate time series, so as to capture important spatiotemporal dynamics of EEG signals during the affective process.

2.2.1 Global field power

EEG microstates are defined as successive short time periods (or stages) during which the configuration of the scalp potential field remains semi-stable (Michel and Koenig, 2018). Before clustering of the original topographic maps, the global field power (GFP) at each time point in the EEG signal is calculated. The scalp potential maps at the peak point of GFP curves are used as the original maps of the spatial clustering algorithm. GFP is calculated as follows:

GFPn=∑i=1Cvin−v¯n2C (1)

where C represents the number of electrodes, vin is the measured voltage of a specific electrode i at sampling point n, and v¯n is the average voltage of all C electrodes at the respective sampling point n.

Mathematically, GFP equals the root mean square across the average-referenced electrode values at a given instant in time, i.e., the standard deviation of all electrodes at a given time. GFP provides a single and reference-independent measure of response strength of topographic maps (Lehmann and Skrandies, 1980). The local maxima of the GFP curve are considered to have stable topological configuration and high signal-to-noise ratio, whereas topographic maps with low GFP tend to have low signal-to-noise ratio, which means the topographical configuration is changing from one to another (Murray et al., 2008). As a result, only the topographic maps at the GFP peak point are selected as the original maps for the spatial clustering algorithm.

2.2.2 KLGEV-based K-means clustering algorithm

Based on the modified K-means spatial clustering algorithm, we proposed a KLGEV-criterion-based microstate analysis method in this paper to automatically and adaptively determine the optimal number of microstates in emotional EEG signals. The flowchart of the proposed algorithm is shown in Figure 2, to provide a clear and concise depiction of the steps involved in the algorithm.

Figure 2. Flowchart of the proposed KLGEV-criterion-based algorithm. (A) The inputs of the algorithm include N original topographic maps and M candidate numbers. The original maps are defined as the scalp potential maps at the peak point of GFP curves. (B) For each candidate number Km , the modified K-means clustering algorithm is used. The N original topographic maps are thus clustered into Km clustering centers. We then calculate GEV for each candidate number as a preparation for the KLGEV criterion. (C) The schematic diagram of the KLGEV criterion and determination of the Koptimal . K* is regarded as the ‘elbow point’ (the star) of the GEV curve, i.e., the local peak points of the KLGEV curve. According to the KLGEV criterion, the largest local peak point of the KLGEV curve is determined as the optimal one. (D) The outputs of the algorithm are the identified Koptimal clustering centers, i.e., the optimal microstate classes.

Global explained variance (GEV) is considered to represent the proportion of data that can be interpreted by all microstate classes. GEV is commonly used to evaluate the quality of clustering. Theoretically, a higher GEV stands for a better clustering result, which means that the current K kinds of microstates can explain a higher proportion of the data. The GEV of the current K clusters is calculated by Equation (2), which is equal to the sum of the global explained variance GEVk of all clusters. The global explained variance of each cluster is calculated by Equation (3), which equals to the sum of the global explained variance of all sampling points with cluster label k:

GEVk=∑nNkGEVn,forln=k (3)

where Nk refers to the number of sampling points assigned to cluster k, and ln is the microstate label of the potential topographic map at sampling point n.

The global explained variance at each sampling point is calculated by Equation (4), which reflects the spatial similarity between the potential topographic map xn at each sampling point and the microstate template map (cluster center) alnto which xn belongs:

GEVn=Corrxn,aln2GFPn2∑n′=1NGFPn′2 (4)

where Corrxn,aln is the spatial correlation coefficient between xn and aln . GFPn and GFPn′ represent the global field power at sampling points n and n′ respectively, calculated by Equation (1). N is the number of all sampling points.

As GEV will increase with the number of microstates (i.e., the number of clusters), a larger GEV usually corresponds to a larger number of microstates. Excessive microstates can result in high similarity between each microstate and fail to reflect the activity characteristics of different neuronal assemblies. In order to make a compromise between clustering quality and data reduction, the KL criterion was introduced in this paper to find the “elbow point” (L-corner) of the GEV curve, to automatically determine the optimal number of microstates (i.e., the optimal number of clusters) in emotional EEG. The “elbow” is the point where the growth of GEV is significantly reduced, in other words, where the increase in GEV caused by adding one more microstate decreases significantly.

To find the optimal number of microstates, we need to find the inflection point between the rapid growing period and the flat period of the GEV curve. The KLGEV criterion investigates the first-order difference of GEV curve with microstate number interval of 2. Compared with the interval of 1, it can reduce the influence of irregular local jitter on the curve, and reflect more clearly and accurately the overall trend of the GEV curve. Let DIFFK denotes the first-order discrete difference with interval 2 in the function K2/CGEVK when the number of groups in the clustering is increased from K-2 to K, i.e.,

DIFFK=K2/CGEVK−K−22/CGEVK−2 (5)

where C is the number of electrodes, and GEVK refers to the global explained variance when the candidate number of microstates is K, calculated by Equation (2).

Then we would expect GEV to increase dramatically as K is increased, as long as K is less than the optimal number K*, but this increase should slow down after K = K*. Thus, we would expect that (as shown in Figure 2C):

(i) For K < K*, both DIFFK and DIFFK+2 should be large (or medium) and positive;

(ii) For K > K*, both DIFFK and DIFFK+2 should be small (or medium) and positive;

(iii) For K = K*, DIFFK∗ should be large and positive, while DIFFK∗+2 should be relatively small and positive.

On the basis of the above expectation, therefore, a reasonable criterion to determine the optimal number of microstates automatically is:

KLGEV=DIFFKDIFFK+2 (6)

As a consequence, the local peak points of the KLGEV curve correspond to the elbow of the GEV curve. In practice, there are usually several local peak points on the KLGEV curve, and the KLGEV criterion identify the largest local peak point as the one indicating the optimal number of microstates.

The clustering algorithm includes two steps: reassigning and recalculation. During the reassigning step, the algorithm determines the category ln for each original topographic map xn . In this step, the algorithm assigns each original topographic map to one of the K clusters. ln is determined using Equations (7) and (8) as follows:

ln=argminkdkn2 (7) dkn2=xnT·xn−xnT·ak2−λbkn (8)

where xn refers to the potential vector of the original map n , ak is the potential vector of the kth cluster center, and dkn2 is the orthogonal square Euclidean distance between xn and ak .

The recalculation step recalculates the cluster center of each cluster, which is defined as the mathematical average of all original maps in each cluster. After the clustering algorithm is finished, all the original topographic maps are clustered into K classes, and K clustering centers (i.e., microstate template maps) are obtained.

The complete procedure of the KLGEV-based K-means clustering algorithm is shown in Algorithm 1. The algorithm consists of two stages: the first stage is the modified K-means spatial clustering algorithm, which obtains several candidate numbers of microstates by clustering the original topographic maps; The second stage is the identification of the optimal number of microstates from candidate numbers based on the KLGEV criterion. The algorithm outputs the final cluster centers, i.e., microstate template maps.

ALGORITHM 1. KLGEV-based K-means clustering algorithm.

2.2.3 Backfitting and temporal smoothing

The obtained microstate template maps are used to backfit scalp potential maps at each sampling point in EEG data based on Pearson spatial correlation coefficients. The Pearson correlation coefficient between each scalp potential map and each template map is calculated by Equation (9) as follows:

Corru,v=∑i=1Cuivi∑i=1Cui2∑i=1Cvi2 (9)

where C represents the number of electrodes, u or v refers to the potential topographic map, i.e., the potential topographic map at each sampling point or the template map, and ui or vi is the potential value of the topographic map u or v at electrode i, respectively.

After the calculation of spatial correlation coefficients, the topographic map at each sampling point is assigned to one template map (i.e., microstate) with the highest spatial correlation coefficient. In this way, the potential topographic maps at all sampling points in EEG signals are represented as a series of template maps, and the raw EEG signals are modeled as a time series of alternating functional microstates, which can characterize the dynamic process of the brain during affective processing.

Due to the existence of noise signals, there are usually some short-duration microstate segments in the microstate time series obtained from topographic map backfitting. We adopted the windowed smoothing algorithm proposed by Pascualmarqui et al. (1995) to smooth these small noise segments.

2.2.4 Microstate temporal and spatial features

By analyzing the EEG microstate time series, several microstate parameters can be obtained (Murray et al., 2008; Michel and Koenig, 2018; Tarailis et al., 2023). We introduced two microstate spatial parameters, namely average global field power and mean spatial correlation (Poulsen et al., 2018), on the basis of the commonly used temporal parameters. The microstate temporal and spatial parameters used in this paper are summarized as follows:

(a) Occurrence: the frequency of occurrence of each microstate;

(b) Duration: the average duration (average lifespan) that a given microstate remains stable;

(c) Coverage: the time coverage rate of each microstate throughout the whole-time course, in other words, the fraction of the total recording time for which a given microstate is dominant;

(d) Transition probability between microstates classes;

(e) Global explained variance (GEV) of each microstate, which is calculated using Equation (3);

(d) Average global field power ( GFPk ) of each microstate, represented by the average global field power GFPn of all sampling points assigned to the kth microstate. GFPk is calculated by Equation (10) as follows:

GFPk=1Nk∑nNkGFPn,forln=k (10)

where GFPn is calculated using Equation (1).

(g) Mean spatial correlation (MspatCorr) of each microstate, which is the average spatial correlation between the template map of each microstate class and the potential topographic maps assigned to this microstate. It is calculated by Equation (11) as follows:

MspatCorrk=1Nk∑nNkCorrxn,aln,forln=k (11)

As reviewed in Murray et al. (2008), Khanna et al. (2015), Poulsen et al. (2018) and Tarailis et al. (2023), these parameters well describe the temporal and spatial dynamic characteristics of the microstate series and the EEG signals, reflecting the response strength, temporal and spatial characteristics of potential neural assemblies and nervous systems.

2.3 Statistical analysis of microstate features

Statistical analyses were performed to characterize the EEG microstate differences in different emotional states. Each microstate parameter was compared on the valence and arousal dimension separately. The level differences in valence describe the positive or negative degree of emotional states, whereas the arousal dimension characterizes the level of physiological activation of emotions (Russell and Barrett, 1999).

For the DEAP dataset, we first classified all emotion-evoked EEG trials into low- or high-level groups based on the self-assessment ratings of all subjects. Each trial was rated separately in the arousal and valence dimensions, where each rating was a floating-point number ranging from 1 to 9. However, the ranges of the reported self-assessment ratings could be quite different from subject to subject, due to individual-specific experience of emotions (Hu et al., 2022b). As a result, it would be unsuitable to have a fixed threshold (e.g., 5) for grouping. Therefore, this paper adopted the self-adaptive threshold reassignment method proposed by Yin et al. (2017) to determine the threshold for level grouping for each subject. The illustration of the method is shown in Supplementary Figure S1, and the obtained self-adaptive thresholds on arousal and valence dimensions for each subject are shown in Supplementary Table S1. In this way, all trials were divided into four classes: high arousal and high valence (HAHV), high arousal and low valence (HALV), low arousal and high valence (LVHA), low arousal and low valence (LALV). For the SEED dataset, each trial has an explicit emotion label: positive, negative or neutral. Trials with positive labels and negative labels were included in the statistical analysis.

Secondly, the Wilcoxon rank-sum test was used to identify whether statistically significant differences exist between high (or positive) and low (or negative) groups for each microstate class in every parameter. The significance level is set to 0.05.

2.4 Emotion recognition

In order to verify whether the microstate temporal and spatial parameters extracted in this paper can effectively capture the emotional characteristics of EEG signals, we employed all the parameters extracted in Section 2.3 as a feature set for the subject-dependent emotion recognition experiment. We did additional comparison experiments which utilized only temporal parameters as a feature set, so as to investigate whether the introduced spatial parameters can further improve the accuracy of emotion recognition. Besides, we also tested whether the characterization ability of the model would be further enhanced with frequency domain features. Specifically, we extracted power spectral density (PSD) features from five bands: δ (1–4 Hz), θ (4–8 Hz), α (8–12 Hz), β (12–30 Hz), and γ (30–45 Hz), and combined them with microstate temporal and spatial features for emotion recognition. The experiments were carried out on SEED and DEAP datasets.

The open-source automatic machine learning framework AutoGluon-Tabular (Erickson et al., 2020) was chosen as the classifier for emotion recognition. AutoGluon-Tabular is an easy-to-use Python library for automatic machine learning with tabular data. It automatically evaluates the performance of multiple machine learning models (e.g., KNN, random forests, XGBoost, ensemble learning models, multi-layer stack ensembling models and even self-implemented models) at the same time, and returns the classification results using the best-performing model. Unlike existing automatic machine learning frameworks that primarily focus on model/hyperparameter selection, AutoGluon-Tabular succeeds by multi-layer stack ensemble and n-repeated k-fold bagging. For each subject, a fivefold cross-validation method was adopted to obtain the final average accuracy.

3 Results 3.1 Results of KLGEV-criterion-based microstate analysis

We used the proposed KLGEV-criterion-based method to perform microstate analysis on two public emotional EEG datasets, SEED and DEAP, to evaluate the effectiveness of the proposed method.

3.1.1 Determination of the Koptimal using the KLGEV criterion

To investigate the performance of the proposed KLGEV criterion, we demonstrate here how the KLGEV criterion determine the optimal number of microstates Koptimal on the SEED and DEAP datasets. For the SEED dataset, when the candidate number of microstates is ranging from 3 to 15, the corresponding GEVK , K2/CGEVK , DIFFK , and KLGEV are listed in Table 1. When the number of microstates K is smaller than 10, DIFFK and DIFFK+2 are relatively large (or medium); and when the K is larger than 10, DIFFK and DIFFK+2 are relatively small (or medium); while when K equals to 10, DIFF10 is relatively large and DIFF12 is relatively small. As a consequence, the ratio of DIFF10 to DIFF12 tends to be larger than ratios at other points (e.g., the ratio of DIFF9 to DIFF11 or the ratio of DIFF11 to DIFF13 ). Therefore, 10 is regarded as the ‘elbow point’ of the GEV curve, i.e., the local peak point of KLGEV. According to the KLGEV criterion, the largest local peak point 10 is chosen as the final optimal number of microstates (Koptimal). In the same way, the Koptimal equals to 9 for the DEAP dataset.

Table 1. Determination of the Koptimal using the KLGEV criterion on (A) SEED and (B) DEAP datasets.

3.1.2 The identified optimal microstate classes

For the SEED dataset, the GEV curve and the corresponding KLGEV obtained by the modified K-means clustering algorithm when the candidate number was from 3 to 15 were shown in Figure 3A. The corresponding template topographic maps (i.e., microstates maps) were shown in Figure 3B, which were named “MS1-MS10” respectively.

Figure 3. The identified optimal microstate classes using the proposed KLGEV criterion for the SEED and DEAP datasets. (A) The GEV and corresponding KLGEV values of the SEED dataset for different number of microstates. The KLGEV criterion identified 10 microstates, which explained 65.11% of the data in all time points. (B) The final identified 10 microstate template maps from the SEED dataset. (C) The GEV and corresponding KLGEV values of the DEAP dataset for different number of microstates. The KLGEV criterion identified 9 microstates, which explained 66.14% of the data in all time points. (D) The final identified 9 microstate template maps from the DEAP dataset.

Pearson spatial correlation coefficient is a measure of the topographic similarity of microstate maps. We calculated the Pearson spatial correlation coefficient between the pairwise topographic maps within each dataset. The correlation coefficient matrix was shown in Supplementary Figure S2A. As can be seen, the similarity between each pair of the extracted microstate maps was relatively low (Most of the coefficients were less than 0.8 except for several that were slightly greater than 0.8).

Similarly, for the DEAP dataset, the GEV curve and the corresponding KLGEV obtained by the modified K-means clustering algorithm were shown in Figure 3C. The corresponding template topographic maps (i.e., microstates maps) were shown in Figure 3D, which were named “MS1-MS9” respectively.

The Pearson spatial correlation coefficient matrix was shown in Supplementary Figure S2B. As can be seen, the similarity between each pair of the extracted microstate maps was relatively low (Most of the coefficients were less than 0.8 except for several that were slightly greater than 0.8).

3.1.3 Corresponding relationship between microstates of two datasets

The stimulus materials used for inducing emotion in the SEED dataset are movie clips, while the stimulus materials used in the DEAP dataset are music videos. In addition, considering the differences in subjects, number of electrodes (Zhang et al., 2021), the hardware conditions and the stability of the algorithm, the number of identified microstates in the two datasets is different. To find the correlation between the identified microstates in the two datasets, we calculated the Pearson spatial correlation coefficient between the microstate topographic maps of the two datasets, and the results are shown in Figure 4. As shown in the figure, the spatial correlation coefficient between the microstates in the SEED dataset and some of the microstates in DEAP is very high, and there is a clear correspondence. These identical functional microstates may reflect the functional patterns of the brain in the process of emotional cognition, and correspond to the basic building blocks in emotion-related information processing.

Figure 4. The Pearson correlation coefficients matrix between microstates of SEED and DEAP dataset. The corresponding relationship is listed on the right. These consistent microstates may represent the basic building blocks of emotion cognition.

3.2 Spatiotemporal dynamics of EEG under different emotions

Statistical analyses were performed to characterize the EEG microstate differences, so as to investigate the spatiotemporal dynamic characteristics of EEG signals under different emotional states.

On the SEED dataset, we performed the Wilcoxon rank-sum test to identify whether statistically significant differences exist under different emotional states (positive vs. negative) for each microstate parameter. The results are shown in Supplementary Figure S3. The activities of MS2, MS3, MS4, MS8 and MS9 were significantly decreased in the positive groups compared to the negative groups, while the activities of MS1, MS5 and MS10 were significantly increased in the positive groups. Specifically, the Occurrence, Coverage and GEV of MS2 were significantly lower in the positive groups. It also showed that decreased Occurrence, Duration, Coverage and GEV of MS3 were found in the positive groups as compared with the negative groups. For MS4 and MS8, the Duration and GEV were found significantly lower in the positive groups. At last, the Occurrence and GEV of MS9 significantly decreased in the positive groups. By contrast, the Occurrence, Coverage, Duration and GEV of MS5 and MS10 were significantly increased in the positive groups compared to the negative groups. Moreover, the GEV of MS1 also significantly increased in the positive groups. In addition, all 10 microstates showed higher GFP in the positive groups, and all microstates except MS1 and MS10 had decreased MspatCorr in the positive groups as compared to the negative

View original article

FRONTIERS IN NEUROSCIENCE

Share Bookmark

0 0 0 0 0 0 0

More from this channel

Emotion recognition based on microstate analysis from temporal and spatial patterns of electroencephalogram

Comments (0)