TQCPat: Tree Quantum Circuit Pattern-based Feature Engineering Model for Automated Arrhythmia Detection using PPG Signals

Dataset

The dataset used in this study is derived from the publicly available dataset described in Liu et al. [30], which consists of photoplethysmography (PPG) and electrocardiography (ECG) signals collected from 228 patients undergoing radiofrequency catheter ablation (RFCA) for arrhythmia treatment at Fuwai Hospital, China. A combined total of 118,217 10-s PPG and ECG waveforms were recorded. These signals were collected using a multiparameter monitoring system (BeneVision N12; Shenzhen Mindray Bio-Medical Electronics) equipped with a fingertip PPG sensor and a 5-lead ECG setup. The signals were obtained while the patients were in a supine position, and the data acquisition was conducted at a sampling rate of 250 Hz for ECG and 100 Hz for PPG. In the current study, we used only PPG waveforms, which amounted to a total of 46,827 PPG signals. The study dataset consists of six different rhythm types, which were annotated by two cardiologists based on ECG interpretations: 1: premature ventricular contraction; 2: premature atrial contraction; 3: ventricular tachycardia; 4: supraventricular tachycardia; and 5: AF (Table 1).

Table 1 Distribution of study PPG samplesProposed TQCPat and TCPPat-based Feature Extraction

The proposed TQCPat is a novel graph pattern modeled after a quantum circuit that features 34 nodes―multiple directed paths connect the start node to the final node via intermediary nodes―which will be populated in sequence by values in the input signal block (Fig. 1). The optimal path linking the start and final nodes is dynamically selected by applying the maximum function on these populated values, after which a local feature extractor is performed using histogram analysis (Fig. 2).

Fig. 1figure 1

Tree quantum circuit pattern is a directed graph comprising starting (V1), final (V34) nodes and intermediary nodes (V2–33) squares linked by paths. The v values are sequentially populated by the 34 values of the PPG signal block

Fig. 2figure 2

Schema overview of the proposed TQCPat-based feature extraction. **Bit, binary features; fv, feature vector. LT, lower ternary kernel; Map, feature map signals; Signum, signum kernel; UT, upper ternary kernel. The superscript numbers 1, 2, and 3 denote binary feature, feature map signal, and feature vector outputs of the signum, lower ternary, and upper ternary functions, respectively

The steps of the TQCPat-based feature extraction are detailed below.

S1: Divide the signal into overlapping blocks, each of length 34.

$$v(j)=signal\left(i+j-1\right),i\in \left\,\dots ,\mathcal-33\right\},j\in \left\,\dots ,34\right\}$$

(1)

where \(v\) represents an overlapping block.

S2: Use the values of the block (Fig. 1) to create a tree quantum graph.

S3: Select the optimal path using the maximum-based selection function. A numerical example is depicted in Fig. 3.

Fig. 3figure 3

The TQCPat is populated with values of a sample signal block of length 34

Where the paths bifurcate, the greater of two values based on the maximum function are selected to construct the optimal path linking the start and final nodes (yellow nodes). The values in the yellow nodes are used for feature extraction.

S4: Extract binary feature by deploying the selected values and three kernels: signum, upper ternary, and lower ternary.

$$\beginbi^\left(h\right)=kerne_\left(vs\left(h\right),vs\left(h+1\right)\right),h\in \left\,\dots ,8\right\}, k\in \left\,3\right\}\\ kerne_\left(vs\left(h\right),vs\left(h+1\right)\right)=\left\0,vs\left(h\right)-vs\left(h+1\right)<0\\ 1,vs\left(h\right)-vs\left(h+1\right)\ge 0\end\right.\\ \beginkerne_\left(vs\left(h\right),vs\left(h+1\right)\right)=\left\0,vs\left(h\right)-vs\left(h+1\right)\ge -t\\ 1,vs\left(h\right)-vs\left(h+1\right)<-t\end\right.\\ kerne_\left(vs\left(h\right),vs\left(h+1\right)\right)=\left\0,vs\left(h\right)-vs\left(h+1\right)\le t\\ 1,vs\left(h\right)-vs\left(h+1\right)>t\end\right.\\ t=\frac\end\end$$

(2)

where \(bit\) represents binary features, each of length 8; \(kerne_(.)\); signum function; \(kerne_(.)\), lower ternary function; \(kerne_(.)\), upper ternary function; \(std(.)\), standard deviation calculation function; and \(t\), threshold value. In this step, three-bit groups are extracted.

Step 5: Calculate three map values using binary to decimal transformation.

$$^\left(i\right)=\sum_^bi^(u)\bullet ^$$

(3)

where \(fmap\) represents the feature map signal.

S5: Repeat steps 1–5 until all overlapping blocks have been processed to generate three map signals per block.

S6: Extract the created map signals to create three feature vectors. Each feature vector is of length 256 as the corresponding feature map signal has been coded with 8 bits.

$$f^=\lambda \left(fma^\right)$$

(4)

where \(fv\) represents feature vector; and \(\lambda (.)\), histogram extraction function.

Proposed Feature Engineering Model

The proposed system consists of four main stages: (i) multilevel feature extraction, (ii) feature selection, (iii) classification, and (iv) information fusion. In the multilevel feature extraction stage, discrete wavelet transform (DWT) is applied to decompose the PPG signals into four levels using Daubechies 4 (db4) wavelet. Additionally, the proposed Tree Quantum Circuit Pattern (TQCPat) extracts three distinct feature vectors per signal by employing Signum, Upper Ternary (UT), and Lower Ternary (LT) transformations (see Sect. "Proposed TQCPat and TCPPat-based feature extraction"). In the feature selection phase, Chi-squared (Chi2) [41] and Neighborhood Component Analysis (NCA) [40] methods are used to rank the most relevant features and reduce dimensionality, ensuring improved classifier performance. The classification stage involves training k-Nearest Neighbors (kNN) [42] and Support Vector Machine (SVM) [43] models with tenfold cross-validation to obtain 12 classifier-specific predictions. Finally, the information fusion phase [45] integrates classification results using an Iterative Majority Voting (IMV) [44] approach, where multiple classifier-specific outputs are combined, and the most frequently predicted class label is selected as the final output. A block diagram summarizing all these phases is given in Fig. 4.

Fig. 4figure 4

Schema of the proposed TQCPat-based feature engineering model. **c, classifier-specific outcome; f, individual feature vector; F, merged feature vector; MDWT, multilevel discrete wavelet transform; s, selected feature vector; v, voted outcomes; w, wavelet subband

The steps of the model are detailed below.

Feature Extraction

Step 1: Apply MDWT with 4 levels to the PPG signal using used the Daubechies 4 (db4) wavelet filter.

$$\begin\left[lo^, hig^\right]=\delta \left(signal\right)\\ \left[lo^, hig^\right]=\delta \left(lo^\right), g\in \,3\}\end$$

(5)

where \(\delta (.)\) represents the discrete wavelet transform function; \(low\), low pass filter; and \(high\), high pass filter.

Step 2: Generate the individual feature vectors by applying the TQCPat-based feature extraction function (see Sect. "Proposed TQCPat and TCPPat-based feature extraction") to both the raw PPG signal and its corresponding MDWT-generated wavelet subbands.

$$\begin\left[_^ _^ _^\right]=\psi \left(signal\right)\\ \left[_^ _^ _^\right]=\psi \left(lo_\right), l\in \,\text\}\end$$

(6)

where \(f\) represents the generated feature vector of length 256; and \(\psi (.)\),TQCPat feature extraction function.

Step 3: Apply categorical merging to generate three merged feature vectors.

$$\begin^\left(r+x\bullet \left(256\right)\right)=_^\left(r\right), r\in \left\,\dots ,256\right\},\\ g\in \left\,3\right\}, x \in \left\,\dots ,5\right\}\end$$

(7)

where \(F\) represents merged feature vector of length 1280 (= 256 × 5).

Step 4: Repeat Steps 1–3 until all PPG signals have been processed to generate three feature matrices (\(X\)).

Feature Selection

Step 5: Calculate six qualified indices of the three extracted feature matrices using NCA [40] and Chi2 [41] feature selectors.

$$\beginid_=Chi2(_,y)\\ id_=NCA(_,y)\end$$

(8)

where \(idx\) represents the qualified indices of the generated feature matrices; and \(y,\) actual output.

Step 6: Select the top 256 features from each feature matrix using the corresponding qualified index.

$$\begin_\left(d,z\right)=_\left(d,id_\left(z\right)\right), d\in \left\,\dots ,N\right\}, z\in \left\,\dots ,256\right\}\\ _\left(d,z\right)=_\left(d,id_\left(z\right)\right)\end$$

(9)

where \(s\) represents the selected feature vector (six feature vectors are selected, each of length 256); and \(N\), number of the used PPG signal.

Classification

kNN [42] and SVM [43] are applied to the six selected feature vectors, generating 12 (6 × 2) outcomes.

Step 7: Calculate 12 classifier-specific outcomes from the six selected feature vectors using kNN and SVM classifiers with ten-fold cross-validation (CV).

$$\begin_=kNN\left(_,y\right), r\in \,\dots ,6\}\\ _=SVM(_,y)\end$$

(10)

where \(c\) represents classifier-specific outcomes.

Information Fusion

IMV [44] is applied to generate 10 voted outcomes from the 12 classifier-specific outcomes; and greedy algorithm, to select the best outcome from the total 22 (12 classifier-specific plus 10 voted outcomes). The steps are detailed below:

Step 8: Apply the IMV algorithm to generate 10 voted results.

$$\beginacc\left(b\right)=\theta \left(_,y\right),b\in \,\dots ,12\}\\ id=\rho \left(acc\right)\\ ^=\varpi \left(_,_,\dots ,_\right), a\in \,\dots ,12\}\end$$

(11)

where \(acc\) represents classification accuracy; \(\theta \left(.\right),\) classification accuracy calculation function; \(id\): the qualified indices; \(\rho (.)\), sorting by descending function; \(v\), voted outcome; and \(\varpi (.)\): mode function.

Step 9: Select the best outcome using the greedy algorithm.

$$\beginacc\left(z+12\right)=\theta \left(_,y\right),z\in \,\dots ,10\}\\ ind=max\left(acc\right)\\ inres=\left\_, ind\le 12\\ _,ind>12\end\right.\end$$

(12)

where \(ind\) represents the index of the maximum accuracy; and \(finres\), the final outcome.

Comments (0)

No login
gif