Duan Li, Ruizheng Shi, Ni Yao Fubao Zhu and Ke Wang
(1.School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou 450001, China;2.Department of Cardiology, Xiangya Hospital, Central South University, Changsha 410008, China)
Abstract: The automatic detection of cardiac arrhythmias through remote monitoring is still a challenging task since electrocardiograms (ECGs) are easily contaminated by physiological artifacts and external noises, and these morphological characteristics show significant variations for different patients. A fast patient-specific arrhythmia diagnosis classifier scheme is proposed, in which a wavelet adaptive threshold denoising is combined with quantum genetic algorithm (QAG) based on least squares twin support vector machine (LSTSVM). The wavelet adaptive threshold denoising is employed for noise reduction, and then morphological features combined with the timing interval features are extracted to evaluate the classifier. For each patient, an individual and fast classifier will be trained by common and patient-specific training data. Following the recommendations of the Association for the Advancements of Medical Instrumentation (AAMI), experimental results over the MIT-BIH arrhythmia benchmark database demonstrated that our proposed method achieved the average detection accuracy of 98.22%,99.65% and 99.41% for the abnormal, ventricular ectopic beats(VEBs) and supra-VEBs(SVEBs), respectively. Besides the detection accuracy, sensitivity and specificity, our proposed method consumes the less CPU running time compared with the other representative state of the art methods. It can be ported to Android based embedded system, henceforth suitable for a wearable device.
Key words: wearable ECG monitoring systems; patient-specific arrhythmia classification; quantum genetic algorithm; least squares twin SVM
Any disorder of heart rate or rhythm, or change in the morphological pattern, is an indication of an arrhythmia. Numerous arrhythmias like ventricular fibrillation and atrial flutter can result in cardiac arrest, hemodynamic collapse, and unexpected death[1-2]. Therefore, automatic, high efficiency arrhythmia monitoring and early alert are critical in clinical cardiology[3].Electrocardiogram (ECG) is an important routine clinical practice for continuous monitoring of cardiac activities and is also an essential monitoring item in ICU. However, automatic, real-time detection and classification of arrhythmia are still a challenging task as the morphological and temporal characteristics of ECG signals show significant variations for different patients under different temporal and physical conditions[4-7].
In recent years, many approaches have been reported to deal with the ECG classification problem such as linear discriminant analysis[8-9], the K-nearest neighbor classifier, the decision tree classifier[10-11], the random forest[12], and different artificial neural networks[13-15]. Another popularly-used method is support vector machine(SVM), including the least squares SVM and the SVM combined with intelligent optimization algorithms such as particle swarm optimization (PSO)and genetic algorithm(GA),which are certified to give high quality results in ECG automatic diagnosis[16-20].
Though research in the area has resulted in excellent progress, there are still some issues remained to be investigated when performed in clinical practice. For example, the design of a robust arrhythmia classifier which is suitable for real-time motoring and early alert is still a challenging task, the traditional methods usually exhibit a common drawback of having an inconsistent performance when classifying a new patient’s ECG signals[21], there are still lack of the application of the common practice when evaluating and testing a particular method over a benchmark dataset.
For these purposes, a least squares twin SVM (LSTSVM) classifier is proposed in this paper to improve the traditional SVMs for classification of ECG arrhythmias. The twin support vector machine (TWSVM) is 4 times faster than the conventional SVMs and has a promising generalization[22-25]. It is also successfully used in the bio-informatics fields[26-27]. The LSTSVM is the least squares version of TWSVM, which considers equality constraints rather than inequality constraints. So it is more suitable for real-time hardware implementation. Furthermore, it has the better generalization ability and faster computational speed as compared to the TWSVM[28-29].To learn how the method would perform, we investigated the influence of changing the parameters for the LSTSVM classifiers. One of the strengths of this study is the use of the search capability of QGA for finding the soft margin constant and kernel parameters of the proposed classifiers. The QGA algorithm uses the quantum principles such as the superposition of states, quantum gates and quantum registers to overcome the high complexity of optimization problems[30].
In this paper, an individual LSTSVM classifier will be trained by using the common and patient-specific training data, such patient-specific approach can improve the classification performance. Once a particular patient uses it for about several days, the system can acquire the ECG signals, then the specific heart beat classifier will be trained for him. Such a solution can conveniently be used for remote arrhythmia monitoring and diagnosis on a light-weight wearable device. Following the recommendations of AAMI, the proposed approach is validated experimentally on the well-known MIT-BIH arrhythmia database.
The remainder of this paper is organized as follows. Section 1 describes our proposed patient-specific heartbeat classification system and a detailed description of the ECG raw data representations used for it. Section 2 performs the self-adaptive wavelet method for ECG signal preprocessing. In Section 3, the proposed QGA based LSTSVM classification model is formulated and implemented for the patient-specific ECG recognition system. Experimental results are given in Section 4, and Section 5 concludes the paper.
In this work, ECG datasets from the MIT/BIH arrhythmia database are used for the performance evaluation of the proposed patient-specific ECG recognition system. This database consists of 48 two-lead recordings of approximately half-hour long which sampled at 360 Hz. We used modified-lead II signals. To extract the ECG beat waveforms, the QRS detection task is performedby means of the WFDB software which is available on:https://physionet.org/physiotools/wfdb-linux-quick-start.shtml. The QRS detection result was also adjusted by the maximum correction. The implementation scheme of real-time patient-specific arrhythmia detection system is shown in Fig.1.
Fig.1 Real-time classification and monitoring system scheme
ECG signals from a wearable device were transmitted to the remote monitoring platform, then filtered by a wavelet filter and segmented according to the R position of the beat detected by the WFDB software. The proposed remote monitoring system includes offline training and real-time prediction. The ECG data used for training the classifier consist of common and patient-specific training patterns. We selected respective beats from each class in some recordings to help the classifier learn arrhythmia patterns that were not included in the patient-specific data. While the patient-specific data were selected from each patient’s ECG record and were used as a part of the training data to perform patient adaptation.
For long-term dynamic ECGs, various sources of background noise is expected, such as baseline wander, electromyogram noise, motion artifact, power-line interference, and muscle noise, all of which distort the signals of interest and consequently reduce classification accuracy. In order to minimize the hardware implementation and improve the de-noising performance, a wavelet self-adaptive threshold de-noising is used for ECG signal preprocessing in this study.
The power-line interference can be defined as narrow-band noise which can be eliminated by the ECG signal acquisition hardware. However, the baseline wandering which ranges from 0.05 Hz to 2 Hz and the other wideband noise is not capable of being simply eliminated using hardware tools without sophisticated electric circuits.
Wavelet transformation is a very powerful tool for non-stationary signal analysis and has been widely used in ECG signal preprocessing, and it can decompose a signal into several scales that represent different frequency bands. In this study, a wavelet self-adapting filter was employed to preprocess the ECG signals. It is composed of 3 steps, namely, wavelet decomposition, adaptive threshold de-noising, and reconstruction. As the symlet wavelet family is closely related to the shape of ECG waveforms, a Sym6 wavelet function with decomposition level 9 was selected in the wavelet decomposition step. The Mallat algorithm is employed to compute the details dk[n] and approximations coefficients ak[n] corresponding to the high and low frequency parts of the signal, respectively. To remove the baseline drift, the approximation coefficients for level 9 were set to 0, because the power of baseline wandering noise was focused on the low frequency levels. Simultaneously, a thresholding algorithm that could reduce the noise by shrinking or scaling the detail coefficients smaller than the threshold was used to remove high frequency noise. A universal method is applied to calculate a general optimal universal threshold for the white Gaussian noise under a mean square error criterion and its side condition. In this method, the threshold function is defined as follows:
(1)
The threshold is selected as
(2)
wherenis the number of samples in the noisy signal andσis the standard deviation of noise that is estimated by the following relationship:
(3)
in which |Yij| is the first level detail coefficient of wavelet transformation in a noisy ECG. Detail coefficients of less than the thresholdThwere set to 0. On the contrary, subtracted theThfrom them. In each layer, the high-frequency detail coefficients were processed in the form of threshold quantization. Finally, the ECG signal is reconstructed, which is based on the new wavelet coefficients via inverse wavelet transform.
As shown in Fig.2, the original ECG signal is from the record 105 of the MIT-BIH Arrhythmia Database, which has large induced noise. The solid line waveform is the denoising signal using a tenth-order Butterworth bandpass filter. The lower and upper cutoff frequency of the Butterworth bandpass filter was 0.25 Hz and 40 Hz, respectively. From Fig.2, it is clear that the proposed wavelet filter can effectively remove the baseline wander and other wideband noise.
Fig.2 Comparison of different de-noising methods
Segmentation is implemented after the adaptive filtering, each heartbeat sample included 129 points in front of the R peak and 190 points after the R peak. As there is useful arrhythmia information in the time duration of the adjacent beats, we extracted three local timing features that contributed to the discriminating power of morphology based features. So each heartbeat sample had 323 points. The timing features are the nextRR, prevRR and ratRR. They are defined as the time interval between the next (nextRR) and the previous (prevRR) beats and also the time ratio between the previous to the next (ratRR), respectively. The morphological features in combination with temporal features of the heartbeats were used in this study.
The traditional SVM involves the solution of a single quadratic programming problem (QPP)[31]. This can be time-consuming for datasets with large number of features. Also, the SVM involves obtaining the predicted label using a single maximum-margin hyperplane. Twin SVM is based on the intuition that a better prediction can be obtained by using a formulation which allows for non-parallel, as well as more than one hyperplanes.
Consider a binary classification problem of classifyingm1samples belonging to class +1 andm2samples belonging to class -1 in then-dimensional real space Rn. Let matrix X1in Rm1×nrepresent the data points of class +1 and matrix X2in Rm2×nrepresent the data points of class -1. Nonlinear LSTSVM seeks two non-parallel hyperplanes in Rm1+m2:
K(χ,X)u1+γ1=0
K(χ,X)u2+γ2=0
(4)
The primal QPPs of nonlinear TWSVM can be modified with 2-norm of slack variables and inequality constraints replaced by equality constraints, shown as
(5)
and
(6)
By substituting the constraints into objective function, these QPPs become
(7)
(8)
The solution of QPPs (7) and (8) can be derived to be
(9)
and
(10)
where G=[K(X1,X) e1], H=[K(X2,X) e2].
Like this, the non-linear binary LSTSVM classifier generates kernel surfaces for each class in the higher dimensional space. For a test data point, its perpendicular distance is calculated from each kernel generated surface, and the classifier predicts the class from which the distance of point is lesser. The final decision function of non-linear classifier is as follows
(11)
The performance of LSTSVM model, such as generalization ability and forecasting accuracy, can be greatly affected by two penalty parametersc1,c2and the kernel function parameterσ. Thus, the choice of the parameters has a heavy impact on the forecasting accuracy. Genetic algorithm has been proved to be more suitable for optimization issues with many parameters. To overcome the high complexity of optimization problems, we used QGA which is the combination of the GA and the quantum-inspired mechanics from quantum physics for parameter tuning of the classifier. To reduce the computation complexity, we setc1=c2=c. As shown in Fig.1, QGA is used to optimize the LSTSVM parameters (c,σ) in this study. The optimal values for penalty parameterscand kernel parameterσwere selected from the following range:c∈{10-8,…,104},σ∈{2-4,…,28}, and the 5-fold cross validation classification result is the fitness function of the QGA and the GA algorithms. The maximum generation of the algorithms were set to 300 in the following study. The variableσis a kernel function parameter of Gaussian RBF. The QGA shares with its classical phases like the initialization of solutions, selection, variation operations, evaluation and finally replacement. Besides, it has other specific features such as quantum interference and measurement. A typical iteration of the QGA consists of the cyclic application of selection, variation operators (quantum crossover, quantum mutation and quantum interference), quantum measurement, evaluation and replacement[32].
The MIT-BIH arrhythmia database contains 48 recordings, and 18 types of heartbeats, which are all classified and labeled. There are two recordings that don’t contain lead II ECG signals, so the lead II ECG signals of the remaining 46 recordings are used in this study. According to the AAMI recommendations, the heartbeat types can be grouped into five heartbeats classes: normal beat (N), supraventricular ectopic beat (S), ventricular ectopic beat (V), fusion (F) of a V and a N, and unknown beat type (Q)[12]. In this study, the 300 common beats were selected from the recordings that included representative beats. For each patient-specific classifier, another patient-specific 300 training beats are selected from the record, and the remaining beats of the record were used to evaluate our proposed patient-specific classifiers.
To evaluate the performance of the proposed classifier, we performed experiments in terms of normal detection (N class versus [S,V,F and Q]), VEB(V class versus [N,S and F]) and SVEB(S class versus [N,V and F]). QGA and GA were compared to tune the parameters of the classifiers, several neural network methods were also implemented to evaluate the results. The classification performance is measured by using the three standard metrics: sensitivity(Se), specificity(Sp) and classification accuracy (Acc)[7].
4.2.1QGA tuning the LSTSVM classifier
Two classification experiments were conducted to compare QGA with GA using the 10 recordings of the MIT/BIH arrhythmia database. The optimal parameters and the testing classification results are shown in Tab.1.The best results are highlighted in all the following Tables of this study. As shown in Tab.1, using the same ECG database, the average classification accuracy of the QGA LSTSVM classifier achieve 98.22%, which is increased by 5.45% compared with the GA LSTSVM. So we used the QGA algorithm to tune the parameters of the classifiers in the following experiments. Furthermore, the CUP running time of the LSTSVM, TWSVM and SVM classifiers were evaluated by using the same datasets. The experimental results were shown in Fig.3. From Fig.3, it is clear that the proposed LSTSVM classifier had the least CPU running time among these experiments.
Tab.1 Parameter tuning of the classifiers for abnormal detection
Fig.3 Comparison of the CPU running time for different SVMs
4.2.2Comparison of different neural networks for VEB and SVEB detection
In accordance with the AAMI recommendations, the problem of VEB and SVEB detection is considered individually and experimental results are summarized in Tab.2 and Tab.3. We randomly selected 8 recordings (100,113,203,207,217,221,230,232) from the total 46 records, the several SVMs classifiers and BP neural network were performed to detect the VEB and SVEB individually. In Tab.2 and Tab.3, the number of type V or type S beats in some recordings was zero, so the sensitivities were represented by “*”.Seis the rate of correctly classified events among all events, it is important in the disease detection. In the VEB detection,Seis the rate of correctly classified V events among all V events, our proposed method has achieved betterSefor all the recordings. From Tab.2, it can be seen that the average VEB detection accuracy of LSTSVM, TWSVM, SVM and BP classifiers are 99.65%, 94.69%,94.69% and 90.02%, respectively. In the SVEB detection,Seis the rate of correctly classified S events among all S events. From Tab. 3, the average SVEB detection accuracy of LSTSVM, TWSVM, SVM and BP classifiers achieve 99.65%, 94.69%,94.69% and 90.02%, respectively.Overall, the proposed method achieves a better performance forSeandSpmetrics.
Tab.2 Experimental results in terms of VEB
Tab.3 Experimental results in terms of SVEB
This paper proposes a novel classifier scheme combing wavelet-denoising with QGA LSTSVM for Patient-specific ECG arrhythmia detection. This approach is suitable for the ECG remote monitoring and diagnosis, especially for wearable devices. The wavelet self-adapting threshold filter was used to remove the noise of the baseline drift and electromyography. The common beats combined with the patient-specific beats were used to train the proposed LSTSVM classifier. The experimental results on MIT/BIH database showed that the overall recognition accuracy, sensibility and specificity were all superior to the traditional SVMs and BP network. The experimental results also verified that our proposed method had a higher running speed compared with other machine learning methods. The future work is to modify the training mechanism of the proposed classifier and realize the real-time training and self-learning ability, which can also achieve a higher accuracy.
Journal of Beijing Institute of Technology2020年1期