Sample Entropy on Multidistance Signal Level

Full text

Turn on search term navigation

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. Introduction

Biological signals are the complex signals resulting from some complex physiological processes in the body [1]. Complex signals are signals that have some properties between periodic signals and random signals. These signals are analyzed using several points of view, such as fractal, entropy, or chaotic approaches. One commonly used method for complex signal analysis is multiscale entropy (MSE). Costa et al. proposed MSE method for a biological signal analysis [2]. As the biological signals are considered to have a number of multiscale properties, an analysis on multiple scales will provide the complete signal characteristic information.

For many cases, one of the most commonly biological signals analysis using MSE is the Electroencephalogram signal (EEG). The measurements of brain functions through EEG can be used for monitoring and interpreting the brain activity, even predicting the outcomes [3]. MSE was used for the analysis of EEG signals monitoring the depth of the anesthetic process during surgery [4]. The results showed that MSE at the presurgical stage was lower than the one at the anesthesia stage. MSE is also used to measure the dynamics of EEG signal complexity in patients with Alzheimer’s disease (AD) [5]. Although statistically the difference between normal EEG and AD is not very significant, there is a difference in pattern between MSE in EEG signals in the normal AD and EEG patients. Lu et al. used MSE in EEG signals as a predictor for the prognosis of neonatal seizures [6]. The EEG signal was acquired from 32 infants below two months old and analyzed using sample entropy, multiscale entropy, and complexity index (CI). The value of MSE and CI decreased in infants who experienced seizures. Attention-related EEG based on motor imaginary potential using multiscale entropy analysis was reported in [7]. MSE was used to differentiate EEG signals recorded in three attention-related activities and obtained the accuracy of 63.158%. Other research [8, 9] performed an analysis of epileptic EEG signals compared with the normal subjects’ EEG signals using the Detrended Cross-Correlation Analysis (DCCA) method. From this research, it can be concluded that the DCCA value of epileptic EEG signal was greater than normal subjects’ EEG signals. A review paper on the application of entropies methods on recognition of epilepsy using EEG signals was presented in [10]. The paper presented a comparison of various entropy methods used for the classification of normal, interictal, and ictal EEG signals. Many researchers have experimented with different entropy in the analysis and classification of EEG signals. Some have used one entropy feature, and others have used the combination of entropies. Both experiments have been reported to reach the accuracy of more than 92%. From this review, it can be concluded that entropy is one of the state-of-the-art methods that have a good performance for recognizing EEG signals in normal, ictal, and interictal conditions, which may be difficult to be recognized visually. Entropy can also be used in focal cases and nonfocal EEG signals.

From the related works described above, the method of feature extraction plays an essential role in the pattern recognition, especially EEG signals. In this research, we have simulated and analyzed the sample entropy (SampEn) on Multidistance Signal Level Difference (MSLD) for feature extraction and SVM algorithm for epileptic EEG signal classification. MSLD was selected for having good performance on the results of a previous study [11]. The MSLD segmented the EEG and then calculated the SampEn at each of its MSLD levels. Then, for classification, we used the Support Vector Machine (SVM) method. The test was performed on three EEG data classes; those are EEG signals of epilepsy patient in ictal conditions, EEG signals in interictal conditions, and normal EEG signals from healthy subjects. All datasets used in this study were sourced from the open databases available at the University of Bonn. The data was taken from normal subjects and epileptic subjects with interictal and ictal conditions.

2. Materials and Methods

2.1. EEG Data

In this research, we used the EEG dataset available at the University of Bonn [12] (source: http://epileptologie-bonn.de/cms/upload/workgroup/lehnertz/eegdata.html). Data were recorded using 173.61 Hz sampling frequency and filtered using 40 Hz LPF. Thus, it was free from artifacts noise. Each data had a length of 4096 samples with the duration of 23.6s. In this study, we used three classes of EEG data consisting of EEG signals from epileptic subjects in the condition of seizures (ictal), EEG signals in interictal conditions, and normal EEG signals from a healthy person with a closed eye condition. Ictal and interictal data were obtained from five patients with pharmacoresistant focal onset epilepsy undergoing some presurgical evaluations. These patients had the long-term intracranial EEG recording in the Department of Neurology, University of Bern. Some electrodes were implanted on the brain area to record the interictal segments between seizures or conditions at intervals without seizures. Each data class consists of 100 dataset; thus, a total of 300 EEG datasets were tested in this study. The sample data for each class can be seen in Figure 1.

[figures omitted; refer to PDF]

2.2. Multidistance Signal Level Difference

Multidistance Signal Level Difference (MSLD) is a modification of the gray-level difference (GLD) proposed by Weszka et al. [13]. GLD was calculated from the absolute value of the difference of two adjacent pixels in the horizontal, vertical, and diagonal directions [11]. In the horizontal direction, GLD could be calculated as $\begin{matrix} (1) & y (i, j) = |x (i, j) - x (i, j + D)|, \end{matrix}$ where D is the pixel distance.

In MSLD, since the signal used was 1D (one dimension), then (1) was modified to (2). The illustration of MSLD in the diagram can be seen in Figure 2. $\begin{matrix} (2) & y_{d} (i) = |x (i) - x (i + d)|, \end{matrix}$ where $i = 1,2, \dots, N - d$ and $d = 1,2, . ., K$ .

[figure omitted; refer to PDF]

2.3. Sample Entropy

Sample entropy (SampEn) was proposed by Richman and Moorman to resolve the weakness of ApEn [14]. In ApEn, there was a bias due to self-matches where the code template of the signal was considered equal to itself. SampEn is the probability of the $m$ sequence of data that will be the same as other sequences in the sequence of signals with the tolerance r, which will remain the same if the sequence m of data is increased to m + 1. Equally, in this case, it has a scale distance between 2 vectors compared to [15]. The equation of SampEn is expressed by $\begin{matrix} (3) & S a m p E n (m, r) = \lim_{N \to \infty} - \ln \frac{A^{m} (r)}{B^{m} (r)} \end{matrix}$ where $A^{m} (r)$ is the probability of two data sequences that would match for a number m+1 point in tolerance r. Meanwhile, $B^{m} (r)$ is the probability of two data sequences that would match for a number m point in tolerance r. In both parameters, self-matches have been avoided. Furthermore, (3) can be estimated by $\begin{matrix} (4) & B = \{\frac{[(N - m - 1) (N - m)]}{2}\} B^{m} (r) \end{matrix}$ and $\begin{matrix} (5) & A = \{\frac{[(N - m - 1) (N - m)]}{2}\} A^{m} (r) . \end{matrix}$ Then, SampEn can be expressed by $\begin{matrix} (6) & S a m p E n (m, r, N) = - \ln \frac{A}{B} . \end{matrix}$ The advantages of SampEn include its usability for short data sequence with noise, its ability to separate large system variations, its better performance compared to ApEn according to theory, its consistent entropy values for different pattern lengths, and self-matches not calculated. The weakness of SampEn is related to the inconsistency of entropy values for short data [10].

2.4. Support Vector Machine

Support Vector Machine (SVM) is one of the machine learning algorithms widely used for pattern recognition. The method proposed by [16] is principled on Structural Risk Minimization (SRM) with the aim of finding the best hyperplane to separate two classes in a space. SVM is a popular algorithm with high performance that is widely used for classification in cases that have complex computations [17].

SVM primarily works on linear problems and then it is developed to be used on nonlinear problems. It works on the kernel trick concepts in high-dimensional workspaces. The SVM concept is to design a hyperplane that can classify all training data into two classes. Figure 3 shows some patterns that are the members of two classes in the form of triangles and squares. Various alternate lines of discrimination (discrimination boundaries) are shown in Figure 4.

[figure omitted; refer to PDF] [figure omitted; refer to PDF]

SVM works by applying the kernel function to form two classes in the training data. Commonly, there are three kinds of kernel functions that can be used. The first type is linear kernel function with the equation below: $\begin{matrix} (7) & K (X, Y) = X^{T} Y . \end{matrix}$ The second kind is the polynomial kernel function: $\begin{matrix} (8) & k (X_{i}, X_{j}) = {(X_{i}, X_{j})}^{d} \end{matrix}$ where d (d ≥1) is the number of polynomials. If d = 2 or d = 3, the function is defined as a quadratic kernel or cubic kernel function.

2.5. K-Fold Cross-Validation

Performing the validity testing of machine learning algorithm requires the performance evaluation through cross-validation. Here, the dataset would be separated into two subsets consisting of training data and test data. In K-Fold cross-validation, the data is divided into k subsets. In this research, we conducted a 5-fold CV simulation. In 5-fold CV, data was divided into 5-fold groups, enabling us to have five subsets of data. From 5 subsets of data, there were four subsets of training data and one subset of test data. Thus, it would be iterated five times as illustrated in Figure 5.

[figure omitted; refer to PDF]

3. Results and Discussion

The result of the MSLD process for the seizure EEG signal with distance d=1-5 is shown in Figure 6. MSLD calculated the absolute value of the difference of 2 data samples at distance $d$ so that the resulting signal was always in the form of a positive value. New signals generated by MSLD would have a number of properties slightly different from the original signal, and these features would be quantized using sample entropy.

[figure omitted; refer to PDF]

Sample entropy value for each data class of MSLD results can be seen in Figure 7. Sample entropy was calculated with r = 0.25. It can be seen that EEG seizure produced the highest SampEn value compared to other conditions and the interictal conditions produced the lowest one. This proved that the condition of the seizure of EEG signal had the highest complexity value. Visually, the value of SampEn between the three classes was significantly different so that in the classification process it could be differentiated well.

[figure omitted; refer to PDF]

The next process was the performance testing of MSLD-SampEn using SVM with multiple kernels as a classifier. The test was performed using 20 SampEn values prior to subtracting the features used to see the effect of feature reduction on accuracy. The results are shown in Tables 1, 2 and 3.

From Tables 1, 2, and 3, the highest accuracy is 97.7% with the use of cubic SVM, MSLD with distance d=1 -20, and SampEn with r=0.25. It can then be seen that reducing the number of features could decrease the resulting accuracy, except for r = 0.25 using linear SVM and r = 0.1 using cubic SVM.

Table 1

Accuracy (%) using linear SVM and 5-fold CV.

SampEn	Scale 1-20	Scale 1-15	Scale 1-10	Scale 1-5
r = 0.1	96	95.7	84.3	79.7
r = 0.15	96	95	83.7	78.7
r = 0.2	96	96	84	78.7
r = 0.25	95.7	96	84	78

Table 2

Accuracy (%) using quadratic SVM and 5-fold CV.

SampEn	Scale 1-20	Scale 1-15	Scale 1-10	Scale 1-5
r = 0.1	96.7	95.7	85.7	81.7
r = 0.15	95.3	96.7	86.3	82.7
r = 0.2	96.7	95.7	87.3	82.7
r = 0.25	97.7	96.7	85.3	80.7

Table 3

Accuracy (%) using cubic SVM and 5-fold CV.

SampEn	Scale 1-20	Scale 1-15	Scale 1-10	Scale 1-5
r = 0.1	96.3	97	85.7	85
r = 0.15	97	89.7	84.3	73.3
r = 0.2	97	96	88.3	82
r = 0.25	97.7	95.7	83.7	83

MSLD shows a common occurrence of two samples at a specific distance range. These results showed some differences of features between classes. The advantage of MSLD is that the value of signal variance is unchanged, different from coarse-grained procedures that decrease the value of signal variance as discussed in previous studies [11]. A decrease in signal variance indicates a change of signal feature so that the results of the coarse-grained procedure will change the feature of the original signal. In previous research of lung sound classification, MSLD Hjorth descriptor obtained higher accuracy than multiscale Hjorth descriptor using coarse-grained procedure [18]. The disadvantage of MSLD is that the range of distances to be calculated is determined by trial and error. However, empirically the MSLD is well calculated for the range of distance d=1-15.

The MSLD method can be further developed in combination with other various feature extraction methods such as other entropy computation, statistical, or signal complexity methods. MSLD can also be used to manipulate some biological signals in addition to EEG signals or lung sounds [11]. MSLD method for signal classification such as ECG, EMG, or other biological signals will be interesting research in the future.

4. Conclusion

This research describes the classification of epileptic EEG signals using MSLD sample entropy. Tests were performed on three classes of EEG signals: normal, seizure, and interictal. This dataset is available online from Department of Epileptology, University of Bonn. From the feature extraction process, the sample of entropy for each class with r=0.25 showed a different value. In the EEG seizure signal, it produced the highest SampEn value compared to the other two conditions. From the simulation results, the values of sample entropy for each class could be differentiated enabling it to be easily classified. We also tested the classifier performance by applying SVM to the MSLD-SampEn result. The test results showed the highest accuracy of 97.7% using the MSLD with distance d=1 -20, SampEn with r = 0.25, and cubic SVM. MSLD can be well used to search some differences in sample signals with apparent difference values. For further research, it is suggested that MSLD can be used for the classification of other biopotential signals that have high complexity.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

[1] A. L. Goldberger, G. B. Moody, M. D. Costa, Variability vs. Complexity, 2012.

[2] M. Costa, A. L. Goldberger, C. Peng, "Multiscale entropy analysis of biological signals," Physical Review E: Statistical, Nonlinear, and Soft Matter Physics, vol. 71 no. 2,DOI: 10.1103/PhysRevE.71.021906, 2005.

[3] C. Uyulan, T. T. Erguzel, "Analysis of Time - Frequency EEG Feature Extraction Methods for Mental Task Classification," International Journal of Computational Intelligence Systems, vol. 10 no. 1,DOI: 10.2991/ijcis.10.1.87, 2017.

[4] Q. Liu, Y.-F. Chen, S.-Z. Fan, M. F. Abbod, J.-S. Shieh, "EEG Signals Analysis Using Multiscale Entropy for Depth of Anesthesia Monitoring during Surgery through Artificial Neural Networks," Computational and Mathematical Methods in Medicine, vol. 2015,DOI: 10.1155/2015/232381, 2015.

[5] T. Mizuno, T. Takahashi, R. Y. Cho, M. Kikuchi, T. Murata, K. Takahashi, Y. Wada, "Assessment of EEG dynamical complexity in Alzheimer's disease using multiscale entropy," Clinical Neurophysiology, vol. 121 no. 9, pp. 1438-1446, DOI: 10.1016/j.clinph.2010.03.025, 2010.

[6] W. Lu, J. Chen, C. Chang, W. Weng, W. Lee, J. Shieh, C. D. Hahn, "Multiscale Entropy of Electroencephalogram as a Potential Predictor for the Prognosis of Neonatal Seizures," PLoS ONE, vol. 10 no. 12,DOI: 10.1371/journal.pone.0144732, 2015.

[7] D. Ming, M. Zhang, Y. Xi, H. Qi, Y. Hu, K. D. K. Luk, "Multiscale entropy analysis of attention ralated EEG based on motor imaginary potential," Proceedings of the 2009 IEEE International Conference on Computational Intelligence for Measurement Systems and Applications, CIMSA 2009, pp. 24-27, .

[8] J. Zhao, W. Dou, H. Ji, J. Wang, "Detrended Cross-Correlation Analysis on Epilepsy Electroenceplalagram Signals," Proceedings of the International Conference on Systems Engineering and Modelling (ICSEM-13), pp. 888-890, .

[9] J. Zhao, W. Dou, H. Ji, J. Wang, "Epilepsy electroencephalagram singals study based on detrended cross-correlation analysis," Proceedings of the 2013 International Conference on Information, Business and Education Technology (ICIBET-2013),DOI: 10.2991/icibet.2013.273, .

[10] U. R. Acharya, H. Fujita, V. K. Sudarshan, S. Bhat, J. E. W. Koh, "Application of entropies for automated diagnosis of epilepsy using EEG signals: a review," Knowledge-Based Systems, vol. 88, pp. 85-96, DOI: 10.1016/j.knosys.2015.08.004, 2015.

[11] A. Rizal, R. Hidayat, H. A. Nugroho, "Hjorth descriptor measurement on multidistance signal level difference for lung sound classification," Journal of Telecommunication, Electronic and Computer Engineering, vol. 9 no. 2, pp. 23-27, 2017.

[12] R. G. Andrzejak, K. Lehnertz, F. Mormann, C. Rieke, P. David, C. E. Elger, "Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: dependence on recording region and brain state," Physical Review E: Statistical, Nonlinear, and Soft Matter Physics, vol. 64 no. 6,DOI: 10.1103/PhysRevE.64.061907, 2001.

[13] J. S. Weszka, C. R. Dyer, A. Rosenfeld, "Comparative study of texture measures for terrain classification," IEEE Transactions on Systems, Man, and Cybernetics, vol. 6 no. 4, pp. 269-285, DOI: 10.1109/tsmc.1976.5408777, 1976.

[14] J. S. Richman, J. R. Moorman, "Physiological time-series analysis using approximate entropy and sample entropy," American Journal of Physiology-Heart and Circulatory Physiology, vol. 278 no. 6, pp. H2039-H2049, DOI: 10.1152/ajpheart.2000.278.6.H2039, 2000.

[15] A. Humeau-Heurtier, "The multiscale entropy algorithm and its variants: A review," Entropy, vol. 17 no. 5, pp. 3110-3123, DOI: 10.3390/e17053110, 2015.

[16] V. N. Vapnik, The Nature of Statistical Learning Theory,DOI: 10.1007/978-1-4757-2440-0, 1995.

[17] J. L. Rojo-Álvarez, M. Martínez-Ramón, J. Muñoz-Marí, G. Camps-Valls, Digital Signal Processing with Kernel Methods, 2018.

[18] A. Rizal, R. Hidayat, H. A. Nugroho, "Multiscale Hjorth descriptor for lung sound classification," Proceedings of the AIP Conference, vol. 1755, .

Word count: 2878

Show less

Copyright © 2018 Achmad Rizal and Sugondo Hadiyoso. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0/

Abstract

Translate

Epilepsy is a disorder of the brain’s nerves as a result of excessive brain cell activity. It is generally characterized by the recurrent unprovoked seizures. This neurological abnormality can be detected and evaluated using Electroencephalogram (EEG) signal. Many algorithms have been applied to achieve high performance for the EEG classification of epileptic. However, the complexity and randomness of EEG signals become a challenge to researchers in applying the appropriate algorithms. In this research, sample entropy on Multidistance Signal Level Difference (MSLD) was applied to obtain the characteristic of EEG signals, especially towards the epilepsy patients. The test was performed on three classes of EEG data: EEG signals of epilepsy patient in ictal (seizure), interictal conditions (occurring between seizures) and normal EEG signals from healthy subjects with a closed eye condition. In this study, classification and verification were done using the Support Vector Machine (SVM) method. Through the 5-fold cross-validation, experimental results showed the highest accuracy of 97.7%.

Details

Title

Sample Entropy on Multidistance Signal Level Difference for Epileptic EEG Classification

Author

Achmad Rizal¹

; Hadiyoso, Sugondo²

¹ School of Electrical Engineering, Telkom University, Bandung 40257, Indonesia
² Telkom Applied Science School, Telkom University, Bandung 40257, Indonesia

Editor

Bernhard Schaller

Publication year

2018

Publication date

2018

Publisher

John Wiley & Sons, Inc.

ISSN

23566140

e-ISSN

1537744X

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1155/2018/8463256

ProQuest document ID

2111093865

Sample Entropy on Multidistance Signal Level Difference for Epileptic EEG Classification

Jump to:

Full text

Abstract

Details

Suggested sources