Advanced Deep Learning Spectroscopy of Scalogram

Full text

Turn on search term navigation

Introduction

Hypoxic–ischemic (HI) insults before and during birth, secondary to events such as placental abruption or umbilical cord occlusion, are a significant contributor to neonatal brain injury (hypoxic–ischemic encephalopathy; HIE).^[^1,2^] The preterm newborn is at greater risk of HIE.^[¹^] In contrast to an overall incidence of 1–3/1000 live births at term in high-income countries, preterm babies born before 37 weeks have an HIE incidence of around 37.3/1000 babies born before 37 weeks of gestation, rising to an overall rate of HIE of 120/1000 in infants born before 28 weeks of gestation.^[^1,3^] Survivors face life-long neurodevelopmental problems, including learning and cognitive impairments, behavioral problems, cerebral palsy, and epilepsy.^[⁴^] The burden for individuals and their families and health and education economic costs are substantial.^[^4,5^] To develop effective treatments or to appropriately utilize existing treatments requires that we fully understand how brain injury evolves and to identify biological markers (biomarkers) that allow us to determine phases of injury.

Key to effective therapy is the knowledge that injury evolves in different phases over time—a latent phase of recovery of oxidative metabolism, which is followed by a secondary loss of cerebral energy metabolism during which time most brain cell injury occurs, followed by a tertiary phase of both repair and ongoing injury.^[^1,3,6^] Therapeutic hypothermia (TH) is currently the only established treatment for HIE in babies born >36 weeks of gestation with moderate–severe HIE, and this treatment is now being cautiously explored for use in preterm babies.^[^1,2^] The current clinical protocol for TH was based on our preclinical studies in term and preterm fetal sheep, which established that this therapy is only effective if started within 6 h after the end of an HI insult and continued for ≈3 days.^[^1–3^] However, current clinical data show that many babies do not benefit from TH.^[²^] This in part reflects late recruitment of babies (typically 4–5 h) into treatment. However, it also reflects the fact that birth cannot always be taken as time zero. Many babies may also have experienced HI insults before birth, so that injury may have already evolved beyond the 6 h window of opportunity for TH efficacy.^[^2,3,6^] Thus, to determine which infants will benefit from TH, and any new therapies that optimally start in the latent phase, we need biomarkers to allow us to determine phases of injury.

Electroencephalogram (EEG) recordings can provide useful diagnostic and prognostic biomarkers of evolving HI injury.^[^7,8^] We have previously shown, in preterm fetal sheep EEG recordings, that epileptiform transients such as sharp waves and high-frequency microscale spike transients in the gamma frequency band (80–120 Hz), superimposed on a suppressed EEG background, are key EEG waveforms and are predictive of neural outcome during the first 6 h of recovery from an HI insult (see Figure 1).^[^6,8–11^] Similar EEG transients are also seen clinically and are associated with adverse neurological outcomes.^[¹²^]

FIGURE: This figure shows examples of EEG activity following an hypoxic-ischemic insult in preterm fetal sheep. A) Raw data example from one fetal sheep from 2 h before hypoxia-ischemia induced by umbilical cord occlusion (UCO) until 12 h postinsult showing profoundly suppressed EEG intensity during the latent phase followed by high-amplitude seizures. B,C) Examples of microscale HI gamma spike transients sampled from the EEG record at 55 min postinsult.

Current studies on the application of automated convolutional neural network (CNN)-based strategies in clinical neonatal EEG studies have primarily focused on developing consistently interpretable determinations of seizures and other EEG features, but not microseizures or post-HI EEG biomarkers.^[^13–17^] Our team has previously shown various successful fusion strategies for automatic detection of post-HI microscale EEG transients.^[^8,10^] Further, we have undertaken preliminary examination of the wavelet-scalogram (WS) CNN structure to robustly identify post-HI sharp waves from an EEG background and artifact during the latent phase after the HI insult with 95.34% accuracy.^[¹⁷^]

In the current study, we evaluated, for the first time, how application of spectrally rich WSs to our post-HI experimental EEG data could be used as inputs to a pattern classifier in the form of a 17-layer deep 2D-CNN for the accurate identification of microscale gamma spike transients in the latent phase. Our study showed that the WSs, generated through continuous wavelet transform (CWT) of the gamma spike transients over a broad scale-range of 1–19, using a reverse biorthogonal basis wavelet (rbio2.8), provide spectrally rich feature maps for the 17-layer deep WS-CNN pattern classifier to accurately (99.81 ± 0.15%) classify gamma spike transients from EEG background noise and other artifacts. Further, we introduced a spectral-based feature extraction strategy to create robust input matrices to be infused into a deep CNN classifier and compared the results against the WS-CNN and conventional 1D-CNN approaches. The set of deep CNN-based approaches in this study provided a reliable platform for data exploration of real-world clinical datasets without the requirement of manual intervention.

Results Cross-Dataset Results of the WS-CNN Classifier

The overall accuracy was 99.81 ± 0.15% (range 99.6–100%), confirming the reliability of the developed WS-CNN pattern classifier for the identification and classification of post-HI microscale gamma spike transients in the fetal electrocorticogram (ECoG) recordings collected at 1024 Hz (Table 1 and Figure 2). Results of the sevenfold cross-validation, using the entire 6 h of data across all sheep (42 h total), showed that the performance of the WS-CNN classifier slightly deceased to 99.03 ± 1.66%, 98.54 ± 1.43%, and 97.70 ± 1.99% for the network architectures with 17, 13, 9, and 5 layers of depth, respectively (Table S1–S3 of Appendix A, Supporting Information). Average values of 1.000, 0.997, 0.995, and 0.976 were calculated for the overall performance of the WS-CNN pattern classifier using 17, 13, 9, and 5 layers of depth, respectively. The results show predictable reductions in total accuracy along with increasing variability (as seen in the increasing standard deviation) as the number of layers decreased (Figure 2H). The receiver-operator (ROC) curves and the corresponding AUC values in Figure 2A–G show how the performance of the WS-CNN classifier from sevenfold cross-validation analysis changed across the fetal data as the number of layers decreased.

Table 1 Results of the WS-CNN classifier for post-HI spike transient identification in experimental data (entire 6 h and 17 layers). TP: true-positive, TN: true-negative, FP: false-positive, and FN: false-negative

Trained and validated on sheep no.	No. of patterns in the Train and Validation Dataset	Tested on sheep no.	No. of patterns in the test-set	TP hits	TN hits	FP hits	FN hits	Sensitivity [%]	Selectivity [%]	Precision [%]	Accuracy [%]
2,3,4,5,6,7	4567	1	443	173	269	1	0	100	99.6	99.4	99.8
1,3,4,5,6,7	4751	2	259	110	149	0	0	100	100	100	100
1,2,4,5,6,7	4731	3	279	82	196	0	1	98.8	100	100	99.6
1,2,3,5,6,7	3372	4	1638	824	807	7	0	100	99.1	99.2	99.6
1,2,3,4,6,7	4088	5	922	454	467	0	1	99.8	100	100	99.9
1,2,3,4,5,7	4466	6	544	231	313	0	0	100	100	100	100
1,2,3,4,5,6	4085	7	925	209	714	2	0	100	99.7	99.1	99.8
Overall performance of the 17-layer WS-CNN in the entire 6 h											99.81 ± 0.15

FIGURE: A–G) ROC curves and the corresponding area under curve (AUC) values from sevenfold cross-validation of the results along 6 h of 1024 Hz data across seven preterm fetal sheep (sheep 1–7, 42 h total) using 17, 13, 9, and 5 layers in the proposed WS-CNN classifier. H) The data for each WS-CNN classifier are presented as mean ± SD in the boxplot demonstrating reduced accuracy and increased variability with fewer layers.

Cross-Dataset Results of the WF-CNN Classifier

The sevenfold cross-validated performance validation of the 11-layer WF-CNN pattern classifier resulted in an overall accuracy of 99.44 ± 0.44% (range: 98.7–100%) when tested on the total 42 h of data. Reducing the original depth of 11-layer WF-CNN architecture (eight convolutional layers) down to 9, 7, and 5 layers (corresponding to 6, 4, and 2 convolutional layers, respectively) resulted in overall accuracies of 99.33 ± 0.36%, 98.07 ± 1.92%, and 97.96 ± 1.48%, respectively (Tables S4–S7 in Appendix B, Supporting Information). Average AUC values of 0.998, 0.998, 0.997, and 0.996 were calculated for the overall performance of the WF-CNN pattern classifier using 11, 9, 7, and 5 layer of depth, respectively. Results of the confusion matrix as well as ROC curves and the corresponding AUC values of the WF-CNN classifier are shown in Figure S1 and Table S4–S7, Supporting Information).

Cross-Dataset Results of the 1D-CNN Classifier

The sevenfold cross-validated performance validation of the 11-layer 1D-CNN pattern classifier resulted in an overall accuracy of 99.27 ± 0.50% (range: 98.5–100%). Reducing the original depth of the 11-layer 1D-CNN architecture (with eight convolutional layers) down to 9, 7, and 5 layers (corresponding to 6, 4, and 2 convolutional layers, respectively) resulted in overall accuracies of 98.07 ± 2.62%, 96.83 ± 2.83%, and 95.86 ± 3.74%, respectively. The average AUC values of 0.998, 0.998, 0.995, and 0.993 were calculated for the overall performance of the 1D-CNN pattern classifier using 11, 9, 7, and 5-layer architectures, respectively. The results of the confusion matrix, as well as ROC curves and the corresponding AUC values of the 1D-CNN classifier, are shown in Figure S2 and Table S8–S11 of Appendix C, Supporting Information.

Cross-Dataset results of the Spectral-Fuzzy Classifiers

As expected, the WT-Type-I-Fuzzy and FFT-Type-I-FLC approaches resulted in high overall performance of 99.04 ± 0.53% and 98.42 ± 0.71% respectively for the identification of individual microscale spike transients when evaluated at the best threshold value, cross-validated across the sampled data (Table S12 and S13 of Appendix D, Supporting Information).

Discussion

Our article presents, for the first time, a series of high-performance deep convolutional pattern classifiers infused with spectrally rich feature map input matrices of EEG segments (i.e., WSs) to automatically identify microscale gamma spike transients during the first 6 h (the latent phase) of recovery after an HI insult in preterm fetal sheep. Performance comparisons of our classifiers (Table 2) show that the 17-layer deep WS-CNN pattern classifier, fed with reverse biorthogonal scalograms of ECoG data, performed the best, with an accuracy of 99.81 ± 0.15% (AUC: 1.000) when tested over 5010 ECoG scalogram images. In contrast, the performance of a set of smaller WS-CNN architectures with a lower number of convolutional layers decreased with higher variations in the standard deviations as the number of layers decreased. However, we showed that the spectrally detailed scalograms can provide robust feature maps for the deep WS-CNN pattern classifier to desirably classify post-HI gamma spike transients regardless of their polarity.

Table 2 Comparison of the evaluated performance of the proposed strategies in the current article

Strategy	No. of layers	Sensitivity [%]	Selectivity [%]	Precision [%]	Accuracy [%]
WS-CNN	17 layers	99.80 ± 0.41	99.77 ± 0.31	99.67 ± 0.39	99.81 ± 0.15
	13 layers	97.90 ± 4.16	99.68 ± 0.31	99.54 ± 0.38	99.03 ± 1.66
	9 layers	97.50 ± 3.54	99.06 ± 1.28	98.83 ± 1.24	98.54 ± 1.43
	7 layers	98.77 ± 1.36	96.71 ± 3.31	95.63 ± 3.88	97.70 ± 1.99
WF-CNN	11 layers	99.91 ± 0.15	98.57 ± 1.40	99.93 ± 0.11	99.44 ± 0.44
	9 layers	99.10 ± 1.39	99.10 ± 0.86	99.60 ± 0.51	99.33 ± 0.37
	7 layers	96.07 ± 5.72	98.80 ± 0.91	97.84 ± 2.95	98.07 ± 1.92
	5 layers	97.63 ± 2.35	96.86 ± 4.71	98.73 ± 1.12	97.96 ± 1.48
1D-CNN	11 layers	99.51 ± 0.96	99.13 ± 0.75	98.87 ± 1.04	99.27 ± 0.50
	9 layers	94.94 ± 7.18	99.91 ± 0.11	99.83 ± 0.30	98.07 ± 2.62
	7 layers	96.67 ± 5.32	96.54 ± 4.26	96.00 ± 3.47	96.83 ± 2.84
	5 layers	90.89 ± 9.79	99.44 ± 0.88	98.67 ± 2.08	95.86 ± 3.75
WT-Type-I-FLC	N/A	99.16 ± 0.44	98.92 ± 0.79	N/A	99.04 ± 0.53
FFT-Type-I-FLC	N/A	98.51 ± 0.62	98.32 ± 0.85	N/A	98.42 ± 0.71

The article also showed the novel use of a spectral-based strategy to extract the spectrally dominant features of the raw ECoG epochs to form robust input matrix sets to be infused into the deep 2D-CNN (WF-CNN) for the classification of spike transients. Our data show that the 11-layer WF-CNN pattern classifier can competitively identify the spike transients compared to the WS-CNN approach, with 99.44 ± 0.44% accuracy (AUC: 0.998). This suggests that the minimal spectral features in the input matrices can provide sufficiently rich features for the 2D-CNN to build computationally simpler feature maps for acute classification that run faster with less required memory. However, this would be at the expense of a negligible loss of accuracy drop, compared to the WS-CNN approach. This was then followed closely by the results of an 11-layer 1D-CNN pattern classifier with an overall accuracy of 99.27 ± 0.50% (AUC: 0.998). The performance of the 1D-CNN classifier considerably decreased, compared to the WS-CNNs and WF-CNNs, when shallower architectures were used.

The cross-validation results of the 17-layer WS-CNN approach recognized this strategy to be the least sensitive to the potential morphological variations of spike transients across all subjects, resulting in only ±0.15% standard variation in the overall performance of all >99.6% across all subjects. Greater standard deviations of ±0.44% and ±0.51% with lower overall performance were observed for some subjects when the deepest WF-CNN and 1D-CNN approaches were used (also see Table 2). We postulate that there are two reasons for the increase in the overall standard deviation when the deepest 1D-CNN and WF-CNN were used compared to the WS-CNN. First, the different feature extraction strategies used in each of these techniques can provide different levels of information at different resolutions. For example, the WS method provides more information compared to the WF and 1D raw data. Second, the embedded morphological uncertainties of the spike transients can add ambiguity in the extracted features at different levels when each of the proposed feature extraction techniques is used, causing the biggest standard deviation for the 1D data when raw data were directly fed to the CNN. Results from Bayesian estimation analysis further justify how the deepest WS-CNN would statistically outperform the other classifiers, either in other classes or with shallower structures within the same class (see Figure S4 and S5 of Appendix G, Supporting Information).

Results from Table 2 further suggest that shallower CNN architectures are much more sensitive to the morphological variations of spike transients, highlighting the robust capabilities of deeper CNN architectures to handle uncertainty within the data. The 1D-CNN scheme, directly fed with the raw EEG time-series, was proposed to utilize/assess the standalone feature extraction/generation property of the CNNs on its own, compared to the WS-CNN and WF-CNN techniques, where robust spectral feature-extraction approaches were combined to reinforce the performance of the 2D-CNNs. Results indicate that the combination of deeper neural networks with successful feature extraction strategies can be a powerful tool in handling the embedded uncertainties within the data, which could ultimately lead to much better accuracies that could be relied on in real practice. In this work, classifiers were trained on spikes aligned at the center of the analyzing windows. We suggest that this is a beneficial property of the proposed classifiers as it provides more realistic signal processing of the data in an online environment.

In real-time practice, a sliding window of specified size can be run over streaming data and the algorithm highlights regions with higher similarity probabilities to spike transients. This property will be limited by the selected length for the sliding window; where a window with either too wide or too narrow a length would highly influence the classifiers’ outputs. Therefore, we would recommend choosing a window size that closely represents the target waveform in any application. Moreover, as an influencing factor, background noise contamination may impact the overall performance and needs to be controlled as required by the individual needs of an application. In this work, only 50 Hz noise was removed from the original EEG time-series to allow generalization of the technique to a more challenging environment; such as spectrally complex clinical data.

We suggest that careful architecture design, through avoiding significant image-size reductions between the layers of deeper networks, has helped to avoid overfitting. Readers are encouraged to investigate this from the results shown in Figure 2H and Figure S1H, Supporting Information, when deeper structures used for performance evaluation demonstrate no performance drop for the deeper architectures compared to the shallower networks, confirming that the chosen strategy to avoid large size reductions was valid when designing the nets.

In addition, our study further compared the results of the deep CNN approaches with other spectral-based techniques, such as the WT-Type-I-Fuzzy and FFT-Type-I-FLC classifiers. However, unlike the comprehensive confusion matrix measures for the CNN-based classifiers, the nature of the fuzzy-based approaches only allowed the evaluation of sensitivity and selectivity in the absence of other performance measures (i.e., precision) for the WT-Type-I-Fuzzy and FFT-Type-I-FLC classifiers, which resulted in overall accuracies of 99.04 ± 0.53% and 98.42 ± 0.71%, respectively. However, while the overall accuracies of these techniques are numerically close to the accuracies of the deep CNN classifiers, their reliability is arguably less, especially when applied to the highly complex clinical data environment. In fact, the spectrally complex nature of EEG in a challenging clinical environment that has an abundance of high-frequency spectrums and background artifacts could lead to lower precision results in the spectral-based techniques. This in turn would cause a significant drop in the overall performance of the later fuzzy-based techniques. Thus, collectively our data show that WS-CNN, WF-CNN, and 1D-CNN pattern classifiers robustly outperform the fuzzy-based approaches, despite their small differences in the numerical overall performance results.

Overall, we expect the proposed high-performance EEG pattern classifiers in this work to have the potential to also accurately detect similar morphology waveforms in clinical data. HI insults leading to brain injury are treatable, but the current standard therapy, therapeutic hypothermia, is only effective when started within a specific window of time in the evolution of brain injury. Our EEG pattern classifiers will allow us to determine the phase of injury and thus whether infants will benefit from treatment.

Methods and Computational Approach Definition of HI Microscale Spike Transients

We have previously described the appearance of microscale gamma spike transients after an HI insult.^[^8,10,11^] Briefly, they are EEG epileptiform waveforms characterized as events with a pointed peak, with amplitude of >20 μV and duration of <12.5 ms (equivalent to the frequency range of >80 Hz) (see Figure 1). To evaluate the performance of the classifier and for consistency between all methods individual microscale gamma spikes (i.e., excluding spike complexes) with an amplitude of at least 14 μV were assessed for automatic classification against manual quantifications. Gamma spike transients were identified manually by an expert (HA). A total of 5010 manually annotated ECoG patterns (scalogram images), within a total of 42 h of data, including 2085 microscale post-HI spike transients and 2925 nonspike events, were used for training, validation, and testing of the deep CNN-based pattern classifiers.

CNNs for Evaluation of the Post-HI EEG

Recent artificial neural network (ANN) architectures with deeper learning structures include a much higher number of nodes and neurons that can better mimic the intricate connectivity of the human brain. CNN architectures are generally categorized under deep learning with significant classification abilities. To date, CNN-based analysis has not been utilized for the identification of post-HI perinatal EEG microscale transients. However, recent studies have investigated the use of CNNs for automatic analysis of brain data, including prediction of epileptic seizures,^[¹⁸^] epilepsy classification,^[¹⁹^] EEG artifact identification,^[²⁰^] and detection of high-frequency EEG oscillations.^[²¹^] However, there has been limited work to date using CNNs for the identification and classification of seizure-like patterns in EEG,^[^22–25^] grading the severity of HIE,^[²⁶^] and in particular neonatal EEG seizure detection through multichannel EEG recordings.^[^14,15^] The literature suggests that 1D time-series can also be directly fed into various formats of CNN architectures for seizure identification^[²³^] and epilepsy detection.^[²⁷^] Feature extraction through wavelet transform of data (time-frequency images) has been shown to enhance the performance of the conventional CNN approaches.

To date, to the best of our knowledge, our previous study is the only published data on the use of CNN for automatic identification of HI biomarkers where the proposed WS-CNN pattern classifier achieved an overall performance of 95.34% for the classification of HI sharp wave in a limited dataset of fetal sheep HI EEG recordings (trained and tested on 2 and 1 h recordings from two different sheep).^[¹⁷^] By contrast, the current study investigated the validity of a series of deep CNN-based pattern classifiers, with different depth configurations, for the identification of gamma spike transients post-HI insult.

WS-CNN Classifier Scalogram Image Preparation

We have previously shown that the rbio2.8 mother wavelet can serve as an optimal wavelet basis for the time-localization of gamma spike transients when used as the transfer function in a CWT.^[⁸^] We have also shown that the spectral features of the rbio2.8 mother wavelet are well aligned with the spectral features of the HI gamma spike transients.^[¹⁰^] In this work, the normalized/zero-meaned raw 1024 Hz sampled ECoG was used for scalogram feature extraction. The CWTs of the zero-meaned ECoGs, using the rbio2.8 mother wavelet at scales 1–19, are initially used to generate scalogram images of an arbitrary ECoG section. Gamma spike transients observed within the raw fetal ECoG, 30–210 min postinsult, are shown in Figure 3A–D. The 2D WSs of the spikes in Figure 3A–D were constructed using rbio2.8 CWT at scales 1–19 and are shown in Figure 3E–H, respectively. Examples of the nonspike events from the same ECoG sections, as well as their corresponding scalograms, are shown in Figure 3I–L and M–P, respectively.

FIGURE: Examples of microscale spike transients taken from the 1024 Hz EEG recordings of preterm fetal sheep A–D) 1–2.5 h after an HI insult as well as I–L) nonspike events, along with E–H) and M–P) their corresponding WSs using CWT with Rbio2.8 basis function of scales 1–19. The example WS images in (E–H) and (M–P) were used for training, validation, and testing of the deep WS-CNN classifier.

These figures show how the spectrally rich rbio2.8 CWT of ECoG sections at scales 1–19 can provide distinct feature maps in the form of high-resolution scalogram images for deep training of the WS-CNN pattern classifier to distinguish between gamma spikes and background ECoG and artifact. CWT scalogram images from the labeled data were stored for the training of the WT-CNN pattern classifier. To generalize the approach and assess its capabilities for use on clinical data; which undergo less processing, both clean and noisy data were used to train, test, and validate the WS-CNN pattern classifier.

The Proposed Deep WS-CNN Classifier: Model Setup and Architecture

Analysis of high-frequency events in the ECoG data requires extra levels of algorithm proficiency due to the complex nature of the signal. The approach taken in this study was designed to tackle the issue by combining the opposite “decomposition” and “combination” functionalities of wavelets and the CNN architecture, respectively. Our study, in particular, validates the performance of our previously developed 17-layer deep WS-CNN pattern classifier architecture^[¹⁷^] for the automatic identification of gamma spike transients in the ECoG datasets from seven fetal sheep post-HI.

The 2D WSs from the reverse biorthogonal wavelet (rbio2.8 over the scale range 1–19) provide spectrally detailed decomposition representations of ECoG patterns for the CNN classifier to combine/convolve the high-resolution elements for final decision-making between a high-frequency gamma spike transient and background activities and/or artifact. A detailed summary of the proposed CNN architecture is shown in Table 3.

Table 3 The architecture of the proposed deep WS-CNN classifier

Layers	Type	No. of neurons (output layer)	Kernel size	Stride	Padding	No. of filters
0–1	Conv.	303 × 404	3	1	1	16
1–2	Max_pool	151 × 202	[3 2]	2	0
2–3	Conv.	151 × 202	3	1	1	32
3–4	Max_pool	75 × 101	[3 2]	2	0
4–5	Conv.	75 × 101	3	1	1	48
5–6	Max_pool	37 × 50	3	2	0
6–7	Conv.	37 × 50	3	1	1	72
7–8	Max_pool	18 × 25	[3 2]	2	0
8–9	Conv.	18 × 25	3	1	1	96
9–10	Max_pool	9 × 12	[2 3]	2	0
10–11	Conv.	9 × 12	3	1	1	128
11–12	Max_pool	4 × 6	[3 2]	2	0
12–13	Conv.	4 × 6	3	1	1	256
13–14	Max_pool	2 × 3	2	2	0
14–17	Fully_connected	1536
	Fully_connected	24
	Fully_connected	2
Output	Softmax and Classification

Figure 4 shows the graphical flow chart of the proposed WS-CNN architecture. Scalogram input images of size 303 × 404 × 3 were fed into the WS-CNN classifier, which were then processed through a 17-layer deep structure, including seven convolutional layers (with rectified linear activation units (ReLUs) and batch normalization), seven max pool, three fully connected layers (with output sizes of 1536, 24, and 2), and finally a softmax and a classification layer, consecutively. The sizes of the kernel filters at each layer were chosen arbitrarily to derive an adequate amount of features from the data. The stride values were also set to 1 and 2 for the convolution and max-pooling layers, respectively, to adjust the mathematical computations. The number of filters at each convolutional block was set to 16, 32, 48, 72, 96, 128, and 256, respectively (Figure 5A). Stepping through the CNN layers yields higher decompositions, while using a larger number of filters at the deeper layers allows performance of an opposite functionality by “combining/convolving” the elements back for a more accurate classification.

FIGURE: The schematic of our proposed procedure for gamma spike transient identification.

FIGURE: The architecture of our proposed A) WS-CNN and B) 1D-CNN spike pattern classifiers.

Initially, an input image was convolved in a convolutional layer with the kernel filters/matrixes to generate “feature maps” in its output.

Stride parameters of a convolutional layer are used to control the amount of jump/overlap that is considered across the input image for the next convolutional filtering step. A convolutional layer is usually followed by an ReLU layer to eliminate nonlinearity. This was performed by using an activation function that maps the outputs of the previous layer into a thresholded representation. An ReLU layer was followed by a max-pool layer to reduce the dimension of the matrixes from the convolutional/ReLU layers. The size reduction significantly contributed to the simplification of the computational burden and helped to avoid overfitting. The maximum values from each ReLU block (feature maps) were selected in a max-pool operation. The output nodes from the last max-pool layer are directly linked to all of the input neurons of a fully connected block that multiplies its inputs by a weight matrix and adds them up with a bias vector. Features from the previous layers throughout the entire image were combined in a fully connected block to detect embedded patterns and helped with final classification. A softmax function was then used to evaluate the probability distribution of the output classes from the last fully connected layer. Finally, a classification layer calculated the cross-entropy loss of the softmax outputs and allocated each value to one of the categories (classes) (see Figure 5A).^[²⁸^]

The effects of reducing the number of layers in the proposed deep classifying architecture were investigated using a lower number of 13, 9, and 5 layers from the original 17-layer structure. To do so, the seven blocks of convolutional, ReLU, and max-pool layers in the original architecture (14 convolutional layers) were redesigned, with careful tuning of the inner convolutional layers to avoid massive image-size reductions, to form 5, 3, and 2 blocks in the new architectures (corresponding to 10, 6, and 4 convolutional layers, respectively). The new architectures were designed only with removing of the convolutional blocks and not with reducing/changing of the size/number of filters in the convolutional blocks. However, the output of the final convolutional block was always designed to match the original size of 2 × 3 in all cases.

Training and Testing of the WS-CNN Classifier

The stochastic gradient descent with momentum (SGDM) strategy was used to minimize the loss function E(θ) through updating of the weights and bias parameters, where θ represents the parameters vector[Image Omitted. See PDF]

Initial values of 0.01 and 0.09 were assigned to the learning rate, α, and momentum, γ, respectively. In the SGDM updating algorithm α is designed to control the learning speed and γ is used to control the convergence through reducing the oscillations of the parameters during upgrading steps on the steepest descent optimization path. Further tuning of the learning rate and momentum was not investigated due to the congenial performance results of the classifier. The classifier was trained over a total of 180 epochs with a training-validation distribution share of 80% to 20%, respectively. All the training sets were passed through the net during each epoch.

The batch size parameter was set to 128. The batch size indicates the number of training examples at each training iteration, where a higher chosen batch size value requires more memory space. The dataset from the remaining sheep, which was not used in the training process, was allocated for testing of the net. The substitution of the root mean square propagation (RMSProp) and Adam updating algorithms with the SGDM optimizer was observed to cause more convergence fluctuations and a much slower training process, for each, respectively. Thus, the aforementioned updating algorithms were not investigated further for result production. Figure 6 shows a schematic of data distribution for training, validation, and testing of the net. A total of 5010 manually annotated ECoG scalogram images, within a total of 42 h of data, including 2085 microscale HI gamma spike transients and 2925 nonspike events, were used for training, validation, and testing of the deep WS-CNN classifier. The net was trained over a total of 180 epochs, taking almost 31 h to train using the aforementioned core configuration.

FIGURE: Allocations of the ECoG datasets for training, validation, and testing of the proposed WS-CNN, the WF-CNN and 1D-CNN pattern classifiers (see Table 1).

WF-CNN Classifier Spectral Feature Fusion

The superior compatibility of the reverse biorthogonal wavelet rbio2.8 basis function for gamma spike detection in comparison to other wavelet basis functions would allow us to extract minimal features from a spike transient to be used in a 2D-CNN.^[¹⁰^] Here, for the first time, we also show an approach that, instead of the full-range spectral features (scalograms) as in the WS-CNN approach, only extracts the spectrally dominant features of the raw ECoG epochs, using wavelet and Fourier spectrums (WF), to form robust input matrix sets. In this method, the CWT coefficients of each zero-meaned ECoG segment (72 × 1) using rbio2.8 at scale 7 (Figure S3E–H of Appendix F, Supporting Information) as well as the inverse Fourier transform time-series of the data (IFFT: spectral components within 80–120 Hz preserved—Figure S3I–L, Supporting Information) along with the original raw ECoG segment (Figure S3A–D, Supporting Information), were combined to form robust 3D input-matrix sets of size 72 × 3 × 1 (Figure S3M,N, Supporting Information) to be fed directly into the deep 2D-CNN classifier (WF-CNN). Such a strategy creates consistent profiles that facilitate the internal feature extraction within the CNN algorithm to generate even more robust feature maps for classification between spikes and nonspikes.

The Proposed Deep WF-CNN Classifier: Model Setup and Architecture

Compared to the WS-CNN, here we introduce an 11-layer 2D WF-CNN classifier that could be considered to be computationally more efficient due to the much simpler input matrix of features, instead of the computationally intensive scalograms used in the WS-CNN. In fact, the WS block in Figure 4, or more specifically, the scalograms in Figure 5A, are replaced with an input matrix of size 72 × 3 × 1 containing the CWT, IFFT, and raw ECoG data (see Figure S3M, Supporting Information). The structure of the 2D-CNN used in the WF-CNN is shown in Table S14 of Appendix E, Supporting Information. Compared to the 17-layer designed structure for the WS-CNN, a maximum feasible depth of 11 layers was designed, according to the limitations related to the size of the inputs for the WF-CNN as well as the inevitable design considerations that should have been taken into account for optimal tuning of the inner convolutional layers to avoid massive size reductions within the inner layers. Similar to the WS-CNN, an SGDM updating strategy was used for sevenfold cross-validation of the WF-CNN. Our recent assessments using limited data indicated that this approach is capable of identifying both sharps and spikes in the recordings of fetal sheep models.^[^29,30^] Here we validated the technique using a much larger dataset, while we also assessed how the size of the network influences the overall performance. Using the procedure described for the WS-CNN pattern classifier, the article investigates the effects of reducing the original 11-layer, 2D-CNN structure down to 9, 7, and 5 layers by using 4, 3, 2, and 1 block(s) of convolutional, ReLU, and max-pool layers in each architecture, respectively.

1D-CNN Classifier

Here we also investigate the performance of a 1D-CNN classifier applied directly to the ECoG time-series as the input. ECoG segments of length 72 × 1 (for both spike and nonspike events) were fed into an 11-layer deep 1D-CNN structure for classification. In the proposed 1D-CNN structure, the WS generating block in Figure 4 is bypassed from the previously detailed WS-CNN pattern classifier. The designed architecture of the proposed 1D-CNN pattern classifier is detailed in Table S15 of Appendix E, Supporting Information, while the CNN block in this approach is shown in Figure 5B. The maximum depth of 11 layers was designed, inevitably, based on the limited length of the input ECoG segment and considering a stride value of 2 for all max-pooling layers. Our recent assessments indicated that the 1D-CNN classifier is well capable of identifying high-amplitude stereotypic epileptiform seizures in a limited dataset of fetal sheep;^[³¹^] therefore, here we evaluated the technique for high-frequency gamma spikes. Using a similar procedure to that described for the WS-CNN pattern classifier, the article investigates the effects of reducing the original 11-layer 1D-CNN structure down to 9, 7, and 5 layers using 8, 6, 4, and 2 convolutional layers in each architecture, respectively.

Wavelet Type-I Fuzzy Classifier

In 2018, we introduced a successful WT-Type-I-Fuzzy approach for the identification of HI microscale gamma spike transients in the latent phase of HI ECoG.^[⁸^] In this work, the performance of the WT-Type-I-Fuzzy classifier was also assessed for the quantification of individual microscale spike transients over the entire 6 h post-HI ECoG. In brief, data were initially continuous wavelet transformed using the Rbio 2.8 mother wavelet at scale 7 and then passed to the fuzzy classifier for final reasoning (see ref. [8] for detailed information). In this approach, the WT-Type-I-Fuzzy classifier was separately cross-validated on data from each individual sheep.

Spectral Fourier-Fuzzy Classifier

We recently introduced a spectral-based fuzzy approach (FFT-Type-I-FLC classifier) for the identification of HI microscale gamma spike transients in the latent phase recordings of post-HI ECoG activity.^[¹⁰^] In this work, the performance of the FFT-Type-I-FLC was also assessed for the quantification of individual microscale spike transients over the entire 6 h of post-HI ECoG recordings. In brief, data were initially Fourier/inverse Fourier transformed preserving the spectral components in the 80–120 Hz frequency band and then passed to the type-I fuzzy classifier for final reasoning (see ref.[10] for detailed information). In this approach, the FFT-Type-I-FLC classifier was separately cross-validated on data from each individual sheep.

Computing Infrastructure

The deep WS-CNN and WF-CNN classifiers were trained using New Zealand eScience Infrastructure (NeSI) high-performance computing facilities that offer the Cray CS400 cluster. The classifiers were trained using 12 CPUs (six hyperthreaded cores) on an Intel Xeon Broadwell node (E5-2695v4, 2.1 GHz) with 18 GB of memory (1.5 GB RAM memory per CPU). The algorithms were executed using Matlab programming software.

Performance Evaluation Metrics K-Fold Cross-Validation for the Deep CNN-Based Classifiers

The performance of the WS-CNN, WF-CNN, and 1D-CNN classifiers was evaluated using a subject-based k-fold cross-validation strategy (sevenfold) to assure the validation, consistency, and reliability of the proposed pattern classifiers across all subjects. This strategy helps to assess the degree of reliability of the classifiers in dealing with potential morphological variations of the transients across all datasets. Typically, k-fold cross-validation is used within a single dataset where the entire dataset is subpartitioned. Here, as in our previous works,^[^8,10^] we performed cross-validation across a seven-sheep dataset where the data from each six sheep was subpartitioned for training and validation, while data from a remaining sheep was used for testing of the classifier (sevenfold cross-validation). This procedure was repeated seven times by swapping the test data set each time.

K-Fold Cross-Validation for the Spectral-Fuzzy Classifiers

A subject-based k-fold cross-validation strategy was also used for the performance evaluation of the WT-Type-I-Fuzzy and FFT-Type-I-FLC approaches. However, the two later classifiers were individually tested over the entire 6 h data set for each sheep. This was permutated across all seven fetal sheep by replacing the test set with the data from a new unseen sheep. At each folding step, the average of selectivity and sensitivity measures was used to evaluate the overall performance of the classifiers.

Conclusion

Reliable, high-performance biomarkers are essential if we are to improve the outcomes for newborns after perinatal HI insults. In part, improved outcomes will require better determination of the phase of brain injury so that we know whether babies may benefit from treatment, and potentially in the future, the type of intervention that is most likely to be beneficial. The current standard therapy of TH has a very narrow window of opportunity for efficacy: the first 6 h of post-HI recovery. Cot-side EEG recordings offer a quick, easily applied method for extracting continuous biomarker data, although, problematically, data are often collected at very low frequency. Our study has shown, using a preclinical animal HI model with 1024 Hz sampling, that the deep WS-CNN and WF-CNN structures reliably outperform the conventional 1D-CNN and spectral-fuzzy classifiers, and that the greater the number of convolutional layers within the architecture, the better is the performance. In conclusion, our study provides a reliable framework that could help with a well-timed diagnosis of at-risk neonates in clinical practice.

Experimental Section

All procedures were approved by the Animal Ethics Committee of the University of Auckland (R1942) and conducted in accordance with the Code of Ethical Conduct for animals in research established by the Ministry of Primary Industries, Government of New Zealand. The experiments are reported in accordance with the ARRIVE guidelines for reporting animal research.

Surgical and Experimental Procedures

This study was conducted in preterm fetal sheep after surgical recovery (i.e., they were studied in utero without the confounding effects of anesthesia). Fetuses rather than neonates were used because at this time there is no neonatal preterm model with a gyrencephalic brain similar to that of humans, and utilizing a whole-body HI insult similar to that seen at birth.^[^32,33^]

Seven singleton Romney/Suffolk fetuses at 98–99 days of gestation (full term ≈147 days gestation) were used in this study. The methods are as previously described.^[^11,33–36^] Briefly, ewes were then anesthetized by an intravenous injection of propofol (5 mg kg⁻¹; AstraZeneca Limited, Auckland, New Zealand) and intubated, and general anesthesia was maintained using 2–3% isoflurane in O₂ (Bomac Animal Health, NSW, Australia). Ewes were given an intramuscular injection of the antibiotic oxytetracycline (20 mg kg⁻¹; Phoenix Pharm, Auckland, New Zealand) for prophylaxis. Maternal fluid balance was maintained via a constant saline infusion (≈250 ml h⁻¹). Animals were constantly monitored by trained anesthetic technicians throughout surgery.

Fetuses were surgically instrumented with catheters and electrodes as previously described.^[³⁵^] Specific to this study. polyvinyl catheters (SteriHealth, Dandenong South, VIC, Australia) were placed in a brachial and femoral arteries for blood sampling and blood pressure. Electrodes (AS633-3SSF wire; Cooner Wire, Chatsworth, CA) were subcutaneously placed over the right shoulder and at the level of the left fifth intercostal space to measure the fetal electrocardiogram (ECG) for EEG recordings and HI insult. Two pairs of left/right electrodes (AS633-5SSF; Cooner Wire) were placed bilaterally onto the parasagittal cortical dura (5 and 10 mm anterior to bregma and 5 mm lateral) via burr holes for recording of the ECoG, which were then sealed. ECoG recordings reduce signal artifacts and therefore the signal can be technically referred to as an “ECoG.” A reference electrode was also sewn over the occiput. An inflatable silicone occluder (OC16HD, 16 mm, In Vivo Metric, Healdsburg, CA, USA) was loosely placed around the umbilical cord to allow postsurgical occlusion of the umbilical cord to induce fetal HI. When fetal procedures were complete, the fetus was returned to the amniotic sac and the uterus and maternal surgical sites were closed. The maternal laparotomy skin incision was infiltrated with a local analgesic, 10 ml of 0.5% bupivacaine plus adrenaline (AstraZeneca, Auckland, NZ), for analgesia. A maternal saphenous vein was cannulated for postoperative care.

Ewes and their fetuses recovered from their surgery for 4–5 days before experiments began. Ewes were housed together in separate metabolic cages with access to concentrated pelleted food and water ad libitum. Rooms were temperature-controlled (16 ± 1 °C, humidity 50 ± 10%) with a 12:12 h light:dark cycle. Antibiotics were given intravenously to the ewe each day for 4 days: 600 mg benzylpenicillin sodium (Novartis, Auckland, New Zealand) and 80 mg gentamicin (Pfizer, Auckland, New Zealand). Fetal vascular catheters were continuously infused with heparinized saline (20 U ml⁻¹ at 0.2 ml h⁻¹) to maintain patency. The fetal condition was assessed via recordings of all fetal physiological variables, and daily arterial samples were drawn to monitor pH and blood gases (ABL800 Flex analyzer, Radiometer, Auckland, New Zealand), glucose, and lactate (YSI 2300 Analyzer, YSI Ltd., Yellow Springs, Ohio, USA).

Fetuses were studied at 103–104 days of gestation. At this age, the fetal brain is the equivalent in neuronal maturation to a human brain of around 28–30 weeks of gestation.^[³³^] They underwent complete inflation of the umbilical cord occluder to induce a severe HI insult for 25 min or until blood pressure fell below 8 mmHg or there was asystole (25 min (n = 4), 19 min (n = 1), 15 min (n = 2)). All occlusions started at 09.00 h. A successful occlusion was defined by measurements of fetal blood pressure, heart rate, and blood samples for pH and blood gas analysis. At the completion of the experiment at 7 days post-HI, fetuses and their ewes were killed by an overdose of pentobarbital sodium intravenously to the ewe (9 g of Pentobarb 300; Chemstock International, Christchurch, New Zealand).

Measurements

Fetal mean arterial blood pressure, heart rate, and ECoG were continuously recorded by a computer using a custom data acquisition software (LabVIEW for Windows; National Instruments, Austin, TX). Specifically for this study, the fetal ECoG was initially amplified with a gain of ×10 000, then passed through a fifth-order low-pass Butterworth antialiasing filter and a first-order high-pass filter with cut-off frequencies set at 512 and 1.6 Hz, respectively. Data were then digitized at a sampling rate of 4096 Hz and filtered by a low-pass filter with a digital IIR Type 2 Chebyshev filter with a cut-off frequency of 512 Hz and resampled to 1024 Hz for analysis of the raw EEG waveforms. Data were finally decoded into Matlab for microscale transient analysis. The first 6 h post-HI ECoG for all animals (total: 42 h) was used for analysis. Depending on the amount of 50 Hz noise contamination of the signal, the data were initially passed through a 100th-order digital bandpass finite impulse response (FIR) filter with a normalized stop-band frequency (ω) between 0.05 and 0.13 (25.60 Hz<f < 66.56 Hz), if needed. The data were not further denoised but normalized and zero-meaned only so that the classifiers could be trained on a more challenging environment that could mimic spectrally complex clinical data.

Acknowledgements

The research was supported by grants from the Health Research Council of New Zealand (HRC-17/601) and the Auckland Medical Research Foundation (AMRF-1117017). The authors would also like to acknowledge the use of the New Zealand eScience Infrastructure (NeSI) high-performance computing facilities for the results of this research. URL: https://www.nesi.org.nz. The Supporting Information of this article can be found here: https://doi.org/10.22541/au.160372551.19723833/v1.

Conflict of Interest

The authors declare no conflict of interest.

Author Contributions

The algorithm development, data analysis, and manuscript writing/preparation were undertaken by H.A. The EEG recordings were provided by L.B. and A.G. The manuscript was reviewed and revised by L.B., A.G, and C.P.U. Funding acquisition: L.B. and A.G (HRC-17/601). The final submitted article has been revised and approved by all authors.

Word count: 7392

Show less

© 2021. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

There is a lack of reliable prognostic biomarkers for hypoxic-ischemic (HI) brain injury in preterm infants. Herein, spectrally detailed wavelet scalograms (WSs), derived from the 1024 Hz sampled electroencephalograms (EEG) of preterm fetal sheep after HI (n = 7), are infused into a high-performance deep convolutional neural network (CNN) pattern classifier to identify high-frequency spike transient biomarkers. The deep WS-CNN pattern classifier identifies EEG spikes with remarkable accuracy of 99.81 = 0.15% (area under curve, AUC = 1.000), cross-validated across 5010 EEG waveforms, during the first 6 h post-HI (42 h total), an important clinical period for diagnosis of HI brain injury. Further, a feature-fusion strategy is introduced to extract the spectrally dominant features of the raw EEG epochs to form robust 3D input matrix sets to be infused into the deep 2D-CNNs for pattern classification. The results show that the proposed WS-CNN approach is less sensitive to the potential morphological variations of spikes across all subjects compared to other deep CNNs and spectral-fuzzy classifiers, allowing the user to flexibly choose an approach depending on their computational requirements. Collectively, the data provide a reliable framework that could help support well-timed diagnosis of at-risk neonates in clinical practice.

Details

Title

Advanced Deep Learning Spectroscopy of Scalogram Infused CNN Classifiers for Robust Identification of Post-Hypoxic Epileptiform EEG Spikes

Author

Abbasi, Hamid¹

; Gunn, Alistair J²

; Unsworth, Charles P³

; Bennet, Laura²

¹ Department of Physiology, Faculty of Medical and Health Sciences, University of Auckland, Auckland, New Zealand; Department of Engineering Science, Faculty of Engineering, University of Auckland, Auckland, New Zealand
² Department of Physiology, Faculty of Medical and Health Sciences, University of Auckland, Auckland, New Zealand
³ Department of Engineering Science, Faculty of Engineering, University of Auckland, Auckland, New Zealand

Section

Full Papers

Publication year

2021

Publication date

Feb 2021

Publisher

John Wiley & Sons, Inc.

e-ISSN

26404567

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1002/aisy.202000198

ProQuest document ID

2822734090

Advanced Deep Learning Spectroscopy of Scalogram Infused CNN Classifiers for Robust Identification of Post-Hypoxic Epileptiform EEG Spikes

Jump to:

Full text

Abstract

Details

Suggested sources