Introduction
Equatorial plasma bubbles (EPBs) are medium-scale ionospheric depletions that occur predominantly in the neighborhood of the Earth's magnetic equator during post-sunset hours (Bhattacharyya, 2022; Park et al., 2009). The EPBs, which are triggered by the Rayleigh-Taylor instability, disrupt radio wave propagation, leading to signal degradation and loss of communication in satellite navigation systems, such as Global Navigation Satellite Systems (GNSS) and satellite-based communication networks (Huba et al., 2008; Kelley, 2009). The importance of EPBs in satellite communications has necessitated continuous monitoring for better understanding, prediction, and mitigation of its impact.
One popular method for observing EPBs is through the use of airglow imaging by all-sky imagers (ASIs) (Makela et al., 2004; Okoh et al., 2017; Shiokawa et al., 1999). Airglow refers to the natural emission of light by atoms and molecules in the upper atmosphere, typically occurring between 80 and 300 km above the Earth's surface (Taylor et al., 1997). This phenomenon allows for the visualization of ionospheric disturbances, including EPBs, which appear as dark depletions or bubbles in airglow images (Makela et al., 2010). However, manual detection of EPBs from airglow images is labor-intensive and subjective, prompting the need for automated detection methods.
Recent advancements in machine learning, particularly convolutional neural networks (CNNs), offer an efficient and robust approach to automating the detection of EPBs in airglow images (Chakrabarti et al., 2024; Srisamoodkham et al., 2022; Thanakulketsarat et al., 2023). CNNs, a class of deep learning algorithms inspired by the human visual cortex, have demonstrated significant success in image recognition tasks across various domains, from medical imaging (Litjens et al., 2017) to satellite image analysis (Ma et al., 2019). Their ability to extract hierarchical features from images, such as edges, textures, and complex patterns, makes them ideal candidates for detecting the subtle features associated with EPBs in airglow images. The application of CNNs in ionospheric image studies is still in its early stages, but the potential for improving detection accuracy and operational efficiency is immense, especially given the rapid increase in the volume of ionospheric imaging data (Zhang et al., 2020).
Although EPBs have been observed for a long time using ASIs (e.g., Weber et al., 1978), the application of machine learning for automated EPB detection on ASI images is quite recent (e.g., Chakrabarti et al., 2024; Githio et al., 2024; Srisamoodkham et al., 2022; Thanakulketsarat et al., 2023). Srisamoodkham et al. (2022) presented a method for detecting EPBs in airglow imager images using a YOLO v3 based CNN. The model was reported to have a slightly lower recognition accuracy, but runs faster as compared to the standard model (Srisamoodkham et al., 2022). The authors decided the presence of EPBs on the ASI images based on anomalies computed for each image during training. They found that 40% was a suitable anomaly threshold for deciding that an image contained EPBs, and so they classified images that have anomalies greater than or equal to 40% as containing EPBs.
Thanakulketsarat et al. (2023) proposed two real-time plasma bubble detection systems based on support vector machine (SVM) techniques. The two designs were made using (a) the CNN, and (b) the singular value decomposition (SVD) for feature extraction, then it was connected to the SVM for EPB classification. The models were trained on quick look images from the VHF radar system at the Chumphon station, Thailand, in 2017. Their results showed that the combined CNN-SVM model, using the radial basis function (RBF) kernel, gave the highest accuracy of 93.08% while the model using the polynomial kernel gave an accuracy of 92.14%. The combined SVD-SVM models gave lower accuracies of 88.37% and 85.00% for RBF and polynomial kernels of SVM, respectively. Chakrabarti et al. (2024) similarly proposed a deep learning framework for the detection of different types of mid-latitude ionospheric plasma structures, including medium scale traveling ionospheric disturbances with single and multiple bands, and plasma bubbles. They used 630.0 nm airglow data captured by an ASI installed at the mid-latitude Himalayan region near the Indian Astronomical Observatory, Hanle, Ladakh, India.
In a different but slightly related study, Githio et al. (2024) conducted simultaneous observations using two GNSS receivers and an ASI to study the morphology of EPBs over Brazil. The study had two main objectives: (a) to develop a Random Forest (RF) machine learning model for estimating and predicting the zonal drift velocities of EPBs, and (b) to compare the model's predictions with actual EPB drift measurements from the instruments, as well as with zonal neutral wind speeds from the Horizontal Wind Model-14 (HWM-14). The model was built using reliable EPB drift data collected during geomagnetically quiet days between 2013 and 2017 in Brazil. It predicted the velocities based on various parameters, including day of the year, universal time, critical frequency of the F2 layer (foF2), as well as solar and interplanetary indices. The model achieved correlation coefficients of 0.98 and 0.96 in conjunction with RMSE values of 10.61 and 10.06 m/s during training and validation phases, respectively. The model's performance was further evaluated on two geomagnetically quiet nights, yielding an average correlation coefficient of 0.89 and an RMSE of 15.74 m/s. CNN techniques can be utilized within and beyond classification for airglow image processing. For example, CNNs can be applied for feature extraction (Jogin et al., 2018), segmentation (Mortazi & Bagci, 2018), object detection (Aulia et al., 2024), denoising (Ilesanmi & Ilesanmi, 2021; Liu et al., 2018), super-resolution enhancement (Tian et al., 2020), and motion tracking (Fan et al., 2010).
In the present study, we present a new bootstrapping CNN approach for detecting EPBs based on airglow images captured by an ASI installed at the African equatorial region (Abuja—Nigeria; Geographic: 8.99°N, 7.38°E; Geomagnetic: 1.60°S). By leveraging the unique pattern recognition capabilities of CNNs, we develop an automated, efficient, and accuracy-improved method for identifying EPBs, which could significantly enhance real-time space weather monitoring and forecasting capabilities. The present study looks to improve on earlier studies in the following ways:
-
The present study relies on the raw airglow images, rather than on the processed (quick look) images (see Section 2.2) which are typically used in previous studies. The advantage is that the model developed in the present study can be quickly used to directly classify isolated airglow images, independent of other airglow images captured by the camera at about similar time. When quick look images are used, the airglow images need to be first processed by the technique described in Section 2.2 before they are passed to the model for classification. Besides the computational time/energy required, the process additionally requires the availability of other airglow images obtained by the ASI at around the same time (e.g., within 1-hr interval) as the airglow image to be classified. The model of the present study therefore offers the added capacity of classifying images which are isolated in time.
-
The bootstrapping technique employed in the present study (see Section 2.3 for details) makes it possible to utilize as much images as possible, and in varied ways, for training of the network. Computer systems are usually limited by the amount of data they can hold in the Random Access Memory (RAM), and this usually compels model developers to make use of a subset of images from available big data set of images, and the subset may usually not be truly representative of the entire set, leading to less accurate models. The bootstrapping technique also promotes the idea of using an equal number of images/samples for each of categories to be trained. This mitigates the disadvantage of biasing the neural network training in favor of a particular category/class for which more samples are presented.
-
The bootstrapping technique as applied to the final model prediction increases the accuracy of the final model predictions as described in Section 3.2.
This work contributes to the growing body of research on applying deep learning techniques to space weather monitoring and aims to set the foundation for future advancements in the automated analysis of ionospheric phenomena.
Data and Methods
Source of Data
Data used in this study are airglow images from an ASI installed in Abuja, Nigeria (Geographic coordinates: 8.99°N, 7.38°E; Geomagnetic coordinates: 1.60°S, 79.39°E) as illustrated in Figure 1, for the period spanning from 2015 to 2020. This ASI is No. 5 of the optical mesosphere thermosphere imagers (OMTIs) which is provided by the Institute for Space-Earth Environmental Research (ISEE) at Nagoya University, and operated by the Space Environment Research Laboratory (SERL), National Space Research and Development Agency (NASRDA) of Nigeria. The imager, equipped with a fish-eye lens, provides a 180° field of view, covering an approximate radial distance of 500 km. The blue circle in Figure 1 illustrates the imager's 500 km radial coverage. The imager operates with five filters in the wavelength ranges of 557.7, 630.0, 720–910, 777.4, and 572.5 nm. For this study, images from filter 2, corresponding to the 630.0 nm wavelength (OI species), were used. This filter has an exposure time of 165 s, and the images are recorded at intervals of 5.5 min. Further details about the ASI can be found in Shiokawa et al. (2009). The images are in binary, 2-byte unsigned integer format, with a resolution of 512 × 512 pixels and airglow intensity represented as count rates ranging from 0 to 65,535.
[IMAGE OMITTED. SEE PDF]
Conventional Method of Detecting EPBs on ASI Images
It is difficult to detect the presence or absence of EPBs on the ASI images by a simple visual inspection of the raw images. This is because the structures buried within the raw images are not evidently visible to the human eyes as can be seen in Figure 2. The three images on the top row of Figure 2 are raw ASI images for three different scenarios in which (a) there are EPB structures, (b) there are no EPB structures, and (c) the image is cloudy/noisy that it is difficult to say whether there are EPB structures or not. To the human eyes, the three images look quite similar, and it is difficult to make this kind of classification by a simple visual inspection of the images. To enhance visibility of the EPB structures on the images, the conventional practice has been to compute the percentage intensity deviations (), also known as quick look images, using the formula described in Equation 1 (Shiokawa et al., 2015; Okoh et al., 2017; Thanakulketsarat et al., 2023). The quick look image () is basically obtained by subtracting the one-hour running average () of raw images around the given raw image from the raw image itself (), and then dividing the result by . The multiplication by 100 is done so as to express the final result as a percentage.
[IMAGE OMITTED. SEE PDF]
After computing the quick look image, the presence or absence of EPB structures can then be determined by visual inspection of the quick look image. The three images at the lower panel of Figure 2 are the respective quick look images for the exact same scenarios illustrated on the upper panel. It becomes evident that “the first image (a)” contains EPB structures, “the second image (b)” does not contain EPB structures, and “the last image (c)” is noisy/cloudy that it is difficult to ascertain whether it contains EPB structures or not.
While it is indeed challenging to detect EPBs directly from raw images by human-in-the-loop visualization, CNN procedures have been shown to effectively identify faint structures by leveraging data preprocessing and feature extraction techniques that enhance such signatures (Garvin et al., 2024; Walmsley et al., 2019). Unlike human visual inspection, the CNN-based model learns hierarchical spatial features that are not immediately apparent in manual visualization of raw images (Bilal et al., 2017; Gheller et al., 2018). Convolutional layers capture variations in intensity, texture, and spatial gradients, allowing the model to detect subtle features even when they are visually ambiguous in raw images (Aggarwal & Aggarwal, 2018). While it has been shown that training on processed images could enhance performance (e.g., Ferrarini et al., 2018; Khalilimoghadama et al., 2017), we did not find significant improvement when we tested the CNN procedure on quick-look images. Given the aim of the present study and the aforementioned merits of using the raw images (Section 1), our study therefore focused on training the raw images, rather than the quick-look ones. In this study, images were classified into three categories: (a) EPB (containing EPB structures), (b) No EPB (lacking EPB structures), and (c) Noisy/Cloudy (affected by noise or cloud cover). It is important to clarify that the images used for the CNN training are the raw images, not the processed/quick look images. The quick look images were only used to classify the raw images appropriately. The resulting trained network is therefore capable of classifying the raw ASI images. Classification/categorization of the images was performed through visual inspection of the quick-look images, and then based on the illustration given in Figure 2, the corresponding raw images were labeled as “Noisy/Cloudy” if they exhibited significant cloud cover, excessive background noise, or distortions that obscured the precise determination of whether or not there are EPBs in the images. Otherwise, the images were precisely determined to contain EPBs or not.
Bootstrapping Technique Applied in This Study
The idea of bootstrapping as applied in the present study is that we have first resampled the training data set to create three subsets of data in the manner explained below in this section. Each of these subsets is then used to train different CNN models in order to reduce overfitting and variance (see Brodeur et al., 2020; Fitzgerald et al., 2013). The three resulting networks are then used in an ensemble method known as bagging or bootstrap aggregating to create the final prediction by computing an aggregate of predictions of the three trained networks.
The number of samples (or raw images) we classified for the CNN training are as follows:
-
1,650 images that contain EPBs
-
3,203 images that do not contain EPBs
-
9,495 images that are Noisy/Cloudy
These gave a total of 14,348 images. Using all of these images for the CNN training at a time has two main drawbacks, for example, (a) it was computationally intensive and caused the computer we used for training (11th Gen Intel(R) Core(TM) i7-1195G7 @ 2.92 GHz 16GB RAM) to run out of RAM, and (b) there was a bias in the network's learning accuracy of the three categories since more sample of noisy/cloudy images, and less sample of EPB images, were presented.
We first ascertained that 3,000 images was a sufficient number of images to train at a time, without causing the computer to run out of RAM, and then we created the training samples such that they present an equal number of samples (1,000 images) for each of the three categories. Prior to creating subsets of the training data set, 200 images/samples of each of the three categories (total: 600 images) were randomly selected and removed/kept aside for the purpose of testing. We henceforth refer to this data set as the testing data set.
Three subsets of the training data set were then created as follows from the remaining data set:
-
The first subset was constituted by randomly taking 1,000 images from each of the three categories, making a total of 3,000 images/samples which we shall hereafter refer to as the Random1 training subset.
-
The second subset was constituted by first arranging the images of each category in chronological order (based on the time they were captured), and then taking 1,000 images systemically from each category by using a linearly spaced vector of 1,000 indices that start from the first to the last image in each category. This also gave a total of 3,000 images/samples which we shall hereafter refer to as the System training subset.
-
Similar to the Random1 training subset, the third subset was constituted by randomly taking 1,000 images from each of the three categories, but now with focus on images that have not been previously selected in the first two subsets. In the case of a category, for example, the EPB category, where there are less than 1,000 images that have not been previously selected in the first two subsets, all images that have not been previously selected are taken, and the balance is randomly taken from the first two subsets. This also gave a total of 3,000 images/samples which we shall hereafter refer to as the Random2 training subset.
Since the focus of our study is on spatial features in airglow images rather than temporal variations, the model is trained to recognize spatial structures associated with EPBs, independent of when the images were captured. However, to ensure transparency, we have included a data set distribution analysis (Figure 3) covering:
-
The number of images per year from 2015 to 2020.
-
The seasonal distribution of images [March Equinox (MarEq: February–April); June Solstice (JunSol: May–July); September Equinox (SepEq: August–October); and December Solstice (DecSol: November, December, January)].
-
The time-of-night distribution of image acquisition.
[IMAGE OMITTED. SEE PDF]
In Figure 3, the panels on the left-hand side show distribution of the entire data set, while those on the right-hand side show distribution of the training subsets used (i.e., the Random1, System, and Random2 training subsets).
The figure shows that all years, seasons, and times of the night, are well-represented in the data set used. A few unavoidable biases are noticeable in the distribution. For example, year 2015 has less number of images because the equipment was installed toward the second half of the year. Some of the other variations are attributable to periods of equipment inactivity/breakdown. The hour of 04:00 UT (05:00 LT) has less number of images because it is the terminating hour for the observations; the observations may not run to this hour depending on the sky-glow during some nights.
The entire data set initially contained mostly cloudy or noisy images, primarily from the September equinox season (see Figure 3c), which coincides with the peak of the rainy season in Abuja. However, the bootstrapping technique applied in this study helped correct this bias, resulting in a more balanced distribution that now favors the December solstice season (Figure 3d). This shift is beneficial because the December solstice is the season with the clearest skies in Abuja, making it easier to manually classify all-sky images based on the presence or absence of EPBs. As a result, the CNN training benefits from a more balanced data set, with a better representation of EPB and No-EPB images relative to large number of cloudy/noisy images in the initial/entire data set.
CNN Training Options and Procedure
A schematic diagram of the CNN architecture used is illustrated in Figure 4. The following description of CNNs was applied to each of the three training subsets. The CNN was made up of an input layer which is the layer that takes in the raw input 512 × 512 images. Then there are three convolutional layers which apply filters/kernels to the input images, producing feature maps that highlight various aspects of the input data. The number of filters defines how many features are learned at each layer, and the size of the filter affects how much of the input data is seen at once by the filter. These are hyper-parameters. The hyper-parameters used in this study were determined by first trying a number of different options that are informed by previous literature, and then choosing the ones that optimize/maximize the model's prediction accuracy on the testing data set. The optimal network used was a network of three convolution layers of 3 × 3 filter size having 8 filters in the first convolution layer, 16 in the second, and 32 in the third. Table 1 illustrates some of the hyper-parameter options we tested, and the optimal options we used.
[IMAGE OMITTED. SEE PDF]
Table 1 Table of Hyper-Parameter Options Tested and the Optimal Options Used
Hyper-parameter | Brief explanation | Options tested | Optimal option |
Weights initializer | Defines how initial weights are set before training | Glorot, he, orthogonal, narrow-normal, ones | Glorot |
Number of filters | Number of feature detectors in a convolutional layer | (4, 8, 16), (8, 16, 32), (16, 32, 64) | (8, 16, 32) |
Activation functions | Introduces non-linearity to the model | Elu, gelu, prelu, relu, sigmoid, softmax, softplus, srelu, swish, tanh | Relu for the convolution layers |
Softmax for the fully connected layer | |||
L2 Regularization value | Applies penalties on layer weights to prevent overfitting | 1.0e−02, 1.0e−03, 1.0e−04, 1.0e−05 | 1.0e−04 |
Solver name/Optimization algorithm | Determines how weights are updated during training | Adam, lbfgs, rmsprop, sgdm | Adam |
Initial learn rate | Controls the step size for weight updates | 1.0e−02, 1.0e−03, 1.0e−04, 1.0e−05 | 1.0e−03 |
Batch size | Number of samples processed before updating weights | 8, 16, 32, 64 | 16 |
Total number of epochs | Total number of times the model sees the full data set | 10, 30, 50 | 10 |
Pooling layers were used to reduce the spatial size of the feature maps; 2 × 2 max pooling was used for down-sampling in each of the three layers. Fully connected layers were then used at the end of the network to help make predictions based on the features that were extracted in previous layers. Momentum is a technique in CNNs that is used during training to improve the efficiency and stability of gradient-based optimization. It helps the model converge faster and reduces oscillations during weight updates by taking into account not only the current gradient but also the previous gradient updates. The momentum coefficient (usually a value between 0 and 1) indicates how much of the previous velocity is carried forward. The optimizing algorithm used is the Adaptive moment estimation (Adam) algorithm (Kingma, 2014; MathWorks, 2024). Adam is a popular optimization algorithm commonly used for training deep learning models, especially CNNs. It combines the advantages of two other algorithms: the Adaptive Gradient algorithm and the Root Mean Square Propagation algorithm, by computing adaptive learning rates for each parameter (Reyad et al., 2023). Adam utilizes both the first moment (mean of gradients) and second moment (uncentered variance of gradients) to adjust the learning rate, making it well-suited for problems with noisy or sparse gradients (Vamaraju et al., 2021; Zhang et al., 2019).
The output layer is the final layer of the network that provides the output/class probabilities. The CNNs in this work were trained to classify the airglow images based on 3 classes/categories: EPBs, No EPBs, and Noisy/Cloudy. Three different CNNs were trained using the three different data subsets described in Section 2.3. The final model calls each of the three trained CNNs to make predictions for a given input image, and the final model prediction is computed as an aggregate of predictions from all three sub-models. The MATLAB routine used for the neural network training, and the resulting networks obtained after the training, can be downloaded from Okoh (2024).
Results and Discussion
Results From Sub-Models
In evaluating the three sub-models, each was tested on the testing data set of 200 images to assess classification accuracy for three categories: EPB, No EPB, and Cloudy. The basic output of the trained CNN models are prediction probabilities; for a given image, each of the sub-models generates three values (each value is between 0 and 1) which represent probabilities that the given image is an EPB image, a No EPB image, or a Cloudy image. Lower values of probability for a given category indicate that the image is less likely associated with that category, while greater values indicate that the image is more likely associated with the category. Therefore, the image is ultimately classified by the sub-models as the category which returns the greatest value of probability.
Figures 5a–5c show the confusion matrices for each of the three CNN sub-models trained using the respective data subsets: Random1, System, and Random2 training subsets. The confusion matrix is a table that shows the different combinations of actual versus predicted categories. For a three-category classification as in the present study, it is a 3 × 3 matrix as illustrated in Figure 5. Each row of the matrix represents the actual class, and each column represents the predicted class as the output of the CNN model. Elements in the leading diagonal show correct predictions, while off-diagonal elements show misclassifications. The confusion matrices help to visualize class-specific performances and to see where the models are confused (i.e., predicting one class as another).
[IMAGE OMITTED. SEE PDF]
Figure 5 contains information on the individual prediction accuracies of each of the three sub-models. The prediction accuracy is simply the percentage of correctly classified images, computed using the formula in Equation 2.
Figure 5a indicates that the Random1 sub-model shows strong classification accuracy across all categories. For EPB images, it correctly classified 195 instances out of the 200 testing image data sets, misclassifying 3 as No EPB and 2 as Noisy/Cloudy images. Similarly, for No EPB images, the model correctly classified 198 cases but misclassified 2 as EPB. It exhibited near-perfect performance in identifying the Noisy/Cloudy images, correctly classifying 199 out of 200, with only a single misclassification as EPB. In general, the Random1 sub-model demonstrates a balanced performance with high accuracy.
The System sub-model performed similarly to the Random1 sub-model with a few notable differences. It achieved higher accuracy in classifying EPB images, with 197 correctly identified and no misclassifications as Noisy/Cloudy. However, the model's ability to distinguish between EPB and No EPB images is slightly less reliable, with 4 No EPB images being misclassified as EPB images, and 3 EPB images classified as No EPB images. This sub-model also avoided confusing Noisy/Cloudy images with other categories with 3 Cloudy images misclassified as EPB images. The System sub-model demonstrated slightly lower performance when compared to the Random1 sub-model, particularly in differentiating EPB from No EPB. However, the overall accuracy remained high, indicating the model's solid ability to generalize across the test data set.
The Random2 sub-model displays a slightly different pattern of performance. Its accuracy for EPB images is on par with the other sub-models, correctly classifying 195 instances. However, it has the highest misclassification rate for Noisy images, with 4 EPB images being incorrectly classified as Noisy images. This is the sub-model's weakest point, as it also misclassified 6 Noisy images as EPB images and 1 as No EPB image, leading to a slightly lower performance in handling Noisy images. The Random2 sub-model also struggled with No EPB images, misclassifying 13 instances as EPB images. This also suggests that the sub-model is less effective in distinguishing between No EPB and EPB images than the other two sub-models.
Among the three sub-models, the Random1 sub-model emerges as the most balanced performer, particularly in correctly identifying Noisy images. The System sub-model was better at identifying EPB images. The Random2 sub-model, though comparable in EPB classification, falls short in distinguishing Noisy from EPB images. These results highlight that while all three sub-models are generally effective, each shows distinct areas where performance can be improved. Rather than selecting one of the three sub-models, the technique of ensemble learning (illustrated in Section 3.2) has been exploited to further enhance the model prediction accuracies.
Results From the Final Model
To improve the prediction accuracy, the final model was developed by making an ensemble of the three sub-models developed. To make a prediction/classification of an image, each of the three sub-models is used to do the classification, and the output of the final model is an aggregate of the results from the three sub-models. This technique leverages the strengths of the three different models, reducing the risk of overfitting and improving generalization (Folino et al., 2021). Two aggregating methods have been considered. The first aggregating method involved computing the mean of the three sub-model probabilities, while the second aggregating method involved computing the mode of the three sub-model classifications.
Mean of Sub-Model Probabilities
In this method, the sub-models were first used to obtain probabilities for the three categories. These are values between 0 (if the image definitely does not belong to the category) and 1 (if the image definitely belongs to the category). The mean of the probabilities predicted by each of the three sub-models for a given category is computed. This is done for all three categories, and the image is classified as the category which gave the highest mean of probabilities. Figure 5d shows the confusion matrix for this final model, which we have referred to as the Final1 model.
This method yielded an improved Final1 model with notable classification accuracy across the EPB, No EPB, and Cloudy categories. The resulting confusion matrix demonstrates a high degree of accuracy, with 197, 199, and 199 correctly classified images in each respective category. This aggregation strategy reduced misclassification rates compared to individual sub-models, especially for cases previously problematic in the System and Random2 models. The combined model accurately classified 199 out of 200 No EPB and Cloudy images, reflecting its robustness in these categories. Minor misclassifications in the EPB category (2 images misclassified as No EPB and 1 as Cloudy) indicate near-perfect performance.
This aggregated model illustrates the effectiveness of ensemble learning through probability averaging, achieving improved generalization and reliability in classifying ASI images. This approach not only leverages the strengths of each sub-model but also demonstrates enhanced consistency, making it an effective strategy for identifying EPB, No EPB, and Cloudy conditions in space weather monitoring applications.
Mode of Sub-Model Classifications
In this second method, each of the three sub-models are first used to classify a given image, and then final model classification is obtained as the mode of the sub-model classifications. For example, if two of the sub-models classify the given image as an EPB image, and the third sub-model classifies it as a Cloudy image, then this new final model will classify it as an EPB image (which is the mode of the sub-model classifications). In the unlikely scenario that each of the three sub-models classifies the image differently (i.e., one sub-model classifies it as an EPB image, another sub-model classifies it as a No EPB image, and the third sub-model classifies it as a Cloudy image), then this final model will classify it as the same category to which the Random1 sub-model classified it. The Random1 category is chosen because it gave the greatest accuracy on the testing data set. We refer to this method as the Final2 model. Figures 6a and 6b show the confusion matrices for the Final1 and Final2 models respectively.
[IMAGE OMITTED. SEE PDF]
This second aggregation method, which determines the final classification by taking the mode of the sub-model classifications, yielded further improvements in classification accuracy. The confusion matrix shows near-perfect performance, with 198, 199, and 199 correctly classified images in the EPB, No EPB, and Cloudy categories, respectively.
This mode-based aggregation approach effectively captures consensus between the sub-models, reinforcing correct classifications by relying on majority agreement. The strategy also includes a fallback rule, defaulting to the classification of the Random1 model (the highest-performing individual model) in the rare case that each sub-model outputs a different category. This fallback enhances stability and helps prevent potential misclassifications in ambiguous cases, resulting in only one misclassified image in each of the EPB and No EPB categories.
Specifically, the mode-based aggregation approach was able to correctly classify one of the EPB images for which the mean-based aggregation approach misclassified as a No EPB image. In order to investigate this image, we traced it to be the image of 21:47:30 UT on 06 January 2016 which is shown in Figure 7a. This image is noticed to mark the onset of EPBs on the ASI images for the night. The EPB structure on the image was weak at that time as the EPB was still developing. Figures 7b–7f, which are the ASI images recorded immediately after Figure 7a, clearly show development of the EPB. It is therefore apparent that the image of Figure 7a was misclassified by the mean-based aggregation model as a No EPB image because the developing EPB structure was weak on that image. However, the faint signature was captured by the mode-based aggregation model which correctly classified it as an EPB image. More detailed analyses on the image showed that the Random1 and System sub-models correctly and independently classified the image as an EPB image, whereas only the Random2 sub-model misclassified it as a No EPB image. This majority voting is the reason that the mode-based aggregation model was able to correctly classify it as an EPB image. The sub-model prediction probabilities for the image are as follows:
-
The Random1 sub-model predicted 0.000, 0.595, and 0.405 as probabilities for the Cloudy, EPB, and No EPB categories respectively.
-
The System sub-model predicted 0.000, 0.861, and 0.139 as probabilities for the Cloudy, EPB, and No EPB categories respectively.
-
The Random2 sub-model predicted 0.000, 0.034, and 0.966 as probabilities for the Cloudy, EPB, and No EPB categories respectively.
[IMAGE OMITTED. SEE PDF]
The mean of the sub-model probabilities are therefore respectively 0.000, 0.497, and 0.503. The mean probability for the No EPB category (0.503) is slightly greater than the 0.497 value for the EPB category, which is the reason that the mean-based aggregation model misclassified the image as a No EPB image.
On the other hand, the mode-based method was able to correctly classify the image as an EPB image because two (Random1 and System) of the three sub-models correctly classified it as an EPB image. The mode-based method demonstrates slight yet notable improvements, underscoring the benefit of combining model outputs through majority voting. The method maximizes the ensemble's accuracy, showing that mode-based aggregation provides a robust and reliable approach for classifying the ASI images.
Conclusion
In this study, we developed and tested different CNN architectures and aggregation techniques to classify ASI images into three categories: images containing EPB, images with no EPB, and images with cloud cover. We initially trained three distinct CNN sub-models, namely Random1, System, and Random2 sub-models. Each of the sub-models yielded high classification accuracies, though with varying levels of performance across the different categories. Random1 consistently demonstrated the highest classification accuracy among the three, indicating its particular strength in identifying these atmospheric conditions with minimal misclassification.
To leverage the complementary strengths of each sub-model, we employed ensemble methods as a means of aggregating predictions to improve overall classification accuracy and robustness. Our first ensemble approach used probability averaging, wherein the mean of the predicted probabilities from the three sub-models was computed, and the category with the highest mean probability was taken as the final classification. This mean-based ensemble provided a boost in overall accuracy compared to the individual sub-models, reducing the rate of misclassification in each category and achieving high consistency across the test images. However, while effective, the approach was limited by its reliance on precise probability distributions and was therefore more sensitive to any overconfidence in individual sub-model predictions.
We then implemented an alternative aggregation method based on majority voting (mode) to further enhance classification accuracy and address the limitations of the mean-based method. The mode-based ensemble method classified each image according to the most frequently predicted category among the sub-models. In cases where each sub-model provided a different classification (a scenario that introduced ambiguity), the model defaulted to the classification given by the Random1 sub-model, as it had shown the greatest individual accuracy during testing. This approach effectively leveraged consensus among the sub-models, improving robustness in cases where the sub-models partially disagreed. The final mode-based ensemble model achieved near-perfect classification accuracy across the EPB, No EPB, and Cloudy categories, as evidenced by a confusion matrix with minimal off-diagonal entries and misclassifications.
The success of the mode-based ensemble approach highlights the efficacy of majority voting in ensemble models for image classification tasks, especially when combined with a fallback mechanism for ambiguous cases. This method demonstrated higher resilience to misclassification than the probability-based aggregation, making it particularly advantageous for space weather applications that demand high precision and reliability. The enhanced accuracy and robustness of the final model suggest that this approach is well-suited for operational implementation in space weather monitoring systems, where rapid and accurate detection of EPB and related phenomena is critical.
In conclusion, our study reveals that ensemble methods, particularly mode-based aggregation with a fallback mechanism, can significantly enhance the performance of CNN-based classifiers in detecting atmospheric phenomena from ASI images. This approach not only maximizes classification accuracy but also provides a systematic way to address ambiguity and disagreement among sub-model predictions. Future work will focus on refining the training of individual sub-models, potentially incorporating additional data or other types of atmospheric events to improve generalizability further. Moreover, exploring dynamic ensemble strategies that adjust aggregation methods based on real-time model performance could offer further gains in accuracy and operational reliability for space weather mitigation and forecasting.
Acknowledgments
ASI images used in this study are obtained from the No. 5 of the optical mesosphere thermosphere imagers (OMTIs) which is provided by the Institute for Space-Earth Environmental Research (ISEE) at Nagoya University (), and operated by the Space Environment Research Laboratory (SERL), National Space Research and Development Agency (NASRDA) of Nigeria. This research is conducted under framework of the New Observatory for Real-time Ionospheric Sounding over Kenya (NORISK), a project of the Istituto Nazionale Geofisica e Vulcanologia (INGV). Daniel Okoh acknowledges the INGV, NASRDA, and DARA (Development in Africa with Radio Astronomy) for supports they provided in the course of the study. Punyawi Jamjareegulgarn acknowledges the financial support for this research from Fundamental Fund (FF68) of the Thailand Science Research and Innovation Fund (Grant Number: RE-KRIS/FF68/84). This work was supported by the JSPS Core-to-Core Program, B. Asia-Africa Science Platforms (JPJSCCB20210003; JPJSCCB20240003), and JSPS KAKENHI Grants (21H04518; 22K21345).
Data Availability Statement
All Sky Imager data from No. 5 of the optical mesosphere thermosphere imagers (OMTIs) were used in the creation of this manuscript. The data are available through the OMTI website (). The MATLAB routine used for the neural network training, and the resulting networks obtained after the training, are available at Okoh (2024).
Aggarwal, C. C., & Aggarwal, C. C. (2018). Convolutional neural networks. Neural networks and deep learning: a textbook, 315–371.
Aulia, U., Hasanuddin, I., Dirhamsyah, M., & Nasaruddin, N. (2024). A new CNN‐BASED object detection system for autonomous mobile robots based on real‐world vehicle datasets. Heliyon, 10(15), e35247. https://doi.org/10.1016/j.heliyon.2024.e35247
Bhattacharyya, A. (2022). Equatorial plasma bubbles: A review. Atmosphere, 13(10), 1637. https://doi.org/10.3390/atmos13101637
Bilal, A., Jourabloo, A., Ye, M., Liu, X., & Ren, L. (2017). Do convolutional neural networks learn class hierarchy? IEEE Transactions on Visualization and Computer Graphics, 24(1), 152–162. https://doi.org/10.1109/tvcg.2017.2744683
Brodeur, Z. P., Herman, J. D., & Steinschneider, S. (2020). Bootstrap aggregation and cross‐validation methods to reduce overfitting in reservoir control policy search. Water Resources Research, 56(8), e2020WR027184. https://doi.org/10.1029/2020wr027184
Chakrabarti, S., Patgiri, D., Rathi, R., Dixit, G., Krishna, M. S., & Sarkhel, S. (2024). Optimizing a deep learning framework for accurate detection of the Earth’s ionospheric plasma structures from all‐sky airglow images. Advances in Space Research, 73(12), 5990–6005. https://doi.org/10.1016/j.asr.2024.03.014
Fan, J., Xu, W., Wu, Y., & Gong, Y. (2010). Human tracking using convolutional neural networks. IEEE Transactions on Neural Networks, 21(10), 1610–1623. https://doi.org/10.1109/tnn.2010.2066286
Ferrarini, B., Ehsan, S., Leonardis, A., Rehman, N. U., & McDonald‐Maier, K. D. (2018). Performance characterization of image feature detectors in relation to the scene content utilizing a large image database. IEEE Access, 6, 8564–8573. https://doi.org/10.1109/access.2018.2795460
Fitzgerald, J., Azad, R. M. A., & Ryan, C. (2013). A bootstrapping approach to reduce over‐fitting in genetic programming. In Proceedings of the 15th annual conference companion on Genetic and evolutionary computation (pp. 1113–1120).
Folino, F., Folino, G., Guarascio, M., Pisani, F. S., & Pontieri, L. (2021). On learning effective ensembles of deep neural networks for intrusion detection. Information Fusion, 72, 48–69. https://doi.org/10.1016/j.inffus.2021.02.007
Garvin, E. O., Bonse, M. J., Hayoz, J., Cugno, G., Spiller, J., Patapis, P. A., et al. (2024). Machine learning for exoplanet detection in high‐contrast spectroscopy‐Revealing exoplanets by leveraging hidden molecular signatures in cross‐correlated spectra with convolutional neural networks. Astronomy & Astrophysics, 689, A143. https://doi.org/10.1051/0004‐6361/202449149
Gheller, C., Vazza, F., & Bonafede, A. (2018). Deep learning based detection of cosmological diffuse radio sources. Monthly Notices of the Royal Astronomical Society, 480(3), 3749–3761. https://doi.org/10.1093/mnras/sty2102
Githio, L., Liu, H., Arafa, A. A., & Mahrous, A. (2024). A machine learning approach for estimating the drift velocities of equatorial plasma bubbles based on All‐Sky Imager and GNSS observations. Advances in Space Research, 74(11), 6047–6064. https://doi.org/10.1016/j.asr.2024.08.067
Huba, J. D., Joyce, G., & Krall, J. (2008). Three‐dimensional equatorial spread F modeling. Geophysical Research Letters, 35(10), L10102. https://doi.org/10.1029/2008GL033509
Ilesanmi, A. E., & Ilesanmi, T. O. (2021). Methods for image denoising using convolutional neural network: A review. Complex & Intelligent Systems, 7(5), 2179–2198. https://doi.org/10.1007/s40747‐021‐00428‐4
Jogin, M., Madhulika, M. S., Divya, G. D., Meghana, R. K., & Apoorva, S. (2018). Feature extraction using convolution neural networks (CNN) and deep learning. In 2018 3rd IEEE international conference on recent trends in electronics, information & communication technology (RTEICT) (pp. 2319–2323).
Kelley, M. C. (2009). The Earth's ionosphere: Plasma physics and electrodynamics. Academic Press.
Khalilimoghadama, N., Delavar, M. R., & Hanachi, P. (2017). Performance evaluation of three different high resolution satellite images in semi‐automatic urban illegal building detection. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 42, 505–514. https://doi.org/10.5194/isprs‐archives‐xlii‐2‐w7‐505‐2017
Kingma, D. P. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
Litjens, G., Kooi, T., Bejnordi, B. E., Setio, A. A., Ciompi, F., Ghafoorian, M., et al. (2017). A survey on deep learning in medical image analysis. Medical Image Analysis, 42, 60–88. https://doi.org/10.1016/j.media.2017.07.005
Liu, Z., Yan, W. Q., & Yang, M. L. (2018). Image denoising based on a CNN model. In 2018 4th International conference on control, automation and robotics (ICCAR) (pp. 389–393). IEEE.
Ma, L., Liu, Y., Zhang, X., Ye, Y., Yin, G., & Johnson, B. A. (2019). Deep learning in remote sensing applications: A meta‐analysis and review. ISPRS Journal of Photogrammetry and Remote Sensing, 152, 166–177. https://doi.org/10.1016/j.isprsjprs.2019.04.015
Makela, J. J., Kelley, M. C., & Miller, C. A. (2010). Optical observations of F region depletion drifts. Journal of Geophysical Research, 105(A9), 21383–21390.
Makela, J. J., Ledvina, B. M., Kelley, M. C., & Kintner, P. M. (2004). Analysis of the seasonal variations of equatorial plasma bubble occurrence observed from Haleakala, Hawaii. In Annales geophysicae (Vol. 22(9), pp. 3109–3121). Göttingen, Germany: Copernicus Publications. https://doi.org/10.5194/angeo‐22‐3109‐2004
MathWorks. (2024). TrainingOptions; Options for training deep learning neural network. Retrieved from https://www.mathworks.com/help/deeplearning/ref/trainingoptions.html#bu59f0q_sep_mw_11d10ee1‐6ccb‐42e1‐8628‐c88bf76485b8_head
Mortazi, A., & Bagci, U. (2018). Automatically designing CNN architectures for medical image segmentation. In Machine learning in medical imaging: 9th international workshop, MLMI 2018, held in conjunction with MICCAI 2018, Granada, Spain, September 16, 2018, Proceedings (Vol. 9, pp. 98–106). Springer International Publishing. https://doi.org/10.1007/978‐3‐030‐00919‐9_12
Okoh, D. (2024). MATLAB software for convolutional neural network training of all sky imager images. Zenodo. https://doi.org/10.5281/zenodo.14191577
Okoh, D., Rabiu, B., Shiokawa, K., Otsuka, Y., Segun, B., Falayi, E., et al. (2017). First study on the occurrence frequency of equatorial plasma bubbles over West Africa using an all‐sky airglow imager and GNSS receivers. Journal of Geophysical Research: Space Physics, 122(12), 12–430. https://doi.org/10.1002/2017ja024602
Park, J., Lühr, H., Stolle, C., Rother, M., Min, K., & Michaelis, I. (2009). The characteristics of field‐aligned currents associated with equatorial plasma bubbles as observed by the CHAMP satellite. In Annales geophysicae (Vol. 27(7), pp. 2685–2697). Göttingen, Germany: Copernicus Publications.
Reyad, M., Sarhan, A. M., & Arafa, M. (2023). A modified Adam algorithm for deep neural network optimization. Neural Computing & Applications, 35(23), 17095–17112. https://doi.org/10.1007/s00521‐023‐08568‐z
Shiokawa, K., Katoh, Y., Satoh, M., Ejiri, M. K., Ogawa, T., Nakamura, T., et al. (1999). Development of optical mesosphere thermosphere imagers (OMTI). Earth Planets and Space, 51(7–8), 887–896. https://doi.org/10.1186/bf03353247
Shiokawa, K., Otsuka, Y., Lynn, K. J., Wilkinson, P., & Tsugawa, T. (2015). Airglow‐imaging observation of plasma bubble disappearance at geomagnetically conjugate points. Earth Planets and Space, 67, 1–12. https://doi.org/10.1186/s40623‐015‐0202‐6
Shiokawa, K., Otsuka, Y., & Ogawa, T. (2009). Propagation characteristics of nighttime mesospheric and thermospheric waves observed by optical mesosphere thermosphere imagers at middle and low latitudes. Earth, Planets and Space, 61(4), 479–491. https://doi.org/10.1186/BF03353165
Srisamoodkham, W., Shiokawa, K., Otsuka, Y., Ansari, K., & Jamjareegulgarn, P. (2022). Detecting equatorial plasma bubbles on all‐sky imager images using convolutional neural network. In Communication and intelligent systems: Proceedings of ICCIS 2021 (pp. 481–487). Springer Nature Singapore.
Taylor, M. J., Eccles, J. V., LaBelle, J., & Larsen, M. F. (1997). High‐resolution airglow imaging of F‐region depletion drifts during the Guara campaign. Geophysical Research Letters, 24(13), 1699–1702. https://doi.org/10.1029/97gl01207
Thanakulketsarat, T., Supnithi, P., Myint, L. M. M., Hozumi, K., & Nishioka, M. (2023). Classification of the equatorial plasma bubbles using convolutional neural network and support vector machine techniques. Earth Planets and Space, 75(1), 161. https://doi.org/10.1186/s40623‐023‐01903‐7
Tian, C., Zhuge, R., Wu, Z., Xu, Y., Zuo, W., Chen, C., & Lin, C. W. (2020). Lightweight image super‐resolution with enhanced CNN. Knowledge‐Based Systems, 205, 106235. https://doi.org/10.1016/j.knosys.2020.106235
Vamaraju, J., Vila, J., Araya‐Polo, M., Datta, D., Sidahmed, M., & Sen, M. K. (2021). Minibatch least‐squares reverse time migration in a deep‐learning framework. Geophysics, 86(2), S125–S142. https://doi.org/10.1190/geo2019‐0707.1
Walmsley, M., Ferguson, A. M., Mann, R. G., & Lintott, C. J. (2019). Identification of low surface brightness tidal features in galaxies using convolutional neural networks. Monthly Notices of the Royal Astronomical Society, 483(3), 2968–2982. https://doi.org/10.1093/mnras/sty3232
Weber, E. J., Buchau, J., Eather, R. H., & Mende, S. B. (1978). North‐south aligned equatorial airglow depletions. Journal of Geophysical Research, 83(A2), 712–716. https://doi.org/10.1029/ja083ia02p00712
Zhang, J., Hu, F., Li, L., Xu, X., Yang, Z., & Chen, Y. (2019). An adaptive mechanism to achieve learning rate dynamically. Neural Computing & Applications, 31(10), 6685–6698. https://doi.org/10.1007/s00521‐018‐3495‐0
Zhang, Y., Li, Y., Feng, Z., & Wang, X. (2020). Deep learning‐based ionospheric anomaly detection using GNSS data. Remote Sensing, 12(9), 1442.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2025. This work is published under http://creativecommons.org/licenses/by/4.0/ (the "License"). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Equatorial plasma bubbles (EPBs) disrupt satellite‐based communication and navigation systems, particularly in equatorial regions. Reliable detection and classification of EPBs from all‐sky imager (ASI) images are essential for accurate space weather monitoring and forecasting. This study presents a novel bootstrapping convolutional neural network (CNN) approach to optimize automated EPB detection on ASI images for operational space weather monitoring applications, and overcoming challenges related to image variability and imbalanced data sets. Data used for CNN training were obtained from the optical mesosphere thermosphere imagers ASI installed at the Space Environment Research Laboratory, National Space Research and Development Agency, Abuja during the period from 2015 to 2020. Our method involved training three sub‐models, and aggregating their predictions. The CNN trainings were conducted on three sub‐datasets of 3,000 images each, categorized as “EPB,” “Noisy/Cloudy” or “No EPB.” Three corresponding sub‐models were developed from the CNN trainings. The three sub‐model classifications independently gave prediction accuracies of 98.67%, 98.33%, and 95.83% on a reserved test data set of 600 images. Ensemble models further improved the model prediction accuracies to 99.17% and 99.33% for methods based on the mean of sub‐model probabilities and the mode of sub‐model classifications respectively. Our results indicate that the bootstrapping CNN technique enhanced the EPB detection accuracy, providing a powerful tool for real‐time space weather monitoring applications, and implications for improving operational reliability of satellite‐based navigation and communication in the equatorial region.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details










1 Istituto Nazionale Geofisica e Vulcanologia (INGV), Roma, Italy, National Space Research and Development Agency (NASRDA), Abuja, Nigeria, Institute for Space Science and Engineering (ISSE), African University of Science and Technology (AUST), Abuja, Nigeria, Department of Astronomy and Space Sciences, Technical University of Kenya, Nairobi, Kenya
2 Istituto Nazionale Geofisica e Vulcanologia (INGV), Roma, Italy
3 National Space Research and Development Agency (NASRDA), Abuja, Nigeria, Institute for Space Science and Engineering (ISSE), African University of Science and Technology (AUST), Abuja, Nigeria
4 Institute for Space‐Earth Environmental Research (ISEE), Nagoya University, Nagoya, Japan
5 Department of Physics, Federal University of Technology Akure (FUTA), Akure, Nigeria
6 National Space Research and Development Agency (NASRDA), Abuja, Nigeria
7 South African National Space Agency (SANSA), Hermanus, South Africa, Department of Physics and Electronics, Rhodes University, Makhanda, South Africa, Centre for Space Research, North‐West University, Potchefstroom, South Africa
8 STI, The Abdus Salam International Centre for Theoretical Physics, Strada Costiera, Trieste, Italy
9 King Mongkut's Institute of Technology Ladkrabang, Prince of Chumphon Campus, Chumphon, Thailand
10 National Institute for Space Research, Sao Jose dos Campos, Brazil
11 Department of Physics and Astronomy, University of Nigeria, Nsukka, Nigeria
12 Department of Astronomy and Space Sciences, Technical University of Kenya, Nairobi, Kenya