This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
1. Introduction
The incidence rate of breast cancer ranks first among female malignancies [1]. How to early detect, standardize diagnosis, and reduce mortality is a major challenge facing the medical community all over the world. Medical image analysis is an important means of early detection of breast cancer lesions risk. It is also recognized as the most effective and reliable medical examination method, and it can find the related clues of early breast cancer lesions. The combination of early prevention and control measures can significantly reduce the cancer rate and mortality of the screening population. The current medical technology for breast cancer has no effective means of prevention. How to detect early is an important way to combat breast tumors. With the rapid development of computer technology, the use of the computer to achieve digital diagnosis systems has become the development trend of medical technology. Computer-aided diagnosis of breast tumors has become an inevitable trend.
In recent years, many deep learning detection methods based on breast images have been proposed [2–4]. However, training deep neural networks (DNNs) requires too many samples and annotations. Moreover, since the DNNs contain multiple hidden layers, training a DNN needs a lot of computing resources and time costs. The acquisition of medical images is not easy, and senior doctors need to label the lesions. Compared with DNN, SNN is a more available tool. SNN simulates the information processing mechanism of the brain and takes the spike neurons as the basic units. It simulates the information processing mechanism of nerve cells, to realize the pattern recognition task [5]. SNN represents information in spike patterns, and each spike neuron experiences rich dynamic behaviors. In addition to the spread of information in the spatial domain, the current state is also affected by the past state in the time domain specifically. Therefore, SNN usually has more time commonality. Due to the biological characteristics of SNN, it has a lower computational cost in theory. However, the accuracy of SNN is lower than that of ANN realized by spatial propagation and continuous activation function. On the other hand, SNN is rarely used in the field of medical image recognition. Therefore, this paper focuses on the SNN for breast cancer image recognition.
In this paper, SNN is employed to recognize breast cancer on three modalities of datasets. The network consists of three parts (i.e., input layer, reservoir layer, and readout layer). In the input layer, the image is encoded to spike time, and the saliency features of the input image are extracted to better identify the lesion information. To obtain the optimal architecture, an evolutionary algorithm, i.e., FOA, is employed to optimize the architecture of the SNN.
The main contributions of this work are as follows.
(1) Two-time encoding schemes (i.e., linear time encoding and entropy-based time encoding) are proposed to encode the input information. The encoding methods are compared and analyzed in the experimental part. The manner of linear time encoding maps pixels of the input image into spike time linearly. The entropy-based encoding scheme calculates the statistical form of image features. This method includes not only the aggregation feature of image-level but also the spatial feature of image distribution
(2) A method of saliency feature extraction is proposed to detect the lesion features of breast images so that the network can better detect the images containing lesions. This method uses a spike convolution network to extract input features. According to the output feature maps, the heat map of the image is reconstructed, and then, the mask is calculated to obtain the salient region of the image
(3) To improve the SNN performance, an evolutionary computing method, namely, Fruit Fly Optimization Algorithm (FOA) [6], is employed to optimize the SNN architecture. The optimized SNN via FOA can improve the classification accuracy and reduce the number of synaptic connections in the reservoir. The structure of the network in our work is determined by the neurons and the connections of neurons in the reservoir layer. The network structure determines future performance in a great measure. The super parameters that determine the structure are usually empirical values. It is difficult to select the appropriate value manually. Therefore, the evolutionary algorithm is used to obtain the optimal structure in this work
The rest of the paper is organized as follows. The related works are provided in Section 2. Section 3 proposes two methods of spike time encoding, the saliency feature extraction method, and the spiking reservoir neural network. Section 4 presents a manner of SNN architecture optimization using FOA. The experimental results are provided in Section 5, which demonstrates the performance of the proposed methods under three breast cancer recognition tasks. Section 6 concludes this paper.
2. Related Works
Interventional therapy of medical imaging has significantly improved the level of early diagnosis of breast cancer. With the application of artificial intelligence in the field of healthcare, researchers use image processing and computer vision technology to design effective intelligent computer-aided detection and diagnosis systems.
Deep learning has made great progress in the field of pattern recognition. Various DNNs based feature extraction architectures are proposed for breast cancer detection and classification [7]. A deep convolution neural network (CNN) framework is proposed by [8]. Features of input images are extracted by using three pretrained CNN architectures. Features are fed into a fully connected layer for the recognition of malignant and benign tumors using average pooling. Whole slide images of breast biopsies are classified into five categories, and a saliency detector is proposed in [9], which uses a pipeline of four fully CNNs. The network fused saliency and classification maps for final categorization. An automated multiscale end-to-end DNN framework is proposed for mammogram classification [10]. It only requires mammogram images with annotations. The model generates three scales of feature maps that make the classifier combine global information with the local lesions for classification. To detect breast cancer automatically, the software is developed by [11]. It also proposes an algorithm to extract the features based on biodata, image analysis, and image statistics. These features are used to distinguish the images as normal or suspected by using CNNs and optimized by the Bayes algorithm. [12] proposes a hybrid deep CNN and RNN to classify breast cancer histopathological image.
Unlike deep CNNs, there is limited work done for SNNs in the field of breast cancer detection and diagnosis. [13] proposes that the discontinuous spike time is regarded as noise and the membrane potential is regarded as continuous signal, so that SNN can be trained by back propagation algorithm based on gradient descent. [14] proposes to train deep SNN in two stages. Firstly, the convolution kernel is pretrained in a hierarchical manner through unsupervised learning, and then, the synaptic weight is fine-tuned through spike-based supervised gradient descent back propagation. [15] proposes an approximate derivative method to explain the leakage behavior of LIF neurons. This method can directly train deep convolution SNN using spike-based back propagation. [5] proposes an ensemble SNN for the histopathological image. It is used for an eight-classification work, which includes each of four types of malignant tumors and benign tumors. In the current study, a saliency-based SNN with time encoding is proposed to recognize breast images on three different modality datasets, namely, breast ultrasound images, breast X-ray images, and histopathological databases. The input image is classified into malignant or nonmalignant under breast ultrasound and breast X-ray databases, with a false negative rate lower than 3%. Further investigation is performed on histopathological database to recognize multiclass task, which is helpful to distinguish tumor types in detail. Experimental results show that our approach can be applied to real-world scenarios.
3. Proposed Methods
A spiking reservoir neural network architecture, two manners of spike encoding (i.e., linear time encoding and entropy-based time encoding), and a saliency-based feature detection network are presented in this section. A diagram of the proposed method is shown in Figure 1. Compared with the existing spike encoding schemes, the linear spike time encoding method proposed in this paper is more concise in the calculation. The entropy-based spike time encoding calculates the statistical form of features and includes the spatial characteristics of the gray distribution.
[figure omitted; refer to PDF]
As shown in Figure 2, LSM includes of three parts, i.e., input layer, reservoir or liquid layer, and a memoryless readout layer. The reservoir consists of Leaky Integrate and Fire (LIF) neurons. The LIF neurons are connected recursively through dynamic synaptic connections in the reservoir part. The red solid circle in the reservoir represents the excitatory neurons, and the black represents the inhibitory neurons. The number of excitatory neurons is selected as four times of inhibitory neurons observed in the cerebral cortex circuit. The readout layer is also performed by LIF neurons, but there is no interconnection within them. The low dimensional input data can be transformed into a high dimensional internal state by reservoir. The internal state acts as the input of the memoryless readout layer, which is responsible for generating the final output of the LSM. The dynamic equations of LIF [20] simulate excitatory and inhibitory spike neurons. It can be calculated by
The neurons in the reservoir act as liquid filters
The output state of the reservoir is taken as the input of the readout layer, and the state is converted to the output
The readout layer of LSM usually uses the exponential filter to convert the spike output of the reservoir into the output of a continuous signal for linear regression training. The LSM model in [21] is trained by the recursive least square method using liquid filtered output. It results in the loss of the accurate spike time information produced by the liquid neuron [22]. In this work, readout neurons are trained by the ReSuMe algorithm [23, 24]. The ReSuMe is a biologically logically plausible algorithm of supervised learning. It allows neurons to adjust their synaptic weights so that it can learn any spike pattern corresponding to a given synaptic stimulus. The ReSuMe can guarantee convergence for a single input spike on each synapse. In the case of multiple spikes, learning rate with a low value can be used to ensure algorithm performance. The weight adjustment can be calculated by
3.2. Spike Encoding Scheme
For the human visual system (HVS), neural coding is helpful to understand how neural spike activity represents a visual scene and then decodes neural spike activity to represent given visual information. Therefore, in SNN, how to encode the numerical input data from the input image into the input neuron spikes is the primary problem to be solved. ANN uses matrix-vector operation in the transformation of input to output, which can directly calculate the value. In SNN, numerical data cannot be operated directly. On the contrary, these values must be converted into spike or spike trains. The conversion from numerical data to spike or spike trains is very important. The choice of encoding scheme not only affects the accuracy of practical application but also affects the speed of data processing and the energy efficiency of the system. This section proposes two different input encoding schemes, i.e., linear spike time encoding and entropy-based spike time encoding are proposed, and the effectiveness in medical image recognition tasks is evaluated in Section 5.
3.2.1. Linear Spike Time Encoding
The purpose of the time encoding is to generate the corresponding spike pattern representing the input. To supply the input layer of SNN, each input representation is converted into a corresponding time pattern. The spike time encoding method uses the time of the spike to express the stimulation of the input signal. This form of time encoding is the basis for the back propagation algorithm in SNNs, such as SpikeProp [25]. Time encoding is also used with other learning algorithms, such as spike time-dependent plasticity (STDP) [26].
In this section, a linear spike time encoding scheme is presented. Using this method, the pixel information of the image can be linearly mapped to the firing time of the spike. It can be calculated by
[figure omitted; refer to PDF]
This encoding method is a form of time-to-first-spike. According to neuroscience theory, the first spike fired by neurons carries the most information, so it is theoretically reliable.
3.2.2. Entropy-Based Spike Time Encoding
Spike time encoding method based on information entropy calculates the statistical form of image features. The average information in the image is taken as the time of firing spike. This method includes not only the aggregation feature of gray level but also the spatial feature of gray distribution. The adjacent gray value is selected as the spatial feature of the gray distribution. The adjacent gray value and the pixel of the image form a feature tuple, i.e.,
The above formula reflects the gray value of a pixel position and the comprehensive characteristics of the distribution, where
According to the information entropy, the time of spike firing can be calculated by
3.3. Visual Saliency Detection
The visual data entering human eyes is about 108 to 109 bits per second [27]. For the human visual system (HVS), real-time processing of these data streams is an extremely heavy task. The HVS only understands and processes a part of the information. This selection mechanism is named visual attention. This kind of attention behavior is considered to be led by two mechanisms, i.e., stimulus-driven bottom-up and expectation-driven top-down mechanisms [28]. Bottom-up attention is mainly driven by the orientation, contrast, color, action, and other attributes of the visual scene. In the field of computer vision, top-down attention is related to cognitive aspects such as memory, experience, and cultural background. Because of the simplicity of visual attention, visual attention is related to the former attention mechanism mainly, which is often called visual salience [29].
In this section, a saliency-based calculation model is proposed, as shown in Figure 4. A malignant tumor image is taken as an example. The spiking CNN is employed to extract the features. Then, the two-dimensional feature maps generated by the spike convolution layer are summed, and the mask is calculated to obtain the saliency feature map. However, for medical imaging devices, the relationship between the input energy and the brightness of the color recorded in the image file is linear. As a result, the image displayed on the device is inconsistent with the actual image captured by the camera. To correct this difference, gamma nonlinearity is performed before the saliency calculation. The gamma nonlinearity can be calculated by
[figure omitted; refer to PDF]
When
The two-dimensional feature maps mainly depend on convolution calculation. It can be calculated by
4. SNN Architecture Optimization
The connection of neurons in the LSM neural network includes the connection between input neurons and excitatory (or inhibitory) neurons in the reservoir, interconnection of neurons in the reservoir, and connection between neurons in the reservoir and neurons in the readout layer. There is no interconnection between neurons in the input layer (or readout layer). There are four types of recursive connections between the synapses in the reservoir, i.e., the connection between excitatory and excitatory synapses (
The performance of the LSM model in pattern recognition tasks depends not only on the strength of the connections between neurons but also on the number of neurons and the probability of synaptic connection. To design an efficient reservoir layer that performs desired kernel functions, these parameters need to be optimized. Therefore, the FOA [6] is employed to search for the best network architecture, that is, the parameters of connection probability
Algorithm 1
# Random initial fruit fly swarm location, radius (R) and number of optimization variables (D):
Init X_axis, Y_axis, Z_axis, R, D
# Enter iterative optimization to repeat the implementation, then judge if the smell concentration is superior to the previous iterative smell concentration.
for i in range (popsize):
X(i) = X_axis + R
Y(i) = Y_axis + R
Z(i) = Z_axis + R
# The distance to the origin is estimated.
Dist(i) = sqrt(X(i)
# the smell concentration judgment value is calculated.
S(i) =1/Dist(i)
# Substitute smell concentration judgment value (S) into smell concentration judgment function.
Smell(i) = function(S(i))
# Finding out the best value and the location index
Smellbest, index = min(Smell), argmin(Smell)
bestSmell = Smellbest
X_axis = X(bestIndex)
Y_axis = Y(bestIndex)
Z_axis = Z(bestIndex)
FOA is an algorithm of global optimization. It is based on the foraging behavior of fruit flies, as shown in Figure 5. In sensory perception, the fruit fly is superior to other species; especially in the sense of smell and vision, the olfactory organs of the fruit fly can collect all kinds of odors floating in the air and even smell food sources 40 km away. After flying close to the food location, the fruit fly can also use their eyes to find the location where the food and its companions gather and fly in this direction. The random direction and distance of individual fruit fly searching for food can be calculated by
5. Experimental Results
To effectively verify the performance of the network, three different modalities of breast image datasets are selected to test the proposed network, namely, breast ultrasound images, breast X-ray images, and breast histopathological images.
5.1. Experiment Settings
The number of input neurons is set according to the image size. The image size of the three datasets is different, so it is necessary to set a different number of input neurons. Other parameters in the network are set according to experience, as shown in Table 1.
Table 1
The parameters of neuron model and network for experiments.
Neuron model | Network | ||
Parameters | Value | Parameters | Value |
0 mV | 100 | ||
0.2 mV | 1 | ||
0 mV | 1 | ||
10 ms | 1 | ||
1 ms | -1.2 | ||
2 ms | 0.5 |
5.2. BreastMNIST Database
The dataset of breast ultrasound images [30] consists of 780 images. It can be categorized into 3 classes, i.e., malignant, benign, and normal tumors, as shown in Figure 5. In our work, the BreastMNIST database [31] is used for testing the proposed SNN. It is based on the breast ultrasound images. The dataset uses low-resolution images to simplify the task to binary classification. This work combines normal and benign into negative and classifies malignant as positive. And the source images with an average image size of
The three types in the dataset are shown in Figure 6. Figure 6(a) is a normal image, Figure 6(b) is a benign tumor image, and Figure 6(c) is a malignant tumor image. The images in each category are shown in Table 2. As shown in Table 2, the dataset contains 133 normal images, 437 benign images, and 210 malignant images.
[figures omitted; refer to PDF]
Table 2
The number of images in each category.
Classes of images | Number of images |
Normal | 133 |
Benign | 437 |
Malignant | 210 |
Total | 780 |
The network performance comparison of several models on the BreastMNIST dataset is shown in Table 3. It shows that 74.36% training accuracy and 81.19% test accuracy can be obtained by setting the LSM with empirical parameters only. Using the linear time coding algorithm, the training accuracy of the network is 77.67%, and the test accuracy is 89.32%. The training accuracy of 86.19% and the test accuracy of 92.74% are obtained by the spike time encoding scheme based on information entropy. After employing the saliency-based feature module, the training accuracy is 85.20%, and the test accuracy is 92.74%. Further optimization of the network architecture parameters can obtain the optimal training accuracy and test accuracy, which are 88.57% and 97.44%, respectively.
Table 3
The network performance comparison of different methods on BreastMNIST.
Methods | Training accuracy | Test accuracy |
Original SNN | 74.36% | 81.19% |
SNN with linear time encoding | 77.67% | 89.32% |
SNN with entropy-based time encoding | 86.19% | 92.74% |
Silence-based SNN | 85.20% | 92.74% |
Improved SNN | 88.57% | 97.44% |
The performance curve is shown in Figure 7. Figure 7(a) represents the performance curve and area under the curve on training data, and Figure 7(b) represents the performance curve and area under the curve on test data. As shown in the figure, the area on training data and test data are 0.90 and 0.99, respectively.
[figures omitted; refer to PDF]
In the optimized SNN, the total number of neurons in the reservoir
[figures omitted; refer to PDF]
Performance comparison of different algorithms on the BreastMNIST database is shown in Table 4. It shows that 85.9% accuracy and 87.8% accuracy can be obtained by using the ResNet-18. Using the ResNet-50, the accuracy of the network is 85.3% and 83.3%. The accuracy of 80.8% and the accuracy of 80.1% are obtained by the Auto-sklearn and AutoKeras, respectively. The Google AutoML Vision obtains 86.5% accuracies. Our work can get the best result, i.e., 97.4%, on the BreastMNIST database.
Table 4
Performance comparison of several networks on the BreastMNIST database (
Methods | AUC | Accuracy | Confidence intervals |
ResNet-18 (28) [32] | 0.821 | 85.9% | (0.8328, 0.8817) |
ResNet-18 (224) [32] | 0.857 | 87.8% | (0.8534, 0.8993) |
ResNet-50 (28) [32] | 0.839 | 85.3% | (0.826, 0.8757) |
ResNet-50 (224) [32] | 0.818 | 83.3% | (0.8055, 0.8578) |
Auto-sklearn [33] | 0.673 | 80.8% | (0.7786, 0.8338) |
AutoKeras [34] | 0.646 | 80.1% | (0.7719, 0.8278) |
Google AutoML Vision [31] | 0.932 | 86.5% | (0.8396, 0.8876) |
This work | 0.997 | 97.4% | (0.9608, 0.9834) |
5.3. Mini-MIAS Database
Breast X-ray image has good contrast, and resolution can distinguish the difference of microstructure density between tissues, and it is easy to operate, relatively cheap, easy to accept, and high diagnostic accuracy. It is internationally recognized as an effective measure for early opportunistic screening and early detection of breast cancer. 322 images from 161 patients are included in the Mammographic Image Analysis Society (MIAS) dataset. This work uses the mini-MIAs database, which contains images with a size of
The three types of images in the dataset are shown in Figure 9. Figure 9(a) is the normal image, Figure 9(b) is the benign tumor image, and Figure 9(c) is the malignant tumor image. To get an effective model and balance all kinds of data, the original image is rotated at different angles to expand the data. The experimental results are shown in Table 5.
[figures omitted; refer to PDF]
Table 5
The performance of different methods on the mini-MIAS database.
Methods | Training accuracy | Test accuracy |
Original SNN | 81.99% | 83.54% |
SNN with linear time encoding | 92.85% | 89.75% |
SNN with entropy-based time encoding | 94.72% | 95.03% |
Silence-based SNN | 93.17% | 96.27% |
Improved SNN | 95.17% | 98.27% |
The SNN performance comparison of different methods on the BreastMNIST dataset is shown in Table 5. It shows that 81.99% training accuracy and 83.54% test accuracy can be obtained by setting the LSM with empirical parameters only. Using the linear time coding algorithm, the training accuracy of the network is 92.85%, and the test accuracy is 89.75%. The training accuracy of 94.72% and the test accuracy of 95.03% are obtained by the spike time encoding scheme based on information entropy. After employing the saliency-based feature module, the training accuracy is 93.17%, and the test accuracy is 96.27%. Further optimization of the network architecture parameters can obtain the optimal training accuracy and test accuracy, which are 95.17% and 98.27%, respectively. The performance curve is shown in Figure 10. Figure 10(a) represents the performance curve and area under the curve on training data, and Figure 10(b) represents the performance curve and area under the curve on test data. As shown in the figure, the area on training data and test data are 0.98 and 0.99, respectively.
[figures omitted; refer to PDF]
In the optimized SNN, the total neurons in the reservoir
[figures omitted; refer to PDF]
Performance comparison of different algorithms on the mini-MIAS database is shown in Table 6. The method of adaptive thresholding provides 93% accuracy in [36]. An accuracy of 94.57% is achieved in [37] using Fisher linear discriminant analysis features of neighborhood structural similarity. An accuracy of 96.7% is obtained in [38] using deep distance metric learning. Using texture features with a neural network classifier provides 95.2% accuracy [39]. Superresolution reconstruction module with texture features provides 96.7% accuracy in [40]. The Hanman transform classifier and the hesitancy-based Hanman transform classifier that are used in [41] can achieve 100% accuracy. Although 98.27% accuracy is provided, only one LSM classier is used for classification in this work.
Table 6
Performance comparison of different algorithms on the mini-MIAS database (
Algorithms or classifiers | Accuracy | Confidence intervals |
Adaptive thresholding [36] | 93% | (0.8951, 0.952) |
Fisher’s LDA [37] | 94.57% | (0.9171, 0.9668) |
Deep distance metric learning [38] | 96.7% | (0.9398, 0.9808) |
Neural network [39] | 95.2% | (0.9208, 0.9692) |
Simple logistic [40] | 96.7% | (0.9398, 0.9808) |
HT, HHT [41] | 100% | (0.9882, 1.0) |
Silence-based SNN (this work) | 98.27% | (0.96, 0.9915) |
5.4. BreaKHis Database
To further test the performance of the reservoir SNN model in breast cancer image recognition, the proposed SNN is applied to the breast histopathological images, i.e., the BreaKHis database [42]. The dataset contains four types of histologically benign tumors (phyllodes tumor, fibroadenoma, tubular adenoma, and adenosis) and four types of malignant tumors (mucinous, lobular, papillary, ductal). Each type of image is shown in Figure 12.
[figures omitted; refer to PDF]
This database contains 7909 images and four magnification factors, i.e., 40x, 100x, 200x, and 400x are acquired. The database includes 2480 images of benign and 5460 malignant tumors, respectively. The size of each image is
Table 7
The detailed information of the BreaKHis database.
Tumor types | The number of images | |
Benign | Adenosis | 444 |
Fibroadenoma | 1014 | |
Phyllodes tumor | 453 | |
Tubular adenoma | 569 | |
Malignant | Ductal carcinoma | 3451 |
Lobular carcinoma | 626 | |
Mucinous carcinoma | 792 | |
Papillary carcinoma | 560 |
The SNN performance comparison of several methods on the BreaKHis dataset is shown in Table 8. It shows that on the data of 40x image, 76.33% training accuracy, and 78.20% test accuracy can be obtained by setting the LSM with empirical parameters only. Using the linear time encoding algorithm, the training accuracy of the network is 80.20%, and the test accuracy is 84.54%. The training accuracy of 86.56% and the test accuracy of 89.90% are obtained by using the spike time encoding scheme based on information entropy. After adding the saliency analysis module, the training accuracy is 89.92%, and the test accuracy is 94.46%. Further optimization of network architecture can obtain optimal training accuracy and test accuracy, which are 91.10% and 96.27%, respectively. On the 100x images, the optimal training accuracy is 89.60%, and the test accuracy is 93.35%. On the 200x magnified image, the optimal training accuracy is 90.33%, and the test accuracy is 95.24%. On the 400x images, the optimal training accuracy is 93.06%, and the test accuracy is 98.44%.
Table 8
Performance comparison of different methods on the BreaKHis database.
Methods | Training accuracy (%) | Test accuracy (%) | |||||||
40x | 100x | 200x | 400x | 40x | 100x | 200x | 400x | ||
Original SNN | 76.33 | 73.25 | 75.13 | 74.12 | 78.20 | 74.52 | 76.31 | 75.90 | |
SNN with linear time encoding | 80.20 | 80.20 | 81.72 | 81.33 | 84.54 | 83.36 | 82.90 | 84.74 | |
SNN with entropy-based time encoding | 86.56 | 83.15 | 85.77 | 87.29 | 89.90 | 85.33 | 86.83 | 90.70 | |
Saliency-based SNN | 89.92 | 88.31 | 89.60 | 90.60 | 94.46 | 91.67 | 93.25 | 94.74 | |
Improved SNN | 91.10 | 89.60 | 90.33 | 93.06 | 96.27 | 93.35 | 95.24 | 98.44 |
The results of several other studies on the BreaKHis dataset are used for performance comparison. Table 9 shows accuracies of these approaches, and Table 10 shows confidence interval comparison of the proposed method and several other methods over the BreaKHis database. The results of [43] report accuracies (86%-90%) by using the AlexNet. The experimental results of [42] demonstrated that the QDA classifier can get higher accuracies than RF and SVM classifiers on BreaKHis. A deep convolutional neural network is used to achieve 95.7%-97.1% classification accuracies in [44]. [45] investigated the performance of five CNNs architectures (i.e., LeNet-5, AlexNet, VGG-16, ResNet-50, and Inception-v1) on the basis of test accuracy. The Inception-v1 can achieve the test accuracy of 89%, 92%, 94%, and 90%, respectively, at 40x, 100x, 200x, and 400x magnification factor classification. [46] proposes an efficient and lightweight CNN model for histopathological image classification based on MobileNet. It achieves the test accuracy of 91.42%, 89.93%, 92.70%, and 85.84%, respectively, at 40x, 100x, 200x, and 400x magnification factor classification. The ensemble SNN [5] can achieve 98.7%, 95.1%, 96.7%, and 97.5% accuracies, respectively, which are higher than other approaches. The result of our work is not the best on the magnification of 40x, 100x, and 200x. However, it is better than others on the 400x images.
Table 9
Accuracy comparison of the proposed method and several other methods over the BreaKHis database.
Methods | Magnification factors | |||
40x | 100x | 200x | 400x | |
AlexNet [43] | 90.0% | 88.4% | 84.6% | 86.1% |
PFTAS+SVM [42] | 81.6% | 79.9% | 85.1% | 82.3% |
PFTAS+RF [42] | 81.8% | 81.3% | 83.5% | 81.0% |
PFTAS+QDA [42] | 83.8% | 82.1% | 84.2% | 82.0% |
CSDCNN [44] | 97.1% | 95.7% | 96.7% | 95.7% |
Inception-v1 [45] | 89% | 92% | 94% | 90% |
MobiHisNet [46] | 91.42% | 89.93% | 92.70% | 85.84% |
The ensemble SNN [5] | 98.7% | 95.1% | 96.7% | 97.5% |
This work | 96.3% | 93.4% | 95.2% | 98.4% |
Table 10
Confidence interval comparison of the proposed method and several other methods over the BreaKHis database (
Methods | Confidence intervals | |||
40x | 100x | 200x | 400x | |
AlexNet [43] | (0.8857, 0.9121) | (0.8697, 0.8972) | (0.8296, 0.8611) | (0.8443, 0.8761) |
PFTAS+SVM [42] | (0.7984, 0.8324) | (0.7813, 0.8158) | (0.8348, 0.8659) | (0.8049, 0.8399) |
PFTAS+RF [42] | (0.8005, 0.8343) | (0.7958, 0.8293) | (0.8183, 0.8507) | (0.7912, 0.8273) |
PFTAS+QDA [42] | (0.8213, 0.8536) | (0.8041, 0.8371) | (0.8254, 0.8573) | (0.8015, 0.8368) |
CSDCNN [44] | (0.9626, 0.9774) | (0.9476, 0.9651) | (0.9585, 0.9741) | (0.9468, 0.9655) |
Inception-v1 [45] | (0.8752, 0.9027) | (0.9073, 0.9307) | (0.9287, 0.9495) | (0.8854, 0.913) |
MobiHisNet [46] | (0.9012, 0.9258) | (0.8854, 0.9113) | (0.9148, 0.9376) | (0.8414, 0.8735) |
The ensemble SNN [5] | (0.981, 0.9911) | (0.9409, 0.9595) | (0.9585, 0.9741) | (0.9671, 0.9815) |
This work | (0.9537, 0.9703) | (0.9227, 0.9441) | (0.9416, 0.9603) | (0.9772, 0.9889) |
In the optimized SNN, the total neurons in the reservoir
[figures omitted; refer to PDF]
6. Conclusions and Future Work
A saliency-based SNN with breast cancer recognition capability, underpinned by the ReSuMe learning algorithm and spike time encoding scheme, has been presented in this paper. To improve the performance, the FOA was employed to optimize the architecture of the SNN. The performance of the proposed methods demonstrates that the network is effective for breast images. Experimental results show that the SNN with entropy-based time encoding can get better performance than the SNN with a linear time encoding scheme. The saliency model and optimized SNN can further improve the classification accuracy. However, on the multiclassification task of the BreaKHis database, the performance of the proposed SNN is still insufficient. Future work will focus on multiclassification learning using the SNN and further research on SNN architecture, such as multireservoir cascade structure and parallel reservoir architecture.
Acknowledgments
This work is supported by the Natural Science Foundation of Heilongjiang Province under Grant LH2020F023.
[1] X. Qi, L. Zhang, Y. Chen, Y. Pi, Y. Chen, Q. Lv, Z. Yi, "Automated diagnosis of breast ultrasonography images using deep neural networks," Medical Image Analysis, vol. 52, pp. 185-198, DOI: 10.1016/j.media.2018.12.006, 2019.
[2] B. Lei, S. Huang, H. Li, R. Li, C. Bian, Y. H. Chou, J. Qin, P. Zhou, X. Gong, J. Z. Cheng, "Self-co-attention neural network for anatomy segmentation in whole breast ultrasound," Medical Image Analysis, vol. 64,DOI: 10.1016/j.media.2020.101753, 2020.
[3] G. Maicas, A. P. Bradley, J. C. Nascimento, I. Reid, G. Carneiro, "Pre and post-hoc diagnosis and interpretation of malignancy from breast DCE- MRI," Medical Image Analysis, vol. 58, article 101562,DOI: 10.1016/j.media.2019.101562, 2019.
[4] C. Gallego-Ortiz, A. L. Martel, "A graph-based lesion characterization and deep embedding approach for improved computer-aided diagnosis of nonmass breast MRI lesions," Medical Image Analysis, vol. 51, pp. 116-124, DOI: 10.1016/j.media.2018.10.011, 2019.
[5] Q. Fu, H. Dong, "An ensemble unsupervised spiking neural network for objective recognition," Neurocomputing, vol. 419, pp. 47-58, DOI: 10.1016/j.neucom.2020.07.109, 2021.
[6] W. T. Pan, "A new Fruit Fly Optimization Algorithm: taking the financial distress model as an example," Knowledge-Based Systems, vol. 26, pp. 69-74, DOI: 10.1016/j.knosys.2011.07.001, 2012.
[7] Y. Benhammou, B. Achchab, F. Herrera, S. Tabik, "BreakHis based breast cancer automatic diagnosis using deep learning: taxonomy, survey and insights," Neurocomputing, vol. 375,DOI: 10.1016/j.neucom.2019.09.044, 2020.
[8] S. U. Khan, N. Islam, Z. Jan, I. Ud Din, J. J. P. C. Rodrigues, "A novel deep learning based framework for the detection and classification of breast cancer using transfer learning," Pattern Recognition Letters, vol. 125,DOI: 10.1016/j.patrec.2019.03.022, 2019.
[9] B. Gecer, S. Aksoy, E. Mercan, L. G. Shapiro, D. L. Weaver, J. G. Elmore, "Detection and classification of cancer in whole slide breast histopathology images using deep convolutional networks," Pattern Recognition, vol. 84, pp. 345-356, DOI: 10.1016/j.patcog.2018.07.022, 2018.
[10] L. Xie, L. Zhang, T. Hu, H. Huang, Z. Yi, "Neural networks model based on an automated multi-scale method for mammogram classification," Knowledge-Based Systems, vol. 208, pp. 106465-106469, DOI: 10.1016/j.knosys.2020.106465, 2020.
[11] S. Ekici, H. Jawzal, "Breast cancer diagnosis using thermography and convolutional neural networks," Medical Hypotheses, vol. 137,DOI: 10.1016/j.mehy.2019.109542, 2020.
[12] R. Yan, F. Ren, Z. Wang, L. Wang, T. Zhang, Y. Liu, X. Rao, C. Zheng, F. Zhang, "Breast cancer histopathological image classification using a hybrid deep neural network," Methods, vol. 173, pp. 52-60, DOI: 10.1016/j.ymeth.2019.06.014, 2020.
[13] J. H. Lee, T. Delbruck, M. Pfeiffer, "Training deep spiking neural networks using backpropagation," Frontiers in Neuroscience, vol. 10 no. 508,DOI: 10.3389/fnins.2016.00508, 2016.
[14] C. Lee, P. Panda, G. Srinivasan, K. Roy, "Training deep spiking convolutional neural networks with STDP-based unsupervised pre-training followed by supervised fine-tuning," Frontiers in Neuroscience, vol. 12,DOI: 10.3389/fnins.2018.00435, 2018.
[15] C. Lee, S. S. Sarwar, P. Panda, G. Srinivasan, K. Roy, "Enabling spike-based backpropagation for training deep neural network architectures," Frontiers in Neuroscience, vol. 14,DOI: 10.3389/fnins.2020.00119, 2020.
[16] W. Maass, T. Natschlager, H. Markram, "Real-time computing without stable states: a new framework for neural computation based on perturbations," Neural Computation, vol. 14 no. 11, pp. 2531-2560, DOI: 10.1162/089976602760407955, 2002.
[17] T. Yamazaki, S. Tanaka, "The cerebellum as a liquid state machine," Neural Networks, vol. 20 no. 3, pp. 290-297, DOI: 10.1016/j.neunet.2007.04.004, 2007.
[18] N. Soures, D. Kudithipudi, "Spiking reservoir networks: brain-inspired recurrent algorithms that use random, fixed synaptic strengths," IEEE Signal Processing Magazine, vol. 36 no. 6, pp. 78-87, DOI: 10.1109/MSP.2019.2931479, 2019.
[19] P. Enel, E. Procyk, R. Quilodran, P. F. Dominey, "Reservoir computing properties of neural dynamics in prefrontal cortex," PLoS Computational Biology, vol. 12 no. 6,DOI: 10.1371/journal.pcbi.1004967, 2016.
[20] W. Ponghiran, G. Srinivasan, K. Roy, "Reinforcement learning with low-complexity liquid state machines," Frontiers in Neuroscience, vol. 13,DOI: 10.3389/fnins.2019.00883, 2019.
[21] W. Nicola, C. Clopath, "Supervised learning in spiking neural networks with FORCE training," Nature Communications, vol. 8 no. 1,DOI: 10.1038/s41467-017-01827-3, 2017.
[22] D. Florescu, D. Coca, "Learning with precise spike times: a new decoding algorithm for liquid state machines," Neural Computation, vol. 31 no. 9, pp. 1825-1852, DOI: 10.1162/neco_a_01218, 2019.
[23] F. Ponulak, A. Kasiński, "Supervised learning in spiking neural networks with ReSuMe: sequence learning, classification, and spike shifting," Neural Computation, vol. 22 no. 2, pp. 467-510, DOI: 10.1162/neco.2009.11-08-901, 2010.
[24] F. Ponulak, "Analysis of the ReSuMe Learning Process For Spiking Neural Networks," International Journal of Applied Mathematics & Computer Science, vol. 18 no. 2, pp. 117-127, DOI: 10.2478/v10006-008-0011-1, 2008.
[25] X. Wang, X. Lin, X. Dang, "Supervised learning in spiking neural networks: a review of algorithms and evaluations," Neural Networks, vol. 125, pp. 258-280, DOI: 10.1016/j.neunet.2020.02.011, 2020.
[26] Y. Hao, X. Huang, M. Dong, B. Xu, "A biologically plausible supervised learning method for spiking neural networks using the symmetric STDP rule," Neural Networks, vol. 121, pp. 387-395, DOI: 10.1016/j.neunet.2019.09.007, 2020.
[27] K. Koch, J. McLean, R. Segev, M. A. Freed, M. J. Berry, V. Balasubramanian, P. Sterling, "How much the eye tells the brain," Current Biology, vol. 16 no. 14, pp. 1428-1434, DOI: 10.1016/j.cub.2006.05.056, 2006.
[28] S. Ullman, C. Koch, "Shifts in selective visual-attention: towards the underlying neural circutry," Human Neurobiology, vol. 4 no. 4, pp. 219-277, 1985.
[29] A. Borji, D. N. Sihite, L. Itti, "Quantitative analysis of human-model agreement in visual saliency modeling: a comparative study," IEEE Transactions on Image Processing, vol. 22 no. 1, pp. 55-69, DOI: 10.1109/TIP.2012.2210727, 2013.
[30] W. Al-Dhabyani, M. Gomaa, H. Khaled, A. Fahmy, "Dataset of breast ultrasound images," Data Br., vol. 28,DOI: 10.1016/j.dib.2019.104863, 2020.
[31] J. Yang, R. Shi, B. Ni, "MedMNIST classification decathlon: a lightweight AutoML benchmark for medical image analysis," . 2020, http://arxiv.org/abs/2010.14925
[32] K. He, X. Zhang, S. Ren, J. Sun, "Deep Residual Learning for Image Recognition," Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 770-778, DOI: 10.1109/CVPR.2016.90, 2016.
[33] M. Feurer, A. Klein, K. Eggensperger, J. T. Springenberg, M. Blum, F. Hutter, "Efficient and Robust Automated Machine Learning," Advances in Neural Information Processing Systems, pp. 2962-2970, 2015.
[34] H. Jin, Q. Song, X. Hu, "Auto-keras: An Efficient Neural Architecture Search System," Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., pp. 1946-1956, DOI: 10.1145/3292500.3330648, 2019.
[35] J. Suckling, "The mammographic image analysis society digital mammogram database," Expert. Medica, Int. Congr. Ser., vol. 1069, pp. 375-378, 1994.
[36] J. Anitha, J. Dinesh Peter, S. I. A. Pandian, "A dual stage adaptive thresholding (DuSAT) for automatic mass detection in mammograms," Computer Methods and Programs in Biomedicine, vol. 138, pp. 93-104, DOI: 10.1016/j.cmpb.2016.10.026, 2017.
[37] R. Rabidas, A. Midya, J. Chakraborty, "Neighborhood structural similarity mapping for the classification of masses in mammograms," IEEE Journal of Biomedical and Health Informatics, vol. 22 no. 3, pp. 826-834, DOI: 10.1109/JBHI.2017.2715021, 2018.
[38] Z. Jiao, X. Gao, Y. Wang, J. Li, "A parasitic metric learning net for breast mass classification based on mammography," Pattern Recognition, vol. 75, pp. 292-301, DOI: 10.1016/j.patcog.2017.07.008, 2018.
[39] M. M. Abdelsamea, M. H. Mohamed, M. Bamatraf, "Automated classification of malignant and benign breast cancer lesions using neural networks on digitized mammograms," Cancer Informatics, vol. 18, pp. 10-12, DOI: 10.1177/1176935119857570, 2019.
[40] S. Boudraa, A. Melouah, H. F. Merouani, "Improving mass discrimination in mammogram-CAD system using texture information and super-resolution reconstruction," Evolving Systems, vol. 11 no. 4, pp. 697-706, DOI: 10.1007/s12530-019-09322-4, 2020.
[41] J. Dabass, M. Hanmandlu, R. Vig, "Classification of digital mammograms using information set features and Hanman transform based classifiers," Informatics Med. Unlocked, vol. 20, pp. 100401-100408, DOI: 10.1016/j.imu.2020.100401, 2020.
[42] F. A. Spanhol, L. S. Oliveira, C. Petitjean, L. Heutte, "A dataset for breast cancer histopathological image classification," IEEE Transactions on Biomedical Engineering, vol. 63 no. 7, pp. 1455-1462, DOI: 10.1109/TBME.2015.2496264, 2016.
[43] F. A. Spanhol, L. S. Oliveira, C. Petitjean, L. Heutte, "Breast Cancer Histopathological Image Classification Using Convolutional Neural Networks," 2016 International Joint Conference on Neural Networks (IJCNN), vol. 2016, pp. 2560-2567, .
[44] Z. Han, B. Wei, Y. Zheng, Y. Yin, K. Li, S. Li, "Breast cancer multi-classification from histopathological images with structured deep learning model," Scientific Reports, vol. 7 no. 1,DOI: 10.1038/s41598-017-04075-z, 2017.
[45] F. Parvin, M. A. M. Hasan, "A comparative study of different types of convolutional neural networks for breast cancer histopathological image classification," 2020 IEEE Region 10 Symposium, pp. 945-948, DOI: 10.1109/TENSYMP50017.2020.9230787, 2020.
[46] A. Kumar, A. Sharma, V. Bharti, A. K. Singh, S. K. Singh, S. Saxena, "MobiHisNet: a lightweight CNN in mobile edge computing for histopathological image classification," IEEE Internet of Things Journal, vol. 8 no. 24, pp. 17778-17789, DOI: 10.1109/jiot.2021.3119520, 2021.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Copyright © 2022 Qiang Fu and Hongbin Dong. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
The spiking neural networks (SNNs) use event-driven signals to encode physical information for neural computation. SNN takes the spiking neuron as the basic unit. It modulates the process of nerve cells from receiving stimuli to firing spikes. Therefore, SNN is more biologically plausible. Although the SNN has more characteristics of biological neurons, SNN is rarely used for medical image recognition due to its poor performance. In this paper, a reservoir spiking neural network is used for breast cancer image recognition. Due to the difficulties of extracting the lesion features in medical images, a salient feature extraction method is used in image recognition. The salient feature extraction network is composed of spiking convolution layers, which can effectively extract the features of lesions. Two temporal encoding manners, namely, linear time encoding and entropy-based time encoding methods, are used to encode the input patterns. Readout neurons use the ReSuMe algorithm for training, and the Fruit Fly Optimization Algorithm (FOA) is employed to optimize the network architecture to further improve the reservoir SNN performance. Three modality datasets are used to verify the effectiveness of the proposed method. The results show an accuracy of 97.44% for the BreastMNIST database. The classification accuracy is 98.27% on the mini-MIAS database. And the overall accuracy is 95.83% for the BreaKHis database by using the saliency feature extraction, entropy-based time encoding, and network optimization.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer