Full Text

Turn on search term navigation

This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

1. Introduction

The incidence rate of breast cancer ranks first among female malignancies [1]. How to early detect, standardize diagnosis, and reduce mortality is a major challenge facing the medical community all over the world. Medical image analysis is an important means of early detection of breast cancer lesions risk. It is also recognized as the most effective and reliable medical examination method, and it can find the related clues of early breast cancer lesions. The combination of early prevention and control measures can significantly reduce the cancer rate and mortality of the screening population. The current medical technology for breast cancer has no effective means of prevention. How to detect early is an important way to combat breast tumors. With the rapid development of computer technology, the use of the computer to achieve digital diagnosis systems has become the development trend of medical technology. Computer-aided diagnosis of breast tumors has become an inevitable trend.

In recent years, many deep learning detection methods based on breast images have been proposed [2–4]. However, training deep neural networks (DNNs) requires too many samples and annotations. Moreover, since the DNNs contain multiple hidden layers, training a DNN needs a lot of computing resources and time costs. The acquisition of medical images is not easy, and senior doctors need to label the lesions. Compared with DNN, SNN is a more available tool. SNN simulates the information processing mechanism of the brain and takes the spike neurons as the basic units. It simulates the information processing mechanism of nerve cells, to realize the pattern recognition task [5]. SNN represents information in spike patterns, and each spike neuron experiences rich dynamic behaviors. In addition to the spread of information in the spatial domain, the current state is also affected by the past state in the time domain specifically. Therefore, SNN usually has more time commonality. Due to the biological characteristics of SNN, it has a lower computational cost in theory. However, the accuracy of SNN is lower than that of ANN realized by spatial propagation and continuous activation function. On the other hand, SNN is rarely used in the field of medical image recognition. Therefore, this paper focuses on the SNN for breast cancer image recognition.

In this paper, SNN is employed to recognize breast cancer on three modalities of datasets. The network consists of three parts (i.e., input layer, reservoir layer, and readout layer). In the input layer, the image is encoded to spike time, and the saliency features of the input image are extracted to better identify the lesion information. To obtain the optimal architecture, an evolutionary algorithm, i.e., FOA, is employed to optimize the architecture of the SNN.

The main contributions of this work are as follows.

(1) Two-time encoding schemes (i.e., linear time encoding and entropy-based time encoding) are proposed to encode the input information. The encoding methods are compared and analyzed in the experimental part. The manner of linear time encoding maps pixels of the input image into spike time linearly. The entropy-based encoding scheme calculates the statistical form of image features. This method includes not only the aggregation feature of image-level but also the spatial feature of image distribution

(2) A method of saliency feature extraction is proposed to detect the lesion features of breast images so that the network can better detect the images containing lesions. This method uses a spike convolution network to extract input features. According to the output feature maps, the heat map of the image is reconstructed, and then, the mask is calculated to obtain the salient region of the image

(3) To improve the SNN performance, an evolutionary computing method, namely, Fruit Fly Optimization Algorithm (FOA) [6], is employed to optimize the SNN architecture. The optimized SNN via FOA can improve the classification accuracy and reduce the number of synaptic connections in the reservoir. The structure of the network in our work is determined by the neurons and the connections of neurons in the reservoir layer. The network structure determines future performance in a great measure. The super parameters that determine the structure are usually empirical values. It is difficult to select the appropriate value manually. Therefore, the evolutionary algorithm is used to obtain the optimal structure in this work

The rest of the paper is organized as follows. The related works are provided in Section 2. Section 3 proposes two methods of spike time encoding, the saliency feature extraction method, and the spiking reservoir neural network. Section 4 presents a manner of SNN architecture optimization using FOA. The experimental results are provided in Section 5, which demonstrates the performance of the proposed methods under three breast cancer recognition tasks. Section 6 concludes this paper.

2. Related Works

Interventional therapy of medical imaging has significantly improved the level of early diagnosis of breast cancer. With the application of artificial intelligence in the field of healthcare, researchers use image processing and computer vision technology to design effective intelligent computer-aided detection and diagnosis systems.

Deep learning has made great progress in the field of pattern recognition. Various DNNs based feature extraction architectures are proposed for breast cancer detection and classification [7]. A deep convolution neural network (CNN) framework is proposed by [8]. Features of input images are extracted by using three pretrained CNN architectures. Features are fed into a fully connected layer for the recognition of malignant and benign tumors using average pooling. Whole slide images of breast biopsies are classified into five categories, and a saliency detector is proposed in [9], which uses a pipeline of four fully CNNs. The network fused saliency and classification maps for final categorization. An automated multiscale end-to-end DNN framework is proposed for mammogram classification [10]. It only requires mammogram images with annotations. The model generates three scales of feature maps that make the classifier combine global information with the local lesions for classification. To detect breast cancer automatically, the software is developed by [11]. It also proposes an algorithm to extract the features based on biodata, image analysis, and image statistics. These features are used to distinguish the images as normal or suspected by using CNNs and optimized by the Bayes algorithm. [12] proposes a hybrid deep CNN and RNN to classify breast cancer histopathological image.

Unlike deep CNNs, there is limited work done for SNNs in the field of breast cancer detection and diagnosis. [13] proposes that the discontinuous spike time is regarded as noise and the membrane potential is regarded as continuous signal, so that SNN can be trained by back propagation algorithm based on gradient descent. [14] proposes to train deep SNN in two stages. Firstly, the convolution kernel is pretrained in a hierarchical manner through unsupervised learning, and then, the synaptic weight is fine-tuned through spike-based supervised gradient descent back propagation. [15] proposes an approximate derivative method to explain the leakage behavior of LIF neurons. This method can directly train deep convolution SNN using spike-based back propagation. [5] proposes an ensemble SNN for the histopathological image. It is used for an eight-classification work, which includes each of four types of malignant tumors and benign tumors. In the current study, a saliency-based SNN with time encoding is proposed to recognize breast images on three different modality datasets, namely, breast ultrasound images, breast X-ray images, and histopathological databases. The input image is classified into malignant or nonmalignant under breast ultrasound and breast X-ray databases, with a false negative rate lower than 3%. Further investigation is performed on histopathological database to recognize multiclass task, which is helpful to distinguish tumor types in detail. Experimental results show that our approach can be applied to real-world scenarios.

3. Proposed Methods

A spiking reservoir neural network architecture, two manners of spike encoding (i.e., linear time encoding and entropy-based time encoding), and a saliency-based feature detection network are presented in this section. A diagram of the proposed method is shown in Figure 1. Compared with the existing spike encoding schemes, the linear spike time encoding method proposed in this paper is more concise in the calculation. The entropy-based spike time encoding calculates the statistical form of features and includes the spatial characteristics of the gray distribution.

[figure omitted; refer to PDF]

As shown in Figure 2, LSM includes of three parts, i.e., input layer, reservoir or liquid layer, and a memoryless readout layer. The reservoir consists of Leaky Integrate and Fire (LIF) neurons. The LIF neurons are connected recursively through dynamic synaptic connections in the reservoir part. The red solid circle in the reservoir represents the excitatory neurons, and the black represents the inhibitory neurons. The number of excitatory neurons is selected as four times of inhibitory neurons observed in the cerebral cortex circuit. The readout layer is also performed by LIF neurons, but there is no interconnection within them. The low dimensional input data can be transformed into a high dimensional internal state by reservoir. The internal state acts as the input of the memoryless readout layer, which is responsible for generating the final output of the LSM. The dynamic equations of LIF [20] simulate excitatory and inhibitory spike neurons. It can be calculated by $\begin{matrix} (1) & I_{i} t = \sum_{p \in N_{P}} W_{p i} ∙ δ t - t_{p} + \sum_{j \in N_{E}} W_{j i} ∙ δ t - t_{j} - \sum_{k \in N_{I}} W_{k i} ∙ δ t - t_{k}, \\ (2) & \frac{{d V}_{i}}{d t} = \frac{V_{rest} - V_{i}}{τ} + I_{i} t, \end{matrix}$ where $V_{i}$ represents the action potential of the $i$ -th neuron and $V_{rest}$ is the resting potential that $V_{i}$ decays with a time constant $τ$ without a current input. The neuron will fire an output spike if the action potential reaches a certain value $V_{threshold}$ . After that, the action potential is reset to $V_{rest}$ , and by keeping the membrane potential constant, the spikes are suppressed in the subsequent refractory period. $I_{i} t$ is the instantaneous current projected to the $i$ -th neuron. The number of input neuron, excitatory neuron, and inhibitory neuron is $N_{P}$ , $N_{E}$ , and $N_{I}$ , respectively. Instantaneous current consists of three parts, i.e., the current of input, inhibitory, and excitatory neurons. $δ t - t_{p}$ means that the current of the input neuron integrates the sum of presynaptic spikes, $t_{p}$ is the firing time, and $W_{p i}$ is the corresponding synaptic weight. Similarly, $δ t - t_{j}$ represents the sum of presynaptic spikes of excitatory neurons, and $δ t - t_{k}$ represents the sum of presynaptic spikes of inhibitory neurons. $W_{j i}$ is corresponding synaptic weights of excitatory neurons, and $W_{k i}$ is corresponding synaptic weights of inhibitory neurons.

The neurons in the reservoir act as liquid filters $L^{M}$ . It maps the vector of the input spike to the vector of continuous function. When the input function is $u t$ , $L^{M}$ maps $u t$ to internal states $x^{M} t$ . It can be calculated by $\begin{matrix} (3) & x^{M} t = L^{M} u t . \end{matrix}$

The output state of the reservoir is taken as the input of the readout layer, and the state is converted to the output $y t$ of the readout layer at each time $t$ . It can be calculated by $\begin{matrix} (4) & y t = f^{M} x^{M} t . \end{matrix}$

The readout layer of LSM usually uses the exponential filter to convert the spike output of the reservoir into the output of a continuous signal for linear regression training. The LSM model in [21] is trained by the recursive least square method using liquid filtered output. It results in the loss of the accurate spike time information produced by the liquid neuron [22]. In this work, readout neurons are trained by the ReSuMe algorithm [23, 24]. The ReSuMe is a biologically logically plausible algorithm of supervised learning. It allows neurons to adjust their synaptic weights so that it can learn any spike pattern corresponding to a given synaptic stimulus. The ReSuMe can guarantee convergence for a single input spike on each synapse. In the case of multiple spikes, learning rate with a low value can be used to ensure algorithm performance. The weight adjustment can be calculated by $\begin{matrix} (5) & \frac{d}{d t} w_{o i} t = S_{d} t - S_{o} t a_{d} + \int_{0}^{\infty} a_{d_{i}} s S_{i} t - s d s, \end{matrix}$ where $w_{o i}$ is the $i$ -th output weights of neuron $o$ , $S_{d} t$ is desired spike trains, and $S_{o} t$ is actual spike trains. The learning rate is $a_{d}$ , and $S_{i} t$ is an input spike train of the $i$ -th neuron. $a_{d_{i}}$ is a kernel function; it can be calculated by $\begin{matrix} (6) & a_{d_{i}} s = A_{d_{i}} e^{- s / {τ d}_{i}}, \end{matrix}$ where $A_{d_{i}}$ is the constant of the learning rate and $τ$ is a constant.

3.2. Spike Encoding Scheme

For the human visual system (HVS), neural coding is helpful to understand how neural spike activity represents a visual scene and then decodes neural spike activity to represent given visual information. Therefore, in SNN, how to encode the numerical input data from the input image into the input neuron spikes is the primary problem to be solved. ANN uses matrix-vector operation in the transformation of input to output, which can directly calculate the value. In SNN, numerical data cannot be operated directly. On the contrary, these values must be converted into spike or spike trains. The conversion from numerical data to spike or spike trains is very important. The choice of encoding scheme not only affects the accuracy of practical application but also affects the speed of data processing and the energy efficiency of the system. This section proposes two different input encoding schemes, i.e., linear spike time encoding and entropy-based spike time encoding are proposed, and the effectiveness in medical image recognition tasks is evaluated in Section 5.

3.2.1. Linear Spike Time Encoding

The purpose of the time encoding is to generate the corresponding spike pattern representing the input. To supply the input layer of SNN, each input representation is converted into a corresponding time pattern. The spike time encoding method uses the time of the spike to express the stimulation of the input signal. This form of time encoding is the basis for the back propagation algorithm in SNNs, such as SpikeProp [25]. Time encoding is also used with other learning algorithms, such as spike time-dependent plasticity (STDP) [26].

In this section, a linear spike time encoding scheme is presented. Using this method, the pixel information of the image can be linearly mapped to the firing time of the spike. It can be calculated by $\begin{matrix} (7) & t_{i}^{f} = \frac{P_{i} - P_{\min}}{P_{\max} - P_{\min}} ∙ T_{\max} ms, \end{matrix}$ where $t_{i}^{f}$ is the spike time encoded by linear method, $P_{i}$ is the pixel corresponding to the input image, the minimum value of input pixels is set to $P_{\min}$ , and the maximum value of input pixels is set to $P_{\max}$ . $T_{\max}$ is the maximum firing time. For example, as shown in Figure 3, suppose that $T_{\max} = 10$ , the pixel range of the image is 0-255 gray value. The pixel values 0 and 110 are mapped to 0 ms and 4.314 ms, respectively.

[figure omitted; refer to PDF]

This encoding method is a form of time-to-first-spike. According to neuroscience theory, the first spike fired by neurons carries the most information, so it is theoretically reliable.

3.2.2. Entropy-Based Spike Time Encoding

Spike time encoding method based on information entropy calculates the statistical form of image features. The average information in the image is taken as the time of firing spike. This method includes not only the aggregation feature of gray level but also the spatial feature of gray distribution. The adjacent gray value is selected as the spatial feature of the gray distribution. The adjacent gray value and the pixel of the image form a feature tuple, i.e., $i, j$ , where $i$ is the pixel gray value ( $0 \leq i \leq 255$ ) and $j$ is the adjacent gray value ( $0 \leq j \leq 255$ ). It can be calculated by $\begin{matrix} (8) & P_{i j} = \frac{f i, j}{N^{2}} . \end{matrix}$

The above formula reflects the gray value of a pixel position and the comprehensive characteristics of the distribution, where $f i, j$ is the frequency of feature tuple $i, j$ and $N$ represents the image scale. The information entropy of an input image can be calculated by $\begin{matrix} (9) & H = \sum_{i = 0}^{255} P_{i j} \log P_{i j} . \end{matrix}$

According to the information entropy, the time of spike firing can be calculated by $\begin{matrix} (10) & t^{f} = H ∙ T_{\max} ms, \end{matrix}$ where $t^{f}$ is the time of firing spike.

3.3. Visual Saliency Detection

The visual data entering human eyes is about 108 to 109 bits per second [27]. For the human visual system (HVS), real-time processing of these data streams is an extremely heavy task. The HVS only understands and processes a part of the information. This selection mechanism is named visual attention. This kind of attention behavior is considered to be led by two mechanisms, i.e., stimulus-driven bottom-up and expectation-driven top-down mechanisms [28]. Bottom-up attention is mainly driven by the orientation, contrast, color, action, and other attributes of the visual scene. In the field of computer vision, top-down attention is related to cognitive aspects such as memory, experience, and cultural background. Because of the simplicity of visual attention, visual attention is related to the former attention mechanism mainly, which is often called visual salience [29].

In this section, a saliency-based calculation model is proposed, as shown in Figure 4. A malignant tumor image is taken as an example. The spiking CNN is employed to extract the features. Then, the two-dimensional feature maps generated by the spike convolution layer are summed, and the mask is calculated to obtain the saliency feature map. However, for medical imaging devices, the relationship between the input energy and the brightness of the color recorded in the image file is linear. As a result, the image displayed on the device is inconsistent with the actual image captured by the camera. To correct this difference, gamma nonlinearity is performed before the saliency calculation. The gamma nonlinearity can be calculated by $\begin{matrix} (11) & f I = I^{γ} . \end{matrix}$

[figure omitted; refer to PDF]

When $γ < 1$ , the dynamic range gets larger, and the image contrast is enhanced in the low gray value region. The dynamic range gets smaller, the image contrast decreases, and the overall gray value of the image gets larger in the high gray value region. The dynamic range of the low gray value region gets smaller, when $γ > 1$ , and the dynamic range of the high gray value region gets larger, which reduces the contrast of the low gray value region image and improves the contrast of the high gray value region image. At the same time, the gray value of the whole image gets smaller.

The two-dimensional feature maps mainly depend on convolution calculation. It can be calculated by $\begin{matrix} (12) & c_{p, q} = \sum_{m = 0}^{M - 1} \sum_{n = 0}^{N - 1} I m, n ∙ R p - m, q - n, \end{matrix}$ where $c$ represents the 2D feature map generated by spike convolution, $R$ is the receptive field, and $M$ and $N$ are the numbers of rows and columns of the image. The sum of multiple 2D feature maps can be calculated by $\begin{matrix} (13) & C = \sum_{i = 1}^{n} c_{i}, \end{matrix}$ where $C$ is the sum of $N$ 2D feature maps, $n$ represents the number of feature maps, and $c_{i}$ is the $i$ -th feature map. The mean value $\bar{C}$ of $C$ is calculated, and then, the value of each element $C_{i j}$ in $C$ with $\bar{C}$ is compared, to calculate the mask map. If the $C_{i j}$ is greater than the $\bar{C}$ , the element value $m$ in the mask map is 1; otherwise, the element value $m$ in the mask map is 0. It can be calculated by $\begin{matrix} (14) & \bar{C} = \frac{1}{M N} \sum_{j = 1}^{N} \sum_{i = 1}^{M} C_{i j}, \\ (15) & Mask = \begin{cases} m = 1, if C_{i j} > \bar{C} \\ m = 0, if C_{i j} \leq \bar{C} \end{cases}, \end{matrix}$ where Mask is the obtained mask map and $m$ is the element value in Mask. In this manner, the original image and mask image can be employed to calculate the dot product to get the saliency feature map. It can be calculated by $\begin{matrix} (16) & S = I ⊙ Mask . \end{matrix}$

$S$ is the final saliency feature map.

4. SNN Architecture Optimization

The connection of neurons in the LSM neural network includes the connection between input neurons and excitatory (or inhibitory) neurons in the reservoir, interconnection of neurons in the reservoir, and connection between neurons in the reservoir and neurons in the readout layer. There is no interconnection between neurons in the input layer (or readout layer). There are four types of recursive connections between the synapses in the reservoir, i.e., the connection between excitatory and excitatory synapses ( $E ➔ E$ , $W_{E E}$ ), the connection between excitatory and the inhibitory synapses ( $E ➔ I$ , $W_{E I}$ ), the connection between inhibitory and the excitatory synapses ( $I ➔ E$ , $W_{I E}$ ), and the connection between inhibitory and the inhibitory synapses ( $I ➔ I$ , $W_{I I}$ ).

The performance of the LSM model in pattern recognition tasks depends not only on the strength of the connections between neurons but also on the number of neurons and the probability of synaptic connection. To design an efficient reservoir layer that performs desired kernel functions, these parameters need to be optimized. Therefore, the FOA [6] is employed to search for the best network architecture, that is, the parameters of connection probability $p$ and the number of neurons.

Algorithm 1

# Random initial fruit fly swarm location, radius (R) and number of optimization variables (D):

Init X_axis, Y_axis, Z_axis, R, D

# Enter iterative optimization to repeat the implementation, then judge if the smell concentration is superior to the previous iterative smell concentration.

for i in range (popsize):

X(i) = X_axis + R $*$ (random (1,D)-1)

Y(i) = Y_axis + R $*$ (random (1,D)-1)

Z(i) = Z_axis + R $*$ (random (1,D)-1)

# The distance to the origin is estimated.

Dist(i) = sqrt(X(i) $* *$ 2+Y(i) $* *$ 2+Z(i) $* *$ 2)

# the smell concentration judgment value is calculated.

S(i) =1/Dist(i)

# Substitute smell concentration judgment value (S) into smell concentration judgment function.

Smell(i) = function(S(i))

# Finding out the best value and the location index

Smellbest, index = min(Smell), argmin(Smell)

bestSmell = Smellbest

X_axis = X(bestIndex)

Y_axis = Y(bestIndex)

Z_axis = Z(bestIndex)

$N$ in the reservoir, which determines the network architecture. The steps of FOA are as follows.

FOA is an algorithm of global optimization. It is based on the foraging behavior of fruit flies, as shown in Figure 5. In sensory perception, the fruit fly is superior to other species; especially in the sense of smell and vision, the olfactory organs of the fruit fly can collect all kinds of odors floating in the air and even smell food sources 40 km away. After flying close to the food location, the fruit fly can also use their eyes to find the location where the food and its companions gather and fly in this direction. The random direction and distance of individual fruit fly searching for food can be calculated by $\begin{matrix} (17) & x_{i} = x_{init} + random value, \\ (18) & y_{i} = y_{init} + random value, \\ (19) & z_{i} = z_{init} + random value, \end{matrix}$ where $x_{init}$ is the initial position of the individual in the $x$ -axis direction, $y_{init}$ is the initial position of the individual in the $y$ -axis direction, and $z_{init}$ is the initial position of the individual in the $z$ -axis direction. According to an individual position, the distance and smell concentration can be calculated by $\begin{matrix} (20) & dist = \sqrt{{x_{i}}^{2} + {y_{i}}^{2} + {z_{i}}^{2},} \\ (21) & S_{i} = \frac{1}{dist}, \end{matrix}$ where dist is the distance between the individual and the target and $S_{i}$ is used to judge the value of smell concentration. The goal of SNN is to minimize the gap between the desired output and the actual output on the training sample. It can be calculated by $\begin{matrix} (22) & E = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {y_{o} - y_{d}}^{2},} \end{matrix}$ where $N$ is the number of samples, $y_{o}$ is the actual output of the network, and $y_{d}$ is the desired output of the network. In addition to minimizing the error on the training set, the number of neurons in the reservoir ( $n_{r}$ ) to be as small as possible and the connections between neurons as few as possible are also expected, to simplify the network structure. Therefore, the objective function is set to $\begin{matrix} (23) & fitness = {w_{1} E + w_{2} n_{r} + w_{3} p}_{\min}, \end{matrix}$ where $w_{1} + w_{2} + w_{3} = 1$ . $w_{1}$ is the weight of $E$ in the objective function, $w_{2}$ is the weight of $n_{r}$ in the objective function, and $w_{3}$ is the weight of $p$ in the objective function. The value of parameter $p$ is the connection probability between neurons in the reservoir. The larger the value of $w_{1}$ , the more emphasis on minimizing the training error. The smaller the value of $w_{1}$ , the more emphasis on simplifying the network structure, thus reducing the training time. If $w_{1}$ is too large, the algorithm will overemphasize parameter optimization and ignore structural optimization. If $w_{2}$ and $w_{3}$ are too large, it will lead to the poor performance of the network. In this work, $w_{1}$ is set to 0.6, $w_{2}$ is set to 0.2, and $w_{3}$ is set to 0.2.

[figure omitted; refer to PDF]

5. Experimental Results

To effectively verify the performance of the network, three different modalities of breast image datasets are selected to test the proposed network, namely, breast ultrasound images, breast X-ray images, and breast histopathological images.

5.1. Experiment Settings

The number of input neurons is set according to the image size. The image size of the three datasets is different, so it is necessary to set a different number of input neurons. Other parameters in the network are set according to experience, as shown in Table 1. $V_{rest}$ is the membrane potential of neurons in a resting state. In this paper, it is set to 0 mV. $V_{threshold}$ is the threshold that determines the spike firing or not. $w_{\max}$ , $w_{\min}$ , $A_{pre}$ , and $A_{post}$ are used to perform STDP training. $n_{r}$ is the total neurons in the initial reservoir, which determines the size of the network. In the initial parameter set up of the FOA, the random initialization fruit fly swarm location range is [0, 300], the random fly direction and distance zone of iterative fruit fly food searching is [-100, 100], fruit fly population size is 1000, and iterative number is 120. The Gaussian distribution hypothesis of classification accuracy is used to calculate the confidence intervals. It can be calculated by $\begin{matrix} (24) & l = \frac{2 * n * p + z - z * \sqrt{z + 4 * n * p * 1 - p}}{2 n + z}, \\ (25) & u = \frac{2 * n * p + z + z * \sqrt{z + 4 * n * p * 1 - p}}{2 n + z}, \end{matrix}$ where $l$ and $u$ are the upper and lower bounds of the confidence interval, respectively, $p$ is the classification accuracy, $n$ is the sample size, and $z$ is the critical value of Gaussian distribution.

Table 1

The parameters of neuron model and network for experiments.

Neuron model		Network
Parameters	Value	Parameters	Value
$V_{rest}$	0 mV	$n_{r}$	100
$V_{threshold}$	0.2 mV	$w_{\max}$	1
$V_{reset}$	0 mV	$w_{\min}$	1
$τ$	10 ms	$A_{pre}$	1
$τ_{refractory}$	1 ms	$A_{post}$	-1.2
$∆ t$	2 ms	$p$	0.5

5.2. BreastMNIST Database

The dataset of breast ultrasound images [30] consists of 780 images. It can be categorized into 3 classes, i.e., malignant, benign, and normal tumors, as shown in Figure 5. In our work, the BreastMNIST database [31] is used for testing the proposed SNN. It is based on the breast ultrasound images. The dataset uses low-resolution images to simplify the task to binary classification. This work combines normal and benign into negative and classifies malignant as positive. And the source images with an average image size of $500 \times 500$ pixels are resized into $28 \times 28$ .

The three types in the dataset are shown in Figure 6. Figure 6(a) is a normal image, Figure 6(b) is a benign tumor image, and Figure 6(c) is a malignant tumor image. The images in each category are shown in Table 2. As shown in Table 2, the dataset contains 133 normal images, 437 benign images, and 210 malignant images.

[figures omitted; refer to PDF]

Table 2

The number of images in each category.

Classes of images	Number of images
Normal	133
Benign	437
Malignant	210
Total	780

The network performance comparison of several models on the BreastMNIST dataset is shown in Table 3. It shows that 74.36% training accuracy and 81.19% test accuracy can be obtained by setting the LSM with empirical parameters only. Using the linear time coding algorithm, the training accuracy of the network is 77.67%, and the test accuracy is 89.32%. The training accuracy of 86.19% and the test accuracy of 92.74% are obtained by the spike time encoding scheme based on information entropy. After employing the saliency-based feature module, the training accuracy is 85.20%, and the test accuracy is 92.74%. Further optimization of the network architecture parameters can obtain the optimal training accuracy and test accuracy, which are 88.57% and 97.44%, respectively.

Table 3

The network performance comparison of different methods on BreastMNIST.

Methods	Training accuracy	Test accuracy
Original SNN	74.36%	81.19%
SNN with linear time encoding	77.67%	89.32%
SNN with entropy-based time encoding	86.19%	92.74%
Silence-based SNN	85.20%	92.74%
Improved SNN	88.57%	97.44%

The performance curve is shown in Figure 7. Figure 7(a) represents the performance curve and area under the curve on training data, and Figure 7(b) represents the performance curve and area under the curve on test data. As shown in the figure, the area on training data and test data are 0.90 and 0.99, respectively.

[figures omitted; refer to PDF]

In the optimized SNN, the total number of neurons in the reservoir $n_{r}$ is 300; the connection probability $p$ of neurons is 0.1. The optimization process of SNN architecture on the BreastMNIST database is shown in Figure 8.

[figures omitted; refer to PDF]

Performance comparison of different algorithms on the BreastMNIST database is shown in Table 4. It shows that 85.9% accuracy and 87.8% accuracy can be obtained by using the ResNet-18. Using the ResNet-50, the accuracy of the network is 85.3% and 83.3%. The accuracy of 80.8% and the accuracy of 80.1% are obtained by the Auto-sklearn and AutoKeras, respectively. The Google AutoML Vision obtains 86.5% accuracies. Our work can get the best result, i.e., 97.4%, on the BreastMNIST database.

Table 4

Performance comparison of several networks on the BreastMNIST database ( $z = 1.96$ ).

Methods	AUC	Accuracy	Confidence intervals
ResNet-18 (28) [32]	0.821	85.9%	(0.8328, 0.8817)
ResNet-18 (224) [32]	0.857	87.8%	(0.8534, 0.8993)
ResNet-50 (28) [32]	0.839	85.3%	(0.826, 0.8757)
ResNet-50 (224) [32]	0.818	83.3%	(0.8055, 0.8578)
Auto-sklearn [33]	0.673	80.8%	(0.7786, 0.8338)
AutoKeras [34]	0.646	80.1%	(0.7719, 0.8278)
Google AutoML Vision [31]	0.932	86.5%	(0.8396, 0.8876)
This work	0.997	97.4%	(0.9608, 0.9834)

5.3. Mini-MIAS Database

Breast X-ray image has good contrast, and resolution can distinguish the difference of microstructure density between tissues, and it is easy to operate, relatively cheap, easy to accept, and high diagnostic accuracy. It is internationally recognized as an effective measure for early opportunistic screening and early detection of breast cancer. 322 images from 161 patients are included in the Mammographic Image Analysis Society (MIAS) dataset. This work uses the mini-MIAs database, which contains images with a size of $1024 \times 1024$ pixels [35].

The three types of images in the dataset are shown in Figure 9. Figure 9(a) is the normal image, Figure 9(b) is the benign tumor image, and Figure 9(c) is the malignant tumor image. To get an effective model and balance all kinds of data, the original image is rotated at different angles to expand the data. The experimental results are shown in Table 5.

[figures omitted; refer to PDF]

Table 5

The performance of different methods on the mini-MIAS database.

Methods	Training accuracy	Test accuracy
Original SNN	81.99%	83.54%
SNN with linear time encoding	92.85%	89.75%
SNN with entropy-based time encoding	94.72%	95.03%
Silence-based SNN	93.17%	96.27%
Improved SNN	95.17%	98.27%

The SNN performance comparison of different methods on the BreastMNIST dataset is shown in Table 5. It shows that 81.99% training accuracy and 83.54% test accuracy can be obtained by setting the LSM with empirical parameters only. Using the linear time coding algorithm, the training accuracy of the network is 92.85%, and the test accuracy is 89.75%. The training accuracy of 94.72% and the test accuracy of 95.03% are obtained by the spike time encoding scheme based on information entropy. After employing the saliency-based feature module, the training accuracy is 93.17%, and the test accuracy is 96.27%. Further optimization of the network architecture parameters can obtain the optimal training accuracy and test accuracy, which are 95.17% and 98.27%, respectively. The performance curve is shown in Figure 10. Figure 10(a) represents the performance curve and area under the curve on training data, and Figure 10(b) represents the performance curve and area under the curve on test data. As shown in the figure, the area on training data and test data are 0.98 and 0.99, respectively.

[figures omitted; refer to PDF]

In the optimized SNN, the total neurons in the reservoir $n_{r}$ are 276, and the connection probability $p$ of neurons is 0.1. The optimization process of SNN architecture on the mini-MIAS database is shown in Figure 11.

[figures omitted; refer to PDF]

Performance comparison of different algorithms on the mini-MIAS database is shown in Table 6. The method of adaptive thresholding provides 93% accuracy in [36]. An accuracy of 94.57% is achieved in [37] using Fisher linear discriminant analysis features of neighborhood structural similarity. An accuracy of 96.7% is obtained in [38] using deep distance metric learning. Using texture features with a neural network classifier provides 95.2% accuracy [39]. Superresolution reconstruction module with texture features provides 96.7% accuracy in [40]. The Hanman transform classifier and the hesitancy-based Hanman transform classifier that are used in [41] can achieve 100% accuracy. Although 98.27% accuracy is provided, only one LSM classier is used for classification in this work.

Table 6

Performance comparison of different algorithms on the mini-MIAS database ( $z = 1.96$ ).

Algorithms or classifiers	Accuracy	Confidence intervals
Adaptive thresholding [36]	93%	(0.8951, 0.952)
Fisher’s LDA [37]	94.57%	(0.9171, 0.9668)
Deep distance metric learning [38]	96.7%	(0.9398, 0.9808)
Neural network [39]	95.2%	(0.9208, 0.9692)
Simple logistic [40]	96.7%	(0.9398, 0.9808)
HT, HHT [41]	100%	(0.9882, 1.0)
Silence-based SNN (this work)	98.27%	(0.96, 0.9915)

5.4. BreaKHis Database

To further test the performance of the reservoir SNN model in breast cancer image recognition, the proposed SNN is applied to the breast histopathological images, i.e., the BreaKHis database [42]. The dataset contains four types of histologically benign tumors (phyllodes tumor, fibroadenoma, tubular adenoma, and adenosis) and four types of malignant tumors (mucinous, lobular, papillary, ductal). Each type of image is shown in Figure 12.

[figures omitted; refer to PDF]

This database contains 7909 images and four magnification factors, i.e., 40x, 100x, 200x, and 400x are acquired. The database includes 2480 images of benign and 5460 malignant tumors, respectively. The size of each image is $700 \times 460$ pixels. The detailed information of the database is shown in Table 7.

Table 7

The detailed information of the BreaKHis database.

Tumor types		The number of images
Benign	Adenosis	444
	Fibroadenoma	1014
	Phyllodes tumor	453
	Tubular adenoma	569
Malignant	Ductal carcinoma	3451
	Lobular carcinoma	626
	Mucinous carcinoma	792
	Papillary carcinoma	560

The SNN performance comparison of several methods on the BreaKHis dataset is shown in Table 8. It shows that on the data of 40x image, 76.33% training accuracy, and 78.20% test accuracy can be obtained by setting the LSM with empirical parameters only. Using the linear time encoding algorithm, the training accuracy of the network is 80.20%, and the test accuracy is 84.54%. The training accuracy of 86.56% and the test accuracy of 89.90% are obtained by using the spike time encoding scheme based on information entropy. After adding the saliency analysis module, the training accuracy is 89.92%, and the test accuracy is 94.46%. Further optimization of network architecture can obtain optimal training accuracy and test accuracy, which are 91.10% and 96.27%, respectively. On the 100x images, the optimal training accuracy is 89.60%, and the test accuracy is 93.35%. On the 200x magnified image, the optimal training accuracy is 90.33%, and the test accuracy is 95.24%. On the 400x images, the optimal training accuracy is 93.06%, and the test accuracy is 98.44%.

Table 8

Performance comparison of different methods on the BreaKHis database.

Methods	Training accuracy (%)				Test accuracy (%)
Methods	40x	100x	200x	400x	40x	100x	200x	400x
Original SNN	76.33	73.25	75.13	74.12	78.20	74.52	76.31	75.90
SNN with linear time encoding	80.20	80.20	81.72	81.33	84.54	83.36	82.90	84.74
SNN with entropy-based time encoding	86.56	83.15	85.77	87.29	89.90	85.33	86.83	90.70
Saliency-based SNN	89.92	88.31	89.60	90.60	94.46	91.67	93.25	94.74
Improved SNN	91.10	89.60	90.33	93.06	96.27	93.35	95.24	98.44

The results of several other studies on the BreaKHis dataset are used for performance comparison. Table 9 shows accuracies of these approaches, and Table 10 shows confidence interval comparison of the proposed method and several other methods over the BreaKHis database. The results of [43] report accuracies (86%-90%) by using the AlexNet. The experimental results of [42] demonstrated that the QDA classifier can get higher accuracies than RF and SVM classifiers on BreaKHis. A deep convolutional neural network is used to achieve 95.7%-97.1% classification accuracies in [44]. [45] investigated the performance of five CNNs architectures (i.e., LeNet-5, AlexNet, VGG-16, ResNet-50, and Inception-v1) on the basis of test accuracy. The Inception-v1 can achieve the test accuracy of 89%, 92%, 94%, and 90%, respectively, at 40x, 100x, 200x, and 400x magnification factor classification. [46] proposes an efficient and lightweight CNN model for histopathological image classification based on MobileNet. It achieves the test accuracy of 91.42%, 89.93%, 92.70%, and 85.84%, respectively, at 40x, 100x, 200x, and 400x magnification factor classification. The ensemble SNN [5] can achieve 98.7%, 95.1%, 96.7%, and 97.5% accuracies, respectively, which are higher than other approaches. The result of our work is not the best on the magnification of 40x, 100x, and 200x. However, it is better than others on the 400x images.

Table 9

Accuracy comparison of the proposed method and several other methods over the BreaKHis database.

Methods	Magnification factors
Methods	40x	100x	200x	400x
AlexNet [43]	90.0%	88.4%	84.6%	86.1%
PFTAS+SVM [42]	81.6%	79.9%	85.1%	82.3%
PFTAS+RF [42]	81.8%	81.3%	83.5%	81.0%
PFTAS+QDA [42]	83.8%	82.1%	84.2%	82.0%
CSDCNN [44]	97.1%	95.7%	96.7%	95.7%
Inception-v1 [45]	89%	92%	94%	90%
MobiHisNet [46]	91.42%	89.93%	92.70%	85.84%
The ensemble SNN [5]	98.7%	95.1%	96.7%	97.5%
This work	96.3%	93.4%	95.2%	98.4%

Table 10

Confidence interval comparison of the proposed method and several other methods over the BreaKHis database ( $z = 1.96$ ).

Methods	Confidence intervals
Methods	40x	100x	200x	400x
AlexNet [43]	(0.8857, 0.9121)	(0.8697, 0.8972)	(0.8296, 0.8611)	(0.8443, 0.8761)
PFTAS+SVM [42]	(0.7984, 0.8324)	(0.7813, 0.8158)	(0.8348, 0.8659)	(0.8049, 0.8399)
PFTAS+RF [42]	(0.8005, 0.8343)	(0.7958, 0.8293)	(0.8183, 0.8507)	(0.7912, 0.8273)
PFTAS+QDA [42]	(0.8213, 0.8536)	(0.8041, 0.8371)	(0.8254, 0.8573)	(0.8015, 0.8368)
CSDCNN [44]	(0.9626, 0.9774)	(0.9476, 0.9651)	(0.9585, 0.9741)	(0.9468, 0.9655)
Inception-v1 [45]	(0.8752, 0.9027)	(0.9073, 0.9307)	(0.9287, 0.9495)	(0.8854, 0.913)
MobiHisNet [46]	(0.9012, 0.9258)	(0.8854, 0.9113)	(0.9148, 0.9376)	(0.8414, 0.8735)
The ensemble SNN [5]	(0.981, 0.9911)	(0.9409, 0.9595)	(0.9585, 0.9741)	(0.9671, 0.9815)
This work	(0.9537, 0.9703)	(0.9227, 0.9441)	(0.9416, 0.9603)	(0.9772, 0.9889)

In the optimized SNN, the total neurons in the reservoir $n_{r}$ are selected as 400, and the connection probability $p$ of neurons is 0.2. The optimization process of SNN architecture on the BreaKHis dataset is shown in Figure 13. In the BreaKHis dataset, the number of images of ductal carcinoma accounts for about two-fifths of the total. The number of images of ductal carcinoma is the most in the images with magnifications of 100 and 200. It leads to data imbalance between categories. Therefore, the recognition results of images with magnifications of 100 and 200 are significantly lower than those of images with magnifications of 40 and 400. In the experiment, some ductal carcinoma images are mis-classified as lobular carcinoma. This is also the reason why the recognition results of images with magnifications of 100 and 200 are significantly lower than those of images with magnifications of 40 and 400.

[figures omitted; refer to PDF]

6. Conclusions and Future Work

A saliency-based SNN with breast cancer recognition capability, underpinned by the ReSuMe learning algorithm and spike time encoding scheme, has been presented in this paper. To improve the performance, the FOA was employed to optimize the architecture of the SNN. The performance of the proposed methods demonstrates that the network is effective for breast images. Experimental results show that the SNN with entropy-based time encoding can get better performance than the SNN with a linear time encoding scheme. The saliency model and optimized SNN can further improve the classification accuracy. However, on the multiclassification task of the BreaKHis database, the performance of the proposed SNN is still insufficient. Future work will focus on multiclassification learning using the SNN and further research on SNN architecture, such as multireservoir cascade structure and parallel reservoir architecture.

Acknowledgments

This work is supported by the Natural Science Foundation of Heilongjiang Province under Grant LH2020F023.

References

[1] X. Qi, L. Zhang, Y. Chen, Y. Pi, Y. Chen, Q. Lv, Z. Yi, "Automated diagnosis of breast ultrasonography images using deep neural networks," Medical Image Analysis, vol. 52, pp. 185-198, DOI: 10.1016/j.media.2018.12.006, 2019.

[2] B. Lei, S. Huang, H. Li, R. Li, C. Bian, Y. H. Chou, J. Qin, P. Zhou, X. Gong, J. Z. Cheng, "Self-co-attention neural network for anatomy segmentation in whole breast ultrasound," Medical Image Analysis, vol. 64,DOI: 10.1016/j.media.2020.101753, 2020.

[3] G. Maicas, A. P. Bradley, J. C. Nascimento, I. Reid, G. Carneiro, "Pre and post-hoc diagnosis and interpretation of malignancy from breast DCE- MRI," Medical Image Analysis, vol. 58, article 101562,DOI: 10.1016/j.media.2019.101562, 2019.

[4] C. Gallego-Ortiz, A. L. Martel, "A graph-based lesion characterization and deep embedding approach for improved computer-aided diagnosis of nonmass breast MRI lesions," Medical Image Analysis, vol. 51, pp. 116-124, DOI: 10.1016/j.media.2018.10.011, 2019.

[5] Q. Fu, H. Dong, "An ensemble unsupervised spiking neural network for objective recognition," Neurocomputing, vol. 419, pp. 47-58, DOI: 10.1016/j.neucom.2020.07.109, 2021.

[6] W. T. Pan, "A new Fruit Fly Optimization Algorithm: taking the financial distress model as an example," Knowledge-Based Systems, vol. 26, pp. 69-74, DOI: 10.1016/j.knosys.2011.07.001, 2012.

[7] Y. Benhammou, B. Achchab, F. Herrera, S. Tabik, "BreakHis based breast cancer automatic diagnosis using deep learning: taxonomy, survey and insights," Neurocomputing, vol. 375,DOI: 10.1016/j.neucom.2019.09.044, 2020.

[8] S. U. Khan, N. Islam, Z. Jan, I. Ud Din, J. J. P. C. Rodrigues, "A novel deep learning based framework for the detection and classification of breast cancer using transfer learning," Pattern Recognition Letters, vol. 125,DOI: 10.1016/j.patrec.2019.03.022, 2019.

[9] B. Gecer, S. Aksoy, E. Mercan, L. G. Shapiro, D. L. Weaver, J. G. Elmore, "Detection and classification of cancer in whole slide breast histopathology images using deep convolutional networks," Pattern Recognition, vol. 84, pp. 345-356, DOI: 10.1016/j.patcog.2018.07.022, 2018.

[10] L. Xie, L. Zhang, T. Hu, H. Huang, Z. Yi, "Neural networks model based on an automated multi-scale method for mammogram classification," Knowledge-Based Systems, vol. 208, pp. 106465-106469, DOI: 10.1016/j.knosys.2020.106465, 2020.

[11] S. Ekici, H. Jawzal, "Breast cancer diagnosis using thermography and convolutional neural networks," Medical Hypotheses, vol. 137,DOI: 10.1016/j.mehy.2019.109542, 2020.

[12] R. Yan, F. Ren, Z. Wang, L. Wang, T. Zhang, Y. Liu, X. Rao, C. Zheng, F. Zhang, "Breast cancer histopathological image classification using a hybrid deep neural network," Methods, vol. 173, pp. 52-60, DOI: 10.1016/j.ymeth.2019.06.014, 2020.

[13] J. H. Lee, T. Delbruck, M. Pfeiffer, "Training deep spiking neural networks using backpropagation," Frontiers in Neuroscience, vol. 10 no. 508,DOI: 10.3389/fnins.2016.00508, 2016.

[14] C. Lee, P. Panda, G. Srinivasan, K. Roy, "Training deep spiking convolutional neural networks with STDP-based unsupervised pre-training followed by supervised fine-tuning," Frontiers in Neuroscience, vol. 12,DOI: 10.3389/fnins.2018.00435, 2018.

[15] C. Lee, S. S. Sarwar, P. Panda, G. Srinivasan, K. Roy, "Enabling spike-based backpropagation for training deep neural network architectures," Frontiers in Neuroscience, vol. 14,DOI: 10.3389/fnins.2020.00119, 2020.

[16] W. Maass, T. Natschlager, H. Markram, "Real-time computing without stable states: a new framework for neural computation based on perturbations," Neural Computation, vol. 14 no. 11, pp. 2531-2560, DOI: 10.1162/089976602760407955, 2002.

[17] T. Yamazaki, S. Tanaka, "The cerebellum as a liquid state machine," Neural Networks, vol. 20 no. 3, pp. 290-297, DOI: 10.1016/j.neunet.2007.04.004, 2007.

[18] N. Soures, D. Kudithipudi, "Spiking reservoir networks: brain-inspired recurrent algorithms that use random, fixed synaptic strengths," IEEE Signal Processing Magazine, vol. 36 no. 6, pp. 78-87, DOI: 10.1109/MSP.2019.2931479, 2019.

[19] P. Enel, E. Procyk, R. Quilodran, P. F. Dominey, "Reservoir computing properties of neural dynamics in prefrontal cortex," PLoS Computational Biology, vol. 12 no. 6,DOI: 10.1371/journal.pcbi.1004967, 2016.

[20] W. Ponghiran, G. Srinivasan, K. Roy, "Reinforcement learning with low-complexity liquid state machines," Frontiers in Neuroscience, vol. 13,DOI: 10.3389/fnins.2019.00883, 2019.

[21] W. Nicola, C. Clopath, "Supervised learning in spiking neural networks with FORCE training," Nature Communications, vol. 8 no. 1,DOI: 10.1038/s41467-017-01827-3, 2017.

[22] D. Florescu, D. Coca, "Learning with precise spike times: a new decoding algorithm for liquid state machines," Neural Computation, vol. 31 no. 9, pp. 1825-1852, DOI: 10.1162/neco_a_01218, 2019.

[23] F. Ponulak, A. Kasiński, "Supervised learning in spiking neural networks with ReSuMe: sequence learning, classification, and spike shifting," Neural Computation, vol. 22 no. 2, pp. 467-510, DOI: 10.1162/neco.2009.11-08-901, 2010.

[24] F. Ponulak, "Analysis of the ReSuMe Learning Process For Spiking Neural Networks," International Journal of Applied Mathematics & Computer Science, vol. 18 no. 2, pp. 117-127, DOI: 10.2478/v10006-008-0011-1, 2008.

[25] X. Wang, X. Lin, X. Dang, "Supervised learning in spiking neural networks: a review of algorithms and evaluations," Neural Networks, vol. 125, pp. 258-280, DOI: 10.1016/j.neunet.2020.02.011, 2020.

[26] Y. Hao, X. Huang, M. Dong, B. Xu, "A biologically plausible supervised learning method for spiking neural networks using the symmetric STDP rule," Neural Networks, vol. 121, pp. 387-395, DOI: 10.1016/j.neunet.2019.09.007, 2020.

[27] K. Koch, J. McLean, R. Segev, M. A. Freed, M. J. Berry, V. Balasubramanian, P. Sterling, "How much the eye tells the brain," Current Biology, vol. 16 no. 14, pp. 1428-1434, DOI: 10.1016/j.cub.2006.05.056, 2006.

[28] S. Ullman, C. Koch, "Shifts in selective visual-attention: towards the underlying neural circutry," Human Neurobiology, vol. 4 no. 4, pp. 219-277, 1985.

[29] A. Borji, D. N. Sihite, L. Itti, "Quantitative analysis of human-model agreement in visual saliency modeling: a comparative study," IEEE Transactions on Image Processing, vol. 22 no. 1, pp. 55-69, DOI: 10.1109/TIP.2012.2210727, 2013.

[30] W. Al-Dhabyani, M. Gomaa, H. Khaled, A. Fahmy, "Dataset of breast ultrasound images," Data Br., vol. 28,DOI: 10.1016/j.dib.2019.104863, 2020.

[31] J. Yang, R. Shi, B. Ni, "MedMNIST classification decathlon: a lightweight AutoML benchmark for medical image analysis," . 2020, http://arxiv.org/abs/2010.14925

[32] K. He, X. Zhang, S. Ren, J. Sun, "Deep Residual Learning for Image Recognition," Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 770-778, DOI: 10.1109/CVPR.2016.90, 2016.

[33] M. Feurer, A. Klein, K. Eggensperger, J. T. Springenberg, M. Blum, F. Hutter, "Efficient and Robust Automated Machine Learning," Advances in Neural Information Processing Systems, pp. 2962-2970, 2015.

[34] H. Jin, Q. Song, X. Hu, "Auto-keras: An Efficient Neural Architecture Search System," Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., pp. 1946-1956, DOI: 10.1145/3292500.3330648, 2019.

[35] J. Suckling, "The mammographic image analysis society digital mammogram database," Expert. Medica, Int. Congr. Ser., vol. 1069, pp. 375-378, 1994.

[36] J. Anitha, J. Dinesh Peter, S. I. A. Pandian, "A dual stage adaptive thresholding (DuSAT) for automatic mass detection in mammograms," Computer Methods and Programs in Biomedicine, vol. 138, pp. 93-104, DOI: 10.1016/j.cmpb.2016.10.026, 2017.

[37] R. Rabidas, A. Midya, J. Chakraborty, "Neighborhood structural similarity mapping for the classification of masses in mammograms," IEEE Journal of Biomedical and Health Informatics, vol. 22 no. 3, pp. 826-834, DOI: 10.1109/JBHI.2017.2715021, 2018.

[38] Z. Jiao, X. Gao, Y. Wang, J. Li, "A parasitic metric learning net for breast mass classification based on mammography," Pattern Recognition, vol. 75, pp. 292-301, DOI: 10.1016/j.patcog.2017.07.008, 2018.

[39] M. M. Abdelsamea, M. H. Mohamed, M. Bamatraf, "Automated classification of malignant and benign breast cancer lesions using neural networks on digitized mammograms," Cancer Informatics, vol. 18, pp. 10-12, DOI: 10.1177/1176935119857570, 2019.

[40] S. Boudraa, A. Melouah, H. F. Merouani, "Improving mass discrimination in mammogram-CAD system using texture information and super-resolution reconstruction," Evolving Systems, vol. 11 no. 4, pp. 697-706, DOI: 10.1007/s12530-019-09322-4, 2020.

[41] J. Dabass, M. Hanmandlu, R. Vig, "Classification of digital mammograms using information set features and Hanman transform based classifiers," Informatics Med. Unlocked, vol. 20, pp. 100401-100408, DOI: 10.1016/j.imu.2020.100401, 2020.

[42] F. A. Spanhol, L. S. Oliveira, C. Petitjean, L. Heutte, "A dataset for breast cancer histopathological image classification," IEEE Transactions on Biomedical Engineering, vol. 63 no. 7, pp. 1455-1462, DOI: 10.1109/TBME.2015.2496264, 2016.

[43] F. A. Spanhol, L. S. Oliveira, C. Petitjean, L. Heutte, "Breast Cancer Histopathological Image Classification Using Convolutional Neural Networks," 2016 International Joint Conference on Neural Networks (IJCNN), vol. 2016, pp. 2560-2567, .

[44] Z. Han, B. Wei, Y. Zheng, Y. Yin, K. Li, S. Li, "Breast cancer multi-classification from histopathological images with structured deep learning model," Scientific Reports, vol. 7 no. 1,DOI: 10.1038/s41598-017-04075-z, 2017.

[45] F. Parvin, M. A. M. Hasan, "A comparative study of different types of convolutional neural networks for breast cancer histopathological image classification," 2020 IEEE Region 10 Symposium, pp. 945-948, DOI: 10.1109/TENSYMP50017.2020.9230787, 2020.

[46] A. Kumar, A. Sharma, V. Bharti, A. K. Singh, S. K. Singh, S. Saxena, "MobiHisNet: a lightweight CNN in mobile edge computing for histopathological image classification," IEEE Internet of Things Journal, vol. 8 no. 24, pp. 17778-17789, DOI: 10.1109/jiot.2021.3119520, 2021.

Word count: 7482

Show less

Copyright © 2022 Qiang Fu and Hongbin Dong. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

The spiking neural networks (SNNs) use event-driven signals to encode physical information for neural computation. SNN takes the spiking neuron as the basic unit. It modulates the process of nerve cells from receiving stimuli to firing spikes. Therefore, SNN is more biologically plausible. Although the SNN has more characteristics of biological neurons, SNN is rarely used for medical image recognition due to its poor performance. In this paper, a reservoir spiking neural network is used for breast cancer image recognition. Due to the difficulties of extracting the lesion features in medical images, a salient feature extraction method is used in image recognition. The salient feature extraction network is composed of spiking convolution layers, which can effectively extract the features of lesions. Two temporal encoding manners, namely, linear time encoding and entropy-based time encoding methods, are used to encode the input patterns. Readout neurons use the ReSuMe algorithm for training, and the Fruit Fly Optimization Algorithm (FOA) is employed to optimize the network architecture to further improve the reservoir SNN performance. Three modality datasets are used to verify the effectiveness of the proposed method. The results show an accuracy of 97.44% for the BreastMNIST database. The classification accuracy is 98.27% on the mini-MIAS database. And the overall accuracy is 95.83% for the BreaKHis database by using the saliency feature extraction, entropy-based time encoding, and network optimization.

Details

Title

Breast Cancer Recognition Using Saliency-Based Spiking Neural Network

Author

Fu, Qiang¹

; Dong, Hongbin¹

¹ College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China

Editor

M Hassaballah

Publication year

2022

Publication date

2022

Publisher

John Wiley & Sons, Inc.

e-ISSN

15308677

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1155/2022/8369368

ProQuest document ID

2646636259

Breast Cancer Recognition Using Saliency-Based Spiking Neural Network

Jump to:

Full Text

Abstract

Details

Suggested sources