Full Text

Turn on search term navigation

This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

1. Introduction

Cognitive radio (CR) has been proposed to address the issue of spectrum scarcity resulting from inefficient utilization of spectrum resources [1, 2]. A CR user has unlicensed access to the spectrum under the constraint that primary user (PU) communication is not affected. To ensure this, the spectrum is continuously monitored for PU activity. Spectrum sensing can also be used to detect spectral holes and enable CR users to transmit opportunistically. The performance gain of a CR system is further improved by cooperative spectrum sensing (CSS), where multiple CR users cooperate to detect spectral holes.

While matched filtering outperforms other techniques such as cyclostationary detection and energy detection used for spectrum sensing, its complexity makes it impractical for most systems. Energy detection is the simplest technique, given the limited resources (e.g., energy and computational power) of most CR users. Common spectrum sensing problems such as multipath fading and shadowing can be overcome by exploiting spatial diversity using CSS, thereby ensuring that PU constraints are met [3]. In CSS, individual CR users share their data with a fusion center (FC) that combines local reports to make a global decision. CR users can report the actual amount of received energy, i.e., the not quantized into different levels and then reporting the quantized level which can be represented by fewer bits than the number of bits required for representing the actual amount of energy received. This is called soft-decision combination and results in optimal detection performance but theoretically requires infinite bandwidth [4]. Alternatively, CR users can make a hard decision based on the received energy and report a single bit representing either the presence or absence of the PU to the FC [5]. Hard reporting saves bandwidth but produces inferior results as compared to soft reporting. Linear soft combination has nearly the same performance as likelihood ratio tests [6].

To balance performance and bandwidth efficiency a combination of both soft and hard decisions can be used where the energy range can be quantized, as in [4, 7]. In [4], the authors used a so-called softened hard combination scheme, in which the observed energy is quantized into four regions using two bits, where each region is represented by a label. This achieves an acceptable trade-off between the improved performance resulting from soft reporting and information loss during quantization process. The FC uses a decision combination rule to combine decisions reported by CR users and make a global decision. The decisions of CR users are in quantized form; i.e., instead of reporting a one-bit decision or the actual amount of energy received to the FC, the CR users quantize the received energy into multiple levels and send multiple bits denoting the quantization zone. This is called quantized-hard decision combination [8].

Along with other factors such as the number of participating CR users, the sensing environment, and sensing capabilities of CR users, the FC’s global decision combination rule determines the detection performance of the CR system. For instance, an OR rule results in good protection for the PU but has the lowest spectral hole exploration capability [9], whereas an AND rule improves spectral hole detection but lowers the PU protection capacity. Likewise, poor sensing and/or malicious CR users reduce the performance of the k-out-N decision combination rule. More sophisticated combination rules such as Bayesian analysis and the sequential probability ratio test (SPRT) have better PU protection and spectral hole exploration but require prior information which may not always be available in a given CR environment [10].

The notion of learning from the environment is embedded in the concept of cognitive radios. CR users are meant to monitor the environment and adapt their operating characteristics (operating frequency, transmitting power, etc.) to the changing conditions. To enable CR users to learn from the environment, several authors have considered machine learning algorithms [11–16]. Machine learning in spectrum sensing becomes a task of extracting a feature vector from a pattern and classifying it into a hypothesis class corresponding either to the absence or presence of PU activity. Fading and shadowing can make estimating the channel condition difficult, and hence spectrum sensing cannot reliably determine the PU status based on the current sensing slot only [17]. However, machine learning-based spectrum sensing is capable of implicitly learning the surrounding environment. Another advantage of machine learning-based spectrum sensing is that it can reliably detect PU activity without requiring any prior knowledge of the environment.

Machine learning algorithms are classified into two types: supervised and unsupervised. K-nearest neighbor (KNN) is a supervised machine learning algorithm. In KNN, training instances (spectrum sensing feature vectors) are used to form K neighborhood classes. A test instance is then classified into one of K neighbors based on majority voting. The voting is based on statistical information gained from finding the distance between the test instance and the training instances. The distance should be calculated accurately as to truly reflect the classifying class [18]. KNN is the simplest of machine learning algorithms, suitable for the low-complexity requirements of CR users. KNN is also the most stable machine learning algorithm [19].

Authors in [20–22] have considered KNN for spectrum sensing. In [20] the authors have considered a binary hypothesis testing and have proposed to optimize the distance between the two classes. The drawback of their scheme is that they have considered soft-decision combination and have used a one-time spectrum sensing which cannot be checked against ground reality. In [21], KNN is used in conventional way as a counting mechanism to fill the spaces in building a TV white space database. The use of KNN is limited to reconstruction of the missing spectrum sensing points and thus, the full capacity of KNN as a classifier is not exploited. In [22], authors have found a global energy detection threshold for different conventional rules of decision combination. These rules are used in conjunction with different classification schemes to classify a test instance which takes the signal strength as a feature vector. The authors in [22] also have used KNN as a counting mechanism and, moreover, the global decision combination rule does not take into consideration the weight of individual CR users and their performance history.

Authors in [23] used multiple antennas for centralized spectrum sensing while in [24] a scheme based on multiple energy detectors and adaptive multiple thresholds for cooperative spectrum sensing was presented. For regional area networks, some improved energy detectors were presented in [25, 26]. Authors in [25] proposed a two-stage energy detector where decisions of both the detectors are fused at a decision device while in [26] multiple antennas were used for spectrum sensing in regional area networks. In [27] both a fixed energy detector and adaptive double threshold were used for cooperative spectrum sensing. In [28] multiple antennas based energy detector utilizing adaptive double threshold for spectrum sensing was proposed while in [29] a comparison between cyclostationary detection technique and adaptive double threshold based energy detection scheme was carried out.

In this paper, we propose a machine learning-based reliable spectrum sensing algorithm in which the FC uses a weight-based decision combination rule. In the training phase, CR users perform spectrum sensing, and based on an acknowledgment signal (ACK) and the global decision, the sensing report is assigned to a sensing class. The sensing class corresponds to the behavior of a CR user in a changing environment which is due to the changing activity of the PU. These sensing classes reliably reflect the activity of the PU and the CR user’s behavior in response to it. After enough information is gathered about the surrounding environment, the classification phase begins. In the training phase the CR users form a local decision. The local decision is in quantized-hard form. The local decisions of the CR users are sent to the FC and the FC takes a global decision. The CR users stay silent or transmit according to the global decision. If the CR users transmit and ACK is received in the next time slot then the transmission was successful. Based on the global decision and the status of ACK signal sensing classes are formed. The training phase is over when enough training data for the sensing classes is gathered. In the classification phase, the KNN algorithm is used, where the sensing classes obtained in the training phase are treated as neighbors for the test instance, which is the current sensing report. The Smith-Waterman algorithm (SWA) is used to accurately find the distance between the current sensing report and the neighboring classes. Based on quantitative variables like the conditional probability and posterior probability, which are calculated through KNN, the current sensing report is classified into one of the sensing classes, corresponding to either the absence or presence of the PU. The local decision is then reported to the FC, where the local decisions of all CR users are combined to make a global decision, taking into consideration the reliability of each CR user.

The proposed scheme uses the quantized information as opposed to the soft-decision combination scheme that is proposed in [30]. The spectrum is sensed multiple times in a sensing slot, which makes the proposed scheme more reliable since temporal diversity to the spectrum sensing process is added as wireless channel changes rapidly. The scheme proposed in [30] was based on one-time spectrum sensing while we add a verification mechanism in case that the spectrum sensing decision is absence of PU activity. The classification problem in the proposed scheme is a multilabel one where the current spectrum sensing report is classified into eight different classes. These eight different classes belong to either hypothesis. But the division of the binary hypothesis into subclasses makes the proposed scheme more accurately analyze the PU activity. In addition, the scheme proposed in [30] used the KNN in the traditional way as a counting mechanism. On the other hand, we in the proposed scheme use posterior probability to find the nearest neighbor and utilize KNN to calculate the conditional and prior probabilities.

In the reference of [31], KNN was simply used for data recovery in white space database as a mechanism for majority voting. The classification problem in [31] is also a binary one and the KNN decides a label based on majority labels of the neighboring data points. The proposed spectrum sensing scheme is different from that of [31] in that quantized energy levels are used to train the classifier and then the sensing reports are used to find the class label of the current sensing report by finding the distance between them. Instead of majority voting, we have used an efficient distance measuring algorithm, Smith-Waterman algorithm (SWA) to calculate the similarity of the current sensing report and the training reports.

Mikaeil et al. proposed different classification schemes which work on thresholds calculated through different fusion rules [22]. In the paper, we utilize a different fusion rule at the fusion center which takes into consideration the weight of different CR users before taking a global decision. The focus of [22] is to find the thresholds for different schemes and KNN is used as one of the classifications schemes. On the other hand, in the proposed scheme, the fusion rule utilizes the distance between the test report and the training reports intrinsically at the CR user level and at the FC the historical accuracy of each CR user is also taken into consideration. In this way, the global fusion rule at the FC makes use of the training reports as well as the history of performance of each CR user. Therefore, the global fusion rule is more robust as well as reliable.

Spectrum sensing has been incorporated into satellite communications, 5G as well as MIMO schemes. The growing need for spectrum has made spectrum sensing crucial for next generation’s communication technologies. Authors in [32] employed CR for future broadband satellite-terrestrial communications under the broader framework toward 5G, while the authors in [33] employed joint spectrum sensing and channel selection optimization for satellite communication based on cognitive radios. The concept of the PU as employed by the CR network was employed for satellite cluster communications where the presence of the primary satellite system was detected using the concepts of spectrum sensing by J. Min et al. [34]. The spectral efficiency of MIMO systems which has hybrid architectures were investigated by [35] by investigating the optimal number of users in the system while in [36] the upper limit of downlink spectral efficiency and energy efficiency were investigated in massive MIMI systems with hybrid architecture.

The rest of the paper is organized as follows: Section 2 describes the system model; Section 3 describes the spectrum sensing scheme which consist of KNN algorithm, SWA, the training phase, and the classification phase in detail; Section 4 describes the cooperative spectrum sensing and the global decision combination in detail; Section 5 discusses the results; and Section 6 concludes the paper.

2. System Model

In this section the energy detection method used and the quantization method which is employed are discussed. This section deals with forming of sensing report which is used both in training phase and classification phase of the spectrum sensing scheme. We consider $N$ CR users that continuously sense the spectrum report their local decisions to the FC through a dedicated control channel [4]. The CR user transmits information if a spectral hole exists which is determined by the FC. CR users can either transmit or receive at a given time; i.e., they operate in half-duplex mode. CR users are assumed to be close to the PU and outside the range of other PUs. The system model is presented in Figure 1.

[figure omitted; refer to PDF]

CSS introduces spatial diversity, while temporal diversity is introduced by dividing the sensing slot into minislots. We consider a slotted time-frame structure, where the first slot is used for spectrum sensing and the second slot is used for transmitting CR user data. The authors in [37] investigated the optimal sensing slot duration. In this work a suboptimal sensing slot duration is considered. The sensing result may change when fading and shadowing phenomena are present. Temporal diversity counters these effects by sensing the spectrum $n$ times in the sensing slot. In this work, the sensing slot is further divided into minislots. In each minislot, the spectrum is sensed independently. The sensing performance can be improved if the number of minislots and hence the sensing duration are increased but that results in lesser duration for the transmission slot. The authors in [37] investigated the optimal number of minislots for sensing-throughput trade-off in CRNs. According to [37], diversity reception is introduced in the sensing process by sensing the channel independently in minislots within the same sensing phase. In our proposed scheme the results of these minislots are combined to form a sensing report which later is used in the classification phase as given in Section 3.2. The sensing reports were previously used in [8] to calculate trust of each CR user in a CRN which is under-attack by malicious users. In this work the sensing reports are used to train the classifiers and then later used for classifying the current sensing report. A half-duplex CR user system is considered in which in the sensing slot the CR users remain silent. If in the sensing slot it is decided that the PU is absent then the CR users transmit in the transmission slot; otherwise the CR users remain silent. When the duration of one-time frame, which consists of a sensing slot and a transmission slot, is over the CR users sense the spectrum again. Energy detection is used in each minislot. The energy received in the $w$ -th sensing slot by the $i$ -th CR user at the $k$ -th minislot, $X_{k, w, i}$ , can be expressed as $\begin{matrix} (1) & X_{k, w, i} = \sum_{j = 1}^{N_{0}} {|e_{k, w, i} (j)|}^{2} \end{matrix}$ where $k \in {1,2, 3, \dots, n}$ , $n$ is total number of minislots, $e_{k, w, i} (j)$ is the j-th energy sample received at the k-th minislot of the $w$ -th sensing slot by the i-th CR user, and N₀ is the total number of samples, given by $N_{0} = 2 T B$ . T and B are the detection time and signal bandwidth in Hertz, respectively. The number of samples received in a particular minislot is dependent upon the bandwidth of the sensed spectrum and the sensing time. The received signal $e_{k, w, i} (j)$ in the absence of PU ( $H_{0}$ ) and presence of PU ( $H_{1}$ ) is given as follows: $\begin{matrix} (2) & e_{k, w, i} (j) = \{\begin{cases} v_{k, w, i} (j); & H_{0} \\ s_{k, w, i} (j) + v_{k, w, i} (j); & H_{1} \end{cases} \end{matrix}$ where $v_{k, w, i} (j)$ is zero-mean additive white Gaussian noise (AWGN) and $s_{k, w, i} (j)$ is the j-th sample of the PU signal received at the k-th minislot of the of the $w$ -th sensing slot by the i-th CR user.

It was shown in [37] that if the primary signal is absent the probability density function of the energy of the received signal at the i-th CR user ( $X_{k, w, i}$ ) follows a central chi-square distribution with mean $μ_{0}$ and variance $σ_{0}^{2}$ ; otherwise it follows a noncentral chi-square distribution with mean $μ_{1}$ and variance $σ_{1}^{2}$ , which can be estimated as $\begin{matrix} (3) & μ_{0} = N_{0} \\ σ_{0}^{2} = 2 N_{0} \\ μ_{1} = N_{0} (γ_{i} + 1) \\ σ_{1}^{2} = 2 N_{0} (2 γ_{i} + 1) \end{matrix}$ where $γ_{i}$ is the signal to noise ratio (SNR) of the received signal at the i-th CR user.

When the total number of samples, N₀, is large, the energy signal received, $X_{k, w, i}$ , under both hypotheses $H_{0}$ and $H_{1}$ can be approximated by a Gaussian random variable. In our scheme, the energy signal at each minislot is quantized into discrete zones. Multiple bits representing the corresponding zone are transmitted to the FC, rather than transmitting a continuous energy variable (a soft decision) or a single bit (a hard decision). An M-level quantizer of an input variable is represented by a set of quantization levels and a set of quantization thresholds. These quantization thresholds determine the accuracy to which the quantization levels represent the actual received signal.

In the paper, the slotted-frame structure is considered where a frame is one unit of accessing the spectrum. The first slot, called the sensing slot, in each frame is used to sense the spectrum to decide whether the PU is active or not. If it is decided in the sensing slot that the PU is absent, the CR users transmit in the transmission slot. Otherwise, they remain silent for the duration of the transmission slot. When the duration of transmission slot is over, the CR users will start sensing the spectrum again.

Because wireless channel changes rapidly, the spectrum is sensed multiple times instead of only once so as to consider the changing behavior of the channel. To do this, in the paper, the sensing slot is divided into minislots. In each minislot, the spectrum is sensed independently and based on the result, a sensing report is formed. A sensing report is formed according to the quantized decision of each minislot, which is expressed by (4) and will be used in the classification phase later. For spectrum sensing, the energy detection is utilized where samples of received energy are summed and compared with a threshold and based on the comparison result it is decided that whether PU is present or absent.

In this work the number of quantization levels is four, i.e., $M = 4$ . These levels or quantization zones are represented by Z₁, Z₂, Z₃, and Z₄. Zones Z₁ and Z₂ represent low energy or the absence of the PU, while Z₃ and Z₄ represent high energy or the presence of the PU. The quantized energy zones are given as $\begin{matrix} (4) & u_{k, w, i} = \{\begin{cases} H_{0} \{\begin{cases} Z_{1}; & X_{k, w, i} \leq λ_{Z_{1}} \\ Z_{2}; & λ_{Z_{1}} < X_{k, w, i} \leq λ_{Z 2} \end{cases} \\ H_{1} \{\begin{cases} Z_{3}; & λ_{Z_{2}} < X_{k, w, i} \leq λ_{Z 3} \\ Z_{4}; & X_{k, w, i} > λ_{Z_{3}} \end{cases} \end{cases} \end{matrix}$ where $u_{k, w, i}$ represents the quantized energy for the k-th minislot of the $w$ -th sensing slot of the i-th CR user and $λ_{Z_{1}}$ , $λ_{Z_{2}}$ , and $λ_{Z_{3}}$ are the thresholds that differentiate different quantization zones. The set of quantization zones is $q \in {Z_{1}, Z_{2}, Z_{3}, Z_{4}}$ and the set of thresholds is $λ \in {λ_{Z_{1}}$ , $λ_{Z_{2}}$ , $λ_{Z_{3}}}$ . Equation (4) signifies that, in case of H₀, the average received energy at i-th CR user at the k-th sensing slot ( $X_{k, w, i}$ ) can be quantized into either Z₁ or Z₂ and in case of H₁, $X_{k, w, i}$ is quantized into either Z₃ or Z₄. According to our quantization scheme Z₁ and Z₂ represent H₀ and Z₃ and Z₄ represent H₁.

At each sensing slot, a sensing report is formed that consists of symbols belonging to q. The report for the i-th CR user at the w-th sensing slot is called sensing report and is represented by $R_{i, w}$ , which contains n elements belonging to q (the sensing report formation is further explained in Section 3.1). This report is used as a feature vector for the machine learning algorithm. During the training phase, this report is assigned to a sensing class based on ACK and the global decision, which will be discussed in detail in Section 3.1. The next section describes the spectrum sensing algorithm at the CR user level.

3. Spectrum Sensing

The proposed spectrum sensing scheme aims to improve PU detection capability under varying environments to improve spectral hole detection. The first goal protects the PU’s data from harmful interference and is the foremost constraint specified by IEEE 802.21 which is the standard for accessing TV white spaces [38]. The second goal efficiently exploits spectrum access opportunities, enabling the CR user to transmit data. For the i-th CR user at the w-th sensing slot, channel availability is decided on the basis of the energy vector ( $R_{i, w}$ ). To correctly map $R_{i, w}$ to PU activity, the behavior of the PU has to be learned. Thus the energy vector in our case is analogous to a feature vector in the context of machine learning.

To construct a classifier, i.e., to classify the current sensing report into channel available (H₀) or channel busy (H₁) classes, a training phase is needed. Each CR user stores energy vectors of size W, where W is the length of the training or training phase. In training phase, the slotted-frame structure is used as explained in Section 2. As explained in Section 2, a one-time slot has a sensing phase and a transmission phase. There are W slots in the training phase. These vectors are input of a classifier in the classification phase, where the current sensing report is compared with previously stored sensing reports to decide between H₀ and H₁.

In our proposed scheme, first the CR users learn the behavior of the PU by mapping the generated quantized energy vectors, which are called sensing reports, to the accurate status of the PU. The true status of the PU is found through ACK and a reliable combination of local decisions of CR users determined by the FC. The function of the CR user in the training phase is different from its function in the classification phase. In the training phase, sensing reports are assigned to sensing classes according to the actual activity of the PU and the corresponding behavior of the CR user. In the classification phase, sensing reports are sorted into one of the sensing classes using KNN. To accurately calculate the distance between the current sensing report and existing members of the sensing classes, SWA is used. Section 3.1 describes the training phase, while Section 3.2 describes the classification phase.

3.1. Training Phase

In this phase, the operating environment is learned by gauging the behavior of the CR user to the changing activity of the PU. The i-th CR user generates a sensing report $R_{i, w}$ , makes a local decision on the basis of the average received energy in the current sensing slot, sends the local decision to the FC, and based on the result of FC and the status of ACK assigns the sensing report to a sensing class. This section will explain these steps in detail.

Let the energy received in the $w$ -th sensing slot at the i-th CR user be represented by $Y_{i, w}$ which is given as $\begin{matrix} (5) & Y_{i, w} = \frac{\sum_{k = 1}^{n} X_{k, w, i}}{n} \end{matrix}$ where $X_{k, w, i}$ is given by (1).

The local decision for the i-th CR user at the $w$ -th sensing slot in the training phase is represented by $q_{i, w}$ and is given by [3] $\begin{matrix} (6) & q_{i, w} = \{\begin{cases} H_{0} \{\begin{cases} Z_{1}; & Y_{i, w} \leq λ_{Z_{1}} \\ Z_{2}; & λ_{Z_{1}} < Y_{i, w} \leq λ_{Z 2} \end{cases} \\ H_{1} \{\begin{cases} Z_{3}; & λ_{Z_{2}} < Y_{i, w} \leq λ_{Z 3} \\ Z_{4}; & Y_{i, w} > λ_{Z_{3}} \end{cases} \end{cases} \end{matrix}$

The local decision is sent to the FC, which combines local decisions from all CR users and renders a global decision. In the training phase, the simple majority rule is used as the rule of decision combination. The symbol (quantization zone) reported by the majority of CR users determines the global decision at the FC. As can be seen from (6), the local decision during the training phase is in the quantized-hard form, so the global decision at the FC is also in the quantized-hard form. The sensing report of a CR user is as shown in Figure 2. The local sensing report was explained above in the previous section. In Figure 2, the first six minislots constitute a local sensing report. As can be seen every element of the report belongs to q. For every CR user, at every sensing slot a sensing report (the current sensing report is represented by $R_{i, w}$ ) is formed, and the local decision is taken according to (6).

[figure omitted; refer to PDF]

Next, the global decision is returned to the CR users. The CR users either transmit or remain silent based on the global decision. If the CR global decision is H₀ then this can be verified by the ACK signal which is sent by the CR receiver to the CR sender after the CR receiver receives the transmission. As overlay cognitive radio network is considered, so, there is no interference to the PU communications. The ACK signal is affected by the PU communication only when the spectrum sensing result is wrong and in-fact the ground reality is H₁. Based on the local decision and the global decision, there are eight possible cases for the CR user and the sensing classes according to our system model. These possible cases called observations are given below.

Observation 1.

The local decision ( $q_{i, w}$ ) is Z₁ and the global decision is also Z₁. The CR user transmits its data. If ACK is received, it means the sensing result was correct and the actual status of the PU was H₀. Through the ACK signal, the true status of the PU is known. The sensing report corresponding to this decision ( $R_{i, w}$ ) is stored in a class labelled as R₁ while in case of absence of ACK signal it is stored in R₂.

Observation 2.

Both the local decision ( $q_{i, w}$ ) and the global decision are Z₁, or the global decision is Z₂ and the local decision is Z₁. The CR user will transmit, but ACK is not received, meaning that the sensing decision was wrong and the PU was available. The CR user will store $R_{i, w}$ in a class labelled R₂. If ACK signal is received it will be stored in R₁. If the local decision is Z₁ and the global decision is Z₃ or Z₄, then $R_{i, w}$ will also be stored in this class.

Observation 3.

The local decision ( $q_{i, w}$ ) is Z₂ and the global decision is also Z₂. The CR users follow the procedure as explained in Observation 1. If ACK is received, the sensing decision is correct and the PU is not present. $R_{i, w}$ is stored in a class labelled R₃, otherwise it is stored in R₄.

Observation 4.

The local decision is Z₂ and the global decision is Z₁ or $Z_{2}$ . The CR user transmits, and if ACK is not received, $R_{i, w}$ is assigned to the class with label R₄, otherwise in R₃. If the local decision is Z₂ and the global decision is either Z₃ or Z₄, then again $R_{i, w}$ will be stored in the class labelled as R₄.

Observation 5.

The local decision is Z₃ and the global decision is also Z₃. There will be no transmission in this case. The true status of the PU thus cannot be known. $R_{i, w}$ will be assigned to a class which is labelled as R₅. The sensing report will also be stored in class R₅ if the global decision is Z₄ and the local decision is Z₃.

Observation 6.

The local decision is Z₃ but the global decision is either Z₁ or Z₂. The CR user will transmit. If ACK is received, $R_{i, w}$ will be stored in a class labelled R₆, otherwise it will be stored in R₅.

Observation 7.

Both the local and global decisions are Z₄. There will be no transmission and $R_{i, w}$ will be stored in the class labelled as R₇. $R_{i, w}$ will also be stored in R₇ if the local decision is Z₄ and global decision is Z₃.

Observation 8.

The local decision is Z₄, but the global decision is either Z₁ or Z₂. The CR user will transmit. If ACK is received, $R_{i, w}$ will be stored in class R₈. If no ACK is received, $R_{i, w}$ will be stored in R₇.

In the observations above it can be seen that ACK signal is used when the global decision is H₀. When the global decision is H₁ the CR users do not transmit and thus ACK signal cannot be used to ascertain ground reality. So, in the case when H₁ is the global decision at the FC the CR users store the current sensing report in the classes R₅ and R₇ as the current sensing decision cannot be verified in any other way than at the risk of causing interference to the PU transmission.

The observations are given in decision tree form in Figure 3. As the observations do not stem from one set of decisions, there is no unified root of the decision tree. The decision trees are given in four partitions depending on the local decision. The local decision is abbreviated as LD and the global decision as GD in Figure 3. Figure 3(a) corresponds to the case that the local decision is Z₁ and Observations 1 and 2 are obtained. Figure 3(b) corresponds to the case that the local decision is Z₂ and the Observations 3 and 4 are obtained. Figures 3(c) and 3(d), respectively, correspond to the cases that the local decisions are Z₃ and Z₄ and the Observations 5, 6, 7, and 8 are obtained.

[figures omitted; refer to PDF]

These observations help learn the CR user about the surrounding environment and its behavior in response to the environment and also give CR users historical data that can be used in conjunction with the current sensing behavior to more reliably predict the PU status. This process can be seen as cooperative learning where not only is the individual CR user taken into account, but also the impact of other CR users is incorporated through the global decision. This adds spatial diversity to the learning process, where a receiver with better signal to noise ratio (SNR) conditions can drive the behavior of CR users with poorer SNR conditions.

The training phase is run until the CR user is sufficiently trained in the behavior of the surrounding environment, including changing the SNR conditions and changing the behavior of the PU. Fading can also temporarily affect the signal and thus the energy received due to the continuously changing sensing environment. The training scheme developed takes into consideration the presence of fading and thus store sensing reports that may have been the results of either fading or bad sensing in their corresponding categories. As the learning is based on the ACK and reliable decision combination at the FC, classes based on training more reliably reflect the sensing environment and PU activity. The results of either fading or bad sensing at the CR user level are found in the above observations, where the local decision is different from the global decision or when ACK is not received.

The training data is collected locally at each CR user in the training phase. The performance of machine learning techniques is dependent upon the size of the training phase. As the training size increase, the performance also improves. With an increase in the number of CR users a larger area under the PU is covered. Because our training model incorporates the global decision by acting according to it and also through the ACK signal the ground reality is known, the training phase can accurately know the behavior of CR users to the PU activity. With a large number of CR users, each CR user can reflect the ground reality in its training classes through the global decision. With a large training phase, the behavior of CR users to varying nature of the PU activity also can be accurately known. In conventional machine learning techniques, the training phase can gather adequate amount of training data to know the environment. Knowing the exact nature of PU activity is practically not feasible because of the random nature both of wireless channel and of the PU activity. But as will be shown in simulations, given a sufficiently large size of the training phase, the system detection performance can converge even at a very low SNR.

Figure 4(a) presents the frame structure when the FC decides that the PU is present during training phase. The CR users remain silent during the transmission phase in this case. Different operations in the sensing phase happen as first the local decision is made, then the local sensing decision is sent to the FC through a CCC. The FC combines the local sensing decisions and decides whether the PU is present or absent. If the FC decides that the PU is absent then the CR users transmit and hear for the ACK signal over the same channel on which transmission has been done. The CCC is not used for establishing links between the CR users. Rather it the communications happen between the CR user through the spectrum which is licensed to the PU and which is accessed by the CR users if the PU is absent. Figure 4(b) presents the time frame for the case when PU is absent during training phase. On the basis of the ACK signal the sensing report of the sensing slot is assigned into the classes as defined by the observations above. The frame structure is different for training phase from classification phase. In the training phase the sensing classes are updated on the basis of status of the ACK signal which helps in training the CR user to accurately reflect the ground reality.

[figures omitted; refer to PDF]

3.2. Classification Phase

In the previous phase, information was gathered regarding the operating environment and the CR user behavior in response to the changing environment. Learning the environment is made especially difficult by the nature of CR networks. Because of the noisy sensing environment, CR users only obtain partial observations of the environment variables. In addition, CR users must also transmit data. This results in a trade-off between sensing time and throughput: the higher the sensing time, the more accurate the sensing result and thus the more efficient the learning. Therefore, partial observability and capping the sensing time complicate the learning process. A third limitation is that a PU is considered to be autonomous. A CR user may not have any prior information about PU behavior, its operating characteristics, the RF environment, interference levels, or noise power distribution.

Our learning scheme addresses these issues. Partial observability is addressed by incorporating the behavior of other CR users into the learning process through the global decision. The ACK enables CR users to better learn the operating environment and divide the sensing observations into their respective classes more accurately. Our learning scheme requires no prior information and can efficiently map sensing performance to the changing activity of the PU, thus enabling the CR user to more reliably detect the PU.

A frame structure during the classification phase is presented in Figure 5. In the local decision making phase, the spectrum is sensed and a sensing report is created. The first six minislots in the local decision making part of Figure 5 represent the local sensing report. The second part is the classification phase discussed in Section 3.2.3. The last part of the local decision making slot is the reporting phase, where the local decision is reported to the FC, the global decision is returned and the CR user takes action accordingly. The transmission phase follows the local decision making phase.

[figure omitted; refer to PDF]

In this section, we will present in detail how the current sensing report is classified into one of the training classes. KNN, a machine learning algorithm, is used to accurately classify the current instance into one of the sensing classes and thus reliably detect PU activity. Section 3.2.1 presents the KNN algorithm.

3.2.1. K-Nearest Neighbor Algorithm

KNN is a distribution-free machine learning algorithm that classifies observations into one of several classes based on quantitative variables. KNN, being a distribution-free method, is suitable for the context of cognitive radios. KNN classifies a test instance, in our case the current sensing report as described in Section 3.1, into one of several neighboring classes by majority voting. The voting can be modified to calculate the distance between any two sensing reports. In the context of CR networks, it is highly improbable that any two sensing reports are exactly the same, so we have to measure the similarity between them.

The classification plane is divided into a number of neighbors and the distance of the current sensing report to each of those neighbors is found. For the sake of notational simplicity let us denote the sensing report of the current sensing slot at the i-th CR user by $x_{i}$ onwards. Let $d (x_{i}, y)$ be the distance, where $y$ represents the neighbors, or the sensing classes obtained in Section 3.1, given by $y \in {R_{1}, R_{2} R_{3}, R_{4}, R_{5}, R_{6} R_{7}, R_{8}}$ . The distance is calculated to each of the neighbors representing either H₀ or H₁. Based on the calculated distance, the current sensing report is classified either to H₀ or to H₁. Section 3.2.2 shows how the distance is calculated and Section 3.2.3 shows the procedure for using KNN for classification.

3.2.2. Smith-Waterman Algorithm

The Smith-Waterman algorithm (SWA) [39] is a local alignment algorithm that calculates an accurate distance between two vectors. The sensing reports in our case can differ from each other due to spatial and temporal diversity, so the voting method conventionally used in KNN, which is based on finding a match or a mismatch, is not applicable here. Instead, we focus on measuring the similarity between sensing reports, using SWA to calculate the distance between the current sensing report and the sensing classes.

SWA consists of three stages: training, matrix fill, and trace back. The three stages are briefly described as follows.

Training: one sensing report is arranged horizontally and the other vertically. The top row and the leftmost column are initialized to 0.

Matrix fill: let the sensing report arranged vertically be ${\hat{q}}_{m}$ and the sensing report arranged horizontally be ${\hat{q}}_{j}$ . Each element of ${\hat{q}}_{m} (q_{p, m})$ is compared with every element of ${\hat{q}}_{j} (q_{l, j})$ and the score $F (p, l)$ is computed according to the matrix fill equation as follows: $\begin{matrix} (7) & F (p, l) = m a x \{\begin{matrix} 0 \\ F (p - 1, l - 1) + o (q_{p, m}, q_{l, j}) \\ F (p - 1, l) - t (q_{p, m}, q_{l, j}) \\ F (p, l - 1) - t (q_{p, m}, q_{l, j}) \end{matrix} \end{matrix}$ where $p, l = 1,2, \dots, n$ are indices of the elements of report ${\hat{q}}_{m}$ and report ${\hat{q}}_{j}$ , respectively, $q_{p, m}$ is the p-th element of report ${\hat{q}}_{m}$ , $q_{l, j}$ is the $l$ -th element of report ${\hat{q}}_{j}$ , $o (q_{p, m}, q_{l, j})$ is the similarity reward between two characters, and $t (q_{p, m}, q_{l, j})$ is the gap penalty (dissimilarity) that determines the degree of mismatch between $q_{p, m}$ and $q_{l, j}$ to be penalized. Different reward and penalty values are defined for different types of sequences and applications. Here, we use intuitive values based on experimental results. The gap penalty is determined as $\begin{matrix} (8) & t (q_{p, m}, q_{l, j}) = \{\begin{cases} 4, & (q_{p, m} = ＇ Z_{1} ＇, q_{l, j} = ＇ Z_{4} ＇) \\ 3, & (q_{p, m} = ＇ Z_{1} ＇, q_{l, j} = ＇ Z_{3} ＇) \\ o r (q_{p, m} = ＇ Z_{2} ＇, q_{l, j} = ＇ Z_{4} ＇) \\ 2, & (q_{p, m} = ＇ Z_{2} ＇, q_{l, j} = ＇ Z_{3} ＇) \\ 1, & (q_{p, m} = ＇ Z_{1} ＇, q_{l, j} = ＇ Z_{2} ＇) \\ o r (q_{p, m} = ＇ Z_{3} ＇, q_{l, j} = ＇ Z_{4} ＇) \\ 0, & o t h e r w i s e . \end{cases} \end{matrix}$ and the similarity reward is calculated as $\begin{matrix} (9) & o (q_{p, m}, q_{l, j}) = \{\begin{cases} 2, & (q_{p, m} = q_{l, j}) \\ 1, & (q_{p, m} = ＇ Z_{1} ＇, q_{l, j} = ＇ Z_{2} ＇) o r \\ (q_{p, m} = ＇ Z_{3} ＇, q_{l, j} = ＇ Z_{4} ＇) \\ 0, & o t h e r w i s e . \end{cases} \end{matrix}$

It is important to note that $o (q_{p, m}, q_{l, j}) = o (q_{l, j}, q_{p, m})$ and $t (q_{p, m}, q_{l, j}) = t (q_{l, j}, q_{p, m})$ , which means that the similarity reward and gap penalty have the commutative property. The similarity score between two sensing reports $F_{{\hat{q}}_{m}, {\hat{q}}_{j}}$ is obtained by taking the maximum element of the score matrix ( $F)$ . The similarity score of the $m$ -th sensing report when compared with the j-th sensing report is given as $\begin{matrix} (10) & F_{{\hat{q}}_{m}, {\hat{q}}_{j}} = \underset{p, l = 1,2, \dots, n}{m a x} \{F (p, l)\} . \end{matrix}$

Trace back: the third stage of the SWA is called trace back and is performed to align sequences based on the scores computed in the “matrix fill” stage. Since our objective is just to find the similarity score, the trace back stage is not required in our work.

3.2.3. Classification

As is explained a sensing report have n elements belonging to q. The sensing report (x_i) has to be classified into one of the sensing classes, which are treated as neighbors for x_i. The candidate set of neighbors for x_i is denoted by $N (x_{i})$ and contains all classes as found in Section 3.1 such that $N (x_{i}) \in {R_{1}, R_{2}, R_{3}, R_{4}, R_{5}, R_{6}, R_{7}, R_{8}}$ and each CR user has its own version of sensing classes.

The current sensing report is compared with every member of each of the sensing classes belonging to $N (x_{i})$ . The membership counting vector is represented by $\vec{y_{x_{i}}} (l)$ . Each element of $\vec{y_{x_{i}}} (l)$ is the result of comparing x_i with the j-th member of the l-th sensing class which is computed by (10). Let $h_{1}^{l}$ be the event that sensing report x_i belongs to class l and $h_{0}^{l}$ be the event that sensing report x_i does not belong to class l. Furthermore, let $E_{ω}^{l}$ be the event that $ω$ elements in $\vec{y_{x_{i}}} (l)$ are greater than a threshold. Then the posterior probability ( $P_{x_{i}} (l)$ ) that the current sensing report x_i belongs to class l is found as $\begin{matrix} (11) & P_{x_{i}} (l) = P (\frac{h_{1}^{l}}{E_{ω}^{l}}) = \frac{P (h_{1}^{l}) P (E_{ω}^{l} / h_{1}^{l})}{\sum_{b \in [0,1]} P (h_{b}^{l}) P (E_{ω}^{l} / h_{b}^{l})} = P (h_{1}^{l}) P (\frac{E_{ω}^{l}}{h_{1}^{l}}) . \end{matrix}$ Based on the posterior probability, the local decision for the i-th CR user at the r-th sensing slot, represented by q_i,r, is given as $\begin{matrix} (12) & q_{i, r} = \{\begin{cases} H_{0} & P_{0} > P_{1} \\ H_{1} & o t h e r w i s e \end{cases} \end{matrix}$ where P₀ is the sum of posterior probabilities of sensing classes representing H₀ and is given as $\begin{matrix} (13) & P_{0} = P_{x_{i}} (R_{1}) + P_{x_{i}} (R_{3}) + P_{x_{i}} (R_{6}) + P_{x_{i}} (R_{8}) \end{matrix}$ and $P_{1}$ is the sum of posterior probabilities of sensing classes representing H₁ and is given as $\begin{matrix} (14) & P_{1} = P_{x_{i}} (R_{2}) + P_{x_{i}} (R_{4}) + P_{x_{i}} (R_{5}) + P_{x_{i}} (R_{7}) . \end{matrix}$

4. Cooperative Spectrum Sensing

The FC receives the local decisions as D_i where $i = 1,2, 3, \dots, N$ . In CSS, the sensing capabilities of CR users are different from each other which results in different local sensing results [40]. In the proposed scheme, we use a weight-based decision combination at the FC. Each CR user is assigned a weight based on its effectiveness.

A partial global decision at FC, represented by $L_{G, i}$ , is made by excluding the result of the i-th CR user as $\begin{matrix} (15) & L_{G, i} = \{\begin{cases} H_{0} & N_{H_{0}}^{i} > N_{H_{1}}^{i} \\ H_{1} & o t h e r w i s e \end{cases} \end{matrix}$ where $N_{H_{0}}^{i}$ is the number of CR users reporting H₀ excluding the local decision of the i-th CR user and is given as $\begin{matrix} (16) & N_{H_{0}}^{i} = \sum_{i = 1, i \neq i}^{N} I_{0} (D_{i} = H_{0}) \end{matrix}$ where $I_{0} (D_{i} = H_{0})$ is indicator function for $H_{0}$ and is given by $\begin{matrix} (17) & I_{0} (D_{i} = H_{0}) = \{\begin{cases} 1; & D_{i} = H_{0} \\ 0; & D_{i} \neq H_{0} . \end{cases} \end{matrix}$

On the other hand, $N_{H_{1}}^{i}$ is the number of CR users reporting H₁ excluding the local decision of the i-th CR user and is given as $\begin{matrix} (18) & N_{H_{1}}^{i} = \sum_{i = 1, i \neq i}^{N} I_{0} (D_{i} = H_{1}) \end{matrix}$ where $I_{0} (D_{i} = H_{1})$ is indicator function for $H_{1}$ and is given by $\begin{matrix} (19) & I_{0} (D_{i} = H_{1}) = \{\begin{cases} 1; & D_{i} = H_{1} \\ 0; & D_{i} \neq H_{1} . \end{cases} \end{matrix}$

Partial global decisions are found for all CR users. The local decisions are then combined through a majority rule as $L_{G, a l l}$ and can be expressed as $\begin{matrix} (20) & L_{G, a l l} = \{\begin{cases} H_{0} & N_{H_{0}} > N_{H_{1}} \\ H_{1} & o t h e r w i s e \end{cases} \end{matrix}$ where $N_{H_{0}}$ is the number of CR users reporting $H_{0}$ and $N_{H_{1}}$ is the number of CR users reporting $H_{1}$ . Based on (15) and (20), the weight for each CR user, $α_{i}$ , is calculated as $\begin{matrix} (21) & α_{i} = \{\begin{cases} α_{i} + 1 & L_{G, i} \neq L_{G, a l l} \\ α_{i} & L_{G, i} = L_{G, a l l} . \end{cases} \end{matrix}$

The cumulative weight for each hypothesis $β_{a}$ where $a \in {H_{0}, H_{1}}$ is then calculated as $\begin{matrix} (22) & β_{a} = \sum_{i = 1}^{N} α_{i} I_{0} (D_{i} = a) a \in \{H_{0}, H_{1}\} \end{matrix}$ where $I_{0} (D_{i} = a)$ is given by $\begin{matrix} (23) & I_{0} (D_{i} = a) = \{\begin{cases} 1; & D_{i} = a \\ 0; & o t h e r w i s e . \end{cases} \end{matrix}$ The final global decision is denoted by $L_{G}$ and is calculated as $\begin{matrix} (24) & L_{G} = \{\begin{cases} H_{0} & β_{H_{0}} > β_{H_{1}} \\ H_{1} & o t h e r w i s e . \end{cases} \end{matrix}$

The global decision is returned to CR users and the CR users then transmit or stay silent according to the global decision.

Let $β = \sqrt{2 γ \sum_{k = 1}^{n} {|h_{k}|}^{2} + 1}$ where $h_{k}$ is the channel gain between the primary user and the i-th CR user during the k-th minislot and $γ$ is the mean SNR as received from the PU. If it is assumed that the system’s coefficients are known, then the system probability of false alarm under nonfading channels is given as [37] $\begin{matrix} (25) & P_{f}^{S} = Q (β Q^{- 1} (\bar{P_{d}}) + \sqrt{N_{0}} γ \sum_{k = 1}^{n} {|h_{k}|}^{2}) \end{matrix}$ where $Q (.)$ is the complimentary distribution function of the standard Gaussian, i.e., $Q (χ) = (1 / 2 π) \int_{χ}^{\infty} e x p (- t^{2} / 2) d t$ and $\bar{P_{d}}$ is the system target probability of detection. The probability of detection and probability of false alarm of the proposed scheme depend both on the probability of the sensing report falling into a particular quantization zone and on the number of minislots in the sensing slot. The target probability of detection and target probability of false alarm are depended upon the number of quantization zones, the portability that under a particular hypothesis the sensing decision will fall in a particular quantization zone and the weight of each quantization zone. The quantization thresholds are adjusted such that the optimal quantization thresholds are found. On the basis of quantization parameters the target probabilities of detection and false alarm are optimized. For cooperative spectrum sensing the target probability of detection, if the weight of the quantization zones is considered the same, i.e., that each quantization zone contributes the same to the final decision combination, can be given as [41] $\begin{matrix} (26) & \bar{P_{d}} = \prod_{m = 1}^{M} \{(\begin{pmatrix} N - \sum_{s = 1}^{l} N_{Z_{s}} \\ N_{B_{m}} \end{pmatrix}) {(P_{H_{1}} (Z_{m}))}^{N_{Z_{m}}}\} \end{matrix}$ where $N_{Z_{m}}$ is the number of CR users having the local sensing decision in zone $Z_{m}$ , $l$ is the largest integer less than m, and $P_{H_{1}} (Z_{m})$ is the probability of having the local sensing decision in quantization zone $Z_{m}$ under $H_{1}$ .

The system probability of detection can be given as [37] $\begin{matrix} (27) & P_{d}^{S} = Q (β Q^{- 1} (\bar{P_{f}}) + \sqrt{N_{0}} γ \sum_{k = 1}^{n} {|h_{k}|}^{2}) \end{matrix}$ where $\bar{P_{f}}$ is the system target probability of false alarm and is given by [41] $\begin{matrix} (28) & \bar{P_{f}} = \prod_{m = 1}^{M} \{(\begin{pmatrix} N - \sum_{s = 1}^{l} N_{Z_{s}} \\ N_{B_{m}} \end{pmatrix}) {(P_{H_{0}} (Z_{m}))}^{N_{Z_{m}}}\} \end{matrix}$ and $P_{H_{1}} (Z_{m})$ is the probability of having the local sensing decision in quantization zone $Z_{m}$ under $H_{0}$ .

5. Results and Analysis

In this section we observe the behavior of our proposed scheme and compare it to other schemes through system parameters such as probability of detection, probability of error, and probability of spectral holes exploitation. In [8], the effect of introducing multiple bits for reporting and sensing the spectrum multiple times within the same sensing phase was investigated where the scheme utilizing reporting multiple bits and multiple minislots was shown to be robust against all kind of attacks. Authors in [3, 37] have also shown the reliability gain which is brought by using multiple minislots. The number of CR users is 5, the number of iterations is 1000, the sensing slot duration is 1 ms, the sampling frequency is 300 kHz, and the number of energy samples in each sensing slot is 600. The idle probability of PU is 0.5. The SNR range is from -25 to -10 dB. When the number of CR users is large, clusters are formed for spectrum sensing to reduce the overhead. Authors in [42] considered clusters to sense the spectrum where the number of CR users in each cluster was five. That is, when cluster is considered, the CR users send their local decisions to a cluster-head to reduce the number of direct reports sent to the FC. To consider a higher number of CR users, the concept of cluster needs to be adopted. However, it is beyond the scope of the paper. However, according to [42] as the number of clusters and thus the number of CR users increase the sensing performance also improves. The idle probability is used as 0.5 in literature for the sake of fairness ([8, 42]). If the idle probability of PU is increased, it will provide higher opportunities of transmission to the CR user. Therefore, the idle probability of PU in the paper is taken as 0.5 for maintaining fairness among CR and PU systems. As the idle probability of PU is considered equal to that of probability of activity of the PU, the target detection probability for channel without fading is set to be 0.8 at SNR of -20 dB. The detection probability as is set in this paper with a higher active probability of the PU of 0.5 (the authors in [37] considered a low active probability of PU of 0.3) guarantees the protection of the PU data. We measure the performance of our proposed scheme in both the AWGN channels and also in fading channels by observing our scheme’s behavior and also of other schemes behavior through varying SNR conditions for different system parameters. The training phase strongly impacts the system performance, as through this phase, the sensing classes are developed. The larger this phase, the greater the number of training instances, which means the current sensing report has more similar reports to match with. We plot the proposed scheme with two variants. In one, the training phase is 100 iterations and in the other it is 330 iterations. These schemes are compared with a scheme in which the CR users make a one-bit local decision and the local decisions are combined at the FC by using a conventional OR rule.

In this paper, the probability of error (Pe) is given as $\begin{matrix} (29) & P e = P f \times P (H_{0}) + (1 - P d) \times P (H_{1}) \end{matrix}$ where Pd is the probability of detection, P_f is the probability of false alarm, P(H₀) is the prior probability of H₀, and P(H₁) is the prior probability of H₁. The probability of detection (Pd) is defined as $\begin{matrix} (30) & P d = \frac{n_{(D_{G} = 1 & & H = 1)}}{n_{(D_{G} = 1 & & H = 1)} + (n_{(D_{G} = 0 & & H = 1)})} \end{matrix}$ and the probability of false alarm (P_f) is defined as $\begin{matrix} (31) & P f = \frac{n_{(D_{G} = 1 & & H = 0)}}{n_{(D_{G} = 1 & & H = 1)} + (n_{(D_{G} = 0 & & H = 1)})} \end{matrix}$ where H is the real status of the PU and is equal to a randomly generated stream of ones and zeroes with size equal to the total number of iterations. A one represents the presence of the PU, while a zero represents absence of the PU. The notation $n_{(x & & y)}$ means the number of times the condition in the subscript is satisfied. The probability of spectral hole exploitation is represented by Pnf and can be expressed as $\begin{matrix} (32) & P n f = 1 - P f . \end{matrix}$

Soft-decision combination gives the optimal sensing performance [4]. In [4], it is also shown that hard decision combination gives inferior results but only has one-bit overhead while soft combination incurs a lot of overhead. In one-bit hard combination scheme, sensing information was lost during local decision making because of using only one threshold. By using multiple thresholds, the sensing information loss can be reduced, which leads to better performance, and more overhead. In [7], it is also shown that using two bits for reporting the local decision can significantly improve the sensing performance. The effectiveness of using two bits (four quantization levels) was shown for both perfect and imperfect reporting channels. In [43], H. Sakran et al. utilized three bits to report the local decision to the FC. The performance improvement was shown to be better than using two bits for reporting local decision. In summary, it is obvious that trade-off exists between spectrum sensing performance and overhead when we design the quantization levels. Therefore, in the paper we mainly focus on applying machine learning algorithm into Smith-Waterman algorithm-based soft-decision spectrum sensing by considering the case of four quantization levels. To consider more quantization levels than 4 levels, the whole problem formulation such as the observations in Section 3.1 and the classification classes have to be changed and redesigned. Therefore, simulation results are bounded to the case of four quantization levels.

In the training phase the probability of detection of the proposed scheme is equal to that of majority rule which uses quantization. In machine learning technique, the performance of the proposed scheme is dependent upon the classification phase. In the simulations, the probability of detection is composed of those of both the training and classification phase. Similarly, training data in the proposed scheme is required to train the KNN classifier, and the performance of the classifier is depended on the training size of the data. The proposed scheme utilizes the majority rule to get training data. Since malicious users or anomalies are not considered in the paper, the majority rule works by majority voting and corresponding performance will be dependent upon local sensing decisions of the CR users. When the training phase is over, the classifier will have ample data available to the changing behavior of PU and will be trained.

Figure 6 shows the system detection performance in an AWGN channel. The proposed scheme with the larger training phase outperforms the other two schemes. The proposed scheme with a smaller training phase has the same detection performance as an OR rule in the low SNR regime. The reason is that the sensing reports in low SNR regimes do not have large distances from each other. The energies received under both hypotheses in the low SNR regime vary little from each other and thus, the scheme with fewer training instances fails to learn the environment more reliably. As the SNR improves, the proposed scheme with the smaller training phase results in more reliable spectrum sensing than conventional schemes. Figure 7 shows the error performance as calculated by (29). In this figure, it can also be seen that the proposed scheme with the larger training phase has a low probability of error even in the low SNR regime. The scheme with the smaller training phase converges to one with a larger training phase in better SNR conditions, which shows that even with a smaller training size the proposed scheme can result in more reliable spectrum sensing than conventional schemes.

[figure omitted; refer to PDF] [figure omitted; refer to PDF]

Figure 8 shows the capability of the proposed scheme to exploit spectral holes which is defined by (28). Exploiting available opportunities for transmitting data is the highest priority from the perspective of a CR user. Even in bad SNR conditions our proposed scheme enables CR users to exploit data transmission opportunities. The proposed scheme with the smaller training phase lags behind the one with the larger training phase in bad SNR conditions but converges to the scheme with the larger training phase in good SNR regimes.

[figure omitted; refer to PDF]

In the region of high SNR, the sensing reports which are formed are better reflections of the PU’s activity. The sensing performance can be improved under the region of high SNR regimes since the PU signal will take larger portion of the received signal, compared to the added noise. That is to say, when SNR gets larger, a smaller number of training samples and further a smaller size of the training window are required to train the classifier. Therefore, when the SNR improves, a smaller training size results in the same performance. On the other hand, in the region of lower SNR, a larger training size and a higher training size are needed to accurately reflect the PU’s activity. All the three schemes show same performance trend but at different SNR levels. The OR rule has the best detection performance among conventional schemes, as it uses the most relaxed criteria for declaring, whether the PU is present or not out of all the conventional rules. However, this means that the OR rule cannot efficiently exploit data transmission opportunities. These figures show that our proposed scheme can protect PU data more effectively as well as provide more data transmission opportunities.

Figure 9 shows the detection performance of the proposed scheme in a fading environment. Fading affects the power of the received signal and thus the number of energy samples required to efficiently decide the status of the PU. In nonfading environment the amplitude gain of the channel is deterministic while in the fading channels the amplitude gain of the channel varies [17]. Thus the probability of detection is dependent upon the instantaneous SNR. The effect of fading on performance of spectrum sensing was investigated in detail by [17]. Instead of following (2) and (3) for setting up a simulation environment, we have followed a path-loss model to incorporate fading as presented in [44]. We assume a path-loss model where the signal goes fading proportional to $d^{α}$ , where $d$ is the distance between the PU and the CR users and $α = 3$ . The average distance between the PU and CR users is assumed to be 20 m. The performance of our proposed scheme with the larger training phase outperforms the OR rule by 5% when the SNR is -23 dB, but when the SNR improves to -16 dB, the improvement is about 20%. The detection performance of the proposed scheme outperforms the OR rule by a larger margin when SNR conditions improve. As can be seen from the figure, the OR rule has a very poor detection performance in a fading environment despite the fact that it has the best detection performance among conventional fusion rules. Figure 10 shows the error performance of the proposed scheme in a fading environment. It can be seen that with increasing SNR, the error reduces. At -25 dB, the error probability is just above 0.1. Due to fading, the error probability of the OR rule is 0.35, which is very high compared to our proposed scheme.

[figure omitted; refer to PDF] [figure omitted; refer to PDF]

Figure 11 shows the effect of the number of CR users on the performance of cooperative spectrum sensing under fading channels where some CR users undergo deep fading and thus have unreliable training data. To reflect the effect of increasing number of CR users fading conditions are required as in nonfading channels the performance with increasing the number of CR users remains the same because the training data of less CR users is also reliable and reflect the PU activity accurately. In the figure for each number of CR users the SNR is varied from -25 to -10 and then the mean of probabilities of detection is found. For instance, when the number of CR users is 6 the probability of detection for a multiple values of SNR varying between -25 and -15 is calculated and then the mean of the computed probabilities is the mean probability of detection. The mean probability of detection is represented by $P m d$ . As the values shown are mean values so the $P m d$ cannot converge to 1. For each values of SNR the system is run 1,000 time for the proposed scheme having training size of 100 and 300 times for the proposed scheme having training size of 100. As can be seen as the number of CR users increases beyond a limit, in this case beyond 10 the improvement in mean probability of detection is not abrupt. That is because of the reasons explained in first paragraph of this section that to utilize the gain which can be introduced by increasing the number of CR users clusters need to be formed. When instead of cluster-heads the FC combines the sensing decisions of all CR users then the sensing decisions of many CR users may fall outside of the similarity distances range as calculated in Section 3.2.2 and thus their reports will be rejected. From Figure 11 it can be seen that the mean probability of detection of the proposed scheme with a larger training size surpasses the performance of the other schemes. A mean probability of detection when the number of users equals 20 reaches 0.8 which is target detection probability as we consider in this paper at SNR of -20 dB for nonfading channels when the number of CR users is 5 as we have considered in this paper. The proposed scheme reaches highest mean probability of detection of near 0.7 and the OR can achieve highest mean probability of detection of less than 0.6.

[figure omitted; refer to PDF]

6. Conclusion

In this paper, a machine learning-based reliable spectrum sensing scheme is proposed. The proposed scheme learns from the environment by taking into account the true status of the PU. Sensing reports are stored in appropriate sensing classes and then the current sensing report is classified into one of the sensing classes. Based on the result of classification, the PU is declared present or absent. Local decisions are combined at the FC by a novel decision combination scheme that takes into account the reliability of the CR users. Mechanisms at both the CR level and the FC level ensure reliable spectrum sensing. Simulation results show that our proposed scheme has better detection performance and better spectral hole exploitation capability than the conventional OR rule. Fading affects detection performance, but our scheme detects successfully 80% of the times at -10 dB SNR even in a fading environment.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2015R1D1A1A09057077) as well as by the Korea Government (MSIT) (2018R1A2B6001714).

References

[1] J. Mitola, Cognitive Radio—An Integrated Agent Architecture for Software Defined Radio [Ph.D. thesis], 2000.

[2] P. Kolodzy, "Spectrum policy task force," Rep. ET Docket 02-135, 2002.

[3] R. Fan, H. Jiang, "Optimal multi-channel cooperative sensing in cognitive radio networks," IEEE Transactions on Wireless Communications, vol. 9 no. 3, pp. 1128-1138, DOI: 10.1109/TWC.2010.03.090467, 2010.

[4] J. Ma, G. Zhao, Y. Li, "Soft combination and detection for cooperative spectrum sensing in cognitive radio networks," IEEE Transactions on Wireless Communications, vol. 7 no. 11, pp. 4502-4507, DOI: 10.1109/T-WC.2008.070941, 2008.

[5] S. Kyperountas, N. Correal, Q. Shi, "A comparison of fusion rules for cooperative spectrum sensing in fading channels," EMS Research, Motorola, 2010.

[6] H. Guo, W. Jiang, W. Luo, "Linear Soft Combination for Cooperative Spectrum Sensing in Cognitive Radio Networks," IEEE Communications Letters, vol. 21 no. 7, pp. 1573-1576, DOI: 10.1109/LCOMM.2017.2686393, 2017.

[7] H. Sakran, M. Shokair, "Hard and softened combination for cooperative spectrum sensing over imperfect channels in cognitive radio networks," Telecommunication Systems, vol. 52 no. 1, pp. 61-71, DOI: 10.1007/s11235-011-9467-7, 2013.

[8] H. A. Shah, M. Usman, I. Koo, "Bioinformatics-inspired quantized hard combination-based abnormality detection for cooperative spectrum sensing in cognitive radio networks," IEEE Sensors Journal, vol. 15 no. 4, pp. 2324-2334, DOI: 10.1109/jsen.2014.2375363, 2015.

[9] P. Kaligineedi, V. K. Bhargava, "Sensor allocation and quantization schemes for multi-band cognitive radio cooperative sensing system," IEEE Transactions on Wireless Communications, vol. 10 no. 1, pp. 284-293, DOI: 10.1109/TWC.2010.102810.100650, 2011.

[10] R. Chen, J.-M. Park, K. Bian, "Robust distributed spectrum sensing in cognitive radio networks," pp. 31-35, DOI: 10.1109/INFOCOM.2007.251, .

[11] Z. Han, R. Zheng, H. V. Poor, "Repeated auctions with Bayesian nonparametric learning for spectrum access in cognitive radio networks," IEEE Transactions on Wireless Communications, vol. 10 no. 3, pp. 890-900, DOI: 10.1109/TWC.2011.010411.100838, 2011.

[12] J. Lundén, V. Koivunen, S. R. Kulkarni, H. V. Poor, "Reinforcement learning based distributed multiagent sensing policy for cognitive radio networks," Proceedings of the 2011 IEEE International Symposium on Dynamic Spectrum Access Networks (DYSPAN), pp. 642-646, DOI: 10.1109/DYSPAN.2011.5936261, .

[13] M. Bkassiny, S. K. Jayaweera, K. A. Avery, "Distributed Reinforcement Learning based MAC protocols for autonomous cognitive secondary users," Proceedings of the 20th Annual Wireless and Optical Communications Conference, (WOCC '11), .

[14] A. Galindo-Serrano, L. Giupponi, "Distributed Q-learning for aggregated interference control in cognitive radio networks," IEEE Transactions on Vehicular Technology, vol. 59 no. 4, pp. 1823-1834, DOI: 10.1109/tvt.2010.2043124, 2010.

[15] B. Y. Reddy, "Detecting Primary Signals for Efficient Utilization of Spectrum Using Q-Learning," Proceedings of the 2008 Fifth International Conference on Information Technology: New Generations (ITNG), pp. 360-365, DOI: 10.1109/ITNG.2008.95, .

[16] Q. Zhu, Z. Han, T. Başar, "No-Regret Learning in Collaborative Spectrum Sensing with Malicious Nodes," Proceedings of the 2010 IEEE International Conference on Communications,DOI: 10.1109/ICC.2010.5502580, .

[17] A. Ghasemi, E. S. Sousa, "Collaborative spectrum sensing for opportunistic access in fading environments," pp. 131-136, DOI: 10.1109/DYSPAN.2005.1542627, .

[18] K. M. Thilina, K. W. Choi, N. Saquib, E. Hossain, "Machine learning techniques for cooperative spectrum sensing in cognitive radio networks," IEEE Journal on Selected Areas in Communications, vol. 31 no. 11, pp. 2209-2221, DOI: 10.1109/JSAC.2013.131120, 2013.

[19] M. Y. Kiang, "A comparative assessment of classification methods," Decision Support Systems, vol. 35 no. 4, pp. 441-454, DOI: 10.1016/S0167-9236(02)00110-0, 2003.

[20] K. M. Thilina, K. W. Choi, N. Saquib, E. Hossain, "Pattern classification techniques for cooperative spectrum sensing in cognitive radio networks: SVM and W-KNN approaches," Proceedings of the 2012 IEEE Global Communications Conference, (GLOBECOM '12), pp. 1260-1265, .

[21] M. Tang, Z. Zheng, G. Ding, Z. Xue, "Efficient TV white space database construction via spectrum sensing and spatial inference," Proceedings of the 34th IEEE International Performance Computing and Communications Conference, IPCCC 2015, .

[22] A. M. Mikaeil, B. Guo, Z. Wang, "Machine learning to data fusion approach for cooperative spectrum sensing," Proceedings of the 6th International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, CyberC '14, pp. 429-434, .

[23] J. Kanti, G. S. Tomar, A. Bagwari, "A novel multiple antennas based centralized spectrum sensing technique," Transactions on Computational Science, vol. 10220, pp. 64-85, DOI: 10.1007/978-3-662-54563-8_4, 2017.

[24] A. Bagwari, G. S. Tomar, S. Verma, "Cooperative spectrum sensing based on two-stage detectors with multiple energy detectors and adaptive double threshold in cognitive radio networks," Canadian Journal of Electrical and Computer Engineering, vol. 36 no. 4, pp. 172-180, DOI: 10.1109/cjece.2014.2303519, 2013.

[25] J. Kanti, G. S. Tomar, A. Bagwari, "An improved-two stage detection technique for IEEE 802.22 WRAN," Optik - International Journal for Light and Electron Optics, vol. 140, pp. 695-708, DOI: 10.1016/j.ijleo.2017.04.073, 2017.

[26] J. Kanti, G. S. Tomar, D. Pham, "Improved sensing detector for wireless regional area networks," Cogent Engineering, vol. 4 no. 1,DOI: 10.1080/23311916.2017.1286729, 2017.

[27] A. Samarah, G. S. Tomar, A. Bagwari, J. Kanti, "Double Stage Energy Detectors for Sensing Spectrum in Cognitive Radio Networks," Proceedings of the 2015 Fifth International Conference on Communication Systems and Network Technologies (CSNT '15), pp. 181-184, DOI: 10.1109/CSNT.2015.259, .

[28] A. Bagwari, G. S. Tomar, "Cooperative Spectrum Sensing with Multiple Antennas Using Adaptive Double-Threshold Based Energy Detector in Cognitive Radio Networks," Journal of The Institution of Engineers (India): Series B, vol. 95 no. 2, pp. 107-112, DOI: 10.1007/s40031-014-0088-x, 2014.

[29] A. Bagwari, G. S. Tomar, "Comparison between Adaptive Double-Threshold Based Energy Detection and Cyclostationary Detection Technique for Cognitive Radio Networks," Proceedings of the 5th International Conference on Computational Intelligence and Communication Networks (CICN '13), pp. 182-185, DOI: 10.1109/CICN.2013.47, .

[30] K. M. Thilina, K. W. Choi, N. Saquib, E. Hossain, "Pattern classification techniques for cooperative spectrum sensing in cognitive radio networks: SVM and W-KNN approaches," Proceedings of the 2012 IEEE Global Communications Conference (GLOBECOM '12), pp. 1260-1265, DOI: 10.1109/GLOCOM.2012.6503286, .

[31] T. Mengyun, Z. Zheng, G. Ding, Z. Xue, "Efficient TV white space database construction via spectrum sensing and spatial inference," Proceedings of the 2015 IEEE 34th International Performance Computing and Communications Conference (IPCCC '15),DOI: 10.1109/PCCC.2015.7410268, .

[32] M. Jia, X. Gu, Q. Guo, W. Xiang, N. Zhang, "Broadband hybrid satellite-terrestrial communication systems based on cognitive radio toward 5G," IEEE Wireless Communications Magazine, vol. 23 no. 6, pp. 96-106, DOI: 10.1109/MWC.2016.1500108WC, 2016.

[33] M. Jia, X. Liu, X. Gu, Q. Guo, "Joint cooperative spectrum sensing and channel selection optimization for satellite communication systems based on cognitive radio," International Journal of Satellite Communications and Networking, vol. 3 no. 2, pp. 139-150, DOI: 10.1002/sat.1169, 2015.

[34] M. Jia, X. Liu, Z. Yin, Q. Guo, X. Gu, "Joint cooperative spectrum sensing and spectrum opportunity for satellite cluster communication networks," Ad Hoc Networks, vol. 58, pp. 231-238, DOI: 10.1016/j.adhoc.2016.05.012, 2017.

[35] W. Tan, M. Matthaiou, S. Jin, X. Li, "Spectral Efficiency of DFT-Based Processing Hybrid Architectures in Massive MIMO," IEEE Wireless Communications Letters, vol. 6 no. 5, pp. 586-589, DOI: 10.1109/LWC.2017.2719036, 2017.

[36] W. Tan, D. Xie, J. Xia, W. Tan, L. Fan, S. Jin, "Spectral and Energy Efficiency of Massive MIMO for Hybrid Architectures Based on Phase Shifters," IEEE Access, vol. 6, pp. 11751-11759, DOI: 10.1109/ACCESS.2018.2796571, 2018.

[37] Y.-C. Liang, Y. Zeng, E. Peh, A. T. Hoang, "Sensing-throughput tradeoff for cognitive radio networks," IEEE Transactions on Wireless Communications, vol. 7 no. 4, pp. 1326-1337, DOI: 10.1109/TWC.2008.060869, 2008.

[38] K. Taniuchi, Y. Ohba, V. Fajardo, S. Das, M. Tauil, Y.-H. Cheng, A. Dutta, D. Baker, M. Yajnik, D. Famolari, "IEEE 802.21: Media Independent Handover: Features, Applicability, and Realization," IEEE Communications Magazine, vol. 47 no. 1, pp. 112-120, DOI: 10.1109/MCOM.2009.4752687, 2009.

[39] T. F. Smith, M. S. Waterman, "Comparison of biosequences," Advances in Applied Mathematics, vol. 2 no. 4, pp. 482-489, DOI: 10.1016/0196-8858(81)90046-4, 1981.

[40] M. Usman, K. Insoo, "Secure cooperative spectrum sensing via a novel user-classification scheme in cognitive radios for future communication technologies," Symmetry, vol. 7 no. 2, pp. 675-688, DOI: 10.3390/sym7020675, 2015.

[41] H. Birkan Yilmaz, T. Tugcu, F. Alagoz, "Novel quantization-based spectrum sensing scheme under imperfect reporting channel and false reports," International Journal of Communication Systems, vol. 27 no. 10, pp. 1459-1475, DOI: 10.1002/dac.2408, 2014.

[42] H. Vu-Van, I. Koo, "A cluster-based sequential cooperative spectrum sensing scheme utilizing reporting framework for cognitive radios," IEEJ Transactions on Electrical and Electronic Engineering, vol. 9 no. 3, pp. 282-287, DOI: 10.1002/tee.21968, 2014.

[43] A. Ghasemi, E. S. Sousa, "Optimization of spectrum sensing for opportunistic spectrum access in cognitive radio networks," Proceedings of the 4th Annual IEEE Consumer Communications and Networking Conference, pp. 1022-1026, DOI: 10.1109/CCNC.2007.206, .

[44] F. Gabry, A. Zappone, R. Thobaben, E. A. Jorswieck, M. Skoglund, "Energy Efficiency Analysis of Cooperative Jamming in Cognitive Radio Networks with Secrecy Constraints," IEEE Wireless Communications Letters, vol. 4 no. 4, pp. 437-440, DOI: 10.1109/LWC.2015.2432802, 2015.

Word count: 11336

Show less

Copyright © 2018 Hurmat Ali Shah and Insoo Koo. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Spectrum sensing is of crucial importance in cognitive radio (CR) networks. In this paper, a reliable spectrum sensing scheme is proposed, which uses K-nearest neighbor, a machine learning algorithm. In the training phase, each CR user produces a sensing report under varying conditions and, based on a global decision, either transmits or stays silent. In the training phase the local decisions of CR users are combined through a majority voting at the fusion center and a global decision is returned to each CR user. A CR user transmits or stays silent according to the global decision and at each CR user the global decision is compared to the actual primary user activity, which is ascertained through an acknowledgment signal. In the training phase enough information about the surrounding environment, i.e., the activity of PU and the behavior of each CR to that activity, is gathered and sensing classes formed. In the classification phase, each CR user compares its current sensing report to existing sensing classes and distance vectors are calculated. Based on quantitative variables, the posterior probability of each sensing class is calculated and the sensing report is classified into either representing presence or absence of PU. The quantitative variables used for calculating the posterior probability are calculated through K-nearest neighbor algorithm. These local decisions are then combined at the fusion center using a novel decision combination scheme, which takes into account the reliability of each CR user. The CR users then transmit or stay silent according to the global decision. Simulation results show that our proposed scheme outperforms conventional spectrum sensing schemes, both in fading and in nonfading environments, where performance is evaluated using metrics such as the probability of detection, total probability of error, and the ability to exploit data transmission opportunities.

Details

Title

Reliable Machine Learning Based Spectrum Sensing in Cognitive Radio Networks

Author

Hurmat Ali Shah¹

; Koo, Insoo¹

¹ School of Electrical Engineering, University of Ulsan, Republic of Korea

Editor

Mu Zhou

Publication year

2018

Publication date

2018

Publisher

John Wiley & Sons, Inc.

e-ISSN

15308677

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1155/2018/5906097

ProQuest document ID

2407627982

Reliable Machine Learning Based Spectrum Sensing in Cognitive Radio Networks

Jump to:

Full Text

Abstract

Details

Suggested sources