Content area

Abstract

Source number detection and Direction-of-Arrival (DOA) estimation are usually addressed in two stages, leading to high computational load. This paper proposes a simple solution to efficiently estimate the source number and DOAs using deep neural network (DNN) and clustering, named DNN-C. By observing that sources in space are usually few, DNN-C uses a simple fully connected DNN to obtain a spatial spectrum. Then, the K2-means clustering is specially designed to extract the source information from the obtained spatial spectrum. In particular, to enable the proposed DNN-C with the ability to detect the mixed sources, we first develop a new strategy for training data generation, and provide a guideline for data balance setting. We then explore the prior knowledge of array signal processing and spatial spectrum to obtain a peak vector and propose to add a virtual peak into the peak vector, and thus transform the task of source detection as a binary clustering problem of noise and sources. Overall, DNN-C provides a lightweight solution to implement source number detection and DOA estimation simultaneously and efficiently. Its testing time is about 2 times less than the classical solution (i.e., minimum descriptive length and multiple signal classification, shortened as MDL-MUSIC) when the grid step is 1° Importantly, it is robust to nonuniform noise by nature and can identify the absence of sources. The effectiveness of DNN-C is verified by simulation results. Furthermore, the DNN-C model trained by simulated data shows its generalization to real data measured by a circular array of eight sensors.

Full text

Turn on search term navigation

1. Introduction

Source number detection/estimation and Direction-of-Arrival (DOA) estimation are two important topics in array signal processing, and have been used in wireless communications, electronic reconnaissance, radar, and sonar systems [1]. DOA estimation provides the direction of the sources, which is important for target localization and tracking and has been widely used in military applications such as moving target detections and civil applications such as automotive driving. A lot of DOA estimation methods assume the prior knowledge of the number of sources, which is not true for non-cooperative targets. Therefore, the number of sources is also considered an unknown parameter to be estimated in this paper. For source number estimation, traditional methods use information theory-based criteria such as the Akaike information criterion (AIC) and minimum descriptive length (MDL) criterion [2,3]. The classical lightweight DOA solutions include the phase interferometric approach [4] and conventional beamforming (CBF) [5]. The experimental results in [4] demonstrate the phase interferometric approach is a simple and effective solution, and thus is applicable for digital dedicated hardware implementation. Experiments on Unmanned Aerial Vehicle (UAV) localization in passive radar (PR) scenarios [5] indicate that the CBF solution presents high estimation accuracy. In addition, traditional high-resolution DOA estimation solutions generally include subspace-based methods such as multiple signal classification (MUSIC) [6] and sparse representation-based methods such as the 1-SVD method [7], and their variants [8,9,10,11]. These methods were developed for uniform noise with the same noise powers at different sensors. In practice, the noise at different sensors might have different powers because of the imperfect channel response and mutual coupling [12]. In this situation, the noise is named nonuniform noise [13]. The nonuniform noise MDL method (named NUMDL) was proposed in [14] and is not robust. The Signal Subspace Matching (SSM) method was developed in [15] and is only suitable for medium and high signal-to-noise ratios (SNRs) in the case of multiple sources. For handling nonuniform noise, robust DOA estimation was proposed in [13,16,17]. Most of them either eliminate or estimate nonuniform noise or seek the optimal subspace for estimating DOAs. Most of the aforementioned solutions require the eigendecomposition or inverse of the array covariance matrix, leading to a high computational load.

At present, deep neural network (DNN)-based methods for source number detection and DOA estimation are emerging. For source number estimation, two networks, i.e., a regression network (ERNet) and a classification network (ECNet), were proposed in [18] and are superior to traditional algorithms based on simulation results. However, the involved eigendecomposition increases the computational load. For DOA estimation, the DNN-based methods are designed in a regression model [19,20], or in a classification model [21,22,23,24]. In [21,25], robust DNN-based DOA methods were developed to handle the array imperfections.

The simulation results demonstrate that compared with the conventional DOA estimation methods, the DOA estimation algorithms based on deep learning improve the estimation performance. However, it should be emphasized that the DNN-based DOA estimation methods require the number of sources. In particular, for the DNN methods with regression, the number of sources is required to design the dimension of the output vector of the DNN model. For the ones with classification, the number of sources is needed as well to select the number of peaks from the output vector of the DNN model (often called a spatial spectrum) for DOA estimation. It is noted that peak searching on the spatial spectrum is a way to find the number of sources. However, this way faces the challenge of selecting the pseudo peaks as the peak of true DOAs as illustrated in Section 3.3.1 in this paper.

Thus, it is necessary to develop the simultaneous estimation of the source number and DOAs. In [26], a deep beamforming method was proposed. It uses long short-term memory (LSTM), fully connected DNN, and 1-D convolutional neural network (CNN) together to learn the mapping from the received signal to the spatial spectrum. Then, the peak searching result of the spatial spectrum is used for the detection of sources. However, in practice, the spatial spectrum commonly not only forms peaks at the locations of sources but also gives pseudo peaks at other locations, which fails the peak searching procedure.

In this paper, simultaneous source number detection and DOA estimation using DNN and clustering (named DNN-C) are proposed. By observing that sources are sparse in space (that is, there are only a few sources in space), DNN-C uses a simple fully connected DNN to produce a spatial spectrum. For extracting the source information from the spatial spectrum, we develop a specified K2-means clustering.

The main contributions of our work are given as follows.

  • 1.. We develop a new strategy for training data generation, which generates mixed data under multiple low SNRs. Moreover, it provides a guideline for data balance setting and thus trains the DNN model to gain equal inference ability for different numbers of sources.

  • 2.. We obtain the peak vector by selecting the M1 largest peaks of the obtained spatial spectrum, by exploring the prior knowledge that an array of M antennas can detect M1 sources at most.

  • 3.. We add one virtual peak pvir (a positive value less than 1) into the peak vector to obtain an extended peak vector by utilizing the prior knowledge that the labels of DNN-C are set to be either 1 for the angles of sources or 0 for other angles. In this way, DNN-C is able to detect the absence of sources. The selection of the virtual peak is analyzed by numerical results.

  • 4.. We transform the task of source detection as a binary clustering problem of noise and sources. Thus, we set the number of clusters as 2 and the initial centroids as (0,pvir) in the proposed K2-means clustering. Due to the avoidance of the selection of the number of clusters and initial centroids, the proposed K2-means clustering is more efficient and effective than the conventional K-means clustering.

We verify the performance of DNN-C and compare it with traditional solutions by using numerical results. The results demonstrate that DNN-C is the most efficient in terms of testing time. Moreover, it is the most robust in terms of source number detection and DOA estimation because it is robust to nonuniform noise by nature. In addition, it can identify the absence of sources. Real data confirm the generalization of the DNN-C model trained with simulated data.

This paper is organized as follows. Section 2 introduces the array signal model. Section 3 presents the proposed DNN-C method. In Section 4, we evaluate the proposed method and compare it with other methods by simulation results. In Section 5, the generalization ability of the DNN-C model trained with simulated data is further verified by real data. Section 6 draws the conclusion.

Notations: Throughout the paper, ·p is the p norm. Real{·} and Image{·} are the real and imaginary parts of a complex value. (·)T and (·)H denote the transpose operation and conjugate transpose operation, respectively. j is the imaginary unit.

2. Array Signal Model

The array consists of M elements, and K far-field narrow-band source signals are in the same plane as the array and impinge on the array with DOAs of θ1,,θK. Without loss of generality, the received signal model for a linear uniform array is given in Figure 1.

The received array signal vector is given as

(1)r(t)=As(t)+n(t),t=1,,N,

where N is the number of snapshots. The noise vector n(t) consists of additive zero-mean Gaussian noise, s(t)=[s1(t),,sK(t)]T, and A=[a(θ1),,a(θK)] with a(θk) being the steering vector at θk.

The expected array covariance matrix is

(2)R=E[r(t)rH(t)]=ARsAH+Rn,

where Rs=E[s(t)sH(t)] and Rn=diagvn. Note that diagvn represents a diagonal matrix with vn as its diagonal elements, and vn is the noise power vector given as

(3)vn=[σn,12,,σn,M2]T,

where σn,m2 is the noise power at the m-th sensor. If the noise powers are all equal, the noise is uniform noise. Otherwise, it is nonuniform.

It is noted that when the sources are uncorrelated, we have Rs=diagσs,12,σs,22,,σs,K2 with σs,k2 being the power of the k-th source. Therefore, from Equation (2), we obtain that the m-th diagonal element of R is Rm,m=k=1Kσs,k2+σn,m2. Therefore, the diagonal elements of R do not contain DOA information and they can be removed from the input of the DNN model.

In practice, R is estimated by N snapshots as R^ below:

(4)R^=1Nt=1Nr(t)rH(t).

The eigendecomposition of R^ is given as

(5)R^=Σm=1Mβiu^iu^iH,

where β^i,i=1,,M are the eigenvalues and they are listed in descending order. u^i is the eigenvector corresponding to β^i. The first K eigenvectors are signal eigenvectors and the rest of the MK eigenvectors are noise eigenvectors.

With the assumption that the number of sources K is known, the spatial spectrum of the MUSIC is given below:

(6)P(θ)=Σm=K+1MaH(θ)u^mu^mHa(θ)22,θ[θmin,θmax),

Then, the angles of the first K largest peaks are taken as the estimated DOAs of the K sources.

In practice, the number of sources K is unknown and needs to be estimated by the source number detection methods such as the MDL method for the case of uniform noise, and NUMDL and SSM methods for the cases of nonuniform noise.

By considering that R^ is a Hermite matrix, the DNN model is real-valued, and the diagonal elements of R do not contain DOA information, we arrange the off-diagonal upper right triangular elements of R^ to form a M(M1)2-dimensional vector

(7)b=[R^1,2,R^1,3,,R^1,M,R^2,3,,R^2,M,,R^M1,M]T,

where R^i,j represents the i-th row and j-th column element of the matrix R^. Since the vector b is a complex vector, we split b into real and imaginary parts and concatenate them to obtain a real-valued and M(M1)-dimensional vector b1 as follows:

(8)b1=RealbT,ImagbTT.

Then b1 is normalized to obtain the vector x as follows.

(9)x=b1meanb1b1meanb12,

where mean· represents the mean operation.

The vector x is then used as the input vector of the DNN model. We use the function finput to represent the transformation from Equations (4)–(9). Therefore, we have

(10)x=finput(R^).

The existing DNN in [21] trains the neural network by the simulation data generated by a fixed number of sources, and the generalization to a different number of sources is not good. In addition, in the inference stage, DNN in [21] assumes the number of sources is known, which is not satisfied in the applications for detecting non-cooperative targets. In the following, we propose a method which trains the neural network with different numbers of sources and is able to estimate the number of sources and DOAs simultaneously.

3. Proposed Method

For efficiently estimating the source number and DOAs, we propose a method by combining DNN and clustering, called DNN-C. By considering the sources are a few (i.e., spatially sparse), DNN-C explores the simple fully connected DNN given in [21] and trains it with the mixed data generated under different numbers of sources and under multiple low SNRs. Then, we explore the prior knowledge of array signal processing, including the fact that the array of M sensors identifies (M1) sources at most and the field of view (FOV) (i.e., angle scope) of an array which is determined by an antenna pattern. In addition, we utilize the prior knowledge that the maximum peak in the outputted spatial spectrum of DNN is equal to 1 since in the training stage, we set the label to 1 if there is a source at the grid, or 0 otherwise. Thus, we develop a specified K2-means clustering for the outputted spatial spectrum to complete the tasks of source number estimation and DOA estimation. The scheme of the DNN-C method is shown in Figure 2 and explained below.

3.1. DNN Structure

The DNN structure is given in Figure 3, which consists of a multi-task autoencoder and multi-layer classifiers. The input of the autoencoder is the vector x in Equation (10). The autoencoder operates as a set of spatial filters, which provides inputs to subsequent classifiers and helps reduce the burden on subsequent classifiers. The outputs of the classifiers are cascaded to reconstruct a conjunction vector (also named as a spatial spectrum and denoted as y).

In the DNN model, the size of the output layer (denoted as γ) for each multi-layer classifier is

(11)γ=θmaxθminηp,

where x defines the smallest integer larger than x. η is the grid step. θmin and θmax represent the minimum and maximum detectable angles, respectively. Thus, the FOV of the array is [θmin,θmax).

Then, the conjunction vector y has a size of γp. Therefore, the function of DNN can be represented by fDNN below:

(12)y=fDNN(x).

3.2. New Strategy for Training Data Generation

3.2.1. Motivation

It is well known that the training data are critical for the inference of the DNN model. Herein, we propose to generate the mixed training data with multiple numbers of sources and multiple low SNRs. In this way, the trained DNN model gains the abilities of source number detection and DOA estimation as well in low to high SNRs. The new training data generation process is given below.

3.2.2. Training Data in the Case of Different Source Numbers

Different from DNN in [21], we train DNN with mixed data under different numbers of sources, enabling the inference of DNN in the absence and presence of sources.

For example, we generate the training data in the absence of source signals (i.e., K=0) as follows:

(13)r0(t)=n(t),

where the noise n(t) is uniform.

We generate the signal at the z-th trial as

(14)r0(z)(t)=n(z)(t).

It is noticed that the noise is randomly generated by the Gaussian-distributed process at each trial.

In the case of one source signal, the received data are

(15)r1(t)=a(θ1)s1(t)+n(t).

It is noted that the number of the possible angles of a single source is

(16)I=θmaxθminη.

We then generate data for each angle θi at the z-th trial as

(17)r1(i,z)(t)=a(θi)s1(i,z)(t)+n(i,z)(t),

where the superscripts i represents the angel located at the i-th grid, for i=1,,I. The signal and noise at each grid and each trial are generated randomly according to the Gaussian distribution process.

In the case of two source signals, the received data are

(18)r2(t)=a(θ1)s1(t)+a(θ2)s2(t)+n(t).

By considering that the sequence of two DOAs will not change the representation of the array signal model, we then generate data for a pair of DOAs (θi,θi+Δj) at the z-th trial as

(19)r2(i,j,z)(t)=a(θi)s1(i,j,z)(t)+a(θi+Δj)s2(i,j,z)(t)+n(i,j,z)(t).

It is noted that the signal and noise at each pair of angles and each trial are produced randomly according to the Gaussian distribution process.

In Equation (19), Δj is the angle interval between the DOAs of two sources, and it is taken from the set of {Δmin,Δmin+Δ,Δmin+2Δ,,Δmax} which has J elements, where Δmin and Δmax define the minimum and maximum DOA separations between two DOAs of sources, respectively. In addition, we use Δ to denote the increment of DOA separation. Therefore, for each Δj, we have the set of angle pairs as (θi,θi+Δj),i=1,2,,Ij, and the number of angle pairs Ij is given as

(20)Ij=θmaxθminΔjη.

Based on the received data when K=0,1,2 (i.e., r0(z)(t), r1(i,z)(t), and r2(i,j,z)(t)) and according to Equation (4), we obtain the estimated covariance matrices when k=0,1,2, that is, R^0(z), R^1(i,z), and R^2(i,j,z). Afterwards, by the finput function in Equation (10), we obtain the input vectors in the cases of K=0,1,2 as x0(z), x1(i,z),x2(i,j,z), respectively.

3.2.3. Labeling for Different Numbers of Sources

It is noted that in each trial, the source signal and noise are randomly generated. However, the label vectors are only relevant to the DOAs of sources, and they are regardless of the trials and thus we ignore the index of trials in the following label vectors.

In the absence of sources (i.e., K=0), we set the label vector L0 as a vector of zeros, that is

(21)L0=[0,,0]T.

In the case of one source (i.e., K=1), we set the label vector L1(i) for the angle θi as

(22)L1(i)=[0,,0,1,0,,0]T.

It is noted that the element 1 in L1(i) appears at the grid corresponding to the DOA of the source, and the rest of the elements are equal to 0.

In the case of two sources (i.e., K=2), we set the label vector L2(i,j) for (θi,θi+Δj) as

(23)L2(i,j)=[0,,0,1,0,,0,1,0,,0]T.

In L2(i,j), element 1 appears at the two angles corresponding to the pair of DOAs (θi,θi+Δj) of the two sources, and the rest of the elements are equal to 0.

As a result, we obtain the training data sets for K=0,1,2 as follows:

(24)Ω0=(x0(z),L0),z=1,,Z0,

(25)Ω1=(x1(i,z),L1(i)),z=1,,Z1,i=1,,I,

(26)Ω2={(x2(i,j,z),L2(i,j)),z=1,,Z2,i=1,,Ij,j=1,,J},

where Z0, Z1, and Z2 represent the numbers of trials for each direction setting for K=0,1,2, respectively. The mixed training data set is

(27)Ω=Ω0,Ω1,Ω2.

3.2.4. Training Data with Multiple SNRs

Moreover, we train DNN with more data generated under multiple low SNRs. In this way, DNN is adapted to different situations in low SNR to high SNR. We set the number of SNRs as Nsnr.

3.2.5. Data Balance for Different Numbers of Sources

In order to train the DNN model to gain the equal inference ability for each case of K=0,1,2, we need to assign data balance to set the lengths of the elements in Ω0, Ω1, and Ω2 to be the same. In addition, in the absence of sources, SNR is not applicable. Thus, according to Equations (24)–(26), we have

(28)Z0=Nsnr×Z1×I=Nsnr×Z2×Σj=1JIj.

We follow Equation (28) to set the numbers of trials for each direction setting in the cases of different numbers of sources (i.e., Z0,Z1,Z2) for the purpose of data balance.

Thus, according to Equations (27) and (28), we obtain the length of the training data as 3×Z0. It is noted that we use K=0,1,2 as an example to explain the generation of mixed training data and the generation for the cases when K>2 is straightforward.

3.3. Specified K2-Means Clustering

3.3.1. Motivation

DNN-C is then trained with the simulated data generated according to the procedure in Section 3.2. Afterward, in Figure 4, we illustrate the spatial spectrum of the trained DNN by testing it with the real data collected by a uniform circular array (UCA) (please refer to Section 5 for the detailed descriptions).

From Figure 4, we notice that the spatial spectrum given by the DNN model gives a peak at the true DOA of a source (i.e., 8°, marked with a red cross). However, it suffers from a pseudo peak which is located at 17.8° beside the true DOA. Therefore, the simple peak searching for source number detection will give a false alarm. This fact motivates us to develop a specified K2-means clustering to identify the number of sources automatically.

3.3.2. Specified Peak Searching

We consider the conjunction vector y of DNN as a spatial spectrum. Moreover, we explore the prior knowledge that the array of M sensors can identify (M1) sources at most. We then form a peak vector p by selecting the (M1) largest peaks from the vector y and obtain its corresponding angle vector below:

(29)p=fpeaks(y)=[p1,,pM1]T,

(30)Θ=[θ^1,,θ^M1]T,

where fpeaks is a function which returns the (M1) largest peaks of y, and the peaks pm,m=1,,M1 are listed in a descending order. θ^m is the angle corresponding to pm.

3.3.3. Addition of Virtual Peak

In order to identify the absence of sources, we extend the peak vector as p˜ by adding one virtual peak with a specified amplitude pvir into p. Correspondingly, we add an arbitrary angle θvir which is out of the FOV (that is, θvir[θmin,θmax]) into the angle set Θ. Then, we obtain the extended peak vector p˜ and its angle vector Θ˜ below:

(31)p˜=[p1,,pM1,pvir]T,

(32)Θ˜=[θ^1,,θ^M1,θvir]T.

We herein use the prior knowledge that the maximum peak of the spatial spectrum y is equal to 1 since in the training stage, we set the label as 1 if there is a source at the grid or 0 otherwise. Therefore, we have 0<pvir<1. The refinement of the pvir is given in Section 3.3.7. In addition, the corresponding angles of the spatial spectrum of DNN-C are not larger than θmax. Thus, the virtual peak can be located at any angle that is larger than θmax. Herein, we set θvir=θmax+1°.

3.3.4. Normalization

We obtain the normalized version of the extended peak vector as p˜norm to prevent the large elements from dominating in the distance-based objective function of the following clustering, given as follows:

(33)p˜norm=(p˜meanp˜)/varp˜,

where meanp˜ and varp˜ are the mean and variance of p˜, respectively.

3.3.5. Specified Clustering

It is noted the number of elements in p˜norm is as small as M. Thus, we develop a K2-means clustering algorithm for efficiently clustering the elements in p˜norm by specializing the parameters of K-means to fit the source number detection and DOA estimation in array signal processing. In particular, we transform the task of source detection as a binary clustering problem of noise and sources and thus set the number of clusters as Nc=2. Moreover, we set the initialized centroids as centroids=(0,pvir). Therefore, the developed K2-means clustering is given as

(34)(c˜noise,c˜sig)=fK2means(p˜norm,Nc,centroids),

(35)Θ˜sigc˜sig,

where fK2means represents the improved K-means clustering which is herein designed with the specified Nc and centroids. (c˜noise,c˜sig) are the binary clusters (i.e., noise cluster and source signal cluster), respectively. ⟷ defines one-to-one correspondence. Θ˜sig denote the angle sets corresponding to c˜sig.

It is noted that the K2-means clustering is different from the general K2-means. This is because the proposed K2-means clustering sets the number of clusters as 2 and the initial centroids as (0,pvir). Due to the avoidance of selection of the number of clusters and initial centroids, the proposed K2-means clustering is more efficient and effective than the general K-means.

3.3.6. Masking of Signal Cluster and Estimated Results

After clustering, we select the cluster c˜sig which has the largest centroid as the source signal cluster and remove the virtual peak by a mask function fmask below:

(36)c^sig=fmask(c˜sig),Θ^sigc^sig.

where fmask keeps the element in c˜sig if the corresponding angle in Θ˜sig is within [θmin,θmax), and fmask deletes the element otherwise. Θ^sig is the angle vector corresponding to c^sig. In this way, the virtual peak is identified and removed in the set c^sig.

We estimate the number of sources K^ as the length of Θ^sig and estimate the DOAs of K^ sources as Θ^sig.

3.3.7. Illustration of the Clustering Results with Different Virtual Peaks

The virtual peak pvir is a hyperparameter and we have 0<pvir<1 as given in Section 3.3.3. In the following, we further provide a refined range of the proper peak value by numerically analyzing the clustering results of DNN-C in the testing stage. We consider a uniform linear array (ULA) of 10 sensors. Similar to [21], we set the FOV of the ULA be [θmin=60°,θmax=60°). According to Section 3.3.3, we insert the virtual peak of pvir at the angle θmax+1=61°. We numerically analyze the results in the absence and presence of sources to obtain a range of the proper virtual peak pvir as follows.

  • (a). Absence of sources.

Firstly, we consider the noise in the absence of sources with 1000 snapshots and give the clustering result of DNN-C with different virtual peaks in Figure 5.

In Figure 5, the red dots represent the elements in the source signal cluster and purple dots represent the elements in the noise cluster. It is noted that the red dot at 61° corresponds to the virtual peak. From Figure 5a,b, we observe that if the virtual peak is too small like 0.2 and 0.1, the clustering results give a false alarm, and DNN-C estimates the number of sources as 1. In contrast, when the virtual peak is as large as 0.3 and 0.4, there is no false alarm, and DNN-C correctly estimates the number of sources as 0. Therefore, we conclude that in order to avoid false alarms in the absence of sources, pvir0.3.

  • (b). Presence of sources.

Herein, we consider two sources with SNR=2 dB and 1000 snapshots, and the DOAs of two sources are (θ1=40.1°,θ2=28.2°). The clustering results of DNN-C with different virtual peaks pvir are given in Figure 6. From Figure 6, we obtain that when the virtual peak is smaller than 0.6 such as 0.5, 0.4, and 0.3, DNN-C classifies the two peaks around θ1=40.1° and θ2=28.2° into the source signal cluster, and thus it correctly estimates the number of sources as 2. On the other hand, when the virtual peak is large as 0.6, DNN-C classifies the peak around θ1=40.1° into the noise cluster and the peak around θ2=28.2° into the source signal cluster. Therefore, in this case, DNN-C suffers from a missed detection and incorrectly estimates the number of sources as 1. As a result, we conclude that in order to avoid missed detections in the presence of sources, pvir0.5.

According to the above-mentioned clustering results, we observe that 0.3pvir0.5. In the following simulation results, we set pvir=0.4.

3.4. Summary of Proposed DNN-C Method

The pseudocode of the proposed DNN-C method is summarized in Algorithm 1.

Algorithm 1 Proposed DNN-C method in testing stage.
  • Require: Array data r(t), t=1,,N and pvir.

  • // Estimate the covariance matrix

    R^=1Nt=1Nr(t)rH(t).

  • // Construct input vector for DNN

    x=finput(R^).

  • // Get conjunction output vector of DNN

    y=fDNN(x).

  • // Obtain the vector of the largest (M1) peaks and the corresponding angle vector

    p=fpeaks(y)=[p1,,pM1]T,

    Θ=[θ^1,,θ^M1]T.

  • // Add virtual peak

    p˜=[p1,,pM1,pvir]T,

    Θ˜=[θ^1,,θ^M1,θvir]T.

  • // Normalization

    p˜norm=p˜meanp˜varp˜.

  • // K2-means clustering with Nc=2 and centroids=(0,pvir)

    (c˜noise,c˜sig)=fK2means(p˜norm,Nc,centroids).

  • // Remove virtual peak

    c^sig=fmask(c˜sig),Θ^sigc^sig.

  • Return: Estimated number of sources K^: the length of Θ^sig and Estimated DOAs of K^ sources: Θ^sig.

3.5. Analysis of Computational Complexity

In this section, similar to [27], we quantify the primary computational complexity through the calculation of real-valued multiplications as given in Table 1.

In this table, for the spectrum searching in MUSIC, we define the grid step as η1, and then the number of grids is

(37)γ˜=θmaxθminη1.

In Table 1, for MDL, the primary computational complexity is due to the eigendecomposition of the array covariance matrix with a size of M×M. NUMDL utilizes M times the eigendecomposition of the array covariance matrix with a size of (M1)×(M1). For the MUSIC, the first term is due to the spatial spectrum searching, and the second one is caused by the eigendecomposition of the array covariance matrix with a size of M×M. For SSM, the first term of computational complexity is for the calculation of the projection matrix on the matrix of the received signal with a dimension of M×N. The second term counts for the calculation of the eigendecomponsition of the array covariance matrix with a size of M×M. The third one is for calculating the SSM criterion with the maximum iteration of (M1).

For a linear array of 10 elements, we set the number of snapshots as N=1000, the grid steps η=η1=1°, p=3, θmin=60°, θmax=60°. Thus, we have γ=40, and γ˜=120. In this case, the result is given in the last column of Table 1. We find DNN-C has the lowest computational complexity and SSM has significantly high computational complexity due to N3, which corresponds to the testing time in Figure 7 below.

3.6. Evaluation Criterion

3.6.1. Source Number Detection

In most cases such as non-cooperative target detection by a sensor array, the number of targets is unknown. Therefore, we use the probability of detection (PD) to measure the performance of the source number detection, which is often used in the literature as well. In addition, we notice that if a PD equals to 1, then the number of targets is detected 100% correctly. On the other hand, if a PD is less than 1, then the detected number of targets can be larger or smaller than the true number of targets. In order to further indicate the relation between the detected number and the true number, similar to [28,29], we define FAR (false alarm rate) and PMD (probability of missed detection). FAR is used to indicate the false alarms, and PMD is used to show the situation of missed detection. PD is defined below:

(38)PD=100%Li=1LfPD(K,K^i),

where the fPD is given below

(39)fPD(K,K^i)=1,ifK^i=K0,ifK^iK,

where L represents the number of Monte Carlo trials, and K^i represents the estimated number of sources in the i-th trial.

We define the FAR as follows:

(40)FAR=1Li=1LfFAR(K,K^i),

and the function fFAR is given below:

(41)fFAR(K,K^i)=K^iK,K^i>K0,K^iK.

It is noted in Equation (41) that false alarm occurs when K^i>K. Similarly, we define the PMD as follows:

(42)PMD=100%Li=1LfPMD(K,K^i),

where the function fPMD is given below:

(43)fPMD(K,K^i)=KK^iK,K^i<K0,K^iK.

3.6.2. DOA Estimation

We use the root mean square error (RMSE) to evaluate the DOA estimation performance. Moreover, false alarms can be removed by trajectory tracking after DOA estimation. In contrast, the missed detections of sources usually lead to serious results in early-warning systems for safety purposes. Thus, we ignore the false alarms and involve missed detections when calculating the RMSE of DOA estimation below:

(44)RMSE=1LKl=1Lk=1K(θ^k,lθk)2+Δ˜2(KK^l,min),

where θ^k,l is the estimated value of θk at the l-th trial. K^l,min=minK^l,K, and K^l is the estimated number of sources at the l-th trial. Δ˜ is an arbitrary value and set as 20° for enabling the RMSE axis covering the large RMSE and showing the details of small RMSE as well. It is noted that (KK^l,min) in Equation (44) is larger than zero if K^l<K (i.e., missed detections occur). Otherwise, it is zero. Therefore, Equation (44) considers missed detections, but it does not consider false alarms.

4. Simulation Results

In this section, we provide the comparisons between the DNN-C and traditional source number estimation methods including MDL [2], NUMDL [14], and SSM [15]. We also present the combined methods which first use the MDL, NUMDL, and SSM methods for estimating the number of sources and then employ the classical MUSIC method [6] with the estimated number of sources (named MDL-MUSIC, NUMDL-MUSIC, and SSM-MUSIC) for DOA estimation. In addition, except the cases we indicate clearly in Section 4.3.1 and Section 4.3.2, for MUSIC, the grid step is fixed as η1=1°. Moreover, Equation (31) in [13] is used to calculate the deterministic CRB as a lower bound in the case of nonuniform noise, which is an extension of the well-known results for the uniform noise to a more general noise model.

We consider a 10-element ULA with an inter-element spacing to be a half wavelength. We generate the training data in the case of uniform noise. Moreover, the training data are generated under different numbers of sources (i.e., K=0,1,2). In addition, they are generated with SNRs sampled from [−16 dB, 2 dB] with an interval of 2 dB, that is, Nsnr=10. The settings for the training data when K=0,1,2 are given below.

  • 1.. For K=2, we set Δmin=2°, Δmax=60°, and Δ=2°. The DOAs of two sources are (θ1,θ1+Δ2), where θ1 are generated within [60°,60°) with a grid of η=1° and Δ2 belongs to the set of 2°,4°,,60°. Furthermore, 10 groups of input vectors are collected for each direction setting with random noise, that is, the number of trials for each direction setting is Z2=10. Therefore, for K=2, according to Equation (20), the number of input vectors is equal to (J1+J2++J30)×Nsnr×Z2=(118+116++60)×10×10= 267,000.

  • 2.. For K=1, the DOA of a single source is sampled from [60°,60°) with a grid step of 1°. Therefore, according to Equation (16), for K=1, the number of angles I=120. According to the data balance derived in Equation (28), in the case of one source, we obtain the number of trials for each direction setting as Z1=267,000/I=2225.

  • 3.. For K=0, according to Equation (28), we obtain the number of trials as Z0=267,000.

Thus, the number of elements of the training data set is 3×267,000=801,000. In addition, the training data set only contains integer DOAs. However, we use non-integer DOAs in the testing stage. Moreover, the source signals are different in the training and testing stages, verifying the generalization ability of DNN-C. Furthermore, the nonuniform noise is present in the testing stage. We define the SNR for the k-th source as SNRk=σSk2/σ˜n2, σsk2 is the power of the k-th source signal, σ˜n2 is the averaged noise power, and σ˜n2=1Mm=1Mσn,m2. In the simulation for testing results, we use the nonuniform noise in two cases as follows.

  • 1.. We set the noise power at the second sensor to be the largest and equal to 4, and others equal to 1. According to [13], the WNPR of nonuniform noise is σmax2σmin2=4.

  • 2.. Similar to the setting in [13], we fix the noise powers at the first and eighth sensors as σ12=1 and σ82=4, respectively. The rest of the noise powers are generated from the interval [σmin2=σ12, σmax2=σ82] using the uniform random generator. Thus, we have that the WNPR of nonuniform noise is σmax2σmin2=4.

It is noted that the WNPRs for Cases 1 and 2 are the same. However, the nonuniform noise power vectors defined in Equation (3) are different. In the testing stage, for K=0,1, we use the nonuniform noise in Case 1. For K=2, we use the nonuniform noise in Case 2. In addition, we set N=1000 and the number of Monte Carlo trials as L=200. The above parameters are fixed unless we indicate other values explicitly.

4.1. Testing Time

A comparison of the testing time for all the methods is provided. For a fair comparison of the testing time, all the methods are executed on a Windows operating system with a 13th Gen Intel(R) processor clocked at 3.0 GHz, without the utilization of the NVIDIA GPU, during the testing stage. Based on 200 trials, the average testing time is given in Figure 7.

From Figure 7, we obtain that DNN-C is the most efficient, and it is about 2 times less than MDL-MUSIC when the grid step is 1°. It is noted that SSM involves the inverse operation of the received array matrix with a dimension of M×N, leading to heavy load when N=1000. In addition, MDL-MUSIC with a grid step of 0.1° requires a long testing time because the high computational load from the spectrum searching when the grid step is 0.1°.

It is noted that DNN-C, MDL-MUSIC, and NUMDL-MUSIC have similar testing times as shown in Figure 7 in different SNRs and numbers of snapshots because their computational complexity is regardless of SNRs and number of snapshots as shown in Table 1. In contrast, the calculation of SSM relies on the number of snapshots. Therefore, as we increase the number of snapshots to 2000, SSM-MUSIC has a testing time equal to 6120 ms. On the other hand, if we decrease the number of snapshots to 200, SSM-MUSIC reduces the testing time to 655 ms.

4.2. Absence of Sources

In the absence of sources, K=0. We consider K=0 as a case that we shall identify because whether the sources are present or absent is unknown in practice. In the absence of sources, we shall measure the PD of K=0 and false alarms. On the other hand, the missed detection is meaningless. Therefore, we do not include PMD and RMSE of the DOA estimation for K=0.

When K=0, we give the PD and FAR versus the number of snapshots to verify the ability of DNN-C to identify the absence of sources in Figure 8. From Figure 8a, we observe that DNN-C identifies the absence of sources with a PD higher than 94% even in a small number of snapshots such as 20. In contrast, the other methods suffer from serious false alarms as shown in Figure 8b. This is because SSM only can detect the number of sources when K>0, the MDL cannot combat nonuniform noise, and NUMDL is a reduced-dimensional MDL method by nature.

4.3. Single Source

In this section, we consider a single source, that is, K=1.

4.3.1. DOA of Single Source

We first set SNR = 0 dB, and illustrate the performance against the DOA of a single source in Figure 9.

From Figure 9a–c, we find that DNN-C and SSM successfully detect the number of sources. In contrast, the MDL gives false alarms due to nonuniform noise [30]. In addition, NUMDL is always giving false alarms because it is a reduced-dimension MDL method by nature. Figure 9d shows that DNN-C and SSM-MUSIC have similar DOA estimation accuracy. In addition, we notice their DOA estimation performance deviates from CRB. This is because their performance is limited by the grid step of 1°.

It is noted in Figure 9d that MDL-MUSIC and NUMDL-MUSIC also have similar accuracy to that of DNN-C. This is because the MDL and NUMDL have false alarms. In particular, under false alarms, the spatial spectrum of the MUSIC method in Equation (6) is changed as

(45)P(θ)=Σm=K^+1MaH(θ)u^mu^mHa(θ)22,θ[θmin,θmax).

.

It is worth noticing that when K^>K in the case of false alarms, the eigenvectors u^K^+1,,u^M in Equation (45) are also orthogonal to the steering vectors of sources. Therefore, in Figure 9d, because of false alarms, MDL-MUSIC and NUMDL-MUSIC have good DOA estimation performance.

Furthermore, it is noted when the grid step is changed to 0.1° in MDL-MUSIC, we observe that MDL-MUSIC with a grid step of 0.1° has an estimation accuracy close to that of the CRB. However, the price for this is that MDL-MUSIC with a grid step of 0.1° suffers from high computational complexity, and it requires a much longer testing time than that for DNN-C as shown in Figure 7.

4.3.2. SNR

We set θ1=20.2°, and illustrate the performance of the methods versus SNR in Figure 10.

From Figure 10a–c, we can see that both DNN-C and SSM can estimate the number of sources correctly, even in a low SNR such as −12 dB. In contrast, the MDL and NUMDL methods suffer from false alarms. From Figure 10d, we observe that all the methods perform better as the SNR increases. Moreover, they are close to the CRB when the SNR is less than −6 dB. However, they deviate from CRB in high SNRs, which is due to the grid step of 1°. The good DOA estimation performance of MDL-MUSIC and NUMDL-MUSIC in the presence of false alarms is due to Equation (45) and the explanation below Equation (45). Moreover, MDL-MIUSIC with a grid step of 0.1° shows a trend approaching the CRB when the SNR increases, with a price of high computational load.

4.3.3. Number of Snapshots

We set SNR=4 dB, θ1=20.2°, and illustrate the performance of the methods versus the number of snapshots in Figure 11.

From Figure 11a–c, we observe that DNN-C and SSM perform well in small to large snapshots. In contrast, both MDL and NUMDL fail for source number detection and yield false alarms. From Figure 11d, we notice that DNN-C, NUMDL-MUSIC, and SSM-MUSIC have DOA estimation performance approaching the CRB in small snapshots. However, they deviate from CRB because of the limit of the grid step of 1°

4.4. Two Sources

In this section, we consider two sources, that is, K=2.

4.4.1. SNR

We set two sources with θ1=10.2° and θ2=40.1°, and give the performance versus SNR in Figure 12.

From Figure 12, it is illustrated that DNN-C performs the best in terms of PD and RMSE, especially when SNR<8 dB. In addition, as shown in Figure 12d, its DOA estimation accuracy improves as SNR increases. In contrast, as given in Figure 12c,d, when K=2, SSM performs well only in high SNRs such as SNR>0 dB. In addition, NUMDL tends to have missed detections when SNR<8 dB. It is worth noticing that Figure 12c indicates that the MDL method has false alarms, and Figure 12d shows that MDL-MUSIC fails in terms of DOA estimation regardless of SNRs. Therefore, in the case of false alarms, the MDL-MUSIC method is not robust.

By comparing the results of MDL, NUMDL, MDL-MUSIC and NUMDL-MUSIC in Figure 10 and Figure 12, we confirm that MDL, NUMDL, MDL-MUSIC and NUMDL-MUSIC are not robust.

4.4.2. DOA Separation

Moreover, we consider two sources with θ1=10.2° and θ2=θ1+Δ2, where Δ2 is the DOA separation between the two sources. In addition, SNR = 0 dB. Figure 13 illustrates the performance versus DOA separation.

From Figure 13a–c, we observe that DNN-C and NUMDL perform well in terms of PD. In contrast, SSM suffers from missed detections due to a low SNR, and MDL suffers from false alarms. In addition, from Figure 13d, we can see both DNN-C and NUMDL-MUSIC have DOA estimation accuracy of about 0.1°. On the other hand, MDL is not robust, and SSM fails in terms of DOA estimation.

4.4.3. WNPR

We consider two sources with θ1=10.2° and θ2=35.1°.

  • (a). SNR=4 dB and N=1000.

With the conditions of SNR=4 dB and N=1000, for the nonuniform noise in case 2, we increase the noise power at the eighth sensor (i.e., σ82) from 1 to 12, implying WNPR is changing from 1 to 12. Then, we present the performance versus WNPR in Figure 14.

From Figure 14a–c, we observe that DNN-C is immune to WNPR because it utilizes the off-diagonal upper triangle elements of the array covariance matrix and thus discards the noise powers from the input vector. In contrast, SSM fails due to the low SNR. MDL performs well only when WNPR=1, that is, the noise is uniform. It is noted that NUMDL behaves well in this case. However, as illustrated in Figure 9, Figure 10, Figure 11 and Figure 12, we obtain that it is not robust because NUMDL is a reduced-dimensional MDL method by nature.

  • (b). SNR=6 dB and N=100.

We change the setting of SNR and the number of snapshots as SNR=6 dB and N=100, and keep other simulation settings the same as those in Figure 14. In this case, the performance versus WNPR is given in Figure 15.

From Figure 15a–c, we note that the NUMDL method suffers from missed detections and results in large DOA estimation error as shown in Figure 15d. The MDL method in Figure 15 with small snapshots performs better than that in Figure 14 with large snapshots. This phenomenon is observed in [30,31] as well. Overall, from Figure 14 and Figure 15, it is illustrated that NUMDL, MDL, and their combinations with MUSIC are not robust. In contrast, the DNN-C method is always robust to nonuniform noise because its input vector is regardless of nonuniform noise.

  • (c). Correlation coefficient.

We consider two sources with θ1=20.2° and θ2=40.1°. The SNR is 10 dB, and the number of snapshots is 1000. The performance of the above-mentioned method is given in Figure 16. From Figure 16, we observe a very interesting phenomenon that the proposed CNN model performs regardless of the correlation coefficient of the two source signals. This might be explained that the simulation data for training the DNN model contain the cases of 0 source, 1 source, and 2 sources. It is noted that when the 2 sources are completely correlated (that is, the correlation coefficient is equal to 1), the number of the source becomes 1, which implies the case of 1 source. Therefore, the DNN model is able to adapt to the correlated sources in the testing stage, even though the two sources in the training data set are uncorrelated. In contrast, the other methods such as SSM/NUMDL MUSIC fail when the correlation coefficient is larger than 0.6.

5. Experimental Results Based on Real Data

This section demonstrates the generalization ability of the proposed DNN-C model trained with simulated data by testing the trained DNN model with real data. The real data are collected using a uniform circular array (UCA) of 8 sensors with a radius equal to 0.56 wavelength. In addition, the real data are obtained after the Radio Frequency (RF) signal passes through the mixer and a low-pass filter, and then they are sampled by a Analog-to-Digital Converter (ADC) with a sampling rate is 375 kHz. The propagation environment is open space and free-field, which only contains direct waves and does not involve multi-paths. It shall be noted this UCA is composed of omnidirectional antennas and it has a FOV of [θmin=0°,θmax=360°). Therefore, the location of the virtual peak is set as θmax+1°=361°. The grid step for the MUSIC is 1°.

5.1. Presence of Source Signal

In the presence of a source signal, the data comprise an electronic reconnaissance signal and noise. The measured DOA of the source is 8°, and the number of snapshots is 1100. In addition, we estimate the SNR of the source as 50 dB, and it is calculated by 10log10(σ^s2/σ^n2), where σ^s2 and σ^n2 are the estimated signal and noise powers and are usually estimated by the eigenvalues of the array covariance matrix.

5.1.1. High SNR

Since the estimated SNR in the presence of a source signal is about 50 dB, the received real data have a high SNR. In this case, the spatial spectrum of DNN-C is shown in Figure 17. From Figure 17, we observe that DNN-C correctly estimates the number of sources and DOA. Additionally, we notice that the maximum peak in Figure 17 is about 0.25 instead of 1. This might be caused by array error. On the other hand, we notice a pseudo peak located at approximately 17°, which is classified as noise in Figure 17. Table 2 shows the results of different algorithms. In Figure 18, we observe that MDL-MUSIC and SSM-MUSIC can correctly estimate the DOA of the source. However, their peaks at the DOA of the source are not sharp. In contrast, MDL-MUSIC fails because of the false alarms provided by MDL.

From Table 2, we find that except MDL, all the DNN-C, NUMDL, and SSM correctly estimate the number of sources. Moreover, all the DNN-C, NUMDL-MUSIC, and SSM-MUSIC correctly estimate the DOA as 8°.

5.1.2. Low SNR

Since the estimated SNR in the presence of sources is very high as 50 dB, we consider the received signal as the pure source signal. Thus, we adjust the ratio of the power of the received pure source signal to that of the received noise in the absence of sources and sum them to generate the real data with an SNR of −10 dB.

In this case, the spatial spectra of the DNN-C and other methods are shown in Figure 19 and Figure 20, respectively. From Figure 19, we observe that in a low SNR of 10 dB, DNN-C still correctly estimates the number of sources as 1 and estimates DOA as 8°. Additionally, we notice that the maximum peak in Figure 19 is lower than that in Figure 17. On the other hand, from Figure 20, we observe that MDL-MUSIC fails, while NUMDL-MUSIC and SSM-MUSIC correctly estimate the DOA as 8°. By comparing Figure 18 and Figure 20, we can see that the sidelobes of NUMDL-MUSIC and SSM-MUSIC are significantly increased in a low SNR of 10 dB than those in a high SNR of 50 dB. Table 3 shows the results of different algorithms in a low SNR of −10 dB. By comparing Table 2 and Table 3, we observe that all the methods except MDL give the same estimation results in the SNRs of 50 dB and −10 dB. On the other hand, in a low SNR of −10 dB, MDL gives six false alarms, and MDL-MUSIC estimates the DOAs of sources as 259° which significantly deviates from the true DOA.

5.2. Absence of Source Signal

In the absence of sources, the real data only comprise noise. The spatial spectrum of DNN-C is shown in Figure 21. From Figure 21, we observe that only the virtual peak is classified into the signal cluster, which will be removed by the mask function in Equation (36). Therefore, it is illustrated that DNN-C correctly estimates the number of sources as 0 and identifies the absence of a source. In contrast, MDL, NU-MDL, and SSM overestimate the number of sources as 6, 1, and 1, respectively, failing to identify the absence of sources.

6. Conclusions

In this paper, the DNN-C method is proposed for simultaneously estimating the source number and DOAs. DNN-C is a lightweight solution, as it is composed of a simple fully connected DNN and K2-means clustering, which involves only multiplications and additions, and avoids matrix decomposition. DNN-C trains the network using the mixed data under different numbers of sources and with multiple SNRs and with data balance for equal inference ability for different numbers of sources. Moreover, it creatively exploits the prior knowledge of array signal processing and spatial spectrum to design the specified K2-means clustering for removing the pseudo peaks automatically and detecting the presence and absence of sources as well. The simulation results verify that, compared to conventional solutions, DNN-C is the most computationally efficient. Moreover, DNN-C is robust to nonuniform noise by nature and thus demonstrates the most robust performance in terms of the PD, PMD, and FAR for source number estimation and the RMSE for DOA estimation. In addition, it can identify the absence of sources and perform robustly to the correlated sources. The real data preliminarily confirm the generalization ability of the DNN-C model trained with simulated data.

It is noted that although DNN-C demonstrates its better performance and efficiency over the classical solutions such as MDL/NUMDL/SSM-MUSIC, it is still facing certain challenges such as training data explosion as the number of the sources increases, the trained model is only suitable to a fixed geometry of an array, and the performance of the trained model might degrade in a complex propagation environment. The transfer learning might be a solution to generalize the trained model to different scenarios. Therefore, the deep neural network for source detection with a sensor array is an interesting and potential solution, but it is still far from completion.

Author Contributions

Conceptualization and methodology, A.L., Y.Z. and Z.L. (Zhiling Liu); software, validation, and visualization, Y.Z., A.L., Z.L. (Zi Li) and Y.X.; formal analysis and investigation, Y.Z., A.L., Z.L. (Zi Li) and Y.X.; resources and data curation, C.Z. and Z.L. (Zhiling Liu); writing—original draft preparation and writing—review and editing, A.L., Y.Z. and Y.X.; supervision, A.L., C.Z. and Z.L. (Zhiling Liu); project administration, A.L.; funding acquisition, A.L. All authors have read and agreed to the published version of the manuscript.

Data Availability Statement

Data available on request due to restrictions.

Conflicts of Interest

The authors declare no conflicts of interest.

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Figures and Tables
View Image - Figure 1. Array signal model for a uniform linear array: [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.] are the waveform and DOA of the k-th sources, respectively; d is the distance between two adjacent sensors; the steering vector at [Forumla omitted. See PDF.] is [Forumla omitted. See PDF.]; the received signal at the m-th sensor is [Forumla omitted. See PDF.]; and the received array signal vector is [Forumla omitted. See PDF.].

Figure 1. Array signal model for a uniform linear array: [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.] are the waveform and DOA of the k-th sources, respectively; d is the distance between two adjacent sensors; the steering vector at [Forumla omitted. See PDF.] is [Forumla omitted. See PDF.]; the received signal at the m-th sensor is [Forumla omitted. See PDF.]; and the received array signal vector is [Forumla omitted. See PDF.].

View Image - Figure 2. Scheme of DNN-C.

Figure 2. Scheme of DNN-C.

View Image - Figure 3. Structure of DNN; for the autoencoder, we define the size of each of the input and output layers as [Forumla omitted. See PDF.] since the input vector [Forumla omitted. See PDF.] has a size of [Forumla omitted. See PDF.]. In addition, we denote that the number of each encoder and decoder has one hidden layer of which the size is [Forumla omitted. See PDF.], and denote the number of spatial subregions as p. For each of the multi-layer classifiers after the autoencoder, the sizes of the two hidden layers are, respectively, [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.].

Figure 3. Structure of DNN; for the autoencoder, we define the size of each of the input and output layers as [Forumla omitted. See PDF.] since the input vector [Forumla omitted. See PDF.] has a size of [Forumla omitted. See PDF.]. In addition, we denote that the number of each encoder and decoder has one hidden layer of which the size is [Forumla omitted. See PDF.], and denote the number of spatial subregions as p. For each of the multi-layer classifiers after the autoencoder, the sizes of the two hidden layers are, respectively, [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.].

View Image - Figure 4. Spatial spectrum of DNN for real data with a source of [Forumla omitted. See PDF.] and an estimated SNR of 50 dB. The red cross corresponds to the true DOA of source (i.e., 8°).

Figure 4. Spatial spectrum of DNN for real data with a source of [Forumla omitted. See PDF.] and an estimated SNR of 50 dB. The red cross corresponds to the true DOA of source (i.e., 8°).

View Image - Figure 5. Spatial spectra of DNN-C under different virtual peaks in absence of sources; (a) [Forumla omitted. See PDF.]; (b) [Forumla omitted. See PDF.]; (c) [Forumla omitted. See PDF.]; (d) [Forumla omitted. See PDF.].

Figure 5. Spatial spectra of DNN-C under different virtual peaks in absence of sources; (a) [Forumla omitted. See PDF.]; (b) [Forumla omitted. See PDF.]; (c) [Forumla omitted. See PDF.]; (d) [Forumla omitted. See PDF.].

View Image - Figure 6. Spatial spectra of DNN-C under different virtual peaks in presence of two sources with DOAs of [Forumla omitted. See PDF.]; (a) [Forumla omitted. See PDF.]; (b) [Forumla omitted. See PDF.]; (c) [Forumla omitted. See PDF.]; (d) [Forumla omitted. See PDF.].

Figure 6. Spatial spectra of DNN-C under different virtual peaks in presence of two sources with DOAs of [Forumla omitted. See PDF.]; (a) [Forumla omitted. See PDF.]; (b) [Forumla omitted. See PDF.]; (c) [Forumla omitted. See PDF.]; (d) [Forumla omitted. See PDF.].

View Image - Figure 7. Testing time for different methods.

Figure 7. Testing time for different methods.

View Image - Figure 8. Performance versus number of snapshots when [Forumla omitted. See PDF.]; (a) PD of source number detection; (b) FAR of source number detection.

Figure 8. Performance versus number of snapshots when [Forumla omitted. See PDF.]; (a) PD of source number detection; (b) FAR of source number detection.

View Image - Figure 9. Performance versus DOA of a single source; (a) PD of source number detection; (b) PMD of source number detection; (c) FAR of source number detection; (d) RMSE of DOA estimation based on the estimated number of sources.

Figure 9. Performance versus DOA of a single source; (a) PD of source number detection; (b) PMD of source number detection; (c) FAR of source number detection; (d) RMSE of DOA estimation based on the estimated number of sources.

View Image - Figure 10. Performance versus SNR when [Forumla omitted. See PDF.]; (a) PD of source number detection; (b) PMD of source number detection; (c) FAR of source number detection; (d) RMSE of DOA estimation based on the estimated number of sources.

Figure 10. Performance versus SNR when [Forumla omitted. See PDF.]; (a) PD of source number detection; (b) PMD of source number detection; (c) FAR of source number detection; (d) RMSE of DOA estimation based on the estimated number of sources.

View Image - Figure 11. Performance versus number of snapshots when [Forumla omitted. See PDF.]; (a) PD of source number detection; (b) PMD of source number detection; (c) FAR of source number detection; (d) RMSE of DOA estimation based on the estimated number of sources.

Figure 11. Performance versus number of snapshots when [Forumla omitted. See PDF.]; (a) PD of source number detection; (b) PMD of source number detection; (c) FAR of source number detection; (d) RMSE of DOA estimation based on the estimated number of sources.

View Image - Figure 12. Performance versus SNR when [Forumla omitted. See PDF.]; (a) PD of source number detection; (b) PMD of source number detection; (c) FAR of source number detection; (d) RMSE of DOA estimation based on the estimated number of sources.

Figure 12. Performance versus SNR when [Forumla omitted. See PDF.]; (a) PD of source number detection; (b) PMD of source number detection; (c) FAR of source number detection; (d) RMSE of DOA estimation based on the estimated number of sources.

View Image - Figure 13. Performance versus DOA separation when [Forumla omitted. See PDF.]; (a) PD of source number detection; (b) PMD of source number detection; (c) FAR of source number detection; (d) RMSE of DOA estimation based on the estimated number of sources.

Figure 13. Performance versus DOA separation when [Forumla omitted. See PDF.]; (a) PD of source number detection; (b) PMD of source number detection; (c) FAR of source number detection; (d) RMSE of DOA estimation based on the estimated number of sources.

View Image - Figure 14. Performance versus WNPR when [Forumla omitted. See PDF.], [Forumla omitted. See PDF.] dB, and [Forumla omitted. See PDF.]; (a) PD of source number detection; (b) PMD of source number detection; (c) FAR of source number detection; (d) RMSE of DOA estimation based on the estimated number of sources.

Figure 14. Performance versus WNPR when [Forumla omitted. See PDF.], [Forumla omitted. See PDF.] dB, and [Forumla omitted. See PDF.]; (a) PD of source number detection; (b) PMD of source number detection; (c) FAR of source number detection; (d) RMSE of DOA estimation based on the estimated number of sources.

View Image - Figure 15. Performance versus WNPR when [Forumla omitted. See PDF.], [Forumla omitted. See PDF.]6 dB, and [Forumla omitted. See PDF.]; (a) PD of source number detection; (b) PMD of source number detection; (c) FAR of source number detection; (d) RMSE of DOA estimation based on the estimated number of sources.

Figure 15. Performance versus WNPR when [Forumla omitted. See PDF.], [Forumla omitted. See PDF.]6 dB, and [Forumla omitted. See PDF.]; (a) PD of source number detection; (b) PMD of source number detection; (c) FAR of source number detection; (d) RMSE of DOA estimation based on the estimated number of sources.

View Image - Figure 16. Performance versus correlation coefficient when [Forumla omitted. See PDF.]; (a) PD of source number detection; (b) PMD of source number detection; (c) FAR of source number detection; (d) RMSE of DOA estimation based on the estimated number of sources.

Figure 16. Performance versus correlation coefficient when [Forumla omitted. See PDF.]; (a) PD of source number detection; (b) PMD of source number detection; (c) FAR of source number detection; (d) RMSE of DOA estimation based on the estimated number of sources.

View Image - Figure 17. Spatial spectrum of DNN-C for real data with source of [Forumla omitted. See PDF.] and an estimated SNR of 50 dB. The red dot with a red circle corresponds to the true DOA of source (i.e., 8°). The red dot with a black circle corresponds to the virtual peak.

Figure 17. Spatial spectrum of DNN-C for real data with source of [Forumla omitted. See PDF.] and an estimated SNR of 50 dB. The red dot with a red circle corresponds to the true DOA of source (i.e., 8°). The red dot with a black circle corresponds to the virtual peak.

View Image - Figure 18. Normalized spatial spectra of conventional methods for real data with source of [Forumla omitted. See PDF.] and an estimated SNR of 50 dB. The blue cross corresponds to the true DOA of source (i.e., 8°).

Figure 18. Normalized spatial spectra of conventional methods for real data with source of [Forumla omitted. See PDF.] and an estimated SNR of 50 dB. The blue cross corresponds to the true DOA of source (i.e., 8°).

View Image - Figure 19. Spatial spectrum of DNN-C for real data with source of [Forumla omitted. See PDF.] and an estimated SNR of −10 dB; The red dot with a red circle corresponds to the true DOA of source (i.e., 8°). The red dot with a black circle corresponds to the virtual peak.

Figure 19. Spatial spectrum of DNN-C for real data with source of [Forumla omitted. See PDF.] and an estimated SNR of −10 dB; The red dot with a red circle corresponds to the true DOA of source (i.e., 8°). The red dot with a black circle corresponds to the virtual peak.

View Image - Figure 20. Normalized spatial spectra of conventional methods for real data with source of [Forumla omitted. See PDF.] and an estimated SNR of −10 dB. The blue cross corresponds to the true DOA of source (i.e., 8°).

Figure 20. Normalized spatial spectra of conventional methods for real data with source of [Forumla omitted. See PDF.] and an estimated SNR of −10 dB. The blue cross corresponds to the true DOA of source (i.e., 8°).

View Image - Figure 21. Spatial spectrum of DNN-C for real data in the absence of the source. The red dot with a black circle corresponds to the virtual peak.

Figure 21. Spatial spectrum of DNN-C for real data in the absence of the source. The red dot with a black circle corresponds to the virtual peak.

Analysis of primary computational complexity.

Algorithm Primary Computational Complexity Result Under Specific Setting
DNN-C O(J˜×J˜2+J˜2×J˜×p+J˜×23J˜×p+ 23J˜×49J˜×p+49J˜×γ×p) O(44,400)
MDL [2] 4 × O ( M 3 ) O ( 4000 )
NUMDL [14] 4 × O ( M × ( M 1 ) 3 ) O(29,160)
SSM [15] 4 × O ( ( N 2 M + N 3 ) + M 3 + M 2 ( M 1 ) ) O ( 4.04 × 10 9 )
MUSIC [6] 4 × O ( ( M + 1 ) ( M K ) γ ˜ + M 3 ) O(46,240)

Results of real data with single source of θ1=8° in a high SNR of 50 dB.

Algorithms Estimated Number Algorithms Estimated DOA°
DNN-C 1 DNN-C 8
MDL 7 MDL-MUSIC 354
NUMDL 1 NUMDL-MUSIC 8
SSM 1 SSM-MUSIC 8

Results of real data with single source of θ1=8° in a low SNR of −10 dB.

Algorithms Estimated Number Algorithms Estimated DOA°
DNN-C 1 DNN-C 8
MDL 6 MDL-MUSIC 259
NUMDL 1 NUMDL-MUSIC 8
SSM 1 SSM-MUSIC 8

References

1. Krim, H.; Viberg, M. Two decades of array signal processing research: The parametric approach. IEEE Signal Process. Mag.; 1996; 13, pp. 67-94. [DOI: https://dx.doi.org/10.1109/79.526899]

2. Wax, M.; Kailath, T. Detection of signals by information theoretic criteria. IEEE Trans. Acoust. Speech Signal Process; 1985; 33, pp. 387-392. [DOI: https://dx.doi.org/10.1109/TASSP.1985.1164557]

3. Huang, L.; Wu, S.; Li, X. Reduced-rank MDL method for source enumeration in high-resolution array processing. IEEE Trans. Signal Process.; 2007; 55, pp. 5658-5667. [DOI: https://dx.doi.org/10.1109/TSP.2007.899344]

4. Antonello, F.; Gianfranco, A.; Giuseppe, C. Multiple Source Angle of Arrival Estimation Through Phase Interferometry. IEEE Trans. Circuits Syst. II Express Briefs; 2022; 69, pp. 674-678.

5. Del-Rey-Maestre Nerea, M.-M.D.; María-Pilar, J.; Anabel, A.; Javier, R. DOA techniques in UAV detection with DVB-T based Passive Radar. Proceedings of the 2023 IEEE International Radar Conference (RADAR); Sydney, Australia, 6–10 November 2023.

6. Schmidt, R. Multiple emitter location and signal parameter estimation. IEEE Trans. Antennas Propag.; 1986; 34, pp. 276-280. [DOI: https://dx.doi.org/10.1109/TAP.1986.1143830]

7. Malioutov, D.; Cetin, M.; Willsky, A.S. A sparse signal reconstruction perspective for source localization with sensor arrays. IEEE Trans. Signal Process.; 2005; 53, pp. 3010-3022. [DOI: https://dx.doi.org/10.1109/TSP.2005.850882]

8. Yin, J.; Chen, T. Direction-of-arrival estimation using a sparse representation of array covariance vectors. IEEE Trans. Signal Process; 2011; 59, pp. 4489-4493. [DOI: https://dx.doi.org/10.1109/TSP.2011.2158425]

9. Liu, A.; Liao, G. An eigenvector based method for estimating DOA and sensor gain-phase errors. Digit. Signal Process; 2018; 79, pp. 116-124. [DOI: https://dx.doi.org/10.1016/j.dsp.2018.04.013]

10. Liu, A.; Yang, D.; Shi, S.; Zhu, Z.; Li, Y. Augmented subspace MUSIC method for DOA estimation using acoustic vector sensor array. IET Radar Sonar Navig.; 2019; 13, pp. 969-975. [DOI: https://dx.doi.org/10.1049/iet-rsn.2018.5440]

11. Liu, L.; Li, Z.; An, J.; Gan, L.; Li, H. DOA Estimation for Switch-Element Arrays Based on Sparse Representation. Proceedings of the CASSP 2024—2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); Seoul, Republic of Korea, 14–19 April 2024.

12. Matveyev, A.L.; Gershman, A.B.; Böhme, J.F. On the direction estimation Cramér-Rao bounds in the presence of uncorrelated unknown noise. CIrcuits Syst. Signal Process; 1999; 18, 5. [DOI: https://dx.doi.org/10.1007/BF01387467]

13. Pesavento, M.; Gershman, A.B. Maximum-likelihood direction-of-arrival estimation in the presence of unknown nonuniform noise. IEEE Trans. Signal Process; 2001; 49, pp. 1310-1324. [DOI: https://dx.doi.org/10.1109/78.928686]

14. Aouada, S.; Zoubir, A.M.; See, C.M.S. Source detection in the presence of nonuniform noise. Proceedings of the 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing; Montreal, QC, Canada, 17–21 May 2004.

15. Wax, M.; Adler, A. Detection of the number of signals by signal subspace matching. IEEE Trans. Signal Process; 2021; 69, pp. 973-985. [DOI: https://dx.doi.org/10.1109/TSP.2021.3053495]

16. He, Z.; Shi, Z.; Huang, L. Covariance sparsity-aware DOA estimation for nonuniform noise. Digit. Signal Process; 2014; 28, pp. 75-81. [DOI: https://dx.doi.org/10.1016/j.dsp.2014.02.013]

17. Liao, B.; Chan, S.; Huang, L.; Guo, C. Iterative methods for subspace and DOA estimation in nonuniform noise. IEEE Trans. Signal Process; 2016; 64, pp. 3008-3020. [DOI: https://dx.doi.org/10.1109/TSP.2016.2537265]

18. Yang, Y.; Gao, F.; Qian, C.; Liao, G. Model-aided deep neural network for source number detection. IEEE Signal Process. Lett.; 2019; 27, pp. 91-95. [DOI: https://dx.doi.org/10.1109/LSP.2019.2957673]

19. Agatonovic, M.; Stankovic, Z.; Milovanovic, I.; Doncov, N.; Sit, L.; Zwick, T.; Milovanovic, B. Efficient neural network approach for 2d doa estimation based on antenna array measurements. Prog. Electromagn. Res.; 2013; 137, pp. 741-758. [DOI: https://dx.doi.org/10.2528/PIER13012114]

20. Liu, A.; Guo, J.; Arnatovich, Y.; Liu, Z. Lightweight Deep Neural Network with Data Redundancy Removal and Regression for DOA Estimation in Sensor Array. Remote Sens.; 2024; 16, 1423. [DOI: https://dx.doi.org/10.3390/rs16081423]

21. Liu, Z.; Zhang, C.; Philip, S.Y. Direction-of-arrival estimation based on deep neural networks with robustness to array imperfections. IEEE Trans. Antennas Propag.; 2018; 66, pp. 7315-7327. [DOI: https://dx.doi.org/10.1109/TAP.2018.2874430]

22. Wu, L.; Liu, Z.; Huang, Z. Deep convolution network for direction of arrival estimation with sparse prior. IEEE Signal Process Lett.; 2019; 26, pp. 1688-1692. [DOI: https://dx.doi.org/10.1109/LSP.2019.2945115]

23. Elbir, A.M. DeepMUSIC: Multiple signal classification via deep learning. IEEE Sens. Lett.; 2020; 4, pp. 1-4. [DOI: https://dx.doi.org/10.1109/LSENS.2020.2980384]

24. Papageorgiou, G.K.; Sellathurai, M.; Eldar, Y.C. Deep networks for direction-of-arrival estimation in low SNR. IEEE Trans. Signal Process.; 2021; 69, pp. 3714-3729. [DOI: https://dx.doi.org/10.1109/TSP.2021.3089927]

25. Ji, Y.; Wen, C.; Huang, Y.; Peng, J.; Fan, J. Robust direction-of-arrival estimation approach using beamspace-based deep neural networks with array imperfections and element failure. IET Radar Sonar Navig.; 2022; 16, pp. 1761-1778. [DOI: https://dx.doi.org/10.1049/rsn2.12295]

26. Chaudhari, S.; Moura, J.M.F. Deep beamforming for joint direction of arrival estimation and source detection. Proceedings of the 2022 56th Asilomar Conference on Signals, Systems, and Computers; Pacific Grove, CA, USA, 31 October–2 November 2022.

27. Yan, F.-G.; Jin, M.; Liu, S.; Qiao, X.-L. Real-valued MUSIC for efficient direction estimation with arbitrary array geometries. IEEE Trans. Signal Process; 2014; 62, pp. 1548-1560. [DOI: https://dx.doi.org/10.1109/TSP.2014.2298384]

28. Huang, L.; Xiao, Y.; Liu, K.; So, H.C.; Zhang, J.-K. Bayesian information criterion for source enumeration in large-scale adaptive antenna array. IEEE Trans. Veh. Technol.; 2016; 65, pp. 3018-3032. [DOI: https://dx.doi.org/10.1109/TVT.2015.2436060]

29. Evers, C.; Löllmann, H.W.; Mellmann, H.; Schmidt, A.; Barfuss, H.; Naylor, P.A. The lOCATA challenge: Acoustic source localization and tracking. IEEE/ACM Trans. Audio Speech Lang. Process; 2020; 28, pp. 1620-1643. [DOI: https://dx.doi.org/10.1109/TASLP.2020.2990485]

30. Liu, A.; Guo, H.; Arnatovich, Y. Global MDL minimization-based method for detection of the number of sources in presence of unknown nonuniform noise. Proceedings of the 2022 30th European Signal Processing Conference (EUSIPCO); Belgrade, Serbia, 29 August–2 September 2022; pp. 1936-1940.

31. Huang, L.; Long, T.; Mao, E.; So, H.C. MMSE-based MDL method for robust estimation of number of sources without eigendecomposition. IEEE Trans. Signal Process; 2009; 57, pp. 4135-4142. [DOI: https://dx.doi.org/10.1109/TSP.2009.2024043]

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.