This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
1. Introduction
The convergence of the internet, cloud computing, and big data technology has broken the situation of isolated islands of information and data fragmentation. A secure, stable, and reliable network connection is the basis for information sharing. With the continuous expansion of network scale and the continuous increase of network users, the traditional networking mode and management methods can no longer meet the diverse needs of future Internet development. As a new network architecture, Software Defined Network (SDN) is a new type of network architecture that decouples the control and forwarding functions of network devices, allowing the network to evolve independently of hardware devices [1].
Compared with the traditional network architecture, the advantage of SDN is that it is an operating system based on virtualized network resources and devices, which concentrates the complexity of network traffic control and management on the controller, while only simple packet forwarding is needed on the forwarding plane [2]. This enables SDN networks to be quickly adjusted and optimized according to the requirements of specific applications and has the characteristics of more openness, flexibility, and personalized programming, which can effectively reduce the load on network equipment and reduce operating costs, so it is gradually favored by equipment manufacturers and network operators.
SDN’s unique centralized controller and open design features also bring certain security risks to the network, such as Denial of Service (DoS) attacks against the available queue capacity of the controller [3]. Overflow attacks and distributed DoS (DDoS) attacks on switch flow tables, caches, and communication links [4], blocking attacks on southbound interfaces [5], various low-rate DoS (LDoS) attacks on transmission protocol vulnerabilities [6], and network attacks such as worm and viruses attacks on system vulnerabilities [7]. These attacks will not only adversely affect the network operation, but even capture the core controller, thus causing the entire SDN network to collapse.
Considering that attacks often occur with abnormal changes in network traffic or performance degradation, exhibiting abnormal behavior patterns that do not conform to expectations. For this reason, researchers have proposed detection methods based on statistical analysis, information theory, clustering analysis, machine learning, and deep learning. In particular, the application of machine learning and deep learning technologies is particularly widespread, such as:
Anyanwu et al. [8] employs a Support Vector Machine (SVM) classifier with a Radial Basis Function (RBF) kernel and an exhaustive parameter search technique known as Grid Search Cross-Validation (GSCV) to construct the intrusion detection framework of in-vehicle units.
The proposed models of Wang et al. [9], Jiang et al. [10], and Cheng et al. [11] utilize the powerful feature extraction capability of neural networks to enhance the representation of network traffic features. Jiang et al. [12] used the weights of neurons in the graph neural network to adjust the available bandwidth of links, thereby restricting the transmission of data from abnormal hosts.
By detecting and analyzing network traffic, it is possible to quickly and effectively identify abnormal traffic within the network. This provides support for taking further corresponding countermeasures to enhance network security.
However, the operation mode of SDN networks is different from that of the traditional network. If the traditional detection method is migrated to the large-scale SDN network with multiple controllers, it will inevitably cause great pressure on high-traffic communication links and the cloud server that performs centralized detection tasks, which will cause the controllers on the entire SDN network to bear too much calculation and storage expenses due to the heavy detection tasks, seriously interfere with the core control function of the controllers [13], and then lead to controller’s single point of failure, and even lead to the abnormal operation of the SDN network, which has become the bottleneck restricting the development of SDN networks. Therefore, it is an important research topic to investigate how to distribute the detection tasks in SDN networks and realize load balance in the process of abnormal detection, to reduce the impact on network operation.
We propose an abnormal traffic detection method for SDN network based on distributed deep learning framework, which is called Distributed Convolutional Neural Networks and Gate Recurrent Unit (DCNN-GRU). As shown in Figure 1, we use the DCNN model to pre-extract the subnet traffic characteristics of multiple SDN controllers and jointly train with the GRU deep detection model located in the cloud server to construct an efficient and accurate abnormal detection model, which can distinguish normal network traffic from attack network traffic. Compared with the detection of each independent controller, DCNN-GRU can train a stronger detection model based on the traffic information of the whole network by using the traffic characteristics provided by all distributed SDN controllers. Compared with the solution where the centralized server collects all the traffic of each SDN controller for attack detection, DCNN-GRU uses distributed SDN controllers to share the task of feature extraction for abnormal traffic detection and only transmits the extracted feature data to the cloud server instead of the complete traffic data, which will greatly reduce the communication cost of the network. This design idea of extracting traffic characteristic information step by step and balancing detection load can not only alleviate the pressure of centralized operation of cloud servers but also help to reduce the communication burden on cloud servers. To reduce the local pressure on network equipment during abnormal detection to the maximum extent, improve the utilization efficiency of network resources, and maintain the normal operation of the whole SDN network.
[figure(s) omitted; refer to PDF]
Our contributions are summarized as follows:
1. We propose a distributed abnormal traffic detection framework based on DCNN-GRU for large-scale multicontroller SDN networks, which avoids the high system consumption in centralized detection and reduces communication consumption.
2. We explore the deep learning of the combination of distributed Convolutional Neural Network (CNN) and GRU and design an efficient collaborative abnormal traffic detection method, which decomposes the task of feature extraction from a large number of data on the cloud server into multiple SDN controllers and conducts joint training with the detection model in the cloud server, which shortens the detection time and is beneficial to the rapid detection of abnormal traffic in large-scale networks.
3. We have evaluated the performance of DCNN-GRU in detecting abnormal samples in traffic datasets in different network environments. The experiment shows that this method has good effectiveness and strong generalization ability.
2. Related Work
SDN has the characteristics of logical centralization, programmability, and openness, which leads to significant differences in network attacks compared to traditional networks. Attackers can exploit vulnerabilities in SDN by sending a large number of spoofed IP address packets to SDN switches, easily causing congestion in the SDN switches and even overloading the controller. This method of attack differs from traditional network attack methods, resulting in substantial differences in abnormal traffic detection between the two.
Traditional abnormal detection methods assume that switches are intelligent and capable of independently handling complex security defense tasks. However, in the SDN architecture, the control plane and data plane are separated and switches only possess simple functions, with security defense, not an inherent feature. Therefore, the defense architecture of SDN relies on switches to collect network traffic and extract and analyze the characteristics of abnormal behavior in the network traffic to classify it as normal or abnormal.
Taking the OpenFlow-based SDN as an example, commonly used DDoS attack defense architectures collect network traffic through OpenFlow switches, analyze the characteristic information of data packets, and match it against a DDoS attack rule database, ultimately utilizing the controller to complete intrusion responses. In this process, OpenFlow switches not only need to perform traffic forwarding tasks but also take on additional tasks such as data protocol analysis and matching against the DDoS attack rule database. Meanwhile, the controller, in addition to maintaining and controlling forwarding tasks, must also collect DDoS attack data packet characteristics and respond to intrusions. This is significantly different from traditional network abnormal detection methods.
We have briefly summarized the related research, which can be classified into detection methods based on statistical analysis, information theory, cluster analysis, traditional machine learning, and deep learning.
2.1. Abnormal Traffic Detection Methods Based on Statistical Analysis
The detection method based on statistical analysis can identify abnormal traffic by analyzing statistical data and parameter information related to traffic characteristics. Bavani et al. [14] proposed an abnormal traffic identification algorithm based on packet statistics. The algorithm made a series of evaluations on the size and duration of the captured original packet and analyzed the differential behavior of the packet by comparing the header information of the packet, which was used as the basis for judging the abnormal traffic. Software-defined detector (SDDetector) is a lightweight DDoS detection and defense framework proposed by Jia Kun et al. [15], uses statistical features to detect suspicious behaviors in coarse-grained mode, and then uses an entropy detection algorithm and SVM detection algorithm to detect DDoS attack flows in fine-grained mode. However, this detection algorithm based on statistical learning is usually only for attacks with obvious characteristics, and few abnormal types can be identified. The recognition effect mainly depends on the threshold setting, which is closely related to the abnormal types, and it has no learning ability, which is not conducive to popularization.
2.2. Abnormal Traffic Detection Methods Based on Information Theory
The detection method based on information theory mainly uses the mutual information of traffic characteristics to calculate entropy or KL divergence, to identify abnormal distribution. Li et al. [16] proposed an entropy-based detection method, which expanded the characteristic difference between normal traffic and abnormal traffic by adjusting the relevant parameters, thus making it easier to detect anomalies in the early stage of DDoS attacks. Mishra et al. [17] proposed a detection scheme based on entropy, which set the packet rate in the switch as a trigger condition and added a blocking flow table entry to the abnormal switch. According to the resource usage of SDN controller, Mousavi et al. [18] proposed a lightweight detection scheme for DDoS attacks based on the entropy change of destination IP address, but did not consider the situation that the entropy value of destination IP may change greatly under normal circumstances. Duo et al. [19] used the destination IP address in the packet header collected by the controller to calculate the entropy of the destination IP address and used the dynamic threshold changing with time to judge the network state. However, the method based on information theory is only suitable for detecting abnormal samples with high density, and it is difficult to locate the root cause of anomalies because of its low sensitivity for detecting sparse abnormalities.
2.3. Abnormal Traffic Detection Methods Based on Unsupervised Learning
Some data flows with no obvious statistical characteristics can be divided into multiple clusters by clustering algorithm for clustering analysis, and data points that do not belong to any cluster can be identified as abnormal data points, to find abnormal data. Cui et al. [20] proposed a K-Means clustering algorithm based on imbalanced traffic distribution, which realized real-time detection of DDoS attack traffic in SDN. Zolotukhin et al. [21] cluster K-Means, Fuzzy C-Means, and Self-Organizing Map (SOM) algorithms into a detection proxy model through training, redirect SDN flows and carry out attack detection and real-time dynamic network update. The decision threshold is defined by calculating the maximum number of packets sent by different processes. Finally, the centroid clustering method is used to detect abnormal payload, but K-Means cannot know how many kinds of data streams there are in advance, so it can’t be used in many applications. Jasim et al. [22] put forward a semisupervised classification algorithm. Firstly, the mixed feature selection method was used to optimize features, and then the best two centroids were selected by clustering algorithm as the basis to distinguish normal traffic from DDoS attacks, but this method could not achieve high detection accuracy. For datasets with different densities, the clustering method may not be able to accurately divide abnormal data points and have poor adaptability.
2.4. Abnormal Traffic Detection Methods Based on Shallow Supervised Learning
Machine learning algorithms can learn rules from a large number of data stream samples, perform traffic data analysis in SDN, and conduct abnormal detection. Its development has roughly gone through two waves: shallow supervised learning and deep learning. This section mainly conducts research on abnormal traffic detection methods using shallow supervision learning. The solution of IDS based on SVM in SDN is studied by Boero et al. [23], and the core features based on probability density estimation are selected by using entropy-based information gain (IG) method to detect various malware intrusions in the network. Cheng et al. [24] proposed a deep detection method based on OpenFlow packets, which used Random Forest (RF), Decision Tree (DT), SVM, Naïve Bayes (NB) and K-Nearest Neighbor (KNN) algorithms to analyze the payload of packets respectively, so as to find malicious attacks in the network. Due to the adoption of the packet-level sampling strategy, this method alleviates the resource performance problem to a certain extent. Tayfour et al. [25] integrated Voting-NB, KNNs, DT, and Extra Trees, four machine learning methods, into a V-NKDE classifier. After classifying the extracted traffic data features using the classifier, a voting mechanism is utilized to determine the final detection results. This combination detection model can balance the individual weaknesses of each classifier, reducing the false positive rate and the risk of overfitting.
A two-stage abnormal detection scheme based on machine learning RF algorithm is designed by Li et al. [26] for the SD Internet of Things (SD-IoT) network. Five stream-based features were selected from KDD99 dataset. And then the improved RF algorithm and weighted voting mechanism are used to adjust the weight of the sample data and classify the streams. Because the attacks in SD-IoT networks have been optimized and the core features have been selected, the detection methods show high accuracy. Sebbar et al. [27] proposed a model using RF algorithm, which pre-established security policy and TTL delay in SDN network, identified Man in the Middle (MitM) attack by detecting the state of the context selection node, and judged any connection request that exceeded the delay as an attack, and the system disconnect all connections with the node, thus blocking any authentication request from the node. Dehkordi et al. [28] proposed a detection method combining statistical learning with machine learning, which can detect DDoS attacks by detecting the traffic imbalance in SDN networks. Finally, the detection accuracy is further improved. To improve the detection ability, Akbas et al. [29] used KNN, DT, and SVM algorithms as the evaluation objects when detecting DDoS attacks in SDN for machine learning, and at the same time, to reduce the detection operation cost, they selected 6 elements from 41 available elements in the dataset. Similarly, Elsayed et al. [30] used the principal component analysis (PCA) algorithm to reduce dataset dimensions before using ML for abnormal detection, so that the original 122 features in the dataset were reduced to 20.
Although machine learning algorithms can detect traffic anomalies, the huge amount of data has been prohibitive in the process of manually extracting features, and its intrusion detection performance in large-scale SDN is not reliable [31].
2.5. Abnormal Traffic Detection Methods Based on Deep Learning
Deep learning is a branch of machine learning, and it is machine learning based on neural network algorithms, which offer higher accuracy and precision compared to traditional maximum likelihood models. Because of its excellent representation ability, it is widely used in computer vision [32], intelligent diagnosis [33], assistant decision [34], and other fields. In addition, deep learning can extract the features of input data independently without additional feature engineering support. Commonly, deep learning models include CNN, Recurrent Neural Network (RNN), AutoEncoder (AE), and Generative Adversarial Network (GAN), among others.
Kim et al. [35] put forward a DoS attack detection scheme based on CNN. Using KDD99 and CSE-CIC-IDS2018 datasets, two kinds of intrusion images in RGB and gray level were created. By changing the convolution kernel size, different precision in binary and multiclass frameworks was achieved. Kurochkin et al. [36] proposed a detection model based on GRU, which used the attack-oriented dataset CSE-CIC-IDS2018, and its F1-score in detecting DDoS attacks was almost 100%. However, it has a poor detection effect on web attacks and penetration tag attacks. Abdallah et al. [37] used CNN and long short-term memory (LSTM) mixed model to extract spatial and temporal features from traffic data and realized the classification of traffic samples in the InSDN dataset. Niyaz et al. [38] used a stacked automatic encoder (SAE) to establish a multivector DDoS detection model, which simplified the set of feature attributes extracted from network traffic headers, improved the recognition performance, and reduced the false recognition rate. Maeda et al. [39] proposed a method of detecting botnets based on deep learning, which used data obtained from botnet traffic collected on traditional networks for training. During detection, when an infected host is detected, it is blocked by SDN, and combined with the machine learning classifier to determine the attack source and perform network isolation to prevent internal infection.
Javeed et al. [40] proposed a hybrid detection model based on LSTM and GRU algorithms to detect network attacks on the IoT environment. The model consists of two deep LSTM layers and a GRU layer, and the output layer is SoftMax. The experiment shows that the model consumes little testing time and can detect abnormal samples in data streams efficiently and accurately. Khan et al. [41] proposed a hybrid deep learning model with LSTM and CNN to detect the traffic generated by malware software on the IoT. In this model, the CNN model is mainly used to extract local features, and the output is passed to the LSTM model to obtain more independent features to improve the performance of the hybrid model. Novaes et al. [42] developed an antagonistic training detection and prevention system based on SDN network, which can identify DDoS attacks by using the framework of GAN model. Adversarial training makes the system less susceptible to adversarial attacks.
However, for large-scale networks, if the detection nodes are deployed only at the SDN controller, the coverage of the detection range will be insufficient, and it will be difficult to effectively defend against network attacks. However, when multiple detection nodes are deployed, the nodes may face the problems of lack of training data and poor coordination when training the detection model.
To overcome the lack of training data that single-node detection equipment may face in SDN networks, A lEroud et al. [43] put forward a method of anti-intrusion detection based on GAN. They input the malicious traffic data and normal traffic data of DDoS attacks into GAN to continuously train the generator network so that GAN can generate realistic samples to evaluate the performance of the intrusion detection system. Elsayed et al. [44] proposed an automatic encoder based on One-Class SVM (OC-SVM) and LSTM for joint detection of attack samples in unbalanced datasets. In this model, LSTM automatic encoder is used to compress the features of normal samples and input them into OC-SVM for classification. This method can overcome the shortcomings of OC-SVM in dealing with massive high-dimensional datasets alone. Shu et al. [45] demonstrated an IDS for vehicular ad hoc networks (VANETs), which can distinguish between normal network traffic and malicious network traffic by installing a distributed SDN controller on each base station. With complete network traffic information, many SDN controllers are jointly trained for the whole VANET by using GAN, without directly exchanging its subnet traffic. The IDS method allows the distributed SDN controller to detect its subnet traffic independently, thus reducing the communication and calculation cost. However, the generalization ability of this detection method is weak, and there may be overload problems for controllers with more nodes. Additionally, the generative capacity of GAN is often used to create new sample data. Ezeh et al. [46] proposed an efficient abnormal detection model for GAN structures, known as EGBAD (efficient GAN-based abnormal detection), based on a multigenerator adversarial approach against multiple discriminators. Besides detecting abnormal traffic in SDN, this model is also utilized to generate samples of currently prevalent network attack traffic, thereby addressing the shortage of samples for novel attack traffic patterns.
According to our investigation, we found that current research on abnormal traffic detection in SDN largely relies on centralized detection models. These models require a significant consumption of computational and storage resources, and the dense transmission of large volumes of data can easily lead to bottleneck issues in the network links. Especially in the context of future large-scale SDN networks, centralized detection methods will find it increasingly difficult to meet the demands of abnormal network traffic detection. Therefore, this paper proposes a distributed detection method that disperses the extensive feature extraction tasks across various sub-networks. Instead of transmitting complete network data packets, these subnetworks only convey compressed feature data to the detection center. This approach can significantly alleviate the computational load on the detection center and reduce the stress on the communication links between the subnetworks and the detection center.
3. Abnormal Traffic Detection Based on DCNN-GRU Deep Learning Architecture
SDN mainly consists of the control plane, data plane, and application plane, as shown in Figure 2. The control plane is the core of SDN, which includes the controller and the network operating system. It is primarily responsible for formulating forwarding tasks, debugging communication units, executing instructions issued by the application plane, and optimizing the allocation of network resources. The data plane, also referred to as the forwarding plane, is composed of network forwarding devices such as switches and routers. It mainly receives forwarding rules set by the control plane through the southbound interface and deals with network flows according to these rules. The application plane primarily consists of business-oriented network applications, such as intrusion detection, traffic control, and load balancing. These applications use the northbound interface to issue network control actions that need to be executed to the controller.
[figure(s) omitted; refer to PDF]
Our proposed distributed abnormal traffic detection method based on the DCNN-GRU deep learning architecture can perform abnormal traffic detection tasks in large-scale SDN environments with multiple controllers and backend servers. Initially, lightweight detection agents embedded in each controller independently complete feature extraction of traffic data and preliminary detection of abnormal traffic within their subnets, yielding coarse-grained detection results. Subsequently, the feature data extracted by each subnet detection agent is sent to the GRU deep detection module located on the cloud-based backend server for fine-grained abnormal traffic detection, thereby enabling the detection of abnormal traffic across the entire SDN network. The structure of DCNN-GRU is shown in Figure 3.
[figure(s) omitted; refer to PDF]
Dividing a large-scale SDN into multiple sub-domains managed by individual controllers, the lightweight detection agents deployed in each controller can complete the initial detection of abnormal traffic through training. Meanwhile, the spatial feature samples of the input traffic data learned by these agents, appended with a unique network-wide identifier, are mapped to the deep detection modules deployed on cloud servers. The deep detection modules, hosted on the cloud servers, carry out in-deep feature extraction and classification on the traffic feature samples within each sub-domain, thereby achieving the detection of abnormal traffic in a large-scale SDN environment.
3.1. Coarse-Grained Abnormal Traffic Detection Based on Lightweight CNN Agents
We have designed a lightweight detection agent based on CNNs. CNN, through training, possesses the capability to extract spatial features from samples and can roughly determine whether a sample is normal or abnormal based on these features.
CNN is a neural network structure similar to Multilayer Perceptron (MLP), commonly used to extract spatial feature information of data. It is composed of several convolution layers, pooling layers, and fully connected layers, as shown in Figure 4.
[figure(s) omitted; refer to PDF]
CNN’s neuron activation calculation can be expressed by convolution calculation of input data and model parameters:
Additionally, we have integrated residual transformation techniques and channel enhancement into the CNN to achieve enhancement and expansion of the sample features.
To achieve diversity in residual transformations, each residual block has a varying number of layers, allowing for a transformation of the input vector from simple to complex diversity. Additionally, to ensure the controllability of the residual transformation, direct mapping is only applied at the beginning and end of each residual pathway, with no bypasses in the intermediate network layers. Comparative experiments in detection accuracy and computational cost have shown that satisfactory results can be achieved when the residual groups are arranged in a three-level parallel connection. Therefore, a feature-enhanced residual classifier has been designed as shown in Figure 5, where BN stands for batch normalization, Compose summarizes the transformation results of each residual block, and FC represents the fully connected operation.
[figure(s) omitted; refer to PDF]
The enhanced error vector’s output through the expected transformation of the kth residual block, can be expressed as
Here,
Here,
We employ mean squared error as the loss function. Furthermore, traditional L1 and L2 regularization methods only control individual weight values. Considering that the attribute characteristics of network traffic do not exist in isolation, there is a certain correlation between these weight elements. For the weight matrices of multilayer neural networks:
To better utilize the correlation information between feature weights, we have designed a regularization method based on the standard deviation (SD) constraint operator, according to the formula for calculating standard deviation, to prevent the model from overfitting during training.
Here,
The collected traffic data is a sequence of data, which is converted into image format before use. The image contains three residual blocks; hence the structure is different.
The lightweight detection agent we designed operates on the host of each controller with low power consumption. It consists of three convolutional residual blocks, with specific hyperparameter settings are shown in Table 1. The optimization function is Adam. The learning rate is 0.01, and 50 epochs have been trained.
Table 1
The optimal hyper-parameters used for our CNN model.
Hyper-parameters | Optimal values |
Conv2D | 2 |
Num of filters | 16, 32 |
Channel number | 1 |
Kernel size | 3 × 3 |
Padding method | Same |
Activation function | Relu, sigmoid |
Dropout | 0.2 |
Pooling layers | Max (2 × 2) |
Using mean squared error as the loss function:
In the training process, the labeled traffic sample data is used to train the CNN detection agent. Before being input into the CNN, the data is converted into image format, and then it is fed into the CNN for training. The training of CNN can be expressed as Algorithm 1.
Algorithm 1: Training of CNN methods.
Input: epochs: number of iterations over the data, adam: optimizer, data: dataset, m: batch size,
Set the parameters of the CNN by Table 1.
Initialize a parameter matrix using random values
for epoch in range epochs
Sample
end for
return
The CNN, after training, can extract features from the sample space, and with these features, it can roughly determine whether a sample is normal or abnormal. This enables each CNN-based lightweight detection agent to independently detect abnormal samples within the subnet. To achieve more accurate detection results, the spatial feature data of the traffic, extracted from the last hidden layer of the CNN, is mapped to the cloud server. This allows the GRU-based deep detection module on the cloud server to perform further feature extraction and conduct fine-grained abnormal detection.
3.2. Abnormal Flow Detection Based on Deep Model Fine-Grained Abnormal Traffic Detection Based on Improved GRU
The cloud server with powerful computing capabilities will conduct in-deep detection of the features extracted from the traffic data by each lightweight CNN detection agent. The deep detection module is composed of RNN architecture, which can extract the time series features of data. Our research uses the architecture of GRU. Both GRU and LSTM networks are designed to solve the long-term dependence problem of traditional RNN, but GRU has fewer parameters than LSTM under the condition of equivalent functions and is easier to train. An ordinary GRU is composed of two control units: a reset gate and an update gate. The reset gate decides how to combine the new input with the previous memory, and the update gate decides how many previous memories work. Together, they decide which information should be delivered and which information should be deleted.
We have made improvements based on the standard GRU by adding a structure of an input block. In the input block, the state information and the input data are linear rectified by the Leaky Relu activation function first, then weighted and added with the input data to highlight the main feature information in the input data. Compared with ordinary GRU, this improvement can extract more abundant features of sequence data, which helps identify fine-grained differences in input data. The internal structure of the improved GRU is shown in Figure 6.
[figure(s) omitted; refer to PDF]
Its internal information process can be expressed as follows.
Reset Gate:
Update gate:
Input block:
Candidate hidden state:
Output value:
Our deep detection module consists of an improved GRU neural network with two hidden layers. The optimization function is Adam, and the activation functions are tanh, sigmoid, and leaky Relu. The learning rate is 0.01, and 50 epochs have been trained. The hyperparameter settings of the GRU are shown in Table 2, and the cross-entropy loss function is adopted as the objective function for GRU training. The GRU utilizes feature data from different CNN inputs to complete the training.
Table 2
The optimal hyper-parameters used for our GRU model.
Hyper-parameters | Optimal values |
Hidden layer size | 32 |
Num of layers | 2 |
Dropout | 0.2 |
Activation function | tanh, sigmoid, Leaky Relu |
4. Experiments and Result Analysis
4.1. Evaluation Metrics
To evaluate the detection performance of the detection method for network attacks, we use five detection indicators, such as Accuracy, Precision, Recall, F1 value, and false alarm rate, as reference for evaluating the performance of the detection method, and their calculation methods are as follows:
Table 3
Relationship matrix between true value and predicted value.
Predicted positive | Predicted negative | |
True positive | TP | FN |
True negative | FP | TN |
4.2. Experimental Environment
The server configuration of this experiment is Core i9-12900F, 128 GB RAM, NVIDIA RTX3090, CUDA11.2, Pytorch1.8. Using five various models of hosts as controllers, on Ubuntu 16.04LTS operating system, POX controller and MinNet program are run to simulate SDN subnet, and a distributed SDN experimental environment with master-slave structure is constructed with the server. The server is connected to various controllers through routers and switches. The configuration information of each host is shown in Table 4.
Table 4
Address information of each controller.
Name | CPU model (frequency) | Memory capacity (frequency) |
Server | Intel Core i9-12900F (2.4 GHz) | 32 GB (4800 MHz) |
Controller 1 | Intel Core i7-8750 (2.21 GHz) | 32 GB (2666 MHz) |
Controller 2 | Intel Core i7-12700H (2.3 GHz) | 32 GB (4800 MHz) |
Controller 3 | Intel Core i7-8750 (2.21 GHz) | 16 GB (2666 MHz) |
Controller 4 | AMD Ryzen R7-4800H (2.9 GHz) | 16 GB (3200 MHz) |
Controller 5 | Intel Core i7-8565U (1.80 GHz) | 16 GB (2133 MHz) |
4.3. Datasets and Data Preprocessing
Our research uses the InSDN dataset as experimental data. The InSDN dataset was released in 2020 by researchers from Dublin University, Ireland, specifically for traffic analysis experiments in the SDN environment. InSDN dataset provides the original data stream file (.pacp file) and the feature file (.csv file) formed after analysis by the Flowmeter tool. According to traffic types and target hosts, InSDN datasets can be divided into three groups. The first group only includes normal traffic, generated by protocols commonly used, such as HTTP, HTTPS, email, SSH, DNS, etc., totaling 68,424 instances. The second group consists of attack traffic targeting the Metasploitable2 server, generated by five attacks: DoS, DDoS, Brute force, Probe, and U2L, totaling 138,722 instances. The third group is the attack traffic inside the Open vSwitch (OVS) server, which includes traffic data generated by six attacks: BotNet, Brute force, DoS, DDoS, Probe, and Web-Attack, totaling 136,743 instances. To reduce the computational cost during abnormal detection and minimize the consumption of resources on the controllers, according to the research results of reference [47], we select 48 subsets of features from more than 80 features in the.csv format dataset file as experimental data for model training and detection and the feature subsets are shown in Table 5.
Table 5
Feature subsets of InSDN datasets.
No. | Feature name |
1 | Protocol |
2 | Flow duration |
3 | Total fwd packet |
4 | Total bwd packets |
5 | Total length of fwd packet |
6 | Total length of bwd packet |
7 | Fwd packet length min |
8 | Fwd packet length max |
9 | Fwd packet length mean |
10 | Fwd packet length std |
11 | Bwd packet length min |
12 | Bwd packet length max |
13 | Bwd packet length mean |
14 | Bwd packet length std |
15 | Flow Bytes/s |
16 | Flow packets/s |
17 | Flow IAT mean |
18 | Flow IAT std |
19 | Flow IAT max |
20 | Flow IAT min |
21 | Fwd IAT min |
22 | Fwd IAT max |
23 | Fwd IAT mean |
24 | Fwd IAT std |
25 | Fwd IAT total |
26 | Bwd IAT min |
27 | Bwd IAT max |
28 | Bwd IAT mean |
29 | Bwd IAT std |
30 | Bwd IAT total |
31 | Fwd header length |
32 | Bwd header length |
33 | FWD packets/s |
34 | Bwd packets/s |
35 | Packet length min |
36 | Packet length max |
37 | Packet length mean |
38 | Packet length std |
39 | Packet length variance |
40 | Average packet size |
41 | Active min |
42 | Active mean |
43 | Active max |
44 | Active std |
45 | Idle min |
46 | Idle mean |
47 | Idle max |
48 | Idle std |
We did not choose source IP, destination IP, source port number, and destination port number as the elements of the feature set, to increase the generalization ability of the detection model. Because IP and port numbers in different network environments are usually very different, if we use it as feature data to train the model, it will inevitably limit the application range of the model, which is inconsistent with our goal of pursuing a model with stronger generalization ability.
To eliminate the adverse effects of inconsistent feature data magnitude on the detection results, the Min-Max normalization method is used to scale the feature values according to a certain ratio so that they fall in a certain area. The calculation method is shown in the following:
4.4. Efficiency
Computational complexity: We will analyze the computational complexity of the cloud server and controllers for DCNN-GRU and compare it with the centralized detection method, as shown in Table 6. In the DCNN-GRU framework, the computational complexity of the cloud server is determined by the LSTM architecture. In contrast, the computational complexity of each controller is primarily dependent on the CNN architecture.
Table 6
Computational complexity.
Communication | Detection algorithm | Computational cost |
Cloud | Centralized detection | |
DCNN-GRU | ||
SDN | Centralized detection | — |
DCNN-GRU |
For the cloud server, upon receiving the feature stream, the LSTM within the cloud server requires perform
For the SDN controller, each CNN requires
The total computational complexity of the DCNN-GRU is
[figure(s) omitted; refer to PDF]
Communication bandwidth cost: Table 7 shows the link bandwidth cost between centralized detection and our DCNN-GRU. In the DCNN-GRU, there are the following types of communications between the cloud server and the SDN controllers:
• SDN controller ⟶ Cloud server. Each SDN controller sends the extracted feature information to the cloud server during an epoch, with a size of
• Cloud server ⟶ SDN controller. For each feature stream received by the cloud server, the cloud server needs to calculate the error term and send it to the corresponding SDN controller. The size of the error term is
Table 7
Communication cost.
Communication | Centralized detection | DCNN-GRU |
SDN-> Cloud | ||
Cloud-> SDN | — |
In the centralized detection approach, the controller needs to collect a large number of raw data samples and send them to the cloud server, with a communication cost of
Storage cost: Table 8 shows the memory complexity of centralized detection and DCNN-GRU. The memory complexity on the cloud server and various SDN controllers is detailed as follows:
• Cloud server. The memory on the cloud server depends on the size of the received streams and the LSTM, so the memory used on the cloud server is
• SDN controller. The memory on each distributed SDN controller mainly depends on the size of the training data, as well as the CNN, so the memory used on the SDN is
Table 8
Storage cost.
Communication | Algorithm | Memory cost |
Cloud | Centralized detection | |
DCNN-GRU | ||
SDN | Centralized detection | |
DCNN-GRU |
4.5. Results and Analysis
4.5.1. Experiment of Lightweight Detection Agent
First, the experiment only uses the lightweight detection agent based on CNN architecture. Considering the differences in the sub-domain environments of distributed SDN networks, we randomly select various samples in the original dataset according to the ratio of 7: 3 and divide them into the training set and test set. The sample distribution after the dataset division is shown in Table 9. Because the number of U2R samples is too rare (only 17) to be used for training and testing, it is excluded. The U2R samples were excluded because they were too sparse (only 17) to participate in training and testing.
Table 9
Sample distribution of InSDN dataset after division.
Sample name | Training set | Test set |
DDoS | 85,359 | 36,583 |
DoS | 37,531 | 16,085 |
Probe | 68,690 | 29,439 |
Brute-Force | 984 | 422 |
Web-Attack | 134 | 58 |
BotNet | 115 | 49 |
Normal | 47,897 | 20,527 |
Total | 240,710 | 103,163 |
We extract samples from the feature subset five times and divide them into five groups of training set and test set data. Each lightweight detection agent carries out complete training and detection experiments with one group of datasets. Table 10 shows the experimental results of five lightweight detection agents.
Table 10
The detection results of abnormal by lightweight detection agent.
Agent number | Accuracy | Precision | Recall | F1 | FPR |
first_CNN | 0.9401 | 0.9791 | 0.9454 | 0.9619 | 0.0811 |
second_CNN | 0.9506 | 0.9743 | 0.9637 | 0.9690 | 0.1023 |
third_CNN | 0.9438 | 0.9764 | 0.9528 | 0.9645 | 0.0926 |
fourth_CNN | 0.9089 | 0.9741 | 0.9105 | 0.9412 | 0.0974 |
fifth_CNN | 0.9544 | 0.9779 | 0.9649 | 0.9714 | 0.0877 |
Mean value | 0.9396 | 0.9764 | 0.9474 | 0.9616 | 0.0922 |
As can be seen, the recall rate of our designed lightweight detection agent for abnormal traffic samples reaches the lowest of 0.910 5, the highest of 0.964 9, and the average recall rate is 0.947 4, which shows that our proposed lightweight detection agent can achieve better recognition effect for abnormal samples with less computing resources. However, it can also be seen that this detection method shows a high false positive rate, ranging from 0.081 1 to 0.102 3, with an average false positive rate of 0.092 2, which will have a negative impact on the normal operation of the SDN network. Therefore, the detection results of lightweight agents can not be used as the ultimate basis for network security protection, and we need more accurate fine-grained detection methods.
4.5.2. Experiment of Deep Detection Model
In this stage of the experiment, each lightweight agent undertakes the task of extracting the features of sample data in advance, and the deep feature extraction and sample classification were completed by the deep detection module. We fixed the parameters of the trained lightweight detection agent, and only trained and tested the deep detection module based on GRU on the main controller. The datasets used for training and testing were randomly selected according to the method of Datasets and Data Preprocessing in Section 4.3.
a. To test the effectiveness of the proposed method, we only use the feature samples generated by the trained No.1 lightweight agent to train GRU and use the test set to test the abnormal detection performance of the trained CNN-GRU model. To avoid the bias problem that may occur when lightweight agents use randomly sampled data for training in the first stage. We trained CNN-GRU five times, and only updated the parameters of GRU. The data used in each training were randomly selected from the original feature subset in proportion. As shown in Figure 8, the results of abnormal detection by CNN-GRU are compared with those by using only the CNN lightweight detection agent. The detection performance of first_CNN-GRU is improved compared with that of first_CNN only. The accuracy of abnormal samples has been improved, from the original 0.940 1 to 0.981 6, with an increase of 4.41%. Secondly, the convening rate increased from 0.945 4 to 0.983 1, an increase of 3.99%; In addition, the detection indexes such as accuracy and F1 value also increased by 1.51% and 2.75% respectively. The false alarm rate that we care about has dropped significantly, from the original 0.081 1 to 0.024 4, down by 69.97%. This shows that the CNN-GRU joint detection method is effective in finding abnormal samples in traffic data and can obtain more potential features of data than single detection mode, thus achieving better detection performance than CNN alone.
b. To further understand the detection performance of the proposed detection method in different network environments, according to the method of Experiment A, the remaining four CNN lightweight agents are trained and detected with GRU respectively, and the detection results of various samples using each pair of CNN and GRU combinations are recorded, as shown in Table 11.
As can be seen from Table 11, the combination of CNN lightweight detection agents trained with different data and GRU will produce different detection effects. For example, the first_CNN-GRU model has a better detection effect on abnormal samples generated by DDoS and Probe attacks; the second_CNN-GRU model is good at detecting samples generated by DoS attacks and BotNet attacks. The fourth_CNN-GRU and fifth_CNN-GRU have excellent detection performance for Brute-Force and Web-Attack samples, respectively. This is because different training data are used in modeling CNN, which makes the trained CNN biased toward a certain kind of samples, which leads to the preference of the detection results combined with GRU. Overall, the combination of CNN and GRU can extract more abundant feature information from traffic samples, which helps to achieve fine-grained classification of sample categories and shows better detection results than using the CNN lightweight agent alone.
c. Comparison of different numbers of CNN detection results.
At this stage, we discuss the detection effect of different numbers of CNN and GRU combinations on abnormal samples. Since we have used the combination of first_CNN and GRU for experiments, we use the combination of first_CNN, second_CNN and GRU to analyze and detect the feature data mapped to the server by first_CNN and second_CNN at the same time. We call this model DCNN-GRU-2. Similarly, the combined model constructed by first_CNN, second_CNN, third_CNN and GRU is DCNN-GRU-3. The combined models constructed by first_CNN, second_CNN, third_CNN, fourth_CNN and GRU are DCNN-GRU-4, and there are similar models of DCNN-GRU-5. Five models use the same data to train 50 Epoch. During the training, the CNN parameters are still fixed and only the GRU parameters are updated. During detection, the test set is divided into equal parts according to the number of CNN, and each sample set is input into the lightweight agent of the SDN controller. Each CNN only completes feature extraction, and the generated feature data is input into GRU for in-deep analysis and detection. The experimental data use the dataset used in stage B. Figure 9 shows the performance of the five models in abnormal detection after training and compares them with the detection performance of CNN alone.
[figure(s) omitted; refer to PDF]
Table 11
Precision, recall, and F1-score of the different CNN-GUR model in the subset of InSDN.
Model | Results | DDoS | DoS | Probe | Brute-Force | Web-Attack | BotNet |
first_CNN-GRU | Precision | 0.9932 | 0.9931 | 0.9958 | 0.9513 | 0.9643 | 1.0000 |
Recall | 0.9784 | 0.9857 | 0.9885 | 0.9265 | 0.9310 | 0.8980 | |
F1-score | 0.9857 | 0.9894 | 0.9921 | 0.9388 | 0.9474 | 0.9462 | |
second_CNN-GRU | Precision | 0.9804 | 0.9937 | 0.9756 | 0.9745 | 0.9206 | 0.9792 |
Recall | 0.9590 | 0.9876 | 0.9796 | 0.9052 | 1.0000 | 0.9592 | |
F1-score | 0.9696 | 0.9906 | 0.9776 | 0.9386 | 0.9587 | 0.9691 | |
third_CNN-GRU | Precision | 0.9831 | 0.9811 | 0.9790 | 0.9692 | 0.9344 | 0.9245 |
Recall | 0.9535 | 0.9689 | 0.9830 | 0.8934 | 0.9828 | 1.0000 | |
F1-score | 0.9681 | 0.9750 | 0.9810 | 0.9297 | 0.9580 | 0.9608 | |
fourth_CNN-GRU | Precision | 0.9807 | 0.9782 | 0.9890 | 0.9951 | 1.0000 | 0.9038 |
Recall | 0.9727 | 0.9751 | 0.9864 | 0.9526 | 0.9655 | 0.9592 | |
F1-score | 0.9767 | 0.9767 | 0.9877 | 0.9734 | 0.9825 | 0.9307 | |
fifth_CNN-GRU | Precision | 0.9671 | 0.9723 | 0.9870 | 1.0000 | 1.0000 | 1.0000 |
Recall | 0.9645 | 0.9813 | 0.9810 | 0.8815 | 0.9828 | 0.9184 | |
F1-score | 0.9658 | 0.9768 | 0.9840 | 0.9370 | 0.9913 | 0.9574 |
Note: The bold values indicate the optimal values obtained by the model when detecting the corresponding types of anomalies.
[figure(s) omitted; refer to PDF]
The results show that the detection effect of the model does not keep getting better or worse with the increase of distributed CNN but fluctuates to some extent. Although DCNN-GRU-3 has the best detection performance among several models, it does not mean that DCNN-GRU-3 is the best model. Therefore, the detection model needs to be deployed according to the actual structure of the SDN network, and the detection ability is only for the current dataset, and the detection ability for other datasets is unknown. However, one thing is certain, the DCNN-GRU detection model can effectively detect abnormal samples in traffic data, and the detection effect is better than that of the CNN model with a single architecture.
In addition, we also count the classification results of the feature data input by the cloud server to the CNN lightweight agent of each controller, as shown in Figure 10. It can be seen that although there are still some differences in the classification of traffic by each CNN-GRU combination, for example, first_CNN-GRU is excellent in detecting samples of DDoS and DoS attacks, but the detection ability of Probe samples is the last; second_CNN-GRU has the highest detection rate for the Probe sample, but it is not good for other samples, but generally speaking, the cloud server performs well on the subnets where several controllers are located. In addition, compare the results of individual training and testing in Experiment A, as shown in Table 12.
[figure(s) omitted; refer to PDF]
Table 12
Optimal classification samples of each CNN-GRU model in separate detection and distributed detection.
Model | Single | Distribution |
first_CNN-GRU | DDoS, probe | DDoS, DoS, Brute-Force |
second_CNN-GRU | DoS, BotNet | Probe, BotNet |
third_CNN-GRU | Web-Attack | Web-Attack |
fourth_CNN-GRU | Brute-Force | Web-Attack, BotNet |
fifth_CNN-GRU | Web-Attack | Web-Attack, Brute-Force |
It can be found that the performance of the original combination is not consistent in separate detection and distributed detection. For example, first_CNN-GRU has a good detection effect on DDoS and Probe samples, but it has a good detection effect on DDoS, DoS, and Brute-Force samples and the worst detection performance on Probe samples in distributed detection. This is because, after the joint training of several CNNs and GRU, GRU is not only limited to optimizing the data provided by a certain CNN but also looking for their common optimal solution from the feature information provided by several CNNs, so there will be a situation that some samples that the model was good at detecting originally did not perform well in distributed detection.
In addition, we recorded the training and testing time of all models, as shown in Figure 11. In general, the time-consuming of model detection depends on the size of the dataset, the type of neural network, the complexity of the model, and the experimental environment. The training time of DCNN-GRU-5 is the longest, but the testing time is the shortest, because each CNN inputs the same amount of training data during training, and the model of DCNN-GRU-5 produces the most feature data, so it takes the longest time to update GRU parameters. When testing, the total number of samples of each model is the same, and DCNN-GRU-5 has 5 feature preprocessing entries, so it can complete feature pre-extraction more quickly. In addition, because GRU is in a high-performance server, it also has strong processing ability for a large number of data, so the overall time consumption of DCNN-GRU-5 is the least. This distributed detection design minimizes the communication and resource requirements of abnormal detection so that the operation of the lightweight detection agent will not cause excessive load on the operation of the SDN subnet; Moreover, the usability of features is extracted to the maximum extent, so that the cloud server where the main controller is located can avoid consuming too much calculation expenses due to centralized processing of large-scale traffic data, thus realizing the whole network balance of the entire SDN detection load. Our proposed method has great advantages in detection time when the data scale is the same.
[figure(s) omitted; refer to PDF]
4.5.3. Comparison With Other Detection Methods
We compare our proposed detection method with several traditional abnormal traffic detection methods in SDN, including V-NKDE [25], CNN-SoftMax [35], CNN-LSTM [36], and LSTM-AE-OC-SVM [42], and EGBAD [46]. Our method incorporates a GRU structure, which is an addition compared to CNN-SoftMax. It shares a similar neural network architecture with CNN-LSTM, but we utilize GRU instead of LSTM and have implemented a distributed detection mechanism. Unlike LSTM-AE-OC-SVM, the differences are not only in the type of recurrent structure used, but also in the classification approach. We do not employ a SVM for classification; instead, we rely solely on a classification function to categorize the samples. This approach significantly reduces the number of model parameters and enhances computational efficiency. Additionally, V-NKDE is a detection method that integrates multiple machine learning techniques, and EGBAD also employs a distributed detection approach. We trained and implemented abnormal detection according to the divided dataset, in the formats required by each model. Table 13 shows the performance comparison between our method and traditional detection methods. It can be observed that our method achieves an accuracy of 0.9939, a recall of 0.9831, an F1-score of 0.9884, and a false positive rate of merely 0.0244 in the detection experiments on the InSDN dataset. Compared to traditional detection methods, our approach has higher detection accuracy and lower false positive rates. For instance, the recall rate for detecting abnormal samples is improved by 0.12% compared to V-NKDE, by 1.79% compared to CNN-SoftMax, by 1.38% compared to CNN-LSTM, by 0.87% compared to LSTM-AE-OC-SVM, and by 0.13% compared to EGBAD. There are also significant improvements in other accuracy metrics.
Table 13
Detection accuracy of different detection methods.
Model | Accuracy | Precision | Recall | F1 | FPR |
V-NKDE | 0.9800 | 0.9931 | 0.9818 | 0.9875 | 0.0273 |
CNN-SoftMax | 0.9529 | 0.9772 | 0.9637 | 0.9704 | 0.0906 |
CNN-LSTM | 0.9558 | 0.9784 | 0.9661 | 0.9722 | 0.0857 |
LSTM-AE-OC-SVM | 0.9606 | 0.9809 | 0.9697 | 0.9753 | 0.0760 |
DCNN-GRU | 0.981 6 | 0.993 9 | 0.983 1 | 0.988 4 | 0.024 4 |
4.5.4. Detection Performance on Different Datasets
To verify the generalization ability of the proposed method, we have trained the DCNN-GRU model completely on various datasets and evaluated its performance using its test sets. The datasets include:
UNSW-NB15 [47] is the Australian Center for Cyber-Security produced this dataset. This dataset included s typical attack cases such as DoS, Fuzzers, Worms, Shellcode, and so on. And we use the subset of its dataset.
CIC-IDS2017 [48]: The Canadian Institute for Cyber-Security (CIC) produced this dataset in 2017. Attack types such as Brute-Force, Port scanning, BotNets, DoS, DDoS, and networking are included in this dataset.
CSE-CIC-IDS2018 [49]: A collaborative project between the Communications Security Establishment (CSE) and the Canadian Institute for Cyber security (CIC). The dataset includes different attack scenarios: Brute-Force, Heartbleed, BotNet, DoS, DDoS, cyberattacks, infiltration of the network from inside, etc.
CIC-DDoS2019 is created by Sharafaldin et al. [50]. The dataset contains benign and the most up-to-date common DDoS attacks, similar to real-world data. Different modern reflective DDoS attacks such as PortMap, NetBIOS, LDAP, MSSQL, UDP, UDP-Lag, SYN, NTP, DNS, and SNMP included in this dataset. We extract 24,000 cases from 10 kinds of attack samples, form a new dataset with 24,000 normal samples, and randomly select 70% to form a training set, and the remaining 30% as a test set.
CIC-IDS 2017, CSE-CIC-IDS 2018, and CIC-DDoS 2019 all use 48 feature data to construct datasets according to the method in Section 4.3. For UNSW-NB 15, 48 features other than tags are used to construct experimental data. The data design of each dataset is shown in Table 14.
Table 14
Data design of various datasets.
Dataset | Attack type | Original Features | Use features | Normal | Abnormal | Training | Testing | Abnormal samples (%) |
UNSW-NB15 | DoS, fuzzers, port scans, reconnaissance, worms, shellcode, exploits, generic, backdoors | 49 | 48 | 93,000 | 164,673 | 82,332 | 175,341 | 63.91 |
CIC-IDS2017 | Brute-Force, portscan, BotNet, Dos, DDoS, Web, infiltration | 83 | 48 | 416,322 | 288,923 | 493,670 | 211,575 | 40.97 |
CSE-CIC-IDS2018 | Brute-Force, portscan, SlowHTTPTest, Dos, LOIC, SQL injection, infiltration, DDoS, BotNet, Web-Attack | 83 | 48 | 1,061,827 | 240,635 | 911,723 | 390,739 | 18.47 |
CIC-DDoS2019 | MSSQL, NetBIOS, NTP, SNMP, SSDP, SYN, TFTP, UDP, UDP-Lag, Web DDoS, DNS, LDAP | 87 | 48 | 240,000 | 240,000 | 336,000 | 14,400 | 50.00 |
By fully training the model using different datasets and evaluating the detection performance of the model using the test set, Figure 12 is obtained. It records the detection ability of DCNN-GRU of different scales to each dataset. It can be seen that each model performs well in the Precision index of UNSW-NB15, which is caused by the large proportion of abnormal samples in UNSW-NB15. For CSE-CIC-IDS2018 data, the detection results of various models are quite different. For example, the recall rate of abnormal samples by 2CNN-GRU is 0.942 9, while that by 5CNN-GRU is 0.982 3, which is the maximum of all detection indicators. The reason for this situation is that only 18.47% of the abnormal samples in CSE-CIC-IDS2018, and the unbalanced dataset has caused some contingency to the detection results. Even if there is a certain gap in the detection of datasets by different models, the overall detection results of DCNN-GRU series models are satisfactory. Except for the recall rate of 2CNN-GRU on the CIC-IDS2017 dataset below 95%, all other detection indicators are above 95%. This shows that the proposed detection method has good adaptability to traffic data with different network backgrounds in the SDN environment, and the model has strong generalization ability.
[figure(s) omitted; refer to PDF]
Additionally, we have recorded the detection time expenditure of DCNN-GRU on four datasets, as shown in Figure 13.
[figure(s) omitted; refer to PDF]
It can be observed from the figure that DCNN-GRU takes 96 s for detection on the CIC-DDoS2019 dataset, 266 s on the UNSW-NB15 dataset, 657 s on the CIC-IDS2017 dataset, and 936 s on the CSE-CIC-IDS2018 dataset. This indicates that DCNN-GRU demonstrates higher detection efficiency when processing large-scale model datasets.
5. Conclusions
This manuscript proposes an abnormal traffic detection method based on the distributed deep learning framework DCNN-GRU in the large-scale SDN environment. The detection task is completed by the cooperation of the lightweight detection agent and the deep detection module. The lightweight detection agent is deployed as a feature extractor of CNN architecture in the host (or server) where each SDN controller is located to complete the pre-extraction of traffic data features in the controller subnet. Then the extracted feature data is mapped to the deep detection module hosted on the cloud server by using the communication link. The deep detection based on the improved GRU framework can further extract the input pre-extracted feature data and accurately identify the abnormal flow. This detection method can assign the feature extraction task of abnormal traffic detection in multicontroller large-scale SDN to the subnet, and the server hosted in the cloud can complete the final detection task. This design method of distributed detection eliminates the excessive resource consumption caused by the large-scale network abnormal detection task concentrated on the detection host and effectively avoids the generation of system bottleneck nodes. In addition, only transmitting the pre-extracted feature data occupies less bandwidth than transmitting the original traffic data, which can alleviate the communication pressure and help maintain the normal operation of the entire SDN network. The experimental shows that the proposed method can detect the abnormal samples in the experimental data timely and accurately and has higher detection accuracy and lower false positive rate than the traditional detection methods.
Considering potential future work, we are currently exploring solutions for complex SDNs in dynamic SDNs to address the challenges of emerging attack types and complex attack samples in real networks such as the IoT and vehicular networks. We are particularly focusing on technologies that can detect attacks spanning multiple network domains, especially those that can identify and respond to complex attack chains that span numerous subnets or network slices, to enhance the generalization capability and adaptability of the detection methods. This will strengthen the security and stability of SDN networks in the face of evolving cyber threats.
Funding
This work was jointly supported by National Natural Science Foundation of China (No. 62102422); Scientific and Technological Key Project of Henan Province (No. 242102211070).
[1] B. Alhijawi, S. Almajali, H. Elgala, H. Bany Salameh, M. Ayyash, "A Survey on DoS/DDoS Mitigation Techniques in SDNs: Classification, Comparison, Solutions, Testing Tools and Datasets," Computers & Electrical Engineering, vol. 99,DOI: 10.1016/j.compeleceng.2022.107706, 2022.
[2] M. P. Singh, A. Bhandari, "New-flow Based DDoS Attacks in SDN: Taxonomy, Rationales, and Research Challenges," Computer Communications, vol. 154, pp. 509-527, DOI: 10.1016/j.comcom.2020.02.085, 2020.
[3] J. Vergara, C. Garzón, J. F. Botero, "A Hybrid Strategy for DoS Attacks Detection and Mitigation on SDN Enabled Real Scenarios," Lecture notes in Networks and Systems, pp. 705-714, DOI: 10.1007/978-981-99-3091-3_58, 2023.
[4] J. Bhayo, S. A. Shah, S. Hameed, A. Ahmed, J. Nasir, D. Draheim, "Towards a Machine Learning-Based Framework for DDOS Attack Detection in Software-Defined IoT (SD-IoT) Networks," Engineering Applications of Artificial Intelligence, vol. 123 no. Part C,DOI: 10.1016/j.engappai.2023.106432, 2023.
[5] J. Singh, S. Behal, "Detection and Mitigation of DDoS Attacks in SDN: A Comprehensive Review, Research Challenges and Future Directions," Computer Science Review, vol. 37,DOI: 10.1016/j.cosrev.2020.100279, 2020.
[6] X. Duan, Yu Fu, K. Wang, L. I. Bin, "LDoS Attack Detection Method Based on Simple Statistical Features," Journal on Communications, vol. 43, 2022.
[7] T. Jafarian, M. Masdari, A. Ghaffari, K. Majidzadeh, "A Survey and Classification of the Security Anomaly Detection Mechanisms in Software Defined Networks," Cluster Computing, vol. 24 no. 2, pp. 1235-1253, DOI: 10.1007/s10586-020-03184-1, 2020.
[8] G. Oluchi Anyanwu, C. I. Nwakanma, J. M. Lee, D. S. Kim, "Optimization of RBF-SVM Kernel Using Grid Search Algorithm for DDoS Attack Detection in SDN-Based VANET," IEEE Internet of Things Journal, vol. 10 no. 10, pp. 8477-8490, DOI: 10.1109/jiot.2022.3199712, 2023.
[9] K. Wang, Yu Fu, X. Duan, T. Liu, "Detection and Mitigation of DDoS Attacks Based on Multi-Dimensional Characteristics in SDN," Scientific Reports, vol. 14 no. 1,DOI: 10.1038/s41598-024-66907-z, 2024.
[10] W. Jiang, H. Han, M. He, W. Gu, "ML-Based Pre-deployment SDN Performance Prediction with Neural Network Boosting Regression," Expert Systems with Applications, vol. 241,DOI: 10.1016/j.eswa.2023.122774, 2024.
[11] M. Cheng, Q. Li, J. Lv, W. Liu, J. Wang, "Multi-Scale LSTM Model for BGP Anomaly Classification," IEEE Transactions on Services Computing, vol. 14 no. 3, pp. 765-778, DOI: 10.1109/tsc.2018.2824809, 2021.
[12] W. Jiang, "Graph-based Deep Learning for Communication Networks: A Survey," Computer Communications, vol. 185, pp. 40-54, DOI: 10.1016/j.comcom.2021.12.015, 2022.
[13] R. Kumar, N. Agrawal, "Software Defined Networks (SDNs) for Environmental Surveillance: A Survey," Multimedia Tools and Applications, vol. 83 no. 4, pp. 11323-11365, DOI: 10.1007/s11042-023-15729-8, 2024.
[14] K. Bavani, M. P. Ramkumar, E. G. S. R. Selvan, "Statistical Approach Based Detection of Distributed Denial of Service Attack in A Software Defined Network," International Conference on Advanced Computing, pp. 380-385, .
[15] K. Jia, J. Wang, F. Liu, "DDoS Detection and Mitigation Mechanism in SDN Environment," Journal of Information Security, vol. 6 no. 01, pp. 17-31, 2021.
[16] R. Li, B. Wu, "Early Detection of DDoS Based on φ-Entropy in SDN Networks," 2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), vol. 1, pp. 731-735, 2020.
[17] A. Mishra, N. Gupta, B. B. Gupta, "Defense Mechanisms against DDoS Attack Based on Entropy in SDN-Cloud Using POX Controller," Telecommunication Systems, vol. 77 no. 1, pp. 47-62, DOI: 10.1007/s11235-020-00747-w, 2021.
[18] S. M. Mousavi, M. St-Hilaire, "Early Detection of DDoS Attacks against Software Defined Network Controllers," Journal of Network and Systems Management, vol. 26 no. 3, pp. 573-591, DOI: 10.1007/s10922-017-9432-1, 2017.
[19] N. Do Van, L. D. Huy, C. Q. Truong, B. T. Ninh, D. T. Thai Mai, "Applying Dynamic Threshold in SDN to Detect DDoS Attacks," 2022 International Conference on Advanced Technologies for Communications (ATC), pp. 344-349, DOI: 10.1109/atc55345.2022.9943031, .
[20] J. Cui, J. Zhang, J. He, H. Zhong, Y. Lu, "DDoS Detection and Defense Mechanism for SDN Controllers with K-Means," IEEE/ACM International Conference Utility and Cloud Computing, pp. 394-401, DOI: 10.1109/ucc48980.2020.00062, .
[21] M. Zolotukhin, S. Kumar, T. Hamalainen, "Reinforcement Learning for Attack Mitigation in SDN-Enabled Networks," 2020 6th IEEE Conference on Network Softwarization (NetSoft), pp. 282-286, DOI: 10.1109/netsoft48620.2020.9165383, 2020.
[22] M. N. Jasim, M. T. Gaata, "K-means Clustering-Based Semi-supervised for DDoS Attacks Classification," Bulletin of Electrical Engineering and Informatics, vol. 11 no. 6, pp. 3570-3576, DOI: 10.11591/eei.v11i6.4353, 2022.
[23] L. Boero, M. Marchese, S. Zappatore, "Support Vector Machine Meets Software Defined Networking in IDS Domain," 2017 29th International Teletraffic Congress (ITC 29), vol. 3, pp. 25-30, DOI: 10.23919/itc.2017.8065806, 2017.
[24] Q. Cheng, C. Wu, H. Zhou, "Machine Learning Based Malicious Payload Identification in Software-Defined Networking," Journal of Network and Computer Applications, vol. 192,DOI: 10.1016/j.jnca.2021.103186, 2021.
[25] O. E. Tayfour, M. N. Marsono, "Collaborative Detection and Mitigation of DDoS in Software-Defined Networks," The Journal of Supercomputing, vol. 77 no. 11, pp. 13166-13190, DOI: 10.1007/s11227-021-03782-9, 2021.
[26] J. Li, Z. Zhao, R. Li, H. Zhang, "AI-Based Two-Stage Intrusion Detection for Software Defined IoT Networks," IEEE Internet of Things Journal, vol. 6 no. 2, pp. 2093-2102, DOI: 10.1109/jiot.2018.2883344, 2019.
[27] A. Sebbar, K. Zkik, Y. Baddi, M. Boulmalf, M. D. E. C. E. Kettani, "MitM Detection and Defense Mechanism CBNA-RF Based on Machine Learning for Large-Scale SDN Context," Journal of Ambient Intelligence and Humanized Computing, vol. 11 no. 12, pp. 5875-5894, DOI: 10.1007/s12652-020-02099-4, 2020.
[28] A. Banitalebi Dehkordi, M.R. Soltanaghaei, F. Z. Boroujeni, "The DDoS Attacks Detection through Machine Learning and Statistical Methods in SDN," The Journal of Supercomputing, vol. 77 no. 3, pp. 2383-2415, DOI: 10.1007/s11227-020-03323-w, 2020.
[29] M. F. Akbaş, C. Güngör, E. Karaarslan, "Usage of Machine Learning Algorithms for Flow Based Anomaly Detection System in Software Defined Networks," Advances in Intelligent Systems and Computing, pp. 1156-1163, DOI: 10.1007/978-3-030-51156-2_135, 2020.
[30] S. Dev, A. D. Jurcut, "Machine-Learning Techniques for Detecting Attacks in SDN," 2019 IEEE 7th International Conference on Computer Science and Network Technology (ICCSNT) abs/1910.00817, pp. 277-281, .
[31] P. T. Dinh, M. Park, "BDF-SDN: A Big Data Framework for DDoS Attack Detection in Large-Scale SDN-Based Cloud," 2021 IEEE Conference on Dependable and Secure Computing (DSC),DOI: 10.1109/dsc49826.2021.9346269, 2021.
[32] W. Li, X. Guo, Y. Yuan, "Novel Scenes & Classes: Towards Adaptive Open-Set Object Detection," ICCV, pp. 15734-15744, 2023.
[33] F. Mohades Deilami, H. Sadr, M. Tarkhan, "Contextualized Multidimensional Personality Recognition Using Combination of Deep Neural Network and Ensemble Learning," Neural Processing Letters, vol. 54 no. 5, pp. 3811-3828, DOI: 10.1007/s11063-022-10787-9, 2022.
[34] Z. Khodaverdian, H. Sadr, S. A. Edalatpanah, M. Nazari, "An Energy Aware Resource Allocation Based on Combination of CNN and GRU for Virtual Machine Selection," Multimedia Tools and Applications, vol. 83 no. 9, pp. 25769-25796, DOI: 10.1007/s11042-023-16488-2, 2023.
[35] J. Kim, J. Kim, H. Kim, M. Shim, E. Choi, "CNN-based Network Intrusion Detection against Denial-Of-Service Attacks," Electronics, vol. 9,DOI: 10.3390/electronics9060916, 2020.
[36] I. I. Kurochkin, S. S. Volkov, "Using GRU Based Deep Neural Network for Intrusion Detection in Software-Defined Networks," IOP Conference Series: Materials Science and Engineering, vol. 927 no. 1,DOI: 10.1088/1757-899x/927/1/012035, 2020.
[37] M. Abdallah, N.-An Le-Khac, H. Jahromi, A. Delia Jurcut, "A Hybrid CNN-LSTM Based Approach for Anomaly Detection Systems in SDNs," Availability, Reliability and Security, vol. 34, 2021.
[38] Q. Niyaz, W. Sun, A. Y. Javaid, "A Deep Learning Based DDoS Detection System in Software-Defined Networking (SDN)," EAI Endorsed Transactions on Security and Safety abs/1611.07400, 2017.
[39] S. Maeda, A. Kanai, S. Tanimoto, T. Hatashima, K. Ohkubo, "A Botnet Detection Method on SDN Using Deep Learning," 2019 IEEE International Conference on Consumer Electronics (ICCE),DOI: 10.1109/icce.2019.8662080, 2019.
[40] D. Javeed, T. Gao, M. T. Khan, "SDN-enabled Hybrid DL-Driven Framework for the Detection of Emerging Cyber Threats in IoT," Electronics, vol. 10,DOI: 10.3390/electronics10080918, 2021.
[41] S. Khan, A. Akhunzada, "A Hybrid DL-Driven Intelligent SDN-Enabled Malware Detection Framework for Internet of Medical Things (IoMT)," Computer Communications, vol. 170, pp. 209-216, DOI: 10.1016/j.comcom.2021.01.013, 2021.
[42] M. P. Novaes, L. F. Carvalho, J. Lloret, M. L. Proença, "Adversarial Deep Learning Approach Detection and Defense Against DDoS Attacks in SDN Environments," Future Generation Computer Systems, vol. 125, pp. 156-167, DOI: 10.1016/j.future.2021.06.047, 2021.
[43] A. AlEroud, G. Karabatis, "SDN-GAN: Generative Adversarial Deep NNs for Synthesizing Cyber Attacks on Software Defined Networks," Lecture Notes in Computer Science, pp. 211-220, DOI: 10.1007/978-3-030-40907-4_23, 2020.
[44] M. Said Elsayed, Nhien-An Le-Khac, S. Dev, A. D. Jurcut, "Network Anomaly Detection Using LSTM Based Autoencoder," International Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems, pp. 37-45, .
[45] J. Shu, L. Zhou, W. Zhang, X. Du, M. Guizani, Collaborative Intrusion Detection for VANETs: A Deep Learning-Based Distributed SDN Approach, 2021.
[46] D. A. Ezeh, J. de Oliveira, "An SDN Controller-Based Framework for Anomaly Detection Using a GAN Ensemble Algorithm," Infocommunications Journal, vol. 15 no. 2, pp. 29-36, DOI: 10.36244/icj.2023.2.5, 2023.
[47] P. Krishnan, S. Duttagupta, K. Achuthan, "VARMAN: Multi-Plane Security Framework for Software Defined Networks," Computer Communications, vol. 148, pp. 215-239, DOI: 10.1016/j.comcom.2019.09.014, 2019.
[48] N. Moustafa, J. Slay, G. Creech, "Novel Geometric Area Analysis Technique for Anomaly Detection Using Trapezoidal Area Estimation on Large-Scale Networks," IEEE Transactions on Big Data, vol. 5 no. 4, pp. 481-494, DOI: 10.1109/tbdata.2017.2715166, 2019.
[49] I. Sharafaldin, A. H. Lashkari, A. A. Ghorbani, "Toward Generating A New Intrusion Detection Dataset and Intrusion Traffic Characterization," International Conference on Information Systems Security and Privacy, pp. 108-116, .
[50] I. Sharafaldin, A. H. Lashkari, S. Hakak, A. A. Ghorbani, "Developing Realistic Distributed Denial of Service (DDoS) Attack Dataset and Taxonomy," International Carnahan Conference on Security Technology,DOI: 10.1109/ccst.2019.8888419, .
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Copyright © 2025 Xueyuan Duan et al. International Journal of Intelligent Systems published by John Wiley & Sons Ltd. This is an open access article under the terms of the Creative Commons Attribution License (the “License”), which permits use, distribution and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0/
Abstract
In response to the centralized single-architecture abnormal traffic detection method in Software Defined Network (SDN), which consumes massive computational and network resources, and may lead to the decline of service quality of SDN network, this paper proposes a large-scale abnormal traffic detection method of SDN network based on Distributed Convolutional Neural Networks and Gate Recurrent Unit (DCNN-GRU) architecture. This method utilizes lightweight detection agents based on CNN deployed on each controller to extract traffic features preliminarily. Then it inputs the feature data into the GRU-based deep detection model hosted in the cloud for collaborative training and completes the final abnormal detection task. Since the feature extraction tasks are distributed across multiple controllers, the cloud server only needs to relearn and classify the extracted feature data, which is less costly than directly extracting feature information from the original traffic data and occupies less bandwidth resources than transmitting complete data packets. The experiment shows that the method achieves an abnormal detection accuracy of 0.9939, a recall rate of 0.9831, and a false alarm rate of only 0.0244, obtaining a higher precision and lower false alarm rate than traditional detection methods.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details





1 College of Computer and Information Technology Xinyang Normal University Xinyang 464000 Henan, China; Department of Information Security Naval University of Engineering Wuhan 430033 Hubei, China; Henan Key Laboratory of Analysis and Applications of Education Big Data Xinyang Normal University Xinyang 464000 China
2 School of Information and Communication Engineering Xinyang Vocational and Technical College Xinyang 464000 Henan, China
3 Department of Information Security Naval University of Engineering Wuhan 430033 Hubei, China
4 Department of Operational Research and Programming Naval University of Engineering Wuhan 430033 China
5 Department of Economic Crime Investigation Henan Police College Zhengzhou 450046 Henan, China