Introduction analysis
IoT comprises a set of interconnected devices used to observe and collect data from the environment, and communication is initiated via the Internet1. It consists of sensors spanning sectors like smart homes, manufacturing, transportation, and complex machines that improve overall communication in various real-time applications2. If the number of devices increases, the network faces cyber security issues like DDoS attacks3. The DDoS attack is named the botnet that targets the network and affects the network traffic, causing the complexity of requests. These issues frequently occur in the environment due to the lack of security features, low-capacity devices, and low power, which increase the number of DDoS attacks while making the data transmission4. Detecting and mitigating such attacks is challenging due to the large number of IoT devices, diversity, communication protocols, and real-time processing requirements. Numerous DDoS detection methods are utilized but often lack the scalability, accuracy, and privacy protection5 for the IoT networking environment. Generally, for centralized DDoS detection systems, sensitive information may be endangered by the need to aggregate massive traffic volumes from multiple devices. The devices may also be sending sensitive or proprietary data, and with the data being sent to a centralized server for analysis, it may be more subject to privacy risks or unauthorized access. Current DDoS detection solutions6 are insufficient to tackle the complexity of Internet of Things networks. The existing techniques are based on measuring traffic trends from a central point of contact. Conventional methods7 can not offer a complete representation of the growing attack vectors due to the dynamic nature of the network. Moreover, the broad spectrum of capabilities for IoT devices results in a range of traffic behaviors, which makes it even harder to detect. Denial-of-service attacks8 likely employ advanced methods to produce regular traffic shapes in IoT environments. The scalability of DDoS detector methods is another main issue. However, the density of connected devices in IoT networks also raises the computational burden on the centralized detection system9, which might be a bottleneck. Without the risk of performance degradation, most existing systems cannot scale smoothly to meet the ever-increasing number of IoT devices. Traditional DDoS detection and mitigation methods10 ignore the context when using simple traffic analysis, leading to many false positives, which reduce the IoT system performance. The new framework addresses this research challenge introduced in this research, GraphFedAI.
The GraphFedAI framework integrates Graph neural networks11 and FL12 concepts to maximize the system’s robustness, scalability, accuracy, and overall efficiency while predicting DDoS attacks. The framework uses FL to train the neural model because the network should consider the entire node in the IoT systems to preserve privacy while sharing information13. The training process is performed locally, eliminating the need to transmit sensitive details to the servers. FL analyzes the available computational resources during training and speeds up the training process by avoiding centralized traffic aggregation14. The local optimization in FL covers several IoT devices that successfully ensure scalability and eliminate intermediate attacks. In addition, the graph network uses topology awareness while generating the graph for every input. The topology maintains the interconnections between the nodes using the traffic or weight value that improves understanding of network traffic and reduces anomalies15. Finally, the framework uses the anomaly score, model prediction, and a combination of scores to make the final decision regarding the DDoS attack16. These discussed systems are implemented using the Python tool with the CICIoT-2023 dataset17, ensuring effective results while predicting the DDoS attack.
Existing federated graph learning methods face critical challenges in dynamic IoT environments, including their reliance on static graph construction, centralized aggregation, and basic graph neural network layers that fail to capture temporal and structural variations in traffic behavior. These limitations reduce scalability, adaptability, and detection accuracy, especially when data is non-independent and identically distributed or partially missing. To address this research gap, the GraphFedAI framework integrates four core innovations: first, session-based dynamic graph modeling to capture real-time device interactions; second, interpolation techniques to restore missing temporal patterns; third, Pearson correlation-based feature selection to reduce redundancy and false alarms; and fourth, a federated multi-task ensemble approach that enables decentralized and privacy-aware learning. Unlike existing models, GraphFedAI preserves data locality, scales efficiently in heterogeneous IoT networks, and remains robust under sparse data conditions. Then, the main research insight is listed as follows.
To improve the DDoS detection accuracy by incorporating the FL with the graph neural model.
To manage the scalability and robustness of the system efficiency for the high-dimensionality of devices in the IoT environment.
To reduce the false positive rate (FPR) while recognizing the normal and DDoS attacks by integrating the data interpolation and correlation analysis approach.
Research workflow summary
Figure 1 illustrates a structured research workflow for DDoS detection in IoT systems, represented through interconnected arrows. The diagram is divided into four key stages:
Fig. 1 [Images not available. See PDF.]
Illustration of research workflow.
Then, the manuscript is organized as follows: section"Research synthesis"describes the research synthesis about DDoS detection in IoT systems. Section"GraphFedAI framework"explains the working process of the GraphFedAI framework to detect the DDoS and the system’s efficiency, as evaluated in Section"Research outcomes". Finally, the conclusion is described in section"Conclusion".
Research synthesis
Almaraz-Rivera, J. G. et al.18 recommended deep learning techniques to identify the DDoS attacks in the application and transport layer. The author uses the Bot-IoT dataset details to overcome the class imbalance issues while developing the intrusion detection process. During the analysis, timestep details are obtained and processed with the help of three feature sets to reduce the feature dependencies. Finally, multiclass and binary classification is performed to predict the intruder activities. Even if the learning model effectively recognizes the intruders with high recognition accuracy, the system should consider the IoT interconnection environments to analyze the anomalies in various scenarios.
Mihoub A. et al.19 introduced the Looking Back-enabled Machine Learning (LBML) approach to identify and mitigate the DDoS in IoT. This work intends to develop the detection and mitigation systems by incorporating the machine learning techniques. The packet types predict the attacks, ensuring fine granularity detection. In addition, mitigation countermeasures are utilized to improve the overall performance of the systems. The system uses the Bot-IoT dataset to evaluate the system performance, and looking back-idea is used to enhance the overall detection accuracy. However, the system focuses on the dynamic environment to predict the intruders and reduce security difficulties.
Sharif, H. et al.20 utilized XGboost and Adaboost techniques (XG-Adaboost) with a looking-back approach to detect DDoS attacks. This study uses the CICDDoS2019 dataset information to analyze the intruder activities. The gathered information is processed by boosting techniques that identify abnormal activities, and the new data is frequently monitored to detect DDoS attacks. The main intention of this work is to ensure security and integrity while communicating in IoT systems. However, the system requires additional efforts to handle the other cyber threats.
Yousuf, O., & Mir, R. N21. introduced Live Capture Neural Networks (LCNN) to identify the DDoS attacks in IoT. The LCNN approach works according to the recurrent neural model concepts and is developed in a software-defined network environment. The collected data is processed with the help of a sequence of layers and a novel activation function that predicts the intruder activities with a maximum recognition rate. The discussed system was implemented using 177 instances to ensure a high accuracy value and minimum error rate. The developed system, which was implemented with the help of the OpenDay light controller, ensures effective results while detecting DDoS.
Ur Rehman et al.22 proposed Gated Recurrent Unit (GRU) networks to predict cyberattacks like distributed DDoS. This study uses the CICDDoS201 dataset information to analyze the various attacks while making the data transmission. The gathered data is fed into the process, which is pre-processed to derive the features. The extracted features are analyzed using the data balancing and balanced information fed into the GRU, which is used to recognize the DDoS with maximum recognition accuracy.
Alkahtani, H., & Aldhyani, T. H23. introduced convolution neural networks with Long Short-Term Networks to identify botnet attacks in IoT applications. This work intends to predict botnet attacks such as Mirai and BASHLITE from commercial IoT devices. This study uses the N-BaIoT dataset, explored using the hybridized network that identifies malicious and benign patterns with maximum recognition accuracy (90.88%). Aisha et al. (2024) detect DDoS attacks in IoT networks by applying the K-nearest neighbor with Neural Networks. The study uses the MQTT-IoT-IDS2020 dataset information to create intrusion detection systems in IoT. This work intends to handle security issues and suspicious activities and improve privacy and integrity. The hyperplane is utilized to identify the normal and abnormal activities; during this procedure, the neural network is incorporated to train the system to improve the overall intruder detection accuracy. However, the system requires optimization techniques to handle dynamic situations.
Hedyeh Nazari et al.30 suggested the Privacy-Preserving Provenance Graph Neural Network (P3GNN) for Advanced Persistent Threats Detection in Software Defined Networking (SDN). Using unsupervised learning, P3GNN examines provenance graphs for operational trends and flags any discrepancies that might indicate a security violation. Its main characteristic is that it strengthens data security and gradient integrity during collaborative learning by integrating FL with homomorphic encryption. The serious problem of data privacy in collaborative learning environments is addressed by this method. In addition to improving security analysis, providing a comprehensive picture of attack pathways and detecting abnormalities at the node level inside provenance graphs are key advances of P3GNN. Additionally, the model may detect zero-day threats by acquiring typical operating patterns via its unsupervised learning capacity. An impressive accuracy of 0.93 and a low false positive rate of 0.06 were achieved by P3GNN in the empirical assessment conducted using the DARPA TCE3 dataset.
Abbas Yazdinejad et al.31 proposed the Privacy-Preserving Federated Learning Model Against Model Poisoning Attacks. Using a Gaussian Mixture Model and Mahalanobis Distance for byzantine-tolerant aggregation, our method incorporates an internal auditor that distinguishes between benign and malicious gradients by evaluating encrypted gradient similarity and distribution. The suggested paradigm minimizes computational and communication costs while ensuring secrecy via Additive Homomorphic Encryption. The model outperforms state-of-the-art tactics and encryption methods regarding accuracy and privacy, including Fully Homomorphic Encryption and Two-Trapdoor Homomorphic Encryption. The suggested paradigm successfully handles low computational and communication overhead in identifying maliciously encrypted non-independent and identically distributed gradients.
Abbas Yazdinejad et al.32 recommended the Auditable Privacy-Preserving Federated Learning (AP2FL) Framework for Electronics in Healthcare. Secure client and server-side training and aggregation procedures are guaranteed by AP2FL via Trusted Execution Environments, thereby reducing the risks of data leakage. Incorporating the Active Personalized Federated Learning model and Batch Normalization approaches, this study centralizes user updates and finds data similarities to handle non-IID data inside the proposed framework. In addition, this study includes an auditing system in AP2FL that shows how each client contributed to the FL process, which helps update the global model to account for different kinds and distributions of data. It guarantees that the FL process is honest, open, fair, and strong. This study shows that the suggested AP2FL model eliminates privacy leakage better than the state-of-the-art technique.
Abdul Mazid et al.33 presented the Federated Learning Based Intrusion Detection Approach With Privacy Preservation. The suggested method finds possible incursions and outliers using bidirectional recurrent neural networks models. The proposed technique guarantees network efficiency and data privacy by keeping data locally on IoT devices and just exchanging the learnt model weights with the FL central server. Integrating a voting ensemble procedure to combine updates from many sources achieves a high level of accuracy for the global ML model. With improved accuracy and data privacy, the testing findings show that the suggested method is useful in identifying possible breaches in IoT networks.
Abdul Mazid et al.34 introduced Improved intrusion detection for securing IoT networks using principal component analysis and Convolutional Neural Network (CNN). Using 1D-CNN, 2D-CNN, and 3D-CNN algorithms trained on the Edge-IIoTset and NSL-KDD benchmark datasets, the method uniquely differentiates between packets that pose a danger and those that do not. The experimental results show that the suggested framework considerably improves accuracy, precision, recall, and F1-score compared to pre-existing models for binary and multiclass classifications. With an average accuracy of 99.76%, precision of 99.79%, recall of 99.89%, and F1-score of 99.85% on the EdgeIIoTset dataset, our binary classification models performed quite well. With an accuracy of 99.20%, precision of 98.07%, recall of 97.95%, and F1-score of 97.71%, the models performed well on the NSL-KDD dataset. On the Edge-IIoTset dataset, the suggested model achieved an average accuracy of 99.41% for multiclass classification, a precision of 98.61%, a recall of 98.49%, and an F1-score of 98.56%.
The research analysis of different scholars is illustrated in Table 1 below.
Table 1. Summary of research analysis.
Reference | Research insights | Technical description | Findings | Challenges |
---|---|---|---|---|
Hayder et al., 202424 | To enhance the security of IoT systems by detecting proactive DDoS attacks | Combining Random forest and principal component analysis approach to reduce dimensionality issues while classifying abnormal activities | The BOT-IoT dataset is utilized for implementing this study and ensures above 95% accuracy | Facing difficulties in real-time data analysis and zero-attack interpretations |
Animesh et al., 202425 | To develop DDoS attack detection systems to reduce intermediate activities and counter adversarial activities | Hybridizing ensemble techniques such as Random Forest and XGBoost to improve DDoS detection accuracy | The robust dataset is utilized to train the system and ensures above 95% accuracy | System facing scalability and other attack issues while sharing information |
Hani et al., 202426 | To analyze software-defined networks for identifying DDoS attacks for improved security | Hybridized deep learning networks such as Convolution Networks, Gated Recurrent Networks, and Deep Learning approaches | Real-world and synthetic datasets are used for analyzing system efficiency; the hybridized approach attains above 90% accuracy | Requires additional security mechanisms to improve overall security factors in SDN |
Lotfi, Mhamdi, et al., 202027 | To improve the DDoS detection rate in SDN to ensure data security | Stacked Autoencoder and One-Class Support Vector Machine for identifying DDoS | The CICIDS2017 dataset is utilized for evaluating system efficiency, ensuring above 95% accuracy | Requires lightweight security mechanisms to improve other attacks in SDN |
Lotfi, Mhamdi, et al., 202028 | Enhance DDoS attack detection accuracy and system security in IoT | Integrated Deep Learning and Convolutional Neural Networks to improve DDoS detection accuracy | BoTNet dataset is utilized; the system ensures above 92% accuracy | High false positive rate and requires continuous evolution to minimize DDoS attacks |
Research insights
The above researcher’s analysis describes that DDoS attacks reduce the efficiency of the IoT system and security and privacy issues. Several machine learning approaches, such as support vector machines, K-nearest neighboring, convolution neural networks, and deep learning networks, are effectively utilized to explore abnormal activities in the IoT environment. However, the conventional methods face scalability and privacy issues in the dynamic IoT environment. These issues create computation overhead and complexity issues. Therefore, the IoT environment requires robust learning and computational techniques to improve IoT systems’ overall privacy and scalability factors. Therefore, this study uses integrated approaches such as FL and graph-based AI modeling techniques to attain privacy and security in IoT.
GraphFedAI framework
Research goal
The research GraphFedAI provides a topology-aware, scalable, and privacy solution to identify DDoS attacks in IoT systems. The GraphFed AI integrates an FL model with a graph neural network to overcome difficulties while managing privacy and security in heterogeneous IoT systems. The developed framework provides a collaborative model to train the systems without negotiating the privacy factor. The graph network network explores the IoT device’s topological, spatial, temporal, and relational features and communication patterns to recognize abnormal activities. To ensure network adaptability, the dual approach successfully identifies distributed and complex attack patterns like multi-point DDoS attacks. Hence, the ultimate goal of this work is to create an effective framework to identify and detect DDoS attacks in IoT systems by ensuring scalability and privacy concerns.
Framework design
This section discusses GraphFedAI-based DDoS attack detection that uses IoT devices to gather the information processed by data cleaning, training, and classification models to predict normal and DDoS attacks. Then, the overall structure of the GraphFedAI is shown in Fig. 2.
Fig. 2 [Images not available. See PDF.]
GraphFedAI framework design.
The main objective of this GraphFedAI framework is to reduce the loss function value while detecting DDoS attacks in IoT systems. is achieved while training the graph neural network over the FL model that ensures the system’s privacy and scalability by observing the association of graph structure. Then, is defined using Eq. (1).
1
In Eq. (1), is utilized to minimize the overfitting issues while exploring the input data from IoT devices. is examined using the introduced framework that generates the graph to identify the relationship between the features that help to minimize value and generalize the framework. is predicted from the ith IoT device’s local loss value , regularization weights and number of contributed IoT devices in the environment. The graph is generated using the CICIoT-2023 dataset, which is pre-processed to eliminate irrelevant information. The generated consists of nodes , edges and feature matrix features that help to understand the IoT devices, communication links, and several attributes like flow statistics, packet size, traffic volume, etc. identifies IoTdevice state like usual and malicious involvement .
One important aspect to consider when dealing with various forms of distributed denial of service assaults, such as volumetric, protocol, or application layer attacks, is the use of GNNs. This guarantees that the model can accurately reflect IoT devices’ complex and dynamic interactions. Due to their built-in modeling of IoT network topology, GNNs may adapt to different attack patterns by studying regular node interactions and detecting attack-induced deviations. Even when assault techniques change or attack intensities vary, the model can retain high accuracy because of this adaptability. Data augmentation approaches are also used during training to expose the model to different assault scenarios with different sizes, timings, and methodologies. For the model to generalize well to various IoT installations and attack vectors, it is important to avoid overfitting certain attack types or environmental variables. To make it even more resilient, we apply multi-task learning and ensemble learning techniques, which simultaneously train many models or tasks to identify different kinds of assaults and network abnormalities. The insights from various models or tasks focusing on certain attack patterns strengthen the system’s ability to withstand unknown or new assault techniques. The model is equipped to manage the scattered nature of IoT devices by integrating FL. This allows for model training across devices without centralized data. By including a wide range of data points, the model becomes more resistant to changes in device settings, network circumstances, and attack types. Another way FL improves security is by reducing the likelihood of data leakage or a single failure point because of its decentralized design.
Data compilation
The first step is data compilation, in which data is prepared for identifying while making the transmission. Initially, the missing values are eliminated from which is done by applying interpolation to ensure data continuity. value is replaced according to Eq. (2), which computes the missing value at index in . The computation covers the present and previous values to identify the replacement value for .
2
In Eq. (2), if has a missing value that is denoted as else 0 that is computed from the value of the last time step and value. This pre-processing is performed up to 15 successive missing values to manage the data continuity in . Then, encoding changes the categorical attributes into numerical values to manage the compatibility while identifying the . The encoding is done using Eq. (3).
3
In Eq. (3), the categorical attributes are denoted as which are allocated to the respective feature class according to the characteristics function . is defined as from these values, the ’s encoding variable is computed. feature category is checked by if the is depending on then else 0. Consider has as the feature that belongs to different categories . Then, need to encode to by exploring . If is belongs to then is defined as . Then, for data point is defined as . Then, these generalized encoding processes are described in Table 2.Table 2
Representation of .
|
|
|
|
---|---|---|---|
0 | True | False | False |
1 | False | True | False |
2 | False | False | True |
3 | False | True | False |
Table 2 shows that while analyzing the attack type using which returns the output according to the condition described in Eq. (3). The output represents that the binary vector for input . Then, the Pearson correlation coefficient is computed to minimize the overfitting issues and improve the interpretability characteristics. value identifies the association between the features that maintain the DDoS prediction system stability. The computed value lies between in which represents that features have a linear relationship (positive), means a linear relationship in negative, and means no linear relationship.
The Pearson correlation coefficient is used to identify and remove highly correlated features (threshold > 0.9), which often introduce redundancy and noise. In the context of DDoS detection, this improves model generalization by reducing overfitting and focusing on more discriminative traffic features. It also lowers false positive rates by minimizing misleading feature combinations that resemble attack-like behavior in benign traffic patterns.
Then, is computed using Eq. (4).
4
According to Eq. (4), the feature continuity is computed from value, and is selected as the threshold value to predict the highly correlated features. computation was used to minimize the multicollinearity and redundancy issues while classifying . The CICIoT-2023 dataset has different features ( gathered from device interaction, sensor data, and traffic. has categorical (attack type) and continuous data (packet counts, packet size, and transmission time). These are explored using Eq. (4) to identify the highly correlated features. First, the correlation matrix is estimated for which is defined in Eq. (5).
5
In Eq. (5), the correlation between and is signified as ; here, the diagonal matrix takes 1 as a value because every feature is effortlessly related . Then, value is compared with the threshold value of 0.95 to minimize the redundancy. Let the feature correlation is computed based on Eq. (5), and the obtained value is defined as . From the computation, the correlation between is a robust and positive correlation because 0.98 is more significant than a threshold value. In addition, irrelevant features are removed from to minimize the multicollinearity. Here, is highly correlated; therefore, is eliminated from the list and maintains the Vice-versa. This selection of is done based on the feature relevancy and a few like are removed from . Then, the final is obtained is defined as . Then, the graphical analysis of the data compilation is shown in Fig. 3.
Fig. 3 [Images not available. See PDF.]
Graphical analysis of data compilation.
Figure 3 illustrates the graphical analysis of data compilation evaluated using heatmap, which provides the relationship between used to identify the . The heatmap defines between which has the value of range between . computation used to identify the relationship between which is described in Fig. 3c. According to value, the features are reduced, and relevant features are shown in Fig. 3d. From the computation and having a moderate relationship and has a weaker correlation. The diagonal value indicates that the features have an exact correlation. The successive analysis of these correlation analyses minimizes the overfitting issues and improves the generalizability and interpretability while classifying .
Federated learning in GraphFedAI framework
GraphFedAI framework uses the FL model to train the graph network to improve the DDoS detection rate. The FL is an effective collaborative model that trains the neural network to preserve data security and privacy. The FL trains the network for every dataset with parameter. FL transmits the updated parameters which is obtained from the objective function to . The updating is done according to that is computed according to Eq. (7). The output from is aggregated with the help of weighted averaging value, which is defined as . The aggregation process ensures data sensitivity and privacy. FL has three components: central aggregator, edge servers, and IoT devices, which optimize the model and process the IoT traffic data to reduce anomaly activities. The first component is IoT devices, which are local endpoints used to gather information by observing the environment. The collected information is sent to the following IoT devices or servers by establishing the communication link. IoT interaction creates the graphs that are denoted as . Node participate in the local networks is denoted as the IoT device. Every has edges to make the interactions between other nodes or servers. creates pairwise communication, which has a weight value while making the interactions. Then, is defined as . is defined as the node . These are defined in the matrix format and the row belongs to the and the feature vector of node is represented as . consists of local information on IoT devices that helps to understand network and node behavior in IoT systems. Consider the IoT sensors like the interaction between these devices forms a graph . The consists of and which helps to create the interaction between the . These devices collect the and the formed matrix is represented as ; column 1 denoted as the and column 2 denoted as for . The generated graph is explored using the classifier to identify the . After forming the , IoT devices need to be trained using GNN to analyze graph data. The graphical structure of GNN is shown in Fig. 4.
Fig. 4 [Images not available. See PDF.]
Graphical structure of graph neural networks.
During this process, every IoT device is trained separately, and the information is aggregated while predicting the The main intention of the GNN training is to understand the graph embeddings to observe the pattern and structure of the IoT network for classifying the Then, the overall GNN training process is described by Eq. (6).
6
The IoT device is processed by the GNN to get prediction from every embedding for node . is obtained from the (aggregated) information. The aggregation is done by performing message-passing to the neighboring nodes . During this process, th previous layer embedding details . value is transformed with the help of trainable weights, which are defined as . This transformation uses the and ReLU activation function to get the feature transformation. The final is fed into the output layer to predict and the training process is continued until to reach the minimum loss value (Eq. 7).
7
Equation (7) is used to reduce the value of which indicates how effectively GNN recognizes the The is computed by considering output in which predicted label and true output . According to the network parameter is updated in terms of computing the after and before local training of parameters. This updated information is transferred to the edge servers to improve the aggregation process. The local training process preserves privacy and maintains decentralized learning directly linked with the system’s scalability and communication overhead. The edge server performs the local optimization by receiving from IoT devices. The value obtained from the gradient update, which is defined as . As said, the optimization process minimizes the loss function defined in Eq. (8).
8
The collected values are aggregated in to the model . The aggregation process gathers the entire IoT devices . From the aggregated information, the minimizes the value for every device by using cross-entropy loss function . Then, the GNN attributes is optimize using the aggregated which is defined in Eq. (9).
9
updates value by using learning rate and the aggregated parameter value. The selected parameters refine and is forwarded to ( . updating was performed for all IoT devices to minimize data transfer needs globally and maximize overall performance while predicting DDoS attacks. The FL process ensures the GraphFedAI framework’s robustness in the IoT environment. The FL learning process allows the GNN to work in different conditions, such as authentic traffic patterns, attack format, and network conditions. The Improvement of GraphFedAI is evaluated before and after applying the FL; the result is shown in Table 3. Missing or partial data, often caused by network interruptions or sensor failures, is handled using interpolation in real-world IoT systems. To ensure the model can train on a more comprehensive dataset and avoid gaps that might lower its performance, interpolation estimates the missing values between known data points. For instance, when data points are missing because of sporadic transmission, interpolation may approximate these values by analyzing patterns in nearby data points. For the model to identify anomalous behavior during DDoS assaults, the data must remain consistent and uninterrupted. This technique enables this to happen. The connections between the dataset’s characteristics, such as packet frequency, device communication patterns, and traffic volume, are evaluated using correlation analysis. The model can distinguish between regular network activity and abnormal due to DDoS assaults by determining the correlation between several variables. During an assault, features that typically exhibit strong correlations may have weak correlations or unexpected patterns. This aids in identifying outliers that may suggest harmful behavior. The model can zero in on the most important aspects using correlation analysis, which improves the DDoS detection mechanism’s accuracy and efficacy by picking up on small changes from typical network patterns.Table 3
Improvement of FL in .
|
| (%) | ||||
---|---|---|---|---|---|---|
Before FL | After FL | Improvement | ||||
|
|
| ||||
1 | UDP | 72.34 | 89.46 | 93.27 | 96.14 | 20.616 ± 0.616 |
2 | SYN Flood | 75.27 | 92.46 | 95.47 | 97.467 | 19.862 + 0.862 |
3 | HTTP Flood | 73.37 | 90.36 | 94.53 | 96.35 | 20.376 ± 0.376 |
4 | ICMP flood | 76.46 | 91.37 | 93.58 | 97.67 | 17.746 ± 0.746 |
5 | DNS Flood | 75.35 | 93.78 | 96.42 | 98.35 | 20.833 ± 0.833 |
6 | ACK Flood | 74.56 | 90.24 | 94.26 | 97.24 | 19.353 ± 0.353 |
7 | Smurf attack | 77.57 | 89.35 | 93.37 | 95.67 | 15.226 ± 0.226 |
8 | HTTP Flood | 73.47 | 90.57 | 94.75 | 98.24 | 21.05 ± 0.05 |
9 | SYN Flood | 72.46 | 87.59 | 90.34 | 93.26 | 17.93 ± 0.93 |
10 | Slowloris | 70.47 | 92.32 | 94.85 | 97.02 | 24.26 ± 0.26 |
Global Model | 73.47 | 93.68 | 96.24 | 98.23 | 22.58 ± 0.58 |
Table 3 illustrates that the Improvement of FL in while predicting the DDoS attacks on various attack types such as SYN flood, UDP flood, HTTP flood, etc. The IoT devices are trained and values are aggregated, helps every device learn the attack types and improves rate and robustness . values identify the high-risk regions like spoofing and flooding and low-risk regions, which helps to predict the DDoS attacks with maximum accuracy. The frequent The updation process helps the system adapt to attack patterns and IoT conditions.
Decision-making of DDoS attack detection in IoT
The final stage of the GraphFedAI framework is DDoS attack detection which uses the FL to train the GNN that are aggregated from . explores the traffic from which is defined as . The overall decision-making process computation is described in Fig. 5.
Fig. 5 [Images not available. See PDF.]
Representation of DDoS decision-making analysis.
The GNN processes the in the message passing stage that aggregate the message from layer , the learning function and edge features . This value is utilized for updating the node representation using . The values are aggregated over the layers, and the prediction is performed as . During this process, network parameters such as is used to improve the overall prediction accuracy. According to this process, status is predicted based on the probability distribution of of input . Then, the output is defined for the node is computed is defined as . The entire behavior is gathered and analyzed according to the threshold analysis and is computed as . The indicator function in is true, it has a value of 1, and the analysis uses the total number of in . Suppose the value exceeds value, and then the system shows a flag that affected by DDoS attacks. Then, the condition is defined as . The discussed framework analyzes the high-density area to identify in because the attack patterns are indicative. Therefore, the intricate environment is explored with the help of attention concept of embedding nodes which predicts the relationship of . From the relationship, decision is taken by computing the anomaly score . Then, the overall decision-making of is carried out according to Eq. (10).
10
11
Equations (10) and (11) decide the DDoS attacks. If value greater than then traffic is alerted for DDoS attacks. Suppose the GNN approach probability value is greater than then traffic is affected by the DDoS attack, and both conditions are satisfied, then the network is infected by the DDoS attack. The efficiency of is evaluated for three conditions such as and the obtained results for various traffic rates in the IoT environment are shown in Fig. 6.
Fig. 6 [Images not available. See PDF.]
Analysis of different decision-making conditions.
Figure 6 shows that analysis of the GraphFedAI framework using different parameters, such as . uses various features such as to explore the DDoS attacks in the IoT environment. uses that identifies the relationship between which helps to classify effectively under different conditions. The introduced framework combines these conditions to identify the DDoS with minimum false negative and positive rates. The effective utilization of these factors improves up to 98.7% for , 98.4% for , 97.9% of and 98% for features. The effective utilization of the FL on the GNN training process improves the overall system scalability, robustness, and DDoS detection accuracy. Therefore, the extracted features from the pre-processed data improve the overall and ensures the system’s robustness and scalability in IoT. Algorithm 1 shows the pseudocode of GraphFedAI for DDoS Detection in IoT Networks.
The GraphFedAI framework builds the graph by extracting communication flows between IoT devices from the CIC-IoT-2023 dataset. To guarantee that each device is consistently represented in each session, we use a mix of source/destination IP addresses, MAC addresses, and protocol types to identify each device as a node. Suppose two nodes create a communication flow within a sliding time range of 60 s, whether TCP, UDP, or ICMP; an edge is formed between them. Three aggregated flow-level metrics are used to assign edge weights, which define the link between nodes: (1) the number of packets exchanged between the two devices during the interval determines the packet frequency; (2) the timestamp difference between the first and last packet calculates the communication duration; and (3) the average payload size is derived by averaging the total bytes transferred.
Temporal window construction and interpolation
To address missing or irregular values in the time-series input features, a linear interpolation technique is applied across each 14-day sliding window. Let denote a missing feature value at time t. The interpolated value is computed as:
12
In Eq. 12, and are the nearest known values before and after the missing timestamp . This interpolation ensures temporal continuity in both node attributes and graph edge weights, which is essential for reliable learning of behavior sequences. Without this smoothing step, discontinuities in the data introduce abrupt changes in the input graph structure, weakening the performance of LSTM and attention layers during training.
Research outcomes
This section explores the outcomes of the GraphFedAI framework while detecting DDoS attacks in an IoT environment. The framework uses various IoT devices to gather the traffic information that is processed continuously to create the graph. From the generated graph, the relationship between the features is analyzed according to the node and edge concept. The relation analysis predicts the normal and attack data by optimizing graph neural networks locally. The optimized results are aggregated to make decisions regarding the attacks. This process ensures robustness and scalability while managing privacy and security in an IoT environment.
To comply with real-time IoT limitations and maintain a healthy balance between model convergence and network latency, the GraphFedAI framework synchronizes model parameters after 5 local training epochs per client and starts global aggregation at predetermined intervals of 10 s. Each edge device uses GNN operations to update its local graph-based model to partially lower communication costs. The generated gradients are then compressed before being sent. On average, the communication payload is reduced by 48.1% every round using compression tactics like 8-bit fixed-point quantization and top-1% gradient sparsification in conjunction with entropy-based encoding methods like Huffman coding. The communication protocol, MQTT over TLS, guarantees low-overhead, safe, and dependable data transmission across diverse Internet of Things devices. Federated Averaging (FedAvg) is used for central aggregation, and the CIC-IoT-2023 dataset was used to improve its frequency using ablation tests empirically. Experimental evaluations have validated the operational feasibility of the framework in distributed real-time IoT networks across 50 simulated edge nodes under constrained bandwidth scenarios (64–512 Kbps). The results show that this synchronization and aggregation strategy reduces total communication cost by 33.7% and improves convergence time by 29.4% without compromising detection accuracy.
Operational context
The discussed GraphFedAI framework integrates GNNs and FL model to analyze the IoT data. The framework is implemented using the Python programming language that supports the machine learning algorithms and FL libraries that help manage privacy and security while sharing data in an IoT environment. At the time of implementation, the CICIoT-2023 dataset (https://www.unb.ca/cic/datasets/iotdataset-2023.html) information is utilized to predict the DDoS attacks. The dataset information was collected using 105 IoT devices, and 33 attacks were implemented using IoT topology. The dataset comprises seven attacks: DoS, DDoS, web-based, Recon, Spoofing, Brute Force, and Mirai. In this work, DDoS dataset information is concentrated to eliminate intermediate access and other security factors. In the dataset, 33,984,560 rows belong to DDoS attacks with attributes like ACK, UDP flood, Slowloris, etc. Then, the details of DDoS attack attributes are displayed in Fig. 7.
Fig. 7 [Images not available. See PDF.]
Representation of attributes and row values in the CICIoT-2023 dataset.
The GraphFedAI system will be tested in several studies to see how well it fares against poisoning attempts and adversarial manipulation in GNNs and Federated Learning (FL). Data poisoning, model poisoning, and graph poisoning attacks will be replicated in these tests. Data poisoning attacks involve malicious clients injecting misleading or incorrect training data, model poisoning attacks involve adversarial gradients shared during aggregation to corrupt the global model, and graph poisoning attacks involve attackers manipulating the graph structure by adding fake edges or altering edge weights. Attackers may confound the GNN’s learning process by manipulating node characteristics, another challenge the system will undergo. Pre- and post-attack comparisons of performance indicators, including accuracy, detection precision, and F1 score, will be used to evaluate the system’s resistance. This study will build defense methods to address these weaknesses, including adversarial training, gradient clipping, and secure aggregation.
Empirical findings
The gathered information is processed and analyzed with the help of the interpolation method to eliminate irrelevant information from the dataset. The missing value replacement and correlation analysis approach identifies the features’ relationship and improves the overall DDoS detection rate. The system’s efficiency is evaluated using detection accuracy, precision, recall, latency, scalability, robustness, and resource efficiency metrics. The GraphFedAI system efficiency is compared with existing techniques Looking Back-enabled Machine Learning (LBML)19, XGBoosting and Adaboost techniques (XG-Adaboost)20, Live Capture Neural Networks (LCNN)21 and Gated Recurrent Unit (GRU)22 to justify the system excellence. The main reason for selecting these algorithms is to effectively recognize the attacks with IoT systems and ensure security while transferring information. Then, the obtained system results are shown in Table 4. Implementing FL and GNNs on lightweight systems like Raspberry Pi or ESP32 is significantly hindered by edge resource restrictions, such as low memory, processing capacity, and bandwidth. Extensive performance assessment of these devices will be carried out to assess deployment practicality, emphasizing memory utilization, CPU/GPU load, and communication bandwidth. This study uses model compression methods such as weight pruning, quantization, and knowledge distillation to decrease the computing needs and model size. The optimization of the local training technique will further guarantee an efficient distribution of computational burden among the available edge resources. This study will use Raspberry Pi and ESP32, two limited-edge devices, to investigate how these improvements affect model performance in real-world settings. This will let us quantify the trade-offs between model accuracy and the limits of the edge devices.Table 4
Analysis of GraphFedAI.
Methods | Conditions |
|
|
|
|
---|---|---|---|---|---|
GraphFedAI |
| 98.7 ± 0.109 | 98.2 ± 0.109 | 98.3 ± 0.109 | 24.2 ± 3.2 |
| 98.4 ± 0.09 | 98.3 ± 0.09 | 98.5 ± 0.09 | 26.3 ± 3.1 | |
| 97.9 ± 0.10 | 98.0 ± 0.10 | 97.8 ± 0.10 | 28.3 ± 3.5 | |
| 98.6 ± 0.087 | 98.3 ± 0.087 | 98.2 ± 0.087 | 27.3 ± 4.1 | |
LBML19 |
| 94.5 ± 0.28 | 93.8 ± 0.28 | 94 ± 0.28 | 45 ± 3.5 |
| 93.9 ± 0.263 | 94 ± 0.263 | 93.8 ± 0.263 | 46.22 ± 3.7 | |
| 94.15 ± 0.13 | 94.1 ± 0.13 | 94.2 ± 0.13 | 44.23 ± 3.2 | |
| 94.1 ± 0.23 | 94.3 ± 0.23 | 93.9 ± 0.23 | 47.2 ± 3.8 | |
XG-Adaboost20 |
| 95.6 ± 0.24 | 94.7 ± 0.24 | 94.9 ± 0.24 | 40 ± 4.2 |
| 94.65 ± 0.187 | 94.6 ± 0.187 | 94.7 ± 0.187 | 41.3 ± 4.5 | |
| 94.7 ± 0.34 | 94.8 ± 0.34 | 94.6 ± 0.34 | 42.5 ± 4.02 | |
| 94.9 ± 0.12 | 94.9 ± 0.12 | 94.9 ± 0.12 | 39.24 ± 4.1 | |
LCNN21 |
| 97.2 ± 0.03 | 96.5 ± 0.03 | 96.8 ± 0.03 | 32 ± 3.5 |
| 96.65 ± 0.02 | 96.7 ± 0.02 | 96.6 ± 0.02 | 33.2 ± 3.2 | |
| 96.65 ± 0.04 | 96.4 ± 0.04 | 96.9 ± 0.04 | 35.22 ± 3.01 | |
| 96.75 ± 0.102 | 96.8 ± 0.102 | 96.7 ± 0.102 | 32.18 ± 3.45 | |
GRU22 |
| 96.8 ± 0.123 | 96.1 ± 0.123 | 96.3 ± 0.123 | 30 ± 3.34 |
| 96.35 ± 0.13 | 96.2 ± 0.13 | 96.5 ± 0.13 | 29.3 ± 2.98 | |
| 96.62 ± 0.25 | 96.34 ± 0.25 | 96.9 ± 0.25 | 30.3 ± 3.23 | |
| 97.02 ± 0.34 | 96.89 ± 0.34 | 97.23 ± 0.34 | 31.39 ± 3.53 |
Table 4 illustrates that analysis of GraphFedAI on different conditions, such as.
how the system effectively predicts the attack with minimum delay and maximum accuracy. analyze the missing values in the dataset, and it has been replaced according to condition, which is used to eliminate the overfitting data. Then condition used for checking the class variable during the encoding procedure, which determines the relationship between . The initial computations improve the overall prediction rate and reduce the false negative and positive values. and successfully construct that identifies the interaction between the nodes. From the analysis, the GraphFedAI framework detects the attack behavior from the . frequently updates a value that directly creates an impact on value. Therefore, while predicting DDoS attacks in an IoT environment. In addition, the aggregation of nodes the FL process reduces latency because of global synchronization and local optimization while training the GNN networks. The frequent aggregation and updation of reduce the communication overheads in IoT systems. During the classification process, the GraphFedAI framework intends to manage the scalability factors, such as processing speed , computational resources , memory usage , and training time . Then, the obtained scalability results are shown in Table 5. Incorporating L2 regularization (weight decay) to limit the size of model weights and using dropout layers to deactivate certain neurons during training randomly will reduce overfitting and increase generalization by forcing the model to learn more robust features. Early stopping methods will also be implemented to monitor the model’s performance on a validation set. Training will be stopped if performance seems stagnant or worsening to prevent excessive overfitting. Model averaging will be used when FL scheme local models are combined to avoid overfitting to outliers in the local data. This will ensure that no local model has an outsized impact on the global model.Table 5
Scalability analysis of GraphFedAI.
Methods |
(MB) |
|
|
|
|
| |||
---|---|---|---|---|---|---|---|---|---|
|
|
|
| ||||||
GraphFedAI | 350 | 70% CPU | 145,000 pps | 190 s | Baseline | ||||
LBML19 | 420 | 85% CPU | 135,000 pps | 320 s | −20 | −21.43 | 6.90 | −68.24 | −15% |
XGBoost20 | 480 | 80% CPU | 138,000 pps | 350 s | −37.14 | −14.29 | 4.83 | −84.21 | −32.63% |
LCNN21 | 400 | 75% CPU | 140,000 pps | 250 s | −14.29 | −7.14 | −3.45 | −31.58 | −19.47% |
GRU22 | 450 | 80% CPU | 138,000 pps | 270 s | −28.57 | −14.29 | 4.83 | −42.11 | −28.42% |
Table 5 shows analysis of the GraphFedAI framework while detecting from . The introduced framework attains quicker compared to other methods because the framework uses across different and devices. The FL concept uses the message passing and neighboring analysis that improves the overall data understanding in IoT, which directly influences . The frequent computation improves the model training by consuming the minimum . GraphFedAI is the fastest, with 145,000 pps, suggesting it has better throughput because FL at the to handle the traffic. Due to the distributed architecture of GraphFedAI, its CPU usage tends to trend towards 70%, which is optimal for the system. The performance benchmark metrics {high , low , lows , and fast }demonstrates that GraphFedAI achieves the best-generalised scalability. On the other hand, LBML, XG-Adaboost, and GRU have a negative scalability improvement as they take longer to train and utilize more resources. In addition, the robustness ( of this system is further evaluated, and the system’s efficiency is shown in Table 6.Table 6
Robustness analysis of GraphFedAI.
DDoS Attack Type |
| ||||
---|---|---|---|---|---|
GraphFedAI | LBML19 | XGBoost20 | LCNN21 | GRU22 | |
SYN Flood | (98.7%) | (94.5%) | (95.6%) | (97.2%) | (96.8%) |
UDP Flood | (98.6%) | (94.8%) | (95.0%) | (97.5%) | (96.5%) |
ICMP Flood | ↑ (98.8%) | ↔ (94.6%) | ↔ (94.7%) | ↑ (97.0%) | ↑ (96.3%) |
HTTP Flood | ↑(99.0%) | ↑(94.2%) | ↔ (95.3%) | ↑ (97.4%) | ↑ (96.7%) |
RSTFIN Flood | ↑ (98.5%) | ↔ (94.7%) | ↔ (95.5%) | ↑ (97.2%) | ↑ (96.4%) |
PSHACK Flood | ↑ (98.7%) | ↔ (94.4%) | ↔ (95.2%) | ↑ (97.3%) | ↑ (96.6%) |
UDP Fragmentation | ↑ (98.6%) | ↔ (94.9%) | ↔ (95.4%) | ↑ (97.6%) | ↑ (96.8%) |
ICMP Fragmentation | ↑ (98.7%) | ↔ (94.3%) | ↔ (94.8%) | ↑ (97.1%) | ↑ (96.5%) |
TCP Flood | ↑ (98.8%) | ↔ (94.7%) | ↔ (95.1%) | ↑ (97.3%) | ↑ (96.7%) |
SynonymousIP Flood | ↑ (98.6%) | ↔ (94.5%) | ↔ (95.2%) | ↑ (97.2%) | ↑ (96.4%) |
* -very high, -high and moderate.
GraphFedAI is the most durable among all the models until the DDoS attack is finally performed, achieving stable accuracy even with a broad spectrum of different DDoS attacks (Table 5). LBML, XGBoost, LCNN, and GRU methods show moderate to high performance but are highly sensitive to attacks such as SYN Flood, HTTP Flood, and TCP Flood. The matrix shows that with respect to a diversity of DDoS attack instances, GraphFedAI performed better than the other methods, which helps optimize the efficacy of real-world deployment where dynamic and heterogeneous network environments must be confronted. The GraphFedAI should have a low FPR while improving The results obtained are shown in Fig. 8.
Fig. 8 [Images not available. See PDF.]
False positive rate analysis.
Figure 8 shows that the analysis of GraphFedAI brings out the best in all four aspects where it performs significantly well against FPR. From the analysis described in Fig. 4, LBML, XG-Adaboost, LCNN, and GRU, GraphFedAI effectively controls the FPR to below 0.025 across all columns. In particular, with different , the FPR of GraphFedAI is stable and lower than that of the other methods. GraphFedAI offers the advantage of processing large datasets in a distributed manner, which can enhance generalization and reduce misclassifications. These metrics emphasize its strength and scalability ( for real-time detection of DDoS activity in dynamic IoT network space.
Ablation study
The ablation study will systematically evaluate the impact of different components of the model, including interpolation, feature encoding, correlation analysis, and attention mechanisms, by selectively removing or modifying each component and observing the change in performance. The following experiments will be conducted:
Without interpolation
The model will be trained without the interpolation step, which is typically used to handle missing or sparse data in time series or spatial data. By removing interpolation, we can evaluate how crucial this step is for maintaining data continuity and improving model accuracy.
Hypothesis:
The removal of interpolation may lead to decreased performance in scenarios with missing or incomplete data, which is common in real-world IoT applications.
Without feature encoding
Feature encoding, which transforms raw input data into a suitable format for model processing, will be excluded in another experiment. This will show how much the feature encoding step contributes to preserving the essential information and patterns within the data.
Hypothesis:
Without feature encoding, the model may struggle to capture complex relationships within raw data, leading to lower prediction accuracy and slower convergence.
Without correlation analysis
The correlation analysis component, which reduces redundancy by identifying and removing correlated features, will be disabled. This test will help assess the impact of feature redundancy on model performance and whether correlation filtering is essential for effective learning.
Hypothesis:
Removing correlation analysis may result in poorer performance due to the model training on redundant or irrelevant features, causing overfitting or slower convergence.
Without attention mechanism
The model will be tested without the attention mechanism, which assigns different weights to features based on their importance. This experiment will assess how crucial the attention mechanism is in helping the model focus on critical aspects of the input data and improve performance in more complex scenarios.
Hypothesis:
Removing attention mechanisms will likely reduce model efficiency, as it may fail to prioritize the most relevant features, especially in heterogeneous or noisy IoT data.
Figure 9 shows GraphFedAI’s performance under clean graph, edge injection, and edge deletion situations. False positive rate (FPR) is on the right, accuracy and F1-score on the left. GraphFedAI performs well even when hostile alterations are made to the graph topology, such as introducing deceptive edges or eliminating critical connections. After a minor fall in accuracy and F1-score, both metrics remain above 96% and the FPR around 2%. These findings show that GraphFedAI can withstand topological perturbations, making it suited for IoT contexts with missing or altered data.
Fig. 9 [Images not available. See PDF.]
Resilience to adversarial graph perturbations.
Table 7 shows the ablation research findings for interpolation, compression, federated learning (FL), and graph neural networks (GNNs) in the GraphFedAI framework. DDoS detection performance is assessed by removing or replacing each module. The entire GraphFedAI model has the lowest false positive rate (1.2%) and maximum accuracy (98.7%). Without interpolation, accuracy drops 3.8%, and without compression, communication costs rise 41.3%. Disabling FL reduces non-IID performance. Replace GNN with MLP reduces detection accuracy, proving graph-based modeling’s utility.
Table 7. Ablation study results for GraphFedAI modules.
Model variant | Accuracy (%) | F1-Score (%) | False positive rate (FPR) | Communication overhead | Observed impact |
---|---|---|---|---|---|
Full GraphFedAI | 98.7 | 98.4 | 1.2% | Moderate | Optimal performance with all modules integrated |
Without interpolation | 94.9 | 94.1 | 3.8% | Moderate | Drop in accuracy due to missing temporal graph continuity |
Without compression module | 98.2 | 97.8 | 1.6% | High (+ 41.3%) | Increased resource usage; minor accuracy drop |
Centralized GNN (No federated learning) | 95.7 | 94.9 | 3.2% | Very High | Reduced privacy and scalability under non-IID conditions |
MLP instead of GNN | 92.2 | 91.5 | 5.4% | Low | Poor spatial modeling; failed to capture graph structure |
Table 8 presents a comparative analysis of GraphFedAI against recent DDoS detection models It highlights differences in learning techniques, accuracy, scalability, and privacy. GraphFedAI outperforms existing models, including FL-IDPP and PCA-CNN, by achieving superior accuracy, scalability in non-IID settings, and strong privacy preservation.
Table 8. Comparative evaluation of GraphFedAI with recent DDoS detection approaches (2023–2025).
Model | Reference | Year | Learning method | Accuracy (%) | F1-score (%) | Scalability | Privacy preservation |
---|---|---|---|---|---|---|---|
PortMap detection (CICDDoS2019) | 20 | 2024 | Machine Learning (RF, SVM) | 91.2 | 90.8 | Medium | No |
KNN-neural network hybrid | 24 | 2024 | KNN + Neural Network | 93.5 | 93.0 | Low | No |
RF-PCA-ANN hybrid | 25 | 2024 | Random Forest + PCA + ANN | 94.8 | 94.2 | Medium | No |
Ensemble detection (IEEE ISCS) | 26 | 2024 | Ensemble Learning | 95.3 | 94.8 | Medium | No |
Hybrid DL for DoS in SDN | 27 | 2023 | Hybrid Deep Learning (CNN + RNN) | 96.0 | 95.7 | High (SDN-specific) | No |
Hybrid SAE + Checkpoint DL | 29 | 2023 | Stacked Autoencoder + Checkpoint Net | 95.4 | 94.9 | Medium | No |
P3GNN (APT detection in SDN) | 30 | 2024 | Graph Neural Network | 96.8 | 96.2 | High | Partial |
Federated learning against poisoning | 31 | 2024 | Robust Privacy-Preserving FL | 97.3 | 96.9 | High | Yes |
AP2FL (Healthcare) | 32 | 2023 | Auditable Privacy-Preserving FL | 95.9 | 95.1 | Medium | Yes |
FL-IDPP | 33 | 2025 | Federated Learning + DNN | 94.1 | 93.5 | Medium | Yes |
PCA-CNN | 34 | 2024 | PCA + CNN | 91.6 | 91.0 | Low | No |
GraphFedAI (Proposed) | Proposed | –- | Federated Learning + GNN + Interpolation | 98.7 | 98.4 | High (non-IID capable) | Yes |
Table 9 shows GraphFedAI’s scalability from 25 to 200 IoT nodes. Detection accuracy, F1-score, false positive rate, communication overhead per training cycle, and average local training time are evaluated. GraphFedAI consistently detects on all scales with accuracy above 97% and false positive rates below 2%. Communication overhead rises with node count, while compression keeps it low. Since the average client training time grows gradually, the framework is suitable for large-scale, decentralised IoT systems.
Table 9. Scalability evaluation of GraphFedAI from 25 to 200 IoT nodes.
Node Count | Accuracy (%) | F1-score (%) | False positive rate (%) | Communication overhead (MB/round) | Avg. training time (s) |
---|---|---|---|---|---|
25 | 98.6 | 98.2 | 1.3 | 7.2 | 3.1 |
50 | 98.3 | 97.9 | 1.4 | 12.4 | 3.8 |
100 | 97.9 | 97.5 | 1.5 | 20.1 | 4.5 |
150 | 97.4 | 97.1 | 1.6 | 26.8 | 5.0 |
200 | 97.2 | 96.9 | 1.7 | 31.5 | 5.7 |
To evaluate the practical deployability of GraphFedAI, we benchmarked its CPU, GPU, and memory utilization against two lightweight DDoS detection frameworks: FL-IDPP33 and PCA-CNN34. Tests were conducted on Raspberry Pi 4B (4 GB RAM, ARM Cortex-A72) and NVIDIA Jetson Nano (4 GB RAM, Maxwell GPU).
In Table 10, Despite including GNN layers, multi-task loss, and interpolation logic, GraphFedAI operates within the hardware limitations of edge devices. Model compression and sparse updates reduce upload payloads during FL rounds. The marginal resource overhead is justified by significant accuracy gains (4.6–7.1% improvement) over baseline models.
Table 10. Summarizes the average per-device resource usage during training.
Model | CPU Usage (%) | GPU usage (%) | Memory usage (MB) | Avg. upload per round (MB) |
---|---|---|---|---|
PCA-CNN34 | 38.5 | 0 | 295 | 5.8 |
FL-IDPP33 | 42.2 | 8.4 | 320 | 7.2 |
GraphFedAI | 48.9 | 13.1 | 479 | 9.3 |
Conclusion
Thus, the paper examines the GraphFedAI framework for analyzing and detecting DDoS attacks in IoT environments. The information is collected from IoT devices, which are explored to predict the missing value, and the interpolation approach is applied to replace the missing values. Then, the relationship between the data is examined to construct the graph used to understand the IoT device behavior. Then, FL is applied to learn and understand the node behavior and train the GNN to improve the prediction accuracy. Message passing, aggregation, and local optimization procedures are incorporated to understand and identify abnormal activities. Neighboring nodes are analyzed and aggregated for every feature to predict the exact output. The aggregated values are updated frequently to minimize the overall prediction loss. The discussed system uses the CICIoT-2023 dataset information and is implemented using Python, ensuring 98.7% to 99% accuracy for different types of attacks. In addition, the FL process minimizes the deviation between outputs and FPRs, indicating that the framework manages robustness and scalability in an IoT environment.
Limitation and future work
The GraphFedAI framework effectively reduces false positives on DDoS detection data under federated machine learning with enhanced scalability, accuracy, and reduced false positives. However, the proposed GraphFedAI’s primary limitations are computational overhead incurred on the graph during FL synchronization and the vulnerability of the graph topology to adversarial attacks. In the future, communication cost optimization in practical FL frameworks and robust protection mechanisms against adversarial graph manipulations would enable smooth application to IoT scenarios with high dynamics and large scale. Also, the framework’s utility can be improved by expanding it to multi-task settings, allowing the synchronized detection and prediction of anomalies.
Acknowledgements
The authors extend their appreciation to the Deanship of Research and Graduate Studies at King Khalid University for funding this work through Large Research Project under grant number RGP2/379/46. This research was supported by the Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2025R259), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.
Author contributions
Mohd Anjum: Conceptualization, Methodology, Software, Writing—Original Draft, Writing—Review & Editing Ashit Kumar Dutta: Methodology, Resources, Writing—Review & Editing, Visualization, Funding acquisition Ali Elrashidi: Conceptualization, Methodology, Resources, Writing—Original Draft, Writing—Review & Editing, Funding acquisition Sana Shahab: Conceptualization, Methodology, Software, Data Curation, Writing—Original Draft, Writing—Review & Editing, Visualization, Funding acquisition Asma Aldrees: Methodology, Validation, Formal analysis, Resources, Writing—Review & Editing, Visualization, Funding acquisition Zaffar Ahmed Shaikh: Methodology, Validation, Formal analysis, Data Curation Abeer Aljohani: Methodology, Validation, Formal analysis, Resources, Data Curation, Funding acquisition.
Funding
This work was supported by the Researchers Supporting Project Number (MHIRSP2024005) Almaarefa University, Riyadh, Saudi Arabia. The authors extend their appreciation to the support provided by the University of Business and Technology, Jeddah, Saudi Arabia and the Taibah University, Medina, Saudi Arabia.
Data availability
The data used to support the study’s findings are publicly available at https://www.unb.ca/cic/datasets/iotdataset-2023.html.
Declarations
Competing interests
The authors declare no competing interests.
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
1. Gupta, BB; Quamara, M. An overview of Internet of Things (IoT): Architectural aspects, challenges, and protocols. Concurr. Comput.: Pract. Exp.; 2020; 32,
2. Chataut, R; Phoummalayvane, A; Akl, R. Unleashing the power of IoT: A comprehensive review of IoT applications and future prospects in healthcare, agriculture, smart homes, smart cities, and industry 4.0. Sensors.; 2023; 23,
3. Kumari, P; Jain, AK. A comprehensive study of DDoS attacks over IoT network and their countermeasures. Comput. Secur.; 2023; 127, 103096.
4. Alazab, A; Khraisat, A; Singh, S; Bevinakoppa, S; Mahdi, OA. Routing attacks detection in 6lowpan-based Internet of things. Electronics; 2023; 12,
5. Bagchi, S; Abdelzaher, TF; Govindan, R; Shenoy, P; Atrey, A; Ghosh, P; Xu, R. New frontiers in IoT: Networking, systems, reliability, and security challenges. IEEE Internet Things J.; 2020; 7,
6. Al-Hadhrami, Y; Hussain, FK. DDoS attacks in IoT networks: A comprehensive systematic literature review. World Wide Web; 2021; 24,
7. Kadri, MR; Abdelli, A; Othman, JB; Mokdad, L. Survey and classification of Dos and DDos attack detection and validation approaches for IoT environments. Internet Things; 2024; 25, 101021.
8. Bhayo, J; Jafaq, R; Ahmed, A; Hameed, S; Shah, SA. A time-efficient approach toward DDoS attack detection in IoT network using SDN. IEEE Internet Things J.; 2021; 9,
9. Li, Q; Huang, H; Li, R; Lv, J; Yuan, Z; Ma, L et al. A comprehensive survey on DDoS defense systems: New trends and challenges. Comput. Netw.; 2023; 233, 109895.
10. Asharf, J; Moustafa, N; Khurshid, H; Debie, E; Haider, W; Wahab, A. A review of intrusion detection systems using machine and deep learning in Internet of things: Challenges, solutions and future directions. Electronics; 2020; 9,
11. Gao, C; Zheng, Y; Li, N; Li, Y; Qin, Y; Piao, J et al. A survey of graph neural networks for recommender systems: Challenges, methods, and directions. ACM Trans. Recomm. Syst.; 2023; 1,
12. Li, L; Fan, Y; Tse, M; Lin, KY. A review of applications in federated learning. Comput. Ind. Eng.; 2020; 149, 106854.
13. Yin, L; Feng, J; Xun, H; Sun, Z; Cheng, X. A privacy-preserving federated learning for multiparty data sharing in social IoTs. IEEE Trans. Netw. Sci. Eng.; 2021; 8,
14. Ma, J; Wu, F. Learning to coordinate traffic signals with adaptive network partition. IEEE Trans. Intell. Trans. Syst.; 2023; 25,
15. Wu, Y; Dai, HN; Tang, H. Graph neural networks for anomaly detection in industrial Internet of Things. IEEE Internet Things J.; 2021; 9,
16. Protogerou, A; Papadopoulos, S; Drosou, A; Tzovaras, D; Refanidis, I. A graph neural network method for distributed anomaly detection in IoT. Evol. Syst.; 2021; 12,
17. Begum, US. Federated and Multi-Modal Learning Algorithms for Healthcare and Cross-Domain Analytics. PatternIQ Min.; 2024; 1,
18. Almaraz-Rivera, JG; Perez-Diaz, JA; Cantoral-Ceballos, JA. Transport and application layer DDoS attacks detection to IoT devices by using machine learning and deep learning models. Sensors; 2022; 22,
19. Mihoub, A; Fredj, OB; Cheikhrouhou, O; Derhab, A; Krichen, M. Denial of service attack detection and mitigation for Internet of things using looking-back-enabled machine learning techniques. Comput. Electr. Eng.; 2022; 98, 107716.
20. Sharif, H., Usman, S., & Hasnain, M. A Machine Learning Based Approach for the Detection of DDoS Attacks on Internet of Things Using CICDDoS2019 Dataset-PortMap. Lahore Garrison University Research Journal of Computer Science and Information Technology, 8(2). (2024).
21. Yousuf, O; Mir, RN. DDoS attack detection in Internet of Things using recurrent neural network. Comput. Electr. Eng.; 2022; 101, 108034.
22. Ur Rehman, S; Khaliq, M; Imtiaz, SI; Rasool, A; Shafiq, M; Javed, AR et al. DIDDOS: An approach for detection and identification of Distributed Denial of Service (DDoS) cyberattacks using Gated Recurrent Units (GRU). Future Gener. Comput. Syst.; 2021; 118, pp. 453-466.
23. Alkahtani, H; Aldhyani, TH. Botnet Attack Detection by Using CNN-LSTM Model for Internet of Things Applications. Secur. Commun. Netw.; 2021; 2021,
24. Gide, AI; Mu’azu, AA. A Real-Time Intrusion Detection System for DoS/DDoS Attack Classification in IoT Networks Using KNN-Neural Network Hybrid Technique. Babylonian J. Int. things; 2024; [DOI: https://dx.doi.org/10.58496/bjiot/2024/008]
25. Jalo, H; Heydarian, M. A Hybrid Technique Based on RF-PCA and ANN for Detecting DDoS Attacks IoT. InfoTech Spectr.: Iraqi J. Data Sci.; 2024; [DOI: https://dx.doi.org/10.51173/ijds.v1i1.9]
26. Animesh, Srivastava., Shweta, Tiwari., Deepak, Kumar., Navin, Garg. 5. Finding of DDoS Attack in IoT-Based Networks Using Ensemble Technique. https://doi.org/10.1109/iscs61804.2024.10581044 (2024).
27. Hani, E; Derya, Y-K. Hybrid Deep Learning Approach for Automatic Dos/DDoS Attacks Detection in Software-Defined Networks. Appl. Sci.; 2023; [DOI: https://dx.doi.org/10.3390/app13063828]
28. Lotfi, Mhamdi., Desmond, C., McLernon., Fadi, El-moussa., Syed, Ali, Raza, Zaidi., Mounir, Ghogho., Tuan, Tang. A Deep Learning Approach Combining Autoencoder with One-class SVM for DDoS Attack Detection in SDNs. https://doi.org/10.1109/COMNET47917.2020.9306073 (2020).
29. Mousa, AK; Abdullah, MN. An improved deep learning model for DDoS detection based on hybrid stacked autoencoder and checkpoint network. Future Int.; 2023; 15,
30. Nazari, H., Yazdinejad, A., Dehghantanha, A., Zarrinkalam, F., & Srivastava, G. P3GNN: A Privacy-Preserving Provenance Graph-Based Model for APT Detection in Software Defined Networking. arXiv preprint arXiv:2406.12003. (2024).
31. Yazdinejad, A., Dehghantanha, A., Karimipour, H., Srivastava, G., & Parizi, R. M. A robust privacy-preserving federated learning model against model poisoning attacks. IEEE Transactions on Information Forensics and Security. (2024).
32. Yazdinejad, A; Dehghantanha, A; Srivastava, G. AP2FL: Auditable privacy-preserving federated learning framework for electronics in healthcare. IEEE Trans. Consum. Electron.; 2023; 70,
33. Mazid, A; Kirmani, S; Manaullah,; Yadav, M. FL-IDPP: A Federated Learning Based Intrusion Detection Approach With Privacy Preservation. Trans. Emerg. Telecommun. Technol.; 2025; 36,
34. Mazid, A., Kirmani, S., & Abid, M. Enhanced intrusion detection framework for securing IoT network using principal component analysis and CNN. Information Security Journal: A Global Perspective, 1–21. (2024).
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the "License"). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
The Internet of Things (IoT) consists of physical objects and devices embedded with network connectivity, software, and sensors to collect and transmit data. The development of the Internet of Things (IoT) has led to various security and privacy issues, including distributed denial-of-service (DDoS) attacks. Conventional attack detection methods face significant challenges related to privacy, scalability, and adaptability due to the dynamic nature of IoT environments. To address these limitations, this research proposes GraphFedAI, a novel framework that integrates adaptive session-based graph modeling, Pearson correlation-guided feature selection, interpolation-aware graph neural network (GNN) training, and federated learning to enable robust, scalable, and privacy-preserving DDoS detection in heterogeneous Internet of Things (IoT) networks.The framework represents the IoT network as dynamic graphs where communication patterns among devices are modeled as edges that evolve over time. Graph neural networks are utilized to extract both temporal and structural features from these graphs, thereby enhancing the accuracy of DDoS detection. Federated learning is incorporated to maintain data privacy by training models locally on each device without sharing raw data. This integration also ensures system scalability, as FL adapts training based on localized network topology.The system is evaluated using the CIC-IoT-2023 dataset, demonstrating its effectiveness in achieving high detection accuracy, low false positive rates, and strong resilience under dynamic IoT conditions.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details
1 Department of Computer Engineering, Aligarh Muslim University, 202002, Aligarh, India (ROR: https://ror.org/03kw9gc02) (GRID: grid.411340.3) (ISNI: 0000 0004 1937 0765)
2 Department of Computer Science and Information Systems, College of Applied Sciences, AlMaarefa University, Ad Diriyah, 13713, Riyadh, Saudi Arabia (ROR: https://ror.org/00s3s5518) (ISNI: 0000 0004 9360 4152)
3 Electrical Engineering Department, University of Business and Technology, 21432, Jeddah, Saudi Arabia (ROR: https://ror.org/05tcr1n44) (GRID: grid.443327.5) (ISNI: 0000 0004 0417 7612)
4 Department of Business Administration, College of Business Administration, Princess Nourah Bint Abdulrahman University, PO Box 84428, 11671, Riyadh, Saudi Arabia (ROR: https://ror.org/05b0cyh02) (GRID: grid.449346.8) (ISNI: 0000 0004 0501 7602)
5 Department of Informatics and Computer Systems, College of Computer Science, King Khalid University, 61421, Abha, Saudi Arabia (ROR: https://ror.org/052kwzs30) (GRID: grid.412144.6) (ISNI: 0000 0004 1790 7100)
6 Department of Computer Science and Information Technology, Benazir Bhutto Shaheed University Lyari, 75660, Karachi, Pakistan (ROR: https://ror.org/02zwhz281) (GRID: grid.449433.d) (ISNI: 0000 0004 4907 7957); School of Engineering, École Polytechnique Fédérale de Lausanne, 1015, Lausanne, Switzerland (ROR: https://ror.org/02s376052) (GRID: grid.5333.6) (ISNI: 0000 0001 2183 9049)
7 Department of Computer Science and Informatics, Applied College, Taibah University, 42353, Madinah, Saudi Arabia (ROR: https://ror.org/01xv1nn60) (GRID: grid.412892.4) (ISNI: 0000 0004 1754 9358)