This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1. Introduction
With the rapid development of removable devices, large amounts of data will be generated in real time. Also, as public concern about data security rises [1], it leads to reluctance to share their private data, ultimately creating data silos. The most straightforward way to address data silos is to centralise data collection, unified processing, cleansing and modelling. In most cases, however, the leakage of data occurs at the collection and processing stage, which is unacceptable. The traditional machine learning methods mostly use a centralised approach to train the model, which requires all the training data to be preserved in a central server. However, this way has the risk of privacy leakage, thus failing to safeguard users’ privacy. In contrast to traditional machine learning, federated learning (FL) is a distributed training method. Specifically, in FL, users (also known as workers or clients) share models trained from local data rather than their private data [2]. Since it is difficult for attackers to get source data from the model, it satisfies the privacy requirements of clients.
FL can be used for things such as improving health care, making smartphones smarter and protecting privacy. It can also help reduce the environmental impact of data centres and enable more inclusive and diverse data sources. Despite the applications and advantages of FL in various fields, it also faces serious challenges. One of the challenges is security. Although the clients participating in FL training are not required to upload private data, it cannot guarantee that all clients are honestly involved in training because of the distributed nature of FL. Dishonest clients, or called attackers, use different types of attacks to make the final model perform less well. One type of attack is a data-poisoning attack, which prevents FL’s central server from completing the target. For instance, the authors in [3] designed a loss function data-poisoning attack that reverses the benign model, while the authors in [4] crafted a bandit-based attack area UCB (AR-UCB) algorithm to perform dynamic data poisoning attacks, thereby hindering the FL central server’s ability to fulfil its target. Another challenge is the fairness issue. In general, models that are favoured by the server or trained on larger datasets are given higher importance in the aggregation process. This results in other clients, also called workers, not being able to or rarely joining in the task of FL.
There have been many research attempts to address these challenges. Previous work has had methods for fending off data poisoning attacks and thus picking out reliable clients. Most of these research studies have used trust models to assess the reliability of the client, thus enabling the server to select credible clients. Reference [5] proposed a reputation-aware stochastic integer programming-based FL client selection method. Reference [6] utilizes deep reinforcement learning to dynamically select clients based on reputation. There is also some research on filtering attackers by monitoring users’ behaviour, thus allowing normal users to participate in global aggregation to complete FL tasks. Reference [7] identifies attackers by calculating the client-wise angle similarity for the clients’ last-layer gradients. Reference [8] utilizes pairwise cosine similarity, clustering mechanism and filtering strategy to filter out malicious updates. Reference [9] introduced an algorithm for accurately detecting malicious model updates to find malicious workers. Reference [10] introduces a fairness-aware incentive mechanism in federated learning that promotes both aggregate and reward fairness. Reference [11] exploits incentive mechanisms to fairly reward clients to attract reliable and efficient clients. Reference [12] utilizes deep reinforcement learning algorithms to achieve fair model allocation in federated learning. Some other studies have proposed new frameworks or algorithms to ensure fairness. In [13], an asynchronous FL framework with adaptive client selection ensures the client’s long-term fairness. Reference [14] proposed an adaptive fairness learning algorithm that goes through the update of the local model to adjust the fairness coefficients and ultimately improves the generalisation of the global model. However, without a dynamic mechanism, it is difficult for the server to maximise its security and fairness to the participants.
Reinforcement learning (RL) is a pivotal field in machine learning, which focuses on finding the best action for an agent to maximise the reward it receives in a complex, uncertain environment [15]. However, RL faces some challenges, such as the curse of dimensionality, the balance between exploration and exploitation and sparse rewards. To address these issues, some researchers have proposed an approach that combines deep learning and RL, which is called deep reinforcement learning (DRL). DRL integrates the feature extraction of deep learning and the decision-making capabilities of RL to skillfully solve complex problems in the real world. The capabilities of DRL permit it to solve complex problems in the real world more effectively. What is more, DRL has effectively tackled key challenges within the FL domain, such as solving the problem of online resource allocation in the vehicular fog network [16], achieving adaptive and efficient communications and credible data interactions [17] and incentivize users to participate in model training over time [18].
In this article, we propose an approach based on deep deterministic policy gradient (DDPG) to select reliable clients fairly. The DDPG algorithm is typical in DRL, which can handle high-dimensional observation spaces and continuous action spaces [19]. It fits with our approach, which considers a multitude of factors and potential actions in a continuous space when selecting clients. In the proposed program, clients with high security-fairness value will be selected to participate in federated learning. This value is calculated based on the client’s security and fairness scores and the weighting strategy. The weighting strategy is developed by DDPG based on the current environment. Specifically, DDPG uses a behavioural critique network framework to process continuous and complex action spaces to find the optimal weighting strategy [19]. In addition, techniques such as experience playback, goal networks and noise exploration are employed to provide the agent with exploration capabilities to avoid falling into a local optimum. Using this approach to intelligently determine the weighting strategy, thereby selecting clients, can minimize the adverse impact of unreliable clients on the global model. At the same time, the system will select reliable clients who can fairly participate in the federated learning process. The main contributions of this paper are listed as follows:
• We calculate the security score by using a beta distribution function to assess the trustworthiness of a local client. In addition, the fairness score is derived from the historical participation of local clients in global aggregation, evaluating their equitable involvement.
• We introduce an adaptive weighting strategy for the security-fairness value, leveraging DDPG. This approach empowers the FL system to equitably select reliable clients while mitigating the impact of unreliable ones
• The final experimental results show that our approach enables the system to fairly select reliable clients, enhancing the security and fairness of the FL training process.
This article is structured as follows. In Section 2, we provide an overview of the existing literature on FL security and fairness. The system model is presented in Section 3. Then, we explain our proposed weighting strategy for security fairness based on DDPG in Section 4. After that, the performance results of our proposed model are analysed in Section 5. Finally, a conclusion is proposed in Section 6.
2. Related Work
FL has been extensively studied in recent work. Reference [20] proposed an adaptive client selection framework to improve convergence performance. In [21], a data-type, resource and time-aware protocol is proposed for selecting FL clients to reduce client exit, model convergence time, number of information exchanges and the time required to reach a certain accuracy. Reference [22] proposed to select clients based on client evaluation accuracy to achieve a specific accuracy improvement. This method focuses only on evaluating accuracy metrics and may not perform optimally in heterogeneous client environments with widely varying accuracy metrics. In [23], the authors propose the two-stage secure aggregate sparsification algorithm, which achieves privacy guarantees, convergence and performance improvements in FL by having the client apply a pairwise multiplicative random mask to the sparse subnetwork. In [24], the researcher introduces a framework for graph-based client selection to accommodate heterogeneity in FL. However, the effectiveness heavily relies on accurate graph representations of client relationships. Reference [25] proposed to select clients based on network prediction of dynamic network conditions and quality of training data to tackle the challenges posed by high system heterogeneity in time-sensitive FL scenarios. This approach relies on accurate predictions of network conditions and data quality, which may not always be achievable or reliable in the real world.
Some researchers have used cryptographic authentication to screen attackers and ensure the security of FL. In [26], the researchers validate the local model using cryptographic primitives and compare the proof result with the list of aggregated models, and if the intersection length between the two resists a threshold, the system is considered to have an attacker. Reference [27] introduced a privacy-preserving scheme for FL in edge computing. The authors start with a streamlined protocol employing shared secrets and weighted masks to safeguard gradient privacy, enhancing defences against device dropout and collusion. In addition, they propose an algorithm utilizing digital signatures and hash functions to guarantee message integrity, consistency and resilience against replay attacks. Finally, they suggest a periodic averaging training strategy to enhance overall operational efficiency. Reference [28] proposed the use of cryptographic primitives, including masks and homomorphic encryption, to prevent privacy leakage. In [29], the authors propose an encryption scheme to defend against backdoor attacks, which employs adaptive local differential privacy techniques and compressed sensing. Reference [30] developed a comprehensive user behaviour model utilizing various attributes. The model incorporates direct evidence of trust and recommended trust data to improve the accuracy of trust assessment in the presence of dynamic behavioural changes. In [31], a protocol based on BLS signatures and multiparty security has been designed, which verifies the integrity of parameters and the correctness of results. However, using this approach requires significant computational resources and communication costs.
There are several solutions to address FL security by employing various techniques and analyses to detect attackers. Reference [32] designed a model based on convolutional neural networks to classify normal and abnormal users. It also uses blockchain-integrated cryptography-based FL technique to remove anomalous users from the database. Training such a classification model requires a large amount of labelled data. The way anomalous users are handled may cause the system to lose its ability to track and problem-solve. Reference [33] proposed a covert communication-based FL to defend against attacks. However, transmissions with this method are unstable and susceptible to sniffing. Reference [34] introduced a temporal convolutional generative network for semisupervised learning in semilabelled data to achieve network attack detection. Reference [35] develops an intrusion detection method based on a semisupervised FL scheme via knowledge distillation. In [35], an intrusion detection method is devised using a semisupervised FL approach with knowledge distillation. It utilizes unlabelled data and knowledge distillation to accomplish intrusion detection. Device resources may limit these methods.
Several researchers have focused on allowing servers to fairly select clients to participate in the federated learning ground process. In [36], the researcher proposed a weight selection algorithm that integrates training accuracy and frequency to measure weights, ensuring fair client aggregation on the server. However, the method is limited to horizontal FL. In the study of [37], the central server combines the local loss, data size, computation power, resource demand and last update time of local clients into an overall index and then selects a group of workers in each round to participate in FL task. However, there is no consideration of how to update the weight parameters. Reference [38] presents a novel optimisation objective function that makes each local model fairly (not necessarily equally) contribute to the training of the global model. The optimisation objective function consists of a fairness term and a training loss reduction term. In [39], the authors suggested transforming the training of the individual fair FL model into an adversarial training approach, which ultimately improves both individual fairness and group fairness. In [40], the researchers quantified fairness in joint learning using the Gini coefficient and proposed fairness interventions in the data fitting phase. In addition to this, the authors integrated a penalty term in the objective function of FL to achieve a balance between model performance and fairness. Nevertheless, these studies do not consider an attacker, and by default, the process of FL is safe.
In [41], the authors introduced a registry based on smart contracts for tracking and recording local data. In addition to this, the authors proposed an algorithm for sampling a weighted fair training dataset, aiming to improve the fairness of the model. Reference [42] addressed the fairness issues by controlling the unmanned aerial vehicles (UAVs) 3D trajectory, transmission power and scheduling time for task offloading by mobile ground users. One of the UAVs in the UAV pairs will act as a jammer that suppresses eavesdroppers. These two studies propose two components to address the security and fairness of FL, respectively.
3. System Model
To address the security and fairness challenges in FL, a model based on the DDPG client selection strategy is proposed to overcome the influence of unreliable clients and fairly select reliable clients to participate in global model training. First, the FL client downloads the global model from the server. Then, the client trains and updates the model using its local dataset. Next, the server uploads the trained local model to the server. After that, the server determines the reliability of the local model by using an attack detection method and calculates the security score for each client. Consequently, the server calculates the fairness score based on the past participation of the clients in the global aggregation. Subsequently, the server uses the DRL algorithm to select an aggregation strategy for the security-fairness value and computes the security-fairness value for each client. Finally, the server selects the local model uploaded by the client with the higher security-fairness value for aggregation to achieve one global model update. The following subsections will discuss the proposed security-fairness value weighting strategy based on DRL. The model for executing client selection strategy based on DDPG is shown in Figure 1.
[figure(s) omitted; refer to PDF]
3.1. FL Model
In this section, FL will be briefed. FL is a distributed machine learning model, which trains a global model by aggregating the locally trained models from multiple clients. Clients are not required to share their local data, only their trained models, which protects their privacy. In each FL round, the client downloads a global model from the parameter server and trains a local model with their data separately. Then, the client uploads the trained model to the parameter server. After that, the server selects uploaded local models randomly or according to specific requirements, aggregates these local models and then sends the aggregated global model to clients to start the next step of training. The abovementioned training process continues until the global model satisfies the desired conditions.
This work considers
Correspondingly, the client will download the global model parameters
The stochastic gradient descent (SGD) algorithm is employed as the training algorithm for FL in this paper. In each iteration, SGD randomly selects a batch of training instances, calculates the gradient of the batch concerning the current model parameters
3.2. Adversary Model
In this work, we investigate the possibility that participating clients may intentionally or unintentionally submit malicious or unreliable models, thus undermining the integrity and effectiveness of federated learning. The reasons for this phenomenon may stem from limitations in the computational resources or expertise of the clients, resulting in models that exhibit poor generalization and even contain biases. In addition, unreliable clients may deliberately inject corrupted or manipulated models to undermine the trustworthiness and dependability of the federated learning system. The parameters of these unreliable models, denoted as
This study specifically addresses the threat of data poisoning attacks in federated learning. Data poisoning refers to the injection of maliciously forged data points into the training dataset to corrupt the integrity and quality of the training model. The common data poisoning attacks are label-flipping attack, clean-label attack and backdoor attack. In this work, we consider label-flipping attacks to undermine the integrity of model predictions. Unlike traditional adversarial attacks that modify the input data, label flipping focuses on subverting the model by tampering with the ground truth labels during the training phase. In this adversarial approach, a malicious attacker strategically changes the labels associated with the training instances to inject misinformation during the learning process. This intentional mislabelling causes the model to learn incorrect patterns and associations, thus corrupting the model, which exhibits poor generalisation and vulnerability to misclassification of unseen data.
3.3. Adaptive Attacks
Adaptive attacks involve an attacker dynamically adjusting their strategy based on system feedback to enhance both the effectiveness and stealth of the attack. In FL, the risk of such attacks is particularly high due to the distributed nature of model training across multiple clients. Attackers can monitor model performance during iterations and identify the optimal moment to upload a malicious model, thereby biasing the global model towards incorrect decisions. These adaptive attacks jeopardize the robustness and security of the FL system, rendering the entire training process vulnerable and degrading overall model performance, especially if client selection is poor. Moreover, attackers may further compromise user privacy by analysing model updates to infer sensitive client data. Therefore, effective strategies are essential to ensure the security of FL systems.
This paper proposes a DDPG-based dynamic weighting strategy that adjusts the influence of each client during global model training by evaluating their current performance. The goal is to effectively mitigate the impact of potential attackers and ensure the model remains efficient and accurate in the face of adversarial threats. Specifically, our approach continuously monitors the performance of all participating clients in real-time to assess their reliability throughout the training process. For instance, we utilize historical performance data and real-time feedback to prioritize clients with stable and trustworthy performance, assigning them greater weight. In addition, extra weights are allocated to clients who participate in training less frequently, ensuring that more reliable clients contribute to the global model’s training. This way, even if some clients are attacked or underperform, their negative impact on the global model is significantly reduced. Through this dynamic adjustment and selection mechanism, our approach not only enhances the robustness of the model but also strengthens the overall system’s resilience against adaptive attacks.
3.4. Problem Formulation
Since the parameter server selects only a limited number of local model parameters for global aggregation in each FL round, we introduce a security-fairness value to evaluate the clients to help the server select local clients. This security-fairness value is obtained by weighting the client’s security and fairness scores, and we considered the formulae provided by [43]. In the proposed scheme, the client’s security-fairness weighting function is given by
As shown in equation (5), the client’s security-fairness value is calculated by his security score and fairness score. In this article, the beta security system is used to evaluate the reliability of each client. Using the beta distribution for security score calculations provides flexibility in modelling safety probabilities, allowing precise representation and accurately representing and adjusting the confidence level of different clients within the model. The client’s fairness score is calculated based on the number of times the client participates in the global model aggregation. The more the client participates in the aggregation, the lower its fairness score.
Trade-off parameter
4. DDPG-Based FL Framework
4.1. Security Score Update Policy Based on Beta Distribution
To facilitate the expression and updating of the security score, the beta distribution is utilized to represent the client’s security score, where the beta distributions can be expressed as
Upon each upload of a local model by a client, the global model undergoes an impact, categorized into both positive and negative effects. The specific distinction is described in the following. Suppose the client
Based on equation (10), the security score of client
In each round of FL, the central server evaluates the local models uploaded by each client. A small portion of the test set owned by the server is used to assess the local model. Based on the performance of the local and global models on the test set, we determine the impact that the local model would have. We first define the following quantity
During the training process of FL, if
On the contrary, the parameter is updated as follows:
4.2. Fairness Score Update Policy
During FL, servers need to consider fairness scores to ensure that ordinary clients participate in the global model aggregation phase. Without such a mechanism, there is a risk of biased model updating, where some clients that are favoured by the server or have better resources dominate the learning process, while others are marginalised. This would result in a trained global model that would hardly satisfy the optimal performance of all clients but would only converge to the optimal performance of a centrally trained model. The fact that the server takes fairness into account helps to promote inclusiveness and diversity in the global model, which ultimately enhances the robustness and representativeness of FL.
In this article, the number of times the server is involved in the global model aggregation progress is considered in the fairness score. The client’s fairness score is updated every time a global model update is performed. The client
When the server selects the local model uploaded by client
4.3. DDPG for Security-Fairness Value
If the server only focuses on the security of the client, it ignores the risk of biased model updates. If the server only prioritises fairness, it will allow malicious attackers to succeed in their goals. This requires the server to weigh these two aspects. Specifically, the server finds it challenging to compute the optimal coefficient
In our model, the server in FL acts as an agent of DDPG, which not only collects feedback from the environment but also interacts with the environment to determine the optimal factor
1. DRL agent: the parameter server in the FL system.
2. Environment: the FL system with the DDPG-Enhanced SC model.
3. State space: the state space consists of the previous weighting strategy, the global model and information from all clients. More specifically, the server defines the state as follows:
where
4. Action space: in each round in FL, the server selects an action
5. Reward: the essence of the reward function
The objective of the agent, i.e. the server, is to find an optimal strategy
Algorithm 1: DDPG-based security-fairness value update for FL.
Input: current state
Output: weighting strategy
1. Initialize the global model parameters
2. Initialize the parameters of the actor network
3. Initialize the parameters of the target actor network
4. for
5. Initialize exploration noise
6. With probability
7. Otherwise select an action
8. Execute action
9. Store transition
10. Sample a random minibatch of transition from
11. Update
12. Update
13. Every certain steps, update
14. end for
The DDPG algorithm consists of four neural networks, the actor network
The update process of the DDPG algorithm focuses on updating the parameters of the actor network and critic network. During the training phase, the agent samples a batch of data from the replay buffer. Suppose a piece of data is
After that, the
Compared with the critic network, the actor network parameters are updated more simply. After using the actor network to compute the action
5. Performance Evaluation
5.1. Simulation Settings
In this section, the performance as well as the reliability of the client strategy based on DDPG is evaluated through simulation. This simulation environment is Nvidia GeForce GPUs Version RTX 3060 running on Windows 11. The framework presented in this paper is developed on Python 3 and TensorFlow. The experiments were conducted on the MNIST and Fashion-MNIST datasets, both of which consists of 6000 training examples and 1000 examples in the range of 0–9, where each example is a
In this simulation experiment, the model iteratively trained on the MNIST dataset is used as the FL environment. For each local client, convolutional neural network (CNN) is used for simulation training in this paper. In each iteration of training, the learning rate is set to 0.01 and the batch size is set to 32. In this work, we consider that the FL model consists of 200 global training iterations and each client has 600 samples. In addition, the attacker’s malicious data are generated in advance by corrupting the data and the label matching of training samples. In particular, we consider both low-intensity and high-intensity data-poisoning attacks.
For the final global model of FL to take into account all reliable clients or to reduce the impact of the test set preference for a particular client, the proposed framework takes into account both the reputation value of the client (i.e., the performance of models that have been uploaded by clients) and the number of times clients has been selected to participate in the global aggregation progress. The trade-off parameter
The DDPG algorithm used in the proposed framework involves four neural networks. The critic network evaluates and provides feedback on the consequences of the action taken by the action network. The target actor network and the critic network ensure the stability of the agent during training. The actor network consists of an input layer, two hidden layers and an output layer, where the number of units in the input layer is equal to the number of factors contained in each state. In the critic network, the number of units of the two hidden layers is equal to the number of states and actions respectively. The ReLu function is used as the activation function for the hidden layer in all networks. On the one hand, the output layer of the actor network is a Sigmoid layer, which limits the action space between 0 and 1. On the other hand, the critic output layer does not have an activation function. The specific parameters of the DDPG algorithm are shown in Table 1.
Table 1
Simulation parameters for the weighting strategy for security-fairness value based on DDPG for FL.
Parameter | Value |
Replay memory size | 10,000 |
Batch size | 32 |
Optimizer | SGD |
Activation function | ReLu |
Learning rate of actor | 0.001 |
Learning rate of critic | 0.001 |
Discount factor | 0.99 |
Security score to fairness score mapping (a) | 100 |
Number of clients (K) | 5, 10 |
Number of attackers (M) | 2, 4 |
Number of FL iterations (T) | 200 |
5.2. Performance Analysis
In this section, we analyse the performance of the proposed client selection strategy based on DDPG in detail. Classifying handwritten digits using the MNIST dataset is considered an FL task. The performance is evaluated by observing its completion of the FL task.
Figure 2 compares the global model performance of the client selection strategy proposed in this paper, i.e., relying on DDPG to compute the trade-off parameter
[figure(s) omitted; refer to PDF]
Next, the number of clients
Moreover, Figure 3 exhibits the completion of the FL tasks with high-intensity poisoning attacks by the three trade-off parameter
[figure(s) omitted; refer to PDF]
Figure 4 compares the global model performance of different trade-off parameters
[figure(s) omitted; refer to PDF]
Since our proposed client selection strategy based on DDPG performs similarly in low-intensity and high-intensity data-poisoning attacks, we only explored the case of low-intensity data poisoning attacks in the experiment of changing the proportion of attackers. Figure 5 illustrates the effect of increasing the number of attackers
[figure(s) omitted; refer to PDF]
Figure 6 shows that the server has selected the clients to participate in the aggregation using the proposed client selection strategy based on DDPG. Figure 6(a) shows the case where the worker
[figure(s) omitted; refer to PDF]
6. Conclusion
In this paper, we introduce a novel client selection strategy for FL, aiming to reduce the risk of client submission of unreliable or malicious models that lead to FL task failure. In addition, the strategies proposed can address the fairness issues arising from disparities in server preferences and client resources during global model aggregation. The core of this client selection strategy is to introduce security-fairness value to comprehensively evaluate client reliability and participation. The security-fairness value is calculated based on the current weighted strategy, security score and fairness score. The security score is computed based on the historical performance dynamics captured by the beta distribution, while the fairness score quantifies the frequency of client involvement in the aggregation process. Utilizing a weighting strategy based on the DDPG, the proposed scheme dynamically weighs these two scores to defend against malicious attacks and promotes fairness in the participation of reliable clients. The experimental results validate the effectiveness of our method and establish a new standard for secure and fair client selection in FL systems.
Funding
This work was supported by the Natural Science Foundation of Jiangxi Province of China (No. 20242BAB25066), and the National Nature Science Foundation of China (No. 61962022, 62062034 and 62172160).
Acknowledgments
This work was supported by the Natural Science Foundation of Jiangxi Province of China (No. 20242BAB25066), and the National Nature Science Foundation of China (No. 61962022, 62062034 and 62172160).
[1] Y. Xie, H. Wang, B. Yu, C. Zhang, "Secure Collaborative Few-Shot Learning," Knowledge-Based Systems, vol. 203,DOI: 10.1016/j.knosys.2020.106157, 2020.
[2] B. McMahan, E. Moore, D. Ramage, S. Hampson, B. A. Y. Arcas, "Communication-efficient Learning of Deep Networks From Decentralized Data," Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, vol. 54, pp. 1273-1282, 2017.
[3] P. Gupta, K. Yadav, B. B. Gupta, M. Alazab, T. R. Gadekallu, "A Novel Data Poisoning Attack in Federated Learning Based on Inverted Loss Function," Computers & Security, vol. 130,DOI: 10.1016/j.cose.2023.103270, 2023.
[4] S. Wang, Q. Li, Z. Cui, J. Hou, C. Huang, "Bandit-Based Data Poisoning Attack Against Federated Learning for Autonomous Driving Models," Expert Systems with Applications, vol. 227,DOI: 10.1016/j.eswa.2023.120295, 2023.
[5] X. Tan, W. C. Ng, W. Y. B. Lim, Z. Xiong, D. Niyato, H. Yu, "Reputation-Aware Federated Learning Client Selection Based on Stochastic Integer Programming," IEEE Transactions on Big Data,DOI: 10.1109/tbdata.2022.3191332, 2024.
[6] S. Ben Saad, B. Brik, A. Ksentini, "Toward Securing Federated Learning against Poisoning Attacks in Zero Touch B5G Networks," IEEE Transactions on Network and Service Management, vol. 20 no. 2, pp. 1612-1624, DOI: 10.1109/tnsm.2023.3278838, 2023.
[7] N. M. Jebreel, J. Domingo-Ferrer, "FL-defender: Combating Targeted Attacks in Federated Learning," Knowledge-Based Systems, vol. 260,DOI: 10.1016/j.knosys.2022.110178, 2023.
[8] X. Xiao, Z. Tang, L. Yang, Y. Song, J. Tan, K. Li, "FDSFL: Filtering Defense Strategies toward Targeted Poisoning Attacks in IIoT-Based Federated Learning Networking System," IEEE Network, vol. 37 no. 4, pp. 153-160, DOI: 10.1109/mnet.004.2200645, 2023.
[9] J. Le, D. Zhang, X. Lei, L. Jiao, K. Zeng, X. Liao, "Privacy-Preserving Federated Learning With Malicious Clients and Honest-But-Curious Servers," IEEE Transactions on Information Forensics and Security, vol. 18, pp. 4329-4344, DOI: 10.1109/tifs.2023.3295949, 2023.
[10] Z. Shi, L. Zhang, Z. Yao, "FedFAIM: A Model Performance-Based Fair Incentive Mechanism for Federated Learning," IEEE Transactions on Big Data,DOI: 10.1109/tbdata.2022.3183614, 2024.
[11] L. Gao, L. Li, Y. Chen, W. Zheng, C. Xu, M. Xu, "FIFL: A Fair Incentive Mechanism for Federated Learning," Proceedings of the 50th International Conference on Parallel Processing, 2021.
[12] T. Wan, X. Deng, W. Liao, N. Jiang, "Enhancing Fairness in Federated Learning: A Contribution-Based Differentiated Model Approach," International Journal of Intelligent Systems, vol. 2023,DOI: 10.1155/2023/6692995, 2023.
[13] H. Zhu, Y. Zhou, H. Qian, Y. Shi, X. Chen, Y. Yang, "Online Client Selection for Asynchronous Federated Learning With Fairness Consideration," IEEE Transactions on Wireless Communications, vol. 22 no. 4, pp. 2493-2506, DOI: 10.1109/twc.2022.3211998, 2023.
[14] Y. Cong, J. Qiu, K. Zhang, "Ada-FFL: Adaptive Computing Fairness Federated Learning," CAAI Transactions on Intelligence Technology, vol. 9 no. 3, pp. 573-584, DOI: 10.1049/cit2.12232, 2024.
[15] R. S. Sutton, A. G. Barto, Reinforcement Learning: An Introduction, 2018.
[16] B. Jamil, H. Ijaz, M. Shojafar, K. Munir, "IRATS: A DRL-Based Intelligent Priority and Deadline-Aware Online Resource Allocation and Task Scheduling Algorithm in a Vehicular Fog Network," Ad Hoc Networks, vol. 141,DOI: 10.1016/j.adhoc.2023.103090, 2023.
[17] Y. Lin, Z. Gao, H. Du, "Drl-Based Adaptive Sharding for Blockchain-Based Federated Learning," IEEE Transactions on Communications, vol. 71 no. 10, pp. 5992-6004, DOI: 10.1109/tcomm.2023.3288591, 2023.
[18] L. Wu, S. Guo, Z. Hong, Y. Liu, W. Xu, Y. Zhan, "Long-Term Adaptive VCG Auction Mechanism for Sustainable Federated Learning with Periodical Client Shifting," IEEE Transactions on Mobile Computing, vol. 23 no. 5, pp. 6060-6073, DOI: 10.1109/tmc.2023.3317063, 2024.
[19] T. P. Lillicrap, J. J. Hunt, A. Pritzel, "Continuous Control with Deep Reinforcement Learning," Proceedings of the International Conference on Learning Representations (ICLR), 2016.
[20] Z. Jiang, Y. Xu, H. Xu, Z. Wang, C. Qian, "Heterogeneity-Aware Federated Learning with Adaptive Client Selection and Gradient Compression," IEEE INFOCOM 2023-IEEE Conference on Computer Communications, 2023.
[21] M. Panigrahi, S. Bharti, A. Sharma, "FedDCS: A Distributed Client Selection Framework for Cross Device Federated Learning," Future Generation Computer Systems, vol. 144, pp. 24-36, DOI: 10.1016/j.future.2023.02.001, 2023.
[22] M. A. P. Putra, A. R. Putri, A. Zainudin, D.-S. Kim, J.-M. Lee, "ACS: Accuracy-Based Client Selection Mechanism for Federated Industrial IoT," Internet of Things, vol. 21,DOI: 10.1016/j.iot.2022.100657, 2023.
[23] J. Zhang, X. Li, W. Liang, P. Vijayakumar, F. Alqahtani, A. Tolba, "Two-Phase Sparsification With Secure Aggregation for Privacy-Aware Federated Learning," IEEE Internet of Things Journal, vol. 11 no. 16, pp. 27 112-127 125, DOI: 10.1109/jiot.2024.3400389, 2024.
[24] T. Chang, L. Li, M. Wu, W. Yu, X. Wang, C. Xu, "GraphCS: Graph-Based Client Selection for Heterogeneity in Federated Learning," Journal of Parallel and Distributed Computing, vol. 177, pp. 131-143, DOI: 10.1016/j.jpdc.2023.03.003, 2023.
[25] B. Chen, N. Ivanov, G. Wang, Q. Yan, "Dynamicfl: Balancing Communication Dynamics and Client Manipulation for Federated Learning," 2023 20th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON), pp. 312-320, 2023.
[26] J. Guo, H. Li, F. Huang, "ADFL: A Poisoning Attack Defense Framework for Horizontal Federated Learning," IEEE Transactions on Industrial Informatics, vol. 18 no. 10, pp. 6526-6536, DOI: 10.1109/tii.2022.3156645, 2022.
[27] R. Wang, J. Lai, Z. Zhang, X. Li, P. Vijayakumar, M. Karuppiah, "Privacy-Preserving Federated Learning for Internet of Medical Things Under Edge Computing," IEEE Journal of Biomedical and Health Informatics, vol. 27 no. 2, pp. 854-865, DOI: 10.1109/jbhi.2022.3157725, 2023.
[28] L. Zhang, J. Xu, P. Vijayakumar, P. K. Sharma, U. Ghosh, "Homomorphic Encryption-Based Privacy-Preserving Federated Learning in IoT-Enabled Healthcare System," IEEE Transactions on Network Science and Engineering, vol. 10 no. 5, pp. 2864-2880, DOI: 10.1109/tnse.2022.3185327, 2023.
[29] Y. Miao, R. Xie, X. Li, Z. Liu, K.-K. R. Choo, R. H. Deng, "Efficient and Secure Federated Learning Against Backdoor Attacks," IEEE Transactions on Dependable and Secure Computing, vol. 21 no. 5, pp. 4619-4636, DOI: 10.1109/tdsc.2024.3354736, 2024.
[30] J. Guo, Z. Liu, S. Tian, "TFL-DT: A Trust Evaluation Scheme for Federated Learning in Digital Twin for Mobile Networks," IEEE Journal on Selected Areas in Communications, vol. 41 no. 11, pp. 3548-3560, DOI: 10.1109/jsac.2023.3310094, 2023.
[31] H. Gao, N. He, T. Gao, "SVerifl: Successive Verifiable Federated Learning With Privacy-Preserving," Information Sciences, vol. 622, pp. 98-114, DOI: 10.1016/j.ins.2022.11.124, 2023.
[32] J. A. Alzubi, O. A. Alzubi, A. Singh, M. Ramachandran, "Cloud-IIoT-Based Electronic Health Record Privacy-Preserving by CNN and Blockchain-Enabled Federated Learning," IEEE Transactions on Industrial Informatics, vol. 19 no. 1, pp. 1080-1087, DOI: 10.1109/tii.2022.3189170, 2023.
[33] Y.-A. Xie, J. Kang, D. Niyato, "Securing Federated Learning: A Covert Communication-Based Approach," IEEE Network, vol. 37 no. 1, pp. 118-124, DOI: 10.1109/mnet.117.2200065, 2023.
[34] M. Abdel-Basset, N. Moustafa, H. Hawash, "Privacy-Preserved Cyberattack Detection in Industrial Edge of Things (IEoT): A Blockchain-Orchestrated Federated Learning Approach," IEEE Transactions on Industrial Informatics, vol. 18 no. 11, pp. 7920-7934, DOI: 10.1109/tii.2022.3167663, 2022.
[35] R. Zhao, Y. Wang, Z. Xue, T. Ohtsuki, B. Adebisi, G. Gui, "Semisupervised Federated-Learning-Based Intrusion Detection Method for Internet of Things," IEEE Internet of Things Journal, vol. 10 no. 10, pp. 8645-8657, DOI: 10.1109/jiot.2022.3175918, 2023.
[36] W. Huang, T. Li, D. Wang, S. Du, J. Zhang, T. Huang, "Fairness and Accuracy in Horizontal Federated Learning," Information Sciences, vol. 589, pp. 170-185, DOI: 10.1016/j.ins.2021.12.102, 2022.
[37] A. Sultana, M. M. Haque, L. Chen, F. Xu, X. Yuan, "Eiffel: Efficient and Fair Scheduling in Adaptive Federated Learning," IEEE Transactions on Parallel and Distributed Systems, vol. 33 no. 12, pp. 4282-4294, DOI: 10.1109/tpds.2022.3187365, 2022.
[38] S. M. Hosseini, M. Sikaroudi, M. Babaie, H. R. Tizhoosh, "Proportionally Fair Hospital Collaborations in Federated Learning of Histopathology Images," IEEE Transactions on Medical Imaging, vol. 42 no. 7, pp. 1982-1995, DOI: 10.1109/tmi.2023.3234450, 2023.
[39] J. Li, T. Zhu, W. Ren, K.-K. Raymond, "Improve Individual Fairness in Federated Learning via Adversarial Training," Computers & Security, vol. 132,DOI: 10.1016/j.cose.2023.103336, 2023.
[40] X. Li, S. Zhao, C. Chen, Z. Zheng, "Heterogeneity-Aware Fair Federated Learning," Information Sciences, vol. 619, pp. 968-986, DOI: 10.1016/j.ins.2022.11.031, 2023.
[41] S. K. Lo, Y. Liu, Q. Lu, "Toward Trustworthy AI: Blockchain-Based Architecture Design for Accountability and Fairness of Federated Learning Systems," IEEE Internet of Things Journal, vol. 10 no. 4, pp. 3276-3284, DOI: 10.1109/jiot.2022.3144450, 2023.
[42] R. Karmakar, G. Kaddoum, O. Akhrif, "A Novel Federated Learning-Based Smart Power and 3D Trajectory Control for Fairness Optimization in Secure UAV-Assisted MEC Services," IEEE Transactions on Mobile Computing, vol. 23 no. 5, pp. 4832-4848, DOI: 10.1109/tmc.2023.3298935, 2024.
[43] Z. Song, H. Sun, H. H. Yang, X. Wang, Y. Zhang, T. Q. S. Quek, "Reputation-Based Federated Learning for Secure Wireless Networks," IEEE Internet of Things Journal, vol. 9 no. 2, pp. 1212-1226, DOI: 10.1109/jiot.2021.3079104, 2022.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Copyright © 2024 Tao Wan et al. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0/
Abstract
Federated learning (FL) is a machine learning technique in which a large number of clients collaborate to train models without sharing private data. However, FL’s integrity is vulnerable to unreliable models; for instance, data poisoning attacks can compromise the system. In addition, system preferences and resource disparities preclude fair participation by reliable clients. To address this challenge, we propose a novel client selection strategy that introduces a security-fairness value to measure client performance in FL. The value in question is a composite metric that combines a security score and a fairness score. The former is dynamically calculated from a beta distribution reflecting past performance, while the latter considers the client’s participation frequency in the aggregation process. The weighting strategy based on the deep deterministic policy gradient (DDPG) determines these scores. Experimental results confirm that our method fairly effectively selects reliable clients and maintains the security and fairness of the FL system.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer