Introduction
In recent years, the rapid development of the Internet of Things (IoT) has been driven by the proliferation of smart devices, which has significantly transformed everyday life and work. Applications such as smart homes, health monitoring, and intelligent transportation have greatly improved convenience [1]. However, the large-scale deployment of smart devices has introduced various challenges, particularly in terms of high-frequency interference between devices, increased computational demands, and the need for low-latency processing [2]. Given that many devices are struggling to meet these stringent requirements, the utilization of mobile edge computing (MEC) technology has turned out to be a promising solution to address these challenges [3].
The core principle of MEC is to deploy computing resources at the edge of the network, relocating computational capabilities nearer to data sources and terminal devices. By positioning computing resources at locations such as base stations, edge servers, or cloud edge nodes, MEC enables computation to occur in proximity to data sources, thereby reducing the distance and time required for data transmission [4]. This edge computing model not only minimizes communication latency but also enhances service efficiency. Additionally, MEC also supports task offloading, allowing computation tasks to be offloaded between terminal devices and edge nodes which helps balance computational workloads [5]. Through these mechanisms, MEC achieves faster, more real-time data processing and service delivery, providing low-latency, high-bandwidth computational support for various application scenarios. Given these advantages mentioned, MEC has garnered significant attention and is widely studied in industry [6].
Despite the significant advantages of MEC, the system faces challenges such as signal attenuation and uneven coverage, leading to degraded communication quality. The complexity and dynamic nature of signal propagation further contribute to increased communication delays, posing challenges for MEC’s practical application. To address these issues, intelligent reflecting surface (IRS) technology [7, 8] has been proposed as an effective solution. Strategically deploying IRS in wideband cognitive radio networks helps mitigate signal attenuation and uneven coverage [9, 10, 11–12], thereby improving the communication quality and reducing latency in MEC systems [13, 14–15]. To be specfic, in [10], an IRS-aided MEC system was proposed. Through the joint optimization of user-side transmission power, base station-side received beamforming vector, computation resource allocation, and IRS reflection coefficients, the goal is to reduce the system’s overall transmission energy consumption to the minimum. In [11], researchers investigated an IRS-assisted wireless-powered MEC system operating in OFDM environments in order to minimize the overall energy consumption. Joint optimization is conducted for power distribution of wireless energy transfer signals, local compute calculation frequencies of wireless devices, subband device collaboration, power allocation in computation offloading, and reflection coefficients of the IRS. In addition, [12] investigated the security-aware computation offloading challenge in IRS-assisted MEC networks.
The integration of UAVs with MEC systems also shows great potential for improving system efficiency, bringing greater flexibility, faster response times, and stronger adaptability by replacing traditional base stations [16, 17–18]. The integration research of UAVs into IRS-assisted MEC networks is increasingly prominent [19, 20, 22]. For instance, the authors in [16] investigated an IRS-assisted UAV-enabled MEC system, proposing a joint optimization of bit allocation, transmission power, phase shifts, and UAV trajectory to maximize system efficiency. Simulation results indicate that the UAV-based solution significantly improves efficiency. In the study presented in [17], the authors investigated a dynamic MEC system assisted by both UAVs and IRS. Simulation results further affirmed the advantages of incorporating UAVs into the system. Moreover, in MEC networks, the scarcity of spectrum resources is a potential issue, and adopting cognitive radio (CR) technology can effectively address this problem. To date, research on utilizing cognitive radio technology to enhance spectrum efficiency continues to emerge [23, 24–25]. In [23], the authors proposed an innovative spectrum sharing network which is assisted by IRS. Together, they optimize the transmission power of the users within the cognitive network and the reflection coefficients of the IRS to maximize the achievable rate of the secondary users while meeting the signal-to-interference-plus-noise ratio requirements of the primary users. In [26], the authors investigate secure resource allocation in an IRS-assisted CR network. The CR network in this article employs the paradigm of opportunistic access to the spectrum. Additionally, the work in [27] discusses an IRS-assisted wideband CR network based on the spectrum sharing paradigm of sensing. The simulations of the above studies have consistently shown that CR technology is able to remarkably enhance the efficiency of the spectrum.
However, traditional MEC systems primarily focus on optimizing physical layer parameters, such as power allocation and resource scheduling, overlooking the inherent semantic meaning of transmitted data and user tasks. Semantic communication, an emerging communication paradigm, has attracted significant concern [21, 28, 29]. The core concept of semantic communication is to take the semantic content of the information as the transmission goal, focusing on the delivery of the real meaning of the information instead of the precision at the bit level [30, 31, 32–33], so as to realize efficient communication resource utilization. Therefore, combining semantic communication with edge computing from UAVs and cognitive radio technology is of great significance. On the one hand, UAV edge computing scenarios usually face the contradiction between limited resources and huge data volume, and semantic communication can effectively reduce the bandwidth demand of data transmission by extracting and transmitting key information [34, 35–36], thus enhancing communication efficiency. On the other hand, cognitive radio technology provides flexibility for complex communication environments by dynamically sensing and allocating spectrum resources, but traditional communication methods have some redundancy in the utilization of spectrum resources [37, 38–39].
In conclusion, semantic-aware IRS/UAV-enabled MEC can take full advantage of the high efficiency of semantic communication, the real-time advantage of UAV edge computing, as well as intelligent adaptability of cognitive radio. This integration offers crucial theoretical and technical support for achieving real-time operation, enhancing robustness, and optimizing resource management within the UAV network in complex environments. In the context of future intelligent communication systems, it is irreplaceably essential and significant for addressing the challenges posed by resource constraints and complex environmental conditions. Furthermore, the exploration of semantic communication in wideband cognitive radio networks offers a new perspective for resource allocation and task scheduling in spectrum-sharing networks.
In this paper, we investigate a semantic-aware IRS/UAV-assisted MEC in wideband cognitive radio system for the first time, incorporating semantic utility into the optimization framework. The optimization focuses on collaboratively optimizing the flight paths of both primary and secondary UAVs, subcarrier allocation, reflection coefficient of IRS, task offloading ratios, task priorities, and contextual relevance to maximize a weighted combination of system energy efficiency and semantic utility. By balancing physical and semantic objectives, the system dynamically adapts to the diverse user demands and task requirements characteristic of wideband cognitive radio networks. This dynamic adaptation is essential for enabling joint communication and sensing capabilities in such high-frequency environments. Moreover, given the problem formulation is non-convex, we utilize a deep reinforcement learning algorithm based on double deep Q-network and twin delayed deep deterministic policy gradient (DDQN-TD3). The main contributions in this paper are as follows:
This is the first time to explore semantic-aware IRS/UAV-assisted MEC wideband cognitive radio systems. By integrating semantic utility, including task priorities and contextual relevance, into the optimization framework, the system achieves a balance between energy efficiency and semantic utility. The proposed approach jointly optimizes the UAV’s flight paths, subcarrier allocation, IRS reflection coefficients, task offloading ratios, task priorities, and contextual relevance while satisfying the maximum interference constraints imposed on primary users.
Due to the strong coupling between optimizing variables, the problem formulated in this paper is non-convex. To solve this problem, we employ the DDQN-TD3 algorithm for optimization. In addition to tackling the problem at hand, the proposed DDQN-TD3 algorithm addresses the challenge of handling a mixed action space. Specifically, the DDQN algorithm is employed for discrete action spaces, while the TD3 algorithm is applied for the continuous spaces of action.
Simulation results indicate that the proposed approach, involving the IRS/UAV-assisted MEC in wideband cognitive radio system, significantly improves overall system performance in comparison with the baseline schemes. Furthermore, the introduced DDQN-TD3 algorithm demonstrates effective convergence and achieves notable optimization results for the given problem.
Methods
[See PDF for image]
Fig. 1
The proposed IRS/UAV-assisted MEC with wideband cognitive radio
This section considers the IRS/UAV-aided MEC in a wideband cognitive radio network. The network comprises a primary UAV (P-UAV), a secondary UAV (S-UAV), L primary users (PUs), K secondary users (SUs), and an IRS, as depicted in Figure 1. The P-UAV, S-UAV, PUs and SUs that are equipped with single antennas and computing resources, whereas the IRS comprises M reflecting elements. Let sets , , , and represent the collections of PUs, SUs, reflecting elements of the IRS, and subcarriers, respectively. To enhance system intelligence, a semantic layer is introduced. This layer extracts high-level semantic information from user requests and task data, such as user intent, task urgency, and data importance. This information guides resource allocation and decision-making processes to achieve a more efficient and user-oriented MEC system.
Semantic coding model
In the process of data transmission and processing, the main challenges include the accumulation of redundant information, limitations in storage space and bandwidth, and high consumption of computational resources. Traditional data compression methods typically rely on removing detailed information to reduce data size; however, this approach could result in the loss of crucial information, affecting the semantic integrity of the data. To address this challenge, semantic compression has emerged. Semantic compression techniques eliminate redundant information and unnecessary details while preserving the core semantic content of the data, significantly improving storage and transmission efficiency. This approach not only optimizes the use of storage space and bandwidth but also reduces the consumption of computational resources, while effectively maintaining the accuracy and integrity of the data semantics.
Task data are semantically encoded at the user side and decoded at the edge computing servers (ECSs) in this paper. This model decreases the volume of data that needs to be transmitted, leading to more efficient processing, lower energy consumption, and faster response times for computationally heavy tasks. The semantic encoding process starts at the user device, where a semantic encoder, denoted as , generates a compressed version of the task data in a semantic form at the SU k. The transformation from raw task data to its semantic representation is given by
1
where represents the raw task data at the SU k, and is the compressed semantic data.In the actual scenario, the data may be disturbed by noise during transmission. Thus, for the SU k, the received semantic information can be represented as . Let represent the task result after the ECS processes the semantically decoded data. The result of the task, , can be expressed as
2
where is the task processing function at the ECS. The loss function is given by3
where N is the total size of data.Channel model
In this article, UAVs fly at the constant altitude, denoted as H. Moreover, the primary UAV and the secondary UAV fly between their predefined stopping points (SPs), denoted as and , respectively. The coordinates of the P-UAV at SP p and the S-UAV at SP s are represented by and , respectively. Furthermore, the coordinates of the l-th PU are represented by , the coordinates of the k-th SU are represented by , and the coordinates of the IRS are represented by . On the basis of the foregoing, the distances from the P-UAV at SP p to the S-UAV at SP s, from the k-th SU to the S-UAV at SP s, from the k-th SU to the P-UAV at SP p, from the l-th PU to the P-UAV at SP p, and from the l-th PU to the S-UAV at SP s are denoted as
4a
4b
4c
4d
4e
Taking into account the distances mentioned above, the channel gains from the P-UAV at SP p to the S-UAV at SP s, from the k-th SU to the S-UAV at SP s, from the k-th SU to the P-UAV at SP p, from the l-th PU to the P-UAV at SP p, and from the l-th SU to the S-UAV at SP s are denoted as
5a
5b
5c
5d
5e
where the symbol is the channel power gain at 1 m reference distance, and denotes the exponent for path loss.Additionally, there are reflection links in the system. The distances from the k-th SU to the IRS, from the l-th PU to the IRS, from the IRS to the S-UAV at SP s, and from the IRS to the P-UAV at SP p can be, respectively, denoted as
6a
6b
6c
6d
Based on the above information, the channel gains from the k-th SU to the IRS, from the l-th PU to the IRS, from the IRS to the S-UAV at SP s, and from the IRS to the P-UAV at SP p can be expressed as
7a
7b
7c
7d
Spectrum sensing
Assuming the flight paths of both the P-UAV and S-UAV are represented as and , where denotes UAV’s starting position, denotes the position while spectrum sensing, denotes the position during the task offloading phase and denotes the SP of the UAV after completing the task, which is the endpoint of the UAV.
In the spectrum sensing phase, the secondary network detects the spectrum usage of the primary network. The detection results can be classified into two states: spectrum c is either occupied (represented as ) or idle (represented as ). When the spectrum is not in use, the sub-carrier can be used by the secondary network to offload tasks and to transmit information. If the subcarrier is detected as being in use, the secondary network refrains from any operation on that subcarrier. Expressions representing the detection of these two states are, respectively, denoted as
8a
8b
where denotes the additive white Gaussian noise at the location of S-UAV. Let denote the reflection coefficient at the IRS, where represents the reflection amplitude of the M-th reflecting element, while denotes the corresponding reflection phase shift. In addition, P is the transmission power of the P-UAV. When the P-UAV detects that subcarrier c is in use, it transmits a signal with power P to the secondary UAV on subcarrier c.When detecting the usage status of subcarriers, we express the probability of detecting and false alarming as
9a
9b
where N signifies the quantity of antennas at the S-UAV, represents the sensing time, and represents the sampling frequency. Meanwhile, denotes the signal-to-noise ratio of the signal sent from the P-UAV to the S-UAV on a subcarrier c. Simultaneously, represents the detection threshold for c-th subcarrier and can be denoted as10
where represents the desired detection probability.Mobile edge computing
Firstly, the S-UAV detects the occupancy status of the subcarrier. Assuming that subcarrier c is detected unoccupied, the SU can utilize this subcarrier for task offloading. Due to the presence of IRS, the information transmission in this paper includes both direct links and reflective links. During task offloading, the signal sent by the SU k to the S-UAV using subcarrier c, with the existence of IRS, can be expressed as
11
where represents the task offloading signal transmitted by the SU k on subcarrier c, denotes the transmission power for computation offloading by the SU k. Additionally, indicates allocation status of subcarrier, where signifies that subcarrier c is allocated to the SU k for computation offloading, otherwise not.However, due to potential inaccuracies in the sensing process, it may lead to misjudgments by the secondary network regarding the occupancy status of subcarrier c. In such a scenario, the interference generated by the primary user on subcarrier c to the secondary UAV is expressed as
12
where signifies the transmission power for computation offloading of the PU l, and denotes the task-bearing signal transmitted by the PU l on subcarrier c.In the occurrence of the above two scenarios, the signal-to-interference-noise ratio (SINR) is written as
13a
13b
The probabilities of the above two scenarios occurring can be expressed as14a
14b
where denotes the probability that the sub-carrier is being idle, and represents the probability that the sub-carrier is in an active state. Then, the achievable rate of the SU k can be expressed as15
Therefore, the total rate that can be achieved by the secondary user can be represented by16
Latency model
Firstly, the flight time of the primary UAV and the secondary UAV can be expressed as
17a
17b
where v represents the flight speed of the UAV. After a UAV reaches a specific SP, another UAV needs to reach the designated SP for the current stage of work to continue. Therefore, the waiting time of the UAV can be expressed as18
In addition, assuming that the size of task data processed by k-th SU is , the total offloading delay of the SUs can be expressed as19
where is the percentage of data that is offloaded from SU k to the S-UAV.The computation delay at the SU and the S-UAV is represented by
20a
20b
where and represents the number of CPU cycles required to process 1 bit of data at users and UAV, and and denotes the computational capacity at users and UAV. Therefore, the total latency of this system proposed in this paper is21
Energy consumption model
This paper concentrates on the energy consumption of UAVs, beginning with the energy usage associated with UAV flight, as expressed by
22
where represents the power required during UAV flight.Furthermore, when the UAV is engaged in tasks such as task offloading, task computation, and spectrum sensing, the UAV remains in a hovering state. Consequently, the hover energy consumption of the UAV can be expressed as
23
where represents the power required for the UAV during hovering. Hence, the overall energy consumption of the UAV can be formulated as24
In conclusion, the energy efficiency of the system can be expressed as25
Semantic utility model
To enhance system intelligence and user satisfaction, this paper introduces a semantic utility model to evaluate how well task scheduling and resource allocation meet user semantic demands. Semantic utility quantifies the importance of tasks, their completion status, and their alignment with allocated resources, and can be defined as the weighted sum of the semantic contributions of multiple tasks, given by
26
where is the total number of tasks at the current time step, is the weight of task , reflecting its significance (e.g., urgency or user priority), is the priority of task , ranging from , is the completion rate of task , representing whether the task is completed on time. It can be calculated as the ratio of completed data to the total required data, ranging from , is the contextual relevance of task , indicating the match between the task and the allocated resources or network state, ranging from .Problem formulation
In this paper, we propose the use of IRS/UAV-assisted MEC in wideband cognitive radio networks. Our goal is to maximize the overall system performance by considering both energy efficiency and semantic utility. To achieve this, we collaboratively optimize the flight paths of both the P-UAV and S-UAV and , subcarrier allocation , reflection coefficients , task offloading ratios , and semantic utility factors, including task priorities and contextual relevance . The problem framework of our proposed approach is outlined as follows.
27a
27b
27c
27d
27e
27f
27g
27h
27i
where is a balancing parameter to adjust the weight between energy efficiency and semantic utility. C1 is the constraint on the maximum likelihood of false alarms. C2 represents the maximum interference tolerated by the primary user for information transmission. C3 denotes the constraint on the IRS reflection phase shift. C4 represents the range constraint on the task offloading ratio. C5 signifies the subcarrier allocation status. C6 indicates that the UAVs depart from a predefined starting point and return to the termination point. C7 introduces constraints on the semantic utility factors, ensuring their validity and proper range. C8 is the maximum loss function constraint.Resource optimization scheme
Due to the strong coupling of optimization variables in the proposed problem framework, it is inherently non-convex, making conventional solution methods challenging. Compared to traditional convex optimization techniques, deep reinforcement learning offers several advantages in solving non-convex problems. First, its high flexibility allows it to adapt to complex problem structures and dynamic environments. This adaptability is particularly beneficial for capturing the nonlinear and intricate characteristics of non-convex problems. Secondly, deep reinforcement learning possesses strong learning capabilities, continuously improving performance through experience. When tackling complex non-convex problems, it can uncover implicit patterns and approximate optimal solutions more effectively. Lastly, its neural network architecture is well-suited for handling high-dimensional and large-scale data, enabling a more comprehensive representation of non-convex problems. Given these advantages, a deep reinforcement learning algorithm is employed to address the proposed problem framework.
Problem transformation
Firstly, we need to transform the proposed problem framework into a Markov Decision Process (MDP). We consider the network under consideration as the environment, and the controller on the UAV is considered to be the agent. Additionally, the Markov Decision Process requires a clear definition of the state space, action space, transition probabilities, and reward function.
State space: The state space is a collection that describes the system’s states. In this study, the state space at time slot t includes the following elements: the channel state information of all channels in time slot t, the energy efficiency in time slot , the semantic utility in time slot , and the action space in time slot , which can be represented in the form of .
Action space: The set of all conceivable actions that the agent can undertake constitutes the action space. In this study, the action space at time slot t includes both discrete and continuous actions, making it a hybrid action space. Discrete actions cover the flight paths of both the P-UAV and S-UAVs and subcarrier allocation, denoted as . Continuous actions encompass the IRS reflection coefficients, the task offloading ratio, task priorities and contextual relevance, denoted as . Therefore, the action space at time slot t can be represented as .
Transition probability: The probability of transition, often represented as , denotes the likelihood of the environment transitioning to the subsequent state based on the current state and action .
Reward function: In this study, the reward function should include metrics of system performance to encourage the agent to take actions that benefit the whole system. Since this paper aims to maximize the overall system performance, the reward function can be expressed as
28
where and are adjustable parameters, controlling the trade-off between energy efficiency and semantic utility .Problem optimization based on DDQN-TD3 algorithm
Dueling DQN (DDQN) is an algorithm for estimating value functions. The algorithm exhibits several advantages when dealing with discrete action spaces. Firstly, by decoupling the value of a state from the advantage of each action, DDQN gains a better understanding of how each action contributes to the state value. This allows the model to more effectively learn the relative importance of different actions. Secondly, by independently estimating state values and action advantages, DDQN can more efficiently learn various aspects of a task, contributing to enhanced learning efficiency, especially in environments with a large number of actions. Finally, DDQN offers more stable training by mitigating instability during the training process. Through the separation of value and advantage, it adeptly handles variations in the action value function, resulting in smoother and more reliable training.
The structure of DDQN is a specialized deep neural network architecture that is primarily composed of two components: the Value Network and the Advantage Network. These two networks collaborate to estimate the Q-values for each action. Through the combination of these networks, the Q-values of DDQN can be expressed as
29
where , , and represent the neural network parameters of the Q-value network, the state value network, and the advantage network, respectively. Meanwhile, denotes the size of the action space, and represents the average value of all action advantages.The update of Q-network parameters in the DDQN algorithm involves the use of a loss function. Specifically, the loss function is computed as the sum of two components: the mean squared error between the predicted Q-values and the target Q-values, and the mean squared error between the predicted advantage values and the calculated advantage targets. Assuming P tuples are sampled from the experience replay buffer, the loss function can be written as
30
where represents the discount factor, and denotes the neural network parameters of the target Q-value network.The TD3 (Twin Delayed DDPG) algorithm offers several advantages for handling continuous action spaces. Firstly, TD3 utilizes a dual Q-network structure by maintaining two Q-value networks, which helps mitigate the overestimation problem. This architecture enhances the algorithm’s robustness and reliability, contributing to improved learning efficiency. Secondly, TD3 incorporates adversarial noise by introducing random noise to the actions of the target policy network. This helps in better exploring the environment, enhancing the algorithm’s robustness.
The structure of the TD3 algorithm comprises two main components: the Actor network and the Critic network. Within the Actor-Critic framework, TD3 employs a twin Q-network structure, consisting of two independent Critic networks dedicated to estimating two Q-values, thereby mitigating the impact of overestimation. Additionally, there are two Actor networks for learning optimal policies. Furthermore, the algorithm utilizes target networks to enhance learning stability through soft updates. The target Q-value network in time slot t for the TD3 algorithm can be expressed in terms of
31
where represents the neural network parameters of the target evaluation network. In addition, the use of the operation in the above expression is to select the smaller of the two target Q-values, mitigating the issue of overestimation.The update of the critic network parameters involves the minimization of a loss function, which can be expressed as
32
where represents the neural network parameters of the critic network.In the TD3 algorithm, the Actor network parameters of the are updated by maximizing the Q-value. Specifically, the goal of the Actor network is to learn a policy that maximizes the estimated Q-value for the chosen actions. To achieve this objective, the update loss function for the Actor network can be expressed as
33
where denotes the actor network parameters.In summary, the algorithm starts with an initialization phase. Subsequently, taking the proposed network as input, the actions produced are implemented in the environment to obtain reward values, transitioning the state to a new one. Tuples generated in this process are stored in the experience replay buffer. Once a sufficient number of tuples have been accumulated in the experience replay buffer, the system begins training. In each training iteration, a random batch of P tuples is sampled from the experience replay buffer to compute the loss function, subsequently updating the parameters of the neural network. The parameters of the target network are updated using a soft update strategy. This entire process aims to optimize the algorithm’s performance, enabling it to better adapt and improve decision-making processes in the environment.
Results and discussion
To illustrate the superior performance of our proposed network approach, we present simulation results in this section. Our simulation parameter settings were referenced from the literature [22, 27] to ensure the rationality and reliability of our parameter choices.
[See PDF for image]
Algorithm 1
DDQN-TD3 based intelligent optimization scheme for the proposed IRS/UAV-assisted MEC in wideband cognitive radio system
Our simulation parameters are as follows: the number of PUs and SUs is denoted as . The sampling frequency is MHz, and the probability of subcarriers being in the idle state is , while the probability of subcarriers being occupied is . Additionally, the system tolerates a maximum false alarm probability of . The reflective elements of the IRS are . The UAV’s flight speed is , and the power for UAV hovering and flying is , The computational capabilities of secondary users and UAV are cycles/bit and cycles/bit, respectively.
In the simulation plots, the term “Proposed method” represents the approach utilized in this paper. “Random phase” denotes the scenario where IRS reflection coefficients are not optimized, and they are randomly generated. “Without IRS” signifies a condition where IRS is not employed in the optimization process, with other conditions remaining unchanged.
[See PDF for image]
Fig. 2
The convergence performance of the proposed intelligent methodology
Figure 2 shows the convergence performance of the proposed DDQN-TD3-based optimization algorithm by plotting the smoothed cumulative rewards over 15,000 training episodes. The x-axis denotes the episode number, and the y-axis indicates the smoothed reward, which serves as an indicator of the agent’s policy quality during training. In the early training stage (approximately before episode 1000), the smoothed reward increases rapidly from about 0.18 to 0.35, demonstrating that the agent is quickly learning effective policies to improve system performance. After approximately 1500 episodes, the reward curve enters a relatively stable phase, where the values fluctuate within the range of 0.35 to 0.48. This fluctuation is expected in reinforcement learning due to the trade-off between exploration and exploitation. The overall reward trend remains stable, indicating that the proposed algorithm achieves effective convergence. The results also validate the algorithm’s ability to handle complex hybrid action spaces (including both discrete and continuous variables), and adapt to the high-dimensional state-action dynamics in the semantic-aware IRS/UAV-assisted MEC system. This convergence behavior confirms the robustness and learning capability of the DDQN-TD3 framework in optimizing semantic utility and energy efficiency under dynamic network conditions.
[See PDF for image]
Fig. 3
The flight path of both the P-UAV and S-UAV
Figure 3 illustrates the optimized flight trajectories of both the P-UAV and S-UAV in the proposed IRS/UAV-assisted MEC system. The solid red line represents the P-UAV’s trajectory, while the dashed red line represents that of the S-UAV. The positions of PUs, SUs, and the IRS are also marked. The UAVs begin their missions from designated starting points and proceed through three operational phases: spectrum sensing, task offloading, and return. During the spectrum sensing phase, the P-UAV actively adjusts its position in coordination with the S-UAV’s movement to ensure optimal LoS sensing. This coordination enhances the sensing accuracy and channel estimation quality. The S-UAV, in turn, aligns its path to stay within effective sensing range of both the IRS and user terminals. It is noteworthy that despite the presence of the IRS (green triangle), the UAVs do not significantly deviate toward it during the sensing phase. This indicates that the direct channel between UAVs is sufficiently strong, and additional IRS reflection is unnecessary at this stage. In the task offloading phase, the S-UAV strategically relocates to a position that is closer to both the SUs and the IRS. This placement facilitates improved communication links via both direct and reflected paths, maximizing signal quality and offloading efficiency. The final phase concludes with both UAVs returning to their predefined endpoints. Overall, the figure highlights the dynamic adaptability of the UAVs’ flight paths, jointly optimized to balance sensing accuracy, communication efficiency, and energy consumption.
[See PDF for image]
Fig. 4
The system energy efficiency exhibits variations with changes in the transmit power of the SU
Figure 4 depicts the variation in system energy efficiency (in bits/Joule/Hz) as a function of the SU transmit power, ranging from 0 dBm to 30 dBm. The results compare three scenarios: the proposed IRS-assisted method with optimized reflection coefficients, a baseline with random IRS phase shifts, and a system without IRS deployment. As the SU transmit power increases, the energy efficiency improves across all schemes. This is expected since higher transmit power leads to improved signal quality and thus higher achievable data rates, which contributes to better energy utilization. The proposed method consistently outperforms both baseline schemes. At 30 dBm, it achieves an energy efficiency of approximately 0.66 bits/Joule/Hz, which is over 50% higher than the “Without IRS” case and more than double that of the “Random phase” scenario. Interestingly, the system with random IRS phases performs worse than the system without IRS, particularly at higher power levels. This indicates that improperly configured IRS elements can introduce additional interference or misalignment, thereby reducing the overall system efficiency. It highlights the importance of precise IRS phase optimization in realizing its full potential. Overall, Figure 4 validates the effectiveness of the proposed intelligent optimization framework that can effectively leverage IRS to enhance energy efficiency, and also emphasizes the critical role of phase control in IRS-based systems.
[See PDF for image]
Fig. 5
The system energy efficiency exhibits variations with changes in the number of reflective elements for IRS
Figure 5 illustrates the relationship between system energy efficiency and the number of IRS reflecting elements. From the figure, it is evident that our proposed scheme achieves higher energy efficiency compared to the baseline scheme. Additionally, as the number of reflecting elements increases, our scheme demonstrates further improvement in energy efficiency. This can be attributed to the increased number of reflecting elements, which enables the IRS to perform more precise beamforming and steer the electromagnetic wave propagation more effectively. Moreover, the proposed scheme can carefully adjust the phase of the reflecting elements, thus signals can be more efficiently directed toward the target area. It shows that our proposed scheme enhances signal strength in the desired region and improving overall system performance. Furthermore, in the case of random phase shifts, the performance impact remains minimal as the number of IRS reflecting elements increases. However, the overall system performance slightly deteriorates compared to the scenario without IRS, further highlighting the potential negative effects of random phase shifts on the system.
Conclusion
In this paper, we propose the semantic-aware optimization of IRS/UAV-assisted MEC in wideband cognitive radio networks, aiming to enhance overall system performance by jointly considering energy efficiency and semantic utility. We achieve this by collaboratively optimizing the flight paths of both the P-UAV and S-UAV, subcarrier allocation, IRS reflection coefficients, task offloading ratios, and semantic utility factors such as task priorities and contextual relevance. Our proposed framework addresses these optimization objectives comprehensively. The simulation results validate the effectiveness of our approach, demonstrating significant improvements in both energy efficiency and semantic utility, with the adopted optimization algorithm efficiently handling the complex mixed action spaces.
Acknowledgements
The authors extend their appreciation to Henan Province for funding this work through the Higher Education Young Backbone Teacher Training Program of Henan Province, “Study for Airborne Gravimetry Solution Technologies based on Adaptive Kalman Filtering” (Grant NO. 2024GGJS157) and the Henan Province Science and Technology Key Project, “Study for RIS technologies in 6 G Wireless Communications.”
Author contributions
Wei Zheng carried out the system modeling study, optimization algorithms and algorithm implementation, participated in drafting the manuscript. Pengshan Ren carried out the survey work, formula derivation and writing. Qing Li participated in the optimization model design. study of optimization algorithms. All authors read and approved the fnal manuscript.
Funding
This research was supported by the Higher Education Young Backbone Teacher Training Program of Henan Province, “Study for Airborne Gravimetry Solution Technologies based on Adaptive Kalman Filtering” (Grant NO. 2024GGJS157) and the Henan Province Science and Technology Key Project, “Study for RIS technologies in 6G Wireless Communications” (Grant NO. 252102210237).
Data availability
Data sharing is not applicable to this study.
Declarations
Conflict of interest
The author declares no conficts of interest related to this research.
Abbreviations
Internet of things
Unmanned aerial vehicle
Intelligent reflecting surface
Mobile edge computing
Double deep Q-network and twin delayed deep deterministic policy gradient
Cognitive radio
Primary unmanned aerial vehicle
Secondary unmanned aerial vehicle
Primary user
Secondary user
Edge computing server
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
1. Xie, H; Qin, Z. A lite distributed semantic communication system for Internet of Things. IEEE J. Select. Areas Commun.; 2020; 39,
2. Du, H; Wang, J; Niyato, D et al. Rethinking wireless communication security in semantic Internet of Things. IEEE Wirel. Commun.; 2023; 30,
3. Z. Kaleem, F. A. Orakzai, W. Ishaq, et al. Emerging trends in UAVs: From placement, semantic communications to generative AI for mission-critical networks. IEEE Transactions on Consumer Electronics, 2024
4. A. Sharma, and C. Diwakar, Future aspects on MEC (Mobile Edge Computing): Offloading Mechanism, in: 2021 6th International Conference on Signal Processing, Computing and Control, 2021
5. J. Liu, Future smart mobile edge computing technology in mobile communication networks, in: IEEE Conference on Telecommunications, Optics and Computer Science, 2021
6. Huo, Y; Liu, Q; Gao, Q; Wu, Y; Jing, T. Joint task offloading and resource allocation for secure OFDMA-based mobile edge computing systems. Phys. Commun.; 2024; 153, 103342.
7. Wang, L; Yang, F; Chen, Y; Lai, S; Wu, W. Intelligent resource allocation for transmission security on IRS-assisted spectrum sharing systems with OFDM. Phys. Commun.; 2023; 58, [DOI: https://dx.doi.org/10.1016/j.phycom.2023.102013] 102013.
8. Wang, L; Wu, W; Zhou, F; Wu, Q; Dobre, OA; Quek, TQS. Hybrid hierarchical DRL enabled resource allocation for secure transmission in multi-IRS-assisted sensing-enhanced spectrum sharing networks. IEEE Trans. Wirel. Commun.; 2024; 23,
9. K. Liu, F. Lin, Y. Zhao and J. Zhang, Deep Reinforcement Learning Optimization Algorithm Designed for IRS-Assisted Edge Computing, in: IEEE 6th International Conference on Electronic Information and Communication Technology, 2023
10. B. Wang, R. Liu, Y. Li, C. Ding, J. Wang and H. Zhang, Joint optimization of transmission and computing resource in IRS-assisted mobile edge computing system, In: IEEE Wireless Communications and Networking Conference, 2022
11. Bai, T; Pan, C; Ren, H; Deng, Y; Elkashlan, M; Nallanathan, A. Resource allocation for intelligent reflecting surface aided wireless powered mobile edge computing in OFDM systems. IEEE Trans. Wirel. Commun.; 2021; 20,
12. M. Wu, W. Chen, K. Li, and L. Qian, Secure computation offloading for IRS-assisted mobile edge computing networks, in: IEEE/CIC International Conference on Communications in China, 2023
13. Z. Wu, H. Zhang, X. Liu, L. Li, and H. Li, IRS Empowered MEC system with computation offloading, reflecting design and beamforming optimization, IEEE Transactions on Communications, 2024, to be published
14. Zhao, S; Liu, Y; Gong, S; Gu, B; Fan, R; Lyn, B. Computation offloading and beamforming optimization for energy minimization in wireless-powered IRS-assisted MEC. IEEE Internet of Things J.; 2023; 10,
15. P. Chen, B. Lyu, S. Gong, H. Guo, J. Jiang, and Z. Yang, Computational rate maximization for IRS-assisted full-duplex wireless-powered mec systems , IEEE Transactions on Vehicular Technology, 2023, to be published
16. F. Wang, and X. Zhang, IRS/UAV-Based edge-computing and traffic offloading over 6G THz mobile wireless networks, In: IEEE International Conference on Communications, 2023
17. Shnaiwer, YN; Kaneko, M. Minimizing IoT energy consumption by IRS-aided UAV mobile edge computing. IEEE Netw. Lett.; 2023; 5,
18. Zhang, Y; Li, J; Mu, G; Chen, X. Deep reinforcement learning enabled UAV-IRS-assisted secure mobile edge computing network. Phys. Commun.; 2023; 61, [DOI: https://dx.doi.org/10.1016/j.phycom.2023.102173] 102173.
19. Qin, X; Song, Z; Hou, T; Yu, W; Wang, J; Sun, X. Joint optimization of resource allocation. Phase shift, and UAV trajectory for energy-efficient RIS-assisted UAV-enabled MEC systems,. IEEE Trans. Green Commun. Netw.; 2023; 7,
20. Jiang, F; Peng, Y; Wang, K; Dong, L; Yang, K. MARS: a DRL-based multi-task resource scheduling framework for UAV with IRS-assisted mobile edge computing system. IEEE Trans. Cloud Comput.; 2023; 11,
21. Y. Li, F. Zhou, L. Yuan, et al. Cognitive semantic communication: a new communication paradigm for 6G, IEEE Communications Magazine, 2025
22. Asim, M; Elaffendi, M; Abd El-Latif, AA. Multi-IRS and multi-UAV-assisted MEC system for 5G/6G networks: efficient joint trajectory optimization and passive beamforming framework. IEEE Trans. Intell. Transp. Syst.; 2023; 24,
23. Guan, X; Wu, Q; Zhang, R. Joint power control and passive beamforming in IRS-assisted spectrum sharing. IEEE Commun. Lett.; 2020; 24,
24. Altrad, O; Muhaidat, S; Al-Dweik, A; Shami, A; Yoo, PD. Opportunistic spectrum access in cognitive radio networks under imperfect spectrum sensing. IEEE Trans. Vehic. Technol.; 2014; 63,
25. Wei, H; Lang, J. Dynamic resource allocation in IRS-assisted UAV wideband cognitive radio networks: A DDQN-TD3 approach. Phys. Commun.; 2024; 63, [DOI: https://dx.doi.org/10.1016/j.phycom.2024.102284] 102284.
26. Z. Wang, W. Wu, F. Zhou, B. Wang and Q. Wu, secure resource allocation for IRS-assisted CRNs under opportunistic spectrum access, In: 2023 International Conference on Ubiquitous Communication, 2023
27. Y. Wu, F. Zhou, Q. Wu, Y. Huang, and R. Q. Hu, Resource allocation for IRS-assisted sensing-enhanced wideband cr networks, In: Proc. IEEE International Conference on Communications, 2021
28. Ding, G; Liu, S; Yuan, J; Yu, G. Joint URLLC trafffc scheduling and resource allocation for semantic communication systems. IEEE Trans. Wirel. Commun.; 2024; 23,
29. Liu, C; Guo, C; Yang, Y et al. OFDM-based digital semantic communication with importance awareness. IEEE Trans. Commun.; 2024; 72,
30. L. Wang, W. Wu, F. Zhou, Z. Qin, Q. Wu, IRS-enhanced secure semantic communication networks: Cross-layer and context-awared resource allocation, IEEE Transactions on Wireless Communications, to be published
31. Xie, H; Qin, Z; Li, GY; Juang, B-H. Deep learning enabled semantic communication systems. IEEE Trans. Signal Process.; 2021; 69, pp. 2663-2675.4271121 [DOI: https://dx.doi.org/10.1109/TSP.2021.3071210] 1543.94014
32. L. Wang, W. Wu, F. Tian, H. Hu, Intelligent resource allocation for UAV-enabled spectrum sharing semantic communication networks, in: 2023 IEEE 23rd International Conference on Communication Technology (ICCT), 2023, pp. 1359-1363
33. Weng, Z; Qin, Z. Semantic communication systems for speech transmission. IEEE J. Select. Areas Commun.; 2021; 39,
34. Farshbafan, MK; Saad, W; Debbah, M. Curriculum learning for goal-oriented semantic communications with a common language. IEEE Trans. Commun.; 2023; 71,
35. L. Wang, W. Wu, F. Zhou, F. Tian, Q. Wu, W. Saad, A unified hierarchical semantic knowledge base for multi-task semantic communication, in: IEEE International Conference on Communications(ICC), 2024, pp. 2937-2943
36. H. Tong, H. Li, H. Du, Z. Yang, C. Yin, D. Niyato, Multimodal semantic communication for generative audio-driven video conferencing, IEEE Wirel. Commun. Lett.
37. Du, H; Wang, J; Niyato, D; Kang, J; Xiong, Z; Zhang, J; Shen, X. Semantic communications for wireless sensing: Ris-aided encoding and self-supervised decoding. IEEE J. Select. Areas Commun.; 2023; 41,
38. S. E. Trevlakis, N. Pappas, A.-A. A. Boulogeorgos, Towards natively intelligent semantic communications and networking, IEEE Open Journal of the Communications Society (2024)
39. L. Wang, W. Wu, F. Zhou, Intelligent resource allocation for IRS-assisted sensing-enhanced secure communication CRNs, International Conference on Ubiquitous Communication (Ucom), 2023, pp. 344-349
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
The efficient integration of communication and computation in the internet of things (IoT) presents new opportunities for enhancing system performance but still faces challenges such as interference management, resource allocation and task scheduling. To address these issues, this paper proposes a semantic-aware intelligent optimization framework that combines unmanned aerial vehicles (UAVs) and intelligent reflecting surface (IRS) with mobile edge computing (MEC) to enhance communication quality and semantic awareness in wideband cognitive radio networks. The proposed semantic-aware optimization framework incorporates semantic information to achieve more efficient task scheduling and resource allocation. Particularly, the proposed optimization framework jointly optimizes UAV trajectories, subcarrier allocation, IRS reflection coefficients, task offloading ratios, task priorities and contextual relevance to maximize semantic utility and system energy efficiency while dynamically ensuring task demands. Furthermore, to tackle the non-convexity caused by highly coupled optimization variables, we employ a deep reinforcement learning algorithm based on double deep Q-network and twin delayed deep deterministic policy gradient (DDQN-TD3). Simulation results demonstrate that the proposed approach significantly outperforms baseline schemes by better aligning with user priorities, task requirements, and contextual awareness, leading to improved task completion rates and semantic utility, providing an innovative optimization solution for wideband cognitive radio networks.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details
1 Henan Institute of Technology, School of Electronic and Information Engineering, Xinxiang, China (GRID:grid.503012.5); Xinxiang Key Laboratory of Signal and Information, Xinxiang, China (GRID:grid.503012.5)
2 Data Center of Jiangsu Provincial Administration for Market Regulation, Xicheng District, China (GRID:grid.503012.5)