Semantic aware intelligent optimization for

Full text

Turn on search term navigation

Introduction

In recent years, the rapid development of the Internet of Things (IoT) has been driven by the proliferation of smart devices, which has significantly transformed everyday life and work. Applications such as smart homes, health monitoring, and intelligent transportation have greatly improved convenience [1]. However, the large-scale deployment of smart devices has introduced various challenges, particularly in terms of high-frequency interference between devices, increased computational demands, and the need for low-latency processing [2]. Given that many devices are struggling to meet these stringent requirements, the utilization of mobile edge computing (MEC) technology has turned out to be a promising solution to address these challenges [3].

The core principle of MEC is to deploy computing resources at the edge of the network, relocating computational capabilities nearer to data sources and terminal devices. By positioning computing resources at locations such as base stations, edge servers, or cloud edge nodes, MEC enables computation to occur in proximity to data sources, thereby reducing the distance and time required for data transmission [4]. This edge computing model not only minimizes communication latency but also enhances service efficiency. Additionally, MEC also supports task offloading, allowing computation tasks to be offloaded between terminal devices and edge nodes which helps balance computational workloads [5]. Through these mechanisms, MEC achieves faster, more real-time data processing and service delivery, providing low-latency, high-bandwidth computational support for various application scenarios. Given these advantages mentioned, MEC has garnered significant attention and is widely studied in industry [6].

Despite the significant advantages of MEC, the system faces challenges such as signal attenuation and uneven coverage, leading to degraded communication quality. The complexity and dynamic nature of signal propagation further contribute to increased communication delays, posing challenges for MEC’s practical application. To address these issues, intelligent reflecting surface (IRS) technology [7, 8] has been proposed as an effective solution. Strategically deploying IRS in wideband cognitive radio networks helps mitigate signal attenuation and uneven coverage [9, 10, 11–12], thereby improving the communication quality and reducing latency in MEC systems [13, 14–15]. To be specfic, in [10], an IRS-aided MEC system was proposed. Through the joint optimization of user-side transmission power, base station-side received beamforming vector, computation resource allocation, and IRS reflection coefficients, the goal is to reduce the system’s overall transmission energy consumption to the minimum. In [11], researchers investigated an IRS-assisted wireless-powered MEC system operating in OFDM environments in order to minimize the overall energy consumption. Joint optimization is conducted for power distribution of wireless energy transfer signals, local compute calculation frequencies of wireless devices, subband device collaboration, power allocation in computation offloading, and reflection coefficients of the IRS. In addition, [12] investigated the security-aware computation offloading challenge in IRS-assisted MEC networks.

The integration of UAVs with MEC systems also shows great potential for improving system efficiency, bringing greater flexibility, faster response times, and stronger adaptability by replacing traditional base stations [16, 17–18]. The integration research of UAVs into IRS-assisted MEC networks is increasingly prominent [19, 20, 22]. For instance, the authors in [16] investigated an IRS-assisted UAV-enabled MEC system, proposing a joint optimization of bit allocation, transmission power, phase shifts, and UAV trajectory to maximize system efficiency. Simulation results indicate that the UAV-based solution significantly improves efficiency. In the study presented in [17], the authors investigated a dynamic MEC system assisted by both UAVs and IRS. Simulation results further affirmed the advantages of incorporating UAVs into the system. Moreover, in MEC networks, the scarcity of spectrum resources is a potential issue, and adopting cognitive radio (CR) technology can effectively address this problem. To date, research on utilizing cognitive radio technology to enhance spectrum efficiency continues to emerge [23, 24–25]. In [23], the authors proposed an innovative spectrum sharing network which is assisted by IRS. Together, they optimize the transmission power of the users within the cognitive network and the reflection coefficients of the IRS to maximize the achievable rate of the secondary users while meeting the signal-to-interference-plus-noise ratio requirements of the primary users. In [26], the authors investigate secure resource allocation in an IRS-assisted CR network. The CR network in this article employs the paradigm of opportunistic access to the spectrum. Additionally, the work in [27] discusses an IRS-assisted wideband CR network based on the spectrum sharing paradigm of sensing. The simulations of the above studies have consistently shown that CR technology is able to remarkably enhance the efficiency of the spectrum.

However, traditional MEC systems primarily focus on optimizing physical layer parameters, such as power allocation and resource scheduling, overlooking the inherent semantic meaning of transmitted data and user tasks. Semantic communication, an emerging communication paradigm, has attracted significant concern [21, 28, 29]. The core concept of semantic communication is to take the semantic content of the information as the transmission goal, focusing on the delivery of the real meaning of the information instead of the precision at the bit level [30, 31, 32–33], so as to realize efficient communication resource utilization. Therefore, combining semantic communication with edge computing from UAVs and cognitive radio technology is of great significance. On the one hand, UAV edge computing scenarios usually face the contradiction between limited resources and huge data volume, and semantic communication can effectively reduce the bandwidth demand of data transmission by extracting and transmitting key information [34, 35–36], thus enhancing communication efficiency. On the other hand, cognitive radio technology provides flexibility for complex communication environments by dynamically sensing and allocating spectrum resources, but traditional communication methods have some redundancy in the utilization of spectrum resources [37, 38–39].

In conclusion, semantic-aware IRS/UAV-enabled MEC can take full advantage of the high efficiency of semantic communication, the real-time advantage of UAV edge computing, as well as intelligent adaptability of cognitive radio. This integration offers crucial theoretical and technical support for achieving real-time operation, enhancing robustness, and optimizing resource management within the UAV network in complex environments. In the context of future intelligent communication systems, it is irreplaceably essential and significant for addressing the challenges posed by resource constraints and complex environmental conditions. Furthermore, the exploration of semantic communication in wideband cognitive radio networks offers a new perspective for resource allocation and task scheduling in spectrum-sharing networks.

In this paper, we investigate a semantic-aware IRS/UAV-assisted MEC in wideband cognitive radio system for the first time, incorporating semantic utility into the optimization framework. The optimization focuses on collaboratively optimizing the flight paths of both primary and secondary UAVs, subcarrier allocation, reflection coefficient of IRS, task offloading ratios, task priorities, and contextual relevance to maximize a weighted combination of system energy efficiency and semantic utility. By balancing physical and semantic objectives, the system dynamically adapts to the diverse user demands and task requirements characteristic of wideband cognitive radio networks. This dynamic adaptation is essential for enabling joint communication and sensing capabilities in such high-frequency environments. Moreover, given the problem formulation is non-convex, we utilize a deep reinforcement learning algorithm based on double deep Q-network and twin delayed deep deterministic policy gradient (DDQN-TD3). The main contributions in this paper are as follows:

This is the first time to explore semantic-aware IRS/UAV-assisted MEC wideband cognitive radio systems. By integrating semantic utility, including task priorities and contextual relevance, into the optimization framework, the system achieves a balance between energy efficiency and semantic utility. The proposed approach jointly optimizes the UAV’s flight paths, subcarrier allocation, IRS reflection coefficients, task offloading ratios, task priorities, and contextual relevance while satisfying the maximum interference constraints imposed on primary users.
Due to the strong coupling between optimizing variables, the problem formulated in this paper is non-convex. To solve this problem, we employ the DDQN-TD3 algorithm for optimization. In addition to tackling the problem at hand, the proposed DDQN-TD3 algorithm addresses the challenge of handling a mixed action space. Specifically, the DDQN algorithm is employed for discrete action spaces, while the TD3 algorithm is applied for the continuous spaces of action.
Simulation results indicate that the proposed approach, involving the IRS/UAV-assisted MEC in wideband cognitive radio system, significantly improves overall system performance in comparison with the baseline schemes. Furthermore, the introduced DDQN-TD3 algorithm demonstrates effective convergence and achieves notable optimization results for the given problem.

The context arrangement of the paper is shown as follows. In Section II, we delve into the system model and the proposed problem’s framework. Section III explores the specific algorithms employed to optimize the proposed problem. Section IV provides the simulation outcomes. Eventually, Section V wraps up the paper by summarizing its content.

Methods

[See PDF for image]

Fig. 1

The proposed IRS/UAV-assisted MEC with wideband cognitive radio

This section considers the IRS/UAV-aided MEC in a wideband cognitive radio network. The network comprises a primary UAV (P-UAV), a secondary UAV (S-UAV), L primary users (PUs), K secondary users (SUs), and an IRS, as depicted in Figure 1. The P-UAV, S-UAV, PUs and SUs that are equipped with single antennas and computing resources, whereas the IRS comprises M reflecting elements. Let sets $L = \{1, \cdot \cdot \cdot, L\}$ , $K = \{1, \cdot \cdot \cdot, K\}$ , $M = \{1, \cdot \cdot \cdot, M\}$ , and $C = \{1, \cdot \cdot \cdot, C\}$ represent the collections of PUs, SUs, reflecting elements of the IRS, and subcarriers, respectively. To enhance system intelligence, a semantic layer is introduced. This layer extracts high-level semantic information from user requests and task data, such as user intent, task urgency, and data importance. This information guides resource allocation and decision-making processes to achieve a more efficient and user-oriented MEC system.

Semantic coding model

In the process of data transmission and processing, the main challenges include the accumulation of redundant information, limitations in storage space and bandwidth, and high consumption of computational resources. Traditional data compression methods typically rely on removing detailed information to reduce data size; however, this approach could result in the loss of crucial information, affecting the semantic integrity of the data. To address this challenge, semantic compression has emerged. Semantic compression techniques eliminate redundant information and unnecessary details while preserving the core semantic content of the data, significantly improving storage and transmission efficiency. This approach not only optimizes the use of storage space and bandwidth but also reduces the consumption of computational resources, while effectively maintaining the accuracy and integrity of the data semantics.

Task data are semantically encoded at the user side and decoded at the edge computing servers (ECSs) in this paper. This model decreases the volume of data that needs to be transmitted, leading to more efficient processing, lower energy consumption, and faster response times for computationally heavy tasks. The semantic encoding process starts at the user device, where a semantic encoder, denoted as $U_{k}$ , generates a compressed version of the task data in a semantic form at the SU k. The transformation from raw task data $J_{k}$ to its semantic representation $J_{k}^{s}$ is given by

\begin{matrix} J_{k}^{s} = U_{k} (J_{k}), \end{matrix}

where

J_{k}

represents the raw task data at the SU k, and

J_{k}^{s}

is the compressed semantic data.

In the actual scenario, the data may be disturbed by noise during transmission. Thus, for the SU k, the received semantic information can be represented as $J_{k}^{r}$ . Let $T_{k}$ represent the task result after the ECS processes the semantically decoded data. The result of the task, $T_{k}^{r}$ , can be expressed as

\begin{matrix} T_{k}^{r} = G_{k} (J_{k}^{r}), \end{matrix}

where

G_{k}

is the task processing function at the ECS. The loss function is given by

\begin{matrix} L_{k} = \frac{1}{N} \sum_{i = 1}^{N} {(J_{k, i} - T_{k, i}^{r})}^{2} \end{matrix}

where N is the total size of data.

Channel model

In this article, UAVs fly at the constant altitude, denoted as H. Moreover, the primary UAV and the secondary UAV fly between their predefined stopping points (SPs), denoted as $Q^{p} = [1, \cdot \cdot \cdot, p, \cdot \cdot \cdot, P]$ and $Q^{s} = [1, \cdot \cdot \cdot, s, \cdot \cdot \cdot, S]$ , respectively. The coordinates of the P-UAV at SP p and the S-UAV at SP s are represented by $q_{p} = (x_{p}, y_{p}, z_{p})$ and $q_{s} = (x_{s}, y_{s}, z_{s})$ , respectively. Furthermore, the coordinates of the l-th PU are represented by $q_{l} = (x_{l}, y_{l}, 0)$ , the coordinates of the k-th SU are represented by $q_{k} = (x_{k}, y_{k}, 0)$ , and the coordinates of the IRS are represented by $q_{I} = (x_{I}, y_{I}, z_{I})$ . On the basis of the foregoing, the distances from the P-UAV at SP p to the S-UAV at SP s, from the k-th SU to the S-UAV at SP s, from the k-th SU to the P-UAV at SP p, from the l-th PU to the P-UAV at SP p, and from the l-th PU to the S-UAV at SP s are denoted as

\begin{matrix} d_{U}^{p, s} & = \sqrt{{(x_{p} - x_{s})}^{2} + {(y_{p} - y_{s})}^{2} + {(z_{p} - z_{s})}^{2}}, \end{matrix}

\begin{matrix} d_{kS}^{s} & = \sqrt{{(x_{s} - x_{k})}^{2} + {(y_{s} - y_{k})}^{2} + {z_{s}}^{2}}, \end{matrix}

\begin{matrix} d_{kP}^{p} & = \sqrt{{(x_{p} - x_{k})}^{2} + {(y_{p} - y_{k})}^{2} + {z_{p}}^{2}}, \end{matrix}

\begin{matrix} d_{lP}^{p} & = \sqrt{{(x_{p} - x_{l})}^{2} + {(y_{p} - y_{l})}^{2} + {z_{p}}^{2}}, \end{matrix}

\begin{matrix} d_{lS}^{s} & = \sqrt{{(x_{s} - x_{l})}^{2} + {(y_{s} - y_{l})}^{2} + {z_{s}}^{2}} . \end{matrix}

Taking into account the distances mentioned above, the channel gains from the P-UAV at SP p to the S-UAV at SP s, from the k-th SU to the S-UAV at SP s, from the k-th SU to the P-UAV at SP p, from the l-th PU to the P-UAV at SP p, and from the l-th SU to the S-UAV at SP s are denoted as

\begin{matrix} h_{U}^{p, s} = \sqrt{α {(d_{U}^{p, s})}^{- β}}, \end{matrix}

\begin{matrix} h_{kS}^{s, c} = \sqrt{α {(d_{kS}^{s, c})}^{- β}}, \end{matrix}

\begin{matrix} h_{kP}^{p} = \sqrt{α {(d_{kP}^{p})}^{- β}}, \end{matrix}

\begin{matrix} h_{lP}^{p} = \sqrt{α {(d_{lP}^{p})}^{- β}}, \end{matrix}

\begin{matrix} h_{lP}^{p} = \sqrt{α {(d_{lP}^{p})}^{- β}}, \end{matrix}

where the symbol

α

is the channel power gain at 1 m reference distance, and

β

denotes the exponent for path loss.

Additionally, there are reflection links in the system. The distances from the k-th SU to the IRS, from the l-th PU to the IRS, from the IRS to the S-UAV at SP s, and from the IRS to the P-UAV at SP p can be, respectively, denoted as

\begin{matrix} d_{kI} & = \sqrt{{(x_{I} - x_{k})}^{2} + {(y_{I} - y_{k})}^{2} + {z_{I}}^{2}}, \end{matrix}

\begin{matrix} d_{lI} & = \sqrt{{(x_{I} - x_{l})}^{2} + {(y_{I} - y_{l})}^{2} + {z_{I}}^{2}}, \end{matrix}

\begin{matrix} d_{IS}^{s} & = \sqrt{{(x_{s} - x_{I})}^{2} + {(y_{s} - y_{I})}^{2} + {(z_{s} - z_{I})}^{2}}, \end{matrix}

\begin{matrix} d_{IP}^{s} & = \sqrt{{(x_{p} - x_{I})}^{2} + {(y_{p} - y_{I})}^{2} + {(z_{p} - z_{I})}^{2}}, \end{matrix}

Based on the above information, the channel gains from the k-th SU to the IRS, from the l-th PU to the IRS, from the IRS to the S-UAV at SP s, and from the IRS to the P-UAV at SP p can be expressed as

\begin{matrix} h_{kI} & = \sqrt{α {(d_{kI})}^{- β}} {[1, e^{- j \frac{2 π}{λ} d φ_{kI}} \dots e^{- j \frac{2 π}{λ} N - 1 d φ_{kI}}]}^{T}, \end{matrix}

\begin{matrix} h_{lI} & = \sqrt{α {(d_{lI})}^{- β}} {[1, e^{- j \frac{2 π}{λ} d φ_{dI}} \dots e^{- j \frac{2 π}{λ} N - 1 d φ_{dI}}]}^{T}, \end{matrix}

\begin{matrix} h_{IS}^{s} & = \sqrt{α {(d_{IS}^{s})}^{- β}} {[1, e^{- j \frac{2 π}{λ} d φ_{IS}^{s}} \dots e^{- j \frac{2 π}{λ} N - 1 d φ_{IS}^{s}}]}^{T}, \end{matrix}

\begin{matrix} h_{IP}^{p} & = \sqrt{α {(d_{IP}^{p})}^{- β}} {[1, e^{- j \frac{2 π}{λ} d φ_{IP}^{p}} \dots e^{- j \frac{2 π}{λ} N - 1 d φ_{IP}^{p}}]}^{T}, \end{matrix}

Spectrum sensing

Assuming the flight paths of both the P-UAV and S-UAV are represented as $q_{A} = \{q_{A}^{1}, q_{A}^{2}, q_{A}^{3}, q_{A}^{4}\}$ and $q_{B} = \{q_{B}^{1}, q_{B}^{2}, q_{B}^{3}, q_{B}^{4}\}$ , where $q^{1}$ denotes UAV’s starting position, $q^{2}$ denotes the position while spectrum sensing, $q^{3}$ denotes the position during the task offloading phase and $q^{4}$ denotes the SP of the UAV after completing the task, which is the endpoint of the UAV.

In the spectrum sensing phase, the secondary network detects the spectrum usage of the primary network. The detection results can be classified into two states: spectrum c is either occupied (represented as $H_{c}^{1}$ ) or idle (represented as $H_{c}^{0}$ ). When the spectrum is not in use, the sub-carrier can be used by the secondary network to offload tasks and to transmit information. If the subcarrier is detected as being in use, the secondary network refrains from any operation on that subcarrier. Expressions representing the detection of these two states are, respectively, denoted as

\begin{matrix} H_{c}^{0} : y_{s, c} = n_{s}, \end{matrix}

H_{c}^{1} : y_{s, c} = (h_{U}^{q_{A}^{2}, q_{B}^{2}} + h_{IS}^{q_{B}^{2}} ϕ h_{PI}^{q_{A}^{2}}) \sqrt{P} x + n_{s},

where

n_{s}

denotes the additive white Gaussian noise at the location of S-UAV. Let

ϕ = {α_{1} e^{j θ_{1}}, α_{2} e^{j θ_{2}}, \dots α_{M} e^{j θ_{M}}}

denote the reflection coefficient at the IRS, where

α_{M}

represents the reflection amplitude of the M-th reflecting element, while

θ_{M}

denotes the corresponding reflection phase shift. In addition, P is the transmission power of the P-UAV. When the P-UAV detects that subcarrier c is in use, it transmits a signal with power P to the secondary UAV on subcarrier c.

When detecting the usage status of subcarriers, we express the probability of detecting and false alarming as

\begin{matrix} P_{d, c} & = Q (\frac{η^{c} - σ_{s}^{2} (N + γ_{c})}{(σ_{s}^{2} / \sqrt{τ f_{s}}) \sqrt{2 γ_{c} + N}}), \end{matrix}

\begin{matrix} P_{f, c} & = Q (\frac{η^{c} - σ_{s}^{2} N}{(σ_{s}^{2} / \sqrt{τ f_{s}}) \sqrt{N}}), \end{matrix}

where N signifies the quantity of antennas at the S-UAV,

τ

represents the sensing time, and

f_{s}

represents the sampling frequency. Meanwhile,

γ_{c}

denotes the signal-to-noise ratio of the signal sent from the P-UAV to the S-UAV on a subcarrier c. Simultaneously,

η_{c}

represents the detection threshold for c-th subcarrier and can be denoted as

\begin{matrix} η^{c} = (Q^{- 1} (\bar{P_{d}}) \sqrt{\frac{2 γ_{c} + N}{τ f_{s}}} + γ_{c} + N) σ_{s}^{2}, \end{matrix}

where

\bar{P_{d}}

represents the desired detection probability.

Mobile edge computing

Firstly, the S-UAV detects the occupancy status of the subcarrier. Assuming that subcarrier c is detected unoccupied, the SU can utilize this subcarrier for task offloading. Due to the presence of IRS, the information transmission in this paper includes both direct links and reflective links. During task offloading, the signal sent by the SU k to the S-UAV using subcarrier c, with the existence of IRS, can be expressed as

\begin{matrix} y_{k, c} = ρ_{k, c} (h_{kS}^{q_{B}^{3}} + {(h_{IS}^{q_{B}^{3}})}^{H} ϕ h_{kI}) \sqrt{P_{k}} s_{k}^{c} + n_{s}, \end{matrix}

where

s_{k}^{c}

represents the task offloading signal transmitted by the SU k on subcarrier c,

P_{k}

denotes the transmission power for computation offloading by the SU k. Additionally,

ρ_{k, c}

indicates allocation status of subcarrier, where

ρ_{k, c} = 1

signifies that subcarrier c is allocated to the SU k for computation offloading, otherwise not.

However, due to potential inaccuracies in the sensing process, it may lead to misjudgments by the secondary network regarding the occupancy status of subcarrier c. In such a scenario, the interference generated by the primary user on subcarrier c to the secondary UAV is expressed as

\begin{matrix} I_{k, c} = \sum_{l = 1}^{L} ρ_{l, c} (h_{lS}^{q_{B}^{3}} + {(h_{IS}^{q_{B}^{3}})}^{H} ϕ h_{lI}) \sqrt{P_{l}} x_{l}^{c}, \end{matrix}

where

P_{l}

signifies the transmission power for computation offloading of the PU l, and

x_{l}^{c}

denotes the task-bearing signal transmitted by the PU l on subcarrier c.

In the occurrence of the above two scenarios, the signal-to-interference-noise ratio (SINR) is written as

13a

\begin{matrix} S I N R_{k, c}^{0} = ρ_{k, c} \frac{{| h}_{kS}^{q_{B}^{3}} + {(h_{IS}^{q_{B}^{3}})}^{H} ϕ h_{kI}^{EMPTY} |^{2} P_{k}}{σ_{s}^{2}}, \end{matrix}

13b

\begin{matrix} S I N R_{k, c}^{1} = ρ_{k, c} \frac{{| h}_{kS}^{q_{B}^{3}} + {(h_{IS}^{q_{B}^{3}})}^{H} ϕ h_{kI}^{EMPTY} |^{2} P_{k}}{\sum_{l = 1}^{L} ρ_{l, c} {| h}_{lS}^{q_{B}^{3}} + {(h_{IS}^{q_{B}^{3}})}^{H} ϕ h_{kl}^{EMPTY} |^{2} P_{l} + σ_{s}^{2}} . \end{matrix}

The probabilities of the above two scenarios occurring can be expressed as

14a

\begin{matrix} ψ_{c}^{0} = Pr (H_{c}^{0}) (1 - P_{f, c}), \end{matrix}

14b

\begin{matrix} ψ_{c}^{1} = Pr (H_{c}^{1}) (1 - P_{d, c}) . \end{matrix}

where

Pr (H_{c}^{0})

denotes the probability that the sub-carrier is being idle, and

Pr (H_{c}^{1})

represents the probability that the sub-carrier is in an active state. Then, the achievable rate of the SU k can be expressed as

\begin{matrix} R_{k} = \sum_{c = 1}^{C} ρ_{k, c} [ψ_{c}^{0} log (1 + S I N R_{k, c}^{0}) + ψ_{c}^{1} log (1 + S I N R_{k, c}^{1})] . \end{matrix}

Therefore, the total rate that can be achieved by the secondary user can be represented by

\begin{matrix} R_{s} = \sum_{k = 1}^{K} R_{k} . \end{matrix}

Latency model

Firstly, the flight time of the primary UAV and the secondary UAV can be expressed as

17a

\begin{matrix} T_{P}^{fly} = \frac{\sum_{i = 1}^{3} ∥∥q_{A}^{i} - q_{A}^{i - 1}∥∥}{v}, \end{matrix}

17b

\begin{matrix} T_{S}^{fly} = \frac{\sum_{i = 1}^{3} ∥∥q_{B}^{i} - q_{B}^{i - 1}∥∥}{v}, \end{matrix}

where v represents the flight speed of the UAV. After a UAV reaches a specific SP, another UAV needs to reach the designated SP for the current stage of work to continue. Therefore, the waiting time of the UAV can be expressed as

\begin{matrix} T^{wait} = \frac{\sum_{i = 1}^{3} ∥∥q_{B}^{i} - q_{B}^{i - 1}∥∥ - ∥∥q_{A}^{i} - q_{A}^{i - 1}∥∥}{v} . \end{matrix}

In addition, assuming that the size of task data processed by k-th SU is

D_{k}

, the total offloading delay of the SUs can be expressed as

\begin{matrix} T^{off} = \sum_{k = 1}^{K} \frac{δ_{k} D_{k}}{R_{k}}, \end{matrix}

where

δ_{k}

is the percentage of data that is offloaded from SU k to the S-UAV.

The computation delay at the SU and the S-UAV is represented by

20a

\begin{matrix} T_{com}^{loc} = \sum_{k = 1}^{K} \frac{(1 - δ_{k}) D_{k} f^{user}}{F^{user}}, \end{matrix}

20b

\begin{matrix} T_{com}^{UAV} = \sum_{k = 1}^{K} δ_{k} D_{k} \frac{f^{UAV}}{F^{UAV}} . \end{matrix}

where

f^{user}

and

f^{UAV}

represents the number of CPU cycles required to process 1 bit of data at users and UAV, and

F^{user}

and

F^{UAV}

denotes the computational capacity at users and UAV. Therefore, the total latency of this system proposed in this paper is

\begin{matrix} T^{total} = max {T_{P}^{fly}, T_{S}^{fly}} + max {T_{com}^{loc}, T_{com}^{UAV}} + T^{off} + τ, \end{matrix}

Energy consumption model

This paper concentrates on the energy consumption of UAVs, beginning with the energy usage associated with UAV flight, as expressed by

\begin{matrix} E^{fly} = P^{fly} (T_{P}^{fly} + T_{S}^{fly}), \end{matrix}

where

P^{fly}

represents the power required during UAV flight.

Furthermore, when the UAV is engaged in tasks such as task offloading, task computation, and spectrum sensing, the UAV remains in a hovering state. Consequently, the hover energy consumption of the UAV can be expressed as

\begin{matrix} E^{hover} = P^{hover} (max {T_{com}^{loc}, T_{com}^{UAV}} + T^{off} + T^{sense} + T^{wait}), \end{matrix}

where

P^{hover}

represents the power required for the UAV during hovering. Hence, the overall energy consumption of the UAV can be formulated as

\begin{matrix} E^{total} = E^{fly} + E^{hover} . \end{matrix}

In conclusion, the energy efficiency of the system can be expressed as

\begin{matrix} Φ_{e} = \frac{R_{s}}{E^{total}} . \end{matrix}

Semantic utility model

To enhance system intelligence and user satisfaction, this paper introduces a semantic utility model to evaluate how well task scheduling and resource allocation meet user semantic demands. Semantic utility quantifies the importance of tasks, their completion status, and their alignment with allocated resources, and can be defined as the weighted sum of the semantic contributions of multiple tasks, given by

\begin{matrix} Φ_{s} = \sum_{i = 1}^{N} β_{i} \cdot (P_{i} \cdot C_{i} \cdot R_{i}), \end{matrix}

where

N

is the total number of tasks at the current time step,

β_{i}

is the weight of task

i

, reflecting its significance (e.g., urgency or user priority),

P_{i}

is the priority of task

i

, ranging from

[0, 1]

C_{i}

is the completion rate of task

i

, representing whether the task is completed on time. It can be calculated as the ratio of completed data to the total required data, ranging from

[0, 1]

R_{i}

is the contextual relevance of task

i

, indicating the match between the task and the allocated resources or network state, ranging from

[0, 1]

Problem formulation

In this paper, we propose the use of IRS/UAV-assisted MEC in wideband cognitive radio networks. Our goal is to maximize the overall system performance by considering both energy efficiency and semantic utility. To achieve this, we collaboratively optimize the flight paths of both the P-UAV and S-UAV $t_{p}$ and $t_{s}$ , subcarrier allocation $ρ$ , reflection coefficients $φ$ , task offloading ratios $δ$ , and semantic utility factors, including task priorities $P_{i}$ and contextual relevance $R_{i}$ . The problem framework of our proposed approach is outlined as follows.

27a

\begin{matrix} P : & max_{t_{p}, t_{s}, ρ, φ, δ, P_{i}, R_{i}} η Φ_{e} + (1 - η) Φ_{s} \end{matrix}

27b

\begin{matrix} s . t . & C 1 : P_{f, c} \leq \tilde{P_{f}}, \end{matrix}

27c

\begin{matrix} C 2 : \sum_{c = 1}^{C} \sum_{k = 1}^{K} ρ_{k, c} ρ_{q, c} ψ_{c}^{1} P_{k} {| h}_{kP}^{q_{A}^{3}} + h_{IP}^{q_{A}^{3}} φ h_{kI} |^{2} \leq P_{I}^{max}, \forall q \in Q, \end{matrix}

27d

\begin{matrix} C 3 : θ_{m} \in [0, 2 π), \forall i, \end{matrix}

27e

\begin{matrix} C 4 : δ \in [0, 1], \end{matrix}

27f

\begin{matrix} C 5 : \sum_{c = 1}^{C} ρ_{i, c} \leq 1, ρ_{i, c} \in {0, 1}, \forall i \in {K, Q}, \end{matrix}

27g

\begin{matrix} C 6 : q_{i}^{0} = q_{0}, q_{i}^{3} = q_{F}, q_{i} = \{q_{A}, q_{B}\}, \end{matrix}

27h

\begin{matrix} C 7 : P_{i} \in [0, 1], R_{i} \in [0, 1], \forall i \in N, \end{matrix}

27i

\begin{matrix} C 8 : L_{k} \leq L_{\min}, \forall k \in K, \end{matrix}

where

η

is a balancing parameter to adjust the weight between energy efficiency and semantic utility. C1 is the constraint on the maximum likelihood of false alarms. C2 represents the maximum interference tolerated by the primary user for information transmission. C3 denotes the constraint on the IRS reflection phase shift. C4 represents the range constraint on the task offloading ratio. C5 signifies the subcarrier allocation status. C6 indicates that the UAVs depart from a predefined starting point and return to the termination point. C7 introduces constraints on the semantic utility factors, ensuring their validity and proper range. C8 is the maximum loss function constraint.

Resource optimization scheme

Due to the strong coupling of optimization variables in the proposed problem framework, it is inherently non-convex, making conventional solution methods challenging. Compared to traditional convex optimization techniques, deep reinforcement learning offers several advantages in solving non-convex problems. First, its high flexibility allows it to adapt to complex problem structures and dynamic environments. This adaptability is particularly beneficial for capturing the nonlinear and intricate characteristics of non-convex problems. Secondly, deep reinforcement learning possesses strong learning capabilities, continuously improving performance through experience. When tackling complex non-convex problems, it can uncover implicit patterns and approximate optimal solutions more effectively. Lastly, its neural network architecture is well-suited for handling high-dimensional and large-scale data, enabling a more comprehensive representation of non-convex problems. Given these advantages, a deep reinforcement learning algorithm is employed to address the proposed problem framework.

Problem transformation

Firstly, we need to transform the proposed problem framework into a Markov Decision Process (MDP). We consider the network under consideration as the environment, and the controller on the UAV is considered to be the agent. Additionally, the Markov Decision Process requires a clear definition of the state space, action space, transition probabilities, and reward function.

State space: The state space is a collection that describes the system’s states. In this study, the state space at time slot t includes the following elements: the channel state information of all channels in time slot t, the energy efficiency in time slot $t - 1$ , the semantic utility in time slot $t - 1$ , and the action space in time slot $t - 1$ , which can be represented in the form of $s^{t} = {H, E E^{t - 1}, Φ_{s}, a^{t - 1}}$ .

Action space: The set of all conceivable actions that the agent can undertake constitutes the action space. In this study, the action space at time slot t includes both discrete and continuous actions, making it a hybrid action space. Discrete actions cover the flight paths of both the P-UAV and S-UAVs and subcarrier allocation, denoted as $a_{d}^{t} = {q_{A}, q_{B}, ρ}$ . Continuous actions encompass the IRS reflection coefficients, the task offloading ratio, task priorities and contextual relevance, denoted as $a_{c}^{t} = {ϕ, δ, P_{i}, R_{i}}$ . Therefore, the action space at time slot t can be represented as $a^{t} = {a_{d}^{t}, a_{c}^{t}}$ .

Transition probability: The probability of transition, often represented as $P r (s^{t + 1} | s^{t}, a^{t})$ , denotes the likelihood of the environment transitioning to the subsequent state $s^{t + 1}$ based on the current state $s^{t}$ and action $a^{t}$ .

Reward function: In this study, the reward function should include metrics of system performance to encourage the agent to take actions that benefit the whole system. Since this paper aims to maximize the overall system performance, the reward function can be expressed as

\begin{matrix} r = λ_{1} Φ_{e} + λ_{2} S U, \end{matrix}

where

λ_{1}

and

λ_{2}

are adjustable parameters, controlling the trade-off between energy efficiency and semantic utility .

Problem optimization based on DDQN-TD3 algorithm

Dueling DQN (DDQN) is an algorithm for estimating value functions. The algorithm exhibits several advantages when dealing with discrete action spaces. Firstly, by decoupling the value of a state from the advantage of each action, DDQN gains a better understanding of how each action contributes to the state value. This allows the model to more effectively learn the relative importance of different actions. Secondly, by independently estimating state values and action advantages, DDQN can more efficiently learn various aspects of a task, contributing to enhanced learning efficiency, especially in environments with a large number of actions. Finally, DDQN offers more stable training by mitigating instability during the training process. Through the separation of value and advantage, it adeptly handles variations in the action value function, resulting in smoother and more reliable training.

The structure of DDQN is a specialized deep neural network architecture that is primarily composed of two components: the Value Network and the Advantage Network. These two networks collaborate to estimate the Q-values for each action. Through the combination of these networks, the Q-values of DDQN can be expressed as

\begin{matrix} Q (s^{t}, a^{t} ; θ_{Q}) = V (s^{t} ; θ_{V}) + (A (s^{t}, a^{t} ; θ_{A}) - \frac{1}{| A |} \sum_{a^{'}} A (s_{t}, a^{'} ; θ_{A})), \end{matrix}

where

θ_{Q}

θ_{V}

, and

θ_{A}

represent the neural network parameters of the Q-value network, the state value network, and the advantage network, respectively. Meanwhile,

| A |

denotes the size of the action space, and

\frac{1}{| A |} \sum_{a^{'}} A (s_{t}, a^{'} ; θ_{A})

represents the average value of all action advantages.

The update of Q-network parameters in the DDQN algorithm involves the use of a loss function. Specifically, the loss function is computed as the sum of two components: the mean squared error between the predicted Q-values and the target Q-values, and the mean squared error between the predicted advantage values and the calculated advantage targets. Assuming P tuples are sampled from the experience replay buffer, the loss function can be written as

\begin{matrix} L (θ_{Q}) = \frac{1}{P} \sum_{n = 1}^{P} (Q (s^{n}, a^{n} ; θ_{Q}) - {(r + γ Q (s^{n + 1}, a^{n + 1} ; θ_{Q}^{'}))}^{2}, \end{matrix}

where

γ

represents the discount factor, and

θ_{Q}^{'}

denotes the neural network parameters of the target Q-value network.

The TD3 (Twin Delayed DDPG) algorithm offers several advantages for handling continuous action spaces. Firstly, TD3 utilizes a dual Q-network structure by maintaining two Q-value networks, which helps mitigate the overestimation problem. This architecture enhances the algorithm’s robustness and reliability, contributing to improved learning efficiency. Secondly, TD3 incorporates adversarial noise by introducing random noise to the actions of the target policy network. This helps in better exploring the environment, enhancing the algorithm’s robustness.

The structure of the TD3 algorithm comprises two main components: the Actor network and the Critic network. Within the Actor-Critic framework, TD3 employs a twin Q-network structure, consisting of two independent Critic networks dedicated to estimating two Q-values, thereby mitigating the impact of overestimation. Additionally, there are two Actor networks for learning optimal policies. Furthermore, the algorithm utilizes target networks to enhance learning stability through soft updates. The target Q-value network in time slot t for the TD3 algorithm can be expressed in terms of

\begin{matrix} y^{t} = r^{t} + γ min_{i = 1, 2} Q^{'} (s^{t + 1}, a^{t + 1} ; θ_{ci}^{'}), \end{matrix}

where

θ_{ci}^{'}

represents the neural network parameters of the target evaluation network. In addition, the use of the

min_{i = 1, 2} Q^{'}

operation in the above expression is to select the smaller of the two target Q-values, mitigating the issue of overestimation.

The update of the critic network parameters involves the minimization of a loss function, which can be expressed as

\begin{matrix} L (θ_{ci}) = \frac{1}{P} \sum_{n = 1}^{P} {(y^{n} - Q (s^{n}, a^{n} ; θ_{ci}))}^{2}, \end{matrix}

where

θ_{ci}

represents the neural network parameters of the critic network.

In the TD3 algorithm, the Actor network parameters of the are updated by maximizing the Q-value. Specifically, the goal of the Actor network is to learn a policy that maximizes the estimated Q-value for the chosen actions. To achieve this objective, the update loss function for the Actor network can be expressed as

\begin{matrix} J (θ_{a}) = - \frac{1}{P} \sum_{n = 1}^{P} min_{i = 1, 2} Q (s^{n}, π (s^{n} | θ_{a}) | θ_{ci}), \end{matrix}

where

θ_{a}

denotes the actor network parameters.

In summary, the algorithm starts with an initialization phase. Subsequently, taking the proposed network as input, the actions produced are implemented in the environment to obtain reward values, transitioning the state to a new one. Tuples generated in this process are stored in the experience replay buffer. Once a sufficient number of tuples have been accumulated in the experience replay buffer, the system begins training. In each training iteration, a random batch of P tuples is sampled from the experience replay buffer to compute the loss function, subsequently updating the parameters of the neural network. The parameters of the target network are updated using a soft update strategy. This entire process aims to optimize the algorithm’s performance, enabling it to better adapt and improve decision-making processes in the environment.

Results and discussion

To illustrate the superior performance of our proposed network approach, we present simulation results in this section. Our simulation parameter settings were referenced from the literature [22, 27] to ensure the rationality and reliability of our parameter choices.

[See PDF for image]

Algorithm 1

DDQN-TD3 based intelligent optimization scheme for the proposed IRS/UAV-assisted MEC in wideband cognitive radio system

Our simulation parameters are as follows: the number of PUs and SUs is denoted as $L = K = 4$ . The sampling frequency is $f_{s} = 6$ MHz, and the probability of subcarriers being in the idle state is $Pr (H_{c}^{0}) = 0.8$ , while the probability of subcarriers being occupied is $Pr (H_{c}^{1}) = 0.2$ . Additionally, the system tolerates a maximum false alarm probability of $\bar{P_{f}} = 0.1$ . The reflective elements of the IRS are $M = 30$ . The UAV’s flight speed is $v = 20 m / s$ , and the power for UAV hovering and flying is $P^{hover} = P^{fly} = 1000 W$ , The computational capabilities of secondary users and UAV are $F^{user} = 1000$ cycles/bit and $F^{UAV} = 2000$ cycles/bit, respectively.

In the simulation plots, the term “Proposed method” represents the approach utilized in this paper. “Random phase” denotes the scenario where IRS reflection coefficients are not optimized, and they are randomly generated. “Without IRS” signifies a condition where IRS is not employed in the optimization process, with other conditions remaining unchanged.

[See PDF for image]

Fig. 2

The convergence performance of the proposed intelligent methodology

Figure 2 shows the convergence performance of the proposed DDQN-TD3-based optimization algorithm by plotting the smoothed cumulative rewards over 15,000 training episodes. The x-axis denotes the episode number, and the y-axis indicates the smoothed reward, which serves as an indicator of the agent’s policy quality during training. In the early training stage (approximately before episode 1000), the smoothed reward increases rapidly from about 0.18 to 0.35, demonstrating that the agent is quickly learning effective policies to improve system performance. After approximately 1500 episodes, the reward curve enters a relatively stable phase, where the values fluctuate within the range of 0.35 to 0.48. This fluctuation is expected in reinforcement learning due to the trade-off between exploration and exploitation. The overall reward trend remains stable, indicating that the proposed algorithm achieves effective convergence. The results also validate the algorithm’s ability to handle complex hybrid action spaces (including both discrete and continuous variables), and adapt to the high-dimensional state-action dynamics in the semantic-aware IRS/UAV-assisted MEC system. This convergence behavior confirms the robustness and learning capability of the DDQN-TD3 framework in optimizing semantic utility and energy efficiency under dynamic network conditions.

[See PDF for image]

Fig. 3

The flight path of both the P-UAV and S-UAV

Figure 3 illustrates the optimized flight trajectories of both the P-UAV and S-UAV in the proposed IRS/UAV-assisted MEC system. The solid red line represents the P-UAV’s trajectory, while the dashed red line represents that of the S-UAV. The positions of PUs, SUs, and the IRS are also marked. The UAVs begin their missions from designated starting points and proceed through three operational phases: spectrum sensing, task offloading, and return. During the spectrum sensing phase, the P-UAV actively adjusts its position in coordination with the S-UAV’s movement to ensure optimal LoS sensing. This coordination enhances the sensing accuracy and channel estimation quality. The S-UAV, in turn, aligns its path to stay within effective sensing range of both the IRS and user terminals. It is noteworthy that despite the presence of the IRS (green triangle), the UAVs do not significantly deviate toward it during the sensing phase. This indicates that the direct channel between UAVs is sufficiently strong, and additional IRS reflection is unnecessary at this stage. In the task offloading phase, the S-UAV strategically relocates to a position that is closer to both the SUs and the IRS. This placement facilitates improved communication links via both direct and reflected paths, maximizing signal quality and offloading efficiency. The final phase concludes with both UAVs returning to their predefined endpoints. Overall, the figure highlights the dynamic adaptability of the UAVs’ flight paths, jointly optimized to balance sensing accuracy, communication efficiency, and energy consumption.

[See PDF for image]

Fig. 4

The system energy efficiency exhibits variations with changes in the transmit power of the SU

Figure 4 depicts the variation in system energy efficiency (in bits/Joule/Hz) as a function of the SU transmit power, ranging from 0 dBm to 30 dBm. The results compare three scenarios: the proposed IRS-assisted method with optimized reflection coefficients, a baseline with random IRS phase shifts, and a system without IRS deployment. As the SU transmit power increases, the energy efficiency improves across all schemes. This is expected since higher transmit power leads to improved signal quality and thus higher achievable data rates, which contributes to better energy utilization. The proposed method consistently outperforms both baseline schemes. At 30 dBm, it achieves an energy efficiency of approximately 0.66 bits/Joule/Hz, which is over 50% higher than the “Without IRS” case and more than double that of the “Random phase” scenario. Interestingly, the system with random IRS phases performs worse than the system without IRS, particularly at higher power levels. This indicates that improperly configured IRS elements can introduce additional interference or misalignment, thereby reducing the overall system efficiency. It highlights the importance of precise IRS phase optimization in realizing its full potential. Overall, Figure 4 validates the effectiveness of the proposed intelligent optimization framework that can effectively leverage IRS to enhance energy efficiency, and also emphasizes the critical role of phase control in IRS-based systems.

[See PDF for image]

Fig. 5

The system energy efficiency exhibits variations with changes in the number of reflective elements for IRS

Figure 5 illustrates the relationship between system energy efficiency and the number of IRS reflecting elements. From the figure, it is evident that our proposed scheme achieves higher energy efficiency compared to the baseline scheme. Additionally, as the number of reflecting elements increases, our scheme demonstrates further improvement in energy efficiency. This can be attributed to the increased number of reflecting elements, which enables the IRS to perform more precise beamforming and steer the electromagnetic wave propagation more effectively. Moreover, the proposed scheme can carefully adjust the phase of the reflecting elements, thus signals can be more efficiently directed toward the target area. It shows that our proposed scheme enhances signal strength in the desired region and improving overall system performance. Furthermore, in the case of random phase shifts, the performance impact remains minimal as the number of IRS reflecting elements increases. However, the overall system performance slightly deteriorates compared to the scenario without IRS, further highlighting the potential negative effects of random phase shifts on the system.

Conclusion

In this paper, we propose the semantic-aware optimization of IRS/UAV-assisted MEC in wideband cognitive radio networks, aiming to enhance overall system performance by jointly considering energy efficiency and semantic utility. We achieve this by collaboratively optimizing the flight paths of both the P-UAV and S-UAV, subcarrier allocation, IRS reflection coefficients, task offloading ratios, and semantic utility factors such as task priorities and contextual relevance. Our proposed framework addresses these optimization objectives comprehensively. The simulation results validate the effectiveness of our approach, demonstrating significant improvements in both energy efficiency and semantic utility, with the adopted optimization algorithm efficiently handling the complex mixed action spaces.

Acknowledgements

The authors extend their appreciation to Henan Province for funding this work through the Higher Education Young Backbone Teacher Training Program of Henan Province, “Study for Airborne Gravimetry Solution Technologies based on Adaptive Kalman Filtering” (Grant NO. 2024GGJS157) and the Henan Province Science and Technology Key Project, “Study for RIS technologies in 6 G Wireless Communications.”

Author contributions

Wei Zheng carried out the system modeling study, optimization algorithms and algorithm implementation, participated in drafting the manuscript. Pengshan Ren carried out the survey work, formula derivation and writing. Qing Li participated in the optimization model design. study of optimization algorithms. All authors read and approved the fnal manuscript.

Funding

This research was supported by the Higher Education Young Backbone Teacher Training Program of Henan Province, “Study for Airborne Gravimetry Solution Technologies based on Adaptive Kalman Filtering” (Grant NO. 2024GGJS157) and the Henan Province Science and Technology Key Project, “Study for RIS technologies in 6G Wireless Communications” (Grant NO. 252102210237).

Data availability

Data sharing is not applicable to this study.

Declarations

Conflict of interest

The author declares no conficts of interest related to this research.

Abbreviations

IoT

Internet of things

UAV

Unmanned aerial vehicle

IRS

Intelligent reflecting surface

MEC

Mobile edge computing

DDQN-TD3

Double deep Q-network and twin delayed deep deterministic policy gradient

Cognitive radio

P-UAV

Primary unmanned aerial vehicle

S-UAV

Secondary unmanned aerial vehicle

Primary user

Secondary user

ECS

Edge computing server

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

1. Xie, H; Qin, Z. A lite distributed semantic communication system for Internet of Things. IEEE J. Select. Areas Commun.; 2020; 39, 1 pp. 142-153.4106556 [DOI: https://dx.doi.org/10.1109/JSAC.2020.3036968]

2. Du, H; Wang, J; Niyato, D et al. Rethinking wireless communication security in semantic Internet of Things. IEEE Wirel. Commun.; 2023; 30, 3 pp. 36-43. [DOI: https://dx.doi.org/10.1109/MWC.011.2200547]

3. Z. Kaleem, F. A. Orakzai, W. Ishaq, et al. Emerging trends in UAVs: From placement, semantic communications to generative AI for mission-critical networks. IEEE Transactions on Consumer Electronics, 2024

4. A. Sharma, and C. Diwakar, Future aspects on MEC (Mobile Edge Computing): Offloading Mechanism, in: 2021 6th International Conference on Signal Processing, Computing and Control, 2021

5. J. Liu, Future smart mobile edge computing technology in mobile communication networks, in: IEEE Conference on Telecommunications, Optics and Computer Science, 2021

6. Huo, Y; Liu, Q; Gao, Q; Wu, Y; Jing, T. Joint task offloading and resource allocation for secure OFDMA-based mobile edge computing systems. Phys. Commun.; 2024; 153, 103342.

7. Wang, L; Yang, F; Chen, Y; Lai, S; Wu, W. Intelligent resource allocation for transmission security on IRS-assisted spectrum sharing systems with OFDM. Phys. Commun.; 2023; 58, [DOI: https://dx.doi.org/10.1016/j.phycom.2023.102013] 102013.

8. Wang, L; Wu, W; Zhou, F; Wu, Q; Dobre, OA; Quek, TQS. Hybrid hierarchical DRL enabled resource allocation for secure transmission in multi-IRS-assisted sensing-enhanced spectrum sharing networks. IEEE Trans. Wirel. Commun.; 2024; 23, 6 pp. 6330-6346. [DOI: https://dx.doi.org/10.1109/TWC.2023.3330999]

9. K. Liu, F. Lin, Y. Zhao and J. Zhang, Deep Reinforcement Learning Optimization Algorithm Designed for IRS-Assisted Edge Computing, in: IEEE 6th International Conference on Electronic Information and Communication Technology, 2023

10. B. Wang, R. Liu, Y. Li, C. Ding, J. Wang and H. Zhang, Joint optimization of transmission and computing resource in IRS-assisted mobile edge computing system, In: IEEE Wireless Communications and Networking Conference, 2022

11. Bai, T; Pan, C; Ren, H; Deng, Y; Elkashlan, M; Nallanathan, A. Resource allocation for intelligent reflecting surface aided wireless powered mobile edge computing in OFDM systems. IEEE Trans. Wirel. Commun.; 2021; 20, 8 pp. 5389-5407. [DOI: https://dx.doi.org/10.1109/TWC.2021.3067709]

12. M. Wu, W. Chen, K. Li, and L. Qian, Secure computation offloading for IRS-assisted mobile edge computing networks, in: IEEE/CIC International Conference on Communications in China, 2023

13. Z. Wu, H. Zhang, X. Liu, L. Li, and H. Li, IRS Empowered MEC system with computation offloading, reflecting design and beamforming optimization, IEEE Transactions on Communications, 2024, to be published

14. Zhao, S; Liu, Y; Gong, S; Gu, B; Fan, R; Lyn, B. Computation offloading and beamforming optimization for energy minimization in wireless-powered IRS-assisted MEC. IEEE Internet of Things J.; 2023; 10, 22 pp. 19466-19478. [DOI: https://dx.doi.org/10.1109/JIOT.2023.3265011]

15. P. Chen, B. Lyu, S. Gong, H. Guo, J. Jiang, and Z. Yang, Computational rate maximization for IRS-assisted full-duplex wireless-powered mec systems , IEEE Transactions on Vehicular Technology, 2023, to be published

16. F. Wang, and X. Zhang, IRS/UAV-Based edge-computing and traffic offloading over 6G THz mobile wireless networks, In: IEEE International Conference on Communications, 2023

17. Shnaiwer, YN; Kaneko, M. Minimizing IoT energy consumption by IRS-aided UAV mobile edge computing. IEEE Netw. Lett.; 2023; 5, 1 pp. 16-20. [DOI: https://dx.doi.org/10.1109/LNET.2022.3222452]

18. Zhang, Y; Li, J; Mu, G; Chen, X. Deep reinforcement learning enabled UAV-IRS-assisted secure mobile edge computing network. Phys. Commun.; 2023; 61, [DOI: https://dx.doi.org/10.1016/j.phycom.2023.102173] 102173.

19. Qin, X; Song, Z; Hou, T; Yu, W; Wang, J; Sun, X. Joint optimization of resource allocation. Phase shift, and UAV trajectory for energy-efficient RIS-assisted UAV-enabled MEC systems,. IEEE Trans. Green Commun. Netw.; 2023; 7, 4 pp. 1778-1792. [DOI: https://dx.doi.org/10.1109/TGCN.2023.3287604]

20. Jiang, F; Peng, Y; Wang, K; Dong, L; Yang, K. MARS: a DRL-based multi-task resource scheduling framework for UAV with IRS-assisted mobile edge computing system. IEEE Trans. Cloud Comput.; 2023; 11, 4 pp. 3700-3712. [DOI: https://dx.doi.org/10.1109/TCC.2023.3307582]

21. Y. Li, F. Zhou, L. Yuan, et al. Cognitive semantic communication: a new communication paradigm for 6G, IEEE Communications Magazine, 2025

22. Asim, M; Elaffendi, M; Abd El-Latif, AA. Multi-IRS and multi-UAV-assisted MEC system for 5G/6G networks: efficient joint trajectory optimization and passive beamforming framework. IEEE Trans. Intell. Transp. Syst.; 2023; 24, 4 pp. 4553-4564. [DOI: https://dx.doi.org/10.1109/TITS.2022.3178896]

23. Guan, X; Wu, Q; Zhang, R. Joint power control and passive beamforming in IRS-assisted spectrum sharing. IEEE Commun. Lett.; 2020; 24, 7 pp. 1553-1557. [DOI: https://dx.doi.org/10.1109/LCOMM.2020.2979709]

24. Altrad, O; Muhaidat, S; Al-Dweik, A; Shami, A; Yoo, PD. Opportunistic spectrum access in cognitive radio networks under imperfect spectrum sensing. IEEE Trans. Vehic. Technol.; 2014; 63, 2 pp. 920-925. [DOI: https://dx.doi.org/10.1109/TVT.2013.2281334]

25. Wei, H; Lang, J. Dynamic resource allocation in IRS-assisted UAV wideband cognitive radio networks: A DDQN-TD3 approach. Phys. Commun.; 2024; 63, [DOI: https://dx.doi.org/10.1016/j.phycom.2024.102284] 102284.

26. Z. Wang, W. Wu, F. Zhou, B. Wang and Q. Wu, secure resource allocation for IRS-assisted CRNs under opportunistic spectrum access, In: 2023 International Conference on Ubiquitous Communication, 2023

27. Y. Wu, F. Zhou, Q. Wu, Y. Huang, and R. Q. Hu, Resource allocation for IRS-assisted sensing-enhanced wideband cr networks, In: Proc. IEEE International Conference on Communications, 2021

28. Ding, G; Liu, S; Yuan, J; Yu, G. Joint URLLC trafffc scheduling and resource allocation for semantic communication systems. IEEE Trans. Wirel. Commun.; 2024; 23, 7 pp. 7278-7290. [DOI: https://dx.doi.org/10.1109/TWC.2023.3339239]

29. Liu, C; Guo, C; Yang, Y et al. OFDM-based digital semantic communication with importance awareness. IEEE Trans. Commun.; 2024; 72, 10 pp. 6301-6315. [DOI: https://dx.doi.org/10.1109/TCOMM.2024.3397862]

30. L. Wang, W. Wu, F. Zhou, Z. Qin, Q. Wu, IRS-enhanced secure semantic communication networks: Cross-layer and context-awared resource allocation, IEEE Transactions on Wireless Communications, to be published

31. Xie, H; Qin, Z; Li, GY; Juang, B-H. Deep learning enabled semantic communication systems. IEEE Trans. Signal Process.; 2021; 69, pp. 2663-2675.4271121 [DOI: https://dx.doi.org/10.1109/TSP.2021.3071210] 1543.94014

32. L. Wang, W. Wu, F. Tian, H. Hu, Intelligent resource allocation for UAV-enabled spectrum sharing semantic communication networks, in: 2023 IEEE 23rd International Conference on Communication Technology (ICCT), 2023, pp. 1359-1363

33. Weng, Z; Qin, Z. Semantic communication systems for speech transmission. IEEE J. Select. Areas Commun.; 2021; 39, 8 pp. 2434-2444. [DOI: https://dx.doi.org/10.1109/JSAC.2021.3087240]

34. Farshbafan, MK; Saad, W; Debbah, M. Curriculum learning for goal-oriented semantic communications with a common language. IEEE Trans. Commun.; 2023; 71, 3 pp. 1430-1446. [DOI: https://dx.doi.org/10.1109/TCOMM.2023.3236671]

35. L. Wang, W. Wu, F. Zhou, F. Tian, Q. Wu, W. Saad, A unified hierarchical semantic knowledge base for multi-task semantic communication, in: IEEE International Conference on Communications(ICC), 2024, pp. 2937-2943

36. H. Tong, H. Li, H. Du, Z. Yang, C. Yin, D. Niyato, Multimodal semantic communication for generative audio-driven video conferencing, IEEE Wirel. Commun. Lett.

37. Du, H; Wang, J; Niyato, D; Kang, J; Xiong, Z; Zhang, J; Shen, X. Semantic communications for wireless sensing: Ris-aided encoding and self-supervised decoding. IEEE J. Select. Areas Commun.; 2023; 41, 8 pp. 2547-2562. [DOI: https://dx.doi.org/10.1109/JSAC.2023.3288231]

38. S. E. Trevlakis, N. Pappas, A.-A. A. Boulogeorgos, Towards natively intelligent semantic communications and networking, IEEE Open Journal of the Communications Society (2024)

39. L. Wang, W. Wu, F. Zhou, Intelligent resource allocation for IRS-assisted sensing-enhanced secure communication CRNs, International Conference on Ubiquitous Communication (Ucom), 2023, pp. 344-349

Word count: 7042

Show less

© The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

The efficient integration of communication and computation in the internet of things (IoT) presents new opportunities for enhancing system performance but still faces challenges such as interference management, resource allocation and task scheduling. To address these issues, this paper proposes a semantic-aware intelligent optimization framework that combines unmanned aerial vehicles (UAVs) and intelligent reflecting surface (IRS) with mobile edge computing (MEC) to enhance communication quality and semantic awareness in wideband cognitive radio networks. The proposed semantic-aware optimization framework incorporates semantic information to achieve more efficient task scheduling and resource allocation. Particularly, the proposed optimization framework jointly optimizes UAV trajectories, subcarrier allocation, IRS reflection coefficients, task offloading ratios, task priorities and contextual relevance to maximize semantic utility and system energy efficiency while dynamically ensuring task demands. Furthermore, to tackle the non-convexity caused by highly coupled optimization variables, we employ a deep reinforcement learning algorithm based on double deep Q-network and twin delayed deep deterministic policy gradient (DDQN-TD3). Simulation results demonstrate that the proposed approach significantly outperforms baseline schemes by better aligning with user priorities, task requirements, and contextual awareness, leading to improved task completion rates and semantic utility, providing an innovative optimization solution for wideband cognitive radio networks.

Details

Title

Semantic aware intelligent optimization for IRS/UAV-enabled MEC in wideband cognitive radio networks

Author

Zheng, Wei¹; Ren, Pengshan¹; Li, Qing²

¹ Henan Institute of Technology, School of Electronic and Information Engineering, Xinxiang, China (GRID:grid.503012.5); Xinxiang Key Laboratory of Signal and Information, Xinxiang, China (GRID:grid.503012.5)
² Data Center of Jiangsu Provincial Administration for Market Regulation, Xicheng District, China (GRID:grid.503012.5)

Pages

Publication year

2025

Publication date

Dec 2025

Publisher

Springer Nature B.V.

ISSN

16871472

e-ISSN

16871499

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1186/s13638-025-02478-5

ProQuest document ID

3227159709

Semantic aware intelligent optimization for IRS/UAV-enabled MEC in wideband cognitive radio networks

Jump to:

Full text

Abstract

Details

Suggested sources