Content area
Though the widespread use of multi-UAV systems offers significant tactical and operational advantages, achieving efficient and secure collaborative planning remains a critical challenge in dynamic threat environments. Traditional methods struggle to balance path optimization with threat avoidance, particularly in fluctuating environments where UAVs must adapt to changing threats. To address this, an enhanced Grey Wolf Optimization (GWO) algorithm is proposed for multi-UAV collaborative planning in dynamic threat zones. Our research integrates a priori knowledge of threat zone locations, speeds, and directions with real-time data on the UAVs position relative to the threat zones to effectively manage dynamic threat zones, allowing UAVs to dynamically decide whether to navigate around or through these zones, thus significantly reducing trajectory costs. To further improve search efficiency and solution quality, strategies such as greedy initialization and K-means clustering are incorporated, enhancing the algorithms multi-objective optimization capabilities. Experimental results demonstrate that the dynamic threat zone crossing strategy significantly reduces trajectory costs compared to the traditional bypass strategy. Furthermore, the enhanced GWO algorithm outperforms both the traditional GWO and MP-GWO algorithms in terms of trajectory cost and convergence accuracy. Our approach provides novel insights and methodologies for the advancement of multi-UAV collaborative trajectory planning, while extending the applicability of the GWO algorithm in complex environments
Introduction
With the increasing complexity of contemporary warfare, there is a growing preference for unmanned aerial vehicle (UAV) swarm collaborative combat. Collaborative trajectory planning for multi-UAV operations has emerged as a key research focus within collaborative operations. Trajectory planning in a multi-UAV system involves allocating targets and considering the constraints between UAVs. It requires situational awareness of the operational environment to plan a cost-effective flight path from the starting point to the target point for each UAV. This enables the UAV to successfully complete its mission and enhances the overall survivability and combat effectiveness of the multi-UAV system [1, 2, 3–4].
In UAV formation coordination, the emergence of dynamic threat zones places higher demands on trajectory planning. These threat zones may be enemy air defense systems, moving targets, or other threat sources, and their locations and ranges change over time, causing UAVs to face great uncertainty in performing their missions. Therefore, how to effectively plan the flight path of UAVs to avoid these dynamic threat zones while efficiently accomplishing the mission has become a hot topic of current research. Existing methods for handling dynamic threat zones can be categorized into the following two types, one is completely unknown dynamic threat zones, where the UAV may encounter threat zones at any time during flight and lacks a priori knowledge of the location and motion patterns of the threat zones, and can only rely on sensor data for real-time detection and obstacle avoidance. For example, Yang [5] proposed an intelligent dynamic UAV trajectory planning method aimed at solving the problem of unknown environments and intermittent target loss, and Peng [6] investigated a dynamic path planning method to track changing targets. The other is a completely known dynamic threat zone, where the UAV can obtain the precise location and movement pattern of the threat zone for more accurate trajectory planning. For example, Tordesillas [7] describes a perceptual trajectory planning system that uses imitation learning to navigate dynamic environments while minimizing the presence of obstacles in the field of view of the onboard camera; Yousefi [8] estimates the movement paths of obstacles by combining path linearization techniques with existing high-level methods for trajectory planning. However, in practice, UAVs often face scenarios with partially known information. Specifically, UAVs may have some a priori information about dynamic threat zones, such as the approximate activity range, movement speed, and movement direction of the threat zone. This a priori information can help the UAV get a preliminary understanding of the dynamic characteristics of the hazardous area, but it cannot accurately determine its specific location and real-time behavior. Therefore, the UAV needs to combine this a priori knowledge with real-time acquired data, such as the UAV’s own positional information and the real-time positional information of the dynamic hazardous area, to calculate the relative position and movement trend to the hazardous area in real time, so as to determine whether to choose to bypass the hazardous area or go through the hazardous area directly. This decision-making approach, which combines a priori information and real-time data, enables UAVs to realize more flexible and intelligent trajectory planning in complex environments, ensuring a balance between flight safety and mission efficiency. Compared with the simple bypass-avoidance strategy, crossing the dynamic threat zone can significantly reduce the trajectory cost and improve the mission execution efficiency. Our research focus on zones with partially known knowledge and proposes an innovative solution to the challenge of trajectory planning.
Despite the broad application prospects of UAVs in both military and civil fields, the current research on UAV formation collaborative trajectory planning is still in the exploratory stage, which mainly focuses on the synergy problem among UAV swarm members [9]. Xu [10] proposed a joint multi-timescale optimization algorithm with fast fitness; Fu et al. [11] proposed a method based on the Improved Cheetah Optimization (MCO) algorithm for collaborative multi-UAV trajectory planning, which solves the difficult problem of UAV trajectory planning in complex 3D environments; Seong [12] proposed a multi-intelligent body reinforcement learning method for flexible path planning using multiple UAVs for a data collection task; Yan [13] proposed a non-dominated sorting genetic algorithm II (DTNSGAII) with dynamic topology for trajectory planning. In summary, although existing studies have solved the UAV formation trajectory planning problem to a certain extent, the following problems generally exist: firstly, the existing algorithms lack effective guidance in the initial solution generation stage, resulting in poor quality of the initial trajectory, which in turn affects the efficiency of the subsequent optimization; secondly, most algorithms are prone to fall into the local optimum during the optimization process, and are unable to take into account multiple optimization indexes at the same time, which results in the diversity and quality of the final solution being limited. The diversity and quality of the final solution are limited. These problems not only affect the collaborative combat effectiveness of UAV formations, but also limit the practical application prospects of UAVs in complex battlefield environments. Therefore, how to improve the quality of the initial solution and guide the algorithm to converge to the global optimal solution while ensuring the efficiency of the trajectory planning has become a key problem to be solved.
To address the above problems, an enhanced GWO algorithm for Gray Wolf Optimization (GWO) is proposed in this study to address the complexity and uncertainty in multi-UAV collaborative trajectory planning. Gray Wolf Optimization (GWO) [14] is a meta-heuristic algorithm proposed by Mirjalili, which mimics the leadership hierarchy and hunting mechanism of gray wolves in nature.The GWO algorithm has the advantages of a simple structure, fewer parameters to be adjusted, higher stability, and better avoidance of falling into local optimality [15, 16]. The existing GWO model has been widely used in the planning field [17, 18–19], but the existing GWO algorithm has two main problems, one is that the quality of the initial solution is not high enough to reflect the actual trajectory demand, which leads to the algorithm needing to spend more time to correct these unreasonable paths during the search process, which in turn affects the efficiency of the subsequent optimization; and the other is that the GWO algorithm in the trajectory planning process. It is difficult to effectively screen out individuals that are excellent in multiple metrics, resulting in the diversity and quality of the final solution being limited. Therefore, in UAV formation collaborative trajectory planning, how to improve the quality of the initial solution and guide the algorithm to converge toward the global optimal solution has become a key demand for enhanced GWO algorithm, and traditional algorithms such as GWO tend to be unreasonable in terms of initial solution and insufficient in terms of multi-indicator optimization due to the random initialization and single fitness function. There is no effective algorithm to solve these two problems at the same time, therefore, in this work, we enhanced GWO algorithm for multi-UAV collaborative trajectory planning by proposing greedy idea-based initialization and K-means clustering analysis, which significantly enhances the planning efficiency and the quality of the solution.
The main contributions made in this work are:
Improvement of dynamic threat zone processing method. In partial-info cases, UAV fuses a priori and real-time data. It makes smarter avoid/pass decisions than traditional methods, slashing flight trajectory costs.
Enhanced the gray wolf optimization algorithm. Employ the greedy concept for high-quality initial solutions, aligning with real needs and upping optimization efficiency. Use K-means clustering to split trajectories into subpopulations. Determine top solutions via fitness—value updates. Dodge local optima and balance indices, boosting planning efficiency and quality.
[See PDF for image]
Fig. 1
Blueprint
Dynamic Threat Zone Model
The motion modeling of dynamic threat regions is a core focus of the work. The dynamic threat region is defined as a spherical zone, characterized by a constant geometry and size during its motion, while its position and motion state change over time. Specifically, the threat region is spherical, with a fixed radius that indicates its influence range. This spherical design simplifies the collision detection problem in trajectory planning, allowing UAVs to determine whether they have entered the threat zone by calculating the distance from the center of the sphere. Additionally, the design enhances the efficiency and reliability of detection and avoidance algorithms in all directions.The characteristics of dynamic threat regions extend beyond their fixed geometries; they also encompass movement patterns and changing trends. we assumed that the threat zone is initially positioned at a certain altitude and circles around a central point at a constant speed. During this circling process, the threat zone gradually ascends, reaches a specific altitude, and then descends back to its initial altitude, entering a new cycle of upward and downward motion. This motion pattern simulates behaviors that a UAV may exhibit while performing specific tasks. The movement speed and hovering radius of the threat zone significantly influence the challenges faced in UAV trajectory planning; a faster moving threat zone with a larger hovering radius increases the difficulty of obstacle avoidance.
The detection and prediction of dynamic threat zones by UAVs are crucial technologies for effective trajectory planning. We assumed that the UAV acquires real-time position information of the threat zone through sensors and determines whether trajectory adjustments are necessary to avoid collisions. Specifically, the UAV measures the distance between its current position and the center of the threat zone upon reaching each new waypoint. If this distance is less than the radius of the threat zone, the UAV is deemed to have entered the threat zone, triggering the obstacle avoidance mechanism.To enhance the success rate of obstacle avoidance, the UAV also predicts the motion trend of the threat region. It is assumed that the motion of the threat region follows its current state, allowing for the prediction of its future position. By planning a detour path in advance, the UAV can effectively avoid collisions with the threat region. However, if the UAV detects that an enemy UAV is far away when approaching the dynamic threat region, it may opt to traverse the threat zone directly to improve mission efficiency.
Our research proposes a staged obstacle avoidance strategy to ensure that UAVs can safely and efficiently navigate threat regions. The strategy consists of the following steps:
Threat zone detection: Upon arriving at a new waypoint, the UAV measures the distance to the center of the threat zone in real time. If the distance is less than the radius, the UAV triggers the obstacle avoidance mechanism.
Motion trend prediction: The UAV predicts the future position of the threat zone based on its current motion state. The prediction model assumes that the motion in the threat zone follows the current speed and direction, and calculates its position at a certain moment in the future based on this.
Trajectory adjustment: The UAV adjusts its current trajectory to avoid the threat zone based on the predicted future position, employing strategies such as re-selecting waypoints, changing flight direction, or altering altitude. Specifically, the UAV will choose a detour path so that it always maintains a safe distance from the threat zone during the obstacle avoidance process.
Real-time updates: Given the dynamic nature of the threat zone, the UAV continuously updates its obstacle avoidance strategy during flight, re-detecting the threat zones location at each waypoint to ensure effective and safe navigation.
Problem Description and Assumptions
In our research, we focus on incorporating dynamic threat regions into the trajectory planning model as a core constraint in the planning process. The objective of the UAV is to navigate from an initial position to a target position while effectively avoiding these dynamic threat regions. We model the trajectory planning problem as a constrained optimization problem, aiming to minimize the total cost of the trajectory, which includes factors such as path length, energy consumption, and time, while adhering to the UAV’s flight constraints (e.g., range, altitude, and time) and obstacle avoidance requirements.Specifically, a UAV’s trajectory is represented as a series of connected waypoints, each indicating the UAV’s position at a specific moment. The locations and movement patterns of dynamic threat regions are treated as additional constraints that must be updated in real-time during the trajectory planning process. By integrating information about dynamic threat regions into the trajectory planning model, we transition from a static obstacle avoidance problem to a dynamic one, thereby increasing the complexity and challenge of the problem.
We establish a three-dimensional battlefield environment defined by dimensions , where each UAV in the swarm is assigned a specific attack target. The UAV swarm is denoted as U, the set of corresponding targets is T, and the collection of threat zones is M, with each threat zone represented as a sphere in the three-dimensional coordinate system.The UAV swarm set is , the set of targets corresponding to UAVs is and the collection of threat zones is , each of which is spherical in the 3-dimensional coordinate system.
For each UAV, the trajectory from the departure point to the target point is composed of a series of trajectory points connected sequentially, with a flight segment between two adjacent trajectory points . Let the total flight range of a single UAV be , the speed be , the flight height be , the flight time be , the flight starting point be , the target point be and the track point is , where . So, the trajectory of can be expressed as , where F(q) is the constraint and q is the constraint parameter. Figure 2 illustrates a simulation of the UAV swarm battlefield environment, showcasing the integration of dynamic threat regions into the trajectory planning framework.
[See PDF for image]
Fig. 2
UAV swarm battlefield environment
UAV Swarms Constraint
Range Constraint
The UAV has a maximum range for a single flight. Let the flight distance of each segment of the UAVs trajectory be denoted as and let the total number of flight segments be n. The UAV must ensure that the total flight range does not exceed the maximum safe distance, which guarantees a safe return. Therefore, the UAVs total flight distance must satisfy the following constraint:
1
where is the maximum safe flight range.Flight Spacing Constraint
The flight spacing constraint ensures that there is no collision between UAVs in the swarm. The minimum distance between any two UAVs must not be less than the minimum safe flight distance. If the distance between UAVs is less than the minimum safe distance, a collision is considered to occur. The constraint can be expressed as:
2
where is the minimum distance between the UAV and the UAV trajectory point, is the specified minimum safe flight distance between UAVs.Flight Altitude Constraints
The UAV’s flight altitude must be constrained within a certain range to avoid vulnerabilities to ground-based threats and to prevent collisions with obstacles such as hillsides. Additionally, flying at high altitudes may increase the cost of the mission. Let the maximum altitude be and the minimum altitude be . Therefore, the flight altitude constraint is:
3
Time Constraint
The UAV needs to reach the target within a specified time. Suppose the task is required to be completed within and hence the time constraint is:
4
Mathematical Model in UAV Path Planning
Stand-Alone Trajectory Cost
Multiple indicators are used to judge the quality of UAV trajectory. The trajectory cost, also known as adaptability, primarily comprises fuel consumption cost, flight altitude cost, collision cost, and threat zone cost [20, 21–22]. The formula for the trajectory cost of a single UAV is as follows. we only consider the trajectory cost when the UAV attacking the target The trajectory cost of the return journey after the attack is not considered for the time being.
Total cost of fuel consumption:
5
Total cost of flight altitude:6
Total cost of the threat zone:7
Total cost of terrain:8
Total cost of collision:9
Total cost of time:10
The total cost of a single UAV is shown in Eq. 11:11
In the above equation: is the altitude of the segment of the trajectory; denotes the fuel cost function with respect to length and flight altitude in segment i of the trajectory; denotes the flight altitude cost function associated with the flight altitude in segment i of the trajectory; denotes the hazard cost function associated with the hazard level of the voyage in the segment of the trajectory; denotes the terrain cost function associated with the number of terrain collisions in trajectory segment i; denotes the collision cost function associated with the number of collisions in segment i; denotes the time–cost function of flight time for flight segment i; denotes the total fuel cost of the UAV; represents the total cost of the flight altitude of the UAV; denotes the total cost of the threat zone for the UAV; denotes the total cost of the collision for the UAV; represents the total cost of time for the UAV. , , , , , are the weighting factors and satisfy, .Multi-aircraft Trajectory Planning
The formula for calculating the trajectory cost of multiple UAVs in formation is obtained by summing the individual costs of each UAV, weighted by specific factors. The total trajectory cost for a fleet of UAVs is therefore expressed as:
12
where N is the number of UAVs and ,,,,, are the weights of each cost.Altitude and Fuel Consumption: Altitude changes during UAV flight significantly affect fuel consumption, particularly during ascent and descent. The squared form of the altitude change is used to more accurately capture energy consumption, especially during sharp altitude variations. When a UAV ascends from a low to a high altitude, it must overcome gravitational forces, which requires the engine to perform work. Additionally, as altitude increases, the air density decreases, which in turn raises the thrust requirements, causing the engine to consume more fuel to maintain sufficient power. Conversely, during descent, although the UAV can utilize gravity to assist in the downward motion, the engine must still provide thrust to maintain control and ensure a safe, stable flight attitude.In general, the longer the total flight distance, the higher the fuel consumption. As a result, the total flight distance is considered a key factor in the trajectory cost calculation. When the UAV’s flight speed is expressed as the average speed over the total range, the fuel cost function for a single UAV can be represented as a function of altitude change and a fixed proportion of the distance traveled. This can be written as:
13
14
15
In the above equation: , are weighting coefficients used to adjust the influence of different factors on the cost function. where is the fuel consumption of one kilometer, k is the waypoint, , , are the abscissa, ordinate, and vertical coordinates of the waypoint, respectively.If the UAV flies too high or too low, the cost will increase by the function :
16
where , are increasing factors.There are two types of threat zones considered: static threat zones and dynamic threat zones caused by enemy UAVs. When a UAV passes through a static threat zone, the threat cost increases according to a specific function. This can be expressed as:
17
18
19
where is the threat level in the threat zone; and are the center coordinate of the hazard zone. If is less than the radius of the threat zone, then it means that the UAV has flown into the threat zone.If the UAV enters the enemy UAV’s threat zone, the threat cost will be increased by the function, i.e.,
20
In the above equation: is the threat attenuation coefficient, between [0, 1], is the threat factor of the enemy UAV, is the distance from our UAV to the enemy UAV, is the jamming radius of the enemy UAV, and is the detection radius of the enemy UAV.If the UAV collides with the terrain, it will increase the cost through the function, i.e.,
21
In the above equation: is the terrain height.If the UAVs collide with each other, the cost is increased by the function, i.e.,
22
If the UAV takes too long to reach the target, the cost is increased by the function:23
Trajectory Evaluation by fitness Degree: The cost of the UAV’s flight trajectory is evaluated based on its fitness degree. The smaller the value of the fitness degree, the lower the cost of the UAV’s flight trajectory. Therefore, the fitness degree is considered a crucial reference index for comparing the performance of different algorithms. The fitness function for evaluating the UAV’s trajectory is defined as:24
GWO Algorithm and Enhanced GWO Algorithm
GWO Algorithm
The Gray Wolf Optimization (GWO) algorithm achieves the optimization objective by simulating the hierarchical structure and hunting behavior of a grey wolf population. The algorithm uses four types of wolves , , and to model the social hierarchy. Initially, the gray wolf population is randomly generated in the search space, and the optimal, sub-optimal, and third-best solutions of fitness values are assigned to the -wolf, -wolf, and -wolf, respectively, while the remaining solutions are considered -wolves. The -wolf, -wolf and -wolf are responsible for guiding the hunt, while the -wolves follow the guidance of the first three wolves.
In their search for prey, gray wolves approach it gradually. The mathematical model of this behavior is represented as follows:
25
26
27
28
where t is the current number of iterations; A and C denote coefficient vectors; represents the position of the prey, X(t) is the current position vector of the gray wolf. a drops linearly from 2 to 0 throughout the iteration and , are random vectors in .Gray wolves are able to identify the potential location of their prey, and the hunt is primarily guided by the -wolf. However, the exact location of the prey remains unknown to the wolves. To simulate the hunting behavior, it is assumed that the -wolf, -wolf, and -wolf have more information about the potential prey location than the -wolves. During each iteration, the three best solutions are recorded, and the -wolves are forced to update their positions based on the guidance provided by the -wolf, -wolf, and -wolf, according to the following equation:
29
30
31
where , , represent the position vector of -wolf, -wolf and -wolf respective in the current iteration; , , are the random vectors; , , represent the distance between other individuals and -wolf, -wolf and -wolf; denotes the updated position vector.Once a new wolf population is generated, boundary control is applied to the wolf population, completing one iteration. These steps are repeated until the termination conditions are met, typically when the maximum number of iterations is reached, at which point the best solution is output.
Despite its effectiveness, the GWO algorithm has certain drawbacks when applied to 3D UAV collaborative trajectory planning problems. Specifically, when the GWO algorithm initializes the population randomly, the initial distribution of individuals may be disordered, leading to low adaptability. This often results in the algorithm prematurely converging to local optima, especially when there are multiple optimization metrics involved.To address these limitations, several enhancements have been introduced to improve the GWO algorithms performance in UAV trajectory planning.
Design of Enhanced GWO Algorithm
Improvements to GWO for UAV Collaborative Trajectory Planning:
First, to improve the fitness of individual UAVs and reduce the likelihood of the population getting trapped in local optima, a greedy initialization method is introduced. This method restricts the search range of each individual such that the lower bound of the range is defined by the coordinate of the previous waypoint, and the upper bound is defined by the target coordinate. This ensures that the distribution of individual waypoints is approximately aligned with the desired flight path direction. However, this method reduces the diversity of the population. To mitigate this issue, a mutation strategy is introduced. During initialization, each individual decides whether to adopt the greedy initialization or random initialization based on a mutation parameter m. If m is smaller than a predefined threshold, the individual uses the greedy initialization; otherwise, it follows the random initialization method. The greedy initialization procedure is as follows:
32
33
In the above equation, , are the horizontal coordinates of the two neighboring waypoints in an individual, , are the vertical coordinates of the two neighboring waypoints in an individual, , are the horizontal and vertical coordinates of the coordinates of the target point; and U is a function of the limitations to the waypoint coordinates.Secondly, for collaborative trajectory planning, K-means clustering is applied to analyze UAV flight paths. The K-means algorithm helps evaluate the population of UAV trajectories independently across multiple metrics. Based on the results, highly adapted individuals referred to as "good genes" are filtered out. These individuals exhibit strong performance across various metrics and are retained for further optimization or evolution. The process of cluster analysis is as follows:
Determine the number of clusters: we consider five factors influence UAV trajectories fuel consumption, flight altitude, collision risk, threat zone exposure, and time. Consequently, the number of clusters is set to k=5;
Then, the distance of each data point from each clustering center is calculated by Eq. 34, and each data point is assigned to the nearest clustering center, which means that UAV trajectories are divided into five populations;
Recalculate the center of each clustering through Eq. 35;
Iterate until the maximum number of iterations is reached, which is specified as 5 times in this work;
The result of the final clustering is the evaluation result of each index.
34
where x is the data point, is the clustering center, d is the dimension of the data, and are the values of x, on the dimension, respectively.35
In the above equation, is the set of data points for the cluster and is the number of data points in that set.Process of the Enhanced GWO Algorithm
The process of the enhanced Gray Wolf Optimization (GWO) algorithm is outlined as follows (Fig. 3).
Set the size of the gray wolf population and the maximum number of iterations. Initialize the population using a method based on greedy principles, combined with a mutation strategy. The positions of the -wolf, -wolf, -wolf and -wolves are initialized based on the aforementioned strategy.
Perform K-means clustering on the track populations to group the UAV trajectories into distinct clusters. This helps to identify and retain the most adapted individuals based on their performance across multiple metrics;
For each individual gray wolf in the subpopulation, calculate the fitness value according to Eq. 24. This step evaluates the performance of each solution based on the predefined metrics;
Compare the optimal fitness values in each sub-population with the fitness values of the -wolf, -wolf and -wolf. Screen out the most optimal solution , the optimal solution and the suboptimal solution of each subpopulation during the current iteration, and record the corresponding waypoint positions of them;
Update each gray wolf individuals position in the subpopulation according to Eq. 31;
For the three best solutions in each subpopulation, calculate the parameters a, A and C according to Eqs. 27 and 28. These parameters guide the movement of the wolves in the search space;
If the preset maximum number of iterations is reached, output the best collaborative trajectory along with the corresponding optimal fitness value. Otherwise, return to Step 2 for the next iteration.
[See PDF for image]
Fig. 3
Flowchart of the enhanced GWO algorithm
[See PDF for image]
Algorithm 1
Enhanced GWO algorithm pseudocode.
Simulation Experiment
Simulation Environment
This experiment uses MATLAB for a simulation. Suppose there is a formation of 5 unmanned aircraft attacking 5 targets. The starting coordinates of the UAV and the coordinates of the targets are shown in Table 2. A total of 9 dangerous zones are set, including radar scanning zones, missile zones, ground artillery zones, meteorological impact zones, dynamic hazard zones, etc. We assume that the number of iterations is 80. The maximum flight range of the UAV is 1.5 times the initial distance, the flight speed is , the flight altitude is in the range of and the safe distance is 0.5km. The fuel cost ; flight altitude cost ; threat zone cost ;topographic costs ; collision cost and time cost .The experimental environment and parameters are summarised in Table 1.
Table 1. Experimental environment and parameters
Parametric | Value/description |
|---|---|
Simulation Platform | MATLAB |
Number of UAVs | 5 |
Number of targets | 5 |
Number of threat zones | 9 |
Maximum number of iterations | 80 |
Safe flight distance | 0.5 km |
To compare the MP-GWO algorithm with both the standard GWO algorithm and the Enhanced GWO algorithm, the parameter settings used are consistent across all three algorithms.
Table 2. Coordinates of the UAV and target of attack
UAV coordinates | Target coordinates |
|---|---|
(1.5,1.5,15) | (20,20,4) |
(0,1,5,15) | (16,20,5) |
(1.5,0,15) | (20,6,5) |
(0,3,15) | (18,20,4.5) |
(3,0,15) | (20,18,4.5) |
Experimental Analysis
Feasibility Test for Multi-UAV Collaborative Path Planning
The enhanced Gray Wolf Optimization (GWO) algorithm was evaluated through simulation experiments. Figure 4 presents the simulation results of the enhanced GWO algorithm, while Fig. 5 shows the fitness curve. In Fig. 4, the circle indicates the starting point of the UAV and the pentagram indicates the end of the trajectory. The green regions represent radar threat zones, the blue zone corresponds to missile threat zones, ground air defense artillery, and weather threats, while the light blue zone depicts the terrain, with terrain height randomly generated. The yellow zone represents the dynamic threat zone, which corresponds to the threat posed by enemy UAVs that are dynamically moving and circling. The results demonstrate that all five UAVs successfully reach the target zone. In Fig. 5, the fitness curve illustrates the change in the adaptability function as the number of iterations increases. As the number of iterations progresses, the adaptability steadily decreases and eventually stabilizes at a fixed value. This suggests that the UAVs’ paths have been optimized, with the cost of the trajectory minimizing, thereby confirming the effectiveness of the proposed algorithm.
[See PDF for image]
Fig. 4
UAV swarm trajectory
[See PDF for image]
Fig. 5
Fitness curves
Table 3 shows the parameters such as flight speed, flight distance, and the number of collisions for each UAV. All parameters fall within acceptable ranges, indicating that the UAVs are operating with minimal damage and their performance is well within safe operational limits.
Table 3. Parameters of the UAVs
UAV1 | UAV2 | UAV3 | UAV4 | UAV5 | |
|---|---|---|---|---|---|
Number of waypoints | 30 | 26 | 24 | 26 | 24 |
Flight distance | 29.51km | 29.09km | 27.05km | 28.03km | 27.74km |
Flight time | 124.69s | 133.68s | 134.45s | 119.18s | 128.23s |
Flight speed | 237 m/s | 218 m/s | 291 m/s | 235 m/s | 216 m/s |
Number of collisions | 0 | 0 | 0 | 0 | 0 |
System Performance Test Experiment
In practical application, the parameter selection of the algorithm is crucial for its performance, different parameter settings will significantly affect the algorithm’s performance in terms of solution quality, convergence speed, fitness value, and various cost indicators, etc. To clarify the effect of different parameter settings and provide a theoretical basis for parameter tuning in practical application, experiments on different greedy initialization parameters and different weight coefficients are carried out. The experimental results are shown in Table 4.
Table 4. Parameters of the UAVs
E-GWO | MP-GWO | GWO | |
|---|---|---|---|
ep = 0.2 | 29.82 | 32.42 | 34.77 |
ep = 0.5 | 12.42 | 15.94 | 17.66 |
ep = 0.8 | 31.18 | 33.21 | 35.06 |
= 0.05, = 0.05, | 12.42 | 15.94 | 17.66 |
= 0.15, = 0.15 | |||
= 0.30, = 0.30. | |||
= 0.15, = 0.15 | 19.21 | 20.07 | 26.58 |
= 0.15, = 0.15 | |||
= 0.20, = 0.20. |
The experiments conclude that in terms of greedy initialization parameters, the threshold 0.5 is the optimal choice to balance the solution quality and diversity, which is applicable to most dynamic threat scenarios; if the task emphasizes fast convergence, the threshold 0.8 can be chosen, but it needs to be combined with the diversity enhancement strategy. If there are multiple local optima in the problem, the threshold 0.2 can avoid precocious convergence through diversity. In terms of weight coefficients, the weights need to be dynamically adjusted according to the task requirements, and in the experiments of this manuscript, the collision prevention and time weights are relatively important.
Comparative Analysis of Dynamic Threat Zone Bypassing and Penetration Strategies
We compared two strategies for UAVs navigating through dynamic hazard zones: one where the UAV bypasses the dynamic threat zone entirely, and another where the UAV performs an in-depth analysis of the threat zone’s location and, based on the analysis, decides whether to bypass or penetrate the dynamic hazard zone. The enhanced GWO algorithm was used to evaluate both strategies, and the fitness convergence values were compared.
Table 5 summarizes the results after 20 simulation runs for each of the two strategies. Under the bypass strategy, the fitness convergence value was 17.91, whereas the strategy that combines bypassing with penetration analysis resulted in a lower fitness convergence value of 12.42. This indicates that by analyzing the location of dynamic threat zones and making informed decisions about whether to bypass or penetrate, the UAVs can optimize their paths more effectively, leading to better performance. Both strategies required 70 iterations to converge, indicating that the number of iterations needed to reach the optimal solution was the same for both strategies. Also, the AVOID OR THROUGH strategy takes longer to run due to the fact that it optimises more steps. Furthermore, there were no in-flight collisions in any of the 20 simulation runs for either strategy, ensuring that flight safety was maintained throughout.
Table 5. Comparison of the two strategies
Strategy | Convergence value of fitness | Number of iterations | Running Time | Number of flight collisions |
|---|---|---|---|---|
Avoid | 17.91 | 70 | 20.31 | 0 |
Avoid or Through | 12.42 | 70 | 26.06 | 0 |
Ablation Experiments
To verify the effectiveness of the two improvement strategies, greedy initialization and K-means clustering, in the enhanced GWO algorithm. In the optimization process of the algorithm, whether the new improvement strategies can really enhance the performance of the algorithm, to what extent and whether there is synergy between different strategies need to be verified through rigorous experiments, so as to provide a reliable theoretical basis for the algorithm in the practical application, details are shown in Table 6.
Table 6. Comparison of the two strategies
Group | Initial fitness | Number of iterations | fitness value |
|---|---|---|---|
E-GWO | 25.3 | 70 | 12.41 |
Greedy initialization method | 26.7 | 85 | 15.23 |
K-means clustering | 34.5 | 75 | 18.67 |
GWO | 40.2 | 90 | 17.66 |
The mean value of initial solution fitness for the greedy initialization only group 26.7 is significantly better than that of the no improvement strategy group 40.2. This indicates that the greedy initialization strategy is able to generate initial paths that are closer to the global optimum, which gives the algorithm a relatively good starting point at the initial stage, reduces ineffective searching in the worse solution space, and demonstrates a positive impact on the algorithm’s performance from the initial stage. The number of convergence iterations for this group is higher than that of the complete augmented GWO group. This is because in the absence of K-means clustering, the algorithm can only rely on its own iterations to correct the paths without the aid of K-means clustering to filter and optimize the population, resulting in more iterations being required to reach a stable fitness value, suggesting that although greedy initialization provides a better initial solution, it requires the cooperation of other strategies during subsequent optimization.
The E-GWO group has the fastest convergence, the lowest final fitness value, and balanced sub-costs. This suggests that the two strategies, greedy initialization and K-means clustering, can fully exploit their respective advantages when they work in tandem. Greedy initialization provides high-quality initial solutions and reduces ineffective searches, while K-means clustering filters and optimizes the population in subsequent iterations and improves the multi-objective balance, and the two work together to significantly improve the global optimization ability of the algorithm, which makes the algorithm both efficient and robust in dynamic threat scenarios.
Algorithm Comparison
To evaluate the performance of the enhanced GWO algorithm, it was compared with the Multi-population GWO (MP-GWO) and standard GWO algorithms. The comparison was based on running time, fitness convergence value, number of iterations required to achieve optimal convergence, and the number of flight collisions.
Table 7 presents the simulation results after 20 runs for each algorithm. The enhanced GWO algorithm achieved the lowest fitness convergence value of 12.42, compared to 15.94 for MP-GWO and 17.66 for standard GWO. This demonstrates that the enhanced GWO algorithm outperforms both MP-GWO and GWO in optimizing the objective function, as indicated by the lower fitness value.Additionally, the enhanced GWO algorithm required fewer iterations to reach convergence, with 70 iterations compared to 73 for MP-GWO and 75 for standard GWO. This suggests that the enhanced GWO algorithm is more efficient in reaching the optimal solution, requiring less computational effort or time.All three algorithms resulted in zero flight collisions, indicating that the UAVs successfully avoided any collisions in all simulation runs, ensuring safe flight operations.
Figures 6 and 7 show the fitness curves of the MP-GWO algorithm and the GWO algorithm. Through the comparison of Figs. 5, 6 and 7, it can be seen that the convergence speed of the Enhanced GWO algorithm is faster, and the convergence accuracy is higher. Moreover, due to the greedy algorithm used for trajectory initialization, the fitness of the Enhanced GWO algorithm in the initial iteration is significantly lower compared to the other two algorithms. This highlights the advantage of utilizing the greedy algorithm for initialization.
[See PDF for image]
Fig. 6
MP-GWO fitness curve
[See PDF for image]
Fig. 7
GWO fitness curve
Table 7. Comparison of simulation results of three algorithms
Arithmetic | Convergence value of fitness | Number of iterations | Running Time | Number of flight collisions |
|---|---|---|---|---|
Enhanced GWO | 12.42 | 70 | 26.06 | 0 |
MP-GWO | 15.94 | 73 | 26.66 | 0 |
GWO | 17.66 | 75 | 20.41 | 0 |
Discussion
In this study, a solution based on the enhanced Gray Wolf optimization algorithm is proposed to address the challenge of multi-UAV collaborative trajectory planning in complex dynamic threat environments. The experimental results show that the constructed spherical dynamic threat model can effectively characterize the dynamic fluctuation characteristics of the threat region (e.g., periodic upward/downward movement), which, combined with the introduction of the staged obstacle avoidance strategy, significantly improves the response capability of the UAV community to dynamic threats.
The algorithm uses a combination of greedy initialization and mutation strategies to generate initial UAV swarm trajectory points. The algorithm then divides the UAV trajectories into subgroups for cluster analysis, calculates the fitness value of each subgroup, and continuously adjusts the positions of the individuals according to the optimal, sub-optimal, and tri-optimal solutions. The iterative process continues until the maximum number of iterations is reached or until convergence occurs when the fitness value is stable. Experimental results show that the proposed method not only reduces the operation cost and improves the efficiency, but also greatly reduces the risk of UAV damage. In terms of performance evaluation, the enhanced GWO algorithm outperforms similar algorithms with a goodness-of-fit convergence value of 12.42, which is significantly better than the 15.94 and 17.66 achieved by the MP-GWO algorithm and the standard GWO algorithm, respectively, confirming the superiority of the enhanced GWO in optimizing the trajectory planning of multi-UAV systems.
However, the algorithm still has some limitations in practical application scenarios. On the one hand, the spherical dynamic threat area model is a simplified simulation; the actual threat environment may be more complex and variable, and the model may not reflect the real situation completely and accurately. On the other hand, the experiment is conducted in a specific simulation environment, and the effectiveness and adaptability of the algorithm need to be further verified for different scenario characteristics, such as complex meteorological conditions, dynamic differences of UAVs, and real-time communication interference. In addition, the parameter settings of the algorithm have a large impact on the final results, the current parameter optimization process is more dependent on experience, and the adaptive adjustment ability of parameters under different types of mission scenarios needs to be studied in depth.
Conclusion
The enhanced GWO algorithm proposed in this study effectively addresses the challenge of collaborative multi-UAV trajectory planning in complex dynamic threat environments. By integrating real-time UAV detection, motion trend prediction and phased obstacle avoidance strategies, combining greedy initialization and mutation strategies to generate initial trajectory points, and employing cluster analysis and iterative optimization, the algorithm achieves safe and efficient obstacle avoidance and improves the overall trajectory planning performance.
The experimental results show that the enhanced GWO algorithm performs well in reducing the operation cost, improving the efficiency and reducing the risk of UAV damage, and outperforms similar algorithms in performance evaluation with better fit convergence values, confirming its superiority in optimizing the trajectory planning of multi-UAV systems.
Despite the results achieved in this study, limitations still exist. The simplicity of the spherical dynamic threat region model, the specificity of the experimental environment, and the lack of adaptive tuning capability of the algorithm parameters need to be further addressed in future research. Overall, this study lays a solid foundation for the development of multi-UAV cooperative trajectory planning technology, and the proposed innovative improvements and experimental validation provide valuable references for research and application in related fields.
Acknowledgements
The authors would like to thank my supervisor for your invaluable guidance and support throughout the research process, and the journal’s editors and anonymous reviewers who generously provided detailed comments and suggestions that helped us improve the work.
Author Contributions
All authors contributed to the study conception and design. Conceptualization was performed by Zihan Zhou and Yanhong Guo. Data curation was performed by Yitao Wang. Validation was performed by Haoran Gong. Visualization was performed by Jingfan Lyu. The original draft of the manuscript was written by Zihan Zhou and all authors commented on previous versions of the manuscript. All authors review and approved the final manuscript.
Data Availability
Data sets generated during the current study are available from the corresponding author upon reasonable request. These data were used under license from the current study and are therefore not publicly available.
Declarations
Conflict of interest
The authors did not receive support from any organization for the submitted work. All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.
Ethics approval
Ethics approval and consent to participate
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
1. Sud, A; Andersen, E; Curtis, S; Lin, MC; Manocha, D. Real-time path planning in dynamic virtual environments using multiagent navigation graphs. IEEE Trans. Visual. Comput. Graph.; 2008; 14,
2. Qin, B; Zhang, D; Tang, S; Wang, M. Distributed grouping cooperative dynamic task assignment method of uav swarm. Appl. Sci. Basel; 2022; [DOI: https://dx.doi.org/10.3390/app12062865]
3. Wang, C; Wu, A; Hou, Y; Liang, X; Xu, L; Wang, X. Optimal deployment of swarm positions in cooperative interception of multiple uav swarms. Digit. Commun. Netw.; 2023; 9,
4. Wen, L; Zhen, Z; Wan, T; Hu, Z; Yan, C. Distributed cooperative fencing scheme for uav swarm based on self-organized behaviors. Aerosp. Sci. Technol.; 2023; [DOI: https://dx.doi.org/10.1016/j.ast.2023.108327]
5. Yang, Z; Yan, S; Ming, C; Wang, X. Intelligent dynamic trajectory planning of uavs: Addressing unknown environments and intermittent target loss. Drones; 2024; 5, 8.
6. Savkin, AV; Huang, H. Asymptotically optimal path planning for ground surveillance by a team of uavs. IEEE Syst. J.; 2022; 16,
7. Tordesillas, J; How, JP. Deep-panther: Learning-based perception-aware trajectory planner in dynamic environments. IEEE Robot. Autom. Lett.; 2023; 8,
8. Yousefi, SMM; Mohseni, SS; Dehbovid, H; Ghaderi, R. A practical approach to tracking estimation using object trajectory linearization. Int. J. Comput. Intell. Syst.; 2024; 17, 1. [DOI: https://dx.doi.org/10.1007/s44196-024-00579-5]
9. Kownacki, C. Control and position tracking for uavs. Appl. Sci. Basel; 2024; 14, 5. [DOI: https://dx.doi.org/10.3390/app14051909]
10. Wang, K; Zhang, X; Duan, L; Tie, J. Multi-uav cooperative trajectory for servicing dynamic demands and charging battery. IEEE Trans. Mob. Comput.; 2023; 22,
11. Fu, Y; Yang, S; Liu, B; Xia, E; Huang, D. Multi-uav cooperative trajectory planning based on the modified cheetah optimization algorithm. Entropy; 2023; 25, 9.4654088 [DOI: https://dx.doi.org/10.3390/e25091277]
12. Seong, M; Jo, O; Shin, K. Multi-uav trajectory optimizer: A sustainable system for wireless data harvesting with deep reinforcement learning. Eng. Appl. Artif. Intell.; 2023; [DOI: https://dx.doi.org/10.1016/j.engappai.2023.105891]
13. Yan, S; Cai, K; Swaid, M. A multi-objective evolutionary algorithm with dynamic topology and its application to network-wide flight trajectory planning. Int. J. Comput. Intell. Syst.; 2017; 10,
14. Mirjalili, S; Mirjalili, SM; Lewis, A. Grey wolf optimizer. Adv. Eng. Softw.; 2014; 69, pp. 46-61. [DOI: https://dx.doi.org/10.1016/j.advengsoft.2013.12.007]
15. Gupta, S; Deep, K. Enhanced leadership-inspired grey wolf optimizer for global optimization problems. Eng. Comput.; 2020; 36,
16. Liu, Y; As’arry, A; Hassan, MK; Hairuddin, AA; Mohamad, H. Review of the grey wolf optimization algorithm: variants and applications. Neural Comput. Appl.; 2024; 36,
17. Jamshidi, V; Nekoukar, V; Refan, MH. Real time uav path planning by parallel grey wolf optimization with align coefficient on can bus. Cluster Comput.; 2021; 24,
18. Liu, X; Shao, P; Li, G; Ye, L; Yang, H. Complex hilly terrain agricultural uav trajectory planning driven by grey wolf optimizer with interference model. Appl. Soft Comput.; 2024; [DOI: https://dx.doi.org/10.1016/j.asoc.2024.111710]
19. Radmanesh, M; Kumar, M; Sarim, M. Grey wolf optimization based sense and avoid algorithm in a Bayesian framework for multiple UAV path planning in an uncertain environment. Aerosp. Sci. Technol.; 2018; 77, pp. 168-179. [DOI: https://dx.doi.org/10.1016/j.ast.2018.02.031]
20. Dewangan, RK; Shukla, A; Godfrey, WW. Three dimensional path planning using grey wolf optimizer for uavs. Appl. Intell.; 2019; 49,
21. Zhang, W; Zhang, S; Wu, F; Wang, Y. Path planning of uav based on improved adaptive grey wolf optimization algorithm. IEEE Access; 2021; 9, pp. 89400-89411. [DOI: https://dx.doi.org/10.1109/ACCESS.2021.3090776]
22. Zhang, Z; Li, Y; Sun, Q; Huang, Y. A novel waypoint guidance and adaptive evolution strategy for unmanned aerial vehicle 3d route planning. J. Frank Inst. Eng. Appl. Math.; 2023; 360,
© The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.