Optimal Stopping and Loading Rules Considering

Full text

Turn on search term navigation

1. Introduction

Existing reliability models concentrate primarily on the system’s capacity to perform specific tasks under specified operating conditions and time constraints. Among these, the probability of mission success (completing a certain mission under specified conditions) is an essential index in reliability research [1,2,3,4,5]. However, in engineering practice, when system failure may result in severe consequences or enormous losses, the system safety may be more crucial than performing the task. Therefore, it is crucial to design appropriate operations strategies to improve the operational efficiency of these systems [6,7,8,9,10,11,12]. For safety-critical systems, by stopping the mission and starting the rescue action, the system safety can be improved, hence reducing the probability of casualties and significant economic losses. When a predetermined condition is met, the task can be stopped, and then a rescue scheme can be launched to save the system [13]. For instance, when a functional airplane experiences a certain level of damage from a lightning strike, it can promptly stop the mission and initiate rescue operations to prevent aircraft damage and human injury and death. Levitin et al. [14] provided two crucial indexes to evaluate the mission reliability and system safety of a system operating in a hostile environment: mission reliability and system safety. The term mission reliability refers to the probability that the mission will be finished within a specified time, whereas system safety refers to the probability that the system is capable of completing the task without catastrophic failure. By achieving a balance between these two important indexes, the mission-stopping threshold can be determined to reduce the total cost during mission execution [15].

In addition to mission abort, the loading level of safety-critical systems is another important factor influencing system safety [16]. For example, using tools at higher cutting speeds can lead to increased wear rates, and charging electric vehicles at higher speeds can lead to accelerated battery degradation. Higher loading level increases mission progress but with a larger degradation rate. To balance the system degradation rate and mission progress, optimizing loading level strategies for safety-critical systems is drawing increasing attention.

Although there has been substantial theoretical progress in mission-stopping modeling and loading-level optimization, the following three topics remain unexplored. First, it is generally assumed in existing stopping procedures that a mission is stopped if the state of the system deteriorates above an acceptable threshold. However, in engineering practice, the failure process of the system may be actively controlled by modifying the loads, allowing the decision-maker to control the degradation before the system hits the stopping threshold. This effect can be achieved by adjusting the system load just before mission termination. As an example of its practical engineering application, the workload of road systems can be changed by controlling the density and speed of passing vehicles. Moreover, the stopping method in multi-attempt missions based on degradation is still under-explored. Multiple attempts of missions can increase mission reliability and system safety. Policies regarding loading and mission stopping should take into account the possibility of multiple tries. Last but not least, the joint optimization of loading and stopping rules has not been studied.

To fill the research gaps, this study proposes degradation-based loading and stopping rules according to the degradation level and time in mission, which represents a significant step forward in the state-of-the-art of mission-stopping modeling. The decision maker adjusts the loading level dynamically in each attempt. In the event that the system reaches the stopping condition, the rescue operation should be initiated, and the task stops. If the rescue is successful, a new attempt is made until system failure or the time limit has expired, whichever comes first. Finally, optimization models are presented, along with an analysis of numerical examples and a sensitivity analysis of important parameters, both of which are applied to real-world engineering scenarios such as the cloud computing system.

The remaining sections of this work are organized as follows. Section 2 presents a literature review. In Section 3, dynamic loading and mission-stopping strategies are developed for systems subject to a monotone deterioration process under two different types of task success criteria. In Section 4 and Section 5, we assess the mission reliability and the system safety under two different types of task success criteria. In Section 6, we investigate the optimal loading and stopping rules. An application example of the proposed models is provided in Section 7. Finally, we sum up our findings and draw some conclusions and outline some possible avenues for further study.

2. Literature Review

For many engineering systems, such as drones [17] and chemical reactors [18], system survival is more important than task success because system failure can lead to huge economic losses or even casualties. When the failure risk becomes high enough, it is necessary to terminate the task and start the rescue program to save the system [19]. For example, when multiple-engine drones perform maintenance tasks on high-voltage grids, external shocks can cause certain damage to the engines, leading to engine failure or even damage to high-voltage grids. Therefore, when the number of failed engines reaches a certain level, it is necessary to terminate the task to avoid economic losses and safety hazards [20].

According to the system’s failure rules, determining appropriate mission-stopping conditions is the fundamental step to successfully balance the mission reliability and system safety indices. Due to its high application value, the research on mission-stopping strategies and related optimization models is gaining increasing interest. Numerous models have been built to examine the impact of mission-stopping techniques on the system operation process since the publication of the seminal paper by Myers [21]. The failure risk of safety-critical systems is primarily attributable to internal deterioration and hostile external environments. For systems with internal degradation, Zhao et al. [22] studied the multi-criteria mission-stopping policy using degradation and mission time. In [23,24,25], a mission-stopping plan taking into account a two-stage failure process, including a normal stage and a defect stage, is formulated depending on the degradation and the time in the defect stage. The former terminated the task when the level of system degradation exceeded the threshold, while the latter terminated the task when the fault stage length exceeded the threshold. Additionally, the two suspension mechanisms that strike a balance between task reliability and system survival probability are examined.

For systems operating in a shock environment, the mission-stopping rules considering shock damages has attracted considerable attention. Depending on the architecture of the system and the operating environment, numerous policies for optimally stopping the mission are provided. In [26], the optimal mission-stopping rules of binary-state systems in an impact environment were studied. Attempting to strike a balance between task reliability and system safety, Cha et al. [27] used the number of minimal repairs as the decision-making parameter of the task-stopping approach of some repairable systems. In addition to those two-state systems, studies of multi-state systems have also attracted increasing attention. The amount of shocks encountered by the system was used as the task-stopping condition in [28,29,30]. The optimal mission-stopping strategy of the balanced systems are studied in [31,32]. The above research only considers one-attempt missions. Nevertheless, in many cases, systems can try to accomplish the task multiple times [33] if the mission is crucial and there are no stringent constraints on time and resources. Task stopping in multiple attempts was first proposed by Levitin et al. [34]. The mission-stopping rules in the case of multiple attempts are studied for single-component systems [15,30,35]. and multi-component systems.

It is worth noting that all of the above-mentioned models concentrated on the optimization of task termination strategies. Based on this, the joint optimization of task termination and other operational policies that may affect task reliability and system survivability has attracted widespread attention. According to the characteristics of thermal storage system, Levitin et al. studied the joint optimization of task termination strategies and component activation sequence of warm standby systems [36], as well as the joint optimization with the component load level [37]. In multi-state warm standby systems, task termination strategies can be jointly optimized with protective policies [19]. In addition to the warm standby systems, Peng [38] also studied the optimal routing plan and task termination strategy joint optimization problem of UAVs. In many practical applications, safety-critical systems need to complete a certain number of sub-tasks to make the whole task successful. Therefore, the joint optimization problem of sub-task allocation among units and task termination strategies [35] should be considered. The paper [39] studied the joint optimization problem of dynamic task termination strategies and inspection intervals. In existing research, when considering maintenance problems, it is assumed that the amount of maintenance resources is always sufficient [40,41,42,43,44]. However, due to cost restrictions, the number of spare parts may be limited [45]. Zhao et al. [46] used recursion algorithms to study the optimal task termination and spare parts allocation strategy considering partial task loss.

Despite the significant research on task termination and loading optimization, the existing literature mainly focuses on the design and optimization of a single-task termination strategy, while in the practical operation process of engineering systems, multiple operational strategies are usually used to reduce the system failure risk, such as loading level optimization, which can effectively extend the system lifetime and improve the reliability and safety. The joint optimization of loading and stopping rules for muti-attempt missions has not been studied. Furthermore, the existing literature is mainly devoted to reliability modeling of loading-degradation dependence, ignoring the optimization of loading level. To further advance the modeling of task termination and loading optimization, this study considers the joint optimization of loading and stopping rules.

3. Problem Modeling

3.1. Deterioration Modeling

To characterize the system failure risk, the deterioration evolution of the system under consideration is modeled first. The deterioration is influenced by both its age and its loading level. Given system age u and loading x, the degradation process is denoted by $\{G (u, x), u \geq 0\}$ with monotonically increasing degradation paths to reflect stochastically growing practical deterioration processes such as wear and crack. It is shown that the inverse Gaussian process is a limiting compound Poisson process with different jump size distributions [47]. Such property justifies the use of an inverse Gaussian process to model systems with monotone degradation paths. To this end, this study assumes that $\{G (u, x), u \geq 0\}$ follows the inverse Gaussian process because of the nice mathematical features and physical implications of the inverse Gaussian process. In accordance with the inverse Gaussian process property, $G (u, x)$ has independent increments that follow the inverse Gaussian distribution. For $u < v$ , the degradation increment in the time interval $(u, v)$ , $G (v, x) - G (u, x)$ follows an inverse Gaussian distribution, i.e.,

(1) $G (v, x) - G (u, x) \sim IG (Λ (v, x) - Λ (u, x), η (x) {[Λ (v, x) - Λ (u, x)]}^{2}), \forall v > u,$

where the volatility parameter is denoted by

η (x)

, and the function

Λ (u, x)

is a monotone increasing function with

Λ (0, x) = 0

. This research makes the simplifying assumption that the

Λ (u, x)

function is linear and that

Λ (u, x) = u m (x)

, where

m (x)

is the function indicating the influence of loading on degradation. In this case, then

G (u, x)

will follow the inverse Gaussian distribution

IG (u m (x), η (x) u^{2})

, where the probability density function (PDF) is given as

(2) $f_{G (u, x)} (g) = \sqrt{\frac{η (x) {(u m (x))}^{2}}{2 π g^{3}}} exp [- \frac{η (x) {(g - (u m (x)))}^{2}}{2 g}],$

and the cumulative distribution function (CDF) is

(3) $F_{G (u, x)} (g) = Φ [\sqrt{\frac{η (x)}{g}} (g - u m (x))] + e^{2 η (x) u m (x)} Φ [- \sqrt{\frac{η (x)}{g}} (g + u m (x))],$

where

Φ (•)

is the standard normal CDF.

Failure of the system happens when the level of deterioration is greater than a certain threshold, denoted by h. As a consequence of this, when the loading level x is taken into consideration, the failure time $U (h, x)$ can be defined as the initial hitting time of the degradation process $G (u, x)$ in relation to the failure threshold h. The CDF of the function $U (h, x)$ can be expressed as

(4) $\begin{matrix} F_{U (h, x)} (u) & = P (G (u, x) > h) \\ = Φ [\sqrt{\frac{η (x)}{h}} (u m (x) - h)] - e^{2 η (x) u m (x)} Φ [- \sqrt{\frac{η (x)}{h}} (h + u m (x))], \end{matrix}$

and the PDF of

U (h, x)

is given as

(5) $\begin{matrix} f_{U (h, x)} (u) & = \frac{d F_{U (h, x)} (u)}{d u} \\ = \sqrt{\frac{η (x)}{h}} [ϕ (\sqrt{\frac{η (x)}{h}} (u m (x) - h)) + e^{2 η (x) u m (x)} ϕ (- \sqrt{\frac{η (x)}{h}} (u m (x) + h))] \\ - 2 η (x) e^{2 η (x) u m (x)} Φ (- \sqrt{\frac{η (x)}{h}} (u m (x) + h)), \end{matrix}$

where

ϕ (•)

is the standard normal PDF.

3.2. Loading and Stopping Policies

Based on the deterioration characteristics, this section proposes dynamic loading and stopping policies. The considered system must remain operational for $τ$ prior to the required deadline $\hat{τ}$ $(τ < \hat{τ})$ to accomplish the mission. By the required deadline, the system can try to accomplish the task multiple times. Let K be the maximum number of allowed tries. We discuss the following two typical types of MSR. MSRI: the continuous operational time must surpass a threshold; MSRII: the cumulative operating time must exceed a threshold.

SS is determined by calculating the chance that a catastrophic failure will not take place while the task is being carried out. During an inspection, in order to improve the SS of the system under consideration, a job may be aborted if the level of degradation is more than a given level, and a rescue process with a length of $φ (t)$ may be initiated. Both of these actions take place simultaneously. Let us call this point in time $ε$ . It is the point at which the successful completion of the task takes less time than the rescue operation. To be more specific, $φ (u) + u > τ, \forall u > ε$ . Therefore, in the event that $u > ε$ , the task will not be halted.

At each try, the stopping decision is governed by a dynamic degradation level. To be explicit, in the k-th try, the thresholds for degradation are given by $h_{k}$ . Let $U (h_{k}, x)$ be the random time from the beginning of the k-th try to the halting instant if threshold $h_{k}$ is taken under loading x, which is the first passage time of $G (u, x)$ with respect to the threshold $h_{k}$ . By Equation (4), the distribution function of $U (h_{k}, x)$ , $F_{U (h_{k}, x)} (u)$ , is given as

(6) $F_{U (h_{k}, x)} (u) = Φ [\sqrt{\frac{η (x)}{h_{k}}} (u m (x) - h_{k})] - e^{2 η (x) u m (x)} Φ [- \sqrt{\frac{η (x)}{h_{k}}} (h_{k} + u m (x))] .$

4. Risk Analysis under MSRI

In this section, we will analyze the suggested dynamic stopping rules in terms of their effects on mission reliability and system survivability for MSRI. Due to the complicated degradation process involving multiple attempts, we employ a numerical technique based on event transitions to assess the probability of a successful mission and system survival.

4.1. Mission Reliability under MSRI

MSRI stipulates that the system must run without interruption for more time than a given threshold $τ$ $(τ < \hat{τ})$ . If we denote by $R_{k}$ the random amount of time needed to complete the job before the k-th try, then the PDF of $R_{k}$ given the halting thresholds $h_{k}$ and the loading level $x_{k}$ is denoted by $α_{k} (r | h_{k}, x_{k})$ . When a new system is turned on, the remaining $\hat{τ}$ of a task’s execution time is recorded as the elapsed time at time 0. Therefore, the appropriate probability mass function of $R_{0}$ can be obtained by the definition of $α_{k} (r | h_{k}, x_{k})$ as follows

(7) $α_{0} (r | h_{0}, x_{0}) = \{\begin{matrix} 1, r = \hat{τ} \\ 0, e l s e \end{matrix}$

Time spent on the $(k - 1)$ -th try is $U (h_{k - 1}, x_{k - 1}) + φ [U (h_{k - 1}, x_{k - 1})]$ , given the halting threshold in the $(k - 1)$ -th try, $h_{k - 1}$ . At the outset of the k-th mission, the remaining time for its execution is thus expressed as

(8) $R_{k} = R_{k - 1} - U (h_{k - 1}, x_{k - 1}) - φ [U (h_{k - 1}, x_{k - 1})] .$

If the mission is stopped at time u and survives the rescue operation, then the remaining time until the $(k - 1)$ th try is $r + u + φ (u)$ . Thus, the probability density function $α_{k} (r | h_{k}, x_{k})$ can be obtained iteratively as

(9) $\begin{matrix} α_{k} (r | h_{k}, x_{k}) & = \int_{0}^{ε} α_{k - 1} (r + u + φ (u) | h_{k - 1}, x_{k - 1}) \\ \times P (G (u + φ (u), x_{k - 1}) < h | G (u, x_{k - 1}) = h_{k - 1}) f_{U (h_{k - 1}, x_{k - 1})} (u) d u . \end{matrix}$

Considering that the inverse Gaussian process satisfies the property of steady increment, $G (u + φ (u), x_{k - 1}) - G (u, x_{k - 1})$ follows $I G (φ (u) m (x_{k - 1}), η (x_{k - 1}) {[φ (u) m (x_{k - 1})]}^{2})$ . Then we have

(10) $\begin{matrix} P (G (u + φ (u), x_{k - 1}) < h | G (u, x_{k - 1}) = h_{k - 1}) \\ = P (G (u + φ (u), x_{k - 1}) - G (u, x_{k - 1}) < h - h_{k - 1} | G (u, x_{k - 1}) = h_{k - 1}) \\ = Φ [\sqrt{\frac{η (x_{k - 1})}{h - h_{k - 1}}} (h - h_{k - 1} - φ (u) m (x_{k - 1}))] \\ - e^{2 η (x_{k - 1}) φ (u) m (x_{k - 1})} Φ [- \sqrt{\frac{η (x_{k - 1})}{h - h_{k - 1}}} (h - h_{k - 1} + φ (u) m (x_{k - 1}))] . \end{matrix}$

Using Equation (10), $f_{U (h_{k - 1}, x_{k - 1})} (u)$ can be given as

(11) $\begin{matrix} f_{U (h_{k - 1}, x_{k - 1})} (u) & = \sqrt{\frac{η (x_{k - 1})}{h_{k - 1}}} [\begin{matrix} ϕ (\sqrt{\frac{η (x_{k - 1})}{h_{k - 1}}} (u m (x_{k - 1}) - h_{k - 1})) \\ + e^{2 η (x_{k - 1}) u m (x_{k - 1})} ϕ (- \sqrt{\frac{η (x_{k - 1})}{h_{k - 1}}} (u m (x_{k - 1}) + h_{k - 1})) \end{matrix}] \\ - 2 η (x_{k - 1}) e^{2 η (x_{k - 1}) u m (x_{k - 1})} Φ (- \sqrt{\frac{η (x_{k - 1})}{h_{k - 1}}} (u m (x_{k - 1}) + h_{k - 1})) . \end{matrix}$

Then the probability density function $α_{k} (s | h_{k}, x_{k})$ can be recursively derived as

(12) $\begin{matrix} α_{k} (r | h_{k}, x_{k}) & = \int_{0}^{ε} α_{k - 1} (r + u + φ (u) | h_{k - 1}, x_{k - 1}) \\ \times \{\begin{matrix} Φ [\sqrt{\frac{η (x_{k - 1})}{h - h_{k - 1}}} (h - h_{k - 1} - φ (u) m (x_{k - 1}))] \\ - e^{2 η (x_{k - 1}) φ (u) m (x_{k - 1})} Φ [- \sqrt{\frac{η (x_{k - 1})}{h - h_{k - 1}}} (h - h_{k - 1} + φ (u) m (x_{k - 1}))] \end{matrix}\} \\ \times \{\begin{matrix} \sqrt{\frac{η (x_{k - 1})}{h_{k - 1}}} [\begin{matrix} ϕ (\sqrt{\frac{η (x_{k - 1})}{h_{k - 1}}} (u m (x_{k - 1}) - h_{k - 1})) \\ + e^{2 η (x_{k - 1}) u m (x_{k - 1})} ϕ (- \sqrt{\frac{η (x_{k - 1})}{h_{k - 1}}} (u m (x_{k - 1}) + h_{k - 1})) \end{matrix}] \\ - 2 η (x_{k - 1}) e^{2 η (x_{k - 1}) u m (x_{k - 1})} Φ (- \sqrt{\frac{η (x_{k - 1})}{h_{k - 1}}} (u m (x_{k - 1}) + h_{k - 1})) \end{matrix}\} d u . \end{matrix}$

If the task is completed on the k-th try before the time $\hat{τ}$ , the task is considered successful according to MSRI, and the system survives the k-th attempt (the rescue initiated time in the k-th try, $U (h_{k}, x_{k})$ , is larger than $ε$ , and the system lifetime $U (h, x_{k})$ is larger than the task duration $τ$ , i.e., $U (h_{k}, x_{k}) > ε$ and $U (h, x_{k}) > τ$ . Then, given MSRI, we may write down the odds of completing the task on the k-th try as

(13) $\begin{matrix} R_{I, k} (h_{k}, x_{k}) = \int_{τ}^{\hat{τ}} P (U (h_{k}, x_{k}) > ε, U (h, x_{k}) > τ) α_{k} (r | h_{k}, x_{k}) d r . \end{matrix}$

According to the multi-criteria stopping strategy that has been proposed, the probability that the mission will not be stopped at the k-th try and that the system would survive the mission is provided as

(14) $\begin{matrix} P (U (h_{k}, x_{k}) > ε, U (h, x_{k}) > τ) & = \int_{0}^{h_{k}} P (U (h, x_{k}) > τ | U (h_{k}, x_{k}) = g) f_{G (ε, x_{k})} (g) d g \\ = \int_{0}^{h_{k}} P (G (τ, x_{k}) < h | G (ε, x_{k}) = g) f_{G (ε, x_{k})} (g) d g \\ = \int_{0}^{h_{k}} P (G (τ, x_{k}) - G (ε, x_{k}) < h - g) f_{G (ε, x_{k})} (g) d g . \end{matrix}$

Due to the feature of IG process, the degradation increment in time interval $(ε, τ)$ , follows inverse Gaussian distribution $I G ((τ - ε) m (x_{k}), η (x_{k}) {(τ - ε)}^{2})$ . We have the following results based on the distribution function of the deterioration increment in Equation (4),

(15) $\begin{matrix} P (G (τ, x_{k}) - G (ε, x_{k}) < h - g) \\ = Φ [\sqrt{\frac{η (x_{k})}{h - g}} (h - g - (τ - ε) m (x_{k}))] + e^{2 η (x_{k}) u m (x_{k})} Φ [- \sqrt{\frac{η (x_{k})}{h - g}} (h - g + (τ - ε) m (x_{k}))] . \end{matrix}$

The likelihood that the mission continues without being halted and that the system remains operational throughout the mission can be calculated using Equation (15) as

(16) $\begin{matrix} P (G (τ, x_{k}) - G (ε, x_{k}) < h - g) \\ = \int_{0}^{h_{k}} \{\begin{matrix} Φ [\sqrt{\frac{η (x_{k})}{h - g}} (h - g - (τ - ε) m (x_{k}))] \\ + e^{2 η (x_{k}) u m (x_{k})} Φ [- \sqrt{\frac{η (x_{k})}{h - g}} (h - g + (τ - ε) m (x_{k}))] \end{matrix}\} f_{G (ε, x_{k})} (g) d g . \end{matrix}$

Based on Equation (15), given MSRI, the probability that the task will be finished on the k-th try is calculated as

(17) $\begin{matrix} R_{I, k} (h_{k}, x_{k}) & = \int_{τ}^{\hat{τ}} \int_{0}^{h_{k}} \{\begin{matrix} Φ [\sqrt{\frac{η (x_{k})}{h - g}} (h - g - (τ - ε) m (x_{k}))] \\ + e^{2 η (x_{k}) u m (x_{k})} Φ [- \sqrt{\frac{η (x_{k})}{h - g}} (h - g + (τ - ε) m (x_{k}))] \end{matrix}\} \\ \times f_{G (ε, x_{k})} (g) α_{k} (r | h_{k}, x_{k}) d g d r . \end{matrix}$

It is important to note that the number of tries that can be made before a mission is considered complete cannot exceed the allowed number. Using the law of total probability, the probability that a task will be completed successfully according to MSRI can be calculated as follows

(18) $\begin{matrix} R_{I} (h_{1 \times K}, x_{1 \times K}) = \sum_{k = 1}^{K} R_{I, k} (h_{k}, x_{k}) \\ = \sum_{k = 1}^{K} \int_{τ}^{\hat{τ}} \int_{0}^{h_{k}} \{\begin{matrix} Φ [\sqrt{\frac{η (x_{k})}{h - g}} (h - g - (τ - ε) m (x_{k}))] \\ + e^{2 η (x_{k}) u m (x_{k})} Φ [- \sqrt{\frac{η (x_{k})}{h - g}} (h - g + (τ - ε) m (x_{k}))] \end{matrix}\} f_{G (ε, x_{k})} (g) α_{k} (r | h_{k}, x_{k}) d g d r . \end{matrix}$

4.2. System Survivability under MSRI

The system will only survive if it completes the mission or rescue operation. Consequently, system survivability equals the sum of mission reliability and the chance of rescue success. If the system survives k tries before $\hat{τ}$ , then we have (1) the time left until the k-th try is greater than $τ$ ; (2) the mission is halted at the k-th attempt and the rescue operation is successful, i.e., $U (h_{k}, x_{k}) < ε$ and $U (h_{k}, x_{k}) + φ (U (h_{k}, x_{k})) < U (h, x_{k})$ ; and (3) the remaining mission time after the kth rescue is less than $τ$ such that the mission is not further tried, i.e., $t_{k} - (U (h_{k}, x_{k}) + φ (U (h_{k}, x_{k}))) < τ$ . Thus, the likelihood of system survival after k trials is given by

(19) $\begin{matrix} S_{I, k} (h_{k}, x_{k}) = \int_{τ}^{\hat{τ}} P (U (h_{k}, x_{k}) < ε, t - τ < U (h_{k}, x_{k}) + φ (U (h_{k}, x_{k})) < U (h, x_{k})) α_{k} (s | h_{k}, x_{k}) d s . \end{matrix}$

Given the remaining mission execution time before the k-th try, using the property of independent and stationary increments of the inverse Gaussian process, the probability that the system survives k tries is given as

(20) $\begin{matrix} P (U (h_{k}, x_{k}) < ε, t - τ < U (h_{k}, x_{k}) + φ (U (h_{k}, x_{k})) < U (h, x_{k})) \\ = \int_{ψ}^{ε} P (G (u + φ (u), x_{k}) < h | G (u, x_{k}) = h_{k}) f_{U (h_{k}, x_{k})} (u) d u, \end{matrix}$

where

ψ

can be determined by

s - ψ - φ (ψ) = τ

. According to Equations (20) and (10), the probability that a system will survive k attempts according to task success criterion I in Equation (19) is given by

(21) $\begin{matrix} P (G (u + φ (u), x_{k}) < h | G (u, x_{k}) = h_{k}) \\ = Φ [\sqrt{\frac{η (x_{k})}{h - h_{k}}} (h - h_{k} - φ (u) m (x_{k}))] + e^{2 η (x_{k}) φ (u) m (x_{k})} Φ [- \sqrt{\frac{η (x_{k})}{h - h_{k}}} (h - h_{k} + φ (u) m (x_{k}))] . \end{matrix}$

By Equation (21), the likelihood that a system would survive k trials according to MSRI in Equation (19) is given by

(22) $\begin{matrix} S_{I, k} (h_{k}, x_{k}) & = \int_{τ}^{\hat{τ}} \int_{ψ}^{ξ} \{\begin{matrix} Φ [\sqrt{\frac{η (x_{k})}{h - h_{k}}} (h - h_{k} - φ (u) m (x_{k}))] \\ + e^{2 η (x_{k}) φ (u) m (x_{k})} Φ [- \sqrt{\frac{η (x_{k})}{h - h_{k}}} (h - h_{k} + φ (u) m (x_{k}))] \end{matrix}\} \\ \times \{\begin{matrix} \sqrt{\frac{η (x_{k})}{h_{k}}} [\begin{matrix} ϕ (\sqrt{\frac{η (x_{k})}{h_{k - 1}}} (u m (x_{k}) - h_{k})) \\ + e^{2 η (x_{k}) u m (x_{k})} ϕ (- \sqrt{\frac{η (x_{k})}{h_{k}}} (u m (x_{k}) + h_{k})) \end{matrix}] \\ - 2 η (x_{k}) e^{2 η (x_{k}) u m (x_{k})} Φ (- \sqrt{\frac{η (x_{k})}{h_{k}}} (u m (x_{k}) + h_{k})) \end{matrix}\} α_{k} (s | h_{k}, x_{k}) d u d s . \end{matrix}$

Noting that the number of attempts till system survival is mutually exclusive, the system’s survivability according to MSRI can be calculated as

(23) $\begin{matrix} S_{I} (h_{1 \times K}, x_{1 \times K}) = \sum_{k = 1}^{K} S_{I, k} (h_{k}, x_{k}) \\ = \sum_{k = 1}^{K} \int_{τ}^{\hat{τ}} \int_{ψ}^{ξ} \{\begin{matrix} Φ [\sqrt{\frac{η (x_{k})}{h - h_{k}}} (h - h_{k} - φ (u) m (x_{k}))] \\ + e^{2 η (x_{k}) φ (u) m (x_{k})} Φ [- \sqrt{\frac{η (x_{k})}{h - h_{k}}} (h - h_{k} + φ (u) m (x_{k}))] \end{matrix}\} \\ \times \{\begin{matrix} \sqrt{\frac{η (x_{k})}{h_{k}}} [\begin{matrix} ϕ (\sqrt{\frac{η (x_{k})}{h_{k - 1}}} (u m (x_{k}) - h_{k})) \\ + e^{2 η (x_{k}) u m (x_{k})} ϕ (- \sqrt{\frac{η (x_{k})}{h_{k}}} (u m (x_{k}) + h_{k})) \end{matrix}] \\ - 2 η (x_{k}) e^{2 η (x_{k}) u m (x_{k})} Φ (- \sqrt{\frac{η (x_{k})}{h_{k}}} (u m (x_{k}) + h_{k})) \end{matrix}\} α_{k} (s | h_{k}, x_{k}) d u d s . \end{matrix}$

5. Mission Reliability and System Survivability under MSRII

Taking into account the dynamic loading and stopping policies, this section calculates the mission reliability and the system survivability under MSRII. A recursive approach is used to evaluate mission reliability and system survivability.

5.1. Mission Reliability under MSRII

Under MSRII, the cumulative operational time must surpass a specified threshold $τ$ $(τ < \hat{τ})$ . Let $R_{k}$ and $W_{k}$ represent the remaining random time for mission execution and the cumulative operational time prior to the k-th attempt, respectively. Let ${\tilde{α}}_{k} (r, w | {\tilde{g}}_{k}, {\tilde{x}}_{k})$ be the joint pdf of $R_{k}$ and $W_{k}$ given stopping thresholds ${\tilde{h}}_{k}$ and ${\tilde{x}}_{k}$ . A new system begins operation with the remaining task execution time $\hat{τ}$ and cumulative operating time 0 prior to the first attempt. According to the definition of ${\tilde{α}}_{k} (r, w | {\tilde{h}}_{k}, {\tilde{x}}_{k}),$ the probability mass function of $R_{0}$ and $W_{0}$ may be expressed as

(24) ${\tilde{α}}_{0} (r, w | {\tilde{h}}_{k}, {\tilde{x}}_{k}) = \{\begin{matrix} 1, r = \hat{τ}, w = 0 \\ 0, e s l e \end{matrix}$

Let ${\tilde{α}}_{k - 1} (r, w | {\tilde{h}}_{k}, {\tilde{x}}_{k})$ denote the joint probability density function of the remaining time and mission execution time prior to the $(k - 1)$ -th try. The remaining mission execution time and cumulative operating time before the $(k - 1)$ -th attempt are $r + u + φ (u)$ and $w - u$ , respectively, if the mission is halted at time t. Thus, ${\tilde{α}}_{k} (r, w | {\tilde{h}}_{k}, {\tilde{x}}_{k})$ can be obtained recursively as follows

(25) $\begin{matrix} {\tilde{α}}_{k} (r, w | {\tilde{h}}_{k}, {\tilde{x}}_{k}) = \int_{0}^{r_{k - 1}} {\tilde{α}}_{k - 1} (r + u + φ (u), w - u | {\tilde{h}}_{k - 1}, {\tilde{x}}_{k - 1}) \\ \times P (G (u + φ (u), {\tilde{x}}_{k}) < h | G (u, {\tilde{x}}_{k - 1}) = {\tilde{g}}_{k - 1}) f_{U ({\tilde{h}}_{k - 1}, {\tilde{x}}_{k - 1})} (u) d u \\ = \int_{0}^{r_{k - 1}} {\tilde{α}}_{k - 1} (r + u + φ (u), w - u | {\tilde{h}}_{k - 1}, {\tilde{x}}_{k - 1}) P (G (φ (u), {\tilde{x}}_{k - 1}) < h - {\tilde{h}}_{k - 1}) f_{U ({\tilde{h}}_{k - 1}, {\tilde{x}}_{k - 1})} (u) d u . \end{matrix}$

On the basis of the distribution of the inverse Gaussian process and the PDF of the inverse Gaussian process’s first passage time, we have

(26) $\begin{matrix} {\tilde{α}}_{k} (r, w | {\tilde{h}}_{k}, {\tilde{x}}_{k}) & = \int_{0}^{r_{k - 1}} {\tilde{α}}_{k - 1} (r + u + φ (u), w - u | {\tilde{h}}_{k - 1}, {\tilde{x}}_{k - 1}) \\ \times \{\begin{matrix} Φ [\sqrt{\frac{η ({\tilde{x}}_{k - 1})}{h - {\tilde{h}}_{k - 1}}} (h - {\tilde{h}}_{k - 1} - φ (u) m ({\tilde{x}}_{k - 1}))] \\ + e^{2 η ({\tilde{x}}_{k - 1}) φ (u) m ({\tilde{x}}_{k - 1})} Φ [- \sqrt{\frac{η ({\tilde{x}}_{k - 1})}{h - {\tilde{h}}_{k - 1}}} (h - {\tilde{h}}_{k - 1} + φ (u) m ({\tilde{x}}_{k - 1}))] \end{matrix}\} \\ \times \{\begin{matrix} \sqrt{\frac{η (x_{k - 1})}{{\tilde{h}}_{k - 1}}} [\begin{matrix} ϕ (\sqrt{\frac{η ({\tilde{x}}_{k - 1})}{{\tilde{h}}_{k - 1}}} (u m ({\tilde{x}}_{k - 1}) - {\tilde{h}}_{k - 1})) \\ + e^{2 η ({\tilde{x}}_{k - 1}) u m ({\tilde{x}}_{k - 1})} ϕ (- \sqrt{\frac{η ({\tilde{x}}_{k - 1})}{{\tilde{h}}_{k - 1}}} (u m ({\tilde{x}}_{k - 1}) + {\tilde{h}}_{k - 1})) \end{matrix}] \\ - 2 η ({\tilde{x}}_{k - 1}) e^{2 η ({\tilde{x}}_{k - 1}) u m ({\tilde{x}}_{k - 1})} Φ (- \sqrt{\frac{η ({\tilde{x}}_{k - 1})}{{\tilde{h}}_{k - 1}}} (u m ({\tilde{x}}_{k - 1}) + {\tilde{h}}_{k - 1})) \end{matrix}\} d u . \end{matrix}$

Under MSRII, the following condition is met if the task succeeds after k tries by time $\hat{τ}$ : (1) the remaining mission execution time before the k-th try must exceed the required mission completion time; (2) the cumulative mission time must exceed $τ$ after the k-th try. There are two possible outcomes for task success. In scenario 1, the total duration of the mission reaches $τ$ before $U ({\tilde{h}}_{k}, x_{k})$ . In scenario 2, the cumulative mission time is less than $τ$ prior to the halting time $U ({\tilde{h}}_{k}, x_{k})$ but greater than $τ$ prior to system failure. The probability of task success after k tries according to MSRII is then expressed as a function of the halting thresholds.

(27) $\begin{matrix} R_{I I, k} ({\tilde{h}}_{k}, {\tilde{x}}_{k}) & = \int_{τ - r}^{\hat{τ}} \int_{0}^{τ} P (U ({\tilde{h}}_{k}, {\tilde{x}}_{k}) > ε, U (h, {\tilde{x}}_{k}) > τ - w) {\tilde{α}}_{k} (r, w | {\tilde{h}}_{k}, {\tilde{x}}_{k}) d w d r \\ = \int_{τ - r}^{\hat{τ}} \int_{0}^{τ} P (U ({\tilde{h}}_{k}, {\tilde{x}}_{k}) > max (ε, τ - w) {\tilde{α}}_{k} (r, w | {\tilde{h}}_{k}, {\tilde{x}}_{k}) d w d r \\ + \int_{τ - r}^{\hat{τ}} \int_{0}^{τ} P (ε < U ({\tilde{h}}_{k}, {\tilde{x}}_{k}) < τ - w, U (h, {\tilde{x}}_{k}) > τ - w) {\tilde{α}}_{k} (r, w | {\tilde{h}}_{k}, {\tilde{x}}_{k}) d w d r . \end{matrix}$

The first term in Equation (27) represents the likelihood that the mission will be finished before the stopping threshold is reached. By adopting the first hitting time of the inverse Gaussian process, we can determine

(28) $\begin{matrix} P (U ({\tilde{h}}_{k}, {\tilde{x}}_{k}) > max (ε, τ - w)) = 1 - Φ [\sqrt{\frac{η ({\tilde{x}}_{k})}{{\tilde{h}}_{k}}} (max (ε, τ - w) m ({\tilde{x}}_{k}) - {\tilde{h}}_{k})] \\ + e^{2 η ({\tilde{x}}_{k}) max (ε, τ - w) m ({\tilde{x}}_{k})} Φ [- \sqrt{\frac{η ({\tilde{x}}_{k})}{{\tilde{h}}_{k}}} ({\tilde{h}}_{k} + max (ε, τ - w) m ({\tilde{x}}_{k}))] . \end{matrix}$

The second component of Equation (27) represents the likelihood that the mission will be completed after reaching the stopping threshold, which can be expressed as

(29) $\begin{matrix} P (ε < U ({\tilde{h}}_{k}, {\tilde{x}}_{k}) < τ - w, U (h, {\tilde{x}}_{k}) > τ - w) \\ = \int_{ε}^{τ - w} P \{G (τ - w, {\tilde{x}}_{k}) - G (ε, {\tilde{x}}_{k}) < h - {\tilde{h}}_{k}\} f_{U ({\tilde{h}}_{k}, {\tilde{x}}_{k})} (u) d u \\ = \int_{ε}^{τ - w} \{\begin{matrix} Φ [\sqrt{\frac{η ({\tilde{x}}_{k})}{h - {\tilde{h}}_{k}}} (h - {\tilde{h}}_{k} - (τ - w - ε) m ({\tilde{x}}_{k}))] \\ + e^{2 η ({\tilde{x}}_{k}) (τ - w - ε) m ({\tilde{x}}_{k})} Φ [- \sqrt{\frac{η ({\tilde{x}}_{k})}{h - {\tilde{h}}_{k}}} (h - {\tilde{h}}_{k} + (τ - w - ε) m ({\tilde{x}}_{k}))] \end{matrix}\} f_{U ({\tilde{h}}_{k}, {\tilde{x}}_{k})} (u) d u . \end{matrix}$

On the basis of the expressions in Equations (28) and (29), the chance that the task will be completed on the k-th attempt under MSRII is given as follows

(30) $\begin{matrix} R_{I I, k} ({\tilde{h}}_{k}, {\tilde{x}}_{k}) = \int_{τ - r}^{\hat{τ}} \int_{0}^{τ} \{\begin{matrix} 1 - Φ [\sqrt{\frac{η ({\tilde{x}}_{k})}{{\tilde{h}}_{k}}} (max (ε, τ - w) m ({\tilde{x}}_{k}) - {\tilde{h}}_{k})] \\ + e^{2 η ({\tilde{x}}_{k}) max (ε, τ - w) m ({\tilde{x}}_{k})} Φ [- \sqrt{\frac{η ({\tilde{x}}_{k})}{{\tilde{h}}_{k}}} ({\tilde{h}}_{k} + max (ε, τ - w) m ({\tilde{x}}_{k}))] \end{matrix}\} d r \\ \times {\tilde{α}}_{k} (r, w | {\tilde{h}}_{k}, {\tilde{x}}_{k}) d w d r \\ + \int_{τ - r}^{\hat{τ}} \int_{0}^{τ} \int_{ε}^{τ - w} \{\begin{matrix} Φ [\sqrt{\frac{η ({\tilde{x}}_{k})}{h - {\tilde{h}}_{k}}} (h - {\tilde{h}}_{k} - (τ - w - ε) m ({\tilde{x}}_{k}))] \\ + e^{2 η ({\tilde{x}}_{k}) (τ - w - ε) m ({\tilde{x}}_{k})} Φ [- \sqrt{\frac{η ({\tilde{x}}_{k})}{h - {\tilde{h}}_{k}}} (h - {\tilde{h}}_{k} + (τ - w - ε) m ({\tilde{x}}_{k}))] \end{matrix}\} \\ \times {\tilde{α}}_{k} (r, w | {\tilde{h}}_{k}, {\tilde{x}}_{k}) \\ \times \{\begin{matrix} \sqrt{\frac{η (x_{k - 1})}{h_{k - 1}}} [\begin{matrix} ϕ (\sqrt{\frac{η (x_{k - 1})}{h_{k - 1}}} (u m (x_{k - 1}) - h_{k - 1})) \\ + e^{2 η (x_{k - 1}) u m (x_{k - 1})} ϕ (- \sqrt{\frac{η (x_{k - 1})}{h_{k - 1}}} (u m (x_{k - 1}) + h_{k - 1})) \end{matrix}] \\ - 2 η (x_{k - 1}) e^{2 η (x_{k - 1}) u m (x_{k - 1})} Φ (- \sqrt{\frac{η (x_{k - 1})}{h_{k - 1}}} (u m (x_{k - 1}) + h_{k - 1})) \end{matrix}\} d u d r d w . \end{matrix}$

Observe that the number of attempts to complete the task is mutually exclusive; using the law of total probability, mission reliability under MSRII can be expressed as a function of the stopping thresholds and loading levels

(31) $\begin{matrix} R_{I I} ({\tilde{h}}_{1 \times K}, {\tilde{x}}_{1 \times K}) = \sum_{k = 1}^{K} R_{I I, k} ({\tilde{h}}_{k}, {\tilde{x}}_{k}) \\ = \sum_{k = 1}^{K} \int_{τ - r}^{\hat{τ}} \int_{0}^{τ} \{\begin{matrix} 1 - Φ [\sqrt{\frac{η ({\tilde{x}}_{k})}{{\tilde{h}}_{k}}} (max (ε, τ - w) m ({\tilde{x}}_{k}) - {\tilde{h}}_{k})] \\ + e^{2 η ({\tilde{x}}_{k}) max (ε, τ - w) m ({\tilde{x}}_{k})} Φ [- \sqrt{\frac{η ({\tilde{x}}_{k})}{{\tilde{h}}_{k}}} ({\tilde{h}}_{k} + max (ε, τ - w) m ({\tilde{x}}_{k}))] \end{matrix}\} \\ \times {\tilde{α}}_{k} (r, w | {\tilde{h}}_{k}, {\tilde{x}}_{k}) d w d r \\ + \int_{τ - r}^{\hat{τ}} \int_{0}^{τ} \int_{ε}^{τ - w} \{\begin{matrix} Φ [\sqrt{\frac{η ({\tilde{x}}_{k})}{h - {\tilde{h}}_{k}}} (h - {\tilde{h}}_{k} - (τ - w - ε) m ({\tilde{x}}_{k}))] \\ + e^{2 η ({\tilde{x}}_{k}) (τ - w - ε) m ({\tilde{x}}_{k})} Φ [- \sqrt{\frac{η ({\tilde{x}}_{k})}{h - {\tilde{h}}_{k}}} (h - {\tilde{h}}_{k} + (τ - w - ε) m ({\tilde{x}}_{k}))] \end{matrix}\} \\ \times {\tilde{α}}_{k} (r, w | {\tilde{h}}_{k}, {\tilde{x}}_{k}) \\ \times \{\begin{matrix} \sqrt{\frac{η (x_{k - 1})}{h_{k - 1}}} [\begin{matrix} ϕ (\sqrt{\frac{η (x_{k - 1})}{h_{k - 1}}} (u m (x_{k - 1}) - h_{k - 1})) \\ + e^{2 η (x_{k - 1}) u m (x_{k - 1})} ϕ (- \sqrt{\frac{η (x_{k - 1})}{h_{k - 1}}} (u m (x_{k - 1}) + h_{k - 1})) \end{matrix}] \\ - 2 η (x_{k - 1}) e^{2 η (x_{k - 1}) u m (x_{k - 1})} Φ (- \sqrt{\frac{η (x_{k - 1})}{h_{k - 1}}} (u m (x_{k - 1}) + h_{k - 1})) \end{matrix}\} d u d r d w . \end{matrix}$

5.2. System Survivability under MSRII

Due to the multiple attempts for mission completion, if the system survives the mission after k attempts and no further attempts are made before the time threshold $\hat{τ}$ , we can conclude that (1) the task is terminated at the k-th attempt, i.e., $U ({\tilde{h}}_{k}, x_{k}) < ε$ and $U ({\tilde{h}}_{k}, x_{k}) + φ (U ({\tilde{h}}_{k}, x_{k})) < U (\tilde{h}, x_{k})$ ; (2) the amount of time left to complete the mission after the k-th rescue operation must be less than the amount of time left to complete the tasks remaining in the mission, i.e., $r - φ (U ({\tilde{h}}_{k}, x_{k})) < τ - w$ . The likelihood that the system will still be operational after k iterations under MSRII is given by the product of the probability density functions of $S_{k}$ and $U_{k}$ .

(32) $\begin{matrix} S_{I I, k} ({\tilde{h}}_{k}, {\tilde{x}}_{k}) & = \int_{τ - u}^{\hat{τ}} \int_{0}^{τ} P (\begin{matrix} U ({\tilde{h}}_{k}, {\tilde{x}}_{k}) < ε, U ({\tilde{h}}_{k}, {\tilde{x}}_{k}) + φ (U ({\tilde{h}}_{k}, {\tilde{x}}_{k})) < U (h, {\tilde{x}}_{k}), \\ r - φ (U ({\tilde{h}}_{k}, {\tilde{x}}_{k})) < τ - w \end{matrix}) \\ \times {\tilde{α}}_{k} (r, w | {\tilde{h}}_{k}, {\tilde{x}}_{k}) d r d w . \end{matrix}$

From the stationary and independent increment property and the degrading increment distribution function in Equation (4), it follows that

(33) $\begin{matrix} P (U ({\tilde{h}}_{k}, {\tilde{x}}_{k}) < ε, T ({\tilde{g}}_{k}) + φ (T ({\tilde{g}}_{k})) < T, s - φ (T ({\tilde{g}}_{k})) < τ - u) \\ = \int_{ψ}^{ε} P \{U ({\tilde{h}}_{k}, {\tilde{x}}_{k}) + φ (U ({\tilde{h}}_{k}, {\tilde{x}}_{k})) < U (h, {\tilde{x}}_{k}) | U ({\tilde{h}}_{k}, {\tilde{x}}_{k}) = u\} f_{U ({\tilde{h}}_{k}, {\tilde{x}}_{k})} (u) d u \\ = \int_{ψ}^{ε} P \{G (φ (u), {\tilde{x}}_{k}) < h - {\tilde{h}}_{k}\} f_{U ({\tilde{h}}_{k}, {\tilde{x}}_{k})} (u) d u, \end{matrix}$

where

ψ

satisfies

r - φ (ψ) = τ - w

. Using Inverse Gaussian process distribution, we can obtain

(34) $\begin{matrix} P (U ({\tilde{h}}_{k}, {\tilde{x}}_{k}) < ε, T ({\tilde{g}}_{k}) + φ (T ({\tilde{g}}_{k})) < T, s - φ (T ({\tilde{g}}_{k})) < τ - u) \\ = \int_{ψ}^{ε} \{\begin{matrix} Φ [\sqrt{\frac{η ({\tilde{x}}_{k})}{h - {\tilde{h}}_{k}}} (h - {\tilde{h}}_{k} - φ (u) m ({\tilde{x}}_{k}))] \\ + e^{2 η ({\tilde{x}}_{k}) φ (u) m ({\tilde{x}}_{k})} Φ [- \sqrt{\frac{η ({\tilde{x}}_{k})}{h - {\tilde{h}}_{k}}} (h - {\tilde{h}}_{k} + φ (u) m ({\tilde{x}}_{k}))] \end{matrix}\} f_{U ({\tilde{h}}_{k}, {\tilde{x}}_{k})} (u) d u . \end{matrix}$

Similarly, the system survivability under MSRI can be calculated using the law of total probability as follows

(35) $\begin{matrix} S_{I I} ({\tilde{h}}_{1 \times K}, {\tilde{x}}_{1 \times K}) = \sum_{k = 1}^{K} \int_{τ - u}^{\hat{τ}} \int_{0}^{τ} \int_{ψ}^{ε} \{\begin{matrix} Φ [\sqrt{\frac{η ({\tilde{x}}_{k})}{h - {\tilde{h}}_{k}}} (h - {\tilde{h}}_{k} - φ (u) m ({\tilde{x}}_{k}))] \\ + e^{2 η ({\tilde{x}}_{k}) φ (u) m ({\tilde{x}}_{k})} Φ [- \sqrt{\frac{η ({\tilde{x}}_{k})}{h - {\tilde{h}}_{k}}} (h - {\tilde{h}}_{k} + φ (u) m ({\tilde{x}}_{k}))] \end{matrix}\} \\ \times f_{U ({\tilde{h}}_{k}, {\tilde{x}}_{k})} (u) {\tilde{α}}_{k} (r, w | {\tilde{h}}_{k}, {\tilde{x}}_{k}) d u d r d w \\ + \int_{τ - r}^{\hat{τ}} \int_{0}^{τ} \{\begin{matrix} 1 - Φ [\sqrt{\frac{η ({\tilde{x}}_{k})}{{\tilde{h}}_{k}}} (max (ε, τ - w) m ({\tilde{x}}_{k}) - {\tilde{h}}_{k})] \\ + e^{2 η ({\tilde{x}}_{k}) max (ε, τ - w) m ({\tilde{x}}_{k})} Φ [- \sqrt{\frac{η ({\tilde{x}}_{k})}{{\tilde{h}}_{k}}} ({\tilde{h}}_{k} + max (ε, τ - w) m ({\tilde{x}}_{k}))] \end{matrix}\} {\tilde{α}}_{k} (r, w | {\tilde{h}}_{k}, {\tilde{x}}_{k}) d w d r \\ + \int_{τ - r}^{\hat{τ}} \int_{0}^{τ} \int_{ε}^{τ - w} \{\begin{matrix} Φ [\sqrt{\frac{η ({\tilde{x}}_{k})}{h - {\tilde{h}}_{k}}} (h - {\tilde{h}}_{k} - (τ - w - ε) m ({\tilde{x}}_{k}))] \\ + e^{2 η ({\tilde{x}}_{k}) (τ - w - ε) m ({\tilde{x}}_{k})} Φ [- \sqrt{\frac{η ({\tilde{x}}_{k})}{h - {\tilde{h}}_{k}}} (h - {\tilde{h}}_{k} + (τ - w - ε) m ({\tilde{x}}_{k}))] \end{matrix}\} {\tilde{α}}_{k} (r, w | {\tilde{h}}_{k}, {\tilde{x}}_{k}) \\ \times \{\begin{matrix} \sqrt{\frac{η (x_{k - 1})}{h_{k - 1}}} [\begin{matrix} ϕ (\sqrt{\frac{η (x_{k - 1})}{h_{k - 1}}} (u m (x_{k - 1}) - h_{k - 1})) \\ + e^{2 η (x_{k - 1}) u m (x_{k - 1})} ϕ (- \sqrt{\frac{η (x_{k - 1})}{h_{k - 1}}} (u m (x_{k - 1}) + h_{k - 1})) \end{matrix}] \\ - 2 η (x_{k - 1}) e^{2 η (x_{k - 1}) u m (x_{k - 1})} Φ (- \sqrt{\frac{η (x_{k - 1})}{h_{k - 1}}} (u m (x_{k - 1}) + h_{k - 1})) \end{matrix}\} d u d r d w . \end{matrix}$

6. Optimizing the Stopping Thresholds and Loading

The mission reliability is increasing in the stopping thresholds; however, the system survivability is decreasing as a result of an increase in failure risk during the period of task execution. Therefore, we should examine the best mission termination criteria in order to achieve a balance between mission reliability and system survival. This research establishes the optimization issue using the generally employed cost criteria. The model of optimization incorporates task failure cost and system failure cost. Let $c_{m}$ and $c_{s}$ represent the cost of task failure and system failure, respectively. On the basis of the mission reliability and system survivability formulations, the predicted total cost under task success criteria during task execution can be expressed as

(36) $E (C_{I} (h_{1 \times k}, x_{1 \times k})) = c_{m} (1 - R_{I} (h_{1 \times k}, x_{1 \times k})) + c_{s} (1 - S_{I} (h_{1 \times k}, x_{1 \times k})),$

The anticipated total cost under IITR during task execution can be expressed as follows

(37) $E (C_{I I} ({\tilde{h}}_{1 \times k}, {\tilde{x}}_{1 \times k})) = c_{m} (1 - R_{I} ({\tilde{h}}_{1 \times k}, {\tilde{x}}_{1 \times k})) + c_{s} (1 - S_{I} ({\tilde{h}}_{1 \times k}, {\tilde{x}}_{1 \times k})) .$

To derive the recursive functions in the mission reliability and system survivability, the following discretized forward procedure is constructed. Instead of using the backward equations presented in the formulas, one can obtain mission reliability and system survivability by using a more convenient forward procedure. The pseudo-code of this procedure is given in Algorithm 1.

Algorithm 1: Recursive algorithm to determine the mission reliability and system survivability

Set

R_{I, 0} (h_{0}, x_{0}) = 0, S_{I, 0} (h_{0}, x_{0}) = 0, α_{0} (r | h_{0}, x_{0}) = 1

The initial value of mission reliability and system survivability Set

d r = (\hat{τ} - τ) / M, d s = (\hat{τ} - τ) / \hat{M}, d h = h / \tilde{M}

Set

α_{0} (r | h_{0}, x_{0}) = 1

The initial value of

α_{k} (r | h_{k}, x_{k})

For

k = 1, 2, . . ., N

For

r = τ, τ + d r, . . ., \hat{τ} - d r, \hat{τ}

For

h_{k} = 0, d h, . . ., h - d h, h

For

x_{k} = τ, τ + d s, . . ., \hat{τ} - d s, \hat{τ}

Obtain

α_{k} (r | h_{k}, x_{k})

based on Eq.(12); Add

P (U (h_{k}, x_{k}) > ε, U (h, x_{k}) > τ) α_{k} (r | h_{k}, x_{k}) d r

R_{I, k - 1} (h, x)

; Obtain

P (U (h_{k}, x_{k}) < ε, t - τ < U (h_{k}, x_{k}) + φ (U (h_{k}, x_{k})) < U (h, x_{k}))

based on Equation (14); Add

P (U (h_{k}, x_{k}) < ε, t - τ < U (h_{k}, x_{k}) + φ (U (h_{k}, x_{k})) < U (h, x_{k})) α_{k} (s | h_{k}, x_{k}) d s

S_{I, k - 1} (h, x)

. End End End End

7. Numerical Example

7.1. Background

In this section, we put the newly established models into the cloud computing system, which consists of many hardware and software resources. It can be used in numerous contexts. This section examines a cloud computing system that uses a network of remote servers to execute computations in concert with a group of virtual machines. The cloud computing system fails when its degradation exceeds a threshold of 30, which in turn leads to data destruction. The set of adjustable loading levels is ${0.5, 1, 1.5}$ . Given loading x, the deterioration process follows a homogeneous inverse Gaussian process. The inverse Gaussian distribution $IG (10 u x, 1.3 x u^{2})$ , which captures the monotone degradation behavior. Assume that the decision-maker has 50 h to complete the computing assignment. The duration of one computing task is 15 h. Each time the system degrades beyond a certain threshold, the computing task is terminated and rescue is attempted, with the latter taking time $φ (t) = 0.5 t$ . It follows that the maximal time in the task is 10 h if the task can be terminated. If a computing task continues for 10 h since the beginning of the task, the task will not be suspended because it will take less time to finish. In this section, we employ a numerical integration method to check the cloud computing system’s mission reliability and system survivability. Then, we examine the dynamic policy that leads to the optimal loading and aborting policies.

7.2. Optimal Termination Policies

In this section, we analyze the optimal task termination and loading rules under different task success requirements. Furthermore, we explore how the optimal solution shifts depending on maximum time and the duration of a task. It is assumed that a task failure will cost 120 and a system failure will cost 1200. The optimal termination and loading actions under TSRI are shown in Table 1 for a range of maximum allowed tries and task duration. It demonstrates that the termination threshold is non-decreasing in the permitted time, for a fixed task duration and number of attempts. One probable reason for the shift is that in cases when only a limited amount of remaining time is available, the initial few attempts should be terminated sooner in order to preserve time for the following rescue procedure and subsequent tries. With more allowable time, it is preferable to put off stopping a task so that task success probability can be improved. According to Table 1, the termination threshold also decreases with task duration for a constant number of attempts since the termination should be delayed until a later stage when the mission length is short. When task duration increases, it is optimal to stop early to maximize the system’s chances of survival. The task termination threshold is decreasing in the number of attempts given maximum time and task duration to save rescue time for future tries.

We can see in Table 1 that the optimal loading level is non-decreasing in the allowable time given task duration and number of attempts. Such a phenomenon is due to the fact that when the time comes for task execution, then the decision-maker can lower the loading to reduce the system failure probability. It can also be observed that the loading level is non-decreasing in the number of attempts, given the allowable time and task duration. Because the remaining time is decreasing as the number of attempts increases, the decision maker should adopt higher loading to increase task success probability.

The optimal termination actions and how they shift when the deadlines and task durations change under MSRII are displayed in Table 2. By comparing Table 1 and Table 2, we see that the optimal loading level drops under MSRII because the finished work can be accumulated under MSRII. Therefore, it is preferable to adopt a lower loading level in order to reduce the total cost of maintenance. It can be seen that MSRI has a lower termination threshold than that under MSRII. To improve mission success probability, MSRII allows for the accumulation of tasks completed during multiple tries. As a result of the improved mission success probability, the task under MSRII can be terminated sooner.

We further consider several intuitive and heuristic stopping and loading policies to compare their cost-saving performance against the optimal policy. Static stopping and dynamic loading policy (SSDL): the stopping decision-making is dependent on degradation and time in mission, and loading is controlled by a fixed value. Dynamic stopping and static loading policy (DSSL): the stopping decision-making is dependent on degradation and time in mission, and loading is controlled by a fixed value. Static stopping and static loading policy (SSSL): the stopping and loading decision-making are both independent of degradation and time in mission. Static loading policy (SL): Under this benchmark policy, loading is fixed and the mission is never stopped. The performance of the comparative aborting and loading rules is reported in Table 3, which shows that the SSDL policy performs best due to its effective utilization of health conditions and time in mission. As mission failure cost increases, the performance of the static aborting and loading rules deteriorates, indicating that the superiority of the optimal policy is more prominent with higher penalty costs. The cost-saving performance of the SL policy improves with increasing mission failure cost because the choice to continue the mission is more likely when system failure is more expensive.

8. Conclusions

This study explores the multiple-task termination approach for two distinct types of task success criteria, drawing on the practical engineering background of the system operation process and the features of tasks. The number of tries can be increased until the assignment is completed in the allotted time. The decision to abort a mission is determined after each try based on the amount of degradation and the length of the flight. When a certain point of system degradation is reached, the rescue operation is initiated and the mission ends. If the rescue is successful, the system will go back to its initial, ideal state and try again until either the task is finished, the system fails, or the maximum number of tries is achieved. Task dependability and system resilience in the face of many tries are obtained from the recursive formula used to characterize the system state transition process. Finally, the numerical example findings are shown against the engineering case study of a cloud computing system.

There are a lot of interesting avenues that the current research could pursue in the future. The first issue is that the system we are talking about here is prone to internal decay. An additional aspect determining the risk of essential safety systems is the impact environment, which might be studied in the future. Second, after a successful rescue, the study assumes the system can be brought back to a faultless condition. In real-world engineering situations, however, the system may be only partially repairable after being rescued. It is important to look into the situation when the state is imperfect following rescue. The joint optimization of the stopping and rescue problem is another avenue of investigation beyond the scope of this paper.

Data Availability Statement

No data is used in the study.

Acknowledgments

The author is grateful to editors and anonymous reviewers whose comments greatly improved the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Tables

Table 1

Optimal termination thresholds under MSRI.

(Allowable Time $\hat{τ}$ , Task Duration $τ$ )	Maximal Tries K = 1	Maximal Tries K = 2	Maximal Tries K = 3
(50 h, 15 h)	$(20.6, 0.5)$	$(21.0, 0.5; 19.7, 0.5)$	$(21.5, 0.5; 20.3, 0.5; 19.3, 1.0)$
(55 h, 15 h)	$(20.6, 0.5)$	$(21.5, 0.5; 20.7, 0.5)$	$(21.7, 0.5; 20.8, 0.5; 19.8, 1.0)$
(60 h, 15 h)	$(20.6, 0.5)$	$(22.5, 0.5; 22.0, 1.0)$	$(21.8, 0.5; 21.0, 0.5; 20.5; 1.0)$
(65 h, 15 h)	$(20.6, 0.5)$	$(23.3, 0.5; 23.0; 1.0)$	$(22.5, 0.5; 21.5, 0.5; 21.1, 1.0)$
(70 h, 15 h)	$(20.6, 0.5)$	$(24.1, 0.5; 23.1, 0.5)$	$(22.8, 0.5; 22.3, 0.5; 21.5, 1.0)$
(60 h, 15 h)	$(19.6, 0.5)$	$(22.8, 0.5; 22.3, 0.5)$	$(22.3, 0.5; 21.8, 0.5; 21.0, 1.0)$
(60 h, 20 h)	$(20.6, 0.5)$	$(22.5, 0.5; 22.0, 0.5)$	$(21.8, 0.5; 21.0, 0.5; 20.5; 1.0)$
(60 h, 25 h)	$(21.5, 1.0)$	$(21.8, 1.0; 21.8, 1.0)$	$(20.3, 1.0; 19.1, 1.0; 19.0, 1.5)$
(60 h, 30 h)	$(22.0, 1.0)$	$(21.3, 1.0; 20.5, 1.0)$	$(19.3, 1.0; 18.3, 1.0; 18.0, 1.5)$

Table 2

Optimal termination and loading policies under MSRII.

(Allowable Time $\hat{τ}$ , Task Duration $τ$ )	Maximal Tries K = 1	Maximal Tries K = 2	Maximal Tries K = 3
(50 h, 12 h)	$(20.6, 0.5)$	$(20.1, 0.5; 18.6, 0.5)$	$(20.5, 0.5; 19.5, 0.5; 18.8, 0.5)$
(55 h, 12 h)	$(20.6, 0.5)$	$(20.6, 0.5; 20.0, 0.5)$	$(20.8, 0.5; 20.1, 0.5; 18.8, 0.5)$
(60 h, 12 h)	$(20.6, 0.5)$	$(21.8, 0.5; 21.1, 0.5)$	$(21.0, 0.5; 20.5, 0.5; 20.0; 0.5)$
(65 h, 12 h)	$(20.6, 0.5)$	$(22.6, 0.5; 22.3; 0.5)$	$(21.6, 0.5; 20.6, 0.5; 20.3, 0.5)$
(70 h, 12 h)	$(20.6, 0.5)$	$(23.5, 0.5; 22.6, 0.5)$	$(22.0, 0.5; 21.6, 0.5; 20.6, 0.5)$
(60 h, 10 h)	$(19.6, 0.5)$	$(22.0, 0.5; 21.6, 0.5)$	$(21.5, 0.5; 21.0, 0.5; 20.3, 0.5)$
(60 h, 12 h)	$(20.6, 0.5)$	$(21.3, 0.5; 21.3, 0.5)$	$(21.0, 10.4; 20.5, 9.7; 20.0; 9.3)$
(60 h, 14 h)	$(21.5, 0.5)$	$(21.0, 0.5; 20.5, 0.5)$	$(19.5, 0.5; 18.7, 0.5; 18.3, 0.5)$
(60 h, 16 h)	$(22.0, 1.0)$	$(20.2, 0.5; 19.8, 0.5)$	$(18.7, 0.5; 17.7, 0.5; 17.3, 0.5)$

Table 3

Comparison of different policies with varying mission failure costs.

	$c_{m}$ = 120	$c_{m}$ = 240	$c_{m}$ = 360	$c_{m}$ = 480
	Cost Increase %	Cost Increase %	Cost Increase %	Cost Increase %
Optimal Policy	-	-	-	-
SSDL Policy	29	40	47	58
DSSL Policy	36	51	60	67
SSSL Policy	53	62	69	80
SL Policy	92	90	88	83

References

1. Yu, H.; Wu, X.; Wu, X. An extended object-oriented petri net model for mission reliability evaluation of phased-mission system with time redundancy. Reliab. Eng. Syst. Saf.; 2020; 197, 106786. [DOI: https://dx.doi.org/10.1016/j.ress.2019.106786]

2. Wang, G.; Peng, R.; Xing, L. Reliability evaluation of unrepairable k-out-of-n: G systems with phased-mission requirements based on record values. Reliab. Eng. Syst. Saf.; 2018; 178, pp. 191-197. [DOI: https://dx.doi.org/10.1016/j.ress.2018.06.009]

3. Qiu, Q.; Cui, L. Reliability evaluation based on a dependent two-stage failure process with competing failures. Appl. Math. Model.; 2018; 64, pp. 699-712. [DOI: https://dx.doi.org/10.1016/j.apm.2018.07.039]

4. Zhao, J.; Si, S.; Cai, Z.; Guo, P.; Zhu, W. Mission success probability optimization for phased-mission systems with repairable component modules. Reliab. Eng. Syst. Saf.; 2020; 195, 106750. [DOI: https://dx.doi.org/10.1016/j.ress.2019.106750]

5. Levitin, G.; Xing, L.; Johnson, B.W.; Dai, Y. Mission reliability, cost and time for cold standby computing systems with periodic backup. IEEE Trans. Comput.; 2015; 64, pp. 1043-1057. [DOI: https://dx.doi.org/10.1109/TC.2014.2315644]

6. Tian, G.; Fathollahi-Fard, A.M.; Ren, Y.; Li, Z.; Jiang, X. Multi-objective scheduling of priority-based rescue vehicles to extinguish forest fires using a multi-objective discrete gravitational search algorithm. Inf. Sci.; 2022; 608, pp. 578-596. [DOI: https://dx.doi.org/10.1016/j.ins.2022.06.052]

7. Fathollahi-Fard, A.M.; Hajiaghaei-Keshteli, M.; Tian, G.; Li, Z. An adaptive Lagrangian relaxation-based algorithm for a coordinated water supply and wastewater collection network design problem. Inf. Sci.; 2020; 512, pp. 1335-1359. [DOI: https://dx.doi.org/10.1016/j.ins.2019.10.062]

8. Tian, G.; Zhang, C.; Fathollahi-Fard, A.M.; Li, Z.; Zhang, C.; Jiang, Z. An enhanced social engineering optimizer for solving an energy-efficient disassembly line balancing problem based on bucket brigades and cloud theory. IEEE Trans. Ind. Inform.; 2022; [DOI: https://dx.doi.org/10.1109/TII.2022.3193866]

9. Fathollahi-Fard, A.M.; Dulebenets, M.A.; Tian, G.; Hajiaghaei-Keshteli, M. Sustainable supply chain network design. Environ. Sci. Pollut. Res.; 2022; [DOI: https://dx.doi.org/10.1007/s11356-022-18956-y]

10. Moosavi, J.; Fathollahi-Fard, A.M.; Dulebenets, M.A. Supply chain disruption during the COVID-19 pandemic: Recognizing potential disruption management strategies. Int. J. Disaster Risk Reduct.; 2022; 75, 102983. [DOI: https://dx.doi.org/10.1016/j.ijdrr.2022.102983]

11. Pasha, J.; Nwodu, A.L.; Fathollahi-Fard, A.M.; Tian, G.; Li, Z.; Wang, H.; Dulebenets, M.A. Exact and metaheuristic algorithms for the vehicle routing problem with a factory-in-a-box in multi-objective settings. Adv. Eng. Inform.; 2022; 52, 101623. [DOI: https://dx.doi.org/10.1016/j.aei.2022.101623]

12. Fathollahi-Fard, A.M.; Dulebenets, M.A.; Hajiaghaei-Keshteli, M.; Tavakkoli-Moghaddam, R.; Safaeian, M.; Mirzahosseinian, H. Two hybrid meta-heuristic algorithms for a dual-channel closed-loop supply chain network design problem in the tire industry under uncertainty. Adv. Eng. Inform.; 2021; 50, 101418. [DOI: https://dx.doi.org/10.1016/j.aei.2021.101418]

13. Levitin, G.; Finkelstein, M. Optimal mission abort policy with multiple shock number thresholds. Proc. Inst. Mech. Eng. Part O J. Risk Reliab.; 2018; 232, pp. 607-615. [DOI: https://dx.doi.org/10.1177/1748006X17751496]

14. Levitin, G.; Xing, L.; Dai, Y. Mission abort policy for systems with observable states of standby components. Risk Anal.; 2020; 40, pp. 1900-1912. [DOI: https://dx.doi.org/10.1111/risa.13532]

15. Levitin, G.; Finkelstein, M.; Xiang, Y. Optimal mission abort policies for repairable multistate systems performing multi-attempt mission. Reliab. Eng. Syst. Saf.; 2021; 209, 107497. [DOI: https://dx.doi.org/10.1016/j.ress.2021.107497]

16. Zhao, X.; Li, R.; Cao, S.; Qiu, Q. Joint modeling of loading and mission abort policies for systems operating in dynamic environments. Reliab. Eng. Syst. Saf.; 2023; 230, 108948. [DOI: https://dx.doi.org/10.1016/j.ress.2022.108948]

17. Yang, L.; Chen, Y.; Qiu, Q.; Wang, J. Risk control of mission-critical systems: Abort decision-makings integrating health and age conditions. IEEE Trans. Ind. Inform.; 2022; 18, pp. 6887-6894. [DOI: https://dx.doi.org/10.1109/TII.2022.3141416]

18. Qiu, Q.; Cui, L.; Wu, B. Dynamic mission abort policy for systems operating in a controllable environment with self-healing mechanism. Reliab. Eng. Syst. Saf.; 2020; 203, 107069. [DOI: https://dx.doi.org/10.1016/j.ress.2020.107069]

19. Zhao, X.; Chai, X.; Sun, J.; Qiu, Q. Joint optimization of mission abort and protective device selection policies for multistate systems. Risk Anal.; 2022; 42, pp. 2823-2834. [DOI: https://dx.doi.org/10.1111/risa.13869]

20. Qiu, Q.; Cui, L.; Gao, H.; Yi, H. Optimal allocation of units in sequential probability series systems. Reliab. Eng. Syst. Saf.; 2018; 225, pp. 351-363. [DOI: https://dx.doi.org/10.1016/j.ress.2017.09.011]

21. Myers, A. Probability of Loss Assessment of Critical k-Out-of-n: G Systems Having a Mission Abort Policy. IEEE Trans. Reliab.; 2009; 58, pp. 694-701. [DOI: https://dx.doi.org/10.1109/TR.2009.2026807]

22. Zhao, X.; Fan, Y.; Qiu, Q.; Chen, K. Multi-criteria mission abort policy for systems subject to two-stage degradation process. Eur. J. Oper. Res.; 2021; 295, pp. 233-245. [DOI: https://dx.doi.org/10.1016/j.ejor.2021.02.043]

23. Qiu, Q.; Cui, L. Gamma process based optimal mission abort policy. Reliab. Eng. Syst. Saf.; 2019; 190, 106496. [DOI: https://dx.doi.org/10.1016/j.ress.2019.106496]

24. Qiu, Q.; Cui, L. Optimal mission abort policy for systems subject to random shocks based on virtual age process. Reliab. Eng. Syst. Saf.; 2019; 189, pp. 11-20. [DOI: https://dx.doi.org/10.1016/j.ress.2019.04.010]

25. Qiu, Q.; Maillart, L.M.; Prokopyev, O.A.; Cui, L. Optimal Condition-Based Mission Abort Decisions. IEEE Trans. Reliab.; 2022; [DOI: https://dx.doi.org/10.1109/TR.2022.3172377]

26. Levitin, G.; Finkelstein, M. Optimal mission abort policy for systems in a random environment with variable shock rate. Reliab. Eng. Syst. Saf.; 2018; 169, pp. 11-17. [DOI: https://dx.doi.org/10.1016/j.ress.2017.07.017]

27. Cha, J.H.; Finkelstein, M.; Levitin, G. Optimal mission abort policy for partially repairable heterogeneous systems. Eur. J. Oper. Res.; 2018; 271, pp. 818-825. [DOI: https://dx.doi.org/10.1016/j.ejor.2018.06.032]

28. Levitin, G.; Xing, L.; Dai, Y. Optimal mission aborting in multistate systems with storage. Reliab. Eng. Syst. Saf.; 2022; 218, 108086. [DOI: https://dx.doi.org/10.1016/j.ress.2021.108086]

29. Levitin, G.; Xing, L.; Dai, Y. Mission aborting and system rescue for multi-state systems with arbitrary structure. Reliab. Eng. Syst. Saf.; 2022; 219, 108225. [DOI: https://dx.doi.org/10.1016/j.ress.2021.108225]

30. Levitin, G.; Finkelstein, M.; Xiang, Y. Optimal multi-attempt missions with cumulative effect. Reliab. Eng. Syst. Saf.; 2020; 203, 107091. [DOI: https://dx.doi.org/10.1016/j.ress.2020.107091]

31. Zhao, X.; Chai, X.; Sun, J.; Qiu, Q. Joint optimization of mission abort and component switching policies for multistate warm standby systems. Reliab. Eng. Syst. Saf.; 2021; 212, 107641. [DOI: https://dx.doi.org/10.1016/j.ress.2021.107641]

32. Wang, J.; Qiu, Q.; Wang, H.; Lin, C. Optimal condition-based preventive maintenance policy for balanced systems. Reliab. Eng. Syst. Saf.; 2021; 211, 107606. [DOI: https://dx.doi.org/10.1016/j.ress.2021.107606]

33. Qiu, Q.; Kou, M.; Chen, K.; Deng, Q.; Kang, F.; Lin, C. Optimal stopping problems for mission oriented systems considering time redundancy. Reliab. Eng. Syst. Saf.; 2021; 205, 107226. [DOI: https://dx.doi.org/10.1016/j.ress.2020.107226]

34. Levitin, G.; Finkelstein, M.; Huang, H.Z. Optimal abort rules for multiattempt missions. Risk Anal.; 2019; 39, pp. 2732-2743. [DOI: https://dx.doi.org/10.1111/risa.13371]

35. Levitin, G.; Finkelstein, M.; Xiang, Y. Optimal abort rules and subtask distribution in missions performed by multiple independent heterogeneous units. Reliab. Eng. Syst. Saf.; 2020; 199, 106920. [DOI: https://dx.doi.org/10.1016/j.ress.2020.106920]

36. Levitin, G.; Xing, L.; Dai, Y. Mission abort policy in heterogeneous nonrepairable 1-out-of-N warm standby systems. IEEE Trans. Reliab.; 2017; 67, pp. 342-354. [DOI: https://dx.doi.org/10.1109/TR.2017.2740330]

37. Levitin, G.; Xing, L.; Dai, Y. Co-optimization of state dependent loading and mission abort policy in heterogeneous warm standby systems. Reliab. Eng. Syst. Saf.; 2018; 172, pp. 151-158. [DOI: https://dx.doi.org/10.1016/j.ress.2017.12.010]

38. Peng, R. Joint routing and aborting optimization of cooperative unmanned aerial vehicles. Reliab. Eng. Syst. Saf.; 2018; 177, pp. 131-137. [DOI: https://dx.doi.org/10.1016/j.ress.2018.05.004]

39. Zhao, X.; Sun, J.; Qiu, Q.; Chen, K. Optimal inspection and mission abort policies for systems subject to degradation. Eur. J. Oper. Res.; 2021; 292, pp. 610-621. [DOI: https://dx.doi.org/10.1016/j.ejor.2020.11.015]

40. Yang, A.; Qiu, Q.; Zhu, M.; Cui, L.; Chen, W.; Chen, J. Condition-based maintenance strategy for redundant systems with arbitrary structures using improved reinforcement learning. Reliab. Eng. Syst. Saf.; 2022; 225, 108643. [DOI: https://dx.doi.org/10.1016/j.ress.2022.108643]

41. Qiu, Q.; Liu, B.; Lin, C.; Wang, J. Availability analysis and maintenance optimization for multiple failure mode systems considering imperfect repair. Proc. Inst. Mech. Eng. Part O J. Risk Reliab.; 2021; 235, pp. 982-997. [DOI: https://dx.doi.org/10.1177/1748006X211012792]

42. Qiu, Q.; Cui, L.; Dong, Q. Preventive maintenance policy of single-unit systems based on shot-noise process. Qual. Reliab. Eng. Int.; 2019; 35, pp. 550-560. [DOI: https://dx.doi.org/10.1002/qre.2420]

43. Shang, L.; Qiu, Q.; Wang, X. Random periodic replacement models after the expiry of 2D-warranty. Comput. Ind. Eng.; 2022; 164, 107885. [DOI: https://dx.doi.org/10.1016/j.cie.2021.107885]

44. Wang, J.; Qiu, Q.; Wang, H. Joint optimization of condition-based and age-based replacement policy and inventory policy for a two-unit series system. Reliab. Eng. Syst. Saf.; 2021; 205, 107251. [DOI: https://dx.doi.org/10.1016/j.ress.2020.107251]

45. Zhao, X.; Lv, Z.; Qiu, Q.; Wu, Y. Designing Two-Level Rescue Depot Location and Dynamic Rescue Policies for Unmanned Vehicles. Reliab. Eng. Syst. Saf.; 2023; 233, 109119. [DOI: https://dx.doi.org/10.1016/j.ress.2023.109119]

46. Zhao, X.; Dai, Y.; Qiu, Q.; Wu, Y. Joint optimization of mission aborts and allocation of standby components considering mission loss. Reliab. Eng. Syst. Saf.; 2022; 225, 108612. [DOI: https://dx.doi.org/10.1016/j.ress.2022.108612]

47. Ye, Z.S.; Chen, N. The inverse Gaussian process as a degradation model. Technometrics; 2014; 56, pp. 302-311. [DOI: https://dx.doi.org/10.1080/00401706.2013.830074]

Word count: 7128

Show less

© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

Numerous engineering systems gradually deteriorate due to internal stress caused by the working load. The system deterioration process is directly related to the workload, providing opportunities for decision-makers to manage system deterioration by modifying the workload. As one of the most effective ways to control system malfunction risk, mission stopping has been extensively studied. Most existing research on mission stopping ignores the effect of working loads on the internal deterioration of safety-critical systems. The purpose of this work is to examine the optimal joint loading and stopping rules for systems subject to internal degradation under two types of mission success requirements (MSR). The problem is formulated using the recursive algorithm to minimize the expected cost over the mission. Mission reliability and system safety are assessed, and the optimal loading and stopping rules are investigated. The established models are illustrated by practical examples, and comprehensive policy comparison and parameter sensitivity analysis on the allowable mission time, mission duration and the number of mission tries are conducted. Our findings indicate that dynamic load level modification has a substantial effect on system deterioration and predicted long-term costs. For the purpose of decision-making, several managerial implications for the joint development of load adjustment and abort implementation are obtained.

Details

Title

Optimal Stopping and Loading Rules Considering Multiple Attempts and Task Success Criteria

Author

Wu, Yaguang

First page

1065

Publication year

2023

Publication date

2023

Publisher

MDPI AG

e-ISSN

22277390

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.3390/math11041065

ProQuest document ID

2779495675

Optimal Stopping and Loading Rules Considering Multiple Attempts and Task Success Criteria

Jump to:

Full text

Abstract

Details

Suggested sources