Fault-tolerant control for nonlinear time-delay systems using neural network observers

Abstract

This article addresses the problem of fault-tolerant control in nonlinear time-delay systems using adaptive dynamic programming. An adaptive neural network observer is developed to estimate unknown dynamics, system states, and actuator faults. This observer is then transformed into an augmented structure for optimal fault-tolerant control problem. The gains of this observer are determined by solving a linear matrix inequality. A new value function index is introduced to account for time-delay states, and control law is derived associated with this novel value function. The Hamilton–Jacobi–Bellman equation for this value function is solved via a critic neural network. Lyapunov functional analysis demonstrates that the closed-loop system remains uniformly ultimately bounded. Simulation results validate the proposed fault tolerant approach. The key contribution of this paper lies in incorporating time-delay states into the adaptive dynamic programming value function in the presence of actuator faults.

Full text

Translate

Turn on search term navigation

Introduction

In contemporary industrial applications and technological systems, control approaches must consider system safety and reliability. This is especially critical in safety-focused applications, where mission success and the protection of lives and the environment are essential. Consequently, the importance of guaranteeing system safety and reliability has reached unprecedented levels [1, 2]. Since modern systems consist of many components, such as sensors and mechanical actuators, then they may fail over time. Failures can lead to system instability and degrade control performance in systems. Actuator failures, in particular, are especially critical because they can significantly impair control performance due to their unpredictable actions. To address this issue in controller design, fault-tolerant control (FTC) strategies should be implemented [3, 4].

The primary goal of a FTC approach is to adjust the controller structure so the system can manage component failures without compromising the mission or endangering users and the public. This often requires incorporating dynamic fault estimation into the nominal control input, which presents significant challenges in ensuring system stability with the modified control input [5].

FTC methods are generally divided into two categories: passive and active. Passive FTC methods rely on robust control theory, maintaining a constant controller structure and tolerating only a limited number of predefined conditions and assumptions. While passive approaches are more conservative, they are easier to design and implement. In contrast, active FTC methods adapt to faults by re-configuring the controller based on fault detection and isolation information. These approaches are less conservative and impose fewer conditions on the system, but proving stability and achieving acceptable control performance with active methods is more challenging. Great works on both active and passive fault approaches have been published; the reader can refer to [6, 7–8] for more details.

This paper introduces an active FTC approach for nonlinear time-delay systems using adaptive dynamic programming (ADP). Various methods for addressing active FTC problems have been published, including adaptive control approaches [9, 10–11], model predictive control [12, 13], and reinforcement learning with ADP algorithms [14, 15]. Among these various techniques, ADP algorithm is particularly effective for FTC challenges in nonlinear systems. An ADP approach is a powerful method for addressing optimization and optimal control problems by utilizing the principle of optimality [16]. For example, in work [17], a novel cost function is developed to address challenges in solving optimal tracking control problems for certain nonlinear systems with known dynamics using ADP. This innovative approach introduces a specific cost function related to tracking errors and their derivatives, eliminating the need for the traditional zero equilibrium assumption and simplifying the controller design process for nonlinear systems with nonzero equilibrium. ADP, an intelligent control method, has gained significant interest in both academia and industry since its proposal [18].

In [19], the authors proposed an online fault compensation control strategy using ADP algorithm to achieve optimal control in nonlinear systems with actuator faults. This method notably avoids using an action neural network. Expanding on this, [20] introduces an observer to enhance the value function. Unlike the approach in [19], the improved value function in [20] simultaneously addresses both faults and control, providing a more integrated solution.

Building on the findings from [19] and [20], the authors in [21] developed an FTC approach that tackles both actuator and sensor faults using neuro-dynamic programming. In [22], an optimal FTC method for a class of unknown nonlinear discrete-time systems with actuator faults was introduced, employing adaptive critic design. The study in [23] utilized a sliding-mode control technique integrated with a ADP algorithm to solve the optimal FTC problem for affine nonlinear systems. However, the works in [21, 22, 14, 19, 20], and [23] did not address nonlinear systems with time delays.

Consequently, a primary motivation of our work is to develop a fault tolerant method using ADP algorithm for nonlinear systems with time delays. Nearly all the previously mentioned works depend on information from nominal systems and assume the availability of system states. Observers, particularly learning observers, are crucial and play an important role in designing control approaches. Then, another motivation is to develop a fault-tolerant method using the ADP algorithm for nonlinear time-delay systems, based on a neural network observer.

In [24], a fault identification approach for partially unknown nonlinear systems is proposed using a combination of the deterministic learning method and the adaptive high gain observer. However, this paper does not consider time delay or adaptive dynamic programming. Building on these results, [25] proposes an adaptive neural observer-based FTC approach for a class of nonlinear systems using ADP. Most works utilizing neural network observers focus on fault estimation and detection, and methods for neural network observer-based FTC are scarce.

In all the aforementioned works, nonlinear time-delay systems were not considered. Time delays are critical in many engineering systems. Industrial processes, both linear and nonlinear, often involve time delays. In control terminology, these delays are categorized as constant time delays and time-varying delays. Constant time delays, common in slow-changing processes like chemical reactors, represent the maximum tolerable delay. Time-varying delays, typical in fast-paced systems like power systems and mobile robot teams, can cause instability or reduced performance. For example, in distributed optimization problems, delays can increase the time needed for systems to converge to an optimal solution [26]. To mitigate these effects, researchers incorporate delay considerations to ensure the stability of their controllers [27]. A widely used approach for this purpose is based on Lyapunov-Krasovskii methods [28].

Addressing time-delay systems in ADP is particularly challenging due to their infinite-dimensional nature, which complicates their solution. Analyzing systems with time delays is significantly more complex than analyzing systems without delays. Currently, there is no specific method for handling nonlinear time-delay systems. For more information, see chapters 4 and 5 of [29].

Based on the discussions and to the best of our knowledge, the issue of FTC based on neural network observer for nonlinear time-delay systems using an ADP algorithm has not been previously studied. The main contributions of this paper are as follows:

An adaptive neural network observer for nonlinear time-delay systems is designed to learn the unknown system dynamics and actuator faults. This observer is integrated into a FTC approach, enhancing its adaptability to stochastic faults through active intervention.
An novel value function for neural network observer-based FTC problem using ADP algorithm: Unlike existing ADP algorithms for fault tolerant in nonlinear systems, this paper specifically targets nonlinear time-delay systems. We introduce a new value function for our ADP algorithm, which integrates state time delays into the dynamics of nonlinear systems. Compared to [19, 24, 25] and [20], our method includes time delays in the control law updates and considers their impact to ensure the stability of the proposed approach.
Stability analysis using bilinear matrix inequality (BMI): We introduce a BMI to consider the impact of time delay on the stability of the proposed method. This BMI helps analyze the effect of constant time delay on the stability of the FTC problem based on the ADP algorithm for nonlinear time-delay systems.

The remainder of this paper is structured as follows. Section 2 presents the problem statement and outlines the control objectives. It also discusses the construction of a neural network observer, the development of a performance index function, and the creation of a fault-tolerant approach using adaptive techniques to address the FTC problem for a class of partially unknown nonlinear time-delay systems. Section 3 provides simulation examples to demonstrate the effectiveness of the proposed scheme. Finally, Sect. 4 concludes the paper.

Notation: The symbols $R$ , $R^{n}$ , and $R^{n \times m}$ denote the sets of all real numbers, n-dimensional real vectors, and $n \times m$ real matrices, respectively. The notation $∥.∥$ represents the Euclidean vector norm. The functions ${\bar{λ}}_{min} (.)$ and ${\bar{λ}}_{max} (.)$ indicate the minimum and maximum eigenvalues, respectively.

Mathematical formulation

Consider a class of partially unknown nonlinear time-delay systems represented as follows:

\begin{matrix} \dot{z} (t) = A z (t) + A_{d} z (t - d) + B u (t) + f (z (t)) \\ + D_{f} a (t), \\ y (t) = C z (t), \\ z (t) = h (t), t \in [- d, 0], \end{matrix}

the variables

z (t) \in R^{n}

and

z (t - d) \in R^{n}

denote the state vectors of the system at time

t

and at a delayed time

t - d

, respectively. The vector

y (t) \in R^{p}

represents the measured output, while

u (t) \in R^{m}

signifies the control input vector. The term

a (t) \in R^{j}

denotes the unknown actuator fault. The matrices

A \in R^{n \times n}

B \in R^{n \times m}

C \in R^{p \times n}

, and

D_{f} \in R^{n \times j}

are known real and constant. The term h(t) represents the historical state of the system and d is time-delay constant. The pair

(A, B)

is controllable, and the pair

(A, C)

is observable. The function

f (z (t)) \in R^{n}

is an unknown nonlinear function with Lipschitz continuity.

The following assumption must be defined to proceed with the proposed method in this paper.

Assumption 1

The partial derivative of the approximation error in the critic neural network, denoted as $\nabla ϵ_{c} (x)$ , the actuator fault $a (t)$ , and the function $f (z (t))$ are all constrained by their respective norms. Specifically, the norm of the gradient $\nabla ϵ_{c} (z (t))$ is bounded by $\nabla {\bar{ϵ}}_{c}$ , the norm of the function $f (z (t))$ is bounded by $f_{1}$ , and the norm of the actuator fault $a (t)$ is bounded by $f_{2}$ . Here, $\nabla {\bar{ϵ}}_{c}$ , $f_{1}$ , and $f_{2}$ are positive constants.

In Eq. (1), the dynamics $f (z (t))$ and the fault $a (t)$ are unknown. To handle these unknown nonlinear terms and the actuator fault, we will use a critic neural network approximation.

Radial basis function neural networks

Radial basis function (RBF) networks have the capability to approximate any function universally. This approximation can be expressed as $F^{T} (Z) W$ , where $F (Z)$ represents a vector of RBF neurons, $W$ denotes a vector of network weights, and $Z$ is the input vector. Let $ψ$ be a trajectory confined within a bounded set, and let $C_{ζ}$ denote the localized area along this trajectory $ψ$ . According to [30], by leveraging the neurons within $C_{ζ}$ , for any smooth nonlinear function $ϕ (Z)$ and for any given tolerance $ϵ > 0$ , there exists an optimal weight vector $W_{ζ}^{*}$ such that:

\begin{matrix} ϕ (Z) = F_{ζ}^{T} (Z) W_{ζ}^{*} - ϵ, Z \in ψ \end{matrix}

where

F_{ζ} (Z)

denotes a vector of RBF neurons within

C_{ζ}

Control objective

The aim of this paper is to develop a control strategy using neural network observers and the ADP algorithm for system (1). This strategy is designed to ensure stability despite actuator faults and time delays. The neural network observer allows for the estimation of system states and unknown dynamics. We consider the unknown nonlinear function $f (z (t))$ and the fault $a (t)$ as a combined term, expressed as $φ (z) = f (z (t)) + D_{f} a (t)$ , and assume the detectable condition is met. The control laws are formulated to incorporate the influence of time delays, guaranteeing system stability. In particular, the control law accounts for the integral of the time-delay history. We propose a new value function that includes the history of time delays.

Adaptive neural network observer

Neural network observers excel in handling nonlinear dynamics, a key advantage over traditional linear observers. While linear observers are limited to small ranges around specific operating points, neural networks can approximate complex, nonlinear functions. This allows them to learn nonlinear dynamics from data, resulting in more accurate state estimation across a wider range of operating conditions. Such capability is particularly valuable in fault-tolerant control problems, where the ability to detect and isolate faults in nonlinear systems is critical for maintaining system stability and performance [24]. In [31], fuzzy logic systems are employed to model the unknown nonlinear dynamics of the pure-feedback system. These systems are particularly well-suited for this task because they can represent complex, nonlinear relationships through a set of simple "if-then" rules derived from expert knowledge or data. However, while fuzzy logic systems are flexible, their rule base can be relatively static unless specifically designed to adapt. This static nature can limit their ability to learn complex, high-dimensional patterns compared to neural networks, which can automatically learn representations from data.

Based on (2), an adaptive neural network observer for system (1) is designed as follows:

\begin{matrix} \dot{\hat{z}} (t) = A \hat{z} (t) + A_{d} \hat{z} (t - d) + B u (t) + F^{T} (y) {\hat{W}}_{o} \\ + Λ (y (t) - C \hat{z} (t)), \\ \hat{y} (t) = C \hat{z} (t), \end{matrix}

and the neural network identifier as

\begin{matrix} {\dot{\hat{W}}}_{o} (t) = Ω F (y (t)) L (y (t) - C \hat{z} (t)) - δ Ω {\hat{W}}_{o} (t), \end{matrix}

in which

\hat{z} (t) \in R^{n}

and

\hat{y} (t) \in R^{n}

represent the estimates of the system state z(t) and the output measurements y(t), respectively. The observer gain matrices are

Λ \in R^{n \times p}

and

L \in R^{j \times p}

. The parameters

δ > 0

and

Ω \in R^{j \times j} > 0

are defined. The term

F^{T} (y (t)) \in R^{n \times k}

, and

\hat{W} \in R^{k}

is the estimate of

W

. The RBF neural network

F^{T} (y (t)) \hat{W}

is used to approximate the unknown function

φ (z (t))

Remark 1

Existing NN-based control methods rely on adaptive techniques, with NN weight update laws designed using a Lyapunov function. However, this approach only guarantees convergence of NN weights, while other NN parameters like widths and centers remain fixed and cannot be updated online. In contrast, gradient descent (GD)-based NN techniques can achieve better approximation performance by updating all NN parameters to satisfactory values through cost function minimization via the GD algorithm. In [32], a novel control strategy for nonlinear systems with unknown dynamics is introduced, utilizing a neural network-based approach that combines gradient descent (GD) with the Barzilai-Borwein (BB) method. This GD-BB-based control algorithm enhances adaptability by updating neural network parameters and learning rates online, simplifying the design process and reducing the need for manual tuning. Thus, using the proposed method in [32] for estimating unknown nonlinear dynamics could be a very interesting approach for fault-tolerant control problems.

For simplicity, the notation t is omitted in the equations that follow.

Let’s define the state observation error as $\tilde{z} = \hat{z} - z$ and the neural network identifier observation error as $\tilde{W} = \hat{W} - W$ . Then, based on Eqs. (1) and (3), we obtain the following.

\begin{matrix} \dot{\tilde{z}} = (A - Λ C) \tilde{z} + A_{d} \tilde{z} (t - d) + F^{T} (y) {\tilde{W}}_{o} + ε . \end{matrix}

The following theorem is presented to show that the state observation error (5) is uniformly ultimately bounded.

Theorem 1

Given the provided observer (3) and Assumption 1, if there exist symmetric positive definite matrices E and $Q_{d} \in R^{n \times n}$ , positive scalar $λ_{1}$ , and matrix $Y$ such that the following LMIs hold:

\begin{matrix} Π = [\begin{matrix} - Π_{11} & - E A_{d}^{T} & 0 \\ * & Q_{d} & 0 \\ * & * & δ \end{matrix}] > 0, \end{matrix}

where

\begin{matrix} λ_{3} = δ {\bar{W}}^{T} \bar{W}, \\ Π_{11} = A^{T} E - C^{T} Y^{T} + E A - Y C + \frac{1}{λ_{1}} E E^{T} + Q_{d}, \end{matrix}

the observer gain

Λ

and

L

are obtained as

Λ = E^{- 1} Y

and

L = E^{T} C^{- 1}

. Consequently, the state estimation error

\tilde{z}

and the identifier estimation error

\tilde{W}

are uniformly ultimately bounded.

Proof

The following Lyapunov function candidate is proposed:

\begin{matrix} \sum = \sum_{1} + \sum_{2} + \sum_{3}, \end{matrix}

where

\begin{matrix} \sum_{1} = {\tilde{z}}^{T} E \tilde{z}, \\ \sum_{2} = \int_{t - d}^{t} {\tilde{z}}^{T} (τ) E \tilde{z} (τ) d τ \\ \sum_{3} = {\tilde{W}}^{T} Ω^{- 1} \tilde{W} . \end{matrix}

Given (5) and (7), we derive the following:

\begin{matrix} {\sum^{˙}}_{1} & = {\dot{\tilde{z}}}^{T} E \tilde{z} + {\tilde{z}}^{T} E \dot{\tilde{z}} \\ = {\tilde{z}}^{T} ({(A - Λ C)}^{T} E + E (A - Λ C)) \tilde{z} \\ + {\tilde{z}}^{T} (t - d) A_{d}^{T} E \tilde{z} + 2 {\tilde{W}}^{T} F (y) E \tilde{z} + 2 {\tilde{z}}^{T} E ε, \end{matrix}

\begin{matrix} {\sum^{˙}}_{2} & = {\tilde{z}}^{T} Q_{d} \tilde{z} - {\tilde{z}}^{T} (t - d) Q_{d} \tilde{z} (t - d), \end{matrix}

\begin{matrix} {\sum^{˙}}_{3} & = 2 {\tilde{W}}^{T} Ω^{- 1} \dot{\hat{W}} = 2 {\tilde{W}}^{T} Ω^{- 1} (Ω F^{T} (y) \\ \times L (y - C \hat{z}) - δ Ω \hat{W}) \\ = - 2 {\tilde{W}}^{T} F^{T} (y) L \tilde{z} - 2 {\tilde{W}}^{T} δ \hat{W}, \end{matrix}

In (9) and using Young’s inequality, we obtain the following:

\begin{matrix} 2 {\tilde{z}}^{T} E ε \leq \frac{1}{λ_{1}} {\tilde{z}}^{T} E E^{T} \tilde{z} + λ_{1} ε^{T} ε . \end{matrix}

First, let us examine the scenario where the neural network approximation error

ε = 0

. Then, we derive the following result:

\begin{matrix} \sum^{˙} = {\sum^{˙}}_{1} + {\sum^{˙}}_{2} + {\sum^{˙}}_{3} \leq {\tilde{z}}^{T} ({(A - Λ C)}^{T} E \\ + E (A - Λ C)) \tilde{z} + {\tilde{z}}^{T} (t - d) A_{d}^{T} E \tilde{z} \\ + 2 {\tilde{W}}^{T} F (y) E \tilde{z} + \frac{1}{λ_{1}} {\tilde{z}}^{T} E E^{T} \tilde{z} + {\tilde{z}}^{T} Q_{d} \tilde{z} \\ - {\tilde{z}}^{T} (t - d) Q_{d} \tilde{z} (t - d) - 2 {\tilde{W}}^{T} F^{T} (y) \\ \times L C \tilde{z} - 2 {\tilde{W}}^{T} δ \hat{W} . \end{matrix}

Utilizing Young’s inequality and Assumption 1, we derive the following result:

\begin{matrix} - 2 {\tilde{W}}^{T} δ \hat{W} = - 2 δ {\tilde{W}}^{T} (\tilde{W} + W) \\ = - 2 δ {\tilde{W}}^{T} \tilde{W} - 2 δ {\tilde{W}}^{T} W \leq - δ {\tilde{W}}^{T} \tilde{W} \\ - δ W^{T} W \leq - δ {\tilde{W}}^{T} \tilde{W} - δ {\bar{W}}^{T} \bar{W}, \end{matrix}

here

\bar{W}

denotes the upper bound of

W

, that is

∥W∥ \leq ∥\bar{W}∥

. By assuming

E F (y) = C^{T} L^{T} F (y)

and defining the variable

η^{T} = [{\tilde{z}}^{T}, {\tilde{z}}^{T} (t - d), {\tilde{W}}^{T}]

\begin{matrix} \dot{Σ} \leq - η^{T} Π η - λ_{3} \end{matrix}

where

λ_{3} = δ {\bar{W}}^{T} \bar{W}

and

Π

is defined in (6). According to (15), if

Π > 0

, then, it leads to

\dot{Σ} \leq 0

In other words, $\tilde{z}$ and $\tilde{W}$ will remain bounded over time. Thus, the proof is concluded. $□$

Inspiring [24], Theorem 1 is obtained and the estimation errors for the states converge to zero when $ε ≃ 0$ . Thus, the convergence of the state estimate error $z$ and the neural network weight estimate errors $W$ can be assured. By using the result of Theorem 1, the identification error systems (3) and (4) are uniformly ultimately bounded, $z$ and $W$ are also bounded.

Designing optimal controller based on neural network observer

From the designed adaptive neural network observer (5), it is clear that system (1) is replaced by (3), thus transforming (1) into:

\begin{matrix} \dot{\hat{z}} = A \hat{z} + A_{d} \hat{z} (t - d) + B u (\hat{z}) + F^{T} (y) \hat{W} + g (\hat{z}) \\ \hat{y} = C \hat{z}, \end{matrix}

where

g (\hat{z}) = Λ (y - C \hat{z})

If an optimal control policy $u (\hat{z})$ is found for Eq. (16), it guarantees the convergence of system (1). This paper presents an augmented system to reframe the problem of fault tolerant as an optimal regulation problem. By merging Eqs. (4) and (16), the following observer system is provided as follows:

\begin{matrix} [\begin{matrix} \dot{\hat{z}} \\ {\dot{\hat{W}}}_{o} \end{matrix}] = [\begin{matrix} A & {F (y)}^{T} \\ 0 & - δ Ω \end{matrix}] [\begin{matrix} \hat{z} \\ {\hat{W}}_{o} \end{matrix}] + [\begin{matrix} A_{d} & 0 \\ 0 & 0 \end{matrix}] \times \\ [\begin{matrix} \hat{z} (t - d) \\ {\hat{W}}_{o} (t - d) \end{matrix}] + [\begin{matrix} B \\ 0 \end{matrix}] u (\hat{z}) + [\begin{matrix} g (\hat{z}) \\ Ω F (y) L (y - C \hat{z}) \end{matrix}], \end{matrix}

introduce the augmented variables and matrices as:

\begin{matrix} η = [\begin{matrix} \hat{z} \\ {\hat{W}}_{o} \end{matrix}], \bar{A} = [\begin{matrix} A & {F (y)}^{T} \\ 0 & - δ Ω \end{matrix}] \\ {\bar{A}}_{d} = [\begin{matrix} A_{d} & 0 \\ 0 & 0 \end{matrix}], \bar{B} = [\begin{matrix} B \\ 0 \end{matrix}], \\ G (\bar{M} ξ) = [\begin{matrix} g (\hat{z}) \\ Ω F (y) L (y - C \hat{z}) \end{matrix}], \bar{M} = [I, 0] . \end{matrix}

Then, system (17) can be represented as

\begin{matrix} \dot{η} = \bar{A} η + {\bar{A}}_{d} η (t - d) + B u (η) + G (\bar{M} η) . \end{matrix}

Next, we will derive the control law, assuming that the control input

u (η) \in Ψ (Ω)

is considered admissible.

Controller formulation

The proposed control strategy leverages the ADP framework to solve optimal control problems using neural network observer. We employ a critic neural network to approximate the value function associated with the Hamilton–Jacobi–Bellman equation, training it to minimize temporal difference error. This approach, supported by Lyapunov functional analysis, ensures the convergence of the system states and neural network weights, thereby guaranteeing the stability and optimality of the control strategy.

System (1) is controllable on the set $φ \in R^{n}$ . Therefore, the goal is to find a feedback control policy $u (η)$ , that minimizes the infinite horizon performance index function, expressed as:

\begin{matrix} V (η) = & \int_{t}^{\infty} (P (η, u) + \int_{τ - d}^{τ} η^{T} (q) {\bar{Q}}_{d} η (q) d q) d τ, \end{matrix}

here, the utility function

P (η (τ), u (τ))

is given by

$P (η, u) = η^{T} \bar{Q} η + u^{T} R u$ . $\bar{Q} = [\begin{matrix} Q & 0 \\ 0 & 0 \end{matrix}],$ $Q \in R^{n \times n},$ ${\bar{Q}}_{d} = [\begin{matrix} Q_{d} & 0 \\ 0 & 0 \end{matrix}], Q_{d} \in R^{n \times n}$ and $R \in R^{m \times m}$ are positive definite matrices.

Consider $U (η, u)$ , defined as: $\begin{matrix} U (η, u) = P (η, u) + \int_{τ - d}^{τ} η^{T} (q) {\bar{Q}}_{d} η (q) d q . \end{matrix}$ This function satisfies $U (0, 0) = 0$ and is nonnegative for all $η$ and $u$ . The expression (19) takes into account time delays.

Next, we will prove that using the new value function (19) to design a controller can stabilize the closed-loop system.

Let $V^{*} (η)$ be the optimal value function corresponding to $V (η)$ , defined as follows:

\begin{matrix} V^{*} (η) = min_{u \in φ} V (η) . \end{matrix}

By applying the Bellman optimality principle, we can determine the gradient of

V^{*} (η)

. This gradient, denoted as

\nabla V^{*} (η) = \frac{\partial V^{*} (η)}{\partial η}

, satisfies the following equation:

\begin{matrix} min_{u \in φ} H (η, \nabla V^{*} (η)) = 0, \end{matrix}

where

H (η, \nabla V^{*} (η))

is known as the Hamiltonian function [33, 34]. The Hamiltonian function associated with the cost function (19) is given by:

\begin{matrix} H (η, & \nabla V^{*} (η), u) = U (η, u) + {(\nabla V^{*} (η))}^{T} \dot{η} \\ = U (η, u) + {(\nabla V^{*} (η))}^{T} (\bar{A} \hat{η} + {\bar{A}}_{d} \hat{η} (t - d) \\ + G (\bar{M} η)) + {(\nabla V^{*} (η))}^{T} B u (\hat{η}) \\ = P (η, u) + \int_{τ - d}^{τ} η^{T} (q) Q_{d} η (q) d q \\ + {(\nabla V^{*} (η))}^{T} (\bar{A} \hat{η} + {\bar{A}}_{d} \hat{η} (t - d) \\ + G (\bar{M} η)) + {(\nabla V^{*} (η))}^{T} B u (\hat{η}) . \end{matrix}

Our goal is to minimize Eq. (22) by combining Eqs. (22) and (21) to derive the optimal control law, denoted as

u^{*}

. This optimal control law can be obtained from

\frac{\partial H (η, \nabla V^{*} (η), u)}{\partial u} = 0

as follows:

\begin{matrix} u^{*} = - \frac{1}{2} R^{- 1} B^{T} \nabla V^{*} (η) . \end{matrix}

From (23), a simple transformation yields:

\begin{matrix} {(\nabla V^{*} (η))}^{T} B = 2 {u^{*}}^{T} R . \end{matrix}

Equation (24) will be needed later.

By substituting Eq. (23) in Eq. (22), the HJB equation can be rewritten as follows:

\begin{matrix} η^{T} Q η - \frac{1}{4} {(\nabla V^{*} (η))}^{T} B^{T} R B \nabla V^{*} (η) \\ + \int_{τ - d}^{τ} η^{T} (q) Q_{d} η (q) d q + {(\nabla V^{*} (η))}^{T} (\bar{A} \hat{η} \\ + {\bar{A}}_{d} \hat{η} (t - d) + G (\bar{M} η)) = 0 . \end{matrix}

It is essential to show that the control law derived in Eq. (23) can stabilize the system described by Eq. (1). This is established in the following theorem.

Theorem 2

Consider system (1), Theorem 1, and the control protocol given by Eq. (23). The control strategy outlined in Eq. (23) can ensure that nonlinear time-delay system (1) remains uniformly ultimately bounded, provided that there exist positive definite matrices $\bar{Q}$ and ${\bar{Q}}_{d}$ , along with free-weighting matrices $M$ and $N$ , such that the following BMI is satisfied:

\begin{matrix} \bar{Υ} = [\begin{matrix} R & 0 & 0 & 0 \\ * & \bar{Q} - M_{d} & - M N_{d} & - M \\ * & * & - N_{d} & - N \\ * & * & * & 0 \end{matrix}] \geq 0, \end{matrix}

in which

\begin{matrix} M_{d} = d M {\bar{Q}}_{d}^{- 1} M^{T}, \\ M N_{d} = d M {\bar{Q}}_{d}^{- 1} N^{T}, \\ N_{d} = d N {\bar{Q}}_{d}^{- 1} N^{T} . \end{matrix}

Proof

The following Lyapunov function is considered:

\begin{matrix} L (η) = V^{*} (η) . \end{matrix}

Based on the definition of

V^{*} (η)

, we have

V^{*} (η) > 0

when

η \neq 0

and

V^{*} (η) = 0

when

η = 0

. This indicates that

V^{*} (η)

is positive definite, which consequently implies that

L (η)

is also positive definite. Following this, the derivative of the Lyapunov function (27) is evaluated along the trajectory of

\dot{η} = \bar{A} \hat{η} + {\bar{A}}_{d} \hat{η} (t - d) + B u (\hat{η}) + G (\bar{M} η)

, yielding the following expression:

\begin{matrix} \dot{L} (η) = & {(\nabla V^{*} (η))}^{T} \dot{η} \\ = & {(\nabla V^{*} (η))}^{T} (\bar{A} \hat{η} + {\bar{A}}_{d} \hat{η} (t - d) + B u (\hat{η}) \\ + G (\bar{M} η)) . \end{matrix}

Calling (22), we obtain:

\begin{matrix} {(\nabla V^{*} (η))}^{T} (\bar{A} \hat{η} + {\bar{A}}_{d} \hat{η} (t - d) + G (\bar{M} η)) = - P (η, u) \\ - \int_{t - d}^{t} η^{T} (q) {\bar{Q}}_{d} η (q) d q - {(\nabla V^{*} (η))}^{T} B u . \end{matrix}

By substituting (29) and (23) in (28), Eq. (28) can be reformulated as follows:

\begin{matrix} \dot{L} (η) = - η^{T} \bar{Q} η - u^{*^{T}} R u^{*} - \int_{t - d}^{t} η^{T} (q) {\bar{Q}}_{d} η (q) d q . \end{matrix}

By using Jensen’s inequality [28], let us define the free-weighting matrices

M

and

N

. The term

- \int_{t - d}^{t} η^{T} (q) {\bar{Q}}_{d} η (q)

d q

in (30) can be expressed as follows:

\begin{matrix} - \int_{t - d}^{t} η^{T} (q) {\bar{Q}}_{d} η (q) d q = 2 η^{T} M \int_{t - d}^{t} η^{T} (q) d q \\ + 2 η^{T} (t - d) N \int_{t - d}^{t} η^{T} (q) d q - \int_{t - d}^{t} (η^{T} (q) M \\ + η^{T} (t - d) N + η^{T} (q) {\bar{Q}}_{d}) {\bar{Q}}_{d}^{- 1} (η^{T} (q) M \\ + η^{T} (t - d) N + η^{T} (q) {\bar{Q}}_{d})^{T} d q + \int_{t - d}^{t} (η^{T} (q) M \\ + η^{T} (t - d) N) {\bar{Q}}_{d}^{- 1} (η^{T} (q) M + η^{T} (t - d) N)^{T} d q \\ \leq ξ^{T} Υ ξ, \end{matrix}

in which

\begin{matrix} ξ^{T} = [η^{T}, η^{T} (t - d), \int_{t - d}^{t} η^{T} (q) d q], \\ Υ = [\begin{matrix} M_{d} & M N_{d} & M \\ * & N_{d} & N \\ * & * & 0 \end{matrix}] . \end{matrix}

Referring to (30) and applying (31), we get:

\begin{matrix} \dot{L} (η) = - η^{T} \bar{Q} η - u^{*^{T}} R u^{*} - \int_{t - d}^{t} η^{T} (q) {\bar{Q}}_{d} η (q) d q \\ \leq - η^{T} \bar{Q} η - u^{*^{T}} R u^{*} + η^{T} Υ η \\ \leq - ({\bar{ξ}}^{T} \bar{Υ} \bar{ξ}), \end{matrix}

where

\begin{matrix} {\bar{ξ}}^{T} = [u^{T}, η^{T}, η^{T} (t - d), \int_{t - d}^{t} η^{T} (q) d q], \\ \bar{Υ} = [\begin{matrix} R & 0 & 0 & 0 \\ * & \bar{Q} - M_{d} & - M N_{d} & - M \\ * & * & - N_{d} & - N \\ * & * & * & 0 \end{matrix}] . \end{matrix}

According to (32), if

\bar{Υ} \geq 0

, then

- ({\bar{ξ}}^{T} \bar{Υ} \bar{ξ}) < 0

holds, leading to

\dot{L} (η) < 0

. This proves that the system (1) is asymptotically stable, thus completing the proof.

□

It is essential to delve deeper into the discussion regarding the BMI (26). As we are aware, solving this BMI is a challenging task. To address this difficulty and analyze the permissible delays, we can employ the following expression to assess the feasibility of the BMI:

\begin{matrix} {\bar{λ}}_{max} (R) {∥u^{*}∥}^{2} - {\bar{λ}}_{max} (Π) {∥η∥}^{2} = 0 \end{matrix}

where

\begin{matrix} Π = [\begin{matrix} \bar{Q} - M_{d} & - M N_{d} & - M \\ * & - N_{d} & - N \\ * & * & 0 \end{matrix}] \end{matrix}

Remark 2

The Lyapunov-Krasovskii function, $\int_{t - d}^{t} η^{T} (q)$ ${\bar{Q}}_{d} η (q) d q$ , is used for analyzing the stability of time-delay systems and is incorporated into the value function (19). Its purpose in the cost function is to account for the effects of time delays on the system, allowing this information to be used to update the control law. As demonstrated in Theorem 2, integrating the Lyapunov-Krasovskii function with time delays into the cost function helps ensure the system’s stability.

According to (26), $\bar{Υ}$ incorporates the time delay $d$ . The BMI condition $\bar{Υ} \geq 0$ helps identify the permissible delays. In other words, since the feasibility of $\bar{Υ} \geq 0$ is dependent on $d$ , we can determine the maximum allowable delays. In the subsequent stages, a single-layer neural network to tackle the task of solving the HJB equation is provided.

Neural network algorithm

We know that neural networks are great at handling complex functions. Since the performance index function (19) is usually quite complicated and does not have an exact mathematical expression, we will use a neural network to estimate it. In this paper, we use a simple one-layer neural network to do this. Then, following [35], $V (η)$ is expressed as

\begin{matrix} V (η) = W_{c}^{T} S (η) + ε (η), \end{matrix}

here, consider

S (η)

as the activation function, where

R^{c}

signifies the real-valued space with dimension

c

W_{c}

denotes the optimal weight vector, and

c

represents the number of neurons in the hidden layer.

ε (η)

stands for the approximation error of the neural network. The gradient of Eq. (35) concerning

η

can then be expressed as:

\begin{matrix} \nabla V (η) = {(\nabla S (η))}^{T} W_{c} + \nabla ε (η), \end{matrix}

let

\nabla S (η) = \frac{\partial S (η)}{\partial η} \in R^{c \times n}

denote the gradient of the activation function, and

\nabla ε (η)

represent the gradient of the approximation error.

When we substitute Eq. (36) into (21), we get:

\begin{matrix} 0 = U (η, u) + ({(\nabla S (η))}^{T} W_{c} + \nabla ε (η)) \dot{η} . \end{matrix}

So, the Hamiltonian can be stated as:

\begin{matrix} H (η, u, W_{c}) & = U (η, u) + (W_{c}^{T} \nabla S (η)) \dot{η} \\ = - \nabla ε (η) \dot{η} ≜ e_{rH} . \end{matrix}

In this context,

e_{rH}

signifies the residual error stemming from the neural network approximation. Since the ideal weight vector

W_{c}

is unknown, we utilize the critic neural network to estimate

V (η)

as:

\begin{matrix} \hat{V} (η) = W_{c}^{T} S (η) . \end{matrix}

As a result, we can formulate the gradient of

\hat{V} (η)

as:

\begin{matrix} \nabla \hat{V} (η) = {(\nabla S (η))}^{T} {\hat{W}}_{c}, \end{matrix}

hence, we can derive the approximate Hamiltonian as:

\begin{matrix} H (η, u, {\hat{W}}_{c}) & = U (η, u) + (W_{c}^{T} \nabla S (η)) \dot{η} \\ ≜ e_{r} . \end{matrix}

We define the weight approximation error as

{\tilde{W}}_{c} = W_{c} - {\hat{W}}_{c}

. Utilizing (41) and (38), we obtain:

\begin{matrix} e_{r} = e_{rH} - {\tilde{W}}_{c}^{T} \nabla S (η) \dot{η} . \end{matrix}

We can revise the weight approximation error as:

\begin{matrix} {\dot{\tilde{W}}}_{c} = - {\dot{\hat{W}}}_{c} = L_{r} (e_{rH} - {\tilde{W}}_{c}^{T} \nabla S (η) \dot{η}) \nabla S (η) \dot{η} . \end{matrix}

To update the weight vector

{\hat{W}}_{c}

of the critic neural network, we minimize the objective function

E_{c} = \frac{1}{2} e_{r}^{T} e_{r}

using the normalized gradient algorithm. The weight vector

{\hat{W}}_{c}

should be updated as follows:

\begin{matrix} {\dot{\hat{W}}}_{c} = - L_{r} e_{r} \nabla S (η) \dot{η}, \end{matrix}

in this expression,

L_{r} > 0

denotes the learning rate of the critic neural network. Therefore, considering (23) and (35), we can describe the ideal control policy as:

\begin{matrix} u (η) = - \frac{1}{2} R^{- 1} B^{T} ({(\nabla S (η))}^{T} W_{c} + \nabla ε (η)), \end{matrix}

and it can be estimated as:

\begin{matrix} \hat{u} (η) = - \frac{1}{2} R^{- 1} B^{T} {(\nabla S (η))}^{T} {\hat{W}}_{c} . \end{matrix}

The control policy (46) solely relies on the critic neural network. This means that updating the weight vector of the critic neural network using (44) eliminates the need to train the action neural network. As a result, this approach is not only feasible but also computationally efficient for implementation.

Theorem 3

If the weights of the critic neural network are updated according to (43) for the system described in (18), the error in weight approximation is uniformly ultimately bounded.

Proof

For this theorem, we used the following Lyapunov candidate:

\begin{matrix} Γ_{2} = \frac{1}{2 L_{r}} {\tilde{W}}_{c}^{T} {\tilde{W}}_{c} . \end{matrix}

The time derivative of (47) is

\begin{matrix} {\dot{Γ}}_{2} & = \frac{1}{2 L_{r}} {\tilde{W}}_{c}^{T} {\dot{\tilde{W}}}_{c} = {\tilde{W}}_{c}^{T} (e_{rH} - {\tilde{W}}_{c}^{T} \nabla S (η) \dot{η}) \\ \times \nabla S (η) \dot{η} = {\tilde{W}}_{c}^{T} e_{rH} \nabla S (η) \dot{η} - {∥{\tilde{W}}_{c}^{T}, \nabla, S, (η), \dot{η}∥}^{2} \\ \leq \frac{1}{2} e_{rH}^{2} - \frac{1}{2} {∥{\tilde{W}}_{c}^{T}, \nabla, S, (η), \dot{η}∥}^{2} . \end{matrix}

Hence,

{\dot{Γ}}_{2} < 0

holds when

{\tilde{W}}_{c}

belongs to the compact set defined by

‖ {\tilde{W}}_{c} ‖ \leq ‖ \frac{e_{rH}}{θ_{1}} ‖

, assuming

‖ \nabla S (z) \dot{z} ‖ \leq θ_{1}

, where

θ_{1}

is a positive constant. Then, we can conclude that the weight approximation error is upper uniformly bounded, thus completing the proof.

□

The ADP algorithm based on neural network observer associated with the approach proposed in this paper is detailed in Algorithm 1.

[See PDF for image]

Algorithm 1

Fault tolerant for a class of nonlinear time delay systems using ADP

Remark 3

For the computational requirements and efficiency of using a critic neural network in real-time applications, several factors need to be considered. These include the complexity of the neural network, the hardware capabilities, and the specific requirements of the real-time system. Real-time applications require low-latency responses, so the neural network should be optimized to produce outputs within the application’s time constraints.

Simulation numerical examples

This section presents numerical simulation examples to illustrate the effectiveness of the proposed method.

The simulation results were obtained using Julia programming, and the reader can download them from this link

Example: Chemical Reaction Tank

The chemical reactor systems consist of multiple chemical reactors and stirring bars, which introduce delays into the systems. The state-space expression of the chemical reaction is given as follows:

\begin{matrix} {\dot{z}}_{1} (t) = - (\frac{1}{T_{11}} + C_{11}) z_{1} (t) + (\frac{1 - F_{12}}{V_{11}}) z_{2} (t) \\ + M_{11} sin (z_{1} (t)), \\ {\dot{z}}_{2} (t) = - (\frac{1}{T_{12}} + C_{12}) z_{2} (t) + (\frac{F_{12}}{V_{12}}) z_{2} (t - d) \\ + u (t) + a (t), \end{matrix}

The chemical reaction parameters have been set as follows:

T_{11} = 50

T_{12} = 50

C_{11} = 3

C_{12} = 3

M_{11} = 0.1

M_{12} = 0.1

F_{11} = 0.5

F_{12} = 0.5

V_{11} = 0.5

, and

V_{12} = 0.5

According to (49): $\begin{matrix} C = [\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}], D_{f} = [\begin{matrix} 0 \\ 1 \end{matrix}], B = [\begin{matrix} 0 \\ 1 \end{matrix}], \end{matrix}$ We assume that the actuator fault dynamic is as follows: $\begin{matrix} a (t) = \{\begin{matrix} 0, & 0 s \leq t < 10 s, \\ 0.5 sin (t / 2 π), & 10 s \leq t \leq 20 s . \end{matrix}) \end{matrix}$ In this example, the initial values are chosen as $z (0) = [0.95 ; - 0.8]$ , and other parameters are set as follows: $learning rate = 0.01$ , $R = I$ , and $Q = I_{2}$ . The observer’s parameters are obtained from (6) as follows: $\begin{matrix} L = [\begin{matrix} 0.4941 & - 0.0385 \\ - 0.0645 & 0.0091 \end{matrix}], Λ = [\begin{matrix} 0.026 & 0.1695 \\ 0.4417 & 12.7558 \end{matrix}] \end{matrix}$ An adaptive neural network observer is designed to estimate the system states and the unknown function. The RBF network employs $M = 11 \times 11 = 121$ nodes, with neuron centers distributed across the range $[- 1.2, 1.2] \times [- 1.2, 1.2]$ . The initial weight matrix is set to ${\hat{W}}_{o} = 0$ . The time delay is assumed to be $d = 0.3 s$ . The weight vector in critic network learning is denoted as ${\hat{W}}_{c} = {[{\hat{W}}_{c 1}, {\hat{W}}_{c 2}, {\hat{W}}_{c 3}]}^{T}$ , with its initial values given by $[{\hat{W}}_{c 1}, {\hat{W}}_{c 2}] = {[1, 1]}^{T}$ . The dimension of ${\hat{W}}_{c 3}$ is determined by $F^{T} (y) \in R^{n \times N^{2}}$ . Additionally, the activation function for the critic neural network is specified as $σ_{c} (z) = [{\hat{x}}_{1}^{2}, {\hat{x}}_{1} {\hat{x}}_{2}, {\hat{x}}_{1} \hat{W}, {\hat{x}}_{2}^{2}, {\hat{x}}_{2} \hat{W}, {\hat{W}}^{2}]$ . The network comprises $c = 2 + 121 = 123$ neurons, and the critic weight vectors are represented by ${\hat{W}}_{c} = [W_{c 1}, \dots, W_{c 123}]$ .

The simulation results are shown in Figs. 1, 2, 3, 4 and 5. Figure 1 illustrates that the states of the nonlinear system (1) are stabilized under the control policy (46). When an actuator fault happens at $t = 10 s$ , there is a noticeable change in the system states, but the controller still performs well. This indicates that the system, with control policy (46), can handle actuator faults. Figure 2 displays the control input (46) during simulation time. In our theoretical development, we focused on ensuring robustness and fault tolerance, which may sometimes introduce oscillations due to the conservative nature of the design. However, in practical implementations, these oscillations can be mitigated through damping techniques, filtering, and careful parameter tuning to ensure smooth control signals.

Figure 3 shows the performance of fault estimation and unknown dynamics, demonstrating that the proposed ADP-based approach can accurately and effectively estimate faults. Furthermore, Fig. 4 presents the critic weight vectors, indicating their convergence. For plotting ${\hat{W}}_{o}$ , we used the norm $|W|$ due to the large size of ${\hat{W}}_{o}$ , which is a vector with 121 columns.

[See PDF for image]

Fig. 1

System states and observer estimations

[See PDF for image]

Fig. 2

The control input (46)

[See PDF for image]

Fig. 3

The fault and unknown dynamic estimation performance

[See PDF for image]

Fig. 4

The weights of the critic neural network

The value of the cost function (19) is shown in Fig. 5. The integral part, which is related to the time delay, is also depicted in Fig. 5, indicating that when the states of the system converge to the equilibrium point, the rate of change of the cost function becomes constant.

[See PDF for image]

Fig. 5

The value of cost function using the proposed method

Conclusion

In this paper, we have proposed a FTC method for partially unknown nonlinear systems with time delays, employing adaptive dynamic programming. The neural network observer was implemented to estimate states, unknown dynamics, and actuator fault of the system. The observer gains were determined by solving a linear matrix inequality. A new value function index was introduced to account for time-delay states, and control laws were derived based on this value function. The Hamilton–Jacobi–Bellman equation for this value function was solved using a critic neural network. Lyapunov functional analysis demonstrated that the closed-loop system is uniformly ultimately bounded.

A promising future direction would be to extend our adaptive neural network observer and time-delay inclusive ADP approach to decentralized control scenarios for interconnected nonlinear systems with multiple fault types, as investigated in [32, 36].

Acknowledgements

None.

Author Contributions

Farshad Rahimi contributed to formulation, writing, simulation, methodology, mathematics, review, and editing.

Funding

The authors claim that there are no funds, grants, or other support devoted to the preparation of this manuscript.

Data availability

The simulation results were obtained using Julia programming, and the reader can download them from [this link](https://farshad-rahimi.github.io/FarshadRahimi//Learning%20Julia/).

Declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

1. Ni, Q; Ding, F; Zhan, Z; Liu, J; Li, X; Zhao, Z. A fault diagnosis and measurement method for temperature measurement circuits in electric traction drive system. Measurement; 2024; 227, 114250. [DOI: https://dx.doi.org/10.1016/j.measurement.2024.114250]

2. Isapour, S; Tavakoli-Kakhki, M. Rough neural network based data-driven model-free adaptive fault-tolerant control for discrete-time nonlinear systems. Eng Appl Artif Intell; 2024; 134, 108651. [DOI: https://dx.doi.org/10.1016/j.engappai.2024.108651] 07964803

3. Gao, Z; Cecati, C; Ding, SX. A survey of fault diagnosis and fault-tolerant techniques-part i: Fault diagnosis with model-based and signal-based approaches. IEEE Trans Industr Electron; 2015; 62, 6 pp. 3757-3767. [DOI: https://dx.doi.org/10.1109/TIE.2015.2417501] 1365.94571

4. Li, H; Yang, C; Hu, Y; Zhao, B; Zhao, M; Chen, Z. Fault-tolerant control for current sensors of doubly fed induction generators based on an improved fault detection method. Measurement; 2014; 47, pp. 929-937. [DOI: https://dx.doi.org/10.1016/j.measurement.2013.10.021] 1313.03023

5. Amin, AA; Hasan, KM. A review of fault tolerant control systems: advancements and applications. Measurement; 2019; 143, pp. 58-68. [DOI: https://dx.doi.org/10.1016/j.measurement.2019.04.083] 1227.97065

6. Jia, F; Cao, F; He, X. Active fault-tolerant control against intermittent faults for state-constrained nonlinear systems. IEEE Trans Syst, Man, Cybern: Syst; 2024; 54, pp. 2389-2401. [DOI: https://dx.doi.org/10.1109/TSMC.2023.3344292] 07966208

7. Hu, J; Sha, Y; Yao, J. Dual neural networks based active fault-tolerant control for electromechanical systems with actuator and sensor failure. Mech Syst Signal Process; 2023; 182, 109558. [DOI: https://dx.doi.org/10.1016/j.ymssp.2022.109558] 1137.86312

8. Ke, C; Cai, K-Y; Quan, Q. Uniform passive fault-tolerant control of a quadcopter with one, two, or three rotor failure. IEEE Trans Robot; 2023; 39, pp. 4297-4311. [DOI: https://dx.doi.org/10.1109/TRO.2023.3297048] 1548.93088

9. Shen, Q; Jiang, B; Shi, P. Fault diagnosis and fault-tolerant control based on adaptive control approach; 2017; Berlin, Springer: 1417.9302491

10. Zhang, X-N; Li, X-J. Adaptive fault-tolerant control for a class of stochastic nonlinear systems with multiple sensor faults. Int J Syst Sci; 2020; 51, 12 pp. 2217-2237.4141434 [DOI: https://dx.doi.org/10.1080/00207721.2020.1793231] 1483.93328

11. Shen, Q; Shi, P; Lim, CP. Fuzzy adaptive fault-tolerant stability control against novel actuator faults and its application to mechanical systems. IEEE Trans Fuzzy Syst; 2024; 32, pp. 2331-2340. [DOI: https://dx.doi.org/10.1109/TFUZZ.2023.3343403] 07902394

12. Camacho EF, Alamo T, Peña DM (2010) Fault-tolerant model predictive control. In: 2010 IEEE 15th Conference on Emerging Technologies & Factory Automation (ETFA 2010), IEEE, pp. 1–8

13. Bavili, RE; Khosrowjerdi, MJ et al. Active fault tolerant controller design using model predictive control. J Control Eng Appl Inf; 2015; 17, 3 pp. 68-76.1312.93047

14. Wang, L; Qi, R; Jiang, B. Adaptive fault-tolerant optimal control for hypersonic vehicles with state constrains based on adaptive dynamic programming. J Franklin Inst; 2024; 361, 8 106833.4733861 [DOI: https://dx.doi.org/10.1016/j.jfranklin.2024.106833] 1539.93099

15. Zhao, Y; Wang, H; Xu, N; Zong, G; Zhao, X. Reinforcement learning-based decentralized fault tolerant control for constrained interconnected nonlinear systems. Chaos, Solitons & Fractals; 2023; 167, 113034.4525496 [DOI: https://dx.doi.org/10.1016/j.chaos.2022.113034] 1452.65068

16. Bardi, M; Dolcetta, IC et al. Optimal control and viscosity solutions of Hamilton–Jacobi–Bellman equations; 1997; Berlin, Springer: [DOI: https://dx.doi.org/10.1007/978-0-8176-4755-1] 0890.4901112

17. Wang, T; Wang, Y; Yang, X; Yang, J. Further results on optimal tracking control for nonlinear systems with nonzero equilibrium via adaptive dynamic programming. IEEE Trans Neural Netw Learn Syst; 2021; 34, 4 pp. 1900-1910.4573877 [DOI: https://dx.doi.org/10.1109/TNNLS.2021.3105646] 07886811

18. Liu, D; Xue, S; Zhao, B; Luo, B; Wei, Q. Adaptive dynamic programming for control: a survey and recent advances. IEEE Trans Syst, Man, and Cybernet: Syst; 2020; 51, 1 pp. 142-160. [DOI: https://dx.doi.org/10.1109/TSMC.2020.3042876] 1480.35332

19. Zhao, B; Liu, D; Li, Y. Online fault compensation control based on policy iteration algorithm for a class of affine non-linear systems with actuator failures. IET Control Theory & Appl; 2016; 10, 15 pp. 1816-1823.3587316 [DOI: https://dx.doi.org/10.1049/iet-cta.2015.1105] 1197.35332

20. Zhao, B; Liu, D; Li, Y. Observer based adaptive dynamic programming for fault tolerant control of a class of nonlinear systems. Inf Sci; 2017; 384, pp. 21-33. [DOI: https://dx.doi.org/10.1016/j.ins.2016.12.016] 1432.93185

21. Zeng, C; Zhao, B; Liu, D. Fault tolerant control for a class of nonlinear systems with multiple faults using neuro-dynamic programming. Neurocomputing; 2023; 553, 126502. [DOI: https://dx.doi.org/10.1016/j.neucom.2023.126502] 1432.93185

22. Wang, Z; Liu, L; Wu, Y; Zhang, H. Optimal fault-tolerant control for discrete-time nonlinear strict-feedback systems based on adaptive critic design. IEEE Trans Neural Netw Learn Syst; 2018; 29, 6 pp. 2179-2191.3811407 [DOI: https://dx.doi.org/10.1109/TNNLS.2018.2810138] 1437.93129

23. Fan, Q-Y; Yang, G-H. Adaptive fault-tolerant control for affine non-linear systems based on approximate dynamic programming. IET Control Theory & Appl; 2016; 10, 6 pp. 655-663.3469069 [DOI: https://dx.doi.org/10.1049/iet-cta.2015.1081] 1412.47213

24. Chen, T; Zeng, C; Wang, C. Fault identification for a class of nonlinear systems of canonical form via deterministic learning. IEEE Trans Cybern; 2021; 52, 10 pp. 10957-10968. [DOI: https://dx.doi.org/10.1109/TCYB.2021.3072645] 1497.53121

25. Zeng C, Zhao B, Liu D (2023) Adaptive neural network observer-based fault tolerant control for partially unknown nonlinear systems via adaptive dynamic programming. In: 2023 IEEE 13th International Conference on CYBER Technology in Automation, Control, and Intelligent Systems (CYBER), IEEE, pp. 212–216

26. Rahimi, F; Mahboobi Esfanjani, R. Estimating tolerable communication delays for distributed optimization problems in control of heterogeneous multi-agent systems. IET Control Theory & Appl; 2024; 18, 5 pp. 626-639.4736452 [DOI: https://dx.doi.org/10.1049/cth2.12595] 1323.93055

27. Zhao, W; Zhang, H; Yu, W. Distributed optimal output synchronization of heterogeneous multiagent systems with communication delays. Int J Robust Nonlinear Control; 2024; 34, pp. 7821-7836.4773657 [DOI: https://dx.doi.org/10.1002/rnc.7369] 1543.93021

28. Fridman E (2014) Introduction to time-delay systems: analysis and control. In: Systems & control: foundations & applications. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-09393-2. Website: https://link.springer.com/book/10.1007/978-3-319-09393-2

29. Zhang, H; Liu, D; Luo, Y; Wang, D. Adaptive dynamic programming for control: algorithms and stability; 2012; Berlin, Springer: 1279.49017

30. Wang, C; Hill, DJ. Deterministic learning and rapid dynamical pattern recognition. IEEE Trans Neural Networks; 2007; 18, 3 pp. 617-630. [DOI: https://dx.doi.org/10.1109/TNN.2006.889496] 1247.93001

31. Wang, Y; Wang, T; Li, C; Yang, J. Neural network-based optimal fault-tolerant control for interconnected nonlinear systems with actuator failures. IEEE Trans Emerg Top Comput Intell; 2024; 8, pp. 1828-1840. [DOI: https://dx.doi.org/10.1109/TETCI.2024.3358981] 1369.93680

32. Wang, Y; Wang, T; Yang, X; Yang, J. Gradient descent-Barzilai Borwein-based neural network tracking control for nonlinear systems with unknown dynamics. IEEE Trans Neural Netw Learn Syst; 2021; 34, 1 pp. 305-315.4551072 [DOI: https://dx.doi.org/10.1109/TNNLS.2021.3093877] 1377.28003

33. Al-Tamimi, A; Lewis, FL; Abu-Khalaf, M. Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans Syst, Man, and Cybernet, Part B (Cybernetics); 2008; 38, 4 pp. 943-949. [DOI: https://dx.doi.org/10.1109/TSMCB.2008.926614] 1210.90170

34. Lewis, FL; Vrabie, D; Syrmos, VL. Optimal control; 2012; Hoboken, John Wiley & Sons: [DOI: https://dx.doi.org/10.1002/9781118122631] 1284.49001

35. Zhang, L; Zhao, X; Zhao, N. Real-time reachable set control for neutral singular Markov jump systems with mixed delays. IEEE Trans Circuits Syst II Express Briefs; 2021; 69, 3 pp. 1367-1371.1369.05221

36. Rahimi, F; Rezaei, H. A distributed fault estimation approach for a class of continuous-time nonlinear networked systems subject to communication delays. IEEE Control Syst Lett; 2021; 6, pp. 295-300.4454081 [DOI: https://dx.doi.org/10.1109/LCSYS.2021.3071478] 1476.37013

Word count: 5400

Show less

Fault-tolerant control for nonlinear time-delay systems using neural network observers

Content area

Abstract

Full text