Full Text

Turn on search term navigation

1. Introduction

Deep learning achieves breakthroughs in many scientific fields and impacts the areas of data analysis, decision making, and pattern recognition. Recently, deep learning methods have been applied to solve partial differential equations (PDEs), and physics-informed neural networks (PINNs) [1,2] used to solve the partial differential equations (PDEs). The main idea is to represent the solutions of PDEs using a neural network, and optimize them with the constraints of physics-informed loss using automatic differentiation (AD). In the last few years, PINNs have been employed to solve PDEs from different fields, including problems in mechanical engineering, geophysics [3], vascular fluid dynamics [4,5], and biomedicine [6].

To further enhance the accuracy and efficiency of PINNs, a series of extensions to the original formulation of Raissi et al. [1,7,8,9,10,11] were proposed. For example, from the aspect of data augmentation, re-sampling methods are proposed to adaptively change the distribution of residual points during training [7,8], which could help to improve the accuracy of PINNs in solving stiff PDEs. The standard loss function in PINNs is the mean square error (MSE), which is not always suitable for training when solving PDEs [12,13]. In [9], an adjustment method among different loss terms was proposed to mitigate the gradient pathologies. Causal-PINNs [10] was proposed to solve the time-dependent PDEs, by means of an adaptive adjustment scheme for loss weights in the temporal domain. Meanwhile, regularization on the differential forms of PDEs was also demonstrated to be an effective way to improve accuracy [14], compared to original PINNs. In addition, the architectures [9,11,15,16] of PINNs greatly influence the final prediction results, and some works [15,17] have focused on embedding methods, which are useful for features enhancement of PINNs and are even possibly enforceable on soft/hard boundary enforcement [18].

PINNs with fully connected neural networks are widely used to solve partial differential equations and the derivatives of PDEs could be directly computed by means of automatic differentiation (AD). There also exist various types of architectures to solve PDEs, e.g., CNN architecture [19] and UNet architecture [20]. However, CNN and UNet require a finite difference approach when calculating the derivatives of PDEs. Other architectures, such as Bayesian neural networks (BNNs) [21] and generative adversarial networks(GANs) [22], are also used to address PDE problems. Despite the development of different architectures, fully connected feed forward neural networks are still the most used architecture in PINNs and their hyper-parameters, such as depth, width or the connected way between hidden layers, greatly influence the final results. In [16], adjustments on width and depth of fully connected neural networks (FCNNs) showed the different accuracy of PINNs. In [11], a ResNet-block was used to enhance the connected way between hidden layers and performed better than FCNNs in the parameters identification of the Navier–Stokes equation. As in [9], modified neural network was also proposed to project the input variables to a high-dimensional feature space and fuse them as neural attention mechanisms. All these cases showed that the selected architecture was essential to the final predictive results of PINNs.

The main idea of such an adaptive scheme is to attach a bounded trainable weight for each single residual point in the residual loss function and adaptively update pointwise weights for each training point. As for the architecture, it influences the predicted results and is also essential for improving the accuracy of PINN methods. Our contributions in this paper are summarized below:

We develop a constrained self-adaptive physics-informed neural network (cSPINN), which had better accuracy in numerical results. Meanwhile, the dynamics of residual weights changed more steadily during the training process.
To better capture the solution with sharp transitions in the physical domain, we develop a ResNet block-enhanced modified MLP architecture, which also has the ability to tackle the vanishing gradient problem using identity mapping, even for deep architectures.

2. Related Works

In this section, we first introduce the model problem, and then provide a brief overview of physics-informed neural networks (PINNs) for solving forward partial differential equations (PDEs).

2.1. Model Problem

In this subsection, we first introduce the model problem given the spatial domain $Ω$ , and the temporal $t \in [0, T]$ domain, having the general form of a partial differential equation (PDE):

(1a) $\begin{matrix} N_{t} [u (x, t)] + N_{x} [u (x, t)] = 0, x \in Ω, t \in (0, T] \end{matrix}$

(1b) $\begin{matrix} u (x, 0) = u_{0} (x), x \in Ω \end{matrix}$

(1c) $\begin{matrix} B [u (x, t)] = g (x, t), x \in \partial Ω, t \in (0, T] \end{matrix}$

where

N_{t} [\cdot]

and

N_{x} [\cdot]

are the general differential operator, which includes any combination of linear and non-linear terms of temporal and spatial derivatives. The corresponding initial condition at

t = 0

is given by

u_{0} (x)

. In the above,

B [\cdot]

is a boundary operator, which could be Neumann, Robin, or periodic boundary conditions, and enforces the condition

g (x, t)

at the boundary domain

\partial Ω

2.2. PINNs Formulation

To solve the PDEs via PINNs [1], we needed to construct a neural network $\hat{u} (x, t; w)$ , given the spatial $x \in Ω$ and temporal $t \in [0, T]$ inputs with the trainable parameters $w$ , to approximate the solution $u (x, t)$ . Then, we could train a physics-informed model by minimizing the following loss function:

(2) $\begin{matrix} L (w) = λ_{r} L_{r} (w) + λ_{i c} L_{i c} (w) + λ_{b c} L_{b c} (w) \end{matrix}$

where

(3a) $\begin{matrix} L_{r} (w) = \frac{1}{N_{r}} \sum_{i = 1}^{N_{r}} {|N_{t} [u_{w} (x_{r}^{i}, t_{r}^{i})] + N_{x} [u_{w} (x_{r}^{i}, t_{r}^{i})]|}^{2}, \end{matrix}$

(3b) $\begin{matrix} L_{i c} (w) = \frac{1}{N_{i c}} \sum_{i = 1}^{N_{i c}} {|u_{w} (x_{i c}^{i}, 0) - u_{0} (x_{i c}^{i})|}^{2}, \end{matrix}$

(3c) $\begin{matrix} L_{b c} (w) = \frac{1}{N_{b c}} \sum_{i = 1}^{N_{b c}} {|B [u_{w}] (x_{b c}^{i}, t_{b c}^{i}) - g (x_{b c}^{i}, t_{b c}^{i})|}^{2} \end{matrix}$

here,

L_{r}

L_{i c}

and

L_{b c}

are loss functions due to the residual in physics domain, loss in initial condition and boundary condition. We use

u_{w}

to represent the output of neural network, which is parameterized by

w

λ_{r}

λ_{i c}

and

λ_{b c}

are the weights that could influence the convergence rate of different loss components and the final accuracy of PINNs [9,12]. Hence, it was important to use an appropriate weighting strategy during training. Here

{x_{r}^{i}, t_{r}^{i}}_{i = 1}^{N_{r}}

{x_{i c}^{i}}_{i = 1}^{N_{i c}}

and

{x_{b c}^{i}, t_{b c}^{i}}_{i = 1}^{N_{b c}}

are training points inside the domain, points on the initial domain and points on the boundary.

In this paper, we used $L^{2}$ error to define the relative error between the PINNs’ prediction and the reference solutions, as

$ε (x_{k}, t_{k}) = \frac{{[\sum_{k = 1}^{N} {|\hat{u} (x_{k}, t_{k}) - u (x_{k}, t_{k})|}^{2}]}^{\frac{1}{2}}}{{[\sum_{k = 1}^{N} {(u (x_{k}, t_{k}))}^{2}]}^{1 / 2}}$

where

u (x_{k}, t_{k})

is the reference solution and

\hat{u} (x_{k}, t_{k})

is the neural network prediction for a set of testing points

{\{(x_{k}, t_{k})\}}_{k = 1}^{N}

(

(x_{k}, t_{k}) \in Ω \times (0, T]

3. Formulation of Constrained Self-Adaptive PINNs (cSPINNs)

In this section, we first present the method of the constrained self-adaptive weighting scheme for PINNs, which could adaptively adjust the weights for residual points during training. Next, we propose a modified network architecture enhanced by ResNet block to further improve the performance of cSPINNs.

3.1. Constrained Self-Adaptive Weighting Scheme

In PINNs, we defined the residual loss $L_{r} (w)$ to enforce the network to satisfy the governing equation for any sample points inside the domain, i.e., ${x_{r}^{i}, t_{r}^{i}}_{i = 1}^{N_{r}}$ . However, we could find that equal weight was attached for all residual points in the above formulation of $L_{r} (w)$ , with the result that PINNs could not focus on the area that was difficult to learn during training (as shown in Figure 1). One effective way to overcome this, was to attach individual adjustable weights for each residual point, according to the distribution of residual in the physical domain, and then automatically raise the weights of inner points with relatively higher loss value. Then, we could formulate such self-adaptive adjustment as a min–max optimization problem, i.e.,

(4a) $min_{w} max_{\hat{λ_{r}}} L (w, λ_{r}, λ_{i c}, λ_{b c}, \hat{λ_{r}})$

(4b) $\begin{matrix} s . t . & \sum_{i = 1}^{N_{r}} {\hat{λ_{r}}}_{i} = C \end{matrix}$

where

\hat{λ_{r}} : = {{\hat{λ_{r}}}_{1}, {\hat{λ_{r}}}_{2}, \dots, {\hat{λ_{r}}}_{N_{r}}}

are trainable adaptive weights for all corresponding residual points

{x_{r}^{i}, t_{r}^{i}}_{i = 1}^{N_{r}}

, and C is a constant that could be used to constrain the range of weights. Here, we set C as the expectation of weights in PINNs, i.e.,

C = E (\sum_{i = 1}^{N_{r}} {\hat{λ_{r}}}_{i}) = N_{r}

. The formulation of loss function is as below:

$L (w, λ_{r}, λ_{i c}, λ_{b c}, \hat{λ_{r}}) = λ_{r} L_{r} (w, \hat{λ_{r}}) + λ_{i c} L_{i c} (w) + λ_{b c} L_{b c} (w)$

where

L_{r} (w, \hat{λ_{r}}) = \frac{1}{N_{r}} \sum_{i = 1}^{N_{r}} {\hat{λ_{r}}}_{i} {|N_{t} [u_{w} (x_{r}^{i}, t_{r}^{i})] + N_{x} [u_{w} (x_{r}^{i}, t_{r}^{i})]|}^{2}

is a weighted residual loss and the other terms in

L (w, λ_{r}, λ_{i c}, λ_{b c}, \hat{λ_{r}})

are the same as in the original formulation. It was easy to solve the inner optimization of the min–max optimization problem above by selecting a residual point with the largest residual loss and attaching a weight with value C, while setting the weights of other points to zero.

However, PINNs could only optimize one single point in every training iteration if we chose such a strategy, which went against the adjustment among different residual points. In [12], Mcclenny et al. proposed a self adaptive method to solve the above min–max problem by a step-forward optimization in the inner optimization using a gradient ascent procedure, which could approximately satisfy the inner maximization requirement. In this way, different residual points could be attached with appropriate weights during training. They updated the $\hat{λ_{r}}$ during the training process as:

${\hat{λ}}_{r}^{k + 1} = {\hat{λ}}_{r}^{k} + η_{k} \nabla_{{\hat{λ}}_{r}^{k}} L (w, λ_{r}, λ_{i c}, λ_{b c}, {\hat{λ}}_{r}^{k})$

where

η_{k}

is the learning rate at iteration k. However, it was easy to see that

{\hat{λ}}_{r}^{k}

was an unbounded weights vector during the training, which mean that the training of PINNs would suffer an unstable state, caused by the rapidly changing weights. We modified the updating rules of the

{\hat{λ}}_{r}^{k}

as:

(5a) $\begin{matrix} λ_{r}^{k + 1} & = {\hat{λ}}_{r}^{k} + η_{k} \nabla_{{\hat{λ}}_{r}^{k}} L (w, λ_{r}, λ_{i c}, λ_{b c}, {\hat{λ}}_{r}^{k}) \end{matrix}$

(5b) $\begin{matrix} λ_{r_{i}}^{k + 1} & = \frac{| λ_{r_{i}}^{k + 1} |}{\sum_{i = 1}^{N_{r}} | λ_{r_{i}}^{k + 1} |} \times C \end{matrix}$

(5c) $\begin{matrix} {\hat{λ}}_{r_{i}}^{k + 1} & = (1 - ϵ) \times {\hat{λ}}_{r_{i}}^{k} + ϵ \times λ_{r_{i}}^{k + 1}, \end{matrix}$

where we denote

r_{i}

as ith residual points in

{x_{r}^{i}, t_{r}^{i}}_{i = 1}^{N_{r}}

, k and

k + 1

the training iteration numbers.

λ_{r}^{k + 1}

is a middle variable before normalization. In other words, we first normalized the

λ_{r_{i}}^{k + 1} \in λ_{r}^{k + 1}

and derived the final

{\hat{λ}}_{r_{i}}^{k + 1}

by a weighted sum of the previous weight

{\hat{λ}}_{r_{i}}^{k}

of iteration k and the normalized

λ_{r_{i}}^{k + 1}

in the current

k + 1

iteration. The framework of cSPINN is shown in Figure 2.

3.2. ResNet Block-Enhanced Modified Network

Continuous improvement of network structure design is one of the drivers for the development of deep learning methods and its applications. For example, convolutional neural networks [23,24,25] were designed for, and have been widely used in, computer vision, such as image classification, image segmentation, and object recognition tasks. Similarly, recurrent neural networks and their variants [26,27,28] show great performance in natural language processing and sequential modeling because of their ability to capture long-term dependencies in the sequences. In [9], a modified MLP framework was proposed to correctly capture the solution of complex PDEs, which has been widely used in many cases. PINNs with ResNet blocks are also used to improve the representational capacity of the neural network when solving PDEs. Inspired by these ideas, we proposed a ResNet block-enhanced modified MLP framework to better represent the solution of PDEs. The ResNet block-enhanced modified network is as follows:

(6a) $\begin{matrix} U = ϕ (X W^{1} + b^{1}), V = ϕ (X W^{2} + b^{2}) \end{matrix}$

(6b) $\begin{matrix} H^{(1)} = ϕ (X W^{z, 1} + b^{z, 1}) \end{matrix}$

(6c) $\begin{matrix} Z^{(k)} = \frac{1}{2} (H^{(k)} + F (H^{(k)})), k = 1, \dots, L \end{matrix}$

(6d) $\begin{matrix} H^{(k + 1)} = (1 - Z^{(k)}) ⊙ U + Z^{(k)} ⊙ V, k = 1, \dots, L \end{matrix}$

(6e) $\begin{matrix} f = H^{(L + 1)} W + b \end{matrix}$

here

ϕ

is an activation function, ⊙ is an element-wise multiplication operation, f is the final output of network and the updating rule of

Z^{(k)}

is similar to the residual learning, which was first proposed in [23] and achieved great success. We have denoted it above, i.e.,

(7) $\begin{matrix} Z^{(k)} = \frac{1}{2} (H^{(k)} + F (H^{(k)})), k = 1, \dots, L \end{matrix}$

More specifically, $H^{(k)}$ and $Z^{(k)}$ are the input and output of the ResNet block, respectively. F is an operation consisting of a fully connected layer and activation function, which could be defined as $F (X) : = ϕ (ϕ (X W^{1, k} + b^{1, k}) W^{2, k} + b^{2, k})$ here, with input X, hidden layers’ parameters ${W^{1, k}, b^{1, k}, W^{2, k}, b^{2, k}}$ and an activation function $ϕ$ . As for the element-wise addition $H^{(k)} + F (H^{(k)})$ , it could be conducted by a shortcut connection. The effectiveness of such a block in PINNs was demonstrated in [11] and here we used it as a feature enhanced sub-structure of our network. The features could be fused by a shortcut connection and fed into updating the hidden layers by an element-wise multiplication operation with U and V, as shown in Figure 3. Compared to simple fully connected neural networks, our architecture enhanced the representative ability of hidden layers by ResNet blocks, which could make it easier for the network to learn the desired solution. Meanwhile, embedding of inputs from low-level space to higher dimensional feature space could also be considered here, and could be fused, using the attention mechanism during the forward process.

4. Numerical Experiments

We demonstrated the performance of our proposed cSPINN in solving several PDE problems. In all of the examples, we used the ResNet block-enhanced modified network with tanh function as our activation function $ϕ$ . The proposed architecture had 2 input neurons and consisted of 4 ResNet blocks, each having a width of 64. The output layer contained only one neuron for the output/solution of the PDE.

4.1. 1D Allen–Cahn Equation

The Allen–Cahn equation is a stiff PDE, which has a sharp interface and time transitions in its solution. We denote the 1D Allen–Cahn equation as below:

(8a) $\begin{matrix} u_{t} - 0.0001 u_{x x} + 5 u^{3} - 5 u = 0, t \in [0, 1], x \in [- 1, 1] \end{matrix}$

(8b) $\begin{matrix} u (x, 0) = x^{2} c o s (π x), \end{matrix}$

(8c) $\begin{matrix} u (- 1, t) = u (1, t), \end{matrix}$

(8d) $\begin{matrix} u_{x} (- 1, t) = u_{x} (1, t) \end{matrix}$

We used the same physics parameters of the Allen–Cahn equation as in [7] to better compare the results. For the given problem, we used the ResNet block-enhanced modified network architecture mentioned above to better fit the sharp transition. In order to implement the cSPINNs for the Allen–Cahn equation, the following loss function was used:

The constrained self adaptive loss for the residual of the governing equation
(9a) $\begin{matrix} R : = {\hat{u}}_{t} - 0.0001 {\hat{u}}_{x x} + 5 {\hat{u}}^{3} - 5 \hat{u}, (x_{r}^{i}, t_{r}^{i}) \in Ω \end{matrix}$

(9b) $\begin{matrix} M S E_{R} = \frac{1}{N_{r}} \sum_{i = 1}^{N_{r}} {\hat{λ}}_{r_{i}} {(R (x_{r}^{i}, t_{r}^{i}))}^{2} \end{matrix}$
Mean squared loss on the initial condition
(10a) $\begin{matrix} I : = \hat{u} (x_{i c}^{i}, 0) - {(x_{i c}^{i})}^{2} c o s (π x_{i c}^{i}), x_{i c}^{i} \in Γ \end{matrix}$

(10b) $\begin{matrix} M S E_{I} = \frac{1}{N_{i c}} \sum_{i = 1}^{N_{i c}} {(I (x_{i c}^{i}, t_{i c}^{i}))}^{2} \end{matrix}$
Mean squared loss on the boundary condition
(11a) $\begin{matrix} B_{1} : = \hat{u} (- 1, t_{b}^{i}) - \hat{u} (1, t_{b}^{i}), \end{matrix}$

(11b) $\begin{matrix} B_{2} : = {\hat{u}}_{x} (- 1, t_{b}^{i}) - {\hat{u}}_{x} (1, t_{b}^{i}), t_{b}^{i} \in \partial Ω \end{matrix}$

(11c) $\begin{matrix} M S E_{B} = \frac{1}{N_{b}} \sum_{i = 1}^{N_{b}} {(B_{1} (x_{b}^{i}, t_{b}^{i}))}^{2} + {(B_{2} (x_{b}^{i}, t_{b}^{i}))}^{2} \end{matrix}$
where $\hat{u}$ is the prediction of neural network and we sampled $N_{r}$ = 25,600 residual points, $N_{b} = 400$ boundary points and $N_{i c} = 512$ points on the initial condition. Here, we used the Adam optimizer with 10,000 epochs and L-BFGS optimizer with 1000 epochs to optimize the network architecture. During the training process, we set the boundary weight and the initial weight $w_{b} = w_{i} = 100$ , which could help expedite the convergence. Figure 4 shows the numerical results of constrained self-adaptive PINNs(cSPINNs) compared with the reference solution obtained through the Chebfun method [29] and the training loss history. The relative $L^{2}$ error was 1.472 × 10 $^{- 2}$ , which was better than the time-adaptive approach in [7] and the original PINNs [1].

4.2. 1D Viscous Burgers’ Equation

The Viscous Burgers’ equation is widely used in various areas of applied mathematics, such as fluid mechanics, traffic flow, and gas dynamics. The 1D viscous Burgers equation could be denoted as below:

(12a) $\begin{matrix} u_{t} + u u_{x} - (0.01 / π) u_{x x} = 0, t \in [0, 1], x \in [- 1, 1] \end{matrix}$

(12b) $\begin{matrix} u (0, x) = - sin (π x), \end{matrix}$

(12c) $\begin{matrix} u (t, - 1) = u (t, 1) = 0 \end{matrix}$

To better compare the results, we used the same physics parameters of the Burgers equation as in [1]. In order to implement the cSPINN scheme for the Burgers equation, the following modified residual loss function was used as mentioned in the formulation of cSPINNs:

The constrained self adaptive loss for the residual of the governing equation
(13a) $\begin{matrix} R : = {\hat{u}}_{t} + \hat{u} {\hat{u}}_{x} - (0.01 / π) {\hat{u}}_{x x}, (x_{r}^{i}, t_{r}^{i}) \in Ω \end{matrix}$

(13b) $\begin{matrix} M S E_{R} = \frac{1}{N_{r}} \sum_{i = 1}^{N_{r}} {\hat{λ}}_{r_{i}} {(R (x_{r}^{i}, t_{r}^{i}))}^{2} \end{matrix}$
Mean squared loss on the initial condition
(14a) $\begin{matrix} I : = \hat{u} (x_{i c}^{i}, 0) + sin (π x_{i c}^{i}), x_{i c}^{i} \in Γ \end{matrix}$

(14b) $\begin{matrix} M S E_{I} = \frac{1}{N_{i c}} \sum_{i = 1}^{N_{i c}} {(I (x_{i c}^{i}, t_{i c}^{i}))}^{2} \end{matrix}$
Mean squared loss on the boundary conditions
(15a) $\begin{matrix} B_{1} : = \hat{u} (- 1, t_{b}^{i}) - 0, \end{matrix}$

(15b) $\begin{matrix} B_{2} : = \hat{u} (1, t_{b}^{i}) - 0, t_{b}^{i} \in \partial Ω \end{matrix}$

(15c) $\begin{matrix} M S E_{B} = \frac{1}{N_{b}} \sum_{i = 1}^{N_{b}} {(B_{1} (- 1, t_{b}^{i}))}^{2} + {(B_{2} (1, t_{b}^{i}))}^{2} \end{matrix}$
Here, we trained the network with a constrained self-adaptive scheme and $\hat{u}$ was the prediction of neural network. In this case, we sampled $N_{r}$ = 25,600 residual points, $N_{b} = 256$ boundary points and $N_{i c} = 512$ points on the initial condition. We set the weights of initial condition and boundary condition as $N_{i c} = N_{b c} = 100$ . Training was performed using 10,000 Adam iterations and 1000 L-BFGS epochs. The predicted solution of cSPINNs, and loss history are shown in Figure 5. Despite a sharp transition in the center of the domain, the solution of cSPINNs was still accurate in the whole domain, yielding a relative $L^{2}$ error of 4.796 × 10 $^{- 4}$ .

4.3. 2D Helmholtz Equation

The Helmholtz equation is widely used to describe the behavior of wave propagation, which could be mathematically formulated as follows:

(16a) $\begin{matrix} Δ u (x, y) + k^{2} u (x, y) - q (x, y) = 0 \end{matrix}$

(16b) $\begin{matrix} u (- 1, y) = u (1, y) = u (x, - 1) = u (x, 1) = 0 \end{matrix}$

where

x \in [- 1, 1], y \in [- 1, 1]

and

(17) $\begin{matrix} q (x, y) = & - {(a_{1} π)}^{2} s i n (a_{1} π x) s i n (a_{2} π y) - {(a_{2} π)}^{2} s i n (a_{1} π x) s i n (a_{2} π y) \\ + k^{2} s i n (a_{1} π x) s i n (a_{2} π y) \end{matrix}$

is a forcing term that results in a closed-form analytical solution

(18) $\begin{matrix} u (x, y) = s i n (a_{1} π x) s i n (a_{2} π y) \end{matrix}$

The exact solution above was as the same as in [9], which helped us better compare the results.

The constrained self adaptive loss for the residual of the governing equation
(19a) $\begin{matrix} R : = Δ \hat{u} (x_{r}^{i}, y_{r}^{i}) + k^{2} \hat{u} (x_{r}^{i}, y_{r}^{i}) - q (x_{r}^{i}, y_{r}^{i}), (x_{r}^{i}, y_{r}^{i}) \in Ω \end{matrix}$

(19b) $\begin{matrix} M S E_{R} = \frac{1}{N_{r}} \sum_{i = 1}^{N_{r}} {\hat{λ}}_{r_{i}} {(R (x_{r}^{i}, t_{r}^{i}))}^{2} \end{matrix}$
Mean squared loss on the boundary conditions
(20a) $\begin{matrix} B_{1} : = \hat{u} (- 1, y_{b}^{i}) - 0, \end{matrix}$

(20b) $\begin{matrix} B_{2} : = \hat{u} (1, y_{b}^{i}) - 0, \end{matrix}$

(20c) $\begin{matrix} B_{3} : = \hat{u} (x_{b}^{i}, - 1) - 0, \end{matrix}$

(20d) $\begin{matrix} B_{4} : = \hat{u} (x_{b}^{i}, 1) - 0, \end{matrix}$

(20e) $\begin{matrix} M S E_{B} = \frac{1}{N_{b}} \sum_{i = 1}^{N_{b}} {(B_{1} (- 1, y_{b}^{i}))}^{2} + {(B_{2} (1, y_{b}^{i}))}^{2} + {(B_{3} (x_{b}^{i}, - 1))}^{2} + {(B_{4} (x_{b}^{i}, 1))}^{2} \end{matrix}$

Here, we solved the problem with $a_{1} = 1$ and $a_{2} = 4$ to allow a direct comparison with the results reported in [9]. In this case, the ResNet block-enhanced modified network was trained with 10,000 Adam and 1000 L-BFGS epochs. As for the training points, we sampled $N_{r}$ = 10,000 residual points and $N_{b} = 400$ (100 points per boundary). We show the prediction results of the cSPINNs in Figure 6. Finally, we achieved a relative $L^{2}$ error of 1.626 × 10 $^{- 3}$ , which exhibited performance than the learning-rate annealing weighted scheme, proposed in [9], and self-adaptive PINNs, proposed in [12]. Meanwhile, our method required less computational cost, due to the stability of the design in the self-adaptive weights.

4.4. 2D Poisson Equation on Different Geometries

Poisson’s equation is an elliptic partial differential equation widely used in the description of potential fields. The 2D Poisson’s problem could be denoted as follows:

(21) $\begin{matrix} - Δ u & = f (x, y), (x, y) \in Ω \end{matrix}$

(22) $\begin{matrix} u & = g (x, y), (x, y) \in \partial Ω \end{matrix}$

To further demonstrate the performance of cSPINNs, we used the exact solution with periodicity $u (x, y) = \frac{1}{2 {(4 π)}^{2}} sin (4 π x) sin (4 π y)$ and obtained $f (x, y)$ and $g (x, y)$ , according to the exact solution, directly as:

(23) $\begin{matrix} Δ u & = - sin (4 π x) sin (4 π y), (x, y) \in Ω \end{matrix}$

(24) $\begin{matrix} u & = \frac{1}{2 {(4 π)}^{2}} sin (4 π x) sin (4 π y), (x, y) \in \partial Ω \end{matrix}$

Then, we had the following loss terms

The constrained self adaptive loss for the residual of the governing equation:
(25a) $\begin{matrix} R : = - Δ \hat{u} (x_{r}^{i}, y_{r}^{i}) + sin (4 π x_{r}^{i}) sin (4 π y_{r}^{i}), (x_{r}^{i}, y_{r}^{i}) \in Ω \end{matrix}$

(25b) $\begin{matrix} M S E_{R} = \frac{1}{N_{r}} \sum_{i = 1}^{N_{r}} {\hat{λ}}_{r_{i}} {(R (x_{r}^{i}, y_{r}^{i}))}^{2} \end{matrix}$
Mean squared loss on the boundary conditions (Taking the rectangular area as example):
(26a) $\begin{matrix} B_{1} : = \hat{u} (- \frac{1}{4}, y_{b}^{i}) - \frac{1}{2 {(4 π)}^{2}} sin (4 π \times (- \frac{1}{4})) sin (4 π y_{b}^{i}), y_{b}^{i} \in \partial Ω_{l} \end{matrix}$

(26b) $\begin{matrix} B_{2} : = \hat{u} (\frac{1}{4}, y_{b}^{i}) - \frac{1}{2 {(4 π)}^{2}} sin (4 π \times \frac{1}{4}) sin (4 π y_{b}^{i}), y_{b}^{i} \in \partial Ω_{r}, \end{matrix}$

(26c) $\begin{matrix} B_{3} : = \hat{u} (x_{b}^{i}, - \frac{1}{4}) - \frac{1}{2 {(4 π)}^{2}} sin (4 π x_{b}^{i}) sin (4 π \times (- \frac{1}{4})), x_{b}^{i} \in \partial Ω_{w}, \end{matrix}$

(26d) $\begin{matrix} B_{4} : = \hat{u} (x_{b}^{i}, \frac{1}{4}) - \frac{1}{2 {(4 π)}^{2}} sin (4 π x_{b}^{i}) sin (4 π \times \frac{1}{4}), x_{b}^{i} \in \partial Ω_{u}, \end{matrix}$

(26e) $\begin{matrix} M S E_{B} = \frac{1}{N_{b}} \sum_{i = 1}^{N_{b}} {(B_{1} (- \frac{1}{4}, y_{b}^{i}))}^{2} + {(B_{2} (\frac{1}{4}, y_{b}^{i}))}^{2} + {(B_{3} (x_{b}^{i}, - \frac{1}{4}))}^{2} + {(B_{4} (x_{b}^{i}, \frac{1}{4}))}^{2} \end{matrix}$

In this case, we first tested the performance of cSPINNs on a rectangular domain $Ω_{1} = [0, 0.25] \times [0, 0.25]$ . We sampled $N_{r}$ = 10,000 residual points in the inner domain and $N_{b}$ = 1000 boundary points distributed on the boundary area. The ResNet block-enhanced modified network was trained with 10,000 epochs Adam and 1000 epochs L-BFGS. Meanwhile, different geometries, including circular, triangular, and pentagonal domains, were also tested to demonstrate the advantages of cSPINNs. The $L^{2}$ error between the predicted solution and the reference solution on different geometries is shown in Table 1. It is worth noting that we magnified the loss value by a constant number of c = 10,000, due to the relatively small true value (the maximum value of exact solution was about 0.003) in the solution, to ensure normal gradient backward during training. As for an irregular domain, we sampled the same number of points as for the rectangular domain. We found that the cSPINNs achieved good performance in this problem, as shown in Figure 7. The original PINNs failed, as shown in Figure 1.

Moreover, we provided the comparison results between cSPINNs and reference solutions in the domain $Ω_{2} = [0, 1] \times [0, 1]$ , which was hard for PINNs to solve, due to the high frequency of the solution, as shown in Figure 8. We show the relative $L^{2}$ errors between the predicted and the exact solution $u (x, y)$ using cSPINNs on the different geometries in Table 2. To further test the performance of cSPINNs, we provided the numerical results of cSPINNs on the L-shaped domain, a classic concave geometry. In this case, we set $f (x, y) = 1$ and $g (x, y) = 0$ to have a direct comparison with PINN, as in [30]. We had the loss terms as follow:

The constrained self adaptive loss for the residual of the governing equation
(27a) $\begin{matrix} R : = - Δ \hat{u} (x_{r}^{i}, y_{r}^{i}) - 1, (x_{r}^{i}, y_{r}^{i}) \in Ω \end{matrix}$

(27b) $\begin{matrix} M S E_{R} = \frac{1}{N_{r}} \sum_{i = 1}^{N_{r}} {\hat{λ}}_{r_{i}} {(R (x_{r}^{i}, y_{r}^{i}))}^{2} \end{matrix}$
Mean squared loss on the boundary condition
(28a) $\begin{matrix} B : = \hat{u} (x_{b}^{i}, y_{b}^{i}) - 0, (x_{b}^{i}, y_{b}^{i}) \in \partial Ω \end{matrix}$

(28b) $\begin{matrix} M S E_{B} = \frac{1}{N_{b}} \sum_{i = 1}^{N_{b}} {(B (x_{b}^{i}, y_{b}^{i}))}^{2} \end{matrix}$

We tested the performance of cSPINN on the L-shaped domain and show the results in Figure 9. The maximum point-wise error was yielded at about 6 × 10 $^{- 3}$ and the relative $L^{2}$ error 4.257 × 10 $^{- 3}$ . In [30], PINNs achieved accurate results with about 0.02 maximum point-wise error in the same case on the L-shaped domain. In [31], hp-VPINNs also tested the performance and, in this case, achieved about 0.02 maximum point-wise error on the domain. Therefore, cSPINNs also performed well, even on such concave geometry.

4.5. High Dimensional Poisson Equation

Consider Poisson’s equation in high dimension (d = 10):

(29) $\begin{matrix} - Δ u = 0, & x \in {(0, 1)}^{10} \end{matrix}$

(30) $\begin{matrix} u (x) = \sum_{k = 1}^{5} x_{2 k - 1} x_{2 k}, & x \in \partial {(0, 1)}^{10} \end{matrix}$

The solution of this problem was $u (x) = \sum_{k = 1}^{5} x_{2 k - 1} x_{2 k}$ and we computed the error of cSPINN using this exact solution.

The constrained self adaptive loss for the residual of the governing equation:
(31a) $\begin{matrix} R : = - Δ \hat{u} ({x_{1}}_{r}^{i}, {x_{2}}_{r}^{i}, \dots, {x_{10}}_{r}^{i}), ({x_{1}}_{r}^{i}, {x_{2}}_{r}^{i}, \dots, {x_{10}}_{r}^{i}) \in Ω \end{matrix}$

(31b) $\begin{matrix} M S E_{R} = \frac{1}{N_{r}} \sum_{i = 1}^{N_{r}} {\hat{λ}}_{r_{i}} {(R (x_{r}^{i}, y_{r}^{i}))}^{2} \end{matrix}$
Mean squared loss on the boundary condition:
(32a) $\begin{matrix} B : = \hat{u} ({x_{1}}_{b}^{i}, {x_{2}}_{b}^{i}, \dots, {x_{10}}_{b}^{i}) - \sum_{k = 1}^{5} x_{2 k - 1} x_{2 k}, ({x_{1}}_{b}^{i}, {x_{2}}_{b}^{i}, \dots, {x_{10}}_{b}^{i}) \in \partial Ω \end{matrix}$

(32b) $\begin{matrix} M S E_{B} = \frac{1}{N_{b}} \sum_{i = 1}^{N_{b}} {(B ({x_{1}}_{b}^{i}, {x_{2}}_{b}^{i}, \dots, {x_{10}}_{b}^{i}))}^{2} \end{matrix}$

We computed the relative $L^{2}$ error between the solution of cSPINN and the exact solution. The relative $L^{2}$ error of cSPINN was 1.028 × 10 $^{- 3}$ , which was smaller than the Deep Ritz method [32] (about 0.4%). In this case, we sampled $N_{r}$ = 1000 residual points in the inner domain and $N_{b}$ = 100 boundary points distributed on the boundary area. The ResNet block-enhanced modified network was trained with 20,000 epochs Adam and 1000 epochs L-BFGS. The training loss history of cSPINN is shown in Figure 10. Finally, we also provide the relative computational cost between cSPINNs and PINNs in different cases in Table 3. When computing the cost, we set the cSPINNs and PINNs with the same network depth, width, and number of training epochs for fair comparison.

At the end of this section, we also tested the impact of the following three architectures: the Multilayer Perceptron(MLP), the modified Multilayer Perceptron (MMLP), and the ResNet block-enhanced modified network(ResMNet). During the test, the depth and width of all networks were fixed at 6 and 128, respectively. Here, we also provided the results to demonstrate the effectiveness of our proposed constrained self-adaptive weighting scheme(cSA) compared to the $L^{2}$ loss function. Meanwhile, as can be seen in Table 4, we observed that, compared to the MLP and the MMLP, the ResMNet yielded the highest accuracy. Therefore, the constrained self-adaptive weighting scheme (cSA) and the ResNet block-enhanced modified network (ResMNet) were desired in the cSPINNs.

5. Conclusions and Future

In this paper, we proposed constrained self-adaptive PINNs (cSPINNs) to adaptively adjust the weights of individual residual points, which became more robust during training, due to the bounded weights. Meanwhile, a ResNet block-enhanced modified neural network was also proposed to enhance the predictive ability of PINNs.

We demonstrated the effectiveness of our method in solving various PDEs, including the Allen–Cahn equation, the Burgers equation, the Poisson equation and the Helmholtz equation. Our method showed good performance in all the cases mentioned and outperformed PINNs, especially in the Poisson equation with periodic solution, regardless of the geometries of the computational domain. Even with sharp transition in the physical domain, cSPINNs were also robust when solving the Allen–Cahn equation, which was difficult for the original PINNs to solve. Compared with the PINNs, cSPINNs could improve the accuracy and could be implemented with just a few lines of code, which made it possible to combine our method with other extensions of PINNs to further improve the performance.

The usage of a constrained self-adaptive weighting scheme could attach higher weight values to difficult to learn regions during training, which made it possible to solve complicated problems. In this paper, we provided the numerical results of cSPINNs in solving the 10D Poisson equation and achieved better performance than the Deep-Ritz method. In the future, we will further generalize cSPINNs to solve higher dimensional PDEs and multi-physics problems.

Author Contributions

Methodology, G.Z.; Software, G.Z., H.Y., G.P., Y.D. and F.Z.; Writing — original draft, G.Z.; Writing — review & editing, G.Z.; Project administration, G.Z., H.Y. and Y.C. All authors have read and agreed to the published version of the manuscript.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Figures and Tables

Figure 1. PINNs results for solving Allen–Cahn Equation (The first row) and 2D Poisson’s Equation (The second row).

View Image - Figure 2. A framework of constrained self-adaptive physics-informed neural networks. A neural network output the solution u for PDE, then we computed the derivatives by auto differentiation and obtained the loss of initial conditions, boundary conditions and governing equation. The parameters of neural network and constrained self-adaptive weights for residual points were updated at the same time by means of the gradient method.

Figure 2. A framework of constrained self-adaptive physics-informed neural networks. A neural network output the solution u for PDE, then we computed the derivatives by auto differentiation and obtained the loss of initial conditions, boundary conditions and governing equation. The parameters of neural network and constrained self-adaptive weights for residual points were updated at the same time by means of the gradient method.

Figure 3. The ResNet block-enhanced modified architecture for physics-informed neural networks.

Figure 4. cSPINNs results (with ResNet block-enhanced) for solving 1D Allen–Cahn equation.

Figure 5. cSPINNs results (with ResNet block-enhanced modified neural network) for solving 1D viscous Burgers equation.

View Image - Figure 6. cSPINNs results (with ResNet block-enhanced modified neural network) for solving the 2D Helmholtz equation with an exact solution.

Figure 6. cSPINNs results (with ResNet block-enhanced modified neural network) for solving the 2D Helmholtz equation with an exact solution.

View Image - Figure 7. cSPINNs results (with ResNet block-enhanced modified neural network) for solving the 2D Poisson equation with an exact solution [Forumla omitted. See PDF.] in [Forumla omitted. See PDF.].

Figure 7. cSPINNs results (with ResNet block-enhanced modified neural network) for solving the 2D Poisson equation with an exact solution [Forumla omitted. See PDF.] in [Forumla omitted. See PDF.].

View Image - Figure 8. cSPINNs results (with ResNet block-enhanced modified neural network) for solving the 2D Poisson equation with an exact solution [Forumla omitted. See PDF.] in [Forumla omitted. See PDF.].

Figure 8. cSPINNs results (with ResNet block-enhanced modified neural network) for solving the 2D Poisson equation with an exact solution [Forumla omitted. See PDF.] in [Forumla omitted. See PDF.].

View Image - Figure 9. cSPINNs results (with ResNet block-enhanced modified neural network) for solving the L-shaped Poisson equation with the reference solution.

Figure 9. cSPINNs results (with ResNet block-enhanced modified neural network) for solving the L-shaped Poisson equation with the reference solution.

Figure 10. cSPINNs’ training loss for High Dimensional Poisson equation. (with ResNet block-enhanced modified neural network).

Table 1

The 2D Poisson equation: Relative $L^{2}$ errors obtained by cSPINN on different geometries. ( $Ω_{1} = [0, 0.25] \times [0, 0.25]$ ).

The 2D Poisson Equation	Relative $L^{2}$ Error (Inner)	Relative $L^{2}$ Error (Boundary)
Rectangular area	4.844 × 10 $^{- 4}$	NaN
Circular area	4.349 × 10 $^{- 4}$	2.258 × 10 $^{- 3}$
Triangle area	3.375 × 10 $^{- 4}$	8.735 × 10 $^{- 4}$
Pentagon area	4.304 × 10 $^{- 4}$	1.527 × 10 $^{- 3}$

Table 2

The 2D Poisson equation: Relative $L^{2}$ errors obtained by cSPINN on different geometries. ( $Ω_{2} = [0, 1] \times [0, 1]$ ).

The 2D Poisson Equation	Relative $L^{2}$ Error (Inner)	Relative $L^{2}$ Error (Boundary)
Rectangular	2.522 × 10 $^{- 3}$	NaN
Circular	1.748 × 10 $^{- 3}$	1.077 × 10 $^{- 2}$
Triangular	4.154 × 10 $^{- 3}$	1.418 × 10 $^{- 2}$
Pentagonal	2.329 × 10 $^{- 3}$	6.309 × 10 $^{- 3}$

Table 3

The relative computational cost of solving different PDEs with cSPINNs and PINNs.

Equation	PINNs	cSPINNs
1D Allen-Cahn	1.000	2.123
1D Burgers	1.000	2.346
2D Helmholtz	1.000	2.121
2D Poisson	1.000	2.101
HD Poisson	1.000	2.709

Table 4

Relative $L^{2}$ errors of different loss functions and different architectures.

Equation	L2 loss+MLP	L2 loss+ResMNet	cSA+MMLP	cSA+ResMNet
1D Allen-Cahn	9.63 × 10 $^{- 1}$	2.38 × 10 $^{- 2}$	1.61 × 10 $^{- 2}$	1.16 × 10 $^{- 2}$
1D Burgers	6.70 × 10 $^{- 4}$	4.68 × 10 $^{- 4}$	3.80 × 10 $^{- 4}$	2.70 × 10 $^{- 4}$
2D Helmholtz	4.69 × 10 $^{- 3}$	3.61 × 10 $^{- 3}$	1.01 × 10 $^{- 3}$	9.50 × 10 $^{- 4}$
2D Poisson (Rectangular)	3.58 × 10 $^{- 2}$	1.64 × 10 $^{- 2}$	1.12 × 10 $^{- 2}$	1.94 × 10 $^{- 3}$
2D Poisson (L-shaped)	5.65 × 10 $^{- 3}$	4.82 × 10 $^{- 3}$	4.26 × 10 $^{- 3}$	3.78 × 10 $^{- 3}$
HD Poisson (d = 10)	1.33 × 10 $^{- 3}$	1.01 × 10 $^{- 3}$	1.25 × 10 $^{- 3}$	8.62 × 10 $^{- 4}$

References

1. Raissi, M.; Perdikaris, P.; Karniadakis, G. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys.; 2019; 378, pp. 686-707. [DOI: https://dx.doi.org/10.1016/j.jcp.2018.10.045]

2. Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics Informed Deep Learning (Part II): Data-driven Discovery of Nonlinear Partial Differential Equations. arXiv; 2017; arXiv: 1711.10566

3. Song, C.; Alkhalifah, T.; Waheed, U.B. Solving the frequency-domain acoustic VTI wave equation using physics-informed neural networks. Geophys. J. Int.; 2021; 225, pp. 846-859. [DOI: https://dx.doi.org/10.1093/gji/ggab010]

4. Zheng, X.; Yazdani, A.; Li, H.; Humphrey, J.D.; Karniadakis, G.E. A three-dimensional phase-field model for multiscale modeling of thrombus biomechanics in blood vessels. PLoS Comput. Biol.; 2020; 16, e1007709. [DOI: https://dx.doi.org/10.1371/journal.pcbi.1007709] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/32343724]

5. Yin, M.; Zheng, X.; Humphrey, J.D.; Karniadakis, G.E. Non-invasive inference of thrombus material properties with physics-informed neural networks. Comput. Methods Appl. Mech. Eng.; 2021; 375, 113603. [DOI: https://dx.doi.org/10.1016/j.cma.2020.113603]

6. Sahli Costabal, F.; Yang, Y.; Perdikaris, P.; Hurtado, D.E.; Kuhl, E. Physics-informed neural networks for cardiac activation mapping. Front. Phys.; 2020; 8, 42. [DOI: https://dx.doi.org/10.3389/fphy.2020.00042]

7. Colby, L.; Wight, J.Z. Solving Allen-Cahn and Cahn-Hilliard Equations using the Adaptive Physics Informed Neural Networks. arXiv; 2020; arXiv: 2007.04542

8. Nabian, M.A.; Gladstone, R.; Meidani, H. Efficient training of physics-informed neural networks via importance sampling. arXiv; 2021; arXiv: 2104.12325[DOI: https://dx.doi.org/10.1111/mice.12685]

9. Wang, S.; Teng, Y.; Perdikaris, P. Understanding and Mitigating Gradient Flow Pathologies in Physics-Informed Neural Networks. SIAM J. Sci. Comput.; 2021; 43, pp. A3055-A3081. [DOI: https://dx.doi.org/10.1137/20M1318043]

10. Wang, S.; Sankaran, S.; Perdikaris, P. Respecting causality is all you need for training physics-informed neural networks. arXiv; 2022; arXiv: 2203.07404

11. Cheng, C.; Zhang, G.T. Deep Learning Method Based on Physics Informed Neural Network with Resnet Block for Solving Fluid Flow Problems. Water; 2021; 13, 423. [DOI: https://dx.doi.org/10.3390/w13040423]

12. McClenny, L.; Braga-Neto, U. Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv; 2020; arXiv: 2009.04544

13. Wang, C.; Li, S.; He, D.; Wang, L. Is L² Physics-Informed Loss Always Suitable for Training Physics-Informed Neural Network?. arXiv; 2022; arXiv: 2206.02016

14. Yu, J.; Lu, L.; Meng, X.; Karniadakis, G.E. Gradient-enhanced physics-informed neural networks for forward and inverse PDE problems. Comput. Methods Appl. Mech. Eng.; 2022; 393, 114823. [DOI: https://dx.doi.org/10.1016/j.cma.2022.114823]

15. Cai, W.; Li, X.; Liu, L. A Phase Shift Deep Neural Network for High Frequency Approximation and Wave Problems. SIAM J. Sci. Comput.; 2020; 42, pp. A3285-A3312. [DOI: https://dx.doi.org/10.1137/19M1310050]

16. Wang, Y.; Han, X.; Chang, C.Y.; Zha, D.; Braga-Neto, U.; Hu, X. Auto-PINN: Understanding and Optimizing Physics-Informed Neural Architecture. arXiv; 2022; arXiv: 2205.13748

17. Wang, S.; Wang, H.; Perdikaris, P. On the eigenvector bias of Fourier feature networks: From regression to solving multi-scale PDEs with physics-informed neural networks. Comput. Methods Appl. Mech. Eng.; 2021; 384, 113938. [DOI: https://dx.doi.org/10.1016/j.cma.2021.113938]

18. Dong, S.; Ni, N. A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. J. Comput. Phys.; 2021; 435, 110242. [DOI: https://dx.doi.org/10.1016/j.jcp.2021.110242]

19. Gao, H.; Sun, L.; Wang, J.X. PhyGeoNet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state PDEs on irregular domain. J. Comput. Phys.; 2021; 428, 110079. [DOI: https://dx.doi.org/10.1016/j.jcp.2020.110079]

20. Wandel, N.; Weinmann, M.; Klein, R. Learning Incompressible Fluid Dynamics from Scratch - Towards Fast, Differentiable Fluid Models that Generalize. arXiv; 2021; arXiv: 2006.08762

21. Yang, L.; Meng, X.; Karniadakis, G.E. B-PINNs: Bayesian physics-informed neural networks for forward and inverse PDE problems with noisy data. J. Comput. Phys.; 2021; 425, 109913. [DOI: https://dx.doi.org/10.1016/j.jcp.2020.109913]

22. Yang, L.; Zhang, D.; Karniadakis, G.E. Physics-Informed Generative Adversarial Networks for Stochastic Differential Equations. SIAM J. Sci. Comput.; 2020; 42, pp. A292-A317. [DOI: https://dx.doi.org/10.1137/18M1225409]

23. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); Las Vegas, NV, USA, 27–30 June 2016.

24. Tan, M.; Le, Q. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. Proceedings of the 36th International Conference on Machine Learning; Long Beach, CA, USA, 9–15 June 2019.

25. Chen, J.; Lu, Y.; Yu, Q.; Luo, X.; Adeli, E.; Wang, Y.; Lu, L.; Yuille, A.L.; Zhou, Y. TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation. arXiv; 2021; arXiv: 2102.04306

26. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput.; 1997; 9, pp. 1735-1780. [DOI: https://dx.doi.org/10.1162/neco.1997.9.8.1735] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/9377276]

27. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention is All you Need. Proceedings of the Advances in Neural Information Processing Systems; Long Beach, CA, USA, 4–9 December 2017.

28. Su, Y.; Zhao, Y.; Niu, C.; Liu, R.; Sun, W.; Pei, D. Robust anomaly detection for multivariate time series through stochastic recurrent neural network. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining; Anchorage, AK, USA, 4–8 August 2019.

29. Berland, H.; Skaflestad, B.; Wright, W.M. EXPINT—A MATLAB package for exponential integrators. ACM Trans. Math. Softw.; 2007; 33, 4-es. [DOI: https://dx.doi.org/10.1145/1206040.1206044]

30. Lu, L.; Meng, X.; Mao, Z.; Karniadakis, G.E. DeepXDE: A deep learning library for solving differential equations. SIAM Rev.; 2021; 63, pp. 208-228. [DOI: https://dx.doi.org/10.1137/19M1274067]

31. Kharazmi, E.; Zhang, Z.; Karniadakis, G.E. hp-VPINNs: Variational physics-informed neural networks with domain decomposition. Comput. Methods Appl. Mech. Eng.; 2021; 374, 113547. [DOI: https://dx.doi.org/10.1016/j.cma.2020.113547]

32. Weinan, E.; Yu, B. The Deep Ritz Method: A Deep Learning-Based Numerical Algorithm for Solving Variational Problems. Commun. Math. Stat.; 2018; 6, pp. 1-12. [DOI: https://dx.doi.org/10.1007/s40304-018-0127-z]

Word count: 5049

Show less

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

Physics-informed neural networks (PINNs) have been widely adopted to solve partial differential equations (PDEs), which could be used to simulate physical systems. However, the accuracy of PINNs does not meet the needs of the industry, and severely degrades, especially when the PDE solution has sharp transitions. In this paper, we propose a ResNet block-enhanced network architecture to better capture the transition. Meanwhile, a constrained self-adaptive PINN (cSPINN) scheme is developed to move PINN’s objective to the areas of the physical domain, which are difficult to learn. To demonstrate the performance of our method, we present the results of numerical experiments on the Allen–Cahn equation, the Burgers equation, and the Helmholtz equation. We also show the results of solving the Poisson equation using cSPINNs on different geometries to show the strong geometric adaptivity of cSPINNs. Finally, we provide the performance of cSPINNs on a high-dimensional Poisson equation to further demonstrate the ability of our method.

Details

Title

Constrained Self-Adaptive Physics-Informed Neural Networks with ResNet Block-Enhanced Network Architecture

Author

Zhang, Guangtao¹; Yang, Huiyu²; Pan, Guanyu²; Duan, Yiting³; Zhu, Fang⁴; Chen, Yang³

¹ Department of Mathematics, Faculty of Science and Technology, University of Macau, Macau 999078, China; SandGold AI Research, Guangzhou 510006, China
² SandGold AI Research, Guangzhou 510006, China; College of Mathematics and Informatics, South China Agricultural University, Guangzhou 510006, China
³ Department of Mathematics, Faculty of Science and Technology, University of Macau, Macau 999078, China
⁴ SandGold AI Research, Guangzhou 510006, China; Faculty of Innovation Engineering, Macau University of Science and Technology, Macau 999078, China

First page

1109

Publication year

2023

Publication date

2023

Publisher

MDPI AG

e-ISSN

22277390

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.3390/math11051109

ProQuest document ID

2785206633

Constrained Self-Adaptive Physics-Informed Neural Networks with ResNet Block-Enhanced Network Architecture

Jump to:

Full Text

Abstract

Details

Suggested sources