Approximating Partial Differential Equations with

Full text

Turn on search term navigation

1. Introduction

PDEs, as a powerful mathematical tool in classical physics, enable the comprehensive depiction, simulation, and modeling of various dynamic phenomena, including heat flow, diffusion processes, wave propagation, and the motion of fluid substances [1], owing to their inherent capability to capture the fundamental characteristics of intricate systems. By solving PDEs, we can obtain insights into the behavior of the system and make predictions about its future evolution. However, obtaining the analytical solutions of PDEs is challenging. To address this issue, conventional numerical methods, like the finite volume method (FVM) [2], the finite difference method (FDM) [3], and the finite element method (FEM) [4], are utilized to approximate the solutions. Despite the strength and rigor of these approaches, they may experience an exponential rise in computational complexity with grid refinement as well as the “curse of dimensionality” as the independent variable’s dimension increases.

With the rapid development of deep learning (DL)—a form of machine learning (ML) that utilizes NNs with multiple hidden layers [5]—it has emerged as a new paradigm in scientific computing due to its universal approximation [6] and expressive capabilities. Solving PDEs via DL has gained significant momentum. Recent research has shown that DL holds promise in building meta-models for efficiently predicting dynamic systems. By training deep neural networks (DNNs), especially with Artificial Intelligence (AI)-based approaches, systems or even families of PDEs can be expressed, leading to exponential advancements in computing efficiency during practical usage. Unlike traditional numerical methods, DNN-based approaches can simultaneously approximate target solutions in multiple dimensions, including the time dimension. They can learn direct mappings between physical states and spatial/temporal coordinates without the need for repeated iterations at each time step [7].

Currently, an increasing number of researchers are utilizing DL methods to investigate PDEs [6,8]. Among the various studies, an important contribution that cannot be overlooked is the PINNs proposed by Raissi et al. [9]. This NN model enhances the performance of the NN model by taking into account the physical laws found in PDEs and encoding them into the NN as regularization terms. Researchers have been paying more attention to PINNs lately, and they are gradually applying them to a wider range of research fields [10]. Although PINNs can be trained with little to no ground truth data, they often fail to effectively generalize to domains that were not encountered during training [11]. However, CNN can learn the inherent laws of the data and thus have better generalization ability for new data. Furthermore, the structure and characteristics of CNN facilitate the solution of PDE. For example, CNN can simplify the solution of complex PDE by reducing the dimension, leverages powerful nonlinear approximation capabilities to handle intricate initial and boundary conditions (ICs/BCs), and, when integrated with parallel computing and GPU acceleration, significantly improves the computational efficiency of PDE solutions. Moreover, CNN can employ multi-input methods to manage multi-variable problems, and, through end-to-end learning, it minimizes the need for manual intervention and the complexity of manual settings [9,12]. Nevertheless, CNN relies on large amounts of training data.

In this paper, we propose PiLMWs-CNN for deriving continuous solutions that can be trained with a physics-informed loss only. This method merges the benefits of (i) utilizing PINNs to eliminate the necessity for extensive training data [9,13] and (ii) capitalizes on the enhanced inference speed and improved generalization abilities provided by CNN [14]. The main contributions of our work are as follows:

•. Construct a set of new standard orthogonal compact supported LMWs.
•. Prove the compact support property of LMWs integrated twice.
•. Propose a novel network called PiLMWs-CNN to obtain LMWs coefficients (LMWCs) for approximating PDEs.

The rest of the paper is organized as follows. In Section 2, we conduct a survey of related works on solving PDEs using DNNs. We analyze and discuss these works from three different perspectives: physics-informed methods, neural operators’ (NO) learning, and the advantages and applications of combining wavelets with CNNs. In Section 3, we provide the research methods of the paper. First, we conduct a feasibility analysis on combining the Legendre wavelet bases with CNNs to solve PDEs, including the role of compactness and orthogonality. Next, we provide relevant mathematical foundations, including the construction of the LMWs. Then, we give the network architectures and pipeline used in this study. In Section 4, we apply this method to approximate the wave equation, fluid equation, and heat conduction equation, with a particular focus on the loss of physical information and the stability of the solution. In Section 5, we give the conclusions.

2. NN Works for Solving PDEs

In this section, we will discuss different approaches to solve PDEs using NNs based on various mathematical foundations.

2.1. Physics-Informed Methods

Physics-informed methods emerged in the context of PINNs, which were introduced by Raissi et al. [9]. The aim is to train these NNs to solve supervised learning tasks while adhering to any given physical laws described by general PDEs. Such NNs are constrained to respect any symmetries, invariances, or conservation principles originating from the physical laws that govern the observed data, as modeled by general time-dependent and nonlinear PDEs. Based on the automatic differentiation algorithm, PINNs effectively solved well-posed PDEs and ODEs, such as Burgers’ Equation and the Allen–Cahn Equation.

The PINNs framework is a data-driven method to solve differential equations and complicated pre-processing modeling is not required, unlike FEM [4]. Physics-informed approaches compute losses implicitly—usually by taking derivatives of an implicit field description. This approach facilitates efficient training of implicit neural networks, producing continuous solutions that can be evaluated at any spatiotemporal point. However, implicit neural networks tend to lack generalization capabilities in novel domains and often necessitate network retraining [11].

This type of method generally involves using PDEs as the constraints on training DNNs directly [9,15]. In numerous practical applications, constraints are often imposed by generalized prior knowledge, which take the form of differential equations (DEs) and are incorporated into the model training as the components of the loss function [16]. More recently, Raissi et al. [9] introduced the concept of PINNs, which have demonstrated their potential in various scenarios [17]. There are also specialized DL libraries, like DeepXDE [16], developed specifically for solving DEs.

Treating PDEs as training constraints was initially referred to as “soft constraints” [18], as the knowledge in the form of PDEs only enforces the correctness of the predicted results without guiding the optimization process. Additionally, the modeling process takes into account additional constraints in the form of ICs/BCs. Unlike conventional numerical approaches that rely on discretizing grids [19], utilizing PDE-based constraints to train DNNs ensures continuous solutions over the entire research domain, leading to highly precise results. PINNs are particularly suitable when the underlying physics of the problem is known beforehand, as they exploit the governing laws of the physical processes during training.

These methods leverage the idea that multilayer feedforward NNs (FFNNs) can serve as universal approximators for high-dimensional functions [20]. Both DNNs and solving DEs share a common objective: to achieve accurate approximations. Whether it is the FEM using basis functions to approximate solutions within a unit or DNNs with complex structures fitting high-dimensional functions, the computed or predicted values at each unit center or sampling point should be extremely close to the actual values. The goal for DNNs is to minimize the loss function value, or the residual, which should ideally approach zero. In the numerical approach, this residual represents the error in the computed solution.

However, data-driven algorithms are often the only practical choice for natural processes because the underlying physics governing them is unknown. These algorithms do not generalize beyond the distribution of the training data, even though they do not ensure the preservation of the problem’s governing physics [21].

2.2. Neural Operators’ Learning

In response to the challenges mentioned above, the concept of NO was introduced by Lu et al. [22]. NOs are designed to learn the relationship between two infinite-dimensional function spaces. After training, these NOs can accurately predict the solution for any input function presented to them. In comparison to conventional PDE solvers, NOs offer significant computational advantages. These NOs are based on the concept of universal operator approximation, as proposed by Chen et al. [23], which is similar to the theory of universal function approximation. Kovachki et al. [24] have conducted rigorous mathematical analysis and provided theoretical guarantees for the effectiveness of NOs.

By employing this methodology, DL can successfully obtain mesh-free, infinite-dimensional linear and nonlinear operators. Chen et al. [23] developed an initial prototype of NO methods based on operator theory, drawing inspiration from linear algebra and functional analysis. Currently, there are three popular instances in this area, which are Deep Operator Network (DeepONet) [25], Graph Neural Operator (GNO) [26], and Fourier Neural Operator (FNO) [27]. The DeepONet architecture comprises two NNs: the branch NN and the trunk NN. The branch NN handles the input function, while the trunk NN is responsible for generating the output function at a specific sensor point. In DeepONet, the output is derived by taking the inner product of these NNs. The GNO focuses on learning mappings in infinite-dimensional spaces by combining nonlinear activation functions with a specific category of integral operators. Nevertheless, GNO may become unstable as the number of hidden layers increases. These advanced networks possess exceptional feature representation capabilities, and, when combined with the idea of approximating functionals using the prototype network, the NOs become infused with boundless vitality. The FNO is an innovative approach inspired by the well-known Fourier transform (FT). Building upon the GNO, the FNO was introduced as an enhanced method to learn network parameters in Fourier space [27]. The FNO employs the fast FT (FFT) to perform spectral decomposition on the input signal, subsequent to calculating the convolution integral kernel in the Fourier domain. One notable limitation of FNO stems from the frequency localization of basis functions in FFTs, which hinders its ability to provide spatial resolution [28]. Consequently, FNO’s performance may be adversely affected when dealing with complex BCs.

An alternative method to tackle this issue is to employ wavelets, which exhibit both spatial and frequency localization properties [28]. By incorporating spatial information, wavelets demonstrate enhanced ability in handling signals with discontinuities and spikes, thus surpassing the pattern-learning capabilities of FFTs when specifically dealing with image data. Wavelets have found applications in various domains, such as the compression of fingerprints, iris recognition, the denoising of signals, the analysis of motion, and the detection of faults, among other areas. The literature [29] highlighted the utilization of wavelets in NNs. Gupta et al. [30] proposed an operator that utilizes MW decomposition (MWD). This approach involves the use of four distinct NNs to calculate wavelet kernels. Specifically, the network architecture includes a fully connected NN (FNN), a CNN, and a Fourier integral layer similar to FNO. As the FNN represents the coarse scale, the combination of the latter three networks computes the detailed wavelet coefficients. As a result, MWD can be considered an improvement over FNO. Given that the solution of a PDE is obtained by calculating the inverse operator map between the input and the solution space, Gupta et al. aimed to transform the problem of learning PDEs into a domain where a more compact representation of the operator can be obtained. In order to achieve this, they proposed an approach for learning the operator, based on MWs, which allows for compression of the associated kernel. By explicitly incorporating the inverse MW filters, they learned the projection of the kernel onto fixed MW polynomial bases. The use of MWs offers a natural basis for the MW subspace, using OPs with respect to a specific measure and employing suitable scale/shifts to capture locality at different resolutions. Tripura et al. [31] introduced an NO, termed Wavelet Neural Operator (WNO), which utilizes spectral decomposition to learn mappings between infinite-dimensional function spaces. The proposed WNO demonstrates its effectiveness in dealing with highly nonlinear families of PDEs that involve discontinuities and abrupt changes in both the solution domain and boundary.

2.3. Wavelets and CNN

It is essential to highlight that the wavelet transform (WT) is a signal processing technique that exhibits the properties of localization in both the time and frequency domain. This attribute makes it a powerful tool for analyzing local features [32]. Consequently, Li et al. [33] developed a deep wavelet NN (DWNN) based on the PINNs approach. The employment of wavelets enables the extraction of multi-scale and detailed features, resulting in enhanced performance in solving PDEs.

However, it is well known that wavelet functions are not simultaneously symmetric, orthogonal, and compact supported. These challenges are overcome by LMWs, also referred to as more general, vector-valued polynomial types of wavelets [34], which simultaneously have all three of these properties. This leads to high smoothness and high approximation order and suggests that LMWs can outperform scalar wavelets in a variety of applications [35].

Therefore, some researchers have focused on investigating how to solve PDEs using the Legendre wavelets method. It is worth mentioning that Legendre wavelets have both spectral accuracy and orthogonality, as well as other properties mentioned about wavelets [36]. For example, the wavelet technique can transform complex problems into a system of algebraic equations. Heydari et al. [37] presented a numerical technique utilizing two-dimensional Legendre wavelets to solve fractional PDEs with Dirichlet BCs. The method involved using operational matrices for fractional integration and derivatives to obtain numerical solutions to the problems under consideration. The approach presented by Abbas et al. [38] primarily focused on converting the fundamental DEs into integral equations via integration. This is accomplished via an approximate representation of the diverse signals embedded in the equations using truncated orthogonal series $φ (x)$ . Subsequently, the operation matrix of integration, denoted as P, is employed to remove the integral operations.

Meanwhile, we find in the research that there are many applications for the combination of wavelets and CNN. It is discovered that a CNN can be considered a simplified version of multiresolution analysis (MRA). This revelation highlights the fact that conventional CNNs fail to capture a substantial portion of the spectral information accessible via MRA. Therefore, Fujieda et al. [39] introduced the idea of wavelet CNNs, which combined MRA and CNNs within a single model. This strategy enables the exploitation of spectral information that is frequently neglected in conventional CNNs but is essential in various image processing tasks. Zhao et al. [40] suggested a Wavelet-Attention CNN (WA-CNN) for image classification purposes. They explored Discrete WT (DWT) in the frequency domain and devised a novel WA block which solely implemented attention in the high-frequency domain. DWT exhibits superior down-sampling information quality in the image processing domain, thereby significantly minimizing the loss of down-sampling information in CNN. Guo et al. [41] developed a method for intelligent fault diagnosis of rolling bearings, utilizing WT and deformable CNN. Their research highlights the significance of selecting appropriate wavelet bases in the WT. Different wavelet bases have a substantial influence on the resulting time-frequency map. In the context of CNNs, WT is developed and employed for breast cancer detection [42]. The adaptive wavelet pooling layers, as proposed by the Wolter et al. [43], make use of fast WT (FWT) to lower the feature resolution. Through FWT, the input features are decomposed into various scales, resulting in a reduction in feature dimensions by eliminating the fine-scale subbands. This approach introduces additional flexibility by optimizing the wavelet basis functions and weighting coefficients at different scales, thus avoiding high repetition.

The findings above indicate that CNNs exhibit strong performance when handling multidimensional data, such as images and speech signals. In fact, these networks have been employed in PINN, primarily within the domain of image processing. Inspired by the above studies, we consider combining LMWs with CNN to approach PDEs, conducting experiments on three different types of equations to validate the effectiveness and accuracy of this approach.

3. Method

3.1. Feasibility Analysis

By integrating LMWs with CNN for solving PDEs, this approach capitalizes on four key advantages, thereby enhancing the modeling and analysis capabilities in PDE solving and delivering more precise predictions and solutions.

Multiscale Representation: PDEs often involve spatial features at different scales. By using the LMWs, it provides the capability of multiscale representation, which can better capture the features of the input data at different scales. This is crucial for analyzing and solving PDEs involving multiscale spatial features.
Local Correlation Modeling: The mathematical models in PDEs often assume that the system’s behavior is influenced by its local neighborhood. The LMWs have good properties in capturing the local correlation of input data effectively. By using convolution operations in CNN, it can leverage this local correlation and perform local feature extraction on the input data, which helps in describing the local behavior in PDEs more accurately. Bedford et al. [44] prove rigorously that good approximation “locally” guarantees good approximation globally.
Translation Invariance: Analysis of spatial translation invariance is often required for input data in PDEs. The translation invariance property of LMWs ensures that the translation of input signals does not affect the representation of their MW coefficients (MWCs). In CNN, convolution operations can preserve the translation invariance of input data, better meeting the requirement of translation invariance in PDEs.
Explanation and Interpretability: LMWs have good explanatory and interpretable properties, providing insights into the features and structure of PDEs. By combining LMWs with CNN, we can leverage the explanatory power of MWs and the learning capability of CNN to better understand and interpret the feature representations of PDEs. In fact, Restricted Boltzmann machines (RBMs) [45] and other variants of DL models, such as deep belief networks and autoencoders, have a clear advantage in providing explanations and interpretability.

Furthermore, the compact support property of LMWs offers several advantages [46,47] when applied to solving PDEs with NNs.

Reduced computational cost: The compact support of LMWs enables a smaller set of wavelet coefficients to be considered, thereby reducing the computational burden associated with representing the solution as a wavelet series;
Localized feature extraction: LMWs are well suited for extracting localized features within the solution, facilitating the NN’s ability to comprehend the underlying structure of the PDE;
Regularization effect: LMWs’ compact support property inherently creates a regularization effect in the NN, which reduces overfitting and improves the generalization performance;
Adaptability to different scales: LMWs readily adapt to various scales by adjusting the wavelet coefficients, permitting the NN to capture features at different resolutions;
Robustness to noise: LMWs’ compact support enhances their resilience to noise and other imperfections in the input data, enabling more accurate solutions to be obtained when employed in conjunction with NNs.

Based on the aforementioned analysis, a strong correlation between LMWs, CNNs, and PDEs becomes apparent. Notably, the compact support property of LMWs offers numerous advantages when resolving PDEs via NNs. These distinctive features render the integration of wavelets and CNNs particularly appealing for PDE solutions, providing the authors with a promising avenue to approximate PDEs. Consequently, it piques the authors’ curiosity and motivates further exploration and in-depth experimentation with this approach.

3.2. Foundations of Multiwavelet Bases’ Construction in Mathematics

Families of functions $φ_{a, b} (x)$ ,

$φ_{a, b} (x) = {|a|}^{- 1 / 2} φ (\frac{x - b}{a}), a, b \in R, a \neq 0,$

obtained from a single function

φ

via dilation and translation that serve as the basis for

L^{2} (R)

, are known as wavelets [48].

In the following, we will provide an overview of the properties of the MW bases developed in [49] and introduce the necessary notations.

3.2.1. MW Bases

In this subsection, we introduce a new class of wavelet-like bases, referred to as MW bases, which enable sparse representations of smooth integral operators over finite intervals. Moreover, MW bases have orthogonality, compact support, vanishing moments, and other wavelet base properties. The basis functions do not overlap on a specific scale and are arranged into small groups of multiple functions (thus, MWs) sharing the same support. One notable benefit of this construction is its simplicity.

1. One-Dimensional Construction

Our focus will initially be limited to $[0, 1] \subset R$ , and we now proceed to establish a basis for $L^{2} [0, 1]$ . Each basis consists of dilates and translates of a finite set of functions $φ_{1}, φ_{2}, \dots, φ_{n}$ . Specifically, these bases consist of orthonormal systems

(1) $φ_{l, i}^{k} = 2^{i / 2} φ_{l} (2^{i} x - k), l = 1, 2, \dots, n; i, k \in Z,$

where the functions

φ_{1}, φ_{2}, \dots, φ_{n}

are piecewise polynomials: they become zero outside the range of

[0, 1]

and possess vanishing moments, making them orthogonal to low-order polynomials,

(2) $\int_{0}^{1} φ_{l} (x) x^{i} d x = 0, i = 0, 1, \dots, n - 1 .$

In addition, we implement the MRA approach and suppose $n \in N^{+}$ ; for $i = 0, 1, 2, \dots$ , we define a space $V_{i}^{n}$ of piecewise polynomial functions,

(3) $\begin{matrix} V_{i}^{n} = {f : the restriction of f to the interval (2^{- i} k, 2^{- i} (k + 1)) is \\ a polynomial of degree less than n, for k = 0, 1, \dots, 2^{i} - 1, \\ and f vanishes elsewhere} . \end{matrix}$

It is clear that the space $V_{i}^{n}$ has a dimension of $2^{i} n$ and

(4) $V_{0}^{n} \subset V_{1}^{n} \subset \dots \subset V_{i}^{n} \subset \dots .$

For $i = 0, 1, 2 \dots$ , we define the $2^{i} n$ -dimensional space (the MW subspace) $W_{i}^{n}$ to be the orthogonal complement of $V_{i}^{n}$ in $V_{i + 1}^{n}$ ,

(5) $V_{i}^{n} \oplus W_{i}^{n} = V_{i + 1}^{n}, V_{i}^{n} ⊥ W_{i}^{n},$

therefore, we can inductively derive the decomposition

(6) $V_{i}^{n} = V_{0}^{n} \oplus W_{0}^{n} \oplus W_{1}^{n} \oplus \dots \oplus W_{i - 1}^{n} .$

Suppose that functions $φ_{1}, φ_{2}, \dots, φ_{n}$ form an orthogonal basis for $W_{0}^{n}$ , that is $W_{0}^{n} = linear span {φ_{1}, φ_{2}, \dots, φ_{n}}$ . More generally,

(7) $\begin{matrix} W_{i}^{n} & = linear span {φ_{l, i}^{k} : φ_{l, i}^{k} (x) \\ = 2^{i / 2} φ_{l} (2^{i} x - k), l = 1, \dots, n, i = 0, 1, 2, \dots, k = 0, 1, \dots, 2^{i} - 1} . \end{matrix}$

Since $W_{0}^{n}$ is orthogonal to $V_{0}^{n}$ , the first n moments of $φ_{1}, φ_{2}, \dots, φ_{n}$ vanish,

(8) $\int_{0}^{1} φ_{l} (x) x^{i} d x = 0, \begin{matrix} \end{matrix} i = 0, 1, \dots n - 1 .$

The $2 n$ -dimensional space $W_{1}^{n}$ is spanned by $2 n$ orthogonal functions $φ_{1} (2 x), \dots, φ_{n} (2 x), φ_{1} (2 x - 1), \dots, φ_{n} (2 x - 1)$ , of which n are supported on the interval $[0, 1 / 2]$ and n on $[1 / 2, 1]$ . Generally, the space $W_{i}^{n}$ is spanned by $2^{i} n$ functions obtained from $φ_{1}, φ_{2}, \dots, φ_{n}$ via translation and dilation. There is some freedom in choosing the functions $φ_{1}, φ_{2}, \dots, φ_{n}$ , subject to the restriction of orthogonality, where we can uniquely determine them by imposing normality and further vanishing moments, except for sign.

2. Completeness of one-dimensional construction

We define the space

(9) $V^{n} = ⋃_{i = 0}^{\infty} V_{i}^{n},$

and observe that

\bar{V^{n}} = L^{2} [0, 1]

. Here the closure

\bar{V^{n}}

is defined with respect to the

L^{2}

-norm,

| | f | | = {⟨f, f⟩}^{1 / 2}

, where the inner product

⟨f, g⟩ = \int_{0}^{1} f (x) g (x) d x

Let ${ϕ_{1}, ϕ_{2}, \dots, ϕ_{n}}$ be an orthonormal basis for $V_{0}^{n}$ ; in accordance with (6)–(9), the set $B_{n} = {ϕ_{1}, ϕ_{2}, \dots, ϕ_{n}} \cup {φ_{l, i}^{k}}$ forms a complete orthonormal basis for $L^{2} [0, 1]$ . We refer to $B_{n}$ as the MW bases of order n for $L^{2} [0, 1]$ .

3.2.2. Multiple-Dimensional Construction

For any positive integer d, the bases’ construction for $L^{2} [0, 1]$ can be extended to some other function spaces as well, such as $L^{2} {[0, 1]}^{d}$ . The basis for $L^{2} {[0, 1]}^{2}$ , which serves as an example of the construction for any finite-dimensional space, is now given in order to outline this extension. We establish the space

(10) $V_{i}^{n, 2} = V_{i}^{n} \times V_{i}^{n}, i = 0, 1, 2, \dots,$

where

V_{i}^{n}

is defined by (3). Furthermore, we define

W_{i}^{n, 2} = W_{i}^{n} \times W_{i}^{n}

to be the orthogonal complement of

V_{i}^{n, 2}

V_{i + 1}^{n, 2}

(11) $V_{i}^{n, 2} \oplus W_{i}^{n, 2} = V_{i + 1}^{n, 2}, V_{i}^{n, 2} ⊥ W_{i}^{n, 2} .$

$W_{0}^{n, 2}$ is then the space generated via the orthonormal basis

(12) ${ϕ_{i} (x) φ_{j} (y), φ_{i} (x) ϕ_{j} (y), φ_{i} (x) φ_{j} (y) : i, j = 1,2, \dots, n} .$

For each basis element $v (x, y)$ of these $3 n^{2}$ elements, it has no projection on any low-order polynomial,

(13) $\int_{0}^{1} \int_{0}^{1} v (x, y) x^{i} y^{j} d x d y = 0, \begin{matrix} \end{matrix} i, j = 0, 1, \dots, n - 1 .$

Dilations and translations of the $v (x, y)$ span the space $W_{i}^{n, 2}$ ; low-order polynomials ${ϕ_{i} (x) ϕ_{j} (y) :$ $i, j = 1, 2, \dots, n}$ are also included in the basis of $L^{2} {[0, 1]}^{2}$ , which is made up of these functions.

3.3. Construct LMWs

In this part, our primary focus is the development of MW bases based on Legendre polynomials, which we refer to as LMWs. We present detailed construction procedures and provide the necessary proofs to support our approach.

3.3.1. Legendre Polynomials

Legendre polynomials are polynomials defined on the $[- 1, 1]$ and have a recursive formula [50]:

(14) $\begin{matrix} p_{0} (x) = 1, p_{1} (x) = x, \\ p_{n + 1} (x) = \frac{(2 n + 1) x p_{n} (x) - n p_{n - 1} (x)}{n + 1}, n = 1, 2, \dots . \end{matrix}$

In order to be consistent with the previous knowledge, we consider a transformation to fix the discussion interval at $[0, 1]$ and construct the standard orthogonal basis of $L^{2} [0, 1]$ . Finally, a similar transformation can be performed to return to the desired interval. The implementation details can be found in the literature [51]. Therefore, according to $()$ , Legendre polynomials on $[0, 1]$ can be expressed as follows via translation and expansion transformations:

(15) $\begin{matrix} p_{0} (x) = 1, p_{1} (x) = 2 x - 1, \\ p_{n + 1} (x) = \frac{(2 n + 1) (2 x - 1) p_{n} (x) - n p_{n - 1} (x)}{n + 1}, n = 1, 2, \dots . \end{matrix}$

Further, $q_{l} (x) = {\{\sqrt{2 l + 1} p_{l} (x)\}}_{l = 0}^{\infty}$ is an orthonormal basis in $L^{2} [0, 1]$ .

3.3.2. LMWs

Now, we use Legendre polynomials $q_{l} (x) (l = 0, 1, 2, 3)$ as an LMW multiscaling function [52] to construct the Legendre mother multiwavelet [53] as

(16) $φ_{l} (x) = \sum_{j = 0}^{3} (h_{l j} q_{j} (2 x) + g_{l j} q_{j} (2 x - 1)), l = 0, 1, 2, 3 .$

By the properties of vanishing moments and standard orthonormality, $φ_{l} (x)$ can be uniquely determined. Figure 1 shows the plots of these four Legendre mother multiwavelets.

Then, we construct the LMWs by translating and dilating the mother multiwavelet $φ_{l} (x)$ and they can be expressed as

(17) $φ_{l, i}^{k} (x) = 2^{i / 2} φ_{l} (2^{i} x - k), i \in N^{+}, k = 0, 1, \dots, 2^{i} - 1, l = 0, 1, 2, 3 .$

In addition, $φ_{l, i}^{k} (x)$ is compact supported on the interval $[k / 2^{i}, (k + 1) / 2^{i}]$ , and ${q_{l}} ⋃ {φ_{l, i}^{k}}$ forms a complete orthonormal basis for $L^{2} [0, 1]$ .

3.3.3. Compact Support

Given the common use of second-order PDEs in physics and engineering [54], we aim to solve them in the $W_{2} [0, 1]$ [55] by integrating $φ_{l, i}^{k} (x)$ twice. Then, we denote

(18) $J_{0}^{2} φ_{l, i}^{k} (x) ≜ \int_{0}^{x} \int_{0}^{x} φ_{l, i}^{k} (t) d t^{2} .$

Lemma 1.

$J_{0}^{2} φ_{l, i}^{k} (x) = \int_{0}^{x} (x - t) φ_{l, i}^{k} (t) d t$ .

Proof.

Let $F (x) = J_{0}^{2} φ_{l, i}^{k} (x)$ , $G (x) = \int_{0}^{x} (x - t) φ_{l, i}^{k} (t) d t$ , $f (t, x) = (x - t) φ_{l, i}^{k} (t)$ . From (18), one can obtain $F (0) = G (0) = 0$ . Furthermore, $F^{'} (x) = \int_{0}^{x} φ_{l, i}^{k} (t) d t$ . Based on the generalization form of the upper limit function of integrals from the mathematical analysis lecture notes [56], we have

(19) $\begin{matrix} G^{'} (x) & = \int_{0}^{x} \frac{\partial f (t, x)}{\partial x} d t + f (x, x) x^{'} - f (0, x) 0^{'} \\ = \int_{0}^{x} \frac{\partial f (t, x)}{\partial x} d t + f (x, x) \\ = \int_{0}^{x} φ_{l, i}^{k} (t) d t . \end{matrix}$

That is $F^{'} (x) = G^{'} (x)$ . By the fundamental theorem of calculus, $F (x) = G (x)$ , the theorem holds true. □

Theorem 1.

$J_{0}^{2} φ_{l, i}^{k} (x)$ is compact supported on $[\frac{k}{2^{i}}, \frac{k + 1}{2^{i}}]$ .

Proof.

For $J_{0}^{2} φ_{l, i}^{k} (x) = \int_{0}^{x} (x - t) φ_{l, i}^{k} (t) d t$ , if $x < \frac{k}{2^{i}}$ , $J_{0}^{2} φ_{l, i}^{k} (x) = 0;$ if $x > \frac{k + 1}{2^{i}}$ ,

(20) $J_{0}^{2} φ_{l, i}^{k} (x) = \int_{\frac{k}{2^{i}}}^{\frac{k + 1}{2^{i}}} (x - t) φ_{l, i}^{k} (t) d t = 2^{\frac{i}{2}} \int_{\frac{k}{2^{i}}}^{\frac{k + 1}{2^{i}}} (x - t) φ_{l} (2^{i} t - k) d t$

Let $2^{i} t - k = s,$ then $t = \frac{s + k}{2^{i}}, d t = \frac{1}{2^{i}} d s$ . So, the transformation of (19) is as follows,

(21) $\begin{matrix} J_{0}^{2} φ_{l, i}^{k} (x) & = 2^{- \frac{i}{2}} \int_{0}^{1} (x - \frac{s + k}{2^{i}}) φ_{l, 0}^{0} (s) d s \\ = 2^{- \frac{i}{2}} \int_{0}^{1} (x - \frac{s + k}{2^{i}}) φ_{l} (s) d s \\ = 0 \end{matrix}$

The theorem holds true. □

3.3.4. Function Approximation

We employ the tensor product of multiple one-dimensional ( $1 D$ ) basis functions to derive basis functions in multiple dimensions [57], building upon Equation (17) discussed earlier. To simplify the expression, we still use $φ_{l, i}^{p}$ as a basis function in $1 D$ so that we have the following equation:

(22) $φ_{l m n, i j k}^{p, q, r} (x, y, t) = φ_{l, i}^{p} (x) φ_{m, j}^{q} (y) φ_{n, k}^{r} (t) .$

With discrete MWCs $c_{\hat{x}, \hat{y}, \hat{t}}^{p, q, r}$ on a grid $\hat{x}, \hat{y}, \hat{t} \in \hat{X} \times \hat{Y} \times \hat{T}$ , we obtain a continuous LMW $\tilde{u} (x, y, t)$ as follows:

(23) $\tilde{u} (x, y, t) = \sum_{\begin{matrix} p, q, r \in [0, 2^{i} - 1] \times [0, 2^{j} - 1] \times [0, 2^{k} - 1] \\ \hat{x}, \hat{y}, \hat{t} \in \hat{X} \times \hat{Y} \times \hat{T} \end{matrix}} c_{\hat{x}, \hat{y}, \hat{t}}^{p, q, r} φ_{l m n, i j k}^{p, q, r} (x - \hat{x}, y - \hat{y}, t - \hat{t}) .$

Our goal is to find MWCs $c_{\hat{x}, \hat{y}, \hat{t}}^{p, q, r}$ such that $\tilde{u} (x, y, t)$ as nearly as possible resembles the PDE solution. By taking the corresponding derivatives of the LMWs, one can directly obtain the partial derivatives of $\tilde{u}$ with respect to $x, y, t$ .

3.4. Neural Network Architecture

Figure 2 showcases the network architectures employed for the incompressible N-S equation and the damped wave equation. The domain’s occupancy grid at a timepoint t is represented by $Ω^{t}$ and $u_{d}^{t}$ contains Dirichlet BCs. We utilize a U-Net architecture [11] internally for the N-S equation; the damped wave equation, however, only required a basic three-layer CNN. These networks were employed to calculate residuals $(Δ c_{i j})$ , which are then added to $c_{i j}^{t}$ to derive the MWCs of the next time-step $c_{i j}^{t + d t}$ . Thus, we mean normalize the coefficients to zero, that is, $\sum_{\hat{x}, \hat{y}} c_{\hat{x}, \hat{y}, \hat{t}}^{0, 0, 0} = 0$ .

Pipeline

As depicted in Figure 3, PiLMWs-CNN employs a CNN (PDE Model) to map discrete LMWCs and BCs from a timepoint $\hat{t}$ to LMWCs at a subsequent timepoint $\hat{t} + d t$ . By iteratively applying the PDE Model on the LMWCs, the simulation can be propagated in time. Efficient computation of continuous LMWs is achievable using transposed convolutions (refer to (23)) with LMWs, as illustrated in Figure 4.

Training Procedure

As demonstrated in Figure 5, our method adopts a training process similar to that in reference [11]. Initially, we create a randomized training pool consisting of domains and LMWCs. Since all LMWCs can be set to zero at first, ground truth data is not required. The PDE model then uses the random minibatch (batch size = 50) that we extracted from the training pool to predict the LMWCs for the next time step. Next, with the goal of optimizing the PDE model weights via gradient descent, we compute a physics-informed loss inside the volume spanned by the LMWCs of the minibatch and the predicted LMWCs. We employ the Adam optimizer (learning rate = $10^{- 4}$ ) for this purpose. We update the training dataset with the newly predicted LMWCs during the training process to gradually replenish the pool with more realistic training data. From time to time, we reset all of the LMWCs in the training pool to zero, which enables us to learn the warm-up phase from the ground up.

Physics-Informed Loss

Drawing parallels with reference [11], the PiLMWs-CNN approach integrates the benefits of both physics-constrained methodologies [48] and physics-informed strategies [9]. This method shows the ability to generalize to new domain geometries and yields continuous solutions that avoid the drawbacks of a discretized loss function based on finite differences by using a CNN to manage LMWCs, which can be thought of as a discrete hidden latent representation for a continuous implicit field description based on LMWs. We aim to minimize the integrals of the squared residuals of the PDEs over the domain/domain boundaries and time steps by optimizing the LMWCs. We uniformly randomly choose points inside the designated integration domains in order to compute these integrals. Within the domain, we require two loss terms for the damped wave Equation (27):

(24) $L_{u} = \int_{Ω} \int_{\hat{t}}^{\hat{t} + d t} {∥\partial_{t} u - v∥}^{2}$

(25) $L_{v} = \int_{Ω} \int_{\hat{t}}^{\hat{t} + d t} {∥\partial_{t} v - k Δ u + γ v∥}^{2}$

The boundary loss term is:

(26) $L_{b} = \int_{\partial Ω} \int_{\hat{t}}^{\hat{t} + d t} {∥u - u_{d}∥}^{2}$

For the integral to be computed, it is imperative that the residuals have bounded variation.

4. Results

The baseline for our work is reference [11], and the main network architecture and settings are kept consistent with the original paper. However, for the interpolation part, we made improvements using our proposed PiLMWs-CNN and conducted experiments and comparisons on the damped wave equation, the incompressible N-S equation, and the two-dimensional heat conduction equations. All the results were performed on an Nvidia DGX V100 32G. Through quantitative evaluation and stability analysis, it is evident that our method achieves higher accuracy and provides better approximation for both the governing equations and BCs. The figures below are displayed in TensorBoard after training.

Example 1.

The Damped Wave Equation appears as a mathematical model in biology and physics [58]. It is important in geophysics, general relativity, quantum mechanics, and plasma physics. We consider the following equation with constant positive damping:

(27) $\begin{matrix} \partial_{t} u & = v, & in Ω, \\ \partial_{t} v & = k Δ u - γ v, & in Ω, \\ u & = u_{d}, & on \partial Ω, \\ u & = u_{0}, v = v_{0}, & in Ω, t = 0 . \end{matrix}$

Here, k is the stiffness constant, and $γ$ is a damping constant. Wave equation solutions are essential for understanding the concepts in fluid dynamics, optics, gravitational physics, and electromagnetism. Moreover, for large damping constants, $γ$ , converges towards the Laplace equation solution [11].

Table 1 compares the losses of our method with that of reference [11]. We trained with LMWs $J_{0}^{2} φ_{l, i}^{k} (x)$ (Figure 4) in the spatial dimensions and observed a much better performance. From the result of $L_{v}$ , it can be seen that our method follows the physical laws well.

As shown in Figure 6, our method produces stable results for the damped wave equation.

Example 2.

The Incompressible N-S equations are a set of PDEs that describe the movement of viscous fluids. These equations are derived by applying Newton’s second law of motion to an incompressible Newtonian fluid. The resulting equation is known as the N-S equation written in vector notation:

(28) $\begin{matrix} \nabla \cdot \vec{u} & = 0, & in Ω, \\ ρ (\partial_{t} \vec{u} + \vec{u} \cdot \nabla \vec{u}) & = μ Δ \vec{u} - \nabla p + \vec{f}, & in Ω . \end{matrix}$

The equations take into account factors such as the dynamic viscosity $μ$ , pressure p, fluid velocity vector $\vec{u}$ , and the fluid’s density $ρ$ . They can be used to model a wide range of phenomena, like the flow of water in a pipe, blood in an artery, air over an airplane wing, weather patterns in the atmosphere, and even the flow of stars in a galaxy. In our tests, we do not take into account the external factors $\vec{f}$ . For given initial conditions ${\vec{u}}_{0}$ , $p_{0}$ and BCs, these pair of equations need to be resolved. We are taking into account the Dirichlet BC, where the velocity field $\vec{u}$ is set to ${\vec{u}}_{d}$ at the boundaries of the domain $\partial Ω .$

We conducted an extensive study on the stability of PiLMWs-CNN when applied to the incompressible N-S equations. The investigation involved hundreds of iterations on the DFG benchmark [59] problem at Re = 100 (Re represents the Reynolds number). Figure 7 shows the curves of $E [| \nabla \cdot \vec{u} |]$ and $L_{p}$ , where $L_{p}$ is a momentum loss term defined as

(29) $L_{p} = \int_{Ω} \int_{\hat{t}}^{\hat{t} + d t} ‖ ρ (\partial_{t} \vec{u} + \vec{u} \cdot \nabla \vec{u}) - μ Δ \vec{u} + \nabla p ‖^{2}$

The results clearly demonstrate the superiority of our approach in terms of both accuracy and stability. This significant improvement highlights the effectiveness of our method over the existing reference [11]. Furthermore, following the same approach as implicit PINNs and Spline nets, which involves computing losses based on physics-informed losses, PiLMWs-CNN calculated $E [| \nabla \cdot \vec{u} |]$ and $L_{p}$ and achieved lower loss values, as shown in Table 2. This demonstrates the enhanced effectiveness of our method in approximating the solutions of PDEs.

Since the proposed method in this paper utilizes physics-informed loss computation during model training, which is primarily based on the residual concept, it aligns with the idea of the $ε$ -best approximate solution [51] or $ε$ -approximate approach [60] that have been proposed in recent years for numerical solutions of PDEs and fractional DEs (FDEs). Therefore, we intend to apply the PiLMWs-CNN method to two different two-dimensional heat conduction equations to observe its approximation performance.

Example 3.

The two-dimensional heat conduction equation [51] is a PDE that governs the thermal conduction and heat transfer within a medium. Mathematically, it is defined as:

(30) $\begin{matrix} \partial_{t} u & = \partial_{x^{2}} u + \partial_{y^{2}} u, (t, x, y) \in [0, 1] \times [0, 1] \times [0, 1], \\ u (0, x, y) & = \sin (π x) \sin (π y), \\ u (t, 0, y) & = u (t, 1, y) = u (t, x, 0) = u (t, x, 1) = 0 . \end{matrix}$

The exact solution is $u (t, x, y) = e^{- 2 π^{2} t} \sin (π x) \sin (π y)$ , which is not involved in the training process. It is only used to compare the absolute errors at given points with the approximate solution obtained via our algorithm. We conducted the experiment for Example 3 using the same network structure as the damped wave equation. $L_{boundary}$ represents the boundary loss, calculated in a manner similar to Equation (26), while $L_{heat}$ signifies the equation loss, calculated similarly to Equation (25). From Table 3, it can be observed that the absolute errors computed by our method (right) are smaller. Figure 8 demonstrates a progressive reduction in the boundary loss of the heat conduction equation (reaching $10^{- 9}$ ) as the iterations progress, while the equation’s loss stabilizes (reaching $10^{- 6}$ ). This indicates that our method is capable of more accurately capturing the physical laws of the equation.

Example 4.

The following two-dimensional heat conduction equation [61] is a materially homogeneous system:

(31) $\begin{matrix} \partial_{t} u & = Δ u, (x, y) \in [0, 1] \times [0, 1], \\ u (0, x, y) & = x^{2} + y^{2} / 2, \\ u (t, 0, y) & = y^{2} / 2 + 3 t, u (t, 1, y) = 1 + y^{2} / 2 + 3 t, \\ u (t, x, 0) & = x^{2} + 3 t, u (t, x, 1) = x^{2} + 1 / 2 + 3 t . \end{matrix}$

Here u is the temperature function, the thermal diffusivity is one, and the external heat source term is uniformly zero. It is easy to find the analytical solution $u (t, x, y) = x^{2} + y^{2} / 2 + 3 t$ . We use the same grid as case study 2 [61]. However, case study 2 treats error as a function of the time-step size h, while we compute the error using NNs and display the loss curve via TensorBoard. As a result, our horizontal axis represents steps (iterations), which is essentially equivalent to time [11]. The errors are examined and compared at final time $t_{f i n} = 0.1$ . From Figure 9, it can be observed that the losses stabilize at around $10^{- 4}$ , while the better results of errors in case study 2 [61] stabilize at around $10^{- 1}$ .

5. Conclusions

This paper presents an approach to approximate PDEs by training a continuous PiLMWs-CNN solely utilizing physics-informed loss functions. A feasibility analysis of this method was conducted, properties of LMWs were analyzed, and their advantages in solving PDEs were discussed. Furthermore, the construction theory and methods of LMWs were also provided. The PiLMWs-CNN utilizes the ability of CNN in fast modeling complex systems in solving PDEs to process LMWCs via CNN, training the network using physics-informed loss function. The experimental results show that this method is more effective, more accurate, and yields more stable solutions.

In the future, we will conduct further research on the application of the algorithm proposed in this paper for solving FDEs. This research will encompass the exploration of solution methods and network structures for constant-order, variable-order, and fractional PDEs. In addition to accuracy and stability measures, we will also investigate and experiment with other evaluation metrics. For instance, in the context of the previously mentioned damping wave equation, we will examine the doppler effect, wave reflections, as well as the drag and lift coefficients at different Reynolds numbers in the N-S equation. This assessment from a physical application perspective will provide a deeper evaluation of the algorithm’s performance and facilitate a gradual optimization of both the algorithm and the network.

Furthermore, a multigrid LMWs’ CNN might be contemplated for the further refinement of solutions at the boundary layers. Our method’s fully differentiable feature might also be useful in situations involving gradient-oriented shape optimization, optimal control, reinforcement learning, and sensitivity analysis. We firmly believe that the performance of upcoming ML-based PDE solvers with generalization capabilities will be positively impacted by the shift from explicit physics-constrained losses to implicit physics-informed losses on continuous fields, supported by discrete latent descriptions like LMWCs.

Author Contributions

Conceptualization, Y.W.; methodology, Y.W.; validation, C.Y.; formal analysis, Y.W., W.W.; investigation, Y.W.; resources, H.S.; writing—original draft preparation, Y.W.; writing—review and editing, W.W. and R.Z.; visualization, C.Y.; supervision, W.W.; funding acquisition, Y.W., W.W. All authors have read and agreed to the published version of the manuscript.

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Figures and Table

Figure 1. Legendre mother multiwavelets.

View Image - Figure 2. Network structures used for the incompressible N-S equation (a) and damped wave equation (b). The number of channels for each layer is indicated by the numbers beneath the blue bars.

Figure 2. Network structures used for the incompressible N-S equation (a) and damped wave equation (b). The number of channels for each layer is indicated by the numbers beneath the blue bars.

View Image - Figure 3. PDE model pipeline using LMWs. Samples for training and evaluation can be obtained at any point in space and time due to the continuous nature of the solution.

Figure 3. PDE model pipeline using LMWs. Samples for training and evaluation can be obtained at any point in space and time due to the continuous nature of the solution.

Figure 4. LMWs [Forumla omitted. See PDF.] are extended as even or odd functions to the interval [Forumla omitted. See PDF.] and zero elsewhere.

Figure 5. Training cycle.

View Image - Figure 6. Stability of solution for damped wave equation. (a) [Forumla omitted. See PDF.]; (b) [Forumla omitted. See PDF.]; and (c) [Forumla omitted. See PDF.].

Figure 6. Stability of solution for damped wave equation. (a) [Forumla omitted. See PDF.]; (b) [Forumla omitted. See PDF.]; and (c) [Forumla omitted. See PDF.].

View Image - Figure 7. Stability of PiLMWs-CNN when tackling the N-S equation on the DFG benchmark problem at Re = 100. (a) [Forumla omitted. See PDF.]; (b) [Forumla omitted. See PDF.].

Figure 7. Stability of PiLMWs-CNN when tackling the N-S equation on the DFG benchmark problem at Re = 100. (a) [Forumla omitted. See PDF.]; (b) [Forumla omitted. See PDF.].

Figure 8. Stability of PiLMWs-CNN while solving Ex. 4.3. (a) [Forumla omitted. See PDF.]; (b) [Forumla omitted. See PDF.].

Figure 9. Stability of PiLMWs-CNN while solving Ex. 4.4.

Table 1

Quantitative results of wave equation.

Methods		$L_{u}$	$L_{v}$	$L_{b}$
Spline Net [11]	$l, m = 1$	8.511 $\times 10^{- 2}$	1.127 $\times 10^{- 2}$	1.425 $\times 10^{- 3}$
Spline Net [11]	$l, m = 2$	5.294 $\times 10^{- 2}$	6.756 $\times 10^{- 3}$	1.356 $\times 10^{- 3}$
PiLMWs-CNN (ours)		5.366 $\times 10^{- 9}$	1.863 $\times 10^{- 5}$	1.024 $\times 10^{- 3}$

Table 2

Quantitative outcomes for Spline Net, implicit PINN, and PiLMWs-CNN at Re = 100.

Methods	$E [\| \nabla \cdot \vec{u} \|]$	$L_{p}$
implicit PINN	1.855 $\times 10^{- 3}$	1.465 $\times 10^{- 6}$
Spline Net [11]	7.492 $\times 10^{- 4}$	9.04 $\times 10^{- 4}$
PiLMWs-CNN (ours)	5.967 $\times 10^{- 6}$	2.506 $\times 10^{- 6}$

Table 3

The absolute errors $| u - u_{n} |$ for Example 3 ( $t = 1$ ). The upper subtable is computed via Shifted-Legendre orthonormal method [51], and the bottom is computed via PiLMWs-CNN (ours).

	$y_{i}$	0.1	0.3	0.5	0.7	0.9
$x_{i}$		0.1	0.3	0.5	0.7	0.9
0.1		7.414 $\times 10^{- 6}$	1.963 $\times 10^{- 5}$	2.421 $\times 10^{- 5}$	1.963 $\times 10^{- 5}$	7.414 $\times 10^{- 6}$
0.3		1.963 $\times 10^{- 5}$	5.130 $\times 10^{- 5}$	6.347 $\times 10^{- 5}$	5.130 $\times 10^{- 5}$	1.963 $\times 10^{- 5}$
0.5		2.421 $\times 10^{- 5}$	6.347 $\times 10^{- 5}$	7.839 $\times 10^{- 5}$	6.347 $\times 10^{- 5}$	2.421 $\times 10^{- 5}$
0.7		1.963 $\times 10^{- 5}$	5.130 $\times 10^{- 5}$	6.347 $\times 10^{- 5}$	5.130 $\times 10^{- 5}$	1.963 $\times 10^{- 5}$
0.9		7.414 $\times 10^{- 6}$	1.963 $\times 10^{- 5}$	2.421 $\times 10^{- 5}$	1.963 $\times 10^{- 5}$	7.414 $\times 10^{- 6}$
	$y_{i}$	0.1	0.3	0.5	0.7	0.9
$x_{i}$		0.1	0.3	0.5	0.7	0.9
0.1		3.822 $\times 10^{- 8}$	3.780 $\times 10^{- 8}$	3.765 $\times 10^{- 8}$	3.780 $\times 10^{- 8}$	3.822 $\times 10^{- 8}$
0.3		3.780 $\times 10^{- 8}$	3.672 $\times 10^{- 8}$	3.631 $\times 10^{- 8}$	3.672 $\times 10^{- 8}$	3.780 $\times 10^{- 8}$
0.5		3.765 $\times 10^{- 8}$	3.631 $\times 10^{- 8}$	3.580 $\times 10^{- 8}$	3.631 $\times 10^{- 8}$	3.765 $\times 10^{- 8}$
0.7		3.780 $\times 10^{- 8}$	3.672 $\times 10^{- 8}$	3.631 $\times 10^{- 8}$	3.672 $\times 10^{- 8}$	3.780 $\times 10^{- 8}$
0.9		3.822 $\times 10^{- 8}$	3.780 $\times 10^{- 8}$	3.765 $\times 10^{- 8}$	3.780 $\times 10^{- 8}$	3.555 $\times 10^{- 8}$

References

1. Wang, H.; Zhao, Z.; Tang, Y. An effective few-shot learning approach via location-dependent partial differential equation. Knowl. Inf. Syst.; 2020; 62, pp. 1881-1901. [DOI: https://dx.doi.org/10.1007/s10115-019-01400-y]

2. Eymard, R.; Gallouët, T.; Herbin, R. Finite volume methods. Handb. Numer. Anal.; 2000; 7, pp. 713-1018.

3. Zhang, Y. A finite difference method for fractional partial differential equation. Appl. Math. Comput.; 2009; 215, pp. C524-C529. [DOI: https://dx.doi.org/10.1016/j.amc.2009.05.018]

4. Taylor, C.A.; Hughes, T.J.; Zarins, C.K. Finite element modeling of blood flow in arteries. Comput. Methods Appl. Mech. Eng.; 1998; 158, pp. 155-196. [DOI: https://dx.doi.org/10.1016/S0045-7825(98)80008-X]

5. LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature; 2015; 521, pp. 436-444. [DOI: https://dx.doi.org/10.1038/nature14539] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/26017442]

6. Wu, K.; Xiu, D. Data-driven deep learning of partial differential equations in modal space. J. Comput. Phys.; 2020; 408, 109307. [DOI: https://dx.doi.org/10.1016/j.jcp.2020.109307]

7. Huang, S.; Feng, W.; Tang, C.; Lv, J. Partial Differential Equations Meet Deep Neural Networks: A Survey. arXiv; 2022; arXiv: 2211.05567

8. Khoo, Y.; Lu, J.; Ying, L. Solving parametric PDE problems with artificial neural networks. Eur. J. Appl. Math.; 2021; 32, pp. 421-435. [DOI: https://dx.doi.org/10.1017/S0956792520000182]

9. Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys.; 2019; 378, pp. 686-707. [DOI: https://dx.doi.org/10.1016/j.jcp.2018.10.045]

10. Tartakovsky, A.; Marrero, C.O.; Perdikaris, P.; Tartakovsky, G.; Barajas-Solano, D. Physics-Informed Deep Neural Networks for Learning Parameters and Constitutive Relationships in Subsurface Flow Problems. Water Resour. Res.; 2020; 56, e2019WR026731. [DOI: https://dx.doi.org/10.1029/2019WR026731]

11. Wandel, N.; Weinmann, M.; Neidlin, M.; Klein, R. Spline-pinn: Approaching pdes without data using fast, physics-informed hermite-spline cnns. Proc. AAAI Conf. Artif. Intell.; 2022; 36, pp. 8529-8538. [DOI: https://dx.doi.org/10.1609/aaai.v36i8.20830]

12. Beck, C.; Hutzenthaler, M.; Jentzen, A.; Kuckuck, B. An overview on deep learning-based approximation methods for partial differential equations. Discret. Contin. Dyn. Syst.-Ser. B.; 2023; 28, pp. 3697-3746. [DOI: https://dx.doi.org/10.3934/dcdsb.2022238]

13. Jin, X.; Cai, S.; Li, H.; Karniadakis, G.E. NSFnets (Navier-Stokes flow nets): Physics-informed neural networks for the incompressible Navier-Stokes equations. J. Comput. Phys.; 2021; 426, 109951. [DOI: https://dx.doi.org/10.1016/j.jcp.2020.109951]

14. Wandel, N.; Weinmann, M.; Klein, R. Teaching the incompressible Navier–CStokes equations to fast neural surrogate models in three dimensions. Phys. Fluids; 2021; 33, 047117. [DOI: https://dx.doi.org/10.1063/5.0047428]

15. Lagaris, I.E.; Likas, A.; Fotiadis, D.I. Artificial neural networks for solving ordinary and partial differential equations. IEEE Trans. Neural Netw.; 1998; 9, pp. 987-1000. [DOI: https://dx.doi.org/10.1109/72.712178] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/18255782]

16. Lu, L.; Meng, X.; Mao, Z.; Karniadakis, G.E. DeepXDE: A deep learning library for solving differential equations. SIAM Rev.; 2021; 63, pp. 208-228. [DOI: https://dx.doi.org/10.1137/19M1274067]

17. Meng, X.; Karniadakis, G.E. A composite neural network that learns from multi-fidelity data: Application to function approximation and inverse PDE problems. J. Comput. Phys.; 2020; 401, 109020. [DOI: https://dx.doi.org/10.1016/j.jcp.2019.109020]

18. Lu, L.; Pestourie, R.; Yao, W.; Wang, Z.; Verdugo, F.; Johnson, S.G. Physics-informed neural networks with hard constraints for inverse design. SIAM J. Sci. Comput.; 2021; 43, pp. B1105-B1132. [DOI: https://dx.doi.org/10.1137/21M1397908]

19. Cuomo, S.; Di Cola, V.S.; Giampaolo, F.; Raissi, M.; Piccialli, F. Scientific machine learning through physics—Cinformed neural networks: Where we are and what’s next. J. Sci. Comput.; 2022; 92, 88. [DOI: https://dx.doi.org/10.1007/s10915-022-01939-z]

20. Hornik, K.; Stinchcombe, M.; White, H. Multilayer feedforward networks are universal approximators. Neural Netw.; 1989; 2, pp. 359-366. [DOI: https://dx.doi.org/10.1016/0893-6080(89)90020-8]

21. Lu, L.; Meng, X.; Cai, S.; Mao, Z.; Goswami, S.; Zhang, Z.; Karniadakis, G.E. A comprehensive and fair comparison of two neural operators (with practical extensions) based on fair data. Comput. Methods Appl. Mech. Eng.; 2022; 393, 114778. [DOI: https://dx.doi.org/10.1016/j.cma.2022.114778]

22. Lu, L.; Jin, P.; Karniadakis, G.E. Deeponet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators. arXiv; 2019; arXiv: 1910.03193

23. Chen, T.; Chen, H. Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems. IEEE Trans. Neural Netw.; 1995; 6, pp. 911-917. [DOI: https://dx.doi.org/10.1109/72.392253] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/18263379]

24. Kovachki, N.; Lanthaler, S.; Mishra, S. On universal approximation and error bounds for Fourier neural operators. J. Mach. Learn. Res.; 2021; 22, pp. 13237-13312.

25. Lu, L.; Jin, P.; Pang, G.; Zhang, Z.; Karniadakis, G.E. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nat. Mach. Intell.; 2021; 3, pp. 218-229. [DOI: https://dx.doi.org/10.1038/s42256-021-00302-5]

26. Anandkumar, A.; Azizzadenesheli, K.; Bhattacharya, K.; Kovachki, N.; Li, Z.; Liu, B.; Stuart, A. ICLR 2020 Workshop on Integration of Deep Neural Models and Differential Equations. 2020; Available online: https://openreview.net/forum?id=fg2ZFmXFO3 (accessed on 27 February 2020).

27. Li, Z.; Kovachki, N.B.; Azizzadenesheli, K.; Bhattacharya, K.; Stuart, A.; Anandkumar, A. Fourier Neural Operator for Parametric Partial Differential Equations. Int. Conf. Learn. Represent.; 2020; Available online: https://openreview.net/forum?id=c8P9NQVtmnO (accessed on 8 January 2021).

28. Bachman, G.; Narici, L.; Beckenstein, E. Fourier and Wavelet Analysis; Springer: New York, NY, USA, 2000.

29. Shervani-Tabar, N.; Zabaras, N. Physics-constrained predictive molecular latent space discovery with graph scattering variational autoencoder. arXiv; 2020; arXiv: 2009.13878

30. Gupta, G.; Xiao, X.; Bogdan, P. Multiwavelet-based operator learning for differential equations. Adv. Neural Inf. Process. Syst.; 2021; 34, pp. 24048-24062.

31. Tripura, T.; Chakraborty, S. Wavelet neural operator for solving parametric partial differential equations in computational mechanics problems. Comput. Methods Appl. Mech. Eng.; 2023; 404, 115783. [DOI: https://dx.doi.org/10.1016/j.cma.2022.115783]

32. Zainuddin, Z.; Pauline, O. Modified wavelet neural network in function approximation and its application in prediction of time-series pollution data. Appl. Soft Comput.; 2011; 11, pp. 4866-4874. [DOI: https://dx.doi.org/10.1016/j.asoc.2011.06.013]

33. Li, Y.; Xu, L.; Ying, S. DWNN: Deep Wavelet Neural Network for Solving Partial Differential Equations. Mathematics; 2022; 10, 1976. [DOI: https://dx.doi.org/10.3390/math10121976]

34. Alpert, B.; Beylkin, G.; Gines, D.; Vozovoi, L. Adaptive solution of partial differential equations in multiwavelet bases. J. Comput. Phys.; 2002; 182, pp. 149-190. [DOI: https://dx.doi.org/10.1006/jcph.2002.7160]

35. Keinert, F. Wavelets and MultIWAVElets; CRC Press: Boca Raton, FL, USA, 2003.

36. Goedecker, S. Wavelets and their application for the solution of Poisson’s and Schrödinger’s equation. Multiscale Simul. Methods Mol. Sci.; 2009; 42, pp. 507-534.

37. Heydari, M.H.; Hooshmandasl, M.R.; Mohammadi, F. Legendre wavelets method for solving fractional partial differential equations with Dirichlet boundary conditions. Appl. Math. Comput.; 2014; 234, pp. 267-276. [DOI: https://dx.doi.org/10.1016/j.amc.2014.02.047]

38. Abbas, Z.; Vahdati, S.; Atan, K.A.; Long, N.N. Legendre multi-wavelets direct method for linear integro-differential equations. Appl. Math. Sci.; 2009; 3, pp. 693-700.

39. Fujieda, S.; Takayama, K.; Hachisuka, T. Wavelet convolutional neural networks. arXiv; 2018; arXiv: 1805.08620

40. Zhao, X.; Huang, P.; Shu, X. Wavelet-Attention CNN for image classification. Multimed. Syst.; 2022; 28, pp. 915-924. [DOI: https://dx.doi.org/10.1007/s00530-022-00889-8]

41. Guo, J.; Liu, X.; Li, S.; Wang, Z. Bearing intelligent fault diagnosis based on wavelet transform and convolutional neural network. Shock. Vib.; 2020; 2020, pp. 1-14. [DOI: https://dx.doi.org/10.1155/2020/6380486]

42. Onjun, R.; Sriwichai, K.; Dungkratoke, N.; Kaennakham, S. Wavelet Pooling Scheme in the Convolution Neural Network (CNN) for Breast Cancer Detection. Machine Learning and Artificial Intelligence; IOS Press: Amsterdam, The Netherlands, 2022; pp. 72-77.

43. Wolter, M.; Garcke, J. Adaptive wavelet pooling for convolutional neural networks. Proceedings of the International Conference on Artificial Intelligence and Statistics; Virtual, 13–15 April 2021; pp. 1936-1944.

44. Zauderer, E. Partial Differential Equations of Applied Mathematics; John Wiley & Sons.: Hoboken, NJ, USA, 2011.

45. Fischer, A.; Igel, C. An introduction to restricted Boltzmann machines. Proceedings of the Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications: 17th Iberoamerican Congress, CIARP 2012; Buenos Aires, Argentina, 3–6 September 2012; Springer: Berlin/Heidelberg, Germany, 2012; pp. 14-36.

46. Grafakos, L. Classical and Modern Fourier Analysis; Springer: New York, NY, USA, 2004.

47. Li, B.; Chen, X. Wavelet-based numerical analysis: A review and classification. Finite Elem. Anal. Design; 2014; 81, pp. 14-31. [DOI: https://dx.doi.org/10.1016/j.finel.2013.11.001]

48. Zhu, Y.; Zabaras, N.; Koutsourelakis, P.S.; Perdikaris, P. Physics-constrained deep learning for high-dimensional surrogate modeling and uncertainty quantification without labeled data. J. Comput. Phys.; 2019; 394, pp. 56-81. [DOI: https://dx.doi.org/10.1016/j.jcp.2019.05.024]

49. Alpert, B.K. A class of bases in L ² for the sparse representation of integral operators. SIAM J. Math. Anal.; 1993; 24, pp. 246-262. [DOI: https://dx.doi.org/10.1137/0524016]

50. Agarwal, R.P.; O’ Regan, D. Legendre polynomials and functions. Ordinary and Partial Differential Equations: With Special Functions, Fourier Series, and Boundary Value Problems; Springer Science & Business Media: New York, NY, USA, 2009; pp. 47-56.

51. Mei, L.; Wu, B.; Lin, Y. Shifted-Legendre orthonormal method for high-dimensional heat conduction equations. AIMS Math.; 2022; 7, pp. 9463-9478. [DOI: https://dx.doi.org/10.3934/math.2022525]

52. Chatrabgoun, O.; Parham, G.; Chinipardaz, R. A Legendre multiwavelets approach to copula density estimation. Stat. Pap.; 2017; 58, pp. 673-690. [DOI: https://dx.doi.org/10.1007/s00362-015-0720-0]

53. Khellat, F.; Yousefi, S.A. The linear Legendre mother wavelets operational matrix of integration and its application. J. Frankl. Inst.; 2006; 343, pp. 181-190. [DOI: https://dx.doi.org/10.1016/j.jfranklin.2005.11.002]

54. Hellwig, G. Partial Differential Equations: An Introduction; Springer: Berlin/Heidelberg, Germany, 2013.

55. Zhang, R.; Lin, Y. A new algorithm of boundary value problems based on improved wavelet basis and the reproducing kernel theory. Math. Methods Appl. Sci.; 2023; pp. 1-11. [DOI: https://dx.doi.org/10.1002/mma.9640]

56. Liu, Y.; Fu, P.; Liu, W.; Lin, D. Lecture Notes on Mathematical Analysis; 4th ed. Higher Education Press: Beijing, China, 2003.

57. Yamada, M. Wavelets: Applications. Encycl. Math. Phys.; 2006; pp. 420-426. [DOI: https://dx.doi.org/10.1016/B0-12-512666-2/00242-X]

58. Bedford, T.; Daneshkhah, A.; Wilson, K.J. Approximate uncertainty modeling in risk analysis with vine copulas. Risk Anal.; 2016; 36, pp. 792-815. [DOI: https://dx.doi.org/10.1111/risa.12471]

59. The CFD Benchmarking Project. 2021; Available online: http://www.mathematik.tu-dortmund.de/~featflow/en/benchmarks/cfdbenchmarking.html (accessed on 8 September 2021).

60. Wang, Y.; Wang, W.; Mei, L.; Lin, Y.; Sun, H. An ε-Approximate Approach for Solving Variable-Order Fractional Differential Equations. Fractal Fract.; 2023; 7, 90. [DOI: https://dx.doi.org/10.3390/fractalfract7010090]

61. Saleh, M.; Nagy, Á.; Kovács, E. Construction and investigation of new numerical algorithms for the heat equation: Part III. Multidiszciplináris Tudományok; 2020; 10, pp. 349-360. [DOI: https://dx.doi.org/10.35925/j.multi.2020.4.38]

Word count: 8071

Show less

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

The purpose of this paper is to leverage the advantages of physics-informed neural network (PINN) and convolutional neural network (CNN) by using Legendre multiwavelets (LMWs) as basis functions to approximate partial differential equations (PDEs). We call this method Physics-Informed Legendre Multiwavelets CNN (PiLMWs-CNN), which can continuously approximate a grid-based state representation that can be handled by a CNN. PiLMWs-CNN enable us to train our models using only physics-informed loss functions without any precomputed training data, simultaneously providing fast and continuous solutions that generalize to previously unknown domains. In particular, the LMWs can simultaneously possess compact support, orthogonality, symmetry, high smoothness, and high approximation order. Compared to orthonormal polynomial (OP) bases, the approximation accuracy can be greatly increased and computation costs can be significantly reduced by using LMWs. We applied PiLMWs-CNN to approximate the damped wave equation, the incompressible Navier–Stokes (N-S) equation, and the two-dimensional heat conduction equation. The experimental results show that this method provides more accurate, efficient, and fast convergence with better stability when approximating the solution of PDEs.

Details

Title

Approximating Partial Differential Equations with Physics-Informed Legendre Multiwavelets CNN

Author

Wang, Yahong¹; Wang, Wenmin²

; Cheng, Yu³; Sun, Hongbo⁴; Zhang, Ruimin⁴

¹ School of Computer Science and Engineering, Macau University of Science and Technology, Macau 999078, China; [email protected]; Zhuhai Campus, Beijing Institute of Technology, Zhuhai 519088, China; [email protected] (H.S.); [email protected] (R.Z.)
² School of Computer Science and Engineering, Macau University of Science and Technology, Macau 999078, China; [email protected]
³ School of Artificial Intelligence, Chongqing University of Technology, Chongqing 401135, China; [email protected]
⁴ Zhuhai Campus, Beijing Institute of Technology, Zhuhai 519088, China; [email protected] (H.S.); [email protected] (R.Z.)

First page

Publication year

2024

Publication date

2024

Publisher

MDPI AG

e-ISSN

25043110

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.3390/fractalfract8020091

ProQuest document ID

2930946555

Approximating Partial Differential Equations with Physics-Informed Legendre Multiwavelets CNN

Jump to:

Full text

Abstract

Details

Suggested sources