A Heuristic Method for Solving Polynomial Matrix

Full text

Turn on search term navigation

1. Introduction

A polynomial matrix equation of degree m is an equation of the type

(1) $\sum_{k = 0}^{m} A_{k} X^{k} = 0,$

where

0

denotes the null square matrix of order n, and

A_{k}, X

are square matrices of order n, i.e.,

A_{k}, X, 0 \in M_{n} (C)

, where

A_{m} \neq 0

. The matrices

A_{k}

are given matrices, and X is the unknown matrix to be solved. In a similar way to the theory of polynomial equations, polynomial matrix equations have a noncommutative analog of the Vieta theorem [1].

The solution of (1) is fundamental in the analysis of queuing problems modeled by Markov chains [2], as well as in the interplay between Toeplitz matrices and polynomial computations (see [3] and the references therein).

A particular problem of (1) that has been examined by many researchers is the calculation of the m-th roots of a given matrix $B \in M_{n} (C)$ [4,5,6,7]:

$X^{m} = B .$

Another particular problem arising from (1) is to consider X a scalar matrix, i.e.,

X = λ I

, thus (1) is reduced to so-called lambda matrices

$L (λ) = \sum_{k = 1}^{m} A_{k} λ^{k} = B .$

When

m = 2

, the quadratic matrix polynomial

L (λ)

arises in a natural way in the study of damped vibrating systems [8]. Matrix polynomials of arbitrary degree m are studied in [9].

Newton’s method [10] has been proposed to numerically solve (1). However, the solution obtained depends on the initial iteration. Also, the solution has to a be a simple root in order to obtain quadratic convergence. In addition, this method has a high computational cost per iteration. Therefore, this method will not be considered further here.

When X is a diagonalizable matrix, the solutions of (1) can be obtained for arbitrary m [11]. The maximum finite number of diagonalizable solutions are $(\binom{m n}{m})$ . Next, we describe this method. Let ${\vec{v}}_{i}$ be an eigenvector of matrix X, and $λ_{i}$ the corresponding eigenvalues, such that

(2) $X {\vec{v}}_{i} = λ_{i} {\vec{v}}_{i} .$

According to (1) and (2), we have

$\vec{0} = 0 {\vec{v}}_{i} = (\sum_{k = 0}^{m} A_{k} X^{k}) {\vec{v}}_{i} = \underset{L (λ_{i})}{\underset{︸}{(\sum_{k = 0}^{m} A_{k} λ_{i}^{k})}} {\vec{v}}_{i} .$

Therefore, the eigenvalues

λ_{i}

of X satisfy the polynomial equation:

(3) $det L (λ_{i}) = 0,$

and the eigenvectors

{\vec{v}}_{i}

satisfy

(4) ${\vec{v}}_{i} \in ker L (λ_{i}) .$

Let P be the following matrix:

$P = ({\vec{v}}_{1}, \dots, {\vec{v}}_{ℓ}),$

where the eigenvectors

{\vec{v}}_{i}

are set in columns in matrix P, and

ℓ \leq n

. If X is a diagonalizable matrix, then

(5) $X = P (\begin{matrix} λ_{1} \\ ⋱ \\ λ_{ℓ} \end{matrix}) P^{- 1} .$

The above method has been coded in MATLAB, and it is available at https://shorturl.at/oHN15 (accessed on 19 March 2024). However, this method has two main drawbacks:

It only works when X is a diagonalizable matrix;
When X is diagonalizable, it is computationally expensive. Note that we have to solve in (3) a polynomial equation of at most degree $m \times n$ . Also, for each eigenvalue $λ_{i}$ , we have to solve in (4) the corresponding eigenvectors ${\vec{v}}_{i}$ , but many of these eigenvectors are redundant. Finally, there are many possibilities to construct the factorization in (5) in order to obtain all the different solutions for X. Again, there are many redundant possibilities.

The aim of this paper is to propose a very simple heuristic method to obtain solutions of (1) when the matrices $A_{k}$ in (1) are scalar matrices. This heuristic method always works when the order of the matrices are $n = 2$ , and it is not difficult to program. In the recent literature, we have found another approach to solve this kind of polynomial matrix equation with scalar coefficients [12], which is based on an interpolation method. Despite the fact that this interpolation method is much more complicated, we have coded it in MATLAB in order to compare it to our heuristic method. In fact, all the examples presented in this paper have been computed using a MATLAB code available at https://shorturl.at/oHN15.

This paper is organized as follows. Section 2 describes the heuristic method for square matrices of arbitrary order n when matrix B can be expressed as a linear combination of the identity matrix and a nilpotent, involutive or idempotent matrix. Whenever this decomposition is possible, an algorithm to find it is described. Section 3 deals with the particular case $n = 2$ , since the aforementioned decomposition for matrix B is always possible in this case. When B is a scalar matrix, we sometimes have an infinite number of solutions by virtue of the Cayley–Hamilton theorem. We explicitly calculate this kind of solution for $n = 2$ . As an application, some examples are given from graph theory. Finally, we present our conclusions in Section 4.

2. Polynomial Matrix Equations of Arbitrary Order

Consider the polynomial matrix equation in X of degree m:

(6) $\sum_{k = 1}^{m} a_{k} X^{k} = B,$

where

a_{k} \in C

, and

X, B

are square matrices of order n, i.e.,

X, B \in M_{n} (C)

. Assume that B admits the following decomposition:

(7) $B = p N + q I,$

where

p, q \in C

;

I, N \in M_{n} (C)

, where N is an idempotent, involutive, or nilpotent matrix with index 2; and I is the identity matrix, i.e.,

$\begin{matrix} N idempotent & \Leftrightarrow & N^{2} = N, \\ N involutive & \Leftrightarrow & N^{2} = I, \\ N nilpotent with index 2 & \Leftrightarrow & N^{2} = 0 . \end{matrix}$

Note that if N is a nilpotent matrix,

H = p N

is also a nilpotent matrix, thus in this case, we will consider the decomposition

(8) $B = H + q I .$

We look for solutions of the form:

(9) $X = λ N + μ I$

(10) $X = λ H + μ I .$

2.1. The Heuristic Decomposition

We want to know when the heuristic decomposition given in (7) is possible. Notice that the idempotent decomposition is always possible when B is a scalar matrix, when $p = 0$ and N is arbitrary (say $N = 0$ ). When B is not a scalar matrix, from (7), we have

${(B - q I)}^{2} = p^{2} N^{2},$

and

(11) $N = \frac{1}{p} (B - q I),$

thus, defining

(12) $M_{α, β} (B) = B^{2} - α B + β I,$

we have that

(13) $M_{α, β} (B) = 0,$

where

(14) $\begin{matrix} N idempotent & \Rightarrow & \{\begin{matrix} α = 2 q + p, \\ β = q (p + q), \end{matrix} \end{matrix}$

(15) $\begin{matrix} N involutive & \Rightarrow & \{\begin{matrix} α = 2 q, \\ β = q^{2} - p^{2} . \end{matrix} \end{matrix}$

According to (11), if

p = 0

, the heuristic decomposition does not exist. In this case, we can look for a nilpotent decomposition

B = H + q I

. Therefore, if (13) is satisfied for some calculated parameters

α

and

β

, then

$H nilpotent index 2 \Rightarrow \{\begin{matrix} α = 2 q, \\ β = q^{2} . \end{matrix}$

In any case, we have to calculate

α

and

β

from (13). We will see in Section 3 that this is always possible when

B \in M_{2} (C)

. When

B \in M_{n} (C)

with

n \geq 3

, this is not always possible. However, when

n \geq 3

, we can look for heuristic solutions calculating

α

and

β

in (13) for some matrix elements of

M_{α, β} (B)

, and then checking if

M_{α, β} (B) = 0

for these calculated parameters

α

and

β

. If this is so, the heuristic decomposition is possible. This method has been coded in MATLAB, and it is available at https://shorturl.at/oHN15 (accessed on 19 March 2024).

2.2. Idempotent Case

Theorem 1.

Consider the polynomial matrix Equation (6), i.e.,

$\sum_{k = 1}^{m} a_{k} X^{k} = B .$

If B admits an idempotent decomposition (7), i.e.,

$B = p N + q I,$

where N is an idempotent matrix, then the solution is

$X = λ N + μ I,$

where μ satisfies the polynomial equation:

$\sum_{k = 1}^{m} a_{k} μ^{k} = q,$

and for each solution of μ, λ satisfies the polynomial equation:

$p = \sum_{ℓ = 1}^{m} λ^{ℓ} \sum_{k = ℓ}^{m} a_{k} (\binom{k}{ℓ}) μ^{k - ℓ} .$

Proof.

Insert (7) and (9) into (6), and consider that N is idempotent, to obtain

$\begin{matrix} p N + q I & = & \sum_{k = 1}^{m} a_{k} {(λ N + μ I)}^{k} \\ = & \sum_{k = 1}^{m} a_{k} \sum_{ℓ = 0}^{k} (\binom{k}{ℓ}) λ^{ℓ} μ^{k - ℓ} N^{ℓ} \\ = & \sum_{k = 1}^{m} a_{k} [μ^{k} I + N \sum_{ℓ = 1}^{k} (\binom{k}{ℓ}) λ^{ℓ} μ^{k - ℓ}] \\ = & I \sum_{k = 1}^{m} a_{k} μ^{k} + N \sum_{k = 1}^{m} a_{k} \sum_{ℓ = 1}^{k} (\binom{k}{ℓ}) λ^{ℓ} μ^{k - ℓ} . \end{matrix}$

Therefore,

(16) $\begin{matrix} \sum_{k = 1}^{m} a_{k} μ^{k} = q, \end{matrix}$

(17) $\begin{matrix} \sum_{k = 1}^{m} a_{k} \sum_{ℓ = 1}^{k} (\binom{k}{ℓ}) λ^{ℓ} μ^{k - ℓ} = p . \end{matrix}$

Taking into account the discrete Heaviside function, defined as

$θ (n) : = \{\begin{matrix} 1, & n \geq 0, \\ 0, & n < 0, \end{matrix}$

rewrite (17) as follows:

(18) $\begin{matrix} p & = & \sum_{k = 1}^{\infty} a_{k} θ (m - k) \sum_{ℓ = 1}^{\infty} (\binom{k}{ℓ}) λ^{ℓ} μ^{k - ℓ} θ (k - ℓ) \\ = & \sum_{ℓ = 1}^{\infty} λ^{ℓ} \sum_{k = 1}^{\infty} a_{k} (\binom{k}{ℓ}) μ^{k - ℓ} θ (m - k) θ (k - ℓ) \\ = & \sum_{ℓ = 1}^{m} λ^{ℓ} \sum_{k = ℓ}^{m} a_{k} (\binom{k}{ℓ}) μ^{k - ℓ} . \end{matrix}$

□

Remark 1.

Notice that from (16) we have a maximum of m different solutions for μ. According to (18), for each solution of μ, we have a maximum of m solutions for λ. Therefore, we have a maximum of $m^{2}$ different pairs $(λ, μ)$ , and, according to (9), a maximum of $m^{2}$ solutions for X.

Example 1.

Consider the matrix

$B = (\begin{matrix} 2 & 2 & 2 \\ 2 & 2 & 2 \\ 2 & 2 & 2 \end{matrix}) .$

If each matrix element of B denotes the number of paths in one or two steps in a graph, calculate the adjacency matrix R of the graph.

According to Chapter 2 in [13], the adjacency matrix R satisfies the quadratic matrix equation

$R^{2} + R = B .$

The MATLAB 2023b code developed to find heuristic decompositions yields

p = - 1

q = 1

, and

$N = \frac{1}{3} (\begin{matrix} 2 & - 1 & - 1 \\ - 1 & 2 & - 1 \\ - 1 & - 1 & 2 \end{matrix}),$

so that

B = p N + q I

, and N is an idempotent matrix. Applying the method described in Theorem 1, we obtain four different solutions:

$\begin{matrix} R_{1} & = & (- 1) (\begin{matrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{matrix}), R_{2} = \frac{2}{3} (\begin{matrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{matrix}), \\ R_{3} & = & (\begin{matrix} 0 & 1 & 1 \\ 1 & 0 & 1 \\ 1 & 1 & 0 \end{matrix}), R_{4} = - \frac{1}{3} (\begin{matrix} 5 & 2 & 2 \\ 2 & 5 & 2 \\ 2 & 2 & 5 \end{matrix}) . \end{matrix}$

Since the adjacency matrix can only contain 0 or 1 as matrix elements, the solution is

R = R_{3}

. It is worth noting that we obtain the same set of solutions

R_{1}, \dots, R_{4}

by using both the diagonalization and interpolation methods. However, when performing 100 tests, the diagonalization method is ≈3500 times slower on average than the idempotent method, and the interpolation method is ≈1000 times slower on average than the idempotent method.

Example 2.

Calculate the square root matrix of the following matrix:

$B = (\begin{matrix} 4 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{matrix}) .$

The MATLAB code developed to find heuristic decompositions yields $p = - 4$ , $q = 4$ , and

$N = (\begin{matrix} 0 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix}),$

so that

B = p N + q I

, and N is an idempotent matrix. Applying the method described in Theorem 1, we obtain two different solutions:

(19) $X = (\begin{matrix} \pm 2 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{matrix}) .$

It is worth noting that we also obtain the above solution (19) using the diagonalization method. However, if we apply the interpolation method to this problem, we will obtain a “division by zero” error. It is worth noting that none of the methods are able to obtain the following infinite sets of solutions:

$X \in \{(\begin{matrix} \pm 2 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & a & 0 \end{matrix}), (\begin{matrix} \pm 2 & 0 & 0 \\ 0 & 0 & a \\ 0 & 0 & 0 \end{matrix}) : a \in C\} .$

2.3. Involutive Case

Theorem 2.

Consider the polynomial matrix Equation (6), i.e.,

$\sum_{k = 1}^{m} a_{k} X^{k} = B .$

If B admits an involutive decomposition (7), i.e.,

$B = p N + q I,$

where N is an involutive matrix, then the solution is

$X = λ_{i j} N + μ_{i j} I,$

where

$λ_{i j} = \frac{r_{i}^{+} - r_{j}^{-}}{2}, μ_{i j} = \frac{r_{i}^{+} + r_{j}^{-}}{2},$

being $r_{j}^{\pm}$ , $j = 1, \dots, m$ the m roots of the polynomial:

$h_{\pm} (x) = \sum_{k = 1}^{m} a_{k} x^{k} - q \mp p .$

Proof.

Insert (7) and (9) into (6), and consider that N is involutive, to obtain

$\begin{matrix} p N + q I \\ = & \sum_{k = 1}^{m} a_{k} {(λ N + μ I)}^{k} \\ = & \sum_{k = 1}^{m} a_{k} \sum_{ℓ = 0}^{k} (\binom{k}{ℓ}) λ^{ℓ} μ^{k - ℓ} N^{ℓ} \\ = & \sum_{k = 1}^{m} a_{k} [I \sum_{ℓ = 0}^{⌊k / 2⌋} (\binom{k}{2 ℓ}) λ^{2 ℓ} μ^{k - 2 ℓ} + N \sum_{ℓ = 0}^{⌊(k - 1) / 2⌋} (\binom{k}{2 ℓ + 1}) λ^{2 ℓ + 1} μ^{k - 2 ℓ - 1}] . \end{matrix}$

Therefore,

(20) $\begin{matrix} q & = & \sum_{k = 1}^{m} a_{k} \sum_{ℓ = 0}^{⌊k / 2⌋} (\binom{k}{2 ℓ}) λ^{2 ℓ} μ^{k - 2 ℓ}, \end{matrix}$

(21) $\begin{matrix} p & = & \sum_{k = 1}^{m} a_{k} \sum_{ℓ = 0}^{⌊(k - 1) / 2⌋} (\binom{k}{2 ℓ + 1}) λ^{2 ℓ + 1} μ^{k - 2 ℓ - 1} . \end{matrix}$

Sum up (20) and (21) to obtain

(22) $\begin{matrix} q + p & = & \sum_{k = 1}^{m} a_{k} \sum_{ℓ = 0}^{k} (\binom{k}{ℓ}) λ^{ℓ} μ^{k - ℓ} \\ = & \sum_{k = 1}^{m} a_{k} {(μ + λ)}^{k} . \end{matrix}$

Also, subtracting (21) from (20), we have

(23) $\begin{matrix} q - p & = & \sum_{k = 1}^{m} a_{k} \sum_{ℓ = 0}^{k} (\binom{k}{ℓ}) {(- λ)}^{ℓ} μ^{k - ℓ} \\ = & \sum_{k = 1}^{m} a_{k} {(μ - λ)}^{k} . \end{matrix}$

Now, denote as

r_{j}^{\pm}

j = 1, \dots, m

the m roots of the polynomial

(24) $h_{\pm} (x) = \sum_{k = 1}^{m} a_{k} x^{k} - q \mp p .$

According to (22) and (23), we have

$λ_{i j} = \frac{r_{i}^{+} - r_{j}^{-}}{2}, μ_{i j} = \frac{r_{i}^{+} + r_{j}^{-}}{2}, i, j = 1, \dots, m .$

Hence we have a maximum of

m^{2}

different solutions. □

Example 3.

Consider the matrix

$B = (\begin{matrix} 1 & 0 & 1 \\ 0 & 2 & 0 \\ 1 & 0 & 1 \end{matrix}) .$

If each matrix element of B denotes the number of paths in one or two steps in a graph, calculate the adjacency matrix R of the graph.

The adjacency matrix R satisfies the quadratic matrix equation

$R^{2} + R = B .$

The MATLAB code developed to find heuristic decompositions yields

p = 1

q = 1

, and

$N = (\begin{matrix} 0 & 0 & 1 \\ 0 & 1 & 0 \\ 1 & 0 & 0 \end{matrix}),$

so that

B = p N + q I

, and N is an involutive matrix. Applying the method described in Theorem 2, we obtain four different solutions:

$\begin{matrix} R_{1} & = & (\begin{matrix} - 1 & 0 & - 1 \\ 0 & - 2 & 0 \\ - 1 & 0 & - 1 \end{matrix}), R_{2} = - \frac{1}{2} (\begin{matrix} 3 & 0 & 1 \\ 0 & 4 & 0 \\ 1 & 0 & 3 \end{matrix}), \\ R_{3} & = & \frac{1}{2} (\begin{matrix} 1 & 0 & 1 \\ 0 & 2 & 0 \\ 1 & 0 & 1 \end{matrix}), R_{4} = (\begin{matrix} 0 & 0 & 1 \\ 0 & 1 & 0 \\ 1 & 0 & 0 \end{matrix}) . \end{matrix}$

However, the adjacency matrix can only contain 0 or 1 as matrix elements, therefore the solution is

R = R_{4}

. It is worth noting that we obtain the same set of solutions

R_{1}, \dots, R_{4}

by using both the diagonalization and interpolation methods. However, performing 100 tests, the diagonalization method is ≈3900 times slower on average than the involutive method, and the interpolation method is ≈1100 times slower on average than the involutive method.

2.4. Nilpotent Case

Theorem 3.

Consider the polynomial matrix Equation (6), i.e.,

$\sum_{k = 1}^{m} a_{k} X^{k} = B .$

If B admits a nilpotent decomposition (8), i.e.,

$B = H + q I,$

where H is an nilpotent matrix with index 2, then the solution is

$X = λ H + μ I,$

where μ satisfies the polynomial equation:

$\sum_{k = 1}^{m} a_{k} μ^{k} = q,$

and for each solution of μ, λ is calculated as:

$λ = \frac{1}{\sum_{k = 1}^{m} a_{k} k μ^{k - 1}} .$

Proof.

Insert (8) and (10) into (6), and consider that H is nilpotent with index 2, to obtain

$\begin{matrix} H + q I & = & \sum_{k = 1}^{m} a_{k} {(λ H + μ I)}^{k} \\ = & \sum_{k = 1}^{m} a_{k} \sum_{ℓ = 0}^{k} (\binom{k}{ℓ}) λ^{ℓ} μ^{k - ℓ} H^{ℓ} \\ = & \sum_{k = 1}^{m} a_{k} (μ^{k} I + λ k μ^{k - 1} H) . \end{matrix}$

Therefore,

(25) $\begin{matrix} \sum_{k = 1}^{m} a_{k} μ^{k} = q, \end{matrix}$

(26) $\begin{matrix} λ = \frac{1}{\sum_{k = 1}^{m} a_{k} k μ^{k - 1}} . \end{matrix}$

□

Remark 2.

Notice that we have 1 solution for λ for each solution of μ, thus according to (9), we have a maximum of m different solutions for X. However, if

$\sum_{k = 1}^{m} a_{k} k μ^{k - 1} = 0,$

then, according to (26), λ does not exist. Therefore, we have to eliminate these cases as solutions of X.

Example 4.

Solve the quadratic matrix equation:

$X^{2} - X = B,$

where

$B = (\begin{matrix} 1 & - 1 & - 1 \\ 1 & 3 & 1 \\ 0 & 0 & 2 \end{matrix}) .$

The MATLAB code developed to find heuristic decompositions yields $q = 2$ , and

$H = (\begin{matrix} - 1 & - 1 & - 1 \\ 1 & 1 & 1 \\ 0 & 0 & 0 \end{matrix}),$

so that

B = H + q I

, and H is an nilpotent matrix with index 2. Applying the method described in Theorem 3, we obtain two different solutions:

$X \in \{\frac{1}{3} (\begin{matrix} - 5 & 1 & 1 \\ - 1 & - 7 & - 1 \\ 0 & 0 & - 6 \end{matrix}), \frac{1}{3} (\begin{matrix} 2 & - 1 & - 1 \\ 1 & 4 & 1 \\ 0 & 0 & 3 \end{matrix})\} .$

It is worth noting that the interpolation method provides the same set of solutions for X, but since X is not a diagonalizable matrix, the diagonalization method does not provide a solution. Despite this fact, performing 100 tests, the diagonalization method is ≈500 times slower on average than the nilpotent heuristic method. Also, the interpolation method is ≈1500 times slower on average than the nilpotent method.

Example 5.

Compute a matrix A that satisfies the following equation:

(27) $A^{3} - 6 A^{2} + 11 A = 6 I .$

Here, matrix $B = 6 I$ is a scalar matrix. Since $I^{2} = I$ (i.e., the identity matrix is idempotent as well as involutive), B admits infinite ways of idempotent or involutive decompositions, i.e.,

$B = p I + (6 - p) I, \forall p \in C .$

Also, B admits the following nilpotent decomposition:

$B = 0 + 6 I .$

All the heuristic methods, as well as the diagonalization and interpolation methods, yield the same set of solutions:

(28) $A \in \{I, 2 I, 3 I\} .$

The computational performance of all heuristic methods is very similar. Nevertheless, when performing 100 tests, the diagonalization method is ≈1200 times slower than the heuristic method on average, and the interpolation method is ≈9000 times slower than the heuristic method on average. Note that the solutions given in (28) are “trivial”, since they are based on the following factorization of (27):

$(I - A) (2 I - A) (3 I - A) = 0 .$

3. Polynomial Matrix Equation of Order 2

The main drawback of the examples presented above is that matrix B has to admit an idempotent, involutive, or nilpotent decomposition. In general, this is not always possible, but in the particular case of $n = 2$ , i.e., $X, B \in M_{2} (C)$ , the decomposition given in (7) or (8) for B is always possible. Next, we consider how to calculate the corresponding decomposition.

3.1. Idempotent Decomposition

For $n = 2$ , (13) reads as

(29) $\underset{C = B^{2}}{\underset{︸}{(\begin{matrix} c_{11} & c_{12} \\ c_{21} & c_{22} \end{matrix})}} - α \underset{B}{\underset{︸}{(\begin{matrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{matrix})}} = - β (\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}) .$

From the first and last matrix elements of (29), we have

(30) $c_{11} - α b_{11} = - β = c_{22} - α b_{22} .$

b_{11} \neq b_{22}

, we calculate

(31) $α = \frac{c_{11} - c_{22}}{b_{11} - b_{22}} .$

However, if

b_{11} = b_{22}

, we can solve for

α

considering the second matrix element in (29). Therefore, if

b_{11} = b_{22}

and

b_{12} \neq 0

(32) $c_{12} - α b_{12} = 0 \Rightarrow α = \frac{c_{12}}{b_{12}} .$

Nevertheless, if

b_{11} = b_{22}

and

b_{12} = 0

, we can solve for

α

considering the third matrix element in (29). Therefore, if

b_{11} = b_{22}

b_{12} = 0

, and

b_{21} \neq 0

(33) $c_{21} - α b_{21} = 0 \Rightarrow α = \frac{c_{21}}{b_{21}} .$

Nonetheless, if

b_{11} = b_{22} \neq 0

b_{12} = 0

, and

b_{21} = 0

, matrix B is a scalar matrix, thus the decomposition in (7) is trivial, for instance,

(34) $B = b_{11} I \Rightarrow \{\begin{matrix} p = 0, \\ q = b_{11}, \\ N = (\begin{matrix} 0 & 0 \\ 0 & 0 \end{matrix}) . \end{matrix}$

Finally, if

b_{11} = b_{22} = b_{12} = b_{21} = 0

, the B matrix is the null matrix, thus the decomposition is also trivial,

(35) $B = (\begin{matrix} 0 & 0 \\ 0 & 0 \end{matrix}) \Rightarrow \{\begin{matrix} p = 1, \\ q = 0, \\ N = (\begin{matrix} 0 & 0 \\ 0 & 0 \end{matrix}) . \end{matrix}$

Once

α

is calculated, we have from (30)

(36) $β = α b_{11} - c_{11} .$

Now, according to (14), we have

$\begin{matrix} α & = & 2 q + p, \\ β & = & q (p + q), \end{matrix}$

that can be solved as

(37) $\begin{matrix} q & = & \frac{α \pm \sqrt{α^{2} - 4 β}}{2}, \\ p & = & α - 2 q . \end{matrix}$

Therefore, from (7), the idempotent matrix is

(38) $N = \frac{1}{p} (B - q I) .$

Note that from (37), we may obtain two different decompositions. It will be remembered that if

p = 0

, the decomposition is not possible, since N does not exist, according to (38). In summary, we have proved the following result. If B admits an idempotent decomposition, (i.e.,

B = p N + q I

, where N is an idempotent matrix), we calculate the parameters p and q of such a decomposition from the matrix elements of B. However, if the calculation of p yields

p = 0

, and B is not a scalar nor a null matrix, such a decomposition is not possible.

3.2. Involutive Decomposition

For the involutive case, (13), taking into account (15), reads as

(39) $\underset{B^{2} = C}{\underset{︸}{(\begin{matrix} c_{11} & c_{12} \\ c_{21} & c_{22} \end{matrix})}} - \underset{α}{\underset{︸}{2 q}} \underset{B}{\underset{︸}{(\begin{matrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{matrix})}} = - \underset{β}{\underset{︸}{(q^{2} - p^{2})}} \underset{I}{\underset{︸}{(\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix})}},$

thus we can apply the same result as before for

α

and

β

(recalling that B is different from a scalar matrix or a null matrix). Thereby,

(40) $α = \{\begin{matrix} \frac{c_{11} - c_{22}}{b_{11} - b_{22}}, & b_{11} \neq b_{22}, \\ \frac{c_{12}}{b_{12}}, & b_{11} = b_{22}, b_{12} \neq 0, \\ \frac{c_{21}}{b_{21}}, & b_{11} = b_{22}, b_{12} = 0, b_{21} \neq 0, \end{matrix}$

and

(41) $β = α b_{11} - c_{11} .$

When B is a scalar matrix, the decomposition is trivial, thereby for

t \neq 0

(42) $B = t (\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}) \Rightarrow \{\begin{matrix} p = 0, \\ q = t, \\ N = (\begin{matrix} 0 & 0 \\ 0 & 0 \end{matrix}) . \end{matrix}$

Also, when B is the null matrix, the decomposition is trivial as well,

(43) $B = (\begin{matrix} 0 & 0 \\ 0 & 0 \end{matrix}) \Rightarrow \{\begin{matrix} p = 1, \\ q = - 1, \\ N = (\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}) . \end{matrix}$

According to (15), we have

(44) $\begin{matrix} q & = & \frac{α}{2}, \\ p & = & \pm \sqrt{q^{2} - β}, \end{matrix}$

and the involutive matrix is given by

(45) $N = \frac{1}{p} (B - q I) .$

Note that from (44) we may obtain two different decompositions. It will be remembered that if

p = 0

, the decomposition is not possible, since N does not exist, according to (45).

Theorem 4.

If a matrix $B \in M_{2} (C)$ cannot be decomposed as $B = p N + q I$ , where $p, q \in C$ ; $I, N \in M_{2} (C)$ , being N an idempotent or involutive matrix, and I the identity matrix, then B is of the form

(46) $B = a I + H, a \in C,$

where $H \in M_{2} (C)$ is a non-null nilpotent matrix with index 2.

Proof.

As previously mentioned, the idempotent or involutive decomposition is not possible when $p = 0$ in (38) or (45), respectively. According to (37) and (44), the latter occurs when $α^{2} - 4 β = 0 \to α = \pm 2 \sqrt{β}$ . Consequently, according to (29) or (39), B has to satisfy

(47) $B^{2} \mp 2 \sqrt{β} B = - β I \to {(B \mp \sqrt{β} I)}^{2} = 0 .$

Therefore,

B \mp \sqrt{β} I

is a nilpotent matrix with index 2, so that we have the following cases:

$B \mp \sqrt{β} I = 0$ , thus $B = \pm \sqrt{β} I$ is a scalar matrix. However, according to (34) and (42), this case is an exception in both decompositions, where $p = 0$ , but the decomposition is possible.
$B \mp \sqrt{β} I = H$ , where H is a non-null nilpotent matrix with index 2. This is the case given by (46), as we wanted to prove.

□

3.3. Nilpotent Decomposition

According to Theorem 4, if B does not admit an idempotent nor involutive decomposition, then B admits a nilpotent decomposition. In this case, from (8) we have

(48) $\begin{matrix} B - q I = H \\ \Rightarrow & \underset{C}{\underset{︸}{B^{2}}} - \underset{α}{\underset{︸}{2 q}} B = - \underset{β}{\underset{︸}{q^{2}}} I . \end{matrix}$

Note that if H is the null matrix (thus nilpotent with index 2), then

B = q I

, i.e., a scalar matrix, so that the decomposition can be obtained by using the idempotent or the involutive decomposition. Consequently, we will consider that H is a non-null nilpotent matrix with index 2. Also, note that H cannot be a scalar matrix since scalar matrices are not nilpotent. In summary, since

B = H + q I

, we determine that B is not a scalar nor a null matrix. Therefore,

α

and

β

can be calculated with (40) and (41), and, according to (48), we have

$q = \frac{α}{2},$

where

$α^{2} - 4 β = 0 .$

Remark 3.

Note that this method generalizes the methods given in the existing literature to solve the square root of a matrix [14], i.e., the equation $X^{2} = B$ . Unlike the square root of a scalar, the square root of a matrix may not exist. For example, we may verify with the proposed heuristic algorithm that the nilpotent matrix with index 2:

$B = (\begin{matrix} 0 & 1 \\ 0 & 0 \end{matrix}),$

has no square root, in agreement with [15]. It is worth noting that the diagonalization method does not yield any solutions either. However, the diagonalization method does not clarify if X exists or not. Also, the interpolation method yields a “division by zero” error.

3.4. Singular Case: Infinite Number of Solutions

As was mentioned in Example 5, if B is a scalar matrix, then B admits infinite ways of idempotent or involutive decompositions. However, following the method described above, all these decompositions yield “trivial” solutions. Nevertheless, according to the Cayley–Hamilton theorem, if the order of the matrices equals the degree of the polynomial matrix equation, i.e., $n = m$ , there are infinite solutions. Next, we calculate all these solutions for the case $n = m = 2$ . For this purpose, consider the quadratic polynomial equation (thus $a_{2} \neq 0$ ):

(49) $\begin{matrix} a_{1} X + a_{2} X^{2} = p I \\ \Rightarrow & X^{2} + \frac{a_{1}}{a_{2}} X - \frac{p}{a_{2}} = 0 . \end{matrix}$

The characteristic polynomial of matrix X is

$p (λ) = |X - λ I| = λ^{2} - tr (X) λ + |X| .$

According to the Cayley–Hamilton theorem

(50) $X^{2} - tr (X) X + |X| I = 0 .$

Consider

$X = (\begin{matrix} a & b \\ c & d \end{matrix}),$

and compare (49) with (50), to obtain

(51) $\begin{matrix} a + d & = & - \frac{a_{1}}{a_{2}} : = α, \end{matrix}$

(52) $\begin{matrix} a d - b c & = & - \frac{p}{a_{2}} : = β . \end{matrix}$

Multiply (51) by

- d

and add the result to (52) to arrive at

(53) $\begin{matrix} d^{2} - α d + b c + β = 0 \\ \Rightarrow & d = \frac{α \pm \sqrt{α^{2} - 4 (b c + β)}}{2} . \end{matrix}$

According to (51) and (53), we obtain

$a = d - α = \frac{α \mp \sqrt{α^{2} - 4 (b c + β)}}{2} .$

Finally, we obtain the following infinite set of solutions:

(54) $X = (\begin{matrix} \frac{1}{2} (α \mp Δ) & b \\ c & \frac{1}{2} (α \pm Δ) \end{matrix}), b, c \in C,$

where

(55) $\begin{matrix} α & = & - \frac{a_{1}}{a_{2}}, β = - \frac{p}{a_{2}}, \end{matrix}$

(56) $\begin{matrix} Δ & = & \sqrt{α^{2} - 4 (b c + β)} . \end{matrix}$

We can generalize the above result for singular polynomial equations of the form

(57) ${(a_{0} I + a_{1} X + a_{2} X^{2})}^{k} = q I,$

where

k = 1, 2, \dots

Notice that (57) can be reduced to (49) as follows

$a_{1} X + a_{2} X^{2} = \underset{p}{\underset{︸}{(q^{1 / k} - a_{0})}} I,$

where now we have k different solutions for

p = q^{1 / k} - a_{0}

Example 6.

Calculate the square root of the identity matrix of order $n = 2$ .

We have to solve the polynomial matrix equation

$X^{2} = B = (\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}) .$

According to Theorem 1, we obtain two “trivial” solutions:

(58) $X = \pm (\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}) .$

However, according to the method described above, we obtain

(59) $X = (\begin{matrix} \pm \sqrt{1 - b c} & b \\ c & \mp \sqrt{1 - b c} \end{matrix}), \forall b, c \in C .$

Note that the regular solutions obtained in (58) are not contained in (59). When performing 100 tests, the diagonalization and interpolation methods compute the solution in similar times on average. However, it is worth noting that the diagonalization and interpolation methods do not calculate the singular solutions (59).

Example 7.

Calculate the solutions of the polynomial matrix equation of order $n = 2$

(60) $2 X^{2} + X^{4} = 3 I .$

Note that (60) can be rewritten as

(61) ${(I + X^{2})}^{2} = I + 2 X^{2} + X^{4} = 4 I,$

thus

(62) $I + X^{2} = \pm 2 I \to X^{2} = \{\begin{matrix} I, \\ - 3 I . \end{matrix}$

Therefore, according to the method described above, the singular solutions of (62) are

\forall b, c \in C

$X = \{\begin{matrix} (\begin{matrix} \pm \sqrt{1 - b c} & b \\ c & \mp \sqrt{1 - b c} \end{matrix}), \\ (\begin{matrix} \pm \sqrt{- 3 - b c} & b \\ c & \mp \sqrt{- 3 - b c} \end{matrix}), \end{matrix}$

and the regular ones are

(63) $X \in \{\pm (\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}), \pm i (\begin{matrix} \sqrt{3} & 0 \\ 0 & \sqrt{3} \end{matrix})\} .$

Again, the diagonalization and the interpolation methods calculate the regular solutions, but not the singular ones. Moreover, the interpolation method provides a slight computational error in the two last regular solutions of (63). After 10 tests, the computational performance between the diagonalization and the heuristic methods is quite similar on average. Nonetheless, the interpolation method is ≈12 times slower on average than the heuristic method. This is quite significant since the interpolation method does not provide any of the singular solutions, as mentioned before.

3.5. General Case

According to the above Sections, we propose the following procedure to solve a polynomial matrix equation

(64) $a_{1} X + a_{2} X^{2} + \dots + a_{m} X^{m} = B$

of order

n = 2

Attempt the decomposition $B = p N + q I$ , where N is an idempotent or an involutive matrix, according to Section 3.1. If the decomposition is successful, apply Theorem 1 (if N is idempotent), or Theorem 2 (if N is involutive), in order to solve (64).
If the idempotent or the involutive decomposition is not successful, then perform the decomposition $B = H + q I$ (being H a nilpotent matrix with index 2) according to Section 3.3. Apply Theorem 3 to solve (64).
Check if there are singular cases, i.e., a polynomial equation of the form
${(a_{0} I + a_{1} X + a_{2} X^{2})}^{k} = q I .$
If this is so, we have an infinite number of extra-solutions. This algorithm has been coded in MATLAB, and it is available at https://shorturl.at/oHN15 (accessed on 19 March 2024).

Despite the fact that this algorithm has been developed for polynomial matrix equations with scalar coefficients, i.e., (6), we can use it when this is not the case. To illustrate this, consider the following example.

Example 8.

Solve the following polynomial matrix equation

(65) $X^{3} - (\begin{matrix} 2 & - 1 \\ 0 & 2 \end{matrix}) X + (\begin{matrix} 1 & - 1 \\ 0 & 1 \end{matrix}) = 0 .$

The diagonalization method yields only the following diagonal solutions:

$X = (\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}), (\begin{matrix} \frac{\sqrt{5} - 1}{2} & 0 \\ 0 & 1 \end{matrix}), (\begin{matrix} - \frac{\sqrt{5} + 1}{2} & 0 \\ 0 & 1 \end{matrix}) .$

However, we can obtain other solutions with the heuristic method. Indeed, define

$B = (\begin{matrix} 1 & - 1 \\ 0 & 1 \end{matrix}),$

to rewrite (65) as

(66) $X^{3} - (B + I) X + B = 0,$

Now, recast (66) as follows

(67) $\begin{matrix} X^{3} - (B + I) X + B + I = I \\ \Rightarrow & X^{3} + (B + I) (I - X) = I \\ \Rightarrow & B + I = (I - X^{3}) {(I - X)}^{- 1} . \end{matrix}$

Now, apply the partial sum of a matrix geometrical series, which is given by

$S = I + X + X^{2} + \dots + X^{n} = (I - X^{n + 1}) {(I - X)}^{- 1},$

thus (67) reduces to

(68) $B = X + X^{2} .$

Applying the heuristic method, we obtain two new non-diagonalizable solutions:

(69) $X \in \{(\begin{matrix} - \frac{1 + \sqrt{5}}{2} & \frac{\sqrt{5}}{5} \\ 0 & - \frac{1 + \sqrt{5}}{2} \end{matrix}), (\begin{matrix} \frac{\sqrt{5} - 1}{2} & - \frac{\sqrt{5}}{5} \\ 0 & \frac{\sqrt{5} - 1}{2} \end{matrix})\} .$

Solving (68) with the interpolation method, we obtain the same set of solutions given in (69).

4. Conclusions

We have derived a heuristic method to solve polynomial matrix equations, i.e., $a_{1} X + a_{2} X^{2} + \dots + a_{m} X^{m} = B$ , where the coefficients $a_{k}$ are scalars and $X, B$ are square matrices of order n, as long as B admits an idempotent, involutive, or nilpotent decomposition. Whenever this decomposition is possible, we have described an algorithm to find it. Moreover, we have proved that this decomposition is always possible when $B \in M_{2} (C)$ . Also, for square matrices of order $n = 2$ , we have described an algorithm that calculates solutions. Further, the algorithm has the capacity to determine the nonexistence of the solution. In addition, when B is a scalar matrix, and $n = 2$ , the algorithm computes singular solutions (i.e., infinite sets of solutions), if any exist.

We have compared the proposed heuristic method with other methods found in the existing literature, such as the diagonalization and interpolation methods. It turns out that the heuristic method is usually considerably faster than the diagonalization or the interpolation methods (see Examples 1, 3, and 4). Also, when the diagonalization method fails (see Example 4 and Remark 3), we do not know if the solution does not exist, or the solution is not diagonalizable. Further, the interpolation method sometimes fails with a “division by zero” error (see Example 2 and Remark 3), and we do not know whether the solution exists or not. Consequently, we have shown some examples for which the proposed heuristic method is able to compute solutions, even though the diagonalization or the interpolation methods fail. Moreover, the diagonalization and interpolation methods are not able to compute singular solutions, as the heuristic method does for $n = 2$ (see Examples 6 and 7). The best strength of the diagonalization method is its ability to find solutions of (1) when $A_{k}$ are not scalar matrices, unlike the interpolation or the heuristic methods. However, Example 8 shows how to compute non-diagonalizable solutions with the proposed heuristic method when some of the coefficients in the polynomial matrix equation are not scalar matrices.

In a future study, the intention is to prove if the proposed algorithm provides all the possible solutions for $n = 2$ . Finally, it is worth noting that all the examples have been computed by using a MATLAB code available at https://shorturl.at/oHN15 (accessed on 19 March 2024).

Author Contributions

Conceptualization, J.L.G.-S. and F.S.L.; Methodology, J.L.G.-S. and F.S.L.; Writing—original draft, J.L.G.-S. and F.S.L.; Writing—review and editing, J.L.G.-S. and F.S.L. All authors have read and agreed to the published version of the manuscript.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

References

1. Connes, A.; Schwarz, A. Matrix Vieta theorem revisited. Lett. Math. Phys.; 1997; 39, pp. 349-353. [DOI: https://dx.doi.org/10.1023/A:1007373114601]

2. Bini, D.A.; Latouche, G.; Meini, B. Solving matrix polynomial equations arising in queueing problems. Linear Algebra Its Appl.; 2002; 340, pp. 225-244. [DOI: https://dx.doi.org/10.1016/S0024-3795(01)00426-8]

3. Bini, D.A.; Gemignani, L.; Meini, B. Computations with infinite Toeplitz matrices and polynomials. Linear Algebra Its Appl.; 2002; 343, pp. 21-61. [DOI: https://dx.doi.org/10.1016/S0024-3795(01)00341-X]

4. Iannazzo, B. On the Newton method for the matrix p-th root. SIAM J. Matrix Anal. Appl.; 2006; 28, pp. 503-523. [DOI: https://dx.doi.org/10.1137/050624790]

5. Choudhry, A. Extraction of nth roots of 2 × 2 matrices. Linear Algebra Its Appl.; 2004; 387, pp. 183-192. [DOI: https://dx.doi.org/10.1016/j.laa.2004.02.010]

6. Psarrakos, P. On the m-th roots of a complex matrix. Electron. J. Linear Algebra; 2002; 9, pp. 32-41. [DOI: https://dx.doi.org/10.13001/1081-3810.1071]

7. Lakić, S. On the computation of the matrix k-th root. ZAMM-J. Appl. Math. Mech.; 1998; 78, pp. 167-172. [DOI: https://dx.doi.org/10.1002/(SICI)1521-4001(199803)78:3<167::AID-ZAMM167>3.0.CO;2-R]

8. Gohberg, I.; Lancaster, P.; Rodman, L. Quadratic matrix polynomials with a parameter. Adv. Appl. Math.; 1986; 7, pp. 253-281. [DOI: https://dx.doi.org/10.1016/0196-8858(86)90036-9]

9. Gohberg, I.; Lancaster, P.; Rodman, L. Matrix Polynomials; Springer: Berlin/Heidelberg, Germany, 2005.

10. Kratz, W.; Stickel, E. Numerical solution of matrix polynomial equations by Newton’s method. IMA J. Numer. Anal.; 1987; 7, pp. 355-369. [DOI: https://dx.doi.org/10.1093/imanum/7.3.355]

11. Fuchs, D.; Schwarz, A. matrix Vieta theorem. arXiv; 1994; arXiv: math/9410207

12. Petraki, D.; Samaras, N. Solving the n-th degree polynomial matrix equation. J. Interdiscip. Math.; 2021; 24, pp. 1079-1092. [DOI: https://dx.doi.org/10.1080/09720502.2019.1706863]

13. Biggs, N. Algebraic Graph Theory; Cambridge University Press: Cambridge, UK, 1993; 67.

14. Deadman, E.; Higham, N.J.; Ralha, R. Blocked Schur algorithms for computing the matrix square root. Proceedings of the International Workshop on Applied Parallel Computing; New Orleans, LA, USA, 26 February 2012; Springer: Berlin/Heidelberg, Germany, 2012; pp. 171-182.

15. Björck, Å.; Hammarling, S. A Schur method for the square root of a matrix. Linear Algebra Its Appl.; 1983; 52, pp. 127-140. [DOI: https://dx.doi.org/10.1016/0024-3795(83)90010-1]

Word count: 4482

Show less

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

We propose a heuristic method to solve polynomial matrix equations of the type $\sum_{k = 1}^{m} a_{k} X^{k} = B$ , where $a_{k}$ are scalar coefficients and X and B are square matrices of order n. The method is based on the decomposition of the B matrix as a linear combination of the identity matrix and an idempotent, involutive, or nilpotent matrix. We prove that this decomposition is always possible when $n = 2$ . Moreover, in some cases we can compute solutions when we have an infinite number of them (singular solutions). This method has been coded in MATLAB and has been compared to other methods found in the existing literature, such as the diagonalization and the interpolation methods. It turns out that the proposed method is considerably faster than the latter methods. Furthermore, the proposed method can calculate solutions when diagonalization and interpolation methods fail or calculate singular solutions when these methods are not capable of doing so.

Details

Title

A Heuristic Method for Solving Polynomial Matrix Equations

Author

González-Santander, Juan Luis

; Fernando Sánchez Lasheras

First page

239

Publication year

2024

Publication date

2024

Publisher

MDPI AG

e-ISSN

20751680

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.3390/axioms13040239

ProQuest document ID

3046587929

A Heuristic Method for Solving Polynomial Matrix Equations

Jump to:

Full text

Abstract

Details

Suggested sources