Application of a Generalized Secant Method to

Full text

Turn on search term navigation

1. Introduction

Let $α$ be the solution to the equation $f (x) = 0 .$ An effective iterative method used for solving this equation that makes direct use of $f (x)$ (but no derivatives of $f (x)$ ) is the secant method that is discussed in many books on numerical analysis. See, for example, Atkinson [1], Dahlquist and Björck [2], Henrici [3], Ralston and Rabinowitz [4], and Stoer and Bulirsch [5]. See also the recent note [6] by the author, in which the treatment of the secant method and those of the Newton–Raphson, regula falsi, and Steffensen methods are presented in a unified manner.

Recently, this method was generalized by the author in [7] as follows: Starting with $x_{0}, x_{1}, \dots, x_{k}$ , $k + 1$ initial approximations to $α$ , we generate a sequence of approximations ${x_{n}}$ , via the recursion

(1) $x_{n + 1} = x_{n} - \frac{f (x_{n})}{p_{n, k}^{'} (x_{n})}, n = k, k + 1, \dots,$

p_{n, k}^{'} (x)

being the derivative of the polynomial

p_{n, k} (x)

that interpolates

f (x)

at the points

x_{n}, x_{n - 1}, \dots, x_{n - k}

. (Thus,

p_{n, k} (x)

is of degree k.) Clearly, the case

k = 1

is simply the secant method. In [7], we also showed that, provided

x_{0}, x_{1}, \dots, x_{k}

are sufficiently close to

α

, the method converges with order

s_{k}

, that is,

lim_{n \to \infty} \frac{| x_{n + 1} - α |}{| x_{n} {- α |}^{s_{k}}} = C \neq 0

for some constant C, and that

1 < s_{k} < 2

. (We call

s_{k}

the order of convergence of the method or the order of the method for short.) Here

s_{k}

is the only positive root of the polynomial

s^{k + 1} - \sum_{i = 0}^{k} s^{i}

. We also have that

$\frac{1 + \sqrt{5}}{2} = s_{1} < s_{2} < s_{3} < \dots < 2; lim_{k \to \infty} s_{k} = 2 .$

Actually, rounded to four significant figures,

$s_{1} \dot{=} 1.618, s_{2} \dot{=} 1.839, s_{3} \dot{=} 1.928, s_{4} \dot{=} 1.966, s_{5} \dot{=} 1.984, s_{6} \dot{=} 1.992, s_{7} \dot{=} 1.996, etc .$

Note that to compute $x_{n + 1}$ we need knowledge of only $f (x_{n}), f (x_{n - 1}), \dots, f (x_{n - k})$ , and because $f (x_{n - 1}), \dots, f (x_{n - k})$ have already been computed, $f (x_{n})$ is the only new quantity to be computed. Thus, each step of the method requires $f (x)$ to be computed only once. From this, it follows that the efficiency index of this method is simply $s_{k}$ and that this index approaches 2 by increasing k even moderately, as can be concluded from the values of $s_{1}, \dots, s_{7}$ given above.

In this work, we consider the application of this method to simple complex roots of a function $f (z)$ , where z is the complex variable. Let us denote a real or complex root of $f (z)$ by $α$ again; that is, $f (α) = 0$ and $f^{'} (α) \neq 0$ . Thus, starting with $z_{0}, z_{1}, \dots, z_{k}$ , $k + 1$ initial approximations to $α$ , we generate a sequence of approximations ${z_{n}}$ via the recursion

(2) $z_{n + 1} = z_{n} - \frac{f (z_{n})}{p_{n, k}^{'} (z_{n})}, n = k, k + 1, \dots,$

p_{n, k}^{'} (z)

being the derivative of the polynomial

p_{n, k} (z)

that interpolates

f (z)

at the points

z_{n}, z_{n - 1}, \dots, z_{n - k}

. As in [7], we can use Newton’s interpolation formula to generate

p_{n, k} (z)

and

p_{n, k}^{'} (z)

. Thus

(3) $p_{n, k} (z) = f (z_{n}) + \sum_{i = 1}^{k} f [z_{n}, z_{n - 1}, \dots, z_{n - i}] \prod_{j = 0}^{i - 1} (z - z_{n - j})$

and

(4) $p_{n, k}^{'} (z_{n}) = f [z_{n}, z_{n - 1}] + \sum_{i = 2}^{k} f [z_{n}, z_{n - 1}, \dots, z_{n - i}] \prod_{j = 1}^{i - 1} (z_{n} - z_{n - j}) .$

Here,

g [ζ_{0}, ζ_{1}, \dots, ζ_{m}]

is the divided difference of order m of the function

g (z)

over the set of points

{ζ_{0}, ζ_{1}, \dots, ζ_{m}}

and is a symmetric function of these points. For details, we refer the reader to [7].

As proposed in [7], we generate the $k + 1$ initial approximations as follows: We choose the approximations $z_{0}, z_{1}$ first. We then generate $z_{2}$ by applying our method with $k = 1$ (that is, with the secant method). Next, we apply our method to $z_{0}, z_{1}, z_{2}$ with $k = 2$ and obtain $z_{3}$ , and so on, until we have generated all $k + 1$ initial approximations, via

(5) $z_{n + 1} = z_{n} - \frac{f (z_{n})}{p_{n, n}^{'} (z_{n})}, n = 1, 2, \dots, k - 1 .$

Remark 1.

1.
Instead of choosing $z_{1}$ arbitrarily, we can generate it as $z_{1} = z_{0} + f (z_{0})$ as suggested in Brin [8], which is quite sensible since $f (z)$ is small near the root α. We can also use the method of Steffensen—which uses only $f (z)$ and no derivatives of $f (z)$ —to generate $z_{1}$ from $z_{0}$ ; thus,
$z_{1} = z_{0} - \frac{{[f (z_{0})]}^{2}}{f (z_{0} + f (z_{0})) - f (z_{0})} .$
2.
It is clear that, in case $f (z)$ takes on only real values along the $R e z$ axis and we are looking for nonreal roots of $f (z)$ , at least one of the initial approximations must be chosen to be nonreal.
3.
We would like to mention that Kogan, Sapir, and Sapir [9] have proposed another generalization of the secant method for simple real roots of nonlinear equations $f (x) = 0$ that resembles our method described in (1). In the notation of (1), this method produces a sequence of approximations ${x_{n}}$ via
(6) $x_{n + 1} = x_{n} - \frac{f (x_{n})}{p_{n, n}^{'} (x_{n})}, n = 1, 2, \dots,$
starting with arbitrary $x_{0}$ and $x_{1}$ , and it is of order 2. Note that, in (6), $p_{n, n} (x)$ interpolates $f (x)$ at the points $x_{0}, x_{1}, \dots, x_{n},$ hence is of degree n, which is tending to infinity. In (1), $p_{n, k} (x)$ is of degree k, which is fixed.
4.
Yet another generalization of the secant method for finding simple real roots of $f (x)$ was recently given by Nijmeijer [10]. This method too requires no derivative information, requires one evaluation of $f (x)$ per iteration, and has the same order of convergence as our method. It follows an idea of applying a convergence acceleration method, such as Aitken’s $Δ^{2}$ -process, to approximations obtained from the secant method, as proposed by Han and Potra [11]. Because Nijmeijer’s method is not based on polynomial interpolation, it is completely different from our method, however. For Aitken’s $Δ^{2}$ -process, see [1,2,3,4,5]. See also [12] (Chapter 15) by the author.

In the next section, we analyze the local convergence properties of the method as it is applied to complex roots. We show that the analysis of [7] can be extended to the complex case following some clever manipulation. We prove that the order $s_{k}$ of the method is the same as that we discovered in the real case. In Section 3, we provide two numerical examples to confirm the results of our convergence analysis.

2. Local Convergence Analysis

We now turn to the analysis of the sequence ${z_{n}}_{n = 0}^{\infty}$ that is generated via (2). Our treatment covers all $k \geq 1$ .

In our analysis, we will make use of the Hermite-Genocchi formula that provides an integral representation for divided differences (For a proof of this formula, see Atkinson [1], for example). Even though this formula is usually stated for functions defined on real intervals, it is easy to verify (see Filipsson [13], for example) that it also applies to functions defined in the complex plane under proper assumptions. Thus, provided $g (z)$ is analytic on E, a bounded closed convex set in the complex plane, and provided $ζ_{0}, ζ_{1}, \dots, ζ_{m}$ are in E, there holds

(7) $g [ζ_{0}, ζ_{1}, \dots, ζ_{m}] = \underset{S_{m}}{\int \dots \int} g^{(m)} (t_{0} ζ_{0} + t_{1} ζ_{1} + \dots + t_{m} ζ_{m}) d t_{1} \dots d t_{m}, t_{0} = 1 - \sum_{i = 1}^{m} t_{i} .$

Here

S_{m}

is the m-dimensional simplex defined as

(8) $S_{m} = \{(t_{1}, \dots, t_{m}) \in R^{m} : t_{i} \geq 0, i = 1, \dots, m, \sum_{i = 1}^{m} t_{i} \leq 1\} .$

We note that (7) holds whether the

ζ_{i}

are distinct or not. We also note that

g [ζ_{0}, ζ_{1}, \dots, ζ_{m}]

is a symmetric and continuous function of its arguments.

By the conditions we have imposed on $g (z)$ , it is easy to see that the integrand $g^{(m)} (\sum_{i = 0}^{m} t_{i} ζ_{i})$ in (7) is always defined because $\sum_{i = 0}^{m} t_{i} ζ_{i}$ is in the set E and $g (z)$ is analytic on E. This is so because, by (7) and (8),

$(t_{1}, \dots, t_{m}) \in S_{m} \Rightarrow t_{i} \geq 0, i = 0, 1, \dots, m, and \sum_{i = 0}^{m} t_{i} = 1,$

which implies that

\sum_{i = 0}^{m} t_{i} ζ_{i}

is a convex combination of

ζ_{0}, ζ_{1}, \dots, ζ_{m}

hence is in the set

C = conv {ζ_{0}, ζ_{1}, \dots, ζ_{m}}

, the convex hull of the points

ζ_{0}, ζ_{1}, \dots, ζ_{m}

, and

C \subseteq E

. Consequently, taking moduli on both sides of (7), we obtain, for all

ζ_{i}

in E,

(9) $\begin{matrix} | g [ζ_{0}, ζ_{1}, \dots, ζ_{m}] | & \leq \underset{S_{m}}{\int \dots \int} | g^{(m)} (\sum_{i = 0}^{m} t_{i} ζ_{i}) | d t_{1} \dots d t_{m} \\ \leq \frac{∥ g^{(m)} ∥}{m!}, ∥ g^{(m)} ∥ = max_{z \in E} | g^{(m)} (z) | . \end{matrix}$

In addition, since

\sum_{i = 0}^{m} t_{i} = 1

in (7), as

ζ_{i} \to \hat{ζ}

for all

i = 0, 1, \dots, m

, there hold

\sum_{i = 0}^{m} t_{i} ζ_{i} \to \hat{ζ}

and

g^{(m)} (\sum_{i = 0}^{m} t_{i} ζ_{i}) \to g^{(m)} (\hat{ζ})

, and hence

(10) $lim_{\begin{matrix} ζ_{i} \to \hat{ζ} \\ i = 0, 1, \dots, m \end{matrix}} g [ζ_{0}, ζ_{1}, \dots, ζ_{m}] = g \underset{m + 1 times}{\underset{⏟}{[\hat{ζ}, \hat{ζ}, \dots, \hat{ζ}]}} = \frac{g^{(m)} (\hat{ζ})}{m!} .$

In (9) and (10), we have also invoked the fact that (see [14] (p. 346), for example)

$\underset{S_{m}}{\int \dots \int} d t_{1} \dots d t_{m} = \frac{1}{m!} .$

We will make use of these in the proof of our main theorem that follows. This theorem and its proof are almost identical to that given in [7] once we take into account, where and when needed, the fact that we are now working in the complex plane. For convenience, we provide all the details of the proof.

Theorem 1.

Let α be a simple root of $f (z)$ , that is, $f (α) = 0$ , but $f^{'} (α) \neq 0$ . Let $B_{r}$ be the closed disk of radius r containing α as its center, that is,

(11) $B_{r} = {z \in C : | z - α | \leq r} .$

Let $f (z)$ be analytic on $B_{r}$ . Choose a positive integer k and let $z_{0}, z_{1}, \dots, z_{k}$ be distinct initial approximations to α. Generate $z_{k + 1}, z_{k + 2}, \dots$ via

(12) $z_{n + 1} = z_{n} - \frac{f (z_{n})}{p_{n, k}^{'} (z_{n})}, n = k, k + 1, \dots,$

where $p_{n, k} (z)$ is the polynomial of interpolation to $f (z)$ at the points $z_{n}, z_{n - 1}, \dots, z_{n - k}$ . Then, provided $z_{0}, z_{1}, \dots, z_{k}$ are in $B_{r}$ and sufficiently close to α, we have the following cases:

1.
If $f^{(k + 1)} (α) \neq 0$ , the sequence ${z_{n}}$ converges to α, and
(13) $lim_{n \to \infty} \frac{ϵ_{n + 1}}{\prod_{i = 0}^{k} ϵ_{n - i}} = \frac{{(- 1)}^{k + 1}}{(k + 1)!} \frac{f^{(k + 1)} (α)}{f^{'} (α)} \equiv L; ϵ_{n} = z_{n} - α \forall n .$
The order of convergence is $s_{k}$ , $1 < s_{k} < 2$ , where $s_{k}$ is the only positive root of the polynomial $g_{k} (s) = s^{k + 1} - \sum_{i = 0}^{k} s^{i}$ and satisfies
(14) $2 - 2^{- k - 1} e < s_{k} < 2 - 2^{- k - 1} for k \geq 2; s_{k} < s_{k + 1}; lim_{k \to \infty} s_{k} = 2,$
e being the base of natural logarithms, and
(15) $lim_{n \to \infty} \frac{| ϵ_{n + 1} |}{| ϵ_{n} |^{s_{k}}} = {| L |}^{(s_{k} - 1) / k},$
which also implies that
(16) $s_{k} = lim_{n \to \infty} \frac{log | ϵ_{n + 1} / ϵ_{n} |}{log | ϵ_{n} / ϵ_{n - 1} |} .$
2.
If $f (z)$ is a polynomial of degree at most k, the sequence ${z_{n}}$ converges to α, and
(17) $lim_{n \to \infty} \frac{ϵ_{n + 1}}{ϵ_{n}^{2}} = \frac{f^{″} (α)}{2 f^{'} (α)}; ϵ_{n} = z_{n} - α \forall n .$
Thus, ${z_{n}}$ converges with order 2 if $f^{″} (α) \neq 0$ , and with order greater than 2 if $f^{″} (α) = 0$ .

Proof.

We start by deriving a closed-form expression for the error in $z_{n + 1}$ . Subtracting $α$ from both sides of (12), and noting that

$f (z_{n}) = f (z_{n}) - f (α) = f [z_{n}, α] (z_{n} - α),$

we have

(18) $z_{n + 1} - α = (1 - \frac{f [z_{n}, α]}{p_{n, k}^{'} (z_{n})}) (z_{n} - α) = \frac{p_{n, k}^{'} (z_{n}) - f [z_{n}, α]}{p_{n, k}^{'} (z_{n})} (z_{n} - α) .$

We now note that

(19) $p_{n, k}^{'} (z_{n}) - f [z_{n}, α] = \{p_{n, k}^{'} (z_{n}) - f^{'} (z_{n})\} + \{f^{'} (z_{n}) - f [z_{n}, α]\},$

and that

(20) $f^{'} (z_{n}) - p_{n, k}^{'} (z_{n}) = f [z_{n}, z_{n}, z_{n - 1}, \dots, z_{n - k}] \prod_{i = 1}^{k} (z_{n} - z_{n - i})$

and

(21) $f^{'} (z_{n}) - f [z_{n}, α] = f [z_{n}, z_{n}] - f [z_{n}, α] = f [z_{n}, z_{n}, α] (z_{n} - α) .$

Note that (20) can be obtained by starting with the divided difference representation of

f (z) - p_{n, k} (z)

, namely,

f (z) - p_{n, k} (z) = f [z, z_{n}, z_{n - 1}, \dots, z_{n - k}] \prod_{i = 0}^{k} (z - z_{n - i}),

and by computing

{lim}_{z \to z_{n}} [f (z) - p_{n, k} (z)] / \prod_{i = 0}^{k} (z - z_{n - i})

via L’Hôpital’s rule.

For simplicity of notation, let

(22) $- f [z_{n}, z_{n}, z_{n - 1}, \dots, z_{n - k}] = {\hat{D}}_{n} and f [z_{n}, z_{n}, α] = {\hat{E}}_{n},$

and rewrite (19) and (20) as

(23) $\begin{matrix} p_{n, k}^{'} (z_{n}) - f [z_{n}, α] = {\hat{D}}_{n} \prod_{i = 1}^{k} (ϵ_{n} - ϵ_{n - i}) + {\hat{E}}_{n} ϵ_{n}, \end{matrix}$

(24) $\begin{matrix} p_{n, k}^{'} (z_{n}) = f^{'} (z_{n}) + {\hat{D}}_{n} \prod_{i = 1}^{k} (ϵ_{n} - ϵ_{n - i}) . \end{matrix}$

Substituting these into (18), we finally obtain

(25) $ϵ_{n + 1} = C_{n} ϵ_{n}; C_{n} \equiv \frac{p_{n, k}^{'} (z_{n}) - f [z_{n}, α]}{p_{n, k}^{'} (z_{n})} = \frac{{\hat{D}}_{n} \prod_{i = 1}^{k} (ϵ_{n} - ϵ_{n - i}) + {\hat{E}}_{n} ϵ_{n}}{f^{'} (z_{n}) + {\hat{D}}_{n} \prod_{i = 1}^{k} (ϵ_{n} - ϵ_{n - i})} .$

We now prove that convergence takes place. First, let us assume without loss of generality that $f^{'} (z) \neq 0$ for all $z \in B_{r}$ , and set $m_{1} = {min}_{z \in B_{r}} | f^{'} (z) | > 0$ . (This is possible since $α \in B_{r}$ and $f^{'} (α) \neq 0,$ and we can choose r as small as we wish to also guarantee $m_{1} > 0$ .) Next, let $M_{s} = {max}_{z \in B_{r}} | f^{(s)} (z) | / s!,$ $s = 1, 2, \dots .$ Thus, assuming that ${z_{n}, z_{n - 1}, \dots, z_{n - k}} \subset B_{r}$ and noting that $B_{r}$ is a convex set, we have by (9) that

$| {\hat{D}}_{n} | \leq M_{k + 1}, | {\hat{E}}_{n} | \leq M_{2}, because {α, z_{n}, z_{n - 1}, \dots, z_{n - k}} \subset B_{r} .$

Next, choose the ball $B_{t / 2}$ sufficiently small (with $t / 2 \leq r$ ) to ensure that $m_{1} > 2 M_{k + 1} t^{k} + M_{2} t / 2$ . It can now be verified that, provided $z_{n}, z_{n - 1}, \dots, z_{n - k}$ are all in $B_{t / 2}$ , there holds

$\begin{matrix} | C_{n} | & \leq \frac{M_{k + 1} \prod_{i = 1}^{k} | ϵ_{n} - ϵ_{n - i} | + M_{2} | ϵ_{n} |}{m_{1} - M_{k + 1} \prod_{i = 1}^{k} | ϵ_{n} - ϵ_{n - i} |} \\ \leq \frac{M_{k + 1} \prod_{i = 1}^{k} (| ϵ_{n} | + | ϵ_{n - i}) | + M_{2} | ϵ_{n} |}{m_{1} - M_{k + 1} \prod_{i = 1}^{k} (| ϵ_{n} | + | ϵ_{n - i} |)} \leq \bar{C}, \end{matrix}$

where

$\bar{C} \equiv \frac{M_{k + 1} t^{k} + M_{2} t / 2}{m_{1} - M_{k + 1} t^{k}} < 1 .$

Consequently, by (25),

| ϵ_{n + 1} | \leq \bar{C} | ϵ_{n} | < | ϵ_{n} |

, which implies that

z_{n + 1} \in B_{t / 2}

, just like

z_{n}, z_{n - 1}, \dots, z_{n - k}

. Therefore, if

z_{0}, z_{1}, \dots, z_{k}

are chosen in

B_{t / 2}

, then

| C_{n} | \leq \bar{C} < 1

for all

n \geq k

, hence

{z_{n}} \subset B_{t / 2}

and

{lim}_{n \to \infty} z_{n} = α

As for (13) when $f^{(k + 1)} (α) \neq 0$ , we proceed as follows: By the fact that ${lim}_{n \to \infty} z_{n} = α$ , we first note that, by (20) and (21),

(26) $lim_{n \to \infty} p_{n, k}^{'} (z_{n}) = f^{'} (α) = lim_{n \to \infty} f [z_{n}, α],$

and thus

{lim}_{n \to \infty} C_{n} = 0

. This means that

{lim}_{n \to \infty} (ϵ_{n + 1} / ϵ_{n}) = 0

and, equivalently, that

{z_{n}}

converges with order greater than 1. As a result,

$lim_{n \to \infty} (ϵ_{n} / ϵ_{n - i}) = 0 for all i \geq 1,$

and

$ϵ_{n} / ϵ_{n - i} = o (ϵ_{n} / ϵ_{n - j}) as n \to \infty, for j < i .$

Consequently, expanding in (25) the product

\prod_{i = 1}^{k} (ϵ_{n} - ϵ_{n - i})

, we have

(27) $\begin{matrix} \prod_{i = 1}^{k} (ϵ_{n} - ϵ_{n - i}) & = \prod_{i = 1}^{k} (- ϵ_{n - i} [1 - ϵ_{n} / ϵ_{n - i}]) \\ = {(- 1)}^{k} (\prod_{i = 1}^{k} ϵ_{n - i}) [1 + O (ϵ_{n} / ϵ_{n - 1})] as n \to \infty . \end{matrix}$

Substituting (27) into (25), and defining

(28) $D_{n} = \frac{{\hat{D}}_{n}}{p_{n, k}^{'} (z_{n})}, E_{n} = \frac{{\hat{E}}_{n}}{p_{n, k}^{'} (z_{n})},$

we obtain

(29) $ϵ_{n + 1} = {(- 1)}^{k} D_{n} (\prod_{i = 0}^{k} ϵ_{n - i}) [1 + O (ϵ_{n} / ϵ_{n - 1})] + E_{n} ϵ_{n}^{2} as n \to \infty .$

Dividing both sides of (29) by

\prod_{i = 0}^{k} ϵ_{n - i}

, and defining

(30) $σ_{n} = \frac{ϵ_{n + 1}}{\prod_{i = 0}^{k} ϵ_{n - i}},$

we have

(31) $σ_{n} = {(- 1)}^{k} D_{n} [1 + O (ϵ_{n} / ϵ_{n - 1})] + E_{n} σ_{n - 1} ϵ_{n - k - 1} as n \to \infty .$

Now, by (10), (22), and (26),

(32) $lim_{n \to \infty} D_{n} = - \frac{1}{(k + 1)!} \frac{f^{(k + 1)} (α)}{f^{'} (α)}, lim_{n \to \infty} E_{n} = \frac{f^{(2)} (α)}{2 f^{'} (α)} .$

Because

{lim}_{n \to \infty} D_{n}

and

{lim}_{n \to \infty} E_{n}

are finite, and because

{lim}_{n \to \infty} (ϵ_{n} / ϵ_{n - 1}) = 0

and

{lim}_{n \to \infty} ϵ_{n - k - 1} = 0

, it follows that there exist a positive integer N and positive constants

β < 1

and D, with

| E_{n} ϵ_{n - k - 1} | \leq β

when

n > N

, for which (31) gives

(33) $| σ_{n} | \leq D + β | σ_{n - 1} | for all n > N .$

Using (33), it is easy to show that

$| σ_{N + s} | \leq D \frac{1 - β^{s}}{1 - β} + β^{s} | σ_{N} |, s = 1, 2, \dots,$

which, by the fact that

β < 1

, implies that

{σ_{n}}

is a bounded sequence. Making use of this fact, we have

{lim}_{n \to \infty} E_{n} σ_{n - 1} ϵ_{n - k - 1} = 0

. Substituting this into (31), and invoking (32), we next obtain

{lim}_{n \to \infty} σ_{n} = {(- 1)}^{k} {lim}_{n \to \infty} D_{n} = L

, which is precisely (13).

That $s_{k}$ , the order of the method, as defined in the statement of the theorem, satisfies (14) and (15) follows from Traub [15] (Chapter 3). We provide a simplified treatment of this topic in Appendix A.

This completes the proof of part 1 of the theorem.

When $f (z)$ is a polynomial of degree at most k, we first observe that $f^{(k + 1)} (z) = 0$ for all z, which implies that $p_{n, k} (z) = f (z)$ for all z, hence also $p_{n, k}^{'} (z) = f^{'} (z)$ for all z. Therefore, we have that $p_{n, k}^{'} (z_{n}) = f^{'} (z_{n})$ in the recursion of (12). Consequently, (12) becomes

$z_{n + 1} = z_{n} - \frac{f (z_{n})}{f^{'} (z_{n})}, n = k, k + 1, \dots,$

which is the recursion for the Newton–Raphson method. Thus, (17) follows. This completes the proof of part 2 of the theorem. □

3. Numerical Examples

In this section, we present two numerical examples that we treated with our method. Our computations were done in quadruple-precision arithmetic (approximately 35-decimal-digit accuracy). Note that in order to verify the theoretical results concerning iterative methods with order greater than unity, we need to use computer arithmetic of high precision (preferably, of variable precision, if available) because the number of correct significant decimal digits in the $z_{n}$ increases dramatically from one iteration to the next as we are approaching the solution.

In both examples below, we take $k = 2$ . We choose $z_{0}$ and $z_{1}$ and compute $z_{2}$ using one step of the secant method, namely,

(34) $z_{2} = z_{1} - \frac{f (z_{1})}{f [z_{0}, z_{1}]} .$

Following that, we compute

z_{3}, z_{4}, \dots,

via

(35) $z_{n + 1} = z_{n} - \frac{f (z_{n})}{f [z_{n}, z_{n - 1}] + f [z_{n}, z_{n - 1}, z_{n - 2}] (z_{n} - z_{n - 1})}, n = 2, 3, \dots .$

In our examples, we have carried out our computations for several sets of

z_{0}, z_{1}

, and we have observed essentially the same behavior that we observe in Table 1 and Table 2.

Example 1.

Consider $f (z) = 0$ , where $f (z) = z^{3} - 8$ , whose solutions are $α_{r} = 2 e^{i 2 π r / 3}$ , $r = 0, 1, 2$ . We would like to obtain the root $α_{1} = 2 e^{i 2 π / 3} = - 1 + i \sqrt{3}$ . We chose $z_{0} = 2 i$ and $z_{1} = - 2 + 2 i$ . The results of our computations are given in Table 1.

From (13) and (16) in Theorem 1, we should have

$lim_{n \to \infty} \frac{ϵ_{n + 1}}{ϵ_{n} ϵ_{n - 1} ϵ_{n - 2}} = \frac{{(- 1)}^{3}}{3!} \frac{f^{‴} (α_{1})}{f^{'} (α_{1})} = \frac{1}{24} (1 - i \sqrt{3}) = 0.04166 \dots - i 0.07216 \dots$

and

$lim_{n \to \infty} \frac{log | ϵ_{n + 1} / ϵ_{n} |}{log | ϵ_{n} / ϵ_{n - 1} |} = s_{2} = 1.83928 \dots,$

and these seem to be confirmed in Table 1. Furthermore, in infinite-precision arithmetic, $z_{9}$ should have close to 60 correct significant figures; we do not see this in Table 1 due to the fact that the arithmetic we have used to generate Table 1 can provide an accuracy of at most 35 digits.

Example 2.

Consider $f (z) = 0$ , where $f (z) = sin (i z) - cos z$ . $f (z)$ has infinitely many roots $α_{r} = (1 - i) (π / 4 + r π)$ , $r = 0, \pm 1, \pm 2, \dots$ . We would like to obtain the root $α_{0} = (1 - i) π / 4$ . We chose $z_{0} = 1.5 - i 1.3$ and $z_{1} = 0.6 - i 0.5$ . The results of our computations are given in Table 2.

From (13) and (16) in Theorem 1, we should have

$lim_{n \to \infty} \frac{ϵ_{n + 1}}{ϵ_{n} ϵ_{n - 1} ϵ_{n - 2}} = \frac{{(- 1)}^{3}}{3!} \frac{f^{‴} (α_{1})}{f^{'} (α_{1})} = - \frac{i}{6} = - i 0.16666 \dots$

and

$lim_{n \to \infty} \frac{log | ϵ_{n + 1} / ϵ_{n} |}{log | ϵ_{n} / ϵ_{n - 1} |} = s_{2} = 1.83928 \dots,$

and these seem to be confirmed in Table 2. Furthermore, in infinite-precision arithmetic, $z_{8}$ should have close to 50 correct significant figures; we do not see this in Table 2 due to the fact that the arithmetic we have used to generate Table 2 can provide an accuracy of at most 35 digits.

Remark 2.

In relation to the examples we have just presented, we would like to discuss the issue of estimating the relative errors $| ϵ_{n} / α |$ in the $z_{n}$ . This should help the reader when studying the numerical results included in Table 1 and Table 2. Starting with (13) and (15), we first note that, for all large n,

$| ϵ_{n + 1} {| \approx | L |}^{(s_{k} - 1) / k} {| ϵ_{n} |}^{s_{k}} .$

Therefore, assuming also that $α \neq 0$ , we have

$| ϵ_{n + 1} / α | \approx D | ϵ_{n} {/ α |}^{s_{k}}, D = {({| L |}^{1 / k} | α |)}^{s_{k} - 1} .$

Now, if $z_{n}$ has $q > 0$ correct significant figures, we have $| ϵ_{n} / α | = O (10^{- q})$ . If, in addition, $D = O (10^{r})$ for some r, then we will have

$| ϵ_{n + 1} / α | \approx O (10^{r - q s_{k}}) .$

For simplicity, let us consider the case $r = 0$ , which is practically what we have in the two examples we have treated. Then $z_{n + 1}$ has approximately $q s_{k}$ correct significant decimal digits. That is, if $z_{n}$ has q correct significant decimal digits, then, due to the fact that $s_{k} > 1$ , $z_{n + 1}$ will have $s_{k}$ times as many correct significant decimal digits as $z_{n}$ .

Funding

This research received no external funding.

Acknowledgments

The author would like to thank Tamara Kogan for drawing his attention to the paper [9] mentioned in the Introduction.

Conflicts of Interest

The author declares no conflict of interest.

Appendix A

Before ending, we would like to provide a brief treatment of the order of convergence of our method stated in (14) and (15) by considering

$\frac{ϵ_{n + 1}}{\prod_{i = 0}^{k} ϵ_{n - i}} = L \forall n \Leftrightarrow ϵ_{n + 1} = L \prod_{i = 0}^{k} ϵ_{n - i} \forall n,$

instead of (13). We will show that

| ϵ_{n + 1} | = Q | ϵ_{n} |^{s_{k}}

is possible if

s_{k}

is a solution to the polynomial equation

s^{k + 1} = \sum_{i = 0}^{k} s^{i}

and

Q = {| L |}^{(s_{k} - 1) / k}

. (For a more detailed treatment, we refer the reader to [15] (Section 3.3)).

We start by expressing all $| ϵ_{n - i} |$ in terms of $| ϵ_{n} |$ . We have

$| ϵ_{n - i} | = \frac{| ϵ_{n} |^{1 / s_{k}^{i}}}{Q^{m_{i}}}, m_{i} = \sum_{j = 1}^{i} \frac{1}{s_{k}^{j}}, i = 1, 2, \dots .$

Substituting this into

| ϵ_{n + 1} | = | L | \prod_{i = 0}^{k} | ϵ_{n - i} |

, we obtain

${Q | ϵ_{n} |}^{s_{k}} = | L | | ϵ_{n} | \prod_{i = 1}^{k} \frac{| ϵ_{n} |^{1 / s_{k}^{i}}}{Q^{m_{i}}} = \frac{| L |}{Q^{M}} {| ϵ_{n} |}^{ρ}; ρ = \sum_{i = 0}^{k} \frac{1}{s_{k}^{i}}, M = \sum_{i = 1}^{k} m_{i} .$

Of course, this is possible when $s_{k} = ρ$ and $Q^{M + 1} = | L |$ .

Now, the requirement that $s_{k} = ρ$ is the same as $s_{k}^{k + 1} = \sum_{i = 0}^{k} s_{k}^{i}$ , which implies that the order $s_{k}$ should be a root of the polynomial

$g_{k} (s) = s^{k + 1} - \sum_{i = 0}^{k} s^{i} = \frac{s^{k + 2} - 2 s^{k + 1} + 1}{s - 1} .$

By Descartes’ rule of signs,

g_{k} (s)

has only one positive root, which we denote by

\tilde{s}

. Since

g_{k} (1) = - k < 0

and

g_{k} (2) = 1 > 0

, we have that

1 < \tilde{s} < 2

. The remaining k roots of

g_{k} (s)

are the zeroes of the polynomial

\tilde{g} (s) = g_{k} (s) / (s - \tilde{s}) = \sum_{j = 0}^{k} c_{j} s^{j}

, the

c_{j}

satisfying

\tilde{s} c_{0} = 1

and

\tilde{s} c_{j} - c_{j - 1} = 1

j = 1, \dots, k,

hence

$c_{j} = \frac{1}{\tilde{s}} \sum_{i = 0}^{j} \frac{1}{{\tilde{s}}^{i}}, j = 0, 1, \dots, k \Rightarrow 0 < c_{0} < c_{1} \dots < c_{k - 1} < c_{k} = 1 .$

Therefore, by the Eneström–Kakeya theorem, all k roots of

\tilde{g} (s)

are in the unit disk. We thus conclude that

\tilde{s} = s_{k}

since we already know that the order of our method is greater than 1. (For Descartes’ rule of signs and the Eneström–Kakeya theorem, see, for example, Henrici [16] (pp. 442, 462)).

Next, we note that $g_{k} (s) = s g_{k - 1} (s) - 1$ . Therefore, $g_{k - 1} (s_{k - 1}) = 0$ implies $g_{k} (s_{k - 1}) = - 1 < 0$ , which, along with $g_{k} (2) = 1 > 0$ , implies that $s_{k - 1} < s_{k} < 2$ . Therefore, the sequence ${s_{k}}_{k = 1}^{\infty}$ is monotonically increasing and is bounded from above by 2. Consequently, ${lim}_{k \to \infty} s_{k} = \hat{s}$ exists and $\hat{s} \leq 2$ . Now,

$g_{k} (s_{k}) = 0 \Rightarrow s_{k}^{k + 2} - 2 s_{k}^{k + 1} + 1 = 0 \Rightarrow s_{k}^{2} - 2 s_{k} = - \frac{1}{s_{k}^{k}} .$

Upon letting

k \to \infty

on both sides, we obtain

{\hat{s}}^{2} - 2 \hat{s} = 0

, which gives

\hat{s} = 2

The expression given for M can be simplified considerably as we show next. First, it is easy to verify that

$M = \sum_{i = 1}^{k} \frac{k - i + 1}{s_{k}^{i}} = \frac{1}{s_{k}^{k}} \sum_{i = 1}^{k} i s_{k}^{i - 1} .$

Next,

$s_{k}^{k} M = (\frac{d}{d s} \sum_{i = 0}^{k} s^{i}) |_{s = s_{k}} = (\frac{d}{d s} \frac{s^{k + 1} - 1}{s - 1}) |_{s = s_{k}} = \frac{(k + 1) s^{k} (s - 1) - (s^{k + 1} - 1)}{{(s - 1)}^{2}} |_{s = s_{k}} .$

s^{k + 1} - 1 = (s - 1) \sum_{i = 0}^{k} s^{i}

, this becomes

$s_{k}^{k} M = \frac{(k + 1) s^{k} - \sum_{i = 0}^{k} s^{i}}{s - 1} |_{s = s_{k}} = \frac{k s_{k}^{k} - \sum_{i = 0}^{k - 1} s_{k}^{i}}{s_{k} - 1} .$

Now, by the fact that

g_{k} (s_{k}) = 0

, we have

\sum_{i = 0}^{k - 1} s_{k}^{i} = s_{k}^{k + 1} - s_{k}^{k}

. Consequently,

$M = \frac{k - (s_{k} - 1)}{s_{k} - 1} \Rightarrow M + 1 = \frac{k}{s_{k} - 1},$

which is the required result.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Tables

Table 1

Results obtained by applying the generalized secant method with $k = 2$ , as shown in (34) and (35), to the equation $z^{3} - 8 = 0$ , to compute the root $α_{1} = - 1 + i \sqrt{3}$ . The entries denoted “**” mean that the limit of the extended-precision arithmetic has been reached.

n	\| $ϵ_{n}$ \|	$\frac{ϵ_{n + 1}}{ϵ_{n} ϵ_{n - 1} ϵ_{n - 2}}$	$\frac{log \| ϵ_{n + 1} / ϵ_{n} \|}{log \| ϵ_{n} / ϵ_{n - 1} \|}$
0	$1.035 D + 00$	-	-
1	$1.035 D + 00$	-	-
2	$4.808 D - 01$	$- 8.972 D - 02$ $+ i 1.015 D - 01$	2.516
3	$6.979 D - 02$	$1.224 D - 01$ $- i 2.727 D - 02$	1.437
4	$4.355 D - 03$	$1.009 D - 01$ $- i 4.079 D - 02$	2.023
5	$1.591 D - 05$	$4.561 D - 02$ $- i 9.794 D - 02$	1.839
6	$5.223 D - 10$	$3.793 D - 02$ $- i 7.268 D - 02$	1.839
7	$2.967 D - 18$	$3.741 D - 02$ $- i 7.579 D - 02$	1.838
8	$2.083 D - 33$	**	**
9	$0.000 D + 00$	**	**

Table 2

Results obtained by applying the generalized secant method with $k = 2$ , as shown in (34) and (35), to the equation $sin (i z) - cos z = 0$ , to compute the root $α_{0} = (1 - i) π / 4$ . The entries denoted “**” mean that the limit of the extended-precision arithmetic has been reached.

n	\| $ϵ_{n}$ \|	$\frac{ϵ_{n + 1}}{ϵ_{n} ϵ_{n - 1} ϵ_{n - 2}}$	$\frac{log \| ϵ_{n + 1} / ϵ_{n} \|}{log \| ϵ_{n} / ϵ_{n - 1} \|}$
0	$6.608 D - 01$	-	-
1	$3.403 D - 01$	-	-
2	$1.341 D - 01$	$3.163 D - 01$ $+ i 1.397 D - 01$	2.743
3	$1.043 D - 02$	$1.466 D - 01$ $- i 1.846 D - 01$	1.774
4	$1.122 D - 04$	$- 2.943 D - 03$ $- i 1.117 D - 01$	1.934
5	$1.755 D - 08$	$9.223 D - 03$ $- i 1.614 D - 01$	1.766
6	$3.320 D - 15$	$- 7.686 D - 04$ $- i 1.658 D - 01$	1.857
7	$1.084 D - 27$	**	**
8	$9.630 D - 35$	**	**

Word count: 3587

Show less

© 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

The secant method is a very effective numerical procedure used for solving nonlinear equations of the form $f (x) = 0$ . In a recent work (A. Sidi, Generalization of the secant method for nonlinear equations. Appl. Math. E-Notes, 8:115–123, 2008), we presented a generalization of the secant method that uses only one evaluation of $f (x)$ per iteration, and we provided a local convergence theory for it that concerns real roots. For each integer k, this method generates a sequence ${x_{n}}$ of approximations to a real root of $f (x)$ , where, for $n \geq k$ , $x_{n + 1} = x_{n} - f (x_{n}) / p_{n, k}^{'} (x_{n})$ , $p_{n, k} (x)$ being the polynomial of degree k that interpolates $f (x)$ at $x_{n}, x_{n - 1}, \dots, x_{n - k}$ , the order $s_{k}$ of this method satisfying $1 < s_{k} < 2$ . Clearly, when $k = 1$ , this method reduces to the secant method with $s_{1} = (1 + \sqrt{5}) / 2$ . In addition, $s_{1} < s_{2} < s_{3} < \dots,$ such that ${lim}_{k \to \infty} s_{k} = 2$ . In this note, we study the application of this method to simple complex roots of a function $f (z)$ . We show that the local convergence theory developed for real roots can be extended almost as is to complex roots, provided suitable assumptions and justifications are made. We illustrate the theory with two numerical examples.

Details

Title

Application of a Generalized Secant Method to Nonlinear Equations with Complex Roots

Author

Sidi, Avram

First page

169

Publication year

2021

Publication date

2021

Publisher

MDPI AG

e-ISSN

20751680

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.3390/axioms10030169

ProQuest document ID

2576380506

Application of a Generalized Secant Method to Nonlinear Equations with Complex Roots

Jump to:

Full text

Abstract

Details

Suggested sources