In this section we investigate the order of convergence of functional iteration schemes and, as a means of obtaining rapid convergence, rediscover Newton’s method. We also consider ways of accelerating the convergence of Newton’s method in special circumstances. First, however, we need a new procedure for measuring how rapidly a sequence converges.
Order of Convergence
Definition 2.7
Suppose $\{p_n\}_{n=0}^{\infty}$ is a sequence that converges to $p$, with $p_n \neq p$ for all $n$. If positive constants $\lambda$ and $\alpha$ exist with
\[\lim_{n \to \infty} \frac{|p_{n+1} - p|}{|p_n - p|^\alpha} = \lambda,\]
then $\{p_n\}_{n=0}^{\infty}$ converges to $p$ of order $\alpha$, with asymptotic error constant $\lambda$.
An iterative technique of the form $p_n = g(p_{n-1})$ is said to be of order $\alpha$ if the sequence $\{p_n\}_{n=0}^{\infty}$ converges to the solution $p = g(p)$ of order $\alpha$.
In general, a sequence with a higher order of convergence converges more rapidly than a sequence with a lower order. The asymptotic constant affects the speed of convergence but not to the extent of the order. Two cases of order are given special attention.
- If $\alpha = 1$ and $\lambda < 1$, the sequence is linearly convergent.
- If $\alpha = 2$, the sequence is quadratically convergent.
The next illustration compares a linearly convergent sequence to one that is quadratically convergent. It shows why we try to find methods that produce higher-order convergent sequences.
Illustration
Suppose that $\{p_n\}_{n=0}^{\infty}$ is linearly convergent to 0 with
\[\lim_{n \to \infty} \frac{|p_{n+1}|}{|p_n|} = 0.5\]
and that $\{\tilde{p}_n\}_{n=0}^{\infty}$ is quadratically convergent to 0 with the same asymptotic error constant,
\[\lim_{n \to \infty} \frac{|\tilde{p}_{n+1}|}{|\tilde{p}_n|^2} = 0.5.\]
For simplicity we assume that for each $n$ we have
\[\frac{|p_{n+1}|}{|p_n|} \approx 0.5 \quad \text{and} \quad \frac{|\tilde{p}_{n+1}|}{|\tilde{p}_n|^2} \approx 0.5.\]
For the linearly convergent scheme, this means that
\[|p_n - 0| = |p_n| \approx 0.5|p_{n-1}| \approx (0.5)^2 |p_{n-2}| \approx \cdots \approx (0.5)^n |p_0|,\]
whereas the quadratically convergent procedure has
\[|\tilde{p}_n - 0| = |\tilde{p}_n| \approx 0.5|\tilde{p}_{n-1}|^2 \approx (0.5)[0.5|\tilde{p}_{n-2}|^2]^2 = (0.5)^3|\tilde{p}_{n-2}|^4\]
\[\approx (0.5)^7(0.5|\tilde{p}_{n-3}|^2)^4 = (0.5)^{15}|\tilde{p}_{n-3}|^8\]
\[\approx \cdots \approx (0.5)^{2^n - 1}|\tilde{p}_0|^{2^n}.\]
Table 2.7 illustrates the relative speed of convergence of the sequences to 0 if $|p_0| = |\tilde{p}_0| = 1.$
The quadratically convergent sequence is within $10^{-38}$ of 0 by the seventh term. At least 126 terms are needed to ensure this accuracy for the linearly convergent sequence.
Quadratically convergent sequences are expected to converge much quicker than those that converge only linearly, but the next result implies that an arbitrary technique that generates a convergent sequence does so only linearly.
Theorem 2.8
Let $g \in C[a,b]$ be such that $g(x) \in [a,b]$, for all $x \in [a,b]$. Suppose, in addition, that $g'$ is continuous on $(a,b)$ and a positive constant $k < 1$ exists with
\[|g'(x)| \leq k, \quad \text{for all } x \in (a,b).\]
If $g'(p) \neq 0$, then for any number $p \neq p$ in $[a,b]$, the sequence
\[p_n = g(p_{n-1}), \quad \text{for } n \ge 1,\]
converges only linearly to the unique fixed point $p$ in $[a,b]$.
Proof.
We know from the Fixed-Point Theorem 2.4 in Section 2.2 that the sequence converges to $p$. Since $g'$ exists on $(a,b)$, we can apply the Mean Value Theorem to $g$ to show that for any $n$,
\[p_{n+1} - p = g(p_n) - g(p) = g'(\xi_n)(p_n - p),\]
where $\xi_n$ is between $p_n$ and $p$. Since $\{p_n\}_{n=0}^{\infty}$ converges to $p$, we also have $\{\xi_n\}_{n=0}^{\infty}$ converging to $p$. Since $g'$ is continuous on $(a,b)$, we have
\[\lim_{n \to \infty} g'(\xi_n) = g'(p).\]
Thus
\[\lim_{n \to \infty} \frac{p_{n+1} - p}{p_n - p} = \lim_{n \to \infty} g'(\xi_n) = g'(p) \quad \text{and} \quad \lim_{n \to \infty} \frac{|p_{n+1} - p|}{|p_n - p|} = |g'(p)|.\]
Hence, if $g'(p) \neq 0$, fixed-point iteration exhibits linear convergence with asymptotic error constant $|g'(p)|$.
Theorem 2.8 implies that higher-order convergence for fixed-point methods of the form $g(p) = p$ can occur only when $g'(p) = 0.$ The next result describes additional conditions that ensure the quadratic convergence we seek.
Theorem 2.9
Let $p$ be a solution of the equation $x = g(x)$. Suppose that $g'(p) = 0$ and $g''$ is continuous with $|g''(x)| < M$ on an open interval $I$ containing $p$. Then there exists a $\delta > 0$ such that, for $p_0 \in [p - \delta, p + \delta]$, the sequence defined by $p_n = g(p_{n-1})$, when $n \ge 1$, converges at least quadratically to $p$. Moreover, for sufficiently large values of $n$,
\[|p_{n+1} - p| < \frac{M}{2} |p_n - p|^2.\]
Proof.
Choose $k$ in $(0,1)$ and $\delta > 0$ such that on the interval $[p - \delta, p + \delta]$, contained in $I$, we have $|g'(x)| \le k$ and $g''$ continuous. Since $|g'(x)| \le k < 1$, the argument used in the proof of Theorem 2.6 in Section 2.3 shows that the terms of the sequence $\{p_n\}_{n=0}^{\infty}$ contained in $[p - \delta, p + \delta]$. Expanding $g(x)$ in a linear Taylor polynomial for $x \in [p - \delta, p + \delta]$ gives
\[g(x) = g(p) + g'(p)(x - p) + \frac{g''(\xi)}{2} (x - p)^2,\]
where $\xi$ lies between $x$ and $p$. The hypotheses $g(p) = p$ and $g'(p) = 0$ imply that
\[g(x) = p + \frac{g''(\xi)}{2} (x - p)^2.\]
In particular, when $x = p_n$,
\[p_{n+1} = g(p_n) = p + \frac{g''(\xi_n)}{2} (p_n - p)^2,\]
with $\xi_n$ between $p_n$ and $p$. Thus,
\[p_{n+1} - p = \frac{g''(\xi_n)}{2} (p_n - p)^2.\]
Since $|g'(x)| \le k < 1$ on $[p - \delta, p + \delta]$ and $g$ maps $[p - \delta, p + \delta]$ into itself, it follows from the Fixed-Point Theorem that $\{p_n\}_{n=0}^{\infty}$ converges to $p$. But $\xi_n$ is between $p$ and $p_n$ for each $n$, so $\{\xi_n\}_{n=0}^{\infty}$ also converges to $p$, and
\[\lim_{n \to \infty} \frac{|p_{n+1} - p|}{|p_n - p|^2} = \frac{|g''(p)|}{2}.\]
This result implies that the sequence $\{p_n\}_{n=0}^{\infty}$ is quadratically convergent if $g'(p) \neq 0$ and of higher-order convergence if $g'(p) = 0.$
Because $g''$ is continuous and strictly bounded by $M$ on the interval $[p - \delta, p + \delta]$, this also implies that, for sufficiently large values of $n$,
\[|p_{n+1} - p| < \frac{M}{2} |p_n - p|^2.\]
$\square$
Theorems 2.8 and 2.9 tell us that our search for quadratically convergent fixed-point methods should point in the direction of functions whose derivatives are zero at the fixed point. That is:
- For a fixed point method to converge quadratically we need to have both $g(p) = p$, and $g'(p) = 0.$
The easiest way to construct a fixed-point problem associated with a root-finding problem $f(x) = 0$ is to add or subtract a multiple of $f(x)$ from $x$. Consider the sequence
\[p_n = g(p_{n-1}), \quad \text{for } n \ge 1,\]
for $g$ in the form
\[g(x) = x - \phi(x) f(x),\]
where $\phi$ is a differentiable function that will be chosen later.
For the iterative procedure derived from $g$ to be quadratically convergent, we need to have $g'(p) = 0$ when $f(p) = 0$. Because
\[g'(x) = 1 - \phi'(x)f(x) - \phi(x)f'(x),\]
and $f(p) = 0$, we have
\[g'(p) = 1 - \phi'(p)f(p) - \phi(p)f'(p) = 1 - \phi'(p) \cdot 0 - f'(p)\phi(p) = 1 - f'(p)\phi(p),\]
and $g'(p) = 0$ if and only if $\phi(p) = 1 / f'(p)$.
If we let $\phi(x) = 1 / f'(x)$, then we will ensure that $\phi(p) = 1 / f'(p)$ and produce the quadratically convergent procedure
\[p_n = g(p_{n-1}) = p_{n-1} - \frac{f(p_{n-1})}{f'(p_{n-1})}.\]
This, of course, is simply Newton’s method. Hence
- If $f(p) = 0$ and $f'(p) \neq 0$, then for starting values sufficiently close to $p$, Newton’s method will converge at least quadratically.
Multiple Roots
In the preceding discussion, the restriction was made that $f'(p) \neq 0$, where $p$ is the solution to $f(x) = 0$. In particular, Newton’s method and the Secant method will generally give problems if $f'(p) = 0$ when $f(p) = 0$. To examine these difficulties in more detail, we make the following definition.
Definition 2.10
A solution of $f(x) = 0$ is a zero of multiplicity $m$ of $f$ if for $x \neq p$, we can write $f(x) = (x - p)^m q(x)$, where $\lim_{x \to p} q(x) \neq 0.$
In essence, $q(x)$ represents that portion of $f(x)$ that does not contribute to the zero of $f$. The following result gives a means to easily identify simple zeros of a function, those that have multiplicity one.
Theorem 2.11
The function $f \in C^1[a,b]$ has a simple zero at $p$ in $(a,b)$ if and only if $f(p) = 0$, but $f'(p) \neq 0.$
Proof.
If $f$ has a simple zero at $p$, then $f(p) = 0$ and $f(x) = (x - p)q(x)$, where $\lim_{x \to p} q(x) \neq 0.$ Since $f \in C^1[a,b]$,
\[f'(p) = \lim_{x \to p} q(x) + \lim_{x \to p} (x - p)q'(x) = \lim_{x \to p} q(x) \neq 0.\]
Conversely, if $f(p) = 0$, but $f'(p) \neq 0$, expand $f$ in a zeroth Taylor polynomial about $p$. Then
\[f(x) = f(p) + f'(\xi)(x - p) = (x - p)f'(\xi),\]
where $\xi(x)$ is between $x$ and $p$. Since $f \in C^{1}[a, b]$,
\[\lim_{x \to p} f'(x) = f'\left(\lim_{x \to p} \xi(x)\right) = f'(p) \neq 0.\]
Letting $q = f' \circ g$ gives $f(x) = (x - p) q(x)$, where $\lim_{x \to p} q(x) \neq 0$. Thus $f$ has a simple zero at $p$.
The following generalization of Theorem 2.11 is considered in Exercise 12.
Theorem 2.12
The function $f \in C^{m}[a, b]$ has a zero of multiplicity $m$ at $p$ in $(a, b)$ if and only if
\[0 = f(p) = f'(p) = f''(p) = \cdots = f^{(m-1)}(p),\]
but
\[f^{(m)}(p) \neq 0.\]
The result in Theorem 2.12 implies that if an interval about $p$ exists where Newton's method converges quadratically to $p$ for any initial approximation $p_0 = p$, provided that $p$ is a simple zero. The following example shows that quadratic convergence might not occur if the zero is not simple.
Example 1
Let $f(x) = e^{x} - x - 1$. (a) Show that $f$ has a zero of multiplicity 2 at $x = 0$. (b) Show that Newton's method with $p_0 = 1$ converges to this zero but not quadratically.
Solution
(a) We have
\[f(x) = e^{x} - x - 1, \qquad f'(x) = e^{x} - 1, \qquad \text{and} \qquad f''(x) = e^{x},\]
so
\[f(0) = e^{0} - 0 - 1 = 0, \qquad f'(0) = e^{0} - 1 = 0, \qquad \text{and} \qquad f''(0) = e^{0} = 1.\]
Theorem 2.12 implies that $f$ has a zero of multiplicity 2 at $x = 0$.
(b) The first two terms generated by Newton's method applied to $f$ with $p_0 = 1$ are
\[p_1 = p_0 - \frac{f(p_0)}{f'(p_0)} = 1 - \frac{e - 2}{e - 1} \approx 0.58198,\]
and
\[p_2 = p_1 - \frac{f(p_1)}{f'(p_1)} = 0.58198 - \frac{0.20760}{0.78957} \approx 0.31906.\]
The first sixteen terms of the sequence generated by Newton's method are shown in Table 2.8. The sequence is clearly converging to 0, but not quadratically. The graph of $f$ is shown in Figure 2.12.


One method of handling the problem of multiple roots of a function $f$ is to define
\[\mu(x) = \frac{f(x)}{f'(x)}.\]
If $p$ is a zero of $f$ of multiplicity $m$ with $f(x) = (x - p)^{m} q(x)$, then
\[\mu(x) = \frac{(x - p)^{m} q(x)}{m(x - p)^{m-1} q(x) + (x - p)^{m} q'(x)}=\frac{(x - p) q(x)}{mq(x) + (x - p) q'(x)}\]
also has a zero at $p$. However, $q(p) \neq 0$, so
\[\frac{q(p)}{mq(p) + (x - p)q'(p)} = \frac{1}{m} \neq 0,\]
and $p$ is a simple zero of $\mu(x)$. Newton’s method can then be applied to $\mu(x)$ to give
\[g(x) = x - \frac{\mu(x)}{\mu'(x)} = x - \frac{f(x) f'(x)}{[f'(x)]^{2} - f(x) f''(x)},\]
which simplifies to
\[g(x) = x - \frac{f(x) f'(x)}{[f'(x)]^{2} - f(x) f''(x)}. \tag{2.13}\]
If $g$ has the required continuity conditions, functional iteration applied to $g$ will be quadratically convergent regardless of the multiplicity of the zero of $f$. Theoretically, the only drawback to this method is the additional calculation of $f''(x)$ and the more laborious procedure of calculating the iterates. In practice, however, multiple roots can cause serious round-off problems because the denominator of (2.13) consists of the difference of two numbers that are both close to 0.
Example 2.
In Example 1 it was shown that $f(x) = e^{x} - x - 1$ has a zero of multiplicity 2 at $x = 0$ and that Newton’s method with $p_{0} = 1$ converges to this zero but not quadratically. Show that the modification of Newton’s method as given in Eq. (2.13) improves the rate of convergence.
Solution
Modified Newton’s method gives
\[p_{1} = p_{0} - \frac{f(p_{0}) f'(p_{0})}{[f'(p_{0})]^{2} - f(p_{0}) f''(p_{0})}= 1 - \frac{(e - 2)(e - 1)}{(e - 1)^{2} - (e - 2)e} \approx -2.3421061 \times 10^{-1}.\]
This is considerably closer to 0 than the first term using Newton’s method, which was 0.58198. Table 2.9 lists the first five approximations to the double zero at $x = 0$. The results were obtained using a system with ten digits of precision. The relative lack of improvement in the last two entries is due to the fact that using this system both the numerator and the denominator approach 0. Consequently, there is a loss of significant digits of accuracy as the approximations approach 0.
The following illustrates that the modified Newton’s method converges quadratically even in the case of a simple zero.
illustration
In Section 2.2 we found that a zero of $f(x) = x^{3} + 4x^{2} - 10 = 0$ is $p = 1.36523001$. Here we will compare the convergence to this zero using both Newton’s method and the modified Newton’s method listed in Eq. (2.13). Let
(i) $p_n = p_{n-1} - \dfrac{p_{n-1}^3 + 4p_{n-1}^2 - 10}{3p_{n-1}^2 + 8p_{n-1}}$ from Newton’s method
and, from the Modified Newton’s method given by Eq. (2.13),
(ii) $p_n = p_{n-1}-\dfrac{(p_{n-1}^3 + 4p_{n-1}^2 - 10)(3p_{n-1}^2 + 8p_{n-1})}{(3p_{n-1}^2 + 8p_{n-1})^2 - (p_{n-1}^3 + 4p_{n-1}^2 - 10)(6p_{n-1} + 8)}.$
With $p_0 = 1.5$, we have
Newton’s method
\[p_1 = 1.3733333, \quad p_2 = 1.36526201, \quad \text{and} \quad p_3 = 1.36523001.\]
Modified Newton’s method
\[p_1 = 1.35689898, \quad p_2 = 1.36519585, \quad \text{and} \quad p_3 = 1.36523001.\]
Both methods are rapidly convergent to the actual zero, which is given by both methods as $p_3$. Note, however, that in the case of a simple zero the original Newton’s method requires substantially less computation.
Maple contains Modified Newton’s method as described in Eq. (2.13) in its \textit{Numerical-Analysis} package. The options for this command are the same as those for the Bisection method. To obtain results similar to those in Table 2.9 we can use
with(Student[NumericalAnalysis])
f := e^x - x - 1
ModifiedNewton(f, x = 1.0, tolerance = 10^(-10), output = sequence, maxiterations = 20)
Remember that there is sensitivity to round-off error in these calculations, so you might need to reset Digits in Maple to get the exact values in Table 2.9.
No comments for "Error Analysis for Iterative Methods"
Post a Comment