53. Newton’s method#

We consider the non-linear equation

\[ F(x) = 0, \]

where \(F : {\mathbb R}^n \rightarrow {\mathbb R}^n\) is a smooth function. In general, it is not clear whether a solution exists, and if it is unique.

Newton’s method is an iterative solution method:

Given some initial guess \(x^0\), compute the sequence:

\[ x^{k+1} = x^k - F^\prime(x^k)^{-1} F(x^k) \]

\(F^\prime(x^k)\) is the Jacobi matrix at position \(x^k\).

If the initial guess \(x^0\) was sufficiently close to a regular solution (with \(F^\prime(x^\ast)\) is regular), Newton’s method converges to the solution \(x^\ast\). It converges very quickly, i.e. with quadratic order:

\[ \| x^{k+1} - x^\ast \| \leq c \| x^k - x^\ast \|^2 \]

The main idea is to replace the non-linear problem by a sequence of linearized problems, where one uses \(F(x^k + \Delta x) \approx F(x^k) + F^\prime(x^k) \Delta x\).

53.1. Minimization problem#

Now we consider the minimization problem

\[ \min_{x \in {\mathbb R}^n} J(x) \]

with the function

\[ J : {\mathbb R}^n \rightarrow {\mathbb R} \]

If the function \(J\) is smooth, then a minimizer \(x^\ast\) satisfies the necessary condition of first order:

\[ J^\prime(x^\ast) = 0 \]

This is a non-linear equation, for the function

\[ F(\cdot) := J^\prime (\cdot) : {\mathbb R}^n \rightarrow {\mathbb R}^n \]

We can rewrite Newton’s method for \(F(x) = 0\) for the minimization problem:

\[ x^{k+1} = x^{k} - J^{\prime \prime}(x^k)^{-1} J^\prime(x^k) \]

The symmetric matrix \(J^{\prime \prime}(x^k)\) is the \(n \times n\) Hesse matrix at the point \(x^k\), and \(J^\prime\) is the gradient of \(J\).

53.2. Nonlinear variational problems#

We extend the concept of bilinear-forms to non-linear forms

\[ A(\cdot, \cdot) : V \times V \rightarrow {\mathbb R}, \]

where \(A(.,.)\) is non-linear in the first argument, but still linear in the second argument. With \(f \in V^\ast\), one searches for a solution of

\[ A(u,v) = f(v) \qquad \forall \, v \in V \]

The Lax-Milgram lemma and Cea’s lemma can be extended under the following hypothesis:

  • continuous is replaced by Lipschitz-continuous: $\( | A(u_1,v) - A(u_2,v) | \leq \alpha_2 \, \| u_1 - u_2 \|_V \, \| v \|_V \)$

  • coercive is replaced by monoton, what means: $\( A(u_1, u_2-u_1) - A(u_2, u_2-u_1) \geq \alpha_1 \, \| u_1 - u_2 \|_V^2 \)$

53.3. Finite element discretization:#

As for linear problems, one searches for an approximate solution

\[ u_h(x) = \sum_{i=1}^N u_i \varphi_i(x) \]

satisfying the variational formulation

\[ A( \sum_i u_i \varphi_i, \varphi_j) = f(\varphi_j) \qquad \forall \, j : 1 \leq j \leq N \]

Since \(A(.,.)\) is not linear in the first argument, we cannot transform the problem to a linear system. We have obtained a non-linear equation

\[ F(u) = 0 \]

for the function \(F : {\mathbb R}^N \rightarrow {\mathbb R}^N\) with

\[ F_j(u) = A(\sum_i u_i \varphi_i, \varphi_j) - f(\varphi_j) \]

Now, we can apply Newton’s method to solve this nonlinear equation in \({\mathbb R}^n\).