Numerical Solution of Partial Differential Equations (Applied Mathematical Sciences)

Applied Mathematical Sciences I Volume 32 Theodor Meis Ulrich Marcowitz Numerical Solution of Partial Differential E...

Author: T. Meis | U. Marcowitz

144 downloads 1181 Views 5MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Applied Mathematical Sciences I Volume 32

Theodor Meis Ulrich Marcowitz

Numerical Solution of Partial Differential Equations

Springer-Verlag New York Heidelberg Berlin

Theodor Meis

Ulrich Marcowitz

Mathematisches Institut der Universitat zu Koln Weyertal 86-90 5000 Kiiln 41 Federal Republic of Germany

Mathematisches Institut der Universitat zu Koln Weyertal 86-90 5000 Koln 41 Federal Republic of Germany

Translated by Peter R. Wadsack, University of Wisconsin.

AMS Subject Classifications:

65MXX, 65NXX, 65P05

Library of Congress Cataloging in Publication Data Meis, Theodor. Numerical solution of partial differential equations. (Applied mathematical sciences; 32) Translation of Numerische Behandlung partieller Differentialgleichungen. Bibliography: p. Includes index. 1. Differential equations, Partial-Numerical solutions. I. Marcowitz, Ulrich, joint author. II. Title. III. Series. QA1.A647 vol. 32 [QA3741 510s [515.3'53J 80-26520

English translation of the original German edition Numerische Behandlung Partieller Differentialgleichungen published by SpringerVerlag Heidelberg © 1978. All rights reserved.

No part of this book may be translated or reproduced in any form without written permission from Springer-Verlag. © 1981 by Springer-Verlag New York Inc. Printed in the United States of America. 9 8 7 6 5 4 3 2 1

ISBN 0-387-90550-2 Springer-Verlag New York Heidelberg Berlin ISBN 3-540-90550-2 Springer-Verlag Berlin Heidelberg New York

PREFACE This book is the result of two courses of lectures given at the University of Cologne in Germany in 1974/75.

The majority of the students were not familiar with partial differential equations and functional analysis. why Sections 1,

2,

This explains

4 and 12 contain some basic material and

results from these areas.

The three parts of the book are largely independent of each other and can be read separately.

Their topics are:

initial value problems, boundary value problems, solutions of systems of equations.

There is much emphasis on theoretical

considerations and they are discussed as thoroughly as the algorithms which are presented in full detail and together with the programs.

We believe that theoretical and practical

applications are equally important for a genuine understanding of numerical mathematics.

When writing this book, we had considerable help and many discussions with H. W. Branca, R. Esser, W. Hackbusch and H. Multhei.

H. Lehmann, B. Muller, H. J. Niemeyer,

U. Schulte and B. Thomas helped with the completion of the programs and with several numerical calculations. Springer-Verlag showed a lot of patience and understanding during the course of the production of the book.

We would like to use the occasion of this preface to express our thanks to all those who assisted in our sometimes arduous task.

Cologne, Fall 1980 Th. Meis U. Marcowitz v

CONTENTS Page

INITIAL VALUE PROBLEMS FOR HYPERBOLIC AND PARABOLIC DIFFERENTIAL EQUATIONS.

PART I.

.

.

.

.

.

.

1

.

1. 2. 3.

Properly posed initial value problems Types and characteristics Characteristic methods for first order hyperbolic .

31

4. 5. 6. 7. 8. 9.

Banach spaces Stability of difference methods Examples of stable difference methods Inhomogeneous initial value problems. Difference methods with positivity properties Fourier transforms of difference methods. Initial value problems in several space variables Extrapolation methods

40 55 73 89 97

.

.

systems

10. 11.

.

.

.

.

. .

. .

.

.

.

. .

.

.

. .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

207

Properly posed boundary value problems.

.

.

.

.

.

Difference methods .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

BOUNDARY VALUE PROBLEMS FOR ELLIPTIC DIFFERENTIAL EQUATIONS

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Variational methods Hermite interpolation and its application to the Ritz method Collocation methods and boundary integral methods .

.

.

SOLVING SYSTEMS OF EQUATIONS.

.

.

.

.

.

.

.

.

.

19.

Iterative methods for solving systems of linear and nonlinear equations Overrelaxation methods for systems of linear equations Overrelaxation methods for systems of nonlinear

20. 21. 22.

Band width reduction for sparse matrices. Buneman Algorithm The Schr6der-Trottenberg reduction method

18.

1

19

119 168 192

PART III. 17.

.

.

.

16.

.

.

.

.

PART II. 12. 13. 14. 15.

.

.

.

.

equations

APPENDICES: Appendix 0: Appendix 1: Appendix 2: Appendix 3:

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

334

363

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

402 417 426

444

.

.

.

.

.

.

.

.

.

.

.

.

Introduction .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

444 447

.

.

.

459

.

.

469

Method of Massau. Total implicit difference method for solving a nonlinear parabolic differential equation Lax-Wendroff-Richtmyer method for the case of two space variables Difference methods with SOR for solving the Poisson equation on nonrectangular .

.

.

.

.

.

.

.

.

.

.

.

484

.

.

.

503

.

.

.

522

.

Programs for band matrices. The Buneman algorithm for solving the Poisson equation. .

Vii

.

.

.

.

.

.

383

.

.

.

regions

Appendix 5: Appendix 6:

334

FORTRAN PROGRAMS . .

290 317

.

.

.

Appendix 4:

.

207 229 270

.

viii

Page BIBLIOGRAPHY

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

532

INDEX .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

538

.

.

.

PART I. INITIAL VALUE PROBLEMS FOR HYPERBOLIC AND PARABOLIC DIFFERENTIAL EQUATIONS

Properly posed initial value problems

1.

In this introductory chapter we will explain what is

meant by the concept of properly posed initial value problems. We start with the well-known situation for ordinary differential equations, and develop the definition with the help of explanatory examples.

This concept is an important one, for

problems which are not properly posed cannot, in general, be attacked reasonably with numerical methods. Theorem 1.1:

Let

f c C°([a,b] x IR,IR)

be a continuous func-

tion satisfying a Lipschitz condition for a constant

lf(x,z) - f(x,w)l < Ljz - wl, Then

for all

n,n e1R

x e [a,b],

z,w CIR.

there exists exactly one function

Ju'(x) = f(x,u(x)) u e Cl([a,b],IR)

L cIR:

x e [a,b]

with lu(a)

= n

and exactly one function

u e C1 ( [ a , b ] ,

IR)

with

u ' (x) = f(x,u(x)), u(a)

1

= n.

x e [a,b]

2

If

I.

L = exp(LIb-aj), then for all Iu(x)

INITIAL VALUE PROBLEMS

x c [a,b]

- u(x)I < exp(LIx-aI)In - nI < LIn -

I.

Theorem 1.1 is proved in the theory of ordinary differential equations (cf. e.g. Stoer-Bulirsch 1980, Theorem 7.1.4).

It

says that the initial value problem x c [a,b]

u'(x) = f(x,u(x)),

u (a)

=n

subject to the above conditions, has the following properties: (1)

There exists at least one solution

(2)

There exists at most one solution

(3)

The solution satisfies a Lipschitz condition with

respect to Iu(x;n)

u(x;n). u(x;n).

n:

- u(x;n)I < fin -

nI,

x c [a,b].

n,n c]R.

This motivates the following definition, which is intentionally general and which should be completed, in each concrete case, by specifying the spaces under consideration and the nature of the solution.

Definition 1.2:

An initial value problem is called property

posed (or well posed) if it satisfies the following conditions: Cl)

Existence:

The set of initial values for which the

problem has a solution is dense in the set of all initial values. (2)

Uniqueness:

For each initial value there exists

at most one solution. (3)

Continuous dependence on the initial values:

The

solution satisfies a Lipschitz condition with respect to those

1.

Properly posed initial value problems

3

initial values for which the problem is solvable.

a

We next consider a series of examples of initial value problems for partial differential equations, and examine whether or not these problems are properly posed. interest will be in classical solutions.

Our primary

These are charac-

terized by the following properties: (1)

a region

The differential equation need only be satisfied in G, i.e., in an open, connected set.

The solution

must be as often continuously differentiable in G

as the or-

der of the differential equation demands. (2)

subset

r

Initial or boundary conditions are imposed on a of the boundary of

tinuously differentiable on

G.

G U r

The solution must be conas many times as the order

of the initial or boundary conditions demand.

If only func-

tional values are given, the solution need only be continuous on

G u r.

Example 1.3: T > 0.

Let

e C1(IIt,IR)

be a bounded function and let

Then one of the simplest initial value problems is

uy(x,y) = 0 u(x,0) _ fi(x)

x e IR,

y e (0,T).

Obviously the only solution to this problem is

u(x,y) = fi(x).

Therefore we have

I1

u(

=1k -all,,,

so the problem is properly posed. This initial value problem can be solved "forwards" as

above, and also "backwards" since the same relationships exist

INITIAL VALUE PROBLEMS

I.

4

y e [-T,0].

for

However, this situation is by no means typi-

cal for partial differential equations.

o

An apparently minor modification of the above differential equation changes its properties completely: Example 1.4:

Let

and

0

T

be chosen as in Example 1.3.

Then the problem

ux(x,Y) = 0

xe]R, ye (0,T)

u(x,0) _ 4(x)

is solvable only if

is constant.

4

In this exceptional case

there are nevertheless infinitely many solutions

4(x) +

for functions

The problem

c

CI((0,T),RR)

with

therefore is not properly posed.

:y(0) = 0.

(y)

o

The following example contains the two previous ones as special cases and in addition leads heuristically to the concept of characteristics, which is so important in partial differential equations. Example 1.5:

A, B, T e]R

Let

with

4 e C1(]RJR)

be a bounded function and let We consider the prob-

A2 + B2 > 0, T > 0.

lem

Aux(x,y) = Buy(x,y)

u(x,0) _ 4(x) For

x e ]R,

y C (0,T) .

B = 0, the problem is not properly posed (cf. Example

1.4).

Assume

B # 0.

(xc(t), yc(t)) = (c

we have for

t e (0,T):

On the lines

-

Bt,t),

t eIR

a parameter

1.

Properly posed initial value problems

S

Bx(xc(t),Yc(t)) + uy(xc(t),Yc(t)) = 0.

d-t-u(xc(t),Yc(t))

This implies that u(xc(t),Yc(t))

=_

4'(c)

= 4'(xc(t) + Byc(t)),

t

e [0,T).

The problem thus has a uniquely determined solution

u(x,Y) _ (x + g Y) The family of lines

and is properly posed.

constant (c

Bx + Ay = Bc =

the family parameter) are called the characteris-

tics of the differential equation.

They play a distinguished

role in the theory and for more general differential equations consist of curved lines.

In the example under consideration

we see that the problem is properly posed if the initial values do not lie on a characteristic (cf. Example 1.6).

Further we

note that the discontinuities in the higher derivatives of propagate along the characteristics.

4'

o

In the following example we consider systems of partial

differential equations for the first time, and discover that we can have characteristics of differing directions. Example 1.6:

Initial value problem for systems of partial dif-

ferential equations. a2 +

S2

Let

n c]N, T > 0, and

Define

> 0.

G = {(x,y) c IR 2

r = { (x, y) and suppose

a,B aIR with

4'

A c MAT(n,n)R)

c iR

c C1(r,IItn)

2

ax + By c (0,T) } ax + By = 0} bounded, q c C1(G,IItn) , and

real diagonalizable.

solution of the problem

We would like to find a

6

INITIAL VALUE PROBLEMS

I.

uy(x,Y) = Aux(x,Y) + q(x,y),

(x,Y) e G,

u(x,Y) _ 4(x,Y),

(x,Y) e r.

The given system of differential equations may be uncoupled by means of a regular transformation matrix A

to the diagonal form

u = Sv, q = Sr, and

which takes

S

S-1AS = A = diag(Ai).

Letting

= Sip, we obtain the equivalent system

vy(x,y) = Avx(x,Y) + r(x,y),

(x,y) e G,

v(x,Y) = V(x,Y),

(x,y) e r.

Analogously to Example 1.5, we examine the ith equation on the lines

x + Aiy = c

with the parametric representation

(xc(t),Yc(t)) = (c-ait,t),

t c 1R.

From the differential equation it follows that for the ith component

vi

of

v, and for all

t e 1R

a(c-ait) +

with

St e (0,T) : av.

d

av.

dt vi(xc(t),Yc(t)) = -aiax (xc(t),Yc(t)) +

ay(xc(t),Yc(t))

= ri(xc(t),Yc(t)) therefore t

vi(xc(t),Yc(t)) = ni + I0 ri(xc(T),Yc(T))dT, ni c 3R

arbitrary.

When considering the initial conditions, we have three possible cases: Case 1:

aai - a # 0.

The two lines

have exactly one intersection point

r

and

x + Xiy = c

1.

Properly posed initial value problems

c,Yc)

=

7

ac

-9c (aai 8

and

ni = ni(c) =

i(XC,Yc)

1vc -

ri(xc(T),Yc(T))dT.

0

Thus for all

vi

we obtain the following representation: Y

vi(x,Y) = ni(c) + j0 ri(c-aiy,Y)dy

c = x + aiy One can check by substitution that

v i

is actually a solution

of the ith equation of the uncoupled system. Case 2:

aa .-$ = 0 i

x + aiy = c

and

c = 0.

are now identical.

The two lines

and

r

The ith equation of the un-

coupled system can be solved only if coincidentally dt i(xc(t),Yc(t)) = ri(xc(t),y (t)),

a(c- A t) + St C (0,T).

In this exceptional case there are, however, infinitely many solutions. Case 3:

aai -

x + aiy = c

B = 0

and

c # 0.

The two lines

now have no point of intersection.

r

and

The ith equa-

tion of the uncoupled system has infinitely many solutions. As in Example 1.5, the family of lines (c

x + aiy = c

the family parameter) are called the characteristics of

the system of partial differential equations.

characteristics coincide with the line

r

If none of the

on which the ini-

tial values are given, then we have case 1 for all

i, and

8

INITIAL VALUE PROBLEMS

I.

v1 (x, Y)

u(x,y) = Sv(x,y) = S vn(x,Y)

is the uniquely determined solution of the original problem. To check the Lipschitz condition, let

0 E Cl(r,IR")

and

Then it follows that

= Spy.

)II

Ilu(

v(, ;ip)II,,

=

Thus we have shown that the problem is properly posed exactly when

is not a characteristic.

r

u = min{aili = l(l)n}

Let

From the representation of at the fixed point

u

u

v = max{aili = l(l)n}.

v

we see that the solution

and

(x0,y0)

ally only on the values of

and

depends on

(x,y)

q

and addition-

on the line segment con-

necting the points I -S(xo+1Y0) l

ap-Is

a(xO+uy0)

l

-1611-,7-75

J

and

( -B(xo+vyo) l av- 13

a(xo+vyo) av-

l

}

This segment is called the domain of dependence of the point (x0,y0).

On the other hand, if

q(x,y)

is known only on the

segment connecting the points I l

-$a

u-'

as

and J

(_i.ab 1 av-0 '

av-S }

then the solution can be computed only in the triangle

ax + By > 0,

x + uY > a,

x + vy < b.

This triangle is called the domain of determinancy of this segment.

These concepts are clarified further by the

1.

Properly posed initial value problems

example in Figure 1.7.

Figure 1.7:

9

o

Characteristics

and domains of determinancy

and dependence.

The uncoupling of a system of differential equations as used in the preceding example is also helpful in investigating other problems.

10

INITIAL VALUE PROBLEMS

I.

Initial boundary value problem for systems of

Example 1.8:

partial differential equations.

r= DG, q c C1(G,1R )

n E]N, G =

Let

0,4 E C1([0,m), ]Rn)

and

real diagonalizable.

A E MAT(n,n,]R)

(0,co)2 ,

bounded, and

We seek a solution to the

problem uy(x,Y) = Aux(x,Y) + q(x,y),

u(x,0) _ 4(x),

x>0

u(0,Y) _ ky),

Y > 0

with the compatibility conditions A4''(0) + q(0,0).

(0) _ (0)

ct'(0)

=

With the notation used there, it

follows that for the ith component with

and

The system of differential equations can be

uncoupled as in Example 1.6.

t EIR

(x,Y) e G

t > 0

and

vi

v, and for all

of

c - X.t > 0: i

t

vi(xc(t),Yc(t)) = ni +

ri(xc(T),Yc(T))dT,

f 0

e 1R

arbitrary.

There are three possible cases for the initial boundary condition:

Case 1:

ai < 0.

The characteristic

x + Aiy = c

one intersection point with the boundary the values of

0

or

4, ni

With the aid of

r.

and therefore

has exactly

vi

is uniquely

determined. Case 2: for

ai > 0.

The characteristic

x + Xiy = c

c > 0, one intersection point with the positive

and one intersection point with the positive case

then has,

ni

y-axis.

x-axis In this

is overdetermined in general, and the ith equation

of the uncoupled system is not solvable.

1.

11

Properly posed initial value problems

ai = 0.

Case 3:

the solution

For

c > 0,

vi(xc(t),yc(t))

ith component of

S-14

for

is uniquely determined, but

i

converges continuously to the c

only in exceptional cases.

0

The problem is properly posed if and only if all eigenvalues A

of

are negative, i.e., when only the first case occurs. The following example, the wave equation, is an impor-

tant special case of Example 1.6. (Example 1.9:

C1(II2, ]R)

Wave Equation.

With

.

Let

a = 0, 0 = 1, A =

T > 0

l0

and 1],

q

01,02 e =_

0

and the use

of componentwise notation, Example 1.6 becomes aul/ay = au2/ax au2/ay = aul/ax

x e ]R,

y C (0,T) .

u1(x,0) = Yx) u2(x,0) = 42(x) ((

We have

Xl = 1, a2 =

1, S =

11

Ii

_lJ

and

S-1

equation

becomes

Thus the solution of the wave ul(x,Y)

= S i1(x+Y)

u2(x,y)

*2(x-Y)

_ S

2(Yx+Y) + 2(x+Y)) Z(01(x-Y)

- 42(x-Y))

Yx+Y)+Yx+Y)+Ol(x_y)-02(x-Y)

x CIR,

1

yC

[0,T).

2 1(x+y)+02(x+y)-q1(x-y)+02(x-y) The wave equation can also be written as a partial differential equation of second order equation with respect to pect to

y

by differentiating the first and the second equation with res-

x, and then subtracting one equation from the other.

The initial value for

u2

becomes an initial value for au1/ay

D

INITIAL VALUE PROBLEMS

I.

12

With the change of nota-

with the aid of the first equation. tion of

for

u

u

and

u1

yy

- u

xx

41, we obtain

for

4)

= 0

u(x,0) = ¢(x)

y e (O,T).

x e ]R,

uy(x,O) = Yx) = Ve(x) Yet another form of the wave equation arises from the coordinate transformation

y = b + a, x = C

-

a.

The differential

operators are correspondingly transformed into a

a

a

a

a

a

'9-y

ax

It then follows that the given differential equation can also be written as

(ay + ax) (ay

2 u(_-a,_+a)

aCaa

-

=

ax) u(x,y) = 0,

a2 u C a)

= 0.

DCao

In order for the wave equation problem in this form to be properly posed, it is necessary that the initial values not be specified for

g ° constant and

lines are the characteristics.

a =_ constant, since these o

Another important type of partial differential equation is exemplified by the heat equation, which is presented in the following examples. Example 1.10: For

Initial value problem for the heat equation.

T > 0, a > 0, and

the following problem:

q,(p e C°(R,IR)

we seek a solution to

1.

Properly posed initial value problems

ut(x,t) = auxx(x,t) - q(x)

u(x,0)

fi(x)

13

t e (0,T).

x e 1R,

However this problem is not uniquely solvable and therefore also not properly posed (cf. Hellwig 1977).

This

shortcoming can be overcome by imposing conditions on 0

q

and

The additional

and by restricting the concept of a solution.

conditions lq(x)lexp(-Ixl)

and

lu(x,t)lexp(-1x1)

(k(x)lexp(-1x1)

x e]R

and

C°(JR,IR)

and

bounded for

determine linear subspaces of

0(x)

(lull

x

=

One can show that the problem is

t c [0,T)

For the norm we use

C2(IR x (0,T) ,]R) fl C°(IR x [0,T),IR). 11011 = sup exp x eIR

bounded,

u(x t)

sup

xe R, te[O,T)

exp

x

now properly posed.

The

solution of the homogeneous equation vt(x,t) = avxx(x,t), v(x,O) = fi(x),

x cIR,

1p(x)1 exp(-Ixl)

t e (0,T)

bounded,

p E C°(IR,IR)

can be derived using Fourier transforms (cf. §9): +2 -1/2 (4nat) exp ( (x )4(T)dt for

t e (0,T)

v(x,t) = for

fi(x)

t = 0.

To obtain the solution of the inhomogeneous problem, we first consider the equation awxx(x,t)

Obviously

- q(x) = 0.

14

I.

fx r

w(x,t) = w(x) = 1

a J0

is a particular solution. that

lw(x)lexp(-Ixl)

INITIAL VALUE PROBLEMS

q(T)dtd

J

0

A straightforward computation shows

is bounded.

Now let

solution of the homogeneous equation with Then

u(x,t) = v(x,t) + w(x)

be the

v(x,t)

V(x) = fi(x)

- w(x).

is the solution of the original

inhomogeneous heat equation.

It is apparent from the integral representation of the solution that there is no finite domain of dependence.

There-

forethis problem has no practical significance, in contrast to the following initial boundary value problem. Example 1.11: equation.

Let

a

Initial boundary value problem for the heat T > 0, a > 0; q,c c C°([a,b], It)

c C°([0,T], Si).

and

We seek a solution of the problem

ut(x,t) = auxx(x,t)

- q(x),

u(x,0) _ ct(x),

x c

(a,b),

x c

[a,b]

t e

(0,T)

u(a,t) = Vpa(t), u(b,t) = iyb(t),t c [0,T). Since there are two conditions on each of

u(a,0)

id

u(b,0), we also need the following compatibility conditions

0(a) _ $a(0),

0(b) = * b(0).

We know from the literature that the problem is properly posed (cf. e.g. Petrovsky 1954, §38).

It can be solved, for example,

by a Laplace transform (reducing it to an ordinary differential equation) or by difference methods. on

¢(Q

s c

[0,t].

for all

E c

[a,b]

u(x,t)

and on a(s),Ys)

is dependent for all

1.

Properly posed initial value problems

15

In contrast to Example 1.3, the problem at hand is not properly posed if one attempts to solve it "backwards" [for t c(-T,0)]-heat conduction processes are not reversible.

For the problem

then either is not solvable or the solution does not depend continuously on the initial values.

This state of affairs is

best explained with the following special example:

a = 0, b = it, a = 1, q = 0, a =

0,

lI,b

= 0, w c IN, y c IR,

(x;Y,w) = y sin wx. ut(x,t;Y,w) = uxx(x,t;Y,w),

x e (O,n),

u(x,O;Y,w) _ 0(x;Y,w),

x c

u(O,t;Y,w) = u(ir,t;Y,w)

t e

(-T,0)

[0,n]

t c ('T,0]

0,

One obtains the solution u(x,t;y,w) = y exp(-w2t)sin wx. For the norms we have Ilu(-,-;Y,w)

- u(',';O,w)II = y exp(w2T) Y.

The ratio of the norms grows with

beyond all bounds.

w

there can be no valid Lipschitz condition with respect to dependence on the initial values. Example 1.12:

o

Initial boundary value problem of the third

kind for the heat equation.

Nonlinear heat equation.

T,a > 0; q, a C°([a,b],]R); 0a,Ya,Rb2Yb > 0; sb+Yb > 0;

c C°([0,T],]R).

Let

0a+Ya > 0;

We consider the problem

Thus

16

I.

INITIAL VALUE PROBLEMS

ut(x,t) = auxx(x,t) - q(x),

x e (a,b),

u(x,0) = O(x),

x e [a,b]

t e (0,T)

0au(a,t) - yaux(a,t) = *a(t) t e

abu(b,t) + ybux(b,t) =

(0,T).

b(t)

Compatibility conditions:

Bab(a)

*a(0)

-

Bb0 (b) + YO' (b) _ *b (0) The boundary conditions imposed here are of great practical They are called boundary values of the third

significance. kind.

The special cases

Ya = Yb = 0

and

Ba = Bb = 0

are

called boundary values of the first and second kinds, respectively.

One can show that the problem is properly posed.

The methods of solution are Laplace transforms or finite differences, as in Example 1.11.

The nonlinear heat equation ut = [a(u)ux]x - q(x,u)

a(z) > e > 0 ,

z

e lR

is frequently rewritten in practice as follows: strongly monotone function

f:]R +]R

rzz

f(z) = f

0

and set

w(x,t) = f(u(x,t)). It follows that

wx = a(u)ux wt = a(u)ut wt = a(u)[wxx - q(x,u)]

With the notation

by

define a

1.

Properly posed initial value problems

&(Z) =

17

a(f-1(z)) q(x,f-1(Z))

q(x,z) =

one obtains a new differential equation:

wt = a(w)[wxx - Q(x,w)) All steady state solutions

(wt = 0) satisfy the simple equa-

tion

wxx = q(x,w). Example 1.13:

o

Parabolic differential equation in the sense

of Petrovski. Let T > 0, q c 1N, a C IR bounded-.

0 C Cq(]R, IR)

and

We seek a bounded solution for the problem

ut(x,t) = a(ax)qu(x,t)

x c IR, t c (0,T) .

u(x,0) = fi(x) q

odd or

(-1)4/2a < 0.

Special cases of this problem are given in Example 1.5 (for B # 0) and in Example 1.10.

The parabolic equations in the

sense of Petrovski are hyperbolic equations for

q = 1

are parabolic equations in the ordinary sense for (see §2).

For larger

q = 2

q, the properties of the problem

resemble those in the case lem is properly posed.

and

q = 2.

One can show that the prob-

The solution methods are difference

methods or Fourier transforms even if

q >

(cf.

2

§9).

The above equation has physical significance even when a

is complex.

For example, letting

a = 4nm

and

q =

2

yields the Schrodinger equation for the motion of a free particle of mass q

m

(h

is Planck's constant).

must be modified for complex

The condition on

a, to become

I.

18

INITIAL VALUE PROBLEMS

Re(aiq) < 0.

Example 1.14: T > 0

o

Cauchy-Riemann differential equations. C°(R,IR)

and

bounded functions.

Let

We consider

the problem uy(x,Y) = -vx(x,Y)

x E IR, y E (0,T)

vy(x,Y) = ux(x,Y) u(x,0) = c(x),

v(x,0) = (x).

These two differential equations of first order are frequently combined into one differential equation of second order: uxx + uyy = 0.

This equation is called the potential equation and is the most studied partial differential equation.

The Cauchy-Riemann

differential equations are not a special case of the hyperbolic system of first order of Example 1.6 since the matrix A =

li

0,

is not real diagonalizable.

elliptic type (see §2).

Rather they are of

Although the initial value problem

at hand is uniquely solvable for many special cases, there is no continuous dependence on the initial values. Example:

y,w EIR, 4(x) = y sin wx, *(x) = 0.

As a solu-

tion of the Cauchy-Riemann differential equations one obtains: u(x,y) = y sin(wx) cosh(wy) v(x,y) = y cos(wx) sinh(wy). With

w = (u,v)

and

X =

(4,ip)

I I w ljW= y cosh(wT) IIXIIW = Y.

this yields:

2.

Types and characteristics

19

Thus the solution cannot satisfy a Lipschitz condition with respect to the initial values, and the problem is not properly posed.

This property carries over unchanged to the equivalent

initial value problem for the potential equation. In practice, only boundary value problems are consid-

ered for elliptic differential equations, since the solution does not depend continuously on the initial values.

2.

o

Types and characteristics

Since initial and boundary value problems in partial differential equations are not always properly posed, it is worthwhile to divide differential equations into various types. One speaks of hyperbolic, elliptic, and parabolic differential equations.

Of primary interest are initial value problems for

hyperbolic equations, boundary value problems for elliptic equations, and initial boundary value problems for parabolic equations.

Typical examples for the three classes of equa-

tions are the wave equation (hyperbolic, see Example 1.9), the potential equation (elliptic, see Example 1.14), and the heat equation (parabolic, see Examples 1.11, 1.12).

In addi-

tion, the concept of the characteristic proves to be fundamental for an understanding of the properties of partial differential equations.

In keeping with our textbook approach, we will consider primarily the case of two independent variables in this and the following chapters.

We consider first scalar equations of

second order, and follow with a discussion of systems of first order.

In the general case of

m

independent variables we

restrict ourselves to a few practically important types, since

20

INITIAL VALUE PROBLEMS

I.

a more complete classification would require the consideration In particular, for the case

of too many special cases. m > 2

there exist simple equations for which none of the

above mentioned problems is properly posed. Definition 2.1:

with

C°(G xIR3,]R)

for all

Let

be a region in

G

a(x,Y,z)2

(x,y) E G, Z EIR3.

IR2

b(x,Y,z)2

+

and a,b,c,f c c(x,Y,z)2 > 0

+

The equation

a(x,Y,P)uxx + 2b(x,Y,P)uxy + c(x,Y,P)uyy + f(x,Y,P) = 0 with p(x,y) = (u,ux,uy) differential equation. au

xx

is called a quasi-linear second order The quantity + 2bu

xy

+ cu

yy

is called the principal part of the differential equation.

The description quasiZinear is chosen because the derivatives of highest order only occur linearly.

The differential equa-

tion is called semilinear when the coefficients of the principal part are independent of

p, and

a, b, and f

c

has the

special form f(x,y,p) = d(x,Y,u)ux + e(x,Y,u)uy + g(x,y,u) with functions

d,e,g a C°(G xlR,IR).

A semilinear differential

equation is called linear when the functions independent of

u, and

g

d

and

e

are

has the special form

g(x,y,u) = r(x,y)u + s(x,y)

with functions

r,s E C°(G,IR).

A linear equation is called

a differential equation with constant coefficients when the functions

a, b, c, d, e, r, and

s

are all constant.

a

Types and characteristics

2.

21

In order to define the various types of second order partial differential equations we need several concepts originating in algebra.

A real polynomial (real) form of degree

x cIRm

and all

quadratic form.

P(x) = P(xl,...,xm)

k

t cIR.

It may also be represented in matrix form

Without loss of generality A

holds for all

A form of degree two is called a

P(x) = xTAx,

Then

P(tx) = tkP(x)

if

is called a

A E MAT(m,m, IR),

x E IRm.

A may be assumed to be symmetric.

is uniquely determined by

P

and vice versa.

The

usual concepts of symmetric matrices positive definite

:<=>

all eigenvalues of A greater than zero

negative definite

:<_>

all eigenvalues of A less than zero

definite

positive definite or negative definite

positive semidefinite

all eigenvalues of A greater than or equal to zero

negative semidefinite

:<=>

all eigenvalues of A less than or equal to zero

semidefinite

:<=>

positive semidefinite or negative semidefinite

indefinite

:<_>

not semidefinite

thus carry over immediately to quadratic forms. Definition 2.2:

To the differential equation of Definition

2.1 assign the quadratic form P(E,n) = a(x,Y,P)E2 + 2b(x,Y,P)En + c(x,Y,P)n2 Then the type of the differential equation with respect to a

I.

22

u c C2(G,]R)

fixed function

INITIAL VALUE PROBLEMS

and a fixed point

(x,y) E G

is

determined by the properties of the associated quadratic form:

Type of d.e.

Properties of

P(t,n)

hyperbolic

indefinite (i.e. ac-b2 < 0)

elliptic

definite (i.e. ac-b2 > 0)

parabolic

semidefinite, but not definite (i.e. ac-b2 = 0)

The differential equation is called hyperbolic (elliptic, parabolic) in all of

with respect to a fixed function, if, with

G

respect to this function, it is hyperbolic (elliptic, parabolic) for all points

(x,y)

c G.

o

The above division of differential equations into various types depends only on the principal part. equations, the type at a fixed point with respect to all functions efficients

a, b, and

For semilinear

(x,y) c G

is the same

u c C2(G,]R); for constant co-

c, the type does not depend on the

point, either.

In many investigations it is not sufficient that the differential equation be hyperbolic or elliptic with respect to a function in all of the region.

In such cases one fre-

quently restricts oneself to uniformly hyperbolic or uniformly eZZiptic differential equations.

By this one means equations

for which the coefficients

a, b, and

dent of

and for which in addition

c G, z EIR3

(x,y)

ac - b

2

< -Y < 0

c

are bounded indepen-

(unif. hyperbolic)

2

ac - b

>

Y > 0

(unif. elliptic)

(x,y) c G, z d R 3

2.

Types and characteristics

where

23

y = const.

Linear second order differential equations with constant coefficients can, by a linear change of coordinates, always be reduced to a form in which the principal part coincides with one of the three normal forms uxx - uyy

hyperbolic normal form

uxx + uyy

elliptic normal form

uxx

parabolic normal form.

Even for more general linear and semilinear equations one can often find a coordinate transformation which achieves similar The type of a differential equation is not changed

results.

by such transformations whenever these are invertible and twice differentiable in both directions.

For the definition of characteristics we will need several concepts about curves. and let

I

with

a,b

c

a

G

A mapping

IR.

if

be a region in

G

¢

E

image of the curve

for all

t c

I.

is called the tangent to the curve the set

at the point

is called a

C1(I,G)

41(t)2 + 4Z(t)2 > 0

(¢i(t),$2(t))

1R2

(a,b), (a,-), (--,b), or

be one of the intervals

smooth curve in The vector

Let

0(I)

is called, the

0.

Definition 2.3:

We consider the differential equation in

Definition 2.1.

A vector

a = (B10B2) c ]R 2,

B # (0,0)

called a characteristic direction at the fixed point with respect to the fixed function

u E C2(G,IR)

is

(x,y)

if it is

true that

a(x,Y,P)BZ - 2b(x,Y,P)$1S2 + c(x,Y,P)B2 = 0.

E G

I.

24

The image of a smooth curve

in

4

INITIAL VALUE PROBLEMS

is called a characteris-

G

tic of the differential equation with respect to the tangents

for aZZ

u

whenever

are characteristic directions for

(4i(t),421(t))

the differential equation at the points respect to

u

(41(t),42(t))

This means that

t e I.

0

with

is a solu-

tion of the ordinary differential equation a(41,42,P(41,42))42(t)2 -2b(01,02,P(01,02))01(t)0z(t) +c(01,02,P(01,(P2))01(t)2

The condition

(2.4)

0

t E I.

=

aa2 - 2b$162 + c02 = 0

can also be put in the

form S

when a # 0.

1

c # 0.

=

(b2-ac)1/2

b ±

S2

(2.5)

c

An analogous rearrangement is possible when

This implies that a hyperbolic (parabolic, elliptic)

differential equation has two (one, no) linearly independent characteristic direction(s) at every point. Examples: The wave equation characteristic directions

The heat equation istic direction The curve

uyy(x,y) - uxx(x,y) = 0 (1,1)

and

(1,-1)

ut(x,t) = auxx(x,t) - q(x) (1,0)

0

has

at every point. has character-

at every point.

is not sufficiently determined by the

differential equation (2.4).

Thus one can impose the normaliza-

tion condition

1(t)2 + 4 (t)2 = 1, Subject to the additional condition

t e

I.

(2.6)

a,b,c a C(G 1 x]R,IR), 3

2.

Types and characteristics

25

it can be shown that the initial value problem for arbitrary to c

for the ordinary differential equations (2.4), (2.6)

I

has exactly two (one) solution(s) with distinct support, if it is the case that the corresponding partial differential equation is hyperbolic (parabolic) in all of

G

with respect to

u.

In the hyperbolic case it follows that there are exactly two characteristics through every point

(x,y) e G, while in the

parabolic case it follows that every point

(x,y) c G

initial point of exactly one characteristic.

is the

The equation has

no characteristic when it is of elliptic type. The differential equation of the characteristic can be simplified when

c(x,y,p(x,y)) # 0

for all

(x,y) c G.

In

lieu of (2.6) we can then impose the normalization condition 42(t) = 1.

dy

.

=

With (2.5) it follows from (2.4) that

o (c UV,Y,P )) ±Vp,Y

b(* ,

A(,P,Y,p) = The image set

,

a('V,Y,p)c(1V,Y,p)

x = i(y)

is a characteristic.

simplification is possible for

An analogous

a(x,y,p(x,y)) # 0.

Finally we consider the special case where c(x,y,p(x,y)) = 0

at a point.

characteristic direction. plies that

(0,1)

This implies that

Thus also

(1,0)

a(x,y,p(x,y)) = 0

is a characteristic direction.

is a im-

Since it is

possible for hyperbolic equations to have two linearly independent characteristic directions, both cases can occur simultaneously.

Indeed, with an affine coordinate transformation

one can arrange things so that the characteristic directions at a given point with respect to a fixed function are in arbitrary positions with respect to the coordinate system, so

I.

26

INITIAL VALUE PROBLEMS

the above representation is possible.

We next consider the type classification and definition of characteristics for systems of first order partial differSince parabolic differential equations

ential equations.

arise in practice almost exclusively as second order equations, we restrict ourselves to the definition of hyperbolic and elliptic. Let

Definition 2.7:

h e C°(G x 1Rn, IRn)

n elN, G

and

a region in

1R2,

A e C°(G x IRn, MAT(n,n, IR)).

The

equation uy - A(x,y,u)ux + h(x,y,u) = 0 is called a quasiZinear ferential equations.

system of first order partial dif-

The quantity uy - Aux

is called the principal part of the system.

The system is

called semiZinear if

u.

A

does not depend on

system is called linear when

A semilinear

has the special form

h

h(x,y,u) = B(x,y)u + q(x,y) with functions

B a C°(G,MAT(n,n,1R)), q c C°(G,IR').

A

linear system is called a system with constant coefficients A, B, and

if the functions Definition 2.8:

are all constant.

q

o

The system of differential equations in Defini-

tion 2.7 is called hyperbolic (elliptic) with respect to a fixed function if the matrix

u e C1(G,1Rn)

A(x,y,u(x,y))

real eigenvectors. G

and a fixed point has

n

(x,y) e G

(no) linearly independent

It is called hyperbolic (elliptic) in all

with respect to a fixed function if at every point

(x,y) e G

2.

Types and characteristics

27

it is hyperbolic (elliptic) with respect to this function. We consider the system of differential equa-

Definition 2.9:

A vector

tions in Definition 2.7. 0 # (0,0)

u C C1(G,IRn)

(x,y) e G

cIR2

of the matrix

An equivalent condition is in

respect to

01

y eIR

and an eigenvalue

A(x,y,u(x,y))

+ag2 = 0.

such that

The image of a smooth

is called a characteristic of the system with

G

u

with respect to a fixed function

if there exists a

A = A(x,y,u(x,y))

4>

a = (a1,a2 )

is called a characteristic direction of the system

at a fixed point

curve

a

if for aZZ

t e

I

the tangents

are characteristic directions of the system with respect to u

at the points

($1(t),02(t)).

This means that

0

is a

solution of the following ordinary differential equation: fi(t) + A(01,4>2,u(4>1,4>2))42(t) = 0,

t e I.

M

(2.10)

From the above definition and the additional normalization condition

S2 = 1

it follows at once that a hyperbolic

system has as many different characteristic directions at a point

(x,y) e G

eigenvalues. direction.

as the matrix

A(x,y,u(x,y))

has different

In an elliptic system there is no characteristic The differential equation (2.10) can be simplified

since we may impose the additional normalization condition 01(t) = 1.

We obtain 1P, (y) + A(p,Y,u(*,Y)) = 0.

The image set of the straight lines

x = ip(y)

is a characteristic.

Consequently

y e constant are never tangents of a

28

INITIAL VALUE PROBLEMS

I.

characteristic in this system; this is in contrast to the previously considered second order equations. Examples:

uy = Aux + q

The system

from Example 1.6 is of hyperbolic

A

diagonalizable matrix

with the presumed real

The characteristics were already given in that example.

type.

(0

The Cauchy-Riemann differential equations

uy

1

01 0

ux

of Example 1.14 are a linear first order elliptic system with constant coefficients.

If all the coefficients are not explicitly dependent on

u, every quasilinear second order differential equation

can be transformed to a

first order system.

2 x 2

This trans-

formation does not change the type of the differential equaThus, we consider the following differential equation:

tion.

a(x,Y,ux'uy)uxx + 2b(x,Y,ux,uy)uxy + c(x,Y,ux,uY)uYY + Setting

v = (ux,uy)

f(x,Y,ux,uy)

yields the system av

av l

av 2

2

a(x,Y,v)x + 2b(x,Y,v)ax + c(x,Y,v)aY

If

3v2

av1

TX_

dy

c(x,y,z) # 0

solve for

v

for all

(x,y) c G

and all

y 0

1

vy = -a/c

-2b/cj

The coefficient matrix has eigenvalues 2 =

A 1 ,

= 0.

c

(-b ±Vb

.

+ f(x,y,v) = 0

z

eIR2

we can

2.

29

Types and characteristics

The corresponding eigenvectors are

When

(l,A1,2 ).

Al = X2 Thus the type of this first

there is only one eigenvector.

order system, like the type of the second order differential equation, depends only on the sign of

b2

-

ac.

We next divide partial differential equations with

m

independent variables into types, restricting ourselves to the cases of greatest practical importance. Definition 2.11:

Let

m c 3N, G

a region in

aik, f C C°(G XIRm+1,IR) (i,k = 1(1)m). a/ax. = a..

IRm

and

We use the notation

The equation m aik(x,P(x))aiaku(x) + f(x,P(x)) = 0

E

i,k=l

with

p(x) = (u(x),alu(x),...,amu(x))

second order differential equation. ity, we may assume the matrix

is called a quasiZinear Without loss of general-

A = (aid)

to be symmetric.

Then the type of the differential equation with respect to a fixed function

u c C2(G,IR)

and a fixed point

x c G

is

determined by the following table:

Type of d.e.

Properties of

hyperbolic

All eigenvalues of A(x,p(x)) are different from zero. Exactly m - 1 eigenvalues have the same

elliptic

All eigenvalues of A(x,p(x)) are different from zero and all have the same sign.

parabolic

Exactly one eigenvalue of A(x,p(x)) is equal to zero. All the remaining ones have the same sign.

A(x,p(x))

sin.

a

INITIAL VALUE PROBLEMS

I.

30

Definition 2.12:

h e C°(G x ]Rm, IRn)

v = 1(1)m-1.

Let

m,n eIN, G

and

a region in

IRm,

Au a C°(G x ]Rn,MAT(n,n, ]R))

for

The system m-1

a

u(x) m

A (x,u(x)) a u(x) + h(x,u(x)) = 0 u=1

u

u

is called a quasiZinear first order hyperbolic system if there exists a

with

C e C1(G xIR", MAT(n,n,]R))

x e G, z cIRn.

regular for all

(1)

C(x,z)

(2)

C(x,z)-lAu(x,z)C(x,z) z eIRn, u = 1(1)m-l.

symmetric for all

x c

f

o

The concepts of principal part, semiZinear, constant coefficients, and the type with respect to a fixed function in all of G

are defined analogously to Definitions 2.1 and 2.7.

The

hyperbolic type of Definition 2.12 coincides with that of Definition 2.8 in the special case of

m = 2.

So far we have considered exclusively real solutions of differential equations with real coefficients. and boundary conditions were similarly real.

The initial

At least insofar

as linear differential equations are concerned, our investigations in subsequent chapters will often consider complex solutions of differential equations with real coefficients and complex initial or boundary conditions.

This has the effect

of substantially simplifying the formulation of the theory. It does not create an entirely new situation since we can always split the considerations into real and imaginary parts.

3.

3.

Characteristic methods for hyperbolic systems

31

Characteristic methods for first order hyperbolic systems Let

be a simply connected region and consider

G C I R 2

the quasilinear hyperbolic system

uy = A(x,y,u)ux + g(x,y,u).

A E C1(G x IRn, MAT (n,n, IR)) , g E C1(G x IRn, IRn) , and

Here

is an arbitrary but fixed solution of the sys-

u e C1(G,IRn)

For the balance of this chapter we also assume that

tem.

always has

A(x,y,z)

x, y, z, and

that

au(x,y)z),

different real eigenvalues

n

Their absolute value shall be bounded independently

u = 1(1)n. of

(3.1)

p < v

The eigenvalues are to be subscripted so

u.

implies

A

V

< Xv.

If the eigenvalues of a matrix are different, they are infinitely differentiable functions of the matrix elements.

When multiple eigenvalues occur, this is not necessarily the case.

Our above assumption thus guarantees that the are continuously differentiable (single-valued)

au(x,y,u(x,y))

functions on

There are always

G.

real eigenvectors. to

1.

linearly independent

Their Euclidean length can be normalized

They are then uniquely determined by the eigenvalue up

to a factor of G xJRn

n

+1

or

In the simply connected region

-1.

the factor can be chosen so that the eigenvectors are

continuously differentiable functions on exists an

E e Cl(G xJRf, MAT(n,n,IR))

G xlRn.

Thus there

with the following

properties: is always regular.

(1)

E(x,y,z)

(2)

The columns of

(3)

E(x,y,z)-1A(x,y,z)E(x,y,z) = diag(au(x,y,z))

Naturally, -E

E

are vectors of length

has the same properties as

E.

Let

1.

32

I.

INITIAL VALUE PROBLEMS

From (3.1) one obtains the first

D(x,y,z) = diag(a)i (x,y,z)).

normal form E-Iuy For

E, D, and

=

E-1

DE-Iux

+

(3.2)

g.

we suppressed the arguments

g

(x,y,u(x,y)).

A componentwise notation clarifies the character of this normal form.

Let

g = (gv)

u = (uv),

E-I = (euv),

This implies

v l

n

au

n

euv y

n

au

v

ax

u v£leu

+

V

Ll euvgv,

u = 1(1)n

(3.3)

or

n

n

a

euvgv,

euv[ay - au axaluv

p = 1(1)n.

(3.4)

= v=1

Each equation contains only one differential operator, 8y

Au 8x' which is a directional derivative in a characteris-

tic direction (cf. Example 1.6).

However, this does not mean

that the system is uncoupled, for in general depend on all the components of

euv, au, and

gv

u.

For a linear differential equation, it is now natural to substitute

v(x,Y) = E(x,Y)-lu(x,Y) This leads to the second normal form 1

vy = Dvx + (aay

D aax ) Ev + E - Ig.

(3.5)

-

In componentwise notation, these equations become av

av

n +

ayu = u -1

(buv) _

E

buvvv + gu,

u = l(l)n

v=1

aay

-1

D aax )E,

(gu) = E-Ig.

(3.6)

33

Characteristic methods for hyperbolic systems

3.

The original hyperbolic system and the normal forms obviously have the same characteristics.

They may be represented para-

metrically as follows:

x = 0(t) ,

y=t p = 1(1)n.

0,

V(t) + For each characteristic, p

Since the

is fixed.

(3.7)

are con-

au

tinuously differentiable, they satisfy local Lipschitz condiIt can be shown that there are exactly

tions.

same

is exactly one.

there

p

Two characteristics for the

No characteristic of

cannot intersect each other.

u

our system touches the

x-axis.

different

G; thus for each choice

characteristics through each point of of

n

Each characteristic cuts the

x-axis at most once.

We will now restrict ourselves to the special case n = 2.

In this case there are particularly simple numerical

methods for handling initial value problems. characteristic methods.

tial values on the set

They are called

For simplicity's sake we specify iniWe presuppose

r = {(x,y) c Gly = 0).

that:

is a nonempty open interval on the x-axis;

(1)

r

(2)

Every characteristic through a point of sects

G

inter-

r.

The second condition can always be satisfied by reducing the size of

G.

It now follows from the theory that the course of depends only on the initial values on

u

in all of

G

is the domain of determinancy of

two points of

G

r.

r.

Let

and

Q1

The characteristics through

Q1

and

then bound the domain of determinancy of the interval

r.

be

Q2 Q2

QlQ2'

INITIAL VALUE PROBLEMS

1.

34

x-axis, one can

Since every characteristic intersects the

choose the abscissa of the intersection point

(s,0)

parameter for the characteristics, in addition to

V.

as a

A char-

acteristic is uniquely determined when these two are specified.

From (3.7) one obtains the parametric representation

_u(s,t) + at

(3.8)

0

= s,

E r,

s

u = 1,2.

11

The solutions are continuously differentiable.

are two characteristics through each point and

two abscissas, sl = pl(x,y) It is true for all

(x,y).

s =

Thus

pl

and

Since there

(x,y), there are

s2 = p2(x,y), for each point

t

that

p(0u(s,t),t)

s

u = 1,2.

e r,

are solutions of the initial value problem

p2

ap

ap

(x,y) e G, u = 1,2

ayu(x,y) = au(x,y,u(x,y))aXL(x,y), pu(x,0) = x,

x e r.

(3.9)

To prove this statement one must first show that the initial value problems (3.9) are uniquely solvable and that the solutions are continuously differentiable.

For these solutions it

is obviously true that pu(0u(s,0),0) = pu(s,0) = s,

On the other hand, the functions on

e r, u = 1,2.

s

t, since their derivatives with respect to ap

30

ax 7 t-

aP

ay

aP

u ax

With the aid of the projections

do not depend

pu(0u(s,t),t)

app

u ax pl

t

are zero:

= 0.

and

p2

one arrives at

3.

Characteristic methods for bvperbolic systems

Figure 3.10.

The domain of determinancy of the interval in the (x,y) plane and in characteristic coordinates (a,r).

35

PQ

INITIAL VALUE PROBLEMS

I.

36

a new coordinate system in

G, called a characteristic coordi-

nate system (cf. Figure 3.10): a = 12-[P2(x'Y) + P1(x,Y)]

T =

12

[P2(x,Y)

- Pl(x,Y)] On

By previous remarks, the transformation is one-to-one.

r

one has a=x and T=y=0. The characteristic methods determine approximations for u, x, and

at the lattice points with characteristic co-

y

ordinates {(o,T)

Here

h

I

a = kh, T= th

with

k,Q E ZZ).

is a sufficiently small positive constant.

The simp-

lest method of characteristics is called Massau's method, which we will now describe in more detail. Let

Q0, Q1, and

Q2, in order, denote the points with

coordinates

a

r= t hh

k h,

a = (k-l)h,

T = (t-1)h

a = (k+l)h,

T = (L-1)h.

Massau's method uses the values of Q2

to compute the values at

for

T = 0

Q0.

u, x, and

y

at

Q1

and

Since the initial values

are known, a stepwise computation will yield the

values at the levels

T = h, T = 2h, etc.

Here one can ob-

viously restrict oneself to the part of the lattice with even or

k + 1 at

Q0

and

k + 1

Q1, as is

odd.

a + T

We note that at

Q0

and

a - T

is the same

Q2.

Therefore, Q0

and

Q1

lie on the characteristic

pl(x,y) = (k-l)h

and

Q2

lie on the characteristic

p2(x,y)

=

(k+l)h

and (cf.

Q0

3.

Characteristic methods for hyperbolic systems

Figure 3.11).

37

In this coordinate system the characteristics

are thus the straight lines with slope

and

+1

-1.

th

(2-1)h (k-l)h kh (k+l)h

Figure 3.11.

Layers in Massau's method.

The numerical method begins with the first normal form (3.4)

and the differential equations (3.8) for and

E-1

72Q0'

01

and

are regarded as constant on the intervals

02.

A, A,

7

and

Their values are fixed at Q1 and Q2, respectively.

The derivatives along the characteristics are approximated by the simplest difference quotients. 0,1,2

We use superscripts

to denote the approximations for and

E-1 = (e,1)

g = (g v)

j

=

u, x, y, At Au,

at the points

Q0, Q1, Q2.

Then

we have

xo -x v = 1,2; j = 1,2 o- uj

yo

=

(ay

Aj aX)uv(x3,y3,u1).

y In detail, the computation runs as follows:

38

INITIAL VALUE PROBLEMS

I.

(1)

Determine

AJ, (E-1)J, A

Determine

x°

for

j

= 1,2

and

u = 1,2. (2)

from the system of equa-

y°

and

tions x°-xJ + A J = 0, yo-yJ

j

= 1,2

J

or (x°-xl) + all (y°-yl) = 0 (x°-xl) + a22(y°-yl) = (x2-xl) + A22(y2-y l). (3)

Determine

tions

ui

u0-uI

2

from the system of equa-

u2

and 2

v=lejv y°-y) = v=l eI

t1,2

j

gV ,

=

or eI (u°-u l) + e1 (uo-u1) 2 2 11 1 1 12

_

(yo-y1)(eI119

e21(u1-u1) + e22(u2-u

=

e21L(y°-y2)g1

1 + 1

+

eI 12gI) 2 u1

-

u1]

2)

+

e22L(yo-y2)g2

+

2

u2

-

u2]

The rewriting of the systems of equations in (2) and (3) is done for reasons of rounding error stability. When

is sufficiently small, the matrices in both

h

systems of equations are regular.

For when

h

is suffici-

ently small, we have 1

1

2

A2 + Al

2

and 1

1

1

1

ell

e12

ell

e12

2

2

1

1

e21

e22

e21

e2

1

= regular matrix.

Massau's method sometimes converges in cases where value problem has no continuous solution.

the initial

As a rule, it is

3.

Characteristic methods for hyperbolic systems

easily seen numerically that then the same pair

(x,y)

39

such a case has occurred, for

occurs for different pairs

Then there is no single-valued mapping

(x,y)

-

(a,T).

(a,T).

The

accuracy of Massau's method for hyperbolic systems is comparable to that of the Euler method for ordinary differential equations.

But there also exist numerous characteristic

methods of substantially greater accuracy.

The extrapolation

methods (see Busch-Esser-Hackbusch-Herrmann 1975) have proven themselves particularly useful.

For nondifferentiable initial

values, implicit characteristic methods with extrapolation are also commendable.

All these methods differ from Massau's

method in their use of higher order difference quotients.

All

in all one can say that under the conditions formulated above--

two variables, systems of two equations, A

has distinct real

eigenvalues--the characteristic methods are probably the most productive.

There also exist generalizations for

more gen-

erally posed problems; unfortunately, they are much more complicated and much less useful.

For that reason we want to

conclude our treatment of characteristic methods at this point and turn to other methods, known as difference methods on rectangular lattices.

The theory of normal forms may be found in Perron (1928), and the convergence proof for characteristic methods in Sauer (1958).

A FORTRAN program may be found in Appendix I.

40

4.

I.

INITIAL VALUE PROBLEMS

Banach spaces

There are many cases in which initial value problems for linear partial differential equations can be reduced to initial value problems for ordinary differential equations.

However, in such cases the ordinary differential equations are for maps of a real interval into an appropriate Banach space of non-finite dimension.

One result of this reformulation of

the problem is that it is easier to make precise the concept of a properly posed initial value problem, as discussed in Chapter 1.

Lax-Richtmyer theory concerns itself with stability

and convergence criteria for difference methods.

As it starts

with the reformulated problems, a knowledge of these "Banach space methods" is absolutely essential for an understanding of the proofs.

The situation is different for practical applica-

tions of difference methods.

For then one almost always begins

with the original formulation as an initial value problem for a hyperbolic or parabolic differential equation.

Elliptic

equations do not play a role here, since the corresponding initial value problems are not properly posed.

In this section are defined the basic concepts of Banach space, linear operator, differentiability and integral in a Banach space, etc.

Also presented are several important

theorems which are necessary for the development of Banach space methods.

Definition 4.1:

Let

B

be a vector space over a field

is called a complex (real) Banach space whenever the following holds:

1K = t

(DC = IR) .

1.

B - [0,co),

In

B

B

there is a distinguished function

called a norm, with the following properties:

Banach spaces

4.

41

(a)

Ilall = 0<==> a= 0

aeB

(b)

Ilaall = JXI hall

A

(c)

IIa+bll

The space

2.

ology induced by of elements of that e

Ilan

{an}

11-11;

is complete with respect to the topi.e., every Cauchy sequence

converges to an element

B

amll< E.

element

a

a

{a

in

B.

n } neIIV Recall

is called a Cauchy sequence if for every positive

there exists an integer -

a,b a B.

IIaII * Ilbil B

e IK, a e B

in

n

The sequence

0

such that {an}

if the sequence

B

n,m > n

0

implies

is said to converge to the {Ila

-

anll} converges to 0. 0

Every Banach space thus consists of a vector space together with a defined norm.

Thus two Banach spaces with the

same underlying vector space are distinct if the norms are different.

In particular it is worth noting that an infinite

dimensional vector space which is complete with respect to one norm by no means need have this property with respect to any other norm.

In the following, we will speak simply of Banach spaces insofar as it is clear from context whether or not we are dealing with complex or real spaces.

Since later developments

will make heavy use of Fourier transforms, we will almost exclusively consider complex Banach spaces.

The vector space n becomes a Banach space with

Example 4.2:

either of the two norms

llxIl = m,x

Ix.I,

Ilxll = (

The same is true for any other norm on 1960, Ch. V).

o

4n

(cf. Dieudonne

42

INITIAL VALUE PROBLEMS

I.

The set of all maps

Example 4.3:

x :2Z +

n

for which the

infinite series n

j=--

k=l

xk(j)x)

converges, becomes a vector space over

with the usual

definition of addition and multiplication by a scalar.

With

the definition

n

°°

I

1lxii

j=-co k=l

xk(j)xk(j))

1/2

this vector space becomes a Banach space, which we denote by (cf. Yosida 1968, Ch. 1.9). Example 4.4: C°(K,ln)

Let

o

be a compact set.

K C IRm

The vector space

is complete with respect to the norm

Iif1l,,= max max If.(x) I j

j

xeK

Here the completeness of the

and is therefore a Banach space.

space results from the fact that, with this definition of the norm, every Cauchy sequence represents a uniformly convergent sequence of continuous functions.

Such a sequence is well

known to converge to a continuous limit function, and thus to an element of the space.

The space

is not complete,

however, with respect to the norm

2= (J 11f11

n 1

K j=1

dx)1/2 f(x)f.(x j j

We can see this from the following counterexample: C°([0,2],4)

the sequence

fu (x) is a Cauchy sequence. limit function.

{fu}u

EIN

in

where

1xu

for

x e [0,1)

l

for

x e [1,2]

It converges, but not to a continuous

4.

Banach spaces

43

In the following, whenever we speak of the Banach space

C0(Ktn), we always mean the vector space of continuous functions

f:K - 4n

Example 4.5:

A = {f:G +

together with the norm

Let

n I

f

be a region in

G

called square-integrable in

n

G

and

]Rm

square-intebrable in

G}, where

f

is

if the integral

[f.(x)f.(x)] dx

E

fG j=1

>

>

exists as a Lebesgue integral and is finite. vector space over

o

11.11a

A

becomes a

with the usual definition of addition

4

and multiplication by a scalar.

The map

111.111:

A - [0,o)

defined by III f III

=

fj (x)x) dx) 1/2

(IG

J

has all the properties of a norm with the exception of

since

III fIII

for all f c N

=0

N = {f c A

I

{x E G

I

1(a),

where

f(x) # 01

has measure zero}.

One eliminates this deficiency by passing to the quotient space A/N.

The elements of

A/N

are equivalence classes of maps in

A, where the elements of a class differ only on sets of measure zero. way.

A/N

becomes a vector space over

4

in a canonical

With the definition

IIfII =IIIfIII,

f e A/N,

fef

this vector space becomes a Banach space, which we denote by L2(G,4n).

Although the vector space and norm properties are

easily checked, the proof of completeness turns out to be substantially more difficult (cf. Yosida 1968, Ch. 1.9).

In

INITIAL VALUE PROBLEMS

I.

44

order to simplify notation and language, we will not distinguish between the equivalence classes presentatives

f e f

f e L2

and their re-

in the sequel, since the appropriate

meaning will be clear from context.

o

The following definition introduces the important concept of a dense set. Definition 4.6:

subsets of D2

a

if

Let

with

B

for every

B

be a Banach space and let

Dl C D2.

a s D2

Then

be

is called dense in

D1

and for every

such that Ila - b it < e.

b c D1

Dl. D2

c > 0

there exists

o

In our future considerations those vector subspaces of a Banach space which are dense in the Banach space play a significant role.

We first consider several Banach spaces of

continuous functions with norm

Because of Weierstrass's

fundamental theorem, it is possible to display simple dense subspaces.

Theorem 4.7: Weierstrass Approximation Theorem. be a compact set.

K C ]Rm

Then the vector space of polynomials with

complex coefficients defined on space

Let

is dense in the Banach

K

C°(K,4).

A proof may be found in Dieudonne (1960), Ch. VII.4. It follows immediately from this theorem that the spaces k = 1(1)", and

are dense in

they are supersets of the space of polynomials. we even have: Theorem 4.8:

(1)

The vector space

C°(K,1), since In addition,

4.

Banach spaces

45

V = If E

I

f(v)(a)

=

f(v)(b) = 0, v = 1(1)oo}

is dense in the Banach space (2)

The vector space of bounded functions in

is dense in the Banach space of bounded functions in

C`°(IR,c)

C°(IR,4).

The proof requires the following lemma. Lemma 4.9:

Let

with

c1,c2,dl,d2 c 1R

Then there exists a function

h e C °°(JR,4)

(1)

h(x) = 1

for

(2)

h(x) = 0

for x e IR

(3)

h(x) a

for

(0,1)

dl < cl < c2 < d2.

x e

x e

with

[cl,c2]

-

(d1,d2)

(d1,d2)

-

[c1,c2].

A proof of this lemma may be found in Friedman (1969), part 1, Lemma 5.1.

Proof of 4.8(1);

We first show that the space

V = If e V

I

f(a) = f(b) = 0}

is dense in the Banach space f(b) = 0}.

exists a

Now let

6> 0

with

f e W

W = If a C°([a,b],4) and

e > 0

If(x)I < 3 for

Choose

h e C°°(IR,¢)

where

d2 = b

as in Lemma 4.9.

f(a) _

Then there

x e [a,a+d) U (b-6,b].

dl = a, c1 = a+d, c2 = b-d and Since

CW([a,b],4)

there exists a function

IIf-giI,° < 3 Now let g =

be given.

I

Then:

g e

is dense in

with

46

INITIAL VALUE PROBLEMS

I.

g e C([a,b],4); h(u)(a) = h(u)(b) = 0, u = O(l)eo V

g(v) (x)

I'

E

u=ot

by 4.9(2);

g(v u) (x)h(u) (x),

I

V = 0 (1)";

1)

v = 0(1)";

g(v) (a) = g(v) (b) = 0,

If(x)-g(x)I = If(x)-g(x)I for

x c [a+d,b-6]

Ig(x)I < If(x)-g(x)I + If(x)I < 3 e for

by 4.9(1);

x c (a,a+d) U (b-6,b];

If(x)-g(x)I < If(x)I + Ih(x)IIg(x)I < c x e [a,a+S) U (b-a,b];

for

IIf-gIl,< e And this shows that (1), let an arbitrary function

V

is dense in f e

To prove contention

W.

subject to Lemma 4.9 with

h(x)

Find a

be chosen.

d1 < cl < a < c2 <

Then

d2 < b.

p(x) = f(a)h(x) + f(b) [1-h(x)] V with

is obviously a function in

p(b) _

and

But by the previous considerations, the function

f(b). f(x)

p(a) = f(a)

- p(x)

tions in

can be approximated arbitrarily closely by func-

V C V.

Proof of 4.8(2): and let

e > 0.

[k,k+l]

(k a 7L)

Let

f

be a bounded function in

By (1) there exist on each of the intervals functions

gk e C"([k,k+l],I)

9(v)(k) = gkv)(k+l) = 0, lf(x)

C°(1R,4)

- gk(x)I < e,

with

v = 1(1)" x e [k,k+l].

It follows from the proof of (1) that the functions be chosen so that additionally

gk

can

Banach spaces

4.

47

gk(k) = f(k),

gk(k+l) = f(k+l).

Thus

g (x) = gk (x) ,

x e [k,k+l)

is a bounded function in C(1R,4) with II f - g Ii, < e.

o

Next we consider two examples of dense subspaces of the Banach space

L2(G,4).

Theorem 4.10: (1)

Let

G

be a region of

The vector space

Co(G,4)

IRm.

Then:

of infinitely differ-

entiable functions with compact support defined on dense in the space (2)

is

L2(G,¢).

The vector space of polynomials with complex co-

efficients defined on G

G

G

is dense in the space

L2(G,4), if

is bounded.

Conclusion (1) is well known.

Proof:

and Theorem 4.7.

o

Definition 4.11:

Let

a subspace of

B1.

B1, B2

(2) follows from (1)

be Banach spaces and let

D

be

A mapping

A: D -. B2 is called a Zinear operator if it is true for all

and all

X,p a 1K

a,b e D

that

A(Aa + ub) = AA(a) + jA(b).

A linear operator

A

is called bounded if there exists an

a > 0 with IIA(a) II . aIIaII for all a e IIAII = inf{a e ]R+

I

IIA(a) II

all all

D.

The quantity

for all a e D)

is then called the norm of the linear bounded operator

A.

We

48

INITIAL VALUE PROBLEMS

I.

define

L(D,B2) _ {A: D - B2

I

linear and bounded}.

A

In order to define bounded linear operators from

n

B1

B2

to

B1, because

it suffices to define them on a dense subspace of of the following theorem. Theorem 4.12:

Let

and

B1

be Banach spaces, let

B2

B1, and let

a dense vector subspace of

there exists exactly one operator

with

on

A

D.

extension of Proof:

A

on

a C B1.

Let

{a.)

a sequence

be

Then

A E L(D,B2).

which agrees

A E L(B1,B2)

Furthermore, IIAII = IIAII.

D

is called the

A

B1.

Since

B1, there exists

is dense in

D

of elements in

D

converging to

It

a.

follows from

IIA(aj )-A(ak) II = IIA(aj-ak) II _

IIAII

_

11 All

(Ila-ai II + Ila-akll

is a Cauchy sequence in

that

Being a Banach

B2.

is complete, and so there exists a (uniquely deter-

space, B2

mined) limit for the sequence note by

Ilaj-akli

c.

The quantity

c

in

{A(ai )}

B2, which we de-

depends only on

{aJ}; for suppose {a

the particular choice of sequence another sequence of elements of

a, and not on

D

is

a.

Then

{A(a5)}

also

converging to

one estimates

lic-A(a

II_ IIc-A(a IIc-A(a

II + IIA(ai ) -A(ai ) II II

+ IIAII (II a-a ll

+ Ila-i.I1)

Passage to the limit shows that the sequence converges to

c.

We define the mapping

A: B1 -. B2

by the

Banach spaces

4.

rule

A(a) = c.

49

is well defined and agrees with

A

To see the linearity of {a

and

ing to

a

{bJ}

A, let

D.

converg-

D

Then the sequence

b, respectively.

and

on

and let

be two sequences of elements of

converges to the element

ubj}

a,b c B1

A

{Xaj +

It follows that

as + ub.

A(Aa + ub) = lim A(Aa. + ub.) J

J

J9.°°

A lim A(a.) + u lim A(bp) = AA(a) + pA(b).

To see the boundedness of

above.

Then

A, let II,

IIA(aj) II < IIAII IIa

a

and

{aj}

be given as

and by the continuity of the

norm

II = lim IIA(a.) II IIA(a) II = II lim A(a.) 7 3

ja

j4-

_ IIAII dim Ilaj II = IIAIII11im a) II j 4-

= IIAII IIaII

It follows from this that IIAII = IIAII To see that directly. B2

D

is uniquely determined, we proceed in-

be a bounded linear operator from

which agrees with

for which of

A

Let

A

A

on

Suppose there is an

D.

{a.}

A(a), and let

A(a)

converging to

a.

+

to

a e B1-D

be a sequence of elements

Then

IIA(a) - A(a) II_ IIA(a) - A(a

IIA(a)-A(a.) II

B1

IIA(a)-A(a

Ti + IIA(a)

- A(aII

II < IIAII IIa-aj II + IIA1111 a-aj II

Passage to the limit leads to a contradiction.

0

By the theorems of functional analysis (in particular the Hahn-Banach Theorem), a bounded linear operator can be extended to all of

B1, preserving the norm, even when the domain

of definition

is not dense in

D

B1.

However, the extension

I.

50

INITIAL VALUE PROBLEMS

is not unique in that case. Definition 4.13:

Let

B1

and

a set of bounded linear operators mapping called uniformly bounded if the set

{IIAII

B1 I

to

I

Theorem 4.14: B2

for all

I A (a) II < a IIa II

B1

to

M

a > 0

Let

such

a and

BI

a set of bounded linear operators

Suppose there exists a function

B2.

with IIA(a) II < B(a) for all a e B1 Then the set M is uniformly bounded.

B: B1 + IR+

A e M.

is

is bounded.

and all a E B1.

Principle of uniform boundedness.

be Banach spaces and

mapping

ACM

M

B2.

A 6 M}

This is equivalent to the existence of a constant

that

M

be Banach spaces and

B2

and all

For a proof of Theorem 4.14,

see Yosida (1968), Ch. II.1.

Observe that the function

need not be continuous or

B

linear.

Definition 4.15: real interval.

Let

B

be a Banach space and

[T1,T2] i B

is called differentiable at the point element

a e B

lim to+h a [T1,T21 The element vative of du t ).

u

a

a

A mapping u:

exists an

[T1,T2]

to c [T1,T21, if there

such that

= 0. IhI

is uniquely determined and is called the deriat the point

The main

u

to.

It is denoted by

or

u'(t0)

is called differentiable if it is

differentiable at every point of

[T1,T21.

The mapping

u

is called uniformly differentiable if it is differentiable

4.

51

Banach spaces

and if

1IIu(t+h)-u(t)-hu'(t)II converges uniformly to zero as mapping

u

h + 0

The

is called eontinuousZy differentiable if it is

differentiable and if the derivative [T1,T2].

t e [T1,T2].

for

is continuous on

u'(t)

o

It follows immediately from the above definition that a mapping which is differentiable at a point ous there.

is also continu-

to

It follows from the generalized Mean Value Theorem

(cf. Dieudonne 1960, Theorem (8.6.2)) that for a continuously differentiable function

u:

-1IIu(t+h)-u(t)-hu' (t) II

<

IIu' (t+vh)-u' (t) II

sup 0
A continuously differentiable function is therefore also uniformly differentiable.

A treatment of inhomogeneous initial value problems requires the concept of an integral for continuous maps of a real interval into a Banach space.

We list a number of its

properties without proof, referring to Dieudonne (1960), Ch. VIII.

Definition 4.16:

Let

real interval, and

B

be a Banach space, I = [T1,T2]

v e C°(I,B).

called an antiderivative for on

I

v

u: I- B

A mapping on

I

u

if

a

is

is differentiable

and

u' (t) = v(t),

t E 1.

o

The following lemma permits the definition of the (definite) integral.

52

INITIAL VALUE PROBLEMS

I.

Lemma 4.17:

I, then

are two antiderivatives of

uI, u2

If

(1)

is constant on

uI - U

v

in

I.

2

Every continuous function

(2)

antiderivative in Definition 4.18: real in

I.

Let

B

be a Banach space, I = [T1,T2]

interval, v e C0(I,B), I.

has (at least) one

v

and

u

an antiderivative of

a

v

Then T J

T2 1

v(t)dt = u(T2) - u(T1)

is called the integral of

v

from

T1

to

T2.

0

It follows from Lemma 4.17 that the above definition is independent of the particular choice of antiderivative.

We now

list some properties of the integral which will be needed in Section 7.

Theorem 4.19: J = [T3,T4]

Let

be a Banach space,

B

I

= [T1,T2]

and

real intervals, v e Co(I,B), C E L(B,B), and

g,f,fx C C°(I x J,B). T2

(1)

Then

C(v(t))dt = C IfT2 v(t)dt

f

T1 (2)

T1

For every

for every partition

e > 0 {xk}

there exists a

S > 0

such that

with

T1 = x° < to < xI < ... < xk < tk < xk+l < ... < xn = T2 and

xk+l - xk < d,

k = 0(1)n -

1

it is true that nil IIfT2

T1

v(t)dt -

v(tk)(xk+l - xk)II < E. k=0

4.

Banach spaces

(3)

53

The function

u(t) =

rt J

v(r)dT

T1

is a continuously differentiable antiderivative of (4)

v

on

I.

The function w(x) = JT12 f(t,x)dt

is continuously differentiable on

with

J

rT

w(x) =

(5)

T1

.

1

r (4 1T3 g(t,x)dxdt

2

x e J.

2 fx (t,x)dt, J

_x

1T4 JT2 g(t,x)dtdx. T1

T3

The solutions of initial value problems in the following chapters will make frequent use of implicit difference equations. x -

That is to say, one has linear mappings of a Banach space

y = T(x)

into itself, which are

B

R(y) = S(x), where

implicitly defined by linear equations R,S c

When the equation

L(B,B).

x e B, we simply write

solvable for all

is uniquely

R(y) = S(x)

T = R-1 S.

How-

ever, we do not mean to imply by this that there exists a R-1

linear map

defined on all of

In most concrete ap-

B.

plications, the unique solvability of the equation

R(y) = S(x)

is nevertheless established via the existence of a bounded linear map

R_

1.

The main aid in these proofs is the Banach

fixed point theorem, or contraction theorem, which can be formulated as follows for bounded linear operators. Theorem 4.20:

Let

D c L(B,B),IIDII < 1

there exists a mapping

R-1

c L(B,B)

and

with

R = I+D.

R-1R

RR-1

=

Then = I.

54

INITIAL VALUE PROBLEMS

I.

For a proof, see Yosida (1968), Ch. II.1 (Neumann series). R-1

The existence of

for implicit difference equations de-

rives from so-called stability results, together with the Banach fixed point theorem. Theorem 4.21: A = {A c [0,1]I =

R _

R c Co([0,l],L(B,B)), R(0) =

Let

there exists an

R(A)R-I(A)

R-1(X) e L(B,B)

If the set

= I).

and

I

(R-I(A)

with I

A e A)

is uniformly bounded, then A = [0,1]. Proof:

Assume IIR-1(A)II < M

for all

continuity of R, there exists a A,A e [0,1]

and

IA-XI

<

6

ao c A

and

6 > 0

By the uniform

so that for all

it is true that

IIR(A) - R(A)II Now let

A E A.

<

IA-A0I < S.

1/M. Then it follows that

R(A) = R(ao) + R(A) - R(ao) = R(ao){I + R-1(Xo)[R(A) - R(ao)]}.

Since II

R-1(,,°)

II

II R(a) - R(A0) II

<

1

both factors on the right side are invertible. implies

[0,6) C A, which

implies

0 e A

[0,26) C A, etc.

thus o

S.

Stability of difference methods

5.

Stability of difference methods

55

In this section, we first define the fundamental concepts properly posed initial value problem, and consistency,

stability, and convergence of difference methods in Banach Then we prove the essential theorem of

space terminology.

Lax-Richtmyer, which states that stability is necessary and sufficient for convergence of a consistent difference method.

Additionally, we discuss several examples, which are designed to give substance to the abstract concepts. Definition 5.1:

Let

be a Banach space, DA

B

B, T e]R+, and

space of

A: DA + B

a vector sub-

a linear operator.

We

consider the following initial value problem for the initial value

c e DA:

Find a differentiable mapping

u:

[0,T] + DA

(called a solution) with

u' (t) = A(u(t)),

t c [0,T] (5.2)

U(O) = c. Henceforth we denote this problem by Since

problem

P(B,T,A).

o

is linear, the set of all initial values for which

A

P(B,T,A)

has a solution is a vector subspace of

DA.

When the problem has a unique solution, then for fixed to a [0,T]

we can define a linear operator by the assignment

c + uc(t0), where

uc(t)

for the initial value

c.

denotes the solution of

P(B,T,A)

These linear operators will be used

in the definition of a properly posed initial value problem. Definition 5.3:

The initial value problem

P(B,T,A)

properly posed if there exist a vector subspace

DE

is called such that

I.

56

INITIAL VALUE PROBLEMS

DE C DA C B, Mo = {E0(t)

and a collection

t e [0,T]}

I

of linear opera-

tors E0(t)

:

DE + DA ,

t c [0,T]

with the following properties: DA) is dense in

(and hence

(1)

DE

(2)

For all initial values

by

(3)

operators.

problem

P(B,T,A)

uc(t), which is represented for all

has exactly one solution t e [0,T]

c c DE

B.

uc(t) = E0(t)(c).

M

The set

is a uniformly bounded set of linear

0

We will call the operators

tion operators from now on.

E0(t)

in

Mo

solu-

o

One sees immediately that the above concept represents a sharpening of the corresponding considerations in Section 1.

The Lipschitz condition required by Definition 1.2 follows at once from (3):

Iluc(t)-u-(t) II = It Eo(t) (c)-Eo(t) (c) II < LIIc-EII t e [0,T].

Since the linear operators domain of definition

DE

E0(t)

are bounded and since their

is dense in

B, they may be extended

uniquely, by Theorem 4.12, to an operator of the same norm defined on all of Definition 5.4: tion 5.3.

and let

Let

B.

I

be solution operators as in Defini-

denote the extension of

E(t)

M = {E(t)

determined by

E0(t)

Let

t e [O,T]}.

c c B

initial value problem

The map

E0(t)

to

B,

E(.)(c): [0,T] + B

is called the generalized solution of P(B,T,A)

for the initial value

c, and

S.

Stability of difference methods

the operators

E(t)

in

57

are called generalized solution

M!

operators.

Generalized solutions are not solutions if they extend beyond DA.

We list some properties of the concepts just introduced.

Theorem 5.5: problem. (1)

Let

P(B,T,A)

be a properly posed initial value

Then it is true that: The set

M = {E(t)

I

is uniformly

t e [0,T]}

bounded, with the same bound as the set

M. = {E0(t)

t e

I

[0,T]). (2)

Every generalized solution

for initial value

c e B

the sense of the norm on

means that for every

E(-)(c)

of

P(B,T,A)

may be approximated uniformly, in B, by solutions of

e > 0

P(B,T,A).

there exists an element

This

c c DE

such that

IIE(t)(c) - E0(t)(E)II < e, (3)

Every generalized solution

for initial value (4)

c c B

belongs to

The linear operators

E(t)

t e [0,T]. E(-)(c)

of

P(B,T,A)

C°([0,T],B). c C°(B,B)

satisfy the

semigroup property: E(r+s) = E(r)E(s), (5)

For all

r,s,r+s a [0,T].

c c DE

E(t)A(c) = AE(t) (c) is satisfied. (6)

For all

c e DE, u(t) = E0(t)(c)

is continuously

differentiable. Proof of (1):

Follows immediately from Theorem 4.12.

INITIAL VALUE PROBLEMS

I.

58

Of (2):

be given.

e > 0

Let

the norms of the operators

Let

Since

given by (1).

E(t)

DE

such that

c e DE

B, there exists a

is dense in

denote the bound for

L

IIc-c II < L E(t)(E) = Eo(t)(c), we estimate as follows:

Since

IIE(t) (c)-E0(t) (c) II = IIE(t) (c)-E(t) (c) II < L IIc-c II < e Of (3):

Let

s

and

e [0,T]

choose an element

c e DE

for which

t e [0,T] ,

IIE(t) (c) - E0(t) (c) II < 3 ,

is a differentiable map

Since

holds.

By (2), we

be given.

e > 0

fore continuous, there exists a

IIEo(s+h)(c)-Eo(s)(c)II < 3, Altogether, for all

with

h

6

>

such that

0

s+h e [0,T].

IhI < 6, IhI

< 6

and there-

s+h a [0,T], this

and

yields the estimate

IIE(s+h) (c)-E(s) (c) II < IIE(s+h) (c)-E0(s+h) (c) II

+

IIE0(s+h) (c) -E0 (s) (2) II + IIEu(s) (2)-E(s) (c) II < C. Of (4):

r e (0,T].

Let

consider

P(B,r,A).

5.4 which pertain to ditional index

r.

will be identified by an ad-

P(B,r,A)

For a solution P(B,r,A)

u

P(B,T,A), v

of

with

E,r

= span{_ U t

a[r,T]

E (`-r)(D )}. o

E

ob-

v(t) = u(t+r-r),

We consequently define D

P(B,T,A)

The quantities from Definitions 5.3 and

viously is a solution of r e [r,T].

In addition to problem

we

S.

Let

Stability of difference methods

r,s

59

be arbitrary, but fixed in

with

(0,T]

Without loss of generality, 0 < r < s.

r+s < T.

Then

DE C DE's C DE, r C DA C B Eo,r(t)(c) = E0(t)(c),

t e [O,r],

Er(t)(c) = E(t)(c),

t c [O,r], c e B

E0(s)(c) c DE

c e DE.

One obtains for

r'

c e DE

c e DE

E(r)oE(s)(c) = Er(r)oE0(s)(c) = Eo,r(r)oEo(s)(c) = Eo(r+s)(c) = E(r+s)(c).

This proves the contention that c e DE

For

Of (5):

and

DE

t e [0,T]

is dense in

B.

one has the following

estimate:

IIE(t)-A(c) - AoE(t) (c) II < IIE(t)-A(c) +

-

h(t) [E(h) (c)-c] II

E(t) [E (h) (c) -c] - AoE (t) (c) II

II

h

< +

I E ( t ) II IIE(h) (c)-c-hA(c) II

1 IIE (t+h) (c) -E (t) (c) -hA°E (t) (c) II

Since E(-)(c) = interval

Th-T I

.

is differentiable on the entire

[0,T], the conclusion follows by passing to the

limit h * 0. Of (6):

Since

u' (t) = A(u(t)) = AoE(t) (c) = E(t)oA(c) = E(t) (A(c)) the continuity of

u'

follows from (3).

60

I.

INITIAL VALUE PROBLEMS

With the next two examples we elucidate the relationship of these considerations to problems of partial differential equations. Example 5.6:

Initial value problem for a parabolic differen-

tial equation. Let T > 0, :]R - c, and a e C-(IR, 1R) with a' c Co(JR, IR) and a(x) > 0 (x c IR). It follows that x cIR.

0 < K1 < a (x) < K2 , We consider the problem ut(x,t) = [a(x)ux(x,t)]x

x e1R,

u(x,0) = 4(x)

The problem is properly posed only if

u

t e (0,T).

and

4

to certain growth conditions (cf. Example 1.10).

(5.7)

are subject The choice

of Banach space in which to consider this problem depends on the nature of these growth conditions and vice versa.

choose

B = L2(1R,4)

DA = {f e B

I

We

and

f e C1(R,4),

af' absolutely continuous, (af')' a B}.

DA

is a superspace of

by Theorem 4.10.

C0_(1R,¢)

We define a linear operator f - (af')'.

the assignment into form (5.2).

and therefore is dense in A: DA - B

B

by

Problem (5.7) is thus transformed

That this is properly posed can be shown

with the use of known properties from the theory of partial differential equations. for example.

For this one may choose

DE = C0_(]R,4),

Generalized solutions exist, however, for arbit-

rary square integrable initial functions, which need not even be continuous.

The operators

E0(t)

and

E(t), which by no

5.

Stability of difference methods

61

means always have a closed representation, can be written as integral operators for Example 5.8:

a(x) s constant

(cf. Example 1.10). a

Initial value problem for a hyperbolic differ-

ential equation. Let T > 0, : ]R + with

Ia(x)l

a e C(]R, ]R)

and

We consider the problem

< K, x e]R.

ut(x,t) = a(x)ux(x,t) t c (0,T).

x e IR,

(5.9)

u(x,0) _ fi(x)

For simplicity's sake we choose the same Banach space

B

as

in Example 5.6 and set DA = If c B

I

absolutely continuous, af'

f

We define the linear operator

A

E B}.

by the assignment

f + af'.

All other quantities are fixed in analogy with Example 5.6.

Once again it can be shown that the problem is properly posed. G We are now ready to define the concept of a difference method for a properly posed problem

P(B,T,A)

as well as the

related properties of consistency, stability, and convergence. Definition 5.10:

value problem and

Let

be a properly posed initial

P(B,T,A)

M = {E(t)

I

t e [0,T]}

the corresponding

set of generalized solution operators, as given in Definition 5.4, and (1)

ho a (0,T].

A family

MD - {C(h): B + B

bounded linear operators defined on method for

P(B,T,A)

h c (0,h0])

(O,h

is bounded in

0].

The difference method

if there exists a dense subspace

of

is called a difference

B

if the function

every closed interval of (2)

I

MD DC

is called consistent in

B

such that, for

I.

62

all

INITIAL VALUE PROBLEMS

c c DC, the expression

II [C(h) - E(h)] (E(t) (c)) fI

1

converges uniformly to zero for (3)

t c [0,T]

The difference method

MD

h + 0.

as

is called stable if

the set of operators {C(h)'

I

h E (0,h0], n E IN, nh < T}

is uniformly bounded. (4)

The difference method

MD

is called convergent

if the expression

JJC(hj) i(c) - E(t) (c) lI c e B, for all

converges to zero for all all sequences 0.

Here

if

{njhj}

{hj}

{n.}

of real numbers in

t c [0,T], and for converging to

(0,h0]

is an admissible sequence of natural numbers

converges to

t

and

njhj < T.

a

The following theorem explains the relationship between the above concepts. Theorem 5.11:

Lax-Riehtmyer.

Let

MD

be a consistent dif-

ference method for the properly posed initial value problem P(B,T,A).

Then the difference method

MD

is convergent if

and only if it is stable.

Proof: (a)

Convergence implies stability: We will proceed indirectly,

and thus assume that

MD

there exists a sequence

is convergent but not stable. {h

sequence of natural numbers

of elements in {n.}

(O,h0]

Then and a

related by the condition n

njhj c [0,T], j

E :1N

so that the sequence QC(hj)

ill)

is not

Stability of difference methods

S.

bounded.

Since

[0,T]

and

63

are compact, we may as-

[O,h0]

sume without loss of generality that the sequences and

{hi}

h > 0.

converge to

t

c

and

[0,T]

From a certain index on, nj

5.10(1), IIC(')II n

is bounded in

h c [O,h0].

is constant.

{njhi}

Assume

By Definition

Consequently

[h/2,h0].

n

IIC(h) 'II
This is a contradiction,

and

Since

{hj}

must converge to zero.

is a convergent

MD

difference method, the sequence

{IIC(hi) i(c) - E(t)(c)II} also converges to zero, for every a

so that for all

j0(c) c 1N

> jo(c)

j

Hence there exists

c c B.

it is true that

IIC(hi) 3 (c) - E(t) (c) II < 1 (c) II < 1 + IIE(t) (c) II

IIC(h We set

n.

K(c) =

max

j<j0(c)

{l + IIE(t) (c) II , IIC(hj) 3 (c) II

It then follows for all

IIC(hi)

c e B

that

j

i (c)II: K(c),

eIN.

Applying Theorem 4.14 yields that

n. {C(hi) 1} is a uniformly bounded set of operators. (b)

{hi}

Stability implies convergence: a sequence of real numbers in

zero, and that

{nj}

{njhj}

Let

Contradiction!

c c DC, t e [0,T], converging to

(O,ho]

a related sequence of natural numbers, so

converges to

t

and

< T.

n h j

j

j (c) = C(hi) i(c) - E(t) (c), j

E IN

For

INITIAL VALUE PROBLEMS

I.

64

it is true by Theorem 5.5(4) (semigroup property of n.-1

Ij (c) =

k=0

E) that

C(h.)k[C(hj)-E(hj)]E((nj-1-k)hj) (c)

+ E(pj) [E(njhj-p.) - E(t-pj)] (c),

j c IN

where

pj = min{t,njhj }, e > 0

Now let

be given.

j e IN.

Then we obtain the following esti-

mates: (a)

KC

By the stability of

MD

there exists a constant

such that

IIC(hj)kII < K/., ($)

jE

IN,

k = 0(1)n j.

By the consistency of

MD

there exists a

j

1

EIN

such that II[C(hi)-E(hi)]E((nj-l-k)hj)(c)II < chip (y)

> jl.

j

By Theorem 5.5(1) there exists a constant

KE

such that

IIE(T)II < KE, (6)

T E [0,T].

By Theorem 5.5(3) there exists a

j > i2.

II [E(njhj-pj) - E(t-pj)] (c) II < c, Altogether, it follows from (a) II Vii (c) II

-

(6) that

< nj KCEhj + KEE < (KCT + KE) c,

j

This already proves that the difference method vergent for all write

c c DC.

Now for

such that

i2 c]N

E E B

and

> max{j1,j2}. MD

is con-

c c DC

we can

Stability of difference methods

5.

*j (E)

65

= C(hj) i(E) - E(t) (E) = C(hj))(c) - E(t)(c)+C(hj)3(E-c)-E(t)(E-c),

Il j(a)II _IIP.(c)II + Kc IIc-c II + KE Ilc-cll, For a given

n > 0

we then choose a

j e IN. so that the last

c e DC

two terms on the right side of the last inequality are less than

2n/3.

Considering this together with the previous in-

equality shows that there exists a

In discussing convergence, we have

not spoken of order of convergence and as yet.

such that

j > jo.

Ilv'j (c) II < n , Remark on Theorem 5.11:

jo c]N

order of consistency

The precise situation is as follows.

In case the

expression

II [C(h)

1 is of order

E(h)] (E(t) (c)) II

-

for all

0(hp)

(order of consistency),

c e DC

then it follows from the above proof, under the additional condition that

n.hj = t

convergence of order

IIC(h)n(c)

In case

c

c e DC, that there is also

for all

O(hp), i.e. - E(t)(c)ll = 0(hp),

does not lie in

nh = t.

DC, the order of convergence is

substantially worse, as a rule.

For

c

in a subspace of

DC

on the other hand, the order of convergence can be even better than

p.

a

As a rule, the proof that a given initial value problem is properly posed is very tedious.

In the literature, one

often finds only existence and uniqueness theorems.

Condition

66

I.

INITIAL VALUE PROBLEMS

(3) of Definition 5.3 is then unsatisfied.

However, if there

also exists a consistent and stable difference method, then this condition, too, is satisfied. Let

Theorem 5.12:

be a problem satisfying condi-

P(B,T,A)

tions (1) and (2) of Definition 5.3. given a family

MD = {C(h)

h c (O,h0]}

of operators

with the following properties:

C(h) c L(B,B)

c c DE

For all

(1)

I

Further let there be

1

Th'T

the expression

- E0(t+h) (c) II I I C (h) (E o (t) (c))

converges to zero as

h + 0.

Convergence is uniform for all

t e [0,T].

The set of operators

(2)

{C(h)'

t

h e (0,h0], n c 1N, nh < T}

is uniformly bounded. Then

P(B,T,A)

Proof:

is properly posed.

Assume

IIC(h)nUI < L, t c (0,T],

For

let

h e (0,h0], n c 1N, nh < T.

h = t/m

where

m c 1N.

For

c c DE

follows that

II Eo (mh) (c) II < II C(h)m(c) II m-1 IIC(h)v+1E0((m-l-v)h)(c)

+

-

C(h)vE0((m-v)h)(c)II

V=0

For fixed

c, we now choose an

m

so large that

IIC(h)E0((m-l-v)h) (c) - Eo((m-v)h) (c) 11 . 11 cII h.

it

Stability of difference methods

S.

67

This yields

IIE0(t)(c)II < (L+Lt) IIclI < L(l+T) IIc!I

t e [0,T].

IIE0(t)II < L(1-T),

One can now also drop condition (1) of Definition 5.3 from the hypotheses of Theorem 5.12. A

is the restriction of

A

Then

P(15E,T,A), where

DE (1 DA, is properly posed.

to

The difference method consequently still converges for all c c UE, i.e., for all

for which the existence of a gen-

c

eralized solution is guaranteed. Theorem 5.13:

problem, MD = {C(h) for

I

P(B,T,A), and

h e (O,h0]}

{Q(h)

set of linear operators h e (O,h0]}

P(B,T,A)

Let

Kreige.

I

a stable difference method

h c (O,h0]}

Q(h)

:

be a properly posed

B + B.

a uniformly bounded Then

I

is also stable.

Proof:

By hypothesis there exist constants

K2 > 0

such that

I Q (h) II

K1 >

0

and

u e IN, h e (O,ho], uh < T

IIC(h)`'II < K1, I

{C(h) + hQ(h)

< K2 ,

h e (0,h0].

On the other hand, we have a representation la)

u

[C(h) + hQ(h)]u =

E

hA

A=0

with operators

P

E K=1 Pa,K

which are products of

XK

factors.

p

,

C(h)

occurs

(u

-

times as a factor, and

x)

C(h), which are not divisible by

We gather the factors as powers.

Now in

Q(h), A

PAK

there are at most

X+l

times.

Q(h),

powers of

,

C(h)

gathered in this way, so that we obtain the estimate

68

I.

INITIAL VALUE PROBLEMS

IIPx,KII < Ki+1 K2 Altogether, it follows for uh < T

u e]N

and

h c (O,h0]

with

that

II[C(h)+hQ(h)I"II <

X0

Ki+1K2

hXIa)

< K1(1+hK1K2)4 < K1 exp(TK1K2).

Consequently, {C(h) + hQ(h)}

is stable.

o

Lax-Richtmyer theory (Theorems 5.11, 5.12, 5.13, 7.2, 7.4) is relatively simple and transparent.

But one must not

overlook the fact that this result was made possible by three restricting hypotheses:

The differential equations

(1)

consideration are linear.

depend on

u'(t) = A(u(t))

Moreover, the operators

A

under do not

t.

All difference operators are defined on the same

(2)

Banach space, and map this space into itself.

The difference operators are defined for all step

(3)

sizes

h

in an interval

(O,ho].

The generalization of the theory to nonconstant operators

A

and quasi-linear differential equations presents considerable difficulties.

A good treatment of the problems involved may

be found in Ansorge-Hass (1970).

Hypotheses (2) and (3) also

are an idealization relative to the procedures followed in practice.

Assume for the moment that the

Banach space

B

elements of the

are continuous functions on a real interval.

In the numerical computations, we consider instead the re-

strictions of the functions to the lattice

69

Stability of difference methods

S.

eIR

xj = j6x,

I

N1 <

< N2},

j

{xj

Ax, Ni. and

where h

in the

all depend on the length of the step

N2

The restrictions of the functions to

t-direction.

the lattice form a finite-dimensional vector space.

The sign-

ificance of that for the practical execution of the numerical computations is naturally decisive.

For initial boundary

value problems, the definition of a difference operator for all

often presents substantial difficulties.

h e (O,h0]

However, it also suffices to have a definition for step widths v = 1(l)eo, for a fixed constant

by =

K > 0.

We will now show that, under certain natural conditions, the essential parts of Lax-Richtmyer theory remain correct for these "finite" difference methods.

For the rest of this

chapter, we make the following general assumptions:

P(B,T,A)

is a fixed initial value problem in the sense of Definition The problem is uniquely solvable for all initial values

5.1.

c e DE C B, where

DE

one nonzero element. B.

is a vector space containing at least We do not demand that

DE

be dense in

The solutions of the problem we denote as before by The operators

E0(t)(c).

E0(t)

are linear.

Their continuity

need not be demanded. Definition 5.14: (1)

Let

The sequence

K eIR+.

MD = {(Bv,rv,Cv)

v e]N}

is called a

strongly finite difference method if it is true for all

v e IN

that: (a)

By

is a finite dimensional Banach space with norm

(the space of lattice functions).

is a linear mapping (restriction) of B to lim jI rv (c) II (v) = Ilc ll holds for every fixed c e B.

(b)

BV, and

rv

70

(2)

is a linear mapping (difference operator) of

Cv

(c)

By

INITIAL VALUE PROBLEMS

I.

to itself.

MD

is called consistent if there exists a vector space

DC C DE C ITC C B

such that

lim hv v-).W

II

with

for all

c e DC

for all

t c [0,T].

(3)

MD

(v) = 0

Convergence is uniform

by =

is called stable if the set

{IICnII(v)

I

v c IN, n E IN, nK2 v< T}

is bounded. -u (4)

MD

is called convergent if for all

with u1u2 c ]N

lim v-

and all

t =

K22 e [0,T]

it is true that

c c DE

n

IICvvorv(c) - rVoEa(t) (c) II(v) = 0,

v>u2 v-u

where

V12-4 2.

nv =

Theorem 5.15:

MD

o

Whenever the strongly finite difference method

is consistent and stable, then

posed, and (2)

(1)

P(DE,T,A)

is properly

MD is convergent.

Conclusion (1) corresponds to the assertion of Theorem 5.12, and conclusion (2) to one direction of Theorem 5.11.

dition, (1) implies that the operators for fixed

t.

E0(t)

In ad-

are continuous

It is easily seen that Theorem 5.13 also can be

carried over to finite difference methods in a reasonable way. u2

Proof of Theorem 5.15(1):

For fixed

we make the following definitions:

c e DC and

t =

1

Stability of difference

S.

by =

-'ods

v =

2(1)

v =

2(1)-

71

v-u

nv = ul2

2

,

tVK = (nv - K)hv, K = 0(1)nv CVorVoE0(tVK)(c)II(V)

dVK = IICV+1,rVoEO(tV,K+l)(c)

K = 0(1)nv-1.

In addition, always assume II Cv II

nK2-v

v e IN, n c 1N,

(v) < L,

< T.

It follows that dvK < L IICVorVoE0(tV,K+l)(c)

-

rVoE0(tvK)(c)II(v)

dvK < LIICvorvoE0(tv,K+l)(c)

-

rvoE0(tv,K+l+hv)(c)II(v).

or

By the consistency of V0(c,e)

so that

now estimate

MD, for every

v > vO(c,e)

e > 0

there exists a We can

dvK < eLhv.

implies

Eo(t)(c):

I I Eo (t) (c) I I = lim II rvo EO (t) (c) II(v) V-}M

< lim sup IICvor n (C) 11(v) + lim sup V

V"*0°

V+

V

n v-1 I d VK K=O

< L IIcli + eLT.

Since

c > 0 is arbitrary, we also have IIE0(t) (c) II

L IIchI

This inequality, however, was derived under the assumptions -u

t =

2

and

c e DC.

Since the function

differentiable, it is also continuous for fixed viously admitted

t

values are dense in

inequality holds for all sion that

DE C UC.

t c [0,T].

[0,T].

E

0(-)(c)

c.

is

The pre-

Hence the

Finally, use the inclu-

Then it follows for all

c e DE

and

t c [0,T]

72

I.

INITIAL VALUE PROBLEMS

II_ L II c ll

I I Eo (t) (c)

This proves conclusion (1) of Theorem 5.15.

Again we assume that a fixed

Proof of 5.15(2):

c c DC and

-u

t = u1K2

have been chosen.

2

Similarly to the above, one

can then estimate: n IICvvorv(c)

-

r,eEo(t)(c)II(v) <

v-

1

K0

dvK < ELT,

v > vo(c,c).

This inequality obviously implies convergence for all Now let

be arbitrarily chosen and let

c e DE

c c DC.

c c DC.

This

yields

IIC VorV(c) - rVoE0(t) (c) II(v) < lICvvorv(c)-Cvvorv(c) II(v) n

+ IlCvvorv(c) - rvoEo(t) (c) II(v) + II rvoEo (t)

(c) - rvoE0 (t) (c) II(v)

< L II rv (c-c) II(v)

+

II rvoEo (t) (c-c) II(v)

n + II Cvvorv (c) - rvoEo (t) (c) II(v) By passing to the limit

Jim v-,-

Here small.

v - -

.

one obtains

sup IICnvorV(c) - rvoEo (t) (c) II(v) < L IIc-cII + 11 E0(t) (c-c) II . E0(t) 13

is bounded and

IIc

-

cII

can be made arbitrarily

6.

Examples of stable difference methods

6.

Examples of stable difference methods

73

This chapter is devoted to a presentation of several difference methods, whose stability can be established by elementary methods.

We begin with a preparatory lemma and

definition. Lemma 6.1:

C°(]R, ]R+)

f e C°(R,O

Let

bounded, and

be square integrable, a e

Ax a ]R.

Then

(1)

r T{a(x+Qx/2)[f(x+Ox)-f(x)] - a(x-Ax/2)[f(x)-f(x-ox)]}dx -J+W

a(x)lf(x+ox/2)-f(x-&x/2)l2dx.

_

(2)

r+

la(x+Ax/2)[f(x+Ax)-f(x)]

-

a(x-&x/2)[f(x)-f(x-Ax)]l2dx

r+

E

< 4J-: a(x)2lf(x+Ax/2)-f(x-Ax/2)12dx. Proof of (1):

We have

+

r+00

g(x+Ax)dx = J-

g(x) dx 00

for all functions

g

for which the integrals exist.

Thus

we can rewrite the left hand integral as follows: 7{a(x+Ax/2)[f(x+Ax)-f(x)]-a(x-Ax/2)[f(x)-f(x-Ax)]}dx

E 00

x-Ax 2 f(x+.x/2)-a(x)

x-Ax/

f(x-Ax/2)

74

INITIAL VALUE PROBLEMS

I.

x+ x

a(x)

x+tx

f(x+1x/2)+a(x)

f(x-tx/2)]dx

J+ a(x)lf(x+Ax/2)-f(x-Ax/2)1 2dx. a,s e t

For

Proof of (2):

1a +

we have

012 <

2(Ta12

+

15 2).

Therefore la(x+Ax/2)[f(x+Ax)-f(x)]-a(x-&x/2)[f(x)-f(x-0x)]12 < 2[a(x+Ax/2)2lf(x+ax)-f(x)l2+a(x-Ax/2)2,f(x)-f(x-Ax)I2].

Each of the summands inside the square brackets, when integrated with the appropriate translation, yields the value a(x)2If(x+6x/2)-f(x-Ax/2)12dx. f

:

The desired co,:-lusion follows once we add the two together. Definition 6.2:

f: ]R - M.

Let

Ax e1R, M

o

an arbitrary set, and

We define: TAx(x) = x + Ax,

x c 1R TAx(f)(x) = f(TAx(x)). TA

is called a translation operator.

The translation operator is a bounded linear map of L2(R,4)

into itself with

and

JITAx11 = 1.

The operator is invertible

-I

TAX = T-Ax'

To derive a difference method for the initial value problem in Example 5.6 we discretize the differential equation (5.7),

6.

Examples of stable difference methods

75

ut(x,t) = [a(x)ux(x,t)]x

a E C (IR, IR) , at c Co (IR, IR) , a (x) > 0

where

for x c 1R, as

follows: D(x,t,ox) = (ox)-2(a(x+ox/2)[u(x+ox,t)-u(x,t)] - a(x-ox/2)[u(x,t)-u(x-ox,t)]}

[a(x)ux(x,t)]x

u(x,t+h)-u(x,t) = ahD(x,t,Ax) + (1-a)hD(x,t+h,Ax) 4x, h eIR+, a c [0,1]. Using

A = h/(Ax) 2

and the operator

H(Ax) = a(x-ox/2)T-1 -

[a(x+&x/2)+a(x-&x/2)]I

+ a(x+Ax/2)TAx we obtain C(h)-I = aAH(Ax) + (l-a)AH(Ox)oC(h), (6.3)

[I-(l-a)AH(Ox)]oC(h) = I + aAH(Ax).

The method is explicit for The case a = 1/2

a = 0

a = 1

and implicit for

is also called totally implicit.

it is called the Crank-Nieolson method.

ships are depicted graphically in Figure 6.4. ing, we will examine only the cases

a = 1

and

Ax

a= 1

a e (0,1) Figure 6.4

a= 0

a E (0,1). For

The relation-

In the followa = 0.

I.

76

INITIAL VALUE PROBLEMS

Begin at

The difference method is applied as follows. a fixed time

to

exact solution w(x,t0+h)

with a known approximation

to the

Compute a new approximation

u(x,t0).

by the rule

u(x,to+h)

to

w(x,to)

w(x,to+h) = C(h)[w(x,to)].

In the implicit case, this means solving the system of equations

[I-(l-a)AH(1x)][w(x,t0+h)] = [I+aXH(dw)][w(x,to)] On a computer, w(x,t0)

w(x,to+h)

and

can only be obtained

x-values, so the latter are chosen

for a discrete set of

equidistant so that the translation operator does not lead outside the set. Theorem 6.5:

Explicit difference method for a parabolic dif-

ferential equation.

Let

lem of Example 5.6 with MD = {C(h) (1)

h e (0,h0]}

I

MD

B = L2OR,4).

where

Consider the set

C(h) = I + XH(Ax).

Then

is a consistent difference method for

with order of consistency (2)

be the initial value prob-

P(B,T,A)

P(B,T,A)

0(h).

Under the additional hypothesis 0 < A max a(x) < 1/2

(stability condition)

xdR MD

with

is stable.

In particular, for all

h e (O,h0]

and

n c 1N

nh < T:

IIC(h)nII2 < 1. Proof of (1):

Every operator

C(h)

maps into

B, is linear,

and is bounded, since such is the case for the operators

I,

Tax, and

To

T_1, and since the function

a(x)

is bounded.

Examples of stable difference methods

6.

prove consistency, we choose

77

We must show that

DC =

the expression

h-1 IIE(t+h) (c)-C(h)°E(t) (c) II 1/2 =

h-1 11+M Iu(x,t+h)-C(h)[u(x,t)]I2dx

converges uniformly to zero for all

c c DC.

t c [0,T]

h - 0, for

as

To this end, we use differential equation (5.7)

to rewrite the expression h-1{u(x,t+h)-C(h)[u(x,t)]}

=

h-1[u(x,t+h)-u(x,t)]

-

(ox)-2H(Ax)[u(x,t)] = ut(x,t) + 2utt(x,t+vh) (Ax)-2H(Gx)[u(x,t)]

-

tt(x,t+vh)+a(x)uxx(x,t) +

= Z

a'(x)ux(x,t)-(ox)-2H(ox)[u(x,t)],

v e [0,1].

The term

[u(x,t)]

f(s) = has a series development in

s:

5 f" (0) +

f(s) = f(0)+sf, (0) +

f(4) (es), e e [0,1]

f(1) = H(Ax) [u(x,t)] = f(0) + f'(0) + if (0) + 6f,,, (0) +

We introduce the following abbreviations:

u

for

u(x,t)

u+

for

u(x+s1x,t)

u

for

u(x-s&x,t)

a

for

a(x)

Z4f (4) (e)

.

78

I.

a+

for

a(x+sAx/2)

a

for

a(x-sAx/2).

INITIAL VALUE PROBLEMS

x-derivatives.

Corresponding abbreviations hold for the Using

a/ax, it follows that

for

'

f(s) = a_u_-(a++a-)u+a+u+ fl(s) = Ax[- 2 a'u -a u'-

2(a+-a')u+ Za+u++a+u+]

f"(s) _ (AX)2[Oar,u_+a'u'+a u,r_ (a++arr)U+4a+u++a+u++a+u+] f

3 3 it 1 fit 3 r it nr_18(a+nr - all)u r _ (s) = (ox) [-8a_ u_-4a_u_r Za_u_-a_u

+ lair' u +q3arr++ u r +32 a+r urr+a+urr r ] + + 8+ +

f(0) = f' (0) = f"' (0) = 0 lfrr(0)

Since

=

(&x)2(a'u'

a'(x)

+ au").

is a function with compact support, we can use

the integral form of Example 1.10

(cf. also 99) to describe

the asymptotic behavior of the solution value problem (5.7).

u(x,t)

of initial

Thus there exists a square integrable

function L e C0(1R, ]R) with ai+ku

x,t

< L(x),

0 < j+k < 4,

x e1R,

t e [0,T].

ax3 at Combining all this with the appropriately chosen constant yields the estimates

h 1 IIE(t+h) (c)-C(h)°E(t) (c) II IILII + 2

IILII + 2

Ilau"+a'u'-(Ax)-2f(1)

II

II(ox)-2f(q)(e)II

[?+M(Ax)2] 111,11

M

Examples of stable difference methods

6.

Since

79

x = h/(0x)2, it follows that the method has order of

consistency

0(h).

Let

Proof of (2):

f e C°cIR,

be square integrable.

)

Then

we can write C(h)(f) = f(x) + x{a(x+tx/2)[f(x+tx)-f(x)] - a(x-Ax/2)[f(x)-f(x-Ax)]}. It follows that

xJf(x)77.

IIC(h)(f)112

=

IIfII2 + +

T.

dx

x TTRT{...}dx

+

x2f+ I{...}I2dx. W

By Lemma 6.1(1), +00

X

j-

+00

f (x) 777. Fdx = xJ

(x-')"{

a(x)If(x+Ax/2)-f(x-Ax/2)I2dx.

-xf

=

... }dx

Lemma 6.1(2) enables the estimate

r+0 x21

r+00

a(x)2If(x+ox/2)-f(x-Ax/2)I2dx.

I{...}I2dx < 4x21

Altogether, we have 2x1+

IIC(h)(f)II2 < IIfII2

-

a(x)[1-2xa(x)]If(x+ox/2)

J

-

f(x-Ax/2)I2dx.

It follows from the stability condition that

2xa(x) < 1.

Therefore the integral is not negative, and hence

IIC(h)(f)IIIIf1I . This, together with the submultiplicativity of the operator norm, implies that

80

I.

IIC(h)n(f) II

IIC(h)°C(h)n-1(f)

II

=

< IIC(h)n-1(f)II Since

INITIAL VALUE PROBLEMS

is dense in

C°(]R,¢)

< ...
B, the conclusion follows.

o

We next investigate method (6.3) for the totally impliThe main advantage lies in the fact that

cit case.

As a result, h

longer subject to any conditions.

is no

A

and

Ax

can be fixed independently of each other. Theorem 6.6:

Totally implicit difference method for a para-

bolic differential equation. (0,h0]}

The family

I

h c

where [I-AH(Ax)]-1

C(h) = is, for all

A e]R+, a consistent and stable difference method

for Problem (5.7) of consistency order for all

MD = {C(h)

A e]R+, h e (0,h0], and

n e]N

0(h).

with

In particular, nh < T:

IIC(h)nII2 < 1. The invertibility of

Proof:

stability condition That

MD

I

IIC(h)II < 1

- AH(Ax)

follows from the

by way of Theorem 4.21.

is a consistent difference method of order

0(h)

can be shown in a manner similar to that of Theorem 6.5(1).

To establish stability, choose an arbitrary square integrable function

f c Co(]R,d)

and let

g = C(h)(f).

Then we can write

f(x) = g(x)-A{a(x+Ox/2)[g(x+tx)-g(x)] - a(x-Ax/2)[g(x)-g(x-/x)]}. It follows that

6.

Examples of stable difference methods

W

IIfII2

=

IIgII2 - xJ-

E

+00

TS{...}dx -

ajo,

81

g(x)-Fdx

A2I+I{...}I2dx. -

+

By Lemma 6.1(1) we have -aJ-

TxT{...}dx =

l-

g(x){-7dx

r

al+ a(x)Ig(x+Ax/2)-g(x-&x/2)I2dx. J-W

It follows from this that

a(x)

+W

IIfII2

=

IIgII2 + taf-W

Ig(x+px/2) - g(x-Ax/2) I2dx r

+

Since

a(x) > 0

negative.

x2J+I{...}I2dx. '

and is bounded, the two integrals are non-

Hence

Il f it _ II g hl = II C (h) (f) II and the conclusion follows from this because of the submultiplicativity of the operator norm,

a

For the parabolic differential equation (S.7), every value

u(i,t)

depends on aZZ initial values

(cf. Example 1.10). (x,i)

1(x), x c1R

The domain of dependency of the point

thus consists of the entire real line.

To discuss the

domain of dependency of the corresponding difference method, we divide the interval h = T/n.

[0,T]

into

n

pieces of length

Then Ax = (T/na)1/2.

To compute the approximation

w(0,h)

for

u(0,h)

with the

explicit difference method (a = 1), we only need initial

82

I.

values from the interval

INITIAL VALUE PROBLEMS

[-tx,tx], but for

we need

w(O,T)

initial values from the domain of dependency [-nAx, nAx] = [-(nT/a)1/2, (nT/a)1/2]

which depends on ent to

n - -)

n.

A passage to the limit

h - 0

(equival-

deforms the dependency domain of the explicit

difference method into that of the differential equation, For positive step sizes, there always is de-

i.e.

pendence only on the values in a finite interval. tion is different for implicit methods after one step, w(O,h)

The situa-

(0 < a < 1).

There,

already depends on the initial values

from all of the real line.

We next present the simplest difference methods for the initial value problem in Example 5.8.

Recall the hyperbolic

differential equation (5.9) ut(x,t) = a(x)ux(x,t),

a c C°°1R, IR),

Ia(x)I < K.

The "naive" discretization h-1[u(x,t+h)-u(x,t)] - ?a(x)(Ax)-1[u(x+ox,t)-u(x-ox,t)] (6.7) leads to a difference method which is unstable for all A = h/1x > 0.

Therefore, we must look for other discretiza-

tions which do possess the desired stability properties.

Here

we will consider the methods of Friedrichs and CourantIsaacson-Rees.

The Friedrichs method begins with the discretization h-1{u(x,t+h)

-

?[u(x+ox,t) + u(x-tx,t)]}

= ?a(x)(ox) 1[u(x+ox,t)

- u(x-ox,t)].

6.

Examples of stable difference methods

83

This leads to the difference method C(h) = 1-Xa(x) T-1 + 1+Aa(x) T Ax

2

Let the function

Theorem 6.8:

Ia'(x)I < K, x e]R.

h

Ax

Ax'

2

satisfy the inequality

a(x)

Then the Friedrichs method for

0 < A < 1/K

(stability condition)

is a consistent and stable difference method for problem P(L2OR,4),T,A)

of Example 5.8 of consistency order

In particular, for all

h c (O,h0]

and

n e W with

0(h).

nh < T,

IIC(h)nII < (1+Kh)n/2 < exp(ZKT).

To show consistency, we choose

Proof:

f(s) =

DC =

We set

2[u(x+sox,t) + u(x-stx,t)]

+ ?a(x)[u(x+sOx,t) - u(x-sOx,t)].

We obviously have f(l) = C(h)[u(x,t)] = f(o)+f'(0)+.f"(s),

e e [0,1].

Using the same abbreviations as in the proof of Theorem 6.5, we have f'(s) = ZAx(u+-u') + f,, (s) =

2(Ax)2(u+ + u") +

f(0) = u f'(0) =

Since the initial values are functions with compact support, the solutions

u(x,t)

of the differential equation will be

84

INITIAL VALUE PROBLEMS

I.

the same.

Thus there exists a square integrable function

with

L c Co (JR, IR)

aj+ku x axJ at

t

0 < j+k < 2,

< L(x),

x IR,

t e [0,T].

Combining all this gives the estimates h-IIu(x,t+h) - C(h)[u(x,t)]I < h-llu(x,t)+hut(x,t)+2h2utt(x,t+vh)-f(0)-f'(0)-Zfof(e)I

< Z(h+A-1dx+KOx)IL(x)I,

v e [0,1].

It follows from this that the difference method is of consistency order

0(h).

Next we show stability.

Choose

f e B.

C(h)(f) - 2[f(x+px)+f(x-Ox)] + ?a(x)[f(x+ox)-f (x fix)].

It follows that

J:Ifx+x+fxx I2dx

IIC(h)(f)I12 = f +W

+

a(x)[f(x+Ax)+f(x-ox)][

4

a2 f+ +

Since

x+Ax +

a(x)[

+

4

x+ox -

x-ox ]dx

x-ox ][f(x+ox)-f(x-ox)]dx 2

Z

a(x) If(x+Ax)-f(x-ox)I dx. J_"O

Ia(x)I < K

and

a < 1/K, we have

a2a(x)2 < 1.

Com-

bine the first and last summands using the identity

la +

$12

+

la

-

02

=

2(IcI2 + la,2).

Expanding the parentheses in the second and third summands leads to a cancellation of half of the products. we obtain

Altogether

Examples of stable difference methods

6.

1

IIC(h)(f)II2

<

8S

If(x-Ax)I2)dx

2

J+00

a(x)(If(x+ox)I2

+

-

If(x-ax)I2)dx

2 +00

=

+2

IIfII2 -

2

J+

a(x+ox) I f (x+ox) 12dx

J_

a(x-Ax) If(x-ox)I2dx

+ 2

2

The second and third summands cancel each other; the fourth and fifth we combine by a shift in the domain of integration:

IIC(h) (f) II2 < IIfII2 + 2 J+: [a(x-ox)-a(x+ox)] If(x) I2dx. Using

Ia(x-Ax)-a(x+ox)I < 2oxla'(O)I < 2hxK

we finally obtain I I C (h) (f) II2 < 1 1f1 12

(1+Katx) = IIfII2 (1+Kh)

and 11C(h)II < (1+Rh)1"2 < exp(2 Kh)

IIC(h)nII < (1+Kh)n/2 < exp(Z Rnh) < exp(Z RT).

A different and long known difference method for the hyperbolic differential equation (S.9) is that of CourantIsaacson-Rees.

It begins with the discretization

I.

86

INITIAL VALUE PROBLEMS

h-I[u(x,t+h)-u(x,t)]

(Ax)-1{a+(x)[u(x+ox,t)-u(x,t)]-a-(x)[u(x,t)-u(x-ox,t)]}

otherwise

0

a(x)

_

a(x) > 0

for

ja(x)

a + (x) _

for

a(x) < 0

a (x) = otherwise.

0

This leads to the difference method C(h) = Xa (x)T-1+{1-A[a+(x)+a-(x)]}I+Xa+(x)TAx, Let the function

Theorem 6.9:

a(x)

satisfy a global Lip-

schitz condition with respect to the norm Ia(x)l < K.

X = h/tax.

11.112

and let

Then the Courant-Isaacson-Rees method, for

0 < A < 1/K

(stability condition)

is a consistent and stable difference method for problem P(L2(IR,4),T,A)

Proof:

of Example 5.8 of consistency order

0(h).

The proof of consistency we leave to the reader.

The

stability of the method follows immediately from the methods of Section 8, since the method is positive definite in the terminology of that chapter.

o

The dependencies of the various methods are shown pictorially in Figure 6.10.

The arc in the naive method in-

dicates that the

derivatives are not related to

each other.

x

and

t

6.

87

Examples of stable difference methods

n J or L "naive" method

Friedrichs method

Courant-Isaacson-Rees method

Figure 6.10

The instability of discretization (6.7) is relatively typical for naive discretizations of hyperbolic differential equations.

Stability is often achieved by means of additional smoothing terms, which remind one of parabolic equations.

We clarify

this in the case at hand by listing the unstable method and the two stabilizations, one below the other: (1)

u(x,t+h)-u(x,t) = ZAa(x)[u(x+Ax,t)-u(x-Ax,t)].

(2)

u(x,t+h)-u(x,t) = ZAa(x)[u(x+Ax,t)-u(x-ox,t)] +

(3)

2[u(x+Ax,t)-2u(x,t)+u(x-&x,t)].

u(x,t+h)-u(x,t) = ZAa(x)[u(x+Ax,t)-u(x-ox,t)] + ZAIa(x)I[u(x+nx,t)-2u(x,t)+u(x-Ax,t)].

The additional terms in (2) and (3) may be regarded as a discretization of viscosity.

cause

c(x,h)uxx(x,t).

They are called numerical

They do not influence the consistency order be-

c(x,h) = 0(h).

For higher order methods, the determina-

tion of suitable viscosity terms is more difficult.

On the

one hand, they should exert sufficiently strong smoothing,

while on the other, they should vanish with order of higher powers of

h.

Let us again consider the domains of dependence and

88

INITIAL VALUE PROBLEMS

I.

determinancy for the differential equation and difference For the sake of clarity, let

method at hand.

The domain of determinancy for a segment on the

a(x)

__

constant.

x-axis is

then a parallelogram whose right and left sides are formed by characteristics, since the solution of the differential equaThe domain of depen-

tion is constant on the characteristics.

dence of a point, therefore, consists of only a point on the For the discussion of the domains of

x-axis (cf. Example 1.5).

dependence and determinancy of the difference method, we divide the interval Then

[0,T]

Ox = T/(na).

into

n

pieces of length

For

A < 1/jal

and

A > 1/lal

one needs initial values from the interval pute the value

h = T/n.

w(O,h).

[-Ax,Ax]

The determination of

w(O,T)

to comre-

quires initial values from the interval [-ntx,n1x] = [-T/a, T/X],

which is independent of

n.

Thus the domain of determinancy

is a triangle and the domain of dependence is a nondegenerate interval, which contains the "dependency point" of the differential equation only for For

and not for

A < 1/lal

A >1/lal.

A = 1/lal, the domains of determinancy and dependence

of the differential equation and the difference identical, since only a term in

T_'

or

Tax

method are remains in the

expressions

C(h) =

1-Aa

-1 + l+Aa T Ax TAx

C(h) = as T1 +

(1-AIal)I+Aa+T_

Ax.

In this case, the difference method becomes a characteristic

7.

Inhomogeneous initial value problems

89

method.

This situation is exploited in the Courant-FriedrichsLewy condition (cf. Courant-Friedrichs-Lewy, 1928) for testing the stability of a difference method.

This condition is

one of necessity and reads as follows:

A difference method is stable only if the domain of dependence of the differential equation is contained in the domain of dependence of the difference equation upon passage to the limit

h -+

0.

The two methods discussed previously, for example, are stable for exactly those

A-values for which the Courant-Friedrichs-

Lewy condition is satisfied. called optimally stable.

which the ratio

A = h/Ex

Such methods are frequently

However, there also exist methods in must be restricted much more

strongly than the above condition would require.

7.

Inhomogeneous initial value problems So far we have only explained the meaning of consis-

tency, stability, and convergence for difference methods for homogeneous problems u'(t) = A(u(t)) ,

t E [0,T]

u(0) = C. Naturally we also want to use such methods in the case of an inhomogeneous problem u'(t) = A(u(t)) + q(t),

t c [0,T]

u(0) = c. We will show that consistency and stability, in the sense of Definition 5.10, already suffice to guarantee the convergence

INITIAL VALUE PROBLEMS

I.

90

of the methods, even in the inhomogeneous case (7.1).

Thus it

will turn out that a special proof of consistency and stability is not required for inhomogeneous problems. For this section, let

P(B,T,A)

be an arbitrary but

fixed, properly posed problem (cf. Definition 5.3). tinuous mappings

q:

IIgII = form a Banach space Theorem 7.2:

Let

[0,T] - B, together with the norm max Iiq(t)II te[0,T]

BT = Co([0,T],B).

c e B, q e BT, 8 e [0,1], and a consistent

and stable difference method given. j

The con-

MD = {C(h)

Further let the sequences

= 1(1)", be such that

lim n.h. = t c [0,T]. j+oo ference equations

hi

I

h e (0,h0))

e (O,h0]

and

be

ni cIN,

n.h. < T, lim h. = 0, and 1 3 j*oo Then the solution u j of the dif-

(n)

u(v) = C(hi)(u(v-l)) + hjq(vhj-8hj),

v = 1(1)nj

u(0) = c converges to

u(t) = E(t) (c) +

t r0

E(t-s) (q(s))ds.

I

u

is called the generalized solution of problem (7.1).

Proof: t

We restrict ourselves to the cases

8 = 0

and

< njhj < t+h, j = l(l)co, and leave the others to the

reader.

Further, we introduce some notational abbreviations

for the purposes of the proof: C = C(hi),

tv = vhj,

We will now show the following:

qv = q(vhj), n = nj.

7.

Inhomogeneous initial vale problems

E(t-s)[q(s)]

(1)

{(t,s)

is continuous and bounded on

t e [0,T], s c [O,t]}.

I

ft

ft

E(t-s)[q(s)]ds.

E(nhj-s)[q(s)]ds =

lim

(2)

0

0

For every

(3)

E > 0

IIE(tn-v) (q v) Let

Proof of (1): s

91

-

there exists a

jo e 1N

> jot v < n.

j

Cnv(qv) II < E

IIE(t)II < L, t e [0,T].

such that

For fixed

t

and

we consider differences of the form D = E(t-s)[q(s)] - E(t-s)[q(s)]

E(t-s)[q(s)]

- E(t-s)(q(s)-q(s)]

- E(t-s)[q(s)]

Either

E(t-s) = E(t-s)oE(t-s-t+s) or

E(t-s) = E(t-s)oE(t-s-t+s). In either case,

IIDII < L IIE(It-s-i+sI)[q(s)] - q(s)II + L IIq(s)-q(s)II By Theorem 5.5(3), E(It-s-t+sl)[q(s)] tion of

s-t.

Since

q(s)

is a continuous func-

also is continuous, the right side

of the inequality converges to zero as E(t-s)[q(s)]

Since the set IIE(t-s)(q(s))II

Proof of (2):

is also continuous in {(t,s)

I

t

('t,s) - (t,s).

and

t e [0,T], s e [O,t]}

s

simultaneously. is compact,

assumes its maximum there. By Theorem 4.19(l) we have

to E(nhj-s)[q(s)]ds = E(nhj-t)[I

E(t-s)[q(s)]ds].

Since every generalized solution of the homogeneous problem is continuous by Theorem 5.5(3), the conclusion follows at once.

92

Proof of (3): j

INITIAL VALUE PROBLEMS

I.

= l(1)-.

Let

IIE(t)II < L

and

IIC(h)nII < L, t e [O,T],

By the uniform continuity of

q, there exists a

such that

d > 0

t,i a [O,T], It-tj < a.

Iiq(t)-q(t)II < 4L'

Furthermore, there are finitely many

such that

u

0 < ua < T.

The finitely many homogeneous initial value problems with initial values

q(p6)

there exists a

jo e]N

can be solved with such that for

IIE(t-pd) [q(ua)] Here the choice of

v

-

j

Therefore

MD.

> jo

and

ua < t

Cn-v[q(ua)] II < 4'

depends on

u, so that

vhj < ua < (v+l)hj.

The functions

E(s)[q(ua)]

Therefore there exists a it-ti

<

are uniformly continuous in d > 0

such that for all

t, t

s.

with

a

IIE(t) [q(ua)] - E(t) [q (116)] II < q In particular, by choosing a larger

if necessary, one can

jo

obtain

IIE(t) [q(ua)] - E (t-ua) [q(ua)] II < 4

t e [t-u6-2h., t-u6+2hj],

j

> jo.

Since

to-v = nh.-vhj = t-ua+(nhj-t) + (p6-vhj)

we always have

IIE(tn-v) [q(ua)] - E(t-ua) [q(ua)] II < 4 Combining all these inequalities, we get

Inhomogeneous initial value problems

7.

IIE(tn-v)(gv)-Cn-v(qv)

93

II < IIE(tn-v)(gv)-E(tn-v) [q(ua)] II

(t ua) [q(ua)] II

+ IIE(tn-v) [ Q(ua)] + IIE(t-p6) [q(ua)]

+IICn-v[q(ua)]

Cn-v[q(ua)]II -

Cn-v(gv)II -

E

E

E

E

L 4L + 4 + 4 + L 4L = E. This completes the proof of (1),

(2), and (3).

The solution of the difference equation is n

u(n) = Cn(c) + h.

Cn-v(qv)

E

V=1

It follows from Theorem 5.11 that lim Cn(c) = E(t)(c). j-wu

Because of (2), it suffices to show

n E Cn v(qv) = lim

lim h. j_)._

J vml

t

0

jam-

E(nh.-s)[q(s)]ds,

and for that, we use the estimate

llh

n

i v1

Cn-v (q ) v

-

I

E(nh.-s)[q(s)]dsll

J0

n

vI1Cn-v(qv)

n hj

vIlE(tn-v)(gv)II

IIhJ

n + Ilhj

nh.

V 1 nhj

+ I1J0

E(tn-v)(qv)

-

E(nhj-s)[q(s)1dsll

J

t

E(nhj-s)[q(s)]ds -

I

E(nhj-s)[q(s)]dsll

The three differences on the right side of the inequality converge separately to zero as

j - -.

For the first differ-

ence, this follows from (3); for the second, because it is the

94

INITIAL VALUE PROBLEMS

I.

difference between a Riemann sum and the corresponding integral (cf. Theorem 4.19(2)).

0

The generalized solutions of (7.1) are not necessarily differentiable, and thus are not solutions of (7.1) in each and every case.

The solutions obtained are differentiable

only if

and

c e DE

are sufficiently "smooth".

q

Exactly

what that means will now be made precise. We define

Definition 7.3:

DA = {c e B A

:

DA } B For

Remark:

For if

I

u(t) = E(t)(c)

A(c) = u'(0).

given by

c c DA, u

is differentiable for t = 0},

is differentiable on all of

tl > t2, then

h [u(t1)-u(t2)] = E(t2){h[E(tl-t2)(c)-c]}. There is a simple relationship between tion

[0,T].

of

u

A

and

is also a solution of

P(B,T,A)

A.

a

Every solu-

P(B,T,A), i.e.

u'(t) = A(u(t)) = A(u(t)). In passing from

P(B,T,A)

to

P(B,T,A), we cannot lose any

solutions, though we may potentially gain some.

are defined may be enlarged under

which the operators

E0(t)

some circumstances.

The operators

changed.

The space on

E(t), however, remain un-

Also nothing is changed insofar as the stability,

consistency, and convergence properties of the difference methods are concerned.

It can be shown that

mapping, i.e., that the graph of This implies that with.

A = A

whenever

A

in

A

B x B

A

is a closed is closed.

is closed to begin

Since we shall not use this fact, we won't comment on

7.

Inhomogeneous initial value problems

95

the proof [but see Richtmyer-Morton (1967), 3.6 and Yosida (1968), Ch. IX].

In our examples in Section 5, A

is always

closed.

Theorem 7.4:

q E BT

Let

and

q(t) = Jo 0(r)E(r) [q(t)]dr 0

0 e C-(R, IR)

where

Support(4) C (0,T).

and

Then

{ft E(t-s)[q(s)]ds}' = q(t) + A{Jt E(t-s)[q(s)]ds}. 0 0

Remark:

It can be shown

is called a regularization of

q

that with the proper choice of rarily little.

For

and

0, q

q

differ arbit-

c c DE,

u(t) = E(t) (c) +

r0t

E(t-s) [q(s)]ds

J

is obviously a solution of u'(t) = A(u(t)) + q(t),

t

c [0,T]

= c.

u(0)

o

For

Proof of Theorem 7.4:

t c [T,2T]

define

E(t) = F.(t/2)oE(t/2).

There exists an

c >

0

Support(4) C [2E,T-2E].

such that

Let

ft

f(t) =

J

For

IhI

<

E(t-s)[q(s)]ds =

J

0 c

ft

and

t+h > 0

0

T-c 4(r)E(t+r-s)[q(s)]drds.

J

E

we obtain

96

I.

INITIAL VALUE PROBLEMS

f(t+h) = I1(h) + I2(h) t T-e 0(r)E(t+h+r-s)[q(s)]drds 11(h) = j J e

0

t+h T-e I2(h) =

j fe

t

We make the substitution

r = r+h

and exchange the order of

integration (cf. Theorem 4.19(5)), so that reT++h h- e

r0 t

I1(h) =

4(r-h)J

E(t+r-s)[q(s)]dsdi

J

t

r0 T

I,(h) =

E(t+r-s)[q(s))dsdi.

1$(i-h)J 0

has the derivative

I1

T

t

f 0 o

0

E(t+i-s)[q(s)]dsdr.

Ii(0) _

The second integral we split one more time, to get rt+h rT+h-e

0(f)E(t+i-s)[q(s)]drds

I2(h) = 1

+

J

t

e+h

rt+hrT+h-e 1

[(i-h)-q(r)]F.(t+r-s)[q(s)]drds.

1

t

e+h

In the first summand we can again change the limits of the innermost integral to

0

longer depends on

The summand obviously can be differ-

h.

entiated with respect to

and

h.

second summand is of order for

h = 0.

It follows that

T.

Since

The integrand then no

is bounded, the

0(Ih12), and hence differentiable

8.

Difference methods with positivity properties

I2'(0)

rT

4'(r)E(i)[(t)]dr = q(t)

= 1

97

0 t

r0 T

'(r)1 E(t+r-s)[q(s)]dsdi.

+ IZ(0) = q(t)-1

Il'(0)

0

Now let

g(h) = E(h)(

r0t

E(t-s) [q(s)]ds)

1

rt T-C f0

C

0(r)E(t+h+r-s)[q(s)]drds = I1(h)

g'(0) = 11(0). Therefore ft

E(t-s)[q(s)]ds C DA

11

0

and rt

rT

A(I E(t-s)[q(s)]ds) _ -1 1110

8.

t

$'(i)f E(t+i-s)[q(s)]dsdr. 0

o

0

Difference methods with positivity properties The literature contains various (inequivalent!) defini-

tions of difference methods of positive type (cf., e.g.,

Friedrichs 1954, Lax 1961, Collatz 1966, Tornig-Ziegler 1966). The differences arise because some consider methods in function spaces with the maximum norm, and others in function spaces with the

L2-norm.

both categories.

A number of classical methods fit into We will distinguish the two by referring to

positive difference methods in the first case, and to positive definite difference methods in the second. In the hyberbolic case, with a few unimportant excep-

tions, even if the initial value problem under consideration has a

Cm-solution, these methods all converge only to first

98

INITIAL VALUE PROBLEMS

I.

order (cf. Lax 1961).

However, they allow for very simple

error estimates and they can be carried over to nonlinear problems with relative ease.

We consider positive difference methods primarily on the following vector spaces:

Bln = {f a C°(R,4n) I lim II f(x)

0}

x-+m

B2n = { f a co R

2n-periodic}

B3n = {f E B2nIf(x) = f(-x),

x c 1R)

B4n = {f c B2nIf(x) = -f(-x),

x E]R)

BSn = {f E C°([-n/2,3n/2],'n)If

Definition 8.1:

satisfies the equations in Def. 8.1).

Functional equations.

(Bo + aox)f(-x) = (8° - aox)f(x) x e (0,n/2)

(Sn + anx)f(n+ x) _ (an Here we let with

a

+B 0

o

B5n

ao, Bo, an, Bn >

0

and

an +S

n

- anx)f(n- x).

be real, nonnegative constants > 0.

o

naturally depends on the given constants.

Since

we shall think of these as fixed for the duration, we suppress them from the notation and use the abbreviation

B5n.

of the various functional equations, the functions in through

Because B2n

B5n are determined by their values on a part of the

interval of definition.

Therefore we define the primary in-

tervals to be G1 =IR,

G2

G3 = G4 = G5 = [0,n]

and the intervals of definition to be

Difference methods with positivity properties

8.

Dl = D2 = D3 = D4 For

sup IIf(x)II_= max

IIf(x)II.,

fe

un

B

xeG u

xcDU

Bun, which we write

These suprema are norms in

The

IIfII,;

become Banach spaces by virtue of these norms

Bun

and are subspaces of

C°(DU,4n).

study of functions on ditions.

D5 = [-r/2, 31T/2].

IR, =

we obviously have

p = 1(1)5

spaces

99

Thus if

GU

They are useful for the

which satisfy certain boundary con-

we have the following

f c Bun (1 C1(DU,4n)

relations, by cases:

and

p = 2:

f (-7T) = f (7r)

p = 3: p = 4:

f'(0) = f'(r) = 0

p = 5:

aof(0)

f'(-7r) = f' (r)

f (0) = f (r) = 0 -

sof'(0) = arf(7r)+$rf'(r) = 0.

Obviously the boundary conditions for

p =

special cases of the boundary conditions for spaces

and

B3n

B

4n

Let

g(x) = f(x)

f"(0) = 0

sup

x e G5.

c

Further let

and simultaneously, Sr # 0

xe[-tx,r+Ax] Proof:

g e B5n n C1(DS,cn), f

for all

g e C2(DS,tn)

DU

or

to

B5n

and

# 0

or

f"(r) = 0.

Then

and

IIf(x) - g(x)IIm= 0((Ax)3),

g 6 BSn fl C1(DS,4n)

implies

if

[-r/2,3r/2].

C3(DS,1n) 0°

are

The

p = S.

also become special cases of

we restrict the intervals of definition Lemma 8.2:

p = 4

and

3

Ax a (O,r/2).

100

INITIAL VALUE PROBLEMS

I.

aog(0)

-

a,g(n) +

Sog'(0) = aof(0) B,ffg' (n)

-

B0f'(0) = 0

= aof(n) + anf' (n) = 0.

Furthermore, by 8.1

g(x)

I

f(x)

for xe[0,7r]

f(-x)(So+aox)/(eo-a0x)

for xe[-7/2,0)

I f(2r-x)[6 n-a7r(x-1r)]/[Bn+a7r (x-i)] for xe[a,3n/2].

First let

B0 # 0.

For

x < 0

we obtain

g'(x) = 2a0B0f(-x)/(80-a0x)2 - f'(-x)(B0+aox)/(a0-aox) g"(x) = 4a2B0f(-x)/(B0-a0x)3 - 4a0a0f'(-x)/(Bo-a0x)2 + f"(-x)(a0+aox)/(Bo-a 0x)

g"(-O) = 4aof(0)/ao - 4a0f'(0)/B0 + f"(0) = 4a0[aOf(0) =

-

a0f'(0)]/Bo + f"(0)

f"(0) = g"(+0). B0 = 0, we have for

In the exceptional case

x <

that

0

g(x) = -f(-x) g"(x) = -f"(-x) g"(-O) = -f" (0) = 0 = f" (0) = g" (+0) That

g"(n+0) = g"(7-0)

C2(DS,cn).

is shown similarly.

Since the restriction of

g

to

Thus

g c

[-7/2,0]

is

three times continuously differentiable, the Taylor series for x e [-tx,0]

becomes

f (x) = f (O) + xf'(O) +

g(x) = g(0) + xg, (0) Since tain

+

2

f (0)

+

3

f"(e1x)

Z2 g"(0) + 63 g'"(e2x)

f(0) = g(0), f'(0) = g'(0), and

91,0,2 E (0,1).

f"(0) = g"(0), we ob-

8.

Difference methods with positivity properties

IIf(x) M

X3

depends only on

estimate for

x e

f, ao

and

spaces

Bo.

< M(Ax)

.

There is an analogous

Combining the two, one obtains a

(Tr,ir+Ax].

bound which depends only on Lemma 8.3:

3

IIf,,,(91x)-g,,,(e2x)I1

-

101

f, a°, B°, an, and

B,R.

The following vector spaces are dense in the

Bun. (1)

Ca(IR,4n)

(2)

C° iR,tn) rl BUn

(3)

If c BSnI f c

in

Bln'

in

Bun

for

v = 2,3,4.

and

f/G

n)}

a 5

in

B5n'

The proofs are left to the reader (cf. Theorem 4.8). Definition 8.4: let

B C C°(D,4n)

G C D, and

be a Banach space with

max 11f(x) II

sup xcD

xcG and

be real intervals with

G, D

Let

11f(x) II

,

fcB

a properly posed initial value problem with

P(B,T,A)

MD = {C(h)Ih E (0,ho]}

a given difference method.

The

method is called positive if it satisfies conditions (1) through (4) below.

We assume suitable choices of

k c IN; A,K,L c 1R E,Av,BV a C°(G x (O,h0],MAT(n,n,]R)),

v = -k(l)k

M c C° (D,MAT (n,n,IR)) . (1)

g = C(h)(f)

For all and

h e (0,h0], f c B, and

Ax = h/A

or

x e G, with

Ax = ATT, we have

k

E(x,h)g(x) =

E

v=-k

[Av(x,h)f(x+vAx)+Bv(x,h)g(x+vAx)]

102

I.

INITIAL VALUE PROBLEMS

k

E(x,h) =

(2)

[Av(x,h) + B,(x,h)] v=-k

M(x)

(3)

ping of

always regular.

is

For every fixed

into itself.

B

ity transformation all matrices

-r M(.)f(.)

is a map-

x e G, the similar-

N + M(x)-INM(x), N c MAT(n,n,IR), carries

AV(x,h)

and

BV(x,h)

into diagonal matrices

The image of the matrix E AV(x,h)

with nonnegative elements.

has diagonal elements which are greater than or equal to

1.

The following inequalities hold:

(4)

IIAV(x,h) II < K, IIM(x)

JIB.(x,h) II00 < K,

K,

In the case

IIM(x)-1

Ax = h/a, let

otherwise, let

M =_

11_ < K,

x e D.

- M(y)II < L[k(k+l)]-llx-yl;

IIM(x)

constant.

x c G,h C (O,ha]

D

Condition (2) is not as restrictive as it may appear

at first glance, for the relation

E [A,(x,h) + B\, (x,h) I = E (x,h) + O (h) . is valid for every at least first order consistent method. If

Q(h)

hQ(h)

is a suitably bounded operator, then for

C(h)

-

we have E[AV(x,h) + BV(x,h)] = E(x,h).

Now by the Kreiss perturbation theorem (Theorem 5.13), the methods

C(h)

and

C(h)

- hQ(h)

are either both stable or

both unstable.

Condition (3) demands that all diagonal elements of M(x)-1[EAV(x,h)]M(x)

be greater than or equal to

1.

Since the equations in (1)

Difference methods with positivity properties

8.

103

can he multiplied by any arbitrary positive constant without affecting any of the remaining, properties, 1 may be replaced

by any arbitrary positive constant, independent of A positive difference method

Theorem 8.5:

Furthermore, when

g = C(h)(f).

and

g(x) = M(x) and

be arbitrary but fixed, let

f c B

Define M(x)-1f(x)

f(x) =

f

is stable.

C(h)

h e (O,h0] , m c IN, mh < T.

h c (O,h0]

Let

h.

M = constant,

IIC(h)mII < Proof:

or

x

lg(x)

= (fl(x), ..., fn(x))

= (gl(x), ..., gn(x))

are related as follows:

g

g = [M(x)-1 oC(h)oM(x)](f) = E(h)(?). Step 1:

C(h)

is stable.

For

x c G, it follows from (1)

that

E(x,h)M(x)g(x) = E A,(x,h)M(x)f(x+vAx)'+ E Bv(x,h)M(x)g(x+vAx) + R1 + R2 R1 = E Av(x,h)[M(x+vAx)-M(x)]f(x+vAx) R2 = E Bv(x,h)[M(x+vAx)-M(x)]g(x+vAx). Multiply by

M(x) 1, write the equation in component form,

and apply conditions (2) and (3) to obtain, for

j

= 1(1)n,

ej(x,h)gj(x) = E[avj(x,h)f.(x+vAx)+Svj(x,h)g(x+vAx)]+rlj+r2j diag(avj(x,h)) = M(x) 1Av(x,h)M(x) diag(Svj(x,h)) = M(x) 1Bv(x,h)M(x)

gj(x,h) = E[avj(x,h)+svj(x,h)]

104

I.

and

r1j

j-th components of

M(x)-1R1

They are, of course, dependent on

M(x)-1R2.

numbers

are the

r2j

INITIAL VALUE PROBLEMS

cj, avj, and

there exist

and

x

and

are nonnegative, by (3).

Svj j

x

such that

Hill = Igj(x)I.

and h.

The

Now

Apply the

triangle inequality and the estimates given by (4) to obtain

c. (x,h) II-II < IIfII Eavj (x,h) +

E Svj(x,h) +

-9 11

K2LA

h(IIfIt + IIfII).

The last summand does not appear when J K2LX 1 K

M = constant.

Letting

M t const.

for

}I

otherwise

0

we have

[E avj (x,h) -Kh] IIfII < [Eavj (x,h)+Kh] IIfII By (3), Eavj(x,h) > 1.

monotone increasing for

Since the function z

(z-Kh)/(z+Kh)

is

> 1, we have

(14h) IIfII < (l+Kh) IIfII Without

loss of generality, we assume that

there is a

K > 0

1-Kh0 > 0.

such that

1+Kh < exp(Kh), 1-Kh

h c (0,h0].

It follows that

IIC(h) II < exp(Kh)

IIC(h)mIt < IIC(h) IIm < exp(K m h). Thus, for all

m

such that

mh < T, we have

IIC(h)mII < exp(KT). Hence

C(h)

is stable.

Should

K = K = 0,IIC(h)mjI < 1.

Then

8.

Difference methods with positivity properties

Step 2:

C(h)

105

is stable.

IIC(h)mII = JIM(x)°[M(x) 1,C(h)°M(x)1m°M(x)-lII =

IIM(x) II IIC(h)mII IIM(x)-l II <

Hence

C(h)

Example 8.6:

is stable also,

K2 IIE(h)mII a

Positive difference methods for a parabolic ini-

tial boundary value problem witi. mixed boundary conditions.

Let

a e CThR, ]R+)

and let

ao, Bo,

be

a7r ,

and

2w-periodic with a(-x) = a(x), x FIR, 07r

be chosen as in Definition 8.1.

Consider the problem ut(x,t) = [a(x)ux(x,t)]x - q(x,t),

x e (0,Tr), t e (0,T)

u(x,0) = 0(x),

x e [0,7r]

a0u(0,t)

-

80ux(0,t) = 0

t e [0,T].

anu(7T,t) + BTrux(Tr,t) = 0

The restriction to homogeneous boundary conditions is not a true restriction since one can always use the substitution v(x,t) = u(x,t) + 60 + 61x + 62x2 to change the inhomogeneous boundary conditions

a0u(O,t)

-

80ux(O,t) = Yo

aT U(Tr,t) + BTrux(n,t) = Yn into the homogeneous boundary conditions

ov(0,t)

-

8oVx(O,t) = 0

arrV(Tr,t) + Bnvx(Tr,t) = 0 One obtains equations

60, 61, and

62

as solutions of the system of

106

INITIAL VALUE PROBLEMS

1.

a0

S0

0

-aTr

6

62

2-2S,r7r

air 7r

Since the system is underdetermined, we add the conditions

if if

d2 = 0

60 = 0

The inhomogeneous term

a0 + an > 0 ao = an = 0. naturally must be transformed

q(x,t)

accordingly.

The Banach space

suits the above problem.

B = B51

We choose DA = {f c BIf/G

A

and define

:

If we choose

by the assignment

DA -* B

f(x)

E

x E G5.

[a(x)fl(x)l',

0 E DE, where

DE = {f c B

I

f/G

f E

E CW(GS,c)}, 5

the initial value problem for

q = 0

the classical sense and the solution [0,T],4).

is uniquely solvable in u

C (G5 x

belongs to

It follows from Section 7 that the restriction to

the homogeneous case

q = 0

is not a true restriction.

With the help of Theorem 8.5 it is now very easy to establish stability for the difference method (6.3). only check for which

A

the

method is positive.

We need

To do this,

we rewrite (6.3) appropriately: g(x)+(l-a)A[a(x+&x/2)+a(x-ox/2)]g(x)

= aaa(x-Ax/2)f(x-Ax)+{l-aa[a(x+Ax/2)+a(x-tx/2)]}f(x) +

8.

Difference methods with positivity properties

107

aaa(x+Ax/2)f(x+Ax)+(1-a)Aa(x-Ax/2)g(x-6x) + (1-a)Aa(x+Ax/2)g(x+Ax),

A = h/ (Ax) 2. Conditions (1) and (2) are satisfied for all of

a.

To establish (3), we choose

that depends on 1

and

A

M 5

1.

A

independently

The only condition

turns out to be

a

- aa[a(x+Ax/2) + a(x-1x/2)] > 0.

This is satisfied for of Laasonen) for all

a = 0 (totally implicit case, method A EIR+.

0 < a < 1, the positivity

For

condition 0

<

max

A

a(x)

< 1/2a

xE[0,lf] results.

For

a = 0

a = 1, this corresponds to the con-

and

clusions of Theorem 6.5 and Theorem 6.6. in Section 6 we used the satisfied because

a(x)

L2-norm

11.11

Note, however, that

2.

Condition (4) is

is a bounded function.

is established with the help of Lemma 8.2.

Consistency

In the most gen-

eral case, the truncation error of the discretization is 0(h) + 0(Ax). Example 8.7: der

0(h2)

The details we leave to the reader.

a

A positive difference method of consistency orfor a parabolic initial value problem.

a a constant > 0.

Let

Consider the problem

ut(x,t) = auxx(x,t)

u(x,0) _ ¢(x)

- q(x,t)

x EAR,

t e (0,T).

It can be formulated as a properly posed initial value problem in the space

B = Bln.

The following discretization

the foundation of a difference method of consistency order 0(h2):

is

108

INITIAL VALUE PROBLEMS

I.

h-1[u(x+Ox,t+h) - u(x+ex,t)] + 1'2

6

1[u(x,t+h)

h

- u(x,t)] +

- u(x-Ax,t)] _

12 h-1[u(x-tx,t+h)

2 (Ax)-2a[u(x+&x,t+h) -

2 (Ax)-2a[u(x+Ax,t)

-

2u(x,t+h) + u(x-tx,t+h)] +

2u(x,t) + u(x-Ax,t)]

[2 q(x+Ax,t+h/2) + 6 (x,t+h/2) +

q(x-Ax,t+h/2)]+R(h,dx).

1

R(h,tx) = 0(h2) + 0((Ax)4).

We first show that breviation

-

Using the ab-

and the error estimates

t = t+h/2

h-I[u(Y,t+h)-u(Y,t)] = ut(Y,t) + 0(h2) (ex) 2[u(x+Ax,s)-2u(x,s)+u(x-ox,s)] = uxx(x,s) + 12 (Ax)2uxxxx(x,s) + O((px)4)

and the differential equation, it follows from the above discretization that

IT

ut(x+Ox,t) +

12ut(x-Ax,t)

6ut(x't) +

+ 0(h2)

12auxx(x+Ax,t) + 6auxx(x,t) + 2auxx(x-ex,t) + 0(h2)

= 2a

{uxx(x,t+h)+uxx(x,t) +

0((Ax)4). Expanding further,

2

a[uxx(x+ox,'t)

6auxx(x,t)

+

+ uxx(x-ox,t)] 1

17

(Ax)2auxxxx(x,t)

2a[uxx(x,t+h)+uxx(x,t)]

+

0((Ax)4),

= auxx(x,'t) + 0(h2),

1 (Ax)2a[uxxxx(x,t+h)+uxxxx(x,t)]

=

12(Ax)2auxxxx(x't)

+ 0(h2)0((Ax)2).

8.

Difference methods with positivity properties

109

The conclusion now follows by simple substitution.

The difference method for the homogeneous differential equation, when expressed in the notation of Definition 8.4, reads

(6+Aa)g(x) _

(12 + ZAa)f(x-ox) +

(6-Aa)f(x)

a)f(x+tx) + (- 12 + Ixa)g(x-Ax)

+ (12

(- -7 + Zaa) g (x+ Ax) .

+

where

A = h/(Ax) 2.

fied for all

A.

Conditions (1), (2), and (4) are satis-

To establish (3), we choose

M = 1.

We ob-

tain the positivity condition

6
A, and since

O(h2).

In the inhomogeneous case (q X 0) we should, according to Section 7, add the term -hjq(x,vhi

0hi ),

-

0

c

[0,1].

But this would reduce the consistency order of the method to O(h), even for

Therefore, we add a term

0 = 1/2.

-hi 112q (x+Ax,(v-1/2)h

)

+ 6 (x,(v-1/2)h

i

+

1 q(x-Ax,(v-1/2)h

corresponding to our discretization.

)]

The dependencies of the

difference method are depicted in Figure 8.8.

0

110

I.

INITIAL VALUE PROBLEMS

h

Ax

Figure' 8.8

Example 8.9:

A positive difference method for a hyperbolic

initial value problem.

Let

A C C(IR,MAT(n,n,IR))

diagonalizable with bounded norm. which diagonalizes

A(x)

Let

It follows that

be the matrix

and let it, together with its in-

verse, have bounded norm, and let x,y eIR.

M(x)

be real

JIM(x)-M(y)ll_ < Ljx-yj,

p(A(x))

is also bounded.

Consider

the problem

ut(x,t) = A(x)ux(x,t) + q(x,t) x CIR, t e (0,T) . u(x,0) _ fi(x) It can be formulated in the usual way as a properly posed initial value problem in the space

B = Bln'

The Friedrichs method (cf. Theorem 6.8) now becomes C(h) = i{[I-AA(x)]TAx + [I+AA(x)]TAx1,

A = h/Ax.

We transform it into the notation of Definition 8.4: g(x) = 2[I-)A(x)lf(x-Ax) + 2[I+AA(x)]f(x+Ax).

Conditions (1), (2), and (4) are satisfied for all Condition (3) results in the positivity condition 0 < A sup

xCIR

p(A(x)) < 1.

A CIR+.

Difference methods with positivity properties

8.

111

The Courant-Isaaeson-Rees r:ethod (cf. Theorem 6.9) becomes

g(x) = AA (x)f(x-ex) + {I->.[A+(x)+A (x)]}f(x) X = h/ax.

+ AA+(x)f(x+Ax), Here

A(x) = A+(x) - A -(x) M(x)D+(x)M(x)-i

A+(x) =

A -(x) = M(x)D (x)M(x) 1 D+(x) = diag(max{Xi(x),0})

D (x) = diag(-miniai(x),O)) where

Xi(x)

are the eigenvalues of

A2(x) < ... < An(x). satisfied

for all

Conditions (1), a e1R+.

with

A(x)

al(x) <

(2), and (4) again are

To establish (3), we have to

show that the diagonal matrices

M(x)-1A (x)M(x) = D (x) M(x)-lA+(x)M(x) = D+(x) M(x)-1{I-A[A+(x)+A

(x)]}M(x) = diag(1-A Ai(x)I)

have only nonnegative elements. holds by definition.

For

D -(x)

and

D+(x)

this

For the third matrix, we must have

0 < A sup p(A(x)) < 1

(positivity condition).

o

x EIR Example 8.10:

Initial boundary value problem for the wave

equation with variable coefficients. 27-periodic with

Let

a(-x) = a(x), x E]R.

a c

C(]R,]R+)

be

Consider the problem

utt(x,t) = a(x)[a(x)ux(x,t)]x-q(x,t),

x c (0,rr), t c(O,T)

u(O,t) = U(ir,t)

t E [0,T]

= 0,

u(x,0) = O(x), ut(x,0) _ (x),

x 6 [O,a].

112

INITIAL VALUE PROBLEMS

I.

The equation can be rewritten as a first order system.

The

substitution

vl(x,t) = a(x)ux(x,t) v2(x,t) = ut(x,t) yields

av av -(x,t) a(x)-(x,t) =

E (O,n), av v

t

(0,T)

r

av

Tt-(x,t) = a(x).(x,t) - q(x,t) v2(0,t)

= v2(ir,t) = 0,

vl(x,0)

=

t c [0,T]

v2(x,0)=p(x), x E [0,7r].

In contrast to Problem 1.8, the boundary values are specified for only one component, namely free.

The other component is

v2.

However, in the homogeneous case

q(x,t) = 0, the dif-

ferential equations imply a TX-(ff't) = 0,

t

c

(0,T].

Since these conditions result from the differential equations, they are distinct from independent boundary conditions.

e C ([O,ir],]R) quely solvable. B31 x B41.

For

with V(O) - $(wr) - 0, the problem is uniA suitable Banach space is given by

B =

However, we must then require q(O,t) = q(n,t) = 0,

t e [0,T].

We only examine

For simplicity, we switch to vector notation. the homogeneous problem vt(x,t) = A(x)vX(x)t), V(X,t)

(v1(X,t),v2(x,t)),

t e (0,T)

x c (0,ir), 0

a(x)

a (x)

0

A(x)

8.

Difference methods with positivity properties

113

The differential equation can be discretized as follows (a a [0,1]):

h-1[v(x,t+h)-v(x,t)] +

=

1-«A(x)(Ax)-1[v(x+Ax,t)-v(x-Ax,t)]

Z(1-a)A(x)(Ax)-1[v(x+Ax,t+h)-v(x-Ax,t+h)].

A positive method results from the addition of a suitable numerical viscosity (cf. §6) to the right side.

One addi-

tional term, for example, is r -

2N[v(x+Ax,t)

2v(x,t) + v(x-Ax,t)]

r

+[v(x+Ax,t+h) -

2v(x,t+h) + v(x-Ax,t+h)]

where

A = h/Ax, s = max p(A(x)),

r2 = (1-a)As.

rl = aas,

xeJR This viscosity may be regarded as a discretization of sha-lv

xx

We have: '

2(1+r2)v(x,t+h) - [r1I+aAA(x)]v(x+Ax,t)+(2-2r1)v(x,t) + [rII-aOA(x)]v(x-Ax,t)+[r2I+(1-a)AA(x)]v(x+Ax,t+h) + [r2I-(l-a)XA(x)]v(x-Ax,t+h).

This approximation can be used immediately to define a difference method. implicit.

a = 1

For

it is explicit, and otherwise

It is easily shown that for

rl < 1, i.e., for

aas < 1, this results in a positive method.

is not known exactly, one can choose a larger dition

aas <

1

max p(A(x))

If

is more restrictive then, for

s. A

The con-

must be

chosen to be smaller.

In an actual execution, one would restrict oneself to the lattice points

x = vx/N, v = 0(1)N.

The implicit methods

114

INITIAL VALUE PROBLEMS

I.

then lead to a linear system of equations in namely the components of v(e,t+h)

and

v(0,t+h)

ary conditions.

v(x,t+h).

unknowns,

2N

The first components of

are always zero because of the bound-

The matrix of the system is a band matrix.

By exploiting the special structure of

A(x), the system of

equations can be reduced to smaller systems with triangular Thus relatively little effort is needed to elimin-

matrices.

ate the unknowns.

Nevertheless, it is more advantageous as a

rule to use the explicit scheme.

The additional effort re-

quired by the implicit method is typically not worthwhile for a hyperbolic differential equation.

For parabolic equations

the situation is completely different.

The solutions of para-

bolic equations often increase in smoothness with increasing t.

Then one wants to use very large values of

in the

h

difference method, perhaps 100 or 1,000 times as large as But that is only possible with methods which are

initially.

stable for every plicit methods.

A = h/(ax)2, and that only occurs with im-

For hyperbolic differential equations, one

should not expect the solutions to increase in smoothness. The size of the derivative remains about the same. ordinates

t

and

x

are equally important.

If the truncaA = h/4x

tion error is not to become too large, the ratio must be kept more or less constant.

The co-

In our case, a commend-

able choice is

a= 1, rl z

1,

A

1/s, h -.

OS

0

For positive difference methods, stability is measured with respect to the norm

However, there exist difference

methods which are not stable with respect to stable with respect to

but are

Since a direct stability proof

8.

Difference methods with positivity properties

115

is usually quite tedious in each particular case, one would like to have handy stability criteria to apply in this case,

This leads to the definition of a positive definite

also.

method.

Definition 8.11:

B =

Let

let

erly posed initial value problem, and let (O,h0]}

P(B,T,A)

be a prop-

MD = {C(h)Ih e

be a corresponding difference method.

It is called

positive definite if the following conditions (1) through (4) are satisfied. (1)

g = C(h)(f)

For all

h e (O,h0], f e B, and

x e]R, where

Ax = h/A, A e]R+, it is true that

and

k

Av(x)f(x+vAx).

g(x) = v=-k

Here

AV E C°c!R,MAT(n,n,IR)), k

Av(x),

(2)

1=

(3)

All matrices

v=-k

v = -k(l)k. x e ]R.

Av(x), x cIR, v = -k(1)k

are sym-

metric and positive semidefinite. (4)

All matrices

Av(x)

satisfy a Lipschitz condition

with respect to the norm II

< LIx-yl ,

x,y c IR,

v = -k(l)k.

o

For positive definite methods, in contrast to positive methods, the matrices

Av(x)

are not allowed to depend on

h.

It can

be shown that as a consequence of this, condition (2) is satisfied by all consistent methods.

In practice, the conditions

for a positive definite method are not as demanding, since the simultaneous diagonalizability of all the matrices is not required, unlike the case of positive methods.

The following

116

INITIAL VALUE PROBLEMS

I.

theorem is the analog of Theorem 8.5. Friedriehe.

Theorem 8.12:

method is stable.

A positive definite difference

Furthermore,

IIC(h)II =

1

+ 0(h),

h c (O,ho].

The proof requires the following lemma. Lemma 8.13:

H c MAT(n,n,4)

Let

be Hermitian and positive

semidefinite; then IzHHwI <

where

2

(zHHz + wHHw),

z,w c

n

zH = zT.

Proof:

Let

be an orthonormal basis of

{$1,...Son}

H+i = ai0i,

i = 1(l)n.

n

n

¢n

Let z = i=1

Since

A.

w =

1 1

w.0

i=1 1 1

and

> 0

1 -

< Z(I12 + Inl2),

Itnl

,n e 4

we have the following estimate H IzHwI =

I

I A.z.w.l <

i=1

1

1

1

E a.lz.w.l i=1 1 1 1

n

<

1

ai(IziI2

*

IwiI2) =

2(zHHz + WHHw).

i=1

Proof of Theorem 8.12: estimate

By 8.11(3) and Lemma 8.13 we may

with

8.

Difference methods with positivity properties

2

+-

H

+

=

IIgII

g(x) g(x)dx

f

117

H

k

g(x) Av(x)f(x+vnx)]dx

[

v=-k k

<

g(x)HAV(x)g(x)]dx

E

[

v=-k

2

k

f+w +

I

[

2

f(x+vOx)HAv(x)f(x+vAx)]dx.

v=-k

By 8.11(2), the first summand is equal to

IIgII2.

For the

2

second summand, we have the further estimates

k j

[

k

I

°° v=-k

f(x+vax)HAv (x)f(x+vax)]dx =

r+

E

v= k -m J

f(x)HA (x-vAx)f(x)dx v

k

IIfII2

+

[+-f (x)H[Av(x-vax)-Av(x)]f(x)dx v=-k E

Ilfll2(1 + L A = h/ax

Noting

k E

v=-k

IvIox).

and letting

K = a-1Lk(k+l), we get the

estimate If 112

IIgII2 < [1 + Lax k(k+l)]

= (1+Kh) 1If1I2.

It follows from this that IIC(h)II < (1+Kh)1/2 < exp(2 Kh) = 1 + 0(h),

IIC(h)mII < exp(2 mKh), Example 8.14:

m E IN,

mh < T.

Positive definite difference methods for a

hyperbolic initial value problem.

We again consider the

hyperbolic initial value problem of Example 8.9 ut(x,t) = A(x)ux(x,t) + q(x,t)

u(x,O) _ ¢(x) Here let

A e C(JR,MAT(n,n,R))

x e 1R,

t e (0,T).

be symmetric with bounded

norm and satisfy a Lipschitz condition with respect to It follows that

p(A(x))

is bounded.

II'II2-

The problem is properly

118

INITIAL VALUE PROBLEMS

I.

posed in the Banach space

The Friedrichs

B = L2(IR,4n).

method C(h) = i{[I-AA(x)]T-l +

A = h/ox

[I+AA(x)]TLx},

is positive definite if the matrices

I-AA(x)

are positive semidefinite (condition (3)).

and

I+XA(x)

This again leads

to the condition 0 < A sup p(A(x)) < 1. x E]R

(8.15)

The other conditions in Definition 8.11 are satisfied for all A E]R+.

In the Courant-Isaacson-Reel method,

C(h) = AA (x)T-I * {I-A[A+(x)+A (x)]}I + aA+(x)TAx' A = h/tx

it must be shown that the matrices A -(x) = M(x)D (x)M(x)-1 A+(x) = M(x)D+(x)M(x)-

I-A[A+(x)+A (x)]

=

I

-

A diag(Iai(x)I)

are symmetric and positive semidefinite and satisfy a Lipschitz condition with respect to the assumption that

M(x)

11-112 .

For this we

is always orthogonal.

coefficient matrices are obviously symmetric. are also positive semidefinite. immediate only if

M(x)

make

Then the

By 8.15, they

The Lipschitz conditions are

= constant.

They are in fact satis-

fied under substantially more general conditions, but we shall not enter into a discussion of that here.

o

In the previous examples, one could add terms of the form

b(x)u(x,t),

b e C°(IR, ]R)

bounded

9.

Fourier transforms of difference methods

119

or

B(x)u(x,t),

B c C°(]R, MAT(n,n,]R))

bounded in norm

without creating any substantive changes in the situation.

An additional term

hb(x)I

or

then appears in the

hB(x)

difference operators, and the stability of the new difference methods then follows from Theorem 5.13 (Kreiss).

For para-

bolic differential equations, the addition of a term a(x)ux(x,t)

suggests itself.

This term would be discretized

as

a(x)[u(x+Ax,t)

- u(x,t))/ex.

All in all, this leads to an operator which differs from the original operator

by

O(ax) = O(h1/2).

Perturbations of this

type are not covered by the Kreiss theorem (but see RichtmyerMorton 1967, Section 5.3). 9.

Fourier transforms of difference methods

The discretization of pure initial boundary value problems for parabolic or hyperbolic differential equations with constant coefficients leads to particularly simple difference methods.

When the underlying Banach space is

L2((0,2n)'In), or

L2OR,tn),

L2((0,ii),In), the requisite stability tests

can be simplified considerably by the introduction of Fourier transforms.

We begin with a review of basic facts about

Fourier series and Fourier integrals.

Proofs can be found in

Yosida (1966), Ch. VI, and elsewhere.

In order to be able to apply the translation operator to functions from the spaces

we extend the functions to all of definition for all

x e]R:

or IR

L2((0,v),4n),

by making the following

120

I.

INITIAL VALUE PROBLEMS

for f e L2((0,2ir),4n)

f(x+2n) = f(x) f (x+25) = f (x)

for f e L2((O,,r),11 ).

f(x) = -f(-x)

comes a closed subspace of

be-

L2(0,rT),cn)

As a result of these definitions, the space

Departing from

L2((0,27r),tn).

ordinary practice, we will call elements of v-times continuously differentiable

L2((O,n),4n)

and

(v = 0(1)-), if such is the case for their extensions.

For

each v-times continuously differentiable function f we then have

f(u)(0) = f(u)(21r), f(2u) (0)

0(1)v

f(2u) (x) = 0, u = 0(1)Z if fcL2((0,,r),ctn).

=

and L2like the

The two spaces spaces

if

and

B2n

in Section 8, thus have boundary con-

Bon

The difference between the present and

ditions built in.

previous Banach spaces is in the norm. Theorem 9.1:

The mapping

(1)

Tr,n:

defined by f + (a(v) }v e2Z , where 2rr

a(v) = (21r)-1/2

f(x)exp(-ivx)dx

(9.2)

I

0

is an isometric isomorphism of i.e.,

L2((0,2n),¢n)

onto

R2(¢n);

is linear, injective, surjective, and satisfies

7r,n

the condition

II ."27r,n(f) II = II f II (2)

Let

ac

fu(x) = (27) 1/2

,

R.2(In) u E

V=--p

fc and

a(v)exp(ivx),

u

c IN,

x c (0,27r).

9.

Fourier transforms of difference methods

Then the sequence

{fu}u£ 1\

is a Cauchy sequence in

a - f

The assignment

defines a mapping of

which is the inverse of If

f c L2((0,2n),4n).

and converges to an element

L2((0,2x),Cn)

121

i2(tn)

to

r n.

is continuous and of bounded variation,

f

then in addition to Theorem 9.1 it can be shown that the infinite series (2x) 1/2

formed from the coefficients uniformly to

a(v)exp(ivx)

(9.3)

a(v), given by (9.2), converges

In many contexts, however, convergence in

f.

the mean, i.e., in the sense of the norm of

L2((0,2n),Qn)

In any of these cases, the above infinite

is sufficient.

series is called the Fourier series of

From a physical

f.

point of view, the representation says that

f

may be repre-

sented as a superposition of (complex) harmonic oscillations (these are the oscillations with frequencies ±2,...).

!

27r , n

subspace of

maps the space

v = 0, ±1,

L2((0,r),cn), regarded as a

L2((0,27r),4n), onto the set

{a a i2(cn)ja(v) = -a(-v)}. The expansion in (2) is then a pure sine

expansion:

a(v)exp(ivx)

fu(x) = (2t)-1/2 v=-u

= 2i(27r) -1/2 l a(v)sin(vx).

v=1

In the case of nonperiodic functions, Fourier integrals replace Fourier series. Theorem 9.4:

Let

f e

and

122

INITIAL VALUE PROBLEMS

I.

u

aU(Y) = (2n)-1/2

f(x)exp(-iyx)dx

l-

u

bU(y) = (27r) -1/2

f(x)exp(iyx)dx

I

1

u

Then the sequences

{au}u,3N

quences in

y d R.

U E IN,

U

and

are Cauchy se-

{bu}ueIN

and converge to elements

The mappings defined by the assignments are isometric automorphisms of

a,b c L2(R,cn). and

f -> a

f - b

L2(IR,Cn); i.e., the mappings

are linear, injective, surjective, and satisfy the condition II a II = II f II

= II b II.

The second mapping is the inverse of the

We denote the first by J'

first.

and the second by

_Znl.

In order to simplify our notation, we will be somewhat imprecise in the sequel and write _F n(f)(Y) =

(2n)-1/2 ff(x)exp( iyx)dx = a(Y)

,

Y 1(a)(x) = (2r)

(9.5)

f(x)

(9.6)

J

This ignores the fact that the integrals converge only in the mean, in general.

Pointwise convergence of the integrals only

occurs in special cases, e.g., when support. of

f.

f

or

a

has compact

Representation (9.6) is called the Fourier integral From a physical point of view, it says that

f(x)

cannot be built out of harmonic oscillations alone, but that (complex) oscillations of all frequencies

y

arise.

There-

fore, the infinite series (9.3) has to be replaced by an integral, where the "infinitesimal" factor to the previous coefficient

a(v).

a(y)dy

The following

corresponds lemmas

describe important computational rules for Fourier series and Fourier integrals.

Fourier transforms of difference methods

9.

Ax CIR+, define

Lemma 9.7:

For

exp(ixAx).

Then

6Ax:

123

1R - I by eAx(x) =

sAX(v) .f2 w,n(f)(v)

(1)

f e L2((0,27r),n), v e f c L2(R,4n).

n[TAX (f)] =

(2)

Proof:

Tl

Conclusion (1) follows from the relation

(2Tr)-1/2 j

2x

Tox(f) (x)exp(-ivx)dx

0

2ir (2fr)-1/2

f(x)exp(-ivx)exp(ivex)dx.

j

0

To prove (2), let

support, i.e., fi(x) = 0 u c]N.

for

u-tx

for suitable

x e]R

Then we can rewrite

u+Ax J

be a function with compact

0 e C°(]R,In)

ru O(x+Ax)exp(-iyx)dx =

D(x)expf-iy(x-Ax)]dx

1-

u fru

eAx(Y)- 4,(x)exp(-iyx)dx. u

The conclusion follows from this via Theorem 4.10(1), since the

mappings n , TAX, and Lemma 9.8:

Let

f -

n(x) = ix.

e

If the function

f e Cm(II2,¢n)

satisfies the growth condition

sup x e 1R

for all

j

c 1N

IIP(x)fO)(x)II

and all polynomials

_W n(f(q))

<

P, then for all

q tIN

= n(.)q n(f).

Since the second lemma will not be used in the investigation of difference methods, we dispense with the proof and with the

124

I.

INITIAL VALUE PROBLEMS

potential substantial weakenings of the hypotheses.

Instead,

we shall apply it to a simple example which is designed to show how valuable Fourier transforms can be, even for solving differential equations with constant coefficients. Example 9.9: Petrovski.

Parabolic differential equation in the sense of Let

q cIN

and

a e 4

and consider the differ-

ential equation (9.10)

ut(x,t) = a(3x)qu(x,t).

The solution

u(x,t)

and

ut(x,t)

are assumed to satisfy the

growth condition of Lemma 9.8 for each fixed Fourier transforms to the variable

We apply

t.

x, letting

t

play the

role of a parameter:

v(Y,t) _ vt(Y,t) _

[ut(

t)I(Y)-

Then it follows from Lemma 9.8 that vt(Y,t) = .F,1[a(aX)qu(x,t)] = an(Y)q

an(Y)gv(Y,t) Thus

v

satisfies an ordinary differential equation with

respect to

t, and can be represented in the form v(y,t) = exp[a(iy)gt]v(y,0)

Transforming back, we get

u(x,t) =_F

(9.11)

Thus we have obtained the first integral representation of the solution. real

a

The Petrovski condition is equivalent to

Re[a(iy)q] < 0, which for

Fourier transforms of difference methods

9.

q

odd or

quarantees that

a( -1)q'2 <

v(y,t)

For a pure imaginary

a

125

(cf. Example 1.13),

0

does not grow faster than and

v(y,0).

even, the Petrovski condition

q

is always satisfied.

Next we want to simplify the representation (9.11) for and the two cases

a c]R q = 1:

and

q = 1

q = 2.

exp(iayt)[1- u(z,0)exp(-izy)d]

u(x,t) = Zn 1

exp (ixy) dy. With the change

x - x-at, it follows that

u(x-at,t) =

m[J

and therefore that

u(i,0)exp(-ixy)dx]exp(ixy)dy Thus we obtain the

u(x-at,t) = u(x,0).

solution representation u(x,t) = u(x+at,0). +m q = 2:

1

u(x,t) = 21

2

u(x,0)exp(-i5Ey)dx]exp(ixy)dy

exp(- ay t) [J M

2f ad

u(z,0)exp(-ay 2t+ixy-izy)dx dy.

Because of the rapid decay of

for

u(i,0)

t > 0, the order

of integration may be changed, yielding 1

u(x,t) = 2nJu(x,0)[1

2

exp(-ay t+ixy-ixy)dy]dz.

Since ay2t + iY(x i) =

at(y i

2

xa ) 2

(x

the inner integral may be simplified further: 2

(

J+We%P[ aY2t+iY(x i)]dy = exp[

(X]J+exp(-R)

The right-hand integral is actually an integral along a line

126

I.

INITIAL VALUE PROBLEMS

parallel to the x-axis, but according to the Cauchy integral theorem this is the same as the integral along the x-axis. Thus we get

+exp[-ay2t+iy(x-R)]dy

(at)-1/2exp[-(x-

=

2 i

exp(-z2)dz.

It is well known (cf. Abramowitz-Stegun 1965) that

i:expcz2)dz

=

Therefore we obtain the following representation of the solution (cf. also Example 1.10): j+m

u(x,t) =

1 4

For

a = ib, where

-co

u(R,0)exp[- 72Fa (x _t

b E]R, and

]dx,

t > 0.

q = 2, Equation (9.10) corres-

ponds to a Schrodinger equation (cf. Example 1.13) or to a differential equation from elasticity theory, describing the vibration of a thin board (cf. Example 9.29).

In this case,

it is frequently formulated as a fourth order differential equation.

The substitution

u = u1 + iu2

leads to the system

au1/at = -ba2u2/ax2 au2/at =

b32u1/ax2.

From this it follows that a2u2/ate

=

-b2a4u2/ax4.

We want to simplify representation (9.11) a bit more. above, for

q = 2

and

t > 0

one obtains

1

u(x,t) =

2

u(R,0)[

exp[-iby t+i(x-R)y]dy)dx. J_m

Since

As

Fourier transforms of difference methods

9.

-iby2t + i(x-x)y = -ibt[y2 - btx y] _ -ibt[(y -

x-x 2 2b-t)

127

_

- (x-x)2 2 -ibt(y

(x-x)2

x- x 2 -

2btt)

4b t

+

l 4bt

the inner integral can be simplified further: +00

+00

2

2 exp[-iby t+i(x-x)y]dy = exp[i(]

Jexp(ibtw2)dw W

+m

2

2(lblt)-1/2exp[i{

exp[-i sgn(b)z2]dz. 0

It is shown in Abramowitz-Stegun 1965 that exp(±iz2)dz J

=

I

Air

exp(±in/4).

0 Z

It follows that r+W

1-exp[-iby2t + i(x-x)y]dy 7r

(

t)exp[-i sgn(b)ir/4]exp[i 'f t (].

tion of the solution: r+m

2

u(x,0)exp[i(]dx.

u(x,t) = (4nlblt)-1/2exp[-i sgn(b),r/4]J

Instead, one could also write

r+0 u(x,t) = (4nlbtl)-1/2exp[ i sgn(bt)n/4]1 This formula also holds for chosen arbitrarily. domain of dependence.

t

< 0.

2

u(z,0)exp[i(-xR)

The sign of

b

]d-x.

can be

The solution obviously has an infinite o

We are now ready to investigate the stability properties of difference methods with the aid of Fourier transforms. From now on, we assume the following:

128

INITIAL VALUE PROBLEMS

I.

is one of the intervals

J

(0,2n), or

IR,

(O,n)

B =

is a properly posed initial value problem.

P(B,T,A)

Definition 9.12:

MD = {C(h)Ih e (O,h0]}, where

Let

k k

Ax = h/A

(1)

The rule

[AvxhTx

Bv(x,h)TAx

be a difference method. and

k

1

C(h) _

Here

Av(x,h), Bv(x,h) e MAT(n,n,]R)

Ax = vih ), A e]R+.

or

( k

G(h,y,x) =

I

k

I

exp(ivyAx)Bv(x,h)

I

v=-k

v=-k

defines a mapping of

(O,h0] xIR x J

an amplification matrix for (2)

MD

exp(ivyAx)Av(x,h)

I

I

MD.

to

It is

MAT(n,n,4), called (2n/Ax)-periodic in y.

is called a difference method with coordinate-free

coefficients if

and

AV

case we abbreviate

By

G(h,y,x)

do not depend on to

G(h,y).

x.

In that

o

The following theorem gives necessary and sufficient conditions for the stability of a difference method with coordinate-free coefficients.

It is proven with the help of a

Fourier transformation. Theorem 9.13:

A difference method with coordinate-free coef-

ficients is stable if and only if there exists a constant K c ]R+

such that the spectral norm, II'II2, of the amplifica-

tion matrix satisfies

IIG(h,y)u1I2 < K, Here

h c (O,ho], p E: IN, ph < T, y

E: IF.

9.

Fourier transforms of difference methods

I IR IF

1

2Z

IN

Proof:

129

for J = IR for J = (0,27) for J = (0,7). We show first that the condition

Case 1, IF =IR:

IIG(h,Y)"'12 < K is sufficient for stability. C(h)uff).

f c B, define

So for

gu =

With the aid of Lemma 9.7(2) it follows that

}n(gu) = G(h,' By Theorem 9.4,

(f)

is isometric, so that we have

9

(f) II

II -5rn(gu) II = II (G(h,.)11 < max IIG(h,Y)"II2 yeIR

II

-Fn(f) II < K UIf1I

The proof in the reverse direction will be carried out indirectly.

exist a

Thus we assume that for arbitrary w eIR, an

h c (0,h0], and an

R cIN

K eIR+

there

with

Rh < T,

such that the inequality

IIG(h,w)R112 > K is satisfied.

We set

S (y) = G (h,Y) R

X = IIS(w)II 2. Then there exists a

v e n such that vHS(w)HS(w)v = A 2

vHv ,

and a continuous, square integrable function $(w) = v.

It follows that

f:IR - C'

with

130

INITIAL VALUE PROBLEMS

I.

(w)HS(w)HS(w)f(w) > K2f(w)Hf(w). Since

is continuous, there exists a nondegenerate interval

f

such that

[yl,y2]

Y E [Y1,y2].

f(Y)HS(Y)HS(Y)f(Y) > K2f(Y)Hf(Y),

(9.14)

We define

for y C [Y1.Y2] for y c ]R - [yl'y2]'

( f(y)

f(Y)

I

0

=-9 1(f).

g

By (9.14) we have

K2IIfil, so upon applying Theorem

9.4 and Lemma 9.7(2), we obtain

IIS(')fII = II G(h,')'. =

(g) II = II .sTn[C(h)'(g)] II

IIC(h)R(g)II > K IIf1I = K IIgh

.

Therefore, the difference method cannot be stable.

Case 2, IF = ZZ

or IF =IN:

The proof is analogous to Case 1.

Instead of the Fourier integral

series $(f). n(f). 9.1.

we have the Fourier

Instead of Theorem 9.4, we apply Theorem

Lemma 9.7(2) is replaced by Lemma 9.7(1).

o

It follows from the preceding theorem that a difference method which is stable in the space in the spaces

L2((0,27r),4n)

and

L2c1R,4n)

is also stable

L2((O,rr),4n).

However, the

converse of this statement need not be true, although the difference methods where it fails are all pathological ones.

As

a result, in practice one tests the boundedness of the norm of the powers of the amplification matrix for for

y cZZ

or

y EIR, and not

y EIN, even when the methods belong to the

9.

Fourier transforms of difference methods

spaces

or

L2((0,2n),In)

131

L2((O,n),tn)

The necessary and sufficient condition for the stability of a difference method with coordinate-free coefficients,

given in Theorem 9.13, is one which is not easily checked, in The following theorem provides a simple necessary

general.

condition.

Von Neumann condition.

Theorem 9.15:

Let

MD

ence method with coordinate-free coefficients.

values of the amplification matrix

G(h,y)

be a differThe eigenare de-

MD

for

noted by

j

X. (h,y), If

MD

h e (O,h0], y e IF.

is stable, there exists a constant

Ix.(h,y)I < 1+Kh, Proof:

= 1(1)n,

Let

MD

j

= l(l)n,

be stable.

k > 0

h e (O,ho],

such that

y E IF.

By Theorem 9.13 it follows that

y e IF, h e (0,ho], u e IN, uh < T.

IIG(h,y)"II2 < K, Since

p(G(h,y)u) < IIG(h,y)''II2 it follows that Iaj(h,y)Iu < K

and hence, for

ph > T/2, that

Iaj(h,y)I < K1/V < K2h/T = exp[2h(log K)/T] < 1+Kh.

Theorem 9.16:

o

The von Neumann condition

Iaj(h,y)I < 1+Kh,

j

= 1(1)n,

h c (0,h0],

y LIF

of Theorem 9.1S is sufficient for the stability of difference

132

INITIAL VALUE PROBLEMS

I.

method

MD

if one of the following conditions is satisfied:

(1)

The amplification matrix

(2)

There exists a similarity transformation, inde-

pendent of

h, which simultaneously transforms all the mat-

rices

AV(h) (3)

and

to diagonal form.

BV(h)

where

G(h,y) = G(w)

Further, for each

Ax =

is always normal.

G(h,y)

w = y6x

and

Ax = h/A

or

one of the following

w c]R

three cases holds: has

different eigenvalues.

(a)

G(w)

(b)

G(u)(w) = yuI has

(c)

n

n

for

u = 0(1)k-1, G(k)(w)

different eigenvalues.

P(G(w)) < 1.

Three lemmas precede the proof of the theorem.

Lemma 9.17:

be a matrix with p(A) < 1.

Let A c MAT(n,n,4)

Then there exists a matrix norm

for all

B,C

11-11s

with 1IAIIs < 1

and

E:

For a proof, see Householder 1964. Lemma 9.18:

Let

G(w0)

have

e >

and maps

0

for all

n

G e C°(1R,MAT(n,n,c)) and

different eigenvalues.

w0 e1R.

Also let

Then there exists an

S, D c C°((wo-e,wo+e), MAT(n,n,4)), such that

w e (wo-e,wo+e): (1)

D(w)

is a diagonal matrix.

(2)

S(w)

is regular.

(3)

D(w) = S(w)-1G(w)S(w).

The proof is left to the reader.

It is possible to weaken the

hypotheses of the lemma somewhat. Lemma 9.19:

Let

u = 0(1)k-1, let

G c Ck(IR,MAT(n,n,t)) G(u)(wo) = yuI

and

and let

w° e1R. G(k)(wo)

For have

n

9.

Fourier transforms of difference methods

different eigenvalues. pings

Then there exists an

133

c >

and map-

0

S, D E C°((w°-e,wo+e), MAT(n,n,¢)), so that for all

w e (wo-e,wo+e):

Proof:

(1)

D(w)

is a diagonal matrix.

(2)

S(w)

is regular

(3)

D(w) = S(w)-1G(w)S(w)

By Taylor's theorem, k-l

G(w) = G

I

is continuous.

1jI0 =

yu(w-W0+

u,

'ET

has

G(wo)

n

(w-wo)kG(w).

different eigenvalues.

conclusion follows by applying Lemma 9.18 to Proof of Theorem 9.16(1):

G.

The

o

Since the spectral radius and the

spectral norm are the same for normal matrices, the bound for the eigenvalues implies a bound IIG(h,y)u112 < (1+Rh)u < exp(Ruh) < exp(RT)

for the norm of the powers of the amplification matrix. Proof of 9.26(2):

Let

S

be the matrix which simultaneously

transforms all the matrices

Av(h)

and

Bv(h)

to diagonal

form:

S 1Av(h)S = Dv(h) v = -k(l)k S-1Bv(h)S = Dv(h). Then,

1-1

k

S-1G(h,y)S

=

I

v=-k

v=-k

The transformed matrix same eigenvalues as

k

exp(ivy0x)Dv(h)E exp(ivyex)Dv(h) S-1G(h,y)S

G(h,y).

is normal and has the

It follows that

134

INITIAL VALUE PROBLEMS

I.

II[S 1G(h,y)S]"II2 < exp(KT) 115-1112

1 1 G (h,y)" 1 1 2 _ 11 S 112

Proof of 9.16(3):

exp (KT) .

is arbitrarily often differentiable and first prove the following assertion. there exists an

v e IN

exp(iw),

G(w), as a rational function of

e >

and a

0

K > 0

2n-periodic.

We

wo EIR

For every

such that for all

and all w e (wo-E,wo+E) , 11G(')"112 < K.

The constants

and

c

K

depend on

Since finitely many open intervals the interval

wo

to begin with.

[0,2n], we can find a different

inequality holds for all

w EIR.

will cover

(u0-E,wo+e)

so that the

K

To establish conclusion (3),

we have to distinguish three cases, depending on whether hypothesis (a), (b), or (c) applies to Case (a):

A special case of (b) for

Case (b):

G

The quantity

and E

wo

wo.

k = 0.

satisfy the conditions of Lemma 9.19.

of the lemma we denote by

w e (wo-2c,wo+2E), G"

here.

2c

For

then has the representation

(w)" = S(w)D(w)"S(w) 1. Let

K= The diagonal of

max IIS(w)112 IIS(w)-111 2. wE[W0-E,W0+c] D(w)

They depend only on

contains the eigenvalues of w, and not explicitly on

follows from the von Neumann condition that IID(W) 112

<1 W E (W

IIG(w)"112 < K

0- c,W0 +c).

h.

G(h,y).

It then

Fourier transforms of difference methods

9.

Case (c):

G(w0)

satisfies the conditions of Lemma 9.17. IIG(wo) IIS

One can choose an W

O

13S

= L < 1. so small that for all

e > 0

Let

w e (wo- e,

+e) ,

(w)

s < 2(L+1) < 1 2(L+1)

1,

v cIN.

Since the spectral norm can be estimated by 11.11s, it follows that

IIG(w)V112 < K,

v eIN, w e (wo-e,wo+e).

o

Methods which satisfy the von Neumann condition are called weakly stable.

Some of these methods are not stable

(cf. Examples 9.28 and 9.30). weakly unstable.

One could also call them

As a rule they do converge for solutions of

problems which are sufficiently often differentiable.

That

is not a counterexample to the Lax-Richtmyer equivalence theorem (Theorem 5.11) since there the question is one of convergence for generalized solutions.

In many cases, methods

which are weakly stable but not stable are even of practical significance.

We will now compute the amplification matrices for a number of concrete difference methods with coordinate-free coefficients.

That is the easiest way to check the stability

or instability of such methods.

In practice, the results are

the same for any of the three spaces 2((0,n),4n). L Therefore we restrict our examples to the first of these spaces.

All the methods which we discussed in

Sections 6 and 8 will reappear here.

However, note that

136

I.

INITIAL VALUE PROBLEMS

methods with coordinate-free coefficients only obtain for differential equations with constant coefficients.

In the

previous chapters, we considered equations with non-constant coefficients.

Stability in the following is always meant in the sense of the norm of

Positive methods, however, are

L2

stable in the sense of the maximum norm. reaching conclusion.

This is a far-

Thus it is of great interest to compare

the stability conditions developed here with the positivity conditions in Section 8.

Stability analysis with Fourier

transforms gains its real significance from the fact that there exist a number of good, stable methods which are neither positive nor positive definite, no matter what the ratio of the step sizes, A = h/tx

or

A = h/(0x)2, may be.

In parti-

cular, this is true for several interesting higher order methods.

Our present restriction to differential equations with constant coefficients should not be misunderstood..

In this

special case, the theory is particularly simple and easily applied.

But there are many other differential equations for

which it is possible to analyze the difference methods with the aid of amplification matrices.

We will return to this

subject briefly later on.

In all the examples, A = h/tx

or

A

=

ing on the order of the differential equation. set

w = yex.

so does

Since

y

h/(Ax2), dependWe always

runs through all the real numbers,

u.

Example 9.20:

(cf. Equation 6.3 and Example 8.6).

9.

Fourier transforms of difference methods

137

Differential equation: ut(x,t) = auxx(x,t),

a > 0.

Method:

C(h) = [I-(1-a)AH(Ax)]-1[I+aXH(Ax)] where

H(Ax) = a[TAx-2I+T-1],

a e [0,1].

Amplification matrix:

G(h,y) = [l+aAH(h,y)]/[l-(l-a)XH(h,y)] where

H(h,y) = -2a(1-cos w).

Like all matrices in

MAT(1,1,¢), G(h,y)

is normal.

Hence

the von Neumann condition is necessary and sufficient for stability.

1G(h,y)l < 1

We have

e = 1-cos w > 0.

>

2

The condition

therefore implies 2aaae -

1

<

1 + 2a(l-a)Xc,

so the stability condition reads 2aa(2a-1) < 1.

More precisely, this says that a < 1/2 For

or

2aa < 1/(2a-1).

a = 1 (explicit case) and

a - 0

(totally implicit

case), these are precisely the conditions of Theorems 6.5 and 6.6 and Example 8.6.

For

a e (0,1), the positivity condi-

tion

2aa < 1/a

is substantially more restrictive.

The popular Crank-NicoZson

138

I.

INITIAL VALUE PROBLEMS

method (a = 1/2), for example, is stable for all but is positive only for A

and

for which

a

A < 1/a.

> 0,

A

For those combinations of

is stable, we have

C(h)

C(h)112 < 1 and for those combinations for which

C(h)

is positive, we

a.

It follows from

have

11C(h)11m< 1. Stability is thus "uniform" in this that one can let

h

converges in the norm

Ax

and

of each other, even though

and

go to zero independently

can depend on

a

11.112

A

The method

h.

if

2a(2a-1)h/(Ax)2 < 1, is always satisfied.

If

2aah/(Ax) 2 < 1

then we also have convergence in the norm 11.11 ginning of the computation, when the step size any case be small, the preferred choice is creasing

t

At the beh

should in

a = 1,

With in-

and increasing smoothness of the solution, one

would like to switch to larger values of ferable to choose

h.

Then it is pre-

a e [1/2,1).

In Example 8.7 we investigated an especially precise method.

For

as > 1/6, it may be regarded as a special case

of the method just considered. 1/2 + 1/12aa.

Thus, for

1/(2a-1)

a =

as = 1/6, the method is the same

as the explicit method, and as the Crank-Nicolson method.

One must then choose

A

increases, it approximates

The stability condition

is always satisfied.

2aa <

The positivity condition

9.

Fourier transforms of difference methods

2aa < 1/a, however, leads to the condition

139

as < 5/6.

0

Example 9.21:

Differential equation: ut(x,t) = auxx(x,t) + bux(x,t) + cu(x,t)

a > 0 and b,c a R. Method: C(h) =

I

Zb (TAX-TAX) + chl.

+

Amplification matrix: G(h,y) = 1-2aa(l-cos w)

+

ibvrll-X sin w + ch.

The von Neumann condition once again is necessary and sufficiWe have

ent.

JG(h,y) 12 = [1-2aa(l-cos w))2 + 0(h). Case 1:

2aX > 1.

For

limIG(h,y)12 > 1.

w = 1r, we have

The method is unstable. 2aa < 1.

Case 2:

It follows from

IG(h,y)I < 1 + 0(h).

that

ity condition

2aX < 1

For the parameter

The stabil-

The method is stable.

holds independently of

and

b

c.

c, this follows from Theorem 5.13.

the only surprise is in the lack of perturbation

IG(h,y)l2 < 1 + 0(h)

bVH_A (TAX-T_1)

influence of

Thus The

b.

is not insignificant from the

viewpoint of Theorem 5.13.

Nevertheless, it has no influence

on stability in this case.

When stability obtains, it is a

matter of definition that the powers bounded. is

1

for

JG(h,y)j"

The bound, however, depends on b = c = 0.

a, b, and

When the fractions IbI/a

are very large, the bound is very large.

are uniformly

or

c.

It

Icl/a

Then it also depends

140

INITIAL VALUE PROBLEMS

I.

on the limits of the time interval

T.

c = 0

Let us consider the special case

and

IbI/a»1.

The method is then very similar to the Friedrichs method for the first order differential equation arising from the limit-

The true viscosity here is potentially even

a = 0.

ing case

smaller than the numerical viscosity in the Friedrichs method. That leads to practical instability.

In this, as in many

similarly situated cases, it pays to investigate more closely.

G(h,y)

We have

JG(h,y)12 = 1-4aa(l-cos w) + 4a2A2(1-cos w)2+b2hasin2w

= l-4aa(1-2aa)(l-cos w)-X(4a2A-b2h)sin2W. For

2aa <

and

1

b2h < 4a2a, it follows that

IG(h,y)l < 1.

Error amplification only begins on the other side of this The inequality

bound.

b2h < 4a2a

=

4a2h/(tx)2

is equival-

ent to ox < 2a/lbl.

Combined with the stability condition, this becomes h/ax < 1/Ibi.

This is the stability condition of the Friedrichs method. For

c > 0

and

w = 0, we get

G(h,0) = l+ch.

is no additional condition that will yield In any case, we must have for

IG(h,y)I''

interval

[0,T].

ch << 1.

There

IG(h,0)I < 1.

Effectively, the bound

now depends on the upper limit of the time This result is not surprising, since the

differential equation has solutions which grow like The situation is more favorable for 2aa < ch/2, we again have

c <

IG(h,y)l < 1.

0

and

exp(ct).

b = 0.

For

This sharpens the

Fourier transforms of difference methods

9.

stability condition somewhat. the method a little. time

141

It is preferable to change

If we evaluate the term

cu(x,t)

at

t+h, we obtain the difference operator

l) + Ib I (T

C(h) = llch

7

-TA-

1

with amplification matrix G(h,y) =

1

1-ch

Its size decreases as

Example 9.22:

grows.

-c

2aa < 1 +

c < 0,

we always have

[1-2aa(1- cos w) +

JG(h,y)l < 1.

Subject to the conditions ZIcIh, Ax < 2a/jbl

a

(cf. Thom6e 1972).

Differential equation:

ut(x,t) = auxx(x,t),

a > 0.

Method:

C(h) = [(1-lOaa)T2x + (26-20aa)TAx + (66+60aa)I + (26-20aa)T-1 + [(1+lOaa)T2

(1-10aa)T-X]-1o

+ (26+2Oaa)TAx + (66-6Oaa)I

+ (26+2Oaa)T-l + (1+lOaa)T-2]. Amplification matrix: G(h,y) _ (1+lOaa)cos 2w+ (26+2Oaa cos w + (33-3OaX) 1 Oa cos 2w + 2 a cos w + 33+30aa

Thomee obtained the method by means of a spline approximation. It converges like

0(h4) + 0((Ax)4).

not positive for any and

20aX -

26 >

0

Unfortunately, it is

A > 0, since the conditions are contradictory.

Letting

66-6Oaa > 0

142

1.

n = (33+30aa) +

INITIAL VALUE PROBLEMS

(26-2Oaa)cos w + (1-lOaa)cos 2w > 0

we have on the one hand that G(h,y) = 1-20aX(3-Zcos w - cos 2w)/n <

1

and on the other,

G(h,y) = -1+2(33+26cos w + cos 2w)/n > -1.

The method is stable, therefore, for all Example 9.23:

A > 0.

0

(cf. Equation 6.7).

Differential equation: ut(x,t) = aux(x,t),

a eIR.

Method: C(h) =

I

aa(TOx - TAX

+

2 Amplification matrix: G(h,y) = 1 + iaa sin w.

The method is unstable, as already asserted in Section 6, for the norm of the amplification matrix is Example 9.24:

1 + a2A2 sin2w.

Friedriche method.

Differential equation: ut(x,t) = Aux(x,t) where

is real diagonalizable.

A e MAT(n,n,IR)

Method:

C(h) = 1[(I+AA)TQx + (I-AA)T-l AX 2

Amplification matrix: G(h,y) = I cos w + iXA sin w.

o

9.

Fourier transforms of difference methods

Since the coefficients of

143

are simultaneously diagon-

C(h)

alizable, the von Neumann condition is also sufficient for stability.

For every eigenvalue

ponding eigenvalue

u(A,w)

u

of

there is a corres-

A

G(h,y), and vice-versa.

of

We

have

u(A,w) = cos w + iAU sin w, I u(a,w)12 = 1 + (A2 11

2-1)sin2W.

Thus the method is stable for

AP(A) < 1 This condition corresponds to the Courant-Friedrichs-Lewy condition in Section 6.

When the method is stable, it is

also positive, and is positive definite if Example 9.25:

A

is symmetric.

Courant-Isaacson-Rees method.

Differential equation: ut(x,t) = Aux(x,t) where

A E MAT(n,nfit)

is real diagonalizable.

Method:

C(h) =

[I-AA+-AA ]+AA TAX.

Amplification matrix: G(h,y) = AA+exp(iw) + I-AA+-AA i`.A exp(-iw).

In analogy with the previous example, we compute For

u > 0,

value of

A

ti,

is an eigenvalue of

A+

for the same eigenvector.

and

0

ji()',w).

is the eigen-

We obtain

u(A,w) = Au exp(iw) + 1-AU, ,11(A,w)l2

= 1 + 2AU(Ap 1)(1- cos w).

o

144

INITIAL VALUE PROBLEMS

I.

Similarly, for

u < 0

we get

u(X,w) = 1-ajuj+ajujexp(-iu), Iu(a,w)I2 = 1+2AIuI(XIiI-1)(1-cos w).

Stability holds exactly for Ap(A) < 1.

This again is the Courant-Friedrichs-Lewy condition from Section 6.

Here again stability implies positive, and potenti-

ally, positive definite. Example 9.26:

o

Lax-Wendroff method.

Differential equation: ut(x,t) = Aux(x,t) where

A e MAT(n,n,1)

is real diagonalizable.

Method: C(h) =

I

+ 2LA(TLx-T_x) +

Amplification matrix: G(h,y) =

I

+ iXA sin w - A2A2(1

Because the method converges like in practice.

O(h2)

-

cos u).

it is very popular

But it is positive or positive definite only in

unimportant exceptional cases. is symmetric and that

Assume, for example, that is an eigenvalue of

u # 0

A.

A

C(h)

is positive or positive definite only if the three matrices I-a2A2,

are positive

AA+a2A2

semidefinite.

l - A 2 1 u 1 2 > 0

and

and

-aA+a2A2

This means

- I u I

+

X 1 02 > 0.

9.

Fourier transforms of difference methods

All eigenvalues value, and

p # 0

of

A

145

must have the same absolute

is the only possible choice for the

A = 1/ILK

In this special case, the method can be

step size ratio.

In Section 11 we will

regarded as a characteristic method.

show that the Lax-Wendroff method can be derived from the Friedrichs method by an extrapolation.

The von Neumann condition leads to some necessary and We have

sufficient conditions for stability. u(X,w) = l+iap sin w-X2p2(1-cos w), ,u(A,w)12

l+A4p4(1_ cos w)2-2A2p2(1-cos w) + X2p2sin2w.

=

and obtain

w = w/2

We substitute

1-cos w = 2 sin sin2w = 4 sin2w jp(A,w)I2

=

1

2w, -

4 sin

4w,

4A2u2(1-A2)12)sin4w,

Stability is now decided by the sign of

1-A211 2.

In agree-

ment with the Courant-Friedrichs-Lewy condition, we obtain the stability condition Ap(A) < 1.

Example 9.27:

a

(cf. Example 8.10).

Differential equation: ut(x,t) = Aux(x,t) where

A E MAT(n,n,R)

is real diagonalizable.

Method:

C(h) _ {-[r2I+(1-a)AA]T6x+(2+2r2)I-[r [(rlI+a),A)TAX+(2-2r1)I+(r II-aAA)T-lI

where

a e [0,1], r1 = aap(A), and

r2 = (1-a)ap(A).

2I-(1-a)XA]T4z}-1

146

INITIAL VALUE PROBLEMS

I.

Amplification matrix:

G(h,y) = [(1+r2-r2 cos w)I-i(1-a)AA sin w11 [(1-r1+r1cos w)I+iaAA sin w]. In Example 8.10 we only considered the special case 0

al

a

0j

A

,

albeit for nonconstant coefficients.

Implicit methods are of

practical significance for initial boundary value problems, at best.

Nevertheless, in theory they can also be applied to

pure initial value problems. positive.

arbitrary.

For

aAp(A) < 1, the method is

In particular, this is so for

a = 0

We now compute the eigenvalues

amplification matrix

-cos w)

+r2

u(A,w)

of the

G(h,y):

1-rl(1-cos w) u(a'w)

A > 0

and

+ iaau sin w -

u sin w

-a

i

[1-rl(1-cos w)]2+a2A2u2sin2w lu(a,w )l2 =

[l+r2(1-cos w)] +(1-a) A u sin w

We have stability so long as the numerator is not greater than the denominator, for all

lul

< p(A).

The difference

of the numerator and denominator is D = -2(rl+r2)(1-cos w)

+

(r12-r2)(1-cos w) 2

-a2u2sin2w + 2a a2u2sin2w. Since

rl+r2 = Ap(A)

and

ri-r2

=

(2a-1)A2p(A)2, we get

D = -2Ap(A)(1-cos w) + (2a-l)A2p(A)2(1-cos w)2 + (2a-l)X2u2sin2w. For

a < 1/2, D <

0

and

lu(x,w)I < 1.

Thus the method is

D

Fourier transforms of difference methods

9.

stable for all

147

It remains to investigate the case

A > 0.

We have

a > 1/2.

D < Ap(A)[-2(1-cos w)+(2a-l)Ap(A)(1-cos w) 2 +(2a-l)Ap(A)sin2w].

To simplify matters, we

again substitute

w = w/2.

We get

the inequality

D < 4Ap(A)[-sin2w + (2a-l)Ap(A)sin4w + (2a-1)Ap(A)sin2w-(2a-1)Ap(A)sin4W.

method

(D < 0).

tute the values D.

is sufficient for stability of the

(2a-1)Ap(A) < 1

Thus

To obtain a necessary condition, we substiw =

it

and

P2 = P(A)

2

in the equation for

We obtain

D = 4Ap(A)[-l + (2a-1)Ap(A)] Thus, the given condition is also necessary.

The stability

condition (2a-1)Ap(A) < 1 a e (0,1)

in part for

may deviate substantially from the

positivity condition aAp(A) < 1.

Example 9.28:

o

(cf. Example 1.9).

Differential equation: ut(x,t) = Aux(x,t) where A = a Method: C(h)

rLI-

1

ZaAB1(TAx-TAX )J o[I+

.aAB2(TAx-TAx )]

148

INITIAL VALUE PROBLEMS

I.

where

B1 = (0

B2 = r0

,

00 1

11

l

Amplification matrix:

J

G(h,y) _ [I-iaAB1 sin w]-1[I+iaAB2 sin w]. In this case it is not very suggestive to represent the dif-

ference method with the translation operator, so we shall switch to componentwise notation.

v(x,t)

u(x,t) _ The method

Let

w(x,t)

.

now reads

v(x,t+h) = v(x,t) + Zaa[w(x+Ax,t)-w(x-Ax,t)] w(x,t+h) - ZaA[v(x+Ax,t+h)-v(x-Ax,t+h)] = w(x,t).

In the first equation, the space derivative is formed at time

t, and in the second, at time

would first compute

t+h.

In practice, one

on the new layer, and then

w.

Then

one can use the new v-values in the computation of

w.

The

v

method thus is practically explicit. any

A > 0.

Since

It is not positive for

B1 = 0,

G(h,y) = [I+iaAB1 sin w][I+iaAB2 sin w] = I+iaa(B1+B2)sin w - a2a2B1B2 sin2w

B1

and

B2

1

iaA sin w

iaa sin w

1-a2X2 sin2w

I

.

obviously are not exchangable.

Thus the coeffici-

ents of the method cannot be diagonalized simultaneously. addition, we will show that

C(h,y)

is not normal for all

In w

9.

Fourier transforms of difference methods

and that double eigenvalues occur.

149

The von Neumann condition

therefore is only a necessary condition. Let

The eigenvalues of

n = a2A2sin2w.

G(h,y)

sat-

isfy the equation (2-n)11+l = 0.

-

112

The solutions are u1,2 = 1 Case 1:

as > 2.

are real.

11

-

If

Zn

-

(4 n2-n)1/2

w = n/2, then (4n2-n)1/2I

-

?n

+-

Both eigenvalues

n > 4.

is greater than

1.

The

method is unstable. as < 2.

Case 2:

If

w # vn

where

are different and of absolute value

The derivative of

G(h,y) = I.

v c22, the eigenvalues For

1.

w = vn,

with respect to

G(h,y)

w

has distinct eigenvalues at these points, namely

u1,2

_ Iiaa.

By Theorem 9.16(3), the method is stable. Case 3:

value

1.

stable.

All eigenvalues of

as = 2.

G(h,y)

The method is weakly stable.

Suppose it were also

Then every perturbation of the method in the sense

of Theorem 5.13 would also be stable. with

have absolute

We replace matrix

B2

and obtain a method with amplification matrix

B2(l+h)

1

iaa(l+h)sin w

iaX sin w

1-a2A2(1+h)sin2w,.

In the special case

w = n/2, we get, for

1

2i+2ihl

2i

1-4-4h

as = 2,

150

I.

INITIAL VALUE PROBLEMS

The eigenvalues of this matrix include -1

2h -

-

2(h+h2)1/2.

Obviously there is no positive constant 1+Kh.

shown:

K

The perturbed method is not stable.

such that

Thus we have

as = 2, one obtains a weakly stable method

(1) For

which is not stable, and (2) there is no theorem analogous to 5.13 for weakly stable methods.

Thus, the stability condi-

tion for our method is

The Courant-Friedrichs-Lewy

condition yields

as <

as < 2.

2.

The difference is without practi-

cal significance.

With respect to computational effort, accuracy, and stability conditions, this method is better than the three explicit methods for hyperbolic systems given in Examples 9.24, 9.25, and 9.27.

A comparison with the Lax-Wendroff

method is not possible, since the semi-implicit method considered here converges only like

o(h).

Unfortunately the

method is tailored specifically for the wave equation with coefficient matrix A = a

This becomes even clearer upon combining two time steps. y(x,t+h)

- 2y(x,t) + y(x,t-h)

4 a2A2[y(x+2Ax,t)

-

2y(x,t) + y(x-2Ax,t)]

where

v(x,t) = [y(x,t)-y(x,t-h)J/h and

w(x,t) = a[y(x+Ax,t) -y(x-Ax,t)]/Ax.

Then

9.

Fourier transforms of difference methods

151

Here one can compute forwards as well as backwards, i.e., A can be replaced by

-A

and

h

by

-h.

So far such a time

reversal has resulted in a stable method only with Massau's method.

In all the other examples, numerical vis-

cosity requires a fixed sign on missing here.

h.

This viscosity term is

Reversibility also requires all the eigenvalues

of the amplification matrix to have absolute value

1.

Ex-

perience shows that methods of this type are no longer useful when the differential equations contain any nonlinearities whatsoever. methods.

Exceptions once again are the characteristic a

Example 9.29:

(cf. Examples 1.13 and 9.9).

Differential equation:

ut(x,t) = iauxx(x,t),

a cIR -

{0}.

Method:

C(h) = [I-(l-a)XH(Ax)] le[I+aAH(Ax)] where H(Ax) = ia[TAx - 21 + T6_ 1],

a e [0,1].

Amplification matrix: G(h,y) =

[l+aA (h,y)]/[1-(1 a)Ai(h,y)]

where

H(h,y) = 2ia(cos w -1).

Formally, the method is the same as the method for parabolic equations.

Since

IG(h,y)l2 =

2 (1- cos w)2 1+4a2a2A 1+4(1-a)2a A (1- cos w)

we obtain, independently of

A, the stability condition

152

(1-a) 2 >

a2

implicit.

or

As for parabolic equations, the truncation error a < 1/2, and

for

v(x,t) = Re(u(x,t))

Therefore we set

bers.

Im(u(x,t)).

0(h2) + 0((Ax)2)

Naturally we prefer to compute with real num-

a = 1/2.

for

All stable methods of this type are

2a < 1.

0(h) + 0((0x)2)

is

INITIAL VALUE PROBLEMS

I.

and

w(x,t) _

This leads to the differential equations vt(x,t) _ -awxx(x,t)

wt(x,t) = avxx(x,t) and the methods

v(x,t+h)+(1-a)aA[w(x+Ax,t+h)-2w(x,t+h)+w(x-tx,t+h)] = v(x,t)-aaa[w(x+ox,t)-2w(x,t)+w(x-ax,t)],

w(x,t+h)-(1-a)aA[v(x+tx,t+h)-2v(x,t+h)+v(x-6x,t+h)] = w(x,t)+aaa[v(x+6x,t)-2v(x,t)+v(x-Ax,t)]. Example 9.30:

o

(cf. Examples 1.13 and 9.9).

Differential equation: ut(x,t) = Auxx(x,t) where

A=a

0

1

1

0

,

a c IR

-

{0}.

Method: C(h) = [I aABl(TAx 21+T-

-1e[I-aAB2(T6x-2I+T_x)J 1)]

where

B1 = (0

01

,

B2 =

0

0,

Amplification matrix: G(h,y) = [I+2aAB1(1-cos w)]-I[I+2aAB2(1-cos w)].

The differential equation is equivalent to the equation in

9.

Fourier transforms of difference methods

153

The method under discussion is more

the previous example.

easily explained in real terms.

We again have a semi-impli-

cit method with great similarity to Example 9.28.

Rewriting

the difference equations componentwise, we have v(x,t+h) = v(x,t)-aa[w(x+ox,t)-2w(x,t)+w(x-ox,t)]

w(x,t+h)-aA[v(x+px,t+h)-2v(x,t+h)+v(x-Ax,t+h)] = w(x,t). The computational effort is as for an explicit method.

Since

2

B1 = 0, we can change the amplification matrix as follows:

G(h,y) _ [I-2aAB1(l-cos w)][I+2aAB2(1-cos w)].

Making the substitution

w = w/2, we get

1

4aasin2

-4aAsin2w

1-16a

G(h,y) =

Let

n = 16a2A 2 sin 41.

G(h,y)

in

4w

then has the eigenvalues 4n2-n)1/2

Zn

u1,2

2X2s

The remaining analysis is entirely analogous to Example 9.28. Case 1:

a2a2 > 1/4.

w = n/2,

If

I1-

?n

-

(-n2-n)1/2I

>

1.

The method is unstable. Case 2: For

a2a2 < 1/4.

w # vn

points

All eigenvalues have absolute value 1.

(v e2Z) they are distinct.

w = vn

we have

G(w) =

derivative with respect to

w

I

and

At the exceptional G'(w) = 0.

is

0

2aa

-2aa

0

This matrix has two distinct eigenvalues. the method is stable.

The second

By Theorem 9.16(3),

154

Case 3:

I.

a2 A2

= 1/4.

INITIAL VALUE PROBLEMS

The method is weakly stable.

One again

shows that it is unstable with the help of a perturbation of order

O(h).

The stability condition is

JalA < 1/2.

Since the

differential equation does not have a finite domain of dependency (cf. the closed solution in 9.9), no comparison with the Courant-Friedrichs-Lewy criterion is possible.

method is not positive for any

A > 0.

The

Richtmyer-Morton pre-

fer the implicit method of the previous example, since the present stability condition,

A = h/(Ax) 2 < 1/(21al), is very

strong.

When the coefficients of a difference method do depend on the coordinates, one cannot automatically apply a Fourier transform to the method.

Although the amplification matrix

appears formally the same as for coordinate-free coefficients, it is not the Fourier transform of the method.

In these cases,

the amplification matrix can only be used to investigate local stability of the method.

Then the variable coefficients of

the difference method are "frozen" with respect to

x, and

the stability properties of the resulting method with coordinate-free coefficients become the subject of investigation. The following theorem 9.31 shows that under certain additional conditions, local stability is necessary for stability.

For simplicity, we restrict ourselves to explicit

methods and

B = L2c1R,¢n).

There also exist a number of sufficient stability criteria which depend on properties of the amplification matrix.

We refer to the work of Lax-Wendroff (1962), Kreiss (1964), Widlund (1965), and Lax-Nirenberg (1966).

The proofs are all

9.

Fourier transforms of difference methods

155

We therefore will restrict ourselves to a

very complicated.

result on hyperbolic systems (Theorem 9.34) due to Lax and Nirenberg.

Let

Theorem 9.31: -k(1)k

AV E C°(IR X (O,ho] , MAT(n,n jR)) , v = be an explicit difference

MD = {C(h)lh c (O,ho])

and

P(B,T,A), where

method for a properly posed problem k

C(h) = v=-k

Further assume that as

AV(x,h)Tex'

h - 0, each

formly on every compact subset of bounded norm.

converges uni-

Av(x,h)

IR

to a mapping MD

Then the stability of method

Av(x,0) of

implies the

MD = {C(h)ih a (O,h0]}, where

stability of the method k

Av(x,0)Tnx

C(h) = v=-k

for every (fixed) point

of

IR.

The proof will be indirect and so we assume that

Proof:

there exists an

x £IR

By Theorem 9.13, for each constant

stable.

y eIR, an

exists a

MD

for which the method

h c (0,h0], an

is not

K c]R+

N c IN with

there

Nh < T,

V c 4n, such that

and a vector

(9.32)

I1r(x)NV112 > K IIVII 2' where k

r(x) =

E

exp(iv&)Av(x,0),

& = ytx.

v=-k Since

Av(x,O)

is continuous, there is a

inequality (9.32) also holds for all now fix

t

and pass to the limit

d eIR+

such that We

x e S6 =

h -+ 0

(and hence

Inequality (9.32) remains valid throughout for all

Ax -

x e Sd.

0).

156

INITIAL VALUE PROBLEMS

I.

Now let

p:]R + ]R

be an infinitely often differenti-

able function with

p(x) = 0 p(x) $ 0

for

x c 1R

in

Sd.

-

S6

Set

v(x) = Vp(x)exp(iyx). Then k

C(h)(v)(x) =

Av(x,h)VP(x+vAx)exp[iy(x+vAx)] v=-k

= C(x)v(x) + c1(x,h). Here

such that: h

is a function for which there is a

c1(x,h)

El(x,h) = 0

(1)

for

sufficiently small, and (2)

to zero as

h -

0

for

'd

cIR+

x e]R (x-d S,x+d+d)

and

converges uniformly

c1(x,h)

x e (x d d,x+d+S),

Applying

C(h)

repeatedly, we obtain C(h)N(v)(x)

Here

eN(x,h)

=

P(x)Nv(x) + eN(x,h).

has the same properties as

sufficiently small

eI(x,h).

Choose a

h, and then it follows from (9.32) that

IIC(h)NII > K. This contradicts the stability of Example 9.33:

0

MD.

Application of Theorem 9.31.

Differential equation (cf. Example 5.6): ut(x,t) = [a(x)ux(x,t)]x where

ae

IR),

a'

a Co(]R, IR)

and

a(x) > 0,

x c 1R.

9.

Fourier transforms of difference methods

Method (cf.

(6.3) for

157

a = 1):

C(h) = Aa(x-Ax/2)T-1 + [1-Aa(x+tx/2)-Aa(x-tx/2)]I A = h/(0x)2.

+ Aa(x+Ax/2)TAx,

By Theorem 6.5, the condition

a(x) < 1/2

0 < Amax x e]R

is sufficient for the stability of the above method.

It

follows from Theorem 9.31 that this condition is also necesThus, for fixed

sary.

x cIR, consider the method

C(h) = ;,a(x)T-1 + [1-2Aa(x)]I + aa(X)TOx.

The corresponding amplification matrix is G(h,y) = 1 + 2Aa(x)[cos w - 1]. Since the above stability condition is necessary and sufficient for h c (0,h0],

y eiR,

IG(h,y)I < 1,

the conclusion follows from Theorem 9.13. Theorem 9.34:

Lax-Nirenberg.

Let

a

MD = (C(h)Ih > 01

difference method for a properly posed problem

be a

P(L2(R,]Rn),

T,A), where

k C (h) and

Ax = h/a, with

=

E

A > 0

u T11 B(x) 1x fixed.

Let the following condi-

tions be satisfied: (1)

Bu c C2(1R,MAT(n,n,IR)), u = -k(1)k.

(2)

All elements of the matrices

-k(l)k, x cIR

are uniformly bounded.

B(v)(x), v = 0(1)2,

158

I.

(3)

INITIAL VALUE PROBLEMS

for h > 0, y c ]R, x e 1R where

II G(h,y,x) 112 < 1

G(h,y,x)

is the amplification matrix for

Then

is stable.

MD

C(h).

Although we have a real Banach space, we form the

Remark:

amplification matrix exactly as in the case of

L2(JR,4n),

namely k

B (x)exp(iuyAx). u=-k u

G(h,y,x) _

For fixed

x

0

c]R, it follows from condition (3) that k

u=-k

Bu (xo)TAx II2

For the proof, we embed

< 1.

L2(1R,IRn)

in

L2(]R,4n)

in the

The conclusion then follows from

canonical way. k k

.G(h,y,xO)o Y .

-9

Bu(xo)TAx

o

Before we prove Theorem 9.34, we need to establish several

We begin by introducing a

further observations and lemmas.

scalar product for u,v a

L2 (R, IRn)

:

= f u(x)Tv(x)dx = . IR

With respect to this scalar product, there exists an operator C(h)T

which is adjoint to

C(h), namely

k

C(h)'

T-x°BU(x)T.

=

k

Using the symmetry of the matrices

Bu(x), we obtain

k

_

J [B(x)u(x+pAx)]Tv(x)dx

u=-k k

1R

f

E

u=-k

J

u(x)T[BV(x-UAX)Tv(x-1Ax)]dx

]R

_ .

9.

Fourier transforms of difference methods

159

In particular,

_ . In addition to the difference operators

C(h), we also have

to consider the difference operators with "frozen" coefficiFor fixed

ents.

a cIR

let

k

I

Ca (h)

u= k

Ca(h)T =

Bu (a) T,x

k u=_k

B (a)TT. x

As remarked above, it follows from (3) that

II Ca(h) (u) II2 1 II u11 2, which is to say < or

> 0.

Our goal is to establish a similar inequality for the operaC(h), instead of

tors

Ca(h).

Lemma 9.35: I-C(h)ToC(h) = Q(h) + Ox R(h)

where

Q(h)

into itself. ex.

and

are bounded linear mappings of

R(h)

is bounded independently of

IIR(h)II2

The coefficients of

Q(h)

are independent of

h h.

L2(IR,IRn)

and

We

then have 2k

Q(h) =

E

u=-2k Proof:

form

The product

D (x) T11 Ax u

C(h)TOC(h)

consists of summands of the

160

I.

TAxoB-r(x)TBs(x)-TAx

+

Q(h)

INITIAL VALUE PROBLEMS

B-r()TBs(x),,TAxs

=

[B_r(x+rAx)TBs(x+rAx)-B_r(x)TBs(x)]TAxs.

contains the first summand on the right side.

in square brackets is divisible by bounded independently of

The quotient is

Ax.

because

Ax

The term

JJB,1(x)JJ

and

JIB 1j(X)II

0

are bounded.

Analogously to

Ca(h), we can define 2k

Qa(h) =

E

u=-2k

Du(a)T"x

Obviously,

Qa(h) = I-Ca(h)ToCa(h) and therefore, > 0. Lemma 9.36:

Let

Eu =

Z[DU(a) + D_u(a)]

Fu = Z[DU(a)

- D u(a)]

Then: c2k

(1)

Qa(h) =

[Eu(TAx+T Ax) + Fu(TAX TAX))

uL- 2k (2)

ET = Eu, 11

Proof:

FT =

Fu,

P

(1) is trivial.

u = -2k(1)2k.

As for (2), note that

Qa(h)

con-

sists of summands of the form B_r(a)TBs(a)Tr+s and

Bs (a) TBs (a) TO

s(a)TB-r(a)T-r-s.

u = r+s >

0

9.

Fourier transforms of difference methods

The corresponding summands in B_r(a)TBs(a)

161

are

ZEU

+ Bs(a)TB-r(a),

u > 0

or

Bs (a)TB5 (a) , These matrices are symmetric.

contains the antisymmetric

2F 11

terms

B_r(a)TBs(a)

-

Bs(a)TB_r(a).

a

One important tool for the proof of Theorem 9.34 is a special partition of unity. Lemma 9.37:

There exists a

0 e C`°(1R,]R)

with the following

three properties. (1)

Support(4) _ [-2/3,2/3].

(2)

0 < $(x) < 1,

x c 1R. x e1R.

(x-u)2 = 1,

(3)

u=-m Proof:

Choose

4 c C"OR,]R)

(cf. Lemma 4.9) such that

fi(x)

= 1

for

jxj

< 1/3

fi(x)

= 0

for

lxi

> 2/3

0 < ¢(x) < 1

for

x e (1/3,2/3)

for

x c (-2/3,-1/3).

fi(x)

= 1-;(x+l)

It follows that +m r

(x-u) = 1,

x c]R.

u=-w

All the derivatives of

fore, 4(x) = (x)1/2 For if

1xo1

vanish at the zeroes of

xo,

There-

is arbitrarily often differentiable.

> 2/3, i.e., xo

neighborhood of

;.

a zero of , then in some

162

I.

INITIAL VALUE PROBLEMS

(x) = (x-x o) 2s ps (x) Ix-xols

4(x) = Here

s

y 5(x.

is arbitrary and

IN

easily shown that

is continuous and

$s x

times differentiable at

(s-1)

is continuous.

Vs

x0.

It is

is at least

0

a

Like , the function ' has compact support. fore,

attains its maximum, which we denote by

Ic'(x)I

10 (X)

-

0(x)I < Lax-xI.

x,x c 1R.

Lemma 9.38:

nv(x) = 0(yx-v), v = --(1)-

Let

as in the preceding lemma. all

L

By the Mean Value Theorem,

in the following.

for all

There-

where

y eIR+

Then for all

where

h = Atx > 0

0

is

and

6ky6x < 1,

I -

u e L2@i, IRn).

M1y2L2(6x)21IuI12, The constant on

u, y, h, or For

Proof:

depends only on the method

M1 > 0

MD, and not

6x.

u = -2k(1)2k

we have

M

D (x)u(x+utx) u

=

[1

-

E

-

nv (x)D

E

u

(x)nv (x+uAx)u(x+uox)

nv(x)nv(x+u6x)IDU(x)u(x+uox).

v=

Using 9.37(3), we replace the

2

nv(x)2

v=-m

+

1

in the square brackets by

nv(x+u6x)2

Fourier transforms of difference methods

9.

163

The term in the brackets is then a sum of squares, 1

1

nv(x)nv(x+uox) =

-

2 [nv(x)-nv(x+udx)].

v=

nv(x) = 0

v=

Iyx-vj > 2/3.

for

(6yk)-l, nv(x)-nv(x+utx) = 0

Since for

Jul

< 2k

Iyx-vI > 1.

and

Ax <

For fixed

x,

at most two summands in the sum are different from zero.

The Lipschitz condition on

then yields

0

[nv(x)-nv(x+uox)]2 < 4k2y2L2(Ax)2.

2

It follows from the Schwartz inequality that I M

I

-

< k2y2L2 (Ax) 2M

11U112

Summing all these inequalities for the conclusion of the lemma.

Lemma 9.39:

8>0

Let

LIx-xI

u e L2(R,IRn)

with

for all

establishes

o

iy

c C° (1R, IR)

be such that

x,x a (-28,28).

Support(u) C [-8,8]

Then for all

and for all

h

2kex < 8,

with

l <'Du'Qa(h) ('Pu)> K

and

u = -2k(1)2k

(u)>I < K L2 (tx) 2 IIuII2.

-

depends only on the method

Proof:

MD.

We have (cf. Lemma 9.36): 2k

Qa(h) = 2

E

[Eu(TOx + TAX) + Fu(Tex - TAx)]

u=-2k

We now establish the inequality of the lemma for each of the

164

I.

summands of

First let

Qu(h).

H

INITIAL VALUE PROBLEMS

I TA-

- Eu(Tax

11x)

and later let H

= FP(TAX -

TAD).

In the first case, we obtain

D = <*u, H(ipu)>

-

, IR

+

j

IR

In the second integral we can make the substitution and then call

x x.

Then we have

D = fIRVp(x)[O(x+uox)-,p(x)]u(x)TEuu(x+uox)dx

-

f1R (x+uax)[ip(x+uox)- (x)]u(x+uAx)TE11u(x)dx.

By the symmetry of

Eu,

u(x)TEUu(x+PAx)

=

u(x+uex)TEUU(x)

and therefore, D = -jR[V+(x+pAx)-$(x)]2u(x)TEUu(x+uax)dx I

IDI

2

< L24k2(Ax)211E V112 IIuIIIn

the second case, we have D = j Y(x)[(x+u0x)-*(x)]u(x)TFUU(x+uox)dx IR

-

f1R

u(x-uox)dx.

x-pAx

Fourier transforms of difference methods

9.

165

Using the fact that u(x)TFUU(x+uox)

=

-u(x+uox)TFuu(x)

and making the same substitution, we get D =

-!IIt

IDI

< L24k2(ax)2 IIFUHz IIu1IZ.

[iV(x+uAx)- (x)]Iu(x)TFUU(x+uAx)dx

We set

Proof of Theorem 9.34: 1/2

y = h

and

8 = 1/y = hl/2

h < ho = (A/6k)2, then

If

6kytx = 6kyh/A <

1

and

2ktx = 2kh/A < 8.

Thus the conditions of Lemmas 9.38 and 9.39 are satisfied. We can approximate the matrices [-38,38]

and

on the interval

Du(x)

with a linear combination of the matrices

Du(-38)

Du(38), namely,

Du(x) = s[(38-x)DU(-38)+(38+x)DU(38)] + Z(x,u,8) Since the second derivatives of the elements of bounded independently of The constant the method

M2 MD.

x, we have

On the smaller interval (38 ± x) ? 16.

The functions V+1(x)

=

[(38-x)/68]1/2

2(x) = [(38+x)/68]1/2

are

IIZ(x,u,8)112 < M28

depends only on the functions

68

D

2 .

Du, i.e., on

[-28,28], we have

166

INITIAL VALUE PROBLEMS

I.

are continuously differentiable on this interval.

ute values of the first derivatives of bounded by

*1

and

The absolare

*2

x,x a [-28, 28], j = 1,2, it follows

1/8.

For

(x)

j(x)I < sIx xI.

that

The above interpolation formula for

(9.40)

D(x)

can now be written

as

Du(x) _ iP1(x)2D11 (-38) + * 2(x)2Du(38) + 2(x,v,B), u = -2k(1)2k,

For fixed but arbitrary v = --(1)-

x c [-28, 28].

u e L2(R,IRn)

we define

(cf. Lemmas 9.37 and 9.38).

uv =

We have

Support(uv) C [(v-1)8, (v+1)8], and hence by Lemma 9.39 together with equation (9.40),

I -< 1 uo'Q-38(h) ( luo)>-I

I

I-<'Pluo'Q

I-< luo,Q_38(h)

o,Q38(h)(uo)>'I

< [(4k+1)M282 + 2K8-2(0x)2] IIuoIIZ _ (2kM2 + 2K/A2)h IIuoIIZ = M3h IIuoIIZ. The scalar products

<W2uo,Q38(h) (p2uo)> are both nonnegative, which proves that

> -M3h IIuoIIZ.

9.

Fourier transforms of difference methods

Analogously, we have for all

v = --(1)-

167

that

> -M3h Lemma 9.38 then implies that

Since

-M1y2L2 (Ax) 2 IIuII2 - M3h

>

y2(Ax)2

=

h/A2

v=-.

H uV II2.

and since by Lemma 9.37(3),

IIuv112 =IIuII2 we have that

> -M4h IIuII2 where

M4 = M1L2/a2 + M3. Applying Lemma 9.35 yields

> -M4h

IIuII2 - M5h IIuII2

- > -M6h

< (l+M6h) IIC(h)112 <

(l+M6h)1/2 < 1+M7h.

We will not give an application of Theorem 9.34 at this time. In the next section, we will return to initial value problems in several variables, and there we will use the theorem to help show that the generalization of the Lax-Wendroff method to variable coefficients is stable.

There does not seem to

be any simpler means of establishing the stability of that method.

168

INITIAL VALUE PROBLEMS

I.

Initial value problems in several space variables

10.

So far we have only investigated partial differential equations in one time variable, t and one space variable, x.

For pure initial value problems, the main results of the previous chapter can be extended effortlessly to partial differ]Rm+l with one time variable

ential equations in space variables m >

x1,

xm.

t

explain the situation for

m

We have avoided the case

until now for didactic and notational reasons.

1

and

We will

in this section with the

m > 1

aid of typical examples.

Initial boundary value problems, in contrast to pure initial value problems, are substantially more complicated when

m >

1.

The additional difficulties, which we cannot

discuss here, arise because of the varying types of boundaries.

The problems resemble those which arise in the study of

boundary value problems (undertaken in Part II).

Throughout this chapter we will use the notation m

x = (xl,...,xm) eIR

y = (YI,...ym) eIRm m

<x,y> =

dx = dxl...dxm,

I

xuYU'

u=1

In addition, we introduce the multi-indices e a

s = (sl) .... s

The translation operator replaced by e{u)

m

TAx of

IR1

different operators

_ (eemu)),

e(u) = 6 uv$

(cf. Definition 6.2) is Tku

in

IRm:

u = l(1)m u,v = 1(l)m

V

Tku(x) = x + ke(u),

m

m x e IR,

k eIR,

u = l(1)m.

10.

Problems in several space variables

For all

169

let

fE

x E Rm.

Tku(f)(x) = f(Tku(x)),

With this definition, the translation operators become bijective continuous linear mappings of

L2(IR m,tn)

For all

They commute with each other.

v e 7l

into itself.

we have

Tku = Tvk,u Let

have bounded spectral norm

B E

II B (x) 112 .

The map

f(') - B(.)f(.) is a bounded linear operator in relations for

Tku

and

B

L2ORm,¢n).

The commutativity

are

Tku°B(x) = B(Tku(x))Tku B(x)'Tku

In many cases, B

Tku0B(T-ku(x)). =

will satisfy a Lipschitz condition

IIB(x)-B(Y) 112 < L I1x-Y112. Then, IIB(x)°Tk1i -Tku°B(x)112 < L1kj.

For

k cIR

and arbitrary multi-indices m s

Tk =

su

fl Tku

.

Vj=l

The difference method MD = {C(h)Ih e(0,ho]}

can now be written in the form

s, we define

170

INITIAL VALUE PROBLEMS

I.

C(h) _ (I B5(x,h)Tk)

(E A5(x,h)Tk).

All sums, here and henceforth, extend only over finitely many Also we assume that for all

multi-indices.

s, x, and

h,

As(x,h),Bs(x,h) E MAT(n,n,IR) k = h/A

A IR+.

where

k =

or

Analogously to Definition 9.12, we can assign to each difference method an amplification matrix exp(ik<s,y>)Bs(x,h))-1(1

G(h,y,x) =

exp(ik<x,y>)As(x,h)).

(X

s If the matrices of

s As(x,h)

and

B5(x,h)

are all independent

x, we speak of a method with space-free coefficients.

Then we abbreviate As(x,h), Bs(x,h), G(h,y,x) to

As(h), Bs(h), G(h,y). The stability of a method with space-free coeff-

icients can again be determined solely on the basis of the amplification matrix

G(h,y).

Theorems 9.13 and 9.15 extend

word for word to the Banach spaces placed by

case the

m = 1.

if

IF

is re-

Theorems 9.16, 9.31, and 9.34 also carry over

IItm.

in essence.

L2(Rm,cn)

All the proofs are almost the same as for the Basically, the only additional item we need is

m-dimensional Fourier transform, which is defined for

all f e L2

by (2")-m/2

a,(y) = 7n_(f)

f(x)exp(-i<x,y>)dx

rIl

xll2

= lim a Viw

Problems in several space variables

10.

171

The limit is taken with respect to the topology of As in the case

L2(Iltm,n).

m = 1, we have: is bijective.

(1)

jn

(2)

11-9n11 = II_V -11I = I.

(3)

Fn(Tk(f))(')

=

For differential equations with constant coefficients, the best stability criteria are obtained from the amplification matrix.

Even when the coefficients are not constant,

this route is still available in certain cases, for example, with hyperbolic systems.

Also, one can define

positive definite methods.

positive and

They are always stable.

For positive definite methods, we need (1)

C(h)

I As(x)TS

with

k = h/a.

S

(2)

1

=

E A5(x) s

(3)

All matrices

As(x)

are real, symmetric, and

positive semidefinite. (4)

For all multi-indices

s

and all

x,y EIRm

we

have

IIAs(x) -As (Y) 112
L > 0.

We consider positive methods only in the scalar case If

m = 1.

m > 1, they are of little significance for systems of

differential equations.

This is due to Condition (3) of De-

finition 8.4, which implies that the coefficients of the difference operators commute.

For

m > 1, the coefficients of

most systems of differential equations do not commute, and

172

INITIAL VALUE PROBLEMS

I.

hence neither do the coefficients of the difference operators. The positive methods occur in the Banach space B = If e

lim

If(x)j

= 01

11xL -Here the norm is the maximum norm. also defined in this space. (1)

e(x,h)C(h) _

k= h/h (2)

The operators

For positive

are

methods, we need

as(x,h)Tk + I bs(x,h)Tk0C(h)

k= h X

or

Tk

where

e(x,h) = E [as(x,h) + bs(x,h)],

A e1R+.

x e

he(0,ho]

s

(3)

as ,bs a CO(JRm, 1R) as(x,h) > 0,

bs(x,h) > 0

E as(x,h) > 1. S

For

m > 1, the so-called product methods occupy a

special place.

methods for

They arise from the "multiplication" of Their stability follows directly from

m = 1.

the stability of the factors.

More precisely, we have the

following.

Theorem 10.1:

MD,u

Let

B

be a Banach space and

{Cu(h)lh c (O,ho]},

a family of difference methods for properly posed problems.

u = 1(1)m m

(possibly different)

The difference method

MD = {C(h)lh a (O,ho]} is defined by

C(h) = C1(h)C2(h) ... Cm(h). MD

is stable if one of the following two conditions is

Problems in several space variables

10.

173

satisfied. (1)

For fixed

h e (0,h01, the operators

C}.(h), u =

1(l)m, commute. (2)

There exists a

such that

K > 0

p = 1(1)m,

IICp(h)II < 1+Kh,

h e (O,h0J.

If (1) holds, we can write

Proof:

m

IIC(h)nil < Since each of the

m

p n IIC(h)nil

p=1

factors is bounded, so is the product.

If (2) holds, we have the inequalities IIC(h)nil

< (1+Kh) mn < exp(mKT).

We now present a number of methods for the case In all the examples, A = h/k

or

a = h/k2, depending on the

order of the differential equation. Example 10.2:

Differential equation: m

ut(x,t) =

E

ap[ap(x)apu(x,t)],

ap = a/ax

p=1

where and

ap a C1(IRm, IR) , 0 < S < ap (x) v = 1(1)m. Iavap(x)I < K,

Method: C(h) =

[I-(l-a)AH]-lo[I+aAH]

where m

(a (X + 2kep)(Tkp-I) + ap (x- 2kep)(Tku-I)]

H = p=l

and

a e [0,1].

m > 1.

174

I.

INITIAL VALUE PROBLEMS

Amplification matrix: G(h,y,x) = [l+aaH]/[1-(1-a)Xi} where m

H =

I (au(x+ zkeu)[exp(ikyu)-1] u=1

+ au (x- Zkeu)[exp(-ikyu)-1]}. For

2mKaa <

1

the method is positive, and hence stable.

Subject to the

usual regularity conditions, the global error is at most + 0(k2)

for

a + 1/2

0(h2) + 0(k2)

for

a = 1/2

0(h)

If all

(Crank-Nicolson Method).

are constant, then m H -2 a (1-cos ky ) > -4mK.

au

u =1

-

u

u

Precisely when 2mK(2a-1)d <

1

we have JG(h,y)J <

1

and hence stability.

Theoretically speaking, there is nothing new in the case

m >

1

that wasn't contained in the case

m = 1.

Prac-

tically speaking, the implicit methods (a < 1) with few restricting stability conditions are very time consuming for m > 1.

A large system of linear equations has to be solved

for each time interval.

For

m = 1, the matrix of the system

is triangular, and five arithmetic operations per lattice point and time interval are required for the solution.

Thus

the total effort required by an implicit method is not very large in the case

m = 1.

For

m > 1, the matrix of the

10.

Problems in several space variables

175

system of equations is no longer triangular.

Even with an

optimal ordering of the lattice points, we get a band matrix where the width of the band grows as the number of lattice The solution of the system then requires

points increases.

considerable effort.

a

Example 10.3:

Differential equation: ut(xi,x2,t) = aailu(xl,x2,t) + 2bala2u(xl)x2,t) +

where

a > 0, c > 0, and

ca22u(xl,x2,t)

ac > b2.

Method:

C(h) = [I-(1-a)XH]-lo[I+aaH] a e [0,1]

where

and

H = a(Tkl+Tki-2I) + 1b(Tkl-T-1)(Tk2-Tk2

c(Tk2+Tk2-2I).

+

Amplification matrix: G(h,y) =

[1+aaH]/[1 (1 a)aH]

where

H = -2a[1-cos(kyl)]

2c[l-cos(ky2)].

-

The differential equation differs from the one in the previous example in the term and

2ba1a2u(xl,x2,t).

Since

ac >b2, it is nevertheless parabolic.

method is never positive, regardless of

a

a > 0, c > 0,

When and

b # 0, the X.

A sta-

bility criterion is obtainable only through the amplification matrix.

We set

wl = 2ky1

and

w2 = 2 ky2, and get

176

INITIAL VALUE PROBLEMS

I.

2w

2

H = -4a sin wI-8b sin wl sin w2 cos wl cos w2-4c sin w2 _ -4(a+c)+4a cos2wl-8b sin wl sin w2 cos wl cos w2 + 4c cos`w2. wI + Rn/2

Also, for

and

w2 + Rn/2 (R EZZ), let

c = sgn(sin wI sin w2) n = sgn(cos wl cos w2).

Thus we obtain two representations for

H = -4(/ sin wl - eT sin w2) -8[/aclsin w1 sin w21

H,

2

+ b sin 111 sin w2 cos wI cos w2]

and

H = -4(a+c) + 4(T cos w1 - n/E cos w2) 2 +8[V a7 1cos 11 cos w21-b sin wl sin w2 cos w1 cos w2]. Since

Jbi

< ac, the first term in the square brackets is

always the dominant one.

Hence,

-4(a+c) < H < 0.

Equality can occur at both ends of the expression. 2(a+c)(2a-1)A <

we have

IG(h,y)l

dent of

b.

exist

wl

< 1.

w2

such that

method is unstable for all Example 10.4:

1

This stability condition is indepen-

On the other hand, and

For

a

if

b2 > ac, there always

H > 0 and

IG(h,y)l >

and A.

o

ADI-method.

Differential equation:

ut(xi,x2)t) -a[a11u(x1,x2,t) + a22u(xl,x2,t)] + b1a1u(xl,x2,t) + b2a2u(xl,x2,t)

1.

The

Problems in several space variables

10.

where

a > 0

and

177

bl,b2 cIR.

Method:

C(h) = CI(h/2)oC2(h/2) Cp(h/2) = [Io [1+

? A(Tkp-2I+Tkp)-

4bpka(Tkp-Tkp)]

Zaa(Tko-2I+Tka)+ 4boka(Tka-Tk1)],

p = 1,2 Amplification matrix:

and

a = 3-p.

G(h,y) = GI(h,y)G2(h,y)

GP (h,y)

1-aa(l-coswp) + 1ibp/ sin wp 1+aa(l-coswp)

-

Zibp vET sin wp

and

wl = kyl

and

w2 = ky2.

The abbreviation ADI stands for Alternating Direction ImpZicit method.

The first ADI method was described by Peaceman-

Rachford (1955).

The method is of great practical signifi-

cance for the following reasons. fractions

lbpI/a

is very large.

ilities, one must then choose

k

Suppose that one of the To avoid practical instabvery small.

Otherwise, one

immediately encounters difficulties such as those in Example 9.21.

With an explicit method, the stability condition

(2maa < 1) demands

ah < 4 k2.

Hence

h

has to be chosen extremely small.

cit methods allow

h

Although impli-

to be chosen substantially larger, one

nevertheless has to solve very large systems of equations because the lattice point separation

k

this is also true for ADI methods.

The difference with other

is small.

Of course,

178

I.

INITIAL VALUE PROBLEMS

implicit methods is in the structure of the system of equations.

In each of the factors

C1

and

C2, the systems of

equations decompose into independent subsystems for the latx2 s constant

tice points

xl a constant, and the mat-

and

rices of the subsystems are triangular.

Only five to eight

arithmetic operations per lattice point and half time interval are required, and that is a justifiable effort. Note that

does not belong to

G1

C1.

The factors of

the amplification matrix are exchanged in the representation of the method.

Such a representation of the amplification

matrix is only possible with constant coefficients. In practice, one deals mostly with initial boundary value problems.

Stability then depends also on the nature of

Thus we must caution that the following remark

the region.

Rather

is directly applicable only to rectangular regions.

different results can occur when the region is not rectangular or the differential equations do not have constant coefficients.

We have [1-aa(l-cos wp)]2 + 4bp ha sin2w p

<

2

[l+aa(l-cos wp)]2 + Abp hA sin2wp and hence < 1.

IG(h,Y)I C

is stable for all

are unstable for large

A, although the factors A.

C1

and

C2

In order to solve the triangular

system of equations without pivoting, we need

-aa

>

-Ibplka.

This means k < 2 min(a/Ib1I,

If additionally, as < 1, then tice, it suffices to limit

k;

a/Ib2I) is also positive.

C

h

In prac-

can be chosen arbitrarily

Problems in several space variables

10.

179

0

large.

Example 10.5:

Differential equation:

ut(xl,x2,t) = iaa1a2u(xl,x2,t),

a cIR - {0}.

Method: C(h)

where

a c

[I-(l-a)XH]-lo[I+aAH]

[0,1]

and

H

T_

1

Amplification matrix: G(h,y) =

where

wl = ky1

1-iaaX sin wl sin w2 +ia sin wl sin w2 -a and

The differential equation is

w2 = ky2.

sometimes called a pseudo-parabolic equation.

It corresponds

to the real system a

at

ul(x1,x2,t) =-aala2u2(xl.x2,t)

aT u2(xl'x2,t)

It follows that for

a

_aa1a2u1(x1,x2,t).

u c C4(IR,4)

2

ul(xl,x2,t) _ -a22'11a22u1(xi,x2,t).

Solutions of the differential equation can be computed with Fourier transforms, analogously to Example 9.9. is formally the method of Example 10.3.

but is stable for Example 10.6:

a < 1/2.

The method

It is not positive,

a

Product method for symmetric hyperbolic systems.

Differential equation:

ut(x1,x2,t) = A1(x)alu(xl,x2,t) + A2(x)a2u(xl,x2,t).

180

I.

INITIAL VALUE PROBLEMS

where

Au E C2(IR2,MAT(n,n, ]R))

A(x) symmetric,

P(AV(x))

IIAU(x)-AU(x) 112 < L IIx-xII2,

bounded

x,k c IR2,

u = 1,2.

Method:

C(h) = 4{[I+XAl(x)][I+XA2(x)]Tkl'Tk2 + [I+AA1(x)][I-AA2(x)ITkl°Tk2

+ [I-XA1(x)][I+AA2(x)]Tkl°Tk2 + [I-AA1(x)][I-AA 2(x)]Tkl°Tk2}.

The method can also be derived from the Friedrichs method for m = 1.

To see this, consider the two systems of differential

equations

ut(x1,x2,t) = A1(x)81u(xl,x2,t) ut(xl,x2,t) = A2(x)82u(x1,x2,t).

In the first system, there is no derivative with respect to x2, and in the second, there is none with respect to

x1.

Thus, each system can be solved with the Friedrichs method. The variables of parameters.

x2

and

xi, respectively, only play the role

The methods are

Cu (h) = 2 [I+AAU(x)]Tkp

For

A sup IIAu(x)II <

1

+ 2 [I-XAV(x)]Tku.

the methods are positive definite.

By Theorem 8.12, there is then a 11c11 (h)II < 1+Kh,

K > 0

u = 1,2,

By Theorem 10.1, the product is stable.

such that h c (0,h0].

Problems in several srace variables

10.

181

C(h) = C1(h)°C2(h)

{[I+AA1(x)][I+XA,(x+kel)]Tkl°Tk2

+

T_

I

+ [I-AA1(x)][I+AA2(x-kel)]Tk1,Tk2 + [I-AA1(x)][I-AA2(x-kel)]Tkl°Tk2}. C

and

C

agree up to terms of order

0(h).

Hence

C

is

also stable for max

A

sup

p(Au (x)) < 1.

U=1,2 x EIR2 The consistency of a product method also follows immediately from the consistency of the factors.

We would like to demon-

strate this fact by means of this example.

Let

C2OR2,¢n).

u e

We have h-I[C(h)-I](u)

h-I[C1(h)-I](u) + h-1[C2(h)-I](u)

= +

h{h-2[C1(h)-I]0[C2(h)-I](u)}.

Since the Friedrichs method is consistent, the summands on the right

side are approximations for A1(x)a1u(x,t), A2(x)a2u(x,t)

and

hA1(x)a1[A2(x)a2u(x,t)]. Thus, up to

0(h), the left side is an approximation for A1(x)alu(x,t) + A2(x)a2u(x,t).

This establishes consistency for replaced by

C.

C.

For simplicity, C

This doesn't affect consistency, since

was

182

INITIAL VALUE PROBLEMS

I.

A [C(h)-C(h)] (u) T-

1

+ [I-?LA 1(x)][A2(x-kel)-A2(x)]Tkl0[Tk2-Tk2](u).

The difference is obviously of order h-1[C(h)-I](u)

Usually matrix

C

AI(x)A2(x)

remedied. A2

and

C

Let

C*

as well as

-

0(h2).

It follows that

h-1[C(h)-I](u) = 0(h).

are not positive definite, because the is not symmetric.

This deficiency can be

be formed from

by exchanging

Tkl

and

is positive definite.

Tk2.

C

Then the method

1

(C + C*)/2

Further details

on product methods can be found in Janenko (1971).

a

m-dimensional Friedrichs method.

Differential equation: m

ut(x,t) _

I

Au(x)3uu(x,t)

u=1

where Au e C2clRm,MAT(n,n,]R)) p(AU(x))

symmetric

bounded

(x)-A11 (R) II < L IIx-RII2,

11A

x,k a IItm,

u = 1(1)m.

11

Method: C(h) = I +

-1 Au(x)(Tku-Tku)

2A

and

All of the methods mentioned here are

too complicated for practical considerations.

Example 10.7:

A

m

+ Zm

u-1

E (Tku-2I+Tku) u=1

with r e IR. Amplification matrix: m

G(h,y,x) = (1-r)I + Ai

m

E A (x)sin w u=1

u

+ r I cos w E u m u u=1

183

Problems in several space variables

10.

where

mu = kyu

u = 1(1)m.

for

The differential equation

constitutes a symmetric hyperbolic system. The case

found in Mizohata (1973).

m = 2

The theory can be

was covered in

the preceding example.

The m-dimensional wave equation m

bu(x)au[bu(x)auv(x,t)]

vtt(x,t) u=1

bu a C2 (Rm, IIt+) can be reduced to such a system by means of the substitution

u(x,t) _ (vt(x,t), bi(x)alv(x,t),...,bm(x)amv(x,t)) In this special case, the coefficients of the system are elements of

MAT(m+1,m+1,7R): Au(x)

(a0T)(x))

where

aQT)(x)

_

b(x)

for

a =

bu(x)

for

T

and

T = u+l

and

a = u+l

otherwise.

0

For

1

m = r = 1, this obviously is the Friedrichs method preFor

viously considered.

m = 2, this is simpler than the

product method of Example 10.6.

For

m > 2, the m-dimensional

Friedrichs method is substantially simpler than the product methods which can be created for these cases. C

is consistent for all

we skip the proof. r e (0,1]

and

C

a elR+

and all

r CIR, but

is positive definite exactly when

a max

sup

u=1(1)m x e1Rm

-

p(A (x)) < r/m, u

for it is exactly under these conditions that all the matrices

184

INITIAL VALUE PROBLEMS

I.

(1-r)I, AA

p

are positive semidefinite.

c IR and

-AA

and

+ rI m

p

Ap(x) = CI, p = l(1)m,

For

it follows for

r = 1

+ rI m

mp = n/2

that

IIG(h,y,x) 112 = Acm. By Theorem 9.31, the stability condition, at least in this special case, agrees with the condition under which the method is positive definite.

However, there are also cases

in which the method is stable but not positive definite.

We want to compare the above condition on the Friedrichs method for

m = 2, r - 1

the product method. max

h

with our stability condition for

The former is sup

p=1(1)2 x EIR2

-

p(A (x)) < k/2 p

and the latter, h

max

sup

p(A (x)) < k.

p=l (1) 2 x d R2

U

However, one also has to take the separation of the lattice points into account. (see Figure 10.8).

They are

and

respectively

For the product method, the ratio of the

O

O

O

O

k

k

0-----=-O

0-k

O Friedrichs method

product method Figure 10.8

-

10.

185

Problems in several space variables

maximum possible time increment to this separation never-

/.

theless is greater by a factor of

The product method

provides a better approximation for the domain of determinancy of the differential equation.

That is the general ad-

vantage of the product method, and guarantees it some attenIt is also called optimally stable, which is a way of

tion.

saying that its stability condition is the Courant-FriedrichsLewy condition.

o

Lax-Wendroff-Richtmyer method.

Example 10.9:

Differential equation as in Example 10.7, with the additional conditions

Au c C3(gtm,MAT(n,n, ]R))

IIaQaTAU(x)II2 bounded, u = 1(1)m, o = 1(1)m, T = 1(1)m. Method: m

C(h) =

I

+ So [I +

ZS

+ 2m

)

1

with r e IR S =

(Tku-2I+Tku)l

and Ap(x)(Tku-Tku).

Za

u=1

For

m = r = 1

and

Au(x) = A =_ constant, we have the ordin-

ary Lax-Wendroff method (cf. Example 9.26), for then, with Tk = Tkl, XA(Tk - Tk

S = 2

C(h) = I+ ?AA(Tk-Tkl)+ 1X2A2(Tk-2I+Tk2)+ 4AA(T2-Tk2) ZAA(Tk-Tk1) C(h) =

I

+ 4XA(T2-Tk2) +

1A2A2(Tk-2I+Tk2).

186

INITIAL VALUE PROBLEMS

I.

Replacing

and

k

a/2

by

yields the expression

X

AA(Tk-Tkl) + 2 a2A2(Tk-2I+Tkl).

+

I

by

2k

2

In any case, when

r = 1, C

only contains powers

Tk

with

m

even sums

s

u=1 u

Figure 10.10 shows which lattice points

.

0 O

0

G

k

)E

G-44- 0

O

aE

c

0 Figure 10.10

are used to compute only when

r # 1.

for

C

m = 2.

The points

*

are used

The Lax-Wendroff-Richtmyer method has or-

der of consistency

0(h2).

It is perhaps the most important

method for dealing with symmetric hyperbolic systems. choice

r e (0,1)

The

is sometimes to be recommended for gener-

alizations to nonlinear problems. We present a short sketch of the consistency proof. It follows from m

AU(x)auu(x,t)

ut(x,t) u=1

that

m

m

utt(x,t) _ u=1

For

Av(x)avu(x,t)].

Av(x)au[ v=1

u e C3(IRm,4n), one shows sequentially that 0

Su(x,t) = h E Au (x)a u(x,t) + 0(h3) V

U=1

ZS2u(x,t) =

2h2

E

u-1

A11 (x)ap[ E Av(x)avu(x,t)] + 0(h3) v=1

Problems in several space variables

10.

187

m (Tku-2I+Tku)u(x,t)

2m S

1

1

hk2

E A (x)93 u(x t) + O(h3 ). I uvv u=1 v=1 u '

Altogether, this yields C(h)u(x,t) = u(x,t) + hut(x,t) + Zh2utt(x,t) + O(h3) = u(x,t+h) + O(h3).

We want to derive sufficient stability criteria with the aid of the Lax-Nirenberg Theorem 9.34 (cf. Meuer 1972).

However,

In reducing

the theorem is not directly applicable.

to

C

the normal form B5(x,h)Tk

C(h) _ s

one ordinarily obtains coefficients depend on

h.

For example, for

S2 = 1X2A(x)A(x+g)T2 -

Bs(x,h)

which actually

m = 1, we have

[4A2A(x)A(x+g)

+

4A2A(x)A(x-g)]I

+ 4A2A(x)A(x-g)Tk2 where

A = A1, g = kel, and

But the operator

Tk = Tkl.

Bs(x,0)Tk

C*(h) _ s

has coefficients which are independent of

h.

One easily

shows: (1)

IIC(h)

-

C*(h)II2 = O(h).

Thus

C

both stable or both unstable. (2)

For every

II [C(h) Hence

-

u c Co(1Rmn)

C* (h) ] (u) II2

=

we have

O(h2)

.

C* is at least first order consistent.

and

C*

are

188

INITIAL VALUE PROBLEMS

I.

has amplification matrix

C*

(3)

m

G*(h,y,x) =

+ S((1-r)I + 2 + mI

I

cos w V=1

wV = kyV

where

(V = l(1)m) and m

s = is

E

AV(x)sin wV.

u=1

For

m = 1, we have C(h)

- C*(h) = 8A2A(x)[A(x+g) -

+

8A2A(x)[A(x+g) + A(x-g)

- A(x)]T2 - 2A(x)]I

IX2A(x)[A(x-g) - A(x)IT k. 2

(1) follows immediately.

The proof of (2) depends on the dif-

ferences C(h)

C*(h) = 4X2gA(x)A'(x)[Tk-T-2] + 0(h2).

-

m > 1, we leave this to the reader.

For

Now we can apply the Lax-Nirenberg Theorem 9.34 to

C*.

Then it suffices for stability that

IIG*(h,y,x) 112 < 1. By Theorem 9.31 this condition is also necessary. H

be the product of

matrix

G*(h,y,x)

Now let

with the Hermite transposed

(G*)H, in

P =

AV(x) sin wV

E

V=1

and let

m r1

I

m

=

cos w

V=1

We have of

that

-1 < n <

A, r, and

h.

1.

r1

assumes all these values independently

It follows from the Schwartz inequality

Problems in several space variables

10.

189

2I

m

Cosw

n2 < m u=1

may be represented as follows:

H

H = [IH = P

2X2P2-iX(l-r+rn)P][I-

(I- ZX2P2)2 + X2(1-r+rn)2P2.

is real and symmetric.

real.

ZX2P2+iX(l-r+rn)P]

The eigenvalues

To every eigenvalue

eigenvalue

of

a

H.

1.

of

P

of

P

are

there corresponds an

Thus it is both necessary and suffici-

ent for the stability of greater than

a

a

and

C*

C

that

&

never be

Hence we must examine the following inequal-

ity:

a = (1

-

X2a2)2 + X2(1-r+rn)2a2 < 1.

a = 0, this is always satisfied.

For

if all the matrices

Au(x)

is always zero only

a

are zero everywhere.

trivial case, one has stability for all

X

and

In this Y.

In all

other cases, we can restrict ourselves to those combinations of

x

wu

and

with

p(P) > 0, and consider an equivalent

set of inequalities: 4X2p(P)2 + (1-r+rn) 2 < 1. For

r < 0

or

contradictory. necessary.

r > 1, n < -1/r, this inequality is selfIn the nontrivial cases, then, r E (0,1]

For these

r, the inequalities can be converted

to the equivalent inequalities

4r

We now set

is

X2p(P)2 + n2-(1-r) (1-n)2 < 1.

190

INITIAL VALUE PROBLEMS

I.

max 11=1(1)m

K =

p(A (x))

sup

x elRm

and assert that r e (0,1]

ZAK <

and

is sufficient for the stability of let

w

21

m C*

(10.12)

.

and

C.

be an arbitrary eigenvector of a matrix

"w'12 = 1

and

To see this, P

with

= aw.

P (w)

m

[wTAp(x)w]sin wu.

a = u=1

Again we apply the Schwartz inequality to obtain m

m

[wTA (x)w]2

a2 <

u

u=1

sin2w u=1

u

m

a2 < m K2

sin2wu

I

u=1 m

p(P)2 <

in

K2 I sin2wu u=1 in

A 2p(P)2 < m

E

sinwu

u=1

4rA2p(P)2 + n2 <

1.

This inequality is somewhat stronger than (10.11).

There remains the question whether stability condition (10.12) is at all realistic. matrices

Au(x)

The answer is that whenever the

have some special structure, it is worthwhile

to refer back to the necessary and sufficient condition (10.11). tion.

A well-known example is the generalized wave equa-

As noted in Example 10.7, for this equation we have Au(x)

(aar)())

Problems in several space variables

10.

191

where

a

'

for a = 1, T = p+l and

(x) = by (x)

T= 1,

a = p+l

otherwise.

a6T)(x) = 0 Letting

K=

max sup p=1(1)m x elRm

bp (x)

we also have max

K =

sup

p(A (x)). P

pal (1)m x EIItm But in contrast to the above, m

p(P)2 < KZ

sin2w

Y

U=1

With the help of (10.11), one obtains a condition which is

better by a factor of V: r c

and

(0,1]

ZXK <

.

The same weakening of the stability condition (factor Am-)

is

also possible for the m-dimensional Friedrichs method, in the case of the generalized wave equation. So far we have ignored general methods for which there are different spacings rections

ep.

of the lattice points in the diX = h/k

Instead of

then have possibly ap =h/kp

kp

A = h/k2, one could

or

different step increment ratios

m

or A,, = h/k2.

Such methods have definite practi-

11

Now one can obtain

cal significance.

kl = k2 =

...

= km

with the coordinate transformation xp =

apxp

where

ap

> 0, p = l(l)m

This transformation changes the coefficients of the differential equation.

They are multiplied by

ap

or

a2

or

192

In many cases, the following approach has proved use-

auaV. ful.

INITIAL VALUE PROBLEMS

I.

First transform the coordinates so that the coeffici-

ents mapped into each other by the change of variables are nearly the same.

For a symmetric hyperbolic system this means

p(A (x)) _ sup 1 x e lRm

...

= sup

p(A (x)). M(x)).

x C 1Rm

Then choose the increments independent of ponds to a method with

ku = k/ou

V.

This corres-

in the original coordinate

system.

11.

Extrapolation methods

All of the concrete examples of difference methods which we have discussed so far have been convergent of first or second order.

Such simple methods are actually of great

significance in practice.

This will come as a great surprise

to anyone familiar with the situation for ordinary differential equations, for there in practice one doesn't consider methods of less than fourth order convergence. High precision can only be achieved with methods of high order convergence.

This is especially true for partial

differential equations.

Consider a method with

variables, of k-th order, and with

m

space

h/4x = A a constant.

the computational effort for a fixed time interval

Then

[0,T]

is

O(h-m-1-e)

For explicit methods, e = 0, while for implicit

methods, e >

0

at times.

The latter depends on the amount

of effort required to solve the system of equations. case, m+l+c >

2.

In any

To improve the precision by a factor of

thus is to multiply the computational effort by a factor of q(m+l+e)/k

q

11.

Extrapolation methods

193

In solving a parabolic _'ifferential equation we have

as a rule that 0(h

-m/2 -

1

-

h/(Ax) 2 = c)

A

The growth law

- -or.stant.

for the computational effort appears more

However, a remainder of O(hk) + O((tx)k) _ q(m+2+2e)/k. q = q(m+2+2e)/2k implies q = is only

favorable.

O(hk/2)

achieved with a remainder

O(hk)

0((Ax)2k)

+

=

O(hk).

How then is one to explain the preference for simpler methods in practice?

There are in fact a number of import-

ant reasons for this, which we will briefly discuss. (1)

involved.

In many applications, a complicated geometry is The boundary conditions (and sometimes, insuffici-

ently smooth coefficients for the differential equations) lead to solutions which are only once or twice differentiable. Then methods of higher order carry no advantage.

For ordin-

ary differential equations, there is no influence of geometry or of boundary conditions in this sense; with several space variables, however, difficulties of this sort become dominant. (2)

The stability question is grounds enough to re-

strict oneself to those few types of methods for which there A method which is stable

is sufficient experience in hand.

for a pure initial value problem with equations with arbitrarily often differentiable coefficients, may well lose this stability in the face of boundary conditions, less smooth coefficients, or nonlinearities.

In addition, stability is a

conclusion based on incrementations quite unclear how

h

0

h < h

-

.

o

It is often

depends on the above named influences.

In this complicated theoretical situation, practical experience becomes a decisive factor. (3)

The precision demanded by engineers and physicists

194

I.

is often quite modest.

INITIAL VALUE PROBLEMS

This fact is usually unnoticed in the

context of ordinary differential equations, since the computing times involved are quite insignificant.

As a result, the

question of precision demanded is barely discussed.

As with

the evaluation of simple transcendental functions, one simply uses the mantissa length of the machine numbers as a basis for precision.

The numerical solution of partial differential

equations, however, quickly can become so expensive, that the engineer or physicist would rather reduce the demands for This cost constraint may well be relaxed with

precision.

future technological progress in hardware.

These arguments should not be taken to mean that higher order convergence methods have no future.

Indeed one

would hope that their significance would gradually increase. The derivation of such methods is given a powerful assist by extrapolation methods.

We begin with an explanation of the

basic procedure of these methods.

In order to keep the for-

mulas from getting too long, we will restrict ourselves to problems in and

1R2, with one space and one time variable, x

t.

The starting point is a properly posed problem and a corresponding consistent and stable difference method. solutions for considered.

noted by

h.

s-times differentiable initial functions are The step size of the difference method is deThe foundation of all extrapolation methods is

the following assumption: Assumption:

Only

The solutions

w(x,t,h)

method have an asymptotic expansion

of the difference

Extrapolation methods

11.

r-1

w(x,t,h)

195

y

T,(x,t)h j + p(x,t,h),

=

(x,t)

E G,

v=0

h e (O,h01

where

r >

and

2

11p(x,t,h) JI Tv

= 0(hy r),

G + ¢n,

:

v = 0(l)r-1

0 = Yo < Y1 To

(x,t) a G, h e (O,ho]

.

< Yr.

is the desired exact solution of the problem.

a

We begin with a discussion of what is called global extrapolation.

method for

r

For this, one carries out the difference different incrementations

for the entire time interval. dependent of each other. tk/hj c2Z

for all

j

= 1(1)r, each

computations are in-

r

For each level

t = tk, where

= 1(1)r, one can now form a linear com-

w(x,tk,hl,...,hr)

bination

The

hj, j

of the quantities

w(x,tk,hj)

so that w(x,tk,hip.... hr) = T0(x,y) + R.

Letting

by = qvh, v = 1(1)r, and letting

h

converge to

zero, we get

R = 0(hlr) w

is computed recursively:

Tj,o = w(x,tk,hj+l), T.

j

T J.,v-1 B Jv[T J.

= 0(1)r-l l,v-1-T J.,v-1 ]'

J ,v=

1(1)r-1,

v

j

= v(1)r-1

w(x,tk,hl,...,hr) = Tr-l,r-1' In general the coefficients ways on the step sizes

hj

8jv cIR

depend in complicated

and the exponents

yv.

In the

196

I.

INITIAL VALUE PROBLEMS

following two important special cases, however, the computation is relatively simple. Case 1:

hi = lhj 1, _

= 2(1)r,

Yv

Y 2

v-1

Yv = vy, y > 0, v = 1(1)r, hj

6jv

arbitrary

1

Sjv Case 2:

j

=

arbitrary

1

(h.

Y

-Jh

-1

J

l

The background can be found in Stoer-Bulirsch, 1980, Chapter 2, and Grigorieff (1972), Chapter 5.

This procedure, by the way,

is well-known for Romberg and Bulirsch quadrature and mid-

point rule extrapolation for ordinary differential equations (cf. Stoer-Bulirsch 1980).

In practice, the difference method is only carried out for finitely many values of sible for those

x

Extrapolation is then pos-

x.

which occur for all increments

The

h j*

case

hj/(tx)2 = constant

ratios of the

hj's

presents extra difficulties.

The

are very important, both for the size of

the remainder and the computational effort.

For solving hy-

perbolic differential equations one can also use the Romberg or the Bulirsch sequence. Romberg sequence: hj = h/2j-l,

j

= 1(1)r.

Bulirsch sequence: hl = h, h2j = h/2J,

h2j±1 = h/(3.2J 1),

j > 1.

Because of the difficulties associated with the case hj/(Ax)e - constant, it is wise to use a spacing of the

(Ax)j

11.

Extrapolation methods

197

based on these sequences for solving parabolic differential equations.

In principle, one could use other sequences for

global extrapolation, however.

Before applying an extrapolation method, we ask ourselves two decisive questions: expansion?

Does there exist an asymptotic

What are the exponents

would be optimal.

yv?

Naturally

yv= 2v

Usually one must be satisfied with yv = v.

In certain problems, nonintegral exponents can occur.

In

general the derivation of an asymptotic expansion is a very difficult theoretical problem.

This is true even for those

cases where practical experience speaks for the existence of such expansions.

However, the proofs are relatively simple

for linear initial value problems without boundary conditions.

As an example we use the problem ut(x,t) = A(x)ux(x,t) + q(x,t),

u(x,O)

x SIR, t c (0,T)

x SIR.

4 (X)'

The conditions on the coefficient matrix have to be quite strict.

We demand

A e C (IR, MAT(n,n,IR)) A(x)

real and symmetric,

IIA(x)-A(R) Ij < L2Ix-XI , Let the

w(x,t,h)

IjA(x)II < L1

x,R a IR.

be the approximate values obtained with

the Friedrichs method.

Let a fixed

A = h/Ax > 0

be chosen

and let A sup

p(A(x)) < 1.

x SIR The method is consistent and stable in the Banach space L2OR,4n)

(cf. Example 8.9).

In the case of an inhomogeneous

198

INITIAL VALUE PROBLEMS

I.

equation, we use the formula w(x,t+h,h) = 2[I+XA(x)]w(x+Ax,t,h) +

Theorem 11.1:

Let

Z[I-XA(x)]w(x-Ax,t,h) + hq(x,t).

r e 1N, 0 e Co (R, IRn)

h c (O,h0]

Then it is true for all

[O,T],IRn).

q c Co (IR x

and

that

r-l

w(x,t,h) _

TV(x,t)hV + p(x,t,h),

I

v=0

x cIR, t e [O,T],

t/h c ZZ

TV e co(R x (0,T1, ]Rn) O(hr)

uniformly in

t.

Since there is nothing to prove for

Proof:

pose that

r = 1, we sup-

We use the notation

r > 1.

V = Co(dt, IRT),

W = Co(JR x

[0,T], IRn).

The most important tool for the proof is the fact that for $ c V

q c W, the solution

and

longs to

W.

u

of the above problem be-

This is a special case of the existence and

uniqueness theorems for linear hyperbolic systems (cf., e.g., Mizohata 1973).

For arbitrary

v e W, we examine the differ-

ence quotients

Q1(v)(x,t,h) = h-1{v(x,t+h)

-

Z[v(x+Ox,t)+v(x-Ax,t)]}

Q2(v)(x,t,h) = (2ox)-1{v(x+Ax,t)-v(x-Ax,t)} Q(v) = Q1(v) - A(x)Q2(v)

Although

w(x,t,h)

apply

to

Q

is only defined for

t/h c2Z, one can

w:

q(x,t), x cIR, tc[0,T], t/h cZZ, hc(O,h01.

11.

Extrapolation methods

For

v e W, Q1(v)

and

199

can be expanded separately

Q2(v)

with Taylor's series

Q(v) (x,t,h) = vt(x,t) - A(x)vx(x,t) s

+

hv-1DV(v)(x,t)

+ hsZ(x,t,h).

s

v2 Here s

s

is arbitrary.

c IN

vanishes.

The operators

operators containing order

We have

v.

For fixed

h,

x, t, and

h.

A(x)

For

The quantities

2

to

DV, v = 2(1)00, are differential

as well as partial derivatives of

DV(v) c W. e W.

s = 1, the sum from

The support of

Z(x,t,h)

Z

is bounded.

is bounded for all

tv e W, v = 0(1)r-1

are defined re-

cursively:

v=0: a to(x,t) = A(x)Bz T0(x,t)+q(x,t)

te

x e IR,

To(x,0) = fi(x)

[0,T]

V-1 v>0:

8t TV(x,t) = A(x)8z TV(x,t)-uIODV+1-u(tu)(x,t)

t e [0,T]

x e IR, TV(x,0) =

It follows that tients

Q(TV)

0

TV E W, v = 0(1)r-1.

The difference quo-

yield 2r-1

U(T0)(x,t)+h2r-1z0(x,t,h)

hµ-'D

Q(T0)(x,t,h) = q(x,t)+ u=2

2r-2v-1

V-1

E Dv+1

Q(t)(x,t,h)

u(tu)(x,t)+

+

h2r-2v-lz(x,t,h),

In the last equation, the sum from when

v = r-l.

I

h11- D

u=2

P=O

2

v = 1(1)r-1. to

2r-2v-1

vanishes

Next the v-th equation is multiplied by

hV

INITIAL VALUE PROBLEMS

206

I.

and all the equations are added.

Letting

r-2

u=0

v r-v u-1

h

h

h u-1 D u (t

h

v=0

u=r-v+l

r-1 h

+

u

v 2r-2v-1

r-2 F

V D (T) (x,t)

u=2

v=O

+

Dv+1-u(ru) (x,t)

I

11v

v=l

+

we get

v-1

r-l

Q('1) (x,L,1i) = q(x,t)- E

T = Ervhv

2r v 1

Zv

) (x,t)

(x,t,h).

v=0

The first two double sums are actually the same, except for sign.

To see this, substitute

in the second,

v+u-l

obtaining r-1

r-2

I

v=0 1=v+1

hVD=_v+1 (TV)(x,t)

Then change the order of summation: r-1

u

h

u=1

u-1 1

Du+l-v(TV)(x,t)

v=0

Now the substitution

(u,v)

-

(v,u)

yields the first double

sum.

While the first two terms in this representation of Q(T)

cancel, the last two contain a common factor of

hr.

Thus we get

Q(T)(x,t,h) = q(x,t) + hrZ(x,t,h), x E IR,

Z

t

c

[0,T], t+h E [0,T],

has the same properties as

ous for fixed h c (0,h0]. tion

Z

v

h, bounded for all The quanity

T-W

:

h e (0,h01.

bounded support, continux E IR,

t

e [0,T], and

satisfies the difference equa-

11.

Extrapolation methods

Q(T)(x,t,h) r(x,0,h)

Thus, r-w

-

201

Q(w)(x,t,h) = hrZ(x,t,h)

- w(x,0,h) = 0.

is a solution of the Friedrichs method with initial

function

and inhomogeneity

0

hrZ(x,t,h).

It follows from

the stability of the method and from t/h e2Z

and

h e (0,ho], that for these

IIT(',t,h)

for

L t

and

h,

o

-

From the practical point of view, the restriction to functions

and

q

with compact support is inconsequential

because of the finite domain of dependence of the differential equation and the difference method.

Only the differen-

tiability conditions are of significance. do not have a finite dependency domain. V

W

and

Parabolic equations The vector spaces

are therefore not suitable for these differential

equations.

However, they can be replaced by vector spaces of

those functions for which sup

1 0 )(x)xkI <

j

= 0(1)s, k = l(1)m

x dIR sup 1(ax)jq(x,t)xkl < =,

j

= 0(1)s, k = l(l)oo, t = [0,T].

x c 1R s e 1N

suitable but fixed.

These spaces could also have been used in Theorem 11.1.

The

proof of a similar theorem for the Courant-Isaacson-Rees method would founder, for the splitting is not differentiable in

A(x) = A+(x)

x, i.e., just because

A(x)

- A-(x) is

arbitrarily often differentiable, it does not follow that this is necessarily so for

A+(x)

and

A_(x).

Global extrapolation does not correspond exactly to

202

INITIAL VALUE PROBLEMS

I.

the model of midpoint rule extrapolation for ordinary differential equations, for there one has a case of local extrapolation.

Although the latter can be used with partial differ-

ential equations only in exceptional cases, we do want to present a short description of the method here. h = nlhl = n2h2 =

.

= nrhr,

nj

c IN,

Let

j

= 1(1)r.

At first the difference method is only carried out for the interval

For

[O,h].

tions for

T

0

t = h, there are then

r

approxima-

With the aid of the Neville

available.

scheme, a higher order approximation for

t = h

is computed.

The quantities obtained through this approximation then become the initial values for the interval

[h,2h].

There are

two difficulties with this: (1)

points

When the computation is based on finitely many

x, the extrapolation is only possible for those

which are used in all

means that for Since

j

computations.

r

= 1(1)r, the same

A = hj/(ox)j a constant

for the larger increments

hj

or

x

Practically, this

x-values must be used. A = hj/(Ax)e a constant,

the method has to be carried

out repeatedly, with the lattice shifted in the

x-direction.

This leads to additional difficulties except for pure initial value problems.

In any case, the computational effort is in-

creased by this. (2)

Local extrapolation of a difference method is a

new difference method.

Its stability does not follow from

the stability of the method being extrapolated.

Frequently

the new method is not stable, and then local extrapolation is not applicable.

Occasionally so-called weakly stable methods

Extrapolation methods

11.

203

arise, which yield useful results with

h

values that are

Insofar as stability is present, this must be

not too small.

demonstrated independently of the stability of the original Local extrapolation therefore is a heuristic method

method.

in the search for higher order methods. The advantages of local over global extrapolation, however, are obvious.

For one thing, not as many intermedi-

ate results have to be stored, so that the programming task For another, the step size

is simplified.

in the interval

can be changed

h

The Neville scheme yields good in-

[0,T].

formation for the control of the step size.

In this way the

method attains a greater flexibility, which can be exploited to shorten the total computing time.

As an example of local extrapolation, we again examine the Friedrichs method above.

for the problem considered

C(h)

The asymptotic expansion begins with

hT1(x,y) + h2T2(x,y).

Let

r = 2, hl = h, and

T0(x,y) + h2 = h/2.

Then

E2(h) = 2(C(h/2))2 - C(h) is a second order method. Let

Ax = h/A C(h) =

g = Ax/2.

2[I+AA(x)]Tg +

2[I+XA(x)]Tg +

C(h/2) = 2(C(h/2))2

and

We check to see if it is stable. Then

2[I-AA(x)]Tg2

Z[I-XA(x)]T-l

2[I+AA(x)][I+AA(x+g)]T2

= +

2[I+AA(x)][I-AA(x+g)]T0

+ 2[I-AA(x)][I+AA(x-g)]T0 +

2[I-AA(x)][I-AA(x-g)]T92

204

1.

INITIAL VALUE PROBLEMS

E2(h) = ZA[I+AA(x)]A(x+g)T2 + I- ZX[I+XA(x)]A(x+g)+ ?A[I-AA(x)]A(x-g) -

ZA[I-XA(x)]A(x-g)Tg2.

By Theorem 5.13, terms of order stability.

method with

Therefore

E2(h)

0(h)

have no influence on

is stable exactly when the

E2(h), created by replacing

A(x+g)

A(x-g)

and

A(x), is stable:

E2(h) = ZA[I+AA(x)]A(x)T9+I-X2A(x)2- ZA[I-XA(x)]A(x)Tg2 = 1+ 2XA(x)(TAx-TA1)+ ZX2A(x)2(TAx-2I+TAX). For

A(x) ° constant, E2(h)

Example 9.26). A(x)

is the Lax-Wendroff method (cf.

This method is stable for

Xp(A(x)) < 1.

If

is real, symmetric, and constant, it even follows that

II E2 (h)112 < 1. With the help of Theorem 9.34 (Lax-Nirenberg) we obtain a sufficient stability condition for nonconstant and

E2(h)

A.

E2(h)

are stable under the following conditions:

(1)

A E C2(IR,MAT(n,n, IR))

(2)

A(x)

(3)

The first and second derivatives of

(4)

Ap(A(x)) < 1,

is always symmetric A

are bounded

x EIR.

By Theorem 9.31, Condition (4) is also necessary for stability.

In the constant coefficient case, E2(h)

with the special case Example 10.9.

m = 1, r = 1

of method

coincides C(h)

of

Both methods have the same order of consistency

11.

Extrapolation methods

205

and the same stability condition, but they are different for nonconstant

A.

The difference E2(h)-(C(h/2))2 = (C(h/2))2-C(h)

gives a

good indication of order of magnitude of the local error. can use it for stepwise control.

One

In this respect, local

extrapolation of the Friedrichs method has an advantage over direct application of the Lax-Wendroff method. The derivation of

E2(h)

the amplification matrix of

can also be carried through

C(h).

C(h/2)

has amplifica-

tion matrix G(h/2,y,x) = cos w = yg = 2 yAx.

It follows that

H2(h,y,x) = 2G(h/2,y,x)2 - G(h,y,x) 2,12

sin 2w-A(x)

iasin 2w-A(x).

That is the amplification matrix of

Through further ex-

E2.

trapolation, we will now try to derive a method

E3

of third

order consistency: E3(2h) =

E2(h) 2

-

3 E2(2h).

3

Consistency is obvious, since there exists an asymptotic expansion.

We have to investigate the amplification matrix

H3(2h,y,x) =

H2(h,y,x)2

-

3 H2(2h,y,x)

3

Let

p

be an eigenvalue of XA(x), and

ponding eigenvalues of Then

n2,n2,3

H2(h,y,x), H2(2h,y,x), and

the corresH3(2h,y,x).

206

INITIAL VALUE PROBLEMS

I.

n2 = 1-2w2u2

2 w4u2 +

+ i[2wu

w3u]

-

+

O(Iw15)

-

8w3u3)

3 n2 = 1-8w211 2 + 30 w4u2 +

4w41j 4

+ i[4wu -

8

w3u

+

O(Iw15)

3

n2 = 1-8w2u2 + 32 w4u2 + i[4wu - 32 w3u] + o(Iw15) n3 = 1-8w2u2

+ 3

w4u2

16 w4 u4

+

+ i[4wp In312

= 1

-

32 w3u3) + O(IwIS)

w4(u2-u4) + O(IwIS).

+

3 For stability it is necessary that IuI

> 1.

On the other hand, for H2(2h,y,x) =

In3I < 1, that is,

w = n/2 we have

I

H2(h,y,x) = I-2A2A(x)2

n3 = 1 + 3 (u4-u2) and hence the condition

IuI

< 1.

Thus

if by chance all of the eigenvalues of 0

or

-1, for all

x eIR.

E3 XA(x)

is stable only are

+1

or

In this exceptional case, the

Friedrichs method turns into a characteristic method, and thus need not concern us here.

For characteristic methods, local extrapolation is almost always possible as with ordinary differential tions.

present.

This is mostly true even if boundary conditions are The theoretical background can be found in Hackbusch

(1973), (1977).

PART II. BOUNDARY VALUE PROBLEMS FOR ELLIPTIC DIFFERENTIAL EQUATIONS

12.

Properly posed boundary value problems Boundary value problems for elliptic differential equa-

tions are of great significance in physics and engineering.

They arise, among other places, in the areas of fluid dynamics, electrodynamics, stationary heat and mass transport (diffusion), statics, and reactor physics (neutron transport).

In

contrast to boundary value problems, initial value problems for elliptic differential equations are not properly posed as a rule (cf. Example 1.14).

Within mathematics itself the theory of elliptic differential equations appears in numerous other areas.

For a

long time the theory was a by-product of the theory of functions and the calculus of variations.

To this day variational

methods are of great practical significance for the numerical solution of boundary value problems for elliptic differential equations.

Function theoretical methods can frequently be

used to find a closed solution for, or at least greatly simplify, planar problems.

The following examples should clarify the relationship 207

208

BOUNDARY VALUE PROBLEMS

II.

between boundary value problems and certain questions of function theory and the calculus of variations. G

Throughout,

will be a simply connected bounded region in

continuously differentiable boundary

IR2

with a

aG.

EuZer differential equation from the calculus

Example 12.1:

of variations.

Find a mapping

u: G -+]R

which satisfies the

following conditions: (1)

is continuous on

u

entiable on

and continuously differ-

G

G.

(2)

u(x,y) = (x,y)

(3)

u

for all

(x,y) E aG.

minimizes the integral

I[w] = If

[a1(x,Y)wx(x,y)2

+

a2(x,y)wy(x,Y)2

G +

c(x,y)w(x,y)2

2q(x,y)w(x,y)]dxdy

-

in the class of all functions

Here

al,a2 a

C1 (G, ]R)

,

c,q c

w

satisfying (1) and (2).

C1 (G, IR)

al(x,y) > a >

a2 (x,y) > a > c(x,Y) > 0.

,

and ip E C1 (aG, IR)

with

0

0

(x,y)

E

It is known from the calculus of variations that this problem has a uniquely determined solution (cf., e.g., GilbargTrudinger 1977, Ch. 10.5). u

In addition it can be shown that

is twice continuously differentiable on

G

and solves the

following boundary value problem: -[al(x,y)ux]x -

[a2(x,Y)uy]y + c(x,y)u = q(x,y), (x,y) E G

u(x,y) = 'P(x,y),

(x,y)

a

G.

(12.2)

12.

Properly posed boundary value problems

209

The differential equation is called the Euler differential equation for the variational problem.

Its principal part is

-aluxx - a2uyy.

The differential operator 2

a2

__7 ax

ay

2

is called the Laplace operator (Laplacian).

In polar coor-

dinates,

x=rcos0 y = r sin it looks like a2

Dr

1

+

a

r 3r

1

+

a2

r 7 ao2

The equation -°u(x,y) = q(x,y)

is called the Poisson equation and -°u(x,y) + cu(x,y) = q(x,y),

c = constant

is called the Helmholtz equation.

With boundary value problems, as with initial value problems, there arises the question of whether the given problem is uniquely solvable and if this solution depends continuously on the preconditions.

In Equation (12.2) the

preconditions are the functions

and

q

ip.

Strictly speak-

ing, one should also examine the effect of "small deformations" of the boundary curve.

Because of the special prob-

lems this entails, we will avoid this issue.

For many bound-

ary value problems, both the uniqueness of the solution and its continuous dependence on the preconditions follows from

210

BOUNDARY VALUE PROBLEMS

II.

the maximum-minimum principle (extremum principle). Maximum-minimum principle.

Theorem 12.3:

q(x,y) > 0 (q(x,y) < 0) for all

and

every nonconstant solution

If

c(x,y) > 0

(x,y) c G, then

of differential equation (12.2)

u

assumes its minimum, if it is negative (its maximum, if it is positive) on

DG

and not in

G.

A proof may be found in Hellwig 1977, Part 3, Ch. 1.1.

Let boundary value problem (12.2) with

Theorem 12.4:

c(x,y) > 0 (1)

for all

(x,y)

e G

be given.

Then

It follows from q(x,y) > 0,

(x,y) E U

i4(x,y) > 0,

(x,y) e DG

u(X,y) > 0,

(x,y) E G.

and

that

(2)

Iu(x,y)I

There exists a constant

<

max (X,y)eDG

K

K > 0

such that

max Iq(X,Y)l, (X,y)cG (x,y) E

The first assertion of the theorem is a reformulation of the maximum minimum principle which in many instances is more easily applied.

The second assertion shows that the boundary

value problem is properly posed in the maximum norm. Proo

:

(1) follows immediately from Theorem 12.3.

(2), we begin by letting w(x,y) = ' + (exp($ ) - exp(sx))Q where

To prove

12.

Properly posed boundary value problems

'1'

=

max lb(X,Y)1, (x,y)c G const. > 0,

a

211

max jq(x,Y)j (x,y)EG

Q =

a const.

>

max (x,y)EG

Further, let maxc_ {1aX al(x,Y)I, c(x,Y)}.

M

(X,y)

Without loss of generality, we may suppose that the first component, x, is always nonnegative on

Since

G.

a1(x,y) > a,

we have

r(x,y) _ -[al(x,Y)wx(x,Y)]x -

[a 2(x,Y)wy(x,Y)]y

+ c(x,Y)w(x,Y)

= Q exp(Bx)[al(x,Y)s2 +

+ c(x,Y) [Q exp(SC) +

s

ax ai(x,Y) - c(x,Y)J

Y']

> Q exp(Bx)[as2 - M0+1)).

Now choose

a

so large that as2 - M(0+1) > 1.

It follows that r(x,Y) I Q,

(x,y)

E G.

In addition,

w(x,y) >

'l,

(x,y) E 9G.

From this it follows that q(x,y) + r(x,y) > 0 (X,y)

q(x,y)

E G

- r(x,y) < 0

u(x,Y) + w(x,Y) = V'(x,Y)

+ w(x,Y) > 0

- w(x,Y) = i,(x,Y)

- w(x,Y) < 0

(x,y) E U(x,Y)

G.

212

BOUNDARY VALUE PROBLEMS

II.

Together with (1) we obtain u(x,y) + w(x,y) > 0 u(x,y)

- W(X,Y) < 0

which is equivalent to (x,y) e G.

Iu(x,y)I < W(x,Y),

u, and its continu-

To check the uniqueness of the solution ous dependence on the preconditions ferent solution

u

and

for preconditions

Theorem 12.4(2), for Iu(x,Y)

1

- u(x,Y)I <

(x,y)

q, pick a dif-

and

q.

From

c G, we obtain the inequality

max (x,y)e3G + K

max _Iq(X,Y)

q(x,Y)I

(x ,y) eG This implies that the solution

is uniquely determined

u

and depends continuously on the preconditions Example 12.5:

El

i

and

q.

Potential equation, harmonic functions.

Boundary value problem: tu(x,y) = 0,

(x,y)

(x,y)

u(x,Y) _ (x,Y), Here

i e C0(3G, ]R).

e G e 8G.

As a special case of (12.2), this prob-

lem has a uniquely determined solution which depends continuously on the boundary condition

P.

The homogeneous differ-

ential equation tu(x,y) = 0 is called the potential equation.

Its solutions are called

Properly posed boundary value problems

12.

213

Harmonic functions are studied care-

harmonic functions.

fully in classical function theory (cf. Ahlfors 1966, Ch.

Many of these function theoretical results were

4.6).

extended later and by different methods to more general differential equations and to higher dimensions.

In this, the

readily visualized classical theory served as a model.

We

will now review the most important results of the classical theory. Let

(1)

be a holomorphic mapping.

f(z)

f(z), Re(f(z)), and

Im(f(z))

Then

f(z),

are all harmonic functions.

Every function which is harmonic on an open set

(2)

is real analytic, i.e., at every interior point of the set it has a local expansion as.a uniformly convergent power series in

x

and

y.

(3)

When the set

G

is the unit disk, the solution

of the boundary value problem for the potential equation can be given by means of the Poisson integral formula r

2

r27r

1-r

JO

1

2

dm

for r<1

2r

u(x,y) = for r=1.

are the polar coordinates of

Here

(x,y).

The

Poisson integral formula is a simple consequence of the Cauchy integral formula. (4)

The Poisson integral formula leads to the expres-

sion

rv[av

u(x,Y) = 012 + V=1

where

sv

214

II.

BOUNDARY VALUE PROBLEMS

f2w

(cos ¢, sin 4)cos(v4)dc

av = n 0

2n

p(cos 4, sin 4)sin(v4)d4.

Sv = 1 j 0

it

The functions

rvcos(v4), rvsin(v4)

monic functions.

are the simplest har-

Thus the above expansion of

u(x,y)

is

analogous to the power series expansion for holomorphic functions.

The potential equation is invariant with respect

(5)

to one-to-one holomorphic transformations.

Thus one need

consider the boundary value problem for the potential equation only on the unit disk, since by the Riemann mapping theorem, every simply connected region with at least two boundary points can be mapped onto the unit disk conformally (i.e., globally one-to-one and holomorphically).

It follows from the Schwarz reflection principle

(6)

that at every boundary point where the boundary curve and the boundary function solution

u

p

8G

are both real analytic, the

is also real analytic.

At these points, u

can

be continued across the border.

The conformal mappings of a simply connected region onto the unit circle can be given in closed or almost closed form for a great number of regions. (5)

As a result, conclusion

is of considerable practical significance.

It is fre-

quently worthwhile to map regions with a complicated border onto the unit disk or onto some other simple region, such as a rectangle.

Unfortunately, the Riemann mapping theorem

has no generalization to higher dimensions.

The exploitation

of conformal mappings is thus restricted to the plane.

Differ-

12.

Properly posed boundary value problems

215

ential equations differing from the potential equation are not in general invariant with respect to conformal maps.

However, it is usually easy to specify the differential equation for the transformed function.

In executing the trans-

formation, the Wirtinger calculus has proved itself to be of use, and we briefly describe it now. Instead of the (mutually independent) coordinates and

y, we consider the (mutually dependent) complex co-

ordinates

z

z, where

and

z = x + iy

,

z=x

x = 2 (z+z)

,

y = ai (z z) . a/az

The differential operators a

az = a

=

1

a

2

ax

1

a ax

2

+

1

a

1

a

and

-

iy,

a/az are defined by

2i ay

_

2i ay

3'F

Conversely, we have a

ax

=

a

az

+

a

3z

y = i(k -

a

az

).

The potential equation now assumes the form

Au(x,y) = 4 a2 u z

z

= 0.

azaz

A function

f(z) = f(z,z) = a(z,z) + ib(z,z) is holomorphic exactly when it satisfies the differential equation

x

216

BOUNDARY VALUE PROBLEMS

II.

of (z, z) az

=

0

This equation is just another form of the

on an open set.

Cauchy-Riemann differential equations ax(x,Y) = by(x,Y),

ay(x,Y) = -bx(x,Y) For a holomorphic function aaf(Z)

= f'(z)

af(z)

3TT7Z

az

=

af(z)

_

w = f(z)

onto the region

,

z

z

2

azaz

_

aZ .

af'(z)

a= f

aZ

a

aw

az

= f (z)f

G

Then it follows from

G*.

aw

az

=

af(z) = 0

be a conformal mapping of the region

af(z) a+ T(Z

a

_

az

az Now let

it is further true that

f(z)

aw

(z) () aw

a2 + at z + f, (z)(af(z) awaw L az az

2

awaw

a2

z

awaw'

that a2

a2

awaw

fl(z)fl(z) azaz

With the help of this equation one easily transforms differential equations of the form -Au(x,y) = H(x,y,u) or -

4 au 2

z,z)

azaz

= H(z,z,u).

12.

217

Properly posed boundary value problems

First boundary value problem for the Poisson

Example 12.6: equation.

-Au(x,y) = q(x,y),

(x,y) (x,y)

u(x,Y) = Vi(x,Y),

EG

E aG.

In many algorithms it is assumed that either

ip(x,y) = 0

The general case can usually be reduced to these

q(x,y) ° 0.

special cases by means of a substitution: able to a function

be extend-

let

and let

$ E C2(G,IR)

u(x,Y) = u(x,Y)

i(x,Y)

-

q(x,Y) = q(x,y) +

We then obtain the new problem (x,y) E G

-Ai(X,Y) = q(x,Y), u(x,y) =

0

(x,y)

,

c 9G.

If, on the other hand,

_

k

q(x,y) = P(z,z) _

auv

I

z11-V

z

u,v=0

then one can define

a

k u(x,y)

= u(x,y) +

(x, Y) =

+

u+l)(v+1

4

(x, Y) + q

E

zu lZV+1

17

li,v=0

i

(u l) (v 1

V,v=O Pi

or

zu lzv 1

is the solution of the problem au(X,Y) = 0, u(x,Y) = I1(X,y),

(x,y) E G (X,y) E 9G.

0

218

BOUNDARY VALUE PROBLEMS

II.

Example 12.7:

Third boundary value problem.

-Au(x,y) + au(x,y) = q(x,y), 3U3(n)Y).+

Here

Su(x,Y)

(x,y) e G

= (x,Y),

(x,y)

E Co (3G, IR) , q e Co (G, IR)

a, S E 7R,

c

G.

is the

and

derivative in the direction of the outward normal of

8G.

We know, from the theory of partial differential equations (cf., e.g., Walter 1970, Appendix), that: (1)

Whenever the real numbers

a,s

satisfy the rela-

tions

a > 0,

0 > 0,

a+B > 0

the problem has a unique solution. tinuously on the preconditions

The solution depends con-

q(x,y)

is a valid monotone principle: q(x,y) > implies

and

*(x,y). and

0

4i(x,y)

There > 0

u(x,y) > 0. (2)

If

a = 0 = 0, then

a solution whenever uniquely solvable.

u(x,y)

is.

u(x,y) + c, c = constant, is Therefore the problem is not

However, in certain important cases, it

can be reduced to a properly posed boundary value problem of the first type.

To this end, we choose

gl(x,y)

and

g2(x,y)

so that

3x gl(x,Y) + ay g2(x,Y) = q(x,y).

The differential equation can then be written as a first order system:

-ux(x,Y) + vy(x,Y) = gl(x,Y), -uy(x,Y) v

- vx(x,Y) = g2(x,Y)

is called the conjugate function for

u .

If

q e C1(G,IR),

Properly posed boundary value problems

12.

v

219

satisfies the differential equation - v(X,Y) = g(x,Y) = ax g2(x,Y)

-

y gl(x,y).

We now compute the tangential derivative of point.

Let

(wl,w2)

the outward normal.

v

at a boundary

be the unit vector in the direction of Then

is the corresponding tan-

(-w2,wl)

gential unit vector, with the positive sense of rotation. -w2vX(X,Y) + wlvy(X,Y)

= -w2[-uy(x,y)-g2(x,y)] + wl[ux(x,Y)+gl(x,Y)]

= (X,Y) + wlgl(x,Y) + w2g2(x,Y) _ 'P(X,Y) thus is computable for all boundary points

'P(x,y)

given

'P(x,y), gl(x,y), and

g2(x,y).

(x,y),

Since the function

v

is unique, we obtain the integral condition ds = arc length along

faG (x,y)ds = 0,

G.

If the integrability condition is not satisfied, the original problem is not solvable. obtain a

E

Cl(aG,]R)

Otherwise, one can integrate

P

to

with

4)

a s

'P

is only determined up to a constant.

Finally we obtain

the following boundary value problem of the first type for v: -AV(X,y) = g(X,Y), v(x,Y) = T(X,Y),

One recomputes tem.

u

from

v

(X,y) E G (x,y) E

G.

through the above first order sys-

However, this is not necessary in most practical in-

stances (e.g., problems in fluid dynamics) since our interest

220

II.

is only in the derivatives of a < 0

For

(3)

BOUNDARY VALUE PROBLEMS

u.

a < 0, the problem has unique

or

solutions in some cases and not in others.

a = 0, -a = v eIN, q = 0, and

5

For example, for

0, one obtains the family

of solutions

y eIR

u(x,y) =

x = r cos , y = r sin Q.

r2 = x2+y2,

Thus the problem is not uniquely solvable.

In particular,

there is no valid maximum-minimum principle. Example 12.8: geneous plate.

o

Biharmonie equation; load deflection of a homoThe differential equation

06u(x,y) = u

xxxx

+ 2u

xxyy

+ u = 0 yyyy

is called the biharmonie equation.

As with the harmonic equa-

tion, its solutions are real analytic on every open set.

The

deflection of a homogeneous plate is described by the differential equation MMu(x,y) = q(x,y),

(x,y) c G

with boundary conditions u(x,y) _ *1(x,y) (x,y) c 3G

(1)

(x,y) c DG.

(2)

-Du(x,y) = Yx,y) or

u(x,y) = *3(x,y) auan,y) _ 4(x,y)

Here

q c C°(U,IR), ip 1

c

C2(3G,IR), *2,'P4 a C°(3G,IR), and

12.

Properly posed boundary value problems

3 E C1 (BG,IR).

221

The boundary conditions (1) and (2) depend In the first case,

on the type of stress at the boundary.

the problem can be split into two second-order subproblems: -Av(x,y) = q(x,y),

(x,y) e G

(a)

v(x,y) = Yx,y),

(x,y)

e aG

and -tU(X,y) = V(X,y),

(X,y) C G

(b)

u(x,y) = P1(x,y),

(x,y)

c

G.

As special cases of (12.2), these problems are both properly posed, since the maximum minimum principle applies.

All prop-

erties--especially the monotone principle--carry over immediately to the fourth-order equation with boundary conditions (1).

To solve the split system (a),

t'I E C° (BG,IR) problem (2)

(b), it suffices to have

instead of l e C2 (aG,IR). Boundary value

is also properly posed, but unfortunately it can-

not be split into a problem with two second-order differential equations.

Thus both the theoretical and the numerical treat-

ment are substantially more complicated.

There is no simple

monotone principle comparable to Theorem 12.4(1). The variation integral belonging to the differential equation AAu(x,y) = q(x,y) is

I[w] = ff

[(Aw(x,y))2 - 2q(x,y)w(x,y)]dx dy.

a The boundary value problem is equivalent to the variation problem

I [u] =min {I [w] with

I

w e W}

222

BOUNDARY VALUE PROBLEMS

II.

W = (w C C2(G,IR)

I

w

satisfies boundary cond. (1)}

or

W = {w C C1(G, IR) n C2(G, IR)

I

w

satisfies boundary cond. (2)}.

It can be shown that

differentiable in

u

G.

is actually four times continuously o

Error estimates for numerical methods typically use higher derivatives of the solution problem.

u

of the boundary value

Experience shows that the methods may converge ex-

tremely slowly whenever these derivatives do not exist or are This automatically raises the question of the

unbounded.

existence and behavior of the higher derivatives of

u.

Matters are somewhat simplified by the fact that the solution will be sufficiently often differentiable in

G

if the bound-

ary of the region, the coefficients of the differential equation, and the boundary conditions are sufficiently often differentiable.

In practice one often encounters regions with

corners, such as rectangles

G = (a,b) x (c,d) or L-shaped regions G = (-a,a)

x (O,b) U (O,a) x (-b,b).

The boundaries of these regions are not differentiable, and therefore the remark just made is not relevant.

We must first

define continuous differentiability for a function on the boundary of such a region. set

U dIR2

properties: G.

and a function

defined

*

There should be an open

f C C1(U,]R)

with the following

(1) 3G c U, and (2) T = restriction of

f

to

Higher order differentiability is defined analogously.

Properly posed boundary value problems

12.

223

For the two cornered regions mentioned above, this definition is equivalent to the requirement that the restriction of

to each closed side of the region be sufficiently often

*

continuously differentiable. Poisson equation on the square.

Example 12.9:

-Au(x,y) = q(x,y),

(x,y)

c G = (0,1) x (0,1)

u(x,Y) = i(x,Y),

(x,y)

a aG.

v = 1(1)k

u c C2k(G,]R), then for

'Whenever

(-1)v-1(ay)2vu(x,Y)

(DX)2vu(x,Y) +

(-1)v- j-1

x)2j(

(

y)2v-2j

2 ]Au(x,Y)

v=o j Let

(xo,yo)

let

*

be one of the corner points of the square and 2k-times continuously differentiable.

be

left side of the equation at the point mined by and

alone.

*

(xo,yo)

Then the is deter-

We have the following relations between

q:

*xx(xo,Yo) + 4) yy(xo,Yo) = -q(xo,Yo) IPxxxx(xo,Yo)

-

Ip

-gxx(xo,Yo) + gyy(xo,Yo)

YYYY(xo,Yo)

etc.

does not belong to

When these equations are false, u

On the other hand a more careful analysis will

C2k(G,]R).

show that

u

does belong to

tions are satisfied and

q

C2k(G,]R) and

p

if the above equa-

are sufficiently often

differentiable.

The validity of the equations can be enforced through

224

II.

BOUNDARY VALUE PROBLEMS

the addition of a function with the "appropriate singularity". v = 1(l)-, let

For

v

Im(z2vlog

vv(x,Y) = 2(-1)

log z = log r+i4 For

x > 0

and

where

y > 0

z)

r = IzI, 4 = argIzI,

-n < 4

< n.

we have

vv(x,0) = 0 y2v vv(O,Y) = Set

cpv = xx(li,v)+'pyy(u,v)+q(p,v), u = 0,1 and v = 0,1

i

u(x,Y) = u(x,Y) + n

V+(x,Y) = V'(x,Y)

+

n

1

1

1

1

2

cpv Im(zpvlog zpv)

p=0 v=o 1

1

E

E

2

cpv Im(zpv log zpv)

p=0 v=0

where z00 = z,

z10 = -i(z-l),

z01 = i(z-i),

zll = -(z-i-1).

The new boundary value problem reads -au(x,y) = q(x,y),

u(x,Y) = kx,Y),

We have

u e C2 (G, IR)

(x,y)

e G

(x,y) c DG.

.

The problem -Eu(x,Y) = 1,

u(x,Y) = 0,

(x,Y) e G (x,y)

c DG

has been solved twice, with the simplest of difference methods (cf. Section 13), once directly, and once by means of u.

Table 12.10 contains the results for increments

the points

(a,a).

h

and at

The upper numbers were computed directly

Properly posed boundary value problems

12.

225

with the difference method, and the lower numbers with the given boundary correction.

a

1/2

h

1/32

1/8

1/128

0.7344577(-l)

0.1808965(-1)

0.7370542(-l)

0.1821285(-l)

0.7365719(-l) 0.7367349(-1)

0.1819750(-1)

0.1993333(-2)

0.1820544(-1)

0.1999667(-2)

1/256 0.7367047(-1)

0.1820448(-1)

0.1999212(-2)

0.1784531(-3)

0.7367149(-1)

0.1820498(-1)

0.1999622(-2)

0.1788425(-3)

1/16

1/64

Table 12.10

h

a

1/64

1/2

1/128

1/32

1/8

0.736713349(-1)

0.182048795(-l)

0.199888417(-2)

0.736713549(-1)

0.182049484(-1)

0.199961973(-2)

1/256 0.736713532(-1)

0.182049475(-1)

0.199961516(-2)

0.178796363(-3)

0.736713533(-1)

0.182049478(-1)

0.199961941(-2)

0.178842316(-3)

Table 12.11

Table 12.11 contains the values extrapolated from the preceding computations. pure

Extrapolation proceded in the sense of a

h2-expansion:

wh(a,a) =

3[4 uh(a,a)

With the exception of the point

- u2h(a,a)]

(1/128,1/128), the last line

is accurate to within one unit in the last decimal place.

the exceptional point, the error is less than 100 units of the last decimal.

The values in the vicinity of the

At

226

II.

BOUNDARY VALUE PROBLEMS

corners are particularly difficult to compute. that the detour via

and

'

is worthwhile.

u

It is clear Incidentally,

these numerical results provide a good example of the kind of accuracy which can be achieved on a machine with a mantissa length of 48 bits.

With boundary value problems, round-

ing error hardly plays a role, because the systems of equations are solved with particularly nice algorithms. Example 12.12:

Poisson equation on a nonconvex region with

corners.

(x,y) E G

-ou(x,y) = q(x,y),

u(x,y) _ (x,y), Ga = {(x,y) cIR2

1

(x,y) s DGa and

x2+y2 < 1 y

Figure 12.13

jyj

for

> x tan Z} a e (r,2a).

Properly posed boundary value problems

12.

227

The region (Figure 12.13) has three corners (0,0), (cos a/2, sin a/2), (cos a/2, -sin a/2).

The interior angles are a,

n/2,

n/2.

The remarks at 12.9 apply to the right angles. interior angle of

arise.

u

a > n

But at the

other singularities in the derivatives

Let

t (x,y) = Re(zn/a) = Re exp[(n/a)log z]

log z = log r +

7r

<

q(x,y) = 0.

Then u(x,y) = Re(zTr /a ),

and for

a = 3n/2, this is

even the first derivatives of

(x,y) _

0

sin a/2)

q(x,y) _ 0

u

on the intervals from and from

(0,0)

Obviously not

u(x,y) = Re(z2/3 ).

to

are bounded in (0,0)

G.

Here

(cos a/2,

to

(cos a/2, -sin a/2).

Since

also, the singularity has nothing to do with the

derivatives of

W

or

q

It arises from

at the point (0,0).

the global behavior of the functions.

It is not possible to

subtract a function with the "appropriate singularity" in advance. ificance.

Problems of this type are of great practical signIn the Ritz method (cf. §14) and the collocation

methods (cf. §16) one should use special initial functions to take account of these types of solutions.

a

The following two examples should demonstrate that boundary value problems for parabolic and hyperbolic differ-

228

II.

BOUNDARY VALUE PROBLEMS

ential equations are either not solvable or not uniquely solvable.

Boundary value problem for the heat equation.

Example 12.14:

uy(x,Y) = uxx(x,Y),

(x,y)

(x,y) E 3G

u(x,y) = p(x,Y), where

i e C°(3G,IR).

determined.

E G

The boundary value problem is over-

For example, let

G =

(0,1) X (0,1).

Then the

initial boundary value problem already is properly posed. Therefore the

set of all boundary values for which the prob-

lem is solvable cannot lie entirely in the set of all boundary values.

For regions with continuously differentiable

boundary there are similar consequences which we will not enter into here.

Example 12.15:

o

Boundary value problem for the wave equation. (x,y) e G

uxx(x,y) - uyy(x,Y) = 0,

(x,y) e 3G

u(x,y) _ 'D(x,Y),

where

$ c C°(3G,]R).

This problem also is not properly posed.

We restrict ourselves to two simple cases.

Let

G = Q1 = (0,1) x (0,1) or

G = Q2 _ {(x,Y) E]R2

1

ri -

Ixl

> y >

I x I ).

The two regions differ in that the boundary of of characteristics while the boundary of cides with the characteristics.

Ql

Q2

consists

nowhere coin-

According to Example 1.9,

the general solution for the wave equation has the representa-

13.

tion

Difference methods

r(x+y) + s(x-y).

229

If

u(x,y)

is a solution for

G = Q1,

then so is

u(x,y) + cos[2n(x+y)]-cos[2n(x-y)] = u(x,y)

-

2 sin(2nx)sin(2ny).

The problem therefore is not uniquely solvable. G = Q2, r

and

s

In case

can be determined merely from the condi-

tions on two neighboring sides of the square (characteristic initial value problem) and therefore the problem is overdetermined. 13.

e

Difference methods

In composing difference methods for initial value problems, the major problem lies in finding a consistent method (of higher order, preferably) which is also stable.

For

boundary value problems, this problem is of minor significance, since the obvious consistent difference methods are stable as a rule.

In particular, with boundary value problems

one does not encounter difficulties of the sort corresponding

to the limitations on the step size ratio h/Ax

or

h/(Ax)2

encountered with initial value problems.

We consider boundary value problems on bounded regions. Such regions are not invariant under applications of the translation operators.

The difference operators are defined,

therefore, only on a discrete subset of the region--the lattice.

In practice one proceeds in the same manner with

initial value problems, but here, even in theory we will dispense with the distinctions, and start with the assumption that the difference operators are defined on the same Banach space as the differential operators.

230

II.

BOUNDARY VALUE PROBLEMS

From the practical point of view, the real difficulty with boundary value problems lies in the necessity of solving large systems of linear or even nonlinear equations for each We will consider this subject extensively in the

problem.

The systems of equations which

third part of this book.

arise with boundary value problems are rather specialized in But they barely differ from the systems which

the main.

arise with implicit methods for the solution of initial value problems.

Error estimation is the other major area of concern in a treatment of boundary value problems. In this chapter, G

will always be a bounded region

(an open, bounded, and connected set) in boundary of

G

We denote the

JR2.

Let

by

r.

rr :

C°(-d, IR)

C°(r, IR)

be the natural map which assigns to each function r, called the

its restriction to the boundary

u e C°(G,IR)

boundary restriction map.

In

C°(G,]R)

and

C°(r,IR)

we

use the norms

max

Iu(x,y)

max

(x, y)

(x,y) cG and

(x,y)cr

Both spaces are Banach spaces, and

map with

JJrr11

is a continuous linear

= 1.

Definition 13.1: in

rr

A finite set

M c G

is called a Zattice

It has mesh size

G.

M

2

max

min

(x,y)EG (u,v)erUM

11 (x,y) -

(u,v)%.

Difference methods

13.

231

The space of all lattice functions

C° (M, ]R)

f:M +]R

we denote by

With the norm

.

11f11- =

If(x,y) I,

max (x, Y) EM

becomes a finite dimensional Banach space.

C°(M,]R)

The

natural map

rM:C°(G, ]R)

-

C°(M, ]R)

is called the lattice restriction map.

{(x,y) E G

x = ph, y = vh with p,v E 7l}, 0 < h < h 0

I

is called the standard lattice in if

h

Obviously and

It has mesh size

G.

is chosen sufficiently small.

0

the space

rM

C°(M,IR)

h

0

is linear, continuous, and surjective,

If the points of

lirMil = 1.

The lattice

M

are numbered arbitrarily,

can be identified with

1R

(n = number

M) by means of the isomorphism

of points in

f <-> (f(xl,yl).... ,f(xn,yn))

Thus it is possible to consider differentiable maps F

:

C°(M, IR) + C°(M, ]R)

.

In this chapter we will consider only the following problem together with a few special cases. Problem 13.2:

Here

L

Lu(x,y) = q(x,y),

(x,y) E G

u(x,y) = (x,y),

(x,y) e r.

is always a semilinear uniformly elliptic second-

232

BOUNDARY VALUE PROBLEMS

II.

order differential operator of the form Lu = -a11uxx - 2a12uxy - a22uyy -b1ux - b2uy + H(x,y,u),

where

all, a12, a22, b1, b2

H e CC(G x IR, IR) , Furthermore, for all

(x,y) e G and all

H(x,y,O) = 0, 0,

all >

a11a22 - a12 >

tion of the problem.

z cIR, let

Hz(x,y,z) > 0,

u c C°(G,IR) n C2(G,IR), u

If

eye C° (P, IR) .

q c Co (G, IR) ,

0.

is called the classical solu-

o

The next definition contains the general conditions on a difference method for solving 13.2. Definition 13.3:

A sequence D = {(Mj,Fj,Rj)

I

= l(1)oo}

j

is called a difference method for Problem 13.2 if the following three conditions are satisfied: (1)

hj

=

IMjj (2)

The

M.

are lattices in

converging to zero. The

Fj

are continuous maps

c° (r , ]R) X Co (Mj , IR) For each fixed

* c C0(r,IR), all

differentiable maps of (3)

with mesh sizes

G

The

Rj

Co(MjIR)

C° (Mj , IR) . Fj(p, ) to

are continuously

C°(Mj,IR).

are continuous linear maps

13.

Difference methods

C° (Mj

The method

233

-> C° (Mj , ]R) .

, IR)

is called consistent if the following condi-

D

tion is satisfied:

There exists an

(4)

for all

2

with the property that

u e Cm(G, IR) , lim J I F . j _.,,

Here

m >

rj = rM j

,

J

( * , r (u)) - R (r (q) )IIm = 0. J J

1 = rr(u), and

q(x,y) = I,u(x,y)

for all

(x,y) e G. The method

D

is called stable if the following condition

is satisfied:

There exist

(5)

K > 0, K > 0

and

jo e14

with the

following properties:

JIF( ,wj)-Fj(y,wj)11m > K11wj-W,Yi_

IIRj (wj)-Rj (wj)II < KlIwj- W.II V'

C C°(r, 1R), j= jo(l)°°, wj,Wj e C°(Mj, IR).

Example 13.4: problem.

The standard discretization of the model

We consider a consistent and stable difference

method for the model problem -Au(x,Y) = q(x,Y),

u(x,Y) = (x,Y) , For

j

= l(1)= MJ

a

:

(x,y)

e G =

(0,1)2

(x,yj e r.

we set

standard lattice with mesh size

2-J

F('Y,w)(x,Y) = 2 (4w(x,Y)-w(x+hjY)-wj(x-hj,Y) hj

wj(x,Y+h.)-wj(x,Y-h.))

234

BOUNDARY VALUE PROBLEMS

II.

R

(x, y) = wi (x, y)

(x,y) a Mj.

w a C° (Mj , IR) , Here

( x.Y)

-

wi (x,y)

when

(x,y) a M)

rp(x,y)

when

(x,y) e r.

.l

The proof of the consistency condition, (4), we leave to the Stability, (5), follows from Theorem 13.16 below.

reader.

The eigenvalues and eigenfunctions of the linear maps

Fi (0,-): C°(Mi ,

-+ C°(Mi ,

IR)

can be given in closed form.

IR)

One easily checks that the

functions

vuv(x,y) =

(x,y)

a M

u,\) = 1(1)2j-l

are linearly independent eigenfunctions.

The corresponding

eigenvalues are

auv = 7[2 - cos (uirh) - cos (virh)

h=h

h Since lattice

Mj

consists of

tive and lie in the interval

.

points, we have a

(23-1)2

complete system of eigenfunctions.

i

All eigenvalues are posiwhere

[all''mm]

m = 2j-l.

We have

A11 = Z [l - cos (nh) ]

=

2Tr2

-

1Tr4h2 + O(h4)

h

amm = -7[1 + cos(nh)] = 2 h h

2712

+ 6n4h2 + 0 (h4).

With an arbitrary numbering of the lattice points, there are real symmetric matrices

Ai

for the maps

Fi

With

13.

Difference methods

235

respect to the spectral norn:

they satisfy the condi-

tions

mm A11

The functions

1

+ cos(rh

1

-

cos (,rh)

+ 0(h4)

2

(,-h) -

3

vuv, regarded as functions in

IR2,

eigenfunctions of the differential operator -Avll(x,Y) = 21T 2v11(x,Y),

Since the functions

v,v

For example,

-A.

(x,y) e ]R

are also

2

vanish on the boundary of

(0,1)2,

they are also eigensolutions of the boundary value problem. Now let

D

a

be an arbitrary difference method for

solving Problem 13.2.

An approximation

the exact solution

of 13.2 is obtained, when possible,

u

wj

e C°(Mj,IR)

for

from the finitely many difference equations

Fj

(x, y) = Rj ( r . (q) ) (x,Y) ,

Thus our first question is: in the finitely many unknowns tion?

(x,Y) e Mj .

Does the system of equations wj(x,y)

have a unique solu-

For stable methods, a positive answer is supplied by

the following theorem. Theorem 13.5:

Let

F c Cl(Rn, Rn)

and

Then the

K > 0.

following two conditions are equivalent:

(1)

IIF(x)-F(x)II > KIIx-RII,

(2)

F

is bijective.

x,i a Rn

The inverse map

Q

is continu-

ously differentiable and

IIQ(x)-Q(X)II _

x,i c Rn.

IIx-RII,

x Proof that (1) implies (2):

Let

F'(x)

be the Jacobian of

236

F.

BOUNDARY VALUE PROBLEMS

II.

We show that

is regular for all

F'(x)

x0,y0 e Rn

then there would exist

F'(x0)y0 =

and

0

This means that the directional derivative at the

yo # 0.

point

with

For if not,

x.

in the direction

x0

is zero:

y0

lim n IIF(xo+hy0)-F(x0)II = 0.

IhI- 0 Thus there exists an

h

such that

> 0

0

h IIF(x0+h0y0 )

-

F(x0)II < KIIyOII

0

or

IIF(xo+hoyo) This contradicts (1).

-

F(x0)II < Kllhoyo II

Therefore

F'(x)

is regular every-

where.

F(x) = F(Z)

is injective.

F

Since

once by virtue of (1).

F'(x)

implies

x = :

at

is always regular, it

follows from the implicit function theorem that the inverse map

Q

is continuously differentiable and that

open mapping. F(Rn)

It maps open sets to open sets.

F

In particular,

is surjective.

be an arbitrary but fixed vector.

IIF(x)

-

F(0)II _ KIIxII

IIF(x)-x011 + Ilxo-F(0)II For all

is an

is an open set.

We must still show that x0 e]Rn

F

x

KIIxII

outside the ball

E = {x a mn

I

Ilxll _ 2IIx0-F(0)II/K}

we have

d(x) = IIF(x)-xoll > IIF(0)-x011

Let

By (1) we have

13.

Difference methods

237

Therefore there exists an

with

x1 e E

d(x1) < d(x), x c 1R'.

On the other hand,

d(x1) = Since

F(Rn)

inf

n II y-xo II

ycF(R )

is open, it follows that

is surjective.

Thus

x0 a F(1Rn).

F

It also follows from (1) that

IIx-RII = IIF(Q(x))-F(Q(R))II

KIIQ(x)-Q(R) II

This completes the proof of (2). Proof that (2) implies (1):

x,R a Rn.

Let

It follows by

virtue of (2) that

IIx-RII =

IIQ(F(x))-Q(F(X))II_

Theorem 13.6:

Let

KIF(x)-F(R) II

0

be a consistent and stable difference

D

method for Problem 13.2 and let

m, jo

constants as in Definition 13.3.

we define the lattice functions

K > 0

IN, and

For arbitrary

be

u e C2(G, IR)

wj, j = jo(l)m, to be the

solutions of the difference equations

F- (V ,wj) = Rj (r (q)). Here

= rr(u)

and

q = Lu.

Then we have:

Ilrj(u) wjIImJIFJ('Y,rj(u))-Rj(rj(q))II.,

(1)

j = jo(1)m. If

(2)

u e Cm(G, R), then

lim IIrj (u) -wjIIm = 0. j +00

Proof:

$

depends only on

13.5, the maps

F- (*,-) 3

We have

u

and not on

j.

By Theorem

have differentiable inverses

Q3.

238

BOUNDARY VALUE PROBLEMS

11.

rj (u) = Qj (Fj

(u)) )

wi = Qj(Rj(rj(q)))

"jrj (u) -w.'<_

ll Qj (Fj (V+,rj (u))) -Qj (Rj (rj (q)) )%.

(1) follows from Theorem 13.5(2) and (2) follows from (1) and Definition 13.3(4).

o

In Problem 13.2, q

and

y,

are given.

All conver-

gence conditions which take account of the properties of the exact solution

are of only relative utility.

u

Unfortunat-

ely, it is very difficult to decide the convergence question simply on the basis of a knowledge of less one knows that for fixed

q

and

P.

Neverthe-

p c C°(r,IR), the set of

q

for which the difference method converges is closed in

C°(U, IR)

.

Theorem 13.7:

Let

be a consistent and stable difference

D

method for Problem 13.2, let S

{q e C°(, IR)

I

c C°(r,IR)

1P

there exists a

and let

u c C2 (U, ]R)

such that rr(u) q = Lu and Further let

lim lIrj(u)-wj11m= 0).

j -.

q E S and

Fj (0', 4i ) = Rj (rj (q)) , Then there exists a

j

u c C°(U,IR)

such that

0.

1im Note that the function

= jo(l)..

u

need not necessarily be the clas-

sical solution of the boundary value problem.

Difference methods

13.

Proof:

239

Let

q(l),q(2) E

=

q(1)

Lu(1),q(2)

=

Lu(2),

R.(rj(q(1))),

j = jo(1)°°

Rj(rj(q(2))),

j

= jo(1)-.

Then:

Il rj (u(1)) -rj (u(2) Al.

Ilrjlull) )-w(1)Ilm

+

Let Q. j = jo(l)W, again be the inverse functions of and

and

K

tion 13.3(S).

the constants from stability condi-

K'

It follows from Theorem 13.5 that:

II rj (u(1) -u(2) )II W

Il rj (u(2) ) -wJ 2)II

Il

+ K llq(1)-q(2)II,,. In passing to the limit ity converges to

Ilull)

the left side of the inequal-

u(2)11, while the mesh

-

IMJI

con-

On the right, the first two summands converge

verges to zero.

to zero by hypothesis.

,lull)

j - ,,,

-

All this means that

u(2)11

llq(1)

-

Thus corresponding to the Cauchy sequence {q(v)

E S0

1

V = 1(l)m}

q(2)11..

240

BOUNDARY VALUE PROBLEMS

II.

there is a Cauchy sequence

{u(v)

E C0 (G,IR)

v = l(l)m}.

I

Let

lim q(v), v+m

Then for

v = l(l)m

u = lim u(v) v->m

we have the inequalities

II rj (u) -wjll <

1 1 rj (u-u(v) ) I l m +

Il rj (u(v))

11w'jv) -wj 1L

< IIu-u(V)IIL + For

E> 0

K

there is a

vo e 1N

Ilq(v)-qli,;

with

IIii-u(v°)IIm < 3 114-q(v0 )lIm

< 3 -K

K

For this

we choose a

v°

jl E N

IIrj(u(v°))-wj(v°)IIm < E Altogether then, for

j

> jl

II rj (U) -wjll < E.

such that

j = jl(1)m.

we have a

For the most important of the difference methods for elliptic differential equations, stability follows from a monotone principle.

The first presentation of'this relation-

ship may be found in Gerschgorin 1930.

The method was then

expanded extensively by Collatz 1964 and others. The monotone principle just mentioned belongs to the theory of semi-ordered vector spaces. concepts.

Let

0

We recall some basic

be an arbitrary set and

V

a vector space

13.

Difference methods

of elements

f:St -IR.

241

V

In

there is a natural semiorder

f < g -1f(x) < g(x), x e 0). The following computational rules hold: f < f

f< g, g< f f< g, g< h f < g, X c IR+ f
.

f= g f< h of < Ag

-

-g < -f 0 < f+g.

We further define

Ifl(x) = lf(x)I From this it follows that

Ifl,

0< When

is a finite set or when

c

f c V

f < Ifl. 12

is compact and all

are continuous,

jjfjj_= max lf(x) xE12

exists.

Obviously, 11 if 111-

lif 11.e

We use this semiorder for various basic sets V

0

{1,2,...,n}

Btn

{l,2,...,m}x{1,2,...,n}

MAT(m,n,IR)

Lattice M

C°(M, ]R)

G

C° (G, IR)

.

12, including

H.

242

A E MAT(n,n,IR)

Definition 13.8:

BOUNDARY VALUE PROBLEMS

is called an

with the following pro-

A = D - B

there exists a splitting

M-matrix if

perties:

B

is a regular, diagonal matrix; the diagonal of

D

(1)

is identically zero.

D > 0, B > 0. A-1 > 0.

(2) (3)

Theorem 13.9:

A = D - B

Let

MAT(n,n,IR), where A

D

and

B

be a splitting of

A c

satisfy 13.8(1) and (2).

is an M-matrix if and only if

p(D- IB) < 1

Then

(p = spectral

radius).

Proof:

Then the series

p(D-1B) < 1.

Let

(D-'B)"'

S

V=0

converges and

S > 0.

Obviously,

(I-D 1B)S = S(I-D-1B) = I, A-1 > 0

Conversely, let D-1B

with

x

A-1

and let

=

A

SD-I

> 0.

be an eigenvalue of

the corresponding eigenvector.

Then we have

the following inequalities: ID- IBxI < D-1BIxl

lXIlxi =

(I-D-1B)lxl < (l-lal)ixI (D-B)Ixl < (l-lal)Dlxi < (1-IXI)A-l Dlxl.

lxi

Since

x # 0, A-I > 0, and

plies that

IA!

<

1

and

The eigenvalues of

D > 0, the last inequality imp(D-1B) < 1. D-1B

o

can be estimated with the

help of Gershgorin circles (cf. Stoer-Bulirsch 1980).

For

Difference methods

13.

243

this let A = {aij

i = l(1)n, j = l(1)n}.

I

One obtains the following sufficient conditions for P(D-1B) < 1:

Condition 13.10:

A

is diagonal dominant, i.e.

n Jai'j

E

j=1 j#i

Condition 13.11:

A

i = l(1)n.

Iaiil,

<

is irreducible diagonal dominant, i.e.,

n Iai'j

E

A

i = l(1)n,

Jaiil,

j=1 j+i

is irreducible and there exist

r c {0,1,...,n}

such that

n I

Iarri.

1 j

+r

Definition 13.12: ping

F:V1 + V2

Let

and

V1

be semiordered.

V2

is called

isotonic

if

f < g

F(f)

antitonic

if

f < g

F(g) < F(f)

inverse isotonic

if

F(f)

for all

A map-

< F(g)

< F(g) . f < g

f,g e VI.

Definition 13.13:

Let

V

be the vector space whose elements

consist of mappings

f:S2 +IR.

diagonal if for all

f,g e V

Then

F:V + V

and all

x E 0

f(x) = g(x) - F(f) (x) = F(g) (x) .

is called

it is true that:

a

In order to give substance to these concepts, we consider the affine maps

F:x - Ax+c, where

A c MAT(n,n, R)

and

244

BOUNDARY VALUE PROBLEMS

II.

Then we have:

c c IRn. A>

0

F

-A >

0

F

isotonic antitonic

A

an M-matrix

A

diagonal matrix

F

inverse isotonic

F

diagonal

A > 0, regular

F

diagonal, isotonic, and

diagonal matrix

inverse isotonic.

A mapping

t.»

F:IRn ]R"

is diagonal if it can be written as

follows:

yi = fi(xi),

= l(1)n.

i

The concepts of isotonic, antitonic, inverse isotonic, and diagonal were originally defined in Ortega-Rheinboldt 1970. Equations of the form

F(f) = g

with

F

inverse isotonic

are investigated thoroughly in Collatz 1964, where he calls them of monotone type. Theorem 13.14:

Let

A c MAT(n,n,IR)

be an M-matrix and let

F:IRn IR"

be diagonal and isotonic.

F: 1Rn y 1R

defined by

Then the mapping

F(x) = Ax + F(x),

x e IRn

is inverse isotonic and furthermore

IIF(x) Proof: y = F(x)

Since

F

- F(x)II ?

II x-AI

-1

11-

is diagonal, one can write the equation

componentwise as follows: yi = fi(xi),

For fixed but arbitrary

i

= l(1)n.

x = (x1,...,xn) cIRn

and

Difference methods

13.

R = (il) ...,xn) e]Rn

245

we define, for f.(xi)

i = 1(1)n,

fl(Ri) if

xl

-

zi # xi

xl

otherwise

1

E = diag(eii). F

isotonic implies

F(x) Let

A = D

-

B

E > 0.

In addition,

F(i) = E(x-i).

-

be a splitting of

A

as in Definition 13.8.

It follows from

F(x) = AX + F(x) = y F(x) = Ax + F(x) = y that

- F(i) = (D+E-B)(x-i).

F(x)

Since

S=

I (D-1B)' > 0

v=0

converges, [(D+E)-IB]v > 0

T = 0

certainly converges.

The elements in the series are cer-

tainly no greater than the elements in preceding series. Therefore, I

-

(D+E)

1B

is regular, and [I-(D+E)-1B]-I

D+E-B

is also an M-matrix.

and this holds for all inverse monotone.

= T > 0.

We have

x,i c1Rn.

x < i

for

This shows that

F(x) F

< F(R), is

246

BOUNDARY VALUE PROBLEM

II.

In addition we have

ilx-RII ° II (D+E-B)1(F(x)-F(R)]

or

II

II

Ilg(x)-FcR)II

IIT(D+E)-III

The row sum norm of the matrix T(D+E)-1

((D+E)-1131'}(D+E)-

{

'=0

is obviously no greater than the norm of SD

1

=

{ Z (D 1B]'}D 1 = A 1. '=0

This implies that

IIT(D+E)-1II. < IIA-1II-

IIx-xII

IIF(x) Theorem 13.15:

IIA-1IIm

Hypotheses:

(a)

A E MAT(n,n,]R)

(b)

F: ]Rn -r IRn F(x)

o

.

is an

M-matrix

is diagonal and isotonic,

= Ax + F(x)

(c)

v EIRn, v > 0, Av > z = (1,...,1) EIRm

(d)

we]Rn,IIF(w)II_<1.

Conclusions: (1)

It is true for all

IIF(x)-F(z)IIm_

x,R e]R

IIx-RII. IIvIIL

that

Difference methods

13.

(2)

Proof:

F(O) = 0

For all

implies

x c1Rn

A-1 > 0

< v.

lwl

it follows from

Av > z

that

llxll, z <

Ixl Since

247

it follows that

A- Ilxl

Ilxll.v

IIA- Ixlim_

IIA 11x1 IL_ Ilxll. Ilvil.

IIA-III, < Ilvll. Combining this with Theorem 13.14 yields conclusion (1):

IIF(x)-F(X)II. >

llx-xlim

11vil

x,x a IRn ,

For the proof of (2) we need to remember that tonic and

F(-x)

is antitonic.

-z < F(w) <

F(0) = 0

F(x)

is iso-

implies that

z

-Av < F(w) < Av -Av+F(-v) < F(w) < Av+F(v)

-k-v) < F(w) < F(v). Since

F

is inverse isotonic, it follows that

-v < w < v.

a

We conclude our generalized considerations of functions on semiordered vector spaces with this theorem, and return to the topic of difference methods.

In order to lend some

substance to the subject, we assume that the points of lattice

Mj

have been enumerated in some way from

1

We will not distinguish between a lattice function wj

e C°(Mj,IR)

and the vector

to

nj.

248

BOUNDARY VALUE PROBLEMS

II.

[wj(x1,y1),...,wj(xn'yn)) a J

J

Thus, for each linear mapping

F: C° (Mj , IR)

A e MAT(nj,nj,IR)

is a matrix

,Rn j. -).

Co (Mj, IR)

and vice versa.

there

This matrix

depends naturally on the enumeration of the lattice points. "A > 0"

However, properties such as matrix" or

is a diagonal

"A

or

is an M-matrix" either hold for every enumera-

"A

tion or for none.

The primary consequence of these monotoni-

city considerations is the following theorem. Let

Theorem 13.16:

D = {(Mj,Fj,Rj)

I

j

be a dif-

= l(1)oo}

ference method satisfying the following properties:

(1)

F(,wj) = Fl)(wj) + F2)(wj) C C° (r, IR) ,

IIRJ jj wj

-

F3)('U),

wj e Co (Mj , IR) .

(2)

F1)

is a linear mapping having an M-matrix.

(3)

F2)

is diagonal and isotonic, and

(4)

Fj3)

and

< K.

Rj

are linear and isotonic, and

Also it is true for all

e C°(Mj,IR)

with

'p

F2)(0) = 0.

e C°(r, IR) and wj > 1 that

>1

'P

and

F3)('p) + Rj(wj) > (1,...,1) (5)

The method

{(Mj,Fl)-F3),Rj)

consistent if the function

H

I

j

= 1(l)°°}

is

in Problem 13.2 is identically

zero.

Conclusion: Remark 13.17:

D

is stable.

The individual summands of F. Rj

as a rule

correspond to the following terms of a boundary value problem:

Difference methods

13.

249

Boundary value problem

Difference method

Lu(x,y)

- H(x,y,u(x,y))

H(x,y,u(x,y)) P(x,Y)

q(x,y).

R

Since

must be isotonic, Hz(x,y,z)

can never be nega-

If this fails, the theorem is not applicable.

tive.

Consistency as in 13.3(4) can almost always be obtained by multiplying the difference equations with a sufficiently high power of

hj

=

jMjI.

The decisive question is

whether stability survives this approach.

Condition (4) of

Theorem 13.16 is a normalization condition which states precisely when such a multiplication is permissible. points

(x,y)

At most

of the lattice it is the rule that 0.

Such points we call boundary-distant

points.

Among other things, 13.16(4) implies that it follows

from

>

wi

1

that for all boundary-distant points

(x,y),

Ri (wi )(x,y) > I.

In practice, one is interested only in the consistent methods D.

But stability follows from consistency alone for

and

0.

H

=_

0

In general, it suffices to have

isotonic.

In Example 13.4 we have

0, R

the identity

and

h {4wi(x,Y)-w)(x+hj,y)-w)(x-hj,y) J

wi (x,Y+h))-w.(x,y-hi ))

250

BOUNDARY VALUE PROBLEMS

II.

if *(x,y) if

(x,y) e M3 (x,y) E r.

f wj (X, Y)

(x, y)

1

has a corresponding symmetric, irreducible, diagonal dominant matrix (cf. Condition 13.11). Since

of Theorem 13.16 is satisfied. Since

is satisfied.

Thus condition (2) 0, condition (3)

is the identity, one can choose

R i

K = 1

Also when

in (4).

F 3) (P) > 0 ,

w)

>

and

1

R) (w ) > I.

Therefore (4) is also satisfied.

obtained for

Consistency (5) is easily

from a Taylor expansion of the differ-

m = 4 For

ence equations.

t > 1, then obviously

u e C4(G,IR)

one obtains

JI F ( ,r (u)) -r (q)JI_ < Kh where 16

max (x,y)cG

1).

lu

yyyy

Thus the method is stable by Theorem 13.16.

o

Theorem 13.16 is reduced to Theorem 13.15 with the aid of two lemmas. Lemma 13.18:

There exists an

s

e C-(G,IR)

with

s(x,y) >

and

Ls(x,y) = Ls(x,y) - H(x,y,s(x,y)) > 1, Proof:

For all

c

(x,y) c G let

all(x,y) > K1 > 0,

lbl(x,y)) e K2,

We set 3K2

a =

and show that

(x,y)

K1,

B1+B2

B

= -2

Bl < x < B2

0

13.

Difference methods

251

- cosh[a(x-B))}/(2aK2)

s(x,y) _ {cosh[a(62-b)]

is a function with the desired properties. it follows from

ix-al

< 62-8

First of all,

that

cosh[a(x-B)] < cosh[a(02-8)], and from this, that

s(x,y) > 0.

Since

s

depends only on

x, we have

Ls = -a11sxx ZK

blsx

blsinh[a(x-B)].

allcosh[a(x-B)] + 2K 1

2

Since it is always the case that jsinh[a(x-B)]j < cosh[a(x-6)],

it is also true that Ls(x,y) > cosh[a(x-B)] > 1. Remark 13.19:

The function

error estimates, since sible. case.

s

s

o

plays a definite role in

should then be as small as pos-

Our approach was meant to cover the most general In many specific cases there are substantially smaller

functions of this type, as the following three examples demonstrate. bl(x,Y) = 0:

s(x,Y) =

bl(x,Y) > 0:

s(x,Y) = 2K (x-B1)(2 $2-B1-x) 1

L = -a, G = unit circle: s(x,y) = In the last example, the choice of

s

2K1(x-Bl)(62-x)

q(1-x2-y2).

is optimal, since

there are no smaller functions with the desired properties. In the other two examples, one may possibly obtain more

252

II.

K1, al, and

advantageous constants and

BOUNDARY VALUE PROBLEMS

by exchanging

82

x

a

y.

There exists a v e C°(G, IR)

Lemma 13.20:

and a jo c 1N

such that v(x,y) > 0,

(X,y)

c

(1,...,1), Proof:

We choose the

s

v(x,y) > 2,

The function

=

of Lemma 13.18 and define

It is obviously true that

v = 2s + 2.

J

Lv(x,y) > 2

for

v e C (6,1R), and (x,y)

e G.

is a solution of the boundary value problem

v

13.2 with = rr(v) > 2,

H ' 0,

q(x,y) = Lv(x,Y)

Insofar as the method

j

I

> 2.

= 1(1)-}

is

consistent with respect to this problem, we have

lim

0.

j-

We now choose

j

F

)

and

q > 1

Rj

0

so large that for all

(rj (v)) -F3 )

j

> jo

we have

-Rj (rj (q)) II, < I.

are linear and isotonic.

For

i >

1

and

we have

R(r(q)) > Since we actually have

* > 2

and

q > 2, it follows that

Rj(rj(q)) > (2,...,2) and hence that

13.

Difference methods

253

Remark 13.21:

Instead of

tually proved

v e CW(G,1k)

(1,...,1).

o

v e C°(G,1k)

and

and

v >

v > 0, we ac-

However, the condi-

2.

Since one

tions of the lemma are sufficient for the sequel.

is again interested in the smallest possible functions of this type, constructions other than the one of our proof These other methods need only yield a continu-

could be used.

ous function

v > 0.

o

We choose

Proof of Theorem 13.16:

Then we can apply Theorem 13.15.

v

as in Lemma 13.20.

The quantities are related

as follows:

Theorem 13.15

Theorem 13.16

Fc1)

A

F(2)

F

J

J

rj (v) F(l)

v +

FJ2)

J

F

J

w

0

For

j

it follows from Theorem 13.15(1) that:

> jo

(w j)-Fj1) >

lIvIi,

Ilwj-wjII, I1rj(v)II-

>

IIwj -w,II

-

IIvIi

does not depend on

ity in 13.3(5) with

equivalent to

II R3 II

(i .)IIm

This proves the first inequal-

j.

K = 1/IIvII,.

< K.

wj,wj e C (Mj, D2).

The second inequality is

o

In view of the last proof, one may choose in Definition 13.3(5).

Here

v

K = 1/IIvII,,

is an otherwise arbitrary

function satisfying the properties given in Lemma 13.20.

254

II.

BOUNDARY VALUE PROBLEMS

Conclusion (1) of Theorem 13.6 yields the error estimate

Ilrj(u)-wjII,, ` IIvII. IIFj(p,rj(u))-Rj(rj(q))II0.

j

= jo(l)o.

Here is the exact solution of the boundary value problem

u wj

is the solution of the difference equation

F(l,r.(u)) - Rj(rj(q))

is the local error

is a bounding function (which depends only on

v

The inequality can be sharpened to a pointwise estimate with the help of conclusion (2) of Theorem 13.15. points

(x,y)

and

c Mj

j

= j0(1)-

For all lattice

we have

Iu(x,Y)-wj (x,Y) I : v(x,Y)IIFj (*,rj (u))-Rj (rj (q))II_. In many important special cases, e.g., the model problem (Example 13.4), Rj

is the identity.

A straightforward

modification of the proof of Lemma 13.20 then leads to the following result: s > 0

and

exists a

e >

let

Ls(x,y) > 1 jl c 1N

0

and let

s

e Cm(G,]R)

(cf. Lemma 13.18).

with

Then there

such that

Iu(x,Y) -wj (x,Y) I < (1+e) S (x,Y)II Fj (,P,rj (u)) -rj (q))11_,

j = jl(1)-. In the model problem s(x,y) = 4 x(1-x) + 4 y(1-y) is such a function.

independently of

c.

Here one can actually choose

jl = 1,

It therefore follows that

Iu(x,y) -w3 (x,y) I < s ( x , y ) I I F j (,P,rj (u) ) -rj (q)II_,

j

= l(l)-.

We will now construct several concrete difference methods. Let

Difference methods

13.

e(1)

(11,

_

e(2)

(O),

=

0

255

if

Ih X

V

v = 1(1)4

let:

(x,y)+ae(v)

e G

Now for

_

(0). `

0

v = 1(1)4

with

for all

a

1

we associate

(x,y) c G

Nv(x,y,h) c G

four neighboring points

e(4)

`(-11,

With each point

(cf. Figure 13.22).

h > 0.

e(3) _

1

c

and

[0,h]

=

min {A >

(x,y)+xe(v)

0

c r}

otherwise

Nv(x,y,h) = (x,y) + ave(v)

dv(x,y,h) = II (x,Y) - Nv(x,Y,h)II2 = AvIIe(v)II2. Obviously we have 0 < dv(x,y,h) < h,

v = 1(1)4.

By Definition 13.1, the standard lattice with mesh size

h

is

Mh = {(x,y) e G

I

x = yh, y = vh where

u,v e 2Z},

0 < h < ho. For

(x,y)

long to

all the neighboring points

c Mh

Mh

or

Nv(x,y,h)

P.

Lip(2)(G,IR).

For brevity, we introduce the notation This is a subspace of every

be-

C&(G,IR)

f e Lip(Q)(G,IR) a''+Vf

there exists an

au+Vf ayv(x,Y)

-

axu ay v

ax u

(x,y)

E G,

defined as follows:

(x,y)

L >

0

<_ L II(x,y) - (X,Y)Ij E G,

Obviously,

IR) c Lip(2') (G, IR) .

u+v

Q

for

such that

H.

256

BOUNDARY VALUE PROBLEMS

e(2) 1

Figure 13.22.

Direction vectors for the difference method

The next lemma contains a one-dimensional difference equation which we will use as the basis for the difference methods in Lemma 13.23:

Let

a > 0

a,u e

and

C2n

Suppose further that there is a positive

C3

constant

L

such that for all

t,s c (-S,8)

and

v = 0(1)3

the following inequalities are valid:

a(") (t) I < L,

Iu(") (t) I ' L,

a(") (t)-a(")(s)I < Lit -sI,

Iu(")(t)-u(")(s)I < LIt-sI.

Then it is true for all

hl,h2 e (0,6]

that

h1h 22 1h2 {h 2a(Zhl)[u(hl)-u(0)]+hla(- Zh2)[u(-h2)-u(0)]) +

h

a(0)u"(0)+a'(0)u'(0) +

1-

2

[4a(0)u"'(0)+6a'(0)u"(0)

+ 3a"(0)u'(0)] + R where

G.

13.

Difference methods

We examine the function

Proof:

f(S)

= h1h22..1+h2

- h2a(shl)u(0) The

257

[h2a(Zhl)u(shl) + hla(- .h2)u(-sh2)

- hla(- Zh2)u(0)],

s

[0,1].

e

v-th derivatives, v = 0(1)3, are

f(v)(s)

h1h22 1+h2

2

h1h2(hl+h2

11=0

u(0) 2v +

v (v) 2-hl (s ) hlh2a

(-1)vhlh2a(v)(-

2h2)].

It follows that

f(0) = f'(0) = 0 f"(0) = 2a(0)u"(0) + 2a'(O)u'(O) f"'(0)

(h1-h2)[2a(0)u"'(0)+3a'(0)u"(0) + Za"(0)u'(0)].

By Taylor's Theorem,

f(l) = f(0) + f'(0) + 2f"(0) + 6f"'(0) +

6[f"l(e)-fil,(0)

0 < e < 1. The conclusion follows, with

R = 6[f"'(8)

show that 3

41 2 L 24

But this inequality follows from

3

l+h2 TT72

once we

258

BOUNDARY VALUE PROBLEMS

II.

Ia(u)(Zhl)u(v-u)(shl) <

Ia(u)(?hl)I

+

Iu(v-u)

(0) l

au(0)u(v-u)(0)I -

u(v-P) (0)

Iu(v-u)(shl)

-

la(u) ( hl)

a(u) (0) I < . L2hl

-

and, similarly, Iau(- 'h2)u(v-u) (-sh2)

< 2 L2h2.

a

a,u a C4a constant

Whenever

Remark 13.24:

au(0)u(v-11)(0)J

-

with the desired properties always exists.

L

A convenient

choice is L =

Example 13.25:

max v=0(1)4 te[-s,a]

(Ia(v)(t)I, Iu(v)(t)I)

Standard Five Point Method.

Differential operator: Lu = -[aluxlx

where

-

[a2uyly + H(x,y,u)

al,a2 a C(G, IR) , H c C-(G x ]R, IR) , and al(x,y) > 0,

H(x,y,0) = 0,

a2(x,y) > 0

(x,y) c G,

z

e IR.

Hz(x,y,z) > 0

Lattice: 2-(0+£),

h. = Mj:

A point

j

= 1(l)-, t

sufficiently large, but fixed

standard lattice with mesh size

(x,y) e Mj

boring points

h..

is called boundary-distant if all neigh-

Nv(x,y,h

belong to

G; otherwise it is

called boundary-close.

Derivation of the difference equations:

At the boundary-

distant lattice points, the first two terms of

Lu(x,y)

are

Difference methods

13.

259

replaced, one at a time, with the aid of Lemma 13.23. hi, wi, and

abbreviate

ni, merely writing

{al(x+Zh,Y)[w(x,Y)

h, w, and

n:

- w(x+h,y)]

al(x-?h,Y)[w(x,Y)

- w(x-h,y)]

+ a2(x,Y+Zh)[w(x,Y)

- w(x,y+h)]

+ a2(x,Y-Zh)[w(x,Y)

- w(x,Y-h)])

+

We

+ H(x,Y,w(x,Y)) = Q(x,Y) If one replaces

by the exact solution

w

O(h2).

of the boundary

u c Lip(3)(G,]R), the local error will

value problem, where be

u

An analogous procedure at the boundary-close

lattice points yields E1KV(x,Y)[w(x,Y) - w(NV(x,Y,h))]

+ E2KV(x,Y)w(x,Y) + H(x,y,w(x,y))

= q(x,y) + E2KV(x,Y)p(NV(x,Y,h)) where

2a(XV,YV) KV(x,Y) =

du x,Y,h +

dv(x,Y,

u=

1

v = 1,3

2

v = 2,4

I

+2 (x, y, )l

(x,Y) + -11-dV(x,Y,h)e(v)

In the sums

El

and

E2, v

runs through the subsets of

{1,2,3,4}: El:

all

v

with

NV(x,y,h) c G

E2:

all

v

with

NV(x,y,h) e r.

260

BOUNDARY VALUE PROBLEMS

II.

Formally, the equations for the boundary-distant points are special cases of the equations for the boundary-close points. However, they differ substantially with respect to the local error.

In applying Lemma 13.23 at the boundary-close points,

one must choose h1 = dl(x,y,h)

for the first summand of

h2 = d3(x,y,h)

Lu(x,y), and

h1 = d2(x,y,h) for the second.

and

and

h2 = d4(x,y,h)

The local error contains the remainder

R

and also the additional term hi-h

[4a(0)u"'(0) + 6a'(0)u"(0) + 3a"(0)u'(0)].

12

Altogether there results an error of may be reduced to

O(h3)

O(h).

However, this

by a trick (cf. Gorenflo 1973).

Divide the difference equations at the boundary-close points by

b(x,y) = E2KV(x,Y) The new equations now satisfy the normalization condition (4) of Theorem 13.16, since for

p > 1

and

q > 1

it is ob-

viously true that [q(x,Y) + E2Kv(x,Y)'U(x,Y)]/E2Kv(x,Y) > 1.

At the boundary-distant points such an "optical" improvement of the local error is not possible.

is

O(h2)

Therefore the maximum

.

We can now formally define (cf. Theorem 13.16) the difference operators:

13.

Difference methods

261

1

whenever

(x,y)

is boundary-distant

E2Kv(x,y)

whenever

(x,y)

is boundary-close

b(x,y) ll

4

Kv(x,Y)w(x,Y) V=1

E2Kv(x,Y)w(Nv(x,Y,h))]/b(x,Y)

-

H(x,y,w(x,y))/b(x,y)

Ri (ri (q))(x,Y) = q(x,y)/b(x,y).

there is a matrix

For

B-1A; B

is a diagonal matrix

b(x,y), whereas the particular

with diagonal elements

A

naturally also depends on the enumeration of the lattice points.

In practice, there are two methods of enumeration

which have proven themselves to be of value: (1)

Enumeration by columns and rows:

(x,y)

precedes

(z,y)

if one of the following conditions is satisfied: x < x,

(a)

(b)

x = z

and

With this enumeration, the matrix

A

y < y.

becomes block tridia-

gone1: D1

-S1

1

D2

-S2

2

D3

A =

-Sk

The matrices

Du

are quadratic and tridiagonal.

Their dia-

gonal is positive, and all other elements are nonpositive. The matrices

S11

and

SP

are nonnegative.

262

(2)

II.

BOUNDARY VALUE PROBLEMS

Enumeration by the checkerboard pattern:

lattice

Divide the

into two disjoint subsets (the white and black

Mj

squares of a checkerboard): Mil) = {(uh,vh) c M.

u+v

even}

{(uh,vh) a Mj

u+v

odd}.

The elements of

Mil)

In each of these subsets, we use the column

second.

of

are enumerated first, and the elements

and row ordering of (1). D1

The result is a matrix of the form -S

A = -9

D1

and

are quadratic diagonal matrices with positive

D2

diagonals.

D2

S

and

S

are nonnegative matrices.

In Figures 13.26 and 13.27 we have an example of the two enumerations.

Figure 13.26.

Enumeration by columns and rows

13.

Difference methods

Figure 13.27.

263

Enumeration on the checkerboard pattern

We will now show that

B-1A

and

A

are M-matrices.

It is

obvious that: 4

(1)

app =

p = 1(1)n,

Kv(x,y) > 0,

(x,y)

c

V=1 (2)

apa = -Kv(x,y)

<

or

0

apo =

0

for all

p,a

with

a

n (3)

app

>

I

lapa1,

p = 1(1)n.

a=1 a+p (4)

For each row

(ap1,...,apn), belonging to a boundary-

close point, n a pp >

E

IapaI.

a=1

a#p (5)

apa = A

In case

0

implies

aap =

matrix

for

p,a = 1(1)n.

is irreducible, it is even irreducible diagonal

dominant, by (1) through (4)

wise, A

0

(cf. condition 13.11).

Other-

is reducible, and by (5) there exists a permutation P

such that

264

BOUNDARY VALUE PROBLEMS

II.

A PAP

1

1

A2

=

l®

1

Av, v = 1(1)L

The matrices Each matrix

A

are quadratic and irreducible.

has at least one row which belongs to a

AV

boundary-close point.

Hence all of these matrices are ir-

reducible diagonal dominant, and thus quently, A

Conse-

is also an M-matrix.

For certain G = (0,1)

M-matrices.

x (0,1)

h

or

and certain simple regions (e.g.

G= {(x,y) E (0,1) x (0,1)

h = 1/m] it will be the case that

dv(x,y,h) = h.

x+y < 1},

I

When this

condition is met, we have the additional results: (6)

Kv(x,y,h) = Ku(Nv(x,y,h),h)

where

u-1 = (v+1)mod 4, (x,y)

(7)

apo = aop

(8)

A

(9)

B-IA

for

c M..

p,a = 1(1)n.

is positive definite. B-1/2AB-1/2

is similar to

and therefore has

positive eigenvalues only.

Of the conditions of Theorem 13.16 we have shown so far that (2)

(B- 1

A

is an M-matrix), (4) (normalization condition),

and (5) (consistency) are satisfied. H(x,y,w(x,y))/b(x,y) is trivially diagonal and isotonic. also satisfied.

Thus condition (3) is

Therefore, the method is stable.

In the following examples we restrict ourselves to the region

G = (0,1) x (0,1); for the lattice

M

we always

choose the standard lattice with mesh width h = h.

= 2-j.

In

13.

Difference methods

265

this way we avoid all special problems related to proximity In principle, however, they could be

to the boundary.

solved with methods similar to those in Example 13.25.

For

brevity's sake, we also consider only linear differential operators without the summand

Then the sumWhen

drops out of the difference operator.

mand (x,y)

H(x,y,u(x,y)).

c

w(x,y)

P, we use

for

Differential operator:

Example 13.28:

Lu = -a11uxx

a22uyy -

b u 1

x

- b2uy.

Coefficients as in Problem 13.2. Difference equations:

h2{[all(x,Y)+ch][2w(x,Y)-w(x+h,y)-w(x-h,Y)] [a22(x,Y)+ch][2w(x,Y)-w(x,y+h)-w(x,y-h)]}

+

Zh{bl(x,Y)[w(x+h,Y)-w(x-h,Y)]

-

+ b2(x,Y)[w(x,y+h)-w(x,y-h)]}

= q(x,Y) Here

When when

c

> 0

is an arbitrary, but fixed, constant.

u E Lip(3)(G,IR), we obtain a local error of

c = 0,

and

can be given by

0(h)

when

an M-matrix.

c > 0.

For small

h,

The necessary and sufficient

conditions for this are flbl(x,Y)l < all(x,Y) + ch,

(x,Y)

e M

Zjb2(x,Y)l < a22(x,Y) + ch,

(x,Y)

a M

which is equivalent to

0(h2)

266

BOUNDARY VALUE PROBLEMS

II.

2[Ibl(x,Y)I-2c]

E Mj

(x,y)

< all(x,Y),

2[Ib2(x,y)I-2c] < a22(x,y),

(x,y) E M3.

If one of the above conditions is not met, the matrix may possibly be singular.

Therefore these inequalities must be

satisfied in every case.

local error, and for h c (0,h0]. lb2I

For

For

c = 0, one obtains the smaller

c > 0, the larger stability interval

In the problems of fluid dynamics, Ib1I

are often substantially larger than

all

and

or a22.

c > 0, we introduce a numerical viscosity (as with the

Friedrichs method, cf. Ch. 6). in many other ways as well.

This could be accomplished

One can then improve the

global error by extrapolation.

o

Differential operator:

Example 13.29:

as in Example 13.28.

Difference equations: h2{all(x,Y)(2w(x,Y)-w(x+h,Y)-w(x-h,y)l

+ -

Here

D1

and

D2

a22(x,y)[2w(x,y)-w(x,y+h)-w(x,y-h)]}

h{D1(x,y) + D2(x,Y)} = q(x,y).

are defined as follows, where

(x,y)

`bl(x,Y) [w(x+h,Y)-w(x,Y)]

for

b 1(x,y) > 0

bl(x,y) [w(x,y)-w(x-h,y)J

for

bl(x,y)

1b2(x,y) [w(x,y+h)-w(x,y)]

for

b2(x,y) > 0

b2(x,y) [w(x,y)-w(x,y-h)]

for

b2(x,y) <

c M.,

Dl(x,Y) <

0

D2(x,y)

F3l)

is given by an M-matrix for arbitrary

h > 0.

0.

This

is the advantage of this method with one-sided difference quotients to approximate the first derivatives.

The local

13.

Difference methods

error is

0(h)

u e Lip(3)(G,IR).

for

sible only if

and

bI

267

b2

Extrapolation is pos-

do not change in sign.

Note

the similarity with the method of Courant, Isaacson, and Rees (cf. Ch. 6).

o

Differential operator:

Example 13.30:

Lu = -aAu - 2buxy

where

satisfy

a,b c C _(_G, ]R)

a(x,y) > 0, a(x,y)2

-

(X,y)

CG

b(x,y)2 > 0.

Difference equations:

{a(x,y)[2w(x,y)-w(x+h,y+h)-w(x-h,y-h)] 2h

+ a(x,Y)[2w(x,Y)-w(x+h,y-h)-w(x-h,y+h)l - b(x,Y)[w(x+h,Y+h)-w(x-h,y+h)-w(x+h,y-h)+w(x-h,y-h)]} = q(x,y).

When Ib(x,Y)I < a(x,y) ,

(x,Y) c Mi

one obtains an M-matrix independent of

h.

However, the dif-

ferential operator is uniformly elliptic only for Ib(x,Y)I < a(x,Y)

When

b(x,y)

__

,

(x,Y) e G.

0, the system of difference equations splits

into two linear systems of equations, namely for the points (ph,vh)

where

p + v

is even

(ph,vh)

where

p + v

is odd.

and

BOUNDARY VALUE PROBLEMS

[I.

268

One can then restrict oneself to solving one of the systems. The local error is of order

0(h2)

for

u e Lip(3)(U,IR).

o

MuZtipZace method.

Example 13.31:

Differential operator: Lu(x,y) = -Au(x,y).

Difference equations: {5w(x,Y)-[w(x+h,Y)+w(x,y+h)+w(x-h,Y)+w(x,Y-h)] h -

4[w(x+h,Y+h)+w(x-h,y+h)+w(x-h,Y-h)+w(x+h,Y-h)]}

= q(x,y) + S[q(x+h,y)+q(x,y+h)+q(x-h,y)+q(x,y-h)].

The local error is

0(h4)

for

13.16 is applicable because

u c Lip(5)(G, Ill).

Theorem

always has an M-matrix.

The natural generalization to more general regions leads to a method with a local error of

0(h3).

More on other methods

of similar type may be found in Collatz 1966.

o

So far we have only considered boundary value problems of the first type, i.e., the functional values on

t

were

Nevertheless, the method also works with certain

given.

other boundary value problems. Boundary value problem:

Example 13.32:

-Eu(x,Y) = q(x,y),

(x,Y)

u(x,y) = P(x,y),

(x,y)c r

u(0,Y)

where fixed.

ii

-

and

0'ux(0,Y)

4

E G = (0,1) x (0,1) and

x +

0

= 0(y), y E (0,1)

are continuous and bounded and

a > 0

is

13.

Difference methods

269

Lattice: A.:

the standard lattice

with mesh width h=hi2

M3 .

3

(0,µh), p = 1(1)2j-l.

combined with the points Difference equations:

For the points in

M. n (0,1) x (0,1), we use the same equa-

tions as for the model problem (see Example 13.4). u e Lip(3)(G,IR)

y = ph, u = 1(1)2j-1, and

For

we have

u(h,y) = u(O,Y) + hux(0,Y) + 1h2uxx(0,Y) + 0(h3) u(O,Y) + hux(0,Y) -h2uyy(O,y)

If we replace

-

Zh2[q(O,Y)+uyy(0,Y)] + 0(h3).

by

2u(O,Y) - u(0,y+h)

- u(0,y-h) + 0(h3)

we obtain u(h,y) = 2u(0,y)

Zu(0,y+h)

-

+ hu x(0,Y)

-

-

Zu(0,y-h)

Zh2q(O,Y) + 0(h3)

u x(O,Y) =

2h[2uCh,Y)+u(O,y+h)+u(O,Y-h)-4u(O,Y)]

+ Zhq(O,Y) + 0(h2).

This leads to the difference equation - a[2u(h,Y)+u(O,Y+h)+u(O,Y-h)l}

h{(2h+4a)u(O,Y)

(y)

Since

+

Zhq(0,Y)

a > 0, the corresponding matrix is an M-matrix.

theorem similar to Theorem 13.16 holds true. converges like tion by possible.

0(h2).

The method

If one multiplies the difference equa-

1/a, the passage to the limit o

A

a - -

is immediately

270

14.

II.

BOUNDARY VALUE PROBLEMS

Variational methods

We consider the variational problem I[u]

= min{I[wl

I

w e W},

(14.1)

where I[w]

= fi [a1w2 + a2wy + 2Q(x,y,w)Idxdy. G

Here

G

is to be a bounded region in

integral theorem is applicable, and

Q F C2(G x ]R, ]R)

to which the Gauss

al,a2 a C1(G,IR), and

where

al(x,y) > a > 0,

a2(x,y) > a > 0,

0 < QzZ(x,y,z) < d,

The function space below.

IR2

W

(x,y)

e G,

z aIR.

will be characterized more closely

The connection with boundary value problems is es-

tablished by the following theorem (cf., e.g., GilbargTrudinger 1977, Ch. 10.5).

Theorem 14.2:

is a solu-

A function u e C2(G, IR) fl C°(G, ]R)

tion of the boundary value problem -[alux]x -

(a2uyly + Qz(x,y,u) = 0,

(x,y) e G

(14.3)

u(x,y) = 0,

(x,y)

e DG

if and only if it satisfies condition (14.1) with

W = {w a C2(G, IR)

fl

C°(-a, IR)

I

w(x,y) = 0 for all (x,y) e 8G}.

In searching for the minimum of the functional

I[w],

it has turned out to be useful to admit functions which are not everywhere twice continuously differentiable.

In practice

one approximates the twice continuously differentiable solutions of the boundary value problem (14.3) with piecewise once

14.

Variational methods

271

continuously differentiable functions, e.g. piecewise polyThen one only has to make sure that the functions

nomials.

are continuous across the boundary points.

We will now focus on the space in which the functional I[w]

will be considered. K(G,IR)

Let

Definition 14.4:

w e C°(G,IR)

functions

such that:

(1)

w(x,y) = 0,

(2)

w

(x,y) e aG.

is absolutely continuous, both as a function with

x

of

with

y

y

held fixed, and as a function of

held fixed.

x

w. e L2(G, ]R).

wx,

(3)

be the vector space of all

We define the following norm (the Sobolev norm) on

K(G,]R):

2

1IwIIH =

[If (w2 + wx + wy )dxdy]l/2 G

We denote the closure of the space H(G,]R).

this norm by

We can extend setting

plies that

w

with respect to

a

continuously over all of

w

w(x,y) = 0

K(G,]R)

outside of

G.

]R2

by

Then condition (2) im-

is almost everywhere partially differentiable, (a,b) c]R2

and that for arbitrary

(cf. Natanson 1961, Ch.

IX) : rx

wx(t,y)dt

w(x,y) = J

a

(x, Y)

e IR2

rY =

J

wy(x,t)dt.

The following remark shows that variational problem (14.1) can also be considered in the space H(G,]R).

II.

272

Remark 14.5:

Let

BOUNDARY VALUE PROBLEMS

u e C2(G, IR) n C°(G,IR)

be a solution of

Then we have

problem (14.3).

= min{I[w]

I[u]

When the boundary

3G

w e H(G, IR)}.

I

is sufficiently smooth, the converse

For example, it is enough that

also holds.

be piece-

2G

wise continuously differentiable and all the internal angles of the corners of the region be less than

2n.

o

The natural numerical method for a successive approximation of the minimum of the functional

I[w]

is the

Ritz method: Choose

linearly independent functions

n

v = 1(1)n, from the space

K(G, IR).

n-dimensional vector space

Vn.

minimum of the functionals

I[w]

I[v]

Each the

V

I

These will span an

Then determine

v e Vn, the

in V:

w e Vn}.

can be represented as a linear combination of

w e Vn f

= min{I[w]

fv,

:

n

w(x,y) =

I

Bvfv(x,Y)

v=1

In particular, we have n

v(x,Y) =

I

cvfv(x,Y),

v=1 I[w]

= I(Sl,...,8n).

From the necessary conditions

2c

(cl,...,cn)

= 0,

v = 1(1)n

v

one obtains a system of equations for the coefficients

cv:

Variational methods

14.

fG[a,(fv)x

'I c(fx

u=1

273

E cu(fu)Y (14.6) + a2(fv)Y u=1 n E cufu)]dxdy = 0, v = 1(1)n. fvQz(x,y,

+

p=1

Whenever the solution

of the boundary value problem

u

(14.3) has a "good" approximation by functions in can expect the error

to be "small" also.

u - v

Vn, one

Thus the

effectiveness of the method depends very decidedly on a suitable choice for the space

Vn.

These relationships will be

investigated carefully in a later part of the chapter.

Now

we will consider the practical problems which arise in solvIt will turn out

ing the system of equations numerically.

that the choice of a special basis for

Vn

is also important.

In the following we will generally assume that is of the special form

Q(x,y,z)

Q(x,Y,z) = 2 a(x,Y)z2 - q(x,y)z, where

a(x,y) >

0

for

(x,y)

e G.

In this case, the system

of equations (14.6) and the differential equation (14.3) are The system of equations has the form

linear.

A c = d where

A = (auv), c = (c1,...,cn)1, and

d = (dl,...,dn)T

with

auv = Gf[al(fu)x(fv)x + a2(fu)y(fv)y + afufv]dxdY, du = If qfu dxdy. G

A

is symmetric and positive semidefinite.

tions

fv

definite.

are linearly independent, A Therefore, v

Since the func-

is even positive

is uniquely determined.

We begin with four classic choices of basis functions

274

II.

BOUNDARY VALUE PROBLEMS

fV, which are all of demonstrated utility for particular problems: (1) (2)

xkyR

monomials

products of orthogonal polynomials

gk(x)gZ(y)

I sin(kx) sin(Ry) (3)

sin(kx)cos(iy)

trigonometric monomials

:

Icos(kx)cos(iy) (4)

Bk(x)BR(y)

products of cardinal splines.

If the functions chosen above do not vanish on

8G, they

must be multiplied by a function which does vanish on and is never zero on

G.

It is preferable to choose basis

functions at the onset which are zero on if

aG

G.

For example,

G = (0,1)2, one could choose

x(1-x)Y(1-y),

x2(1-x)y(l-y),

x(1-x)y2(1-y),

x2(1-x)y2(1-y),

or sin(Trx) sin(Try) ,

sin(2rrx) sin(Try) ,

sin(rx) sin(2iry) , sin(2nx)sin(2rry)

For

G = {(r cos ¢, r sin 0)

1

r e

[0,1),

a good

c

choice is: r2-1,

(r2-1)sin ,

(r2-1)cos 0,

(r2-1)sin 20, (r2-1)cos 20.

Usually choice (2)

is better than (1), since one ob-

tains smaller numbers off of the main diagonal

of

A.

The

system of equations is then numerically more stable.

For

periodic solutions, however, one prefers choice (3).

Choice

(4)

is particularly to be recommended when choices (1)-(3)

give a poor approximation to the solution.

14.

Variational methods

27S

A shared disadvantage of choices (l)-(4) is that the A

matrix compute tions.

is almost always dense. n(n+3)/2

As a result, we have to

integrals in setting up the system of equa-

The solution then requires tedious general methods The com-

such as the Gauss algorithm or the Cholesky method.

putational effort thus generally grows in direct proportion with

n3.

One usually chooses

n < 100.

The effort just described can be reduced by choosing initial functions with smaller support. fufvo

(f11 )x(fv)x.

The products

(fu)y(fv)y

will differ from zero only when the supports of have nonempty intersection. are zero.

A

fu

In all other cases, the

fv

and auv

In this case, specialized, faster

is sparse.

methods are available to solve the system of equations. Estimates of this type are called finite element methods. The expression "finite element" refers to the support of the initial functions.

In the sequel we present a few simple

examples.

Example 14.7:

Linear polynomials on a triangulated region.

We assume that the boundary of our region is a polygonal line. Then we may represent

as the union of

G

AP, as in Figure 14.8.

N

closed triangles

It is required that the intersection

of two arbitrary distinct triangles be either empty or consist of exactly one vertex or exactly one side. tices of the triangles be denoted by

&v.

which do not belong to

Let them be enumerated from

We then define functions rules:

AP

Those ver-

1

2G, will to

n.

fv, v = 1(1)n, by the following

276

Triangulation of a region

Figure 14.8.

(1)

fv e C°(G, IR)

(2)

fv

restricted to

nomial in

IR2,

(3)

fvW')

(4)

fv(x,y) = 0

The functions (4).

BOUNDARY VALUE PROBLEMS

11.

is a first degree poly-

Op

p = 1(1)N.

dvu for

(x,y)

c 3G.

are uniquely determined by properties (1)-

fv

They belong to the space

fv

vanishes

which does not contain vertex

AP

on every triangle

K(G,IR), and

CV.

If the triangulation is such that each vertex v belongs to at most

k

triangles, then each row and column of

contain at most

k +

1

A

will

elements different from zero.

In the special case v = (rvh, svh)T,

rv,sv eZZ

we can give formulas for the basis functions

fv.

The func-

tions are given in the various triangles in Illustration 14.9.

The coefficients for matrix

this.

We will demonstrate this for the special differential

A

equation -Du(x,y) = q(x,y)

Thus we have

a1 = a2 = 1, a

=_

0, and

can be computed from

(

Variational methods

14.

0

0

0

277

/0 /0 0

0

0

1-rv+sv

0

l+rv x

+h 1-rv+

0

h

l+s

0

x

l+rv-sv-

0

0

0

V V0 Figure 14.9.

svh

v

1 sv+h

0

0

0

0

0

0

r h v

Initial functions triangulation

fv for a special

auv = It [(fu)x(fv)x + (fu)y(fv)y]dxdy Since

(fo)x

and

(fa)y

are

1/h, -1/h, or

0,

depending

on the triangle, it follows that 4

for p = v

-1

for

sv = su

and

rv = ru+1

or

rv = ru-1

-1

for

rv = ru

and

sv = su+1

or

S. = su-

0

otherwise.

In this way we obtain the following "five point difference

278

II.

BOUNDARY VALUE PROBLEMS

operator" which is often also called a difference star: 0

-1

0

-1

4

-1

0

-1

0

Tk,i

are the translation operators from Chapter

Here the

= 41

-

(Th,l + T_ h,1 + Th,2 + T_ h,2)'

10.

The left side of the system of equations is thus the same for this finite element method as for the simplest difference method (cf. Ch. 13).

On the right side here, how-

ever, we have the integrals

du = If gfudxdy while in the difference method we had h2q(r11 h, suh)

In practice, the integrals will be evaluated by a sufficiently accurate quadrature formula.

In the case at hand the follow-

ing formula, which is exact for first degree polynomials (cf., e.g. Witsch 1978, Theorem 5.2), is adequate: If g(x,y)dxdy z h2 [6g(0,0) + 6g(h,0) + 1 (O,h) where

is the triangle with vertices

A

Since the

fu

(0,0), (h,0), (O,h).

will be zero on at least two of the three

vertices, it follows that 2

du

a

6 (+l+l+l+l+l+l)q(ruh,suh) = h2q(ruh,suh).

Example 14.10:

Linear product approach on a rectangular is the union of

N

closed

rectangles with sides parallel to the axes, so that

may

subdivision.

We assume that

G

14.

Variational methods

279

Figure 14.11.

Subdivision into rectangles

be subdivided as in Figure 14.11.

We require that the inter-

section of two arbitrary, distinct rectangles be either empty or consist of exactly one vertex or exactly one side. denote by

CV

(v = 1(1)n)

We

those vertices of the rectangles

p which do not belong to

Then we define functions

G.

fv

by the following rule:

(1)

fv e G°(G, IR)

(2)

fv

restricted to

op

is the product of two

first degree polynomials in the independent variables and

x

y. (3)

fv(u)

(4)

fv(x,y) =

As in Example

svv 0

for

(x,y)

c

G.

14.7, the functions

fv

are uniquely

determined by properties (1)-(4), and belong to the space K(G,IR).

Each

fv

with common vertex

vanishes except on the four rectangles

v.

Thus each row and column of

at most nine elements which differ from zero. In the special case Ev = (rvh, svh)T,

rv,sv C 2Z

A

has

H.

280

BOUNDARY VALUE PROBLEMS

we can again provide formulas for the basis functions

fv,

namely: (1-Ifi -rvI)(1-Ih -svI)

for Ih-rvl

0

otherwise.

< 1,

< 1

Ih-svI

fv

We can compute the partial derivatives of the

fv

on the

interiors of the rectangles: -1Fi(1

I

(1

I

S

for

sv

for -1 <

0 < N-rv <

<

1

h-rv < 0,

Ih-svI <

1

<

h-sv < 1,

Iih-rI

<

1

for -1 <

h-sv < 0,

Ih-rvI

< 1.

1,

IK-svI

otherwise

0

for

0

otherwise.

0

The coefficients of the matrix

A

can be derived from this.

We restrict ourselves to the Poisson equation -Au(x,y) = q(x,y).

By exploiting symmetries, we need consider only four cases in computing the integrals: (1)

Iru- rvI

> 1

(2)

u = v:

auu = h2

(3)

rv = ru + 1

or

Isu- svI

2

h

auv = ri JO

h (h to f0[(1

and rrh

> 1

[-(l

auv =

:

h)2 + (1-

0

)2]dxdy

=

3

sv = su + 1:

)(1 (1

)) (l

)(1 (1

K))]dxdy='3

I

(4)

rv = ru + auv =

h

2

J

h

and

1

h

t (1){1-)+(1

j

0

sv = su:

0

fi)(l-(l-

h))ldxdy

= -7.

Variational methods

14.

281

We obtain the difference star _

1

3

1

3

1

3

1

1

1

8 31

_ 13

8 + 3

3[Th,1+T-h,l+Th,2+T h,2

1

_

3

3

+

The integrals

dU

(Th,l

-h,l)(Th,2+T-h,2))'

can be evaluated according to the formula 2

tt g(x,y)dxdy x[g(0,0) + g(h,0) + g(O,h) + g(h,h)), a

where (h,h).

is the rectangle with vertices

o

(0,0), (h,0), (O,h),

Therefore,

du = h2q(ruh, suh). Example 14.12:

a

Quadratic polynomial approach on a triangu-

lated region (cf. Zlamal 1968). lated as in Example 14.7.

Let region

G

be triangu-

We will denote the vertices of the

triangles

AP

and the midpoints of those sides which do not

belong to

aG

by

We define functions

Ev.

Let these be numbered from

f

(1)

fv a C°(G, IR)

(2)

fv(x,y)

v = 1(1)n

restricted to

1

to

n.

by the following rule:

Op

is a second degree

polynomial, p = 1(1)N.

(3)

fv(O')

(4)

fv(x,y) = 0

avu for

(x,y)

a aG.

As in the previous examples, the functions determined by properties (1)-(4). of a triangle, fv able.

fv

are uniquely

Restricted to one side

is a second degree polynomial of one vari-

Since three conditions are imposed on each side of a

282

II.

is continuous in

triangle, fv

G, and hence belongs to

It vanishes on every triangle which does not con-

K(G, IR).

tain

BOUNDARY VALUE PROBLEMS

CV.

a

With a regular subdivision of the region, most finite element methods lead to difference formulas.

For the pro-

grammer, the immediate application of difference equations is simpler.

However, the real significance of finite ele-

ment methods does not depend on a regular subdivision of the The method is so flexible that the region can be

region.

divided into arbitrary triangles, rectangles, or other geoIn carrying out the division, one can let

metric figures.

oneself be guided by the boundary and any singularities of the solution.

Inside of the individual geometric figures it

is most certainly possible to use higher order approximations (such as polynomials of high degree or functions with special

In these cases, the reduction to difference

singularities).

The programming required by

formulas will be too demanding.

such flexible finite element methods is easily so extensive as to be beyond the capacities of an individual programmer. In such cases one usually relies on commercial software packages.

We now turn to the questions of convergence and error estimates for the Ritz method. Definition 14.13:

Special inner products and norms.

quantities uv dxdy

2 =

f G

I =

ff[aluxvx + a u v G

2

y y

+ ouv]dxdy

The

14.

Variational methods

are inner products on

283

K(G,]R).

2,

II uI12 -

They induce norms

I.

II uIII =

The following theorem will show how the norms

and

can be compared to each other on

11.16

There exist constants

Theorem 14.14:

for all

11.1111

K(G, ]R) .

Y1,Y2 > 0

such that

u e K(G, ]R) : (1)

Y1l1uIII

(2)

IIUII2 < IIuIIH

(3)

1Iu1I2

Y211uIII

11u11H

Y211uIII.

The second inequality is trivial, and the third

Proof:

follows from the second and the first. first is as follows. show that

CI(G,]R)

The proof of the

Analogously to Theorem 4.10(1) we can is dense in

K(G,]R)

with respect to

Thus it suffices to establish the inequalities for

11.16.

u c Co(G,]R).

We begin by showing that there exists a con-

Yo > 0

stant

such that

Gf u2dxdy < Yo Gf (u2 + uy) dxdy, Let

11.1121

[-a,a]

x

[-a,a]

be a square containing

denote that continuous extension of (-a,a]

x

[-a,a]

u C Co (G, ]R) . Let

G.

u c Co(G,]R)

which vanishes outside of

G.

u

to

It follows

that t

u(t,y) = J- ux(x,y)dx. a

Applying the Schwartz inequality we obtain u(t1Y)2 < (t+a)jtaux(x,Y)2dx < 2a 1a aii(x,Y)2dx.

284

BOUNDARY VALUE PROBLEMS

II.

It follows from this that

1a u(t,y)2dy < 2aJa -a

a Ja -a

ux(x,y)2dxdy,

ra

ffu2dxdy

=

4a2fa

-a1a-a f

G

ra

u2dxdy <

u2dxdy

11

all

-a x

< 4a2JG fa(a2 + uy)dxdy. Setting

establishes our claim.

yo = 4a2

We now set =

cc

min

{min[al(x,y), a2(x,y)]}

max

{max[al(x,y), a2(x,y), o(x,y)]}

(x,y) eG Yo =

(x,y) eG

and use the above result to obtain the estimates 2 Ilu<2 (l+Yo) ff (ux+u)dxdy < y G

2

IIuIII < Yo

l+Y

o IIuII2

a

11U12

Inequality (1) then follows by letting

Yl = Let

{uv

1-77

I

v

in

{I

H(G, ]R)

K(G,IR)

Y2 = and

(1+

o

{vv

I

v = l(1)'}

be

which converge to elements

with respect to the norm

v = l(1)°°}

I

and

v = l(1)°°}

Cauchy sequences in

and

-

II

is a Cauchy sequence in

IIH.

Then

]R, for it

follows from the Schwarz inequality and Theorem 14.14 that I
VI-

IJ = I1 - I

I1uv-uu1II IIvvIII + Ilvu'vVIII IIuuIII

< yl 2 (I l

uv -uu IIH I I vv "H + 11V u -vV

H

II

I I uu lIH) .

u

14.

Variational methods

285

1 = lim

If we define

and

Theorem 14.14 holds trivially for all The space norms

However, this is not the case with

respect to the norm

II'112, as rather simple counterexamples

There is no inequality of the form

will show.

IIulIH < Y3IIujl2

Convergence for the Ritz method is first es-

tablished for the norm and

II'112

Theorem 14.15:

Let

II'III, and convergence with respect

then follow from the theorem. u c H(G,]R)

I[u] = min{I[w]

w e H(G,IR)

and let

Proof:

u e H(G,IR).

is closed with respect to the

H(G,IR)

and

to

For I[u]

I

be such that

w e H(G,IR)}

be arbitrary.

Then we have:

I = 2

(14.16)

I [u+w] = I [u] + 11W112

(14.17)

A e]R = I

it follows that 22

I[u+aw) = I

-

22

= I[u) + 2a(1 - <w,q>2) + Since

1,

IIull1 =

A2<w,w>1.

is the minimum of the variation integral, the ex-

u

pression in the parentheses in the last equality must be Otherwise, the difference

zero.

sign as

with

A

changes sign.

I[u+Aw]

- I[u]

will change

The second conclusion follows

A = 1.

It is also possible to derive equation (14.16) directly from the differential equation (14.3).

For

286

BOUNDARY VALUE PROBLEMS

II.

a(x,Y)z2 - q(x,y)z

Q(x,Y,z) =

Z we multiply (14.3) by an arbitrary function (test function) and integrate over

w e K(G,]R)

G:

(a2uy)y + au-qlw dxdy = 0.

ff[-(alux)x G

It follows from the Gauss integral theorem that ff[aluxwx + a2uywy + auw]dxdy = This is equation (14.16).

Gf qw dxdy.

It is called the weak form of dif-

ferential equation (14.3).

With the aid of the Gauss inte-

gral theorem, it can also be derived immediately from similar differential equations which are not Euler solutions of a variational problem.

Ac = d

The system of equations

can also be obtained

This process is called the GaZerkin

by discretizing (14.16). method: Let

fv, v = l(1)n, be the basis of a finite dimen-

sional subspace

Vn

We want to find an approxi-

K(G,]R).

of

mation n

v(x,Y) =

E cvfv(x,Y)

v=1 such that
u

=
> I

u

>

2,

u = 1(1)n.

As in the Ritz method it follows that auv = I

and

du =

2

A derivation of this type has the advantage of being applicable to more general differential equations.

We prefer to

proceed via variational methods because the error estimates follow directly from (14.17).

14.

Variational methods

Theorem 14.18:

Let

K(G, IR) .

be an n-dimensional subspace of

Vn

Let

287

u e H(G, ]R)

IM = min{I [w]

I

v c Vn be such that

and

w e H(G, IR) },

I (v] = min(I [w]

I

w e Vn}.

Then it is true that

Here

(1)

IN] < I[VI

(2)

(Iu-v112 < Y211u-viII < Y2 min

11u-v*11I

is the positive constant from Theorem 14.14.

Y2

Proof:

v*eVn

Inequality (1) is trivial.

It follows from this,

with the help of Theorem 14.15, that for every

Ii u-v (j 2 = I [v]

I (U]

-

<

I [v* ]

-

I [U]

_ ii u-v* Iii

The conclusion follows from Theorem 14.14. Thus the error

11u-vi12

if there is some approximation for which

11u-v*jjI

is small.

mation in the mean to

u

v* e Vn,

a

in the Ritz method is small v* a Vn

of the solution

u

This requires a good approxi-

and the first derivatives of

u.

Nevertheless, Theorem 14.18 is not well suited to error estimates in practice, because the unknown quantity

u

continues to appear on the right sides of the inequalities in (2).

However, the following theorem makes it possible to

obtain an a posteriori error estimate from the computable defect of an approximate solution. Theorem 14.19:

Let

u e C2(G,IR) n C°(G,]R)

of boundary value problem (14.3).

be a solution

Let the boundary of

G

consist of finitely many segments of differentiable curves. Further let v(x,y) = 0

v e C2(G,]R) for all

(x,y)

be an arbitrary function with e DG

and let

288

II.

BOUNDARY VALUE PROBLEMS

a a a a wx(al az) - ay(a2 ay)

L=

+

Then it is true that II u-v II2 < YZII Lv-q112. Here

is the positive constant from Theorem 14.14.

Y2

Proof:

Let

c(x,y) = u(x,y)

G

Since

is square integrable on

q(x,y), Lu(x,y) ishes on

- v(x,y).

Lu(x,y) _ Since

G.

van-

a

BG, it follows from the Gauss integral theorem that

cLcdxdy = f f I (ale2x + a2ey + cc2)dxdy

=11E,12

'

It follows from Theorem 14.14 and the Schwartz inequality that

IIEII2 < YZIIEIII < Y2 IIE2IIIILEII2

0

We see from the estimate in the theorem that the error will be small in the sense of norm

11112

if

v

is a twice

continuously differentiable approximation of solution for then

depends on

good constants and

Of course the quality of the estimate

Lv z q. Y2.

u,

This shows how important it is to determine Y2

for a region

G

and functions

al

a2.

One further difficulty arises from the fact that the Ritz method normally produces an approximation from instead of from

C2(G,IR).

vented as follows.

K(G,IR)

This difficulty can be circum-

First cover

G

with a lattice and com-

pute the functional values of the approximation on this lattice with the Ritz method.

Then obtain a smooth approxi-

mation by using a sufficiently smooth interpolation between the functional values

v(Ep)

at the lattice points

Cp.

14.

Variational methods

289

Unfortunately, bilinear interpolation is out of the question because it does not yield a twice continuously differentiable A two dimensional generalization of spline inter-

function.

polation is possible, but complicated. interpolation is simpler.

The so-called Hcrmite

We will consider it extensively

in the next chapter.

Up to now we have assumed that form

In the following, let

1 az2 - qz.

function in

Q

C2(G x IR, ]R)

QZ(X,Y,Z) > 0,

has the special Q

be an arbitrary

with (x,y) a G, z cIR.

0 < QZZ(X,Y,z) < b,

Then one has the following generalizations of Theorems 14.15 and 14.18.

Theorem 14.20:

Let

u e K(G,]R)

I [u] = min{I [w] and let

v e K(G,IR)

I

be such that

w c K(G, ]R) } Then it is the case that

be arbitrary.

ff[aluxvx + a2uyvy + Qz(x,y,u)v]dxdy = 0,

I[u+v] = I[u] + ff[alvx+a2v2+Qzz(x,y,u+0v)v2]dxdy, 0<8<1. y G Theorem 14.21:

K(G, R). I[u]

Let

Vn

be an n-dimensional subspace of

Further let u e K(G, IR)

= min{I[w]

I

w c K(G,]R)}

v e Vn be such that

and and

Then there exists a positive constant

I[v] = min{I[w] y2

such that

I

w eVn}.

290

BOUNDARY VALUE PROBLEMS

II.

(1)

I [u]

<

(2)

jl u-vuj

2 < Y2 min

I [v] v*eVn

Y2

G

a2(uy-vy)2

+

The constant

{ff (al (ux-v*) 2

does not depend on

+

6(u-v*)2]dxdy}1/2. Q

or on

d.

Theorems 14.20 and 14.21 are proven analogously to Theorems 14.15 and 14.18.

Inequality (2) of Theorem 14.21

implies that convergence of the Ritz method for semilinear

differential equations is hardly different from convergence for linear differential equations. 15.

Hermite interpolation and its application to the Ritz method

We will present the foundations of global and piecewise Hermite interpolation in this section.

This interpola-

tion method will aid us in smoothing the approximation functions and also in obtaining a particularly effective Ritz method.

In the interest of a simple presentation we will

dispense with the broadest attainable generality, and instead endeavor to explain in detail the more typical approaches. We begin with global Hermite interpolation for one independent variable. Theorem 15.1: (1)

that

m c N

and

f

c

Cm-1([a,b],]R).

There exists exactly one polynomial

deg fm < 2m-1

fmu) (a) fm

Let

fm

Then:

such

and

= f (u) (a) ,

fm(11) (b)

= f (u) (b) , u = 0 (1) m -

is called the Hermite interpolation polynomial for

1. f.

291

Hermite interpolation and the Ritz method

15.

If

(2)

is actually

f

[a,b], then the function

entiable on

µ = 0(1)2m-1, has at least v = 1(1)2m - µ.

2m - µ

f(u)

zeros

fmu), for

-

xµv

[a,b],

in

Here each zero is counted according to multi-

For each

plicity.

2m-times continuously differ-

x e

there exists a

[a,b]

9 e (a,b)

such that the following representation holds: f(u)(x)

The

xµv

fmu)(x)

=

f(mmu 9

+

2fl µ(x-xuv), v=1

u = 0(1)2m-1.

(µ fixed) ordered by size are given by

xuv

ia

for

v = l(1)m -

b

for

v = m+1(1)2m - u.

u

We have the inequality

II

f(u)

_

fmu)IIm

< cmu(b-a) 2m-"II f

0(1) 2m-1.

(2m)II.,

where

mm m-u m-u 77-M (2m-µ)

_-P

1

2m-µ

for

µ = 0(1)m-1

for

µ = m(1)2m-1.

mu 1

(2m-p)

This theorem can be generalized when continuously differentiable on that case, an estimate for

[a,b]

is only

f

with

IIf(µ)-fmµ)II

Swartz-Varga 1972, Theorem 6.1.

For

1-times

0 < R < 2m.

In

can be found in

k < m-1, we require in

(1) that: fmu)(a)

The constants

=

cmµ

fmu)(b)

= 0,

u = i+1(1)m-1.

are not optimal.

Through numerical compu-

tations, Lehmann 1975 obtained improved values of small

m

(cf. Table 15.2).

cmµ

for

7

6

5

4

3

2

1

0

y

cmu

cmu

S.oooo00o0000E-1

l.oooooooooooE o

1.o71428S7143E-1

5.oooooooo000E-1

1.19o47619o48E-2

1.66666666667E-1

1.oooooooooooE o 5.ooo000oooooE-1

5.95238095238E-4

4.16666666667E-2

5.oooooooooooE-1

1.oooooooooooE-1

2.45o7619o282E-5

6.82666666667E-4

3.1oo1984127oE-6

8.33333333333E-3

1.66666666667E-1

1.oooooooooooE o

5.oooooooooooE-1

5.2o833333333E-4

3.o483158o552E-5

4.3945312SoooE-3

5.oooooooooooE-1

8.33333333333E-2

3.689522oo589E-7

1.66527864535E-6

2.8800oooooooE-4

7.453559925ooE-5

9.68812oo3968E-8

9.68812oo3968E-8

m=4

m = 1,2,3,4.

2.17013888889E-5

2.17ol3888889E-5

m=3

(lower entry) for

8.o1875373875E-3

2.4691358o247E-2

1.ooo000000ooE o

5.oooo0000000E-1

2.6o416666667E-3

1.25000ooooooE-1

2.6o416666667E-3

m=2

(upper entry) and

1.2SoooooooooE-1

m=1

TABLE 15.2:

15.

293

Hermite interpolation and the Ritz method

The conditions on

Proof of (1):

equations for the

fmu)

create

coefficients of polynomial

2m

linear

2m

fm.

If

the determinant of the system of equations were zero, then for certain right sides there would be two different polynomials

and

fm

polynomial with

fm - fm

Then

fm.

would be a nonvanishing

zeros and degree < 2m-1.

2m

Since that is

a contradiction, the system of equations must have a unique solution.

Proof of (2): city

The difference

f -

fm

a

and

at each of the points

m

total of

b, and therefore a

It then follows from the generalized

zeros.

2m

has a zero of multipli-

Rolle's Theorem that the derivatives (f-fm)(u)

have at least xuv first

zeros on

2m-u

u = 0(1)2m-1

We denote these by

[a,b].

and order them by size with respect to

equal to

a, and the last

zeros are equal to

m-u b.

Obviously the

v.

m-u

are

Now we consider the function

Oq(x) = f(u)(x)

-

fmu)(x)

2m-u -

q

(x-xuv

II

V=1

for fixed xo a [a,b]

p

e {0,l,...,2m-1}

and

with v = 1(1)2m-p

xo + xuv,

one can then choose a zero.

Then

For a fixed

q e]R.

q e]R

such that

has at least

0q(x)

is equal to

mq(xo)

2m-u+l

zeros in

[a,b].

We again appeal to the generalized Rolle's Theorem to conclude that

0q(2m-u)

(x)

has at least one zero

Then it follows from

q(xo) = 0,

q(2m-u) (g) = 0

8

in

(a,b).

294

II.

BOUNDARY VALUE PROBLEMS

that f(2m)(0)

- q(2m-1j) = 0,

f(u) (xo) When

xo

f (mmu

e (a,b).

6

(xo-xuv)

2mIT

6

v=l

= 0.

xuv, the last equation holds

is one of the zeros

for arbitrary x

fmu) (xo)

-

Therefore it holds for all

The equation, together with

c [a,b].

Ix-xuvI < b-a

immediately implies the inequality for f(u) fmwhere p = 0(1)m-1, we can split the product.

When

p =m(1)2m-1.

We have (x-a)m-u(b-x)m

2m-Vi

Ix-xuvI

II

for

x

c [a,(a+b)/21

for

x

e ((a+b)/2,b1.

<

v=1

(b-x)m-u

1(x-a) m

We want to find the extrema of y(x) = (x-a)

m-u(b-x)m

x c (a,(a+b)/2) .

We have (m-P)(z-a)m-u-1(b-z)m-m(x-a)m

y'(x) _

exactly when

z =

R.

x-a = (m-u)(b-a)/(2m-u)

= 0

The function

[ma + (m-u)b]/(2m-u).

assumes its maximum at

u(b-z)m-1

y(x)

Since and

b-x` = m(b-a)/(2m-p)

it follows that y(X) =

m m m(2m-u)

M-11 (b-a)2m-u

m-u

The considerations for

x e ((a+b)/2,b]

inequality follows for

u = 0(1)m-1.

are similar. c

The

29S

Hermite interpolation and the Ritz method

15.

Suppose a fixed

Then the

has been chosen.

e [a,b]

x

assignment

A

f -* f(u)(x)

fmu)(x),

-

defines a linear functional on

u = 0(1)2m-1 It vanishes

C2m([a,b],]R).

on the set of all polynomials of degree less than

The

2m.

functional can be represented explicitly with the aid of a Peano kernel.

Definition 15.3:

m eIN, x,t c [a,b]

Let

(x-t) (x-t)+m-1

g(x,t) =

2m-1

g(x,t)

of

t,

by

for

x > t

for

x < t.

= 0

For fixed

and

we denote the Hermite interpolation polynomial gm(x,t).

We set

Gm(x,t) = g(x,t) - gm(x,t).

Then

al'Gm/axu

is called the Peano kernel of

Au.

e

The coefficients of the Hermite interpolation polygm(x,t)

nomial

are functions of

t

which can be repreTherefore,

sented explicitly with the aid of Cramer's Rule. gm e C

2m-2

tion in

Since

([a,b] x [a,b],IR). C2m-2([a,bl

g(x,t)

is also a func-

x [a,bl,]R), the same is true for

Gm(x,t) . Theorem 15.4:

Let

f

e

C2m([a,b],]R)

Hermite interpolation polynomial for x

e [a,b]

f(p) (x)

and let

fm

be the

Then for all

f.

we have the representation: - fmp) (x) =

rb m

T

1a

f(2m) (t)

all

m(x,t)dt,

ax

p = 0(1)2m-1.

296

BOUNDARY VALUE PROBLEMS

II.

We begin by showing that

Proof:

b

m(x)

=

f

J

(2m)

(t)Gm(x,t)dt

a

is a solution of the following boundary value problem: 0(2m)(x)

f(2m)(X)

= (2m-l)!

(15.5)

0(v)(a)

=

(u)(b) = 0,

µ = 0(1)m-l.

will then be the Green's function for the

Gm(x,t)/(2m-1)!

boundary value problem (cf., e.g. Coddington-Levinson 1955).

Since

Gm a

C2m-2

([a,bJ

x

[a,b],]R), it follows that

(2m-2)-times continuously differentiable on

is

[a,b].

We have o

(2m-2) (x) =

b r

2m 2

f(2m)(t)a

m (x,t)dt

ax

a

2m-2

f(2m) (t) axZm-Z G M(x,t)dt Jxa rb +

2m-2

f(2m)(t)Gm(x,t)dt.

x

For

x # t, g(x,t), gm(x,t), and hence

It follows that

arbitrarily often differentiable. 10

e

C2m-1([a,b],]R).

(2m-1)(x) _

Differentiation yields

(a f(2m)(t)a2(x,t)dt m + f

respect to

m(x,x-0)

a

m-7Gm(x,t)dt

(t) a

ax -

Since the

(x)

2m-1

b

Jx f(2m)

+

a2m-2

(21m)

Z

J

+

f

(2m)

2m-2 (x)

:xzmm(x,x+0).

(2m-2)-th partial derivative of x

is continuous in

integral terms remain.

are all

Gm(x,t)

x

and

Gm(x,t)

with

t, only the two

As above, it follows that

15.

0

297

Hermite interpolation and the Ritz method

e C2m([a,b],]R)

(2m)

and

m(x,t)dt + f(2m) (x)

ax f(2m)a

(x) =

ax

fax

-Gm(x,t)dt + Jb f(2m)(t)a-ax

a 2Zm=iGm(x,x-0) ax

f(2m)(x)3 2m2m1lGm(x,x+0).

-

axr

x

We have a2m-1 ax

J(2m-l)!

IM--_79 (x,t) = 0

a2m 2 g(x,t) =

ugm(x,t) ax

and

x >

for

x < t

is continuous in

and

x

t

for

u = 0(1)2m-1,

Combining all this, we obtain

= 0.

mm(x,t)

ax

t

x # t

am

all

and

for

0

for

a2m 1

a2m-1

-0)

-

a 2m l-m(x'x+0) _ (2m-1):

a

a2m

a XTM_Gm

(x,t) = 0,

x # t.

From this it follows that (2m)

(x) = (2m-l)!

In addition, it follows for tion of

Gm(x,t)

m

u = 0(1)m-1

from the construc-

that

0(u) (a) Thus

f(2m)(X)

=

0(u) (b) = 0.

is a solution of boundary value problem (15.5).

The

function (2m-l)![f(x)

- fm(x)]

is obviously also a solution of (15.5).

Since the boundary

value problem has a unique solution, it follows that f(x)

-

fm(x) _ (2m-l)! m(x) _

1

(2m-1

fb f(2m) (t)Gm(x,t)dt. Ja

298

BOUNDARY VALUE PROBLEMS

II.

Differentiating this and substituting the derivatives of O(x)

obtained farther above yields the conclusion.

Example 15.6:

g, gm, and

Gm

for

a

m = 1,2,3.

Case m = 1:

x-b-ab-t

gl(x,t) -

g(x,t) = (x-t)+,

(b-x4 t-a

for

x > t

(b-t)(x-a) b-a

for

x < t

G1(x,t) -

t-a

for x > t

b-t

for

Glx(x,t) _

GIx(x,x+0) _ X-a

Glx(x,x-0)

E --a

x < t

+ F = 1.

Case m = 2:

g(x,t) = (x-t)+ (b-t

2

x-a

2

(2(b-t)(x-a) + 3(b-a)(t-x)]

92(x,t)

(b-a)

b-x 2 t-a

2

[2(b-x)(t-a)+3(b-a)(x-t)] for x>t

(b-a)

G2(x,t) _

(b t)2 x- a 2

[2(b-t)(x-a)+3(b-a)(t-x)] for x
(b-a)

t-a

Z

2{-2(b-x)[2(b-x)(t-a)+3(b-a)(x-t)]

(b-a)

+ (b-x)2(3b-a-2t)}

Glx(x,t) =

for x > t

({2(x-a)[2(b-t)(x-a)+3(b-a)(t-x)]

(b-a) +

(x-a)2(3a-b-2t)}

for x < t

15.

299

Hermite interpolation and the Ritz method

(t-a) 2{2[2(b-x)(t-a)+3(b-a)(x-t)] (b-a)

4(b-x)(3b-a-2t)} for x > t G2xx(x,t)

(b-t 2{2(2(b-t)(x-a)+3(b-a)(t-x)] (b-a)

3

+ 4(x-a)(3a-b-2t)} for x < (t-a 2 6(3b-a-2t)

for

x >

for

x < t

t

t

(b-a)

G2xxv(x,t) = b-t

2

6(3a-b-2t)

(ba) G2xxx(x,x-0)

- G2xxx(x,x+0) = 6.

Case m = 3: g(x,t) = (x-t)+ (b-t)3(x-a 3 {5(b-a)(t-x)[2(b-a)(t-x)+3(x-a)(b-t)]

g3(x,t)

(b-a)

+ 6(x-a)2(b-t)2}

G3(x, t)

=

J (x-t)

5

- g3(x,t)

for

x > t

- g3(x,t)

for

x < t.

Theorem 15.4 immediately yields an estimate for the interpolation error with respect to the norm Theorem 15.7:

Let

cmu holds true.

2

r rl rl

m-1

Here

o

Gm(x,t)

tion 15.3 for the interval computed for small by Lehmann 1975.

Then the inequality

< cmu(b-a) 2m-u IIf(2m)I12,

IIf(1j)-fmu)II2

where

f e C2m([a,b],]R).

11-112-

m.

all in [-i

ax

u - 0(1)2m-1

t) ] 2 dx dt I

1/2

M1

is the function (0,1].

Gm

The constants

from Definicmu

can be

The values in Table 15.8 were obtained

2.24457822314E-5 2.77638992969E-4

4.27311575545E-1

4.24705992865E-3 5.37215309350E-2 1

4.21294644506E-11 5.08920680460E-3 5.92874650749E-2

4.45212852385E-4

4.87950036474E-2

4.14039335605E-1

2

3

4

5

6

7

2.38010180208E-6

3.19767674247E-7

6.56734371321E-5

7.27392967453E-3

4.08248290464E-1

7.175679561o6E-8

m=4

1

1.63169843917E-5

m=3

m = 1,2,3,4.

2.01633313311E-3

for

1.05409255339E-1

m=2

cmu

o

m=1

TABLE 15.8.

15.

301

Hermite interpolation and the Ritz method

Proof:

From Theorem 15.4 and an application of the Cauchy-

Schwarz Inequality we obtain

f(u) (x)

-

fmu) 1

(x)

_7

((2m-1)!)

12

(

b

b If(2m)(t)]2dt J [auGm(x,t)]2dt.

a

a ax

By integration, this becomes < cmu(b-a) Ja [f(2m)(t)32dt

Jalf(u)(x)-fmu)(x)12dx where cmu(b-a) =

T_ L

b b u 1/2 {J J (a uGm(x,t)]2dt dx} a a 2x

2m-1

Every interval can be mapped onto transformation.

by an affine

(0,1]

With that substitution, we get (b-a)2m-u

cm11(b-a) _

cm11(1)

Letting 1

1

1

cmu = amu(1)

1/2

u

(2m-1)! {Jo1o [aa u(x,t)]2dx dt}

yields the desired conclusion.

a

The polynomials of degree less than or equal to form a 2m-dimensional vector space. (l,x,...,x

2m-1 )

2m-1

The canonical basis

is very impractical for actual computations

with Hermite interpolation polynomials.

Therefore we will

define a new basis which is better suited to our purposes. Definition 15.9: space.

Basis of the 2m-dimensional polynomial

The conditions S(11),m(0) a,X

= 6112

6a6

(a,6 = 0,1,

u,R = 0(1)m-l)

302

BOUNDARY VALUE PROBLEMS

II.

define a basis

{SI

I

(x)

R = 0(l)m-1}

a = 0,1;

of the 2m-dimensional space of polynomials of degree less than or equal to

2m-1.

o

It is easily checked that the Hermite interpolation polynomial

f e Cm-1([a,bl,]R)

for a function

fm

has the

following representation: M-1

fm(x) =

I

(b-a)t[f(-')(a)SO,1,m(b-a) (15.10)

Z=0

This corresponds to the Lagrange interpolation formula for ordinary polynomial interpolation. Sa ,

R , m(x)

explicitly for

Table 15.11 gives the

m = 1,2,3.

In order to attain

great precision it is necessary to use Hermite interpolation formulas of high degree.

This can lead to the creation of

numerical instabilities.

To avoid this, we pass from global

interpolation over the interval

[a,b]

to piecewise inter-

polation with polynomials of lower degree. partitioning

[a,b]

intermediate points.

into

n

We do this by

subintervals, introducing

n-1

The interpolation function is pieced

together from the Hermite interpolation polynomials for each subinterval.

Theorem 15.12:

Let

m,n c IN,

f

e

Cm-1([a,b],]R)

and let

a = xo < x1 < ...... < xn-1 < xn - b be a partition of the interval and

i = 0(1)n-l

we define

[a,b].

For

x

E [xi,xi+1]

Hermite interpolation and the Ritz method

15.

TABLE 15.11:

S

m = 1,2,3.

for

(x)

m

Sa

m(x)

a

k

0

0

1

1-x

1

0

1

x

0

0

2

1

0

1

2

x - 2x2 + x3

1

0

2

1

1

2

0

0

3

1

0

1

3

x - 6x3 + 8x4 - 3x5

0

2

3

2

1

0

3

10x3

1

1

3

-

1

2

3

-x3

t

+ 2x3

3x2

-

3x2

1x2

2x3

-

x2 + x3

-

-

l0x3

+

-

3x3 + 3x4

_

6x5

1x5

$

Y

2

15x4 + 6x5

-

4x3 + 7x4 -

15x4

-

3x5

x4 + 2x5

M-1

x-x i

xi) [f (t) (xi)SO,R,m(

fm(x) =

303

kI0(xi+1

l

xi+1 xil

x-x. + f( )(xi+1)S1,L,m(x1+11xi)]

Then

fm

is the Hermite interpolation polynomial for

each subinterval

[xi,xi+l]

(cf. Theorem 15.1(1)).

(m-1)-times continuously differentiable on f

[a,b].

f

fm

is

Whenever

is actually 2m-times continuously differentiable on

the following inequalities hold:

on

[a,b],

304

II

f(u) _ fmu)II*

<

muh2m-u

II f(u) _ fmu)II 2 <

15.7.

cmu

and

cmu

We denote by

it f (2m)IIm

cmuh2m-u

II f (2m)II

h = max{xi+l-xi Here

BOUNDARY VALUE PROBLEMS

II.

i

I

2

= 0(1)n-l},

u = 0(1)2m-1.

are the constants from Theorems 15.1 and II'IIm

the norm obtained from

11-11_ by

considering only one-sided limits at the partitioning points

xi . The proof follows immediately from (15.10) and Theorems 15.1 and 15.7.

Our two-fold goal is to use global and piece-

wise Hermite Interpolation both for smoothing the approximation functions in two independent variables and for obtaining Therefore, we will

a special Ritz Method in two dimensions.

generalize global and piecewise Hermite Interpolation to two variables.

We follow the approach of Simonsen 1959 and

Stancu 1964 (cf. also Birkhoff-Schultz-Varga 1968). basic region, we choose

[0,1]

[0,1]

x

As our

instead of an arbit-

rary rectangle, thereby avoiding unnecessary complications in our presentation. Definition 15.13:

m c N.

Let

We define

H(m)

to be the

vector space generated by the set p,q polynomials of degree less than or equal I

to 2m-l}.

This space has dimension

Sa k m(x)SB,i m(y) constitute a basis.

o

4m2.

The functions

a,s = 0,1;

k,i = 0(1)m-1

15.

305

Hermite interpolation and the Ritz method

Remark 15.14:

f e H(m)

if and only if 2m-1 2m-1

f(x,y) =

a..xlyl. j=0

i=0

We can impose

conditions on the interpolation.

4m2

properties demanded of the

Sa

The

by Definition 15.9

2 m

require

exay(Sa,k,m(Y)SS,i,m(6)) a,s,y,6 = 0,1; If, instead of

uk6vk6 ay 6$6

u,v,k,k = 0(1)m-1.

H(m), we choose the set of polynomials in two

variables of degree less than or equal to becomes substantially more difficult. space of dimension

m(2m+1).

m = 1

Then we have a vector

At each of the four vertices of

the unit square we will need to give Even in the case

2m-1, the theory

m(2m+l)

conditions.

we run into difficulties, since we

cannot prescribe the four functional values at the four vertices.

We avoid this difficulty by choosing the interpolation

polynomials from Theorem 15.15:

H(m).

Foi each sequence

{Ca,O,k,t cIR

I

a,O = 0,1;

k,k = 0(1)m-l}

there exists exactly one polynomial akak

ax k ay

X

f(a,6) =

f(x,y) =

ca,a,k,k

1

m-1

E

E

a,s=O k,k=0

H(m)

a,s = 0,1;

such that

k,k = 0(1)m-1.

Ca,d,k,tSa,k,m(x)SB,R,m(Y)

See Remark 15.14 for the proof. special basis for

f e H(m)

Table 15.16 gives the

explicitly for

m = 1,2.

306

II.

Basis of

TABLE 15.16:

BOUNDARY VALUE PROBLEMS

H(m)

m = 1,2.

for

R

m

Sa,k,m(x)

SS'R'm(Y)

0

0

1

(1-x)

(1-y)

1

1

0

1

(I-X)

y

0

1

0

0

1

x

(1-y)

1

0

1

1

0

1

x

y

0

0

2

0

0

2

(1-3x2+2x3)

(1-3y2+2y 3)

0

0

2

0

1

2

(1-3x2+2x3)

(y-2y 2+Y3)

0

1

2

0

0

2

(x-2x2+x3)

(1-3y2+2)r 3)

0

1

2

0

1

2

(x-2x2+x3)

0

0

2

1

0

2

(1-3x2+2x3)

(3y2- 2y 3)

0

0

2

1

1

2

(1-3x2+2x3)

(-y2+y3)

0

1

2

1

0

2

(x-2x2+x3)

(3y2-2y3)

0

1

2

1

1

2

(x-2x2+x3)

(-Y2+Y3)

1

0

2

0

0

2

(3x2-2x3)

(1-3y2+2y3)

1

0

2

0

1

2

(3x2-2x3)

1

1

2

0

0

2

(-x2+x3)

(1-3y2+2y3)

1

1

2

0

1

2

(-x2+x3)

(Y-2Y2+Y3)

1

0

2

1

0

2

(3x2-2x3)

(3y2-2y3)

1

0

2

1

1

2

(3x2-2x3)

(-y2+y3)

1

1

2

1

0

2

(-x2+x3)

1

1

2

1

1

2

(-x2+x3)

a

k

m

0

0

1

0

0

1

.

'

'

(Y-2Y2+Y 3)

(Y-2y

2+Y3)

(3y2-2y3) (-Y2+Y3)

Hermite interpolation and the Ritz method

15.

307

The following theorem uses the Peano kernel to obtain a representation of the error in a Hermite interpolation. f c C4m([0,1] x

Let

Theorem 15.17:

[0,1],

We define

IR).

by the condition

fm c H(m) au+v

au+v

(a4) = axuayv m f

Then for

we have

u,v = 0(1)2m-1

u+2m

2m+v

u+v

ax"ayv(f-fm)II 2_mull azgymv

II

u,v = 0(1)m-1.

a,3 = 0,1;

xuayvf(a,6),

a

f II2 + cmv'I

axu a4m

+

Proof:

We first show

au+v

p,v = 0(1)m-1

for

f 112

ax

that

"+v axuayvfm(E,n)

axuayv

2m+v

1

axaayf(t'n)a m(E,t)dt

(2m-1)

o 1

+

(15.18)

au+2m

(

f(,,s)

0 ax ay

[(2m-1):]

av

4m

2ay

2

Vm(n,s)ds}

ax

flfl

Gm

cmucmvll

f(t,s)

00 ax

au

av

ax

ax

is the function from Definition 15.3 corresponding to the

interval

[0,1].

We begin by assuming that

f

sented as a product, f(x,y) = p(x)q(y), where C2m([0,1],IR).

Then

p

and

q

can be repre-

p,q c

can be approximated indivi-

dually by means of the one dimensional Hermite interpolation. By Theorem 15.4 it is true for all

p(I) M = p(u) ( gmv) (n)

=

)

q(v) (n)

rl p(2m)

1 -

-

(2 m

,n c [0,1]

1

2m11

0

rl

Jo

(t)

-1-11-Gm

ax

that

(&,t)dt

Im(n,s)ds. q(2m) (s) av ax

BOUNDARY VALUE PROBLEMS

II.

308

Multiplying the two equations together yields

pmu)

(Q qmv) (n) = p(u) (E)q(v) (n) 11

(2m

!

1 p(2m)(t)q(v)(n)a-11 J0

u mt)dt

ax

+

(1p(U)(E)q(2m)(s)a

G(n,s)ds}

ax"

0

(2m)

+

1111 p

((2m-1)00

(t) q

all

(2m) (s)

m(E,t)am(n,s)dtds.

ax

ax

By means of the identification

u+v a

ax"ayv

f(C,n) = p(u) Oq(v) (n)

u+v ax''ayvm(E,n)

=

(n)

mu)

we obtain the conclusion for the special case of p(x)q(y).

f(x,y) _

Since the formula is linear, it will also hold

for all arbitrary linear combinations of such products. includes all polynomials of arbitrarily high degree.

This

There

is a theorem in approximation theory which says that every function

f

satisfying the differentiability conditions of

the hypotheses (including the 4m-th order partial derivatives), can be approximated uniformly by a sequence of polynomials. Therefore Equation (15.18) also holds for arbitrary f e C4m([0,1]

x

[0,1],IR).

The Schwarz integral inequality,

applied to (15.18), immediately yields the conclusion.

o

We now pass from global Hermite interpolation on the unit square to piecewise Hermits interpolation.

For this we

allow regions whose closure is the union of closed squares ap

with sides parallel to the axes and of length

require that the intersection

op n oo

(p # a)

h.

We

be either

empty or consist of exactly one vertex or exactly one side.

15.

309

Hermite interpolation and the Ritz method

As in the one dimensional case, we construct the interpolation function by piecing together the Hermite interpolation polynomials for the various squares

All of the con-

op.

cepts and formulas developed for the unit square carry over immediately to squares

aP

Theorem 15.19:

be a region in

Let

properties and let

G

with sides of length 1R2

with the above

We define

f c C4m(G,]R).

h.

fm:G

by

IR

the two conditions: op

restricted to

fm

(2)

At the vertices of square a''+v

uayvf(x'Y),

ax is

fm

For

u+v 2

<

Proof:

mu

it is true that

2m+v

cmuh2m-

cmvh2m-

II2 +

II

G.

is arbitrarily often

u,v = 0(1)2m-1

+ cmycmvh c

u,v = 0(1)m-1.

(m-l)-times continuously differentiable in

differentiable.

Here the

it is true that

ax

In the interiors of the squares, fm

II ax ayv(f fm)II

oP

au+v

uay vfmx.Y) = Then

H(M).

is a polynomial in

(1)

4m-u-v

a2m ax---2_m

u+2m II axj

f II2

f II2 .

II

are the constants from Theorem 15.7.

Along the lines joining two neighboring vertices, fm

and the partial derivatives of already determined by the values vertices of the given side.

fm

through order

m-i

a" vf(x,y)/ax"ay"

at the two

From this it follows that

(m-l)-times continuously differentiable in

are

fm

is

G.

With the aid of Theorem 15.7, the inequality of Theorem 15.17 can immediately be carried over to the case of a square with sides parallel to the axes and of length

h:

310

{cmuh2m-uIf. (a2may

[apaY(f-fm)] 2dxdy <

Jo

BOUNDARY VALUE PROBLEMS

II.

f)2dxdy,1/2

P

P

ch2m-v1I +

mv

+ c

mucmy

(aXa2mf)2dxdy]1/2 Y

h4m-upv((

2ma2mf 2dxd ]1/2}2 o (3x y ) y P

Summing over

p

and applying the Minkowski inequality for

sums (cf. e.g. Beckenbach-Bellmann 1971, Ch. 1.20) gives the inequality for the norms.

a

In the sequel, we will explain how the global and

piecewise Hermite interpolation polynomials can be used as initial functions for the Ritz method (cf. Chapter 14).

We

first consider the one-dimensional problem I(u) = min{I(w)

I

w e W}

where 1

1(w) = J [a(x)w'(x)2 + 2Q(x,w(x))}dx 0

W = {w E C2([0,11, ]R)

I

w(0) = w(l) = 0}.

An actual execution of the Ritz method requires us to choose a finite dimensional subspace

Vn

of

W, and a particular

basis for this subspace.

Example 15.20:

Basis of a one-dimensional Ritz method with

global Hermite interpolation.

{Sa,m(x) as a basis.

I

The functions

m >

Let

a = 0,1;

2,

space generated has dimension

x = 0

2m-2.

Choose

= l(1)m-l}

S a,0,m(x)

they do not vanish at the points

2.

are discarded because and a

x = 1.

The

15.

Hermite interpolation and the Ritz method

Example 15.21:

Basis of a one dimensional Ritz method with

piecewise Hermite interpolation. For

b eJR

311

and

Let

n e 1N

and

h = 1/n.

R = 0(1)m-1, define

Tb,R,m(x)

The restrictions of the m(n+l)-2

Tb,R,m(x)

to

yield

[0,1]

basis functions, for the following combination of

indices:

b = 0,1:

R = 1(1)m-1

b = h,2h,...,(n-1)h: R = 0(1)m-1. On

[0,1], the basis functions are (m-l)-times continuously

differentiable and only differ from zero on one or two of the subintervals of the partition

0 < h < 2h < of the interval

The 15.22.

... < (n-l)h < nh = 1 They all vanish at

[0,1].

Tb,R,m(x)

are given for

and

x = 0

m = 1,2,3

in Table

They are graphed in Figures 15.23 and 15.24.

a

We will now discuss the two-dimensional variational problem with I[w] = If [a 1 W2 + a 2w

2

+ 2Q(x,y,w)]dxdy

G

W = {w a C2(G,IR) n C°(G, IR)

I

w(x,y) = 0 for (x,y) a 8G}.

This is the same problem as in Chapter 14, with the same conditions on

al, a2, and

Q.

For an actual execution of

312

BOUNDARY VALUE PROBLEMS

II.

TABLE 15.22:

Tb

for

R m(x)

for

m = 1,2,3.

R

m

0

1

0

2

1

2

h sgn(x-b)(z-2z2+z3)

10

3

1

1

3

2

3

For

Ix-bi

T

b,R,m

(x)

1

Ix-bi

1

-

-

3z2 + 2z3

z

Ix-bl

z

l0z3 + 15z4

-

< h,

-

6z5

h sgn(x-b)(z-6z 3+8z4_3z5) h22

(z2-3z3+3z4-z5)

> h, Tb,R,m(x) = 0.

the Ritz method, we must choose a finite dimensional subspace

Vn

W

of

Example 15.25:

and a particular basis of this subspace.

Basis of a two-dimensional Ritz method with

global Hermite interpolation. m > 2.

Let

G =

[0,1]

x

[0,1]

and

Choose

{Sa k m(x)SS,R,m(Y) as a basis.

I

a,8 = 0,1; k,R = 1(1)m-l}

The space generated has dimension

Example 15.26:

4(m-1)2.

0

Basis of a two-dimensional Ritz method with

piecewise Hermite interpolation.

Let

G

be a region in

]R2

which satisfies the subdivision properties of Theorem 15.19.

15.

Hermite interpolation and the Ritz method

m = 1 1,0

m =

b-h

b

b+h

b-h

b

b+h

2

0,15

b+h

b-h

Figure 1 5 . 2 3 :

Tb

k

m(x)/hx

for

m = 1,2

313

314

II.

m =

BOUNDARY VALUE PROBLEMS

3

1,0

Tb,0,3 b-h

Figure 15.24:

b

Tb

b+h

m(x)/h91

i

for

m = 3

15.

Hermite interpolation and the Ritz method

Let

E

op

315

denote the set of all vertices of all of the squares

of the partition.

Further, let

m,n c N, and

h = 1/n.

We define the basis functions to be the restrictions of the functions

Tb,k,m(x)Tb'R,m(Y)

G

to

for the following combination of indices:

(b,b) e E

fl

G

e E

f1

2G

(b,b)

k,2 = 0(1)m-1;

and and

k > 1

whenever

(b,b+h)

c DG

or

(b,b-h)

c 8G

> 1

whenever

(b+h,b)

a 8G

or

(b-h,b)

E 8G.

P.

The basis functions belong to

Cm-1(G,IR).

They vanish on

m = 1, we obtain the basis already discussed in

For

3G.

k,t = 0(1)m-1, but

Example 14.10, if the subdivision assumed there agrees with the one prescribed here.

Thus piecewise Hermite interpola-

tion supplies a generalization of this method to The matrix

A = (auv)

m > 1.

of the Ritz method for the

basis chosen here can be given explicitly for the special case

Q(x,y,z) = q(x,y)z + 1-a(x,y)z2, a(x,y) > 0 for (x,y)

c G

by:

auv =

[al(x,Y)Tb,k'm(x)Tb,R,m(Y)Tb*k*,m(x)Tb*'R*,m(Y) +

a(x,Y)Tb,k,m(x)Tb,.,,m(Y)Tb*,k*,m(x)Tb*,R*'m(Y)]dxdy. In this case

Tb,k,m(x)Tb,t,m(Y) is the

u-th basis function, and

316

II.

BOUNDARY VALUE PROBLEMS

Tb*,k*,m(x)Tb*,IC*,m(Y)

v-th basis function.

is the

11 (b,b)

-

The integrals vanish whenever

(b*,b*)II 2

>

Therefore, each row and column of matrix elements which differ from zero.

9m2

A

has at most

At most four squares

contribute to the integrals.

Theorem 14.18 supplies the inequalities

Y211u-wl'I < Y2I!u-w*III.

11u-w112

Here we have u

solution of the variation problem in space

w

Ritz approximation from space

W

V u

w*

arbitrary functions from

:

Vu.

We have the additional inequality: II u-w*II 2

max

<

_ a 1(x,Y)II (u-w*) xI122

I - (x,y)EG max

+

(x,y)c max

+

aI(x,Y)lI (u-w*) y112

_ a(x,Y) IIu-w*II2. 2

(x,y)EG

In our problem we can apply Theorem 15.19 to choose

w*

so

that

IIu-w*112 < Moh2m

II(u-w*) 12 < II (u-w*)YII2 < The numbers of

u

M0, M1, and

and on

Mlh2m-1

M2h2m-1

M2

m, but not on

depend only on the derivatives h.

Altogether it follows that

the Ritz method has convergence order

0(h2m-1 ):

16. Collocation methods and boundary integral methods

317

1Iu-w1I2 < Mh2m 1. In many practical cases it has been observed that the convergence order is actually

O(h2m).

The explanation for this

behavior can usually be found in the fact that the error must be a function of 16.

h2

for reasons of symmetry.

o

Collocation methods and boundary integral methods Collocation methods are based on a very simple con-

cept, which we will explain with the aid of the following boundary value problem (cf. Problem 13.2): Lu(x,y) = q(x,y),

(x,y)

e G

u(x,Y) = (x,Y),

(x,Y)

e r.

(16.1)

Once again, G

is a bounded region with boundary

r

and

L

is a linear, uniformly elliptical, second order differential operator of the form Lu = -alluxx - 2a12uxy - b

1

u

x

- b 2u y

a22uyy

+ gu

where all, a12, a22, b1, b2, g e C_(G, IR) and

q e C°(G, IR), Further it is true for all

a11a22 - a12 >

C° (r, IR) that

(x,y) e

0,

all >

0,

g > 0.

The execution of the method presupposes that we are given: (1) j

n

= 1(1)n, from

linearly independent basis functions C2(G,IR).

vj,

318

n

(2)

different collocation points

of these, the first ing

BOUNDARY VALUE PROBLEMS

II.

n2 = n - n1

are to belong to

n1

are to lie in

The solution

(xk,yk) e G;

G, and the remain-

r.

of boundary value problem (16.1)

u

will now be approximated by a linear combination functions

w

of the

vj, where we impose the following conditions on w: k = 1(1)n1

Lw(xk,yk) = q(xk,yk),

(16.2)

k = nl + 1(1)n.

w(xk,yk) _ (xk,yk), In view of the fact that n

w(x,y) =

cj e IR

E cjv.(x,y), j=1

the substitute problem (16.2) is concerned with the system of linear equations: n

k = 1(1)n

8k,

akj =

Lvj(xk,yk)

for

k < nl

vj(xk,yk)

for

k > nl

q(xk,yk)

for

k < nl

(16.3)

$I,

for k > n

(xk'yk)

In many actual applications, the system of equations can be simplified considerably by a judicious choice of the

vj.

It is often possible to arrange matters so that either the

differential equation or the boundary conditions are satisfied exactly by the functions (A)

Boundary collocation: Lvj(x,y) = 0,

All

(xk,yk)

must lie in

We distinguish:

vj.

We have

j

q _ 0

and

= 1(1)n, (x,y) E

r, i.e. nl = 0, n2 = n.

G.

16. Collocation methods and boundary integral methods

(B)

Interior collocation:

We have

j

vj (x, Y) = 0 , All

(xk,yk)

must lie in

i ° 0

319

and

1 (1) n, (x,y) c r.

=

G, i.e. nl = n, n2 = 0.

The system of equations (16.3) does not always have a unique solution. rarily large.

When it does, the solution can be arbit-

A priori conclusions about the error

u-w

only be drawn on the basis of very special hypotheses. is the weakness of collocation methods.

can

This

It is therefore

essential to estimate the error a posteriori.

Nevertheless,

collocation methods with a posteriori error estimation frequently are superior to all other methods with respect to effort and accuracy.

Error estimates can be carried out in the norm as explained in Section 14 (cf. Theorem 14.19).

However,

this seems unduly complicated in comparison with the simplicity of collocation methods.

Therefore, one usually premonotone principles.

fers to estimate errors with the aid of

We wish to explain these estimates for the cases of boundary collocation and interior collocation. c = u-w,

r = q-Lw,

To this end, let = iy-w.

Then we have Lc(x,y) = r(x,y),

(x,y)

c G

c(x,Y) = O(x,Y),

(x,Y)

c r.

(16.4)

(A)

For boundary collocation, we have

r(x,y) = 0.

It

follows from the maximum-minimum principle (cf. Theorem 12.3) that for all

(x,y) c G:

320

BOUNDARY VALUE PROBLEMS

II.

min max {q(x,y),0} < e(x,y) < (x,y)Er (x,y)er

Thus it suffices to derive estimates for . We assume that r

consists of only finitely many twice continuously dif-

ferentiable curves

rR.

Each arc then has a parametric

representation

t E [0,1].

(x,y) = [c1(t),F,2(t) ], We set

fi(t)

= wl(t),E2(t))

finitely many points h = 1/m.

tj

= jh,

and compute j

= 0(1)m, where

Then it is obviously true for all < h max - 2 te[0,1]

min j=0(1)m

for the

;

t e

m c 1N [0,1]

and that

'(t)

can be interpolated linearly between the points

tj.

The

interpolation error will be at most h2 4

max te[0,1]

"(t)

Combining this and letting dl = Zh

max tE[0,1)

we have, either for min

m(x,y) =

For small

or for

min

fi(t)

v = 2, that >

ta[0,1]

c(x,y) =

max

;(t) <

te[0,1]

(x,y)er91

max

h2

te[0,1]

v = 1

(x,y)erx

max

d2 =

h, coarse estimates for

min j=0(1)m

_0(t.-) -d

max ;(tj)+dv j=0(1)m or

4"

suffice.

Since v

(v)(t)

n

v

dtv vj(1(t),C2(t))

16. Collocation methods and b2undary integral methods

the quantities (B)

cj,

j

321

= l(l)n, are the deciding factors.

For interior collocation, we have

Lemma 13.18, there exists a

4(x,y) = 0.

u e C2(G,IR)

By

such that

Lw(x,y) > 1,

(x,y)

e G

W(x,Y) > 0,

(X,Y)

E G.

We set

max _I r(x,y) I

=

(X,Y) EG

II r II

and obtain

L(e+aw) (x,Y) > (e+Rw)(x,Y)

>

0,

(x, Y) E G

0,

(X,Y)

E F.

It follows from the monotone principle that (E+aW)(x,Y) > 0,

(x,y)

E

E(x,Y) > -aW(x,Y),

(x,y)

E G.

L(E-aw)(x,y) < 0,

(x,y)

e G

(e-aw)(x,Y) < 0,

(X,Y)

E

E(X,Y) < aw(x,Y),

(X,Y)

E

Analogously, one obtains

P

Combining this leads to

IIEII _ aIIWII, . Thus the computation of the error is reduced to a computation of n

max Iq(x,y) (x,y)EG

-

c.Lv.(x,y)I. j=1 >

II.

322

BOUNDARY VALUE PROBLEMS

We next want to consider three examples of boundary collocaIn all cases the differential equation will

tion in detail. be

ou(x,y) = 0. Example 16.5:

Let

We use the polar

be the unit circle.

G

coordinates

x = r cos t y = r sin t

r E [0,1], t

E

[0,2'R)

and let

vj(x,y) = rj-lexp[i(j-1)t],

j

= 1(l)n

h = 2,r/n (xk,yk) = (cos[(k-1)h], sin[(k-1)h]), Since the functions ents

cj

vj

k = l(l)n.

are complex-valued, the coeffici-

will also be complex.

Naturally, one can split the

entire system into real and imaginary parts.

Then this ap-

proach fits in with the general considerations above.

The

system of equations (16.3) can be solved with the aid of a fast Fourier transform Example 16.6:

Let

G

(cf. Stoer-Bulirsch 1980).

be the annulus

G = {(x,Y) E I R 2

For even

n = 2n-2

o

x2+y2 1

0 < rl <

<

r2}.

we set

vl(x,y) = log r

vj(x,y) = r3-nexp[i(j-n)t),

j = 2(1)n

h = 4n/n

(xk,yk) = (rlcos[(k-1)h], rlsin[(k-1)h]), k = 1(1)n-1

16. Collocation methods and boundary integral methods

(xk,yk) = (r2cos[(k-n)h), r2sin((k-n)h]), All functions fi(l)n

vj

k = n(l)n.

are bounded on the annulus.

For

j

=

they correspond to the basis functions of the previous One cannot dispense with the functions

example.

because the region is not simply connected.

cannot be approximated by

vn_1

323

v1

through

vn

vl,"',vn-1 through One com-

vn.

putes, e.g., that 12Tr

(vl(r2cos t, r2sin t) - vl(rlcos t, rlsin t)]dt n

r

r = 21log (7) > 0

2n

)I

[vj(r2cos t, r2sin t) - vj(rlcos t, rlsin t)]dt = 0,

f0

= 2(1)n.

j

This example shows that in each case a thorough theoretical examination of the problem is essential. Example 16.7: technology.

This example has its origins in nuclear It has to do with flow through a porous body.

Let the space coordinates be

(x,y,s).

pressure, u, does not depend on is given by

Au(x,y) = 0.

s.

= = {(x,Y) a IR

u,v = {(x,Y) r1

and

r2

a IR

2

We assume that the

Then a good approximation

Cylindrical channels are bored

through the body, parallel to the

Here

a

s-axis:

(x-2p)2 + (y-2v)2 < r2)

U,v c ZZ. 2

(x-2p-1)2 +

(y-2,v

_1)2 < r2}

are fixed numbers with

0 < r1 < 1 ,

0 < r2 < 1, rl+r2 < /.

Figure 16.8 depicts a section of this region for

r2 = 1/2.

r1 = 1/4

and

324

II.

Figure 16.8:

A region in Example 16.7

In each of the channels

while in each of the

I

Jµ2v

rl

there is a pressure of

µ ,v

there is a pressure of

flow thus goes from channels

monotonically with

BOUNDARY VALUE PROBLEMS

and

I

r2.

to channels

1,

-1.

The

and increases

J

Using the symmetries, one

can reduce the problem to a restricted region

G

Figure 16.9).

or

On the solid lines, u(x,y) =

1

on the dashed lines, the normal derivative of

u

(cf.

u(x,y) = -1; is zero.

In this form, the problem can be solved with a difference method.

The exact solution

problem is doubly periodic.

u(x,y)

of the boundary value

For if

(x,y)

lies in the region

between the channels, we have u(x+2,y) = u(x,y+2) = u(x,y).

Collocation methods and boundary integral methods

16.

Au

0

325

0

an =

u = x

au = 0

an

Figure 16.9:

A region in Example 16.7.

Therefore it is natural to approximate the simplest doubly periodic functions (cf.

with periods

functions, the Weierstrass P-

Magnus-Oberhettinger-Soni 1966, Ch. 10.5).

z = x+iy.

Let

with the help of

u

We denote the Weierstrass 2

and

2i

by

p(z).

P-function

The function is meromor-

phic, with a pole of second order at the points and with a zero of second order at

2u + 2vi

2u+l + (2v+l)i, where

u,v E. The poles and zeros thus are at the centers of the channels.

Therefore one can choose the basis functions

v. J

for the collocation method from the following set: 1,

log1p(z)I,

Re[p(z)J], Im(p(z)J],

j

ell

Because of all the symmetries, the set 1,

suffices.

loglp(z)21, Re [P(z)2j]

We use the trial function

w(x,Y) = Y loglp(z)2I +

c.Re(p(z)2J)

j=-R

J

-

{o}.

326

BOUNDARY VALUE PROBLEMS

II.

and must determine the

n = 29+2

unknowns

Y, c-9' C_ X+11- .'C L'

n collocation points; k+l

To do this, we use the ary of

Io o

and

on the boundary of

k+l

J

on the bound-

oo, '

rlexp((k-1))

for

k = 1(1)i+l

l+i+r2exp(1(k-k-2))

for

k = 9+2(1)n.

xk+iyk =

The linear system of equations is 1

for

k = 1(1)9+1

-1

for

k = 9+2(1)n.

_

w(xk,yk)

Table 16.10 gives the values of of

r1

and

r2

10-3/10-4/10-5.

9

for several combinations

necessary to obtain a precision of When the difference

/2-rl-r2

is not too

small, one can obtain relatively high precision even with small

L.

The flow depends only on

the computed values

y

for

y.

i = 1,3,5,7.

Table 16.11 contains

The computing time

to obtain the solution of the linear system is small in all of these cases compared to the effort required to estimate the error.

The efficiency of a collocation method usually is

highly dependent on the choice of collocation points.

The

situation is reminiscent of polynomial interpolation or numerical quadrature.

Only there exists much less research into

the optimal choice of support points for collocation methods. The least squares method is a modification of the collocation method in which the choice of collocation points is not quite as critical.

basis functions

vi

In these procedures, one chooses and

n > m

collocation points

m (xk,yk)'

16.

Collocation methods and boundary integral methods

TABLE 16.10:

r2

r

R1/R2/R3

for accuracies of

1/8

3/8

5/8

7/8

10-3/10-4/10-5

1

1/8

1/1/1

1/1/2

2/2/3

3/4/5

3/8

1/1/2

1/1/2

2/2/3

4/6/(9?)

5/8

2/2/3

2/2/3

3/4/(9?)

7/8

3/4/5

4/6/(9?)

TABLE 16.11:

r2

r

1/8

y

for

k = 1,3,5,7

3/8

5/8

7/8

1

1/8

0.13823 0.13823 0.13823 0.13823

0.19853 0.19853 0.19853 0.19853

0.25011 0.25003 0.25003 0.25003

0.32128 0.31853 0.31852 0.31852

3/8

0.19853 0.19853 0.19853 0.19853

0.35218 0.35217 0.35217 0.35217

0.55542 0.55495 0.55495 0.55495

1.10252 1.06622 1.06599 1.06599

5/8

0.25011 0.25003 0.25003 0.25003

0.55542 0.55495 0.55495 0.55495

1.34168 1.32209 1.32207 1.32208

7/8

0.32128 0.31853 0.31852 0.31852

1.10252 1.06622 1.06599 1.06599

327

328

II.

of these are to lie in

Again, n1

BOUNDARY VALUE PROBLEMS

n2

G, and

in

r.

Condi-

tion (16.2) is replaced by: 11

6k(Lw(xk,Yk)

-

q(xk,Yk)l2

k=1

(16.12) n

6k(w(xk2Yk)

+

-

1p(xk,Yk)l2 = Mini

k=n1+1

Here the

6k >

are given weights and

0

m

w(x,y) = jIlcjvj(x,Y) Because of these conditions, the coefficients

cj,

j

= 1(1)m,

can be computed as usual with balancing calculations (cf. StoerBulirsch 1980, Chapter 4.8).

Only with an explicit case at

hand is it possible to decide if the additional effort (relative to simple collocation) is worthwhile.

For

n = m, one

simply obtains the old procedure.

Occasionally there have been attempts to replace condition (16.12) with max{

6kILw(xk,Yk) max k=1(1)n1

- q(xk,Yk)I,

max 6klw(xk,Yk) k=n1+1(1)n

- *(xk,Yk)I} = Min!

(minimization in the Chebyshev sense).

Experience has demon-

strated that this increases the computational effort tremenConsequently, any advantages with respect to the pre-

dously.

cision attainable become relatively minor.

We next discuss a boundary integral method for solving Problem (16.1), with region

G

unit disk

L = A, q = 0, and

0 e C1(r,IR).

The

is to be a simply-connected subset of the closed IzI

<

1,

z

e 4, with a continuously differentiable

16. Collocation methods and boundary integral methods

boundary

r.

329

The procedure we are about to describe repre-

sents only one of several possibilities.

be a parametrization of

C e Cl([0,2T],r)

Let

without double points and with

(O) =

and

C(27T)

r

i1 +

2 >

Consider the trial function

(2n u(z) =

z

J

(16.13)

c G.

0

If

p

is continuous, u c C0(G,IR)

(cf. e.g. Kellog 1929).

By differentiating, one shows in addition that monic in

G.

is har-

u

The boundary condition yields 2n

p(t)logjz-C(t)jdt = (z),

z

e F.

(16.14)

0

This is a linear Fredholm integral equation of the first kind with a weakly singular kernel. determined solution

p

There exists a uniquely The numeri-

(cf. e.g. Jaswon 1963).

cal method uses (16.14) to obtain first an approximation of

p

at the discrete points

tj

= 27r(j-1)/n,

j

Next (16.13) is used to obtain an approximation u(z)

for arbitrary

z

u

= 1(1)n. u(z)

of

c G.

The algorithm can be split into two parts, one dependent only on

r

and

E, and the other only on .

(A)

Boundary dependent part:

(1)

Computation of the weight matrix

W = (wjk)

quadrature formulas JTrf(t)loglzj-C(t)Idt °

zj = C(tj),

= k11wjkf(tk) + R(f)

j = 1(1)n

for

n

0.

330

BOUNDARY VALUE PROBLEMS

II.

R(fv) = 0

The matrix

fv(t) =

for

11

v = 1

cos(2 t)

v = 2(2)n

sin(t)

v = 3(2)n.

Therefore

is regular.

(fv(tj))

W

is uniquely

determined.

Most of the computation is devoted to determin-

ing the

integrals

n2 f2s

fv(t)log;z;-E(t)ldt,

v,j = 1(1)n.

1-

Triangulation of

(2)

algorithm or into

W

into

W = QR

W = LU

using the Gauss

using the Householder transforma-

tions. (B)

Boundary value dependent part:

(1)

Computation of

u(tk)

from the system of equations

n

wjku(tk) = (zj),

kIl

Since

W = LU

W = QR, only

or

j

O(n2)

= l(1)n.

operations are re-

quired for this.

Computation of

(2)

u(z)

integrand is a continuous

for

z e G

from (16.13).

2n-periodic function.

The

It seems

natural to use a simple inscribed trapezoid rule with partition points

tj,

j

u(z) =

= 1(1)n:

2n

(16.15)

1u(tk)log1z-E(tk)I.

k= If

z

does not lie in the vicinity of

yields good approximations for For boundary-close

z,

r,

(16.15) actually

u(z).

-loglz - g(t)I

extremely large on a small part of the interval (16.15) is useless.

becomes [0,27T].

Then

The following procedure improves the re-

sults by several decimal places in many cases.

But even this

16. Collocation methods and boundary integral methods

331

approach fails when the distances from the boundary are very small. Let

A(t)

boundary values uc(z) = c +

be that function Then, for

i ° 1.

which results from

u(t)

c e1R,

n

2n

(16.16)

I

k=1

are also approximations to

u(z).

It is best to choose

c

so that u(tR)

whenever a(t)

- ca(tR) = 0

is minimal.

Since the computation of

can proceed independently of the boundary values ,

the effort in (16.15) is about the same as in (16.16). each functional value operations.

one needs

u(z)

0(n)

For

arithmetic

The method is thus economical when only a few

functional values

are to be computed.

u(z)

In the following example, we present some numerical results: ,P(z)

= Re[exp(z)] = exp(x)cos(y)

al(t) = 0.2 cos(t) + 0.3 cos(2t)

-

0.3

E2(t) = 0.7[0.5 sin(t-0.2)+0.2 sin(2t)-0.l sin(4t)] + 0.1.

The region in question is the asymmetrically concave one shown in Figure 16.17.

The approximation

u

was computed on

the rays 1, 2, and 3 leading from the origin to the points E(0), l;(n), and

points.

E(5ii/3).

R

is the distance to the named

Table 16.18 contains the absolute error resulting

from the use of formula (16.15) (without boundary correc-

tion); Table 16.19 gives the corresponding values obtained

332

II.

BOUNDARY VALUE PROBLEMS

from formula (16.16) (with boundary correction).

We note

that the method has no definitive convergence order.

FIgure 16.17.

Asymetrically concave region

n

n

1.9E-2 3.3E-3

8.3E-4

2.SE-3

1.9E-7

1.1E-10

96

5.0E-10 2.2E-7

3.3E-6

9.8E-5

2.4E-5

1.1E-6

1.7E-12

4.7E-12

1.3E-12

4.0E-5

4.4E-8

2.6E-7

1.2E-7

2.8E-4

9.6E-4

2.0E-3 3.4E-4

5.1E-4

4.6E-3

7.4E-3 1.3E-2

9.4E-3 7.5E-5

1/128 1/32

2.4E-5

1/8

1.4E-2

1/128

1.SE-4

1/32

Ray 3

2.1E-2

1/8

Ray 2

Absolute error when computing with boundary correction

3.SE-6

4.6E-12

96

2.4E-9

5.4E-7

48

4.0E-4

4.7E-3

4.3E-5

1.6E-4

1.1E-4

24

1/128

1.9E-6

3.9E-3

3.1E-3

12

Ray 1

1/32

TABLE 16.19:

2.4E-12

Absolute error when computing without boundary correction

1/8

R

TABLE 16.18:

6.7E-2

1.4E-4

2.2E-3

1.9E-7

7.0E-6

1.8E-2

3.0E-5

5.5E-7

48

1.0E-6

1.1E-2

S.SE-3

1.9E-1

6.9E-2

2.5E-2

7.0E-3

2.2E-4

1.3E-2

1.2E-4

8.1E-2

4.3E-3

7.SE-4 2.2E-5

9.8E-3

2.8E-1

5.5E-2

2.6E-2

24

1/128

12

1/32

1/8

1/128

1/32

1/8

1/128

1/32

R

1/8

Ray 3

Ray 2

Ray 1

U4 LA

w

PART III. SOLVING SYSTEMS OF EQUATIONS

17.

Iterative methods for solving systems of linear and nonlinear equations When we discretize boundary value problems for linear

(nonlinear) elliptic differential equations, we usually ob-

tain systems of linear (nonlinear) equations with a great many unknowns.

The same holds true for the implicit discreti-

zation of initial boundary value problems for parabolic differential equations.

For all practical purposes, the utility

of such a discretization is highly dependent on the effectiveness of the methods for solving systems of equations. In the case of systems of linear equations, one distinguishes between direct and iterative methods.

Aside from

rounding errors, the direct methods lead to an exact solution in finitely many steps (e.g. Gauss algorithm, Cholesky method, reduction method).

Iterative methods construct a

sequence of approximations, which converge to the exact solution (e.g. total step method, single step method, overrelaxation method).

These are ordinarily much simpler to

program than the direct methods.

334

In addition, rounding errors

17.

335

Iterative methods

play almost no role.

However, in contrast to direct methods

fitted to the problem (e.g. reduction methods), they require so much computing time that their use can only be defended when the demands for precision are quite modest.

When using

direct methods, one must remain alert to the fact that minimally different variants of a method can have entirely different susceptibilities to rounding errors.

We have only iterative methods for solving systems of non-linear equations.

Newton's method (together with a few

variants) occupies a special position. only a few iterations.

It usually requires

At each stage, we have to solve a

system of linear equations.

Experience shows that a quick

direct method for solving the linear system is a. necessary adjunct to Newton's method.

An iterative method for solving

the linear equations arising in a Newton's method is not to be recommended.

It is preferable instead to apply an itera-

tive method directly to the original non-linear system.

The

Newton's method/direct method combination stands to nonlinear'systems as direct methods to linear systems.

However,

the application is limited by the fact that frequently the linear systems arising at the steps of Newton's method are too complicated for the fast direct methods.

This section will serve as an introduction to the general theory of nonlinear iterative methods.

A complete treat-

ment may be found, e.g., in Ortega-Rheinboldt 1970. In the following two sections, we examine overrelaxation methods (SOR) for systems of linear and nonlinear equations.

After that, we consider direct methods. Let

F : G c

to find a zero

1n y 1n

x* e G

of

be a continuous function.

We want

F, i.e. a solution of the equation

336

SOLVING SYSTE'1S OF EQUATIONS

111.

F(x)

lying in

=

(17.1)

0

In functional analysis, one obtains a number of

G.

sufficient conditions for the existence of such a zero. Therefore, we will frequently assume that a zero

x* E G

exists, and that there exists a neighborhood of F

We further demand that

has no other zeros.

in which

x* G

be an

open set.

Iterative methods for determining a zero of

F

are

based on a reformulation of (17.1) as an equivalent fixed point problem, x = T(x),

so that

x*

point of

is a zero of

T.

T(x(v-1)),

=

One expects the sequence if the initial point

proximation to case.

exactly when

F

is a fixed

x*

Then we set up the following iteration: x(")

x*

(17.2)

x*.

{x(v)

x(0)

I

v = 1(1)-. v = 0(1)oo}

(17.3)

to converge to

is a sufficiently close ap-

But this is by no means true in every

In addition to the question of convergence of the

sequence, we naturally must give due consideration to the speed of the convergence, and to the simplicity, or lack thereof, of computing

T.

Before we begin a closer theoreti-

cal examination of these matters, we want to transform Equation (17.1) into the equivalent fixed point problem for a special case which frequently arises in practice. Suppose that the mapping

Example 17.4:

into a sum, F(x) = R(x) + S(x), in which dependent on

x

and

R

can be split

F S

is only "weakly"

is constructively invertible.

By

the latter we mean that there exists an algorithm which is

337

Iterative methods

17.

realizable with respect to computing time, memory storage demand, and rounding error sensitivity, and for which the R(y) = b

equation

neighborhood of R

can be solved for all

-S(x*).

in a certain

b

Such is the case, for example, when

is a linear map given by a nonsingular diagonal matrix or

by a tridiagonal symmetric and positive definite matrix. we set

then equation

T = R- lo(-S)

to the fixed point problem fore

F(x) = 0

x = T(x).

When

is equivalent S, and there-

also, depends only weakly on the point, one can ex-

T

pect the iterative method (17.3) to converge to

x*

ficiently close approximations

o

Definition 1 7 . 5 :

Let

x(0)

T : G c IRn -' 1R n

a fixed point of

x* e G

sequence (17.3) for

x(0)

The fixed point

x*.

an interior point of

x*.

I(T,x*)

of

II

II

T

in

1R'

A point

y e G

x*, if the and converges

G

is called attractive if it is

x*

The iteration (17.3) is

I(T,x*). x*

The mapping

is attractive. a e

is called contracting if there exists an

norm

for suf-

be a mapping and

remains in

= y

called locally convergent if T

of

T, i.e. T(x*) = x*.

belongs to the attractive region

to

If

[0,1)

and a

such that

IIT(x) - T(Y)IIT < allx-YIIT

x,y e G.

0

Every contraction mapping is obviously continuous. Theorem 17.6:

Let

T:G c IRn

-

iRn

be a contraction mapping.

Then it is true that: (1)

T

has at most one fixed point

x* c G.

x*

is

attractive. (2)

x*.

In case

G =]R

,

there is exactly one fixed point

n Its attractive region is all of ]R.

338

SOLVING SYSTEMS OF EQUATIONS

III.

Proof of (1): Since

Let

and

x*

be two fixed points of

y*

is contracting, there is an

T

a e

T.

such that

[0,1)

IIx* Y*IIT = IIT(x*) T(y*)IIT < allx*-y*IIT. It follows that

x* = y*.

We now choose

r eIR+

so small

that the closed ball KT

lies entirely in

r = {y e Itn

I

IIx*-yIIT < r}

It follows for all

G.

z

e KT r

that

IIT(z)-T(x*)IIT < allz-x*IIT < r. Therefore

T

maps the ball x(v)

is defined for

KT r

T(x(v-1)),

and satisfies the inequality

IIx(v)-x*IIT < avllx(0)-x*IIT < avr, Ix(v)

It follows that the sequence to

I

v = 1(1)00.

v = 0(1)00}

converges

x*.

Proof of (2): T

The sequence

v = 1(1)00

=

a KT r

x(0)

into itself.

Let

x(0) e]Rn

is contracting there is an

v = 0(1)00

be chosen arbitrarily. a e

Since

so that for

[0,1)

it is true that

Iix(v+l) -x(v)IIT = IIT(x(v))

-T(x(v-1)

allx(v)-x(v-1)IIT <

)IIT

avllx(1)-x(0)II T'

From this it follows with the help of the triangle inequality that for all

v,p E IN,

17.

339

Iterative methods

v+

Ilx(v+u)-x(,))IIT

K=V

K=V

laK)ilx(1)-x(0)1IT

lva lix(1) _x(0)iiT This says that

(x(v)

v = 0(1)oo}

I

Its limit value we denote by

is a Cauchy sequence.

Since every contraction

x*.

mapping is continuous, it follows that x* = lim x(v) = lim T(x(v)) = T(x*).

Therefore

x*

is a fixed point of

region of

x*

is all of

Theorem 17.7: let

A

Let

a

IRn,

be a real

be an affine

T(x) = Ax + b

The attractive

T.

n x n

matrix, b eIRn

mapping.

tracting if and only if the spectral radius than 1.

In that case, T

attractive region Proof:

Then p(A)

T

is con-

is less

has exactly one fixed point. is all of

I(T,x*)

and

Its

IRn.

The conclusion follows at once from Lemma 9.17 and

Theorem 17.6.

a

Theorem 17.8:

Let

fixed point

T:G c]Rn -IRn

x* a G.

Let

T

be a map which has one

be differentiable at

x*.

Then

p(T'(x*)) < 1,

implies that

x*

is an attractive fixed point.

The proof is obtained by specializing the following theorem and making use of Lemma 9.17.

Theorem 17.8 says that the local convergence of an iterative method (17.3) for a differentiable map depends in a simple way on the derivative.

Thus differentiable nonlinear

340

SOLVING SYSTEMS OF EQUATIONS

III.

maps behave like linear maps with respect to local converThis conclusion can be extended to nondifferentiable

gence.

maps which are piecewise differentiable. Theorem 17.9:

T:G c]Rn {1Rn

Let

fixed point

x* a G.

there is an

m e N

Let

be continuous at

T

which are differentiable at there is an

s

r = l(1)m,

Suppose that for each

x*.

e {1,...,m}

such that

T(x) = T5(x).

Suppose further that there is a vector norm the corresponding matrix norm

r = 1(1)m.

is an attractive fixed point.

x*

Since

Proof:

are continuous at

Tr

and

T

r e {l,...,m}

for each

T

x*, we have,

the alternative

(1)

T(x*) = Tr(x*), or

(2)

There exists a neighborhood

and

for which

satisfies

IITT(x*)IIT < 1, Then

Suppose

x*.

and maps

Tr :G c IRn 3 ]R"

x e G

be a map which has one

Ur

of

x*

in which

never agree.

Tr

Since we are only interested in the local behavior of may disregard all

r

T, we

for which statement (2) is true.

There-

fore, without loss of generality, we may suppose that statement (1) is true for all Since the maps exists a with

>

6

IIYIIT <

0 6

r.

Tr

are differentiable at

for every and all

c >

0

such that, for all

r E {1,...,m}

IITr(x*+y) x* Tr(x*)YIIT

x*, there

it is true that

EIIYIIT.

y

341

Iterative methods

17.

It follows for

r = 1(1)m

that

IITr(x*+Y)-x*IIT _ (1ITr(x*)IIT+E) IIYIIT. Now we may choose

so small that it is true for all

c

r

that

11 T r' (x *)IIT + c< Y< 1. For every initial vector

satisfying

x(o)

IIx(0) -x*IIT <

6

it then follows that

Iix(")-x*IIT Therefore

V = l(1)m.

Yv

is an attractive fixed point.

x*

o

In addition to the previously considered single step method T(x(v-1))

x(v)

=

practical application also make use of two step methods (or multistep methods) X(V)

=

T(x(v 1),x(v 2)).

These do not lead to any new theory, since one can define a

mapping

T: IR2n ; R2n by setting xl T

T(xl,x2) =

x2

xl

which results in the single step method

x(v)

T

=

T(x(v 1)).

1x(v) x(v)

x(v-l)

is then significant for convergence questions.

Of course

342

III.

SOLVING SYSTEMS OF EQUATIONS

this transformation is advisable only for theoretical considerations.

We are now ready to apply the theorems at hand to Newton's method. Lemma 17.10: zero at

We start with a lemma to help us along.

be a mapping which has a

and is differentiable at

x* e G

mapping from

F:G c]Rn +IRn

Let

J

be a

which is continuous at

MAT(n,n,IR)

to

G

Let

x*.

x*.

Then the mapping

T(x) = x - J(x)F(x) is differentiable at

x*, with Jacobian matrix

T'(x*) = Proof: all

For every

y,z a Rn

e

> 0

satisfying

I

- J(x*)F'(x*).

there exists a IIY112

6

>

0

so that for

< d, it is true that

IIF(x*+Y)-F(x*)-F' (x*)YII2 =IIF(x*+Y)-F'(x*)Y112 <_ e11Y112 II [J (x*+Y) - J(x*) 12112_ all 2112 . This leads to the inequalities IT(x*+y)-T(x*)-[I-J(x*)F'(x*))Y112

= IIJ(x*+y)F(x*+Y) -J(x*)F' (x*)YII2 < II [J(x*+Y)-J(x*))F(x*+Y)II2 + IIJ (x*) [F(x*+Y)-F' (x*)Y1112 _ EIIF(x*+Y)112 + IIJ(x*)112.E:IIAl2 _<

C11Y112 (e+11 F' (X*) 112

+ IIJ(x*)112 ) Example 17.11:

Newton's method and variations.

is to find a zero F

x*

of the mapping

F:G c]Rn

The problem IR

n, where

is continuously differentiable in a neighborhood of

x*

17.

Iterative methods

343

and has a regular Jacobian matrix there.

Then the basic fixed

point problem underlying Newton's method is: x = T(x) = x - J(x)F(x),

By Lemma 17.10, T

where

J(x) = F'(x)-1.

is differentiable at

and has Jacobian

x*

T'(x*) = I-J(x*)F'(x*) = I-F'(x*)-1F'(x*) = 0. This means that

p(T'(x*)) =

0.

By Theorem 17.8, Newton's

method converges for all initial values which lie sufficiently close to

x*.

Theorem 17.8 and Lemma 17.10 also establish that the fixed point

x*

remains attractive when

is not the

J(x)

inverse of the Jacobian, but is merely an approximation thereto, since local convergence only demands p(T'(x*)) = p(I-J(x*)F'(x*)) < 1.

This is of considerable practical significance, since frequently considerable effort would be required to determine the Jacobian and its inverse exactly.

It is also noteworthy

that, by Lemma 17.10, it is not necessary for It suffices to have

be differentiable. x*.

F

J

itself to

differentiable at

The following computation establishes how far

deviate from the inverse of the Jacobian.

be a perturbation matrix and let

Thus we let

J(x) = C[F'(x))-1.

may

J(x) C

Then by

Lemma 17.10 we have

T'(x*) = I-J(x*)F'(x*) = I-C[F'(x*)]-IF'(x*) = I-C. By Theorem 17.8, the iteration converges locally for p(I-C) < 1. for

For the special case

A e (0,2).

o

C = XI, we have convergence

344

SOLVING SYSTEMS OF EQUATIONS

III.

The following two theorems will give a more precise concept of the attractive regions for Newton's method and for a simplified method.

We suppose we are given the following

situation:

G cIRn

X(0)

convex,

K = {x a JR"

IIx-x(0)II < ro},

I

F c C1(G,IRn), IIA-111,

a=

E G

A = F'(x(0))

KeG regular

n = IIA-1F(x(O))II

Newton-Kantorovich.

Theorem 17.12: Hypotheses:

(a)

IIF' (x) -F' (y) II < YII x-y II

(b)

0 < a = 8Yn < 1/2

(c)

rl = 2n/(1+V-1-2a) < ro.

X,y e G

Conclusions: (1)

The sequence x(v+l)

remains in

x(v)

F'(x(v))-1F(x(v)), -

and converges to

K

(2)

IIx(v)-x*II <

(3)

x*

K2 = G If

=

x* e K.

r12-v(laa)(2v-l)

is the only zero of fl

{x eJRn

I

v = 0(1)m

F

11x-x(0)II < r2

v = 0(1)°x. in

(1+)/(BY)}.

a << 1/2, the sequence converges very quickly, by (2).

In a practical application of the method, after a few steps there will be only random changes per step.

These arise

because of inevitable rounding errors.

The theorem permits

an estimate on the error

For this one takes

IIx*

-

x(v)II.

17.

X(V)

Iterative methods

34S

as the initial value

computes upper bounds for error

IIx*

For

-

y,a,n

is at most

x(0)II

for a new iteration, and

x(0)

and

For

rl = 2n/(l +

a < 1/2, the ).

a = 1/2, it is possible that convergence is only

The following example in

linear.

a.

II21

shows that this case

can actually occur: 2

f(x) _

-

B+X-

YYX

,

n > 0, a > 0,

y> 0

We have If'(x)-f'(Y)I = Ylx-y1

1/If'(o)I = a If(o)1/1f'(o)1 = n. For

a < 1/2, f

and

r2 = 2n/(1-/1--2a).

When

a > 1/2, f

has two different real zeros, r1 = 2n/(l+/) When

a = 1/2, they become the same.

has no real zeros (see Figure 17.13).

The example is so chosen that convergence of Newton's method is worse for no other

f.

The proof of Theorem 17.12 is

grounded on this idea. f

Figure 17.13.

Typical graph for

a < 1/2.

346

III.

F'(x(v)) F'(x*)

SOLVING SYSTEMS OF EQUATIONS

is always regular.

However, for

a = 1/2,

can be singular (as in our example).

We will use the following three lemmas in the proof of Theorem 17.12.

In addition to assuming all the hypotheses of

Theorem 17.12, we also make the definitions:

AV = F' (x(v)).

Sv = IIAv1II ,

av = SvnvY,

nv = IIAv1F(x(v))II ,

PV = 2nv/(1+Vl Za

V).

Naturally these definitions only make sense if we also assume E G, AV

x(v)

is regular, and

av < 1/2.

Therefore we will

restrict ourselves temporarily to the set v >

M

for which these hypotheses are true.

0

of integers At this point it

is not at all clear that there are any integers besides which belong to

M.

However, it will later turn out that

contains all the positive integers. Lemma 17.14:

If

$v+l

v/(1-av).

Proof:

Since

x(v+l) E G, then

YIIx(v)-x(v+l)II

IIAv-Av+1II

Av+l

is regular and

= Yn v

we have IIAvl(Av-Av+1)II < av < 1/2.

Therefore, we have convergence for the series [AV'(Av-Av+1)]u.

S = u=0

We have [I-Avl(Av-Av+1)]S = I,

S[I-AV1(Av-Av+1)] = I.

0

M

347

Iterative methods

17.

The matrix inside the square brackets is therefore regular, and its inverse is

But then

S.

Av+1 = Av[I-AV'(Av-Av+1)) is also regular.

For the norm of the inverses we have the

inequalities

0,+1 <_ Lemma 17.15:

IISII

A-

'

E a11av = 0V/(1 -av) .

<

V

u=o

x(v+l) c G, then

If

We have shown above that

Proof:

IIAv+1F(x(v+l)

nv+l °

IIA-IF(x(v+l)

nv+1 < Znvav/(1-av

< IIsII '

)II

= SA 1

Av-+1

and

IIA-1F(x(v+l) )AI

V

)11/(I-av).

It remains to show that IIA-1F(x(v+I))II

- 1 avnv'

V

(1-t)x(v)+tx(v+1),

fi(t)

=

R(t) = A-1F($(t)) + (1-t)(x(v+1)-x(u)) Since

G

is convex, fi(t)

remains in

G.

We clearly obtain

the following: x(v+I)

$(0) = x(v)

O(1)

R(0) = 0,

R(1) = A-1F(x(v+1)),

=

'(t) = x(v+1)-x(v), R'(t) _ [A-1F'(c(t))-I](x(v+1)-x(v)), Since

R'(0) = 0.

348

SOLVING SYSTEMS OF EQUATIONS

III.

IIF'(0(t))-AvII ° IIF'(O(t))-F'(a(o))II IIF'(0(t))-A'vII

Ytllx(v+l)-x(v)II

r1

YIIm(t)-x(v)II

=

we obtain the following estimate for

IIR' (t)II

IIR'(t)II:

0 ytllx(v+1)-x(v)II2 = svyn2t = avnvt.

<

From this we obtain the desired inequality,

IIAv F(x (v+1)

1

)II

= IIR(1)Il

=

II

R'(t)dtll <

avnv.

0

I

When

we set

x(v+l) a G, 6 =

(cf. Lemma 17.14)

> Sv+l

6v/(l-av)

a = SYnv+l > av+1 Then the last lemma implies that

av+1 < a <

a

1

2

= < av < 1/2.

(1 V)

It follows that if Lemma 17.16: Proof:

v e M x(v+l)

If

and

x(v+l) c G, then

e G, then

nv + pv+l <- pv'

Let

1 1 a >

Y

pv+l

From the inequality 2

a

av

1

2

(1 av) 2

it follows that av

1-

1-,/l- 26 <

Since

1-av

v+l e M.

1-av

B(1-av) = sv, it further follows that

Iterative methods

17.

141-2a

349

aV

1-

<

sVY

SY

PV+l <

SVY

< PV - nV .

o

u < v

and

Proof of Theorem 17.12:

If

v e M, then by Lemma

17.16, v

T=p T + PV+l

Since

Pug

po = r1 < ro, we also have V

TI nT + PV+l

IIx(v+I)-X(0)11 _ It follows that 17.15, that

E IIx(T+I)-x(T)II < rI-PV+I

T=0

x(v+l) e K e G

AV+l

and, from Lemmas 17.14 and

is regular and

this implies that integers

ro

rl

Thus the set

v+l a M.

x(V+l) e K

for

Ilx(v+I)-x(°)II_ rl it follows that either all such that

contains all

and

nV # 0

aV

v = 0(1)m.

x(V+l) e K, or there is a first

x(u+2) =

=

This implies

x(v+3)

_

0, we also have

nV + Pv+l = nv < 2n,/(l+/1 « ) nv

PV

pv+1

V

TI°nT + Pv+l < Po = rl

IIx(v+I)

xvl

-

x(O)II

e K.

Since

Pv+l

pV+1 = 0, i.e. nV+1 = 0. x(v+l)

Since

M

v > 0.

Next we show that

v

Altogether,

aV+l < aV.

< rl

ro

.

350

SOLVING SYSTEMS OF EQUATIONS

III.

Hence in this case, too, all

x(v)

a K.

Next we establish the error estimate (2).

We begin

with the inequality 2

0'v

(12

av+l < 2

V

It implies that 1 2-4a v a

>

v+l

-

1

2

<

1-av+1

-

av.

From

n v+l

1-2av+(1-av)72

22-4 2

4

av

v-1

(1-a ao

T

.

(1-av-1) (2v)

C1-a)

1-av

2

av

_

aV

aV+l v+l

(1-a V) 2

av+1

V

a

1

< 2 nvav/(1-av)

(2v)

{1-a>

_

(cf. Lemma 17.15) we finally

obtain (2v+2v-1)

1 \1(2

nv+l - 2(laa/

nv

< 2- v

nv -

nv-1 < ..

(2-1) v

a

1-a

PV = 2nv/(i+

PV < r12-v(1aa)

The sequence

(10-10

x(v)

no

)

<

(2v-l)

therefore converges to

can be estimated by conclusion (2).

Since

continuous, it also follows that F(x*) = -F'(x*)(x* - x*) = 0.

x* a K. F

and

1fx(v)-x*IF F'

are

17.

Iterative methods

351

For the proof of conclusion (3) we refer to Ortega-Rheinboldt (1970).

C3

The Jacobian

F'(x(\))

step of Newton's method.

Frequently this computation in-

Therefore it may be advantageous

volves considerable effort. to replace

must be computed anew at each

by a fixed, nonsingular matrix

F'(x(v))

which does not differ too greatly from

F'(x(v)).

cedure is especially advantageous whenever algebraic structure than the Jacobians.

B

B

This pro-

has a simpler

It is conceivable

that linear systems of equations involving

B

may be solved

with the aid of special direct methods (e.g. the reduction method of Schr6der/Trottenberg or the Buneman algorithm), while the corresponding systems involving the Jacobians F'(x(v))

are too complicated for such an approach.

We describe this situation briefly:

G c Itn

convex,

K = {x c IRn

x(°)

cG

11x-x(O)II I

< ro},

F c C1(G, IRn), B E MAT(n,n, ]R) s = JIB-111 , d

= IIB -

KcG

regular

n = IIB-1F(x(0))II

F'(x(°))II

Theorem 17.17: Hypotheses:

(a)

IIF'(x)-F'(y)II

(b)

Sd < 1

(c)

a = any/(1-66)2 < 1/2

(d)

rl = 2n/[(1+ 1-2a)(1-sS)] < ro

yJJx-yII

x,y c G

352

SOLVING SYSTEMS OF EQUATIONS

III.

Conclusions: (1)

The sequence x(v+l)

remains in (2)

=

x(v)

B-1F(x(v)),

-

K and converges to

v = 0(l)-

x* e K.

It is true that II x

(v+2)

-x(v+1)11 < cli x(v+1) -x(v)11

where c = Bd +

(1-B6) < 1.

2

1+ 41 --2 o,

The theorem contains two interesting special cases: (A)

is an affine map, i.e.

F

y = a = 0, rl = n/(1-86)

and

c = 86. (B)

B = F'(x(0)),6 = 0, y > 0.

The conditions (a), (c), and

(d) are then precisely the hypotheses of the preceding theorem.

Our convergence conditions for the simplified method

thus are the same conditions as for Newton's method. Conclusion (2) of the theorem is of interest first of all for large

v.

better estimates.

For the first few iterations, there are Thus, we have

x(2)-x(1)I a(l-86)2.

a6 +

where For

c1Ix11)-x10) 11

2

a = 1/2, we have

c = 1, independently of

In

6.

fact in such cases the convergence of the method is almost arbitrarily bad.

This can be seen with an example from

Let

f(x) = x2,

x(0)

= 1,

B = f'(1) = 2.

The constants in this example are:

JRl.

17.

353

Iterative methods

R = 1/2,

n = 1/2,

y = 2,

a = 1/2.

6 = 0,

This leads to the sequence x(v+l) =

1(x(v))2

x(v)

=

2

x(v)(1

-

1 x(v)). 2

which converges to zero very slowly. In practice one can apply the method when and

a << 1/2.

In these cases,

c 1 R6 + a/2

Table 17.18 shows the effect of larger

a

a6 << 1

and

c - 66 + a.

or larger

$6.

TABLE 17.18

a

R6

C

R6+ T,

c

R6+a

1/4

1/2

0.531

0.625

0.646

0.750

1/4

1/4

0.320

0.375

0.470

0.500

1/4

0

0.125

0.125

0.293

0.250

1/8

1/2

0.516

0.563

0.567

0.625

1/8

1/4

0.285

0.313

0.350

0.375

1/8

0

0.063

0.063

0.134

0.125

1/16

1/2

0.508

0.531

0.532

0.563

1/16

1/4

0.268

0.281

0.298

0.313

1/16

0

0.031

0.031

0.065

0.063

The proof of Theorem 17.17 runs on a course roughly parallel to that of the proof of Theorem 17.12. based on three lemmas.

nV = JIB- 1F(x(V))II, av = Rynv/(1-8oV)2, v

runs through the set

which it is true that

Once again, it is

We make the following definitions:

6V = JIB -

F'(x(V))II.

PV = 2nV/[(1+

M x(v)

)(1 R6V)1

of all nonnegative integers for c G, Rdv < 1, and

av < 1/2.

354

SOLVING SYSTEMS OF EQUATIONS

III.

Lemma 17.19:

Let

x(v+l) c G

and

[Bdv + av(1-B6v)2]/B.

Then B'5v+l < Ba <

1

1-Bdv+l > 1-Bd = [1-av(1-66 v))(1-B6 v) > 0.

Proof:

We have

BSv+l = BIB-F' (x(v+l))II < B6v+B1!F'(x(v+l))-F' (x(v)) 11 BS v+1 < B6 v+3YIIx(v+1)-x(v)II = BSv+BYnv BSv+l < 66v+av(1-B6 v)2 = BS

2(1+6252)-(2 -av)(1-Bav)2 <

Z(1+a2&)

1.

<

From this it follows that 1-B6 v+l > 1-0 = [1-av(1-BS,))(1-BSv). Lemma 17.20:

x(v+l)

If

o

c G, then

av(1-66 )2]nv.

nv+l < [BSv + 2

Proof:

As in the proof of Lemma 17.15, for

t

e

[0,1]

4(t) = (1-t)x(v) + tx(v+l)

R(t) = B-IFWt)) + (1-t)(x(v+1)-x(v) It follows that x(v+1)

q(0) = x(v),

0(1) =

R(0) = 0,

R(l) = B-1F(x(v+l)),

V (t)

=

x(v+l)

-

x(v)

R'(t) = B[F'((t))-B ](x(v+1)

x(v))

By hypothesis (a) of Theorem 17.17 it follows that

we set

Iterative methods

17.

IIF'(O(t))-BII

355

IIF'(x(v))-BII

+

YIIm(t)-x(v) II

IIF'(a(t))-BII < 6v + Ynvt. Therefore we have

IIR'(t)II < a(6v + Ynvt)nv and finally

nv+l = IIR(1)II

=

1110 R'(t)dtII 1

,V(1-a6v)2Inv.

nv+l < (a6v + 2 aYnv)nv = [a6v + 2

With

aYnv+1(1-06)2 a > av+l

the last two lemmas yield v (1-a6)2 v

aYn[a6v + v <

[1-av(1-a6v)]2(1-66 v)

Since [1-av(1-a6v)]2 = 1-2a(l-a6v)[l> 1-(1-86,,)[1-

v(1-a6v v(1-a6v)]

a6v + -12av(1-a6v)2

we have av+l < a < av < 1/2.

It follows that when

v e M

and

x(v+l) c G, then also

V+l a M.

Lemma 17.21:

If

x(v+l)

a G, then

nv + Pv+l

<<

pv'

SOLVING SYSTEMS OF EQUATIONS

III.

356

Proof: Case 1:

nv = 0.

Case 2:

y = 0.

Then

x(v+l)

is an affine map.

F

av = av+l = 0,

Then

Therefore:

Pv+1 = nv+l/(1-Bdv+l) < vdvpv nv + Pv+1

av = BYnv/(1-Bdv)2 > 0. 2nv+l/[(1+/&)(1-6d))

P =

pv+l = PV = 0.

nv+1 < 06VnV,

6V = 6v+1'

PV = nv/(1-BdV),

Case 3:

and

x(v)

=

nv + 66VPV = PV. Let Sd)

=

ay

Lemma 17.20 implies that

p > PV+l.

nv+1 <- [66v + Zav(1-Bdv)2Inv.

Multiplying by

yields

By

a(l-Ba)2 <

[B6V + 2 av(1-B6v)2]aV(1-Bdv)2.

We have 1 Bd =

[1 av(1 Bdv)](1 66v)

and therefore,

a[1-av(1-66V)]2 2136

2a <

V

[Bdv +

<

l-1

(1-S6v)2]av

+a (1-136 v)2 v a v

[1-av(1-B6 V)]

1-av(1-B6v)-

1

BS BY

1-av(1-B6v) By

(l-66v)

yields

(1-y)(1-4) BY

<

-

1-aV(1-Bdv) By

41-77 (1-Bdv).

17.

Iterative methods

357

The left side is a different representation of

p.

There-

fore, we have shown that 1-av(1-66v)-

Now the right side of the inequality is equal to Proof of Theorem 17.17:

If

v e M, then

nv < PV

pv - nu.

and by

Lemma 17.21, v-1

uI0

nu+pv < po = r1 < ro

This implies that

x(v+l) a K.

iix(v+l) -x(0)Il < r 0 , The lemmas also imply that fore contains all The sequence

v > 0.

{x(v)

I

86v+1 < 1, av+l < 1/2. x(v)

remains in

v = 0(1)OD}

M

there-

K.

has at most one limit point

x* a K, since

Ix(u+l)

E

u=0 is bounded above by mate (2).

v = 0(1)-

n

u=0 r0.

It remains to prove the error esti-

By Lemma 17.20, we have

nv+l ` [06v + 2

V(1-86V1

2

In

or

Iix(v+2)-x(v+l)II < (86v We can get a bound on condition for

86v

+ ?av(1-86v)2]IIx(v+1)-x(v) with the aid of the Lipschitz

F':

86v < 060 +BIIF'(x(V))-F1(x(0)A < 860+8YIIx(v)-x(0)11 It follows from

358

III.

v-l u=o

x(u+1)-x(0)II + p

v-l

n +p

SOLVING SYSTEMS OF EQUATIONS

u=o

u

1

that

$6v < B6o + Syr1-SYpv < B6o + BYrI-Syn

(1+vrl

$6V < B6 +

)(1-86)]-av(1-B6v)2 2a

66v + 2 v(1-Sdv)2 < B6 +

(1-8s) = c.

o

1+ vrl-- -2a

We follow this discussion of Newton's method with a definition of generalized single step methods.

The starting

point is an arbitrary iterative method x(v)

T(x(v-1))

=

In an actual computation, the components of

x(v)

x(v-1).

xiv), i = 1(1)n,

are computed sequentially from the components of Therefore it is advantageous, in looking at the

right side of the equation, to

use those components of

x(v)

which are already known, instead of the corresponding components of

x(v-1).

In practice this means that the compon-

immediately replace those of

ents of

x(v)

memory.

This not only saves memory, but simplifies the pro-

gram.

x(v-1)

in

In many important cases (see Varga, 1962, Ch. 3), the

new method converges better than the original method.

The

new method is called a single step method in contrast to the original total step method.

By defining a modified operator

T, one can again regard a single step method as a total step method.

Before defining the operator important special case.

T, we will look at an

17.

Iterative methods

Example 17.22:

359

Let

T(x) = (L+U)x + b where

b e]Rn, L

is a strictly lower

n x n triangular

matrix (diagonal identically zero), and upper

is a strictly

U

n x n triangular matrix (diagonal identically zero).

The corresponding total step method x(v)

=

T(x(v-1))

(L+U)x(v-1) + b

=

is also known as the Jacobi method, and by Theorem 17.7, converges for

p(L+U)

<

In view of our discussion above,

1.

the single step method may be characterized by the rule x(v)

=

Lx(v) + Ux(v-1) + b.

This is also called the Gauss-Seidel method.

We can again

rewrite it formally as a total step method: x(v)

(I-L)-1(Ux(v-1)+b). =

By Theorem 17.7, the method converges for

p((I-L)-l U)

<

1.

In the following formulation the similarity between the two methods will become clearer. Jacobi method: x(v)

x(v-1) =

[(I-L-U)x(v-1)-b].

-

Gauss-Seidel method: x(v)

Definition 17.23:

(I-L)-1[(I-L-U)x(v-1)-b].

x(v-1)

=

Let

-

T:

G c ]n +]Rn

a

be the fixed point

operator of some total step method, with component mappings ti(YI.Y2....PYn),

i = 1(1)n.

360

III.

We define the components

ti

SOLVING SYSTEMS OF EQUATIONS

of a mapping

-]R'

T:G

IRn

recursively by the rule ti(yl,y2,.... yn) f

= ti(wl,w2,...,wn)

tj(yl)y2,.... yn)

for

yj

otherwise.

<

j

i

wj

Then

x(v)

T(x(v-1)), v = l(l)m

=

method corresponding to x(v)

T.

=

defines the single step

We frequently use the notation

T(x(v-1)/x(v)).

This is to be interpreted as saying that puted with the aid of the mapping

from

T

is to be com-

x(v)

x(v-1);

however,

insofar as they are already known, the components of are to be used in the computation.

x(v)

o

The following theorem focuses on some significant connections between total and single step methods. Theorem 17.24: Let

x*

Let

T

and

be a fixed point of

be as in Definition 17.23.

T T.

Then the following are true:

(1)

x*

is a fixed point of

(2)

If

T

continuous at (3)

is continuous at

T.

x*, then

is also

T

x*.

If

T

differentiable at

is differentiable at x*.

x*, then

Let the Jacobian of

T

at

is also

T

x*

be

partitioned as follows:

T' (x*) = D - R - S where

D =

(dij)

is a diagonal matrix, R = (rij)

strictly lower triangular matrix, and strictly upper triangular matrix.

S = (sij)

is a is a

Then the Jacobian of

T

17.

at

361

Iterative methods

x*

can be decomposed as follows: T'(x*) = (I+R)-I(D-S).

Proof:

Conclusions (1) and (2) follow immediately from the

definition of

By the recursive definition of

T.

ti, we

have that at the fixed point, i-1 a .t. (x*) =

:

ati(x*)ajt4(x*),

j

au ti(x*)ajtu(x*) + ajti(x*),

j > i.

<

i

u=1 i-1

a(x*) = u=1

In both cases, this means that i-1

riuajtu(x*) + dij -sij,

ajti(x*)

i,j = l(1)n.

V=1

It follows from this that T'(x*) _ -RT'(x*) + D -

S

T'(x*) _ (I+R) I(D-S).

o

The method considered in Example 17.11 had the form The corresponding single step method is

T(x) = x - J(x)F(x). x(v)

=

x(v-l)

J(x(v-1)/x(v))F(x(v-1)/x(v)). -

The Jacobian for this method can be determined with the aid of Theorem 17.24 for a special case.

This will also be of

significance in Sec. 19, when we develop the SOR method for nonlinear systems of equations. Theorem 17.25: 17.10.

Let

F:G c1Rn ;1Rn

Suppose further that

Let the Jacobian of

F

J(x)

and

J

be as in Lemma

is a diagonal matrix.

at the point

F'(x*) = D* - R* - S*

x*

be partitioned as

362

III.

where

is a diagonal matrix, R*

D*

triangular matrix, and matrix.

SOLVING SYSTEMS OF EQUATIONS

S*

is a strictly lower

is a strictly upper triangular

Then the Jacobian of the single step method x(v-l)

x(v)

J(x(v-1)/x(v))F(x(v-1)/x(v))

=

at the point

-

may be represented as follows:

x*

T'(x*) _ =

[I-J(x*)R*] 1[I-J(x*)F'(x*)-J(x*)R*] I

-

[I-J(x*)R*]-1J(x*)F'(x*).

By Lemma 17.10, the Jacobian of the corresponding

Proof:

total step method can be represented as T'(x*) = I-J(x*)F'(x*) = I-J(x*)(D*-R*-S*) [I-J(x*)D*] + J(x*)R* + J(x*)S*.

=

into diagonal, lower,

The last sum is a splitting of

T'(x*)

and upper triangular matrices.

Applying Theorem 17.24

yields:

T'(x*) = [I-J(x*)R*] 1[I-J(x*)D* + J(x*)S*]

Remark 17.26: x(v)

=

[I-J(x*)R*] - [I-J(x*)F' (x*) - J(x*)R*]

=

I -

[I-J(x*)R*] 1J(x*)F'(x*).

Instead of the iterative method x(v-1)

=

J(x(v-1)/x(v))F(x(v-1)/x(v)) -

one occasionally also uses the method x(v)

=

x(v-l)

J(x(v-1))F(x(v-1)/x(v)). -

One can show that the two methods have the same derivative at the fixed point.

Therefore one has local convergence for

18.

Overrelaxation methods for linear systems

both methods or for neither.

This is not to say that the

attractive regions are the same.

18.

363

o

Overrelaxation methods for systems of linear equations In this section we will discuss a specialized itera-

tive method for the numerical solution of large systems This is the method of overrelaxation

of linear equations.

developed by Young (cf. Young 1950, 1971).

It is very popu-

lar, for with the same programming effort as required by the Gauss-Seidel method (see below or Example 17.22), one obtains substantially better convergence in many important cases. Definition

Gauss-SeideZ method, successive overrelaxa-

18.1:

tion (SOR) method.

A c MAT(n,n,IR)

Let

be regular and let

The splitting

b e IRn.

A = D - R - S

is called the triangular splitting of

A

if the following

hold true: R

is a strictly lower triangular matrix

S

is a strictly upper triangular matrix

D

is a regular matrix

L = D-1R

is a strictly lower triangular matrix

U = D-1S

is a strictly upper triangular matrix.

To solve the equation (1)

Ax = b, we define the iterative methods;

Gauss-Seidel method: x(v) =

D-1(Rx(v)+Sx(v-1)+b)

=

Lx(v)+Ux(v-1)+D-lb,

or x(v)

=

x(v-1)-D-1(Dx(v-1)-Rx(v)-Sx(v-1)-b),

v = 1(1)x°.

364

(2)

SOLVING SYSTEMS OF EQUATIONS

III.

Successive overreZaxation or SOR method: X(V)

x(v-1)-wD-1(Dx(v-1)-Rx(v)-Sx(v-1)-b)

=

V = 1 (1)00 x(v-1)-w(x(v-1)-Lx(v)-Ux(v-1)-D-lb).

=

where

w eIR

is called the relaxation parameter.

In the splitting

A = D - R

In that case, L

diagonal matrix.

triangular matrices regardless.

-

S, D

and

U

may possibly be a are strictly

Our definition, however,

also encompasses the possibility that the

D

contains more than simply the diagonal of

A.

When

D

o

in the method

is a diagonal matrix, then the methods can

also be described by:

When

D-1(Ax(v-l)/x(v)-b)

x(v-1)

X(V)

=

X(V)

=

-

x(v-1)

wD-1(Ax(v-1)/x(v)-b). -

w = 1, the successive overrelaxation method is the

same as the Gauss-Seidel method.

When

w > 1, the changes

in each iterative step are greater than in the Gauss-Seidel method.

This explains the description as overrelaxation.

However, it is also used for

w < 1.

For

w > 1, convergence

in many important cases is substantially better than for w = 1

(cf. Theorem 18.11).

From a theoretical viewpoint it is useful to rewrite these methods as equivalent total step methods. lemma is useful to this end.

The following

The transformation is without

significance for practical computations. Lemma 18.2:

Let

A e MAT(n,n,IR)

n and let let b e IR,

have a triangular splitting,

365

Overrelaxation methods for linear systems

18.

I-w(D-wR)-1A

=

(I-wL)-1[(l-w)I+wU].

=

Then the method x(v)

= Y x(v-1)

yields the same sequence x(v)

Proof:

I

w(D-wR)-lb,

V = 1(1)-

as the SOR method

x(v)

x(v-1)-wD-1(Dx(v-1)-Rx(v)-Sx(v-1)-b),

=

v = 1(1)co.

The SOR method is easily reformulated as (I-wL)x(v)

Since

+

(1-w)Ix(v-1)+wUx(v-1)+wD-lb. =

is a strictly lower triangular matrix, the matrix

L

is invertible.

- wL

x(v)

It follows that

(I-wL)-1[(1-w)I+wU]x(v-1)

=

+

w(D-wR)-1b.

Further reformulation yields: (I-wL)-1[(1-w)I+wU]

(I-wL)-1D-1D[(l-w)I+wU]

_

_

(D-wR)-1[(1-w)D+w(D-R-A)]

=

I

w(D-wR)-1A

= _Vw.

-

=

(D-wR)-1[(D-wR)-wA]

o

The following theorem restricts the relaxation parameter

p(°) of

to the interval

w

(0,2), since the spectral radius

is greater than or equal to

1

for all other values

w, and for the method to converge, one needs

p(_V) < 1,

by Theorem 17.7. Theorem 18.3:

Under the hypotheses and definitions of Lemma

18.2 it follows that:

(1)

det(.) = (1-w)n

(2)

p(`-tw)

> 11-wl.

366

III.

SOLVING SYSTEMS OF EQUATIONS

Lemma 18.2 provides the representation

Proof:

_V = (I-wL)-1[(1-w)I+wU1.

Y is thus the product of two triangular matrices.

Since

the determinant of a triangular matrix is the product of the diagonal elements, we have det(I-wL)

1

= 1/det(I-wL) = 1

det[(1-w)I+wU] = (1-w)n.

Conclusion (1) follows from the determinant multiplication theorem.

For the proof of (2) we observe that the determinant of a matrix is the product of the eigenvalues.

By (1) how-

ever, the size of at least one of the eigenvalues is greater than or equal to

11-w1.

a

The next theorem yields a positive result on the convergence of the SOR method.

However, there is the substan-

tial condition that the matrix definite.

A

be symmetric and positive

This is satisfied for many discretizations of dif-

ferential equations (cf. Sections 13 and 14). "D

when

The condition

is symmetric and positive definite" is not necessary, D

is a diagonal matrix.

The diagonal of a symmetric

positive definite matrix is always positive definite. when

D

contains more than the true diagonal of

usually true in most applications that

D

A, it is

is still symmetric

positive definite. Theorem 18.4:

A e MAT(n,n,IR)

Even

ostrowski (cf. Ostrowski 1954).

Let

have a triangular splitting and let

18.

367

Overrelaxation methods for linear systems

=

w

I

-

We further require that:

<

IIA1/2.A-1/2II2

(2) p(Yw)

ing in the vector norm

w e (0,2).

Then it is true that:

1

= spectral norm)

(II

T(x) _

II2

x+c (c aIRn)

are contract-

(xTAx)1/2.

IIxIIA =

For each sequence

(4)

are symmetric and

D

1.

<

All mappings

(3)

and

A

(a)

positive definite, and (b)

(1)

w(D-wR)-1A.

{wi

I

with

i e N}

wi e (0,2)

and

lim sup wi <

lim inf wi > 0,

2,

i ym

i -).W

we have

lim ll w.V i->w

i-1

...

1 I12 = 0. 1

denotes the symmetric positive definite matrix

In (1), Al/2

whose square is Proof of (1):

A.

Let

M = (D-wR)/w =

D-R. m

Then A1/2.`L°A-1/2

B =

W

=

I

-

A1/2M-1A1/2.

Therefore BBT

=

(I-A1/2M

1A1/2)(I-A1/2(MT)-lAl/2) I-Al/2M-1(MT+M-A)(MT)-lAl/2.

=

The parenthetic expression on the right can further be rewritten as MT+M-A

=

7D-RT + =D-R-D+R+RT

=

(- -1)D.

This matrix is symmetric and positive definite.

Therefore it

368

SOLVING SYSTEMS OF EQUATIONS

III.

has a symmetric and positive definite root, which we denote by

It follows that

C.

The matrices

BBT

and

BBT + UUT UUT

=

where

I

U = Al/2M-1C.

are symmetric, positive semi-

definite, and also simultaneously diagonalizable. each eigenvalue UUT

of

A

such that

BBT

A + u = 1.

are nonnegative.

Since

there is an eigenvalue All the eigenvalues

U, and therefore

regular, it further follows that values

A

p > 0.

UUT

A

u

of

and

u

also, is

Thus all the eigen-

satisfy the inequality

BBT

of

Thus for

0 <

A

< 1.

It

follows that IA1/2WA-1/2II2

= IIBII.7

= [p(BBT)]1/2 < 1.

Since the matrices

Proof of (2):

they have the same eigenvalues.

B

and W are similar,

From (1)

it follows that

pW) = p(B) < IIBII2 < 1.

Proof of (3):

We have

IIT(x)-T(Y)IIA=II1`w°(xY)IIA= [W (xY)]TA[W(xY)} TA1/2(BTB)A1/2(x-y).

_ (x-Y) T(. )TA w(x-Y) = (x-Y)

Here

B = Al/2W _VA- 1/2

of (1).

BBT

is the matrix already used in the proof

BTB

and

have the same eigenvalues.

the proof of (1), the largest eigenvalue of the inequality Am = max{A

I

A

eigenvalue of

BTB} < 1.

Altogether it follows that

IIT(x)-T(Y)IIA < am(x-y)TA(x-Y) = Amllx-YII2

IIT(x) T(Y)IIA < m IIx-ylIA.

BTB

Thus, by satisfies

18.

Overrelaxation methods for linear systems

Proof of (4): w e (0,2)

369

From the proof of (3) we have for all

that m<1.

Here both "'1A norm

and m depend in general on

however does not depend on

11-11A

as a function of

The

w.

We regard

w.

II

This function is continuous and attains

w.

its maximum in every interval

where

[a,b]

0 < a < b < 2.

Then

a= The endpoints sequence

{wi

i

< 1.

I I-VII A

are chosen so that all elements of the

a,b I

max

we[a,b]

lie in the interval

e 1N)

[a,b].

This

leads to the inequality

i cIN.

11A 1

i-1

1

It follows that

1 im 11 L i-,_

1

i-1

....mow 11A 1

=

0.

Because of the equivalence of norms for finite dimensional vector spaces, the conclusion also obtains for the norm

11.112.

We follow the proof with some remarks intended to increase understanding of the theorem. Remark 18.5:

One knows from examples that in general the

spectral norm

[P(w of 9 wis not less than 1.

Thus the convergence of the SOR

method is not necessarily monotonic in the norm

11.112

How-

ever, convergence is always monotonic in the vector norm

"IA , by (3)_

370

SOLVING SYSTEMS OF EQUATIONS

III.

The significance of conclusion (4) is that the

Remark 18.6:

SOR method will also converge when

w

changes from one step

Such nonstationary methods are by no means un-

to the next.

In a manner similar to the proof of (4)

usual in practice.

one can also prove convergence for the method when the matrix D

is changed from one step to the next.

to remain within the fixed set

It is only necessary

DU < D < D0, where the

inequality signs are to be understood as applying componentwise.

Finally, convergence is even assured when the sequence

of equations and unknowns is permuted from step to step (see Theorem 19.13).

o

The SOR method, like every single step method,

Remark 18.7:

is by definition substantially dependent on the sequential ordering of the equations.

But it is worth noting that

hypothesis (a) of Theorem 18.4 remains true when the ordering of the equations is changed. tation matrix.

When

A

definite, then so are

Let

and

PAP 1

D

and

P

be an arbitrary permu-

are symmetric and positive PDP -l.

Thus convergence

of the method is assured, independent of the ordering of the equations, for symmetric and positive definite matrices.

The

speed of convergence, however, is dependent on the ordering In certain special cases, it is possible

of the equations.

to characterize particularly favorable orderings (see Young's Theorem 18.11).

In the worst cases, with the least favorable

orderings, one needs twice as many iterations. Definition 18.8:

A matrix

B e MAT(n,n,IR)

o

is called

weakly cyclic of index 2, if there exists a permutation matrix P, a matrix

B1 c MAT(q,n-q,]R), and a matrix

B2 c MAT(n-q,q, IR)

such that

Overrelaxation methods for linear systems

18.

371

P B P-1 = B2 B

is called consistently ordered if the eigenvalues of the

matrix B

(a c , a # 0)

aL + aU

do not depend on

a, where

is to be split into B = L + U

with

L

a strictly lower triangular matrix and

strictly upper triangular matrix. Example 18.9:

If

U

a

o

already has the form

B

r

0

B1

B2

0

B =

then

B

is weakly cyclic of index 2 and consistently ordered.

In the proof, we use block notation, beginning with 0

B1

B2

0

B x = x2

x2

It follows that Bix2 = ax1,

B2x1 = ax2.

We let 0

0

L=

,

B2

0

B1

0

0

U=

0

and obtain xl

xl

0

0

ax2

aB2

0

ax2

372

SOLVING SYSTEMS OF EQUATIONS

III.

Example 18.10:

When

is block tridiagonal (cf. Sec. 13)

B

with vanishing diagonal blocks, then

is weakly cyclic of

B

index 2 and consistently ordered.

In the following proof, we restrict ourselves to tridiagonal matrices.

We choose that permutation which places

all odd numbered rows and columns at the beginning, and has all even numbered rows and columns following. for

n = 5, the permutation matrix 1

0 0 0

P =

0

P

is

0 0 0

0

0

0

1

0

0

0

0

1

1

0

0

0

0

1

0 0

For example,

From this we get that B1

0

PBP-I

= B2

It only remains to show that the original ordering.

B = L + U

1

0

is consistently ordered in

B

x clRn

Let

for the eigenvalue

be the eigenvector of

A:

bi,i-lx i-1 + bi,i+1xi+l = Ax i.

It follows that ai-lbi,i-1xi-1

ai-lb1,l+lxi+1 +

(abi,i-1)(ai-2xi-1)

=

Aai-1x1

Alai-1x

+ (1bi,i+l)(alxi+l) =

This means that the vector (xi,ax2....

is an eigenvector of

al, +

aU

,an-lxn)T

for the same eigenvalue

A.

o

18.

373

Overrelaxation methods for linear systems

We now come to the theorem of Young.

Its significance

lies in the fact that it accurately describes the behavior of the function

in the interval

p(Yw)

Such infor-

(0,2).

mation is important for the determination of an

for which

w

the convergence of the SOR method is more rapid. Theorem 18.11:

A e MAT(n,n,]R)

Let

Young.

gular splitting and let U = I-w(D-wR)-1A. B = D-I(R+S) = L+U, where

and

8 = P(B)

have a trian-

Let the matrix

wb = 2/[1+(1-82)1/2]

be weakly cyclic of order 2 and consistently ordered. further hypothesize that: and

8

< 1, or (b) A

definite.

(1)

82

(a) All eigenvalues of

and

D

We

are real

B

are symmetric and positive

Then it follows that:

= P(.") < 1. w c (0,2)

P(ub) = wb-1,

(2)

1-w+2w282+w8+

1-w+4w282

for

w c (O,wb)

for

w e [wb,2).

p (Y)

l

(3) (4)

W-1

w < wb, p(V)

For

simple, if

a

is an eigenvalue of . It is

is a simple eigenvalue of

values of _w, for

B.

All other eigen-

w < wb, are less in absolute value.

w > wb, all eigenvalues of m have magnitude Proof:

For

w - 1.

We derive the proof from a series of intermediate

conclusions: (i)

All eigenvalues of

B

are real.

If condition

(a) does not hold, then by (b) all the matrices are symmetric and positive definite. $ = D-1/2(D-A)D-1/2

=

Then

D-1/2(R+S)D-1/2

A

and

D

374

III.

SOLVING SYSTEMS OF EQUATIONS

is also symmetric and hence has only real eigenvalues. B

and

are similar, B

B

If

(ii)

Since

too has only real eigenvalues.

is an eigenvalue of

p

has the same eigenvalues as For arbitrary

(iii)

and ±,a (L+U) clear for

-p.

(-1)-1U

z =

B.

the matrices

z,w e 1

have the same eigenvaZues. 0

or

w = 0, for then

upper or lower triangular matrix. So now let

z # 0

and

zL + wU

The assertion is

zL + wU

is a strictly

Its eigenvalues are all

w # 0.

zL + wU = Y/'z-w[(z/w)1/2L +

Since

B, then so is

is consistently ordered,

B

-B = -L +

zero.

Since

Then we can rearrange

(z/w)-1/2U].

is consistently ordered, the square-bracketed ex-

B

pression has the same eigenvalues as

L + U.

In view of (ii),

the conclusion follows. (iv)

It is true for arbitrary

z,w,y e

that:

det(yI-zL-wU) = det(yI±I(L+U)). The determinant of a matrix is equal to the product of its eigenvalues. (v)

w e (0,2)

For

and

A c ¢

it is true that:

det((A+w-1)I±w I). It follows from the representation =

(I-wL)-1[(1-w)I+wU]

that

det(AI-5) = det(AI-(I-wL)-1[(l-w)I+wU]) =

det((I-wL)-1(AI-awL-(1-w)I-wU)).

375

Overrelaxation methods for linear systems

18.

it further follows that

det(I-wL) = 1

Since

det(AI-.) = det(AI-AwL-(l-w)I-wU) = det((A+w-1)I-AwL-wU).

This, together with (iv) yields the conclusion. B = p(B) = 0

(vi)

implies that for all

w c (0,2),

Since the determinant of a matrix is the product of its eigenvalues, it follows from (v) that for

p(B) = 0,

n II

(A-Ar) = ()L+w-1)n

r=1

Here the

i = l(1)n, are the eigenvalues of .V.

Ai,

The

conclusion follows immediately. (vii)

Let

w e (0,2), µ e IR

and

A

c 4, A

Further

0.

let (A+W-l)2 Then

p

is an eigenvalue of

value of W.

=

B

Aw2u2.

exactly when

is an eigen-

A

The assertion follows with the aid of (v):

det(AI-.) = det(±wµTI±wTB) _ (wT)ndet(±uI±B) . We are now ready to establish conclusions (1) By (vii), u # 0

Proof of (1):

and only if (a) implies

p2 S2

-

(4):

is an eigenvalue of

is an eigenvalue of .l.

Thus

S2

B

if

= p("5).

< 1, and (b), by Theorem 18.4(2), implies

P(Y1) < 1. Proof of (2):

The conclusion p(.f) > p(. )

follows from

b

considering the graph of the real valued function p(-W), defined in (3), over the interval Remark 18.13).

(0,2)

f(w) _

(cf. also

SOLVING SYSTEMS OF EQUATIONS

III.

376

We solve the equation

Proof of (3) and (4): (a+w-1)2

-

given in (vii) for

Aw2p2 = 0

W2112)

A2-2A(1-w+

+ (w-1)2 =

x:

0

2 A

w2µ2 + wu(1-w

= 1-w +

For

w2p2)1/2.

+ 4

2

[wb,2), the element under the radical is non-positive

w c

for all eigenvalues

of

p

B:

W2s2

w2p2 < 1-w +

1-w + 4

< 0.

4

Therefore it is true for all eigenvalues 2 1 .2112) 2 2 2 +

1

w p (1-w +

2

that

of

A

4

2

wp

2

(w-1)

2

= w-1. It follows that p(`.) W We now consider the case too there can exist eigenvalues

w e (0,wb). of

p

B

In this case

for which the ex-

pression inside the above radical is non-positive. corresponding eigenvalues JAl

= w-1.

of S we again have

However, there is at least one eigenvalue of

B

p = $) for which the expression under the radical is

(namely

positive. B.

A

For the

We consider the set of all of these eigenvalues of

The corresponding eigenvalues

are real.

of

A

positive root gives the greater eigenvalue.

The

For

u > 2(lw-11)1/2/w

the function 1-w +

1 _I

grows monotonically with p = a.

W2,12)1/2

w2 p2 + wp(1-w +

It follows that

4

p.

The maximum is thus obtained for

Overrelaxation methods for linear systems

18.

P(yw) = 1-w + 1 w2s2 + ws[1 w +

377

w2s2]1/2, 4

p(.

W)

is an eigenvalue of

also implies that whenever

is a simple eigenvalue of

p( W)

a = p(B)

The monotonicity

by (vii).

.

is a simple eigenvalue of

other eigenvalues of ./

are smaller.

o

In the literature, the matrix

Remark 18.12:

2-cyclic whenever

A

is called

is weakly cyclic of index 2.

B

allows matrices other than the true diagonal of matrix

D, then

B

All of the

B.

depends not only on

particular choice of

If one

for the

A

A, but also on the

Therefore it seemed preferable to

D.

us to impose the hypotheses directly on matrix

a

B.

Conclusion (1) of Young's Theorem means that

Remark 18.13:

the Gauss-Seidel method converges asymptotically twice as fast as the Jacobi method. convergence for

w = wb

ally greater than for

For the SOR method, the speed of

in many important cases is substantiw = 1

(cf. Table 18.20 in Example

18.15). In (3) the course of the function exactly for

w c (0,2).

is described

A consideration of the graph shows

that the function decreases as the variable increases from to

wb.

On the interval Figure 18.14). known.

w + wb

The limit of the derivative as

-

0

is

0

(wb,2), the function increases linearly (see wb

is easily computed when

S = p(B)

is

However that situation arises only in exceptional

cases at the beginning of the iteration.

As a rule, wb

will

be determined approximately in the course of the iteration. We start the iteration with an initial value

X(V) _ Y x(v-l) + 0

wo(D-woR)-lb.

wo a

[l,wb):

378

SOLVING SYSTEMS OF EQUATIONS

III.

1

i W wb

1

Figure 18.14.

For a solution

2

Typical behavior of p(.VW)

of the system of equations we have

x*

x* = - x* + W0 (D-w0R)

lb.

0

It follows that x(v)-x*

(x(v-1)-x*)

= W

= W

O

By (4), ao = p(W ) 8 = p(B)

)

is an eigenvalue of

0

one if

)v-1(x(1)-x* 0

W 0,

is a simple eigenvalue of

and a simple This occurs,

B.

by a theorem of Perron-Frobenius (cf. Varga 1962), whenever, e.g., the elements of irreducible.

B

are non-negative and

We now assume that )'0

of W with eigenvector

e.

B

is

is a simple eigenvalue

Then the power method can be

0

used to compute an approximation of For sufficiently large x(v) x* z aye, It follows that

v

p(B).

it holds that:

x(v+l)

x* z X0ave,

x(v+2)-x* = aoave.

18.

Overrelaxation methods for linear systems

379

x(v+2)-x(v+l) z (a0-1)A0ae x(v+I)-x(v) z (ao-1)ave x(v+2)-x(v+1)112

a. o 2

(v+1) -X(V)112

11x

The equation (ao+wo-1)2

aowo62

=

makes it possible to determine an approximate value 82.

Next compute

wb

82

for

from the formula

wb = 2/[1+(1-52)l/2]

and then continue the iteration with The initial value

w0

wb.

must be distinctly less than

wb, for otherwise the values of the eigenvalues of Yw

will 0

be too close together (cf. the formula in 18.11(3)) and the power method described here will converge only very But it is preferable to round up

slowly.

function w < wb

p(.9)

grows more slowly for

(cf. Figure 18.14).

difference

2-1b

wb, since the

w > wb

than for

It is worthwhile to reduce the

by about ten percent.

o

In the following example we compare in an important special case the speed of convergence of the Jacobi, GaussSeidel, and SOR methods for Example 18.15:

w = wb.

Sample Problem.

The five-point discretization

of the problem tu(X,y) = q(x,y), u(x,y) = V+(x,y),

(X,y) C G = (0,1)2 (x,y) c 3G.

380

SOLVING SYSTEMS OF EQUATIONS

III.

leads to a linear system of equations with coefficient matrix 1

A=

A

I

I.

A,

e MAT(N2,N2, ]R). I

Here we have (

-4

1

11,

-4

e MAT(N,N, IR) 4

1

N+l = 1/h.

A, as we know from Section 13, are

The eigenvalues of

Let

where

A

avu = -2(2-cos vhn - cos phn),

v,y = 1(1)N.

be partitioned triangularly into

A = D - R - S,

D = -41.

The iteration matrix D-1(R+S) = D-1(D-A)

of the Jacobi method has eigenvalues p(D-1(R+S))

I-D-1A

=

1

+

4

= cos hn = 1- 2 h2n2

Therefore

A

.

+

O(h4).

By Theorem 18.11 (Young) we further obtain

P()

2

=

cos2hn

=

1-h2n2

+ O(h4)

wb = 2/(l+./l-$ ) = 2/(l+sin hn) P(mob )

_ Wb-1 = 1-2h7T + O(h2).

Table 18.16 contains step sizes.

g, p(S), urb, and

p( mob)

for different

18.

381

Overrelaxation methods for linear systems

Spectral radii and

TABLE 18.16:

h

a

)

P(`Sz

wb.

P( W )

Wb

b

1/8

0.92388

0.85355

1.4465

0.44646

1/16

0.98079

0.96194

1.6735

0.67351

1/32

0.99519

0.99039

1.8215

0.82147

1/64

0.99880

0.99759

1.9065

0.90650

1/128

0.99970

0.99940

1.9521

0.95209

1/256

0.99993

0.99985

1.9758

0.97575

Now let

e(v)

=

be the absolute error of the

x(v)-x*

v-th

approximation of an iterative method x(v+l)

Here let Since

Mx(v) + C.

be an arbitrary matrix and let

M

e(v)

=

=

Mve(°)

and

lim

IIMVIIl/v

'V-.W

there is for each

IIMvII

n

.

x* = Mx* + c.

>

0

a

v

0

= P(M), eIN, such that

(P(M)+n)V, V > v

-

1

o

k(V)II < (P(M)+n)" IIe(°)II.

The condition 11e(m)II

< elle(°)II

thus leads to the approximation formula

m=

to

e

log P(M)

(18.17)

which is sufficiently accurate for practical purposes. In summary we obtain the following relations for the iteration numbers of the methods considered above:

382

SOLVING SYSTEMS OF EQUATIONS

III.

mJ c lo

Jacobi:

C

-h Tr /2 e

ml Z 1o

Gauss-Seidel (w=1):

(18.18)

-h it

SOR (w=wb

mw

=

1

b

Here the exact formulas for the spectral radii were replaced by the approximations given above.

The Jacobi method thus

requires twice as many iterations as the Gauss-Seidel method in order to obtain the same degree of accuracy.

one frequently requires that

In practice,

Since

1/1000.

log 1/1000 = -6.91, we get ml z

6.91 h-2 = 0.7/h2 Tr

m

wb

ml/mw

z 0.64/h.

Table 18.20 contains

ml, h2ml,

various step sizes.

(18.19)

6.91 h-1 - 1.1/h

mwb, hmwb , and

ml/mwb

for

These values were computed using Formula

(18.17) and exactly computed spectral radii.

One sees that

the approximate formulas (18.18) and (18.19) are also accurate enough. TABLE 18.20:

h

Step sizes for reducing the error to

m1

2

h ml

mw

hmw

b

1/1000.

m1/m b

b

43

0.682

8

1.071

5

178

0.695

17

1.092

10

1/32

715

0.699

35

1.098

20

1/64

2865

0.700

70

1.099

40

1/128

11466

0.700

140

1.099

81

/256

45867

0.700

281

1.099

162

1/8

1/16

19.

Overrelaxation methods for nonlinear systems

383

For each iterative step, the Jacobi and Gauss-Seidel methods require

4N2

The SOR method, in

floating point operations.

contrast, requires

From (18.19) we get

operations.

7N2

as the total number of operations involved (e = 1/1000): Jacobi:

1.4.4N4 z 6N4

Gauss-Seidel (w=1):

0.7.4N4 z 3N4

SOR (w=wb):

1.1.7N3 = 8N3.

The sample problem is particularly suited to a theoretical comparison of the three iterative methods.

Practical experi-

ence demonstrates that these relations do not change signifiHowever, there exist sub-

cantly in more complex situations.

stantially faster direct methods for solving the sample problem (cf. Sections 21, 22).

SOR is primarily recommended,

therefore, for non-rectangular regions, for differential equations with variable coefficients, and for certain nonlinear differential equations. 19.

o

Overrelaxation methods for systems of nonlinear equations In this chapter we extend SOR methods to systems of non-

linear equations.

The main result is a generalization of

Ostrowski's theorem, which assures the global convergence of SOR methods and some variants thereof. In the following we let

G

denote an open subset of

In Definition 19.1: tions.

Let

SOR method for nonlinear systems of equa-

F c C1(G,IRn), and let

an invertible diagonal

D(x).

F

have a Jacobian with

Then we define the SOR method

384

SOLVING SYSTEMS OF EQUATIONS

III.

for solving the nonlinear equation

F(x) = 0

by generalizing

the method in Definition 18.1:

X(O) e G x(v-1)-wD-1(x(v-1)/x(v))F(x(v-1)/x(v))

X(V)

t(x(v-1)

=

w e (0,2),

v = l(1)-.

(19.2)

Ortega-Rheinholdt 1970 calls this the single-step SOR Newton method. If

of

T.

has a zero

F

x* E G, then

is a fixed point

x*

This immediately raises the following questions: attractive?

(1)

When is

(2)

How should the relaxation parameter

(3)

Under which conditions is the convergence of the method

x*

w

be chosen?

global, i.e., when does it converge for all initial values (4)

x(0) a G?

To what extent can the substantial task of computing the partial derivatives of

(5)

be avoided?

F

Do there exist similar methods for cases where

F

is

not differentiable?

The first and second questions can be answered immediately with the help of Theorems 17.8 and 17.25. Theorem 19.3:

Let the Jacobian of

F

at the point

x*

be

partitioned triangularly (cf. Definition 18.1) into F'(x*) = D* matrix.

-

R*

- S*, where

D*

is an (invertible) diagonal

Then p(I-w[D*-wR*1-1F'(x*)) < 1,

implies that

x*

is attractive.

19.

Overrelaxation methods for nonlinear systems

Proof:

385

By Theorem 17.25 we have I-[I-w(D*)-1R*]-lw(D*)-1F'(x*)

T'(x*) =

I-w[D*-wR*]-1F'(x*).

=

The conclusion then follows from Theorem 17.8.

o

The SOR method for nonlinear equations has the same convergence properties locally as the SOR method for linear equations.

The matrix - w[D*-wR*]

I

1F'(x*)

indeed corresponds to the matrix ..

of Lemma 18.2.

Thus

the theorems of Ostrowski and Young (Theorems 18.4, 18.11), with respect to local convergence at least, carry over to the nonlinear case.

The speed of convergence corresponds asymp-

totically, i.e. for the linear case.

the optimal

v 4 -, to the rate of convergence for

Subject to the corresponding hypotheses, can be determined as in Remark 18.13.

w

sufficiently accurate initial value

x(O)

If a

is available for

the iteration, the situation is practically the same as for linear systems.

This also holds true for the easily modified

method (cf. Remark 17.26) X(V)

wD-1(x(v-l))F(x(v-1)/x(v)).

x(v-l)

=

-

The following considerations are aimed at a generalization of Ostrowski's theorem.

Here convergence will be estab-

lished independently of Theorem 17.8.

The method (19.2) will be generalized one more time, so that it will no longer be necessary to compute the diagonal of the Jacobian

F'(x).

The hypothesis

"F

differenti-

386

SOLVING SYSTEMS OF EQUATIONS

III.

able" can then be replaced by a Lipschitz condition.

Then

questions (4) and (5) will also have a positive answer.

In

an important special case, one even obtains global convergence. Definition 19.4:

A mapping

F e C°(G,)Rn)

gradient mapping if there exists a F(x)T = '(x), x c G.

We write

e

is called a such that

C1(G,IR1)

F = grad

o

In the special case of a simply connected region

G,

the gradient mappings may be characterized with the aid of a well-known theorem of Poincare (cf. Loomis-Steinberg 1968, Ch. 11.5).

Theorem 19.5:

Let

G

be a simply connected region

Then

F

is a gradient mapping if and

Poincare.

and let

F e C1(G,IRn).

only if

F'(x)

is always symmetric.

Our interest here is only in open and convex subsets of

IRn, and these are always simply connected.

then, we always presuppose that

set of

a c (0,1)

Let

4

0: G +]R1

and for all

x,y E G

(1-a)4(y)

- (ax + (1-a)y).

is called, respectively, a

convex function

if

r(x,y,a)

strictly convex function

if

r(x,y,a) > 0,

uniformly convex function

if

r(x,y,a) > ca(l-a)jjx-y,j

for all c

and

let

r(x,y,a) = Then

is an open, convex sub-

G

IRn.

Definition 19.6: all

In the sequel

x,y e G

with

>

x # y, and for all

0,

a e (0,1).

is a positive constant which depends only on

4.

2,

Here 0

19.

Overrelaxation methods for nonlinear systems

387

The following theorem characterizes the convexity properties of

with the aid of the second partial deriva-

tives.

Theorem 19.7:

A function 41 EC2(G,1R1)

is convex, strictly

convex, or uniformly convex, if and only if the matrix of the second partial derivatives of ing inequalities for all

x e G

0

satisfies the follow-

and all nonzero

z e]Rn

respectively,

zTA(x)z > 0

(positive semidefinite)

zTA(x)z >

(positive definite)

0

zTA(x)z > czTz

Here Proof:

(uniformly positive definite in x and z).

depends only on

c > 0

A, not on

x

or

z.

x,y e G, x # y, we define

For

p(t) = r(x,x+t(y-x),a),

t e [0,1).

Then we have P(t) = a41(x)

and

+ (l-a)$(x+t(y-x))

p(O) = 0, p'(0) = 0.

PM =

- 41(x+t(1-a)(y-x))

It follows that

(1

(1-s)p"(s)ds

J

0

PM = (1-a)J (1-s)(y-X) TA(x+s(y-x))(y-x)ds 0

(1-s)(y-x) TA(x+s(1-a)(y-x))(y-x)ds.

(1-a) 2 0

In the second integral, we can make the substitution s = (1-a)s, and then call integrals:

s'

again

A(x)

s, and combine the

III.

388

SOLVING SYSTEMS OF EQUATIONS

1

P(1) = j T(s)(Y-x)TA(x+s(Y-x))(Y-x)ds 0

where

Jas (1-a)(1-s)

for

0 < s < 1-a

for

1-a < s < 1.

The mean value theorem for integrals then provides a suitable 6

for which

c (0,1)

a(1-a)(Y-x)TA(x+6(Y-x))(Y-x).

r(x,Y,(x) = P(1) =

2 The conclusion of the theorem now follows easily from Definition 19.6.

o

is only once continuously differentiable, the

c

If

convexity properties can be checked with the aid of the first derivative. Theorem 19.8:

c c C1(G,IR1), F = grad 4, and

Let

p(x,y) = [F(Y)-F(x)]T(Y-x). Then

0

is convex, strictly convex, or uniformly convex, if

and only if

p(x,y)

satisfies the following inequalities,

respectively, p(x,Y) > 0,

p(x,Y) > 0,

p(x,Y) > c*IIY-x112 Here

c* > 0

Proof: t e

depends only on

Again, let

[0,1].

F.

p(t) = r(x,x+t(y-x),a), x,y c G, x # y,

Then we have

p(t) = a1(x) + (1-a)4(x+t(Y-x)) - (x+t(1-a)(Y-x)) and

Overrelaxation methods for nonlinear systems

19.

389

(1

p(1) = r(x,y,a) =

p'(t)dt

J

0 1

[F(x+t(y-x))-F(x+t(1-a)(y-x))] T(y-x)dt.

_ (1-a) 0

It remains to prove that the inequalities in Theorem 19.8 and Definition 19.6 are equivalent.

We content ourselves

with a consideration of the inequalities related to uniform convexity.

Suppose first that always

P(x,Y) > c*IIY-x112. Then it follows that P(x+t(1-a)(Y-x),x+t(Y-x)) > c*a2t2IIY xll2 IF(x+t(y-x))-F(x+t(1-a)(y-x))] T(y-x) > c*atIIY-xll2

r(x,y,a) >

c*a(1-a)IIY-x442. 2

The quantity here.

c

in Definition 19.6 thus corresponds to

1-2c

Now suppose that always

r(x,Y,a) > ca(1-a)IIY-xI12

Then it follows that aO(x)+(1-a)0(Y) > (x+

a)(Y-x))+ca(1-a)I1Y-x12

O(x+(1-(x)(ya-x))-4(x) + callY-xll2. Since this inequality holds for all the limit

a -

a c (0,1), by passing to

1, we obtain

m(Y) fi(x) > F(x)T(Y-x) + clly-x112. Analogously, we naturally also obtain

390

SOLVING SYSTEMS OF EQUATIONS

III.

(x)-4(y) > F(Y)T(x-Y) + cIIy-xlI2.

Adding these two inequalities yields 0 > -[F(Y)-F(x))T(Y-x) + 2cjjY-x,12.

0

The following theorem characterizes the solution set F(x) = 0

of the equation

for the case where

is the

F

gradient of a convex map. Theorem 19.9: F = grad 0.

Let

m E C1(G,1R1)

Then:

The level sets

(1)

convex for all

is a zero of

global minimum at If

(3)

one zero (4)

x*

N(y,q) = {x e G

O(x) < yl

I

are

y eIR.

x*

(2)

be convex and let

x*.

in

assumes its

0

The set of all zeros of

is strictly convex, then

0

If

exactly when

F

F

is convex.

F

has at most

G.

is uniformly convex and

0

has exactly one zero

G =IRn, then

F

x*, and the inequality

c*IIx*112 < !IF(0)112. is valid, where Proof of (1):

is the constant from Theorem 19.8.

c*

Let

a c (0,1)

follows from the convexity of

y,x c N(y,1).

and 0

Then it

that

4,(ax+(1-a)y) < x (x)+(1-a)0(Y) < ay+(l-a)y = Y.

Thus

ax + (1-a)y

Proof of (2):

Let

also belongs to x*

N(y,o).

be a zero of

F

and let

x E G,

x # x*, be arbitrary.

By the mean value theorem of differ-

entiation, there is a

A e (0,1)

such that

19.

Overrelaxation methods for nonlinear systems

O(x) _ O(x*)

391

[F(x*+A(x-x*))] T(x-x*).

+

It follows from Theorem 19.8 that p(x*,x*+X(x-x*)) = [F(x*+X(x-x*))] TX(x-x*) > 0.

Thus we obtain cp(x)

Therefore

o(x*) > 0.

-

is a global minimum of

x*

sion is trivial, since the set of all zeros of

particular zero of

If

The reverse conclu-

It follows from (1) that

is convex, for if

F

is a

x*o

F, then

{x* e G

Proof of (3):

is open.

G

4>.

F(x*) = 0} = N(4>(xo),4>).

is a strictly convex function, then in

4>

the above proof we have the stronger inequality p(x*,x+A(x-x*)) =

[F(x*+X(x-x*))] TX(x-x*) > 0.

It follows that 4>(x)

Therefore

> (x*) .

is the only point at which

x*

can assume the

global minimum.

By (3), F

Proof of (4): to show that examine

0

x = bt,

F

has at most one zero.

has at least one zero.

Thus we need

To this end we

along the lines

t e ]R,

b c IRn

fixed with

JjblJ 2 =

1.

There it is true that t

4>(bt)

= (0) + F(0)Tbt +

[F(bs)-F(0)]Tbds.

i

0

392

SOLVING SYSTEMS OF EQUATIONS

III.

Since

[F(bs)-F(0)]Tbs > c*s2 it follows that 0(bt)

> 0(0) + F(0)Tbt

c*t2.

+ 2

with

x = bt

For all

Iti

=

11x112

> 211F(0)112/c* ¢(x) > 4(0).

this inequality means that

Therefore

0

as-

sumes its minimum in the ball

{x a ]Rn

1

11x112

<

711F (0)112/c*}'

This minimum is a global minimum in all of fore has at least one zero

x*

even establish the inequality

7Rn.

F

in the above ball.

there-

One can

c*IIx*112 < IIF(0)112, for we have

[F(x*)-F(0)]T(x*-0) > c*IIx*-0112

F(0)Tx* > c*11x*112 o

IIF(0)112_ c*11x*112. Definition 19.10:

Let

,

F = grad and

G* c G

C1(G,]R1)

e

be strictly convex,

(fl,f2,...,fn),

a convex set with at least one interior point.

Then we define h

mid(F,G*) = inf Here the

0

,

and

,

fi(x+he

i

e(j) h E I R

j

= 1(1)n.

)-fi(x)

are the unit vectors parallel to the axes in and

x E I R n

run through all possible

Overrelaxation methods for nonlinear systems

19.

393

combinations satisfying x e G*,

x+he(j)

h# 0.

e G*,

Further let MID(F,G*) = diag(midJ(F,G*)). The notation

MID

Since

G*

stands for "maximal inverse diagonal".

o

has at least one interior point, the infi-

mum is always formed over a nonempty set of combinations of x

and

Further, we always have

h.

[F(x+he(J))

F(x)]The(J) > 0

-

[fJ(x+he(J))

- fJ(x)]h >

0

and therefore, also h

J))-fi(x)

fJ(x

> 0

Nevertheless, the infimum of these quantities can be zero. When

0

is uniformly convex, midJ(F,G*)

bound which is independent of [F(x+he(J))

-

has an upper

It follows from

G*.

F(x)]The(J) > c*I1he(J)II22

that

fJ(x+he(J)) - fJ(x) > c*h midJ(F,G*) < 1/c*. In the sequel, a lower bound for ant than an upper bound. for

midJ(F,G*)

is more import-

That requires Lipschitz conditions

fJ.

Theorem 19.11: inequalities

If, for all

x e G*

with

x+he(j) c G*, the

394

SOLVING SYSTEMS OF EQUATIONS

III.

Ifj(x+he(j))-fj(x)I < IhILj, hold, then

MID(F,G*)

Lj

is positive definite and

midj(F,G*) > 1/Lj,

The proof is trivial. Theorem 19.12: and let

G*

Let

j

= 1(1)n.

0

be twice continuously differentiable

4

be compact.

Then

midj(F,G*) = 1/max Proof:

j = 1(1)n

0,

>

j

xEG*

= 1(1)n.

Since

fj(x+he(j))-fj(x)

=

a.4(x+he(j))-aj4(x)

<

xEG*

it follows from the previous theorem that in any case midj(F,G*) > 1/max xec

For the proof of the reverse inequality, we use the definition of the partial derivatives:

a.O(x+he(j))-aO(x)

urn

j

h-0

h lim h-*0 aj4(x+he(3))-aj4(x) h#0

mid.(F,G*) = inf

2

aJ

=

.4(x)

1/a?.4(x)

=

jj

< 1/3

h

aj4(y+he j

)-aj4(y)

2

.4(W).

o

jj

In the previous section, we considered the iterative solution of linear systems of equations.

We want to explain

briefly how these fit in here in the case of real, symmetric, and positive definite matrices.

Let

F(x) = Ax - b

be an

affine map with a real, symmetric, and positive definite

19.

Overrelaxation methods for nonlinear systems

matrix

395

The map is the gradient of the function

A.

XTAx

(x) =

-

bTx.

Z By Theorem 19.7, 0

is uniformly convex.

The constant

of that theorem is the smallest eigenvalue of tion

p(x,y)

A.

c

The func-

of Theorem 19.8 is p(x,Y) = (Y-X) TA(Y-x).

Thus

c*

is also the smallest eigenvalue of

A.

Finally by

Theorem 19.12 midj(F,G*) = 1/ajj,

These considerations hold for all The next example shows that

j

= l(1)n.

G*.

MID(F,G*)

can be posi-

tive definite even in the case of a nondifferentiable func-

tion F.

Let

¢ c C1( IR, IR)

with

2x2

for

x > 0

l x2

for

x < 0

4x

for

2x

for

(x) and

F(x) = Then

MID(F,IR1) = 1/4.

The next theorem presupposes that G c]Rn

is open and convex,

e C1(G, 1n),

F = grad 0 = (fl,f2,...,fn), Q e MAT(n,n,IR)

Q = diag (qj).

is a positive definite diagonal matrix,

396

SOLVING SYSTEMS OF EQUATIONS

III.

Theorem 19.13:

Hypotheses:

is strictly convex.

(a)

0

(b)

There is a

y e R

G* _ {x c G 14(x) < y}

with a compact level set

which consists of more than one

point. (c)

qj

< 2 midj(F,G*),

j

= 1(1)n.

Conclusions: (1)

x*

There is exactly one

is an interior point of (2)

Either

x = x*

x* e G

with

F(x*) = 0.

G*.

(y) < $(x)

or

for

y = x-QF(x/y). (3)

Every sequence arbitrary

x(0) a G*

x(u+l) =

converges to

x(v)-QF(x(v)/x(v+l)),

u = 0(1)co

x*.

Proof of (1):

Since

on

G*.

is larger outside of

an

x* a G*

O(x)

G*

is compact,

assumes its minimum G*.

Therefore there is

such that m(x*) = min 4(x), xeG

is strictly convex.

Therefore there is at most one point

with these properties.

Since

besides

and

x*, O(x*) < y

Proof of (2):

naturally into

F(x*) = 0.

G*

x*

The computation n

contains other points is an interior point of

y = x-QF(x/y)

individual steps.

x G*.

can be split

With each of these

We call individual steps, only one component is changed. y(1), y(2)' ., y(n) = y these intermediate results y(0) = x,

The individual computations are

19.

Overrelaxation methods for nonlinear systems

y(J)

=

y(J-1)

a

-

397

.e(J) J

where

= qjfi (y(J-1)).

Aj

Since

qj > 0, either

Aj

# 0

In the first case we define

fi (y(J-l)) = 0.

t e [0,1]

YO-l) - tai e(J) y(t) = p(t) = O(Y(t))

This leads to p'(t) = -ai fj(Y(t))

p'(0) = -Aifi (Y(0)) _ -qjfj(Y(0))2 <

0

p' (t) -P' (0) = ai [fi (Y(0)) -fi (Y(t)) ] For

Here we need hypothesis (c).

t > 0

and

y(t) c G*,

this hypothesis implies:

2ta.

(Y(O)) -fj Y t)

q] <

2tifj(Y(0M 1

<

jyt

iy

IP'(t)-p'(0)l < 2t qjfj(Y(0))2. Since t

P(t) = P(0) + P'(0)t + J [P'(s)-P'(0)]ds 0

the last inequality leads to p(t) < p(O)

-

qjfj(Y(0))2t + qjfi (Y(0))2t2

(Y(t)) < O(y(J-1))

-

t(1-t)qjfj(Y(0))2

m(Y(t)) < 4,(Y(J-l)) < Y. Thus

y(t)

we have

cannot leave the set

G*.

For all

t c (0,1]

or

SOLVING SYSTEMS OF EQUATIONS

III.

398

(Y(t)) < O(Y(j-1)) and in particular, 0(Y(J))

Thus, either

fi (y(3-1))

<

(y(J-l)) (y) < ¢(x).

is always zero or

(2) is proven.

The sequence

Proof of (3):

{O(x(v))

I

v = 0(1)")

converges,

for by (2), O(x(v-1))

Let

> 0(x(v))

0(x(v+1)) >

>

> (x*).

...

be an arbitrary limit point of the sequence

x**

It follows from continuity considerations that

(x**), where means that

4(y**) =

By (2) however, this

y** = x** - QF(x**/y**).

F(x**) = 0

{x(v)}.

and hence by (1) that

x** = x*.

The only possible limit point of the sequence {x(v)

v = 1(1)")

I

in the compact set

the sequence is convergent.

G*

x*.

is

Thus

o

This last theorem requires a few clarifying remarks. Remark 19.14:

0 e C1(]R ,IR)

If

every level set

{x EIRn

of Theorem 19.9).

x(0)

a IR

.

Remark 19.15:

< y}

is compact (cf. proof

The iteration therefore converges for all

o If

there is no matrix for

O(x)

is uniformly convex, then

MID(F,G*) Q

is not positive definite, then

with the stated properties.

m e C2(G,IR1), MID(F,G*)

However,

is always positive definite.

The same is also true when the components of

F

Lipschitz conditions (cf. Theorem 19.11).

o

satisfy

Overrelaxation methods for nonlinear systems

19.

In practice, the starting point is usually

Remark 19.16: and not

Since the Jacobian

0.

399

metric, there exists a function in addition, F'(x)

F'(x) 0

of

F

is always sym-

F = grad .

with

is

It is most difficult to establish that

hypothesis (b) of Theorem 19.13 is satisfied.

MID(F,G*)

be determined from bounds on the diagonal elements of In the actual iteration one needs only Remark 19.17:

If,

is always positive definite, then

even strictly convex.

F,

F

and not

The first convergence proof

can

F'(x).

0.

o

for an SOR method

for convex maps was given by Schechter 1962.

In Ortega-

Rheinboldt 1970, Part V, several related theorems are proven. Practical advice based on actual executions can be found, among other places, in Meis 1971 and Meis-Tornig 1973.

o

We present an application of Theorem 19.13 in the form of the following example. Example 19.18:

Let

G

be a rectangle in

]R2

with sides

parallel to the axes, and let a boundary value problem of the first kind be given on this rectangle for the differential equation -(alux)x - (a2uy)y + H(x,y,u) = 0.

The five point difference scheme (see Section 13) leads, for fixed

h, to the system of equations

F(w) = Aw + H(w) = where

A e MAT(n,n,]R)

0

is symmetric and positive definite,

w = (wl,w2,...,wn), (xj,yj ) = lattice points of the dis-

cretization, and A

H(w) = (H(xl,y1,w1),...,H(xn,yn,wn)).

can be split into

400

SOLVING SYSTEMS OF EQUATIONS

III.

-R-

A=D where

D

RT,

is a diagonal matrix, R

is a strictly lower tri-

angular matrix with nonnegative entries, and weakly cyclic of index 2.

pj(Z) =

D-1(R+RT)

is

Let

10z

H(xj,yj2)di

P(w) = (P1(wl),...,Pn(wn)) 4(w) =

wTAw + P(w). 2

Then obviously

F(w) = grad ¢.

Under the hypotheses

0 < H7(x,y,z) < 6,

0

is uniformly convex in

IRn

(x,y)

and

MID(F,G*)

definite for every compact convex set lar, for

j

c G,

z cIR

is positive

G* cfRn.

In particu-

= 1(1)n: 2

a2 0 (w)

= ajj + HZ(xj) yj,w)

ajj <

aj

1/a

> midj(F,G*) > 1/(ajj+6).

ajj + d

ii

ajj

contains the factor

ajj >> 6.

h, therefore,

For small

1/h2.

Every sequence arbitrary

w(0)

w(v+1)

=

w(v) - Q(Aw(v-1)/w(v)

+

H(w(v-1)

converges, if 0

The condition strictive.

< qj

< 2/(ajj+d).

Hz(x,y,u(x,y)) <

6

appears to be very re-

In most cases, however, one has a priori know-

ledge of an estimate

401

Overrelaxation methods for nonlinear systems

19.

a < U(x,y) < a,

(x,y)

e G.

When the maximum principle applies, e.g. for

H(x,y,0) = 0,

this estimate follows at once from the boundary values. H

then only on

is significant

G x

The function can

[a,8].

be changed outside this set, without changing the solution This change can be undertaken

of the differential equation. H c C1(G x]R,]R)

so that

Hz(x,y,z)

and

are bounded.

will demonstrate this procedure for the case If

We

H(x,y,z) = ez.

a < u(x,y) < a, one defines

f

H*(z) =

< a

ea(z-a+l)

for

z

ez

for

a <

es(z S+1)

for

z > S.

z

< R

It follows that ea < H*'(z)

<

e8

and 1

< mid.(F,G*) <

1

J

.+es

a J.J

- a J.J.+ea

One begins the iteration with

Q= w diag(

1

aii +e

0< w < 2.

s)

It may be possible, in the course of the computation, to replace

a

estimating

with a smaller number. a

and

of the results.

Should one have erred in

a, one can correct the error on the basis

The initial bounds on

a

and

To speed convergence,

do not have to be precisely accurate.

one is best off in the final phase to chose for approximation of w diag

(

1 aJJ+ai Hi (w)

therefore

S

),

o

Q

an

402

20.

III.

SOLVING SYSTEMS OF EQUATIONS

Band width reduction for sparse matrices When boundary value problems are solved with differ-

ence methods or finite element methods, one is led to systems of equations which may be characterized by sparse That is, on

matrices.

each row of these matrices there

are only a few entries different from zero.

The distribution

of these entries, however, varies considerably from one problem to another.

In contrast, for the classical Ritz method and with many collocation methods, one usually finds matrices which are fuZZ or almost full; their elements are almost all different from zero.

Corresponding to the different types of matrices are different types of algorithms for solving linear systems of equations.

We would like to differentiate four groups of

direct methods.

The first group consists of the standard elimination methods of Gauss, Householder, and Cholesky, along with their numerous variants.

At least for full matrices, there are as

a rule no alternatives to these methods. too are seldom better.

Iterative methods

For sparse matrices, the computational

effort of the standard methods is too great. ber of equations is

For if the num-

n, then the required number of additions

and multiplications for these methods is always proportional to

n3

The second group of direct methods consists of the specializations of the standard methods for band matrices. In this section we treat Gaussian elimination for band matrices.

Two corresponding FORTRAN programs may be found in

20.

Band width reduction for sparse matrices

Appendix S.

403

The methods of Householder and Cholesky may be

adapted in similar ways.

The number of computations in this

group is proportional to

nw2, where

w

is the band width of

the matrix (cf. Definition 20.1 below).

The third group also consists of a modification of the Its distinguishing characteristic is the

standard methods.

manner of storing the matrix.

Only those entries different

from zero, along with their indices, are committed to memory. The corresponding programs are relatively complicated, since the number of nonzero elements increases during the computation.

In many cases, the matrix becomes substantially filled.

Therefore it is difficult to estimate the number of computations involved.

We will not discuss these methods further.

A survey of these methods can be found in Reid 1977. The fourth group of direct methods is entirely independent of those discussed so far.

Its basis lies in the

special properties of the differential equations associated with certain boundary value problems.

These methods thus can

be used only to solve very special systems.

methods differ greatly from each other.

Further, the

In the following two

sections we consider two typical algorithms of this type. Appendix 6 contains the FORTRAN program for what is known as the Buneman algorithm.

The computational effort for the most

important methods in this group is proportional to merely n log n

or

methods.

n.

These are known, therefore, as fast direct

Please note that

n

is defined differently in

Sections 21 and 22.

We are now ready to investigate band matrices.

404

SOLVING SYSTEMS OF EQUATIONS

III.

Definition 20.1:

A = (aij) a MAT(n,n,Q)

Let

and

A # 0.

Then

w = 1 + 2 max{dld = li-jl where

is called the band width of the band width to be 1.

A matrix for which matrix.

aij ¢ 0 or aji # 0}

For a zero matrix, we define

A. o

w << n

will be called a band

Thus we do not have a precise mathematical concept

in mind; the precise term is band width. examples, an

x

In the following

represents a nonzero element, and a blank

represents a zero.

A diagonal matrix has a band width diagonal matrix has band width

w = 3.

w = 1, and a tri-

A matrix of the follow-

ing type has band width 5:

Every full matrix has the maximal band width of

2n-1.

How-

ever, many sparse matrices also have this bandwidth, e.g. the matrix:

x

x

x

20.

Band width reduction for sparse matrices

be a matrix with band width less

A e

Now let

than or equal to the matrix

405

w, w

A,

Every linear system containing

odd.

n

i = l(l)n

= bit

JI aijxj

is equivalent to the system w j

where

Iajixi+j-k-1 l = aw+l,i' and

k = (w-l)/2

if

j

< w, 1 < i+j-k-1 < n

bi

if

j

= w+l

0

otherwise.

ai

aji =

For

p <

1

or

i+j-k-1

p > n, set

(cf. Figure 20.2).

0

a21

form a matrix

w << n, A

For

less core memory than

0

xp = 0.

aji

The quantities

0

i = l(l)n

A

a31

and

A c MAT(w+l,n',¢)

occupies substantially The ratio is

b.

(w+l)/(n+l).

a42

a53

a64

a75

a86

a97

a32

a43

a54

a65

a76

a87

a98

a22

a33

a44

a55

a66

a77

a88

a99

a12

a23

a34

a45

a56

a67

a78

a89

0

a13

a24

a35

a46

a57

a68

a79

0

0

b2

b3

b4

b5

b6

b7

b8

b9

and

n = 9

all

[b1

Figure 20.2.

A

for

w =

5

Gaussian elimination without pivot search does not lead to an increase in band width.

The same holds true if the search

for the pivot elements is restricted to the row at hand, i.e. if there are only column exchanges.

Thus the algorithm can

406

SOLVING SYSTEMS OF EQUATIONS

III.

run its course entirely within

A.

The number of computa-

tions is, as previously mentioned, proportional to

nw2.

The Ritz method leads to positive definite Hermitian matrices.

A pivot search thus is unnecessary, as it also

is with most difference equations.

The number of computations

without pivot search is

n(w2 + 3w

-

2).

Appendix S contains two FORTRAN programs, for

w = 3

and

w > 3. In many cases the band width of a matrix

A

can be

reduced substantially through a simultaneous exchange of rows and columns.

For example, we can convert x

x

into

rx x

through an exchange of rows 2 and 5 and columns 2 and S.

Such a simultaneous exchange of rows and columns is a similarity transformation with a permutation matrix.

In the

example at hand, we have the permutation (1,2,3,4,5) - (1,5,3,4,2),

which we abbreviate to

(1,5,3,4,2).

Slightly more complicated

20.

Band width reduction for sparse matrices

407

is the example x

x x x x x x x x x x x

x

The permutation

(1,3,5,6,4,2)

leads to the matrix

The band width has been reduced from 11 to S.

A further re-

duction of band width is not possible. For each matrix transforms

A

A

there exists a permutation which

into a matrix with minimal band width.

permutation is not uniquely determined, as a rule. nately there is no algorithm for large

n

This

Unfortu-

which finds a

permutation of this type within reasonable bounds of computation.

The algorithms used in practice produce permutations

which typically have a band width close to the minimal band width.

There are no theoretical predictions which indicate

how far this band width deviates from the minimal one. Among the numerous approximating algorithms of this type, the algorithm of Cuthill-McKee 1969 and a variation of Gibbs-Poole-Stockmeyer 1976 are used the most.

Both algorithms

are based on graph theoretical considerations.

Sometimes the

first algorithm provides the better result, and other times, the second.

The difference is usually minimal.

The second

408

SOLVING SYSTEMS OF EQUATIONS

III.

algorithm almost always requires less computing time. that reason, we want to expand on it here.

For

The correspond-

ing FORTRAN program is to be found in Appendix S.

The band widths of and

A = (aij) are the same.

B = (IaijI

+

Jajij)

For an arbitrary permutation matrix

P, the

band widths of P-1AP

remain the same.

Thus we may assume that

The diagonal elements of width.

P-1BP

and

A

A

is symmetric.

have no significance for band

So, without loss of generality, we can let the dia-

gonal entries be all ones.

The i-th and j-th row of aij # 0.

We write

zi - zj

A =

or

are called connected if

zj

- zi.

Thus, for

x x x x x x

we have

A

x

zl - zl,zl - z2,zl - z3,z2 - zl,z2 - z2,z3 - zl,z3 - z3.

This leads to the picture

This can be regarded as an undirected graph with numbered knots.

The rows of the matrix are the knots of the graph.

Conversely, the plan of the matrix can be reconstructed from a graph.

Thus, the graph

20.

Band width reduction for sparse matrices

409

yields the matrix

A =

Definition 20.3: (G,-)

is called an undirected graph if the

following hold: (1)

G

is a nonempty set.

(2)

-

is a relation between certain knots

The elements are called

knots.

Notation: "g - h" (3)

or

"g

and

and

h.

are connected".

h

g e G, g - g.

For all

g

g - h

always implies

h - g. A knot p+l

g

has degree

knots in

bitrary knots

r 6N and

(G,-)

is connected with exactly

g

A graph is called connected if, for ar-

G. g

if

p

and

ki e G,

i

in

h

there always exists an

G

= 0(1)r, with the properties:

(4)

ko = g, kr = h.

(5)

ki 1 - ki

Let

a

for

i

= 1(1)r.

o

be an arbitrary nonempty subset of

is also a graph (subgraph of

connected, we simply say that Definition 20.4:

(G,-)).

If

G.

(G,-)

where

is called the band width of the graph 0

(G,-)

be

G = {g1,g21" "gn}'

w = 1 + 2 max{dld = ji-jj

the given numbering.

is

is connected.

G

Let the knots of a finite graph

numbered arbitrarily, i.e. let

Then

gi - gj}

(G,-)

with respect to

410

SOLVING SYSTEMS OF EQUATIONS

III.

The problem now becomes one of finding a numbering of the knots for which the band width sible.

is as small as pos-

As a first step, we separate the graph into levels. L = (L1,L2,...9Lr)

Definition 20.5:

structure of the graph

disjoint subsets of For

(2)

is called the level

when the following hold true:

(G,-)

Li, i = 1(1)r, (levels) are nonempty

The sets

(1)

r

w

G.

Their union is

g c Li, h c Lj

and

G.

g - h, li-ji

is called the depth of the level structure.

<

1.

Its width

k

is the number of elements in the level with the most elements. D Figure 20.6 shows how to separate a graph into levels.

L1

Figure 20.6.

Theorem 20.7: of

(G,-)

L3

L2

Let

of width

L4

L5

Separating a graph into levels

be a level structure

L = (L11L21...,Lr) k.

Then there exists a numbering of the

knots such that the band width of the graph with respect to this numbering is less than or equal to Proof: of

that

First number the elements of

L2, etc. li-jJ

For < 1,

gu - 9V , Iu-vl

4k-1.

L1, then the elements

gu c Li, and

gv c Lj

< 2k-1, and hence, w < 4k-1.

it follows 0

20.

Band width reduction for sparse matrices

411

This theorem gives reason for constructing level strucUsually constructing

tures with the smallest width possible.

level structures of the greatest possible depth leads to the same end.

For every knot

Theorem 20.8:

g

of a connected graph there

exists exactly one level structure R(g) = (LI,L2)...,Lr) satisfying: (1)

LI = {g}.

(2)

For every

with

k e Li 1

with

h e Li

i > 1, there is a

This level structure is called the

k - h.

level structure with root The proof is trivial.

g. o

In the following, let the graph

(G,-)

be finite

In actuality, the connected components can

and connected.

The algorithm for band width re-

be numbered sequentially.

duction begins with the following steps: (A)

Choose a knot

(B)

Compute

of minimal degree.

g e G

R(g) _ (LI9L2,...,Lr).

last level (C)

Set j = 1.

(D)

Compute

Lr

be

kj,

j

= l(1)p.

R(kj) _ (M1,M2,...,Ms)

level structure

R(kj).

Let the elements of the

If

s

and set > r, set

mj = width of g = kj

and

return to (B). < p, increase

(E)

If

(F)

Choose set

j

j

e {1,...,p}

h = kj.

j

by 1 and repeat (D). so that

mj

is minimal.

Then

412

III.

SOLVING SYSTEMS OF EQUATIONS

The first part of the algorithm thus determines two knots and

h

g

and the corresponding level structures R(h) = (M1,M2,...,M5).

R(g) _ (L1,L2,...,Lr), Lemma 20.9:

The following are true:

(1)

s = r.

(2)

h e Lr.

(3)

g e Mr.

(4)

Li c U Mr+l_j = Ai,

i

i = 1(1)r.

j=1 (5)

Mi c U Lr+l-j = Bi,

i = 1(1)r.

j=1

Proof:

For

Conclusion (2) is trivial.

the empty set.

in

Mi

Mi+l

All knots in

(cf. Theorem 20.8(2)).

Mi

s

let

M1 = {h} c Lr.

be

M.

is a subset of

are connected with knots

Lu_1, Lu, and

Bi, we have

Conclusion (5) implies that (cf. Step (D)), s < r.

Now the

Since the elements of

are connected only to elements in since

>

We then prove Conclusion (5) by induction.

By (2), the induction begins with induction step.

i

Lu

Lu+l, and

Mi+l c Bi+l.

s > r.

By construction

This proves Conclusion (1).

From (5)

we obtain M. c B. i

i

r-l i Ul

r-l

M. c

B.

Ul

i

=

Br-1

G-L

L1 C Mr g e Mr.

This is conclusion (3).

The proof of (4) is analogous to the

proof of (5), in view of (3).

13

20.

Band width reduction for sparse matrices

413

The conclusions of the lemma need to be supplemented In most practical applications it turns out

by experience. that:

The depth

(6)

and

R(h)

r

of the two level structures

R(g)

is either the maximal depth which can be achieved

for a level structure of

(G,-)

or a good approximation

thereto. (7)

Li n Mr+l_i, i = 1(1)n, in many cases

The sets

contain most of the elements of

L.

U

Mr+l_i.

The two level

structures thus are very similar, through seldom identical. (8)

The return from step (D) of the algorithm to step

(B) occurs at all in only a few examples, and then no more frequently than once per graph.

This observation naturally

is of great significance for computing times.

In the next part of the algorithm, the two level structures

and

R(g)

R(h)

are used to construct a level

structure S = (Sl,S2,....sr)

of the same depth and smallest possible width. k e G

cess, every knot

is assigned to a level

In the proSi

by one

of the following rules: Rule 1:

If

k c Li, let

Rule 2:

If

k e Mr+l-i, let

For the elements of

k c Si. k c Si.

Li n Mr+l-i, the two rules have the

So in any case, L. n Mr+1_i e Si.

same result.

The set

r

V=G splits into

p

-

U (Li n Mr +1 i)

i=1

connected components,

V1,V2,...,Vp.

SOLVING SYSTEMS OF EQUATIONS

III.

414

Unfortunately it is not possible to use one of the Rules 1 and 2 independently of each other for all the elements of

V.

Such an approach would generally not lead to a level structure

S.

But there is

Lemma 20.10:

the same

V

If in each connected component of

rule (either always 1 or always 2) is applied constantly, then

We leave the proof to the

is a level structure.

S

reader.

In the following we use elements of the set K2

the width of

Let

T.

to denote the number of

ITI

K1

be the width of

The second part of the algorithm

R(h).

consists of four separate steps for determining (G)

Compute

Si = L. n Mr+l-i,

and determine

S:

K2, set

and

KI

V.

V

If

i = 1(1)r

is the empty set, this part of

the algorithm is complete (and continue at (K)). wise split Order the (H)

Set v = 1.

(I)

Expand all from

Vv

into connected components

V

so that always

Vi

Si

Si,

to

by rule 1.

IVj+1I

<

Vj,

j

= l(1)p.

IVjI,

j

= 1(1)p-l.

Expand

Si

to

Si,

i

by rule 2.

Vv

= 1(1)r,

Compute

K3 = max{1Si1

Ji = 1(1)r

where

Si # Si}

K4 = max{jSij

Ji = 1(1)r

where

Si # Si}.

K3 < K4

otherwise set

or if

Other-

i = 1(1)r, by including the knots

by including the knots from

If

and

R(g)

K3 = K4

and

Si = Si, i = 1(1)r.

K1 < K2, set

Si = Si;

20.

Band width reduction for sparse matrices

(J)

For

v < p, increase

v

415

by 1 and repeat (I).

This completes the computation of the level structure S. S

Figure 20.11 shows the level structures The knots in

for a graph.

this case knots.

V

are denoted by

In

x.

consists of two components, each having three

For the left component rule 2 was used, and for the

right, rule 1. of 3.

G-V

R(g), R(h), and

In this way, one obtains the optimal width

The second part of the algorithm consumes the greater

part of the computing time.

This is especially so when

has many components.

Figure 20.11.

Level structures

R(g), R(h), and

S

V

416

SOLVING SYSTEMS OF EQUATIONS

III.

Finally, in the third and last part of the algorithm, one starts with the level structure

and derives the numbering

S

of the graph: (K)

Set

p = 1.

(L)

Let

k

Go to (N).

run through all the elements of

order of the numbering. k c Sp

For fixed

which are connected to

yet been numbered

S

in the

p-1

k, number all

and which have not

k

so that the degree of

k

does not

decrease. (M)

Let

k

run through all the elements of

Sp

which have

already been numbered, in the order of the numbering. For fixed

k, number all

k e Sp

which have not been

numbered and which are connected to degree of (N)

k

Sp

which are unnumbered,

search for an unnumbered element of

(0)

Increase

so that the

does not decrease.

If there remain elements of

degree.

k

Sp

of minimal

Assign it the next number and return to (M). p

by

1

and, if

p < r, return to (L).

This last step also completes the numbering.

In some

cases, the result can be improved through the use of an iterative algorithm due to Rosen 1968.

Lierz 1976 contains

a report on practical experience with the algorithm. All algorithms for band width reduction are useful primarily for sparse matrices which are not too large and irregularly patterned.

These arise primarily in finite ele-

ment approaches where the geometry is complicated. magnitudes are on the order of

n = 1000

and

Typical

w = 50.

21.

Buneman Algorithm

21.

Buneman Algorithm Let

417

be a rectangle with sides parallel to the axes,

G

and let the following problem be posed on

G:

Au(x,y) = q(x,y),

(x,y)

E G

u(x,y) = p(x,y),

(x,y)

E DG.

This problem is to be solved with a five point difference method.

We will assume that the distance separating neigh-

boring lattice points from each other is always equal to (cf. Section 13). function

w

h

The difference equations for the discrete

are then uniformly the same for all internal

lattice points:

w(x+h,y) + w(x-h,y) + w(x,y+h) + w(x,y-h)

-

4w(x,y)

= h2q(x,y) (x+h,y), (x-h,y), (x,y+h), or

If one of the points

should lie in the boundary of the rectangle, w placed by the boundary value

is to be re-

at that point.

the points by rows or columns.

(x,y-h)

We number

In either case, we obtain a

linear system of equations of the following type: Mw = z r A

M =

I

I

` A,

E MAT(pn,pn, 1R) I

`A (21.1)

A = 1

-1

I1

1

'4

F MAT(n,n, IR)

SOLVING SYSTEMS OF EQUATIONS

III.

418

wl

W =

Z

=

w.,z. cJRn

;

(i=1(1)p).

L zP J

wP J The inhomogeneity

z

contains

as well as the bound-

h2q(x,y)

(x,y).

ary value

0. Buneman has developed a simple, fast algorithm for solving (21.1) for the case where

p = 2k+1

-

k e N.

1,

wo = w k+l = 0.

In the following, we always let

Three conse-

2

cutive components of w.

J-2

+ Aw. j-1 w.

will satisfy the block equations:

w

= z. J-1

+

'jJ

+ Aw. + w.

J-1

= Z.

j+l

J

J

w. +Aw. j+2 J J+1 + w j+2

Multiply the middle equation by equations.

When restricted to

j

z

j+1*

-A

and then add the three

=

2(2)2k+1-2, this results

in - Azj. wj-2 + (21-A2)w.+wj+2 = zj-I + zj+l

This represents even index.

2k-1

(21.2)

block equations for the unknowns of

After solving this reduced system, we can deter-

mine the remaining unknowns of odd index by solving the ndimensional system Awj = zj

- wj_1 - wj+l,

j

=

1(2)2k+I-1

(21.3)

The reduction process described here can now be applied again, to the

2k-1

block equations (21.2), since these have the

same structure as (21.1).

Using the notation

Buneman Algorithm

21.

A(1) = 21-A z.

1

419

2 2(2)2k+1-2

Azj,

+ zj+1

=

j

we obtain the twice-reduced system of block equations w.

4

+ [2I-(A(l))21w+w J

=

j+4 +4

j,

z(1)+z(1)-A(1)z(1) J-2

What we have here is a system with

J+2

2k-1-1

=4(4)2k+l-4. j=4(4)2

block equations.

After this system is solved, the remaining unknowns of even index can be found as the solution of the n-dimensional system A(1)wj

2(4)2k+1-2.

wj

=

2

- wj+2,

=

j

The remaining unknowns of odd index are then computed via (21.3).

The reduction process we have described can obviously be repeated again and again, until after

reduction steps

k

only one system, with a single block equation, remains to be solved.

After that comes the process of substituting the un-

knowns already computed in the remaining systems of equations. The entire method can be described recursively as follows.

Let

A(0) = A z30)

Then, for

1(1)2k+1

= 1P

=

j

r = 1(1)k, define

A(r)

= 21

z(r)

=

(A(r-1))2 -

z(r-1) j-2r-1

+ z(r-1) j+2r-1

A(r-1) -

z(r-l) J

J

j

Then, for

=

2r(2r)2k+1-2r

r = k(-1)0, solve successively

,

(21.4)

420

III.

A(r)w.

=

w

w

z(r)

j+2r

J

SOLVING SYSTEMS OF EQUATIONS

j=2r(2r+l)2k+1-2r.

r,

(21.5)

j-2

This method is known as cyclic odd/even reduction, and frequently, simply as the COR algorithm. matrices

grows with

A(r)

The band width of the

r, so that a direct determination

of the matrices requires extensive matrix multiplication.

For

this reason, they are explicitly represented instead as the product of simpler matrices in the following. The recursion formula (21.4) implies that polynomial

of degree

P r

2r

in

A(r)

is a

A:

2

2

A(r) = P r(A) 2

r-1

2J 2( r)A,

=

j=0

e IR,

r = 1(1)k.

J

The highest order coefficient is

= -1. c(r,) r 2

We want to represent the polynomial as a product of linear factors.

To do that, we need to know its zeros.

found most easily via the substitution

These are

_ -2 cos 9.

From

(21.4) we obtain the functional equation

P2r(E) =

2

-

LP 2r-1(E) ] 2

which reduces inductively to P r(E) = -2 cos(2rO).

The zeros of

P r(E)

Ej

The matrices

=

A(r)

of linear factors:

thus are

2 cos('r+l},

j

= 1(1)2x.

therefore can be represented as a product

Buneman Algorithm

21.

421

r=0

A, A

(r)

-t =

1

(21.6) 2r

\

J01IA

=L

+

2 cos (2j-l nri], +T

r=1(1)k.

This factorization can be used both in the reduction phase (21.4) for computing

and in the following solution

phase (21.5) for determining

In this way, the matrix

wj.

multiplication as well as the systems of equations are limited to tridiagonal matrices of a special kind.

This last described method is called cyclic odd/even reduction with factorization, and frequently is simply known as the CORF algorithm.

Various theoretical and practical investigations have demonstrated that the COR and CORF algorithms described here are numerically unstable (cf. e.g. Buzbee-Golub-Nielson 1970 and Schr6der-Trottenberg-Reutersberg 1976).

Buneman has

developed a stabilization (cf. e.g. Buzbee-Golub-Nielson 1970) which is based on a mathematically equivalent reformulation from (21.4).

for computing the vectors Let

1(1)2k+1-1.

zj,

0,

r = 1(1)k, define

Then, for p(r)

=

J

(A(r-1))-l(p(r-l)

p(r-l) ]

q (r)

=

=

j

j-2

q(r-1) j-2r-1

+ q(r-1) j+2r-1

-

r- 1

2p (r)

+

p(r-1) j+2r- 1

-

q(r-1))

(21.7)

2r(2r)2k+1

j

=

The sequences

-

2r

per), qtr), and

are related as follows:

422

SOLVING SYSTEMS OF EQUATIONS

III.

A(r)pJr) + q,r) (21.8)

r = 0(1)k,

=

j

2r(2r)2k+1

The proof follows by induction on tion is immediate, since

2r

r = 0, the asser-

For

r.

0, and

q(0) = zj

Now suppose the conclusion (21.8) holds for step

r - r+l

z(0)

The induction

r.

is:

A(r 1)p(r+l)+q (r+l) J

=

= 2p(r+1)-(A(r))2p(r+l +q (r) +q (r) -2p(r+l) J J j-2r j+2r

J

-(A(r))2pjr)+A(r)(p(r)+p(r)r_gjr))+q(r)+q(r)

_

j-2

r

r

j+2

r

j-2

j+2

A(r)p(r)r+q(r)r+A(r)p(r)r+q(r)r j-2

j-2 =

z(r)r +z(r) r -A(r)z(r) J-2

J

j+2

j+2 j+2 z(r+l). =

The Buneman algorithm can be summarized as follows: 1.

Reduction phase:

Compute the sequences

per)

and

qtr)

from (21.7): 0,

J

1(1)2k+1-1

z. J

J

A(r-l)(p(r-l)_p(r))

=

J

=

p(r-T)1 + p(r-T)1

q(r-1) -

J

P(r) J (r) gj

=

2.

j-2

p(r-1)

(p(r-1)

`J

-

J +

(r-1)

= qj-2r-1

r = 1(1)k,

J+2

j

_

(r-1)

p(r)) J 2

qj+2r-1

=

Solution phase:

2r(2r)2k+1

(21.9)

J

(r)

pj -

2r

Solve the system of equations (21.5),

using the relation (21.8):

21.

Buneman Algorithm

423

- w

qtr)

j+2

wj = p r)

+

- w

r

j-2

r

(wj -p r) )

r = k(-1)0,

(21.10) 2r(2r+1)2k+1

=

j

-

2r.

In both the reduction and solution phase we have systems of equations with coefficient matrix

A(r), and these systems

are solved by using the factorization (21.6).

system it is necessary to solve

Thus, for each

systems in the special

2r

symmetric, diagonal dominant, tridiagonal matrix

r c-4 1

`1

c-4

-2 < c = 2 cos(2 r+i ir) < 2. 2

The vectors

p30)

q30), which are set equal to zero and

and

zj, respectively, at the beginning of the reduction phase, take up a segment of length

pn =

(2k+1-1)n

in memory.

cause of the special order of computation, the vectors and

qtr)

can overwrite

and

Beper)

in their respec-

tive places, since the latter quantities are no longer needed. Similarly, in the solution phase, the successively computed solution vectors

wj

can overwrite the

qtr)

in their cor-

responding places. For

k = 2

and

p = 7, the sequence of computation

and memory content in the reduction and solution phases is given in Table 21.11.

r

0

1

2

2

1

0

Step

1

2

3

1

2

3

,

3

p (0)

,

S

P (0)

7

w2, w 4, w6

a (0) ,q 3(0) ,q (0) ,a (C) 1 5 7

1

p (0)

w4

a 2(1) ,q 6(1)

2

PM

q42)

4

,0.

w 1'

w 3'

w2'' 6

W4

p ( 2)

P6(1)

x 4(2)

a 2(1) ,q 4(1) ,q 6(1)

,

P 4(2)

P 4(1) ,6 P (1)

P2(1) p

P (0)

q2(1) 0.

w S'

(1) 4

4 ,

w

j= 1(1)7

a 1(0) ,a 2(0) ,a (0) ,a 4(0) ,a (0) ,a 6(0) >a 7(0) 3 5

z.

P2

3

P 1(0) ,P 2(0) ,P 3 (0) ,P 4(0) ,P 5(0) ,P 6(0) ,P 7(0)

3

initialization: p(.0 )=0, q(0)=

computes

7

x (1) 6

6

{

,

P 2(1)

,2 a (1)

2

,3 P (0)

,a (0) 3

3

,

6

,a S(0) ,a (1) (1) 4 6

5

P 4(2) ,P 5(0) ,P 6(1)

, 0.

4

,

,

P (0) 7

a (0) 7

P7

w

1'

w

2'

w

3'

w

4'

w S

'w61w7

7

,g2 1) ,g 30) ,w4,q 50) ,q 61) ,q 0)

q1(0) ,w2,g3(0) ,w 4,g5(0) ,w6, g7(0)

qi 0)

a 1(0) ,a 2(1) ,a (0) ,a (2) ,a 5(0) ,q 6(1) ,a 7(0) 3 4

P 1(0)

a 1(0)

1

memory content at end of step

Stages of the computation for the reduction phase (upper part) and the solution phase (lower part)

requires

Table 21.11:

Buneman Algorithm

21.

425

The Buneman algorithm described here thus requires twice as many memory locations as there are discretization points.

There exists a modification, (see Buzbee-Golub-

Nielson 1970), which uses the same number of locations.

How-

ever, the modification requires more extensive computation. We next determine the number of arithmetic operations in the Buneman algorithm for the case of a square p = n.

G, i.e.

Solving a single tridiagonal system requires

6n-5

operations when a specialized Gaussian elimination is used. For the reduction phase, noting that p = n =

2k+l

k+l = log2(n+1),

1,

-

we find the number of operations to be k

[6n+2r 1(6n-5)]

I

I

r=l j eMr where 2r(2r)2k+l-2r}

Mr =

{j

2k+l-r

and cardinality(Mr) = k

k+1 r

2(2 1)[6n+2

=

-

r -l

1.

Therefore,

(6n-5)] = 3n1og2(n+l)

r=1

+ 0[n log2(n+l)].

Similarly, for the solution phase we have k

J_ [3n+2r(6n-5)]

I

r=0 j eMr where Mr =

{i

and cardinality(:Nr) = 2k k 1

r=0

2k r[3n+2r(6n-5)]

=

r

=

2r(2r+l)2k+1-2r} Therefore,

3n21og2(n+l)+3n2+0[n log2(n+l)].

426

SOLVING SYSTEMS OF EQUATIONS

III.

Altogether, we get that the number of operations in the Buneman algorithm is 6n21og2(n+l) + 3n2 + O[n log2(n+l)].

At the beginning of the method, it is necessary to compute values of cosine, but with the use of the appropriate recursion formula, this requires only

0(n)

operations, and thus

can be ignored.

Finally, Appendix 6 contains a FORTRAN program for the Buneman algorithm. 22.

The Schroder-Trottenberg reduction method We begin by way of introduction with the ordinary dif-

ferential equation

-u"(x) = q(x). The standard discretization is 2u(x)-u(x+h)-u(x-h)

= q(x)

h2

for which there is the alternative notation (2I-Th-ThI)u(x) Here

I

=

denotes the identity and

h2q(x).

Th

the translation opera-

tor defined by Thu(x) = u(x+h).

Multiplying equation (22.1) by (2I+Th+Th1)

yields (2I-Th-Th2)u(x)

=

(22.1)

h2(21+Th+Th1)q(x)

The Schroder-Trottenberg reduction method

22.

427

Set

q0(x) = q(x),

ql(x) = (21+Th+Th1)go(x)

and use the relations Th = Tjh,

ThJ

=

This simple notational change leads to the simply reduced equation (2I-T2h-T-h1 )u(x)

=

h2g1(x).

This process can be repeated arbitrarily often.

The m-fold

reduced equations are (2I-Tkh-Tkh)u(x) = h2gm(x)

where

k = 2m

(22.2)

and

qm(x) _ (21+TRh+Tth)gm-1(x),

R =

2m-1

The boundary value problem -u"(x) = q(x),

u(O) = A,

can now be solved as follows. into

2n

u(l) = B

Divide the interval

subintervals of length

h = 1/2n.

Then compute

sequentially ql(x)

at the points

2h, 4h, ..., 1-2h

q2(x)

at the points

4h, 8h, ..., 1-4h

qn-1(x) The

at the point

2n-lh = 1/2.

(n-l)-fold reduced equation

[0,1]

428

SOLVING SYSTEMS OF EQUATIONS

III.

(2I-T n 1 -T n-1 )u(1/2) = h2gn 1(1/2)

h

2

to be determined immediately, since the

u(1/2)

allows

values of

u

h

2

are known at the endpoints

and

0

Simi-

1.

larly, we immediately obtain

u(1/4)

(n-2)-fold reduced equation.

Continuing with the method

and

u(3/4)

from the

described leads successively to the functional values of at the lattice points

jh,

u

= 1(1)2n-1.

j

In the following we generalize the one dimensional reduction method to differential or difference equations with constant coefficients. Definition 22.3:

Gh = {vh

Let

I

v e a}, h > 0.

Then we call

r

av Th

Sh =

av c R

v=-r

a one-dimensional difference star on the lattice called symmetric if a2v =

odd (even) if

a v = av 0

for

v = 1(1)r.

(a2v+1 = 0) for all

-r < 2v < r

(-r < 2v+l < r).

the sum for

Sh

v

Gh.

Sh

Sh

is

is called

with

In Schroder-Trottenberg 1973,

is simply abbreviated to [a-ra-r+l .....* ar-lar]h.

o

Each difference star obviously can be split into a sum

Sh = Ph + Qh, where Since

Ph T

2hv

is an even difference star, and

Qh

is an odd one.

v

= T 2h, the even part can also be regarded as the

difference star of the lattice

G2h = {20

I

v e22}.

The reduction step now looks like this:

The Schroder-Trottenberg reduction method

22.

Shu(x) = (Ph+Qh)u(x)

=

429

h2q(x)

(Ph-Qh)(Ph+Qh)u(x) = (Ph-Qh)u(x) = h2(Ph-Qh)q(x) Ph

Since

obviously is an even difference star, it can

Q2

-

be represented as a difference star on the lattice denote it by

We

G2h.

We further define

R2h.

ql(x) = (Ph-Qh)q(x) Thus one obtains the simply reduced equation R2hu(x) = h2g1(x).

In the special case Sh

1Th1 + a o Th + a1Th

one obtains 2 o

2

Ph = aoTh 2

Q2 = a2 Th2 + 2a-la1Th + a T R2h =

2-T-1 2h +

2_

2

- aIT2h.

The reduction process can now be carried on in analogy with (22.2).

In general, the difference star will change from

step to step.

The number of summands, however, remains con-

stant.

The reduction method described also surpasses the Gauss algorithm in one dimension in numerical stability. main concern however is with two dimensional problems.

Our

We

begin once again with a simple example.

The standard discretization of the differential equation

430

SOLVING SYSTEMS OF EQUATIONS

III.

-Au(x) = q(x),

x c IR 2

is

4u(x)-u(x+he1)-u(x-hel)-u(x+he2)-u(x-he2) = h2q(x).

The translation operators Tv,hu(x) = u(x+hev)

lead to the notation (4I-Tl,h-T- 1h-T2,h-T-1 )u(x) = h2q(x) I, 2,h

Multiplying the equation by (41+T1,h+T11h+TZ

T_

1

yields T_

{16I

1

T2,h

-

-

T12h-T2

h2(4I+Tl,h+T11h+T2

h-T2_

h+T21h)q(x)

We set

ql(x) = (41+Tl,h+T11h+T2,h+T21h)q(x)

and obtain the simply reduced equations {121-2(T

l,hT2,h+Tl,hT2,h+TllhT2,h+Tl,hT2,h)

h+T12h+T2 h+T22h))u(x)

=

h2g1(x),

(T121

In contrast to the reduction process for a one dimensional difference equation, here the number of summands has increased from

5

to

spread farther apart.

9.

The related lattice points have The new difference star is an even

polynomial in the four translation operators

Tl,h' Tllh'

The Schroder-Trottenberg reduction method

22.

®

431

0 8

Figure 22.4:

Related lattice points after one reduction step.

T2 h,

and

The related lattice points are shown in

TZlh.

Figure 22.4 with an "x".

Such an even polynomial can be

rewritten as a polynomial in two 'larger' translation operators.

Let e3 = (el + e2)/./-2-

e 4 = (e2

(22.5)

- el)/17

be the unit vectors rotated by

it/4

counterclockwise, and

let

k = h"T T3,ku(x) = u(x+ke3) = u(x+hel+he2) T4 ku(x) = u(x+ke4) = u(x+he2-hel). Then we have T3,k =

Tl,h T2,h -1

-1

-1

(22.6)

-1

Tl,h T2,h.

The simply reduced equation therefore can be written in the following form:

432

SOLVING SYSTEMS OF EQUATIONS

III.

{12I-2(T3,k+T4,k+T31k+T41k)-(T3,kT4'k+T3'kT-1k+T-1kT4,k + T3IkT41k)u(x) = h2g1(x).

The difference star on the left side once again can be split into an even part, 121

(T3,kT4,k+T3,kT41k+T31kT4,k+T31kT41k)

and an odd part,

2(T+TT 3,k 4,k+ and reduced anew.

3,1k+T- 1

4,k)

One then obtains a polynomial in the

translation operators

T1,2h, T1,2h' T2,2h, T2,2h.

Thus, beginning with a polynomial in -1

-1

Tl,h' Tl,h' T2,h' T2,h we have obtained, after one reduction step, a polynomial in

-1

T3,h/' T3,hr' T4,h 2

-1 T4,hV2

and after two reduction steps, a polynomial in -1

1

T1,2h' T1,2h' T2,2h' T2,2h In particular, this means that the and

e2 -> e4

r/4

rotation

e1

e3

has been undone after the second reduction.

This process as described can be repeated arbitrarily often. The amount of computation required, however, grows substantially with the repetitions.

For this reason we have not

carried the second reduction step through explicitly. We now discuss the general case of two-dimensional reduction methods for differential and difference equations

The Schroder-Trottenberg reduction method

22.

with constant coefficients.

433

We preface this with some gen-

eral results on polynomials and difference stars. Definition 22.7: p

variables

be a real polynomial in

P(xl,...,xp)

Let

xi,...,xp.

is called even if

P

P(-x1,...,-xp) = P(xl,...,xp and

is called odd if

P

P(-xl,...,-xp) _ -P(xl,...,xp It is obvious that the product of two even or of two odd polynomials is even, and that the product of an even polyFurther, every polynomial can

nomial with an odd one is odd.

be written as the sum of one even and one odd polynomial. Lemma 22.8:

Let

P(xl,x2,x3,x4)

there exist polynomials

and

P

P(x,x 1,y,Y 1) = P(xy,(xy) P(xY,(xY)

.,-

i+j

l,yx l,xy 1)

=

P

Then

with the properties:

l,yx l,xy 1)

P(x2,x 2,y2,y-2).

We have

Proof:

Since

be an even polynomial.

P

--1 --

-

S

iE7l

aijx iY J

aij

,

E

pl

is even, we need only sum over those i,j For such

even.

x1Y3

---11

=

i,j

(xy)r(yx-1)s,

E 7l

with

we have r = (i+j)/2,

s

= (j-i)/2.

We obtain P(x,x-1,Y,Y-1)

I

r,sE ZZ

r+s(xY)r(Yx-1)s

ar-s ,

and therefore, the first part of the conclusion.

Similarly,

434

SOLVING SYSTEMS OF EQUATIONS

III.

we obtain 1)

l,yx-l,xy

P(xy,(xy)

=

a..(xy)1(yx

X

1)j

i.,jE ZZ

(xy)1(yx-1 ) j

=

P(xy,(xy) l,yx

x2ry2s, l,xy 1)

ar+s,s

rx2ry2s 0

r,sE ZZ h > 0, define the lattice

For fixed

Definition 22.9:

s = (i+j)/2

r = (i-j)/2,

Gv

by

{xe7R2Ix=2v/2h(iel+je2); i,jeZZ}

when v is even

{xe ]R2Ix=2v/2h(ie3+je4); i,jeZl}

when v is odd.

Gv = .

A difference star Ti

k,

on the lattice

Sv

Gv

T2-1k

for even

T4-1k

for odd

Tllk, T2 k,

is a polynomial in v

or in

T3 k, T31k, T4 k, Here

k = 2v/2h.

Sv

is called even (odd) if the correspond-

ing polynomial is even (odd). Go = Gl = G2 D ...

p

O

.

0

It follows that

Figure 22.10 shows

@

p

Go, G1, G2, and

Go

:

G1

:

G2

:

G3

Figure 22.10:

v.

.

0 X

D

A representation of lattices Go, Gl, G2 and G3.

G3.

22.

The Schr6der-Trottenberg reduction method

Theorem 22.11: Gv.

be partitioned into a sum

Sv

Let

S

where

be a difference star on the lattice

Sv

Let

v = Pv

Qv

is an even difference star and

PV

435

ference star.

Qv

is an odd dif-

Then

Sv+1 = RV - Qv)Sv may be regarded as a difference star on the coarser lattice Gv+l c Gv. Proof:

The difference star S

is even since

P2

v+l

2

= P2v

and

- Qv

are even.

Q2

Now let

By Lemma 22.8, there exists a polynomial Sv+l(Tl,k,T-

1

k,T2,k,T-

Sv+l

v

be even.

with

1

2, k)

l,

1,T2,kT1'k,Tl,kT2,k). Sv+1(Tl,kT2,k,(Tl,kT2,k) Since

1

T l,k

one obtains, where

_ T4,k,

m = k/ = 2(v+l)/2h,

Sv+l(Tl,k,Tl1k,T2,k,T2 k) = 9v+l(T3,m,T3,mT4,m,T4,m).

The right side of this equation is a difference star on the lattice

Gv+l.

For odd

taking note of (22.6):

v, one obtains, correspondingly and

SOLVING SYSTEMS OF EQUATIONS

III.

436

-1

-1

Sv+l(Tl,RT2,L,(Tl,kT2,k)

-1

-1

-1

,T1,kT2,k,Tl,kT2,k)

Sv+1(T1,k,T1 2 k,T2,k,T2,2) = Sv+l(Tl,m,Tl,m,T2,m,T2,m) 2

k = k//i,

m = 2k = k/ = 2(v+l)/2h.

a

We may summarize these results as follows: Reduction step for equations with constant coefficients: initial equation:

Svu(x) = h2gv(x),

partition:

Sv = pv + Qv,

compute:

qv+l(x) _ (PV-Qv)gv(x),

x E Gv+l

reduced equation:

Sv+lu(x) = h2gv+1(x),

x e Gv+1'

Pv

x E Gv even, Q.

odd

The reduction process can be repeated arbitrarily often. The difference stars then depend only on the differential equation and not on its right side.

Thus, for a particular

type of differential equation, the stars can be computed once for all time and then stored.

Our representation of the difference stars becomes rather involved for large

f

v.

aiiT iI.

Instead of

k . Tj2.k

for

v

even

for

v

odd

Sv =

i

j

1i T 3,k T 4,k i,3EZZ a

k = 2v/2h

we can use the Schroder-Trottenberg abbreviation and write

22.

The Schroder-Trottenberg reduction method

-1,1

437

......

a0,1

a1,1

a0,-1

al,-1 ......

Sv

a-1,-1

V

With the differential equation

-Au(x) = q(x), the standard discretization leads to the star

So =

0

-1

0

-1

4

-1

0

-1

0 0

The first three reductions are:

S1 =

-1

12

-2

-1

-2

-1 1

1

0

1

0

0

0

-2

-32

-2

0

1

-32

132

-32

1

0

-2

-32

-2

0

0

1

0

0

S3 =

-2

-2

0

S2 =

r

-1

0J 2

l

-4

-4

-752

-2584

-752

6

-2584

13348

-2584

6

-4

-752

-2584

-752

-4

1

-4

6

-4

The even components

6

PV

-4

of the stars

1 "1 -4

1J3 SV

are:

438

SOLVING SYSTEMS OF EQUATIONS

III.

PO =

P2

0

0

0

0

4

0

0

0

0

0

0

0

1

0

-2

0

-2

0

0

132

0

1

0

-2

0

-2

0

0

0

1

0

0

0

1

6 0

1

0

-752 0

-752 0

6 0

13348 0

6

-1

0

-1

0

12

0

-1

0

-1

1

0

0

0 3

,

1

1

P

P1

-752

2

0

0

6

-752

0

0

1

3

When the reduction method is applied to boundary value problems for elliptic differential equations, a difficulty arises in that the formulas are valid only on a restricted region whenever the lattice point under consideration is sufficiently far from the boundary.

In certain special cases, however, one

is able to evade this constraint. Let the unit square G = (0,1) x (0,1) be given.

Let the boundary conditions either be periodic, i.e. u(O,x2) = u(1,x2) u(x1,0) = u(xl,l) x1,x2 e [0,1] alu(0,x2) = a1u(1,x2) a2u(xl,o) = a2u(x1,l),

or homogeneous, i.e.

u(xl,x2) =

0

for

(xl,x2) e

DG.

Let

the difference star approximating the differential equation be a symmetric nine point formula:

22.

The Schroder-Trottenberg reduction method

SO

FY

9

3

a

Y

439

Y S

Y

10'

The extension of the domain of definition of the difference equation S0u(x) = q0(x) to

x E Go

is accomplished in these special cases by a con-

tinuation of the lattice function

to all of

u(x)

the case of periodic boundary conditions, u(x)

are continued periodically on boundary conditions, u(x)

and

G0.

and

Go.

In

q0(x)

In the case of homogeneous

q0(x)

are continued anti-

symmetrically with respect to the boundaries of the unit square:

u(-xl,x2) _ -u(xl,x2),

g0(-xl,x2) = -g0(x1,x2)

u(x1,-x2) = -u(xl,x2),

go(x1,-x2) = -g0(x1,x2)

u(l+xl,x2) _ -u(1-xl)x2), q0(1+x1,x2) = -q0(l-xl,x2) u(xl,l+x2) = -u(xl,l-x2), 90(x1,1+x2) = -go(xl,l-x2) g0(x1,x2) =

0

for

(xl,x2) e @G.

This leads to doubly periodic functions.

Figure 22.12 shows

their relationship.

Figure 22.12:

Antisymmetric continuation.

The homogeneous boundary conditions and the symmetry of the difference star assure the validity of the extended difference equations at the boundary points of

G, and therefore,

440

SOLVING SYSTEMS OF EQUATIONS

III.

on all of

An analogous extension of the exact solution

Go.

of the differential equation, however, is not normally possible, since the resulting function will not be twice differentiable.

We present an example in which we carry out the reduction method in the case of a homogeneous boundary condition and for

h = 1/n

and

After three reduction steps,

n = 4.

we obtain a star, S3, which is a polynomial in the translation operators 1 -1 T3,k,T3,k,T4,k,T4,k,

k=

2

3/2

h = 2hI.

The corresponding difference equation holds on the lattice G3

(cf. Figure 22.10).

Because of the special nature of the

u, it only remains to satisfy the equation

extension of

S3u(1/2,1/2) = 83(1/2,1/2).

By appealing to periodicity and symmetry, we can determine all summands from

u(1/2,1/2), u(0,0), u(1,0), u(0,1), and

But the values of u(1/2,1/2)

u

on the boundary are zero.

can be computed immediately.

sulting from the difference star

S2

u(1,1).

Thus

The equations re-

on the lattice

longer contain new unknowns (cf. Figure 22.10).

G2

no

The equations

S1u(1/4,1/4) = gl(1/4,1/4),

Slu(3/4,1/4) = gl(3/4,1/4),

S1u(1/4,3/4) = g1(1/4,3/4),

Slu(3/4,3/4) = g1(3/4,3/4),

for the values of

u

at the lattice points

(3/4,1/4), (1/4,3/4), and

(3/4,3/4)

otherwise only the boundary values of determined value

u(1/2,1/2)

(1/4,1/4),

are still coupled; u

and the by now

are involved.

Thus the

4 x 4

The Schroder-Trottenberz reduction method

22.

system can be solved.

441

As it is strictly diagonally dominant,

it can be solved, for example, in a few steps (about 3 to 5) with SOR.

All remaining unknowns are then determined from

S0u(1/2,1/4) = q0(1/2,1/4), S0u(1/4,1/2) = q0(1/4,1/2), S0u(3/4,1/2) = 80(3/4,1/2), S0u(1/2,3/4) = g0(1/2,3/4).

In all cases arising in practice, this system too is strictly diagonally dominant.

The method of solution described can

generally always be carried out when say

n = 2m, m e 1N, h = 1/n.

n

is a power of

2,

The (2m-l)-times reduction

equation S2m-lu(1/2,1/2)

= q2m-1(1/2,1/2) u(1/2,1/2).

is then simply an equation for

The values of

u

at the remaining lattice points then follow one after another from strictly diagonally dominant systems of equations. By looking at the difference stars

S1, S2, S3

formed

from -1

0

-1

4

-1

0

-1

0

0

S

0

=

one can see that the number of coefficients differing from In addition, the coefficients differ

zero increases rapidly.

greatly in order of magnitude.

This phenomenon generally can

be observed with all following

SV, and it is independent of

the initial star in all practical cases.

As a result, one

typically does not work with a complete difference star SV, but rather with an appropriately truncated star. a truncation parameter

a

Thus, after

has been specified, all coeffici-

442

SOLVING SYSTEMS OF EQUATIONS

III.

of

ents

with

SV

IaijI

< a1aool

are replaced by

aiJ

zeros.

For sufficiently small

this has no great influ-

a

ence on the accuracy of the computed approximation values of 10-g

u.

As one example, the choice of

for the case of

a =

the initial star

S

=

0

0

1

1

4

-1

0

-1

0

with

leads to the discarding of all coefficients a1J

Iii

+

Iii

> 4,

lii

>

3,

Iii

> 3.

In conclusion, we would like to compare the number of operations required for the Gauss-Seidel method (single-step method), the SOR interation, the Buneman algorithm, and the reduction method.

We arrived at Table 22.13 by restricting

ourselves to the Poisson equation on the unit square (model problem) together with the classic five point difference formula.

All lower order terms are discarded.

N2

denotes the

number of interior lattice points, respectively the number of unknowns in the system.

The computational effort for the

iterative method depends additionally on the factor which the initial error is to be reduced.

e

by

For the reduction

method, we assume that all stars are truncated by

o = 10

8.

More exact comparisons are undertaken in Dorr 1970 and Schroder-Trottenberg 1976. marks on computing times.

Appendices 4 and 6 contain reIn looking at this comparison, note

that iterative methods are also applicable to more general problems.

22.

The Schroder-Trottenberg reduction method

Gauss-Seidel

443

2 N4Ilog ej Tr

SOR

21

N3Ilog e

Buneman

[61og2(N+1)+3]N2

Reduction

36 N2

Table 22.13:

The number of operations for the model problem

APPENDICES: FORTRAN PROGRAMS

Appendix 0:

Introduction.

The path from the mathematical formulation of an algorithm to its realization as an effective program is often difficult.

We want to illustrate this propostion with six

typical examples from our field.

The selections are intended

to indicate the multiplicity of methods and to provide the reader with some insight into the technical details involved. Each appendix emphasizes a different perspective:

computa-

tion of characteristics (Appendix 1), problems in nonlinear implicit difference methods (Appendix 2), storage for more than two independent variables (Appendix 3), description of arbitrary regions (Appendix 4), graph theoretical aids (Appendix 5), and a comprehensive program for a fast direct method (Appendix 6).

Some especially difficult questions,

such as step size control, cannot be discussed here.

As an aid to readability we have divided each problem into a greater number of subroutines than is usual.

This is

an approach we generally recommend, since it greatly simplifies the development and debugging of programs. 444

Those who

Appendix 0:

Introduction

445

are intent on saving computing time can always create a less structured formulation afterwards, since it will only be necessary to integrate the smaller subroutines.

However, with

each modification, one should start anew with the highly structured original program.

An alternative approach to re-

duced computing time is to rewrite frequently called subroutines in assembly language.

This will not affect the

readability of the total program so long as programs equivalent in content are available in a higher language.

The choice of FORTRAN as the programming language was a hard one for us, since we prefer to use PL/l or PASCAL. However, FORTRAN is still the most widespread language in the technical and scientific domain.

The appropriate compiler

is resident in practically all installations.

Further,

FORTRAN programs generally run much faster than programs in the more readable languages, which is a fact of considerable significance in the solution of partial differential equations.

The programs presented here were debugged on the CDCCYBER 76 installation at the University of Koln and in part on the IBM 370/168 installed at the nuclear research station Julich GmbH.

They should run on other installations without

any great changes.

We have been careful with all nested loops, to avoid any unnecessary interchange of variables in machines with virtual memory or buffered core memory. for example, the loop DO 10 I = 1,100

DO 10 J = 1,100 10

A(J,I) =

0

In such installations,

APPENDICES

446

is substantially faster than DO 10 I = 1,100

DO 10 J = 1,100

10

A(I,J) = 0.

This is so because when FORTRAN is used, the elements of

A

appear in memory in the following order:

A(1,l), A(2,1), ..., A(100,1), A(1,2), A(2,2),

..

For most other programming languages, the order is the contrary one:

A(l,l), A(1,2), ..., A(1,100), A(2,1), A(2,2),

...

.

For this reason, a translation of our programs into ALGOL, PASCAL, or PL/l requires an interchange of all indices. There is no measure of computing time which is independent of the machine or the compiler.

If one measures the

time for very similar programs running on different installations, one finds quite substantial differences.

We have ob-

served differences with ratios as great as 1:3, without any plausible explanations to account for this.

It is often a

pure coincidence when the given compiler produces the optimal translation for the most important loops..

Therefore, we use

the number of equally weighted floating point operations as a measure of computing time in these appendices.

This count

is more or less on target only with the large installations. On the smaller machines, multiplication and division consume substantially more time than addition and subtraction.

Appendix 1:

Method of Massau

Appendix 1:

Method of Massau

447

This method is described in detail in Section 3.

It

deals with an initial value problem for a quasilinear hyperbolic system of two equations on a region

where

in

G

uy = A(x,y,u)ux + g(x,y)u),

(x,y)

e G

u(x,0) _ ip(x,0),

(x,0)

e G

A e CI(G x G, MAT(2,2,]R)), and

is an arbitrary subset of

at the equidistant points belong to

g e C1(G x G,IR2).

1R2.

(x) _ (i1(x),Iy2(x))

The initial values

1R2:

are given

(xi,0), insofar as these points

G:

+ 2h(i-nl),

xi = xn

i = nl(1)n2.

1

The lattice points in this interval which do not belong to G

must be marked by

B(I) = .FALSE..

Throughout the complete computation, U(*,I)

contains

the following four components: ul(xi,yi),

u2(xi,yi),

xi,

yi.

The corresponding characteristic coordinates (cf. Figure 3.10) are:

SIGMA = SIGMAO + 2hi,

TAU.

At the start, the COMMON block MASS has to be filled:

Ni = n1 N2 = n2

H2 = 2h SIGMAO = xn

-

1

TAU = 0.

2hn1

APPENDICES

448

For those

for which

i

B(I) =

belongs to

(xi,0)

G, we set

TRUE.

U(1,I) = 01(xi) U(2,I) = 02(xi)

U(3,I) = xi U(4,I) = 0,

and otherwise, simply set B(I) =

FALSE.

Each time MASSAU is called, a new level of the characteristic lattice (TAU = h, TAU = 2h, ...) is computed. also alters N1, N2, and SIGMAO.

The program

The number of lattice points

in each level reduces each time by at least one. N2 < Ni.

At the end,

The results for a level can be printed out between

two calls of MASSAU.

To describe

A

and

g, it is necessary in each con-

crete case to write two subroutines, MATRIX and QUELL. initial parameter is a vector u1(x,Y),

The

of length 4, with components

U

u2(x,Y),

x, Y

Program MATRIX sets one more logical variable, L, as follows: L =

tion

FALSE. if G x G

lies outside the common domain of defini-

U

of

A

and

g; L =

TRUE. otherwise.

To determine the eigenvalues and eigenvectors of MASSAU calls the subroutine EIGEN.

A,

Both programs contain a

number of checks: (1)

Are

(2)

Are the eigenvalues of

(3)

Is the orientation of the (x,y)-coordinate system and of the

A

and

g

defined? A

(o,2)-system the same?

real and distinct?

Appendix 1:

Method of Massau

449

The lattice point under consideration is deleted as necessary B(I) = .FALSE.).

(by setting

Consider the example

2u2

u1

A=

,

g = 0, * =

10exsin(2lrx) ,

ul

u2/2

G = (0,2) x]R.

1

Figures 1 and 2 show the characteristic net in the system and the (x,y)-system.

For

(o,T)-

2h = 1/32, we have the

sector

0.656250 < o < 1.406250 0.312500 < T < 0.703125

20th to 45th level.

In the (x,y)-coordinate system, different characteristics of the type

a + T = constant will intersect each other, so in

this region the computation cannot be carried out.

If one

drops the above checks, one obtains results for the complete In the region where the solution

"determinancy triangle".

exists, there cannot be any intersection of characteristics in the same family.

The method of Massau is particularly well suited to global extrapolation, as shown with the following example: 0

A =

0

1

2 2

1-u2 1

2u1u

,

g =

4ule

2x

G = (0,1) x IR. The exact solution is u(x,y) = 2ex cos y

The corresponding program and the subroutine MATRIX and QUELL are listed below.

0.35

0.40

0. 50

0 .60

0 .70

0.75

0.80

0.7

0.8

Figure 1:

0'.9

1 .0

1.1

Characteristic net in the (x,y)-coordinate system

1.2

1.3

452

APPENDICES

Table 3 contains the results for

(a,T) = (1/2,1/2).

The discrete solution has an asymptotic series of the form T0(x,Y) + T1(x,Y)h + T2(x,Y)h2 +

The results after the first and second extrapolation are found in Tables 4 and S.

The errors

Du1

and

Au2

in all

three tables were computed from the values in a column: Dul = 2exsin y - ul,

Au2 = 2excos y - u2.

H2

1/32

1/64

x

1.551750

1.582640

1.598355

1.606317

y

1.122827

1.138263

1.147540

1.152624

Ul

8.804323

9.004039

9.104452

9.154850

U2

3.361230

3.664041

3.838782

3.932681

AU1 -0.296282

-0.165040

-0.087380

-0.044999

0.727335

0.416843

0.223264

0.115574

AU2

Table 3:

1/128

1/256

Results for (o,T) = (1/2,1/2).

1.614279 1.157708

1.613530 1.153699

1.614070

y Ul

9.203755

9.204865

U2

3.966852

4.013523

9.205248 4.026580

tU1

-0.023579

-0.007085

-0.001948

AU2

0.100843

0.027710

0.007299

x

Table 4:

1.156817

Results after the first extrapolation

Appendix 1:

Method of Nassau

453

x

1,614250

1.614349

y

1.157856

1.158005

U1

9.205235

9.205376

U2

4.029080

4.030932

LU1

-0.001605

-0.000234

AU2

0.003320

0.000496

Table 5:

Results after the second extrapolation

SUBROUTINE MASSAU C

C C C C

C C C C C

VARIABLES OF THE COMMON BLOCK U(1,I) AND U(2,I) COMPONENTS OF U, U(3,I)=X, U(4,I)=Y, WHERE MI.LE.I.LE.N2. B(I)=.FALSE. MEANS THAT THE POINT DOES NOT BELONG TO THE GRID. THE CHARACTERISTIC COORDINATES OF THE POINT (U(3,I),U(4,I)) ARE (SIGMAO+I*H2,TAU). THE BLOCK MUST BE INITIALIZED BY THE MAIN PROGRAMME.

C

REAL U(4,500),SIGMAO,TAU,H2 INTEGER N1,N2 LOGICAL B(500) COMMON /MASS/ U,SIGMAO,TAU,H2,N1,N2,B C

REAL E1(2,2),E2(2,2),LAMBI(2),LAMB2(2),G1(2),G2(2) REAL CI,C2,D,XO,X1,X2,YO,Y1,Y2,RD REAL DXOI,DX21,DY01,DY02,DY21,UI1,U12 INTEGER N3,N4,I LOGICAL L,LL C

N3=N2-N1 10 IF(N3.LE.0) RETURN IF(B(N1)) GOTO 20 11 N1=N1+1 N3=N3-1 GOTO 10 20 LL=.FALSE. N4=N2-1 C C

BEGINNING OF THE MAIN LOOP

C

DO 100 I=N1,N4 IF(.NOT.B(I+1)) GOTO 90 IF(LL) GOTO 30 IF(.NOT.B(I)) GOTO 90 CALL EIGEN(U(1,I),E1,LAMBI,L) IF(.NOT.L) GOTO 90 CALL OUELL(U(1,I),G1)

APPENDICES

454

30 CALL EIGEN(U(1,I+1),E2,LAMB2,L) IF(.NOT.L) GOTO 90 CALL QUELL(U(1,I+1),G2) C C C

SOLUTION OF THE FOLLOWING EQUATIONS (XO-X1)+LAMBI(1)*(YO-Y1)=O (XO-X1)+LAMB2(2)*(YO-Y1)=(X2-X1)+LAMB2(2)*(Y2-Y1)

C

C

CI=LAMBI (I) C2=LAI4B2 (2) D=C2-C1 IF(D.LT.1.E-6*AMAX1(ABS(C1),ABS(C2))) GOTO 80 X1=U(3,I) X2=U(3,I+1)

Y1=U (4, I) Y2=U(4,I+1) DX21=X2-X1 DY21=Y2-Y1 RD=(DX21+C2*DY21)/D DXOI=-C1*RD DYOI=RD XO=XI+DXOI Y0=Y1+DYO1 DYOZ=YO-Y2 C

C C C

CHECK WHETHER THE TRANSFORMATION FROM (SIGMA,TAU) TO (X,Y) IS POSSIBLE IF((DX21*DYOI-DXOI*DYZI).LE.O.) GOTO 80 SOLUTION OF THE FOLLOWING EQUATIONS E1(1,1)*(U(1,I)-U11)+E1(1,2)*(U(2,I)-U12)= DYOI*(El(1,1)*G1(1)+E1(1,2)*G1(2)) E2(2,1)*(U(1,I)-U11)+E2(2,2)*(U(2,I)-U12)= E2(2,1)*(DY02*G2(1)+U21-U11)+E2(2,2)*(DYOZ*62(2)+U22-U12) U11=OLD U12=OLD U21=OLD U22=OLD

VALUE VALUE VALUE VALUE

OF OF OF OF

U(1,I) U(2,I) U(1,I+1) U(2,I+1)

D=E1(1,1)*E2(2,2)-E2(2,1)*E1(1,2) IF(ABS(D).LT.1.E-6) GOTO 80 U11=U(1,I) U12=U(2,I) C1=DYO1*(El(1,1)*G1(1)+El(1,2)*61(2)) C2=E2(2,1)*(DY02*62(1)+U(1,I+1)-UI1) + F E2(2,2)*(DY02*G2(2)+U(2,I+1)-U12) U(1,I)=U11+(C1*E2(2,2)-C2*E1(1,2))/D U(2,I)=U12+(E1(1,1)*C2-E2(2,1)*C1)/D C

C

U(3,I)=XO U(4,I)=YO

Appendix 1:

Method of Massau

455

70 LAMB1(1)=LAMBZ(1)

El (1,1)=E2(1,1) El (1,2)=E2(1,2) 61(1)=G2(l) G1 (2)=G2(2) LL=.TRUE.

GOTO 100 80 B(I)=.FALSE.

GOTO 70 90 B(I)=.FALSE. LL=.FALSE. 100 CONTINUE C C C

END OF THE MAIN LOOP B(N2)=.FALSE. 110 N2=N2-1 IF(.NOT.B(N2).AND.N2.GT.N1) GOTO 110 SIGMAO=SIGMAO+H2*0.5 TAU=TAU+H2*0.5 RETURN END

SUBROUTINE EIGEN(U,E,LAMBDA,L) C

REAL U(4),E(2,2),LAMBDA(2) LOGICAL L C C C C C C C C

INPUT PARAMETERS U CONTAINS U(1),U(2),X,Y OUTPUT PARAMETERS EIGENVALUES LAMBDA(1).LT.LAMBDA(2) MATRIX E (IN THE TEXT DENOTED BY E**-1) L=.FALSE. INDICATES THAT THE COMPUTATION IS NOT POSSIBLE

C

REAL A(2,2),C,D,C1,C2,C3,C4 LOGICAL SW L=.TRUE. CALL

IF(.NOT.L) RETURN C C

COMPUTATION OF THE EIGENVALUES OF A

APPENDICES

456

C

C=A(1,1)+A(2,2) D=A(1,1)-A(2,2) D=D*D+4.*A(1,2)*A(2,1) IF(D.LE.O) GO TO 101 D=SQRT(D) IF(D.LT.1.E-6*ABS(C)) GO TO 101 LAMBDA(1)=0.5*(C-D) LA14BDA(2)=0.5*(C+D) C C C C C C C

SOLUTION OF THE FOLLOWING HOMOGENEOUS EQUATIONS E(1,1)*(A(1,1)-LAMBDA(1))+E(1,2)*A(2,1)=O E(1,1)*A(1,2)+E(1,2)*(A(2,2)-LAMBDA(1))=0 E(2,1)*(A(1,1)-LAMBDA(2))+E(2,2)*A(2,1)=O E(2,1)*A(1,2)+E(2,2)*(A(2,2)-LAMBDA(2))=0 C=LAMBDA(1) SW=.FALSE. 10 C1=ABS(A(1,1)-C) C2=ABS(A(2,1)) C3=ABS(A(1,2)) C4=ABS(A(2,2)-C) IF(AMMAX1(C1,C2).LT.AMAXI(C3,C4)) GO TO 30 IF(C2.LT.C1) GO TO 20 C1=1. C2=(C-A(1,1))/A(2,1) GO TO 50 20 C2=1. C1=A(2,1)/(C-A(1,1)) GO TO 50 30 IF(C3.LT.C4) GO TO 40 C2=1.

C1= (C-A (2,2) )/A (1,2) 60 TO 50 40 C1=1. C2=A(1,2)/(C-A(2,2))

50 IF(Sl!) GO TO 60 E(1,1)=C1 E(1,2)=C2 C=LAMBDA(2) SU=.TRUE. GO TO 10 60 E(2,1)=C1 E(2,2)=C2 RETURN 101 L=.FALSE. RETURN END

Appendix 1:

Method of Massau

EXAMPLE (MENTIONED IN THE TEXT)

MAIN PROGRAMME: C C

DESCRIPTION OF THE COMMON BLOCK IN THE SUBROUTINE MASSAU REAL U(4,500),SIGMAO,TAU,H2 INTEGER N1,N2 LOGICAL B(500) COMMON /MASS/ U,SIGMAO,TAU,H2,NI,N2,B

C

REAL X,DUI,DU2,SIGMA INTEGER I,J C

C

INITIALIZATION OF THE COMMON BLOCK

C

TAU=O. NI=1 N2=65

/32.

P_

.--

_ .*ATAN(1.)

SIGMA.:?--=-H2

X=O. DO 10 I=1,N2

U(1,I)=0.1*SIN(2.*PI*X)*EXP(X) U(2,I)=1.

U(3,I)=X

U(4,I)=0. B(I)=.TRUE. 10 X=X+H2 C C C

LOOP FOR PRINTING AND EXECUTING THE SUBROUTINE DO 40 I=1,65 DO 39 J=N1,N2 IF(.NOT.B(J)) GOTO 39 SIGMA=SIGMA0+J*H2 WRITE(6,49) SIGMA,TAU,U(3,J),U(4,J),U(1,J),U(2,J) 39 CONTINUE WRITE(6,50)

C

IF(N2.LE.N1) STOP CALL. MASSAU

40 CONTINUE STOP C

49 FORMAT(IX,2F8.5,IX,6F13.9) 50 FORMAT(IHI) END

4S7

APPENDICES

458

SUBROUTINES:

SUBROUTINE QUELL(U,G) C C C C

INPUT PARAMETER U CONTAINS U(1),U(2),X,Y OUTPUT PARAMETERS ARE G(1),G(2)

REAL U(4),G(2) G(1)=0. G(2)=0. RE".` 'RN END

SUBROUTINE MATRIX(U,A,L) C C C C C

INPUT PARAMETER U CONTAINS U(1),U(2),X,Y OUTPUT PARAMETERS ARE THE MATRIX A AND L L=.TRUE. IF U BELONGS TO THE DOMAIN OF THE COEFFICIENT MATRIX A AND OF THE TERM G. OTHERWISE, L=.FALSE.

C

REAL U(4),A(2,2) LOGICAL L C

REAL U1,U2 C

U1=U (1) U2=U (2)

L=.TRUE. A(1,1)=-U1 A(1,2)=-2.*U2 A(2,1)=-O.S*U2 A(2,2)=-U1 RETURN END

Appendix 2:

Nonlinear implicit

Appendix 2:

Total implicit difference method for solving a

difference method

459

nonlinear parabolic differential equation.

The total implicit difference method has proven itself useful for strongly nonlinear parabolic equations.

With it

one avoids all the stability problems which so severely complicate the use of other methods.

In the case of one (space)

variable, the amount of effort required to solve the system of equations is often overestimated.

The following programs solve the problem ut = a(u)uxx - q(u),

x e (r,s), t > 0

u(x,0) = $(x),

x c [r,s]

u(r,t) _ ar(t), u(s,t) = 4s(t),

t > 0.

Associated with this is the difference method u(x,t+h)-u(x,t) = Xa(u(x,t+h))[u(x+&x,t+h)+u(x-Ax,t+h) -

where

2u(x,t+h)]

Ax > 0, h > 0, and

- hq(u(x,t+h))

A =

ox = (s-r)/(n+l),

x = r + jAx,

fixed

n e 1N

j

= l(l)n

this becomes a nonlinear system in solved with Newton's method.

When specialized to

h/(Ax)2.

n

unknowns. It is

For each iterative step, we

have to solve a linear system with a tridiagonal matrix. linear equations are

aljuj_1 + a2juj + a3juj+1 = a4j, where

j = l(1)n

The

APPENDICES

460

alj = -aa(uj)

aaI(uj)[uj+l+uj-l-2uj]+hgI(uj

a2j

= 1+2aa(uj

aij

= -Xa(uj)

a4j

= uj-[1+2Aa(uj)]uj+aa(uj)[uj+l+uj-1)-hq(uj)

uj = solution of the difference equation at the point (r+jAx,t+h).

uj = corresponding Newton approximation for u(r+jtx,t+h). When this sytem has been solved, the by

uj

+ uj; the

aij

are replaced

uj

are recomputed; etc., until there is

no noticeable improvement in the

uj.

Usually two to four

Newton steps suffice.

Since the subscript 0 is invalid in FORTRAN, the quantities

u(x+jAx,t)

are denoted in the programs by

For the same reason, the Newton approximation

uj

U(J+1).

is called

Ul(J+1).

The method consists of eight subroutines:

HEATTR, AIN, RIN, GAUBD3, ALPHA, DALPHA, QUELL, DQUELL. HEATTR is called once by the main program for each time increment.

Its name is an abbreviation for heat transfer.

other subroutines are used indirectly only.

The

The last four

subroutines must be rewritten for each concrete case.

They

are REAL FUNCTIONs with one scalar argument of REAL type, which describe the functions

a(u), a'(u), q(u), and

q'(u).

The other subroutines do not depend on the particulars of the problem.

AIN computes the coefficients

linear system of equations.

aij

of the

GAUBD3 solves the equations.

This program is described in detail along with the programs

Appendix 2:

Nonlinear implicit difference method

dealing with band matrices in Appendix S. alj, a2j, and

a3j

Newton's step.

461

The coefficients

are recomputed only at every third

In the intervening two steps, the old values

are reused, and the subroutine RIN is called instead of AIN. RIN only computes

a4j.

Afterwards, GAUBD3 runs in a simpli-

For this reason, the third variable is .TRUE..

fied form.

We call these iterative steps abbreviated Newton's steps. Before HEATTR can be called the first time, it is necessary to fill the COMMON block /HEAT/: N = n DX = Ax = (s-r)/(n+l) U(J+1)

4(r+jIx)

j

= 0(1)n+l

H = h. H

and

can be changed from one time step to another. u(s,t)

depend on

boundary values

If

u(r,t)

t, it is necessary to set the new

U(l) _ r(t+h)

and

U(N+2) = 0s(t+h)

be-

fore each call of HEATTR by the main program.

An abbreviated Newton's step uses approximately 60% of the floating point operations of a regular Newton's step: (1)

(2)

Regular Newton's step: n

calls of ALPHA, DALPHA, QUELL, DQUELL

21n+4

operations in AIN

8n-7

operations in GAUBD3

4n

operations in HEATTR.

Abbreviated Newton's step: n

calls of ALPHA, QUELL

10n+3

operations in RIN

5n-4

operations in GAUBD3

4n

operations in HEATTR.

APPENDICES

462

This sequence of different steps--a regular step followed by two abbreviated steps--naturally is not optimal in every Our

single case.

error test for a relative accuracy of If so desired, it suffices to

10-5 is also arbitrary.

make the necessary changes in HEATTR, namely at IF(AMAX.LE.0.00001*UMAX) GO TO 70 and

IF(ITERI.LT.3) GO TO 21.

As previously noted, two to four Newton's iterations usually suffice.

This corresponds to four to eight times

this effort with a naive explicit method.

If

u

and

a(u)

change substantially, the explicit method allows only extremely small incrementat.ions

h.

This can reach such extremes

that the method is useless from a practical standpoint. ever, if

q1(u)

one should have

How-

< 0, then even for the total implicit method hq'(u) > -1, i.e. h < 1/lq'(u)l.

For very large

n, to reduce the rounding error in

AIN and RIN we recommend the use of double precision when executing the instruction

A(4,J)=U(J+1)-(1.+LAMBD2*AJ)*UJ+LAMBDA*AJ* *

(U1(J+2)+U1(J))-H*QJ.

This is done by declaring DOUBLE PRECISION LAMBDA, LAMBD2, AJ, UJ, U12, U10 and replacing the instructions above by the following three instructions:

Appendix 2:

Nonlinear implicit difference method

463

U12 = Ul(J+2) U10 = U1(J)

A(4,J) = U(J+1)-(1.+LAMBD2*AJ)*UJ +

+LAMBDA*AJ*(U12+UlO).

All remaining floating point variables remain REAL. other than AIN and RIN do not have to be changed.

Programs

APPENDICES

464

SUBROUTINE HEATTR(ITER) C C C C

U(I) VALUES OF U AT X=XO+(I-1)*DX, I=1(1)N+2 U(1), U(N+2) BOUNDARY VALUES H STEP SIZE WITH RESPECT THE TIME COORDINATE

C

REAL U(513),H,DX INTEGER N COMMON/HEAT/U, H,DX,N C

REAL U1(513),AJ,UJ,AMAX,UMAX,A(4,511) INTEGER ITER,I,ITERI,N1,N2,J N1=N+1 N2=N+2 C

C

FIRST STEP OF THE NEWTON ITERATION

C

CALL AIN(A,U) CALL GAUBD3(A,N,.FALSE.) DO 20 J=2,N1 20 U1(J)=U(J)+A(4,J-1)

UI (1)=U(1)

U1(N2)=U(N2) ITER=1 ITER1=1 C C C

STEP OF THE MODIFIED NEWTON ITERATION 21 CALL RIN(A,U1) CALL GAUBD3(A,N,.TRUE.) GO TO 30

C C C

STEP OF THE USUAL NEWTON ITERATION 25 CALL AIN(A,U1) CALL GAUBD3(A,N,.FALSE.) ITER1=0.

C C C

RESTORING AND CHECKING 30 AHAX=O. UHAX=O. DO 40 J=2,N1 AJ=A(4,J-1) UJ=U1(J)+AJ AJ=ABS(AJ) IF(AJ.GT.AMAX) AMAX=AJ U1(J)=UJ UJ=ABS(UJ) IF(UJ.GT.UMAX) UMAX=UJ 40 CONTINUE

C

Appendix 2:

Nonlinear implicit difference method

465

C

ITER=ITER+1

ITERI=ITER1+1 IF(AMAX.LE.0.00001*UMAX) GO TO 70 IF(ITER.GT.20) GO TO 110 IF(ITERI.LT.3) GO TO 21 GO TO 25 C C

U=U1

C

70 DO 80 J=2,N1

80 U(J)=U1(J) RETURN C C

110 WRITE(6,111) 111 FO.RMAT(15H NO CONVERGENCE) STOP END

SUBROUTINE AIN(A,U1) C C C C

EVALUATION OF THE COEFFICIENTS AND OF THE RIGHT-HAND SIDE OF THE SYSTEM OF LINEAR EQUATIONS REAL A(4,511),U1(513)

C

C C

COMMON BLOCK COMPARE HEATTR REAL U(513),H)DX INTEGER N

C

REAL LAMBDA,LAMBD2,LAMBDM,UJ,AJ,DAJ,QJ,DQJ INTEGER J REAL Z LAMBDA=H/(DX*DX) LAHBD2=2.*LAMBDA

DO 10 J=1,N UJ=U1(J+1) AJ=ALPHA(UJ) DAJ=DALPHA(UJ) QJ=QUELL(UJ) DQJ=DQUELL(UJ) 2=LAi1BDH*AJ

466

APPENDICES

A(1,J)=Z *

A(2,J)=1.+LAMBD2*(AJ+DAJ*UJ)-LAMBDA*DAJ* (U1(J+2)+U1(J))+H*DQJ

A (3,J)=Z

A(4,J)=U(J+1)-(1.+LAMBD2*AJ)*UJ+LAMBDA*AJ* (U1(J+2)+U1(J))-H*QJ 10 CONTINUE RETURN *

END

SUBROUTINE RIN(A,U1) C

C C C

EVALUATION OF THE RIGHT-HAND SIDE OF THE LINEAR EQUATIONS REAL A(4,511),U1(513)

C C

COMMON BLOCK COMPARE HEATTR

C

REAL U(513),H,DX INTEGER N COMMON/HEAT/U,H,DX,N C

REAL LAMBDA LAMBDZ,UJ,AJ,QJ INTEGER J LAMBDA=H/(DX*DX) LAMB02=2.*LAMBDA DO 10 J=1,N UJ=U1(J+1) AJ=ALPHA(UJ) QJ=QUELL(UJ) A(4,J)=U(J+1)-(1.+LAMBD2*AJ)*UJ+LAMBDA*AJ* * (U1(J+2)+U1(J))-H*QJ 10 CONTINUE RETURN END

Appendix 2:

Nonlinear implicit difference method

EXAMPLE

MAIN PROGRAMME:

REAL U(513),H,DX INTEGER N COMMON/HEAT/U,H,DX,N REAL PI,T INTEGER I,J,ITER PI=4.*ATAN(1.) C

N=7 DX=1 ./8. H=1 ./64.

DO 10 1=1,9 10 U(I)=SIN(PI*DX*FLOAT(I-1)) C

T=O.

DO 20 I=1,10 CALL HEATTR(ITER) WRITE(6,22) ITER T=T+H WRITE(6,21) T WRITE(6,21)(U(J),J=1,9) 20 CONTINUE STOP C

21 FORMAT(1X,9F12.9/1X,9F12.9) 22 FORMAT(6H ITER=,I2) END

467

468

SUBROUTINES:

REAL FUNCTION ALPHA(U) REAL U ALPHA=(1.-0.5*U)/(4.*ATAN(1.))**2 RETURN END

REAL FUNCTION DALPHA(U) REAL U DALPHA=-.5/(4.*ATAN(1.))**2 RETURN END

REAL FUNCTION QUELL(U) REAL U QUELL=U*U*0.5 RETURN END

REAL FUNCTION DQUELL(U) REAL U DQUELL= U RETURN END

APPENDICES

Appendix 3:

Lax-Wendroff-Richtmyer method

Appendix 3:

Lax-Wendroff-Richtmyer method for the case of

469

two space variables.

The subroutines presented here deal with the initial value problem ut = A 1 u

x

+ A2uy + Du + q,

u(x,y,0) = (X,y), Here

x,y e1R, t > 0

x,y e 1R.

A1,A2,D c C1(IR2,MAT(n,n,IR)), D(x,y) = diag(dii (x,y)),

q e C1(ii2 x [0,-),]R n).

properly posed in

We require that the problem be

L2(IR2,cn).

are always symmetric, and

that (1) A1(x,y), A2(x,y) (2) Al, A2, D, q

For this it suffices, e.g.,

have compact support.

Because of the terms

Du + q, the differential equation

in the problem considered here is a small generalization of Example 10.9.

There is one small difficulty in extending the

difference scheme to this slightly generalized differential equation.

One can take care of the terms

Du + q

in the

differential equation with the additional summand h[D(x,y)u(x,y,t) + q(x,y,t)]

or better yet, with h{ZD(x,y)[u(x,y,t) + u(x,y,t+h)] + q(x,y,t+ 2h)} This creates no new stability problems (cf. Theorem 5.13). Consistency is also preserved.

However, the order of consis-

tency is reduced from 2

The original consistency proof

to 1.

considered only solutions of the differential equation ("consistency in the class of solutions") and not arbitrary sufficiently smooth functions; here we have a different differential equation.

470

APPENDICES

We use the difference method u(x,y,t+h) = [I- ZD(x,Y)] 1[I+S(h)oK(h)+ + h(I- ZD(x,Y)] lq(x,Y,t+

D(x,Y)]u(x,Y,t)

-T)+2(I-ZD(x,Y)) 1S(h)q(x,Y,t)

where

K(h) = ZS(h) + (4I+8D(x,Y)) (TA,1+TA,2+TA1+T012)

S(h) = ZA[A1(x,Y)(TA'1-Tpl1) + A2(x,Y)(TA 2-TA12)1. For

D(x,y) = 0

and

method (r = 1). every case.

q(x,y,t) = 0, this is the original

But then the order of convergence is 2 in

Naturally, this presupposes that the coeffici-

ents, inhomogeneities, and solution are all sufficiently often differentiable.

The computation procedes in two steps. 1st Step

(SUBROUTINE STEP1):

v(x,y) = K(h)u(x,y,t) + 2nd Step

hq(x,y,t).

(SUBROUTINE STEP2):

u(x,y,t+h) = {[I+

[I

-

ZD(x,Y)]-1o

ZD(x,Y)lu(x,Y,t)+S(h)v(x,Y)+hq(x,Y,t+ 11)1.

The last instruction is somewhat less susceptible to rounding error in the following formulation:

u(x,y,t+h) = u(x,y,t) +

[I-

ZD(x,Y)]

to

{S(h)v(x,y)+h[D(x,y)u(x,y,t)+q(x,y,t+ 2)]}. If

u(x,y,t)

is given at the lattice points

(x,Y) = (uo,vt)

(p,v e2z, p+v

even)

Appendix 3:

then

Lax-Wendroff-Richtmeyer method

v(x,y)

can be computed at the following points: p,v e22, p+v

(x,Y) = (PA,vt),

From these values and the old values u(x,y,t+h)

471

odd.

u(x,y),

one obtains

at the points p,v c

(x,y) = (pt,vt),

p+v

,

even.

This completes the computation for one time increment. If steps 1 and 2 follow each other directly, then and

Therefore, we divide each time

have to be stored.

v

u

step into substeps, in which the u-values are computed only for the lattice points on a line x + y = 2aA = constant.

For this one only needs the v-values for x + y = (2a-1)A (as shown in Figure 1).

and

x + y = (2a+1)a

At first, only these v-values are

stored, and in the next substep, half of these are overThus we alternately compute

written. on a line. a line.

v

on a line and

SUBROUTINE STEP1 computes only the

STEP2 does compute all of

v

u

values on

u, but in passing from

one line to the next, STEP2 calls STEP1 to compute

v.

The program starts with the lattice points in the square t(x,y)

-1 < x+y <

1

and

-1 < x-y < U.

Because of the difference star of the Lax-Wendroff method, we lose fewer lattice points at the boundary per time step than with an equally large square with sides parallel to

APPENDICES

472

+: v-lattice :

1\ 1\ "\ .

+

N+N N

+

+

\

+

+

x+y = (2a+l)A x+y = 2au x+y = (2a-l)A

+

Figure 1:

Lattice points for

We set

the axes.

u-lattice

m m = 2 0

and

m0 = 3, m = 8, = 1/8.

A = 1/m.

Altogether, the solution of the initial value problem requires six subroutines: INITIO, STEP1, STEP2, PRINT, COEFF, FUNC.

The last two subroutines have to be rewritten for each application.

COEFF computes the matrices

inhomogeneity

q.

Al, A2, and

D

and the

FUNC provides the initial values.

The first program called by the user is INITIO. defines the lattice, the number of components of

It

u, computes

the initial values with the help of FUNC, and enters all this in COMMON.

Additionally, INITIO computes DMAX = max(O,dii (x,y)).

When calling STEP2, one should choose

h

no larger than

Appendix 3:

Lax-Wendroff-Richtmeyer method

473

In each time step, STEP2 is called first to carry

1.0/DMAX.

out the computation, and then PRINT is called to print out the results.

After remain.

s2

time steps, only

Thus, at most

m/2

(m+l-2s2)2

lattice points

steps can be carried out.

an incorrect call of INITIO or of STEP2, IERR >

0

After

is set

as follows: in INITIO: IERR = 1:

m

outside the boundaries

IERR = 2:

n0

outside the boundaries.

s0

too large.

0

.

in STEP2: IERR = 1:

STEP1 and STEP2 each contain only one computation intensive loop:

STEP1

STEP2

DO 100 K=1,MS

DO 100 J=J1,J2

100 Y=Y-DELTA

100 Y1=YI+DELTA

In the following accounting of the number of floating point calls we ignore all operations outside these loops, with the exception of calls of STEP1 in STEP2. STEP1: (m-2s2)

calls of COEFF

(m-2s2)(4n2+12n+2)

operations.

STEP2: (m-2s2)

calls of STEP1

(m-2s2-1)2

calls of COEFF

(m-2s2-1)2(4n2+lln+2)

operations

474

APPENDICES

Each time step therefore consumes approximately 2(m-2s2)2

calls of COEFF

2(m-2s 2)2(4n2+lln)

operations.

The total effort required for all

m/2

time steps thus is

a

calls of COEFF

a(4n2+lln)

operations

where m/2 a = 8

u2

= 3 (m+l) (m+2)

u=i

If the matrices

A

and

1

contain many zeros (as in the

A2

wave equation for example) then the term substantially.

can be reduced

4n2

To accomplish this it is enough, in STEP1 and

STEP2, to reprogram only the loops beginning with DO 20 LL = 1,N.

If enough memory is available, Al, A2, and

can be com-

D

puted in advance, and CALL COEFF can be replaced by the appropriate reference.

If

is t-dependent, however, it will

q

have to be computed for each time step.

In this way, the

computing time can be reduced to a tolerable level in many concrete cases.

For the case of the wave equation

A1(x,y) =

0

1

0

1

0

0

0

0 0

1 ,

A2(x,y) =

( 0

0

1

0

0 0

0

I

11

1

0

q(x,y,t) = 0

D(x,y) = 0,

we have tried to verify experimentally the theoretical stability bound

A <

vr2-.

The initial values

Appendix 3:

Lax-Wendroff-Richtmeyer method

475

0

cos x cos y

O(x,y)

have the exact solution -sin t(sin x + sin y) cos t cos x cos t cos y

u(x,y,t) =

We chose

m0 = 7, m = 128, A = 1/128, h = A

A = 1.3(0.1)1.7.

where

After 63 steps, we compared the numerical

results with the exact solutions at the remaining 9 lattice points.

The absolute error for the first component of

u

is generally smaller than the absolute error for the other components (cf. Table 2). ceable until

The instability is not really noti-

A = 1.7, where it is most likely due to the

still small number of steps.

max. absolute error 2nd & 3rd comp. 1st comp. < 1.5

1.0

10-7

4.0

10-6

1.6

4.4

10-5

5.3

10-5

1.7

2.0

100

1.2

100

Table 2

Nevertheless, the computations already become problematical with

A > 1.5.

A multiplication of

creates a perturbation, for

A = 1.3

and

h

by 1 + 10-12

A = 1.4, of the

same order of magnitude as the perturbation of

h.

For

A = 1.5, however, the relative changes in the results are greater up to a factor of 1000, and for

A = 1.6, this

amplification can reach 109 for some points.

We have tested consistency among other places in an

476

APPENDICES

example wherein Al

and

A

and

1

A2, as well as the diagonal elements of In this case, q

sentially space-dependent. y, and

are full, and all elements of

A2

t.

D, are es-

depends on

x,

The corresponding subroutines COEFF and FUNC are

listed below.

The initial value problem has the exact solu-

tion

cosx+cosy u(x,y,t) = e- t

cos x + sin y

The computation was carried out for the four examples: (1)

A = 1/8,

h = 1/32,

A = 1/4,

s2 = 1

(2)

A = 1/16,

h = 1/64,

A = 1/4,

s2 =

(3)

A = 1/32,

h = 1/128,

A = 1/4,

s2 = 4

(4)

0 = 1/64,

h = 1/256,

A = 1/4,

s2 = 8.

2

The end results thus all belong to the same time

T = s2h =

Therefore, at lattice points with the same space co-

1/32.

ordinates, better approximations can be computed with the aid of a global extrapolation.

Our approach assumes an asymptotic

expansion of the type

T0(x,y) + h2T2(x,y) + h3T3(x,y) + h4T4(x,y) +

.

The first and third extrapolation do in fact improve the The summand

results substantially.

h3T3(x,y)

very small in our example relative to the terms and

4

h T4(x,y).

should be h2T2(x,y)

The absolute error of the unextrapolated 10-3

values decreases with

h

from about

to 10 5.

After the

third extrapolation, the errors at all 49 points (and for both components of

u) are less than 10-9.

Appendix 3:

Lax-Wendroff-Richtmeyer method

477

We do not intend to recommend the Lax-WendroffRichtmyer method as a basis for an extrapolation method as a result of these numerical results.

For that it is too com-

plicated and too computation intensive.

However, global

extrapolation is a far-reaching method for testing a program for hidden programming errors and for susceptibility to rounding error.

APPENDICES

478

SUBROUTINE INITIO (COEFF,FUNC,TO,MO,NO,DMAX,IERR) C C C C

C C C

FOR THE DESCRIPTION OF COEFF COMPARE STEP2. THE SUBROUTINE FUNC YIELDS THE INITIAL VALUES F(N) AT THE POINTS X,Y. THE USER HAS TO DECLARE THIS SUBROUTINE AS EXTERNAL. T=TO, N=NO, M=2**MD, FOR DMAX COMPARE TEXT.

INTEGER I,IERR,I1,I2,J,MMAXI,MO,NN,NO REAL DMAX,MINUS,TO,XO,XI,YO,Y1 REAL Al(4,4),A2(4,4),D(4),Q(4),F(4) C C C C C C C C C C

C C

C C C C C C

C C

C C C C C C C

C C

MEANING OF THE VARIABLES OF THE COMMON BLOCK M

NUMBER OF THE PARTS OF THE INTERVAL (0,1),

DELTA MMAX

=1./M, UPPER BOUND FOR M,

NUMBER OF THE COMPONENTS OF THE SOLUTION (1.LE.N.LE.4), S2 NUMBER OF CALLS OF STEP2 DURING THE EXECUTION OF INITIO, T TIME AFTER S2 STEPS, H STEP SIZE WITH RESPECT TO THE TIME, LAMBDA =H/DELTA (LAMBDA.GT.O), U SOLUTION. U(*,I,J) BELONGS TO THE POINT X=DELTA*(J+I-MMAX-2) Y=DELTA*(J-I), V INTERMEDIATE VALUES (COMPARE TEXT) V(*,2,I) BELONGS TO THE POINT X=DELTA*(J+I-MMAX-1) Y=DELTA*(J-I) J IS THE RESPECTIVE PARAMETER OF STEP1 V(*,1,I) BELONGS TO THE POINT X1=X-DELTA Y1=Y-DELTA MMAX AND THE BOUNDS OF THE ARRAYS U(4,DIM2,DIM2) AND V(4,2,DIM1) ARE RELATED AS FOLLOWS MMAX DIM2 DIM N

32

32

33

64 128

64 128

129

...

...

...

65

C

INTEGER MMAX,M,N,S2 REAL U(4,65,65),V(4,2,64),H,DELTA,LAMBDA,T COMMON U,V,H,DELTA,LAMBDA,T,MMAX,M,N,SZ DATA MINUS /-1.E5O/ MMAX=64 C

MMAXI=MMAX+1 M=2**M0 IF( MO.LT.1 OR. M GT.MMAX ) GOTO 998 IF( NO.LT.1 OR. NO.GT.4 ) GOTO 997

Appendix 3:

C C C C

Lax-Wendroff-Richtmeyer method

479

SET V(*,2,*)=O AND ASSIGN MINUS INFINITY (HERE -1E50) TO U(*,*,*). DO 10 J=1,MMAX DO 10 NN=1,N 10 V(NN,2,J)=O. 00 20 I=1,MMAXI 00 20 J=1,MMAXI 20 U(1,I,J)=MINUS

C

30

40

997

998

T=TO N=NO S2=0 DMAX=O. IERR=O DELTA=1./FLOAT(M) I1=(MMAX-M)/2+1 I2=I1+M X0=-1. YO=0. DO 40 J=I1,I2 X1=X0 Y1=Y0 DO 30 I=I1,I2 CALL FUNC (X1,Y1,F) CALL COEFF (X1,Y1,TO,A1,A2,D,Q) X1=XI+DELTA Y1=Y1-DELTA DO 30 NN=1,N U(NN,1,J)=F(NN) IF( D(NN).GT.DMAX ) DMAX=D(NN) CONTINUE XO=XO+DELTA YO=YO+DELTA RETURN IERR=1 RETURN IERR=2 RETURN END

APPENDICES

480

SUBROUTINE STEPI (COEFF,J) INTEGER I1,I2,J,J1,J2,K,L,LL,MS REAL H2,H8,LAM4,SUM,X,Y REAL A1(4,4),A2(4,4),D(4),Q(4),UX(4),UY(4) C C C

FOR VARIABLES IN COMMON COMPARE INITIO INTEGER MMAX,M,N,S2 REAL U(4,65,65), V(4,2,64),H,DELTA,LAMBDA,T COMMON U,V,H,DELTA,LAMBDA,T,MMAX,M,N,S2

C

H2=H*.5 HS=H*.125 LAM4=LAMBDA*.25 MS=M-2*S2 I1=(MMAX-MS)/2+1 J1=J I2=I1+1

J2=J1+1 DO 10 K=1,MS DO 10 L=1,N 10 V(L,1,K)=V(L,2,K) X=DELTA*FLOAT(JI+II-MMAX-1) Y=DELTA*FLOAT(JI-I1) DO 100 K=1,MS DO 15 LL=1,N UX(LL)=U(LL,I2,J2)-U(LL,I1,J1) 15 UY(LL)=U(LL,I1,J2)-U(LL,I2,J1) CALL COEFF (X,Y,T,A1,A2,D,Q) DO 30 L=1,N SUM=0. 20 + +

30

DO 20 LL=1,N SUM=SUM+A1(L,LL)*UX(LL)+A2(L,LL)*UY(LL) V(L,2,K)=LAM4*SUM+H2*Q(L)+ (0.25+H8*D(L))*(U(L,I2,J2)+U(L,I1,J1)+ U(L,I1,J2)+U(L,I2,J1)) CONTINUE I1=I1+1 I2=I2+1

100

X=X+DELTA Y=Y-DELTA RETURN END

Appendix 3:

Lax-Wendroff-Richtmeyer method

SUBROUTINE

481

STEP2 (COEFF,HO,IERR)

THE SUBROUTINE COEFF EVALUATES THE COEFFICIENTS Al, A2, 0 AND THE SOURCE TERM Q OF THE DIFFERENTIAL EQUATIONS. COEFF IS TO BE DECLARED AS EXTERNAL. Al(N,N), A2(N,N), D(N) MAY DEPEND ON X AND Y, Q(N) MAY DEPEND ON X,Y, AND T. HO IS THE SIZE WITH RESPECT TO TIME. HO MAY CHANGE FROM ONE CALL STEP2 TO THE NEXT CALL ACCORDING TO THE STABILITY CONDITION. EXTERNAL COEFF INTEGER I,IERR,II,I2,J,J1,J2,K,KK,L,LL,MS REAL HO,H2,LAMM2,MINUS,SUM,T2,X,X1,Y,Y1 REAL A1(4,4),A2(4,4),D(4),Q(4),VX(4),VY(4) C C C

FOR VARIABLES IN COMMON COMPARE INITIO INTEGER MMAX,M,N,S2 I*gAL U(4,65,65),V(4,2,64),H,DELTA,LAMBDA,T COMMON U,V,H,DELTA,LAMBDA,T,MMAX,M,N,S2 DATA MINUS /-1.E50/

C

MS=M-2*S2 IF( MS.LT.1 ) GOTO 99 IERR=O H=HO LAI.IBDA=H/DELTA

LAM2=LAMBDA*.5 H2=H*.5 T2=T.+H2

I1=(MMAX-MS)/2+1 I2=I1+MS J1=I1+1 J2=I2-1 CALL STEP1 (COEFF,I1) X1=DELTA*FLOAT(I1+I1-MMAX) Y1=0. C

DO 100 J=JI,J2 X=Xl Y=Y1 K=1

15

KK=2 CALL $TEPI (COEFF,J) DO 50 I=Jl,J2 DO 15'LL=1,N VX(LL)=V(LL,2,KK)-V(LL,1,K ) VY(LL)=V(LL,2,K )-V(LL,1,KK) CALL COEFF (X,Y,T2,A1,A2,D,Q) DO 30 L=1,N

SUMO.

482

APPENDICES

20

/ 30

50

DO 20 LL=1,N SUM=SUM+A1(L,LL)*VX(LL)+A2(L,LL)*VY(LL) U(L,I,J)=U(L,I,J)+(LAM2*SUM+H*(D(L)*U(L,I,J)+0(L)))/ (l.-H2*0(L)) CONTINUE X=X+DELTA Y=Y-DELTA K=K+1 KK=KK+1

X1=XI+DELTA Yl=Yl+DELTA DO 110 J=I1,I2 U(1,I1,J)=MINUS U(1,I2,J)=MINUS U(1,J,II)=MINUS 110 U(1,J,I2)=MINUS T=T+H 100

S2-S2+1 RETURN

99 IERR=1 RETURN END

SUBROUTINE PRINT INTEGER I,J,L,MMAXI REAL MINUS,X,Y C C C

FOR VARIABLES IN COMMON COMPARE INITIO INTEGER MMAX,M,N,S2 REAL U(4,65,65),V(4,2,64),H,DELTA,LAMBDA,T COMMON U,V,H,OELTA,LAMBDA,T,MMAX,M,N,S2 DATA MINUS /-1.ES0/

C

MMAXI=MMAX+1 DO 30 J=1,MMAXI DO 20 I=1,MMAXI IF( U(1,I,J).LE.MINUS ) GOTO 20 X=DELTA*FLOAT(J+I-MMAX-2) Y=DELTA*FLOAT(J-I) DO 10 L=1,N 10 WRITE(6,800) L,I,J,U(L,I,J),X,Y CONTINUE 20 30 CONTINUE RETURN 800 FORMAT(IH ,IOX,2HU(,I2,IH,,I2,IH,,I2,IH),5X,E20.14, F 5X,2HX=,F10.6,2X,2HY=,F1O.6) END

Appendix 3:

Lax-Wendroff-Richtmeyer method

483

EXAMPLE (MENTIONED IN THE TEXT)

SUBROUTINE COEFF (X,Y,T,A1,A2,D,Q) REAL Al(4,4),A2(4,4),D(4),Q(4) SINX=SIN(X) COSY=COS(Y) CXI=COS(X)+1. CX2=CXI+1. SYI=SIN(Y)-1. SSI=SYI*SYl-1. C C C C C

I SIN(X) AI=I

, COS(X)+1 I I

I COS(X)+I, COS(X)+2 I

I COS(Y) A2=I

I SIN(Y)-l, SIN(Y)*(SIN(Y)-2)

Al(1,1)=SINX

Al (1,2)=CX1 Al(2,1)=CX1 Al(2,2)=CX2 A2(1,1)=COSY A2(1,2)=SYI A2(2,1)=SYI A2(2,2)=SSI

D(1)=0.

D(2)=SY1-CX1 Q(1)=0. 0(2)=-EXP(-T)*(COSY*SS1-SINX*CX2) RETURN END

SUBROUTINE FUNC (X,Y,F) REAL F(4) F(1)=SIN(X)+COS(Y) F(2)=COS(X)+SIN(Y) RETURN END

, SIN(Y)-l

I I I

484

APPENDICES

Appendix 4:

Difference methods with SOR for solving the Poisson equation on nonrectangular regions. G c]R2

Let

be a bounded region and let

Au(x,Y) = q(x,y) u(x,Y) = lp(x,Y)

(x,y) c G (x,y) e 3G.

Furthermore, let one of the following four conditions be satisfied: (1)

G c (-1,1) x (-1,+1) = Q1

(2)

G c (-1,3) x (-1,+1) = Q2 G,q,P

(3)

x = 1.

G c (-1,+1) x (-1,3) = Q3 G,q,P

(4)

are symmetric with respect to the line

(2)

are symmetric with respect to the line

y = 1.

G c (-1,3) x (-1,3) = Q4 G,q,*

are symmetric with respect to the lines

x=1

and

y = 1.

The symmetry conditions imply that the normal derivative of u

vanishes on the lines of symmetry.

This additional bound-

ary condition results in a modified boundary value problem for

u

on the region

(-1,1) x (-1,1)

fl

G.

The program uses the five point difference formula of Section 13.

The linear system of equations is solved by SOR

(cf. Section 18).

Because of the symmetries, computation is

restricted to the lattice points in the square

[-1,1] x [-1,11.

This leads to a substantial reduction in computing time for each iteration.

The optimal overrelaxation parameter w b

the number of required iterations

m, however, remain as

large as with a computation over the entire region.

Altogether, nine subroutines are used:

and

Appendix 4:

Poisson eq::ation on nonrectangular regions

48S

POIS, SOR, SAVE, QNCRM, NEIGHB, CHARDL, CHAR, QUELL, BAND.

The last three named programs depend on the concrete problem and describe

G, q, and

Formally, we have REAL FUNCTIONs

gyp.

of two arguments of type REAL. tion of the region

CHAR is a characteristic func-

G:

if

(X,Y)

e G

= 0

if

(X,Y)

e 8G

< 0

otherwise.

>

CHAR(X,Y)

0

This function should be continuous, but need not be differentiable.

If

ABS(CHAR(X,Y))

LT. 1.E-4

it is assumed that the distance from the point to the boundary

G

is at most 10-3.

Each region is truncated by the

program so as to lie in the appropriate rectangle i e {1,2,3,4}.

CHAR(X,Y) = 1

For

Qi,

G = (-1,1) x (-1,1), therefore, If a region

suffices.

(union) of two regions

GI

and

G2, then the minimum (maxi-

mum) of the characteristic functions of characteristic function for

is the intersection

G

G1

and

G2

is a

G.

POIS is called by the main program.

The first two

parameters are the names of the function programs RAND and QUELL.

The name of CHAR is fixed.

The remaining parameters

of POIS are BR, BO

M, EPSP, OMEGAP BR = TRUE. x

BO =

=1 TRUE. -- y = 1

is a line of symmetry is a line of symmetry.

486

APPENDICES

The mesh of the lattice is

H = l./2**M.

EPSP is the absolute

error up to which the iteration is continued. the computation defaults to 10-3. tion parameter.

OMEGAP is the overrelaxa-

When OMEGAP = 0., wb

cally by the program. Note that does not depend on

or

q

4).

When EPSP = 0.,

wb

is determined numeri-

does depend on

G, but

The numerical approximation

It improves as EPSP gets smaller.

also depends on EPSP.

In

case OMEGAP = 0, POIS should be called sequentially with M

2,

3, 4,

...

.

The program then uses as its given initial

value for determining

the approximate value from the

wb

preceding coarser lattice.

OMEGAP remains zero.

In each iteration, SOR uses

7e+llf

floating point

operations, where e

:

f

:

number of boundary distant lattice points number of boundary close lattice points.

In the composition of the system of equations, the following main terms should be distinguished: calls of QUELL

:

proportional to

1/H**2

calls of RAND

:

proportional to

1/H

calls of CHAR

:

proportional to

Ilog EPSP /H.

The program is actually designed for regions which are not rectangular.

For rectangles, the Buneman algorithm (cf.

Appendix 6) is substantially faster.

It is nevertheless en-

ticing to compare the theoretical results for the model problem (Example 18.15) with the numerical results. Lu(x,y) = -20 u(x,y) = 0

Let

in

G = (0,1) x (0,1)

on

3G.

Appendix 4:

Poisson equation on nonrectangular regions

Since the iteration begins with initial error

u(x,y) E 0, the norm of the

is coarsely approximated by

IIe(0)112

The

1.

following results were obtained first of all with EPSP 1./1000.

487

=

The error reduction of the iterative method is thus

about 1/1000.

Table 1 contains the theoretical values of the numerical approximations

wb.

and

wb

The approximations, as

suspected, are always too large.

wb

h

Table 1.

mb

1/8

1.447

1.527

1/16

1.674

1.721

1/32

1.822

1.847

1/64

1.907

1.920

1/128

1.952

1.959

and its numerical approximations

wb

Table 2 contains the number of iterations and the computing times w = wb, and

w = wb.

t1, t2, and

ml, m2, and

m3,

t3, for OMEGAP = w = 0,

Column 2 contains the theoretically

required number of steps

mw

from Example 18.15.

ml

and

b

ti cab.

describe the computational effort involved in determining The times were measured on a CDC CYBER 76 system (in

units of 1 second). h

mwb

ml

m2

m3

tl

t2

t3

8

22

14

14

0.021

0.016

0.016

1/16

17

36

28

32

0.098

0.077

0.085

1/32

35

52

56

64

0.487

0.506

0.751

1/64

70

132

112

128

4.484

3.785

4.298

1/8

Table 2.

Iterative steps and computing time for EPSP = 10-3

488

APPENDICES

Surprisingly, the number of iterations is smaller for than for

w = wb.

This does not contradict the theory, since

the spectral radius gence behavior. EPSP = 10-9.

w = wb

p(ew)

only describes asymptotic conver-

In fact, the relationship reverses for

Table 3 contains the number, Amw b, Amt, Am3, of

additional iterations required to achieve this degree of accuracy.

Amw

:

theoretical number, as in Example 18.5

b

Amt

computation with

w = wb

Am3

computation with

w = wb.

In both cases, i.e. for EPSP = 10-3 and EPSP = 10-9, it is our experience that w = wb + (2-wb)/50 is better than

wb.

h

Table 3.

Amw

Am2

LN m3

1/8

17

20

20

1/16

35

44

36

1/32

70

88

72

1/64

141

176

144

Additional iterations for EPSP = 10-9

Appendix 4:

Poisson equation on nonrectangular regions

489

SUBROUTINE POIS(RAND,QUELL,BR,BO,M,EPSP,OMEGAP) C C

PARAMETERS OF THE SUBROUTINE

C

C C

REAL EPSP,OMEGAP INTEGER M LOGICAL BR,BO RAND AND QUELL ARE FUNCTIONS FOR THE BOUNDARY VALUES AND THE SOURCE TERM, RESPECTIVELY.

C C C

VARIABLES OF THE COMMON BLOCK REAL W(66,66),W1(65,65),WPS(65),Q(65,65),COEFF(1600) INTEGER NR(66,66),MITTE,N1,N2,ITER,NSYM,L1(65),L2(65),NO EQUIVALENCE (L1(1),W1(1,1)),(L2(1),Q(1,1)) EQUIVALENCE (NO,W1(1,1)),(MITTE,Q(1,1)),(WPS(1),W1(1,2)) COMMON W,W1,Q,COEFF,NR COMMON N1,N2,ITER,NSYM

C

C

LOCAL VARIABLES

C

REAL D(4),PUNKT,STERN,BET2,EPS,EPSI,EPS2,H,H2, OALT,OMEGAB,OMEGA,X,XN,Y,Z1,Z2,Z3 INTEGER I, J,K,K1,K2,LCOEFF,N,NN,N3,N4, MALT,MITTEO,MITTE1,MMAX,LCMAX LOGICAL BRN,BON,HBIT DATA PUNKT/1H./ DATA STERN/1H*/ DATA MALT /0/ C

C C

MEANING OF THE VARIABLES

C C C C C

C

W(I,J)

W1(I,J), WPS (I) COEFF(L)

C C

C C C C C C

C C C C

Q(I,J)

VALUE OF THE UNKNOWN FUNCTION AT (X,Y), WHERE X=(I-MITTE)*H, Y=(J-MITTE)*H AUXILIARY STORAGE HERE THE COEFFICIENTS OF THE DIFFERENCE EQUATIOI BELONGING TO A POINT NEAR THE BOUNDARY ARE STORED. COEFF(L), COEFF(L+1), COEFF(L+2), AND COEFF(L+3) ARE RELATED TO ONE POINT. RIGHT-HAND SIDE OF THE DIFFERENCE EQUATION

THE INTERIOR POINTS OF THE REGION SATISFY THE INEQUALITIES N1.GT.1 N1.LE.J.LE.N2 L1(J).LE.I.LE.L2(J) THIS SET OF INDICES MAY ALSO CONTAIN OUTER POINTS INDICATED BY NR(I,J)=-1. L1(J) IS EVEN IN ORDER TO SIMPLIFY THE RED-BLACK ORDERING OF THE SOR ITERATION. N1,N2, L1(J), L2(J),

490

APPENDICES

THE ARRAY BOUNDS OF (1) W,NR, OF (2) W1,WPS,Q,L1,L2, AND OF (3) COEFF CAN BE CHANGED SIMULTANEOUSLY WITH MMAX: MMAX=4 BOUNDS (1) (2) 34 33 (3) 800 MMAX=5 (1) (2) BOUNDS 66 65 (3) 1600 MMAX=6 BOUNDS (1) 130 (3) 3200 (2) 129 THE REAL VARIABLES MAY BE REPLACED BY DOUBLE PRECISION VARIABLES.

N

DISTANCES OF A BOUNDARY CLOSE POINT FROM THE NEIGHBOURING POINTS. DISTANCES OF THE BOUNDARY DISTANT POINTS (GRID SIZE) H=1./2**M =2**M

H2 NN

=H*H =N/2

MALT

=0 IN THE CASE OF THE FIRST RUN OF THIS SUBROUTINE; OTHERWISE, MALT COINCIDES WITH M FROM THE FOREGOING RUN. OMEGAS OF THE LAST RUN; OTHERWISE UNDEFINED =.TRUE. IN THE CASE OF SYMMETRIE WITH RESPECT TO THE LINE X=1 =.TRUE. IN THE CASE OF SYMMETRIE WITH RESPECT TO THE LINE Y=1 =.NOT.BR =.NOT.BO RELATIVE ACCURACY. THE SOR ITERATION IS CONTINUED UNTIL EPS IS REACHED. RELATIVE ACCURACY FOR DETERMINING THE DIFFERENCE 2-OMEGAB PRELIMINARY SOR PARAMETER THAT IS USED FOR COMPUTING OMEGAB (OMEGA. LT.OMEGAB) NUMBER OF STEPS OF THE SOR ITERATION LENGTH OF THE LINES OF SYMMETRIE IN W(I,J)

D(K) H M

OALT BR BO

BRN BON EPS EPSI

OMEGA ITER NSYM

IF THE PARAMETERS "EPSP" AND "OMEGAP" EQUAL ZERO, THE PROGRAMME DEFINES "EPS=0.001" AND COMPUTES THE OPTIMAL "OMEGAB". IN THE CASE OF "OMEGAP.GT.O.", THE PARAMETER OMEGAB=OMEGAP IS USED DURING THE WHOLE ITERATION.

COMPONENTS OF REAL ARRAYS AND INTEGER VARIABLES EQUATED BY AN EQUIVALENCE STATEMENT ARE USED ONLY AS INTEGERS. OMEGAB=OMEGAP EPS=EPSP MMAX=S LCMAX=50*(2**MMAX) MITT E0=2**MMAX MITTE =MITTEO+1 MITTEI=MITTE +1

Appendix 4:

C

C C C

C

C

Poisson equation on nonrectangular regions

491

M MUST SATISFY 2.LE.M.LE.MMAX IF(2.LE.M AND. M.LE.MMAX) GO TO I PRINT 97, M,MMAX 97 FORMAT(4H1 M=,II,11H NOT IN (2 ,,II,IH)) STOP 1 N=2**M PRELIMINARY "OMEGA" IS I.. ONLY IN THE CASE OF "M=MALT+1" THE VALUE OF "OMEGAB" OF THE FOREGOING RUN IS ASSIGNED TO "OMEGA". OMEGA=1. IF(M.EQ.MALT+I) OMEGA=OALT MALT=M IF(EPS.LE.O.) EPS=0.001 EPS1=-1./ALOG(EPS) EPS2=0.1*EPS THE NUMBER NO OF BISECTION STEPS IN NACHB IS ABOUT -LOG2(EPS). NO=-1.5*ALOG(EPS)+0.5 ITER=O NN=N/2 IF(BR.OR.BO) NN=N

XN = N H = I./XN H2 = H*H Ni = MITTEI-N N2 = MITTE +N N3=N1-1 N4=N2+1

C C C

3

BRN=.NOT.BR BON=.NOT.BO N2R=N2 N20=N2 IF(BRN) N2R=N2R-1 IF(BON) N20=N20-1 THE VALUES 0. AND -1 ARE ASSIGNED TO "W" AND "NR", RESP., AT ALL POINTS OF THE SQUARE -1..LE.Y.LE.+1. -1..LE.X.LE.+I. DO 3 J=N3,N4 DO 3 I=N3,N4 W(I,J)=0. NR(I,J)=-1

C C C C

C C C

THE FOLLOWING NESTED LOOP DETERMINES ALL INTERIOR POINTS. AT THE SAME TIME "KI", "K2", "L1", "L2" ARE DEFINED SUCH THAT KI.LE.J.LE.K2 LI(J).LE.I.LE.L2(J) HOLDS FOR ALL INTERIOR POINTS. KI=N2 K2=N1

492

APPENDICES

DO 5 J=NI,N20 Y = J-MITTE

Y = Y*H LI(J)=N2 L2 (J)=NI

DO 4 I=NI,N2R X = I-MITTE X = X*H (X,Y) IS AN INTERIOR POINT IF "CHAR(X,Y).GT.O.". IN ORDER TO AVOID TOO SMALL DISTANCES FROM THE BOUNDARY, "CHAR(X,Y).GT.EPS2" IS REQUIRED FOR INTERIOR POINTS. THEREFORE IT IS PERMITTED THAT SOME POINTS HAVE A DISTANCE GREATER THAN "H" FROM THE BOUNDARY. WE ALLOW A DISTANCE UP TO 1.1*H. THIS HARDLY INFLUENCES THE ACCURACY. LE. EPS2) GOTO 4 IF (CHAR(X,Y) KI=MINO(KI,J) K2=MAXO(K2,J)

C C C C C

C

L1 (J)=NIND(LI(J) ,I) C C C

C

C C C C C C C

L2 (J) =MAXO (L2 (J) , I) "NR=O" AND "Q=QUELL(X,Y)*H*H" IS USED AT INTERIOR POINTS. THESE VALUES WILL BE CHANGED ONLY AT POINTS NEAR THE BOUNDARY. NR(I,J) = 0 Q(I,J) = QUELL(X,Y)*H2 4 CONTINUE NR(NSYM+I, J)=NR(NSYM-1, J) ROUNDING OFF L1(J) TO THE PRECEDING EVEN INTEGER. L1(J)=L1(J)-MOD (L1(J),2) 5 CONTINUE DO 6 I=NI,N2R HR(I,NSYN+1)=NR(I,NSYM-1) CONTINUE 6

NEW PAGE. AFTERWARDS THE GRID IS PRINTED OUT PROVIDED THAT M.LE.5. "STAR"=INTERIOR POINT "PERIOD"=EXTERIOR POINT "X" IS ORIENTATED FROM THE LEFT TO THE RIGHT, "Y" FROM BOTTOM TO TOP. IF (M GT. 5) GO TO 10 PRINT99 99 FORMAT (IHI) J=N20

DO 9 K = NI,N20 DO 7 I = NI,N2R WPS(I) = PUNKT IF(NR(I,J).GE.O) WPS(I) = STERN 7 CONTINUE PRINT 8, (WPS(I), I=NI,N2R) 8

.FORMAT(3X,64A2)

9

CONTINUE

J = J-1

Appendix 4:

C

Poisson equation on nonrectangular regions

493

IN THE CASE OF "K1.GT.K2" THERE ARE NO INTERIOR POINTS. IF(K1.LE.K2) GOTO 10 PRINT 98 98 FORMAT(30H1 THERE ARE NO INTERIOR POINTS) STOP

C C C

HENCEFORTH THE INTERIOR POINTS SATISFY N1.LE.J.LE.N2 L1(J).LE.I.LE.L2(J) 10 N1=K1 N2=K2

C C C C

C C C C

IT FOLLOWS THE DETERMINATION OF THE BOUNDARY CLOSE POINTS AND THE COMPUTATION OF THE COEFFICIENTS OF THE DIFFERENCE EQUATION IN THESE POINTS. THE COEFFICIENTS ARE ASSIGNED TO COEFF(LCOEFF), COEFF(LCOEFF+1), COEFF(LCOEFF+2), COEFF(LCOEFF+3)

C

LCOEFF = 1 DO 30 J=N1,N2 Y = J-MITTE Y = Y*H

K1=L1 (J) K2=L2 (J) C C C

C C C

C C C C

DO 29 I=K1,K2 NO CHECKS FOR EXTERIOR POINTS. IF (NR(I,J) LT. 0) GOTO 29 THE SUBROUTINE "NEIGHB" DEFINES THE ARRAY "D". "D(1),D(2),D(3),D(4) = DISTANCES FROM THE NEIGHBOURS" FOR BOUNDARY CLOSE POINTS; "D(1)=-H", OTHERWISE. CALL NEIGHB(D, I, J, H) ONLY BOUNDARY CLOSE POINTS ARE TREATED IN THE FOLLOWING IF (D(1).LT.O.) GO TO 29 IF "LCOEFF.GT.LCMAX" THE ARRAY "COEFF" IS FILLED. THE PROGRAMME MUST BE TERMINATED. LCMAX (=LENGTH OF LCOEFF) CAN BE ENLARGED. IN THIS CASE THE COMMON BLOCKS ARE TO BE CHANGED. IF(LCOEFF.GT.LCMAX) GOT0 100 X = I-MITTE

X = X*H Z1=D (1)+D(3)

Z2=D(2)+D(4) Z3 = 1./(Z1*D(1))+1./(Z2*D(2))+1./(Z1*D(3))+1./(Z2*D(4)) Q(I,J) = Q(I,J)*2./(Z3*H2) Z1=Z1*Z3 Z2=Z2*Z3 HBIT=.TRUE.

APPENDICES

494

11

12 13 14

15

16 17

18 19 20

21

22

29 30

IF (NR(I+1,J)) 11, 12, 12 Q(I,J) = Q(I,J) - 4./(D(1)*Z1)*RAND(X+D(1),Y) COEFF(LCOEFF) = 0. HBIT=HBIT.AND.D(1).EQ.H GOTO 13 COEFF(LCOEFF) = 4./(D(1)*Z1) IF (NR(I,J+1)) 14, 15, 15 Q(I,J) = Q(I,J) - 4./(D(2)*Z2)*RAND(X,Y+D(2)) COEFF(LCOEFF+1) = 0. HBIT=HBIT.AND.D(2).EQ.H GOTO 16 COEFF(LCOEFF+1) = 4./(D(2)*Z2) IF (NR(I-1,J)) 17, 18, 18 Q(I,J) = Q(I,J) - 4./(D(3)*Z1)*RAND(X-D(3),Y) COEFF(LCOEFF+2) = 0. HBIT=HBIT.AND.D(3).EQ.H GOT0 19 COEFF(LCOEFF+2) = 4./(D(3)*Z1) IF (NR(I,J-1)) 20, 21, 21 Q(I,J) = Q(I,J) - 4./(D(4)*Z2)*RAND(X,Y-D(4)) COEFF(LCOEFF+3) = 0. HBIT=HBIT.AND.D(4).EQ.H GOTO 22 COEFF(LCOEFF+3) = 4./(D(4)*Z2) NR(I,J) = 0 IF(HBIT) GOTO 29 NR(I,J) = LCOEFF LCOEFF = LCOEFF + 4 CONTINUE CONTINUE

C C

LCOEFF = LCOEFF/4 PRINT 40,LCOEFF 40 FORMAT(IX/ 34H NUMBER OF BOUNDARY CLOSE POINTS, 14H (D(L).NE.H) _, 14) C C C C

THE NEXT LOOP ENDING WITH STATEMENT NUMBER "59" COMPUTES THE OPTIMAL "OMEGAB". THE COMPUTATION IS OMITTED IF

C

"OIIEGAB.GT.O.".

C C

C C C

C

IF(OHEGAB.GT.O.) GOTO 60 "OMEGAB" WILL BE IMPROVED ITERATIVELY, STARTING WITH A UNSUITABLE VALUE. OMEGAB=2. AT FIRST "NN" STEPS OF THE SOR ITERATION ARE EXECUTED. AFTER NEARLY NN STEPS THE INFLUENCE OF THE BOUNDARY VALUES BEARS UPON THE MIDDLE OF THE REGION. 31 DO 32 I = 1,NN 32 CALL SOR(OMEGA) "W1 =W"

CALL SAVE

Appendix 4:

C C

C C C C C C

C C C

C

C

Poisson equation on nonrectangular regions

495

A FURTHER STEP OF THE SOR ITERATION. CALL SOR(OMEGA) "Z1"=SUMME"(W1(I,J)-W(I,J))**2" CALL QNORM(Z1) CALL SAVE CALL SOR(OMEGA) CALL QNORM(Z2) IF THE COMPONENT OF THE INITIAL ERROR BELONGING TO THE LARGEST EIGENVALUE OF THE MATRIX IS TOO SMALL, THE STARTING VALUES MUST BE CHANGED. IF(Z2.GE.Z1) GOTO 110 "Z3"=APPROXIMATION TO THE SPECTRAL RADIUS OF THE ITERATION MATRIX OF THE SOR ITERATION WITH PARAMETER "OMEGA". Z3 = SQRT(Z2/Z1) "BET2"=APPROXIMATION TO THE SQUARED SPECTRAL RADIUS OF THE ITERATION MATRIX OF THE JACOBI ITERATION. BET2 = (Z3+OI4EGA-1.)**2/(Z3*OMEGA**2) "Z3"=NEW APPROXIMATION OF "OMEGAB" Z3 = 2./(1.+SQRT(1.-BET2)) THE DIFFERENCE "2.-OMEGAS" IS TO BE DETERMINED UP TO THE RELATIVE ACCURACY "EPS1". IN CASE OF WORSE ACCURACY THE WHOLE PROCESS IS TO BE REPEATED WITH "OMEGAB=Z3". IF (ABS(Z3-01IEGAB) LT. (2.-23)*EPSI) GO TO 59 OHEGAB=Z3 GOTO 31

C C C

SINCE IT IS MORE ADVANTAGEOUS TO USE A LARGER THAN A SMALLER VALUE, THE APPROXIMATION OF "OMEGAB" IS SOMEWHAT ENLARGED. THEN "OMEGAB" IS ROUNDED UP. 59 Z3=Z3+EPS1*(2.-Z3)+16.

C C

"OMEGAS" IS ASSIGNED TO "OALT". THIS VALUE IS KEPT SINCE IT CAN BE USED IN A FOLLOWING RUN WITH "M=M+l". OALT=OMEGAB PRINTING OF "OMEGAS" AND OF THE NUMBER OF ITERATION STEPS NEEDED UP TO NOW. PRINT 61, OMEGAS, ITER 61 FORMAT (1X/ 11H OMEGAS =,F6.3,7H ITER =,I4) 62 FORMAT(29H TOTAL NUMBER OF ITERATIONS,I6/1H

C C

C C C C

C C

"W1=W"

60 CALL SAVE NN STEPS OF THE SOR ITERATION DO 80 I = 1,NN 80 CALL SOR(OMEGAB) CHECK OF ACCURACY. THE ACCURACY IS NEARLY INDEPENDENT OF "H" AND "H", SINCE "NN" ITERATIONS ARE USED. DO 90 J = N1,N2 K1=L1(J) K2=L2(J) DO 90 I = K1,K2 IF (ABS(W1(I,J)-W(I,J)) GT. EPS) GOTO 60 90 CONTINUE

496

APPENDICES

C

C

PRINT-OUT OF "W" AND OF THE TOTAL NUMBER OF SOR ITERATIONS.

C

IF (M.GT.3) PRINT 99 PRINT 62, ITER

N3 = N3 + I J = N2

Y=(N2-MITTE)*H DO 93 K=N1,N2 PRINT 92, Y,(W(I,J),I=N3,N2R) FORHAT(1X,F9.7,4X,8F7.4,4X,8F7.4/7(14X,8F7.4,4X,8F7.4/)) 92 J = J - 1 Y=Y-H 93 CONTINUE RETURN C

100 PRINT 101 101 FORMAT(33H STOP

TOO MANY BOUNDARY CLOSE POINTS)

C

C

CHANGE OF STARTING VALUES AT THE INTERIOR POINTS.

C

110 DO 120 J=N1,N2 K1= L1(J) K2=L2(J) DO 119 I=K1,K2

IF(NR(I,J).LT.O) GO TO 119 V(I,J)=W(I,J)-1. CONTINUE 119 120 CONTINUE GOTO 31 END

Appendix 4:

Poisson equation on nonrectangular regions

497

SUBROUTINE SOR(OMEGA) C

THE SUBROUTINE "SOR" PERFORMS ONE ITERATION STEP. THE NUMBER OF ITERATIONS IS COUNTED BY "ITER". THE RED-BLACK ORDERING IS USED FOR THE INTERIOR POINTS. SINCE "L1(J)" IS EVEN, M=1 , MOD(I+J,2)=MOD(N1,2) HOLDS FOR THE FIRST RUN, WHILE M=2 , MOD(I+J,2)=MOD(N1+1,2) HOLDS FOR THE SECOND RUN.

C C C C

C C C C C

PARAMETER

C C

REAL OMEGA C

VARIABLES OF THE COMMON BLOCK

C C

REAL W(66,66),W1(65,65),WPS(65),Q(65,65),COEFF(1600) INTEGER NR(66,66),MITTE,NI,N2,ITER,NSYM,L1(65),L2(65),NO EQUIVALENCE (L1(1),W1(1,1)),(L2(1),Q(1,1)) EQUIVALENCE (NO,W1(1,1)),(MITTE,Q(1,1)) COI414ON W,W1,Q,COEFF,NR

COMMON N1,N2,ITER,NSYM C C

LOCAL VARIABLES

C

REAL OM,OM1 INTEGER I,J,K,K2,L,M,N C

ITER=ITER+1 OM = OMEGA*0.25 OMI=1.-OMEGA I.1

=I

N = 0 5 DO 50 J=N1,N2 K=LI(J)+N K2=L2(J) DO 40 I = K,K2,2 IF (NR(I,J)) 40, 10, 20 10 U(I,J) = W(I,J)*OMI + OM*(W(I+1,J)+W(I,J+1)+W(I-1,J) +

+W (I, J-1) -Q (10 J)

GOTO 40 20 L = NR(I,J) W(I,J) = W(I,J)*OM1 + OM*(COEFF(L)*W(I+1,J)+COEFF(L+1)* * W(I,J+1)+COEFF(L+2)*W(I-1,J)+COEFF(L+3)*W(I,J-1)-Q(I,J)) 40 CONTINUE W(NSYM+1,J)=W(NSYM-1,J) N = 1-N 50 CONTINUE IF(N2.LT.NSYM) GOTO 54 K=L1(NSYM-1) K2=L2(NSYM-1)

498

APPENDICES

DO 53 I=K,K2 W(I,NSYM+1)=W(I,NSYM-1) 53 CONTINUE 54 IF (M-2) 55,56,56

55 N = 1

M = 2 GOTO 5 56 RETURN END

SUBROUTINE NEIGHB(D, I, J, H) C C C

C C C

C C

"NEIGHB" COMPUTES THE DISTANCE OF THE INTERIOR POINT (X,Y) WITH X=(I-MITTE)*H Y=(J-MITTE)*H

FROM THE NEIGHBOURS. IN THE CASE OF A BOUNDARY DISTANT POINT, THE RESULT IS

D(1)=-H , D(2)=D(3)=D(4)=H.

C C C C

FOR BOUNDARY CLOSE POINTS FOUR POSITIVE NUMBERS ARE COMPUTED. THE DISTANCE FROM THE BOUNDARY IS DETERMINED BY A BISECTION METHOD, SINCE THE CHARACTERISTIC FUNCTION OF THE REGION IS NOT REQUIRED TO BE DIFFERENTIABLE.

C C C

PARAMETERS

REAL D(4),H INTEGER I,J C C C

VARIABLES OF THE COMMON BLOCKS REAL W(66,66),WI(65,65),WPS(65),Q(65,65),COEFF(1600) INTEGER NR(66,66),MITTE,NI,N2,ITER,NSYM,L1(65),L2(65),NO EQUIVALENCE (L1(1),W1(1,1)),(L2(1),Q(1,1)) EQUIVALENCE (NO,WI(1,1)),(MITTE,Q(1,1)) CON11ON W,W1,Q,COEFF,NR COMMON N1,N2,ITER,NSYM

C C C

LOCAL VARIABLES

REAL DELTA,H1,H11,X,Y INTEGER I1(4),J1(4),I2,J2,K,L LOGICAL B DATA II/1,0,-1,O/ DATA J1/0,1,0,-1/

Appendix 4:

Poisson equa is

cr. 'crrectangular regions

499

C

B = TRUE. H1=H/8. X = (I-MITTE)*H Y = (J-MITTE)*H DO 100 K = 1,4

12 = I + II (K) J2 = J + JI (K)

IF (NR(I2,J2).GE.O.) GO TO 90 B = FALSE. H11 = HI DELTA = HI C C

10

C

C

11

15

20 C

C C C C

80

90 100

IN THIS LOOP THE SIGN OF "CHARDL" IS CHECKED IN STEPS OF "H/8" IN ORDER TO DETECT THE FIRST CHANGE OF SIGN DO 10 L=1,9 IF (CHARDL(K, DELTA, X, Y).LT.O.) GO TO 11 DELTA = DELTA + H11 DELTA=H GO TO 80 HERE THE BISECTION METHOD STARTS. "NO" IS THE REQUIRED NUMBER OF STEPS. IT IS DEFINED IN POIS. H11 = H11*0.5 DELTA = DELTA - H11 DO 20 L = 1,NO H11 = H11*0.5 IF (CHARDL(K, DELTA, X, Y).LE.O.) GO TO 15 DELTA = DELTA + H11 GO TO 20 DELTA = DELTA - H11 CONTINUE IF(DELTA.GT.1.1*H) DELTA=H THE RESULTING DISTANCE MAY BE SOMEWHAT LARGER THAN "H". BUT NOTE THE BOUNDARY POINT IS LOCATED IN ABS(X).LE.1. ABS(Y).LE.1. SINCE EVERY REGION IS CUT IN THIS WAY. IF (DELTA.LE.H) GO TO 80 IF (X + I1(K)*DELTA.GT.1. OR. X + I1(K)*DELTA.LT.-I.) DELTA = H OR. IF (Y + J1(K)*DELTA.GT.1. Y + J1(K)*DELTA.LT.-1.) DELTA = H D(K) = DELTA GO TO 100 D(K) = H CONTINUE IF (B) D(1) _ -H RETURN

END

APPENDICES

500

REAL FUNCTION CHARDL (K, DELTA, X, Y)

"CHARDL" TRANSFORMS THE ARGUMENTS OF "CHAR" AS SUITED TO "NACHB". "CHAR" SHOULD BE PROGRAMMED AS SIMPLE AS POSSIBLE TO ENABLE EASY CHANGES. PARAMETERS

REAL DELTA,X,Y INTEGER K C

LOCAL VARIABLES

C

C

REAL X1(4),Y1(4),X2,Y2 DATA X1/1.,0.,-1.,0./ DATA C

X2 = X + DELTA*X1(K) Y2 = Y + DELTA*Y1(K) CHARDL = CHAR(X2,Y2) RETURN END

SUBROUTINE SAVE

"SAVE" STORES "W" ON "WI" VARIABLES OF THE COMMON BLOCKS REAL W(66,66),W1(65,65),WPS(65),Q(65,65),COEFF(1600) INTEGER NR(66,66),MITTE,NI,N2,ITER,NSYM,L1(65),L2(6S),NO EQUIVALENCE (L1(1),W1(1,1)),(L2(1),Q(1,1)) EQUIVALENCE (NO,W1(1,1)),(MITTE,Q(1,1)) COMMON W,W1,Q,COEFF,NR

COiINON N1,N2,ITER,NSYM C

LOCAL VARIABLES

C C

INTEGER I,J,K1,K2 C

DO 10 J=N1,N2 K1=L1 (J) K2=L2 (J)

DO 10 I=Kl,K2

10

W1 (I,J)=W(I,J)

RETURN END

Appendix 4:

Poisson equa:_c-

..c-rectangular regions

501

SUBROUTINE QNORM(Z) C C C C C

C C

"QNORM" COMPUTES THE SQUARED EUCLIDEAN NORM OF THE DIFFERENCE Z=SUMME(W1(I,J)-W(I,J))**2 PARAMETER REAL Z

C C C

VARIABLES OF THE COMMON BLOCKS REAL W(66,66),W1(65,65),WPS(65),Q(65,65),COEFF(1600) INTEGER NR(66,66),MITTE,NI,N2,ITER,NSYM,L1(65),L2(65),NO EQUIVALENCE (L1(1),W1(1,1)),(L2(1),Q(1,1)) EQUIVALENCE (NO,W1(1,1)),(MITTE,Q(1,1)) COHNON W,W1,Q,COEFF,NR COMMON N1,N2,ITER,NSYM

C C

LOCAL VARIABLES

C

REAL SUM INTEGER I,J,K1,K2 C

SUM = 0. DO 10 J = N1,N2

K1=L1 (J) K2=L2 (J) I = K1,K2 IF (NR(I,J).LT. 0) GOTO 9 SUM = SUM + (W1(I,J)-W(I0J))**2 9 CONTINUE 10 CONTINUE Z = SUM RETURN

DO 9

END

APPENDICES

502

EXAMPLE:

REAL FUNCTION CHAR(X,Y) C C

ELLIPSE

C

ZI=X/O.9 Z2=Y/O.6 CHAR=1.-ZI*ZI-Z2*Z2 RETURN END

REAL FUNCTION QUELL (X,Y) C

QUELL=4. RETURN END

REAL

FUNCTION RAND (X,Y)

C

RAND=1. RETURN END

Appendix 5:

Programs for

Appendix 5:

-atrices

503

Programs for 'band matrices.

We present programs for the following methods:

Gaussian elimination without pivot search for

GAUBD3:

tridiagonal matrices.

Gaussian elimination without pivot search for band

GAUBD:

matrices of arbitrary band width

w > 3.

Band width reduction by the Gibbs-Poole-Stockmeyer

REDUCE:

method.

(This includes the subroutines LEVEL,

KOMPON, SSORTI, SSORT2). K = (w-l)/2

In the program we use

as a measure of band

width (cf. Definition 20.1 and 20.4). We first consider calls of GAUBD3 and GAUBD. A

the matrix

from

A

is

Section 20 (cf. also Figure 20.2), N > 2

is the number of equations, K < N width just mentioned.

is the measure of band-

If the only change since the last call

of the program is on the right side of the system of equations

A(4,*)

B = .FALSE..

tor is in

ment for

[or A(2*K+2,*)], set

After the call of the program, the solution vec-

A(4,*)

A

B = .TRUE., otherwise,

[or A(2*K+2,*)].

For

K > 10, the state-

in GAUBD has to be replaced by

REAL A(m,N). Here

m

is some number greater than or equal to

2*K+2.

The number of floating point operations in one call of GAUBD3 or GAUBD is: GAUBD3

GAUBD

B = .FALSE.:

8N-7

B = .TRUE.:

SN-4

B = .FALSE.:

(2K2+5K+1)(N-1)+l

B = .TRUE.:

(4K+1)(N-1)+1.

APPENDICES

504

The program REDUCE contains four explicit parameters: N:

number of rows in the matrix

M:

the number of matrix elements different from zero

above the main diagonal

N

KOLD:

K

before band width reduction

KNEW:

K

after band width reduction.

and

M

parameters.

vector

A

are input parameters, and KOLD and KNEW are output The pattern of the matrix is described by the in the COMMON block.

Before a call, one enters

here the row and column indices of the various matrix elements different from zero above the main diagonal.

The entry

order is: row index, corresponding column index, row index, corresponding column index, etc.; altogether there are pairs of this sort.

REDUCE writes into this vector that per-

mutation which leads to band width reduction. KNEW or KOLD = KNEW.

M

Either KOLD >

In the second case, the permutation is

the identity, since no band width reduction can be accomplished with this program.

The next example should make the use of REDUCE more explicit.

T he p at tern o f th e ma tr ix i s given as

x x x x x X x x x x x x x x x x x x x x x x x x x x

x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x

x x x x x x x x x x x x x x x x x x x x x x x x x x

Appendix 5:

Programs for --and matrices

This corresponds to the gr

505

i.. :figure 20.11.

Input:

N = 15, M = 38, A = 1,2,1,6,1,7,2,3,2,6,2,7,2,8,3,4,3,7,3,8,3,9, 4,5,4,8,4,9,4,10,5,9,5,10,6,7,6,11,6,12,7,8,7,11, 7,12,7,13,8,9,8,12,8,13,8,14,9,10,9,13,9,14,9,15, 10,14,10,15,11,12,12,13,13,14,14,15.

Output: KOLD = 6, KNEW = 4,

A = 1,4,7,10,13,2,5,8,11,14,3,6,9,12,15.

The program declarations are sufficient for N < NMAX = 650 and

M < MMAX = 2048.

For large

only the bounds of the COMMON variables GRAD, and

NR

have to be changed.

than 10,000 in any case.

N

or

M,

A, VEC, IND, LIST,

However, N

must be less

On IBM or Siemens installations with

data types INTEGER*2 and INTEGER*4, two bytes suffice for IND, LIST, GRAD, and NR.

All other variables should be INTEGER*4.

For a logical run of the program, it is immaterial whether the graph is connected or not.

S

However, if the graph

decomposes into very many connected components (not counting knots of degree zero), the computing times become extremely long.

We have attached no special significance to this fact,

since the graph in most practical cases has only one or two connected components. Section 1: zero knots.

Section 2:

REDUCE is described in nine sections.

Computation of KOLD and numbering of the degree NUM contains the last number given. Building a data base for the following sections.

During this transformation of the input values, matrix

APPENDICES

506

elements entered in duplicate are eliminated.

The output is

the knots

A(J), J = LIST(I) to LIST(I+1)-l which are connected to the knot order of increasing degree. I.

NR(I)

I.

They are ordered, in

GRAD(I) gives the degree of knot

is the new number of the knot, or is zero if the

knot does not yet have a new number.

In our example, after

Section 2 we obtain: A = 2,6,7, 2,4,7,8,9,

1,3,6,7,8, 5,3,10,8,9,

4,10,9,

1,11,2,12,7,

1,11,2,6,13,3,12, 15,5,3,14,4,13,10,8,

2,3,4,13,14,12,7,9, 5,15,4,14,9, 11,6,13,7,8,

6,12,7,

12,14,7,8,9, 10,14,9.

15,10,13,8,9,

LIST = 1,4,9,14,19,22,27,35,43,51,56,59,64,69,74,77. GRAD = 3,5,5,5,3,5,8,8,8,5,3,5,5,5,3. = 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.

NR

If the graph (disregarding knots of degree zero) consists of several connected components, Sections 3 through 8 will run the corresponding number of times. Section 3: of

Step (A) and (B) of the algorithm, computation

K1.

Section 4:

Steps (C) through (F), computation of

K2.

The

return from (D) to (B) contains the instruction IF(DEPTHB.GT.DEPTHF) GOTO 160 Section 5:

Preliminary enumeration of the elements in the

levels

(Step (G)).

Si

Appendix S:

Section 6:

Programs fcr _..d

507

Determination and sorting of the components

Vi

(Step (G)).

Section 7:

Steps (H) through (J).

The loop on

v

begins

i

ends with

with

DO 410 NUE = 1, K2 Steps (L) and (M) are combined in the program. Section 8:

Steps (K) through (0).

The loop on

IF(L.LE.DEPTHF) GOTO 450 Section 9:

Computation of KNEW and transfer of the new

enumeration from NR to A.

LEVEL computes one level with the root START (cf. Theorem 20.8), KOMPON computes the components of

V

(cf. Lemma

20.10) beginning with an arbitrary starting element. and SSORT2 are sorting programs.

SSORT1

To save time, we use a

method of Shell 1959 (cf. also Knuth 1973), but this can be replaced just as easily with Quicksort.

Section 7 determines the amount of working memory required.

If the graph is connected and the return from Step

(D) to Step (B) occurs at most once, then the computing time is

0(c1n) + 0((cIn)5/4) + 0(c

1

c2n)

where n

= number of knots in the graph

cl = maximum degree of the knots c2 = maximum number of knots in the last level of

R(g).

The second summand contains the computing time for the sorts. If Quicksort is used, this term becomes the mean (statistically). Section 4 of the program.

O(c1n log(c1n))

in

The third summand corresponds to

508

APPENDICES

Suppose a boundary value problem in

]R2

is to be

solved with a difference method or a finite element method. We consider the various systems of equations which result from a decreasing mesh

h

of the lattice.

Then it is usually

true that n = 0(1/h2),

k = 0(1/h)

cl = 0(1),

c2 = 0(1/h),

(band width measure).

The computing time for REDUCE thus grows at most in proportion to

1/h3, and for GAUBD, to

1/h4.

The program was tested with 166 examples.

Of these,

28 are more or less comparable, in that they each had a connected graph and the number of knots was between 900 and 1000 and

M

was between 1497 and 2992.

For this group, the com-

puting time on a CDC-CYBER 76 varied between 0.16 and 0.37 seconds.

Appendix S:

Programs for

matrices

509

SUBROUTINE GAUBD3(A,N,B) REAL A(4,N) INTEGER N LOGICAL B C C C C C

SOLUTION OF A SYSTEM OF LINEAR EQUATIONS WITH TRIDIAGONAL MATRIX. THE I-TH EQUATION IS A(1,I)*X(I-1)+A(2,I)*X(I)+A(3,I)*X(I+1)=A(4,I) ONE TERM IS MISSING FOR THE FIRST AND LAST EQUATION. THE SOLUTION X(I) WILL BE ASSIGNED TO A(4,I).

C

C

REAL Q INTEGER I,I1 C

IF(N.LE.1)STOP IF(B) GOTO 20 DO 10 I=2,N Q=A(1,I)/A(2,I-1) A(2,I)=A(2,I)-A(3,I-1)*Q A(4,I)=A(4,I)-A(4,I-1)*Q

10

A(1,I)=Q

GOTO 40 C

20 Q=A(4,1) DO 30 I=2,N Q=A(4,I)-A(1,I)*Q 30 A(4,I)=Q C

40 Q=A(4,N)/A(2,N) A(4,N)=Q I1=N-1

DO 50 I=2,N Q=(A(4,I1)-A(3,I1)*Q)/A(2,I1) A(4,II)=0 50

I1=I1-1 RETURN END

APPENDICES

510

SUBROUTINE GAUBD(A,N,K,B) REAL A(22,N) INTEGER N,K LOGICAL B C C

SOLUTION OF A SYSTEM OF LINEAR EQUATIONS WITH BAND MATRIX. THE I-TH EQUATION IS A(1,I)*X(I-K)+A(2,I)*X(I-K+1)+...+A(K+1,I)*X(I)+...+ A(2*K,I)*X(I+K-1)+A(2*K+1,I)*X(I+K) = A(2*K+2,I) FOR I=1(1)K AND I=N-K+1(1)N SOME TERMS ARE MISSING. THE SOLUTION X(I) WILL BE ASSIGNED TO A(2*K+2,I).

C

C C

C C C

REAL 0 INTEGER KI,K2,K21,K22,I,II,III,J,JJ,L,LL,LLL C

IF((K.LE.O).OR.(K.GE.N))STOP K1=K+1 K2=K+Z K21=2*K+1 K22=K21+1 'IF(B) GO TO 100 C

JJ=K21 II=N-K+1

DO 20 I=II,N DO 10 J=JJ,K21 10 A(J,I)=0. 20 JJ=JJ-1 DO 50 I=2,N II=I-K DO 40 J=1,K IF(II.LE.O) GO TO 40 Q=A(J,I)/A(KI,II)

JI =J+1 JK=J+K LLL=K2 DO 30 L=J1,JK 30

40 50

A(L,I)=A(L,I)-A(LLL,II)*Q LLL=LLL+1 A(K22,I)=A(K22,I)-A(K22,II)*Q A(J,I)=Q II=II+1 CONTINUE GO TO 200

C

100 DO 150 I=2,N II=I-K

DO 140 J=1,K IF(II.LE.0) GO TO 140 A(K22,I)=A(K22,I)-A(K22,II)*A(J,I) 140 150

II=II+1 CONTINUE

Appendix 5:

Programs for

:nat-rices

511

C

200

A(K22,N)=A(K22,N)/A(K1,N) II=N-1

DO 250 I=2,N Q=A(K22,II) JJ=II+K IF(JJ.GT.N) JJ=N II1=II+1 LL=K2

DO 240 J=II1,JJ Q=Q-A(LL,II)*A(K22,J) LL=LL+1 A(K22,II)=Q/A(K1,II)

240 250

II=II-1 RETURN END

SUBROUTINE REDUCE(N,M,KOLD,KNEW) C C

PROGRAMME FOR REDUCING THE BANDWIDTH OF A SPARSE SYMMETRIC MATRIX BY THE METHOD OF GIBBS, POOLE, AND STOCKMEYER.

C C

INPUT

C

C C

C C C C C C C C C C

C C C

C C C

N

M

A(I), I=1(1)2*M

NUMBER OF ROWS NUMBER OF NONVANISHING ENTRIES ABOVE THE DIAGONAL INPUT VECTOR CONTAINING THE INDICES OF THE NONVANISHING ENTRIES ABOVE THE DIAGONAL. THE INDICES ARE ARRANGED IN THE SEQUENCE II, J1, 12, J2, 13, J3, ...

OUTPUT A(I), I=1(1)N KOLD KNEW

NEW NUMBERS OF I-TH ROW AND COLUMN. BANDWIDTH OF THE INPUT MATRIX BANDWIDTH AFTER PERMUTATION OF THE INPUT MATRIX ACCORDING TO A(I), I=1(1)N THE ARRAY BOUNDS MAY BE CHANGED, PROVIDED THAT NMAX.LT.10000 A(2*MMAX), VEC(NMAX), IND(NMAX+1,8), LIST(NMAX+1), GRAD(NMAX), NR(NMAX)

C

INTEGER N,M,KOLO,KNEW C

INTEGER A(4096),VEC(650) INTEGER IND(651,8),LIST(651),GRAD(650),NR(650) COMMON A,VEC,IND EQUIVALENCE (LIST(1),IND(1,1)),(GRAD(1),IND(1,2)), (NR(1),IND(1,3)) C

INTEGER NMAX,MMAX,NN,M2,NI,NUE,NIM,NUM,IS,OLD,NEW INTEGER F,L,L1,L2,L10,I,J,III,K,KNPI,KI,KIN,K2,K2P1 INTEGER G,H,START,DEPTHF,DEPTHB,LEVWTH,GRDMIN,C,C2 INTEGER KAPPA,KAPPAI,KAPPA2,KAPPA3,KAPPA4 INTEGER INDI,IND2,INDJ2,INDJS,INDJ6,INDI7,INDI8,VECJ DATA C/10000/ C2=2*C NMAX=650

512

APPENDICES

M11AX=2048

IF(N.LT.2.OR.N.GT.NMAX.OR.M.GT.MMAX) STOP C C C

SECTION 1

M2=M+M KOLD=O KNEW=N DO 10 I=1,8 DO 10 J=1,N 10 IND(J,I)=O

IF(M.EQ.O) GOT0 680 DO 15 I=1,M2,2 J=IABS(A(I)-A(I+1)) IF(J.GT.KOLD) KOLD=J 15 CONTINUE C

DO 20 I=1,M2 K1=A(I) 20 INO(K1,7)=1 NUM=1

DO 30 I=1,N IF(IND(I,7).GT.O) GOT0 30 NR(I)=NUM NUf1=NUM+1

30

CONTINUE

C

SECTION 2 (NEW DATA STRUCTURE)

C

C

DO 40 I=1,M2,2

K1=A(I) K2=A(I+1) A(I)=K1+C*K2 40 A(I+1)=K2+C*K1 CALL SSORTI(1,M2) J=1

OLD=A(l) DO 70 I=2,M2 NEW=A(I) IF(NEW.GT.OLD) J=J+1 A(J)=NEW 70 OLD=NEW M2=J

I110(1,2)=1 J=1

L1O=A(1)/C 00 90 I=1,M2

K=A(I) LI=K/C L2=K-L1*C

A(I)=L2 IF(LI.EQ.L1O) GOTO 90 LiO=L1 J=J+1

IND(J,2)=I

90

CONTINUE IND(J+1,2)=M2+1 LIST(I)=1 J=1

Appendix 5:

Programs for

atrices

DO 110 I=1,N IF(IND(I,7).GT.0) J=J+1 110 LIST(I+1)=IND(J,2) DO 120 I=1,N 120 GRAD(I)=LIST(I+1)-LIST(I) DO 130 I=1,N F=LIST(I) L=LIST(I+1)-1 130 CALL SSORT2(A,2,F,L) C

SECTION 3 (COMPUTATION OF R(G)) STEPS (A) AND (B), COMPUTATION OF KAPPA I IND(I,7) LEVEL NUMBER OF R(G) ELEMENTS OF THE LAST LEVEL VEC(I)

C

C C C

C

140 GRDMIN=N DO 150 I=1,N IF(NR(I).GT.0) GOTO 150 IF(GRDHIN.LE.GRAD(I)) 60TO 150 START=I GRDIIIN=GRAD (I)

150

CONTINUE

C

160 G=START NN=N

CALL LEVEL(G,NN,DEPTHF,KI,KAPPAI) J=NN-KI DO 180 I=1,K1 III=I+J 180 VEC(I)=IND(III,6) DO 190 I=1,N 190 IND(I,7)=IND(I,8) C

C C C C

SECTION 4 (COMPUTATION OF R(H)) STEPS (C) TO (F), COMPUTATION OF KAPPA 2 IND(I,8) LEVEL NUMBERS OF R(H) LEVWTH=N DO 210 I=1,K1 START=VEC(I) N1=N CALL LEVEL(START,N1,DEPTHB,KIN,KAPPA2) IF(DEPTHB.GT.DEPTHF) GOTO 160 IF(KAPPA2.GE.LEVWTH) GOTO 210 LEVWTH=KAPPA2 VECJ=I 210 CONTINUE H=VEC(VECJ) Ni =N

CALL LEVEL(H,N1,DEPTHB,KIN,KAPPA2)

513

APPENDICES

514

C

SECTION 5 (PRELIMINARY NUMBERING OF THE ELEMENTS OF S(I)) STEP (G) IND(I,4) PRELIMINARY NUMBER OF ELEMENTS OF S(I) IND(I,5) LEVEL NUMBERS FOR NODES WITH SAME NUMBERING; ZERO OTHERWISE

C C

C C C C

DO 230 I=1,N IND(I,4)=O 230 IND(I,5)=O J=O

KNPI=DEPTHF+1 DO 260 I=1,N INDIB=IND(I,8) IF(INDI8.EQ.0) GOT0 260 K2=KNP1-INDIB IF(IND(I,7).NE.K2) GOTO 250 IND(I(2,4)=IND(K2,4)+1 K2=-K2 J=J+1

250 260 C C

CONTINUE IND(I,5)=K2 CONTINUE

SECTION 6 (DETERMINATION AND SORTING OF V(I)) STEP (G) VEC(I) STARTING VALUES OF V(I) SORTED WITH RESPECT TO IABS(V(I))

C

C C C

K2=0 IF(J.EQ.NN) 60T0 412 DO 290 I=1,N

IF(IND(I,5).LE.0) GOTO 290 START=I CALL KOMPON(START,N1) K2=K2+1 VEC(K2)=START IND(START,8)=N1 290 CONTINUE DO 310 I=1,N

IF(IND(I,5).LT.-C) IND(I,5)=IND(I,5)+C2 CONTINUE CALL SSORT2(VEC,8,1,K2) N1M=VEC(K2) N1M=IND(N1M,8) DO 315 I=1,K2 IND(I,8)=VEC(I) 315 310

Appendix 5:

Programs for '--and matrices

SECTION 7 (COMPUTATION OF S) STEPS (H) TO (J) IND(I,4) NUMBER OF ELEMENTS OF S(I) IND(I,5) LEVEL NUMBER OF S IND(1,6) ALL NODES V(I) 319

DO 319 J=1,DEPTHF VEC(J)=0 K2PI=K2+1 DO 410 NUE=1,K2 III=K2P1-NUE START=IND(III,8) CALL KOMPON(START,N1) IND1=7 IS=O

325

330

340

DO 330 J=1,N1 INDJ6=IND(J,6) III=IND(INDJ6,IND1)+IS VEC(III)=VEC(III)+1 IND(J,2)=III KAPPA=O DO 340 J=1,N1 INDJ2=IND(J,2) III=VEC(INDJ2) IF(III.LE.O) GOTO 340 III=III+IND(INDJ2,4) VEC(INDJ2)=0 IF(III.GT.KAPPA) KAPPA=III CONTINUE IF(IS.GT.O) GOTO 346 KAPPA3=KAPPA IND1=5 IS=C2 GOTO 325

C

346

347 350

370 380

400 410 411

KAPPA4=KAPPA IF(KAPPA3-KAPPA4) 350,347,380 IF(KAPPAI.GT.KAPPA2) GOTO 380 DO 370 J=1,N1 INDJ6=IND(J,6) III=IND(INDJ6,7) IND(INDJ6,5)=III-C2 IND(III,4)=IND(III,4)+1 GOTO 410 DO 400 J=1,N1 INDJ2=IND(J,2) IND(INDJ2,4)=IND(INDJ2,4)+1 CONTINUE 00 411 J=1,NIM GRAD(J)=LIST(J+1)-LIST(J)

515

516

APPENDICES

412 DO 415 I=1,N IF(IND(I,5).LT.-C) IND(I,5)=IND(I,5)+C2 415 IND(I,5)=IABS(IND(I,5)) C

SECTION 8 (NUMBERING)

C

STEPS (K) TO (0)

C C

420

DO 420 I=1,N VEC(I)=I CALL SSORT2(VEC,5,1,N) INDI=1 L=1

OLD=O NEW=1

IND(1,7)=G NR(G)=NUM NUM=NUM+1 C

450 I=0 C

460 I=I+1 IF(I.GT.NEW) GOTO 490 470 INDI7=IND(I,7) L1=LIST(INDI7) L2=LIST(INDI7+1)-1 DO 480 J=L1,L2 START=A(J) IF(NR(START).GT.O) GOT0 480 IF(IND(START,S).NE.L) GOT0 480 NR(START)=NUM NUN=1JU1.1+1

NEU=NEU+1

480

I110(NEW,7)=START CONTINUE GOTO 460

C

490 IF(NEW-OLD.GE.IND(L,4)) GOTO 510 GRDHIN=N

INDZ=IND1 DO 500 J=INDI,N VECJ=VEC(J) INDJS=IND(VECJ,5) IF(INDJS-L) 499,491,501 491 IF(NR(VECJ).GT.O) GOTO 500 IF(GRAD(VECJ).GE.GRDMIN) GOTO 500 (VECJ) START=VECJ GOT0 500 499 IIJD2=J+1 500 CONTINUE

Appendix 5:

Programs for 'n-and matrices

501 INDI=IND2 NR(START)=NUM NUM=NUM+1 NEW=NEW+1 IND(NEW,7)=START SOTO 470 C

510 NEW=NEW-OLD DO 520 I=1,NEW III=I+OLD 520 IND(I,7)=IND(III,7) OLD=NEW L=L+1

IF(L.LE.DEPTHF) SOTO 450 IF(NUM.LE.N) GOTO 140 C C C

SECTION 9 (COMPUTATION OF KNEW) KNEW=O DO 670 I=1,N N1=NR(I) L1=LIST(I) L2=LIST(I+1)-1 IF(L1.GT.L2) GOTO 670 DO 660 J=L1,L2 K=A(J) III=IABS(N1-NR(K)) IF(III.GT.KNEW) KNEW=III CONTINUE 660 670 CONTINUE 680 IF(KOLD.GT.KNEW) SOTO 700 KNE11=KOLD

DO 690 I=1,N A(I)=I RETURN 700 DO 710 I=1,N 690

710

A(I)=NR(I)

RETURN END

517

518

APPENDICES

SUBROUTINE LEVEL(START,NN,DEPTH,K3,WIDTH) GENERATION OF THE LEVELS R(START) DEPTH DEPTH OF THE LEVELS NUMBER OF NODES IN THE LAST LEVEL K3 WIDTH WIDTH OF THE LEVELS NUMBER OF ASSOCIATED NODES NN INTEGER START,NN,DEPTH,K3,WIDTH INTEGER A(4096),VEC(650) INTEGER IND(651,8),LIST(651),GRAD(650),NR(650) COHNON A,VEC,IND EQUIVALENCE (LIST(1),IND(1,1)),(GRAD(1),IND(1,2)), (NR(1),IND(1,3)) INTEGER J,I,BEG,END,NI,K,K2,LBR,STARTN,AI,L1,L2 J=NN DO I I=1,J 1

IND(I,8)=0 BEG=1 N1=1 K=1

LBR=1 K2=1

INO(1,6)=START IND(START,8)=1 C

3 K=K+1 END=N1

DO 10 J=BEG,END STARTN=IND(J,6) L1=IND(STARTN,1) L2=IND(STARTN+1,1)-1 DO 5 I=LI,L2 AI=A(I)

IF(IND(AI,8).NE.0) GOTO 5 IND(AI,8)=K NI=N1+1

IND(NI,6)=AI 5 10

CONTINUE

CONTINUE K3=K2 K2=N1-END IF(LBR.LT.K2) LBR=K2 BEG=END+1 IF(K2.GT.O) GOTO 3

C

DEPTH=K-1 WIDTH=LBR NN=N1 RETURN END

Appendix 5:

Programs fcr

....:..races

519

SUBROUTINE KOMPON(START,NI) C

COMPUTATION OF THE COMPONENT V(I) CONTAINING "START" NUMBER OF INVOLVED NODES NI IND(I,6) ALL NODES V(I)

C C

C C

INTEGER START,N1 INTEGER A(4096),VEC(650) INTEGER IND(651,8),LIST(651),GRAD(650),NR(650) COMMON A,VEC,IND EQUIVALENCE (LIST(1),IND(1,1)),(GRAD(1),IND(1,2)), (NR(1),IND(1,3)) INTEGER AI,I,K2,LI,L2,STARTN,J,BEG,END,C,C2 DATA C/10000/ C2=2*C BEG=1 N1=1

IND(START,5)=IND(START,5)-C2 IND(1,6)=START C

3 END=N1 DO 10 J=BEG,END STARTN=IND(J,6) L1=IND(STARTN,1) L2=IND(STARTN+1,1)-1 DO S I=L1,L2 AI=A(I)

IF(IND(AI,5).LT.O) GOTO 5 IND(AI,5)=IND(AI,5)-C2 N1=N1+1 IND (NI , 6) =AI

5

10

CONTINUE

CONTINUE K2=N1-EIJD BEG=END+1 IF(K2.GT.0) GOTO 3 RETURN

END

520

APPENDICES

SUBROUTINE SSORTI(F,L) SORTING OF A FROM A(F) TO A(L) INTEGER F,L INTEGER A(4096),VEC(650) INTEGER IND(651,8),LIST(651),GRAD(650),NR(650)

C

A,VEC,IND EQUIVALENCE (LIST(1),IND(1,1)),(GRAD(1),IND(1,2)), (NR(1),IND(1,3)) INTEGER N2,S,T,LS,IS,JS,AH,I,J IF(L.LE.F) RETURN N2=(L-F+1)/2 S=1023 DO 100 T=1,10 IF(S.GT.N2) GOTO 90 LS=L-S DO 20 I=F,LS IS=I+S AH=A(IS) J=I 5

10 20 90 100

JS=IS IF(AH.GE.A(J)) OTO 10 A(JS)=A(J) JS=J J=J-S IF(J.GE.F) GOTO 5 A(JS)=AH CONTINUE S=S/2 CONTINUE RETURN END

Appendix 5:

Programs for

-atrices

SUBROUTINE SSORT2(VEC,K,F,L) SORTING OF VEC FROM I=F TO I=L SO THAT IND(VEC(I),K) IS WEAKLY INCREASING INTEGER F,L,VEC(650),K INTEGER A(4096),VECC(650) INTEGER IND(651,8),LIST(651),GRAD(650),NR(650) COFIHON A,VECC,IND EQUIVALENCE (LIST(1),IND(1,1)),(GRAD(1),IND(1,2)), (HR(I),IND(1,3)) INTEGER N2,S,T,LS,IS,JS,AH,GH,AJ,I,J IF(L.LE.F) RETURN N2=(L-F+1)/2

C C

5=63

DO 100 T=1,6 IF(S.GT.N2) GOTO 90 LS=L-S DO 20 I=F,LS IS=I+S AH=VEC (IS)

GH=IND(AH,K) J=I 6 5

10 20 90 100

JS=IS AJ=VEC(J)

IF(GH.GE.IND(AJ,K)) GOTO 10 VEC(JS)=VEC(J) JS=J J=J-S IF(J.GE.F) GOTO 6 VEC(JS)=AH CONTINUE S=S/2 CONTINUE RETURN END

521

APPENDICES

522

Appendix 6:

The Buneman algorithm for solving the Poisson equation.

Let

We want to find a solution

G = (-1,1) x (-1,1).

for the problem tu(x,y) = q(x,y),

(x,y)

e G

u(x,Y) _ gp(x,Y),

(x,y)

e

G.

Depending on the concrete problem, two REAL FUNCTIONs, QUELL and RAND, have to be constructed to describe

q

and

iP

(see the example at the end below).

The parameter list for the subroutine BUNEMA includes only the parameter

K

in addition to the names for QUELL and

RAND (which require an EXTERNAL declaration).

H = l./2**K, where lattice points.

H

We have

denotes the distance separating the

After the program has run, the COMMON domain

will contain the computed approximations.

All further details

can be discerned from the program comments. BUNEMA calls the subroutine GLSAR.

This solves the

system of equations

A(r)x = b by means of a factorization of

A(r)

(cf. Sec. 21).

When

first used, GLSAR calls the subroutine COSVEC, which computes a series of cosine values via a recursion formula. also calls on the routine GAUBDS.

GLSAR

The latter solves special

tridiagonal systems of equations via a modified Gaussian algorithm (LU-splitting). Program BUNEMA contains 50 executable FORTRAN instructions, and the other subroutines another 42 instructions.

In order to make the program easy to read we have restricted

Appendix 6:

The Bunemar.

ourselves to the case

523

C = '-1,1) x (-1,1).

However, it can

be rewritten for a rectan;;::ar region without any difficulties.

We want to discuss the numerical results by means of the following examples:

(x,y) e G

iu(x,y) = 0, u(x,y) = 1,

(x,y)

e aG

(1)

exact solution: u(x,y) = 1

Au(x,y) = -2ir2sin(vrx) sin(Try) ,

(x,y) e G

u(x,y) = 0, (x,y) e DG exact solution: u(x,y) = sin(Irx)sin(Try).

(2)

The first example is particularly well suited to an examination of the numerical stability of the method, since the discretization error is zero.

Thus the errors measured are ex-

clusively those arising from the solving of the system of equations, either from the method or from rounding.

The com-

putation was carried out on a CDC-CYBER 76 (mantissa length 48 bits for REAL) and on an IBM 370/168 (mantissa length 21-24 bits for REAL*4 and 53-56 bits for REAL*8).

Table 1 contains the maximal absolute error for the computed approximations.

Here

H = 1/2**K = 2/(N+1) N

N2

:

number of lattice points in one direction dimension of the system of equations

Since the extent of the system of equations grows as would expect a doubling of

N

N2, one

in Example 1 to lead to a four-

fold increase in the rounding error.

This is almost exactly

S24

APPENDICES

Example

(1)

(2)

N

CYBER 76 REAL

370/168 REAL*4

370/168 REAL*8

3

0.71E-14

0.77E-6

0.11.E-15

7

0.43E-13

0.77E-6

0.54E-15

15

0.23E-12

0.10E-4

0.72E-15

31

0.58E-12

0.55E-4

0.15E-13

63

0.15E-11

0.45E-3

0.85E-13

127

0.68E-11

0.14E-2

0.30E-12

3

0.23E0

0.23E0

0.23E0

7

0.53E-1

0.53E-1

0.53E-1

15

0.13E-1

0.13E-1

0.13E-1

31

0.32E-2

0.32E-2

0.32E-2

63

0.80E-3

0.69E-3

0.80E-3

127

0.20E-3

0.31E-3

0.20E-3

Table 1.

Absolute error

what happened in the computation on the CYBER 76, averaged over all

N.

On the IBM, the mean increase in error per step

was somewhat greater.

The values for Example (1) and Example (2) show that should in no case

for a REAL*4 computation on the IBM 370, N be chosen larger than 63.

For greater mantissa lengths, there

is no practical stability bound on either machine.

Table 2 contains the required computing times, exclusive of the time required to prepare the right side of the system of equations.

In the compilation (with the exception

of the G1 compiler), the parameter

OPT =

2

was used.

Appendix 6:

The Bunemar.

Machine computation compiler

CYBER 76 REAL FTN

::

_

r thm

5-:!/168

REAL*4 :: extended

525

370/168 REAL*8

H-Extended

370/168 REAL*4 G1

N 31

0.03

0.04

0.04

0.07

63

0.13

0.19

0.22

0.31

127

0.55

0.85

1.04

1.44

Table 2.

Computing times in seconds

526

APPENDICES

SUBROUTINE BUNEMA(RAND,QUELL,K) C C C C C C

PARAMETERS

FUNCTIONS RAND, QUELL INTEGER K VARIABLES OF THE COMMON BLOCKS

C

REAL W(63,65),P(63,63),Q(63,65) EQUIVALENCE(W(1,1),Q(1,1)) CONHON W,P C C C

LOCAL VARIABLES INTEGER REAL B(63),X,Y,H,H2

C C C C C C C C C C C C C C

MEANING OF THE VARIABLES H K

H2 K1

K2 P,Q:

W

C

C C

C C C C C

C C C C

B

DISTANCES OF THE LATTICE POINTS H=1/2**K. THE PROGRAMME IS TERMINATED IN CASE OF K.LT.1. =H**2 =2**(K+1)-1 =2**(K+1) COMPARE DESCRIPTION OF THE METHOD. AS STARTING VALUES ZEROS ARE ASSIGNED TO P AND THE RIGHT-HAND SIDE OF THE SYSTEM IS ASSIGNED TO Q. AFTER PERFORMING THE PROGRAMME W(I,J) CONTAINS AN APPROXIMATION TO THE SOLUTION W(X,Y) AT THE INTERIOR GRID POINTS. (1,J) AND (X,Y) ARE RELATED BY X=(I-2**K)*H, I=1(1)K1 Y=(J-1-2**K)*H, J=2(1)K2. TO SIMPLIFY THE SOLUTION PHASE OF THE PROGRAMME, W(*,1) AND W(*,K2+1) ARE INITIALIZED BY ZEROS. W AND Q ARE EQUIVALENCED. THERE IS NO IMPLICIT USE OF THIS IDENTITY. DURING THE SOLUTION PHASE THOSE COMPONENTS OF W ARE DEFINED SUCCESSIVELY, THAT ARE NO LONGER NEEDED IN Q. AUXILIARY STORAGE

KMAX=5 C C C C C C C C

ARRAY BOUNDS

THE LENGTHS OF THE ARRAYS W(DIMI,DIM2), P(DIMI,DIMI), Q(DIMI,DIM2), AND B(DIM1) CAN BE CHANGED TOGETHER WITH KMAX. IT IS DIM1=2**(KMAX+1)-1 DIM2=2**(KMAX+1)+1.

Appendix 6:

C C

C

The Buneman

thm

527

ACCORDINGLY, THE LENGTHS OF THE ARRAYS COS2(DIMI) IN SUBROUTINE GLSAR AND A(DIM1) IN SUBROUTINE GAUBDS MUST BE CHANGED. THE PROGRAMME TERMINATES IF K.GT.KMAX.

C

IF((K.LT.1).OR.(K.GT.KMAX))STOP K2=2**(K+1) K1=K2-1 F1=1.0/FLOAT(2**K) H2=H**2 C C C

ASSIGN ZEROS TO P AND TO PARTS OF W DO 10 J=1,K1 U(J,I)=0.0 U(J,K2+1)=0.0 DO 10 I=1,K1 10 P(I,J)=O.O

C C C

STORE THE RIGHT-HAND SIDE OF THE SYSTEM ON Q Y=-1 .0

DO 120 J=2,KZ Y=Y+H X=-1 .0

DO 110 I=1,K1 X=X+H 110 Q(I,J)=H2*QUELL(X,Y) Q(1,J)=Q(1,J)-RAND(-1.0,Y) 120 Q(K1,J)=Q(K1,J)-RAND(1.O,Y) X=-1.0 DO 130 I=1,K1 X=X+H Q(I,2)=Q(I,2)-RAND(X,-1.0) 130 Q(I,K2)=Q(I,K2)-RAND(X,1.0) C C C

REDUCTION PHASE, COMPARE EQ. (21.9) OF THE DESCRIPTION DO 230 R=1,K JI=2**R J2=K2-J1 DO 230 J=J1,J2,J1 J3=J-J1/2 J4=J+J1/2 DO 210 I=1,K1 B(I)=P(I,J3)+P(I,J4)-Q(I,J+1) 210 CALL GLSAR(R-1,B,K1,K) DO 220 1=1,K1 P(I,J)=P(I,J)-B(I) 220 Q(I,J+1)=Q(I,J3+1)+Q(I,J4+1)-2.0*P(I,J) 230 CONTINUE

528

C C C

APPENDICES

SOLUTION PHASE, COMPARE EQ. (21.10) OF THE DESCRIPTION R2=K+1

DO 330 R1=1,R2 R=R2-R1 J1=2**R J2=K2-J1 J3=2*J1

DO 330 J=J1,J2,J3 J4=J+1+J1 J5=J+1-J1 DO 310 I=1,K1 310 B(I)=Q(I,J+1)-W(I,J4)-W(I,J5) CALL GLSAR(R,B,K1,K) DO 320 I=1,K1 320 W(I,J+1)=P(I,J)+B(I) 330 CONTINUE RETURN END

Appendix 6:

The Buneman a gcrithm

529

SUBROUTINE GLSAR(R,B,N,K) INTEGER R,N,K REAL B(N) C

SOLUTION OF THE SYSTEM A(R)*X=B BY FACTORING OF A(R) (COMPARE EQ. (21.6) OF THE DESCRIPTION). A(R) IS DEFINED RECURSIVELY BY A(R)=2I-(A(R-1))**2, R=1(1)K WITH A(O)=A=(AIJ), I,J=1(1)N

C C C C C C

AND -4 AIJ= I

C

0

C C C C

IF I=J IF I=J+1 OR I=J-1 OTHERWISE

THE PROGRAMME FAILS IF N.LT.2; OTHERWISE IT TERMINATES WITH B=X. INTEGER FICALL,J,J1,J2,JS REAL COS2(63) DATA FICALL/O/

C

IF(R.EQ.O)GOTO 30; C C

THE SUBROUTINE COSVEC IS ONLY CALLED IF THE VECTOR COS2 IS NOT YET COMPUTED FOR THE ACTUAL VALUE OF K.

C

C

IF(FICALL.EQ.K)GOTO I CALL COSVEC(K,COS2) FICALL=K C 1

10

DO 10 J=1,N

B(J)=-B(J)

J1=2**(K-R) J2=(2**(R+1)-1)*J1 JS=2*J1 C

BECAUSE OF COS2(J)=2*COS(J*PI/2**(K+1))=2*COS(I*PI) THE DOMAIN OF THE INDEX I IS I=2**(-R-1) (2**(-R)) 1-2**(-R-1).

C C

C C

20

DO 20 J=J1,J2,JS CALL GAUBDS(COS2(J),B,N) GOTO 40

C

30 CALL GAUBDS(O.0,B,N) C

40 RETURN END

530

APPENDICES

SUBROUTINE GAUBDS(C,B,N) INTEGER N REAL C,B(N) C

SOLUTION OF A LINEAR SYSTEM WITH SPECIAL TRIDIAGONAL MATRIX: X(I-1)+(-4+C)*X(I)+X(I+1)=B(I), I=1(1)N, WHERE X(0) AND X(N+1) ARE VANISHING. THE PROGRAMME TERMINATES WITH B=X. IN CASE OF N.LT.2 THIS SUBROUTINE FAILS. IF N.GT.63 THE ARRAY BOUNDS MUST BE ENLARGED. INTEGER 1,11 REAL Q,C4,A(63)

C C C C C C C

C

C4=C-4.0 A (1) =C4 DO 10 I=2,N 0=1.0/A(I-1) A(I)=C4-Q 10 B(I)=B(I)-B(I-1)*Q Q=B (N)/A (N)

B(N)=Q I1=N-1

DO 20 I=2,N

Q=(B(II)-Q)/A(II) B(II)=Q

20

I1=I1-1 RETURN END

Appendix 6:

The Bunemar, algorithm

SUBROUTINE COSVEC(K,COS2) INTEGER K REAL COS2(1) C

C C C

C

COMPUTATION OF COSZ(J)=2*COS(J*PI/2**(K+1)), J=1(1)2**(K+1)-1 BY MEANS OF RECURSION AND REFLECTION. THE PROGRAMME FAILS FOR K.LT.1.

C

INTEGER K2,J,J1 REAL DC,T,CV,PI4 C

K2=2**K J1=2*K2-1 PI4=ATAN(1.0) COS2(K2)=O.0 DC=-4.0*SIN(PI4/FLOAT(K2))**2 K2=K2-1 T=DC CV=2.0+DC COS2(1)=CV COS2(J1)=-CV DO 10 J=2,K2 J1=J1-1 DC=T*CV+DC CV=CV+DC COS2(J)=CV 10 COS2(J1)=-CV RETURN END

EXAMPLE (MENTIONED IN THE TEXT):

REAL FUNCTION QUELL(X,Y) REAL X,Y DATA PI/3.14159265358979/ QUELL=-2.0*PI*PI*SIN(PI*X)*SIN(PI*Y) RETURN END

REAL FUNCTION RAND(X,Y) REAL X,Y RAND=0.0 RETURN END

531

BIBLIOGRAPHY Abramowitz, M., Stegun, I. A.: Handbook of Mathematical Functions, New York: Dover Publications 1965. Ahlfors, L. V.: Complex Analysis, New York-Toronto-London: McGraw-Hill 19ZT.

Ansorge, R., Hass, R.: Konvergenz von Differenzenverfahren fur Lineare and Nichtlineare An an swertau a en, Lecture Notes in Mathematics, Vol. 159, Berlin-Heidelberg-New York: Springer 1970. Beckenbach, E. F., Bellman, R.: Inequalities, BerlinHeidelberg-New York: Springer 11971. Birkhoff, G., Schultz, M. H., Varga, R. S.: "Piecewise hermite interpolation in one and two variables with applications to partial differential equations," Numer. Math., 11, 232256 (1968).

Busch, W., Esser, R., Hackbusch, W., Herrmann, U.: "Extrapolation applied to the method of characteristics for a first order system of two partial differential equations," Numer. Math., 24, 331-353 (1975). Buzbee, B. L., Golub, G. H., Nielson, C. W.: "On direct methods for solving Poisson's equations," SIAM J. Numer. Anal., 7, No. 4, 627-656 (1970). Coddington, E. A., Levinson, N.: Theor of Ordinary Differential Equations, New York-Toronto-London: McGraw-Hill 19S5. Collatz, L.: Funktionalanalysis and Numerische Mathematik, Berlin-G6ttingen-Hei el erg: Springer 1964. Collatz, L.: The Numerical Treatment of Differential Equations, Berlin ei el erg-New Yor Springer 1966. :

Courant, R., Friedrichs, K. 0., Lewy, H.: "her die partiellen differenzengleichungen der mathematischen physik," Math. Ann., 100, 32-74 (1928). Cuthill, E., McKee, J.: "Reducing the bandwidth of sparse symmetric matrices," Proc. 24th ACM National Conference, 157-172 (1969). Dieudonne, J.: Foundations of Modern Analysis, New York-London: Academic Press 1960. Dorr, F. W.: "The direct solution of the discrete Poisson equation on a rectangle," SIAM Rev., 12, No. 2, 248-263

(1970).

(Editor): Numerical Solution of Ordinary and Partial Differential Equations, London: Pergamon Press 19

Fox, L.

.

532

Bibliography

533

Friedman, A.: Partial Differential Equations, New York: Holt, Rinehart and Winston 1969.

Friedrichs, K. 0.: "Symmetric hyperbolic linear differential equations," Comm. Pure Appl. Math., 7, 345-392 (1954). Gerschgorin, S.: "Fehlerabschatzungen fur das differenzenverfahren zur losung partieller differentialgleichungen," ZAMM, 10, 373-382 (1930).

Gibbs, N. E., Poole, W. G., Stockmeyer, P. K.: "An algorithm for reducing the bandwidth and profile of a sparse matrix," SIAM J. Numer. Anal., 13, 236-250 (1976). Gilbarg, D., Trudinger, N. S.: Elli tic Partial Differential E uations of Second Order, Berlin-Hei a erg-New or pringer 1977. :

Gorenflo, R.: "Ober S. Gerschgorins Methode der Fehierabschatzung bei Differenzenverfahren." In: Numerische, Insbesondere Approximations-Theoretische Behandlun von Fun tional leichungen, Lecture Notes in Mat ematics, Vol. 333, 128-143, Berlin-Heidelberg-New York: Springer 1973.

Grigorieff, R. D.: Numerik Gewohnlicher Differential leichun en, StudienbUcher, Bd. 1. Stuttgart: eu ner 1 .

Hackbusch, W.: Die Verwendung der Extrapolationsmethode zur Numerischen Losung Hyperbolischer Differentialgleichungen, Universitat zu Koln: Dissertation 1973. Hackbusch, W.: "Extrapolation to the limit for numerical solutions of hyperbolic equations," Numer. Math., 28, 455474 (1977). Hellwig, G.: Partial Differential Equations, Stuttgart: Teubner 197 .

Householder, A. S.: The Theory of Matrices in Numerical Analysis, New York: Blaisdell196 .

Janenko, N. N.: The Method of Fractional Steps, BerlinHeidelberg-New Yor Springer 1971. :

Jawson, M. A.: "Integral equation methods in potential theory I," Proc. R. Soc., Vol. A 275, 23-32 (1963). Kellog, 0. D.: Foundations of Potential Theory, Berlin: Springer 1929. Knuth, D. E.: The Art of Com uter Programming, Reading, Massachusetts: Addison-Wesley 1973. Kreiss, H. 0.: "On difference approximations of the dissipative type for hyperbolic differential equations," Comm. Pure Appl. Math., 17, 335-353 (1964).

BIBLIOGRAPHY

534

"On the stability of difference approximations Lax, P. D.: to solutions of hyperbolic equations with variable coefficients," Comm. Pure Appl. Math., 14, 497-520 (1961). Lax, P. D., Wendroff, B.: "On the stability of difference schemes," Comm. Pure Appl. Math., 15, 363-371 (1962). "On stability for difference Lax, P. D., Nirenberg, L.: schemes; a sharp form of Garding's inequality," Comm. Pure Appl. Math., 19, 473-492 (1966).

Fehlerabschatzungen in Verschiedenen Normen bei Lehmann, H.: Eindimensionaler and Zweidimensionaler Hermite-Interpolation, Universitat zu Koln: Diplomarbeit 1975. Losung von grooen Gleichungssystemen mit Lierz, W.: Symmetrischer Schwach Besetzter Matrix, Universitat zu K61n: Diplomarbeit 1975. Loomis, L. H., Sternberg, S.: Advanced Calculus, Reading, Massachusetts: Addison-Wesley 1968.

Magnus, W., Oberhettinger, F., Soni, R. P.: Formulas and Theorems for the Special Functions of Mathematical Physics, New York: Springer 196b. Meis, Th.: "Zur diskretisierung nichtlinearer elliptischer differentialgleichungen,"Computing, 7, 344-352 (1971).

Meis, Th., Tornig, W.: "Diskretisierungen des Dirichletproblems nichtlinearer elliptischer differentialgleichungen. Methoden and Verfahren der Mathematischen Physik, In: Bd. 8. Herausgeber: B. Brosowski und E. Martensen, Mannheim-Wien-Zurich: Bibliographisches Institut 1973. Meuer, H. W.: Zur Numerischen Behandlung von S stemen H erbolischer An an swert ro leme in Beliebig Vielen Ortsveran erlic en mit Hil a von Di erenzenverfahren, g72 Tec nisc a Hoc sc ule Aachen: Dissertation 1 .

The Theory of Partial Differential Equations, Mizohata, S.: Cambridge: University Press 1973. Natanson, I. P.: Theorie der Funktionen einer Reellen Veranderlichen, Berlin: A a emie-Verlag 1961.

Iterative Solution of NonOrtega, J. M., Rheinboldt, W. C.: linear E uations in Several Variables., New or -London: Academic Press 1970.

"On the linear iteration procedures for symmetric matrices," Rend. Math. e. Appl., 14, 140-163

Ostrowski, A.. M.: (1954).

Peaceman, D. W., Rachford, H. H.: "The numerical solution of parabolic and elliptic differential equations," J. Soc. Indust. Appl. Math., 3, 28-41 (1955).

Bibliography

535

Perron, 0.: "Uber exister: u^d nichtexistenz von integralen partieller differentialgleichungssysteme im reellen Gebiet," Mathem. Zeitschrift, 2-, 549-564 (1928).

Petrovsky, I. G.: Lectures on Partial Differential Equations, New York-London: Interscience Publishers 1954. Reid, J. K.: "Solution of linear systems of equations: direct methods (general). In: Sparse Matrix Techni ues, Lecture Notes in Mathematics, Vol. 572, Editor V. A. Barker, BerlinHeidelberg-New York: Springer 1977.

Richtmyer, R. D., Morton, K. W.: Difference Methods for Initial Value Problems, New York-London-Sydney: Interscience Publishers 1967. Rosen, R.: Matrix Bandwidth Minimization, Proc. 23rd ACM National Conf., 585-595 (1968).

Sauer, R. S.: Anfan swert robleme bei Partiellen Differentialgleichungen, Berlin-Gottingen- ei el erg: Springer 1958. Schechter, S.: "Iteration methods for nonlinear problems," Trans. Amer. Math. Soc., 104, 179-189 (1962).

Schroder, J., Trottenberg, U.: "Reduktionsverfahren fur differenzengleichungen bei randwertaufgaben I," Numer. Math., 22, 37-68 (1973). Schroder, J., Trottenberg, U., Reutersberg, H.: "Reduktionsverfahren fur differenzengleichungen bei randwertaufgaben II," Numer. Math., 26, 429-459 (1976). Shell, D. L.: "A highspeed sorting procedure," Comm. ACM, 2, No. 7, 30-32 (1959). Simonson, W.: "On numerical differentiation of functions of several variables," Skand. Aktuarietidskr., 42, 73-89 (1959). Stancu, D. D.: "The remainder of certain linear approximation formulas in two variables," SIAM J. Numer. Anal., Ser. B.1, 137-163 (1964).

Stoer, J., Bulirsch, R.: Introduction to Numerical Analysis. New York-Heidelberg-Berlin: Springer 1980. Swartz, B. K., Varga, R. S.: "Error bounds for spline and L-spline interpolation," J. of Appr. Th., 6, 6-49 (1972). Thomee, V.: "Spline approximation and difference schemes for the heat equation. In: The Mathematical Foundations of the Finite Element Method with Applications to Partial Differential Equations, Edited by K. Aziz, New York-London: Academic Press 1972.

T6rnig, W., Ziegler, M.: "Bemerkungen zur konvergenz von differenzenapproximationen fur quasilineare hyperbolische anfangswertprobleme in zwei unabhangigen veranderlichen." ZAMM, 46, 201-210 (1966).

BIBLIOGRAPHY

536

Varga, R. S.: Matrix Iterative Analysis, Englewood Cliffs: Prentice-Hall 1962. Walter, W.: Differential and Integral Inequalities, BerlinHeidelberg-New York: Springer 1970.

Whiteman, J. R. (Editor): The Mathematics of Finite Elements and Applications, London-New York: Academic Press 1973. Whiteman, J. R. (Editor): The Mathematics of Finite Elements and Applications II, London-New York-San Francisco: Academ-ic Press 19 76 .

Widlund, 0. B.: "On the stability of parabolic difference schemes," Math. Comp., 19, 1-13 (1965). Witsch, K.: "Numerische quadratur bei projektionsverfahren," Numer. Math., 30, 185-206 (1978). Yosida, K.:

Functional Analysis, Berlin-Heidelberg-New York:

Springer 1779Iterative Methods for Solving Partial DifferenYoung, D. M.: tial Equations of Elliptic Type, Harvard University, Thesis 1950. Zlamal, M.: "On the finite element method," Numer. Math., 12, 394-409 (1968).

Textbooks Ames, W. F.: Numerical Methods for Partial Differential Equations, New York- San Francisco: Academic Press 1 97 .

Ansorge, R.: Differenzena roximationen Partieller Anfangswertaufgaben, tuttgart: Teubner 1978. Ciarlet, Ph. G.: The Finite Element Method for Elliptic Problems, Amsterdam: North-Holland 1978. Collatz, L.: The Numerical Treatment of Differential Equations, Berlin ei el erg-New York: Springer 1-9-6-67.

Forsythe, G. E., Rosenbloom, P. C.: Numerical Analysis and Partial Differential Equations, New or 8. o n i ey Forsythe, G. E., Wasow, W. R.: Finite Difference Methods for Partial Differential Equations, New York-London: Jo n Wiley 1960. John, F.: Lectures on Advanced Numerical Analysis, London: Gordon an Breach 1967. Marchuk, G. I.: Methods of Numerical Mathematics, Vol. 2 New York-Heidel erg-Berlin: Springer 1975.

Bibliography

537

Marsal, D.: Die Numerisch L6sun Partieller Differentialleichun en in Wissenschaft and Tec 19 ni , Mann eim-WienZuric iograp isc es Institut i .

Com utational Methods in Partial Differential Equations, Lon oNew York-Sydney-Toronto: John Wiley 196

Mitchell, A. R.:

Iterative Solution of NonOrtega, J. M., Rheinboldt, W. C.: linear Equations in Several Variables, New York-London: Academic Press 1970.

Ralston, A., Wilf, H. S.; Mathematical Methods for Digital Computers, New York: John Wiley 196 0. Richtmyer, R. D., Morton, K. W.: Difference Methods for Initial Value Problems, New York-London-Sydney: Interscience Publishers 1967. Schwarz, H. R.: Methode der Finiten Elemente, Stuttgart: Teubner 1980. Smith, G. D.: Numerical Solution of Partial Differential Equations, London: Oxford University Press 1969.

Stoer, J., Bulirsch, R.: Introduction to Numerical Analysis, New York-Heidelberg-Berlin: Springer 1980.

An Analysis of the Finite Element Strang, G., Fix, G. J.: Method, Englewood Clif s: Prentice-Hall 1973. Varga, R. S.: Matrix Iterative Analysis, Englewood Cliffs: Prentice-Hall 1962.

Iterative Solution of Elliptic Systems, Wachspress, E. L.: 1966. Englewood Cliffs: Prentice-Hall Young, D. M.: Iterative Solution of Large Linear Systems, New York-London: Academic Press 1971. Zienkiewicz, 0. C.: The Finite Element Method in Engineering Science, London: McGraw-Hill 1971.

INDEX

Alternating direction implicit (ADI) method, 176, 177

Characteristics, 5, 7, 19ff., 24, 27

Amplification matrix, 128

Classical solution, 3, 232

Antitonic, 243

Collocation methods, 317ff.

Asymptotic expansion, 194

Collocation points, 318

Attractive region, 337

Consistency, 61, 70, 233

Banach fixed point theorem,

order of, 65

53

Contraction mapping, 337 Banach spaces, 40ff.

Contraction theorem, 53 Band matrices, 175, 404, 503

Convergence, 62, 70

Band width reduction, 402ff.

order of, 65

Biharmonic equation, 220

Convex function, 386

Boundary collocation, 318

COR algorithm, 420

Boundary distant points, 249, 258

CORF algorithm, 421

Boundary integral methods,

Courant-Friedrichs-Lewy condition, 89

328ff.

Boundary value problems,

Courant-Issaacson-Rees method, 85, 111, 118, 143

207ff.

Bulirsch sequence, 196 Buneman algorithm, 417ff., 522ff.

Calculus of variations, 207, 208 (see also variational methods) Cauchy-Riemann equations, 18, 216

Crank-Nicolson method, 75, 137, 138, 174 Cyclic odd/even reduction (COR) algorithm, 420 with factorization (CORF), 421 Definite, 21

Derivative in a Banach space, 50 Diagonal dominant, 243

Cauchy sequence, 41 Difference methods Characteristic direction, 23, 27

in Banach spaces, 61

Characteristic methods, 31ff., 88, 89

for boundary value problems, 229ff. 538

Index

Difference methods (cont.)

539

Friedrichs method, 82, 142, 180, 197, 203

Fourier transforms of, 119ff.

with positivity properties, 97ff.

m-dimensional, 182 Friedrichs theorem, 116 Galerkin method, 286

stability of, 55ff.

Gauss-Seidel method, 359, 363 Difference operator, 70 Gaussian elimination, 402, 503 Difference star, 278, 428, 434

Generalized solution, 56, 90

truncated, 441

Gibbs-Poole-Stockmeyer algorithm, 407, 503ff.

Direct methods, 402, 403 Global extrapolation, 195, 449, Domain of dependence, 8, 87,

476

88

Green's function, 296 Domain of determinancy, 8, 87, 88

Heat equation, 12, 14, 107, 137, 141, 228

Elimination methods, 402 nonlinear, 16, 459ff. Elliptic equations,

Helmholtz equation, 209, 218 definition, 22, 26, 29 Hermite interpolation, 290ff. methods for solving, 207ff. piecewise, 302 Euler equation, 208 two-variable, 304 Explicit method, 75 Hyperbolic equations Extrapolation methods, 192ff. definition, 22, 26, 29 Finite element methods, 275 methods for solving, 31ff. Fixed point, 53, 336 Implicit method, 75 FORTRAN, 444

Initial boundary value problems, Fourier integral, 122 Fourier series, 121 Fourier transforms, 13 of difference methods,

10

Initial value problems, lff. in Banach spaces, 55 inhomogeneous, 89ff.

119ff.

in several space variables,

m-dimensional, 170

168ff.

INDEX

540

Integral in a Banach space, 52

Monotone type, equations of, 244

Interior collocation, 319

Multi-index, 168

Irreducible diagonal dominant, 243

Multiplace method, 268 Multistep methods, 341

Isotonic, 243

Negative definite, 21 Iterative methods for systems of equations, 334ff.

Neville scheme, 203

Jacobi method, 359

Newton-Kantorovich theorem, 344

Kreiss theorem, 67, 119

Newton's method, 335, 342

Laplacian, 209

Norm, 40

Lattice, 230

of linear operator, 47

Lattice function, 69

Sobolev, 271

Lax-Nirenberg theorem, 157

Numerical viscosity, 87, 266

Lax-Richtmeyer theory, 40ff.

Optimally stable, 89

Lax-Richtmeyer theorem, 62

Order of consistency, 65

Lax-Wendroff method, 144,

Order of convergence, 65

204

Ostrowski theorem, 366 Lax-Wendroff-Richtmeyer method, 185, 469ff.

Overrelaxation methods

Level structure, 410

for linear systems, 363ff.

Linear operator, 47

for nonlinear systems, 383ff.

Lipschitz condition, 1 Local extrapolation, 202,

Parabolic equations definition, 22, 29

203

Local stability, 154 Locally convergent interation, 337

in the sense of Petrovski, 17, 124, 152

Peano kernel, 295 Poincare' theorem, 386

Massau's method, 36, 447ff. Maximum-minimum principle,

Poisson equation, 209, 217, 223, 226, 280, 379, 484, 522ff.

210

Poisson integral formula, 213 Mesh size, 230 Positive definite, 21, 387 M-matrix, 242 difference method, 97, 115

Index

Positive difference methods, 97ff.

S41

Successive overrelaxation (SOR) method

Potential equation, 18, 212

for linear systems, 363ff.

Product method, 179

for nonlinear systems, 383ff.

Properly posed boundary value problems, 207ff.

initial value problems, lff., 55

Total step method, 358 Totally implicit method, 75, 80, 459ff.

Translation operator, 74 Triangulation, 275

Quadratic form, 21 Tridiagonal matrices, 404, 503 Quasilinear, 20, 26, 30 T yp e of a p artial

Relaxation parameter, 364

differential

equation, 19ff.

Ritz method, 272, 310ff.

Uniform boundedness, 50

Romberg sequence, 196

Uniformly elliptic, 22

Schroder-Trottenberg method,

Uniformly hyperbolic, 22

426ff.

V ar i at i ona l me th o d s, 270ff

.

Schr6dinger equation, 17 Von Neumann condition, 131 Semilinear, 20, 26 Semiorder, 241

Single step method, 341, 358

Wave equation, 11, 111, 147, 228, 474 generalized, 183, 190

Sobolev norm, 271

Weakly cyclic, 370

SOR, see successive overrelaxation

Weakly stable, 135

SOR-Newton method, 384

Weierstrass approximation theorem, 44

Solution operator, 56, 57

Weierstrass P-functions, 325

Sparse matrix, 275, 402

Wirtinger calculus, 215

Stability of difference methods, 55ff., 103, 116, 172, 233

Young's theorem, 373

definition, 62, 70 Standard lattice, 231

Strongly finite difference method, 69