This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
, IR
+
j
IR
In the second integral we can make the substitution and then call
x x.
Then we have
D = fIRVp(x)[O(x+uox)-,p(x)]u(x)TEuu(x+uox)dx
-
f1R (x+uax)[ip(x+uox)- (x)]u(x+uAx)TE11u(x)dx.
By the symmetry of
Eu,
u(x)TEUu(x+PAx)
=
u(x+uex)TEUU(x)
and therefore, D = -jR[V+(x+pAx)-$(x)]2u(x)TEUu(x+uax)dx I
IDI
2
< L24k2(Ax)211E V112 IIuIIIn
the second case, we have D = j Y(x)[(x+u0x)-*(x)]u(x)TFUU(x+uox)dx IR
-
f1R
u(x-uox)dx.
x-pAx
Fourier transforms of difference methods
9.
165
Using the fact that u(x)TFUU(x+uox)
=
-u(x+uox)TFuu(x)
and making the same substitution, we get D =
-!IIt
IDI
< L24k2(ax)2 IIFUHz IIu1IZ.
[iV(x+uAx)- (x)]Iu(x)TFUU(x+uAx)dx
We set
Proof of Theorem 9.34: 1/2
y = h
and
8 = 1/y = hl/2
h < ho = (A/6k)2, then
If
6kytx = 6kyh/A <
1
and
2ktx = 2kh/A < 8.
Thus the conditions of Lemmas 9.38 and 9.39 are satisfied. We can approximate the matrices [-38,38]
and
on the interval
Du(x)
with a linear combination of the matrices
Du(-38)
Du(38), namely,
Du(x) = s[(38-x)DU(-38)+(38+x)DU(38)] + Z(x,u,8) Since the second derivatives of the elements of bounded independently of The constant the method
M2 MD.
x, we have
On the smaller interval (38 ± x) ? 16.
The functions V+1(x)
=
[(38-x)/68]1/2
2(x) = [(38+x)/68]1/2
are
IIZ(x,u,8)112 < M28
depends only on the functions
68
D
2 .
Du, i.e., on
[-28,28], we have
166
INITIAL VALUE PROBLEMS
I.
are continuously differentiable on this interval.
ute values of the first derivatives of bounded by
*1
and
The absolare
*2
x,x a [-28, 28], j = 1,2, it follows
1/8.
For
(x)
j(x)I < sIx xI.
that
The above interpolation formula for
(9.40)
D(x)
can now be written
as
Du(x) _ iP1(x)2D11 (-38) + * 2(x)2Du(38) + 2(x,v,B), u = -2k(1)2k,
For fixed but arbitrary v = --(1)-
x c [-28, 28].
u e L2(R,IRn)
we define
(cf. Lemmas 9.37 and 9.38).
uv =
We have
Support(uv) C [(v-1)8, (v+1)8], and hence by Lemma 9.39 together with equation (9.40),
I
I
I
I
o,Q38(h)(uo)>'
< [(4k+1)M282 + 2K8-2(0x)2] IIuoIIZ _ (2kM2 + 2K/A2)h IIuoIIZ = M3h IIuoIIZ. The scalar products
9.
Fourier transforms of difference methods
Analogously, we have for all
v = --(1)-
167
that
Since
-M1y2L2 (Ax) 2 IIuII2 - M3h
>
y2(Ax)2
=
h/A2
v=-.
H uV II2.
and since by Lemma 9.37(3),
IIuv112 =IIuII2 we have that
> -M4h IIuII2 where
M4 = M1L2/a2 + M3. Applying Lemma 9.35 yields
> -M4h
IIuII2 - M5h IIuII2
-
(l+M6h)1/2 < 1+M7h.
We will not give an application of Theorem 9.34 at this time. In the next section, we will return to initial value problems in several variables, and there we will use the theorem to help show that the generalization of the Lax-Wendroff method to variable coefficients is stable.
There does not seem to
be any simpler means of establishing the stability of that method.
168
INITIAL VALUE PROBLEMS
I.
Initial value problems in several space variables
10.
So far we have only investigated partial differential equations in one time variable, t and one space variable, x.
For pure initial value problems, the main results of the previous chapter can be extended effortlessly to partial differ]Rm+l with one time variable
ential equations in space variables m >
x1,
xm.
t
explain the situation for
m
We have avoided the case
until now for didactic and notational reasons.
1
and
We will
in this section with the
m > 1
aid of typical examples.
Initial boundary value problems, in contrast to pure initial value problems, are substantially more complicated when
m >
1.
The additional difficulties, which we cannot
discuss here, arise because of the varying types of boundaries.
The problems resemble those which arise in the study of
boundary value problems (undertaken in Part II).
Throughout this chapter we will use the notation m
x = (xl,...,xm) eIR
y = (YI,...ym) eIRm m
<x,y> =
dx = dxl...dxm,
I
xuYU'
u=1
In addition, we introduce the multi-indices e a
s = (sl) .... s
The translation operator replaced by e{u)
m
TAx of
IR1
different operators
_ (eemu)),
e(u) = 6 uv$
(cf. Definition 6.2) is Tku
in
IRm:
u = l(1)m u,v = 1(l)m
V
Tku(x) = x + ke(u),
m
m x e IR,
k eIR,
u = l(1)m.
10.
Problems in several space variables
For all
169
let
fE
x E Rm.
Tku(f)(x) = f(Tku(x)),
With this definition, the translation operators become bijective continuous linear mappings of
L2(IR m,tn)
For all
They commute with each other.
v e 7l
into itself.
we have
Tku = Tvk,u Let
have bounded spectral norm
B E
II B (x) 112 .
The map
f(') - B(.)f(.) is a bounded linear operator in relations for
Tku
and
B
L2ORm,¢n).
The commutativity
are
Tku°B(x) = B(Tku(x))Tku B(x)'Tku
In many cases, B
Tku0B(T-ku(x)). =
will satisfy a Lipschitz condition
IIB(x)-B(Y) 112 < L I1x-Y112. Then, IIB(x)°Tk1i -Tku°B(x)112 < L1kj.
For
k cIR
and arbitrary multi-indices m s
Tk =
su
fl Tku
.
Vj=l
The difference method MD = {C(h)Ih e(0,ho]}
can now be written in the form
s, we define
170
INITIAL VALUE PROBLEMS
I.
C(h) _ (I B5(x,h)Tk)
(E A5(x,h)Tk).
All sums, here and henceforth, extend only over finitely many Also we assume that for all
multi-indices.
s, x, and
h,
As(x,h),Bs(x,h) E MAT(n,n,IR) k = h/A
A IR+.
where
k =
or
Analogously to Definition 9.12, we can assign to each difference method an amplification matrix exp(ik<s,y>)Bs(x,h))-1(1
G(h,y,x) =
exp(ik<x,y>)As(x,h)).
(X
s If the matrices of
s As(x,h)
and
B5(x,h)
are all independent
x, we speak of a method with space-free coefficients.
Then we abbreviate As(x,h), Bs(x,h), G(h,y,x) to
As(h), Bs(h), G(h,y). The stability of a method with space-free coeff-
icients can again be determined solely on the basis of the amplification matrix
G(h,y).
Theorems 9.13 and 9.15 extend
word for word to the Banach spaces placed by
case the
m = 1.
if
IF
is re-
Theorems 9.16, 9.31, and 9.34 also carry over
IItm.
in essence.
L2(Rm,cn)
All the proofs are almost the same as for the Basically, the only additional item we need is
m-dimensional Fourier transform, which is defined for
all f e L2
by (2")-m/2
a,(y) = 7n_(f)
f(x)exp(-i<x,y>)dx
rIl
xll2
= lim a Viw
Problems in several space variables
10.
171
The limit is taken with respect to the topology of As in the case
L2(Iltm,n).
m = 1, we have: is bijective.
(1)
jn
(2)
11-9n11 = II_V -11I = I.
(3)
Fn(Tk(f))(')
=
For differential equations with constant coefficients, the best stability criteria are obtained from the amplification matrix.
Even when the coefficients are not constant,
this route is still available in certain cases, for example, with hyperbolic systems.
Also, one can define
positive definite methods.
positive and
They are always stable.
For positive definite methods, we need (1)
C(h)
I As(x)TS
with
k = h/a.
S
(2)
1
=
E A5(x) s
(3)
All matrices
As(x)
are real, symmetric, and
positive semidefinite. (4)
For all multi-indices
s
and all
x,y EIRm
we
have
IIAs(x) -As (Y) 112
L > 0.
We consider positive methods only in the scalar case If
m = 1.
m > 1, they are of little significance for systems of
differential equations.
This is due to Condition (3) of De-
finition 8.4, which implies that the coefficients of the difference operators commute.
For
m > 1, the coefficients of
most systems of differential equations do not commute, and
172
INITIAL VALUE PROBLEMS
I.
hence neither do the coefficients of the difference operators. The positive methods occur in the Banach space B = If e
lim
If(x)j
= 01
11xL -Here the norm is the maximum norm. also defined in this space. (1)
e(x,h)C(h) _
k= h/h (2)
The operators
For positive
are
methods, we need
as(x,h)Tk + I bs(x,h)Tk0C(h)
k= h X
or
Tk
where
e(x,h) = E [as(x,h) + bs(x,h)],
A e1R+.
x e
he(0,ho]
s
(3)
as ,bs a CO(JRm, 1R) as(x,h) > 0,
bs(x,h) > 0
E as(x,h) > 1. S
For
m > 1, the so-called product methods occupy a
special place.
methods for
They arise from the "multiplication" of Their stability follows directly from
m = 1.
the stability of the factors.
More precisely, we have the
following.
Theorem 10.1:
MD,u
Let
B
be a Banach space and
{Cu(h)lh c (O,ho]},
a family of difference methods for properly posed problems.
u = 1(1)m m
(possibly different)
The difference method
MD = {C(h)lh a (O,ho]} is defined by
C(h) = C1(h)C2(h) ... Cm(h). MD
is stable if one of the following two conditions is
Problems in several space variables
10.
173
satisfied. (1)
For fixed
h e (0,h01, the operators
C}.(h), u =
1(l)m, commute. (2)
There exists a
such that
K > 0
p = 1(1)m,
IICp(h)II < 1+Kh,
h e (O,h0J.
If (1) holds, we can write
Proof:
m
IIC(h)nil < Since each of the
m
p n IIC(h)nil
p=1
factors is bounded, so is the product.
If (2) holds, we have the inequalities IIC(h)nil
< (1+Kh) mn < exp(mKT).
We now present a number of methods for the case In all the examples, A = h/k
or
a = h/k2, depending on the
order of the differential equation. Example 10.2:
Differential equation: m
ut(x,t) =
E
ap[ap(x)apu(x,t)],
ap = a/ax
p=1
where and
ap a C1(IRm, IR) , 0 < S < ap (x) v = 1(1)m. Iavap(x)I < K,
Method: C(h) =
[I-(l-a)AH]-lo[I+aAH]
where m
(a (X + 2kep)(Tkp-I) + ap (x- 2kep)(Tku-I)]
H = p=l
and
a e [0,1].
m > 1.
174
I.
INITIAL VALUE PROBLEMS
Amplification matrix: G(h,y,x) = [l+aaH]/[1-(1-a)Xi} where m
H =
I (au(x+ zkeu)[exp(ikyu)-1] u=1
+ au (x- Zkeu)[exp(-ikyu)-1]}. For
2mKaa <
1
the method is positive, and hence stable.
Subject to the
usual regularity conditions, the global error is at most + 0(k2)
for
a + 1/2
0(h2) + 0(k2)
for
a = 1/2
0(h)
If all
(Crank-Nicolson Method).
are constant, then m H -2 a (1-cos ky ) > -4mK.
au
u =1
-
u
u
Precisely when 2mK(2a-1)d <
1
we have JG(h,y)J <
1
and hence stability.
Theoretically speaking, there is nothing new in the case
m >
1
that wasn't contained in the case
m = 1.
Prac-
tically speaking, the implicit methods (a < 1) with few restricting stability conditions are very time consuming for m > 1.
A large system of linear equations has to be solved
for each time interval.
For
m = 1, the matrix of the system
is triangular, and five arithmetic operations per lattice point and time interval are required for the solution.
Thus
the total effort required by an implicit method is not very large in the case
m = 1.
For
m > 1, the matrix of the
10.
Problems in several space variables
175
system of equations is no longer triangular.
Even with an
optimal ordering of the lattice points, we get a band matrix where the width of the band grows as the number of lattice The solution of the system then requires
points increases.
considerable effort.
a
Example 10.3:
Differential equation: ut(xi,x2,t) = aailu(xl,x2,t) + 2bala2u(xl)x2,t) +
where
a > 0, c > 0, and
ca22u(xl,x2,t)
ac > b2.
Method:
C(h) = [I-(1-a)XH]-lo[I+aaH] a e [0,1]
where
and
H = a(Tkl+Tki-2I) + 1b(Tkl-T-1)(Tk2-Tk2
c(Tk2+Tk2-2I).
+
Amplification matrix: G(h,y) =
[1+aaH]/[1 (1 a)aH]
where
H = -2a[1-cos(kyl)]
2c[l-cos(ky2)].
-
The differential equation differs from the one in the previous example in the term and
2ba1a2u(xl,x2,t).
Since
ac >b2, it is nevertheless parabolic.
method is never positive, regardless of
a
a > 0, c > 0,
When and
b # 0, the X.
A sta-
bility criterion is obtainable only through the amplification matrix.
We set
wl = 2ky1
and
w2 = 2 ky2, and get
176
INITIAL VALUE PROBLEMS
I.
2w
2
H = -4a sin wI-8b sin wl sin w2 cos wl cos w2-4c sin w2 _ -4(a+c)+4a cos2wl-8b sin wl sin w2 cos wl cos w2 + 4c cos`w2. wI + Rn/2
Also, for
and
w2 + Rn/2 (R EZZ), let
c = sgn(sin wI sin w2) n = sgn(cos wl cos w2).
Thus we obtain two representations for
H = -4(/ sin wl - eT sin w2) -8[/aclsin w1 sin w21
H,
2
+ b sin 111 sin w2 cos wI cos w2]
and
H = -4(a+c) + 4(T cos w1 - n/E cos w2) 2 +8[V a7 1cos 11 cos w21-b sin wl sin w2 cos w1 cos w2]. Since
Jbi
< ac, the first term in the square brackets is
always the dominant one.
Hence,
-4(a+c) < H < 0.
Equality can occur at both ends of the expression. 2(a+c)(2a-1)A <
we have
IG(h,y)l
dent of
b.
exist
wl
< 1.
w2
such that
method is unstable for all Example 10.4:
1
This stability condition is indepen-
On the other hand, and
For
a
if
b2 > ac, there always
H > 0 and
IG(h,y)l >
and A.
o
ADI-method.
Differential equation:
ut(xi,x2)t) -a[a11u(x1,x2,t) + a22u(xl,x2,t)] + b1a1u(xl,x2,t) + b2a2u(xl,x2,t)
1.
The
Problems in several space variables
10.
where
a > 0
and
177
bl,b2 cIR.
Method:
C(h) = CI(h/2)oC2(h/2) Cp(h/2) = [Io [1+
? A(Tkp-2I+Tkp)-
4bpka(Tkp-Tkp)]
Zaa(Tko-2I+Tka)+ 4boka(Tka-Tk1)],
p = 1,2 Amplification matrix:
and
a = 3-p.
G(h,y) = GI(h,y)G2(h,y)
GP (h,y)
1-aa(l-coswp) + 1ibp/ sin wp 1+aa(l-coswp)
-
Zibp vET sin wp
and
wl = kyl
and
w2 = ky2.
The abbreviation ADI stands for Alternating Direction ImpZicit method.
The first ADI method was described by Peaceman-
Rachford (1955).
The method is of great practical signifi-
cance for the following reasons. fractions
lbpI/a
is very large.
ilities, one must then choose
k
Suppose that one of the To avoid practical instabvery small.
Otherwise, one
immediately encounters difficulties such as those in Example 9.21.
With an explicit method, the stability condition
(2maa < 1) demands
ah < 4 k2.
Hence
h
has to be chosen extremely small.
cit methods allow
h
Although impli-
to be chosen substantially larger, one
nevertheless has to solve very large systems of equations because the lattice point separation
k
this is also true for ADI methods.
The difference with other
is small.
Of course,
178
I.
INITIAL VALUE PROBLEMS
implicit methods is in the structure of the system of equations.
In each of the factors
C1
and
C2, the systems of
equations decompose into independent subsystems for the latx2 s constant
tice points
xl a constant, and the mat-
and
rices of the subsystems are triangular.
Only five to eight
arithmetic operations per lattice point and half time interval are required, and that is a justifiable effort. Note that
does not belong to
G1
C1.
The factors of
the amplification matrix are exchanged in the representation of the method.
Such a representation of the amplification
matrix is only possible with constant coefficients. In practice, one deals mostly with initial boundary value problems.
Stability then depends also on the nature of
Thus we must caution that the following remark
the region.
Rather
is directly applicable only to rectangular regions.
different results can occur when the region is not rectangular or the differential equations do not have constant coefficients.
We have [1-aa(l-cos wp)]2 + 4bp ha sin2w p
<
2
[l+aa(l-cos wp)]2 + Abp hA sin2wp and hence < 1.
IG(h,Y)I C
is stable for all
are unstable for large
A, although the factors A.
C1
and
C2
In order to solve the triangular
system of equations without pivoting, we need
-aa
>
-Ibplka.
This means k < 2 min(a/Ib1I,
If additionally, as < 1, then tice, it suffices to limit
k;
a/Ib2I) is also positive.
C
h
In prac-
can be chosen arbitrarily
Problems in several space variables
10.
179
0
large.
Example 10.5:
Differential equation:
ut(xl,x2,t) = iaa1a2u(xl,x2,t),
a cIR - {0}.
Method: C(h)
where
a c
[I-(l-a)XH]-lo[I+aAH]
[0,1]
and
H
T_
1
Amplification matrix: G(h,y) =
where
wl = ky1
1-iaaX sin wl sin w2 +ia sin wl sin w2 -a and
The differential equation is
w2 = ky2.
sometimes called a pseudo-parabolic equation.
It corresponds
to the real system a
at
ul(x1,x2,t) =-aala2u2(xl.x2,t)
aT u2(xl'x2,t)
It follows that for
a
_aa1a2u1(x1,x2,t).
u c C4(IR,4)
2
ul(xl,x2,t) _ -a22'11a22u1(xi,x2,t).
Solutions of the differential equation can be computed with Fourier transforms, analogously to Example 9.9. is formally the method of Example 10.3.
but is stable for Example 10.6:
a < 1/2.
The method
It is not positive,
a
Product method for symmetric hyperbolic systems.
Differential equation:
ut(x1,x2,t) = A1(x)alu(xl,x2,t) + A2(x)a2u(xl,x2,t).
180
I.
INITIAL VALUE PROBLEMS
where
Au E C2(IR2,MAT(n,n, ]R))
A(x) symmetric,
P(AV(x))
IIAU(x)-AU(x) 112 < L IIx-xII2,
bounded
x,k c IR2,
u = 1,2.
Method:
C(h) = 4{[I+XAl(x)][I+XA2(x)]Tkl'Tk2 + [I+AA1(x)][I-AA2(x)ITkl°Tk2
+ [I-XA1(x)][I+AA2(x)]Tkl°Tk2 + [I-AA1(x)][I-AA 2(x)]Tkl°Tk2}.
The method can also be derived from the Friedrichs method for m = 1.
To see this, consider the two systems of differential
equations
ut(x1,x2,t) = A1(x)81u(xl,x2,t) ut(xl,x2,t) = A2(x)82u(x1,x2,t).
In the first system, there is no derivative with respect to x2, and in the second, there is none with respect to
x1.
Thus, each system can be solved with the Friedrichs method. The variables of parameters.
x2
and
xi, respectively, only play the role
The methods are
Cu (h) = 2 [I+AAU(x)]Tkp
For
A sup IIAu(x)II <
1
+ 2 [I-XAV(x)]Tku.
the methods are positive definite.
By Theorem 8.12, there is then a 11c11 (h)II < 1+Kh,
K > 0
u = 1,2,
By Theorem 10.1, the product is stable.
such that h c (0,h0].
Problems in several srace variables
10.
181
C(h) = C1(h)°C2(h)
{[I+AA1(x)][I+XA,(x+kel)]Tkl°Tk2
+
T_
I
+ [I-AA1(x)][I+AA2(x-kel)]Tk1,Tk2 + [I-AA1(x)][I-AA2(x-kel)]Tkl°Tk2}. C
and
C
agree up to terms of order
0(h).
Hence
C
is
also stable for max
A
sup
p(Au (x)) < 1.
U=1,2 x EIR2 The consistency of a product method also follows immediately from the consistency of the factors.
We would like to demon-
strate this fact by means of this example.
Let
C2OR2,¢n).
u e
We have h-I[C(h)-I](u)
h-I[C1(h)-I](u) + h-1[C2(h)-I](u)
= +
h{h-2[C1(h)-I]0[C2(h)-I](u)}.
Since the Friedrichs method is consistent, the summands on the right
side are approximations for A1(x)a1u(x,t), A2(x)a2u(x,t)
and
hA1(x)a1[A2(x)a2u(x,t)]. Thus, up to
0(h), the left side is an approximation for A1(x)alu(x,t) + A2(x)a2u(x,t).
This establishes consistency for replaced by
C.
C.
For simplicity, C
This doesn't affect consistency, since
was
182
INITIAL VALUE PROBLEMS
I.
A [C(h)-C(h)] (u) T-
1
+ [I-?LA 1(x)][A2(x-kel)-A2(x)]Tkl0[Tk2-Tk2](u).
The difference is obviously of order h-1[C(h)-I](u)
Usually matrix
C
AI(x)A2(x)
remedied. A2
and
C
Let
C*
as well as
-
0(h2).
It follows that
h-1[C(h)-I](u) = 0(h).
are not positive definite, because the is not symmetric.
This deficiency can be
be formed from
by exchanging
Tkl
and
is positive definite.
Tk2.
C
Then the method
1
(C + C*)/2
Further details
on product methods can be found in Janenko (1971).
a
m-dimensional Friedrichs method.
Differential equation: m
ut(x,t) _
I
Au(x)3uu(x,t)
u=1
where Au e C2clRm,MAT(n,n,]R)) p(AU(x))
symmetric
bounded
(x)-A11 (R) II < L IIx-RII2,
11A
x,k a IItm,
u = 1(1)m.
11
Method: C(h) = I +
-1 Au(x)(Tku-Tku)
2A
and
All of the methods mentioned here are
too complicated for practical considerations.
Example 10.7:
A
m
+ Zm
u-1
E (Tku-2I+Tku) u=1
with r e IR. Amplification matrix: m
G(h,y,x) = (1-r)I + Ai
m
E A (x)sin w u=1
u
+ r I cos w E u m u u=1
183
Problems in several space variables
10.
where
mu = kyu
u = 1(1)m.
for
The differential equation
constitutes a symmetric hyperbolic system. The case
found in Mizohata (1973).
m = 2
The theory can be
was covered in
the preceding example.
The m-dimensional wave equation m
bu(x)au[bu(x)auv(x,t)]
vtt(x,t) u=1
bu a C2 (Rm, IIt+) can be reduced to such a system by means of the substitution
u(x,t) _ (vt(x,t), bi(x)alv(x,t),...,bm(x)amv(x,t)) In this special case, the coefficients of the system are elements of
MAT(m+1,m+1,7R): Au(x)
(a0T)(x))
where
aQT)(x)
_
b(x)
for
a =
bu(x)
for
T
and
T = u+l
and
a = u+l
otherwise.
0
For
1
m = r = 1, this obviously is the Friedrichs method preFor
viously considered.
m = 2, this is simpler than the
product method of Example 10.6.
For
m > 2, the m-dimensional
Friedrichs method is substantially simpler than the product methods which can be created for these cases. C
is consistent for all
we skip the proof. r e (0,1]
and
C
a elR+
and all
r CIR, but
is positive definite exactly when
a max
sup
u=1(1)m x e1Rm
-
p(A (x)) < r/m, u
for it is exactly under these conditions that all the matrices
184
INITIAL VALUE PROBLEMS
I.
(1-r)I, AA
p
are positive semidefinite.
c IR and
-AA
and
+ rI m
p
Ap(x) = CI, p = l(1)m,
For
it follows for
r = 1
+ rI m
mp = n/2
that
IIG(h,y,x) 112 = Acm. By Theorem 9.31, the stability condition, at least in this special case, agrees with the condition under which the method is positive definite.
However, there are also cases
in which the method is stable but not positive definite.
We want to compare the above condition on the Friedrichs method for
m = 2, r - 1
the product method. max
h
with our stability condition for
The former is sup
p=1(1)2 x EIR2
-
p(A (x)) < k/2 p
and the latter, h
max
sup
p(A (x)) < k.
p=l (1) 2 x d R2
U
However, one also has to take the separation of the lattice points into account. (see Figure 10.8).
They are
and
respectively
For the product method, the ratio of the
O
O
O
O
k
k
0-----=-O
0-k
O Friedrichs method
product method Figure 10.8
-
10.
185
Problems in several space variables
maximum possible time increment to this separation never-
/.
theless is greater by a factor of
The product method
provides a better approximation for the domain of determinancy of the differential equation.
That is the general ad-
vantage of the product method, and guarantees it some attenIt is also called optimally stable, which is a way of
tion.
saying that its stability condition is the Courant-FriedrichsLewy condition.
o
Lax-Wendroff-Richtmyer method.
Example 10.9:
Differential equation as in Example 10.7, with the additional conditions
Au c C3(gtm,MAT(n,n, ]R))
IIaQaTAU(x)II2 bounded, u = 1(1)m, o = 1(1)m, T = 1(1)m. Method: m
C(h) =
I
+ So [I +
ZS
+ 2m
)
1
with r e IR S =
(Tku-2I+Tku)l
and Ap(x)(Tku-Tku).
Za
u=1
For
m = r = 1
and
Au(x) = A =_ constant, we have the ordin-
ary Lax-Wendroff method (cf. Example 9.26), for then, with Tk = Tkl, XA(Tk - Tk
S = 2
C(h) = I+ ?AA(Tk-Tkl)+ 1X2A2(Tk-2I+Tk2)+ 4AA(T2-Tk2) ZAA(Tk-Tk1) C(h) =
I
+ 4XA(T2-Tk2) +
1A2A2(Tk-2I+Tk2).
186
INITIAL VALUE PROBLEMS
I.
Replacing
and
k
a/2
by
yields the expression
X
AA(Tk-Tkl) + 2 a2A2(Tk-2I+Tkl).
+
I
by
2k
2
In any case, when
r = 1, C
only contains powers
Tk
with
m
even sums
s
u=1 u
Figure 10.10 shows which lattice points
.
0 O
0
G
k
)E
G-44- 0
O
aE
c
0 Figure 10.10
are used to compute only when
r # 1.
for
C
m = 2.
The points
*
are used
The Lax-Wendroff-Richtmyer method has or-
der of consistency
0(h2).
It is perhaps the most important
method for dealing with symmetric hyperbolic systems. choice
r e (0,1)
The
is sometimes to be recommended for gener-
alizations to nonlinear problems. We present a short sketch of the consistency proof. It follows from m
AU(x)auu(x,t)
ut(x,t) u=1
that
m
m
utt(x,t) _ u=1
For
Av(x)avu(x,t)].
Av(x)au[ v=1
u e C3(IRm,4n), one shows sequentially that 0
Su(x,t) = h E Au (x)a u(x,t) + 0(h3) V
U=1
ZS2u(x,t) =
2h2
E
u-1
A11 (x)ap[ E Av(x)avu(x,t)] + 0(h3) v=1
Problems in several space variables
10.
187
m (Tku-2I+Tku)u(x,t)
2m S
1
1
hk2
E A (x)93 u(x t) + O(h3 ). I uvv u=1 v=1 u '
Altogether, this yields C(h)u(x,t) = u(x,t) + hut(x,t) + Zh2utt(x,t) + O(h3) = u(x,t+h) + O(h3).
We want to derive sufficient stability criteria with the aid of the Lax-Nirenberg Theorem 9.34 (cf. Meuer 1972).
However,
In reducing
the theorem is not directly applicable.
to
C
the normal form B5(x,h)Tk
C(h) _ s
one ordinarily obtains coefficients depend on
h.
For example, for
S2 = 1X2A(x)A(x+g)T2 -
Bs(x,h)
which actually
m = 1, we have
[4A2A(x)A(x+g)
+
4A2A(x)A(x-g)]I
+ 4A2A(x)A(x-g)Tk2 where
A = A1, g = kel, and
But the operator
Tk = Tkl.
Bs(x,0)Tk
C*(h) _ s
has coefficients which are independent of
h.
One easily
shows: (1)
IIC(h)
-
C*(h)II2 = O(h).
Thus
C
both stable or both unstable. (2)
For every
II [C(h) Hence
-
u c Co(1Rmn)
C* (h) ] (u) II2
=
we have
O(h2)
.
C* is at least first order consistent.
and
C*
are
188
INITIAL VALUE PROBLEMS
I.
has amplification matrix
C*
(3)
m
G*(h,y,x) =
+ S((1-r)I + 2 + mI
I
cos w V=1
wV = kyV
where
(V = l(1)m) and m
s = is
E
AV(x)sin wV.
u=1
For
m = 1, we have C(h)
- C*(h) = 8A2A(x)[A(x+g) -
+
8A2A(x)[A(x+g) + A(x-g)
- A(x)]T2 - 2A(x)]I
IX2A(x)[A(x-g) - A(x)IT k. 2
(1) follows immediately.
The proof of (2) depends on the dif-
ferences C(h)
C*(h) = 4X2gA(x)A'(x)[Tk-T-2] + 0(h2).
-
m > 1, we leave this to the reader.
For
Now we can apply the Lax-Nirenberg Theorem 9.34 to
C*.
Then it suffices for stability that
IIG*(h,y,x) 112 < 1. By Theorem 9.31 this condition is also necessary. H
be the product of
matrix
G*(h,y,x)
Now let
with the Hermite transposed
(G*)H, in
P =
AV(x) sin wV
E
V=1
and let
m r1
I
m
=
cos w
V=1
We have of
that
-1 < n <
A, r, and
h.
1.
r1
assumes all these values independently
It follows from the Schwartz inequality
Problems in several space variables
10.
189
2I
m
Cosw
n2 < m u=1
may be represented as follows:
H
H = [IH = P
2X2P2-iX(l-r+rn)P][I-
(I- ZX2P2)2 + X2(1-r+rn)2P2.
is real and symmetric.
real.
ZX2P2+iX(l-r+rn)P]
The eigenvalues
To every eigenvalue
eigenvalue
of
a
H.
1.
of
P
of
P
are
there corresponds an
Thus it is both necessary and suffici-
ent for the stability of greater than
a
a
and
C*
C
that
&
never be
Hence we must examine the following inequal-
ity:
a = (1
-
X2a2)2 + X2(1-r+rn)2a2 < 1.
a = 0, this is always satisfied.
For
if all the matrices
Au(x)
is always zero only
a
are zero everywhere.
trivial case, one has stability for all
X
and
In this Y.
In all
other cases, we can restrict ourselves to those combinations of
x
wu
and
with
p(P) > 0, and consider an equivalent
set of inequalities: 4X2p(P)2 + (1-r+rn) 2 < 1. For
r < 0
or
contradictory. necessary.
r > 1, n < -1/r, this inequality is selfIn the nontrivial cases, then, r E (0,1]
For these
r, the inequalities can be converted
to the equivalent inequalities
4r
We now set
is
X2p(P)2 + n2-(1-r) (1-n)2 < 1.
190
INITIAL VALUE PROBLEMS
I.
max 11=1(1)m
K =
p(A (x))
sup
x elRm
and assert that r e (0,1]
ZAK <
and
is sufficient for the stability of let
w
21
m C*
(10.12)
.
and
C.
be an arbitrary eigenvector of a matrix
"w'12 = 1
and
To see this, P
with
= aw.
P (w)
m
[wTAp(x)w]sin wu.
a = u=1
Again we apply the Schwartz inequality to obtain m
m
[wTA (x)w]2
a2 <
u
u=1
sin2w u=1
u
m
a2 < m K2
sin2wu
I
u=1 m
p(P)2 <
in
K2 I sin2wu u=1 in
A 2p(P)2 < m
E
sinwu
u=1
4rA2p(P)2 + n2 <
1.
This inequality is somewhat stronger than (10.11).
There remains the question whether stability condition (10.12) is at all realistic. matrices
Au(x)
The answer is that whenever the
have some special structure, it is worthwhile
to refer back to the necessary and sufficient condition (10.11). tion.
A well-known example is the generalized wave equa-
As noted in Example 10.7, for this equation we have Au(x)
(aar)())
Problems in several space variables
10.
191
where
a
'
for a = 1, T = p+l and
(x) = by (x)
T= 1,
a = p+l
otherwise.
a6T)(x) = 0 Letting
K=
max sup p=1(1)m x elRm
bp (x)
we also have max
K =
sup
p(A (x)). P
pal (1)m x EIItm But in contrast to the above, m
p(P)2 < KZ
sin2w
Y
U=1
With the help of (10.11), one obtains a condition which is
better by a factor of V: r c
and
(0,1]
ZXK <
.
The same weakening of the stability condition (factor Am-)
is
also possible for the m-dimensional Friedrichs method, in the case of the generalized wave equation. So far we have ignored general methods for which there are different spacings rections
ep.
of the lattice points in the diX = h/k
Instead of
then have possibly ap =h/kp
kp
A = h/k2, one could
or
different step increment ratios
m
or A,, = h/k2.
Such methods have definite practi-
11
Now one can obtain
cal significance.
kl = k2 =
...
= km
with the coordinate transformation xp =
apxp
where
ap
> 0, p = l(l)m
This transformation changes the coefficients of the differential equation.
They are multiplied by
ap
or
a2
or
192
In many cases, the following approach has proved use-
auaV. ful.
INITIAL VALUE PROBLEMS
I.
First transform the coordinates so that the coeffici-
ents mapped into each other by the change of variables are nearly the same.
For a symmetric hyperbolic system this means
p(A (x)) _ sup 1 x e lRm
...
= sup
p(A (x)). M(x)).
x C 1Rm
Then choose the increments independent of ponds to a method with
ku = k/ou
V.
This corres-
in the original coordinate
system.
11.
Extrapolation methods
All of the concrete examples of difference methods which we have discussed so far have been convergent of first or second order.
Such simple methods are actually of great
significance in practice.
This will come as a great surprise
to anyone familiar with the situation for ordinary differential equations, for there in practice one doesn't consider methods of less than fourth order convergence. High precision can only be achieved with methods of high order convergence.
This is especially true for partial
differential equations.
Consider a method with
variables, of k-th order, and with
m
space
h/4x = A a constant.
the computational effort for a fixed time interval
Then
[0,T]
is
O(h-m-1-e)
For explicit methods, e = 0, while for implicit
methods, e >
0
at times.
The latter depends on the amount
of effort required to solve the system of equations. case, m+l+c >
2.
In any
To improve the precision by a factor of
thus is to multiply the computational effort by a factor of q(m+l+e)/k
q
11.
Extrapolation methods
193
In solving a parabolic _'ifferential equation we have
as a rule that 0(h
-m/2 -
1
-
h/(Ax) 2 = c)
A
The growth law
- -or.stant.
for the computational effort appears more
However, a remainder of O(hk) + O((tx)k) _ q(m+2+2e)/k. q = q(m+2+2e)/2k implies q = is only
favorable.
O(hk/2)
achieved with a remainder
O(hk)
0((Ax)2k)
+
=
O(hk).
How then is one to explain the preference for simpler methods in practice?
There are in fact a number of import-
ant reasons for this, which we will briefly discuss. (1)
involved.
In many applications, a complicated geometry is The boundary conditions (and sometimes, insuffici-
ently smooth coefficients for the differential equations) lead to solutions which are only once or twice differentiable. Then methods of higher order carry no advantage.
For ordin-
ary differential equations, there is no influence of geometry or of boundary conditions in this sense; with several space variables, however, difficulties of this sort become dominant. (2)
The stability question is grounds enough to re-
strict oneself to those few types of methods for which there A method which is stable
is sufficient experience in hand.
for a pure initial value problem with equations with arbitrarily often differentiable coefficients, may well lose this stability in the face of boundary conditions, less smooth coefficients, or nonlinearities.
In addition, stability is a
conclusion based on incrementations quite unclear how
h
0
h < h
-
.
o
It is often
depends on the above named influences.
In this complicated theoretical situation, practical experience becomes a decisive factor. (3)
The precision demanded by engineers and physicists
194
I.
is often quite modest.
INITIAL VALUE PROBLEMS
This fact is usually unnoticed in the
context of ordinary differential equations, since the computing times involved are quite insignificant.
As a result, the
question of precision demanded is barely discussed.
As with
the evaluation of simple transcendental functions, one simply uses the mantissa length of the machine numbers as a basis for precision.
The numerical solution of partial differential
equations, however, quickly can become so expensive, that the engineer or physicist would rather reduce the demands for This cost constraint may well be relaxed with
precision.
future technological progress in hardware.
These arguments should not be taken to mean that higher order convergence methods have no future.
Indeed one
would hope that their significance would gradually increase. The derivation of such methods is given a powerful assist by extrapolation methods.
We begin with an explanation of the
basic procedure of these methods.
In order to keep the for-
mulas from getting too long, we will restrict ourselves to problems in and
1R2, with one space and one time variable, x
t.
The starting point is a properly posed problem and a corresponding consistent and stable difference method. solutions for considered.
noted by
h.
s-times differentiable initial functions are The step size of the difference method is deThe foundation of all extrapolation methods is
the following assumption: Assumption:
Only
The solutions
w(x,t,h)
method have an asymptotic expansion
of the difference
Extrapolation methods
11.
r-1
w(x,t,h)
195
y
T,(x,t)h j + p(x,t,h),
=
(x,t)
E G,
v=0
h e (O,h01
where
r >
and
2
11p(x,t,h) JI Tv
= 0(hy r),
G + ¢n,
:
v = 0(l)r-1
0 = Yo < Y1 To
(x,t) a G, h e (O,ho]
.
< Yr.
is the desired exact solution of the problem.
a
We begin with a discussion of what is called global extrapolation.
method for
r
For this, one carries out the difference different incrementations
for the entire time interval. dependent of each other. tk/hj c2Z
for all
j
= 1(1)r, each
computations are in-
r
For each level
t = tk, where
= 1(1)r, one can now form a linear com-
w(x,tk,hl,...,hr)
bination
The
hj, j
of the quantities
w(x,tk,hj)
so that w(x,tk,hip.... hr) = T0(x,y) + R.
Letting
by = qvh, v = 1(1)r, and letting
h
converge to
zero, we get
R = 0(hlr) w
is computed recursively:
Tj,o = w(x,tk,hj+l), T.
j
T J.,v-1 B Jv[T J.
= 0(1)r-l l,v-1-T J.,v-1 ]'
J ,v=
1(1)r-1,
v
j
= v(1)r-1
w(x,tk,hl,...,hr) = Tr-l,r-1' In general the coefficients ways on the step sizes
hj
8jv cIR
depend in complicated
and the exponents
yv.
In the
196
I.
INITIAL VALUE PROBLEMS
following two important special cases, however, the computation is relatively simple. Case 1:
hi = lhj 1, _
= 2(1)r,
Yv
Y 2
v-1
Yv = vy, y > 0, v = 1(1)r, hj
6jv
arbitrary
1
Sjv Case 2:
j
=
arbitrary
1
(h.
Y
-Jh
-1
J
l
The background can be found in Stoer-Bulirsch, 1980, Chapter 2, and Grigorieff (1972), Chapter 5.
This procedure, by the way,
is well-known for Romberg and Bulirsch quadrature and mid-
point rule extrapolation for ordinary differential equations (cf. Stoer-Bulirsch 1980).
In practice, the difference method is only carried out for finitely many values of sible for those
x
Extrapolation is then pos-
x.
which occur for all increments
The
h j*
case
hj/(tx)2 = constant
ratios of the
hj's
presents extra difficulties.
The
are very important, both for the size of
the remainder and the computational effort.
For solving hy-
perbolic differential equations one can also use the Romberg or the Bulirsch sequence. Romberg sequence: hj = h/2j-l,
j
= 1(1)r.
Bulirsch sequence: hl = h, h2j = h/2J,
h2j±1 = h/(3.2J 1),
j > 1.
Because of the difficulties associated with the case hj/(Ax)e - constant, it is wise to use a spacing of the
(Ax)j
11.
Extrapolation methods
197
based on these sequences for solving parabolic differential equations.
In principle, one could use other sequences for
global extrapolation, however.
Before applying an extrapolation method, we ask ourselves two decisive questions: expansion?
Does there exist an asymptotic
What are the exponents
would be optimal.
yv?
Naturally
yv= 2v
Usually one must be satisfied with yv = v.
In certain problems, nonintegral exponents can occur.
In
general the derivation of an asymptotic expansion is a very difficult theoretical problem.
This is true even for those
cases where practical experience speaks for the existence of such expansions.
However, the proofs are relatively simple
for linear initial value problems without boundary conditions.
As an example we use the problem ut(x,t) = A(x)ux(x,t) + q(x,t),
u(x,O)
x SIR, t c (0,T)
x SIR.
4 (X)'
The conditions on the coefficient matrix have to be quite strict.
We demand
A e C (IR, MAT(n,n,IR)) A(x)
real and symmetric,
IIA(x)-A(R) Ij < L2Ix-XI , Let the
w(x,t,h)
IjA(x)II < L1
x,R a IR.
be the approximate values obtained with
the Friedrichs method.
Let a fixed
A = h/Ax > 0
be chosen
and let A sup
p(A(x)) < 1.
x SIR The method is consistent and stable in the Banach space L2OR,4n)
(cf. Example 8.9).
In the case of an inhomogeneous
198
INITIAL VALUE PROBLEMS
I.
equation, we use the formula w(x,t+h,h) = 2[I+XA(x)]w(x+Ax,t,h) +
Theorem 11.1:
Let
Z[I-XA(x)]w(x-Ax,t,h) + hq(x,t).
r e 1N, 0 e Co (R, IRn)
h c (O,h0]
Then it is true for all
[O,T],IRn).
q c Co (IR x
and
that
r-l
w(x,t,h) _
TV(x,t)hV + p(x,t,h),
I
v=0
x cIR, t e [O,T],
t/h c ZZ
TV e co(R x (0,T1, ]Rn) O(hr)
uniformly in
t.
Since there is nothing to prove for
Proof:
pose that
r = 1, we sup-
We use the notation
r > 1.
V = Co(dt, IRT),
W = Co(JR x
[0,T], IRn).
The most important tool for the proof is the fact that for $ c V
q c W, the solution
and
longs to
W.
u
of the above problem be-
This is a special case of the existence and
uniqueness theorems for linear hyperbolic systems (cf., e.g., Mizohata 1973).
For arbitrary
v e W, we examine the differ-
ence quotients
Q1(v)(x,t,h) = h-1{v(x,t+h)
-
Z[v(x+Ox,t)+v(x-Ax,t)]}
Q2(v)(x,t,h) = (2ox)-1{v(x+Ax,t)-v(x-Ax,t)} Q(v) = Q1(v) - A(x)Q2(v)
Although
w(x,t,h)
apply
to
Q
is only defined for
t/h c2Z, one can
w:
q(x,t), x cIR, tc[0,T], t/h cZZ, hc(O,h01.
11.
Extrapolation methods
For
v e W, Q1(v)
and
199
can be expanded separately
Q2(v)
with Taylor's series
Q(v) (x,t,h) = vt(x,t) - A(x)vx(x,t) s
+
hv-1DV(v)(x,t)
+ hsZ(x,t,h).
s
v2 Here s
s
is arbitrary.
c IN
vanishes.
The operators
operators containing order
We have
v.
For fixed
h,
x, t, and
h.
A(x)
For
The quantities
2
to
DV, v = 2(1)00, are differential
as well as partial derivatives of
DV(v) c W. e W.
s = 1, the sum from
The support of
Z(x,t,h)
Z
is bounded.
is bounded for all
tv e W, v = 0(1)r-1
are defined re-
cursively:
v=0: a to(x,t) = A(x)Bz T0(x,t)+q(x,t)
te
x e IR,
To(x,0) = fi(x)
[0,T]
V-1 v>0:
8t TV(x,t) = A(x)8z TV(x,t)-uIODV+1-u(tu)(x,t)
t e [0,T]
x e IR, TV(x,0) =
It follows that tients
Q(TV)
0
TV E W, v = 0(1)r-1.
The difference quo-
yield 2r-1
U(T0)(x,t)+h2r-1z0(x,t,h)
hµ-'D
Q(T0)(x,t,h) = q(x,t)+ u=2
2r-2v-1
V-1
E Dv+1
Q(t)(x,t,h)
u(tu)(x,t)+
+
h2r-2v-lz(x,t,h),
In the last equation, the sum from when
v = r-l.
I
h11- D
u=2
P=O
2
v = 1(1)r-1. to
2r-2v-1
vanishes
Next the v-th equation is multiplied by
hV
INITIAL VALUE PROBLEMS
206
I.
and all the equations are added.
Letting
r-2
u=0
v r-v u-1
h
h
h u-1 D u (t
h
v=0
u=r-v+l
r-1 h
+
u
v 2r-2v-1
r-2 F
V D (T) (x,t)
u=2
v=O
+
Dv+1-u(ru) (x,t)
I
11v
v=l
+
we get
v-1
r-l
Q('1) (x,L,1i) = q(x,t)- E
T = Ervhv
2r v 1
Zv
) (x,t)
(x,t,h).
v=0
The first two double sums are actually the same, except for sign.
To see this, substitute
in the second,
v+u-l
obtaining r-1
r-2
I
v=0 1=v+1
hVD=_v+1 (TV)(x,t)
Then change the order of summation: r-1
u
h
u=1
u-1 1
Du+l-v(TV)(x,t)
v=0
Now the substitution
(u,v)
-
(v,u)
yields the first double
sum.
While the first two terms in this representation of Q(T)
cancel, the last two contain a common factor of
hr.
Thus we get
Q(T)(x,t,h) = q(x,t) + hrZ(x,t,h), x E IR,
Z
t
c
[0,T], t+h E [0,T],
has the same properties as
ous for fixed h c (0,h0]. tion
Z
v
h, bounded for all The quanity
T-W
:
h e (0,h01.
bounded support, continux E IR,
t
e [0,T], and
satisfies the difference equa-
11.
Extrapolation methods
Q(T)(x,t,h) r(x,0,h)
Thus, r-w
-
201
Q(w)(x,t,h) = hrZ(x,t,h)
- w(x,0,h) = 0.
is a solution of the Friedrichs method with initial
function
and inhomogeneity
0
hrZ(x,t,h).
It follows from
the stability of the method and from t/h e2Z
and
h e (0,ho], that for these
IIT(',t,h)
for
L t
and
h,
o
-
From the practical point of view, the restriction to functions
and
q
with compact support is inconsequential
because of the finite domain of dependence of the differential equation and the difference method.
Only the differen-
tiability conditions are of significance. do not have a finite dependency domain. V
W
and
Parabolic equations The vector spaces
are therefore not suitable for these differential
equations.
However, they can be replaced by vector spaces of
those functions for which sup
1 0 )(x)xkI <
j
= 0(1)s, k = l(1)m
x dIR sup 1(ax)jq(x,t)xkl < =,
j
= 0(1)s, k = l(l)oo, t = [0,T].
x c 1R s e 1N
suitable but fixed.
These spaces could also have been used in Theorem 11.1.
The
proof of a similar theorem for the Courant-Isaacson-Rees method would founder, for the splitting is not differentiable in
A(x) = A+(x)
x, i.e., just because
A(x)
- A-(x) is
arbitrarily often differentiable, it does not follow that this is necessarily so for
A+(x)
and
A_(x).
Global extrapolation does not correspond exactly to
202
INITIAL VALUE PROBLEMS
I.
the model of midpoint rule extrapolation for ordinary differential equations, for there one has a case of local extrapolation.
Although the latter can be used with partial differ-
ential equations only in exceptional cases, we do want to present a short description of the method here. h = nlhl = n2h2 =
.
= nrhr,
nj
c IN,
Let
j
= 1(1)r.
At first the difference method is only carried out for the interval
For
[O,h].
tions for
T
0
t = h, there are then
r
approxima-
With the aid of the Neville
available.
scheme, a higher order approximation for
t = h
is computed.
The quantities obtained through this approximation then become the initial values for the interval
[h,2h].
There are
two difficulties with this: (1)
points
When the computation is based on finitely many
x, the extrapolation is only possible for those
which are used in all
means that for Since
j
computations.
r
= 1(1)r, the same
A = hj/(ox)j a constant
for the larger increments
hj
or
x
Practically, this
x-values must be used. A = hj/(Ax)e a constant,
the method has to be carried
out repeatedly, with the lattice shifted in the
x-direction.
This leads to additional difficulties except for pure initial value problems.
In any case, the computational effort is in-
creased by this. (2)
Local extrapolation of a difference method is a
new difference method.
Its stability does not follow from
the stability of the method being extrapolated.
Frequently
the new method is not stable, and then local extrapolation is not applicable.
Occasionally so-called weakly stable methods
Extrapolation methods
11.
203
arise, which yield useful results with
h
values that are
Insofar as stability is present, this must be
not too small.
demonstrated independently of the stability of the original Local extrapolation therefore is a heuristic method
method.
in the search for higher order methods. The advantages of local over global extrapolation, however, are obvious.
For one thing, not as many intermedi-
ate results have to be stored, so that the programming task For another, the step size
is simplified.
in the interval
can be changed
h
The Neville scheme yields good in-
[0,T].
formation for the control of the step size.
In this way the
method attains a greater flexibility, which can be exploited to shorten the total computing time.
As an example of local extrapolation, we again examine the Friedrichs method above.
for the problem considered
C(h)
The asymptotic expansion begins with
hT1(x,y) + h2T2(x,y).
Let
r = 2, hl = h, and
T0(x,y) + h2 = h/2.
Then
E2(h) = 2(C(h/2))2 - C(h) is a second order method. Let
Ax = h/A C(h) =
g = Ax/2.
2[I+AA(x)]Tg +
2[I+XA(x)]Tg +
C(h/2) = 2(C(h/2))2
and
We check to see if it is stable. Then
2[I-AA(x)]Tg2
Z[I-XA(x)]T-l
2[I+AA(x)][I+AA(x+g)]T2
= +
2[I+AA(x)][I-AA(x+g)]T0
+ 2[I-AA(x)][I+AA(x-g)]T0 +
2[I-AA(x)][I-AA(x-g)]T92
204
1.
INITIAL VALUE PROBLEMS
E2(h) = ZA[I+AA(x)]A(x+g)T2 + I- ZX[I+XA(x)]A(x+g)+ ?A[I-AA(x)]A(x-g) -
ZA[I-XA(x)]A(x-g)Tg2.
By Theorem 5.13, terms of order stability.
method with
Therefore
E2(h)
0(h)
have no influence on
is stable exactly when the
E2(h), created by replacing
A(x+g)
A(x-g)
and
A(x), is stable:
E2(h) = ZA[I+AA(x)]A(x)T9+I-X2A(x)2- ZA[I-XA(x)]A(x)Tg2 = 1+ 2XA(x)(TAx-TA1)+ ZX2A(x)2(TAx-2I+TAX). For
A(x) ° constant, E2(h)
Example 9.26). A(x)
is the Lax-Wendroff method (cf.
This method is stable for
Xp(A(x)) < 1.
If
is real, symmetric, and constant, it even follows that
II E2 (h)112 < 1. With the help of Theorem 9.34 (Lax-Nirenberg) we obtain a sufficient stability condition for nonconstant and
E2(h)
A.
E2(h)
are stable under the following conditions:
(1)
A E C2(IR,MAT(n,n, IR))
(2)
A(x)
(3)
The first and second derivatives of
(4)
Ap(A(x)) < 1,
is always symmetric A
are bounded
x EIR.
By Theorem 9.31, Condition (4) is also necessary for stability.
In the constant coefficient case, E2(h)
with the special case Example 10.9.
m = 1, r = 1
of method
coincides C(h)
of
Both methods have the same order of consistency
11.
Extrapolation methods
205
and the same stability condition, but they are different for nonconstant
A.
The difference E2(h)-(C(h/2))2 = (C(h/2))2-C(h)
gives a
good indication of order of magnitude of the local error. can use it for stepwise control.
One
In this respect, local
extrapolation of the Friedrichs method has an advantage over direct application of the Lax-Wendroff method. The derivation of
E2(h)
the amplification matrix of
can also be carried through
C(h).
C(h/2)
has amplifica-
tion matrix G(h/2,y,x) = cos w = yg = 2 yAx.
It follows that
H2(h,y,x) = 2G(h/2,y,x)2 - G(h,y,x) 2,12
sin 2w-A(x)
iasin 2w-A(x).
That is the amplification matrix of
Through further ex-
E2.
trapolation, we will now try to derive a method
E3
of third
order consistency: E3(2h) =
E2(h) 2
-
3 E2(2h).
3
Consistency is obvious, since there exists an asymptotic expansion.
We have to investigate the amplification matrix
H3(2h,y,x) =
H2(h,y,x)2
-
3 H2(2h,y,x)
3
Let
p
be an eigenvalue of XA(x), and
ponding eigenvalues of Then
n2,n2,3
H2(h,y,x), H2(2h,y,x), and
the corresH3(2h,y,x).
206
INITIAL VALUE PROBLEMS
I.
n2 = 1-2w2u2
2 w4u2 +
+ i[2wu
w3u]
-
+
O(Iw15)
-
8w3u3)
3 n2 = 1-8w211 2 + 30 w4u2 +
4w41j 4
+ i[4wu -
8
w3u
+
O(Iw15)
3
n2 = 1-8w2u2 + 32 w4u2 + i[4wu - 32 w3u] + o(Iw15) n3 = 1-8w2u2
+ 3
w4u2
16 w4 u4
+
+ i[4wp In312
= 1
-
32 w3u3) + O(IwIS)
w4(u2-u4) + O(IwIS).
+
3 For stability it is necessary that IuI
> 1.
On the other hand, for H2(2h,y,x) =
In3I < 1, that is,
w = n/2 we have
I
H2(h,y,x) = I-2A2A(x)2
n3 = 1 + 3 (u4-u2) and hence the condition
IuI
< 1.
Thus
if by chance all of the eigenvalues of 0
or
-1, for all
x eIR.
E3 XA(x)
is stable only are
+1
or
In this exceptional case, the
Friedrichs method turns into a characteristic method, and thus need not concern us here.
For characteristic methods, local extrapolation is almost always possible as with ordinary differential tions.
present.
This is mostly true even if boundary conditions are The theoretical background can be found in Hackbusch
(1973), (1977).
PART II. BOUNDARY VALUE PROBLEMS FOR ELLIPTIC DIFFERENTIAL EQUATIONS
12.
Properly posed boundary value problems Boundary value problems for elliptic differential equa-
tions are of great significance in physics and engineering.
They arise, among other places, in the areas of fluid dynamics, electrodynamics, stationary heat and mass transport (diffusion), statics, and reactor physics (neutron transport).
In
contrast to boundary value problems, initial value problems for elliptic differential equations are not properly posed as a rule (cf. Example 1.14).
Within mathematics itself the theory of elliptic differential equations appears in numerous other areas.
For a
long time the theory was a by-product of the theory of functions and the calculus of variations.
To this day variational
methods are of great practical significance for the numerical solution of boundary value problems for elliptic differential equations.
Function theoretical methods can frequently be
used to find a closed solution for, or at least greatly simplify, planar problems.
The following examples should clarify the relationship 207
208
BOUNDARY VALUE PROBLEMS
II.
between boundary value problems and certain questions of function theory and the calculus of variations. G
Throughout,
will be a simply connected bounded region in
continuously differentiable boundary
IR2
with a
aG.
EuZer differential equation from the calculus
Example 12.1:
of variations.
Find a mapping
u: G -+]R
which satisfies the
following conditions: (1)
is continuous on
u
entiable on
and continuously differ-
G
G.
(2)
u(x,y) = (x,y)
(3)
u
for all
(x,y) E aG.
minimizes the integral
I[w] = If
[a1(x,Y)wx(x,y)2
+
a2(x,y)wy(x,Y)2
G +
c(x,y)w(x,y)2
2q(x,y)w(x,y)]dxdy
-
in the class of all functions
Here
al,a2 a
C1 (G, ]R)
,
c,q c
w
satisfying (1) and (2).
C1 (G, IR)
al(x,y) > a >
a2 (x,y) > a > c(x,Y) > 0.
,
and ip E C1 (aG, IR)
with
0
0
(x,y)
E
It is known from the calculus of variations that this problem has a uniquely determined solution (cf., e.g., GilbargTrudinger 1977, Ch. 10.5). u
In addition it can be shown that
is twice continuously differentiable on
G
and solves the
following boundary value problem: -[al(x,y)ux]x -
[a2(x,Y)uy]y + c(x,y)u = q(x,y), (x,y) E G
u(x,y) = 'P(x,y),
(x,y)
a
G.
(12.2)
12.
Properly posed boundary value problems
209
The differential equation is called the Euler differential equation for the variational problem.
Its principal part is
-aluxx - a2uyy.
The differential operator 2
a2
__7 ax
ay
2
is called the Laplace operator (Laplacian).
In polar coor-
dinates,
x=rcos0 y = r sin it looks like a2
Dr
1
+
a
r 3r
1
+
a2
r 7 ao2
The equation -°u(x,y) = q(x,y)
is called the Poisson equation and -°u(x,y) + cu(x,y) = q(x,y),
c = constant
is called the Helmholtz equation.
With boundary value problems, as with initial value problems, there arises the question of whether the given problem is uniquely solvable and if this solution depends continuously on the preconditions.
In Equation (12.2) the
preconditions are the functions
and
q
ip.
Strictly speak-
ing, one should also examine the effect of "small deformations" of the boundary curve.
Because of the special prob-
lems this entails, we will avoid this issue.
For many bound-
ary value problems, both the uniqueness of the solution and its continuous dependence on the preconditions follows from
210
BOUNDARY VALUE PROBLEMS
II.
the maximum-minimum principle (extremum principle). Maximum-minimum principle.
Theorem 12.3:
q(x,y) > 0 (q(x,y) < 0) for all
and
every nonconstant solution
If
c(x,y) > 0
(x,y) c G, then
of differential equation (12.2)
u
assumes its minimum, if it is negative (its maximum, if it is positive) on
DG
and not in
G.
A proof may be found in Hellwig 1977, Part 3, Ch. 1.1.
Let boundary value problem (12.2) with
Theorem 12.4:
c(x,y) > 0 (1)
for all
(x,y)
e G
be given.
Then
It follows from q(x,y) > 0,
(x,y) E U
i4(x,y) > 0,
(x,y) e DG
u(X,y) > 0,
(x,y) E G.
and
that
(2)
Iu(x,y)I
There exists a constant
<
max (X,y)eDG
K
K > 0
such that
max Iq(X,Y)l, (X,y)cG (x,y) E
The first assertion of the theorem is a reformulation of the maximum minimum principle which in many instances is more easily applied.
The second assertion shows that the boundary
value problem is properly posed in the maximum norm. Proo
:
(1) follows immediately from Theorem 12.3.
(2), we begin by letting w(x,y) = ' + (exp($ ) - exp(sx))Q where
To prove
12.
Properly posed boundary value problems
'1'
=
max lb(X,Y)1, (x,y)c G const. > 0,
a
211
max jq(x,Y)j (x,y)EG
Q =
a const.
>
max (x,y)EG
Further, let maxc_ {1aX al(x,Y)I, c(x,Y)}.
M
(X,y)
Without loss of generality, we may suppose that the first component, x, is always nonnegative on
Since
G.
a1(x,y) > a,
we have
r(x,y) _ -[al(x,Y)wx(x,Y)]x -
[a 2(x,Y)wy(x,Y)]y
+ c(x,Y)w(x,Y)
= Q exp(Bx)[al(x,Y)s2 +
+ c(x,Y) [Q exp(SC) +
s
ax ai(x,Y) - c(x,Y)J
Y']
> Q exp(Bx)[as2 - M0+1)).
Now choose
a
so large that as2 - M(0+1) > 1.
It follows that r(x,Y) I Q,
(x,y)
E G.
In addition,
w(x,y) >
'l,
(x,y) E 9G.
From this it follows that q(x,y) + r(x,y) > 0 (X,y)
q(x,y)
E G
- r(x,y) < 0
u(x,Y) + w(x,Y) = V'(x,Y)
+ w(x,Y) > 0
- w(x,Y) = i,(x,Y)
- w(x,Y) < 0
(x,y) E U(x,Y)
G.
212
BOUNDARY VALUE PROBLEMS
II.
Together with (1) we obtain u(x,y) + w(x,y) > 0 u(x,y)
- W(X,Y) < 0
which is equivalent to (x,y) e G.
Iu(x,y)I < W(x,Y),
u, and its continu-
To check the uniqueness of the solution ous dependence on the preconditions ferent solution
u
and
for preconditions
Theorem 12.4(2), for Iu(x,Y)
1
- u(x,Y)I <
(x,y)
q, pick a dif-
and
q.
From
c G, we obtain the inequality
max (x,y)e3G + K
max _Iq(X,Y)
q(x,Y)I
(x ,y) eG This implies that the solution
is uniquely determined
u
and depends continuously on the preconditions Example 12.5:
El
i
and
q.
Potential equation, harmonic functions.
Boundary value problem: tu(x,y) = 0,
(x,y)
(x,y)
u(x,Y) _ (x,Y), Here
i e C0(3G, ]R).
e G e 8G.
As a special case of (12.2), this prob-
lem has a uniquely determined solution which depends continuously on the boundary condition
P.
The homogeneous differ-
ential equation tu(x,y) = 0 is called the potential equation.
Its solutions are called
Properly posed boundary value problems
12.
213
Harmonic functions are studied care-
harmonic functions.
fully in classical function theory (cf. Ahlfors 1966, Ch.
Many of these function theoretical results were
4.6).
extended later and by different methods to more general differential equations and to higher dimensions.
In this, the
readily visualized classical theory served as a model.
We
will now review the most important results of the classical theory. Let
(1)
be a holomorphic mapping.
f(z)
f(z), Re(f(z)), and
Im(f(z))
Then
f(z),
are all harmonic functions.
Every function which is harmonic on an open set
(2)
is real analytic, i.e., at every interior point of the set it has a local expansion as.a uniformly convergent power series in
x
and
y.
(3)
When the set
G
is the unit disk, the solution
of the boundary value problem for the potential equation can be given by means of the Poisson integral formula r
2
r27r
1-r
JO
1
2
dm
for r<1
2r
u(x,y) = for r=1.
are the polar coordinates of
Here
(x,y).
The
Poisson integral formula is a simple consequence of the Cauchy integral formula. (4)
The Poisson integral formula leads to the expres-
sion
rv[av
u(x,Y) = 012 + V=1
where
sv
214
II.
BOUNDARY VALUE PROBLEMS
f2w
(cos ¢, sin 4)cos(v4)dc
av = n 0
2n
p(cos 4, sin 4)sin(v4)d4.
Sv = 1 j 0
it
The functions
rvcos(v4), rvsin(v4)
monic functions.
are the simplest har-
Thus the above expansion of
u(x,y)
is
analogous to the power series expansion for holomorphic functions.
The potential equation is invariant with respect
(5)
to one-to-one holomorphic transformations.
Thus one need
consider the boundary value problem for the potential equation only on the unit disk, since by the Riemann mapping theorem, every simply connected region with at least two boundary points can be mapped onto the unit disk conformally (i.e., globally one-to-one and holomorphically).
It follows from the Schwarz reflection principle
(6)
that at every boundary point where the boundary curve and the boundary function solution
u
p
8G
are both real analytic, the
is also real analytic.
At these points, u
can
be continued across the border.
The conformal mappings of a simply connected region onto the unit circle can be given in closed or almost closed form for a great number of regions. (5)
As a result, conclusion
is of considerable practical significance.
It is fre-
quently worthwhile to map regions with a complicated border onto the unit disk or onto some other simple region, such as a rectangle.
Unfortunately, the Riemann mapping theorem
has no generalization to higher dimensions.
The exploitation
of conformal mappings is thus restricted to the plane.
Differ-
12.
Properly posed boundary value problems
215
ential equations differing from the potential equation are not in general invariant with respect to conformal maps.
However, it is usually easy to specify the differential equation for the transformed function.
In executing the trans-
formation, the Wirtinger calculus has proved itself to be of use, and we briefly describe it now. Instead of the (mutually independent) coordinates and
y, we consider the (mutually dependent) complex co-
ordinates
z
z, where
and
z = x + iy
,
z=x
x = 2 (z+z)
,
y = ai (z z) . a/az
The differential operators a
az = a
=
1
a
2
ax
1
a ax
2
+
1
a
1
a
and
-
iy,
a/az are defined by
2i ay
_
2i ay
3'F
Conversely, we have a
ax
=
a
az
+
a
3z
y = i(k -
a
az
).
The potential equation now assumes the form
Au(x,y) = 4 a2 u z
z
= 0.
azaz
A function
f(z) = f(z,z) = a(z,z) + ib(z,z) is holomorphic exactly when it satisfies the differential equation
x
216
BOUNDARY VALUE PROBLEMS
II.
of (z, z) az
=
0
This equation is just another form of the
on an open set.
Cauchy-Riemann differential equations ax(x,Y) = by(x,Y),
ay(x,Y) = -bx(x,Y) For a holomorphic function aaf(Z)
= f'(z)
af(z)
3TT7Z
az
=
af(z)
_
w = f(z)
onto the region
,
z
z
2
azaz
_
aZ .
af'(z)
a= f
aZ
a
aw
az
= f (z)f
G
Then it follows from
G*.
aw
az
=
af(z) = 0
be a conformal mapping of the region
af(z) a+ T(Z
a
_
az
az Now let
it is further true that
f(z)
aw
(z) () aw
a2 + at z + f, (z)(af(z) awaw L az az
2
awaw
a2
z
awaw'
that a2
a2
awaw
fl(z)fl(z) azaz
With the help of this equation one easily transforms differential equations of the form -Au(x,y) = H(x,y,u) or -
4 au 2
z,z)
azaz
= H(z,z,u).
12.
217
Properly posed boundary value problems
First boundary value problem for the Poisson
Example 12.6: equation.
-Au(x,y) = q(x,y),
(x,y) (x,y)
u(x,Y) = Vi(x,Y),
EG
E aG.
In many algorithms it is assumed that either
ip(x,y) = 0
The general case can usually be reduced to these
q(x,y) ° 0.
special cases by means of a substitution: able to a function
be extend-
let
and let
$ E C2(G,IR)
u(x,Y) = u(x,Y)
i(x,Y)
-
q(x,Y) = q(x,y) +
We then obtain the new problem (x,y) E G
-Ai(X,Y) = q(x,Y), u(x,y) =
0
(x,y)
,
c 9G.
If, on the other hand,
_
k
q(x,y) = P(z,z) _
auv
I
z11-V
z
u,v=0
then one can define
a
k u(x,y)
= u(x,y) +
(x, Y) =
+
u+l)(v+1
4
(x, Y) + q
E
zu lZV+1
17
li,v=0
i
(u l) (v 1
V,v=O Pi
or
zu lzv 1
is the solution of the problem au(X,Y) = 0, u(x,Y) = I1(X,y),
(x,y) E G (X,y) E 9G.
0
218
BOUNDARY VALUE PROBLEMS
II.
Example 12.7:
Third boundary value problem.
-Au(x,y) + au(x,y) = q(x,y), 3U3(n)Y).+
Here
Su(x,Y)
(x,y) e G
= (x,Y),
(x,y)
E Co (3G, IR) , q e Co (G, IR)
a, S E 7R,
c
G.
is the
and
derivative in the direction of the outward normal of
8G.
We know, from the theory of partial differential equations (cf., e.g., Walter 1970, Appendix), that: (1)
Whenever the real numbers
a,s
satisfy the rela-
tions
a > 0,
0 > 0,
a+B > 0
the problem has a unique solution. tinuously on the preconditions
The solution depends con-
q(x,y)
is a valid monotone principle: q(x,y) > implies
and
*(x,y). and
0
4i(x,y)
There > 0
u(x,y) > 0. (2)
If
a = 0 = 0, then
a solution whenever uniquely solvable.
u(x,y)
is.
u(x,y) + c, c = constant, is Therefore the problem is not
However, in certain important cases, it
can be reduced to a properly posed boundary value problem of the first type.
To this end, we choose
gl(x,y)
and
g2(x,y)
so that
3x gl(x,Y) + ay g2(x,Y) = q(x,y).
The differential equation can then be written as a first order system:
-ux(x,Y) + vy(x,Y) = gl(x,Y), -uy(x,Y) v
- vx(x,Y) = g2(x,Y)
is called the conjugate function for
u .
If
q e C1(G,IR),
Properly posed boundary value problems
12.
v
219
satisfies the differential equation - v(X,Y) = g(x,Y) = ax g2(x,Y)
-
y gl(x,y).
We now compute the tangential derivative of point.
Let
(wl,w2)
the outward normal.
v
at a boundary
be the unit vector in the direction of Then
is the corresponding tan-
(-w2,wl)
gential unit vector, with the positive sense of rotation. -w2vX(X,Y) + wlvy(X,Y)
= -w2[-uy(x,y)-g2(x,y)] + wl[ux(x,Y)+gl(x,Y)]
= (X,Y) + wlgl(x,Y) + w2g2(x,Y) _ 'P(X,Y) thus is computable for all boundary points
'P(x,y)
given
'P(x,y), gl(x,y), and
g2(x,y).
(x,y),
Since the function
v
is unique, we obtain the integral condition ds = arc length along
faG (x,y)ds = 0,
G.
If the integrability condition is not satisfied, the original problem is not solvable. obtain a
E
Cl(aG,]R)
Otherwise, one can integrate
P
to
with
4)
a s
'P
is only determined up to a constant.
Finally we obtain
the following boundary value problem of the first type for v: -AV(X,y) = g(X,Y), v(x,Y) = T(X,Y),
One recomputes tem.
u
from
v
(X,y) E G (x,y) E
G.
through the above first order sys-
However, this is not necessary in most practical in-
stances (e.g., problems in fluid dynamics) since our interest
220
II.
is only in the derivatives of a < 0
For
(3)
BOUNDARY VALUE PROBLEMS
u.
a < 0, the problem has unique
or
solutions in some cases and not in others.
a = 0, -a = v eIN, q = 0, and
5
For example, for
0, one obtains the family
of solutions
y eIR
u(x,y) =
x = r cos , y = r sin Q.
r2 = x2+y2,
Thus the problem is not uniquely solvable.
In particular,
there is no valid maximum-minimum principle. Example 12.8: geneous plate.
o
Biharmonie equation; load deflection of a homoThe differential equation
06u(x,y) = u
xxxx
+ 2u
xxyy
+ u = 0 yyyy
is called the biharmonie equation.
As with the harmonic equa-
tion, its solutions are real analytic on every open set.
The
deflection of a homogeneous plate is described by the differential equation MMu(x,y) = q(x,y),
(x,y) c G
with boundary conditions u(x,y) _ *1(x,y) (x,y) c 3G
(1)
(x,y) c DG.
(2)
-Du(x,y) = Yx,y) or
u(x,y) = *3(x,y) auan,y) _ 4(x,y)
Here
q c C°(U,IR), ip 1
c
C2(3G,IR), *2,'P4 a C°(3G,IR), and
12.
Properly posed boundary value problems
3 E C1 (BG,IR).
221
The boundary conditions (1) and (2) depend In the first case,
on the type of stress at the boundary.
the problem can be split into two second-order subproblems: -Av(x,y) = q(x,y),
(x,y) e G
(a)
v(x,y) = Yx,y),
(x,y)
e aG
and -tU(X,y) = V(X,y),
(X,y) C G
(b)
u(x,y) = P1(x,y),
(x,y)
c
G.
As special cases of (12.2), these problems are both properly posed, since the maximum minimum principle applies.
All prop-
erties--especially the monotone principle--carry over immediately to the fourth-order equation with boundary conditions (1).
To solve the split system (a),
t'I E C° (BG,IR) problem (2)
(b), it suffices to have
instead of l e C2 (aG,IR). Boundary value
is also properly posed, but unfortunately it can-
not be split into a problem with two second-order differential equations.
Thus both the theoretical and the numerical treat-
ment are substantially more complicated.
There is no simple
monotone principle comparable to Theorem 12.4(1). The variation integral belonging to the differential equation AAu(x,y) = q(x,y) is
I[w] = ff
[(Aw(x,y))2 - 2q(x,y)w(x,y)]dx dy.
a The boundary value problem is equivalent to the variation problem
I [u] =min {I [w] with
I
w e W}
222
BOUNDARY VALUE PROBLEMS
II.
W = (w C C2(G,IR)
I
w
satisfies boundary cond. (1)}
or
W = {w C C1(G, IR) n C2(G, IR)
I
w
satisfies boundary cond. (2)}.
It can be shown that
differentiable in
u
G.
is actually four times continuously o
Error estimates for numerical methods typically use higher derivatives of the solution problem.
u
of the boundary value
Experience shows that the methods may converge ex-
tremely slowly whenever these derivatives do not exist or are This automatically raises the question of the
unbounded.
existence and behavior of the higher derivatives of
u.
Matters are somewhat simplified by the fact that the solution will be sufficiently often differentiable in
G
if the bound-
ary of the region, the coefficients of the differential equation, and the boundary conditions are sufficiently often differentiable.
In practice one often encounters regions with
corners, such as rectangles
G = (a,b) x (c,d) or L-shaped regions G = (-a,a)
x (O,b) U (O,a) x (-b,b).
The boundaries of these regions are not differentiable, and therefore the remark just made is not relevant.
We must first
define continuous differentiability for a function on the boundary of such a region. set
U dIR2
properties: G.
and a function
defined
*
There should be an open
f C C1(U,]R)
with the following
(1) 3G c U, and (2) T = restriction of
f
to
Higher order differentiability is defined analogously.
Properly posed boundary value problems
12.
223
For the two cornered regions mentioned above, this definition is equivalent to the requirement that the restriction of
to each closed side of the region be sufficiently often
*
continuously differentiable. Poisson equation on the square.
Example 12.9:
-Au(x,y) = q(x,y),
(x,y)
c G = (0,1) x (0,1)
u(x,Y) = i(x,Y),
(x,y)
a aG.
v = 1(1)k
u c C2k(G,]R), then for
'Whenever
(-1)v-1(ay)2vu(x,Y)
(DX)2vu(x,Y) +
(-1)v- j-1
x)2j(
(
y)2v-2j
2 ]Au(x,Y)
v=o j Let
(xo,yo)
let
*
be one of the corner points of the square and 2k-times continuously differentiable.
be
left side of the equation at the point mined by and
alone.
*
(xo,yo)
Then the is deter-
We have the following relations between
q:
*xx(xo,Yo) + 4) yy(xo,Yo) = -q(xo,Yo) IPxxxx(xo,Yo)
-
Ip
-gxx(xo,Yo) + gyy(xo,Yo)
YYYY(xo,Yo)
etc.
does not belong to
When these equations are false, u
On the other hand a more careful analysis will
C2k(G,]R).
show that
u
does belong to
tions are satisfied and
q
C2k(G,]R) and
p
if the above equa-
are sufficiently often
differentiable.
The validity of the equations can be enforced through
224
II.
BOUNDARY VALUE PROBLEMS
the addition of a function with the "appropriate singularity". v = 1(l)-, let
For
v
Im(z2vlog
vv(x,Y) = 2(-1)
log z = log r+i4 For
x > 0
and
where
y > 0
z)
r = IzI, 4 = argIzI,
-n < 4
< n.
we have
vv(x,0) = 0 y2v vv(O,Y) = Set
cpv = xx(li,v)+'pyy(u,v)+q(p,v), u = 0,1 and v = 0,1
i
u(x,Y) = u(x,Y) + n
V+(x,Y) = V'(x,Y)
+
n
1
1
1
1
2
cpv Im(zpvlog zpv)
p=0 v=o 1
1
E
E
2
cpv Im(zpv log zpv)
p=0 v=0
where z00 = z,
z10 = -i(z-l),
z01 = i(z-i),
zll = -(z-i-1).
The new boundary value problem reads -au(x,y) = q(x,y),
u(x,Y) = kx,Y),
We have
u e C2 (G, IR)
(x,y)
e G
(x,y) c DG.
.
The problem -Eu(x,Y) = 1,
u(x,Y) = 0,
(x,Y) e G (x,y)
c DG
has been solved twice, with the simplest of difference methods (cf. Section 13), once directly, and once by means of u.
Table 12.10 contains the results for increments
the points
(a,a).
h
and at
The upper numbers were computed directly
Properly posed boundary value problems
12.
225
with the difference method, and the lower numbers with the given boundary correction.
a
1/2
h
1/32
1/8
1/128
0.7344577(-l)
0.1808965(-1)
0.7370542(-l)
0.1821285(-l)
0.7365719(-l) 0.7367349(-1)
0.1819750(-1)
0.1993333(-2)
0.1820544(-1)
0.1999667(-2)
1/256 0.7367047(-1)
0.1820448(-1)
0.1999212(-2)
0.1784531(-3)
0.7367149(-1)
0.1820498(-1)
0.1999622(-2)
0.1788425(-3)
1/16
1/64
Table 12.10
h
a
1/64
1/2
1/128
1/32
1/8
0.736713349(-1)
0.182048795(-l)
0.199888417(-2)
0.736713549(-1)
0.182049484(-1)
0.199961973(-2)
1/256 0.736713532(-1)
0.182049475(-1)
0.199961516(-2)
0.178796363(-3)
0.736713533(-1)
0.182049478(-1)
0.199961941(-2)
0.178842316(-3)
Table 12.11
Table 12.11 contains the values extrapolated from the preceding computations. pure
Extrapolation proceded in the sense of a
h2-expansion:
wh(a,a) =
3[4 uh(a,a)
With the exception of the point
- u2h(a,a)]
(1/128,1/128), the last line
is accurate to within one unit in the last decimal place.
the exceptional point, the error is less than 100 units of the last decimal.
The values in the vicinity of the
At
226
II.
BOUNDARY VALUE PROBLEMS
corners are particularly difficult to compute. that the detour via
and
'
is worthwhile.
u
It is clear Incidentally,
these numerical results provide a good example of the kind of accuracy which can be achieved on a machine with a mantissa length of 48 bits.
With boundary value problems, round-
ing error hardly plays a role, because the systems of equations are solved with particularly nice algorithms. Example 12.12:
Poisson equation on a nonconvex region with
corners.
(x,y) E G
-ou(x,y) = q(x,y),
u(x,y) _ (x,y), Ga = {(x,y) cIR2
1
(x,y) s DGa and
x2+y2 < 1 y
Figure 12.13
jyj
for
> x tan Z} a e (r,2a).
Properly posed boundary value problems
12.
227
The region (Figure 12.13) has three corners (0,0), (cos a/2, sin a/2), (cos a/2, -sin a/2).
The interior angles are a,
n/2,
n/2.
The remarks at 12.9 apply to the right angles. interior angle of
arise.
u
a > n
But at the
other singularities in the derivatives
Let
t (x,y) = Re(zn/a) = Re exp[(n/a)log z]
log z = log r +
7r
<
q(x,y) = 0.
Then u(x,y) = Re(zTr /a ),
and for
a = 3n/2, this is
even the first derivatives of
(x,y) _
0
sin a/2)
q(x,y) _ 0
u
on the intervals from and from
(0,0)
Obviously not
u(x,y) = Re(z2/3 ).
to
are bounded in (0,0)
G.
Here
(cos a/2,
to
(cos a/2, -sin a/2).
Since
also, the singularity has nothing to do with the
derivatives of
W
or
q
It arises from
at the point (0,0).
the global behavior of the functions.
It is not possible to
subtract a function with the "appropriate singularity" in advance. ificance.
Problems of this type are of great practical signIn the Ritz method (cf. §14) and the collocation
methods (cf. §16) one should use special initial functions to take account of these types of solutions.
a
The following two examples should demonstrate that boundary value problems for parabolic and hyperbolic differ-
228
II.
BOUNDARY VALUE PROBLEMS
ential equations are either not solvable or not uniquely solvable.
Boundary value problem for the heat equation.
Example 12.14:
uy(x,Y) = uxx(x,Y),
(x,y)
(x,y) E 3G
u(x,y) = p(x,Y), where
i e C°(3G,IR).
determined.
E G
The boundary value problem is over-
For example, let
G =
(0,1) X (0,1).
Then the
initial boundary value problem already is properly posed. Therefore the
set of all boundary values for which the prob-
lem is solvable cannot lie entirely in the set of all boundary values.
For regions with continuously differentiable
boundary there are similar consequences which we will not enter into here.
Example 12.15:
o
Boundary value problem for the wave equation. (x,y) e G
uxx(x,y) - uyy(x,Y) = 0,
(x,y) e 3G
u(x,y) _ 'D(x,Y),
where
$ c C°(3G,]R).
This problem also is not properly posed.
We restrict ourselves to two simple cases.
Let
G = Q1 = (0,1) x (0,1) or
G = Q2 _ {(x,Y) E]R2
1
ri -
Ixl
> y >
I x I ).
The two regions differ in that the boundary of of characteristics while the boundary of cides with the characteristics.
Ql
Q2
consists
nowhere coin-
According to Example 1.9,
the general solution for the wave equation has the representa-
13.
tion
Difference methods
r(x+y) + s(x-y).
229
If
u(x,y)
is a solution for
G = Q1,
then so is
u(x,y) + cos[2n(x+y)]-cos[2n(x-y)] = u(x,y)
-
2 sin(2nx)sin(2ny).
The problem therefore is not uniquely solvable. G = Q2, r
and
s
In case
can be determined merely from the condi-
tions on two neighboring sides of the square (characteristic initial value problem) and therefore the problem is overdetermined. 13.
e
Difference methods
In composing difference methods for initial value problems, the major problem lies in finding a consistent method (of higher order, preferably) which is also stable.
For
boundary value problems, this problem is of minor significance, since the obvious consistent difference methods are stable as a rule.
In particular, with boundary value problems
one does not encounter difficulties of the sort corresponding
to the limitations on the step size ratio h/Ax
or
h/(Ax)2
encountered with initial value problems.
We consider boundary value problems on bounded regions. Such regions are not invariant under applications of the translation operators.
The difference operators are defined,
therefore, only on a discrete subset of the region--the lattice.
In practice one proceeds in the same manner with
initial value problems, but here, even in theory we will dispense with the distinctions, and start with the assumption that the difference operators are defined on the same Banach space as the differential operators.
230
II.
BOUNDARY VALUE PROBLEMS
From the practical point of view, the real difficulty with boundary value problems lies in the necessity of solving large systems of linear or even nonlinear equations for each We will consider this subject extensively in the
problem.
The systems of equations which
third part of this book.
arise with boundary value problems are rather specialized in But they barely differ from the systems which
the main.
arise with implicit methods for the solution of initial value problems.
Error estimation is the other major area of concern in a treatment of boundary value problems. In this chapter, G
will always be a bounded region
(an open, bounded, and connected set) in boundary of
G
We denote the
JR2.
Let
by
r.
rr :
C°(-d, IR)
C°(r, IR)
be the natural map which assigns to each function r, called the
its restriction to the boundary
u e C°(G,IR)
boundary restriction map.
In
C°(G,]R)
and
C°(r,IR)
we
use the norms
max
Iu(x,y)
max
(x, y)
(x,y) cG and
(x,y)cr
Both spaces are Banach spaces, and
map with
JJrr11
is a continuous linear
= 1.
Definition 13.1: in
rr
A finite set
M c G
is called a Zattice
It has mesh size
G.
M
2
max
min
(x,y)EG (u,v)erUM
11 (x,y) -
(u,v)%.
Difference methods
13.
231
The space of all lattice functions
C° (M, ]R)
f:M +]R
we denote by
With the norm
.
11f11- =
If(x,y) I,
max (x, Y) EM
becomes a finite dimensional Banach space.
C°(M,]R)
The
natural map
rM:C°(G, ]R)
-
C°(M, ]R)
is called the lattice restriction map.
{(x,y) E G
x = ph, y = vh with p,v E 7l}, 0 < h < h 0
I
is called the standard lattice in if
h
Obviously and
It has mesh size
G.
is chosen sufficiently small.
0
the space
rM
C°(M,IR)
h
0
is linear, continuous, and surjective,
If the points of
lirMil = 1.
The lattice
M
are numbered arbitrarily,
can be identified with
1R
(n = number
M) by means of the isomorphism
of points in
f <-> (f(xl,yl).... ,f(xn,yn))
Thus it is possible to consider differentiable maps F
:
C°(M, IR) + C°(M, ]R)
.
In this chapter we will consider only the following problem together with a few special cases. Problem 13.2:
Here
L
Lu(x,y) = q(x,y),
(x,y) E G
u(x,y) = (x,y),
(x,y) e r.
is always a semilinear uniformly elliptic second-
232
BOUNDARY VALUE PROBLEMS
II.
order differential operator of the form Lu = -a11uxx - 2a12uxy - a22uyy -b1ux - b2uy + H(x,y,u),
where
all, a12, a22, b1, b2
H e CC(G x IR, IR) , Furthermore, for all
(x,y) e G and all
H(x,y,O) = 0, 0,
all >
a11a22 - a12 >
tion of the problem.
z cIR, let
Hz(x,y,z) > 0,
u c C°(G,IR) n C2(G,IR), u
If
eye C° (P, IR) .
q c Co (G, IR) ,
0.
is called the classical solu-
o
The next definition contains the general conditions on a difference method for solving 13.2. Definition 13.3:
A sequence D = {(Mj,Fj,Rj)
I
= l(1)oo}
j
is called a difference method for Problem 13.2 if the following three conditions are satisfied: (1)
hj
=
IMjj (2)
The
M.
are lattices in
converging to zero. The
Fj
are continuous maps
c° (r , ]R) X Co (Mj , IR) For each fixed
* c C0(r,IR), all
differentiable maps of (3)
with mesh sizes
G
The
Rj
Co(MjIR)
C° (Mj , IR) . Fj(p, ) to
are continuously
C°(Mj,IR).
are continuous linear maps
13.
Difference methods
C° (Mj
The method
233
-> C° (Mj , ]R) .
, IR)
is called consistent if the following condi-
D
tion is satisfied:
There exists an
(4)
for all
2
with the property that
u e Cm(G, IR) , lim J I F . j _.,,
Here
m >
rj = rM j
,
J
( * , r (u)) - R (r (q) )IIm = 0. J J
1 = rr(u), and
q(x,y) = I,u(x,y)
for all
(x,y) e G. The method
D
is called stable if the following condition
is satisfied:
There exist
(5)
K > 0, K > 0
and
jo e14
with the
following properties:
JIF( ,wj)-Fj(y,wj)11m > K11wj-W,Yi_
IIRj (wj)-Rj (wj)II < KlIwj- W.II V'
C C°(r, 1R), j= jo(l)°°, wj,Wj e C°(Mj, IR).
Example 13.4: problem.
The standard discretization of the model
We consider a consistent and stable difference
method for the model problem -Au(x,Y) = q(x,Y),
u(x,Y) = (x,Y) , For
j
= l(1)= MJ
a
:
(x,y)
e G =
(0,1)2
(x,yj e r.
we set
standard lattice with mesh size
2-J
F('Y,w)(x,Y) = 2 (4w(x,Y)-w(x+hjY)-wj(x-hj,Y) hj
wj(x,Y+h.)-wj(x,Y-h.))
234
BOUNDARY VALUE PROBLEMS
II.
R
(x, y) = wi (x, y)
(x,y) a Mj.
w a C° (Mj , IR) , Here
( x.Y)
-
wi (x,y)
when
(x,y) a M)
rp(x,y)
when
(x,y) e r.
.l
The proof of the consistency condition, (4), we leave to the Stability, (5), follows from Theorem 13.16 below.
reader.
The eigenvalues and eigenfunctions of the linear maps
Fi (0,-): C°(Mi ,
-+ C°(Mi ,
IR)
can be given in closed form.
IR)
One easily checks that the
functions
vuv(x,y) =
(x,y)
a M
u,\) = 1(1)2j-l
are linearly independent eigenfunctions.
The corresponding
eigenvalues are
auv = 7[2 - cos (uirh) - cos (virh)
h=h
h Since lattice
Mj
consists of
tive and lie in the interval
.
points, we have a
(23-1)2
complete system of eigenfunctions.
i
All eigenvalues are posiwhere
[all''mm]
m = 2j-l.
We have
A11 = Z [l - cos (nh) ]
=
2Tr2
-
1Tr4h2 + O(h4)
h
amm = -7[1 + cos(nh)] = 2 h h
2712
+ 6n4h2 + 0 (h4).
With an arbitrary numbering of the lattice points, there are real symmetric matrices
Ai
for the maps
Fi
With
13.
Difference methods
235
respect to the spectral norn:
they satisfy the condi-
tions
mm A11
The functions
1
+ cos(rh
1
-
cos (,rh)
+ 0(h4)
2
(,-h) -
3
vuv, regarded as functions in
IR2,
eigenfunctions of the differential operator -Avll(x,Y) = 21T 2v11(x,Y),
Since the functions
v,v
For example,
-A.
(x,y) e ]R
are also
2
vanish on the boundary of
(0,1)2,
they are also eigensolutions of the boundary value problem. Now let
D
a
be an arbitrary difference method for
solving Problem 13.2.
An approximation
the exact solution
of 13.2 is obtained, when possible,
u
wj
e C°(Mj,IR)
for
from the finitely many difference equations
Fj
(x, y) = Rj ( r . (q) ) (x,Y) ,
Thus our first question is: in the finitely many unknowns tion?
(x,Y) e Mj .
Does the system of equations wj(x,y)
have a unique solu-
For stable methods, a positive answer is supplied by
the following theorem. Theorem 13.5:
Let
F c Cl(Rn, Rn)
and
Then the
K > 0.
following two conditions are equivalent:
(1)
IIF(x)-F(x)II > KIIx-RII,
(2)
F
is bijective.
x,i a Rn
The inverse map
Q
is continu-
ously differentiable and
IIQ(x)-Q(X)II _
x,i c Rn.
IIx-RII,
x Proof that (1) implies (2):
Let
F'(x)
be the Jacobian of
236
F.
BOUNDARY VALUE PROBLEMS
II.
We show that
is regular for all
F'(x)
x0,y0 e Rn
then there would exist
F'(x0)y0 =
and
0
This means that the directional derivative at the
yo # 0.
point
with
For if not,
x.
in the direction
x0
is zero:
y0
lim n IIF(xo+hy0)-F(x0)II = 0.
IhI- 0 Thus there exists an
h
such that
> 0
0
h IIF(x0+h0y0 )
-
F(x0)II < KIIyOII
0
or
IIF(xo+hoyo) This contradicts (1).
-
F(x0)II < Kllhoyo II
Therefore
F'(x)
is regular every-
where.
F(x) = F(Z)
is injective.
F
Since
once by virtue of (1).
F'(x)
implies
x = :
at
is always regular, it
follows from the implicit function theorem that the inverse map
Q
is continuously differentiable and that
open mapping. F(Rn)
It maps open sets to open sets.
F
In particular,
is surjective.
be an arbitrary but fixed vector.
IIF(x)
-
F(0)II _ KIIxII
IIF(x)-x011 + Ilxo-F(0)II For all
is an
is an open set.
We must still show that x0 e]Rn
F
x
KIIxII
outside the ball
E = {x a mn
I
Ilxll _ 2IIx0-F(0)II/K}
we have
d(x) = IIF(x)-xoll > IIF(0)-x011
Let
By (1) we have
13.
Difference methods
237
Therefore there exists an
with
x1 e E
d(x1) < d(x), x c 1R'.
On the other hand,
d(x1) = Since
F(Rn)
inf
n II y-xo II
ycF(R )
is open, it follows that
is surjective.
Thus
x0 a F(1Rn).
F
It also follows from (1) that
IIx-RII = IIF(Q(x))-F(Q(R))II
KIIQ(x)-Q(R) II
This completes the proof of (2). Proof that (2) implies (1):
x,R a Rn.
Let
It follows by
virtue of (2) that
IIx-RII =
IIQ(F(x))-Q(F(X))II_
Theorem 13.6:
Let
KIF(x)-F(R) II
0
be a consistent and stable difference
D
method for Problem 13.2 and let
m, jo
constants as in Definition 13.3.
we define the lattice functions
K > 0
IN, and
For arbitrary
be
u e C2(G, IR)
wj, j = jo(l)m, to be the
solutions of the difference equations
F- (V ,wj) = Rj (r (q)). Here
= rr(u)
and
q = Lu.
Then we have:
Ilrj(u) wjIImJIFJ('Y,rj(u))-Rj(rj(q))II.,
(1)
j = jo(1)m. If
(2)
u e Cm(G, R), then
lim IIrj (u) -wjIIm = 0. j +00
Proof:
$
depends only on
13.5, the maps
F- (*,-) 3
We have
u
and not on
j.
By Theorem
have differentiable inverses
Q3.
238
BOUNDARY VALUE PROBLEMS
11.
rj (u) = Qj (Fj
(u)) )
wi = Qj(Rj(rj(q)))
"jrj (u) -w.'<_
ll Qj (Fj (V+,rj (u))) -Qj (Rj (rj (q)) )%.
(1) follows from Theorem 13.5(2) and (2) follows from (1) and Definition 13.3(4).
o
In Problem 13.2, q
and
y,
are given.
All conver-
gence conditions which take account of the properties of the exact solution
are of only relative utility.
u
Unfortunat-
ely, it is very difficult to decide the convergence question simply on the basis of a knowledge of less one knows that for fixed
q
and
P.
Neverthe-
p c C°(r,IR), the set of
q
for which the difference method converges is closed in
C°(U, IR)
.
Theorem 13.7:
Let
be a consistent and stable difference
D
method for Problem 13.2, let S
{q e C°(, IR)
I
c C°(r,IR)
1P
there exists a
and let
u c C2 (U, ]R)
such that rr(u) q = Lu and Further let
lim lIrj(u)-wj11m= 0).
j -.
q E S and
Fj (0', 4i ) = Rj (rj (q)) , Then there exists a
j
u c C°(U,IR)
such that
0.
1im Note that the function
= jo(l)..
u
need not necessarily be the clas-
sical solution of the boundary value problem.
Difference methods
13.
Proof:
239
Let
q(l),q(2) E
=
q(1)
Lu(1),q(2)
=
Lu(2),
R.(rj(q(1))),
j = jo(1)°°
Rj(rj(q(2))),
j
= jo(1)-.
Then:
Il rj (u(1)) -rj (u(2) Al.
Ilrjlull) )-w(1)Ilm
+
Let Q. j = jo(l)W, again be the inverse functions of and
and
K
tion 13.3(S).
the constants from stability condi-
K'
It follows from Theorem 13.5 that:
II rj (u(1) -u(2) )II W
Il rj (u(2) ) -wJ 2)II
Il
+ K llq(1)-q(2)II,,. In passing to the limit ity converges to
Ilull)
the left side of the inequal-
u(2)11, while the mesh
-
IMJI
con-
On the right, the first two summands converge
verges to zero.
to zero by hypothesis.
,lull)
j - ,,,
-
All this means that
u(2)11
llq(1)
-
Thus corresponding to the Cauchy sequence {q(v)
E S0
1
V = 1(l)m}
q(2)11..
240
BOUNDARY VALUE PROBLEMS
II.
there is a Cauchy sequence
{u(v)
E C0 (G,IR)
v = l(l)m}.
I
Let
lim q(v), v+m
Then for
v = l(l)m
u = lim u(v) v->m
we have the inequalities
II rj (u) -wjll <
1 1 rj (u-u(v) ) I l m +
Il rj (u(v))
11w'jv) -wj 1L
< IIu-u(V)IIL + For
E> 0
K
there is a
vo e 1N
Ilq(v)-qli,;
with
IIii-u(v°)IIm < 3 114-q(v0 )lIm
< 3 -K
K
For this
we choose a
v°
jl E N
IIrj(u(v°))-wj(v°)IIm < E Altogether then, for
j
> jl
II rj (U) -wjll < E.
such that
j = jl(1)m.
we have a
For the most important of the difference methods for elliptic differential equations, stability follows from a monotone principle.
The first presentation of'this relation-
ship may be found in Gerschgorin 1930.
The method was then
expanded extensively by Collatz 1964 and others. The monotone principle just mentioned belongs to the theory of semi-ordered vector spaces. concepts.
Let
0
We recall some basic
be an arbitrary set and
V
a vector space
13.
Difference methods
of elements
f:St -IR.
241
V
In
there is a natural semiorder
f < g -1f(x) < g(x), x e 0). The following computational rules hold: f < f
f< g, g< f f< g, g< h f < g, X c IR+ f
.
f= g f< h of < Ag
-
-g < -f 0 < f+g.
We further define
Ifl(x) = lf(x)I From this it follows that
Ifl,
0< When
is a finite set or when
c
f c V
f < Ifl. 12
is compact and all
are continuous,
jjfjj_= max lf(x) xE12
exists.
Obviously, 11 if 111-
lif 11.e
We use this semiorder for various basic sets V
0
{1,2,...,n}
Btn
{l,2,...,m}x{1,2,...,n}
MAT(m,n,IR)
Lattice M
C°(M, ]R)
G
C° (G, IR)
.
12, including
H.
242
A E MAT(n,n,IR)
Definition 13.8:
BOUNDARY VALUE PROBLEMS
is called an
with the following pro-
A = D - B
there exists a splitting
M-matrix if
perties:
B
is a regular, diagonal matrix; the diagonal of
D
(1)
is identically zero.
D > 0, B > 0. A-1 > 0.
(2) (3)
Theorem 13.9:
A = D - B
Let
MAT(n,n,IR), where A
D
and
B
be a splitting of
A c
satisfy 13.8(1) and (2).
is an M-matrix if and only if
p(D- IB) < 1
Then
(p = spectral
radius).
Proof:
Then the series
p(D-1B) < 1.
Let
(D-'B)"'
S
V=0
converges and
S > 0.
Obviously,
(I-D 1B)S = S(I-D-1B) = I, A-1 > 0
Conversely, let D-1B
with
x
A-1
and let
=
A
SD-I
> 0.
be an eigenvalue of
the corresponding eigenvector.
Then we have
the following inequalities: ID- IBxI < D-1BIxl
lXIlxi =
(I-D-1B)lxl < (l-lal)ixI (D-B)Ixl < (l-lal)Dlxi < (1-IXI)A-l Dlxl.
lxi
Since
x # 0, A-I > 0, and
plies that
IA!
<
1
and
The eigenvalues of
D > 0, the last inequality imp(D-1B) < 1. D-1B
o
can be estimated with the
help of Gershgorin circles (cf. Stoer-Bulirsch 1980).
For
Difference methods
13.
243
this let A = {aij
i = l(1)n, j = l(1)n}.
I
One obtains the following sufficient conditions for P(D-1B) < 1:
Condition 13.10:
A
is diagonal dominant, i.e.
n Jai'j
E
j=1 j#i
Condition 13.11:
A
i = l(1)n.
Iaiil,
<
is irreducible diagonal dominant, i.e.,
n Iai'j
E
A
i = l(1)n,
Jaiil,
j=1 j+i
is irreducible and there exist
r c {0,1,...,n}
such that
n I
Iarri.
1 j
+r
Definition 13.12: ping
F:V1 + V2
Let
and
V1
be semiordered.
V2
is called
isotonic
if
f < g
F(f)
antitonic
if
f < g
F(g) < F(f)
inverse isotonic
if
F(f)
for all
A map-
< F(g)
< F(g) . f < g
f,g e VI.
Definition 13.13:
Let
V
be the vector space whose elements
consist of mappings
f:S2 +IR.
diagonal if for all
f,g e V
Then
F:V + V
and all
x E 0
f(x) = g(x) - F(f) (x) = F(g) (x) .
is called
it is true that:
a
In order to give substance to these concepts, we consider the affine maps
F:x - Ax+c, where
A c MAT(n,n, R)
and
244
BOUNDARY VALUE PROBLEMS
II.
Then we have:
c c IRn. A>
0
F
-A >
0
F
isotonic antitonic
A
an M-matrix
A
diagonal matrix
F
inverse isotonic
F
diagonal
A > 0, regular
F
diagonal, isotonic, and
diagonal matrix
inverse isotonic.
A mapping
t.»
F:IRn ]R"
is diagonal if it can be written as
follows:
yi = fi(xi),
= l(1)n.
i
The concepts of isotonic, antitonic, inverse isotonic, and diagonal were originally defined in Ortega-Rheinboldt 1970. Equations of the form
F(f) = g
with
F
inverse isotonic
are investigated thoroughly in Collatz 1964, where he calls them of monotone type. Theorem 13.14:
Let
A c MAT(n,n,IR)
be an M-matrix and let
F:IRn IR"
be diagonal and isotonic.
F: 1Rn y 1R
defined by
Then the mapping
F(x) = Ax + F(x),
x e IRn
is inverse isotonic and furthermore
IIF(x) Proof: y = F(x)
Since
F
- F(x)II ?
II x-AI
-1
11-
is diagonal, one can write the equation
componentwise as follows: yi = fi(xi),
For fixed but arbitrary
i
= l(1)n.
x = (x1,...,xn) cIRn
and
Difference methods
13.
R = (il) ...,xn) e]Rn
245
we define, for f.(xi)
i = 1(1)n,
fl(Ri) if
xl
-
zi # xi
xl
otherwise
1
E = diag(eii). F
isotonic implies
F(x) Let
A = D
-
B
E > 0.
In addition,
F(i) = E(x-i).
-
be a splitting of
A
as in Definition 13.8.
It follows from
F(x) = AX + F(x) = y F(x) = Ax + F(x) = y that
- F(i) = (D+E-B)(x-i).
F(x)
Since
S=
I (D-1B)' > 0
v=0
converges, [(D+E)-IB]v > 0
T = 0
certainly converges.
The elements in the series are cer-
tainly no greater than the elements in preceding series. Therefore, I
-
(D+E)
1B
is regular, and [I-(D+E)-1B]-I
D+E-B
is also an M-matrix.
and this holds for all inverse monotone.
= T > 0.
We have
x,i c1Rn.
x < i
for
This shows that
F(x) F
< F(R), is
246
BOUNDARY VALUE PROBLEM
II.
In addition we have
ilx-RII ° II (D+E-B)1(F(x)-F(R)]
or
II
II
Ilg(x)-FcR)II
IIT(D+E)-III
The row sum norm of the matrix T(D+E)-1
((D+E)-1131'}(D+E)-
{
'=0
is obviously no greater than the norm of SD
1
=
{ Z (D 1B]'}D 1 = A 1. '=0
This implies that
IIT(D+E)-1II. < IIA-1II-
IIx-xII
IIF(x) Theorem 13.15:
IIA-1IIm
Hypotheses:
(a)
A E MAT(n,n,]R)
(b)
F: ]Rn -r IRn F(x)
o
.
is an
M-matrix
is diagonal and isotonic,
= Ax + F(x)
(c)
v EIRn, v > 0, Av > z = (1,...,1) EIRm
(d)
we]Rn,IIF(w)II_<1.
Conclusions: (1)
It is true for all
IIF(x)-F(z)IIm_
x,R e]R
IIx-RII. IIvIIL
that
Difference methods
13.
(2)
Proof:
F(O) = 0
For all
implies
x c1Rn
A-1 > 0
< v.
lwl
it follows from
Av > z
that
llxll, z <
Ixl Since
247
it follows that
A- Ilxl
Ilxll.v
IIA- Ixlim_
IIA 11x1 IL_ Ilxll. Ilvil.
IIA-III, < Ilvll. Combining this with Theorem 13.14 yields conclusion (1):
IIF(x)-F(X)II. >
llx-xlim
11vil
x,x a IRn ,
For the proof of (2) we need to remember that tonic and
F(-x)
is antitonic.
-z < F(w) <
F(0) = 0
F(x)
is iso-
implies that
z
-Av < F(w) < Av -Av+F(-v) < F(w) < Av+F(v)
-k-v) < F(w) < F(v). Since
F
is inverse isotonic, it follows that
-v < w < v.
a
We conclude our generalized considerations of functions on semiordered vector spaces with this theorem, and return to the topic of difference methods.
In order to lend some
substance to the subject, we assume that the points of lattice
Mj
have been enumerated in some way from
1
We will not distinguish between a lattice function wj
e C°(Mj,IR)
and the vector
to
nj.
248
BOUNDARY VALUE PROBLEMS
II.
[wj(x1,y1),...,wj(xn'yn)) a J
J
Thus, for each linear mapping
F: C° (Mj , IR)
A e MAT(nj,nj,IR)
is a matrix
,Rn j. -).
Co (Mj, IR)
and vice versa.
there
This matrix
depends naturally on the enumeration of the lattice points. "A > 0"
However, properties such as matrix" or
is a diagonal
"A
or
is an M-matrix" either hold for every enumera-
"A
tion or for none.
The primary consequence of these monotoni-
city considerations is the following theorem. Let
Theorem 13.16:
D = {(Mj,Fj,Rj)
I
j
be a dif-
= l(1)oo}
ference method satisfying the following properties:
(1)
F(,wj) = Fl)(wj) + F2)(wj) C C° (r, IR) ,
IIRJ jj wj
-
F3)('U),
wj e Co (Mj , IR) .
(2)
F1)
is a linear mapping having an M-matrix.
(3)
F2)
is diagonal and isotonic, and
(4)
Fj3)
and
< K.
Rj
are linear and isotonic, and
Also it is true for all
e C°(Mj,IR)
with
'p
F2)(0) = 0.
e C°(r, IR) and wj > 1 that
>1
'P
and
F3)('p) + Rj(wj) > (1,...,1) (5)
The method
{(Mj,Fl)-F3),Rj)
consistent if the function
H
I
j
= 1(l)°°}
is
in Problem 13.2 is identically
zero.
Conclusion: Remark 13.17:
D
is stable.
The individual summands of F. Rj
as a rule
correspond to the following terms of a boundary value problem:
Difference methods
13.
249
Boundary value problem
Difference method
Lu(x,y)
- H(x,y,u(x,y))
H(x,y,u(x,y)) P(x,Y)
q(x,y).
R
Since
must be isotonic, Hz(x,y,z)
can never be nega-
If this fails, the theorem is not applicable.
tive.
Consistency as in 13.3(4) can almost always be obtained by multiplying the difference equations with a sufficiently high power of
hj
=
jMjI.
The decisive question is
whether stability survives this approach.
Condition (4) of
Theorem 13.16 is a normalization condition which states precisely when such a multiplication is permissible. points
(x,y)
At most
of the lattice it is the rule that 0.
Such points we call boundary-distant
points.
Among other things, 13.16(4) implies that it follows
from
>
wi
1
that for all boundary-distant points
(x,y),
Ri (wi )(x,y) > I.
In practice, one is interested only in the consistent methods D.
But stability follows from consistency alone for
and
0.
H
=_
0
In general, it suffices to have
isotonic.
In Example 13.4 we have
0, R
the identity
and
h {4wi(x,Y)-w)(x+hj,y)-w)(x-hj,y) J
wi (x,Y+h))-w.(x,y-hi ))
250
BOUNDARY VALUE PROBLEMS
II.
if *(x,y) if
(x,y) e M3 (x,y) E r.
f wj (X, Y)
(x, y)
1
has a corresponding symmetric, irreducible, diagonal dominant matrix (cf. Condition 13.11). Since
of Theorem 13.16 is satisfied. Since
is satisfied.
Thus condition (2) 0, condition (3)
is the identity, one can choose
R i
K = 1
Also when
in (4).
F 3) (P) > 0 ,
w)
>
and
1
R) (w ) > I.
Therefore (4) is also satisfied.
obtained for
Consistency (5) is easily
from a Taylor expansion of the differ-
m = 4 For
ence equations.
t > 1, then obviously
u e C4(G,IR)
one obtains
JI F ( ,r (u)) -r (q)JI_ < Kh where 16
max (x,y)cG
1).
lu
yyyy
Thus the method is stable by Theorem 13.16.
o
Theorem 13.16 is reduced to Theorem 13.15 with the aid of two lemmas. Lemma 13.18:
There exists an
s
e C-(G,IR)
with
s(x,y) >
and
Ls(x,y) = Ls(x,y) - H(x,y,s(x,y)) > 1, Proof:
For all
c
(x,y) c G let
all(x,y) > K1 > 0,
lbl(x,y)) e K2,
We set 3K2
a =
and show that
(x,y)
K1,
B1+B2
B
= -2
Bl < x < B2
0
13.
Difference methods
251
- cosh[a(x-B))}/(2aK2)
s(x,y) _ {cosh[a(62-b)]
is a function with the desired properties. it follows from
ix-al
< 62-8
First of all,
that
cosh[a(x-B)] < cosh[a(02-8)], and from this, that
s(x,y) > 0.
Since
s
depends only on
x, we have
Ls = -a11sxx ZK
blsx
blsinh[a(x-B)].
allcosh[a(x-B)] + 2K 1
2
Since it is always the case that jsinh[a(x-B)]j < cosh[a(x-6)],
it is also true that Ls(x,y) > cosh[a(x-B)] > 1. Remark 13.19:
The function
error estimates, since sible. case.
s
s
o
plays a definite role in
should then be as small as pos-
Our approach was meant to cover the most general In many specific cases there are substantially smaller
functions of this type, as the following three examples demonstrate. bl(x,Y) = 0:
s(x,Y) =
bl(x,Y) > 0:
s(x,Y) = 2K (x-B1)(2 $2-B1-x) 1
L = -a, G = unit circle: s(x,y) = In the last example, the choice of
s
2K1(x-Bl)(62-x)
q(1-x2-y2).
is optimal, since
there are no smaller functions with the desired properties. In the other two examples, one may possibly obtain more
252
II.
K1, al, and
advantageous constants and
BOUNDARY VALUE PROBLEMS
by exchanging
82
x
a
y.
There exists a v e C°(G, IR)
Lemma 13.20:
and a jo c 1N
such that v(x,y) > 0,
(X,y)
c
(1,...,1), Proof:
We choose the
s
v(x,y) > 2,
The function
=
of Lemma 13.18 and define
It is obviously true that
v = 2s + 2.
J
Lv(x,y) > 2
for
v e C (6,1R), and (x,y)
e G.
is a solution of the boundary value problem
v
13.2 with = rr(v) > 2,
H ' 0,
q(x,y) = Lv(x,Y)
Insofar as the method
j
I
> 2.
= 1(1)-}
is
consistent with respect to this problem, we have
lim
0.
j-
We now choose
j
F
)
and
q > 1
Rj
0
so large that for all
(rj (v)) -F3 )
j
> jo
we have
-Rj (rj (q)) II, < I.
are linear and isotonic.
For
i >
1
and
we have
R(r(q)) > Since we actually have
* > 2
and
q > 2, it follows that
Rj(rj(q)) > (2,...,2) and hence that
13.
Difference methods
253
Remark 13.21:
Instead of
tually proved
v e CW(G,1k)
(1,...,1).
o
v e C°(G,1k)
and
and
v >
v > 0, we ac-
However, the condi-
2.
Since one
tions of the lemma are sufficient for the sequel.
is again interested in the smallest possible functions of this type, constructions other than the one of our proof These other methods need only yield a continu-
could be used.
ous function
v > 0.
o
We choose
Proof of Theorem 13.16:
Then we can apply Theorem 13.15.
v
as in Lemma 13.20.
The quantities are related
as follows:
Theorem 13.15
Theorem 13.16
Fc1)
A
F(2)
F
J
J
rj (v) F(l)
v +
FJ2)
J
F
J
w
0
For
j
it follows from Theorem 13.15(1) that:
> jo
(w j)-Fj1) >
lIvIi,
Ilwj-wjII, I1rj(v)II-
>
IIwj -w,II
-
IIvIi
does not depend on
ity in 13.3(5) with
equivalent to
II R3 II
(i .)IIm
This proves the first inequal-
j.
K = 1/IIvII,.
< K.
wj,wj e C (Mj, D2).
The second inequality is
o
In view of the last proof, one may choose in Definition 13.3(5).
Here
v
K = 1/IIvII,,
is an otherwise arbitrary
function satisfying the properties given in Lemma 13.20.
254
II.
BOUNDARY VALUE PROBLEMS
Conclusion (1) of Theorem 13.6 yields the error estimate
Ilrj(u)-wjII,, ` IIvII. IIFj(p,rj(u))-Rj(rj(q))II0.
j
= jo(l)o.
Here is the exact solution of the boundary value problem
u wj
is the solution of the difference equation
F(l,r.(u)) - Rj(rj(q))
is the local error
is a bounding function (which depends only on
v
The inequality can be sharpened to a pointwise estimate with the help of conclusion (2) of Theorem 13.15. points
(x,y)
and
c Mj
j
= j0(1)-
For all lattice
we have
Iu(x,Y)-wj (x,Y) I : v(x,Y)IIFj (*,rj (u))-Rj (rj (q))II_. In many important special cases, e.g., the model problem (Example 13.4), Rj
is the identity.
A straightforward
modification of the proof of Lemma 13.20 then leads to the following result: s > 0
and
exists a
e >
let
Ls(x,y) > 1 jl c 1N
0
and let
s
e Cm(G,]R)
(cf. Lemma 13.18).
with
Then there
such that
Iu(x,Y) -wj (x,Y) I < (1+e) S (x,Y)II Fj (,P,rj (u)) -rj (q))11_,
j = jl(1)-. In the model problem s(x,y) = 4 x(1-x) + 4 y(1-y) is such a function.
independently of
c.
Here one can actually choose
jl = 1,
It therefore follows that
Iu(x,y) -w3 (x,y) I < s ( x , y ) I I F j (,P,rj (u) ) -rj (q)II_,
j
= l(l)-.
We will now construct several concrete difference methods. Let
Difference methods
13.
e(1)
(11,
_
e(2)
(O),
=
0
255
if
Ih X
V
v = 1(1)4
let:
(x,y)+ae(v)
e G
Now for
_
(0). `
0
v = 1(1)4
with
for all
a
1
we associate
(x,y) c G
Nv(x,y,h) c G
four neighboring points
e(4)
`(-11,
With each point
(cf. Figure 13.22).
h > 0.
e(3) _
1
c
and
[0,h]
=
min {A >
(x,y)+xe(v)
0
c r}
otherwise
Nv(x,y,h) = (x,y) + ave(v)
dv(x,y,h) = II (x,Y) - Nv(x,Y,h)II2 = AvIIe(v)II2. Obviously we have 0 < dv(x,y,h) < h,
v = 1(1)4.
By Definition 13.1, the standard lattice with mesh size
h
is
Mh = {(x,y) e G
I
x = yh, y = vh where
u,v e 2Z},
0 < h < ho. For
(x,y)
long to
all the neighboring points
c Mh
Mh
or
Nv(x,y,h)
P.
Lip(2)(G,IR).
For brevity, we introduce the notation This is a subspace of every
be-
C&(G,IR)
f e Lip(Q)(G,IR) a''+Vf
there exists an
au+Vf ayv(x,Y)
-
axu ay v
ax u
(x,y)
E G,
defined as follows:
(x,y)
L >
0
<_ L II(x,y) - (X,Y)Ij E G,
Obviously,
IR) c Lip(2') (G, IR) .
u+v
Q
for
such that
H.
256
BOUNDARY VALUE PROBLEMS
e(2) 1
Figure 13.22.
Direction vectors for the difference method
The next lemma contains a one-dimensional difference equation which we will use as the basis for the difference methods in Lemma 13.23:
Let
a > 0
a,u e
and
C2n
Suppose further that there is a positive
C3
constant
L
such that for all
t,s c (-S,8)
and
v = 0(1)3
the following inequalities are valid:
a(") (t) I < L,
Iu(") (t) I ' L,
a(") (t)-a(")(s)I < Lit -sI,
Iu(")(t)-u(")(s)I < LIt-sI.
Then it is true for all
hl,h2 e (0,6]
that
h1h 22 1h2 {h 2a(Zhl)[u(hl)-u(0)]+hla(- Zh2)[u(-h2)-u(0)]) +
h
a(0)u"(0)+a'(0)u'(0) +
1-
2
[4a(0)u"'(0)+6a'(0)u"(0)
+ 3a"(0)u'(0)] + R where
G.
13.
Difference methods
We examine the function
Proof:
f(S)
= h1h22..1+h2
- h2a(shl)u(0) The
257
[h2a(Zhl)u(shl) + hla(- .h2)u(-sh2)
- hla(- Zh2)u(0)],
s
[0,1].
e
v-th derivatives, v = 0(1)3, are
f(v)(s)
h1h22 1+h2
2
h1h2(hl+h2
11=0
u(0) 2v +
v (v) 2-hl (s ) hlh2a
(-1)vhlh2a(v)(-
2h2)].
It follows that
f(0) = f'(0) = 0 f"(0) = 2a(0)u"(0) + 2a'(O)u'(O) f"'(0)
(h1-h2)[2a(0)u"'(0)+3a'(0)u"(0) + Za"(0)u'(0)].
By Taylor's Theorem,
f(l) = f(0) + f'(0) + 2f"(0) + 6f"'(0) +
6[f"l(e)-fil,(0)
0 < e < 1. The conclusion follows, with
R = 6[f"'(8)
show that 3
41 2 L 24
But this inequality follows from
3
l+h2 TT72
once we
258
BOUNDARY VALUE PROBLEMS
II.
Ia(u)(Zhl)u(v-u)(shl) <
Ia(u)(?hl)I
+
Iu(v-u)
(0) l
au(0)u(v-u)(0)I -
u(v-P) (0)
Iu(v-u)(shl)
-
la(u) ( hl)
a(u) (0) I < . L2hl
-
and, similarly, Iau(- 'h2)u(v-u) (-sh2)
< 2 L2h2.
a
a,u a C4a constant
Whenever
Remark 13.24:
au(0)u(v-11)(0)J
-
with the desired properties always exists.
L
A convenient
choice is L =
Example 13.25:
max v=0(1)4 te[-s,a]
(Ia(v)(t)I, Iu(v)(t)I)
Standard Five Point Method.
Differential operator: Lu = -[aluxlx
where
-
[a2uyly + H(x,y,u)
al,a2 a C(G, IR) , H c C-(G x ]R, IR) , and al(x,y) > 0,
H(x,y,0) = 0,
a2(x,y) > 0
(x,y) c G,
z
e IR.
Hz(x,y,z) > 0
Lattice: 2-(0+£),
h. = Mj:
A point
j
= 1(l)-, t
sufficiently large, but fixed
standard lattice with mesh size
(x,y) e Mj
boring points
h..
is called boundary-distant if all neigh-
Nv(x,y,h
belong to
G; otherwise it is
called boundary-close.
Derivation of the difference equations:
At the boundary-
distant lattice points, the first two terms of
Lu(x,y)
are
Difference methods
13.
259
replaced, one at a time, with the aid of Lemma 13.23. hi, wi, and
abbreviate
ni, merely writing
{al(x+Zh,Y)[w(x,Y)
h, w, and
n:
- w(x+h,y)]
al(x-?h,Y)[w(x,Y)
- w(x-h,y)]
+ a2(x,Y+Zh)[w(x,Y)
- w(x,y+h)]
+ a2(x,Y-Zh)[w(x,Y)
- w(x,Y-h)])
+
We
+ H(x,Y,w(x,Y)) = Q(x,Y) If one replaces
by the exact solution
w
O(h2).
of the boundary
u c Lip(3)(G,]R), the local error will
value problem, where be
u
An analogous procedure at the boundary-close
lattice points yields E1KV(x,Y)[w(x,Y) - w(NV(x,Y,h))]
+ E2KV(x,Y)w(x,Y) + H(x,y,w(x,y))
= q(x,y) + E2KV(x,Y)p(NV(x,Y,h)) where
2a(XV,YV) KV(x,Y) =
du x,Y,h +
dv(x,Y,
u=
1
v = 1,3
2
v = 2,4
I
+2 (x, y, )l
(x,Y) + -11-dV(x,Y,h)e(v)
In the sums
El
and
E2, v
runs through the subsets of
{1,2,3,4}: El:
all
v
with
NV(x,y,h) c G
E2:
all
v
with
NV(x,y,h) e r.
260
BOUNDARY VALUE PROBLEMS
II.
Formally, the equations for the boundary-distant points are special cases of the equations for the boundary-close points. However, they differ substantially with respect to the local error.
In applying Lemma 13.23 at the boundary-close points,
one must choose h1 = dl(x,y,h)
for the first summand of
h2 = d3(x,y,h)
Lu(x,y), and
h1 = d2(x,y,h) for the second.
and
and
h2 = d4(x,y,h)
The local error contains the remainder
R
and also the additional term hi-h
[4a(0)u"'(0) + 6a'(0)u"(0) + 3a"(0)u'(0)].
12
Altogether there results an error of may be reduced to
O(h3)
O(h).
However, this
by a trick (cf. Gorenflo 1973).
Divide the difference equations at the boundary-close points by
b(x,y) = E2KV(x,Y) The new equations now satisfy the normalization condition (4) of Theorem 13.16, since for
p > 1
and
q > 1
it is ob-
viously true that [q(x,Y) + E2Kv(x,Y)'U(x,Y)]/E2Kv(x,Y) > 1.
At the boundary-distant points such an "optical" improvement of the local error is not possible.
is
O(h2)
Therefore the maximum
.
We can now formally define (cf. Theorem 13.16) the difference operators:
13.
Difference methods
261
1
whenever
(x,y)
is boundary-distant
E2Kv(x,y)
whenever
(x,y)
is boundary-close
b(x,y) ll
4
Kv(x,Y)w(x,Y) V=1
E2Kv(x,Y)w(Nv(x,Y,h))]/b(x,Y)
-
H(x,y,w(x,y))/b(x,y)
Ri (ri (q))(x,Y) = q(x,y)/b(x,y).
there is a matrix
For
B-1A; B
is a diagonal matrix
b(x,y), whereas the particular
with diagonal elements
A
naturally also depends on the enumeration of the lattice points.
In practice, there are two methods of enumeration
which have proven themselves to be of value: (1)
Enumeration by columns and rows:
(x,y)
precedes
(z,y)
if one of the following conditions is satisfied: x < x,
(a)
(b)
x = z
and
With this enumeration, the matrix
A
y < y.
becomes block tridia-
gone1: D1
-S1
1
D2
-S2
2
D3
A =
-Sk
The matrices
Du
are quadratic and tridiagonal.
Their dia-
gonal is positive, and all other elements are nonpositive. The matrices
S11
and
SP
are nonnegative.
262
(2)
II.
BOUNDARY VALUE PROBLEMS
Enumeration by the checkerboard pattern:
lattice
Divide the
into two disjoint subsets (the white and black
Mj
squares of a checkerboard): Mil) = {(uh,vh) c M.
u+v
even}
{(uh,vh) a Mj
u+v
odd}.
The elements of
Mil)
In each of these subsets, we use the column
second.
of
are enumerated first, and the elements
and row ordering of (1). D1
The result is a matrix of the form -S
A = -9
D1
and
are quadratic diagonal matrices with positive
D2
diagonals.
D2
S
and
S
are nonnegative matrices.
In Figures 13.26 and 13.27 we have an example of the two enumerations.
Figure 13.26.
Enumeration by columns and rows
13.
Difference methods
Figure 13.27.
263
Enumeration on the checkerboard pattern
We will now show that
B-1A
and
A
are M-matrices.
It is
obvious that: 4
(1)
app =
p = 1(1)n,
Kv(x,y) > 0,
(x,y)
c
V=1 (2)
apa = -Kv(x,y)
<
or
0
apo =
0
for all
p,a
with
a
n (3)
app
>
I
lapa1,
p = 1(1)n.
a=1 a+p (4)
For each row
(ap1,...,apn), belonging to a boundary-
close point, n a pp >
E
IapaI.
a=1
a#p (5)
apa = A
In case
0
implies
aap =
matrix
for
p,a = 1(1)n.
is irreducible, it is even irreducible diagonal
dominant, by (1) through (4)
wise, A
0
(cf. condition 13.11).
Other-
is reducible, and by (5) there exists a permutation P
such that
264
BOUNDARY VALUE PROBLEMS
II.
A PAP
1
1
A2
=
l®
1
Av, v = 1(1)L
The matrices Each matrix
A
are quadratic and irreducible.
has at least one row which belongs to a
AV
boundary-close point.
Hence all of these matrices are ir-
reducible diagonal dominant, and thus quently, A
Conse-
is also an M-matrix.
For certain G = (0,1)
M-matrices.
x (0,1)
h
or
and certain simple regions (e.g.
G= {(x,y) E (0,1) x (0,1)
h = 1/m] it will be the case that
dv(x,y,h) = h.
x+y < 1},
I
When this
condition is met, we have the additional results: (6)
Kv(x,y,h) = Ku(Nv(x,y,h),h)
where
u-1 = (v+1)mod 4, (x,y)
(7)
apo = aop
(8)
A
(9)
B-IA
for
c M..
p,a = 1(1)n.
is positive definite. B-1/2AB-1/2
is similar to
and therefore has
positive eigenvalues only.
Of the conditions of Theorem 13.16 we have shown so far that (2)
(B- 1
A
is an M-matrix), (4) (normalization condition),
and (5) (consistency) are satisfied. H(x,y,w(x,y))/b(x,y) is trivially diagonal and isotonic. also satisfied.
Thus condition (3) is
Therefore, the method is stable.
In the following examples we restrict ourselves to the region
G = (0,1) x (0,1); for the lattice
M
we always
choose the standard lattice with mesh width h = h.
= 2-j.
In
13.
Difference methods
265
this way we avoid all special problems related to proximity In principle, however, they could be
to the boundary.
solved with methods similar to those in Example 13.25.
For
brevity's sake, we also consider only linear differential operators without the summand
Then the sumWhen
drops out of the difference operator.
mand (x,y)
H(x,y,u(x,y)).
c
w(x,y)
P, we use
for
Differential operator:
Example 13.28:
Lu = -a11uxx
a22uyy -
b u 1
x
- b2uy.
Coefficients as in Problem 13.2. Difference equations:
h2{[all(x,Y)+ch][2w(x,Y)-w(x+h,y)-w(x-h,Y)] [a22(x,Y)+ch][2w(x,Y)-w(x,y+h)-w(x,y-h)]}
+
Zh{bl(x,Y)[w(x+h,Y)-w(x-h,Y)]
-
+ b2(x,Y)[w(x,y+h)-w(x,y-h)]}
= q(x,Y) Here
When when
c
> 0
is an arbitrary, but fixed, constant.
u E Lip(3)(G,IR), we obtain a local error of
c = 0,
and
can be given by
0(h)
when
an M-matrix.
c > 0.
For small
h,
The necessary and sufficient
conditions for this are flbl(x,Y)l < all(x,Y) + ch,
(x,Y)
e M
Zjb2(x,Y)l < a22(x,Y) + ch,
(x,Y)
a M
which is equivalent to
0(h2)
266
BOUNDARY VALUE PROBLEMS
II.
2[Ibl(x,Y)I-2c]
E Mj
(x,y)
< all(x,Y),
2[Ib2(x,y)I-2c] < a22(x,y),
(x,y) E M3.
If one of the above conditions is not met, the matrix may possibly be singular.
Therefore these inequalities must be
satisfied in every case.
local error, and for h c (0,h0]. lb2I
For
For
c = 0, one obtains the smaller
c > 0, the larger stability interval
In the problems of fluid dynamics, Ib1I
are often substantially larger than
all
and
or a22.
c > 0, we introduce a numerical viscosity (as with the
Friedrichs method, cf. Ch. 6). in many other ways as well.
This could be accomplished
One can then improve the
global error by extrapolation.
o
Differential operator:
Example 13.29:
as in Example 13.28.
Difference equations: h2{all(x,Y)(2w(x,Y)-w(x+h,Y)-w(x-h,y)l
+ -
Here
D1
and
D2
a22(x,y)[2w(x,y)-w(x,y+h)-w(x,y-h)]}
h{D1(x,y) + D2(x,Y)} = q(x,y).
are defined as follows, where
(x,y)
`bl(x,Y) [w(x+h,Y)-w(x,Y)]
for
b 1(x,y) > 0
bl(x,y) [w(x,y)-w(x-h,y)J
for
bl(x,y)
1b2(x,y) [w(x,y+h)-w(x,y)]
for
b2(x,y) > 0
b2(x,y) [w(x,y)-w(x,y-h)]
for
b2(x,y) <
c M.,
Dl(x,Y) <
0
D2(x,y)
F3l)
is given by an M-matrix for arbitrary
h > 0.
0.
This
is the advantage of this method with one-sided difference quotients to approximate the first derivatives.
The local
13.
Difference methods
error is
0(h)
u e Lip(3)(G,IR).
for
sible only if
and
bI
267
b2
Extrapolation is pos-
do not change in sign.
Note
the similarity with the method of Courant, Isaacson, and Rees (cf. Ch. 6).
o
Differential operator:
Example 13.30:
Lu = -aAu - 2buxy
where
satisfy
a,b c C _(_G, ]R)
a(x,y) > 0, a(x,y)2
-
(X,y)
CG
b(x,y)2 > 0.
Difference equations:
{a(x,y)[2w(x,y)-w(x+h,y+h)-w(x-h,y-h)] 2h
+ a(x,Y)[2w(x,Y)-w(x+h,y-h)-w(x-h,y+h)l - b(x,Y)[w(x+h,Y+h)-w(x-h,y+h)-w(x+h,y-h)+w(x-h,y-h)]} = q(x,y).
When Ib(x,Y)I < a(x,y) ,
(x,Y) c Mi
one obtains an M-matrix independent of
h.
However, the dif-
ferential operator is uniformly elliptic only for Ib(x,Y)I < a(x,Y)
When
b(x,y)
__
,
(x,Y) e G.
0, the system of difference equations splits
into two linear systems of equations, namely for the points (ph,vh)
where
p + v
is even
(ph,vh)
where
p + v
is odd.
and
BOUNDARY VALUE PROBLEMS
[I.
268
One can then restrict oneself to solving one of the systems. The local error is of order
0(h2)
for
u e Lip(3)(U,IR).
o
MuZtipZace method.
Example 13.31:
Differential operator: Lu(x,y) = -Au(x,y).
Difference equations: {5w(x,Y)-[w(x+h,Y)+w(x,y+h)+w(x-h,Y)+w(x,Y-h)] h -
4[w(x+h,Y+h)+w(x-h,y+h)+w(x-h,Y-h)+w(x+h,Y-h)]}
= q(x,y) + S[q(x+h,y)+q(x,y+h)+q(x-h,y)+q(x,y-h)].
The local error is
0(h4)
for
13.16 is applicable because
u c Lip(5)(G, Ill).
Theorem
always has an M-matrix.
The natural generalization to more general regions leads to a method with a local error of
0(h3).
More on other methods
of similar type may be found in Collatz 1966.
o
So far we have only considered boundary value problems of the first type, i.e., the functional values on
t
were
Nevertheless, the method also works with certain
given.
other boundary value problems. Boundary value problem:
Example 13.32:
-Eu(x,Y) = q(x,y),
(x,Y)
u(x,y) = P(x,y),
(x,y)c r
u(0,Y)
where fixed.
ii
-
and
0'ux(0,Y)
4
E G = (0,1) x (0,1) and
x +
0
= 0(y), y E (0,1)
are continuous and bounded and
a > 0
is
13.
Difference methods
269
Lattice: A.:
the standard lattice
with mesh width h=hi2
M3 .
3
(0,µh), p = 1(1)2j-l.
combined with the points Difference equations:
For the points in
M. n (0,1) x (0,1), we use the same equa-
tions as for the model problem (see Example 13.4). u e Lip(3)(G,IR)
y = ph, u = 1(1)2j-1, and
For
we have
u(h,y) = u(O,Y) + hux(0,Y) + 1h2uxx(0,Y) + 0(h3) u(O,Y) + hux(0,Y) -h2uyy(O,y)
If we replace
-
Zh2[q(O,Y)+uyy(0,Y)] + 0(h3).
by
2u(O,Y) - u(0,y+h)
- u(0,y-h) + 0(h3)
we obtain u(h,y) = 2u(0,y)
Zu(0,y+h)
-
+ hu x(0,Y)
-
-
Zu(0,y-h)
Zh2q(O,Y) + 0(h3)
u x(O,Y) =
2h[2uCh,Y)+u(O,y+h)+u(O,Y-h)-4u(O,Y)]
+ Zhq(O,Y) + 0(h2).
This leads to the difference equation - a[2u(h,Y)+u(O,Y+h)+u(O,Y-h)l}
h{(2h+4a)u(O,Y)
(y)
Since
+
Zhq(0,Y)
a > 0, the corresponding matrix is an M-matrix.
theorem similar to Theorem 13.16 holds true. converges like tion by possible.
0(h2).
The method
If one multiplies the difference equa-
1/a, the passage to the limit o
A
a - -
is immediately
270
14.
II.
BOUNDARY VALUE PROBLEMS
Variational methods
We consider the variational problem I[u]
= min{I[wl
I
w e W},
(14.1)
where I[w]
= fi [a1w2 + a2wy + 2Q(x,y,w)Idxdy. G
Here
G
is to be a bounded region in
integral theorem is applicable, and
Q F C2(G x ]R, ]R)
to which the Gauss
al,a2 a C1(G,IR), and
where
al(x,y) > a > 0,
a2(x,y) > a > 0,
0 < QzZ(x,y,z) < d,
The function space below.
IR2
W
(x,y)
e G,
z aIR.
will be characterized more closely
The connection with boundary value problems is es-
tablished by the following theorem (cf., e.g., GilbargTrudinger 1977, Ch. 10.5).
Theorem 14.2:
is a solu-
A function u e C2(G, IR) fl C°(G, ]R)
tion of the boundary value problem -[alux]x -
(a2uyly + Qz(x,y,u) = 0,
(x,y) e G
(14.3)
u(x,y) = 0,
(x,y)
e DG
if and only if it satisfies condition (14.1) with
W = {w a C2(G, IR)
fl
C°(-a, IR)
I
w(x,y) = 0 for all (x,y) e 8G}.
In searching for the minimum of the functional
I[w],
it has turned out to be useful to admit functions which are not everywhere twice continuously differentiable.
In practice
one approximates the twice continuously differentiable solutions of the boundary value problem (14.3) with piecewise once
14.
Variational methods
271
continuously differentiable functions, e.g. piecewise polyThen one only has to make sure that the functions
nomials.
are continuous across the boundary points.
We will now focus on the space in which the functional I[w]
will be considered. K(G,IR)
Let
Definition 14.4:
w e C°(G,IR)
functions
such that:
(1)
w(x,y) = 0,
(2)
w
(x,y) e aG.
is absolutely continuous, both as a function with
x
of
with
y
y
held fixed, and as a function of
held fixed.
x
w. e L2(G, ]R).
wx,
(3)
be the vector space of all
We define the following norm (the Sobolev norm) on
K(G,]R):
2
1IwIIH =
[If (w2 + wx + wy )dxdy]l/2 G
We denote the closure of the space H(G,]R).
this norm by
We can extend setting
plies that
w
with respect to
a
continuously over all of
w
w(x,y) = 0
K(G,]R)
outside of
G.
]R2
by
Then condition (2) im-
is almost everywhere partially differentiable, (a,b) c]R2
and that for arbitrary
(cf. Natanson 1961, Ch.
IX) : rx
wx(t,y)dt
w(x,y) = J
a
(x, Y)
e IR2
rY =
J
wy(x,t)dt.
The following remark shows that variational problem (14.1) can also be considered in the space H(G,]R).
II.
272
Remark 14.5:
Let
BOUNDARY VALUE PROBLEMS
u e C2(G, IR) n C°(G,IR)
be a solution of
Then we have
problem (14.3).
= min{I[w]
I[u]
When the boundary
3G
w e H(G, IR)}.
I
is sufficiently smooth, the converse
For example, it is enough that
also holds.
be piece-
2G
wise continuously differentiable and all the internal angles of the corners of the region be less than
2n.
o
The natural numerical method for a successive approximation of the minimum of the functional
I[w]
is the
Ritz method: Choose
linearly independent functions
n
v = 1(1)n, from the space
K(G, IR).
n-dimensional vector space
Vn.
minimum of the functionals
I[w]
I[v]
Each the
V
I
These will span an
Then determine
v e Vn, the
in V:
w e Vn}.
can be represented as a linear combination of
w e Vn f
= min{I[w]
fv,
:
n
w(x,y) =
I
Bvfv(x,Y)
v=1
In particular, we have n
v(x,Y) =
I
cvfv(x,Y),
v=1 I[w]
= I(Sl,...,8n).
From the necessary conditions
2c
(cl,...,cn)
= 0,
v = 1(1)n
v
one obtains a system of equations for the coefficients
cv:
Variational methods
14.
fG[a,(fv)x
'I c(fx
u=1
273
E cu(fu)Y (14.6) + a2(fv)Y u=1 n E cufu)]dxdy = 0, v = 1(1)n. fvQz(x,y,
+
p=1
Whenever the solution
of the boundary value problem
u
(14.3) has a "good" approximation by functions in can expect the error
to be "small" also.
u - v
Vn, one
Thus the
effectiveness of the method depends very decidedly on a suitable choice for the space
Vn.
These relationships will be
investigated carefully in a later part of the chapter.
Now
we will consider the practical problems which arise in solvIt will turn out
ing the system of equations numerically.
that the choice of a special basis for
Vn
is also important.
In the following we will generally assume that is of the special form
Q(x,y,z)
Q(x,Y,z) = 2 a(x,Y)z2 - q(x,y)z, where
a(x,y) >
0
for
(x,y)
e G.
In this case, the system
of equations (14.6) and the differential equation (14.3) are The system of equations has the form
linear.
A c = d where
A = (auv), c = (c1,...,cn)1, and
d = (dl,...,dn)T
with
auv = Gf[al(fu)x(fv)x + a2(fu)y(fv)y + afufv]dxdY, du = If qfu dxdy. G
A
is symmetric and positive semidefinite.
tions
fv
definite.
are linearly independent, A Therefore, v
Since the func-
is even positive
is uniquely determined.
We begin with four classic choices of basis functions
274
II.
BOUNDARY VALUE PROBLEMS
fV, which are all of demonstrated utility for particular problems: (1) (2)
xkyR
monomials
products of orthogonal polynomials
gk(x)gZ(y)
I sin(kx) sin(Ry) (3)
sin(kx)cos(iy)
trigonometric monomials
:
Icos(kx)cos(iy) (4)
Bk(x)BR(y)
products of cardinal splines.
If the functions chosen above do not vanish on
8G, they
must be multiplied by a function which does vanish on and is never zero on
G.
It is preferable to choose basis
functions at the onset which are zero on if
aG
G.
For example,
G = (0,1)2, one could choose
x(1-x)Y(1-y),
x2(1-x)y(l-y),
x(1-x)y2(1-y),
x2(1-x)y2(1-y),
or sin(Trx) sin(Try) ,
sin(2rrx) sin(Try) ,
sin(rx) sin(2iry) , sin(2nx)sin(2rry)
For
G = {(r cos ¢, r sin 0)
1
r e
[0,1),
a good
c
choice is: r2-1,
(r2-1)sin ,
(r2-1)cos 0,
(r2-1)sin 20, (r2-1)cos 20.
Usually choice (2)
is better than (1), since one ob-
tains smaller numbers off of the main diagonal
of
A.
The
system of equations is then numerically more stable.
For
periodic solutions, however, one prefers choice (3).
Choice
(4)
is particularly to be recommended when choices (1)-(3)
give a poor approximation to the solution.
14.
Variational methods
27S
A shared disadvantage of choices (l)-(4) is that the A
matrix compute tions.
is almost always dense. n(n+3)/2
As a result, we have to
integrals in setting up the system of equa-
The solution then requires tedious general methods The com-
such as the Gauss algorithm or the Cholesky method.
putational effort thus generally grows in direct proportion with
n3.
One usually chooses
n < 100.
The effort just described can be reduced by choosing initial functions with smaller support. fufvo
(f11 )x(fv)x.
The products
(fu)y(fv)y
will differ from zero only when the supports of have nonempty intersection. are zero.
A
fu
In all other cases, the
fv
and auv
In this case, specialized, faster
is sparse.
methods are available to solve the system of equations. Estimates of this type are called finite element methods. The expression "finite element" refers to the support of the initial functions.
In the sequel we present a few simple
examples.
Example 14.7:
Linear polynomials on a triangulated region.
We assume that the boundary of our region is a polygonal line. Then we may represent
as the union of
G
AP, as in Figure 14.8.
N
closed triangles
It is required that the intersection
of two arbitrary distinct triangles be either empty or consist of exactly one vertex or exactly one side. tices of the triangles be denoted by
&v.
which do not belong to
Let them be enumerated from
We then define functions rules:
AP
Those ver-
1
2G, will to
n.
fv, v = 1(1)n, by the following
276
Triangulation of a region
Figure 14.8.
(1)
fv e C°(G, IR)
(2)
fv
restricted to
nomial in
IR2,
(3)
fvW')
(4)
fv(x,y) = 0
The functions (4).
BOUNDARY VALUE PROBLEMS
11.
is a first degree poly-
Op
p = 1(1)N.
dvu for
(x,y)
c 3G.
are uniquely determined by properties (1)-
fv
They belong to the space
fv
vanishes
which does not contain vertex
AP
on every triangle
K(G,IR), and
CV.
If the triangulation is such that each vertex v belongs to at most
k
triangles, then each row and column of
contain at most
k +
1
A
will
elements different from zero.
In the special case v = (rvh, svh)T,
rv,sv eZZ
we can give formulas for the basis functions
fv.
The func-
tions are given in the various triangles in Illustration 14.9.
The coefficients for matrix
this.
We will demonstrate this for the special differential
A
equation -Du(x,y) = q(x,y)
Thus we have
a1 = a2 = 1, a
=_
0, and
can be computed from
(
Variational methods
14.
0
0
0
277
/0 /0 0
0
0
1-rv+sv
0
l+rv x
+h 1-rv+
0
h
l+s
0
x
l+rv-sv-
0
0
0
V V0 Figure 14.9.
svh
v
1 sv+h
0
0
0
0
0
0
r h v
Initial functions triangulation
fv for a special
auv = It [(fu)x(fv)x + (fu)y(fv)y]dxdy Since
(fo)x
and
(fa)y
are
1/h, -1/h, or
0,
depending
on the triangle, it follows that 4
for p = v
-1
for
sv = su
and
rv = ru+1
or
rv = ru-1
-1
for
rv = ru
and
sv = su+1
or
S. = su-
0
otherwise.
In this way we obtain the following "five point difference
278
II.
BOUNDARY VALUE PROBLEMS
operator" which is often also called a difference star: 0
-1
0
-1
4
-1
0
-1
0
Tk,i
are the translation operators from Chapter
Here the
= 41
-
(Th,l + T_ h,1 + Th,2 + T_ h,2)'
10.
The left side of the system of equations is thus the same for this finite element method as for the simplest difference method (cf. Ch. 13).
On the right side here, how-
ever, we have the integrals
du = If gfudxdy while in the difference method we had h2q(r11 h, suh)
In practice, the integrals will be evaluated by a sufficiently accurate quadrature formula.
In the case at hand the follow-
ing formula, which is exact for first degree polynomials (cf., e.g. Witsch 1978, Theorem 5.2), is adequate: If g(x,y)dxdy z h2 [6g(0,0) + 6g(h,0) + 1 (O,h) where
is the triangle with vertices
A
Since the
fu
(0,0), (h,0), (O,h).
will be zero on at least two of the three
vertices, it follows that 2
du
a
6 (+l+l+l+l+l+l)q(ruh,suh) = h2q(ruh,suh).
Example 14.10:
Linear product approach on a rectangular is the union of
N
closed
rectangles with sides parallel to the axes, so that
may
subdivision.
We assume that
G
14.
Variational methods
279
Figure 14.11.
Subdivision into rectangles
be subdivided as in Figure 14.11.
We require that the inter-
section of two arbitrary, distinct rectangles be either empty or consist of exactly one vertex or exactly one side. denote by
CV
(v = 1(1)n)
We
those vertices of the rectangles
p which do not belong to
Then we define functions
G.
fv
by the following rule:
(1)
fv e G°(G, IR)
(2)
fv
restricted to
op
is the product of two
first degree polynomials in the independent variables and
x
y. (3)
fv(u)
(4)
fv(x,y) =
As in Example
svv 0
for
(x,y)
c
G.
14.7, the functions
fv
are uniquely
determined by properties (1)-(4), and belong to the space K(G,IR).
Each
fv
with common vertex
vanishes except on the four rectangles
v.
Thus each row and column of
at most nine elements which differ from zero. In the special case Ev = (rvh, svh)T,
rv,sv C 2Z
A
has
H.
280
BOUNDARY VALUE PROBLEMS
we can again provide formulas for the basis functions
fv,
namely: (1-Ifi -rvI)(1-Ih -svI)
for Ih-rvl
0
otherwise.
< 1,
< 1
Ih-svI
fv
We can compute the partial derivatives of the
fv
on the
interiors of the rectangles: -1Fi(1
I
(1
I
S
for
sv
for -1 <
0 < N-rv <
<
1
h-rv < 0,
Ih-svI <
1
<
h-sv < 1,
Iih-rI
<
1
for -1 <
h-sv < 0,
Ih-rvI
< 1.
1,
IK-svI
otherwise
0
for
0
otherwise.
0
The coefficients of the matrix
A
can be derived from this.
We restrict ourselves to the Poisson equation -Au(x,y) = q(x,y).
By exploiting symmetries, we need consider only four cases in computing the integrals: (1)
Iru- rvI
> 1
(2)
u = v:
auu = h2
(3)
rv = ru + 1
or
Isu- svI
2
h
auv = ri JO
h (h to f0[(1
and rrh
> 1
[-(l
auv =
:
h)2 + (1-
0
)2]dxdy
=
3
sv = su + 1:
)(1 (1
)) (l
)(1 (1
K))]dxdy='3
I
(4)
rv = ru + auv =
h
2
J
h
and
1
h
t (1){1-)+(1
j
0
sv = su:
0
fi)(l-(l-
h))ldxdy
= -7.
Variational methods
14.
281
We obtain the difference star _
1
3
1
3
1
3
1
1
1
8 31
_ 13
8 + 3
3[Th,1+T-h,l+Th,2+T h,2
1
_
3
3
+
The integrals
dU
(Th,l
-h,l)(Th,2+T-h,2))'
can be evaluated according to the formula 2
tt g(x,y)dxdy x[g(0,0) + g(h,0) + g(O,h) + g(h,h)), a
where (h,h).
is the rectangle with vertices
o
(0,0), (h,0), (O,h),
Therefore,
du = h2q(ruh, suh). Example 14.12:
a
Quadratic polynomial approach on a triangu-
lated region (cf. Zlamal 1968). lated as in Example 14.7.
Let region
G
be triangu-
We will denote the vertices of the
triangles
AP
and the midpoints of those sides which do not
belong to
aG
by
We define functions
Ev.
Let these be numbered from
f
(1)
fv a C°(G, IR)
(2)
fv(x,y)
v = 1(1)n
restricted to
1
to
n.
by the following rule:
Op
is a second degree
polynomial, p = 1(1)N.
(3)
fv(O')
(4)
fv(x,y) = 0
avu for
(x,y)
a aG.
As in the previous examples, the functions determined by properties (1)-(4). of a triangle, fv able.
fv
are uniquely
Restricted to one side
is a second degree polynomial of one vari-
Since three conditions are imposed on each side of a
282
II.
is continuous in
triangle, fv
G, and hence belongs to
It vanishes on every triangle which does not con-
K(G, IR).
tain
BOUNDARY VALUE PROBLEMS
CV.
a
With a regular subdivision of the region, most finite element methods lead to difference formulas.
For the pro-
grammer, the immediate application of difference equations is simpler.
However, the real significance of finite ele-
ment methods does not depend on a regular subdivision of the The method is so flexible that the region can be
region.
divided into arbitrary triangles, rectangles, or other geoIn carrying out the division, one can let
metric figures.
oneself be guided by the boundary and any singularities of the solution.
Inside of the individual geometric figures it
is most certainly possible to use higher order approximations (such as polynomials of high degree or functions with special
In these cases, the reduction to difference
singularities).
The programming required by
formulas will be too demanding.
such flexible finite element methods is easily so extensive as to be beyond the capacities of an individual programmer. In such cases one usually relies on commercial software packages.
We now turn to the questions of convergence and error estimates for the Ritz method. Definition 14.13:
Special inner products and norms.
quantities uv dxdy
2 =
f G
I =
ff[aluxvx + a u v G
2
y y
+ ouv]dxdy
The
14.
Variational methods
are inner products on
283
K(G,]R).
2,
II uI12 -
They induce norms
I.
II uIII =
The following theorem will show how the norms
and
can be compared to each other on
11.16
There exist constants
Theorem 14.14:
for all
11.1111
K(G, ]R) .
Y1,Y2 > 0
such that
u e K(G, ]R) : (1)
Y1l1uIII
(2)
IIUII2 < IIuIIH
(3)
1Iu1I2
Y211uIII
11u11H
Y211uIII.
The second inequality is trivial, and the third
Proof:
follows from the second and the first. first is as follows. show that
CI(G,]R)
The proof of the
Analogously to Theorem 4.10(1) we can is dense in
K(G,]R)
with respect to
Thus it suffices to establish the inequalities for
11.16.
u c Co(G,]R).
We begin by showing that there exists a con-
Yo > 0
stant
such that
Gf u2dxdy < Yo Gf (u2 + uy) dxdy, Let
11.1121
[-a,a]
x
[-a,a]
be a square containing
denote that continuous extension of (-a,a]
x
[-a,a]
u C Co (G, ]R) . Let
G.
u c Co(G,]R)
which vanishes outside of
G.
u
to
It follows
that t
u(t,y) = J- ux(x,y)dx. a
Applying the Schwartz inequality we obtain u(t1Y)2 < (t+a)jtaux(x,Y)2dx < 2a 1a aii(x,Y)2dx.
284
BOUNDARY VALUE PROBLEMS
II.
It follows from this that
1a u(t,y)2dy < 2aJa -a
a Ja -a
ux(x,y)2dxdy,
ra
ffu2dxdy
=
4a2fa
-a1a-a f
G
ra
u2dxdy <
u2dxdy
11
all
-a x
< 4a2JG fa(a2 + uy)dxdy. Setting
establishes our claim.
yo = 4a2
We now set =
cc
min
{min[al(x,y), a2(x,y)]}
max
{max[al(x,y), a2(x,y), o(x,y)]}
(x,y) eG Yo =
(x,y) eG
and use the above result to obtain the estimates 2 Ilu<2 (l+Yo) ff (ux+u)dxdy < y G
2
IIuIII < Yo
l+Y
o IIuII2
a
11U12
Inequality (1) then follows by letting
Yl = Let
{uv
1-77
I
v
in
{I
H(G, ]R)
K(G,IR)
Y2 = and
(1+
o
{vv
I
v = l(1)'}
be
which converge to elements
with respect to the norm
v = l(1)°°}
I
and
v = l(1)°°}
Cauchy sequences in
and
-
II
is a Cauchy sequence in
IIH.
Then
]R, for it
follows from the Schwarz inequality and Theorem 14.14 that I
VI-
I1uv-uu1II IIvvIII + Ilvu'vVIII IIuuIII
< yl 2 (I l
uv -uu IIH I I vv "H + 11V u -vV
H
II
I I uu lIH) .
u
14.
Variational methods
285
1 = lim
If we define
and
Theorem 14.14 holds trivially for all The space norms
However, this is not the case with
respect to the norm
II'112, as rather simple counterexamples
There is no inequality of the form
will show.
IIulIH < Y3IIujl2
Convergence for the Ritz method is first es-
tablished for the norm and
II'112
Theorem 14.15:
Let
II'III, and convergence with respect
then follow from the theorem. u c H(G,]R)
I[u] = min{I[w]
w e H(G,IR)
and let
Proof:
u e H(G,IR).
is closed with respect to the
H(G,IR)
and
to
For I[u]
I
be such that
w e H(G,IR)}
be arbitrary.
Then we have:
I = 2
(14.16)
I [u+w] = I [u] + 11W112
(14.17)
A e]R = I
it follows that 22
I[u+aw) = I
-
22
= I[u) + 2a(1 - <w,q>2) + Since
1,
IIull1 =
A2<w,w>1.
is the minimum of the variation integral, the ex-
u
pression in the parentheses in the last equality must be Otherwise, the difference
zero.
sign as
with
A
changes sign.
I[u+Aw]
- I[u]
will change
The second conclusion follows
A = 1.
It is also possible to derive equation (14.16) directly from the differential equation (14.3).
For
286
BOUNDARY VALUE PROBLEMS
II.
a(x,Y)z2 - q(x,y)z
Q(x,Y,z) =
Z we multiply (14.3) by an arbitrary function (test function) and integrate over
w e K(G,]R)
G:
(a2uy)y + au-qlw dxdy = 0.
ff[-(alux)x G
It follows from the Gauss integral theorem that ff[aluxwx + a2uywy + auw]dxdy = This is equation (14.16).
Gf qw dxdy.
It is called the weak form of dif-
ferential equation (14.3).
With the aid of the Gauss inte-
gral theorem, it can also be derived immediately from similar differential equations which are not Euler solutions of a variational problem.
Ac = d
The system of equations
can also be obtained
This process is called the GaZerkin
by discretizing (14.16). method: Let
fv, v = l(1)n, be the basis of a finite dimen-
sional subspace
Vn
We want to find an approxi-
K(G,]R).
of
mation n
v(x,Y) =
E cvfv(x,Y)
v=1 such that
u
=
> I
u
>
2,
u = 1(1)n.
As in the Ritz method it follows that auv =
and
du =
2
A derivation of this type has the advantage of being applicable to more general differential equations.
We prefer to
proceed via variational methods because the error estimates follow directly from (14.17).
14.
Variational methods
Theorem 14.18:
Let
K(G, IR) .
be an n-dimensional subspace of
Vn
Let
287
u e H(G, ]R)
IM = min{I [w]
I
v c Vn be such that
and
w e H(G, IR) },
I (v] = min(I [w]
I
w e Vn}.
Then it is true that
Here
(1)
IN] < I[VI
(2)
(Iu-v112 < Y211u-viII < Y2 min
11u-v*11I
is the positive constant from Theorem 14.14.
Y2
Proof:
v*eVn
Inequality (1) is trivial.
It follows from this,
with the help of Theorem 14.15, that for every
Ii u-v (j 2 = I [v]
I (U]
-
<
I [v* ]
-
I [U]
_ ii u-v* Iii
The conclusion follows from Theorem 14.14. Thus the error
11u-vi12
if there is some approximation for which
11u-v*jjI
is small.
mation in the mean to
u
v* e Vn,
a
in the Ritz method is small v* a Vn
of the solution
u
This requires a good approxi-
and the first derivatives of
u.
Nevertheless, Theorem 14.18 is not well suited to error estimates in practice, because the unknown quantity
u
continues to appear on the right sides of the inequalities in (2).
However, the following theorem makes it possible to
obtain an a posteriori error estimate from the computable defect of an approximate solution. Theorem 14.19:
Let
u e C2(G,IR) n C°(G,]R)
of boundary value problem (14.3).
be a solution
Let the boundary of
G
consist of finitely many segments of differentiable curves. Further let v(x,y) = 0
v e C2(G,]R) for all
(x,y)
be an arbitrary function with e DG
and let
288
II.
BOUNDARY VALUE PROBLEMS
a a a a wx(al az) - ay(a2 ay)
L=
+
Then it is true that II u-v II2 < YZII Lv-q112. Here
is the positive constant from Theorem 14.14.
Y2
Proof:
Let
c(x,y) = u(x,y)
G
Since
is square integrable on
q(x,y), Lu(x,y) ishes on
- v(x,y).
Lu(x,y) _ Since
G.
van-
a
BG, it follows from the Gauss integral theorem that
cLcdxdy = f f I (ale2x + a2ey + cc2)dxdy
=11E,12
'
It follows from Theorem 14.14 and the Schwartz inequality that
IIEII2 < YZIIEIII < Y2 IIE2IIIILEII2
0
We see from the estimate in the theorem that the error will be small in the sense of norm
11112
if
v
is a twice
continuously differentiable approximation of solution for then
depends on
good constants and
Of course the quality of the estimate
Lv z q. Y2.
u,
This shows how important it is to determine Y2
for a region
G
and functions
al
a2.
One further difficulty arises from the fact that the Ritz method normally produces an approximation from instead of from
C2(G,IR).
vented as follows.
K(G,IR)
This difficulty can be circum-
First cover
G
with a lattice and com-
pute the functional values of the approximation on this lattice with the Ritz method.
Then obtain a smooth approxi-
mation by using a sufficiently smooth interpolation between the functional values
v(Ep)
at the lattice points
Cp.
14.
Variational methods
289
Unfortunately, bilinear interpolation is out of the question because it does not yield a twice continuously differentiable A two dimensional generalization of spline inter-
function.
polation is possible, but complicated. interpolation is simpler.
The so-called Hcrmite
We will consider it extensively
in the next chapter.
Up to now we have assumed that form
In the following, let
1 az2 - qz.
function in
Q
C2(G x IR, ]R)
QZ(X,Y,Z) > 0,
has the special Q
be an arbitrary
with (x,y) a G, z cIR.
0 < QZZ(X,Y,z) < b,
Then one has the following generalizations of Theorems 14.15 and 14.18.
Theorem 14.20:
Let
u e K(G,]R)
I [u] = min{I [w] and let
v e K(G,IR)
I
be such that
w c K(G, ]R) } Then it is the case that
be arbitrary.
ff[aluxvx + a2uyvy + Qz(x,y,u)v]dxdy = 0,
I[u+v] = I[u] + ff[alvx+a2v2+Qzz(x,y,u+0v)v2]dxdy, 0<8<1. y G Theorem 14.21:
K(G, R). I[u]
Let
Vn
be an n-dimensional subspace of
Further let u e K(G, IR)
= min{I[w]
I
w c K(G,]R)}
v e Vn be such that
and and
Then there exists a positive constant
I[v] = min{I[w] y2
such that
I
w eVn}.
290
BOUNDARY VALUE PROBLEMS
II.
(1)
I [u]
<
(2)
jl u-vuj
2 < Y2 min
I [v] v*eVn
Y2
G
a2(uy-vy)2
+
The constant
{ff (al (ux-v*) 2
does not depend on
+
6(u-v*)2]dxdy}1/2. Q
or on
d.
Theorems 14.20 and 14.21 are proven analogously to Theorems 14.15 and 14.18.
Inequality (2) of Theorem 14.21
implies that convergence of the Ritz method for semilinear
differential equations is hardly different from convergence for linear differential equations. 15.
Hermite interpolation and its application to the Ritz method
We will present the foundations of global and piecewise Hermite interpolation in this section.
This interpola-
tion method will aid us in smoothing the approximation functions and also in obtaining a particularly effective Ritz method.
In the interest of a simple presentation we will
dispense with the broadest attainable generality, and instead endeavor to explain in detail the more typical approaches. We begin with global Hermite interpolation for one independent variable. Theorem 15.1: (1)
that
m c N
and
f
c
Cm-1([a,b],]R).
There exists exactly one polynomial
deg fm < 2m-1
fmu) (a) fm
Let
fm
Then:
such
and
= f (u) (a) ,
fm(11) (b)
= f (u) (b) , u = 0 (1) m -
is called the Hermite interpolation polynomial for
1. f.
291
Hermite interpolation and the Ritz method
15.
If
(2)
is actually
f
[a,b], then the function
entiable on
µ = 0(1)2m-1, has at least v = 1(1)2m - µ.
2m - µ
f(u)
zeros
fmu), for
-
xµv
[a,b],
in
Here each zero is counted according to multi-
For each
plicity.
2m-times continuously differ-
x e
there exists a
[a,b]
9 e (a,b)
such that the following representation holds: f(u)(x)
The
xµv
fmu)(x)
=
f(mmu 9
+
2fl µ(x-xuv), v=1
u = 0(1)2m-1.
(µ fixed) ordered by size are given by
xuv
ia
for
v = l(1)m -
b
for
v = m+1(1)2m - u.
u
We have the inequality
II
f(u)
_
fmu)IIm
< cmu(b-a) 2m-"II f
0(1) 2m-1.
(2m)II.,
where
mm m-u m-u 77-M (2m-µ)
_-P
1
2m-µ
for
µ = 0(1)m-1
for
µ = m(1)2m-1.
mu 1
(2m-p)
This theorem can be generalized when continuously differentiable on that case, an estimate for
[a,b]
is only
f
with
IIf(µ)-fmµ)II
Swartz-Varga 1972, Theorem 6.1.
For
1-times
0 < R < 2m.
In
can be found in
k < m-1, we require in
(1) that: fmu)(a)
The constants
=
cmµ
fmu)(b)
= 0,
u = i+1(1)m-1.
are not optimal.
Through numerical compu-
tations, Lehmann 1975 obtained improved values of small
m
(cf. Table 15.2).
cmµ
for
7
6
5
4
3
2
1
0
y
cmu
cmu
S.oooo00o0000E-1
l.oooooooooooE o
1.o71428S7143E-1
5.oooooooo000E-1
1.19o47619o48E-2
1.66666666667E-1
1.oooooooooooE o 5.ooo000oooooE-1
5.95238095238E-4
4.16666666667E-2
5.oooooooooooE-1
1.oooooooooooE-1
2.45o7619o282E-5
6.82666666667E-4
3.1oo1984127oE-6
8.33333333333E-3
1.66666666667E-1
1.oooooooooooE o
5.oooooooooooE-1
5.2o833333333E-4
3.o483158o552E-5
4.3945312SoooE-3
5.oooooooooooE-1
8.33333333333E-2
3.689522oo589E-7
1.66527864535E-6
2.8800oooooooE-4
7.453559925ooE-5
9.68812oo3968E-8
9.68812oo3968E-8
m=4
m = 1,2,3,4.
2.17013888889E-5
2.17ol3888889E-5
m=3
(lower entry) for
8.o1875373875E-3
2.4691358o247E-2
1.ooo000000ooE o
5.oooo0000000E-1
2.6o416666667E-3
1.25000ooooooE-1
2.6o416666667E-3
m=2
(upper entry) and
1.2SoooooooooE-1
m=1
TABLE 15.2:
15.
293
Hermite interpolation and the Ritz method
The conditions on
Proof of (1):
equations for the
fmu)
create
coefficients of polynomial
2m
linear
2m
fm.
If
the determinant of the system of equations were zero, then for certain right sides there would be two different polynomials
and
fm
polynomial with
fm - fm
Then
fm.
would be a nonvanishing
zeros and degree < 2m-1.
2m
Since that is
a contradiction, the system of equations must have a unique solution.
Proof of (2): city
The difference
f -
fm
a
and
at each of the points
m
total of
b, and therefore a
It then follows from the generalized
zeros.
2m
has a zero of multipli-
Rolle's Theorem that the derivatives (f-fm)(u)
have at least xuv first
zeros on
2m-u
u = 0(1)2m-1
We denote these by
[a,b].
and order them by size with respect to
equal to
a, and the last
zeros are equal to
m-u b.
Obviously the
v.
m-u
are
Now we consider the function
Oq(x) = f(u)(x)
-
fmu)(x)
2m-u -
q
(x-xuv
II
V=1
for fixed xo a [a,b]
p
e {0,l,...,2m-1}
and
with v = 1(1)2m-p
xo + xuv,
one can then choose a zero.
Then
For a fixed
q e]R.
q e]R
such that
has at least
0q(x)
is equal to
mq(xo)
2m-u+l
zeros in
[a,b].
We again appeal to the generalized Rolle's Theorem to conclude that
0q(2m-u)
(x)
has at least one zero
Then it follows from
q(xo) = 0,
q(2m-u) (g) = 0
8
in
(a,b).
294
II.
BOUNDARY VALUE PROBLEMS
that f(2m)(0)
- q(2m-1j) = 0,
f(u) (xo) When
xo
f (mmu
e (a,b).
6
(xo-xuv)
2mIT
6
v=l
= 0.
xuv, the last equation holds
is one of the zeros
for arbitrary x
fmu) (xo)
-
Therefore it holds for all
The equation, together with
c [a,b].
Ix-xuvI < b-a
immediately implies the inequality for f(u) fmwhere p = 0(1)m-1, we can split the product.
When
p =m(1)2m-1.
We have (x-a)m-u(b-x)m
2m-Vi
Ix-xuvI
II
for
x
c [a,(a+b)/21
for
x
e ((a+b)/2,b1.
<
v=1
(b-x)m-u
1(x-a) m
We want to find the extrema of y(x) = (x-a)
m-u(b-x)m
x c (a,(a+b)/2) .
We have (m-P)(z-a)m-u-1(b-z)m-m(x-a)m
y'(x) _
exactly when
z =
R.
x-a = (m-u)(b-a)/(2m-u)
= 0
The function
[ma + (m-u)b]/(2m-u).
assumes its maximum at
u(b-z)m-1
y(x)
Since and
b-x` = m(b-a)/(2m-p)
it follows that y(X) =
m m m(2m-u)
M-11 (b-a)2m-u
m-u
The considerations for
x e ((a+b)/2,b]
inequality follows for
u = 0(1)m-1.
are similar. c
The
29S
Hermite interpolation and the Ritz method
15.
Suppose a fixed
Then the
has been chosen.
e [a,b]
x
assignment
A
f -* f(u)(x)
fmu)(x),
-
defines a linear functional on
u = 0(1)2m-1 It vanishes
C2m([a,b],]R).
on the set of all polynomials of degree less than
The
2m.
functional can be represented explicitly with the aid of a Peano kernel.
Definition 15.3:
m eIN, x,t c [a,b]
Let
(x-t) (x-t)+m-1
g(x,t) =
2m-1
g(x,t)
of
t,
by
for
x > t
for
x < t.
= 0
For fixed
and
we denote the Hermite interpolation polynomial gm(x,t).
We set
Gm(x,t) = g(x,t) - gm(x,t).
Then
al'Gm/axu
is called the Peano kernel of
Au.
e
The coefficients of the Hermite interpolation polygm(x,t)
nomial
are functions of
t
which can be repreTherefore,
sented explicitly with the aid of Cramer's Rule. gm e C
2m-2
tion in
Since
([a,b] x [a,b],IR). C2m-2([a,bl
g(x,t)
is also a func-
x [a,bl,]R), the same is true for
Gm(x,t) . Theorem 15.4:
Let
f
e
C2m([a,b],]R)
Hermite interpolation polynomial for x
e [a,b]
f(p) (x)
and let
fm
be the
Then for all
f.
we have the representation: - fmp) (x) =
rb m
T
1a
f(2m) (t)
all
m(x,t)dt,
ax
p = 0(1)2m-1.
296
BOUNDARY VALUE PROBLEMS
II.
We begin by showing that
Proof:
b
m(x)
=
f
J
(2m)
(t)Gm(x,t)dt
a
is a solution of the following boundary value problem: 0(2m)(x)
f(2m)(X)
= (2m-l)!
(15.5)
0(v)(a)
=
(u)(b) = 0,
µ = 0(1)m-l.
will then be the Green's function for the
Gm(x,t)/(2m-1)!
boundary value problem (cf., e.g. Coddington-Levinson 1955).
Since
Gm a
C2m-2
([a,bJ
x
[a,b],]R), it follows that
(2m-2)-times continuously differentiable on
is
[a,b].
We have o
(2m-2) (x) =
b r
2m 2
f(2m)(t)a
m (x,t)dt
ax
a
2m-2
f(2m) (t) axZm-Z G M(x,t)dt Jxa rb +
2m-2
f(2m)(t)Gm(x,t)dt.
x
For
x # t, g(x,t), gm(x,t), and hence
It follows that
arbitrarily often differentiable. 10
e
C2m-1([a,b],]R).
(2m-1)(x) _
Differentiation yields
(a f(2m)(t)a2(x,t)dt m + f
respect to
m(x,x-0)
a
m-7Gm(x,t)dt
(t) a
ax -
Since the
(x)
2m-1
b
Jx f(2m)
+
a2m-2
(21m)
Z
J
+
f
(2m)
2m-2 (x)
:xzmm(x,x+0).
(2m-2)-th partial derivative of x
is continuous in
integral terms remain.
are all
Gm(x,t)
x
and
Gm(x,t)
with
t, only the two
As above, it follows that
15.
0
297
Hermite interpolation and the Ritz method
e C2m([a,b],]R)
(2m)
and
m(x,t)dt + f(2m) (x)
ax f(2m)a
(x) =
ax
fax
-Gm(x,t)dt + Jb f(2m)(t)a-ax
a 2Zm=iGm(x,x-0) ax
f(2m)(x)3 2m2m1lGm(x,x+0).
-
axr
x
We have a2m-1 ax
J(2m-l)!
IM--_79 (x,t) = 0
a2m 2 g(x,t) =
ugm(x,t) ax
and
x >
for
x < t
is continuous in
and
x
t
for
u = 0(1)2m-1,
Combining all this, we obtain
= 0.
mm(x,t)
ax
t
x # t
am
all
and
for
0
for
a2m 1
a2m-1
-0)
-
a 2m l-m(x'x+0) _ (2m-1):
a
a2m
a XTM_Gm
(x,t) = 0,
x # t.
From this it follows that (2m)
(x) = (2m-l)!
In addition, it follows for tion of
Gm(x,t)
m
u = 0(1)m-1
from the construc-
that
0(u) (a) Thus
f(2m)(X)
=
0(u) (b) = 0.
is a solution of boundary value problem (15.5).
The
function (2m-l)![f(x)
- fm(x)]
is obviously also a solution of (15.5).
Since the boundary
value problem has a unique solution, it follows that f(x)
-
fm(x) _ (2m-l)! m(x) _
1
(2m-1
fb f(2m) (t)Gm(x,t)dt. Ja
298
BOUNDARY VALUE PROBLEMS
II.
Differentiating this and substituting the derivatives of O(x)
obtained farther above yields the conclusion.
Example 15.6:
g, gm, and
Gm
for
a
m = 1,2,3.
Case m = 1:
x-b-ab-t
gl(x,t) -
g(x,t) = (x-t)+,
(b-x4 t-a
for
x > t
(b-t)(x-a) b-a
for
x < t
G1(x,t) -
t-a
for x > t
b-t
for
Glx(x,t) _
GIx(x,x+0) _ X-a
Glx(x,x-0)
E --a
x < t
+ F = 1.
Case m = 2:
g(x,t) = (x-t)+ (b-t
2
x-a
2
(2(b-t)(x-a) + 3(b-a)(t-x)]
92(x,t)
(b-a)
b-x 2 t-a
2
[2(b-x)(t-a)+3(b-a)(x-t)] for x>t
(b-a)
G2(x,t) _
(b t)2 x- a 2
[2(b-t)(x-a)+3(b-a)(t-x)] for x
(b-a)
t-a
Z
2{-2(b-x)[2(b-x)(t-a)+3(b-a)(x-t)]
(b-a)
+ (b-x)2(3b-a-2t)}
Glx(x,t) =
for x > t
({2(x-a)[2(b-t)(x-a)+3(b-a)(t-x)]
(b-a) +
(x-a)2(3a-b-2t)}
for x < t
15.
299
Hermite interpolation and the Ritz method
(t-a) 2{2[2(b-x)(t-a)+3(b-a)(x-t)] (b-a)
4(b-x)(3b-a-2t)} for x > t G2xx(x,t)
(b-t 2{2(2(b-t)(x-a)+3(b-a)(t-x)] (b-a)
3
+ 4(x-a)(3a-b-2t)} for x < (t-a 2 6(3b-a-2t)
for
x >
for
x < t
t
t
(b-a)
G2xxv(x,t) = b-t
2
6(3a-b-2t)
(ba) G2xxx(x,x-0)
- G2xxx(x,x+0) = 6.
Case m = 3: g(x,t) = (x-t)+ (b-t)3(x-a 3 {5(b-a)(t-x)[2(b-a)(t-x)+3(x-a)(b-t)]
g3(x,t)
(b-a)
+ 6(x-a)2(b-t)2}
G3(x, t)
=
J (x-t)
5
- g3(x,t)
for
x > t
- g3(x,t)
for
x < t.
Theorem 15.4 immediately yields an estimate for the interpolation error with respect to the norm Theorem 15.7:
Let
cmu holds true.
2
r rl rl
m-1
Here
o
Gm(x,t)
tion 15.3 for the interval computed for small by Lehmann 1975.
Then the inequality
< cmu(b-a) 2m-u IIf(2m)I12,
IIf(1j)-fmu)II2
where
f e C2m([a,b],]R).
11-112-
m.
all in [-i
ax
u - 0(1)2m-1
t) ] 2 dx dt I
1/2
M1
is the function (0,1].
Gm
The constants
from Definicmu
can be
The values in Table 15.8 were obtained
2.24457822314E-5 2.77638992969E-4
4.27311575545E-1
4.24705992865E-3 5.37215309350E-2 1
4.21294644506E-11 5.08920680460E-3 5.92874650749E-2
4.45212852385E-4
4.87950036474E-2
4.14039335605E-1
2
3
4
5
6
7
2.38010180208E-6
3.19767674247E-7
6.56734371321E-5
7.27392967453E-3
4.08248290464E-1
7.175679561o6E-8
m=4
1
1.63169843917E-5
m=3
m = 1,2,3,4.
2.01633313311E-3
for
1.05409255339E-1
m=2
cmu
o
m=1
TABLE 15.8.
15.
301
Hermite interpolation and the Ritz method
Proof:
From Theorem 15.4 and an application of the Cauchy-
Schwarz Inequality we obtain
f(u) (x)
-
fmu) 1
(x)
_7
((2m-1)!)
12
(
b
b If(2m)(t)]2dt J [auGm(x,t)]2dt.
a
a ax
By integration, this becomes < cmu(b-a) Ja [f(2m)(t)32dt
Jalf(u)(x)-fmu)(x)12dx where cmu(b-a) =
T_ L
b b u 1/2 {J J (a uGm(x,t)]2dt dx} a a 2x
2m-1
Every interval can be mapped onto transformation.
by an affine
(0,1]
With that substitution, we get (b-a)2m-u
cm11(b-a) _
cm11(1)
Letting 1
1
1
cmu = amu(1)
1/2
u
(2m-1)! {Jo1o [aa u(x,t)]2dx dt}
yields the desired conclusion.
a
The polynomials of degree less than or equal to form a 2m-dimensional vector space. (l,x,...,x
2m-1 )
2m-1
The canonical basis
is very impractical for actual computations
with Hermite interpolation polynomials.
Therefore we will
define a new basis which is better suited to our purposes. Definition 15.9: space.
Basis of the 2m-dimensional polynomial
The conditions S(11),m(0) a,X
= 6112
6a6
(a,6 = 0,1,
u,R = 0(1)m-l)
302
BOUNDARY VALUE PROBLEMS
II.
define a basis
{SI
I
(x)
R = 0(l)m-1}
a = 0,1;
of the 2m-dimensional space of polynomials of degree less than or equal to
2m-1.
o
It is easily checked that the Hermite interpolation polynomial
f e Cm-1([a,bl,]R)
for a function
fm
has the
following representation: M-1
fm(x) =
I
(b-a)t[f(-')(a)SO,1,m(b-a) (15.10)
Z=0
This corresponds to the Lagrange interpolation formula for ordinary polynomial interpolation. Sa ,
R , m(x)
explicitly for
Table 15.11 gives the
m = 1,2,3.
In order to attain
great precision it is necessary to use Hermite interpolation formulas of high degree.
This can lead to the creation of
numerical instabilities.
To avoid this, we pass from global
interpolation over the interval
[a,b]
to piecewise inter-
polation with polynomials of lower degree. partitioning
[a,b]
intermediate points.
into
n
We do this by
subintervals, introducing
n-1
The interpolation function is pieced
together from the Hermite interpolation polynomials for each subinterval.
Theorem 15.12:
Let
m,n c IN,
f
e
Cm-1([a,b],]R)
and let
a = xo < x1 < ...... < xn-1 < xn - b be a partition of the interval and
i = 0(1)n-l
we define
[a,b].
For
x
E [xi,xi+1]
Hermite interpolation and the Ritz method
15.
TABLE 15.11:
S
m = 1,2,3.
for
(x)
m
Sa
m(x)
a
k
0
0
1
1-x
1
0
1
x
0
0
2
1
0
1
2
x - 2x2 + x3
1
0
2
1
1
2
0
0
3
1
0
1
3
x - 6x3 + 8x4 - 3x5
0
2
3
2
1
0
3
10x3
1
1
3
-
1
2
3
-x3
t
+ 2x3
3x2
-
3x2
1x2
2x3
-
x2 + x3
-
-
l0x3
+
-
3x3 + 3x4
_
6x5
1x5
$
Y
2
15x4 + 6x5
-
4x3 + 7x4 -
15x4
-
3x5
x4 + 2x5
M-1
x-x i
xi) [f (t) (xi)SO,R,m(
fm(x) =
303
kI0(xi+1
l
xi+1 xil
x-x. + f( )(xi+1)S1,L,m(x1+11xi)]
Then
fm
is the Hermite interpolation polynomial for
each subinterval
[xi,xi+l]
(cf. Theorem 15.1(1)).
(m-1)-times continuously differentiable on f
[a,b].
f
fm
is
Whenever
is actually 2m-times continuously differentiable on
the following inequalities hold:
on
[a,b],
304
II
f(u) _ fmu)II*
<
muh2m-u
II f(u) _ fmu)II 2 <
15.7.
cmu
and
cmu
We denote by
it f (2m)IIm
cmuh2m-u
II f (2m)II
h = max{xi+l-xi Here
BOUNDARY VALUE PROBLEMS
II.
i
I
2
= 0(1)n-l},
u = 0(1)2m-1.
are the constants from Theorems 15.1 and II'IIm
the norm obtained from
11-11_ by
considering only one-sided limits at the partitioning points
xi . The proof follows immediately from (15.10) and Theorems 15.1 and 15.7.
Our two-fold goal is to use global and piece-
wise Hermite Interpolation both for smoothing the approximation functions in two independent variables and for obtaining Therefore, we will
a special Ritz Method in two dimensions.
generalize global and piecewise Hermite Interpolation to two variables.
We follow the approach of Simonsen 1959 and
Stancu 1964 (cf. also Birkhoff-Schultz-Varga 1968). basic region, we choose
[0,1]
[0,1]
x
As our
instead of an arbit-
rary rectangle, thereby avoiding unnecessary complications in our presentation. Definition 15.13:
m c N.
Let
We define
H(m)
to be the
vector space generated by the set p,q polynomials of degree less than or equal I
to 2m-l}.
This space has dimension
Sa k m(x)SB,i m(y) constitute a basis.
o
4m2.
The functions
a,s = 0,1;
k,i = 0(1)m-1
15.
305
Hermite interpolation and the Ritz method
Remark 15.14:
f e H(m)
if and only if 2m-1 2m-1
f(x,y) =
a..xlyl. j=0
i=0
We can impose
conditions on the interpolation.
4m2
properties demanded of the
Sa
The
by Definition 15.9
2 m
require
exay(Sa,k,m(Y)SS,i,m(6)) a,s,y,6 = 0,1; If, instead of
uk6vk6 ay 6$6
u,v,k,k = 0(1)m-1.
H(m), we choose the set of polynomials in two
variables of degree less than or equal to becomes substantially more difficult. space of dimension
m(2m+1).
m = 1
Then we have a vector
At each of the four vertices of
the unit square we will need to give Even in the case
2m-1, the theory
m(2m+l)
conditions.
we run into difficulties, since we
cannot prescribe the four functional values at the four vertices.
We avoid this difficulty by choosing the interpolation
polynomials from Theorem 15.15:
H(m).
Foi each sequence
{Ca,O,k,t cIR
I
a,O = 0,1;
k,k = 0(1)m-l}
there exists exactly one polynomial akak
ax k ay
X
f(a,6) =
f(x,y) =
ca,a,k,k
1
m-1
E
E
a,s=O k,k=0
H(m)
a,s = 0,1;
such that
k,k = 0(1)m-1.
Ca,d,k,tSa,k,m(x)SB,R,m(Y)
See Remark 15.14 for the proof. special basis for
f e H(m)
Table 15.16 gives the
explicitly for
m = 1,2.
306
II.
Basis of
TABLE 15.16:
BOUNDARY VALUE PROBLEMS
H(m)
m = 1,2.
for
R
m
Sa,k,m(x)
SS'R'm(Y)
0
0
1
(1-x)
(1-y)
1
1
0
1
(I-X)
y
0
1
0
0
1
x
(1-y)
1
0
1
1
0
1
x
y
0
0
2
0
0
2
(1-3x2+2x3)
(1-3y2+2y 3)
0
0
2
0
1
2
(1-3x2+2x3)
(y-2y 2+Y3)
0
1
2
0
0
2
(x-2x2+x3)
(1-3y2+2)r 3)
0
1
2
0
1
2
(x-2x2+x3)
0
0
2
1
0
2
(1-3x2+2x3)
(3y2- 2y 3)
0
0
2
1
1
2
(1-3x2+2x3)
(-y2+y3)
0
1
2
1
0
2
(x-2x2+x3)
(3y2-2y3)
0
1
2
1
1
2
(x-2x2+x3)
(-Y2+Y3)
1
0
2
0
0
2
(3x2-2x3)
(1-3y2+2y3)
1
0
2
0
1
2
(3x2-2x3)
1
1
2
0
0
2
(-x2+x3)
(1-3y2+2y3)
1
1
2
0
1
2
(-x2+x3)
(Y-2Y2+Y3)
1
0
2
1
0
2
(3x2-2x3)
(3y2-2y3)
1
0
2
1
1
2
(3x2-2x3)
(-y2+y3)
1
1
2
1
0
2
(-x2+x3)
1
1
2
1
1
2
(-x2+x3)
a
k
m
0
0
1
0
0
1
.
'
'
(Y-2Y2+Y 3)
(Y-2y
2+Y3)
(3y2-2y3) (-Y2+Y3)
Hermite interpolation and the Ritz method
15.
307
The following theorem uses the Peano kernel to obtain a representation of the error in a Hermite interpolation. f c C4m([0,1] x
Let
Theorem 15.17:
[0,1],
We define
IR).
by the condition
fm c H(m) au+v
au+v
(a4) = axuayv m f
Then for
we have
u,v = 0(1)2m-1
u+2m
2m+v
u+v
ax"ayv(f-fm)II 2_mull azgymv
II
u,v = 0(1)m-1.
a,3 = 0,1;
xuayvf(a,6),
a
f II2 + cmv'I
axu a4m
+
Proof:
We first show
au+v
p,v = 0(1)m-1
for
f 112
ax
that
"+v axuayvfm(E,n)
axuayv
2m+v
1
axaayf(t'n)a m(E,t)dt
(2m-1)
o 1
+
(15.18)
au+2m
(
f(,,s)
0 ax ay
[(2m-1):]
av
4m
2ay
2
Vm(n,s)ds}
ax
flfl
Gm
cmucmvll
f(t,s)
00 ax
au
av
ax
ax
is the function from Definition 15.3 corresponding to the
interval
[0,1].
We begin by assuming that
f
sented as a product, f(x,y) = p(x)q(y), where C2m([0,1],IR).
Then
p
and
q
can be repre-
p,q c
can be approximated indivi-
dually by means of the one dimensional Hermite interpolation. By Theorem 15.4 it is true for all
p(I) M = p(u) ( gmv) (n)
=
)
q(v) (n)
rl p(2m)
1 -
-
(2 m
,n c [0,1]
1
2m11
0
rl
Jo
(t)
-1-11-Gm
ax
that
(&,t)dt
Im(n,s)ds. q(2m) (s) av ax
BOUNDARY VALUE PROBLEMS
II.
308
Multiplying the two equations together yields
pmu)
(Q qmv) (n) = p(u) (E)q(v) (n) 11
(2m
!
1 p(2m)(t)q(v)(n)a-11 J0
u mt)dt
ax
+
(1p(U)(E)q(2m)(s)a
G(n,s)ds}
ax"
0
(2m)
+
1111 p
((2m-1)00
(t) q
all
(2m) (s)
m(E,t)am(n,s)dtds.
ax
ax
By means of the identification
u+v a
ax"ayv
f(C,n) = p(u) Oq(v) (n)
u+v ax''ayvm(E,n)
=
(n)
mu)
we obtain the conclusion for the special case of p(x)q(y).
f(x,y) _
Since the formula is linear, it will also hold
for all arbitrary linear combinations of such products. includes all polynomials of arbitrarily high degree.
This
There
is a theorem in approximation theory which says that every function
f
satisfying the differentiability conditions of
the hypotheses (including the 4m-th order partial derivatives), can be approximated uniformly by a sequence of polynomials. Therefore Equation (15.18) also holds for arbitrary f e C4m([0,1]
x
[0,1],IR).
The Schwarz integral inequality,
applied to (15.18), immediately yields the conclusion.
o
We now pass from global Hermite interpolation on the unit square to piecewise Hermits interpolation.
For this we
allow regions whose closure is the union of closed squares ap
with sides parallel to the axes and of length
require that the intersection
op n oo
(p # a)
h.
We
be either
empty or consist of exactly one vertex or exactly one side.
15.
309
Hermite interpolation and the Ritz method
As in the one dimensional case, we construct the interpolation function by piecing together the Hermite interpolation polynomials for the various squares
All of the con-
op.
cepts and formulas developed for the unit square carry over immediately to squares
aP
Theorem 15.19:
be a region in
Let
properties and let
G
with sides of length 1R2
with the above
We define
f c C4m(G,]R).
h.
fm:G
by
IR
the two conditions: op
restricted to
fm
(2)
At the vertices of square a''+v
uayvf(x'Y),
ax is
fm
For
u+v 2
<
Proof:
mu
it is true that
2m+v
cmuh2m-
cmvh2m-
II2 +
II
G.
is arbitrarily often
u,v = 0(1)2m-1
+ cmycmvh c
u,v = 0(1)m-1.
(m-l)-times continuously differentiable in
differentiable.
Here the
it is true that
ax
In the interiors of the squares, fm
II ax ayv(f fm)II
oP
au+v
uay vfmx.Y) = Then
H(M).
is a polynomial in
(1)
4m-u-v
a2m ax---2_m
u+2m II axj
f II2
f II2 .
II
are the constants from Theorem 15.7.
Along the lines joining two neighboring vertices, fm
and the partial derivatives of already determined by the values vertices of the given side.
fm
through order
m-i
a" vf(x,y)/ax"ay"
at the two
From this it follows that
(m-l)-times continuously differentiable in
are
fm
is
G.
With the aid of Theorem 15.7, the inequality of Theorem 15.17 can immediately be carried over to the case of a square with sides parallel to the axes and of length
h:
310
{cmuh2m-uIf. (a2may
[apaY(f-fm)] 2dxdy <
Jo
BOUNDARY VALUE PROBLEMS
II.
f)2dxdy,1/2
P
P
ch2m-v1I +
mv
+ c
mucmy
(aXa2mf)2dxdy]1/2 Y
h4m-upv((
2ma2mf 2dxd ]1/2}2 o (3x y ) y P
Summing over
p
and applying the Minkowski inequality for
sums (cf. e.g. Beckenbach-Bellmann 1971, Ch. 1.20) gives the inequality for the norms.
a
In the sequel, we will explain how the global and
piecewise Hermite interpolation polynomials can be used as initial functions for the Ritz method (cf. Chapter 14).
We
first consider the one-dimensional problem I(u) = min{I(w)
I
w e W}
where 1
1(w) = J [a(x)w'(x)2 + 2Q(x,w(x))}dx 0
W = {w E C2([0,11, ]R)
I
w(0) = w(l) = 0}.
An actual execution of the Ritz method requires us to choose a finite dimensional subspace
Vn
of
W, and a particular
basis for this subspace.
Example 15.20:
Basis of a one-dimensional Ritz method with
global Hermite interpolation.
{Sa,m(x) as a basis.
I
The functions
m >
Let
a = 0,1;
2,
space generated has dimension
x = 0
2m-2.
Choose
= l(1)m-l}
S a,0,m(x)
they do not vanish at the points
2.
are discarded because and a
x = 1.
The
15.
Hermite interpolation and the Ritz method
Example 15.21:
Basis of a one dimensional Ritz method with
piecewise Hermite interpolation. For
b eJR
311
and
Let
n e 1N
and
h = 1/n.
R = 0(1)m-1, define
Tb,R,m(x)
The restrictions of the m(n+l)-2
Tb,R,m(x)
to
yield
[0,1]
basis functions, for the following combination of
indices:
b = 0,1:
R = 1(1)m-1
b = h,2h,...,(n-1)h: R = 0(1)m-1. On
[0,1], the basis functions are (m-l)-times continuously
differentiable and only differ from zero on one or two of the subintervals of the partition
0 < h < 2h < of the interval
The 15.22.
... < (n-l)h < nh = 1 They all vanish at
[0,1].
Tb,R,m(x)
are given for
and
x = 0
m = 1,2,3
in Table
They are graphed in Figures 15.23 and 15.24.
a
We will now discuss the two-dimensional variational problem with I[w] = If [a 1 W2 + a 2w
2
+ 2Q(x,y,w)]dxdy
G
W = {w a C2(G,IR) n C°(G, IR)
I
w(x,y) = 0 for (x,y) a 8G}.
This is the same problem as in Chapter 14, with the same conditions on
al, a2, and
Q.
For an actual execution of
312
BOUNDARY VALUE PROBLEMS
II.
TABLE 15.22:
Tb
for
R m(x)
for
m = 1,2,3.
R
m
0
1
0
2
1
2
h sgn(x-b)(z-2z2+z3)
10
3
1
1
3
2
3
For
Ix-bi
T
b,R,m
(x)
1
Ix-bi
1
-
-
3z2 + 2z3
z
Ix-bl
z
l0z3 + 15z4
-
< h,
-
6z5
h sgn(x-b)(z-6z 3+8z4_3z5) h22
(z2-3z3+3z4-z5)
> h, Tb,R,m(x) = 0.
the Ritz method, we must choose a finite dimensional subspace
Vn
W
of
Example 15.25:
and a particular basis of this subspace.
Basis of a two-dimensional Ritz method with
global Hermite interpolation. m > 2.
Let
G =
[0,1]
x
[0,1]
and
Choose
{Sa k m(x)SS,R,m(Y) as a basis.
I
a,8 = 0,1; k,R = 1(1)m-l}
The space generated has dimension
Example 15.26:
4(m-1)2.
0
Basis of a two-dimensional Ritz method with
piecewise Hermite interpolation.
Let
G
be a region in
]R2
which satisfies the subdivision properties of Theorem 15.19.
15.
Hermite interpolation and the Ritz method
m = 1 1,0
m =
b-h
b
b+h
b-h
b
b+h
2
0,15
b+h
b-h
Figure 1 5 . 2 3 :
Tb
k
m(x)/hx
for
m = 1,2
313
314
II.
m =
BOUNDARY VALUE PROBLEMS
3
1,0
Tb,0,3 b-h
Figure 15.24:
b
Tb
b+h
m(x)/h91
i
for
m = 3
15.
Hermite interpolation and the Ritz method
Let
E
op
315
denote the set of all vertices of all of the squares
of the partition.
Further, let
m,n c N, and
h = 1/n.
We define the basis functions to be the restrictions of the functions
Tb,k,m(x)Tb'R,m(Y)
G
to
for the following combination of indices:
(b,b) e E
fl
G
e E
f1
2G
(b,b)
k,2 = 0(1)m-1;
and and
k > 1
whenever
(b,b+h)
c DG
or
(b,b-h)
c 8G
> 1
whenever
(b+h,b)
a 8G
or
(b-h,b)
E 8G.
P.
The basis functions belong to
Cm-1(G,IR).
They vanish on
m = 1, we obtain the basis already discussed in
For
3G.
k,t = 0(1)m-1, but
Example 14.10, if the subdivision assumed there agrees with the one prescribed here.
Thus piecewise Hermite interpola-
tion supplies a generalization of this method to The matrix
A = (auv)
m > 1.
of the Ritz method for the
basis chosen here can be given explicitly for the special case
Q(x,y,z) = q(x,y)z + 1-a(x,y)z2, a(x,y) > 0 for (x,y)
c G
by:
auv =
[al(x,Y)Tb,k'm(x)Tb,R,m(Y)Tb*k*,m(x)Tb*'R*,m(Y) +
a(x,Y)Tb,k,m(x)Tb,.,,m(Y)Tb*,k*,m(x)Tb*,R*'m(Y)]dxdy. In this case
Tb,k,m(x)Tb,t,m(Y) is the
u-th basis function, and
316
II.
BOUNDARY VALUE PROBLEMS
Tb*,k*,m(x)Tb*,IC*,m(Y)
v-th basis function.
is the
11 (b,b)
-
The integrals vanish whenever
(b*,b*)II 2
>
Therefore, each row and column of matrix elements which differ from zero.
9m2
A
has at most
At most four squares
contribute to the integrals.
Theorem 14.18 supplies the inequalities
Y211u-wl'I < Y2I!u-w*III.
11u-w112
Here we have u
solution of the variation problem in space
w
Ritz approximation from space
W
V u
w*
arbitrary functions from
:
Vu.
We have the additional inequality: II u-w*II 2
max
<
_ a 1(x,Y)II (u-w*) xI122
I - (x,y)EG max
+
(x,y)c max
+
aI(x,Y)lI (u-w*) y112
_ a(x,Y) IIu-w*II2. 2
(x,y)EG
In our problem we can apply Theorem 15.19 to choose
w*
so
that
IIu-w*112 < Moh2m
II(u-w*) 12 < II (u-w*)YII2 < The numbers of
u
M0, M1, and
and on
Mlh2m-1
M2h2m-1
M2
m, but not on
depend only on the derivatives h.
Altogether it follows that
the Ritz method has convergence order
0(h2m-1 ):
16. Collocation methods and boundary integral methods
317
1Iu-w1I2 < Mh2m 1. In many practical cases it has been observed that the convergence order is actually
O(h2m).
The explanation for this
behavior can usually be found in the fact that the error must be a function of 16.
h2
for reasons of symmetry.
o
Collocation methods and boundary integral methods Collocation methods are based on a very simple con-
cept, which we will explain with the aid of the following boundary value problem (cf. Problem 13.2): Lu(x,y) = q(x,y),
(x,y)
e G
u(x,Y) = (x,Y),
(x,Y)
e r.
(16.1)
Once again, G
is a bounded region with boundary
r
and
L
is a linear, uniformly elliptical, second order differential operator of the form Lu = -alluxx - 2a12uxy - b
1
u
x
- b 2u y
a22uyy
+ gu
where all, a12, a22, b1, b2, g e C_(G, IR) and
q e C°(G, IR), Further it is true for all
a11a22 - a12 >
C° (r, IR) that
(x,y) e
0,
all >
0,
g > 0.
The execution of the method presupposes that we are given: (1) j
n
= 1(1)n, from
linearly independent basis functions C2(G,IR).
vj,
318
n
(2)
different collocation points
of these, the first ing
BOUNDARY VALUE PROBLEMS
II.
n2 = n - n1
are to belong to
n1
are to lie in
The solution
(xk,yk) e G;
G, and the remain-
r.
of boundary value problem (16.1)
u
will now be approximated by a linear combination functions
w
of the
vj, where we impose the following conditions on w: k = 1(1)n1
Lw(xk,yk) = q(xk,yk),
(16.2)
k = nl + 1(1)n.
w(xk,yk) _ (xk,yk), In view of the fact that n
w(x,y) =
cj e IR
E cjv.(x,y), j=1
the substitute problem (16.2) is concerned with the system of linear equations: n
k = 1(1)n
8k,
akj =
Lvj(xk,yk)
for
k < nl
vj(xk,yk)
for
k > nl
q(xk,yk)
for
k < nl
(16.3)
$I,
for k > n
(xk'yk)
In many actual applications, the system of equations can be simplified considerably by a judicious choice of the
vj.
It is often possible to arrange matters so that either the
differential equation or the boundary conditions are satisfied exactly by the functions (A)
Boundary collocation: Lvj(x,y) = 0,
All
(xk,yk)
must lie in
We distinguish:
vj.
We have
j
q _ 0
and
= 1(1)n, (x,y) E
r, i.e. nl = 0, n2 = n.
G.
16. Collocation methods and boundary integral methods
(B)
Interior collocation:
We have
j
vj (x, Y) = 0 , All
(xk,yk)
must lie in
i ° 0
319
and
1 (1) n, (x,y) c r.
=
G, i.e. nl = n, n2 = 0.
The system of equations (16.3) does not always have a unique solution. rarily large.
When it does, the solution can be arbit-
A priori conclusions about the error
u-w
only be drawn on the basis of very special hypotheses. is the weakness of collocation methods.
can
This
It is therefore
essential to estimate the error a posteriori.
Nevertheless,
collocation methods with a posteriori error estimation frequently are superior to all other methods with respect to effort and accuracy.
Error estimates can be carried out in the norm as explained in Section 14 (cf. Theorem 14.19).
However,
this seems unduly complicated in comparison with the simplicity of collocation methods.
Therefore, one usually premonotone principles.
fers to estimate errors with the aid of
We wish to explain these estimates for the cases of boundary collocation and interior collocation. c = u-w,
r = q-Lw,
To this end, let = iy-w.
Then we have Lc(x,y) = r(x,y),
(x,y)
c G
c(x,Y) = O(x,Y),
(x,Y)
c r.
(16.4)
(A)
For boundary collocation, we have
r(x,y) = 0.
It
follows from the maximum-minimum principle (cf. Theorem 12.3) that for all
(x,y) c G:
320
BOUNDARY VALUE PROBLEMS
II.
min max {q(x,y),0} < e(x,y) < (x,y)Er (x,y)er
Thus it suffices to derive estimates for . We assume that r
consists of only finitely many twice continuously dif-
ferentiable curves
rR.
Each arc then has a parametric
representation
t E [0,1].
(x,y) = [c1(t),F,2(t) ], We set
fi(t)
= wl(t),E2(t))
finitely many points h = 1/m.
tj
= jh,
and compute j
= 0(1)m, where
Then it is obviously true for all < h max - 2 te[0,1]
min j=0(1)m
for the
;
t e
m c 1N [0,1]
and that
'(t)
can be interpolated linearly between the points
tj.
The
interpolation error will be at most h2 4
max te[0,1]
"(t)
Combining this and letting dl = Zh
max tE[0,1)
we have, either for min
m(x,y) =
For small
or for
min
fi(t)
v = 2, that >
ta[0,1]
c(x,y) =
max
;(t) <
te[0,1]
(x,y)er91
max
h2
te[0,1]
v = 1
(x,y)erx
max
d2 =
h, coarse estimates for
min j=0(1)m
_0(t.-) -d
max ;(tj)+dv j=0(1)m or
4"
suffice.
Since v
(v)(t)
n
v
dtv vj(1(t),C2(t))
16. Collocation methods and b2undary integral methods
the quantities (B)
cj,
j
321
= l(l)n, are the deciding factors.
For interior collocation, we have
Lemma 13.18, there exists a
4(x,y) = 0.
u e C2(G,IR)
By
such that
Lw(x,y) > 1,
(x,y)
e G
W(x,Y) > 0,
(X,Y)
E G.
We set
max _I r(x,y) I
=
(X,Y) EG
II r II
and obtain
L(e+aw) (x,Y) > (e+Rw)(x,Y)
>
0,
(x, Y) E G
0,
(X,Y)
E F.
It follows from the monotone principle that (E+aW)(x,Y) > 0,
(x,y)
E
E(x,Y) > -aW(x,Y),
(x,y)
E G.
L(E-aw)(x,y) < 0,
(x,y)
e G
(e-aw)(x,Y) < 0,
(X,Y)
E
E(X,Y) < aw(x,Y),
(X,Y)
E
Analogously, one obtains
P
Combining this leads to
IIEII _ aIIWII, . Thus the computation of the error is reduced to a computation of n
max Iq(x,y) (x,y)EG
-
c.Lv.(x,y)I. j=1 >
II.
322
BOUNDARY VALUE PROBLEMS
We next want to consider three examples of boundary collocaIn all cases the differential equation will
tion in detail. be
ou(x,y) = 0. Example 16.5:
Let
We use the polar
be the unit circle.
G
coordinates
x = r cos t y = r sin t
r E [0,1], t
E
[0,2'R)
and let
vj(x,y) = rj-lexp[i(j-1)t],
j
= 1(l)n
h = 2,r/n (xk,yk) = (cos[(k-1)h], sin[(k-1)h]), Since the functions ents
cj
vj
k = l(l)n.
are complex-valued, the coeffici-
will also be complex.
Naturally, one can split the
entire system into real and imaginary parts.
Then this ap-
proach fits in with the general considerations above.
The
system of equations (16.3) can be solved with the aid of a fast Fourier transform Example 16.6:
Let
G
(cf. Stoer-Bulirsch 1980).
be the annulus
G = {(x,Y) E I R 2
For even
n = 2n-2
o
x2+y2 1
0 < rl <
<
r2}.
we set
vl(x,y) = log r
vj(x,y) = r3-nexp[i(j-n)t),
j = 2(1)n
h = 4n/n
(xk,yk) = (rlcos[(k-1)h], rlsin[(k-1)h]), k = 1(1)n-1
16. Collocation methods and boundary integral methods
(xk,yk) = (r2cos[(k-n)h), r2sin((k-n)h]), All functions fi(l)n
vj
k = n(l)n.
are bounded on the annulus.
For
j
=
they correspond to the basis functions of the previous One cannot dispense with the functions
example.
because the region is not simply connected.
cannot be approximated by
vn_1
323
v1
through
vn
vl,"',vn-1 through One com-
vn.
putes, e.g., that 12Tr
(vl(r2cos t, r2sin t) - vl(rlcos t, rlsin t)]dt n
r
r = 21log (7) > 0
2n
)I
[vj(r2cos t, r2sin t) - vj(rlcos t, rlsin t)]dt = 0,
f0
= 2(1)n.
j
This example shows that in each case a thorough theoretical examination of the problem is essential. Example 16.7: technology.
This example has its origins in nuclear It has to do with flow through a porous body.
Let the space coordinates be
(x,y,s).
pressure, u, does not depend on is given by
Au(x,y) = 0.
s.
= = {(x,Y) a IR
u,v = {(x,Y) r1
and
r2
a IR
2
We assume that the
Then a good approximation
Cylindrical channels are bored
through the body, parallel to the
Here
a
s-axis:
(x-2p)2 + (y-2v)2 < r2)
U,v c ZZ. 2
(x-2p-1)2 +
(y-2,v
_1)2 < r2}
are fixed numbers with
0 < r1 < 1 ,
0 < r2 < 1, rl+r2 < /.
Figure 16.8 depicts a section of this region for
r2 = 1/2.
r1 = 1/4
and
324
II.
Figure 16.8:
A region in Example 16.7
In each of the channels
while in each of the
I
Jµ2v
rl
there is a pressure of
µ ,v
there is a pressure of
flow thus goes from channels
monotonically with
BOUNDARY VALUE PROBLEMS
and
I
r2.
to channels
1,
-1.
The
and increases
J
Using the symmetries, one
can reduce the problem to a restricted region
G
Figure 16.9).
or
On the solid lines, u(x,y) =
1
on the dashed lines, the normal derivative of
u
(cf.
u(x,y) = -1; is zero.
In this form, the problem can be solved with a difference method.
The exact solution
problem is doubly periodic.
u(x,y)
of the boundary value
For if
(x,y)
lies in the region
between the channels, we have u(x+2,y) = u(x,y+2) = u(x,y).
Collocation methods and boundary integral methods
16.
Au
0
325
0
an =
u = x
au = 0
an
Figure 16.9:
A region in Example 16.7.
Therefore it is natural to approximate the simplest doubly periodic functions (cf.
with periods
functions, the Weierstrass P-
Magnus-Oberhettinger-Soni 1966, Ch. 10.5).
z = x+iy.
Let
with the help of
u
We denote the Weierstrass 2
and
2i
by
p(z).
P-function
The function is meromor-
phic, with a pole of second order at the points and with a zero of second order at
2u + 2vi
2u+l + (2v+l)i, where
u,v E. The poles and zeros thus are at the centers of the channels.
Therefore one can choose the basis functions
v. J
for the collocation method from the following set: 1,
log1p(z)I,
Re[p(z)J], Im(p(z)J],
j
ell
Because of all the symmetries, the set 1,
suffices.
loglp(z)21, Re [P(z)2j]
We use the trial function
w(x,Y) = Y loglp(z)2I +
c.Re(p(z)2J)
j=-R
J
-
{o}.
326
BOUNDARY VALUE PROBLEMS
II.
and must determine the
n = 29+2
unknowns
Y, c-9' C_ X+11- .'C L'
n collocation points; k+l
To do this, we use the ary of
Io o
and
on the boundary of
k+l
J
on the bound-
oo, '
rlexp((k-1))
for
k = 1(1)i+l
l+i+r2exp(1(k-k-2))
for
k = 9+2(1)n.
xk+iyk =
The linear system of equations is 1
for
k = 1(1)9+1
-1
for
k = 9+2(1)n.
_
w(xk,yk)
Table 16.10 gives the values of of
r1
and
r2
10-3/10-4/10-5.
9
for several combinations
necessary to obtain a precision of When the difference
/2-rl-r2
is not too
small, one can obtain relatively high precision even with small
L.
The flow depends only on
the computed values
y
for
y.
i = 1,3,5,7.
Table 16.11 contains
The computing time
to obtain the solution of the linear system is small in all of these cases compared to the effort required to estimate the error.
The efficiency of a collocation method usually is
highly dependent on the choice of collocation points.
The
situation is reminiscent of polynomial interpolation or numerical quadrature.
Only there exists much less research into
the optimal choice of support points for collocation methods. The least squares method is a modification of the collocation method in which the choice of collocation points is not quite as critical.
basis functions
vi
In these procedures, one chooses and
n > m
collocation points
m (xk,yk)'
16.
Collocation methods and boundary integral methods
TABLE 16.10:
r2
r
R1/R2/R3
for accuracies of
1/8
3/8
5/8
7/8
10-3/10-4/10-5
1
1/8
1/1/1
1/1/2
2/2/3
3/4/5
3/8
1/1/2
1/1/2
2/2/3
4/6/(9?)
5/8
2/2/3
2/2/3
3/4/(9?)
7/8
3/4/5
4/6/(9?)
TABLE 16.11:
r2
r
1/8
y
for
k = 1,3,5,7
3/8
5/8
7/8
1
1/8
0.13823 0.13823 0.13823 0.13823
0.19853 0.19853 0.19853 0.19853
0.25011 0.25003 0.25003 0.25003
0.32128 0.31853 0.31852 0.31852
3/8
0.19853 0.19853 0.19853 0.19853
0.35218 0.35217 0.35217 0.35217
0.55542 0.55495 0.55495 0.55495
1.10252 1.06622 1.06599 1.06599
5/8
0.25011 0.25003 0.25003 0.25003
0.55542 0.55495 0.55495 0.55495
1.34168 1.32209 1.32207 1.32208
7/8
0.32128 0.31853 0.31852 0.31852
1.10252 1.06622 1.06599 1.06599
327
328
II.
of these are to lie in
Again, n1
BOUNDARY VALUE PROBLEMS
n2
G, and
in
r.
Condi-
tion (16.2) is replaced by: 11
6k(Lw(xk,Yk)
-
q(xk,Yk)l2
k=1
(16.12) n
6k(w(xk2Yk)
+
-
1p(xk,Yk)l2 = Mini
k=n1+1
Here the
6k >
are given weights and
0
m
w(x,y) = jIlcjvj(x,Y) Because of these conditions, the coefficients
cj,
j
= 1(1)m,
can be computed as usual with balancing calculations (cf. StoerBulirsch 1980, Chapter 4.8).
Only with an explicit case at
hand is it possible to decide if the additional effort (relative to simple collocation) is worthwhile.
For
n = m, one
simply obtains the old procedure.
Occasionally there have been attempts to replace condition (16.12) with max{
6kILw(xk,Yk) max k=1(1)n1
- q(xk,Yk)I,
max 6klw(xk,Yk) k=n1+1(1)n
- *(xk,Yk)I} = Min!
(minimization in the Chebyshev sense).
Experience has demon-
strated that this increases the computational effort tremenConsequently, any advantages with respect to the pre-
dously.
cision attainable become relatively minor.
We next discuss a boundary integral method for solving Problem (16.1), with region
G
unit disk
L = A, q = 0, and
0 e C1(r,IR).
The
is to be a simply-connected subset of the closed IzI
<
1,
z
e 4, with a continuously differentiable
16. Collocation methods and boundary integral methods
boundary
r.
329
The procedure we are about to describe repre-
sents only one of several possibilities.
be a parametrization of
C e Cl([0,2T],r)
Let
without double points and with
(O) =
and
C(27T)
r
i1 +
2 >
Consider the trial function
(2n u(z) =
z
J
(16.13)
c G.
0
If
p
is continuous, u c C0(G,IR)
(cf. e.g. Kellog 1929).
By differentiating, one shows in addition that monic in
G.
is har-
u
The boundary condition yields 2n
p(t)logjz-C(t)jdt = (z),
z
e F.
(16.14)
0
This is a linear Fredholm integral equation of the first kind with a weakly singular kernel. determined solution
p
There exists a uniquely The numeri-
(cf. e.g. Jaswon 1963).
cal method uses (16.14) to obtain first an approximation of
p
at the discrete points
tj
= 27r(j-1)/n,
j
Next (16.13) is used to obtain an approximation u(z)
for arbitrary
z
u
= 1(1)n. u(z)
of
c G.
The algorithm can be split into two parts, one dependent only on
r
and
E, and the other only on .
(A)
Boundary dependent part:
(1)
Computation of the weight matrix
W = (wjk)
quadrature formulas JTrf(t)loglzj-C(t)Idt °
zj = C(tj),
= k11wjkf(tk) + R(f)
j = 1(1)n
for
n
0.
330
BOUNDARY VALUE PROBLEMS
II.
R(fv) = 0
The matrix
fv(t) =
for
11
v = 1
cos(2 t)
v = 2(2)n
sin(t)
v = 3(2)n.
Therefore
is regular.
(fv(tj))
W
is uniquely
determined.
Most of the computation is devoted to determin-
ing the
integrals
n2 f2s
fv(t)log;z;-E(t)ldt,
v,j = 1(1)n.
1-
Triangulation of
(2)
algorithm or into
W
into
W = QR
W = LU
using the Gauss
using the Householder transforma-
tions. (B)
Boundary value dependent part:
(1)
Computation of
u(tk)
from the system of equations
n
wjku(tk) = (zj),
kIl
Since
W = LU
W = QR, only
or
j
O(n2)
= l(1)n.
operations are re-
quired for this.
Computation of
(2)
u(z)
integrand is a continuous
for
z e G
from (16.13).
2n-periodic function.
The
It seems
natural to use a simple inscribed trapezoid rule with partition points
tj,
j
u(z) =
= 1(1)n:
2n
(16.15)
1u(tk)log1z-E(tk)I.
k= If
z
does not lie in the vicinity of
yields good approximations for For boundary-close
z,
r,
(16.15) actually
u(z).
-loglz - g(t)I
extremely large on a small part of the interval (16.15) is useless.
becomes [0,27T].
Then
The following procedure improves the re-
sults by several decimal places in many cases.
But even this
16. Collocation methods and boundary integral methods
331
approach fails when the distances from the boundary are very small. Let
A(t)
boundary values uc(z) = c +
be that function Then, for
i ° 1.
which results from
u(t)
c e1R,
n
2n
(16.16)
I
k=1
are also approximations to
u(z).
It is best to choose
c
so that u(tR)
whenever a(t)
- ca(tR) = 0
is minimal.
Since the computation of
can proceed independently of the boundary values ,
the effort in (16.15) is about the same as in (16.16). each functional value operations.
one needs
u(z)
0(n)
For
arithmetic
The method is thus economical when only a few
functional values
are to be computed.
u(z)
In the following example, we present some numerical results: ,P(z)
= Re[exp(z)] = exp(x)cos(y)
al(t) = 0.2 cos(t) + 0.3 cos(2t)
-
0.3
E2(t) = 0.7[0.5 sin(t-0.2)+0.2 sin(2t)-0.l sin(4t)] + 0.1.
The region in question is the asymmetrically concave one shown in Figure 16.17.
The approximation
u
was computed on
the rays 1, 2, and 3 leading from the origin to the points E(0), l;(n), and
points.
E(5ii/3).
R
is the distance to the named
Table 16.18 contains the absolute error resulting
from the use of formula (16.15) (without boundary correc-
tion); Table 16.19 gives the corresponding values obtained
332
II.
BOUNDARY VALUE PROBLEMS
from formula (16.16) (with boundary correction).
We note
that the method has no definitive convergence order.
FIgure 16.17.
Asymetrically concave region
n
n
1.9E-2 3.3E-3
8.3E-4
2.SE-3
1.9E-7
1.1E-10
96
5.0E-10 2.2E-7
3.3E-6
9.8E-5
2.4E-5
1.1E-6
1.7E-12
4.7E-12
1.3E-12
4.0E-5
4.4E-8
2.6E-7
1.2E-7
2.8E-4
9.6E-4
2.0E-3 3.4E-4
5.1E-4
4.6E-3
7.4E-3 1.3E-2
9.4E-3 7.5E-5
1/128 1/32
2.4E-5
1/8
1.4E-2
1/128
1.SE-4
1/32
Ray 3
2.1E-2
1/8
Ray 2
Absolute error when computing with boundary correction
3.SE-6
4.6E-12
96
2.4E-9
5.4E-7
48
4.0E-4
4.7E-3
4.3E-5
1.6E-4
1.1E-4
24
1/128
1.9E-6
3.9E-3
3.1E-3
12
Ray 1
1/32
TABLE 16.19:
2.4E-12
Absolute error when computing without boundary correction
1/8
R
TABLE 16.18:
6.7E-2
1.4E-4
2.2E-3
1.9E-7
7.0E-6
1.8E-2
3.0E-5
5.5E-7
48
1.0E-6
1.1E-2
S.SE-3
1.9E-1
6.9E-2
2.5E-2
7.0E-3
2.2E-4
1.3E-2
1.2E-4
8.1E-2
4.3E-3
7.SE-4 2.2E-5
9.8E-3
2.8E-1
5.5E-2
2.6E-2
24
1/128
12
1/32
1/8
1/128
1/32
1/8
1/128
1/32
R
1/8
Ray 3
Ray 2
Ray 1
U4 LA
w
PART III. SOLVING SYSTEMS OF EQUATIONS
17.
Iterative methods for solving systems of linear and nonlinear equations When we discretize boundary value problems for linear
(nonlinear) elliptic differential equations, we usually ob-
tain systems of linear (nonlinear) equations with a great many unknowns.
The same holds true for the implicit discreti-
zation of initial boundary value problems for parabolic differential equations.
For all practical purposes, the utility
of such a discretization is highly dependent on the effectiveness of the methods for solving systems of equations. In the case of systems of linear equations, one distinguishes between direct and iterative methods.
Aside from
rounding errors, the direct methods lead to an exact solution in finitely many steps (e.g. Gauss algorithm, Cholesky method, reduction method).
Iterative methods construct a
sequence of approximations, which converge to the exact solution (e.g. total step method, single step method, overrelaxation method).
These are ordinarily much simpler to
program than the direct methods.
334
In addition, rounding errors
17.
335
Iterative methods
play almost no role.
However, in contrast to direct methods
fitted to the problem (e.g. reduction methods), they require so much computing time that their use can only be defended when the demands for precision are quite modest.
When using
direct methods, one must remain alert to the fact that minimally different variants of a method can have entirely different susceptibilities to rounding errors.
We have only iterative methods for solving systems of non-linear equations.
Newton's method (together with a few
variants) occupies a special position. only a few iterations.
It usually requires
At each stage, we have to solve a
system of linear equations.
Experience shows that a quick
direct method for solving the linear system is a. necessary adjunct to Newton's method.
An iterative method for solving
the linear equations arising in a Newton's method is not to be recommended.
It is preferable instead to apply an itera-
tive method directly to the original non-linear system.
The
Newton's method/direct method combination stands to nonlinear'systems as direct methods to linear systems.
However,
the application is limited by the fact that frequently the linear systems arising at the steps of Newton's method are too complicated for the fast direct methods.
This section will serve as an introduction to the general theory of nonlinear iterative methods.
A complete treat-
ment may be found, e.g., in Ortega-Rheinboldt 1970. In the following two sections, we examine overrelaxation methods (SOR) for systems of linear and nonlinear equations.
After that, we consider direct methods. Let
F : G c
to find a zero
1n y 1n
x* e G
of
be a continuous function.
We want
F, i.e. a solution of the equation
336
SOLVING SYSTE'1S OF EQUATIONS
111.
F(x)
lying in
=
(17.1)
0
In functional analysis, one obtains a number of
G.
sufficient conditions for the existence of such a zero. Therefore, we will frequently assume that a zero
x* E G
exists, and that there exists a neighborhood of F
We further demand that
has no other zeros.
in which
x* G
be an
open set.
Iterative methods for determining a zero of
F
are
based on a reformulation of (17.1) as an equivalent fixed point problem, x = T(x),
so that
x*
point of
is a zero of
T.
T(x(v-1)),
=
One expects the sequence if the initial point
proximation to case.
exactly when
F
is a fixed
x*
Then we set up the following iteration: x(")
x*
(17.2)
x*.
{x(v)
x(0)
I
v = 1(1)-. v = 0(1)oo}
(17.3)
to converge to
is a sufficiently close ap-
But this is by no means true in every
In addition to the question of convergence of the
sequence, we naturally must give due consideration to the speed of the convergence, and to the simplicity, or lack thereof, of computing
T.
Before we begin a closer theoreti-
cal examination of these matters, we want to transform Equation (17.1) into the equivalent fixed point problem for a special case which frequently arises in practice. Suppose that the mapping
Example 17.4:
into a sum, F(x) = R(x) + S(x), in which dependent on
x
and
R
can be split
F S
is only "weakly"
is constructively invertible.
By
the latter we mean that there exists an algorithm which is
337
Iterative methods
17.
realizable with respect to computing time, memory storage demand, and rounding error sensitivity, and for which the R(y) = b
equation
neighborhood of R
can be solved for all
-S(x*).
in a certain
b
Such is the case, for example, when
is a linear map given by a nonsingular diagonal matrix or
by a tridiagonal symmetric and positive definite matrix. we set
then equation
T = R- lo(-S)
to the fixed point problem fore
F(x) = 0
x = T(x).
When
is equivalent S, and there-
also, depends only weakly on the point, one can ex-
T
pect the iterative method (17.3) to converge to
x*
ficiently close approximations
o
Definition 1 7 . 5 :
Let
x(0)
T : G c IRn -' 1R n
a fixed point of
x* e G
sequence (17.3) for
x(0)
The fixed point
x*.
an interior point of
x*.
I(T,x*)
of
II
II
T
in
1R'
A point
y e G
x*, if the and converges
G
is called attractive if it is
x*
The iteration (17.3) is
I(T,x*). x*
The mapping
is attractive. a e
is called contracting if there exists an
norm
for suf-
be a mapping and
remains in
= y
called locally convergent if T
of
T, i.e. T(x*) = x*.
belongs to the attractive region
to
If
[0,1)
and a
such that
IIT(x) - T(Y)IIT < allx-YIIT
x,y e G.
0
Every contraction mapping is obviously continuous. Theorem 17.6:
Let
T:G c IRn
-
iRn
be a contraction mapping.
Then it is true that: (1)
T
has at most one fixed point
x* c G.
x*
is
attractive. (2)
x*.
In case
G =]R
,
there is exactly one fixed point
n Its attractive region is all of ]R.
338
SOLVING SYSTEMS OF EQUATIONS
III.
Proof of (1): Since
Let
and
x*
be two fixed points of
y*
is contracting, there is an
T
a e
T.
such that
[0,1)
IIx* Y*IIT = IIT(x*) T(y*)IIT < allx*-y*IIT. It follows that
x* = y*.
We now choose
r eIR+
so small
that the closed ball KT
lies entirely in
r = {y e Itn
I
IIx*-yIIT < r}
It follows for all
G.
z
e KT r
that
IIT(z)-T(x*)IIT < allz-x*IIT < r. Therefore
T
maps the ball x(v)
is defined for
KT r
T(x(v-1)),
and satisfies the inequality
IIx(v)-x*IIT < avllx(0)-x*IIT < avr, Ix(v)
It follows that the sequence to
I
v = 1(1)00.
v = 0(1)00}
converges
x*.
Proof of (2): T
The sequence
v = 1(1)00
=
a KT r
x(0)
into itself.
Let
x(0) e]Rn
is contracting there is an
v = 0(1)00
be chosen arbitrarily. a e
Since
so that for
[0,1)
it is true that
Iix(v+l) -x(v)IIT = IIT(x(v))
-T(x(v-1)
allx(v)-x(v-1)IIT <
)IIT
avllx(1)-x(0)II T'
From this it follows with the help of the triangle inequality that for all
v,p E IN,
17.
339
Iterative methods
v+
Ilx(v+u)-x(,))IIT
K=V
K=V
laK)ilx(1)-x(0)1IT
lva lix(1) _x(0)iiT This says that
(x(v)
v = 0(1)oo}
I
Its limit value we denote by
is a Cauchy sequence.
Since every contraction
x*.
mapping is continuous, it follows that x* = lim x(v) = lim T(x(v)) = T(x*).
Therefore
x*
is a fixed point of
region of
x*
is all of
Theorem 17.7: let
A
Let
a
IRn,
be a real
be an affine
T(x) = Ax + b
The attractive
T.
n x n
matrix, b eIRn
mapping.
tracting if and only if the spectral radius than 1.
In that case, T
attractive region Proof:
Then p(A)
T
is con-
is less
has exactly one fixed point. is all of
I(T,x*)
and
Its
IRn.
The conclusion follows at once from Lemma 9.17 and
Theorem 17.6.
a
Theorem 17.8:
Let
fixed point
T:G c]Rn -IRn
x* a G.
Let
T
be a map which has one
be differentiable at
x*.
Then
p(T'(x*)) < 1,
implies that
x*
is an attractive fixed point.
The proof is obtained by specializing the following theorem and making use of Lemma 9.17.
Theorem 17.8 says that the local convergence of an iterative method (17.3) for a differentiable map depends in a simple way on the derivative.
Thus differentiable nonlinear
340
SOLVING SYSTEMS OF EQUATIONS
III.
maps behave like linear maps with respect to local converThis conclusion can be extended to nondifferentiable
gence.
maps which are piecewise differentiable. Theorem 17.9:
T:G c]Rn {1Rn
Let
fixed point
x* a G.
there is an
m e N
Let
be continuous at
T
which are differentiable at there is an
s
r = l(1)m,
Suppose that for each
x*.
e {1,...,m}
such that
T(x) = T5(x).
Suppose further that there is a vector norm the corresponding matrix norm
r = 1(1)m.
is an attractive fixed point.
x*
Since
Proof:
are continuous at
Tr
and
T
r e {l,...,m}
for each
T
x*, we have,
the alternative
(1)
T(x*) = Tr(x*), or
(2)
There exists a neighborhood
and
for which
satisfies
IITT(x*)IIT < 1, Then
Suppose
x*.
and maps
Tr :G c IRn 3 ]R"
x e G
be a map which has one
Ur
of
x*
in which
never agree.
Tr
Since we are only interested in the local behavior of may disregard all
r
T, we
for which statement (2) is true.
There-
fore, without loss of generality, we may suppose that statement (1) is true for all Since the maps exists a with
>
6
IIYIIT <
0 6
r.
Tr
are differentiable at
for every and all
c >
0
such that, for all
r E {1,...,m}
IITr(x*+y) x* Tr(x*)YIIT
x*, there
it is true that
EIIYIIT.
y
341
Iterative methods
17.
It follows for
r = 1(1)m
that
IITr(x*+Y)-x*IIT _ (1ITr(x*)IIT+E) IIYIIT. Now we may choose
so small that it is true for all
c
r
that
11 T r' (x *)IIT + c< Y< 1. For every initial vector
satisfying
x(o)
IIx(0) -x*IIT <
6
it then follows that
Iix(")-x*IIT Therefore
V = l(1)m.
Yv
is an attractive fixed point.
x*
o
In addition to the previously considered single step method T(x(v-1))
x(v)
=
practical application also make use of two step methods (or multistep methods) X(V)
=
T(x(v 1),x(v 2)).
These do not lead to any new theory, since one can define a
mapping
T: IR2n ; R2n by setting xl T
T(xl,x2) =
x2
xl
which results in the single step method
x(v)
T
=
T(x(v 1)).
1x(v) x(v)
x(v-l)
is then significant for convergence questions.
Of course
342
III.
SOLVING SYSTEMS OF EQUATIONS
this transformation is advisable only for theoretical considerations.
We are now ready to apply the theorems at hand to Newton's method. Lemma 17.10: zero at
We start with a lemma to help us along.
be a mapping which has a
and is differentiable at
x* e G
mapping from
F:G c]Rn +IRn
Let
J
be a
which is continuous at
MAT(n,n,IR)
to
G
Let
x*.
x*.
Then the mapping
T(x) = x - J(x)F(x) is differentiable at
x*, with Jacobian matrix
T'(x*) = Proof: all
For every
y,z a Rn
e
> 0
satisfying
I
- J(x*)F'(x*).
there exists a IIY112
6
>
0
so that for
< d, it is true that
IIF(x*+Y)-F(x*)-F' (x*)YII2 =IIF(x*+Y)-F'(x*)Y112 <_ e11Y112 II [J (x*+Y) - J(x*) 12112_ all 2112 . This leads to the inequalities IT(x*+y)-T(x*)-[I-J(x*)F'(x*))Y112
= IIJ(x*+y)F(x*+Y) -J(x*)F' (x*)YII2 < II [J(x*+Y)-J(x*))F(x*+Y)II2 + IIJ (x*) [F(x*+Y)-F' (x*)Y1112 _ EIIF(x*+Y)112 + IIJ(x*)112.E:IIAl2 _<
C11Y112 (e+11 F' (X*) 112
+ IIJ(x*)112 ) Example 17.11:
Newton's method and variations.
is to find a zero F
x*
of the mapping
F:G c]Rn
The problem IR
n, where
is continuously differentiable in a neighborhood of
x*
17.
Iterative methods
343
and has a regular Jacobian matrix there.
Then the basic fixed
point problem underlying Newton's method is: x = T(x) = x - J(x)F(x),
By Lemma 17.10, T
where
J(x) = F'(x)-1.
is differentiable at
and has Jacobian
x*
T'(x*) = I-J(x*)F'(x*) = I-F'(x*)-1F'(x*) = 0. This means that
p(T'(x*)) =
0.
By Theorem 17.8, Newton's
method converges for all initial values which lie sufficiently close to
x*.
Theorem 17.8 and Lemma 17.10 also establish that the fixed point
x*
remains attractive when
is not the
J(x)
inverse of the Jacobian, but is merely an approximation thereto, since local convergence only demands p(T'(x*)) = p(I-J(x*)F'(x*)) < 1.
This is of considerable practical significance, since frequently considerable effort would be required to determine the Jacobian and its inverse exactly.
It is also noteworthy
that, by Lemma 17.10, it is not necessary for It suffices to have
be differentiable. x*.
F
J
itself to
differentiable at
The following computation establishes how far
deviate from the inverse of the Jacobian.
be a perturbation matrix and let
Thus we let
J(x) = C[F'(x))-1.
may
J(x) C
Then by
Lemma 17.10 we have
T'(x*) = I-J(x*)F'(x*) = I-C[F'(x*)]-IF'(x*) = I-C. By Theorem 17.8, the iteration converges locally for p(I-C) < 1. for
For the special case
A e (0,2).
o
C = XI, we have convergence
344
SOLVING SYSTEMS OF EQUATIONS
III.
The following two theorems will give a more precise concept of the attractive regions for Newton's method and for a simplified method.
We suppose we are given the following
situation:
G cIRn
X(0)
convex,
K = {x a JR"
IIx-x(0)II < ro},
I
F c C1(G,IRn), IIA-111,
a=
E G
A = F'(x(0))
KeG regular
n = IIA-1F(x(O))II
Newton-Kantorovich.
Theorem 17.12: Hypotheses:
(a)
IIF' (x) -F' (y) II < YII x-y II
(b)
0 < a = 8Yn < 1/2
(c)
rl = 2n/(1+V-1-2a) < ro.
X,y e G
Conclusions: (1)
The sequence x(v+l)
remains in
x(v)
F'(x(v))-1F(x(v)), -
and converges to
K
(2)
IIx(v)-x*II <
(3)
x*
K2 = G If
=
x* e K.
r12-v(laa)(2v-l)
is the only zero of fl
{x eJRn
I
v = 0(1)m
F
11x-x(0)II < r2
v = 0(1)°x. in
(1+)/(BY)}.
a << 1/2, the sequence converges very quickly, by (2).
In a practical application of the method, after a few steps there will be only random changes per step.
These arise
because of inevitable rounding errors.
The theorem permits
an estimate on the error
For this one takes
IIx*
-
x(v)II.
17.
X(V)
Iterative methods
34S
as the initial value
computes upper bounds for error
IIx*
For
-
y,a,n
is at most
x(0)II
for a new iteration, and
x(0)
and
For
rl = 2n/(l +
a < 1/2, the ).
a = 1/2, it is possible that convergence is only
The following example in
linear.
a.
II21
shows that this case
can actually occur: 2
f(x) _
-
B+X-
YYX
,
n > 0, a > 0,
y> 0
We have If'(x)-f'(Y)I = Ylx-y1
1/If'(o)I = a If(o)1/1f'(o)1 = n. For
a < 1/2, f
and
r2 = 2n/(1-/1--2a).
When
a > 1/2, f
has two different real zeros, r1 = 2n/(l+/) When
a = 1/2, they become the same.
has no real zeros (see Figure 17.13).
The example is so chosen that convergence of Newton's method is worse for no other
f.
The proof of Theorem 17.12 is
grounded on this idea. f
Figure 17.13.
Typical graph for
a < 1/2.
346
III.
F'(x(v)) F'(x*)
SOLVING SYSTEMS OF EQUATIONS
is always regular.
However, for
a = 1/2,
can be singular (as in our example).
We will use the following three lemmas in the proof of Theorem 17.12.
In addition to assuming all the hypotheses of
Theorem 17.12, we also make the definitions:
AV = F' (x(v)).
Sv = IIAv1II ,
av = SvnvY,
nv = IIAv1F(x(v))II ,
PV = 2nv/(1+Vl Za
V).
Naturally these definitions only make sense if we also assume E G, AV
x(v)
is regular, and
av < 1/2.
Therefore we will
restrict ourselves temporarily to the set v >
M
for which these hypotheses are true.
0
of integers At this point it
is not at all clear that there are any integers besides which belong to
M.
However, it will later turn out that
contains all the positive integers. Lemma 17.14:
If
$v+l
v/(1-av).
Proof:
Since
x(v+l) E G, then
YIIx(v)-x(v+l)II
IIAv-Av+1II
Av+l
is regular and
= Yn v
we have IIAvl(Av-Av+1)II < av < 1/2.
Therefore, we have convergence for the series [AV'(Av-Av+1)]u.
S = u=0
We have [I-Avl(Av-Av+1)]S = I,
S[I-AV1(Av-Av+1)] = I.
0
M
347
Iterative methods
17.
The matrix inside the square brackets is therefore regular, and its inverse is
But then
S.
Av+1 = Av[I-AV'(Av-Av+1)) is also regular.
For the norm of the inverses we have the
inequalities
0,+1 <_ Lemma 17.15:
IISII
A-
'
E a11av = 0V/(1 -av) .
<
V
u=o
x(v+l) c G, then
If
We have shown above that
Proof:
IIAv+1F(x(v+l)
nv+l °
IIA-IF(x(v+l)
nv+1 < Znvav/(1-av
< IIsII '
)II
= SA 1
Av-+1
and
IIA-1F(x(v+l) )AI
V
)11/(I-av).
It remains to show that IIA-1F(x(v+I))II
- 1 avnv'
V
(1-t)x(v)+tx(v+1),
fi(t)
=
R(t) = A-1F($(t)) + (1-t)(x(v+1)-x(u)) Since
G
is convex, fi(t)
remains in
G.
We clearly obtain
the following: x(v+I)
$(0) = x(v)
O(1)
R(0) = 0,
R(1) = A-1F(x(v+1)),
=
'(t) = x(v+1)-x(v), R'(t) _ [A-1F'(c(t))-I](x(v+1)-x(v)), Since
R'(0) = 0.
348
SOLVING SYSTEMS OF EQUATIONS
III.
IIF'(0(t))-AvII ° IIF'(O(t))-F'(a(o))II IIF'(0(t))-A'vII
Ytllx(v+l)-x(v)II
r1
YIIm(t)-x(v)II
=
we obtain the following estimate for
IIR' (t)II
IIR'(t)II:
0 ytllx(v+1)-x(v)II2 = svyn2t = avnvt.
<
From this we obtain the desired inequality,
IIAv F(x (v+1)
1
)II
= IIR(1)Il
=
II
R'(t)dtll <
avnv.
0
I
When
we set
x(v+l) a G, 6 =
(cf. Lemma 17.14)
> Sv+l
6v/(l-av)
a = SYnv+l > av+1 Then the last lemma implies that
av+1 < a <
a
1
2
= < av < 1/2.
(1 V)
It follows that if Lemma 17.16: Proof:
v e M x(v+l)
If
and
x(v+l) c G, then
e G, then
nv + pv+l <- pv'
Let
1 1 a >
Y
pv+l
From the inequality 2
a
av
1
2
(1 av) 2
it follows that av
1-
1-,/l- 26 <
Since
1-av
v+l e M.
1-av
B(1-av) = sv, it further follows that
Iterative methods
17.
141-2a
349
aV
1-
<
sVY
SY
PV+l <
SVY
< PV - nV .
o
u < v
and
Proof of Theorem 17.12:
If
v e M, then by Lemma
17.16, v
T=p T + PV+l
Since
Pug
po = r1 < ro, we also have V
TI nT + PV+l
IIx(v+I)-X(0)11 _ It follows that 17.15, that
E IIx(T+I)-x(T)II < rI-PV+I
T=0
x(v+l) e K e G
AV+l
and, from Lemmas 17.14 and
is regular and
this implies that integers
ro
rl
Thus the set
v+l a M.
x(V+l) e K
for
Ilx(v+I)-x(°)II_ rl it follows that either all such that
contains all
and
nV # 0
aV
v = 0(1)m.
x(V+l) e K, or there is a first
x(u+2) =
=
This implies
x(v+3)
_
0, we also have
nV + Pv+l = nv < 2n,/(l+/1 « ) nv
PV
pv+1
V
TI°nT + Pv+l < Po = rl
IIx(v+I)
xvl
-
x(O)II
e K.
Since
Pv+l
pV+1 = 0, i.e. nV+1 = 0. x(v+l)
Since
M
v > 0.
Next we show that
v
Altogether,
aV+l < aV.
< rl
ro
.
350
SOLVING SYSTEMS OF EQUATIONS
III.
Hence in this case, too, all
x(v)
a K.
Next we establish the error estimate (2).
We begin
with the inequality 2
0'v
(12
av+l < 2
V
It implies that 1 2-4a v a
>
v+l
-
1
2
<
1-av+1
-
av.
From
n v+l
1-2av+(1-av)72
22-4 2
4
av
v-1
(1-a ao
T
.
(1-av-1) (2v)
C1-a)
1-av
2
av
_
aV
aV+l v+l
(1-a V) 2
av+1
V
a
1
< 2 nvav/(1-av)
(2v)
{1-a>
_
(cf. Lemma 17.15) we finally
obtain (2v+2v-1)
1 \1(2
nv+l - 2(laa/
nv
< 2- v
nv -
nv-1 < ..
(2-1) v
a
1-a
PV = 2nv/(i+
PV < r12-v(1aa)
The sequence
(10-10
x(v)
no
)
<
(2v-l)
therefore converges to
can be estimated by conclusion (2).
Since
continuous, it also follows that F(x*) = -F'(x*)(x* - x*) = 0.
x* a K. F
and
1fx(v)-x*IF F'
are
17.
Iterative methods
351
For the proof of conclusion (3) we refer to Ortega-Rheinboldt (1970).
C3
The Jacobian
F'(x(\))
step of Newton's method.
Frequently this computation in-
Therefore it may be advantageous
volves considerable effort. to replace
must be computed anew at each
by a fixed, nonsingular matrix
F'(x(v))
which does not differ too greatly from
F'(x(v)).
cedure is especially advantageous whenever algebraic structure than the Jacobians.
B
B
This pro-
has a simpler
It is conceivable
that linear systems of equations involving
B
may be solved
with the aid of special direct methods (e.g. the reduction method of Schr6der/Trottenberg or the Buneman algorithm), while the corresponding systems involving the Jacobians F'(x(v))
are too complicated for such an approach.
We describe this situation briefly:
G c Itn
convex,
K = {x c IRn
x(°)
cG
11x-x(O)II I
< ro},
F c C1(G, IRn), B E MAT(n,n, ]R) s = JIB-111 , d
= IIB -
KcG
regular
n = IIB-1F(x(0))II
F'(x(°))II
Theorem 17.17: Hypotheses:
(a)
IIF'(x)-F'(y)II
(b)
Sd < 1
(c)
a = any/(1-66)2 < 1/2
(d)
rl = 2n/[(1+ 1-2a)(1-sS)] < ro
yJJx-yII
x,y c G
352
SOLVING SYSTEMS OF EQUATIONS
III.
Conclusions: (1)
The sequence x(v+l)
remains in (2)
=
x(v)
B-1F(x(v)),
-
K and converges to
v = 0(l)-
x* e K.
It is true that II x
(v+2)
-x(v+1)11 < cli x(v+1) -x(v)11
where c = Bd +
(1-B6) < 1.
2
1+ 41 --2 o,
The theorem contains two interesting special cases: (A)
is an affine map, i.e.
F
y = a = 0, rl = n/(1-86)
and
c = 86. (B)
B = F'(x(0)),6 = 0, y > 0.
The conditions (a), (c), and
(d) are then precisely the hypotheses of the preceding theorem.
Our convergence conditions for the simplified method
thus are the same conditions as for Newton's method. Conclusion (2) of the theorem is of interest first of all for large
v.
better estimates.
For the first few iterations, there are Thus, we have
x(2)-x(1)I a(l-86)2.
a6 +
where For
c1Ix11)-x10) 11
2
a = 1/2, we have
c = 1, independently of
In
6.
fact in such cases the convergence of the method is almost arbitrarily bad.
This can be seen with an example from
Let
f(x) = x2,
x(0)
= 1,
B = f'(1) = 2.
The constants in this example are:
JRl.
17.
353
Iterative methods
R = 1/2,
n = 1/2,
y = 2,
a = 1/2.
6 = 0,
This leads to the sequence x(v+l) =
1(x(v))2
x(v)
=
2
x(v)(1
-
1 x(v)). 2
which converges to zero very slowly. In practice one can apply the method when and
a << 1/2.
In these cases,
c 1 R6 + a/2
Table 17.18 shows the effect of larger
a
a6 << 1
and
c - 66 + a.
or larger
$6.
TABLE 17.18
a
R6
C
R6+ T,
c
R6+a
1/4
1/2
0.531
0.625
0.646
0.750
1/4
1/4
0.320
0.375
0.470
0.500
1/4
0
0.125
0.125
0.293
0.250
1/8
1/2
0.516
0.563
0.567
0.625
1/8
1/4
0.285
0.313
0.350
0.375
1/8
0
0.063
0.063
0.134
0.125
1/16
1/2
0.508
0.531
0.532
0.563
1/16
1/4
0.268
0.281
0.298
0.313
1/16
0
0.031
0.031
0.065
0.063
The proof of Theorem 17.17 runs on a course roughly parallel to that of the proof of Theorem 17.12. based on three lemmas.
nV = JIB- 1F(x(V))II, av = Rynv/(1-8oV)2, v
runs through the set
which it is true that
Once again, it is
We make the following definitions:
6V = JIB -
F'(x(V))II.
PV = 2nV/[(1+
M x(v)
)(1 R6V)1
of all nonnegative integers for c G, Rdv < 1, and
av < 1/2.
354
SOLVING SYSTEMS OF EQUATIONS
III.
Lemma 17.19:
Let
x(v+l) c G
and
[Bdv + av(1-B6v)2]/B.
Then B'5v+l < Ba <
1
1-Bdv+l > 1-Bd = [1-av(1-66 v))(1-B6 v) > 0.
Proof:
We have
BSv+l = BIB-F' (x(v+l))II < B6v+B1!F'(x(v+l))-F' (x(v)) 11 BS v+1 < B6 v+3YIIx(v+1)-x(v)II = BSv+BYnv BSv+l < 66v+av(1-B6 v)2 = BS
2(1+6252)-(2 -av)(1-Bav)2 <
Z(1+a2&)
1.
<
From this it follows that 1-B6 v+l > 1-0 = [1-av(1-BS,))(1-BSv). Lemma 17.20:
x(v+l)
If
o
c G, then
av(1-66 )2]nv.
nv+l < [BSv + 2
Proof:
As in the proof of Lemma 17.15, for
t
e
[0,1]
4(t) = (1-t)x(v) + tx(v+l)
R(t) = B-IFWt)) + (1-t)(x(v+1)-x(v) It follows that x(v+1)
q(0) = x(v),
0(1) =
R(0) = 0,
R(l) = B-1F(x(v+l)),
V (t)
=
x(v+l)
-
x(v)
R'(t) = B[F'((t))-B ](x(v+1)
x(v))
By hypothesis (a) of Theorem 17.17 it follows that
we set
Iterative methods
17.
IIF'(O(t))-BII
355
IIF'(x(v))-BII
+
YIIm(t)-x(v) II
IIF'(a(t))-BII < 6v + Ynvt. Therefore we have
IIR'(t)II < a(6v + Ynvt)nv and finally
nv+l = IIR(1)II
=
1110 R'(t)dtII 1
,V(1-a6v)2Inv.
nv+l < (a6v + 2 aYnv)nv = [a6v + 2
With
aYnv+1(1-06)2 a > av+l
the last two lemmas yield v (1-a6)2 v
aYn[a6v + v <
[1-av(1-a6v)]2(1-66 v)
Since [1-av(1-a6v)]2 = 1-2a(l-a6v)[l> 1-(1-86,,)[1-
v(1-a6v v(1-a6v)]
a6v + -12av(1-a6v)2
we have av+l < a < av < 1/2.
It follows that when
v e M
and
x(v+l) c G, then also
V+l a M.
Lemma 17.21:
If
x(v+l)
a G, then
nv + Pv+l
<<
pv'
SOLVING SYSTEMS OF EQUATIONS
III.
356
Proof: Case 1:
nv = 0.
Case 2:
y = 0.
Then
x(v+l)
is an affine map.
F
av = av+l = 0,
Then
Therefore:
Pv+1 = nv+l/(1-Bdv+l) < vdvpv nv + Pv+1
av = BYnv/(1-Bdv)2 > 0. 2nv+l/[(1+/&)(1-6d))
P =
pv+l = PV = 0.
nv+1 < 06VnV,
6V = 6v+1'
PV = nv/(1-BdV),
Case 3:
and
x(v)
=
nv + 66VPV = PV. Let Sd)
=
ay
Lemma 17.20 implies that
p > PV+l.
nv+1 <- [66v + Zav(1-Bdv)2Inv.
Multiplying by
yields
By
a(l-Ba)2 <
[B6V + 2 av(1-B6v)2]aV(1-Bdv)2.
We have 1 Bd =
[1 av(1 Bdv)](1 66v)
and therefore,
a[1-av(1-66V)]2 2136
2a <
V
[Bdv +
<
l-1
(1-S6v)2]av
+a (1-136 v)2 v a v
[1-av(1-B6 V)]
1-av(1-B6v)-
1
BS BY
1-av(1-B6v) By
(l-66v)
yields
(1-y)(1-4) BY
<
-
1-aV(1-Bdv) By
41-77 (1-Bdv).
17.
Iterative methods
357
The left side is a different representation of
p.
There-
fore, we have shown that 1-av(1-66v)-
Now the right side of the inequality is equal to Proof of Theorem 17.17:
If
v e M, then
nv < PV
pv - nu.
and by
Lemma 17.21, v-1
uI0
nu+pv < po = r1 < ro
This implies that
x(v+l) a K.
iix(v+l) -x(0)Il < r 0 , The lemmas also imply that fore contains all The sequence
v > 0.
{x(v)
I
86v+1 < 1, av+l < 1/2. x(v)
remains in
v = 0(1)OD}
M
there-
K.
has at most one limit point
x* a K, since
Ix(u+l)
E
u=0 is bounded above by mate (2).
v = 0(1)-
n
u=0 r0.
It remains to prove the error esti-
By Lemma 17.20, we have
nv+l ` [06v + 2
V(1-86V1
2
In
or
Iix(v+2)-x(v+l)II < (86v We can get a bound on condition for
86v
+ ?av(1-86v)2]IIx(v+1)-x(v) with the aid of the Lipschitz
F':
86v < 060 +BIIF'(x(V))-F1(x(0)A < 860+8YIIx(v)-x(0)11 It follows from
358
III.
v-l u=o
x(u+1)-x(0)II + p
v-l
n +p
SOLVING SYSTEMS OF EQUATIONS
u=o
u
1
that
$6v < B6o + Syr1-SYpv < B6o + BYrI-Syn
(1+vrl
$6V < B6 +
)(1-86)]-av(1-B6v)2 2a
66v + 2 v(1-Sdv)2 < B6 +
(1-8s) = c.
o
1+ vrl-- -2a
We follow this discussion of Newton's method with a definition of generalized single step methods.
The starting
point is an arbitrary iterative method x(v)
T(x(v-1))
=
In an actual computation, the components of
x(v)
x(v-1).
xiv), i = 1(1)n,
are computed sequentially from the components of Therefore it is advantageous, in looking at the
right side of the equation, to
use those components of
x(v)
which are already known, instead of the corresponding components of
x(v-1).
In practice this means that the compon-
immediately replace those of
ents of
x(v)
memory.
This not only saves memory, but simplifies the pro-
gram.
x(v-1)
in
In many important cases (see Varga, 1962, Ch. 3), the
new method converges better than the original method.
The
new method is called a single step method in contrast to the original total step method.
By defining a modified operator
T, one can again regard a single step method as a total step method.
Before defining the operator important special case.
T, we will look at an
17.
Iterative methods
Example 17.22:
359
Let
T(x) = (L+U)x + b where
b e]Rn, L
is a strictly lower
n x n triangular
matrix (diagonal identically zero), and upper
is a strictly
U
n x n triangular matrix (diagonal identically zero).
The corresponding total step method x(v)
=
T(x(v-1))
(L+U)x(v-1) + b
=
is also known as the Jacobi method, and by Theorem 17.7, converges for
p(L+U)
<
In view of our discussion above,
1.
the single step method may be characterized by the rule x(v)
=
Lx(v) + Ux(v-1) + b.
This is also called the Gauss-Seidel method.
We can again
rewrite it formally as a total step method: x(v)
(I-L)-1(Ux(v-1)+b). =
By Theorem 17.7, the method converges for
p((I-L)-l U)
<
1.
In the following formulation the similarity between the two methods will become clearer. Jacobi method: x(v)
x(v-1) =
[(I-L-U)x(v-1)-b].
-
Gauss-Seidel method: x(v)
Definition 17.23:
(I-L)-1[(I-L-U)x(v-1)-b].
x(v-1)
=
Let
-
T:
G c ]n +]Rn
a
be the fixed point
operator of some total step method, with component mappings ti(YI.Y2....PYn),
i = 1(1)n.
360
III.
We define the components
ti
SOLVING SYSTEMS OF EQUATIONS
of a mapping
-]R'
T:G
IRn
recursively by the rule ti(yl,y2,.... yn) f
= ti(wl,w2,...,wn)
tj(yl)y2,.... yn)
for
yj
otherwise.
<
j
i
wj
Then
x(v)
T(x(v-1)), v = l(l)m
=
method corresponding to x(v)
T.
=
defines the single step
We frequently use the notation
T(x(v-1)/x(v)).
This is to be interpreted as saying that puted with the aid of the mapping
from
T
is to be com-
x(v)
x(v-1);
however,
insofar as they are already known, the components of are to be used in the computation.
x(v)
o
The following theorem focuses on some significant connections between total and single step methods. Theorem 17.24: Let
x*
Let
T
and
be a fixed point of
be as in Definition 17.23.
T T.
Then the following are true:
(1)
x*
is a fixed point of
(2)
If
T
continuous at (3)
is continuous at
T.
x*, then
is also
T
x*.
If
T
differentiable at
is differentiable at x*.
x*, then
Let the Jacobian of
T
at
is also
T
x*
be
partitioned as follows:
T' (x*) = D - R - S where
D =
(dij)
is a diagonal matrix, R = (rij)
strictly lower triangular matrix, and strictly upper triangular matrix.
S = (sij)
is a is a
Then the Jacobian of
T
17.
at
361
Iterative methods
x*
can be decomposed as follows: T'(x*) = (I+R)-I(D-S).
Proof:
Conclusions (1) and (2) follow immediately from the
definition of
By the recursive definition of
T.
ti, we
have that at the fixed point, i-1 a .t. (x*) =
:
ati(x*)ajt4(x*),
j
au ti(x*)ajtu(x*) + ajti(x*),
j > i.
<
i
u=1 i-1
a(x*) = u=1
In both cases, this means that i-1
riuajtu(x*) + dij -sij,
ajti(x*)
i,j = l(1)n.
V=1
It follows from this that T'(x*) _ -RT'(x*) + D -
S
T'(x*) _ (I+R) I(D-S).
o
The method considered in Example 17.11 had the form The corresponding single step method is
T(x) = x - J(x)F(x). x(v)
=
x(v-l)
J(x(v-1)/x(v))F(x(v-1)/x(v)). -
The Jacobian for this method can be determined with the aid of Theorem 17.24 for a special case.
This will also be of
significance in Sec. 19, when we develop the SOR method for nonlinear systems of equations. Theorem 17.25: 17.10.
Let
F:G c1Rn ;1Rn
Suppose further that
Let the Jacobian of
F
J(x)
and
J
be as in Lemma
is a diagonal matrix.
at the point
F'(x*) = D* - R* - S*
x*
be partitioned as
362
III.
where
is a diagonal matrix, R*
D*
triangular matrix, and matrix.
SOLVING SYSTEMS OF EQUATIONS
S*
is a strictly lower
is a strictly upper triangular
Then the Jacobian of the single step method x(v-l)
x(v)
J(x(v-1)/x(v))F(x(v-1)/x(v))
=
at the point
-
may be represented as follows:
x*
T'(x*) _ =
[I-J(x*)R*] 1[I-J(x*)F'(x*)-J(x*)R*] I
-
[I-J(x*)R*]-1J(x*)F'(x*).
By Lemma 17.10, the Jacobian of the corresponding
Proof:
total step method can be represented as T'(x*) = I-J(x*)F'(x*) = I-J(x*)(D*-R*-S*) [I-J(x*)D*] + J(x*)R* + J(x*)S*.
=
into diagonal, lower,
The last sum is a splitting of
T'(x*)
and upper triangular matrices.
Applying Theorem 17.24
yields:
T'(x*) = [I-J(x*)R*] 1[I-J(x*)D* + J(x*)S*]
Remark 17.26: x(v)
=
[I-J(x*)R*] - [I-J(x*)F' (x*) - J(x*)R*]
=
I -
[I-J(x*)R*] 1J(x*)F'(x*).
Instead of the iterative method x(v-1)
=
J(x(v-1)/x(v))F(x(v-1)/x(v)) -
one occasionally also uses the method x(v)
=
x(v-l)
J(x(v-1))F(x(v-1)/x(v)). -
One can show that the two methods have the same derivative at the fixed point.
Therefore one has local convergence for
18.
Overrelaxation methods for linear systems
both methods or for neither.
This is not to say that the
attractive regions are the same.
18.
363
o
Overrelaxation methods for systems of linear equations In this section we will discuss a specialized itera-
tive method for the numerical solution of large systems This is the method of overrelaxation
of linear equations.
developed by Young (cf. Young 1950, 1971).
It is very popu-
lar, for with the same programming effort as required by the Gauss-Seidel method (see below or Example 17.22), one obtains substantially better convergence in many important cases. Definition
Gauss-SeideZ method, successive overrelaxa-
18.1:
tion (SOR) method.
A c MAT(n,n,IR)
Let
be regular and let
The splitting
b e IRn.
A = D - R - S
is called the triangular splitting of
A
if the following
hold true: R
is a strictly lower triangular matrix
S
is a strictly upper triangular matrix
D
is a regular matrix
L = D-1R
is a strictly lower triangular matrix
U = D-1S
is a strictly upper triangular matrix.
To solve the equation (1)
Ax = b, we define the iterative methods;
Gauss-Seidel method: x(v) =
D-1(Rx(v)+Sx(v-1)+b)
=
Lx(v)+Ux(v-1)+D-lb,
or x(v)
=
x(v-1)-D-1(Dx(v-1)-Rx(v)-Sx(v-1)-b),
v = 1(1)x°.
364
(2)
SOLVING SYSTEMS OF EQUATIONS
III.
Successive overreZaxation or SOR method: X(V)
x(v-1)-wD-1(Dx(v-1)-Rx(v)-Sx(v-1)-b)
=
V = 1 (1)00 x(v-1)-w(x(v-1)-Lx(v)-Ux(v-1)-D-lb).
=
where
w eIR
is called the relaxation parameter.
In the splitting
A = D - R
In that case, L
diagonal matrix.
triangular matrices regardless.
-
S, D
and
U
may possibly be a are strictly
Our definition, however,
also encompasses the possibility that the
D
contains more than simply the diagonal of
A.
When
D
o
in the method
is a diagonal matrix, then the methods can
also be described by:
When
D-1(Ax(v-l)/x(v)-b)
x(v-1)
X(V)
=
X(V)
=
-
x(v-1)
wD-1(Ax(v-1)/x(v)-b). -
w = 1, the successive overrelaxation method is the
same as the Gauss-Seidel method.
When
w > 1, the changes
in each iterative step are greater than in the Gauss-Seidel method.
This explains the description as overrelaxation.
However, it is also used for
w < 1.
For
w > 1, convergence
in many important cases is substantially better than for w = 1
(cf. Theorem 18.11).
From a theoretical viewpoint it is useful to rewrite these methods as equivalent total step methods. lemma is useful to this end.
The following
The transformation is without
significance for practical computations. Lemma 18.2:
Let
A e MAT(n,n,IR)
n and let let b e IR,
have a triangular splitting,
365
Overrelaxation methods for linear systems
18.
I-w(D-wR)-1A
=
(I-wL)-1[(l-w)I+wU].
=
Then the method x(v)
= Y x(v-1)
yields the same sequence x(v)
Proof:
I
w(D-wR)-lb,
V = 1(1)-
as the SOR method
x(v)
x(v-1)-wD-1(Dx(v-1)-Rx(v)-Sx(v-1)-b),
=
v = 1(1)co.
The SOR method is easily reformulated as (I-wL)x(v)
Since
+
(1-w)Ix(v-1)+wUx(v-1)+wD-lb. =
is a strictly lower triangular matrix, the matrix
L
is invertible.
- wL
x(v)
It follows that
(I-wL)-1[(1-w)I+wU]x(v-1)
=
+
w(D-wR)-1b.
Further reformulation yields: (I-wL)-1[(1-w)I+wU]
(I-wL)-1D-1D[(l-w)I+wU]
_
_
(D-wR)-1[(1-w)D+w(D-R-A)]
=
I
w(D-wR)-1A
= _Vw.
-
=
(D-wR)-1[(D-wR)-wA]
o
The following theorem restricts the relaxation parameter
p(°) of
to the interval
w
(0,2), since the spectral radius
is greater than or equal to
1
for all other values
w, and for the method to converge, one needs
p(_V) < 1,
by Theorem 17.7. Theorem 18.3:
Under the hypotheses and definitions of Lemma
18.2 it follows that:
(1)
det(.) = (1-w)n
(2)
p(`-tw)
> 11-wl.
366
III.
SOLVING SYSTEMS OF EQUATIONS
Lemma 18.2 provides the representation
Proof:
_V = (I-wL)-1[(1-w)I+wU1.
Y is thus the product of two triangular matrices.
Since
the determinant of a triangular matrix is the product of the diagonal elements, we have det(I-wL)
1
= 1/det(I-wL) = 1
det[(1-w)I+wU] = (1-w)n.
Conclusion (1) follows from the determinant multiplication theorem.
For the proof of (2) we observe that the determinant of a matrix is the product of the eigenvalues.
By (1) how-
ever, the size of at least one of the eigenvalues is greater than or equal to
11-w1.
a
The next theorem yields a positive result on the convergence of the SOR method.
However, there is the substan-
tial condition that the matrix definite.
A
be symmetric and positive
This is satisfied for many discretizations of dif-
ferential equations (cf. Sections 13 and 14). "D
when
The condition
is symmetric and positive definite" is not necessary, D
is a diagonal matrix.
The diagonal of a symmetric
positive definite matrix is always positive definite. when
D
contains more than the true diagonal of
usually true in most applications that
D
A, it is
is still symmetric
positive definite. Theorem 18.4:
A e MAT(n,n,IR)
Even
ostrowski (cf. Ostrowski 1954).
Let
have a triangular splitting and let
18.
367
Overrelaxation methods for linear systems
=
w
I
-
We further require that:
<
IIA1/2.A-1/2II2
(2) p(Yw)
ing in the vector norm
w e (0,2).
Then it is true that:
1
= spectral norm)
(II
T(x) _
II2
x+c (c aIRn)
are contract-
(xTAx)1/2.
IIxIIA =
For each sequence
(4)
are symmetric and
D
1.
<
All mappings
(3)
and
A
(a)
positive definite, and (b)
(1)
w(D-wR)-1A.
{wi
I
with
i e N}
wi e (0,2)
and
lim sup wi <
lim inf wi > 0,
2,
i ym
i -).W
we have
lim ll w.V i->w
i-1
...
1 I12 = 0. 1
denotes the symmetric positive definite matrix
In (1), Al/2
whose square is Proof of (1):
A.
Let
M = (D-wR)/w =
D-R. m
Then A1/2.`L°A-1/2
B =
W
=
I
-
A1/2M-1A1/2.
Therefore BBT
=
(I-A1/2M
1A1/2)(I-A1/2(MT)-lAl/2) I-Al/2M-1(MT+M-A)(MT)-lAl/2.
=
The parenthetic expression on the right can further be rewritten as MT+M-A
=
7D-RT + =D-R-D+R+RT
=
(- -1)D.
This matrix is symmetric and positive definite.
Therefore it
368
SOLVING SYSTEMS OF EQUATIONS
III.
has a symmetric and positive definite root, which we denote by
It follows that
C.
The matrices
BBT
and
BBT + UUT UUT
=
where
I
U = Al/2M-1C.
are symmetric, positive semi-
definite, and also simultaneously diagonalizable. each eigenvalue UUT
of
A
such that
BBT
A + u = 1.
are nonnegative.
Since
there is an eigenvalue All the eigenvalues
U, and therefore
regular, it further follows that values
A
p > 0.
UUT
A
u
of
and
u
also, is
Thus all the eigen-
satisfy the inequality
BBT
of
Thus for
0 <
A
< 1.
It
follows that IA1/2WA-1/2II2
= IIBII.7
= [p(BBT)]1/2 < 1.
Since the matrices
Proof of (2):
they have the same eigenvalues.
B
and W are similar,
From (1)
it follows that
pW) = p(B) < IIBII2 < 1.
Proof of (3):
We have
IIT(x)-T(Y)IIA=II1`w°(xY)IIA= [W (xY)]TA[W(xY)} TA1/2(BTB)A1/2(x-y).
_ (x-Y) T(. )TA w(x-Y) = (x-Y)
Here
B = Al/2W _VA- 1/2
of (1).
BBT
is the matrix already used in the proof
BTB
and
have the same eigenvalues.
the proof of (1), the largest eigenvalue of the inequality Am = max{A
I
A
eigenvalue of
BTB} < 1.
Altogether it follows that
IIT(x)-T(Y)IIA < am(x-y)TA(x-Y) = Amllx-YII2
IIT(x) T(Y)IIA < m IIx-ylIA.
BTB
Thus, by satisfies
18.
Overrelaxation methods for linear systems
Proof of (4): w e (0,2)
369
From the proof of (3) we have for all
that m<1.
Here both "'1A norm
and m depend in general on
however does not depend on
11-11A
as a function of
The
w.
We regard
w.
II
This function is continuous and attains
w.
its maximum in every interval
where
[a,b]
0 < a < b < 2.
Then
a= The endpoints sequence
{wi
i
< 1.
I I-VII A
are chosen so that all elements of the
a,b I
max
we[a,b]
lie in the interval
e 1N)
[a,b].
This
leads to the inequality
i cIN.
11A 1
i-1
1
It follows that
1 im 11 L i-,_
1
i-1
....mow 11A 1
=
0.
Because of the equivalence of norms for finite dimensional vector spaces, the conclusion also obtains for the norm
11.112.
We follow the proof with some remarks intended to increase understanding of the theorem. Remark 18.5:
One knows from examples that in general the
spectral norm
[P(w of 9 wis not less than 1.
Thus the convergence of the SOR
method is not necessarily monotonic in the norm
11.112
How-
ever, convergence is always monotonic in the vector norm
"IA , by (3)_
370
SOLVING SYSTEMS OF EQUATIONS
III.
The significance of conclusion (4) is that the
Remark 18.6:
SOR method will also converge when
w
changes from one step
Such nonstationary methods are by no means un-
to the next.
In a manner similar to the proof of (4)
usual in practice.
one can also prove convergence for the method when the matrix D
is changed from one step to the next.
to remain within the fixed set
It is only necessary
DU < D < D0, where the
inequality signs are to be understood as applying componentwise.
Finally, convergence is even assured when the sequence
of equations and unknowns is permuted from step to step (see Theorem 19.13).
o
The SOR method, like every single step method,
Remark 18.7:
is by definition substantially dependent on the sequential ordering of the equations.
But it is worth noting that
hypothesis (a) of Theorem 18.4 remains true when the ordering of the equations is changed. tation matrix.
When
A
definite, then so are
Let
and
PAP 1
D
and
P
be an arbitrary permu-
are symmetric and positive PDP -l.
Thus convergence
of the method is assured, independent of the ordering of the equations, for symmetric and positive definite matrices.
The
speed of convergence, however, is dependent on the ordering In certain special cases, it is possible
of the equations.
to characterize particularly favorable orderings (see Young's Theorem 18.11).
In the worst cases, with the least favorable
orderings, one needs twice as many iterations. Definition 18.8:
A matrix
B e MAT(n,n,IR)
o
is called
weakly cyclic of index 2, if there exists a permutation matrix P, a matrix
B1 c MAT(q,n-q,]R), and a matrix
B2 c MAT(n-q,q, IR)
such that
Overrelaxation methods for linear systems
18.
371
P B P-1 = B2 B
is called consistently ordered if the eigenvalues of the
matrix B
(a c , a # 0)
aL + aU
do not depend on
a, where
is to be split into B = L + U
with
L
a strictly lower triangular matrix and
strictly upper triangular matrix. Example 18.9:
If
U
a
o
already has the form
B
r
0
B1
B2
0
B =
then
B
is weakly cyclic of index 2 and consistently ordered.
In the proof, we use block notation, beginning with 0
B1
B2
0
B x = x2
x2
It follows that Bix2 = ax1,
B2x1 = ax2.
We let 0
0
L=
,
B2
0
B1
0
0
U=
0
and obtain xl
xl
0
0
ax2
aB2
0
ax2
372
SOLVING SYSTEMS OF EQUATIONS
III.
Example 18.10:
When
is block tridiagonal (cf. Sec. 13)
B
with vanishing diagonal blocks, then
is weakly cyclic of
B
index 2 and consistently ordered.
In the following proof, we restrict ourselves to tridiagonal matrices.
We choose that permutation which places
all odd numbered rows and columns at the beginning, and has all even numbered rows and columns following. for
n = 5, the permutation matrix 1
0 0 0
P =
0
P
is
0 0 0
0
0
0
1
0
0
0
0
1
1
0
0
0
0
1
0 0
For example,
From this we get that B1
0
PBP-I
= B2
It only remains to show that the original ordering.
B = L + U
1
0
is consistently ordered in
B
x clRn
Let
for the eigenvalue
be the eigenvector of
A:
bi,i-lx i-1 + bi,i+1xi+l = Ax i.
It follows that ai-lbi,i-1xi-1
ai-lb1,l+lxi+1 +
(abi,i-1)(ai-2xi-1)
=
Aai-1x1
Alai-1x
+ (1bi,i+l)(alxi+l) =
This means that the vector (xi,ax2....
is an eigenvector of
al, +
aU
,an-lxn)T
for the same eigenvalue
A.
o
18.
373
Overrelaxation methods for linear systems
We now come to the theorem of Young.
Its significance
lies in the fact that it accurately describes the behavior of the function
in the interval
p(Yw)
Such infor-
(0,2).
mation is important for the determination of an
for which
w
the convergence of the SOR method is more rapid. Theorem 18.11:
A e MAT(n,n,]R)
Let
Young.
gular splitting and let U = I-w(D-wR)-1A. B = D-I(R+S) = L+U, where
and
8 = P(B)
have a trian-
Let the matrix
wb = 2/[1+(1-82)1/2]
be weakly cyclic of order 2 and consistently ordered. further hypothesize that: and
8
< 1, or (b) A
definite.
(1)
82
(a) All eigenvalues of
and
D
We
are real
B
are symmetric and positive
Then it follows that:
= P(.") < 1. w c (0,2)
P(ub) = wb-1,
(2)
1-w+2w282+w8+
1-w+4w282
for
w c (O,wb)
for
w e [wb,2).
p (Y)
l
(3) (4)
W-1
w < wb, p(V)
For
simple, if
a
is an eigenvalue of . It is
is a simple eigenvalue of
values of _w, for
B.
All other eigen-
w < wb, are less in absolute value.
w > wb, all eigenvalues of m have magnitude Proof:
For
w - 1.
We derive the proof from a series of intermediate
conclusions: (i)
All eigenvalues of
B
are real.
If condition
(a) does not hold, then by (b) all the matrices are symmetric and positive definite. $ = D-1/2(D-A)D-1/2
=
Then
D-1/2(R+S)D-1/2
A
and
D
374
III.
SOLVING SYSTEMS OF EQUATIONS
is also symmetric and hence has only real eigenvalues. B
and
are similar, B
B
If
(ii)
Since
too has only real eigenvalues.
is an eigenvalue of
p
has the same eigenvalues as For arbitrary
(iii)
and ±,a (L+U) clear for
-p.
(-1)-1U
z =
B.
the matrices
z,w e 1
have the same eigenvaZues. 0
or
w = 0, for then
upper or lower triangular matrix. So now let
z # 0
and
zL + wU
The assertion is
zL + wU
is a strictly
Its eigenvalues are all
w # 0.
zL + wU = Y/'z-w[(z/w)1/2L +
Since
B, then so is
is consistently ordered,
B
-B = -L +
zero.
Since
Then we can rearrange
(z/w)-1/2U].
is consistently ordered, the square-bracketed ex-
B
pression has the same eigenvalues as
L + U.
In view of (ii),
the conclusion follows. (iv)
It is true for arbitrary
z,w,y e
that:
det(yI-zL-wU) = det(yI±I(L+U)). The determinant of a matrix is equal to the product of its eigenvalues. (v)
w e (0,2)
For
and
A c ¢
it is true that:
det((A+w-1)I±w I). It follows from the representation =
(I-wL)-1[(1-w)I+wU]
that
det(AI-5) = det(AI-(I-wL)-1[(l-w)I+wU]) =
det((I-wL)-1(AI-awL-(1-w)I-wU)).
375
Overrelaxation methods for linear systems
18.
it further follows that
det(I-wL) = 1
Since
det(AI-.) = det(AI-AwL-(l-w)I-wU) = det((A+w-1)I-AwL-wU).
This, together with (iv) yields the conclusion. B = p(B) = 0
(vi)
implies that for all
w c (0,2),
Since the determinant of a matrix is the product of its eigenvalues, it follows from (v) that for
p(B) = 0,
n II
(A-Ar) = ()L+w-1)n
r=1
Here the
i = l(1)n, are the eigenvalues of .V.
Ai,
The
conclusion follows immediately. (vii)
Let
w e (0,2), µ e IR
and
A
c 4, A
Further
0.
let (A+W-l)2 Then
p
is an eigenvalue of
value of W.
=
B
Aw2u2.
exactly when
is an eigen-
A
The assertion follows with the aid of (v):
det(AI-.) = det(±wµTI±wTB) _ (wT)ndet(±uI±B) . We are now ready to establish conclusions (1) By (vii), u # 0
Proof of (1):
and only if (a) implies
p2 S2
-
(4):
is an eigenvalue of
is an eigenvalue of .l.
Thus
S2
B
if
= p("5).
< 1, and (b), by Theorem 18.4(2), implies
P(Y1) < 1. Proof of (2):
The conclusion p(.f) > p(. )
follows from
b
considering the graph of the real valued function p(-W), defined in (3), over the interval Remark 18.13).
(0,2)
f(w) _
(cf. also
SOLVING SYSTEMS OF EQUATIONS
III.
376
We solve the equation
Proof of (3) and (4): (a+w-1)2
-
given in (vii) for
Aw2p2 = 0
W2112)
A2-2A(1-w+
+ (w-1)2 =
x:
0
2 A
w2µ2 + wu(1-w
= 1-w +
For
w2p2)1/2.
+ 4
2
[wb,2), the element under the radical is non-positive
w c
for all eigenvalues
of
p
B:
W2s2
w2p2 < 1-w +
1-w + 4
< 0.
4
Therefore it is true for all eigenvalues 2 1 .2112) 2 2 2 +
1
w p (1-w +
2
that
of
A
4
2
wp
2
(w-1)
2
= w-1. It follows that p(`.) W We now consider the case too there can exist eigenvalues
w e (0,wb). of
p
B
In this case
for which the ex-
pression inside the above radical is non-positive. corresponding eigenvalues JAl
= w-1.
of S we again have
However, there is at least one eigenvalue of
B
p = $) for which the expression under the radical is
(namely
positive. B.
A
For the
We consider the set of all of these eigenvalues of
The corresponding eigenvalues
are real.
of
A
positive root gives the greater eigenvalue.
The
For
u > 2(lw-11)1/2/w
the function 1-w +
1 _I
grows monotonically with p = a.
W2,12)1/2
w2 p2 + wp(1-w +
It follows that
4
p.
The maximum is thus obtained for
Overrelaxation methods for linear systems
18.
P(yw) = 1-w + 1 w2s2 + ws[1 w +
377
w2s2]1/2, 4
p(.
W)
is an eigenvalue of
also implies that whenever
is a simple eigenvalue of
p( W)
a = p(B)
The monotonicity
by (vii).
.
is a simple eigenvalue of
other eigenvalues of ./
are smaller.
o
In the literature, the matrix
Remark 18.12:
2-cyclic whenever
A
is called
is weakly cyclic of index 2.
B
allows matrices other than the true diagonal of matrix
D, then
B
All of the
B.
depends not only on
particular choice of
If one
for the
A
A, but also on the
Therefore it seemed preferable to
D.
us to impose the hypotheses directly on matrix
a
B.
Conclusion (1) of Young's Theorem means that
Remark 18.13:
the Gauss-Seidel method converges asymptotically twice as fast as the Jacobi method. convergence for
w = wb
ally greater than for
For the SOR method, the speed of
in many important cases is substantiw = 1
(cf. Table 18.20 in Example
18.15). In (3) the course of the function exactly for
w c (0,2).
is described
A consideration of the graph shows
that the function decreases as the variable increases from to
wb.
On the interval Figure 18.14). known.
w + wb
The limit of the derivative as
-
0
is
0
(wb,2), the function increases linearly (see wb
is easily computed when
S = p(B)
is
However that situation arises only in exceptional
cases at the beginning of the iteration.
As a rule, wb
will
be determined approximately in the course of the iteration. We start the iteration with an initial value
X(V) _ Y x(v-l) + 0
wo(D-woR)-lb.
wo a
[l,wb):
378
SOLVING SYSTEMS OF EQUATIONS
III.
1
i W wb
1
Figure 18.14.
For a solution
2
Typical behavior of p(.VW)
of the system of equations we have
x*
x* = - x* + W0 (D-w0R)
lb.
0
It follows that x(v)-x*
(x(v-1)-x*)
= W
= W
O
By (4), ao = p(W ) 8 = p(B)
)
is an eigenvalue of
0
one if
)v-1(x(1)-x* 0
W 0,
is a simple eigenvalue of
and a simple This occurs,
B.
by a theorem of Perron-Frobenius (cf. Varga 1962), whenever, e.g., the elements of irreducible.
B
are non-negative and
We now assume that )'0
of W with eigenvector
e.
B
is
is a simple eigenvalue
Then the power method can be
0
used to compute an approximation of For sufficiently large x(v) x* z aye, It follows that
v
p(B).
it holds that:
x(v+l)
x* z X0ave,
x(v+2)-x* = aoave.
18.
Overrelaxation methods for linear systems
379
x(v+2)-x(v+l) z (a0-1)A0ae x(v+I)-x(v) z (ao-1)ave x(v+2)-x(v+1)112
a. o 2
(v+1) -X(V)112
11x
The equation (ao+wo-1)2
aowo62
=
makes it possible to determine an approximate value 82.
Next compute
wb
82
for
from the formula
wb = 2/[1+(1-52)l/2]
and then continue the iteration with The initial value
w0
wb.
must be distinctly less than
wb, for otherwise the values of the eigenvalues of Yw
will 0
be too close together (cf. the formula in 18.11(3)) and the power method described here will converge only very But it is preferable to round up
slowly.
function w < wb
p(.9)
grows more slowly for
(cf. Figure 18.14).
difference
2-1b
wb, since the
w > wb
than for
It is worthwhile to reduce the
by about ten percent.
o
In the following example we compare in an important special case the speed of convergence of the Jacobi, GaussSeidel, and SOR methods for Example 18.15:
w = wb.
Sample Problem.
The five-point discretization
of the problem tu(X,y) = q(x,y), u(x,y) = V+(x,y),
(X,y) C G = (0,1)2 (x,y) c 3G.
380
SOLVING SYSTEMS OF EQUATIONS
III.
leads to a linear system of equations with coefficient matrix 1
A=
A
I
I.
A,
e MAT(N2,N2, ]R). I
Here we have (
-4
1
11,
-4
e MAT(N,N, IR) 4
1
N+l = 1/h.
A, as we know from Section 13, are
The eigenvalues of
Let
where
A
avu = -2(2-cos vhn - cos phn),
v,y = 1(1)N.
be partitioned triangularly into
A = D - R - S,
D = -41.
The iteration matrix D-1(R+S) = D-1(D-A)
of the Jacobi method has eigenvalues p(D-1(R+S))
I-D-1A
=
1
+
4
= cos hn = 1- 2 h2n2
Therefore
A
.
+
O(h4).
By Theorem 18.11 (Young) we further obtain
P()
2
=
cos2hn
=
1-h2n2
+ O(h4)
wb = 2/(l+./l-$ ) = 2/(l+sin hn) P(mob )
_ Wb-1 = 1-2h7T + O(h2).
Table 18.16 contains step sizes.
g, p(S), urb, and
p( mob)
for different
18.
381
Overrelaxation methods for linear systems
Spectral radii and
TABLE 18.16:
h
a
)
P(`Sz
wb.
P( W )
Wb
b
1/8
0.92388
0.85355
1.4465
0.44646
1/16
0.98079
0.96194
1.6735
0.67351
1/32
0.99519
0.99039
1.8215
0.82147
1/64
0.99880
0.99759
1.9065
0.90650
1/128
0.99970
0.99940
1.9521
0.95209
1/256
0.99993
0.99985
1.9758
0.97575
Now let
e(v)
=
be the absolute error of the
x(v)-x*
v-th
approximation of an iterative method x(v+l)
Here let Since
Mx(v) + C.
be an arbitrary matrix and let
M
e(v)
=
=
Mve(°)
and
lim
IIMVIIl/v
'V-.W
there is for each
IIMvII
n
.
x* = Mx* + c.
>
0
a
v
0
= P(M), eIN, such that
(P(M)+n)V, V > v
-
1
o
k(V)II < (P(M)+n)" IIe(°)II.
The condition 11e(m)II
< elle(°)II
thus leads to the approximation formula
m=
to
e
log P(M)
(18.17)
which is sufficiently accurate for practical purposes. In summary we obtain the following relations for the iteration numbers of the methods considered above:
382
SOLVING SYSTEMS OF EQUATIONS
III.
mJ c lo
Jacobi:
C
-h Tr /2 e
ml Z 1o
Gauss-Seidel (w=1):
(18.18)
-h it
SOR (w=wb
mw
=
1
b
Here the exact formulas for the spectral radii were replaced by the approximations given above.
The Jacobi method thus
requires twice as many iterations as the Gauss-Seidel method in order to obtain the same degree of accuracy.
one frequently requires that
In practice,
Since
1/1000.
log 1/1000 = -6.91, we get ml z
6.91 h-2 = 0.7/h2 Tr
m
wb
ml/mw
z 0.64/h.
Table 18.20 contains
ml, h2ml,
various step sizes.
(18.19)
6.91 h-1 - 1.1/h
mwb, hmwb , and
ml/mwb
for
These values were computed using Formula
(18.17) and exactly computed spectral radii.
One sees that
the approximate formulas (18.18) and (18.19) are also accurate enough. TABLE 18.20:
h
Step sizes for reducing the error to
m1
2
h ml
mw
hmw
b
1/1000.
m1/m b
b
43
0.682
8
1.071
5
178
0.695
17
1.092
10
1/32
715
0.699
35
1.098
20
1/64
2865
0.700
70
1.099
40
1/128
11466
0.700
140
1.099
81
/256
45867
0.700
281
1.099
162
1/8
1/16
19.
Overrelaxation methods for nonlinear systems
383
For each iterative step, the Jacobi and Gauss-Seidel methods require
4N2
The SOR method, in
floating point operations.
contrast, requires
From (18.19) we get
operations.
7N2
as the total number of operations involved (e = 1/1000): Jacobi:
1.4.4N4 z 6N4
Gauss-Seidel (w=1):
0.7.4N4 z 3N4
SOR (w=wb):
1.1.7N3 = 8N3.
The sample problem is particularly suited to a theoretical comparison of the three iterative methods.
Practical experi-
ence demonstrates that these relations do not change signifiHowever, there exist sub-
cantly in more complex situations.
stantially faster direct methods for solving the sample problem (cf. Sections 21, 22).
SOR is primarily recommended,
therefore, for non-rectangular regions, for differential equations with variable coefficients, and for certain nonlinear differential equations. 19.
o
Overrelaxation methods for systems of nonlinear equations In this chapter we extend SOR methods to systems of non-
linear equations.
The main result is a generalization of
Ostrowski's theorem, which assures the global convergence of SOR methods and some variants thereof. In the following we let
G
denote an open subset of
In Definition 19.1: tions.
Let
SOR method for nonlinear systems of equa-
F c C1(G,IRn), and let
an invertible diagonal
D(x).
F
have a Jacobian with
Then we define the SOR method
384
SOLVING SYSTEMS OF EQUATIONS
III.
for solving the nonlinear equation
F(x) = 0
by generalizing
the method in Definition 18.1:
X(O) e G x(v-1)-wD-1(x(v-1)/x(v))F(x(v-1)/x(v))
X(V)
t(x(v-1)
=
w e (0,2),
v = l(1)-.
(19.2)
Ortega-Rheinholdt 1970 calls this the single-step SOR Newton method. If
of
T.
has a zero
F
x* E G, then
is a fixed point
x*
This immediately raises the following questions: attractive?
(1)
When is
(2)
How should the relaxation parameter
(3)
Under which conditions is the convergence of the method
x*
w
be chosen?
global, i.e., when does it converge for all initial values (4)
x(0) a G?
To what extent can the substantial task of computing the partial derivatives of
(5)
be avoided?
F
Do there exist similar methods for cases where
F
is
not differentiable?
The first and second questions can be answered immediately with the help of Theorems 17.8 and 17.25. Theorem 19.3:
Let the Jacobian of
F
at the point
x*
be
partitioned triangularly (cf. Definition 18.1) into F'(x*) = D* matrix.
-
R*
- S*, where
D*
is an (invertible) diagonal
Then p(I-w[D*-wR*1-1F'(x*)) < 1,
implies that
x*
is attractive.
19.
Overrelaxation methods for nonlinear systems
Proof:
385
By Theorem 17.25 we have I-[I-w(D*)-1R*]-lw(D*)-1F'(x*)
T'(x*) =
I-w[D*-wR*]-1F'(x*).
=
The conclusion then follows from Theorem 17.8.
o
The SOR method for nonlinear equations has the same convergence properties locally as the SOR method for linear equations.
The matrix - w[D*-wR*]
I
1F'(x*)
indeed corresponds to the matrix ..
of Lemma 18.2.
Thus
the theorems of Ostrowski and Young (Theorems 18.4, 18.11), with respect to local convergence at least, carry over to the nonlinear case.
The speed of convergence corresponds asymp-
totically, i.e. for the linear case.
the optimal
v 4 -, to the rate of convergence for
Subject to the corresponding hypotheses, can be determined as in Remark 18.13.
w
sufficiently accurate initial value
x(O)
If a
is available for
the iteration, the situation is practically the same as for linear systems.
This also holds true for the easily modified
method (cf. Remark 17.26) X(V)
wD-1(x(v-l))F(x(v-1)/x(v)).
x(v-l)
=
-
The following considerations are aimed at a generalization of Ostrowski's theorem.
Here convergence will be estab-
lished independently of Theorem 17.8.
The method (19.2) will be generalized one more time, so that it will no longer be necessary to compute the diagonal of the Jacobian
F'(x).
The hypothesis
"F
differenti-
386
SOLVING SYSTEMS OF EQUATIONS
III.
able" can then be replaced by a Lipschitz condition.
Then
questions (4) and (5) will also have a positive answer.
In
an important special case, one even obtains global convergence. Definition 19.4:
A mapping
F e C°(G,)Rn)
gradient mapping if there exists a F(x)T = '(x), x c G.
We write
e
is called a such that
C1(G,IR1)
F = grad
o
In the special case of a simply connected region
G,
the gradient mappings may be characterized with the aid of a well-known theorem of Poincare (cf. Loomis-Steinberg 1968, Ch. 11.5).
Theorem 19.5:
Let
G
be a simply connected region
Then
F
is a gradient mapping if and
Poincare.
and let
F e C1(G,IRn).
only if
F'(x)
is always symmetric.
Our interest here is only in open and convex subsets of
IRn, and these are always simply connected.
then, we always presuppose that
set of
a c (0,1)
Let
4
0: G +]R1
and for all
x,y E G
(1-a)4(y)
- (ax + (1-a)y).
is called, respectively, a
convex function
if
r(x,y,a)
strictly convex function
if
r(x,y,a) > 0,
uniformly convex function
if
r(x,y,a) > ca(l-a)jjx-y,j
for all c
and
let
r(x,y,a) = Then
is an open, convex sub-
G
IRn.
Definition 19.6: all
In the sequel
x,y e G
with
>
x # y, and for all
0,
a e (0,1).
is a positive constant which depends only on
4.
2,
Here 0
19.
Overrelaxation methods for nonlinear systems
387
The following theorem characterizes the convexity properties of
with the aid of the second partial deriva-
tives.
Theorem 19.7:
A function 41 EC2(G,1R1)
is convex, strictly
convex, or uniformly convex, if and only if the matrix of the second partial derivatives of ing inequalities for all
x e G
0
satisfies the follow-
and all nonzero
z e]Rn
respectively,
zTA(x)z > 0
(positive semidefinite)
zTA(x)z >
(positive definite)
0
zTA(x)z > czTz
Here Proof:
(uniformly positive definite in x and z).
depends only on
c > 0
A, not on
x
or
z.
x,y e G, x # y, we define
For
p(t) = r(x,x+t(y-x),a),
t e [0,1).
Then we have P(t) = a41(x)
and
+ (l-a)$(x+t(y-x))
p(O) = 0, p'(0) = 0.
PM =
- 41(x+t(1-a)(y-x))
It follows that
(1
(1-s)p"(s)ds
J
0
PM = (1-a)J (1-s)(y-X) TA(x+s(y-x))(y-x)ds 0
(1-s)(y-x) TA(x+s(1-a)(y-x))(y-x)ds.
(1-a) 2 0
In the second integral, we can make the substitution s = (1-a)s, and then call integrals:
s'
again
A(x)
s, and combine the
III.
388
SOLVING SYSTEMS OF EQUATIONS
1
P(1) = j T(s)(Y-x)TA(x+s(Y-x))(Y-x)ds 0
where
Jas (1-a)(1-s)
for
0 < s < 1-a
for
1-a < s < 1.
The mean value theorem for integrals then provides a suitable 6
for which
c (0,1)
a(1-a)(Y-x)TA(x+6(Y-x))(Y-x).
r(x,Y,(x) = P(1) =
2 The conclusion of the theorem now follows easily from Definition 19.6.
o
is only once continuously differentiable, the
c
If
convexity properties can be checked with the aid of the first derivative. Theorem 19.8:
c c C1(G,IR1), F = grad 4, and
Let
p(x,y) = [F(Y)-F(x)]T(Y-x). Then
0
is convex, strictly convex, or uniformly convex, if
and only if
p(x,y)
satisfies the following inequalities,
respectively, p(x,Y) > 0,
p(x,Y) > 0,
p(x,Y) > c*IIY-x112 Here
c* > 0
Proof: t e
depends only on
Again, let
[0,1].
F.
p(t) = r(x,x+t(y-x),a), x,y c G, x # y,
Then we have
p(t) = a1(x) + (1-a)4(x+t(Y-x)) - (x+t(1-a)(Y-x)) and
Overrelaxation methods for nonlinear systems
19.
389
(1
p(1) = r(x,y,a) =
p'(t)dt
J
0 1
[F(x+t(y-x))-F(x+t(1-a)(y-x))] T(y-x)dt.
_ (1-a) 0
It remains to prove that the inequalities in Theorem 19.8 and Definition 19.6 are equivalent.
We content ourselves
with a consideration of the inequalities related to uniform convexity.
Suppose first that always
P(x,Y) > c*IIY-x112. Then it follows that P(x+t(1-a)(Y-x),x+t(Y-x)) > c*a2t2IIY xll2 IF(x+t(y-x))-F(x+t(1-a)(y-x))] T(y-x) > c*atIIY-xll2
r(x,y,a) >
c*a(1-a)IIY-x442. 2
The quantity here.
c
in Definition 19.6 thus corresponds to
1-2c
Now suppose that always
r(x,Y,a) > ca(1-a)IIY-xI12
Then it follows that aO(x)+(1-a)0(Y) > (x+
a)(Y-x))+ca(1-a)I1Y-x12
O(x+(1-(x)(ya-x))-4(x) + callY-xll2. Since this inequality holds for all the limit
a -
a c (0,1), by passing to
1, we obtain
m(Y) fi(x) > F(x)T(Y-x) + clly-x112. Analogously, we naturally also obtain
390
SOLVING SYSTEMS OF EQUATIONS
III.
(x)-4(y) > F(Y)T(x-Y) + cIIy-xlI2.
Adding these two inequalities yields 0 > -[F(Y)-F(x))T(Y-x) + 2cjjY-x,12.
0
The following theorem characterizes the solution set F(x) = 0
of the equation
for the case where
is the
F
gradient of a convex map. Theorem 19.9: F = grad 0.
Let
m E C1(G,1R1)
Then:
The level sets
(1)
convex for all
is a zero of
global minimum at If
(3)
one zero (4)
x*
N(y,q) = {x e G
O(x) < yl
I
are
y eIR.
x*
(2)
be convex and let
x*.
in
assumes its
0
The set of all zeros of
is strictly convex, then
0
If
exactly when
F
F
is convex.
F
has at most
G.
is uniformly convex and
0
has exactly one zero
G =IRn, then
F
x*, and the inequality
c*IIx*112 < !IF(0)112. is valid, where Proof of (1):
is the constant from Theorem 19.8.
c*
Let
a c (0,1)
follows from the convexity of
y,x c N(y,1).
and 0
Then it
that
4,(ax+(1-a)y) < x (x)+(1-a)0(Y) < ay+(l-a)y = Y.
Thus
ax + (1-a)y
Proof of (2):
Let
also belongs to x*
N(y,o).
be a zero of
F
and let
x E G,
x # x*, be arbitrary.
By the mean value theorem of differ-
entiation, there is a
A e (0,1)
such that
19.
Overrelaxation methods for nonlinear systems
O(x) _ O(x*)
391
[F(x*+A(x-x*))] T(x-x*).
+
It follows from Theorem 19.8 that p(x*,x*+X(x-x*)) = [F(x*+X(x-x*))] TX(x-x*) > 0.
Thus we obtain cp(x)
Therefore
o(x*) > 0.
-
is a global minimum of
x*
sion is trivial, since the set of all zeros of
particular zero of
If
The reverse conclu-
It follows from (1) that
is convex, for if
F
is a
x*o
F, then
{x* e G
Proof of (3):
is open.
G
4>.
F(x*) = 0} = N(4>(xo),4>).
is a strictly convex function, then in
4>
the above proof we have the stronger inequality p(x*,x+A(x-x*)) =
[F(x*+X(x-x*))] TX(x-x*) > 0.
It follows that 4>(x)
Therefore
> (x*) .
is the only point at which
x*
can assume the
global minimum.
By (3), F
Proof of (4): to show that examine
0
x = bt,
F
has at most one zero.
has at least one zero.
Thus we need
To this end we
along the lines
t e ]R,
b c IRn
fixed with
JjblJ 2 =
1.
There it is true that t
4>(bt)
= (0) + F(0)Tbt +
[F(bs)-F(0)]Tbds.
i
0
392
SOLVING SYSTEMS OF EQUATIONS
III.
Since
[F(bs)-F(0)]Tbs > c*s2 it follows that 0(bt)
> 0(0) + F(0)Tbt
c*t2.
+ 2
with
x = bt
For all
Iti
=
11x112
> 211F(0)112/c* ¢(x) > 4(0).
this inequality means that
Therefore
0
as-
sumes its minimum in the ball
{x a ]Rn
1
11x112
<
711F (0)112/c*}'
This minimum is a global minimum in all of fore has at least one zero
x*
even establish the inequality
7Rn.
F
in the above ball.
there-
One can
c*IIx*112 < IIF(0)112, for we have
[F(x*)-F(0)]T(x*-0) > c*IIx*-0112
F(0)Tx* > c*11x*112 o
IIF(0)112_ c*11x*112. Definition 19.10:
Let
,
F = grad and
G* c G
C1(G,]R1)
e
be strictly convex,
(fl,f2,...,fn),
a convex set with at least one interior point.
Then we define h
mid(F,G*) = inf Here the
0
,
and
,
fi(x+he
i
e(j) h E I R
j
= 1(1)n.
)-fi(x)
are the unit vectors parallel to the axes in and
x E I R n
run through all possible
Overrelaxation methods for nonlinear systems
19.
393
combinations satisfying x e G*,
x+he(j)
h# 0.
e G*,
Further let MID(F,G*) = diag(midJ(F,G*)). The notation
MID
Since
G*
stands for "maximal inverse diagonal".
o
has at least one interior point, the infi-
mum is always formed over a nonempty set of combinations of x
and
Further, we always have
h.
[F(x+he(J))
F(x)]The(J) > 0
-
[fJ(x+he(J))
- fJ(x)]h >
0
and therefore, also h
J))-fi(x)
fJ(x
> 0
Nevertheless, the infimum of these quantities can be zero. When
0
is uniformly convex, midJ(F,G*)
bound which is independent of [F(x+he(J))
-
has an upper
It follows from
G*.
F(x)]The(J) > c*I1he(J)II22
that
fJ(x+he(J)) - fJ(x) > c*h midJ(F,G*) < 1/c*. In the sequel, a lower bound for ant than an upper bound. for
midJ(F,G*)
is more import-
That requires Lipschitz conditions
fJ.
Theorem 19.11: inequalities
If, for all
x e G*
with
x+he(j) c G*, the
394
SOLVING SYSTEMS OF EQUATIONS
III.
Ifj(x+he(j))-fj(x)I < IhILj, hold, then
MID(F,G*)
Lj
is positive definite and
midj(F,G*) > 1/Lj,
The proof is trivial. Theorem 19.12: and let
G*
Let
j
= 1(1)n.
0
be twice continuously differentiable
4
be compact.
Then
midj(F,G*) = 1/max Proof:
j = 1(1)n
0,
>
j
xEG*
= 1(1)n.
Since
fj(x+he(j))-fj(x)
=
a.4(x+he(j))-aj4(x)
<
xEG*
it follows from the previous theorem that in any case midj(F,G*) > 1/max xec
For the proof of the reverse inequality, we use the definition of the partial derivatives:
a.O(x+he(j))-aO(x)
urn
j
h-0
h lim h-*0 aj4(x+he(3))-aj4(x) h#0
mid.(F,G*) = inf
2
aJ
=
.4(x)
1/a?.4(x)
=
jj
< 1/3
h
aj4(y+he j
)-aj4(y)
2
.4(W).
o
jj
In the previous section, we considered the iterative solution of linear systems of equations.
We want to explain
briefly how these fit in here in the case of real, symmetric, and positive definite matrices.
Let
F(x) = Ax - b
be an
affine map with a real, symmetric, and positive definite
19.
Overrelaxation methods for nonlinear systems
matrix
395
The map is the gradient of the function
A.
XTAx
(x) =
-
bTx.
Z By Theorem 19.7, 0
is uniformly convex.
The constant
of that theorem is the smallest eigenvalue of tion
p(x,y)
A.
c
The func-
of Theorem 19.8 is p(x,Y) = (Y-X) TA(Y-x).
Thus
c*
is also the smallest eigenvalue of
A.
Finally by
Theorem 19.12 midj(F,G*) = 1/ajj,
These considerations hold for all The next example shows that
j
= l(1)n.
G*.
MID(F,G*)
can be posi-
tive definite even in the case of a nondifferentiable func-
tion F.
Let
¢ c C1( IR, IR)
with
2x2
for
x > 0
l x2
for
x < 0
4x
for
2x
for
(x) and
F(x) = Then
MID(F,IR1) = 1/4.
The next theorem presupposes that G c]Rn
is open and convex,
e C1(G, 1n),
F = grad 0 = (fl,f2,...,fn), Q e MAT(n,n,IR)
Q = diag (qj).
is a positive definite diagonal matrix,
396
SOLVING SYSTEMS OF EQUATIONS
III.
Theorem 19.13:
Hypotheses:
is strictly convex.
(a)
0
(b)
There is a
y e R
G* _ {x c G 14(x) < y}
with a compact level set
which consists of more than one
point. (c)
qj
< 2 midj(F,G*),
j
= 1(1)n.
Conclusions: (1)
x*
There is exactly one
is an interior point of (2)
Either
x = x*
x* e G
with
F(x*) = 0.
G*.
(y) < $(x)
or
for
y = x-QF(x/y). (3)
Every sequence arbitrary
x(0) a G*
x(u+l) =
converges to
x(v)-QF(x(v)/x(v+l)),
u = 0(1)co
x*.
Proof of (1):
Since
on
G*.
is larger outside of
an
x* a G*
O(x)
G*
is compact,
assumes its minimum G*.
Therefore there is
such that m(x*) = min 4(x), xeG
is strictly convex.
Therefore there is at most one point
with these properties.
Since
besides
and
x*, O(x*) < y
Proof of (2):
naturally into
F(x*) = 0.
G*
x*
The computation n
contains other points is an interior point of
y = x-QF(x/y)
individual steps.
x G*.
can be split
With each of these
We call individual steps, only one component is changed. y(1), y(2)' ., y(n) = y these intermediate results y(0) = x,
The individual computations are
19.
Overrelaxation methods for nonlinear systems
y(J)
=
y(J-1)
a
-
397
.e(J) J
where
= qjfi (y(J-1)).
Aj
Since
qj > 0, either
Aj
# 0
In the first case we define
fi (y(J-l)) = 0.
t e [0,1]
YO-l) - tai e(J) y(t) = p(t) = O(Y(t))
This leads to p'(t) = -ai fj(Y(t))
p'(0) = -Aifi (Y(0)) _ -qjfj(Y(0))2 <
0
p' (t) -P' (0) = ai [fi (Y(0)) -fi (Y(t)) ] For
Here we need hypothesis (c).
t > 0
and
y(t) c G*,
this hypothesis implies:
2ta.
(Y(O)) -fj Y t)
q] <
2tifj(Y(0M 1
<
jyt
iy
IP'(t)-p'(0)l < 2t qjfj(Y(0))2. Since t
P(t) = P(0) + P'(0)t + J [P'(s)-P'(0)]ds 0
the last inequality leads to p(t) < p(O)
-
qjfj(Y(0))2t + qjfi (Y(0))2t2
(Y(t)) < O(y(J-1))
-
t(1-t)qjfj(Y(0))2
m(Y(t)) < 4,(Y(J-l)) < Y. Thus
y(t)
we have
cannot leave the set
G*.
For all
t c (0,1]
or
SOLVING SYSTEMS OF EQUATIONS
III.
398
(Y(t)) < O(Y(j-1)) and in particular, 0(Y(J))
Thus, either
fi (y(3-1))
<
(y(J-l)) (y) < ¢(x).
is always zero or
(2) is proven.
The sequence
Proof of (3):
{O(x(v))
I
v = 0(1)")
converges,
for by (2), O(x(v-1))
Let
> 0(x(v))
0(x(v+1)) >
>
> (x*).
...
be an arbitrary limit point of the sequence
x**
It follows from continuity considerations that
(x**), where means that
4(y**) =
By (2) however, this
y** = x** - QF(x**/y**).
F(x**) = 0
{x(v)}.
and hence by (1) that
x** = x*.
The only possible limit point of the sequence {x(v)
v = 1(1)")
I
in the compact set
the sequence is convergent.
G*
x*.
is
Thus
o
This last theorem requires a few clarifying remarks. Remark 19.14:
0 e C1(]R ,IR)
If
every level set
{x EIRn
of Theorem 19.9).
x(0)
a IR
.
Remark 19.15:
< y}
is compact (cf. proof
The iteration therefore converges for all
o If
there is no matrix for
O(x)
is uniformly convex, then
MID(F,G*) Q
is not positive definite, then
with the stated properties.
m e C2(G,IR1), MID(F,G*)
However,
is always positive definite.
The same is also true when the components of
F
Lipschitz conditions (cf. Theorem 19.11).
o
satisfy
Overrelaxation methods for nonlinear systems
19.
In practice, the starting point is usually
Remark 19.16: and not
Since the Jacobian
0.
399
metric, there exists a function in addition, F'(x)
F'(x) 0
of
F
is always sym-
F = grad .
with
is
It is most difficult to establish that
hypothesis (b) of Theorem 19.13 is satisfied.
MID(F,G*)
be determined from bounds on the diagonal elements of In the actual iteration one needs only Remark 19.17:
If,
is always positive definite, then
even strictly convex.
F,
F
and not
The first convergence proof
can
F'(x).
0.
o
for an SOR method
for convex maps was given by Schechter 1962.
In Ortega-
Rheinboldt 1970, Part V, several related theorems are proven. Practical advice based on actual executions can be found, among other places, in Meis 1971 and Meis-Tornig 1973.
o
We present an application of Theorem 19.13 in the form of the following example. Example 19.18:
Let
G
be a rectangle in
]R2
with sides
parallel to the axes, and let a boundary value problem of the first kind be given on this rectangle for the differential equation -(alux)x - (a2uy)y + H(x,y,u) = 0.
The five point difference scheme (see Section 13) leads, for fixed
h, to the system of equations
F(w) = Aw + H(w) = where
A e MAT(n,n,]R)
0
is symmetric and positive definite,
w = (wl,w2,...,wn), (xj,yj ) = lattice points of the dis-
cretization, and A
H(w) = (H(xl,y1,w1),...,H(xn,yn,wn)).
can be split into
400
SOLVING SYSTEMS OF EQUATIONS
III.
-R-
A=D where
D
RT,
is a diagonal matrix, R
is a strictly lower tri-
angular matrix with nonnegative entries, and weakly cyclic of index 2.
pj(Z) =
D-1(R+RT)
is
Let
10z
H(xj,yj2)di
P(w) = (P1(wl),...,Pn(wn)) 4(w) =
wTAw + P(w). 2
Then obviously
F(w) = grad ¢.
Under the hypotheses
0 < H7(x,y,z) < 6,
0
is uniformly convex in
IRn
(x,y)
and
MID(F,G*)
definite for every compact convex set lar, for
j
c G,
z cIR
is positive
G* cfRn.
In particu-
= 1(1)n: 2
a2 0 (w)
= ajj + HZ(xj) yj,w)
ajj <
aj
1/a
> midj(F,G*) > 1/(ajj+6).
ajj + d
ii
ajj
contains the factor
ajj >> 6.
h, therefore,
For small
1/h2.
Every sequence arbitrary
w(0)
w(v+1)
=
w(v) - Q(Aw(v-1)/w(v)
+
H(w(v-1)
converges, if 0
The condition strictive.
< qj
< 2/(ajj+d).
Hz(x,y,u(x,y)) <
6
appears to be very re-
In most cases, however, one has a priori know-
ledge of an estimate
401
Overrelaxation methods for nonlinear systems
19.
a < U(x,y) < a,
(x,y)
e G.
When the maximum principle applies, e.g. for
H(x,y,0) = 0,
this estimate follows at once from the boundary values. H
then only on
is significant
G x
The function can
[a,8].
be changed outside this set, without changing the solution This change can be undertaken
of the differential equation. H c C1(G x]R,]R)
so that
Hz(x,y,z)
and
are bounded.
will demonstrate this procedure for the case If
We
H(x,y,z) = ez.
a < u(x,y) < a, one defines
f
H*(z) =
< a
ea(z-a+l)
for
z
ez
for
a <
es(z S+1)
for
z > S.
z
< R
It follows that ea < H*'(z)
<
e8
and 1
< mid.(F,G*) <
1
J
.+es
a J.J
- a J.J.+ea
One begins the iteration with
Q= w diag(
1
aii +e
0< w < 2.
s)
It may be possible, in the course of the computation, to replace
a
estimating
with a smaller number. a
and
of the results.
Should one have erred in
a, one can correct the error on the basis
The initial bounds on
a
and
To speed convergence,
do not have to be precisely accurate.
one is best off in the final phase to chose for approximation of w diag
(
1 aJJ+ai Hi (w)
therefore
S
),
o
Q
an
402
20.
III.
SOLVING SYSTEMS OF EQUATIONS
Band width reduction for sparse matrices When boundary value problems are solved with differ-
ence methods or finite element methods, one is led to systems of equations which may be characterized by sparse That is, on
matrices.
each row of these matrices there
are only a few entries different from zero.
The distribution
of these entries, however, varies considerably from one problem to another.
In contrast, for the classical Ritz method and with many collocation methods, one usually finds matrices which are fuZZ or almost full; their elements are almost all different from zero.
Corresponding to the different types of matrices are different types of algorithms for solving linear systems of equations.
We would like to differentiate four groups of
direct methods.
The first group consists of the standard elimination methods of Gauss, Householder, and Cholesky, along with their numerous variants.
At least for full matrices, there are as
a rule no alternatives to these methods. too are seldom better.
Iterative methods
For sparse matrices, the computational
effort of the standard methods is too great. ber of equations is
For if the num-
n, then the required number of additions
and multiplications for these methods is always proportional to
n3
The second group of direct methods consists of the specializations of the standard methods for band matrices. In this section we treat Gaussian elimination for band matrices.
Two corresponding FORTRAN programs may be found in
20.
Band width reduction for sparse matrices
Appendix S.
403
The methods of Householder and Cholesky may be
adapted in similar ways.
The number of computations in this
group is proportional to
nw2, where
w
is the band width of
the matrix (cf. Definition 20.1 below).
The third group also consists of a modification of the Its distinguishing characteristic is the
standard methods.
manner of storing the matrix.
Only those entries different
from zero, along with their indices, are committed to memory. The corresponding programs are relatively complicated, since the number of nonzero elements increases during the computation.
In many cases, the matrix becomes substantially filled.
Therefore it is difficult to estimate the number of computations involved.
We will not discuss these methods further.
A survey of these methods can be found in Reid 1977. The fourth group of direct methods is entirely independent of those discussed so far.
Its basis lies in the
special properties of the differential equations associated with certain boundary value problems.
These methods thus can
be used only to solve very special systems.
methods differ greatly from each other.
Further, the
In the following two
sections we consider two typical algorithms of this type. Appendix 6 contains the FORTRAN program for what is known as the Buneman algorithm.
The computational effort for the most
important methods in this group is proportional to merely n log n
or
methods.
n.
These are known, therefore, as fast direct
Please note that
n
is defined differently in
Sections 21 and 22.
We are now ready to investigate band matrices.
404
SOLVING SYSTEMS OF EQUATIONS
III.
Definition 20.1:
A = (aij) a MAT(n,n,Q)
Let
and
A # 0.
Then
w = 1 + 2 max{dld = li-jl where
is called the band width of the band width to be 1.
A matrix for which matrix.
aij ¢ 0 or aji # 0}
For a zero matrix, we define
A. o
w << n
will be called a band
Thus we do not have a precise mathematical concept
in mind; the precise term is band width. examples, an
x
In the following
represents a nonzero element, and a blank
represents a zero.
A diagonal matrix has a band width diagonal matrix has band width
w = 3.
w = 1, and a tri-
A matrix of the follow-
ing type has band width 5:
Every full matrix has the maximal band width of
2n-1.
How-
ever, many sparse matrices also have this bandwidth, e.g. the matrix:
x
x
x
20.
Band width reduction for sparse matrices
be a matrix with band width less
A e
Now let
than or equal to the matrix
405
w, w
A,
Every linear system containing
odd.
n
i = l(l)n
= bit
JI aijxj
is equivalent to the system w j
where
Iajixi+j-k-1 l = aw+l,i' and
k = (w-l)/2
if
j
< w, 1 < i+j-k-1 < n
bi
if
j
= w+l
0
otherwise.
ai
aji =
For
p <
1
or
i+j-k-1
p > n, set
(cf. Figure 20.2).
0
a21
form a matrix
w << n, A
For
less core memory than
0
xp = 0.
aji
The quantities
0
i = l(l)n
A
a31
and
A c MAT(w+l,n',¢)
occupies substantially The ratio is
b.
(w+l)/(n+l).
a42
a53
a64
a75
a86
a97
a32
a43
a54
a65
a76
a87
a98
a22
a33
a44
a55
a66
a77
a88
a99
a12
a23
a34
a45
a56
a67
a78
a89
0
a13
a24
a35
a46
a57
a68
a79
0
0
b2
b3
b4
b5
b6
b7
b8
b9
and
n = 9
all
[b1
Figure 20.2.
A
for
w =
5
Gaussian elimination without pivot search does not lead to an increase in band width.
The same holds true if the search
for the pivot elements is restricted to the row at hand, i.e. if there are only column exchanges.
Thus the algorithm can
406
SOLVING SYSTEMS OF EQUATIONS
III.
run its course entirely within
A.
The number of computa-
tions is, as previously mentioned, proportional to
nw2.
The Ritz method leads to positive definite Hermitian matrices.
A pivot search thus is unnecessary, as it also
is with most difference equations.
The number of computations
without pivot search is
n(w2 + 3w
-
2).
Appendix S contains two FORTRAN programs, for
w = 3
and
w > 3. In many cases the band width of a matrix
A
can be
reduced substantially through a simultaneous exchange of rows and columns.
For example, we can convert x
x
into
rx x
through an exchange of rows 2 and 5 and columns 2 and S.
Such a simultaneous exchange of rows and columns is a similarity transformation with a permutation matrix.
In the
example at hand, we have the permutation (1,2,3,4,5) - (1,5,3,4,2),
which we abbreviate to
(1,5,3,4,2).
Slightly more complicated
20.
Band width reduction for sparse matrices
407
is the example x
x x x x x x x x x x x
x
The permutation
(1,3,5,6,4,2)
leads to the matrix
The band width has been reduced from 11 to S.
A further re-
duction of band width is not possible. For each matrix transforms
A
A
there exists a permutation which
into a matrix with minimal band width.
permutation is not uniquely determined, as a rule. nately there is no algorithm for large
n
This
Unfortu-
which finds a
permutation of this type within reasonable bounds of computation.
The algorithms used in practice produce permutations
which typically have a band width close to the minimal band width.
There are no theoretical predictions which indicate
how far this band width deviates from the minimal one. Among the numerous approximating algorithms of this type, the algorithm of Cuthill-McKee 1969 and a variation of Gibbs-Poole-Stockmeyer 1976 are used the most.
Both algorithms
are based on graph theoretical considerations.
Sometimes the
first algorithm provides the better result, and other times, the second.
The difference is usually minimal.
The second
408
SOLVING SYSTEMS OF EQUATIONS
III.
algorithm almost always requires less computing time. that reason, we want to expand on it here.
For
The correspond-
ing FORTRAN program is to be found in Appendix S.
The band widths of and
A = (aij) are the same.
B = (IaijI
+
Jajij)
For an arbitrary permutation matrix
P, the
band widths of P-1AP
remain the same.
Thus we may assume that
The diagonal elements of width.
P-1BP
and
A
A
is symmetric.
have no significance for band
So, without loss of generality, we can let the dia-
gonal entries be all ones.
The i-th and j-th row of aij # 0.
We write
zi - zj
A =
or
are called connected if
zj
- zi.
Thus, for
x x x x x x
we have
A
x
zl - zl,zl - z2,zl - z3,z2 - zl,z2 - z2,z3 - zl,z3 - z3.
This leads to the picture
This can be regarded as an undirected graph with numbered knots.
The rows of the matrix are the knots of the graph.
Conversely, the plan of the matrix can be reconstructed from a graph.
Thus, the graph
20.
Band width reduction for sparse matrices
409
yields the matrix
A =
Definition 20.3: (G,-)
is called an undirected graph if the
following hold: (1)
G
is a nonempty set.
(2)
-
is a relation between certain knots
The elements are called
knots.
Notation: "g - h" (3)
or
"g
and
and
h.
are connected".
h
g e G, g - g.
For all
g
g - h
always implies
h - g. A knot p+l
g
has degree
knots in
bitrary knots
r 6N and
(G,-)
is connected with exactly
g
A graph is called connected if, for ar-
G. g
if
p
and
ki e G,
i
in
h
there always exists an
G
= 0(1)r, with the properties:
(4)
ko = g, kr = h.
(5)
ki 1 - ki
Let
a
for
i
= 1(1)r.
o
be an arbitrary nonempty subset of
is also a graph (subgraph of
connected, we simply say that Definition 20.4:
(G,-)).
If
G.
(G,-)
where
is called the band width of the graph 0
(G,-)
be
G = {g1,g21" "gn}'
w = 1 + 2 max{dld = ji-jj
the given numbering.
is
is connected.
G
Let the knots of a finite graph
numbered arbitrarily, i.e. let
Then
gi - gj}
(G,-)
with respect to
410
SOLVING SYSTEMS OF EQUATIONS
III.
The problem now becomes one of finding a numbering of the knots for which the band width sible.
is as small as pos-
As a first step, we separate the graph into levels. L = (L1,L2,...9Lr)
Definition 20.5:
structure of the graph
disjoint subsets of For
(2)
is called the level
when the following hold true:
(G,-)
Li, i = 1(1)r, (levels) are nonempty
The sets
(1)
r
w
G.
Their union is
g c Li, h c Lj
and
G.
g - h, li-ji
is called the depth of the level structure.
<
1.
Its width
k
is the number of elements in the level with the most elements. D Figure 20.6 shows how to separate a graph into levels.
L1
Figure 20.6.
Theorem 20.7: of
(G,-)
L3
L2
Let
of width
L4
L5
Separating a graph into levels
be a level structure
L = (L11L21...,Lr) k.
Then there exists a numbering of the
knots such that the band width of the graph with respect to this numbering is less than or equal to Proof: of
that
First number the elements of
L2, etc. li-jJ
For < 1,
gu - 9V , Iu-vl
4k-1.
L1, then the elements
gu c Li, and
gv c Lj
< 2k-1, and hence, w < 4k-1.
it follows 0
20.
Band width reduction for sparse matrices
411
This theorem gives reason for constructing level strucUsually constructing
tures with the smallest width possible.
level structures of the greatest possible depth leads to the same end.
For every knot
Theorem 20.8:
g
of a connected graph there
exists exactly one level structure R(g) = (LI,L2)...,Lr) satisfying: (1)
LI = {g}.
(2)
For every
with
k e Li 1
with
h e Li
i > 1, there is a
This level structure is called the
k - h.
level structure with root The proof is trivial.
g. o
In the following, let the graph
(G,-)
be finite
In actuality, the connected components can
and connected.
The algorithm for band width re-
be numbered sequentially.
duction begins with the following steps: (A)
Choose a knot
(B)
Compute
of minimal degree.
g e G
R(g) _ (LI9L2,...,Lr).
last level (C)
Set j = 1.
(D)
Compute
Lr
be
kj,
j
= l(1)p.
R(kj) _ (M1,M2,...,Ms)
level structure
R(kj).
Let the elements of the
If
s
and set > r, set
mj = width of g = kj
and
return to (B). < p, increase
(E)
If
(F)
Choose set
j
j
e {1,...,p}
h = kj.
j
by 1 and repeat (D). so that
mj
is minimal.
Then
412
III.
SOLVING SYSTEMS OF EQUATIONS
The first part of the algorithm thus determines two knots and
h
g
and the corresponding level structures R(h) = (M1,M2,...,M5).
R(g) _ (L1,L2,...,Lr), Lemma 20.9:
The following are true:
(1)
s = r.
(2)
h e Lr.
(3)
g e Mr.
(4)
Li c U Mr+l_j = Ai,
i
i = 1(1)r.
j=1 (5)
Mi c U Lr+l-j = Bi,
i = 1(1)r.
j=1
Proof:
For
Conclusion (2) is trivial.
the empty set.
in
Mi
Mi+l
All knots in
(cf. Theorem 20.8(2)).
Mi
s
let
M1 = {h} c Lr.
be
M.
is a subset of
are connected with knots
Lu_1, Lu, and
Bi, we have
Conclusion (5) implies that (cf. Step (D)), s < r.
Now the
Since the elements of
are connected only to elements in since
>
We then prove Conclusion (5) by induction.
By (2), the induction begins with induction step.
i
Lu
Lu+l, and
Mi+l c Bi+l.
s > r.
By construction
This proves Conclusion (1).
From (5)
we obtain M. c B. i
i
r-l i Ul
r-l
M. c
B.
Ul
i
=
Br-1
G-L
L1 C Mr g e Mr.
This is conclusion (3).
The proof of (4) is analogous to the
proof of (5), in view of (3).
13
20.
Band width reduction for sparse matrices
413
The conclusions of the lemma need to be supplemented In most practical applications it turns out
by experience. that:
The depth
(6)
and
R(h)
r
of the two level structures
R(g)
is either the maximal depth which can be achieved
for a level structure of
(G,-)
or a good approximation
thereto. (7)
Li n Mr+l_i, i = 1(1)n, in many cases
The sets
contain most of the elements of
L.
U
Mr+l_i.
The two level
structures thus are very similar, through seldom identical. (8)
The return from step (D) of the algorithm to step
(B) occurs at all in only a few examples, and then no more frequently than once per graph.
This observation naturally
is of great significance for computing times.
In the next part of the algorithm, the two level structures
and
R(g)
R(h)
are used to construct a level
structure S = (Sl,S2,....sr)
of the same depth and smallest possible width. k e G
cess, every knot
is assigned to a level
In the proSi
by one
of the following rules: Rule 1:
If
k c Li, let
Rule 2:
If
k e Mr+l-i, let
For the elements of
k c Si. k c Si.
Li n Mr+l-i, the two rules have the
So in any case, L. n Mr+1_i e Si.
same result.
The set
r
V=G splits into
p
-
U (Li n Mr +1 i)
i=1
connected components,
V1,V2,...,Vp.
SOLVING SYSTEMS OF EQUATIONS
III.
414
Unfortunately it is not possible to use one of the Rules 1 and 2 independently of each other for all the elements of
V.
Such an approach would generally not lead to a level structure
S.
But there is
Lemma 20.10:
the same
V
If in each connected component of
rule (either always 1 or always 2) is applied constantly, then
We leave the proof to the
is a level structure.
S
reader.
In the following we use elements of the set K2
the width of
Let
T.
to denote the number of
ITI
K1
be the width of
The second part of the algorithm
R(h).
consists of four separate steps for determining (G)
Compute
Si = L. n Mr+l-i,
and determine
S:
K2, set
and
KI
V.
V
If
i = 1(1)r
is the empty set, this part of
the algorithm is complete (and continue at (K)). wise split Order the (H)
Set v = 1.
(I)
Expand all from
Vv
into connected components
V
so that always
Vi
Si
Si,
to
by rule 1.
IVj+1I
<
Vj,
j
= l(1)p.
IVjI,
j
= 1(1)p-l.
Expand
Si
to
Si,
i
by rule 2.
Vv
= 1(1)r,
Compute
K3 = max{1Si1
Ji = 1(1)r
where
Si # Si}
K4 = max{jSij
Ji = 1(1)r
where
Si # Si}.
K3 < K4
otherwise set
or if
Other-
i = 1(1)r, by including the knots
by including the knots from
If
and
R(g)
K3 = K4
and
Si = Si, i = 1(1)r.
K1 < K2, set
Si = Si;
20.
Band width reduction for sparse matrices
(J)
For
v < p, increase
v
415
by 1 and repeat (I).
This completes the computation of the level structure S. S
Figure 20.11 shows the level structures The knots in
for a graph.
this case knots.
V
are denoted by
In
x.
consists of two components, each having three
For the left component rule 2 was used, and for the
right, rule 1. of 3.
G-V
R(g), R(h), and
In this way, one obtains the optimal width
The second part of the algorithm consumes the greater
part of the computing time.
This is especially so when
has many components.
Figure 20.11.
Level structures
R(g), R(h), and
S
V
416
SOLVING SYSTEMS OF EQUATIONS
III.
Finally, in the third and last part of the algorithm, one starts with the level structure
and derives the numbering
S
of the graph: (K)
Set
p = 1.
(L)
Let
k
Go to (N).
run through all the elements of
order of the numbering. k c Sp
For fixed
which are connected to
yet been numbered
S
in the
p-1
k, number all
and which have not
k
so that the degree of
k
does not
decrease. (M)
Let
k
run through all the elements of
Sp
which have
already been numbered, in the order of the numbering. For fixed
k, number all
k e Sp
which have not been
numbered and which are connected to degree of (N)
k
Sp
which are unnumbered,
search for an unnumbered element of
(0)
Increase
so that the
does not decrease.
If there remain elements of
degree.
k
Sp
of minimal
Assign it the next number and return to (M). p
by
1
and, if
p < r, return to (L).
This last step also completes the numbering.
In some
cases, the result can be improved through the use of an iterative algorithm due to Rosen 1968.
Lierz 1976 contains
a report on practical experience with the algorithm. All algorithms for band width reduction are useful primarily for sparse matrices which are not too large and irregularly patterned.
These arise primarily in finite ele-
ment approaches where the geometry is complicated. magnitudes are on the order of
n = 1000
and
Typical
w = 50.
21.
Buneman Algorithm
21.
Buneman Algorithm Let
417
be a rectangle with sides parallel to the axes,
G
and let the following problem be posed on
G:
Au(x,y) = q(x,y),
(x,y)
E G
u(x,y) = p(x,y),
(x,y)
E DG.
This problem is to be solved with a five point difference method.
We will assume that the distance separating neigh-
boring lattice points from each other is always equal to (cf. Section 13). function
w
h
The difference equations for the discrete
are then uniformly the same for all internal
lattice points:
w(x+h,y) + w(x-h,y) + w(x,y+h) + w(x,y-h)
-
4w(x,y)
= h2q(x,y) (x+h,y), (x-h,y), (x,y+h), or
If one of the points
should lie in the boundary of the rectangle, w placed by the boundary value
is to be re-
at that point.
the points by rows or columns.
(x,y-h)
We number
In either case, we obtain a
linear system of equations of the following type: Mw = z r A
M =
I
I
` A,
E MAT(pn,pn, 1R) I
`A (21.1)
A = 1
-1
I1
1
'4
F MAT(n,n, IR)
SOLVING SYSTEMS OF EQUATIONS
III.
418
wl
W =
Z
=
w.,z. cJRn
;
(i=1(1)p).
L zP J
wP J The inhomogeneity
z
contains
as well as the bound-
h2q(x,y)
(x,y).
ary value
0. Buneman has developed a simple, fast algorithm for solving (21.1) for the case where
p = 2k+1
-
k e N.
1,
wo = w k+l = 0.
In the following, we always let
Three conse-
2
cutive components of w.
J-2
+ Aw. j-1 w.
will satisfy the block equations:
w
= z. J-1
+
'jJ
+ Aw. + w.
J-1
= Z.
j+l
J
J
w. +Aw. j+2 J J+1 + w j+2
Multiply the middle equation by equations.
When restricted to
j
z
j+1*
-A
and then add the three
=
2(2)2k+1-2, this results
in - Azj. wj-2 + (21-A2)w.+wj+2 = zj-I + zj+l
This represents even index.
2k-1
(21.2)
block equations for the unknowns of
After solving this reduced system, we can deter-
mine the remaining unknowns of odd index by solving the ndimensional system Awj = zj
- wj_1 - wj+l,
j
=
1(2)2k+I-1
(21.3)
The reduction process described here can now be applied again, to the
2k-1
block equations (21.2), since these have the
same structure as (21.1).
Using the notation
Buneman Algorithm
21.
A(1) = 21-A z.
1
419
2 2(2)2k+1-2
Azj,
+ zj+1
=
j
we obtain the twice-reduced system of block equations w.
4
+ [2I-(A(l))21w+w J
=
j+4 +4
j,
z(1)+z(1)-A(1)z(1) J-2
What we have here is a system with
J+2
2k-1-1
=4(4)2k+l-4. j=4(4)2
block equations.
After this system is solved, the remaining unknowns of even index can be found as the solution of the n-dimensional system A(1)wj
2(4)2k+1-2.
wj
=
2
- wj+2,
=
j
The remaining unknowns of odd index are then computed via (21.3).
The reduction process we have described can obviously be repeated again and again, until after
reduction steps
k
only one system, with a single block equation, remains to be solved.
After that comes the process of substituting the un-
knowns already computed in the remaining systems of equations. The entire method can be described recursively as follows.
Let
A(0) = A z30)
Then, for
1(1)2k+1
= 1P
=
j
r = 1(1)k, define
A(r)
= 21
z(r)
=
(A(r-1))2 -
z(r-1) j-2r-1
+ z(r-1) j+2r-1
A(r-1) -
z(r-l) J
J
j
Then, for
=
2r(2r)2k+1-2r
r = k(-1)0, solve successively
,
(21.4)
420
III.
A(r)w.
=
w
w
z(r)
j+2r
J
SOLVING SYSTEMS OF EQUATIONS
j=2r(2r+l)2k+1-2r.
r,
(21.5)
j-2
This method is known as cyclic odd/even reduction, and frequently, simply as the COR algorithm. matrices
grows with
A(r)
The band width of the
r, so that a direct determination
of the matrices requires extensive matrix multiplication.
For
this reason, they are explicitly represented instead as the product of simpler matrices in the following. The recursion formula (21.4) implies that polynomial
of degree
P r
2r
in
A(r)
is a
A:
2
2
A(r) = P r(A) 2
r-1
2J 2( r)A,
=
j=0
e IR,
r = 1(1)k.
J
The highest order coefficient is
= -1. c(r,) r 2
We want to represent the polynomial as a product of linear factors.
To do that, we need to know its zeros.
found most easily via the substitution
These are
_ -2 cos 9.
From
(21.4) we obtain the functional equation
P2r(E) =
2
-
LP 2r-1(E) ] 2
which reduces inductively to P r(E) = -2 cos(2rO).
The zeros of
P r(E)
Ej
The matrices
=
A(r)
of linear factors:
thus are
2 cos('r+l},
j
= 1(1)2x.
therefore can be represented as a product
Buneman Algorithm
21.
421
r=0
A, A
(r)
-t =
1
(21.6) 2r
\
J01IA
=L
+
2 cos (2j-l nri], +T
r=1(1)k.
This factorization can be used both in the reduction phase (21.4) for computing
and in the following solution
phase (21.5) for determining
In this way, the matrix
wj.
multiplication as well as the systems of equations are limited to tridiagonal matrices of a special kind.
This last described method is called cyclic odd/even reduction with factorization, and frequently is simply known as the CORF algorithm.
Various theoretical and practical investigations have demonstrated that the COR and CORF algorithms described here are numerically unstable (cf. e.g. Buzbee-Golub-Nielson 1970 and Schr6der-Trottenberg-Reutersberg 1976).
Buneman has
developed a stabilization (cf. e.g. Buzbee-Golub-Nielson 1970) which is based on a mathematically equivalent reformulation from (21.4).
for computing the vectors Let
1(1)2k+1-1.
zj,
0,
r = 1(1)k, define
Then, for p(r)
=
J
(A(r-1))-l(p(r-l)
p(r-l) ]
q (r)
=
=
j
j-2
q(r-1) j-2r-1
+ q(r-1) j+2r-1
-
r- 1
2p (r)
+
p(r-1) j+2r- 1
-
q(r-1))
(21.7)
2r(2r)2k+1
j
=
The sequences
-
2r
per), qtr), and
are related as follows:
422
SOLVING SYSTEMS OF EQUATIONS
III.
A(r)pJr) + q,r) (21.8)
r = 0(1)k,
=
j
2r(2r)2k+1
The proof follows by induction on tion is immediate, since
2r
r = 0, the asser-
For
r.
0, and
q(0) = zj
Now suppose the conclusion (21.8) holds for step
r - r+l
z(0)
The induction
r.
is:
A(r 1)p(r+l)+q (r+l) J
=
= 2p(r+1)-(A(r))2p(r+l +q (r) +q (r) -2p(r+l) J J j-2r j+2r
J
-(A(r))2pjr)+A(r)(p(r)+p(r)r_gjr))+q(r)+q(r)
_
j-2
r
r
j+2
r
j-2
j+2
A(r)p(r)r+q(r)r+A(r)p(r)r+q(r)r j-2
j-2 =
z(r)r +z(r) r -A(r)z(r) J-2
J
j+2
j+2 j+2 z(r+l). =
The Buneman algorithm can be summarized as follows: 1.
Reduction phase:
Compute the sequences
per)
and
qtr)
from (21.7): 0,
J
1(1)2k+1-1
z. J
J
A(r-l)(p(r-l)_p(r))
=
J
=
p(r-T)1 + p(r-T)1
q(r-1) -
J
P(r) J (r) gj
=
2.
j-2
p(r-1)
(p(r-1)
`J
-
J +
(r-1)
= qj-2r-1
r = 1(1)k,
J+2
j
_
(r-1)
p(r)) J 2
qj+2r-1
=
Solution phase:
2r(2r)2k+1
(21.9)
J
(r)
pj -
2r
Solve the system of equations (21.5),
using the relation (21.8):
21.
Buneman Algorithm
423
- w
qtr)
j+2
wj = p r)
+
- w
r
j-2
r
(wj -p r) )
r = k(-1)0,
(21.10) 2r(2r+1)2k+1
=
j
-
2r.
In both the reduction and solution phase we have systems of equations with coefficient matrix
A(r), and these systems
are solved by using the factorization (21.6).
system it is necessary to solve
Thus, for each
systems in the special
2r
symmetric, diagonal dominant, tridiagonal matrix
r c-4 1
`1
c-4
-2 < c = 2 cos(2 r+i ir) < 2. 2
The vectors
p30)
q30), which are set equal to zero and
and
zj, respectively, at the beginning of the reduction phase, take up a segment of length
pn =
(2k+1-1)n
in memory.
cause of the special order of computation, the vectors and
qtr)
can overwrite
and
Beper)
in their respec-
tive places, since the latter quantities are no longer needed. Similarly, in the solution phase, the successively computed solution vectors
wj
can overwrite the
qtr)
in their cor-
responding places. For
k = 2
and
p = 7, the sequence of computation
and memory content in the reduction and solution phases is given in Table 21.11.
r
0
1
2
2
1
0
Step
1
2
3
1
2
3
,
3
p (0)
,
S
P (0)
7
w2, w 4, w6
a (0) ,q 3(0) ,q (0) ,a (C) 1 5 7
1
p (0)
w4
a 2(1) ,q 6(1)
2
PM
q42)
4
,0.
w 1'
w 3'
w2'' 6
W4
p ( 2)
P6(1)
x 4(2)
a 2(1) ,q 4(1) ,q 6(1)
,
P 4(2)
P 4(1) ,6 P (1)
P2(1) p
P (0)
q2(1) 0.
w S'
(1) 4
4 ,
w
j= 1(1)7
a 1(0) ,a 2(0) ,a (0) ,a 4(0) ,a (0) ,a 6(0) >a 7(0) 3 5
z.
P2
3
P 1(0) ,P 2(0) ,P 3 (0) ,P 4(0) ,P 5(0) ,P 6(0) ,P 7(0)
3
initialization: p(.0 )=0, q(0)=
computes
7
x (1) 6
6
{
,
P 2(1)
,2 a (1)
2
,3 P (0)
,a (0) 3
3
,
6
,a S(0) ,a (1) (1) 4 6
5
P 4(2) ,P 5(0) ,P 6(1)
, 0.
4
,
,
P (0) 7
a (0) 7
P7
w
1'
w
2'
w
3'
w
4'
w S
'w61w7
7
,g2 1) ,g 30) ,w4,q 50) ,q 61) ,q 0)
q1(0) ,w2,g3(0) ,w 4,g5(0) ,w6, g7(0)
qi 0)
a 1(0) ,a 2(1) ,a (0) ,a (2) ,a 5(0) ,q 6(1) ,a 7(0) 3 4
P 1(0)
a 1(0)
1
memory content at end of step
Stages of the computation for the reduction phase (upper part) and the solution phase (lower part)
requires
Table 21.11:
Buneman Algorithm
21.
425
The Buneman algorithm described here thus requires twice as many memory locations as there are discretization points.
There exists a modification, (see Buzbee-Golub-
Nielson 1970), which uses the same number of locations.
How-
ever, the modification requires more extensive computation. We next determine the number of arithmetic operations in the Buneman algorithm for the case of a square p = n.
G, i.e.
Solving a single tridiagonal system requires
6n-5
operations when a specialized Gaussian elimination is used. For the reduction phase, noting that p = n =
2k+l
k+l = log2(n+1),
1,
-
we find the number of operations to be k
[6n+2r 1(6n-5)]
I
I
r=l j eMr where 2r(2r)2k+l-2r}
Mr =
{j
2k+l-r
and cardinality(Mr) = k
k+1 r
2(2 1)[6n+2
=
-
r -l
1.
Therefore,
(6n-5)] = 3n1og2(n+l)
r=1
+ 0[n log2(n+l)].
Similarly, for the solution phase we have k
J_ [3n+2r(6n-5)]
I
r=0 j eMr where Mr =
{i
and cardinality(:Nr) = 2k k 1
r=0
2k r[3n+2r(6n-5)]
=
r
=
2r(2r+l)2k+1-2r} Therefore,
3n21og2(n+l)+3n2+0[n log2(n+l)].
426
SOLVING SYSTEMS OF EQUATIONS
III.
Altogether, we get that the number of operations in the Buneman algorithm is 6n21og2(n+l) + 3n2 + O[n log2(n+l)].
At the beginning of the method, it is necessary to compute values of cosine, but with the use of the appropriate recursion formula, this requires only
0(n)
operations, and thus
can be ignored.
Finally, Appendix 6 contains a FORTRAN program for the Buneman algorithm. 22.
The Schroder-Trottenberg reduction method We begin by way of introduction with the ordinary dif-
ferential equation
-u"(x) = q(x). The standard discretization is 2u(x)-u(x+h)-u(x-h)
= q(x)
h2
for which there is the alternative notation (2I-Th-ThI)u(x) Here
I
=
denotes the identity and
h2q(x).
Th
the translation opera-
tor defined by Thu(x) = u(x+h).
Multiplying equation (22.1) by (2I+Th+Th1)
yields (2I-Th-Th2)u(x)
=
(22.1)
h2(21+Th+Th1)q(x)
The Schroder-Trottenberg reduction method
22.
427
Set
q0(x) = q(x),
ql(x) = (21+Th+Th1)go(x)
and use the relations Th = Tjh,
ThJ
=
This simple notational change leads to the simply reduced equation (2I-T2h-T-h1 )u(x)
=
h2g1(x).
This process can be repeated arbitrarily often.
The m-fold
reduced equations are (2I-Tkh-Tkh)u(x) = h2gm(x)
where
k = 2m
(22.2)
and
qm(x) _ (21+TRh+Tth)gm-1(x),
R =
2m-1
The boundary value problem -u"(x) = q(x),
u(O) = A,
can now be solved as follows. into
2n
u(l) = B
Divide the interval
subintervals of length
h = 1/2n.
Then compute
sequentially ql(x)
at the points
2h, 4h, ..., 1-2h
q2(x)
at the points
4h, 8h, ..., 1-4h
qn-1(x) The
at the point
2n-lh = 1/2.
(n-l)-fold reduced equation
[0,1]
428
SOLVING SYSTEMS OF EQUATIONS
III.
(2I-T n 1 -T n-1 )u(1/2) = h2gn 1(1/2)
h
2
to be determined immediately, since the
u(1/2)
allows
values of
u
h
2
are known at the endpoints
and
0
Simi-
1.
larly, we immediately obtain
u(1/4)
(n-2)-fold reduced equation.
Continuing with the method
and
u(3/4)
from the
described leads successively to the functional values of at the lattice points
jh,
u
= 1(1)2n-1.
j
In the following we generalize the one dimensional reduction method to differential or difference equations with constant coefficients. Definition 22.3:
Gh = {vh
Let
I
v e a}, h > 0.
Then we call
r
av Th
Sh =
av c R
v=-r
a one-dimensional difference star on the lattice called symmetric if a2v =
odd (even) if
a v = av 0
for
v = 1(1)r.
(a2v+1 = 0) for all
-r < 2v < r
(-r < 2v+l < r).
the sum for
Sh
v
Gh.
Sh
Sh
is
is called
with
In Schroder-Trottenberg 1973,
is simply abbreviated to [a-ra-r+l .....* ar-lar]h.
o
Each difference star obviously can be split into a sum
Sh = Ph + Qh, where Since
Ph T
2hv
is an even difference star, and
Qh
is an odd one.
v
= T 2h, the even part can also be regarded as the
difference star of the lattice
G2h = {20
I
v e22}.
The reduction step now looks like this:
The Schroder-Trottenberg reduction method
22.
Shu(x) = (Ph+Qh)u(x)
=
429
h2q(x)
(Ph-Qh)(Ph+Qh)u(x) = (Ph-Qh)u(x) = h2(Ph-Qh)q(x) Ph
Since
obviously is an even difference star, it can
Q2
-
be represented as a difference star on the lattice denote it by
We
G2h.
We further define
R2h.
ql(x) = (Ph-Qh)q(x) Thus one obtains the simply reduced equation R2hu(x) = h2g1(x).
In the special case Sh
1Th1 + a o Th + a1Th
one obtains 2 o
2
Ph = aoTh 2
Q2 = a2 Th2 + 2a-la1Th + a T R2h =
2-T-1 2h +
2_
2
- aIT2h.
The reduction process can now be carried on in analogy with (22.2).
In general, the difference star will change from
step to step.
The number of summands, however, remains con-
stant.
The reduction method described also surpasses the Gauss algorithm in one dimension in numerical stability. main concern however is with two dimensional problems.
Our
We
begin once again with a simple example.
The standard discretization of the differential equation
430
SOLVING SYSTEMS OF EQUATIONS
III.
-Au(x) = q(x),
x c IR 2
is
4u(x)-u(x+he1)-u(x-hel)-u(x+he2)-u(x-he2) = h2q(x).
The translation operators Tv,hu(x) = u(x+hev)
lead to the notation (4I-Tl,h-T- 1h-T2,h-T-1 )u(x) = h2q(x) I, 2,h
Multiplying the equation by (41+T1,h+T11h+TZ
T_
1
yields T_
{16I
1
T2,h
-
-
T12h-T2
h2(4I+Tl,h+T11h+T2
h-T2_
h+T21h)q(x)
We set
ql(x) = (41+Tl,h+T11h+T2,h+T21h)q(x)
and obtain the simply reduced equations {121-2(T
l,hT2,h+Tl,hT2,h+TllhT2,h+Tl,hT2,h)
h+T12h+T2 h+T22h))u(x)
=
h2g1(x),
(T121
In contrast to the reduction process for a one dimensional difference equation, here the number of summands has increased from
5
to
spread farther apart.
9.
The related lattice points have The new difference star is an even
polynomial in the four translation operators
Tl,h' Tllh'
The Schroder-Trottenberg reduction method
22.
®
431
0 8
Figure 22.4:
Related lattice points after one reduction step.
T2 h,
and
The related lattice points are shown in
TZlh.
Figure 22.4 with an "x".
Such an even polynomial can be
rewritten as a polynomial in two 'larger' translation operators.
Let e3 = (el + e2)/./-2-
e 4 = (e2
(22.5)
- el)/17
be the unit vectors rotated by
it/4
counterclockwise, and
let
k = h"T T3,ku(x) = u(x+ke3) = u(x+hel+he2) T4 ku(x) = u(x+ke4) = u(x+he2-hel). Then we have T3,k =
Tl,h T2,h -1
-1
-1
(22.6)
-1
Tl,h T2,h.
The simply reduced equation therefore can be written in the following form:
432
SOLVING SYSTEMS OF EQUATIONS
III.
{12I-2(T3,k+T4,k+T31k+T41k)-(T3,kT4'k+T3'kT-1k+T-1kT4,k + T3IkT41k)u(x) = h2g1(x).
The difference star on the left side once again can be split into an even part, 121
(T3,kT4,k+T3,kT41k+T31kT4,k+T31kT41k)
and an odd part,
2(T+TT 3,k 4,k+ and reduced anew.
3,1k+T- 1
4,k)
One then obtains a polynomial in the
translation operators
T1,2h, T1,2h' T2,2h, T2,2h.
Thus, beginning with a polynomial in -1
-1
Tl,h' Tl,h' T2,h' T2,h we have obtained, after one reduction step, a polynomial in
-1
T3,h/' T3,hr' T4,h 2
-1 T4,hV2
and after two reduction steps, a polynomial in -1
1
T1,2h' T1,2h' T2,2h' T2,2h In particular, this means that the and
e2 -> e4
r/4
rotation
e1
e3
has been undone after the second reduction.
This process as described can be repeated arbitrarily often. The amount of computation required, however, grows substantially with the repetitions.
For this reason we have not
carried the second reduction step through explicitly. We now discuss the general case of two-dimensional reduction methods for differential and difference equations
The Schroder-Trottenberg reduction method
22.
with constant coefficients.
433
We preface this with some gen-
eral results on polynomials and difference stars. Definition 22.7: p
variables
be a real polynomial in
P(xl,...,xp)
Let
xi,...,xp.
is called even if
P
P(-x1,...,-xp) = P(xl,...,xp and
is called odd if
P
P(-xl,...,-xp) _ -P(xl,...,xp It is obvious that the product of two even or of two odd polynomials is even, and that the product of an even polyFurther, every polynomial can
nomial with an odd one is odd.
be written as the sum of one even and one odd polynomial. Lemma 22.8:
Let
P(xl,x2,x3,x4)
there exist polynomials
and
P
P(x,x 1,y,Y 1) = P(xy,(xy) P(xY,(xY)
.,-
i+j
l,yx l,xy 1)
=
P
Then
with the properties:
l,yx l,xy 1)
P(x2,x 2,y2,y-2).
We have
Proof:
Since
be an even polynomial.
P
--1 --
-
S
iE7l
aijx iY J
aij
,
E
pl
is even, we need only sum over those i,j For such
even.
x1Y3
---11
=
i,j
(xy)r(yx-1)s,
E 7l
with
we have r = (i+j)/2,
s
= (j-i)/2.
We obtain P(x,x-1,Y,Y-1)
I
r,sE ZZ
r+s(xY)r(Yx-1)s
ar-s ,
and therefore, the first part of the conclusion.
Similarly,
434
SOLVING SYSTEMS OF EQUATIONS
III.
we obtain 1)
l,yx-l,xy
P(xy,(xy)
=
a..(xy)1(yx
X
1)j
i.,jE ZZ
(xy)1(yx-1 ) j
=
P(xy,(xy) l,yx
x2ry2s, l,xy 1)
ar+s,s
rx2ry2s 0
r,sE ZZ h > 0, define the lattice
For fixed
Definition 22.9:
s = (i+j)/2
r = (i-j)/2,
Gv
by
{xe7R2Ix=2v/2h(iel+je2); i,jeZZ}
when v is even
{xe ]R2Ix=2v/2h(ie3+je4); i,jeZl}
when v is odd.
Gv = .
A difference star Ti
k,
on the lattice
Sv
Gv
T2-1k
for even
T4-1k
for odd
Tllk, T2 k,
is a polynomial in v
or in
T3 k, T31k, T4 k, Here
k = 2v/2h.
Sv
is called even (odd) if the correspond-
ing polynomial is even (odd). Go = Gl = G2 D ...
p
O
.
0
It follows that
Figure 22.10 shows
@
p
Go, G1, G2, and
Go
:
G1
:
G2
:
G3
Figure 22.10:
v.
.
0 X
D
A representation of lattices Go, Gl, G2 and G3.
G3.
22.
The Schr6der-Trottenberg reduction method
Theorem 22.11: Gv.
be partitioned into a sum
Sv
Let
S
where
be a difference star on the lattice
Sv
Let
v = Pv
Qv
is an even difference star and
PV
435
ference star.
Qv
is an odd dif-
Then
Sv+1 = RV - Qv)Sv may be regarded as a difference star on the coarser lattice Gv+l c Gv. Proof:
The difference star S
is even since
P2
v+l
2
= P2v
and
- Qv
are even.
Q2
Now let
By Lemma 22.8, there exists a polynomial Sv+l(Tl,k,T-
1
k,T2,k,T-
Sv+l
v
be even.
with
1
2, k)
l,
1,T2,kT1'k,Tl,kT2,k). Sv+1(Tl,kT2,k,(Tl,kT2,k) Since
1
T l,k
one obtains, where
_ T4,k,
m = k/ = 2(v+l)/2h,
Sv+l(Tl,k,Tl1k,T2,k,T2 k) = 9v+l(T3,m,T3,mT4,m,T4,m).
The right side of this equation is a difference star on the lattice
Gv+l.
For odd
taking note of (22.6):
v, one obtains, correspondingly and
SOLVING SYSTEMS OF EQUATIONS
III.
436
-1
-1
Sv+l(Tl,RT2,L,(Tl,kT2,k)
-1
-1
-1
,T1,kT2,k,Tl,kT2,k)
Sv+1(T1,k,T1 2 k,T2,k,T2,2) = Sv+l(Tl,m,Tl,m,T2,m,T2,m) 2
k = k//i,
m = 2k = k/ = 2(v+l)/2h.
a
We may summarize these results as follows: Reduction step for equations with constant coefficients: initial equation:
Svu(x) = h2gv(x),
partition:
Sv = pv + Qv,
compute:
qv+l(x) _ (PV-Qv)gv(x),
x E Gv+l
reduced equation:
Sv+lu(x) = h2gv+1(x),
x e Gv+1'
Pv
x E Gv even, Q.
odd
The reduction process can be repeated arbitrarily often. The difference stars then depend only on the differential equation and not on its right side.
Thus, for a particular
type of differential equation, the stars can be computed once for all time and then stored.
Our representation of the difference stars becomes rather involved for large
f
v.
aiiT iI.
Instead of
k . Tj2.k
for
v
even
for
v
odd
Sv =
i
j
1i T 3,k T 4,k i,3EZZ a
k = 2v/2h
we can use the Schroder-Trottenberg abbreviation and write
22.
The Schroder-Trottenberg reduction method
-1,1
437
......
a0,1
a1,1
a0,-1
al,-1 ......
Sv
a-1,-1
V
With the differential equation
-Au(x) = q(x), the standard discretization leads to the star
So =
0
-1
0
-1
4
-1
0
-1
0 0
The first three reductions are:
S1 =
-1
12
-2
-1
-2
-1 1
1
0
1
0
0
0
-2
-32
-2
0
1
-32
132
-32
1
0
-2
-32
-2
0
0
1
0
0
S3 =
-2
-2
0
S2 =
r
-1
0J 2
l
-4
-4
-752
-2584
-752
6
-2584
13348
-2584
6
-4
-752
-2584
-752
-4
1
-4
6
-4
The even components
6
PV
-4
of the stars
1 "1 -4
1J3 SV
are:
438
SOLVING SYSTEMS OF EQUATIONS
III.
PO =
P2
0
0
0
0
4
0
0
0
0
0
0
0
1
0
-2
0
-2
0
0
132
0
1
0
-2
0
-2
0
0
0
1
0
0
0
1
6 0
1
0
-752 0
-752 0
6 0
13348 0
6
-1
0
-1
0
12
0
-1
0
-1
1
0
0
0 3
,
1
1
P
P1
-752
2
0
0
6
-752
0
0
1
3
When the reduction method is applied to boundary value problems for elliptic differential equations, a difficulty arises in that the formulas are valid only on a restricted region whenever the lattice point under consideration is sufficiently far from the boundary.
In certain special cases, however, one
is able to evade this constraint. Let the unit square G = (0,1) x (0,1) be given.
Let the boundary conditions either be periodic, i.e. u(O,x2) = u(1,x2) u(x1,0) = u(xl,l) x1,x2 e [0,1] alu(0,x2) = a1u(1,x2) a2u(xl,o) = a2u(x1,l),
or homogeneous, i.e.
u(xl,x2) =
0
for
(xl,x2) e
DG.
Let
the difference star approximating the differential equation be a symmetric nine point formula:
22.
The Schroder-Trottenberg reduction method
SO
FY
9
3
a
Y
439
Y S
Y
10'
The extension of the domain of definition of the difference equation S0u(x) = q0(x) to
x E Go
is accomplished in these special cases by a con-
tinuation of the lattice function
to all of
u(x)
the case of periodic boundary conditions, u(x)
are continued periodically on boundary conditions, u(x)
and
G0.
and
Go.
In
q0(x)
In the case of homogeneous
q0(x)
are continued anti-
symmetrically with respect to the boundaries of the unit square:
u(-xl,x2) _ -u(xl,x2),
g0(-xl,x2) = -g0(x1,x2)
u(x1,-x2) = -u(xl,x2),
go(x1,-x2) = -g0(x1,x2)
u(l+xl,x2) _ -u(1-xl)x2), q0(1+x1,x2) = -q0(l-xl,x2) u(xl,l+x2) = -u(xl,l-x2), 90(x1,1+x2) = -go(xl,l-x2) g0(x1,x2) =
0
for
(xl,x2) e @G.
This leads to doubly periodic functions.
Figure 22.12 shows
their relationship.
Figure 22.12:
Antisymmetric continuation.
The homogeneous boundary conditions and the symmetry of the difference star assure the validity of the extended difference equations at the boundary points of
G, and therefore,
440
SOLVING SYSTEMS OF EQUATIONS
III.
on all of
An analogous extension of the exact solution
Go.
of the differential equation, however, is not normally possible, since the resulting function will not be twice differentiable.
We present an example in which we carry out the reduction method in the case of a homogeneous boundary condition and for
h = 1/n
and
After three reduction steps,
n = 4.
we obtain a star, S3, which is a polynomial in the translation operators 1 -1 T3,k,T3,k,T4,k,T4,k,
k=
2
3/2
h = 2hI.
The corresponding difference equation holds on the lattice G3
(cf. Figure 22.10).
Because of the special nature of the
u, it only remains to satisfy the equation
extension of
S3u(1/2,1/2) = 83(1/2,1/2).
By appealing to periodicity and symmetry, we can determine all summands from
u(1/2,1/2), u(0,0), u(1,0), u(0,1), and
But the values of u(1/2,1/2)
u
on the boundary are zero.
can be computed immediately.
sulting from the difference star
S2
u(1,1).
Thus
The equations re-
on the lattice
longer contain new unknowns (cf. Figure 22.10).
G2
no
The equations
S1u(1/4,1/4) = gl(1/4,1/4),
Slu(3/4,1/4) = gl(3/4,1/4),
S1u(1/4,3/4) = g1(1/4,3/4),
Slu(3/4,3/4) = g1(3/4,3/4),
for the values of
u
at the lattice points
(3/4,1/4), (1/4,3/4), and
(3/4,3/4)
otherwise only the boundary values of determined value
u(1/2,1/2)
(1/4,1/4),
are still coupled; u
and the by now
are involved.
Thus the
4 x 4
The Schroder-Trottenberz reduction method
22.
system can be solved.
441
As it is strictly diagonally dominant,
it can be solved, for example, in a few steps (about 3 to 5) with SOR.
All remaining unknowns are then determined from
S0u(1/2,1/4) = q0(1/2,1/4), S0u(1/4,1/2) = q0(1/4,1/2), S0u(3/4,1/2) = 80(3/4,1/2), S0u(1/2,3/4) = g0(1/2,3/4).
In all cases arising in practice, this system too is strictly diagonally dominant.
The method of solution described can
generally always be carried out when say
n = 2m, m e 1N, h = 1/n.
n
is a power of
2,
The (2m-l)-times reduction
equation S2m-lu(1/2,1/2)
= q2m-1(1/2,1/2) u(1/2,1/2).
is then simply an equation for
The values of
u
at the remaining lattice points then follow one after another from strictly diagonally dominant systems of equations. By looking at the difference stars
S1, S2, S3
formed
from -1
0
-1
4
-1
0
-1
0
0
S
0
=
one can see that the number of coefficients differing from In addition, the coefficients differ
zero increases rapidly.
greatly in order of magnitude.
This phenomenon generally can
be observed with all following
SV, and it is independent of
the initial star in all practical cases.
As a result, one
typically does not work with a complete difference star SV, but rather with an appropriately truncated star. a truncation parameter
a
Thus, after
has been specified, all coeffici-
442
SOLVING SYSTEMS OF EQUATIONS
III.
of
ents
with
SV
IaijI
< a1aool
are replaced by
aiJ
zeros.
For sufficiently small
this has no great influ-
a
ence on the accuracy of the computed approximation values of 10-g
u.
As one example, the choice of
for the case of
a =
the initial star
S
=
0
0
1
1
4
-1
0
-1
0
with
leads to the discarding of all coefficients a1J
Iii
+
Iii
> 4,
lii
>
3,
Iii
> 3.
In conclusion, we would like to compare the number of operations required for the Gauss-Seidel method (single-step method), the SOR interation, the Buneman algorithm, and the reduction method.
We arrived at Table 22.13 by restricting
ourselves to the Poisson equation on the unit square (model problem) together with the classic five point difference formula.
All lower order terms are discarded.
N2
denotes the
number of interior lattice points, respectively the number of unknowns in the system.
The computational effort for the
iterative method depends additionally on the factor which the initial error is to be reduced.
e
by
For the reduction
method, we assume that all stars are truncated by
o = 10
8.
More exact comparisons are undertaken in Dorr 1970 and Schroder-Trottenberg 1976. marks on computing times.
Appendices 4 and 6 contain reIn looking at this comparison, note
that iterative methods are also applicable to more general problems.
22.
The Schroder-Trottenberg reduction method
Gauss-Seidel
443
2 N4Ilog ej Tr
SOR
21
N3Ilog e
Buneman
[61og2(N+1)+3]N2
Reduction
36 N2
Table 22.13:
The number of operations for the model problem
APPENDICES: FORTRAN PROGRAMS
Appendix 0:
Introduction.
The path from the mathematical formulation of an algorithm to its realization as an effective program is often difficult.
We want to illustrate this propostion with six
typical examples from our field.
The selections are intended
to indicate the multiplicity of methods and to provide the reader with some insight into the technical details involved. Each appendix emphasizes a different perspective:
computa-
tion of characteristics (Appendix 1), problems in nonlinear implicit difference methods (Appendix 2), storage for more than two independent variables (Appendix 3), description of arbitrary regions (Appendix 4), graph theoretical aids (Appendix 5), and a comprehensive program for a fast direct method (Appendix 6).
Some especially difficult questions,
such as step size control, cannot be discussed here.
As an aid to readability we have divided each problem into a greater number of subroutines than is usual.
This is
an approach we generally recommend, since it greatly simplifies the development and debugging of programs. 444
Those who
Appendix 0:
Introduction
445
are intent on saving computing time can always create a less structured formulation afterwards, since it will only be necessary to integrate the smaller subroutines.
However, with
each modification, one should start anew with the highly structured original program.
An alternative approach to re-
duced computing time is to rewrite frequently called subroutines in assembly language.
This will not affect the
readability of the total program so long as programs equivalent in content are available in a higher language.
The choice of FORTRAN as the programming language was a hard one for us, since we prefer to use PL/l or PASCAL. However, FORTRAN is still the most widespread language in the technical and scientific domain.
The appropriate compiler
is resident in practically all installations.
Further,
FORTRAN programs generally run much faster than programs in the more readable languages, which is a fact of considerable significance in the solution of partial differential equations.
The programs presented here were debugged on the CDCCYBER 76 installation at the University of Koln and in part on the IBM 370/168 installed at the nuclear research station Julich GmbH.
They should run on other installations without
any great changes.
We have been careful with all nested loops, to avoid any unnecessary interchange of variables in machines with virtual memory or buffered core memory. for example, the loop DO 10 I = 1,100
DO 10 J = 1,100 10
A(J,I) =
0
In such installations,
APPENDICES
446
is substantially faster than DO 10 I = 1,100
DO 10 J = 1,100
10
A(I,J) = 0.
This is so because when FORTRAN is used, the elements of
A
appear in memory in the following order:
A(1,l), A(2,1), ..., A(100,1), A(1,2), A(2,2),
..
For most other programming languages, the order is the contrary one:
A(l,l), A(1,2), ..., A(1,100), A(2,1), A(2,2),
...
.
For this reason, a translation of our programs into ALGOL, PASCAL, or PL/l requires an interchange of all indices. There is no measure of computing time which is independent of the machine or the compiler.
If one measures the
time for very similar programs running on different installations, one finds quite substantial differences.
We have ob-
served differences with ratios as great as 1:3, without any plausible explanations to account for this.
It is often a
pure coincidence when the given compiler produces the optimal translation for the most important loops..
Therefore, we use
the number of equally weighted floating point operations as a measure of computing time in these appendices.
This count
is more or less on target only with the large installations. On the smaller machines, multiplication and division consume substantially more time than addition and subtraction.
Appendix 1:
Method of Massau
Appendix 1:
Method of Massau
447
This method is described in detail in Section 3.
It
deals with an initial value problem for a quasilinear hyperbolic system of two equations on a region
where
in
G
uy = A(x,y,u)ux + g(x,y)u),
(x,y)
e G
u(x,0) _ ip(x,0),
(x,0)
e G
A e CI(G x G, MAT(2,2,]R)), and
is an arbitrary subset of
at the equidistant points belong to
g e C1(G x G,IR2).
1R2.
(x) _ (i1(x),Iy2(x))
The initial values
1R2:
are given
(xi,0), insofar as these points
G:
+ 2h(i-nl),
xi = xn
i = nl(1)n2.
1
The lattice points in this interval which do not belong to G
must be marked by
B(I) = .FALSE..
Throughout the complete computation, U(*,I)
contains
the following four components: ul(xi,yi),
u2(xi,yi),
xi,
yi.
The corresponding characteristic coordinates (cf. Figure 3.10) are:
SIGMA = SIGMAO + 2hi,
TAU.
At the start, the COMMON block MASS has to be filled:
Ni = n1 N2 = n2
H2 = 2h SIGMAO = xn
-
1
TAU = 0.
2hn1
APPENDICES
448
For those
for which
i
B(I) =
belongs to
(xi,0)
G, we set
TRUE.
U(1,I) = 01(xi) U(2,I) = 02(xi)
U(3,I) = xi U(4,I) = 0,
and otherwise, simply set B(I) =
FALSE.
Each time MASSAU is called, a new level of the characteristic lattice (TAU = h, TAU = 2h, ...) is computed. also alters N1, N2, and SIGMAO.
The program
The number of lattice points
in each level reduces each time by at least one. N2 < Ni.
At the end,
The results for a level can be printed out between
two calls of MASSAU.
To describe
A
and
g, it is necessary in each con-
crete case to write two subroutines, MATRIX and QUELL. initial parameter is a vector u1(x,Y),
The
of length 4, with components
U
u2(x,Y),
x, Y
Program MATRIX sets one more logical variable, L, as follows: L =
tion
FALSE. if G x G
lies outside the common domain of defini-
U
of
A
and
g; L =
TRUE. otherwise.
To determine the eigenvalues and eigenvectors of MASSAU calls the subroutine EIGEN.
A,
Both programs contain a
number of checks: (1)
Are
(2)
Are the eigenvalues of
(3)
Is the orientation of the (x,y)-coordinate system and of the
A
and
g
defined? A
(o,2)-system the same?
real and distinct?
Appendix 1:
Method of Massau
449
The lattice point under consideration is deleted as necessary B(I) = .FALSE.).
(by setting
Consider the example
2u2
u1
A=
,
g = 0, * =
10exsin(2lrx) ,
ul
u2/2
G = (0,2) x]R.
1
Figures 1 and 2 show the characteristic net in the system and the (x,y)-system.
For
(o,T)-
2h = 1/32, we have the
sector
0.656250 < o < 1.406250 0.312500 < T < 0.703125
20th to 45th level.
In the (x,y)-coordinate system, different characteristics of the type
a + T = constant will intersect each other, so in
this region the computation cannot be carried out.
If one
drops the above checks, one obtains results for the complete In the region where the solution
"determinancy triangle".
exists, there cannot be any intersection of characteristics in the same family.
The method of Massau is particularly well suited to global extrapolation, as shown with the following example: 0
A =
0
1
2 2
1-u2 1
2u1u
,
g =
4ule
2x
G = (0,1) x IR. The exact solution is u(x,y) = 2ex cos y
The corresponding program and the subroutine MATRIX and QUELL are listed below.
0.35
0.40
0. 50
0 .60
0 .70
0.75
0.80
0.7
0.8
Figure 1:
0'.9
1 .0
1.1
Characteristic net in the (x,y)-coordinate system
1.2
1.3
452
APPENDICES
Table 3 contains the results for
(a,T) = (1/2,1/2).
The discrete solution has an asymptotic series of the form T0(x,Y) + T1(x,Y)h + T2(x,Y)h2 +
The results after the first and second extrapolation are found in Tables 4 and S.
The errors
Du1
and
Au2
in all
three tables were computed from the values in a column: Dul = 2exsin y - ul,
Au2 = 2excos y - u2.
H2
1/32
1/64
x
1.551750
1.582640
1.598355
1.606317
y
1.122827
1.138263
1.147540
1.152624
Ul
8.804323
9.004039
9.104452
9.154850
U2
3.361230
3.664041
3.838782
3.932681
AU1 -0.296282
-0.165040
-0.087380
-0.044999
0.727335
0.416843
0.223264
0.115574
AU2
Table 3:
1/128
1/256
Results for (o,T) = (1/2,1/2).
1.614279 1.157708
1.613530 1.153699
1.614070
y Ul
9.203755
9.204865
U2
3.966852
4.013523
9.205248 4.026580
tU1
-0.023579
-0.007085
-0.001948
AU2
0.100843
0.027710
0.007299
x
Table 4:
1.156817
Results after the first extrapolation
Appendix 1:
Method of Nassau
453
x
1,614250
1.614349
y
1.157856
1.158005
U1
9.205235
9.205376
U2
4.029080
4.030932
LU1
-0.001605
-0.000234
AU2
0.003320
0.000496
Table 5:
Results after the second extrapolation
SUBROUTINE MASSAU C
C C C C
C C C C C
VARIABLES OF THE COMMON BLOCK U(1,I) AND U(2,I) COMPONENTS OF U, U(3,I)=X, U(4,I)=Y, WHERE MI.LE.I.LE.N2. B(I)=.FALSE. MEANS THAT THE POINT DOES NOT BELONG TO THE GRID. THE CHARACTERISTIC COORDINATES OF THE POINT (U(3,I),U(4,I)) ARE (SIGMAO+I*H2,TAU). THE BLOCK MUST BE INITIALIZED BY THE MAIN PROGRAMME.
C
REAL U(4,500),SIGMAO,TAU,H2 INTEGER N1,N2 LOGICAL B(500) COMMON /MASS/ U,SIGMAO,TAU,H2,N1,N2,B C
REAL E1(2,2),E2(2,2),LAMBI(2),LAMB2(2),G1(2),G2(2) REAL CI,C2,D,XO,X1,X2,YO,Y1,Y2,RD REAL DXOI,DX21,DY01,DY02,DY21,UI1,U12 INTEGER N3,N4,I LOGICAL L,LL C
N3=N2-N1 10 IF(N3.LE.0) RETURN IF(B(N1)) GOTO 20 11 N1=N1+1 N3=N3-1 GOTO 10 20 LL=.FALSE. N4=N2-1 C C
BEGINNING OF THE MAIN LOOP
C
DO 100 I=N1,N4 IF(.NOT.B(I+1)) GOTO 90 IF(LL) GOTO 30 IF(.NOT.B(I)) GOTO 90 CALL EIGEN(U(1,I),E1,LAMBI,L) IF(.NOT.L) GOTO 90 CALL OUELL(U(1,I),G1)
APPENDICES
454
30 CALL EIGEN(U(1,I+1),E2,LAMB2,L) IF(.NOT.L) GOTO 90 CALL QUELL(U(1,I+1),G2) C C C
SOLUTION OF THE FOLLOWING EQUATIONS (XO-X1)+LAMBI(1)*(YO-Y1)=O (XO-X1)+LAMB2(2)*(YO-Y1)=(X2-X1)+LAMB2(2)*(Y2-Y1)
C
C
CI=LAMBI (I) C2=LAI4B2 (2) D=C2-C1 IF(D.LT.1.E-6*AMAX1(ABS(C1),ABS(C2))) GOTO 80 X1=U(3,I) X2=U(3,I+1)
Y1=U (4, I) Y2=U(4,I+1) DX21=X2-X1 DY21=Y2-Y1 RD=(DX21+C2*DY21)/D DXOI=-C1*RD DYOI=RD XO=XI+DXOI Y0=Y1+DYO1 DYOZ=YO-Y2 C
C C C
CHECK WHETHER THE TRANSFORMATION FROM (SIGMA,TAU) TO (X,Y) IS POSSIBLE IF((DX21*DYOI-DXOI*DYZI).LE.O.) GOTO 80 SOLUTION OF THE FOLLOWING EQUATIONS E1(1,1)*(U(1,I)-U11)+E1(1,2)*(U(2,I)-U12)= DYOI*(El(1,1)*G1(1)+E1(1,2)*G1(2)) E2(2,1)*(U(1,I)-U11)+E2(2,2)*(U(2,I)-U12)= E2(2,1)*(DY02*G2(1)+U21-U11)+E2(2,2)*(DYOZ*62(2)+U22-U12) U11=OLD U12=OLD U21=OLD U22=OLD
VALUE VALUE VALUE VALUE
OF OF OF OF
U(1,I) U(2,I) U(1,I+1) U(2,I+1)
D=E1(1,1)*E2(2,2)-E2(2,1)*E1(1,2) IF(ABS(D).LT.1.E-6) GOTO 80 U11=U(1,I) U12=U(2,I) C1=DYO1*(El(1,1)*G1(1)+El(1,2)*61(2)) C2=E2(2,1)*(DY02*62(1)+U(1,I+1)-UI1) + F E2(2,2)*(DY02*G2(2)+U(2,I+1)-U12) U(1,I)=U11+(C1*E2(2,2)-C2*E1(1,2))/D U(2,I)=U12+(E1(1,1)*C2-E2(2,1)*C1)/D C
C
U(3,I)=XO U(4,I)=YO
Appendix 1:
Method of Massau
455
70 LAMB1(1)=LAMBZ(1)
El (1,1)=E2(1,1) El (1,2)=E2(1,2) 61(1)=G2(l) G1 (2)=G2(2) LL=.TRUE.
GOTO 100 80 B(I)=.FALSE.
GOTO 70 90 B(I)=.FALSE. LL=.FALSE. 100 CONTINUE C C C
END OF THE MAIN LOOP B(N2)=.FALSE. 110 N2=N2-1 IF(.NOT.B(N2).AND.N2.GT.N1) GOTO 110 SIGMAO=SIGMAO+H2*0.5 TAU=TAU+H2*0.5 RETURN END
SUBROUTINE EIGEN(U,E,LAMBDA,L) C
REAL U(4),E(2,2),LAMBDA(2) LOGICAL L C C C C C C C C
INPUT PARAMETERS U CONTAINS U(1),U(2),X,Y OUTPUT PARAMETERS EIGENVALUES LAMBDA(1).LT.LAMBDA(2) MATRIX E (IN THE TEXT DENOTED BY E**-1) L=.FALSE. INDICATES THAT THE COMPUTATION IS NOT POSSIBLE
C
REAL A(2,2),C,D,C1,C2,C3,C4 LOGICAL SW L=.TRUE. CALL
IF(.NOT.L) RETURN C C
COMPUTATION OF THE EIGENVALUES OF A
APPENDICES
456
C
C=A(1,1)+A(2,2) D=A(1,1)-A(2,2) D=D*D+4.*A(1,2)*A(2,1) IF(D.LE.O) GO TO 101 D=SQRT(D) IF(D.LT.1.E-6*ABS(C)) GO TO 101 LAMBDA(1)=0.5*(C-D) LA14BDA(2)=0.5*(C+D) C C C C C C C
SOLUTION OF THE FOLLOWING HOMOGENEOUS EQUATIONS E(1,1)*(A(1,1)-LAMBDA(1))+E(1,2)*A(2,1)=O E(1,1)*A(1,2)+E(1,2)*(A(2,2)-LAMBDA(1))=0 E(2,1)*(A(1,1)-LAMBDA(2))+E(2,2)*A(2,1)=O E(2,1)*A(1,2)+E(2,2)*(A(2,2)-LAMBDA(2))=0 C=LAMBDA(1) SW=.FALSE. 10 C1=ABS(A(1,1)-C) C2=ABS(A(2,1)) C3=ABS(A(1,2)) C4=ABS(A(2,2)-C) IF(AMMAX1(C1,C2).LT.AMAXI(C3,C4)) GO TO 30 IF(C2.LT.C1) GO TO 20 C1=1. C2=(C-A(1,1))/A(2,1) GO TO 50 20 C2=1. C1=A(2,1)/(C-A(1,1)) GO TO 50 30 IF(C3.LT.C4) GO TO 40 C2=1.
C1= (C-A (2,2) )/A (1,2) 60 TO 50 40 C1=1. C2=A(1,2)/(C-A(2,2))
50 IF(Sl!) GO TO 60 E(1,1)=C1 E(1,2)=C2 C=LAMBDA(2) SU=.TRUE. GO TO 10 60 E(2,1)=C1 E(2,2)=C2 RETURN 101 L=.FALSE. RETURN END
Appendix 1:
Method of Massau
EXAMPLE (MENTIONED IN THE TEXT)
MAIN PROGRAMME: C C
DESCRIPTION OF THE COMMON BLOCK IN THE SUBROUTINE MASSAU REAL U(4,500),SIGMAO,TAU,H2 INTEGER N1,N2 LOGICAL B(500) COMMON /MASS/ U,SIGMAO,TAU,H2,NI,N2,B
C
REAL X,DUI,DU2,SIGMA INTEGER I,J C
C
INITIALIZATION OF THE COMMON BLOCK
C
TAU=O. NI=1 N2=65
/32.
P_
.--
_ .*ATAN(1.)
SIGMA.:?--=-H2
X=O. DO 10 I=1,N2
U(1,I)=0.1*SIN(2.*PI*X)*EXP(X) U(2,I)=1.
U(3,I)=X
U(4,I)=0. B(I)=.TRUE. 10 X=X+H2 C C C
LOOP FOR PRINTING AND EXECUTING THE SUBROUTINE DO 40 I=1,65 DO 39 J=N1,N2 IF(.NOT.B(J)) GOTO 39 SIGMA=SIGMA0+J*H2 WRITE(6,49) SIGMA,TAU,U(3,J),U(4,J),U(1,J),U(2,J) 39 CONTINUE WRITE(6,50)
C
IF(N2.LE.N1) STOP CALL. MASSAU
40 CONTINUE STOP C
49 FORMAT(IX,2F8.5,IX,6F13.9) 50 FORMAT(IHI) END
4S7
APPENDICES
458
SUBROUTINES:
SUBROUTINE QUELL(U,G) C C C C
INPUT PARAMETER U CONTAINS U(1),U(2),X,Y OUTPUT PARAMETERS ARE G(1),G(2)
REAL U(4),G(2) G(1)=0. G(2)=0. RE".` 'RN END
SUBROUTINE MATRIX(U,A,L) C C C C C
INPUT PARAMETER U CONTAINS U(1),U(2),X,Y OUTPUT PARAMETERS ARE THE MATRIX A AND L L=.TRUE. IF U BELONGS TO THE DOMAIN OF THE COEFFICIENT MATRIX A AND OF THE TERM G. OTHERWISE, L=.FALSE.
C
REAL U(4),A(2,2) LOGICAL L C
REAL U1,U2 C
U1=U (1) U2=U (2)
L=.TRUE. A(1,1)=-U1 A(1,2)=-2.*U2 A(2,1)=-O.S*U2 A(2,2)=-U1 RETURN END
Appendix 2:
Nonlinear implicit
Appendix 2:
Total implicit difference method for solving a
difference method
459
nonlinear parabolic differential equation.
The total implicit difference method has proven itself useful for strongly nonlinear parabolic equations.
With it
one avoids all the stability problems which so severely complicate the use of other methods.
In the case of one (space)
variable, the amount of effort required to solve the system of equations is often overestimated.
The following programs solve the problem ut = a(u)uxx - q(u),
x e (r,s), t > 0
u(x,0) = $(x),
x c [r,s]
u(r,t) _ ar(t), u(s,t) = 4s(t),
t > 0.
Associated with this is the difference method u(x,t+h)-u(x,t) = Xa(u(x,t+h))[u(x+&x,t+h)+u(x-Ax,t+h) -
where
2u(x,t+h)]
Ax > 0, h > 0, and
- hq(u(x,t+h))
A =
ox = (s-r)/(n+l),
x = r + jAx,
fixed
n e 1N
j
= l(l)n
this becomes a nonlinear system in solved with Newton's method.
When specialized to
h/(Ax)2.
n
unknowns. It is
For each iterative step, we
have to solve a linear system with a tridiagonal matrix. linear equations are
aljuj_1 + a2juj + a3juj+1 = a4j, where
j = l(1)n
The
APPENDICES
460
alj = -aa(uj)
aaI(uj)[uj+l+uj-l-2uj]+hgI(uj
a2j
= 1+2aa(uj
aij
= -Xa(uj)
a4j
= uj-[1+2Aa(uj)]uj+aa(uj)[uj+l+uj-1)-hq(uj)
uj = solution of the difference equation at the point (r+jAx,t+h).
uj = corresponding Newton approximation for u(r+jtx,t+h). When this sytem has been solved, the by
uj
+ uj; the
aij
are replaced
uj
are recomputed; etc., until there is
no noticeable improvement in the
uj.
Usually two to four
Newton steps suffice.
Since the subscript 0 is invalid in FORTRAN, the quantities
u(x+jAx,t)
are denoted in the programs by
For the same reason, the Newton approximation
uj
U(J+1).
is called
Ul(J+1).
The method consists of eight subroutines:
HEATTR, AIN, RIN, GAUBD3, ALPHA, DALPHA, QUELL, DQUELL. HEATTR is called once by the main program for each time increment.
Its name is an abbreviation for heat transfer.
other subroutines are used indirectly only.
The
The last four
subroutines must be rewritten for each concrete case.
They
are REAL FUNCTIONs with one scalar argument of REAL type, which describe the functions
a(u), a'(u), q(u), and
q'(u).
The other subroutines do not depend on the particulars of the problem.
AIN computes the coefficients
linear system of equations.
aij
of the
GAUBD3 solves the equations.
This program is described in detail along with the programs
Appendix 2:
Nonlinear implicit difference method
dealing with band matrices in Appendix S. alj, a2j, and
a3j
Newton's step.
461
The coefficients
are recomputed only at every third
In the intervening two steps, the old values
are reused, and the subroutine RIN is called instead of AIN. RIN only computes
a4j.
Afterwards, GAUBD3 runs in a simpli-
For this reason, the third variable is .TRUE..
fied form.
We call these iterative steps abbreviated Newton's steps. Before HEATTR can be called the first time, it is necessary to fill the COMMON block /HEAT/: N = n DX = Ax = (s-r)/(n+l) U(J+1)
4(r+jIx)
j
= 0(1)n+l
H = h. H
and
can be changed from one time step to another. u(s,t)
depend on
boundary values
If
u(r,t)
t, it is necessary to set the new
U(l) _ r(t+h)
and
U(N+2) = 0s(t+h)
be-
fore each call of HEATTR by the main program.
An abbreviated Newton's step uses approximately 60% of the floating point operations of a regular Newton's step: (1)
(2)
Regular Newton's step: n
calls of ALPHA, DALPHA, QUELL, DQUELL
21n+4
operations in AIN
8n-7
operations in GAUBD3
4n
operations in HEATTR.
Abbreviated Newton's step: n
calls of ALPHA, QUELL
10n+3
operations in RIN
5n-4
operations in GAUBD3
4n
operations in HEATTR.
APPENDICES
462
This sequence of different steps--a regular step followed by two abbreviated steps--naturally is not optimal in every Our
single case.
error test for a relative accuracy of If so desired, it suffices to
10-5 is also arbitrary.
make the necessary changes in HEATTR, namely at IF(AMAX.LE.0.00001*UMAX) GO TO 70 and
IF(ITERI.LT.3) GO TO 21.
As previously noted, two to four Newton's iterations usually suffice.
This corresponds to four to eight times
this effort with a naive explicit method.
If
u
and
a(u)
change substantially, the explicit method allows only extremely small incrementat.ions
h.
This can reach such extremes
that the method is useless from a practical standpoint. ever, if
q1(u)
one should have
How-
< 0, then even for the total implicit method hq'(u) > -1, i.e. h < 1/lq'(u)l.
For very large
n, to reduce the rounding error in
AIN and RIN we recommend the use of double precision when executing the instruction
A(4,J)=U(J+1)-(1.+LAMBD2*AJ)*UJ+LAMBDA*AJ* *
(U1(J+2)+U1(J))-H*QJ.
This is done by declaring DOUBLE PRECISION LAMBDA, LAMBD2, AJ, UJ, U12, U10 and replacing the instructions above by the following three instructions:
Appendix 2:
Nonlinear implicit difference method
463
U12 = Ul(J+2) U10 = U1(J)
A(4,J) = U(J+1)-(1.+LAMBD2*AJ)*UJ +
+LAMBDA*AJ*(U12+UlO).
All remaining floating point variables remain REAL. other than AIN and RIN do not have to be changed.
Programs
APPENDICES
464
SUBROUTINE HEATTR(ITER) C C C C
U(I) VALUES OF U AT X=XO+(I-1)*DX, I=1(1)N+2 U(1), U(N+2) BOUNDARY VALUES H STEP SIZE WITH RESPECT THE TIME COORDINATE
C
REAL U(513),H,DX INTEGER N COMMON/HEAT/U, H,DX,N C
REAL U1(513),AJ,UJ,AMAX,UMAX,A(4,511) INTEGER ITER,I,ITERI,N1,N2,J N1=N+1 N2=N+2 C
C
FIRST STEP OF THE NEWTON ITERATION
C
CALL AIN(A,U) CALL GAUBD3(A,N,.FALSE.) DO 20 J=2,N1 20 U1(J)=U(J)+A(4,J-1)
UI (1)=U(1)
U1(N2)=U(N2) ITER=1 ITER1=1 C C C
STEP OF THE MODIFIED NEWTON ITERATION 21 CALL RIN(A,U1) CALL GAUBD3(A,N,.TRUE.) GO TO 30
C C C
STEP OF THE USUAL NEWTON ITERATION 25 CALL AIN(A,U1) CALL GAUBD3(A,N,.FALSE.) ITER1=0.
C C C
RESTORING AND CHECKING 30 AHAX=O. UHAX=O. DO 40 J=2,N1 AJ=A(4,J-1) UJ=U1(J)+AJ AJ=ABS(AJ) IF(AJ.GT.AMAX) AMAX=AJ U1(J)=UJ UJ=ABS(UJ) IF(UJ.GT.UMAX) UMAX=UJ 40 CONTINUE
C
Appendix 2:
Nonlinear implicit difference method
465
C
ITER=ITER+1
ITERI=ITER1+1 IF(AMAX.LE.0.00001*UMAX) GO TO 70 IF(ITER.GT.20) GO TO 110 IF(ITERI.LT.3) GO TO 21 GO TO 25 C C
U=U1
C
70 DO 80 J=2,N1
80 U(J)=U1(J) RETURN C C
110 WRITE(6,111) 111 FO.RMAT(15H NO CONVERGENCE) STOP END
SUBROUTINE AIN(A,U1) C C C C
EVALUATION OF THE COEFFICIENTS AND OF THE RIGHT-HAND SIDE OF THE SYSTEM OF LINEAR EQUATIONS REAL A(4,511),U1(513)
C
C C
COMMON BLOCK COMPARE HEATTR REAL U(513),H)DX INTEGER N
C
REAL LAMBDA,LAMBD2,LAMBDM,UJ,AJ,DAJ,QJ,DQJ INTEGER J REAL Z LAMBDA=H/(DX*DX) LAHBD2=2.*LAMBDA
DO 10 J=1,N UJ=U1(J+1) AJ=ALPHA(UJ) DAJ=DALPHA(UJ) QJ=QUELL(UJ) DQJ=DQUELL(UJ) 2=LAi1BDH*AJ
466
APPENDICES
A(1,J)=Z *
A(2,J)=1.+LAMBD2*(AJ+DAJ*UJ)-LAMBDA*DAJ* (U1(J+2)+U1(J))+H*DQJ
A (3,J)=Z
A(4,J)=U(J+1)-(1.+LAMBD2*AJ)*UJ+LAMBDA*AJ* (U1(J+2)+U1(J))-H*QJ 10 CONTINUE RETURN *
END
SUBROUTINE RIN(A,U1) C
C C C
EVALUATION OF THE RIGHT-HAND SIDE OF THE LINEAR EQUATIONS REAL A(4,511),U1(513)
C C
COMMON BLOCK COMPARE HEATTR
C
REAL U(513),H,DX INTEGER N COMMON/HEAT/U,H,DX,N C
REAL LAMBDA LAMBDZ,UJ,AJ,QJ INTEGER J LAMBDA=H/(DX*DX) LAMB02=2.*LAMBDA DO 10 J=1,N UJ=U1(J+1) AJ=ALPHA(UJ) QJ=QUELL(UJ) A(4,J)=U(J+1)-(1.+LAMBD2*AJ)*UJ+LAMBDA*AJ* * (U1(J+2)+U1(J))-H*QJ 10 CONTINUE RETURN END
Appendix 2:
Nonlinear implicit difference method
EXAMPLE
MAIN PROGRAMME:
REAL U(513),H,DX INTEGER N COMMON/HEAT/U,H,DX,N REAL PI,T INTEGER I,J,ITER PI=4.*ATAN(1.) C
N=7 DX=1 ./8. H=1 ./64.
DO 10 1=1,9 10 U(I)=SIN(PI*DX*FLOAT(I-1)) C
T=O.
DO 20 I=1,10 CALL HEATTR(ITER) WRITE(6,22) ITER T=T+H WRITE(6,21) T WRITE(6,21)(U(J),J=1,9) 20 CONTINUE STOP C
21 FORMAT(1X,9F12.9/1X,9F12.9) 22 FORMAT(6H ITER=,I2) END
467
468
SUBROUTINES:
REAL FUNCTION ALPHA(U) REAL U ALPHA=(1.-0.5*U)/(4.*ATAN(1.))**2 RETURN END
REAL FUNCTION DALPHA(U) REAL U DALPHA=-.5/(4.*ATAN(1.))**2 RETURN END
REAL FUNCTION QUELL(U) REAL U QUELL=U*U*0.5 RETURN END
REAL FUNCTION DQUELL(U) REAL U DQUELL= U RETURN END
APPENDICES
Appendix 3:
Lax-Wendroff-Richtmyer method
Appendix 3:
Lax-Wendroff-Richtmyer method for the case of
469
two space variables.
The subroutines presented here deal with the initial value problem ut = A 1 u
x
+ A2uy + Du + q,
u(x,y,0) = (X,y), Here
x,y e1R, t > 0
x,y e 1R.
A1,A2,D c C1(IR2,MAT(n,n,IR)), D(x,y) = diag(dii (x,y)),
q e C1(ii2 x [0,-),]R n).
properly posed in
We require that the problem be
L2(IR2,cn).
are always symmetric, and
that (1) A1(x,y), A2(x,y) (2) Al, A2, D, q
For this it suffices, e.g.,
have compact support.
Because of the terms
Du + q, the differential equation
in the problem considered here is a small generalization of Example 10.9.
There is one small difficulty in extending the
difference scheme to this slightly generalized differential equation.
One can take care of the terms
Du + q
in the
differential equation with the additional summand h[D(x,y)u(x,y,t) + q(x,y,t)]
or better yet, with h{ZD(x,y)[u(x,y,t) + u(x,y,t+h)] + q(x,y,t+ 2h)} This creates no new stability problems (cf. Theorem 5.13). Consistency is also preserved.
However, the order of consis-
tency is reduced from 2
The original consistency proof
to 1.
considered only solutions of the differential equation ("consistency in the class of solutions") and not arbitrary sufficiently smooth functions; here we have a different differential equation.
470
APPENDICES
We use the difference method u(x,y,t+h) = [I- ZD(x,Y)] 1[I+S(h)oK(h)+ + h(I- ZD(x,Y)] lq(x,Y,t+
D(x,Y)]u(x,Y,t)
-T)+2(I-ZD(x,Y)) 1S(h)q(x,Y,t)
where
K(h) = ZS(h) + (4I+8D(x,Y)) (TA,1+TA,2+TA1+T012)
S(h) = ZA[A1(x,Y)(TA'1-Tpl1) + A2(x,Y)(TA 2-TA12)1. For
D(x,y) = 0
and
method (r = 1). every case.
q(x,y,t) = 0, this is the original
But then the order of convergence is 2 in
Naturally, this presupposes that the coeffici-
ents, inhomogeneities, and solution are all sufficiently often differentiable.
The computation procedes in two steps. 1st Step
(SUBROUTINE STEP1):
v(x,y) = K(h)u(x,y,t) + 2nd Step
hq(x,y,t).
(SUBROUTINE STEP2):
u(x,y,t+h) = {[I+
[I
-
ZD(x,Y)]-1o
ZD(x,Y)lu(x,Y,t)+S(h)v(x,Y)+hq(x,Y,t+ 11)1.
The last instruction is somewhat less susceptible to rounding error in the following formulation:
u(x,y,t+h) = u(x,y,t) +
[I-
ZD(x,Y)]
to
{S(h)v(x,y)+h[D(x,y)u(x,y,t)+q(x,y,t+ 2)]}. If
u(x,y,t)
is given at the lattice points
(x,Y) = (uo,vt)
(p,v e2z, p+v
even)
Appendix 3:
then
Lax-Wendroff-Richtmeyer method
v(x,y)
can be computed at the following points: p,v e22, p+v
(x,Y) = (PA,vt),
From these values and the old values u(x,y,t+h)
471
odd.
u(x,y),
one obtains
at the points p,v c
(x,y) = (pt,vt),
p+v
,
even.
This completes the computation for one time increment. If steps 1 and 2 follow each other directly, then and
Therefore, we divide each time
have to be stored.
v
u
step into substeps, in which the u-values are computed only for the lattice points on a line x + y = 2aA = constant.
For this one only needs the v-values for x + y = (2a-1)A (as shown in Figure 1).
and
x + y = (2a+1)a
At first, only these v-values are
stored, and in the next substep, half of these are overThus we alternately compute
written. on a line. a line.
v
on a line and
SUBROUTINE STEP1 computes only the
STEP2 does compute all of
v
u
values on
u, but in passing from
one line to the next, STEP2 calls STEP1 to compute
v.
The program starts with the lattice points in the square t(x,y)
-1 < x+y <
1
and
-1 < x-y < U.
Because of the difference star of the Lax-Wendroff method, we lose fewer lattice points at the boundary per time step than with an equally large square with sides parallel to
APPENDICES
472
+: v-lattice :
1\ 1\ "\ .
+
N+N N
+
+
\
+
+
x+y = (2a+l)A x+y = 2au x+y = (2a-l)A
+
Figure 1:
Lattice points for
We set
the axes.
u-lattice
m m = 2 0
and
m0 = 3, m = 8, = 1/8.
A = 1/m.
Altogether, the solution of the initial value problem requires six subroutines: INITIO, STEP1, STEP2, PRINT, COEFF, FUNC.
The last two subroutines have to be rewritten for each application.
COEFF computes the matrices
inhomogeneity
q.
Al, A2, and
D
and the
FUNC provides the initial values.
The first program called by the user is INITIO. defines the lattice, the number of components of
It
u, computes
the initial values with the help of FUNC, and enters all this in COMMON.
Additionally, INITIO computes DMAX = max(O,dii (x,y)).
When calling STEP2, one should choose
h
no larger than
Appendix 3:
Lax-Wendroff-Richtmeyer method
473
In each time step, STEP2 is called first to carry
1.0/DMAX.
out the computation, and then PRINT is called to print out the results.
After remain.
s2
time steps, only
Thus, at most
m/2
(m+l-2s2)2
lattice points
steps can be carried out.
an incorrect call of INITIO or of STEP2, IERR >
0
After
is set
as follows: in INITIO: IERR = 1:
m
outside the boundaries
IERR = 2:
n0
outside the boundaries.
s0
too large.
0
.
in STEP2: IERR = 1:
STEP1 and STEP2 each contain only one computation intensive loop:
STEP1
STEP2
DO 100 K=1,MS
DO 100 J=J1,J2
100 Y=Y-DELTA
100 Y1=YI+DELTA
In the following accounting of the number of floating point calls we ignore all operations outside these loops, with the exception of calls of STEP1 in STEP2. STEP1: (m-2s2)
calls of COEFF
(m-2s2)(4n2+12n+2)
operations.
STEP2: (m-2s2)
calls of STEP1
(m-2s2-1)2
calls of COEFF
(m-2s2-1)2(4n2+lln+2)
operations
474
APPENDICES
Each time step therefore consumes approximately 2(m-2s2)2
calls of COEFF
2(m-2s 2)2(4n2+lln)
operations.
The total effort required for all
m/2
time steps thus is
a
calls of COEFF
a(4n2+lln)
operations
where m/2 a = 8
u2
= 3 (m+l) (m+2)
u=i
If the matrices
A
and
1
contain many zeros (as in the
A2
wave equation for example) then the term substantially.
can be reduced
4n2
To accomplish this it is enough, in STEP1 and
STEP2, to reprogram only the loops beginning with DO 20 LL = 1,N.
If enough memory is available, Al, A2, and
can be com-
D
puted in advance, and CALL COEFF can be replaced by the appropriate reference.
If
is t-dependent, however, it will
q
have to be computed for each time step.
In this way, the
computing time can be reduced to a tolerable level in many concrete cases.
For the case of the wave equation
A1(x,y) =
0
1
0
1
0
0
0
0 0
1 ,
A2(x,y) =
( 0
0
1
0
0 0
0
I
11
1
0
q(x,y,t) = 0
D(x,y) = 0,
we have tried to verify experimentally the theoretical stability bound
A <
vr2-.
The initial values
Appendix 3:
Lax-Wendroff-Richtmeyer method
475
0
cos x cos y
O(x,y)
have the exact solution -sin t(sin x + sin y) cos t cos x cos t cos y
u(x,y,t) =
We chose
m0 = 7, m = 128, A = 1/128, h = A
A = 1.3(0.1)1.7.
where
After 63 steps, we compared the numerical
results with the exact solutions at the remaining 9 lattice points.
The absolute error for the first component of
u
is generally smaller than the absolute error for the other components (cf. Table 2). ceable until
The instability is not really noti-
A = 1.7, where it is most likely due to the
still small number of steps.
max. absolute error 2nd & 3rd comp. 1st comp. < 1.5
1.0
10-7
4.0
10-6
1.6
4.4
10-5
5.3
10-5
1.7
2.0
100
1.2
100
Table 2
Nevertheless, the computations already become problematical with
A > 1.5.
A multiplication of
creates a perturbation, for
A = 1.3
and
h
by 1 + 10-12
A = 1.4, of the
same order of magnitude as the perturbation of
h.
For
A = 1.5, however, the relative changes in the results are greater up to a factor of 1000, and for
A = 1.6, this
amplification can reach 109 for some points.
We have tested consistency among other places in an
476
APPENDICES
example wherein Al
and
A
and
1
A2, as well as the diagonal elements of In this case, q
sentially space-dependent. y, and
are full, and all elements of
A2
t.
D, are es-
depends on
x,
The corresponding subroutines COEFF and FUNC are
listed below.
The initial value problem has the exact solu-
tion
cosx+cosy u(x,y,t) = e- t
cos x + sin y
The computation was carried out for the four examples: (1)
A = 1/8,
h = 1/32,
A = 1/4,
s2 = 1
(2)
A = 1/16,
h = 1/64,
A = 1/4,
s2 =
(3)
A = 1/32,
h = 1/128,
A = 1/4,
s2 = 4
(4)
0 = 1/64,
h = 1/256,
A = 1/4,
s2 = 8.
2
The end results thus all belong to the same time
T = s2h =
Therefore, at lattice points with the same space co-
1/32.
ordinates, better approximations can be computed with the aid of a global extrapolation.
Our approach assumes an asymptotic
expansion of the type
T0(x,y) + h2T2(x,y) + h3T3(x,y) + h4T4(x,y) +
.
The first and third extrapolation do in fact improve the The summand
results substantially.
h3T3(x,y)
very small in our example relative to the terms and
4
h T4(x,y).
should be h2T2(x,y)
The absolute error of the unextrapolated 10-3
values decreases with
h
from about
to 10 5.
After the
third extrapolation, the errors at all 49 points (and for both components of
u) are less than 10-9.
Appendix 3:
Lax-Wendroff-Richtmeyer method
477
We do not intend to recommend the Lax-WendroffRichtmyer method as a basis for an extrapolation method as a result of these numerical results.
For that it is too com-
plicated and too computation intensive.
However, global
extrapolation is a far-reaching method for testing a program for hidden programming errors and for susceptibility to rounding error.
APPENDICES
478
SUBROUTINE INITIO (COEFF,FUNC,TO,MO,NO,DMAX,IERR) C C C C
C C C
FOR THE DESCRIPTION OF COEFF COMPARE STEP2. THE SUBROUTINE FUNC YIELDS THE INITIAL VALUES F(N) AT THE POINTS X,Y. THE USER HAS TO DECLARE THIS SUBROUTINE AS EXTERNAL. T=TO, N=NO, M=2**MD, FOR DMAX COMPARE TEXT.
INTEGER I,IERR,I1,I2,J,MMAXI,MO,NN,NO REAL DMAX,MINUS,TO,XO,XI,YO,Y1 REAL Al(4,4),A2(4,4),D(4),Q(4),F(4) C C C C C C C C C C
C C
C C C C C C
C C
C C C C C C C
C C
MEANING OF THE VARIABLES OF THE COMMON BLOCK M
NUMBER OF THE PARTS OF THE INTERVAL (0,1),
DELTA MMAX
=1./M, UPPER BOUND FOR M,
NUMBER OF THE COMPONENTS OF THE SOLUTION (1.LE.N.LE.4), S2 NUMBER OF CALLS OF STEP2 DURING THE EXECUTION OF INITIO, T TIME AFTER S2 STEPS, H STEP SIZE WITH RESPECT TO THE TIME, LAMBDA =H/DELTA (LAMBDA.GT.O), U SOLUTION. U(*,I,J) BELONGS TO THE POINT X=DELTA*(J+I-MMAX-2) Y=DELTA*(J-I), V INTERMEDIATE VALUES (COMPARE TEXT) V(*,2,I) BELONGS TO THE POINT X=DELTA*(J+I-MMAX-1) Y=DELTA*(J-I) J IS THE RESPECTIVE PARAMETER OF STEP1 V(*,1,I) BELONGS TO THE POINT X1=X-DELTA Y1=Y-DELTA MMAX AND THE BOUNDS OF THE ARRAYS U(4,DIM2,DIM2) AND V(4,2,DIM1) ARE RELATED AS FOLLOWS MMAX DIM2 DIM N
32
32
33
64 128
64 128
129
...
...
...
65
C
INTEGER MMAX,M,N,S2 REAL U(4,65,65),V(4,2,64),H,DELTA,LAMBDA,T COMMON U,V,H,DELTA,LAMBDA,T,MMAX,M,N,SZ DATA MINUS /-1.E5O/ MMAX=64 C
MMAXI=MMAX+1 M=2**M0 IF( MO.LT.1 OR. M GT.MMAX ) GOTO 998 IF( NO.LT.1 OR. NO.GT.4 ) GOTO 997
Appendix 3:
C C C C
Lax-Wendroff-Richtmeyer method
479
SET V(*,2,*)=O AND ASSIGN MINUS INFINITY (HERE -1E50) TO U(*,*,*). DO 10 J=1,MMAX DO 10 NN=1,N 10 V(NN,2,J)=O. 00 20 I=1,MMAXI 00 20 J=1,MMAXI 20 U(1,I,J)=MINUS
C
30
40
997
998
T=TO N=NO S2=0 DMAX=O. IERR=O DELTA=1./FLOAT(M) I1=(MMAX-M)/2+1 I2=I1+M X0=-1. YO=0. DO 40 J=I1,I2 X1=X0 Y1=Y0 DO 30 I=I1,I2 CALL FUNC (X1,Y1,F) CALL COEFF (X1,Y1,TO,A1,A2,D,Q) X1=XI+DELTA Y1=Y1-DELTA DO 30 NN=1,N U(NN,1,J)=F(NN) IF( D(NN).GT.DMAX ) DMAX=D(NN) CONTINUE XO=XO+DELTA YO=YO+DELTA RETURN IERR=1 RETURN IERR=2 RETURN END
APPENDICES
480
SUBROUTINE STEPI (COEFF,J) INTEGER I1,I2,J,J1,J2,K,L,LL,MS REAL H2,H8,LAM4,SUM,X,Y REAL A1(4,4),A2(4,4),D(4),Q(4),UX(4),UY(4) C C C
FOR VARIABLES IN COMMON COMPARE INITIO INTEGER MMAX,M,N,S2 REAL U(4,65,65), V(4,2,64),H,DELTA,LAMBDA,T COMMON U,V,H,DELTA,LAMBDA,T,MMAX,M,N,S2
C
H2=H*.5 HS=H*.125 LAM4=LAMBDA*.25 MS=M-2*S2 I1=(MMAX-MS)/2+1 J1=J I2=I1+1
J2=J1+1 DO 10 K=1,MS DO 10 L=1,N 10 V(L,1,K)=V(L,2,K) X=DELTA*FLOAT(JI+II-MMAX-1) Y=DELTA*FLOAT(JI-I1) DO 100 K=1,MS DO 15 LL=1,N UX(LL)=U(LL,I2,J2)-U(LL,I1,J1) 15 UY(LL)=U(LL,I1,J2)-U(LL,I2,J1) CALL COEFF (X,Y,T,A1,A2,D,Q) DO 30 L=1,N SUM=0. 20 + +
30
DO 20 LL=1,N SUM=SUM+A1(L,LL)*UX(LL)+A2(L,LL)*UY(LL) V(L,2,K)=LAM4*SUM+H2*Q(L)+ (0.25+H8*D(L))*(U(L,I2,J2)+U(L,I1,J1)+ U(L,I1,J2)+U(L,I2,J1)) CONTINUE I1=I1+1 I2=I2+1
100
X=X+DELTA Y=Y-DELTA RETURN END
Appendix 3:
Lax-Wendroff-Richtmeyer method
SUBROUTINE
481
STEP2 (COEFF,HO,IERR)
THE SUBROUTINE COEFF EVALUATES THE COEFFICIENTS Al, A2, 0 AND THE SOURCE TERM Q OF THE DIFFERENTIAL EQUATIONS. COEFF IS TO BE DECLARED AS EXTERNAL. Al(N,N), A2(N,N), D(N) MAY DEPEND ON X AND Y, Q(N) MAY DEPEND ON X,Y, AND T. HO IS THE SIZE WITH RESPECT TO TIME. HO MAY CHANGE FROM ONE CALL STEP2 TO THE NEXT CALL ACCORDING TO THE STABILITY CONDITION. EXTERNAL COEFF INTEGER I,IERR,II,I2,J,J1,J2,K,KK,L,LL,MS REAL HO,H2,LAMM2,MINUS,SUM,T2,X,X1,Y,Y1 REAL A1(4,4),A2(4,4),D(4),Q(4),VX(4),VY(4) C C C
FOR VARIABLES IN COMMON COMPARE INITIO INTEGER MMAX,M,N,S2 I*gAL U(4,65,65),V(4,2,64),H,DELTA,LAMBDA,T COMMON U,V,H,DELTA,LAMBDA,T,MMAX,M,N,S2 DATA MINUS /-1.E50/
C
MS=M-2*S2 IF( MS.LT.1 ) GOTO 99 IERR=O H=HO LAI.IBDA=H/DELTA
LAM2=LAMBDA*.5 H2=H*.5 T2=T.+H2
I1=(MMAX-MS)/2+1 I2=I1+MS J1=I1+1 J2=I2-1 CALL STEP1 (COEFF,I1) X1=DELTA*FLOAT(I1+I1-MMAX) Y1=0. C
DO 100 J=JI,J2 X=Xl Y=Y1 K=1
15
KK=2 CALL $TEPI (COEFF,J) DO 50 I=Jl,J2 DO 15'LL=1,N VX(LL)=V(LL,2,KK)-V(LL,1,K ) VY(LL)=V(LL,2,K )-V(LL,1,KK) CALL COEFF (X,Y,T2,A1,A2,D,Q) DO 30 L=1,N
SUMO.
482
APPENDICES
20
/ 30
50
DO 20 LL=1,N SUM=SUM+A1(L,LL)*VX(LL)+A2(L,LL)*VY(LL) U(L,I,J)=U(L,I,J)+(LAM2*SUM+H*(D(L)*U(L,I,J)+0(L)))/ (l.-H2*0(L)) CONTINUE X=X+DELTA Y=Y-DELTA K=K+1 KK=KK+1
X1=XI+DELTA Yl=Yl+DELTA DO 110 J=I1,I2 U(1,I1,J)=MINUS U(1,I2,J)=MINUS U(1,J,II)=MINUS 110 U(1,J,I2)=MINUS T=T+H 100
S2-S2+1 RETURN
99 IERR=1 RETURN END
SUBROUTINE PRINT INTEGER I,J,L,MMAXI REAL MINUS,X,Y C C C
FOR VARIABLES IN COMMON COMPARE INITIO INTEGER MMAX,M,N,S2 REAL U(4,65,65),V(4,2,64),H,DELTA,LAMBDA,T COMMON U,V,H,OELTA,LAMBDA,T,MMAX,M,N,S2 DATA MINUS /-1.ES0/
C
MMAXI=MMAX+1 DO 30 J=1,MMAXI DO 20 I=1,MMAXI IF( U(1,I,J).LE.MINUS ) GOTO 20 X=DELTA*FLOAT(J+I-MMAX-2) Y=DELTA*FLOAT(J-I) DO 10 L=1,N 10 WRITE(6,800) L,I,J,U(L,I,J),X,Y CONTINUE 20 30 CONTINUE RETURN 800 FORMAT(IH ,IOX,2HU(,I2,IH,,I2,IH,,I2,IH),5X,E20.14, F 5X,2HX=,F10.6,2X,2HY=,F1O.6) END
Appendix 3:
Lax-Wendroff-Richtmeyer method
483
EXAMPLE (MENTIONED IN THE TEXT)
SUBROUTINE COEFF (X,Y,T,A1,A2,D,Q) REAL Al(4,4),A2(4,4),D(4),Q(4) SINX=SIN(X) COSY=COS(Y) CXI=COS(X)+1. CX2=CXI+1. SYI=SIN(Y)-1. SSI=SYI*SYl-1. C C C C C
I SIN(X) AI=I
, COS(X)+1 I I
I COS(X)+I, COS(X)+2 I
I COS(Y) A2=I
I SIN(Y)-l, SIN(Y)*(SIN(Y)-2)
Al(1,1)=SINX
Al (1,2)=CX1 Al(2,1)=CX1 Al(2,2)=CX2 A2(1,1)=COSY A2(1,2)=SYI A2(2,1)=SYI A2(2,2)=SSI
D(1)=0.
D(2)=SY1-CX1 Q(1)=0. 0(2)=-EXP(-T)*(COSY*SS1-SINX*CX2) RETURN END
SUBROUTINE FUNC (X,Y,F) REAL F(4) F(1)=SIN(X)+COS(Y) F(2)=COS(X)+SIN(Y) RETURN END
, SIN(Y)-l
I I I
484
APPENDICES
Appendix 4:
Difference methods with SOR for solving the Poisson equation on nonrectangular regions. G c]R2
Let
be a bounded region and let
Au(x,Y) = q(x,y) u(x,Y) = lp(x,Y)
(x,y) c G (x,y) e 3G.
Furthermore, let one of the following four conditions be satisfied: (1)
G c (-1,1) x (-1,+1) = Q1
(2)
G c (-1,3) x (-1,+1) = Q2 G,q,P
(3)
x = 1.
G c (-1,+1) x (-1,3) = Q3 G,q,P
(4)
are symmetric with respect to the line
(2)
are symmetric with respect to the line
y = 1.
G c (-1,3) x (-1,3) = Q4 G,q,*
are symmetric with respect to the lines
x=1
and
y = 1.
The symmetry conditions imply that the normal derivative of u
vanishes on the lines of symmetry.
This additional bound-
ary condition results in a modified boundary value problem for
u
on the region
(-1,1) x (-1,1)
fl
G.
The program uses the five point difference formula of Section 13.
The linear system of equations is solved by SOR
(cf. Section 18).
Because of the symmetries, computation is
restricted to the lattice points in the square
[-1,1] x [-1,11.
This leads to a substantial reduction in computing time for each iteration.
The optimal overrelaxation parameter w b
the number of required iterations
m, however, remain as
large as with a computation over the entire region.
Altogether, nine subroutines are used:
and
Appendix 4:
Poisson eq::ation on nonrectangular regions
48S
POIS, SOR, SAVE, QNCRM, NEIGHB, CHARDL, CHAR, QUELL, BAND.
The last three named programs depend on the concrete problem and describe
G, q, and
Formally, we have REAL FUNCTIONs
gyp.
of two arguments of type REAL. tion of the region
CHAR is a characteristic func-
G:
if
(X,Y)
e G
= 0
if
(X,Y)
e 8G
< 0
otherwise.
>
CHAR(X,Y)
0
This function should be continuous, but need not be differentiable.
If
ABS(CHAR(X,Y))
LT. 1.E-4
it is assumed that the distance from the point to the boundary
G
is at most 10-3.
Each region is truncated by the
program so as to lie in the appropriate rectangle i e {1,2,3,4}.
CHAR(X,Y) = 1
For
Qi,
G = (-1,1) x (-1,1), therefore, If a region
suffices.
(union) of two regions
GI
and
G2, then the minimum (maxi-
mum) of the characteristic functions of characteristic function for
is the intersection
G
G1
and
G2
is a
G.
POIS is called by the main program.
The first two
parameters are the names of the function programs RAND and QUELL.
The name of CHAR is fixed.
The remaining parameters
of POIS are BR, BO
M, EPSP, OMEGAP BR = TRUE. x
BO =
=1 TRUE. -- y = 1
is a line of symmetry is a line of symmetry.
486
APPENDICES
The mesh of the lattice is
H = l./2**M.
EPSP is the absolute
error up to which the iteration is continued. the computation defaults to 10-3. tion parameter.
OMEGAP is the overrelaxa-
When OMEGAP = 0., wb
cally by the program. Note that does not depend on
or
q
4).
When EPSP = 0.,
wb
is determined numeri-
does depend on
G, but
The numerical approximation
It improves as EPSP gets smaller.
also depends on EPSP.
In
case OMEGAP = 0, POIS should be called sequentially with M
2,
3, 4,
...
.
The program then uses as its given initial
value for determining
the approximate value from the
wb
preceding coarser lattice.
OMEGAP remains zero.
In each iteration, SOR uses
7e+llf
floating point
operations, where e
:
f
:
number of boundary distant lattice points number of boundary close lattice points.
In the composition of the system of equations, the following main terms should be distinguished: calls of QUELL
:
proportional to
1/H**2
calls of RAND
:
proportional to
1/H
calls of CHAR
:
proportional to
Ilog EPSP /H.
The program is actually designed for regions which are not rectangular.
For rectangles, the Buneman algorithm (cf.
Appendix 6) is substantially faster.
It is nevertheless en-
ticing to compare the theoretical results for the model problem (Example 18.15) with the numerical results. Lu(x,y) = -20 u(x,y) = 0
Let
in
G = (0,1) x (0,1)
on
3G.
Appendix 4:
Poisson equation on nonrectangular regions
Since the iteration begins with initial error
u(x,y) E 0, the norm of the
is coarsely approximated by
IIe(0)112
The
1.
following results were obtained first of all with EPSP 1./1000.
487
=
The error reduction of the iterative method is thus
about 1/1000.
Table 1 contains the theoretical values of the numerical approximations
wb.
and
wb
The approximations, as
suspected, are always too large.
wb
h
Table 1.
mb
1/8
1.447
1.527
1/16
1.674
1.721
1/32
1.822
1.847
1/64
1.907
1.920
1/128
1.952
1.959
and its numerical approximations
wb
Table 2 contains the number of iterations and the computing times w = wb, and
w = wb.
t1, t2, and
ml, m2, and
m3,
t3, for OMEGAP = w = 0,
Column 2 contains the theoretically
required number of steps
mw
from Example 18.15.
ml
and
b
ti cab.
describe the computational effort involved in determining The times were measured on a CDC CYBER 76 system (in
units of 1 second). h
mwb
ml
m2
m3
tl
t2
t3
8
22
14
14
0.021
0.016
0.016
1/16
17
36
28
32
0.098
0.077
0.085
1/32
35
52
56
64
0.487
0.506
0.751
1/64
70
132
112
128
4.484
3.785
4.298
1/8
Table 2.
Iterative steps and computing time for EPSP = 10-3
488
APPENDICES
Surprisingly, the number of iterations is smaller for than for
w = wb.
This does not contradict the theory, since
the spectral radius gence behavior. EPSP = 10-9.
w = wb
p(ew)
only describes asymptotic conver-
In fact, the relationship reverses for
Table 3 contains the number, Amw b, Amt, Am3, of
additional iterations required to achieve this degree of accuracy.
Amw
:
theoretical number, as in Example 18.5
b
Amt
computation with
w = wb
Am3
computation with
w = wb.
In both cases, i.e. for EPSP = 10-3 and EPSP = 10-9, it is our experience that w = wb + (2-wb)/50 is better than
wb.
h
Table 3.
Amw
Am2
LN m3
1/8
17
20
20
1/16
35
44
36
1/32
70
88
72
1/64
141
176
144
Additional iterations for EPSP = 10-9
Appendix 4:
Poisson equation on nonrectangular regions
489
SUBROUTINE POIS(RAND,QUELL,BR,BO,M,EPSP,OMEGAP) C C
PARAMETERS OF THE SUBROUTINE
C
C C
REAL EPSP,OMEGAP INTEGER M LOGICAL BR,BO RAND AND QUELL ARE FUNCTIONS FOR THE BOUNDARY VALUES AND THE SOURCE TERM, RESPECTIVELY.
C C C
VARIABLES OF THE COMMON BLOCK REAL W(66,66),W1(65,65),WPS(65),Q(65,65),COEFF(1600) INTEGER NR(66,66),MITTE,N1,N2,ITER,NSYM,L1(65),L2(65),NO EQUIVALENCE (L1(1),W1(1,1)),(L2(1),Q(1,1)) EQUIVALENCE (NO,W1(1,1)),(MITTE,Q(1,1)),(WPS(1),W1(1,2)) COMMON W,W1,Q,COEFF,NR COMMON N1,N2,ITER,NSYM
C
C
LOCAL VARIABLES
C
REAL D(4),PUNKT,STERN,BET2,EPS,EPSI,EPS2,H,H2, OALT,OMEGAB,OMEGA,X,XN,Y,Z1,Z2,Z3 INTEGER I, J,K,K1,K2,LCOEFF,N,NN,N3,N4, MALT,MITTEO,MITTE1,MMAX,LCMAX LOGICAL BRN,BON,HBIT DATA PUNKT/1H./ DATA STERN/1H*/ DATA MALT /0/ C
C C
MEANING OF THE VARIABLES
C C C C C
C
W(I,J)
W1(I,J), WPS (I) COEFF(L)
C C
C C C C C C
C C C C
Q(I,J)
VALUE OF THE UNKNOWN FUNCTION AT (X,Y), WHERE X=(I-MITTE)*H, Y=(J-MITTE)*H AUXILIARY STORAGE HERE THE COEFFICIENTS OF THE DIFFERENCE EQUATIOI BELONGING TO A POINT NEAR THE BOUNDARY ARE STORED. COEFF(L), COEFF(L+1), COEFF(L+2), AND COEFF(L+3) ARE RELATED TO ONE POINT. RIGHT-HAND SIDE OF THE DIFFERENCE EQUATION
THE INTERIOR POINTS OF THE REGION SATISFY THE INEQUALITIES N1.GT.1 N1.LE.J.LE.N2 L1(J).LE.I.LE.L2(J) THIS SET OF INDICES MAY ALSO CONTAIN OUTER POINTS INDICATED BY NR(I,J)=-1. L1(J) IS EVEN IN ORDER TO SIMPLIFY THE RED-BLACK ORDERING OF THE SOR ITERATION. N1,N2, L1(J), L2(J),
490
APPENDICES
THE ARRAY BOUNDS OF (1) W,NR, OF (2) W1,WPS,Q,L1,L2, AND OF (3) COEFF CAN BE CHANGED SIMULTANEOUSLY WITH MMAX: MMAX=4 BOUNDS (1) (2) 34 33 (3) 800 MMAX=5 (1) (2) BOUNDS 66 65 (3) 1600 MMAX=6 BOUNDS (1) 130 (3) 3200 (2) 129 THE REAL VARIABLES MAY BE REPLACED BY DOUBLE PRECISION VARIABLES.
N
DISTANCES OF A BOUNDARY CLOSE POINT FROM THE NEIGHBOURING POINTS. DISTANCES OF THE BOUNDARY DISTANT POINTS (GRID SIZE) H=1./2**M =2**M
H2 NN
=H*H =N/2
MALT
=0 IN THE CASE OF THE FIRST RUN OF THIS SUBROUTINE; OTHERWISE, MALT COINCIDES WITH M FROM THE FOREGOING RUN. OMEGAS OF THE LAST RUN; OTHERWISE UNDEFINED =.TRUE. IN THE CASE OF SYMMETRIE WITH RESPECT TO THE LINE X=1 =.TRUE. IN THE CASE OF SYMMETRIE WITH RESPECT TO THE LINE Y=1 =.NOT.BR =.NOT.BO RELATIVE ACCURACY. THE SOR ITERATION IS CONTINUED UNTIL EPS IS REACHED. RELATIVE ACCURACY FOR DETERMINING THE DIFFERENCE 2-OMEGAB PRELIMINARY SOR PARAMETER THAT IS USED FOR COMPUTING OMEGAB (OMEGA. LT.OMEGAB) NUMBER OF STEPS OF THE SOR ITERATION LENGTH OF THE LINES OF SYMMETRIE IN W(I,J)
D(K) H M
OALT BR BO
BRN BON EPS EPSI
OMEGA ITER NSYM
IF THE PARAMETERS "EPSP" AND "OMEGAP" EQUAL ZERO, THE PROGRAMME DEFINES "EPS=0.001" AND COMPUTES THE OPTIMAL "OMEGAB". IN THE CASE OF "OMEGAP.GT.O.", THE PARAMETER OMEGAB=OMEGAP IS USED DURING THE WHOLE ITERATION.
COMPONENTS OF REAL ARRAYS AND INTEGER VARIABLES EQUATED BY AN EQUIVALENCE STATEMENT ARE USED ONLY AS INTEGERS. OMEGAB=OMEGAP EPS=EPSP MMAX=S LCMAX=50*(2**MMAX) MITT E0=2**MMAX MITTE =MITTEO+1 MITTEI=MITTE +1
Appendix 4:
C
C C C
C
C
Poisson equation on nonrectangular regions
491
M MUST SATISFY 2.LE.M.LE.MMAX IF(2.LE.M AND. M.LE.MMAX) GO TO I PRINT 97, M,MMAX 97 FORMAT(4H1 M=,II,11H NOT IN (2 ,,II,IH)) STOP 1 N=2**M PRELIMINARY "OMEGA" IS I.. ONLY IN THE CASE OF "M=MALT+1" THE VALUE OF "OMEGAB" OF THE FOREGOING RUN IS ASSIGNED TO "OMEGA". OMEGA=1. IF(M.EQ.MALT+I) OMEGA=OALT MALT=M IF(EPS.LE.O.) EPS=0.001 EPS1=-1./ALOG(EPS) EPS2=0.1*EPS THE NUMBER NO OF BISECTION STEPS IN NACHB IS ABOUT -LOG2(EPS). NO=-1.5*ALOG(EPS)+0.5 ITER=O NN=N/2 IF(BR.OR.BO) NN=N
XN = N H = I./XN H2 = H*H Ni = MITTEI-N N2 = MITTE +N N3=N1-1 N4=N2+1
C C C
3
BRN=.NOT.BR BON=.NOT.BO N2R=N2 N20=N2 IF(BRN) N2R=N2R-1 IF(BON) N20=N20-1 THE VALUES 0. AND -1 ARE ASSIGNED TO "W" AND "NR", RESP., AT ALL POINTS OF THE SQUARE -1..LE.Y.LE.+1. -1..LE.X.LE.+I. DO 3 J=N3,N4 DO 3 I=N3,N4 W(I,J)=0. NR(I,J)=-1
C C C C
C C C
THE FOLLOWING NESTED LOOP DETERMINES ALL INTERIOR POINTS. AT THE SAME TIME "KI", "K2", "L1", "L2" ARE DEFINED SUCH THAT KI.LE.J.LE.K2 LI(J).LE.I.LE.L2(J) HOLDS FOR ALL INTERIOR POINTS. KI=N2 K2=N1
492
APPENDICES
DO 5 J=NI,N20 Y = J-MITTE
Y = Y*H LI(J)=N2 L2 (J)=NI
DO 4 I=NI,N2R X = I-MITTE X = X*H (X,Y) IS AN INTERIOR POINT IF "CHAR(X,Y).GT.O.". IN ORDER TO AVOID TOO SMALL DISTANCES FROM THE BOUNDARY, "CHAR(X,Y).GT.EPS2" IS REQUIRED FOR INTERIOR POINTS. THEREFORE IT IS PERMITTED THAT SOME POINTS HAVE A DISTANCE GREATER THAN "H" FROM THE BOUNDARY. WE ALLOW A DISTANCE UP TO 1.1*H. THIS HARDLY INFLUENCES THE ACCURACY. LE. EPS2) GOTO 4 IF (CHAR(X,Y) KI=MINO(KI,J) K2=MAXO(K2,J)
C C C C C
C
L1 (J)=NIND(LI(J) ,I) C C C
C
C C C C C C C
L2 (J) =MAXO (L2 (J) , I) "NR=O" AND "Q=QUELL(X,Y)*H*H" IS USED AT INTERIOR POINTS. THESE VALUES WILL BE CHANGED ONLY AT POINTS NEAR THE BOUNDARY. NR(I,J) = 0 Q(I,J) = QUELL(X,Y)*H2 4 CONTINUE NR(NSYM+I, J)=NR(NSYM-1, J) ROUNDING OFF L1(J) TO THE PRECEDING EVEN INTEGER. L1(J)=L1(J)-MOD (L1(J),2) 5 CONTINUE DO 6 I=NI,N2R HR(I,NSYN+1)=NR(I,NSYM-1) CONTINUE 6
NEW PAGE. AFTERWARDS THE GRID IS PRINTED OUT PROVIDED THAT M.LE.5. "STAR"=INTERIOR POINT "PERIOD"=EXTERIOR POINT "X" IS ORIENTATED FROM THE LEFT TO THE RIGHT, "Y" FROM BOTTOM TO TOP. IF (M GT. 5) GO TO 10 PRINT99 99 FORMAT (IHI) J=N20
DO 9 K = NI,N20 DO 7 I = NI,N2R WPS(I) = PUNKT IF(NR(I,J).GE.O) WPS(I) = STERN 7 CONTINUE PRINT 8, (WPS(I), I=NI,N2R) 8
.FORMAT(3X,64A2)
9
CONTINUE
J = J-1
Appendix 4:
C
Poisson equation on nonrectangular regions
493
IN THE CASE OF "K1.GT.K2" THERE ARE NO INTERIOR POINTS. IF(K1.LE.K2) GOTO 10 PRINT 98 98 FORMAT(30H1 THERE ARE NO INTERIOR POINTS) STOP
C C C
HENCEFORTH THE INTERIOR POINTS SATISFY N1.LE.J.LE.N2 L1(J).LE.I.LE.L2(J) 10 N1=K1 N2=K2
C C C C
C C C C
IT FOLLOWS THE DETERMINATION OF THE BOUNDARY CLOSE POINTS AND THE COMPUTATION OF THE COEFFICIENTS OF THE DIFFERENCE EQUATION IN THESE POINTS. THE COEFFICIENTS ARE ASSIGNED TO COEFF(LCOEFF), COEFF(LCOEFF+1), COEFF(LCOEFF+2), COEFF(LCOEFF+3)
C
LCOEFF = 1 DO 30 J=N1,N2 Y = J-MITTE Y = Y*H
K1=L1 (J) K2=L2 (J) C C C
C C C
C C C C
DO 29 I=K1,K2 NO CHECKS FOR EXTERIOR POINTS. IF (NR(I,J) LT. 0) GOTO 29 THE SUBROUTINE "NEIGHB" DEFINES THE ARRAY "D". "D(1),D(2),D(3),D(4) = DISTANCES FROM THE NEIGHBOURS" FOR BOUNDARY CLOSE POINTS; "D(1)=-H", OTHERWISE. CALL NEIGHB(D, I, J, H) ONLY BOUNDARY CLOSE POINTS ARE TREATED IN THE FOLLOWING IF (D(1).LT.O.) GO TO 29 IF "LCOEFF.GT.LCMAX" THE ARRAY "COEFF" IS FILLED. THE PROGRAMME MUST BE TERMINATED. LCMAX (=LENGTH OF LCOEFF) CAN BE ENLARGED. IN THIS CASE THE COMMON BLOCKS ARE TO BE CHANGED. IF(LCOEFF.GT.LCMAX) GOT0 100 X = I-MITTE
X = X*H Z1=D (1)+D(3)
Z2=D(2)+D(4) Z3 = 1./(Z1*D(1))+1./(Z2*D(2))+1./(Z1*D(3))+1./(Z2*D(4)) Q(I,J) = Q(I,J)*2./(Z3*H2) Z1=Z1*Z3 Z2=Z2*Z3 HBIT=.TRUE.
APPENDICES
494
11
12 13 14
15
16 17
18 19 20
21
22
29 30
IF (NR(I+1,J)) 11, 12, 12 Q(I,J) = Q(I,J) - 4./(D(1)*Z1)*RAND(X+D(1),Y) COEFF(LCOEFF) = 0. HBIT=HBIT.AND.D(1).EQ.H GOTO 13 COEFF(LCOEFF) = 4./(D(1)*Z1) IF (NR(I,J+1)) 14, 15, 15 Q(I,J) = Q(I,J) - 4./(D(2)*Z2)*RAND(X,Y+D(2)) COEFF(LCOEFF+1) = 0. HBIT=HBIT.AND.D(2).EQ.H GOTO 16 COEFF(LCOEFF+1) = 4./(D(2)*Z2) IF (NR(I-1,J)) 17, 18, 18 Q(I,J) = Q(I,J) - 4./(D(3)*Z1)*RAND(X-D(3),Y) COEFF(LCOEFF+2) = 0. HBIT=HBIT.AND.D(3).EQ.H GOT0 19 COEFF(LCOEFF+2) = 4./(D(3)*Z1) IF (NR(I,J-1)) 20, 21, 21 Q(I,J) = Q(I,J) - 4./(D(4)*Z2)*RAND(X,Y-D(4)) COEFF(LCOEFF+3) = 0. HBIT=HBIT.AND.D(4).EQ.H GOTO 22 COEFF(LCOEFF+3) = 4./(D(4)*Z2) NR(I,J) = 0 IF(HBIT) GOTO 29 NR(I,J) = LCOEFF LCOEFF = LCOEFF + 4 CONTINUE CONTINUE
C C
LCOEFF = LCOEFF/4 PRINT 40,LCOEFF 40 FORMAT(IX/ 34H NUMBER OF BOUNDARY CLOSE POINTS, 14H (D(L).NE.H) _, 14) C C C C
THE NEXT LOOP ENDING WITH STATEMENT NUMBER "59" COMPUTES THE OPTIMAL "OMEGAB". THE COMPUTATION IS OMITTED IF
C
"OIIEGAB.GT.O.".
C C
C C C
C
IF(OHEGAB.GT.O.) GOTO 60 "OMEGAB" WILL BE IMPROVED ITERATIVELY, STARTING WITH A UNSUITABLE VALUE. OMEGAB=2. AT FIRST "NN" STEPS OF THE SOR ITERATION ARE EXECUTED. AFTER NEARLY NN STEPS THE INFLUENCE OF THE BOUNDARY VALUES BEARS UPON THE MIDDLE OF THE REGION. 31 DO 32 I = 1,NN 32 CALL SOR(OMEGA) "W1 =W"
CALL SAVE
Appendix 4:
C C
C C C C C C
C C C
C
C
Poisson equation on nonrectangular regions
495
A FURTHER STEP OF THE SOR ITERATION. CALL SOR(OMEGA) "Z1"=SUMME"(W1(I,J)-W(I,J))**2" CALL QNORM(Z1) CALL SAVE CALL SOR(OMEGA) CALL QNORM(Z2) IF THE COMPONENT OF THE INITIAL ERROR BELONGING TO THE LARGEST EIGENVALUE OF THE MATRIX IS TOO SMALL, THE STARTING VALUES MUST BE CHANGED. IF(Z2.GE.Z1) GOTO 110 "Z3"=APPROXIMATION TO THE SPECTRAL RADIUS OF THE ITERATION MATRIX OF THE SOR ITERATION WITH PARAMETER "OMEGA". Z3 = SQRT(Z2/Z1) "BET2"=APPROXIMATION TO THE SQUARED SPECTRAL RADIUS OF THE ITERATION MATRIX OF THE JACOBI ITERATION. BET2 = (Z3+OI4EGA-1.)**2/(Z3*OMEGA**2) "Z3"=NEW APPROXIMATION OF "OMEGAB" Z3 = 2./(1.+SQRT(1.-BET2)) THE DIFFERENCE "2.-OMEGAS" IS TO BE DETERMINED UP TO THE RELATIVE ACCURACY "EPS1". IN CASE OF WORSE ACCURACY THE WHOLE PROCESS IS TO BE REPEATED WITH "OMEGAB=Z3". IF (ABS(Z3-01IEGAB) LT. (2.-23)*EPSI) GO TO 59 OHEGAB=Z3 GOTO 31
C C C
SINCE IT IS MORE ADVANTAGEOUS TO USE A LARGER THAN A SMALLER VALUE, THE APPROXIMATION OF "OMEGAB" IS SOMEWHAT ENLARGED. THEN "OMEGAB" IS ROUNDED UP. 59 Z3=Z3+EPS1*(2.-Z3)+16.
C C
"OMEGAS" IS ASSIGNED TO "OALT". THIS VALUE IS KEPT SINCE IT CAN BE USED IN A FOLLOWING RUN WITH "M=M+l". OALT=OMEGAB PRINTING OF "OMEGAS" AND OF THE NUMBER OF ITERATION STEPS NEEDED UP TO NOW. PRINT 61, OMEGAS, ITER 61 FORMAT (1X/ 11H OMEGAS =,F6.3,7H ITER =,I4) 62 FORMAT(29H TOTAL NUMBER OF ITERATIONS,I6/1H
C C
C C C C
C C
"W1=W"
60 CALL SAVE NN STEPS OF THE SOR ITERATION DO 80 I = 1,NN 80 CALL SOR(OMEGAB) CHECK OF ACCURACY. THE ACCURACY IS NEARLY INDEPENDENT OF "H" AND "H", SINCE "NN" ITERATIONS ARE USED. DO 90 J = N1,N2 K1=L1(J) K2=L2(J) DO 90 I = K1,K2 IF (ABS(W1(I,J)-W(I,J)) GT. EPS) GOTO 60 90 CONTINUE
496
APPENDICES
C
C
PRINT-OUT OF "W" AND OF THE TOTAL NUMBER OF SOR ITERATIONS.
C
IF (M.GT.3) PRINT 99 PRINT 62, ITER
N3 = N3 + I J = N2
Y=(N2-MITTE)*H DO 93 K=N1,N2 PRINT 92, Y,(W(I,J),I=N3,N2R) FORHAT(1X,F9.7,4X,8F7.4,4X,8F7.4/7(14X,8F7.4,4X,8F7.4/)) 92 J = J - 1 Y=Y-H 93 CONTINUE RETURN C
100 PRINT 101 101 FORMAT(33H STOP
TOO MANY BOUNDARY CLOSE POINTS)
C
C
CHANGE OF STARTING VALUES AT THE INTERIOR POINTS.
C
110 DO 120 J=N1,N2 K1= L1(J) K2=L2(J) DO 119 I=K1,K2
IF(NR(I,J).LT.O) GO TO 119 V(I,J)=W(I,J)-1. CONTINUE 119 120 CONTINUE GOTO 31 END
Appendix 4:
Poisson equation on nonrectangular regions
497
SUBROUTINE SOR(OMEGA) C
THE SUBROUTINE "SOR" PERFORMS ONE ITERATION STEP. THE NUMBER OF ITERATIONS IS COUNTED BY "ITER". THE RED-BLACK ORDERING IS USED FOR THE INTERIOR POINTS. SINCE "L1(J)" IS EVEN, M=1 , MOD(I+J,2)=MOD(N1,2) HOLDS FOR THE FIRST RUN, WHILE M=2 , MOD(I+J,2)=MOD(N1+1,2) HOLDS FOR THE SECOND RUN.
C C C C
C C C C C
PARAMETER
C C
REAL OMEGA C
VARIABLES OF THE COMMON BLOCK
C C
REAL W(66,66),W1(65,65),WPS(65),Q(65,65),COEFF(1600) INTEGER NR(66,66),MITTE,NI,N2,ITER,NSYM,L1(65),L2(65),NO EQUIVALENCE (L1(1),W1(1,1)),(L2(1),Q(1,1)) EQUIVALENCE (NO,W1(1,1)),(MITTE,Q(1,1)) COI414ON W,W1,Q,COEFF,NR
COMMON N1,N2,ITER,NSYM C C
LOCAL VARIABLES
C
REAL OM,OM1 INTEGER I,J,K,K2,L,M,N C
ITER=ITER+1 OM = OMEGA*0.25 OMI=1.-OMEGA I.1
=I
N = 0 5 DO 50 J=N1,N2 K=LI(J)+N K2=L2(J) DO 40 I = K,K2,2 IF (NR(I,J)) 40, 10, 20 10 U(I,J) = W(I,J)*OMI + OM*(W(I+1,J)+W(I,J+1)+W(I-1,J) +
+W (I, J-1) -Q (10 J)
GOTO 40 20 L = NR(I,J) W(I,J) = W(I,J)*OM1 + OM*(COEFF(L)*W(I+1,J)+COEFF(L+1)* * W(I,J+1)+COEFF(L+2)*W(I-1,J)+COEFF(L+3)*W(I,J-1)-Q(I,J)) 40 CONTINUE W(NSYM+1,J)=W(NSYM-1,J) N = 1-N 50 CONTINUE IF(N2.LT.NSYM) GOTO 54 K=L1(NSYM-1) K2=L2(NSYM-1)
498
APPENDICES
DO 53 I=K,K2 W(I,NSYM+1)=W(I,NSYM-1) 53 CONTINUE 54 IF (M-2) 55,56,56
55 N = 1
M = 2 GOTO 5 56 RETURN END
SUBROUTINE NEIGHB(D, I, J, H) C C C
C C C
C C
"NEIGHB" COMPUTES THE DISTANCE OF THE INTERIOR POINT (X,Y) WITH X=(I-MITTE)*H Y=(J-MITTE)*H
FROM THE NEIGHBOURS. IN THE CASE OF A BOUNDARY DISTANT POINT, THE RESULT IS
D(1)=-H , D(2)=D(3)=D(4)=H.
C C C C
FOR BOUNDARY CLOSE POINTS FOUR POSITIVE NUMBERS ARE COMPUTED. THE DISTANCE FROM THE BOUNDARY IS DETERMINED BY A BISECTION METHOD, SINCE THE CHARACTERISTIC FUNCTION OF THE REGION IS NOT REQUIRED TO BE DIFFERENTIABLE.
C C C
PARAMETERS
REAL D(4),H INTEGER I,J C C C
VARIABLES OF THE COMMON BLOCKS REAL W(66,66),WI(65,65),WPS(65),Q(65,65),COEFF(1600) INTEGER NR(66,66),MITTE,NI,N2,ITER,NSYM,L1(65),L2(65),NO EQUIVALENCE (L1(1),W1(1,1)),(L2(1),Q(1,1)) EQUIVALENCE (NO,WI(1,1)),(MITTE,Q(1,1)) CON11ON W,W1,Q,COEFF,NR COMMON N1,N2,ITER,NSYM
C C C
LOCAL VARIABLES
REAL DELTA,H1,H11,X,Y INTEGER I1(4),J1(4),I2,J2,K,L LOGICAL B DATA II/1,0,-1,O/ DATA J1/0,1,0,-1/
Appendix 4:
Poisson equa is
cr. 'crrectangular regions
499
C
B = TRUE. H1=H/8. X = (I-MITTE)*H Y = (J-MITTE)*H DO 100 K = 1,4
12 = I + II (K) J2 = J + JI (K)
IF (NR(I2,J2).GE.O.) GO TO 90 B = FALSE. H11 = HI DELTA = HI C C
10
C
C
11
15
20 C
C C C C
80
90 100
IN THIS LOOP THE SIGN OF "CHARDL" IS CHECKED IN STEPS OF "H/8" IN ORDER TO DETECT THE FIRST CHANGE OF SIGN DO 10 L=1,9 IF (CHARDL(K, DELTA, X, Y).LT.O.) GO TO 11 DELTA = DELTA + H11 DELTA=H GO TO 80 HERE THE BISECTION METHOD STARTS. "NO" IS THE REQUIRED NUMBER OF STEPS. IT IS DEFINED IN POIS. H11 = H11*0.5 DELTA = DELTA - H11 DO 20 L = 1,NO H11 = H11*0.5 IF (CHARDL(K, DELTA, X, Y).LE.O.) GO TO 15 DELTA = DELTA + H11 GO TO 20 DELTA = DELTA - H11 CONTINUE IF(DELTA.GT.1.1*H) DELTA=H THE RESULTING DISTANCE MAY BE SOMEWHAT LARGER THAN "H". BUT NOTE THE BOUNDARY POINT IS LOCATED IN ABS(X).LE.1. ABS(Y).LE.1. SINCE EVERY REGION IS CUT IN THIS WAY. IF (DELTA.LE.H) GO TO 80 IF (X + I1(K)*DELTA.GT.1. OR. X + I1(K)*DELTA.LT.-I.) DELTA = H OR. IF (Y + J1(K)*DELTA.GT.1. Y + J1(K)*DELTA.LT.-1.) DELTA = H D(K) = DELTA GO TO 100 D(K) = H CONTINUE IF (B) D(1) _ -H RETURN
END
APPENDICES
500
REAL FUNCTION CHARDL (K, DELTA, X, Y)
"CHARDL" TRANSFORMS THE ARGUMENTS OF "CHAR" AS SUITED TO "NACHB". "CHAR" SHOULD BE PROGRAMMED AS SIMPLE AS POSSIBLE TO ENABLE EASY CHANGES. PARAMETERS
REAL DELTA,X,Y INTEGER K C
LOCAL VARIABLES
C
C
REAL X1(4),Y1(4),X2,Y2 DATA X1/1.,0.,-1.,0./ DATA C
X2 = X + DELTA*X1(K) Y2 = Y + DELTA*Y1(K) CHARDL = CHAR(X2,Y2) RETURN END
SUBROUTINE SAVE
"SAVE" STORES "W" ON "WI" VARIABLES OF THE COMMON BLOCKS REAL W(66,66),W1(65,65),WPS(65),Q(65,65),COEFF(1600) INTEGER NR(66,66),MITTE,NI,N2,ITER,NSYM,L1(65),L2(6S),NO EQUIVALENCE (L1(1),W1(1,1)),(L2(1),Q(1,1)) EQUIVALENCE (NO,W1(1,1)),(MITTE,Q(1,1)) COMMON W,W1,Q,COEFF,NR
COiINON N1,N2,ITER,NSYM C
LOCAL VARIABLES
C C
INTEGER I,J,K1,K2 C
DO 10 J=N1,N2 K1=L1 (J) K2=L2 (J)
DO 10 I=Kl,K2
10
W1 (I,J)=W(I,J)
RETURN END
Appendix 4:
Poisson equa:_c-
..c-rectangular regions
501
SUBROUTINE QNORM(Z) C C C C C
C C
"QNORM" COMPUTES THE SQUARED EUCLIDEAN NORM OF THE DIFFERENCE Z=SUMME(W1(I,J)-W(I,J))**2 PARAMETER REAL Z
C C C
VARIABLES OF THE COMMON BLOCKS REAL W(66,66),W1(65,65),WPS(65),Q(65,65),COEFF(1600) INTEGER NR(66,66),MITTE,NI,N2,ITER,NSYM,L1(65),L2(65),NO EQUIVALENCE (L1(1),W1(1,1)),(L2(1),Q(1,1)) EQUIVALENCE (NO,W1(1,1)),(MITTE,Q(1,1)) COHNON W,W1,Q,COEFF,NR COMMON N1,N2,ITER,NSYM
C C
LOCAL VARIABLES
C
REAL SUM INTEGER I,J,K1,K2 C
SUM = 0. DO 10 J = N1,N2
K1=L1 (J) K2=L2 (J) I = K1,K2 IF (NR(I,J).LT. 0) GOTO 9 SUM = SUM + (W1(I,J)-W(I0J))**2 9 CONTINUE 10 CONTINUE Z = SUM RETURN
DO 9
END
APPENDICES
502
EXAMPLE:
REAL FUNCTION CHAR(X,Y) C C
ELLIPSE
C
ZI=X/O.9 Z2=Y/O.6 CHAR=1.-ZI*ZI-Z2*Z2 RETURN END
REAL FUNCTION QUELL (X,Y) C
QUELL=4. RETURN END
REAL
FUNCTION RAND (X,Y)
C
RAND=1. RETURN END
Appendix 5:
Programs for
Appendix 5:
-atrices
503
Programs for 'band matrices.
We present programs for the following methods:
Gaussian elimination without pivot search for
GAUBD3:
tridiagonal matrices.
Gaussian elimination without pivot search for band
GAUBD:
matrices of arbitrary band width
w > 3.
Band width reduction by the Gibbs-Poole-Stockmeyer
REDUCE:
method.
(This includes the subroutines LEVEL,
KOMPON, SSORTI, SSORT2). K = (w-l)/2
In the program we use
as a measure of band
width (cf. Definition 20.1 and 20.4). We first consider calls of GAUBD3 and GAUBD. A
the matrix
from
A
is
Section 20 (cf. also Figure 20.2), N > 2
is the number of equations, K < N width just mentioned.
is the measure of band-
If the only change since the last call
of the program is on the right side of the system of equations
A(4,*)
B = .FALSE..
tor is in
ment for
[or A(2*K+2,*)], set
After the call of the program, the solution vec-
A(4,*)
A
B = .TRUE., otherwise,
[or A(2*K+2,*)].
For
K > 10, the state-
in GAUBD has to be replaced by
REAL A(m,N). Here
m
is some number greater than or equal to
2*K+2.
The number of floating point operations in one call of GAUBD3 or GAUBD is: GAUBD3
GAUBD
B = .FALSE.:
8N-7
B = .TRUE.:
SN-4
B = .FALSE.:
(2K2+5K+1)(N-1)+l
B = .TRUE.:
(4K+1)(N-1)+1.
APPENDICES
504
The program REDUCE contains four explicit parameters: N:
number of rows in the matrix
M:
the number of matrix elements different from zero
above the main diagonal
N
KOLD:
K
before band width reduction
KNEW:
K
after band width reduction.
and
M
parameters.
vector
A
are input parameters, and KOLD and KNEW are output The pattern of the matrix is described by the in the COMMON block.
Before a call, one enters
here the row and column indices of the various matrix elements different from zero above the main diagonal.
The entry
order is: row index, corresponding column index, row index, corresponding column index, etc.; altogether there are pairs of this sort.
REDUCE writes into this vector that per-
mutation which leads to band width reduction. KNEW or KOLD = KNEW.
M
Either KOLD >
In the second case, the permutation is
the identity, since no band width reduction can be accomplished with this program.
The next example should make the use of REDUCE more explicit.
T he p at tern o f th e ma tr ix i s given as
x x x x x X x x x x x x x x x x x x x x x x x x x x
x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
x x x x x x x x x x x x x x x x x x x x x x x x x x
Appendix 5:
Programs for --and matrices
This corresponds to the gr
505
i.. :figure 20.11.
Input:
N = 15, M = 38, A = 1,2,1,6,1,7,2,3,2,6,2,7,2,8,3,4,3,7,3,8,3,9, 4,5,4,8,4,9,4,10,5,9,5,10,6,7,6,11,6,12,7,8,7,11, 7,12,7,13,8,9,8,12,8,13,8,14,9,10,9,13,9,14,9,15, 10,14,10,15,11,12,12,13,13,14,14,15.
Output: KOLD = 6, KNEW = 4,
A = 1,4,7,10,13,2,5,8,11,14,3,6,9,12,15.
The program declarations are sufficient for N < NMAX = 650 and
M < MMAX = 2048.
For large
only the bounds of the COMMON variables GRAD, and
NR
have to be changed.
than 10,000 in any case.
N
or
M,
A, VEC, IND, LIST,
However, N
must be less
On IBM or Siemens installations with
data types INTEGER*2 and INTEGER*4, two bytes suffice for IND, LIST, GRAD, and NR.
All other variables should be INTEGER*4.
For a logical run of the program, it is immaterial whether the graph is connected or not.
S
However, if the graph
decomposes into very many connected components (not counting knots of degree zero), the computing times become extremely long.
We have attached no special significance to this fact,
since the graph in most practical cases has only one or two connected components. Section 1: zero knots.
Section 2:
REDUCE is described in nine sections.
Computation of KOLD and numbering of the degree NUM contains the last number given. Building a data base for the following sections.
During this transformation of the input values, matrix
APPENDICES
506
elements entered in duplicate are eliminated.
The output is
the knots
A(J), J = LIST(I) to LIST(I+1)-l which are connected to the knot order of increasing degree. I.
NR(I)
I.
They are ordered, in
GRAD(I) gives the degree of knot
is the new number of the knot, or is zero if the
knot does not yet have a new number.
In our example, after
Section 2 we obtain: A = 2,6,7, 2,4,7,8,9,
1,3,6,7,8, 5,3,10,8,9,
4,10,9,
1,11,2,12,7,
1,11,2,6,13,3,12, 15,5,3,14,4,13,10,8,
2,3,4,13,14,12,7,9, 5,15,4,14,9, 11,6,13,7,8,
6,12,7,
12,14,7,8,9, 10,14,9.
15,10,13,8,9,
LIST = 1,4,9,14,19,22,27,35,43,51,56,59,64,69,74,77. GRAD = 3,5,5,5,3,5,8,8,8,5,3,5,5,5,3. = 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.
NR
If the graph (disregarding knots of degree zero) consists of several connected components, Sections 3 through 8 will run the corresponding number of times. Section 3: of
Step (A) and (B) of the algorithm, computation
K1.
Section 4:
Steps (C) through (F), computation of
K2.
The
return from (D) to (B) contains the instruction IF(DEPTHB.GT.DEPTHF) GOTO 160 Section 5:
Preliminary enumeration of the elements in the
levels
(Step (G)).
Si
Appendix S:
Section 6:
Programs fcr _..d
507
Determination and sorting of the components
Vi
(Step (G)).
Section 7:
Steps (H) through (J).
The loop on
v
begins
i
ends with
with
DO 410 NUE = 1, K2 Steps (L) and (M) are combined in the program. Section 8:
Steps (K) through (0).
The loop on
IF(L.LE.DEPTHF) GOTO 450 Section 9:
Computation of KNEW and transfer of the new
enumeration from NR to A.
LEVEL computes one level with the root START (cf. Theorem 20.8), KOMPON computes the components of
V
(cf. Lemma
20.10) beginning with an arbitrary starting element. and SSORT2 are sorting programs.
SSORT1
To save time, we use a
method of Shell 1959 (cf. also Knuth 1973), but this can be replaced just as easily with Quicksort.
Section 7 determines the amount of working memory required.
If the graph is connected and the return from Step
(D) to Step (B) occurs at most once, then the computing time is
0(c1n) + 0((cIn)5/4) + 0(c
1
c2n)
where n
= number of knots in the graph
cl = maximum degree of the knots c2 = maximum number of knots in the last level of
R(g).
The second summand contains the computing time for the sorts. If Quicksort is used, this term becomes the mean (statistically). Section 4 of the program.
O(c1n log(c1n))
in
The third summand corresponds to
508
APPENDICES
Suppose a boundary value problem in
]R2
is to be
solved with a difference method or a finite element method. We consider the various systems of equations which result from a decreasing mesh
h
of the lattice.
Then it is usually
true that n = 0(1/h2),
k = 0(1/h)
cl = 0(1),
c2 = 0(1/h),
(band width measure).
The computing time for REDUCE thus grows at most in proportion to
1/h3, and for GAUBD, to
1/h4.
The program was tested with 166 examples.
Of these,
28 are more or less comparable, in that they each had a connected graph and the number of knots was between 900 and 1000 and
M
was between 1497 and 2992.
For this group, the com-
puting time on a CDC-CYBER 76 varied between 0.16 and 0.37 seconds.
Appendix S:
Programs for
matrices
509
SUBROUTINE GAUBD3(A,N,B) REAL A(4,N) INTEGER N LOGICAL B C C C C C
SOLUTION OF A SYSTEM OF LINEAR EQUATIONS WITH TRIDIAGONAL MATRIX. THE I-TH EQUATION IS A(1,I)*X(I-1)+A(2,I)*X(I)+A(3,I)*X(I+1)=A(4,I) ONE TERM IS MISSING FOR THE FIRST AND LAST EQUATION. THE SOLUTION X(I) WILL BE ASSIGNED TO A(4,I).
C
C
REAL Q INTEGER I,I1 C
IF(N.LE.1)STOP IF(B) GOTO 20 DO 10 I=2,N Q=A(1,I)/A(2,I-1) A(2,I)=A(2,I)-A(3,I-1)*Q A(4,I)=A(4,I)-A(4,I-1)*Q
10
A(1,I)=Q
GOTO 40 C
20 Q=A(4,1) DO 30 I=2,N Q=A(4,I)-A(1,I)*Q 30 A(4,I)=Q C
40 Q=A(4,N)/A(2,N) A(4,N)=Q I1=N-1
DO 50 I=2,N Q=(A(4,I1)-A(3,I1)*Q)/A(2,I1) A(4,II)=0 50
I1=I1-1 RETURN END
APPENDICES
510
SUBROUTINE GAUBD(A,N,K,B) REAL A(22,N) INTEGER N,K LOGICAL B C C
SOLUTION OF A SYSTEM OF LINEAR EQUATIONS WITH BAND MATRIX. THE I-TH EQUATION IS A(1,I)*X(I-K)+A(2,I)*X(I-K+1)+...+A(K+1,I)*X(I)+...+ A(2*K,I)*X(I+K-1)+A(2*K+1,I)*X(I+K) = A(2*K+2,I) FOR I=1(1)K AND I=N-K+1(1)N SOME TERMS ARE MISSING. THE SOLUTION X(I) WILL BE ASSIGNED TO A(2*K+2,I).
C
C C
C C C
REAL 0 INTEGER KI,K2,K21,K22,I,II,III,J,JJ,L,LL,LLL C
IF((K.LE.O).OR.(K.GE.N))STOP K1=K+1 K2=K+Z K21=2*K+1 K22=K21+1 'IF(B) GO TO 100 C
JJ=K21 II=N-K+1
DO 20 I=II,N DO 10 J=JJ,K21 10 A(J,I)=0. 20 JJ=JJ-1 DO 50 I=2,N II=I-K DO 40 J=1,K IF(II.LE.O) GO TO 40 Q=A(J,I)/A(KI,II)
JI =J+1 JK=J+K LLL=K2 DO 30 L=J1,JK 30
40 50
A(L,I)=A(L,I)-A(LLL,II)*Q LLL=LLL+1 A(K22,I)=A(K22,I)-A(K22,II)*Q A(J,I)=Q II=II+1 CONTINUE GO TO 200
C
100 DO 150 I=2,N II=I-K
DO 140 J=1,K IF(II.LE.0) GO TO 140 A(K22,I)=A(K22,I)-A(K22,II)*A(J,I) 140 150
II=II+1 CONTINUE
Appendix 5:
Programs for
:nat-rices
511
C
200
A(K22,N)=A(K22,N)/A(K1,N) II=N-1
DO 250 I=2,N Q=A(K22,II) JJ=II+K IF(JJ.GT.N) JJ=N II1=II+1 LL=K2
DO 240 J=II1,JJ Q=Q-A(LL,II)*A(K22,J) LL=LL+1 A(K22,II)=Q/A(K1,II)
240 250
II=II-1 RETURN END
SUBROUTINE REDUCE(N,M,KOLD,KNEW) C C
PROGRAMME FOR REDUCING THE BANDWIDTH OF A SPARSE SYMMETRIC MATRIX BY THE METHOD OF GIBBS, POOLE, AND STOCKMEYER.
C C
INPUT
C
C C
C C C C C C C C C C
C C C
C C C
N
M
A(I), I=1(1)2*M
NUMBER OF ROWS NUMBER OF NONVANISHING ENTRIES ABOVE THE DIAGONAL INPUT VECTOR CONTAINING THE INDICES OF THE NONVANISHING ENTRIES ABOVE THE DIAGONAL. THE INDICES ARE ARRANGED IN THE SEQUENCE II, J1, 12, J2, 13, J3, ...
OUTPUT A(I), I=1(1)N KOLD KNEW
NEW NUMBERS OF I-TH ROW AND COLUMN. BANDWIDTH OF THE INPUT MATRIX BANDWIDTH AFTER PERMUTATION OF THE INPUT MATRIX ACCORDING TO A(I), I=1(1)N THE ARRAY BOUNDS MAY BE CHANGED, PROVIDED THAT NMAX.LT.10000 A(2*MMAX), VEC(NMAX), IND(NMAX+1,8), LIST(NMAX+1), GRAD(NMAX), NR(NMAX)
C
INTEGER N,M,KOLO,KNEW C
INTEGER A(4096),VEC(650) INTEGER IND(651,8),LIST(651),GRAD(650),NR(650) COMMON A,VEC,IND EQUIVALENCE (LIST(1),IND(1,1)),(GRAD(1),IND(1,2)), (NR(1),IND(1,3)) C
INTEGER NMAX,MMAX,NN,M2,NI,NUE,NIM,NUM,IS,OLD,NEW INTEGER F,L,L1,L2,L10,I,J,III,K,KNPI,KI,KIN,K2,K2P1 INTEGER G,H,START,DEPTHF,DEPTHB,LEVWTH,GRDMIN,C,C2 INTEGER KAPPA,KAPPAI,KAPPA2,KAPPA3,KAPPA4 INTEGER INDI,IND2,INDJ2,INDJS,INDJ6,INDI7,INDI8,VECJ DATA C/10000/ C2=2*C NMAX=650
512
APPENDICES
M11AX=2048
IF(N.LT.2.OR.N.GT.NMAX.OR.M.GT.MMAX) STOP C C C
SECTION 1
M2=M+M KOLD=O KNEW=N DO 10 I=1,8 DO 10 J=1,N 10 IND(J,I)=O
IF(M.EQ.O) GOT0 680 DO 15 I=1,M2,2 J=IABS(A(I)-A(I+1)) IF(J.GT.KOLD) KOLD=J 15 CONTINUE C
DO 20 I=1,M2 K1=A(I) 20 INO(K1,7)=1 NUM=1
DO 30 I=1,N IF(IND(I,7).GT.O) GOT0 30 NR(I)=NUM NUf1=NUM+1
30
CONTINUE
C
SECTION 2 (NEW DATA STRUCTURE)
C
C
DO 40 I=1,M2,2
K1=A(I) K2=A(I+1) A(I)=K1+C*K2 40 A(I+1)=K2+C*K1 CALL SSORTI(1,M2) J=1
OLD=A(l) DO 70 I=2,M2 NEW=A(I) IF(NEW.GT.OLD) J=J+1 A(J)=NEW 70 OLD=NEW M2=J
I110(1,2)=1 J=1
L1O=A(1)/C 00 90 I=1,M2
K=A(I) LI=K/C L2=K-L1*C
A(I)=L2 IF(LI.EQ.L1O) GOTO 90 LiO=L1 J=J+1
IND(J,2)=I
90
CONTINUE IND(J+1,2)=M2+1 LIST(I)=1 J=1
Appendix 5:
Programs for
atrices
DO 110 I=1,N IF(IND(I,7).GT.0) J=J+1 110 LIST(I+1)=IND(J,2) DO 120 I=1,N 120 GRAD(I)=LIST(I+1)-LIST(I) DO 130 I=1,N F=LIST(I) L=LIST(I+1)-1 130 CALL SSORT2(A,2,F,L) C
SECTION 3 (COMPUTATION OF R(G)) STEPS (A) AND (B), COMPUTATION OF KAPPA I IND(I,7) LEVEL NUMBER OF R(G) ELEMENTS OF THE LAST LEVEL VEC(I)
C
C C C
C
140 GRDMIN=N DO 150 I=1,N IF(NR(I).GT.0) GOTO 150 IF(GRDHIN.LE.GRAD(I)) 60TO 150 START=I GRDIIIN=GRAD (I)
150
CONTINUE
C
160 G=START NN=N
CALL LEVEL(G,NN,DEPTHF,KI,KAPPAI) J=NN-KI DO 180 I=1,K1 III=I+J 180 VEC(I)=IND(III,6) DO 190 I=1,N 190 IND(I,7)=IND(I,8) C
C C C C
SECTION 4 (COMPUTATION OF R(H)) STEPS (C) TO (F), COMPUTATION OF KAPPA 2 IND(I,8) LEVEL NUMBERS OF R(H) LEVWTH=N DO 210 I=1,K1 START=VEC(I) N1=N CALL LEVEL(START,N1,DEPTHB,KIN,KAPPA2) IF(DEPTHB.GT.DEPTHF) GOTO 160 IF(KAPPA2.GE.LEVWTH) GOTO 210 LEVWTH=KAPPA2 VECJ=I 210 CONTINUE H=VEC(VECJ) Ni =N
CALL LEVEL(H,N1,DEPTHB,KIN,KAPPA2)
513
APPENDICES
514
C
SECTION 5 (PRELIMINARY NUMBERING OF THE ELEMENTS OF S(I)) STEP (G) IND(I,4) PRELIMINARY NUMBER OF ELEMENTS OF S(I) IND(I,5) LEVEL NUMBERS FOR NODES WITH SAME NUMBERING; ZERO OTHERWISE
C C
C C C C
DO 230 I=1,N IND(I,4)=O 230 IND(I,5)=O J=O
KNPI=DEPTHF+1 DO 260 I=1,N INDIB=IND(I,8) IF(INDI8.EQ.0) GOT0 260 K2=KNP1-INDIB IF(IND(I,7).NE.K2) GOTO 250 IND(I(2,4)=IND(K2,4)+1 K2=-K2 J=J+1
250 260 C C
CONTINUE IND(I,5)=K2 CONTINUE
SECTION 6 (DETERMINATION AND SORTING OF V(I)) STEP (G) VEC(I) STARTING VALUES OF V(I) SORTED WITH RESPECT TO IABS(V(I))
C
C C C
K2=0 IF(J.EQ.NN) 60T0 412 DO 290 I=1,N
IF(IND(I,5).LE.0) GOTO 290 START=I CALL KOMPON(START,N1) K2=K2+1 VEC(K2)=START IND(START,8)=N1 290 CONTINUE DO 310 I=1,N
IF(IND(I,5).LT.-C) IND(I,5)=IND(I,5)+C2 CONTINUE CALL SSORT2(VEC,8,1,K2) N1M=VEC(K2) N1M=IND(N1M,8) DO 315 I=1,K2 IND(I,8)=VEC(I) 315 310
Appendix 5:
Programs for '--and matrices
SECTION 7 (COMPUTATION OF S) STEPS (H) TO (J) IND(I,4) NUMBER OF ELEMENTS OF S(I) IND(I,5) LEVEL NUMBER OF S IND(1,6) ALL NODES V(I) 319
DO 319 J=1,DEPTHF VEC(J)=0 K2PI=K2+1 DO 410 NUE=1,K2 III=K2P1-NUE START=IND(III,8) CALL KOMPON(START,N1) IND1=7 IS=O
325
330
340
DO 330 J=1,N1 INDJ6=IND(J,6) III=IND(INDJ6,IND1)+IS VEC(III)=VEC(III)+1 IND(J,2)=III KAPPA=O DO 340 J=1,N1 INDJ2=IND(J,2) III=VEC(INDJ2) IF(III.LE.O) GOTO 340 III=III+IND(INDJ2,4) VEC(INDJ2)=0 IF(III.GT.KAPPA) KAPPA=III CONTINUE IF(IS.GT.O) GOTO 346 KAPPA3=KAPPA IND1=5 IS=C2 GOTO 325
C
346
347 350
370 380
400 410 411
KAPPA4=KAPPA IF(KAPPA3-KAPPA4) 350,347,380 IF(KAPPAI.GT.KAPPA2) GOTO 380 DO 370 J=1,N1 INDJ6=IND(J,6) III=IND(INDJ6,7) IND(INDJ6,5)=III-C2 IND(III,4)=IND(III,4)+1 GOTO 410 DO 400 J=1,N1 INDJ2=IND(J,2) IND(INDJ2,4)=IND(INDJ2,4)+1 CONTINUE 00 411 J=1,NIM GRAD(J)=LIST(J+1)-LIST(J)
515
516
APPENDICES
412 DO 415 I=1,N IF(IND(I,5).LT.-C) IND(I,5)=IND(I,5)+C2 415 IND(I,5)=IABS(IND(I,5)) C
SECTION 8 (NUMBERING)
C
STEPS (K) TO (0)
C C
420
DO 420 I=1,N VEC(I)=I CALL SSORT2(VEC,5,1,N) INDI=1 L=1
OLD=O NEW=1
IND(1,7)=G NR(G)=NUM NUM=NUM+1 C
450 I=0 C
460 I=I+1 IF(I.GT.NEW) GOTO 490 470 INDI7=IND(I,7) L1=LIST(INDI7) L2=LIST(INDI7+1)-1 DO 480 J=L1,L2 START=A(J) IF(NR(START).GT.O) GOT0 480 IF(IND(START,S).NE.L) GOT0 480 NR(START)=NUM NUN=1JU1.1+1
NEU=NEU+1
480
I110(NEW,7)=START CONTINUE GOTO 460
C
490 IF(NEW-OLD.GE.IND(L,4)) GOTO 510 GRDHIN=N
INDZ=IND1 DO 500 J=INDI,N VECJ=VEC(J) INDJS=IND(VECJ,5) IF(INDJS-L) 499,491,501 491 IF(NR(VECJ).GT.O) GOTO 500 IF(GRAD(VECJ).GE.GRDMIN) GOTO 500 (VECJ) START=VECJ GOT0 500 499 IIJD2=J+1 500 CONTINUE
Appendix 5:
Programs for 'n-and matrices
501 INDI=IND2 NR(START)=NUM NUM=NUM+1 NEW=NEW+1 IND(NEW,7)=START SOTO 470 C
510 NEW=NEW-OLD DO 520 I=1,NEW III=I+OLD 520 IND(I,7)=IND(III,7) OLD=NEW L=L+1
IF(L.LE.DEPTHF) SOTO 450 IF(NUM.LE.N) GOTO 140 C C C
SECTION 9 (COMPUTATION OF KNEW) KNEW=O DO 670 I=1,N N1=NR(I) L1=LIST(I) L2=LIST(I+1)-1 IF(L1.GT.L2) GOTO 670 DO 660 J=L1,L2 K=A(J) III=IABS(N1-NR(K)) IF(III.GT.KNEW) KNEW=III CONTINUE 660 670 CONTINUE 680 IF(KOLD.GT.KNEW) SOTO 700 KNE11=KOLD
DO 690 I=1,N A(I)=I RETURN 700 DO 710 I=1,N 690
710
A(I)=NR(I)
RETURN END
517
518
APPENDICES
SUBROUTINE LEVEL(START,NN,DEPTH,K3,WIDTH) GENERATION OF THE LEVELS R(START) DEPTH DEPTH OF THE LEVELS NUMBER OF NODES IN THE LAST LEVEL K3 WIDTH WIDTH OF THE LEVELS NUMBER OF ASSOCIATED NODES NN INTEGER START,NN,DEPTH,K3,WIDTH INTEGER A(4096),VEC(650) INTEGER IND(651,8),LIST(651),GRAD(650),NR(650) COHNON A,VEC,IND EQUIVALENCE (LIST(1),IND(1,1)),(GRAD(1),IND(1,2)), (NR(1),IND(1,3)) INTEGER J,I,BEG,END,NI,K,K2,LBR,STARTN,AI,L1,L2 J=NN DO I I=1,J 1
IND(I,8)=0 BEG=1 N1=1 K=1
LBR=1 K2=1
INO(1,6)=START IND(START,8)=1 C
3 K=K+1 END=N1
DO 10 J=BEG,END STARTN=IND(J,6) L1=IND(STARTN,1) L2=IND(STARTN+1,1)-1 DO 5 I=LI,L2 AI=A(I)
IF(IND(AI,8).NE.0) GOTO 5 IND(AI,8)=K NI=N1+1
IND(NI,6)=AI 5 10
CONTINUE
CONTINUE K3=K2 K2=N1-END IF(LBR.LT.K2) LBR=K2 BEG=END+1 IF(K2.GT.O) GOTO 3
C
DEPTH=K-1 WIDTH=LBR NN=N1 RETURN END
Appendix 5:
Programs fcr
....:..races
519
SUBROUTINE KOMPON(START,NI) C
COMPUTATION OF THE COMPONENT V(I) CONTAINING "START" NUMBER OF INVOLVED NODES NI IND(I,6) ALL NODES V(I)
C C
C C
INTEGER START,N1 INTEGER A(4096),VEC(650) INTEGER IND(651,8),LIST(651),GRAD(650),NR(650) COMMON A,VEC,IND EQUIVALENCE (LIST(1),IND(1,1)),(GRAD(1),IND(1,2)), (NR(1),IND(1,3)) INTEGER AI,I,K2,LI,L2,STARTN,J,BEG,END,C,C2 DATA C/10000/ C2=2*C BEG=1 N1=1
IND(START,5)=IND(START,5)-C2 IND(1,6)=START C
3 END=N1 DO 10 J=BEG,END STARTN=IND(J,6) L1=IND(STARTN,1) L2=IND(STARTN+1,1)-1 DO S I=L1,L2 AI=A(I)
IF(IND(AI,5).LT.O) GOTO 5 IND(AI,5)=IND(AI,5)-C2 N1=N1+1 IND (NI , 6) =AI
5
10
CONTINUE
CONTINUE K2=N1-EIJD BEG=END+1 IF(K2.GT.0) GOTO 3 RETURN
END
520
APPENDICES
SUBROUTINE SSORTI(F,L) SORTING OF A FROM A(F) TO A(L) INTEGER F,L INTEGER A(4096),VEC(650) INTEGER IND(651,8),LIST(651),GRAD(650),NR(650)
C
A,VEC,IND EQUIVALENCE (LIST(1),IND(1,1)),(GRAD(1),IND(1,2)), (NR(1),IND(1,3)) INTEGER N2,S,T,LS,IS,JS,AH,I,J IF(L.LE.F) RETURN N2=(L-F+1)/2 S=1023 DO 100 T=1,10 IF(S.GT.N2) GOTO 90 LS=L-S DO 20 I=F,LS IS=I+S AH=A(IS) J=I 5
10 20 90 100
JS=IS IF(AH.GE.A(J)) OTO 10 A(JS)=A(J) JS=J J=J-S IF(J.GE.F) GOTO 5 A(JS)=AH CONTINUE S=S/2 CONTINUE RETURN END
Appendix 5:
Programs for
-atrices
SUBROUTINE SSORT2(VEC,K,F,L) SORTING OF VEC FROM I=F TO I=L SO THAT IND(VEC(I),K) IS WEAKLY INCREASING INTEGER F,L,VEC(650),K INTEGER A(4096),VECC(650) INTEGER IND(651,8),LIST(651),GRAD(650),NR(650) COFIHON A,VECC,IND EQUIVALENCE (LIST(1),IND(1,1)),(GRAD(1),IND(1,2)), (HR(I),IND(1,3)) INTEGER N2,S,T,LS,IS,JS,AH,GH,AJ,I,J IF(L.LE.F) RETURN N2=(L-F+1)/2
C C
5=63
DO 100 T=1,6 IF(S.GT.N2) GOTO 90 LS=L-S DO 20 I=F,LS IS=I+S AH=VEC (IS)
GH=IND(AH,K) J=I 6 5
10 20 90 100
JS=IS AJ=VEC(J)
IF(GH.GE.IND(AJ,K)) GOTO 10 VEC(JS)=VEC(J) JS=J J=J-S IF(J.GE.F) GOTO 6 VEC(JS)=AH CONTINUE S=S/2 CONTINUE RETURN END
521
APPENDICES
522
Appendix 6:
The Buneman algorithm for solving the Poisson equation.
Let
We want to find a solution
G = (-1,1) x (-1,1).
for the problem tu(x,y) = q(x,y),
(x,y)
e G
u(x,Y) _ gp(x,Y),
(x,y)
e
G.
Depending on the concrete problem, two REAL FUNCTIONs, QUELL and RAND, have to be constructed to describe
q
and
iP
(see the example at the end below).
The parameter list for the subroutine BUNEMA includes only the parameter
K
in addition to the names for QUELL and
RAND (which require an EXTERNAL declaration).
H = l./2**K, where lattice points.
H
We have
denotes the distance separating the
After the program has run, the COMMON domain
will contain the computed approximations.
All further details
can be discerned from the program comments. BUNEMA calls the subroutine GLSAR.
This solves the
system of equations
A(r)x = b by means of a factorization of
A(r)
(cf. Sec. 21).
When
first used, GLSAR calls the subroutine COSVEC, which computes a series of cosine values via a recursion formula. also calls on the routine GAUBDS.
GLSAR
The latter solves special
tridiagonal systems of equations via a modified Gaussian algorithm (LU-splitting). Program BUNEMA contains 50 executable FORTRAN instructions, and the other subroutines another 42 instructions.
In order to make the program easy to read we have restricted
Appendix 6:
The Bunemar.
ourselves to the case
523
C = '-1,1) x (-1,1).
However, it can
be rewritten for a rectan;;::ar region without any difficulties.
We want to discuss the numerical results by means of the following examples:
(x,y) e G
iu(x,y) = 0, u(x,y) = 1,
(x,y)
e aG
(1)
exact solution: u(x,y) = 1
Au(x,y) = -2ir2sin(vrx) sin(Try) ,
(x,y) e G
u(x,y) = 0, (x,y) e DG exact solution: u(x,y) = sin(Irx)sin(Try).
(2)
The first example is particularly well suited to an examination of the numerical stability of the method, since the discretization error is zero.
Thus the errors measured are ex-
clusively those arising from the solving of the system of equations, either from the method or from rounding.
The com-
putation was carried out on a CDC-CYBER 76 (mantissa length 48 bits for REAL) and on an IBM 370/168 (mantissa length 21-24 bits for REAL*4 and 53-56 bits for REAL*8).
Table 1 contains the maximal absolute error for the computed approximations.
Here
H = 1/2**K = 2/(N+1) N
N2
:
number of lattice points in one direction dimension of the system of equations
Since the extent of the system of equations grows as would expect a doubling of
N
N2, one
in Example 1 to lead to a four-
fold increase in the rounding error.
This is almost exactly
S24
APPENDICES
Example
(1)
(2)
N
CYBER 76 REAL
370/168 REAL*4
370/168 REAL*8
3
0.71E-14
0.77E-6
0.11.E-15
7
0.43E-13
0.77E-6
0.54E-15
15
0.23E-12
0.10E-4
0.72E-15
31
0.58E-12
0.55E-4
0.15E-13
63
0.15E-11
0.45E-3
0.85E-13
127
0.68E-11
0.14E-2
0.30E-12
3
0.23E0
0.23E0
0.23E0
7
0.53E-1
0.53E-1
0.53E-1
15
0.13E-1
0.13E-1
0.13E-1
31
0.32E-2
0.32E-2
0.32E-2
63
0.80E-3
0.69E-3
0.80E-3
127
0.20E-3
0.31E-3
0.20E-3
Table 1.
Absolute error
what happened in the computation on the CYBER 76, averaged over all
N.
On the IBM, the mean increase in error per step
was somewhat greater.
The values for Example (1) and Example (2) show that should in no case
for a REAL*4 computation on the IBM 370, N be chosen larger than 63.
For greater mantissa lengths, there
is no practical stability bound on either machine.
Table 2 contains the required computing times, exclusive of the time required to prepare the right side of the system of equations.
In the compilation (with the exception
of the G1 compiler), the parameter
OPT =
2
was used.
Appendix 6:
The Bunemar.
Machine computation compiler
CYBER 76 REAL FTN
::
_
r thm
5-:!/168
REAL*4 :: extended
525
370/168 REAL*8
H-Extended
370/168 REAL*4 G1
N 31
0.03
0.04
0.04
0.07
63
0.13
0.19
0.22
0.31
127
0.55
0.85
1.04
1.44
Table 2.
Computing times in seconds
526
APPENDICES
SUBROUTINE BUNEMA(RAND,QUELL,K) C C C C C C
PARAMETERS
FUNCTIONS RAND, QUELL INTEGER K VARIABLES OF THE COMMON BLOCKS
C
REAL W(63,65),P(63,63),Q(63,65) EQUIVALENCE(W(1,1),Q(1,1)) CONHON W,P C C C
LOCAL VARIABLES INTEGER REAL B(63),X,Y,H,H2
C C C C C C C C C C C C C C
MEANING OF THE VARIABLES H K
H2 K1
K2 P,Q:
W
C
C C
C C C C C
C C C C
B
DISTANCES OF THE LATTICE POINTS H=1/2**K. THE PROGRAMME IS TERMINATED IN CASE OF K.LT.1. =H**2 =2**(K+1)-1 =2**(K+1) COMPARE DESCRIPTION OF THE METHOD. AS STARTING VALUES ZEROS ARE ASSIGNED TO P AND THE RIGHT-HAND SIDE OF THE SYSTEM IS ASSIGNED TO Q. AFTER PERFORMING THE PROGRAMME W(I,J) CONTAINS AN APPROXIMATION TO THE SOLUTION W(X,Y) AT THE INTERIOR GRID POINTS. (1,J) AND (X,Y) ARE RELATED BY X=(I-2**K)*H, I=1(1)K1 Y=(J-1-2**K)*H, J=2(1)K2. TO SIMPLIFY THE SOLUTION PHASE OF THE PROGRAMME, W(*,1) AND W(*,K2+1) ARE INITIALIZED BY ZEROS. W AND Q ARE EQUIVALENCED. THERE IS NO IMPLICIT USE OF THIS IDENTITY. DURING THE SOLUTION PHASE THOSE COMPONENTS OF W ARE DEFINED SUCCESSIVELY, THAT ARE NO LONGER NEEDED IN Q. AUXILIARY STORAGE
KMAX=5 C C C C C C C C
ARRAY BOUNDS
THE LENGTHS OF THE ARRAYS W(DIMI,DIM2), P(DIMI,DIMI), Q(DIMI,DIM2), AND B(DIM1) CAN BE CHANGED TOGETHER WITH KMAX. IT IS DIM1=2**(KMAX+1)-1 DIM2=2**(KMAX+1)+1.
Appendix 6:
C C
C
The Buneman
thm
527
ACCORDINGLY, THE LENGTHS OF THE ARRAYS COS2(DIMI) IN SUBROUTINE GLSAR AND A(DIM1) IN SUBROUTINE GAUBDS MUST BE CHANGED. THE PROGRAMME TERMINATES IF K.GT.KMAX.
C
IF((K.LT.1).OR.(K.GT.KMAX))STOP K2=2**(K+1) K1=K2-1 F1=1.0/FLOAT(2**K) H2=H**2 C C C
ASSIGN ZEROS TO P AND TO PARTS OF W DO 10 J=1,K1 U(J,I)=0.0 U(J,K2+1)=0.0 DO 10 I=1,K1 10 P(I,J)=O.O
C C C
STORE THE RIGHT-HAND SIDE OF THE SYSTEM ON Q Y=-1 .0
DO 120 J=2,KZ Y=Y+H X=-1 .0
DO 110 I=1,K1 X=X+H 110 Q(I,J)=H2*QUELL(X,Y) Q(1,J)=Q(1,J)-RAND(-1.0,Y) 120 Q(K1,J)=Q(K1,J)-RAND(1.O,Y) X=-1.0 DO 130 I=1,K1 X=X+H Q(I,2)=Q(I,2)-RAND(X,-1.0) 130 Q(I,K2)=Q(I,K2)-RAND(X,1.0) C C C
REDUCTION PHASE, COMPARE EQ. (21.9) OF THE DESCRIPTION DO 230 R=1,K JI=2**R J2=K2-J1 DO 230 J=J1,J2,J1 J3=J-J1/2 J4=J+J1/2 DO 210 I=1,K1 B(I)=P(I,J3)+P(I,J4)-Q(I,J+1) 210 CALL GLSAR(R-1,B,K1,K) DO 220 1=1,K1 P(I,J)=P(I,J)-B(I) 220 Q(I,J+1)=Q(I,J3+1)+Q(I,J4+1)-2.0*P(I,J) 230 CONTINUE
528
C C C
APPENDICES
SOLUTION PHASE, COMPARE EQ. (21.10) OF THE DESCRIPTION R2=K+1
DO 330 R1=1,R2 R=R2-R1 J1=2**R J2=K2-J1 J3=2*J1
DO 330 J=J1,J2,J3 J4=J+1+J1 J5=J+1-J1 DO 310 I=1,K1 310 B(I)=Q(I,J+1)-W(I,J4)-W(I,J5) CALL GLSAR(R,B,K1,K) DO 320 I=1,K1 320 W(I,J+1)=P(I,J)+B(I) 330 CONTINUE RETURN END
Appendix 6:
The Buneman a gcrithm
529
SUBROUTINE GLSAR(R,B,N,K) INTEGER R,N,K REAL B(N) C
SOLUTION OF THE SYSTEM A(R)*X=B BY FACTORING OF A(R) (COMPARE EQ. (21.6) OF THE DESCRIPTION). A(R) IS DEFINED RECURSIVELY BY A(R)=2I-(A(R-1))**2, R=1(1)K WITH A(O)=A=(AIJ), I,J=1(1)N
C C C C C C
AND -4 AIJ= I
C
0
C C C C
IF I=J IF I=J+1 OR I=J-1 OTHERWISE
THE PROGRAMME FAILS IF N.LT.2; OTHERWISE IT TERMINATES WITH B=X. INTEGER FICALL,J,J1,J2,JS REAL COS2(63) DATA FICALL/O/
C
IF(R.EQ.O)GOTO 30; C C
THE SUBROUTINE COSVEC IS ONLY CALLED IF THE VECTOR COS2 IS NOT YET COMPUTED FOR THE ACTUAL VALUE OF K.
C
C
IF(FICALL.EQ.K)GOTO I CALL COSVEC(K,COS2) FICALL=K C 1
10
DO 10 J=1,N
B(J)=-B(J)
J1=2**(K-R) J2=(2**(R+1)-1)*J1 JS=2*J1 C
BECAUSE OF COS2(J)=2*COS(J*PI/2**(K+1))=2*COS(I*PI) THE DOMAIN OF THE INDEX I IS I=2**(-R-1) (2**(-R)) 1-2**(-R-1).
C C
C C
20
DO 20 J=J1,J2,JS CALL GAUBDS(COS2(J),B,N) GOTO 40
C
30 CALL GAUBDS(O.0,B,N) C
40 RETURN END
530
APPENDICES
SUBROUTINE GAUBDS(C,B,N) INTEGER N REAL C,B(N) C
SOLUTION OF A LINEAR SYSTEM WITH SPECIAL TRIDIAGONAL MATRIX: X(I-1)+(-4+C)*X(I)+X(I+1)=B(I), I=1(1)N, WHERE X(0) AND X(N+1) ARE VANISHING. THE PROGRAMME TERMINATES WITH B=X. IN CASE OF N.LT.2 THIS SUBROUTINE FAILS. IF N.GT.63 THE ARRAY BOUNDS MUST BE ENLARGED. INTEGER 1,11 REAL Q,C4,A(63)
C C C C C C C
C
C4=C-4.0 A (1) =C4 DO 10 I=2,N 0=1.0/A(I-1) A(I)=C4-Q 10 B(I)=B(I)-B(I-1)*Q Q=B (N)/A (N)
B(N)=Q I1=N-1
DO 20 I=2,N
Q=(B(II)-Q)/A(II) B(II)=Q
20
I1=I1-1 RETURN END
Appendix 6:
The Bunemar, algorithm
SUBROUTINE COSVEC(K,COS2) INTEGER K REAL COS2(1) C
C C C
C
COMPUTATION OF COSZ(J)=2*COS(J*PI/2**(K+1)), J=1(1)2**(K+1)-1 BY MEANS OF RECURSION AND REFLECTION. THE PROGRAMME FAILS FOR K.LT.1.
C
INTEGER K2,J,J1 REAL DC,T,CV,PI4 C
K2=2**K J1=2*K2-1 PI4=ATAN(1.0) COS2(K2)=O.0 DC=-4.0*SIN(PI4/FLOAT(K2))**2 K2=K2-1 T=DC CV=2.0+DC COS2(1)=CV COS2(J1)=-CV DO 10 J=2,K2 J1=J1-1 DC=T*CV+DC CV=CV+DC COS2(J)=CV 10 COS2(J1)=-CV RETURN END
EXAMPLE (MENTIONED IN THE TEXT):
REAL FUNCTION QUELL(X,Y) REAL X,Y DATA PI/3.14159265358979/ QUELL=-2.0*PI*PI*SIN(PI*X)*SIN(PI*Y) RETURN END
REAL FUNCTION RAND(X,Y) REAL X,Y RAND=0.0 RETURN END
531
BIBLIOGRAPHY Abramowitz, M., Stegun, I. A.: Handbook of Mathematical Functions, New York: Dover Publications 1965. Ahlfors, L. V.: Complex Analysis, New York-Toronto-London: McGraw-Hill 19ZT.
Ansorge, R., Hass, R.: Konvergenz von Differenzenverfahren fur Lineare and Nichtlineare An an swertau a en, Lecture Notes in Mathematics, Vol. 159, Berlin-Heidelberg-New York: Springer 1970. Beckenbach, E. F., Bellman, R.: Inequalities, BerlinHeidelberg-New York: Springer 11971. Birkhoff, G., Schultz, M. H., Varga, R. S.: "Piecewise hermite interpolation in one and two variables with applications to partial differential equations," Numer. Math., 11, 232256 (1968).
Busch, W., Esser, R., Hackbusch, W., Herrmann, U.: "Extrapolation applied to the method of characteristics for a first order system of two partial differential equations," Numer. Math., 24, 331-353 (1975). Buzbee, B. L., Golub, G. H., Nielson, C. W.: "On direct methods for solving Poisson's equations," SIAM J. Numer. Anal., 7, No. 4, 627-656 (1970). Coddington, E. A., Levinson, N.: Theor of Ordinary Differential Equations, New York-Toronto-London: McGraw-Hill 19S5. Collatz, L.: Funktionalanalysis and Numerische Mathematik, Berlin-G6ttingen-Hei el erg: Springer 1964. Collatz, L.: The Numerical Treatment of Differential Equations, Berlin ei el erg-New Yor Springer 1966. :
Courant, R., Friedrichs, K. 0., Lewy, H.: "her die partiellen differenzengleichungen der mathematischen physik," Math. Ann., 100, 32-74 (1928). Cuthill, E., McKee, J.: "Reducing the bandwidth of sparse symmetric matrices," Proc. 24th ACM National Conference, 157-172 (1969). Dieudonne, J.: Foundations of Modern Analysis, New York-London: Academic Press 1960. Dorr, F. W.: "The direct solution of the discrete Poisson equation on a rectangle," SIAM Rev., 12, No. 2, 248-263
(1970).
(Editor): Numerical Solution of Ordinary and Partial Differential Equations, London: Pergamon Press 19
Fox, L.
.
532
Bibliography
533
Friedman, A.: Partial Differential Equations, New York: Holt, Rinehart and Winston 1969.
Friedrichs, K. 0.: "Symmetric hyperbolic linear differential equations," Comm. Pure Appl. Math., 7, 345-392 (1954). Gerschgorin, S.: "Fehlerabschatzungen fur das differenzenverfahren zur losung partieller differentialgleichungen," ZAMM, 10, 373-382 (1930).
Gibbs, N. E., Poole, W. G., Stockmeyer, P. K.: "An algorithm for reducing the bandwidth and profile of a sparse matrix," SIAM J. Numer. Anal., 13, 236-250 (1976). Gilbarg, D., Trudinger, N. S.: Elli tic Partial Differential E uations of Second Order, Berlin-Hei a erg-New or pringer 1977. :
Gorenflo, R.: "Ober S. Gerschgorins Methode der Fehierabschatzung bei Differenzenverfahren." In: Numerische, Insbesondere Approximations-Theoretische Behandlun von Fun tional leichungen, Lecture Notes in Mat ematics, Vol. 333, 128-143, Berlin-Heidelberg-New York: Springer 1973.
Grigorieff, R. D.: Numerik Gewohnlicher Differential leichun en, StudienbUcher, Bd. 1. Stuttgart: eu ner 1 .
Hackbusch, W.: Die Verwendung der Extrapolationsmethode zur Numerischen Losung Hyperbolischer Differentialgleichungen, Universitat zu Koln: Dissertation 1973. Hackbusch, W.: "Extrapolation to the limit for numerical solutions of hyperbolic equations," Numer. Math., 28, 455474 (1977). Hellwig, G.: Partial Differential Equations, Stuttgart: Teubner 197 .
Householder, A. S.: The Theory of Matrices in Numerical Analysis, New York: Blaisdell196 .
Janenko, N. N.: The Method of Fractional Steps, BerlinHeidelberg-New Yor Springer 1971. :
Jawson, M. A.: "Integral equation methods in potential theory I," Proc. R. Soc., Vol. A 275, 23-32 (1963). Kellog, 0. D.: Foundations of Potential Theory, Berlin: Springer 1929. Knuth, D. E.: The Art of Com uter Programming, Reading, Massachusetts: Addison-Wesley 1973. Kreiss, H. 0.: "On difference approximations of the dissipative type for hyperbolic differential equations," Comm. Pure Appl. Math., 17, 335-353 (1964).
BIBLIOGRAPHY
534
"On the stability of difference approximations Lax, P. D.: to solutions of hyperbolic equations with variable coefficients," Comm. Pure Appl. Math., 14, 497-520 (1961). Lax, P. D., Wendroff, B.: "On the stability of difference schemes," Comm. Pure Appl. Math., 15, 363-371 (1962). "On stability for difference Lax, P. D., Nirenberg, L.: schemes; a sharp form of Garding's inequality," Comm. Pure Appl. Math., 19, 473-492 (1966).
Fehlerabschatzungen in Verschiedenen Normen bei Lehmann, H.: Eindimensionaler and Zweidimensionaler Hermite-Interpolation, Universitat zu Koln: Diplomarbeit 1975. Losung von grooen Gleichungssystemen mit Lierz, W.: Symmetrischer Schwach Besetzter Matrix, Universitat zu K61n: Diplomarbeit 1975. Loomis, L. H., Sternberg, S.: Advanced Calculus, Reading, Massachusetts: Addison-Wesley 1968.
Magnus, W., Oberhettinger, F., Soni, R. P.: Formulas and Theorems for the Special Functions of Mathematical Physics, New York: Springer 196b. Meis, Th.: "Zur diskretisierung nichtlinearer elliptischer differentialgleichungen,"Computing, 7, 344-352 (1971).
Meis, Th., Tornig, W.: "Diskretisierungen des Dirichletproblems nichtlinearer elliptischer differentialgleichungen. Methoden and Verfahren der Mathematischen Physik, In: Bd. 8. Herausgeber: B. Brosowski und E. Martensen, Mannheim-Wien-Zurich: Bibliographisches Institut 1973. Meuer, H. W.: Zur Numerischen Behandlung von S stemen H erbolischer An an swert ro leme in Beliebig Vielen Ortsveran erlic en mit Hil a von Di erenzenverfahren, g72 Tec nisc a Hoc sc ule Aachen: Dissertation 1 .
The Theory of Partial Differential Equations, Mizohata, S.: Cambridge: University Press 1973. Natanson, I. P.: Theorie der Funktionen einer Reellen Veranderlichen, Berlin: A a emie-Verlag 1961.
Iterative Solution of NonOrtega, J. M., Rheinboldt, W. C.: linear E uations in Several Variables., New or -London: Academic Press 1970.
"On the linear iteration procedures for symmetric matrices," Rend. Math. e. Appl., 14, 140-163
Ostrowski, A.. M.: (1954).
Peaceman, D. W., Rachford, H. H.: "The numerical solution of parabolic and elliptic differential equations," J. Soc. Indust. Appl. Math., 3, 28-41 (1955).
Bibliography
535
Perron, 0.: "Uber exister: u^d nichtexistenz von integralen partieller differentialgleichungssysteme im reellen Gebiet," Mathem. Zeitschrift, 2-, 549-564 (1928).
Petrovsky, I. G.: Lectures on Partial Differential Equations, New York-London: Interscience Publishers 1954. Reid, J. K.: "Solution of linear systems of equations: direct methods (general). In: Sparse Matrix Techni ues, Lecture Notes in Mathematics, Vol. 572, Editor V. A. Barker, BerlinHeidelberg-New York: Springer 1977.
Richtmyer, R. D., Morton, K. W.: Difference Methods for Initial Value Problems, New York-London-Sydney: Interscience Publishers 1967. Rosen, R.: Matrix Bandwidth Minimization, Proc. 23rd ACM National Conf., 585-595 (1968).
Sauer, R. S.: Anfan swert robleme bei Partiellen Differentialgleichungen, Berlin-Gottingen- ei el erg: Springer 1958. Schechter, S.: "Iteration methods for nonlinear problems," Trans. Amer. Math. Soc., 104, 179-189 (1962).
Schroder, J., Trottenberg, U.: "Reduktionsverfahren fur differenzengleichungen bei randwertaufgaben I," Numer. Math., 22, 37-68 (1973). Schroder, J., Trottenberg, U., Reutersberg, H.: "Reduktionsverfahren fur differenzengleichungen bei randwertaufgaben II," Numer. Math., 26, 429-459 (1976). Shell, D. L.: "A highspeed sorting procedure," Comm. ACM, 2, No. 7, 30-32 (1959). Simonson, W.: "On numerical differentiation of functions of several variables," Skand. Aktuarietidskr., 42, 73-89 (1959). Stancu, D. D.: "The remainder of certain linear approximation formulas in two variables," SIAM J. Numer. Anal., Ser. B.1, 137-163 (1964).
Stoer, J., Bulirsch, R.: Introduction to Numerical Analysis. New York-Heidelberg-Berlin: Springer 1980. Swartz, B. K., Varga, R. S.: "Error bounds for spline and L-spline interpolation," J. of Appr. Th., 6, 6-49 (1972). Thomee, V.: "Spline approximation and difference schemes for the heat equation. In: The Mathematical Foundations of the Finite Element Method with Applications to Partial Differential Equations, Edited by K. Aziz, New York-London: Academic Press 1972.
T6rnig, W., Ziegler, M.: "Bemerkungen zur konvergenz von differenzenapproximationen fur quasilineare hyperbolische anfangswertprobleme in zwei unabhangigen veranderlichen." ZAMM, 46, 201-210 (1966).
BIBLIOGRAPHY
536
Varga, R. S.: Matrix Iterative Analysis, Englewood Cliffs: Prentice-Hall 1962. Walter, W.: Differential and Integral Inequalities, BerlinHeidelberg-New York: Springer 1970.
Whiteman, J. R. (Editor): The Mathematics of Finite Elements and Applications, London-New York: Academic Press 1973. Whiteman, J. R. (Editor): The Mathematics of Finite Elements and Applications II, London-New York-San Francisco: Academ-ic Press 19 76 .
Widlund, 0. B.: "On the stability of parabolic difference schemes," Math. Comp., 19, 1-13 (1965). Witsch, K.: "Numerische quadratur bei projektionsverfahren," Numer. Math., 30, 185-206 (1978). Yosida, K.:
Functional Analysis, Berlin-Heidelberg-New York:
Springer 1779Iterative Methods for Solving Partial DifferenYoung, D. M.: tial Equations of Elliptic Type, Harvard University, Thesis 1950. Zlamal, M.: "On the finite element method," Numer. Math., 12, 394-409 (1968).
Textbooks Ames, W. F.: Numerical Methods for Partial Differential Equations, New York- San Francisco: Academic Press 1 97 .
Ansorge, R.: Differenzena roximationen Partieller Anfangswertaufgaben, tuttgart: Teubner 1978. Ciarlet, Ph. G.: The Finite Element Method for Elliptic Problems, Amsterdam: North-Holland 1978. Collatz, L.: The Numerical Treatment of Differential Equations, Berlin ei el erg-New York: Springer 1-9-6-67.
Forsythe, G. E., Rosenbloom, P. C.: Numerical Analysis and Partial Differential Equations, New or 8. o n i ey Forsythe, G. E., Wasow, W. R.: Finite Difference Methods for Partial Differential Equations, New York-London: Jo n Wiley 1960. John, F.: Lectures on Advanced Numerical Analysis, London: Gordon an Breach 1967. Marchuk, G. I.: Methods of Numerical Mathematics, Vol. 2 New York-Heidel erg-Berlin: Springer 1975.
Bibliography
537
Marsal, D.: Die Numerisch L6sun Partieller Differentialleichun en in Wissenschaft and Tec 19 ni , Mann eim-WienZuric iograp isc es Institut i .
Com utational Methods in Partial Differential Equations, Lon oNew York-Sydney-Toronto: John Wiley 196
Mitchell, A. R.:
Iterative Solution of NonOrtega, J. M., Rheinboldt, W. C.: linear Equations in Several Variables, New York-London: Academic Press 1970.
Ralston, A., Wilf, H. S.; Mathematical Methods for Digital Computers, New York: John Wiley 196 0. Richtmyer, R. D., Morton, K. W.: Difference Methods for Initial Value Problems, New York-London-Sydney: Interscience Publishers 1967. Schwarz, H. R.: Methode der Finiten Elemente, Stuttgart: Teubner 1980. Smith, G. D.: Numerical Solution of Partial Differential Equations, London: Oxford University Press 1969.
Stoer, J., Bulirsch, R.: Introduction to Numerical Analysis, New York-Heidelberg-Berlin: Springer 1980.
An Analysis of the Finite Element Strang, G., Fix, G. J.: Method, Englewood Clif s: Prentice-Hall 1973. Varga, R. S.: Matrix Iterative Analysis, Englewood Cliffs: Prentice-Hall 1962.
Iterative Solution of Elliptic Systems, Wachspress, E. L.: 1966. Englewood Cliffs: Prentice-Hall Young, D. M.: Iterative Solution of Large Linear Systems, New York-London: Academic Press 1971. Zienkiewicz, 0. C.: The Finite Element Method in Engineering Science, London: McGraw-Hill 1971.
INDEX
Alternating direction implicit (ADI) method, 176, 177
Characteristics, 5, 7, 19ff., 24, 27
Amplification matrix, 128
Classical solution, 3, 232
Antitonic, 243
Collocation methods, 317ff.
Asymptotic expansion, 194
Collocation points, 318
Attractive region, 337
Consistency, 61, 70, 233
Banach fixed point theorem,
order of, 65
53
Contraction mapping, 337 Banach spaces, 40ff.
Contraction theorem, 53 Band matrices, 175, 404, 503
Convergence, 62, 70
Band width reduction, 402ff.
order of, 65
Biharmonic equation, 220
Convex function, 386
Boundary collocation, 318
COR algorithm, 420
Boundary distant points, 249, 258
CORF algorithm, 421
Boundary integral methods,
Courant-Friedrichs-Lewy condition, 89
328ff.
Boundary value problems,
Courant-Issaacson-Rees method, 85, 111, 118, 143
207ff.
Bulirsch sequence, 196 Buneman algorithm, 417ff., 522ff.
Calculus of variations, 207, 208 (see also variational methods) Cauchy-Riemann equations, 18, 216
Crank-Nicolson method, 75, 137, 138, 174 Cyclic odd/even reduction (COR) algorithm, 420 with factorization (CORF), 421 Definite, 21
Derivative in a Banach space, 50 Diagonal dominant, 243
Cauchy sequence, 41 Difference methods Characteristic direction, 23, 27
in Banach spaces, 61
Characteristic methods, 31ff., 88, 89
for boundary value problems, 229ff. 538
Index
Difference methods (cont.)
539
Friedrichs method, 82, 142, 180, 197, 203
Fourier transforms of, 119ff.
with positivity properties, 97ff.
m-dimensional, 182 Friedrichs theorem, 116 Galerkin method, 286
stability of, 55ff.
Gauss-Seidel method, 359, 363 Difference operator, 70 Gaussian elimination, 402, 503 Difference star, 278, 428, 434
Generalized solution, 56, 90
truncated, 441
Gibbs-Poole-Stockmeyer algorithm, 407, 503ff.
Direct methods, 402, 403 Global extrapolation, 195, 449, Domain of dependence, 8, 87,
476
88
Green's function, 296 Domain of determinancy, 8, 87, 88
Heat equation, 12, 14, 107, 137, 141, 228
Elimination methods, 402 nonlinear, 16, 459ff. Elliptic equations,
Helmholtz equation, 209, 218 definition, 22, 26, 29 Hermite interpolation, 290ff. methods for solving, 207ff. piecewise, 302 Euler equation, 208 two-variable, 304 Explicit method, 75 Hyperbolic equations Extrapolation methods, 192ff. definition, 22, 26, 29 Finite element methods, 275 methods for solving, 31ff. Fixed point, 53, 336 Implicit method, 75 FORTRAN, 444
Initial boundary value problems, Fourier integral, 122 Fourier series, 121 Fourier transforms, 13 of difference methods,
10
Initial value problems, lff. in Banach spaces, 55 inhomogeneous, 89ff.
119ff.
in several space variables,
m-dimensional, 170
168ff.
INDEX
540
Integral in a Banach space, 52
Monotone type, equations of, 244
Interior collocation, 319
Multi-index, 168
Irreducible diagonal dominant, 243
Multiplace method, 268 Multistep methods, 341
Isotonic, 243
Negative definite, 21 Iterative methods for systems of equations, 334ff.
Neville scheme, 203
Jacobi method, 359
Newton-Kantorovich theorem, 344
Kreiss theorem, 67, 119
Newton's method, 335, 342
Laplacian, 209
Norm, 40
Lattice, 230
of linear operator, 47
Lattice function, 69
Sobolev, 271
Lax-Nirenberg theorem, 157
Numerical viscosity, 87, 266
Lax-Richtmeyer theory, 40ff.
Optimally stable, 89
Lax-Richtmeyer theorem, 62
Order of consistency, 65
Lax-Wendroff method, 144,
Order of convergence, 65
204
Ostrowski theorem, 366 Lax-Wendroff-Richtmeyer method, 185, 469ff.
Overrelaxation methods
Level structure, 410
for linear systems, 363ff.
Linear operator, 47
for nonlinear systems, 383ff.
Lipschitz condition, 1 Local extrapolation, 202,
Parabolic equations definition, 22, 29
203
Local stability, 154 Locally convergent interation, 337
in the sense of Petrovski, 17, 124, 152
Peano kernel, 295 Poincare' theorem, 386
Massau's method, 36, 447ff. Maximum-minimum principle,
Poisson equation, 209, 217, 223, 226, 280, 379, 484, 522ff.
210
Poisson integral formula, 213 Mesh size, 230 Positive definite, 21, 387 M-matrix, 242 difference method, 97, 115
Index
Positive difference methods, 97ff.
S41
Successive overrelaxation (SOR) method
Potential equation, 18, 212
for linear systems, 363ff.
Product method, 179
for nonlinear systems, 383ff.
Properly posed boundary value problems, 207ff.
initial value problems, lff., 55
Total step method, 358 Totally implicit method, 75, 80, 459ff.
Translation operator, 74 Triangulation, 275
Quadratic form, 21 Tridiagonal matrices, 404, 503 Quasilinear, 20, 26, 30 T yp e of a p artial
Relaxation parameter, 364
differential
equation, 19ff.
Ritz method, 272, 310ff.
Uniform boundedness, 50
Romberg sequence, 196
Uniformly elliptic, 22
Schroder-Trottenberg method,
Uniformly hyperbolic, 22
426ff.
V ar i at i ona l me th o d s, 270ff
.
Schr6dinger equation, 17 Von Neumann condition, 131 Semilinear, 20, 26 Semiorder, 241
Single step method, 341, 358
Wave equation, 11, 111, 147, 228, 474 generalized, 183, 190
Sobolev norm, 271
Weakly cyclic, 370
SOR, see successive overrelaxation
Weakly stable, 135
SOR-Newton method, 384
Weierstrass approximation theorem, 44
Solution operator, 56, 57
Weierstrass P-functions, 325
Sparse matrix, 275, 402
Wirtinger calculus, 215
Stability of difference methods, 55ff., 103, 116, 172, 233
Young's theorem, 373
definition, 62, 70 Standard lattice, 231
Strongly finite difference method, 69