G. Geymonat ( E d.)
Constructive Aspects of Functional Analysis Lectures given at the Centro Internazionale Matematico Estivo (C.I.M.E.), held in Erice (Trapani), Italy, June 27-July 7, 1971
C.I.M.E. Foundation c/o Dipartimento di Matematica “U. Dini” Viale Morgagni n. 67/a 50134 Firenze Italy
[email protected]
ISBN 978-3-642-10982-9 e-ISBN: 978-3-642-10984-3 DOI:10.1007/978-3-642-10984-3 Springer Heidelberg Dordrecht London New York
©Springer-Verlag Berlin Heidelberg 2011 st Reprint of the 1 ed. C.I.M.E., Ed. Cremonese, Roma, 1971 With kind permission of C.I.M.E.
Printed on acid-free paper
Springer.com
CENTRO INTERNAZIONALE MATEMATICO ESTIVO
(C.I. M. E . )
A. V. BALAKRISHNAN
: I
A CONSTRUCTIVE APPROACH T O OPTIMAL CONTROL
Corso tenuto
ad E r i c e d a l 27 giugno
a1 7 luglio 1973
A CONSTRUCTIVE APPROACH TO OPTIMAL CONTROL
0. Introduction. Except for linear problems, i t i s difficult, .if not impossible, to obtain explicit solutions for. optimal control problems.
The closest we
get to a general 'solution' i s the Maximum Principle of Pontrjagin.
But
important a s this result i s , i t only provides us with necessary conditions for a (any) postulated solution. Unfortunately, many control problems do not have an optimal solution.
1; =,u ;
Consider for instance this trivial example:
x(0) = 0
Minimize :
subject to the constraint that the control u(t) must be equal to
+ 1 o r -1
a.e.
The minimal value i s zero, but this i s attained for u ( t ) . 0~ and of course this i s impossible.
u (t) = n
On the other hand
Sin nnt
fTiGGl
provides us with a sequence.of admissible controls which approximate the infimum arbitrarily closely.
The sequence
1 1 un(t)
of course converge
A. V. .Balakrishnan in the weak sense in L ~ [ O , I] to zero, but unfortunately un(t)2 converges
to one,
and of Course there is no optimal control.
In his recent book [ I
1,
L. C . Young has pointed out the fallacy
in proving necessary conditions for a possibly non- existent solution. He eites a paradox of P e r r o n that this leads to: consider the problem of fbnding the largest positive integer. sohtion, say N,
If we assume there exists a
then clearly N 2 I; on the other hand, we must have
thp t
which combined with
shows that N = 1 ! To resolve this difficulty, Young introduces the notion of a 'relaxed' or 'generalized control' and proves the existence of an optimal control in this class, -and derives the maximum principle valid for such 'functions'.
In
A. V. Balakrishnan the present work we shall go one step further and show how to actually construct
-
-
'compute'
a sequence of approximating controls which
converge to an optimal 'generalized control' and which then satisfies the maximum principle.
The computational technique i s of more than
theoretical value; and in fact has proved to be.practically useful a s well..
Relaxed controls play an essential role in this approach. We begin with a simple exposition of the theory of relaxed controls [Young [I]
1,
because i t i s of some independent interest a s well.
1. Relaxed Controls Let U be a compact s e t i n Euclidean space E the L 2 -space of functions u(t), 0 < t < T < of measurable functions such that
u (t) E U n
-.
m ' Let H denote
Let u (t) be any sequence n
a.e.
Then we can find a subsequence (renumber i t un (.)) such that u n ( a ) converges weakly to u (.) say in H. 0
Em.
L e t p ( . ) be any polynomial over
Then
also contains a weakly convergent subsequence. What i s the limit? Unfortunately if i s not
'
A. V , Balakrishnan a s the example Sin nt u (t) = n Isin ntl
-
shows, taking p(u) = u
2
. At the slmplest level the 'generalized curves'
[we shall continue to use the t e r m generalized 'controls' because we shall need this notion only with the controls] may be regarded a s providing a means tc. straighten out this situation.
Consider now the product space Cl = I x U where I denotes the interval [0, T]
.
Then Cl i s compact metric and l e t C(Q)denote the
Banach space of continuous functions over Cl with range in Em. f(t,u) denote such a function.
Let
Then observe that for any Lebesgue
measurable function ~ ( t such ) that
we have that
SI
f(t3
u(t))dt
defines a continuous linear functional on C(Cl). We know that there must be a countably additive s e t function p (of finite variation) defined on the Lebesgue subsets of Cl such that
A. V. Balakrlshnan
and it i s clear that p i s an atomic m e a s u r e with a unit jump a t u(t) for each t.
That i s to say, on any product s e t of the f o r m A x B
p(A x B) = Lebesgue measure of the set [t
I u(t) e BI
F o r any polynomial p(.), we note that
i s Lebesgue measurable in t. A generalized control i s simply a measure on (the Lebesgue subsets of) I x U such that
and
SU p(u) dp(t;u) i s Lebesgue measurable i n t . Alternately, 'for our purposes i t i s more natural to define i t a s a 'family' of probability measures ('control measures') dk(t; u) over U such that
i s Lebesgue measurable in t.
Thus defined i t i s not difficult to show
that
Jnf (t;u) 'aP(t;u) defines a continuous linear functional on C(n). Moreover
JU f(t;u) dp(t;u)
i s Lebesgue measurable in t.
L e t now u (t) be the sequence we began with, un(t) converging n weakly in H to uo(t). Let f(t) be any m x m matrix function, continuous on I and p(.) be any polynomial with domain and range in Em. Then we can write
where dpn(t;u) i s the corresponding sequence of measures. Now by the weak compactness of measures we know that (independent of f ( -) and
A. V. Balakrishnan
p(.)) we can find a subsequence (renumber i t dpn(.) again) which converges to a measure d h ( t ; ~ ) : 0
Working with a further subsequence, we know that
where ~ ( t i)s Lebesgue ~ c e a s u r a b l eand since f(t) i s arbitrary, i t follows that
Thus if we agree to define
where the bar indicates use of 'generalized control', then we do indeed have that if
A. V . Balakrishnan
then p(un(t))
-
p[u0(t)l
Example L e t us illustrate this with a simple example for m = 1. Let Sin n nt u (t) = -n Isinn ntl
O
) what is the limiting generalized function? Note that d p n ( t ; ~for ea.ch t has a jump a t t 1 o r -1.
where O(a
n
(t)l1
Hence Snf(t;u)dyn= t
J:
an(t)f(t;l)dt
s1 0
+
(1-an(t))f(t:-l)dt
S' a(t)f(t;l)dt 0
Hence
A . V . Balakrishnan
Hence the limiting measure p i s such that
dp(t;u) has a jump at t 1 of a(t) and a jump at -1 of (1-a(t))
Now
J 1 Juu
dpn(t.u) ->J
0 Hence
Also
Hence we must have:
1
0
a(t)dt
- J 1 (I-a(t))dt 0
A. V. Balakrishnan
Hence
Thus the limiting measure i s a "chattering" between the values 1 and
- 1 with equal probability.
Note that
which i s correct. Generalization i s fairly transparent a t this stage. F o r example, for the extension to the immediate c a s e
u (t) = one of m values, u l , n
.,.u m ,
and u
n
( a )
converges weakly to zero
we have: P(U) dpU(t;u) ->
J-
u
m p(u)dp(t;u) = C a k ( t ) p ( % ) 1
A. V. Balakrishnan To determine the functions ak(t), we may note that
m
ak(t)\
2
= limit \(t)
2
1
C a,(t)\m- ' = limit m
rn- 1 uk(t)
1 giving u s m equations t o determine the m unknowns.
The length
of the tune interval, s o long a s i t i s finite, obviously plays no role.
The weak limit of "ordinary controls" thus l e a d s to a generalized control.
Conversely, we have the following important result due to
Young: Any generalized control c a n be approximated in the weak s t a r topology of linear functionals on C(n) by ordinary controls. [Ordinary controls a r e weak-star dense i n the c l a s s of generalized controls.]
A . V. Balakrishnan [Of c o u r s e the weak- s t a r lirnits of generalized controls a r e quite
obviously generalized controls .]
2.
The Basic Technique L e t u s illustrate our technique with reference to a simple
control problem:
Minimize:
f
0
g(t;x(t);u(t))dt
where 2 ( t ) = f(t;x(t);u(t)); x ( 0 ) = xo
and the control u(t) i s constrained to be in a r e s t r i c t e d c l a s s of functions (called 'admissible' controls). We replace this problem by the non-dynamic epsilon problem:
Minimize:
over the c l a s s of state functions x ( t ) , absolutely continuous with x(0) = xo and the c l a s s of admissible Controls.
We present a
A . V . Balakrishnan
constructive technique f o r solving this p r o b l e m which a s E goes to z e r o a p p r o x i m a t e s the original p r o b i z m a s closely a s d e s i r e d .
The
construction exploits the m a x i m u m p r i n c i p l e indirectly; i n f a c t the Hamiltonian a r i s e s i n a n a t u r a l way in the p r o c e s s .
See [2] f o r the bibliography and r e l a t e d w o r k .
A. V. Balakrishnan A Basic Estimate We begin with the immediate question: how well does the epsilon problem approximate the original control problem?
This question i s of
c o u r s e of p r i m a r y importance f o r computation, and i t i s interesting that we can answer i t without the need f o r any of the usual assumptions of control thoery , even including the conditions that a s s u r e unique solution to the differential equation.
We can also consider a s general a c l a s s of
control problems a s necessary.
However, in o r d e r not to confuse the
main i d e a s with too much generality, we shall confine ourselves to the following c l a s s of problems (the extension to m o r e general problems involving other types of phase plane constraints being readily made):
Minimize
Sz
g(t;x(t);u(t))dt
subject to:
where x ( t ) i s absolutely continuous and satisfying additional conditions a t the end points t = 0 ,
and t = T.
course not necessarily fixed.
The end-point T i s finite but of
The control u(t) i s Lebesgue measurable
A. V. Balakrishnan and subjec't to additional constraints, if any. We shall refer to such controls a s "admissible" controls. It should be noted that not every admissible control necessarily yields a trajectory x(t) satisfying all the conditions. (2.2), (2.3) and the end conditions.
However i t would
be 6 t u r a l to assume that there do exist admissible controls that lead to such trajectories.
(Even this condition can be.dispensed with for our
purposes in this section.) Nor shall w e need to impose any smoothness conditions on the functions f(.), g(.) and &.).
We shall only assume
that they a r e Lebesgue measurable and such that the integral in (2.1) i s well-defined for each (finite) T.
he epsilon problem i s now formulated a s follows: Let
.
Minimize h(c ;x( );u(. );T)over the c l a s s of (absolutely continuous) trajectories x(t) subject to the given end conditions (any other "phase plane" constraints can clearly be added); and admissible controls ~ ( t ) . We add the condition [F]:
A. .V. Balakrishnan where m i s a fixed positive constant independent of epsilon.
This
condition i s not necessary if for example:
Inf
g(t;x;u)
>
-
m
( a s in time-optimal problems, see section 4) The condibion (F) i s certainly a natural one in that w e a r e , after all, trying to approximate the case m = 0. The need f o r such a condition may b e s e e n by considering the simple example:
Minimize:
-
~ J [ u ( t ) ~x(t)4) d t
H e r e the epsilon problem without the finiteness condition will have minus infinity f o r the infimum while the control problem has zero for the infimum.
Unless otherwise stated, this condition will be
p a r t of the epsilon problem i n what follows.
Again i n o r d e r not to complicate the exposition too much, we shall a s s u m e that the infimumof the epsilon problem i s attained by a finite final time Tg , in the s e n s e that
hlo) = Inf
h(o;x(.);u(.);T) = l i m n
h(s;xn(-);un(.);Tg)
A. V. Balakrishnan where Te i s finite, and the epsilon problem.
xn( -),
un(. ), i s a "minimizing" sequence for
For such a minimizing sequence, l e t
1 T€ d ( r ) = L i m inf 2 0
G ( E ) = l i m sup
T€ 0
(11; - f(t:xw(t);un(t))ll2 + 11
,(t;xn(t):un(t)ll')dt
g(t;x(t);u(t))d t
Then of course
Let us now define
6 ( o ) = sup d(o) g(s)=Inf G(s)
where the infimum (and supremum) i s taken over the class of all minimizing seGuences. While 8 ( r ) i s finite because of condition
(F), g(c) may well be minus infinity in general.
Under the usual
conditions on the dynamics, we shall see however that g(o) will be finite.
We have of course:
A. V. Balakrishnan
It i s natural now to define g(0) to be the infimum for the control problem, assuming i t i s definable.
Then
With these definitions we can state the following theorem concerning the approximation:
Theorem 2.1 Suppose g(o) is finite f o r some oo. l e s s than eo, and moreover a s c
Then g(c) i s finite for every o
-+ 0,
6(e) i s monotone non-increasing
and g(e) i s monotone non-decreasing. Further :
limit 6-90
6(e)/o
=
0
if g(0) is definable (not equal to plus infinity).
Proof Let s
be l e s s than eo. We have then, a s an elementary analysis
on sums of limits shows:
A. V. Balakrishnan and similarly:
Since every quantity i s f ~ ~ i i on t e the left w e can freely transpose to obtain: b('e)
E
a n d since
E
6(€0)
<
-
g(so)
-
g(e)
5 '(€1
- '('o) 0
i s l e s s than eo, these relations a r e consistent only
if
Hence g(s) i s finite. Moreover since the argument can now be repeated with
the required monotonicity follows g(s) a s s goes to zero.
. L e t g(0.t) denote the l i m i t of
F r o m (2.7). since g(0) i s not plus infinity,
6(s) must converge to zero. Againwith e < el < cO, we have
A. V. Balakrishnan and letting c go to zero in this we obtain that for
E
< so,
and in particular then 6(s)/c goes to zero.
Remark 1 It should b e noted that infimum of the epsilon'problem has been sought in the class of admissible controls.
This i s natural since,
freed of having to satisfy the differential equation constraint, any admissible control can be used.
On the other hand this means that
in general the optimal control will be a relaxed control.
In particular
g(O+) may well be l e s s than g(O), the latter being usually sought in the c l a s s of ordinary controls, a s we assume herein also. An example i s given i n [3] where g(O+) = -1, while g(0) = 0.
However we shall
s e e that the infimum f o r the control problem allowing relaxed controls will b e g(O+), a t least under the usual conditions. But the main point i s that in the epsilon problem relaxed controls appear of necessity.
Remark 2 As shown in [3], (2.9) and (2.10) actually hold f o r d(s) and G(s) (even though the latter may depend on the particular minimizing sequence chosen!)
Corollary
Assume g(0) c
A; V
+m
Balakrishnan
and that g(co) i s finite for some e n > 0. Then, h(6) i s monotone non3ecreasing and omitting a t most a countable number of points in 3<e<e0, we have
and
.-Proof
F o r E < e0, both g(.) and 6 i . ] a r e monotone, and hence
continuous except for a countable number of points and differentiable a.e.
Now
= 6(e) (1/ ( € S A ) showing monotonicety) while
h(e
+ A)
- h(e) 2
o r , (2.11) follows.
(6(e + A )
-1 / ~ )
+ g(e + A)) - ( 6 ( c + A ) / O + g(e + A ) )
But omitting a s e t of measure zero:
f r o m which (2.12) follows.
3. Fixed End- Point Problems In order to introduce the basic ideas in the epsilon technique, i t i s convenient to begin With what.is perhaps the simplest class of control problems: Fixed end-point problems with fixed initial'zondition, and bounded controls
.
A . V. Balakrishnan P r o b l e m 1:
Minimize
J Tg(t;x(t);u(t))d t
t cp(x(T))
0
where T i s fixed and finite and
k ( t ) = f(t;x(t);u(t)) a . e .
x(0) = x fixed 1
;
u(t) Lebesgue measurable
~ ( t e) U
a. e.,
U being compact
It will be assunied in addition that
f(t;x;u), g(t;x;u), cp(x) a r e C'
in x, continuous in all
variables, and further condition G holds: t
(GI:
[x, f(t;x;u)] 5
2
c ( 1 t llxll )
f o r u in U,
05t l T
..
(3.4)
We note immediately that the infimum, denoted g(O), i s finite. The epsilon problem i s formulated a s follows: Let h(e;x(.);u(.) =
1 5
11 2 -
11 2 d t
f(t;x(t);u(t))
t Q(x(T))
h his condition a s well a s [ 3 . 3 ] c a n be relaxed a s in [15] for example we forego this generalization in the i n t e r e s t of simplicity of exposition, especially since i t i s not an intrinsic limitation on the approach.
-
A. V. Balakrishnan
Minimiee ~ ( E ; x ( . ) ; u ( . ) )over the class of controls u(t) Lebesgue n~easurabie, u(t) e U, and also over the class of absolutely continuous (It i s clear that additional phase
(Istat-9funeons x(t) with x(O) = xl.
plane constraints can be added here if necessary .) In addition the condftion F' i s imposed:
The condition F' i s a slight weakening of condition F, which i s possible because of the arnoothness properties of the functions assumed.
.
l e t xn( ),
Thus
.
u ( ) be a minimizing sequence for the epsilon problem. n
Condition F' implies that xn(t) i s uniformly bounded in O j k T and hence both 6(6) and g(e) {which now includes the. cp(. ) term) a r e finite. Again, i t is readily seen that condition F implies F'. F o r l e t
Then, using (3.4):
I[;,,
x,]
1 -<
I["n' f(t;xn(tl;un(t)I1 + l[xn. 2
-<
~ ( +1IIx,II
-<
0 (1 + llxnll
1
2
1
+
mll~,!
z,I I
A. V. Balakrishnan and by the usual analysis (Gronwall l e m m a ) this implies that xn(t) is uniformly bounded.
(If the initial condition xn(0) = x1 i s
generalized to cpo(x(0) = 0, a r e m u s t then r e q u i r e that the s e t
is bounded, f o r this r e s u l t to hold a s well a s f o r g(0) to b e finite.)
To solve the epsilon problem we take the following elementary route.
L e t a n admissible state function x ( t ) (that is, absolutely
continuous ana satisfying x(0) = x and (3.6)) b e chosen. 1 (3.5), w e simply minimize the integrand. 1 (r; (y
rn(a;t;y;x) = Min
ueu
To minimize
Let
- f(t;x;u)l12 t
g(t;x:u))
The minimum is clearly attained since U.is compact and the functional i s continuous. I t i s readily s e e n f u r t h e r that m(o;t;y;x) is continuous in all the variables.
Now
s o that
where the infimum is taken o v e r the c l a s s of admissible state functions x(t). To r e v e r s e the inequality in (3.10) w e have only
A. V. Balakrishnan
to uote that w c can find an admissible control u(t) such that (a.e.):
This i s obvious if the minimum of (3.9) i s attained a t a unique point in U.
Otherwise we invoke the "half-way principle of McShane
and warfield", a s in Young [ 1 1.
Let x
n
(a)
h ( r ) = Inf
be a minimizing sequence for
sT 0
m(a;t;;(t)';x(t))dt t rp(x(T))
Let u n ( - ) be a corresponding admissible control sequence. Now i t
is readily seen that x ( - ) i s equicontinuous. Hence we may, by n renumbering if necessary, assume that x (t) converges uniformly to n xo(t) say.
Further we can see that xo(t) i s absolutely continuous and
we may assume that the sequence (again by renumbering a s necessary)
kn (t) converges function.
weakly to Go(t). Also xo(t) i s an admissible state
But the sequence of controls converge, in general, only in
the sense of relaxcd controls. [Indeed to establish the existence of a relaxed optimal control for the epsilon problem, a s in the original control problem, takes "no more than a routine exercise in using the Ascolj theorem and the diagonal process11 (McShane [ 4 s o in the present case!]
1) only more
A. V. Balakrishnan The main point i s , however. that the optimal control for the epsilon oroblem must be soueht in the class of relaxed controls. Because the ordinary admissible controls a r e weak- s t a r dense in 'the clirss of relaxed controls, the infimum of the epsilon problem. over relaxed controls, i s the same a s that over ordinary controls. Moreover for Prohlem I, the infimum over relaxed controls i s also the same a s that over ordinary controls.
We can now obtain a constructive
approach to the maximum principle by allowing for r,elaxed controls. IA_EE:oach
to the Maximum Principle
Let
x
denote the element in appropriate product space:
Let p denote a probability (regular) measure on the Lebesgue subsets of U. As ranges over the class of all such t'control" measures, the points
f(t;x;u) d P (u)
=
u
,
g(t;x;u) d p (u)
u
describe the closed convex hull of the set
Let us agree to use the notaiion:
f
(t;x;u) =
g(t;x;u) d p (u)
u
A. V. Balakrishnan if a "control measure1' (the termxnology borrowed f r o m Young [ 111) is intended.
It is convenient to denote the closed convex hull by
C(t;x). Let
Then rn(c;t;y;x) = Inf u s u Let I f ; ( ~ ; t ; ~ ; x=) Inf
P
r(e;t;y;x;u)
-r(e;t.;y;x;u)
= Inl XE
r(e;t;y;x)
C(t;x)
w h e r e now the infimum i s taken over the c l a s s of a l l control m e a s u r e s , p
,.,
We note that m (e;t;y;x) i s continuous in t, y and x.
Let z n ( t ) be a
minimizing sequence of admissible state functions f o r
where the infimum is taken over the c l a s s of all admissible s t a t e functions, If f o r each t the optimal control m e a s u r e that attains the min;.num of r(e;t;z(t);z(t);u) i s unique, o r by invoking the McShane-Warfield half-way principle otherwise, we note that
A:
V. Balakrishnan
there i s a relaxed control such that
Because the ordinary controls a r e weak-star dense in the c l a s s of relaxed controls, i t readily follows that
h ( r ) = Inf ~ f ~ ( r ; t ; $ t ) ; x ( t ) ) d tt rp(r(T))
Now i t i s readily seen that to z o ( t ) say,
M
n ( t ) can be taken to b e uniformly convergent
xo(t) absolutely continuous, and
A. V. B a l a . k r l s h n ~ -
Next to obtain the m a x i m u m principle, we begin by noting that
.is a differentiable convex functional on the compact convex s e t C ( t ; x ) . Hence the infimum i.s attained, and denoting such a point by any point
x
Xo w e know that f o r
i n C(t;x) we m u s t have: [omitting a s e t of m e a s u r e z e r o in t]
writing
the l e f t side of (3.11) i s readily calculated to be:
Or,
-
[ Y , f (t;x;uo)]
where
-
-
g(t;x;uo)
1 [Y,
-(t;x;u)] - ?(t;x;u) f
A. V. Balakrishnan
We note that (3. l j ) i s also a sufficient condition f o r Xo to be the infimum, and i s already recognizable a s the Maximum Principle.
and since we may take z n ( t ) converging weakly (in the L (0,T) sense) 2 to s o ( t ) , we have also:
Hence
+
h ( c ) = ~ ~ l j i ( e ; t ; ; ; ~ ( t ) :d~t ~ ( t ~) )J ( Z ~ ( T ) ) 0
Letting Y(c;t) = (golt)
- f (t;zO(t);Go(t)))/c N
we s e e that the optimal control
Go(*) (which we
r e p e a t i s now, in
general, relaxed) i s characterized by the Maximum Principle:
A. V. Balakrishnan where the maximum i s now taken over the c l a s s of control m e a s u r e s . Finally a routine f i r s t variation analysis shows tkiat Y(c;t) m u s t satisfy.
Also note that f o r c sufficiently small, s t r i c t
inequality will hold in (F').]
where
where fl(t;x;u) =
vx f(t;x;u)
gl(t;x;u) =
vX
cp,(x)
=
g(t;x;u)
vx v(x)
vx denoting gradient with r e s p e c t to x. We finally note that by Caratheodoryls Theorem [ 5 ] , the control m e a s u r e dpo(u;t) can be characterized by a finite number of atomic p a r t s f o r
all t.
This result was also noted by Garnlcrelidze, It i s quite straightforward now to get the Maximum Principle by
letting epsilon go to z e r o .
Let u s u s e x(e;t), u(e;t) to denote go(t), 'ii,(t),
f o r the solution to the epsilon problcm (the optimal control u(e;t) being relaxed).
Then i t i s readily seen f r o m the estimate (7.8) that 6 ( c )
goes to z e r o so that x ( c ; t ) i s equicontinuous and we can consider a
A. V . Balakrishnan convergent subsequence converging to x(0;t) say, uniformly i n t.
Since
x(c;T) i s bounded, so also then i s Y(c;T), and hence also Y(c;t), the functions f l ( . .), g l ( -
) being continuous.
Moreover, there exists
an optimal (relaxed) control, and we may take weak l i m i t s in (3.18) and (3.19) to obtain the maximum principle f o r the control problel-sl, in the f o r m given by McShane[4]
. Also,
note that
g(O+) = g(0) Remark 1 Note that we have incidentally shown that
since
Remark 2 h(c) =
Lim ~ T m ( c ; t ; < ( t ) ; x ~ ( t )dt) t .(xn(T)) n
and hence with xn(t) converging uniformly to x ( t ) , and <(t) 0
converging weakly to Go(t), w e have:
->
T
Jo r n ( ~ ; t ; ~ ~ ( t ) ; x d t~ (+t )cp(xo(T)) )
where the s t r i c t inequality may hold.
A. V. Balakrishnan Remark 3 It i s convenient to u s e the notation:
L e t u s fix t and l e t :
m(c;t;y;x) = r(c;t;y;x;uo) and l e t us a s s u m e now that the m i n i m a l point u 0 i s unique. Then letting m ( c ; t ; y t 8h ; x
Y
+ Bhx)
= r ( e ; t ; y + eh ;X t OhX;ue) Y
w e note the the u s u a l inequalities:
Since u is compact, every sequence ue contains a convergent subsequence and the l i m i t m u s t b e a m i n i m a l point and hence equal to uo. Hence:
A . V. Balakrishnan
and the l e f t side i s zero a s soon a s
u(c;t) denote an optimal solution of the epsilon problem. and suppqse we s e e how well we can do i n the original control problem if we took'\u(c;t) for the control.
Let x ( ~ ; t )denote the solution of 0
(note that u(o;t) rrhy well be a relaxed control but because of condition ( 3 . 4 ) , the differential equation still has a unique solution, s e e Young
[ll]).
Let
Then
y ( ~ =) 0; ];it) = f ( t ; ~ ~ ( ~ ; t ) ; u ( e-; f(t;x(s;t);u(o;t)) t)) - cy(~lt) f r o m which it follows that
o r y ( t ) goes to zero uniformly in t with epsilon.
A . V . Balakrishnan
Also
-(SoTg(t;x(a;t):u(.:t))dt
t t ( x ( a ; ~ ) ) l k sup lv(t)l
- 0
Problem I1
Minimize
SoT g(t;x(t);u(t))dt +
(g(x(T)).
Where T i s fixed and finite, subject to:
(3.2). ( 3 . 3 ) , ( 3 . 4 ) and in addition:
where
1(t;x;u) i s continuous in all the variables,
differentiable in x .
and continuously
A. V. Balakrishnan Let g(0) denote the infimum, assumed to be l e s s than plus infinity; in other words there i s a t l e a s t one solution to (3.2) satisfying (3.3) and (3.21). In extending the class of controls to include relaxed controls we require that the l a t t e r satisfy:
A relaxed control satisfying (3.22) is not necessarily approximated by ordinary controls satisfying (3.21).
Thus the infimum over relaxed
controls may well be strictly l e s s than g(O), a s the following simple example shows :
minimize x l ( l ) subject to:
It i s apparent that g(0) i s zero. while the infimum over relaxed controls i s (-1).
The epsilon problem i s formulated a s that of minimizing (2.4) over the class of admissible state functions (that i s , absolutely continuous,
A . V . Balakrishnan satisfying F', and x ( 0 ) = x
and o v e r the c l a s s of c o n t r o l s u ( . ) 1' satisfying (3.2) and ( 3 . 3 ) . We note that the e s t i m a t e ( 2 . 8 ) applies.
In p a r t i c u l a r we shall s e e that g ( 0 t ) i s the infimum f o r p r o b l e m I1 over the c l a s s of relaxed c o n t r o l s .
m ( s ; t ; y ; x ) = Min ucu
L e t u s again u s e the notation:
1 (11 y - f ( t ; ~ ; ~ )2l 4-l 28
11 t(t;x;u)II2 ) t
g(t;x;u)
Then c l e a r l y
w h e r e the infimum i s taken o v e r the c l a s s of a d m i s s i b l e s t a t e functions and h(E) denotes the infimum f o r the epsilon problem.
That actually
equality holds i s readily s e e n again by invoking the McShane-Warfield halfway principle.
L e t x ( t ) b e a minimizing sequence of a d m i s s i b l e n
s t a t e functions, and l e t u s denote the l i m i t (of a n a p p r o p r i a t e subsequence) by xO(c; t )
.
Approach to the Maximum P r i n c i p l e Let
)I
denote a regular probability m e a s u r e ( o r control m e a s u r e )
on the Lebesgue s e t s of U.
L e t C ( t ; x ) denote the s e t of points X in the
appropriate dimension product space:
where a s before:
a s p varies over the class of all such measures. the closed convex extension of the set
{f(t;x;u), p(t:x;u). t
x
u
ue
Let us u s e the notation:
and l e t
t%(e;t;yx) = Inf r(c;t;y;x) x c G(t;x) Then a s before, i t i s readily seen that:
-
Again letting: m(o;t;y;x) = r(e;t;y;Xo)
U)
Then C(t;x) i s
A. V. Balakrishnan
w e h a v e f o r any x i n C(t;x) that [omitting a s e t of m e a s u r e z e r o i n t]
leading to the Maximum Principle:
where N
y
= (y-f(t;x;uo))/e
;
bo
= &t;x;u0)/e
Corresponding to i 0 (e ; t ) t h e r e i s a relaxed optimal c o n t r o l d p0 (e;u;t) [-uo(o;t)] and by a routine variational analysis we have letting
that Y(e;t) m u s t satisfy:
A. ,V. Balakrishnan where
and other notation i s the s a m e a s before f o r Problem I. W e have the epsilon maximum principle:
w
H
= Max [ Y ( E ; ~f) ,( t ; ~ ~ ( ~ ; t )+; ~ [bo(e;t). )]. b(t;x0(e;t)iu)I
Let us now consider .the situation a s epsilon goes to zero.
The
main difference from the e a r l i e r treatment f o r Problem I i s that we m u s t now show that
to(€it) converges a s epsilon goes to zero. F i r s t of all we may take xo(c;t)[ f o r suitable subsequence] converging uniformly in t to xo(t) say, and the corresponding controls to 'converge (in the weak- s t a r topolog~)to
'A. V. Balakrishnan
dpo(u;t) say.
) to z e r o ) that Then we note that (since b ( ~ converges
and that
the l a t t e r following f r o m the fact that f o r each continuous function f(t), we m u s t have:
where we have denoted by d p ( ~ ; u ; t[NU ) (e;t)] the control corresponding .o to xO(c;t). Also
and g(O+) i s a l s o then the infimum f o r the control problem in the cla'ss of relaxed controls satisfying (3.22).
Suppose now that
SoT I
(o(rn;t).112 dt c M <
for some sequence e n going to z e r o .
m
L e t us denote the weak limit
A. V. Balakrishnan of a subsequence by fo(t). Then since Y(on;t) satisfies the linear equation (3.23) and y(cn;T) i s necessarily bounded, i t follows that Y ( r ;t) i s bounded. n
Hence
and hence Y(sn;t) i s equicontinuous, and we may renumber, a s ) necessary, to have Y(cn;t) to converge uniformly in t to ~ ( t say. Clearly the f i r s t t e r m on the left of (3.24):
The next t e r m in (3.24):
converges to zero a.e.,
since the integral over [ O , T] goes to zero. Y
On the right side of (3.24), both f (t;xo(e;t);u)and &t;xo(e;t);u) converge uniformly in t, so that w e may take weak limits in (3.24) to obtain the
f Here and elsewhere i t i s understood that we subsequences a s required.
yill be dealing with
the maximum principle:
Suppose 11ow that
i s unbounded for every sequence c
n
going to zero.
Let us pick a
subsequence that makes xo(cn;t) converge ,uniformly in t to xo(t), and the controls u0(e;t), in weak-star topology, to uo(t) a s before so that (3.25), (3.26) hold.
Let
Let
Dividing thru by kn in (3.23) l e t us note that Yn(t) i s bounded. Clearly sup
~dll i n ( t )11
dt <
-
so that we have equicontinuity and may take ('suitable subsequence renumbered) Yn(t) to converge uniformly in t to Y(t) say. Similarly
A. V. Balakrishnan l e t u s t a k e hn(t) t o , c o n v o r g e weakly to b3(t) s a y .
Then by taking (weak)
l i m i t s i n ( 3 . 2 3 ) we have:
And c o r r e s p o n d i n g l y , taking l i m i t s in (3.24) w e obtain: [ a . e. i n o < ~ < T ] P,.
['Y(t)# f ( t ; x o ( t ) ; u o ( t ) ) l w
N
= M a x [ ~ ( t ) ;(t;xo(t);u)l f t [ 6 ( t ) . b(t;xo(t);u)]
.. . .
w e note t h a t ( 3 . 2 7 ) and ( 3 . 2 8 ) c a n be combined by introducing a multiplier
X i n f r o n t of g ( t ; x o ( t ) ; u o ( t ) )and saying X 2
i s now t h e possibility t h a t Y ( t ) i s z e r o .
0.
There
To avoid this one c a n
i n t r o d u c e additional conditions involving d e r i v a t i v e s with r e s p e c t to u .
Computational A s p e c t s W e s h a l l now study s o m e of the questions t h a t a r i s e i n examining computational a s p e c t s m o r e deeply, and a t the s a m e t i m e i n d i c a t e a In doing s o w e s h a l l
p a r t i c u l a r scheme. f o r solving the epsilon p r o b l e m .
need to m a k e s o m e additional a s s u m p t i o n s which a r e n a t u r a l in the p r a c t i c a l context.
Ure c a n a p p r e c i a t e i n a g e n e r a l way t h a t a s epsilon i s m a d e s m a l l e r and s m a l l e r w e will r u n into compr
itional a c c u r a c y p r o b l e m s while too
l a r g e a n e p s i l o n will h a v e n o r e l a t i o n to the c o n t r o l p r o b l e m we w i s h to
-
.
..
- - . - - - .-- -
--
A. V. Balakrishnan
of i t f o r the problem.
Let C[O, T] denote the Banach space of continuous
state functions under the sup norm. basis functions in C[O, T].
{
F o r each n, l e t
functions spanned by bk(*), k = 1,
1
Let bn(t) denote a sequence of
...n,
sin
denote the s e t of
such that
'For each n, we now consider the epsilon problem over the class of state functions (denoted Sn) of the form:
where the
{%)
must atisfy (3.29). ~orrespondingto condition
7
(F'),and over controls utt) a s before.
(The controls a r e not
approximated by basis functions .) L e t u s denote the corresponding infimurn.by hn(e). Clearly any admissible state function c a ~be i approximated uniformly in t by functions i n Sn a s closely a s desired for large enough n, and of course S i s also conditionally n compact.
Let
where the quantities. a r e defined the same way a s in section 2, the subscript n denoting restriction to Sn. It is evident that bn(c)
,
A. V. Balakrishnan and g (e) a r e again monotone in the same fashion a s before. n Let 6,(0) and gn(O+) denote the limits 4s e goes to zero. Since hn(c) (unlike h(c)) has no given upper bound, 6,(0) need not be zero.
In fact we have:
so that
Thus hn(o) eventually increases without bound [O(l/c)] a s we make epsilon smaller. Of course
and h(e) = l i m h,(~), for each .c
r 0.
L e t us now indicate a method for obtaining h(e). We begin with any element of Sn, say with all
{ak I s e t to be zero,
in the.
absence of any prior ideas concerning the optimal state function.
A. N . Balakrishnan
Call this xl(t)
.
We now m a k e the following a s sumption ( U ): 1
i s attained a t a unique point i n U, f o r e a c h t, y, x and ~ ( t denote ) the m i n i m a l point in U f o r y =
E
.
2 1( t ) , x = x1( t ) ,
Let so
t h a t in o u r previous notation:
(The assumption (U1) c a n c l e a r l y be weakened to hold i n a suitable neighbourhood.)
We now choose x 2 ( t ) s o that
(which i s attained in general by a n e l e m e n t in the c l o s u r e of Sn) i s attained by x 2 ( t ) . Note that the function m(c ; t ; r e q u i r e d to be known.
. . .) i s not
We next d e t e r m i n e u 2 ( t ) so that (3.33)
holds with x ( t ) replaced by x ( t ) , and continue on this way to 1 2 produce the sequence
A . V. Balakrishnan Now
and w e have thus a monotone decreasing sequence.
F r o m any
subsequence we can choose a further subsequence such that x ( t ) , &,(t) converge i n
n
C[O,T] to x o ( t ) , k 0 ( t ) say, and now because
of condition ( U ), the corresponding u ( t ) m u s t converge to u ( t ) 1 n o where
Moreover we m u s t have:
The l a s t equality together with (3.35) m e a n s that x o ( t ) , uo(t) cannot b e f u r t h e r improved by o u r p r o c e d u r e .
Next l e t u s note that x6(t). u ( t ) i s a local minimum f o r the epsilon problern in the sense that
f o r any a d m i s s i b l e control u ( t ) while the f i r s t variation of
vanishes a t x ( t ) = x o ( t ) . .For this purpose w e a s s u m e that f o r any h(.) in%,
belongs to Sn f o r a l l sufficiently s m a l l 101
A. V. Balakrishnan Now b e c a u s e of (U2),
.
we s e e f r o m ( 3 . 2 0 ) that we only need to show that
i s z e r o . But this follows f r o m the fact that this i s t r u e f o r xn(t), u ( t ) and we can take l i m i t s with r e s p e c t to n . n
Clearly i n any
computational scheme we can only obtain a l o c a l minimum.
In
p r a c t i c e w e a s s u m e that ( a t l e a s t f o r l a r g e enough o r d e r of approximation) t h e r e i s only one l o c a l minimum, o r , a t l e a s t o u r s e a r c h i s confined to a region w h e r e there i s only one m i n i m u m and i t i s the t r u e minimum. Note that under condition hh(s) =
-
[ u ~ ]we ,
have
1
bn(c) e 2 f o r every e > 0.
To conclude the computational scheme, w e need only now to d e s c r i b e how the i n f i m u m in (3.33a) i s determined. only s e e k a local minimum.
H e r e again w e
F o r this purpose, l e t u s note that the
functional
i s now a function only of the coefficients function by h(&;n;u(.)), . n =
{%}
and l e t u s denote this
Then ure u s e the iteration:
cr
m1.1 =
-1
nm - Hm G m
where
H
m
i s an n X n m a t r i x with components:
aa.
3
This i s then a slight variation of the Newton-Raphson technique (in that no second derivati-ves of the integrand a r e u s e d ) .
The
convergence of the scheme i s proved by a minor modification on the usual proofs, a s given in [14]for examp1.e. Of c o u r s e other techniques can be used.
A. V . B a l a k r i s h n a h
F i n a l l y , l e t 6 > 0 be given.
Then, in t h e o r y , we c a n find
a n N 2nd c such that
F o r this we only need to f i r s t find
6
such that
Then s i n c e
we have that
g(O+) - h(e)
5
6/2
Next w e choose N l a r g e enough s o that
A. V . Balakrishnan
Computational A s p e c t s We shall now study some of the questions that a r i s e in examining computational a s p e c t s m o r e deeply, and a t the s a m e time indicate a p a r t i c u l a r s c h e m e f o r solving the epsilon p r o b l e m . In doing s o we shall need to m a k e s o m e additional assumptions which a r e n a t u r a l i n the p r a c t i c a l context.
We c a n a p p r e c i a t e in a general way that a s epsilon i s m a d e s m a l l e r and s m a l l e r we will r u n into computational accuracy p r o b l e m s while too l a r g e an epsilon will have no relation to the control p r o b l e m w e w i s h to solve.
This i s b e s t s e e n by examining a R i t e approximation, o r of : rersion
of i t f o r the p r o b l e m .
L e t C[O,T] denote the Banach s p a c e of contin(..,us
s t a t e functions under the sup n o r m . b a s i s functions i n C[O,T].
.
Let
1. 1 b (t)
denote a sequence of
F o r each n , l e t A n denote the s e t of
functions spanned by b k ( ), k = I ,
. ..n, such that
A. V. Balakrishnan
F o r each n, vJe now consider the epsilon p r o b l e m over the c l a s s of s t a t e functions (denoted S ) of the form: n
w h e r e the
1 ak} 1
m u s t satisfy ( 3 . l ) , corresponding to condition
(F'), and over controls u ( t ) a s before.
(The controls a r e not
approximated by b a s i s functions .) L e t u s denote the corresponding infimum by hn(c). Clearly any admissible s t a t e function can be approximated uniformly in t by functions i n S a s closely a s n d e s i r e d for l a r g e enough n, and of c o u r s e S
n
i s a l s o conditionally
compact.
Let
w h e r e the quantities a r e defined the s a m e way a s in section 2, the s u b s c r i p t n denoting r e s t r i c t i o n to Sn. It i s evident that
tin(€)
A. V. Balakrishnan
and g ( s ) a r e again monotone i n the s a m e fashion a s before. n L e t 6,(0) and gn(O+) denote the l i m i t s a s o goes to z e r o .
Since
h ( s ) (unlike h ( c ) ) h a s no given upper bound, 6,(0) need not b e n z e r o . -In f a c t we have:
s o that
lim 0
(hn(c)
-
6 , ( 0 ) / ~ ) = gn(O+)
€7)
Thus hn(s) eventually i n c r e a s e s without bound [0(1I s ) ] a s w e m a k e epsilon s m a l l e r .
Of c o u r s e
and h ( s ) = l i m h n ( c ) , f o r each
6
> 0.
L e t u s now indicate a method f o r obtaining h ( o ) . W e begin with any element of Sn, say with a l l
{ ak I
s e t to b e z e r o , i n the
a b s e n c e of any p r i o r i d e a s concerning the optimal s t a t e function.
A . - V . Balakrishnan
Call this xl(t)
.
We now make the following assumption (U1):
Min uou i s attained a t a unique point i n U, for each t, y , x and c . ul(t) denote the minimal point in U f o r y =
A 1( t ) , x
Let
= x ( t ) , so 1
that in o u r previous notation:
( T h e assumption (U1) c a n c l e a r l y be weakened to hold in a suitable neighbourhood.) We now choose x ( t ) s o that 2
(which i s attained in general by an element i n the c l o s u r e of Sn) i s attained by x 2 ( t ) . Note that the function m(o ; t ;
...) i s not
r e q u i r e d to be known. We next determine u 2 ( t ) so that ( 3 - 5 ) holds with x ( t ) replaced by x ( t ) , and continue on this way to 1 2 produce the sequence
x n ( t ) , un(t)
A . V. Balakrishnan
Now
and we have thus a monotone decreasing sequence.
F r o m any
subsequence we can choose a further subsequence such that x (t), n
4n (t) converge i n C[O,T] to x o ( t ) , Go(t) say, and now because
of condition ( U ), the corresponding u ( t ) m u s t converge to uo(t) 1 n where
A. V. Balakrishnan
Moreover we must have:
The l a s t equality together with (3.8) means that xo(t),uo(t) cannot be further improved by our procedure.
Next l e t us note that xo(t), uo(t) is a local minimum for the epsilon problem in the sense that
for any admissible control u(t) while the f i r s t variation of
vanishes at x(t) = xo(t). F o r this purpose we assume that for any
.
h( ) in
dc,
belongs to Sn f o r all sufficicntly s m a l l
1 t3 1 .
A: V . Balakrishnan Now because of (U1),
that we only need to show that
i s z e r o . Rut this follows f r o m the f a c t that this i s t r u e f o r x n ( t ) , u ( t ) and we c a n take l i m i t s with r e s p e c t to n. n
Clearly i n any
compuLationa1 s c h e m e we can only obtain a local minimum.
In
p r a c t i c e w.e a s s u m e that ( a t l e a s t f o r l a r g e enough o r d e r of approximation) t h e r e i s only one l o c a l minimum, o r , a t l e a s t o u r s e a r c h i s confined to a region w h e r e t h e r e is only one minimum artd i t i s the t r u e minimum. Note that under condition [ U
1
hh(c) =
-
1,
we have
bn(e)l e 2 f o r every
E
> 0.
To conclude the computational s c h e m e , we need only now to d e s c r i b e how the i n f i m u m in (3.6) is determined. only s e e k a l o c a l minimum.
H e r e again we
F o r this purpose, l e t u s note that the
functional
is now a function only of the coefficients (ak} and l e t u s denote this function by
A. V. Balakrishnan
Then w e u s e the iteration:
-
1 r n t l = a m - Hm G m
n
where
H
. is
m
an n X n m a t r i x with components: .
so
1 T L . ; [aai(~(t)-f(t;~(t);~(t))s
a
a-(G(t)-f(t;x(t);u(t))] a. dt
J
This i s then a slight variation of the Newton-Raphson technique (in that no second derivatives of the integrand a r e u s e d ) .
The
convergence of the scheme i s proved by a m i n o r modification on the usual p r o o f s . Of c o u r s e other techniques c a n be used.
A . V . Balakrishnan
F i n a l l y , l e t 6 > 0 b e given. a n N and
E
Then, i n theory, w e c a n find
such that
F o r this we only need to f i r s t find
E
s u c h that
Then s i n c e
~ ( E ) / c ( g(O+) - g ( ~ )
w e have that
g(O+) - h ( s )
5
6/2
Next w e choose N l a r g e enough s o t h a t
A. V . Balakrishnan
References
1
.,
L. C . Young: "Calculus of Variations and Control Theory", W . B . Saunders, 1969.
2.
A . V. Balakrishnan: "On a New Computing Technique in Optimal Control", S U M J o u r n a l on Control, S e p t e m b e r 1968.
3.
A . V. Balakrishnan: "A Computational Approach to the Maximum P r i n c i p l e t ' , Jourrial of Computer and S y s t e m Sciences, 1971.
4.
E . J . McShane: "Relaxed Controls and Variational P r o b l e m s " , SLAM J o u r n a l on ConLrol, 1967.
5.
H. G . Eggleston: "Convexity", CaLnbridgc University P r e s s , 1968.
A . V. Balakrishnan
LECTURE NOTES
A . V. Balakrishnan
Erice, July 1971
11
A. V. Balakrishnan
Stochastic Systems
1.
Inducing M e a s u r e s on C; the Wiener m e a s u r e -Given a stochastic p r o c e s s , o r equivalently, a consistent family
of finite dimensional distributions, we can always c o n s t r u c t a 'functionspace' p r o c e s s with X a s the sample space and a probability m e a s u r e
p ( . ) on the sigma-algebra
9' of subsets of, X (that ' a g r e e s ' with the
finite-dimensional distributions). two a r b i t a r y time-points
Throughout we a s s u m e that f o r any
t l , t2, denoting the corresponding 'variables'
by x ( t l ) , x ( t ), that w e have.: 2
where r
> 0,
k > 0, 6
of t l , tZ; an3
>0
1. I
a r e fixed constants independent denotes the Eticlidean n o r m .
F u r t h e r m o r e we shall a s s u m e that T i s a compact i n t e r v a l , F o r simplicity of notation, we s h a l l take i t to be the unit interval [0, 11 without l o s s of generality. L e t C (0, 1 ) denote the c l a s s of continuous functions (with range in E ) on the closed i n t e r v a l [0, 11. Endowing i t with the 'sup' norm, we know that i t becomes a ~ a n a c hspace:
A . V. Balakrishnan
where
llfll
= sup I f ( t ) l
I 'I
denotes the Euclidean norm.
,
0( t 2 1
~ g n a c hspace, and denote i t by
We note that i t i s a separable
%'. By the Borel s e t s of g we mean
the smallest sigma-algebra generated by all open s e t s .
Let B be an
Then for a r b i t r a r y t 1, t 2 s m-dimensional Bore1 s e t in E ( ~ ) . the s e t s in g
... t n s
defined by:
a r e called 'cylinder s e t s ' (and B i s then referred to a s the 'base').
Lemma
The c l a s s of Borel s e t s in Q coincides with the smallest
sigma-algebra generated by the c l a s s of cylinder sets,
Proof'
It i s c l e a r that cylinder s e t s a r e Borel s e t s . Conversely, since
B i s separable, any open s e t in $2 can be expressed a s the union of a countable number of closed spheres; and every closed sphere, say with center fo and radius d can be expressed:
where r
n
denotes the countable collection of rational numbers in
[o, 11. Hence open s e t s a r e contained in the smallest sigma-algebra
A . V . Balakrishnan
g e n e r a t e d by cylinder s e t s . Note i n p a r t i c u l a r that the B o r e l s e t s a r e g e n e r a t e d a s the s m a l l e s t s i g m a - a l g e b r a containing s e t s of the f o r m ,
[f(.)
I f(t) E
I
,
f ( . ) e g:
I a n i n t e r v a l i n E]
Given any c o n s i s t e n t s e t of d i s t r i b u t i o n s , we c a n induce a finitely -additive m e a s u r e o n the cylinder s e t s of
The m a i n point
of t h i s c h a p t e r i s to show that under condition ( 2 . I ) , i t c a n b e extended to b e a countably additive m e a s u r e on the B o r e l s e t s of Parthasarathy [ 4
.
( X , S, p ( )).
1.
V. We follow
F i r s t u s e Kolmogorov's t h e o r e m and go to
Then f o r .each n , define a mapping cp (. ), mapping X into n
52 by:
f o r m/2"
(t
(m
+
1)/2",
0 ( m < Zn, m
-
integer.
In o t h e r w o r d s we join the o r d i n a t e s a t t h e ' d i s c r e t e timepoints m / 2
by s e g m e n t s of s t r a i g h t l i n e s .
I t i s not difficult to s e e that cpn(.)
m e a s u r a b l e ; that i s ' t o s a y , the i n v e r s e i m a g e s of B o r e l s e t s :
n
is
A . V. Balakrishnan
w h e r e B -is a B o r e l s e t i n
CG, belong to 9 F o r this w e have only to
note thatSif B is of the f o r m :
B = [f(.)
E
%, f ( t l )
E
I
,
I Bore1 s e t in E ]
then taking m/zn
5
tl<
( m + l ) / ~ ~
w e have that
vn- 1( B ) = [ x ( . )
E
X, x ( m / z n )
+ 2n(tl-m/2n)(x(m+l/~n)
-x(m/Zn)) E'I]
which i s c l e a r l y i n '% Hence the s m a l l e s t s i g m a - a l g e b r a generated by i n v e r s e i m a g e s of this f o r m a r e i n
But the s m a l l e s t s i g m a - a l g e b r a
generated by s e t s of the f o r m B i s indeed the c l a s s of a l l B o r e l s e t s . Hence cp (.) n
is measurable.
that
i s a Cauchy sequence i n %' f o r a l l x i n X, except f o r a
vn(x)
fixed s e t of m e a s u r e z e r o .
Next w e c o m e to the c r u c ' a l
t
a r t ; and show
F o r t h i s we begin with the bound:
A. V . Balakrishnan
This i s evident f r o m the f a c t t h a t the maximum 'deviation' o c c u r s a t the subdivision points ( 2 k i 1 ) /2",
and h e r e the deviation i s
Next w e invoke the Ghebychev inequality; and obtain:
Next l e t 0<0<6/r
and l e t A
=
[X
I
\ II.P,(X)
- Pn-l(x)
1 2 2-"01
Then
-
t
---
-
The student should verify that the s e t considered i s m e a s u r a b l e
fl!
A . V. Balakrishnal
and s i n c e
i t follows ( B o r e l - C a n t e l l i L e m m a ) :
Let OJ
F =
m
I\ .LJAj
n=l J=n
F o r any x not in F we have that, denoting the complement of A by
j
GA., 3
o r , f o r e a c h x not i n F,
for a l l n sufficiently l a r g e (depending on x, such x
,
of c o u r s e ) .
f o r a l l n sufficiently l a r g e and every p,
Hence f o r
or, cpn(x) i s a Cauchy s e q u e n c e in rg.
Denote the l i m i t by cp(x) f o r x not in F and define the l i m i t to b e z e r o ( t h e z e r o function) on F.
And define
Y(x) = cp(x) o n the c o m p l e m e n t of F
Then Y ( . )
i s measurable.
.
Note that f o r any x( ) not i n F, we have that, if
then a t e a c h subdivision point
t. J
(of the f o r m n-112~)
by c o n s t r u c t i o n . On the o t h e r hand, using (2.1) i t follows that we have stochastic continuity; and h e n c e if
$
c o n v e r g e s to t, a subsequence
c o n v e r g e s with probability 1 t o x ( t ) . Hence f o r e a c h t
A. V . Balakrishnan
with probability one w h e r e the exceptional s e t may depend on t . [In o t h e r - w o r d s (Y(x))(t)i s equivalent to x(t)]. Next we define the probability m e a s u r e on g by: f o r e a c h Bore1 s e t B,
L e t u s c a l c u l a t e the finite d i m e n s i o n a l distributions c o r r e s p o n d i n g A
to p.
L e t I b e a n i n t e r v a l i n E, and l e t
Then A
P(B) = ~ ( Y - ' ( B ) )
by the equivalence j u s t p r o v e d . dimensional d i s t r i b u t i o n s .
h
Hence p ' a g r e e s ' with the given finite
A.
V. Balakrishnan
Wiener P r o c e s s An i m p o r t a n t special c a s e of t h i s construction, f o r u s the one c e n t r a l c a s e , i s Wiener m e a s u r e .
T h e o r e m 2.1
T h e r e e x i s t s a m e a s u r e W on the Bore1 s e t s of
%?
(which w e s h a l l denote henceforth by 93 ) s u c h that %'
and f o r any finite n u m b e r of i n d i c e s t l < t < 2
... < tm,
w h e r e Ik a r e i n t e r v a l s in E , i s givau by
w h e r e G(.
...)
i s m n - v a r i a t e Gaussian with z e r o m e a n and
P r o o f It i s readily v e r i f i e d t h a t the given joint d i s t r i b u t i o n s satisfy (2.1) with r = 4, K = n(n+2), and 6 = 1, and satisfy the consistency requirements.
A. V . Balakrishnan
With w denoting 'points' in
V
and defining
0< t<1
W(t;w)
to be the 'function-value' a t t corresponding to w , the m e a s u r e W being the Wiener m e a s u r e above, we have a 'Wiener p r o c e s s ' which i s thus a Gaussian function-space p r o c e s s , with sample space Q. note that f o r t
1
We
3'
i s Gaussian with mean z e r o and covariance ( t - t ) I 2
1
(n)' '(n)
being the
n x n identity m a t r i x ahd i s independent of
Hence W(t;w) i s a l s o a Gaussian p r o c e s s with independent increments such that
and h a s sample functions which a r e continuous.
It i s a l s o readily
verified that R ( t ; s ) = E(W(t;cu) W(s;w)*) = (min s , t ) 1 (n)
A. V., Balakrishnan
w s u r a b i l i t y F o r each t, W(t;cu) i s actually continuous in w, being a continuous linear functional on Q.
Being thus continuous in both
variables, t and w, i t i s measurable jointly with respect to w ana t (Lebesgue measurable s e t s in [0, 11). L e t ( ~ ( tbe ) any function in function of
t
%'. F o r each w, W(t;cu) i s a continuous
. Hence we can define the integral
a s a Rlemann-integral for every cu.
Moreover, i t defines actually a
continuous linear ~unctionalon '3. In fact
since Iw(t;wl)
-
w(t;cu2)I < Ilrn1-w211
The integral defines a Gaussian of mean z e r o and with variance
using ( 2 . 3 ) and the calculation can of course be c a r r i e d further.
A. V . Balakrishnan
Stochastic -
Integrals: L i n e a r C a s e
Let
t(t)
be any function in L2(0,1) over E.
We wish now to
define ;!le stochastic i n t e g r a l (with r e s p e c t to the Wiener p r o c e s s )
F i r s t l e t o(t) b e a function of the f o r m :
corresponding to an arbi-trary subdivision of
10,4 into subintervals.
Then
we define
This defines a Gaussian random v a r i a b l e with mean z e r o and variance:
[The integral of c o u r s e defines a continuous l i n e a r functional on G , and the n o r m of the functional i s
A. V. Balakrishnan
Consider now the L
2
space of a l l random variables (measurable
)
witheinite second moment with respect to Wiener m e a s u r e . Denote this L2(B). The mapping takes step functions into L 2 ( g ) . It i s readily verified to be l i n e a r on the linear subspace of step functions of the type considered.
But this subspace i s dense in L2(0, 1). If we
denote the mapping by
9, we have
and hence can be extended to be continuous on L (0.1). In other words 2 if
4 (.)
i s any element in L ( 0 , l ) we know that we can find a sequence 2
{n(.) of step functions of the type considered which converge to + ( . ) , and since
( hn(-) 1
i s then Cauchy sequence in L (0, 1), so i s 2
in L 2 ( g ) ? Since L 2(55') i s complete i t follows that we can define
g(#(.)) = limit (in L 2 ( g ) ) g ( b n ( - ) . ) n
The l i m i t i s of course independent of the particular sequence chosen, thanks again to (2.4), and moreover a subsequence of converge to the limit a. e to have :
.
9($n(.)) will
Also, of course, we will continue
A . .V. Balakrishnan
and further, if
b(.)
, I(.) a r e any two elements in
L2(0, 1) we
have
where the right side indicates inner product in L ( 0 , l ) . It i s hardly 2
necessary to add that
LZ? i s
ation of L ~ [ o 11 , into L2('8)
a linear continuous (isometric) transformand
i s Gaussian with zero mean and variance
If
b(. )
i s actually continuous then we can approximate
sums of the form, with
( ti I
a s before.
9
(b(- )) by
A . V. Balakrishnan
where ti(
T
since the difference (between this sum and the
integral)
If
+(.)
i s absolutely continuous with derivative i n L~[O. 11, we can
'integrate by parts': 1
,j [
5
1
) ( t ) s d ~ ( t i w )=~
-
]
0
[)l(t),
~ ( t ; w )dt ] with p r . one
This follows from the cal.culation:
and
[t 1
i
-t
n
i t
i
=
C 1
I
-
, wcti;w,l
A. V. Balakrishnan
~ l s for o any complete orthonormal sequence
(
4}
in L ~ [ o , ~ ] ,
we have
[
4
m
d;
] = 1
[ ). bn]
1
[ i n ( t ) ,dw(t:.)]
with p r . 1 '
where
a r e zero mean, unit variance, mutually independent Gaussian variables. We can pursue this a bit further, to obtain an expansion f o r the process.
Shepp Expansion:
(Cf. Shepp [ 5
1)
Let
tn(.)
be a complete orthonormal
sequence in L ( 0 , l ) . Let 2
Then the
Cn
a r e z e r o mean, unit variance, mutually independent
Gaussian variables (one-dimensional)
. Let
Now l e t us fix t, and take any a r b i t r a r , ~ element v in E , and define the following function I ( . ) in L (0.1): 2
A . V. Balakrishnan
f(s) = v , O < s ( t = 0 , otherwise.
Then fir st of all,
for every w by definition.
Let
be the Fourier coefficients of Y ( . ) Let
Then
with respect to the basis
bj(.).
A. V. Balakrishnan
Hence (since 9
i s a continuous mapping):
2 E [v, wn(t;w)- ~ ( t ; w ) ] ---A- 0
a s n->m
for every v in E , o r
Or, we have the expansion (in the mean s q u a r e sense), for each t:
Actually, the convergence i s with probability one also, since
and by virtue of Kolmogorov's inequality this converges with probability one, since
Problem for any f ( . ) in L ~ [ O , 11, show that
A . V . Balakrishnan
Problem
Hint:
and so,
2,I!f 1
tk(s)ds
11 ' converges boundedly to
(nt)
, in L ~ [ O11. ,
0
Linear Stochastic Equations Let F ( s ) be an m-by-n matrix function, Lebesgue measurable on [0, 11 and
A. V. Balakrishnan such that
Let
f o r almost every w.
which we know defined f o r each t, i t i s not defined for every w,
However
and in particular, the exceptional s e t
of points w on which i t i s not defined may well depend on t. wish now to rectify this situation.
We
If F(s) i s absolutely continuous
with a square-integrable (on [0, 11) derivative we can do this very simply.
F o r then,
with probability one, and since the right side i s defined f o r every w, we may just define the left side to be that f o r every w . S(t;w) i s then continuous in t, separable process. Lemma
05 t
5
Note that
1, for every w, and hence a
Moreover, we have the following basic bound:
Suppose S(t;w) in (2.7) i s determined a s a continuous
function'of t f o r almost every
w
sup
[OlE 1
11
jO t
W.
Then:
11
F(S) d ~ ( s ; w )
1
>e
;T
] ~
jOI I F ( ~ ) I I 1
ds
A. V. Balakrishnan
Proof Let
Then by the assumed continuity of S(t;w) in t, measurable.
we observe that A i s
In f a c t if
then An(k) i s clearly monotone in n for each k
But the integral over non-overlapping intervals being independent, a simple application of the Kolmogorov inequality shows that
f r o m which the Lemma follows. Next l e t us recall t h t if F(.) i s any element in LZ(O,I ) , we can find a sequence F (.) of functions which a r e absolutely continuous n with derivative in L ( 0 , l ) such that 2
A . V . Balakrishnan
Using the lemma we obtain that
Let 0
, 0 < €I<1 , be fixed, and l e t nk be such that
Clearly Choose
can be taken to be increasing.
Ink
z 2 = 2-ky
,
O
and l e t ( P ~ ( s ) - F (~s ) ) dw(s;w)Il > 2-
for n, m 2 nk. Let
Then since W (Bk) 5 2
-k@-y)
we have (Bore1 Cancelli Lemma):
W(A
)
=0
Hence for m not in ft
we have that:
A . V . Balakrishnan
for
all k > some
N.
In other words the sequence
converges uniformly in t with probability one, and hence the limit 'A
. Then
must be continuous in t. Denote this limit by S(t;w)
$(t;cu) = S(t;w) with pr. one
and we have produced a continuous equivalent version of
S (t;w)
Problem Suppose F(
a )
i s essentially bounded on {u, 1). Then show that (2.1)
i s satisfied and hence use Parathasarathy's construction to obtain
a continuous version of S (t;O). Answer -
where
rn =
ess.
sup I I ~ ( s ) l l 0 2 85 1
Define S( t;w) arbitrarily f o r each t f o r map 8 into
X
U I
not in the defining set.
Then
by
This i s a measurable map and then we can use the fact that (2.1) i s satisfied. mapping
X
Hence we can obtain a new measurable map I ( . ) ,
into B. Note that if
A. V. Balakrishnan
then for each- t; f(t) = x(t) omitting a s e t of measure zero (depending on t) in X.
y[cp(ru)] (.t) = S(t;w), omitting a set of measure zero (depending on t) in
V.
and 2rt;w =
f
[v(w)l (t).
i s continuous in t with probability one. Let us now consider the problem of solving the following linear stochastic integral equation:
. .
where 5(w) i s a given random variable, independent of W( ;U)
A(s) is a T.ebesgue measurable m-by-m matrix function B(8) i s a Lebesg.2 measurable m-by-n matrix function
and
A. .V. Balakrishnan
Sometimes (2.9) is written in the differential form:
but this is to be looked upon only a s a shorthand notation for (2.10). What shall we mean by a solution of (2.10) ? Minimally, any stochastic process on
[W,ag,W], for which the integrals in (2.10)
can be deiined. We can actually 'do a little better. By a 'continuous solution' of (2.10) we shall mean a stochastic process such that x(t;w) i s continuous in 0 2 t:
I;, and satisfies (2.10) for every t in
[0, L], omitting a fixed s e t of measure zero. Lemma Suppose ~ , ~ ( t ; w )x, (t;w) a r e two continuous solutions of 2 (2.9).
Then
for every t except for a fixed cu-set of measure zero.
Proof Let
In fact
A. V. baiakrishnan
Then y(t;w) i s a continuous solution of:
A
f o r w ' not i n
A,
m not in
say, where
A
is a set of m e a s u r e zero.
But for each
(2.12) i s a 'deterministic' equation, and readily yields that
y(t;W) = 0
f o r every t and
UJ
I\.
TO produce a continuous solution, l e t @(t)be the .fundamental matrix solution of @(t) = A(t) @ ( t ); I ( 0 ) = I .
Then l e t S (t;m)
be a continuous version of
s
1B
- -
d ; ) , 0 C t < l.
and finally l e t x(t;w) = P(t)(S(t;w)t S(w))
A . V . Balakrishnan
which i s then a continuous function of t excepting for a set A of measure zero. Let us next note that
and integrating by parts, we get:
This equality can clearly be defined to hold for every cu not in for cu not in
as required.
A ,
A
. But
A. V . Balakrishnan
Problem Let x(t;cu) be a (continuous) solution of (2.10) and l e t
show that R(t) i s absolutely continuous with
Answer We note that R(t) =
((t) E [5(w)
S(cu)*l @(t)*
~ e x l te t <(w) = 0 in ( 2 . 1 0 ) and define:
where
C(s)
i s q-by-rn and continuous on [ 0 , L]
D(s) i s q-by-n and continuous on [O, L]
A. , V . Balakrishnan
Note.that Y(t;w) can be defined so that it i s continuous in t, O< t < L,
-
excepting an w-set of (Wiener) measure zero.
Furthermore, we can
define a stochastic integral with respect to the Y(t;w) process. Let f ( . ) be a function in L2(0, L ) ;~ we f i r s t define the integral for a step function taking on a finite number of values each on a subinterval, a s before. We note that for a step function the following equality holds:
Now if we take a Cauchy sequence of step functions, the right- side converges, and hence so does the left-side. i s then defined a s this limit.
in L2[0, Llq.
And the stochastic integral
Hence finally (2.15) holds for every f ( . )
This may be indicated in differential notation as:
We can c a r r y this further. L ~ [ oL , ] ~ into L ~ [ oL. ] ~ :
Define the following operators mapping
A. V. Balakrishnan
Note that K i s a Volterra operator.
Let R denote the mapping
L ~ ~ O , Linto ] ~ L~[o,L]~:
Then we can verify that for f ( . ) , h ( . ) in LZIO, Llq:
where on the right side w e u s e the innerproduct notation i n LZIO,L ] ~ . For this w e have only to note that
( ( s ) * C(s)* f ( s ) ds,
d~(t;cu)]
A. V . Balakrishnan
s o that
[D(t)* f(t) + B(t)* @(t)*-I
@ ( s ) *C ( s ) * f ( s ) ds,
d~(t;w)]
from which the left side of ( 2 . 2 0 ) i s s e e n to be
Problem
Show that if
Let J = (D+K)*
{bn(.
))
'is any complete orthonormal sequence in
L ~ [ O , L ] ~ ,w e have
where
and
sn(w)=
lo L
L(,(t) .dw(t;w)l
A. V. Ralakrishnan
These considerations c a n be generalized a s follows, illustrating our point of view on stochastic p r o c e s s e s .
L e t y(t;w) b e a Gaussian
p r o c e s s [with sample s p a c e not n e c e s s a r i l y %] of dimension n with continuous covariance function R ( t ; s ) . Then l e t (taking a s e p a r a b l e extension)
Note that this i s a (Gaussian) p r o c e s s providing a c o n s i s t e n t family of finite dimensional distributions, and
Hence ( 2 . 1 ) holds and hence we c a n define a stochastic p r o c e s s with s a m p l e s p a c e V.
We shall denote this by Y (t;w). Then we can a l s o
define a stochastic i n t e g r a l , by the s a m e p r o c e d u r e a s before. f o r f ( . ) , h ( . ) in L ~ [ O , L ] " :
And
A . V . Balakrishnan
where
If
{hn(t)}
a r e the o r t h o n o r m a l i z e d eigen-functions of R , we have:
where
and
and
{}
eigen-values of R c o r r e s p o n d i n g to
bn(.)
The s e r i e s again c o n v e r g e s with probability one. Conditional Expectation and M a r t i n g a l e Theory Let ( R , S , p ) denote a probability t r i p l e and l e t ( n - d i m e n s i o n a l ) r a n d o m v a r i a b l e s u c h that
~ [ l ~ l l
5
b e any
A. V . Balakrishnan
L e t -9 denote a sub s i g m a - a l g e b r a of 3. Then 6 need not be cs We can c o n s t r u c t a ( s o r t of) projection which i s . measurable W
.
Define the s e t function
B in =Bs.
f o r every
m e a s u r e p ( . ) on 3
S
V(
.
) i s absolutely continuous with r e s p e c t to the
Hence by the Radon Nikodym theorem, t h e r e
i s a function f(w), m e a s u r a b l e
Let
a'
st
such that
denote the sigma a l g e b r a of s e t s which a r e e i t h e r in
o r differ f r o m such s e t s by null s e t s . with r e s p e c t to expectation of E[
9/,
Any function measurable
3 and satisfying (2.23) will be called the conditional
5 with r e s p e c t to
.% and denoted:
I SsI
C
L e t us now itemize s o m e of the p r o p e r t i e s of conditional expectations:
E[G1 Zs] i s a random variable (0, % . ,:
i)
I
p ) and
/
E[CI s s ] = E[c a s ] with probability one.
ii)
or, E[Cl
since
Q . 3 ,J,Gdp
= E ~ E las]) I ~
=
jaEISlqId~
A . V. Balakrishnan
3,3'2
iii) Let
93,
be two sub sigma-algebras of
such that
h his
means that every s e t in .ZJ1 i s also
in B2]
Then
E[6
I dl]
= E[EiS
Is2I]5 1
Proof Let f 2 ( w )
Let B be any s e t in
sl.
But since B belongs to
IB d~
=
la2]
= E[6
Then
s2,
jBf2(')d~
A . V . Balakrishnan
Hence for every B in 31'
o r , the result follows.
iv) Let B be a sub-sigma algebra of 2? and l e t h(w) be a p x n m a t r i x valued function measurable
BS. Suppose
Then
E [h(w)
l.gsl = h(w) E[s
Is s l
Proof Let h(w) be the characteristic function of a s e t Bo in Bs. Then for any B in Bs
A . V . Balakrishnan where
By l i r s a r i t y , this r e s u l t i s extended to s i m p l e functions, and by the usual liniiting a r g u m e n t s t o any function m e a s u r a b l e
Definition: L e t
(xa(w)l
W
.
be any collection of random v a r i a b l e s .
By the
s m a l l e s t s i g m a a l g e b r a generated by this collection w e s h a l l m e a n the s m s m a l l e s t s i g m a a l g e b r a containing a l l s e t s of the f o r m :
B]
[wlx,(w)
w h e r e B i s a B o r e l s e t i n E, o r d i f f e r s f r o m such a s e t by w-sets of m e a s u r e z e r o . We s h a l l denote the s i g m a a l g e b r a by:
L e m m a (Doob): L e t xl(w), let
amdenote
... . xm (u) b e
m
r a n d o m v a r i a b l e s , and
the s m a l l e s t s i g m a a l g e b r a generqted by t h e m .
Let
f(w) be m e a s u r a b l e with r e s p e c t to .qm. Then t h e r e e x i s t s a B o r e l function g ( x l ,
....x m )
defined on E ( ~ such ) that
Proof: The m a i n thing to note i s that the s m a l l e s t s i g m a a l g e b r a generated by the m v a r i a b l e s i s p r e c i s e l y the c l a s s of s e t s of the f o r m :
A. V.
i s a Borel s e t i n mxn dimensions and N i s any s e t of m e a s u r e
w h e r e Unln zero.
Balakrishnan
Now suppose f(co)i s a c h a r a c t e r i s t i c function of a s e t in Bm of the
above type.
Define:
g(xl,
....x m )
= 1 on U mn
= 0 otherwise.
Then g(
...) i s a B o r e l function and (2.26) holds.
Next note that the c l a s s
of functions f o r which (2.20) holds i s a l i n e a r c l a s s . f o r s i m p l e functions f ( .).
Now given any function f (
Hence (2.26) holds ) m e a s u r a b l e .%m.
w e c a n find a sequence of s i m p l e functions converging to i t with probability one.
L e t f ( w ) denote a sequence of s u c h s i m p l e functions and l e t n
denote the s e t of convergence.
f o r B o r e l functions gn(.
...),
A
We know that
and that
A
m u s t b e of the f o r m :
w h e r e I is a B o r e l s e t in Inxn dimensions and N i s a s e t of m e a s u r e z e r o . Then gn(xl,
.. . .Xm )
c o n v e r g e s on I.
Define:
A . V . Balakrishnan
g ( x l * . - . - x r n ) = l i m i t gn ( x l ,
....x,m)
on I
= 0 otherwise
Then : g ( .
...)
i s a Borel function, and of c o u r s e (2.26) holds.
Note in p a r t i c u l a r that
f o r borne B o r e l function g(.
...).
B e c a u s e of this we s h a l l ori occasion
u s e the notation:
f o r the conditional expectation with r e s p e c t to Zm.
Independence Let
5,l
denote random v a r i a b l e s of dimension n and m
respectively which a r e independent.
R e c a l l that this m e a n s that:
L e t u s calculate the conditional expectation:
E[SIll1
Let R b e any s e t in
2(3).
Then
where
f ( w ) = Identity nmatrix on B, and z e r o o t h e r w i s e .
Then f ( w ) and
5 a r e independent, and hence w e have that:
Hence
Or,
~ ( ~ 1 =7 )E(b) with probability one
Martingales-Discrete P a r a m e t e r A sequence of random v a r i a b l e s
i s c a l l e d a Martingale if
,
1 cn
such that:
A . V. Balakrishnan
where
an i s
variables.
Suppose
the s m a l l e s t s i g m a a l g e b r a g e n e r a t e d by the f i r s t n
Note i n p a r t i c u l a r that:
1 I
I
i s a sequknce of independent, r a n d o m v a r i a b l e s with finite
expectation, but with z e r o m e a n .
Then
is a M a r t i n g a l e . M o r e o v e r we have the following g e n e r a l i z a t i o n of the Kolmogorov inequality to M a r t i n g a l e s : (Doob): Lemma: L e t
xn
be a Martingale.
Then:
p r . [Max Ix.(cu)l_> €1 ( ( 1 12~) E[Ix,I j5n J
21
A. V. Balakrishnan
Proor: Let
Define zk(w) = xk(m) f o r m i n Bk
= 0 otherwise
Then zk is m e a s u r a b l e W ( x l ,
...xk), and hence:
w h e r e the M a r t i n g a l e p r o p e r t y i s invoked i n the l a s t equality. We c a n now r e a d i l y p r o c e e d a s i n the proof of the original Kolmogorov inequality:
A . V. Balakrishnan
and
so that ;
which y i e l d s the inequality sought. Doob's M a r t i n g a l e Convergence T h e o r e m : The following s p e c i a l c a s e of Doob's t h e o r e m will be useful to u s in the sequel: Theorem 2.2
Suppos,e
Then the sequence
6n
1 cn 1
i s a Martingale s u c h that
c o n v e r g e s with probability one to a v a r i a b l e 5 such
that
~ ( l c l 5) Proof
2 '
S e e Doob ([ 2 ] p . 319)
As a n example of how this r e s u l t i s u s e d , l e t us c o n s i d e r the following canonical situation.
L e t x(w) b e a v a r i a b l e wiLh finite expectation, and
l e t y (w) b e a sequence of r a n d o m v a r i a b l e s , and l e t Bn denote the s i g m a n algebra g e n e r a t e d by the f i r s t n v a r i a b l e s y l ,
.. .yn . L e t 3- denote the
s i g m a a l g e b r a g e n e r a t e d by the whole sequence.
Then
A. V . Balakrishnan
en
= E[X(UJ) I s,]
i s a M a r t i n g a l e which c o n v e r g e s with
is a M a r t i n g a l e which c o n v e r g e s with probability one to:
6 =
1 ~ ~ )
E(x(w)
T o s e e t h i s w e have only t o note t h a t
E[cn 133n - 1] = E [ E [ X ( ~ ) IlBn-lI ~~I =
E [ X ( ~ ) I ~ =~ _ ~ I
;
and of c o u r s e :
Hence
en
c o n v e r g e s with probability one. L e t
f o r any s e t B i n
an,we
A
5 denote the l i m i t . Then
observe that
and h e n c e
Hence
E(c
- QBn)
= 0 f o r every n
O r , really
jB(c-f,dp
= O for e v e r y B i n l n f o r every n.
But the c l a s s of s e t s B s u c h t h a t B belongs to 3n f o r s o m e n i s a field, and i t g e n e r a t e s Ba. Hence ( 2 . 2 9 ) holds f o r e v e r y B i n
.%a,
or,
A
5 = 5 with probability one
since both v a r i a b l e s a r e m e a s u r a b l e with r e s p e c t to Ba.
Continuous P a r a m e t e r M a r t i n g a l e s : L e t us now t u r n to continuous p a r a m e t e r M a r t i n g a l e s which f o r m the c e n t r a l p a r t of o u r study.
L e t Z(t;w) be a s t o c h a s t i c p r o c e s s , t E T
w h e r e T i s a s s u m e d to b e an i n t e r v a l of the r e a l l i n e .
L e t F ( t ) be a
s i g m a a l g e b r a of m e a s u r a b l e s e t s s u c h that Z(t;w) i s m e a s u r a b l e F ( t ) and l e t F ( t l ) C F ( t Z )f o r t l < t
2
Then Z(t;w) i s s a i d to b e a M a r t i n g a l e with r e s p e c t to F(.), o r simply a Martingale if (i)
E ( I z ( ~ ; ~ ) ( )
(ii)
E(Z(t;w)
1
F ( s ) ) = Z(s;w), s < t
L e t .93(t) denote the s i g m a a l g e b r a g e n e r a t e d by Z(s;w) f o r s
5 t.
Then i t i s c l e a r t h a t Z(t;cu)continues t o b e a M a r t i n g a l e with r e s p e c t
A. V. Balakrishnan
to @(t). The Wiener p r o c e s s W(t;"r) i s a Martingale with r e s p e c t to $(t).
b fact:
for any finite number of indices al,
...,u n , -<
s.
The c l a s s of s e t s B
such that
f o r s o m e finite number of indices, f o r m s a field which generates % s ) , and hence
E ( W ( ~ ; , J l)% ( s ) ) = W(s;w) f o r s l t.
Actually m o r e is t r u e because non- overlapping i n c r e m e n t s a r e actually independent:
E[~(t;w)-~(s;w))(~(t;w)-~(s;w))*I=gs)] = ( t - s ) In' s < t
w h e r e In i s the indentity m a t r i x .
Next l e t F ( s ) b e any rectangular
m-by-n m a t r i x function, Lebesgue m e a s u r a b l e .
F o r any rectangular
A . .V. Balakrishnan
matrix A, we' define:
11~11
= Trace AA* = Tr.rce A+A
As sumefnow that
Then
i s also a Martingale with respect to .4?(t). In fact:
and for any simple function f ( . ) , w e know that
E[
f ( o ) dU(u;cu) I d ( s ) ] = 0 6
and hence since F ( . ) i s the limit of such simple functions,
A . V . Balakrishnan
M o r e o v e r w e know that
We s h a l l now g e n e r a l i z e the stochastic i n t e g r a l ( s t i l i n the l i n e a r s e t - u p ) to M a r t i n g a l e s , specifically to a c l a s s of Martingales which, following Nelson [ 8
1,
we s h a l l c a l l R2 M a r t i n g a l e s .
Since
w e s h a l l only be concerned with such M a r t i n g a l e s i n the sequel, we s h a l l not i n s e r t the qualification u n l e s s a n o t h e r kind i s involved. an R
2
M a r t i n g a l e , w e s h a l l m e a n a Martingale with the additional
p r o p e r t y that:
w h e r e P ( s ) i s a non-nagative definite m a t r i x function, Lebesgue m e a s u r a b l e , and
To define t h e stochastic i n t e g r a l we follow the s a m e p r o c e d u r e a s before.
We wish to define:
By
A . V . Balakrishnan
f o r a c l a s s o r r e c t a n g u l a r (m-by-n say to b e specific) m a t r i x functions. The c l a s s will be the Hilbert s p a c e of Lebesgue m e a s u r a b l e functions with inner-product defined by:
I
[f,d = "
T
[ f ( s ) , g ( s ) P ( s ) ] d s ; ( h e r e [ a , b ] = T r . ab*).
It will b e c0nvenien.t to keep using the g e n e r i c notation the s p a c e .
t o denote
F o r m o s t p u r p o s e s we m a y a s s u m e that P ( s ) i s continuous;
in any c a s e w e s h a l l always a s s u m e t h a t w e c a n avoid the t r i v i a l c a s e where it i s zero almost everywhere.
To define the i n t e g r a l w e begin
with s i m p l e functions which we know a r e d e n s e i n
.%
.
F o r simple
functions i t is readily verified that
and the mapping being l i n e a r into L ( 0 ) of m-dimensional r a n d o m 2 v a r i a b l e s , we c a n complete the definition a s b e f o r e .
It should b e noted
that the b a s i c Martingale property i m p l i e s that
An Example: L e t W(t;w) denote the Wiener p r o c e s s , and l e t F(s) denote a n m-by-n m a t r i x function, Lebesgue m e a s u r a b l e such that
A.
and J T ~ F ( s ) 1 l 2ds <
w h e r e w e shall now take T =
[o, 11.
Define
Then Z(t;cu) i s a n R2 Martingale.
which i s then non- singular a . e .
Let
Let
Then the function b ( s ) h a s the property
Hence w e can define
V. Balakrishnan
A . V . Balakrishnan
Thus defined, w e obtain a Wiener p r o c e s s i n rn d i m e n s i o n s .
To verify
this, w e note t h a t f o r any two m d i m e n s i o n a l v e c t o r s ( m - b y - 1 m a t r i c e s ) we have that:
s o that f o r s < t
= ( t - s ) Im
w h e r e Im is t h e m - b y - m unit m a t r i x .
Of c o u r s e K(t;w) i s G a u s s i a n .
M o r e o v e r , we have the r e p r e s e n t a t i o n (of Doob, Nelson):
F o r t h i s w e h a v e only to note that the p a r t i a l s u m s (we a s s u m e that a ( t ) i s
A. V . Balakrishnan
continuous f r o m now on):
The f i r s t t e r m
= Z(t;w)
while the second t e r m goes to zero a s t h e m a x i m a l subdivision length goes.to zero.
For
A. V. Balakrishnan
and
11 a(ti)b(s)-b(ti))f(s)(1
=
11 a(ti)(b(s)- b(ti)) a(s)(l2
and a ( s ) i s uniformly continuous on [0, 11.
Finally, .we note that by using Doob's Martingale inequality, we can obtain
a s a continuous function of t, for almost all cu.
Problem
Show that Y (t;cu) defined by (2.16) i s such that
i s an R2 Martingale.
Show that i f
and we assume say, b(t) = (a(t))-1/2
A . V. Ralakrishnan
i s e s s e n t i a i l y bounded on [0, 11, then
w h e r e K(t;w) i s a Wiener p r o c e s s .
Radon-Nikodym D e r i v a t i v e s with R e s p e c t to W i e n e r M e a s u r e L e t W(t;w) denote the n-dimensional W i e n e r p r o c e s s ,
0 f t 5 1,
inducing the Wiener m e a s u r e on %=C(O, 1 ) a s w e have indicated. ~ ( w )denote a ( B o r e l ) m e a s u r a b l e function mapping %' into Q.
cp(.)
induces a m e a s u r e o n the B o r e l s e t s of
w h e r e pW denotes the Wiener m e a s u r e . i n the c a s e where cp(-)
V
Let
Then
given by:
V e r y often w e a r e i n t e r e s t e d
is defined by m e a n s of a s t o c h a s t i c integral:
w h e r e s a y the functions L ( . ) , M(.) a r e continuous.
Of p a r t i c u l a r
i m p o r t a r ~ c eis to d e t e r m i n e when the induced m e a s u r e i s absolutely continuous with r e s p e c t to Wiener m e a s u r e , and then t o evaluate the Radon-Nikodym derivative. c a s e w h e r e cp(.)
In the p r e s e n t s e c t i o n w e s h a l l study the
i s a l i n e a r t r a n s f o r m a t i o n ( o r affine t r a n s f o r m a t i o n ) .
A. .V. Balakrishnan
Theorem 2.3 Suppose cp(-) i s a measurable map of the induced m e a s u r e .
Let
itn ]
3 ' into O. Let p (. ) denote
CP
be any complete orthonormal system
in L ~ ( o , I ) ( ~ )L. e t
L e t Bn denote the sigma algebra
B in
sn, we have
.8(c1.. ......,6n ).
Suppose for eyery
In other words,
i s absolutely continuous with r e s p e c t to pW O n p~ Sn, f o r every n, and Hn (w) denotes the corresponding derivative. Then p
i s absolutely continuous with r e s p e c t to, pW, and the derivative CP i s given by
H(W) = l i m Hn(w)
omitting a s e t of Wiener m e a s u r e z e r o .
Proof
We observe f i r s t of a l l that the sequence of random variables
Hn (w) i s a Martingale. F o r , f o r any s e t B in Bn, we have that
A . V . Balakrishnan
s i n c e .B is increasing with n. Also
n
Hence by Doob Martingale convergence t h e o r e m we know that Hn(w) c o n v e r g e s with probability one.
Denote the l i m i t by H(w). Then of
course
and if w e denote by 3 - the s m a l l e s t s i g m a a l g e b r a containing every s e t i n e v e r y Bn, w e know that
Next l e t f ( . ) b e any e l e m e n t i n
2 = LZ(O, 1)'").
L e t U b e a n open s e t in the r e a l line, and l e t
Let
A. V. Balakrishnan
It i s clearly enough to show that
Unfortunately however B need not belong to
9th'.
On the other hand,
6(w) = l i w i t 5,(w) with probability one n
where
sinck
I qnl
i s a complete orthonormal system.
the s e t of convergence of the variables
5n'
hence
since
A
h a s Wiener m e a s u r e one.
Hence
Then
Let
/\
A
i s in
denote
B?,
and
A . V. Balakrishnan
Now l e t
Ak = [ W I Sk(w) a
and l e t
m
w
A =
u n=k n
n=l
Then A a
Ul
A,
and
Hence
a s we s t a r t e d to show.
Corollary any f ( . ) i n
Suppose y(w) i s a l i n e a r t r a n s f o r m a t i o n such that f o r J f , a n d a n y l i n e a r B o r e l s e t U,
whcre h = Mf; and M i s a l i n e a r bounded t r a n s f o r m a t i o n mapping
Jf into i t s e l f .
A . V. Balakrishnan Suppose
+ -1
M*M =(I J)
where J is Hilbert-Schmidt.
Then p i s absolutely continuaus with 'P
respect tn Wiener m e a s u r e .
Proof
Since J i s compact and self-adjoint by definition, l e t
denote the orthonormalized eigenfunc tions, including those corresponding to zero eigenvalues, s o that
i s a complete orthonormal system.
L e t yn denote the corresponding eigenvalues.
Then we observe that the
infinite product
converges.
Define
Let y denote its value.
Let
-
Then it i s not difficult to verify that for B in
8(c1,
. ..,Sn),
A . V . Balakrishnan
Hence the t h e o r e m applies. We can actually evaluate the Limit.
For
this o b s e r v e that
The n u m e r i c a l factor in f r o n t converges to
,&
While in the random
exponent:
i s a s u m of independent 'random v a r i a b l e s , each of m e a n z e r o , while
and hence converges.
Hence the s u m converges with probability one.
Hence C13
Example equation:
L e t x(t;w) denote a continuous s,olution of the stochastic
A - V. Balakrishnan where l e t u s a s s u m e that A(t) i s continuous in [0, 11. Then the m e a s u r e induced by the process x(t;w) on C, i s absolutely continuous with respect to Wiener measure.
In this c a s e we can explicitly write
the solution a s :
.
which then defines the,mapping cp( ) explicitly.
This shows that the
mapping M is given by
where
Since K i s thus a Volterra operator, we note that ( I bounded inverse.
+ K)
In fact a simple calculation shows that
where L i s again a Volterra operator defined by:
Hence we have ( M * M ) - ~ = (I-L*)(I-L)
has a
A. V. Balakrishnan .
Thus
J = L*L -(L
+ L*)
and i s obviously Hilbert- Schmidt.
It i s pertinent to ask where (2.32) can be expressed a s a functional on che (Wiener) process without having to go through the eigenfunctions.
Before we do this in general, i t i s interesting to
consider the special case where
Then the main simplification is that (L + L*) i s trace-class. Indeed
so that the operator has finite-dimensional range, and
Hence J i s trace-class, 'since L*L being the product of HilbertiSchrnidt operators i s clearly trace-class.
The implication of this i s that now,
A . V . Balakrishnan
s o that the infinite product
converges.
Indeed we can actually evaluate this product by the general
formula that
which is valid whenever (L the Appendix.
+ L*) is t r a c e - c l a s s ,
A proof is given in
Since the exponent in (2.32) i s convergent, and (2.35)
holds,
converges a l s o , with probability one. limit.
F o r this l e t u s note that
Since
and
Ifka (Lt L*)+,]
= [
We c a n go on to evaluate the
A . V . Balakrishnan
and letting
w e have:
Hence w e obtain finally t h a t
Multiplying by Ck(w) and summing on k, the second t e r m c l e a r l y yields
-[w(l;w), AW(l;w)]
while the f i r s t t e r m
A . V. Balakrishnan F o r this we only need to note that
and
which goes to z e r o .
Hence finally
Hence we have:
We shall verify this answer in another way eventually.
A.
V . Balakrishnan
So f a r we have been concerned with mapping $2 into CG. This i s of course not essential.
) any stochastic Thus l e t Y ( t ; ~be
process with continuous sample functions for almost all w.
Then
for O ( t s 1 ,
i s a measurable map into %?, measureable with respect to Bore1 sets in %Hence '. we can define an induced measure p a s before. Of cP particular interest i s the c a s e where Y(t;w) i s a Gaussian process and the stochastical integral
can be defined. We can then state the following theorem.
Theorem 2.4 paths.
Let Y (t;w) be a Gaussian process with continuous sample
Let the process dimension be n.
Suppose the stochastic integral
can be defined for every f ( - ) in L2(0, 1)'") such that
where R i s a linear transformation of L2(0, l ) l n )into itself, and has the form:
A . V. Balakrishnan
where J i s self-adjoint Hilbert- Schmidt operator.
Then the measure
induced by the process i s absolutely continuous with respect to Wiener measure.
Prbof
L e t c denote the generic element in '%, and l e t W(t;c)
denote the Wiener process,
Weneed only note that for any f ( . ) in LZ(O,I)'"), and any linear Bore1 s e t U,
The proof can clearly proceed exactly a s in Theorem 2.3, using (2.39).
Example
Let u s consider the p r o c e s s Y ( t ; ~ )defined by (2.16),
and take the c a s e where L = 1 and
D(t)D(t)* > 0
a.e.
and assume that
-b ( t ) = h i t. ) .~ ( ~ ) * ) - ~ )
is essentially bounded.
Let
A. W. Balakrishnan
Then the m e a s u r e induced by the
?(.;
with r e s p e c t to Wiener m e a s u r e .
Indeed
) p r o c e s s is absolutely continuous
1 ([f(tl,
[b(t)f(t),c l (tw)] ~
d ~ ( t ; w ) ]=
50 so that, introducing the operator
we have that
where
Now because (R
-
I) i s . c o m p a c t and self-adjoint, i t is c l e a r that R
has a bounded inverse unless f o r s o m e non-zero f ( . ) [Rf, f] = 0 O r , equivalently, [ ( D t ~ ) b f (D , t K) b f ] =
o
A. V . Balakrishnan
(D+K)bf=O
O r , with slight abuse of notation:
D(t)* b ( t ) f ( t )
+Kb f = G
But, multiplying by D(t), and noting t h a t
w e obtain that
But this implies that (b f ) m u s t b e z e r o , since the second t e r m involves a Volterra operator.
Since b ( t ) is non-sing,dar,
f must be zero.
R f i s z e r o only if f i s zero, o r R h a s a bounded inverse.
Further it i s
c l e a r that R-I
= ISJ
where J is self-adjoint, and Hilbert- Schmidt.
A s a final example of
ad on-Nikddym
consider the c a s e of a non-zero mean.
derivatives, l e t u s
Thus l e t
Hence
A. V. Balakrishnan
. where m(.) i s in L ~ [ o I](").
~ h e the h measure induced by the
process Y ( t ; ~i)s absolutely continuous with respect to Wiener process. F o r this, l e t
I
seen that for B in
1
be a complete orthonormal system and it i s readily $(cl,.
..5,).
where, a s usual:
we have, denoting the new measure by p
Y'
where
from which i t follows that the Radon-Nikodym derivative i s given by:
A. V. Balakrishnan
Ito Integral We shall now study one of the main tools of much of our theory: namely the Ito integral.
The Ito integral i s the non-linear version of
the stochastic integrals we have considered, in that the integrand will now be a random process also.
Let Z(t;w) denote an R2 Martingale,
with F ( t ) the increasing sigma-algebra and
W e wish to define the integral
where f(t;w) i s an m-by-n matrix valued random process with the following properties: (i)
f(t;w) i s measurable jointly in t and w, in t with respect
to Lebesgue measure; f(t;w) i s measurable F ( t ) for each fixed t.
The
significance of this i s that f(t;w) depends only on the 'past' of the process Z(t;a). Sometimes this i s indicated by saying that f(t;w) i s 'non-anticipatory', o r 'physically realizable'.
Let
F/L
denote the class'of such functions; i t i s clearly a linear space.
A. V . Balakrishnan '
Introduce an inner-product in
=fi?
by:
(Here l e t u s recall that in the integrand:
[ a , b ] = T r . ab* = T r . a*b)
By a "simple" function in
,X we shall mean a function of the form:
It i s implicit of course that vi(w) i s measurable F(ti).
AS in the
l i n e a r c a s e , we define the integral f o r such a simple function by:
Note that
= E(E(ditto
I
< t., and hence z e r o . F(tjt,)) if t jtl- I
= T r . E(vi(Z(titl;w)
- Z(ti;w))(~(titl;w)- ~ ( t . ; w ) ) *v.*
\~"[visvi~(s)l 1 ds,
f o r i =j
=
A . .V. Balakrishnan
Hence, by a s i m i l a r calculation a s i n - t h e l i n e a r c a s e we s e e that f o r any two s i m p l e functions,
E([
So
f ( t ; ~dZ(t;w) )
f(t;w), g(t;w) in
.
.%, we have:
\:
g(t) d ~ ( t ; w ) ] )= [f,
In other w o r d s w e have an inner-product p r e s e r v i n g l i n e a r transformation f r o m the c l a s s of simple functions in
."&, into the s p a c e L (Q)of 2
m a t r i x valued ( m - b y - n ) w-random v a r i a b l e s with finite second moment.
It only r e m a i n s to show that functions a r e dense in i t . f(t;w) in
35'
, we
A? i s
a H i l b e r t space, and that the simple
L e t u s note f i r s t that given any function
c a n find a sequence of functions f (t;w) in n
S'
such
that f ( t ; ~ i)s unifornlly continuous i n t in the m e a n s q u a r e s e n s e , and n such that:
F o r this we only need to define
Then when P ( s ) i s the Identity m a t r i x f o r example, we have
A . V. Balakrishnan
s o that
4
0 as
1A1 > -
0, uniformly in t.
The more general case i s handled similarly. Next l e t g(t;w) be in 2,and uniformly continuous in the mean. Define
Then i t i s easy to see that
Since the closure of the class of simple functions i s already a Hilbert space, this i s enough to show that
&' i s a Hilbert space.
Here i s an elementary canonical example which illustrates the difference between the ordinary integral and the Ito integral. Wiener process.
The the Ito integral
i s clearly definable. by parts.
Let W(t;w) be the
Let us attempt what i s essentially an integration
F o r this we begin with a n approximating finite sum:
A . V . Balakrishnan rn- 1'
C
[W(ti;w).
W(tit1;.)
-W(ti;w)],
to = 0 .
i=O
....t. < titl, 1
trn = 1
w h i c h w e can r e w r i t e as:
rn- 1 0
H e r e t h e s e c o n d term c a n b e e x p r e s s e d :
a n d b y i n t e r c h a n g i n g t h e o r d e r of s u m m a t i o n , w e h a v e t h a t this is the s a m e as:
rn- 1
=
C
[wtjtl;w-wtj;w
,
W(l;,)
- W ( t .w)] j'
j =O
=
m- 1 [ w ( t j t l ; ~ ) - W ( t j ; ~ ) ,w ( ~ ; w ) ] j=O
C
and since the first t e r m i n this
= [W(l;w). W(1;udI
-
rn- 1
C [ W ( t j t l ; ~ ) - W ( t j i ~ ) , W ( tj'.w)l
j=O
A . V . Balakrishnan
and the second t e r m i s the s u m w e originally s t a r t e d with, we have
Now s i n c e the s u m on the l e f t c o n v e r g e s a s the subdivision s i z e s h r i n k s , t h e second t e r m on the r i g h t c o n v e r g e s , in the m e a n s q u a r e s e n s e a l s o . But
~ ( l w(tit,;w) l - w(ti;w)ll 2
= n(titl-ti)
and h e n c e
Hence
The significant point is the a p p e a r a n c e of the s e c o n d t e r m , which i s a c h a r a c t e r i s t i c f e a t u r e of the $to i n t e g r a l .
A. V. B a l a k r i s h n a n
Problem:
Show that i f we define the p a r t i a l s u m slightly differently:
w e get a completely different a n s w e r i n the l i m i t .
In f a c t the difference
m- 1
C [ ~ ( ( t ~ t t ~ + ~ ) / Z- ;W(tii")s rn) i=O
> -
1
7 ( T r . I)
M o r e generally, taking
show that the l i m i t of the p a r t i a l s u m s :
Hence the p a r t i a l s u m s :
W(titl;w) -w(ti:w)I
A. V. Balakrishnan
converge to
L e t u s now prove a useful generalization of this result.
Thus l e t
W(t;w) denote the Wiener p r o c e s s , and l e t H(t;w) be an n-by-1 process defined by:
where L(*)i s continuous.
L e t u s calculate the Ito integral:
L e t us begin with an approximating sum:
A. V.. Balakrishnan
As before,
Transposing, and taking l i m i t s , and noticing that
defines i n the l i m i t the Ito integral:
we have:
- 146
-
A . V . Balakrishnan
As before the l i m i t exists in the mean square sense, and we shall now ;how that the l i m i t i s actually equal to:
But this follows readily f r o m the fact that:
=
f
T r . L(t) d t
Hence E ( ( g [[si"L(s) 0 't.
dW(.;-)
,
r"
L(t)dt
Y(S;UI)]
1
1
and
'i+ 1 L(s)dW(s;w), t.
A. V:
sit'
Balakrishnan
dW(s;w)] 2 )
ti
ti+l T r . L ( s ) L ( s ) * d s )n(titl -ti) t.
s o that (2.46) c l e a r l y goes to z e r o a s m a x i m a l length of subdivision goes to z e r o .
-
Hence we finally have the result:
c[\i
1,- (
dW(s:w), L(t)dW(t:w)
T r . L ( t ) dt..
....
and of c o u r s e again, the unusual thing to note i s the appearance of the third t e r m on the right. This can c l e a r l y be generalized into a m o r e s y m m e t r i c f o r m :
-b
T r . L(t)M(t)*dt
A. V, Balakrishnan
Note that the r i g h t s i d e c a n b e put i n the f o r m :
-
1
Tr.
L(t)M(t)dt.
..
but the i n t e g r a l ( f i r s t ' t e r m ) o n the right h a s to be i n t e r p r e t e d j u s t in the way we got i t by adding the f i r s t t e r m s on the r i g h t in (2.48).
R - N D e r i v a t i v e s using Ito I n t e g r a l s W e s h a l l now see, how to e x p r e s s the Radon-Nikodym derivatives
in t e r m s of Ito i n t e g r a l s .
F i r s t of a l l , l e t u s c o n s i d e r the special r e s u l t
( 2 . 3 8 ) . Using (2.48), we have (by setting ' L ( t ) = A; M(t) = Identity; and
A = A*):
i
1
[AW(l;w), W(I;w)]
-$ jOT r . A
dt
and substituting into (2.38) w e have:
and we have the advantage that the T r a c e t e r m d i s a p p e a r s .
This e x p r e s s i o n
(2.49)
A. V.. Balakrishnan
f o r the Radon-Nikodym derivative i s actually valid f o r the g e n e r a l c a s e of (2.33).
But h e r e l e t u s c o n s i d e r the c a s e w h e r e J i n T h e o r e m 2 . 4
i s trace-class.
T h e o r e m 2.5
W e begin with a t h e o r e m of i n t e r e s t i n i t s e l f .
L e t L denote a V o l t e r r a o p e r a t o r , mapping LZ!O, 1 )(n)
into i t s e l f .
~ ( t ; s f)( s ) d s ; ~ ( t ; s continudus )
where L i s also trace-class.
Then f o r any o r t h o n o r m a l s y s t e m
w e have:
where
and the convergence of the infinite s e r i e s i s at l e a s t i n the m e a n of o r d e r two.
Proof
F i r s t l e t u s c o n s i d e r the c o n v e r g e n c e of the s e r i e s ;
using the f a c t that the
IPk I
A. V . Balakrishnan a r e independent, z e r o m e a n Gaussian,
and the f a c t that L i s t r a c e - c l a s s w e can readily verify that
Moreover
conyerges in the mean of o r d e r two.
Note a l s o that W
O
)
since L being Volterra, i t s t r a c e m u s t b e z e r o . the Ito integral on the l e f t of (2.51).
F o r a n approximating finite sum:
we have, by using the Shepp expansion:
Define the o p e r a t o r Ln by'
L e t u s next consider
A. V . Balakrishnan
Then Ln i s clearly finite dimensional, and hence trace-class.
the sum in (2.. 5 3 ) can be written
Let
m
n
=
CC~,G,LL(,.~~I 1
1
Then
and more important
using ( 2 . 5 2 ) .
But
A; V. Balakrishnan
But the Hilbert-Schmidt n o r m of ( L - L ) c l e a r l y goes to z e r o , and n hence (2.53a) goes to z e r o .
T h e o r e m 2.6
L e t Y(t;w), 0 5 t 5 1, b e a Gaussian p r o c e s s with
continuous sample paths such that the stochastic integral
c a n b e defined f o r each f ( - ) i n L 2 (0,I ) ~w h e r e
f o r f ( . ) , g ( - ) i n L2(0, l ) n , w h e r e
A. V. Balakrishnan
and L has the form:
M(.),f,(.)being continuous square m a t r i c e s . Further it i s assumed that J i s t r a c e - c l a s s .
Then the m e a s u r e induced on g by the mapping:
i s absolutely continuous with respect to Wiener measure, and the derivative i s given by:
H ( v ) = exp
-$
1
j0
[ ~ ~ ( t ; vw) l,( t ; v i l dt
+
1:
[w1(t;v),dw(t;v)l
(2'-54)
where
W,(t;C) = M ( t )
L(s)dW(s;v)
and W(t;v) i s the Wiener process with
W(t;v) = v ( t ) , v(.) c CG
r o o :
Let
[bk\ d e n o t e the complete orthonormal system of
of J with corresponding eigenvalues yk.
eigenvectors
Then a s we know f r o m Theorems
2 . 3 and 2.4, the Radon-Nikodym derivative i s given by:
A. V:
Balakrishnan
where
Also f r o m ( 2 . 3 6 ) , we have, since if J i s t r a c e - c l a s s ,
nJGk) Now w e c a n write:
where J
and
Pk
= fk
=
exp
--21
T r . (L t L*)
+
s o i s (L L::):
A . V . Balakrishnan It i s readily seen that
while writing:
where
we note that P i s finite dimensional, and hence trace-class, and hence Q must be trace-class, and Q being Volterra, i t s t r a c e must be zero.
Hence we obtain:
Tr. (L
+ L*)
= Tr. P =
Tr
. L(t) M(t) dt
=
... (2.55) We thus obtain? Q being now Volterra and trace-class,
A ; V. Balakrishnan
and using Theorem 2 . 6 we have that
But from (2.48) we have that
- Tr.
lo
Hence. .substituting, we finally .obtain:
1
M(t)* L(t)* dt
A . V . Balakrishnan
-
k,
T r . M(t)* L ( t ) d t
fr.om which (2.54) readily follows, upon using (2.55).
Example
Suppose in (2.14), we take t h e special c a s e where:
(i)
D ( s ) D ( s ) *= Identity m a t r i x .
(ii)
B ( s ) D ( s ) *= 0
,
05 s 5 1
Then R in (2.19) becomes
R =I
$
K::K
This implies that
w h e r e J i s t r a c e - c l a s s , since K::K
is.
By a t h e o r e m of Krein [it],
this implies that
where L i s Volterra.
After we do Kalman filtering, we shall s e e an
a l t e r n a t e method f0.r actually obtaining L . as: Lf = g; g ( t ) = M(t)
Once we c a n r e p r e s e n t L
A . V . Balakrishnan
we c a n apply Thqorem 2.6 directly to obtain the R-N derivative in this c a s e .
Example: Kalman Filtering P e r h a p s the m o s t significant application of stochastic difier ential s y s t e m s a t a useful level i s the theory originally due to Kalman and since associated with his name.
The approach h e r e i s quite different f r o m the
original v e r s i o n s based on Wiener-Hopf equation analysis, and considerably simpler.
We begin with a fundamental property of Martingales. (Cf. Nelson, Doob).
Lemma
L e t Zi(t;cu); i = 1 , 2 denote two Martingales with r e s p e c t to the
growing sigma algebra
i, j
Suppose that f o r
lim 0
;5
.F(t), and let:
fixed, : 0 2 t < 1, :
~ ( ( z ~ ( t + a- ~; ~~ )( t ; ~ ) ) ( ~ . ( t -~.(t;w));kl t ~ ; c u ) a t ) ) = Pij(t) J J
where the convergence of the random variable on the l e f t i s in the mean o r d e r one (L1)
-
O < t
,
and i t i s ' a s s u m e d that P . . ( t ) i s continuous in t,
Then f o r O l s < t ~ l .
1J
(2.57)
A. V: Balakrishnan
Proof L e t 0 5 a < 1, and define
Observe that f o r any
.
.
> 0, t + A < 1, we have:
where we have used the fact that:
We can now follow the argument of Nelson ( 8 ) given.
; Let
> 0 be
L e t J denote the s e t of points i n [ a , 11 such that for t in J:
E ( I ~ ( ~ ~ ~ ( t ) l-y ( a ) )
5E
(t-a)
Clearly J contains the point a . Also J i s closed. Now we shall show that if t i s any point for which (2.58a) holds, then i t will hold for t t 6,
0< 6
< A , f o r some A > 0. F o r this, we note that
(2.58a)
A ; V. Balakrishnan and l e t us choose
A1
Next l e t us choose
Choosing h
such that for all 6
<
Al,
such that for all 6 < A2
to be the minimum of A1 and
,
$, we have:
.for
0 < 6 < h
- -
Hence (2.58a) follows for t
+ 6,
for t = a. Suppose now that!the a
holds for [a,t] i s to, say. closed.
0< 6 <
a.
In particular this i s true
upperbound of t such that (2.58a)
Then to must belong to J since J i s
But if to i s not equal to 1, we will have a contradiction
because i t can be extended by a non-zero amount. arbitrary, i t i s clear that:
Since
S.
is
which i s clearly enough to prove the L e m m a .
Corollary:
Assume now that the Martingales Z. (s;w) a r e Gaussian,
0 5 s ( 1, and that
Assume f u r t h e r that (2.56) holds, and that (2.57) holds f o r i = j = 2, and for i = 1, j = 2. by Z2(s;u), s ( t.
L e t p2(t) the s m a l l e s t sigma algebra generated
Then
where r12(s) i s defined a s the limit:
r12(s) =
Proof
+
l i m P12(s) ( P 2 2 ( ~ )€1)0-0
It should b e emphasized that it i s
1
a.e.
0
<s<1
not assumed
that (2.57) holds
f o r i.= j = 1 . F i r s t l e t u s prove that f o r any Lebesgue measurable matrix function f ( . ) , having the s a m e dimension a s P12(s), such that
T r . f(s)* f(s) PZ2(s)dB <
w e have that:
-
For this i t i s enough to note that f o r
0 < s < t < 1,
from (2.58) ; hence i t i s immediate that (2.62) will hold for simple functions, and hence by the usual limiting arguments to any f( .) satisfying,(Z. 61). Next if
we have that: L(o;t) satisfies (2.61), and that:
By Fatou's Lemma, i t follows that: r12 satisfies (2.61). integral:
Hence the
A. V; Balakrishnan
is well defined. Let f ( . ) satisfy ( 2 . 6 1 ) . Then
But
which i s trivially true at a point where
and otherwise because if
A. V. Balakrishnan
a s can be verified from (2.57).
Hence
i s uncorrelated with, and being Gaussian, independent of,
f o r every f ( - ) satisfying (2.61).
But since the l a t t e r generate B2(t),
(2.59) follows.
Let us consider now &e linear stochastic equation (cf. (2.16)):
where we shall for simplicity a s s u m e that all coefficients a r e continuous.
Further we shall assume:
D(s)D(s)+> 0
on
[O. 11
A. V. Balakrishnan
Note that this. implies that
i s continuous.
Then a s we have seen, defining:
we have:
where
A
W(s;w) i s now a Wiener process.
the growing sigma-algebra A
A
L e t W(t;w) be measurable
9 (t). Then of course x(t;w), Y ( t ; ~ ) ,
Y ( t ; ~ ) ,c(t;w) a r e all measurable
9 (t). L e t
B y ( t ) = sigma algebra generated by Y (s;w),
and similarly define B?(t).
Then since
s.5t
A. V; Balakrishnan
we have that
By( t ) = Byn(t) Since we have that
2 ~ ( l x ( t ; c u ) l) <
we can define the conditional expectation:
A
x(t;w) = E(x(t;w) l a ( t ) )
where f r o m now on we write
.%t) f o r BY(t) S Q ( t )
Observe now that:
But the leftside i s a Martingale in s , s < t, for fixed t, and converges with probability one a s s goes to t. probability one, we note that
~ u t~' ( t ; w being )' continuous with
A. V. Balakrishnan B ( t ) = s m a l l e s t sigma-algebra containing a s ) f o r every s < t
and hence the l i m i t m u s t b e 2(t;w). Thus 3t;ru) i s continuous f r o m below with probability one. Hence by a t h e o r e m of Doob ([2
1,
T h e o r e m 2.6, p. 61,
and r e m a r k 0n.p. 6 2 ) , there i s a'process equivalent to Gt;w) which i s jointly m e a s u r a b l e i n t and w, a n d w e may c l e a r l y redefine %t;w) to b e this p r o c e s s .
And since
i t follows that the integral
is well-defined
Lemma:
h
(converges a . ~ . ) .
The p r o c e s s :
x(t;w)
-
S
A ( & )3 s ; ~ d) s = Zs(t;w) 0 5 t
is a Gaussian Martingale.
Moreover
<1
A . V. Balakrishnan
Proof
Now for t
> s,
d ( g ( t ; w ) l ~ ( s ) )=
we have:
I
E(x(t;w) , g ( t ) ) l~ ( s ) )
= E(x(t;w) h s ) ) .
....
Hence
But the first term i s zero because
g ( s ) , 3 @ s ) , and
Martingale; the second term i s zero, using (2.65).
Lemma: Let
Then Z ( t , u) i s a Gaussian martingale. Moreover: 0
S
B(u)dW(u,u) i s a
A . V. Ralakrishnan
Proof
We have only to note that:
so that
and just a s in the previous lemma,
Let us use the notation:
and observe that
since 2
E(le(s;w)l
5
2
E(lx(s;w)l
A,. V. Balakrishnan
and i s bounded in 0 < s < 1 . Again:
Hence clearly:
Hence (2.66) follows.
Lemma:
Let
Then rtt A
Proof
rt+A
Since Zo(s;w) satisfies (2.57) and Zs(t;w) satisfies (2.56), it
only remains to calculate (2.68). We have seen that:
A . V. Balakrishnan
It i s immediate that
Next for any
A> 0 ,
e(t+A;w) i s uncorrelated with Y(s;w), s 5 ( t t h ) ,
and hence with Zo(s;w), s
< ( t t b ).
It is a l s o uncorrelated with (and
/
hence a l s o independent of) the random variables generating B ( t ) . Hence
Since
w e have :
and
E ( r ! ( ~ ) d W ( ~ ; (~ f t
=
+ ol&
.
A . V. Balakrishnan
since
Hence (2.68) follows by taking the limit a s that a.e.
Eventually we shall show that i t i s .
(Cf. (2.83)).
Under the assumption that
we can write:
where
Proof
goes to zero. Note
in (2.68) i s nec'essary since we cannot (at this stage) a s s e r t
that e(t;w) i s continuous in t.
Lemma:
A
Let us f i r s t note that
A. V . Balakrishnan
where
Define-the operator L by
Then i t follows that
where
Here f ( . ) i s a n m-by-q matrix function, and p(.) i s m-by-q consistent with the choice of dimensions on p, 37
where h(.) i s m-by-n
and p(.) i s m-by-q.
.
Specifically:
A . V. Balakrishnan
Next E(x(t;w) (
5
t q(s)dY(s;s))*) 0
where
Now
where
w h e r e k s t a n d s f o r the function k ( t ; .).
Hence f o r (2.69) to hold i t
is n e c e s s a r y and sufficient that
Z(S)
= 0 ;
or,
L*L k = L:: v
But b e c a u s e D ( s ) D ( s ) * i s positive,
L*L h a s a bounded i n v e r s e , s o
that the f i r s t p a r t of our r e s u l t that t h e r e e x i s t s a function k ( t ; s )
A. V . Balakrishnan satisfying (2.69) and such that
f o r each 0 < t < 1, i s i m m e d i a t e . We n e e d however to show that the double i n t e g r a l i n (2.69a) m a k e s s e n s e and that i t i s finite.
F o r this
we proceed to a c l o s e r examination of ( 2 . 7 3 ) . Thus w e note that if
Lf = g, w e c a n w r i t e :
The point i n doing t h i s i s that the f i r s t two t e r m s a r e independent of t.
Again if we denote by
w'e have
the o p e r a t o r yielding the f i r s t two t e r m s :
A . V. Balakrishnan where
Also, the function r ( . ) in (2.72) c a n be expressed:
Hence (2.73) c a n b e written:
where
W e now exploit the fact that h(t;s) f a c t o r s into a function of time and
a function of s :
h(t;s) =
(.-fk ( t ; u ) C ( u ) P ~ ) dt~ ((t]) U(S)
But since D(s)D(s)*i s a,ssumed positive, w e note that L* has a
A. V. Balakrishnan C*
bounded Znverse i n L2(0, l ) , and s o d o e s L ; m o r e o v e r if
we h a v e
Since the right s i d e i s 'identity plus V o l t e r r a o p e r a t o r ' , we c a n u s e a C*
Neumann expansion to find the i n v e r s e ; and s i m i l a r l y f o r L.
But since
each of these 0peration.s does not involve t,. i t i s readily s e e n that this implies that i t is possible to e x p r e s s k ( t ; s ) a s
k ( t ; s ) = kl(t)k2(s) (where k (.) i s m - b y - m ) 1 Hence substituting this into (2.73), and taking advantage of the f o r m s in
(2.75) and (2.76) we m u s t have:
And k (t)m u s t satisfy: 1
A. V. Balakrishnan But s i n c e $(t) i s nonsingular, i t follows that both k l ( t ) and
a r e nonsingular, and hence
f r o m which (2.69a) follows.
Of c o u r s e we have obtained m o r e than we
sought to prove.
Now w e c a n prove one of the m a i n t h e o r e m s ( 7 , l l ) .
T h e o r e m 2.7
Under the assumption that
f o r every s , O ( s (
1, we have f o r every t, 0 1
tc 1,
@(t) = s m a l l e s t sigma a l g e b r a generated by
Proof
Let u s r e c a l l that
set]
(~~(siw),
(2.79)
A. V. Balakrishnan
Hence f o r any q-.by - q m a t r i x function f;( ) i n ( a p p r o p r i a t e dimensional) LZ(O,t) space, w e have:
But using (2.69)
Introduce now the o p e r a t o r s :
which m a p s L 2 ( 0 , t) into i t s e l f .
M o r e importantly, in view of (2.69a)
i t d i f f e r s f r o m the identity o p e r a t o r by a V o l t e r r a o p e r a t o r with a s q u a r e integrable k e r n e l .
. Hence H h a s a bounded i n v e r s e .
Hence f o r any g ( . ) i n L 2 (0, t),
A. V. Balakrishnan Hence the random variables
a r e measurable with respect to the smallest sigma algebra generated by
I
1.
Z~(S;~5 ) , tS
and hence B(t) i s contained i n that algebra.
This
proves (2.79) since obviously th/e rightside of (2.79) i s contained in B(t).
Theorem 2.8
Proof
F i r s t of all, using (2.59)
and Z,(t;w)
, taking
Zl(t;w) therein to be Zs(t;w),
to b e Zo(t;w), and making u s e of (2.66), (2.68), we have that
the conditional expectation of Zs(t;w) with respect to the smallest sigma algebra generated by I Z ~ ( S : U I s) ,5 t
).
i s given by the right side of (2.81).
But this algebra, by Theorem 2.7 i s the same a s B(t), and Z (t;w) i s of c o u r s e measurable with respect to B(t). Hence (2.81) follows. Finally we note that: Corollary . l
.
A
Let R(t) =
~ (t;w)x(t;w)*) 3A
Then a ( t ) i s absolutely cbntinuous and,
A
A
+n
+
~ ( t=)~ ( t ) ~ ( ~t )( t ) ~ * (t (t ~) ( t ) c ( t ) +~(t)~(t)*)(~(t)~(t)*)-l(~(t)p(t)+~(t)~(t);k) (2.82)
A. V.. Balakrishnan
Proof
We have only to note that by the t h e o r e m ,
and the res.ult follows by specializing ( 2 . 1 3 ) , and using (2.66).
C o r o l l a r y 2.
Proof
P ( t ) i s absolutely continuous and P ( 0 ) = 0
,
and
We have only .to note that
and using (2.13) and (2.82)
,
the r e s u l t follows.
The equations ( 2 . 8 0 ) , (2.83), ( 2 . 8 4 ) a r e the Kalman f i l t e r i n g equations.
Problem'
L e t Y(t) b e a fundamental solution of the m a t r i x equation:
+ ( t ) = (A(t) - K ( t ) C ( t ) )Y(t)
A. V..B a l a k r i s h n a n where
+
~ ( t =) ( ~ ( t ) c ( t ) s~ ( t ) ~ ( t )( s~ )( t ) ~ ( t ) * ) - l
Th&n the function k(t;s) in (2.69) i s given by
1 k ( t ; s ) = Y(t) Y(s)- K ( s )
Hint: Substitute (2.80) into (2.83);
Problem
Hint
In the notation of (2.86) w e have
Use (2.84)
A. V. Balakrishnan
Time-Invariant Systems: Asymptotic Behavior : L e t us now specialize to the c a s e w h e r e the s y s t e m i s 'time-invariant': that i s , w h e r e the m a t r i c e s A ( t ) , B ( t ) , C ( t ) , D(t) a r e a l l constant, independent of t,
and l e t u s denote them by A, B, C , D.
happens
to P ( t ) , R ( t ) a s t goes to infinity
Of particular i n t e r e s t then is what
-
the situation i n r e g a r d to
is straight-forward ; and i n fact, motivates the question regarding the o t h e r s . F i r s t of all, note that we have (equation (2.13)):
k(t) =
A R ( t ) t R ( t ) A*
+ BB*;
R(0) = 0
Also, this c a n b e 'solved' explicitly, (being a l i n e a r equation), a s :
Note that f r o m this w e a l s o have:
k ( t ) = eAt BB*
Now (2.87) converges a s t
-> o +w
if the eigen-values of (A t A*) a r e
a l l negative, strictly l e s s than zero.
For
A . V. Balakrishnan
where h i s the l a r g e s t eigen-value of (AtA*) and hence clearly
or,
11
,
denoting o p e r a t o r n o r m of a m a t r i x ,
and hence
so that (2.87) converges a s t goes to infinity.
Actually, w e can replace
this r e q u i r e m e n t [of 'stability' of the A-matrix] by a weaker condition that
[(A+A*)x,x]
5-
I h 1 [x. i] f o r
x
E
(Range e
At
B)
but since this i s quite t r a n s p a r e n t and only adds to the notation, we shall forego this, and stick to requiring that A b e stable.
A related question
i s whether the l i m i t , denoted R(-) i s non-singular; o r , equivalently, f o r stable A, whether (A, B) is 'controllable', that i s to say B, AB,
...A m - 1 B
A. V. Balakrishnan
a r e l i n e a r l y independent ( a s l i n e a r o p e r a t o r s ) , m being t h e d i m e n s i o n of A.
Suppose now w e a s s u m e (A, B ) is c o n t r o l l a b l e s o t h a t R(m) i s
non- s i n g u l a r .
Then f r o m (2.13)
0 Z A R ( ~+)R ( m ) A *
we h a v e
+ BB*
s o that a l s o :
where
Next, we s h a l l confine o u r s e l v e s to the c a s e 'where 'signal and n o i s e ' a r e independent, n a m e l y :
BD* = 0
Now f r o m
i t follows t h a t R ( t ) i s non- s i n g u l a r a s soon a s P ( t ) i s
.
Conversely,
suppose P ( t ) i s singular. Say:
v* P ( t )v = 0
This implies that
But left side
and right side
where the first term:
Since
BD*=O
for some m x 1 vector v .
A.
V.
Balakrishnan
i t follows that we m u s t have
o r s i n c e D D* is non- singular,
Hence both t e r m s i n (2.91) a r e z e r o .
V*
e
A(t-s)
B
= @
Hence s o m u s t
0< s
But this i m p l i e s t h a t (A, B ) i s not controllable.
A l s o f r o m (2.92) i t follows
that
But since R ( s ) i s monotone i n c r e a s i n g , we m u s t have
Hence a l s o
v* P(s) = 0
O<s
,
A. V . Balakrishnan
Hence R ( t ) and P ( t ) a r e singular o r non-singular together. R ( t ) is non- singular f o r any t
> 0, implies
Note a l s o that
that (A, B) i s controllable.
Next l e t us note that P ( t ) satisfies:
In what follows we may s e t
DD* = I
without l o s s of generality (simply redefine C by
1 C) (m*)-
Hence we w r i t e
G(t) = A P ( t )
+ P ( t ) A* + BB*
-
P ( t ) C*C P ( t )
Note that w e c a n r e w r i t e this a s
6 ( t ) = (A-P(t)C*C) P ( t )
+ P ( t ) (A-P(t)C*C) + BB* + P ( t ) C*C
Before w e consider the asymptotic p r o p e r t i e s , we shall indicate a constructive method f o r solving (2.92), b a s e d essentially on Wonham [9
1.
An obvious iteration based on (2.94) would be
P(t)
(2.94)
A . V . Balakrishnan
P n + l ( t ) = (A-Pn(t)C*C)Pn+,(t)
+ Pn+,(t)
(A-Pn(t)C*C)
+ BR:? + Pn ( t ) C*C Pn(t)
[If we based the iteration on (2.93), we would have
and h e r e i t does not follow that P n t l ( t ) i s non-negative.]
W e shall show that (2.95) i s actually the Newton-Raphson equation solving algorithm of
-
A. V . Balakrishnan
F o r our purposes i t s e e m s m o s t convenient to work with (2.97). Let C denote the Banach space of symmetric-matrix functions P(.) such that they a r e continuous on the closed interval [ o , w ] [ i . e . , approach l i m i t s a t
tm],
and vanishing a t the origin.
Then
clearly m a p s C into C. Moreover if we denote the F r e c h e t derivative
.
'at' P( ) by
R(P), we have
The Newton-Raphson iteration would then become:
where P stands for the function P (t), R f o r the function R(t). n n We shall show that f o r a suitable choice of the initial function P
o (I (
P
)
) does have a bounded inverse.
actually compute the inverse.
We shall now
Our main a i m i s of c o u r s e to prove
that Pn converges to P in C, f o r suitable initial choice.
Let
A. V.
Balakrishnan
(I t.!r(P)) g = f
Then we have that (f(t)-g(t)) i s absolutely continuous and d
(f (t)-g(t)) = A(f(t)-g(t) t (f(t)-g(t))A;\ t g(t) C*C P ( t ) t P(t)C*C g(t)
o r , assuming f(t) i s differentiable, so is g(t), and
&t) = f (t)
-A
f(t)
- f(t) A* -
g(t) C*C P ( t )
+A
- P ( t ) C*C g(t)
+ g(t)A*
L e t us now note t h a t f o r our iteration ( 2 . 9 8 ) , we have:
Pn - Pn+, + 2 Q(Pn) -9'(P,)
Pntl = Pn + Q(Pn)
g(t)
-R
A , V. Balakrishnan '
But
6 ( p n ) = A O(Pn) t Q(Pn)A* + PnC*CPn
i ( t ) = AR(t)
+ R(t)A* + BB*
Hence
Pntl = BB*
= (A
-.
+
PnC*CPn
- PntlC*CPn - PIIC*CPntl + APn+L+Pn+lA*
PnC*C)Pn+l+Pntl(A-PnC4C)* + BB*
+ PnG*CPn
(2.\oo)
We can then s t a t e the following
Theorem:
Suppose A is stable.
Suppose f u r t h e r t h a t C,A is observable
that i s to say that the matrix:
is nonsingular.
Denote i t s i n v e r s e by
with P ( t ) defined by: 0
A
.
Then the iteration (2.100)
Po(t) =
[
A . . V. Balakrishnan e
as^^*^ l \ e A*s d s 8
yields a sequence of (real) symmetric non-negative definite matrix functions in the space C[O,w], vanishing a t the origin,. This sequence i s moreover monctone :
and converges to P ( t ) , the unique solution of (2.92) in any finite interval of the f o r m [0, TI. Moreover P ( t ) converges a s t goes to infinity, and denoting the limit by
P(w),
we have:
Further
i s stable (has all eigenvalues with strictly negative r e a l p a r t s ) . Finally, P(w)
i s non-singular if and only if R(w) i s ( o r A-B i s controllable).
Proof
F i r s t l e t u s note that
A. V. Balakrishnan
and hence:
and hence:
P (t) converges a s t goes to infinity, and 0
,
Next ,we need
Lemma 1:
Let K be a non-negative definite r e a l symmetric matrix
such that
where
y ) .O; p
> 0 (strictly positive!)
Then the eigenvalues of (A-KC*C)
Proof
have strictly negative real parts.
Let H denote (A-KC*C)*.
Hz = Az ; Real part
A = a _> 0
Suppose for some nonzero vector z,
A. V. B a l a k r i s h n a n
Then substituting i n (2.103), w e have
0220
[ K ~ , z ]+ Y
IIB*
Z1l2
t~ l l c ~ z l l ~
and hence
CKz = 0
which implies that
H z = A*z
and hence
[(A
+ A*)
z , z] = [ ( H
+ H*)z,
z] = 2u[z, z]
s o that o m u s t b e strictly negative.
L e m m a 2.
(Wonham [ 9
I):
L e t K , P d e n o t e two r e a l s y m m e t r i c ,
non-negative definite m a t r i c e s and l e t
Y(K;P) = (A-KCQC)P
+ P(A-KC*C)* + BB* + KC:kCK
A. V. Balakrishnan
Then
Proof
We have only to note that we can write:
Y(K;P) = Y ( P ; P )
= Y(P;P)
+ (P-K)C* C P t PC*C(P-K)-PC*CP
t KC*CP
+ (K-P)C*C(K-P)
-> Y (P;P) a s required.
Lemma 3.
Suppose P (t) i s a r e a l symmetric non-negative definite n
m a t r i x function uniformly continuous on [0,
w]
, with
P ( 0 ) equal to zero.
F u r t h e r suppose
i s stable.
Define P
ntl
(t) by:
@,+,(t) = (A-Pn(t)C*C)Pn+,(t) + Pntl(t)(A-Pn(t)C*C)*
+Pk(t)C*CPn(t) Pntl
+ BB*
A. V . Balakrishnan
Then . Pn+l(t) h a s the s a m e p r o p e r t i e s a s Pn(t).
Proof
Since we a r e given that P n ( t ) converges a s t goes to infinity,
and that
i s stable, i t follows that f o r a l l t
> T, T
sufficiently l a r g e , the eigen-
values of
have a l s o a l l s t r i c t l y negative r e a l p a r t s , s a y a l l l e s s than equal to u, where u i s negative.
Hence if '$(t) denotes a fundamental m a t r i x
solution of:
we have that t > s
> T:
Next l e t u s note that we can e x p r e s s the solution of (2.105) a s : Pn+,(t) =
1 t
0
+ ( t ) ( ( s ) - l O(s) ((s)*-lb(t):rds
A. V . Balakrishnan where
and the m a i n thing to note i s that 0 ( s ) i s convergent a t infinity. F r o m (2.107) i t i s immediate that P n + l ( t ) i s non-negative definite. Let e
> 0 be given. Then we c a n find T l a r g e enough s o that (2.106)
holds and i n addition
Next l e t u s note that f o r
we have
F o r , setting
A sufficiently l a r g e s o that. A 2 T and
A . . V . Balakrishnan w e have on the one hand
while also:
A simple estimation of the integral using ( 2 . 1 0 6 ) verifies (2.108); then
w e can write:
(2.108) and hence:
( A - P (m)C*C)*s n ds
- 6(tl-s))e
+
terms which go to zero with T->co
by virtue %
of our estimates a s can b e directly verified.
A ; V. Balakrishnan
The f i r s t t e r m , f o r t Z 2 t1
Hence P
2 2 A , is l e s s
than (in n o r m )
( t ) converges a s t goes to infinity. nt1
Hence f r o m (2.105),
P n + l ( t ) a l s o converges, and hence m u s t have z e r o f o r i t s l i m i t .
Hence
we have (in the notation of L e m m a 2):
by L e m m a 2; and the l a s t inequality, by L e m m a 1 i m ~ l i e sthat
A-PnSl(m)C*C
i s stable.
T h i s completes proof of the L e m m a .
Next l e t u s note that P ( t ) s a t i s f i e s the conditions of L e m m a 3 . 0 F o r , f r o m (2.102)'we Lave:
and by L e m m a 1, this implies the stability of '(A-I\C*C). P ( t ) converges a s t -> 0 asserted.
a.
Of c o u r s e
Hence (2.100) yields the kind of sequence
The monotonicity follows f r o m L e m m a 3 .
Thus following
A. V. Balakrishnan
so that
- Pn+,(t) 2
-
(Pn(t);Pn(t)) Y(Pn(t);Pntl(t))
But if
i c t ) 2 ~ ~ ( t ) z t( tz )( t ) ~ ~ ( t ) *
w e m u s t have:
a s c a n b e verified by calculation. Since Z(0) i s z e r o , i t follows that
.
Z(t)
2
0
A . V . Balakrishnan
Hence P ( t ) i s monotone non-increasing. n
Pn+l(-1
In p a r t i c u l a r wc have
5 Pn(-)
B e c a u s e Pn (t) i s non-negative definite, and the sequence
Pn(t)
i s monotone, we have that P (t) c o n v e r g e s f o r every t in the n closed i n t e r v a l [o,-1.
F r o m (2.105),
n
(t) a l s o converges, and
i t i s evident that, denoting the l i m i t of P n ( t ) by P ( t ) , l?(t), P n ( t ) converges to 6 ( t ) , and P ( t ) is of c o u r s e the unique solution of We have thus obtained a constructive method f o r solving
(2.92).
(2.92); particularly noteworthy i s the monotonic nature of the approximating sequence
.
L e t u s now examine the asymptotic
properties.
L e m m a 4:
Under the condition that A i s stable, (2.101) has a
unique solution, i n the c l a s s of r e a l s y m m e t r i c m a t r i c e s .
Proof
Let P,Q,
(2.101). stable.
denote two r e a l s y m m e t r i c m a t r i x solutions of
F i r s t of all. by L e m m a 1, both ( A - P C K ) and (A-QC*G) a r e Substituting P, Q into equation (2.101) we have, upon
subtraction: A(P-Q)
+ (P-Q)A*: t +C::CQ
- PCgCP = 0
A . V . Balakrishnan
L e t z be a n eigen- vec tor of (P-Q):
(P-Q)z= lz, A m u s t of c o u r s e b e r e a l .
Then we have:
A [ (A+A*)z, Z] +
IIC Q 11~ - IICPZll 2 = o
But
Hence if A i s not z e r o ,
which c o n t r a d i c t s th'e stability condition.
Hence a l l eigen-values m u s t be
z e r o , o r P m u s t equal Q.
Next l e t u s note that w e have a l s o a c o n s t r u c t i v e method f o r solving (2.101).
Thus we have ( s e e (2.109)):
A. V. Balakrishnan
(a) t
0 = (A-P n (m)C"C)Pntl
Pn+l(-)(A-Pn(m)C*C)* +BB*
which i s a l i n e a r equation f o r d e t e r m i n i n g P ( a )f r o m ntl
Pn(m),
+ Pn(m)C*CP n (a)
and
taking
we have f u r t h e r that
s o that
Pn(m)
c o n v e r g e s monotonically to the solution of (2.101).
F i n a l l y , that P ( t ) c o n v e r g e s to
P(m)
a s t goes to infinity
follows f r o m :
Lemma 5.
T h e solution P ( t ) of (2.95) with P ( 0 ) = 0, i s actually
monotonic n o n - d e c r e a s i n g a s t i n c r e a s e s .
Proof
We follow Wonham [ 7
-
1.
Thus l e t ( i n the notation of L e m m a 2)
V
P ( t ) = f ( P ( t + ~ )p;( t ) ) ; P ( 0 ) = 0 ;
T
> 0 and fixed.
A. . V.
Balakrishnan
Then f r o m Lemma 2, we have:
and hence
which, just a s i n the proof of the monotonicity of the sequence P (.) n in Lemma 3 , implies that
But
s t )=
[
((ttr) ( ( s ~ T ) - ~ ( B B t *~ ( s t T ) ~ * ~ ~ ( s t r ) ) l $ ( s t r ) : 8 - ' ) ( t t r ) * d s
and by an obvious change of variable in the integrand, this i s
1 ttr
=
( ( t t r ) ( ( s ) - '(BB:"
P(s)c*cP(s))((s)::-'~(ttr)*ds
A. V . Balakrishnan
Hence
P(t)
< P ( ~ + Ta s) r e q u i r e d .
Hence P(t) converges a s t goes to infinity to the unique solution of ( 2 . 1 0 1 ) .
Finally suppose
P(m)
i s singular.
Then by L e m m a 5, so i s
P ( t ) f o r every t, and a s we have seen, this i m p l i e s that ( A - B ) i s not controllable.
A. V . B a l a k r i s h n a n References 1.
A . N . K o l m o g o r o v : F o u n d a t i o n s of the T h e o r y of P r o b a b i l i t y , Chelsea, 1950.
2.
J . L . Doob: S t o c h a s t i c P r o c e s s e s , J o h n W i l e y a n d S o n s , 1953.
3.
L. I. G i k h m a n a n d A . V. S k o r o k h o d : I n t r o d u c t i o n t o t h e
4.
K. P a r t h a s a r a t h y : " P r o b a b i l i t y M e a s u r e s o n M e t r i c S p a c e s " , A c a d e m i c P r e s s , 1967.
5.
L . Shepp: R a d o n - N i k o d y m D e r i v a t i v e s of G a u s s i a n M e a s u r e s : A n n a l s of M a t h e m a t i c a l S t a t i s t i c s , 1966.
of R a n d o m P r o c e s s e s , W . B. S a u n d e r s , 1969.
heo or^
6. H. M c K e a n : S t o c h a s t i c I n t e g r a l s , A c a d e m i c P r e s s . 7.
W. M. Wonham: "Random Differential Equations,in Control T h e o r y ", i n P r o b a b i l i s t i c M e t h o d s i n A p p l i e d M a t h e m a t i c s . V o l u m e 2 , A c a d e m i c P r e s s , 1970.
8.
E . N e l s o n : I1Dynamical T h e o r i e s of B r o w n i a n Motion", P r i n c e t o n University P r e s s , 1967.
9.
W . M . W o n h a m : "On a M a t r i x R i c c a t i E q u a t i o n of S t o c h a s t i c C o n t r o l " , SLAM J o u r n a l o n C o n t r o l , V o l u m e 6 , N o . 4, 1 9 6 6 .
10.
M. C o e v e : P r o b a b i l i t y T h e o r y , V a n N o s t r a n d , 1954.
11.
I. G o h b e r g a n d M . G . K r e i n : V o l t e r r a O p e r a t o r s , Translations, 1970.
AMS
Example: L i n e a r Stochastic Control L e t u s next consider stochastic control problems f o r the l i n e a r system:
where we a s s u m e that a l l the coefficients a r e continuous on .[O, 11, and
and of c o u r s e W(s;w) i s a Wiener p r o c e s s a s before.
The control
problem i s that.of finding a n optimal control function u ( t ) , u ( t ) being measurable @ ( t ) , so a s to minimize:
Y
where Q ( s ) i s continuous in s and i s non-negative definite, and ),
i s a fixed positive constant.
Since u ( t ) i s now a l s o a random
proces&, l e t u s denote i t by u(t;w). It i s implicit that u(t;(") i s jointly m e a s u r a b l e in t and w .
A. V. Balakrishnan
Whatever the choice of u(t;w), l e t
A
I
x(t;cu) = E(x(t;w) By ( t ) )
Then we have the Kalman filter equations characterizing c(t;w):
A ~(t;., =
st
+
st
~ ( ~ ) $ ( ~ ; ~ ) d(P(s)c(s)* s t F(~)G(S)*)(G(S)G(S~*)-'~Z~(~;.)
0
0
+~~~(s)u(s;~)ds
(2.111)
0
where
z 0 ( s ; ~=) Y(s;w) -
sS
c(o)G(o;w)do
0
and i s a Wiener p r o c e s s with covariance G(s)G(s)*. (Cf equations (2.83), (2.66)).
The matrix P ( s ) i s determined by: (Cf. (2.84)):
'
+
( ~ ( t ) ~ ( t ) * () -~ ( t ) ~ ( ~t () t ) ~ ( t ) * ) and in particular does not depend on the control. Next in (2.110) w e
(2.113)
A. V. Balakrishnan
can write
A
E([~(s)x(s;ro),x(s;,)1)= T r . Q ( s ) C ( ( ~ ( S ; U + )e(s;w))(x(s;w)+ d(s;w))*)
= T r . Q ( s ) E ( G ( S ; ~2(~;(;1)*) ~) t Tr
. Q(s) P(s)
where e(s;w) = x(s;w)
- %s;w)
The point i n doing this i s that the problem i s thus reduced to '&at of choosing u(t;w) so a s to minim'lze:
w h e r e x(t;w) satisfies (2.111).
H e r e we shall exploit o u r knowledge of
deterministic control p r o b l e v s .
Thus, l e t u s fix the sample point w, and
consider the problem of minimizing:
for each u. F o r this purpose i t i s convenient to w r i t e
(2.114)
A. V. Balakrishnan
where $ ( t )i s a fundamental m a t r i x solution of
and finally:
Note that w(t;w) i s continuous in t ( a s we have seen). Since w i s fixed, we consider G e control functions a s functions of t alone f o r the moment.
L e t L (0, 1) denote the usual L space for control functions 2 2
~ ( t ) .Introduce the l i n e a r bounded operator on this space:
Then, we can r e w r i t e (2.116) a s :
where C! stands f o r the operator corresponding to multiplication by Q ( s ) , u stands f o r u ( t ) , w for w(t;u), and inner-products in two spaces, have been used.
Being a quadratic f o r m with h positive, i t i s obvious
by a routine f i r s t variation (gradient with r e s p e c t to u ) that the u.tique
A. V. Balakrishnan
minimum is given by
,)
uO
+ L*Q
Lu
0
=
-
L*Q w
where L* denotes the adjoint of L , and i s given by:
La:<$ =h
; h(t) = J1g(t)* $(t)*-l p(s)* f ( s ) d s 0
t
5t
5 1
F r o m (2.119) we have:
where z(tico) = t
1
t
$(s)::< Q(S)$(S;.)~S
and i s the unique solution of:
S e r c (2.120) i s the b e s t 'c;ien loop' solution.
It i s unfortunately NOT
m e a s u r a b l e $ ( t ) , since, a s i s evident f r o m (2.121), z(t;co) i s not Y measurable
Y (t) (being independent to the sigma-algebra generated by Zojt;w), tile l a t t e r being the s a m e a s By (tj). $
A. V. Balakrishnan
Next we observe that
subject to P ( . l ) = 0
. We
has a unique non-negative solution which we shall denote by P ( t ) shall noq* show that we have the decomposition:
where
This foilows f r o m the directly verifiable relation (using (2.111). (2.120), (2.122) and (2.123) in differential form:.
and since z(l;w) = 0; Pc(l)= 0
A. V. Balakrishnan (2.125) follows. Note now that u (t;w) 1
, and
u2(t;w) a r e independent of
each other; ul(t;w) i s measureble Py(t), while u(t;w) i s independent of it.
Hence
It i s apparent that ul(t;w) should be the optimal Stochastic Control.
We
shall now prove this m o r e formally. Let u s consider the pre-Hilbert space HU .of control functions u(t;w) such that i t i s jointly measurable i n t and w and f u r t h e r u(t;w) i s measurable
By (t) f o r each t, with inner product
defined by
Then the m a i n thing to note i s that if we define the l i n e a r transformation L by:
the adjoint i s given by:
It i s c l e a r a l s o that the functional (2.115) rewritten i n the f o r m
[R(Lu
+ w),
Lu t w ]
+ 1 [u,u]
yields a quadratic form over Hu with the unique solution:
- B(t)* A
E(z(t;w)
I By(t))
where z(t;w) i s the unique solution of:
Let , . ,
I
z(t;cu) = E(z(t;w) By ( t ) )
Then w e habe, in differential notation:
Because
z(l;w) and P c ( l ) vanish and
- z(s;ur)) I N
E((z(s;w)
it follows that:
py(t)) = 0
for s
2
t
A. V. Balakrishnan
thus proving the optimality of
Note that the filtering and control can thus be treated separately this i s r e f e r r e d to a s the 'separation' principle It was f i r s t derived by Joseph and Tou [12]. different f r o m both of these.
-
-
s e e Wonham 171.
Our treatment i s quite
A. :V: Balakrishnan References 1.
A. N. Kolmogorov: Foundations of the Theory of Probability, Chelsea, 1950.
J . L. Doob: Stochastic P r o c e s s e s , John Wiley and Sons, 1953. L. I. Gikhman and A. V. Skorokhod: Introduction to the Theory of Random P r o c e s s e s . W . B. Sauhders, 1969.
.
X. P a r t h a s a r a t h y : "Probability M e a s u r e s on M e t r i c Spaces", Academic P r e s s , 1967.
.
L 'Shepp: Radon-Nikodym Derivatives of Gaussian M e a s u r e s : Annals of .Mathematical Statistics, 1966. H. McKean: Stochastic Integrals, Academic P r e s s .
W . M. Wonham: "Random Differential Equations,in Control Theory", i n Probabilistic Methods i n Applied Mathematics, Volume 2, Academic P r e s s . 1970.
.
E Nelson: 'lDynarnical Theories of Brownian Motion", P r i n c e t o n University P r e s s , 1967.
W . M. Wonham: "On a Matrix Riccati Equation of Stochastic Control", SLAM Journal on Control, Volume 6, No. 4, 1968.
M. Coeve: Probability Theory, Van ~ o s t r a n d ,1954; I. Gohberg and M. G . Krein: V o l t e r r a Operators, Translations, 1970.
.
AMS
P. D. Joseph and J T. Tou: "On Linear Control Theory". ' A X E Transactions , Applications and industry. Vol. 80, p. 193-196, 1961.
A. V. Balakrishnan Identification and A d a ~ t i v eControl: An E x a m ~ l e
A s a n example of an adaptive control problem we shall take the flight control problem outlined by Taylor and Rediess [l]. The problem here i s the design of a n automatic control system f o r a hypothetical aerospace vehicle which operates throughout a wide range of flight conditions that alter i t s dynamic characteristics. A functional block diagram of the basic a i r c r a f t , control servoactuator,
and measurement dynamics i s shown i n figure A- 1.
The description of these
elements given h e r e i s necessarily simplified, but i t represents many of the important characteristics that would have to be considered i n the preliminary design of a stability augmentation system f o r this type of aircraft.
Only
longitudinal modes of response a r e considered, and the phugoid mode i s assumed to be negligible.
1. Basic Dynamics and Measurements The basic dynamical equations f o r command and windgust inputs are:
The quantity w is the random vertical gust velocity and is assumed to g have zero mean with spectral density given by
A. V. Balakrishnan
where w is angular frequency 0
= 5.0
The control servoactuator is taken approximately a s a first-order lag plus an output rate limit, with dynamic equations:
ie =
S(-k
\
t k.6e ) C
k = 20 radlsec
1
S(x) = x lx S 0.5
=
0.5x>0.5 -0.5
X<
-0.5
is the total servoactuator command signal including the pilot's ec command.
and 6
Observations The following quantities a r e observed: the pitch rate [O], the pitch attitude (81, the normal acceleration inZ] and the angle of attack [a]. Denoting the observations by the subscript i, we have:
where
dn is white Gaussian with spectral density: 0.0005 (ra&ec)
2
A,. V: Balakrishnan
v
1
and v
2
a r e the f i r s t and second bending mode deflections (at a specific
reference station) and a r e governed by the equations: 2
5, + (2f1w1) lii + (w1 v = * 6 e 2 C, + (2f 2 ~ 2 $2 ) + (u2) v2 = Y 6 e
where +=4;4 e
The frequencies w
1
t2 a r e assumed
and w2 and the damping ratios El and
normally distributed with means and variances listed in Table r , m [I]. Indicated pitch attitude
where On is white Gaussian with spectral density Indicated normal acceleration:
n n'
i e white Gaussian with spectral density (0.01) 2 and
aZ = l o 4, =
- 0.15
G2 =
+ 0.45
Indicated Angle of Attack
2
(.0001)
where
aa = 32ft Note that angle of attack measurement may be taken to be noise free. It is assumed that 'servoactuator position and rate can be measured with no e r r o r and that the meaohrements a r e available. State Space ,Formulation We now turn to a state space formulation of both the dynamics and the We dd .msas.foll~ws:
observations.
u
5
control
A. V. Balakrishnan
O b s e ~ ~ v a t i o n(Measurements) s Given 4;
Z,
Jd e-!:
w (t-u)
@in w l q ( t - o ) x4(u) du
Ind. Pitch Rate
v 1 = x3 + (0.025) Ind. Pitch Attitude
- (0.040) C2 + N1(. 0005)
Ind..Normal Accel.
-Ind. Angle of Attack v4
a
kl [X1+,F
ki * ka
Let
+
x3]
+ (0.075) u1 - (0.10) u2+ (0. OOOOS)N4
A . V , Balakrishnan
Now 5 is a random process with zero mean and spectral density:
The corresponding correlation function is
Writing
I= x, we have the following state space representation for the windgust:
NO,N1,NZ,N3 are all unit power white Gaussian.
A: V. Balakrishnan
We have then the following state- space representation for the system and the observations:
The observation vector v
i s given by
-
v = C x+(4Fz4)v+GN
'where
C
=
G = diagonal (.0005.
where N1,N
2'
N
3
.0001,
.01,
d4)
,N4 are unit power white
Gaussian.
A . V. Balakrishnan
Note:
The dimensions of the various matrices and vectors will change
depending on whether i t i s the Identification Phase, State Filtering Phase o r Control Phase although we shall use the same generic notation a s A, B, etc.
In the main text a subscript (i-for identification, S-for state
filtering, c - f o r control) will serve to distinguish the phases.
The
notation in the Appendices where the general theory i s given will also use the same l e t t e r s A,B, etc. and i t i s intended that appropriate specialization will be q a d e to apply to particular cases in the text.
2.
Sy s tern Identification
The f i r s t (and l e s s standard) problem i s that of estimating system parameters (System Identification) f r o m observed data. Here we can distinguish between closed loop identification and open loop identification.
In closed loop identification, the feedback control input alone i s used in contrast to open loop identification where i t i s assumed that we can have a separate input o r probe signal.
F o r the present this distinction i s
unnecessary; we a s s u m e that the servo-actuator output can always be measured
-
in other.words, that x4(t) i s known [ o r can be calculated
from the dynamic equations knowing u(t) and 6 (t)]. The main point P i s that then in the Identification mode we have the representation:
A. V. . Balakrishnan
where
Ind. white G.aussiane with ui~cpower
A. V. Balakrishnan
A general theory of identification f o r such a s y s t e m is given in Appendix I.
The theory splits into two distinct caFes according a s
to whether d4 i s non-zero ["non-aingular case"] o r d 4 i s z e r o ['singular ca.sel]. Of p r i m a r y i n t e r e s t (because of an essential simplification) i s the 'singular c a s e ' where the windgust i s high s o that in the angle of attack measurement (v4(t))the e r r o r due to m e a s u r e m e n t i s "small in comparison with the windgust componenttt. [Note that this i s the only m e a s u r e m e n t where the windgust e n t e r s directly .]
Specializing the general treatment therein, we have
f o r our p a r t i c u l a r problem 32 kl
v (t) = k x + 4 11 v
x3 t klx5 t ((0.075)
-
(0.19) Y2)(47z4)
Hence, if we a s s u m e that a l l the coefficients a r e known, [ a s we would i n calculating conditional distribution of the observations given the coefficients2 we can actually ' s o l v e l ~ f o rthe windgust component as:
Substituting this into the state equations (2. l ) , we can 'eliminatel'x5 f r o m the equations and (since x 4 ( t ) i s known) thereby obtain a s e t of
A. V. Balakrishnan equations in x1,x2, x3 only. We can solve the latter, noting that v (t) plays the role of a 'forging term'. 4
We can thus obtain the
state vector x(t) exactly; the problem thus reduces to that of identification without state noise, which has been extensively studied
[3
1.
Thus we seek to find the coefficients to minimize:
where x
i s determined by (2.2). 5 This minimization i s c a r r i e d out by a Newton-Raphson technique a s
indicated in [3], by actually seeking a root of the gradient with respect to the unknown parameters.
Note that x (t), x (t), x (t) now contain 2 3 1
the unknown parameters, and that a t the end of the computation of unknown parameters, we obtain an estimate of the state variables a s well.
Of course, strictly speaking, the integrals should be written
a s Ito integrals a s indicated in the Appendix I.
Note that the equations
A. V. Balakrishnan determining x 1(t), x2(t), x3(t) a r e :
G1(t) = (1
32Z1
- -=$--)
3
x (t) + - (v 3 kl 4
, .
+ 4~~~(0.19~~-0.075"~))
42( t ) = x3(t)
and the initial conditions a t t = 0 being taken to be zero.
Non- Singular CaseLet us next consider the c a s e where
so that G is non- singular. By writing
in place of v,
we can then clearly assume G to be the Identity
matrix, and thereby simplify the notation.
We shall now specialize
the material in Appendix I to our specific problem.
Thus l e t m(t)
denote the system response in the absence of windgust noise and observation noise.
In other words,
A. V. Balakrishnan
L e t the m a t r i x P
i
be the solution of the steady-state Riccati Equation
(since,.our estimates a r e good only 'asymptotically' w e simplify the calculation in this manner):
Appendix I1 indicates the iterative method used f o r solving this equation. Let
h(t) = m(t)
- C.'0f t e
( A ~ - P ~ C ? C(t) s.) pic;m(s)ds
(Ai- PiC*C) (t- S )
Pic:
v( s)ds
where the 0 indicates Ito integral. Let
denote the gradient with r e s p e c t to a l l the unknown p a r a m e t e r s , and l e t
0 denote the unknown p a r a m e t e r vector, and l e t
' A . V. Balakrishnan
Then w e seek a root of:
using the Newton-Raphson a ~ g o r i t h m :
where
.L
J~ T 0
act)* dt
and the identifiabilik conditions insure that R i s non- singular
for l a r g e enough T. iteration.
We u s e the given mean values to s t a r t the
A. V. Balakrishnan
3 . Control Theory
At the f i r s t level, we can assume that we use the estimated system parameters to devise a feedback control to meet the performance requirements [see [ l ] for the latter]. standard optimal control theory.
Here we can make u s e of
Since the given performance
requirements cannot be handled directly in the theory, we proceed in the usual way by using appropriate 'soft' criteria. shall assume a linear control system
-
Moreover, we
that in particular the servo-
actuator does NOT saturate, so that in reference to the basic state equations (1.1) (1.2), we can write
This simplification i s reasonable; and i s necessary in any c a s e if we wish to obtain any useful answers f r o m stochastic control theory.
We then have the following canical linear control problem:
A . V . Balakrishnan
where
and C and G a r e a s before. We again have two c a s e s to distinguish depending on whether d4 i s zero or not. But first l e t us formulate the optimization criterion. We wish to choose u(t) such that in the absence of pilot input (that i s 6 (t) :0)w e minimize P
A. V. Balakrishnan
where A i s some.fixed positive constant to allow f o r the saturation constraint.
Now
-
n (t) =
z
-v
(Z1xl + Z4x4 + Z1x5) g
s o that nZ(tl2 -= [Q x ( t ) . X(OI
where
[ F o r the control analysis we a r e of c o u r s e assuming that the s y s t e m p a r a m e t e r s a r e known.]
It is well-known that the optimal control i s
given by
where PC i s the solution of
A. V. Balakrishnan
This equation i s again solved by the iteration technique described in Appendix II. Note &at PC i s independent of the state noise variance. separation theorem,
C(t) i s given by
By the
the Kalrnan F i l t e r (for the case
where 'd4 > 0 s o that G i s non- singular):
where (in the steady state) P s i s determined from
In the c a s e where d4 = 0 (corresponding to high windgust) we note that
a s explained in section 2 and in Appendix I. notation letting
In fact in our current
,
A . V. Balakrishnan
where c1, c Z ,c 3 , c 4 are 1 x 5 matrices, and noting that
c4Fc = k # 0
we have
;(t) = ( A - F ~ ' L - c' 4 A) ~ ( t+) ( B - F ~ k - l c 4 B) ~ ( t )
Using the optimal feedback control, the actual value of the normal acceleration component
=
lim T->w
T S ~[ Q x(t), x(t)] dt
= TrQJ where in.the non-singular case J = Ps + Ja where Ja i s the solution of
A. V. Balakrishnan
This follows from the fact that in the steady state
E[(x(t)x(t)*l =
and ~
t =)
E[(x(~)-P( t))(x(t)-:(t)*)]+
B B*P P ') lt (A 0
J~GG*
,o
ds
In the singular c a s e ' (d4 = 0) we have
J = Ja
and Ja i s the solution of
h
G ( s ) ~ st PE*( c c * ) - l ( z O ( t ) )
where Z (t) i s a Wiener process with 0 E[z0(t) zo(t)*] =
A
~ [ x ( t ) x ( t ) * ) ]Ps =
+ Ja
. A.
V. Balakrishnan
Note that in the absence of feedback, the normal acceleration due to windgust i s given by J
A Jb
b
+ JbA* + FF*
the solution of
=0
The reduction in decibels i s given by
10 Log ( T r Q Ja) - 10 Log ( T r Q Jb)
and this remains the same even'if FF* increases by any multiplicative factor.
Optimization of Step Response Let u s next consider the problem of meeting the requirements on desired response to step input (that is,
bp(t) i s a step function). Here again we
can follow standard optimization theory.
Thus l e t the state dynamics be
(setting windgust noise to be zero):
we consider the 'deterministic' case where the state is'known completely. Our optimization criterion is: Minimize
A. V. Balakrishnan
where
i s a positive constant to be chosen appropriately later.
The corresponding theory i s new with this paper; and i s given in Appendix 111.
The optimal control uo(t) i s given by:
where
L being defined by:
n,(t) = L x(t)
Remembering that
Q = L*L
A. V. Balakrishnan
the optimal feedback control i s the same a s that obtained in the stochastic c a s e .
+
BT
h
(A*
Thus the pilot input 6 (t) i s shaped by the rule:
P
-
P B B*/I)-' C
C
C
L* 6 ( t ) P
and the feedback control remains the same a s in the stochastic case. F o r a unit step function, the corresponding average meansquare 'tracking-error' is
of course this average mean-square e r r o r i s only the square of the difference between the steady-state output corresponding to the unit step, and the unit step and is not very significant,
since the precise
steady state value i s not important and can be scaled up a s desired. However the criterion does yield the same feedback gain a s in the stochastic case. Stability is of course guaranteed.
A. V. Balakrishnan This brings up then the possibility of using a 'shaping filter' with memory.
Ideally the frequency transform of this shaping filter
should be the inverse of the-frequency transform of the feedback system: Denoting the f o r m e r by K(f), we must have, ideally,
and this involves differentiatorg. in theory.
The problem i s thus solved completely,
In practice a suitable approximation can be chosen.
It
must be noted that the input i s no longer a constant s o that .the considerations
.
of Appendix I11 for Z(t) now hold, strictly speaking, only asymptotica~ly
APPENDIX I
A. V. Balakrishnan
IDENTIFICATION THEORY
The identification problem can be formulated as: Given
where W0( * ) and W(*) a r e Wiener processes in the appropriate dimensions, perhaps correlated with each other, u(.) i s a given known input, and v ( * ) i s the observed output.
It i s desired to identify
some o r all of the parameters in the m a t r i c e s A, B, C ,D, F. matrix G is a known constant matrix.
The
F o r most of the analysis we
assume G is a non-singular square matrix; we t r e a t the c a s e where
G i s singular separately. Identification [41,
In spite of the considerable l i t e r a t u r e on
such a problem in this generality has not been
studied hitherto.
The f i r s t question tha.t we wish to answer i s , when can we identify such a s y s t e m ? Of course the answer must depend on the notion of identifiability used. While the literature on Identification abounds with many 'recipes', there i s often little by way of any measure of goodne.s
of the estimates obtainable.
There i s not in any
c a s e much agreement on what constitutes 'identifiability'.
In the
A . V. Balakrishnan
present paper we take the foUowing,approach which has a t l e a s t the virtue of being mathematically precise. F i r s t we assume there does exist a s e t of 'true' values f o r the unknown p a r a m e t e r s .
Let 8
denote the s e t of unknown parameters; 0 then takes its values in some finite-dimensional Euclidean space. value.
Let
denote the true
An estimate based on observed data f o r a time-interval T
will be denoted BT.
We shall say that a system i s identifiable if we
can find an estimate which is asymptotically unbiassed and i s consistent.
That is to say:
i)
ii)
limit T a m
E(BT) = go
QT converges with probability one to B0 a s T goes to infinity
.
Such a definition requires a precise formulation of the problem involving 'noise' processes, of course. Flight Coptrol problem -
Fortunately, the practical
shows that this i s not unrealistic.
In the
present paper we shall show that under certain conditions, which we t e r m 'iazntifiability conditions', it i s possible to find such a c l a s s of estimates f o r the problem under consideration.
Moreover, we can
develop a computational algorithm for generating such an estimate. Our estimate i s a maximun+ikelih~'od estimator; o r , more cor:.ectly (and i n o r d e r that the difference in approach can be emphasized) i t is a root.of the gradient of the likelihood functional.
Because we a r e
A. V. Balakrishnan dealing with 'continuous' data, the terml'likelihood functional' will have to b e clarified. We shall f i r s t consider the c a s e where G i s non-singular; without l o s s of generality, we can clearly take i t to be the Identity; which we do.
The main thing to note then is that measure
induced by the process v(.) f o r any finite time-interval [O, T] on the [ ~ a n a c h ]space C of continuous functions on [0, T] in the usual manner, i s absolutely continuous with respect to the Wiener m e a s u r e thereon.
This i s t r u e f o r any assumed value f o r 0.
By the 'likeli-
hood functional' we mean the corresponding Radon-Nikodyrn derivative. We shall show that under the 'identifiability conditions', there i s a non-zero neighborhood of
e0
i n which the gradient of the likelihood
functional has a root for a l l T bigger than some To. We shall take and show that it is asymptotically
such a root a s the estimate OT,
unbiassed and is consistent, provided the identifiability conditions a r e satisfied. We begin then with the calculation of the R-N derivative for the c a s e w h e r e G i s the identity matrix. the s y s t e m (1) c a n b e rewritten with:
where m ( t ) = D u(t) t C
St e A ( t ' a ' ~ u(s)ds 0
in the form: x(t)'
=
:(t)
=
A x(s)ds t F W. (t)
;/
0
C x(s)ds t W(t)
F o r this we note that
A . V. Balakrishnan Mainly f o r notational simplicity we shall a s s u m e W o ( . ) and W ( - ) are
The fact that G i s the identity matrix will imply that the
independent.
process v(6) i s absolutely continuous with respect to Wiene.r measure. However this can be explicitly demonstrated by ueing the well-known Kalman filtering equations f o r the system.(3). Thus l e t ,
Then we have: %(t)
where W
( 0 ) .
=
Jbt
A
P(B)d s
t
it P(s) 0
C*
dW(s)
!(4)
is the Wiener procee s and P(*) is the unique solution of:
~ ' ( t )= A P ( t )
+ P(t)A* - P ( t )C*CP(t) + FF*; P ( 0 ) = 0
, ~ u l t i p l ~the in~ differential f o r m of (5) by P(t)C*, we have:
and substitt~ting-for-thelastt e r m on the eight using (4), we have finally : A
x(t)
- f t (A-P(s)C*C) % ( s ) d s = St P(s)C* 0
Letting
d ?(s)
0
@ ( t )to be a fundamental matrix solution.of
A(6)
A. V. Balakrishnan
we have: 'P(t) = #(t)
S'@(S)-'P(S)C* d ;(a) 0
and hence
,.
''W(t) = ~ ( t ).- .
St0 ~ ( t ; s ) ?(a) d '
where: K(t;s) =
J;
C @(.I
d0 O (8)-
In the differential form, dW(t) = d ?(t)
P(.)C*
(8) becomes:
- J~L ( t ; s ) d ;(s)
dt; L(t, s) = C@(~)I#(S)-'P(s)c*'
0
Let y(t) =
- st
L(t;s)dv(s)
0
Theorem The Radon-Nikodym derivative of the measure induced by the process v ( - ) on C with respect to Wiener measure is given by:
where the second integral i s an Ito integral, and v(..) s C. Remark Note that the use of the Ito integral in (9) eliminates the determinant used i n [ Z ]
.
A. V. Balakrishnan
W e proceed next to the gradient equation.
F o r this, l e t 0
denote the vector of unknown p a r a m e t e r s . We shall show that f o r l a r g e T i t i s enough to seek a root of
where L a )
and
Virgdenotes
where Qi
m(s)ds
-
A
m(t) = m(t)
the gradient with r e s p e c t to 0 .
denotes the ith unknown parameter.
Let
Then the m a i n
'identifiability conditioni i s that the m a t h with components
be positive definite i n the l i m i t a s T goes to infinity.
Using this
m a t r i x a Newton-Raphson technique for finding the root can be readily developed.
F u r t h e r i t can be shown, f o r example, that for
our p a r t i c u l a r problem, the identifiability condition i s satisfied under a n appropriate almost-periodic c h a r a c t e r imposed on the input in open loop mode.
A. V. Balakrishnan
Singular Case:
Let us now consider the case (corresponding to
large windgust component)
and
with
where W 2 (t) i s a Wiener process.
Then'of course the measure induced
by the process vet) cannot be absolutely continuous with respect to Wiener measure. But:
A. V . Balakrishnan
Theorem: Suppose CIF is non-singular, that i s to sag:
Denote this (square) matrix by K.
'
Then
being the unique solution of
Thus x(t) is known without e r r o r
and the measure induced by
v (t), conditioned on x(t) being given, with respect to Wiener measure i s 2
given by :
where the second integral i s an Ito integral. The minimization can proceed a s before; indeed, the calculations a r e simpler.
APPENDIX I1
A. V. Balakrishnan
In this Appendix we indicate the iterative technique used to find the
solution of the steady state Riccatti equation
where Q i s non-negative definite and we a s s u m e A is stable. The iteration is:
This i s a linear equation for P
; m o r e o v e r l e t us a s s u m e that
nt 1
the s y s t e m i s observable s o that
i s actually non- singular, and denote the i n v e r s e by
/\,
s o that
Then choose
P =A 0
with this choice the approximating sequence Pn i s actually a monotone decreasing sequence of non-negative definite m a t r i c e s , the l i m i t being the solution sought.
F o r a proof s e e
[s].
................................
--COMMAND SIGNAL
-- -
I
/
,
:RESPONSE
i i
3 --
NONLINEAR CONTROL SERVO ACTUATOR
AIRCRAFT
GUST RESPONSE TRANSFER FUNCTION
GUST DISTURBANCECONTROL SURFACE
-I
-
+
:
RIGID BODY TRANSFER FUNCTION
;
:
STRUCTURAL RESPONSE TRANSFER FUNCTION
: RESPONSE :
...
-
RIGID BODY RESPONSE
+
: STRUCTURAL
...............................,
--
I
MEASURED SIGNALS
-FLIGHT
Figure A.
DYNAMIC RESPONSE VARIABLES
CONDITION VARIABLES
IDEAL MEASUREMENT SENSORS
4
+
;
AIR-DATA COMPUTER
I
Functional Block Diagram of the Aircraft, Control Servo Actuator and Measurement Dynamics.
1
--
-
A. V. 'Balakrishnan
NOMENCLATURE pilot command input force, lbs.
*s
acceleration due to gravity constant, ft/sec 2
g
2
Gw
angle-of-attnck calibration coefficient
Ka
M
Mach number
e
n
z
gust power spectral density (ftlsec) sec.
Dimensional pitch moment coefficients
normal acceleration, "g' s"-
9
dynamic pressure, lbs/ft
v
true velocity, ft/sec
w
g
2
incremental vertical.velocity due to air gusts, ft/eec
z
distance from c. g. to accelerometer, .ft.
=a
Dimensional normal force coefficients
'6
e
a
.angle-of-attack, rad
ir (sub) distance from c. g. to angle-of-attack vane, ft.
'e
elevator deflection angle, rad
A. V. Balakrishnan
NOMENCLATURE (Continued) commanded elevator deflection angle, rad pilot command input deflection, inches damping ratio of the jth bending mode pitch attitude, rad slope of the jth bendingmode at angle-of-attack $me, rad/ft slope of the jth bending mode at pitch attitude and rate sensor, rad / ft displacemenr of the jth bending mode at reference station, ft relative displacement of the jth bending mode at the accelerometer forcing function coefficient % r bending modes, ftfsec
2
characteristic frequency of gust power spectral density, rad /sec natural frequency of the jth bending mode, rad/sec
A. V. Balakrishnan
Taking the state equation as:
x = Ax
+ Bu
x(0) = 0
we wish to find u(t) so a s to minimize:
where 6 (t) i s a unit step function. By the usual analysis, the optimal P , u(t), denoted uo(t) i s given by:
[The main point to note i s the appearance of
m
as the upper limit in
the integral.] We can express this solution alternately a s
u (t) = D
B* -Y(t)
O
where Y(t) =
- A* Y(t) - L*L
~ ( t =) A x ( t ) -
x(t) + L* 6 (t) P.
BB* X Y (t)
A . V . Balakrishnan
Now if A, B i s controllable,
has a.unique non-negative definite solution such that
i s stable. Next let
AIII (1)
Then we have:
and substituting from A111 (2) and A111 ( 3 ) , we have
~ ( t=)
AIII ( 2 )
AIII ( 3 )
- [A* - PBB*] h -'
-
~ ( t ) Id* 6 (t)
and hence letting D denote (-A*
P
+TI, we have
A. V. Balakrishnan
Again f r o m A111 (2). A111 ( 3 ) i t i s c l e a r that Z(t) m u s t have a finite value f o r the time average and since D i s now unstable (eigen-values have positive r e a l p a r t s ) i t follows that
Z(t) =
- D- 1L*
and finally
We note that this result i s already available in the l i t e r a t u r e [ 6 ] although derived on.less f i r m grounds, and i n particular not on the b a s i s of the time-average e r r o r criterion.
A. V . Balakrishnan References
1.
L. Taylor and H. Rediess: "Flight Control Design Challenge", JACC, June 1970.
2'.
A. V. Balakrishnan: "Flig>t Control System Design I: Identification of Flight P a r a m e t e r s , P r o b l e m Formulation", UCLA Engineering R e p o r t No. 70-63, July 1970.
3.
L. Taylor and K. Iliff: "A Modified Newton-Raphson Method f o r Determining Stability Derivatives f r o m Flight Data:' i n 'Computing Methods i n Optimization P r o b l e m s ', 11, Academic P r e s s , 1969.
4.
A. V. Balakrishnan and V. Peterka: "Identification i n Automatic Control Systems", Automatica, Vol. 5, 1969.
5.
A. V. Balakrishnan: " ~ t o c h a s t i cDifferential Systems I", L e c t u r e Notes, UCLA, January 1971.
6.
M. Athans and P. Falb: "Optimal Control", p. 804. McGrawHill, 1966.
C E N T R O INTERNAZIONALE MATEMATICO ESTIVO (C.I.M.E.)
R. GLOWINSKI
METHODES ITERATIVES DUALES POUR L A MINIMISATION DE FUNCTIONNELLES CONVEXES
Corso tenuto
ad Erice
dal
27
g i u g n o a1 7 l u g l i o 1 9 7 1
METHODES ITERATIVES DUALES
POUR LA MINIMISATION DE FONCTIONNELLES CONVEXES ('I par
R
Glowinski
( Universite de P a r i s )
1 Introduction I1 a r r i v e qufon puisse associer d un problbme dfoptimisation lfdifficilefl ( parce que l a fonction 6 minimiser est non differentiable , ou parce que l e s contraintes sont non-triviales ) un problbme dfoptimisation plus simple dont l a solution fournit celle du problbme initial ; on indiquera dans ce qui suit un procede base s u r lfutilisation de la dualite , via l e lagrangien , permettant une telle transformation dans un certain nombre de c a s importants. On en deduira
, de fa5on t r e s naturelle, deux algorithmes dfopii-
misation associes 6. cette transformation et permettant , en fait , de resoudre silmultanement l e s deux problbmes. LfexposB qui suit reprend en grande partie l e s considerations s u r Les methodes iteratives duales developpees dans Glowinski-Lions-Tremolieres [I],
chapitre 2 et systematiquement utilishes dans l e meme
ouvrage.
( ' ) ~ x ~ o s efait au CIME
(*%
E r i c e , Sicile-Italia en Juin-Juillet 1911
IRIA et Universite P a r i s 6.
R. Glowinski 2. Cadre fonctionnel On considere un espace de Hilbert V, de dimension finie ou infinie et la fonctionnelle :
oh a (u,v) = a (v, u)
V u,v E V e s t un forme bilineaire continue s u r
V vCrifiant (2.2)
a (v,v) LO(IIvI1
2
; 0(
oh llvll = norme de v dans V, et oh forme lineaire continue
>o ,
V
VEV
dans (2. 1) v-+(f,v) e s t une
s u r V.
On s e donne par ailleurs M = ensemble convexe ferme de V.
(2. 3)
(L
I
= espace de Hilbert, et
ou de M+L
(2. 5)
A
On suppose que,
(
t
>
L.
, lindaire ou non.
= ensemble convexe ferme de L.
, )L designant le produit scalaire dans L, q Q L , on a I
pour toute fonction
(2.6)
4fonction de V
v
-+
6
(q, (v) ),est convexe et s e m i continue inferieurement s u r V faible.
On considere alors le problkme: (2. 7)
inf. [J, VEM
(v) + SUP.
( q,
@
(v)
)LJ
qeh
Remarque 2- 1
On prend ainsi par definition le problgme sous une forme adaptee
a l a dualitd
(cf. Glowinski-Lions-Tremolieres [I]
, ch. I
).
R. Glowinski On v e r r a dans l a suite de lfexpos8 que la formulation (2. 7) contient des probleme importants. Remarque 2- 2: La fonction
-+ qsup eA
v
4
(q.
(v) )L
est convexe semi-continue
inferieurement pour l a topologie faible de v; il en est donc de m$me pour J definie par: (2.:8)
J (v) = JO
(v) +
(q,
SUP
qU\
4
(v) 1.
B u r 6tudier l a convergence des algorithmes ci-apr6s on f e r a lfune ou l'autre des hypothhses suivantes :
$E&(v;L)
(2. 9)
et A borne dans L.
otl M est borne dans V et (2. 10)
M +L,
4 est Lipschitzienne de
i. e;
(u)
- $ (v)I(
--<
CIIIu-vI(
VU. VEM
Notons que la fonction J (v) est strictement convexe,avec , dans le cas (2. 9),lim
11~1+ 1
i l existe u
J(v) =
unique dans M tel que: J(u)
3
-
+ oo. Donc sous les hypothkses(2. 9) ou (2. 10)
Exemples Exemple 3. 1 :
.X
5
J(v)
,
V
v E M
*
Soit n u n ouvert born6 de Rn, de frontikre
=
on prend (3.1)
v
= HI o
(3.2)
M
(3. 3)
a (u,
( n ) =[
V I V ~
2 .a;A v E L (R), 1
V
= V)
=
gradu. gradv
dx
i= l , n , v l k =
o
I
3
R
,
R. Glowinski
(3.6) donc
@v=gradv
$E
cf
(3. 7)
(V ; L)
I
A = [q
On rappelle que
.lq(x)l 5
qE L
v
-/,
g
1 gradv 1
dans
p. p.
01,g=
ctf
>0
dx definit une norme Hilber-
tienne dans Hfo (0), on e s t donc dans le cadre (2. 9) et on montrerait facilment : (3. 8)
SUP
seA
(q,
4
(v)
= max (q q e ~
. (P
(v))= g
1,
1
lgradv dx :
ce qui correspond pour l e problgme ( 2 . 8 ) d l'ecoulement d'un fluide rigide visco-plastique dans un cylindre de section Rx (cf. Cea-Glowinski [I],
Glowinski-Lions-Tremolieres [g,Lions-Duvaut
[I] etc. . )
Exemple 3. 2 : 'On considere dans V (3.9)
min ( v ~ k
1
Z
=
HI jR) l e problgme d1Clasto-plasticit8 : 0
a(v,v) - / n f v d x ) = m i n vsk
Jo (v)
avec (3.10) et
K = [v
I v g H 1o
(R).
a(u,v) et f comme dans
lgradvlz
1 p.p.7
(3. 3) , (3. 4).
Le probleme (3. 9) admet une solution et une s a l e eta-t strictement connexe avec l i m Jo (v) = v ll+=J dans V = H,1 (R).
I1
+
, l a fonction J o
co et K Btant fermC
En pratique ( I ) , on considerera l e restriction de Jo B un sous espece
--
(1)
Si on travaille en approximation interne.
-.-
Vh de V = HI
0
(n),
Vh 6tant de dimension finie de fason que lorsque
v parcourt Vh l a fonction :
4 (v) =
(3. 11)
( grad vl
2
-
I
demeure dans un espace L de dimension finie. On notera que K e s t born6 dans H' (03, on peut donc restreindre 0
l e probleme B un ensemble M borne
de Vh.
On introduit :
A
= { c ~ n edes vecteurs
{ qi{ EL
q =
avec qi& 0 v i
)
Alors :
(3. 13) de sorte que (2. 7) devient inf. (3.9).
I
(Z ' n ( g r a d v l dx
0
si
+
m
[grad v12-1 ( 0 sinon
(la restriction B Vh de ) :
-
/Q
fv dx) , vEK donc du problbme
I1 est facile de voir que la fonction
4 : HI
0
(0)+ L I
(n) est
zienne s u r l e s bornds de telle s o r t e que par restriction
$I
lipschit-
Vh on a.
(2. 10). En fait la restriction B Vh n t e s t pas necessaire i c i mais est utile pour la verification de (4. 2) ci-aprgs, c a r l'existence de q E A
, as-
soci6e au problbme ( 3 . 9) restraint B Vh, resultera de l a theorie des multiplicateurs de Lagrange en dimension finie, cf. R. T. Rockafellar [I] 4
-
Un algorithme de recherche de point-selle
Introduisons l e Langrangien :
R. Glowinski et' faisons llhypoth&se : $ ( v , ~ ) admet un point-selle s u r M x A,i. e.
3 un point
{ ~ , ~ ] ~ ~ x A t e l q u e :
Remarque 4. 1 L7hypoth&se (4. 2 ) a lieu dans l e cas oh (2. 9 ) e s t satisfaite : c'est l e theoreme classique de Ky-Fan [2],~ion [I] si M est borne; s i M n'est pas borne, on "approche" M par onconsidere la restriction de% B selle
{u
.
, R}
MR r A ; i l existe a l o r s un point
Tout s e ramkne
a
bornes independamment deRlorsque R sulte de llhypothese
M
verifier que u
++
m ;
et p R sont
pour pR cela
re-
borne, il r e s t e donc B verifier quellu I[< C . R-
Mais :
Fixant v on en deduit :
donc J o ( u R )
5
C + C lluHll dloa l e resultat puisque
Definition de l'algorithme : (1 ) 0
On s e donne ~ 0 on , calcule n , puis Ctant : p
k1
ect.
. . . La
regle gbnbrale
n connu (6A ) u e s t alors llelCment de M minimisant
(4. 3)
(1) C'est un algorithme de type UZAWA cf. H. UZAWA
R. Glowinski On definit ensuite : (4. 4)
P
n+l
= PA(
pn + p n $ ( i n ) )
oh :
+h
PA = opCrateur de projection de L
(4. 5)
e t oh f n > O
est convenablemellt choisi. Remarque
4.2 :
La fonction
v
+Jo
(v) + ( pn ,
xe et e s t infinie llinfini n bien u de fason unique. Remarque Si {u, (4. 6)
)
"
4 (v) )L
est strictement conve-
dans l e cas (2. 9) ; (4. 3) definit donc
4. 3 : (Motivation de llalgorithme ) e s t un point-selle a l o r s :
Jo ( u
c e qui conduit
+ (P ,
$( u
IL
(
J6
(v) + (p ,
9
(v) IL
Vv EM
(4. 3) et d1apr6s (4. 2) :
ce qui Bquivant B :
Remarque 4. 4 : En fait llalgorithme n l e s t pas completement dktermine puisque
dans (4. 3) il faut trouver une methode de calcul pour u? On
v e r r a au N. 5 une variante o t ~l1on precise c e choix.
R. Glowinski Convergence de l'algorithme : Theorkme 4. 1 :
On suppose que
l'une d e s hypotheses
.
(2. 3) ,. . (2. 6) ont lieu ainsi que
(2. 9) ou (2. 10). Alors l'algorithme defini. p a r
(4. 3 ) , (4.4) e s t convergent au s e n s suivant : (4. 9)
un
+u
dans V f o r t
u d a n t l a solutibn de (2. 7), lorsque : (4. 10) 0 <
.go 5 f n 5
Demonstration : (4.11)
,f, suffisemmont petit.
f
De (4. 4) e t (4. 8) on deduit en posant
-
rn = p n
p
et en utilisant l e fait que (4. 12)
II
par ailleurs
Prenant
11
5
11 r n + p
(4. 3) equivant
v = u
,
e s t une contraction:
PA
(
(u
1-
(resp. v = u " ) dans (4. 13)
a ( u"-u,
u"-
U)
+
Sous l1hypoth&se (2. 10) on a llun et sous l'hypothese
1P
- P
+run)
11
<
(2. 9)
-
(u) 1
I,
L
:
( r e s p . (4. 14)) on en de=
duit : (4. 15)
+
( p n - p , C$I (u" )
-m(u) 1,
o
:
constante
A
e s t borne, donc
$ (u))J<
c
11."-
uII
$
Ctant Jiepschitzienne
R. Glowinski .n u est borne dans V et
donc dans tous les cas
(4. 16)
(pn- p.
@
- 4 (u)
(un)
< -a
( u n -us
in-
u)
(4. 12) que :
On deduit de
p,,
+ 2
(ins
-
+(u =I
(UI
,1
+
Mais dans tous l e s c a s :
11 4
4 (u) 11
(un 1 -
5
C
11 un-u 11 2
11r'~~ -< l lI1 ~r Choisissons
2 alors (4. 17) Dona
0
,p
dpn -
n
f a y n que
fn2
2 /j >
O
si
f$Vo> P1l
:
llrntl
(1 2L
+f
11 un
u
l(rn(ldecroit avec U , donc
et alors d'aprbs
d'oh
1de
1
et donc:
11
2
[Ir
n
11
2
I( rnll 2L 4 C
lorsque n
>+
.D
(4. 17)
(4.9) n. 6 l'application ti l'exemple 3 . 1 de llalgorithme prb-
On v e r r a au
cedent ; llappli cation ti llexemple 3. 2 s e r a considerbe dans un expo-
se
sbpar6 consacre B lfAnalyse Numerique du problkme (3. 9) : cf
Glowinski [2]
.
R. Glowinski (1)
5. - Un deuxieme ---- algorithme de recherche de point-selle On complete maintenant l a Remarque 4 . 4. On s e place dans l e cadre de llhypothese (2. 9 ) avec (5. 1)
M = V
a l o r s dans (4 3) un e s t d6fini p a r :
ce qui s l b c r i t encore (5.3) 013 A
A U ~ +$ * p n - f = O
% (V,
V T ) e s t defini p a r
a(u, v) = (Au, V )
Fi u, v E V
Si on introduit un algorithme itbratif pour l a resolution de ( 5. 3) on n e s t naturellement conduit 2 llalgorithme suivant: supposant u e t p n n+ 1 par: connus ou calcules o n definit u (5 4)
u
n = u
n+l
- q,s-'
(nun+
$*
n p - f)
otl : S = identit6
si V1 = V = espace de dimension finie, toute marepondant plus generalement 2 l a
t r i c e sgmetrique definie positive
question. S = operateur d e dualit6 de V infinie (2 On definit ensuite (5. 5)
pfl+ 1
p
n+ 1
V e s t l a dimension
par:
= PA (pn +
oh dans (5. 4) et (5. 5)
+ V7 si
f
f l et
+un+l)
P2
sont deux p a r a m h t r e s
>0
3 choi-
sir convenablement . Ou a l e : (1) (2)
-
L1algorithme consider* e s t de type ARROW-HURWICZ [I] Si on s a i t que
u E W , W Hilbert inclu dans V avec densite et
injection continue ( dlob W c V cV1cWt), on peut p r e n d r e pour S un operateur de dualite
de W + \V1
R. Glowinski On cherche
P2
SOUS
l a forme:
.Pz= PC
PI=? Utilisant
P
(5. 12) dans ( 5 . l l ) , il vient :
Mais : ZC(1-P) si
p
0 <
(5. 13)
-
<
11 rn 11
p2~2114112 2 min
(
Po, po*
+ c p l l wn
11
>
0 si
5
O
P*o ,
de sorte que
) on a :
-
11
+ p
( llrn+l
~
~
11 2)2 w ~~ I I +I ~
~ ~ +
Dbnc :
11 P 11
2
+ cpll wn
premier member de Remarque
5. 1
11
. donc converge et donc l e
decroit avec n
(5. 13) tend v e r s 0, donc
Le comportement de
P
n
)I
wn+'ll
*0
depend des caractCristiques
du probleme et en particulier des proprietes
de
@
, par exemple
~
I
[
~
R. Glowinski
4
$
~ ( v . L ) surjectif conduit 5
m*
injectif et s i
de p , e t on peut montrer que toute l a suite
p
U
V = M on n unicite converge
lrers p
(faiblement s i V est de dimension infinie), plus generalement , sous u les hypotheses precedentes, la suite p Btant bornCe d'apres (4. 17) ou (5. 13)
. posskde des points adherents
(faibles, au
moins, en di-
mension infinie) et si p est un tel point adherent
ou montre sans
difSicultB que
/\ .
Remarque 5. 2
La premiere methode fournit
valeurs inferieures de
inf. [J, vEM
En effet , p a r definition de u
< -
J
+
Remarque
U P q
~ E A 5. 3
sur M x
(u, p ) est point-selle de
@ ( 1
(v)
+
sup. ( q q
une estimation par
.@
(v))L]
n
= i
f (v)
+ sup ( q .
sell
vEM
$
(v) ) L ]
(Motivation de l a terminologie methodes itCratives
duales ) Si
ddini en (4. 1)
(5. 14)
Max
Min
admet un point
~ E Av c M
$ (v, q) =
L1existence du point selle
selle
Min vCM
Max qeA
(u, p ) s u r M
xh on
a :
% (v, q) = 8 (u, p )
(u,p) entraine donc que le probl$me (p)
defini par (2. 7) admet une solution ; on appellera probleme primal le problgme (p) ; de m8me le problBme
sdmet une solution
; si on note
(pX ) defini par :
R. Glowinski (p*) peut encore s 1 6 c r i r e
alors
(5. 17) Max qeA On appellera
fl (q) (pql e
probleme dual de (p), cette terminologie Btant
compatible avec l e s notions de dualit6 developpees, entre autres , par R. T. ROCKAFELLAR [I]. On verifiers par ailleurs que la methode dlUZAWA i. e. l'algorithme etudie au tion
I
N 4 nlest autre que l a methode
( c f. MOSCO [1
1)
du gradient avec projec-
appliquee au problCme dual.
A proprement p a r l e r , l e s deux algorithmes Ctudies sont
du type
puisque l1on determine et on exprime l e gradient de l a fonctionnelle duale J f en fonction de l a variable primale v(q) assoprimal-dual
q et solution du probleme :
ciee 8
6.
-
Application du premier algorithme 2 llexemple
On a vu au
N
3
3. 1.
(c. f. (3. 8) ) que dans l e cas de llexemple 3. 1
la forme explicite du probleme general
(2. 8) est donnee par :
)I
lgradv,2dx + 2g j n l g r a d v l dx - 2 j n fvdx] Min vEHIn(n - $2 probleme admettant une solution unique dans H f o ( n ) et dont l a dif(6. 1)
ficulte essentielle reside dans la non-differentiabilite du t e r m e 2g
in
Igradvl dx.
L1application au probleme duit, compte tenu du 8 :
(6. 1) , de llalgorithme
(4. 3), (4. 4) con-
N 3, ( et de quelques modifications mineures)
R. Glowinski
avec
La condition de convergence e s t a l o r s :
Mise. en oeuvre 'numerique de (6. 2 ) Pour une etude approfondie de l l a s p e c t numerique que de (6.2) on renvoie a CEA- GLOWINSKI [I], TREMOLIERES
GOURSA T
Cl],
GLOWINSKI- LIONS-
[I].
On va utiliser une approximation p a r differences finies : on s e donne ( n = 2 etant l e c a s physique r e e l pour c e probleme),
R C R~
donc
e t ayant introduit un pas h
>0
( destine B tendre v e r s z e r o ) , on
definit successivement : (6. 5)
Rh = [ M . 1J
(6.6)
0.. 1J
I Mij E R2
=I 5 , xi
-
1 . q + J (resp. C.. 2 -1 ) 1 -
a
2
o x ( r e s p .oy).
1J
2
xi +
.
M.. = (xi. Y . ) x . ~=ih
J
1.l
ki
z [x]
= translate de
yj
h
- 5 ,
+ h- 2
y. +
J
. Y.J
=jh . i , j r Z I
[
de 4 . , parallelement 1J
R. Glowinski
4
e t on notera
qj, la valeur approchee
en M.. ; mdme rotation avec 1J
Pour dBfinir la variable duale
(du moins on ltesp&re) de u
f . . pour f. On posera: 1J
2 p ( E (L
2 (a) ) ),
il est commode d1in-
troduire un reseau Qh, de pas h Bgalement, decal6 par rapport R h de l a fason indiquee s u r la figure
6. 1 :
On utilisera systematiquement l e s relations 1
-1 j+
et
2 2 1 2 p 2 ( P = ( P , P 1) en
h
+
1 Z
et
2
Pi+
prendre
Mi+
, Pi + 1 . 1 pour les valeurs tfapproch6estl de p = ~ + ~
en compte
Mi+l j+ ; l e s points M ~ + L.+ 1 a 2 2 2 J Z sont ceux qui sont centre dlun c a r r e de cote
(v. figure 6. 1) dont un sommet au moins appartient B Roh
notera
!2 l h
-
1
llensemble de c e s points; on posera Bgalement :
et on
R. Glowinski Dans c e s conditions llalgorithme (6. 2) e s t
(p u
approch6If p a r :
donne n+ 1 i+ij
+
n+ 1 i-ij
u
,n+l
+
u.. ij+l
n+l 1
+
-
4un+1 -i j-
n g Dij Ph 'fij
(MijQaoh)
avec : Dij
1n Pi+-1 j+L 2 2
-
n
Ph
-
(6. 12)
+
1n
-
2n 1 1 P i+- j+ 2 2
-
1n
j+L
pi-L 2
2
+
P i+L 2
2h 2n 1 . 1 + pi+- J-2 2 2h
In
j-L2 -
2n pi-L 2
approchant
j+L
u
=
n+ 1 u. i+lj+l
-
2
AU
;S;;
n+l
'
au bSf
en Mi+L
2
J
j+L2
u n+l 2+lj 2h
9
+
n+l q+l
u..
2n
j+l- p . 1 . 1 2 1-5 J-2
approchant d i v p e n Mij,
2 Gi+L 2
pi+L j-- 1 2 +
- u .n+l . 13
R. Glowinski
(6. 14)
{
avec
n+ 1 Dans (6. 11, (6. 13), on convient de prendre u k l = 0 s i Mkl
4 nOh
On demontre dans CEA-GLOWINSKI [l] ,GLOWINSKI-LIONS-TREMOLIERES
[I] que (6. 11) , (6. 12), (6. 13), (6. 14) correspond 5 l1appli-
cation de llalgorithme
du
N. 4 (llalgorithme dlUZAWA)B 'la minimiJh (vh), convexe, non differentiable, en
sation dlune fonctionnelle
dimension finie ; J h Btant une approximation de l a fonctionelle problgme
du
(6. I ) , explicitee dans l e deux references ci-avant ; on y
demontre egalement la convergence de (6. 11) pour En ce qui concerne l a convergence de u h
-+
pn <-Ph.
0<
u lorsque h
on demontre cette propriet6 dans CEA-GLOWINSKI
+0,
[I] ,GLOWINSKI-
[I], TREMOLIERES [I],
pour une topologie 1 appropriee (du type norme de SOBOLEV discretisee dans I-I. (Q)).
LIONS-TREMOLIERES
Resolution numerique dlun exemple : On a p r i s
R = l o , l[
Le processus iteratif
X ~ ,C1 [
, f = 10, g= 1 , h =
(6. 11) e s t initie avec :P
choisi &ant : n+l n luij - U i j Mij"Roh
t
l
-3 < - 10
1 20
= 0 , l e test dlarrqt
R. Glowinski n+ 1 uh
On remarque, d f a p r & s (6. 11) , que
n (B ph
donne) est solution
dfun problgme de DIRI CHLET approche resolu p a r surrelaxation avec
w
optimal, en initialisant B u:
a travail16 en fait avec Le paramgtre
P
= ctz
pn on
= ~2
ayant s a valeur optimale (constatee e,kperimentalement)
-popt . -
soit
fi
; en ce qui concerne
d
1-1, i l y a convergence en 12 i t e r tions, soit un
temps dv6x6cution de l v o r d r e de 1 s . s u r IBM 360 91 ; plus generalement on a represent6 figure 6.2
. e t on remarque l e caractgrerealiste de lfestimation
de convergence (6.4) puisque
la dependance en f de la vitesse
si pour
f = 2 il y a encore convergence (en 150 it6-
rations environ ) , il y a divergence pour 150 '
p =2. 1.
Nombre dfiterations
100 , ' 50 10
"
Figure 6. 2 Sur la figure 6. 3 on a represent6 l e s zones de $2 oh grad u = 0 et celles on grad u
f
0. onahachure la zone oh grad u= 0 Figure 6. 3
,
R. Glowinski 7.
-
Considerations s u r la minimisation par dualit6 de fonctionelles convexes non differentiables
On a vu dans les N. precedents et en particulier au N. 6 , que llintroduction dlun Lagrangien et l'utilisation de techniques de dualite, fournit une methode Blegante pour la minimisation de certaines fonctionnelles non differentiables et B ce sujet, on renvoie B CEA-GLOWIN-
SKI-NEDELEC [I]
oh on trouvera un certain nombre d'exemples en
dimension infinie (incluant l'exemple 3 . 2, trait6 au N. 6 ) ; en fait la methode e s t bien adaptee aux fonctionnelles dont l a partie non differentiable J
1
est positivement homogene de degre 1 i. e.
Nous voudrions dans ce N. indiquer un exemple t r e s simple de minimisation de fonctionelle non differentiables pour lequel , l e probleme dual etant egalement non differentiable, la methode preeckdente ne s'applique pas (tout au moins directement, c f. Remarque 7. 1) 7. 2 : Position du probleme On suppose v=R", on s e donne A matrice symetrique definie posin tive , f € R et Jo definie p a r :
oh ( . , . ) dCsigne le produit scalaire euclidien standard R On pose :
N i.1
[
"v"
=
ma. i
--
lvil
N
.
R. Glowinski On considere alors la fonctionnelle
, visiblement non differentiable
definje par :
J &ant continue, strictement convexe (&ce
+, )
avec' lim J(v)=+co, Ilv ll +w admet s u r V un optimum unique, soit u ; le problkme: (7. 4)
Min v €V
B
J(v)
est donc bien pose. Avant de degager , en 7. 3 , les proprietes de dualite du probleme (7. 4)on va indiquer comment i l est possible d'utiliser l e s resultats N. precedents pour repondre
des
(7. 4 ) ; ceci fuit l'objet de la :
Remarque 7. 1 : On parametre (7. 4) de la fa3on suivante : >
Soit
0; B (7.4) on associe la famille de problemes
par :
(P )
X
Min
[J~
(P ) definie
A
(v) +)IIvII1]
vEV
Alors : Proposition 7. 1
:
i ) ( ~ 4 a d m e une t solution unique , u ii)
+11 uAI( 1 est
: Le point
, v A >- 0.
une fonction continue
decroissante de R Demonstration
X
+ + R+
(i) est evident.
Pour dem ontrer (ii) on remarquera (c. f . U. MOSCO sont l e s solutions respectivement
associees &
[I] ) que si e t p on
2 :
uA ,
R. Glowinski
dloh par addition : (7.6)
( J 1 o ( ~ ' ) - J 1 (o uA)s
/.2.
p - .A)
+
(/*-*I
(11
up(
-11.~4,
._
0
or :
( y )J b- ( u x ) , y -
(7.1)
Jb
oh
c ( c
UA),
>
2cIIuP-
u)$
0) est la plus petite valeur propre de A.
montre la decroissance; pour demontrer la continuite on remarque que
I\ uA\I1
<_
~h >- 0
11 U; 11
dloh, compte tenu de (7.6) , (7.7) :
ce qui dCmontre la continuit6 (uniforme) de llapplication
+ :R A
.
V donc celle de
,
X*)luXJll
: R++
R+
On en t i r e la : Proposition 7 . 2 :
L1equation
seule qui est egale a
u
A
=
)) uh\ll admet une solution et une
11 l.
Demonstration : Evidente et illustree par la figure 7. 1 ( u = ~ - ' f ) 0
IIuoIl 1
Hu~ : IIull
.* 1
Figure 7. I
R. Glowinskj Donc si on sait resoudre
( P k ) on e s t
probleme
A-
une variable
Ilu A
(Il
rzmene
l a resolution du
etant continue e t decroissante cela ne pose pas de En c e qui concerne
A
=
-
[--, -In 2 2 X
X 1
difficultes.
( P A ) , on verifiers q u t i l entre dans l e cadre des
problemes etudies dans l e s N.
X
-+ 11 u (I
= 0 ; la fonction
precedents ( avec
) e t que ltalgorithme
L
etudie au
=
Rn , $ = I ,
NO
4 prend la
avec : (7. 10)
( p ~ ( q =) max ~ (
X , - 2
On demontrerait la convergence de (7. 11) sous ( 7 . 11)
O < d L
pn
zp
A
min ( y - , q i )
)
l a condition :
<2c
C Btant (toujours) l a plus petite valeur propre de A .
Ceci
termine la r e m a r q u e 7. 1.
7.. 3 : Determination dtun probleme dual. 1) On va r a m e n e r
l e probleme ilon differentiable (7. 4) , probleme
sans contrainte 3 un probleme avec contraintes , en effet e s t identique'
:
Min (7. 12)
[(Av,v)-2 (f,v)+ p
sous l a contrainte
2
7
(7. 4 )
R, Glowinxki
Dans
(7. 12) la contrainte e s t non convexe , m a i s un raisonnement
t r e s simple montre qu'on peut la e s t de type
11
((v 1 ; d1oC1 equivalence e n t r e
convexe en (v, p)
remplacer p a r
-
p< - 0 qui (7. 4), (7. 12)
et (7. 13) defini par : Min (7. 13)
[(Av,v)
-
2(f,v) + p
e c t . . ) ou directement, on rnontrerait que ( u , point s e l l e unique de tiplicateur de On deduit
(7.16)
]
sous l a contrainte
A l'aide des theorernes classiques de min-max
(7.15)
2
de s u r
x R r
Min J (v) v
=
d:(v,p;
11 u (1 I
et que
R+
Lagrange des problemes
de tout ceci
Min v, P
R"
(VON NEUMAN, SION
11 u 11 e s t 11 u I l l e s t mul;
(7. 121, (7. 13).
:
Max 0
A >-
A)=
Min v,p
$(v,p;A)
Min [(Av.v) - 2 ( f , v ) + 2 A l l v l l I - ~ 2 ] v
d'od :
(1)
Ce
e s t B distinguer de celui de la r e m a r q u e
7. 1.
'R. Glowinski Min J (v) = Max v X I
Min
= Max
+
-
2 Al/v
Min [ ( A v , ~ )- 2(f.v) + 2(t,v)-
Max
= Max
(7. 17)
[(Av,~; ) 2(f,v)
v
1
2 Max ! A V , V ) - ~ ( ~ , V ) + ~ (h~ , V ) 4JlltIlrn5A
Min
t = Max
Min v
t
]
2
[(Av,v)-2(fJv)+2(t,v)-
1ti/
2
Mais
Min [(Av,v)- 2(f,v) + 2(t,v)] = -(A-'(f-t) , (f-t) ) v - 1 f l a formulation du probleme dual d'oG posant g = A (7. 18) Min t
[(A-' t , t )
-
2 (g, t ) +
la fonctionnelle J* (t) = (A-I t, t )
-
~(tll 2(g, t )
+
11 ti1
2
e s t non differen-
tiable et l'utilisation d'une m5thode de gradient s u r l e probl&me dual (7. 18) conduit A des difficult& lies A l a non differentiabilite de t+\l t/j
'
m
8. Un probleme de contrsle optimal ------ avec fonction cout s n differentiable Soit R un ouvert borne de R Q =
n X ] O , T [ ( T fi")
on s e donne
,
n
2
de fronti@re =
r
x ] 0, T
suffisemment r e g d i e r e et
C;
:
I
Y ( x , O ) = yo (x) dans R
G. Glowinski avec
I E L~ (Q) , y o E L~ (Q). 1 L e contr6le v &ant donne, (8. 2) admet une solution unique dans H (Q),
soit
y (v)
(8.3) avec o(
, et on peut definir l a fonction co3t
J(v) =
>
o
(QIy(v)
-jd12dxdt+dL
:
1vl2dxdt
zrhlvla
. /j > o e t a E L 2 (Q).
I1 r e s u l t e de
LIONS [I]
que l e probleme :
Min J ( v) vEUad adrnet une solution unique , soit Compte
u , qui e s t l e contrzle optimal.
tenu de LIONS , loc. c i t . , et des N. 2 e t 3 , on montrerait
facilement llexistence et lrunicite de p. p. s u r Q solution de At
f
=
+ u
sur
y (x,t) = o y (x,o) = y
P (x,t)
=
0
sur sur >
R
~
p. p . sur
Q I1
u 20
IXI Xu
x O] , T [
sur R
(x)
o
+PA+p
(du
~
s u r x
p(r,T)=o o(u
avec
:
'Y - b y (8. 5)
( y,p, u,X)
11
1
+PA+p ) u =
1.1
= 0
It 11
I
X(x t ) ( 1~
dt
R. Glowinski Reciproquement
si
(y,p, u,
1)
e s t solution de (8.5), (8. 6), (8.7) u
e s t contr8le optimal. L1utilisation de (8.5), (8.6), (8.7), associee B des algorithmes du type de ceuxdeveloppes
aux
N. 4 et 5 , a donne des resultats nume-
riques satisfaisants pour l e probleme (8.4) , mzme pour des valeurs assez petites de o(.
: 9.:
Une remarque s u r l e comportement des mulSiplicateurs de Lagrange. I1 a souvent Cte constate que, la fonctionnelle J
ficacite des methodes duales etait une fonction
I1
&ant donnee, l1ef-
(en particulier celles des N. 4 et 5 )
..
decroissantell
du convexe K, ceci. n l a rien de
t r 6 s surprenant puisqu1.5 la limite lorsque K s e reduit B un point l e problkme dloptimisation considCrC de Lagrange en gGn6ral.
n'admet pas de multiplicateurs
On. va mettre ce phenomeme en evidence s u r
un exemple t r e s simple :
V
Soit
=
R
, J definie par
On prend pour K E l e convexe &ant donnee par :
D1o~ le Lagrangien :
La solution optimale de
(9.4)
Min y(K
6
J (y)
[y
:
1 1 y 15€1 , llappartenance
K
E
R. Glowinski e s t donnee de fa$on 6vidente p a r :
(9.5)
ye
=
min
e t l e multiplicateur
( 1 , E )
de Lagrange
E
de : (9.6)
d'ofi :
Min J ( y ) YEKE
=Min&(y.a)
Y a
correspondant s'obtient & p a r t i r
-
R. Glowinski
B I B L I O G R A F H I E ARROW K. J. - HURWICZ L. [I]
CEA J.
CEA J.
- GLOWINSKI
-
R.
GLOWINSKI R. Li]
NEDELEC J. C. -
DUVAUT G.
-
[l]
-
LIONS J. L.
GLOWINSKI R.
: Dans Arrow-Hurwicz-Uzawa,
Studies i n l i n e a r and non l i n e a r programming. - Stanford Univers i t y P r e s s 1958. : MBthodes numeriques pour llecou-
lement l a m i n a i r e d t u n fluide rigide visco-plastique incompressible A b a r s t r e - (R~DDOI-t IRIl\ d i s ~ o n i b l e ) : Methodes duales pour l a minimisa-
tion de fonctionnelles non diffkrentiables - A p a r a i t r e dans l e s p r o ceedings d u Colloque d'analyse numCrique de DUNDEE- 197 1, SPRINGER VERLAG. : L e s inequations e n mecanique et en
physique
[I]
-
DUNOD 1971
: Methodes Numerique pour llCcoule-
ment stationnaire d l u n fluide rigide visco-plastique incompressible Proceedings of the 2. nd. Int. Conf. on Num. Methods i n Fluid DynamicsL e c t u r e Notes i n physics, 8 , Springer - V e r l a g 197 1 : Expose B cette reunion CIME c z -
s a c r e B llAnalyse NumCrique du ProblPme e l a s t o - ~ l a s t i a u e . GLOWINSKI- LIONSTREMOLIERES.
GOURSAT
5
C1l
: L i v r e s u r llAnalyse NumCrique des inequations variationnelles, 5 parai-
t r e an 1972 chez DUNOD : A.nalyse Numerique d e problemes
dlelasto-plasticit6 e t de visco-plasticttk. - T h e s e de 3 cycle PARIS-IRIA, 1971
R. Glowinski KY-FAN
[I]
LIONS J. L.
MOSCO U.
S u r un t h e o r e m e de Min Max C. R. A. S. - P a r i s , 259, 39253928, P a r i s .
:
[I]
[l]
Controle optimal de s y s t e m e s g o u ~ e r n e sp a r d e s equations aux derivees p a r t i c e l l e s - DUNODGAUTHIER-vILLARs 1968
:
Expose 5 cette reunion CIME
:
ROCKAFELLAR R. T.
[I]
:
Convex Analysis P r e s s 1970
-
Princeton Univ.
SION M.
C11
:
On g e n e r a l m i n max t h e o r e m s Pacific J. Math. 8,1958, 171- 176.
UZAWA -- H.
El]
:
Dans ARROW- HURWICZ-UZAWA. loc. cit.
C E N T R O INTERNAZIONALE MATEMATICO ESTIVO ( C . I. M . E . )
J. L . LIONS
Corso
tenuto
ad E r i c e
dal
21
g i u g n o a1 7 l u g l i o
1971
.
APPROXIMATION NU~ERIQUEDES INEQUATIONS D '~VOLUTION J.L. LIONS (Paris)
Introduction.-
On donne dans ce cours les methodes fondg
mentales pourla r6solution numerique des inequations d'Svolution intervenant en Mecanique et en Physique. ~exSexperiencesnumerique, faites
a 1'I.R.I.A.
(Paris),
i
seront present&
avec toues les details dans un livre de R.
Glowinski, R. TrbmoliSres et l'A., a paraitre chez Dunod.
Plan detaill6. CHAPITRE 1 . 1
Inequations d'evolutions parabolique. Type I.
. Exemples .
2. Formulation generale. 3. Solutions fortes et faibles. 4 . Generalitgs sur les methodes constructives d'approxi-
mation. 4.1
Reduction 5 un equation parabolique. Penalisation.
4.2
Reduction 5 un equation parabolique. REgularisation.
4.3
Reduction 5 un inequation elliptique. Regularisation elliptique.
a un inequation elliptique. Discr6tisation.
4.4
Reduction
4.5
InQquation d'6volution et points selles.
J.L. Lions
CHAPITRE -2.
-
Approximation par discretisation des inequations paraboliques de type I.
1. Approximation d'un couple d'espaces. Constante de stabilit6. 2. Schemas d 'approximation 'des inequations paraboliques de
type I. 3 . Analogue de la stabilite.
4. Etude de la,convergence.
CHAPITRE 3 . -
Inequations d'evolution paraboliques de type 11.
1. Exemples. 2. Formulation gQn6rale. 3.
Schemas d'approximation.
4 . Stabilite et convergence.
CHAPITRE 4. - Inequations d16volution du 2eme ordre en t. 1 . Exemples. 2. Formulation ggngrale. 3 . SchQmas d 'approximation.
4. Stabilite et convergence.
CHAPITRE 5.- Complgments et problSmes.
1. Ecoulement de fluides de Bingham. 2. ProblBmes ouverts.
BIBLIOGRAPHIE.
J.L.
Lions
CHAPITRE I.- I n e q u a t i o n s d ' e v o l u t i o n s p a r a b o l i q u 1.-
Exemples. Exemple 1 . 1 . -
La t h e o r i e d e l a d i f f u s i o n e n m i l i e u x p o r e u x
( c f . Duvaut-Lions Soit
n
[lj)
c o n d u i t 3 d e s problemes du t y p e s u i v a n t :
o u v e r t borne de R
(n=2
o u n=3 d a n s l e s a p p l i c a t i o n s ) d e fro; tiere
le 3
r
2
"r6guli&re8'. Soit
l a norma-
r dirigee vers l'exterieur de a.
On c h e r c h e une f o n c t i o n u = u ( x , t ) , xe 0, t>0, solution de (1.1)
aU at
[ , T>O f i n i q u e l c o n q u e ,
x ]O,T
n x ]0,T[),
(oii f e s t donnee d a n s (1.2)
n
au = f d a n s
avec l a c o n d i t i o n i n i t i a l e ~
~ ( ~ 1 =0 u) 0 ( x ) I
I
un donne d a n s
n
e t l e s c o n d i t i o n s aux l i m i t e s (uiO
sur
o:"
sur
u & = o
Z = P X ] O , T [ ,
z, surz
.
Remarque 1.1.Le probleme (1.1 )
,
(1.2)
,(1.3)
e s t non l i n e a i r e Z cause d e s
c o n d i t i o n s aux l i m i t e s ( 1 . 3 ) . Remarque 1.2.DtaprSs u = 0
sur
u
c0
c
an
z
= 0
sur Z,
on voit q u e
J.L. Lions
an
o
=
sur
Z-
zo.
Mais Zo n'est pas donne 3 priori. Orientation. Le but de ce premiGr chapitre est: a) de montrer brievement corn ment le probleme (1.1), (1.2) , (1.3) (et, a vrai dire, des problemes beaucoup plus gen6rauxJ. est bien pos6
(
1
1;
b) de donner des m6thodes d'approxirnation numerique de la solution du problsme. Donnons un 2eme exernple. Exemple 1.2. On cherche u satisfaisant 3 (1.1),(1.2) et aux conditions aux limites sue
I
(1.4)
u 32
+
glul =
Z
o
g constante sur z
>Or
.
an
Autrement dit:
On verra que ce probleme est encore bien pose.
1
( )
[I]
Pour une 6tude plus systgmatique de la theorie, cf. H B&ZIS J.L.
LIONS [lJ[2]
.
J.L. Lions
2. Formulation ggnerale. Nous donnous maintenant une formulation "abstraite" de pro blSmes dlinequat.ionsde type parabolique, puis nous montrons cog ment cette formulation contient, en particulier, les exemples du N.1. Soient V et X deux eispaces de Hilbert ( 1 ) (2.1)
VcH
,
sur E,avec
V dense dans H I l'injection de V dans H etant continue.
On designe par:
(2.2)
I
I I
la norme dans H,
(
,
)
le produit scalaire
correspondant dans :HI
II II
la norme dans V
D'aprGs (2.1), il existe une constante c>O telle que
On se donne ensuite: bilingaire continue sur V x V, coercive au sens: il existe X tel que Ivl
l2
, a>O, VV E V ,
et on se donne encore: (2.5)
K = ensemble.convexe ferme dans V;
(2.6)
j = fonction convexe continue de V +1R
.
On identifie H 3 son dual et l'on introduit l'espace V' dual de V
1
( )
On peut aller beaucoup plus loin, enprenant pour V un espace
de Banach reflexif. Cf. Lions [2]
et la bibliografie de ce livre.
J.L.. Lions
de s o r t e que (2.7) S i f E V'
, on
ddsigne p a r ( f , v ) son p r o d u i t s c a l a i r e avec v E V;
c e t t e n o t a t i o n e s t c o m p a t i b l e avec c e l l e du p r o d u i t s c a l a i r e dans H. Le problsme. 1 O n c h e r c h e une f o n c t i o n t + u ( t ) d e [o,T] + V ( ) t e l l e que (2.8)
9
u ( t ) E K,
Un a u t r e probleme est: On c h e r c h e u = u ( t ) d e [o,T] +V t e l l e que
'tlv E V r
avec ( 2 . 1 0 ) .
L ' i n d q u a t i o n (2.9) ou (2.11 ) e s t c e q u ' o n a p p e l l e r a i c i une inequation parabolique de type I Remarque 2.1
.
S i K=V ou s i j=O,
2
( )
1
( )
(2).
(2.9)
et
(2.11) s e r d d u i s a n t 3 l ' e q u a t i o n :
Cf. l e s i n d q u a t i o n s du t y p e I1 au Chap.3. Dont il f a u d r a p r e c i s s r l e s p r o p r i 5 t d s .
J.L. Lions
Remarque 2.2. Si la fonction v
+
j(v) est diffgrentiable sur V alors (2.11)
Bquiva~ita l'equation (en ggn6ral non lingaire): (2.13)
au(t) , v ) + a ( ~ ( t ) ~ v ) + ( j ~ ( ~ ( t ) ) , v ) = ( f ( t ) , v ) ,V V E V . ( at
Remarque 2.3. Introduisons A
E
$ (V;V' ) par
Alors (2.13) gquivaut 2
Remarque 2.4. Si l'on considere la fonction $k indicatrice de K 0
1
( ):
si v e K ,
JI (v)=
(2.16)
k
+msi
V#K
alors (2.8) (2.9) gquivaut 3 : (2.17)
au(t) ( , t , v-u(t) )+a(u(t) ,v-u(t) + ~ ~ ~ ( v ) - $ ~ ( u) (3 t )
Les ingquations (2.9) sont donc des cas particuliers de l8in&quation
1
( )
La fonction $k est convexe et semi continue infgrieqrement.
J.L. Lions
I
vvcv 1 09 y e s t une fonction convexe propre (cf. le cours de U. Mosco [I]).
Utilisant la notion de sous diffsrential, on voit que
(2.18) 6quivaut 3
equation parabolique multivoque. Exemple 2.1.Voyons comment 18enonc6 gen6ral recouvre le probleme de 18Exemple 1.1. On introduit (notations des cours de R. Glowinski et U. Mosco): 1
(2.20)
V = H (n),
(2.22)
a(u.v)=
2
H=L (n),
&.K dx i=
ax. ax. 1
1
Alors le probleme (2.81,(2.9) ,(2.10) 6quivaut au probleme (1.1),(1.2),(1.3). Exemple 2 . 2 . On prend V,H,a(u,v) c o m e en (2.20) ,(2.22) et 180n intro duit
Alors le probleme (2.11 ) , (2.10) Bquivaut au probleme (1.1 ) (1.2), (1.4).
,
J.L.
Lions
Origntation. On va m a i n t e n a n t p r g c i s 9 r 3 q u e l s e n s on e n t e n d l e s " s o l u t i o n s " ' d e s probl6mes p r g c g d e n t s .
3.-
Solutions f o r t e s e t faibles. Solutions fortes. Par " s o l u t i o n f o r t e " du problsrne ( 2 . 8 ) , ( 2 . 9 )
,(2.10)
on en-
t e n d r a une f o n c t i o n u t e l l e que
(3.3)
(3.4)
u(t)e K
I
p.p.
en t
(z pour
tout t E
[o,~])
s a u f p e u t S t r e pour t d a n s un ensemble Z c [o,T] de mesurenulle, ona:
e t naturellement (2.10) ( 2 ) :
Evidernment ( 3 . 3 )
3 impose ( )
( I ) L~ (0,T; X ) = e s p a c e d e s " f o n c t i o n s " t + u ( t ) d e T 2 mesurables e t telles que I I u ( t ) l l dt<m
jo
2
( )
[o,T] +X
qui sont
•
X
I1 r 6 s u l t e d e ( 3 . 1 ) e t ( 3 . 2 ) que t + u ( t ) e s t , a p r e s modifica-
t i o n 6 v e n t u e l l e s u r un ensemble d e mesure n u l l e , c o n t i n u e de [O,T] A l o r s u ( 0 ) a un s e n s . 3
( )
Tant que l ' o n t r a v a i l l e avec d e s s o l u t i o n s f o r t e s 06 ( 3 . 3 )
l i e u pour t o u t t .
a
+H.
J.L. Lions
I1 est important pour les applications d'lntroduire tion de solution faible
(cf. Lions-Stampacchia [I],
une no-
Brdzis [ z ] )
.
Pour simplifier l'expose nous prenons
On observe alors que si u est solution "forte1'de (3.4) , on a:.
[
I
,v-u)+a(u,v-u)-(f,v-u) dt50
(3. 8)
V V E L2 (0,T;V) tel
que =EL' at
(o,T,v') et v(t) 6, K p. p. et v(O)=O. '
Mais c o m e
n'intervient plus dans (3.8) on peut ddfinir u cop! at me solution faible si u satisfait 3 (3.1), (3.3) et (3.8). Remarque 3.1.On a 6videmment des notions analogues de solutions "fortes" et "faibles" relativement
3 11in6quation (2.11)
.
Remarque 3.2.Seuil de .c6gularitS. La solution u(t) des problemes prdcedents n'est pas une fonction "trGs r6guliSre8'de t, quelle que soit la rdgularit6 des donnes f et uo. Prenons en effet V=H= IR,
a (u,v)=O (qui v6rifie (2.4) lorsque V=H),
J.L. Lions
La solution est indiquee
sur
le graphe ci contre. On voit que, 2
en particulier,
at2
$
L' (0,T).
Resultats ggngraux. On demontre
1
( )
les resultats
suivants (cf. par ex. Lions [2]
et
L L 0
la bibliographie de ce travail):
-
>C
TheorSme 3.1.- On suppose f E L~(O,T;V'). On suppose que (2.4) a lieu. I1 existe alors une fonction u et un seule satisfaisant
a
(3.11,(3.31 ( 2 . 8 ) .
TheorSme 3.2.- On suppose que (2.4) a lieu et que
I1 existe alors un solution forte et une seule de (3.1). ..(3.5). Remarque 3.3.On a des enonci!s analogues pour l1ini!quation (2.11). Remarque 3.4. L'unicite des solutions fortes est immediate. Pour l'uniciti! des solutions faibles, si.ul et u2 sont deux solutions Bven-
tuelles, on introduit: 1
( )
Nous donnons ci aprPs guelques indications sur les methodes
'constructives possibles de dgmonstration et au Chap.2 nous donnons l'approximation numgrique de la solution (qui peut, d'ail-
J.L. Lions
1 W = -(u +u ) 2 1 2
et l'on prend v=w u1 et u2
,
puis w
e
solution de
dans chacun des inequations (3.8) relatives 2
E
.
On additionne et.on peut alors faire e
-+
0. Cf. H. Brezis
c21 4.- G6ngralit6s sur les m6thodes constructives d'approximation.
4.1.-
Reduction 2 une equation parabolique. penalisation.
Soit 6 un operateur de penalisation attach6 a K (cf. Lions [2], p.370 et les cours de R. Glowinski et U. Mosco). On
"approche"
(3.4) par 1 "equation penalisee (4.1)
au ( 7 ,v)+a(u
1
(t),v)+;(~(u~(t)),v)=(f(t),v)
VVGV,
E
oil
E>O est destine 2 tendre vers 0, avec
--
I1 s'agit d'un probleme parabolique non lin6aire monotone (car 6 est, par definition, monotone de V
-+
V') dont on sait qu'il
admet une solution unique. On montre (Lions [2)) Theoreme 3.1
,-u E
que, par ex. sous les conditions du 2
-+
u dans L ( 0 , T ; V )
faible lorsque
.
E-+Ooh u est
solution faible. On peut ensuite approcher ur par l'une des msthodes de reso
J.L. Lions
4.2.- RPduction,&un equation pbrabolique. R6gularisation. Dans le cas. des inequations (2.11) on peut introduire j (v), EI
approchant j ( v ) . Par ex.
fonctionnelle convexe differentiable
,
dr
o
on prendra
est convexe, differentiable, et par exemple
y
(X)=IXI E
si
\ A ( > E
On "approchet'alors (2.11) par llequation rggularisee %u (4.3)
,
(* v)+a(uE,v)+(j~(u
avec (4.2)
,v)=(C,v)
Vv E V I
E
.
11 s1agit 12 encore d'une gquation parabolique non lineaire monotone
2
et on verifie encore gue u
-+
u dans L (0,T;V) faible,
u solution faible.
4.3.- Reduction 2 un inequation elliptique. Regularisation elliptique. Pour reduire par ex. (3.8)
a un situation
d e j a connue, nous
nous somrnes rarnenks jusqtici 2 des equations dlBvolution. On peut essajer de se ramener 2 des inequations stationnaires. pour cela, on considere le probleme de nature elliptique: trou ver u
fi
oii
fGo =
{v
E E
(4.4)
I
2 av veL (O,T;V), % E L
2
(0,T;H),v(t) E K v(O)=O>
solution de
1
P.P. I
J.L. Lions
Cette inequation entre dans le cadre de inequations variationnelles-elliptiques etudiees dans le cours de R. Glowinski et U.
= .
MOSCO 7 ( 1 )
On montre (cf. Lions-Stampacchia [I] u
par ,ex.) 1 'existence de
solution de (4.5) et la convergence de u
vers la solution E
faible lorsque Remarque 4.1
E
+
0.
.-
On peut utiliser pour llapproximation de la solution u
€
de
(4.5) les methodes descours Glowinski et Mosco. I1 est possible (mais non encore verifie sur des exemples num6riques)que l'usage de (4.5) soit utile
pour des calculs sur de longs intervalles
de temps. Cf. Carasso [I] pour le cas .de equations. Remarque 4.2.On peut utilis6r simultanim_ep~le5 id6es de 4.1 on 4.2 et 4.3. On peut donc se ramener 3 des equations stationnaires.
4.4.- Reduction 3 une inequation elliptique. Discretisation. La methode peut Gtre la plus naturelle de reduction au cas
(
1
On notera que le probleme (4.5) est non symetrique meme si
l'on part dlune forme a(u,v) sym6trique.
J.L. Lions
elliptique est dlutilis&r la discretisation de la deride en t. En raison de l'importance essentielle de ce procede pour les applications numBriques, nous etudions cela en detail au Chap. 2..
4.5.- InBquations dl&volution et points selles (cf. Tremolieres C11 1
. On introduit
et 1 'ensemble
On vBrifie que si u est solution forte alors
autrement dit: {u,u} est point selle de L(v,w) .
sur
hxh.
A
RBciproquement, soit {u,u) point selle de L(u,w) sur & x X , Alors (4.9)
L(u,w)
L(u,Q) 4 L(v,Q)
vv,w ~
3 d
d l o a l1on deduit (en observant que ~ ( u , u=L ) (QIQ)=0) que L (u,Q)=O. Alors L ( v , ~>o )
et
L(w,u)=-L(U,W) a0
donc dlaprSs 11unicit6 dans le Theoreme 3.1 (ou 3.2) on a: U
= u.
J.L. Lions
Donc
u est solution, alors {u,u) est point selle de
L(v,w) sur
Kx k
et rgciproquement.
On peut dgduire de 13 une mgthode de demonstration de l'&stence de solutions. En effet si K est born6 dans V, alorsk est born& dans l'espace W des fonctions v 6 L~ (0,T;V) avec
E L ' (O,T;Vt)et l'existence at d'un point selle (ngcessairement de la forme {u,u}) est consGqueg
ce d'un resultat classique di Von Neumann. Si K n'est pas borne, on i n t r ~ d u i t ~ KR =
CV I v
E
Kt
1 lvl 1
6 R};
soit uR la solution de
Prenant dans (4.10 (4.11)
J
:
I luRI I 2dt
v=O on en d6duit que
<
c = constante independante de R
d'oi3 l'on dgduit l'existence d'une solution faible u et (4.12)
uR +. u dans L ' (0,T;V) faible lorsque R++,
.
J.L.
CHAPITRE 2.-
Lions
Approximation p a r d i s c r e t i s a t i o n d e s i n e q u a t i o n s paraboliques de type I.
1 . - Approximation d ' u n c o u p l e d ' e s p a c e s C o n s t a n t e d e s t a b i l i t e . . S o i e n t V e t H deux e s p a c e s d e H i l b e r t ( I) comme a u Chap.1, N . 2. Dans l a t h e o r i e d e s a p p r o x i m a t i o n s i n t e r n e s (cf. Aubin
[I],
C6a [I] ) o n c o n s i d e r e une f a m i l l e d e s o u s e s p a c e s Vh d e V ( 2 ) (1.1)
VhC
v
avec (1.2)
d e dimension f i n i e N ( h ) , N ( h ) + =
Vh
l o r s q u e "h+O",
e t l e s Vh r e a l i s a n t une a p p r o x i m a t i o n d e V ( c f . l e s c o u r s d e S t r a n g , Glowingki d a n s c e volume). La norme d e V
-
resp. de H
-
i n d u i t s u r Vh une norme,
e t c o m e Vh e s t d e dimension f i n i e c e s deux normes s o n t Cquival e n t e s s u r Vh.
Evidemment ( 2 . 3 ) Chap. 1 e s t v a l a b l e s u r Vh, i - e .
I V ~ S C I IyI I V V E V ~ , e t
(
h
,
d'aprss l'equivalence:
il e x i s t e une c o n s t a n t e :Sqh) t e l l e q~
La c o n s t a n t e S ( h ) e s t d i t e : c o n s t a n t e d e s t a b i l i t e du c o u p l e {V,H) pour l a f a m i l l e Vh. Remarque 1 - 1
-
On a ( s a u f s i V = H ) ; ~ ( h + ) +=
(1.4)
si h
+
0.
(1) Cela s'etend aux espaces de Banach (2) h est un p s r a m e t r e scalair ou vectoriel representand soit le maillage dans la methode des -differences finies soit l a "triangulationu en simplexes - -------- ----dans la methode d_es elements (cf l e cours de Strang, c e volume et -- -- ---finis - -- Ciarlet- Raviant Ll,
-
J.L. Lions
En effet sinon
les normes
I
I
et
11
11
seraient
Bquivalentes sur V, d'oti V=H puis que V est dense dans H. Cette remarque fait toute la difference entre le cas des equations differentielles (oa V=H) et le cas des equations aux derivees partielles (06 V C H strictement). Remarque 1 .2. Evaluations de .S:(h). 1
Si v = H ~ ( R )ou
Ho(R) et H=L
2
(0)
alors
soit le pas du maillage soit la longuer maximum du cote de la triangulation (dont les angles sont supposes uniformement >,8'>0). 0 2
Si v = H ~ ( Q )ou ~ t ( n )et H=L ( a ) alors
Cf. Aubin [I] Strong [I],
Bramble-Schatz, Ciarlet-Raviart [I],
Fix-
Zlamal [I].
Remarque 1 . 3 . Pour simplifier lVexpos&, nous utiliserons seulement les ap proximation internes. Mais, 2 des difficult6s dV6criture prss, tout ce qui suit s V 6 tend aux approximations externes.
J.L. Lions
,2.-
SchCmas d'approximation des inequations paraboliques de type I. Notations.On iktroduit un pas ~t de discretisation en t ( 1 )
On designe par un h de u a l'instant nAt.
.
ou plus simplement un llapproximation
Cela suppose choisis: a) ube approximation V de V; k b) une approximation Kh de K et jh de j (cf. les cours de Glowinski et Mosco dans ce Volume). Commenqons par llapproximationde 18inCquation (2.9) Chap. 1. . un+l-un au(t) par Bi l'on remplace , et si l'on remplace dans at At (2.9) v-u(t) par v-u* , u" a choisir, on obtient:
oil u7 est Cgalement 3 choisir, et oil un+l E Kn. Raisonnons par analogie forme= (ce qui correspondrait au cas classiques sont: .n+ I -,n (2.2) ( At
1
avec le cas des Cquations
K =V dans (2.1 ) ) h h
. Alors les schemas
n ,v)+a(unlv) = (f ,v), schdra "explicite" . ( ' ) ,
( )
On peut Ctendre ce qui suit au cas de pas variables 4tn.
(2)
Bien que, dans le cas des Glbrneats finis, on ait 3 rCsoudre
un systeme non diagonal.
J.L. Lions
(2.3)
n+l-un n- 1 n ,v)=(f ,v), (U At ,v)+a(u
schema implicite,
et plus g6n6ralementI si l'on introduit (2.4)
Le cas&=
un+p= &Qun+l+(1-8)un,
1 correspond au schema de Crank-Nicolson 2
Cela conduit raisonnablement 3 prendre dans (2.1) u'=u
n+&
d'oa:
n+ 1 1 toujours avec u E Kh ( 1 . Reste 2 choisir uk de faqon que (i)
le schema ait un sens:
(ii)
on puisse analper la stabilitg et la convergence du schema. Relativement au point (i), notons que l'on ne peut pas pren-
dre u*=un, 1 lingquation (2.6) correspondante ne d6f init pas un+l
*
de maniere unique. I1 reste alors d e m possibilites: u =un+' ett
*-
plus ggneralement u -u appartenir 2 Kh pour
O
Nous nous bornerons dans ce qui suit au cas o=l, d'oa les
J.L. Lions
avec (2.8) et uO= approximation dans Vh de uo.
(2.9)
- L ' ingquation (2.71 gquivaut
3
>, (un+at fn,v-un+l)-(1-4)st a(un,v-un+l)
V v E Kh; pour
~t assez
petit on a
sur Vh
de sorte que (2.10) definit untl de f a ~ o nunique dans Kh. Remarque 2.1
.-
Si 8 =Or (2.10) se r&duit Z (2.11)
(Un+ 1,v-un+ 1 ) 3(un+btfn-at ~ ~ u ~ , v - u " + ~V) V
EK~
oil
a (u,v) = (Ahu,v), Introduisons fin+'
1
( )
Ah opgrateur de Vh
+
VA
.
par
Cela mGme sans 1 'hypothese (2.4), Chap. 1 ; dette hypothese
interuiendra dans l'aaalyse de la stabilite et de la convergence.
J.L. Lions Alors (2.11) Bquivaut a
I
ofi PK = projecteur de Vh -+ Kh pour la norme 1. h 1 O n dit que le schema que correspond 3 &=O est semi-explicite ( ) Pour
&=I
.
on obtient un schema implicite. Ces sch6mas ont BtB
introduits dans Lions [3]. Remarque 2.2. Une variante de (2.12) (2.14) (TrBmoliSres [I]
)
est:
C'est un schema "semi-implicite". Remarque 2.3.Si l'on prend dans (2.6) u*=unCBalors (2.6) Bquivaut .3 un+S-un , v-un+') +a ( un+ ', v-un+*) 3 ( fn ,v-un+*) \dv€Khl At
I noter qu'alors un E Kh: Remarque 2 . 4 . Tout ce qu'est fait dans ce Chapitre s'adapte aussit%t 3 l'inequation (2.11), Chap. 1.
(
1
Pour les optimistes ! Les pessimistes prgfbrent "semi-implicite".
Lions
J.L.
Remarque 2 . 5 .
-
Pour l e r e s o l u t i o n d e (2.10) on u t i l i s e r a les methodes i n d i q u S e s d a n s l e s c o u r s d e G l o w i n s k i e t Mosco. On u t i l i s e r a e n ou-
t r e l e f a i t que, e n t o u t e p z o b a b i l i t c i , un+l e s t " v o i s i n " d e u
n
.
Orientation. On v a m a i n t e n a n t S t u d i e r l e s t a b i l i t g , p u i s l a c o n v e r g e n c e ,
3.-
de l a- stabilitC Analys . - - --
-
t r a n s l a t i o n d e u0 ce q u i r e v i e n t Fi s u p p o -
On e f f e c t u e une
ser q u e (3.1)
O C K
,
uO = 0 .
P o u r un p e u s i m p l i f i e r 1 1 e x p o s 6 , o n s u p p o s e r a q u e
On v a d e m o n t r e r l e : TheorStme 3 . 1 . -
On s u p p o s e q u e ( 3 . 1 )
l e schema ( 2 . 7 ) ( 2 . 8 ) ( 2 . 9 )
& ( 3 . 2 ) o n t l i e u . On c o n s i d s r e
( a v e c uO=O). On s u p p o s e q u e N At=T f i x 6 .
On a : (3.3)
iunl
(OsnsN),
oii l e s C d e s i g n a n t d e s c o n s t a n t e s i n d g p e n d a n t e s d e h cela: (3.5)
q u e l que s o i t A t l o r s q u e
&=I,
& ~ t ,
J.L. Lions
sila condition de stabilite suivante a 2 (3.6)
(I-B)S(h) ~t
Remarque 3.1
6 C
lieu :
0 Cfi~l
lorsque
(I
.-
I1 y a une diffGrence avec le cas des Squations oil les conditions de stabilite n'apparaissent que si
&< 1/2. Mais nous igno-
rons si (3.6) est le meilleur rGsultat possible. La difficult6 vient de ce que 1 'on ne peut, e n gCGral, prendre v=un+ l-un+@dans (2.7). DGmonstration. On fait v=O dans (2.7). I1 vient:
(f",un+l)=(f",un+~)+(fnlun+l-u"+I9 1
=(I-&)
(fn,Un)+~(fn,Un+')+(l-4l
(fnlun+i-Un)
de sorte que (3.7) donne
1 ( 1 U ~ + ~ 1 ~ - l u ~ 1 ~ + l u ~ + ' - u ~ j ~ ) + ~ . a ( u ~ ~ ~4 ) + ( 1 - 4 ) a ( u ~ ) 2 At 1
( )
S ( h ) est la constante de stabilit6 introduite dans (1.3). On
notera que si &"=I
la condition (3.6)
disparait et on tetrouve
(3.5). La demonstration qui suit donne une evaluation de,la consta!?
J.L. Lions
Avec les majorations
2 I I ~ ~ I I f: +a(untl),
( f n f ~ n + l ) < l l i nI~~/U*" + ' / I('1s
(inrun+'-un)
"
/PI C / lun+'-un/ 1 1 T;
~a(u",u~+'-u~) 1s
MI
2
I Ifn/ I*+ 1unl 1
I a(un ) < 4
E~
(h)
-2-
(Un+'-Un)
,
>O quelconque,
1 /un+l-unl/ + M~ - ~ ( h21un+1-un) )
(2)
On obtient: 2 M 2 )At ~(h)~(l-@)) lun+'12- un2+(1-(f+ ; ~ u ~ + ~ +- u ~ / ~
(hypothsse qu'est sans objet si &=I); cf. (3.5) et (3.6). Alors (3.8) donne
1 IunI ) (3.10) lun+'12-/~n12+p1un+'-un12+a~ t ( ( l - &
I 2+ ?.
(I+ (L)
II
11;
norme dans V ' duale de
I 1 I/
M est la constante de continuit6 de a(u,v).
J.L. Lions
On en deduit par sommation
lorsque n < N . De (3.11) on deduit (3.3) (3.4) et on obtient en'outre l'estima-
Remarque 3.2. Pour l'analyse des schemas du type pr6cPdent et d'autres schemas, cf. R. Tr6moliSres .]I[
4.- Etude de la convergence. DSfinissons: (4.1)
uAt(t)=un
dans l'intervalle
[nnt, (n+l)at[.
On va demontrer le: TheorSme 4.1.- On se place dans les conditions du ThGorSne 3.1.Alors lorsque h (4.2)
-+
0 et, 10,rsque at+O
sans condition supplgmentaires si&=l,
et avec (4.3)
s(h12at
-+
o
si
O< & < I r
J.L.
(4.4)
ofi -
u
A t
+
2
u
d a n s L (0,T;V)
u e s t s o l u t i o n " f a i b l e " de ( 3 . 8 )
Lions
faible,
, Chap. I.
Dgmonstration. 1
S o i t v une f o n c t i o n r g g u l i e r e ( p a r ex. C ) d e [o,T]
+
K
avec v(O)=O e t s o i t vn = vn = approximation d a n s Kh d e v ( n ~ t ) .
h
On prend v=vnf' d a n s (2.7) :
On e n d e d u i t que
En e f f e t l e 1 membre de (4.6) v a u t
J.L.
Lions
e t l ' o n p e u t donc s u p p o s e r q u e ( p a r e x t r a c t i o n ) u
(4.9)
A t
w
+
2
d a n s L (0,T;V) f a i b l e .
A l o r s ( 4 . 7 ) donne
I
T
(4.10)
[(vl,v-w)+a (w,v)-(f ,v-w)]dtblim. i n £
0
Mais comrne u n + 1 = u n + 4 . + ( ~ - @( u) n + ' - u n ) ,
N
c
n+ cQ
n=O
le
A t a(u
,u
n+ 1 )
membre de ( 4 . 1 0 )
vaut (4.11)
lim-in£.
N
N
A ~ ~ ( u ~ + ' ) + ( I - ) ata(u n=O
,
Admettons un i n s t a n t que s o u s l a c o n d i t i o n ( 3 . 6 ) on a ( 1 ) :
A l o r s ( 4 . 1 1 ) donne N I ~t a ( un+S) 5 n=O
lim. inf.
jT0 a ( w ) dt
c e q u i j o i n t 3 ( 4 . 1 0 ) m o n t r e que
p o u r t o u t e f o n c t i o n v r s g u l i s r e d e [o,T] + K a v e c v(O)=O, d ' o a p a r prolonqement p a r c o n t i n u i t s v v
(I )
5e c a s &=I e s t immsdiat
.
E L
2
(0,T;V)
,
v
2
' L ~ (O,T;V1) ,
J.L.
Lions
v(O)=O1 ~ ( tE )K. Donc w=u e t l e theoreme e s t d6rnontr6, sous reserve de l a v g r i f i , c a t i o n de (4.12). Pour c e l a . on note aue
Remarque 4.1.La demonstration precedente montre l ' e x i s t e n c e d'un solut i o n f a i b l e du probleme ( c f . Chap. 1 , N . 4 . 4 ) .
J.L. Lions
-
CHAPITRE 3.
InGquations dt6volution paraboliquesde type 11.
1.- Exemples. Exemple 1.1.-
La thGorie de l'asservissement (cf. Duvaut-Lions
conduit Fi des problsmes du type suivant; les notations sont cel
]I[
les du Chap .I, N.1. On cherche une fonction u=u(x,t) solution de
aaut -
(1.1)
-
dans n x]O,T[,
Au = f
Remarque 1.1.~'gquation(1.1) et la condition initiale (1.2) sont identi-
ques
5 celles du Chap.1
, N.1. Par contre les conditions aux limL
tes .(non linsaires) dans (1.3) sont diff6rentes des c~nditions (1.3), Chap. I. Remarque 1.2.- (Analogue 3 la Remarque 1.2, Chap.1). D'aprss la 36me condition (1.3), on voit que partie
Z:
de
z
et
2 an
=
0 sur
z-z:.
Mais
C:
aU at
= 0 sur une
n'est pas donnG a priori.
Exemple 1.2.Une variante de 1'Exemple 1.1
(1.4)
a ut a
"0
aaut -
-
dans
au-f 50,
est: on cherche u avec
n X]O,TL
J.L. Lions
avec la condition initiale (1.2) et,la condition aux limites (1.5)
u
sur
= 0
z
Exemple 1.3.On cherche u satisfaisant a ( 1
.'I ) (1.2) et
3
2.- Formulation qenerale. Les notations sont celles du Chap.1, N.2.
] telle que On cherche une fonction t + u(t) de ~ o ~ T+V au(t) at
(2.1)
= ,I(,)
K
,
Variante. On cherche u telle que (2.4)
(u'(t),v-ut(t))+a[u(t) ,v-ut(t))+j (v)-j(u'(t))&(f(t) ,v-ut(t)
avec (2.3). Les inequations (2.2) oil (2.4) seront appel6e inequations paraboliques de type 11. Remarque 2.1.Si K=V ou si j=Or (2.2) et (2.4) se reduisant 3 l'gquation
J.L.
o r d i n a i r e (2.12) Chap.1.
Lions
La d i s t i n c t i o n e n t r e les t y p e s I
I1
n ' a d e s e n s - q u e pour l e s i n e q u a t i o n s . Remarque 2 . 2 .
-
Pour l e s i n C q u a t i o n s , l e s i n 6 q u a t i o n s d e t y p e I1 s o n t
ef-
fectivement d i s t i n c t e s du t y p e I ! Prenons en e f f e t l a s i t u a t i o n d e l a Remarque 3 . 2 , Chap.1.
Alors (2.2) e q u i v a u t
~ ' 3 0 , ul-£>Of
ul(u'-f) = 0
donc (2.5)
u 1
= f+.
S i £=-I on t r o u v e ul=O q u i donne u ( t ) = u o s o l u t i o n d i f f & r e n t e d e c e l l e d e l a Remarque 3 . 2 , Chap.1. Notons e n o u t r e que ( 2 . 5 ) montre g u ' i l y a e n g e n e r a l un seuil de regularite. Remarque 2 . 3 .
-
Pour l a r e s o l u t i o n d e s i n e q u a t i o n s d e t y p e I1 nous d e v r o n s s u p p o s e r a s y m e t r i q u e ( i l s u f f i r a n t que l a " p a r t i e p r i n c i p a l e " d e a s o i t s y m 6 t r i q u e ; il e s t p r o b a b l e , m a i s non demontr6, q u ' u n e c o n d i t i o n d e c e g e n r e e s t n e c e s s a i r e p o u r q u e l e s problPmes d e t y p e I1 s o i e n t b i e n p o s 6 s ) . Remarque 2.4.La f o r m u l a t i o n ( 2 . 2 ) s u g g e r e d e nombreuses v a r i a n t e s . En v o i c i q u e l q u e s unes: 1 ) a s s e r v i s s e r n e n t r e t a r d 6 ( c f . Duvant-lions donns; on c h e r c h e u ( t ) s o l u t i o n d e
[I]
; s o i t r>O
J.L. Lions
2) bariante de l'asservissement retarde (cf. D. Viaud [I]); T
soit
>O donne; on definit B u ( t ) = u(t)-u(nT) t-nT
(2.7)
pour t
E
[nT, (n+l)r [:
on cherche u(t) solution de
3) notons que, en utilisant AG&(v;v')
d6fini par a(u,v)= (Au,v),
1 lin&quation (2.2) s '6crit
on peut introduire les puissances fractionnaires'A
de A et coy
,siderer l'inequation (2.11)
,
9.
(u1+Au-f,v-Au1)30
v v EK,
avec (2.12)
8
A ul(t)E K
.
Remarque 2.5. Tout cequi a Bt6 dit dans le Chap. prgcddents et tout ce qui sera fait dans la suite vaut en remplaqant a(u,v) par a(t;u,v) dependant convenablement de t.
J.L. Lions
Remarque 2.6.- Exemples. Prenons dlabord V=H 1 ( a ) ,
I
H=L2 ( 0 ), K={V
~ 3 SUI: 0
r
r
Alors (2.1) (2.2) (2.3) equivalent au problsme de 1 'Exemple
1.1. Si l1on prend K= {v [ v%O dans
a,
1
vEHo(n) 1
r
on trouve le probleme de lqExemple 1.2. Si l1on prend dans ( 2 . 4 )
et V,H, a comme ci dessus, on trouve le probleme de 18Exemple 1.3. Relativement au probl8me (2.2),(2.3)
(ou (2.4) (2.3)) on a
le resultat suivant: Theoreme 2.1.- On suppose que
On suppose f donne avec
et u. donne 0
1
( ) Irr
avec
On peut, plus g6n6ralementI sapposer qu'il existe vo€ K tel que 1 ,n W ~ r K c n r r hien au'il existe v _E V tel qUe
&arr - e r n )
W - T ~
-
J.L. Lions
I1 existe alors une fonction u et un seule, solution de (2.2),(2.3) (ou (2.4) (2.3)) telle que
(2.18)
u" E L~ (0, T; H)
L1unicit6 est immediate. Pour la demonstration de llexistenceon a des methodes analogues 3 celles evoqu6es au Chap.1, N.4. ~ o u kallons d&velopper seulement,.dans les deux N. qui suivent, la methode de discretisation.
3.- Schemas d'approximation. Notations. Las notations sont celles du Chap. 2.
....
Si yn est une suite (n=O,
)
gn)
(3.1)
on posera
I
On considsre alors le schBma suivant:
ce qui montre que, dSs que At est assez petit, bun bst dBfini de faqon unique par (3.3) Ensuite un+l est calcule par: un+l-
.
J.L. Lions
=un+4t 6un. On demarre le calcul den sun par 6u0 = 0
(3.4) et de
u0 = approximation de uo.
(3.5)
Cela d8finit donc de proche en proche la suite un. si &=I 1e schema est dit "implicite". En fait (cf. (3.3)) il est Iimplicite pour
tout &>0 et
"semi explicite" si &=o.
Remarque 3.1.Le schema analogue 3 (3.2) relatif 3 11in8quation (2.4) s'ecrit: n+&- fn ,v-6un)+j (v)-j (6un)30 (6un+~u
(3.6)
VVEV~.
4.- Stabilitg et converqence. (cf. D. Viaud 11 I ] . Ecrivons l'analogue de (3.2) pour n+l: (~u~+~+Au"+~+'-~~'~,v-~u n+ 1 )>,O
(4.1)
VVEK~.
Prenons dans (3.2) (resp. (4.1)) v= 6un+ 1 (resp. v=6un) et ajoutons: 1 6 ~ ~ ~ l 2+a(un+1+tun+*, - 6 ~ ~ 1
(4.2)
-
(
1
-
6Unt1- 6un)< (fn+1-fnr6Un+1-6Un).
-
Cela si justifie a partir de (2.16).
J.L. Lions
On deduit donc de (4.2) apres dibision par ~t que 2 n 2 n+& n+1 2 n ~ t 1 6u + a ( 6 ~ , 6u -6un)
I
r
At -16
un
2
At T16fn12.
donc
Theorgme 4.1.- On se place dans les hypotheses du Thgorgme 2.1. On a alors:
1 lunl 1, 1 1 6unl 1 <
(4.4)
lorsqueh
constante independante de At et de h,
dt-.O, et cela
1 sans condition si Zs&sl,
(4.6) et -
(sous une condition de stabilit6 de la forme
Demonstration
(' )
.
On vgrifie (en utilisant le symetrie de a(u,v)) que 2a (6un+5 6 unS1-6un)=a(6un+l) -a(6un)+ (2&-1)a(6un+'-6un) 1 . Donc (4.3)-donne:
(
3
7 On supposera que a (v,v) > a
(I
v1
1 2.
J.L. Lions
Distinguons deux cas: ler
cas:
2&-I >o.
On somme (4.8); il vient
On en d6duit (4.4)(4.5) et en outre
zem cas:
C 0 4 &<1/2.
On note que
1 6'un1
donc (28-1)a(6un+'-6un))-~(1- &) s(h)'dt2
et (4.8) implique
donc que (4.11)
I
21-12
~t [I-C(I-2 )~(h)~Atl 6 u
+a(6un+l)-a(6un)<~tlitf"l2.
Si donc l'on suppose que
on deduit de (4.11) que (4.13)
y
~ dt16 2u In +a(6un+l)-a(6un)<~tl6f"1'
dlofi l'on d6duit (4.4) et (4.5).
J.L.
Lions
Exami-nons maintenant l a convergence. A
Theoreme 4.2.-
On s e p l a c e d a n s l e s hypotheses du Theorbme 4 . 1 . 2 n On d e f i n i t uAtr 6 uAt c o m e l e s f o n c t i o n s Bgales 3 u , 6uAtr 2 aun s 2 u n r 6un 6 un [ n ~ t ,( n + l ) ~t On a l o r s q u e h t
+
C.
sur
At+O:
(4.14)
uAtl 6uAt
(4.15)
6 uAt
2
+
+
urul
-
u 1 dans
L ~ ( o , T ; v )f a i b l e g t o i l e ,
Lm(O,T;H)
e t c e l a sans conditions suppl6mentaire s i
-21
faible etoile,
, < & < I e t sous l a
condition
Demonstration. S o i t v une f o n c t i o n r e g u l i e r e d e [o,T] vn = vn h
=
+
K e t soit
approximation dans Kh d e v l n A t ) .
Nous prenons d a n s (32) v=vn e t nous s o m o n s en n , a p r s s m u l t i p l i cation par A t ;
il v i e n t :
Mais on v B r i f i e que
J.L. Lions
Dtapr5s les estimations (4.4) et (4.5) , uAt, 6uAt demeurent OD
dans un born6 de L- (0,T;V) et 62u dans un born6 de L~ (0,T;H), ,t de sorte que l'on peut extraire une sous suite, encore not6e uAt1 telle que u At
+
w
dans Lw (0,T;V) faible gtoile,
+ w dans L~(O,T;V) faible etoile, 6u At 1
6 2 ~ + w dans L~(O,T;H) faibie 6toile. 2 At On v6rifie sans peine que W
1
=
Wl,
W
2
=
w"
de sorte que
.
uN+' + w (T)
Distinguons alors deux cas.
1
a
:
71 < &
1 a (uNtl)- 1 a(uO) d IOU On a alors-Yb0 donc X 37 2
(4.21)
1 a(w(T)) lim in£ X b 2
-
-1
a (uo)
.
Utilisant (4.21) dans (4.17) on en d6duit que
J.L. Lions
1".
de sorte que (4.22) 6quivaut a C(wl,v-wl)+a(w,v-wl)-(f,v-w1)Jdt30 V v regulisre
(4.23)
de
LOIT]
et donc par prolongement par continuit6, V V E L' (0,T;V) avec v(t) E K
p.p. On d6duit de 13 que w est solution du problSme,
25me cas: o<&- 1 2
.
Cette fois Y est 60, mais le resultat final est valable sl l1on v6rifie que
mais (4.4) entraine que, en particulier, ( 6un [ sC, donc N ~t s(h) 2
+O d 'aprss 1 hypothsse (4.1 6)
.
Remarque 4.1.On a des r6sultats analogues relatifs 3 (3.6). Remarque 4.2.On a d6hontr6, par cette mgthode, I'existence dlune solution du problsme.
J.L.~Lions
CHAPITRE 4
.
1 - Exemples
.- IntZquations dl&rolutiondu
2eme ordre en t..
.
De nombreux problemes de MBcanique et de Physique (cf. Duvaut-Lions [ I ] )
1
( )
conduisent 3 des problsmes du type
suivant: Exemple 1.1
.-
Avec les notations du Chap. 1, N.l, on cherche une fonction u solution de dans n x ]o,T[
,
Remarque 1.1.On nOtera que les conditions aux limites 0.3) sont identi-
ques 3 celles de llExemple 1 .I, Chan.3; on beut donc faire des observations analogues 3 celles de la Remarque 1.2, Chap.3. Exemple 1 .2.On cherche u solution de (1.1) (1 .2) avec les conditions aux limites sur
z
.
Remarque 1 .2. Des problbmes de ce type se posent egalement pour le systeme de 1161asticit8; cf. Duvaut-Lions, loc. cit. 1
( )
--
Par ex. pour les operateurs de Maxwell. Cf. Duvaut-Lions,
J.L. Lions
Remarque 1.3.L10p6rateur (1.1)
est hyper601ique. ~ a i sdes probl5mes de
ce genre se posent egalement pour des operateurs 'E hyperboliques. Par ex. trouver u solution de 2
a
dans n x ]o,TG
at2 + A ~ =U f
les conditions initiales (1.2) et les conditions aux limites sur z I
AU = 0
(1.6)
au
(1.7)
at "0
A
I
2 an AU" Or
au (an a AU) = at
0 sur X.
2.- Formulation generale. Les notations sont celles du Chap.1, N.2. On cherche une fonction t
(2.2)
+
u(t) de .[o,T]
.+
V telle que
(u"(t),v-u' (t))+d(U(t) ,v-u' (t))3(f (t)1v-u'(t))
vv eK,
Variante. On cherche u telle que
et les conditions initiales (2.3) Remarque 2.1
.
.-
Si K=V on si j=O, ou retrouve les equations du 2eme ordre en t:
J.L. Lions
VVE VI avec (2.3). Exemples. Exemple 2.1.On prend: 1
2
H='L (a),
V = H (a),
K = cv
I
Le probleme (2.1),(2.2) Exemple 2; 2.
V3O
sur
r 1
.
(2.3) correspond 3 llExemple 1.1.
-
V,H et a &ant
choisis comme ci dessus, on prend
Le probleme (2.4) (2.3) correspond alors 3 1'Exemple 1.2. Exemple 2.3. On prend:
I
V = {v
2
v, A V E L ( n ) )
,
2
H = L (a), K ={v
1
v e V , v3O sur T
( 1)
I ,
Alors le p%roblSme (2.1 ) , (2.2) , (2.3) correspond Zi 1 'Exempie de la Remarque 1.3.
1
On peut definir v e H-''~ ( r ) (cf. Lions-Magenes [I] de sorte que " vbO" a un sens. ( )
,Chap.21
V
On a le resultat suivant: Theoreme 2.1.- On suppose que a(u,v) = a(v,u)
(2.6)
vurv E V,
On suppose que (2.8)
f,
L~(O,T;H)
I1 existe une fonction u et une seule telle que (2.11)
U,U#E L~(O,T;V),
Remarque 2.2.On a un resultat analogue pour le problsme (2.4), (2.3). Remarque 2.3.On peut Btendre le resultat precedent au cas oO seule la "partie principale" de a est symetrique. Remarque 2.4.On a defini et donne un r6suPtat d'existence pour des solutions fortes. On peut aussi definir des solutions faibles de ces problgmes. Cf. Lions L4] et une etude plus gBnBrale et plus simple dans Brezis [I].
J.L. Lions
Remarque 2.5. Seuil de regularitg. On rencontr6
ici encore le phenomene de "seuil de regula-
ritd" ddja rencontre marque' 2.2.
au,
Chapitre 1, Remarque 3.2 et au Chap.3, Re-
Prenons en effet V=H=R, a=O, K=fv
1
90). Alors (2.11, (2.2)
(2.3) equivalent Zi ~ ' 3 0 , un-f>O,
U'
(u"-f)=O
Posant ul=w, on a donc:
ce qui r a m h e au Chap. 1, Remarque 3.2
1
( ).
L1unicit6 dans le Thdorsme 2.1 est immddiate Pour la ddmonstration de l'existence, on a des m6thodes analogues Zi celles Bvoqu6es
.
au Chap.1, N.4.-
Nous allons developper
seulement, dans les deux numBros qui suivent, la m6thode de discrdtisation.
Notations (3.1)
1
( )
dn =
n+l
n-I
-U
Naturellement le cas general ne se rBduit pas
du Chap. 1 !
a
la situation
J.L. Lions
Schema "semi-expliciten. Supposons uq comme di 0 I ni on demarre'avec u0 = approximation de uo, u 1=
n
I ' U1.
On determine alors un+' par:
Cela definit bien un+l. En effet (3.3) Bquivaut I
ce qui definit dn de faron uni6ue. Remarque 3.1
.-
Si l1on definit un+l par
alors (3.4) equivaut 3 ,n+ 1 ,un- 1 (an, v-an) 3 ( 2At
v-dn)
oil P = projecteur sur Kh pour la norme Kh Remarque 3.2.- Schema implicite. n+ 1 On dBtermine cette fois u par
I 1
.
J.L. Lions
Le schema (3.7) est inconditionnellement stable. Remarque 3.3.NOUS
avons commenc6 par donner le schQma (3.3) 3 cause de
sa "sym8trie". Mais ce schema a, du point de vue. pratique, tendance 3 "d8coupler" les un selon la parite de l'indice
"
n
"
et
donc 3 engendrer des oscillations. Un autre schema, gvitant cet Bcueil, est le suivant; posant
on considere le schema:
'
n n+ i par (3.8). Si u est connu, (3.9) definit 6n, dl013 u Nous allons maintenant dtudier la stabilite de (3.3) puis (Remarque 4.1 3 la fin) celle du schema (3.9)
-
les demonstra-
tions &ant d'ailleurs trSs voisines.
4.- Stabilitg et convergence. Nous allons demontrer les resultats suivants: Thloreme 4.1. (StabilitB).-On senlace dans les condition du ThBoreme 2.1. Alors, si la condition de stabilite suivante a lieu:
J.L. Lions
Theoreme 4.2 (convergence).- Les hypotheses snnt celles du Thgoreme 2.1.- On introduit uAt, 6uAt, 62uAt 6gale un+', un+ l,un- 1 2At u"+l-2u"+u"-'
dans 1 'intervalle [n~t, (n+l)At [. Alors, lorsque
At At-+0:
h
u
,
(4.4)
uAt
(4.4
6 u +u" At
+
6uAt
u'
dans Lm(O,T;V)
2
oil u -
faible Btoile,
dans Lm(O,T;H) faible Btoile,
est la solution du problsme (2.1), (2.21,(2.3), (2.11) ,(2.12),
si (4.5)
At S(h)
+
Dgrnonstration du Theoreme 4.1
0.
.
( 2,
.
Introduisons:
Alors (3.3) slecrit
1
( )
C = constante dont on obtient une estimation dans la DBmon-
stration ci aprSs. 2 ( ) Pour un 'peu simplifier 11expos8, on suppose que OeK et que a ( v ) ~ I~v I I ?
I
J.L. Lions
" At -&"- 1
(4.7)
(--
, V-dn)+a(~n,"-dn)~-ffn,v-dn)
Prenant v=O 'et notant que
il vient :
en .multipliant par 2 t et en notant que a (u,v) est symgtrique, on en deduit
Par sommation, on en dgduit que
n (4.11)
16n12+a(un,un+1)6160~:a(u0,u1)+
I
(fq,uq+'-uq-l)
q=o
de sorte que (4.12) donne n (4-13) [ I - c ( A ~ s ( ~ ) ) ~1J6 n 1 2 + ~ a ( u n ) 4 c + c P lfq12dt+ q=o
J.L. Lions
On choisit donc At et h de fapon que
Alors (4.13) donne pour At assez petit
De (4.15) et de l1in6galit6de Gronwall discrSte, on d6duit que
+I
1 6 ~ 1 IunI
(4.16)
ISC.
On va maintenant obtenir d'autres estimations a priori.EcrL vons (4.7) pour n+l au lieu de n: (4.17)
(
6"+'At
v-dn+l)+a (un+1,v-~ni1)3(fn+1rv-dnt1)V V E R ~ .
On prend v=dn+ 1 (resp. v=dn) dans (4.7) (resp. (4.17)) ditionant et divisant par At
de sorte
,
2
,
il vient
que (4.18) s'bcrit encore:
. Ad-
J.L.
Lions
oil l ' o n a posB gn =
(4.20)
1 A t
(fnfl-f")
Mais (4.19) e q u i v a u t 3 (4.9) oil 1'on remplace un
par 6 n .
On a donc l e r e s u l t a t analogue 3 ( 4 . 1 6 ) ; s o u s l ' h y p o t h e s e (4.141, on a
demeurent dans un borne d e L"(o,T;v)
u . ~ 6uAt ~ , 2
6 uAt
,
demeure d a n s un born6 de L"(O,T;H)
e t dn p e u t donc s u p p o s e r , p a r e x t r a c t i o n d ' u n s o u s s u i t e , que (4.22)
uAtr 6u
(4.23)
6 uAt+
2
w,wl d a n s L " ( o , T ; v )
+
A t
W'
dans L"(o,T;H)
faible etoile,
faible 6toile.
On a a l o r s : (4.24)
uN-I
(4.25)
1
+
w(T)
dans V
faible,
+
w ' (T)
dans V f a i b l e .
S o i t v = v ( t ) une f o n c t i o n r e g u l i e r e d e [o,T] (4.26)
-t
K.
vn = vn = approximation d a n s Kh d e v ( n A t ) . h
Posons:
J.L. Lions
Faisons v=vn dans (4.7); on en dgduit, apres multiplication par At:
d'oil; apres sonunation en n::
Admettons un instant que (4.29)
r +O lorsque At
+
0 (avec ( 4 . 5 ) ) .
On deduit alors de (4.27) que T (4.30) [(w",v)+a(w.v)-(f,v-ul)]dt
3
0 > lid inf
[$
laN-l
12+
1 N-1 7 a(u 1
-
1 !w'(o)
-'1
TI
a(w(o))
J.L. Lions
de sorte que
pour toute fonction v par exemple continue de [o,T]
+
K
-
et
par prolongement par continuite, t/v E L~ (0,T;V) avec v (t)E K P -P. On ddduit de I3 que w=u=solution du 'problSme, d'oO le TheorSme sous reserve de la verification de (4.29)
.
Mais
d'oO le resultat dlaprSs (4.5). Remarque 4.1. Les resultats pour le schema (3.9) sont tout 3 fait analogues aux precedents. Faisant v=O dans (3.9) on en deduit: (4.32)
(
6"-6"-1 At
,6n)+a(un,6n)~(fn,6n 1 .
Mais a(un ,6n )=a(un+l , 6n)-~ta(6") d'oii en portant dans (4.32) et en multipliant par At:
J.L. Lions
Par sommation on en deduit
d'oil l'on dgduit, &
1 6"1 '+a(~"+~)<
(4.36)
constante.
On obtient uneestimation suppl6mentaire en considerant (3.9)
pour n+l, soit (4.37)
(
n+ 1 n+l n+i n+l ,v-6 )+a(un+',v-6 )g(f ,v-6 . ) -
6n+1-6n
n+ 1 Prenant v=6 (resp. v=6") dans (3.9) (resp. (4.37)) 2
il vient par addition et apres division par ht :
,
J L Lions Introduisant gn c o m e en (4.20) et posant
(4.38) s'bcrit:
qui est l'analogue de (4.32) avec un (resp. 6") remplacg par n 6" (resp. 6
.
On a donc l'analogue de (4.36) 2 savoir (4.41)
1 :6 1 '+a
( 6"")
g costante
d'oO les resultats analogues aux Thboremes 4.1
et 4.2.-
J L.Lions CHAPITRE 5.- complements et problsmes.
1.- Ecoulement de fluides de Bingham bidimensionnels. Decriimns un modsle vaut -Lions [I]
,
- Pour la motivation physique, cfr. Du-
Chap. 6.
On introduit: V={V
1
v
H=Iv
I
vc(~~(nl, ) Div
E
( 1H( a~) ) 2 , Div 2
v=Ol, v=O , n v = O
...,un 1
On cherche une fonction u={ul,
sur
ij,
(I
(la vitesse de 1'Bcou-
lement) verifiant
+g j (v)-g j(u(t)))(f
(t),v-u(t))
et (1.5)
u (0)=uo
Dans (1.4)
(I)
p
est donne
n = normale 3
donne. >0.
r dirigee vers 1 IextGrieur de a .
V V EV
J L Lions Remarque 1.1.Si g=O, (1.4) dquivaut 3
Vvev; c'est le systeme classique des dquations de Navier-Stokes. Remarque 1.2.L'inequation (1.4) est une inequation parabolique, mais l'operateur aux ddrivees partielles est (3 la diffdrence des chapitres precedents) non lineaire ( 3 cause du facteur b(u,u,v-u)). On demontre (cf. Duvaut-Lions, loc. cit.) que, le probleme (1.4) (1.5) admet une solution unique vdrifiant
Notons que, pour les 6couLements gridimensionnels, on montre l'existence d'une solution faible, l'unicit6 Bventuelle
dans
la classe des solutions faibles oil l'on peut montrer l'existence Btant un problsme ouvert (I). Schema d'approximation. On introduit encore (1.8)
un+*-Bu"+l+(l-,y)un -
Part ant de uo=approximatioi~ dans Vh de proche en proche un+ 1 par
1
. 2
( )
de u,,
on definit
( )
Comme dans le cas des equations de Navier-Stokes.
(L)
La construction de Vh est l'une des difficultds du probleme
Cf. Fortin [I].
J L Lions
At
, v-un+ i ) +
Pa (un+qv-un+' ) +b (un+
+g j(v)-g j(un+')~(fn,v-u n+l)
Un+ I ,v-Un+ 1 +
VVEV~.
L' idquation (1 .9) equivaut 3 n+ 1 1 (u~+',v-u~+')+~~'~(u~+~ ,v-~~+')+~(&u~+~+(I-~ v) u ~ , )u+~ ~ At 0 9
+g j (v)-g j'(un+l).
6
(un,v-tn+')+(P,v-unt")
dont on montre qu'elle admet une solution unique pour ~t assez petit.
Etude de la stabilite. ~stimati0ns.h priori. Prenons v=O dans (1.9). Notons !que b (un+t un+' ,un+
on encore
' =o. I1 vaut donc: )
J.L Lions
et donc (1.11) entraine:
Par cons6quent, si (1.13).
(1-
)
2
2
S (h) At
<
constante convenable
Ecrivons maintenant (1.9) pour n-I au lieu de n:
n Prenant v=u (resp. v=un+l) dans (1.9) (resp. (1.15)) et posant
il vient, apres addition et division par (1.17)
-(6n-6n-1,6n)-u a(un+&-,n-l+&
At:
At 1 + r6)+b(u n n+& I U n+l I un-un+'
de sorte que (1.17) donne, aprss division par At:
de sorte que (1.18) donne:
= b(86"+(1-&)6"-~,6",u")
donc (1.20)
1 x 1 ~ ~186"+(1-& 1
sn-l
1 bnl 1 I 1unl 1
(L4)2 '
Mais comme la dimension d'espace vaut 2, on a:
I 1Y 1 1 (~4) 7.
et comme d1apr5s (1.14), 'lunl-
vy
E"
J L.Lions
Posant gn =
(1.22)
fn-fn-1 At
'
on d6duit de (1.1'9) (1.21) que (1.23)
& [16n12-/6n-111+6n-6n-112]+
4 a(6 n)4
de sorte que (1.23) et (1.24) entrainent
1
(1.25)
+$
(16nl,2-16n-'12)+
a(&")
<
$
((-(I-&) cAt s(hI2) 16"-6"-'12+
1
a(&"-l)+1gnl 16"1+c( 16n12+16n-1f) 1unl 12.
On en deduit par sommation (et multiplication par ~ t )(
1
( )
que
En supposant qu'une condition du type (1.13) a lieu, pour
1
)
J L. Lions
On en deduit, en supposant que
N ~t 1 gq 1 2 f C : q=o
D'apres la 2eme inegalite (1.14) et l'inegalit& de Gronwall discrete, on en tire:
Donc: Theoreme 1.1.-
On suppose que
On a alors, un+ 1 Btant donn6 par le schema (1.9) :
et cela, sous condition sur At g & = 1 et sous la condition
J L Lions
2
(1.31)
.-On
at S(h)
<
constante convenable si 0
6
&
en deduit le th6orGme de convergence suivant:
Theoreme 1.2.- On se place dans les conditions de ThGorGme 1.1.n On definit u A t t 6uAt c o m e les fonctions egales a un, 6u sur [uat, (n+l)at [. On a, lorsque h & At+O: (1.32)
u A t t 6uAt
(1.33)
6u At
+
u'
+
U, u'
dans
L~(o,T,v)
faible,
dans Lm(O,T;H) faible etoile,
oCi u est la solution de (1.4) (1.5) appartenant? L2 (0,T;V) avec u' E L2 (0,T;V)tl L"(o,T;H) , et cela sans condition supplementaire si
&=I at sous la condition
La demonstration se fait selon les mgme qessus. Cf. Fortin
principes que ci
[I].
Remarque 1.3. On trouvera d'autres schemas et des applications numeriques dans Fortin .]I[
2.- ProblSmes
ouverts.
Nous ne mentionnons par ici les problemes ouverts relatifs la theorie des inequations d'evolution (cf. Duvaut-Lions Lions [5]
pour un certain nombre de questions ouvertes)
L?.],
- mais
plutst des problSmes d'analyse numerique lies aux inequations d'evolution .
J L.Lions
2.1.- Obtention d'estimations d'erreur. La question est deja ouverte dans le cas stationnaire. I1 serait, en particulier, tres interessant d'etendre dans la mesure du possible ( 1 )
- les r6sultats de Bramble Fix-Strong [I]
,
-
-
Shhatz [ I ] ,
~i-arle't-~aviart [I],
Strong [I],
Zlamal [I],
aux inequations stationnaires puis d'adapter l'ang
lyse aux problemes d16volution.
2.2.- Obtention de schemas stables (et efficaces...) lorsque les convexes K=K(t) dependent de t.
2.3.- Adaptation au cas des inEquations d'kvolution des methodes de Douglas-Du Pont.]I[
(
1
Pour les schemas 3 haute precision, on se heurte 3 la dif-
ficulte du "seuil de regularit€it'.
J.L.Lions
Litrre a paraitre.
J.R.AUBIN ]I[
J.H. BRAMBLE et A.H. SCHATZ [I]
Least square methods for 2mth
order elliptic boundary value problems. A paraitre [2]
Rayleigh-Ritz-Galerkin Methods
for Dirichlet's problem using subspaares without boundary conditions. Comm. Pure Applied Math. 23 (1970), 653-675. H. BREZIS Cl]
Les inequations variationnelles. Journal de Mathe-
matiques. Paris 1972. [2]
Equations et inequations non lineaires dans les
espaces vectoriels en dualite. Annales Inst. Fou-
CARASSO J. CEA
.
rier, XVIII, 1968, 115-175. [I] These, Madison, 1970. [I] Approximation variationnelle des probl6mes aux limites. Annales Inst .Fourier, XIV ('1964), 345-444.
Ph. CIARLET, P.A. RAVIART Ll] Lirre 3 paraitre chez Dunod,1972. J. DOUGLAS Jr et T. DU PONT [I] G. DUVAUT et J.L. LIONS ]I[
A paraitre.
Les inequations en Mecanique et en
Physique. Dunod Paris, 1971. G. FIX et G. STRANG ]I[ On The Analysis of the Finite Element method. A paraitre. M.
FORTIN ]I[
These Paris 1972.
R. GLOWINSKI [I]
Cours CIME, ce Livre.
R. GLOWINSKI, J.L. LIONS et R. T R ~ M O L I E ~[I]S
Resolution nume-
rique des inequations de la Mecanique et de la
J.L.Lions J.L. LIONS ]I[
Quelques problemes de la thdorie des inequations variationnelles d 'Bvolution. Cours CIME "Problems in Non Linear Analysis", 1970.- Ed. Cremonese, 1971.
[ 2 ]
Quelques mdthodes de rdsolution des problsmes aux limites non lineaires. Paris, Dunod-Gauthier Villars, 1969. Sur l'approximation de la solution d'indquations
[3]
d'evolution
-
C.R.
Acad. Sc. Paris, t.269 (1966),
pp. 55-57. Sur un nouveau
[4]
type de probleme non lineaire pour
opgrateurs hyperbolique du 2eme ordre. S6m. J. Leray, Codlsge de France, 1965-66., vol.11, 17-33. Sur les indquations aux ddrivdes partiellss. U
[5]
spechi Mat. Nauk. (1971) (en Russe) J.L. LIONS, E. MAGENES ]I[
-
.
ProblBmes aux limites non homogenes et
applications, vol.1. Dunod, 1968. J.L. LIONS, G. STAMPACCHIA ]I[
Variational Inequalities. Comm.
Pure Applied Math. XX (1967), 493-51 9. U. MOSCO [I]
Cours CIME, ce Livre
G. STRANG [I] Cours CIME, ce Livre. R.
T&MOLI~RES
D. VIAND ]I[ M. ZLAMAL ]I[
]I[
ThBse, Paris, 3 paraitre.
, ThSse,
Paris, .3 paraitre.
On the finite element method. Numer Math. 12 (1968) 394-409.
.CENTRO INTERNAZIONA L E MATEMATICO ESTIVO (C. I. M. E . )
G. I. MARCHUK
INTRODUCTION INTO T H E METHODS O F NUMERICAL ANALYSIS
Corso tenuto a d E r i c e
dal
27
g i u g n o a1 7 luglio 1 9 7 1
INTRODUCTION INTO THE METHODS OF NUMERICAL ANALYSIS
Introduction Numerical mathematics being part of mathematics has currently at its disposal powerful techniques for solving problems of science and engineering. Large
-
capacity electronic computers gave r i s e to algorithmic
constructions and mathematical experimentation over a wide area of science and engineering. This attracted new research personnel to the problems of numerical mathematics.
The valuable experience we
had in solving applied problems, was later used to devise effective methods and algorithms in numerical mathematics. The methods of numerical mathematics a r e closely related to the state of computer art. New concepts and methods a r e formed in numerical mathematics and its numerous applications influenced essentially by every new stage of computer technology. As a rule, with the perfection of existing methods and ideas new tendencies begin to appear in numerical mathematics which eventually become important scientific trends. The standard of research i n numerical mathematics is largely dependent on the actual connection with fundamental a r e a s of mathematics. F i r s t of all I should like to mention functional analysis, differential equations , algebra and logic theory of probability, calculus of variations, ect. An interchange of the ideas between different branches of mathematics has been intensified in the recent decade.
This is true in the f i r s t
.
G . I. Marchuk
place f o r numerical mathematics which has used the results of fundamental
mathematical a r e a s to develop new and m o r e sofisticated me-
thods and t o improve the old ones. At the same time it should be emphasized that applications have an important influence on numerical mathematics.
Thus, for instance,
mathematical simulation often stimulated a discovery of new approaches which a r e now a most valuable possession of numerical mathematics. Such applied a r e a s a s hydrodynamics, atomic physics, mathematical economics and control theory a r e most important examples. The studies of approximation, stability and convergence have provided the necessary basis f o r a wide r e s e a r c h of effective difference schemes applied to the problems of mathematical physics.
The algo-
rithms of finite difference methods combine a s a rule the aspect of a construction of a difference equation-analog a s well a s the aspect of the solution.
Therefore the advance of constructive theory of finite differen-
ce methods depends on an i n t e r coordinated development of the two aspects mentioned above. . One of such trends is connected with a systematic development of conservative difference schemes based on conservation laws inherent in most physical phenomena.
F o r this purpose one u s e s equations of
balance written for a separate mesh element of the domain, and then quadrature and interpolational formulas a r e applied.
The resulting dif-
ference equations adequately transformed and summed over a l l mesh points satisfy integral conservation laws.
These methods were of great
importance in-forming a general point of view on a construction of difference schemes for linear and quasilinear equations. The second trend is concerned with finding efficient algorithms for multi-dimensional stationary problems of mathematical physics. As a result of the success achieved in a solution of simultaneous linear
G. I. Marchuk
algebraic equations with Jacobi and block-tridiagonal m a t r i c e s there have emerged 9 few excellent algorithms in which factorization of the difference operator is used. E a r l y sixties were marked by a major contribution t o numeri'al mathematic associated with the names of Douglas ,Reaceman and Rachford who suggested an alternating direction method.
The s u c c e s s of
the meth$d was ensured by use of a simple reduction of a multi-dimensional problem t o a sequence of onedimensional problems with Jacobi matrices which a r e convenient to handle. rection method
Ultimately the alternating di-
can be treated a s the iterative method, where optimiza-
tion of computation is carried
out by a special selection of the com-
pression operator. The latter i s a product of simpler operators. Besides the selection of the compression operator there is a s e t of f r e e relaxation parameters t o the selected
too.
Later Soviet mathematicians Yanenko, Diakonov, Samarsky and others suggested a so-called splitting-up method. The point i s that the approximation of the initial operator by each auxiliary operator i s not necessary but on the whole such an approximation in special norms. A s e r i e s of investigations has been devoted to a choice of optimization p a r a m e t e r s of splitting-up by means of spectral and variational techniques. The experience we have in a solution of one - dimensional problems represents a solid base when we come to the development of algorithms for m o r e complex problems in mathematical physics. An important role in the development of new approaches t o a solution of non-stationary twodimensional problems belongs to the alternating direction method. However from the outset i t turned our that the method had some drawbacks. Further advance of the methods for multi-dimensional non-stationar y problems is connected with splitting-up techniques based a s a rule
G. I. Marchuk on inhomogeneous difference approximations of the initial differential operators.
The mathematical technique is related with splitting of a
compound operator to simple ones. If this approach is used the given equation can be solved by means of integration of simpler equations. In this case the intermediate schemes have t o satisfy the approximation and stability conditions only a s a whole which permits flexible schemes to be constructed f o r practically a l l problems in mathematical physics. Splitting-up schemes f o r implicit approximations have been suggested by Yanenko, Diakonov, Samarskii et al. and applied in various problems.
Such schemes have stimulated a m o r e general numerical
approach t o the problems of mathematical physics which has been called a weak approximation method.
It turned out that the splitting-up
method can be treated a s a method of weak approximation of the initial equation by another , s i m p l e r one.
The solution of the latter, under
certain conditions, comes i n a norm t o that of the initial problem.
It
is natural that the method of weak approximation is applicable t o hydrodynamics, meteorology, oceanology, radiation t r a n s f e r theory and s o on. Recently t h e r e has been found a c l a s s of splitting-up schemes equivalent in their accuracy to the Crank-Nicolson difference scheme and applied to non-stationary operators. These schemes a r e absolutely stable f o r the s y s t e m s of equations with positive semi-definite operators depending explicity on coordinates and time.
This method is easily extended t o
quasi-linear equations. F r e n c h scientists Lions, Teman, et al. have m a d e a n important contribution t o the splitting-up methods and theoretically substantiated a number of new approaches. These investigations a r e especially important for fluid dynamics, theory of plasticity and control theory. The method of decomposition and decentralization formulated by these scientists
G . I . Marchuk should be specially mentioned. It is closely related t o the method of weak approximation. Lately there has been much interest in variational methods applied to problems in mathematical physics. The variational methods of Ritz, Galerkin, Trefz and others have long become classical numerical mzthematics.
in
These methods a r e especially effective when
one seeks functionals to a solution. Besides not long ago there emerged a new trend in variational methods, a so-called method of finite elements o r functions. The main idea of it was expressed by Courant in nineteen forties. The essence of this method is that one seeks a n approximate solution in a form of a linear combination of functions with compact support of o r d e r of the mesh width
h. In other words one takes a s t r i a l functions special func-
tions in a polynomial form identically equal to zero outside of a fixed domain having a characteristic dimension of several h's.
The main
problem h e r e is theory of approximation of the functions by a given system of finite elements. An important contribution to the finite element method has been made by Oganesian, Ruhovtz, Rivkind and by Birkhoff, Shultz, Varga et al.
The finite element method is closely associated with the application
of a variational approach to constructing finite difference equations corresponding t o differential equations in mathematical physics.
French
mathematicians have contributed t o this a r e a of research. A
solution of simultaneous algebraic equations and computing of
eigenvalues and eigenvectors of m a t r i c e s a r e important problems in numerical mathematics.
Speaking
about the numerical methods and pro-
blems in linear algebra of recent y e a r s i t is necessary f i r s t of all to emphasize the growing interest in the solution of large systems of the, corresponding equations, in the solution of ill-conditioned systems and
G. I. Marchuk in spectral problems for a r b i t r a r y matrices. Much attention has been paid
to the use of a priori and a posteriori information i n the process Under the influence of computer development the old
of a solution.
numerical methods in linear algebra have been reconsidered. The increasing u s e of computers has stimulated a creation of new algorithms well suited f o r automatic calculation. Direct methods in linear algebra a r e especially important. A method
is called direct if it provides a solution by a finite number
of arithmetic operations. Direct methods play an important role when simultaneous linear algebraic equations a r e solved o r inverse m a t r i c e s and determinants a r e found.
Using some elementary transformations
one can represent the initial matrix a s a product of two matrices, each being easily inverted. The Gauss elimination method is a classical example of direct methods. As usual a difficult problem is a solution of simultaneous equations with ill-conditioned matrices. It is closely associated with a solution of conditionally properly posed problems of mathematical physics. One comes a c r o s s this difficulty because the solution is sensitive t o the accuracy of the matrix the equation.
elements and vector components in the right-hand side of Though important r e s u l t s have been obtained here, i t is
only a beginning of extensive r e s e a r c h which, evidently, will lead to general theory. Iterative methods r e m a i n v e r y important in linear algebra. An active p r o g r e s s of these methods has resulted i n a number of powerful algorithms which a r e efficiently used on computers. This progress has been caused f i r s t of all by a need t o solve problems in mathematical physics, economics and control theory involving large s y s t e m s of equations with special m a t r i c e s .
Direct methods a r e in most c a s e s ineffective f o r such
problems though each new stage in the computer technology extends
G. I. Marchuk their applicability. So f a r spme trends have formed in a construction of iterative processes and methods. We shall focus our attention only on two of them.
The first is associated with the use of spectral characteristics
of the operators involved. The methods of this type can be described a s follows. An iterative method is constructed with a matrix which depends on some s e t of parameters. There a r e two alternatives. First, all parameters of the set a r e the same for all iterative steps , the spectral radius of the transition matrix being minimized. Second, a sequence of parameter values i's constructed so that the value of a parameter depends on the No. of iteration (provided that the e r r o r vector tends fast to zero over all initial approximations). Both methods use a priori information about spectra of the matrices involved. A choice of such parameters i s part of optimization of the numerical algorithm. The major difficulty here is a s a rule to determine boundaries of the spectra of the matrices. Special optimization of iterative methods stimulates a formulation of a number of problems. Once again we shall discuss two of them. It will be noted that two spectral optimization methods a r e especially effective when we have a set of problems with the same operator but with different input data. Much attention has been recently attracted to the Lanszos transform of arbitrary matrices which leads to an equivalent system of equations with a symmetric matrix whose spectrum occupies two segments symmetric with respect to zero.
A possibility of such symmetrization
as well a s some new problems encourage the
development
of spectral methods accelerating convergence and using polynomials with the least deviation from zero on a set of segments. The second problem connected with spectral optimization i s a search of effective
G . I. Marchuk
methods intended to determine matrix eigenvalue with minimum modulus. Let us discuss application of variational principles to iterative methods.
Such methods allow a successive minimization of some func-
tional (squared a s a rule) which attains a minimum on a desired solution.
There has been much interest in such problems. When the variational approach to iterative methods i s used one
can select relaxation parameters on the basis of a posteriori information obtained at each step.
This also is the case for the steepest de-
scent method and the iterative method with minimal discrepancies. The above said i s a merit of the variational approach. The rate of convergence seems to be not lower than the rate we get using Chebyshev polynomials. It i s essential that such methods converge for both symmetric and non-symmetric matrices i f these a r e positive definite. It has been possible to devise a number of effective methods like the method of minimal discrepancies for positive semi-definite matrices. A grave obstacle to developing nonstationary variational methods has been a necessity to store a more amount of intermediate data than in the case of using corresponding Chebyshev methods. At present there a r e iterative methods which combine spectral and variational approaches.
Lebedev has formulated conditions which must
\
be satisfied by corresponding operators. Under these conditions unim,provable estimates of a number of arithmetic operations take place. There is also probabilistic technique intended t o choose optimization parameters of iterative processes. A series of interesting results have been obtained by Vorobjev,
The Young-Frankel overrelaxation method
has not yet lost its importance. It has become classical and i s generalized in a number of monographs. We used to compare computational methods according t o a number of arithmetic operations and memory requirements. Now we, also ought
G . I. Marchuk
to pay attention to their accuracy. It means that round-off e r r o r analysis has become an essential feature of the method itself. A systematic study of e r r o r s was first made by Wilkinson. His results were lat e r systematized in his excellent monograph
"
An algebraic eigenvalue
problem" where the method of equivalent perturbations was taken a s a basic mathematical technique. As a result estimates of the norms of perturbations were obtained for a l l fundamental transformations in linear algebra. In parallel with the method of equivalent parturbations there was an intensive development of statistical e r r o r theory. The results obtained by Bakhvalov, Voevodin et al. initiated an investigation of roundoff e r r o r s . Certainly the statistical methods will play an important role in round-off e r r o r analysis. The progress in computation technology has had an important influence on many branches of computer science which show a tendency to integration. The relations between : software, the methods in numerical and applied methematics, the theory of programming and languages
- become s o close that c h o i ~ eof a strategy for a solution of particular problems is now of paramount importance. Though optimization of individual components of computional process i s a s before a fundamental factor bf the t h e o r y , the attention becomes more and more concentrated on optimization of the whole process. Optimization of computation is obviously one of the central objects in numerical mathematics which stimulates exploration of new algorithms. and new ways of their computer implementation. The second trend is connected with a solution of classes of problems and with algorithm standardization. A large amount of computerprocessed information must be systematized and put in order. The va-' luable experience which we have in the solution of the problems of scien-
G. I. Marchuk ce and technology allows u s in many cases t o s e t a s an ultimate goal a creation of universal methods suitable h handle m o r e o r l e s s wide c l a s s e s of mathematical problems of the s a m e type. At present a c a r e must be taken to save the efforts of the society on a creation of numerous individual algorithms f o r individual and r a r e problems. It s e e m s that a rational strategy f o r a solution of various r a r e problems is t o construct universal algorithms self-adjusting to optimal operating conditions because they use a posteriori information. A rational strategy f o r a solution of frequently repeated problems is a careful implementation of specific algorithms. These two approaches combined will help to save social resources spent on a creation of efficient software. F i r s t steps have been made in the theory of universal algorithms which a r e self-adjusting to a kind of optimal operating conditions and a cours e of further r e s e a r c h has been outlined Software is becoming a materialization of the society's intellect The process of mathematization of sciences has given r i s e to an active development of the methods to simulate the phenomena occurring in nat u r e and society High-speed, large-capacity computers of new generations can s t o r e immediately available valuable information and multia c c e s s computers allow new f o r m s of man-machine interaction using a conversational mode of operation
Therefore standardization of software
in general and of numerical algorithms in particular is an urgent problem of scientific and technological progress The problem of software h a s stimulated a formulation of new problems in numerical mathematics, such a s a construction of gride f o r complicated domains F o r two-dimensional domains the above is close to i t s effective solution while f o r t h r e e it i s just being posed
- and
multi
- dimensional
domains
This problem is closely connected with a con-
struction of algorithms for l a r g e problems with high accuracy by difference, variational and other techniques
o r may be a combination
of
G. I. Marchuk different methods. The solution of the problems with non-linear monotonous operators is especially important. The corresponding theory is at present intensively developed. The success achieved in analytic transformations on a computer practically leads us to a solution of mathematical physics problems by the well-known technique of continuous function analysis. As the supply of visual aids for analytic computations grows, these methods will penetrate m o r e and m o r e into software. The s u c c e s s achieved transformations on computers will give computer science new possibilitles which nowadays should be taken into account. Finally I should like to note that the further development of numerical mathematics depends on the standard of r e s e a r c h in fundamental branches of mathematics, the importance of the l a t t e r essentially increasing at the age of great technological progress. Only a harmonic combination of r e s e a r c h in all branches of mathematics will provide the necessary and favourable conditions f o r self-development of mathematics and i t s applications.
G, I. Marchuk C h a p t e r
1
GENERAL INFORMATION FROM THE THEORY OF DIFFERENCE SCHEMES This chapter presents brief information on the basic problems of the theory of difference schemes, widely applied in the chapters to A.s our task is to introduce certain modern principles of con-
follow.
structing numerical algorithms, we ahall restrict ourselves to the simplest cases that can be easily interpreted. The book is supplied with a list of special literature in which one can find more complicated and specific problems of the theory. 1. ---1.
-
Basic and Adjoint Equations
Let us consider a r e a l Hilbert space
@
of functions (Q with an in-
ner product (1. 1) where D x
i s the domain of definition of the functions
represents a generalized coordinate of an n
space. A s usual, the norm of the function
cp
of
g and
h, and
dimensional Euclidean
@
is defined from
the relation (1.2) Let us then consider an operator A acting on the functions of a real Hilbert space
0.
The operator A i s called positive definite i f there exists such a constant
on the functions of
@,
f > 0 , that for each function &f) the
relations hold
( A Y . 9 ) L f ( y . y ) >o, being different from
(1. 3)
0 and for the sake of simplicity i s defined
G . I. Marchuk by A
> 0.
The operator A is called positive semidefinite if there a r e
such non-trivial
0E
,elements which turn the inner product (A lp, rp )
into zero. For a l l the remaining elements there holds the inequality
Below we shall formally denote positive semi-definite operators byA > 0. Let u s introduce then an adjoint operator A*, satisfying the Lagrange identity (Ag, h ) = ( g , A* h ) .
4 and h E 4. The space 4*, generally coincide with 4 ,though the domain. D of definition
If i s essential to note that speaking, does not
(1.5)
g E
of basic and adjoint functions is the same. To clarify the fact we shall show that
in many problems of mathematical physics
longing to the Hilbert space
@
g - function be-
satisfy some homogeneous boundary
conditions. In the application of the Lagrange identity (1. >), a s a rule, alongside with the operator
* those
boundary conditions which a r e
A
satisfied by the adjoint functions
h
a r e defined.
Further, we shall
apply a more convenient notation for adjoint functions. Thus, if the elements of the space of functions
* @ are
adjoint space
denoted by (P , then those of the
suitably denoted by
In the case of A Then
(Z, a r e
= A*
q=@.
CP*
, the operator A is called self-adjoint.
Note an important consequence connected with the properties'of the adjoint operators. it follows that Fourier
-
A*
>
Thus, if A
>
0
, then from the Lagrange identity
0.
s e r i e s expansions by eigenfunctions of basic and adjoint
operators a r e of great importance f o r the analysis of algorithms.
G. I. Marchuk Consider the two following spectral problems for A
Assume that each of the homogeneous equations forms a complete set of orthogonal eigenfunctions {u
n
2
0:
(1. 6) , (1. 7) and {u*), which n
can be normalized a s follows:
and eigenvalues
n
belong to the interval
We shall call this complete set of eigenfunctions a biorthogonal basis. Then supposing completeness, any flinction f of
@
and f*of @*can be r e -
presented a s a Fourier s e r i e s
where
Later we shall consider, without stipulating, the spectrum of
A
>0
and
A
-> 0
operators to be real. Hence, it is not difficult to A A establish that in such a case 1 > 0 and 2 2 0. n n Of great value for the analysis of numerical algorithms a r e esti1
2
mations of norms of operators. The norm of the operator from :
A is definyd
G. I. Marehuk
(further f o r the sake of simplicity the restriction
#
(fl
0 will not be stated).
A
Taking into consideration the relation
tlie'squared norm of the operator A can be also written a s follows;
ll 2 The operator
* A A is
=
sup
('p , A * A ~
(rp.9)
cp€+
symmetric and positive sani-definite. Consider
a spectral problem
A*AR =
X~
*
(1. 13)
The problem defines a s e t of eigenfunctions
bJ
X A * ~ > 0. The set n any function yY of
@J and
eigenvalues
for symmetric operators is complete. Then
Q A Acan be
represented a s a F o u r i e r s e r i e s
where
cpn Substitute the s e r i e s tion
R
n
=
((P
.nn).
(1. 15)
(1. 14) into (1. 12) and u s e the condition f o r func-
orthonormalization. Then we shall have
where Q is a Hilbert space of F o u r i e r coefficients. It is not difficult to find that
-
11
1
;
11
XA'A
min
;
d
A*A '
G. I. Marchuk
where
A* A Amin
and
A*A max
a r e minimum and maximum eigenvalues
respectively from the totality hAtA of the spectral problem (1. 13). n A*A The value is usually called a spectral radius of PA*A = max the ,dperator A*A. li
9
In the case of a self-adjoint operator A consider
problem. Au = We have
IIAII
=
xA
u
a spectral
.
(1. 18)
FA-
If i s evident that for the self-adjoint operator
Let us consider certain properties of norms of operators
the problem
of eigenvalues. 1. 1. 1. Energy Norms. Later we shall always deal with Hilbert spaces of real functions with the norm (1.21) where C
> 0 . It
is easy to see that using the Lagrange identity we have
the equality
and consequently
C+C* 2
The operator C
i s symmetric and positive. It means that i f
> 0, then the norm of lp function can always
be presented a s any
inner product with a symmetric operator in the form of the weight function, i. e.
.
On the basis of Buniakowsky-Schwarz inequality
one can obtain the following important estimation :
where
dC+p and
-2
-
i s a maximum and minimum eigenvalue of
2
the symmetric operator
c + c" , 2
F o r simpler and more frequent cases one usually assumes C = E. Then we get
1. 1. 2.
Estimation of the Norm of a single operator
Let us consider a positive semi-definite operator A
2
0.
There
is the following relation :
II(E
+&A )-l11< 1
(1. 25)
for any parameter 6' > 0. This assumption can be proved by the formula
G. I. Marchuk
11
( E + @A)-
1
(1 =
( (E + ~ A )- l q ,(E + ~ A ) - l q
sup
26)
( Y W
Let u s take
y,
=
(E
+
G-~)-lq
a s a new t r i a l function of (1. 26). Then we get
11 ( E
As A
A
->
>0 ,
-
('t'>Y')
( ( E + C A ) (I, , ( E + C A )
0,
=
)
the estimation (1. 2 5 ) follows f r o m the l a s t relation. If
we have
1. 1. 3. If
-
+ W A ) - ' I ~ = sup
Kellogg's Lem= A > 0 and
>
0
, then
Let u s introduce the notation
T = ( E - r A ) ( E + #A)Consider t h e expression f o r
I( T I(
1
G . I. Marchuk = sup
( ( E - ~ A ) ( v (, E - W A ) ( Y )
-
y E \ Y ((E+O-A)Y, ( E + c A ) y ) = sup YIEY
(yJ, W )
-2
(A yr,
( y , y)+2 (Ay,
2
'Y )+a( A Y s A ' ? ) <
p)
1
+~2(~yl,~:yl)-
The property of positive semi-definiteness of the operator A has been essentially used here. Thus, the lemma i s proved. In the case of
A
> 0
instead of (1. 28) we shall get
1. 1. 4. Estimation of the Norm of Operators A.s it was stated above
Since the squared norm of the operator A coincides with the spectral radius of the self-adjoint operator A*A, to define/3A,A
one can use the well-known
iterative Kelloggls process, i f A i s a normal operator, that ,is AA*= A*A,
where index k denotes the number of the sequential iteration of the following scheme :
The proof of the convergence of the iterative process (1.29)imme-
-
G. I. Marchuk
diately follows from the Fourier analysis. In.fact, let
where the
R a r e eigenvalues of problem (1. 13). Consequently n
Substituting the series into' (1.29) with large k we have
Am
where
=/$I*A
=
11 ~ 1 a n1 d 1~;-
is the eigenvalue preceeding '
its maximum of the operator A*A. 1. 1. 5. Cdculation of Spectrum Bounds .of a Positive Matrix Consider a problem of finding out maximum and minimum eigen~ a l u e sof the operator A, havifig a positive spectrum Au=
A
u.
F o r this purpose we use Lyusternik's method. We shall introduce an iterative process
where c
is a normalization factor, which i s conveniently chosen in n (n) the form c . Them n
=11~
9
= A
$111
and
/A
= lim
n+m
1,
y(n)
11
1
G. I. Marchuk Here the following norm i s used
where
a r e vector
selected to the order
y(n) components. The constant c
PA.
Then consider the matrix
B
n
i s usually
= / j A- ~ A
and the problem
It is evident that
B.2
0. Then consider again Lyusternikts iterative
process
and get
It i s easy to s e e that operators A and B have a common base and
Hence
=A-A.
Un this way not only maximum, but also minimum eigenvalues bf the matrix A a r e found. We assumed the matrix A to be of the form in which Lyusternikts method is applicable. 1. 1. 6. Examples Let us now go over the simplest examples which later will help illustrate methods i n numerical mathematics.
G. I. Marchuk 1. Let
where
A=
3 3;2
p 32
+
is the Laplace operator. The operator A
is defined on a set of r e a l functions
+,
whose elements satisfy the fol-
lowing requirements. B r s t ,
where 3 D is the boundary of the domain D. F o r the sake of simplicity it is assumed that the domain D i s a unit square Second , the functions
((P)
form a
{0 5r5
1, O<-yC_l}.
Hilbert space with an inner pro-
duct (a, b) where
a Q*
=
jDab
dD
and b 6 *,with the norm
IIell=V K ' I . Let us show now that
on the set of functions
$ the
operator A i s
self-adjoint. Let us indeed consider a functional
Using Green's second formula o r a double integration by parts we get
Suppose each function
c g * *~satisfies $
v*= 0
on
the following boundary condition
aD
The condition ( 1 . 35) together witli:(l. 3l)inakes the integral.?lang equal t o zero. As a result ,we have
(1. 35)
dD
G. I. Marchuk (AT
It means that
,y*)
=
-
A '= AY and the o p e r a t o r ' in question is self-adjoint.
b e t us then study the problem of definiteness of A.
F o r this
purpose we shall consider the functional
With the help of Green's
As
cp satisfies
f i r s t formula we get
condition (1. 3 I), we have
f o r any function
not identically equal t o zero.
Finally , by this example we shall illustrate the problem of eigenvalues. As the operator
A
=
A*,
the system of eigenfunctions of
the problem Au = Xu
(1. 39)
in the c a s e of u =
0
for
a~
(1.40)
is complete. If D is taken a s a square, the system will be of the form
The eigen-values of the operator A being :
G;I. Marchuk Hence, it follows that
Thus, in this case
P,
= r ~ l= lo o .
Hence if follows in particular
that (A
cf) , q ) > -
2 'K
2
(q,(0 ).
Consequently, the operator A is positive definite. The system of eigenfunction
u
being complete, any functi,on of n a s a Fourier series.
where when
(pi i
=
#
can be represented
(Cq, u,)
(1.44)
i s a new index of the ordered s e r i e s
2. Let us consider now an example of a finite-difference analog of the Iaplace operator.
ykl
in nodes (x y ), k' 1 uniformly covering the domain D with the h step s o that Suppose that
X
k+ 1
= Xk
+ h , Y L + ,
a r e the function
=Y1
values
+ h .
Let us call the points (xk , yl) mesh points, their s e t a grid, and h mesh s i z e We shall denote, the domain of definition of grid functions (so the functions prescribed in the mesh points a r e usually called) by Dh '
boundary
points by dD and the set of grid functions h
-
G . I. Marchuk
Such a projection of the operators upon a grid domain leads t o finitedifference analogs of the equations, whose methods of construction and some other theoretical problems, such a s approximation , numerical stability and convergence of the solutions of an approximate problem t o a precise one, Let
-
will be considered l a t e r .
be a vector with a polynomial, (4
k, 1.
-
the values of the
function in mesh points and
where
Ak
and
A1
a r e m a t r i c e s built in the following way.
Let u s introduce auxiliary s c a l a r operators
and similar operators of index 1. Then the components of (A lfl ) k -kl and (A IQ ) factors will be 1 kl
-
The. totality of the mesh points f o r which
n and 1 = 0, n with of the domain
Dh
k = 0,1,2.
. .,
=
0
for a D h
=
0, n with 1 = 0 , 1 , 2 . . . . ,
n will be called boundary points
and denoted by 3~~
ykl
k
.
Suppose that
G . I. Marchuk As an inner product we take
-
n-1
n-l
Then
Then consider the functional
Here we have the following identities similar t o Green's f i r s t and s e cond forinulas
Formulas (1. 48) and
(1. 52) a r e valid only f o r
e
*:
functions satisfying the condition
, satisfying the reldtion
cp*kl
=
o
for
JDh
Similar equalities take place f o r the sums of index 1. lation from (1. 5 2 ) helps u s get
The second r e -
,
G. I. Marchuk
Hence follows
Ah
self-adjointness, i. e.
=
A
(
h
1*, .
Let u s ' consider the functional
With the help of the f i r s t identity
( 1 . 52)
for
k and 1
we get
hence follows (Ah
y.y,
>
0.
At last, consider a spectral problem A
h
u
u = 0
=
Xu
in^,
for B D h .
The components of the eigenvectors corresponding t o (1. 56) u k1 = mP In (1. 57) the indices m
and
follows
sin
k m X h s i n lp7s h.
k, 1 specify the components
p a r e the numbers of eigenvalues :
are (1.'57)
of the solution, and
which can be ordered a s
G. I. Marchuk
uk1 mP
=
U
!
l , (i = 1.2 . . . . 1.
J
With the obvious relations
- A k (vksin
4
k r n ~ h =)
-a,ol s i n (
l
p h ~) =
sin
h2 4
h2
sin2
----mxh sin krnKh, 2 sin
i p h,~
2
the eigen-values will be = -4
mp Note that
Here the
m
and
1. a r e
p
mrch ---
( sin
h2
change from
ordered
mP
.
+
2
I to
pxh sin2 -1.
n-1.
As, usually ,
2
(1. 58)
Consequently,
*h 2
< 1 we can
write approximately sin2 7Ch --2
-
-- + n2h2 4
0
(h4)
hence , we get
Thus, we can write
The basis of eigen-vectors
(1. 57) can be used to present the vector
G. I. Marchuk a s a series
CQ kl
.
Then we get
where
The examples considered above give the necessary understanding of some operators and their properties.
-1.-2.
Approximation -----
Let us consider a certain problem of mathematical physics in the operator
where Here
form
cq
A is linear operator,
4
and
elements in
F
E
4
and
f E F.
a r e Hilbert spaces with the domains of definition of
D + 3 D and D respectively;
is
a linear operator of the
bountary condition, g e G , G is a Hilbert space of functions with the definition domain 3 D . Along with the equation (2. 1) , let us consider a s i m i l a r equation in a finite-dimensional euclidean space
a
'f
=
P,
=
gh
in
D~
, f o r aD,
G . I. Marchuk h where is a linear operator depending on the m ~ s h s i z e h , +h h f E F h , and. $h and F a r e euclidean spaces with the domain of h + a D h and Dh , respectively. Here Dh definition of elements D h
,y
is a set of inner
mesh-points of the D domain, and
3 D is a s e t of h the meshpoints, on which the boundary condition of the problem is aph proximated, ah is a linear ?perator on the grid g E G G is an h' h euclidean space of the vectors with the domain of definition a D h' Let us introduce the Hilbert norm of the vector i n grid spaces Fh,
Gh ,
.
Then we s h a l l denote by (
) the totality of values 7.h of any function of problem (2. '1) after projection on the grid domain
3 D o r Dh + aDh . Then the following definition is usually used.: h problem ( 2 . 2 ) approximates problem ( 2 . 1) with the o r d e r hn for the
Dh,
solution
where
cp
, if
M. a r e some constants different from
ca
.
F o r the cases where the solution of problem (2. 1) is
smooth
enough, e r r o r s in approximation a r e conveniently measured by the maximum norm peculiar t o the space of continuous and differentiable functions. To this end the Taylor-series functions participating in the formulation of the problem is used.
G. I. Marchuk L a t e r we shall assume that redtction of problem (2. 1) t o (2. 2) is made and moreover, the boundary condition of (2. 2) is used to eli-
minate the solution in boundary points of the
Dh
+ a D domain. As h
a result we have a n equivalent problem
h . is now the domain of definition of the solution D The soh' lution iph in boundary points is t o be found after solving equation
where
(2. 4) a s a result of a solution of
Eq. (2. 2)
with respect to the un-
knowns. Thus, in s o m e cases it is convenient to u s e form (2. 4) in writing an approximation problem, and in others form (2. 2) is
m o r e proper.
So, a s a result of the reduction applied with certain approximation, a problem with a continuous argument (2. 1) is reduced t o a problem in linear algebra (2. 4). F u r t h e r task is to solve a system of algebraic equations.
E x a m p 1 e.
Consider problems
The domain of definition D is assumed to be a square €0
5x 5
1 , 0
5
y
5
1 1 , and f
square D with a uniform grid along
-
a smooth function. Let u s cover
x and y
with m e s h s i z e h. The
mesh-points of the domain will be denoted by two indices (k, l ) , where the f o r m e r 3 <'k and 0 <- 1 tions:
5
5
n corresponds to the points of the x coordinate
,
n to those of y. Let u s consider the following approxima-
where difference operators by (1. 46).
nh.a , vk and v1
a r e defined
(2. 5) can be approximated by
Then
-[a,(vkqkl)
(vl
ykl
I]
=
in
h'
(2. 6)
Tkl = where D
G. I. Marchuk
0
for
3Dh,
is a s e t of mesh-points coinciding with the boundary D
With the help
\ qf
.
(1. 46) problem (2. 6) can be reduced to
=
0
for 3 D h ,
where
The operator
h
i s usually called a difference analog of Laplace ope-
rator. Let u s introduce a space of solutions
(0
- k <
1
< - n, 0 <- 1 < - n ments from . The values =
1< - 1 <-
n- 1
4h
)
cPh. . .
Assume Dh
+ a Dh =
to be domain of definition of the eleh fkl belong to F a l l Dh = l5 kLn- 1, h '
{
being their domain of definition. Applying the Taylor-
G. I. Marchuk
s e r i e s expansion in the vicinity of ( xk , y ) points , a s well a s 1 assuming the continuity of the solution and derivatives with respect to ( x
, y ) up to the second order, we get
:
There is a s i m i l a r expansion for the function f(x, y) expansions a s well. These expansions a r e considered i n the domain {x
Y1-l and
5 f
5
Y
k-1
< x i .
-
k+l
Y ~ + ~ )Let us substitute the expansions obtained for
,
CP
into (2. 7) ,and estimate the result in the energy norm. Then
we have
where
M
1
and M
a r e constants. If is of interest to note i f f h is kl , yl) , then M2 = 0 in the second relation (2. 8)
2 chosen equal to f ( x
and we get a n exact approximation of the right part of equation (2. 5) in the given metric. As a result of the analysis ( 2 . 8), we come to the conclusion that problem (2. 7) approximates the basic problem (2. 5) accurate to the s e cond o r d e r relative to h. Until now have considered the approximation of a problem id space
G. I. Marchuk variables. Similarly, one can consider an approximation. problem of an evolutionary equation
a y
g . for
=
a
~
,
, with, t
=
=
0 .
F o r the sake of simplicity, using the above considerations, we shall c a r r y o u t a two-stage approximation of (2. 9). We shall f i r s t approximate this problem in the
Dh
+ 3Dh domain in space variables.
A s a re-
sult, we get a differential equation in time, and a difference equation in space variables. Consider a new evolutionary equation
where A , index h
fh
and
iph
a r e the functions of the t i m e t . l a t e r the
in (2. 10) will be omitted a s non-essential, supposing we deal
with a difference analog in space variables of the basic problem in mathematical physics. Generally speaking ( 2 . 10) is a system of ordinary differential equations f o r the vector (ph
components, which a r e approximate solu-
tions in the mesh-points of the D domain. h Thus, consider the following Cauchy problem :
y = , g
with
t =
0.
G . I. Marchuk
Suppose that the operator
A
does not depent on time. For the sake
of explicitness consider the simplest methode of approximation of the problem (2. 11) i n respect t o time.
.
.
Schemes of the f i r s t and second accuracy i n t
a r e currently most
applied difference schemes. F i r s t , consider the simplest explicit scheme of the accuracy in a grid space
where
=At ,
first order
TT
f J is a certain approximation of the function f . F o r
( t . ) . If the simplest implicit approxi J mation scheme is considered, we have simplicity one can take
fl
fj =
f
f(t. ) . Both (2. 12) and (2. 1 3 ) a r e of the f i r s t J+1 o r d e r accuracy in time. It becomes quite obvious if one applies the and
Taylor
is chosen a s
-
s e r i e s expansion in time, allowing the existence of time deri-
vatives f r o m the solution to the second order. Solving (2. 12) and (2. 13) in respect to the unknown, we get a r e current relation
where
T
is a step operator , defined in the follouling way :
E - ZA
T
f o r scheme
(2. 12)
=
(E + TA)-' and a source operator
f o r scheme
(2. 13)
G . I. Marchuk
(E
-
f o r scheme
(2. 12)
T
-
f o r scheme
(2. 13)
One should note the fact that a l l single-layer difference schemes f o r evolutionary equations a r e reduced to (2. 14) The Crank
-
.
Nicolson difference scheme of the second order of
accuracy is of certain interest in applications
where
f1
f
=
S
= (E
(
t
2)
Scheme (2. 14) is canonical f o r (2. 15) , where
7 + -A)-'. 2
In some c a s e s difference equations' (2. 12), (2. 1 3 ) and a r e conveniently
(2. 15)
written in the form of a system of two equations,
where one approximates the equation itself i n D , and the other
-
the
boundary condition on b D. In this c a s e we have p 7 lphT
lhZ
'Y
h+
=
=
It is supposed h e r e
and the operator
fhT h7 g that
in on
LhT
D
, 3D.
approximates the operator
1 approximates the boundary condition within the in-
G. I. Marchuk
terval
0
<
t
- T . Similarly , <
fhZ.
and
h t . g approximate in the
respective (generally speaking, different) norms f and g, i. e.
The difference equation in the canonical form (2. 14) , by introducing vector- functions and new operators
, acting in the Dh
T~
space , can be written a s
Essentially speaking , an evolutionary equation with regard to boundar y conditions and initial data can be reduced to the problem in linear algebra (2. 18). Note that for the analysis of approximation in different cases one can apply either a metric space D o r D x TT .' In a h h particular case , a boundary value elliptic problem , an integral equation e t c . , can be reduced to equation (2.18). The condition of approximation can be again written a s (2. 17) , where only value from the totality of the approximation index.
(Ax.) of 1
h
-
the maximum
steps in geometrical variables, i s
G.. I. Marchuk
.
Esample
Consider a problem
A p
=
2~ 3t
- -9 ,
=
f
(9
=
0
fordD,
cp
=
g
with
in D ,
(2. 19)
t=O.
D x T is the domain .of definition of the solution, where i s a square Dn, from so that
, and T
aD
tj+l
-
=
{ 05
t
<
D , a s above,
T ~ .)L e t us change from D to
to 3 D and from T to TZ.Let Tz be the set of t . points, J h tj = z .
Then the following will be an approximation to the problem (2. 19):
1. Consider the simplest explicit approximation
G . I. Marchuk
Problem (2. 21) i s obviously solvable in respect to the solution. We have
Under the condition
The recurrent relation
(2. 24) is represented a s
where
i s a step operator, and the operators A. a r e defined by ( 1 . 4 7 ) . Let us 1
estimate i t s norm. the problem
To that end, consider
the maximum eigenvalue for
G. I. Marchuk The following relation is obvious
2. Along with the explicit approximation of first order accuracy in
T , one can consider an implicit approximation of the first order in 7 and second order in h , then instead of (2.21) we take
and the values
i1
and
a r e again defined by (2.22). In this gkl case equation (2.20) i s unsolvable, and we get the operator equation
This equation must be solved i f
Write equation
(2. 30)
in the form
where
T
=
(E
-7Ah)- 1 .
In this case the norm of the operator
T
is
3. At last , consider the approximation by the Crank-Nicolson dif-
ference scheme, In this case operators and functions in (2.20) a r e defined the following way :
and
Then we have the problem
G . I. Marchuk
In this case equation unknown
kl
(2. 36) ,is formally solved in respect to the
in the form :
where
The norm of the step
1. 3.
operator i s
Numerical S t a b i l i i
Now let us turn to the notion of numerical stability of difference schemes.
We shall not aim at a possible generalization of the definition,
as we a r e mainly interested in the simplest algorithmic approaches to
G . I. Marchuk.
the analysis of the quality of difference s c h e m e s , approximating the problems of mathematical physics. and a number of important
Different aspects of stability theory
generalizing results can be found in some
monographs. To find out the main definitions and concepts of. the stability theory
let u s f i r s t consider an explicit difference scheme
Suppose the operdtor h functions
fun
1
and a
> 0
(2. 12)
generates a complete s e t of eigen-
set of eigen-values
{ hn > 0)
corresponding
to the spectral problem Au
=
Xu.
Introduce the following F o u r i e r s e r i e s :
where
u" a r e eigenfunctions of the adjoint spectral problem. Substitute (3.2) n into (3. 1) and multiply the result scalarly by u: . Then , if 7 > 0 we get expressions
Supposing that
f o r Fourier coefficients
G. I. Marchuk
we obtain the initial condition
The solution of ( 3 . 3 ) , ( 3 . 4) is obtained f r o m r e c u r r e n t elimination of the unknowns. A s a r e s u l t . we have
where r
n
=
1
-
7Xn.
Equality ( 3 . 5) is estimated
in modulus
We reinforce the l a t t e r substituting have
max J
I f h 1 f o r 1 fi- 1 I .
Then we
where
l fn l
rnax j
John Neumann introduced a so-called spectral critetion of stability.
It means that if f o r each harmonic
va
G. I. Marchuk
of the Fourier s e r i e s of ( 3 . 2 ) ,
t h e r e holds the relation
.1'
where C
C
2 -
a r e constants. independent of
j, then the difference
scheme' ( 3 . 1) is announced to be numerically stable. i ~ e u ts s e e what conditions should be applied to the p a r a m e t e r s of the difference scheme ( 2 . 12) to satisfy relation ( 3 . 8): Relation ( 3 . 7) analysis shows that sta-
bility criterion ( 3 . 8 ) is fulfilled if the condition
is imposed upon the parameter
r
.
n Suppose that the spectrum of the operstor
A
is situated in the
interval 0
i p ~ .
( 3 . 9')
Then , in accordance with ( 3 . 6 ) , relation ( 3 . 9 ) will hold if
Relation ( 3 . 10) will be the constructive condition f o r the stability of the difference scheme ( 3 . 1).
.
Note that condition ( 3 . 10) is sufficient for stability. In fact, suppos e that
In this case, relation ( 3 . 7 ) is very likely convert into the following :
G. I. Marchuk
Let
jiz <
-
T , where T is fixed. It means that with a small T a
large number of
j
steps a r e considered and with
but s o that T remains fixed. Then
T+O , j --+ s ,
, with such an approach we again
have stable schemes according to Neumann. In conclusion, two important facts should be stressed. F i r s t , it should be noted that stability according t o Neumann is based on the spectrum analysis of the operator of the problem. It means that with such an approach estimation of the maximum eigenvalue of the problem o r i t s upper bound is the necessary element of the algorithm. Second, the s p e c t r a l stability criterion establishes stability and correctness of the solution in respect to every harmonic of the F o u r i e r s e r i e s , but i t does not s a y anything about the correctness of the solution a s a whole. .Meanwhile , i t is the solution
yf ,
which is the main object
of our consideration. All this has stimulated the investigators to find other stability definitions, connected with the n o r m s of the operators of the problem. We should s t r e s s a s well that up to now stability analysis according to Neumann plays a significant r o l e in applications. Now consider other difference schemes, based on implicit difference approximations. In the c a s e of an implicit scheme of the f i r s t o r d e r approximation (2. 13) we get an expression s i m i l a r t o (3. 7 ) ,
where
G . I. Marchuk It i s obvious that for the given difference scheme with absolute stability holds, a s
Similarly, the absolute stability of the Crank
-
n
> 0 the
Nicolson scheme
(2. 15) according to Neumann can be established. In this case the esti-
mation for Fourier coefficients of the solution i s
Hence IrnI<
with any
n '
>
0
.
Now let us come to a more general definition of the notion of numerical stability. ,For this purpose l e t us consider the problem
which i s approximated by the difference problem
G. I. Marchuk
We s a y that the difference scheme (3. 15) is stable , if with the fixed parameter
h , characterizing the difference approximation, t h e r e holds
the following relation for any
where the
h C1 , and
C:
j :
constants a r e independent of j.
The definition of numerical stability involves notion of the correctness of problems .with continuous argument. One can s a y that num e r i c a l stability establishes a continuous dependence of the solution on the input data for the problems of discrete argument. In fact , we choose, a s the input data of ( 3 . 15) f = f,,
and
g = g*
.
We get some solution of ( 3 . 15) and denote it by (4+. the input data
Then f o r the difference of the solutions
we have the following problem :
Then the stability condition becomes
Then we take a s
correspond
Hence it follows that small variations uf the solution t o those of the input data
f
and
g
.
It i s easy to s e e that the definition of stability a s it is given in
(3. 16) already relates the solution itself with a priori information
of the input of the problem. Such a definition i s more convenient for stability Bnalysis of many prqblems than the one due to Neumann. Let us consider stability of (2. 12) from this point of view., To this end we rewrite
the recurrent relation (3. 1) a s
where
The formal solution of (3. 17) i s
We shall estimate solution
(2. 28) by the norm using the Cauchy-Bunia-
kowski inequality and the triangle inequality.
We replace
11 f j - I 11
by the maximal value in all
Let
max j Then
Then we get
Il ll = I II fJ
j
.
If we put
then scheme
(2. 12) will be stable in the sense of the definition (3. 16).
It i s natural that condition (3. 22) is a sufficient condition of stability. One could get finer and weaker criteria through the norms of the powers i of the step operators 11 T )[ (i = 1,2,. ,j). However, such a weakin-
..
ing of the condition makes difficult a constructive procedure for establishing the stability criterion.
A s a rule , in practical calculations the sufficient
condition of
the form (3.22) i s usually applied.
A =A'>
Let us consider a case with the operator
0
and denote
Then
Let
'f=z
n
where
(u,)
~
~
n
~
n
'
is the base of the operator A .
Then
J
=
1
- 2 t X + -c2
2,
where
Let us find conditions to be satisfied by 7
, s o that J < - 1 ,
i. e.
then
H e n c e , if
P A =l l h ]
mar
=
An
=
,then 1.
We get following sdficient condition of stability :
Z
.
<
L
- PA
In this case
IITII
- 1, <
and the calculation will be stable in the sense of ( 3 . 16). Note that in the case of a self-adjoint operator, sufficient conditions of. numerical stability according to Neumann ( 3 . 10).and those in the sense of definition ( 3 . 16) coincide. Generally speaking
, in the case of not self-adjoint
G. 1. Marchuk operator A
, there i s not a similar correspondence i n stability crite-
ria. Really , if Neumannls method involves the maximum eigenvalue spectral norm stability involves the maximum eigenvalue of the operator (A*A) 112 of the operator
h , the
.
Similarly, we can consider the stability of implicit difference equations (2. 13) and (2. 15). In these cases we get
for
where
(2. 13)
and ( ( E ++A)-I
A s stated above, for the self-adjoint operator
for
(2. 13)
for
(2.15)
A >
0
the suffi-
cient condition of stability for these schemes coincides with Neumann's criterion considered above. It means that in this case difference schemes will be absolutely stable in the sense of the definition
(3. 16).
We have considered the principal scheme for the research of numerical stability of the difference scheme, assuming the
operator A
to be time independent. Such an assumption i s quite natural for a number of problems in mathematical physics. I5 also allows us to consider a number of further constructive approaches widely used in numerical
Gi.1. Marchuk mathematics. Indeed the research of stability ieads eventually to the estimation of the norm of the operator with the step stated above in
5 1. 1.4
t
. As
it was
, the squared norm of the operator T coinci-
des with the spectral radius of the self a d j o i n t positive operator T* T. To define the spectral radius one can apply Kellogg's interative process.
1 1 ~ 1 1 =~ where
Ik'
lim
*,(
T (P (k)
(k)
,
a r e the elements of the following :
T , is reduced
Thus, the problem of defining the norm of the operator
to a successive implementation of the recurrent relation (3. 27). It i s this way that is constructively most developed for the use in computers. In the case of a self-adjoint operator
T.
The following observation is noteworthy. When one studies stability of a difference scheme one sometimes applies the method for the definition of the spectral radius of the infinite periodic
-
in.space
-
va-
riables problem. F o r problems with non-periodic boundary conditions, it is quite obligatory to estimate the spectral radius with the help of Kellogg's method for the operators T
, in whose construction r e a l boun-
dary conditions 'have already been considered. If the operator
A changes with time, the problem of stability re-
search i s hampered to a great extent, a s in such. a case the n o r m ' o f
G. I: Marchuk
the operator T changes with time a s well and it is necessary to extablish a spectral radius for each step, the latter depending in the number of the time step. In this case it is appropriate to construct absolutely stable difference analogs of problems. In this respect some methods will be discussed below. In conclusion it ought to be noted that i f the evolutionary equation D x TT space, the stabih lity definition would be rather given in t e r m s of the same space. Real-
approximation i s investigated in t e r m s of
l y , let equation (2. 16) approximate the basic evolutionary problem and let the difference problem be of the form :
Then the stability criterion i s
where
C:
and
C:
a r e constants with
h
fixed, independent of
+
,
Now let the basic problem of mathematical physics be approximated by the difference equation (2. 18) so that the boundary conditions a r e already taken into account in i t s construction. Then the stability criterion i s conveniently introduced in the form :
G. I. Marchuk Ch with h fixed i s independent o f 4
where
,
Of course, if numerical stability is studied with any h , h+O incliding , then i t is supposed that with small h ' s t h e r e a r e C
1
>
0
and C 2 ' > 0 ,already independent of h, i. e. C l = max h
ch 1
and
C2 =
max h
Thus,with the constructive approach to the establishment of stability of some o r other difference scheme it is always preferrable to h h define C1 and C constants f o r h fixed. However, i n the c a s e s when 2 passages to the limit a r e studied with h + 0 , Z+ 0 , judgement of '
stability should be connected with either of
h
o r of
Z
.
C1 and C2
constants independent
It i s this view of stability that we shall use
in the investigation of the convergence of approximate problem solutions to accurate ones, with simultaneous tendency of
h and Z towards zero.
A s i m i l a r consideration r e f e r s to the definition of numerical stability in the f o r m (3. 3 1).
-- 1 . Let us investigate the numerical stability of differenExample ce schemes approximating the equation of heat conductivity. In the c a s e of an explicit scheme (2. 21) we had the norm of the step operator in the form (2. 28)
Now let
(1 T 1)
<
1
.
It is i n this c a s e that the difference scheme is
numerically stable. To realize the assumption a time step 7 should be properly chosen.
Obviously, if we choose
.-.
G. I. Marchuk
1 1)
then T 5 1 , and the calculation will be stable. T h u s , the explicit difference scheme (2. 21) is stable with the condition (3. 32). Such schemes a r e usually called conditionally stable ------
.
Let u s consider now a n implicit scheme (2. 30). In this case
a n d , consequently , the difference scheme will be stable with any
Z lutely ---
> 0 . Such schemes a r e usually called unconditionally o r absostable . Similarly, for the Crank-Nicolson scheme (2. 36) we have :
Thus this scheme is also absolutely stable. 1. 4.
The Convergence Theorem-
The present section will deal with one of the most important theorems of numerical mathematics, known a s the equivalence theorem, finally formulated by P . Lax. The point i s that f r o m approximation and stability of the difference scheme follows the convergence of the approximate problem to the solution of the exact one. Let u s consider an evolutionary problem
G. I. Marchuk
with t h e boundary conditions :
acp
= g foraD
x
T
and the initial data
tp =
lp
0
with
Let u s &rite the problem (4. 1)
t
-
=
0
(4. 3) a s follows ;
Let us consider a space of functions with the definition domain D x T and cover this domain with a grid. Then we project the solution of the problem (4. 4) upon the grid domain
-Dh x
TT, where
and consider an approximate problem approximating
6h
=
D +aDh
h (4. 4) in the form :
Suppose now that the following approximation is valid :
G. I. Marchus
Then a s s u m e a certain numerical stability of a difference problem
and C be independent of h and T . Then provided that we have 1 2 approximation (4. 6 ) , stability (4. 7 ) and .linearity L, 1, Lh*and lh there
Let C
is a convergence
Let us prove the theorem a s follows. Consider the identities
and taking (4. 4) and (4. 5) into account we write :
G. I. Marchuk
If triangle inequalities a r e taken f o r the norms :
Applying the conditions of approximation (4. 6 ) we get :
Now consider the equalities
where
A s the difference scheme (4. 1 2 ) is numerically stable according to the theorem condition, we get
and, considering (4. 13) and ( 4 . l l ) , we have
G . I. Ma rchuk
o r , finally,
Thus the convergence theorem is proved. The assumption of the theor e m included a r a t h e r rigid condition that of
h
and
C1,
and C
2
a r e independent
Z.
P a r t i c u l a r l y unpleasant i s the requirement of independence of these constants of h . As i t was already mentioned i n the previous h section, t h e values ch and C2 a r e defined with h fixed. Moreover, 1 with h -+ 0 t h e s e constants, i n many c a s e s , tend t o infinity a s follows:
wheke
m > 0. If this fact i s taken into account,
then the
approxi-
mate solution convergence to the exact one will b e evaluated in the following wa.y :
If k>m and t P< hm , then t h e r e is convergence. Naturally, the convergence t h e o r e m can also be formulated f o r the c a s e s when C depend both on h and on 7 .
1
and C
2
G . I. Marchuk Let us turn t o the case of convergence in stationary problems of mathematical physics. Let the problem be
A y
=
f'
in D
a 9
=
g
for 3 D .
Let problem (4. 16) be approximated by the following difference scheme:
Suppose, there i s the following approximation
Besides there is an estimate a priori of the problem solution
where C
1
and C
2
(4. 17)
a r e constants independent of h. Then , similar to
the above , we get a convergence
Thus, in the study of difference stationary problems of mathematical physics the role of stability condition i s performend by close to i t c o r r e c t n e s s condition using a p r i o r i estimations. Here l i e s a profound inner relationship of difference equations for stationary and evolutionar y problems. After establishing approximation and a priori estimates (of stability i n the case of evolutionary problems), both of the problems become principally equivalent i n their formulations and a r e investigated by s a m e methods.
G. I. Marchuk
C h a p t e r
2
METHODS O F SOLUTION O F NONSTATIONARY PROBLEMS
In this chapter we shall deal with methods of solution of nonstationary problems i n mathematical physics. We shall concentrate on a solution of complex problems and their reduction t o simple ones using the method of finite differences. Our main object will be a n evolution problem in mathematical physics
where A
> -
0 , and the solution of the problem
( X ), the functions f
and g possess a necessary smoothness in the domain of definition of the solution D x T. It will be assumed that at the boundary of dD the solution of the problem satisfies some boundary conditions. 2. 1 Approximation
-
Stability Relation
The problem of approximating difference equations by finite-difference ones and stability of difference equations a r e closely related. Indeed, let u s a s s u m e that it is required to- find an approximate solution to a problem in mathematical physics given imput data of the problem. The approach to approximation of stability formulated in the previous sections appears i n most instances too general judge of individual properties o r the algorithm being developed. One of the reasons is generality of assumptions made when one investigates properties of the solution of a n approximate problem.
Thus, when one judges of approxi-
G. I. Marchuk mation one usually uses estimates which a r e valid f o r a whole c l a s s of problems but not for an individual problem, a theoretical estimate of the operator's norm being given from the worst function of the class, i. e. the function which results in a maximum e r r o r . In practical cal-
culations, however , we have to do with specific functions defined by the input data of the pt.oblem. Therefore it is expected that investigation of a n approximate solution of a problem studied may allow u s to construct effective algorithms of an approximate calculation in a different aspect. This thesis is easily illustrated if we consider a n evolution problem
where A = A g
>
0
, and the operator A does not depend
belong to the Hilbert space Let u s approximate
L2 (1. 1) by
on t , +
and
.
Naturally, a necessary condition for approximation to (1. 1) by the difference problem (1. 2 ) is , in a sense, smallness of the expression T A ~ '.
This follows immediately from Tailor's expansion in a s e r i e s
of the initial problem (1. 1) and from substitution of the s e r i e s into (1. 2). Here , s u r e l y , i t is natural , when one uses this method, to make an assumption that the solution is sufficiently smooth.
Thus, let
G. I. Marchuk
Substituting (1. 3 ) into (1. 2 ) we get
Using (1. 1) the last equality is transformed and written a s
F r o m h e r e i t ' follows that the approx?mation condition may be chosen
a
so a s t o emphasize smallness of the remainder due t o the mutual compensation of the t e r m s in the left-hand side of (14 ) and, considering that for symmetric operators (A2 * j . 4 ' ) = (A
4', A @ j ) ,
we have
3 2
( A + j , ~ + j )<< (A + j , + j ) .
(1. 6)
Naturally, the condition (1. 6) will be only necessary. F r o m herefollows necessity of the following criterion of approximation t o the problem in question :
On the other hand the condition of numerical stability also yields a r e striction upon 7 , which will be obtained in a constructive form. To this end let u s consider the difference equation (1. 2 ) and solve it with respect to v j + l
. w e get
Make up t h e functional
Then we have ,j+l
=
qjJj
where
qJ = 1 - 2 7
- .
Obviously, following will be the stability condition
F r o m h e r e immediately follows the condition f o r
If we a s s u m e that the approximate solutions
and the exact one @
differ but little and join (1. 7) and (1. 12) we a r r i v e at the following algorithme of choice of 't i n the s t e p j. It is required that the following conditions be fulfilled simultaneously :
F r o m h e r e i t follows that the algorithm's stability imposes a l e s s r e striction upon tion
1 . than approximation. Hence the approximation condi-
J
G . I. Marchuk
automatically ensures stability of the algorithm a s well.
We would
like to draw attention to t h i s fact. When one considers implicit schemes of the type
which a r e unconditionally stable
at A
2
0 , the choice of the para-
meter 7 should be made from the approximation condition. In this case we have the necessary approximation condition of the form
or, according to the definition introduced above,
Here, a s in the previous case,
@j
can be replaced by (9J. Then we
a r e led to a constructive criterion of choice of the steps
Here, again, we get estimate of the step mation condition only.
7 .from the approxiJ
From the above analysis it follows that the choice of the paramet e r 7 for schemes of the first
-
the accuracy of approximation.
This i s a decisive point when schemes
o r d e r approximation is dictated by
of higher order accuracy a r e used in numerical calculations. F o r such schemes, a s i s easily found, a limiting condition for the step
z will
be
G. I. Marchuk a condition of
numerical stability.
Apparently,in future, numerical schemes will be constructed with an automatic adjustment to optimal conditions. This means that the choi-
r. will
ce oy
J
not be made f r o m a priorif consideration of e r r o r s on
c l a s s e s of solutions .but from a posteriori estimates of the type (1. 13) and (1. 16). 2. 2.
Difference schemes of second-order accuracy with time-dependent operators.
Schemes of second-order approximation with respect
to
Z
should be specially mentioned i n nuinerical methods. Most common at present is the Crank-Nic6lson difference scheme. Let u s consider the evolution equation
The difference equation corresponding to (2. 1) is written. in the form
It is easy t o find that if the solution is sufficiently smooth the problem (2. 1) is approximated by the approximate problem (2. 2) with secondo r d e r accuracy with respect t o 7 . (2. 2) is generally called a sheme with central differences in t i m e o r the Crank-Nicolson difference scheme. It is curious to note that the scheme (2. 2) is the result of an alternate application of explicit and implicit schemes of f i r s t - o r d e r accuracy written f o r the intervals
t. < J
-
t
respectively, (if A is a linear operator):
< t.
-
~ f 1 / 2and t j + l / 2 <
,:t j + l ~
G. I. Marchuk
Eliminating the unknowns (9J
+
lP from the system of differen-
ce equations we get the Crank-Nicolson scheme. Let us assume now the operator A is dependent
on time and in
the problem (2. 1) it is approximated by a difference operator. In this case by the solution of the problem (4 we shall understand the vectorfunction whose components a r e approximate solutions at mesh points of space and A i s the matrix approximating the operator A.. Thus we have a problem in linear algebra
where
( A ' ~ , ~2 ) o o r A'
> o
for any functions of a Hilbert space. Eq. (2. 4) is formally solved with respect to
9 j+l .
Then we have
where is
3
T j = (E
+
. ) - (El -Z~ ~ 2
-
2
hj)
step operator.
In order to prove numerical stability it is not necessary to evaluate the norm of the step operator - Tj . We ahnll do differently. Let us form inner product of Eq. ( 2 . 4 ) and 112 (
+ $).
Q. I. Marchuk
We get
j+l
(9
j+ 1 ) - c ~ ' . $ ) ,pj+l+pj 'f + (hj 2T 2
qj+l+yj
- O
2 (2. 9)
Since, by assumption, the operator
A~ is positive-semidefinite (ske
(2. 5)), we have
i. e. stability is ensured. Estimate of the norm of the step operator is of importance in the analysis of difference schemes. F o r this phrpose we consider Eq.
Let us evaluate this equation by the norm using the Bunfakowski-Schwarz inequality. As a result we get
Comparing '(2. 10) to (2. 11) we arrive at the important conclusion that
When the operator
is skew
-
' A ' ~ , ~= )0
symmetric, i. e. the equality is valid
,
then, in place of (2. lo), we have a strict equality '(2. 12) We shall use of this fact l a t e r when we shall consider different applications. Analogously t o the foregoing it can be shown that in this case
\l?jIl =
I
Let us discuss now approximation of the Crank-Nicolson difference scheme when the operator A i s time-dependent. To this end we expand the operator
TJ in powers of the parameter 7.
Then we have 72 (hj)
~j = E - ' T A ~+
2
2-
. . ..
(2. 13)
~ e us t introduce the operator H accroding to the equality
H+=
at
+A*
and the approximating operator
(2. 14)
'
.I +
+A +j+l+ j - -(E+ $A') {+j+l- [ ~ - r ~\ 2~. (+~ j ) ~ - , j
j+l-$
2
Z
j
Z
Then we introduce a norm convenient for estimating approximation to the operator H.
In order to find the norm (2. 16) we expand the solution of the initial equation (2. 1) into a Tailor's series. We have
Taking into consideration =
where A
t
=
-A
. ,
Gtt
2
= A.
@ -
At$
(2. 18-
,
aA
a t , the Tailor's s e r i e s (2. 17) will be reduced -
*j+'=
@ j - s ~ j i # j +2
2
[ ( ~ j ) ~ $ j A:*' - ]
- .
Substitute (2. 19) into (2. 16). Then , considering (2. 15); we get
. ( 2 . 19)
G. I. Marchuk
If we choose
I\'
=
A3
= A(tj)
a s the approximatieg operator
,
A~ , then from (2. 20) it follows that
and we have the first
-
special case, when
is independent of t , approximation of the form
h
o r d e r approximation. It will be noted that in a
(2.21) ensures the second-order approximation in Z Let the approximating operator
nJ be
.
chosen in the form
In this case we have
In particular, approximation by the Crank - Nicolson scheme will be of the second order in T,i f in addition to (2. 22) we can assume one of the following forms : A
j - j=1/2 - A
'(2. 23)
In different applications, especially in a numerical solution of quasilinear equations, one can use one of the three above-mentioned forms of approximation of the operator A : (2.22) and (2.23) o r (2.24) ensuring the second order of accuracy. Finally, it should be noted that when one chooses the approximating operator
Aj i n the form (2.22), (2.23) and (2.24), the step operator
TJ of the form (2.8) ensures the second-order approximation. Later we shall make use of this fact. 2. 3 Inhomogeneous evolution equations In the previous section we dealt with homogeneous equations. Let us consider now inhomogeneous equations
Difference approximation to Eq. (3. 1) on the basis of the Crank-Nicolson difference scheme under the assumptions made in
4 2. 2
has the
form
where F~ = f ( t . 3
+
112 ).
It i s easy t o find that the difference problem (3.2) approximates (3. 1) to within T ~ .Let us write the formal interval
solution of (3. 2) at each
G. I. Marchuk In,the previous section in the case 0,f a homogeneous equation. it i s shown that , i f
> -
0 the following estimate holds
Naturally, this estimate of the operator's norm does not depend on the right-hand side f. Hance it holds in this case a s well. From Eq. (3. 3)lit follows that
In order to establish stability we make use of the estimate (2. 5) from Chapter I. As
we have
Z > 0 and
11 (E +
%A')
-'1
L
'
I 8
Thus, considering (3. 4) and (3. 7), we transform the inequality (3. 5) a s follows :
(3. 8)
setting
1 yo11
=
1 g 11
and
1 f 11
= max
1 $ 11 .nd
using
j
the recurrence relation (3. 8-, we get
(3. 9)
where
C = J T
=
T <
ca
( 3 . 10)
is a time interval of definition of a solution. In this way, the relation (2. 9) establishes stability of the difference
G . I. Marchuk scheme. In addition, this relation i s an a priori estimate of the norm of a solution if we take (3. 10) into account.
.
Splitting-up methods for nonstationary problems
2. 4
Very often when one has to solve a complicated problem of mathematical physics it i s possible to reduce it to a sequence of simpler problems which can be effectively solved on a computer. Such reduction i s possible when the initial positive semi-definite operator of the prpblem
can be presented a s a sum of simplest positive semi-definite
operators. This procedure will be called a splitting-up method. Theory of the splitting-up methods has been developed in extense for the case when
the initial operator is representable a s a sum of two simpler
operators. Therefore we begin presentation of splitting-up methods just with this case. Let u s consider the evolution equation
where the operated A
2
0 is independeht of time can be presented as
Let us assume then that the solution t o (4. 1) possesses the necessary smoothness. Now we consider three most efficient splitting-up methods.
2. 4. 1
The method of universal algorithm
Let us write the approximate formulation of the problem (4. 1)(4. 3) a s follows :
G. I. Marchuk
(E
+
5 A1) (E +
ZA) 2 2 0 CP
vjg'
-
vj
7
+ A ~ =] 0
(4.4)
= g. If the solution i s sufficiently smooth it i s not difficult to show that
(4.g)' approximates the initial-value differential problem (4.1) - (4.3) to an accuracy of second- o r d e r smallness i n 7 .Indeed, through algebraic transformations, equation (4.4) is reduced to
It is seen that the difference equation
(4.5) is equivalent in accuracy
to the Crank-Nicolson difference scheme i f the solution is sufficiently smooth
This follows from the fact that scheme (4.6) itself is of second order approximation in Z and therefore in this sense (4.5) and (4.6) a r e equivalent. Let u s analyse now stability of the difference equation F o r this purpose
(4.4).
(4,4). will be written a s
yJO=g;
(4.7)
and the difference equation from (4. 7) will'be solved with respect to jtl
Then we have
G . I. Marchuk
Then from the unknown p j we get over t o
y j according t o the for-
mula
yJ
=
( E +
T 2 .
(4. 9)
In this case for the new unknown .tpj we come to the relation
where T = (E
Z + 7 A1)
-1
7 (E- -A )(EJ A ) (E + + A ~ ) 2'1 2 2
(4.11).
is a step operator.
Using (4.10) we get the estimate in the energy norm
11 B"' 11
I
11 11 I~"V11 .
Let us find the norm of the operator T
II II
I
T
=
II TI II 1I T2 11
where (E
Z Ad) - 1
(E+
12 ~ 2 - l
Here we use commutativity property
which follows from the obyious identity.
Indeed, by multiplying the left-hand side and the right-hand side of
G. I. Marchuk
A d ) and using commutativity of the operators (4. 14) by (E 2 7 7 A d ) ( which can be verified by a direct (E - y Ad) and (E + -2multiplication) we arrive at the
property being proved.
Thus, the problem of determining stability is reduced to that of finding norms the operators
To(.
Applying Kellogg's lemma t o evaluating norms of the operators T
1
and T
2
in the relation (4. 13) we come to the
conclusion
hence,
Thus, stability of (4. 10) is proved. However, our ultimate aim
is to
determine stability of the initial difference problem (4. 4). For this purpose we use relation (4. 9) and rewrite (4. 16).
We introduce the notation
where
It is not difficult to see that Cd norm.
>
0, hence,
I(. 11
is
, in fact, :he
C2
So, in this metric takes place the condition of absolute stability
G. I. Marchuk It will be noted that the difference, scheme of the universal splitting-up algorithm allows a convenient implementation on a computer: Indeed, let u s write the difference equation (4. 4 ) a s follows :
Here
gj
+
112
and
+ ,
a r e some auxiliary values allowing
reduction of the problem (4. 4) to a sequence of simplest problems ( 4 . 2 0 ) which a r e solved successively. It should be noted that the first
and the last equations of (4. 2 0 ) a r e explicit relattons;' It means thzt we must invert operators only when we solve the second and third equations of (4. 2 0 ) i n which the simplest operators A
1
and A
2
Let us consider the inhomogeneous problem
where A = A1 + A 2 '
A I L 0 >
A2
2
0.
Then the scheme of the universal algorithm i s written a s
a r e present.
G. I. Marchuk
where
1 (4. 23) the difference problem
It can be shown that under condition
(4. 22) approximates the initial value problem (4. 2 1) with second order
approximations in 7 . Let us discuss now stability of the difference scheme. F o r this purpose we transform equation (4. 22) a s follows :
where
Equation (4. 24) will be evaluated by the energy norm
+
5
IITlllkljll
Z
+ A
(
-
IIfjlI.
(4. 26)
Since for the homogeneous equation it was found that IITII
F
a
we have
11 yj+l 11
5
1lYj11 + r l l ( E
+
3
A,)-'
11 . 11 2 11
Let a s make the following obvious transformation
.
(4. 27)
G. I. Marchuk
Taking (4.18), (4.25) and (4.28) into account we get the desired condition
Let us use then the estimation of the norm
I/(E + at
Ad
2
0.
T
1~d)-1115
1,
AS a result we have
F r o m here, using recurrence relations, we obtain
where
It means that (4.22) is a stable scheme on the interval 0 2. 4.2
The predictor
-
5
t. J
5
T = Jr.
corrector method
We shall formulate now another splitting-up technique, a so-called predictor-corrector method. The point of this approximate method is that the whole interval
0
5
t
5
T
is
split into a number of intervals
and within each elementary interval t . < t <_ tj+l the problem (4.1) is $01J ved in two steps. F i r s t one findsapproximate solution t o the problem at
.
= t . + TI2 using a scheme of f i r s t o r d e r accuracy t. 3+1/2 J which has a sufficient "reserve" of stability. Then on the whole interval
the moment
(t. , t j
J
+
one writes the basic equation of second-order approximation
G: I. Marchuk which is a corrector. It is essential that in constructing,the corrector one uses the ,l'roughv solution at tj+112 found with the aid of the predictor.
Thus, the predictor
- corrector scheme can be written a s
follows :
previded that
.
= g
Let us study the predictor-corrector scheme more carefully. F i r s t of a l l we eliminate the auxiliary function
9
from the f i r s t
two equations of ( 4 . 34). Then (4. 34) i s reduced to the two equations
1
.
'pJ -L9
j +
-t'
Eliminating Pj+1/2,
'PJ
from
=
Ayj+l/2 (4. 35)
.
;
0,
we get
g
To investigate the question of approximation we rewrite equation (4. 36) a s follows :
G. I. Marchuk '+1
'7
(E + p l ) (E + Z 2 A2
)UJ+ 7 = 0 , . -
where
A = (E +
Z - Z A 1 ) (E +
2
2
) A (E
+ -Z A 2
By means of a s e r i e s expansion in powers
and by
2
1 )- (E
+ i7~ 2 ) -
1
.
of 7 it i s easy to find
the method of estimates used in the case of the universal al-
gorithm we conclude that the predictor
-
corrector method has second
order ..approximation in 7. Le'aus study stability of this method. F o r this purpose (4. 36) will be written a s
where * j = (E + L A ) - l ( E 2 2
7 +2
'-1 All
j
LP .
A s for the difference equation (4. 27), above it has been shown to be stable in the following metric :
Let us substitute relation (4. 38) into (4. 39). Then, taking account of (4. 21)
.we get
G . I. Marchuk
where
Thus we have proved stability of the system (4. 41). Let us consider now an inhomogeneous problem. In this case we formulate the predictor-corrector method a s follows :
where
qj+l-yj
+ A v j + l / ~=
+
Z
If f j chosen in the form (4. 43), one can show that (4. 42) approximates the initial problem with second order accuracy in 7
.
Stability of
(4. 42) i s established in the same way. As a result we get
where IIfJIC1-l=
Since
and
mar))flI) - 1 . j 1
I ~ ( E +$A)-'I~2. 1 J
-
1 + O (7)
0 (7), -the asymptotic estimate of (4. 44) for j
> 1 will be
G. I. Marchuk
hence, if 0
5
me.
t. J
< T , we again get stability of the difference sche-
-
2. 4. 3 Component ,wise splitting up method Let us consider now a method of complete component-wise splitting. Let us find an approximation solution to the problem (4. 1)
-
(4. 3).
A scheme consisting of a successive solution of the simplest Crank-Nicolson difference schemes will be regarded a s an algorithm. Schemes of this kind were suggested by N. N..Yanenko and studied at length by the author in this paper at SYNSPADE
-
1970.
The system of difference equations (4. 46) after eliminating the auxiliar y function (Q jt112 will reduce to one equation
where T = (E
+
f
-A ) 2 2
,
-1 7' 7 -1 7 (EA.2)(E+ - A ) (E-2 1 All..
Let us study first the problem of approximation. F o r this
(4.48) purpose we
expand operator T in powers of 7: After obvious transformations we get T = E-7A.
+
If the operators A
' 2 2 X(A. + 2A2A1 2 . 1 1
and A a r e 2
2 + A2)..-.
commutative,i.e.
(4.. 49)
A A 1 2 = *zA1.
G . I.. Marchuk the above formula (4. 49) can be written a s T = E
- r*
+
2 s 2 A - .... 2
(4. 50)
If the operators b a r e noncommutative, one can achieve second order approximation by a special organization of computation. This i s a twocycle computation. Thus , f o r instance, one solves f i r s t the problem.
and then
Within the cycle the difference schemes (4. 51), (4. 52) a r e alternately used. In a similar way (see above) it can be shown that for the whole computation, by use of (4. 51) and (4. 52), we have.
where T c = E - 2 Z A +
-
If we compare the operator T
C
2
with the step operator of t h e total.
Crank-Nicolson difference scheme
G. I. Marchuk
we find that operator T f o r the two-cycle splitting-up scheme coinci2 des, to within 7 , with the step operator of the initial Crank-Nicolson difference scheme applied to a double time interval. This result is t r u e for both commutative and noncommutative operators A d . Hence
this method does not require commutativity of the operators. Let us discuss now n u m r i c a l stability of the method. F o r this purpose we shall consider the relation (4. 47) and evaluate it in the energy norm II(4ji111 5 Since, a s it is shown above,
IIT
IIlIYJII
.
II I L If Ad
_>
0 , we get
ll lp jtl 11
II
5l l ~ j
Faom h e r e it immediately follows that (4. 55) If we use the two-cycle method we get estimates of the form (4. 55) at each step of the cycle. It means that the two-cycle method is absolutely stable. Let us consider now an inhomogeneous problem and approximation of i t s solution with a two-cycle complete splitting. To this end we write the system of difference equations of the form (4. 51), (4. 52) i n a m o l e convenient form
E
2
A
1
)
- I 2=
(E
-
7'
(9j - I +
2 2 f j)
where
fl
= f(t.1
J
.
Solving these equations with respect to yj+l, we get
where Tc = T1T2T2T1 and Ta = (E
-1
+ -ZA2,)-
(E
-
7 2
A,,)
By expanding in powers the small parameter 7 we reduce the expression (4. 57) to
and transform it a s follows
Let us eliminate ~ j - l from the last relation. To this end we make use of the expansion of the solution in a Tailor's s e r i e s in the vicinity of the point t .
3- 1
yj
. To
within 2
q j - l
have
+&T+
Let u s e l i d n a t e the derivative
at
,
(72)
3 using at
.
the equation
(4. 62)
Let us substitute (4. 63) into (4. 62). Then we get
.
Hence (E - f ~ ) ~ j - l Cpj . =
After substituting
- 72 +
2
0 (7 )
.
(4. 64) into (4. 61) we get
Evidently, Eq. (4. 65) approximates the intial equation (4. 1) on the interval (tj-1
5
5
t
tj+l ) to within second order in 7 . Thus we ha-
ve found difference approximation of the inhomogeneous evolution equation of second order accuracy by applying the two-cycle method. Stability of the method is elementally proved in the energy norm. Indeed , let us evaluate (4. 57) by the norm
llYjt1ll ( T
-
+ 2
+
+
Above it was found that
Hence , we have
Using the recurrence relation (4. 63) we get
11 ll
L
Il g ll
+ j.
II II
a
A
(4.66)
G. I. Marchuk where
From (4. 68) follows numerical stability of the scheme on any finite time interval. Let us consider now the splitting-up method for implicit difference approximations. To the end we analyse the problem
y= Let A =
f
Ad
and all
@=I
g
if t
=
O i n D.
0 and consider the splitting-up algo-
Ad?
rithm in the form
We show now that such an algorithm is absolutely stable. Indeed, let us look at the equation
-n
j+ c4
'f'
j+
- c p
n
4-1
t
4 j+ n
'+ n . +
cd
and multiply it scalarly by c$ As a result we obtain
= 0
G. I. Marchuk Considering positive semidefiniteness of the operators A
, we have
But, since
then
Using the recurrence inequality we have
This means that under the assumptions made the computation by the splitting-up scheme
( t) will be absolutely stable.
It i s not difficult to see that the system ( + ) approximates the initial
- value
problem to within first order in Z
.
Let us see the inhomogeneous problem
q=
g if
The splitting-up scheme for
t = 0
inD,
this problem will be considered in the
form
Such a splitting-up sche\me approximates the basic inhomogeneous equation to within first order in
.
T
Stability of the scheme will be proved a s follows. Let us multiply scalarly each of the equation, respectively, by (4
j+l/n'
,. . ,
j+ 1
Then, anaIogously to the previous, we have
Let us consider the final equation of (**) in more detail. After 'the above procedure we have
Taking into consideration that
A
n
2
0, we get
and , using the Buniakowski-Schwarz inequality, j+ (
'P
(f'
Hence,
n-1 -
j+ 5 1
,qj+l)
+ $+l 1 5
5Jlp
I1 fj I1 llyj+l I1 j+ n-1
G . I.' Marchuk
Cancelling
(1 9j+ 1 11 , we arrive at the
following inequality
Eliminating the solution with fractional indices, we have
Considering that
l l ~ ~ =l l
llg
11
9
by eliminating intermediate values of the solution, we get
where
From here follows absolute stability of the difference scheme for any instant of time of the interval
This splitting-up algorithm i s generalized for the case of time dependence of the operator A
.
In such a case at each cycle of computation
by the splitting-up scheme instead of A
we should take any difference
approximation of this operator at each interval
tj
5
t
5
tj+l
.
G. I. Marchuk *A solution of problem with time-dependent operators.
2. 4. 4
Thus, we have considered the three splitting-up techniques : the method of universal algorithm, the predictor-corrector method and the method of successive splitting based on two-cycle procedure. It i s remarkable that all Ihe three methods a r e equivalent in accuracy add absolutely stable if Ad
3
0.
However we sh-ould keep in mind'one restriction which was imposed on the operators
A d , .at the very beginning, that is their inde-
pendence of time. Due to this restriction we could make a complete analysis of stability assuming only that the Ad
a r e positive semi-defi-
nite operators. Unfortunately, if the operators a r e time-dependent it i s generally impossible to make such an analysis of stability. It i s plea sant to note that the method of successive splitting i:: an .exception. If the operators A M a r e time-dependent, then for the other two methods, one can choose approximation of the operators A d
in time,
for example,
A
=
As (tjtlI2 1.
which retain second order approximation, but stability in other norms i s to be established. Most often it i s energy norm. In both cases we have the relations
where S "
z j -1 E - + ( E + T I \ ~ ) (E+2
for the scheme of universal algorithm, and sj = E
- T A ~ ( E+
1
A:) -I(E
+
$ A!
)-I
G. I. Marchuk for the predictor-corrector scheme.
; Unfortunately, A we let
I/ S' 11
2
0
does not follow in the energy metric if
< 1. Therefore investigation of the schemes1 stability
in this case consists of an- estimation of the norm of a very involved operator
sJ at
each computation step. Owing to this fact the two me-
thods in question become less valuable if the operators Adare time dependent. F o r the time dependent
A d , operators the method of successive
splitting has an advantage over the other methods because it can be effectively used to a wide range of problems.
2. 4. 5
An example
Let us illustrate splitting-up methods for nonstationary problems taking a s an example the following problem :
q =
gl
for
dDh,
Here 'it i s assumed that approximation of the problem in space variables has been made.
Hence, we deal with a system of ordinary diffe-
rential equations for the components of the solution tion and g
gkl
,
CQ kl
and the func-
F o r the sake of simplicity the indices of the functions
will be omitted. The operator
a s a sum of two operators
A=
CQk
-ahwill be presented a s
where
It is not difficult to find norms of the operators A same. procedure which we used i n
4
and A2 by the 1 1. 1 when we looked for the norm
of the operator A. '
It is easy to obtain
rc 2 < A nA 1 5 - 4
h2
and, analogously
Thus
and
L'et u s examine now schemes of realization and stability of the basic splitting-up algorithms. If we use the method of universal algorithm ( 4 . 4 ) , the scheme
I.
of realization i s of the form (4. 20) :
4kl 7A -2
3 j+l/2 kl
j +'I ? A 1i 3kl
t
-
k
=
-
+
5;1/2
+
5
AhVj kl kl
z1
'
;
=
5 j+l"kl
,
(1 = 1 , 2
,...., n-I), (4. 72)
, (k .= 1.2,.
. . ,n-1).
G . I. Marchuk Since the step operator has the form (4. 11) and the operators Aain our case a r e commutative, we have
Thus 2.
stability of the difference 'scheme (4.'68) i s ehsured. Let ns make use of the predictor- corrector (4. 34), we have
The norm of the step operator for the scheme in question, due to the proved general statement (3. 39) i s strictly l e s s than unity. Therefore stability of (4. 73) is valid. 3.
In the case of complete componentwise splitting of (4. 46) we have
A = A
Here
+A
and A is obtained f r o m V k 1 k with using of boundary conditions. k
k,
fromqnl,
It was shown in (4. 6 4 ) that the norm of the step operator in the case of the scheme under consideration and m o r e general is strictly l e s s than unity. Therefore the scheme
schemes
(4. 74) is stable.
In a l l the above examples stability is proved in the energy norm. Hence each of the schemes considered is a set of equations with difference operators which depend either on the index k o r on the index 1, only. Eventually the problem is reduced t o a solution of the simplest thkee-point equations of the form
where a
n'
b
n
and c
k
and f
n
a r e prescribed values, where
For. solving, difference equations (4. 75) one usually uses a factorization method
2. 5.
(See
5. 4 ) .
Multi - c o m ~ n e n tsplitting of problems
Until now it has been, assumed that the initial operator A is a 'sum of two simpler operators. When we solve complicated problems in mathematical physics we often have to do with the splitting of operators
G. I. Marchuk into a number of components. In a general c a s e we have
where Ao(
3
0. Since in the previous paragraph we have ,dealt with
n = 2 , here we shall only deal with
n
>
2,
F i r s t of a l l we can find out that a trivial extension of the above splitting-up methods to the case n = 2 i s generally impossible. Therefore our object will be t o extend $putting-up algorithms to this case making assumptions which allow such an extension. 2. 5. 1 -
The method of universal algorithm-
Under the assumption (5. 1) it can be presented a s
p.'
0
I g,
where fJ = f
(tj+l/2).
The scheme of the operational algorithm i s a s follows : $ = - A $ + f j .
3 j+l/n
(E + (E
z +2
A2)
= $,
j+l/n = $j+l/n
G. I. Marchuk (E+
Y'
;%) 5
j+l =
j
j+1
Il- 1
y j+ y-
;
- 7tj+l
(5. 3)
IV is not difficult to check that the universal splitting-up algorithm has second o r d e r accuracy in Z if a solution i s sufficiently smooth. Numerical stability 'will be achieved if the condition is fulfilled :
II T 11 <
(5.4)
*
where T is the step operator defined by
Unfortunately from the condition not follow
%1
a s was the case for n
=
0 stability in some norm does
2 . To establish stability one usually
uses ,the following simple algorithmic technique. If fl = 0 , the homoj+ 1 geneous equation (5. 2) solved with respect to 14 becomes yj+l
=
T
(5. 6)
T i s assumed to be time-independent operator (of the index j). Therefor e solving (5. 6 ) with the initial condition
yo
=
g
(5. 7)
and the fixed T , allowing the necessary approximation, we shall. look after the norm
IIyJ 11.
If this norm does not increase, it follows that
I/ 11
<
(5.8)
hence, it will be reckoned that the condition for numerical stability can
be fulfilled. Then one :can go over to' a solution oi the inhomogeneous problem. ~ndeed, if the condition (5. 8) can be fulfilled, Eq. (5. 2) will
be rewritten a s
From here
o r from the, condition of inequality
By the recurrence relation we come t o the stability condition in the energy metric
where
It will be noted that when solving the homogeneous equation (5. 6) we used the initial condition (5. 7), though it is not necessary. We can choose a s the initial condition any function and look after the computation. If the computation. is stable the norm of the operator T will be l e s s than o r equal to unity. Otherwise-round-off e r r o r s will cause an increase ot the norm of the solution. beginning with some 2. 5. 2
The predictor-corrector method
In this case the splitting-up scheme i s a s follows :
j.
'p j+ 1- cpj
+
Alpj+l/2 =
fi
-t
1
where we let Ad
0
and
fl
=
f (
tj+l12 ).
The system of equations (5.11) reduces to one equation
provided that
'pO
=
g
The predictor-corrector method in this case has second-order accuracy in Z if the solution is sufficiently smooth.
Eq. (5. 12) will be
written a s
where
Z 1 T = E - A n (E+-AH) 2 d=n .
-1
is a step operator.
In a similar way the condition of nuherical stability reduces eventually to evaluation of the norm of the operator T. F o r this purpose we.can use the above method by which we estimated the norm of the solution
G . I. Marchuk to the problems (5. 6), (5. 7). Unfortunately in this case like in the previous one we did not manage to prove stability of the scheme at
A&>
0 .
In order t o complete the analysis of the two above-mentioned schemes we shall examine the simplest case where the
splitting-up operators
AM
a r e commutative and have a common basis. This r e -
quirement in addition to the condition AM
3
0 seems sufficient to pro-
ve stability of the schemes. Indeed; in the case of commutativity. the step operators T for both schemes caincide with each other. F o r simplicity we shall consider the homogeneous problem (5. 6), (5. 7) and look for the solution in the spectral form
where the
'9:
u
n
a r e eigenfunctions of the problem
j
=
( y , U:
)
, where the
u* k
(1. 6) (chapter 1) and
a r e eigenfunctions of the adjoint
problem (1. 7) (Chapter 1). Since
u i s a common basis, then n
Substituting (5. 16)and the respective expansions forthe function (5. 6). (5. 7) we get for the Fourier coefficients ipJ k expressions :
g into
the following
where Tk = 1
The expression for T from (5. 19) will be rewritten a s k .
where the
pk a r e
positive constants
that
ITkl
<
d
k
> - 0. F r o m (5. 20)
a
it follows
(5. 21)
which proves the statement. The method of universal algorithm a s well a s the prodictor-corrector method for the n-component splitting of the operator can also be applied when the operator A is time dependent. However, in this situation the a priori formulation of the stability condition appears to be a more complicated problem. Therefore it is difficult to say to what extent in this case the application of the two above schemes is in gener a l worth while. This fact encouraged the author t o formulate a universal approach t o the solutiop of different complicated and rather general problems using splitting-up techniques. In what follows this new approach, called
a two cycle method of successive splitting,will be discussed in
detail. 2. 5. 3.
The method of successive splitting based on elementary Crank-Nicolson difference schemes
We shall attempt t o build a difference analog of the problem accu-
G. 1.. Marchuk rate to. second o r d e r in 7 and absoluiely stable in'time: According t o multicompbnent splitting it will be assumed that
where all the
4 a r e positive semi-definite operators s o that A j 2 0. o(
Let us consider a system of equations :
When
A
0 and a r e commutativei (5. 23)
is unconditionaEly stable
This can be easily established j However, for noncommutative operators Ad,
and has second o r d e r approximation. by the Fourier method.
a s i s easily seen, (5. 2 3 ) accuracy
will be
, generally speaking, of first order
in Z and therefore it will be of l e s s interest for applications
than the following scheme of second order accuracy :
Later we shall t f y to find a special construction of the method of complete splitting using (5. 23). This construction will give the solution to the Cauchy problem for the positive semi-definite and noncommutative operators A'
o(
which have second order approximatjon. In fact this
G . I. Marchuk
is in a sense the final solution of the splitting-up problem.
We note that the system of equations (5. 23) reduces to one equation
Using (5. 25) we find the estimation by the norm
By Kellogg's lemma we have
It the operator i s skew-symetric, we have
Thus we have proved absolute stability of this scheme. To determine the order of approximation we shall expand the expression
.
n d =l
in powers of the smaller parameter Z
we shall expand first the operator we have
.
Since
a series. Then, a s in (2. 13).
:T 2
T = :
E -A . :+
+-
A ():
2....
G. I. Marchuk
As a result we get
When the operators
A' a r e commutative , the expression under the
d sign of the double sum disappears and we have
Comparing (5. 31) t o (2. 13) we find that in this particular case the scheme (5.23) is accurate to the second order in
A:
T. If
the operators
a r e noncomiutative. the splitting -up scheme i s of first order
accuracy in 7.In order to build a scheme accurate to the second order in Z for the noncommutative case the scheme (5. 23) should be modified and substituted by
Algorithmically it means that first the system (5. 23) i s solved on the interval
t j - 1 -< t < - for &=
ped on the interval t .
1 , 2 , . . . , n , then a similar system i s sol-
< t 5 tjcl
J -
in the inverse sequence oC= n,n-1,.
G. I. Marchuk (o( = n,n-1,.
. ...l).
It is obvious that f o r whole cycle of (5. 38) we have +j+l
=
Tj ~ j - 1
where
Thus within the interval
t j- 1
5 t 5 tj+l
the scheme
rate to the second order in 7 a s is the aase
(5.13) i s accu-
with (2. 13) for a double
interval. In concl'usion it will be remarked that the difference scheme (5. 33) is absolutely stable f o r ~ i 0
.
Hence
we have come, in a
sense, t o an optimal algorithm of multicomponent splitting. 2. 6. A General Approach t o ~ o m ~ o n ' e n t - w i sSplitting e
To solve many problems in mathematical physics pne has
to
split initial differential , integral and integro-differential equations into simpler ones, subsequently reducing the latter t o a difference form using the algorithms described in this chapter. In doing s o one has each time to consider a question of aprroximation to initial equations by aifference equations, and this is the object of our discussion. Let us take some problem in mathematical physics.
Suppose that
A
=
n Z
o(=l
Aw
a
G.'I. Marchuk wher'e
2
Ad
0.
The solution
y~
and the function
g
a r e assumed
{ 5 t 5 tj+l 1
to be sufficie,ntly smooth. Then, on each interval B t j j (6. 1) is written a s
.
where
Earlier it has been shown that if one applies the Crank-Nicolson difference sdheme to each eqhation, one comes to the system of difference equations of the second order approximation
where we use the notation
Let us suppose then that each of the operators
Ad is in turn
representable in the form
where
A
> 0. There
")8-
is a question if it is reasonable to
"split" ,
f i r s t , the operator into A
dP
operators
A
into
A and then, in turn, the operators A d
? . I s it not easier to represent the operator A a s a set of
&
right away ? In this connection it should be remarked
that though these two approaches seem equivalent, in many cases it
is more convenient to convert, first, a complex problem in mathematical physics into simpler ones which further
can be independently
reduced t o difference problems (see supplement). Let us analyse any problem of (6. 3 ) and ,considering (6. 7), split it into even simpler ones
where
It i s not difficult to see that the system of the split equations (6. 8) approximates the initial-value problem (6. 1) to within the second order in Z .. The proof of such a statement i s based on the fact that, using (6.2) and (6. 6), one can change the ordering of the components of splitting-up , by writting
*dp and in this event we gave the problem
=
f A *=I
C
G. I. Marchuk which, a s was shown in 6 2 . 5, approximates the problem ( 6 . 1) accu-
.
rate to the second order in Z
The result is also true of the case
when A i s dependent on time. Then one should make approximation to
ou3
the operators
Ad.
=
to within the second order in 7 on each
A&
. If the a r e non-commutative, then, using interval t . < t 5 t . J J+1 the two-cicle procedure described in 8 2. 5, we obtain a difference scheme accurate to the second order for each interval
t j - l 5 t 2 tj+l. To summarize, we can assert the following. When an evolution problem of the form (6. 1) , under the condition Ad? 0 , is reduced to particul a r evolution problems ( 6 . 3) and these a r e regarded a s a set of new evolution problems, the approximation to the initial will be accurate to the first order in
z provided
-
value problem
that at least one of
the elementary problems i s reduced to difference schemes accurate to the first order. If every such problem has approximation of secondorder accuracy, then, using the two-cycle procedure with respect to O(.
a n d p , we again come t o approximation of second order accuracy
in Z . It should be noted that if the operators A
a r e non-commutative,
then without using the two-cycle procedure we derive approximation of ( 6 . 1) accurate to the first order.
Indeed, let us consider the case of non-commutative operators. Then the following is an initial-value problem : n
v =y j
if
t =
t
j '
(6. 10) is reduced t o the system
Let
Ad
=
E LAMP /j'l
. where Adpi
'
Adpj
"yi
(6. 11) is solved by the two-cycle method.
Then every problem of
The initial conditions for each of the systems (6. 12) a r e taken in the form
It i s easy to find that (6. 12) approximates, on the interval t .< t 32 any of the problem ( 6 . 11) to within T .
5
tj+l,
In order for the whole algorithm to lead to a solution of (6. 1) to within z4 it i s also necessary alternate the basic cycles. Thus,instead of (6. 11) on the interval t . 5 t J- 1
* at
+
= 0
and, on the next interval
5 t. 3
,
t. < t 3 -
we should have
(d=1,2,.
5
t. 3+1
. . ,n),
:
(6. 14)
G . I. Marchuk
It is assumed that each .problem of (6. 13) and (6. 14) is solved by thertwo-cycle method of the form (6. 12). Note that for the condition
& .2
C
the component-wise splitting-up method is absolutely sta-
ble.
2. 7 Hyperbolic ,Equations Hyperbolic equations hold a prominent place in applications. Numerical methods for such equations a r e studied to full advantage. Hyperbolic equations have the following characteristic features. F i r s t , the domain of dependence of a solution for such equations is bounded by a characteristic cone so that the region outside D x T space does not affect the solution in the point under consideration. Second, among the solutions of the initial-value problem there may be non- smooth solutions a s well and this should be kept in mind when one develops numerical schemes. Since a great amount of excellent investigations a r e devoted t o constructing difference schemes for hyperbolic problems we shall discuss only some of the methods most widely used in recent years. Let u s look at the problem
G. I. Marchuk
It will be assumed that the operator the functions
g
and
p
j- >
0 for a l l
#
is independent
of time and
allow sufficient smoothness of the solution
of the periodic problem. Let the exist
A
operator A be pdsitive, i. e. there
0, so that
By the way, we must note that for symmetric positive definite operators
)-= dA, , where
dA is a minimum eigenvalue of the operator
A spectrum. Let us consider difference approximation to the equation of (7. 1) in the form
It is easy to show that the difference scheme (7. 3) approximates the initial equation of (7. 1) to within quantities of the second-order of smallness with respect to 7 . We apply initial data t o (7. 3 ) . In order not to distort'the second o r d e r of approximat'ion we take, along with the condition,
the following relation :
(7. 5) is derived by expanding the problem (7. 1) into Taylor's f e r i e s in the vicinity of
t = 0 with a subsequent elimination of derivatives
G. I. Marchuk
using the equation and the known initial conditions in (7. 1). The problem of (7. 31, (7. 4), (7. 5)
is fully formulated. Our
object now is to analyse numerical stability spectral method. Let u
and u* n n values of spectral problems
Next, it will assumed that
of (7. 3) , using the
be eigenfunctions and
I gn)
n
> -
0 eigen-
f o r m s a basis. Then we seek a
solution to the equation in the form
where
Substituting the F o u r i e r s e r i e s result
by
expression
j
yn
(7. 7) into (7. 3) and multiplying the
we get for the F o u r i e r coefficients the following
A solution to (7. 8)
is sought in a form of the power
Note that in the left- hand side of (7. 9 ) j right
-
i s an index
function
while in the
hand side it is a power.
Substituting (7. 9 ) into
(7. 8)
we get the characteristic equation for
7' n'
G. I. Marchuk
It .is easily seen that i f
the roots of (7. 10) a r e complex conjugate and equal t o unity in modulus
i. e.
IqnI F r o m the condition
(7. 12 )
=
(7. 11)
Evidently, (7. 13) will be fulfilled for all
n
if
7 a r e taken such
that
where
/jA
is the upper bound of the o p e r a t o r
F o r symmetric operators
PA
=
11 A 11
Let u s proceed to implicit difference schemes
A
hence,
spectrum.
G. I. Marchuk
The scheme ( 7 . 16)
is accurate to the second o r d e r with respect to Z
and in combination with ( 7 . 4), ( 7 . 5)
it approximates ( 7 . 1) to within
the setond o r d e r . F o r ( 7 . 16) the characteristic equation is of the form.
hence, 7
F r o m h e r e i t follows that with any 1
Thus,
( 7 . 16)
( 7 . 19)
is a n unconditionally stable scheme.
Let
where
A
b(
> -
0 . F o r an approximate solution of ( 4 . 1) we make use of
difference appr~~xirnation of the form
G. I. Marchuk where
F r o m (7. 21) and (7. 22) it follows that (7. 21) approximates the initial equation of (7. 1) to within quantities of the second o r d e r with respect to 7
.
Since the equation
(7. 21) can be reduced to
f r o m the analysis made above t h e r e follows stability of (7. 21), (7. 22), provided that
In t h i s way the problem of choice of the p a r a m e t e r
Z satisfying the
stability condition reduces to calculating a maximum eigenvalue of the problem (under the assumption that all eigenvalues
/3B- 1 A
a r e positi-
ve ) : AU
=
XBU.
( 7 . 25)
This problem is solved by the iterative p r o c e s s
In this connection
The operational scheme of the difference s y s t e m corresponding to (7. 21)
G. I. Marchuk
i s written a s
This problem is solved successively with initial data
(7. 4 ) and
j = 2, 3
. . . . ,and
using the
(7. 5).
( 7 . 28) i s a splitting-up scheme. To conclude, we consider a wave equation
where
.2 a
is squared. velocity of propagation of wave perturbation
The problem (7. 2 9 ) will be called periodic in geometric variables. Using our notation
A
=
-a2,.
G. I. Marchuk
Let us assume that instead of the differential operator we consider its second-order difference approximation in all variable6 xo< Then
If
a
,X
= h
. the spectral problem
defines the upper bound of the difference operator A spectrum in the form
Thus, i n this case the explicit scheme (7. 3) requires fulfilment of the condition
(7. 14) o r
If we consider the scheme
(7.21), (7. 22) , where
then, applying the spectred analysis , we get
This means that the difference scheme of (7. 29) on the basis of the
G. I. Marchuk
splitting-up algorithm (7. 2 1 )
will be unconditionally stable.
The algorithm considered extends fairly simply to inhomogeneous hyperbolic equations.
G. I. Marchuk Contents INTRODUCTION
.. . . .. . .. . .. . . .. .. ...
Chapter 1. General information from the theory of difference schemes 1. 1. Basic and adjoint equations
1. 1.-1. Energy norms
1. 1. 2. Estimation of the norm of a single operator
;
1. 1. 3. Kellogg's leinma 1. 1.4. ~ s t i m a t i o nof the norm of the operators
1. 1. 5. .Calculation of the spectrum bounds of a positive matrix 1. 1. 6. Examples 1. 2
Approximation
1. 3
Numerical stability
1.4
The convergence theorem
Chapter 2. 2.
Methods of solution of nonstationary pr6blems
L Approximation-stability relation
2. 2. Difference schemes of second o r d e r accuracy with time-dependent operators 2. 3. Inhomogeneous evolution equations 2. 4. Splitting-up methods for nonstationary problems 2. 4. 1. The method of universal algorithm
2. 4.2. The predictor-corrector method 2. 4.3. Component-wise splitting-up method 2. 4.4. A solution of problems with time-dependent operators
2. 4. 5. An example 2. 5. Multi -component splitting of problems
G. I. Marchuk
2 , 5. 1. The method of universal algorithm 2,'5. 2, The predictor-corrector method 2. 5. 3. The method of successive splitting based on .elementary Crank-Nicolson difference schemes
2. 6 . A general approach to component-wise splitting 2. 7 . Hyperbolic equations References.
page
G. I. Marchuk References p e r s which a r e v e r y close with a theory symbol ( W ) indicates t h e p a-------------of the spletting-up method. --
-The ---
a -
1
- -Monographs ----
and text books.
Babuska I. P r h g e r M. , Vithsek E. Numerical p r o c e s s e s i n differential equations-Interscience ,1966 Bahvalov N. S. Foundations of numerical analysis Berezin I S. , Zhidkov N. P. Computing methods
-
-
(1970) Moscow (Russian)
Pergamon P r e s s
-
Oxford 1965,2' vols
Wasow W. ,Forsythe G. Finite difference methods f o r partial differential equations J . Wiley and Sons (1959).
-
Voevodin V. V. Numerical methods of algebra. Theory and algorithms. "Naukafl Moscow 1966 (Russian) Godunov S K. L e c t u r e s on difference methods f o r the solution of the equations of gas-dynamics Novosibirsk 1962 (Russian).
-
Godunov S. K. , Ryabenki V. S. The theory of difference schemes. An Introduction. North Holland .- A m s t e r d a m 1964. D'jakonov E. G . Iterative methods f o r t h e solution of d i s c r e t e analogues of ( ) boundary value problems f o r ellyptic equations (Intern Spring School on Numerical Math. , Kiev. 1966 (Russian) I. K. A. N. SSSR V. Z. Acad. Nauk. SSSR/Kiev 1970. Illin V. P.
Difference s c h e m e s f o r t h e solution of ellyptic equations Izd . NGU. Novosibirsk (1970) (Russian)
Kantorovich L. V. Functional analysis and applied mathematics Nauk 3,6, 89- 185 (1948) (Russian).
.
-
-
Uspehi Matem.
Kantorovich L. V. Krylov R. I. Approximate methods of higher al.alysis, FM. ,Moscow-Leningrad (1962).
G. I. Marchuk
Collatz L. The numerical treatment of differential equations, 3rd. ed. Springer Verlag, Berlin (1960).
-
Funktional analysis und Numerische Mathematik , SpringerVerlag, Berlin (1964).
Krasnosel'skii M. A. , Vainikko G. M. , Zabreiko P. P . , Rutickii J a . B. , Stecenko V J a . Approximate solutions of operator equations Izdat "Naukall Moscow (1969) (Russian) Courant R. P a r t i a l differential equations vol. 11, New York 1962. Lions J .
*
- llCourant-Hilbertv
( ) Resolution iterative d1in6quations variationnelles p e r decomposition et eclatement College d e F r a n c e (1967)
-
Marchuk G I. Computational methods f o r nuclear r e a c t o r s Atomizdat, Moscow 1961 (Russian)
-
Marchuk G. I. (+) Numerical methods f o r weather forecasting-Hidrometizdat
1967 (Russian)
-
( S ) Methods and P r o b l e m s of numerical analysis, (Russian) Int. Congress of Math. Nice (1970)
Marchuk G. I. , Lebedev V. I. (+) Numerical methods i n t r a n s p o r t theory Moscow (Russian)
- Atomizdat
(1971)
Mikhlin S. G. Variationsmethoden d e r Mathematischen Physik-Akademie Verlag-Berlin 1962 Richtmyer R. Difference methods f o r Initial-Value problems Intersciance Publ. Inc. New York (1957) Richtmyer R . , Morton K. Difference methods f o r initial-value problems, New York 1967 Rozdestvenski B. L. , Ianenko N. N. ( + ) Systems of quali-linear equations 1968 (Russian)
-
"NaukaN Moscow
G. I. Marchuk Ryabenki V. S. ,Philippov A. F. On stabJlity of difference equations Gozudarst Izdat. Tehn-Teor. Lit. Moscow 1956 (Russian)
-
Samarskii A,. A . (*) Lectures on difference schemes
- Moscow
1969 (Russian)
Saul1yev V. K. Integration of equations of parabolic type by the method of Pergamon P r e s s London 1964 nets
-
-
Smirnov V. K. Lehrgang d e r hBherer Mathematik Berlin 1956
- Deutscher Verlag -
Sobolev S. L. Lectures on the theory of cubature f o r m u l a s part I (1964) part I1 (1965) Novosibirsk (Russian), Izd NGU Tihonov A . N. , Samarskii A. A. Equations of mathematical physics (Russian)
- " ~ a u k a "Moscow
Wilkinson J H. The algebraic eigenvalue problem (1965)
- Oxford, Claredon P r e s s
Faddeev V. K. , Faddeeva V. N. Numerische Methoden d e r linearen A.lgebra Verlag-Berlin 61964)
1966
- Veb. Deutscher
For.sythe G ,and MBller F . B. Computer solution of linear algebraic systems ,Prentice-Hall, Inc. Engle wood Cliffs, N. Y. 1967 ,
Ianenko N N ( $ ) The method of fractional steps f o r solving multidimenllNaukall Novosisional problemd of mathematical physics. birsk 1967 (Russian)
-
- ( & ) Introduction to difference methods of mathematical phys i c s - part I and 2 Novosibirsk 1868 (Russian),Izd. NGU.
G. I. Marchuk 2.
Additional
literature
-
Babuska I. The finite element method- foe elliptic differential equations. I1 SYNNumerical solution of p a r t i a l differential equations 1970 A.cc. P r e s s New Y o r k , London , 69-106 (1967) SPADE
-
-
-
Belotserkovskii 0. M. , Chuskin P. I. A numerical method of integral relations Zh vych. mat. i mat. fi?. 2 , 5 (1962) 731-759 (Russian) Birkoff G . . Varga R.', Young D. Alternating direction implicit methods. Advantages in Comp. , Vol. 3, Academic P r e s s , New York-London 189-273 (1962) Birkoff G. , Schultz M. H . , Varga R. S. Piecewise Hermite !nterpolation i n one and two variables with applications t o p a r t i a l differential equations, Num. Math. 11 (1968) 232-256 Bryan K. A s c h e m e f o r numerical integration of t h e equations of motion on a n i r r e g u l a r grid f r e e of non l i n e a r instability Mon. Wea. Review, v 94, 1, 39 40 (1966)
-
-
Byleev N N. Numerical methods f o r the solution of two-and t h r e e sional diffusion equation Mat Sb. T 51,2,227-238 (1960) (Russian) Varga R. S. Matrix iterative a n a l y s i s
- Prentice-Hall-New
-
dimen-
J e r s e y (1962)
Wachspress E. L. ( j ~Extended ) application of alternating direction implicit iteration model problem t h e o r y 1016 (1963) SIAM. J. v . 11,3,994
-
Gunn J E .
( & ) The solution of elliptic difference equations by semiexplicit iterative techniques . SIAM. J Numer. Anal. v. 2,1,24-25 (1965)
G. I. Marchuk Godunov S. K , Prokopov G. P. variational approach to the solution of large systems of linear equations appearing in strongly elliptical problems. Preprint Inst. of appl. Math. Acad Nauk SSSR, Moscow 1968
-
Dorodnizin A. A. A computing method for solving some non-linear problems of aerohydrodynamics. Trudy 3-rd All Union Math. Symposium 5,447-453 (1958) A contribution to the problem of computing eigenvalues and eigenvectors of matrics. Dokl, Acad. Nauk. SSSR 126 (1959) 1170-1171 (Russian) Dtjakonov E. G. ( + ) Difference schemes with a '1 disintegrating "operator for multidimensional stationary equations. Zh. vych. mat. Mat. fiz. 2 , 4 (1962) 549-568 - ( ~ u s s i a n ) ( # ) The construction of iterative methods based on the use of spectrally equcvalent operators. Zh. vych. mat. mat. fiz. 6 , 1 (1966) 12- 34 (Russian)
-
Douglas J. . ~ a c h f o r dH. ( +) On the numerical solution of heat conduction problems in two and three space variables. Trans. mev. Math. Soc. v. 82, 2,421 (439) (1956) Kellogg
v)
-
( ~.not'heralternating direction 11,4,976-979 (1963) S1A.M J. V . -
- implicit
method
Konovalov A. N ( Y )Numerical methods for problems of. the theory of elasticity. - Izd. NGU Novosibirsk 1968 (Russian) Krasnoseltskii M. A.. , Krein S. G. An iteration process with minimal residuals 31 (73) 315-334 (1952) (Russian) --
- Mat.
Sb. (N. S)
Kreiss H. 0. Initial boundary value problem for partial differential and difference equation* in one space dimension? Numerical so1970 lution of partial differential equations. I1 SYNSPADE A.cademic P r e s s New York-London , 401-410 (1971)
-
-
Courant R. , Friedrichs K. , Lewy H. Uber di partiellen Differenzengleichungen der mathematischen Physik - Math Ann: T. 100, 32 (1928)
G. I. Marchuk Kurihara Y. , Holloway J. L. Numerical integration of a nine-level global primitive equaMon. Wea. Rev. tions model formulated by the .box method. v. 95,8,509-530 (1967)
-
Lhdyzenskaya 0 . A The mothod of finite differences in the theory of partial differential equations Uspehi Mat. Nauk (N. S. ) 2 (1957) 123-145 (Russian).
-
Lax P. D. , Wendroff W. On the stability of difference schemes with variable coeffiComm. P u r e Appl. Math. v. 15,4,363-371 (1962) cients
-
Lax P. D. , Richtmyer R. D. Survay of the stability of linear finite difference equationsComm. P u r e Appl. Math. v.,Q,2,267-293 (1956) Lebedev V. I. On the mesh method f o r a certain system of partial different i a l equations - Izv. Acad. Nauk' SSSR Ser. Mat. -22 (1958) 717 - 734 (Russian) Dirichlet and Neumann problems on triangular and hexagonal % (1961) 33-36 (Russian) grids. - Dokl. Acd. Nauk SSSR 1 Lyusternik L . A On difference approximations of the Laplace operator Mat. Nauk (N. S. ) 9_, 2. (1954) 3 - 66 (Russian)
- Uspehi
Marchuk G I.
-
( T )Numerical solution of Poincarb problem f o r the oceanic circulation - Dokl. Acad. Nauk SSSR T 185 -, 8 (1969) 1041-1044 (Russian) ( On the theory of the splitting-up method. Numerical solution of partial differential equations SYNSPADE - 1970 Academic P r e s s , New York-London 469-500 (1971)
w)
-
Marchuk G. I. , Kuznezov Ju. A. On the question of optimal iteration p r o c e s s e s -- (1968) 1331-1334 (Russian) Nauk. SSSR 181
- Dokl
A.cad.
Marchuk G. I. , Sultangazin U. M. (q)Convergence of a decoupling method f o r the radiation (1965) t r a n s f e r equation - Dokl. Acad. Nauk. SSSR 66-69 (Russian) (j(f) Solving the kinetic t r a n s f e r equation by the separation method - Dokl. Acad. Nauk. SSSR 163 -- -(1965) 857-860 (Russian)
G. I. Marchuk Marchuk G. I. On the foundation of the separation method for the equations of radiation transfer Zh. vych. mat. fiz. 5 , 5 (1965) 852-863 (Russian)
(x)
-
Marchuk G. I , Ianenko N N. ( * ) Application of the fractional steps method. to the solution of problems of mathematical physics Proc. A11 Union Conference of Num. Math ,Moscow February 1965 - Proc. Congress IFIP, New York May 1965 P r o c "Some questions of applied and numerical math. Wauka1I ~ o v o s i b i r s k5-22 (1966)
-
A.ubin J. P . ,Burchard H. G. . Some aspects of the method of hypercircle applied to elliptic variational problems. ,Numerical solution of partial differential equations. I1 SYNSPADE 1970. Academic P r e s s New York-London 1-67 (1971)
-
Oganesyan L A. , Rukhovets. L. A Variational-difference schemes for linear second-order elliptic equations in a two-dimensional region with piecewise smooth boundary. Zh. vych mat. fiz. 8. 1 (1968) 97-114 (Russian) Analysis of the r a t e of convergence of variational-difference schemes for second-order elliptic equations on a two-dimensional domain with smooth boundary. Zh vych. mat. fiz. 9,5 (1969) 1102-1 120 (Russian)
-
Rivkind V. J a . An approximate method of solving the Dirichlet problem and estimates of the r a t e of convergence of solutions of the difference equations to solutions of elliptic equations with dicontinnous coefficients - Vestik Leningrad Univ. , Sez. Mat. Meh 13 - (1964) 3,37-52 (Russian) Peaceman D W . , Rachford H. H. ( $ ) The numerical solution of parabolic and elliptic differential equations - SIAM J . v. 3, 1 (1955) 28-42 Raviart P. A Sur ltapproximation de certains equations dl€wolution lineaires et non lineaires - J. de Mathem. P u r e s et Appl. ,v. 46,1(1967) 11-107
G..I. Marchuk Richtmyer R. D. Nonlinear stability of difference schemes P r o c . "Some questions of applied and numerical mathematics" "Nauka" Novosibirsk 1966, (Russian) 54-59
-
Samarskii A. A. ( W ) An economical algorithm f o r the numerical solution of s y s t e m s of differential and algebraic equations Zh. vych. mat. mat. fiz. 4 , 3 (1964) 580-585 ( 8) Necessary and sufficient conditions for the stability of double layer difference schemes. Doll. Acad. Nauk SSSR 181 (1968) 808-811 (Russian)
-
Strang W.
Difference methods f o r mixed boundary problem Math. J. , v. 27,2 (1960) 221-232
Tihonov A N. ,Samarskii A. A. Homogeneous difference schemes (1961) 5-63 (Russian)
- Zh
- Duke
vych. mat. mat fiz. l , 1
Thomee V. On maximum-norm stable difference operators. Numerical solution of partial differential equations. P r o c . Intern. Symposium 1965, Academic P r e s s . 1 2 5 - 1 5 1 New York. Fedorenko R F. The r a t e of convergence of an itarative process mat. mat fiz. 4 , 3 (1964) 559-564 (Russian)
- Zh, vych.
Phillips N. A. A.n example of non-linear computational instability . The atmosphere and the s e a in motion- Scientific Contribution to the Rossby Memorial Volume The Rockfeller Inst.
-
Frankel S. P. Convergence r a t e s of iterative treatments of partial differenMath. Tables and Other Aids Comput. v. 4,30 t i a l equations (1950) 65-75.
-
Hubbard B. E.
(y)
Alternating direction schemes for the heat equation in a general domain.' SIAM J. Num Anal. -2 , 3 (1966) 448-463
-
Young D. M. Iterative methods f o r solving partial difference equations of 1 (1954) 93-111. elliptic type- T r a n s . Math Soc.
U. Mosco
CHAPTER 2 and existence Some typical problems -
theorems
1 : Examples
2 : Finite-dimensional and iterative existence theorems 3 : .Variational inequalities f o r bilinear f o r m s in Hilbert
spaces
4 : Direct existence theorem CHAPTER 3 Convergence ----of convex s e t s and of solutions
of variational
lities 1 : The Ritz-Galerkin discretization
2 : Convergence of convex s e t s and convex functions 3 : The "stability?heorem
4 : Further existence theorems 5 : Finite-dimensional approximation, I : the discrete
problem 6 : Finite-dimensional approximation, I1 : convergence of
the approximate solutions
7 : Dual variational inequalities and complementary systems 8 : An example
inequa-
U. Mosco
CHAPTER 1 Minimum ~ r o b l e m sand variational ineaualities: convexity, monotonicity and fixed-points 1 : The direct formulation 2 :' The weak formulation 3 : The linearized formulation
4 : The fixed-point formulation 5 : The epigraph formulation 6 : Minimum problems in normed spaces
7 : Monotone operators and variational inequalities : the linearization lemma
8 : Variational inequalities and fixed-points 9 : Minimization of non-differentiable functionals and
mixed variational inequalities
U . Mosco
CHAPTER I Minimum problems and variational inequalities : convexity, monotonicity and fixed-points The aim of this introductory chapter is to bring to light some simple geometric features underlying the theory of the s o called variational inequalities which a r e also inherent in the problem of minimizing a convex functional on a convex set. Most of the characterizations of the solutions of these problems, indeed ; give-r*
to inequalities involving the differential of the given
functional, which is a monotone map from the space where the functional is defined to its dual. F o r minimum problems in function spaces, these inequalities should be seen a s the analogue of the Euler condition of the calculus of variations, in the presence of unilateral constraints on the solution. The variational inequality approach consists in dealing directly wiih such inequalities, without asstiming "a priorill that the monotone operator involved is the differential of a convex functional. Moreover, e ven in this special case, they can often be used, a s the Euler equation is, to investigate the properties of the solution of the original minimum problem is
-
-
f o r example, the regularity, i. e . , how
smooth that solution
and they also occur in various approxi~lratemethods of solution, a l -
ternative to the direct Ritz approach. Therefore, it is of some interest that some basic general featur e s of convex optimization a r e shared by the monotone inequalities we a r e going to discuss in our lectures. Let u s recall some of the specific aspects of convexity.
U . Mosco F i r s t , any local minimum is actually a global minimum. The r e levance of this property is apparent: only a w l o c a l l linvestigation of the functional and the constraints is required, which id terms,for instance, of computer performances means l e s s information to be memorized. Another motivation for setting
(if possible
) an infinite dimen-
sional minimum problem in a convex framework 'is that "good"
topolo-
gies a r e then allowed. It is well known, in fact, that convex functionals keep their semicontinuity in topologies weak enough to make the set of constraints compact in a suitable function space, under reasonable boundedness assumptions most of the times intrinsic in the problem at hand : the existence of the minimum is then an immediate consequence of
We-
i e r s t r a s s theorem. Another specific feature of convexity is that some linearization is always possible. This i s , on the other hand, the underlying reason for the existence of good topologies. As we shall s e e later, a linearization of the problem is the basic tool in the existence theory f o r infinite dimensional variational inequalities. The equivalent formulations of a convex minimum problem that we shall discuss in the present chapter, can be tentatively named a s follows: I
the direct formulation
I1 the gradient o r
weak
formulation
I11 the linearized formulation IV the fixed-point and the iterative -fixed -point fo m u l a t i o n . V the epigraph formulation. Further methods that should be mentioned a r e the VI
duality and minimax methods and the methods based on
VII
penalization and regularization
U. Mosco
We should perhaps also say that many of these approaches a r e often adopted at the same time and can be variously combined with the finite-dimensional approximate methods of solution we shall
talk about
in the following chapters. We shall fi+st intuitively sketck I
.. .
V in Sections 1 . . . 5 below,
postponing the rigorous proof of their equivalence to the later sections., where we shall also discuss further properties of monotone variational inequalities. A partial account of the duality theory will be given in the Section . j
of Chapter 3, whereas the other subjects mentioned in VI
\
and VII above will not be treated at all in our lectures, thsugfT important they are. Let us mention that a &tailed discussion of the methods based on penalization and regularization devices with regard to variational inequalities, can be found in J. L. Lions [I] , while an account of the duality and minimax methods can be found in J. L. Lions, R. Glowinski and
.
R. Tremolieres [I] 1
.
Direct formulation
Let us consider the problem of minimizing a real valued convex function ce
F
on a convex subset
K
of the n-dimensional euclidean spa-
E~ , i. e . , the problem u
<
K :
F ( u ) < F(v)
for all v c K
If we introduce for any fixed vector bly empty) level set L(v) =
L(v) of
I
w
<
K :
F
on
v
.
of the space the (possi-
K, F(w) )d F ( v )
)
U. Mosco then
I
can be equivalently written a s u €
n
L(v)
VEK
Remark 1 : The set of all solutiona u
is closed, for then
.
is convex,
I
. It 'is also closed provided
f o r L(v) i s convex f o r every r wer semicontinuous and K
of problem
F le lo-
L(v) is also closed.
Let us notice that throughout this chapter we disregard any question about the existence of solutions.' .In other words, we a r e assuming that a solution does exist and we a r e only concerned with the characterization of it in terms of
F
and
K , together with the properties of the
set of all solutions.
2 . Weak formulation
Let us assume now that
F
is differentiable on
E
n
and
let
U. Mosco
be the gradient of
F
at the point F :
En
v s(vl,.
-
. .,vn)
of
E~ :
En
and
Remark 2 :
VF(z)
is a vector at the point
the direction of maximum increase of hence of the level s e t s of
F,
at any point
pointing to the same half space where ctors
w
-z
F :
z
pointing toward
due to the convexity of
z
all directions
w
F,
-z
V F ( z ) is oriented, i . e., all v e -
such that
a r e directions of increasing forward directions at
z
( r non-decreasing) F : we shall call them
,
U. Mosco
of all
Let us now introduce for any fixed
v
z . of
v
is seen in a forward direc-
F
on
from which the given
K
of
E~ t h e set
tion. Clearly, a vector vectors
v
of
K
u
minimizes
a r e seen forward from
K
if and only if all
u, that is, I i s equivalent
to the inclusion
which is to say that problem I above is equivalent to the problem
I1
u c K
:
( *VF(u) I v - U ) ) 0
for all v & K
.
U . Mosco Remark 3 :
If
u
belongs to the interior of
K,
then I1 is e -
quivalent to f o r all w , which is the "weakw form of the equation
[In fact, if to
K
f o r all
Q
u 6 int K, then the vector
~ ~ ( v F ( u ) l w ) > O , w i~ t h>
3.
+ p w -
belongs
> 0 sufficiently small, whatever is the vector w of
the space; therefore, we can put any such get
v = u
The linearized -
v
into inequality I1 and we
.I6
( v F ( u ) l + ) =0
0 hence ,
formulation
Both I and I1 can be viewed a s a system of infinite non-linear inequalities in u
.
As we shall s e e below, however,it i s also possi-
ble to characterize a solution
u
of
I
and
I1
by means of a system
of linear inequalities. It is now convenient to consider, f o r any given point space, the s e t N(v) = of all w
of
IW K
E K :
( VF(V)IV - w ) > O
which a r e seen
backward
3
from
v,
v
of the
U. Mosco
Note that
1'
backward" here does'nt necessarily mean "direction
w -v
of non-increasing F" : it simply denotes a direction the opposite half space with respect to Clearly, a vector
u
VF(v).
minimizes
i s seen backward from all points
v
pointing t o
F
of
K
on
if and only if
u
K, i . e . ,
Therefore, problem I i s equivalent to
111.
u € K :
( vF(v) Iv
-
u)
>
0
for all
which i s indeed a system of linear inequalities in
v 6 K
,
u.
Remark 4 : While it is not immediately recognizible that the s e t of all solutions of problem I1 i s convex, this property is again apparent for problem I11 above, as it w a s f o r the direct minimum problem I, even
U . Mosco
if
qF
i s replaced in I11 by any map
A
of
into itself. Q
E~
4 --The fixed -point formulation
Let
u
us assume that
be a vector of
VF(U) {
0
.
K
that minimizes
F
on
If we move backward from
K u
and let to s o m e
vector
then we leave the convex
L The vector u -
K
normally to a supporting hyperplane
j v F ( u ) cannot stay i n K, because that
yould contradict the fact that F
attains i t s minimum at u on K.
Therefore, if we now project the vector we fall again in the point
u
u
"3
- y v F ( u ) on I$
we started with, that is, we have
U . Mosco
where
PK
denotes the minimum distance projection
amounts to s a y that
u
I 5 identity map in
E~
to
onto
K. This
is a fixed-point - of the map
.
Note that if
V F ( u ) = 0, then IV reduces
u = P
PK(I
u, a trivial consequence of u E K. K Let u s suppose now that u i s a fixed-point of the map
-
V F)
and let us show that then
Let u s distinguish two c a s e s
u
minimizes
F
on
K
:
then ,
hence
which implies that
2d
u
u
minimizes
- g V F (u)
then, the projection
u
f! K
of u.
-
F ;
,
in particular,
7 v F(u1
on
K
VF(u)
f
0 :
will certainly be
U. Mosco
on the boundary of
K
the hyperplane at
u
ne of the convex set
normal to
9 F(u) will be a support hyperpla-
and
will be entirely contained in the f o r -
K
K
ward half space at u : again, the conclusion is that u minimizes F on K. Remark 5 : ve algorithm -
The fixed-point formulation IV suggests an iterati-
for the
search of a minimizing
u, namely
which can be expected to yield a convergent sequence of approximat-
e
u provided the map n'
is a contraction
on
K. Algorithm IVn is well known in optimization
theory as a "projected gradient" method. We shall come back to this
point in Chapter 2. 1
.The epigraph
5
formulation
A l l the characterizations of a minimizing vector so far require the function
F
be differentiable. Even
u
if
discussed F
i s not
differentiable, however, it is possible to give a characterization of by means of a system of inequalities, which involve the epigraph of
u
F,
that is, the subset epi F = of the product space ,
K x IR
E~ x
kK
In fact, let
of the space
Clearly,
u
rv.p l IR
F(V)
:
.
be the intersection of epi F n E x R, i. e . ,
minimizes
F
on. K
with the tlcylinderll
i f and only if
[u, F(u)]
minimizes the function
on the convex set ces of
E
~
XR
5
&
K. Note that the level sets of
:
[CW,~]
:
$<
?]
,
a r e the half spa-
pea.
U. Mosco
E
n
in clearly differentiable on The convex function n+1 x lR 5 E , with a constant gradient given by
for all
[v,
63
E~ x R , that is,
-
Therefore, the convex set qualities
K
r [u,
~(u)]
if and only if
minimizes the function
$
on
5 is a solution of the system of ine-
U. Mosco The equivalence of our original minimum problem I with problem V above can be also checked by a direct computation (see Section 9)
6
Minimum problems in normed spaces
In this section we shall prove the equivalence of problems 1-111 of sections 3-5 in an infinite dimensional normed space framework. Let X (v
*
be a real normed space,
, v) the pairing between v e X
*
X* the dual of and
v
6 X
X
,
' . Let
be a convex function that we shall assume to be (Gateaux) differentiable, with a differential
given by
We have the following
PROPOSITION 1 on -
X
equivalent
K
a convex ----
.Let
F
subset of
5a X
differentiable
. Then,
convex function
--
I I1 and I11 below a r e
U . Mosco
Furthermore, the s et of-all solutions u
m)and,
if
K
is closed, --
convex (possibly
em-
also closed. -
Proof : I
+ I1
:
we u s e the definition of DF : in fact, if
K, then f o r each
v 6 K,
0
(1)
10,5)
,
d dt
6 - F ( u + t(v
minimizes
-
+
6>
0 , therefore we must have
u))
attaines a global minimum at
necessity of u
6
I1
3
on
t(v - u))
=
(DF(u) , v
[ ~ o t ethat we have not used the convexity of F
F
t = 0 minimizes the r e a l function t w F(u
on an interval
u
u
I1 for any differentiable
:
-
u)
F, nor even that
the proof above. shows indeed the F
having a
local
minimum at
K J . I : it is an immediate consequence of the following property of
a differentiable convex
F
U . Mosco
l ~ h i scan be proved a s follows : by the convexity of t
H
F(u
+
tv), the derivative appearing in the definition of
the non-increasing limit of the differential quotient of s e s to 0 : hence,for all
t
-
F ( u ) 2 t(DF(u), w),
which f o r t = 1 and w = v
-
u
+
DF :
X
HX'
gives ( 2 )
above
i s a monotone operator
.] DF :
.
far the interchanged pair
ng up the two inequaljties obtained
3
decrea-
can be easily proved by writing inequality ( 2 ) above for the given
vectors u and v and also
I11
t
111 : it is a consequence of the following basic property of
i. e . , [(3)
as
> 0 and all u, w t X we have
F(u + t w )
I1
F
is
DF
11
v, u
and then addi-
1.
it i s a consequence of the continuity of
:
hence also of
t
(DF(u + t w), w)
u , w t ? X ,
U . Mosco
which is a well known elementary property of differentiable convex functions on the r e a l line. In fact, we can replace the vector
that belongs to
K
which to the limit t H IDF(U
+
t w),
Remark 6 : rentiability of
v at
<
F
for all
t
4
W)
0
at
v
in
I11
with the vector
0 6 t 6 1 , and we get
yields 11, by the continuity of
of .
la
It should be remark that only the -right side diffeat the point
u 6 K
along any vector
v
- u,
K, has been used in the. proof above. The differentiability of u
F
is required, instead, in the following corollary of Proposition 1. 5 Corollary of Proposition 1 : -Under the assumptions of Proposition 1,
a vecotr u which belongs to the interior of K minimizes -the func-tional F on K -if and only if u is a solution of the problem 7
U . Mosco
Proof
:
See Remark 3 of Section 2. 111
Both properties (2) and (3) of DF, on which the proof of Proposition 1 has been based, characterize the differential of a convex function. We have in fact the following lemma (see also Remark 11 of Chapter 3) : LEMMA 1
: Let
on X - F -be-a differentiable function Then (i), (ii) and (iii), -below a r e equivalent -
(ii)
* . F -is convex F(v) > F ( u ) +
(iii)
DF
DF: X (i)
X
(DF(u), v
is monotone, -
-
u)
, with
~ , V E X
i. e.,
Proof : The implications (i) J (ii)
----\ (iii) have been shown
in the proof of Proposition 1. The proof of
(iii)
(ii) is based on the formula
that i s obtained by integrating the function of
t L
- F(u + dt
from indeed
0 to
1
t(v
-
u) = (&(u
+ t(v
-
t u)), v -. u)
and then applying the mean value theorem. We find
U. Mosco
F(v)
- F(u) = (DF(u), v
- u) + (DF(u + T(v - u)) - DF(u),
v
- u) ) (DF(u), v - u),
since (DF(u by the monotonicity of
+ T(v -
u))
-
DF(u),
(v
-
u))
)
v
G X, o < X < l .
0
DF.
Let
(ii) 3 (i) :
u =
Avl
+
(I
-
A)v2
,
v
1'
2
By (ii) we have
By multiplying the f i r s t of these inequalities by t e r one by (1
-
)
and adding up,we find
which is the convexity inequality .
1
and the l a t -
U. Mosco
-
7 . Monotone operators and variational inequalities :
--zation
lemma. Any problem such u € K :
11
with K
the lineari-
K
a s I1 of Proposition 1, that is
(Au, v
-
>
u)
'dv
0
a convex subset of the normed space
into
' X
X
L K , and
A
a map of
, is called a variational inequality.
As we said in the introduction and the discussion up to now should have shown, variational inequalities involving monotone mappings a r i s e naturally in connection with the minimization of a convex functional subject to convex constraints
and
s h a r e indeed many important proper-
ties with these problems, even if the map
A
is not the differential of
a convex functional. F o r instance, whatever the map i . e . , a vector
u
of
K
(Au, v
A
is, any local solution of
such that for some
-
u)
6
>
O
forall
) 0
11
v
f
where
is actually a global solution of
the vector
v
that belongs to
I1
.
h his
can be seen by replacing
in the inequality above with the vector K
for
t
>
0
s m a l l enough
u
+ E
{v
- u)
.J
Moreover, the solutions of variational inequalities involving any
U. Mosco
monotone mapping having sdme mild continuity property can still be characterized, like the solutions of convex minimum problems, by means of a s y s t e m of infinite linear Inequalities. In fact, if we inspect the proof of the equivalence
I1 ( ; 1 I11
of Proposition 1, we note that only the following properties of the map A = DF
have been used (see also Remark 6 ) :
is monotone, i. e. , (Au
- Av, u
-
v)
>0
u, v
C 'X
and hemicontinuous, i. e . ,
t c+
(A(u
+
tw), w)
is continuous at
o+.
U,
w
X
Therefore, we can state the following basic lemma, which is essentially due to G . J. Minty [ I ]
Let
(see also F. E. Browder
be 2
C8] )
and hemicontiof the normed space X to itsdual ' X . Then, -for any nuous map ---convek sbbset K of X , problems I1 and I11 -below a r e equivalent Linearization Lemma :
I11
u L K
:
(Av, v
A
-
U)
>
0
monotone
v
C K .
U. Mosco Corollary : The common set of a l l solutions and -
u
----
I11 --above is convex : it is also closed, provided
of problems K
I1
is closed. --
Perhaps the most important consequence of this "linearizationI1 of problem I1 is that the inequalities in I11 a r e stable under limits in
u
in the weak topology of
X, contrary to f h a t occurs for inequalities
u
11, where we can take weak limits in
only if
A
has some
compa-
cteness property. An essential use of this property of the equivalent linearized problem I11 will be made in Chapter 3, when we shall dead with the existence and stability of the solutions of variational inequalities such as I1
. Remark 7 :
example if
If the solution
is an interior point of
u
K , for
K = X , then I1 reduces to u
E
K
:
(Au, w) = 0
b'w
E X
,
which is the weak form of the equation
(cfr. 'the' Corollary of Proposition 1)
.
Even in this case, however, the linearization lemma is meaning
-
ful, sfor it states the eQlivalence of this(non-1inear)equation with the system of linear inequalities satisfied by the solution u :
U. Mosco
Remark 8 :
An interesting cesult due to
T. Kato
that a monotone hemicontinuous mag on a convex= space
X
[ 11
affirms
domain of a Banach
is always derhicontinuous , i. e . , continuoue from the strong
topology of
X
to the weak*
topology of
ps l e s s surprising if we think of the case
X
*
(this property is perha-
A = DF,
F
being a differe-
ntiable convex functional). Let u s remark, however, that what i s really required to have the equivalence I1 micontinuity of
A
I11 is the
only on the set
hemicontinuity on a convex subset
K
K
monotonicity and he-\
and not on the whole of of
X
X, and
is in general a weaker p r -
operty than demicontinuity. F o r the boundedness properties of monotone operators s e e also F. E . Browder Remark 9 : on
X,
If
A = DF,
then for any finite s e t
of vectors of
X,
we have
[3]
F
and
R. T . Rockafellar
[4J
a differentiable convex functional
.
U . Mosco
(see (2) of Section 6) and adding up a l l these.inequalities we find
Therefore, not only the monotonicity condition is satisfied by the map
A = DF, indeed a whole family of l l c y c l i c l l inequalities a r e also
satisfied,
each one corresponding t o a finite subset of
is cyclically monotone
.
X : the map
DF
As Rockafellar has shown, this property is the
basic one occurring in the characterization of the monotone mappings which a r e the differential of convex functionals, s e e R. T. Rockafellar
I
.a 8
.
Variational inequalities and fixed-points
Let u s come back now to the relation between variational inequalities and fixed-points. Even if the reduction ~f a general variational inequality to a fixed-point statement can be realized in any smoothly normed Banach space (see Remark 1 3 below), we shall a s s u m e f o r the sake of sirriplicity that our problem takes place in a Hilbert space framework. We shall make u s e of the following tools, both of which depend on the specific inner product of the given Hilbert space
V
and not m e -
rely on the topology induced by it :
J
(i)
the duality Riesz isomorphism
(ii)
the weak characterization of the Riesz projection
the convex s e t
K.
of
V
onto
V
PK
on
U. Mosco
The Riesz If
V
ng between
isomorphism
J :
is a (real) Hilbert space,
v C: V
and
v * V* ~ and
* ,v)
V* i t s dual*(v (u
I
v) the inner product in
then
is the map defined by the identity
the .map J
is an (isometric) isomorphism of
We can use the inverse of
to represent any given map
by
the map
J
the pairi-
V
onto
v*.
V,
U. Mosco
Remark 10 :
If
&
V, then
functional on
A = DF, = J-I A
The Riesz projection P~ : -.K i s a convex subset of V
If
F being a differentiable real valued i s the gradient
and
dl?
of
F
:
z 6 V , the vector
is defined to be, if it exists, the unique solution of the minimum pro-
blem
where
I w 11 It
is
(w
=
I
K
.
well known and it can be elementary proved by using the pa-
rallelogram identity in u = P
w) 112
z on K,
V, that any vector
provided
K
Lemma 2: Then, given
z
Let -
6 V
u
of
V
has a projection
is closed.
It is also clear that a vectnr problem above if and only if
z
u
is the solution of the minimum
is the s o l
of the pmblem
K ----be a convex subset of 5 Hilbert space V
, we have
-.
U. Mosco
if and --
only if u ( K Proof:
( u - z l v - n ) > O
:
The functional
V ,with
is differentiable on
(that is,
VF
= I
-
z,
I E identity of
V)
.
Therefore , (DF(u), v -u) = (J(u - z), v
- u)
= (u
-
1
V
-
U)
and the lemma follows a s a special c a s e of t h e equivalence I tj I1 Proposition 1
.I
The weak characterization of P K of l a t e r on :
ve that
of
P t u r n s out t o be useful t o p r o K does not i n c r e a s e distances, a property we shall make u s e
Corollary
of
Lemma 2 :
PK -is non-expansive,
i. e.,
U. Mosco Proof : Let u s write the weak characterizations of
,
and replace
u
1
= P z
K 1
and
u
2
= P z K 2'
v = u
in the f i r s t inequality, 2 Adding up,we then find
v = u
1
in the l a t t e r one.
that is
(ul
-
u2
I u1 -
u2) 6 (zl
-
z2
U1
-
u2)
hence, by Schwarz inequality,
Now we a r e ready to prove the fixed-point characterization of a variational inequality
sketched in Section 4
PROPOSITION 2 : Let K and -
A
-a -map -
of
K
into V * -
.
be a convex s e t of 5 Hilbert space V ----. Then, I1 and IV --below a r e equiva-
U . Mosco
lent : 11
where
I
u
c
is the v * m v .
K
:
-
(AU, v
identity -map of
u) 2
V
V V E K
o
, J
the canonical
Remark 11 : In t e r m s of the map
= J-'
isomorphism of
A,
I1
and
IV
above can be written respectiveiy as u E K :
(f2u I v - u ) > O
v v
EK
and
Proof of Proposition 2 : It suffices to write the
weak characte-
rization of u = P z K
with
,
u
-
pJ-'Au.
In fact, we find u 6 K
:
which i s to say, since
(U
-
(U
y o ,
- g
-
J-'A
U)
I
v
-
u), 3
o
VV c
K
U . Mosco
where
I
(J-'A~
v
- U) = IAU, v -
.
U)
E
Remark 12 : The fixed-point characterization IV of problem I1 is not intrinsic : if we change the inner product in
one, then the dual
V
fi
of
V, hence also
A
V
to an equivalent
and problem 11, does
and dt = J-'A will change. The not change, whereas P K of the inner product will also effect the range of values of make the map
PK(I
-
f
A
3
)
a contraction in
V
choice which
( s e e Remark 5).
We shall come back to this point in the following chapter, when we shall discuss the iterative methods of solution of problem I1
.
Remark 13 : The weak characterization of the Riesz projection above, Holds i n any Banach space.
PK , hence also Proposition 2
w b s e norm is Gateaux differentiable (outside the origin) we can take
where
7
J
t o be any duality mapping of
mappings s e e A. Beurling
-A .
, 1: 72
In that case
X , that is ,
is a suitable continuous i n c r e a s k g function of the norm
(we have chosen above
r4]
.
X
, E. Asplund
Remark 1.4 : can always be reduced
2
r )
.
F o r more details on duality
E. Livingstone
[13
.
El1
, F.
E . Browder
a
Proposition ,2 shows that a variational inequality $01
a.fixd-point problem. The converse reduction,
U. Mosco however, is aiso possible : indeed, a vector
u
is a fixed point of a
map U : K W if and only if
u
K
is a solution of the variational inequality
u e K
:
( & i t 1
v - u ) & o
v
v
C K
where
v = Uu
In fact, by replacing
in the inequality aboven,we obtain
Let us also r e m a r k that the s a m e conclusion can be drawn i f
is a map of vector
v
K
into
15 K
with
V , such that for any
u u - u
=
X (v-u)
u
t
K, there exists a
U
U . Mosco The fixed points of these so-called inward mappings, a s well a s those of outward mappings (
/\
< 0 in the condition above) have been
investigated by using the variational inequality approach by F. E . Browder
[II]
a
1141
Remark 15
Proposition 2
is essentially due to H. Brezis
C2J
It should be remarked, however, that the connection between variational inequalities and fixed points was already present in the theory a s it had been previously developed by G. Stampacchia Lions
-
G . Stampacchia
[I]
and J. L .
1
f o r bilinear forms in Hilbert spaces. The
existence results obtained by these authors were based, in fact, on iterative methods and
reduction to a fixed point statement f o r a suitable
contractive mapping. We shall come back on this c a s e with m o r e details in the following
Chapter 2
.1
9. Minimization of non-differentiable functionals --and mixed varia.tional inequalities Let u s go back to the problem of minimizing a convex functional F
on a convex subset
K
of a normed space
assumption, made in Section 6, that
F
X , dropping now the
is differentiable. Indeed, many
variational problems arising from the applications, a s we s h a l l s e e l a t e r on, lead to the minimization of functionals which appear t o be t%e sum 'of a differentiable
F
and a non-differentiable
By allowing the functional
G
also assume that the indicator function
t o take
QK
G
.
the value
+
co , we can
of the constraint s e t
K
been preliminarly incorporated into the nondifferentiable ,component of the functional .at hapd. Let u s r e c a l l that
9K
is the functional on
X
given by
has
U. Mosco
Clearly, minimizing nlizing
F
+
G
F
+
G
on the subset of
on
is the same thing a s mini-
X
X ,
K s dom G e
v
:
G(v)<+oo
is also called the effective domain of
(dom G
1
G :X W (
-
m,
+
m)).
Proposition 1 of Section 6 can then be generalized a s follows Let F : X H R POPOSITION 3 : nctional 11'
and
G : X I+
(-
+
oo]
fuI'
below a r e equivalent --
P r o o f : ( s e e also v
m,
be a differentiable -convex be convex, G gf + m . Then, --
of
X
and all
t ,
where the convexity of
M. Sibony
0 < t < 1,
G
[17
)
If 1' holds, f o r every
we have
has been used
.
Since
~ ( u <)
duce from the inequality above t
-1
[ ~ ( u + t(v
- u)) - ~ ( u ) ]
) G(u)
-
G(v)
+
m
, we de-
U . Mosco
that
gives 11' a s
t
To prove that
& 0,
by the differentiability of
11' in turn implies
F
It, it suffices to use the ine-
quality
F(v)
> F(u) +
(DF(u),
v
-
u)
which is a consequence of the convexity of Remark 16 : zing
F
F
. 1
As the proof above shows., a vector
+ G always satisfies the inequality 11'
ble functional
G
is not convex
u
minimi-
even if the differentia-
.1
Another way of looking at the inequality 11' above is to regard it a s a unification of the direct and weak formulations of minimum problems. In fact, while IIt obviously reduces to the problem u E X when
F
:
G(u) 6 G(v)
b'v
E X
0 , on the other hand, it is easy to verify that
IIt is equi-
valent to the variational inequality
when
G
is taken t o be the indicator function of the s e t
K
.
Similarly, .we can generalize both the direct minimum problems and the variational inequalities discussed s o far, by introducing inequalities of the form
U . Mosco
(Au, v
11''
u 6 X :
with
A : X W X *
- u)
and
2 F(u)
F
-
F(v)
v
v
E X ,
: X H ( - m ,
It turns out, however, that such ttmixedw variational inequalities only apparently a r e m o r e general then the original ones. In fact, by making use of the epigraph formulation discussed ,in Section
5
, we can equivalently write problem 11" above a s a v a r i a -
tional inequality a s those considered so f a r . In fact, let us consider now the product space
and the inequality : rcl
I1
.
where
?i
"'w
(Au, v N
N
ly
K
F
w
X
of
I:V.~J
i s the epigraph of
V
"'
u) >, 0
A x 1
i s the map
A( and
-
=
into i t s dual
CAV,~~
$ p*
X
G
~
-
X x
-
*
,
R, i. e . ,
I"'"1
:
The following lemma holds : Lemma3:
Let
A : X H X*
,
F : X n (-m,
+ m J
,
U. Mosco
+
F
oo
above and --
G
.
Then, a vector F(u) =
of
fu,dl
=
every
r
If
> F(v)
u
$ = [u,
>F(u)
when
+
-
u)
+
e"
I
F(v)
- F(u) =
I.(
p
~ ( u] ) ,
7 = [v, p] t u, d l is
a,
11"
11.
solution of
a
- U) +
, v
u =
andforall
F(v) =
is
= (Au, v
Conversely, if O(
solution of problem
,
then
for
we have
0 6 (Au, v
where
and only --if the vector
is a --
g-_asolution of problem
X
Proof :
X
d E R , if
,
a(
A4
of -
u
CV,P~
:=
-
F(u)) =
w
, therefore I1 holds. a solution of
with
lu
11, then
J3 a F ( v )
the inequality in 11" is trivially satisfied
[notethat
]
,
we
have -'N
N
0 ,< (Au, v
-
"4
u) = (Au, v
Therefore, by taking cular, that
46
p
- u) +
v = u
= F(u), hence
and, moreover, f o r all
P
I"'
and
j 3 - d .
= F(u), we find, in p a r t i -
U . Mosco
0
,<
(Au, v - u)
which yields 11' when
(3
+
= F(v).
- F(u)
a
Remark 1 7 : If we. now assume that the vector
[u,
=
~1 i s
a so hr
h
% .-
lution of the analog of problem I1 with K replaced by the intersection K' (c.
od K with a "strip"
with a \< inf F,
X x I, were I = [a, b]
then we can still conclude that u is a solution of the problem u 6 X : (Au, v and
- u) >, F(u) - F(v)
v €
F ( v ) ,< b
X,
o( = F(u).
We shall make u s e of this fact l a t e r .
Remark 18 : differentiable convex
A different approach to the minimization of non F
consists in making the subdifferential
F take the role of the differential
DF
the, in general multivalued, mapping of each of
F
u
of at
X
. X
Let us recall that
zx*,
into
3
aF
u* E X*
defines a support hyperplane of
F
such that at the point
tangent one). In other words, f o r each
u
of
v u
X ,
of is
which associates
with the s e t (possibly empty) of a l l subgradients
u, i . e . , of all
F
uY
F(u)+($ v
-
(not necessarily a
u)
U . Mosco
It should be noted, indeed, that all statements about the vector u = DF(u)
of
we have made s o far, involve
u
only a s a subgradient
F . This approach yields naturally to variational inequalities invol-
ving multivalued monotone mappings. We r e f e r t o F. E. Browder 1143 R. T . Rockafellar
.8
C6]
As general reference on the theory of convex functions, let us only mention h e r e R. T. Rockafellar
[ 11
J. J. Moreau
[I]
A. Ioffe
,
[77
- V.
, J. Stoer and C . Witzgall Tikhomirov [I]
.
Monotonicity properties of operators in Hilbert o r Banach spaces have been investigated R. I. KachurovskiL Ll]
, l2J
,[3]
M. M. Vainberg and R. I . Kachurovskii 1
111
l21 ,
E. H. Zarantonello [I]
, G . J. Minty
,F..E. . ~ r o w d e rs e e ref. quoted ih F.E.B., D51, T. Kato rl]
R . T. Rockafellar
3
]
and others. More references and a survey
of the theory and i t s applications can be found in R. I. Kachurovskii and
F . E. Browder
C3]
loc. cit.
Specific references to variational inequalities will be given in the following chapters.
U. Mosco
CHAPTER 2
Some typical problems and existence theorems 1 : Examples 2 : Finite-dimensional and iterative existence theorems 3 : Variational inequalities for bilinear forms in'Hilbert
spaces 4 : Direct existence theorem
CHAPTER
2
TQe existence results and the methods of approximation of the s o lutions of variational inequalities such a s
u E K
11
where
K
:
(AU,
V V E K,
V - U ) > O
is a convex subset of a noruled space
tone map of
K
into
X*
X
and
A
a mono-
, a r e essentially based on the characteriza-
tions of problem I1 we discussed in the preceding chapter. In fact, the three main ideas underlying this existence and approximation theory can be summarized a s followsi : 1
(ij Reduction of I1 to a fixed-point problem and application of a
fixed-point theorem such a s Brouwer's o r Schauder's theorem or, more constructively, the contraction principle which also
yields an iterative
algorithm for the approximation of the solution ; (ii) Reduction of I1 to a direct minimum problem (when the map
A
is the differential of some convex functional
F), what makes it pos-
sible to apply the classical existence theorems of the calculus of variations, ,The solution can be evaluated by combinkg a finite-dimensional approximation of Ritz type together with some method of finite-dimensional convex optimization ; (iii) Preliminary restriction of I1 to finite-dimensional subspaces
of
X
. where a solution can be found by applying any one of the fore-
going methods, and then linearization of the problem to get a solution in the whole space. The constructive aspect of this approach comes out a gain from a combination of a finite-dimensional discretization of Ritz-Galerkin
type and iterative algorithms o r methods of convex optimization
U. Mosco f o r the evaluation of the solution of the finite-dimensional approxim3te problem: In the present chapter we shall present some of the existence and approximation results that can be found along the lines mentioned in (i) and (ii). above, while a more general existence theorem, based on the Linearization Lemma of Chapter' 1, and a description of the tdi'adretization p r o ~ e d u r e s ~ w ibe l l postponed to the following Chapter 3
.
Moreover, a s a motivation to all problems discussed so f a r in an abstract framework, we shall describe below some typical examples of minimum problems and variational inequalities involving integral func t i o n a l ~and partial differential operators. Most of these problems a r i s e in the mathematical description of the equilibrium of a physical system subject to unilateral constraints. As everywhere else in these lectures,we shall confine ourselves to problems of elliptic stationary type and we refer to the lectures of J. L. Lions and R. Glowinski at this Course for all that riational inequalities of evolutive type.
concerns va-
U . Mosco 1
.
Some variational inequalities
-. .
Everywhere in this section we shall denote by pen subset of the n-dimensional euclidean space
R
E ~ by ,
a bounded o-
f
its boundary.
We shall begin our list of examples with a quite classical pqobelm, that does 'nt involve unilateral constraints. Example 1 : The Dirichlet problem :
-A
where
R, say,
is the Laplace operator and
f
is a given function on
f G ~ ~ (. 0 )
The variational
( o r weak) solution of (1) is the function
u
that minimizes the energy integral
H: (R) : this is. indeed, the classi-
over the appropriate Sobolev space cal Dirichlet principle
he space
.
1 H (L?) is made of a l l functions f C L ~ ( R ) whose 0 9 v. , i = 1, . . . ,n still belongs to L2(R). distribution derivatives
a x,
I
and m
e t r a c e on the boundary
r
of
R
vanishes
.
With the n o r m .
U. Mosco
Hb(R)
is a reflexiue Banach space, whose dual,
can be identified with a l l distributions
T
on
R
that can be (non uni-
quely) written a s
the duality pairing between
v E Hb(R) and
T
The functional
i s differentiable on
1
H (0) , with differential 0
H
-1
(0) being given
I
U. Mosco
given by
(-
nu,^)
=
d Fo dt
(U
f3u
+ tv) t=O
/3v
dx,
n
The identity above,indeed,provides the variational definition of the Laplace operator
and the bilinear form at the right hand
i s called the Dirichlet form
.
The variational solution of the Dirichlet problem can then be characterized as the solution of the problem
U. Mosco
a s i t follows, for instance, from the Corollary of Proposition 1 of Chapter 1
. As for the existence of
u, i t suffices to show that the energy in-
tegral above attains its minimum in the space
Hb(R) , what can be do-
ne, f o r instance, by applying the general theorem on the existence of minima we s h a l l give in Section 4 of the present Chapter. Remark 1 :
-
To replace the Laplace operator
, in
the
problem above, by any 2d o r d e r elliptic partial differential operator of the form
where the
ai,(x) a r e bounded measurable functions on
the ellipticity
satisfying
condition
amounts only to replace the Dirichlet form the form
R
a(u, v)
in problem ( 2 ) with
U. Mosco
However, this generalized form is
not
the differential of the fun-
ctional
unless it is symmetric, that is, the coefficients
a..
satisfies
l . 1
a..(x) = a..(x) 4
....n .,
a. e.
i,j = 1
J1
The proof of the existence of the solution of this
generalized Di-
richlet problem cannot be obtained, a s before, by using the direct variational formulation. We must appeal, indeed, to a well known theorem of Lax and Milgram that will be recalled later.
a
The existence theorems for variational inequalities we a r e going t o discuss in our lectures can be seen, and in fact so they were first obtained, as a generalization of the Lax-Milgram theorem to problems involving unilateral constraints
.
The simplest example of problems of this type is the following Example 2 : The capacity problem : Given a compact :subset
E
of
that .minimizes the Dirichlet integral 1 v 6 Hi)(0) , such that v > l
on
E
Fo(v) over: the
in the sense of
[ w e say that. v 2 the'limit in the norm of
0, we look for the function all
Hb(R) ,
E
in the sense of
1
v is 1 H 0 ( 0 ) of a sequence of smooth functions which '
1 on
cone of
u
Ho(S2) if
U. Mosco are
2 1 on E Therefore, our problem now i s
The solution
u
is called the equilibrium potential
of the minimum is the capacity of the s e t
E
in
R
and the value
.
By applying Proposition 1 of Chapter 1, now we find that the lution
u
so-
of (3) can be characterized by means of the variational inequa-
lity
where
a(u, v)
i s the Dirichlet form ( 2 )
Remark 2
.
.
The variational inequality (4) is the
formulation of this capacity problem whenever
a(u, v)
only
possible
i s the generali-
zed Dirichlet form associated wlth a non-symmetric elliptic operator a s in Example 1
.
L
In this case, ( 4 ) has to be taken a s the definition of
the equilibrium potential
u
on
E
relative to the operator
L
in
R.
The capacity theory f o r non-symmetric second o r d e r elliptic p. d. o. was started by G. Stampacchia
[I]
and his variational inequality appro-
ach to this problem was extended to other variational problems with unil a t e r a l constraints, both of stationary and evolutive type, by J. L. Lions
U. Mosco
and G. Stampacchia
E] . Many other problems. of this tm arising in phyaics
and engineering have been (investigated 8inc.e t h o ) .by maqy authors in the light o'f the theory of variational inequalities. The main reference in this r e gard
is the recent book of C.
Duvaut and J. L. Lions
rl]
.
Let us also mention that many unilateral problems of mechanics and hydrodynamics have been also investigated by J. J. Moreau [3]
, by using
the methods of convex analysis. @ Example 3 : The l.!ohataclefl problem. We now want to minimize the Dirichlet integral F0(v) over the cone 1 of a l l functions v of H (R) which a r e )a. e. of a preassigned function Y, 0 in R . The function Y/ will be called the obstacle and we assume that Y, i s such that the cone just defined i s not empty. F o r instance we may have
y
2
H1(n)
E L (R) and a l l distribution derivatives yx. of [that is, 2 1 y also belonging to L (0) the t r a c e of on being < 0 a. e.
E
Y/
1,
r
The minimizing u i s the solution of the variational inequality
It can be shown that if I i s the closed subset of S2 where, formally, u =
y
, then the solution u satisfies the conditions
u Z \ t / u =
y
a.e. on I
andand
u s 0
in
R ,
- Au=Oin R - I
Thus, u, i s super-harmonic all over R and harmonic outside the s e t I where it fltouchesllthe obstacle.
A similar problem which includes the capacity problem, consists in minimizing F
over all functions v which a r e ) only on a given compa0 ct subset E of R (and now v > U( on E has an analogue meaning than the conditicn v >, 1 on E in Example 2). If E is a (n-1)-dimensional manifold in 9 , the problem at hand may be called a "thin obstacle" problem.
Problems of this type have been f i r s t considered by J. L. Lions- and G. Stampacchia [I]
.
The regularity of the solution has been also investi-
gated by many authors. See H. Lewy and G. Stampacchia [I], L 2 ] , [3],
H.
Brezis and G . Stampacchia [I] , H. Lewy [I], c22] , H. ~ r e z i s L 5 1 ,G . Stampacchia L2J , [3] , [4] by C. Baiocchi [I]
. An application to hydraulics has been .
Example 4 : The I1boundarf obstaclet1 problem
recently given
.
The problem now consists in minimizing the functional n
F(v) =
11 i=
1312 R
where F i s a given function on R '1 v € H (R) such that v 2 h
a. e.
on
-
dx
0xi
1
f v d r
51
. say, f r
E L L (a), on the cone of all
,
where h is a preassigned function on
r .
It can be seen that the associated variational inequality satisfied by the solution u corresponds to a boundary value problem f o r the Laplacian
-A
, with unilateral constraints on the boundary
problem can be stated a s follows
r . Formally,
this
U . Mosco
A detailed discussion of this example can be found in C. Duvaut
and J . L . Lions
.
1
Let us only remark here, following J. L.
Lions , R. Glowinski and R. Tremoliers
(11
,
that the conditions
-
(6) can be interpreted a s describing the stationary equilibrium of a fluid in a region
R
surrounded by a membrane
to came in and prevents it to leave
R
that allows the fluid
.
If u(x) is the p r e s s u r e of the fluid inside f2 and h(x) is the external p r e s -
p,
s u r e applied on the boundary l e u = h on ver
p
fin > '0' whi; on the other hand, if u > 0, then the fluid is pushed out, howe-
P forbids the outcome,
when the fluid comes in,then
l = 0. The regularity of the solution u hence f-
nn
has been studied, in particular, by H. Brezik-G. Stampacchia [11 , H. Brezis
157, H.
de Veiga 117 , [2]
.
Example 5 : A problem in nonlinear elasticity r h e energy integral
now has t o be minimized on the convex set of a 1 v
HI (R) 0
such
that
1 grad
v
] 61
a. e. in
R
.
This is a problem bccurring in 'the elastic-plastic torsion of a
U. Mosco
bar and it has been studied by many authors, s e e in particular B. D. Annin [I]
, H. Lanchon and C. Duvaut [I]
cil
W. Ting
and C. Duvaut
- J.
, H. Lanchon
Cd
, T.
L. Lions, loc. cit., where more in-
formation can be found.The variational inequality characterizing the solution u formally interpreted a s follows : There is a "plasticity regionv R
can be Ro in
where
outside
no,
the function
moreover,
that is, in the region
u
u
R
-
R where 0
satisfies the equation
and its derivatives
ditions at the interface between
O u /b X i
Ro
and
satisfy certain matching con -
R - Ro
.
Like the obstacle
problem, this too is a free boundary problem. Let us also mention that the present problem can be eqyivalently stated in the form of a J1two-obstacles'l problem, that is, condition (17). above can be replaced by a condition of the type
where
and
y2
a r e two suitable functions. We refer to T. W.
U. Mascs
Ting, loc. cit. and H. Brezis-M. Sibony [ 2 1
for more details on this point
The regularity of the solution has been studied by H. Brezis-G. Stampacchia, loc. cit.; see also H. Brezis (51. F o r the numerical solution of this problem see R. Glowinski 3 , J . F . I?ourgat[1] , M. Nedelec ursat
/17, M.
Sibony [27
tl]
, M. Go-
.
-
Example 6 : A Bingham's
fluid
In its direct variational formulation, the ~ r o b l e mconsists in minimizing the
=
over the space
differentiable functional
Hb(n)
The funational above is the sum of the same energy integral F(v) occurring in Example 5, which is obviously differentiable on
Hb(R), and
the non differentiable term
G(v) = g
The minimizing
J
u
1 grad
n
v
I
dx
-
.
i s thus characterized by the mixed variational
inequality
u 6 Hb(R) where
: a(u, v
- u)
2 G(u)
- G(v)
\d v
.
Hb(n)
zl J
U . Mosco
F o r the numerical solution s e e R. Glowinski [I]
.u,
V) =
/i> X,
dx
/?x i
R
-
I
, M. Goursat
fvdx
El]
.
,
52
.
cfr. Proposition30f Chapter 1
The physical motivation of this problem a s well a discussion of the properties of the solution
u
can be found in C. Duvaut
-
J1 L.
Lions, loc. cit.. of elasticity ---with friction on the bbundary. Example 7 : A p r o b l e m Another problem involving a non .differentiable functional is the m i nimization- of
over the whole space
H1(R)
.
The non differentiable t e r m is the bounda-
r y integral
and the solution
u
riational inequality
where now we have
is characterized, a s in Example 6, by the mixed va-
U. Mosco
Problems of this type occur in the theory of elastic bodies subjected to unilateral boundary constraints. Once again we refer to C. Duvaut
-
J. L. Lions, loc. cit, o r
J. L . Lions, R. Glowinski and R. T r e -
moliers, loc. cit. Example 8 : the Laplace -
opeador
Inequalities involving non linear generalization
of
.
A generalization of all,problems considered so far, which is qui-
t e natural from a mathematical point of view thongbit of no direct physical interest, consists in replacing the Dirichlet integral with the functional
F o r every
p >, 2
this is indeed a differentiable convex functio-
nal on .th.e Sobolev sppce
and its differential is the monotone. operator
U. Mosco H ~ ' ~ ( R t)o its dual, associated with the form
from
r
~ have e in fact
Let u s remark that another natural family of convex functionals that generalize the Dirichlet integral is given by
FI.1
=
1 grad v 1
whose differential i s now the operator Av =
- div
(
[ grad v 1 p - 2 grad v)
dx
,
U. Mosco
-
Like \the opetarot. (18), this operator obviously reduces to when
p = 2
.
Therefore, they can both be considered a s natural nori li-
near generalization of the Laplace operator. ~ t ' s h o u l d be also noted that these operators a r e all duality mapH ~ ~ ~ (suitable Q ) normed, cfr. Section 8
pings of the spaces pter 1,
Remark 1 2
of Cha-
.
Example 9 : Inequalities involving m o r e general non linear second order
elliptic, partial differential operators. Let us consider the form
u
(ux , . . . ,ux ), 1 n
X
a r e measurable in a. e . in
R
x
where the functions
t
for fixed
and continuous in
r
for
x
fixed
.
If the functions
ai(x; f
)
a r e of polynomial growth at
oa
in
1,
that is
f o r some
p
with
for every function u we have
1< p <
+
w
, c > 0, then it i s easy to verify that
of the Sobolev space
H1~P(!2)(see the Example 8)
U. Mosco
and an astimate such a s
6 (r) a continuous function-of r > 0,\j \( . being the norm " H1' '(0). 1, P f~~ taking t h e Sobolev imbedding theorem into account, t h e g r o w t h
holds, with
condition (19) could be obviously
weakened, s e e the ref. quoted
below
.I
Therefore, the identity
,
(AU, v) = a(u, v)
defines a map
A
from
Clearly, this map (Au
-
Av, u
- v)
provided the functions
H1"(n)
A
EH~,~(Q)
i t s dual, formally
is monotone, that is
= a(u, u
ai(x;
to
u, v
-
v)
-
a(v, u
1/) , satisfy a
-
v) 2 0 ,
weak ellipticity conditions
of type
Let u s notice t h l if t h e r e exists a function
5 (x;
)
such that
U. Mosco
then
A
i s the differential of the multiple integral
since then we have
a(u, v) =
d dt
Fl-*+ t v ) 1t.O = (DF(u),v)
.
The monotonicity condition (20) is then equivalent to the convexi-
$ (x;
ty of
5
)
in
1
(see also the Remark below)
.
Thus, variational inequalities involving differential ogerators as the operator
A
above arise,for instance,whenever we minimize
an
integral functional like (21) subject to a convex set of constraints. .There i s an extensive literature .onathe partial differential operators of type described above, and of highek order too, and on the related boundary value problems. We only r e f e r here to the papers of E . Browder
[l]
, [2]
, to J. Leray and J. L. Lions [I]
P. H. H'artmen and G. 'Stampacchia rl]
F.
and to
, where variational problems
wits unilateral constraints a r e studied in detail. Surveys of the theor; and its' Applications can be found in F . E. Browder [6]
,
[ 12 R. .I. Kachurovski [3]
[7]
J. L. Lions
.
Remark 3 : The application of the theory of monotone operators to the boundary value problems f o r partial differential operators of elliptic type has brought to a natural' genekalization of the theory.
In
fact,
it
is
U. Mosco more convenient in many applications concerning an operator such a s the a.(x; u, ux , . . . ,ux ) 1 1 n be monotone in the arguments corresponding to the highest derivatives
A
considered above, to require
-
only
,. . . u
u
der terms
1
A = DF, F
-
the
functions
in our example above - and handle the lower o r n u in the example - by using a compactness argument. When X -
being the integral functional ( 2 1 ) , this corresponds to r e s t r i c t
the convexity assumption on the highest o r d e r derivatives appearing in the integrand of
F , a s i t is indeed natural in many problems of calcu,-
lus of variations. '
The operators that a r i s e in this way can be described in the s i m plest c a s e by adding
a compact operator to a monotone one and in gene-
r a l by allowing a more sophisticated intertwining between the monotone and compact components. These so-called semimonotone
operators a r e
also discussed in the references quoted above. In this regard let u s also mention a m o r e general class of operators, the pseudo-monotone operators which has been introduced by H. Brezis
Cl]
, r2-J, E3J , s e e also J. L . Lions C17 .
Example
10 : Minimal surfaces with obstacles
Let u s consider the functional
that gives the area of the surface F
over the cone of all
compact subset
E
of
v
v = v(x), x E R . We can minimize
which a r e >, then a given function
y
on a
R ,
The variational inequality that characterizes the minimizing surface
u = u(x) involves
the Euler operator
U. Mosco
The natural Sobolev space h e r e is HI' '(n), that is, the space 1 1 C L ( 0 ) . However, of all v E H (f2) with all f i r s t derivatives v Xi this space is not reflexive. Therefore, the problem mentioned above cannot be handled with the standard methods diaoussed
in
a u r lectures ,and
ad hoc techniques have been indeed developed. F r o m the extensive literature concerning minimal surfaces, let us only quote the papers by J. C. C. Nitsche El] , M. Miranda [l] , H. Lewy and G. Stampacchia r3] , E. Giusti El] ,
R . Temam El] ,
that specifically concern the obstacle problem.
2 Finite-dimensional and iterative existence theorems
By taking into account the relation between variational inequalities and fixed-point problems (see Proposition 2 of Chapter 1) and making us e of ttie classical'Brouwerl's fixed-point theorem, we can easily profe the following finite-dimensional existence theorem, due to P . H. Hartman and G. Stampacchia Cl] Theorem tkiB euclidean -
,
1 : Let K
space
En,
be a non-empty closed convex subset of -n ,& -a continuous -map of K E . Let
u s suppose, furthermore, -that either -
K
is bounded -o r the -
following
coerciveness condition holds (c) --There exists a bounded --open convex subset v
0
(
K fl B, such that
--
B
of En and - -a vector
U . Mosco
'3
B
tion -
being the boundary of B u
. ------Then, there exists
of the problem
Proof : Let us f i r s t assume that
By Proposition 2 of Chapter 1, above if and only if
of
K
K u
is bounded, hence compact.
-.
is a solution of problem I1
is a fixed-point of the m a p
u
into itself, where PK is the Riesz projection on K. since n E I K is continuous, a s it follows from the Corollary 2 of
PK : Lemma 2 of Chapter 1, the map vided
at least one solu-
is such
p@
.
PK(I
-
)
too
The existence of a fixed-point
now a consequence of Brouwer's theorem
is continuous, prou
of this map is
.
Now let us replace the compacteness assumption with the coerciveness condition (c). By what we have just proved, there exists a solution ii
where
u (
of
5
the problem
= B U
/3 B
Since we a r e assuming that condition ( c ) holds,
cannot belong to the boundary .? B
8;/ v O - 5 ) < 0
of
B , for
we should have
in contradiction with the inequality above. This v>
means that
%
is a local solution of problem 11, hence it is a l s o a
U. Mosco
global solution of I1 (cfr. Section 1 of Chapter 1) Remark 4 : nal of
32
F o r a map
.
Bd
which is the gradient of a
functio-
F , the coerciveness condition ( c ) is related to the growth at F, s e e Lemma 2 of the following Section 4
RJ
.a
Theorem 1 could be extended t o infinite dimensional
Remark 5 :
spaces by making use of the Schauder o r Tychonoff fixed point theorems,
[ld [lq.
s e e F. E . Browder [8] needed by the map
&
However, the continuity assumption application
would then be too strong f o r direct
to problems of the type mentioned in the preceding section. Theorem 1 cannot be considered a constructive^^ existence theorem, because it relies on the deep though non constructive, Brower's theorem. However, we can easily. convect Theorem 1 into a w c o n s t r u c t i v e ~ ls with a constructitheorem, whenever we can replace R r o u ~ ~ e rtheorem ve fixed point theorem. The main example is obviously given. by
the
well known contraction principle. We haven in fact, the following iterative existence theorem : THEOREM 2 : space
V ,
3
-
a map of ---
K
ft
is 2 -
I - 5 %there
Let K be a closed convex subset of a ~ & e r t -
- - - - A
into
-
.
I
--
V , such'that
contraction for some
exists a unique solution
u
y>o .
of the problem --
.
_
U. Mosco
and u -
=
lim
u n
rative Scheme -
in -
V, where the sequence (un )- -is given
the it&--
Proof : Since P : V c--, K is non-expansive (see Coroll. 2 of K L e m m a 2 , Chapt. 1), the m a p PK(I & ) is a contraction provided
1
- 7 eft
is a contraction. which is, f o r a suitable
o u r assumption (
+
)
.
f > 0,
Therefore, there exists a unique fixed point
u
yielded by.the iterative:acheme (IV ), which is also a solution of problem n 11, again by Proposition 2 of Chapter 1 . b Theorem 2 above can be integrated by the following lemma that gives a simple sufficient condition f o r I
& & 2 -m-a p- ofsuch that --
Let LEMMA 1 : space
V (i)
into V, -
be a contraction.
a subset
lipschiztian, i. a . , t h e r e exists
--
(ff
]I &u (n)
- ?8
11
- &v
.\. L
fl i s s t r o n g l r monotone,
Then, -the map I tisfying the bound ---
-
@
-is -a
11 u - 17. I\
K
L
>
of -a-Hilbert -
0 -such that
u,v
,
t K
i . e. , t h e r e exists c > 0 -such that
contraction
&
V
f o r all --
p
"-
U. Mosco Proof. An elementary computation shows that
n (I - g S t , u =
IIu - v I I
2
-
(1 - p & , v
- 2 g r
112
=
dtu-av J U - V ) +<
2
I I & ~ - W V ~
.
Therefore, by (i) and (ii) 2
hence 1
-
provided
& is a contraction, -2
9 c + L~ 7
that is, 1-2 f c
2
+ L~
< 1 (note that c ( L).
c 0 , which is to say, 0 c
Remark 6 : The minimum of the function 1 - 2 c p
P
+
L~ g 2
of
is attained at the value
that gives the contraction constant
A
Iterative methods f o r solving variational inequalities have beem* considered by J. L . Lions and G. Stampacchia '112 M . Sibony [I] , M. Sibony [I] ,
-
.
M , J.
,
H. Brezis &nd
P. Dias-M. Sibony
[1] , G.
Stampacchia [5],
See also J. L . Lions, R. Glowinski and R. Tremoliers, loc. cit. The special case of variational inequalities involving bilinear fo-
r m s in Hiltzrt spaces will be treated with some more details i; the follawing section.
U. Mosco
F o r further references on the iterative methods f o r the solution of equations involving monotone operators s e e F. E. Browder and W. V. Petryshyn
[I]
.
, R. I. Kachurovskii [31
3 Variational inequalities
for bilinear
f o r m s in Hilbert spaces
An important c l a s s of problems to which the existence and approximation results of the previous section apply a r e the variational inequalities involving a coercive continuous bilinear form in a Hilbert space Let us recall that a bilinear form
a(u, v)
(on Y x V) if and only if t h e r e exists a constant
Moreover, a constant
a(u, v) is said to be coercive
on L
V
is continuous
> 0, such that
on
if t h e r e exists
V
c > 0 , such that
[AS in Sectiodof Chapteri, we shall denote by ( .
product in
V
and we put
11
11
= (
-
1
,
)'I2 3 .
I
. ) the inner
Let us also recall that any continuous bilinear form a(u,v) V
V.
can be represented in the given inner product of
bounded linear operator
given by the identity a(u, v) = ( S t u , v)
au,
v & V
V
on
by means of a
U . Mosco
eft
Such operator coerciveness of
&k
i s obviously
V ,
lipschiztian in
while the
a(u, v) is nothing else than the strong monotonicity of
stated in Lemma 1
.
cff
Let us also notice that
satisfies the
conditions (i) and (ii). -of that lemma with the same constants c
L
and
that appear in (22) and (23) respectively. The following theorem ,isthen an immediate consequence of The-
orem 2 above THEOREM 3
:
Let -
a(u, v)
form on a -Hilbert space V , -f o r any given f --in the dual u -of the problem. (24)
u E K
coercive continuous bilinear
a closed -----
convex subset of
of -
, there exists a unique solution
V*
a ( ~ v, - u ) 2
:
Proof : In
K
-be-a V
(f, v
-
\d
u)
terms of the inner product of V
V. Then,
v g K
the problem above
can be written as
where
i s the operator that represents
the vector of
r"
f =
J
-1f,
J
N
= (f
I
and
/V
f
is
8- f
V w e V *
W)
being the Riesz isomorphism of
As we already remarked, the operator
operator
V
V , such that (f, w
i. e.,
in
a (u,v)
,
V onto V
*
hence also the
, is lipschiztian and strongly monotone in V . The-
U. Mosco
refore, by Lemma 1, the map
> lution
@
- y
1
(
@- 5
is a contraction f o r
small enough and then the existence and un-iqueness of the so:
u. of problem (25) follows from Theorem 2'
em ark 7
:
. &d
A further consequence of Theorem 2 is the follo-
wing iterative algorithm
yielding the solution
u
of problem (24)
where
0
<
2c
J- < 7
The constants
c
and
(
L
(22) and (23) satisfied by the form
Popt
-
d-)
a r e those appearing in the inequalities a(u, v)
.
We can also write this scheme in two successive steps,
B$ using the weak characterization of
P ( s e e Lemma 2 of ChaK yter I), the iterative scheme (28) can be also written in the following
weak form :
U. Mosco
which is to say, in t e r m s of the original form
a(u, v) :
I
C K : ( u ~ + v~ - u n + l ) '>
r
(un
I v - u n + 11 - p L a b n , v - un + l
-
VV
(f,*.-~~+~gt K
Let us r e m a r k that this is a variational inequality such a s the o ne we started with, with the bilinear form a(u, v) replaced by the inner product of
and the given functional f
V
V* replaced by the func-
gn , given by
tional
Remark tion
of
u
8
The iterative algorithm that
:
gives
the solu-
of problem (24), both in the strong form (27) o r the weak form
(291, depends on the specific inner product that has been used. In fact, a s we already noted in Chapter 1, Remark 5, a change of the inner product of
V
to an equivalent one does not affect the form
even the given vector
of
f
V *, hence it leaves
On the other hand, such a change and the best constants gence range
0 <
7
a(u, v), .nor
o u r problem invariant.
modifies the projection
c
P~
and L in (22) and (23), hence the conver2. < 2c/L of the ? allowed in the algorithms
Moreover, in the strong form (27) of the algorithm, the change of
8 and
P '
f
must b e also taken into account.
To make the role of the inner product appear explicitely, iet us introduce a continuous bilinear form s e that 9b(u, v)
is symmetric, i . e.
b(u, v)
on
V .and let us suppo-
U. Mosco
and coercive.on
V
.
Under these assumptions,
will define an inner product in one. Clearly,
V, which is equivalent to the original
any equivalent inner product in V arises in such a way.
If we denote by
b
PK and J
the Riesz isomorphism of
V
b
onto
the Riesz projectioh on V*
K
and
relative to the new inner pro-
duct (30), then the strong form (27) of our iterative algorithm takes the form
where A : V
V*
i s the bounded linear operator defined by the
identity (Au, v) = a(u, v) = (
8u 1
v),
u, v
V
The weak form ,of (31) now is
The 'term in square brackets above is not affected by th'e change of the inner product, in fact , i t i s the'same in (32) than it was in (29).
U. Mosco
However, its representation a s a vector
u of V in the n + 112 two-steps scheme (28) depends on the inner product, and we should now write
u
n + 112
as
Thus, the explicit evaluation of of the Riesz isomorphism
requires an inversion n + 112 induced by the form b(u, v). By c a r e - '
Jb
fully choosing the form b(u, v)
u
we may hope to make this inversion
Itsimple" and consequently simplify the whole algorithm. Some regularity of the solution could possibly facilitate this choice. F o r more details on this point we r e f e r to J. L. Lions, R. Glowinski and R . Tremoliers, loc. c i t . , where a more detailed account of the iterative methods discussed in this section can be also found. @ Similar results for variational inequalities involving lipschiztian, strongly monotone operators can b e also deduced from Theorem 2 and we
r e f e r to the papers
of
H. Brezis
and
M.
Sibony
men-
tioned in the preceding section. 4. Direct existence theorem
As we said in the introductory remarks of the present chapter, when a variational inequality involves a mapping rential of a convex functional
A
which is the diffe-
F , then the existence of solutions can
be proved by applying the direct existence theorems bf the calculus of variations to the equivalent minimum problem for the functional
F.
This approach is also fruitful from a general point of view because the theorem we shell obtain along these lines
will be
proved to
hold, in Chapter 3, even if we drop the assumption that the map
A is
U. Mosco
the differential of a functional. Let us now recall the following basic existence theorem of minima. THEOREM 4 : Let K
be a non-empty -
dlosed -----
subset of a refle-
xive Banach space X , F a lower semicontinuous convex function on --K . Letus suppose -either that K is bounded or the following coerciveness condition (c0) holds : (c0) : ---There exists a vector vo ( K and a constant R > 0, with F(vo) < + w and I\ vO (I < R , such that F(v)
f o r all v E K,
> F(vo)
Then, there exists ------
I
u E K :
with 11
at least one solution
N u ) ,< F ( v )
u
for all --
v
11
=
R
of the problem V E K
Proof : The proof i s based on the following two well known results of functional analysis
:
closed convex subsets of a normed space a r e
also closed in the weak topology of the space; bounded subsets of a r e flexive Banach space a r e relatively compact in that topology bounded closed convex subsets of Such a r e then, if
of
F
on
K
.
K
X
a r e weakly compact.
is bounded,
Thus, we have
all level s e t s
.
Therefore
U . Mosco which is to say, problem I above has a solution
u. If the boundedness
is replaced by the coerciveness condition ( c ), we 0 can still draw the s a m e conclusion a s before, provided we show that s o -
assumption of
K
m e non-empty level set of
F
on
K
is bounded.
and R a r e the vector and the constant, respecti0 vely, appearing in (c0), it is easy to verify that all vectors z 6 L (vo) Now, if
v
a r e bounded in norm by
R.
[1n fact, i f there exists
zl E L(vo) with
z2 6 L(vo) such that
it would also exist
would contradict the condition
>
F(z2)
1 z2
F(vO)
11 vl 11 > 11 = R , and
.1
-
-
this
@
Addendum 1 to Theorem 4 : The set of a l l solutions p
R, then
-
u
of pro-
blem I above, under the assumptions -of Theorem 4, is a bounded clop
---
sed convex subset of ----
K
-
.
-
- .-
-
In fact, this set is nothing e l s e than the set (33), which i s bounded closed and convex f o r some ( o r all) s e t s Addendum 2 to Theorem 4 unique, provided
In fact, if (ul
+
u )/2 rG K 2
F
u
:
The solution
is strictly convex on
1
and
L(v) a r e such u
of -
. l(l
problem I
K, i. e . ,
u2- were two distinct solutions of
we would have both
is
I, since
U. Mosco hence
t h a t . contradicts the s t r i c t convexity of
.
F
Let us now suppose that the functional X
and let u s ask how the properties of
F
F
is differentiable on
involved in the assumption
of Theorem 4 may be expressed in t e r m s of the differential We already know, from Section 6 vexity of
F
over, the differentiability of
of F :
of Chapter 1, that the con-
is equivalent to the monotonicity of the map
F
DF DF
.
More-
clearly a s s u r e s i t s lower semicontinui-
ty, a s it follows trivially f r o m the inequality
that implies
l i m inf F(v.) >, F(u) whenever
o r strongly) to a given
u
.
J
v j
converges (weakly
More interesting to investigate is the connection between the coercivity of
F
and the coercivity of
D F . and we shall do that in the
following four lemmas. We a r e now assuming that nal on a normed space and
X
0
K
and R -
> 0
is a differentiable convex functio-
DF : X W X*
an unbounded' convex subset of
K
Lemma 2 : Let DF v
,
F
X
,
the differential ;f
F
.'
satisfy the condition : (d ) There exists -0 -,with )I vo I\ < R , -such that
U. Mosco Then,
F
There exists v 0 g 11 vO 11 < R , -such that
(c,)
F(v)
Rt
v
E K
0
>R
belongs to
. Let
and
K
and -
K
> m0)
Proof : Let
fix
the condition
satisfies
[I
and
vo
R
d R
and
:
> 0, with F(v0) < +
v
z be a vector of K with
[I <
R <
11 v 11
=
vo
i v l l r - = R .
EK,
be such that
and -
w
(do) holds- and l e t us
11 z 11
Rf
=
. Since
11 z 11 , the vector
R
f o r a suitable
E
> 0
. Moreover,
since
we have
and thus, in consequence of (d ), we find 0
Therefore, condition (c ) is satisfied by taking 0
v
0
= z
0
to be
U. Mosco
1z
1 z fi 4 R
3 .a
any vector that minimizes
F
on
K fl
If the dimension of
X
is
finite, and only then, the coercive-
ness condition tion
(cl)
3 : If X
is a finite-dimensional
. Tke map
v
m
(c ) 1
(DF(v), v
hence it attains its minimum
X, and
implies that the stronger condi-
(do) of Lemma 2 implies that
Proof
of
DF
below is satisfied by the functional .F
Lemma condition
(do) satisfied by
:
normed space,. then-
below holds :
- vo)
is cdntinuous on
m on the compact subset
> 0 because we a r e assuming that (do) holds
Now, for every' z
g K
with
I\
z
11
>
R
in the proof of Lemma 2, a vector
t 6 K,
and
with
-11
X,
v 11
=
R , such that
.
Thus,
we can find, a s
U. Mosco
z0
Therefore, if K
n
1 v 11
v :
= R
1 z 1 -+ ao
hence
is any vector that minimizes
F
on
, by taking (34) into account we find
implies
F(z)
+ +
w
. m'
: The lemmas above is false if
X
has infinite dimen-
sion, a s the following simple example shows : X
=
12, space of a l l
Remark 9 sequence
v
I .
( v ~ )such ~ that
hence
Then, we have
and
F(v)
whereas
>
0
v
#
0, according to Lemma 2 above,
U. Mosco
F(v(4)
=
1 -+ 2n
0
on the sequence
though
Thus, in order that
F
in an arbitrary normed space ger sense than (d ) 0
. We have
satisfy the coerciveness condition (cl) X ,
DF
must be coercive in a stron-
indeed the following
the condition Lemma 4 : Let DF satisfy (d )
1
There exists --
Then, ceding lemma Proof and let m> 0
R
>
-. 'Such
lion (dl)
.
F
vo E K
satisfies
--
, such that
the coerciveness
condition
(cl)
'
of the pre-
. . Let
be the vector that appears in condition (d ) 1 0 be such that 11 vo 11 < R and (34) holds for some vo
a consth
R now exists in consequence of our assump-
Thereafkr-', the proof i s the same:as that of Lemma 2
.1
The coerciveness condition (c ) is not stable under a correction of 1 as F(v) by an affine function (vgf v) + c , v * € x*, c C IEt 0 condition (d ) is not stable under the addition of a constant term 1 vo* to DF(v) ,. In other words, the coerciveness (c ) of the map 1 v DF(v) - v *.would depend on the given vector v 0 0
.
*
U. Mosco
'
Coerciveness conditions which a r e stable under the above mentio-
ned corrections a r e those given in the following Lemma 5
:& J
Then,
satisfies the c ~ n d i t i o n
F
DF
satisfy the condition
Proof : It is easy to verify that the lemma at hand is invariant under addition of affine functions to
F
.
Therefore, it suffices to prove
the lemma with regard t o the functional
yr/
F DF = DF
still is a differentiable convex function on
- DF(0)
and we have
Therefore, we have
where
X, with
U. Mosco
by the monotonicity of
5,
f o r a suitable
As
.
-21
<
with
.--,
1v
DF
Thus,
< 1 , hence also
we also have
oo,
11 Tv 11 -3
-t from (d2) . co,
since
positively bounded from below. Therefore, ( c ) follows 2 At this point we know how to formulate the properties of
F
is
.
re-
quired in the direct existence theorem given at the beginning of this s e c tion, in t e r m s of the differential
DF
of
F
.
On the other hand, we also know that a solution mum problem considered in that theorem can be
u
of the mini-
characterized by means
of the variational inequality.
Therefore, the existence theorem below is nothing else than Theorem 4 mentioned above, whenever the map ential of a function
F
THEOREM 5
on Let -
A
X :
A
-be- a
monotone hemicontinuous -map from
a reflexive Banach space X to its dual x*, ---
--
convex subset ---
of
X
.
i s the (Gateaux) differ-
a non-empty closed ---Let us suppose -either that K i s bounded ar -K
U. Mosco
that -
A
the following
satisfies
(d ) There exists 0 that -
v
--
-
(Av, v
K
0
vo)
coerciveness condition on K : and -
I(
R > 0, with
vo
I\ <
. such
R
> 0
Then, ----t h e r e exists at least -
one solution
u
of' the -
variational ine-
quality
u
11
C K
Proof of -If such an
A
(Au, v
:
-
Theorem 5
u) 3 0
' d v C K
under the --
adaitional assumption
i s the Gateaux differential of a function
F , a s we know from the discussion
semicontinuous and, in case
K
F
A = DF
.
X , then
on
above, i s convex, lower
is unbounded, it satisfies the coercive-
( c ) of Theorem 4 . Therefore, t h e r e exists a vector 0 that minimizes F on K . By Proposition 1 of Chapter 1,
ness condition u
of
K
any such minimizing
u
is a solution of problem
Remark- 10
If we want a solution
u 6 K
(Au, v - u ) ) ( f , v - u )
to exist
:
whatever
f
i s given in
x*,
u
I1
above
.
5
of the inequality
V V E
K
then we must assume in place
of the coerciveness condition
(d ) above that the stronger condition 0 holds. The l a t t e r condition i s indeed stable under the addition of a
(d2 ) constant vector to
A , a s we already remarked.
-
Theorem 5 in its' general f o r m i s due to Stampacchia
[I]
and
F. E . Browder
r5]
.
@ P . H. Hartman and G.
The proof of Theorem 5,
U. Mosco
under the additional assumption
that
A
i s bounded, will be given in
Chapter 3 and it will be based on the Linearization Lemma of Chapter 1 and a Ifstability theoremff f o r solutions of I1 convex set
under perturbation of the
that will be proved in that Chapter.
K
Since now, however, we can easily prove that t h e set of all solutions
of problem I1 above hag the same properties than the set of
u
a l l solutions of the minimum problem considered in Theorem 4
.
We have in fact the following Addendum 1
to Theorem
Proof
.
K
of
: Under the assumptions of the theo-
solutions -of the inequality I1
rem, ---the set of. a l l vex subset ---
5
closed con-
: We already proved in Chapter 1, a s a corollary of the
Linearization
Lemma of
Sec@on 7 (where the Benrhmtinuit'' of f was
used) that this set is closed and convex any solution
fi 5;bounded
u
of
I1
.
Moreover, if
K
i s bounded in nbrm by the. constant
i s unbounded, R
appea-
ring in condition (d ) of the theorem, a s i t can be shown easily by taking 0
the convexity of the set of a l l solutions into account
-
Addendum 2 to Theorem 5 provided
A
is strictly monotone
(Au
- Av,
Proof : .If u
1
u
- v) > 0
and
u
2
. Tile solution
. dlI u
of I1 i s unique,
K, i. e.,
u f v , u,v
a r e solutions
C K .
of I1 , then we have both
U. Mosco
hence (Aul that,
-
Au2, u
by the monotonicity
city of
A
-
u2)
of
<
0 ,
A, implies
u1 f u2 this would contradict the s t r i c t monotoni-
Therefore, if
.8
Remark 11 if
1
. If
A = DF, then A is strictly monotone if and only
F i s strictly convex (cfr. Lemma 1 of Section 6 of Chapter 1)
fact, if
i6 a - r i c t l y monotone and the convex function
A
strictly convex, then the directional derivative of on a line segemnt joining two points the strict monotonicity of
A
'
L the
marking that in the implications
u 1'
u 2'
F
F
F(u
+
t(v
for every Sec. 6,
- u)) < F(u) + t 0
we have
.
F
In
were not
would be constant
and this would contradict
proof could be also achieved by r e -
(iii)==+ (ii)
+
(i) of the lemma
quoted above one now has s t r i c t inequalities everywhere, when On the other hand, if
.
u f v
is strictly convex, then
[ ~ ( v )- F(u)]
U
f
V,
Therefore, a s in the proof of (2) of Chapter 1,
1.
U. Mosco
for ,every u f v. By interchanging the role of
u
v
and
and adding up the two
inequalities obtained, we find
0
> (DF(u)
- DF(v),
v
-
u)
U
f
v
.
8
U . Mosco
CHAPTER 3
-----
Convergence of convex sets and of solutions
of
variational inequa-
lities 1 : The Rite-Galerkin discretization
2 : Convergence of convex sets and convex functions 3 : The "stability" theorem
4 : Further existence theorems
5 : Finite-dimensional approximation, I : the discrete problem 6 : Finite-dimensional approximation, I1 : convergence of
the approximate solutions 7 : Dual variatbnal inequalities and complementarity
systems 8 : An example
U. Mosco Solutions of variational inequalities enjoy . a remarkable "stability'! property under perturbations of the convex s e t involved in the inequality. We shall describe this stability in Section 3 below, by introducing a suitable topology in the space of all closed convex subsets of a normed space. The topology which turns out to be the appropriate one to describe the perturbation we mentioned above will be discussed in the following Section 2
. This stability is also the property underlying the infinite-dimensio-
nal existence theorem we shall prove in Section 4. This theorem,indeed, can be obtained from the existence. theorem f o r variational inequalities in an euclidean space, by approximating the initial problem with a family of finite -dimensional problems. Under suitable assumption on the map easily converted into a "constructive"
A
this procedure can be
one, yielding a method of Rita-Ga-
lerkin type for the numerical solution of variational inequalities lution
.
The s o -
u ( ~ ) of the initial problem can be found a s the limit in an appro-
priate norm of a family of approximate solutions
which can be coh a mputed numerically by solving a variational inequality in euclidean spaces u
of increasing dimensions. Tbis approximation will be f i r s t sketched in Section 1 below and then discussed with more details in Section 5 and Section 6 Finally in Section
7
.
we shall give a short account of a duality
theory f o r variational inequalities
. We
shall r e s t r i c t ourselves to varia-
tional inequalities on cones, by relating this subject to the s o called Emplementarity systems
.
U. Mosco
. -The Ritz -Galerkin
1
approximation
Let us consider a variational inequality such. a s
u 6 K
(1)
(Au, v - u ) ) O
:
in a normed space
X
.
v v E K
We shall speak of the elements of
X
as of
functions. Now let
Xh
be some finite-dimensional subspace of
ding on a parameter
where
to be specified, and let
, n = nh being the dimension of
be a basis in functions
h
vh(x) in
Xh
1 to
n =
Xh. Thus, the
a r e those of the form
( v ~ a) r~e the components of
ranging from
X, depen-
v (x) in the chosen basis (2), q h
"h.
Let us now choose a convex subset
Kh
of
Xh , which may be
conveniently assumed t o realize an approximation of the given choice of such a finite-dimensional approximation of
K
.
The
K is crucial a d
we shall come back on this point in a moment. We then replace the initial problem (1) with the following approximate problem
U. Mosco
where, for sake of simplicity, the operator
A
has- been left unchanged.
This means that the approximate solution
u (x) in h Kh i s required to satisfy exadly.the given inequality for a l l vh(x) of Kh ' sis
If we now take the expressions ( 3 ) of u (x) and v (x) in the ba h h h ( y q ( x ) ) q that has been chosen to span the subspace Xh and r e -
place them into (4), we then find that the approximate problem above takes the form of the following discrete prob35m in the space
(5)
h h u = (uq) E
ch :
where the vector field
the pairing at and
ch
right
Z s
s
A
h . . ..un)
v
h (vS
vh =
:
h - us) )o
(g) E
ch ,
is given by
hand being that between
the convex subset of
When
h h As(ul,
Rn
X
and its dual
X
,
R~
is a linear map, then the system of inequalities (5) r e -
duces to the system of linear inequalities
U. Mosco with the n x n matrix
.
h (A ) given by qs
Let us point out that we have been considering so f a r three distiX ; second, the
nct problems, first, the initial problem (1) in the space approximate problem (4) in the finite third, the discrete n = nh
-
dimensional subspace
Xh
of
problem (5) in the n-dimensional vector space
being the dimension of
X h
X;
Fin,
'
To write down the approximate probem (4) we only need to specify the subspace
Xh
of
X
write the discrete probem (5)
Kh of
the convex subset
Xh
.
To
we must additionally choose a basis in
the subspace Xh. i s thus a basic adKh ditional element of the Galerkin scheme of approximation of the inequaliThe choice of an approximate convex set
ties we a r e dealing with. When these reduce to the equation and this i s the case, e. g., when subspace
Xh
Au = 0 ,
K = X, then only the choice of the
has to be made.
The choice of a family of "good"
approximants
should b e Kh submitted to the following two general requirements: first, the corresponding approximate problem can be converted into a discrete probelm as "easy to solve" a s possible; second, the dependence of meter
h
is such that the approximate solution
initial solution
u(x) a s
Kh approaches K
.
on the para-
u (x) converges to t h e
h
It is trivial to realize that once the subspace a
Kh
Xh
has been chosen,
naif choice of
X
K may be not always successeful. F o r example, if h i s a separable Hilbert space, Xh the subspace spanned by the first
n
vectors of a given orthonormal system of
X,
K
the one-dimensio-
U. Mosco a subspace
rial
spanned by
a
vector
v
of
0
X
which has
infinitely many non-zero components with respect to the given basis, then namely Kh, Kh consists of the
what in general may be the most natural candidate for K
~
= Xh K ,~ is obviously a bad choice,for this
single vector
2
0
.
. -A convergence ----for convex sets and convex functions .
As we
pointed out in the preceding
section, the finite-dime-
nsional approximation of a given variational inequality, as f a r as the approximation of the convex set involved in the inequality and the convergei nce of the corresponding approximate solutions a r e concerned, can be dealt with by suitably, defining a topology in the family of all closed convex subsets of a normed space
. The topology
we a r e looking for, however,
must be weak enough a s to allow a family of finite dimensional converge to a possibly infinite-dimensional
K
.
Kh
to
To investigate what a llgoodu converge
(9)
K = lim
K
h
would be like, let us consider the basic variational problem coneisting in the orthogonal projection of a given vector onto a closed linear subspace
M
of
z
of a Hilbert space
V-
.
V
A "goodv convergence can then be
reasonably requested to .ful-
fill the following requirement "For any Closed linear subspace quence of closed linear ----
subspaces
of
V
M
of -
V
such that
if -
(Mh) is any se-
U . Mosco
M = lim Mh
(10) then for every --Mh
z
of
V
the orthogonal
z of z on Mh PMz f z on M.
projection
P
--
converges strongly to the orthogonal projection
-
In other words, the map
M . P M
where
PM is the orthogonal projection operator on
nuous at any
M
-
M, must be conti-
f o r the strong topology of operators.
Moreover, the requirement above can be assume4 to be invariant under the orthogonal complementation
M
MI of subspaces of.
Therefore, the convergence (10) too must be stable under
V.
1,
which is to say, we must have M = l i m Mh if and obly if
M
I=
lim Mh
1.
Let u s now assume that the convergence (9) has beexi .indeed so defined a s to satisfy requirement
"
It
above and l e t us derive s o -
me necessary condition satisfied by any sequence to a given
M
accof ding to (10).
(Mh )
that converges
A f i r s t condition is the following inclution M C s-lim inf s-lim inf M h strong topology of. V
where
Mh
denotes the l i m inf of the sequence ,
(%)
in the
(11)
s-liminfMh=
C v c V : v = s t r o n g l i m vh.
In fact, by property
It
.. . .
zh = P M z~ belongs to
vector
l1
Mh
vh€Mlr
above, for any
z
of
V h j . M the
and converges strongly to
PMz = z.
The following lemma shows at one time that the condition above is not Btable under orthogonal complementation and how it must be .strenghtened to become such. We shall denote below by w-limsup M the sequential limsup of the h equence (Mh) in the weak topology of V, i. e.,
(12)
a-limsup Mh = L v < V : v = weaklim vh ,
v
j
(Mh.)j J
subsequence of
(M ) hh
hj
€ . M ~ v j , j
3
- (Mh) a sequence --of closed linear subspaces and M a closed linear subspace of V . -Then we
LZMMA 1 Let
of -
V
have -
.
Mh
#
V,
if and only if --
(14)
w-limsup MhL
c M
I
Proof : We shall-base the proof on the following well known fornula
U. Mosco
+
dist (v, H) = max W G
(w I v)
Uwl 4 1 which gives the distance of space
H of
V
vector
v
V
af
from a closed linear sub-
.
Cdist (v, H) space
a;
can be seen a s the norm of
V/H , whose normed dual is isometric to
v HL
in the. quotient ; hence (15) is
a special case of the well known dual formula
\\
v I
=
I satisfied by
*
max a (v v*e x v* I1 ,< 1
the norm of any norrned space
Let us prove that (13) implies (14) v
0
= weak lim vh j
( ~ being ~ a4subsequence ~ of J
Let
, v)
r. > 0
X
. Let
.] v 0 C d i m s u p Mh
,
.)h. (Mh1
be such that for a l l j
Then, for any
v
we have,in oonsequence of (15),
, i. e.,
U. Mosco
and, by our assumption (13), dist (v,
3)
0.
Therefore, (V
0
that is, .
lim (vh j
vo E M
v) = 0
for a l l
v € M,
1 ,
.
Conversely, let us . suppose that (14) holds and let us prove that
for every (13)
I V) =
v
M we have dist (v, Mh) 4 0 , what clearly proves
of
. a vector
Again as a consequence of (15),we can find for every h vh 6
qL ,
with
such that
Let
(MhjIj
be an arbitrary subsequence of (Mh)h
dedness of the sequence which we still call
.
BJ the boun-
( v ~ ,) there ~ exists a subsequence of
(vh.) , that converges weakly to some vector
J j of
and this
V
fore we
(vhjlj ,
v
by our assumption (14), belongs to 0 ' have, since v M ,
It follows, by the arbitrariness of
(Mh )) , that j
M
1 .
vo
.. There-
U. Mosco
-M
Corollary : Let
= lim Mh
above hold. Then, we have
and (14) -
mean that both the inclusions ----
M = lim
%
(13)
if and only if
1 . = lim M~ .
M
We shall also see in Section 6 that if we actually define the convergence of a sequence of subspaces a s in the corollary above, then the requirement.
t r\ 1
"
satisfied : thkt is, P
z
for every
M
of
'
we stipulated at the beginning of our disuussion is
M = lim M h
does imply indedd that
z = strong lim P
z
Mh
V
'.
The discussion made up to now leads us to the following general definition Definition 1 : A sequence (\)
of convex subsets of a normed spa-
c e X converges to a (closed convex) ,subset
K of
X, and then we
wri-
te K = lim K
i n x
h
,
if both the following inclusions holds
w-limsup
K
h
C
K C s-liminf Kh
#
where the limits wre defined a s in (11) and (12) above. Remark 1 : If
Kh C K
quivalent to the single inclusion a
if
K
C
Kh
w-limsup
\
for every C
K
.
h , than
for every
h
, than K
= lim
M
is B
h K C s-lim inf Kh. On the other hand,
K = lim K h
reduces to the condition
U.. Mosco
It could be shown that if
X is a reflexive Banach space, then a
Hausdorff topology can be-intraduced in the space of a l l closed convex sub-
X
sets of
which reduces on sequences of sets to the convergence defi-
ned above. However, we shall not discuss this problem here, and we r e f e r to J. L.
Joly [I]
f o r a general investigation of the topologies f o r
convex s e t s and convex functions and the action of polarity on them. The basic property of the convergence we have just defined (and its
of the topology mentioned above) is indeed
stability under polarit&
a s we already showed in the special case of Lemma 1
.
To make this point more specific, let us consider the Young-Fenchel transform
which associates every 1. s. c. convex function f
:xH
(-a, +co1
(fF+oo)
with its polar function f*:
X*
(-
H
CO,
+ COT
,
given by (16)
f*(vq)=
sup V'G
L(v*,v) - f ( v ) j
f
++
f*
v
* e xr.
X
It can be shown that and
,
f
*
is again a
1. s . c. convex function
i s bijective and involutory, that is,
f
**
=
f
U. Mosco
Let us note, in particular, that when
, where
K
is the support function of is a closed subspace of
re
X)
.
X
. As and
is the.annihilator of
M
space of
(
f = M
, then f
* X
in
H
*
=
K = M
6MA
(hence, the orthogonal
, whe-
sub-
is identified with
i s a closed convex cone with vertex at ze-
SH)*= SH*, where
is-.the polar cone of
--
If
a special case of this, if
, if X is an Hilbert space and X*
M
More generally, if
ro, then
is the indicator
X ( s e e Section 9 of Chapter I),
function of a closed convex subset of then. f4=
SK
f =
(f ) h
H
.
is a sequence of 1.8. C. convex functions on
X
we now
define f = lim f
(1 9) where
f
h
in
X
,
is also a 1. s. c. convex function on
X
epi f = lim epi f
in
,
to be equivalent to
the limit
in the product space
X x
h
R , according
X x
R
to Definition 1 above. Let
U. Mosco us recall that for any 1. S . C .
is the epigraph of
epi g
convex fun'ction
3
.
g , that i s the closed convex subset
of the product space X x
R
.
It; is easy to verify that if Kh
g : X H ( - w, + w
LK>fh
f =
X , then
closed convex subsets of
=
f = lim f
Kh h
'
with
K,
if and only if
K = lim K
h ' Again, if
is a reflexive Banach space a Hausdorff topology
X
can be define on the space of - a l l 1. s. c. convex functions on ced from the analog
X , indu-
topology for closed convex sets we mentioned abo-
ve, which reduces on sequences to the convergence (19) , The stability of this topology under polarity can now be stated as follows THEOREM 1
The Young-Feuched --
tinuous
.In particular,
on the --
reflexive Banach space
only if
f=
lid ;f
bijection
f
H
f*
is bicon--
f o r any sequence (fh) of 1. s. c. convex functions X*
in -
.
X , we have
f = lim f
h
in -
X
if and --
This theorem generalizes Lemma 1 to which it reduces when
.
b
Another speckik case of the theorem above, which, we shall Mh need later, is obtained by taking f to be the indicator function of a clof
=
sed convex cone
H , with vertex at
COROLLARY with vertex ---
at 0
0 :
- (Hh) -be- a
: Let
sequence ---of closed convex cones,
, -in a refled~veBanach space
ce of the polar cones ------
in
X
* .-- --Then we
have
X , (H h
the sequen*) -
U. Mosco H = lim H h if and -
only if Hh .= lim
H being the polar cone of
----
i n x ,
H h H
.
This result could be also proved directly, along the lines of the proob of Lemma 1 F o r the proof of Theorem 1 s e e J . L. Joly, loc. cit. and U. Mosco [6]
. Let us now give a few examples of sequences of convex sets that
converge according to Definition 1. By X we shall always denote a r e flexive Banach space, even if for some of the results stated below the reflexivity of the space is not neede,d.
Let
(a)
be subspaces of -
(b) ' be -
c . . . c M h c
....
:X . Then ,
lirn where
M,CM
Mh = M
M = closure of
,x
Mh
&
Let M I D M ... sulypaces f X . Then, 3
X
3
lim Mh = lk ,
. Mh 3
....
U. Mosco
The Ritz approximation suggeststhe following example ..
2a
-K
(c) Let X
is not -such that --
closed convex subset of
empty, and
X , whose interior
0
K
(5)an increasing sequence of subspaces of
X = closure of
i X h
& X,
.Then*
More generally we have be as in (c) -above and -K -
(c') L e t subsets 0
of
that K n -
X S
converging
# 9
.
Then,
5
S
in
(Sh)
5 sequence of closed. convex
.
Let us assume, furthermore,
X
U. Mosco F o r the proof we r e f e r to U. Mosco [4]
, L e m m a 1.4. (+I
The l a s t example leads u s to the general problem of the continuity of the intersection operation : under which assumptions does K = l i m K
s
= lim S
s
K ll
imply
= um
K
fl S
h 0 we saw that if we drop the assumption K f
At the end of Section 1
?
9
the conclusion of (c) a -
bove may be false. Therefore, some condition on the sequences involved must be imposed. We r e f e r to the papaer of J. L. Joly already quoted, where the problem raised above is investigated also in the form it takes for convex functions
.
Then, it i s the pi-oblem of the continuity of the
so-called inf-convolution
f
l a r operation of the sum
V g f + g
.
which is, roughly speaking, the po-
. see J.
J . Moreau [I]
duces some notion of angle, called codistance convex s e t s quired in
K1
(c')
and
K
2
e (K1, K2)
.
Joly introbetween two
to find an alternative condition t o the one r e -
above, which i s of the type
That a condition of this kind m a y be required to draw the conclusion of ( c l ) can be also inferred from trivial examples such a s that sketched below :
As J. L. Joly kindly pointed out to the author, the additional hypo-
(+)
$ n
tesis u
0 tor
S f
9
was erroneously omitted in that lemma
0
E K that appears in the proof must be replaced,
u
0
EEn
s .
: the vector
indeed, by a vec-
h'
U. Mosco
(d) Let
operator
x
on
a Hilbert space,
X
(ph)
of simmetric
linear
--
v = strong l i m p . v h If K -is a -
is a bounded -------
5 sequence
such that
f o r every
bounded ---closed convex subset of
closed convex subset of
X
v t X
.
X
then
,
and
F o r the proof s e e the author's paper
p]
, Lemma 1.5.
E~hat
p K is closed f o r each given h can be seen a s follows : l e t h v = l i m p wj , W. 4 K ; since K is bounded and weakly closed j J t h e r e exists a subsequence of (w!) of (w.) that converges weakly 3 J to a vector wo G K ; thus, f o r every z E X we have
U. Mosco
therefore,
V
= Ph
W0.7
An important special case of (d) is the following (dl) Let X . be-a- Hilbert space,
(X ) h with
increasing sequence of f i -
Xh dense in X , and, for nite -dimensional subspaces of X .. each h , ph the orthogonal projection of X onto Xh . Then, the conclusion
of
(d) above holds
.
It can be shown by simple examples chat if
K
is unbounded
then the projected s e t s
p K may not converge to K , even if K is h a closed linear subspace of X ( s e e Remark 1 . 3 of Ref. quoted above).
A sufficient condition is then the inclusion h
.
Let us also r e m a r k that if
K
phK
C
K
f o r every
is unbounded then the projected
phK need not even be closed : think of the orthogonal p.ojection of the epigraph of the r e a l function y = l / x , x > 0 , on the x-axis set
of the euclidean plane
x, y
.
Further examples of converging sequences of convex s e t s in Sobolev spaces have been also considered in the author's papers
c53 ,
C4) and
in connection with some perturbed boundary value problems- f o r
partial differential operators. In this regard s e e also L. ~ o c c a r d ; [l]
In the following Section 6 sing in the theory ces
.
we shall consider some e x a m p k s a r i -
of internal o r external approximation of Sobolev spa-
. Finally,< l e t us also mention the applications to approximation
theory that have been given by J. L. Joly in his paper quoted above.
U. Mosco. 3. The lfstabilitylltheorem
Let u s now study how the solution involving a map
A
and a convex s e t
u
of a variational inequality,
K, depends on the s e t
is the problem of the continuity of the map
K
u
K : this
and we shall stu-
dy *it with respect to the topology f o r convex s e t s described in the p r e - . ceding section and the weak o r strong topology in the space We shall f i r s t assume that the map
A
X
.
is kept. fixed while
K
is allowed to vary and we shall mention l a t e r how joint perturbations of
A and
K
can be taken into account.
Let u s consider the "initialff problem
u E K :
(21)
(Au, v
-
u) >, 0
YVCK
and a family of I1perturbed" o r wapproximateftproblems of the form
We shall assume
be a sequence of sets. However, any (Kh)h directed family could be also allowed, with anly minor changes in what follows
. The main result concerning the problem considered above can
be stated a s follows
5 reflexive Banach space and
-X
THEOREM 2 : Let (i) A
a bounded monotone
(ii) (Kh)h
5
subset
of -
K
hemicontinuous. -map of -a domain .D(A)
sequence of subset of
D(A), which converges to -a-convex
&)(A) , according to Definition 1 -of Section 2
.
U. Mosco
Letus assume, furthermore, ----that there exists f o r every h 52 solution
uh of the perturbed problem (22) --and that the sequence (u ) hh is bounded & X . ----T--
if this --
Then the initial problem --solution, u , is unique, u
h
(21) has-at -least - one solution ; moreover, then
converges weakly to
5
u
X
and (Auh)
.
Proof
- Au, uh
-
U)
4
0
Let us prove f i r s t :
(a) Any weak limit
in X of a subsequence . ( u ~ . )of~ approximat6 J -
u
solutions is a solution of the initial problem. 7 -
To simplify our notation, l e t us call (u ) h hand. Thus, we have f o r every h
every for a l l
the subsequence at
By our assumption (ii), we have
w -1im sup K
On the other hand, we also have
K
v E K h
.
C
w = v
C
s-liminf K
is the strong limit of a sequence
We can put
h
K , hence
h' (vh), with
therefore vh
K h in the inequality above and make v
h appear in that inequality, by writing it as
e
U. Mosco
A
Now we use the monotonicity of (Av, v
-
uhj
> (Auh,
and then we go to the limit in
v
at the left hand to get
- vh)
h : since A is bounded, the sequen-
ce , ( A u ~ )i s~ bounded; fat. (uhlh is such; thus we find (Av, v
-
.
u) >, 0
Therefore we can conclude that
u
i s a solution of the lineari-
zed problem
and since we a r e under the assumption of the Linearization Lemma of Chapater 1, this also means that
u
is a solution of problem (21)
.
The second step in our proof is (b) --There exists a solution
u.
of problem -
(21) is unique then the whole sequence in -
(21) and -- if-the - solution of ( u ~ )converges ~ weakly t~ u
X.. Since X
is reflexive, the existence of a solution
u
follows
from (a) and our assu~nptionthat there exists a bounded sequence of approximate solutions. The weak convergence of an obvious consequence of the uniqueness if .into account.
(uh)
u- to u iS th'en h u, once we take (a) again
U. Mosco
The final step is the proof of
(Auh)
-
Au, uh
-
U)
0
Still in consequence of the hypotesis find for every
h
we can
z E Kh that converges strongly to
a vector
On the other hand, since all
K C s - L i m inf Kh,
u
h
u
,.
is a solution of (22), we have for
h (Allh'
-
Zh
.
uh) 3 0
We now write this inequality by making
and then we go to the limit in verges weakly to
h
. Since
u
to appear in it,
A
is bounded and
c
0
uh
con-
u , we find that L i m sup h
(Auh, uh
-
U)
-
-
u) u 0,
hence also
Lirn sup (Auh h
wdich, by the monotonicity of Remark 2
.
Au, u h
A , is clearly equivalent to
Many operators
A
(c)
above.
arising in the applications ha-
ve the property that the weak convergence of a sequence
u h
to a vec-
U. Mosco tor
u
of
together with the convergence of the form
X
(Auh
-
Au, uh
-
U)
to zero imply the strong convergence of of this kind a r e often said to
IS^
wder
.
to h type (S)
be of
u
This is the case, for instance, if
u
in
X
. Operators
s e e F. E. BroA
is strongly mono-
tone : c
Ilv - u
11 2
or, more generally, if
with at
+ 0 ,
f : R+ with
- Au,
4
(Av
A
satisfies
-
v
&
u)
condition such a s
R+ any continuous and strictly increasing function
I+
g(0) = 0
.[1f
the space X
is uniformly
convex, then
this condition can be somewhat weakened, s e e H. Brezis - M. Sibony
[ 133 L e t us also remark that the analogue of condition (c) f o r minimuin problems invalvinga functional gence of
\
to
u
convergence of
uh
for instance, if
F
and of to
u
F
is that the joint weak conver-
F(u ) to Ftu) should imply the strong h and i t is well known that this is the case,
is the norm of a uniformly convex Banach space
In this regard, s e e also the Corollary of Theorem 2 below Remark 3
.
If a simultaneous approximation of the map
.
A
must be also taken into account, them problem (22) should be replaced by the problem
.
U. Mosco where D(Ah)
A is a suitable perturbation of the given h .$ in X containing Kh E U I ~range in X
A , with domain
Indeed, if the maps
A I s a r e uniformly bounded,monotone and h hemicbntinuous and satisfy the convergence sondition
A C s -Lim inf graph
graph
(26)
in
Ah
X X X *
in the strong topology of this product space, then the analogue of Theorem 1, with (22) replaced by (25) above, can be proved along the same lines, see U. Mosco [4]
he ded subset
B
maps of
.
{Ah)
an uniformly bounded in X
X , there exists a bounded subset
if for any bounB' of
X
*
,
such that
A ~ ( Bn D~A,)) c B'
for all
2
h
The following result is included in Theorem 2, a s .it can be shown by using the epigraph formulation of minimum problems we discussed in Chapater 1 :
of
Corollary
Theorem 2
:
Let
1-
f .: X
w,
-1
w2
be con-
-
X H ( - w, + be such that f = lim f in X Eh ' h cording to the definition of Section 2 Let us suppose that there exists
vex and -. for each --
f
.
h
a vector
( u ~ ) ~ bounded reover, if u kly to
u
. Then there exists
4 f h(uh)
X x
minimizes
fh
a vector
and that the sequence u
minimizing
is the unique minimizing vector, then
Proof space
% -that
converges
Apply Theorem
R
into
X* x
2
R
f(u)
uh
f ;mo-
converges =a-
.
to the map
0 x 1 of the product
and to the convex subsets K = epi f
U. Mosco and
Kh = epi f
ter
1
of
X x
R
and use Lemma 3
of
Section 9 of Chap-
. @ Similarly, we can consider a mixed variational inequality such as u g X
: (Au, v
-
ill
)
f(u) - f(v)
Y v E X ,
and the perturbed inequality
o r even take
to depend on h too, and use the epigraph forh mulation together with Theorem 1, o r its generalization with A = Ah, A = A
to obtain a result on the convergence of the perturbed solution
u We h ' refer to. U. Mosco 1 4 1 for more details on this point. See also Theorem 4 below
.
In the following section we shall apply Theorem 2 in order to obtain further existence results for variational inequalities and related problems, In Section 6
we shall apply Theorem 2 to prove the convergen-
ce of certain schemes of finite-dimensional approximation of the solutions of variational inequalities o r minimum problems. Let us, also mention that Theorem 2 could be also useful to investigate the continuous dependence of the solution of boundary value problems on the, pofsibly unilateral, constraints imposed on the solution. Some applications of this typd can be'found in the authorrs pers [43 ,
[&
and in
L. Boccardo [l]
.
pa-
U. Mosco 4 : Further existence theorems We shall assume throughout this section that reflexive Banach space
X
is a separable
.
The reason for the separabilkty assumption, which is indeed unnacessary for the general validity of the existence. results we shall prove below, must. be found in the fact that we shall deduce our results from the "stabilityH theorem of Section 3 and in that theorem only sequences were allowed . To remove this assumption we Kh should rely, instead then on Theorem 2, on the analogue stability result
of perturbed sets
for directed families
(Kh)
.
Let us now prove the general existence theorem,namely Theorem 5 we stated in the last section of Chapter 2, In addition to the separability of the space, we shall now prove that theorem by assuming the map A
to be bounded. THEOREM 3
of -
xX
X
either that --
.
Let K
be 5 bgunded -
a closed convex subset of
K is o r that . A -bounded --
ness condition on -
'
X
.Letus
suppose
satisfies the -following coercive-
K
(d ) There exists 0 such that
-- v0 G -(AV, v
K
R
> 0, with 11 vo 11 < R ,
- vo) > o for a~ v .€ K
Then, there exists a solution ---
Proof
monotone hemicontinuous map
: By the separability of
u
,
JJ v II
=
R
of; the problem
X , we can find an increasing
U. Mosco sequence of finite-dimensional closed convex subsets vO E K
1'
Kh
of
K , with
such that K = closure of
We shall now apply the finite-dimensional existence theorem of Chapter 2 to prove the existence, f o r every
h
, of
a
h
,
solution
the problem
u
h
of
such that
ll Uh II
for all
6
R being the constant appearing in condition (d ) any positive constant l a r g e enough, if Let X,
n = n
h ' Let
h
be fixed and l e t that containes
Kh
? t h : En be an injective map with
jh ' be the transpose of
h
'
+-+
.
X*b+
Then
i s unbounded o r
.
be an n-dimensional subspace of
X
Nh( E ~ =) Xh , and l e t
*:
Th
X
K
if K 0 is bounded,
En
U. Mosco iw
*
r
V) =
I
(y
KhX'
v =
x).
y =
*
Xh
w 21:
The s e t
is a closed convex subset .of
n E ,
bounded if
K
Moreover, the map
i s continuous X,
A
[In fact,
is continuous from
(see Remark X
.
h X to
* with that topology to
En
(%xIx where
x
0 [I v 11 = R
- xo) 1
=
= (ATX ,
h
vo € Ch
.
for
'.
to
E~
X* endowed with the weak topology
n h X , in
.J Finally
bounded, then by the assumption
is such
is obviously continuous from
of Chapter 1) and
8
h
if
turn, i s continuous from
Kh, hence
Ch too, is not
(do) we have Xhx
-
all
x
ThxO) = (Av, v such that
v =
-
vO) > 0 ,
rhx6K
and
C such that 11 Rh x 11 = R . h We now a r e in position to apply Theorem 1 of Chapter 2 , with
, in particular, for all x
1
and B = x € En : K = C h exists a solution x € Ch n B
hence a solution
u = h
Rh x
€
11 Rh x (1 4
R
1
and we find that there
of
of problem ( 2 8 ) , with
.
U. Mosco
Thus, we have shown that there exists a bounded sequence
(uh)
of approximate solutions. The conclusion of the theorem is now a consequence of the stability theorem. Remark
4
: If
the map
A
is only defined on
K ,which is
not an open domain, then the hemicontinuity assumption must be strenghtened
. In fact,
in this case the demicontinuity of
A
r i l y follows from the hemicontinuity (cfr. Remark 8 even we can affirm that sections
K fI Xh
of
A
is continuous from the finite-dimensional
X
* . This latter
property, however, was
needed in the proof above to show the continuity of
Ah
.
Therefore, i t
"a p r i o r i w , in place of the hemicontinuity, when A
has not an open domain Remark 5 :
. of Chapter I), nor
(Xh being any finite-dimensional subspace of
K
X) to the weak topology of
must be assumed
does1nt necessa-
.1
We know f r o m Chapter 1 that the s e t of all solu-
tions of problem (27) is closed and convex. Under the assumption of Theorem 3 above, we also have that this s e t is bounded is obviously the case if
solution dition
u
is bounded. If
K
is bounded in norm by the constant
(do) of the theorem
11 'ii I( >
K
.
In fact, if
.
In fact this
is unbounded, then any
R
that appears in con-
ii is a solution of
R , since there exists at least one solution
u
with
(27) with
11 ull (
R
and the s e t of all solutions is convex, there would also exist a solution lu
u
with
11 11
= R ; hence
and this contradicts
(do)
.
-
U. Mosco
Remark 6
.
The coerciveness condition
(do) can be obviously
replaced by the stronger condition (dl)
There exists
v
K
€
0
, -such that
(Av, v - ~ ) - - + + a ,
as IvII-,oo, -
0
(cf r
v
E 'K
of Section 4 of Chapter 2). We shall now deduce from Theorem 3 an existence theorem for
inequalities of type (29)
u
X
:
(Au, v
-
u)
> F(u) -
F(v)
v v EX
,
by using the epigraph formulation of this problem we already discussed in Sections 5 and
9 of Chapter 1
THEOREM 4 map of
X
lues in (-w,
5 bounded monotone hemicontinuous
Let A -
into X r + WJ .
.
a 1. s. c. convex functional
F
k t us suppose that
lowing coerciveness condition There exists R -F(vo) <
(db)
+
.
A
and
on
F
X
with va--
satisfy the fol-
:
> 0
v
0
-
€ X, with
ll vO \I
< R and
w , such that
-
(Av, v f o r all --
vo) + F'(v)
v E X
wit@
-
F(vo)
IIv II
>
?
= R
Then, problem (29) --above has a solution Remark 7 : Condition stronger condition
:
u
.
(dl) can be obviously replaced by the 0
U; Mosco ( d1l )
There exists vo E -(Av, v
-
vo)
+
X ,
with
F(v ) < 0
+
F(v)
w
as
+
w
\I
, -such that V
11
+
-4
03
(cfr Remark 6 above). L e t u s also notice that i f the effective domain of then v
0
F
is bounded,
( d l ) is automatically satisfied, for it suffices then to choorz: any 0 with F(vo) < + oo and R l a r g e enough, s o that F(v) = t 2 whe-
n e e
\Iv
1= R .
Proof of Theorem 4
.
In t e r m s of the epigraph formulation of
Sections 5 and 9 of Chapter 1, problem (29) can be equivalently written a s follows
where
and N
K = epi
F
.
As w e know from Section 7 of Chapter 1, to find a solution of the inequality above, it suffices to find a vector a local
solution of (30)
K
u
which is
..
To find such a local solution auxiliary problem
ci( of
N
I-
u =
[u,d]
let us consider the
U . Mosco
(31)
#
.
rd
u ,g KR
u
-m
: (Au, v
- ); 3
0
where
i s the intersection of the epigraph
BR x I
is
.
0-L
K
of
F
with the ttcykindert4
Here
X
the closed ball of
a r s in condition
whose radius is the constant R
that appe-
(db), while
I =
[a,
b3
is a closed bounded interval of the r e a l line, which we assume to have been
closen iarge enough, a s we shall specify l a t e r Since the map
?i
.
i s obvioubly bounded, monotone and hemicontinuous
in X x LR and X is a bounded closed convex subset of X x
R, the existence of
a solution 5 = [ u , q of problem (31) is now a consequence of Theorem 3. To prove that
G
= cu,,$]
is also a
problem (30), it suffices now'to show that boundary of the "cylindert1
BR x I ,
local solution
u
of our initial
does not belong to the
which is to say, that we have
U . Mosco
1. 11
< R
and
a < - ( < b .
L e t u s now suppose that a and b were s o chosen a s to satisfy
(33)
a < - c O R - c ,1
where
[
c
0
> 0 and
.cl > 0
any 1. s. c. convex
,
a r e such that
can be bounded f r o m below in this way
F
3.
and
v
0
being the vector appearing in condition Let u s now interpret
n/
KR
a s the intersection
I%
KR = (epi FR) " X with the " s t r i p f f X x I by putting for
F
r +
cu
F (v ) = F(v ) < R 0 0
+
(db)
x I)
of the epigraph of the functional outside the ball w
.
BR
.
Note that
FR
Moreover, by our previous choice
B we now have,
a ,< inf F
FR
R'
actually
a < inf F
R *
obtained
+
co
,
(33) of
U.
Mosco
[1n fact, by (33), we have a< hence
- coR - cl 6 - c0 ll v ll - c1 6
a < inf
FR
F(v)
Ifv ll < R
f o r all
3
Thus, by Lemma 3 and Remark 1 7 of Section 9 of Chapter 1, since
$
=
[u,d3
is a solution of inequality (31), then
u
is a solution
of the mixed inequality
(35)
u € X
: (Au, v
-
u) 2 FR(u)
-
FR(v)
v'v
L X, F (v.)6 b
R
and
By putting now v = vo in (35), CnoJe that F (v ) = F(vO)-= b by (34)l R 0 we find that F (u) < oo, hence, F (u) = F(u) and then R R
+
By condition
(dl ), this implies 0
(u
11 < R, which is.theifirst
strict bound in (32) we had to prove. It rema'ins to prove the bounds on
4 . ve
Again by inequality (36) above and the monotonicity of
A, we ha-
U. Mosco
hence
o(
.
< b , by o u r choice (34) of b
On the other hand, a s we
have already seen, we also have
d Reamrk 8
.
= F (u) R
> inf
FR
>
.
a
@
Theorem 4 was deduced above f r o m Theorem 3
.
However, it clearly implies in turn both Theorem 3. (which is obtained by taking 4
F '- 0 ) and the direct existence theorem, Theorem
, of Chapter 2 (which is obtained by taking A
0 )
.a
Existence theorems f o r mixed variational inequalities such a s ( 2 9 ) have been given by C. L e s c a r r e t
111 ,
611
, J. Lions and G. Stampacchia,
, 193 , R. T. Rockafellar r 6 1 . Theorem 4
F. E. Browder C 8 1
i s essentially due to F. E . Browder, loc. cit. See also U . Mosco [ 2 ] where the proof given above is taken from
,
.
In the remaining of the present section we s h d l apply Theorem 3 to prove the existence of fixed-points of so-called inward non-expansive mappings
of a closed convex subset every
K
L e t us recall that
U
v C K
Uv
the vector
of a Hilbert space
z = Uv
into
V
.
is said to be an inward mapping if f o r
Clearly, any mapping of then we can take
V
and
belongs to some r a y
K
into
h= 1 .
K
i s an inward mapping, f o r
U. Mosco
.me following theorem, that generalizes Schauderls theorem, is F. E. Browder l111 :
due' to
.Let
THEOREM 5 a -
U
be an inward non-expansive mapping of
K
bounded closed convex subset
K f
fl .
U
Then,
fixed points ---
has a fixed -----
a Hilbert space
point in
a ---closed convex
U
of
of
K
.
V
,
Moreover, ---the s e t of all
subset of
K
.
Proof. We know from Section 4 of Chapter 1 that point of
into V -
u
i~ s fixed'
U, i. e . ,
if and only if
u
is a solution of the variational inequality
where
&
= I
- U ,
I = identity map of
V
By taking Lemma 2 below into account, the existence of a solution
u
of (37) and the properties of the s e t of all solutions of ( 3 7 ) co-
m e a s a consequence of Theorem 3 above and the Addendum 1 to The-
.@ . Let U
o r e m 5 of Chapter 1 LEMMA 2
$Z = map -
I
and continuous -
-
U
.
: K
is monotone -
V
be non-expansive.
and lipschiztian, -
Proof. We have f o r every
v, w
of
K
Then,the -
in particular, -
bounded
U. Mosco
(av2
I
&w 2
IIV - W I I
-
v
-
w) =
-
I~UV
IJV
uwv
-
W I
2
1lv -
-
I
(Uv-uw
w ~ r 2 (1
-
C)
v -w)
IIV - w 11
2
,
provided
Therefore, i f
U
is non-expansive
( c = I), then
&=
I - U
is monotone. Moreover,
,<
hence
4
-
4 is lipschiztian . Remark 9
then
Ilv
ft
= I
-U
.
w
ll
2
a
If U is a contraction ( c < 1 in the inequality above),
is strongly monotone. More refined relations between
"monotonicitytt and "non-expansivenesstt can be found in F. E. Browder
[lq
and 2 . Opial Cl]
.
F o r the construction of fixed-point, s e e also F.
E. Browder-W. V. Petryshyn
[11
[2]
, S. Kaniel
C9
.
An expository reference for fixed-points of non-expansive mappings and iteration methods of solution is D. G . de Figuerido 5. Finite-dimensional approximation I :
the discrete
111 .
problem
,.
Most of the methods that can be used for the numerical solution ~f variational inequalities arising in the applications can, be put, essen:
tially into the Ritz-Galerkin framework we discussed at the beginning of this chapter.
U. Mosco
Let us briefly summarize the situation, trying again to point out which are the important choices that must be done in o r d e r to write .down : first, an approximate problem yielding an approximate solution uh(x) ; second, a discrete problem allowing in practice the
numerical
u (x) . We shall postpone to the following section the dih scussion of the convergence of the approximate solution u (x) to the h solution u(x) of the given problem. Thus, the discretization parameter evaluation of
h
must be always considered in the present section a s fixed. We shall assume f o r convenience, a s in Section 1, that the ele-
ments of the space
X
a r e functions.
We s t a r t with an initial problem
that involves a normed space convex subset
K
of
X, a map
A
of
X
into
X
*
and a
X
[ ~ h e s e should be all considered a s our data. However, the s a m e problem can be given a differeit formalization leadlnq to different X ,
A
and
K
.A
new formalization can a r i s e intrinsically from the
problem at hand, o r else it can be suggebted by a technical motivation. , W e shall s e e an example of that in connection with the usual f i nite-difference method in problems involving partial differential o s e ~ . a tors
.I
To write down an approximate problem (39)
4,
E Kh
:
(Auh, vh
-
uh) >/ 6
d
V,
t Kh
ive must conveniently choose a finite-dimensional skbspace
Xh
of
X
U. Mosco and a -
s
map
5
convex subset A
.
.
of Xfi ~ could e also take into account an approximation
Ah
that containes
being a map from Kh
.
X
to
* with X
Ah
a domain
of the
D(Ah)
Then the approximate problem becomes
.
Here too the pairing (. , ) is the duality pairing between
X
and
x* .I The further choice of a basis
in
%; n = nhbeing the
dimension of
Xh, allows us to associate with
the approximate problem (39) a discrete problem
in the euclidean space Here subset of
En
.
En
is a map of
En.
into
which a r e obtained from
show. Let
Th
:
En
-
be the injective map that brings the vector
En A
X
and and
Kh
h C
a convex
as we now
U. Mosco
of,
of
E~~ .into the function
X
. Clearly
Trh
hence
is a (norm) isomorphism of
E~
There is then a unique convex subset
obviously,
h
C
he
Kh
=
onto
ch
actual determination of
vh
,
lowing Remark 10
En
.
. such that
. can present in practice so-
C'
h
and the basis 3 , Xh 9 that has been done. In this respect see also the fol-
me difficulties, depending on the choice of hence of
of
Xh
,I
Moreover, the map
Th has a transpose
w h o s e kernel is the. annihilator of Y,
in
X
k
, and
U. Mosco
i s a map
of
En:
F o r all vectors
into
En'
vh
and
wh
of
En'
we have
were h Tfh v
,
Thus, a vector
u
v (x) = h
h
wh(x) = h
I (uq) 6 En
if and only if the function problem (41)--
is a --
flh wh
is a solution of the discrete h \(x) = nh u , &
solution of the approximate problem (39) CWe have in fact
were vh =
h Thv,
uh=
h
TThg
,
and
71h
is a 1-1 map
U. Mosco
ch
of
onto
5J
To write down the discrete problem (41) explicitely, let us
com-
pute the components
of the vector
Ah un of
E~
in the
canonical basis
We have
and since
we f a d
~ is expressed, in term of the components Therefore, ( A u')~ h ,uhn ) introduced*in (iiq)q of u , b y the same hnctions A:(U?. Section 1 : h
. ..
'U. Mosco
Thus, the discrete problem (41) is the s a m e a s the discrete p r o blem of that section, namely
We shall not discuss in o u r lectures which a r e the methods that can be used to solve numerically the discrete problem. h L e t u s only r e m a r k that if the map A i s strongly monotone, then the iterative methods of Chapter 2 could be applied.
A
On the other hand, if
is the gradient of a convex functional
and, therefore, the solution of (38) i s also the vector that optimizes a convex function on a convex subset of
E ~ , then the numerical solu-
tion of ( 3 8 ) could be c a r r i e d on by means of one of the s e v e r a l methods available in convex optimization, s e e f o r instance F o r a linear
A
, also
J. Cea [2]
.
pivoting methods a s those typical of line-
a r programming can be also tried. We shall s e e an example of that in the l a s t sectioq of the present Chapter. The role played by the euclidean m e t r i c and the c a -
Remark 10 nonical b a s i s
r
1
= (
$ql)q,
. ..
in associating
the discrete problem
( 4 1 ) with the approximate problem ( 3 9 ) can be taken by any inner product
. 1.
and any orthonormal basis with respect to that inner product. 'This
change will naturally affect the maps hence also the subset
ch
of
Rn
and i t s transpose h and the map of Rn
flhX into
r
U. Mosco
itself. Since the discrete problem (45) will .be modified in consequence, a suitable choice of the new m e t r i c and a new basis can facilitate the actuaJ numerical solution of (45). In this regard, l e t u s r e c a l l Remark 8 of Section 3 of Chapter 2
n
r ~ h map e brings the- vector
. &ill is the l i n e a r map of
of the new basis
Rn
into
X
that
chosen in Rn into h , f o r any the functions of the basis (40) of Xh ; hence , n h v h h a r e now the compov € R~ , is given again by (42), where (vs)s h nents of v , in the basis ( ) While the subset h IT;' Kh changes accordingly only to the change of Rh , the C =
rp: 5S
nh*
Ah =
A
qh
)
.
5
map
(
will be also affected by the change of the me-
tric, on which the transpose
'
nhr
depends. Let u s also notice that the di-
s c r e t e problem is still given by (45) above, where the functions Ah h a r e the s a m e a s before, provided the components (vqIq (uq)q of
.
vh
and
uh
a r e now taken with respect to the new basis in
Remark 11 When ctional
F
on
DF
R~
.]a
of a (convex) fun-
X , we know that the variational inequality (38) charac-
t e r i z e s a solution
(46)
is the differential
A
.
u e K
u
of the minimum problem :
minimizes
F
on
K
.
This suggests that we could directly solve this minimum problem, by taking an approximate problem (47)
%
E Kh
minimizes
and then, once the basis
(
F
h Ts)s
on
K
h
has been chosen in
X numerically h'
U. Mosco
solving the discrete euclidean problem
(48)
u
h
Here
ch
r
minimizes
h C =
-1 TIh
F
Kh
d
on and
h
C
,
IV
F = F e R h , that i s
It should be remarked, however, that the discrete variational inequality associated with the discrete minimum problem (48) above, that
is
i s the same a s the variational inequality (45), where
that we may find by discretizing the variatloniu inequality
associated with the initial' problem (46). In fact, we have
.
U. Mosco
h v F ( u ) in the canonical basis
and the components of
(es)
of
B"
a r e given by
for
T h es =
CPf:
h Rh u = uh =
and
h uq
p 4 . Thus,
Riesz-Ga-
lerkin discretization and weak characterization of minimum problems a r e trcommutingt operations 6. Finite-dimensional approximation, I1 : convergence of the ap-
proximate solutions
.
We shall now give conditions on the map
A
and the approxi-
mants convex sets Kh in order that the approximate solutions converge to the solution
uh(x)
u(x) of the initial problem.
As the stability theorem shows, the most natural convergence to b; expected from the sequence
(u' (x)] is the weak convergence
h X - where .A
u (x) to u(x) on the space h convergence to zero of the form
of
acts, together with the
.
Au, u - u) .As we said in h Section 3, for a whole class of maps, by definition : those of . . m e (S) , the convergence of uh (x) to u(x) in the nor& of the space X
(Auh
-
will then follows as a ronseqUence.
Let us notice, however, that we shall not be able to give estimates of the e r r o r
11 u
-
u I1 of the same type a s those which a r e h In this common in the analogue approximation of the equation Au = f 45 regard'see Remark below.
.
U . Mosco
By taking the finite-dimensional existence theorem of Chapter 2 into account, we can state the stability theorem of Section 3 in the form theorem, more suitable f o r our present of a ffconstructivelfexistence aims.
Let
,THEOREM 6 continuous -map of
X
A
&5
bounded, strictly monotone and hemi-
into X * . -
Let K b e a closed convex subset of ce of finite-dimensional closed convex subset in
K = lim K h according to Definition 1 f Letus -
Section 2
X , f $, of X
sequen-'
, such that
--
,
.
assume ,furthermore,that the following coerciveness condi--
tion holds -There exists v --
t O
n Kh -such that h
Then, t h e r e exists f o r -----
every
h
a unique solution
problem
unique solution and 5 -
u G K
X
2
(Kh)h
:
u
of. the -
problem
(Au,v-u)aO
W v E K .
u
h
of the -
U. Mosco
Moreover, (AuI.
-
u
converges -weakly to h Au, uh- u) converges to zero.
COROLLARY u
h
'
If, in addition, -
converges strongly
5
X
u
A
u
2 of
in X --and the form -
(S),
type
,
then
.
Proof : Theorem 6 reduces to a special c a s e of Theorem 2 once we have proved the existence of a bounded sequence of approximate s o lutions
F o r given h , the existence of uh can be proved by aph ' plying Theorem 3 to the map A and the s e t K Let us only note h ' that, if K is unbounded, then the coerciveness property (d ) requih 0 r e d in Theorem 3 is now an obvious consequence of our present assumption (uh)
u
(g1)
is a further consequence of
that appears in
for every ce
(uh)
.
(see Remark 6 above) )
(dl) : in fact, if
'
v
is the vector
0
, vo 6 Kh hence
h , and
this clearly implies, by
is bounded in
Remark 12
The boundedness of the sequence w
X
N
(dl),
that the sequen-
.
The assumption
particular, that the approximauts
w
(d ) .of Theorem 6 requires, in 1 Kh have a non-empty intersection.
A more general condition avoiding this hypotesis can be found, f o r instance, in U. Mosco Lb] , namely Theorem 3 lowing remark
. See
also the fol-
.
Remark 13
Condition
the following assumption on
ICI
(d ) 1
in Theorem 6 can be replaced by
A : --there exists a function
U. Mosco
continuous
(r)
and +
4
11 v -
strictly increasing a,
w 1
as
(
r
Il v
-
j
+
0
a
+, with
$ (0) =
Remark
(C)
and -
, such that
< (Av
q)
- Aw, v
-
w),
v, w E X
Let us also note, in particular, that a map tisfies the property
0
A
.
of this type s a -
mentioned in the corollary of Theorem 6 , s e e
of Section 3 @
2
Remark 14
The approximate solutions considered in Theorems 6
a r e required t o satisfy the initial inequality exactly. This corresponds, indeed, to the fact that the map
A
was left unchanged in the approxi-
mate problem. However, a simultaneous approximation of
A
could be
also taken into account and in this regard we recall Remark 3 of Section 3
.8 As we said at the beginning, the problem we a r e concerned with
in the present section is the convergence of the approximate solutions u
h
to the initial solution
u.
In this respect, the meaning of Theorem
6 above is that, for inequalities involving a map
A
of the type consi-
dered in that theorem, the proof of the weak o r strong convergence of uh
to
te sets
u
is reduced to the proof of the conv'ergence of the approxima-
Kh
of Section 2
to the convex set
.
The proof that
K
of the initial problem, in the sense
K = lim Kh
can be achieved, in some
cases, by using one of the general convergence results given in Section 2 o r otherwise by carrying i t out directly in the specific situation at.
hand. We shall now consider some examples. (a) Ritz-Galerkin approximation. An circumstance
mentioned
above
is
the
example
of the
first
U. Mosco
internal approximation of Ritz-Galerkin type of a convex s e t with a non-empty interior. In this case, given an initial convex set K in the space
X , the approximahis K
h
Are simply chosen to be the finite-di-
mensional sectiohs.
of
K , with respect to an increasing sequence of finite dimensional sub-
spaces
Kh of K
Xh . h Then we know f r o m (c) of Section 2, that we have convergence of of
X
such that
U
X = closure of
K , hence, under the assumption of Theorem 6 , convergence
to
u to u in the sense of that theorem, provided the interior of h is not empty If
0
K is empty, then a condition of type mentioned in (a) of
Section 2 could be cheked.
-
(b) Internal approximation of :Sobolov spaces.
A general scheme
of internal approximation, which is typical, for instance, of the finite-el e p e n t methods, can be described a s follows K
is a closed convex subset of
(vh) L~
X :
is a sequence of finite-dimensional spaces,,
is, for every h, a closed convex subset of
We assume that there exists f o r every
and a map
.
h
'h V
.
an injective map
U. Mosco
such that (i) and h (i) ph L
(ii) below hold : C
K
f o r every -
11 v - % rh v 11 + 0
(ii)
h f o r every -
v
fK
Under these assumption, if
we obviously have K = lim K
in
h
Differently then in example
X
.
( a ) , this scheme requires that the
convergence condition (ii) must be checked
directly in the specific si-
tuation at hand. We shall s e e now a classical example of this kind of approximation, namely the internal approximation of the Sobolev space
.
F o r this,as well a s f o r the example of external approximation H;)(R) 1 of H (R) we shall give in the subsection ( c ) below, we r e f e r to J. Cea 0 [ , J. P. Aubin [I-52 , F. Di Guglielmo Ll] .
11
1 Internal approximation H (52) . 0 1 X = Ho(R) , where R is a bounded open subset of
of
Example 1 . Here
~ ' ( 0 )the Sobolev space of a l l r e d functions
0
such that
2
v 4 L (R) and
vx
i
v(x) on
R ,
, distribution derivative of
v , also
R ~ ,
U. Mosco 2
.
.
belongs to L ( 0 ) , f o r every i = 1,. .,li 1 Ho(R) is a reflexive Banch space with the norm
Given a discretization parameter
.An)
h = (h1 8 . .
,
h.
1
>
0
f o r every
i
,
we shall now define an injective .map
where
vh
is an
and on the given R
n -dimensional vector space, h
-
o(*d (
depending on
:
) = the characteristic function of the real interval
5
h
.
Let us consider, indeed
- d(
nh
) = the one-dimensional I1tentl1function
1
[-p,
1
z]
U . Mosco
- for any multi-index
q = (ql,
. ..,qi, . . .
x), qi
E fN
i
'
.
let
i . e . , the .n-dimensional tent function whose support is the "n-cube"
with center at the point
Q=
(qlhl,
.. ., qihi, . . . .qnhnl
and "edge lenght"
2h = (2h1,.
. . ,2hi,. . .2hn)
U . Mosco
U . Mosco
h
-Q
= the set of
q = (ql,.
aU multi-indices u
p
,;
c
..,qn),
-
such that
,
Q
and
Note that h I (rq(x) 6 H,,(R)
- vh S Vh (0)=
for every
q E ph
;
the space of all real vectors
-the map
given by
that i s
v (x) = h
L ~
E
h v Q
X
[ ~
o(*d(-
1
hl
X
-
qJ.
..
d*
0.L (
n
hn
- q)
1
U . Mosco
-
h Xh = phV (Q) , the subspace
X = H J (0)
04
0
generated by the basis functions
?",XI
Since
ph
.
q E
Q~
is obviously injective,
vh
and
Xh a r e isomorphic
and in many arguments they can be indeed identified ;
-
the map
which associates the function
v(x) with the vector
whose q-component
is the mean-value of
v(x) on the region
h vh = ( v ~ ) ~ Qh
U. Mosco
The main approximation results can then be summarized a s follows : Lemma 3
F o r every --
Corollary 1 If
X = lim X
h
v o Hb(R) , ph
rhv
€ TtR)
1 h X = HO(R) and Xh = ph V ( 0 )
'
Corollary 2 If
and -
then K = lim K
h
in
1 HO(R)
,
n
then
U . Mosco
11 v - ph
For the estimate of the e r r o r the papers quoted above
rh
I(
v
we refer to
.
(c) External approximation of Sobolev spaces
.
The examples
mentioned up to now have the common feature that the set ximated by sets
Kh
which a r e contained in
K
is appro-
K , and it is to s t r e s s
this fact that we used the term internal approximation. However, it is easy to adapt the scheme (b) to the more general situation in which the approximant set ined in
K
is not required to be conta-
Kh
.
It suffices, indeed, to replace condition (i) with the condition
-
(ii) If v. J and -
6 ph Lhj , where j
v j
h
-is-a
(L j)j
converges weakly to
v
&
h subsequence of (L )h '
-v
X , then
In fact, this condition together with condition (j)
K
above is equi-
valent to the limit
Although improperly, we may call this kind
of. approximation an
external approximation. As it well known, the finite-difference methods for the approxi>
mation of partial differential operators caii be put in the external framework of approximation we have just mentioned. As an example of that, le't us briefly describe the exterrial approximation of the Sobolev space 1 HO(W 1 Example External approximation of HO(0)
.
.
Let us consider :
-
U. Mosco
-
for each
i = 1,
. . .,n,
the space
with the norm
-
the product space
of all vector functions
with the product topology ; -the closed subspace
with
Xo,
.
1 We shall identify the space HO(0) 1 that i s , we shall identify the function v(x) E Ho(0) with
contained in the diagonal of
the vector function
X
U. Mosco
-
the n-dimensional tent functions "smooth in the i-directionn : (XI=
i = 1,.
.. , n ,
-
X
1
-
X.
q...
hl
where
X
d + d ( 1h - q i ) . - - - d ( -n
q = (ql,.
..,R)
-
4.1.
hn
1
is a multi-index and
d(
t
)
is
the one-dimensional tent function considered in the previous. example :
The support of t e r at the point
Q = (qlhl,
(x) is the n-coordinate llcube" with cen-
. . .,q,hn),
but the i-direction and edge-leght 2h
i
- Qest
=
the s e t of all multi-indices
the t l c r o s s region"
edge-lenght
h
j
in all directions
in the i-direction :
q = (ql.
. .x),
such that
U . Mosco
with center at the point
Q = (qihi)
and "arm-lenght"
and
Clearly, for every
h
-
- 'est -
'est
(G?) = the space of
i = 1,
.. .,n
all real vectors
2hi
in every
U . Mosco
- the map ph d
~ : , t s~t
~
where, for every
i = 1,.
)
h est ~ (n)
x= i
. .,n,
the map
-z
h
i~'(')3~
i s given by h phi v
= vhix) =
- q = ph tst(S,2the ) subspace
of
hi
k
h . 'est
CH'(~)]
=
i
generated by the basis vector-functions
-f:
(x) = ( .
.., y:
(XI,.
(XI)
.:st
E~
Clearly,
is an isomorphism of
associates the vector function of
vhi(x) = phi v
where
h
X =
7CH
, i = 1,. . .,n,
Finally, we shall still denote by every function
v(x)
h Vest(SZ) 1 (S2)
7
phir
h
v
i = l,...,n,
h
of
h Ves,(fl)
rh the map with associates
L (SZ) with the vector
Let -
Let -
v
2
1 v E HOW)
~[H'(Q)I
Lemma 5
Xh , which
,
with the vector
whose q-component is given by the mean value of
Lemma 4
onto
. For ,each
v(x) on the region
i =I,.
. .,n
9
@
v
h
.=
h (v ) and suppose that. for every 4 q r Q ; ~~
.
U. Mosco
convekges weakly to a funcrion v
r
H'(R)~
f o r all
i = 1,.
in
i
Then, we have --wi(x) Z where
V(X)
v(x) belongs to
weakly to
Hb(0)
-
=
7
cp_ hg(x)
1 6
Corollary
Xo
--
Moreover,
in
x
1 Ho(R)
0
.
..,n
v (x) = p vh -h -h
.
converges
=
C H ~ ( R ~ as
&
Xh
ih/+
o
the subspace of
spanned by the basis vector-functions
CH'(~)J
,
.
y(x) = (v(x). . ... v ( r ) )
Corollary 1 : Let
x
3.e.
a s 1h 1 * -
Qhq
.
r?
: Let
Then
-K
identified --with a cone
1
v E Ho(S2) : v ) 0
=
K
X
a. e.
R
3 be
and
Then , K -
(d)
= l i m -h K
-
Projection methods
X
.
as
\hl+O
.
The approximation methods discussed
.
U. Mosco
s o f a r may be called of injective type : they a r e based on suitable injective mappings
ph
vh
of some fiAte-dimensional space
rying some convex subset that approximates the given
L~
of
K
vh
into
in a convex suset
K
.
h
X of
, carX
However, a conceptually different type of approximation is also possible, which is based instead nn some projection mappings X
onto a finite-dimensional subspace
map
ph
c a r r i e s the given s e t , K
Xh
of
X
of h : in this case, the
onto an approximate
Kh
p
in
Xh
.
The most natural setting for these methods .involves an Hilbert space of
X
, an increasing sequence of finite-dimensional subspace
U
X. with
Xh
dense in
X, and, f o r each
h
Xh
, the orthogonal
h
projection If
p h
of
X
onto
is a map of
convex subset of
X
h
X
'
into -
X
and
is a bounded closed
K
X, then the problem
may be approximated by the sequence of problems
v where
,is a map of
Xh
into
Xh
and
h
c
Kh
U. Mosco
(see (dl! of section 2) . Xh Let us remark that the approximate problem above is in the
i s a (bounded) closed convex subset of space
Xh. However, its solution
uh
is the same as. that of the pro-
blem
in the space
X
. In fact,
we have
hence
Therefore, the proof of the convergence of the approximate s o still can rely on Theorem 6, by taking now the convergenu h ce result stated in (el) of Section 2 into account. We refer to the au-
lutions
thor's paper C43
, e.g., Proposition 3.1, for more details on this point.
Projection methods for solving equations involving non-linear operators in Banch spaces have been extensively investigated by many authors, let us mention here F. E. Browder , [lo] n23 der and W. V. P e t r y s h s [I]
,
F. E. Brow-
and the review paper by R. I. Kachurov-
skii [3] , where further reference on the subject can be found. See also D. G. de Figuerido [I] Eemark 15
.
.
The usual estimates of the e r r o r [lu. - uh
the Ritz-Galerkin' approximation of an equation Au = f near map
A
a r e based on the inequality
!( in
involving a li
U. Mosco
u beh longs to. This estimate, however, is in general false f o r variational ine-
Xh
being the subspace of
X, which the approximate solution
convex cones. qualities, even if it r e f e r s to an internal approximation of -This can be seen with the following simple example, due to G .
--, (0, 0)
i s the vector that minimizes the distance functional 2 2 F(v) = - 11 v - z )I , v = (vl, v2) E E z = ( - 1 0 on the 2 0 0 E~ , while u = (0, -h) half plane v 3 0 of the euclidean space 1 h i s f o r a given h > 0 the vector that minimizes F ( v ) on the cone Kh Strang : u 1
described in the figure below
Then, h
> 0 small.
11 u -
uh
11
= h, whereas
dist (u, Kh)
/J
h
2
for
U. Mosco
7
Dual variational inequalities and complementarity systems
It i s well known that many minimum problems of the calculus of variations and optimization theory admit a "complementary" on "dual" formu
lation. On this matter, let u s only r e f e r here,for instance, t o A. M. Arthurs [I], J.Stoer-Witzgal [I], J . C e a 1 2 1 Robinson [I],
Moreau [2],
U.Dieter [i][2].
F o r variational inequalities too , many "dual" characterizat ions of the solutions can be given, which a r e based, essentially, on separation o r
.,
mini-max theorems. A discussion of some of these dual methods can be found in J. L. Lions, R. Glowinski, R. ~ r e m o l i G r e s ,loc. cit. As we shall see below, it i s always possible, at least in principle, t o associate a variational inequality in X*
-
the "dual" inequality
-
with any
given variational inequality in the space X - the "primal" inequality
-
in such a
way that a vector ,u of X i s the solution of the primal inequality if and only
' is a solution of the associated dual inequality. if the vector u* = -Au of X However, the explicit formulation of the dual inequality may eften be in practice a difficult problem in itself. . We shall first consider variational inequalit$ on convex cones. The dual scheme we have in mind becomes then particularly slrnple and both the primal and dual inequalities can be characterized more symmetrically by means of a so-called (generalized) complementarity sy$tem. Let indeed M be any map of X into
x*,
H a convex cone with u er
tex at 0 in X, z a solution of the variational inequality
(48' )
Z E H : ( M Z ;w - z ) 3 0 Then, the p a i r z, z* =
-
Sb
W E H
Mz i s a solution of the problem
U. Mosco
In fact, by putting w = z
+v
into (Cg'), with v a n a r b i t r a r y vector
of H, we find
which is t o say, z * H*. ~ Moreover, by lepladhg now the vector w in
(46")
once with 0 and then with 22, we find
(Mz,-z)3 0
and
(Mz,z) 3 0
respactetely, therefore (z*, z) = 0. Conversely, i f the p a i r z, z*= every
WQ
- Mz
satsfies (480 above, then for
H we have
since z * ~H*. Therefore, we have proved the following LEMMA 6 : Let M be any map of X into X:
H a convex cone
with vertex a t 0 in X. A vector z is a solution of the veriational inequality
i f %nd only i f the pair z, z*= - M z is a solution of the
U . Mosco
zg H
,
Z*d H*
,
(Z*, zz)= 0
Let u s suppose now that the map M i s 1-1 of X onto X* and let us define the map
Note that MI= M-'
if M i s linear. Let u s also assume that H i s a closed
convex cone with vertex at 0. Then, if we denote by H** the polar cone
* & X,
of H
i.e.
U. Mosco
a simple argument, basedon the separation theorems for convex sets, shows that
Since the relation
is clearly equivalent to
by applying Lemma 6 once t o the map M and the cone H and then to M 1 and H*, we obtain the following
*
THEOREM 7 : Let M be a 1-1 map of X onto X , H a closed convex cone with vertex at 0 in X. If MI is the.map of X* on X given by (49) and H* -
the polar cone of H in x*, then the following three problems a r e
equivalent (i)
z~H:(Mz,w-z)po
(iii)
z e H , Z*E:H*
VWEH
, (zX,z)=O ,
U. Mosco
prbvided z a n d z* a r e related by
Remark 16: If M = DF, F being a convex functional oli X, then MIZ*
= - ~ f ( - z * ) ,zkE x*, where
is the
conjugate functional of F (see
Section 2)Then, the dual problems (i)and (ii)of Theorem 7 characterize the minimum problems ( i ) z minimizes
F(w)
on
H
(ii) z* minimizes
F*(w*)
oh
-H
*
respectevely. Problems (i)and (ii) above a r e conjugate in the sense, for instance, of Fenchel's duality theorem, cfr. R. T. Rockafellar [2].
91
Remark 17 : Let vo be a given vector of X, A a given 1-1 map of X onto
x*, A'
defined a $ in (49). If we apply Theorem 7 to the map Mz
=
A(z+vo)
zEX
,
,
then we find that the following problems a r e equivalent (i)
(ii)
uEvo+H:(Au,v-u)>O 'u% H~
Y
v.v=H Y
: ( ~ l u * + vv~ , - u 13
o
Vv'g
H*
U. Mosco
(iii) provided
It mfficeb infactAa make the change of variable z When X = IR"
=
u
- v 0' @
" x3:and H 'H* is the non-negative
ortant of
IRn ,
then problems such a s (iii) above a r e known in the litarature a s complementarity systems: -
linear c. s . if M i s an affine map of I R ~into itself;=
linear c. s . in the general case. They a r i s e in many problems of optimization and game theory, a s well a s i n geometric o r physical applications, and have been investigated by many a u t h o r s see R. Cottle-I. Dantzig ClJ R. Cottle [I],
C. E. Lemke [lJ, S. Karamardian
Cl],
l1.3,
where further r e -
ference on these systems and their applications can be found. In these papers many algorithms for the numerical solution both of linear and non-linear complementarity systems have been given. These algorithms, which a r e based mainly on suitable pivoting techniques, can thus be also used to solve discrete variational inequalities on convex cones. We shall s e e an example in the following Section 8. In turn, the reduction of a complementarity system to a variational ineauality, hence to a fired-point problem, can be convenient in o r d e r to obtain more general existence results. M.oreover, this reduction is also fruitful from an algorithmic point of view, for it makes i t possible to use iterative methods of solution. F o r more details on the relation between variational inequalities and complementarity systems in finite-dimensional spaces we r e f e r to K a r h r d i a n , loc. cit. , I. Dolcetta
[I],
J. MorC
611.
U. Mosco
Variational inequalities in connection with convex programming have been a l s o investigated, from a computational point of viewtoo, by
0.G. Mancino-G. Stampacchia [I]. The duality for variational inequalities on convex cones considered above is a special case of a general dual scheme for variatibnal inequalities of type u r x : (Au, v
(50)
- u) 3 F(u) - F(v)
,
++VEX ,
where F i s a 1. s. c. convex functional on the normed space X, with values in ( - co,+w]. In this case, the dual variational inequality can be witten a s
JP
where At is defined a s in (49) above and F is the Young-Fenchel conjugate of F . see Sec@m 2. It can be proved that a vector u is a solution of (50) ~f and only if the vector (52)
u* = -Au
4
(i. e. , u = -A1u )
is a solution of (51). Moreover, both solutions u and u* a r e characterized by the Young-Fenchel identity
U. Mosco where u and u* a r e related a s in (52) above. The special c a s e of theorem 7 is obtained by taking F t o be the indicator function
6H of the convex cone
F* = (
is the indicator.function of the polar cone H
in
=
cH'
H in X ( s e e Section 2). hence
* of
x*. C ~ h dual e prolems of Remark 17 a r e given. insteed, by F=
hence
H
L-
.
* = &.,,(wl) + (w*,vo) 2 F o r more details we r e f e r to U. ~ o s c o r f l . F (c*) Remark 18: When A is the differential of a convex functional G, than
the dual problems (50) (51) characterise a p a i r of dual extrenum problems in the sense of Fenchells duality theorem, s e e R. T . Rockafellar 123. When
*
X i s a Hilbert space, X
X, and A = identity map of X, the0 problems
(50) (51) above and the equivalent system (53) a r e related t o the so-called proxi&y mappings introduced by J. J. Moreau C f l . An application of a dual scheme of this type t o prove the regularity of the solution has been given by H. Brezis
[d.
Remark 19: The explicit formulation of the dual variational inequality (51) r e q u i r e s the knowlbdge of the inverse map A-' 'and of the conjugate functional F'.
In particular, t h e calculation of F*-may be a difficult pro-
e blem even for "simple1' F. However, the dual scheme described in the p r sent section can be modified in concrete situations, hy making a sort of "change of variable" in the initial inequality before operating with the duality. In some c a s e s this leads t o a more feasible
inversion of A and
dualization of F. Dual extremum problems have been indeed investigated along these lines by R. Teman
[a, by relying on a generalized form of
F e n c h e l f s duality theorem given by R. T. Rockafellar
C72. or
a similar
U. Mosco approach to dual variational inequalities see
M. Matzeu El].
In the following section we shall apply the dual scheme described.above t o the "obstacle problemt' mentioned in Section 1 of Chapter 2 and we shall rely on it t o give a method f o r the numerical approximation of the solution. Let u s a l s o mention, in this respect, that a different approach to duality, based mainly on minimax techniques, h a s been a l s o applied t o the numerical solution of problems a s those mentioned in Section 1 of Chapter 2, by J. Cea-R. Glowinski
Cfl , Dl.J. Cea-R. Glowinski-Nedelec [d, M. Nedelec
[:11, J. F. Bourgat C13. See a l s o J. Cea 121. 8. An example Let u s consider the obstacle problem of Section 1 of Chapter 3 (Exam ple 3):
where a(u, v ) i s the Dirichlet form
and
y
i s a fiven function in HI
(a).
0
We a r e in the situation described in Remark of the proceding section, with
U. Mosco
Moreover,
A t = A-' = G
:
H-l(Jl)
Hi(*
is the Green operator for the
: for any measure
Dirichlet problem in
V(X)
H
= G 'E
T in
H-'(A),
the potential
(XI
i s given by
dn
Where g(x, y) is the Green function for the Dirichlet problem in
R.
The primal variational inequality now is
while the dual ineauality now is
Both these inequalities a r e equivalent t o the complementarity s i s t e m (see Remark 17 of the preceding section)
U. Mosco
The approximate solution of problem (50), o r of the equivalent direct minimum problem, has been studied by many authors, s e e R. Glowinski i2J, M. Goursat [I],
Sibony [2],
li-L. Guerra and G. Volpi
Marzulli [I],
G. Stampacchia
[a J. J. Moreau C73,
J. F. Durand
rg,
V. Comincio -
n].
We shall summarize below the method followed i n A. Fusciardi et al. r l 3 . The complementarity system (52) is approximated by a sequence of finite-dimensional complementarity s y s t e w what gives a direct simultaneous approximation of the function u and the measure
/( = d u .
the solution of (50)
and (51) respectevely, without any assumption of regularity. This discretization is obtained by realizing an internal approximation
*
of the cone of measures H , by meane-of unit m a s s e s supported by the (n-1)-dimensional meshes of a given coordinate subdivision of
fi .
We shall now describe this approximation, by taking, for sake of simplicity, n=2. Let u s consider a coordinate subdivision of IR
2
a s that given in Ex. 1
of Section 6 , h=(h ,h ) being the discretization parameter and Q = (q h q h ), 1 2 1 1 ' 2 2 q = (q., q,) G z2, the vertices of the subdivision. 1 L 2 F o r every q=(q q ) G Z w e shalldenote by sh and sh2the one-dimensional 1 1' 2 9 ( l = n - 1 ) meshes of the subdivision which have Q a s ? h e left end point and the lowest end point, respectevely: that i s ,
U. Mosco
h Let u s now consider the functional bh and d which associate q1 the mean values on shl aPld sh2,respectevely: 9 q
aith each function
p cc:(~).
It i s easy to show that
2 q1
and
h 2 satisy the estimates q
U. Mosco
h. a r e contained for every q such that both sh and S q h q1 by Q ' the set of a11 such q l s .
ins. We shall denote
[we hive in fact
for every ipC c:(a ), we can find x
0
2
I
such that
thus,
hence xll
j' x1 1
2 I'f(xl,x21\ d x l < ( d i a m a )
jz a
2
2 ~ x l , x 2 ) l dx, dx
2
Therefore,
Then.
cqhl and
6: 2,
,
q C Qh a r e both elements of the dual H-'(Q
)
U. Mosco of H
1
(h)and
0
provided
(p+
since they a r e ooviously non-negative ( i . e. 0, i = l , 2 ) . we can conclude that
tive m e a s u r e s belonging t o H-'(R).
c$
ql
and
h
Q-
i ( )+ ~0 q a r e non-nega-
We shall now denote by H the convex cone, with vertex at 0, geneh h , rated by the non-positive m e a s u r e s - cqi, Q ~ ii1,2:
Clearly
i=1,2
,
*
Hh C' H where, let u s recall it, H
*
for every h,
+ i s the cone of a l l non-positive measures in H-'(R).
The finite-dimensional cones H* approximate the cone H* in the sense h of the following lemma: LEMMA
K
H
= l i r n H W in H-'(JL). h -
In view of the Corollary of Theorem 1 of Section 2, the convergence just stated is equivalent t o the convergence of the polar cones: the polar cone 1 1 of H' in H is the cone H of a l l non-negative functions of H we 0
(a)
0
(a)
started with, while the polar cone H of H * i s the cone of a l l functions h h h whose t r a c e on each S q P Q ~ iz1.2, , has a non-negative mean vc qi ' value:
%(R)
U. Mosco
Now, it i s not difficult t o prove that lim H
h
=
A h
we r e f e r t o A. Fusciardi et al.
H~ = H
as l h 1 3 0
,
. loc. cit.
We can thus apply the approximation sheme described in Sections! 5 and 6 above. The finite-dimensional problems that approximate problem (51) can
: be obtained by replacing H* with H
(we take
yh =
for a l l h and we
also leave the operator G unchanged):
which i s equivalent t o the complementarity system
where
C ~ o t ethat whereas the cone H* i s finite-dimensional, i t s polar cone h i s ndt such. However, zh belongs t o the finite-dimensional cone -G(Hh)-\t'
H h thus the complementarity system above is essentially a finite-dimensional one.2 We now w r f ~ ethe approximate measure
rh
i i t h m s of the'basis
U. Mosco
h [- Cql .]
that generates H*. h'
If, similarly, we put
then the discrete variational inequality corresponding t o ( 5 4 ) , according to what we have seen in Section, i s given by
E b y writing that a vector of ElN is non-negative, we mean that a l l its components a r e non-negative
1,
where for each q C Qh, i = 1 , 2 h yql . =
' crqi.y
) = mean value of
Y
On
h sqi
U. Mosco
fi
of the potential in
of the measure [ ~ o t ethat
h h . carried by the mesh S TI rj
b
Gh =Gg qi; r j rj;qi
,
q , r e ,~ i ,~j = 1,2
because of the symmetry of the Green operator G.
2
is given by F o r example, if i = l , j=2, the matrix elemeb G ql, r 2
Gh
--
1
J (q2+l)h2 g(x10 q2h2;fZhl. ~ ~ ) d x ~ d
qzh2 The discrete problem ( 5 6 ) is in turn equivalent to the discrete complementarity system
which can be solved, f o r instance, by using pivoting thechniques ( s e e F. Scarpini,. A. Valdinoci [l) ) Once this system has been solved, we can write the approximate mea sure
U. Mosco
and the approximate function
By the convergence r e s u l t s of Section,G we know that u (x) converges h 1 strongly in H (R ) to the solution u(x) of problem (51) a s \hi + 0, while 0
the measure of (52).
/". converges strongly in the dual
H-'(& ) to the solution
r
Let u s also r e m a r k that the coefficients z
obtained by solqi+ yqis ving system (58), yield a direct approximation of the mean values of the so lution u(x). on each one-dimensional mesh of the subdivision used in the a p proximation. We r e f e r to A. Fusciardi efal. for more d e t d s on this point. [ ~ o t ethat the mean-values a r e well defined for a n a r b i t r a r y function uq
$(a), 0
(n=2) whereas the point-values of u(x) a r e not defined, unless
some regularity of the solution u(x), depending on the regularity of the obstacle
\Y
, is known
1
Remark20.The approximation of the one-dimensional obstacle problem h in an interval (a, b) is particularly simple. Then, the basis measures q at a point x = qh of a subdivision of lR. can be taken to be the unit m a s s q q The approximate measures a r e finite-combinations of such Dirac measures
6
and since the potential g(x) = G
6 q1 x 1 in
( a , b) is thdtriangle'function
U. Mosco
then the approximate functions
a r e piece-wise affine function in (a, b). The solution of the complementarity system in t h i s case is particularly simple, see F. Scarpini, A. Valdinoci
Ed. @
Remark 2 1.The approximation method described in t h i s section requices the knowledge of the Green function g(x, y) for the Dirichlet problem in
fi
It i s still possible, however, to combine the dual approach discussed above with a n approximation of the Laplace operabr of finite-difference type (by replacing point-values with suitable mean values). This requires a n external approximation of the m e a s u r e s in H-'(R),
in place of the internal one ddscribed Pbo
ve. We r e f e r to U. Mosco-F. Scarpini [I].
$
Existence and uniqueness of the solution of the elastic-plastic torsion problem for a cylindrical b a r of nval cross-section, 29 (1965). .Prikl. Mat. Meh. -
B.D.ANNIN,
[I]
A. M. ARTHURS,
[I] Complementary variational principles,
Clarendon P r e s s
Oxford, 1970. E. ASPLUND,
M
Positivity of duality mappings, Bull. Amer. Math. Soc. 73
(19671, 200-203. J. P. AUBIN,
Approximation of variational inequalities, in "~unctional Analysis and Optimizationt' (E.R. Caianiello Ed. ), A.cad. P r e s s
[I]
1966, 7-14. [2] Approximation des espaces de distributions et des operateurs differentiels, Bull. Soc. Math. France, MBmoire 12 (1967), 1-139. [3] Behavior of the e r r o r of the approximate solutions of boun-
bary value problems for linear elliptic operators by Galerkints and finite difference methods, Annali Scuola Norm. Sup. 21 (1967),
599-637.
I41 Evaluation des e r r e u r s de troncaturo des approximations des 21 (1968), 356-368. espaces de Sobolev, J. Math. Anal. Appli. -
[g
Approximation des problemes aux limites non homogenes 30 (1970), pour des op6rateurs non lin6aires, J. Math. Anal. Appl. -
510-521. C. BAIOCCHI,
[I] Su un problema di frontiera libera connesso a questioni di idraulica, to appear in Annali Mat. P u r a Appl.
A'. BEURLING and A. E. LNINGSTONE, [I] A theorem on duality mappings in 4 (1962), 405-411. Banach spaces, Ark. Mat. L. BOCCARDO,
M
Alcuni problemi a1 contorno con vincoli unilaterali dipendenti da un parametro, to appear.
J. F. BOURGAT, [I] Analys3 numerique du probleme de l a torsion elastoplastique, Thsse, I. R. I.A. Paris, 1971;
p. Mosco H. BREZIS.
[I] Une generalisation des operateurs monotones, Inequations d'6volution abstraites, C. R. Acad. Sci. Paris, t. 264 (1967), 683-686 and 732-735.
k]
Sur certains probl&mesnon-lineaires, Sdminaire Choquet n. 18 (1966-67), 1-18.
[g
Equations et inequations eon lineaires dans l e s espaces vectoriels en dualit6, Ann. Inst. Fourier (19681, 115-175.
18
Problemes unilateraux, to appear. H. BREZIS and M. SIBONY, [lJ M6thodes dlapproximation e t dliteration pour l e s operateurs monotones, Arch. Rat. Mech. Anal. 28 (1969), 59-82.
[23 Equivalence de deux inequations variationnelles et applications, Arch. Rat. Mech. Anal.. to appear. H. BREZIS and G. STAMPACCHLA, [I] Sur l a regularit6 de l a solution d1in696 (1968), 153-180. quations elliptiques, Bull. Soc. Math. France F. E. BROWDER, [I] Nonlinear elliptic boundary value problems, Bull. Amer. Math. Soc. 69 (1963), 862-874.
b]
wonlinear elliptic boundary value problems, 11, Trans. Amer. Math. Soc..-117 (1965), 530-550.
b]
Continuity properties of monotone nonlinear operators in Banach spaces, Bull. Amer. Math. Soc. (19641, 551 553.
Ef3
-
On a theorem of Beurling and Livingstone, Canad. J. Math. 367-372.
17 (1965), -
E]
Nonlinear monotone operator-s and convex sets in Banach spaces, Bull. Amer. Math. Soc: 71 (1965), 780-785.
[6] Existence and uniqueness theorems for solutions of nonlinear boundary value problems, Proc. Amer. Math. Soc. Symp. Appl. Math. XVII (1965), 24-49.
B]
Probl&mes non-lineaires, Les P r e s s e s de lfUniv. de Montreal, 1966, 1-48.
[q
Existence and approximation of solutions of nonlinear variational inequalities, Proc. Natl. Acad. Sci. U. S. 56 (1966), 1080- 1086.
[93 On the unification of the calculus of variations and the theory of monotone nonlinear operators in Banach spaces, Proc. Natl. Acad. Sci. U.S. 56, 419-425.
-
U. Mosco
[lo]
Non-linear accretive operators inBmach spaces, Bull. Amer. Math,.Soc. 73 (1967), 470-476 .. -
1 A new generalization of the Schauder fixed point theorem, Math. Annalen 174 (1967), 285-290. [12] Approximation - solvability of nonlinear functional equa26 ( 1967), tions in normed linear spaces, Arch. Rat. Mech. Anal. 33-42. b3] Non-expensive nonlinear operators in a Banach space,
54 (1965), 1041-1044 Proc. Nat. Acad. Sci. USA, -
&4] Non-linear variational inequalites and maximal monotone mappings i n ~ a & s p a c e s , Math. Annalen 175 (1968), 89-113 b5] Nonlinear operators and nonlinear equations of evolution in Banach spaces, P r o c . Amer. Math. Svmp. Nonlinear Functional Analysis,Chicago. 1968. The solution by iteration of nonF. E. BROWDER and W. V. PETRYSHYN, linear functional equations in ~ar$ch spaces, Bull. Amer. Math. 72 (19661, 571-575. SOC.Construction of fixed points of nonlinear mappings in Hilbert space, J . Math. Anal. Appl. , 2 0 (1967) , 197-228. -
[2]
J. CEA,
Approximation variationnelle d e s problemes aux limite s , Ann. Inst. F o u r i e r 14 ( 19641, 345-444.
@ Optimisation, thkorie et algorithmes, Dunod ed.,Paris, 1971, J. CEA and R. GLOWINSKI, @] Mi-tion bles, t o appear,.
d e s fonctionnelles non differentia-
Methodes numeriques pour 1'6conlement laminaire d'un fluide rigide viscoplastique incompressible, to appear.
b]
J . CEA, R. GLOWINSKI and J . L. NEDELEC, Methodes numeriques pour la torsion elasto-plastique d'une b a r r e cylindrique, t o a p p e a r ,
V. COMINCIOLI, L. GUERRA and G. VOLPI Analisi numerica di un problema di frontiera libera connesso col moto d i un fluido a t t r a v e r s o un mezzo poroso, Pubbl. n. 17 d e l G b . Anal.Numer., pavia, 1971. L
R. W. COTTLE,
n]
Nonlinear P r o g r a m s with positively bounded Jacobians, SIAM J. Appl. Math. 14 (1966), 147-157. - -
U. Mosco
R. W. COTTLE and G. B. DANTZIG, [lJ Positive (semi-)definite programming, Symp. Math. P r o g r . , H. W. Kuhn E d . . Princeton, 1970
-
] Complementary pivot theory of mathematical programming, i n LinemAlgebra and i t s Applications, vol. I (1968), 103-125* Mdthodes d'approximation pour certains probl&mes non lindaires non homogenes, to a p p e a r .
J. P . DIAS and M. SIBONY,
Optimierung soufgaben i n topologischen ~ e k t o r r z u m e nI: ~ualitgts:.theorie,Z. Wahrscheinlichkeitheorie verw. Geb. 5 (19661, 89-117.
U. DIETER,
Dual extremal problems ir, linear s p a c e s with examples and applications i n geme theory and statistics, P r o c . NATO Adv. Study Inst. on Theory and Appl. of Monotone Operators, Venezia 1968, Oderisi Ed. , 1969, 1 - 9 . F. DI GUGLIELMO,
[I] Construction dlapproximations d e s espaces d e Sobolev 6 (1969), 279-33 1 s u r d e s reseaux e n simplexes, Calcolo -
I. DOLCETTA,
Sistemi d i complementarith e disuguaglianze variazionali, Tesi, Universith d i Roma, 1972
J. F. DURAND,
.
a
Rdsolution numdrique de problbmes aux limites sousharmonique, These B' llUniv. d e Montpellier, 1968-6 9 .
C. DUNAUT and J. L. LJONS,
n]
MM6chanique e t fn6quations, Dunod. ed., P a r i s , 197t
D. G. de. FIGUERIDO, Topics i n non-linear functional analysis, Lecture S e r i e s N. 48, University of Maryland, 1967
-
A. FUSCIARDI, U. MOSCO, F. SCARPINI and A. SCMAFFINO, bl A dual method f o r the numerical solution of some variational inequalities, t o appear in J. Math.Anal.App1. 4 0 (4972) . E. GIUSTI;
Superfici minime cartesiane con ostacoli discontinui, Arch. Rat. Mech. Anal. 40 (1971) -
U. Mosco R. GLOWINSKI, [I] Mdthodes n u m e r i q u e s pour l t 6 c m l e m e n t stationn.aire d l u n fluide r i g i d e vi'sco-plastique incompressible,,to appear.
La methode d e relaxation. Applications B l a minimisation a v e c e t s a n s consts&ted d e fonctionnelles convexes, Q u a d e r n i d e i Rendiconti, 1st. Mat. Univ. Rorna, 1971.
@] Methodes n u m e r i q u e s pour la t o r s i o n elasto-plastique d t u n e
b a r r e cylindrique, dormulation variationnelle, Colloq. Anal. N u m e r Supper B e s s e s , 1970
.
W
R. GLOWINSKI, J. L. LIONS and R. TREMOLIERES, Methodes nume'riquep d e resolution d e s probl&mes d1in6quations variationnelles e n m e c a n-i que, eT e n phydique, Dunbd E d , . P a r i s , t o appear.
.
M. GOURSAT,
[I] Analyse numkrique d e p r o b l 6 m e s d t e l a s t o - p l a s t i c i t 6 e t d e visco-plasticitd, T h e s e , I. R. I. A. ,P a r i s , 1971
P. H. HARTMAN and G. STAMPACCHIA, [I) On s o m e non l i n e a r elliptic differ e n t i a l functional equations, A c t l Math. 115 (19663, 271-310 . A. I O F F E and V . TIKHOMIROV, [l] Duality of convex functions and e x t r e m u n problems, Uspekhi Mat. Nauk. 23,6 (19681, 51-116; R u s s i a n 23,6 (1968), 5 3 - 5 4 . Math. S u r v e y s J,L.JOLY,
Une f a m i l l e d e topologies e t d e convergences s u r l ' e n s e m b l e d e s fonctionnelles convexes, T h e s e B l a Facult6 d e s Sciences d e Grenoble, 1 9 7 0 .
R. I. KACHUROVSKII, [I] On monotone o p e r a t o r s a n d convex functionals, Uspe h i Mat. Nauk. 15, 94 (1960), 213-215. Monoton'e non-linear o p e r a t o r s i n Banach s p a c e s , Dokl. Akad. Nauk. SSSR 163 (19651, 559- 562. 133 Nonlinear monotone o p e r a t o r s i n Banach s p a c e s , Uspehi Mat. Nauk. 23 (19683, 121-168 R u s s i a n Math. Surveys 23,2 (1968), 117-165. S. KANIEL,
Construction of a fixed-point f o r contractions i n Banach space,. I s r k e l J. Math. , 9 1971, 535-540.
S. KARAMARDIAN, T h e nonlinear complementarity p r o b l e m with application, JOTA 4 (1969),87-98.
U. Mosco T. KATO,
[I] Demicontinuity, hemicontinuity and monotonicity, Bull. i d e m . , P a r t . 11, ibid. A m e r . Math. Soc. 70 (1964), 548-550; 73 (1967), 886-889. -
H. LANCHON,
Solution du p r o b l e m e d e t o r s i o n elasto-plastique d ' u n b a r r e cylindrique d e s e c t i o n quelconque, C. R. A d d . Sc. Paris, 269 (1969), 791-794
H. L. LANCHON and C. DUVAUT, 111 S u r la solution du p r o b l e a e d e l a t o r s i o n Clasto-plastique d'une b a r r e cylindrique d e section quelconque, C. R. Acad. Sci. P a r i s 264 (1967)
-
@]
C. E. LEMKE,
Recent r e s u l t s on complementarity p r o b l e m s , P r o c . P r i n c e t o n Symp. on Math. P r o g r a m m i n g , H. W. Kuhn ed. , P r i n c e t o n Univ. P r e s s , 1970, 349-384.
J . LERAY and J. LIONS, [I] Quelques r e s u l t a t s d e Visik s u r l e s p r o b l e m e s elliptiques nonlineaires p a r l e s mCthodes d e Minty- Browder, Bull. Soc. Math. F r a n c e 93 (1965), 97-107. C. LESCARRET,
H. LEWY,
[I] Ca s d'addition d e s applications monotones m a x i m a l e s d a n s un erpace d e Hilbert, C. R. Acad. Sc. P a r i s 261 (1965), 1160-1163.
03
O n a variational p r o b l e m with inequalities on t h e boundary, J. Math. Mech. 17 (1968), 861-884.
[a
O n a minimum p r o b l e m f o r ' s u p e r h a r m o n h functions, Int. Conf. on Functional Anal. ,Tokyo 196 9 ,
H. LEWY and G. STAMPACCHIA, 111 On t h e rejgularity of a solution of a v a r i a t i o n a l inequality, Comm. P u r e Appl. Math. 22 ( 1969), 153-188.
@] Onthe r e g u l a r i t y of c e r t a i n s u p e r h a r m o n i c functions, J. dlAnxlyse Math. 23 (1970), 227-236 O n existence and s m o o t h n e s s of solutions of s o m e non-coercive variational inequalities, t o a p p e a r
.
J. L. LIONS,
Quelques methodes d e resolution d e s problCmes aux l i m i t e s non l i n e a i r e s . Dunod e t Gauthier-Villars E d . , P a r i s , 1969.
J. L. LIONS and G. STAMPACCHIA, [I] Variational inequalities, Comm. P u r e Appl. Math. 20 (1967), 493-519.
W. LITTMAN, G. STAMPACCHIA and H. F. WE1 NBERGER, [I] R e g u l a r points f o r elliptic equations with discontinuous coefficients (19631, 45-79 Ann. Scuola N o r m a l e Sup. P i s a ,
17
0. G. MANCINO, G. STAMPACCHIA, [I] Convex p r o g r a m m i n g and variational inequalities, JOTA 9 (1972). 3-23. P. MARZULLI, [I] Risoluzione a l l e differenze d i equazinni a l l e d e r i v a t e p a r ziali d i tip0 ellittico oon condizioni s u u n contorno l i b e r o , Calcolo, Suppl. 1 , 5 (19681, 1-22. M. MATZEU,
[l] Dualitti nella t e o r i a della capacitg, Tesi,Ist. Mat. Univ.
d i Roma, 1972.
G. J . MINTY,
p]
Monotone (nonlinear) o p e r a t o r s i n Hilbert space, Duke Math. J. 29 (1962), 341-346.
(23 On a "monotonicity" method f o r t h e solution of non l i n e a r .us. equations i n Banach s p a c e s , P r o c . Natl. Acad. Sci; 5'0 (1963), 10384.041. [33 On t h e monotonicity of t h e gradient of a convex function, 14 (1964), 243-247. Pacific J. Math. -
@
On t h e solvability of nonlinear functional equations of rno14 (1964), 243-247. notonic type, P a c i f J. Math. -
[5) On the generalization of a d i r e c t method of t h e c a l c u l u s of variations, Bull. A m e r . Math. Soc. 73 (1967), 3 1 5 - 3 2 1 .
b1
On s o m e a s p e c t s of the t h e o r y of monotone o p e r a t o r s P r o c . NATO Adv. Study Inst. onl'heory and Appl. of Monotone O p e r a t o r s , Venezia 1968, O d e r i s i E d . , 1969, 6 7 - 8 2 .
.
M. MIRANDA,
[llF r o n t i e r e
J. J. MORE,
T h e application of variational inequalities t o complementarity p r o b l e m s end e x i s t e n c e theorem-s, T e c h . Rep. n. 71- 110, Dept. Computer Science, Cornell Univ., 1972.
m i n i m a l i con ostacoli, t o a p p e a r
J. J. MOREAU, @ Fonctionnelles convexes, S e m i n a i r e s u r l e s equations aux d e r i v e e s $artielles, College d e France., Paris,1966- 1067, miltigraph, 1- 108.
[2] P r o x i m i t e e t dualit6 d a n s un e s p a c e hilbertien, Bull. Soc. Math. F r a n c e 93 (1965),, 273-299.
U. Mosco [3] One-sided c o n s t r a i n t s i n h y d r o d h a m i c s , i n J. Abadie E d . , Nonlinear programming, Nort Holland Pub. , Amsterdam( l967), 257-279. P r i n c i p e s extremaux pour l e probleme d e la n a i s s a n c e de l a cavitation, J o u r n . d e Mkcanique, 5 (1966), 439-470.
[5) La notion d e sur-potentiel e t l e s l i a i s o n s u n i l a t e r a l e s e n elastotatique, C. R. Acad. Sci. P a r i s , 267 (1968), 954-957. [6] S u r l e s lois d e frottement, d e plasticit6 e t d e viscosit6, C. R. Acad. Sci. P a r i s , 271 (1970), 608-611. [7] Traitement numerique d l u n probleme a u x de'rive'es p a r t i e l l e s d e type unilateral, Publ. b.15, Depart. dlInformatique, Univ. d e MontrBal, 196 U. MOSCO,
D]
Approximation of t h e solutions of s o m e variational inequal i t i e s , Ann. Scuola N o r m a l e Sup. P i s a , 2 1 (1967), 373-394.
117
A r e m a r k oil a t h e o r e m of F. E. Browder, J. Math. Anal. Appl. 20 (1967), 90-93.
[3] convergence of solutions of variational inequalities, P r o c . NATO, Adv. Study Inst. o n l h e o r y and Appl. of Monotone Oper a t o r s , Venezia 1968, O d e r i s i Ed. , 1969, 231-247. Convergence of convex s e t s and of solution of variational 3 , 4 (1969), 510-585. inequalities, Adv. i n Math. -
4
[5] Perturbation of variational inequalities, P r o c . Amer. Math. Soc. Symp. P u r e Math. XVIII (1'970), 182-194 . [61 On t h e continuity of t h e Young-Fenchel t r a n s f o r m , J. Math. Anal. Appl. 35 (1971), 518-535. [73 Dual variational inequalities t o a p p e a r i n J'. Math. Anal. Appl. 3_9 (4472) .
U. MOSCO, F. SCARPINI, 01 On t h e approximation of s o m e complementarity s y s t e m s i n Sobolev s p a c e s , t o appear. M. NEDELEC,
Un algorithme dual pour l e problgme d e l a t o r s i o n elastoplastique d'une b a r r e , Colloq. d1Analyse N u m e r . , Supper B e s s e s , 1969-70.
U. Mosco
[I] Variational problemswith inequalities a s boundary conditions, Arch. Rat. Mech. Anal. 35 (1969), 83-113
J. C. C. NITSCHE,
Z. OPIAL,
.
Non expansive and monotone mappings in Banach spaces Lecture Notes Divis. Appl. Math. , Brown Univ. ,1967
-
[g Projection methods in non linear numerical functional analysis, J. Math. Mech. 17 (1967), 353-372
W. V. PETRYSHYN,
P. D. ROBINSON, El] Complementary Variational Frinciples, in Nonlinear Functional Analysis and applications,edited by L. B. Rall, Academic P r e s s , 1971 R. T . ROCKAFELLAR, [I] Characterization of the subdifferentials of convex 17 (1966), 497-510 functions, Pacific J. Math. -
.
[21 Extension of Fenchel's duality theorem for convex func3 (1966), 81-90. tions, Duke Math. J. 3 On the virtual convexity of the ,domain and range of a ncn linear maximal monotone operator, Math. Annalen
@] Local boundedness of nonlinear monotone operators Michigan Math. J.
b]
On the maximal monotonicity of subdifferential mappings, Michigan Math. J.
@
Convex functions, monotone operators and variational inequalities, Proc. NATO Study Inst. on Theory and Appl. of Monotone Operators, Venezia 1968, Oderisi Ed. , 1969, 231-247. Convex analysis, Princeton Univ. P r e s s . , Princeton,
. 1970
Su alcuni sistemi di complementaritP F. SCARPINI and T. VALDINOCI, connessi a disequazioni variazionali d i tip0 ellittico, to appear& C - c l ~ l O.
fi]
A. SCHIAFFINO, Su un problema di disequazioni variazionali p e r operat o r i differenziali ordinari, Boll. U. M. I. (1969), 25-35 * M. SIBONY,
@J Sur l'approximation d16quatios et inequations aux d6riv6es partielles non lineaires de type monotone, to appear in J. Math. Anal,. Appl.
@]
Methodes iteratives pour l e s dquations et inequations aux 7 dkrivkes partielles non lineaires de type monotone, Calcolo (1970), 65-183.
U. Mosco G. STAMPACCHIA, [I] F o r m e s b i l i n e a i r e s c o e r c i t i v e s s u r l e s e n s e m b l e s convexes. C. R. Acad. Sc. P a r i s 258. (19641, 4413-4416. [21 On t h e regularity of solutions of variational inequalities Int. Conf. on Functional Anal. , Tokyo, 1969. 131 Variational inequalities, P r o c . NATO Adv. Study Inst. Venezia 1968, O d e r i s i E d . , 1969, 101-191.
.
k]
Regularity of solutions of s o m e variational inequalities, P r o c . Amer. Math. Soc. Symp. P u r e Math. XVIII (1970), 271-281.
151 On a problem of n u m e r i c a l a n a l y s i s connected with the t h e o r y of variational inequalities, I. E . I. , CNR, P i s a , Nota interna ~ 7 2 1 5 ,1972
.
b]
J. STOER and C. WITZGALL, Convexity and optimization i n finite dimen s i o n s I , Springer Verlay, Berlin, 1970 R. TEMAM,
[l] Solutions g6nbralisees d16quations non 1inCaires non unifor mbment elliptiques, Publ. Math. d1Orsay, Univ. Paris XI, 1970-71.
T . W. TING,
[I] Elastic-plastic t o r s i o n P r o b l e m , Arch. Rat. Mech. Anal. 25 (19671, 342-366 1
h]
Variational methods f o r the study of nonlinear o p e r a t o r s , M. M. VAINBERG, Holden Day; San F r a n c i s c o C a l . , 1964
.
LE probl$me d e la minimisation d e s fonctionnelles non l i n e a i r e s , i n P r o b l e m s i n Non-linear Analysis, CIME Varenna 1970, C r e m o n e s e Ed., Roma, 1971. M. M. VAINBERG and R. I. KACHUROVSKI, [fJ .On t h e variational t h e o r y of non-linear o p e r a t o r s and equations, Dokl. Akad. Nank. SSSR 129 (1959), 1199-1202. H. d e VEIGA
111 Sulla holderianitP d e l l e soluzioni d i alcune disequaziopi
variazionali con condizioni unilaterali a 1 bordo, Annali Mat. 83 (1969), 73-112. P u r a Appl. -
[23 R6gulaPitB pour u n e c l e s s e d'inequations non linbaires, C. R. Acad. Sci. P a r i s 271 (1970), 23-25. R. A. WIJSMm. [I] Convergence of sequences of convex s e t s , cones and functions, Bull. A m e r . Math. Soc. 70 (1964), 186-188; idem, P a r t . 11, T r a n s . Amer. Math. Soc. ( 1 9 6 3 , 32-45
' U . Mosco
E. H. ZARANTONELLD, [d Solving functional equations by contractive averaging, Tech. Rep. N. 160, U. S. Army R e s e a r c h C e n t e r Madison, Wisconsin, 196 0.
CENTRO INTERNAZIONALE MATEMATICO ESTIVO ( C . I. M. E . )
4.
SINGER
BEST APPROXIMATION IN NORMED LINEAR SPACES
G o r s o tenuto ad Erice d a i 27
giugno a1 7 luglio
1971
Beat approximation in normed linear spaces
by Ivan Singer (University of Bucharest)
Contents -Introduction. 1. Characterizations of elements of best approximation.
1. 1. The first main theorem of characterization. 1 . 2 . The second main theorem of characterization.
1 . 3 . Differential characterizations. 1 . 4 . Other characterizations.
2. Existence of elements of best approximation. 2. 1. Characterizations of proximinal linear subspaces. 2.2. Some classes of proximinal linear subspaces. 2.3. Normed linear spaces in which all linear subspaces a r e proximinal. 2.4. Normed linear spaces which a r e proximinal in e: t r y superspaces.
2.5. Transitivity of proximinality. 2.6. Proximinality and quotient spaces. 2.7. Very non-proximinal linear subspaces. 3 . Uniqueness of elements of best approximation.
3.1. Characterizations of semi-zebygev and Eebygev subspaces. 3.2. Existence of semi-Eeby~evand Eebyt3ev subspaces. 3.3. Normed linear spaces in which all linear (respectively, all closed linear) subspaces a r e semi-Eebygev (respectively, Sebygev) subspaces.
I. Singer
3.4. semi-Eebygev and Eebygev sybspaces and quotient spaces. 3.5. Strongly unique elements of best approximation. Strongly zebyxev subspaces. Interpolating subspaces. 3.6. Almost Eebygev subspaces. k-semi-8ebygev and k-8eby;ev paces. pseudo-?ebygev
subs-
subspaces
3.7. Very non-Cebysev subspaces. 4. P r o p e r t i e s of m e t r i c projections. 4. 1. Definition and some properties of m e t r i c projections. 4 . 2 . Continuity of m e t r i c projections. 4. 3. Weak continuity of m e t r i c projections. 4.4. Lipschitzian m e t r i c projections. 4.5. Differentiability of m e t r i c projections. 4. 6. Linearity of metric projections. 4. 7. Semi-continuity and continuity of set-valued m e t r i c projections. 4. 8. Continuous selections and linear selections f o r setvalued metric projections
.
5. Best approximation 5.1. Best approximation
by elements of non-linear s e t s . by elements of convex s e t s .
5. 2. Best approximation by elements of N-parameter sets. 5. 3. Generalizations. 5.4. Best approximation by elements of a r b i t r a r y s e t s .
I. Singer
Introduction Here we want to present briefly some results, problems and directions of research in the modern theory of best approximation, i. e. in which the methods of functional analysis a r e applied in a consequent manner. In this theory the functions to be approximated and the approximating functions a r e
regarded a s elements of certain normed linear (or,
more generally, of certain metric) spaces of functions and best approximation amounts to finding "nearest pointsf1. The
advan$ages and a brief
history of this modern point of view have been described in the Introduction to the monograph
1821 and we shall not repeat them here; the m a -
terial which will be presented in the sequel will be convincing enough, we hope, to prove again that the theory of best approximation in normed linear spaces constitutes both a rigorous theoretical foundation for the existing classical and more recent results in various concrete spaces and a powerful1 tool for obtaining new results, solving the new problems which appear. Since June 1966, when the Romanian version of the monograph [82] has gone to print, the theory of best approximation in normed linear spaces has developed rapidly and the number of papers in this field is growing co~tinuously. However, except the expository paper up to 1967, by A. L. Garkavi
[23]
[31]
, which appeared in 1969, and the bibltography
compiled by F. Deutsch and J. Lambert in 1970, we know of no
other survey material on these new developments. One of the aims of our course is to fill this gap to a certain extent, by presenting much new material which appeared after the monograph
[82]
. In this
re-
spect the present course, though self-contained, may be regarded a s an up to date complement to the monograph
[82]
; however,
the
biblio-
graphy does not aim at begin complete, but wants merely to give useful
I. Singer
orientation to the reader
. Naturally,
since another aim of the course
is to introduce the non-specidits to this field , some overlapping with
the material of the monograph
[ ~ z Ji s
unavoidable; however, even this
part is presented here in a slightly improved way. We shall give here only a few simple proofs, but for all results we shall give references. Even with this I1economyt1in proofs, some important topics had to be omitted.. The reader i s assumed to know some elements of functional analysis and integration theory, but we shall
recall, whenever necessary, the
definition of the notions (especially those of the geometry of normed linear spaces) which will be used in the sequel. We acknowledge with pleasure that we benefited from attending s e minar lectures on metric projections by F. R. Deutsch (at Pennsylvania State University, in 1968) and G. Godini (at the Institute of Mathematics of the Academy, Bucharest, in 1970/71).
I. Singer
1
.
Characterizations of elements of best approximation.
The first main theorem of characterization
1. 1 .
Throughout the sequel, without any special mention, we shall denote by
p
the distance in a metric space
p
E i s a normed linear space,
E
and, in particular, if
will denote the distance in
E
indu-
ced by the norm, i. e.
Definition 1.1. x E E. An element of
x
Let
g € 0
E 'be a metric space,
denote by
go
a set in
E
and
G i s called an element of best approximation
(by the elements of the set
i. e., if
G
is "nearestIf to
x
G ) if we have
among the elements of
PG(x) the set of all such elements
G ; we shall
go, i. e.
It i s natural to consider f i r s t the problem of characterization of elements of best approximation, i. e. the probelm of finding necessary and sufficient conditions in order that
g
0
E FG(x), since these results-
will be applied to solve the other problems on best approximation (e. g.
-
those of existence and uniqueness of elements of best approximation, etc
I. Singer
Also, the characterization theorems in concrete spaces (see e. g. the llaltemation theoremv11.7 below) a r e convenient tools for verifying whether o r not a given
go
satisfies
G ( ~ ) ,since they a r e easier
go
to use 'than (1. 2). Since we have obviously
it will be sufficient to characterize the elements of best approximation of the elements
x E: E
5.
\
In order to exclude the case when such
elements do not exist, in the sequel we shall assume, without any special mention, that
5
f
E
.
Unless otherwise stated, the field of scalars for all (general o r concrete) normed linear spaces considered ib the sequel can be either the field of complex numbers o r the field of realenumbem. The first main theorem of characterizatio; of elements of best approximation by elements of linear subspaces in normed linear spaces is the following (see 1 8 2 3
, p. 18)
Theorem 1.1. Let E subspace of
E, x E E \
if and only if there exists an
:
be a normed linear space, and f E E*
go E G
. We
such that
have
G
a linear
go E
TG(X)
I. Singer
E
We recall that
*
denotes the conjugate space of
space of a l l continuous linear functionals on
E, i. e. the
E , endowed with the usual
vector opePations and with the norm
To p m e theorem 1.1, assume that
x E E \
G
, we have
go
IIx-- g
p ( x , G) =
0
E TG(x)
(1
3
a corollary of the Hahn-Banach theorem (see e.g. ma IZ), t h e r e exists an
II f o I I
=
0
.
Then, since
and hence, by
, p. 6 4 , l e m -
[25]
fO E E* such that
1
II.-goo
Then the functional .f =
fo(g) = 0 ( g E G)
11 x
- 1 7).
Conversely, if there is an
f o r any
g E G
we have
11.
- gO1l =
If(.
- so)l= If(.
- go\( fo f 6
-
gl
E
and E
* satisfies
E * satisfying
< llfll
IIx
f (x) = 1 0
.
(1.5)-
(1.5)-(1.7), then
- sll =
IIx
-
gll
I. Singer and'ehence
go E
pG(x) , which completes
the proof.
It is easy to s e e that theorem 1.1 admits the following geometrical interpretation :
We have
a closed hyperplane
H
such that
E
(i. e., a closed
dim B/H = 1) containing
p (HBS(x,Ilx -
go(( 1) =
o
?#
Any functional
f CZE
T G ( x ) if and only if there exists
go
G
n
Int S(X,
satisfying (1.5)
"maximal functional" of the element
H
, which supports the cell
H
and
linear suhspace
x
-
go
and
11
-
go 11 ) =
9).
(1. 7) is called a
(because we have
The usefuhes-6 of theorem. I. 1 for ap~lictiti0ns:inxdious concrete normeh linear spaces i s due to the fact that for these spaces the general form of maximal functionals of the elements of the space i s well _known and simple (see e. g.
[73J ,
[99I
)
.
Let- us give now some examples
of applications of theorem 1.1 in concrete spaces. We shall use the word llcompacttl in the sense of i; e.: bicompact Hausdorff. F o r mmpact space
C(Q), respectively
Q
N. Bourbaki,
we shall denote by
C (Q), the space of all complex o r real, respective-
R
l y of all r e d continuous functions on operations and with the norm
Q , endowed with the usual vector
I. Singer
Using the general form of maximal functionis of the elements of C (Q), from theorem 1 . 1 we obtain (see [82] , p. 33 ) : R Theorem 1 . 2 . Let E = CR(Q) (Q compact), G a linear subspa-
ce of
E
.
.
x EE \
and -
go.€ G
. We have
only if there exist two disjoint closed subsets a Radon measure
where
S ( p)
/" 2
&,
such that
Y+
go 6 PG(x) if and
go
denotes the c a r r i e r of the measure
, Ygo
of
-
Q
and
/L" .
One can also give a characterization theorem in the spaces E = C(Q) (see
[82]
, p. 2 9 )
.
Theorem 1 . 2 , which appeared in [79]
,
has constituted the first theorem' of characterization in- E = C (Q) (even R in E = CR( [a, b] )) of elements of best approximation by elements o? linear subspaces
G
-
of arbitrary dimehsion.
I. Singer F o r a positive measure space p - f i e l d of subsets of
T
(T, 3 ) (we shall not specify the
on which the measure
3
i s defined; this
will cause no confusion ) and for 1 $ p ,< co (respectively p = a) we shall denote by L P(T, 3 ) the space of all equivalence classes of functions with
3 -integrable
p-th power (respectively 'of
3 -essentially bounded functions on
and
3 -measurable
T) , endowed with the usual
vector operations and with the norm
(respectively,
11 x 11 =
ess sup t E T
I c(t
)I;
for simplicity, we use .here the same notation f o r a function and for its equivalence class in
L'.
Again the subscript
R
will mean, both here
and for the dpaces occurring in the sequel, that we restrict ourselves to real scalars. F o r a function
x1 on
T
we shall use the notation
Using the general form of maximal functionals of the elements of 1
L (T, p.
3
), we obtain from theorem 1 . 1 the following theorem (see [82]
46 ),
lin [51]
which was obtained initially by
T.
Riv-
with different (function-theoretic) methods and with the above func-
tional 'analytic method$ in 1811 Theorem 1 . 3 . measure bpace) , G go E G
B. 'R. Kripke and J.
,
. We
have
Let
:
1 E = L (T, 3 ) (where (T, 3 )
a linear subspace of
E,
go 6 FG(x) if and only if
x € E \
is a positive
5 'and
I. Singer
We recall that f o r a complex number -i a r g d
sign o( = e and that sign 0 = 0 For
% we
results ( s e e [82]
3 ) with
€
G
b) Let E = x E E \
where
(x, y)
.
< p < co and f o r an abstract inner
:
-E
and go -
i
obtain from theorem 1.1 the following well known
, pp. 56-57) 1
o(
Id1
Theorem 1.4. a) Let s u r e space and
< p<
We have
GO),
= L'(T,
G
3 ) (wlm -(T, 3 ) a positive mea-
a linear subspace of
go €
E, x € E \
F
go E T G ( x ) if and only i f
be an inner product space,
and
0, by definition
.
E = L'(T,
product space
=
o< f
G
. We
hslve
go
denotes the s c a l a r product i n
G
a linear subspace of
F G ( x ) if and only if
E =
.
E,
I. Singer Some applications of theorem i. 1 in other concrete spaces, e. g. 1 in the spaces ( Q ) and CR(Q, 9 ) ,; a r e given in [82]
.
Q and a positive Radon me1 S ( 3 ) = Q, C (Q, 3 ) denotes the (dense)
We recall that for a compact. space asure 3 on
Q
such that
linear subspace pf
3
L'(Q,
the continuous functions on
11 x 11
and with the norm carrier
S(
class of
=
conbisting of the equivalence classes of
)
Q, endowed with the usual vector operations
dI
I
x(q) d 3 ( q ) ; the assumption that the
3 ) i s the whole space Q implies that each equivalence
1 C (Q,
3
)
contains exactly one continuous function.
The foregoing results in concrete spaces illustrate the power of the methods of functional analysis in the theory of best approximation. Indeed, although theorem 1.1 in general normed linear spaces iis obatined by a simple application of a corollary of the Hahn-Banach theorem, it gives in various concrete spaces theorems 1.2-1.4 and other r e sults as particular cases, simply by ,using the general form of maximal functionals in these spaces. A direct proof of theorems 1.2-1.4 would require -different methods in each of the concrete spaces involved, apparently having connection with each other; however, they a l l turn out to be particular cases of theorem 1.1, and this unified method of obtaining them is simpler and clearer than the separate proofs in each concrete space. This unified method i s carried out in a consequent manner, for the whole theory of best approximation (e. g. for problems of uniqueness, etc.),, & fie ~ n c i g r a p h@ 2 7
.
In the sequel we shall only indicate some
examples of applications in concrete spaces (rather than mention a l l known applications) of some of the general results on best approximation in arbitrary normed linear spaces which we shall give. A natural. generalization of the problem of
, characterfzatibn
of
elements of besf; approximation i s the problem of simultaneous characte-
I. Singer
E; G c E
rization of a s e t of elements of best approximation : given and ,.,x a s above and a subset
M
of
G
sufficient conditions in o r d e r that -every . ment of best approximation of is given (see [82]
E, x E E \
if and only there exists an (1.16)
f(x
what a r e the necesaary and E M
element
go by the elements of
be an ele-
G ? The answer
, p. 23 ) by
Theorem 1.5. Let E subspace of
x
,
- go)
=
11 X
be a normed linear space,
G and
M C G
.
We have
* E E satisfying (1.5),
f
(go
- go 1j
M
G
a linear
C
pG(x)
(1. 6) and
fz M) .
In other words this says that one can find the functional
f
E E ?&
.
of theorem 1 . 1 to be common for all
g E M Theorem 1.5 is an 0 immediate consequence of theorem 1.1 and the observation that
1x -
g
11
=
1x -
g
2
1
for all pairs
gl.
g2 E M
ly, the converse is also true, since theorem 1.1 of theorem 1.5. We shall s e e in
3
F G ( x ) ; natural-
c
'is
even a particular c a s e
that theorem 1 . 5 has applications
in the study of the uniqueness of elements of best approximation. Now we shall consider the important particular case when x 3 = the (closed) linear sub1"" n space of E spanned by n linearly independent elements x .X n' Naturally, the preceding results a r e also applicable in this particu-
dim G. = n < ca , i. e. .when
G = [x
dim G = n <
l a r case; however, by using effectively the assumption
we can obtain additional information. We recall that an element a closed convex s e t point
of
imply
y = z = x
A
A
in a topological linear
if the relations
.
y, z E A, 0 <
x
,
of
is called an extremal
L
< 1,
x =
h y + (1 - h
Using the classical theorem of Minkowski that the
elements of the unit cell
S r = E
{f
C E*
111 f 1 <
1)
in a finite-dimen-
)z
I. Singer
E* can be expressed as finite convex combinations of
sional space extremdpal*
of
SE* and that
c e of a normed linear space
if
Eo
i s an arbitrary linear subspa-
E, then every extremal point of the unit
S X can be extended to an extremal point of SEr (see e. g. [82] 0 p. 168), from theorem 1.1 we obtain (see [82] , p. 170): cell
Theorem 1.6. Let E G = [xl,. and -
. .,xn]
go E G
.
be a normed linear space,
an n-dime-sional h e a r
SL
space of
E,
x E E
\
G
A
We have
extremal points
f l,
go E
. . .,fh
ydG(x) if and only if there exist
1 ,< h ,< 2n
and -
,
....
numbers
b
S + , where 1 ,< h ,< n + 1 E 1 if the scalar a r e complex
of the unit cell
if the scalars a r e real and
h
,
Ah >
+
0' with
such that
In other words, the additional information to theorem 1. 1 which we obtain for dim G = h < take the functional
f
CQ
i s that for such subspaces
G
one can
of theorem 1.1 to be a convex combination
I. Singer
h
of SEf
1 (respectively h
= {f
E
f
1
<1 .
6 2n + 1) extremal points of the unit cell Theorem 1.6 is also convenient for appli-
cations in the usual concrete spaces, because for these spaces the gene;
*
S is well known and simple. F o r E p. 441, lemma 6) f o r E = C,(Q) (Q compact),
r a l form of the extremal points of
.
example (see e.g. [251 a functional
4 f € E is an extremal point of
q E Q
exist a
and a scalar o ( with
We recall that a system of
n
SE* if and only if there
I d 1=
1 such that
functions
x1,
(Q compact) is called aEebygev system (on Q)
...,xn
€
C(Q)
if every non-zero linear
combination
has at most n-1 zeros on
Q
.
For
Q =
[a,
b]
and for real scalars
we have the following .classical rlalternation theoremff of P. L. Eebygev-
ern stein (see e. g. [82] , p. 184):
-S. N.
Theorem 1.7.
Let
G = [x
..,xn]
e = CR ( [a, b] ) such that system and let x E E \ G and go E G
subspace of
and only if there exist [a,
b]
n
+ 1 points
be an n-dimensional linear xl,
. .,xn
. We
have
ql < q2 <
form a Eebygev go E
.. . "n.1
F G ( x ) if of
, at which the difference x(q) - gO(q) takes the value 11 r - go (1
with alternating signs (i. e., with opposite signs at consecutive points qj,
5
+
.
l ( j = 1,. ,n)). This classical theorem follows as a particular case both
theorem 1.2 for dim G = n < oo
anf
from theorem 1.6 for
from
E = C (Q), R
I. Singer using the general form (1.19) of the extremal points of
o< = +. 1
since the scalars a r e real, we have now 1.2.
SEF
-in (1.19)
The second main theorem of characterization
(naturally,
.
.
The second main theorem of characterization of elements of best approximation by elements of linear subspaces in normed linear spaces i s the following (see [82] Theorem subspace pf
.. 8. Let
unit cell
E
\E
E, x E E
and only i f for every
, p. 62)
gEG
SE+= {f E E* 11 f 11
:
be a normed linear space, G a linear
and
go E G
. We have
A
g0 E T G ( x ) , there exists an extremal point $g -of the
<
1)
such that
F o r a geometrical interpretation of theorem 1.8 see 1821 ,
p. 75.
From theorem 1.8, using the general form (1.19) of the extremal points of
*
S for the space E = C(Q), one obtains immediately the folE lowing classical theorem of characterization of elements of best approximation, due A. N. Kolmogorov (see [82]
,
p. 69):
Theorem 1.9. Let E = C(Q) (Q compact), G a linear subspace of, -
E, x E E \ E
if for every
and -
g E G . Wehave go E P G ( x ) if and only 0 g E G there exists a q = q g Q such that
I. Singer
F o r applications of theorem 1..8 in other concrete spaces, s e e C821
. Returning to a r b i t r a r y normed l i n e a r spaces, it is easy to s e e
that the sufficiency part of theorem I. 8 remains valid f o r an a r b i t r a r y set
G
g E G
Ilx
whence
in E .
Indeed, if the condition is satisfied, then f o r every
we have
-gall
go
= Ref
g (x - g o )
< Re
f
g (x - g ) 6
Ifg(,
E T G ( x ) , which proves the assertion. The problem of
characterizing those s e t s
Gc E
f o r which the condition in theorem
1 . 8 is also necessary, i s important in non-linear approximation ( s e e 1. 3.
9 5).
Differential characterizations.
Since best approximation amounts, by zation of the convex functional of a normed l i n e a r space
E
X
=
X
G, x defined by
definition, to the minimion the l i n e a r subspace
it- a r i s e s naturally the problem of obtaining characterizations of the ments of best approximation
go E
ele-
YG(x) with the aid of differential
calculus. The main difficulty is that in general the norm in
E
necessarily Gsteaux differentiable at each non-zero point of
E
theless, i t i s known ( s e e e. g. [25]
G
i s not
.
Never-
, p. 445, l e m m a 1) that the limits
I. Singer
(1.25)
7 (x, y) = lim f +0+
I
t
+ - I
(x,yEE)
always exist and one can use them to give the following characterizations of elements of best approximation (see Theorem 1.10. Let E subspace of
E, x
E \
[ ] ,
pp. 88-90) :
be a normed linear space,
and
go E G
. We
have
G
a linear
go E F G ( x )
2
and only i f
If the norm in . E
is Ggteaux differentiable at
x
-
go , this con-
dition is equivalent to the following :
F o r some remarks on the uses of the theory of convex functions (subdifferentials, etc. ) for characterizations of elements of best approximation by elements of more general sets, s e e 1.4. Other characterizations
95 .
.
We have the following characterization theorem, proved by Y. I kebe [413
for
E = C ( Q ) and in C851 for arbitrary normed linear spaces :
Theorem 1.11. subspace of and only if the set
E 0
.
E
x E E \
belongs to the
be a normed linear space,
and
G
a linedr
g E G . we' have go E P G ( x ) - if 0 C(E* , E ) - closure of the convex hull of
-
I. Singer
8 (sE*
where SEj
and
f
denotes the set of all extremal points of the unit cell denotes the restriction of
f
to the subspace
G.
A related result in
E = C (Q) has been obtained by E. W.
G < co , from theorem 1.11 one obtains Corollary 1.1. Let E
be a norrned linear space. E, x E E \ G
an n-diemnsional linear suhspace of have -
go E C,f3G ( ~ )if and only if
0
and
G = [xl,. go E G
.., xn1
.We
belongs to the convex hull of the
following set in the n-dimensional euclidean space :
'In the particular case when
E = C(Q),taking into account (1.19)
one obtains from corollary 1.1 a result of E. W. Cheney [19]
, p.
73);
the necessity part of this latter result was observed, essentially, by T. J. Rivlin and H. S. Shapiro (see 1 8 2 1 , p. 181).
F o r a characterization of elements of best approximation in t e r m s of fixed points of a set-valued mapping, see
$
5 , theorem
some other characterizations of elements of best so [82]
.
5.5
. For
approximation, - s e e al-
I. Singer
2
. Existence
of elements of best approximation
2. 1, Characterizations of proximinal linear subspaces
The .basic notidn
1
in connection with the existence of elements of
best approximation is the following : Definition 2.1. A s e t proximinal
G
if every element
approximation in
in a metric space x E E
E
is said to
tre
has at least one element of best
G, i. e. if
The t e r m I1proximinal1l set (a combination of
"proximityI1 and
R. Killgrove and used first by R. R. Phelps
was proposed by
, p. 790). Some authors use for such s e t s the t e r m distance set,
( [68]
o r existence set, o r (E)-set. 1 , formula (1.4), .f(3 (x)
Since by
G
# 9
f o r all x E G , con-
dition (2. 1) is equivalent to
Let u s also observe that every proximinal s e t is necessarily sed , since otherwise no x proximation in
G
clo-
\ G would have an element of best ap-
E
.
Now we shall consider the problem of characterization of proximinal linear subspaces
G
of a normed linear space
E , i . e. the problem
of giving necessary and sufficient conditions in o r d e r that mind
.
G
be proxi-
The basic observation is the following : A linear subspace
of a normed linear space
E
is proximinal if and only if
m i n d in every linear subspace
Eo
C
E
such that
G
G
G
is proxi-
is a closed hy-
I. Singer
perplane
in
(i.e. a closed linear subspace of
Eo
Eo
such that
.
dim E /G = 1) In fact, although this observation is obvious, it is use0 ful since it reduces the problem of characterization of proximinal linear subspaces to that of the characterization of proximinal closed hyperplanes. This latter problem i s solved by the observation that a closed hyperplane
G
z E Eo \
{o)
go E and
o
x
i s proximinal if and only if there exists an element 0 such thai 0 E p G ( z ) (Indeed, if x E E O \ G and
0
- O(z
0
E
pG(x
\ G , then, since E G
yG( z),
E
E
p G ( x ) , then x E E
have
in
we infer x
- dz
. Using these observations, 5 1,
mations given in
G
and conversely. if
0 E pG(z)
is a closed hyperplane in
for a suitable scalar
note that this proof i s somewhat -94)
- go)
Eo . , we
d f 0 whence, since also
E YG(&z
+x - d
z) =
pG(x) :
shorter than that given in [82] , pp. 93and the characterization of best approxi-
theorem 1.1, we obtain the following main theorem
of characterization of proximind linear subspace : ' Theo'rem 2.1. A linear subspace is proximinal if and only if E
C
0 nal -
E
such that
(/2 E E:
G
Any element
z
of a normed linear space
E
i s closed and for every linear subspace
i s a closed hyperplane in
with 'PIG
that -
G
G
Eo
and every functio-
= 0, there exists an element
z € Eo
such
satisfying (2.3) i s called a maximal element. of
In the usual concrete normed linear spaces the general form -of
Y' .
functionals which admit maximal. elements and the general form of maximal elements of such a functional a r e well known and simple (see e. g. [73],
[99]
)
and therefore tceorem 2.1 is suitable f o r appljcations in
I. Singer concrete spaces. Let us also mention another characterization of proximinal linear subspaces, which appears in Cheney-Wulbert L20]
A linear subspaces
Proposition 2.1. E
of a normed linear space
i s proximinal if and only if
(2.4) where
E = G +
n g c o )
,
we use the notations
Indeed, i f
x = go + (x
-
G
is proximinal and
go) E G
+
W
x = go + YI where
Ea
0 E
G
:
FG(Y) = TG
(X
2
(0)
.
x E E.
go E F G ( x ) , then
Conversely, if we have (2.4) and
go € G , y.
.
t z 7~
-1
(O),
then
(3
- go), whence go E J G ( ~ ).
Some other characterizations of proximinal linear subspaces oT normed linear spaces a r e given in
[82]
, pp. 94-95. From the
theorems
in concrete spaces, let us mention the following characterization o$ proximinal linear subspaces
G
of finite codimension of
call that, by definition, codim G = dim E (see [823
/
E = C (Q)-(we r e R G), due to A. L. Garkavi
, p. 302) :
Theorem 2.2. A closed l b e a r subspace of -
G
of codimension
n
E = CIR(Q) (Q compact) is proximinal i f and only if the fdllowing three
conditions a r e satisfied .: I
I. Singer
O ( ) F o r every
\
p E'G
{0} the c a r r i e r
Hahn-decompon4ition into two closed s e t s
p) F o r every pair of measures
S ( p )+
p, 1, P - 2 E
r)
and
S(/ur )
S(P)- =
GL \ (0)
S ( p l ) \ S ( p 2 ) i s closed. F o r every pair of measures
p1
measure
is absolutely continuous
In the particular case when hyperplane), conditions ce
and
)
dim G L = 1) and hence condition
order that
G
I
the set
w!?!L
pl>./Lz \ with respect to p2 on the
n = 1 (i. e., when
r)
admits a
G
is a closed
a r e automatically &)
set
satisfied (sin-
is necessary and sufficient in
be proximinal; one can also show directly that this condi-
tion is equivalent to that, of theorem 2.1, i. e. to the existence of a ma-
j* E GI \ { 0)
ximal element for each Problem 2.1. Let ve measure space and let n
E
of
. What
.
1 E = LR (T, 3 ), where
(T, 3 )
is a positi-
be a closed linear subspace of codimension
G
conditions a r e necessary and sufficient in order that
G
be proximinal ? Some necessary conditions, which a r e also sufficient when
n = 1 ,
a r e known (see C821 , p. 325, theorem 2.10). The notation definition 4. 1 a)
TZ:
(0) for the sets (2.5) i s motivated by 5 4 ,
. Since in the
subsequmt sections the s e t
X
(0)
will be a useful tool in the study of problems of best approximation, let us mention here some properties of these sets (see [82]
C66l
1
C381
:
Proposition 2.2. Let space
E
.
, p. 143 and
G
be a linear subspace of a normed linear
Then
a) The s e t
X
(0) i s closed.
I. Singer b) The set implies
(0) is tfstar-shaped", i. e.
71 -1
o( x E
(0) for all scalars
o<
x E R
2
(0)
(in particular,
c) We have
if dim E = d) -
e) If G 3 . I),
3
UJ
.
dim G <
v
is a semi-ceby;ev
a,
,
then
7 ( :
subspace of
(0) i s nowhere dense in
E
(0) i s homeomorphio-
E (see
9
3 , definition
.
2.2 Some classes of proximinal linear subspaces. Whenever a new class (= family) of subspaces i s introduced, it i s natural to ask whether it i s non-void; in particular, one can ask whether proximinal linear subspaces exist in every normed linear space. The answer i s affirmative, since from theorem 2.1 it follows that whenever f E E* has a maximal element, G = {x E E
I f(x) = 0)
i s a proximi-
nal hyperplane. Now we shall give some other important classes of proximinal linear subspaces. Theorem 2.3. Let E
a linear subspace of
E
be a normed linear space and let
such that the unit cell
is sequentially compact for the weak topology
i s proximinal
.
This theorem (due to V. Klee [45]
SG = W(E, E
tg
E G )
G
II~II
* - --Then
be -
G
'1
) can be deduced from theorem
2.1 but it admits also a simple direct proof ; a similar
remark i s also
valid for theorem 2.4 below. For the proofs s e e 1821 Ch. I , immediate consequence of theorem 2 . 3 i s the following :
9 2 . An
I. Singer Corollary 2 . 1 .
Let E
be a normed linear space and l e t G be a li-
near subspace of E with the p r o ~ e r t ythat G is a reflexive Banach space. Then G is proximinal
. In particular,
every, finite -dimensional li-
near subspace G of a normed linear space E is proximinal Thoerem 2 . 4 . ce E . -
Let b be the conjugate space of -
{ f E r 111 f 11 < 1 )
every
a normed linear s p a -
Then
a) Every linear subspace =
.
C
rof E*
compact for
=
6-(E* ,E ) is proximinal. In particular,
(E* ,E) -closed linear subspace
b) The same holds when S
having the unit cell Sr
r
r of
E
*
i s proximinal.
is sequentially compact f o r
P
(E*, E).
Note that the first statement in a) is indeed more general than the second, since we did not assume ,E complete. Also, it can be shown by examples that between a) and b) there is no relation of- implication. Recently J. Blatter [ 6 ] has shown that the usual concrete Banach spaces
E
a r e proximinal in their second conjugate space
tifying
E
with i t s canonical image in
E*'
)
c
0
(iden-
and has asked whether
a l l Banach spaces hav&this p r o p e w . W . Pollul has shown [69]
swer i s negative, f o r example the space
E""
that the an-
endowed with the equivalent
norm
is a Banch space which is not proximinal in its second conjugate space. 2.3.
Normed linear spaces in which all closed linear subspaces a r e proximinal.
Along with every new class of linear subspaces of normed linear spaces which we introduce, it i s natural to consider also the complemen-
I. Singer
t a r y class, i. e. the family of all linear subspaces which do not belong to that class; we shall denominate the linear subspaces of this family by the prefix ffnon-"followed by the name of the original class, e. g. the linear subspaces which a r e not proximind. will be called ffnon-proximinalff. Naturally, in every infinite dimensional normed linear space there exist non-proximinal linear subspaces, for example, the non-closed linear subspaces. However, now we shall show that the problem of existence of non-proximinal blosed linear subspaces may have a negative answer, i. e. there exist normed linear spaces in which all d o s e d linear subspaces a r e proximinal, and we shall give some characterizations of such spaces. Theorem 2.5. F o r a normed linear space
E
the following
sta-
tements a r e equivalent : fO. AU closed linear subspaces of
2O. The rest2ic?tim.cif each ce pf E -
E
f E E*
a r e proximinal
.
to every closed linear subspa-
has a maximal element.
If E -
is a Banach space, these statements:are equivalent to the
following : 3O. All closed linear subspaces 'of dimension
m , where
4O. Every f E: E 5O.
E
1 ,< rn ,< dim E
*
-
'E, of &&ah 1
fixed finite co-
. . a r e proximind.
has a maximal element.
is reflexive.
The equivalence
IO@
2O
follows from theocem 2.1,
and t h s o -
ther equivalence a r e a consequence of the profound theorem of R. C: J a mes [43]
(for which only difficult proofs a r e known to-day) that 4O =$ 5O.
Nbte that a normed linear space
E
satisfying
4'
need not be a
Banach space (and thus it-: need not satisfy 5O), a s shown by the following (unpublished) example of R. C. James: Let B be the space of all s e quences of real numbzrs
I. Singer
such that
endowed with the usual operations and with the norm (2. lo), and let be the linear span of a l l members
ItyI
1 Then
B
=
1
=
...
x
=
of
IF;:I
for which.
for n = 1,2,.
E
is a reflexive Banach space and
subspace of B with E f B (hence E the property 4'
B
E
..
is a dense linear
i s not a Banach space), having
of theorem 2.5 above.
2.4. Normed linear spaces which a r e proximinal in every superspaces. We have seen in corollary.2.1 above that if Banach space, then
G
G
i s a reflexive
is proximinal in every superspace- E
in every normed linear space
E
containing
G
(i. e.,
as a subspace). It is
natural to raise the problem of characterization of a l l normed linear spaces ce
G
G
with this property. It is obvious that a normed linear spa-
which is proximinal in every superspace must be complete, i.
e., a Banach space, since every non-complete normed linear space G i s non-proximinal in its completion. Recently W. Pollul 1 6 9 1 has proved that each nonreflexive Banach space
G
can be embedded isome-
trically a s a non-proximinal closed hyperplane in another Banach space
E.
Thus we have
I. Singer Theorem 2.6. A normed linear space superspace
E
if and only if
G
The .proof of Pollul [69]
sd
4O 'PP
and
has used J a m e s f theorem (the implication
R
(P
E G* has
{G, 0 ) is: a non- proximinal hyperplane
element, then
E = G X
endowed with the norm
{ G,
is isometric to
G
is a reflexive Banach space.
of theorem 2.5 above), by observing that if
no maximal. in
- G - is proximinal in every
0
.A
more e l e m e n t ~ r yproof o f theo-
rem 2.6, which does not make use of James1 theorem, was given recen-
.
tly in [86] 2.5.
Transitivity of proxiininality.
Let us denote the statement "G is a proximinal linear subspace of the normed linear space
E f f by : G
(e)
E
. We
shall
consider now
the problem, to what extent i s this relation transitive.
-E
Theorem 2.7. Let
be a Banach space. The following statements
a r e equivalent : lo
.
For all Banach spaces F, G
(PI
E C 2' 3O
F.'
(PI
F C
. same as 10, with dim . For all Banach spaces
.
G
im~lies
F I E = dim G / F = 1
.
5'
.
0
Same as 2 E
is reflexive
The implications are
'-with
5O
dim
(PI C
G
.
.
F, G implies
4O
E
E / F = dim G/E = 1
(PI
F C G .
.
. lo '---4
2O
immediate consequences of corollary 2.1
and
5' 4 3O
+4O
and the fact that every
I. Singer closed linear subspace of a reflexive apace i s reflexive. The implications 2-'
5'
and
using again
4 ' 4
5'
have been proved recently by W. Pollul C69],
Jarnets theorem
.
One can also consider a third type of transitivity property, with E
"on the last placeH , but the following theorem of
W
. Pollul [69]
shows that some non-reflexive Banach spaces d s 6 have this property. Theorem 2.8. Let E = c (2.14)
G
(PI C
F,
F
(P) C
E,
0
Then for all
'
dim FIG <
F, G
dim E / F < co imply
w,
(P) G C E .
Naturally, by cotollary 2.1,
every reflexive Banach space
E
al-
so has property
(2.14). On the other hand, there exist non-reflexive Ba-
nach spaces
which do. not have property (2.14), even with
E
dim F/,G = dim E / F = 1, e. g. E = %(Q)
1
(Q compact) and E = LR (T,3 ) ((T,3 )
measure space) whenever Problem 2.2. ces
E
dim E = w
(W. Pollul [69]
have property (2.14) with Problem 2.3.
(W. Pollul [69]
(PI
G c F,
F
(PI C
E
(W. Pollul h69]
)
a positive
.
). Wich non-reflexive Banach spa-
dim FIG = dim E / F = 1 ? ). Does
imply
E = co have the property
G
or, alternatively: is. every Banach space. E
(PI C
E
?
with property (2.15) reflexi-
ve ? 2.6.. Proximinality and quotient spaces The following theorem of Cheney and Wulbert [20]
shows how
I. Singer proximinality is;t r a n d f t 6 d to and from quotient spaces: .Theorem 2.8. If G is a proximind linear subspace of a normed 1 linear space E and G2 an arbitrary closed linear subspace of G1 , then G1/G2 -
i s proximinal in
Conversely, if
G1
E/G2
.
is a closed linear subspace of
a closed linear subspace of
G1
such that
E , then
nal in
G1
i s proximinal in
.
E
G2
G ~ / G i s~ proximind in
(in particular, if G l / ~ 2 i s reflexive) and that
E/G2
and
E
G 2
i s proximi-
2. 7. Very non-proximinal linear subspaces . Definition 2.2.
very
A set
G
if
G
in a metric space
i s closed and no element
an element of best approxiination in (2.16)
pG(x) =
E
fl
G, i. e. if
i s said to be x E E \G
= G
has
and
(x E E \ G).
Obviously, every very non-proximinal set i s non-proximiaal
. We
have the following simple characterization of very non-proximinal subspat ces of normed linear spaces :
A linear subspace
Proposition 2.3. ce -
E istmymm-&
z €,E \ [ 0 ]
then
PG(z)
x
-
f
1 if
E
\
G
of a normed linear spa-
and only if there is no element
such that
Indeed, if a and
-
.
6
z
(0)
Conversely, if
go€ E \ G
and
0 E
satisfies (2. 17), then
x EE J0 G(x -
\ G. go)
z d E \G
PG(x) f $, go E pG(x),
.
From this proposition and horn the observation made before theo-
I. Singer
rem 2 . 1 it follows that a closed hyperplane ximinal if and only if it i s non-proximinal 2 . 5 , a Banach space
if and only if
E
E
G
in
E
i s very non-pro-
. Consequently,
by theorem
contains very non-proximinal linear subspaces
i s non-reflexive ;
moreover, in this case, for every
f E E* which has ng maximal element, the closed m e r p l a n e
G =
{x
E E
I f (IK)= 0 )
i s very non-proriminal
.
I. Singer
3 3.1
Uniqueness of elements of best approximation. Characterizations of semi-eebygev and 8 e b y ~ e vsubspaces.
The basic mtions in connection with the uniqueness of elements of 'best approximafion a r e given in Definition 3. 1 . A set
G
a) a semi-Eebys'ev set
in =.metric space
if every element
element of best approximation in
is said to be
E
x E E
has a t , m o s t one
G, i. e. if
b) a Cebylev set, if it simultaneously a proximinal and semi-Eebys'ev set i. e. if every element approximation in
G
x E E
has
exactly one element of best
.
Obviously, in definition 3 . 1 the condition by:
xEE
can 'be replaced
x B E \ G . The term " ~ e m i - e e b ~ ~set e v has ~ ~ been proposed in the monograph
[82]
.
The term tt?ebys'evtl set has been used previously by several
authors, e. g. by N. Efimov and S. B. SteEkin [28] for such.sets the term
.
Some outhors use
Haar set. v
Let us consider now the problem of characterization of semi-Cebys'ev and Ceby8ev (linear) subspaces
of a normed linear space E.
G
From § 1, theorem 1 . 5 , it follows Theorem 3 . 1 . A linear subspace
G
of a normed linear space E
i s a semi-eebygev subspace if and only if there x CE
and g,,E
G \
{o)such that
do not exist
f
€
E*
,
I. Singer
~ l t h o u ~this h characterization of semi-Eebys'ev subspaces' is intrinsic
(i.e.,
i t involves also elements of
E
\
not
G), i t is convenient
for applications,. since one can deduce from it intrinsic characterizations of semi-Eebys'ev subspaces in the usual concrete normed linear spaces, a s we shall s e e below. Let u s mention some characterizations of Eelqys'ev lated to
C38]
$2, proposition 2.1, whick can be found in [20]
, [77] and
respectively : Proposition 3.1. F o r a ciosed linear subspace
linear space
E
.
G
2'
.
We have
is a eeby;ev
n -1 (0)
subspace
3
. G is
4'
.
G
is the s e t defined by 42, formula (2.6) and where 8
(0) is
x E E
is unique.
proximinal and
is proximinal and the restriction
canonical mapping
;
of a normed
.
means that the sum decomposition of each element 0
G
the following statements a r e equivalent :
lo
where
r
subspaces r e -
wG : E
one-to-one
.
+ElG
to the s e t
of the
I. Singer Indeed, the equivalence
1°L--.'
2'
is essentially proved by the
same arguments a s those used to.prove $ 2 , proposition 2.1. 3'
does not hold, then either
8 2,
(by
-
y . = 0 + yl = (yl 1 Furthermore, if yl,
y1
- y2
fZ( Tt
thus, 3O ==9 4.' x
-
(because 4'==3
x
lo,
-
(0)
-
gl
n
- (x
-1
x E E,
gl, g2
- g1 f x g2 - gl E G), x
=
that each element
g2, wG(X
g2
then
contradilrting 4
0
; thus
by requiring instead
have at most one sum decomposition and
and hence, by
4 2 , formula
we have, by definition,
is norm-preserving
G
.
11 wG(x)11 =
(x, G)(x E E)
(2.5),
(0) i s that subset of
E , on which the restriction
.
We recall that a linear subspace is said to have property
G
above jet us observe that for any closed li-
G C E
G
f
- g11 = wG(x - g2)
the condition of proximinality of
near subspace
in other words,
E
x E
In connection with 4'
w'
contradicting (3.6) ;
which completes the proof. Naturally, one obtains correspbn-
by omitting in 3O and 4'
of
.
E (j3G(~), gl
ding charactefizationsof semi-8ebyzev subspaces of 2'
.
{o)
(0)) n G \
?C
(01,
- g2)
y2)
+
y2 , contradicting (3.5); thus, 2'+3O. -1 y2 E X G(0) , y1 f y2. wG(y1) = uG(y2),
Finally, if
x - g2 E
gl,
contradicting (3.5)
proposition 2.1) o r (3.6) does not hold, say
and then
then
G in non-proximinal,
Now,, if
(U)
G
of a normed linear space E'
if every functional
(P E G* 'has a unique
I. Singer extension with the same norm to the whole space
E
. From proposition
3.1 we obtain the following relations of duality between EebyVsev subspaces and subspaces with property (U) : Corollary 3.1.
A
a)
W(E*
Let E -
, E)-closea
Eebygev subspace if and only if has property
If
b) = {f
GL
r
linear subspace
r,
E E
{
=
)
.
G
is a proximinal linear subspace of
I f(x) =
0 (x € G ) )
-is a tebyges subspace of
E
C
E*
(U), then G
.
respectively) and the observation that
G, E replaced by
up
I
f E E* into the set of a l l extensions of f E ( l. r , whole space E (we have = ( ) , since r is
then by part a) ce
G
C
GL1
rL
G1 = (G"
f
and E*
* : E + ~ . / r carries
each
-closed). Furthermore, if
r ))c E
E
has property
Indeed, a) follows from (3.7) (for
r
E* is a
f ( ~ =) 0 (f E
(U)
E E*
of
rA)'
* (E*
to the
,E)-
)1 C E* has property
G A L is a Eebygev subspace, in
E*"
(U),
, whence, sin-
, b) follows. However, one can show that E = cn has -
finite-dimensional Eebyzev subspaces and that for no such subsapce has
the pkoperty
G'
(U) ,(see (82J
,
G
p. 108$, and thus thd conper-
. s e . of b). is not true. Let us return now to applications of theorem 3. 1. In from theorem 3.1 one obtains the following theorem of [78]
E = CR(Q) which
has been the first intrinsic charactBrikatkn1 of semi-Eeby~evsubspaces
G
of an arbitrary (finite o r infinite) dimension of
(see also [82]
, p. 117)
Theorem 3.2.
E = CR(Q)
:
A linear subspace
G
E = C (Q) i s a s e R mi-Cebylev subspace if and only if there do not exist a Radon measu. I
re u , on
Q , two disjoint closed s e t s
Y,'
of -
Y- C Q
'
and an element
I. Singer go E G \
such that
{o)
j*.
(3.10)
where
S ( p)
+ >0 on Y .pq 0 en
Y-
and
denotes the c a r r i e r of the measure G
and only if
0
-setv1 i. e., -
for some
of
x E %< (0)
have proved that
E = CR(Q) is a semi-febygev subspace if
is the only element of
on a subset of
Y - 3S ( p ) .
P .
Recently E. W. Cheney and D. E. Wulbert [20] a linear subspace
Y'U
G
which vanishes on an
l1
MG-
Q of the form
. Naturally,
one can also deduce this easily
from theorem 3.1 o r directly from theorem 3.2. Some other characterizations of semi-6ebygev and Eebygev subspaces of arbitrary dimension, in general normed linear spaces and in some other concrete normed linear spaces, a r e given in 1821 , Ch. I,
4
3
and in [20]
.
Let us consider now the particular case when dim G = n < w By
42
.
, corollary 2.1, every such G is proximinal. Naturally,
the preceding results on uniqueness of elements of best approximation a r e also valid in this particular case, but exploiting the assumption of
I. Singer finite dimensionality, we can obatin theorem 3.1 one obtains (see. €821
additional information. Thus, from
,
pp. 210-211) :
Theorem 3.3. An n-dimensional linear subspace med linear space exist -
h
E
the scalars a r e real and
and -
A,,....
fl,
{o)
. .,fh
~f
1 4 h ,< 2n
Ah >
x € E, go € G \
, where
SE+
-1
1 ,< h ,< rr
if -
if the s'calars a r e complex,
0
,
such that -
we have
h (3.12)
of a nor-
is a 6ebysev subspace if and only if there do not
extremal points
h numbers'
G
Xf.(x)=O h j=l j J
( k = 1,
...,n)
The main difficulty in the proof of this result consists in establishing the bounds
8 1,
h ,< n
and
h 4 2n
-1
theorem 1:6 we had only the bounds
respectively (note that in h4 n
+1
and h< 2n
+1
respectively). In particular, for
E = C(Q) (Q compact), we have the following
classical- theorem, given for real scalars by A. Haar [36] plex scalars by
A. N. 'Kolmogorov [49J
Theorem 3.4. of xl,
and for
.
An -n- dimensional linear subspace G = [xl;.
E = C(Q) (Q compact) i s a Cebygev subspace if and only if
....xn
com-
form a 116ebygevsystemw (i.e., every
.
: ,xn]
I. Singer
has at most
n-1 zeros on
Q).
This theorem follows a s a simple particular case both from theorem 3.2 for
dim G = p < w (respectively, a complex version of it)
and from theorem 3.3 for
E = C(Q). These two proofs of theocem 3.4,
compared with the other function-theoretic proofs of theorem 3.4 given previously by several authors, show clearly the advantages of applying the methods of functional analysis in the theory of best approximation. We mention also the following characterization of finite-dimensional EebytYev subspaces of
CR(Q),
due (for Q = [a, b] ) to
Y. Ikebe
:
[40]
Proposition 3.2.
A finite-dimensional subspace
G f E = CR(Q) -.
(Q compact) i s a ?ebygkv subspace if and only if
Let us conaider now closed linear subspaces mension
. From
theorem 3.1 one obtains the following result, due to,
A. L. Garkavi (see [82] Theorem 3.5.
6f
G of .finite codi-
, p. 296)
:
A closed linear subspace
a normed linear space
E
.3
G = {x
G of codimensiqp
E
n
P
I fl(x) = . .. = fn(x) =
0)
fn E E" linearly independent), is a semi-Eebygev subspace 1' * ' if and only if for every fOE G' \ { 0) the set fo of a l l ma(with f
ximinal elements of
r
+1
elements
f
0 xO, x
is of dimension
..,xr
such that
r = r(fo) < n
-1
and
contains
r
moreover, in this case f o r any Xo*
., xr E Wf0
XI,.
we have
+
1 linearity independent elements
(3.15)
.
Applying this result in the space binig it with
E = C(Q) (Q compact) and com-
92
, theorem 2.2, one obtains the following result, due to A. L. Garkavi ( s e e [82] , pp. 315-320) : Theorem 3.6.
A closed linear subspace
G
of codimension
I
.
n
G = {x E C(Q) p l ( x ) = . . = p n ( x ) , = 1 fiR (Q) linearly independent,). - - ,iti . E E* = 0 ) (with a 6ebyzev subspace if and only if the following four conditions a r e s a -
bf -
E = CR(Q) (Q compact),
.
PI,. . pn
3
tisfied : a)
F o r every
p, e G1 \ { 0) /ll
decomposition with respect*
the space
into two closed s e t s
Q
admits a Hahn Q+
and
Q'.
E G' \ (0) a r e equivalent Pl* p 2 (i. e., each is absolutely continuous with respect to the other) on the s e t
b)
Q'
Any two measures
of all limit points of c)
F o r every u ,
of at most n d) 1 ,< r , < n
-1
points
F o r any
r
Q
.
aG
\
0)
,
the s e t
Q .\
S ( p)
consists
. isolated points
ql,.
.., 'r
of
-
Q , where
- 1 , we have
If Q -
contains at least
rial, then consition
c)
n
isolated points and
G
is proximi-
i s necessary and sufficient in o r d e r that
G
be -
I. Singer
a Cebys'ev subspace. If then G -
Q
has no isolated p o h t and
G
S(/c ) = Q
is a Eebygev subspace i f and only if
is proximinal,
for all
p ~ ~ ~ \ I. o l Some other characterizations of semi-Cebygev and EebylSev sub-
spaces of finite codimension in general normed linear spaces and in various concrete spaces a r e given in [82 [32]
-
.
1341
1,
Ch. 111 and in [20]
, 166 7 ,
In particular, characterizations of semi-eebygev subspa1 LR(T,
ces of finite codimension in
9
)
a r e given in E82]
, p. 326
; theorem 5. At the time when 1821 appeared, no characte1 rization of E e b y ~ e vsubspaces of finite codimension in LR(T, 9 ) was
and in [66]
known (only some necessary and some sufficient conditions were known, s e e [82J
,
p. 328-330); this problem has been solved recently by A. L .
Garkavi [34]
.
'3.2. Existence of semi-Zebys'ev and Z e b y ~ e vsubspaces. It is obvious that every normed linear space
E
contains s e m i -
-Eebys'ev linear subspaces; f o r example, every linear subspace s e in
E
, with G f E, satisfies
pG(x) =
9
(X
G
den-
E E \ G) and hence
is a semi-6ebys'ev subspaces. The problem of existence of s e m i - e e b y ~ e vclosed linear subspaces of Banach spaces in l e s s trivial, but still has an affirmative answer, v
namely, every Banach space has at least one s e m i - ~ e b y g e vclosed hyperplane.
Indeed, by
S 2,
every non-reflexive Banach space
E
contains
very-non-proximinal closed hyperplanes and any such closed hyperplane
is clearly s e m i - e e b y ~ e v . On the other hand, by a result of J. Lindenstrauss [59],
in every reflexive Banach space
has an exposed point, i. e., that there exists a
a point
x E Fr
support hyperplane
H
of
E . the unit cell
SE S,
S E with the property such that
Y
H
n
SE =
{I]
, and hence E
obviously has a Eebys'ev hyperplane
I. Singer H -x) .
(namely,
F o r Eebygev subspaces..pf Banach spaces the situation is different, a s shown by the following example of A. L. Garkavi (see [82], If I is a s e t of cardinality space
>
p. 114):
c , then the (non-separable) Banach
E = E(1) of all bounded families
x = {$i}icI
of s c a l a r s
which have at most a countable number of non-zero coordinates
f i *
endowed with the usual vector ope-rations and with the norm
has no 8ebygev subspace. However, the following problem, raised essentially by A. L. Garkavi, in 1964 ( s e e [82]
, p. 116 and L83J
is
)
open : Problem 3.1. Does there exist a separable Banach space
E
which has no eebygev subspace ?
E
We have seen above that the assumption of separability of
in
is
problem 3.1 is essential. The assumption of completeness of
E
also essential, since V. Klee and myself have given (see [83]
) the fol-
lowing example of a separable non-complete
normed linear space
which has no Eebyiev subspace : the dense linear subspace
E
of
E co
consisting of all almost-zero sequences (i.e. sequences with all coordinates = 0,
except a finite number of them).
It is well known that the space
E = c
has no ?ebygev subspa0 c e of infinite dimension, but it has $ebygev subspaces of any finite di1 rnension It is also known that E = L ( has no Eebyzev
.
R
LO, 11)
subspace of finite dimension -or of finite codimension, but, still it has. cebygev subspaces. Therefore one might perhaps find a solution to pro1 blem 3.1 by combining in some way the spaces c and L ( [o, 1 1 ) 0
R
I. Singer
.Furthermore, since in every separable
conjugate Banach space
the unit cell has at least one exposed point and hence a Eeby-
E = B*
gev subspace, one may expect a positive solution of problem 3.1 only among those separable Banach spaces which a r e not isometric to any C O Y , 1 njugate Bnnach space (the spaces c and LR ( LO, I] ) do have this 0
property; moreover, they a r e even not isomorphic to any conjugate Ba-
.
nach space)
Now we shall :consider the problem of existence of Eebygev subspaces
G
with various restrictions on
dim G
o r on
codim G
.
F o r some time the following problem, raised by V. Klee [46] in 1957, has been open : Does every m-dimensional Banach space
E,
where
G ?
3 ,< m <
W,
. possess a one-dimensional 6ebygev subspace
Or, geometrically : does every possess a line on
F r SE =
G
E
with
through the origin such that there exists no segment
{x E E (
11 x 11
= 1)
, parallel to
swers have been given by various authors for trary
m
when
3 ,< dim E = m < w
SE
Affirmative an-
G ?
m = 3, 4
and f o r arby-
i s sufficiently smooth and the problem has conti-
nued to preoccupy many mathematicians. Recently an affirmative solution for arbitrary
m
and C. A. Rogers [29]
has been given by G. Ewald, D. G. Larman
.
Let us consider now the following problem : What .are (i.e., characterize topologically) the compact spaces has n-dimensional -
Eebygev subspaces ?
Q
for which
E = C(Q)
This, problem presents interest
also from the following general point of view ': The classical thedrem of Banch-Stone (according to which two compact spaces
Q1
and
Q2
C(Q ) and C(Q)2 a r e linearly iso1 metric) shows that, theoretically, the metric-linear properties of the a r e homeomorphic if and only if
spaces C(Q) a r e completely determined by the topological praperties
I. Singer
of the compact spaces
Q
and conversely; but, the effective, explicit
study of this interpendence still presents many open problems and any non-trivial answer to the above question may be regarded also a s a contribution to this study. Let us first observe that by the Haar-Kolmogorov theorem (theorem 3 . 4 above) the problem is equivalent to the following : What a r e the compact spaces
...,xn ? n + 1 points) .
eebygev system at least
XI,
Q
which admit a r e a l ( o r complexJ
(naturally, we assume that
F o r n = 1 the answer is obvious, since
every compact space
admits a cebygev system consisting of one function x function
x1 = 1
.
For
n
xl,
Eebygev
...,xn
1
f o r example the
2 2 the answer i s given by
Theorem 3 . 7 . A compact space stem
Q consists of
Q
admits a r e a l 6eby;ev
sy-
(or, what is equivalent: C (Q) has an n-dimensional R subspace) with n 2 if and only if Q is homeomorphic to
>
a subset of the unit circumference
Moreover, if
Q
is homeomorphic to the whole unit circumfe-
rence, then every r e a l Cebys'ev system on
Q
consists of an odd num-
b e r of elements. This result has been conjectured by S. Mazur and proved by J. C. Mairhuber under the assumption that
Q
is a subset of a finite
-
-dimensional euclidean space; f o r general compact spaces it has been proved by K. Siekluki and P. C. Curtis Jr. (see 1 8 2 1 , pp. 218-222, where a proof due to I.. J. Schoenberg and C. T. Yang is presented; f o r the l a s t statement of theorem 3. 7 s e e e. g. [6l] , p. 26 ). F o r the c a s e of complex s c a l a r s the problem is still open (only
I. Singer
partial results a r e known, s e e C827, p. 222)
.
F r o m theorems 3. 7 and 3 . 4 one can deduce the following result, d ~ to e R. R. Phelps (see [82] , p. 222) :
E = L w (T, 3 ), where (T, 3 ) R & -finite 'positive measure space such that dim E = oo ,: has no 6ebyCorollary 3.2. The space
3 2 (however, i t does have ?ebygev
gev subspace of finite dimension
subspaces of dimension 1, even in the case of complex scalars)
.
It id natural to consider the similar problem f o r subspaces of finite codimension, i. e. the problem of characterizing those compact spaces
Q
f o r which
C(Q)
contains a semi-6eby;ev
subspace of finite codimension t r i c compact -
spaces
Q
n >, 1
.
The problem is solved f o r
and r e a l scalars, by the
to A. L. Garkavi (see [82]
,
o r a Eebygev
following results,
pp. 314 and 325, footnote, and [33]
Theorem 3.8. a ) F o r every infinite compact metric space every integer
n
with
6ebygev subspaces
G
1 ,< n
<
w
mension
(i. e.,
w,
n
Q
due
) :
Q
, the space
of codimension
b) F o r a compact metric spaces 2 ,< n <
me-
n Q
C (Q) contains semiR and 6eby8ev hyperplanes. and any integer
C (Q) has a ?ebygev subspaces R if and only if the space
G
n
with
of codi-
coincides with the closure of the s e t of i t s isolated points).
F o r non-metrizable compact spaces
Q
only partial results a r e
known, namely, conditions which a r e necessary (e; g. that
Q
have at
most a countable number of disjoint open subsets and that
Q
contain
no open connected infinite subset) o r which a r e sufficient in o r d e r that C R (Q) have 6ebygev subspaces of finite codimension n
>, 1 o r n >, 2
I. Singer
(see
[a27 ,
.
ch. III)
One can also consider the analogous problem f o r the spaces
on positive measure spaces
(T, ), )
. For
real s c a l a r s we have the fol-
lowing results corresponding to theorems 3.4 and 3.8 b), which a r e due, in the case when (T, 3 )
is
W-finite, to A. L. Garkavi ( s e e [82],
p. 233 and p. 331) : Theorem 3.9. Let (T, 3 ) be a positive measure space such 1 The following statethat LR (T, Y )*" L: (T, ), ) and l e t n 2, 1 -
.
ments a r e equivalent : 1 1.' L (T, 3 ) has an n-dimensional z e b y ~ e vsubspace. R
1
2O. L (T, 3 ) R 3'.
(T, 3 )
has a t least
We recall that an atom of with
B
3 (A) > 0, such that if
then either
Z) (B) = 0
n
.
(T, 3 ) is a measurable s e t
A
C
is any measurable subset of
A ,
has a . ceby8ev subspace of codimension
or
n
.
atoms
3 (A \ B) =
0
.
T
F r o m theorem 3.9 it fol-
if (T, 3 ) lows, in particular, that -
has no atoms (e. g. if T = [0, 1 1 and 3 is the Lebesgue measure) L 1 (T, 3 ) has no Eebygev subR spaces of finite dimension o r codirnension. F o r Eeby:ev subspaces of
then
finite dimension this l a t t e r result is known to hold also f o r complex p. 230-232), but no extension of theorem 3.9 to (see [82], 1 complex L (T, I> ) spaces is known .
scalars
E. W. Cheney and D. E. Wulbert have proved ( 1 2 0 3 , theo1 Y r e m 34) that E = CR(Q, P ) (Q compact, S ( 3 ) = Q, contains a Cebygev subspace of codimension
n
if and only if
Q
has at least
n
I. Singer
lated points. We conclude with the following problem of A. L. Garkavi (see 83
, p. 2 and 31
, p. 9 6 ) :
Problem 3 . 2 . Does the space subspace
G
C ( 0, 1 ) possess a Cebysev R of infinite dimension and infinite codimension?
Recently D. E. Wulbert
Q
spaces
95
has shown that there exist compact
such that the analogue of problem 3 . 2 f o r
affirmative answer
. It
i s also known ( s e e
CR(Q) has an
, p. 332) that
82
L;(
0, 1
has Cebysev subspaces of infinite dimension and infinite codimension. 3 . 3 . Normed linear spaces in which all linear (respectively, all
closed linear) subspaces a r e semi-Cebygev (respectively, c e byzev) subspaces
.
We recall that a normed linear space
E
is said to be strictly
convex (or rotund) if the relations
imply the existence of a
c
> 0 such that
It is well known and easy to show that this property is equivaJ
lent to each of the following properties : a) :Fr SE = g ( S E ) ;
b ) ' ~ r SE
contains no segment ; c) each
f E E*
has at most one maximal element.
Using theorem 3. 1, one obtains (see [82 Theorem 3 . 1 0 .
1,
F o r a normed linear space
statements --. -a r e equivalent:
p. 110)' : E
the .following
)
I. Singer
.
1'
2
0
sion n -
v
E
a r e semi-Cebysev subspaces.
. All linear s u b s ~ a c e sof
E
of a certain fixed finite dimen-
, where
1 4 n
.
codimension
v
v
4 dim E - 1 , a r e semi-Cebysev (or, what is e -
quivalent, Eebyzev) subspaces 3O
"
All linear subspaces of
. E
All closed linear subspaces of 1 ,< m
of a certain fixed finite
4 dim E - 1 , a r e semi-Eebygev sub-
m
, where
E
is strictly convex
sDaces. 4'
.
82 ,
Combining this with
.
theorem 2.5, we obtain (see [82]
,
p. 111) : Theorem 3.11. F o r a Banach space
E
the following statements
a r e equivalent : 1'
. All
2'
.
codimension 3'
.
closed linear subspaces of
E
a r e <ebyzev subspaces.
All closed linear subspaces of
E
of a certain fixed finite
- 1,
a r e eebyzev subspaces.
, where
E
is reflexive and strictly convex
In particular, re
(T, 3 )
1 1,< m g dim E
m
since the spaces
.
E = L'(T,
3 ) (1
< p<
oo), whe-
is a positive measure space, and the Hilbert (= complete
inner product) spaces
E =
%
satisfy
3O, it follows that all closed l i -
near subspaces of these spaces a r e eebygev subspaces. 3.4. semi-zebygev and eebyzev subspaces and quotient spaces. The following results, corresponding to
52
, theorem 2.8, have
been proved by E. W. Cheney and D. E. Wulbert [203
lf GI is a semi-febygev linear subspace of E and G2 a closed linear subspace of G1 ' E , then GI / G2 is a semi-Cebygev subspa-
Theorem 3.12. a ) a normed linear space which is proximinal in
:
I. Singer
b)
If
G1
such that
linear 'subspace of and that
is
G
space of c)
If
G1
ia a
and that
G
1' is Eebylev in E
G2
E
G1
,
C G
a
a closed
1
such that
.
then
Gl/G2 G1
v
V
is a semi-Cebysev sub-
E
closed linear subspace of
closed linear subspace of
G
-
semi-eebylev in
2 E .
and
E
i s a linear subspace of
&
G;
is a
is f e b y ~ e vin
E/G2
i s a Eebygev subspace of
3 . 5 . Strongly unique elements of best approximation. Strongly
.
EebysYev subspaces Definition 3 . 2 . Let ment
0
< r ,<
1
E
. An
ele-
x E E
PG(x) =
of best approximation of
p (x,
if there exists a constant
r = r(x, G)
, such that
In this case we have'
be a set in a metric space
G
is said to be a strongly unique element of best approxi-
go E G
mation of an element with
Interpolating subspaces.
g)
>
{go}
x, since by
p (x,
i. e. . go
r > 0
i s the unique element
for every
g E G \
Ig0)
go). The following characterization of such
elemehts is due, essentially (namely) for
g = 0
and
11 x 11
= 1). to
D. E . Wulbert 1977 : be a linear subspace of a r e a l normed
Proposition 3 . 3 . Let G linear space
E
. -An
element
go E G
best approximation of an element a constant
r = r(x, G)
with 0 -
is a strongly unique element of
x E E \
-
G
if and only if
< r ,< 1 such that
there exists
I. Singer
f(g)
2 r IIg
11
(g E G )
>
where
Definition 3.3. A 6ebyzev s e t to be a strongly 8eby:ev
G
set if every
lement of best approximation in
G
x
in a metric space C
E
has a strongly unique e -
.
D. J. Newman and H. S. Shapiro ( s e e [lg], tively, D. E. Wulbert [97]
3)
80) and, respec-
p.
have proved
Theorem 3.13. In the spaces ((T,
is said
E
a positive measure space)
1 CR(Q) (Q compact) and LR(T, 3 ) every finite-dimensional Cebygev
subspace i s a strongly Eebygev subspace. has observed that
On the other hand, D. E . Wulbert [97] smooth normed linear space
E
a
no Eebygev subspace is strongly Eeby-
sev . V
We recall that there exists only one Definition 3.4. linear space
E
E
is said to be smooth
f = f
X
E E*
c
. ..,cn
11 f 11
G
E
x
f(x) =
= 1
An n-dimensional linear subspace
11 x (I.
of normed
is called an interpolating subspace if f o r any
n
line-
. ..,f n
n
num-
a r l y independent extremal points bers
such that
if f o r every
fl.
there exists exactly one
(3.23)
f.(g) = c
J
j
of
g E G
and
any
such that
( j = 1, ..., n)
In a r b i t r a r y normed linear spaces such considered in [76],
SE
.
subspaces have been f i r s t
where it was proved that they a r e Cebygev subspa-
I. Singer ces (indeed, -
this i s a consequence of theorem 3.3; s e e C82] , pp. 213-
-214) and that the converse need not hold
even if dim E < co
.
Recen-
tly D.A. Ault, F. R. Deutsch, P. D. Morris and J. E . Olson [ 3 I have studied best approximation by elements of interpolating subspaces, proving, among other results, the following :
G
Theorem 3.14. Every interpolating subspace linear space
E
of a normed
i s a strongly zebygev subspace.
F r o m th.e Haar-Kolmogorov theorem (theorem 3.4 above) it folG f E = C(Q)
lows that a finite-dimensional subspace
is a 6ebygev
subspace if and only if it is an interpolating subspace ; this, together with theorem 3. 14, implies again the f i r s t - part of theorem 3.13. On the other hand, fkom theorems 3 . 4 and the observation made after theorem 3.13 it follows that (in particular, the
1 < p
<
E
co) contains no i n -
L' - spaces, D. A. Ault, F. R. DeuR tsch, P. D. Morris and J . E. Olson [3] have proved the following r e terpolating subspace
.
spaces, for
3 )
L'(T,
a smooth normed linear space
F o r the
sult, which should be compared with theorem 3. 9 : Theorem 3.15. F o r a the sppce
1
E = LR(T, 3 )
contains a n interpolating subspace of dimenT
i s the union of at least
n
atoms (or, 1 o r to s o equivalently, L (T, 3 ) is linearly isometric either to 1 R 1 Also, LR (T, 1) ) contains a one -dimensional interpolating me (ImlR sion n -
> 1 i f and only if
C -finite positive m e a s u r e space (T, 3 )
1
.
subspace if and only if
T
F o r complex s c a l a r s and n namely, the complex spaces
.
contains an atom
1'
> 1 the situat?on
and
ting subspace of any finite dimension
1'
rn
n
is quite opposite,
contain no proper interpola-
> 1
(J. H. Biggs, F. R. Deutsch,
R. E. Huff, P. D. Morris and J . E. Olson [4] ); on the other hand, it is c l e a r that the unit vector
{ 1 , 0 , 0 , . . .)
in the complex spaces
I. Singer
l1 o r
spans a one-dimensional interpolating subspace.
lm
3.6. Almost Cebygev subspaces. k-semi-eebygev and k-?ebygev subspaces. pseudo -6ebyZev subspaces. We shall consider now some generalizations of semi-Eebygev and E e b y ~ e vsubspaces.
A set
Definition 3.5.
G
in a metric space
most Gebys'ev s e t if the s e t of a l l
f o r which
x € E
is called an al-
E
pG(x)
does' not
consist of a single element forms a s e t at most of the f i r s t category in
E
. Almost Eebygev linear subspaces of normed linear spaces have
been introduced by A. L. Garkavi (see [82],
p. 116), since they ha-
ve the advantage that in every separable Bansch space E almost 6eby;ev space
there exist
However, the Banach
subspaces of any finite dimension.
E(1) of section 3 . 2 has no almost 6ebyzev subspace of infinite
dimension. F o r results on finite-dimensional almost 6eby;ev of
CR(Q) (Q compact) s e e [82
224-225.
A lin'ear subspace
Definition 3.6. E
1, p.
subspaces
G
of a normed linear space
is called a k-semi-Cebyzev subspace, respectively a k-Eebyzev sub-
space. (where
(3.24)
is an integer with
k
-1 ,< dim
50G (x),< k
0 ,< k
<
m),
if
(X
E El,
(X
E E)
respectively if
0
c
We recall that
dim
PG(') 4 lc
.
P G ( x ) is a convex set, since it is the intersec-
I. Singer
tion of the two convex s e t s non-void convex set
A
G
and
S(x
, p (x, G)) and that for a
in a linear space
the dimension dim A is
E
E
defined a s the dimension of the linear subspaces of 'A-y,
where
is an a r b i t r a r y element of
y
A = $,then, by defi-
v
-
nition, dim A =
A; if
spanned by
1. Thus, the 0-semi-cebyiev and 0-6ebygev subspa-
ces a r e nothing else than the usual semi-Eebygev and, respectively,
Ee-
bygev subspaces. Most of the preceding results (e. g. theorems 3. 1, 3.2;
3.3,
3.4
and 3.10) admit extensions to k-semi-Cebygev and k-Cebygev sunspaces (see [82] i. e.,
). However, only partial extensions of theorem 3.7 a r e known,
it is not known which a r e the compact spaces Q such that
has k - E e b y ~ e v subspaces of finite dimension n, where 0 ,< k ,< n
x,
-1
..,xn E
(or, equivalently, such that
-k
distinct points
linearly independent elements gi(
'3)
=
0
2
2
and
admits systems
of E e b y ~ e vrank $ k ", i. e. with the property that
CR(Q)
there do not exist n that
Q
n
CR(Q)
for
go, gl,
,...,n - k
j = 1
ql,
..., n' - k
E Q
and
. . ..gk e G = [x1. . .,xn] and i = 0,1,. .., k).
Definition 3. 7. A linear subspace
G
k + l such
of a normed linear space
E is called a pseudo-eebyzev subspace if
In particular,
co
(X
E E)
.
every finite-dimensional linear subspace and eve-
r y k-Eebys'ev subspace (0 [66]
Jr;>G ( ~ <)
O < dim
(3.26)
<
co) is pseudo-6ebygev.
P. D. Morris.
has constructed examples of pseudo-Eebygev subspaces of finite
E
subspaces. R The following characterization of pseudo-cebygev subspaces of
codimension of
E = CR(Q),
= lco which a r e not 6eby;ev
due to
P. D. Morris 1667
, should be compared with
I. Singer theorems 3 . 2 and 3 . 4 (although the l a t t e r is only for codim G < Theorem 3.16. A proximinal linear subspace
G
is a pseudo-Ceby8ev subspace if and only if f o r every
the set
Q \ S ( p)
of E
W)
:
= CR(Q)
/u. € G I\
{o]
is finite.
3 . 7 . Very non-2ebygev subspaces.
One can introduce the following notion, corresponding to $ 2 , definition 2 . 2 : Definition 3 . 8 . A s e t 'G in a metric space a very non-6eby;ev does the s e t
s e t if
G
E
is said to be
is closed and if f o r no element
pG(x) consist. of -
x E E \G
a single element.
One can show (see e. g. L82] , pp. 114-116) that a l l closed linear. subspaces
G -of cardinality
>
e of
3 . 2 and all infinite-dimensional closed
.
a r e very non-<eby:ev
subspaces.
the -space E(1)
of section
linear subspaces of
E = c
0
I. Singer
Properties -
4.
of m e t r i c projections.
4. 1 . Definition and..some properties of m e t r i c projections If G is a s e t in a m e t r i c space E, we shall deno-
Definition
4. 1. a)
t e by
the multi-valued mapping
?r
G
D(% ) G
+G
defined by
this mapping should be distinguished from the "set-valued m e t r i c projection"
pG defined
the function
y(t)
=
in section 4.. 1 ( the reader may compare %
6
on
LO,oo)
with
).
D(tCG) = E and
b) In the particular c a s e when lued (i.e. when G i s a Eebyiev s e t ), tion - of E
G
i s one-va-
IG is called the m e t r i c projec-
onto G .
Some properties of the mappings %
(and hence, in particular , G of m e t r i c projections) onto linear. subspaces of normed linear spaces a r e collected in Let E be a normed linear space and G a linear subTheorem 4. 1. space of E. Then a)
G
D( XG)
C
and
7tG is one-valued on G, namely , T G ( g ) =
= g f o r a l l g EG. Hence , ifxeD(TZ
have (4. 2 )
7C
2
i. e. the mapping b) We have
(XI G
=
nG (XI
is idempotent
G
), then TC
-
(X E
G
(x) E D (XG) and we
D(TG)),
I, Singer
z-
is continuous at every point g & G (i. e, x -+ g E G n plies that f o r any I (xn)E (xn) we have 7CG (xn) --,nG (g)=g) G c)
X
pG
d)
IfG
1
i s a linear subspace of G, we have
If, in addition, the mapping .Stl i s one-valued on D ( V ) (i&f G G i s a semi-eebygev subspace),
G
then
e ) If - x€D(n
G
)
and gciG, then - x + g c D ( XG) -
and we have
i s quasi-additive.
i. e. q -
f) if x E D ( X G )
d is an a r b i t r a r y s c a l a r ,
then d
x e D("iYG)
and we have (4.8)
(x&D(TG) ,d=scalar),
XG(O(x) =d'K (x) G
i. e. n G
is homogeneous.
g) If G is closed and x n e D(71 ) lim G 'n-a, x cD(TG)
and
TCG(x) = g ,
i.e.
TtG
x =x, lim n G ( x n ) = g , n n+oo 6
is closed.
The proofs a r e straightforward ; s e e
[82]
, pp. 140- 142 and 390.
Part. a ) shows that in the particular case when D (If ) = E and dG G i s one-valued , the metric projection I K is indeed a ( n ~ n l i n e a r )closed G projection of E onto G. Some ~ u t h o r suse for the m e t r i c projection 'K G
I. Singer the t e r m n o r m a l projection , o r best approximation operator, o r nearest point map, 4. 2.
o r cebygev map
.
Continuity of m e t r i c projections.
The main characterization of 6ebyzev subspaces G with continuous m e t r i c projection K '
G
is the following result, due t o R. B. Holmes ([38],
theorem 6; in the particular c a s e when E / G is reflexive, this result also follows from
[84]
, theorem 3):
Theorem 4. 2. F o r a Cebyiev subspace G of normed linear space
E the m e t r i c projection 7C is continuous if and only if the G tion w = w of the canonical mapping w . E + E / G G ' G ln-k(o) 1 s e t rrG (0) is a homeomorphism of %-A (0) g o E / G . -
restrict o the
Proof. Since G is a EebysVev subspace , by $ 3 , proposition 3. 1 is one-to-one. K r t h e r m o r e , w
w
is always conti.nuous and a mapping
onto E / G , since for any x + G E E / G we have x-IT (X)E%-'(O) and G uG(x - WG(x)) = X. + G. Thus, the condition that w = w G x - ~ ( o be ) a
I
-1 homeomorphism onto E / G i s equivalent to the continuity of w . - 1 i s continuous and let x x € . E , ,himrn Ilx - x Aseume now that w n' n Then by the preceding r e m a r k , x whence
and thus TC i s continuous. G
n
-
TC
G
11 = 0.
(x -)= w - l ( x +G), x - T (x)=w-'(x+G), n n G
I. Singer Conversely, assume now that 'IY x+GEE/G, lim (x the point w
-1
x +G, n E > 0. Since 7CG is continuous at
G
+G) = x+G and
is continuous and let
(x+G)ETt-' (0). there exisls a G
11 z - w - 1(xtG) 11 <
implies
11%
G
(,)J(;Al
G
d >0
(,).n
G
(W
such
that
-1
(x+G)
11
<
1
&..
Consider the open cell v=Int s ( w - l ( x + ~ ) ,min
( J ,1 l ~ )= ) {
Since the canonical mapping
LJ
is open; the set
1 (J.Tt.))
w (V) is open and
G obviously x+G E w (V). Hence , x +G E wG(V) f o r n 3 N = N ( & ) and G n thus there exist elements z E V such that z +G = u i (z ) = x +G(n>N). n n c j n n Therefore x
n
have 'K (x ) G n z
n
G
z c ~ I ~ ~ z - w - l ( x + ~<min )ll
- z n € G and hence , by the quasi-additivity of XG, we = 71: (x - z + z )= x - z + RG(zn) . Consequently, since G
n
n
n
n
n
V ( O N ) , we obtain
and thus 71 i s continuous, which completes the proof of theorem 4 . 2 . G Note that induced by I
w - I i s nothing else than the mapping E / G + W ~ - ' ( O )
- KG,
where I denotes the identical mapping of E onto E.
Since for every
-1
xtXG
(0) we have
w
-1
(x+G)
=
x
- TLG (x)=x,
theorem 4. 2 can be also rephrased a s follows ; 9 ! is continuous if G and only if the relations x , x & % i l (0) , nvm f) (xn - x,G) = 0 n
I. Singer
imply
lim n-+w
11 x n
x
11
.=
0.
Corollary 4. 1. Let E be a normed linear space. Then a)
A
G ( E + ,E )
- closed 6,ebys'ev subspace
of E
admits a conti
nuous m e t r i c projectionTlri f and only if the (uniquely determined) extension mqp (Q E(
rL)*+
-f 1
f E E* with
b) If G i s a 6ebygev property
=
5
11 f 11 = Ilyll,
,
subspace of E , such that G
I
i s continuous. C
E*=
(U) and that the extension map ip€(GL )* $ f E** with -
H $ 11 = 11 (Q 11.
aGis
is continuous, & t
4 IG~=.(P,
continuous.
Indeed, this follows f r o m theorem 4. 2 by the.arguments of fj 3, proof of corollary 3 . 1. A. direct proof of corollary 4. 1 a ) has been given by J. Lindenstrauss
( [57J,
4 7),
but the proof sketched h e r e is
simplep Some other, m o r e elementary, characterizations of the continuity of llG. due t o E. W. Cheney-D. E . Wulbert
C203 and R. B. Holmes [38]
a r e collected in Proposition 4. 1.
F o r a 6ebygev subspace G of a normed linear
space E the following statements a r e equivalent : lf
The m e t r i c projection 'lX
20
rGcontinuous
G
is continuous.
at each point of
-1
T G (0).
The direct sum decomposition E = G 63 T t - ' (0) is topologiG cal ( i . , lirn x = x if and only if l i m 7ZG(xn) = xG(x) n-+w n n+w 3:
5
40 %G
IA. ( G)
is continuous , where
I. Singer The functional
5P
i s continuous. 69 The mapping
% :E
- 1 (0) n Fr SE defined by
\ G
TG
i s continuous. Proof. The implication not hold, say 4
x
x
2Ois obvious. Conversely; if lo does
lo+
x' , X G (xn) +xG(x),
-1
- n G ( x ) e n G (o), but reG(xn -
dicting 2O
Thus,
TC
G
then
xn
- TtG (XI
(XI)= K G (xn )-TGh)+o, contra-
1°*20
The equivalence
1 ° e 3 0 a n d the implication lo ==$ 5' - 1 (0) , t h e n Conversely, if we have 5' and x ---+ x e ?t n G
1G
x n
that
5'-
-
x
1
=
1
XG x
1
T
G1
a r e obvious.
= 0. which proves
2'
The implication 1 ° 3 6 0 i s also obvious. Furthermore, if we have 6'
and if
x . x E A (G), xn 4 x, then n 1
which proves that
6*'
Assume finally
4. 1
4O that we have 4Oand let
x -+x. Since by theorem n c) 'KG is always continuous at the points x EG, we may assume Xn €A (G) G, hence xn# G f o r n>N. Then 11 x,-
rrG(xn
I. Singer f o r n > Nand by theorem 4. 1 b) , formula (4. 3 1. IIxn
-A
-+ 11 x-.n,cx, II .
Therefore
x
n
11 xn- ~ ~ ( ~* ~ ) l l
- ZG(xn)11 + whence, by 4',
= nG(x) ;
thus, 4 ' 3 lo, which completes the proof of proposition 4. 1. No theerem is knoun in concrete spaces about characterization
6f Eebys'ev subspaces Gof arbitrary dimension with a continuous metric projention. Let us consider now, in arbitrary normed linear spaces, the problem of characterization of eebygev subspaces G with continuous mewhen we have restrictions on dim G o r codim G, t r i c projection 71: G ' o r restrictions on the quotient space E/G. Theorem 4. 3. F o r every finite-dimensional ?ebygev subspace G is continuous. G The proof is straightforward, using a compactness argument
of a normed linear space E, the metric projection 'IC (see
[82]
, pp.251 and 386-3.90).
or Zebygev subspaces Gof codikension 1 , we have even a
stronger property of TG (see [82],
pp. 142- 145):
Theorem 4.4. F o r ePery eebygev hyperplane G in a normed linear space E , the metric projection It
G
is linear, and hence (by theo-
rem 4. 1 c)) continuous. We have the following characterizations of Eebygev subspaces of finite codimension with. continuous metric projections, due to CheneyWulbert
C20]
and Holmes
[38]
respectively :
I. Singer Theorem 4. 5 .
For 'a 6ebygev subspace
G
of finite codimen-
sion of a normedHnear space E ,the following statements a r e equivalent -
:
lo. -It i s continuous. G 1 20.7tG , (0) is boundedly compact, that is, interexects
every cell
S (x, r) C E in a compact set. 30. n:.(o)
n F r sE
is compact.
Proof. Since dim E/G < Furthermore, by whence (4 12)
f 3.
-1
wG(XG (0)
w
a r e compact. , SEIG and F r S -1 E/G.
formula (3. 71, Tf. G(0) f l F r
nFr
SE = F r
SE = { x E E I u w ~ ( x ) A = ~ ) .
SEIG. ,
where w
i s the canonical map E + E / G . Now, i f 'KG is continuous, G then, by theorem 4 . 2 , w = w Ifill(;) is a homeomorphism and hence, by (4. 13), we have 2'
.
Finally, since the implication 2O33Ois obvious,
let us assume that we have 3'.
Then, by (4. 12) and by the remarks ma-
de at the beginning of the proof of theorem
4. 2,
~ ~ 1 % :( b ) n F r SE
-1 i s a one-to-one continuous mapping of the compact set 'TCG (0)
n F r $E
and hence a homeomorphism. Therefore, onto the compact set F r S 1 E/G since both 'TKG (0) and E / G a r e star-shaped (by 8 2, pro~osition2. 2) since w is homogeneous (i. e . , w ( d x ) = d w (x)), it follows G G G i s a homeomorphism of I f 1 (0) onto E / G and easily that wG G hence, by theorem 4. 2, is continuous, which completes the proof of G theorem 4. 5. and
1. Singer Naturally, one can also prove theorem 4. 5 directly , i . e , without using theorem 4. 2.
Moreover, with a direct argument one can prove
that the relations 3°&20410 spaces G
pemain valid for a r b i t r a r y zebyzev sub-
[20].
F o r reflexive strictly convex Banach spaces E we have also another useful characterization of 6ebys'ev subspaces G
of finite codimension
due to R. B. Holmes [38] , in t e r m s of the with continuous IT G' spherical image map v:E*-+E defined a s follows : for f €E*, v(f)= the (unique) element x Theorem 4. 6. subspace G o f i n i t e
E such that
f(x)
11 f 11 . 11 x 11 . 11 X 11
=
=
11 f 11
F o r a ?ebyzev (or, equivalently, closed linear) codimension of a reflexive strictly convex Banach
space E the metric projection It i s continuous if and only if the r e G - ~ l i contis striction of the spherical image map v:E*--+ E to !GI nuous. In connection with theorem 4. 6 , note that we have always v(G')= -1 -1 I = 'lt (0) and, i f E i s smooth , then also v G ( n G( o ) ) = G Furthermore, v Glis one-to-one i f and only if E / G is smooth. Using
I
that v is a
duality mapu and hence a llmaximal monotone operator"
in the sense of F:Browder proved that
(see e. g. [16]
),
R. B. Holmes [38]
has
the sufficieoky part of theorem 4. 6 remains valid if in-
stead of m d i m G
<
m
we assume only that the norm of E / G is
Fre-
chet differentiable at every non-zero point. Let u s mention the following results on characterization of zebyzev subspaces of finite codimenaion with
continuous metric projection in
some concrete spaces, due to P . D . Morris Theorem 4. 7.
of E
= C
R
[66]
:
a) F o r a c e b y ~ e vsubspace G of finite codimension
( Q ) (Q compact infinite), %
G
is continuous (if and) only if
-.
I. Singer
G
is a closed hyperplane.
b)
3. F o r Eebygev subspace G of finite codimension of E = L ( T , V ), R
where ( T , 3 )
is a- finite
aGis
positive measure space,
continuous
if and only i f the s e t
v
p E G ~ , { Oat)
i t ET
I1
/3 (t)l
=
11/111)
(where at A denotes the. set of
all atoms of A ) is finite. A characterization of such subspaces in E = C
1
R
given by E. W. Cheney and D. E. Wulbert Various extensions of theorem 4. 7
( C201
(Q, 3 ) has been
, theorem 35).
a) have been given by A.
Lazarr , D. E. Wulbert and P. D. Morris [56]
.
They imply, i n particu-
lar, the following sharpening of the theorem 4. 7 b) for the space E = c
R
E = cR' flG is continuous (if and) only if either G is finite-dimensional o r G is a hy-
( [56],
corollary 3.. 10 ):
F o r a <ebygev subspace G
perplane. By theorem 4. 7 a), foe every 6ebygev subpsace G of finite codi-
> 2 of
(Q) (Q compact infinite) TC is discontinuous, R G and theorem 4. 7 b) shows how to construct Eebygev subspaces G of mension
n
E = C
(T , 3 )-spaces (with (T , 3 ) C-finite) with discontinuous 'Tt The R G' first example of a Eeby;ev subspace G of a normed linear space E
L
[o,
(namely, a subspace G of codimension 2 of C ( tinuous metric projection %
G
11 ) * )
with discon-
has been given in 1964, by J. Lindenstrauss
[57). , and then examples in other spaces by E. W. Cheney-D. E. Wulbert 1 [ZO] (in $R ), R. B. Holmes-B..R . Kripke , [39] and others. Let us give now some classes of ~ a n a z hspaces E i n which for every EebysVev subspace G the metric projection
% is y n t i n u o u s
and
of spaces E such that all closed linear subspace G a r e cebygev subspaces with continuous Tt
G
Using corollary 4. 1 a) ,J. Lindenstrauss
[57]
has proved that
if
I. Singer f E F r SE+, E" is locally uniformly convex (i.e, , if the relations f n' = 0 ), then every C ( E E ~) lim 11 fn + f 11 = 2 E m 11 f -f imply n-m n+ m n
1)
closed linear subspace f
of E* is
a 6ebygev subspace with continuous
F r o m this result it follows.
fir
Proposition 4. 2.
In a unifcrmly convex Banach space E a l l clo-
sed linear subspaces G a r e Eebydev subspaces with continuous nG. We recall that a Banach space E is called uniformly convex for every =
IIYII
=
(see e. g.
E > 0 there is a
4 (& )
1, llx-yll > & imply [21])
>0
(Ixtyll
such that the
relations
if
11% 11 =
- 2 ( 1 - J ( E ) ) . It i s well known <
that every uniformly convex space is reflexive and
strictly convex. On the other hand, we have proved in
if
[80] ,corollary 4 , that
E is a reflexive.Banach space with property (H) (i. e. such that the
relations
x
11
11 (1
11
---+ x weakly and lim x imply n+m lirn xn -xll=O), n n-+m ..xnl1 = then the m e t r i c projection TC onto any Eebygev subspace G f E G continuous ; for spaces E which a r e , in addition , strictly convex, this was obtained by KyFan-I. Glicksberg [30]
.
~ h i s ' a g a i nimplies proposi-
tion 4. 2, since i t i s well known that every uniformly convex space has property (H) (see e. g. [21]). We mention that for subspaces G of finite codimension (and even for G such that the norm of E / G is Frechet differentiable at every nonzero point) proposition 4. 2 also follows from results of M. I. Kadec, who has proved ( L44] ), lemma 2 ) that for a uniformly convex space E the
spherical image map v : E*+ E is continuous and ( [44], lemma
G < m the set %-'(0) n F r SE i s compact (and thus we G can apply theorem 4. 6 o r 4. 5 , combined with $ 3, theorem 3. 11). 3) for codim
There also exist. some results giving conditions in order that f o r all 6ebygev subspaces G of finite codimension the metric projection %G
I. Singer
be continuous. F o r example, R. B. Holmes [38]
h a s observed that
from theorem 4. 6 it follows Proposition 4. 3. In a reflexive strictly convex Banach space E
- m e t r i c .projection onto every zebygev (or, equivalently, closed the, linear) subspace C of finite codimension is continuous if and only'if the restriction of the spherical image map v : E*+E dimensional linear subspace of and only i f
,v
.
su is
continuous ( o r , i n ather words, i f
is continuous on E t endowed with
Similarly, P. D. Morris [66] Proposition 4. 4.
t o every finite-
finite topologyI1).
has used theorem
4. 7 b) to deduce
- (T, 3 ) i s a positive m e a s u r e space such If
that at T is finite, then for every Zebygev subspace G of finite codi1 mension of E = L (T , 3 ), the m e t r i c projection XI is continuous. R G Naturally, when applying .the results on continuity of W (for G example, theorem 4. 7 o r proposition 4.4-), one should take into account the results of
$ 3 on characterization and existence of E e b y ~ e vsubspa-
c e s (in particular, of finite codimension, for example, theorems 3.6, 3. 8 b), 3. 9).
We conclude this section with the following theorem of Cheney and Wulbert [20]
on continuity of m e t r i c projections i n quotient spaces,
corresponding t o $ 2 , theorem Theorem 4 3 .
2 . 8 and
9 3,
theorem 3. 12.
be a linear subspace of a normed linear 1 space E a n d G a subspace of G such that G is a f?ebygev subspace 2 1 2 - E with continuous TC of . Then G is a z e b y ~ e vsubspace of E with G2 1 continuous n if and only G I / G~ i s a Eebyjev subspace of G1 E / G 2 y i t h continuous X G1/G2 The converse of theorem 4. 8 is not valid : one can give [20] an Let G
7
if
I. Singer G- zebygev in E with codim G = 2 , G Eebysev 2 2 1 i n E with continuous % and G / G 6ebygev in E / G 2 with continuous G. 1 2 1 rG , but -K discontinuous. G1/G2 G2
example of G C G C E , 2
4. 3 , -
1
Weak continuity of m e t r i c projections.
The following analogues of a.)Weak sequential continuity of T t G' some r e s u l t s of section 4. 2 a r e due to R. B. Holmes [38] : a) If G i s
Theorem 4. 9.
w
space E , such that of
-4' (0) onto E / G
G
L-L- (0)
(where
eebygev subspace of a norrned linear is a weak sequential homeomorphism
wG : E + E / G is the canonical map-
ping), then % , Ls weakly sequentially continuous.
-
b) If G is a <ebygev sibspace of finite codimension, and if TC G
is continuous, then ItG k w e a k l y sequentially continuous.
c)
If
G is any closed linear subspace of finite codimension of a
normed linear space E , then %-'(0) G
--
is weakly sequentially closed.
Theorem 4. 10. F o r a ?ebygev subspace G of a reflexive Banach E the following statements a r e equivalent :
space
lo. .rt. is weakly sequentially continuous. G
2'. 3'.
T-'(O) G
is weakly sequentially closed.
wGb-k(0)
Corollary 4 . 2 .
is a weak sequential homeomorphism.
F o r every e e b y ~ e vsubspace G of finite codirnen-
sion of a reflexive Banach E , the matric projection 72 i s weakly seG quentially continuous. Indeed, this follows immediately from theorem 4. 10, implication 2'+
1'
and theorem 4. 9 c).
I. Singer 11. If E is a reflexive strictly convex smooth Banach Theorem-4. space such that the mapping by -
rx
(y) = t%o+
' I xtty
t
x
+
I' - llx'l
rx gf E\,{O)into- F r SE * defined
for all
y e E , is weakly sequentially
continuous, then f o r every Eeby2ev ( o r , equivalently , closed linear) subspace G of E .the metric projection
'TI
G
is weakly sequentially con-
tinuous. In particular, it is known that the spaces
E
=aP,where l
satisfy the conditions of theorem 4. 11. V. Klee [47] B) Weak continuity of fl G' which guarantee the weak continuity of
.
has given conditions on E
i f d i m ' G < m, for example G the following ( C47J , proposition 2. 5) : If for any pair of elements
x, y e E the
equidistant set I!
i s weakly closed then. for every finite-dimensional ?ebygev subspace G E , TC G is weakly continuous. F,or the case when dim G = 1 C.A. Kottman and Bor-Luh Lin [50] have given the following sharpen-
of -
-G is a E e b y ~ e vsubspace with dim G = 1 and if ing i n this result : If for some
g e G the set P (g, -g) is weakly closed, then -K
G
is weakly
continuous. Let u s also mention the following partial analogue of theorem 4. 10, due to Kottman and Lin [50]:
G is a finite dimensional 6 e - 1(0) is weakly clobygev subspace of a Banach space E, such that G is weakly continuous. sed, then If
G A similar result for the bw-topology was a1s.a given in [507 theo-
r e m 3. An example of a one-dimensional Eebygev subspace such that 'K and Lin [50]
G
is not weakly o r bw-continuous,
G of E = c
was given by Kottman
0
I. Singer 4. 4. Lipschitzian m e t r i c projections.
A) Pointwise Lipschitzian metric projections. The following result is due, essentially , to G. Freud, and E. W. Cheney (see 1 1 9 1 Proposition 4. 5. F o r every strongly c e b j r ~ e vsubspace (hence, i n particular, for every interpolating subspace) G of a normed linear
E
the metric projection 7t is pointwise Lipschitzian ,i. e. f o r G each x g E there exists a constant h= A (G,x) such that space
Proof. there
go =
If r = r(G,x) is a s i n 53, formula (3. 20), then , putting 'TCG(x) , g = % (y), we obtain G
and thus we may take
A=
r
, which completes the proof. 2
The converse of proposition 4. 5 is not valid, since e. g. i n E = e we have
(4. 15) even with
h = 1,
independent of G and x (since XG is 2 has no strongly EebysYev the orthogonal projection onto G), but E subspace
=A
G (by the rema'rk made after
$ 3 , theorem 3. 13, about smooth
spaces). Combining proposition 4. 5 with $3, theorem 3. 13 , it follows that 1 i n the spaces E = CR (Q) a s E = L (T, V ) f o r every finite-dimensional
R
I. Singer zebygev subspace
G the metric projection % i G
.
-
zian. Od the other hand, R. B. Holmes and B. R. Kripke [39
ex.&mple of a Eeby8ev line G in a 3-dimensional
have given an
uniformly convex
space E, 'such that It i s not pointwise Lipschitzian. G B) Lipschitzian metric projections. The main characterization of 8ebygev subspaces G with Lipschitzian metric projection Tt following analogue of theorem 4. 2, due to R. B. Holmes
G [38].
is the
Theorem 4. 12. F o r a eebygev subspace G of a normed linear space E the metric projection w = uG
lx-~(0)
qGis Lipschitzian if and only i f
is a Lipschitzian homeomorphism of
(C)
E/G.
The proof is similar to that of theorem 4. 2 (the condition amounts to w
-1
being Lipschitzian ). One can also give a corollary similar to
corollary 4. 1. Some other, more elementary, characterizations of Lipschitzian ITG , due to R. B. Holmes and B. R. Kripke [39] Proposition4.6. ce E -
, a r e collected in
F o r a Eebygev subspace G of a normed linear spa-
, the following statements a r e equivalent : lo. 7Y is Lipschitzian. G 2'.
30.
% is uniformly continuous o n E . G
*G
I
(n;'(o)) is
bounded , where
is tluniformly locally pointwise Lipschitzian I t , G i. e. there exist two constants 1 = X(G)> 0 and d(G) > 0 such that 1 the relations x E l( (0) F r SE. x - Y 115 imply II'KG(x)- 'NG(y)11 G 4'.
11
d=
<
I. Singer <XIIx-
YII.
R. B. Holmes
and B. R. Kripke [39]
a result of Lindenstrauss
[58]
have also observed that from
one obtains, a s a particular case, the
following important necessary condition f o r 'T11 t o be Lipschitzian : G Theorem 4. 13. Ifhe space
m e t r i c projection % onto a E e b y ~ e vsubG G of a reflexive Banach space is Lipschitzian, then G i s com-
plemented i n E. However, the condition i n theorem 4. 13 is not sufficient. Indeed,
R. B. Holmes
and B. R. Kripke [39]
one-dimensional ;eby8ev
have given even an example of a
subspace G of R P , where 2 < p <
03,
such
that T t G is not Lipschitzian. Combining theorem 4. 13 'with a recent characterization of Banach spaces isomorphic t o Hilbert spaces, due to J. Lindenstrauss and L. T z a f r i r i [607 , we obtain Corollary 4. 2. If f o r a l l closed linear subspaces G of a s t r i c t l y convex reflexive Banach space E , f l
is Lipschitzian (or, in particu-
l a r , uniformly Lipschitzian, i. e. , with a constant )\ independent of G), then E is isomorphic to a Hilbert space.
R. B. Holmes and B. R. Kripke L39J have proved that in the finitedimensional LP(T, 3 ) spaces, where 2 < p < co , 'E is Lipschitzian f o r a l l 6eby;ev
( o r equivalently, a l l linear) subspaces G, but not uni-
formly Lipschitzian, i f dim
L'
(T, 3 )
- 3. If E >
is a strictly convex
space of dimension 2 , then, by theorem 4. 4, TC is uniformaly LipschitG zian on E. R. B. Holmes and B. R. Kripke [39] have constructed examples of non-Hilbert
spaces % of any finite dimension such that 'NG i,s
uniformly Lipschitzian on E.
In this connection we alSo have (see [82]
, pp. 247-249 and 350):
I. Singer If - E is normed linear space of (finite o r infinite)
Theorem 4. 14. dimension
2
3 with the property that for every linear ( not necessa-
rily ?ebygev) subspace G of a certain fixed finite dimension n,or codimension n, where 1 4 n ,< dim E- 1, the (generally multi-valued) map-
ping %G satisfies ) \ (x) ~ 11 < llx 11 for a l l x 6 D ( x ) (hence, i n particular,
if 7CG
G
is1! contractive
then E is -
-
" on D
G
($
-
) i. e. Lipschitzian with constant I ) ,
G linearly isometric t o a Hilbert space.
4. 5.
Differentiabilitv of m e t r i c ~ r o i e c t i o n s .
The notions end results of this section a r e due t o R. B. Holmes and B. R. Kripke
[39J.
Definition 4. 2
If G is a Cebygev subspace df a normed linear
space E and x, y E.E and if the limit 'nG(x+ty - q x ) (4.17)
XIG (x, y) = t l+i m o
t
exists, then %IG (x, y) is called the ~ s t e a u xderivative of TI
G
at x
-
2
the direction y . The following observations a r e 'immediate : if either side exists.
a ) ?tTG(x,cy) = c f C ' G ( ~Y), b) ntG(g,y) = 'rtG(y) f o r a l l gaG, c) where
If x E E \ G and either
yGis
y4E.
%IG (x, y) o r
71 I G
(
Y G ( x ) ,y) exist,
a s in proposition 4. 1, then both exist and a r e .equal. G
If for a <ebys'ev subspace G of a normed #near -1 (x,y) exists for a l l x & a G (0) h F r SE, y E F r S; and -
Theorem 4. 15. space E if -
G sup- 1
x E'It
G
Ilnb
( 0 ) n F r SE
(x, y) 11 < ca,
then
G is Lipschitzian.
If x € ~ \ { ~ f a n dif f o r any y, z E E the function N(s, t ) =
11 x+sy+tz 11
I. Singer is twice continuously differentiable in a neighbourhood of (0, O ) ,
then
one can define a functional on E X E by
-
. . ,xn]
Theorem 4. 16. Lei G= [ x 1 ,
E
subspace of a normed linear space
be an n-dimensional Eebygev
and suppose that for some
-1
~ € 7 (0) 1 ~ f7 F r SE, y E F r SE, the function f ( s l , . . , s , t ) =llx+ty n n
- kf =l
i s twice 11 -
s x k k
the origin in and let
R"".
continuously differentiable in a neighbourhood of Let X be x-
qx(y) be the
Then, if X
X
-
n X n matrix ( TXIi, = <xi, x j >
the
n-vector whose
i-th component i s <xi, y),
.
is invertible, Tr' (x, y) exists and i s given by G
A straightforward computation shows that i f E = L (~T , Y ) , 2 ~ =(p-1) [yl d 3 (t) and that the functional x + G , z> fixed
y, z E E . Moreover, using this observation and theorem 4. 16, one
obtains 4. 17. F o r every finite-dimensional linear subspace G -@ Theorem
E = L:
(T, 9 ), where -- 2 < p < m,
T(
G
(x, y) exists for all
x E E \ G, y QE.
The assumption h e r e that dim G < m, is essential, since one can give an example of an infinite-dimensional closed linear (hence 6ebygev) subspace G of
A:,
where 1 < p < m , p
ferentiable and non-pointwise Lipschitzian.
2 , such that fl
G
i s non-dif-
I. Singer Definition 4. 3 . If G a Eebygev subspace of a normed linear spa1 ce E, Tf is said t o be Frbchet - C on the open s e t E \ G i f there G
exists a continuous mapping u: E \G
4 L(E, G)
(where L(E, G) denotes
the space of all continuous linear mapping of E into G, with the unif o r m norm), such that % & (x, y) exists and
Using theorem 4. 15 one can show that if dim E
E such that X G then -flG Lipschitzian. a Cebylev subspace of
< a, and if G
i s Frechet C
1
' onJ' E -
1%
\ G,
-
Some m o r e results on the existence of ' X I (x, y) whenever Ilx-x 11 < 5 G 0 (for some x € E \ G ) and on 71G satisfying a Lipschitz condition for 0
all x in a neighbourhood of x
0
and a l l
.
y e E , a r e given in p 9 ]
.
of m e t r i c projections. 4. 6 . Linearity --
"
'Y
By theorem 4. 1. f), for any semi-Cebysev subspace
G the lineari-
ty of 71' on D(Q ) is equivalent to i t s additivity on D ( n ) . G G G The main characterization of eebygev subspaces G with linear mec
t r i c projection 'KG is the following analogue of theorem 4. 2, due to R. B. Holmes
~387.
Theorem 4. 18. F o r a Eebygev subspace G of a normed linear space
E
the m e t r i c projection
nG
I
is linear if and only
if w = w , G T C..k ( 0 ) i s an iso'metric (i. e . , distance-preserving) mapping of ~ ' ( 0 ) onto - G -E/G. Note that, a s was observed in the proof of theorem 4. 2 and in $3, formula (3. 7)
. for any EebysVev subspace G,w- 1
is a one-to-
= w
one continuous norm-preserving mapping of llG(0) o s o E/G. One can also give a corollary of theorem 4. 18, similar to corollary 4. 1. Some other characterizations of the linearity of p. 144 and in C39] , theorem 3 , a r e collected in
wC,' given in -
[82],
I. Singer Proposition 4. 7. F o r a semi-6ebyZev subspace G of a normed linear space E the following statements a r e equivalent : i s one-valued and linear on D ( fC ). G G 1 % (0) i s a closed linear subspace of E. G '
lo. % . ' 2
(0) is convex.
3'. K:'
v
- in addition, G i s proximinal (and hence a ~ e b y s ' e vsubspace), If, these statements a r e equivalent t o the following : 4
0
-1 . nG (0)
contains a linear s u b s p a c e , F of E such -- that
E = G + F . 5'.
!( %,
(4.21)
(x+Y) 6P
There exists a constant
/I
11 < 'Tf
G
K
+ Il%(y)ll
G
such that
1 (x, y € E l
i s continuously Gateaux differentiable.
In theorem 4. 4 it was established that for every Zebygev hyperplane G i n a norm.ed linear space E , % i s linear. There we also obG served that whenever N onto a Eebygev subspace G is linear, it i s G also continuous and hence a bounded linear projection ; thus a necessar y condition in o r d e r that % be linear i s that G be complemented G in - E . Moreover, in this case, by theorem 4. 1 b) we have 1 <- 2 and <
ll1-7tG1( = 1.
Some simple characterizations of the situation when
R. B. Holmes and B. R. Kripke [397, Proposition 4. 8. ce E -
, such that
11% G 11 =
1. due to
a r e collected in
F o r a 6ebygev subspace G of a normed linear spai s linear, the following statements a r e equivalent :
I. Singer
% - 1(0
(x) for all x € E ( r e c a l l that X - l ( 0 ) G G i s now a closed linear subspace , by proposition 4. 7). ( I G
2
(x)e
(0) i s a Cebygev subspace, these statements G a r e equivalent t o the following : If, i n addition, -
The following sufficient condition f o r the linearity of TC related G' to theorem 4. 6, w;.s observed by R. B. Holmes [38]: If for a Eebygev (or, equivalently, closed linear) subspace G of a convex Banach space E the restriction v : E*-E
to - G'
is linear, then-
reflexive strictly
v lGl of the spherical image map
TG i s l i n e a r . However, the converse
i s not valid. There a r i s e s naturally the problem of characterizing in the usual concrete normed linear spaces E the cebyzev subspaces G f o r which
lxG- i s linear. In this direction we have the following result of P. D. Morris (
[6d
,theorem 9 ), related .to theorem 4. 7 b):
Theorem4. 19. F o r -
a 6ebys'ev subspace G of finite codimension of
1
E = LR (T ,? ), where ( T , 3 ) i s a positive measure space, X G is linear i f and only i f t h e r e exists a
G
I
such that
An example of a Eebyi5ev subspace G of codirnension 2 of E
3, continuous but not linear, has been given by The following result
=
R1R '
with
P. D. Morris [66].
( s e e [ 8 4 pp. 249, and 351 and [84]), somewhat
related to corollary 4. 2 and theorem 414, gives , in particular, a dharacteri-
I. Singer zation of the spaces E in which a l l R G Theorem 4.20.
2
finite) dimension
If- E
a r e Linear .:
is a normed l i n e a r space of (finite o r in-
3, with the property that f o r e v e r y closed linear -
subspace G of a given fixed finite dimension n , o r codimension n,
ping
-
1 < < dim E-l'the map-n < - dim E-2, respectively where 2 < n -
where
xG is
one-valued on D ( X ) = E and linear, then E is linearly G isometric to an inner ~ r o d u c tSDace. F o r codim G
2
3 one can prove more, namely (see [82],
p. 352
and [84]):
- E is a normed linear space of (finite o r infiTheorem 4. 21. If nite) dimension
2
4, such that f o r every closed linear subspace G
a certain fixed finite codimension n , where 3 mapping
nGis
5
n
5
dim E
- 1, t h e
one-valued on D(7t ) and satisfies G
t h e n E is linearly isometric t o a n inner product space. However [84],
it is not known what happens i f the s a m e conditions
a r e satisfied for every closed linear subspace G of codimension 2. Finally, let u s mention that i n some c a s e s f o r a certain (increasing o r decreasing) sequence ( G ]of Eeby;ev subspaces of a space E , n each % is linear, but not f o r all &ebygev subspaces G of E is .TC G Gn 1 linear. F o r example , i n E = 4 , the increasing sequence G = [el,. . ,en] n (where e
k
=
(0
'+jf-'
0,1,0,.
Fn
. .) )
and the decreasing sequence G = { x n
=
= . .. = = 0 ) (n = 1.2.. . . ) have this property; i n this E example we also have d i m G = codim G = n f o r a l l n = 1 , 2 , . . . In n n the general case, this property is related t o the existence of a "Schauder
={$ k} €
1
basisf1 in E.
I. Singer 4. 7. Semi-continuity and continuity of set-valued metric projec-
tions Definition 4. 4. F o r a proximinal set G in a metric space E, the mapping
PG : x + p G (x) of
E into 2G (= the collection of all closed
non-void 'subsets of G) is called the set-valued m e t r i c projection of E onto G. Semi-continuity and continuity properties of set-valued metric projections in normed linear spaces and, more generally, i n metric spaces, have been f i r s t investigated by K. Tatarkiewicz
[87] and in
C801.
We recall -that a mapping
U : E -2
ti
i s called upper semicon-
tinuous, respectively lower semi-continuous , i f the set
{ x E E IU(X)CM)
i s open for each open subset M of G, respectively closed for each closed subset M of G,or, equiiraldptly if the set
(x E E
I U(x) fl N f
q)
is
closed for each closed subset N of: G , respectively open for each open G subset N of G . Furthermore, U:E -42 is called continuous in fhe Hausdorff metric, if x
(4.24)
mar
[
n
--+ x
sup g$U(x)
implies
p
(go. U(xn)).
SUP gn E U (xn)
p( gn,U(x
We also recall the following well known (and easily proved)facts: a) dontinuity of
pGin the Hausdorff
metric implies the lower semi-con-
tinuity and ,if P G ( x ) is compact for every semi- continuity of p G . b) If G is boundedly
x E E , then also the upper compact, then
w a y s u p p e r semi-continuous and Hausdorff metric continuity equivalent to i t s lower semi-continuity. every
x B E,
(XI
a is al.f G . is of pG -
c) If G is zebygev (i.e. , for
is the one-point set I I(I), the upper semi-eonf G 1 tinuity , the lower semi-continuity and the Hausdorff metric continuity
I. Singer of
9G
a r e a11 equivalent to the continuity of 'It
G' There a r i s e s naturally the problem of the characterization of
those proximinal linear subspaces G of a normed linear space E, for
PGis upper
which tinuous.
o r lowk~semi-continuous o r Hausdorff metric con-
F r o m b) above X follows , i n particular, that for every finite-
dimensional linear subspace G of a normed upper semi-continuous.
linear space E ,
9G i-s
F o r subspaces of finite codimension, we have
the following generalization of theorem 4. 5, due .P. D. Morris [66]: Theorem 4 . 2 2 .
F o r proximinal linear subspace G of finite codimen-
sion of a normed linear space
E the following statements a r e equiva-
lent : lo
.
$G
is upper semi-continuous and
-
(x) is compact for
every x 4E. 2' .It: (0) is bound&dly compact. On the other hand, A. L. Brown (
[la)
, proposition 1. 1) has ob-
served. Proposition 4. 9. F o r a proximinal linear subspace G of a normed linear space E, x-x, n
?G
is'lower semi-continuous if and only if the relations
goe P G ( x ) i m p 1 y j ~ ( g ~ , P ~ ( x ~ ) ) + ~ . F o r the spaces E = C (Q) we have the following results, due to
R
P. D. M o r r i s [66] Wulbert
[77
(recently, the proof of the latter has been simplified by
A. L. Brown [la])
Theorem 4 . 2 3 . mension n continuous.
and respectively to J. Blatter, P. D. Morris and D. E.
:
a) If G is a pseudo-eeby8ev subspace of finite codi-
> - 2 of- E
= C (Q)
R
(Q c o m p a ~ t ) .then -
9G is not
upp*
.semi-
I. Singer b) F o r a pseudo-6eby8ev subspace G o f E = C (Q), i n o r d e r that
R
PGbe lower
semi-continuous it is necessary, and if
PGis upper 4
semi-continuous , also sufficient , that for every xgT( (0) the set. G -
I go
z(TG(x)i( q BQ
(4 25)
( q ) = 0 for all
goE
pG(x)
be open. P a r t a ) is a generalization of theorem 4. 7 a). Blatter, Morris and Wulbert [7J
have proved the following corollary of part b) :
Far a compact space Q the following continuous
Corollary 4. 3 . a r e equivalent. :
lo . Q is connected.
2O. Every pseudo-EebyHev subspace G
pGis
= C (Q), such that
lower continuous. is a Eebvgev subspace.
.R
Every one-dimensional subspace G o L E = CR (Q) ,
3'. G
of E
--
is lower semi-continuous , is a Eeby2ev subspace.
such
F o r some related results s e e also B. Brosowski, K. -H. Hoffmann, E. S c h a e r
and H. Weber
F o r the spaces
[147. 1 E = L ( T , 3 ) , A Lazar, D. E. Wulbert and P. D.
R
' M o r r i s [567 have proved Theorem 4. 24.
F o r an n-dimensional.linear subspace G h
where
(T, 3 ) is -a C-finite positive measure space,
continuous if and only if there do not e x i s t / j € ~ ' \
{o)
yGis
of E = LR1 (T,$)
lower semiand g g G with
the following three properties : o() The set
))(s(P)\
u
S(P)= { t E T it)
I1
p(t)1 <
/j(} is purely atomic (i. e. . ,
) = 0) and contains at most n-1 atoms.
t e a t S(/)
SO) C
Z(g) = { t E T
1
g (t) = 0 ) .
I. Singer
r)
T \Z(g)
i s not the union of a finite family of atoms. If G i s a finite-dimensional non3ebygev linear 1 E=L (T, 3 ) (where (T, ), ) is a C - f i n i t e positive mea-
Corollary 4. 4. subspace of
R
- T \ Z ( g ) purely atomic, g E G \ 1 0 ) has
s u r e space) such that no
consisting of a finite number of atoms (in particular, i f (T,
3 )has
no atoms and hence any finite-dimensional G is non-6ebygev by theorem 3. 9). then
?G
$3,
is not lower semi-continuous.
Let u s consider now the normed linear spaces in which for every proximinal
subspace G set-valued metric projection
the above semi-continuity o r continuity properties. 1 and implication
5' =3
3O
pGhas
From
one of
[80],.theorem
of theorem 3, it r e s u l t s the following
extension of a r e m a r k made a f t e r proposition 4. 2 above : Theorem 4.25.
F o r a l l proximinal linear subspaces G of a
ve Banach space with property
(H),
On the other hand, A. L. Brown [17]
1'4
3O(and introduced "property
bert [7]
-
reflexi-
is upper semi-continuous.
has proved the equivalence
(P)") and Blatter, M o r r i s and Wul-
observed tho other equivalence of
Theorem 4.26. F o r a normed linear space E the following statements a r e equivalent : l o . F o r every finite-dimensional linear subspace G
of E ,
i s lower semi-continuous.
.'2
For
every one-dimensional linear subspace G
of
E,P is G -
lower semi- continuous. 3'.
E has the following property :
(P) F o r every pair of elements x, z exist constants
b = b (x, z )
> 0, c
=
GE
such that Ilx+z 11
c (x, z)
> 0 such that
5
IIxll,there
I. Singer It i s natural t o ask, which normed linear spaces E have ty (P). A . L. Brown and
latter- orris
Proposition 4. 10. has property
El71 has -Wulbert
roved
a ) and b) , and Blatter
proper[5J
[7] have proved the other statemeqts of
a ) Every strictly convex normed linear space
b
(P).
b) Every finite-dimensional normed linear space E , in which the -unit cell S i s a polyhedron --E-
(i.e . , the intersection of a finite number
of half-spaces o r , what i s equivalent, the convex hull of a finite num,i-c r of points)has property -
(P).
c ) C (Q compact ) has property R d) co has property (P)
e)
Lf
(P) i f and only i f Q i s finite.
( T , 3 ) i s a F - f i n i t e positive measure space such that T i s not
the union of a finite number of atoms, then
1
LR(T, 3 )does
not have
property (PI 4. 8. Continuous selections and linear selections for set-valued m e t r i c projections. We recall that if G and E a r e metric spaces, a ping
continuous map-
u : E --+ G i s said t o be a continuous selection for a set-valued
mapping
U : E + 2G i f
u(x) EU(x) f o r all x E E . If G is a linear sub-
space of a normed linear space E , one can. define a linear selection for
I* in a similar way. By theorem 4. 1 b) o r c ) , if G is proximinal , every linear selection for
pGi s
continuous.
In o r d e r to characterize the proximinal linear subspaces G of a norrned linear space E, for which
admits a continuous selection o r G a linear selection, we define a set-valued mapping
I. Singer It is easy to s e e that
n;'(o)
VG(x+G) E 2
and closed. Indeed, VG(x+G) f
, i. e . . , i s non-void
since G is proximinal.
Furthermo-
?I (x ) --, x E E , then since ~ - ' ( 0 )i s closed, we have G n G whence x = x -71 (x) E V (x+G), which proves that VG(x+G) G G is closed. Observe that if G is a Eebyiev subspace of E, then VG(x+G)
r e , if
x
-1 n xEXG (0),
-
{ w- '(x))
i s the one-point set
=
( x- T ~ ( x ) ).
where u = w
(0)
( s e e section 4. 2). Also , V is nothing e l s e than the set-valued mapG ping
induced by
E/G-+ 2
I-
pG,where
I is
the identical map-
ping of E onto itself. W e have the following generalization of theorem 4. 2 (which, in
the particular c a s e when E / G .is reflexive, was essentially proved in [84]
, theorem 3 )
:
Theorem 4 22 F o r proximinal linear subspace G of a normed liG E +2 admits n e a r space E , - a continuous selection if and only i f G: 1 E / G --+ 2 ( O ) defined by (4 21) admits a continuousthe --mapping V
9
G:
rG
selection Indeed, this can be proved either similarly t o the above proof of theorem 4. 2, o r using, in the necessity part, a theorem of Bartle and Graves (see
[63])
according to which the mapping
.w
G
-
E / G +2E
'
defined by
always admits a continuous selection w
G
and
PG
then putting
where x ( O ) is a continuous selection for . G F r o m theorem 4. 27 we obtain the following generalization of co-
I. Singer rollary 4. 1, the f i r s t part of which i s due t o J. Lindenstr'auss L57] and the second part to Corollary 4. 5.
[84]
:
Let E be a normed .. -linear space. a ) For ~ u ( E T E ) - Then --
closed (hence - proximina1)linear -- - subspace r o f E*,
: a selection ) +(f
b)
e ~ * l Ip; f
9,admits
a continuous
if and only if the (set-valued) extension map ( P c ( ~)*+
(9 , 11 f
11
=
*
l l ~ l & d m i t s a continuous selection.
that the extension If- G i s a. proximinal linear subspace of E , such = I/(P admits a eonti(P~(G')+{+ =
EE**(QI~I
y,
m
11)
1
nuous selection, then -y% 5dmits oontinuous selection, Let u s observe that one can a l s o obtain relations between the G : E -2 and semi-continuity properties of the set-valued mapping
pG
+
VG : E/G
2 % ~ '(O) ( a s well a s corollar$es of the above type).
The r e s u l t s on lower semi-continuity a r e particularly useful because of the following theorem Michael
( [63];
on continuous selections, due to E. A,.
theorem 3. 2") :
lower --semi-continuous
If -E , G afe Banach spaces, every G U : E -+ 2 such that U(x) is convex for each
if
x E E, admits a continuous selection. Hence, in particular, p r o ~ i m i n a llinear subspace of a Banach space E, such that semi-continuous then
53G admits acontinuous -
G i
-
~
a
i s lower
selection; the converse i s not
true, even if dim G = 1. F r o m this observation and from the of the preceding section there follow sufficient
results
conditions on a given
G in o r d e r that
admit a continuous selection and sufiicient congitions G on E oi o r d e r that for a l l subspaces G of E. admit a se-
pG
continuo?^
lection. Conversely, in some c a s e s the results on non-lower semi-continuity of for
P'G
pGcan
be sharpened t o non-existence of continuous selections
F o r example , A . L a z a r , D. E. Wulben and P. D. Morris have '
proved the following partial sharpening of corollary 4 . 4 r e m 1. 4) :
( [56]
, theo-
I. Singer Theorem 4. 28.. E = L'
R
If
G is any finite -dimensional subspace of
-
(T,3 ) , where (T,V) is a positive measure space having no
atoms, then
PG admits
no continuous selection.
We have the following characterizations of the one-dimensional l i n e a ~ subspace G of E =
-8;
and E = C (Q) f o r which
R
a continuous selection, due to A. Lazar
PG admits
, lemma 5. 2) and
( :55]
respectively A. Lazar, T, E Wulbert and P. D. M o r r i s ( [56]
, proposi-
tion 2. 6): Proposition 4. 11. a ) F o r the one-dimensional linear subspace G
4R
- E= of
spenned by an element
g =
{rn).eGadmits a continuous
selection i f and only if t h e r e do not exist
-of N
=
{
1,2,3,.
. . .)
two disjoint subsets N
1'
N
2
such that
b) F o r the one-dimensional linear subspace (Q compact), spanned by an element
g
G
=
Lg] of E =
CR(Q)
of norm 1,
tinuous selection if and only if
I g(q) 01 .
d)
card F r Zlg) < - 1,
/3 )
q e F r Z(g) implies that there exists a neighborhood of q
where Z(g)
=
{ q eQ
=
on which g is either non-positive o r non-negative. Let us also mention the &llowing recent results of A . L. Brown
( [18]
,theorems 2 . 8 and 3. 10) :
Theorem 4 . 2 9 . a )
If
G i s a
"
Z-subspace" of E
i. e . a closed linear subspaee such that Int Z(g) =
then either there i s no continuous selection for
=
9
CR(Q) (Q compact),
for all g 6 G \
(01,
o r there i s a unique G -
I. Singer one. -b) There exists a 5-dimensional Z-subspace G of E = C ( [-l,+l])
R
which contains the constants, i s non-eebygev and such that
f?G
a unique continuous selection.
imits -
The l a t t e r result (which disproves a claim of A . Lazar, D. E. Wulbert and P. D. Morris :
[56],
theorem 2. 1) shows that in the parti-
cular case when dim G < co ( and hence the implication
1°+200f
i s upper semi-continuous), f?G corollary 4. 3 cannot be sharpened s o a s to
assume only existence of a continuous selection f o r
PG instead
of the
lower semi-continuity of
G' Concerning linear selections f o r
we have ( s e e [82]
, p . 142):
Theorem 4. 30. F o r every proximinal hyperplane G in a normed linear space E,
admits a l i n e a r selection. PG By theorem 4. 1, i f f o r a proximinal linear subspace G of a nor-
PG
admits a linear selection %(0) then X ( 0 ) G ' i s a continuous linear projection of E onto G and hence G is complemed linear space E,
mented. Obviously, the converse is not valid. Finally, l e t u s mention t h a t one can give a characterization of proximinal linear subspaces G for which
PG admits
a linear selection,
genera1izing.theorem 4. 18 in a s i m i l a r way a s we generalized theorem 4. 2 by theorem 4. 27 and one can than prove also a corollary correspon-
ding t o corollary 4. 5. Also, one can define weakly continuous, Lipschitzian and differentiable selections f o r 9 and obtain f o r then- similar exG tensions of the preceding results.
I. Singer
5.
Best approximation by elements of non-linear s e t s
5. 1. Best approximation by .elements of convex sets. By a non-linear s e t i n a normed linear space E we mean any s e t G C E which i s not a form x+G
0
linear manifold ", i . e. which i s not of the
,where x e E and where G
0
i s a linear subspace of E. Since
best approximation by elements of linear
manifolds can be reduced,
by a simple translation , t o best approximation by elements of linear subspaces, we shall not consider here this problem, but r e f e r the r e a d e r to [82]
,p p . 135-140 and 242-246. We want to present here,
briefly, some directions of r e s e a r c h on best approximation by elements of non-linearsets. Note that the existing r e s u l t s in this field do not yet constitute a unified theory ( a s is ,the theory of best approximation by elements of linear subspaces) and the construction of such a theory in general normed linear spaces i s only at its beginning. The f i r s t natural step when passing from best approximation in normed l i n e a r spaces E by elements of linear s e t s G C E to non-linear s e t s is t o take a s G a convex s e t in E. The followi.ng extension of $1, theorem 1. 1, to this case has been given, f o r r e a l s c a l a r s , by G. Rubinstein [75]
and Ch. Roumieu
( [747 ,proposition 5) and for com-
plex s c a l a r s in r82], pp. 360- 361 and [22] Theorem 5. 1. and -
1 4
x EE
5.
, [37]
(independently) :
L A G be a convex s e t in a normed linear space E ,
\ 5,
%E G . We have g0 E
PG
(x) if and only if
e r i k t s an f EE* with the following properties :
there
I. Singer
This theorem admits the following geometric interpretation, observed by V. N. Burov (see [82]
, p . 362):
g0g c ( x ) if and only i f
there exists a r e a l hyperplane H which separates G f r o m S(x, Clearly , such a hyperplane H must pass through g cell S(x,
11 x-g 0 1)
convex cone,
0
11 x-g,ll).
and support the
). The particular c a s e of theorem 5. 1, when G i s a
was also considered by G.
S. Rubinstein
( s e e [75],
pp. 362-
363) ; another characterization theorem f o r best approximation by elements of convex cones has been given by G. Godini [35]. F o r bite-dimensional convex s e t s F . R. Deutsch and P. H. Maserick [22]
and , independently, S. Ia. Havinson [37J
lowing extension of
have proved the fol-
51, theorem 1 . 6 :
Theorem 5.2. L A G be an n-dimensional convex s e t i n a normed linear space E and let x E E \E, only if t h e r e exist
g E G. We have g o E p G (x) if and 0
h extremal points f l , . . . ,fh
5 h 5 2n+l if X1 . . . . . . A h > O ~ $
- n+l i f the s c a l a r s a r e r e a l and <
complex, and h numbers
cf SE*, where
1
the
1< < -h-
scalars a r e
h j l . s u c h that j= 1
Actually, this follows from theorem 5. 1 i n the s a m e way a s $1, theorem 1. 6 follows from
L
41, theorem 1. 1.
The second main characterization theorem* of
5 1 (theorem
0
1. 8)
remains valid in the c s s e when G is a convex jet in E; this was observed by 'A. L. Garkavi (see [82]; quet ( ~ m ~ u b l i s n e d ) .
p. 360) and , independently, by G. Cho-
I. Singer Soine other characterizations of elements of best approximation by elements of convex s e t s G have been obtained by F. J. Laurent [52] (see also
[54J ), who has used the
convex cones of displacement
(introduced by A. J..Dubovitskii and A.A. Miliutin [24] Moreau [65],
) and by J: J.
who has used the tools of the theory of convex functions
(e. g. subdifferentials, indicatrices, etc. , s e e [64] It was observed in [82],p.
and 1421).
360, that several results on existence
and uniqueness of elements of best approximation (for ex&nple;
4 2,
theorem 2. 3, with S replaced by all bounded subsets of GI1; $ 2 , G theorem 2. 6 with the following addition : 6' All closed convex subsets of E a r e proximinal ; 5 3 ,theorem 3. 11 with the following addition : 4O.
All closed convex subsets of E a r e zebygev s e t s ) and on properties of the mapping 75 not involving the linearity of G , remain valid f o r the G case when G is a convex s e t in E. B'owever, a systematic extension of the results on best approximation by elements of linear subspaces G to convex s e t s G has not yet been accomplished. We mention that some results on existence and uniqueness of elements of best approximation have been extended t o convex cones by G. Godini [35 difficulties
1. Some
which a r i s e at the best approximation by elements of c e r -
tain finite-dimensional convex cones i n E = CR( [o, 1 1 ) have been pointed out; by J. R. Rice (see [82] continuity of
PC - for
,p. 363). Some results on the upper semi-
closed convex sets G in reflexive Banach spaces
have been given by E. V. ~ z m a n[67]. 5. 2.
Best
approximation by elements of N-parameter sets.
One of the most important classical problems of best approximation by elements of non-linear s e t s is that of best rational approximation in E = C
[a,
( [a, b] ), raised by P. L. Eebyzev (see N. I. Ahiezer R Ch II), i . e. the problem of best approximation in E = C ( [a, b] )by eleR
I. Singer ments of the set
where n, m a r e given positive integers and z is a given function in ( [a, b] ), such that z(t) > 0 (t E [a, b] ) ; obviously, in the R particular case when m = l and z(t) -= 1, this problem educes to that
E = C
of best approximation by elements of the n-dimensional linear subspace Gn = [ l , t , .
. . tn-'1
of E = C
R
( [a,b]
). A slightly more general
problem is that of best approximation in E = C ( [a, b] ) by elements R of the set
where G1, G
a r e given finite-dimensional linear subspaces of E=C ( [a, b] 2 R and where z is a s above. One can also replace the condition g (t) f 0 2 ) by weaker ones. A further generalization of the problem con-
(t€[a,b]
s i s t s ' i n replacing the interval [a,b] R ~ .G2 l
by a compact space Q and the set
by an !IN-parameter setl',i. e. by a set of the f o r m
where P is a subset of a r e a l N-dimensional Banach space (N<w), say BN. The aim is to find classes of s e t s (5. 8) such that the known results
1
I. Singer of the theory of rational approximation o r of convex approximation (in particular, of linear approximation), e. g. the alternation theorem, the characterization theorem of Kolmogorov (see $1, theorem 1. 9)
.
uniqueness theorems, e t c . , remain valid for the sets G of these classes. Fpr this purpose, there have been introduced, by various authors, interpolatingu N-parameter sets
GCE=C (Q), (i. e. having the proR perty described in $3, definition 3. 4, with n=N ; they a r e also called unisolvent
N-parameter sets),
locally unisolvent
"
and
assymp-
totically convexI1 N-parameter sets
GCE=C (Q), and, in an attempt R to include also other important non-linear sets G of approximating funct ), It varisolvent" tions (e. g. of functions of the type g(t)=o( H +
5
N-parameter sets. The problem of necessary and sufficient conditions on (5. 8) in order that a certain known theorem on rational approximation remain valid for (5. 8) has been also studied; for example, local unisolvence is necessary and sufficient in order that the alternation theorem remain valid. The literature of these approximation problems in E=C ( [a,b])
R
and in E=CR(Q) (Q compact) i s very vast ; the reader
may consult the monographs of N. I. Ahiezer [I] , E . W. Cheney [19], G. Meinardus [62]
, J. R. Rice [70]
and B. Brosowski [9 J , and the pa-
pers in the bibliographies of these monographs. From tlie above it is clear that it would be important to develop a theory of best approximation by N-parameter s e t s G in a general normed space E, i. e. by sets.
where P is a subset 'of a r e a l N-dimensional Banach space BN ; natur a l l y . such a theory would include a s particular cases, the above theories. This problem was first raised in[82],
p; 137, and it is difficilt
I. Singer even when dim E
< CO. Two different approaches t o this problem have
been proposed by J. R. Rice [71] 371-374) and D. E. Wulbert [96],
, [72] [97],
, [70]
[98],
(see a l s o [82]
,pp.
respectively. Both authors
agree in pointing out the importance of the particular c a s e when (5. 9) i s a manifold and obtain more r e s u l t s for this c a s e ; unfortunately, this does not incrude completely the rational approximation, since the defined by (5. 6) need not be a manifold in E = C ( [a, b]), set G=R R n, m even when. n=m=2. While the approach of J. R. Rice points out the importance of the concept of when dim E <
curvaturetf, and insists m o r e on the c a s e
that of D. E. Wulbert emphasizes the utility, of I1boun-
CO,
dedly c o n n e c t e d n e ~ s ~Among ~. other results, D. E. Wulbert 1981 has 1 obtained a characterization of those n-dimensional C -submanifolds Gof V
E=CR(Q) ( Q compact) which a r e ~ e b y z e vs e t s and satisfy a certain additional condition.
We shall not
enter h e r e into m o r e details. Let us
only mention that often differentiability is used t o linearize the problem and to draw f r o m the known linear results for the,non-linear case, by observing that
local
If
best approximat.ion (i. e. minimizing (Ix-g 11 on
a neighbourhood of g ) is equivalent to best approximation by elements 0
of the "tangentff linear manifold t o G passing through g
. naturally
0'
,
this contains b e s t approximation by elements of linear subspaces G a s a particular case, since it is known (see [82] , p . 90) that f o r linear subspaces G any element of local best approximation is :already in p G ( x ) , i. e. is a 5. 3.
"
globaln best approximation.
Generalizations.
The problem of best approximation by elements of N-parameter s e t s admits further generalizations. We shall mention h e r e two directions of such generalizations. Both of them have been considered f i r s t i n E = CR(Q) and then in general normed linear spaces.
A) The s e t P of parameters { d l ,
..
.
dN} in (5.8) can be replaced
by a subset P of an infinite-dimendional normed linear
space P; usual-
ly P is assumed to be open in F. F o r this case, assuming also Frechet differentiability with respect to the parameter, G-Meinardus and D. Schwedt have given in E = C (Q) a necessary condition for an element R of best approximation (see [62], p. 140, theorem 89, o; 28, theorem 5), which extends the necessity part of $ 1 , theorem 1. 9. In general this condition is not a sufficient one and the problem of characterization of the sets G
E = C (Q) for which this condition is also R sufficient, raised by B. Brosowski [lo], has been solved recently by C
B. Brosowski and R. Wegmann E l 5 7 ,P. J. Laurent 1531 has given in an a r b i t r a r y normed linear space E , under similar assumptions, necess a r y conditions for an element of best approximation which extend the necessity parts of 5 1 , theorems 1. 1 and 1. 6. B) The set P of parameters can be completely omitted and one can consider the problem of finding classes of s e t s GCE such that t h e r e sults of the theory of N-parameter approximation o r of convex approximation (ingarticular,of linear approximation) remain valid for the s e t s G of these classes. The notions of interpolating (=unisolvent), locally unisolvent, assymptotically convex and varisolvent s e t s G, mentioned in section 5. 2, do not solve the problem in E C (Q), since they assume
R
that G is an N-parameter set. B. Brosowski (9)
h a s introduced the no-
tion of a "regular1' set G in E = C (Q), which is independent of the noR tion of N-parameter set, and has proved that f o r a s e t G C CR(Q) Emogorov's criterion (
5 l , theorem
condition in o r d e r to have g
c G (x) if and only
0
(we recall that, a s shown in
l. 9) gives a necessary and sufficient
5 1,
i f G is a regular set
for any set G Kolmogorov's criterion
gives a sufficient condition in o r d e r that g E PG(x). 0
I. Singer A set G
c cR(Q) is called C9J r e g u l a r , if for every p a i r of elements
g, go€ G, every g(q) -
% (q) f
0
A > 0 and every closed subset
A C Q such that
such that.
(q E A), there exists an element g EG h
(5.10) sign( g (q)-go(,)) = sign (gA (q)-go(q))
(9 € A ) ,
F r o m the above result on Kolmogorov~scriterinn i t follows , in of the form particular, that every convex s e t G, every set G=R G1, G2 (5. 7), every varisolvent N-parameter s e t G and every assymptotically convex s e t G in E = C (Q) is regular (naturally, this can be deduced a l R s o directly from the definitions ; s e e [9] . Some other problems of best approximation i n E = C (Q) by elements of regular s e t s G (for examR ple, uniqueness), have been also studied by B. Brosowski [9] . The s i m i l a r problem (of finding suitable c l a s s e s of s e t s G and studying best approximation by elements of the s e t s G belonging to these classes) i n a r b i t r a r y normed linear spaces has been a l s o attacked successfully by B. Brosowski i n a s e r i e s of papers. Let u s give here an example of the results obtained i n this direction. We r e c a l l that a a n &-sun, - respecti-
s e t G i n a normed linear space E is called [48]
- if f o r every r Q E and every g vely a )-sun,
0
(5.12)
g0g
PG( Ax + (1-A
respectively if f o r every x
(1 <
g0 E
~ p we~ have ( ~ ) X < ~a
t h e r e exists a g
0
),
E pG (x) such that we
have (5. 12). Theorem 5. 3. F o r a set 'G in a normed linear space E the following statements a r e equivalent :
lo
.
go€
PG(x) implies
I. Singer
g(sEd denotes
where
.
'3
f(x-go) =
( ~ E E * ~ I I ~ I= I 1.
=
x-go
the s e t of all extremal points of S and where E*
F o r every p a i r of elements g, g
0
and every O - ( E ~ E ) -closed s e t A C that Re that -
f(g-g ) 0
>
0 for a l l
11 P A . -
(5.15)
11 x-g0ll) .
goIl <
The equivalence 1-' the equivalence r a l l y , a set G
low '3 C
regular. Theorem 5 . 4 .
f
€A
e G,
every
( S d containing
A>
x- go
0 , x 6 E\G
and such
, there exists an element g E G X
9
A*
2Ohas been proved by B. Brosowski Ell] and by B. Brosowski and R. Wegmann [15]
.
Natu-
E = C (Q) has the above properties if and only if it i s R F o r a s e t G in a normed linear space E the following
stgtements a r e equivalent :
1.
G
(x)=J,implies the existence of an element g o ~ p G ( x )
tisfying (5. 13).
is proximinal and i f
(x) is compact f o r every x GE, , t . e PG statements a r e equivalent t o the following : IfG
I. Singer 3*. F o r every
x
77
E +2b
€
E and r
> 0 the set-valued mapping Ax , r'.
defined by
has - a fired point (i. e. , a point y E E such that y o ~ A xr(yo)). , 0
The equivalence 1 ~ ~ 2 ~ been h a sproved by B. Brosowski [ll] and the equivalence 1°w30by B. Brosowski, K. -H. Hoffmann, E. ~ c h ' a f e r and H. Weher
Ll3-J. Some other characterizations of the above classes
of s e t s in t e r m s of fixed points of set-valued mapping have been given by B. Brosowski [127. A localized Kolmogorov type criterion, i n t e r m s of cones of displacement, has been also given by B. Brosowski [ll] and the s e t s G C E for which this criterion is necessary and sufficient in o r d e r that g E PG(x) have been characterized by B. Brosowski and R. Wegmann ~ 1 5 7 . F o r further r e s u l t s on best approximation by elements of the above classes of s e t s G in a r b i t r a r y and i n some concrete normed linear spac e s E we r e f e r the reader t o the papers of B. Brosowski. 5. 4.
Best approximation by elements of a r b i t r a r y sets.
A.) A characterization of elements of best approximation. The following characterization of elements of best approximation in t e r m s of fixed points of a set-valued mapping, was given by OBrandt [8]
and B. Brosowski 1121 :
Theorem 5. 5. E, X E E \ ~
L e t E be a normed linear space, G a n a r b i t r a r y s e t in
CpG );( if and -+ 2' defined by
g P G . We have g 0
fixed point of the mapping
BX. G
0
only if g >s a 0-
I.. Singer (i. e. , -
Moreover
got
(5t18)
&&(go )
. in this case we have
PG(').
B) Some problems on existence of elements of, best approxi-
mation by elements of closed sets. One of the problems studied recently is that of finding the Banach spaces E with the property that for every closed set G c
is dense i n E
.
S. B. Stezkin and M. Edelstein [26]
C
E the s e t
have proved that
eve-
r y uniformly convex Banach space E has this property. This result has been slightly extended by D. E. Wulbert [96],
who has proved that every
Banach space E "with property (2R) " also has the above property. We recall ( s e e e . g. [21])
that a Banach space E is said to have property
(2R) if every sequence {x,)
C
E such that
lim bn+xml( = 2 is a
n;m+co
Cauchy sequence (and hence convergent); clearly, every space with property (2R) i s uniformly convex, but ~ n econverse is not true. S a c z ever y uniformly convex space (and hence every space with property (2R) i s strictly convex, it is natural to ask whether there exist non-strictly the above property (i.e. , s u c l ~that for every clo-
convex spaces.Ewith sed s e t G
C
E the set D (TCG) is dense in E.
D. E. Wulbert [96] has
given an affirmative answer, by proving that every uniformly smooth Banach space E
with property
(H) (see $4, section 4. 2) also has the
above property. We recall (see e. g. [2g) that E is smooth i f for every tion L961
x
-
1
f
called uniformly
7
> 0 there exists an & = E ( q ) such that the relaimplies 11 x 11 + 11 y 11 5 (I+ 7 ) 11 x + 11~; D. E. Wulbert
has shown that there exist uniformly smooth spaces with proper-
ty (H) which a r e not strictly convex.
I. Singer By the r e m a r k made at the end of 5 2 (on v e r y non-proximinal subspaces) a Banach space E with the above property must be reflexive. D. E: Wulbert
96
has raised the problem whether the converse is
true,. i. e. : Problem . 5 . 1 Does there exists a reflexive Banach space E containing a closed set G such that
D(?tG) is not dense in E ?
Some other problems related t o existence of elements of best approximation a r e concerned with v e r y non-proximinal s e t s (see definition 2. 2).
M. Edelstein [27]
has proved that in a separable co-
njugate space E* no closed bounded s e t He has also shown
[27]
$ 2,
r is
very non-proximinal.
that in the - separable space E = c
0
(which i s
not i s isomorphic t o any conjugate Banach space) t h e r e do exist bounded very non-proximinal sets.
V. Klee
( s e e [82]
terization of the c l a s s e s
, p. 371) h a s considered the problem of characN. (i = 0 , 1 , 2 , 3 , 4 ) of a l l normed linear spa1
ces E which contain a very non-proximinal set G having respectively the following properties : (0) no additional property ; (1) G is convex; (2) G is bounded and convex; ( 3 )
E \ G is convex; (4) I;: \ G is bounded
V. Klee has made the following r e m a r k s : a ) Nl is the
and convex.
class of a l l non-reflexive spaces ; b)
N3 3 N1 ;
C)
no Banach space
but N2 f 9 ; d) N4 f 9 ; e ) it i s possible that N (whence 4 2 ' also Ng , N ) coincides with the class of a l l normed linear. spaces.
is in N
0
C) Some problems on uniqueness of elements of best approximation. F o r an a r b i t r a r y set G i n a normed linear space E , S. B. Stezkin (see [82]
,p' 375 ) has studied the s e t
i. e. the s e t of a l l elements x € E
which have at most one element of
I. Singer best approximation i n G, and has obtained, among other results, the following 'tconstructive characterization" of strictly convex spaces : Theorem 5 . 6. A Banach space E has the property that for every set -
G C E the set -.-
U
G
is dense in E
i f and only i f
E
i s strictly
convex.
if E
Furthermore, S. E. stezkin has proved that
is a strictly con-
vex Banach space, then for every boundedly compact set G C E the set is of the second category in E. However , i t is not known whether G this also holds for every s e t G C E ;it is also unknown whether from
U
the fact that ror every compact G C E the set U
E o r of the s m n n d category in E it follows
G
is either dense in
that E is strictly convex.
Finally, we mention the following results of Stezkin : If - E is a locally E the set U i s of G the second category and if E 'is a uniformly convex Banach space, then uniformly convex Banach space, then for every G
f o r e v e r y closed set G
iz E
the set
category. However, it is not known
D(TCG)
C
n UG is of the second
whether the second result remains
valid also for locally uniformly convex spaces. We conclude -this section with a famous classical problem, namely, the problem of convexity of ?ebygev sets. We have seen in section 5. 1 that a Banach space E has the property that every closed convex s e t G C E is a ?ebygev s e t i f and only if
E is reflex-ive and strictly con-
vex. It is natural to ask whet a r e the Banach spaces E in which the conv e r s e property holds, i. e. in which every Cebygev s e t G C E is convex. This problem has been solved only for 3-dimensional spaces E(see [82], p. 364), namely, E has this property i f and only if every exposed point
-
of S (see $ 3 , section 3. 2) admits a unique maximal functional of norm 1 . E F o r Banach spaces E of finite dimension m -> 4 it is only known .that the smoothness of E is a sufficient but not necessary condition f o r the convexity of all ?ebygev s e t s G C E.
F o r infinite-dimensional Banach
I. Singer spaces E the problem i s considerably more difficult, even the anBwer to the following problem being Problem 5. 2
unknown :
In a Hilbert space
%,
i s every CebyEev s e t necessa-
rily convex ? V. Klee has conjectured that the answer is negative and has proved
(see [82], p. 370) that in every infinite-dimensional Hilbert space%t= exist non-convex closed semi-cebygev s e t s . On the other hand, much work has been done towards a positive answer. L. P. Vlasov has observed (see [82]
,.p. 366) that in a smooth normed linear space E every
- ( s e e section 5. 3) i s convex (the con?ebyEev s e t G which i s an &-sun v e r s e is immediate) and thus the problem reduces to prove that every 8ebygev s e t is an d -sun. With an ingenious application of SchaiLder1s fixed point theorem, L. P . Vlasov has proved (see [82],p.
365) that
in
an a r b i t r a r y Banach space E E r y boundedly compact <ebygev set G i s an o(-sun and hence, if E i s smooth, G is convex. The assumption of boundedly compactness of G i n this result was weakened by N. V. Efimov and S. B. ~ t e z k i nand others ( s e e [82]
,pp.
368-369), under additional restrictions on the space E (e, g. uniform convekity). An important step in this di'rection was the idea of V. Klee of imposing continuity conditions on the m e t r i c projection 7C onto G rather G than imposing conditions directly on the 6ebygev s e t G; i n this way, for all classes of 6eby8ev s e t s G f o r which 7C has the required continuity G properties, it follows that the s e t s G in those classes a r e convex! L. P. Vlasov [93]
has proved.
If E is a Banach space such that the conjugate space Theorem 5. 7. -
E * is strictly convex
(in particular, i f E is a smooth Banach space),
then every 8ebyEev s e t G C E with continuous m e t r i c projection % convex.
G
5
I. Singer F o r Hilbert spaces E. Asplund [21) has shown that it is sufficient here to assume that W is continuous from the norm tqpology to the G weak topology. Also, E. Asplund [2] has proved.
:
- G i s a <ebygev set in a Hilbert space %such that Theorem 5. 8. If every closed half-space intersects G in a proximinal set, then G
2
convex. These two theorems contain as particular cases the previously known results, since e. g, every boundedly compact Eeby;ev
set G sa-
tisfies the above hypotheses. Let us note that the arguments of E. Asplund [2]
lean heavily on the tools of the theory of convex functions;
some other uses of the theory of convex functions to problems of best approximation have been mentioned in section 5. 1. F o r the continuity of metric projections onto EebysVev sets
(see also D. E.~u1ber.t [943.).
We mention that the above problems can be generalized in several ways, e. g. some of the a b w e results remain valid i f we replace the aasirmption that G i s a Eebygev set by the weaker assumption that G i s a proximinal set such that for every x E E the set
PG(r) is
convex.
F o r these problems and for other related results we refer the reader to [84, pp. 364-371 , [83]
-
[W].
and to the recent papers of L. P. Vlasov [88]-
Finally, for some results and problems on best approximation
in metric (not necessarily normed linear) spaces we refer to [82], 377-391.
pp.
I. Singer References N. I. Ahiezer, E. Asplund,
Lectures on t h e theory- of approximation. Second edition, Moscow (1965) [ ~ u s s i a d . Cebygev s e t s i n Hilbert space. Trans. Amer. Math. . SOC. 144 (1969), 236-240.
D. A. Ault, F. R. Deutsch, PD. M o r r i s and J. E. Olson, Interpolating subspaces i n approximation theory. J. Approx. Theory 3 (1970), 164-182.
J. Blatter,
Zur Stetigkeit von mengenwertigen metrischen Projkktionen. Schriften JesRheimsch-Westf. Inst. fiir Instrum. Math. Univ.. ~ l o n n , , ~ k r . & , ; , .16[196,7), ~o. 1'72.38.
J. Blatter,
Approximation und Selection. Habilitationsschri'ft, Bonn (1969).
J. Blatter, P. D. M o r r i s and D. E. Wulbert, Continuity of the set-valued
metric projection. Math. Ann. 178 (1968), 12-24.
0. Brandt,
Geometrische Approximations theorie in normierten Vektorrgumen. Schriften des Rheinisch-Westf. Inst. f g r Instrum. Math. Univ. Bonn. s e r . A, No. 18 (1968). 1-36.
B. Brosowski,
Nicht-lineare Tschebyscheff-Approximation. Bibliogr. Inst. Hochschulskripten Bd. 808/808a, Mannheim (1968).
B. Brosowski,
Einige Bemerkungen zum verallgemeinerten Kolmogoroffschen Kriterium. Funktionalanalytische Methaden d e r numerischen Mathematik. ISNM 12, Birkhzuser Verlag (1969), 25.-34.
B. Brosowski,
Nichtlineare Approximation i'n normierten Vektorrgumen. Abstract spaces and approximation. ISNM 10, i a i r k h ~ u s e r - V e r l a g(1969), 140- 159.
B. Brosowski,
FixpunktsHtze in d e r Approximations theorie. Mathematica II(34) (1969), 195-220.
B. Brosowski, K. -H. Hoffmann, E. ~ c h z f e rund H. Weber, Stetigkeitssstze fiir Metrische Projektionen. Iterationsverfahren. Numerische Mathematik. Approximations-theorie. ISNM 15, Birkhzuser-Verlag (1970), 11-17.
I. Singer [14]
B. Brosowski, K. -H. Hoffmann, E. Sthzfer und H. Weber, Metrische Projektionen auf lineare ~ e i l r z u m evon Co [Q, H]. If&rationsverfahren. Numerische Mathematik. Approximationstheorie. ISNM 15, irksu user- Ver lag (1970), 19-27.
1151
B. Brosowski und R. Wegmann, Charakterisierung bester Approximationen i n normierten v e k t o r r ~ u m e nJ. . Approx. Theory 3 (1970), 369-397.
[l6]
F . Browder,
Multivalued monotone nonlinear mappings and duality mappings i n Banach spaces. Trans. Amer. Math. Soc. 118(1965),338-351.
[IT]
A. L. Brown,
Best n-dimensional approximation to s e t s of fuctions. Proc. London Math. Soc. 14(1964), 577-594.
h18]
A. L. Brown,
On continuous selections f o r metric projections in spaces of continuous functions (to appear).
[191
E. W. Cheney,
Introduction to approximation theory. Mc Graw Hill, New York (1966).
E207
E. W. Cheney and D. E. Wulbert, Existence and unicity of best approximations. Math. Scand. 24(1969), 113- 140.
[21]
M. M. Day,
[22]
F. R. Deutsch and P. H. Maserick. Applications of the Hann-Banach theorem in approximation theory. SIAM Rev. 9 (1967), 516-530.
[23]
F. R. Deutseh and J. Lambert, A bibliography on metric projections (mimeographed).
[24]
A. I. Dubovitskii and A. A. Miliutin, E x t r e m ~ mproblems i n the p r e sence of restrictions. Z. Vycisl. Mat. i Mat. Fiz.
Normed linear spaces. Springer-Verlag, BerlinGattingen-Heidelberg (1962).
5(1965), 395-453 [ ~ u s s i a n ] [25]'
.
N. Dunford and J. Schwartz, Linear operators. P a r t . I :General theory. Interscience Publ. ,New York (1958).
~ 2 6 j M. Edelstein,
~ e a r e s tpoints of s e t s in uniformly convex Banach spaces. J. London Math. Soc. 43(1968), 375- 377.
[27I
A note on nearest points. Quarterly J. Math. 21 (1970), 403-406.
M. Edelstein,
I. Singer L28]
N. V. Efimov and S. B. ~ t e k n Some , properties of EebysVev s e t s . Doklady Akad. Nauk SSSR 18 (1958), 17- 19[Russian].
E29]
G. Ewald,D. G. L a r m a n and C. A. Rogers, The directions of the line segments and of the n-dimensional balls on t h e ~ space boundary of a convex body i r Euclidean (to appear).
[30]
Ky F a n and I. Glicksberg, gome geometric p r o p e r t i e s of the s p h e r e s in. a normed linear space. Duke Math. J. 52(1958), 553-568.
[31]
A. L. Garkavi,
The theory of best approximation i n normed lin e a r spaces. Mathematical analysis 1967, MOSCOW (1969), 75-132 [Russian].
[32]
A. L. Garkavi,
The problem of Helly and best approximation i n the space of continuous functions. Izevestija Akad. Nauk SSSR 31 (1967), 641-656 [ ~ u s s i a n ] .
[33]
A. L. Garkavi,
Compact admitting Eebygev s y s t e m s of m e a s u r e s . Matem. Sbornik 74 (116) (1967), 209-217 [ ~ u s s i a n ] .
[34]
A. L. Garkavi.
Characterization of Eebygev subspaces of finite codimension L1. Matem. Zametki 7(1970), 155- 163 (1970) C ~ u s s i a n ].
[35]
G. Godini,
Best approximation i n normed l i n e a r spaces by elements of convex cones. Studfi si cekcet. mat. 21(1969), 931-936 (1969) C ~ o m a n i a n ] .
[36]
A. Haar,
Die Minkowskische Geometrie und die ~ n n s h e r u n g an stetige FunMionen. Math. Ann. 78(1918).294- 3 11.
[37]
S. Ia. Havinson,
On approximation by elements of convex s e t s . Doklady Akad. Nauk SSSR 172(1967), 294-297 [Russian].
[38]
R. B. Holmes,
On the continuity of best approximatior? operators. Symp. on infinite dimensionhl topology. 'princeton University P r e s s '(to appear).
[39]
R. B. Holmes and B.R. Kripke, Smoothness of approximation. Michigan Math. J. 15(1968), 225-248.
1401
Y. Ikebe,
A characterizarion of H a a r dubspaces i n C P r o c . Japan Acad. 44(1968), 219-220.
[41]
Y. Ikebe.,
A characterization of best Tchebycheff approximations i n fl. nction spaces. P r o c . Japan Acad. 44(1968) 485-488.
Lb, b].
I. Singer [42]
A . D. Ioffe and V. M. Tihomirov, Duality of convex functions and
C431
R. C. J a m e s ,
Characterizations of reflexivity. Studia Math. 23(1964), 205-216.
[44I
M. I. Kadec,
Topological equivalence of uniformly convex spaces. Uspehi Mat. Nauk. 10, 4(66) (1955), 137141 [~ussian].
[45]
V. m e e ,
The support property of a convex set i n a l i n e a r normed space. Duke Math. J. 15 (1948), 767-772.
[461
V. Klee,
R e s e a r c h problem no. 5, Bull. Amer. Math. Soc. 63 (1957), 419.
[47]
V. Klee,
Convexity of Chebyshev s e t s , Math. Ann. 142 (1961), 292-304.
[48]
V. KZee,
R e m a r k s on nearest points i n normed l i n e a r s p a c e s . P r o c . Coll. on convexity (Copenhagen, 1965), Univ. of Copenhagen (1966), 168- 176.
[49]
A. N. Kolmogorov, A r e m a r k on the polynomials 01 P. L. eeby&ev deviating t h e l e a s t f r o m a given function. Uspehi Mat. Nauk 3,1(23) (1948), 216-221 [ ~ u s s i a n ]
e x t r e m a l problems. Uspehi Mat. Nauk 23, 6(144) (1968), 51-116 [ ~ u s s i a n ] .
.
[50I
C. A. Kottman and Bor-Luh Lin, On the weak continuity of m e t r i c projections. Michigan Math. J. 17(1970), 401-404.
[51]
B R. Kripke and T. J. Rivlin, Approximation i n the m e t r i c of L ~ ( X P ) T r a n s . A.mer. Math. Soc. 115 (1965), 101- 122.
[52]
P. J. Laurent,
T h e o r e m s de c a r a c t e r i s a t i o n e n approximation convexe. Mathernatica 10 (33) (1968), 95- 111.
[53]
P. J. Laurent,
Conditions n e c e s s a i r e s pour une meilleure appromation non lineaire dans un espace norme. Compte rendus Acad. Sci. ( P a r i s ) Ser. A-B 269(1969), A 245-A 248.
154)
P. J. Laurent and Pham-Dinh-Tuan, Global approximation of a compact s e t by elements of a convex s e t i n a normed space. Numer. Math. 15 (1970), 137- 150.
[551
A. J. L a z a r ,
Spaces of affine continuous functions on simplexes. T r a n s . a m e r . Math. Soc. 134(1968,), 503-525.
I. Singer A. J. Lazar, D. E. Wulbert and P. D. Morris, Continuous selections for metric projections: J. Functional Anal. 3(1969), 193-216.
J. Lindenstrauss,
Extension of compact operators. Memolsa. . Amer. Math. Soc. 48 (1964).
J, Lindenstrauss,
On nonlinear projections i n Banach spaces. Michigan Math. J. 11(1964), 263-287.
J. Lindenstrauss,
On nonseparable reflexive Banach spaces. Bull. A.mer. Math. Soc. 72(1966), 967-970.
J. Lindenstrauss and L. Tzafriri, On the complemented subspace problem (to appear).
G. G . Lorentz,
Approximation of functions. Holt, Rinehart and Winston, New York (1966).
G. Meinardus,
Approximation of functions, theory and numerical methods. Springer-Verlag, Berlin-Heidelberg New J o r k (1967).
E. A. Michael,
Continuous selections. I. Ann. Math. 63(1956), 361-382.
J. J. Moreau,
Fonctionnelles convexes. Seminaire s u r l e s Bquaions aux derivees partielles. College de France, P a r i s (1966-67).
J. J. Moreau,
Distance un convexe dlun espace norme et caracterisation d e s points proximaux. S6minaire dlanalyse milaterale. Fac. Sci. Montpellier, 2(1969), expose no 6.
P. D. M o r r i s ,
Metric projections onto subspaces of finite codi mension. Duke Math. J. 35 (1968), 799- 808.
E. V. Ogman,
Continuity of metric projections and some geomet r i c properties of the unit sphere in a Banach s ace, Doklady Akad. Nauk SSSR (1969), 34- 36. {Russian].
R. R. Phelps,
Convex s e t s and nearest points. Proc. Amer. Math. SOC.8(1957), 790-797.
W. Pollul,
Reflexivitgt und ~ x i s t e n z - ~ e i l r g u mi ne der linearen Approximationstheorie. Schriften d e r Gas. fcr Math. und Datenverarbeitung, Bonn (to appear)
I. Singer C701
J. R. Rice,
The approximation of fbnctions. Vol. 1;Linear theory. Vol. 1I:Non-linear and multivariate theory. Addison-Wesley, Reading, Mass. -London-Don Mills,Qnt. (1964 and 1969).
[71]
J. R. Rice,
Nonlinear approximation. Approximation of functions (Ed. by H. L. Garabedian). Elsevier. AmsterdamLondon-New York (1965), 111- 133.
[7i]
J. R. Rice,
Non-linear approximation. 11. Curvature in Minkowski geometry and local uniquenness. Trans. ~'mer.,Math.Soc. 128(1967), 437-459.
[73]
W. W. Rogosinski,
Continuous linear functiohals on subspaces of x P a n d 'g. Proc. London Math. Soc. 6,22(1956), 175-190.
[74]
Ch. Roumieu,
Sur quelques problkmes d'approximation. S6mi.n. math. fac. sci. Montpellier -(1966).
[75]
G. 5. Rubinstein,
On gn extremal problem i n a linear normed space. Sibirak. Mat. 2. 6(1965), 71 1-714 [ ~ u s s i a n ]
C76]
1. Singer,
Properties of the surface of the unit cell and applications to the solution of the problem of uniqueness of the polynomial of best approximation i n a r b i t r a r y Banach spaces. Studil si cercet. mat. 7(1956), 95-145 [ ~ o m a n i a n ]
[77]
I. Singer,
Caractdrisation des elements de meilleure approximation dans un espace de Banach quelconque. Acta Sci. Math. 17(1956), 181- 189
C78]
I. Singer,
[I
On best approximation of continuous functions. Math. Ann. 140 (1960), 165- 168.
I: Singer,
On best approximation of continuous functions. 11. Rev. math. pures et Appl. 6(1661), 507-511.
[80]
I. Singer,
Some r e m a r k s on approximative compactness. Rev. rourri. math. pures et appl. 9(1964), 167-177.
C81]
I. Singer,
On the extension of continuous linear functionals .and best approximation in normed linear s.~~c€!s. Math. Ann. 159(1965), 344-355.
[Is21
I. Singer,
Best approximation i n normed linear & p a c e sby elements of linear subspaces. Publ. House Acad.
I. Singer Soc. Rep. Romania, Bucharest (1967) [~omanian]. English translation : Publ. House Acad. Soc. Rep. Romania, Bucharest and Springer-Verlag, BerlinHeidelberg-New York (1970). C83I
I. Singer,
Some open problems on best approximation in normed linear spaces. SCminaire Choquet, 6'annee. Universite de P a r i s :1966/67), expose m. 1 2 .
[a47
I. Singer,
On metric projections onto linear subspaces of normed linear spaces. Proc. Confer. on rrProjections and related topicsrrheld i n Clemson. Aug. 1967 Preliminary Edition (January 1968).
[85]
I.Singer,
Remark on a paper of Y. Ikebe. P r o c . Amer. Math. SOC. 21 (1969), 24-26.
[86]
I. Singer,
On normed linear spaces which a r e proximinal in every superspace. J. Approx. Theory, (to appear).
[87]
K. Tatarkiewicz,
Une theorie generalisee de l a meilleure approximation. Ann. Univ. Mariae Curie-Sklodowska 6 (1952), 31-46.
[88]
L. P. Vlasov,
On ?ebysev s e t s . Doklady Akad. Nauk SSSR 173 (1967), 491-494 [~ussian].
[89]
L. P . Vlasov,
Approximatively convex s e t s in uniformly smooth spaces. Mat. t a m e t k i 1(1967),443-449 [Ftussian] .
[go]
L. P. Vlasov,
On 6ebys'ev and approximatively convex eets. Mat. Zametki 2(1967), 191-200 [ ~ u s s i a n ] .
[91]
L. P. vsasov,
cebygev s e t s and some generalizations of them Mat. Zametki 3(1968), 59-69 [~ussian].
[92]
L. P. Vlasov,
Approximative properties of s e t s . i n Banach spaces. Mat. Zametki 7 (1970), 593-604 [ ~ u s s i a n ] .
1937
L. P. Vlasov,
Almost convex and z e b gelr s e t s . Mat. Zvmetki 8 (1970) 545-550 [Russianj.
643
D. E. Wulbert,
Continuity of m e t r i c projections. Trans. Amer. Math. SOC.134 (1968), 335-343.
[957
D. E. Wulbert,
Convergence of operators and Korovkin's theorem. J. Approx. Theory 1(1968), 8- 18.
[96]
D, E. Wulbert,
Differential theory for non-linear approximation. (preprint ) .,
I. Singer [97]
D. E. Wulbert,
Uniqueness and differential characterization of approximations from manifolds of functions. Amer. J. Math. (to appear).
[98]
D. E. Wulbert,
Nonlinear approximation with tangential characterization (to appear).
[99J
S. I. Zuhovitskii,
On minimal extensions of linear functionals in the space of continuous functions. Izvestija Akad. Nauk SSSR 21(1957), 409-422 [ ~ u s s i a n ] .
CENTRO INTERNAZIONALE MATEMATICO ESTIVO l(C. I. M. E. 1
STRANG
f3.
AND
G.
FIX
A FOURIER ANALYSIS O F T H E FINITE E L E M E N T VARIATIONAL METHOD
Corso tenuto ad Erice
dal
2 7 giugno a1 7 luglio
1971
These lectures were prepared f o r a CIME Advanced Summer Institute held in 1971. ?h ' e 'first author has been supported by the Office of Naval Research and the National Science Foundation (GP - 13778), and the second by AEC Contract 7158
-
2
.
9.Apologia .This paper has been taken from a preliminary d r a f t of our book "An Analysis of t h e F i n i t e Element Method", t o be published by Prentice-Hall about the end of 1972.
I n t h i s f i r s t d r a f t we developed the theory of
f i n i t e elements on a regular mesh, w i t h Fourier -analysis a s the principal tool, and we were able t o discuss the connections w i t h f i n i t e differerne equations and t o include a p a r t of the theory of splines.
lhis framework
we now c a l l the "abstract f i n i t e element method".
In our book, the emphasis will be s h i f t e d to the "nodal f i n i t e element methodn a s developed by s t r u c t u r a l engineers, i n which i r r e g u l a r elements a r e more the r u l e than the exception.
In t h i s case splines a r e muoh l e s s
convenient, and Fourier analysis i s impossible. remain valid
- we r e f e r t o a forthcoming paper of
However t h e basic theorems the f i r s t author i n
Numerische Mathematik on "Approximations in t h e F i n i t e Element Method". Furthermore it becomes
possible to examine t h e e r r o r s due to the presence
of curved boundaries and inhomogeneous boundary data. Ve hope that the book w i l l give a reasonably complete and r e a l i s t i c treatment of t h e e s s e n t i a l f i n i t e element theory f o r l i n e a r problems, and t h a t the reader w i l l accept the present paper a s an interim report.
821
G . Strang-G. Fix
in wj t u r n out t o
The c r u c i a l i d e a i n t h i s synthesis i s t o choose t h e b a s i s such a way t h a t t h e variati.ona1 eauations f o r t h e . v j be difference eouations.
The customary description of t h e r e s u l t i n g
method i s i n v a r i a t i o n a l terms, and we s h a l l introduce i t i n t h i s way below.
To think- of it a t t h e same time a s a difference scheme
w i l l require a c e r t a i n tolerance on t h e reader's p a r t , since a t f i r s t s i g h t t h e f i n i t e element method seems t o depart a t several
points from conventional difference equations.
I n f a c t it i s J u s t
these p o i n t s which represent f o r u s t h e main contrjbutions of the\ method t o f i n i t e difference theory; they a r e innovations which could have been devised independently, -
but never were.
Our goal w i l l be t o decide when t h e f i n i t e element method i s convergent and numerically s t a b l e , and t o estimate t h e e r r o r ,
Al-
though ne do not discuss a t t h i s point t h e i r r e g u l a r meshes and general boundary conditions which a r e met i n applications, we have t r i e d t o r e t a i n t h e mathematical e s s e n t i a l s of t h e method; t h e s e we .study i n some generality.
Of course t h e ultimate question i s whether
t h e f i n i t e element method i s more e f f e c t i v e than i t s competitors, namely those techniques which l i e ou$side t h e above i n t e r s e c t i o n . The evidence sugGests t h a t although difference scliemes can be constructed which require fewer operations f o r a given order of accuracy, nevertheless t h e v a r i a t i o n a l approach has an important coherence which derives from t h e f a c t t h a t , once t h e b a s i s i s chosen, t h e r e s t i s l a r g e l y automatic.
'Pj
This coherence seems
t o be r c f l e c t c d i n a more regular behavior both of t h e e r r o r and of t h e user, who has othcrv:ise t o make a separate choice of f i n i t e difference replacement f o r each term i n t h e d i f f e r e n t i a l equation
G. Strang-G. Fix
Thc evidence i n t h i s comparison i s s t i l l
and boundary cqnditions.
very l i m i t e d , however, and we s h a l l t r y t o remain n e u t r a l . The nenie we have adopted was o r i g i n a l l y chosen by engineers [ I ] , who decompose a continuous s t r u c t u r e , f o r numerical purposes, i n t o a s e t of " f i n i t e elements". mathematics i s l e s s c l e a r .
The h i s t o r y of t h e underlying
Both Courant [2] and p61ya [3] commented
on t h e merits, i n c e r t a i n v a r i a t i o n a l problems, of seeking ap-. proximate s o l u t i o n s which a r e l i n e a r within each ( t r i a n g u l a r ) element; accuracy i s . improved by i n c r e a s i n g t h e number of elements r a t h e r than t h e .complexity of t h e approximating functions. t h i s t r i a l space, t h e Laplace operator a c t i n g on
u
With
induces i t s
f a m i l i a r 5-point d i f f e r e n c e analogue, a c t i n g on t h e c o e f f i c i e n t vector
v
.
Such t r i a l f u n c t i o n s t h e r e f o r e make t h e Ritz method
esgeclally s i ~ p l et o execute, and it seems very l i k e l y t h a t t h i s
idea was proposed even e a r l i e r . The development of t h e method has l e d n a t u r a l l y from p i c e c u i s e l i n e a r functions t o s p i i n e s and o t h e r piecewise polynomials of f i x e d dcgree
p ; each i n c r e a s e i n
and t o t h e complexity of t h e method.
p
adds both t o t h e accuracy A s usual, t h e e x t r a accuracy
i s i n i t i a l l y .worth t h e p r i c e ; b u t j u s t a s Newton1 s method i s more
popular than i t s higher-order analogues, questions of convenience soon become paramount, cubic approximants point.
I n a p p l i c a t i o n s t o second-order equations,
(p = 3 ) . a r e apparently c l o s e t o t h e t u r n i n g
Thc e s s e n t i a l f e a t u r e s of t h e method a r e t h e subdivision
of t h e region i n t o f i n i t e elements, and t h e choice of a so-called local b a s i s f o r t h e space of approximating f u n c t i o n s b a s i s con~posedof functions which vanish over a l l b u t
- that tL
is, a
few elements.
G. Strang-G. Fix
We analyze i n t h i s paper t h e case of subdivision by a regular mesh, of width
h-30
, with
following systematic way.
We s t a r t with a fixed s e t of t r i a l
...,uN(~) ; N
functions
vl(~),
f o r each meshpoint. functions variable
a l o c a l b a s i s constructed i n t h e
mi
w i l l be t h e number of b a s i s functions
To ensure a l o c a l basis, we i n s i s t thak t h e s e
vanish f o r l a r g e
x by t h e mesh width
1x1 h
.
, and
r e s c a l e t h e independent
The eventual Rayleigh-Ritz
equations w i l l take t h e form of difference equations i f t h e b a s i s h functions vi, associated with each meshpoint ( jlh,. ,jnh)
..
a r e simply t r a n s l a t e s of these rescaled functions b a s i s i s thus composed, f o r each
h
, of
cDi(~/h)
.
The
t h e functions
Of course t h e r e have t o be modifications a t boundaries. I n t h e piecevrise l i n e a r case t h e r e i s a s i n g l e parameter f o r each meshpoint, namely t h e value of t h e function a t t h a t point; thus of
N =l
v1
.
The graph
i s a pyramid with vertex a t t h e o r i g i n and with base
formed from t h e neighboring-triangular elements.
G . Strang- G . Fix
It should be c l e a r t h a t t h i s
and 3.ts t r a n s l a t e s span t h e space
cpl
of a l l continuous functions i n t h e plane which a r e l i n e a r within each element.
I n some exemples our description (1.1) of t h e b a s i s
w i l l seem l e s s transparent than a description of t h e space i t s e l f ,
i n terms of t h e functions admitted within each element and t h e compatibility conditions across element boundaries.
Nevertheless,
both f o r p r a c t i c a l computations and f o r t h e general theory, t h e d e f i n i t i o n of the b a s i s i s crucial; it i s from combinations of these functions
oi,
t h a t t h e Rayleigh-Ritz-Gal-erkin p r i n c i p l e w i l l s e l e c t an approximation uh The fundamental questions f o r a numerical analyst a r e those of convergence and s t a b i l i t y : i) k a c c u r e t e i s t h e apnroxixftc s o l u t i o n
uh
, and
how well conditioned a r e the eauations from vrhich
ii)
uh
i s determined numcrica.lly?
The answers can only come from t h e connections between t h e given d i f f e r e n t i a l problem, t h e t r i a l functions
, and
the
norms i n which accuracy and condition number a r e measured.
We
tpl,...,qN
want t o study these questions f o r e l l i p t i c o ~ e r a t o r sof a r b i t r ~ r y order
2m
, for
with t h e spaces eh = uh
of order
-u
q u i t e general
xS
and
I
I I ~
, and
f o r t n e norms associated
These norms measure t h e e r r o r
and i t s d e r i v a t i v e s
In1 = xuj ( s
respectively:
.
qi
i n t h e mean-square and po5.ntwise senses
In Sobolevts notation the space 9fS is written :W
.
Our arguments make constant use of the Fourier"transfo?m, which operates at full strength only on problems which are either periodic, or defined on the whole of Euclidean space R"
.
We
are convinced that (as in the theory of elliptic differential operators) the investigation of these special problems is fundamental to the understanding of more general boundary conditions. Thus we regard the present work as a necessary first step in analyzing the wide variety of problems, with irregular meshes and boundaries, vihich are actually being solved.
Fortunately,
most of the second step in the analysis is already complete; J.-P. Aubin, follotring the work of ~ 6 and a others in the French
school, has successfully analyzed the solution of boundary problems by means of
splines
bution is to determine extend.
.
all trial
In his terms, our contrifunctions to which his theory can
(He mention, in addition to his forthcoming manuscript,
the reference [ 4 ] .)
The third step is the study of more general
meshes, particularly those formed by an arbitrary triangulation of the region. This is a major point in our forthcoming book.
G . Strang-G. Fix
For problems on t h e whole space runs over t h e s e t
2"
Rn
, the
of a l l m u l t i - i n t e g e r s
index (jl,'.
j
i n (1.1)
..,jn) .
We
adopt t h e d e f i n i t i o n
f o r t h e Fourier transform, where xlcl
+
... + xntn..
= (,...,E~)
and
xg
denotes
AS a f i r s t a p p l i c a t i o n of t h e Fourier transform,
P a r s e v a l t s formula can be used t o replace (1.2) by t h e equivalent and more convenient norm
We f i r s t describe t h e app1ication:of t h e Ritz-Galerkin method
.
Using t h e con-
a b s t r a c t l y , t o e l i n e a r e l l i p t i c problem on
R*
ventional i n n e r product
,.t h e problem begins
( f , g) = /f (x)E(x) dx
with a b i l i n e a r form
I f the coefficients
ever u
and
w
a r e bounded, t h e form
l i e i n t h e space
7j"
,
a
i s defined when-
t h a t is, whenever a l l
p a r t i a l d e r i v a t i v e s of order not exceeding m The fgrm i s c a l l e d ? ? - e l l i p t i c provided t h a t
.
lie in L ~ ( R ~ )
G.Strang-G. F i x
The most familiar example is the form associated with the Laplace equation, a(ualv) = $ au ai;
+
au a? ... + axn ax, dx .
As it stands this is not ??'-elliptic, corresponding to the fact that Laplace' s equation has non-zero solutions, e,g. u = constant.
To satisfy (1.4), and thereby eliminate this non-uniqueness of the solution, we need'to add some positive multiple of the zero-order term
(u,w)
.
An elliptic form induces the follovring variational problea: given f
, find
(1.5)
u in
a(u,w)
so that
=
(f,v,)
for all w
This problem has one and only one solution u
in
ip
.
, provided
the in-
homogeneous data f is such that the right side makes sense; since w ranges over %? , this places f . in the adJoint space IC-"
, so that
The elliptic problem (1.5) can equally well be put into the more familiar opcrationcl form
G. Strang-G. Fix
For t h i s we i n t e g r a t e t h e l e f t s i d e of (1.5) by p a r t s , s h i f t i n g a d e r i v a t i v e s from w onto q D u The r e s u l t i s ( L U , ~ )= ( f , ~ ) ,
.
which i s equivalent t o (1.6); L
ji)n t o u - ~
i s t h e map from
given by
Many applications l e a d a l s o t o problems of I n such c a s e s the form of
a(u,u)
- ( f , u ) - (u,f)
problem (1.5).
a
i s self-adjoint,
mininlization
.
and t h e minimization
l e a d s exactly t o t h e same v a r i a t i o n a l
I n f a c t t h e operational equation
is
LU = f
nothing but t h e x e r eauation from t h e c a l c u l u s of variations. The Ritz-Galerkin technique i s now simple and very familiar; t h e space
i s replaced i n t h e v a r i a t i o n a l statement (1.5) by
a sequence of closed subspaces
sh
.
Thus t h e approximating
problem, w r i t t e n v a r i a t i o n a l l y , i s t o f i n d a ( uh ,wh ) = (f,wh)
(1.7)
for a l l
uh
wh
in
sh
in
E l l i p t i c i t y implies the existence and uniqueness of concerned with i t s computation. expand
sh
.
uh ; we a r e
Therefore we put t h e approximate
problem a l s o i n t o operational form.
sh , we
so t h a t
Choosing a b a s i s
vh u
fdr
G . Strang-G. Fix
and compute t h e vector
vh of unlmovrn c o e f f i c i e n t s .
Substituting
i n t o (1.7))
Since t h i s holds f o r a l l c o e f f i c i e n t s wvh
Thus t h e v e c t o r . vh
,
s a t i s f i e s t h e d i s c r e t e operational equation
where t h e e n t r i e s i n t h e coefficient matrix and t h e inhomogeneous vector a r e given by
Since a l l t h e s e e n t r i e s have t o Be calculated, e i t h e r a n a l y t i c a l l y o r by numerical quadrature, one wants a s simple a b a s i s a s possible. The use of t h e c l a s s i c a l special functions, i n o t h e r words a r e t u r n t o stage one, i s by no means obsolete; both Urabe and Clenshaw have made successful application of Chebyshev polynomials. interested, however, i n t h e b a s i s functions for'problems on R" f ,u,
...
and ' t h a t
t h e index
a l l have period 0 ( jv < h-'
j
i,J
runs over
Z"
1 , we require t h a t
f o r cach component of
j
Me a r e
defined e a r l i e r ;
, whereas
if
h-I
be an i n t e g e r
.
I n t h i s periodic
sh
case
has f i n i t e dimension
The index
i
N h-n
assumes t h e values
.
1,. .,N
, so
~t is n a t u r a l
t o take t h e b a s i s functions i n groups
of
W
order
a t a time. 11
,a
This p a r t i t i o n s t h e matrix
ct
N
i n t o blocks of
t y p i c a l block being
Thus t h e f i 3 i t e element r e l a t i o n
of
Ah
d i s c r e t e equations.
Ah vh = f h
i s a coupled system
Normally ' t h i s system i s analogous t o
continuous one, i n which t h e o r i g i n a l d i f fe r c n t i a l equation
Lu = f
i s coupled t o some of i t s d i f f e r e n t i a t e d forms
I n t h e Hernite case
D(LU) = Df
t h e d i s c r e t e system can be formally
recombined t o y i e l d a s i n g l e s p l i n e - l i k e equation, J u s t a s t h e Hcrmite b a s i s functions, with small support, can be combined w i t h t h e i r t r a n s l a t e s t o yield the spline basis. .
We want now t o summarize our r e s u l t s .
A more extended summary
has already appeared [5] i n Studies i n Applied ~~lathematics,a r e i n c a r h a t i o n of M.I.T.fs
Journal of Mathematics and Physics.
We
hope t h a t our discussion there-, i n terms of t h e model problem -Au
-1-
u .= f
and i t s
5-
and
9- point d i f f e r e n c e analogues, w i l l
be a u s e f u l supplement t o t h e present pzper.
For t h e moment wc
set nsidc extensions t o eigenvnlue problems and parabolic equations,
and dcscribc our conclusions only f o r t h c problems of convergence and s t a b i l i t y s t a t e d above.
G. Strang-G. Fix ii)
The problem 6f s t a b i l i t y i s t h e simpler,
Hcre t h e
fundamental question i s whether o r not t h e independencc of t h e b a s i s h elements qi, i s uniform a s h e 0 ; i n L2 ,with our normalization (1.1) of t h e b a s i s elements, t h i s means
For t h i s uniform independence we f i n d t h e folloering necessary and s u f f i c i e n t condition: I' =
1
ci qi
t h e r e e x i s t s no n o n - t r i v i a l combination such t h a t t h e Fourier transform of
and r e a l
Y
satisfies
(1.14)
u(so
+
2sj) = 0
for a l l
j e
zn
.
The reader w i l l n o t i c e t h a t everything depends on t h e
qi ; only gross f e a t u r e s of t h e d i f f e r e n t i a l problem, i t s order and e l l i p t i c i t y , a r e r e l e v a n t t o t h e condition number.
This i s an
a t t r a c t i o n , a t l e a s t t o t h e a n a l y s t , which should not disappear
i n more general boundary problems. are
not
For d i f f e r e n c e equations which
derived v s r i a t i o n a l l y , Schacfferf s poererful work [6] has
shovm how deep t h e s t a b i l i t y problem a c t u a l l y i s , even i n comparison with t h e 'corresponding question i n t h e general theory of e l l i p t i c boundary problems.
I n ~homgelS terminology [ 7 ]
,.
(1.14) i s
ncccssary and s u f f i c i e n t f o r t h e d i f f crence equations (1.9) t o be e l l i p t i c . i)
Thc r a t c of -convergence of
"density" of t h e spaces
sh ; t h e r c f o r c
uh
depends on t h e
we begin with a discussion
G. Strang- G . Fix
of approximation theory.
Our main r e s u l t f o r t h e case
takes
N = l
t h e following form:
smooth functions can be approximated from
with error
O(hp+l-') i n l i ~ egs
--
no~nialsi n
xl,.
..,xn
cor;lb:inatlons of
Q
norm, -
,if
s(p
of degree ( p
and only i f a l l poly-
can be w r i t t e n a s l i n e a r
and i t s t r a n s l a t e s .
Fourier a n a l y s i s l e a d s ; i t must have zeros
t o an equivalent condition on t h e transform
$
p -1- 1 a t a l l t h e p o i n t s
,j #
of order
sh
4 = 2nj
(0,.
.,,0)
Here
we have assumed t h e conditions most commonly met i n p r a c t i c e r t h a t cp
is in
fJP
and s t a b i l i t y holds; more p r e c i s e r c s u l t s a r e proved
i n 52. With
N > 1 , the
qi
and l i n e a r combinations of t h e i r t r a n s -
l a t e s may have d i f f e r i n g degrees of smoothness.
I n fact thcre are
important c a s e s i n which t h i s i s bound t o happen. o.r'-.:,?
.?
thnt
sh
Suppose f o r
:is comyriscA of pccc&rise cubic functions
which have continuous f i r s t d e r i v a t i v e s a t each j o i n t .
(n=1)
Since t h i s
Hermite space contains t h e s p l i n e subspace,. whose elements have .continuous second d e r i v c t i v e s a s well, t h e s p l i n e b a s i s f u n c t i o n s must be combinations of t h e Hermite b a s i s ; t h e former i s i n and t h e l a t t e r only i n only t h a t t h e
mi
??
are i n
approximation of order
.
Thus, i n t h e general case, ure a s s w e
vq , q(p , and
hP'l-S
cpi
agbin we prove t h a t
i s p o s s i b l e i n ?lS ( f o r
if and only i f a l l polynomials of degree
from t h e
3
2(
p
s( q)
can be produced
and t h e i r t r a n s l a t e s .
These approximation r e s u l t s a r e of course a l r e a d y known f o r many s p c c i f i c choices of
sh
.
\re mention i n p i r t i c u l a r t h e e a r l y
estimatcs f o r s p l i n c s by Birl
, and
the detailed
G . Strang-G. Fix
treatment of multi-dimensional spline and Hermite functions in Birkhoff, Schultz, and Varga. [ 9 ] and in recent papers by Schultz (cf. [lo])
.
The order p+l-s of approximation, starting with a
single arbitrary
QY
, has also been
established elsewhere; in fact
this problem was so striking that it was attacked independently by di Guglielmo [U],~abuxka[12], and the two of us.
(We regret
that the order just given is chronological, although At Our orvn contributions to the theory of approximation on
is small.)
?? with
N = 1 seem to be these: we have found the exact lower bound of possible constants in the error estimate c hPS1-s \ 1 u+
,.
J
'
and
we have proved the converse result, that such an estimate is only possible when polynomials can be manufactured from
tp
.
Obviously
this gives a special plzce to piece~iisepolynomial approximatingfunctions; they not only lead LO ~inplecu~npu-l;etions, which has until now been their real justification, but they also provide the most efficient basis from which to produce pol~momials. (Of course one may use polynomials themselves in the Ritz method, but this produces major difficulties. If monomials xJ are used as the basis, then the matrix Ah is both .hopelesslyill-conditioned and full of non-zero entries. The alternative, a basis of orthogonal polynomials, is awkward to compute numerically.)
We note
that our converse result was anticipated by ~051[13], who considered' only a very limited class of app'roximating functions, and that di GuQielmo has discovered
[u] a
basis which makes Ah as sparse
as possible. Splines are optimal in one dimension, as is the pyramid function drn~mabove when n = 2 ,p = 1 "triangulates"
nn
.
In general he
and uses splincs whosc continuity properties
are imposcd between the triangular elements, or simplexds; this
G . Strang- G. Fix
leads to
In the,splines over rectangular elements introdued by de Boor and Birkhoff, the last factor disappears and the remaining exponents have to be increased to p + l
sh
becomes essentially a tensor
product of the one-dimensional case. at .the points
f
= 205
In either case, the zeroes
# 0 are evident.
For approximation .in the pointvise norms' :W
, the proofs
are
different but the results are the same. Here we construct an explicit approximation
C V ,~ a~
~uasi-interpolate,by combining interpolation
with a generalized Taylor expansion:
The 6,
are those combinations of the basic functions v,l
...,wN
which are used in the construction of polynomials up to degree P; if N
=
1 they arc suitable multiples of cpl
.
The estimate
G. Strang -G. Fix
then follows f o r smooth u
by comparing Taylor s e r i e s .
Again
t h i s bound i s a l r e a d y known i n s p e c i a l cases. We can novr s t a t e our p r i n c i p a l r e s u l t f o r problen ( i ) :
-f .:w
5s s u f f i c i e n t l y smooth, t h e e r r o r i n e i t h e r t h e
if
o r the
norm s a t i s f i e s
Apparently t h i s exponent for
?fS
.
s
m
, it
r
i s new, although i n
rfS , a t l e a s t
can be deduced by. v a r i a t i o n a l arguments from t h e
order of a p p r o x i l ~ ~ a t i ogiven n ~ b o v e . It 5 s rcnar!:able
that the
e r r o r i n t h e R i t z method o f t e n i s not of t h e same order a s t h e e r r o r i n t h e b e s t a7proxiriation; t h e second expression i n well be smaller than t h e f i r s t .
For t h e s p e c i a l value
r may
s = m
the
two o r d e r s autolnaticslly coincide, s i n c e e l l i p t i c i t y implies t h a t f o r any choice of t h e Ritz spaces
sh ,
-
h \\u u\\# 5 constant
We always assume
p2m
, which
inf s c sh
-
\\u s\ltpl
.
t o g e t h e r with s t a b i l i t y i s necessarx
and s u f f i c i e n t f o r t h e f i n i t e el.ement method t o succeed; t h i s
a p p l i e s t o eigenvalue p r o b l e G a s well. of t h e minimal degfce .p=m r e c a r d l e s s of
m
i s only
, the
0(h2)
We note t h a t f o r s p l i n e s
error i s in
~ ( h )in ?l0 = L 2 (R n )
.
and
Our proof of t h e e r r o r estimate (1.17) i s severely c l a s s i c a l ; we regard (1.8-1.9)
a s inducing a f i n i t e d i f f e r e n c e scheme and
estimate t h e l o c a l t r u n c a t i o n e r r o r . non-reflexive spaces
This allows us t o t r e a t t h e
WE , which i n v a r i a t i o n a l arguments a r e
avrkvard a t best; of course Sobolev's imbedding theorems may be used t o deduce pointwise estimates, b u t t h e c o s t i n powers of i s unacceptably high.
h
Furthernore, t h e l o c a l e r r o r i s well-defined
even i n t h e absence of g l o b a l p r o p e r t i e s l i k e e l l i p t i c i t y ; our technique should extend f o r example t o p a r a b o l i c and hyperbolic operators.
It may a l s o be used t o j u s t i f y "Richardson extrapolation",
which i s a u s e f u l t o o l f o r i n c r e a s i n g t h e accuracy.
Be n o t e t h a t
t h e order of t h i s l o c a l e r r o r i s normally t r i v i a l t o compute, by a comparison of Taylor expansions; b u t f o r a system of
N
difference
c q ~ ~ ~ t ; o ?tzh,i n i s no 3or.ger so.
We cannot r e s i s t p o i n t i n g t o a r a t h e r paradoxical consequence of our estimates. derivatdves of
u
It i s concerned with t h e approximation of
, in
v~hichRitz methods a r e generally thought
t o be superior; they provide s o l u t i o n s which can be d i f f e r e n t i a t e d a n a l y t i c a l l y , whereas more conventional d i f f e r e n c e schemes y i e l d only mesh functions.
However, ~hom&e has shown [ l b ] t h a t t h e
order of accuracy i s not reduced when t h e s e mesh functions a r e differenced, while our p r e s e n t estimates (which a r e b e s t p o s s i b l e ) imply t h a t t h e accuracy i s . g e n e r a l l y decreased by each d i f f e r e n t i a t i o n and f i n a l l y disappears,
The paradox e n t e r s i n t o t h e f i n i t e
element method, which i s both a Ritz system and a d i f f e r e n c e scheme: appprently t h e d e r i v a t i v e s of by differentiating uh
, but
u
at
+,
should not be computed
r a t h e r by regarding
xo
a s the center
G . Strang.-G. Fix
Un-
of an h-mesh and applying a n a c c u r a t e d i f f e r e n c e o p e r a t o r ! fortunately f o r t h e theory,
round-off e r r o r gener.ally r e v e r s e s
t h i s recommendation. One f u r t h e r problem, of obvious i n t e r e s t b u t questionable importance, i s t h i s :
among a l l d i f f e r e n c e schemes c o n s i s t e n t
with a given d i f f e r e n t i a l equation, v;hich can b e produced by t h e f i n i t e elemcnt method?
Certainly not a l l , since self-adJoint
n o n n e g a t i v e - d e f i n i t e problecls
0) l e a d o n l y t o s e l f -
(L = L+
ad j o i n t n o n n e g a t i v e - d e f i n i t e f i n i t e element equations Furthermore, t h e inhomogeneous d a t a
f
.
(fib =
L 0)
e n t e r s t h e X i t z eauations,
n o t through i t s v a l u e s a t meshpoints, b u t through t h e i n t e g r a l s
J
f'pj
T h i s smoothing b e & r s on a remark of Zlamal [15]
, which
we discussed w i t h IJidlund and :4orton, t o t h e e f f e c t t h a t v a r i a t i o n a l e r r o r bou.nds r e q u i r e fewer deri-.ratives o f standard f i n i t e d i f f e r e n c e estirnstes.
f
t h a n so:ne of t h e
W e n o t e t h a t Herbold,
S c h u l t ~ jand Varga [ 16J have observed how t h e u s u a l numerical quadratures of t h e s e i n t e g r a l s l e a d t o familf a r 4 , ~ ' fe r e n c e schemes. I f we c o n s i d e r only t h e l e a d i n g term
-uxx
,
f o r second-crder
equations i n one dimension, a complete answer i s p o s s i b l e :
all
consistent self-adjoint nonnegative-definite difference matrices can b e produced. p r o p e r t i e s of
The proof dcpends on t r a n s l a t i n g t h e s e
, whose
a c t i o n i s t y p i c a l l y given by
i n t o p r o p e r t i e s of i t s symbol
G. Strang- G. F i x
The correspondcnce is a familiar one: Consistency <=> a(e) Self-adjointness <=> a
=
e2
+
o(0 3 ), so that a(~) = a( (0) = 0
is real
o
Non-negativity <=> a(0)
.
e
for all
The dj,er-~iesz theorem allorvs us to facsor such a nonnegative trigonometric polynomial into
Now we let cp
be the piecewise linear function which vanishes
for x ( 0 and has slope gives C P j x = M
+
1
.
= 0
, so
that
Bj in cp
j
<x<
j
+
1
.
Consistency
returns to the value 0
For larger x we keep it zero, so that
0
the csscntial condition for the finite element method: outside a conpact set.
Comparing coefficients of
at satisfies it vanishes
eiee in (1.19)~
the Ritz-Galcr1;in coefffcients are indeed the a
a
Unhappily, the result we have just proved is misleading; the proper analoee of the differential equation Lu simply the explicit system
vh = fh
.
f is Instead it is the =
not
combination of this system with the expansion (1.18), expressing uh
in tcrrnn of the given basis for
cxpnnsion is
sh ; in
our context this
G. Strang-G. Fix
Suppose we consider only t h e values
uh(kh)
assumed a t t h e mesh-
x = kh ; these values a r e t h e components of a v e c t o r h On t h i s mesh t h e previous equation vhich we denote by u points
.
becomes
where t h e e n t r i e s i n t h e matrix Altogether, then, data
f
uh
B~
a r e t h e values
ri(k-J)
.
i s computed from t h e inhomogeneous
by solving
This i s t h e i m p l i c i t f i n j . t e d t f f c r e n c e eouation cle~l-vedby t h e f i n i t e element method; it i s t h e t r u n c a t i o n e r r o r a s s o c i a t e d with
this equation
which y i e l d s an estimate of
uh
-u .
There i s a
s i m i l a r equation corresponding t o every choice of h-mesh i n t h e plane; our matrix
r e f e r s only t o t h e mesh through t h e o r i g i n ,
B~
Thus t h e f i n i t e elcment method i s f i n a l l y exposed a s a family of d i f f e r e n c e equations, each an i m p l i c i t system of
N
equations
but othcrv:j.se more o r l e s s convcntionsl, and a l l descended from t h c common a n c c s t o r s
cp.,l
..,vN .
G.Strang- G. Fix Approxir,m.L;ion
9 2.
I n t h i s s e c t i o n w e s h a l l i s o l a t e those conditions on t h e f i n i t e elemcnt spaces approximation.
sh
which determine t h e accurrtcy of
A l t h o u ~ )t h ~e proofs a r e r a t h e r technPcal, t h e
c o n d i t i o n s thexselves @ r c remarkably simple. s5.n;:l.s
function
pro;.frnal;ion
in
(?
, that
?lS by
i s with
N = 1
S t a r t i n g with a
, we
consider ap-
expansions of t h e form
I n ordcr that each
s h a l l be non-zero within on1.y s. f i n i t e
'P j nunbe]- of elements, vre a s a w e t h a t
p
vmishes f o r large
This property of conpact support ensures t h a t n i n l ~ i x , s i n c c a l l prociucts
1 j-kl
eve:.:
v! :W
1x1
i s a band
a r e i d e n t i c a l l y zero vhen-
i s sui'f'icicatly l a r g e .
We l e t
?la C
denote t h e
spncc.? of Punctions with co:npact support whose d e r i v a t i v e s of
orc!cx
. (
q
lic in
of order l e s s than
L2
.;
q
v ; i l l be contlnu.ous, and t h e
i n many a p p l i c a t i o n s t h e d e r i v a t i v e s
q th
d e r i v n t i v e s p i eceivise c o n t i ~ ~ u o u s . We adopt %he standard n o t a t i o n f o r x u l t i - i n t e g e r s ,
...
j
,
Z:
.
f o r clcments of
Thus each component
non-negative int.eger; and
zn j
and
...
a,$,
vrriting
for. e l e n e n t s o$
i s an integer,. each
uv
is a
G. Strang- G. Fix
a
> p -
i f and only i f each
The range of surnnztion i s understood t o bc
of i n t c c r a t i o n i s
, when
R"
Suppose
TiiEORFki I.
.
2 P,
a,
zn , and
t h e range
nothing i s s a i d t o t h e contrary.
is i n
m
?fz .
Then t h e following
conditions a r e equivalent: A
v ( ~# ) 0
(5.)
p+1
...,tn
for
h
5
p
27.2" :
-
, ),
ja cp(t-J)
is a polynomial i n
jczn
with l e a d i n g terrn
(iff) that as
la1
has zeros of order a t l e a s t
co
n . t t h e o t h e r p o i n t s of
(ii)
tl'
, but
f o r each -*
0
The c o n s t a n t s
u
in
eta , C # ?!p-kl
0
t h e r e a r e weights
,
cs
and
K
a r e independent of
u
wh S
such
.
Proof.
( i =
(
i
.
This equivalence, l i k e much of t h e a n a l y s i s
l a t e r i n t h i s paper, depends on t h e Poisson f o r r ~ u l a : t h e
G. Strang- G. Fix
vnlucs of a function
Y
on t h e l a t t i c e A
those of i t s Fouricr transform
If
zn
a r e connected with
on t h e l a t t i c e
y
2azn by
has compact support, t h e f i r s t swn involves only a f i n i t e
Y'
number of terms,. and t h e secolld i s absolutely convergent. a Applied t o t h e function Y(x) = x ~ ( t - x ) , this y i e l d s
Suppose f i r s t t h a t ( i ) holds. terms on t h e r i g h t s i d e with
j
Then f o r any
# 0 a l l vanish.
have only t o compute t h e contribution from
The leading term i s c l e a r l y
eta , with
la1 ( p
, the
Therefore we
j = 0 :
C = $(o)
# 0 as
required. NOVI we supposc t h a t ( i i ) ho1d.s.
Taking
a = 0
, this
means from (2.4) t h a t
i s a non-zero constant function. for
j
# 0
A Therefore ~ ( 0# ) 0
.
Next we considcr
a = (1,0,.
..,0) ; t h e
A ,m ( 2 ~ j )=
0
r l g h t s i d e of (2.4) i s
G . Strang-G. Fix
{lccording to (ii) this is a polynomial, and therefore this time we
have
A arn/asl
(21.j) = 0 for ' j # 0
way, in order of increasing a conditions in (2.1) (i) =
> (iii)
.
(2.2) and (2.3)
.
Proceeding in the seme
, we,establish the
remaining
.
Our first step is to convert the inequalities
, by
Parseval's formula, into inequalities for
Fourier transforms. We note first that
Therefore the transform of C wh vh is
J
J
We denote the function in braclrets by Wh ( 5 ) it has period C/11
2r/h
in each variable
denote t-hecube - ~ / h <
F,
_I h/h
zy
.
, the
, and
remark that
Thus if we let stability condition
(2.3) on the weights becomes
Using the definition (1.3) of the cctiinnte (2.2) is converted into
us
norm, the required
G . Strang- G. F i x
(2.7)
I
nn
A
lu(5)
To construct
wh
coefficients
a,
Recalling t h a t
- vh(5)$(h?) 1 s a t i s f y i n g these conditions, we f i r s t choose
,
la1
5p
6(0) #
Then each remaining
q
by ( i )
0
B
so t h a t
is
, we
chosen t o make t h e Pth derivative
of t h e l e f t s i d e of (2.8) vanish a t only t h e a t h derivative of r u l e 3etermines the
%
qa
= 0
.
I n f a c t , since
i s non-zero a t
i n order of increasing
Now we choose the weights
wh
3
q = 0
$
, this
by
so t h a t % ( h ~ )i ~ n t h e period cube
wh(5) = b ( 5 )
(2.9)
begin with
C/h
.
la19 The s t a b i l i t y condition (2.3) i s s a t i s f i e d , s i n c e
in C/h
.
1 ~ (~ cIGI1
(kle s h a l l denote a l l constants i n t h e following
estimates by
c .)
Therefore Ve have t o show t h a t t h e i n t e g r a l s
G.Strang-Gy Fix
are a l l bounded by
L C
For
In
5
I
c h2(p+1-s)l/ u 2\ / ~ ~ W + e ~begin with ,
J 16(5)121hz12(pi1-s)(l+lr,12)Sh5 s i n c e lhc,l C/h
o u t s i d e 'C/h
, we
notice t h a t
Ih%l 2 1 ; t h e r e f o r e
wh
t o change variables:
we use t h e p e r i o d i c i t y of
( r
G. Strang-G. Fix
To show that the sun in this integral is O( 1 h< 1 2p+2)
, we
A
expand cp ,in s Taylor'series around 2 ~ jj since by condition (i) it has a
p+l-fold zero there, there is only the usual
remainder t e n
The evalu~tionpoint
O j = 0j(h<)
2sj and 28j+h6 ; since 5
lies on the line between
is in
C/h
,
Therefore the sum in (2.10) is bounded by cl hs1 2D+2 ~ ( 5 ),where -
To bound thcorcm:
S uniformly in 5 E
?fE
we apply first the Paley-W5.ener
inlplies that gA
and its derivatdves are
entire functions of exponential type, and that for any s ( p they satisfy
The thcory of entire functions allows us to estimate the sums S
in tcrlns of thc integral
St
, as
long as the evaluation
G. Strang-G. Fix
p o i n t s a r e s u f f i c i e n t l y vrell spaced; t h i s i s a straightforward extension of
heo or em 6.7.15 i n Boas [17]
always l i e s i n t h e cube centered a t 2~
, and
vre do g e t
( i i => ( i )
of '(iii)
.
, but
S(2) ( c
.
21rj
'(')
=
, (iii)
(2.11)
J
(
0
i n the b a l l
outside
provides I \ $ ~ (1 ~ C)SI
wh
?fp
B = (111 5
11
J3
such t h a t
5 constant
I f we consider t h e s e two i n t e g r a l s only over
+
~ ( h ), it follows t h a t
is
defined by
C/h
$(gh) = Q(0)
' j
s i d e s of l e n g t h
only t h e . f a c t t h a t approximation i n
A
9lP"
, with
A t t h i s s t e p we use, not t h e f u l l s t r e n g t h
(1
E
I n our case
S u b s t i t u t i n g back i n t o (2.10)
possible f o r t h e particular function U
Since U
.
B
, where
,
G . Strang-G. Fix
Appealing again to the crucial constraint (2.11), we must ]lave
$(o)
$0.
Next we consider (2.12) over R"
=
1
l~~(~)$(h;+ 2vj) I
I
- C/h , where
A
U
'(l+ l-lYdt
=
0 :
(periodicity of W)
J#O C/h '
for 'every j # 0
.
Therefore
Comparing each power of h wc
in turn, and recalling (2.11),
conclude that ~~$(2vj)= 0 for
la1 ( p
,j#
Remark 1. In one dimension, n = 1
0
, the
. impl lest function
which has the zeros required by condition (i) is the.one which ceneratcs the local -basis for splines of degree -p :
Furthermore, any
where E
4
which s a t i s f i e s .(i)
i s a s u i t a b l e e n t i r e function.
i s a multiple of
6P '
Thus every such cp
,
which may o r may not be a piecewise polynomial, can be found
a I f n > 1 (or N > 1) P ' no such common d i v i s o r e x i s t s , but condition (i)should make 1-b from convolution with t h e "B-spline"
possible t o i d e n t i f y a l l u s e f u l bases.
We have learned t h a t
t h i s condition vras known mcch e a r l i e r t o Schoenberg; it appears i n h i s fundamental paper [18] a s t h e condition f o r a smoothing formula t o map onto i t s e l f t h e space of polynomials of degree The Poisson formula, which i s t h e c r u c i a l connection between
zn
transforms an
and
R"
, also
f i g u r e s i n t h e valuable
recent work of Bramble and H i l b e r t 1191 Remark 2.
.
Theorem I d i f f e r s a l i t t l e from t h e r e s u l t
s t a t e d i n t h e introduction, where we assumed ' s t a b i l i t y but gave weaker forms f o r t h e conditions i n t h e theorem.
In t h e
t h i r d condition, f o r exemple, t h e weaker statement imposed no r e s t r i c t i o n (2.3) on t h e weigizts; given t h e s t a b i l i t y condition (1.14), however, t h i s r e s t r i c t i o n i s automatic.
In t h e second
condition, we assumed no p a r t i c u l a r form f o r t h e expansion of polynomials
-
only t h a t they could somehow be represented a s
combinations of
and i t s t r a n s l a t e s , e.g.
p
G . Strang-G. Fix
We, t h e r e f o r e want t o show t h a t s t a b i l i t y f o r c e s these weights
J
The r o l e of s t a b i l i t y i s t o make any r e -
- t o be equal.
presentation (2.14) unique; t h e only rcpresent.ation of tllc zero .function has a l l weights zero.
Novr .we use t r a n s l a t i o n
invariance, s h i f t i n g both s i d e s of (2.14) through the. u n i t vect4r
ev ' i n t h e
By uniqueness
p j-ev
direction:
t,,
= Pj
for a l l
j and
v
, and. t h e
weights
a r e equal
For an expension
o S-e, we add
t h e same argument gives s h i f t i n g through
el
Uniqueness now gives
a
j
=
uo
= oj
for
v = 2,. ..,n
; after
1 t o find
+
jlp
, .so t h a t
This means t h a t t h e l a s t sum i s a polynomial of t h e form required by ( i i ) t t h e induction i s obvious.
Thus t h e statement i n t h e
introduction may be rcgarded a s a corol.lary t o Theorem I Remnrk.3.
.
Thc condition G(0) # 0 i s i n general not necessary f o r clpproximation t o be possible i n ?Is , i f t h e
G . Stkang-G. Fix
r e s t r i c t i o n (2.3) on t h e weights i s removed. n = 1 , and
example t h a t origin.
tp
Suppose f o r
has a zero of o r d e r
Then i f (2.1) holds with
one can c o n s t r u c t ireights holds.
A
p
replaced by
of order
W~
h-'
p
A converse of t h i s r e s u l t is, a l s o poqsible.
all
2nd
, and
(1.14) i s s a t i s f i e d a t
+
p
A
so
= 0
(I
.
We vanishes Therefore
t h e associated Rite-Galerkin system w i l l be & e r i c a l l y s t a b l e , and such a choice of Remark 4.
y
,
such t h a t (2.2)
emphasize tha,t i n such a s i t u a t i o n t h e transform at
a t the
un-
i s t o be avoided.
Theorem I can be made much more p r e c i s e i n a (These refinements may be o f l i t t l e
number of d i r e c t i o n s .
i n t e r e s t t o t h e s e n s i b l e reader, who wants to. .get on with t h e p l o t ; he can s a f e l y disregard Thcorem It. ) show t h a t t h e exponent
pkl-s
b e s t p o s s i b l e f o r any u f 0
First, we can
in t h e e r r o r estimate (2.2) i s
, and
f i n d t h e infinum of constznts
c s f o r which t h i s estimate holds. Second, we note t h a t t h e A smoothness of y ' a n d t h e order of t h e d e r i v a t i v e s . o f tp
which vanish a t t h e p o i n t s 27rj were s p e c i f i e d i n Theorem I by t h e same i n t e g e r
p
.
This r e l a t i o n i s t h e most e f f i c i e n t
I n p r a c t i c e , anc! consequently t h e most common, but t h e r e i s n o . a n r i o r i reason why t h e two i n d i c e s must agree. folloiring we allovr rp where
q (p
>
, since
.q
p
.
t o be l e s s smooth, say 'cp
Insthe i n W:
There i s n o a s e i n permitting e x t r a smoothness,
t h e e s t i n ~ o t e sa r e n o t improved.
Third, we
strengthen t h e converse p a r t of t h e theorem by deducing t h i s smootlmess of
cp
r a t h e r than assunling it.
A f u r t h e r generalization,' which Ire s h a l l forego, i s t o
give estimates a l s o i n f r a c t i o n a l and negative norms.
In
f a c t t h e reader can v e r i f y t h a t such r e s u l t s follow d i r e c t l y from our proofs5 t h e norm (1.3)
, involving s
form,applics equally well t o a l l r e a l discussion of ' L ~ estimates f o r
t h e Fourier t r a n s -
.
We a l s o omit any
.
,m
p f 2
The more p r e c i s e version of Theorem I i s THEOREM 1'.
For any i n t e g e r s
p
0
q
, the
following
conditions a r e equivalent:
t
(I)
rp
(ii)
. cp
ja * ( t - j)
l i e s i n :(?
, {(o) #
lies in
, and
1~:
i s a polynomial Xn
eta , c + o .
term
, and
la1
, the
(p
..,tn
tl ,.
&th
function
leading
i s a d i s t r i b u t i o n with compact support, and
(iii) f o r each h+O
for
0
u
in
XP+l
t h e r e a r e weights
wh
5
such t h a t a s
J
The exponent every
u
p
0
,if
pi-1-s p
i s b c s t p o s s i b l e . f o r every
s
and
i s t h e l a r g c s t i n t e g e r f o r which ( i )
G. Stran.g:G. Fix
I n one dimension, t h e g r e a t e s t lower bound of possible
holds.
constants cs i s
>
With' n
1 t h e l a s t f a c t o r becomes
where " :a w
.
i s t h e d e r i v a t i v e of order
(This constant
Cs
cs = Cs
estimate (2.2) holds, with know only t h a t f o r every u h
<
b(e,u)
.
i n the direction
i s computed i n [Mlfor several
Me must emphasize t h a t we have
range
p+l
Q
;)'
not
established t h a t t h e
+
, for
e
each h ; vre
i n ?ip+l , it w i l l hold i n some
Our constants Cs
a r e i n f a c t nothing
b u t t h e constants involved i n approximating polynomials of degree p+l
.
Ve know from condition ( i i ) i n Theorem I t h a t
t h e polynomials of degree
p
can 'be reproduced exactly, and
t h e point i s t h a t f o r small h
any smooth function w i l l
resemble a polynomial l o c a l l y .
Therefore it should be no
s u r p r i s e t h a t t h e e r r o r . i s asymptotically a t t r i b u t a b l e t o t h e telms of degree p+l
i n a l o c a l Taylor expansion of
I n f a c t we make t h e f o l l o v i n g conSecture: function u
u
f o r every snooth
of one variable, t h e r a t i o of e r r o r i n b e s t
.
approximation in
.
constant Cs
~-''~h
~\fl~u))approaChes this kame L
This means that the constents Cs are a very useful criterion for the comparison of two different finite element functions cp (given that they have the same value of p)
j
the constants are relevant not just to some extreme choice of u
, but
rather to all choices. With n
>
1
, the situation be a tiny subset
seems to be essentially the samej there of u
for which the asymptotic constant exceeds Cs
namely
A
those with u supported away from the optimal direction w
.
(We hope to show elsewhere that this conjecture extends also to ripproximation in the ,maximumnorm.) 'Novr we consider the approximation problem when the space
.
...,*
is generated by several functions pl. h case there are N unknowns viJJ i lJ...JN S"
; :
In this
, to be
computed at each meshpoint from the finite element equations vh = fh
.
The merit of such an extension'is to make high
accuracy p possible.xvithreiatively.sLmple functions cpi
-
their support can be small, so that frequently the required inner products are easier to compute and boundary conditions simpler to match, and they can have additional interpolating propcrtics. each
vi
In the one-dimensional Hermitc case, for example,
is supported on the interval [-1,1]
, and the
1-1 st
is the only one of the first N - 1 derivatives to be non-zero
G. Strang-G. Fix.
which
at the origin. This means that the quantities vi,j
satisfy the finite element equation have physical significance in themselves, as the "displacement", "slope", "stress", etc. of the approximate solution at the meshpoint x
=
.
jh
This
has been found very attractive by users.
TIEOREM 11.
Suppose
", ...er", N
in ?lz
.
Then the
following conditions are equivalent: (i)
there are linear combinations Pa
of the q i which
satisfy A
A
(2.194
~~(= 0 la, ) y0(2nj)
0
= 0
for j #
o
for all j E z n , 1 s la1 L p
(ii) there are linear combinations Y,
of the cpi
which
satisfy
(iii) for each u that for s = D , l , .
in ?tP+l there,areweights w
. .q,
i, j
wch
G. Strang- G. Fix
Remark 5 .
Babuska has asked us whether t h e following
condition on t h e ' (iv) ql,.
..,%
vi
i s equivalent t o those i n Theorem 11:
t h e r e i s a f i n i t e l i n e a r combination and t h e i r t r a n s l a t e s
via
0
of
which s a t i s f i e s t h e
conditions imposed' i n t h e case I?.= 1 :
(2.23)
~ ~ f i ( 2 r =j )0 f o r "la1 ( p
W e s h a l l prove t h a t
,j #
( i ) => ( i v ) => ( i i i )
0
, so
.
t h a t Babuska's
i n s i g h t allolvs t h e reduction of many 'N-dimensional p r o b l e m t o t h e simpler case
N = 1
.
We note t h a t t h e t r a n s l a t e s admitted
i n . ( i v ) a r e t h e functions
It i s obvious t h a t ( i v ) => ( i i i ) : i f t h e function s a t i s f i e s t h c conditions of Theorem I, then combinations
n
G. strang-G. Fix
C w? J .
oh can be used t o J
To prove t h a t
( i ) => ( i f f )
polynomialof degree for
Jy-l 5 P
approximate
p
A
l a t e s of
Y,
e
l e t t, 191,
..., e
denote t h e unique such t h a t
,
Cl by giving i t s Fourier transform:
i s t h e trznsform of a f i n i t e combination of t r a n s -
, this
of t h e o r i g i n a l qi
n i s indeed a f i n i t e l i n e a r combination and W e i r t r a n s l a t e s .
By t h e p e r i o d i c i t y of t h e
which by (2.lga) equals one i f To v e r i f y (2.23) of a product:
, we
jointlyin
Then we define t h e required
Since t, Y,
u a s required i n (2.21-22).
, we
t,
,
j = 0
and zero otherwise.
use t h e Leibniz r u l e f o r . t h e d e r i v a t i v e
G..Strang-G. Fix
replacinp, a
by
y
-P .
property (2.19~)of t h c
This vanishes, by t h e fundamental h
Ya
, 2nd
condition (iv) i s verified.
We note t h a t (2.23) holds even f o r 1 ( In1 ( p
.
Tnus t h e
qa
, when
which were constructed i n t h e
~ i - o o fof Theoren I reduce -to boa u
j= 0
, when
by combinations of t h e functions
rve a r e approximating
". h
We t u r n now from t h e mean-square zpproximation. of a f u n c t i o n and i t s d c r i v a t i v c s t o t h e problem of p o i n t a i s e approximation.
r
The l a t t e r i s c r u c i a l nwncrically, even th0up.h t h e origfr:al d i f f e r e n t i a l problem ( e . ~ .t h a t of minimizing a quadratic f u n c t i o n a l ) l e s d s more n a t u r a l l y t o t h e former.
It i s a
funda!ncntal orovcrty of Lhe f i n i t e elernent method t h a t t h e tv:o p
+
GO
1
toqcthcr.
-
s
By t h i s vie mean not only t h a t t h e order
o r t h e b a s t approximetion i s t h e sane i n t h e t m
nonns, a s t h e next theorem shov~s, but a l s o t h a t t h e o r d e r
r
of t h e a c t u a l approxi?nntion by t h e Ritz-Galerkin function
uh
G. Strang-G. Fix
i s t h e same i n both norms. of
p+l-s
and
2(p+l-m)
r
This exponent
i s t h e smaller
.
The following estimate includes known r e s u l t s f o r spline approximation, i n t h e special case of equally spaced knots. Splines a r e t y p i c a l of many important choices of t h e t h a t t h e i r derivatives of some order
q
cpi
,in
have jump discontiunities;
.
( I n t h e spline we note t h a t t h i s leaves them s a f e l y i n W; case, q i s the. degree of t h e polynomial i n each interval, and coincides h%th t h e accuracy exponent THEOREM 111.
Suppose
Q1,...,vN
p .) s a t i s f y t h e conditions of
t h e previous theorem, and have bounded derivatives of order
q
.
Then i f
i t follows t h a t
Proof. part of
x/h
i n powers of
We write and
h
,
t
x=kh+ th
, where
k
l i e s i n t h e u n i t cube
i s the integral O l t v < l
.
Expanding
G . Strang-G. Fix
We know from the s t a r t that uI1 cannot be c l o s e r to u than the optimal h approximation from the space S
.
Therefore the e r r o r , measured i n H~
o, r:w
will be a t best of the o r d e r hptl-s determined i n Theorems I
and 11.
The question i s whether the approsimation produced by the finite Thcre i s no a p r i o r i reason
clement is actually of this optimal o r d e r .
why this should always be so; i n fact, if we fix
(3 i n
H~ and i n c r e a s e the It
o r d e r 2m of the equation, t h e r e i s every reason to think otherwise. s e e m s unlikely that the e r r o r in 14' would stay of o r d e r p
+ 1 until m
exceeds
p, a t which point cp would no longer lead to admissible t r i a l functions and the method would collapse.
Therefore we anticipate that tlie o r d e r of
accuracy will depend on m a s a c l l a s p and s . Thc c o r r e c t o r d e r can in fact he d e t e r ~ ~ l i n eover d the range 0
Q
s 4 m by an elegant variational argument which me lkarned from
Marlin Schultz. A special but still typical c a s e of this argument h a s been published by Nitsche ( 211, and Aubin has shown us all alternative route to the s a m e result. We begin with the fundamciltal result (cf. Varga 1221) that i n the case s
= m, the c r r o r uh
- u i s indeed of the optimal o r d e r hp t l -m
allowcd by Lhc approximation theorems.
Repeating the standard a r g u -
ment, we deduce from the v a r i a t i o ~ l a lequations (1. 5) and (1. 7) that a(uh
- u, wh ) = (f, wh ) -
h (f, w ) =
o
Therefore by ellipticity plluh
- u l l H m2
-
6 ~ e a ( u " u,u
h
-u)
11 . h for a l l w ~n s
(2.31)
G. Strang-G. Fix ior a l l w
h
.
h Cancelling the f i r s t i a c t o r on thc right, and choosing w
a s the optimal approximation to u,
p 1-1 IIerc we ticed u i n 13 i n o r d e r l o apply T h e o r e m s I and 11, and t h e r e If i is l e s s smooth, the e s t i -
fore we a s s u m e that f l i c s in Hp'1-2m.
m a t e s of this section a r e not difficult t o r e v i s e .
It i s t o the e s t i m a t e
(2.32)that we m a y apply o u r calculation i n Theorem I' of the m i n i m a l constant C
i n approximation.
m
Now we give Schultz's a r g u i l ~ c n tf o r s
< m. I t begins with t h e
adjoint problem LQv = g, which i s equivalent to the variatioilal equation 'm a(y, v) = (y, g) f o r a l l y i n N
(2.33)
Again the cllipticjty of a guarantees a unique solution v f o r g i n H - ~ , and i u r t h e r ~ n o r cthat
-
Taking y = ul' h (U h . i o r a l l w in
-
u, and recalling (5. I ) ,
U,
sh.
11 g) = a ( u
- u,
I)
= a(uh
- u, v - wh )
(2.35)
Therefore
I(u"-u,E)~
K ~ ~ u ' ' - u I I ~ ~ ~ ~ hv -llHmw
.
(2.36)
To e s t i m a t e the l a s t t e r m , we choose wh a s the b e s t Hm a p p r o x i m a ~ i o n t o v, and appeal to T h c o r c m s I and 11:
r iipf 1 4 2m-s (2.37) i f p f l > Z m - s I1
G. Strang-G. Fix
(In tLc second c a s e we reduced p t o 2m
- s r - 1 before applying the
approximation theorems; if t h e i r hypotheqes hold f o r a given p they certainly hold for a s m a l l e r on;. ) Substituting @.32)and@:37) into (2.36)
and using (2.34), we have
r = min(p
+ 1 - s, 2(p + 1 - in)) .
(2.38)
Now a s g r u n s over thc unit b a l l i n I - I - ~ , the s u p r e m u m of the left side is exactly the n o r m of uh
- u i n t h e d u d space H ~ .T h e r e f o r e the finai
e r r o r estimate i s
We notice that with s = m the f i r s t expression i n r is the s m a l l e r , and
r =p
+ 1 - rn in agreement with
(2.32).
G . Strang-G. Fix
C. C , Zienkiewicz, "The f i n i t e element method i n s t r u c t u r a l and continuum mechanics, " London: McGraw-Nil1 (1967)
.
R. Courant, "Variationel methods f o r the s o l u t i o n of problems of e u i l i b r i m znd vibrations," Bull. Amer. Math. Soc. 49, 1-23?1943)
.
G. Polya, "Sur une i n t e r p r e t a t i o n de l a mcthode cles differences
f i n i e s c j u i peut fournir des bornes superieures ou i n f e r i e u r e s , " Comptes Hendus 235, 995-997 (1952)
.
J. P. Aubin, "Behavior of t h e e r r o r of . t h e approximate
solutions of boundary value problems f o r l i n e a r e l l i p t i c operators by Galerkin's and f i n i t e difference methods," Rem. Sem. Mat. Pado-ra.
G. Fix and 6 . Strang, "Fourier analysis of t h e f i n i t e elernent method i n Ritz-Galerkin theory," Studies i n Appl. Math. 48, 265-273 (1969)
.
D. Schaeffer, "Approxination of e l l i p t i c boundary value problemllby difference equations; I. Factorization of t h e symbol, J. Functional Analysis 1970.
V. Thomee, " E l l i p t i c difference operators and D i r i c h l e t ' s
problem," Contr. Diff. Eqns. 3, 301-324 (1964).
J . P. Aubin, "Approxim9tion des espaces de d i s t r i b u t i o n s e t
.
dea operateurs d i f f e r e n t i a l s , I' Bull. Soc Math. Prance, Memoire 12 (1957).
G. Birkhoff, M. H. Schultz and R. S. Varga, "Hermite i n t e r polation i n one and more v a r i a b l e s with applications t o p a r t i a l d i f f e r e n t i a l equations, " Elmer. Math., 11, 232-256 (1968)
M. II. Schultz, "Rayleigh-Ritz-Galerkin methods f o r multi-
dimensional problems,
"
SUM Numer. Anal. 6, 523 -538 (1969)
.
F. DiGuglielmo, "Methode des elements f i n i s : une famille dlapproximations des espaces de Sobolev par l e s t r a n s l a t e s de p-fonctions," Manuscript, 1970.
I. Babuska,
"Approximation by H i l l functions,
''
t o appear.
J . J. Goel, "Construction of basic functions f o r numericAl u t i l i z a t i o n of R i t z l s method," Numer. Math. 12, 435-447 (1968). V. Thomee,"On t h e convergence of difference quotients i n
e l l i p t i c problems," Univ. of Maryland, Note BPI-537 -(1968).
Zlamal, "On the. f i n i t e el ernent method, " Numer. Math. 12, 394-!+09 (19.58)
)I.
.
.
G. Strang-G. Fix
[16] R. J. lierbold, M. H. Schultz, and R. S. Varga, "Quzdrature schemes for the numerical ,solution of boundary value problems by variational techniques, Aequationes Mathenaticae 3, 96-119 (1959). .
[1.7] R. Boas, :Entire functions," New York, Academic Press (1954). [18] I. 3. Schoenberg, "Contributions to the problem of approximation of eauiclistant dcta by analytic functions, Parts A and B.," Quart. Appl. ihth. 4, 45-99>112-141 (1946). [lg]
3. H. Bramble, and S. R. Hilbert, "Bounds for a class of
linear functionals with applications to Hermite interpolation." Numer Mathematik.
.
r2.01 G. Strang, "The finite element method and approximation theory,
Numerical Solution of Partial Differential Equations I1 (SYI'TSPADE), Academic Press, 1971. [21] J. Nitsche, "Ein Kriterium fur die auasi-optimilifat des Ritzschen verfahrens," Numer. Math. .11, 346-348 (1968). 1221
R. S. Varga, 'Hermite interpolation-type Ritz methods for two-point boundary value problems," in: "Numerical solution of partial differential equations," 3. H. Bramble, Ed., New York: Academic Press, 365-373 (1965).
CENTRO INTERNAZIONALE MATEMATICO ESTIVO
(C. I. IVI. E. )
$A.
ZERNER
CARACTERISTIQUES D'APPROXIMATION DES COMPACTS DANS L E S ESPACES FONCTIONNELS E T PROBLEMES AUX LIMITES ELLIPTIQUES
Corso
t e n u t o a E r i c e d a l 2 7 giugno a 1 7 l u g l i o
1971
,CARACTERISTIQUES DfA PPROXIMATION DES COMPACTS DANS L E S ESPACES FONCTIONNELS E T PROBLEMES AUX LIMITES ELLIPTIQUES
par M. Zerner ( Universite de Nice )
1.
Considerons l e probleme aux limites :
oh R est un ouvert borne suffisamment r6gulier.de R
n
.
A est un opera-
teur elliptique & coefficients e g d f o r d r e 2m, l e s B. des operateurs d f o r 3 des fonctions donndes dans des boules des wP d r e , les j k-m. -112. 3 Nous faisons expressement lfhypoth&se que ce problgme e s t bien pose et que si u = G ( (9 1' de Vk=
. . .ym )
wP k - m l - l / p ( 3 0)
sur
est l a solution, alors G est un isomorphisme
wPk
(R) flA-'(0) et cel& pour tout k assez
grand. Nous voulons des indications s u r l a faqon de discretiser ce problhme de f a ~ o nB garantir une precision de E au sens de donnee
tq
= (
IlyIIk_<
ql,.. . . ., (9, l<j<m --
wP k
(R) (l
wp k-m. - l / p 3
Nu1 nfignore que si' l1on veut a r r i v e r & c e resultat sans gaspillage inconsidere de volume de memoire et de temps machine, il faut construire
M. Zerner un reseau de discretisation plus s e r r e pr6s de l a frontiGre qu18 11int6rieur. Cependant s i l1on veut a l l e r plus l o i n , i l faut donner une description axiomatique de c e qulon entend par 18. Nous
supposerons desormais k a s s e z grand pour que tous l e s
wpk - m . - l / p
soient des espaces de fonctions continues.
J
On cherche alors des sous-ensembles finis une application lineaire L de ( R81m dans dans
W!
OB
de
0%
pour toute
PJ
et
4
Q, 61 , une application J de R
5
v E R
I J v ) ~=~ (ii)
de
(Q) telles que
pour tout
(i)
&
E Vk verifiant Ilyll k -< 1,
designe l a restriction 3
Remarque 1
:
v
d
Cette version du formalisme suppose que l a discretisa-
tion ne met en jeu que l e s valeurs de l a fonction elle-msme s u r l e s points du reseau. O r certaines methodes d1e16ments finis par exemple mettent en jeu des derivees. I1 nly a la rien dlessentiel et tou; l e s
re-
sultats generaux peuvent & r e conserves moyennant une legere complication du formalisme. On peut a u s s i m e t t r e en jeu des moyennes s u r des Blements du reseau, etc. Rappel de definitions
...
: Soit K un sous-ensemble dtun espace norme E .
M. Zerner On appelle
n-&me epaisseur de K (dans E ) et on note d (K) l e nombre: n sup d (K) = inf n LEGn x€K
inf yaL
11 x - 11
oh G est llensemble des sous-espaces de dimension n de E . En d1aun t r e s termes, on prend l a distance des elements de K B L et on l a maximise s u r K.
On doit enfin faire v a r i e r l e sous-espace L de dimension
n de f a ~ o nB minimiser l e resultat obtenu. Si
F est un autre espace vectoriel norme et F
E, on note d (F, E ) n
l a n-Bme epaisseur de l a boule unite d e F dans E . Notons N l e nom0
bre des elements de
zf .
L1application J,,L,PQ &ant de rang N
&ant un isomorphisme de V
1
s u r son image, on a
0
et G
:
Proposition 1 : Pour que l e s conditions (i) et (ii) ci-dessus aient lieu, il est necessaire que
dh l e nombre C ne depend pas d e
E.
Moyennant l e s evaluations connues de n-&me epaisseur , (Kolmogorov [4],
Birman et Solomjak 121 , E l Kolli
(ici et par l a suite , C designera une
[3]),
on en deduit
con'stanteU au sens des-analystes,
clest-&-dire qui ne depend pas de tous l e s autres nombres qui interviennent,
mais elle pourra dependre du num8ro.de l a formule oh elle inter-
vient). Dans des cas simple , on peut verifier llinegalite inverse de (3).
M. Zerner
L e problbme e s t donc maintenant de regarder
3a
autant d e points qufil y a d e points de
61
Dans la pratique
R n proches "
de 3 R.
Supposons qufon ait un rbseau .&rbgulier de pas h , au rnoins au voisinage de l a frontibre , et que de points dont l a distance a
3 ait
autant de points que a a
R e s t inferieure
Ah (A une certai-
ne constante). On a a l o r s :
(a h
=
b signifie a=O (b) et b=U (a) ; il s f a g i t ici de relations pour
+ 0). Nous appellerons N l e nombre dfblements de &. Si l e rCseau e s t n -
regulier N == h - n =N
h-1
0
ce qui montrerait, sfil en Ctait besoin ,
qu'il faut construire un reseau irrkgulier. On voudrait bien en effet que N = 0 (N ). 0
Nous pouvons en d i r e un tout petit peu plus long en introduisant un "pas localw. Dans l a partie 2 nous examinerons des , r e s e a m bE. pour lesquels ce concept e s t simple. On construit une suite finie dfouverts R=R 3 h Z 1 3.. .3Rd.
Sur R. - Ri+l , la t r a c e de & e s t un rbseau
rbgulier de pas h.
h. e s t l e pas local en x E Ri
0
1 '
L
1
d r a naturellement hi < hi+l
. Si
alors
il faudra , si on veut que (6)
N = 0 (No)
-
Ri+l
. On pren-
M. Zerner
De plus,, si on pose :
il faudra utiliser une methode dont l ' o r d r e de precision soit au
moins :
2.
n- 1
b = an
Gardant l e s notations ci-dessus, nous allons donner une con-
struction des R. e t des h. que nous appellerons lfconstruction de 1
1
Bahvalov simplifiCeI1 (en abrege C. B. S). Nous dirons un mot des modifications .3 f a i r e pour passer aux constructions de Bahvalov [I]. Ces modifications n1en18vent pas l e u r validite aux calculs d'ordres de grandeur ci-dessous. On s e donne deux parametres
:
d € ] 0.1 [
s (i)= min(t entier,
Pour tout multi-entier
dt >
t > 1. On pose : i).
j, J ( j , s) designera l e pave :
Nous dirons que J(j, s) E
s
si l e pave de msme centre Z fois pius
grand
e s t contenu dans
et
5. Nous
poserons a l o r s
et 0. s e r a i f i n t e r i e u r cte son adherence.
M. Zerner
Enfin :
On a alors, A designant le c6te du plus grand cube contenu dans SZ et pourvu que h soit assez petit pour que
2 Th < A :
droh lron deduit que- (5) est verifiee avec :
de sorte que (7) devient :
Proposition 2 : Dans une C. B. S. verifiant C > l / n , on a Nzh (et par suite (6) est verifie ).. Lemme
:
- (n- 1)
Notons : n(')
=
{x; x e n , d ( x , JSZ) < - r).
I1 existe C tel que .: mes (~2") )
5 c r.
Demonstration : Comme ~ 2 ' ~CS2 ) qui est borne, il suffit de demont r e r que
(r) lim. sup. mes ( 0 r r+ o
<
*
P a r compacite de l a frontiere, on s e ramene au domaine d'une carte locale et & demontrer que mes
ur
= (x; x =.(xf,t),
(ur) 5
X'EV,
Cr r oh O< t ~ ~ ( xr)) l ,
oh C1 ne depend pas de r , nonplus que llensemble V born6 dans Rn- 1
et
M. Zerner Y(xf,r)
5
Cf' r
.
(*
c q f d . Demonstration de l a proposition 2 dans
R
i '
s a distance
On utilise a l o r s
a
-
Si un
3 R e s t au plus
1
point de R n'est pas s(i) - 1 1 +X ) 2 h.
c(
l e lemme pour voir que
I1 y a donc au plus
points de
-
(R dans Ri -
Ri , ce qui, dlapr&s l a definition de s(i)
e s t majore p a r
-
( n - 1LC) i
C3
-
(n- 1 )
o r l a s e r i e e s t convergente, cgfd cqfd. Bahvalov
[I]
respectivement .
donne deux constructions correspondant
d T=
2 et 3
I1 e s t amen6 & ajouter des points au , reseau prsb.
de l a frontikre des R.. I1 donne a u s s i l e s equations discretisees cor1
respondantes pour. le probleme :
et une methode dfinterpolation, lfensemble assurant que l a solution approchee diffkre de l a solution exacte de moins de & en norme du sup.. I1 donne ensuite une methode de resolution iterative du sy-
(%) Le
lemme sfapplique un o u v e r t borne lipschitzien et p a r consequent l a proposition 2 aussi.
M. Zerner stbme d < ~ r 6 t i & (celle de Jacobi semble - t - il) qui permet 26-2 llog h ) iterations dans d'atteindre la precision E en 0 ( h
1
un cas, en 0 (
.I log h 1 2,
iterations dans llautre cas avec n = 2
variables independantes. Dans ce deuxibme cas, la situation est donc t r e s satisfaisante, mais je dois dire que la partie systbmes discretisds
'11
It. resolution des
m l a paru particulibrement obscure.
3 . Nous revenons au probleme general mais en prenant p= + oo. Soit
x E 52 , posons
R = d ( x , 3 S-2 ) et
soit r€]0, R [
drions Cvaluer M, l e nombre d1e16ments de
.
& appartenant
Nous vou-
a la bou-
de centre x et rayon r. I1 faut pepser pour comprende l a r suite que r est' petit devant l e diemetre de 52 mais grand devant
le U
h 2 No
l / ( n-1)
-
Definition. on appklle
Soit K un sous - ensembLe borne d'un espace norme E, &me n dpaisseur au sens de Gelfand de K le nombre:
e ( K ) = e n n otl L
("I
=inf. L
sup. x€~ni;(n)
llx 112
parcourt l~enaembledes sous -espaces
vectoriels fermes
de codimension n. Nous rappelons les d e n proprietes faciles suivantes
:
a ) Soit D un sous-ensemble dense de El. eme Dans l a definition de la n -epaisseur au sens de Gelfand, i l suffit
de faire parcourir
L
lrensemble des sous-espaces
definis par des systbmes d'equations:
M. Zerner b)
Soit L un sous-espace de dimension n+l dans E et K la bou-
le de rayon
p
dans L. On a
:
Soit K llensemble,parcouru par l e s solutions de (1) lorsque Nous voulons qufune fonction u E K parcourt l a bouh unite de V . k' qui sfannule aux M points de &nu r soit majoree par & dans W ('r) , dfoh lfid6e de determiner M par la relation
?
lM(K) 5 €
(12)
pour la norme
.
Wm 1 (Ur)
Pour montrer que cette idee n f e s t pas entierement utopique,nous allons nous limiter au cas
n = 2, k
> 1, 1 =
0, R simplement con-
nexe et demontrer deux inegalites t r & s simples. Lemme : Sous l e s hypotheses ci-dessus il existe A tel que u s e decompose en u = u + u 1 2 u
1
holomorphe , u
2
antiholomorphe , et
Si u est un polynome de degr6 k, u d'ordre
<
et u aussi, si l e s derivees 1 2 3 de u sfannulent en x, celles de u et u aussi. 1 2
Nous admettrons ce lemme qui resume des resultats classiques de la theor'ie des fonctions, 3 : I1 existe C telle que
Proposition
C
oh R = d(x, R).
Vk
ekg) ( c ( ~ / R ) ~ ' ~
M. Zerner Demonstration
: Nous prenons pour L ( ~ l'ensemble ) des fonctions
qui stannulent avec leurs derivees d'ordre < k
-
1'
Compte-tenu des
relations
c'est un sous-espace de codimension au plus 2k
1
+
k- 1 1 d'oh k l < _ ~ .
On a a l o r s , d f a p r & sl e fait que u /(x +ix )kl et u / ( x -ix ) kl sont 1 1 2 2 1 2 encore respectivement holomorphe et antiholomorphe, et en utilisant le principe du maximum, Iuj(y) I
Proposition
5 c1(r/~lk1
: Soit R' t e l que
4
5 soit contenu dans l e disque ou-
vert de centre x et de rayon R f . 1l.existe C tel que, pour tout k : C ek(K) Demonstration espace
:
On utilise l a propriete b). On prend comme sous-
de dimension k
degre au plus k
1'
>(r/~')~~ 1
l'ensemble des polyn8mes harmoniques de
On remarque qufil exiate B
nique e t majoree p a r B pour ul(yl, y2) p a r
1
>
0 t e l clue u harmo-
11 - x 11
< R1 implique u E K. On divise ' u 2 (y1 , 2y ), p a r (yl-xl-iy2+ix2) -I+1.
Enfin on fait Une inversion de centre x et on raisonne comme i a n s l a demonstration de l a proposition 3. Remarque en guise de conclusion
: Les densites prevues p a r l e s
propositions 2 et 3 sont beaucoup'plus faibles que celles des reseaux construits en 2. Pour l e a utilisel' il faudrait trouver des formules de
M. Zerner discretlsation utilisant toute l a regularit6 de la solution
, donc en
o r d r e variable. Outre quron nren e s t p a s l&., on a vu que l e s
re-
seaux du 2 suffisent & a s s u r e r que l l o r d r e de grandeur du nombre total de points e s t celui du nombre de points frontikres.
BIBLIOGRAPHIE
[I]
N. S. BahRalov : 0 Eislenom regenij zadaEi Dirihle dlja u r a Genja Laplasa,Vestnik Moskovsk, Un. 5 (1959).
[2]
Birman e t Solomonjak: Approximation polyn8miale p a r morceaux des fonctions de classe
wO''~,
Maf. Sbornik 73 (115):
3, (1967). 331-355. C3I
A. E l Kolli
:
n-@me Cpaisseur dans l e s espaces de Sobolev.
Note aux C. R. Acad. Sc. P a r i s t. 272' (1971), 537-539. [4]
A. N. Kolrnogorov : Math. Ann. (2) 37 (1936); 107- 111.