MORE ON THE LINEAR SEARCH PROBLEM BY ANATOLE BECK ABSTRACT The linear search problem concerns a search made in the real ...
8 downloads
546 Views
2MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
MORE ON THE LINEAR SEARCH PROBLEM BY ANATOLE BECK ABSTRACT The linear search problem concerns a search made in the real line for a point selected according to a given probability distribution. The search begins at zero and is made by continuous motion with constant speed along the line, first in one direction and then the other. The problem is to search in such a manner that the expected time required for finding the point according to the chosen plan of search is a minimum. This plan of search is usually conceived of as having a first step, a second, etc., and in that case, this author has previously shown a necessary and sufficient condition on the probability distribution for the existence of a search plan which minimizes the expected searching time. In this paper, we define a notion of search in which there is no first step, but the steps are instead numbered from negative to positive infinity. These new rules change the problem, and under them, there is always a minimizing search procedure. In those cases which satisfy the earlier criterion, the solutions obtained are essentially the same as those obtained previously. Introduction. I n a recent paper by this a u t h o r , [1], the linear search problem is discussed, and a necessary and sufficient c o n d i t i o n is derived for the existence o f a search procedure having minimal expected path. It is shown t h a t when the left and right upper derivatives o f the normalized distribution function are b o t h infinite at 0, then no m a t t e r h o w small the first steps m i g h t be, it is nonetheless advantageous t o add a yet smaller step before them, thus decreasing the expected p a t h length. I n this paper, we consider a modification o f the definition o f search procedure in which there is no first step. The procedures are conceived o f as beginning with an infinitesimal oscillation, as defined below. U n d e r this definition, which is a g e n e r a l i z a t i o n o f the c o n c e p t o f search procedure as defined in [1], a minimizing procedure exists for every distribution with finite first m o m e n t . Furthermore, if minimizing exists in the sense o f [1], then the minimizing procedures derived here are the same ones, in a certain natural sense. Definitions and fundamental notions. We begin with a probability distribution F on the real line which has finite first m o m e n t M l = MI(F) = ae t) F is assumed to be normalized to be continuous f r o m the left in the left half-line, continuous f r o m the right in the right half-line, and continuous at 0, for reasons discussed in I1]. I n that paper, we define a search procedure as a sequence oo with x = {x i}i=l
Y_+:ltl
Received June 4, 1965. * The research work for this paper was supported by the National Science Foundation under grant No. GP 2559 to the University of Wisconsin. 61
62
ANATOLE BECK
[June
•..=<x4<x2
X2 < X4 ~ ...
A function X(x, t) was defined for - Go < t < + ~ as the length o f the path f r o m 0 to t along the broken line running f r o m 0 to x~ to x2 to xa etc. The expectation o f this function, X(x) = S+_~ X(x, t) dF(t), was called the expected path length for the procedure x, and mo was the infimum o f X(x) for all such x. I n this paper, we shall designate the set o f such search procedures as ~o, and the functions X(x, t) and X(x) by Xo(x, t) and Xo(x) respectively. Then we have
mo = too(F) = inf {Xo(x) [ x e ~ o } . We next define a generalized search procedure. Let x = {xi}i+=~_ ~ , where
•.. < x 2 < X o < X _ 2 < . . .
<0<
... < x - i < x l <x3 < ' " ,
and ~ o = - oo[x,[< oo. Let us imagine a point t lying between xt and x3.We can imagine a broken line f r o m t to 0 having for its vertices, in order, t, X2,Xl,XO, X _ l , "... We can easily imagine the broken line traversed in the direction f r o m t to 0 (its length is X l ( x , t ) = ELbut it is harder to imagine m o t i o n in the opposite direction, since we stumble on the question: " W h a t does one do first?" We say that the search plan x so defined begins with an infinitesimal oscillation. As before, we define Xl(x) = S +_~ X ~(x, t) dF(t), where X ~(x, t) is the distance f r o m 0 to t along the b r o k e n line whose vertices, in order, are
Ill+
2Ix, I),
•..,X_2,
X_I,Xo,
X1,X2, ....
As in [1], we define x - and x + so that F ( t ) = O i f t < x - , F ( t ) = l ift>x+, while 0 < F(t) < 1 for x - < t < x + . If, for instance, - oo < x - < 0, x + = + oo then we allow the possibility o f a search procedure with x s_ 1 = x - , x j = + oo, 3j, a n d no entries x, for i > j. Similarly if x - = - oo, 0 < x + < + oo. In any case we do n o t allow any search procedures with entries x~ for which x - < x, < x + does n o t hold. F o r each x e ~0, we can define a corresponding element 2 e ~ , by the following: I f . . . < x 4 < x2 < 0 < xl < x 3 " " , then [ xl
if
i> 0
0
if
i_<_0
}
I f ... = < x.~ = < x 1 __< 0 <_ _ x2 < x4 = < "-', then [x~+l if 0
if
i>O i<0
]
1965]
MORE ON THE LINEAR SEARCH PROBLEM
63
Thus g e ~2 = {x e 3~11 x / = 0, V i < 0}, and
Vy~.z,
3xe3~o :.~ = y.
For each x e ~ o, we have, o f course, Xo(x ) = Xl(g). We define
mi = m~( F) = inf ( X t(x) l x e ~1}" F o r each x e ~1, we have
Xdx)=E(X,(x,t))--
E
j=-oo
=
Itl + Z 21x, l]dF(t)
(-1) '+'
i=-ao
tldF(t) + 2 2 i=--oO
Ix, l(1-tF(x,)-F(x,_,)l)
oo = MI(F )
+ 2 ~, x,(F(x,) - F(x i_ t) - ( - 1)') i =--o~
1. LEMMA. Assume x - = - o% x + = + oo. Then for every ~ > 0, 3B(~)< oo such that for all x e3~x with X l ( x ) < 2 m l , we have
Proof. Let Pc = min(Pr(t > e), Pr(t < - e)). Assume xj > 0; the other case is dual. Then Xa(x,t) > 2 ] x j + 1 l, Vt > x i . T h u s
21x~+11 • e . < f +coXa(x,t)dF(t)< f_+ooX~(x,t) so that ] x j + l ] <
(mdP3 =
df(t)<2m,,
B(,).
Q.E.D.
The same p r o o f will give 2. LENNA. I f X- < a - < b < x x e ~ 1 with X l ( x ) < 2 m l ,
+ , then
xje[a,b]
~
3B(a,b)
< oo such that for every
Ix,+,l
3. COROLLARY ( o f L e m m a 1). I f x - = - m, x + = + 0% then for every e > 0, 3C(e) < m such that for every x e ~ 1 with Xx(x ) < 2ml,
Proof. [x~+a [ < B(e), by L e m m a I. Thus
Ix j+ 2 [ <
B(B(~.))
= C(~).
Q.E.D.
4. DEFINITION. Choose any n u m b e r 0 < e 0 < x + to be fixed for the remainder o f the paper. ( N o t e that if x - >_ 0 or x + -< 0, the problem is completely trivial, as in [1]). Let no be chosen so that X,o + ~ > eo, x, < eo, V i < no.For each x e 3~, define a new search plan g by ~7,= xi-,,o. Then X t ( x ) = X l(g). ~ is called the normalizedformofx.
64
ANATOLE BECK
[June
5. LE~dA. Let 0 < ~ < go. Then there is an n = n(e, F) > 0 such that for every x e . ~ l with Xt(x) < 2ml, I~,1 < g, v i < - n. Proof. Let P = P r ( t > %), and let k > 0 be chosen so that either
I~-~,1 >
Then g < I~-~1 + I~-=~+~1 < I~-~+~ I + I~-=~+~1 < "'" < I~-~1 + I~-~1" Thus k ~ < ~,l-~ 2k
I~, I For every t > %, t > ~7_l, so that 0
X,(~,t) >
2 ~
-1
I*,l >
i=-oo
I*,1 > 2kg.
2 Z i=-2k
Thus we have
2ke .P < = XI(; ) = Xl(x ) < 2m I .
It follows that 2k < (2ml/gP), and therefore 2k - 1 is also.
Q.E.D.
6. LEMMA. I f X - < a < _ O < - b < x +, then 3 n = n ( a , b , F ) > O such that for all x e~.l with Xl(x ) < 2m I ,
2j~[a,b],Vj > n. Proof. Let P = min (Pr(t < a), Pr(t > b)). Choose j > 0 and assume that x2j e [a, b]. Then
~o < I~, I+ 1~1 < I ~ l + I ~ l <
"
--<
I ~ - - , [ + I~-I •
Thus, for t < ~7~4, X1(~7,t) >--Jeo and
Jgo " P <
Xl(;,t)dF(t) <
= Xt(x) < 2 m l ,
yielding j < (2ml/goP). If ~2i_le[a,b], the same analysis shows that jeoP < S ~ X l ( ~ , t ) d F ( t ) < 2ml. Thus, in either case, j < (2ml/eoP), and ~ ¢ [a, b] if k > (4ma/%P). Q.E.D. 7. Tn1EOR~M. I f X- = -- ~ , X + = + oO, then 3 y e ~ . l : X l ( y ) = rex. Proof. Let a sequence (x(n))~= 1 of elements o f ~ t be chosen so that Xl(x(n))-~mt as n -~ oo. We note that X1(£ t")) ~ ml also, so that it will not disturb the generality o f the p r o o f if we assume x (,) = ~(,), V n. Let each x (") be designated as {x~,) +i =o~ - - oo
*
Then 0 =< ... _< xP)a =<x~)~ =< eo,Vn, 0 =<... =< [x~)z]<[Xto")[
1965]
MORE ON THE LINEAR SEARCH PROBLEM
65
process, we extract a subsequence {xt~J)}T=t of {x(")}~= i such that {x}~J)}~'=t converges for each i, and such that Xx(x ("~)) < 2ml,Vj. We can assume without loss of generality that the sequence {x(")}~= t is actually the chosen subsequence. For each i, let y~ = lim~_,~xi(n! Then... < Y2 < Yo< Y2"'" < 0 < . . . < y _ t < Yt -~ " " . Furthermore, for each - o o < a < 0 < b < + o o , we have Yi¢(a,b) for i > n ( a , b , F ) , so that as i ~ + oo. Also, ly-,I -* if i > n(e,F), so that Yt ~ 0 as i ~ - oo. Finally, if we set P = Pr(t > %), we have
ly, I
0
Xt(x~"),t)_-__ Z 2ix~"~i,Vt > ~o,Vn. i = --00
Thus
e. ~ 21xl'~l< i=--co
~0 +°°
x,(~<">,OdF(t)<Xl(x("))<2mi,
and 0
ml
X Ix,'"'l < ~ , v . . i=--oo Therefore •o=_ oo[yl ] --- (mdP) < oo, and y e ~a. To show that X t ( y ) = rot, choose a 6 > 0, and any k large enough so that < ~. For each , , w e now define a w @le ~1 as follows:
X; '-ooly, I
t
w~")=
x~t~)
if
i<-2k
( - 1 ) '+1
max(lxl.,l,ly, l)
if
-2k
( - 1)'+t
max(lx}")l,l"'("),,,-2i)
if
i > 2k + 1.
Then w~")= x~~) for all but at most s k values of i, where s k-- 2k + 1 + n(y2~,y2k +t, F); Choose any ~ > 0 with SRe < iS. Since x[ n)-, Yi,V i, we know that for all n large enough, say all n > n l, we have ]w~" ) - x,(")] < e, Vi. Then Xl(Wt~),t) < Xl(X ("~t) + Ske, V -- oo < t < + oo, V n > n I .
It follows that for n > nx, X l ( w (~)) < X l ( x (")) + ske. Note that w~")-, Yi as n ~ oo, Y i, and define v (~) = {v~)}e ~ t by f
v~" ) = 4 y t
t w~~
ifi<-2k if i _-> - 2k.
Then X t(v (n)) < X l(W t")) + 26, V n, since -2k
X i=--o0
-2k
2iy,[-
X i=--oO
-2k
2[w}")l<
X t=--O0
21Yi1<26.
66
ANATOLE
BECK
[Juno
Thus we have Xl(v (")) < X~(x (")) + ske + 28, Vn. We see immediately that
X (v % t) -+ Xl(y, t) uniformly for Yzk-<- t =< Y2k+l, and thus
r'="
XI(y, t) dF(t) = lira
~Y2k
~"'~ o0
f?+
X l ( v ("), t) d r ( t )
Y2k
< lira supXl(V °°) = lira s u p X l ( x (~)) + ske + 23 < ml+3& Since this inequality holds for all k large enough, and I Y~I~ oo as i ~ + o% we have Xl(y ) =
(y, t)dF(t) = < m I + 3c5.
Since 8 is arbitrary, we have XI(y) < m~, which gives us Xx(y) = ml, by definition of ml. Q.E.D. 8. THEOREM. I f -
O0 < X- < 0 , X+ = + 0% then 3 y e3~l : X l ( y ) = m 1.
Proof. Let {x(')}~= 1 be chosen so that X l ( X (")) -+ ml as n -+ ~ . Again, we can k. assume x <") =£(~),Vn. Let x (~) = {x~(n) }~=_¢o,Vn, where - oo < k~ < + oo. .¢~.(a)~oo ~(n) _ For each i, consider the sequence t ~ j , : 1, where ~ - + oo if i > k,. Either limsup,_.ooxit")< + o%Vi, or else 3k:limsup,~®x~ ")= + ~ , limsup~_,~ox/~")< + 0% Vi + ~ , as k--, + or. Then the proof of Theorem 7 employed verbatim will show that for arbitrary /~ > 0 and all k large enough, f yyZk~k lX I ( y , t )
dF(t) < m I + 3&
Thus, as before, we have Xl(y ) =
f_+/Xl
(y, t)dF(t) =
f:=
X l ( y , t)dF(t) < m I ,
and thus XI(y) = m l . On the other hand, if the opposite case holds, then a subsequence {x("J)}~= 1 c n be chosen from {x(~)}~°=1 so that Xk("J)--~+ oO as j - ~ o% each sequence {x(-j)} converges as j -+ oo for i < k, and X ~(x("J)) < 2m 1 , Vj. We can assume that f v t n ) ~ oo the sequence t.~ s.= ~ itself satisfies all these conditions. Referring back to the
19651
MORE ON THIn, LINEAR SEARCH PROBLEM
67
~. (,h _., 0 as n ~ 0% so that proof of Theorem 8 in [1], we see that Pr(t <.~k_l/ afortiori, Yk-t = X-. We now show, as in [1], that
f xk(n)
--
Xt(Y, t) dF(t) <- m 1
dxt¢- I (n)
so that XI(y) = f+_~Xx(y,t) dF(t)<= m I , and X1(y ) = m I in this case also. Q.E.D. 9. COROLLARY. I f X- = -- 0% 0 < X + < + ~ , then 3 y e Y q : Xx(y) = ma. Proof. Clear by symmetry. 10. THEOREM. I f -- oo < X - < 0 < X + < -I- cO,
then 3 y e ~1 : Xx(y) = mx.
Proof. As before, we choose a sequence {xt")}~= x out of ~1 with Xx(x (")) ~ ml as n --, o9. We again assume that x (")= ~ (,7, Vn. We note that either x - < liminf,_.~x~"),lim sup,_.~x~")< x +, V i , or else there is at least a value of i, call it k, such that one of the inequalities fails. In the first case, the analysis of Theorems 7 and 8 will again show that a sequence y extracted in the indicated way will have X t ( y ) = m l . In the contrary case, assume that /.'~k ~-'(")~ J has a subsequence converging to x - " the case of convergence to x + is dual. Then a subsequence {x("S)}~= x can again be chosen so that Xkt"~)--,X- as j--, ~,{x~ "~) } converges for each i < k, and X l ( X t " ' ) ) < 2 m x , V j . Again, we assume {xt"J~}7= 1 ={x("~}7=1 . Define y~ as before for i < k, with Yk = X- and Yk+t = X+. The previous analysis will now show that X l ( y ) = ml. Q.E.D. We have now shown that in all cases, 3 y e ~ l : X I ( y ) = rnl. In [1], we showed that under certain circumstances, we have 3y e3Eo:Xo(y)= rno. What is the relationship between these results? 11. LEMMA.
me(F) = m~(F)
Proof. For every x e ~ o , xe3E2 and Xo(x) = Xa(x). Thus me = inf{Xo(x) lx ~ ~o} = inf (Xl(x) lx e 3E2) > inf(Xl(X) ] x e 3E1) m1
On the other hand, let x e ~ l and choose 6 > 0 arbitrarily. We have x~-~O as i ~ - oo. Thus, we can choose a k so that -6<xk+2
<0=<xk+l <6.
Let y e ~o be defined by y~ = xk+~, V i > 1.
68
ANATOLE
BECK
[June
Then for each - m < t < + 0% we have X o ( y , t) < X I ( x , t) + 4&
Since mo < X o ( y , t), we have mo - 46 < X I ( x , t ) , V x ~ .
1.
Thus m o - 46 -<_mr. Since 6 is arbitrary, m o < ml, giving mo = m l .
Q.E.D.
In [1], we define F - (0) = lira F(t) - F(O) I-~0-
t
F +(0) = lim F(t) - F(O) ~
t-}O+
t
I f at least one of these is finite, then 3 y e 3~o with X o ( y ) = too. There is no need for y to be unique, of course. Under Theorems 7, 8 and 9 of this paper, we can find y e ~ l with X I ( y ) = m l = too. Are all the y e3~1 with this property essentially representatives of elements of 3~o ? The answer is " y e s " , as seen from the next theorem. 12. TI-mOREM. A s s u m e that 3-m
F + ( 0 ) < o~. Let y ~ 3 E a : X l ( y ) = mo.
Then
Proof. Assume not. Then it is easily seen that Yi ~ 0, V - m < i < + or. Choose D > 0 with F+(0) < D < 0% and let K > 0 be chosen satisfying lo F(t) - F(O) < D, VO < t < K t
1 2 ° F ( K ) - F( - K ) < --~,
1 3° K < 2 d . Since ~o_,__~
[Y,[< 0% there
must be an odd, negative number k with
Y~--Yk+I=IYkI+]Yk+II
=
Yk- 1 = O.
Define x e 3E1 by y~,
Xi
Vi > k,
) L y j - 2 , Vi <- k.
1965]
69
MORE ON THE LINEAR SEARCH PROBLEM
Then XI(Y)- Xl(x) = 2[ t Yk-1 [ (1 -- F ( y k_ 2) + F ( y k - 1)) + ] Yk 1(1 -- F(yt,) + F(y~_ 1)) + } Yk +1 } (1 -- F(yk) + F(y~, + 1)) - I Yk +1 [ (1 -- F(Yk- z) + F(yk + 1))]
= 2 [( I Y k - I I + ] Yk [) ( 1 -- F(yl,-2 ) + F ( y k - I ) ) + [ Yk[ ( - F(yk) + F(Yk- 2)) + [ Y, +1 } ( - F(yk) + F(Yk- z))].
= 2[(yk-- Yk- 1) (1 -- F(y k_ 2) + F ( y k - 1)) - (Yk-- Yk + 1) (F(Yk) -- F ( y k - 2))].
We observe the following: (a) Yk < Yk -- Yk- 1, with equality only if Yk- 1 = 0. (b) Yk-2 < Yk < Yk -- Yk+l < K , and Yk-I
> Y k + l >= Y k 4 1 - - Y k > - -
K , so that
F(yk_2) -- t ( y k _ l ) < F ( K ) - F( - K ) < 1,
and 1 - F(yk-2) + F(Yk-1) > ½.
(c) F(y~) - F(yk_ 2) ----F(y~) -- F(0) __>/)Yk, with equality holding only if Yk = O. Thus, 1 X ( y ) - X ( x ) > 2 [Yk " - ~ -- K " Dyk] ,
with equality holding only if Yk =- Yk- 1 ~- O. However, X ( y ) - X ( x ) < 0 by assumption on y, while K D < ½ by definition on K, so that 1 ~ Yk -- K D y k > O.
It follows that Yk = Yk-1 = 0.
Q.E.D.
One important feature of an absolute minimum, aside from aesthetic considerations, lies in the fact that a recursion formula for the entries can sometimes be obtained by partial differentiation. Let x C°) e ~1, with X l(x to))= m l. Assume that Fis differentiable at each x~W~- oo < i < + co.Then X l ( x ) i s a differentiable function of each x~ at x = x w), and tOXl(x(°)) = O, V - co < i < + co. t~xi
70
ANATOLE BECK
[June
Since (O(Xl(x))/Oxi) = 2[(xi+ 1 - x~)F'(xi) - ( - 1) * - F(x~) + F(xi_ l)], we have F(xi_
Xi+ l ~-- Xt "4-
1) -
F(x~) - ( -
1) ~
F'(x,)
and F ( x i - l ) = F(xi) + ( - 1) i + (xi - xi+ l)F'(xi). Thus, under proper differentiability conditions, we can derive all the xi of a minimal solution if we have two consecutive ones. Clearly, if F is a strictly increasing function, the two equations above will give us all the xi. To extend the same observation to distributions which are not strictly increasing, we note that o f all the values of x for which F ( x ) takes a given value, only that value o f x having the smallest absolute value can appear as an entry in a minimal search procedure. The formulae hold as well for x ~X1, as for x ~-3~o. Although the problem as originally posed has as yet no solution in a useful sense, even for approximations, the analysis here is too delicate to carry over approximations, and the recurrence relations, which depend strongly on F ' (a very sensitive quantity) do not withstand the approximation process. BIBLIOGRAPHY 1. Anatole Beck, On the linear search problem, Israel Journal of Math. 2 (1964), 221-228.* 2. Wallace Franck, On the optimal search problem, Technical Report No. 44, University of New Mexico, Oct. 1963. THE HEBREWUNIVERSITYOF JERUSALEM, AND THE UNIVERSITYOF WISCONSIN
* Please note the following erratum in [1]: Equation 2 ° on page 224 should read: 20 F ( t ) - F ( O )
A REMARK ON ALMOST PERIODIC TRANSITION OPERATORS BY
I. GLICKSBERG ABSTRACT
Results of Rosenblatt on almost periodic transition operators are extended to the reducible case. Let X be a compact Hausdorff space and T a non-negative linear operator on C(X) with T1 = 1. Such an operator defines (and is defined by)a weak* continuous map x ~ t x from X into P(X) (the space of probability measures on X) given by Tf(x) = tx(f)( = ff(y)tx(dy)), f E C(X). We shall call the closure of the union of the supports of the measures tx the support of T, denoted by ET.. Recently Rosenblatt [.4, 5] has considered the (admittedly rare) situation in which S = {T n: n > 1} is almost periodic in the sense that the orbit {Tnf: n > 1} is conditionally compact in C(X) for all f in C(X). From [2] one knows that then C(X) = Co @ Cp, where Co is the closed invariant subspace consisting of all f with I] T~f[I-~0, and Cj, is the closed invariant span of the eigenvectors of T with eigenvalues of unit modulus. In [.5] Rosenblatt showed that, when T is irreducible in a certain sense, Cp can be identified with C(Y) for some quotient space Y of X, while T is induced on C(Y) by a self-homeomorphism of Y. We wish to point out a very simple derivation of a stronger assertion, which in Rosenblatt's context says Y is a compact monothetic group on which his self-homeomorphism is a translation. To begin we recall that in the strong operator topology the closure S of S = {T~: n > 1} is a compact abelian semigroup whose kernel (least ideal) K is a compact topological group [-2]. Indeed it is precisely the identity E of K which projects C(X) onto Cp and annihilates C O. Naturally the elements of K are non-negative and leave 1 fixed, so setting ex(f)= Ef(x), f ~ C(X), defines e~ E P(X). Evidently Cp = EC(X) is conjugate closed. With Er the support of E, A = Cp EE = EC(X) Y,g and Cp are isometric, so A is closed in C(EE): for e~(f) <sup f(Er) , and, applied to E f, ex(f)[ = le~(Ef)l < s u p l E f ( E r ) I, whence HEfH < H(EflEE) II. AS a consequence Rosenblatt's argument [-4] shows A is a subalgebra of C(EE), viz. : A n (the set of
Received February 16, 1965. 71
72
I. GLICKSBERG
[June
real elements of A) is a sublattice of CR(~r) since for Eq~i= ~?ieAR, i = 1,2, E(q~l V ~b2) ->_E~b~= ~b~, i = 1,2, so ~ = E(~bI V ~2) - (~bl ~/tk~) >=0; and since E~k = 0, ~k vanishes on EE. So by Stone's proof of the Stone-Weierstrass theorem, A R (and thus A) is an algebra. Consequently A = C(Y) for some factor space Y of Er. Now trivially k = kE e K has support ER c ~ , since if f ~ C(X) vanishes on E E then k f = k E f = kO = O. Since Cp is invariant for each k in K (as for all elemnts of S), k actually yields a well defined operator ( f l E E ~ kflY, E) on A = Cp[ER. So K acts as a group of operators on A, and evidently k ~ k f is strongly continuous as a map into A since it coincides with k ~ for any extension g e C(X) of f , and k ~ kg is strongly continuous. Viewed as operators on C(Y) t h e n , K is a group of nonnegative operators leaving 1 fixed whose identity is the identity operator. So each adjoint k* maps P(Y) onto itself, and thus maps extreme points onto exteme points: with/~y the unit mass at y, k*/~y=/~kty) for some unique k(y) in Y. But
kglr,~
(1)
(k,y) -~ k(y)
is continuous since this amounts to continuity of
( k , y ) ~ f(k(y)) = k * # y ( f ) = kf(y), all f e C(Y), and that follows from the strong continuity of k ~ kf. Hence each element k of K induces a self-homeomorphism k( • ) of Y, which in turn induces the action of k on C(Y): k f = f o k( • ). Thus K, with the action (1), gives rise to a transformation group on Y which yields the action of K on A = Ce I Ee, and in particular that of k o = TE. Having identified C(Y) and EC(X)I EE we can of course compose E f I Ee with an element k ( - ) of our transformation group on Y, and thus write, without ambiguity,(1) k f l E ~ = k(f[Ee) = ( f [ ~ E ) o k(. ) if f = Ef. To obtain this action of T (hence that of TE = ko) on Cp -- EC(X) then we note that for any x in X and f = E f, Tf(x) = TEf(x) = kof(x ) = Ekof(x ) = ex(kof I F,n), so (2)
Tf(x) = ex([fl~E] o go(" )),
f E EC(X).
Noting that the powers of ko = TE are dense in K = ~E since those of T are dense in S we have proved half of the following THEOREM. A non-negative operator T with T1 = 1 has S = {Tn:n > 1} almost periodic if and only if (i) there is a projection E in the strong operator closure of S, (ii) there is a quotient space Y of~,Efor which EC(X)I ~'r. is precisely C(Y) (as naturally imbedded in C(EE)), (iii) a compact (monothetic) transformation group K acts on Y, with the action o f T on EC(X) that induced by a generator ko of K, as in (2). Q) The second k is our operator on A = Cp] --rE.
1965]
A REMARK ON ALMOST PERIODIC TRANSITION OPERATORS
73
" I f " is easily proved by showing conditional compactness of orbits for f in
( I - E)C(X), and then for f in EC(X), as follows. Suppose the net T n0 ~ E strongly. For f e (I - E)C(X), E f = 0, so given e > 0, [[ T"f[[ <=l[ T"~f][ < 8 for n _->n~, some 6, whence [] T"f[{ ~ 0 and our conditional compactness is apparent. On the other hand, action of T on EC(X) is determined on ZE, by (2), which also shows EC(X) and EC(X) [Y~gare isometric. Thus our conditional compactness will follow from that of the corresponding orbit in C(Y), which itself follows directly from the compactness of K and the fact that k ~ f o k( • ) is strongly continuous for any transformation group on a compact space. REMARKS. 1. In case S = {T" : n >_-1} is weakly almost periodic (i.e., {T "f: n >_ 1} is conditionally weakly compact for each f i n C(X)), (i) - (iii) still hold if "strong" is replaced by " w e a k " in i). Indeed E is then the identity of the least ideal K of the weak operator closure of S, which is still [2, 8.1] a compact group in the strong operator topology, so that the same proof applies. Needless to say, in this situation Co (the nullity of E) is not so simply described. (More generally the same proof yields (i)-(iii))(with obvious modifications) for any (weakly) almost periodic semigroup S of non-negative T with T 1 = 1 for which the conclusions of [2, 4.11] hold (in particular for S amenable [1]), with Co and Cp the subspaces defined in [2].) 2. Note that if X = ZE, the natural decomposition of Y into orbits lifts to a decompositon ,~ of X for which x e F e ~- implies the support of tx is contained in F. Indeed i f f ~ C(Y), 0 <=f<=1, and f = 1 on the orbit of the image y o f x then viewing f as an element of Cp we have Tf(x) = f(ko(y)) = 1 = tx(f), so that tx is carried by f-1(1), hence by an arbitrary neighborhood of F if f is chosen appropriately, which yields the assertion. Thus we may define operators Tv: C ( F ) ~ C(F) for which T f I F = Tr(f[ F), f e C(X), and so in a sense decompose T into irreducible parts. As is easily seen the case in which ~ is a singleton is precisely Rosenblatt's irreducible case, and then Y and K can be identified. 3. That identification is made possible by the fact that our operator group K, in its role as a transformation group on Y, always acts effectively; i.e., k(y) = y for all y implies k = E. For it clearly implies k f = f for all f in Cp, whence k f = k E f = E f for all f in C(X). 4. Whenever C~ is finite dimensional, as in the special case in which T is a compact operator (where S is necessarily almost periodic), one easily obtains parallels to the results obtainable when T is compact [3, §8]. Indeed if f l , ' " , f , are independent eigenfunctions spanning Cp, corresponding to eigenvalues 21," ', 2,, the fact that K must be finite (since Y is, and K is effective) shows TE is of finite order N-< n ! and each 2~ is a root of unity. Moreover since E f is uniquely expres-
74
I. GLICKSBERG
sible as ~1 ci(f)fi while f /~ with (3)
[June
ci(f) is continuous, we have unique complex measures
Ef =
~ /t,(f)f~, i=1
f ~ C(X),
necessarily biorthogonal to the fi. And T'it i = 2d~i: for given any f in C(X), g=X2flT*l~j(f)fy-Ef is an element of C~ with T g = X T * # j ( f ) f j - T E f = X~j(Tf)fj - T E f = E T f - T E f = 0 since ~ is commutative. But T acts as an invertible on Cp, so g = 0, E f - - E 2 f 1T*,uj(f)fj, whence T*Itj = 2fltj by uniqueness of the/~j. Lastly, from the fact that T N E = ( T E ) N = E one concludes that T N J ~ E strongly as j ~ oo ; for any strong cluster point k of the sequence must lie in K, as is easily seen, so that TNJE = E implies k = kE = E. Since g is compact in the strong operator topology, our convergence is assured. 5. Finally, we note that invariant integration over K can be used to obtain analogues of the other results of Rosenblatt [5], while the eigenfunctions f spanning Cp[E~ are easily obtained; each is a (common) eigenfunction (with unimodular eigenvalue) of each k in K, and since k ~ k f is continuous, k f = x ( k ) f for some character X of K. Thus each such f coincides on each orbit in Y with a multiple of a fixed character Xy of K, and any such function is obviously an eigenfunction in CplEe. (Values off Ee must of course be computed via (2).) BIBLIOGRAPHY 1. M. M. Day, Amenable semigroups, Illinois J. Math., 1 (1958), 509-543. 2. K. de Leeuw and I. Glicksberg, Applications of almost periodic compactifications, Acta Math. 105 (1961), 63-97. 3. M. G. Krein and M. A. Rutman, Linear operators leaving invariant a cone in a Banach space, Uspehi Matem. Nauk (N.S.) 3 (1948), 3-95 Amer. Math. Soc. Translation 26. 4. M. Rosenblatt, Equicontinuous Markov Operators, Theory of Probability and its Applications, 9 (1964), 205-222. 5. M. Rosenblatt, Almost periodic transitionoperators..., J. Math. Mech. 13 (1964), 837-847. UNIVERSITY OF WASHINGTON, SEATTLE, WASHINGTON
THE CONNECTION BETWEEN NORMALIZABLE AND SPECTRAL OPERATORS BY
L. TZAFRIRI* ABSTRACT
A necessary and sufficient condition for an operator to be normalizable is given in terms of Dunford's spectral theory. In [7], Zaanen defined the notions of symmetrizable and normalizable operators with respect to a positive, self-adjoint fixed operator H. His definitions become evident and more transparent by taking into account that an operator is normalizable (symmetrizable) whenever there exists a certain naturally corresponding normal (self-adjoint) operator on a certain factor space of our Hilbert space (non-necessary complete) (see [7] §12, p. 226). Using the additional assumption that H has a closed range, we prove here that an operator is normalizable (symmetrizable) with respect to H if and only if it is normal (self-adjoint) on a subspace of our basic Hilbert space provided with a new and equivalent norm. Relying on the above-mentioned result and using an idea of Mackey [5] and Wermer [6] we carry on by proving that an operator is normalizable if and only if its product with the orthogonal projection on the afore-mentioned subspace is a spectral operator of scalar type in Dunford's sense [1]. Later we shall draw the conclusion that the product between a positive self-adjoint invertible operator and a self-adjoint operator is a spectral operator of scalar type having a real spectrum. It should be mentioned that this result is proved with no assumption of commutativity of the operators. 1. Notation. Our notation is essentially that of Zaanen [7], [8]. Throughout the paper t will denote a fixed Hilbert space; H # 0 a bounded positive selfadjoint operator defined in t ; ~ t h e null space of H; ~ = ~ ± and P the orthogonal projection on ~¢. Every operator will be assumed to be bounded. For convenience, we shall summarize here some definitions and results from [7], [8].
Received May 15, 1965. * This paper is a part of the author's Ph.D. thesis to be submitted to the Hebrew University. The author wishes to express his indebtedness to Professor S. R. Foguel for his guidance and kind encouragement. 75
76
L. TZAFRIRI
[June
First, we shall mention that (1.1)
H = HP = PH
DEFINmON 1. Given an operator T, any operator T satisfying the relation (1.2)
( n T x , y) = (Hx, Ty)
will be called an Generally, the special case that ordinary adjoint
x, y ~ X
H-adjoint of T. H-adjoint of an operator is not uniquely determined. In the H = I, the H-adjoint is uniquely determined and equal to the T*.
DEFmmON 2. An operator T will be called H-symmetrizable if T = T, i.e., (1.3)
(H Tx, y) = (Hx, Ty)
x, y ~
Obviously, T is H-symmetrizable whenever H T is self-adjoint. DEFINITION 3. An operator T will be called H-normalizable if: (1.4)
H T"T = H T T
The concept of spectral operator as used here is that which was developed by Dunford in [1], [3]. 2. Spectral properties. Let us begin with a necessary and sufficient condition for the existence of the H-adjoint. LEMMA 4. Assume H a positive operator with a closed range and let Ho = H/J/[ be the restriction of H to the subspace ./fl; then Ho is invertible. Proof. Using (1.1) it follows H ~ = H..g c_ .,g and therefore Ho is one-to-one, positive, self-adjoint operator. Hence, if 0 e a(Ho) then 0 is in the continuous spectrum of H o, i.e. H o J # = H J / / = H ~ is dense in ~ . But, since Ho has a closed range it is invertible. THEOREM 5. Let H be a positive, self-adjoint operator with a closed range. An operator T will have an H-adjoint if and only if (2.1)
PT = PTP
or, equivalently, ~ and J// are invariant under PT.
Proof. If T has an H-adjoint iP, then by (1.1) ( P T x , n y) = ( n P T x , y) = ( B T x , y) = ( n x , "Ty) = 0
for every x~L~o, y e ~ , i.e. P T ~ - - H 2 E thus PT~-a_ ./g, hence PT.Le c_ ~ . On the other hand it is clear that P T ~ g ~_all, i.e. LP and J// are subspaces in-
1965]
NORMALIZABLE AND SPECTRAL OPERATORS
77
variant under P T and, therefore, P T = P ( P T ) = (PT)P = PTP. It should be mentioned that in this part of the proof, the fact that H has a closed range is not used. Conversely, if (2.1) holds then
(HTx, y) = (HPTx, y) = (HPTPx, y) = (x, PT*PHy) -- (x, HoHo 1PT*Hy) = (x, HHo ~PT*Hy) = (Hx, Ho I PT*Hy)
x, y e )E
and, therefore, we will be able to put T = H o 1PT*H. In [7], Zaanen elucidates the definitions 2 and 3 through the factor space ~/.Y. Given [x], [y] e :X/.~ it can be defined a new inner product on X/.Y by putting <[x], [y] > = (Hx, y) but with this new norm K/.oq° does not have to be complete. If T has an H-adjoint it is easy to see that Hx = 0 implies H T x = 0 and hence we can define without ambiguity the operator [T] acting in X / ~ by putting:
[r] [x] = [rx]
[x] eX/
Zaanen proved that a compact operator T is H-normalizable (symmetrizable)if and only if I T ] is a bounded normal (self-adjoint) operator on 3E/£z, provided with a new norm. Relying on the afore-mentioned consideration and the additional assumption that H has a closed range we give here this parallel construction; by Lemma 4 H ~ = H ~ ' = ~ i.e. ~ is an invariant subspace of H. Define a new inner product on dg as follows:
< x , y > = ( H x , y)=(H1/2x, H1/2y) Oenoting the initial norm
by I1" IIandthenew norm on
x, y e ./g
by Ill Ill, weshallhave
IIIx[II= HH'/2 II <=I1H'2]I . ]IxlI but, according to the Lemma 4
0 ¢ a(Ho) and hence 0 ~ a(Hlo/Z)i.e.
Ilxl[--II H o ' 2 H o ' e x II--< [IHo ''2 II" lilxlll
xe~
and it will follow that these two norms on d//are equivalent and ~ with the new norm is a complete space. By using Theorem 5 we have:
( P r x , y ) = ( H P r x , y) = ( H r x , y) = (Hx, Ty) = (x, HP:Fy) = ( H x , P T y ) = (x, P T y ) for every operator T satisfying (2.1). Hence
x, y e , / g
78
L. TZAFRIRI
(2.2)
(PT) ÷x = P Tx
[June x ~ ~¢[
where (PT) ÷ is the adjoint of P T within the space .At', with the new norm. THEOREM 6. An operator T is H-normalizable if and only if it satisfies (2.1) and P T is a normal operator on the Hilbert space {Jr,
III II1 .
Proof. If T is H-normalizable, then it has an H-adjoint and according to the Theorem 5, (2.1) is satisfied. In addition (HTTx, y) = (HTTx, Py) = ( n T x , Tey) = ( n T x , PTPy) = (Hx, ~PTPy) = (HPx,(PT)(PT)Py) = ( P x , ( P T ) ( P T ) P y ) x , y e3E
and similarly (HrTx,y) = (Px,(PT)(PT)Py)
x , y E•
therefore H ~ = H T T whenever ( P T ) ( P T ) P = ( P T ) ( P T ) P or, using (2.2), if and only if P T will be a normal operator on { ..¢¢,II1" III For H-symmetrizable operators we have a similar theorem. THEOREM 7. An operator T is H-symmetrizable if and only if it satisfies (2.1) and P T is a self-adjoint operator on the space
,111 II1 .
Proof. We have (HTx, y) = (HPTPx, y) = ( P T P x , P y )
x,y ~
( n x , Ty) = ( n Tx, y) = (P TPx, Py )
x, y ~ 3~
and similarly
Hence P T P = PTP if and only if T is H-symmetrizable and the proof may be finished by (2.2). The two next theorems elucidate the connection between normalizable or symmetrizable operators and spectral operators. THEOREM 8. I f T is an H-normalizable operator, then PT is a spectral operator of scalar type. Moreover, if T is H-symmetrizable then the spectrum of PT is real. Proof. According to the Theorem 6, PT is a normal operator on the space I11" 111~in wr~ch the new norm II1 111is equivalent to the initial one I1" II and, therefore, PT/.I¢ is a spectral operator of scalar type on ~f z is H-symmetrizable then by the theorem 7, PT will be self-adjoint on {.A',III-II1~ and, therefore, scalar with real spectrum on {..#, According to Theorem 5 PT.oq"= PTP.2" = {0} and we can conclude, using the Dunford and Schwartz [3] Theorem XVI-5-3, that P T is a spectral operator of scalar type on the whole space 3~. If T is H-symmetrizable, naturally, a(PT) will be a real set.
II I1 .
1965]
NORMALIZABLE AND SPECTRAL OPERATORS
79
In order to prove the converse of the previous theorem we shall use a result of Mackey [5]. THEOREM 9. Let P be a self-adjoint projection in 3E and T an operator such that P T is spectral of scalar type and (2.1) is satisfied. Then, there exists a positive, self-adjoint operator H, having a closed range and such that T is an H-normalizable operator satisfying H3~ = P~ and (2.3)
H = PH = HP.
Moreover, if a(PT) is a real set, then T is H-symmetrizable. Proof. First, let us remark that the restriction of P T to d / / = P3E is a spectral operator of scalar type on d / a n d a(PT/J¢) is real whenever o(PT) is real too. By Mackey [5], Theorem 55 (see also Wermer [6]) we can define a new norm Ill" Ill on ~ (associated with a new inner product ( . , . ) ) which is equivalent to the initial one and such that PT/J¢ is a normal operator on { ,lll" IIl . if a(PT/dQ is real then P T / ~ is self-adjoint on {"~',11[" Ill Using [2], Lemma X. 2.2, we can suppose that the new inner product is given by ( x , y ) = (Bx, y)
x , y ~ d¢
where B is a positive, self-adjoint operator on ('/~,]1" II}, having a bounded everyewhere defined inverse. Hence, H = BP is a positive, self-adjoint operator on ~ satisfying (2.3) and H3E = PX = d / s i n c e (Hx, x ) = ( B P x , x ) = ( P B P x , x ) = ( B P x , P x ) = IHPxlII 2>=o Now, let ix,} x, e t ; n = 1, 2,... be such that (2.4)
lim Hx, = y n .-.~ 0 0
Then Py = y i.e. y ~ .////. Applying B - i o n (2.4) we shall get lim Px, = B - l y n ---) O 0
and therefore lim Hx n = H B - 1y n.-* o0
and, consequently y = H B - 1y, i.e. H has a closed range. Taking into account (2.3) we can remark that P is just the orthogonal projection on the orthogonal complement of the null space of H; hence by Theorems 6 and 7 T is H-normalizable and if a(PT) is real then it is H-symmetrizable. COROLLARY 10. Let A and B be self-adjoint operators; if A is positive and has an inverse (defined everywhere), then the product AB is a spectral operator of scalar type, with real spectrum.
80
L. TZAFRIRI
[June
Proof. Denote S = AB; it follows that B = A -~S is self-adjoint and from the observation made after the definition 2 it follows that S is A-1-symmetrizable (A- 1 is positive, self-adjoint and invertible too). Therefore, the orthogonal projection on the orthogonal complement of the null space of A- t coincides with the identity I. By the theorem 8 S is a spectral operator of scalar type and a(S) is real. Every normal operator N can be decomposed as follows: N = N 1 + iN 2
where N 1 = (N + N*)/2 and N 2 = ( N - N*)/2i are commutative self-adjoint operators. A similar decomposition exists for H-normalizable operators too. THEOREM 11. Let T be H - n o r m a l i z a b l e . Then, there exist two H - s y m m e t r i z a b l e operators T 1 and T 2 satisfying:
(a) T = 7"1 + iT2 (b) P T I T 2 = PT2T1 (c) I f S i , S 2 are H - s y m m e t r i z a b l e operators such that T = S 1 + i S 2 and PS~S2 = PS2S1 then P S i = P T i , i = 1,2. Proof. We can put T+T
T1-
2
T-~P
;T2-
2i
Obviously, T1 and 7"2 are H-symmetrizable and H T I T 2 = HT2T 1 . Using (1.1) we get H ( P T ~ T 2 - P T 2 T ~ ) = O i.e. ( P T ~ T 2 - P T 2 T ~ ) e ~ n M ¢ = { O } and, hence (b) is satisfied. If (c) holds for some S, and S2 then by the Theorem 8 the operators PTt, PT2, PSI and PS2 will be spectral of scalar type and their spectrum will be real. But P T = P T I + iPT2 = P S
+ iPS2
and using the T h e o r e m 5 ( P T t ) ( P T 2 ) = ( P T t P ) T 2 = PT~ T 2 = PT2T~ = (PT2)(PT~)
and similarly ( P S 1 ) ( P S 2 ) = (PS2)(PS1). According to Foguel [4] (Theorem 1. p. 59) we get PSi = PT~, i = 1,2. REFERENCES 1. N. Dunford, Spectral operators, Pacific J. Math. 4 (1954), 321-354. 2. N. Dunford and J. Schwartz, Linear operators, Part II., Interscience Publishers, New York 1963. 3. - - - - , Linear operators, Part III., (to be published by Interscience Publishers). 4. S. R. Foguel, The relations between a spectral operator and its scalar part. Pacific J. Math. 8 (1958), 51-65. 5. G. Mackey, Commutative Banach algebras, Notas de Matematica. No. 4, Rio de Janiero. 6. J. Wermer, Commuting spectral measures on Hilbert space, Pacific J. Math. 4 (1954), 355-361. 7. A. C. Zaanen, Normalisable transformations in Hilbert space and systems of linear integral equations. Acta Math. 83 (1950), 197-248. 8. , Linear analysis, P. Noordhoff, Groningen and Interscience Publishers, New York, 1953. THE HEBREW UNIVERSITY OF JERUSALEM
CAYLEY'S DECOMPOSITION AND POLYA'S W-PROPERTY OF ORDINARY LINEAR DIFFERENTIAL EQUATIONS(1) BY
MISHAEL ZEDEK ABSTRACT
A proof is given for the equivalence of P61ya's W- property of a linear differential equation Ln(D)y = 0 to the possibility of decomposing Ln(D)--Fint [ D + 2i(x)l in a given interval. In this case a set of n independent solutions form a Chebyshev system in the interval. An application determines intervals of non-oscillation for solutions of linear equations of the second order. 1. Introduction. In 1886 Cayley [1] considered the problem of decomposing a linear differential operator (1)
L . ( D ) - D" + y l ( x ) O "-1 + ... + ~,.(x);
O - d/dx
into a product of linear factors (2)
L . ( D ) = [D + 2,(x)] [D + 2,_ x(x)] --. [D + )q(x)].
By carrying out the multiplications and differentiations in (2) and comparing coefficients with (1) he arrived at a set of differential equations for the functions 2k(x), k = 1, ...,n in terms of the coefficients 7k(X), k = 1, ...,n. He also indicated methods for finding local solutions. The example D 2 + 1 = [D - tan(x + a ) ] [ D + tan(x + ct)] illustrates however the fact that a decomposition (2) does not necessarily exist in an arbitrary given interval, (e.g. one whose length exceeds ~) and is not unique when it does exist there. In 1922 P61ya [4] defined a W-property of an ordinary linear differenequation L , ( D ) y = 0 in a closed interval [a, b] as the property of possessing n independent solutions h i ( x ) , - . . , h.(x) such that the Wronskians Wk = W ( h ~ , . . . , h k ) =
det Ihy~(x)l,:, .....
.....
,-,
are all positive in [a, b] for k = 1,..-, n. P61ya has shown that the W-property of L.(D) = 0 is equivalent to the possibility of writing L,(D) as a product in the following way: Received June 6, 1965. (1) Research supported by the National Science Foundation Grant No. Gpo3897 with the University of Maryland. 81
82
MISHAEL ZEDEK
[June
Ln( D ) - ( Wn/Wn_ x)D(Wn2_1/Wn- 2 Wn)D ... D(1/W1). It is known (compare Ince [3, p. 121]) that this form is equivalent to the decomposition (2). In §2 we show directly that the validity of a decomposition (2) in an interval [a, b] with 2k(X) ~ C ("-k) [a, b] is equivalent to P61ya's W-property of Ln(D)y = 0 in that interval. In §3 we show that if Ln(O)y = 0 has a decomposition (2) in an interval then every set of n independent solutions form a Chebyshev (or unisolvent) system in that interval. This means that every non trivial solution has at most n - 1 distinct zeros in that interval. Applications to Approximation Theory will appear elsewhere. In §4 we apply the above results to the finding of intervals of non osciUation for the solutions of linear equations of the second order.
2. The equivalence between Cayley's decomposition and P61ya's W-property. Before we state our main result we prove a
L~MMA. Suppose Lk(D) = I-I~=k[ D + 2i]. Then (3)
(D + 2k + ... + ;~t)W(ht,'",hk) =
hi,
"',
hk
h(k-2), ...,
h(kk-2)
Lk(D)hl, ..., Lk(D)h k assuming that the functions As,hi are sufficiently differentiable to make the above expressions meaningful. Proof. In the last row of the determinant Lk(D)h ~ can bc replaced by [ D k + (2k + "'" + 21)Dk-1]hi since the omitted terms form a linear combination of the other rows. At the same time it is easily seen that the left hand side is obtained by operating (D + 2k + . . . + 21) on the last row only of the Wronskian W(hx,..., hk). TrmOREM 1. Let L~(D) - D n + ~t(x)D ~-1 + ... + y~(x) be a differential operator with coefficients TR(X), k = 1,...,n, continuous in an interval [a,b]. Then a necessary and sufficient condition that L~(D) admit in [a, b] a Cayley decomposition (2) with 2kec(n-k)[a,b], k = l , . . . , n , is that the equation Ln(D)y = 0 should have in [a,b] P61ya's W-property i.e. there should exist n independent solutions h 1, ..., h~ such that (4)
W[hl(x),..., hk(x)] > O, x e [a, b],
k = 1,..., n.
1965]
LINEAR DIFFERENTIAL EQUATIONS
83
Proof. hi(x) = W(hl) -- exp[ -
j'221] is a solution of [D + 21(x)]y = 0 and is also simultaneously a solution of I-I~=k[D + 2i(x)]y = 0 for k = 1, ...,n. Let h 2 be a solution of [D -I- 22(x)] [D + ,~l(x)]y ~ [D 2 + (22 "l" 21)D + 2] + ~,l,,],2]y = 0 linearly independent of h I. By Abel's identity (see [3, pp. 119]) W ( h l , h 2 ) = c. exp [ - Sax(22 -~ 21)], C ~ 0. By suitably normalizing h 2 we may assume c -- 1. We continue in this manner until we add to the solutions hi, "", h , - i of t D + 2 J y = 0 independent of the I-I~=,_t[D + 2~]y = 0 a solution h, of I-is=,,[ previous ones and normalized so that W(hl,..., h,) = exp [ - j'2(2, + ... + 21)]. Assuming now that L,(D) has the decomposition (2), hi, ...,h, are solutions of L,(D)y = 0 verifying property-W. This proves the necessity of the condition in the theorem. We shall now prove that the condition is sufficient by constructing a system of ,~k'S from a set of solutions hi, "", h, of the equation
(5)
[D" + ~,t(x)D " - t + ... + ~,.(x)]y = 0
satisfying (4). Indeed, define recursively (6) 21 ---
- -
k-! D[W(ht)] " 2k = - D[W(hx'""hk)] _ Z 2,, W(hi) ' W ( h l , ' " , hk) 1
k = 2,...,n.
Denote Lk(D ) - I'I~=k['D + 2 J , k = 1,...,n. We shall show that h l , . . . , h t are solutions of LI(D)y = 0 for i = 1,..., n. The proof is by induction. The statement is true for i = 1 by the first part of (6). Suppose it is true for i = k - 1 and let us prove that it is also valid for i = k. By the assumption Lk(D)hj = L k _ l ( D ) h j = 0 for j = 1,..., k - 1 and thus the right hand side of (3) in the lemma is equal to Lk(O)hk" W ( h D ' " , h k - i ) . The left hand side of (3) vanishes by (6) and since the Wronskian is positive, Lk(D)h k = 0. Hence h i , ' " , h, are independent solutions of the equation L,(D)y-[I-I~=,(D + 2i)]y = 0 which proves its identity with the equations (5). 3. The number of zeros of a solution. Theorems 2 and 3 are generalizations of Rolle's theorem. THEOREM 2. Let f ( x ) be differentiable in the compact interval [a,b] and suppose f ( a ) = f ( b ) = O. Then for every continuous function 2(x) in [a, b] there exists a number c, a < c < b, such that (7)
[D + $(c)]f(c) = f'(c) + ,~(c)f(c) = O.
84
MISHAEL ZEDEK
[June
Proof. Apply Rolle's theorem to the function G ( x ) = f ( x ) e x p [ S2(x)dx]. Clearly G(a) = G(b) = 0 and G'(x) = [ f ' ( x ) + 2(x)f(x)] exp [ ~2(x)dx]. (7) follows from G'(c) = O. In view of Theorem 1, the next two theorems are essentially the same as Theorems I and II of P61ya [4]. The short proofs given here are made possible by using the decomposition (2) as point of departure. THEOREM 3. Let f ( x ) be n times differentiable in the interval [a, b] and have there n + 1, distinct zeros and let 2k(X) belong to C~"-k)[a,b], k = 1, ...,n. Then there exists a number c, a < c < b, such that
(8)
f,(c) = I n + 2,,(x)] [D + 2,_ l(X)]... [D + 21(x)]f(x ) Ix =c = O.
Proof. Let us define fk(X) for k = 0 , 1 , . . . , n as follows: fo(X)=f(x);fk(X) = [D + 2k(x)]fk-I(X) for k = 1, ..., n. An inductive argument, which makes use of Theorem 2, shows that fk(X) has at least n - k + 1 zeros, each lying between each pair of adjacent zeros among the n - k + 2 zeros of Jk-l(X) (k = 1, ...,n). For k = n we obtain (8). THEOREM 4. Let y = f ( x ) ~ O be a function defined on the compact interval interval [a, b] and satisfy there the linear differential equation (9)
L,(D)y =- [D + 2,,(x)] [D + ~n_ l(X)]
""
[D "3L /~l(X)] y
=
0
where n is a positive integer and ;tk(x) ¢ C¢"-k)[a, b], k = 1,..., n. Then f ( x ) has at most n - 1 distinct zeros in [a, b]. The conclusion remains true if the interval [a, b] is replaced by a finite or infinite open or semi-open interval. Proof. The proof is by induction. For n = 1, equation (9) is reduced to [D+2t(x)]y=0 whose general non-trivial solution is given by y = c exp [ - ~;~(x)dx], c ~ O, and has no zeros in [a, b] as claimed. Now suppose our theorem to be true for k = n - 1. We shall prove that it is also valid for k = n. Indeed, if y = f ( x ) ~ 0 is a solution of equation (9), then the function f , _ l(x) - [D + 2,,_ l(x)] ... [D + 21(x)]f(x ) is a solution of the equation
Ix0)
[o +
x(x) = 0.
1965]
LINEAR DIFFERENTIAL EQUATIONS
85
Two cases are possible. If f,_ l(x) is a non-trivial solution of (10), then by the case n = 1 discussed above, it has no zero in [a, b] and therefore, by Theorem 3, f ( x ) can have at most n - 1 zeros there. If on the other hand f , _ l(x) = 0, then f(x) is a solution of an equation of order n - 1 [D + ).,_ ,(x)] .-. [ n + 2,(x)]f(x) = 0 and by the assumption of induction f(x) has at most n - 2 zeros in [a, b]. COROLLARY. I f ha(x), ".., hn(x) are linearly independent solutions of equation (9), then they form a Chebyshev system in [a, b], i.e. every non linear combination
~,~ akhk(X) with ~';I a~ 1> 0 has at most n - l distinct zeros in [a, b]. 4. Intervals of non oscillation of solutions. It follows from Theorem 4 that if a decomposition
(11)
Lz(D ) = D z + A(x)D + B(x) - [D + 22(x)] [D + 2,(x)]
is valid in an interval l-a, b] then every non trivial solution of L2(D)y = 0 is nonoscillatory in [a, b] i.e. has at most one zero in that interval. We assume here A, B, 2 2 e C [a, b] and 21 e C (1) [a, b]. From (11) we obtain (12)
21(x) + 22(x) = A(x)
and
21(x)22(x) + 2'1(x) = B(x).
The simultaneous solvability of (12) is equivalent to the existence in [a, b] of a solution of the Ricatti equation (compare Ince [-3, p. 241) ,~;(x) = , h ( x ) 2 - a ( x ) , h ( x )
+ B(x).
We can now apply the standard Cauchy-Lipschitz existence theorem (see [2, p. 3]) to find intervals of non-oscillation: Tr~EOREM 5. Let A(x) and B(x) be continuous in the interval [ a , b ] and let
M = M(yo, h) = m a x [ l y 2 - A(x)y + B(x)}, x ~ [ a , b ] , l y -
yo] < h].
Then every non-trivial solution of [02 + A(x)D + B(x)]y = 0 has at most one zero in the interval [a, c], where c - a = min [b - a, h/M].
86
MISHAEL ZEDEK
[June
REFERENCES 1. Arthur Cayley, On linear differential equations (the theory of decomposition), Quarterly J. of Pure and Appl. Math., 21 (1886), 331-335. 2. Lamberto Cesari, Asymptotic behavior and stability problems in ordinary differential equations, Springer-Verlag, Berlin, 1959. 3. E. L. Ince, Ordinary differential equations, Longmans, Green and Co., London, 1927. 4. Georg Pblya, On the mean-value theorem corresponding to a given linear homogeneous differential equation, Tram. Amer. Math. Soc. 24 (1922), 312-324. UNIVERSITY OF MARYLAND, MARYLAND
A SUMMATION FORMULA FOR APPELL'S FUNCTION F2 BY R. C. BHATT ABSTRACT
The exact solution of numbcr of problems in quantum mechanics has been given in terms of Appclrs function/;2; in an extension of this work I have given here a summation formula, which is as follows:
~ F2(a,-n,-n;1,1;x,y) n=O
= (m + 1) (x - y ) - x [F2(a _ 1, - m, - in - 1 ; 1,1 ; x, y ) - ~- ], a where ~- shows the presence of a similar term with x and y interchanged. The exact solution of n u m b e r of p r o b l e m s in q u a n t u m mechanics has been given in terms o f AppeU's function F2; in an extension o f this work, I give here, a s u m m a t i o n formula, which m a y prove to be useful. We take for F2 the definition (1)
F2(a,b,b';c,c';x,y ) - F(a)l f ;
e_ t
t,_llFl(b.c;xt)tFl(b,;c,;yt)dt '
Ixl + lyl < x A special case of (1) is (2) F2(a , - n, - n; 1, 1 ; x, y) 1 -- F(-a) where
fo °° e-t t"-l L"(xt)L"(yt)dt'
L.(x) = 1Fl( -- n; 1 ; x).
Therefore
~. F2(a, n=O _
n, - n ; 1 , 1 ; x , y )
1
~
F(a) ,,=o
fO °
e_tt~_lLn(Xt)L.(yt)dt"
Received April 29, 1965. 87
a n d R l ( a ) > 0.
88
R . C . BHATT
[June
The change in order of summation and integration is easily justified and we get the right-hand side as 1 -
F(a)
fo
e
-tt"-x _ L.(xt)L.(yt)dt. . =o
Using, now the relation [1, pp. 214], (3)
£ Lk(X)Lk(y ) = (n + 1) (x - y)-i [L. +,(y)L.(x) - L. +,(x)L.(y)], k=0
the right-hand side becomes
=
(m + 1) (x - y)- x.(; -~a) .v e-tta-2[Lm+l(yt)Lm(xt)-Lm+'(Xt)L"~Yt)]dt'Rl(a)>l"
Now separating the right-hand side as the difference of two integrals then by virtue of (1), we get (4)
£ F2(a,-n,-n;1,1;x,y) n=O
= (m + 1 ) ( x - y)-I [ F 2 ( a - 1 , - m , - m a
1 ; 1 , 1 ; x , y ) - ~-],
where ~- shows the presence of a similar term with x and y interchanged. This is a summation formula. REFERENCES 1. Earl D. Rainville, Specialfunctions, (1960) New York. UNIVERSITY OF JODHPUR, JODHPUR (INDIA).
SECONDARY FLOW ABOUT A MAGNETIZED SPHERE R O T A T I N G I N VISCOUS C O N D U C T I N G F L U I D BY
SUNIL DATTA ABSTRACT
The problem of secondary motion induced by the steady rotation of a magnetized sphere in an infinite incompressible viscous conducting fluid is considered. It is found that the secondary flow adds nothing to the couple required tomaintain the motion and the effect of the magnetic field is to damp the secondary velocity field. Introduction. The steady rotation of a sphere, magnetized along the axis of rotation, in an infinite incompressible viscous conducting fluid was considered by the author [1] under the assumption that the fluid moves in concentric circles whose centres lie on the axis of rotation. Actually since the centrifugal force is greatest in the neighbourhood of the equator of the sphere, the fluid particles will recede from the sphere at the equator and approach it again at the poles. Thus combined with the motion about the axis of rotation, there will be a circulatory motion in planes containing the axis of rotation. This secondary flow for the case of an incompressible viscous fluid has been investigated by several authors [2-4]. In the present paper the analysis has been extended to the case of a magnetized, sphere. Basic Equations. With the usual notation the basic equations of magnetohydrodynamics in the non-dimensional form are (1 a , b , c , d )
c u r i E = 0 , J = Rmcurl/~, ] = ( E +
17x B),
R( IT"xT)/7 = - V P + ~ 7 2 p + M ( J × B ) ,
(2 a,b)
div/3=0, div 17=0,
where R( = (aZf~/v)) is the Reynolds number, R,,( = 4rctra2~21~) is the magnetic Reynolds number and M ( = (a/pv)a2B 2) is the square of Hartmann number. From (1) we get (3) E = - grad q~ and V 2 q5 = div( 17 x B). To solve the problem we make use of the following perturbation expansions
( ~'= F'o + RPx + MV~ + RM~"2 + ..., i
(4 a,b,c)
i P = Po + RP1 + MP'I + RMP2 + ..., |
L qS= e~o+ Rc~l+ ....
Received Feb. 7, 1965. 89
90
SUNIL DATTA
[June
The magnetic Reynolds number is assumed to be small so that the magnetic field, BI = - V(½(z/ra))], of the sphere, remains unaffected by the velocity field. Inserting this value of/~ and the expansions of 17and ~ in (3) we get [ ~72~0 = div(Vo x B), (5 a,b)
i ~(kt
div(17, x B).
Again using the perturbation expansions (4) in (2a) we get the following equation
f
0 = - VPo + V2Vo,
(~o.V) 17o= - vPl (6 a,b,c,d)
I
+
v~17~,
o= - v v i + v217; + (•o+ 17o× n) × ~, (17o.V)~, +(17i.v)~7~ = -
v p 2 + v2172+ + (E~ + 17~ x B) x/~.
The problem is to be solved subject to the following boundary conditions (i) No-slip condition at the surface of the sphere (ii) Continuity of normal component of current density vector and continuity of tangential component of electric intensity vector at the surface of the sphere. (iii) Vanishing of the quantities at infinity. Solution. Cylindrical polar coordinates (&,O,z) with velocity components (u, v, w) will be used in writing down the soutions. The equation (6a)with Vo = c5 at r( = x/~--2-~z 2) = 1 yields 1-5]
(7)
~o = ( o , ~ , o).
Again the solution of (6b) can be written down 13] in terms of a stream function (8)
~b~= r52z ( r - l ) 2 171 = ( 1 ~¢1 8 rs ' t5 Oz '0' When P1 is inserted in (5b) we get
I ~h~) t5 &5 "
~7~~bl = 0, which together with the boundary condition gives/~ = 0. The solution of equation (6c) has been obtained elsewhere I-1] and is
~7i = (0,,~g,, 0), where
l(z 2 1) 3,Z(z 2 l) ~ gl=~ ~+7-7)- +ig 7~+3--~ +g+v
(Sz 2 1) r7 r5 '
9I
FLOW ABOUT A MAGNETIZED SPHERE
19651 with 2= -~
1(l+a,) 3+2a,
E-
l+3a, 140(3+2ar)'F=
'
3-2a, 480(3+2a,)'
a, being the ratio of the conductivities of the sphere and the fluid medium. Using the above results, equation (6d) in cylindrical polar-coordinates wit axial symmetry can be written as 2> r--Y-- -
Op2 ~
"~ V 2 U 2
×Lr, g - -r4+
g
u2
th {3z 2
(..~2
i-6 \ r s
r3
r5
r6 + P-
2
~
1 )
+2 ~ rs
x
(9 a,b,c)
2 ,_z2(3r-T - r - g8- + ~ - 5)}]
r-5 . - r---X + ~
0---~ V 2 / ) 2
=
/)2
th 2
-- ~-~ dr- V 2 W 2 -~- 1~6
3 Z2(~ 2 2 rs
x
r3 ]
I~3
r---f - L~ rS .
r--~ -
i.--~ -t- - ~
,) ')1]
5) + 21(3z2 rs
8 + ~r6
r3
{1r--~ - r---g 2 + ~1_z2(3r---g- r--g8 + ~-
The boundary conditions are that, on r = 1, tl 2 ~--- 1.72 = W2 ~ O,
and each of the functions tends to zero as r tends to infinity. The equation for v2 is satisfied by taking v z = 0 throughout the fluid. To solve the equations for u 2 and w2, let
u2
1 c~2 60 Oz ' w2--
1 ~¢2 c3 c ~
Substituting these values the equation for ~b2 is obtained from (9 a, c) to be (10)
A4~,2 = &2[z/l(r ) + zaf2(r)],
92
SUNIL DATTA
[June
where 02
A2 ~
06~2
1 0 02 & O& + Oz2'
1 / " 1600 F A
=
6 6 2 - 15
28
7 )
+
fz =
12E rs
'
36F 62 + 3 rio + 16rt~
17 9 21r12 + 16r~----T,
and ~2 = (O~b2/Or) = 0, on r = 1 and ¢2 -~ 0 as r ~ oe. Writing ~2 = (°2[zFa(r)+z3F2(r)] we get ordinary linear differential equations for the functions Fl(r) and F2(r). These are 1(1600F 662-15 28 7 ) D(0-2)(0+7)(0+9)Fz = ]-~ 7 + ~ r +~ +~ ' and
D(D - 2)(D + 3)(D + 5)F 1 -
12E r4
36F 62 + 3 r6 + l l r ~
17 9 21r8 + 16r---g
- (12r4F~ + 96raFt), where D = (1/r)(d/dr). The equations have been solved to give the functions Fx and F2 which satisfy the boundary conditions F 1 = F 2 = F] = F~ = 0, on r = 1, and tend to zero as r tends to infinity. The functions F 1 and Fz for tr, = 1 are given below Fx(r ) = [ 976 2857 7435 243 3778 In r3 r ~ + r5 r6 r7 r (11)
1378 r7
2133 152] ~r- + r 9 J × 10 -6,
and (12)
F2(r)=
-
6062 521 298 8900, 4861 r---~-+-~+-~-+--~-mr+-~-+-~
382 ]
× 1 0 -6
The stream function for the secondary flow is given by 8 - - gr~
+ M zFl(r) + z3F2(r)}
It is easy to see that the secondary velocity field, as obtained above, contributes nothing to the couple required to maintain the motion.
1965]
FLOW ABOUT A MAGNETIZED SPHERE
i
/~
V,
\~
'~\ \
\,
\',
",~, \'~\
4,,
"~. ".¢.
-,,, . \\
93
--..~-._
"~...
~5o
~ - - - - ~
\\
"-~
"%. "-%.
....
I
I
r:1
I
-
i r--~
Fig. Stream line pattern for the secondary flow. -- M = 0, - - - M ---- 1 Stream line pattern for the secondary flow, in the plane containing the axis, is given in the figure for M = 0 and M = 1. The graph suggests that the secondary velocity field decreases on account o f the magnetic field. Acknowledgement. I am grateful to Prof. R a m Ballabh for his help and guidance in the preparation of this paper. REFERENCES 1. SUNIL DATTA, J'. Phys. Soc. Japan, 19 (1964), 392.
2. 3. 4. 5.
W.G. BICKLEY,Phil. Mag. 25 (1938), 746. W.D. COLLINS, Mathematika 2 (1955), 42. W.L. HABERMAN,Phys. of Fluids, 5 (1962), 625. H. LAMB, Hydrodynamics, p. 588-9.
DEPARTMENT OF MATHEMATICSAND ASTRONOMY, LUCKNOW UNIVERSITY, LUCKNOW, INDIA
A MODIFIED NEWTON-RAPHSON METHOD FOR THE SOLUTION OF SYSTEMS OF EQUATIONS BY
ADI BEN-ISRAEL* ABSTRACT
An implicit function theorem and a resulting modified Newton-Raphson method for roots of functions between finite dimensional spaces, without assuming non-singularity of the Jacobian at the initial approximation. Introduction. The Newton-Raphson method for solving an equation
f(y) = 0
(1)
is based upon the convergence, under suitable conditions, of the sequence (2)
y~+l = y~
y f(Yp) ,(yp)
p = 0,1,...
to the solution of (1), where Yo is an initial approximation to that solution. The modified Newton-Raphson method uses, instead of (2), the sequence (3)
Yp+ 1 = Yp
f (Yp) f'(Yo)
p = 0, 1,...
These methods are described in detail in [7-1, [6] and [4]. Extensions to systems of equations
(4)
fI(Yl,Y2,'",Yn) :
= 0 :
fm(Yl,Y2,'",Yn) = 0 are immediate in case: m = n, e.g. [3] and [4]. The analogs of (2), (3) are respectively:
(5)
y~ + ~ = yp - ( J ( y ~ ) ) - ~f(y~)
p = 0,1,...
(6)
yp+l = y ~ - ( J ( y o ) ) - ~ f ( y , )
p = 0,1,...
where y is the vector with components y j, j = 1,..., n Received February 28, 1965. * Research supported by the Swope Foundation. 94
1965]
METHOD FOR THE SOLUTION OF SYSTEMS OF EQUATIONS
95
f(y) is the vector with components fi(Y), i = 1, ..., n J(y) is the Jacobian matrix, whose (i,j)th element is t~fi(Y) tgyj and Yo is an initial approximation to a solution of (4). The composite Newton-Raphson gradient method of Hart and Motzkin [2], is applicable also if m # n, provided rank J(y) = n at the solution. In this note the Algorithm (6) is extended to general systems of equations, by using the generalized inverse [8] of the Jacobian matrix. Conditions of convergence, as well as bounds on the convergence rate, are stated in Theorem 2. These are based on theorem 1, which is an implicit function theorem following from a classical result of Hildebrandt and Graves [5, Theorem 3], and is of independent interest. NOTATIONS.: Let
Ek
be the k-dimensional vector space with the Euclidean norm
Ilxll = ( x , x ) 1'2. Let Em×" be the space of m × n complex matrices, with the norm IIh II--max {x/2:2 an eigenvalue of A ' A } , A* being the conjugate transpose of A. These norms satisfy [I A x II --< IIa II IIx II for every A E E m×~,x ~ E". By R(A),! N ( A ) we denote the range space respectively null space of A, and by A + the generalized inverse of A, [8]. For x o e E k and a real positive r, S(xo, r) = { x e E k ; II x - Xo[I < r}, the open ball of radius r around x o. The components of a function F : E " - - * E m are denoted by fi(Y), i = 1 , . . . , m . The Jacobian of F at y e E" is the m × n matrix
(0f, y) ] J(Y) =
,~yj
:
i = l,...,m j
= 1,..., n
Let Y be an open set in E n. Following Hildebrandt and Graves [5] we say that a function F : E n ~ E m is in the class C ' ( Y ) if the mapping: E n-~ E m×, given by: y ~ J(y) is continuous for every y ~ Y. The modulus of continuity of J(y) at Yo, 6(yo,e) is defined by I]Y - Yo I] < 6(Yo,e) =~ IIJ(y) - J(yo)II - ~. THEOREM 1. Let X o be an open set in E p, Yo a vector in E n, F a function, F : X o × S(yo, r ) ~ E m, T a linear transformation, T : E"-~ Em, M a real positive n u m b e r , such that:
(7)
M II T+ II < 1
(8)
I[zfyl-r2)-F(x,r )+F(x, y2)[l<_Mllr,-y llfor Y~, Y2 ~ S(yo, r) which satisfy Yl - Y2 ~ R ( T * )
<9)
II ll+[IF<x, ro)[I <
every x ~ X o and
96
ADI BEN-ISRAEL
[Juno
Then there is a unique function y: Xo--, S(yo, r)t3 {Yo + R(T*)}, which for every x • Xo is the solution of
(lO)
r*r(x,y(x))
= 0
Proof. Define a function G: Xo x S(yo, r) ~ E ~ by (11)
G(x,y) = y - T+F(x,y)
For every Y~,Yz such that y t - Y2 • R ( T * ) we recall that [8]: T + T ( y ~ - Y2) = Yl - Yz and therefore (12)
G(x, yl) - G(x, y2) = T + { T ( y l - Y z ) - F(x,y,) + F(x, y2)}
From (12) and (8) it follows that: I[ G(x,y,)- G(x, y2)II <=M
(13)
II T+ I1II Y~ - Y2 I[ for every Yl,Yz in S(yo,r) which satisfy Yl - Y2 ~ R(T*)
Consider now the sequence (14)
yl(x) = G(x, Yo) y,+~(x)
=
6(x,y,(x))
p = 1,2,....
Since R ( T +) = R(T*) it follows that (15)
Yr+ l(x) - yp(x) ~ R(T*) for every x e Xo,
p = 0,1,....
Also by (9), for every x • Xo: (16)
[ly~(~)-yoll =(1- k)c
wherek=MllT+[I< 1
by (7) and c < r
and by induction: (17)
l[Y,+'(x) - y,(x)II =< k,(1 - k)c for every x ~ Xo,
p = 1,2,....
Thus for every x • Xo the sequence {yp(x)} converges to a unique vector y(x), which by (15) and (17) lies in S(yo, r)t3 {Yo + R(T*}. We prove now that for every x • Xo, y(x) is a solution of (18)
y(x) =
G(x,y(x))
Indeed (19) [[y(x) - G(x, y(x))It < I[y(x) - y,+ l(x)JI + [I G(x,y,(~)) - G(~,y(x))I[ <
IIy(x) - y,+ ~(x) II + k IIy,(x) - y(x) [I w h i c h --* 0 a s p -~ oo.
1965]
METHODFOR THE SOLUTION OF SYSTEMS OF EQUATIONS
97
The proof is completed by noting that (18) is equivalent, by (1 I), to (20)
T +F(x, y(x)) = 0
which, since N ( T +) = N(T*), is equivalent to (10).
Q.E.D.
REMARKS: (i) If rank T* = m then Theorem 1 reduces to a well known theorem of Hildebrandt and Graves [5, Theorem 3] restricted to the manifold {Yo + R(T*)}. (ii) Taking Theorem 1 as the basis of a modified Newton-Raphson method for solving the system (4), where y is the independent variable, we proceed by regarding Xo and E p of Theorem 1 both identical with the zero-dimensional vector space. THEOREM 2. Let F be a function, F:E"--*E ~, Yo a vector in E n, M a real positive number, such that: (21)
F ~ C'(S(Yo, 6(yo, M)))
(22)
M li J+ [I < 1 where Jo = J(Yo)
IIJ; II IIF(yo)II < (1 - M It S: II)~(~o,M)
(23) Then the sequence
(24)
yp+ ~ = yp - J+ F(yp)
p = O, 1,...
converges to the unique solution of
(25)
J*F(y) = 0
which lies in S(Yo, 6(Yo, M)) ~ {Yo + R(J*)}. Moreover:
(26)
IIY,+' - Y, 11<-- kg( 1 - ko)3(y o, M),
where
ko = M ItJ; II
Proof. Specializing a result of Bartle [1, Lemma 1] we verify that (21) implies that: (27)
llF(.vt)-
F(Y2)- S(Yo)(Ya -
y2)ll---- MIIy, - Y21l for every y~, Y2 e S(yo, 6(Yo, M))
Taking T = J(Yo) in Theorem 1, we have conditions (7), (8) and (9) satisfied respectively by (22), (27) and (23). The proof is completed by noting the correspondence between (24), (25), (26) and respectively (14), (10), (17) in Theorem 1. Q.E.D.
98
ADI BEN-ISRAEL
REMARKS.(i) If Y0 is chosen so that rank J(Yo) = m, then (25) is equivalent to (4) and (24) is a generalization of the Algorithm (6) for the solution of (4). (ii) The computational efficiency of the Algorithm (24) proposed above, depends upon that of computing the generalized inverse J+, and upon the initial approximation Yo. Methods for computing generalized inverses were recently given by several authors, and the rapid progress in this area may result in favor of the proposed Algorithm. REFERENCES 1. R. G. Battle, Newton's method in Banach spaces, Proc. Amer. Math. Soc. 6 (1955), 827-831. 2. W. L. Hart and T. S. Motzkin, .4 composite Newton-Raphson gradient method for the solution of systems of equations, Pacific J. Math. 6 (1956), 691-707. 3. P. Henrici, Discrete Variable Methods in Ordinary Differential Equations, J.Wiley, 1962. 4. P. Hera'ici, Elements of numerical analysis, J. Wiley, 1964. 5. T. H. Hi|debrandt and L. M. Graves, Implicit functions and their differentials in general analysis, Trans. Amer. Math. Soc. 29 (1927), 127-153. 6. A. S. Householder, Principles of Numerical Analysis, McGraw-Hill, 1953. 7. A.M. Ostrowski, Solution of Equations and Systems of Equations, Academic Press, (1960) 8. R. Penrose, A generalized inverse for matrices Proc. Cambridge Philos. Soc. 51 1955, 406-413. TECHNION~ISRAELINSTITUTEOF TECHNOLOGY, H/flFA
A SHORT PROOF OF THE LEVY CONTINUITY THEOREM IN HILBERT SPACE* BY
J. FELDMAN ABSTRACT
A short proof of the Levy continuity theorem in Hilbert space. In the theory of the normal distributionon a real Hilbert space H, certain functions (9 have been shown by L. Gross to give rise to random variables (9~ in a natural way; in particular,this is the case for functions which are "uniformly ~-continuous near zero". A m o n g such functions are the characteristic functions (9 of probability distributions m on H, given by (9(y) = SeiCy,x)dm(x).The following analogue of the Levy continuity theorem has been proved by Gross: Let (gj be the characteristicfunction of the probability measure m# on H, Then necessary and sufficient that ~fdmj ~ Sf dm for some probability measure m and all bounded continuous f, is that there existsa function ~, uniformly z-continuous near zero, with ~#~ ~ (9~ in probability,qb turns out, of course, to be the characteristicfunction of m. In the present paper we give a short proof of this theorem.
Let H and K be a pair of real linear spaces in duality, and let St be the smallest a-field of subsets of H for which all elements of K become measurable. If m is a probability measure on St, then the formula
(9(y) = f e~<X'y~dm(y) defines a function on K, the "characteristic function" of m. (9 is clearly positivedefinite, 1 at the origin, and weakly continuous. One is then led to consider the possibility of generalizing theorems about characteristic functions which are known for the finite-dimensional case. Two basic theorems are the following. THEORE~tI. (S. BOCHNER) The characteristic functions of probability measures on finite-dimensional spaces are precisely the continuous positive-definite functions which are 1 at the origin. TrmOREM II. (P. LEVY) Let (9j, j = 1,2,... be characteristic functions of probability measures mj on the finite-dimensional space K, and let (9 be a continuous function on K with (9(0)= 1. A necessary and sufficient condition that (9 be the characteristic function of a measure m such that f f dmj-~ f f d m for all continuous bounded f, is that (9~~ (9 Lebesgue-almost-everywhere. Received August 5, 1965. Research supported by National Science Foundation Grant GP-3977. 99
100
J. FELDMAN
[June
The first of these theorems has been generalized to Hilbert space H, in the following form. A weaker topology ~-- is placed on H (see [1,2, 6]) via the seminorms {11" h : A any Hilbert-Schmidt operator on H}, where I i x h - - ilAxll. Then we have THEOREM 1. Necessary and sufficient conditions for a function ¢~ on Hilbert space H to be a characteristic function are that #p be positive-definite, 1 at O, and uniformly continuous in the topology 5 (and in fact the continuity assumption may be weakened to J-continuity at 0). This theorem has been proven by V. Sazanov [6], in a somewhat different form by R.A. Minlos [4], and also by L. Gross [2] as a corollary to the next theorem. Recently K. Ito has given an extremely direct and simple proof, as yet unpublished. Theorem II has also been generalized to Hilbert space by L. Gross [2]. His proof is rather long and technical, as befits a first proof. The object of the present note is to give a simpler and more transparent proof. One of the devices in my proof, the use of the measure e- llxll2/zdin(x) to estimate the m-measure of a set, was already used by K. Ito in his brief proof of Theorem 1, and by Kolmogorov, [3];it goes back to Prokhorov, [5]. In order to state Gross's generalization, we recall some definitions and results from [7] and [2]. By a cylinder function on H we shall mean a function f such that f = f o P for some finite-dimensional projection P. Let d be the *-algebra of all continuous cylinder functions on H, and s¢~o the bounded elements of ~¢. A linear functional on ~¢~o may be defined by integrating f over PH with respect to the normal distribution on PH, i.e., Sfdn is defined by
f fdn =
1
:en f(x)e'-t/211xll2dx'
where d = dimension of PH. This is to be interpreted as integration with respect to a generalized normal distribution on H. Although many different P may be used, the number f fdn is independent of the P chosen. It is now possible, as in [7], to define a homomorphism, which in this case is clearly an injection, from ~¢ into random variables on some probability space, call this isomorphism f --,f~, such that if f is bounded, then
f f dn =
f f~dPr.
1965] PROOF OF THE LEVY CONTINUITY THEOREM IN HILBERT SPACE 101 The isomorphism preserves the algebraic operations including complex conjugation, and for any bounded continuous complex function u of a complex variable, sends u ( f ) to u ( f ~). In [1,2] Gross has extended the injection f ~ f ~ to a larger class of functions; those which are uniformly ~Y--continuous near zero. This term means, for a function f, that there is a sequence Aj of Hilbert-Schmidt operators with t r ( A * A j ) ~ 0 and such that f is uniformly ~--continuous on <_ 1}. We shall denote by M the functions satisfying this condition, and by 8oo the bounded ones. For any net of finite-dimensional projections Pk 1' I, it is shown in [1, 2] that f ~ = lim in probability ( f o Pk)~ exists. The extended map again is an injection preserving algebraic operations and conjugation, and one may define . f f d n = f f ~dPr on bounded elements. Furthermore, for continuous u with compact support, f in M implies u ( f ) is in ~oo. We are now in a position to state Gross's generalization of Theorem II.
{x:llA xll
THEOREM 2. Let ~bj.,j = 1,2,... be the characteristic function of a probability measure mj on the Hilbert space H. Then necessary and sufficient for the measures mj to converge to a probability measure m, in the sense that ~ f d m j ~ ~ f d m for all bounded continuous f, is that the random variables c~; converge in probability to c~~ , for some c~ in ~ with c~(O)= 1. c~ then turns out to be the characteristic function of m. Proof. All we shall show here is that if tk is in ~ with ~b(0) = 1, and qSj~ ~ ~b~ in probability, then the set {mi :j = 1,2,..-} is precompact. The "necessity" proof, and the remainder of the proof of "sufficiency", are fairly straightforward in Gross's original treatment. First we note that [ q5 [ < 1. For let u be any nonnegative continuous function of a complex variable, having compact support, and vanishing in the unit disk. Then each u(qSj) = 0, and f u((oj)dn~ S u(c~)dn, so f u(~b)dn = 0. Thus u(~b) ~ = 0, hence u(~) = 0. Thus [~bl < 1. Next, observe that for given a > 0, b > 0, finite-dimensional projection P, and Q = I - P, the set {x: II(aP + Q)x II < b} is contained in the b-neighborhood in H of the b/a-sphere in the range of P. Thus, by a theorem of Prokhorov, all we need show is that for any preassigned b > 0 3 a > 0 and finite-dimensional P such that, for all sufficiently large j,
max: [l(aP + Q)xJI
b} >
1-b.
Select b > 0, and let c = (1 - e-b2/2)b/3. There exists a Hilbert-Schmidt operator A >__0 such that [lAx H< 1 implies [ 1 - ~b(x)[ < c. Then
re (y) >
1-c-2
Ay]] 2.
102
J. FELDMAN
[June
Let T = aP + Q, where 0 < a < 1, and P is a finite-dimensional projection. Then
fn
r e ¢ o Tdn > 1 - c - 2tr(AT2A).
Now, a simple finite-dimensional calculation shows that, for f. f o Tdn = S f gdn, where g(x) --
1
f e a¢oo,
e ( t _a-2/2)l[Pxll 2
d being the dimension of P. Then the same equation holds for fe&~o. (This really amounts to saying (dno T-1/dn)= g; see [7-1 for a fuller discussion of these ideas). Now: since tk]" ~ q~~ in probability, it follows that c~fg ~ -* qb~g~ in probability, so Sq~f g~dPr ~ Sqb~g~ dPr. Then, for sufficiently large j, r e f Cj" Tdn = r e f d&gdn > 1 - c -
2tr(AT2A).
Now let Qk be a net of finite-dimensional projections ascending to Q, and Pk = P + Qk" Then, for fixed j,
f ¢bjo PkTdn = f qbio TPkdn~ f ¢1o Tdn. Now, q~j o Pk T restricted to PkH is the characteristic function of the measure m~ o (PkT)- 1on PkH, so
f q~jO PkTdn = :eane-(llxll2/2) dmio (PkT) -1
= f.
e-(l/2)IIPkTxll2dmj(x) •
Then, taking limits in k, S Cjo Tdn = j" e -(1/2)11Tx112 dmj(x). Now: f e -O/2)llrxl12 dmj(x) < mj {x: 11Tx
11 b} +
e-(b~/Z)m~{x:
II
II > b},
Thus, mj{x: II T x [I < b} > 1 - b/3 - (1/1 - e -Cb~/2)) 2tr(AT2a). Since tr(AQA)~O as P I'1, a finite-dimensional P may be chosen with 2tr(AQA)< c. Then choose a so small that 2a2tr(APA) < c. Writing 2tr(ATZA) = 2a2tr(APA) + 2tr(AQA) we have
m~{x:ll(aP + Q)xll = 1 - b .
Q.E.D.
1965] PROOF OF THE LEVY CONTINUITY THEOREM IN HILBERT SPACE
103
REFERENCES 1. L. Gross, Integration and nonlinear transformations in Hilbert space, Trans. Amer. Math. Soc., 94 (1960), 404-440. 2. L. Gross, Harmonic analysis in Hilbert space, Memoirs, American Mathematical Society, 46, 1963. 3. A. N. Kolmogorov, "A note to the papers of R. A. Minlos and V. Sazonov," Teoriya Veroyatnostei i eyo Primenyeniya 4 (1959), 237-239; Theory of Probability and its Applications 4 221-223. 4. R. A. Minlos, Generalized random processes and their extension to measures, Trudy Moskov. Mat. Obsc., 8 (1959), 497-518. 5. Yu. V. Prohorov, Convergence of random processes and limit theorems in probability theory, Theory of probability and its applications, 1 (1956) English translation published by S. I. A. M., 157-214. 6. V. Sazonov, "A remark on characteristic Functionals," Toriya Veroyatnostei i eyo Primenyeniya 3 (1958), 201-205 ; Theory of probability and its Applications 3 188-192. 7. L E. Sega|, Tensor algebras over Hilbert space, Trans. Amer. Math. Soc., 81 (1956), 106-134. UNIVERSITY OF CALIFORNIA, BERKELEY, CALIFORNIA
FINITE SETS ON CURVES AND SURFACES* BY
H. GUGGENHEIMER ABSTRACT
A complete proof is given for Schnirelmann's theorem on the existence of a square in C2 Jordan curves. The following theorems are then proved, using the same method: 1. On every hypersurface in R n, C3-diffcomorphic to S n- 1, there exist 2n points which are the vertices of a regular 2n-cell C.. 2. Every plane C' Jordan curve can be C' approximated by a curve on which there are 2N distinct points which are the vertices of a centrally symmetric 2N-gon (angles ~znot excluded). 3. On every plane C2 curve there exist 5 distinct points which are the vertices of an axially symmetric pentagon with given base angles a, •/2 <_ a < ~z.(The angle at the vertex on the axis of symmetry might be ~z). 1. L. Schnirelmann has published [4] the following theorem: On every simple, closed, plane Jordan curve having continuous curvature of bounded variation, one can findfour points which form the vertices of a square. The same holds for any finite union of such curves. However, Schnirelmann's p r o o f as printed is not quite convincing. [In the Uspehi text, the only one available to the present author, the remark on p. 38, lines 28-30, does not seem to be justified by the statement of L e m m a 1. This text is posthumous and apparently not an exact reprint of the 1929 paper.] Since this theorem (as well as the related one on systems of rhombs, §5) is of great intrinsic interest and the topological method invented for its p r o o f seems capable of a wide range of other applications, we present here a complete p r o o f of a slightly improved version of Schnirelmann's theorem, following as closely as possible t h a t author's method. This is done in §§ 2 to 4. §5 deals with Schnirelmann's r h o m b theorem, and §§ 6, 7 bring some new applications o f Schnirelmann's method to problems in two and more dimensions. 2. The space L of oriented line elements in the plane R z can be identified with the cartesian product R 2 x S 1. In the space of differentiable, closed, plane curves, i.e., of the differentiable maps f : $ 1 ~ R 2, we use the C ' metric defined by
d(fl,fz) =
max
{ [fl(t)-fz(t)l + I f [ ( t ) - f~(t) [ /,
ze[0,2n]
Forfixed t, the expression in brackets can be uscd as a distance in L induced by the cartesian product. Hencc it is possiblc to speak of a curve which is near to a line element (x, y ,~) in a neighborhood of a point (x, y). W e say that (x, y, ~) is a line clement at (x, y). * Research supported by Grant AF-AFOSR-664-64, Air Force Office of Scientific Research. 104
1965]
FINITE SETS ON CURVES AND SURFACES
105
T h e m a i n tool in the p r o o f is a p e r t u r b a t i o n lemma. F o r convenience, the indices i will always run f r o m 1 to 4, and the index 4 + 1 shall be identified to 1. MAIN LEMMA: At the vertices A~ of a square we choose line elements a~
characterized by the angle Iz~ between the direction of the line element and the edge AiAi+ t of the square. If the rank of the matrix I A =
COS//2 + sin//2
- sin//a
0
-I
0
- cos//2
cos/~3 + sin//3
- sinp~
- sinp ~
0
- cos//3
cos//4 + sinp4
s i n / / t - cos//~
cos//2 - sinpz
sin//3 - cos//3
cos//4 - sin//4
J
--
COS///
is at least three, then there exist neighborhoods V(ai) c L with the following property: On all quadrupels of analytic arcs ci such that the tangent elements to c~ are in V(a~) it is possible to find points Bie ci which are the vertices of a square. The square B i is unique if d e t A # 0. We select that system of coordinates for which the square A~ becomes the unit square, i.e., A1 = (0,0), A2 = (1, 0), Aa = (1, 1), A4 = (0, 1). On each arc ci we use as p a r a m e t e r s~ the arclength measured f r o m some point X~ ~ q. The angle o f the line element of c~ at X~ and A~A~+~ is denoted by p~. The parametric representation of the four arcs then becomes t xl = at + cospt st + t h ( s 0 Yt = bl - sin//'t s~ + 01(sl) x z = 1 + a2 + sin//~Sz + q2(s2) Y2 =
b 2
all- COS//2 S2 "[-02(S2)
x3 = 1 + a a - cos//3 s~ + qa(S3) Y3 -- 1 + b 3 +.sin//~ s 3 + 03(s3) x4 =
a4 - sin//4 s4 + ~/4(s4)
Y4 = 1 + b 4 - cos//~ s4 + 0g(S4) T h e neighborhoods V(ai) are characterized by a n u m b e r e > 0 where
(1)
[
and we m a y add the condition
+ ] b,[ + ]//,- i,/] < e.
Is,l <
c~.
T h e th, 0~ are analytic functions subject to
,t,(s,) =O(s?)
O,(sl) =
106
H. GUGGENHEIMER
[June
The points (xi(s,), yi(s~)) are the vertices of a square iff ( X i + I - - Xi) 2 + ( Y i + l - - Y i ) 2 = (Xi+2 -- X,+I) 2 + (Yi+2 -- YI+I) 2
(i = 1,2,3)
(x3 - x 0 ~ +(Y3 - y~)~= (x~,-x2) 2 +(Y4 - y 2 ) 2 or by the parametric representation, ( - cos/t~ + err)st + (cos#~ + sin/t~ +/~12)$2-]- (-- sin#~ + Sta)Sa + gt = f t (2)
(-- COS#I -}" 822)$2 "q-(C0S/23 "]- sin/~ + g23)S3 + (-- sinp~ + e24)S4 -}- g2 = f 2 ( - sin/a~ + esl)st + ( - cos/~ + eaa)S3 + (COS//4 + sin/~4 + ea4)S,~ + ga = f a
(sin/z~ - cos#~ + e4t)sl + (cos/~ - sin/~ +/~42)$2 + ( s i n # i - cosy6 + e43)$ 3 -~ (COS#4-- sin#~ + ~44)S4 + g4 = f 4 wherein the fi=J~(al, "",a4, bt, "",b4) we have collected all the constant terms, and in the g~ the terms of order > 2 in the s v By (1), computation shows
(3) We have to solve the system (2) for given constants ai, b~,/~. Its Jacobian J can be written J=A+A'
where, by (1) and (3). the elements 6ij of A satisfy (4)
6i~ < 10~[si]
CASE 1. detA ~ 0. If 8 is small enough, d e t J ~ 0 for small &, and the system (1) has a unique solution by the inverse function theorem. (An upper bound for the st can be obtained by successive approximation, hence e can be determined to justify our reasoning.) CASE 2. detA = 0, rank A = 3. At least one of the three-rowed minors of A is ¢ 0. Since detA is linear in the trigionometric functions, we may assume without loss of generality that we have a relation detA = cos/~4 • F(/~t/~2,/~3) -sin/t4 ' F(/~t, P2,/*a) = 0 where F 2 + G2 ~ 0 Hence, for a given choice of line elements a t, a2, ira there exists a unique (up to orientation) line element through A 4 which will annul detA.
1965]
FINITE SETS ON CURVES AND SURFACES
107
By continuity, rank J > 3 in some neighborhood of a~ x 0"2 × 0"3 × 0"4 in L x L × L × L. If det J ~ 0 for the given c i, we are back at case 1. If det J = O, we momentarily direct our attention to the case A4---B4, and approximate c4 by analytic arcs c* through A4 and which at that point have line elements (0,1, #4),/t~ 5#4. By the preceding paragraph, det A(pt, p2, p a , # * ) ~ 0 hence there exists a square inscribed in (cl, c2, c3,c*). All these squares have their widths bounded from above and below. By the Blaschke Auswahlsatz there exists a converging sequence of squares. The limit figure is a non-degenerate oval which must be a square inscribed in (cl, c2, c3, c4). There is no upper bound for the number of squares obtained in this way. A repetition of the process eliminates the condition A4= B 4. 3. A point (ao, -", a2,; bo, "', b2.) in (4n + 2) dimensional space R 4"+2 defines a plane curve by ao
x(t) = y +
n
~k = 1(ak COS kt + a,,+k sin kt)
y(t) = ~ + ~;= ~ (bg cos kt + b,,+k sin kt) Let Go(n) be the open set of points which represent simple closed curves. (The image of the C' topology in the space of curves is compatible with the cartesian topology in R4n+2.) Ellipses with center at the origin and axes on the coordinate axes are represented by the points. fo:at
=a
b,+l=b
all other coordinates0.
Let Gl(n ) be the connected component of Go(n) which contains the (connected) set of the fo. By T we denote the set of points which characterize curves with an inscribed square for which rank A < 3 (computed for the line elements of the curve at the vertices of the square.) The dimension of Tis < 4n, therefore G2(n ) = G~(n) - T is a connected set, hence is arcwise connected. G2(n ) is dense in Gl(n). Plane curves which can be C' approximated by curves in Gl(n ) (i.e., curves represented by points of G~(n)) therefore also can be C' approximated by curves in G2(n). In a fixed coordinate "system in the plane, let O(s) be the tangent angle of a C' curve as a function of its arclength.The declension is the difference quotient d ( s . s~) = O(s~) - O(s2) S t ~S 2
108
H. GUGGENHEIMER
[June
As an average over the curvature of C 2 curves the declension is of geometric interest.Ill We are interested in the curves of bounded declension. However, for the Fourier approximation of these curves we have to ask that the declension d(so, s) be of bounded variation as a function of s. If this holds for all So, we say that the declension is of bounded variation. A sufficient condition would be for O(s) to satisfy a HiSlder condition of exponent > 2. If the curve is closed, x(t) and y(t) are periodic functions. We may assume the period to be 2~. The Fourier approximation Jn(f) of a periodic function f(t) is
.]o {f(z) coskzcoskt + f ( z ) sinkz sin kt) dz
J . ( f ) = 2|.1o f ( z ) d z + k=l
Any plane, simple, closed curve of declension of bounded variation is differentiability isotopic to an ellipse. This means that there exists a map F : S 1 1 ~ R 2 of the unit cylinder into the plane such that a) F(t, O) = (a cos t, b sin t) b) F(t, 1) = (x (t), y(t)) c) F(t, ~) = (x~(t), y~(t)) is a simple, closed curve of declension of bounded variation for constant d) F(t, ~) is continuous of bounded variation in the two variable t, ct. This fact is well known even in n dimensions [3]. Since S 1 x I is compact, for given e > 0 there exists an no such that (Jn(x~), J,(y~)) is a C'-approximation of (x~, y~) for all g and is of uniformly bounded declension, for n > no. We have shown: Every simple, closed curve of declension of bounded variation can be C'
approximated by curves in Gl(n ) of uniformly bounded declension, hence also by curves of G2(n ) of uniformly bounded declension, for some n > no depending on curve and approximation. 4. The theorem may now easily be proved. First we show that: All curves given by points of G2(n) admit an inscribed square. Gz(n) is arcwise connected. We choose an arc which connects any point c e G2(n ) to an ellipse Fo. An ellipse contains an inscribed square (of edge Since nearness of points in R 4n+2 implies nearness of the curves in C' topology, it follows from the main lemma that points near a point in Gz(n ) which represents a curve with an inscribed square, also represent curves with an inscribed square. The result follows from the compactness of the arc.
2ab(aZ+ b2)-1/2).
THEOREM: On every simple, closed C' curve of declension of bounded variation
one can find four points which form the vertices of a square. For any sequence 8j ~ 0 we can find curves ci, represented by points Cj ~ G2(ni) which C' approximate the given curve c up to ej. By our last result, there exists a square Sj in cj. The set of ovals S i is bounded and by the Blaschke Auswahlsatz there exists a converging subsequence of the Sj. All the limit vertices must be
1965]
FINITE SETS ON CURVES AND SURFACES
109
on c. Since c is simple, the limit figure can be degenerate only if an e-arc of c~(j > Jo) passes through the four vertices of Sj. But then the declension of c, which is approximated by the declension of the cj, cannot be bounded at the limit point, Q.E.D. At this point the hypothesis of bounded declension, which was a technical convenience in § 3, becomes essential. The proof still holds if the curve is C' and piecewise of bounded declension. 5. The following theorem is given by Schnirelmann as Theorem 2. DEFINITION: We shall call complete a system of rhombs whose vertices aie on a closed curve c, if it satisfies the following conditions: 1) Every point of c can be taken as a vertex of some rhomb of the system; 2) Any two rhombs Ro and R 1 can be connected by a continuous one-parameter family of rhombs R~(0 _< ~ _< 1) so that a fixed vertex Ao of the rhomb Ro passes into a fixed vertex A i of the rhomb R t ; 3) None of the rhombs degenerates into a figure without interior points. TnEOI~M: There exists a complete system of rhombs in every simple, closed curve of bounded continuous curvature. "Bounded continuous curvature" may again be replaced by "declension bounded variation." An analysis of the preceding proof shows the following steps: I. The main lemma. For det J ~ 0 it is trivial, for det J = 0 depends on the fact that J is linear in the trigonometric functions of the tangent angles. 2. Therefore, the quadruples of line elements which do not admit a prolongation satisfy at least two independent conditions. This allows us to choose a connected open set G2(n)in each approximation space R 4 n + 2 . 3. In the final convergence argument, the main point to be established is the non-degeneracy of the limit figure. Rhombs are characterized by the first three conditions (2). The main lemma therefore reads: "At the vertices Ai of a rhomb we choose line elements at characterized by the angle/z~ between the direction of the line element and the edge > A~Ai+ t of the rhomb. If the rank of the matrix
A =
I
-
0
- cos#2
cos~3 + sin#a
sin/~
0
- costa
-
- sin#4
°
cos#4 + sin#4
1 _
is three, then there exist neighborhoods V(ai) c L with the following property: On all quadrupels of analytic arcs ci such that the tangent elements to c~< a r e in V(a~) it is possible to find quadrupels Bi(t) which for constant to form the vertices of a rhomb. Bi(t) is a continuous function of t where t is identified with
110
H. GUGGENHEIMER
[June
one of the arclengths s~ of the arcs % For the proof one simply chooses the fixed s~ so that the resulting system has a non-vanishing Jacobian determinant. Step two then follows as before (rank A -- 2 gives four conditions) and also step three (the theorem is trivial for ellipses) by the boundedness of the declensions. The reader may easily fill in the detail s. Schnirelmann also gives a theorem about possible degeneracies if the curve is only supposed to be C'. 6. It is worthwhile to investigate n-dimensional generalizations of the square theorem. The square is the two-dimensional member both of the series of n-dimensional cubes B, and of their duals, the n-dimensional 2n-cells Cn. (For n > 4, the regular simplices An, B,,, and Cn are the only regular polyhedra). In general, a smooth closed surface in R 3 does not contain an inscribed cube. But by Schnirelmann's method we may prove. THEOREM: Every C a hypersurface in R", C3-diffeomorphic to S "-1 , contains 2n points which are the vertices of a regular Cn. The regular Cn has 2n vertices and 2n(n-1) edges. All its two-dimensional faces are triangles. (n __>3). A piece of hypersurface may be described by n-1 variables; for pieces laid out about all the vertices we need 2n(n-1) variables. The equality of the edges is expressed by 2 n ( n - 1 ) - 1 equations which can be expanded so that the Jacobian matrix contains only first powers of the trigonometric functions of the Euler angles of the elements of hypersurface. These conditions alone suffice to make the polytope a regular C~. (This situation is parallel to that in § 5, and a deeper discussion probably would yield a proof that each closed C 2+~ hypersurface admits a continuum of inscribed Cn, n __>3.) By hypothesis, the function which describes the surface, as well as its Gauss curvature, can be considered as univalent differentiable functions on the sphere S n- 1. Therefore they can be developed into series of spherical harmonics and, by compactness, we again have the possibility of C 2 approximation by finite polynomials of ( n - 1 ) spherical harmonics. These polynomials again may be characterized by points of an open set Gl(n) of a finite dimensional cartesian space, in which there is a dense, connected subset which contains ellipsoids and for which the main (prolongation) lemma holds. The theorem holds for ellipsoids. A vertex of C~ in an ellipsoid of half-axes al, "', as has distance p from the center, where (See [2]) 1
pz
1 ~ 1 k i a]
The Gauss curvature of our surfaces is continuous, hence bounded, and uniformly bounded on the approximating surfaces to one C 3 surface. The approximation of a C a surface by surfaces given by spherical polynomials yields a bounded set of
1965]
FINITE SETS ON CURVES AND SURFACES
111
C's which, by the Blaschke Auswahlsatz, is relatively compact in the space of ovals in R". A converging sequence of C',s must converge to a non-degenerate C, since otherwise the Gauss curvature of the appro ximating surfaces could not be bounded. This completes the outline of the proof. A count of constants shows that also in dimensions 3 and 4 the A, and C, are the only universally inscribable regular polytopes. 7. According to Schnirelmann's theorem, every smooth curve contains a square. It is well known that every oval contains a symmetric hexagon, but not every oval contains a centrally symmetric octagon. In this connection, we can prove: THEOREM: Every simple C' curve can be C'-approximated by a curve which has an inscribed centrally symmetric 2N-gon. Let At (i = 1, ..., 2N) be the vertices of a 2N-gon in cyclic order. The condition of central symmetry is a)
AtA~+ a
=
AN+iAN+i+I (indices mod 2N)
b) .~A t = ~AN+ t Because of (b), condition (b) is equivalent (b) AtAt+ 2 = AN+t AN+i+ 2 Therefore, all conditions are given by relations between distances which yield a Jacobian matrix linear in the trigonometric functions of the angles of the line elements, and a Main Lemma will hold if the number of variables st is not less then the number of conditions. There are 2N variables st, N conditions (a) and N conditions (b). If we start from a 2N-gon with distinct vertices, we get 2N-gons with distinct vertices by the limit process needed to establish the Main Lemma. However, we are not able to control the sizes of the angles, and, therefore, the theorem has to be understood in such a way that 2N-gons with distinct vertices but angles n are admissible. The approximation by trigonometric polynomials and the definition of a G2(n) for whose curves an inscribed 2N-gon exists does not present any difficulties. On the other hand, since we cannot control the angles we cannot be sure that no points will merge in the final approximation process, even for analytic curves. Therefore, the theorem cannot be improved by Schnirelmann's methods, and I would even conjecture that with every reasonable measure in the space of C' curves the curves admitting a 2N-gon would fill a subset of measure zero. The proof of the previous theorem shows that for the complete success of Schirelmann's method a control of at least some angles of the inscribed polygon is necessary. In this direction, we have for instance the following theorem: THEOREM: On every simple, closed curve of declension of bounded variation
112
H. GUGGENHEIMER
there are five points which are the vertices of an axially symmetric pentagon with three equal edges and fixed base angles ~ > rc / 2. I f the vertex A1 is on the axis o f symmetry, the pentagon is characterized by A I A 2 = AIA5 A2A 3 = A3A 4 = A4A 5 A2A 4 = 2 A2A a sin
A3A 5 = 2 A2A3"sin
Gt
These are five equations for the five variables st. The existence o f the pentagon for an ellipse follows f r o m a simple continuity argument. The reader will easily fill in the details of the proof. NOTE. (Added in Proof). The full Sehnirelmann Theorem, for analytic curves only, was proved in a different way by R. P. Jerrard, Inscribed squares in plane curves, Trans. Amer. Math. Soc. 98 1961, 234-241 (Reference supplied by the Referee). BIBLIOGRAPHY 1. C. Carath6odory, Die Kurven mit baschriinkten Biegungen. Sitz. Ber. Preuss. Akad. Wiss. Berlin, Math. Phys. Kl., (1933), 102-125. (Gesammelte Werke, vol. II) 2. G. Salmon and W. Fiedler, Analitische Geometric des Raumes, 2 Aufl. B. G. Teubner, Leipzig, (1874); see. 95. 3. H. Whitney, Differentiable manifolds. Ann. of Math. (2)37 645-680, 1936; and remark by S. S. Chern, La g6ometrie des sous-vari6t6s d'un espace euclidien ~ plusieurs dimensions, Ens. Math. 40 26--46, 1954. 4. L. G. ~nirel'man, O nekotoryh geometri~eskih svoistvah zamknutyh krivyh. Sbornik rabot matemati~eskogo razdela sekcii estestvennyh i to~nyh nauk Komakademii, Moskva 1929. Reproduced in Uspehi Matemati6eskih Nauk, 10, (1944), 34-44. UNIVERSITY OF MINNESOTA, MINNEAPOLIS, MINNESOTA
ON SOME E X T R E M A L
PROBLEMS
IN GRAPH
THEORY
BY P. E R D O S
ABSTRACT
The author proves that if C is a sufficientlylarge constant then every graph of n verticesand [Cn3/2] edgescontains a hexagon X1, )(2, X3, )(4, )(5, X6 and a seventh vertex Y joined to X1, X3 and )is. The problem is left open whether our graph contains the edges of a cube, (i.e. an eight vertex Z joined to X2, )(4 and )(6). Throughout this paper G, G' will denote graphs, V(G) denotes the number of edges, n(G) the number of vertices of G. G(n; m) is a graph of n vertices and m edges. Vertices will be denoted by xt'"Yl"'" edges by (x, y). {Xl,'", xn} denotes a path whose edges are (x l, x2),'-',(xn-x,X~), the vertices Xx,'", x, are assumed distinct, n - 1 is the length of the path, similarly ( x D ' " , x,) is a circuit of length n whose edges are (Xl, x2), ..., (x~_ t, xn), (x n, Xl). v(x), the valency of x is the number of edges incident to x. G(xl, ...,x~) is the subgraph of G spanned by(xt, ..-,x~). In an even graph all circuits have even length. It is well known and easy to see that the vertices of an even graph can be divided into two classes A and B so that every edge joins a vertex of A to a vertex of B. C, c, e l ' " denote suitable positive absolute constants. Recently several papers appeared which discussed various extremal problems in graph theory [1]. Denote by f ( n ; k , l) the smallest integer for which every G(n;f(n;k;l)) contains a G(k,l). Two years ago Turfin asked me to determine or estimate the smallest integer m for which every G(n;m) contains the various graphs determined by the vertices and edges of the regular polyhedra. For the tetrahedron the problem was solved many years ago by Turfin himself [6], for the octahedron I proved several years ago that (n 2/4) + en 3/2 < m < (n2/4) + Cn 3/2, details of the proof have not been published [1] and in this note we do not discuss the octahedron. The question for the dodecahedron and icosahedron seems difficult. It is well known that f ( n ; 4 , 4 ) > cn 3/2, but for a sufficiently large C every f ( n ; [On 3/2 ]) contains a rectangle [2]. One might conjecture that for a sufficiently large C every G(n; [Cn3/2]) contains a cube. In fact I proved t h a t f ( n ; 8,12) < Cn 3/2, and I even showed that every G(n; [Cn 3/2]) contains a G(8; 12) having the vertices, Received February 24, 1965. 113
i 14
P. ERDOS
[June
xl,x2,x3,x4;yl,Y2,ya,y , and the edges (xi, yj) where min(i,j)__< 2 [3]. But at present I can not prove that it must contain a cube. I can prove the much weaker result that it contains a G(7,9) consisting of a hexagon (Xl, "", x6) and a vertex y joined to xl,x a and x s. To prove the existence of a cube we would need an eighth vertex z joined to x2, x , and x6, and I have not succeeded in showing this. More precisely I am going to prove the following THEOREM. Let n > no(k). Then every G(n; lO[kl/En3/2]) contains a
G(2k + 1 ; 4k - 2) which has a path of length 2k{xDyl,..',yg, Xk+l} and the further edges (xl,Yi),(yl, x~), 2 <_i <_k, 3 <=j <- k + 1. Clearly our G(2k + 1,4k - 2) contains for every 2 <_ l _< k a circuit of length and another vertex joined to every second vertex of our circuit. It seems likely that for a sufficiently large c k every G(n;[ckna/2]) contains
aG(1 + k + (- k /-~ ; k 2 ) 2
defined as follows: The vertices are X o ; Y l , ' " , y k
; ZI,j,
1 <__i < j <= k, x 0 is joined to all the y ' s and zi,j to y~ and yj. I can not prove this for k > 3. To prove our Theorem we need two lemmas. LEMMA 1. Every G(n;m) has an even subgraph having at least m/2 edges. We prove the L e m m a by induction for n. It is clearly true for n _< 2. Assume that it is true for n - 1, we shall show it for n. Denote the vertices of G(n;m) by X l , ' " , xn. Since the lemma is true for n - 1, we can split the vertices xl ... x,_ 1 into two classer A and B so that the number of edges joining a vertex of A to a vertex of B is at least ½V(G(x 1,'", x,_ 1))"Without loss of generality we can assume that the number of edges joining x, to the vertices of B is at least ½v(x.). But then the even graph spanned by the vertices A u X . and B has at least ½(V(G(x i..., x,_ ~) + v (x.)) > (m/2) edges, which proves the Lemma. By a slightly more careful induction process we can prove that if the graph G(n;m) has no vertices of valency 0 then it contains an even graph having at
leastI2+4]edges.
ThecompletegraphofnverticesG(n;(2))showsthat
this result is in general best possible. It seems probable that if we know that our G(n; m) contains no triangle, the lemma can be considerably strengthened i.e. m/2 can perhaps be improved to cm for some c > 1/2, but I did not succeed in doing this. LEMMA 2. Every G(n;m) contains a subgraph G' every vertex of which has
valency (in G') greater that [mini.
1965]
ON SOME EXTREMAL PROBLEMS IN GRAPH THEORY
115
The Lemma is known [-4]. The proof is very simple. Now we can prove our Theorem. By Lemmas 1 and 2 our G(n;lO[kl/2n3/2]) contains an even subgraph every vertex of which has valency greater than 5kl/2n 1/2. Let x l , . . . , x , ; Yl "",Yv u + v < n be the vertices of G'. Let Yx, "",Yt, t > 5kl/2nl/2 be the vertices joined to x~ and let x 2 , ' " , x u , , u' < u be the other x's joined to a Yi, 1 < i -< t. G" is the subgraph of G' spanned by Yl, "'" Yt, x2 "" x,. Clearly each y in G" has valency > 5kl/2n 1/2 - 1 > 4kl/2n 1/2, i.e. each Yi has valency (in G') greater then 5kl/2nl/2. Thus (1)
V(G") > 4tkU2 n 1/2.
Denote by X2, "'" X u. the x~ with (2)
v(xi) > 2tk 1/2/n 1/2.
Let G~' be the subgraph of G" spanned by x2,"',x.,, ; Yl,'",Yv By (1), (2) and u" < n we have (3)
V(G") > V(G") - 2tkl/2n 1/2 > 2tk~/2n t/2,
By (3) one of the y's has valency > 2kl/2n 1/2 (in G~'). Let this vertex be Yl and let x2, " ' x t + l 1 > 2kl/2nl/2 be the vertices joined to Yl- Consider finally the graph G ' ( x 2 , . . . x l + l , Y2"",Yt), each xi has by (2) valency greater than 2tk'2/n w - 1 > tkl/2/n 1/2 (t > 4kl/2nl/2). Thus by a simple computation (4)
V(G'(XE,...,xl+l,y,...yt) ) >
tlk t/2 > kzr(G"(XE,...xt+Dy2...,yt) n
since by t > 4kl/2nU2, 1 > 2k 1/2n 1/2 _ tl_ > _8kn _ > t+ l 6kl/2n 1/2
kl/2nl/2
and rc(G"(x2"" Xt+ 1, Y2"'" Y,)) = 1 + t - 1. From (4) we obtain by a theorem of Gallai and myself [5] that G"(x2,... xl+ 1, Yt"" Y 3 has a path of length 2k - 2 {x2, Y2,'", Yk, Xk+ 1}. By our construction Xx is joined to every y of our path and Yl to every x of it. Thus finally G~(xl,... xt+ a, Y l , ' " Y k ) satisfies the requirements of our Theorem. The constant 10 could clearly be reduced, but I made no attempt in doing so since I am not sure if the factor k 1/2 is of the right order of magnitude.
116
P.
ERDOS
REFERENCES 1. P. Erd6s, Extremal problems in graph theory, Proc. Symposium on Graph theory, Smolenice, Acad. C.S.S.R. (1963), 29-36. 2. P. Erd6s, On sequences of integers no one of which divides the product of two others and on some related problems, Irv. Inst. Math. i Mech. Tomsk, 2 (1938), 79-82. 3. P. Erd6s, On an extremal problem in graph theory, Coll. Math. 13 (1965), 251-254. 4. P. Erd6s, On the structure of linear graphs, In. J. Math. 1 (1963), 156-160, see Lemma 1, p. 157-158. 5. P. ErdtSs and P. Gallai, On the maximal paths and circuits of graphs, Acta Math. Acad. Sci. Hung. 10 (1959), 337-357. 6. P. Turin, On the theory of graphs, Coll. Math. 3 (1955), 19-30. TECHNION-ISRAELINSTITUTEOF TECHNOLOGY, HAIFA