TWO TOPOLOGICAL PROPERTIES OF TOPOLOGICAL LINEAR SPACES* BY
CZESLAW BESSAGA AND VICTOR KLEE ABSTRACT
A topological and...
26 downloads
662 Views
3MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
TWO TOPOLOGICAL PROPERTIES OF TOPOLOGICAL LINEAR SPACES* BY
CZESLAW BESSAGA AND VICTOR KLEE ABSTRACT
A topological and a geometrical-topological property, previously known only for normed linear spaces, are established here for much more general classes of topological linear spaces. Introduction. Throughout the present paper, E and E' will denote topological linear spaces (real scalars, separation axiom assumed). A convex body in E or E' is a convex set which has nonempty interior. Our two main results are as follows. THEOREM A. I f E is infinite-dimensional and admits a countable family of open (or closed) convex bodies whose intersection consists of a single point, then for each point p of E the spaces E and E ~ {p} are homeomorphic. ~ THEOREM B. Every closed convex body in E is homeomorphic with a closed halfspace or with the product of an n-cell by a closed linear subspace of finite deficiency n in E. t t These results were first established in I2] for Hilbert space, and were extended in 13] and [1] to arbitrary normed linear spaces, t Note that infinite-dimensionality is required for A but not for B. The topological property expressed in A has a number of interesting consequences; in particular, it implies that E admits a fixed-point-free homeomorphism of period two. Property B is useful in connection with the topological classification of closed convex bodies. Our methods are analogues or refinements of those employed previously, and as before the notions of gauge functional and characteristic cone will play an important role. When y is an interior point of a convex body U in E, the gauge functional of U with respect to y is the real-valued function/~uy defined as follows for all x e E:
pv,(x) = inf { 2 >O: -~(x - y)~ U } . Re~ived December 6, 1964. * This research was conducted at the University of Washington in 1963 when the first author was visiting there. The work of both authors was supported in part by the National Science Foundation, U. S. A. (NSF-GP-378). t, t t See the footnotes on the last page 211
212
CZESLAW BESSAGA AND VICTOR KLEE
[December
When y is the origin 0, we speak simply of the gauge functional of U and write /zv rather than/~u0. In several instances below, we shall define a transformation geometrically and leave to the reader the routine but occasionally tedious verification that the transformation is actually a homeomorphism. This can often be accomplished by expressing the transformation in terms of the appropriate gauge functionals and then making use of the well-known continuity of the function/tvy(x ) [ (x, y) e E x int U. For y e int U, we define cc, U = {x ~ u:/~uy(x) = 0}, ccU = ccyU - y, and csU = cc(U N (y - U)).
Thus the sets y + ccU and y + csU are the unions with {y} of, respectively, all rays and all lines which issue from y and lie in U. The sets ccU and csU, being independent of the choice of y ~int U, are called respectively the characteristic cone and the characteristic subspace of U. Note that the convex cone ccU is a linear subspace if and only if ccU = csU. NOTATION The interior, boundary, closure and convex hull of a set X are denoted by intX, aX, clX and con X respectively. The fact that X and Y are homeomorphic is indicated by X ~ Y. Set-theoretic addition and subtraction are indicated by U and ~ respectively, while + and - are reserved for algebraic operations. The real number field and the set of all positive integers are denoted by 9l and ~R respectively. Equality by definition is indicated by .= or = . . When x and y are distinct points of a linear space, the open segment connecting them is denoted by ] x, y [, the half-open segments by ] x, y] and ]'x, y [-, and the closed segment by J-x,y] The open and closed rays which issue from x and pass through y are denoted by ] x, y ( and [x, y ( respectively.
1. Three Propositions. The three propositions of this section are used later in proving Theorems A and B.
11
PROPOSITION.I:~L, II II) is an in:nite dimensional normed linear space,
then the linear space L admits norms I [ and I11 I II such that
I I=
ll-~ll~ Ill, so
1964]
PROPERTIES OF TOPOLOGICAL LINEAR SPACES
213
continuous. If the space (L, [l[ Ill)is complete, then with (L, [[ I[)als° complete, it follows from the open mapping theorem that z is a homeomorphism. But that is impossible, for the point 0 is in the II II"cl°sure of B but not in the III Illclosure of B. Actually, it is the existence of I [ rather than III Ill which is used in the sequel. To obtain I [, let M be a separable infinite-dimensional linear subspace of (L, II 11)' and use a construction in [3] to produce an unbounded closed convex body V in M such that V is linearly bounded and V = - V. Let U denote the unit ball of the space (L, It li) and let C = con(U U V). Then of course C is a convex body in (L, II 1]) and C = - C. Now suppose that C contains a line J through 0, and consider an arbitrary point x of d. For each n ~ 9/there exist u, ~ U, v, e V and 2, e [0, 1] such that nx = it.u. + (1 - 2.)v.. Since the sequence {(it./n) u . } . ~ is convergent to 0 and since always 1 -
it.
~ v . n
e V,
it follows that x e V. This implies that J c V, an impossibility since V is linearly bounded. We conclude that the set C is linearly bounded, whence the gauge functional] I ° f C i s a n ° r m f ° r L ' C l e a r l y l I <[I I[ " If the spaces (L' [I ][) and (L, [ l) were both complete, the open mapping theorem would lead to a contradiction as in the preceding paragraph, for the point 0 is in the l l-cl°sure of the set {y e M: IIY ti = 1} but not in its I[ II"cl°sure" The proof is complete. A strip is the set of all points lying on or between two parallel hyperplanes. Proof of the following is left to the reader. 1.2 LEM~IA. Suppose that H and H' are closed hyperplanes, S and S' are closed strips, and Q and Q' are closed half spaces such that
OQ=HcOScScQcE
and OQ'=H' c~S'cS' cQ'cE'.
Then every homeomorphism of H onto H' can be extended to a homeomorphism of E onto E' which carries Q onto Q' and S onto S'. Further, H and H' are homeomorphic if E = E'"
1.3. LE~tMA. Suppose that U, V and P are closed convex bodies in E, Q is a closed half space, y is a point of E ,,~ {0}, and the following four conditions are all satisfied: (i) P c intQ and U c i n t V; (ii) o e o Q NOV; (iii) [ 1, ~ [ y c i n t (Pf)U); (iv) V does not contain any ray which issues from y ~and passes through a point
of OQ.
214
CZESLAW BESSAGA AND VICTOR KLEE
[December
Then there is a homeomorphism of E onto E which carries U onto P and V onto Q. Proof. Let W . = ( U - y ) N ( y - U)N~Q, a closed convex body relative to ~Q. Then W = - W and ccW = csW = csU, where the latter equality depends on (iv) and the fact that U ~ V. For each point q e S Q , let c ( q ) . = (1 + I~ve(q))y e int (PU U) and let the (unique)points at which the ray ] c(q), q (intersects the sets ~U, dV and 8P be denoted by u(q), v(q) and p(q) respectively. The existence of these points follows from conditions (iv), (iii) and (i), and from (i) it follows that p( q) e ] c( q), q] and u( q) ~ ] c( q), v( q) [. Let t(q) . = x2c (q) + ½u( q). Since the set E ,~ (csU + [1, oo [y) is simply covered by the family of open rays
{] c(q), q(: q ~ ~Q}, we can define a biunique transformation ~ of E onto E by specifying that ~ is the identity on csU + [1, ~ [y and that for each q ~gQ, ~ is the identity on the segment [c(q), t(q)], carries the segments [t(q), u(q)] and [u(q),v(q)] affinely onto the segments [t(q), p(q)] and [p(q), q] respectively, and translates the ray c(q) + [1, 0o [(v(q) - c(q)) onto the ray c(q) + [1, oo [(q - c(q)). With the aid of the continuity properties of the relevant gauge functionals, it is tedious but not difficult to verify that ~ is the desired homeomorphism of E onto E. This completes the proof of 1.3. When U and V are subsets of E, we shall write U c ~ V to indicate the existence of a neighborhood G of the origin such that U + G c V. A closed convex body will be said to be of type Q provided its characteristic cone is not a linear subspace. 1.4. LEMMA. Suppose that U and V are closed convex bodies of type Q in E, with U c c V . Then there is a homeomorphism of E onto E which carries U and V onto a pair of parallel halfspaces. Proof. Clearly ccU c ccV and esU c esV. Suppose first that the set ceU ,,, csV is nonempty, and choose y e i n t U . "[hen there is a point x e S V such that int U contains the ray x + [1, ~ [ ( y - x). We assume without loss of generality that x = 0. Let Q be a closed halfspace such that 0 e 8Q and V c Q, and let P be the translate of Q such that ½ y ~gP. Then the conditions of 1.3 are satisfied and the desired conclusion follows. (Here the full strength of the condition U c c V was not required; it was sufficient to have U c i n t V.) In the remaining case, ccU c csV and 1.3 does not apply directly. However, we can apply 1.3 in two stages with the aid of a closed convex body J (to be constructed) such that U c i n t J, J c i n t U, and ccU ,,, csJ CZJ ~ ccJ ,,, csV.
By the preceding paragraph, there are homeomorphisms ¢ and ~/of E onto E such
1964]
PROPERTIES OF TOPOLOGICAL LINEAR SPACES
215
that (~U, ~ J) and (~lJ, qV) are pairs of parallel halfspaces. Let K be a closed halfspace which is contained in the interior of qJ and hence is parallel to qJ. By 1.2, the homeomorphism ~/~ - 1 of a~J onto ~qJ can be extended to a homeomorphism of E onto E which carries the strip cl(~J ~ ~U) onto the strip cl(qJ ,,~ K). For each point w e E, let fq(w) if w e E ~ i n t J "c(w) • ((~(w)) if w ~ J . Then z is a homeomorphism of E onto E which carries U onto K and V onto qV. Thus it remains only to construct the intermediate body J. In constructing J, we assume without loss of generality that 0 e int U. Since U and V are both of type Q, there are points u, v ~ E -~ {0} such that (1)
[0, u ( c U ~b [0, - u ( a n d [0, v( c V ~b [0, - v(.
On the other hand, the fact that ccU c csV implies that (2)
[0,-u(cV
and [0, v ( ¢ U .
For each 2 > 0, let L~ denote the line - 2u + ~v. Since U is closed and convex it is easy to derive from (1) and (2) the existence of 2 > 0 such that Lx N U = ~ . But then L2g n 2U = ~ , and by a standard separation theorem there is a closed halfspace Q in E such that L2; " c t~Q and 2U c Q ; the latter condition implies that U c c Q. Let S . = ½U + ~(V NQ). Then U c c J because U c c V N Q, while J c ~ V because U c c V and V N Q c V. From the relevant definitions in conjunction with (1) and (2), it follows that u e c c U ,~ csJ and v e c c J ,~ csV.
Thus the proof of 1.4 is complete. 1.5. PROPOSmON. I f V and V' are closed convex bodies in E which have the same characteristic cone or are both of type Q, then there is a homeomorphism of E onto E which carries V onto V' and ~V onto OV'. Proof. Suppose first that ccV = ccV', and let z and z' be translations of E such that 0eW. =intzV N intz'V'. O f course, cczV = ccz'V'. For each point v e ~(zV), let the points % and v' be defined by the conditions that w ~ e O W t~ [0,v( and v' eO(zV) 0 [0,v(.
Let q be the identity transformation on 14I, and for each v ~ ~(zV) let q map the
216
CZESLAW BESSAGA AND VICTOR KLEE
[December
segment [wo, v] affinely onto the segment [w~, v'] and translate the ray [1, oo [v onto the ray [1, oo [v'. Then t/is a homeomorphism of E onto E which carries zV onto z'V', and the transformation T'-blz has the properties desired in 1.5. In view of 1.2, it suffices for the other case in 1.5 to show that an arbitrary dosed convex body V of type Q in E can be carried onto a closed halfspace by means of a homeomorphism of E onto E. But this is a special case of 1.4, for if 0 e int V then the set U -= ½ V is a convex body of type Q with U c c V. A finite or infinite sequence V~, V2,... of closed convex bodies in E will be called nested provided 1/1 ~ E and one of the following two conditions is satisfied: (*) the convex bodies V~ all have the same characteristic cone; V~+I ~ i n t V~ (i = 1, 2,...); (**) the convex bodies Vi are all of type Q; V~+~ c c Vt (i = 1, 2, ...). 1.6. PROPOSITION. Suppose that E and E' are topological linear spaces, that V1, V2,... is a nested sequence of convex bodies in E, that VI', V~,... is a nested sequence of convex bodies in E', and that the two sequences are of the same length. Then every homeomorphism of dV 1 onto OV~ can be extended to a homeomorphism of E.-~ 0iint V/onto E' ,,~ fqi int V/ which (for each i) carries OVi onto 011[. (Presumably, the proposition remains valid when "V~+1 c c V{' is replaced by "V,.+ ~ c i n t ~ " in condition (**). However, the replacement seems to entail some technical complications and the present form of the proposition is adequate for our needs.) Proof. It suffices to consider the case in which the sequences involve only I/1, 1/2 and V/, V~ respectively. For if this is known, then by its use the given homeomorphism of OV1 onto ~V; can be extended to a homeomorphism ~h of V1 ~ int 172onto V; ~int V~ such that ql(0V2) = 01/2'. The restriction of t/1 to aV2 can then be extended to a homeomorphism ~/2 of V2 ~ int 113 onto V~ ~ int V~ such that q2(~V3) = OVa'. Then the restriction of ~/2 to OVa can be extended.... Proceeding in this way to obtain a sequence ~h, q2,"" of homeomorphisms, we find t h a t U : h is the homeomorphism required in the statement of 1.6. Now if ccV1 = ccV2, ecVI'= eeV~, and ~ is a homeomorphism of 0V1 onto ~V~, a straightforward application of gauge functionals extends ~t to a homeomorphism of E ~ int V2 onto E' ~ int V~. (See 1.1 of [1].) The other cases can be reduced to this one, for if 1/1 and V2 are both of type Q then 1.4 guarantees the existence of a homeomorphism of E onto E which carries VI and V2 onto a nested pair of closed halfspaces, and of course these halfspaces have the same characteristic cone. 2. Proof of Theorem A. The following result provides some alternative ch aracterizations of the spaces for which Theorem A will be established.
1964]
PROPERTIES OF TOPOLOGICAL LINEAR SPACES
217
2.1. PROPOSITION I f E is a topological linear space and N is an infinite cardinal number, then the following four assertions are equivalent: (i) E contains R closed convex bodies whose intersection is {0}; (ii) E contains N closed convex bodies whose characteristic cones have intersection {0}; (iii) E contains N open convex bodies whose intersection is {0}; (iv) E contains N open convex bodies whose characteristic cones have intersection {0}. In each case, the convex bodies may be chosen so that the origin 0 is interior to all of them and so that all are of type Q or are linearly bounded and centrally symmetric (about 0). In the latter circumstance, we may take N = No. When N = No, we may require that the No convex bodies are arranged in a sequence C1,C2,... such that C,+ x + C~+1 c Cnfor all n~ ~R. Proof. (i)-~ (iii). Suppose that :~ is a family of N closed convex bodies such that f i g = { 0 ) . For each B ~ ' , choose a point pB¢intB and then for each nEaR letB~. =intB--(1/n)pB. Then each B~ is an open convex body and 0 ~ An ~ ~tcl B, c B. The family {B,: B e :~, n ~ ~R} has intersection {0}, and with N ~ No it has the same cardinality as :~. (iii) => (iv). Note that if 0 ~ B and B is open, then ccB c B. (iv) =~ (ii). Suppose that ~ is a family of N open convex bodies such that N { c c B : B ~ } = {0}. For each B ~ , choose a point pB~B and let B ' - = ½ e l ( B - p n ) . Then B' is a closed convex body and ccB'=ccB, so n {ecB': = f0). (ii) :*-(i). Suppose that & is a family of N closed convex bodies such that n{ccB:Be&} = {0}. For each B e g , choose a point p n e i n t B and then for each neg~ let B , . =(1/n)(B--pB). Then each Bn is a closed convex body and A, t ~B, = ccB, so the desired conclusion follows. In the above discussion, the point 0 is always interior to the sets B, and B ' ; further, the sets B, and B' are linearly bounded or of type Q if and only if the same is true of the set B. Thus in restricting the type of the convex bodies in question, it suffices to consider condition (iii). Suppose, then, that ~ is a family of open convex bodies in E whose intersection is {0}. If some member B of ~ is linearly bounded, then the same is true of the centrally symmetric sets (1/n)(BN-B), and of course n , , gt (I/n) (B N - B) = {0}. Suppose, on the other hand, that no member of & is linearly bounded. Then for each B a & there exists qn e E ,~ {0} such that [0, ~ [qB c B. Since N ~ = {0}, there exists Ca ~ ~ such that - qs ~ C~, and then by the separation theorem there is an open half space Qs D CB such that qn ¢ Qn. But then [0, oo [qB c Qn, and with B". = B N Q~ we have ccB" # csB". The family {B":B e &} has intersection {0} and its members are all open convex bodies of type Q. For the last assertion of 2.1, it suftices to observe that if G~, G2,"" is a sequence -
218
CZESLAW BESSAGA AND VICTOR KLEE
[December
of convex sets whose intersection is {0}, then the same is true of the sequence C1,C2,... where C s • = 2 -]f'~j I li=Ovi" Further, C,+x + C,+x = C , for all n e g l . 2.2. linear whose spaces
THEOREM A. SupFose that E is an infinite-dimensional topological space which admits a countable family of open (or closed) convex bodies intersection consists of a single point. Then for each point p of E, the E and E ,.~ {p} are homeomorphic.
Proof. By 2.1, there exists a sequence C~, C 2 , ' " of closed convex bodies in E such that the following four conditions are all satisfied:
(1) n.o c.
= (o};
(2) 0 e int C, for all n e 92 ; (3) C,+ 1 + C , + l c C.~ for all no92; (4) each set C, is of type Q, or each set C. is linearly bounded and centrally symmetric about 0. Suppose first that each set C~. is of type Q, and let u be a point of int C~ such that [0, co [u c C1 but - u ¢ C1. Let C~ = Cl, and for n > 1 let C" = (I/n) Cl + nu. 2hen each set C, is of type Q, t3, ~ C" = ~ , and the sequence C;, C~,... is nested. The sequence Ca, C2,... is nested by conditions (2) and (3). Let ~ be the identity mapping on E ~ intC1 = E ~ intC[. Since f l . ~ C . = {0} while f l . ~ C . ' = ~ , it follows from 1.6 that ~ can be extended to a homeomorphism of E ~ {0} onto g, In the remaining case, the set C~ is centrally symmetric and linearly bounded. Let [I [I denote the gauge functional of C,, and use 1.1. to obtain a norm [ [ for E such that I I < II I and the space (E,] l ) i s incomplete. Let E denote the completion of E with respect to the norm [ and let p e ~',,- E with Ipl < ~. For n > 1, let C ; = {x6E: IIx-pll z and let C'x=C1. Then C1,C2,... is a nested sequence of closed convex bodies whose intersection is {0} and C~, C2,'-" is a nested sequence of closed convex bodies whose intersection isJ2I, so the use of 1.6 leads to a homeomorphism of E ~ (0} onto E. 2.3.COROLLARY.An infinite-dimensional topological linear space E has property (A) if it satisfies any of the following conditions: (i) E contains a linearly bounded convex body; (ii) E admits a countable separating family of continuous linear forms; (iii) E is separable, metrizable and locally convex. Note that property (A) is not possessed by every infinite-dimensional locally convex topological linear space. Indeed, let U be an uncountable set and let E be the space of all bounded real functions on U, in the topology of pointwise convergence. (That is, E is a subspace of 9t~) Then E is a-compact, but (with p e E) {p} is not a G~ set in E and consequently E ,,~ {p} is not a-compact. 3. Proof of Theorem B.
1964]
PROPERTIES OF TOPOLOGICAL LINEAR SPACES
3.1. THEOREM B.
Suppose that U is a closed convex body linear space E. I f the characteristic cone ccU is not a linear linear subspace of infinite deficiency, then U is homeomorphic in E. I f ccU is a linear subspace of finite deficiency n, then U with the product ccU x [0,1]".
219
in a topological subspace or is a with a halfspace is homeomorphic
Proof. The case in which ccU is not a linear subspace is handled immediately by 1.5. If ccU is a linear subspace of finite deficiency n, then E is topologically and algebraically the direct sum of ccU and an n-dimensional linear subspace L of E. It is clear that U is homeomorphic with the product ccU x (U n L) and that U N L is homeomorphic with [0,1-1". There remains only the case in which ccU is a linear subspace of infinite deficiency. With 0 ~ ccU = csU, we have cc(U N - U) = cs(U N - U) = csU.
In view of 1.5, we may assume (in treating the remaining case) that the closed convex body U is centrally symmetric. Choose Uo e dU and let f be a linear form on E such that f(uo) = 1 and f U c ] - ~ , 1]. Let Q denote the halfspace { x : f ( x ) > 0}, whence Y" = csU c ~Q = {x: f ( x ) = 0}. Let U' = {x: (Itv(x - f ( x ) U o ) ) 2 + f ( x ) 2 <__1}. Then U' is a closed convex body with ccU' = ccU, whence U ~ U' by 1.5 and also OU ~ OU'. For each x e OU ~ (Uo + Y), let 1 z(x) = Uo + 1 - f ( x ) (u° - x),
so that z is a "stereographic projection" translated by the vector Uo. It can be verified that • is a homeomorphism of 0U ~ (Uo + Y) onto 0Q ,~ Y. Thus to complete the proof it suffices to prove the following: (1) OQ ~ OQ ,-~ Y; (2) OU ,~ OU ~ (Uo + Y); (3) U ~ U x ] 0 , 1 ] .
(4) Q
× ]0,1].
Assertion (4) is obvious. In connection with the others, let us prove: ( , ) There is a homeomorphism of E onto E .~ Y which carries U onto U ..~ Y and is the identity on E ~ int U.
The gauge functional it o is a seminorm on E. Let ([E], H 11) be the normed linear space corresponding to (E,/~v) in the usual way, and for each x ~ E let Ix] denote the corresponding element of [E]. Since Y is of infinite deficiency, the space [E] is infinite-dimensional and hence by 1.1 [E] admits a norm ] ] < [I ]1 such that the space ([E], t [) is incomplete. Let (~,1 1) be the completion of the
220
CZESLAW BESSAGA AND VICTOR KLEE
space ([E],[ I), and choose q e g ~ [E] for n ~ 9~, V/" = U, and
V'={x
with Iq I <
E:l[x]-ql
½E. Define E " = E, V." =nU (n := 2,3,...)
Then the conditions of 1.6 are all satisfied and the desired conclusion follows from 1.6. Now (1) follows by applying (,) with the sets E and U replaced by OQ and u noQ respectively. (3) also follows easily from (,), for U ,-, Y is simply covered by the segments ] 0, u] with u ~ OU. It remains only to estabfish (2). Let W denote the set { x : f ( x ) <=0}. Then with the aid of (,) it is easy to see that OW ,,~ OW ~ Y. And of course OW ,,~ OU by 1.5. But OW ~ Y is the image of OU ",(Uo + Y) under the homeomorphism ~ given by 1 W (x) = T U ° +
1 1 (x - T Uo).
This completes the proof. REFERENCES I. Corson, H. and Klee, V. 1963, Topological classification o f convex sets, Proceedings of Symposia in Pure Math.,vol. 7, Convexity, Amer. Math. See., Providence, R. I., pp. 37-51. 2. Klee, V. 1953, Convex bodies and periodic homeomorphisms in Hilbert space, Trans.Amer. Math. See., 74, 10-43. 3. K10¢, V. 1956, d note on topological properties o f normed linear spaces, Prec. Amer. Math. See., 7, 735-737. UNIVERSITY OF WARSAW, WARSAW, POLAND AND UNIVERSITY OF WASHINGTON,
SEATtLe, WASHINGTON,U. S. A. t (Added in proof.) Let us say that a subset X of a topological space E is negligible provided the spaces E and E ,,, X are homeomorphic. There are now several results which assert the negligibility of certain sets in various infinite-dimensionaltopological linear spaces. Theorem A above involves the most general class of spaces, but asserts the negligibility only of onepointed sets. It is also known that Xis a negligible subset of E if any of the following conditions is satisfied: (i) X is compact and E is an infinite-dimensionalnormed linear space; (ii) Xis compact and Eis a topologicallinear space which admits a Sehauder basis whose closure does not include the origin; (iii) X is weakly compact and E is a nonrefleaive Banach space; (iv) Xis weakly compact and E i s a Banach space which admits a Sehander basis; (v) X i s acompact and g = g~o. The results (i) and ('fii) appear in [3] and [2] respectively; (ii) and (iv) are in a recently completed paper of R.D. Anderson ("On a theorem of Klee"), and (v) is in another paper of Anderson ("Topological properties of the Hilbert cub¢ and the infinite product of open intervals"). t t Theorem B is used in the authors' recently completed paper entitled "Every nonnormable Frechet space is homeomorphic with all of its closed convex bodies."
ON THE LINEAR SEARCH PROBLEM* BY
ANATOLE BECK ABSTRACT
A man in an automobile searches for another man who is located at some point of a certain road. He starts at a given point and knows in advance the probability that the second man is at any given point of the road. Since the man being sought might be in either direction from the starting point, the searcher will, in general, have to turn around manytimes before finding his target. How does he search so as to minimize the expected distaneetravelled? When can this minimum expectation actually be achieved ? This paper answers the second of these questions. The purpose of this paper is to prove an existence theorem concerning the linear search problem. The problem, which this author has been circulating for some years, is the following. A point t is placed on the real line R according to a known probability distribution F. A search is made for the point by executing a continuous path in R starting at O.A search plan is a program of the the following type: Start in the positive (or negative) direction, and if the point is not found before reaching x l , turn around and explore the other half of the real line as far as x2. I f this still does not yield the point, turn around again and explore as far as xa, etc. Thus, we can represent the search plan as a sequence x = (x~} with . . . < x 4 < x 2 ~ O < x l < x 3 < . . . or . . . < x a < x l < O < x 2 < x 4 < . . . . In certain cases, as we shall note below, there may be only finitely many entries in a search plan, and one of them (but not more) might be infinite. We use the weak inequalities for technical reasons. I f all the inequalities are strong, we call x a strong search plan, otherwise it is said to be weak. A search plan as so defined represents all the meaningful planning the searcher can do, since no new information is coming in except that t has not yet been found, and this supposition is made in constructing the plan. Thus, assume that F is fixed, and that some plan x has been chosen. For each t, we let X(x, t) be the length of the path from 0 to t according to the search procedure x. F o r each x, X(x, t) is a random variable, and we define
X(x) = ~(X(x, t)) = ~/~o X(x, t) dF(t). Received January 29, 1965. * The research work for this paperwas supported by the National ScienceFoundation (under GrantNo. GP 2559 to the University of Wisconsin) and the Wisconsin Alumni Research Foundation. 221
222
ANATOLE BECK
[December
It has long been known by this author (and probably many others) that X(x) can be made finite for some search plan x exactly if the first moment MI(F) = j'~®[t[ d F ( t ) < oo. In that case, we define x~= {fi, - 2 f i , 46,...} and then X(x~)< oo and in fact limsup~_.0X(x~)=< 9M 1 (see e.g. [1]). We shall assume throughout that Si o l t] dF(t) < oo. There is an infimum for all the X(x), and we designate it as mo = too(F) throughout this paper. The main problem is to produce a method for finding a search plan y with X(y) -- too, or at least a y~ with X(y~) < m o + ~ for each e > 0. A "best answer" would be a formula for the x~ in terms of F, (or at least an algorithm for deciding whether xl is positive or negative). In the pursuit of this problem, it is interesting and useful to know whether it is possible to realize the infimum for some search plan. Under date of October 1963, Wallace Franck, of the University of New Mexico has circulated a preprint (see reference) in which he gives sufficient conditions for the existence of a minimum and also an example in which there is no minimum. The heart of his work is I.emma 1 (pp. 4-5) and an example (pp. 13-14). In this paper, we sharpen these last two results to produce a necessary and sufficient condition for the attainment of the minimum. Aside from the sharpened results (Lemma 3 and Theorem 13 of this paper), the other techniques are reasonably simple and not essentially different from those used by Franck. They are included only for the sake of completeness. Our main theorem is as follows: t~t7
.
_
1. THEOREM. Let F be a probability distribution on the real line and let M1 = S-~ [tl dF(t)< ~ . Define X(x) as in the above discussion, and let F+(a) = limsup,_~o+ ( F ( a + t ) - F ( a + ) t
, F - ( a ) = l i m sup,-.o_ (F(a + t) - F ( a - ) t
'
Then there is a search plan y with X(y) < X(x) for all search plans x i f a n d o n l y i f at least one of P+(0) and F-(O) is finite. Let us dispose immediately of the trivial cases. If the probability that t is negative (resp. positive) is 0, then the only reasonable way to search is to proceed to the right (resp. left) until t is found. Because of cases like this, we require only that t be found almost surely (i.e. with probability 1), and we allow the possibility of a search plan with only finitely many entries, one of which might be _ oo. Otherwise, if we were to require that every point be searched, even simple cases like this would have no minima, thus ruining an otherwise interesting problem. Let x - a n d x+be so designated that F(t) = 0, Vt < x - , F(t) = 1, Vt > x +, and 0 < F(t) < 1, Vx- < t < x ÷ . In any case where x - > - ~ or x + < + ~ , we allow the possibility that search procedures may be finite and may have _ ~ for the last entry. Whenever we reach a point, we shall assume it has been searched, and thus we normalize F by assuming that F is continuous from the right in the positive half of the real axis and from the left in the negative half. The jump at 0, if there is one, is unimportant, as 0 is searched immediately at the outset. Thus, our answer will be the same whether we consider the given probability distribution or the
1964]
ON THE LINEAR SEARCH PROBLEM
223
conditional distribution on the hypothesis t ~ 0. The distribution function for the conditional probability is continuous at 0, and thus it will not affect the generality of our result if we assume that F is continuous at 0. Suppose we choose a sequence x (") = {x[ ")} with X ( x (")) ~ too. A reasonable procedure would be to define Yi = lim,_~ o0x~") and prove that X(y) = m o , where Y = Yi. In fact, where possible, that is exactly our procedure. We must be wary, however of the possibility that x~")-~ 0, Vi, and this u n h a p p y circumstance can occur regardless of the nature of F, since for any search plan x, arbitrarily m a n y points can be added at the beginning arbitrarily close to 0, such that the change in X(x) is as small as we like. Furthermore, x t~ can diverge to + oo but not, as we shall see, if x - = - oo and x + = + oo. So let us take this case first. 2. LEMMA. I f X- = -- oO, X + = + o% then we can find a sequence (bi} such that Ix i [ < b, < ~ , k/i = 1, 2,-.-, holds for every search plan x with X(x) < 2m o. Proof. Let P1 = min(Pr(t < 0), Pr(t > 0)). Assume that xl > 0; the other case is dual. In this case,
21xil " Pl <
[xlldF(t)<
<
X(x,t)dF(t)<2mo. --00
Thus [ x l l < ( m o / P 1 ) = b l . Pr(t > bO). Then
21x2 I"
--<
In
the
same
21x21dF(t) <= bl
way,
let
P2=min(Pr(t<-bO,
X(x, t) de(t) < 2mo, 1
so that I x21 < (too~P2) = b2. In general, then, we choose P. = min (Pr(t < - b._ 1), Pr(t > b._ a)) and b. = (mo/P.), and this sequence gives us the desired bounds. Q.E.D. Even if the hypotheses o f L e m m a 2 are not satisfied, something can yet be salvaged, as we shall see later. W h a t of the possibility that x ~")~ 0 as n ~ oo ? We shall modify the p r o o f of F r a n c k ' s L e m m a 1 to show 3. LEMMA. I f F-(O) < O < 0% then we can find a K > 0 such that for all sequences x with x2 < 0 < xl and x3 - x4 < K, we can form the sequence y by removing xl and x2 from x (Yi = xi+2, Vi), in which case X(y) < X(x). Proof. It is easy to see (or cf. [1] p. 3 formula (2)) that
X(x) - X(y) = 2 [ I x 11(1 - I f ( x 1 )
+
Ix31(1 -I
- F(0)I) + Ix21(1 - I F(x2) - F ( x l ) l )
F(xa)-
-lx
l(1-
- t(o)l)
= 2 [(x 1 - x2)(1 - (F(xO - F(0))) - (x a - x2)(F(0) - F(x2))]. I f K is small enough, this difference is positive. Indeed, let K > 0 be chosen so that
224
ANATOLE BECK
[December
10 F(K) - F(0) < ½Pr(t > 0),
f(0)
2 o (F(t)-----
V-K
N o t e that K is dependent only on F, and not on x. Assume x3 - x4 < K. In this case, we have. xl < x 1 - x2 < x3 - x4 < K, so that 1 - ( F ( x O - F(0)) > 1 - ( F ( K ) - F(O)) > ½. Also, Ix21 < x , - x 2 < x 3 - x 4 < K ' so that F ( 0 ) - F ( x 2 ) < D [ x 2 [ . Finally, x3 - x2 < x3 - x4 < K, so that
X¢x)- X(y)> 2EIx21½-K • Dlx213 >
2E lx21- lx 13 = o Q.E.D.
4. LEMMA. I f F-(O) < o0, ~ > O, and x is any strong search plan, then we can :find a search plan y such that X(y) < X(x) + 5, Yl > O, and Y3 - Y4 >=K, where
K = K ( F ) is defined in the proof of the previous lemma. Proof. Perhaps x~ >= 0. I f not, let z = {z~} be chosen with 0 < z~ < x2 and I f z 1 is small enough, then X ( z ) < X ( x ) + 8 . If Xx>0, let z = x. I f z3 - z4 > K, we are through. I f not, then z ~1) = {z~~1)} defined by z} a)= Z~+z has X(z ~1)) =<X(z) < X(x) + e. Defining z (") by z~") = ",~(n-l)+2,we have X ( z c")) < X ( z t"-a)) whenever z(3"-1) -z(,,"-1) < K. Since z3(")= z3+2.~must eventually exceed K (Recall that F ( K ) - F(O) < ½Pr(t > 0).), we must come to a first n for which ,-a"~")- ,-4"t")=> K. Then if we set y = zC",)we have X(y) < X(x) + e. Q.E.D.
zl=x~-l, Vi>2.
5. LEMMA. I f X ( x ) < 2 m o , x x > 0 , x 3 - x 4 > K, and x- < a < O < b < x +, then x I e [a, b] for only no values of i at most, where n o depends only on F, a, b, and K. Proof. Let P = rain {Pr(x- < t < a), Pr(b < t < x + )}. Then if, for example, a __<x2, =< 0, we see immediately that
<
f: --
X(x, t)dF(t) <
X(x, t)dF(t) < 2too, ¢0
so that n < (mo/KP) + 1. Similarly if 0 ___x2,+~ < b.
Q.E.D.
6. THEOREM. Let F be our given distribution. I f x - = - oo, x + = + 0% and F-(O) < 0% then there exists a search plan y with X(y) = too. Proof. The p r o o f given is very similar to that o f Franck. First note that for any weak search plan x, we can find a strong search plan z with X(z) < X(x). We
1964]
ON THE LINEAR SEARCH PROBLEM
225
do this by first finding the least valuej of i with xj # 0. If we define w by w~ = x~_:+ 1, we see that X(w) = X(x). N o w let k be the greatest value of i with wk = 0. Define v~= w~_k. Then X(v)<X(w). Finally, whenever vi=vi+ z, eliminate v~ and v~+ 1 from v, thus giving us a new search plan z. It is clear that z is a strong search plan and X(z) < X(v) < X(x). Let x <") = {x}")} be chosen for each n in such a way that X(x ~")) --+ mo. Let K be chosen as before, and let a sequence {e,} be chosen with e, > 0, e, --* 0 as n -+ ~ . By Lemma 4 and the comments above, we can choose a strong search plan z (") based on x <"~ with z~")> 0, Vn, z3<")- z~ ")> K, and X(z (")) < X(x (")) + e,. Then X(z<"))-~mo, and for each i, {z~<")} is a bounded sequence. Using the diagonal method, we can extract a subsequence {z <"+)}of {z <")} with {z}"+)} convergent for each i as j--+ c~. F o r convenience, we assume {z <")} = {z<"J>}. Let y~ = lira,z} "), Vi, and y = {Yi}. Then y is a search plan (possibly weak). We wish to show that X(y) = too. Let w~<")be chosen so that it has the same sign as x} ) and so that IY, I}" Then w}")--+y, as n - + o % ¥, and
Iw,<.,l=max{Ix/") l,
Choose any k > 0, 6 > 0, and let no be chosen so that [ x: ") Yi = 1, ...,2k. Then for every Y2k < t < Y2k+l, we have
Y,I <
6, v n > no,
X(w <"), t) < X(x <") , t) + 2 k . 26.
Thus
l"
fr2k
m
~
X(w <"),t)dF(t) < I r:~
y2k
~
X(x <"),t)dF(t) + (Y2k- 1 -- Y2~)2k " 26.
.J y 2 k
On the other hand, as n -+ ~ , we have
X(w <"),t)-+ X(y, t) for every t e R. Thus, for n large enough, say n > nx, we have
fy
'lk--1 X(y,t)dF(t)- f ' ~ - - ' X ( w C"),t)dF(t) 2k
<6
u/ y 2 k
Thus, for n > max{no, ns}, we have
f,
':~-' X(y, t)dF(O <
ff
:~-'
X(w <"),t)dF(t) + <5
2k
~2k
~-
f
Y2k-1
X( x(">, t)dF(t)+(Y2k-x - Y2k)" 2k. 26 + 6
oylk
t,+,~
!
x<x'"',,)aF(,) +
Since X(x <"))-~ too, we have for each <5> 0
- y:k)2
. 26 + 6
226
ANATOLE
f
y~k-~ X(y,t)dF(t) < m o +
[December
BECK
( Y 2 R - 1 -- Y2R) 2 k "
2~ + 6,
Y2k
so that
ff
ek X(y, t)dF(t) < m o, V k > O,
2k
and thus
X(y) = : +_f
X(y,t)dF(t)<mo.
Since y is a search plan (weak or strong), X(y) > too, so that X(y) = too. Furthermore, we could m a k e y a strong search plan (by removing no m o r e than three zeroes at the beginning) and it would then still have this property. Q.E.D. 7. COROLLARY. If, in the above theorem, the hypothesis F - ( 0 ) < ~ is replaced by F+(0) < ~ , then the same conclusion follows. Proof. Clear by symmetry. We deal n o w with the thornier case in which x - > - oo or x + < + ~ . Let us consider first the case x - > - ~ , x + = + ~ . A s we noted before, the case x - > 0 is trivial. 8. THEOREM. Let F be our given distribution. I f - oo < x- < O, x + = + oo,
and t - ( O ) < oo, then there is a y with X(y) = too. Proof. Let x (n) be a sequence o f search plans with X(x (n)) ~ mo as n ~ oo. As before, choose z (n) so that each z (n) is a strong search plan with z~"!> 0, z~* ) - z(4")> K, and so that X(z (~)) ~ mo as n ~ oo. I f {z~~)} is b o u n d e d for each i as n ~ ~ , then the p r o o f given for T h e o r e m 6 will apply here. In the other case, let k be the least value of i for which {z}~)} is unbounded as n ~ oo. Choose a subsequence {z (~J)} of {z (~) } such that {z}*J)} converges for each 1 < i < k, while {z~~j)} ~ + ~ , and such that X ( z (~)) <2m o , Vj. We shall save notational difficulties by assuming that {z (~)} itself is this subsequence. We first show that ~(~) Then we have Zk21 -* X-. Let U. = Z(k") and v~ = ~k-l"
21u"l ( P r ( t < v ' ) ) = f~[" 21u"ldF(t)<= f~i" X(z(')'t)dF(t) < f _ ~ X(z("),t)dF(t) < 2mo, so that
Pr(t < v,) < m o ~ 0 as n -~ oo, since un = z~n) --* oo. Un
Thus ~,(n) k _ l -'-:"- V n - " ~ X - ) as asserted. Letting Yi = limn~oo z~t*~ Vi = 1, ...,k - 1, we have as before
1964]
ON THE LINEAR SEARCH PROBLEM
227
fx"-" X(y,t)dF(t) < too, Vn and thus Y = (Yl, " " , Y k - I , + 00) is a search procedure with X ( y ) = too.
Q.E.D.
9. COROLLARY. If, in the above theorem, the hypothesis F-(O) < oo is replaced by F+(0) < ~ , then the same conclusion follows. Proof. The p r o o f is identical except that z~") < 0 in each z("), which is an inessential difference. 10. COROLLARY. I f in Theorem 8 or Corollary 9, the hypothesis - oo < x - < O, x + = + oo is replaced by x - = - oo, 0 < x + < + oo, then the conclusions still hold. Proof. Clear by symmetry. 11. THEOREM. Let F be our given distribution. I f - ~ < x - < 0 < x + < + ~ , and 1:-(0) < ~ , then we can find a search plan y with X ( y ) = mo. Proof. Define x ~") and z(")as before. I f the sequence z~") is bounded away from x - and x ÷ for every i as n ~ ~ , then the p r o o f is the same as in T h e o r e m 6. Otherwise, let k be the smallest value of i for which {z~") } violates this condition. Assume {Zk°°} has x ÷ for a limit point; the other case is dual. As previously, we can assume each {z~")} converges, V1 < i < k, to a point of ( x - , x + ) , and z(~")~x +. Then let Yi = l i r a . . . . z~"), V1 < i < k, and set y = ( y l , . . . , y k _ x , x + , x - ) . Then as before, X ( y ) = too. Q.E.D. 12. COROLLARY. If, in Theorem 11, the assumption 1v-(O)< oo is replaced by iv+(0) < 0% then the same conclusion follows. Proof. Clear by symmetry. 13. THEOREM. Let F be a probability distribution with ,f~-ooltldF(t)< oo. Suppose that F - ( 0 ) = F+(0) = oo. Let x be any search procedure. Then there is a search procedure y such that X ( y ) < X(x). Proof. Since F - ( 0 ) = F+(0) = ~ 5 0 , x - < 0 < x +. Thus, any search procedure has at least two entries. Assume xx > 0 ; the other case is dual. Choose any Yx with x2 < Yl < 0 and y ~ i ( F ( y t ) - F(0)) > (1/xi), so that F(yx) - F(O) < (yl/xO, and define y~ = x~_ 1, V i => 2. Then X ( y ) - X ( x ) = 2 [ y l ( f ( 0 ) - F(yl) - 1) + x l ( F ( y l ) - F(0)] < 2
yl(F(O)- F(yl)-
1) + x 1 •
= 2yj(F(0) - F(yO) < O, so that X(y) < X(x).
Q.E.D.
228
ANATOLE BECK
Proof of Theorem. 1. Direct consequence of "l?heorems 6, 8, 11, and 13 and Corollaries 7, 9, 10 and 12. Q.E.D. In fact, Theorem 12 shows a little more than promised. It shows that if /v-(0) = oo (resp. F + ( 0 ) = oo), then there is no minimal search procedure with x I > 0 (resp. xl < 0). Thus, if F - ( 0 ) = 0% F+(0) < ~ (resp. the reverse), then we see that there is a minimal search procedure, x, and xt < 0 (resp. x 1 > 0). Thus, in this limited case, we have an indication of the direction of the first entry.
REFERENCE 1. Franck Wallace, On the optimal search problem, Technical Report No. 44, University of New Mexico, October, 1963. THE HEBREW UNIVERSITY OF JERUSALEM, AND THE UNIVERSITY OF WISCONSIN
S H A D O W SYSTEMS OF CONVEX SETS BY G. C. SHEPHARD
ABSTRACT
An s-system of convex sets is the system of shadows of a given convex set cast on to a subspace by a beam of light whose direction varies. Here the convexity properties of s-systems are investigated, and, in the final section, a relationship with the projection functions of convex sets is established. In three-dimensional space, the shadow of a convex set cast on to a plane by a parallel beam of light is a convex region. If we let the direction of the beam vary, we get a system of convex regions in the plane which will be called a shadow system of convex sets, or, more briefly, an s-system. The purpose of this short paper is to investigate the properties of s-systems. In particular it will be shown that if an s-system is parametrised in a suitable way, many geometrical functionals such as the volume, surface area, diameter, etc., are convex functions of the system parameter. Although there is no exact theory of duality in the study of convex bodies, s-systems seem, in some sense, to play the part of duals to Minkowski concave systems. This duality arises because, whereas an s-system consists of the projections of some convex set in higher-dimensional space, a Minkowski concave system may be considered as arising from the parallel sections of a such a set [2, p. 33]. S-systems are closely related to the process of Steiner symmetrisation [-2, p. 69], and, in a one-dimensional form, occur implicitly in the works of many authors (compare, for example, the continuous symmetrisation of P61ya and Szeg6 ['5, p. 200] and the linear parameter system of Rogers and Shephard [-6, p. 95]). §1. Definitions and Elementary Properties. In Euclidean space o f n + 1 dimensions, E n+ l, let ~ be any non-zero vector, K be any closed bounded convex set, and . ~ any hyperplane (subspace of n dimensions). Then we define S(~, K, . ~ ) (the shadow of K on ~ in the direction ~) to be .g~nZ(K, ~) Received December 30, 1964. 229
230
G.C. SHEPHARD
[December
where Z(K, 0 is the cylinder {x + t~ I x e K, - oo < t < oo} containing K with generators in the direction ~. Let a be any fixed vector in E n+1 not parallel to a~f. Since the definition is affm¢ invariant there will be no loss of generality in assuming that a is a unit vector normal to ~,~. Let u be a variable vector parallel to ~ , and let K(u) = S(a + u, K, ~ ) . Then the system of convex sets {K(u)}, as u varies, is called an s-system, and u is the system parameter. The s-system will be said to originate from the set K cE n+l
•
Since, dearly, S(a + u, x + (a, ~ ) = x - (u for any real number (, an alternative definition of K(u) is K(u) = {x - ~u Ix + ~a ¢ K} for each u. Written in this way, it is easy to see that if u is restricted to lie on a line, we obtain the linear parameter system of [6]. Since S(~,K, ~ ) is an affine image of the orthogonal projection of K on to a hyperplane normal to ~, many elementary properties of s-systems follow immediately from the corresponding properties of orthogonal projections. For example: I f all the sets K(u) of an s-system are centrally symmetric, then so is the set K from which they originate. The corresponding result for orthogonal projections was first proved by Blaschke and Hessenberg, see [2, p. 124]. As a second example, we mention the (rather surprising) fact if {Ki(u)} and {K2(u)) are two s-systems such that (2)
vn(Kl(u)) < vn(K2(u))
for all u, then it is possible for (3)
v~+I(KI) > vn+l(K2).
(Here vn(X) means the n-dimensional volume or content of the set X.) The corresponding result for orthogonal projections must have been known for a long time, but does not seem to appear in the literature; we therefore give an example below. The question of whether (2) implies vn+ ~(K1) < Vn+l(K2) if we restrict K1 and K2 to be centrally symmetric is still open. An answer would be interesting since this question is dual (in some sense) to an unsolved problem of Busemann and Petty [3, p. 88] about the cross-sections of centrally symmetric bodies. Let K1 be any ball in E s, and K~ be any non-spherical body of constant brightness ([2, p. 140] and [1, p. 151]) whose orthogonal projection (and therefore shadow) in any direction is equal in volume to that of KI. By Cauchy's surface area formula [2, p. 48] and the isoperimetrictheorem [2, p. 111] it follows that vn+l(Kt) > vn+I(K~). If we dilate K~ slightly, we obtain
1964]
SHADOW SYSTEMS OF CONVEX SETS
231
a body K 2 which satisfies (2), and, if the dilation is small enough, will satisfy (3) also. Let Ho, Hi be two given convex sets in ~ . Then it will not, in general, be possible to find an s-system which contains them both. A criterion for this is: (4) Let Ho, H 1 be any two closed, bounded convex sets and u o, u I be any two vectors in E n. Then there exists an s-system (K(u)) such that K(uo) = Ho, K(Ul) = H1 if any only if P(Ho, ~ ) = P(H1, ~), where P(H, ~ ) is the orthogonal projection of H on to a hyperplane ~ c ~v normal to the vector Uo - ul. The corresponding criterion for more than two sets H~ is not known. Clearly (4) is necessary since, by the properties of orthogonal projection, both P(K (Uo), ~ ) and P(K(ui), ~1) are equal to P(K, ~). It is also sufficient, for if a is any unit vector in E n+ 1 normal to the hyperplane ~ ' in which Ho, H1 lie, then we may put
K = Z(Ho,a + Uo) n Z ( H I , a + ui) and it is easily verified that Ho = K(uo) and H1 = K(Ul). This proves (4). There will be many s-systems containing the given sets Ho and H i , but for any such system {K'(u)}, it is clear that K' c K . Hence the system defined above is, in an obvious sense, the 'maximal' one. It may be called a linear s-system by analogy with a linear Minkowski system, which is the 'minimal' concave system containing two given sets. If rio - K(uo), H1 = K(u~), then the system of sets K((1 -O)uo+Ou~)(O <=O< 1), may, besides being part of an s-system, be a Minkowski linear system. By [2, p. 94] this occurs if and only if one of the sets Ho, H~ can be produced from the other by being 'stretched' in a direction normal to the hyperplane ~. If H1 is the reflection of Ho in the hyperplane ~ , then Ho and Hi satisfy condition (4)and we may, as above, construct a linear s-system {K(u)} with K(uo)= Ho, K(ul)= H1 for suitable vectors Uo, ul. Such a system is invariant under reflection in ~. Further K(½(Uo + ul)) is the result of applying Steiner symmetrisation [2, p. 69] to Ho (or to Hi). This is the relation between s-systems and Steiner symmetrisation mentioned in the introduction. §2. Convexity Properties of s-systems. (5) Let {K(u)} be an s-system of convex sets in E ~. Then the volume v~(K(u)) is a convex function of the system parameter u. We present two short proofs of this basic result: (i) As we have already noted, if u is restricted to lie on a line, then {K(u)) is a linear parameter system, and so by [6, p. 95], v~(K(u)) is a convex function of u. Since this is true for every line vn(K(u)) is a convex function of u in E ~. (ii) Since a is a unit vector normal to ,g~', the equality
232
G.C. SHEPHARD
[December
v,(S(a + u , K , ~ ) ) = (n + 1)v(a + u , K , K , ...,K) holds, where the expression on the right is the mixed volume of the line segment corresponding to the vector a + u, and the set K taken n times. But Minkowski proved that this mixed volume is a convex function of a + u (see [2, p. 44]) and so v,,(K(u)) is a convex function of u. Minkowski's argumenta lso applies to the mixed volume v(a + u, KI, K 2 , " ' , K,), and so we deduce the following generalisation of (5): (6) Let {Kx(u)}, {K2(u)}, ..., {K,(u)} be any n, not necessarily distinct, s-systems with the same parameter u. Then the mixed volume v(Kl(u), K2(u),..., K,(u)) is a convex function of u. A number of special cases are of interest. Let K~ = K2 . . . . . K, = K and let Kr+l = Kr+2 . . . . . Kn = B n, an n-dimensional unit ball lying in ~g', so that B"(u) = B" for all u. Taking r = n - 1 we deduce: (7) I f {K(u)} is any s-system in En, then the (n - 1)-dimensional surface area a(K(u)) is a convex function of u. An alternative proof of (7) can be constructed from (10) and the fact that, by Cauchy's surface area formula, the surface area is a constant multiple of the average area of projection on to a hyperplane. If we take r = 1 we obtain: (8) I f {K(u)} is any s-system, then the mean width of K(u) is a convex function ofu. Other values of r lead to convex functionals, of which the following special case will be used later: (9) Let {K(u)} be an s-system in E r"with the property that all the sets K(u) have dimension r. Then v,(K(u)) is a convex function of u. A system of this type arises (for r < n) if and only if K is r-dimensional. Futher special cases of (6) arise when we take some of the sets K to be balls of dimension lower than n. Let ~ be any r-dimensional subspace of . ~ , and, as above, write P ( H , ~ ) for the orthogonal projection of any set H on to ~ . Let K,+~ = K , + 2 . . . . . K , = B -" be an (n - r)-dimensional ball in aft normal to ~ . Then B"-'(u) = B "-r (for all u) and since the mixed volume
v(K(u),..., K(u), B"-', . . . , (in which K(u) occurs r times and B"-" occurs (n - r) times) is a constant multiple of vr(P(K(u), ~)), we deduce: (I0) Let {K(u)} be any s-system in E" and ~ any linear subspace ofr dimensions. Then vr(P(K(u), ~)) is a convex function of u. The case r = n - 1 (so that ~ is a hyperplane in E") leads to the alternative
1964]
SHADOW SYSTEMS OF CONVEX SETS
233
proof of (7) mentioned above. Since the supremum of a set of convex functions is also a convex function, we may deduce also: (11) Let {K(u)} be any s-system in E ~. Then the maximum brightness of K(u) is a convex function of u. The maximum brightness (defined by analogy with 'sets of constant brightness') is the maximum (n - 1)-dimensional volume of the projections of K(u) on to hyperplanes. The case r = 1 yields: (12) Let {K(u)) be any s-system. Then the width of K(u) in a given fixed direction is a convex function of u. If we average over all directions, we obtain an alternative proof of (8). If we take the supremum over all directions, we obtain: (13) Let {K(u)} be any s-system, then the diameter of of K(u) is a convex function of u. Statement (10) may also be established from: (14) I f {K(u)} is any s-system, and ~ is any r-dimensional subspace, then {P(K(u),~I)} is also an s-system, the system parameter being P(u, ~). For let ~ * be any (r + 1)-dimensional subspace through ~ and normal to Then projecting orthogonally on to ~ * we see that
P(K(u), ~ ) = P(K(u), ~l*) = P(S(a + u, K, ~ ) , ~*) = S(P(a + u, ~*), P(K, ~*), P(v~,°, Y~*)) = S(a + P(u,~*),P(K,~*),P(,g',Yl*)) which, as u varies, is an s-system in ~ with parameter P(u,~*) = P ( u , ~ ) . Other convex functionals can be defined in terms of convex polytopes inscribed in K(u): (15) Let fs(X ) be the functional defined as the maximum n-dimensional volume of all polytopes with at most s vertices included in the set X. Then for any ssystem {K(u)} in E",fs(K(u)) (s > n + 1) is a convex function of u. Let IIs be any convex polytope, with at most s vertices, included in the set K. Then S(a + u, Hs, ~ ) c S(a + u, K , . ~ ) and, further S(a + u, Hs,ovf) is a convex polytope with at most s vertices. Conversely, any polytope with at most s vertices included in K(u) can arise in this way as a shadow of a suitable I-Is c K. Hence
f~(K(u)) = sup v~(S(a + u, II~, ~ ) . IIscK
But, by (5), v~(S(a + u, IIs,~" ) isIa convex function of u for each II~, and so,
234
G.C. SHEPHARD
lDeeember
being a supremum of convex functions, f~(K(u)) is also a convex function of u. This proves (15). In an exactly similar manner, using (6) instead of (5), we may deduce a statement corresponding to (15) concerning the maximal ( n - 1)-dimensional surface area of polytopes, with at most s vertices, included in K(u). (16) Let js(X) be the maximum sum of the lengths of the ½s(s - 1) line segments joining s points belonging to the convex set X. Then for any shadow system {K(u)}, j,(K(u)) is a convex function of u. For the proof, take Ts as any set of s points belonging to K. Then S(a + u, Ts , ~ ) is a set of s points belonging to K(u), and the sum of the lengths of the joins of these points is
j(T~, u) =
~,
vl(S(a + u, (titj), ~ ) )
l~i<j<=s
where T~ = {tl,'",t~} and (t,tj) is the line segment joining h to t~. By (9), with r = 1, each term v~(S(a + u, (t~tj), ~ ) is a convex function of u, and so, being a sum of convex functions, j(Ts, u) is also convex. Now
js(K(u)) = sup j(T~, u) Ts~K
and so, being a supremum of convex functions, j~(K(u)) is also a convex function of u. This proves (16). If we put s = 2 in (16) we obtain (13) again. Because of the relation between s-systems and Steiner symmetrisation mentioned at the end of §1, it follows that: (17) Let f ( X ) be any functional defined on convex sets X in E n, with the properties: (i) for any s-system {K(u)}, f(K(u)) is a convex function of u, and (ii) f ( X ) = f ( X ' ) , where X' is the reflection of X in any hyperplane, then the functional f ( X ) is not increased by Steiner symmetrisation of X. Thus, for example, the surface area, mean width, maximum brightness and diameter of a set tend to be decreased by Steiner symmetrisation (see (7), (8), (11), (13)). The volume vn(X) is left invariant by Steiner symmetrisation, which is a consequence of the fact that if K(u) is any linear s-system joining Ho = K(uo) and n l = K ( u l ) , then vn(K(,~Uo+(l - 2)ul)) is, for 0 < 2 < 1, a linear function of L Hence, for a symmetrical linear s-system, it is constant. The converse of (16) is not true, in fact many interesting and important functionals, such as the circumradius, reciprocal of the inradius, reciprocal of the breadth, moment of inertia, electrical capacity, are decreased by Steiner sym-
I964]
SHADOW SYSTEMS OF CONVEX SETS
235
metrisation, yet none of these are convex functions on general s-systems, or even on linear ones. The proofs of these asertions are omitted. For those concerning moments of inertia and electrical capacity (as well as other quantities of a physical nature), the reader ir referred to the work of P61ya and Szeg~5 [5-1. §3. A Generalisation. In the first two sections we considered s-systems {K(u)} with one system parameter u. We now present a brief account o f a generalisation which provides geometrical insight into the properties of the projection function P(K, R)(1 < r < n - 1) discussed in recent publications of H. Busemann, G. Ewald and the author [4]. For notations and terminology, the reader is referred to this paper. Let K be any closed bounded convex set in E n+ , ~ any n-dimensional linear subspace and a l , "", at be unit vectors normal to ~ and to each other. Put t
(18)
t
E ~iu,lx + ~ (ia,¢K}
K(ul,...,ut)={x-
i=1
i=I
for any set {ul, " ' , u,} of vectors parallel to ~¢f. Then {K(ul, ..., u,)} will be called, as the u~ vary, an s-system of convex sets with t system parameters u ~ , . . . , u,. If one system parameter varies, and the remainder are fixed, {K(ux, ..., ut)} is an s-system as previously defined. It is natural to ask whether a system with t parameters has corresponding convexity properties to those of §2 when the parameters are allowed to vary simultaneously. Let T be any simple t-vector, then, with K and ~ as above, we define S(T, K , X ~) (the shadow of K on ~ in the direction T) to be
J/f N Z(K, T) where Z(K, T ) i s the cylinder {x + y l x e K, y l[ T} containing K and with tdimensional generators parallel to T. Then (19) K ( u l , . . . , u,) = S(T, K, ~ ) where
T = (al + ui) ^ (a2 q- U2) A
" ' "
A (a, + U,).
To see this, we notice that x-
~ui i=1
+
E ~(a~ + u~) = x + ~ ~a~, i=l
1=1
so that the line joining x - ~,(iui to x + ~,~iai is linearly dependent on al + u l , " ' , at + ut and so is parallel to T. Consequently x - ~ i u t is the unique point which a t-dimensional subspace parallel to T, through the point x + ~ (ia~ K meets ~ . From this, (19) follows immediately. We can show that (20) For the s-system defined by (18),
236
O. C. SI-IEPHARD
Vn(K(ul , ..., u,)) = P(K, T ' ) where T = (al + ul) ^ "" ^ (at "Jr ut) and T l is the simple n-vector normal to Z w i t h l T "L I = I T I . (In terms of components, 2.
Thi2...i. = Tt.+,i.+2...~ . if 01 "'" in+t) is an even permutation of (I, 2,..., n + t).) Let To = al ^ ." ^ at. Since K(Ul, ..., ut) and P ( K , 3 --L) (where ~'_L is the n-dimensional subspace of E"+t through the origin determined by the vector T -L) are cross-sections of the same cylinder, the latter being the normal cross-section, we deduce v.(P(K, gr±)) I T" To l
v,(g(u,,..., u,))- I TII To l However [ To = 1, T • To l = 1 from the way in which T and To were defined, . Hence
and lT I = I
v,(K(u,, ..., u,)) = I T± ] P ( K ' 3 " )
= tO(K,T ±) by definition. Thus (20) is proved. Relation (20) enables us to deduce immediately the properties of the system { K ( u , ..., ut)} from the properties of P(K,R) given in 14] and 1,7]. For example, we see that if n > 2, v,(K(ul, ..., ut)) is not in general, a convex function of T jfor t > 1, but is, in certain special cases. For example it is so if K is a simplex of at most n + 2 dimensions I-7, p. 307] or if K is a vector sum of line segments [4, p. 20]. The fact that v,(K(ul ,..., uf)) is, for all K, a convex function of each parameter u s separately corresponds to the fact that/~(K, T ±) is a weakly convex function [4, p. 34], i.e. is convex on the generators of G~ +t. REFERENCES 1. W. Blaschke, Kreis und Kugel. Viet, Leipzig 1916. Reprint: Chelsea, New York, 1948. 2. T. Bonnesen and W. Fenchel, Theorie der konvexen K6rper. Springer, Berlin, 1934. Reprint: Chelsea, New York, 1948. 3. H. Busemann and C. M. Petty, Problems on convex bodies. Math. Scand., 4 (1950, 88-94. 4. H. Busemann, (3. Ewald and G. C. Shephard, Convex bodies and convexity on Grassmann cones, Parts I-IV. Math. Annalen, 151 (1963), 1--41. 5. G. P61ya and G. Szeg~, [soperimetric Inequalities in Mathematical Physics, Princeton University Press, 1951. 6. C. A. Rogers and G. C. Shephard, Some extremalproblemsfor convex bodies. Mathematika, 5 (1958), 93-102. 7. G. C. Shephard, Convex bodies and convexity on Grassmann cones, Part VI. Jour. London Math. Soc., 39 (1964), 307-319. UNIVERSITY OF BIRMINGHAM, BIRMINGHAM,ENGLAND
NONNEGATIVE MATRICES WITH STOCHASTIC POWERS* BY
DAVID L O N D O N ABSTRACT
Matrices with nonnegative elements, which are nonstochastic but have stochastic powers, are considered. These matrices are characterized in the irreducible case and in the symmetric one. 1. Introduction. In this paper we consider square matrices with nonnegative elements which themselves are not stochastic, but for which a certain power is stochastic. In §2 we deal with nonnegative irreducible matrices, and in §3 with nonnegative symmetric matrices. In each of these cases we obtain a characterization of the nonstochastic matrices of the corresponding class which have stochastic powers. Our characterizations are constructive and enable us to build effectively the corresponding matrices. A very special case of our second result, the characterization of all 3 × 3 nonnegative symmetric matrices A which are nonstochastic, but for which A 2 is stochastic, was obtained earlier as a byproduct of the proof of a certain matrix inequality [2, Remark 3 following Theorem 1]. The main tool used in this paper is the Perron-Frobenius theorem [1, p. 53]. Let A = (aij) be a n × n nonnegative irreducible matrix. By the Perron-Frobenius theorem, A has a dominant simple positive characteristic value ~ = ~(A). If at ,-'-, c~h= a are all the characteristic values of A with modulus ce, then (~k ~(.0k k = 1,...,h, where co = e 2~i/h. If h = 1 A is primitive. If h > 1 A is cyclic o f index h. If A is cyclic of index h, then there exists a permutation matrix P such that =
m
0 0
(1.1)
pAp r =
D
A1 0 0 A2 0 \ \ \
0 0
\ \ 0 0 Ah 0
0
Ah_ 1 0
Received December 15, 1964. * This paper represents part of a thesis submitted to the Senate oftheTechnion-Israel Institute of Technology in partial fulfillment of the requirements for the degree of Doctor of Science.The author wishes to thank ProfessorB. Schwarz for his guidance in the preparation of this paper.
237
238
DAVID LONDON
[December
The null matrices in the main diagonal are squares of orders nk, k = 1, ..., h. (1.1) is the Frobenius normal form of A. Let rl, ..., rh be the characteristic values of PAP T corresponding respectively to ~1, "", ~h. rh is positive (r h > 0). Write rh = z 1 4 - . . . 4 - z h , where Zk is a vector of order nk, and the symbol 4- indicates direct sum. (If u = (ul,...um) and v = (vl,-..,v,), then u 4- v = ( u l , ' " , u m , v l ' . . , v , , ) ) . We have r k = z t 4- (.okz2 4- (o2kz3 4- "" 4- oj(h-1)kzh,
k = 1,...,h.
We end this introduction by a definition. Let B = (b/y) be a nonnegative m × n matrix. If
bij = fl,
i = 1,..., m,
j=l
then B is fl stochastic or generalized stochastic. If fl = 1 then B is stochastic. We remark that usually this definition is given only for square matrices. However, for our purpose it is convenient to use it for rectangular matrices. 2. Nonnegative irreducible matrices. Let A be a nonnegative irreducible matrix which is not stochastic. In this section we obtain a necessary and sufficient condition for some power of A to be stochastic. THEOREM 1. Let A be a nonnegative irreducible square matrix and let m > 1 be a positive integer. Let H be the cyclic permutation H=
(12,... h),
and let (2.1)
H m = CIC2". C,
be the representation of H mas the product of disjoint cycles. A is not a stochastic matrix while A m is stochastic if and only if (I) A is cyclic of index h, where (h, m) > 1. (II) There exist positive numbers fli, i = 1, ...,h, such that the matrices A i appearing in the Frobenius normal form (1.1) of A are respectively fldflt+l stochastic.(*) The numbers fl~fulfill the following two conditions: (A) They are not all equal. (B) Every two numbers with indices belonging to the same cycle in (2.1) are equal. Proof. First we prove that the conditions (I) and (II) are necessary. Let A be a nonnegative irreducible matrix which is not stochastic while A = is stochastic. * Here and in the sequel the indices are taken modulo h.
1964]
NONNEGATIVE MATRICES WITH STOCHASTIC POWERS
239
As A s is stochastic, 1 is the dominant characteristic value of A s and e=(1, ..., 1) is a corresponding characteristic vecor. Returning to A, it follows that 1 is the dominant characteristic value of A. As A is not stochastic, e is not a characteristic vector of A. Assume A is primitive. Then 1 is a simple characteristic value of A m and the only characteristic vector of A s corresponding to the characteristic value 1 is the characteristic vector of A corresponding to 1. But as this vector is different from e, it follows that A cannot be primitive. Hence, A is cyclic and can be represented by the Frobenius normal form (1.1). As P A P r is only a cogredient permutation of the rows and columus of A, we may change in the above considerations A and A m respectively with P A P r and P A m P r. Let cq ,..., eh = 1 be all the characteristic values of P A P r(or of A) with modulus 1, and let r l , . . . , rh be the corresponding characteristic vectors. As quoted in §1 we have (2.2) (2.3)
c~k = cok, co = e z~i/h,
k = 1,..., h,
r k ~--- Z 1 4- (DkZ2 4- (D2kZ3 4- "'" 4- c o ( h - l ) k z h ,
k = 1,-..,h.
As e is a characteristic vector of P A " P r c o r r e s p o n d i n g to 1 while it is not a characteristic vector of P A P r, there exist integers kt , ..., kt ," 1 =< kx < k2 "" < kz < h, l > 1, such that (2.4)
co,,~, = o)mk2. . . . .
comk,= 1,
and also numbers d r , "-, d~ such that (2.5)
dxrkl + d2rk~ + "" + dlrkz = e.
(2.3) and (2.5) imply (2.6)
Zl(dx
+ "'" nt-
dr) 4- "'" 4- Zh(dlco
(h-1)kt
+ ... + d , c o ( h - l ) k , ) =
e.
Let ei = (1, ..., 1), i = 1,..., h, be a vector of order nt. From (2.3), (2.6) and the fact that rh > 0 it follows that there exist positive numbers/~a, "",/3h such that (2.7)
rh = flxel 4- f12e2 4 - ' " -Jr flheh.
As rh is a chracteristic vector of P A P r corresponding to the characteristic value 1, we obtain from (1.1) and (2.7) p A p r r h = B2Ale2 4- "'" 4- PhAh-teh 4- ~lAhel = ~1el 4- "" 4- flhen.
Hence, (2.8)
Aiei+t = ~-ff-~-~ e~,
i = 1,...,h.
Pi+I
From (2.8) follows that A i is a//i//~+l stochastic matrix.
240
DAVID LONDON
[December
We have now to show that fit fulfill the conditions (A) and (B). PAPris not stochastic and therefore not all the matrices At are stochastic. As At is fl~/fl,+ stochastic, it follows that not all the numbers fl~ are equal. (A) is thus proved. To prove (B) denote the blocks of PAP r in the partitioning (t.1) by Aii, i,j = 1,..., h, and the blocks of pAmp r in the same partitioning by" Llt,A(m)j.We have AiS = { At'
(2.9)
0 ,
j = i + 1 (mod h) j~:i+l
(modh)
and h
(2.10)
A}~") =
~
Aik, ak~k2 ..... Ak,.-,, i"
kt,-..,km- t =1
From (2.9) and (2.10) follows (2.11)
A!m')={~ tAi+l"''''At+m-l',s ,
j=-i+m
(modh)
j~i+m
(modh).
As pAmp r is stochastic, all the matrices A t,(m)i+m are stochastic. As At is fit~fit+ 1 stochastic, it follows from (2.11) that Atl~)+m is fli
fli+X
fli+m-1 __ fit
stochastic.
Hence,
(2.12)
fit = fli+~,
i = 1, ...,h.
The permutation H ~ carries i into i + m and therefore i and i + m belong to the same cycle in (2.1), and so (2.12) is equivalent to (B). (B) is thus proved, and the proof of (II) is established. We have already proved that A is cyclic of index h. To complete the proof of (I), we have to show that (h, m) > 1. This fact follows easily from (2.2) and (2.4). The proof of the necessity part of the theorem is completed. We now prove that the conditions (I) and (II) are sufficient. Let A be a matrix which fulfills the conditions (I) and (II). From (A) follows that PAP r, and therefore A too, is not stochastic. According to (2.11) the matrix A~."~+mis flt/flt+~ stochastic. (B) is equivalent to fit = fit+m, and so ~(m) "*t,t+m is stochastic. Hence, PA~P r, and therefore A mis stochastic. The proof of Theorem I is thus completed. REMARK 1. We have H h = (1)(2)... (h), and so for m = h the condition (B) holds for any fit. In this case it is thus sufficient that the condition (A) holds. From this we conclude that if A is a nonstochastic cyclic matrix of index h and if A m is stochastic, then A s is also stochastic and so any power A rex, where ml = m (mod h).
1964]
NONNEGATIVE MATRICES WITH STOCHASTIC POWERS
241
REMARK 2. Let m and h be positive integers (m, h ) > 1. By the sufficient conditions of Theorem 1 we can construct all the matrices A which are nonstochastic and cyclic of order h, and for which A '~ is stochastic. As ( m , h ) > 1, there is more than one cycle in the representation (2.1), and so we can find positive numbers fl~, i = 1, ...,h, for which both (A) and (13) hold. Let Ai, i = 1,--.,h, be fli/fli+ 1 stochastic matrices chosen so that their dimensions fit the structure of (1.1) and so that A is cyclic of index h. Using the Aj's, we construct A according to (1.1). 3. Nonnegative symmetric matrices. Let A be a nonnegative symmetric matrix which is not stochastic. In this section we obtain a necessary and sufficient condition for some powers of A to be stochastic. Let us first define a class of matrices 9~. A matrix A belongs to the class 9~ if and only if A is a n × n nonnegative symmetric matrix, A is not stochastic and there exists a natural number m for which A s is stochastic. Let A eg~n . A s is thus stochastic while A is not stochastic. It is necessary that the multiplicity of the dominant characteristic value of A m is greater than the multiplicity of the dominant characteristic value of A. Hence, m is even. As the multiplicity of the dominant characteristic value of A m is equal for all the even re's, it follows that i f A E 9.I,, then A s is stochastic if and only if m is even. In the following theorem we characterize the classes 9~, by a recursive procedure. The structure of the class 9~ is determined by the structure of the classes 9.Ira, m < n. As the class 9~1 is void, we can by this procedure determine the structure of 9~ for any n. THEOREM 2. Let A be a n × n matrix. (1) I f A is reducible, then Ae9~, if and only if there exists a permutation matrix P such that
Bn - k
k is an integer for which the inequality (3.2)
n
-- < k < n 2 =
holds. B k and Bn_ k are respectively k × k and (n - k ) × ( n - k) matrices, and at least one of the following two conditions (3.3)
holds. I f only one of these conditions holds, then the matrix for which the condition does not hold is symmetric and stochastic.
242
DAVID LONDON
[December
(2) I f A is irreducible, then Aeg.l, if and only if there exists a permutation matrix P such that (3.4)
A = pr
p.
A~
0
0 indicates square null matrices, k is an integer for which the inequality n
(3.5)
-~- < k < n
holds. A1 is a k × ( n - k) matrix [ ( n - k)/k] 1/2 stochastic and its transposed AT is [k/(n - k)] 1'2 stochastic. Proof of (1). First we prove the necessity part. Let A be a reducible matrix belonging to 9~,. As A is reducible and symmetric, there exists a permutation matrix P for which (3.1) holds, where Bk and B,_ k are symmetric matrices. It is obvious that P can be chosen so that (3.2) holds. We have
]
0 A 2 = pT
2 nn-k
p.
As Aeg~,, A 2 is stochastic and therefore Bk2 and Bn-k 2 are both stochastic. As A is nonstochastic, at least one of the two matrices Bk and Bn_ k is nonstochastic. For the matrix which is nonstochastic the corresponding condition in (3.3) holds. If the other matrix is also nonstochastic, then (3.3) holds for this matrix too. If the other matrix is stochastic, then it is symmetric and stochastic. It is easy to verify that the conditions are also sufficient. Proof of (2). Let us begin with the necessity part. Let A be an irreducible matrix belonging to 92[,. According to Theorem 1, A is cyclic. As A is symmetric, it is cyclic of index 2 and so there exists a permutation matrix P for which (3.4) holds. It is obvious that P can be chosen so that (3.2) holds. According to Theorem 1 there exist positive numbers fll and f12, fll v~ f12, such that AI is fll/fl2 stochastic and AT is fl2/fll stochastic. As A 1 is a k x (n - k) matrix, we obtain
kfl2
fl-11"
Hence,
A1 is thus [(n - k)/k] 1/2 stochastic and AIr is [k/(n - k)]l/2stochastic. As fll # f12, it follows that the sign of equality in the lefthand side of (3.2) does not hold, and so (3.5) holds.
1964]
NONNEGATIVE MATRICES WITH STOCHASTIC POWERS
243
The sufficiency part follows by direct computation of PA 2pr. The proof o f Theorem 2 is completed. We shall now discuss the structure of the classes 9~n for n up to 4. n"l.
As already mentioned 9.It is void. n
m.~ 2 ,
(1) A reducible. (3.2) implies k = 1, n - k = 1. As 9~t is void, the condition (3.3) cannot be fulfilled, and so there are no reducible matrices in 9~2. (2) A irreducible. No natural k exists for which (3.5) holds, and so there are no irreducible matrices in 9.I2. Conclusion: 9.12 is void. n=3. (1) A reducible. (3.2) implies k = 2, n - k = 1. As the classes 921 and 9.I2 are void, the condition (3.3) can not be fulfilled, and therefore there are no reducible matrices in 993. (2) A irreducible. (3.5) implies k = 2, n - k = 1. At is a 2 × 1, l/x/2 stochastic matrix, and A has the following form m
m
1
0
(3.6)
A = pr
1
0
- -
1
1
p.
0
,/5
m
There exists 3 distinct matrices of the the form (3.6). Conclusion: 9~3 includes precisely the three matrices given by (3.6). From this conclusion follows the result mentioned in the introduction. n--4. (1) A reducible. (3.2) implies k = 2 ; 3 and so n - k = 2 ; 1 respectively. As 9.I2 is void, there remains only the possibility k = 3, n - k = 1. Let B3 be one of the three matrices belonging to 993. A has the following form 0 (3.7)
B3
A = pr
0
0 0
0
0
1
P.
244
DAVID LONDON
There are 12 distinct matrices of the form (3.7). (2) A irreducible. (3.5) implies k = 3, n - k = 1. At is a 3 × 1 matrix, stochastic, and A has the following form
11fg
B
(3.8)
1
0
0
0
~-~
0
0
0
~-~
0
0
0
~_
1
A = pr
1
1
1
P.
43 1
There are 4 distinct matrices of the form (3.8). Conclusion: 9I, includes 12 reducible matrices given by (3.7) and 4 irreducible matrices given by (3.8). REFERENCES 1. F. R.. Gantmacher, The Theory of Matrices, Vol. 2, Chelsea, New-York, 1959. 2. D. London, Two inequalities in nonnegative symmetric matrices, (to appear in the Pacific J. Math.) TECHNION-ISRAELINSTITUTEOF TECHNOLOGY, HAIFA, ISRAEL
ON PROJECTIONS AND SIMULTANEOUS EXTENSIONS* BY
DAN AMIR ABSTRACT
Let Xb X2 be subspaces of a completely regular space X. The bounded linear extension of C(X1 0 )(2) into C(X) are related to the projections of norm < 3 from C(X0 + C(X2)onto C(X). 1. Introduction. C(X) denotes the Banach space of all bounded continuous real-valued functions on a topological space X, with the supremum norm. If B is a subspace of X, a simultaneous extension is a linear operator E from C(B) to C(X), such that for each f in C(B), Ef is an extension o f f . If R denotes the restriction operator of C(X) to C(B), then a simultaneous extension is a linear right inverse of R. When a bounded simultaneous extension exists, C(B) is isomorphic to the subspace EC(B) of C(X), and P = ER is a projection (all "projections" in this paper are linear and bounded) of C(X) onto this subspace. If X is metric and B is closed in X, then there exists a simultaneous extension E of C(B) to C(X) with norm 1 [2]. In the general case a bounded simultaneous extension may fail to exist: Let X = fin (the Stone-Cech compactification of the discrete sequence N), and B = f i N - N. As proved in [1], C ( f l N - N) is not isomorphic to a direct factor of C(flN). Corson and Lindenstrauss [4], found recently, for every k > 1, a pair B c X of compact Hausdorff spaces, such that there is a simultaneous extension of C(B) to C(X) with norm k, but no one with smaller norm. Another simple relation between projections and simultaneous ,'extensions was observed by Dean [3]: Let Co(X, B) denote the subspace of C(X) of functions vanishing on B. If E is a bounded simultaneous extension of C(B) to C(X), then I - ER is a projection of C(X) onto Co(X, B). If R has a bounded (not necessarily linear) right inverse Q on C(B) (e.g. when B is closed and X is normal--by Tietze's theorem), the converse is also true: If P is such a projection, define E = Q - PQ. E does not depend on the choice of Q (if Q' is another right inverse of R, then (Q - PQ) - (Q' - PQ') = (Q - Q') - P(Q - Q') = 0), and is a bounded linear extension. In this paper we study a less immediate relation between projections and simultaneous extensions: Suppose we have two pairs: B1 c X1, B2 c X2 and a homeoReceived January 28, 1965. * This work was supported in part by National Science Foundation grant NSF GP-2026. 245
246
DAN AMIR
[December
morphism h of BI onto B 2 . We can "paste" the spaces X1 and X2 along the Bi by identifying all the points s in B 1 with the corresponding hs in B 2 , the quotient space X having the quotient topology. C(X) is naturally identified as a subspace of C(X1) ~ C(X2). The theorem relates projections of C(XI)O)C(X2) onto C(X) to simultaneous extensions from C(B) to C(X), where B is the image of the Bj under the quotient map. This is done for the case where the Bi are closed and nowhere dense, the Xr--completely regular, and there exist norm preserving extensions Qi (not necessarily linear) of C(Bj) to C(Xi) (RjQj is the identity on C,(B,)) (i = 1, 2). 2. Tn~OREM. (In the conditions
ofC(B) toC(X), with
specified above) if P is a projection of
IIEII<¢1 I -1)/(3- []Pl[).
Proof. For f in C(B) define W~f as the restriction of P(0 @ Q2f) to Xl (fl @f2 denotes the function which is fl on X1 and f2 on X2). W~f is independent on the choice of Q2--if Q~ is another extension, then P(0@ Q 2 f ) - p(o@ Q~zf)= P [ 0 ~ ( Q 2 - Q'2)f] = 01@(Q2- Q'2)f vanishes on X1. W1 is a linear operator from C(B) to C(X~); R~W~ is a linear operator from C(B) into itself. Define W2 symmetrically. For each f in C(B) we have: (1)
(RIW1 + R2W2)f = R [P(QlfG 0) + P(0 ~) Q2f)] = R(QlfO Q2f) =f.
We shall give now some bounds for W~f: L e t f be in C(B) with Ilfll = 1, and x in B. Let t > 0 be arbitrary. In the open set U = {s e X 1 - B;[ Qif(s)- f (x) ]< ~} choose a point t that satisfies also: [P(Qlf~)O)(t) - P ( Q l f ~ 0)(x) [ < e. Take a function g in C(X) such that 0 < g < 1, g(t) = 1 and g vanishes out of U. Consider the function F = ( - Q l f ~ Q 2 f ) + [1 + f ( x ) ] g which belongs to C(Xx) ~ C(X2) and satisfies IF] < 1 +
• (PF)(t) = ( Q l f ~ Q2f)(t) - 2P(Q~f@ 0)(t) + [1 + f ( x ) ] g(t) = 1 + 2f(x) - 2w~f(x) + [Q~f(t) - f ( x ) ] + 2 [P(Q~f~) 0)(t) - P(Q~f@ 0) (x)] by the choice of t, we have: (1 + e)II P II > (PF)(t) > 1 + 2f(x) - 3~ - 2W~f(x), and as 8 was arbitrary, we can conclude (by symmetry) that (2)
Wlf(x)~_f(x) - ½ ( I I P i l -
1) for all x e B ; f ~ C ( B ) with Ifl < 1 and i = 1,2.
Combining (1) and (2), we get also the upper bounds: WJ(x) < ½( ]1P I1 - 1). A very similar method gives us the same upper bound for x e X i - B: Take gEC(S) such that 0__< g=< 1, g(x)= 1 and g vanishes out of {s ~ XI - B; [Q~f(s) - Qlf(x) l < e}. Consider the function F = ( - Q~f ~) Qzf) + [1 +f(x)]g. IF l < 1 + e and (PF)(x) = - (Qlff~Q2f)(x) + 2Wlf(x) +
1964]
ON PROJECTIONS AND SIMULTANEOUS EXTENSIONS
+ 2g(x) = 1 + 2w, f(x), hence ( - f ) instead off, we get: (3) --
247
Wlf(x) <-½(IIP i] - 1) whenever Ifl ----1. Using
] Wif(x)]--< ½ ( I I P l i - 1) for all x ~ X ; f e C ( B ) with If] < 1; i = 1,2.
~(-lIf 3llf III =1, sup {I ~f(x) I ; xeB}>__ sup {If(x) I ;x~B}-½ (11PII- 1) P 11)(by (2)), combining these results we get:
(4) o< ~(3-11PII) 0, f~ = f , - i + (R1W1)(f-fn-1). All the f, are evidently in R1W1C(B). We prove by induction that f - f , = (R2W2)~f--this is obvious for n = 0, and for n > 0: f - f , = ( f - f , - 1 ) - (RiW1) ( f - f n - : ) (R2W2) ( f - f , - l ) = (R2W2)*f by ((1)). This implies that the fn converge uniformly to f, and as R1W1C(B) is closed, feR1WIC(B). By symmetry we have also =
R2W~C(B) = C(B).
As RtW~ is one-to-one, Ri must be one-to-one on WiC(B), and this establishes a simultaneous extension El = R~-1 of C(B) to WiC(B) c C(Xi). F,f= E l f @ E 2 f i s a simultaneous extension of C(B) to C(X), its norm is, by (4), not larger than (l] P II - 1)/(3 - [I P II)" Q.E.D. 3. REMARK. 1) If B is not empty, then necessarily II P I[ > 2. If IIP II = 2, then the extension E is norm preserving. 2) If we have bounded simultaneous extensions El of C(B) to C(XI), a projection P of norm 1 + 2 II ~1 II II E~ II/( II E, II ÷ IIE211) exists: f o r f ~ g in C(XI) @ C(X2), define:
P(f@ g) = I f +
{11F,~III(11F,,II + IIF'2 II)} F'l ( R l f -
• rg + {11E, II/( IIF,, 11+ 11F,2II))F,~
R2g)]
(R2g - Rif)].
If we have I[E1 H = 1, then [] P II = 1 + 2 IIE~ II/(1 + IIE~ II) hence I[E2 II = ([I P I[ - 1)/(3 - l]P H)" This shows that the bound in the theorem is the best. 3) The projection in 2) can be defined also when liE 2][ = 0% and we get a projection of norm I + 2 H E I [ [ , If lIE 11[=1 (e.g. X 2 = f i N , B 2 = fiN - N, X1 = [fiN - N] x ['0,1] and B1 = [fiN - N] x {0}), we get a projection of norm 3, but there is no bounded simultaneous extension of C(B) to C(X). "Reflecting" by an open and closed subset of fiN - N, we get an example where a projection of norm 3 exists, but there is no bounded simultaneous extension of C(B) to any of the C(X3.
248
DAN AMIR
4) In the symmetric case--when h can be extended to a homeomorphism H of X 1 onto X2, we get easily a better extension: Wlf+ W2fH. Its norm is [I P 11- 1. Conversely, for an extension E, the projection defined in (2, has norm II lIE + 1 . REFERENCES 1. D. Amir, Projections onto continuous function spaces. Proc. Amer. Math. Soc., 15 (1964), 396-402. 2. J. Dugundji, An extension of Tietze's theorem. Pacific Jour. of Math., 1 (1951), 353-367. 3. D. W. Dean, Subspaces of C(H) which are direct factors of C(H). Notices of the Amer. Math. Sot., 11 (1964), 344. 4. H. H. Corson, and J. Liladenstrauss, On simultaneous extension of continuous functions (to appear). UNIVERSITYOF CALIFORNIA, BERKELEY,CALIFORNIA
A PROOF
OF THE
DVORETZKY-ROGERS
THEOREM
BY
IVAN SINGER ABSTRACT
We give a new proof of the famous Dvoretzky-Rogers theorem ([2], Theorem 1), according to which a Banach space E is finite-dimensional if every unconditionally convergent series in E is absolutely convergent. It is dearly sufficient to prove the theorem for a separable Banach space E. Now, if every unconditionally convergent series in E is absolutely convergent, then every continuous linear mapping u of E into (P)* = m is integral(*). Since E is separable, let u be an isometrical embedding ([1], Theorem a)) of E into m. Then the "astriction" v = u: E --) u ( E ) of u is also integral, whence there exist(**) a Hilbert space H and continuous linear mappings w l : E -~ H , w z : H --) u ( E ) such that v = w z w l . Since v is onto, w z must be onto, whence the isomorphisms (linear homeomorphisms) E "~ u ( E ) ,,~ H I K e r
W 2 "~
H1 ,
where H 1 is a Hilbert space. Thus, by our hypothesis, every unconditionally convergent series in H t is absolutely convergent. It is trivial that in this case we have dimH1 < oo (since e.g. the series ~ = 1 x, in l 2, where
/
n--1
is unconditionally convergent, but not absolutely convergent), whence also dim E < oo. This completes the proof. REMARK. Our proof is, in a certain sense, dual to that of Grothendieck ([3], pp. 149-150). In fact, he considers, f o r e v e r y s e q u e n c e {fn}cE* w i t h I]f, [[ < 1, the mapping w: 11 ---}E* defined by
= r. i=1
while in the above proof we consider the isometry u: E --} m defined by u(x) = {L(x)} (x e e), Received February 7, 1965. * See e.g. [3], p. 148, Lemma 15; for an elementary proof, see [4], p. 162. ** See e.g. [3], p. 163; for an elementary proof, see [4], p. 160, proposition 2 (this latter proof is valid only for real spaces and mappings u with values in conjugate spaces, but it is easy to adapt it to the general case). 249
250
IVAN SINGER
where {f,} c E* is a f i x e d tr(E*,E)-dense sequence in {f E* l llfll 1) (see 111 the p r o o f of theorem a)), and for such {fn} it is clear that w* ] E = u. However, the p r o o f o f Grothendieck makes use o f two results on integral operators(*) and o f the Eberlein theorem on reflexivity, while the above p r o o f is perhaps slightly m o r e simple.
REFERENCES
1. S. Banach and S. Mazur, Zur Theorie der linearen Dimension, Studia Math., 4 (1933), 100--112. 2. A. Dvoretzky and C. A. Rogers, Absolute and unconditional convergence in normed linear spaces, Proc. Nat. Acad. Sci., 36 (1950), 192-197. 3. A. Grothendieck, Produits tensoriels topologiques et espaces nacldaires, Mere. Amer. Math. Soc., no. 16., (1955). 4. A. Pelczyfiski, A proof of the theorem of Grothendieck on the characterization of nuclear spaces (Russian), Prace Mat., 7 (1962), 155-167. INSTITUT DE MATHEMATIQUE, ACADEMIE DE LA R, P. ROUMAINE
* Namely, of: (1) every integral operator is weakly compact ([3], p. 131, theorem 9, I0 and (2) every integral mapping into a reflexive spaceis compact ([3], p. 134, Corollary 2).
ON THE MULTIPLICATIVE REPRESENTATION OF INTEGERS BY
P. ERDOS Dedicated to my friend ,4. D. Vdallaee on the occasion of his 60th birthday. ABSTRACT
Let al < a2 < ... be an infinite sequence of integers. Denotebyg(n) the number of solutions of n = at...aj. Ifg(n)>0 for a sequence n of positive upper density then lira sup g(n) = cx~. Let a , < a2 < "" be an infinite sequence of integers and denote by f ( n ) the number of solutions of n = a~ + aj. An old conjecture of T u r i n and myself states that if f ( n ) > 0 for all n > no then lim sup, = ~ f ( n ) = m. A stronger conjecture (which nevertheless might be easier to attack) states that if at < ck z then limsup,=~of(n) = m. Both these conjectures seem rather deep. I could only prove that ak < ck z implies that the sums at + aj can not all be different [6] (c, c t , c2,.., denote absolute constants). In view o f the difficulty of these conjectures it is perhaps surprising that the multiplicative analogues of these conjectures though definitely non-trivial are not too hard to settle. In fact I shall prove the following.
THEOREM 1. Let b~ < b 2 < ... be an infinite sequence o f integers. Denote by g(n) the n u m b e r of solutions of n = bib j . Then
(1)
g(n) > 0 for all n > no
implies
(2)
limsup g(n) = m n=OD
Define
B(x) = E 1 bt_~x
A well known theorem of Raikov [5"] states that (1) implies that for infinitely many x
(3)
B(x) > clx/(log x) ~/2
Thus to prove Theorem 1 it will suffice to show that if (3) holds for infinitely m a n y x then (2) follows. In fact I shall prove stronger results. Denote by ut(n ) the smallest integer so that if b: < --. < bt < n, t = u~(n) is any Received December 17, 1964. 251
252
P. ERD~)S
[December
sequence of integers then for some m, g(m) > I. Theorem 1 would follow from ul(n) = 0(n/(log n)t/2). THEOREM 2. U2k(n) < e2 l o ~ n log log n) k+l In a previous paper I [1] proved that (4)
II(n) + c3nal4/(logn) 3/2 < u2(n) < H(n) + c4 ha~4.
II(n) denotes the number of primes not exceeding n and IIk(n ) denotes the number of integers m > n the number of distinct prime factors of which does not exceed k. The right side of (4) can in fact be stregthened to (5)
u2(n ) < II(n) + csna/4/(log n) 3/2
I do not prove (5) in this paper. (4) and (5) suggest the possibility of obtaining an asymptotic formula with an error term for ut(n) also for l > 2. I am going to outline the proof of THEOREM 3. Let 2 k- 1 < l < 2 k. Then nilog log n) k- t ul(n ) = (1 + o(1) (k - 1)l logn " Finally I am going to prove the following THEOREM 4. To every c and l there is an n o = no(C, l) so that if n > no and b I < ... < b s < n is such that the number N(n) of integers t < n which can be written in the f o r m bib j is greater than en then there is an m with g(m) > I. Theorem 4 clearly implies Theorem 1, but not Theorems 2 and 3. Our main tool will be the following LEMMA. Let $ 1 , . . . , Sr be r sets of integers, S~ has N t elements ( N 1 > ... > N,) x (0, I < j < N , . Let u s < u 2 < . . . < u t be a sequence of integers where each uj, 1 ~ j < t is of the formI-[.~=l xt°(i.e, every u can be written as the product of r integers one f r o m each Si). Then if (6)
3': t> ~ 1-IN, Nlr 2r-I i=1
there is an m so that the number of solutions of m = UjIUj2
IS at least 2"-2. To each integer of S~, 1 < i < r we make correspond a vertex and to each
1964]
MULTIPLICATIVE REPRESENTATION OF I N T E G E R S
253
uj = l-i:= 1 x~°, we make correspond the r-tuple { xji}! 0 1 <_ i <- r. Thus we obtain an r-Graph [2] G ( ' ) ( ~ = I N ~ ; t ) and if t satisfies (6) then by the corollary of Theorem 1 of [2] there are integers xt~°, x~ ° in S~, 1 _< i -< r so that all the 2' integers rCIx~O,
;t=lor2
t=l
are u's. Thus I-[~=l ~r,.(o,.~o 1 "~'2 ~ U j l U j 2 has at least 2 ' - I solutions, which completes the proof of the Lemma. Let now b~ < ... < bs _-
s <
c2n(log log n) k+ 1 log n
To prove (7)we split the b's into two classes. In the first class are the b's which can not be written in the form (exp z = e~) k+l
(8)
YI e~, ei > exp ((log log n)2). /=1
Denote these b's by b],..., b,', and write bi = uiv~ where all prime factors of ul are not exceeding exp ((loglog n)2), and all prime factors of vi are greater than exp ((log log n)2). By (8) v i has at most k prime factors (for otherwise vi and therefore b~ = uiv i would be of the form (8). Further a simple argument shows that u i < exp ((2k + 2)(log log n) 2) (for otherwise u+ and therefore b[ would be of the form (8)). But then clearly (9)
sl <
flk
=< ]~' rIk
where the dash in the summation indicates that 1 =< t < exp((2k + 2) (tog log n)2). Now by a theorem of Landau [4] 00)
n k ( x ) = (1 +
.... x (log logx) k- 1
.
Thus from (9) and (10) we obtain by a simple computation (11)
sl < c6n(log logn) k+l/logn.
Denote now by b " l , " " b"s2 the b's of the form (8). If c2 > c6 and (7) would be false, we would have from (11) (12)
s2 > (c2 - % )
. n (log log n) k+ 1 1-~ "
254
P. ERDOS
[December
Put for 1 < j < s2 (we write e~° instead of e~ ))
(13)
rle.~ 0 , 2 ' < f b7 = k+* ~.(o e~0< 21+~,,., t 1 =
log n (Iog log n) 2 < 2(o < log 2"
To each bj' we make correspond the (k + 1)-tuple (14)
{2J°},
1 < i < k + 1.
By (13) the number of possible choices of the (k + 1)-tuples (14) is for n > n o less than (logn/log2) k+l. Thus by (12) there is a ( k + 1)-tuple (21,...,2k+1) which corresponds to more than n/(log n)k+ 3b,,,,s say b~' ,..., b', (15)
s 3 > n/(log n) k+
3
Now we apply our Lemma with r = k + 1. The sets S, are the integers in (2 a*, 2t+x'), thus Ni = 2 a', and by b~" < n we have k+l zk+l '~1 1"I N t £ 2 i=i
By (15)and 2t >-(loglogn) 2 a simple computation shows that sa = t clearly satisfies (6). Thus by our Lemma there is an m for which m = b~: bl~ has at least 2 k solutions, which proves Theorems 1 and 2. COROLLARY. Let bl < .." be an infinitesequence of integers so that every n > no can be written as the product of k or fewer b's. Then limsup,=~ g(n) = oo.
Raikov's theorem implies that for infinitely many x B(x) > cx/(logx) 11~. Thus the corollary follows from Theorem 2. Now we prove Theorem 4. We shall show that there is an e = e(c) > 0 so that to every T there is an no = no(T, e) for which, n > no, N ( n ) > cn implies that there is an L > T satisfying.
B(L) > 8L/(logL) t12 .
(16)
(16) by Theorem 2 implies Theorem 4. (16) implies Raikov's theorem with e = ca. Our proof of (16) will not use Raikov's theorem but we will use his method. We evidently have (17)
cn
~. B ( - ~ ) =
~,+
~2+
where in ~1 bl < T, in ~2 T < b~ < niT and in ~,a bi ~_ niT.
~a Clearly
1964]
MULTIPLICATIVE REPRESENTATION OF INTEGERS
(18) and (19)
255
E1 <-_TB(n) E3 < TB(n).
(19) follows from the fact that there are most B(n) summands in E3 and each summand is < T. If (16) does not hold then for every L > T
B(L) < ~L/(IogL)1/2.
(20)
Thus from (18), (19) and (20) we have for n > n o (T, e) (21)
E1 + E3 < 2TB(n) < 2Ten/(logn) a/2 < cn/2.
From (20) we further have (22)
E2 <
E T
~n/b i log
-~i
Now from (20) we have by a simple argument that for b i > T, b i > i(logi) 1/2. Thus from (22) we obtain by a simple computation (23)
E2 < en ~ i=2
1 i(logi) 1/2 log
(
c
if e is sufficiently small. (21) and (23) contradicts (17), thus (20) can not hold for all L > T (or (16) holds for some L > T) which completes the proof of Theorem 4. The following problem can now be put: Assume that (1) holds. What can be said about F(n) = max g(m) . mN_n I can prove that there are two constants el and e2 so that (1) implies for n > no (24)
F(n) > (log n) ~.
But there exists a sequence bl < ".. for which (1) is satisfied and for all n (25)
F(n) < (log n)"z.
In this paper I do not give the proof of (24) and (25) but only remark that the proof of (24) is a refinement of the proof of Theorem 2 and the proof of (25) uses probabilistic arguments similar to the ones used in [3]. Now we outline the proof of Theorem 3. By (10) Theorem 3 implies that for 2k-1 < l <<_2k ut(n) = (1 + o(1))l-lk(n ). First we show (26)
uzk -, + l(n) > (1 + o(1)) FIR_1 (n) = (1 + o(1)) n (log log n) k- 1 (k - 1)l
256
P. E R D O S
[December
Denote by v]k) < ... < ""t~ tk~ = < n the set of integers of the form k
n
-log -< n
(27)
Pi+I <
i1--I = 1Pi < n,
P~/k2 • ,Pk>(logn)
2.
It is a simple exercise in analytic number theory to prove by induction with repect to k that n (log log n) k- 1 (28) t k = (1 + o(1)) (k - 1)!log n We leave the proof of (28) to the reader. To prove (26) we now show that for every m the number of solutions of
v(k). ~k) J l VJz = m
(29)
is at most 2k- ~. Observe that if (29) is solvable we must have k
m = H piq,
(30)
i=l
where I-Lk= ~P~ and I-Lk ~ q~ both satisfy (27). Every solution of (29) must be of the form k
(31)
k
. (k) = I 1 x / ' ,
-- I-I x/2 ,
i=1
... > x 2
> ... >
/=1
where the x}~) and x}2~are the p's and q's and xrig x, =~ x(l)and , I-Ik= 1 x} 2) satisfy (27). x~~) and x~z) we will call the i-th coordinate of v~k]respectively .~j~.~k)Clearly p~ and q~ must be the first coordinates of any possible solution of (29). To see this observe that (27) implies k
I-I piqi < n a/2 < n/log n i=2
and hence (27) can be satisfied only if the first coordinates are Px and q t . Assume that the first i - 1 coordinates of a solution vii • (k)^~(29) has already been chosen. ut I claim that there are only two possible choices for the i-th coordinate o f "u j l(k)• To show this it will suffice to prove that only one p and only one q can possibly occur as the i-th coordinate of U" j(k)If this is not so we assume that both l • It t t vj=xl...x~_lpux~+t...xk and v j = x l . . . x i _ l p v x i + l . . . x k would be solution of (29). But then clearly (32)
vjr > =Xl...Xi_lPu,
tl
O j < X l ' . . X i _ I p k.
Hence by (27) and (32) , ,, k pul/2 > log n > vy/vj > Pu/Pv > log n
an evident contradiction.
1964]
MULTIPLICATIVE REPRESENTATION OF INTEGERS
257
The fact that the first coordinates of every solution of (29) must be p~ and ql and the fact that for i > 1 there are most two choices for the i-th coordinate of v(k) h immediately implies that (29) has at most 2 k- 1 solutions. Thus by (28), (26) is proved. To complete the proof of Theorem 3 we have to show (33)
u2~(n) < (1 + o(1))
n(log log n) k- i (k - 1) ! log n
To prove (33) it suffices to show that to every e > 0 there is an no = no(e, k) so that if (34)
bl<...
t, 1 > ( 1 + ~ )
n(l°gl°gn)k-X ( k - l)!logn
is any sequence of integers then there is an m with g(m) > 2 k. We will only outline the fairly complicated proof. Assume that there is a sequence satisfying (34) for which g(m) < 2k for all m. We shall show that this assumption leads to a contradiction. We split the b's into five classes. In the first class are the b's which can be written in the form k+l
(35)
I-[e~,
e ~ > ( l o g n f ~,
l
i=l
where Ck is a sufficiently large absolute constant. Using (35) and our Lemma in the same way as we used (8) and our Lemma in the proof of Theorem 2 we obtain that g(m) < 2 k for all m implies that the number of integers of the first class is 0((n/(log n)2)). The integers of the second class have at most k - 2 prime factors > (log n)~ and they can not be written in the form (35). In the asme way as we proved (11) we can show that the number of integers of the second class is less than (cn (log log n)k-2)/log n). The integers which do not belong to the first two classes can be written in the form k-1
(36)
t 1-I pi, pi > (log n) c~, 1 .~ i < k - 1 i=1
and where t can not be written as the product of two integers > (log n) ck (for otherwise our number would be of the first class). In the third class are the integers where all prime factors of t are less than (log n) ~' where r/1 = t/l(e) is sufficiently small. We can assume t < (log n)4Ckfor otherwise t would be the product of two integers > (log n) ~. Thus the number of integers of the third class is at most ~ ' 17k_ l((n/t)) where the dash indicates that t < (log n) 4c~and all prime factors of t are less than (log n) ~1. By a simple computation we have from (10) and t/1 = r/l(0
258 (37)
P. ERDOS ]~'Hk_l
[December
( t ) = (1 + o(1) n(l°gl°gn)k-2 ~' 1 < - -e n(loglogn) k-I (k - 2)!logn
10 (-k~
Thus by (37) the number of b's which belong to the first three classes is less than (e/2)(n(loglog n)k- l)/(k- 1)! log n) and hence by (34) there are at least n (log log n) k (1 + -~-) (-~-~-~.Vl~n
(38)
1
b's which do not belong to the first three classes. These b's can by (36) all be written in the form (Pk is the greatest prime factor of t) (39)
k t' 1-[ Pi,
t Pi > (l°gn) c~, 1 < i < k - 1, Pk > (logn) "', t' = - - .
Pk
i=t
In the fourth class are the b's for which t' < (log n) .2 where ~/2 =/72(?]1)
(40)
is sufficiently small. We shall now show that our assumption g(m)< 2 k for all m implies that the number N of integers of the fourth class is less than N<(1+4)
(41)
n(l°gl°gn)k-t .
If C is any set of integers N(C) will denote the number of integers of this class. Let b~ be any integer of the fourth class, b~ can be written (uniquely) in the form (39) and by It, we denote the set of integers hilt'. The integers in I t, have all k prime factors. If (41) does not hold then (in ~ ' t' < (log n) ~2) N = E' N(t,,) ¢
(42)
( 1 + 4 ) n(loglogn) k-I
t"
We evidently have In ]~" t'~t")
(It, n It. is the set of integers belonging to both I,, and It. f J v~i,N(It,) < Hk(n) + ~t",e, N(It, N It,,) < IIk(n ) + ( 1 "Og n x21/2 ) max N( I t, t~ It,.).
(43)
[
1", t "t
From (42), (43) and (10) we obtain that for n > no e n (log log n) k- 1 n (44) max N(It, hit.) > 1--6(k - 1)!(logn)t+2,~ > (logn)t+3.~ 1% t"
Hence there are values oft' and t" say t (~) and t (2)(tin ¢ t (2)) for which(44)holds. We are going to prove that (44) implies that there are primes p[1), p[2), 1 < i < k so that all the 2 k products
1964]
MULTIPLICATIVE REPRESENTATION OF INTEGERS
259
k
(45)
I-I P~),
2 = 1 or 2
/=1
belong to It, (3 Its. But then all the 2k+ ~ integers. k
2=1or2
H
i=1
are b's of the fourth class. Thus for m = t tl)tt2)I-[~k= t Pi(1)Pi~2)we have g(m) > 2 k, which contradicts our assumption, hence (41) is proved. Thus we only have to show that if the primes p~X), 1 < i < k with the property (45) do not exist then (44) can not hold. This will be accomplished by arguments similar to but more complicated than the ones used in the proof of Theorem 2. We will only outline the argument. Denote by (46)
r 1 < ... < r,,
l >
n/(logn) 1+3~2
the integers belonging to It, n I~2. By (39) each r i is the product of k primes each greater than (log n) n'. As in the proof of Theorem 2 we make correspond to rj = I-I k= 1 pt the k-tuple (47)
(;q,'",2k),
21>'">2k,
2 ~ ' < p ~ < 2 l+a'
Denote by N(21,-.., 2k) the number of r's corresponding to the k-tuple (2t, ..., 2k). We shall show that (48)
N(21, "",2k) ~_ 2
xk=12i
('),1 [-I2~
~ 12--~k]2~+ 1Q
By the prime number theorem the number of primes p~ satisfying (47) is (1 + o(1))(2a'/2~1og2). Now we apply our Lemma with r = k and (49)
N, = (1 + o(1)) ~ 2 ~
> 2~k/2, 2~k > 21__(log n) "~.
We obtain by the Lemma by a simple argument that if (50) N(21 , "",2k) > (1 +
1
,
(1og2)-k 2-2k/2~>
)
2Zi=t2~ 1-I 2~ -1 2-ak/2~*' ',1=1
then primes pt(l~ p t2~ 1 < i < k exist so that the numbers (45) are all r,'s and we have assumed that such primes do not exist. Thus (50) is false or (48) is proved. (48) clearly implies (the dash indicates that 2 x~k--x2~ __6 n and 24~> ½(log n) ~t
260 (51)
[December
P. ERDOS
l= ~"N(2~"" • '2k)< ~'2Xk=12/( ~I=~2i ) -1
2--4k/2~'+1
i
By an elementary but somewhat lengthy argument (using elementary inequalities) which we supress we obtain that (51) implies (52)
1 < n/(log n) 1+ 3~2
if f]2 = ~2(/'/1) is sufficiently small. (52) contradicts (46) and this contradiction proves that (44) can not hold, which finally proves (41). The remaining b's are in the fifth class. By (41) these integers can be written in the form k (53) t' l-[ Pi, P / > (log n) ok, 1 _< i < k - 1, Pk> (log n) ~', (log n) ~ < t' < (log n) 4ck /=1 k 1 P/ can be (if t' > (log n)4~ then a simple argument would show that our t' I~/= written in the form (35) and hence belongs to the first class). By (38) and (41) there are at least
e n(loglogn)k-I (54)
-4- (k - 1)!log n
n > log---n
b's of the fifth class. To each such b we make correspond a (k + 1)-tuple (21, "",2k+1), 21 > "'" > 2k+1 satisfying 24` < p ~ < 2 1 + 4 ' ,
l
24` > ½(log n) %
1 < i < k , ½(log n) ~2 <
2 4 ~ + ' < t < 2 t+ak÷~
(55) < (log n)4ok,
'vk+l ] 2 ''/= 1 "~* < n . Denote by N1 (21,'",2k+1) the number of b's of the fifth class belonging to to (21, ..., 2k+1). By (54) we have n (56) ]~' Nl(2X, "-., 2k+ 1) > 1og------~ where the dash indicates that (57)
2zk+12 i=1 i < n , 2a'> ½(logn) n', 1 _< i _< k, ½(log n)~2 < 24k +' < (logn) 4c~
Now we prove (58)
N1(21, '",2k+1)~_~2
y k+t 2 / k \ i=l i t t--I-I121)-X 2--4k+,2k+2
As in the proof of (48) we obtain that if (58) would not hold then there would be
196¢]
MULTIPLICATIVE REPRESENTATION OF INTEGERS
261
primesp~ a), 1 < i < k, 2 = 1 or 2 and two integers t (1) and t(2)so that the 2 k+l integers k
ta H Pi(a),
2 = 1 or 2
i=l
all would be b's of the fifth class, but as we have already seen this implies _O)_(2)~ g(m) > 2 k (for m = t(1)t(2)lqk 11 i = t Pi vi :- Thus (58) is proved. N o w we obtain from (58) by a simple computation the details of which we supress that (the dash indicates that (57) is satisfied)
•k+l ] /k+l )
(59)
~' Nl(21,'",;t~+l) <
2"~=t'~i|
1-1 21 -1 2-au+,/2k+2 = o
(l~gH)
.
\i=1
(59) contradicts (56) and this contradiction proves (34) and also (33) and hence completes the p r o o f o f T h e o r e m 3. Let 2 k- 1 < l < 2 k. T h e o r e m 3 could be sharpened to n ( l o g l o g n ) k-1 ( n ) ul(n) = (k - 1) ! log n + 0 (log nff1 + c where c > 0 is a suitable positive constant. But at present I can not prove for l > 2 a result as sharp as (4) and (5).
REFERENCES 1. P. Erd6s, On sequences of integers no one of which divides the product of two others and on some relatedproblems, Izv. Inst. Math. and Mech. Univ. of Tomsk 2 (1938), 74-82. 2. P. Erd6s, On extremal problems of graphs and generalized graphs, Israel J. Math. 2. 3 (1964), 183--190. 3. P. Erd6s and O. R6nyi, Additive properties of random sequences of positive integers, Acta Arithmetica 6 (1960), 83-110. 4. E. Landau, Verteilung der Primzahlen, Vol. 1,203-213. 5. D. Raikov, On multiplicative bases for the natural series, Math. Sbornik, N. S., 3 (1938), 569-576. 6. A. St6hr, Gel6ste und ungel6ste Fragen iiber Basen der natiirlichen Zahlenreihe, I and ll, J. reine u. angew. Math., 194 (1955), 40-65 and 111-140; see p. 133. TECHNION-ISRAEL INSTITUTE OF TECHNOLOGY, HAIFA, ISRAEL
INDEPENDENT SETS IN REGULAR GRAPHS BY
M. ROSENFELD* ABSTRACT
Lower and upper bounds for the maximal number of independent vertices in a regular graph are obtained, it is shown that the bounds are best possible. Some properties of regular graphs concerning the property .~ defined below are investigated. Introduction. In this paper we are interested in independent sets in regular graphs. In §2 we give bounds for the maximal number of independent vertices in a regular graph G. (G will always denote a graph without loops and multiple edges). It is shown that these bounds are best possible. It seems true that each value between the bounds is obtainable. In §3 we define the property ~ for graphs: we say that a graph G e . ~ if every vertex of G belongs to a maximal independent set of vertices in G. In some cases conditions are given under which G e . ~ . In §3 we define a class of graphs called homogeneous for which it seems to be interesting to investigate their properties and structure. 1. Definitions and Notations A graph G will be called regular of degree m if every vertex is incident with exactly m edges. We shall denote such a graph by G(n, m) where n is the number of vertices in G. It is evident that such a graph exists iff n > m and n ' m - 0 (mod 2). a(M) will denote the number of elements of the finite set M. 6 will denote the complementary graph of G. The components of G are its maximal connected subgraphs. A set R = { a l " " ak} ~ S (S the set of vertices of G) will be called a representing system of the edges of G if every edge of G is incident with at least one vertex from R. #(G)will denote the minimal number of vertices representing the edges of G. /7(G) will denote the maximal number of independent vertices in G. v(G) denotes the number of edges in G. M ~ S. [M] will denote the subgraph spanned by M. M, N ~ S, a MN-edge is an edge whose one endpoint is in M and the other in N. [M, N] will denote the subgraph whose vertices are M u N and the MN-edges contained in G. a, b e S, (a, b) e G denotes that a and b are incident in G. Cn will denote the complete graph with n vertices. Received January 20, 1965. * This paper is to be a part of the author's Ph.D. thesis written under the supervision of Prof. B. Griinbaum at the Hebrew University of Jerusalem.
262
1964]
INDEPENDENT SETS IN REGULAR GRAPHS
263
[fl]* will denote the smallest integer not less than ft. 2. THEOREM I. Let (3 = (3(n, m) be a regular graph of degree m with n vertices. Denote by Ft(G) the maximal number of independent vertices in G then:
k+2 b) fi(G) >_-
3
n=k(m+l)+m;
m>2
-~
(3 = G(2n + 1,2n - m); m < n;
> 2n + 1
, n a . ot er ase These bounds are best possible in the sense that for each pair of integers n, m such that m < n and n • m -= 0(rood2) they are obtainable. Proof. Let S be the set of vertices of G. Let A c S be an independent set of vertices in (3 and ~(A) = fi(G). Since A is a maximal independent set of vertices, each vertex of S - A is neighboring with at least one vertex of A. The number of different edges having one endpoint in A is/~((3) • m hence: 1) ~ ( G ) ' m >=n - ~ ( G ) ~ ( G ) >
= [ ~ ] *
(/7(G) is an integer !)
2) v((3) = n . m =~ Tn ' m > ~ ( G ) " m ~ [ 2 ] 2
>/~(G).
Let a ~ A, The m different edges whose one endpoint is the vertex a have their second endpoint in S - A. Therefore 3) ~(S - A) = n - ~(G) > m ~/7(G) < n - m. To show b) we need two Lemmas:
L~CIMA I. Let G = G ( K ( m + l ) + m ; m ) , (i.e.~(G)_>_
~
m>2~/~(G)>K+I +1).
Proof. (Observe that m must be even otherwise the graph does not exist). To prove the lemma we use induction on k. For k = 1: Since G = G(2m + 1, m) ~ G = G(2n + 1, m) and since an independent set of k vertices in G form a complete k-subgraph in G it will suffice to show that (~ must contain a C 3 . Let a ~ S be any vertex in G. a ~ {a~ ... am} = A. If a is not contained in a C3 A must be independent. S = {a} ~ A@B. e(B)= m. This implies that each at E A is an endpoint of m - 1 edges whose second endpoint is in B.
264
M. ROSENFELD
v(G) = m 2 +
[December
_• ~v([B]) = m 2 + -~- - (m + (m m
m
1)m) = -~-.
Since by our assumption m > 2 and m is even v ( [ B ] ) > 2. Let bi--', bk, bi'-"} b[, (it is possible that bi = b~ but bk ~ bk) bi is neighboring with some ai ~ A otherwise bi @ A would be an independent set with m + 1 vertices which is impossible. If a,-~bk a i - - { B - b k ) ai~b'~ and ai~b'k and [a,b~b'k]=C3. If a , ~ b k then [aibibk] = C3. Suppose ko > 1 is the smallest positive integer for which the lemma does not hold. Let us denote by Go the graph satisfying
Go = G(ko(m + l) + mlm) /7(Go) = ko + l.
(1)
G o cannot be connected: if G o is connected by a theorem due to Brooks [1] G o is m-chromatic. Hence Go = 25 (b At i=1
where Ai is the set of vertices colored with the i-th color. max~(Ai) >
ko(m + 1)+ m m
> ko + 1
Since A i is an independent set of vertices in G o this is a contradiction to our assumption (1). Hence Go must have at least two components: (2)
G O ---- G 1 (~) G 2 .
Let us consider the two possible cases: i) G 1 = G(kl(m + 1) + r I , m) G2 = G(k2(m -I- I) + r2, m) kl + k 2 = r 1 q-r 2-m
ii) G l = G ( K l ( m + l ) + m , m )
ko,
r2, r 1 ~0
G2 - - G ( K 2 ( m + l ) , m ) .
CASE (i): /7(Go) =/7(G1) +/7(G2) > K1 + 1 + k 2 -1- 1 = Ko + 2 which contradicts the assumption (1) hence this decomposition is impossible. CASE (ii): By the induction hypothesis (K 1 < Ko)~(G1) >K1 + 2 /7(Go) =/7(G1) +/7(G2) > K1 + 2 + K2 = Ko + 2. This shows that /7(Go) = K o + 1 is impossible and the proof of the lemma is completed.
5m
LEMMA II. Let G = G(2n + 1 ; m) and suppose that --~ > 2n + 1 then G
contains a Ca.
,g.
1964]
INDEPENDENT SETS IN REGULAR GRAPHS
265
Proof. Suppose that G does not contain a C 3. Let a s ~ (ba, b 2... bin} = B ba -~ ( a l ' " am} = A. I f G does not contain a Ca, A and B are independent sets of vertices and A n B = 0. Therefore G = A ® B • C where C = (ca ' " c,} and m by our assumption r in an odd integer and r < -~. Let c e C and suppose that c has r a < r - 1 neighboring vertices in C. Suppose furthermore that c
"-~ (aft...
a~
bkl"'" bk)
j + k = m - ra.
Without loss of generality we may suppose that j > k. Since B is independent bg, has m neighboring vertices in A @ C. Now c has r a neighboring vertices in C; then if G does not contain a C a bk~ has at most r - r~ neighboring vertices in C. Therefore bk~ has at least m - r = r~ neighboring vertices in A. A contains m - j vertices that are not incident with c. j+k=m-r
a
j>k
rn + r a 2
m-j<=
m m m+ Since -~ > r, m - r + r 1 > -~ + r 1 > rx > m - ] ; this means that bk must have at least one neighboring vertex from a h ... aij and G o would contain a C3, a contradiction to our assumption. Therefore we conclude:
i) c~ai~c~{B}. ii) c , c ' e C , c ~ c ' ,
c-~{A}~c'-~{B}. t
t
m
For if we had c -~ {a,l ... aij } = A c ~_ A, and c' -~ (a,, ... a,k } = Ac,, k = m - rl > - ~ , m j = m - r a > ~- A c N A t , ~ 0 and Go would contain a C 3 (N l,r~ have the same meaning as in the preceding paragraph). Denote by Ca = {c a ... c~}_ C and C B = (c] ..-c~,}_ C those vertices of C having neighboring vertices in A or B respectively. Because of i) and ii) C = CA • C~ and CA and Cs are independent sets in Go. I f Go does not contain a Ca we conclude from the above discussion that Go contains only four types of edges: (1) CaCB-edges (2) ACa-edges (3) BCB-edges (4) AB-edges. Henceforth to complete our proof it will suffice to show that r i B , CB] # v i A , Ca]. (This means that if G Odoes not contain a C3 it cannot be regular). Since r is odd and j + k = r we may suppose tbat k > j . L e t r~ = drc~(Ci) we have rt CaCB-edges in Go, j . m - Z~=a r, ACA-edges and k. m - ~ = x r, BCs-edges. Since k > j k " m - ~,r~ = r i B . CB] > j " m -- ~,~=~ r, = v [ A . C4] Q.E.D.
~=1
Lemmas I and I I give the justification of the modification of the lower bound for/7 (G) given in (b).To complete the proof we will construct for each given admissible m, n a regular graph in which we obtain the bounds.
M. ROSENFELD
266 (a) G(2n, m)
m <__n
A= {a,...a.)
[December
fi(G)= n.
B = {b,...b,}
ai "-* ( bi, bi4 t"" bi;,. }
I i+m
i-i-m=
i+m
=
i+m-n
i+m>n.
It is easily seen that this construction gives a regular bipartite graph with/~(G) = n. Observe that this graph does not contain a C 3 hence the complementary graph is a G(2n, m') m' > n and fi(G) = 2. (b) G(2n + 1,m) (m<_ n) /7(G)=n. Observe that in this case m must be even m = 2k. To the graph constructed in (a) adjoin an additional vertex c. Omit the edges (a~b3 1 <- i <- k and add the edges (ca3 and (cb3. The resulting graph is easily seen to be a G(2n + 1, m)with fi(G) = n. Observe that if k > 2 G will contain a C3 but not a C,. (c) G(n;m)
m>
n
--~
p(G)=n-m.
Let G* = C,_,, @ G(m ; n - m - 1). rl
.
G(m; n - m - 1) always exists: for m > ~- implies that m > n - m - 1, if m is odd n must be even and therefore n - m - 3is even. Furthermore it is easily seen that G* contains a C,_,, but not a C.-m+t. The graph G* = G(n, rn) is the desired example. (a) (b) and (c) show that the upper bound stated in (a) is best possible for each admissible m, n. (d) G ( k ( m + l ) + m ; m )
fi(G)=k+2
m~4
(See Lemma I).
For k = l the graph G o ( 2 m + l , m ) = G ( 2 m + l , m ) where G ( 2 m + l , m ) is the graph constructed in (b) has (Go) = 3. For k > 1 the graph G = C,, ÷ 1 @ "'" @ Cm÷x @ Go(2m + 1,m) will have
fi(G) = k + 2. (e) G(2n + l ; m )
m=2r o>n
2 n + 1 > - 5 ( 2 n2- m )
?,(c) = 2.
The condition 2n + 1 >_5(2n2- m) Is . necessary because of Lemma II. Denote the set of vertices by: A ~- { a l , . " a n }
B~-- {bt,".b.}
and c.
Edges: c -* (a, "" a,, bt "'" b,} m' -- 2rm' = 2n - m
ai~{b,+t...b,}
b~{a,+l...a,)
By our a s s u m p t i o n m ' > n - r ~ n - r + l = m ' - k
l
1964]
INDEPENDENT SETS IN REGULAR GRAPHS
a,+,~(aj};
b,+,-~(bj}
l <j
267
l
This is possible since n - r > k as will be shown in the sequel. It is easily seen that: 2r = m' l < _ i < r + k
d(ai) = d(bi) =
r
r+k+l<_i<_n
d(c) = 2r = m'.
Now n - r - k = n - r - [ m ' - ( n - r + l ) ] = 2 n + l - 2 m ' > r by our assumption (which proves that n - r > k) hence we can construct from the two sets: a ' = {at+k+ 1 , ' " , an} and B' = {br+k+ 1,"" bn} a bipartite regular graph of degree r with A', B' independent. It is easy to see that the graph thus defined is a G(2n + 1, m') that does not contain a C a . The complementary graph of thisgraph is the example looked for. (f) G(n,m) ~ ( G ) = [ n~
Letn=h(m+l)+r
1" n # k(m + l) + m
~5 m_ ' < n ( i f n i s o d d ) "
O
I f r = 0 take G = Cm+1 @ "'" @Cm+l (h times). 5r If r > 0 but m + 1 + r is even or m + r + 1 > -~- take in the first case G = Cm+l @ ""Cm+l@Ga and in the second case:
G= Cm+l@"" @Ge Where G a - - G ( m + r + 1, m) is the graph constructed in (a) (/7(Ga)= 2 ) a n d
Ge = G(m + r + 1, m) is the graph constructed in (e). One easily verifies the equality ~ ( G ) = [ - ~ + 1 ]* . If m + r + l is odd but m + r + l < - ~ - ,5r mmust be even. Let 2m + r + 2 = 3 k + J o 0 < J o < 2 ( 2 m + r + 2 is even!). I f j o = 0 k is even. Let Ci . . .{c~ i 1 <-- i < 3 be a set of vertices. Join by an edge the following . Ck} vertices: c¢ -4, 2 {Ck-l+j(mod(k/2))} } C2 "-4"(C~q-,}
( l
k) k {l+jifl+j O .
Each C t is a complete k-graph. These relations define a graph G* = G(3k, m)with ~(G*) = 3. The graph G = Cm+l @ "'" @ G* (Cm+~ taken h - 2 times) is a G(n,m) =
. If io ¢ 0 then: 2(k + (k + 1) 2m+2+r=
2(k+l)+k
k + 1 is even kiseven.
In the first case take three complete graphs: C 1 = C 2 = Ck; C 3 = C~+~. In the
268
M. ROSENFELD
[December
second case we take C 1 = C 2 = Ck+~ ; C 3 = C~,. Add to the graph C ~ @ C 2 @ C 3 the following edges: c3
I
1
-
3
k+l
2
1
O<=j<m-k-1
k+l -----<s
2
~
=
2
2
=
It is easily seen that these relations define a graph G * = G(3k + 1, m) with #(G*) = 3. A similar construction can be carried out in the second case. In both cases the graph looked for is G = Cm+ 1 @ " " @ G*. (g)
G(2n + l,2n-m)
m
5m --f-- > 2n + l.
#(G)=3
It is trivially seen that the complementary graph of the graph constructed in (b) is the desired example. This completes the proof of the theorem. REMARKS. Since the complement of a maximal independent set of vertices in a graph is a minimal representing system and vice versa, our theorem can be applied to the estimation of #(G) in regular graphs. P. Erd6s and T . Gallai [2] have shown that: 2v(G) rc(G) _ n n - / 7 ( G ) =/~(G) < 2v(G) + rc(G~ p(G)< m + 1 which is the same bound obtained in (b). But in the case of regular graphs as was shown we can say more than in the general case. In [2"] it is shown that the equality n
/7(G) = m + 1 holds if and only if G is the direct sum of complete graphs. In the case of regular graphs the equality n
*
which can be obtained except for the two cases mentioned in (b) does not determine uniquely the graph in general. Furthermore, we can give a lower bound for/~(G): (only for regular graphs) /~(G)>max
~
,m
.
and the minimal value is obtainable for each n, m such that n • m - 0 (rood 2). 2) Given a regular graph we can easily estimate e(G).
8(G)
1964]
269
INDEPENDENT SETS IN REGULAR GRAPHS
where I(G) is the interchange graph of G and if G = G(n, m) then hence:
I(G) = G(½ n " m, 2(m -- 1))
Into
2~-m" -- 2
= e(G) = min
/[4J
' z n" m - 2(m -
1,/ .
One can easily modify these bounds using the fact that each vertex in I(G) is contained in a C m . Lemma I is a sharpening of a theorem of Turfin [4], for the case of C3. Observe that it holds only for regular graphs having an odd number of vertices. One can easily deduce from theorem I, that if n / n - m > r, G(n, m) must contain a C,+ x. (This result can be obtained from Tur~m's theorem). A slight modification of this result can be obtained from lemma II: If
G = G(u(m + 1) + m,(u - 1)(m + 1) + m)
then G contains a C,,+2. 3. DEFINITION A graph G will have the property ~¢'(G e oW) if every vertex in G is contained in a maximal independent set of vertices with p(G) vertices. In this section we shall investigate the property ~ in some extremal cases of regular graphs. For this we need few more definitions: 1) With each vertex of G(n, m) we associate a (m/2)-tuple o f integers ordered by increasing magnitude and defined as follows: with each two edges incident with the vertex in consideration associate the length of the shortest circuit containing them. If such a circuit does not exist the number associated will be + oo. We denote by z(a) the (m/2)-tuple associated with the vertex a, and call it the type of a. It is obvious that a necessary condition that there exists an automorphism of the graph that carries a to b is z(a) = z(b). 2) A regular graph will be called homogeneous if all the vertices in each component have the same type. Examples of homogenous graphs are circuits, complete graphs and point symmetric regular graphs. A homogenous graph need not be point symmetric, see for example the graph constructed by B. Griinbaum in [3] L E n A 3.1. G i s a g r a p h . d ( a ) < m
VaeG~Ft(G)>
Proof. We use induction on n. (~(G) = n). For " s m a l l " n' s the lemma is obvious. Let a e G. a ~ {bl'." bm, }m' < m. e(S - {a b l ' " bin,}) = n - (m' + 1). The graph G' = I S - { a b l ." bm,}] satisfies the conditions of the lemma hence m+--1
->-
-1
therefore a maximal independent
set from G' together with a is an independent set A with e(A) > THEOREM 3.1. G = G(n • m).
=[
Ft(G) [ m +
1J
,Ge
~,'g'.
270
M. ROSENFELD
[December
Proof. The proof is a direct consequence of lemma 3.1. since we have shown
that any arbitrary chosen vertex belongs to a maximal independent set. n
THEOREM 3.2. G = G(n,m) ~ ( G ) = ~ = ~ G e ; , ~ mined, m+ 1
and G is uniquely deter-
Proof. P. Tur{m in [4] proved that in this case G is the direct sum of complete m + 1-graphs, hence the theorem follows.We give here another proof of the theorem. n We use induction on k = For k = 1 the theorem is obvious since in this m+l" n c a s e G = G m + 1. Let G = G ( n , m ) and ~(G)= m + 1 = k > 1 be given. If G is connected by a theorem of Brooks ['1], G would be m-chromatic, G = ~ = t (3 At, At is the set of vertices colored " i " . n
n
max ct(ai) ~_ - - > m m+l
n But Al is independent, in contradiction to the assumption ~(G) = m +-------T'" Hence G = G t 0) G2 where Gt = G(nlm), G2 = G(n2, m). n t + n2 = n Suppose
na m+l
and
is not an integer =~ =
Hence: m n~ + 1 - k~ < k
---:--:,.
m n2 +
/~(G) = g(Gt) + g(G2). n2
m+l
+
i =
is not an integer by Theorem I: >
=
m+l
+
1
>
~
re+l"
k2 < k and by the induction Hypothesis
they are the direct sum of complete m + 1-graphs. This means that G is point symmetric :~ G e ~¢t~. THEOREM 3.3. Let G = G(n,m); m > ½n; p(G) = n - m then: 1) G e ~ if and only if n - m/n and G is homogeneous. In this case G is uniquely determined and point symmetric. 2) G ¢ ~ ' if b does not belong to a maximal independent set of vertices while a does then z(b) < z(a). (The types are ordered lexiocographically). Proof. 1) Suppose G e ~ .
Let A 1 = (as,..., an-m} be a maximal independent Therefore if A2 = {a~.-. a,'-m} is a maximal independent set different from AI we must have As N A2 = ¢. Now if G e ~ each g e G belongs to a uniquely determined maximal independent set of vertices, hence G is the direct sum of independent sets of vertices and each vertex is connected by an edge to all the vertices not belonging to the independent set including it. This means that n - m/n. Since the complementary graph of A is set. H e n c e
a, ~ { S - & } .
1 _< i -< n - m.
1964]
INDEPENDENT SETS IN REGULAR GRAPHS
271
easily seen to be the direct sum of complete n - m graphs, G is point symmetric and therefore homogeneous. It is then obvious that if n - mXn :~ G 6 ~ . 2) Let a e A , A a maximal independent set =>a ~ {S - A). Let B be the set of all vertices that are not joined by edge to b (including b) :~ct(B) = n - m. Since each a e A is joined by an edge to S - A and b e S - A. B ~ A = O. Hence: G = C (3 A ~ B. Denote by l(xay) the length of the shortest circuit containg the edges (ax) and (ay) (in the sequel it will be shown that l(xay) is finite). Let us calculate z(a) and
,(b). Put {cl ... cs} = C
cicj e C and
s = 2m - n.
(cicj) e G
(c~cj) ~ G
l(ciacj) = l(cibcj) = 3
l(ciacj) = l(cibcj) = 4.
l(a~bc~) = 3 this contributes ( n - m)(2m - n) times " 3 " to z(b). l(a~baj) = 4 this contributes ½(n - m)(n - m - 1) times " 4 " to z(b). Sin ce b does not belong to a maximal independent set:
v[B] = r >=1 Suppose therefore (b~bj) e G :~ 3 c', c" e C ^ (bid), (bjc") ¢ G. :~ l(biabj) = 3,
l(b~ac') = 4, l(bjac") = 4. Since riB] = r it is easily seen that we have 2r triangles of type l(abc) more than of type l(cab)while only r triangles of type l(bab)more than of type l(aba); this shows that in z(b) we have r " 3 " more than in z(a) ~ z ( b ) < ~(a). This proves also that if G is homogenous we must have n - m/n and G e ~ . This completes the p r o o f of the theorem. THEOREM 3.4. G = G(n, m) /7(G) = 3. I f a does not belong to a maximal independent set of vertices in G while b does then z(a) < z(b). Proof. Observe first that if m < 2 ' G e . ~ , hence we will suppose that m > n =
2"
Let a ~ { x l " " X m } =X,,, a-~,{yl...y,_m_l}= Y,,. 1) G = { a } ~ X . @ Y. 2) a does not belong to a maximal independent set implies Y. = C._ m_ i. 3) The number of different triangles containing a is v[X.].
v[X']=n'm2
{m+
(n-m-1)(n-m-2)2
t-(n-m-1)(2m-n+l)}"
Let b ~ { r l . . . r m } = R b b + ~ { p t . . . p . _ m _ l } = Pb. G = {b} ~ R b G P b. Since b belongs to a maximal independent set Pb # Cn-m-t
the number of
272
M. R O S E N F E L D
different triangles containing b is v[Rb]. It will therefore suffice to show that V[Rb] < v[Xa]. Suppose that in [Pb] r edges are needed to complete the graph [Pb], the two endpoints of such an edge are connected by an edge to vertices in Rb. Hence for each "missing" edge in Pb we have two "additional" PR-edges:
V[Rb] _ n -- m 2
{
mq-
(n -- m - - 1 ) ( n - - m - - 2) 2
r+(n--m--1)(2m--n+l)+2r
}
Since r > 1 =~v[Rb] < v[Xo]. This completes the proof. REMARKS. 1) G = G(n, m) f~(G) = 3 G is homogeneous =~G ~ • . 2) In the general case we do not know when G E ~¢~; it is obvious that: G e J f ¢> c3 Ro = ~ (R~ runs over all the minimal representing systems in G) but this is not a useful criterion.
REFERENCES 1. R. L. Brooks, On coloring the nodes of a network. Proc. Cambridge Philos. Soc., 37 (1941), 194-197. 2. P. Erd6s and T. Gallai, On the minimal number of vertices representing the edges of a graph Magyar Tud. Akad. Mat. Kut at6 Int. K6zl., 6 (1964), 181-202. 3. B. Grtinbaum, A problem in graph coloring. (Unpublished). 4. P. Turfin, On the theory of graphs, Coll. Mathematicum, 3 (1954), 10-30. THE HEBREW UNIVERSITY OF JERUSALEM