CONTRACT LAW A N D THE VALUE OF A G A M E BY
JOHN R. ISBELL ABSTRACT
For the special case of games with linearly transferable utility, a treatment preserving the main features of the controversial treatment in the author's doctoral dissertation is newly derived from a model for negotiation and play that is more elaborate than most such models. The crucial point of the derivation is that the author's special bargaining theory is not needed; the usual Zeuthen-Nash-Harsanyi bargaining theory gives the same result. The main novelty in the model that makes this possible is the replacement of customary informal uses of "enforceable agreements" by explicit contract law. The problems of contract law for cooperative games seem to be very complex, and the present work makes only a bare beginning on them. A characteristic function and value are derived. Introduction. This paper presents a value for certain n-person cooperative games (viz. those with linearly transferable utility) which, though technically new, could have been derived by trivial steps from two different places in the literature. There are two main points to the paper. Background: my previous work in cooperative games [4, 5] begins by changing utility theory and goes forward on radically different lines from anyone else's work; and, sound or not, it has not been followed. Main point 1: here I concede almost everything to the opposition, using their utility theory, their (Zeuthen-Nash-Harsanyi [1]) bargaining theory, and (though this is agreement, not concession) Shapley's formula, which can be based on a model due to Harsanyi [2], for deriving a value from a characteristic function. Nevertheless the value obtained is consistent with my previous scattered remarks [4, 5] on what a value should do, and inconsistent with the evaluations most recently proposed by Harsanyi [3], Selten [9], and Shapley [11 ]. The crucial turn of course lies under the word "almost". I depart from the custom of treating the coalition as a sort of religious order acting like a single person. I introduce a simple explicit contract law. It is actually too simple; for playing a game cooperatively, one would want a more sophisticated law, and its development appears to involve knotty problems. But the simple contracts used here seem to suffice for evaluation, in much the same way that ordinary mixed strategies suffice for a matrix game, because the means proposed secure certain results regardless of what the other players do. Main point 2: for the games considered, one can speak of "the" other value, for Harsanyi [3], Selten [9], and Shapley [11 ] assign the same values to these games. Received December 20, 1965, in revised form June 22, 1966. 135
136
JOHN R. ISBELL
[July
The point is that the other value, to speak picturesquely, swindles certain players. Obviously there is room for considerable controversy about such a point. At any rate, the middle section of this paper concentrates ontwo three-person games which we evaluate differently. Selten's derivation of a value and other considerations lead us to examine a couple of related games. I have profited from a confrontation with Selten on this question, and shall try to indicate some of his points about the examples. But the Selten value seems to require support by a theory at least as detailed as the simple theory given in section 1 below taking account of contract law. Toward that, Selten has made one point quite clear: he follows ScheUing [8] in stressing unilateral commitments and feels they should have a legal standing if my contracts do. Lacking a detailed development, one can still note that if commitments are to have the force suggested by Schelling, there must be possible legal proceedings compelling "specific performante", the actual carrying out of promised actions. In contrast ,I propose to exclude specific performance, securing contracts only by penalty clauses for pecuniary (utility) indemnities. Section 3 of the paper has the only non-trivial theorems, but is much the least interesting; it gives some results on "games with infinitely many players". The ideas of this paper were worked out at the 1965 Jerusalem Game Theory Workshop, sponsored by The Hebrew University and the Israel Academy of Sciences and Humanities. I am much indebted to J. C. Harsanyi, L. S. Shapley, Martin Shubik and R. M. Thrall for constructive criticism there. The writing was supported by the National Science Foundation. 1. Value. For the simplest definition of the proposed value, we consider games in normal form. Though for most game theory, and for everything in this paper, the normal form suffices, we shall usually find it convenient to speak of the extensive form. If F is an n-person finite game in normal form with payoff function h, define h-(xt, ...,x~) for each (pure) strategy n-tuple ~ = (xl ..... x~) by ht'(O = hi(O + 1/n(m - Y, hj(~)), where m is the joint maximum maxY.hl(~). Let F - be the game derived from F by replacing the payoff h with h-. We may follow an unpublished manuscript of Harsanyi and call F - the upraised game. It is evidently constant-sum. We define the upraised characteristic function v- = Vr of F as the Neumann-Morgenstern characteristic function of F - and the contract value d)- of F as the Shapley value of F-. What does ~b- evaluate ? To begin with, there is substantial agreement on the Shapley value for constant-sum transferable-utility games. As far as I know, none of the numerous values for more or less general games that have been proposed since Shapley's original paper [10] has differed on these games. A number of arguments leadingto it are known. Note Shapley's axiomatic argument, adapted to the constant-sum setting in [4], and note Harsanyi's bargaining model [2].
1967]
CONTRACT LAW AND THE VALUE OF A GAME
137
Accordingly, we shall here take the step from v- to ~ - as established, and examine the passage from F to v-. A specification of the assumptions involved that would be adequate for, say, the designer of a game-theoretic experiment, would be extremely long. Roughly, we need the usual assumptions after von-Neumann-Morgenstern [7], amplified by explicit legal assumptions (see below), and supplemented by a symmetry or democracy assumption to the effect that the players recognize each other as peers. That last assumption enters in two ways: in the (standard) bargaining theory to be applied to the model, and also in the structure of the model, where we seem to need a special meeting of all players. Of course no positive or negative concern of the players with each other's welfare is implied. In Zeuthen's and in Harsanyi's presentation of the bargaining theory [1], a player's resistance to a concession is supposed to depend monotonically on the ratio of the cost (to him) of the concession to the cost (to him) of a breakdown of negotiations, and by the same rule for all players. As for the special meeting of all players, it serves to conclude indefinitely long negotiations which may have already determined the outcome; but players considering whether to delay or not in the previous negotiations have the definite prospect of a last grand meeting, into which they will go with the support derived from previous agreements or the independence secured by previous disagreements, whichever they prefer. Concerning the legal apparatus of the model, some preliminary remarks. Indisputably one wants contracts to be sufficiently definite so that a trial court can in a finite time determine whether the contract has been violated. That sounds like a question from recursive function theory; and I think the analogy is sound, remote though it seemsfrom ordinary business practice. An analogue in contracts to the endless passage from n to n + 1 that gives rise to arithmetic may be found in this remark: if a contract Cn (between A and B, say) could affect the prospects of the players, then so could a contract Cn+ t ! (between A and C) binding its parties to enter no contract of the form Cn. We may hope that the analogy is unsound. At any rate, the apparatus here proposed gives it no footing. This model admits those and only those contracts C such that, in consideration of certain side payments x among the set S of parties to C, all players in S agree (1) to relinquish all their turns to move in the game tree to a designated agent who is to play a specified mixed strategy a and (2) to sign no other contract; any member i of S violating (1) or (2) is to pay a specified vector indemnityfi to the rest of S. Enforcement is paternalistic; without plaintiff, defendant, or trial, the court will assess the indemnities provided for. Time taken to move or to negotiate is customarily ignored in game theory and can be ignored here, as long as we have three successive periods: the first for general negotiation, the second for an attempt for all players to agree on an outcome, the third (if necessary) for play and assessment and payment of indennities. There is nothing to add to what has been said on the first period, and almost nothing on the third; but the second period bargaining requires a determination c f the outcome if no agreement is reached. Now the players may be legally committed to play certain strategies, and may even have inconsistent commitments. In the second period I want them actually committed to a certain
138
JOHN R. ISBELL
[July
course of play. Thus the second period begins wi h each player handing to a clerk a statement of a mixed strategy on the following finite set: the union of his set of pure strategies in the game tree and his set of contracts. (Playing a contract means complying with its play clause, and that would determine his play completely. This "early" commitment to a strategy is objectionable if one thinks of it as early, but the players are supposed to be through negotiating, ready to play, only making a last attempt to secure a jointly optimal outcome by compromise.) Then if the second period results in agreement on an outcome, that is the outcome; if not, play is governed by the strategies given to the clerk. One remark seems needed, on cancelling contracts. Formally, we allowed the set of all players to cancel or render irrelevant previous contracts. In effect any superset of the set of parties to a contract C can cancel C by drawing up a new contract D (a direct violation of C) and offsetting the indemnities associated with C by the side payments associated with D. The model is now completely specified, by the second and third paragraphs before this one. It is not specifically a bargaining model, though because of the rudimentary contract law it is a defective model for play. I remark that the customary model for cooperative play, from [7] on, amounts to the first period of this model, with a contract law about a millennium more primitive, followed by play. Assuming Zeuthen-Nash-Harsanyi bargaining theory, any set S of players can secure jointly in this model v-(S). (Since the same will hold for the complement of S, it will follow that v-(S) is all that S can secure.) After the completely trivial proof, we shall note how the model leads to a result so different from the result of Harsanyi's similar model in [3]. Proof: our (second period) bargaining problem, for any threat payoff (tl ..... tn), has the solution (t 1 + a ..... t, + a), where na is the excess m - Eti[l ].Thus the first-period problem for S is(t) the maximin problem, familiar from the Neumann-Morgenstern theory, for the upraised game F - ; and its solution is the solution v-(S) of that problem [7]. How ? Well, Harsanyi does not permit S to choose the best coordinated mixed strategy a and still act as a number of separate persons in the bargaining. He does not even permit them to play uncoordinated strategies and bargain as separate persons. In Harsanyi's model, the number corresponding to v-(S) (call it u(S)) comes from S and its complement choosing the best coordinated strategies and bargaining as two ("corporate") persons. (Strictly speaking, the present model does not permit them to bargain separately or corporately as they may prefer; but it would not matter if it did, since Zeuthen-Nash-Harsanyi bargaining theory for games with transferable utility never makes combination gainful.) Accord ingly the rather strange function u that emerges is not a plausible characteristic function (not superadditive). Harsanyi puts no stress on u and describes his model (1) The fact that the problem for "S", which is not a legal individual, is fairly simple follows from their ability to secure each other's continued cooperation by large indemnities.
1967]
CONTRACT LAW AND THE VALUE OF A GAME
139
as a model for bargaining, not for play. The presentation in [3] goes on, in effect reproducing Harsanyi's previous [2] bargaining-model derivation of the Shapley value. 2. Crucial examples. There are two rather simple three-person games, for one of which the contract value seems prima facie sound and the Selten value(z) strange, with the reverse holding for the other. It is not only as a debating tactic that I present the latter first; it is simpler. In fact, there is no need to bother about the details of moves. There are three players, Adams, Brown, Cox. If Adams wishes it, Brown will receive 90 units. Otherwise no one gets anything. Call the game Fgo. The Selten value is (45, 45, 0). Surely this seems obviously right. But the upraised characteristic function gives Adams (precisely, {Adams}) 30, Brown 30, Cox 0, and so on, and the contract value is (40, 40, 10). Several objections may be made to this contract value. First, what has Cox to do with the game ? (With a little jargon, the objector can give an answer instead of a question; Cox is a dummy, and has nothing to do with the game.) The model already described provides an answer. The indicated contract for Cox to try to negotiate with (e.g.) Adams will call for Cox to pay Adams 15 units, and for Adams to deny Brown the 90, paying Cox a prohibitively large indemnity if he violates. If this is legal, and accomplished, Brown cannot afterward hope to split 45-45 with Adams; Cox has bought a full one-third share in the enterprise of stealing 90 dollars, or whatever it is. Cox can expect a gross return of 30, net of 15. The reader can easily work out the whole analysis. There is a second-order objection that I have met often enough to justify a comment on it. "Why don't Adams and Brown close ranks and get 45 instead 407" They may. OrAdams and Cox may close ranks. That would get Adams 45. The second serious objection is, if Cox can do this, what about Davis ? We were discussing a three-person game. It seems to me that the reminder that the enterprise might consist of stealing 90 dollars almost suffices to answer this. It is notorious that in applying game theory to the description of actual conflict situations, often the hardest part is to say what game is being played. There is a legitimate question, what happens if there are very many players in the situation of Cox. Some answers are given in Section 3 below. Of course there are other legitimate questions on various levels (Is this model for cooperative play socially useful ?); not for this paper. A counterquestion: how does Selten justify the value (45, 45, 0)? By axioms. A dummy is a player having no moves, and the same payoff at every outcome. Two axioms require dummies to get nothing and to exert no influence [9]. Harsanyi's model secures that value by requiring Cox in effect to marry Adams and lose his legal identity if he makes any agreement with Adams at all. (2) Selten's paper [9] discusses a number of values, but the main results (according Selten's Introduction) concern this one and its constant -- sum specialization.
to
140
JOHN R. ISBELL
[July
Presumably some readers arc now at least provisionally convinced. I must address readers who consider dummies outside the community or object to a pluralistic society. The next game Fs4 has merits for that purpose and will lead to other points too. The moves are as before, but the payoff is (0, 54, 36) if Adams wishes, otherwise (0, 0, 0). Here the contract value is the rather obvious (30, 30, 30). The reader can check the analysis, but a reader who stayed with Selten for the previous example will give no credit to the contract value. An ad hoc argument for (30, 30, 30): each player may be supposed to ask for a "fair share". Adams' claim is ironclad; without him, nothing can be done. Brown's claim is at least as good as Cox's. Cox's claim is that the other players cannot get anything without giving Cox more than 30; if they wish to divide 90, they can do it foolishly and overpay Cox, or they can treat it as a project requiring unanimous agreement and share equally. The Selten value is (33, 33, 24). All arguments for it (that I know of) depend somehow, and I think must depend, on a judgment that the 54-36 security which Adams-Brown have against irrational behavior, or skillful bargaining, or other deviant actions by Cox, is worth something, while the lower-level security which Adams-Cox have is worth nothing. (If it were worth something, Adams would have a higher value than Brown.) Of course the dividing line is 45, and an unfriendly critic can readily explain it as the product of a naive theory of coalition formation. I do not see how the friendliest critic can maintain the (33, 33, 24) value in a social context like that of Harsanyi's model(3) or my more detailed model in Section 1. If the players are bargaining, we may suppose Cox to say: "Your insurance against my deviations is interesting, and I wish I had insurance against your theory. I will not accept 24. If you wish to optimize, give me 30. If you value your insurance more, take it, and I will pocket my 36". If the context is arbitration, Cox may be more helpless, but otherwise his argument still seems sound. The more serious defense of (33, 33, 24), in line with the insurance idea, treats the value as some sort of average outcome of play. (With the best precedent; the idea is older [10] than the idea of a value as outcome of a definite procedure.) It is difficult to criticize a distribution of which only the mean is known, but we can look. Since an insurance policy paying 45 is worth 0 in this setting, the last 9 of AdamsBrown's 54 provides all their advantage. (We are comparing Fs4 with F45, defined in the obvious way. The Selten and contract values for F45 agree: (30, 30, 30).) This is 4½ units each, some of the time. Their gain in value is 3 each, in the mean. Clearly we cannot get such a result by giving Adams and Brown the extra 4½ only the one-third of the time that the A - B coalition may be supposed to form. (3) The working of Harsanyi's model for F involves three shotgun marriages, i.e. coalitions formed in order to lose.
1967]
C O N T R A C LAW T AND THE VALUE OF A GAME
141
We must assume that if Cox combines with Adams, he will have to compensate Adams for entering the less secure A - C partnership instead of A-B(4). This suggests that Adams can be 4½ better off two-thirds of the time. (For if BrownCox combine, the insurance available for Adams-Brown cannot help Adams.) But the possibility of Adams-Brown getting 60 instead of 54 has completely disappeared. The player excluded from a two-man coalition must behave deviantly with probability 1. And with invariable success. Perhaps a better explanation, though not for supporting Selten's theory, is that as this game is 20 ~o of the way from F4s to Fgo, where Cox would be a dummy, Cox is suffering from 20 Yo leprosy. There is a serious point here; is Selten's value for F54 required by linearity to be the weighted average that it is of the values of F,5 and of Fg0 ? In a way, yes; the value of F~4 is determined by the axioms. But the linearity notion involved in the axioms, a standard one, does not make Fs, a weighted average of the other two. .8 F4 s + .2 I"90 is a well-defined game M, played as follows. There is an initial chance move, after which either F,5 (with probability .8) or 1"9o(with probability .2) will be played. Selten's axioms require M to have the corresponding average value, (33, 33, 24), which happens to be the same as that of Fs,. The contract value also has this linearity property, and the contract value of M is (32, 32, 26). It differs from the Selten value just in proportion as the contract value of 1-'9o differs from the Selten value of Fgo. But M is not at all the same as F5,; the difference has some value according to my theory, none according to Selten's. The difference is just that the one windfall of 90, which in 1-'5, Adams can give or withhold, breaks into two parts in M. The parts happen to be mathematical expectations, but for game theory it would be the same if they were certainties, mere physical parts of the 90-unit pile. Adams can give 72, 36 each to Brown and to Cox; and he can give Brown 18 independently of what he does with the 72. Obviously Cox has no claim to equal standing in the game M. According to the Selten value, the detached 18 in M as compared with Fs, does not strengthen Adams and Brown. A remark is in order about linearity. The contract value has the property, and I have no argument against linearity; but neither do I regard it as a satisfactory axiom to support dubious theories with. If one drops the artificial hypothesis Of transferable utility (greatly increasing the difficulties), neither the Nash cooperative value for two-person games [6], the value for two-person games in my thesis [4], nor any value that I know of satisfies any substantial remnant of linearity. One would not expect equality, for the average of values need not be Pareto optimal in the average of games. But one might expect an inequality. But it fails. (4) 4~ is enough compensation, for Adams has no prospect in 1"54 more than 4½ better than in 145.
142
JOHN R. ISBELL
[July
How does the Selten value of F54 follow from the axioms ? The complete answer is very complicated (fourteen auxiliary games in Selten's proof [9]), but the qualitative point that Cox's value will be less than the others is easily established; and the argument brings out another feature of the theory that may enrich our dossier. Consider the game A in which Adams chooses (0, 0, 0) or ( - 3 6 , 0, -36). In F54 + A (a careful reader may translate via .5 Fs4 + .5 A, for correctness) one strategy for Adams yields ( - 3 6 , 54, 0); so Adams-Brown can certainly secure (9, 9, 0), and Cox is dearly weaker. (It is interesting to compare F54 + A and M; my theory agrees with Selten's that Cox is weaker in these games, but makes him quite a bit worse off in Fs4 + A. But this is not the place for fine points.) Of course that settles Fs4, for the Selten value of A is (0, 0, 0). Look at what the dummy taboo does in A. The contract value is (4, 4, - 8). The mechanism is plain; this is a game of extortion, the threat ( - 3 6 , 0, - 3 6 ) is not very persuasive, but a threat of ( - 1 8 , - 1 8 , - 3 6 ) may move Cox. Selten forbids it - - this may be socially desirable. It is not impartial. From Adams' point of view as well as Brown's, it prohibits best play. 3. Residual value. If F has players 1.... , n, let Dk(F) denote an n + k-player game constructed from F by adding dummy players n + 1.... , n + k, each of whom gets 0 at each outcome of F. Dk~b-(F) is the n-vector consisting of the first n components of q~-Dk(F). The residual value ~r (F) is limk~o~D~b-(F). The question of the easiest proof that q~" exists seems mildly interesting; as we see from the example A, the vectors DkqS-(F) need not be monotone decreasing. Here we shall describe a computation previously done by L. S. Shapley (unpublished) and indicate why it yields the residual value. Recall that the Shapley value ~b~(Dk(F)-) of the upraised game with k dummies, to the i-th player, is the average of the expressions v-(S u { i } ) - v-(S), averaged over the (n + k)! total orderings of the players, S being the set of predecessors of i. Let xi( j = 1,..., n) be the fraction ofallthe players preceding the j-th, So = {j ~ S:j < n} = So(x, i). Then v-(S u {i}) - v-(S) = 6k(X, i) is determined by the n-vector x and the indices i, k; v-(S), for example, is the value of a two-player constant-sum game with sum rn derived from F by aggregating players, So making the first player and getting at each outcome z the payoff Z[hi(z): i ~ So] + x i ( m - F,hi(z)). The other term involves k. So
(3.0)
qSi(Dk(r)-) = f 6k(X, i)dpk,
where /~k is a suitable atomic measure on the unit n-cube. The formulas define 6k(X, i) On the whole cube, piecewise uniformly continuous on a fixed set of n! pieces; then so is 6(x, i) = limk~o 6k(X, i), for the convergence is uniform. We conclude
1967]
CONTRACT LAW AND THE VALUE OF A GAME
(3.1)
~bir =
f
6(x, i)dl~,
where # is ordinary volume. For the functionals J"
tof
143
dl~k(Riemann sums) converge
d,.
To illustrate (3.1), the double integral for ~b~ (a = Adams) in the example A reduces to 2 Sos (36x - 72xZ)dx = 3; q~ is - 1 5 . Evidently the operators Dk, 4)" are positive-linear on games to games resp. vectors. From this we can get an interesting remark and a computational shortcut. The constant-sum extension of F, with any chosen constant sum c, is constructed [7] by adjoining a player who gets c - Zhi(z) at each outcome z. In general the contract value of the constant-sum extension differs from the residual value; for A it is (12, 0, - 2 4 , 12 + c). But: (3.2)
The constant-sum extension gives the residual value for fixed-threat games.
A fixed-threat, or "characteristic function" game is determined by giving a superadditive function v on the set of all sets of players. To play, each player names a set; those sets S named by all their members get v(S) (divided equally among them, say), and players i left out get v((i}). Now the characteristic functions, as set functions, are linearly generated [10] by the pure bargaining games Bs determined by v(T) = 1 for T ~_ S, v(T) = 0 otherwise. Thus every fixedthreat game F satisfies a relation F + Z2sB s = EI~sBs with non-negative coefficients 2s and/z s. (Strictly, + for games complicates the strategies; to justify the equation we must modify the sums of games by requiring a player to name the same S for each summand. Clearly this will not affect the values.) So it suffices to prove (3.2) for the games Bs. In the constant-sum extension of Bs with sum 1, the added player's value is the probability that in a total ordering he is between two members of S; this is (s - 1)/(s + I). The remaining 2/(s + 1) is shared equally by the members of S. In DR(Bs) , if the players are totally ordered, each player i of S has fig(X, i) < 1/k except for the first and last of them, who get together nearly the fraction of players not between two members of S. The expected value is again 2/(s + 1), and S shares it equally. REFERENCES 1. J. C. Harsanyi, Approaches to the bargaining problem before and after the theory of games: a critical discussion of Zeuthen's, Hick's, and Nash's theories, Econometrica, 24 (1956), 144-157. 2. J. C. Harsanyi, A bargaining model for the cooperative n-person game, Annals of Math. Study 40 Princeton, (1959), 325-355. 3. J. C. Harsanyi, A simplified bargaining model for the n-person cooperative game, International Economic Review 4 (1963), 194-220.
4. J. R. Isbell, Absolute games, Annals. of Math. Study 40, Princeton, N. J. 1959, 357-396. 5. J. R. Isbell, A modification of Harsanyi's bargaining model, Bull. Amer. Math. Soc. 66 (1960), 70-73. 6. J. Nash, Two-person cooperative games, Econometrica 21 (1963), 128-140.
144
JOHN R. ISBELL
[July
7. J. yon Neumann and O. Morgenstern, Theory of Games and Economic Behavior, 2nd edn., Princeton, 1947. 8. T. C. Schelling, The Strategy of Conflict, Cambridge, Mass., 1960. 9. R. Selten, Valuation of n-person games, Annals of Math. Study 52, Princeton, 1964, 577-626. 10. L. S. Shapley, A value for n-person games, Annals of Math. Study 28, Princeton, N. J. 1953, 307-317. 11. L. S. Shapley, Values of large market games: status of the problem, Rand Memorandum RM-3957-PR, 1964. CASE-WESTERNRESERVE UNIVERSITY CLEVELAND, OHIO
NEIGHBORHOODS OF EXTREME POINTS(1) BY
I. NAMIOKA ABSTRACT
An examination of relationship between two neighborhood systems (relative to two linear topologies) of extreme points yields a unified approach to some known and new results, among which are Bessaga-Petczyfiski's theorem on closed bounded convex subsets of separable conjugate Banach spaces and Ryll-Nardzewski's fixed point theorem. §0. Introduction. Let C be a compact subset of a Banach space E. Then, of course, the norm topology and the weak topology agree on C. Now suppose that C is only weakly compact. Then the identity map: (C, weak)--. (C, norm) is no longer continuous in general. Nevertheless one may still ask how the set of points of continuity of this map is distributed in C. In particular, when C is convex as well as weakly compact, is the identity map: (C, weak) ~ (C, norm) continuous at any of the extreme points of C, i.e., do there exist extreme points of C which have weak neighborhoods (relative to C) of arbitrarily small diameter? The importance of an answer to such a question is demonstrated in Rieffel [7] and in note [6]. Professor J. L. Kelley also recognized the relevance of this question to Ryll-Nardzewski's fixed point theorem. The work of Lindenstrauss in [4] yields the following answer: if C is a weakly compact, convex subset of a separable Banach space, then there are " m a n y " extreme points of C, where the identity map (C, weak) ~ (C, norm) is continuous. This fact was proved by using deep Banach space techniques due to Kadec and Lindenstrauss. In the present article, we shall generalize this result in various directions. The main theorem of this paper (Theorem 2.3) is stated in somewhat obscure, if not pedantic, language, because we tried to combine all the generalizations into one theorem. However, we hope this is forgiven because of the diverse applications of the single theorem. Here are some of the consequences of the main theorem: each bounded subset of a separable, conjugate Banach space is "dentable" in the sense of [7]; each closed, convex, bounded subset of E is the closed convex hull of its extreme points, where E is either a separable, conjugate Banach space or a Frrchet space such that E** is separable relative to its strong topology. In addition, a slight generalization of Ryll-Nardzewski's fixed point theorem can easily be derived from the main theorem. Received February 27, 1967 (1) This research was partly supported by the U.S. National Science Foundation. 145
146
I. NAMIOKA
[July
The paper is organized as follows: Section §0, the present one, is the introduction. Section §1 contains the preliminary material, and section §2 is devoted to the main theorem. Our proof of the main theorem is independent of Lindenstrauss' work and is quite different in spirit. Category plays a large r61e throughout §§1-2. Section §3 gives applications of the main theorem. Our terminology and notation will be those of Kelley, Namioka, et al. [3]. Finally, we wish to thank R. Phelps for many enlightening discussions on the subject of the present paper. 1. Preliminaries. Let (E,~-) be a linear topological space, and let A be a subset of E. Then we denote by (A,~--) the space A with the topology induced by 3-. If p is a pseudo-norm on the linear space E, then ~-'~ denotes the pseudonorm topology on E given by p. The pseudo-norm p is lower 3"-semicontinuous if {x:p(x) <=1} is J--closed. If V is a convex, circled, J-closed subset of E, which is radial at 0,2 then the Minkowski functional of V is lower ~--semicontinuous. For instance, if E is a normed linear space, then the norm on E is lower w-semicontinuous, and the norm on the dual E* is lower w*-semicontinuous, where w and w* are the topologies w(E,E*) and w(E*,E) respectively. 1.1 LEMMA. Let X be a compact Hausdorff space, and let (Ci:i = 1,2,...} be a sequence of closed subsets of X such that X = (_J{C~: i = 1,2,...}. Then [.J {IntCi: i = 1,2,...} is dense in X , where IntC~ is the interior of Ci in X . Proof. We may assume that X ¢ ~ . Let U be an open nonempty subset of X. Then U is locally compact, and hence U is of the 2nd category in itself. Since U = U { U n C~: i = 1,2,...} and UnC~ is closed in U, for at least one i, U n C~ has non-empty interior relative to U and hence relative to X . Therefore, U n l , . ) {IntCi: i = 1,2,..-} ~ ~ , and since, U is arbitrary, L) {Int Ci: i = 1,2, ...} is dense in X. 1.2 PROPOSmON. Let (E,~-) be a Hausdorff linear topological space, let p be a lower J'-semicontinuous pseudo-norm such that (E,~q-p) is separable, and let K be a 3"-compact subset of E. The set of all points of continuity of the identity map: (K,~--)~(K,3"p) is a dense G~ subset of ( K , ~ ) . Proof. For a subset X of E, define p-diam(X) = s u p { p ( x - y): x , y ~ X } . For each 5 > 0, let As be the union of all open subsets of (K,~--) of p-diam < 5. Clearly A, is open. Let S = {x:p(x)< 5/2}. Then, since (E,3"p) is separable, there is a sequence {xi} in E such that K = U { K n ( x ~ + S ) : i = 1,2,...}, and each K t~ (x, + S) is J--closed because p is lower J'-semicontinuous. Hence, by Lemma 1.1, the union of the interiors of K n ( x , + S) in (K,3-) is dense in K , and this union is clearly contained in A~. Therefore As is a dense open subset (2) v is radial at 0, if, for each x in E, there is a positivenumber t such that sx ~ Vwhenever O~s~_t.
1967]
N E I G H B O R H O O DOF S EXTREME POINTS
147
of(K, J-). Now the set of points of continuity of the identity map: (K, 3-) ~ (K oj-p) is precisely ("] {A1/~: n = 1,2, ...), and this set is dense in (K,3-), because (K,3-) is of the 2rid category in itself. In order to state the corollary, we introduce the following notion, which will be useful in the sequel as well. A locally convex linear bitopological space is a triple (E; J - l , : ' 2 ) such that J-1 and ~-'2 are locally convex vector topologies for the linear space E and such that there is a local base for J'~ consisting of 3-2-closed sets. Clearly the last condition is equivalent to: a family {p=} of Yl-continuous, lower J2-semicontinuous pseudo-norms determines the topology ~-1 i.e. a net {xr} : - : c o n v e r g e s to x if and only if limTp~(xr - x) = 0 for each a. Let (E, ~ be a locally convex linear topological space, and let E* be its dual. Then (E;5,w(E,E*)), (E*; s(E*,E),w(E*,E)) and (E*;m(E*,E),w(E*,E)) are examples of locally convex linear bitopological spaces. 3 1.3 COROLLARY. Let
(E;~'-1,3-2) be a locally convex linear bitopological
space such that (E,~-'~) is pseudo-metrizable and separable and (E,J'2) is Hausdorff. If K is a ~J'2-compact subset of E, then the set of all points of continuity of the identity map: (K, Y2) -~ (K,3-1) is a dense G~ subset of (K,~-2). Proof. Let {p~} be a sequence of Yl-continuous, lower ~-2-semicontinuous pseudo-norms which determines Y l • Let Z~ be the set of all points of continuity of the identity map: (K, ~-'2) ~ (K, 5p,). By Proposition 1.2, each Z~ is a dense G6 subset of (K,J-2), and therefore Z = 0 {Z~: i = 1,2,...} is also a dense G0 subset of (K, ~-2). The set Z is precisely the set of all points of continuity of the identity map ( K , g ' 2 ) ~ ( K , 9 " I ) . In Proposition 1.2 and Corollary 1.3, the assumption of separability is necessary. The following example was suggested by E. Michael. Let M be a compact Hausdorff space such that no singleton is Go, let E be the Banach space C(M) with the supremum norm, and let e: M ~ E* be the evaluation map. Then e[M] is a weak* compact subset of E*. If the identity map: (e[M],weak*) ~ (elM], norm) is continuous at e(m), m ~ M , then {m} is Go in M . Therefore this map is not continuous at any point of e[M]. For M , we may take for instance [ 0 , 1J"F' . 2, The main results. In Corollary 1.3, suppose K is convex as well as ~-2compact. Then we may ask, as we have done in the introduction, whether the identity map (K,J'2)---~(K,J'I) is continuous at any of the extreme points of K . We shall show below that this is indeed the case. We need the following theorem of Choquet. The symbol ext(K) denotes the set of all extreme points of K . For a proof of Choquet's theorem, see [2; p. 355]. 2.1 THEOREM. (Choqne0.
Let ( E , ~ ) be a Hausdorff locally convex linear
(3) s(E*, E), w(E*, E) and re(E*, E) are respectivelythe strong, weak and Mackey topologies induced on E* by the natural pairing of E* and E.
148
I. NAMIOKA
[July
topological space, and let K be a non-empty, compact, convex subset of E. Then (ext(K),~') is of the 2nd category in itself. The essential idea of the proof of the next theorem is in the proof of the lemma in [6]. We shall, however, give the proof in full for the sake of completeness. 2.2 TrmORE~. Let (E,~q-) be a Hausdorff locally convex linear topological space, let p be a lower J'-semicontinuous pseudo-norm on E such that (E,~-'p) is separable, let K be a ~'-compact, convex subset of E, and let Z be the set of all points of continuity of the identity map ( K , ~ ) ~ (K,.~rp). Then Z ~ ext(K) is a dense G~ subset of (ext(K),~-). Proof. We may assume that K ~ ~ . Let X = ext(K) and 8 > 0. Let B~ be the subset of X such that u e B8 if and only if there is a neighborhood of u in (K, ~ of p-diam ~ 8. Clearly B~ is an open subset of (X, J ) . We will show that B8 is dense in (X, ~ ) . Let W be an arbitrary ~r-open subset of E such that W n X # ~ ; we must show that Be n W ¢ ~ . Let D be the J'-closure of X = ext(K). Then D is 3". compact and W n D ~ ~ . By prop. 1.2, the set of all points of continuity of the identity map: (D, 3 " ) ~ (D, 3-~) is dense in (D, J-). Hence there is a 3--open subset V of E such that W n D D V n D ~ ~ and p-diam(V~ D) < e/2. Let K s be the Y-closed, convex hull of the 3"-compact set D ~ V, and let K2 be the J--closed convex hull of D n V. Since K1 and K2 are 3"-compact and X c K1 L) K2, K is the convex hull of Ks L) K2. Note also that p-diam(K2) < e/2, because p is lower 3"-semicontinuous. Moreover K1 # K, because ext(K1) = D ~ V and D n V # ~ . Let re(0,1], and let C, be the image of the map f,: K1 x K2 x [r, 1 ] ~ K defined by f,(xx, x2, 2) = 2x~ + (1 - 2)x2. Then C, is a 3"-compact, convex subset of K. In addition, C~ ~ K, because X ~ C, = Ka and K~ ~ K. Let y E K ,~ C,. Then y is of the form y = ).xl + (1 - 2)x 2, x~ e Ki, 2 s [0, r). Hence p(y - x2) = 2p(x~ - x2) ~ rd, where d = p-diam(K). Now by the absorption theorem [3; p. 91], the set {x: p(x) < 1} absorbs K; hence K is J'v-bounded, and it follows that d < oo. Taking into account that p-diam(K2)< 8/2, we see that p-diam (K~C,)~/2+2rd. Let C = C , with r = ~ / 4 d ; then p - d i a m ( K , ~ C ) < e . Since C ~ K, there is u in (K ~ C) n X, and K ~ C is a neighborhood of u in (K, oq) ofp-diam < e. Hence u e B,. Next, since C ~ K~ ~ D ~ V, u ~ D n V ~ W. Therefore u ~B, n W, and consequently B~ ~ W ~ ~ . Hence B~ is dense in (X,~r). Finally, to conclude the proof, observe that Z ~ x = A { B ~ / , : n = 1,2,-.-} and that the set A{B~/,: n = 1,2,... } is a dense G~ subset of (X,~a-) in view of Theorem 2.1. 2.3 T.q~. MAIN TrmORnM. Let (E; ~'~,~'2) be a locally convex linear bitopologicaI space such that ( E , J ~ ) is pseudo-metrizable and separable and (E,~'2) is Hausdorff, let K be a ~'2-compact, convex subset orE, and let Z be the set of
1967]
NEIGHBORHOODS OF EXTREME POINTS
149
all points of continuity of the identity map: ( K , ~ 2 ) -~ ( K , ~ I ) . Then Z r~ ext(K) is a dense G~ subset of (ext(K), Y2), and the J-2-closed, convex hull of Z t~ ext(K) is K.
Proof. Let Z i be as in the proof of Corollary 1.3. Then, by Theorem 2.2, Zi Next(K) is a dense G~ subset of (ext(K),~v2). By Theorem 2.1, (ext(K),~r2) is of the 2nd category in itself. Therefore N {Z~ t~ext(K): i = 1,2,...) [ N {Z~: i = 1,2,...)] n ext(K) -- Z n ext(K) is a dense G~ subset of (ext(K), ~J2). The second conclusion follows from the first one, because of the Krein-Milman theorem. §3. Applieatioas. In this section, we shall derive several consequences of the material presented in §2. If (E, ~ is a locally convex, separable, metrizable, linear topological space, then the bitopological space (E;3", w(E,E*)) satisfies the conditions of Theorem 2.3. Thus we obtain: 3.1 THEOREM. Let ( E , ~ ) be a locally convex, separable, metrizable, linear topological space, let K be a weakly~ compact (i.e. w(E,E*)-compact), convex subset of E, and let Z be the set of all points of continuity of the identity map (K, w(E, E*)) -~ (K, ~'). Then Z r~ ext(K) is weakly dense in ext (K), and the closed convex hull of Z t~ext(K) is K . REMARKS (a). In Theorem 3.1, instead of the separability of ( E , 3 ) , one can assume that K is J-separable or, equivalently, that K is weakly separable, because then the dosed subspace generated by K will be separable. (b) If E is a separable Banach space, then Theorem 3.1 is a consequence of Lindenstrauss' theorem [4; Theorem 4]. For, if u is a "strongly exposed" point of K, then the identity map (K, weak)~ (K, norm) is continuous at u. Let E be a Banach space such that E* is separable. Then the bitopological space (E*;J',w(E*,E)) also satisfies the conditions of Theorem 2.3, where ~r is the norm topology for E*. Thus we obtain: 3.2 THEOREM. Let E be a Banach space such that E* is separable, let K be a weak* compact, convex subset of E*, and let Z be the set of all points of continuity of the identity map: (K, weak*)~(K, norm). Then Z n e x t ( K ) is weak* dense in ext(K). 3.3. COROLLARY. Let E be a Banach space such that E* is separable, let K be a norm-closed, bounded, convex subset of E*, and let Kt be the weak*-closure of K . Then K t3 ext(K1), which is a subset of ext(K), is weak* dense in ext(Kl). Proof. Since K is bounded, K1 is weak* compact. Hence Theorem 3.2 applies to K t . Let Z be the set of all points of continuity of the identity map (Kt,weak*) ~ (Kl,norm), and let z e Z. Since K is weak* dense in K t , there is a net {x~} in K converging to z relative to the weak* topology. Then {x~} con-
150
I. NAMIOKA
[July
verges to z relative to the norm topology, because z e Z, and therefore z e K . Hence Z c K and Z N ext(Ki) c K / h ext(Ki). It follows from Theorem 3.2 that K ~ e x t ( K i ) is weak* dense in ext(K1). Finally, since K i D K , K rhext(K1) = ext ( K ) .
From Corollary 3.3, we may deduce a recent result due to Bessaga and Pelczyfiski [ I ] : 3.4 COROLLARY. Let E be a Banach space such that E* is separable. Then each norm-closed, convex, bounded subset of E* is the norm-closed, convex hull of its extreme points. Proof. According to Lemma 1 in [5], it is sufficient to prove that, if K is a non-empty, norm-closed, convex bounded subset of E*, then ext(K) ~ ~ . Let K 1 be the weak* closure of K . Then ext(K~) ~ ff~f, and hence, by Corollary 3.3, Z~ ~ K n ext(Ki) = ext(K). R~tAg~:. As pointed out by Bessaga and Pelczyfiski [1], the separability of E* is essential in Corollary 3.4. A subset A o f a Banach space E is called dentable [7], if, for each e > 0, there is x in A such that x is not in the convex, closed hull of A,,~ { y : l l x - Y[I < ~}" It is easily seen that A is dentable if the closed, convex hull of A is. A point x of A is called a denting point if for each e > 0, x does not belong to the closed, convex hull of A ~ { y : [ I x - yII =<e}. Clearly A is dentable if the closed, convex hull of A contains a denting point. A denting point is necessarily extreme. It is easy to deduce from Theorem 3.1 that a relatively weakly compact subset of a separable Banach space is dentable. M. Rieffel asked [7; Question 1] whether the separability here is essential. This question, we believe, is still open. Rieffel also raised the question [7; Question 3] : which Banach spaces have the property that all bounded subsets are dentable? He proved that l~(X) has this property, where X is any set (possibly uncountable). We give another family of Banach spaces with this property. 3.5 THEOREM. Let E be a Banach space such that E* is separable. Then each nonempty, norm-closed, convex, bounded subset of E* contains a denting point. Consequently each bounded subset of E* is dentable. Proof. Let K be a nonempty, norm-closed, convex, bounded subset of E* let K1 be its weak* closure. Let u be a point in ext(Kx) where the identity map: (Ki,weak*)-~(Kl,norm) is continuous. By Theorem 3.2, we know such a point exists, and, as shown in the proof of Corollary 3.3, u ~ ext(K). Let e > 0. Then there is a weak* open subset W of E* such that u ~ W n K ~ and diam(W n K1) <= e. The point u is not in the weak* closed, convex hull of K1 "-~ W, and afortiori u is not in the (norm)-closed, c o n v ~ hull of K 1 ,-~ W. Clearly K 1-W=K-(x: I I x - u l I < e } , and hence u is not in the closed, convex
1 671
N
IOHBO
OODS OF
151
hull of K ,-, {x: [Ix - u II ~ e}. Therefore u is a denting point of K. The second condusion is clear from the remarks preceding the theorem. Let E be a locally convex, metrizable linear topological space, let E* be its dual with the strong topology (i.e. s(E*,E)), and let E** be the dual of E* with the strong topology (i.e. s(E**, E*)). Let I denote the evaluation map: E ~ E**. Then I is one-to-one and relatively open. Since E is metrizable, the topology is bound [3; 22.3], and hence it is evaluable [3; 20.4], i.e. I is continuous. Therefore I is a linear topological isomorphism of E onto I[E]. A Fr3chet space is a locally convex, complete metrizable linear topological space. A linear topological space is called quasi-separable, if each bounded subset is separable. 3.6. THEOREM. Let E be a Frdchet space such that E** is quasi-separable relative to the strong topology. Then each closed, bounded, convex subset of E is the closed convex hull of its extreme points. Proof. Let K be a closed, bounded, convex subset of E, abd let I be the evaluation map: E ~ E**. Then the weak* ( = w(E**,E*)) closure K 1 of IEK] in E** is weak* compact and strongly bounded. Let F be the subspace of E** generated by K 1 . Then, since E** is quasi-separable, (F,s(E**,E*)) is separable and metrizable, and thus we may apply Theorem 2.3 to the bitopological linear space (F;s(E**,E*), w(E**,E*)) and K 1 . Let Z be the set of all points of continuity of the identity map: (Kl,weak*)-~ (K1, strong). Because K is complete, IEK] is strongly closed in E**, and, as in the proof of Corollary 3.3, we see that Z c IEK ] . Hence Z n ext(K1) c ext(IEK]) = I[ext(K)], and Theorem 2.3 implies that the weak* dosed, convex hull of I[ext(K)] is KI. It follows that the (weak) closed, convex hull of ext(K) is K. REMARK. Let E be a Fr6chet space satisfying the hypothesis of Theorem 3.6. Then, by [3; 22.15], each strongly bounded subset of E** is equicontinuous, i.e. E* is evaluable. Therefore, by [3; 22.15], E* is bound and barrelled as well; that is, E is a distinguished ( = distingud) Fr~chet space. Finally, we present a bitopological version of Ryll-Nardzewski's fixed point theorem. Let Q be a subset of a locally convex linear topological space (E, oq') and let ~a be a semi-group of transformations of Q into Q. The semigroup Sa is 3"-noncontracting on Q, if, for each pair of distinct points x, y of Q, there is a 3--continuous pseudo-norm p on E such that:
inf{p(Tx - Ty): T~SP} > O. A proof of the next theorem is a straightforward modification of the proof of Ryll-Nardzewski's fixed point theorem given in [6]. Theorem 2.2 takes the place of the lemma in E6]. 3.7. THEOREM. Let (E; J ' l , J-2) be a locally convex linear bitopological space such that (E, 3"1) is separable and (E, 3"2) is Hausdorff, let Q be a nonempty,
152
I. NAMIOKA
[July
J ' 2 - c o m p a c t , convex subset o f E , and let 5P be a semigroup o f 3"2-continuous affine transformations on Q into itself. I f dP is .~'l-noncontracting on Q then •9~ has a common f i x e d point in Q.
RB~,RK. In Theorem 3.7, the assumption of the separability o f ( E , J ' I ) can be dropped i f ( E ; ~Jt, ~q'2) satisfies the following condition: (S)
Each ~'2-compact, ~2-separable subset of E is included in a 3-1-separable subset of E .
I f ( E , ~ ' ) is a locally convex space then (E;~-, weak) satisfies (S). However, if E is a separable Banach space, then (E* ;norm, weak*) satisfies (S) if and only if E* is separable relative to the norm topology.
REFERENCES I. C. Bessaga and A. Pelcyzfiski,On extreme points in separable conjugate spaces. Israel J. of Math. 4, No. 4 (1966), 262-264. 2. J. Dixmiex, Les C*-alg~bres et Leurs Reprdsentations, Gauthier-Villars, Paris, 1964. 3. J. L. Kelley, I. Namioka, et al., Linear Topological Spaces, D. Van Nostrand, Princeton, 1963. 4. J. Lindenstrauss, On operators which attain their norm, Israel J. of Math. 1, No. 3 (1963) 139-148. 5. , On extreme points in 11, Israel J. of Math. 4, No. 1 (1966) 59-61. 6. I. Namioka and E. Asplund, A geometric proof of Ryll-Nardzewski's fixed point theorem. Bull. Amer. Math. Soc. 73(1967) 443--445. 7. M. A. Rieffel, Dentable subsets of Banach spaces, with application to a Radon-Nikodym theorem. (to appear) UNIVERSITYOF WASHINGTON~ SEATTLE,WASHINGTON
ON COMPLEMENTED
SUBSPACES
O F rn
BY
JORAM LINDENSTRAUSS* ABSTRACT
It is proved that an infinite dimensional subspace of m is complemented in rn if and only if it is isomorphic to m. Let Y be a Banach space and let X be a closed linear subspace of Y. We say that X is complemented in Yif there is a bounded linear projection from Yonto X. A Banach space is called a ~ space if it is complemented in every Banach space containing it. For a set F let re(F) denote the space of bounded scalar-valued functions on F with the supremum norm. Since every Banach space is isometric to a subspace of m(F) for a suitable F and since every re(F) is a ~ space it is easily seen and well known (cf. [3, p. 94]) that a Banach space X is a ~ space if and only if X is isomorphic to a complemented subspace of some m(F). The question of functional representation of ~ spaces has been considered by many authors but is still open. The main known results in this direction are contained in [5] and its references. The purpose of the present note is to settle the question of the structure of the complemented subspaces of re(F) if F is countably infinite (for this F we denote, as usual, m(F) by m). THEOREM. An infinite dimensional subspace X of m is complemented in m if and only if X is isomorphic to m. The if part of the theorem is trivial since m is a ~ space. The theorem answers a question o f Pelczyfiski [5] (cf. also [2]). Pelczyfiski proved in [5] a result similar to the only if part o f the theorem for Iv, 1 __
154
J. LINDENSTRAUSS
[July
For a proof see [5, p. 222]. We do not use essentially new methods here. Lemma 5 below is closely related to the paper [1] of Bessaga and Petczyfiski. In the proof of the theorem itself we use an idea of Nakamura and Kakutani [4] (cf. also [7] and [8]). Our approach can be used to get some information on general ~ spaces. We do not, however, treat general ~ spaces here since it seems that the solution of the problem of characterizing general ~ spaces will need other methods. Our reason for believing this is the observation in [5, p. 223] that there are ~ spaces which are not isomorphic to re(F) for any F. By the w* topology of m we understand the topology induced on m by the elements of l~. For x e m and an integer i, x(0 denotes the i-th coordinate of x. For x ~ m and ~ > 0 we put N(x, ~) = {i; I x(i)] > 5} and M(x, 5) = {i; I x(i) l < 5}. The scalar field (real or complex) is denoted by R. LE~_A 3. constant K
(1)
Let {Xk}k~l be a sequence of elements in m such that for some
II k=l
xkll=
f o r a l l {2k}~=,cR,
n=l,2,....
k
Then for every bounded sequence {2k)~= 1 c R the series ~,~=12kXk converges in the w* topology to an element of norm < Ksupkl 2k I in m.
Proof. Obvious. L~M_MA 4. Let (Xk)~=l be a sequence of elements in m such that (1) holds and such that for some H > 0
(2)
IIx ll--- H
k-- 1,2,..
Then for every 5 > 0 and 0 < H there is an index k such that
(3)
A~ = {h;M(Xk, e) o N(xh, O) ~ ¢}
is an infinite set.
Proo£ By (1) we get that for every i, ~,kl Xk(i) [ < K and hence the intersection of r > K/5 of sets of the form N(xk,5) is empty. Assume that Ak, k = 1, ...,r, are finite sets. Then A = ~.J~ = ~Ak is a finite set. For h ¢ A, N(xh, 0) = 0 ~= t N(Xk, 5) = ¢ but this contradicts (2). LEMMA 5. Let {Xk}k~l be a sequence of elements in m such that (1) holds and IIxk II > 2for every k. Then there is a subsequence { x , J of {Xk} such that
(4)
supl l k
II k = l
gsupl k
l,
ON COMPLEMENTED SUBSPACES OF m
1967]
oo for every bounded sequence { k}k=l
155
of scalars.
Proof. The right inequality of (4) is valid for every subsequence { x J by Lemma 3, so we have to consider only the left inequality of (4). By Lemma 4 there is an n~ so that B~ = {h; M(x,,,l/4)r3N(xh,7/4)#¢} is infinite. Next we choose an integer i~ such that }x,,,(il) 1 = . >2 Since ~ = ~[x~(i~)[=
7/4 for some fixed i e M1 and this contradicts the fact that ~h] xh(i)} =< K. Let Xk,1 be the restriction Of Xk to M1, k = 1, 2, "'" (i.e. Xk,1(i) is defined only for i ~ M1 and for these i, Xk,~(i) = x~(i)). By the definition of > 7/4 for keB~ and hence for keC1. By applying Lemma 4 to {x~,~}~c, we get an n2eC~ such that
Ix (i,)l
M ,llx
B 2 = C1 ~ {h; M 1 (3 M(xn2 , 1/42) (3 N(xh, 7/4 -- 1142) ~ ¢) is an infinite set. Let i2 e M1 be such that I x~(i2) I -> 7/4 and let C2 be an infinite subset of B2 such that ~k ~C~ < 1/4 ~. Set M2 = M1 r3 M(x~, 1142) and Xk 2 the restriction of xk to M2, k = 1, 2, ..-. Continuing inductively we get a subsequence { x , J of (xk} and a sequence of integers ik such that (5)
I x,~(it) I > 2 - 1/4 . . . . .
(6)
~' I x.k(i.i) l < 1/4"/
1/4k-x > 5]3
~o
k=j+l
Ix. (OI =<1/4
(7)
By (6) and (7) such that sup l
k=l
for j > k
, jIx.,Cij)l
< 1/3. Let now { ~}k~ = 1 be any sequence of scalars -- 1 and let j be such that I ;~JI > 415. Then k~j
k=l
kgj
Ix. (ij)l
1,
and this concludes the proof of the lemma. Proof of the theorem. Let X be an infinite dimensional complemented subspace of m. By Lemma 1 X contains a subspace isomorphic to co. By Lemma 5 it follows that there is a sequence {Xk}k~l in X and a constant K such that supk 12kl_< k=l II --
156
J. LINDENSTRAUSS
[July
known (cf. [71, [8] and the references there). For each ? e F let X r be the subspace o f m consisting of the elements of the form T. k , N 2 k X k where {2k}k ,N~ is a bounded set o f scalars. Clearly every Xy is isomorphic to m. We shall prove that X r = X for some ? and this will conclude the proof of the theorem (by Lemma 2). Let ?p:m ~ m / X be the quotient map. Since X is complemented in m, m i X is isomorphic to a subspace of m. Hence there is a countable set of functionals {fj}jo= 1 in ( m / X ) * such that fj(u) = 0 for every j (u ~ m / X ) implies u = 0. Assume that for every ? ¢. F there is an xr = ~k~¢~ ;t~xk such that II II = ~ and s i.e. ~b(xr) ~ 0. We claim that for every choice of signs as and every finite set {xr,}~.= 1
(8)
U
i=l
z K.
Indeed, by our assumption on the N r there is a finite set Mo such that N~, c M o U M ~ , i = 1,...,n, and Ms C3Mk = ~ f o r i ~ k. Put, for i = 1,...,n, x ~ , = Ys+ zi where y s = ~k~uo2~'Xk and z~= ~,k~U,A~'Xk. Every Yi belongs to X and hence ~b(xT,)= ~(zs). Since II l ,z, II < K for every choice o f signs we get (8). F r o m (8) it follows that for every f E ( m / X ) * , K IJflJ and in particular there is only a countable set of ~ s F such that f(~(x~)) y~ O. It follows that there is only a countable number of ~ e F such that fj(~(xr)) ~ 0 for some j and this contradicts our choice of the f j and the assumption that q~(x~)~ 0 for every ~. REFERENCES 1 . C. Bessaga and A. Pelczyfiski,On bases and unconditional convergence of series in Banach spaces, Studia Math. 17 (1958), 151-164. 2. W. J. Davis and D. W. Dean, The direct sum of Banach spaces with respect to a basis,
Studia Math. 28 (1967), 209-219. 1958. 3. M. M. Day, NormedLincar Spaces, Springer Verlag, Berlin v 4. M. Nakamura and S. Kakutani, Banach limits and the Cech compactification of a countable discrete set, Proc. Imp. Acad. Japan, 19 (1943), 224-229. 5. A. Pe|czyfiski, Projections in certain Banach spaces, Studia Math. 19 (1960), 209-228. 6. A. Petczy6ski, Banach spaces on which every unconditionally converging operator is weakly compact, Bull. Acad. Pol. Sci., 12 (1962), 641-648. 7. A. Pelczyfiski and V. N. Sudakov, Remark on non-complemented subspaces of the space re(S), Colloq. Math. 9 (1962), 85--88. 8. R. Whitley, Projecting m onto co, Amer. Math. Monthly 73 (1966), 285-286. THE HEnmsw UmV~RSl~ or
JERUSAI~M.
UBER DAS SCHWARZSCHE LEMMA UND VERWANDTE SATZE VON
ALEXANDER DINGHAS ABSTRACT
Upper bounds for the Jacobian determinant by holomorphic mappings of bounded domains D into itself were given first more then thirty years ago by Stefan Bergman by means of his theory of the kernel function of D. In this paper a different method shall be developed and distortion theorems for holomorphic mappings of bounded domains of a Kahler manifold M n into a K~daler manifold M8 shall be proved. The special cases M " = C n (unit sphere of C n) and M " = M~ =[C n shah also be considered. The proof depends essentially on the two Hermitian quadratic forms corresponding to the metric and to the Ricci tensor. The manifolds must be of negative Rioei curvature and fulfil two conditions given in section 4. 1. Einleitung. In einem Aufsatz, der in der Festschrift zur Ged/ichtnisfeier fiir Karl Weierstrass 1815-1965 ver~Sffentlicht wurde (x), babe ich einen allgemeinen Verzerrungssatz ftir den Inhalt des durch eine holomorphe Abbildung w : C n - - * M ~ der offenen Einheitskugel C~ in eine K/ihler-Mannigfaltigkeit M ~ erzeugten Obedagerungssttickes bewiesen, der in seiner einfachsten Form (n = 1,M 1 = C~) ein klassisches Ergebnis yon Schwarz und Pick (2) fiber die Deformation des Inhalts bei in-Abbildungen einer offenen Kreisscheibe der komplexen Ebene als Spezialfall enthalt. Sowohl der allgemeine als auch der seeben erw~hnte Satz yon Schwarz und Pick h~ingen mit S/itzen zusammen, die S. Bergman als erster vor mehr als dreissig Jahren mit Hilfe seiner Theorie der Kernfunktionen (3) bewiesen hat, und welche erstmalig die Bedeutung der K/ihlerschen Metrik ftir eine Reihe von Problemen der komplexen Analysis, insbesondere bei der Abbildung von Gebieten des n-dimensionalen komplexen Raumes C ~ eindrucksvoll demonstrierten. Die Bedeutung der nachfolgenden Entwicklungen liegt weniger an den Ergebnissen (mindestens an einem Teil derselben) als in der verwendeten, mehr differentialgeometrischen Methode. Diese Methode, deren Ursprung in dem einfachen Fall n = 1 auf Ahlfors (4)zuriickgeht, dtirfte Received February 23, 1967, and in revised form March 26, 1967. (1) Man vgl. [5]. (2) Man vgl. [51, S. 478 f. (3) Man vgl. etwa [2], S. 138 ft. Die Original-Abhandlungen yon Bergman sind tiber einen grossen Zeitraum verteilt. Der Leser findet die meisten am Ende von [2] zusammengefasst. (4) Man vgl. [1] und eventuell auch [5]. 157
158
A. DINGHAS
[July
insofern noch yon Interesse sein, als sic nicht nur den Begriff der K/ihlerschen Metrik gleich zu Anfang in Anspruch nimmt, sondern auch dafiir, dass fie die Bedeutung des Ricci-Tensors fiir die in Frage kommenden Probleme ins rechte Licht riickt. Die Nummer 2 enth~ilt Beweise yon Hilfss~itzen, die ich in meiner eingangs dieser Note zitierten Arbeit mit Riicksicht auf den mir dort zur Verfiigung stehenden Raum nur kurz bzw. andeutungsweise entwickeln konnte. In den Nummern 3 und 4 werden drei Verzerrungss~itze bewiesen, die s~imtlich Abbildungen yon Gebieten yon C" durch Systeme von n holomorphen Funktionen betreffen. 2. Vorbereitende Tatsachen und Hilfssiitze. Im folgenden soll ~'k (k = 1,'", n) die zu der Zahl Zk konjugierte Zahl bedeuten. Die Ver/inderlichen z k und ~'k (k = 1,..., n) definieren einen n-dimensionalen Hermiteschen Raum H". Schreibt man zk = x t + i y k , Y'k = Xk -- iy~, SO ist H n dem Raum R 2n ~iquivalent. HmrSSATZ 1. Es sei fiir i, k = 1,...,n 110 6~k =
(i=k) (i ~ k ) .
Sind dann z l , . . . , z , und 2 beliebige komplexe Zahlen, so gilt die Gleichung
(2.1)
16, +xz,e l-- l+2(Zlel+...+z.e.)=l+&llzU 2.
Dabei bedeutet [6ik +
Xz,e l die Determinante der Matrix (6~k+ 2zfik) (i, k = 1, ..., n).
Beweis. Man setze A(2) = 161k+ Azfi'k]. Dann ist A"(2) = 0 und somit A eine lineare Funktion yon A. Das liefert leicht die Gleichung 0
(2.2)
Z 1 ""
A(A) = 1 - 2 zl 1
Zn
= 1 + Xllzll 2
... 0 ...
~,, 0 HILFSSATZ 2.
...
1
Man setze fiir Uz I1 < 1 Vo = Vo(z,Y.) = log (1 - II z I1 ) -,
(2.3)
Dann geniigt Vo der partiellen Differenzialgleichung ~2Vo I ~z~a~k
(2.4)
= exp[(n + 1)Vo].
Dabei bedeutet links die Determinante der Gr~ssen ~ Bewds.
Es ist
(5) Man vs1. [71 trod [5].
O2Vo (5)
.
1967]
OBER DAS SCHWARZSCHE LEMMA UND VERVANDTE S~TZE ~2V0
1
~,Zk)
(
Oz,a~k - 1 -- I1z II5
159
~,k + a - II z II2
D a s liefert die Gleichung
](1HmFSSATZ 3.
2 02Vo
1
ZiZk
IIz )~]=
]~,k+1-~zll 2
-1-Hz [12
Die quadratische (Hermitesche) Form(6)
~~2Vo k ~ kr7
(2.5)
((i,'", (n komplex)
ist in jedem Punkt z = (z i, ...,z,) mit II z II < 1 positiv definit. Beweis.
Es sei
~
(1___ m _ < ~ )
derjenige H a u p t m i n o r der linken Seite yon (2.4), der die Ver~nderlichen z 1, "", zm ~, "",~m enth~lt. Wegen
CS'k + mit II ~ 1t2- = z ~
+,
1-II
+ zj.
= I1'~ m= 1 + ]
l[ 5
findet m a n
[ a~ro
1- H~_II~+I_~ .-
( ~ _ I!zll~r+,
•
M a n setze jetzt Wk = Uk + irk, ff'k = Uk -- irk mit reellen Uk, Vk, und betrachte die Punkte (xl, "",xn, yl, " " , y , ) bzw. (ul,...,u,,,v~,...,v~) yon R 2~. Es wird sich nun d a r u m handeln, die T r a n s f o r m a t i o n des Inhaltselements [dx~, ..., dy,,] durch die Abbildungafunktionen Wk ZU bestimmen. Der Beweis des nachfolgenden Satzes verwendet die klassische Theorie der /iusseren Multiplikation von Differentialformen von E. Cartan (7):
Es sei w~,...,w, holomorph in der Umgebung des Punktes z = ( z x , . . . , z , , ) yon C~. Man bilde fiir k = 1 , 2 , . . . , n die Differentialformen HmFSSATZ 4.
(6) Nach dem Vorbild der Tensorrechnung soll stets tiber einen zweimal auftretenden (stummen) Index yon 1 bis n summiert werden. Eine entsprechende Vorschrift soll ftir Paare von zweimal auftretenden Indizes gelten. Hierbei wird ein Unterschied zwischen oberen und unteren Indizes nicht gemacht. Eine quadratische Form
Hc(~, ~) = c,k~,~
(i,k = 1,... ,n)
heisst Hermitesch, wenn cg~ = ~ ist. Sie heisst positiv definit, wenn aus Hc(~, ~) > 0 folgt. Ftir den Beweis des Hilfssatzes 3 vgl. man noch [7] und [5]. (7) Man vgl. etwa [4] und [6].
II~ II > 0 stets
160
[July
A. DINGHAS
aWk.
dwk =
(2.6)
flkldZl
azl
und (2.7)
d~k = X --~--a l = audzt /=1
vz t
mit ~k = ~k (~1,'",~.).
Dann gilt die Gleichung [dUl,...,dv,] = ]a,,] ]d,k] [dxl,...,dy,].
(2.8)
Dabei bedeuten [a,k [ bzw. [d~k[ die (konjugiert komplexen) Determinanten der ark bzw. ~ . Beweis. Nach Definition hat die ~iussere Multiplikation A der Elemente ~,fl,~, ... eines Vektorraumes V tiber einen K0rper R die Eigenschaften: 1. crAft = - f l A i r , ~ A ~ = O 2. o~A Cfl = C(ct A fl) (C-Konstante) 3. ~ A ( f l A ~ ) = (aAfl) A~? 4. ~A(fl +~') = (otAfl)+(ctAy). Das liefert vorerst die Gleichungen
fi Adw~= [a,~l h Adz~ 1
1
1
1
und
also
h Adw,,Ah Ad~V~=la,,l'la,~lff-I Adz~A fl Ad~,. 1
1
1
i
Andererseits ist wegen
dx, A dxk = - dXk A dx,, dy, A dYk = - dYk A dx~ und
dxi A dYk = - dYk A dxi 1-I " A dzk A 1
fi1 A d~'~ =
(-2i) ~
fi1 A dx~ A1 I-I" A dy~
und
f i A dwkA f i A df'k = ( - 2 i ) " f i A du~ A f i A dv~. 1
1
1
1
1967]
OBER DAS SCHWARZSCHE LEMMA LIND VERWANDTE S~,TZE
161
Das beweist die Gleichung (2.8)(8). In folgenden werden die eckigen Klammern in (2.8) mit den Masselementen ( > 0!) der entsprechenden R/iume identifiziert werden. HILFSSATZ 5.
Es seien
Ha(z, z.) = aikZfik
(ak~ = die)
n R z , 2) = b~::k
(b~ = b~)
zwei (Hermitesche) positiv definite Formen, mit den (positiven) Determinanten ]a,~ l, ]b,k l" Dann folgt aus Hb(z,~) < na(z,~.)
(2.9)
(zen")
die Ungleichung
(2.10)
Ib,,I _<__la,~ I
Beweis. Wir gehen v o n d e r Tatsache aus(9), dass man jede positiv definite Hermitesche Form nc(z,~) = c~::~
(cki = 6~)
durch eine (unit~ire) Transformation
z,
(2.11)
= Pik(k
(i,k = 1,...,n)
und ff~kPi! = ¢~kl
=
{10 (k=/) ( k ~ t)
in die (Normal-) Form
bringen kann. Man betrachte das Integral
Qc = f exp {-nc(z,~))
[dxl ... dyn]
erstreckt fiber den gesamten Raum H n, und beachte, dass allgemein
f+oo ~,lz _ exp{-2xZ}dx - 21/2
(2>0)
gilt. Das liefert unter Heranziehung des Hilfssatzes 4 und der Tatsache, dass (s) Ftir n = 2 finder sich diese Gleichung in einer Arbeit von P. J. Myrberg. Man vgl. [8]. (9) Man vgl. etwa [9]. In diesem Buch (S. 169 If) lindet man eine vollst,~ndige Theorie der Hermiteschen Formen, sowie eine Darstellung des (klassischen) Beweises des Hilfssatzes 5 mit Hilfe einer gleichzeitigen tJberftilarung von Ha(z , ~,) wad Hb(Z , ~,) in die Norraalform.
162
A. DINGHAS QO = n"/21 • 22"'"
[July
2,,
ist, die Gleichung QC m
~, "" ;~
I c,~ [ "
Gilt nun (2.9), so ist Q~ < Qb und mithin gilt die Ungleichung ~n ~n
la, I-= Ib, l" Das licfert die Ungl¢ichung (2.10). Der Hilfssatz 5 gilt offenbar auch dann, wenn dic (Hcrmitesche) Form Hb(z,~ ) positiv semi-definit (d.h. ~ 0) ist. Es gent~gt in der Tat die (posifiv) definhe Form H~,(z,~), Hb.(Z,~) mit a'i~ = ask + 26a~,
b'ik = bik d- ~ik F,
zu betrachten and in der Ungleichung I b',k [ __<[a'ik [,e gegen Null konvergieren zu lassen. Von der Ungleichung (2.10) wird in den n[ichsten Nummern Gebrauch gemacht. 3. Beweis eines Satzes von S. Bergman. Man setze (3.1)
Vo = Vo( Z, ~) = log
1
1 - 11w II
und (Jacobische Funktionaldeterminante).
J = I ~-~Zk ]aW~ Dann wird wegen ~2V0 az~a~k
~2V0 ~W/, ~1~v aw~a~, ~z I a~k
02V0
02Vo Ow~Og,k
02V0
[J[2
a
[2
fir jeden Punkt (z, ~) yon C~. Da noch ~2 02 Oz-~,loglJ[2 = Oz,O~ log(J J) = 0
gilt, so geniigt die Funktion
1)Vo]
1967]
OBERDAS SCHWARZSCHE LEMMA UND VERWANDTE S,~TZE V1 = log
163
Isis (1 -- {IW 1{2)n+l
der partiellen Differentialgleichung
(3.2)
] o2V1
[I = ( n + l ) " e x p V l = A n e x p V 1.
Es bezeichne jetzt r eine positive Zahl zwischen 0 and 1. Man setze fiir (kurz z e C,) U, = log
IIz II < r
rz
(: -[I z D.+I
und beachte, dass U, der partiellen Differentialgleichung
I 02U" [I = ( n + l ) " e x p U '
(3.3)
geniigt. Es wird sich jetzt darum handeln zu zeigen, dass (3.4)
Vx =< U,
in C, und somit auch 1,'1< U1 in C. gilt. Man setze¢ = 1"1 - U, und nehme an, die Menge
E, = {z: zeC,, ~ > 0 } sei nicht leer. Dann liegt (wegen U, = + oo auf IIz I[ = r) die abgeschlossene Hiille/~, yon E, ebenfalls in C, und somit besitzt ~b ein Maximum in einem Punkt von E,, etwa im Punkt (Zo, 5o). Nun gilt der Satz: HILFSSATZ 6. Hat d/ ein Maximum in z e E , , so gilt dort die Ungleichung (3.5)
0 z ~ k t~k < 0.
Dabei sind die ~ willkiirliche komplexe Zahlen. Beweis. Man sehreibe ~k + ir/k (mit reellen ~k, qk) fik ~ und beachte, dass ~k
+ r/k ~
k~
+ k~
(keine Summation)
also
k = I\
OZk
~Zk
)
"
Erreicht nun ~ in z o sein Supremum, so muss dort bei beliebiger Wahl der (k die Ungleichung
164
A. DINGHAS
[July
gelten. Daraus folgt wegen DgO + D2;O= 4 02~p die Behauptung (3.5). Nachfolgender Hilfssatz gestattet die Anwendung des Hilfssatzes 5: HILFSSATZ 7. Man setze 1 Vo = l o g 1 -
Ilwll 2
Dann ist die (Hermitesche) quadratische Form
O2Vo positiv definit.
Bewds. Da in E, die Ungleichung I JI > 0 gilt, hat das lineare Gleichungssystem
nichttriviale L~Ssungenin E,. Nun ist
02Vo
02110 Ow. ~ ,
o e vo
02110
und somit
Das beweist (mit Rticksicht auf den Hilfssatz 3) die Behauptung. Nach diesen Vorbereitungen kann die Ungleichung (3.4) folgendermassen bewiesen werden: Aus (3.5) folgt (in Zo) 02V1 r ~
02U, "
und somit nach dem Hilfssatz 5 d.h. der Ungleichung (2.10)
I 02V1
02Ur --
~
Das liefert wegen (3.2) und (3.3) die Ungleichung
"
19671
OBER DAS SCHWARZSCHE LEMMA UND VERWANDTE $ATZE
165
exp V1 -<_ exp U,, d.h. 1/'1< U, in (Zo, ~'o), was gegen die Voraussetzung ist. Somit gilt VI < U, in C,, also auch V1 =< Ut in C, (lo). Somit ist folgender Satz bewiesen worden: SATZ 1. Es bezeichne w : C , ~ C, eine holomorphe Abbildung der Einheits-
kugel (3.6)
C, = {z: llzl[ < I}
mit z = (zl, "", z.), [lz [l2 = zj~('1), dutch die (in C.holomorphcn) Funktionen
Wk = Wk(Zl, ...,Z,).
(3.7)
Dann gilt in jedem Punkt yon C, die Ungleichung
llz 1
--
Dabei bedeutet links den Betrag der Jacobischen Funktionaldeterminante der w~ nach den Ver~nderlichen zk. Die Ungleichung (3.8) und somit der Satz 1 wurde unter Heranziehung der Theorie der Kernfunktionen erstmalig yon S. Bergman im Jahre 1936 bewiesen (15). Aus dem Satz 1 kann man unter Heranziehung klassischer ttilfsmittel der Mass- und Integrationstheorie ohne Schwierigkeit folgenden Satz beweisen:
Es bezeichne.4 eine messbare Teilmenge yon C. und w(A) die durch die Abbildung SATZ la.
(3.9)
w: C . ~ C .
erzeugte Uberlagerungsmenge. Dann gilt die Ungleichung (3.10)
f.
[dul...do.] ,A) (1 Z ~ [ ~ t
fA --<
[dxl""dY']
(1 - II~D,+ ~"
Der Bcwcis dcr Ungleichung (3.10)siitzt sich auf folgcnde Eigenschaften der Abbildung w:C. ~ C. (die hier als nicht konstant vorausgcsetzt wird): (10) Der Gedanke, das Schwarzsche Lemma (n ----1) mit Hilf¢ der Poincar6-Picardschcn Differentialgloichung Au = 4 exp (2u) zu beweisen (und zu vcrallgemeinem) geht auf Ahlfors [1] zuriick. Die Bergmansche Theorie der Kernfunktionen wiirde zunichst zu demselben Resultat ftihren. (~i) Man vgl.Fussn. 6). (12) Den Hinw¢is, dass dcr Satz I sich dirckt aus der Thcoric der Bcrgmanschen Kornfunktionen,dutch Botrachtung der Kernfunktion von Cn und Anwcndung der Unglcichung (5.7) von [3l,ableitenl~Isst,verdanko ich eincrMitteilungyon Horrn M. Sehiffer.
166
A. DINGHAS
[July
1. Die Teilmenge von C,, auf der J verschwindet, hat das (n-dimensionale) Mass Null. 2. Auf der iibrigen Teilmenge von w(A) ist die Abbildung A ~ w(A) lokal ein-eindeutig und lokal dehnungsbeschr~inkt (~a). 3. Es sei K eine messbare Teilmenge von A, auf der dieAbbildung w: A ~ w(A) ein-eindeutig und dehnungsbeschr~inkt ist. Dann gilt die Gleichung
(3.11)
fw(K)
( ~d~ul "'" dl)n]
w [ ~ -+' = fK
]JI 2 __r-. (1--~w~2) "*'[ax''''dy"]"
Der Leser findet in [6] die der Gleichung (3.11) zugrundeliegende allgemeine Transformation eines Lebesgueschen Integrals hinreichend begriindet.
4. Allgemeine Verzerrungssgitze. Mit Riicksicht darauf, dass der Satz 1 Spezialfall eines allgemeineren Satzes yon S. Bergman ist(~4), stellt sich die Frage, inwieweit das in 3 entwickelte Verfahren verallgemeinert werden kann. Dass dies tats~chlich der Fall ist, soll hier kurz gezeigt werden. Es bezeichne M" einen K~ihlersche Mannigfaltigkeit(~s) mit der quadratischen Grundform (4.1)
ds 2 =2g~adz~d~p
( ~ , f l = 1,...,n)
und
~20 g'P - Oz, 0~.~'
(4.2)
wobei • eine reelle (viermal stetig differenzierbare) Funktion von z and ~ ist. Dabei wird z als lokaler Parameter mit den iiblichen Transformationsregeln bei dessert ~,nderung aufgefasst. Sehreibt man fl fiir fl und deutet man ~p dureh ~'¢ ~ zp, so gelten die Beziehungen z-~--, zp, g,~ = g~,, g~p = g,p ( 6 ) .
Wichtig ffir die nachfolgenden Entwieklungen ist der Begriff des Ricci-Tensors (R,~), der durch die Metrik (4.1) gegebenen Kahlerschen Mannigfaltigkeit. Es bezeichne I g, p l die (positive) Determinante der g~p. Dann gilt die Gleichung (4.3)
R,p =
0z 0z-~F.,p'l°glg'i~l(aT)"
(13) Man vgl. etwa [6], S. 285 ft. (14) Man vgl. etwa [2], S. 139 if, und [31 S. 95. (15) Die Definition der analytischen Mannigfaltigkeit vonder komplexen Dimension n sowie der K~ihlerschenMannigfaltigkeitfinder der Leser in [10]. (16) Man vgl. [10], S. 120 ft. (17) Man vgl. [101, S. 126.
1967]
I~BERDAS SCHWARZSCHE LEMMA UND VERWANDTES~TZE
167
Mit Hilfe des Ricci-Tensors (R~p) wird die Ricci-Kriimmung in jedem Punkt von M" durch den Ausdruck (4.4)
M -
R~lJdz'd~B g, ijdz, d~p
definiert (18). Fiir den Fall M " = C, und die Wahl (4.5)
g~ = ~
02
Oz,O
log
1
x- IIzll 2
(Poincar6-K/ihler Metrik yon C,) gilt offenbar die Gleichung (4.6)
[-R~[
02 1 = I ~l°g(1-1fzII2)"+f
I = [(n+l)g'["
In diesem Falle hat bekanntlich die Ricci-Kriimmung yon C, einen vom Punkt unabhangigen konstanten Wert (Einstein-Mannigfaltigkeit). Man nehme jetzt an, es existiere auf M" eine Metrik (4.7)
ds 2 = 2~t~dz~dzl~
mit den Eigenschaften: 1. Es ist (4.8)
02 -az,a~ log I g,,I = - (" + 1)~,~.
2. Es ist (4.9)
l i m l ~ I = + oo,
wenn der in Frage kommende Punkt von M" gegen den Rand F yon M" konvergiert. Dann kann zunfichst folgender Verzerrungssatz bewiesen werden: SATZ 2. Es bezeichne Mg eine Kgihler-Mannigfaltigkeit mit der Metrik (4.10)
ds 2 = 2 g~pdw~dwp .
Man nehme an, (4.10) habe in jedem Punkt die Eigenschaften 1.
- R~Bdw~dw~ > O.
2.
I-e= l ~ a , lg,pl
(A, = ( n +
1)").
Dann gilt fiir die holomorphe Abbildung w: M" ~ M"o die Ungleichung
(4.11) (x8) Man vgl. [10], S. 23.
Ig,,l Isl 2 =
168
A. DINGHAS
[July
sofern diese als Grenzwert yon holomorphen Abbildungen fk:M"~M"o (k = 1,2, ...) aufgefasst werden kann, mit der Eigenschafi, dass die abgeschlossene Hiille der durch die fk vermittelten Abbildungsmengen f~(M~) s~mtlich in M"o liegen. Beweis. Man ¢rsetze die w~ und wk in den g~y dureh wk(zl,...,z,)bzw. ff,~(2~, ...,~) und bilde die Funktion u -- loglg,vl + loglSl 2
zun/ichst unter der Annahme, dass die Abbildungsmenge w(M") die Eigenschaft
w(M") c M"o
(4.12)
besitzt. Dann gilt zun/ichst (mit Riicksicht auf die Entwicklungen yon 3 und die Voraussetzung 2)
Oz~O~p
(4.13)
~ A, exp U
(J ~ 0).
Andererseits geniigt die Funktion V = log[~,~[ der Differentialgleichung (4.14)
[
02V
]1 = An exp V
und dies liefert den Zugang zu dem Beweisgang yon 3. Denn die Voraussetzung (4.12) siehert, dass das Supremum yon ~b = U - V in einem Punkt zo yon M ~ angenommen wird. Andererseits kann in diesem Punkt (wegen [J [ > 0, der Bedingung 2 und dem Hilfssatz 6) unm~Sglich U - V > 0 gelten. Das beweist die Behauptung for jede Abbildung w mit der Eigenschaft (4.12). Der allgemeine Fall erledigt sich durch Verwendung der Abbildungsfolge (fk) und Grenziibergang k -* + oo. Ist M" konvex, so fiihrt die Betrachtung yon konvexen (abgeschlossenen) Aussch~Spfungsmannigfaltigkeiten, ~ihnlich wie beim Fall der Kugeln C, zu einem direkten Beweis yon (4.11). Wir gehen darauf nicht n~iher ein. Ist M" = C,, so liefert der Satz 2 unter Zugrundelegung der Poinear6-K/ihler Metrik
=
1°g i -Itz
den Satz: SATZ 3. Fiir jede Kahlersehe Metrik (4.10)yon Mg mit den Eigenschaften
1 und 2 und jede holomorphe Abbildung w: C, ~ M"e gilt die Ungleichung (4.15)
Ig,,I Isl' z (1- Ilzll') -<'÷'.
Dass sowohl aus (4.11) wie aueh aus (4.15) ~thnliehe S/itze wie der Satz la folgen, ist wohl trivial und boclarf hier keiner weiteren Erlauterung.
1967]
OBER DAS SCHWARZSCHE LEMMA UND VERWANDTE SATZE
169
Es diirfte wohl yon Interesse sein, festzustellen, inwieweit die yon S. Bergman mit Hilfe der Theorie der Kernfunktionen begriindeten S/itze sich mit den bier durch differentialgeometrische Methoden bewiesenen S~itzen decken. Das ist eine Frage, auf die zun/ichst umso weniger eine Antwort gegeben werden kann, als mir bisher die Bergmannsche Theorie nur in allgemeinen Ztigen bekannt war. Herrn Prof. M. Schiffer habe ich hier dafiir zu danken, dass er reich auf die betreffende Stelle im Bergmanschen Buch aufmerksam gemacht hat.
LITERATUR 1. L.V. Ahlfors, An extension ofSchwarz's lemma. Trans. Amer. Math. Soc. (1938), 359-364. 2. S. Bergrnan, The Kernel Function and Conformal Mapping, Publ. by the Amer. Math. Soc., New York, 1950. 3. ~ , Zur Theorie der pseudokonformen Abbildungen. Rec. math. Moscou, (2), 1, (43), 1936, S. 79--96. 4. E. Cartan, Le¢ons sur les invariants intdgraux. Hermann, Paris, 1922. 5. A. Dinghas, Ein n-dimensionales Analogon des Schwarz-Pickschen Satzes far holomorphe Abbildungen der komplexen Einheitskugel in eine Kiihler-Mannigfaltigkeit. Festschrift zur Ged~ichtnisfeier f'tir Weierstrass 1815-1965. Wissenschaftliche Abhandlungen der Arbeitsgemeinschaft fiir Forschung des Landes Nordrhein-Wesffalen 33. Westdeutscher Verlag, K01nOpladen, 1966, S. 477--494. 6. Haupt-Aumann-Pauc. Differential-und Integralrechnung. III. Band, S. 279 ft. Walter de Gruyter, Berlin, 1955. 7. E. Kiihler, Ober eine bemerkenswerte Hermitesche Metrik. Abh. Mathem. Sem. d. Univ. Hamburg 9 (1933), 173-186. 8. P. J. Myrberg, Ein Analogon des Fliichensatzes bei Funktionen mehrerer Variablen. Annal. Acad. Scient. Fennicae A.]. (1961), 303. 9. B. L. van d. Waerden, Algebra, Zweiter Teil. 4te Auflage, Springer-Verlag, BerlinG/Sttingen-Heidelberg, 1959. 10. K. Yano and S. Bochner, Curvature and Betti Number.%Princeton Univ. Press, Princeton, New Jersey, 1953. I. MATHEMATISCHES]NSTITUT, FREIE UNIVERSITATBERLIN, BERLIN, GERMANY
SOME MOMENT INEQUALITIES FOR STOCHASTIC INTEGRALS AND FOR SOLUTIONS OF STOCHASTIC DIFFERENTIAL EQUATIONS
BY MOSHE ZAKAI ABSTRACT
Some inequalities concerning the It6 stochastic integral and solutions of stochastic different equations are obtained. 1. Stochastic integrals. Let ( f ~ , B , P ) b e a probability space, and w(t, to), t e [0, T], to e f~, a standard Wiener process. Let B , t e [0, T] be an increasing system of sub-a-fields of B, such that for all t e [0, T], w(s, to) (0 < s < t) is measurable with respect to Bt, and such that if tl, t2, "", tk ~ It, T], then the aggregate of differences w(tj, to) - w(t, to), j = 1,2,...,k is independent of Bt. Finally, let f(t, to) be a real-valued measurable random function, such that for each t in [0, T], f(t, to) is measurable with respect to Bt and for almost all to e f~
fo r f~(t, to)dt < or. Under these assumptions, the It5 stochastic integral (1"1], [2])
fo r f(t, to)dw(t, to) exists, and if
E
foT f2(t, to)dt < oo
then
E
E
(f:
fo T f(t,
to)dw(t, to) ffi 0
)2 y:
f(t, co)dw(t, co)
=E
if(t, co)dt.
Received September 11, 1966, in revised form December 9, 1966.
170
19671
SOME MOMENT INEQUALITIES FOR STOCHASTIC INTEGRALS
REMARK. Er(
171
) will be used as an abbreviation for [E( )]~'.
Sr E2z'lf(t, to) I "at < 0% then
THF_~REM 1. If, for some p > 2,
E
fo r
f(t, co)dw(t, to) P < oo
(1)
and
In order to prove (1) we first prove the following
LEMMA: If ~(S) is non-negative and continuous; ¢(s) is non-negative, Lebesgue-measurable, Sr~b(s)ds < oo., and if for 0 ~ t ~_ T and 0 < ot < 1 t
(2)
~(t) <
+
fO
$(s) ($(s)) 1- =ds,
5>0,
then
~(t) <__ (5~, + a f f ¢(s)ds) 1/=
(3)
Proof of the lemma. Under the additional assumption that #(t)is continuous it follows from (2) that
d ~+
q,(s)4~~-"(s)ds
¢(0)4~t-=(o) -< 0(0).
The above inequality can be integrated to obtain
and (3) follows from (5) and (2). The additional restriction that ¢(s) be continuous can be removed, since for almost all 0 in [0,T], ~ +[.g~(s)~t-~'(s)ds is differentiable, with ¢(0)~1-=(0) as its derivative. For each such 0, (4) is true. Since (5 + j-g~ t - = ) = is absolutely continuous, it is the integral of its almost everywhere derivative; therefore, (5) holds and (3) follows. From now on the variable to will be omitted.
Proof of Theorem 1. We assume first that a.s. If(t)[ ~ k for all t in [0, T]. It follows, then, from a theorem of Dynkin ([2], Theorem 7.3) that
172
MOSHE ZAKAI
[July
Srf(s)dw(s) are finite. I~rf(s)dw(s)l "< (~ + (f~f (s)dw(s))2)p/2, and by Ito's formula
and therefore all the moments of
For any t5 > 0, [3]
+
(6)
The right hand side of (6) is of the form
forFl(s)dw(s)+ f orF2(s)ds.
(7)
ElF (s)12
ele (s) [
Since and are bounded on [0, T], the expectation of the first term of (7) is zero; and for the expectation of the second term the order of integration and expectation may be interchanged. Also, the expectation of the left hand side of (6) is continuous in t. Therefore
Applying to (8) the Htilder inequality with tr =
p/2, p = p/(p -
2) we have
By (2) and (3) (9)
E(6+
( Jof(s)dw(s)) ) ~_ + (p -1) fo r
from which (1) follows after letting 6 ~ 0. In order to remove the additional restriction If(s)l ____k, let fn(S) = min In, max ( -- n,f(s))]. Since E TM If(x)I p is non-decreasing in p, the condition of Theorem 1 implies that Ey~f2(t)dt < oo and S~f,(s)dw(s) converges to S~f(s)dw(s) in quadratic mean, Therefore there exists a subsequence n~ for which convergence is almost sure. For this subsequence we have
19671
SOMEMOMENT INEQUALITIESFOR STOCHASTICINTEGRALS
(
6+(p-1)
f[ P?/"[f(s)lds
)p/2
(/~T
173
\2\ p/2 __> lim E ~5 + [Jof,,,(s)dw(s)) )
( ["T ---
\2\p12
Jo
where the first inequality follows from (9) and the second from Fatou's lemma. COROLLARY 1. (10)
EI
Under the condition of Theorem 1.
:of(s)dw(s) i'=< ( p -
1)p12T(t'-2)/2
f: Elf(s)l'ds.
The proof follows from (9) by direct application of the H61der inequality to y~l E]:'[f(s) l'ds with a = p/(p - 2) and p = p/2. REMAgK: For p = 4, inequality (10) has been known ([1] p. 23, with 36 instead of (p - 1)p/2 = 9, and [4] in a weaker form). 2. Stochastic differential equations. Let E, be the Euclidean r-space. Let m(t, x), (t >=O, x e E,) assume values in E,; G(t, x), (t ~_ O, x e E,) will assume real r x q matrix values and w(t) will denote the q-dimensional Wiener process. The prime (') will denote the transpose of a vector or a matrix; for vectors I • [ will denote the Euclidean norm, for matrices G, [G I will denote the norm (trace GG') 1/2 ([2] p. 209). Assume that re(t, x) and G(t, x) are measurable functions of their variables, that (11)
Im(t,x)- m(t'y)l <-klx- YI'] G(t'x) -G(t,Y)[---- klx- YI'
and f o T([m(t,O) Is + I G (t,O) p)dt < o o . Let a be a random variable assuming values in E, and independent of w(t), t ~_ O. Under these assumptions, the stochastic differential equation (12)
dx(t) = re(t, x(t))dt + G(t, x(t))dw(t),
x(O) = a
has a unique solution in [0, T], and if Elal2 < oo, then Elx(t)l 2 is bounded on [0, T] ([12, [23).
TrmORr,M 2. If for some p > 2
(13)
fo~lm(,,O)l" + I G(,, 0) I')d, < oo
and ~lal ' < ~, then [Ix(t) ]" is bounded on [0, T]. For p = 2 this result is usually obtained (together with the existence of solution) by the method of successive approximations ([1] p. 47, or [2] Theorem 11.1).
174
MOSHE ZAKAI
[July
Using Corollary 1 (and the Holder inequality instead of the Schwarz inequality), the same proof applies directlyto convergence in thc mean of order p. The details are, therefore,omitted.
THEORr.~ 3. If, for all t ~_ O.
[ m(t,O)[ + IG(t,o) I __
(14)
then there exists a constant • such that for all t > O, for all a in E, and for all non-negative integers n
(15)
E.(1 + ] x(t)12) - < (1 + l a IZ)°e"C2n+1),a
Io(t,x)l
If, in addition, <=k . (t >_o, x e E . ) , then there exists a constant fl such that for all a and for all t >=0
(2n), (3
)n
E°(l+lx(t)[2)~ ~- .~.2" ~+lal 2 e2"#';
(16a)
(which is the same as
E.(1 + I x(t)l~)" _
(165)
E(~(t)) 2"
where
and (o is a Gaussian random variable with E(o = 0 and E ~ = 1). Poorf.
By Ito's formula ([2], [3]), for n = 0,1, 2,...
(I + [x(t)]2)"= (1 + ]a]2)" +
:o'
2n(1 +]x(s)12)"-Ix'(s)m(t,x(s))ds
+ fl 2.,, + (17)
+ ~
1 + I x(s)[2)"-I 2n. trace (G. O')ds
1 + 5 ; 4n(n- 1)(1 +{ x(sl[ 2f'- 2x '( s) GG ' x(s)d s. The right hand side of (17) is of the form
(1 + I a]~) * +
f:
F(s)dw(s) +
fo
f(s)ds.
By the result of Theorem 2, E I F(s)l 2 and EJf(s)l are bounded on any bounded t interval. Therefore ESg F(s)dw(s) -- 0 and Eo(1 +Ix(t)]2) ~ is continuous in t.
1967]
SOME MOMENT INEQUALITITIES FOR STOCHASTIC INTEGRALS
175
From (11) and (14) it follows that for some
I m(',x)l
~ ~(I + Ixl=) ''=
IG(t,x)l = = trace(GG') < ~(1 + Therefore, from (17),
(18)
~o<1 + Ix(,)12) . -<_(1 + l a 12)" + .(en + 11~
Ixl=).
£'Ea(1 + Ix(s) 12)'ds
Setting Ea(1 + Ix(t)]2), = ¢.(t) and since Cn(t) is continuous in t, we have from (18)
1
d e -n(z"+ 1)~t ds
fo' } dp.(s)ds
< ~b,(0)e-,(2,+ 1)a, =
Integrating and substituting in (18), we get (15). From the condition [G[ <= k2, (11) and (14) it follows that for some
I m(t,x)l =< ~(1 + Ixl2) ''2 I G(t,x) 12 _~. Therefore, from (17), for n = 1, 2, ... (19)
4~(t) _~ 4,n(0) + 2n~
L
~n(s)ds + n(2n - 1)p
fo
?A-t(s)ds,
or
I fo' / <
d e -2"a'
~b,<')d,
P
dt
Integrating and substituting in (19) (20)
4~.(t) N 4~.(0)e2~#' + ,6n(2n - 1)e2~#t
:o'
e-2,#~q~._ l(s)ds.
Repeated applications of (20) yield d~(t) < ~ (2n)l(~l(O))lBn-te2"Ptn:t,n~ , ~, ~= o Q(t,O) = 1 =
where
,
- dPn(O) e- 2n#t+ n(2n - 1)fie -2n#t Jo ¢~n-l(S)ds.
Q(t,j) =
Io'
Q(~.j - 1)e-2#~ d[
= ( 2 ~ y ( 1 -- e-2't) J
ji
•
i),
176
MOSHE ZAKAI
Therefore
4,.(t) =<
(q~l(0))i(2n) !(1 - e-#') "(2i)!4(n_.-ii)[ , t=0
and, since (20! > 2ii!, (2n)! ~ (~,(0))'(1 - e-#')"-Sn[ ~b.(t) <= ~ ,=o i t ( n - i)v2.-~. 1
-
This proves (16a), and (16b) follows from the fact that for ~ Gaussian with zero expectation E~ 2n
(2n) !
= n!2- (El2)""
REFERENCES 1. A. V. Skorohod, Studies in the theory of random processes, Addison-Wesley, (1965). 2. E. B. Dynkin, Markov Processes, Springer (1965). 3. K. It6, On a formula concerning stochastic differentials, Nagoya Math. J. 3 (1951), 55-65. 4. E. Wong and M. Zakai, The oscillation of stochastic integrals, Z. Wahrscheinlichkeitstheorie verw. Geb. 4 (1965), 103-112. FACULTYOF ELECTRICALENGINEERING TECHNION--ISRAELINSTITUTEOF TECHNOLOGY HAIFA
THE INCREASE OF SUMS AND PRODUCTS DEPENDENT ON (yl,..-, Y,)BY REARRANGEMENT OF THIS SET BY
SHOSHANA ABRAMOVICH ABSTRACT
Let F(u, v) be a symmetric real function defined for tt < u, v < fl and assume that G(u, v, w) = F(u, v) + F(u, w) -- F(v, w) is decreasing in v and w for u <-- min(u,v). For any set 0") = 0 ' 1 . . . . . Yn), a < y~ 0 and where the set 0") =0"1, . . , Yn), Yl ~_ 0 is given except in arrangement. The problems considered here arose in connection with a theorem by A. Lehman Ill and a lemma of Dnmn and Schaeffer [2]. We start with some definitions given in [1]. The sets ( y - ) = (y~, y~,..., y~) and ( - y ) = ( - y 1 , " ' , - Y.) are s y m metrica fly decreasing rearrangements o f an ordered set (y) = (Yl, "",Y~) o f n real numbers if (1)
Y~ --
and
(2)
-Y. < -Yl < - Y . - I < "'" < -Yt(n+ 1)/21.
A circular rearrangement o f an ordered set ( y ) = (Yl, "",Yn) is a cyclic rearrangement o f (y) or a cyclic rearrangement followed by inversion. A n ordered set ( y ) = ( Y l , ' " , Y n ) o f n real numbers is arranged in circular s y m m e t r i c a l order if one o f its circular rearrangements is symmetrically decreasing. It follows that the sets ( y - ) , ( - y ) a r e arranged in circular symmetrical order and so is the set ( y ) = ( Y l , ' " , Y . ) if either (3)
Yl <=Y2 <=Yn < Ys ~ Y~-I < "'" < Ytf.+a)/2l
or (4)
Y2 -< Yl ~ Y3 =< Yn ~- Yn-1 ~ "'" "( Y[(n+4)12]
Received Aug. 3, 1966, and in revised form April 27, 1967. This paper is part of the author's Master of Science dissertation at the Technion-Israel Institute of Technology. The author wishes to thank Professor B. Schwarz and Professor E. Jabotinsky for their help in the preparation of this paper. 177
S. ABRAMOVICH
178
[July
holds. THEOREM 1. Let F(u,v) be a symmetric real function defined for
-
and assume that the function (5)
G(u, v, w) = F(u v) + F(u, w) - F(v, w),
ot < u, v, w <
is decreasing in v and w for u < min (v,w). Let the set (y) = (Yl, "",Yn) ot < Yt < 8, i = 1 ,..., n be given except in arrangement. Then 11
Sn = ~, F(y~,Y~+I), (Yn+I= Yl)
(6)
i=l
is maximal if (y) is arranged in circular symmetrical order. Moreover, if G(u v, w) is strictly decreasing in v and w for u < rain(v, w) and (7)
F(u, u) = G(u, v, u) = G(u, u, w) > G(u, v, w)
u < rain (v, w)
and if, in addition, no three elements of(y) have the same value, then (6) attains its maximum only if (y) is arranged in circular symmetrical order. Proof. As the proof is similar to the proof in [1], we give here only a short outline. The first assertion of the theorem is equivalent to
S~= ~
F(y~,y'~+I) > ~
i=l
(8)
F(y,,YI+I)=S,
|=1
y , + l = y l , y~+t=y'~
yl
i=2,...,n.
(8) is proved by induction, using the equalities
F(yT,yT÷,) = (9)
r(x;, x;÷
f=l
f=l
n
n-1
+ 6(y , x;,
0
~, F(y~,y~+I) = ~, F(x, xt+l) + G(yl,xl,x~-I) I=1
f=l t
t
y~+t = yi xn= xl, y~+l =y'~, xn= xl t
where x~ = y~+ i, x~ = y ~ 1, i = 1,..., n - 1 (hence (x') = (-x)). The second assertion is also proved by induction and (9) again allows us to proceed from n - 1 to n. For the sets (1,2,3,4,2,2) and (1,2,2,3,4,2) and any symmetric function F the sums (6) are equal; hence it is necessary for the second part of the theorem to assume that no three elements of (y) have the same value.
1967]
THE INCREASE OF SUMS AND PRODUCTS
179
The theorem yields the following result concerning linear rearrangements COROLLARY. Let all assertion of Theorem = F(u,O) = 0 and if (x) i = 1 ...,n given except
the assumptions about F(u,v) and G(u,v,w) of the first 1 hold, and assume ~ < 0 (o~ < u,v < ft.) I f F(O,v) = (xl, ...,x,) is a set of n positive numbers 0 < x~ < fl in arrangement then n-1
(10)
~
F(xt, xi+l)
i=l
is maximal if (x) is arranged in symmetrical decreasing order. Proof. Define Yi+l = X t
i= 1,...,n,
n
(11)
i = 1, ...,n,
Yi+ l = - X l
Yl = Y't = O.
note that ( y ' ) = (y-). Using (11) the assertion of the corollary is equivalent to n+l
(10')
n+l
Y_, F(y~,y[+,)>= ~, F(y~,yI+I) i:1
t=1
because Yl = 0 and F(O,u)= O. But this is (8) because (y') = (y-), and thus the proof of the corollary is complete. It can be shown that if in the corollary we assume that the stronger assumptions of the second part of Theorem 1 hold, then (10) attains its maximum only if (x) is symmetrically decreasing. We give now examples of functions F(u v) satisfying the assumptions of Theorem 1. Define F ( u , v ) = f ( I u - v I), where f ( x ) is concave decreasing function for x > 0. This is a symmetric function of u and v and G(u, v,w) = f (lu - v l) + f
u - w) l - f
v - w l)
has all the properties which are needed for the first assertion of Theorem 1. If the concavity o f f ( x ) is strict, then F ( u , v ) = f ( I u - v l) and G(u,v,w) has all the properties of Theorem 1. In this special case Theorem 1 becomes Lehman's theorem. Another class of functions satisfying the first part of Theorem 1 consists of the symmetric functions F(u,v) for which ~ZF/~u~v> 0 holds. Examples of such functions are P ( u , v) = u . v,
180
s. ABRAMOVICH F(u, v) = uSd + vSu" F(u, v) = log(u • v + 0
s, r > 0
[July u, v > O,
t > max (0, - u" v).
It can be shown for these examples that the assumptions on F(u,v) of both parts of Theorem 1 hold. From the last example it follows that (12)
fi
(Y+Yl+t+t), Y,,+I=Yt t > m a x ( O , - y l y f i
i#j
i=l,...,n)
/=1
is maximal when (y) is arranged in circular symmetrical order, and if no three elements of (y) have the same value, then the maximum is attained only if (y) is arranged in circular symmetrical order. We now turn to another result concerning a product which is a generalization of the following lemma of Duitin and Schaeffer. LEMMA I-2, p 522]. Let the set (y) > 0 of 2n nonnegative numbers be given except in arrangement. Then M
(13)
I-[ (Y2+-1" Y2~ + 0
t> 0
i=1
is maximal when (y) is arranged in decreasing order. Generalizing this result we obtain the following theorem.
THEOREM 2. Let (a)=(aa,...,am) and ( Y ) = ( Y t , ' " , Y 2 , ) be sets of nonnegative qumbers where (y) is given except in arrangement, then the product (14)
fi
[(y2~_ay2~) m + at(Y2~_ly2~) m-1 + ... + %]
f=l
attains its maximum when (y) is arranged in decreasing order.
Proof. The proof is by induction on m. m = 1 is the lemma of Duttin and Schaeffer. Let Yl be the maximal term in (y). If between Yl and Y2 there is a term, let us call it Y3, Y2 < Ya < Yl, we interchange Y2 with Y3 and consider the difference a = [(YlYa)m+ a,(ylya)m-1+ ... + am] [(Y2Y+)"+ al(Y,Y+) " - t + "'" + am] - [(Yly2)m + al(y,y2) " - t + "" am] [(Y3Y+)m + a l ( Y , y 4 ) ' - ' + "'" + am] = {[ax(YlY3)m-t+ "'" + am] [al(Y2Y+) m-1 + "'" + am] - -
[a~(y~y2) " - ~ + ... + am] [al(yzy+) m-x + ... + am]}
+ {(yay3)"[at(yzy+) "-~ + ... + a,,] + (y2y+)"[at(y~y3) ''-1 + ... + am] - (yly2)'[at(y3y,,) " - I + "'" + am] - (y~y+)'[al(y~y2)" + "'" + am]}.
1967]
THE INCREASE OF SUMS AND PRODUCTS
181
Let us look at the terms after the equality sign: By the assumption of induction the term in the first braces is nonnegative. The term in the second braces is also nonnegative because it is equal to ~, ak(yty2yay4) m-k [(Y~ -- yk4) (yak - y2k)] k=l
and we assumed that Ya > Y2, Y~ > Y4. Therefore A >=0 and we can rearrange the set (y) in such a way that two greatest numbers of (y) will appear in the same term of the product (14) without diminishing it. We continue the same process for the remaining terms of the product (14). This completes the proof of the theorem. The proof shows that if (y) > 0 and at least one of the ak, k = 1, ..., n, is posit ive, then the maximum is attained only in those cases in which neither Y2i- 1 < Yk < Y2~
nor Y2i < Y~ < Y2i-1
holds. REFERENCES
1. A. L. Lehman, A result on rearrangements, Israel J. Math. 1, No. 1 (1963), 22-28. 2. R. J. D u f ~ and A. C. Schaeffer, A refinement of an inequality of the brothers Markoff, Trans. Amer. Math. Sot:. 50 (1941) 517-528. TECHNION--ISRAELINSTITUTEOF TECHNOLOGY HAIFA, ISRAEL
SUBFAIR CASINO FUNCTIONS ARE SUPERADDITIVE BY
LESTER E. DUBINS(*) AI~TRACT It is shown that all subfair casino functions are superadditive on the unit interval. As the main step in showing that bold play in a primitive casino is optimal, and, consequently, that the utility U of the bold strategies in a primitive casino is a casino function, it was shown in [1, Chap. 6] that for a l l f a n d g in the closed unit interval, U satisfies: (1)
U(f + g) ~_ U (f) + U(g) i f f + g < l ,
and
(2)
U(f+g-1)_~V(f)+U(g)-I
if f + g _ - l .
It turns out to be very simple to prove TI-mOl~M 1. (**) Every subfair casino function U satisfies (1) and (2). Proof.*** We first show thatlU satisfies (1). As was observed in [1, Chap. 4, Sees. 2 and 3], (3)
U(f g) ~_ U(f)U(g),
and
(4)
U(f + g(1 - f ) ) ~ U ( f ) + U(g)(1 - U(f)),
for all f and g in the unit interval The only discontinuous subfair casino function is 0 for 0 <-f < 1 and is 1 at f = 1, as was shown in [1, Chap. 4, Sec. 3]. Since this function plainly satisfies (1) and (2), only continuous U need to be considered. Let
Received March 5, 1967 * Supported in part by N. S.F. Grant G.P. 5059 ** The idea that Theorem I might be true, and a proof of it, occured to me in the course of a seminar in which David Gilat, Shmuel Shye, and William Sudderth actively participated. *** I am grateful to Leonard J. Savage for significantlysimplifying my original far-fromelegant proof. 182
1967]
SUBFAIR CASINO FUNCTIONS ARE SUPERADDITIVE
(5)
183
S(f, g) = U(f) + U(g) - U(f + g)
on the triangle where f + g __<1. If S were ever positive, there is a least f such that S attains its maximum for that f and some g. That f cannot be 0, nor can a corresponding g be 1. Apply S to ( f ' , g') where f ' = f g and g' = f + g - f g. S ( f ' , g ' ) = C ( f ' ) + V(g') - U(f' + g') = U(fg) + U(f+ g - f g ) - U ( f + g) (6)
> U(f)U(g) + U(f) + U(g) - U(f)U(g) - U ( f + g) =
U ( f ) + U(g) - U ( f + g)
= S(f,g), where the inequality is an application of (3) and (4). But (6) is a contradiction, since f ' < f . So (1) holds. A very similar proof would show that (2) holds, but it seems more interesting to demonstrate (2) by means of a digression that brings out the intimate relation of (2) to (1). To each function V of two real variables associate its dual V*, thus. (7)
V*(f,g) = 1 - V(I - f , 1 - g ) .
Let ~" be the set of all V such that: (i) V(f, g) is increasing in f and in g; and (ii) for every subfair casino function U, (8)
U(V(f, g)) >=V(U(f), U(g))
whenever f, g and V(f,g) are in the unit interval. LEMMA 1.
V e~
if and only if V* e ~v'.
Proof of Lemma. 1: Introduce U* as in [1, Chap. 4, Sec. 6], namely, U*(z) = 1 - U-X(1 - z) for 0 < z < 1. L e t f , g,x and y be related thus. U ( f ) = 1 - x, U(g) = 1 - y, or, equivalently, U*(x) = 1 - f , U*(y) = 1 - g. Let V ~ and suppose first that V(f, g) and V*(x y) are in the unit interval. Then notice that (8) holds if and only if
(9)
U*(V*(x,y)) > V*(U*(x), U*(y)),
even if V is not monotone. Therefore, it may be supposed that x, y, and V*(x, y) are in the unit interval, but V(f, g) is not. Verify that because V*(x, y) is in the unit interval, so is V(U(f), U(g)). So by monotoneity of V, V(f,g) cannot be less than 0, and therefore must exceed 1. So V*(U*(x), U*(y))= 1 - V ( f , g ) < 0 < U*(V*(x,y)). This completes the proof of Lemma 1. Since (1) has been shown to hold for all subfair casino functions, the function
184
LESTER E. DUBINS
V(f,g) = f + g is in ~v'. In view of Lemma 1, so is the function f + g - 1, that is, (2) also holds for all subfair casino functions U. The proof of Theorem 1 is now complete. As is easily seen, the proofs of Theorem 1 and Lemma 1 apply not only to casino functions U but to all bounded solutions U to (3) and (4) omitting those U for which U ( f ) - 1 for 0 < f < 1. Inequalities (3) and (4) for casino functions U have intuitive interpretations that make them apriori plausible and, therefore, natural to conjecture. It would be nice to find such interpretations for (1) and (2).
REFERENCES
1. L. E. Dubins, and L. J. Savage, How To Gamble If You Must. McGraw-Hill, New York. 1965. UNIVERSITYOF CALIFORNIA, BERKELY,CALIFORNIA
A FUNCTIONAL METHOD FOR LINEAR SETS BY
R. KAUFMAN ABSTRACT
Kronecker sets for approximation by characters are constructed by an application of Baire's theorem to Banach spaces of differentiable functions. A compact subset E o f ( - oo,oo) is a Kronecker set [2, §5.2] if the exponential functions e~aX(- oo < 2 < oo) are uniformly dense in the continuous complexvalued functions of modulus 1 on E. Wik [3] has constructed Kronecker sets of Hausdorff dimension 1; in fact E can carry a positive measure subject to any prescribed continuity condition weaker than absolute continuity. However, the sets constructed in [3] seem to be very unevenly dispersed; we shall describe a function-space method that necessarily yields Kronecker sets with some degree of symmetry. Let p be a positive integer and rl,rz, r3,.., numbers such that (1)
0 < 2r~+1 < r~ < 1
(2)
sup
(1 __
r~ = oo. rn+ 1
Let [Y be the "symmetric set" of numbers ~ = l t n r n , e~ = "Jr 1 (1 _- 6, then for some f e C p f ( x ) = ai for x ~ Ii
llsll s
1 < i < n)
(, + n6-P)max[ a,[
for an absolute constant B - - B ( p ) . A function 4~ in C v belongs to ~ N if each continuous unimodular function h Received December 20, 1966, and in revised form Match, 1967. 185
186
R. K A U F M A N
[July
on Ycan be uniformly approximated by exponentials e~a~( - oo < 2 < m). Because the functions h admit a countable dense subset it is enough to prove that for eachfixed h, the exceptional set has void interior, since it is evidently of type F.. We denote by 2(//)-1 (//> 0) a real number whose nearest neighbors in the sequence ri r2,r3,'", say r . > 2 ( / / ) -1 > r . + 1, satisfy the inequalities //2 r,p >//;t(//)-1 > r.+1;
(3)
,~,(//)-1and r. can be made arbitrarily small by (2). For a number t > 0, //1e (0, I), and a complex number z of modulus I set
V(t,Z, rh)= { - ~o < x < oo, le '= - z I /d.
(4)
Every real number is within 2nt -~ of the mid-point of an interval in V(t,z,//1) and each interval has length >= 2//lt -1 or > 2n. Divide Y into disjoint closed subsets (with mutual distances > p > 0) on each of which h has oscillation /1. Choose a number 2(//)-1 and further divide Y according to the co-ordinates e l , ' " , ~., where n depends on 2(//) by the inequality r. > 2(//) -1 > r.+l. The distances between the distinct portions, say Ej (1 < j < s), are at least p, or r.(1 -//2), on account of (1) and (3). More exactly, suppose r t ~ ,oo= le,.., and ~moo= lemr,, agree up to ej with 1 < j < n. Then
I
OO
O0
E---
,gmT"m
ra=l
m=l
~
~O
2r j+ 1
>=
2r.(2 "-J-x - 2 n - J - 2
--
2r.(1
- 2/12) __> rn(p
j+2
.....
n+l
1 - 2//2)
- / / 2 ) , f o r 1 > 3//2.
By choosing n and // appropriately we make the distances > ½1".. The length I ,1 =<2r.+, < 2//2(//) -1. Choose any number zj in h(Ej) and any number wj smaller than 2n2(//) such that wj + ~b(Ej) contains the mid-point of an interval of VQ.(//), zj ~h)) (see (4)); the length of the interval > 2//~2(//) -1 (for large 1(//)). Because the length II ll when ~ and // are properly chosen ff(g~) + wj ~V(2(//), zj, rh). This means that I expi,t(//) (~ + wj) - zjI < nx on Ej
(1 =<j < s),
and it is already true that I h - z~l /~ on ej (1 <=j<=s). Now the distance between the Ej's is at least ½r~; Therefore there is a function ~ in C p such that ¢=wjonE~(l<j<s) and
I1 11 -<
+ 2"r~-P)max[ wJ[ < B2P+*rZ*2n2(//)'-x =< 4n//B by (3).
Then I exp i 2(//) (~b + ~) - h I < 2//1 and the proof is complete.
1967]
A FUNCTIONAL METHOD FOR LINEAR SETS
187
REMARKS. Let L be the Lebesgue function of Y[1, I], so that dLis the " n a t u r a l " probability carried by Y. Then L ( x + 8) < L(x) + 2 -m(~), whenever r~(~) >= e, so that the modulus of countinuity WL(e) < 8. (2-m(')/e). Now let p = 1 and observe that (1) and (2) can be attained with m log 2 + log r m converging to - o3 as slowly as we please. But this means that 2-m(~)/8 can converge to + ~ as slowly as we please. Thus Y can have positive h-measure for any indicator h such that lim,_.o h(~)/e = + oo. Now if we choose ~b' > 0, by the theorem proved, ~'(Y) has positive h-measure. This is Theorem 1 of I33.
REFERENCES 1. J.-P. Kahane and R. Salem, Ensembles Parfaits et Series Trigonomdtriques, Hermann
Paris, 1963. 2. W. Rudin, Fourier Analysis on Groups, Interscience, New York, 1962. 3. I. Wik, Some examples of sets with linear independence, Arkiv for Matematik, 5(1965), 207-214. UNIVERSITYOF IU.~OXS, URBANA, ILLINOIS
A GENERALIZATION OF MALHO'S METHOD FOR OBTAINING LARGE CARDINAL NUMBERS BY
HAIM GAIFMAN ABSTRACT
Mahlo used a method by which fixed points of an enumeration of regular cardinals were employed to get a hierarchy of "large cardinals." He also employed a second method which, in a certain sense, is much stronger than the first. Here the methods are investigated and generalized and the relations between them are clarified. This stronger method turns out to be a kind of "least upper bound" to all "fixed-points operations." Possibilities of strengthening these processes in a natural way are pointed out. 0. Introduction. In his articles from 1911-1913 (cf. [1], [2] and [3]) Mahlo uses the method of taking fixed points of an enumeration of ordinals, in order to arrive at "big ordinals." Consider an increasing sequence of ordinals Uo, U t , " ' , z ~ , ' " , a fixed point would be an ordinal which is equal to its index, i.e. 2 = ~a. Now in order that this should give us "big ordinals" it is first required that the sequence should be with "large enough" gaps. For instance, if there are no gaps at all and the sequence runs through all ordinals, then every one will be a fixed point. This, however, is not enough. Consider the sequence of all cardinals (being identified with their initial ordinals), COo,COl,..., ~oa,.... As it is well known there are lots of fixed points here. Just put ~b(0)= ~o~,qg(n + 1 ) = o~,tn) andthen Un
A GENERALIZATION OF MALHO'S METHOD
189
bers, where ct > 0, can be stated, for ~t a limit or a non-limit ordinal, by saying that a ~-number is a fixed point in the enumeration of the ~p-numbers, for all fl
It should be remarked here that the existence of ~1 numbers > 0 cannot be proved; indeed, it is consistent to assume that they do not exist. Moreover, for every model M of set-theory (say, Zermelo-Fraenkel's), in which there are ~x+ 1numbers > 0, where x and x + 1, are, respectively, an ordinal and its successor in M, there is a submodel N, of set-theory, which is transitive in M, having x as a member, in which there are ~-numbers > 0 but not ~+l-numbers > 0. The clause "' > 0 " is added here because according to Mahlo 0 is considered among the regular cardinals, the next one being O~o. This makes 0 a ~v-number for all v. If one starts from o~o it can be omitted. The process of taking fixed points is not finished by now. One can continue and ask whether there are ordinals v which are zoo-numbers. Enumerating all Ir~ numbers: no,,, 7r1,~"", 7ra,,, ..., since 0is trivially rco.,, it can be shown that if v is a Try-number then v = 7h,,. Thus, one can consider fixed points for the second index. Enumerating them, one can take fixed points, and continue likewise "indefinitely." A clarification of this "indefinitely" is one of the aims of the present work. Mahlo introduces a second kind of "big ordinals." An ordinal is called by him a po-number if it is a regular ordinal and every sequence of ordinals, whose limit it is, has an initial whose limit is a smaller regular ordinal. A p~-number is a po-number having the analogous property with respect to po-numbers, that is, every sequence whose limit it is has an initial whose limit is a smaller po-number. In this way the p~-numbers are defined for all 0e, the case where 0~is a limit ordinal being taken care of by forming the intersection of all previously obtained classes. It is shown by Mahlo that this second method is "stronger" than the first in the sense that every po-number is equal to some v which is a ~r,-number. Moreover the first v which is a 7r, number is not a po-number. The proof seems to indicate that one cannot get at the po-numbers by iterating the fixed point process. Thus every po-number is a fixed point in the enumeration of all v's which are n~-numbers ---and this seems to hold for any "indefinite" continuation of the fixed point process. The same situation takes place if one compares the numbers obtained from the p,-numbers by fixed point methods with the p,+~-numbers. It is the aim of this paper to establish the exact relationship between the two processes and to show in which sense the p~-numbers are "bigger" than those obtained by fixed points methods. It turns out that the second process can be described as the " s u p r e m u m " of all iterations of the fixed point methods, and is itself a kind of "generalized" fixed point operation. This generalized process when applied, say, to the operation which determines the po-numbers does not yield the Pl numbers but much "larger" ordinals whose relation to the Po numbers is perhaps more analogous to the relation which the Po numbers bear to the no-numbers.
190
HAIM GAIFMAN
[July
Finally, the way is open to even "stronger" operations. 2. Preliminaries. Cardinals are identified with their initial ordinals, and every ordinal is identified with the set of all its preceeding ones. O r d = the class of all ordinals We will consider functions whose arguments and values are sets or proper classes of ordinals; moreover we will form classes of such functions. Thus classes of classes of classes of... to the fourth or fifth degree are used. This, howeves, is completely inessentiak and is being done for convenience only. All our functions will have the property: a ( X n ~) = 6 ( x ) n
where G is the function, X any class of ordinals and ~ any ordinal.All the operations on the functions will preserve this property. The structure will have, thus, the local property that ordinals < ~ are not affected by the question whether or not our sets of ordinals include ordinals > ~. One can, therefore, limit oneself from the beginning to an initial section consisting of all ordinals < ~ and then let ~ be arbitrary large. In fact, the whole work can be carried in Zermelo-Fraenkel's set theory at the cost of encumbering, somewhat, the formulation. Those who still feel unsure may imagine that all the classes involved are subsets of 0, where 0 is some strongly inaccessible cardinal, and Ord = O. For the sake of convenience we introduce, a new symbol, " ~ " , and make the convention that ~ < go for all ordinals ~. go is not to be considered an ordinal, and symbols such as " ~ " , "fl",--- which are used to denote ordinals never denote OO.
A sequence X = <X~>~<~o (<X~>~<~) of classes of ordinals is decreasing if for every ~ < p (every ~ < fl < 6) X~ ~_ Xp. It is continuously decreasing if for every limit ordinal ~ > 0 (6 > ~ > 0) we have X~ = na<~xp, go (or 6) is the length of the sequence. All the functions to be considered will have as arguments and values classes of ordinals. D(f) is the domain of the function f. f > O if D(f) = 0(9) and f ( X ) D_g(X) for all X e O(f). If {fi}i ~ff~ is family of functions then n i , tf~ is the function whose domain is n~iD(f~) and whose values are given by: ( At ~ff,) (X) --- n , ~tf,(x) F = ~,<~) is a decreasing sequence of functions if for all ~ < fl ( ~t < fl < 6) F , >=Fp. It is continuously decreasing if it is decreasing and F~ = np<~F~ for all limit ordinals (all limit ordinals < 6).
1967]
A GENERALIZATION OF MALHO'S METHOD
191
If X is a decreasing sequence of classes of ordinals then X ° is defined by:
X ° = {~ :a e X~}, if = (X~>~< ~o X ° = {a: ~ < 6 and cce X,} u n~<~X~, if X = (X~)~ <~. X ° is referred to as the diagonal of X. If F = ( F , ) , is a decreasing sequence of functions then F ° is defined by: D(F°) = D(Fo), F°(X) = YJ), where Y, = F~(X). A function f is a local thinning function if it has the properties: X e D(f) and Y ~ X imply Y e D(f). f ( X ) ~_ X for X e O(f) f ( X n a) = f ( X ) n ~, for all X e D(f) and all ~, (this is equivalent to: for all fl, fl e f ( X ) iff fl e f ( X (3 (fl + 1))). We will abbreviate and speak about a L T F or say that f is a LTF, f e LTF. f i s said to be monotone ifX_c Y i m p t i e s f ( X ) c_f(y), for all X, Y e D ( J ) . (Not every L T F is monotone. Consider, for example, the function which, for evry X, has as a value the set of all members of X which are not limit points of X) The composition of two functions, f o g, is defined by: D ( f o 9) = {X: X e D(g), 9( X) ~ D(f)}, (f© g) (X) = f (g( X)). PROPOSITION 1. (i) I f f, g e L T F and D(f) = D(g) then f o g e L T F (ii) {fi}~~I-- L T F implies n~ ~ l f i e L T F (iii) I f P is a decreasing sequence of members of L T F then FOe LTF. (iv) The statements (i)-(iii) are true if L T F is replaced by the subclass of all
the monotone functions
in LTF.
The proof is straghtforward. Let I be the identity function (I(X) = X, for all X) f~ is defined by transfinite induction as follows: f o = I restricted to D(f)
f ~+l=f°f~;f~=
n
fP, if a is a limit ordinal.
Let f e L T F , let F be the sequence ( f ' ) ~ ~ o,d then f ais defined to be F °. means that f a ( X ) = { e : a ef'(X)}.(*) Let X be any class of ordinals. Let ~o,~1, "",oh, "" be the enumeration of X according to the natural order. Then th(2,X) is the 2th ordinal of X, which is (*) The definitions of f~ and fzx are due to D. Scott.
192
HAIM GAIFMAN
[July
~ , if there is such, or oo if there is no such ordinal. Jp(X) is defined as {4: 4 = ~a}. fp(X) is the class of the fixed points of X. We define q(X) as {0q,0c2 ... } i.e., the class obtained by deleting the first member of X, and put q(0) = 0. From proposition 1 it follows that i f f ~ L T F then also f~ ~ LTF, for all ~, and f a E L T F . The same holds for monotone LTtr's. It is easily seen that:
fp =qa. Since q is, obviously, a monotone LTF, so is fp. The following is also easily verified. PROPOSITION 2. (i) X ~ Y implies th(4, X) < th(4, Y). (ii) If th(it, X) < oo then th(v, X) is strictly increasing as a function of v, where v a then or,th(a, X) ~.fp(X) 3. The n-numbers and the p-numbers. Rg = class of all regular ordinals. An ordinal a is regular if a is a limit ordinal > 0 and, a # U a < ~c9 whenever 7 < a and a p < a for all f l < ? . The members ot ft/'(Rg U {0}) is what Mahlo, [1], calls the n~-numbers.
n r,~(X) = D f th(y, Jp~(X)). The following proposition sums up the properties of the doubly-indexed array
n~,.(x). PROPOSITION 3. (i) I f It > 0 then ct = nx,u(X) for some 4, iff for all v < # = n~,~(X).
(ii) nz,(X) as a function of v is non-decreasing. (iii) I f It > v and na,~(X) < oo then the following three conditions are equivalent na,z(X) > n~,~(X), nz,~,(X) ¢ ft g`+t(X), 4 6fl ?+ X(X). (iv) I f It > 4 and 4 + 1 • X then n~,,x(X) ~ na.~(X ). (v) Assume that # > 4 and that either n~,~(X) < oo or n~,x(X) < oo. Then n~,~,(X) = n~,,x(X) iff # = n~,~,(X); each side implies that 4 is the least ordinal which is not in X. Proof. (i) is straightforward. (ii) follows from the fact that fp~(X) is decreasing with v, and from proposition
2(i). (iii) Each of the conditions 4~f1¢'+l(X), n a , ~ f p " + l ( X ) i s equivalent to 4 = n~,,(X). Now, zr~,(X) >_ n~.~(X) > 4. Hence 4 = n~,j,(X) implies zc~,,(X) = na,~(X). Consequently nx,,(X)> rcx.,(X) implies each of the other two con-
1967l
A GENERALIZATION OF MALHO'S METHOD
193
ditions. On the other hand n~,~,(X)= rcx,,(X) means, since 7rx,~(X)< 0% that na,~(X) ~ fp"(X), hence rcx,~(X)~fp v+l(X), implying 2 = rcx,v(X) = rca,~(X). (iv) If na,u(X) = ~ it is obvious. Otherwise let V be the first ordinal which is not in X. Then ~ < 2 and hence rc~,u(X) < ~ . Obviously ~ CfpU+1(X).Hence, from (ii) and (iii) it follows that rcr,,(X) is strictly increasing as a function of v, for all v ~. Therefore 7rr,~(X)> tt, Hence nx,~(X)> p. Put ~ = 7ra,u(X). Now ~efp~(X)~_fpX+~(X), therefore ~=zcg, x(X). Since ~ > # we have rcu,x(X) =< ~ , ~ ( x ) = ~ = ~ , ~ ( x ) . (v) If 7ra,~(X) = =u,a(X) < m, then, putting ~ = rca,~(X), it follows by (i) that = n=,a(X). Hence ~ = # and we get # = 7ra,u(X). On the other side, if/~ = =a,~(X), then, again by (i), # = rcu,a(X) and therefore 7ra,u(X)= 7ru,a(X). It is clear that if 2 + 1 ___X then rca,,(X ) = 2, for all a. Hence, if rc;,,u(X) =/~ > 2 we have an ordinal < 2 which is not in X. Let ~ be the smallest one, By (iv) we have: ~,~(X) < 7rr,u(X). Since # < rc~,,~(X)one gets: # < rc~,~(X) < rc~,~(X) < 7~a,u(X) = ~t. Hence rcr,a(X) = 7~a,~(X) < m and, consequently, y = 2. The class of fixed points for the second index is F'(X) where F = (fpV),<~o. We define L(X), the class of all limit points of X, as the class of all ordinal which are of the form Oa<~,a, where ~a is a strictly increasing sequence of members of X and # a limit ordinal > 0. Thus 0 ~ L(X). PROPOSITION 4. (MAlqLO). If X -~ Rg then fp(X) = X A L(X). Proof. If ~ = th(cq X), then X n ~ forms a strictly increasing sequence of type ~, ~ being a limit ordinal. The union of this sequence is therefore > ~, on the other hand it is < ~. Hence ~eL(X). If ~ e L(X) then ~ = U ~ < ~ where ~ is a limit ordinal > 0 and ~ a strictly increasing sequence of members of X. Since ~ is regular we have ~ = ~. Put Y = {~: 2 < 7} u {~}. Then ~ = th(~, Y). Since Y ~ X it follows that if~ = th(fl, X) then fl >-_~. But we must have fl < ~. Hence ~ = th(~, X). Note that the argument of the first half of the proof shows that, for all
X, Jp(X) O L(Ord) c_ L(X). We define X to be closed in Y if L(X) r3 Y ___ Y. If X _ Y this actually means that X ts a subclass of Y which is closed in the order-induced topology, tin general it means that X is a closed subclass of X u Y. A function f is closed in Y if, for everygX e D(f), if X is closed in Y so is f(X). It is easily seen that if Xi is closed in Yfor all i e I so is Ni ,IXi • Consequently iff~ is a function which is closed in Yfor all i e I so is N~ ~1fi. I f f a n d # are closed in Y s o i s f o g . PROPOSITION 5. (i) If X is continuously decreasing sequence of classes of ordinals, each of which is closed in Ythen XDis closed in Y. (ii) If F is continuously decreasing sequence of functions, each closed in Y, then FD is closed in Y.
194
HAIM GAIFMAN
[July
ProoL Assume with no loss of generality that X = (X~)~<®, because, if ( X , ) ~ < ~ , w e can replace it by X*, where X*:~X~ for a < 6 and X* = oa<~Xa for a > ~. Then X ° = (X*) ° a n d each X ' i s closed in Y. Let a e L ( X D ) n Y. Then a = Ur<~a~, where (a~)~<6 is a strictly increasing sequence of members of X o and t5 is a limit ordinal, are X~, for all V < t~. Hence aa ~ X,, for all ~ < fl < ~. Consequently a ~ L(X~) for all V < ~, and, since X,.. is closed in Y, it follows that a ~ X ~ for all ~ < ~5.Therefore a ~ n r < ~ X ~ , but since X is continuously decreasing this intersection is X~. Thus, a e X~, i.e. a ~ X ~. (ii) follows from (i).
PRO]'OSmON 6. If X ~ Rg and X is closed in Rg then f p ( X O L(X)) ~ Rg
= fp(X). The proof follows the usual considerations. The function fp(X u L(X)) ~ In, where In is the class of inaccessible cardinals was used by Ltvy in [4]. This amounts exactly to fp(X) for closed subclasses of In. DEFmmON. h(X) is the class of all limit ordinals, a e X, such that, for every Y ~ ct, if a ~ L(Y) then L(7 ) n X n a ~ 0. The function h, applied to Rg and iterated, was first used by Mahlo. If ~ ~ h(X) then a ~ L(X), because otherwise, for some ~ < a, we will have {4: fl < ~ < a} n x = 0, contradicting the requirement for Y = {~: fl < ¢ < a}. D. Scott suggested the version: ~ e h(X) if it is a limit ordinal and, for all Y _ ct, if Y is closed in a and a e L(Y) then Y o X # 0. This version is equivalent to the one given here for X c L(Ord). In general we have hs(X) ~- h(X), if ha is the function as defined by Scott. Obviously h is a monotone LTF. It can be easily seen that, for X ~ R9, h(X)~_fp(X). (Letting {~x}z be the enumeration of X, if a = a a > 2 consider fl = U ~ < ~ and Y = {4: fl < ~ < a}.) THEOREM (MAHLO). (I) I f Y ~_ Rg and rcu,~(Y)> lt, v then z~p,~(Y)(~h(Y). (II) If Y ~ Rg then the first ~ > 0 such that a ~fp~(Y) does not belong to h(Y). This follows from the generalization which we formulate and prove next. The idea of the following theorems is to indicate a sense in which the function h is "stronger" than other "decent" functions. Hereby, "stronger" roughly means that its application yields smaller classes of ordinals, whose members can, therefore, be considered as "larger." " D e c e n t " includes, among the rest, the function fp, as well as any function fp, fpA, and lots of others which are specified in this section. Theorem 1 implies that, under certain conditions, a regular a, which is in h(7), cannot be removed by an application of any function which arises out of a "decent function" by means of a process which involves composition of functions,
1967]
A GENERALIZATION OF MALHO'S METHOD
195
iterations of less than ~ times, and forming the diagonal of a continuously decreasing sequence of functions. Mahlo's theorem, which is implied by Theorem 1, deals with two special cases which are typical to the general state of affairs. The first case is that where = G.~(Y) >/~, v. Here, applying fp~ will still leave ~, but one more application of fp (or # + 1 applications of q) will remove it. In the second case ~ is the first fl such that fl = fpa(y). Here an application of the diagonal of
Let
~e
(x) c~L(Y) = L(X) n x .
h(Y) ARg.
(I) If Z ~ D(f) is closed in Y, then ~ e f(Z) implies ~ ef(f(Z)) (II) If 0 < fl < ~ and, for all 7 < fl, Xr e D(f), Xr is closed in Y and ~ ~f(X~) then ~ ef(n~r<~ is a continuously decreasing sequence and, for all 7 < ~, c5, we have o~f(Xr), then ~ ~f(XD).(*) (The theorem can be generalized by omitting the requirements that f is a LTF and that D(J) consists of subsets of Y. The requirements which have to be made are:
f(X) ~_ X t~ L(X) 0 Y and L(X) ~ f ( X ) 0 L(Y). In (I), (II), (III) one has to add in every place the condition that the classes in question are in D(f). The proof is the same). Proof. The ideas are the same as in the p r o o f given by Mahlo for the previous theorem. (I) In order to show ~ e f(f(Z)) it suffices to show a e L(f(Z)). Since f(Z) ~_ Z n L(Z) it is enough to show ~t ~ L(Z n L(Z)). Now ~ el(Z), hence ~ e L(Z). Since ~ e h(Y), ~ is not confinal with 090. This is easily seen to imply ct ~ L(L(Z)). Consequently, for every fl < ~, c¢e L([fl, ~) o L(Z)), where [fl,ct) = (~: fl < ~ < ~}. Since [fl, ~ ) o L(Z) is closed in ~ it follows that [fl,~)tn L ( Z ) n Y # 0. But Z is closed in Y, hence L(Z) n Y ~_ Z. Consequently, for every fl < a, [/3, ~) n Z L(Z) # O, which proves that a e L(Z ~ L(Z)) (II) By induction on/3. For fl = 1 -obvious. If/3 = 6 + 1 and the claim holds for fl = 5 then ~ f ( o ~<~X~). Putting Z t = n~<~X~ and Z2 = X~ it suffices to 2. The theorem is also true if we use Scott's definition of h. Indeed, the same proof is valid.
196
HAIM GAIFMAN
[July
show that ~ ef(Z~ n Z2), from the assumptions that ct ef(Z3 i = 1, 2, and Z~ are closed subclasses of Y. Now ~ e L(Z~) o L(Z2). Hence, given fl < ~, one can construct a sequence Co < (o < ~ < (~ < "'" < ¢~ < (~ < "'" where C~e Zt,(~ e Z2, Co > fland ~,(~ < ~. Then U~ / L In general let ~o,~be the first ordinal in Zr which is > Ua U~<,C~o,~,cq,~ as the first member of Z~ which is > U~
~ , r e Z ~ and U~.~,,~ < ~t,.~<
The regularity of ,t insures that all the ,t,,r are < ct. Put ~ = U~.:~0~,,~. Then fl~eL(Z~) for every ~ < iS, and U , < ~ , = ~. Let T = (/1,: r / < ct} UL({fl,: t / < ~}). Since T is dosed in ~ and ~eL(T) we have TAY#0. But C e T implies CeL(Zr), and L(Zv) AY_~Z~, therefore T n Y__GUr<~Z r. Thus, [fl,~)nnr<~Zr#Oforeveryfl ft. Let ~t be the first member of X~o which is > ~o. In general, let ~ + t be the first member of X ~ which is > ,q, and, if 2 is a limit ordinal, let 0t~ Ur<~ctr. The regularity of implies that ~ < ~ for all ;l < ct. The set {0~a}a, where 2 ranges over all limit ordinals < c~, has ~ as its limit and is closed in ~. Therefore, for some arbitrary large limit ordinal 2 < ~, we have ~x e Y. Now, ~ = U ~ . : ~ + t and ~ + ~ e X~. Hence ~aeL(X~,) =
1967]
A GENERALIZATION OF MALHO'S METHOD
197
for all ~ < ;t. Since X,~ is closed in Ywe have ~ ~ X,~, for all 7 < A. Consequently ~nr • and Y is the sequence of length ct defined by: Yx = Xx n ( ~ + 1), for all 2 < ~, then y O n ~ = X ° o o~. (ii) I f in (i) ce is a limit ordinal and X is continuously decreasing then also y O n (~ + 1) = X ° n (~ + 1) (iii) I f F is a decreasing sequence of L T F ' s, of length > ~ and G is the initial segment of F of length or, then for all X ~ D(F°), FD(x) n o~= G°(X) N o~. (iv) I f o:is a limit ordinal and F is continuously decreasing then F°(X) N (~+ 1) = 6
(x) n
+ 1).
(iii) follows from (i) and (ii) whose proof is straightforward. If Q t(~, f ) is the class obtained by replacing (iv) by (iv'), and if g ~ Q'(~,y) then the restriction of g to subsets of ~ + 1 is in Q(~,f). This is shown by proving, with the help of Proposition 8, that the class of all LTF's whose restrictions are in Q(~,f) satisfies the conditions of Q'(~,f). PROPOSITION 8. (i) Q(0t,f) _c L T F and all its members have the same domain (ii) h,g ~ Q(~,f) implies h o g ~ Q(~,f). The proof (i) is obvious. (ii) is proved by showing that, for any given g ~ Q(~t,f),
198
HAIM GAIFMAN
[July
the class of all functions h which satisfy h o g e Q(0qf) satisfies the conditions of the definition of Q(~,f). DEFINITION. d~.:(X) = (~ {g(X): g e Q(g,f)}, X e h(f)}. fV(X) = (~: ~ e 2,.:(X)}. Thus f v = j ~ where J: = <J~ :>~ ~o,a The remark following proposition 8 implies that f v would have been the same if, in the definition of Q(~t,f), (iv) is replaced by (iv'). The operation f - , f V c a n be described as the "supremum" of all the iterated diagonal operations. To see how " s t r o n g " f V i s , let ~ be any ordinal > 0. Let f be a LTF and let g be its restriction to subsets of 0~+ 1. We will have g#~ Q(~,g) for all ~ < 0~. Hence, since the domain of g is limited to subsets of ~ + 1 and p<~ is continuously decreasing, we have g a e Q(~,g) Going on, this implies that (g~)%Q(~,g) and so on. If g a # i s defined by: gap+ 1 = (ga#)Ar3 gap and g aa = f')~<~ga~, for limit ordinals 2, then g~#e Q(~t,f) for all fl < ~. If g* is obtained by diagonalizing over this sequence, then since the sequence is decreasing continuously, g * e Q(c~,f). One can continue on to form (g*)*, .-., diagonalize over this, etc. As long as we diagonalize over sequences which are decreasing continuously we are still in Q(~, g). Therefore g v, g . . . . are all > gV. Since this is true for all ~ we have: fA>fV, f . > f V ... etc. f v is also defined by a diagonal process. However, this is different from the diagonalizations used in defining the members of Q(~,f). In Q(~,f) we allowed for diagonalizations over continuously decreasing sequences of functions. Now <J~.:>~ ~orais, ingeneral, not continuously decreasing. Take for example f = q. It is not difficult to show that qV(Ord) = Rg - (COo} and Rg - {COo}is not closed in Ord. On the other hand q is closed, and hence, using Proposition 4, it follows that every member of Q(~ q) is closed. Consequently the J~,q's form a sequence of closed functions. If it were continuously decreasing then qV would have been closed, which it is not. All the diagonal operations on continuously decreasing sequence preserve the property of being closed. The fact that qV is not so indicates that here we made indeed a jump. THEOREM 2. (I) I f f e LTF is closed in Y, Y e D(f), and, for every X c y, f ( X ) n L(¥) = L(X n X , then h(Y) n Rg ___fV(y). (II) For every Y c Rg, h(Y) c_f pV( y) Proof. (I) Let ~ ~ h(Y) :3 Rg. We claim that, for every g e Q(~,f), ~ ef(g(Y)), hence a fortiori ~ e g(Y). Put T = {g: g is closed in Y and ~ef(g(Y)}. Since ~ L ( Y ) ~ Y, we have e f ( Y ) = f(I(Y)). Thus, I e T. By Theorem 1 (I), putting Z = Ywe get ~ ~f(f(¥)). Hence f e T. If g e T then 0~ef(g(Y)). Putting Z = g(Y) we get, by Theorem 1 (I), ~ e f ( f ( g ( Y))).
1967]
A GENERALIZATION OF MALHO'S METHOD
199
The other conditions in the definition of Q(ct,f) are also satisfied by T. If, for all ~ < fl, F~, is a member of T a n d fl < ct, then we deduce from Theorem 1(II), by putting X~ = F~(Y), that f")r
Proof. One shows that for every h ~ Q(,t,O) there is h'e Q(ot,f) such that h' < h. This is done by proving that the class of functions h for which such an h' exists satisfies the conditions of Q(~,g). If 0 6 x then fp(X) < q(X). Hence, by Theorem 2 we have h(X) =_qV(X) for every X ~ R9. Actually, equality holds. THEOREM 3. For every X, q~ (X) ~_ h(X).
Proof. Assume ~6h(X) and show ot~q~(X). First let 0t be a limit ordinal > 0. Let Y ~ ~ he such that ~ L ( Y ) and X ~ L ( Y ) = 0. Enumerate YU L(Y) in the natural order: %,~1, " " , ~ , ' " , ~ < ~5. Then for limit ordinals 2 > 0 we have c~x= Ur<xTx and ~x~X. Define: fa = q'~+~, f o r / l a non limit ordinal or 0, and f~ = q~ otherwise. Thenfa ~ Q(ce, q) for all 2 < iS. If t~ < 0tthen q "= 0 x<~fa e Q(a, q) hence q~+1 ~ Q(ct,q). But ct $ q~+ l(X) hence ~ $ qV (X). If t~ = ct then if 2 < ~ is a non limit ordinal or 0 we have 2 < ~tx < ~x + t, hence ;t 6fa(X). Otherwise 2 6 X and a f o r t i o r i 2$fa(X). Consequently F ° ( X ) ~ = 0 where F = (fa)a<~. This implies ct ¢ q(F°(X)). Since q ~ F ° ~ Q(~, q) it follows that a ~ q V(X). If ~t = 0 the claim is obvious. If ct is a non-limit ordinal > 0, let ~ = fl + k, where fl is a limit ordinal. Then q#= f')~<#q~ ~ Q(ct, q). Hence, applying (i) of the definition k + 1 times, we have qa+k+~ Q(~,f). But ~ 6 q~+ l(X). q.e.d. Thus for X ~ Rg we have:
h(X) = fp~(X) = qV(X). It seems that the " j u m p " from q to h amounts to the V operation. Thus, if one wishes to make such another jump, the natural way would be not to iterate h but to take h v. One can of course start iterating the V operation take fixed points etc. In this way one gets a new operation, V*, which is defined similarly to V. The difference is that in this definition Q*(~,f) is required also to be closed under V, i.e. g ~ Q*(~,f) should imply gV~Q*(~t,f). One can still continue on in this direction, getting stronger and stronger operations and adding them to the set of operations which is used to define Q(ct,f) but here we prefer to stop.
200
HAIM GAIFMAN
[July
REFERENCES 1. P. Mahlo, Ober iineare trar~finite Mengen, Borichte fiber die Verhandlungen der K6niglich Sachsiscben Gesellschaft tier Wissenschaften zu Leipzig, Mathcmatiscbe-Physische Klass¢, 63(1911), 187-225. 2. - - , Zur Tbeori¢ und Anwendung der po-Zahlen, ibid. 64 (1912), 108-112. 3. ~ , Zur Tbeorie und Anwendung tier po-Zahlen, ibid. 65 (1913), 268-282. 4. A. L6vy, Axiom Schemata of strong infinity in axiomatic set theory, Pacific Journal of Mathematics, 10 (1960), 223-238. HEBREW U ~ r r Y O1~ JERU~I.EM
SETS WITH UNIQUE FARTHEST POINTS BY
EDGAR ASPLUND ABSTRACT
If a certain set in R n has the property, that in some unsymmetric norm each point in R n has a unique farthest point in this set, then it consists of exactly one point. Let there be given a finite dimensional real vector space with a norm that satisfies all usual conditions except that it need not be symmetric, only homogeneous for multiplication with positive scalars. Suppose that, in the (unsymmetric) metric defined by this norm, a certain set has the property that each point in the space has a unique farthest point in this set, then we say that the set has unique farthest points. We shall prove that the sets with unique farthest points are exactly the one point subsets of the space. Since every one point set is a set with unique farthest points, we have only to prove the converse. This problem was solved by V. L. Klee [3] for several types of special norms in finite dimensional space and also for some infinite dimensional cases under certain extra a priori chnditions on the set with unique farthest points. The coveted goal is to solve the problem for an infinite dimensional Hilbert space (c.f. Klee, loc. cit.). No counterexample is so for known in any normed real linear vector space. Here we give a proof of the most general finite dimensional case. We will use the apparatus of "convexity calculus" developed by Brondsted, Moreau, Rockafellar and others. In the finite dimensional case the notions are essentially due to Fenchel [2], except for that of subdifferential, which has appeared later. Furthermore, wo will use a result of the author [1], which we refer to as the generalized Straszewics theorem. In the find final section we give a more detailed discussion of the relation of our results to Klee's, and also a proof for the infinite dimensional space Co(W). 1. Conjugate convex functions and subdifferentials. By a lower semicontinuous proper convex function we mean a function f on R n, with values in
R u { + ~ } and not identically + ~ , such that
f(2x + (1 - 2)y) ~ Af(x) + (1 - 2)f(y) for all x , y in R n and 0<2___ 1
f(x)=liminfy_.xf(y) for all x in R n
Received September 30, 1963; revised version, February 8, 1967. 201
202
EDGAR ASPLUND
[July
For brevity, we will say simply convex function instead of lower semicontinuous proper convex function. Then each convex function f has a conjugate convex function f*, defined by (1.1)
f*(x) = sup { ( x , y ) - f ( y ) : y ~ R~}
and (f*)* = f . The set of points in R" where f has a finite value is called the effective domain of f and denoted by d o m f = {x: x 6 R~,f(x) < oo} The subdifferential ~ f of f is a set valued function defined for each x in d o m f by (1.2)
~f(x) = (y: f(z) > f(x) + (y, z - x ) for all z ~ R ~}
This relation says, that in the "graph space" Rnx R = { ( x , a ) : x e R ~ , a ~ R } the set (1.3)
{(z,a): a = f ( x ) + ( y , z - x)}
is a supporting hyperplane to the epigraph grf of f, defined by grf = {(z, a): z e dom f, a > f(z)} Actually, grf is a dosed convex subset of the graph space, and each "non-vertical" supporting hyperplane (a hyperplane in R~x R would be called "vertical" if its projection onto R n is not all of R n) of grf can be written as (1.3) for some x in dom f and y in ~f(x). From (1.1) and (1.2) we find that (1,4)
tgf(x) = {y: f*(y) = (y, x ) - f(x)}
and from the symmetry of (1.4) it is clear that
y ~ ~f(x).,~x e ~gf*(y.,:,f(x) +f*(y) = ( y , x ) Relation (1.4) can also be interpreted as stating: 8f(x) is the projection onto R ~ of the intersection of grf* and its supporting hyperplane {(z,a): a = ( z , x ) - f ( x ) = f * ( y ) + (z - y, x)), e.f. (1.3). All statements in this section that are not completely evident are proved in Fenchel [2]. 2. The generalized Straszewics theorem. The well known Straszewics theorem says that in a compact convex set C in R ", the set of all exposed points are dense in the set of all extreme points, a point of C being called exposed if
1967]
SETS WITH UNIQUE FARTHEST POINTS
203
there is a hyperplane H such that H k) C contains this point alone. We say that a subset of C is a face of C if it is the intersection of C and some supporting hyperplane. Then a point is exposed if the smallest dimension of any face containing it is 0. To state a generalization of Straszewics theorem we define a point (in the boundary of C) to be k-exposed if the smallest dimension of any face containing it is at most k, and to be k-extreme if the largest dimension of any simplex contained in C, of which the point is the barycenter, is at most k. The following theorem is proved in [1]. For k = 0 it's the old Straszewics theorem. THEOREM 1. (Generalized Straszewics theorem): Each k-extreme point oj the boundary of a compact convex set is the limit of k-exposed points. 3. Restatements o f the farthest point problem. We assume that we have in R n an "unsymmetric n o r m " , i.e. a function x ~ IIx I1: R n-" R + satisfying (3.1)
II~xli=~lixll for 2 > 0
(3,2) B--
{x:
Ilxll z
and
1}is a bounded convex neighborhood of 0 in R*.
In general, however, IIx
II ~' II- x tl so one has to be
careful with signs.
Now let S be a set that has unique farthest points with respect to this norm. In other words, given any point x in R" we assume that there exists a point q(x) in S (the "antiprojection" of x onto S) such that
q(x) # y e S implies 11Y - x 11<
IIq(x) -
x 1t
Also, we will let s denote one point in S, fixed once and for all. The theorem that we have set out to prove is the following. THEOREM 2. The set S consists of the point s alone. The proof will be indirect, assuming the counterhypothesis to Theorem 2 in the following version (CH)
q(x) # x for all x in R n
If (CH) is true then Theorem 2 must be false, since q(q(x)) # q(x) and both are in S, whereas if Theorem 2 is true then (CE) is false because q(s) = s. Hence another way of stating Theorem 2 is: the antiprojection onto S has a fixpoint. It then follows immediately that the antiprojection is a constant. Another reformulation of Theorem 2, which will not be used in the proof, is the following. Denote by B(x, r) the (unsymmetric) ball with center x and radius r:
n(x, r) -- {y : IlY- xll Z r) Then S = B (x, It~(x) - x II) and ~ O bdB(x, IIq(x) - ~ II) # ~ together define the number IIq(x)- xll in a way which would make sense for any bounded set S even if it did not have unique farthest points, whereas "for each x in R ~,
204
EDGAR ASPLUND
[July
S n bdB(x, H q(x) - x l[) has exactly one element" expresses thisunique farther point property. N o w = {B(x r): x e R", r _>_0) is exactly the family of all (positive) homothcties of the unit ball B. Suppose that ~ were any family of closcd convex scts.Define S to be strictlyconvex with respect to c¢ if S c C for some C eC¢ and C e ~ S c C and • n bdC ~ f~ implies that S r3 bdC has exactly one clement Since B could have been a translateof any given compact convex body, wc have the following reformulation of Theorem 2. TrmOREM 2'. I f the set S is strictly convex with respect to any family of all positive homotheties of some fixed compact convex body, then S consists of exactly one point. 4. Proof of Theorem 2.
We introduce the real-valued function g defined by
g(x) = [1qCx) - x [[ = sup {11 y - x U: y ~ s} The function g is the supremum of the elementary functions x -~ Ity - x [I that are clearly convex, so g itself is convex in the sense of Section 1. It also satisfies the following Lipschitz condition (4.1)
- II x - y II --< gCx) -
gcy) <=II y - x II for all x, y ~ R"
with equality on either side only if q(x) = q(y). We will denote the polar body of the unit ball B by B°: B° = {x: ( x , y ) < 1 foraU y e B }
-- {x:<x,y>=< Ilyll for all y e R " } We will also define two set-valued mappings D and D* by
DCx)= {y: ( y , x ) =
1 , y e B °} for xebdB(i.e., Ilxll = x)
and
D*(x) = {y: (x, y ) = 1, y e B} for x e bdB* These are sometimes called the spherical mappings or duality mappings between the boundaries of the unit ball and its polar body. Obviously, D(x) c bdB ° for eachx in bdB and D*(x) = bdB for each x in bdB °. We assume (CH), so we may define the function b, with values in bdB, by
q(x) - x
b(x) = II q ( ~ ) - xll " for all x in R'. Now we will prove the following lemma
1967]
SETS WITH UNIQUE FARTHEST POINTS
LEMMA 1. Proof.
205
-- O(b(x))c Og(x)
Take y in -D(b(x)). This means that
(-y,q(x)-x)=[]q(x)-x[[,
( - y,u)<=i[u[[ for all u in R ~
Now compute g ( x ) = l[ q<x> - x 11 = U q<x) - z + II = < - y, q<x> - ~ + <~ - x>> < II q<x> - ~ tl - __< g -
Hence
g(z) > g(x) + (y, z - x) for all z in R" so that y is also in Og(x), as claimed. We go on to interpret ag(x) as the projection onto R n of the intersection of the epigraph of g*,
grg* = ((z, a): a > g*(z), z ~ doing*} and its supporting hyperplane {(z, a): a = (z, x ) - g(x)} In other words, Og(x) is the projection onto R" of a face of grg*, in the sense of Section 2. We want of course to apply the generalized Straszewics theorem to grg*. However, this is an unbounded set, so we must find out how to truncate it in a good way. The following inequalities hold for g:
Its - x il < g(x) < g(0) + II - x 11 for all x in R" From them, we derive the corresponding inequalities for g*:
- g ( O ) < g * ( x ) < ( x , s ) -<_ [ I - s I I for x in --B ° g*(x) = + oo otherwise In other words, dom g* = - B °, and in its effective domain, g* assumes values b e t w e e n - g ( 0 ) and I I - s i t . Thus, we may consider grg* to be truncated, by adding, say, a < I [ - s [ l + 2 to the defining relations. We will be interested in that part of the boundary of grg* which lies in the open set {(z,a): z ~ i n t ( - B °) a <
11- sll +
1)
For brevity, we will say simply that a boundary point of grg* is "in int ( - B°) '', when it lies in the above open set. As a first application of Straszewics theorem (for this the classical version suffices) we show that grg* has no extreme points in int ( - B°). For if it were so,
206
EDGAR ASPLUND
[July
then some boundary point (z,g*(z)) in int ( - B °) would have to be exposed, i.e. possible to separate from the rest of grg* by some hyperplane with gradient x. But then z would be the only dement of 0g(x), whereas we know by Lemma 1 that Og(x) contains points in the boundary of - B °. Hence there are no extreme points of grg* in the interior of - B °. Suppose now, inductively, that we have shown that grg* has no (k - 1)-extreme points in int ( - B°), for some k ___
Og(x) = ( - B °) n K where K is a k-dimensional affiue subspace of R". By Lemma 1, the intersection o f - B ° and its supporting hyperplane H = {y: ( y , b ( x ) ) = - 1} which is - D ( b ( x ) ) , is contained in Og(x). Here we need a second lemma. LEMMA 2. I f Z in Og(x) is a boundary point of - B °, then any hyperplane supporting - B ° at z intersects - O(b(x)) = H n ( B°). Proof.
We have by definition that
< - z , u > = Ilull = 1 for all u in D * ( - z). The hyperplane defined by u is the set n u = { y : ( - y , u > = 1}
and to prove Lemma 2 we have to show that H n H u n ( - B °) is nonempty. Since z is in Og(x),
( z , y ) - g(y) < ( z x ) - g(x) for all y in R" Thus, for y = x - 2u, 2 > 0 we have
g(x) - g(y) < - ( - z , x - y ) = -
Iltx -
y ]l
so, by (4.1) we have equality, and q(x) = q(y). We will now show that the midpoint o f u and b(x) is in bdB, i.e. ½(u + b(x)) = 1 . Take 2 = q ( x ) - x above. Then = [q
1[½(u+b(x))
l (v,u + b(x)) = 1
1967]
SETS WITH UNIQUE FARTHEST POINTS
207
Since u and b(x) are both in B, we have that
(v,u) = (v, b(x)) = 1 Hence the element - v is both in H and H,, and since it is also in - B °, we have completed the p r o o f of Lemma 2. Going back to the proof of Theorem 2, we see that H n K is a supporting hyperplane of Og(x) in K. Since Og(x) is supposed to have interior points relative to K, there is in K a hyperplane supporting Og(x) which is parallel to but different from H. This can then be extended to a hyperplane J supporting - B ° in R". But J meets H n ( B °) by Lemma 2, so J must contain Og(x), which is therefore contained in the boundary of - B ° a contradiction. We conclude that no point of grg* in int ( - B °) can be k-exposed. Invoking Theorem 1, we now find by induction that no point of grg* in int ( - B °) is (n - 1)-extreme. But that means, that if z is any point in the interior o f - B ° and x is in Og*(z), then Og(x) is all of - B °, since if z is k-extreme in Og(x) then (z, g*(z)) is k-extreme in qrq*, for k < n - 1. As in the proof of Lemma 2 we have that q(x) = q(y) for every y = x - 2u with 2 > 0 and u in D * ( - z) for some - z in the boundary of B °, which now coincides with - Og(x). But this means that u can be any element of bdB so q(x) = q(y) for all y s R" contradicting (CH). Thereby the proof of Theorem 2 becomes complete. 5. Klee's results, an infinite dimensional case, and an open problem. The special eases solved by Klee that were mentioned in the introduction are the following 1. The set S is closed. 2. The norm is rotund. 3. The norm is polyhedral, i.e. it is the maximum of a finite family of linear functions. Each ease has its own proof, and they are all very different from each other and from our proof here, which is closes related to Klee's proof of Case 3, but much more complicated. Klee states and proves Case 1 and 2 for general Banach spaces, with the additional assumption that S is compact and totally bounded, respectively. Klee does not extend Case 3 to any infinite dimensional case, but it is in fact possible to do so, by means of the following remark. Let E be a Banach space taking the role of R" and let all other notations be as before. We have then
LEMMA 3. The family {q-~(x): x ~ S} is a cover of E consisting of closed sets that are pairwise either disjoint or identical. Hence the family has either all members equal to E or else uncountably many different members. Proof. The closedness of q-a(x)= {y: q(y)= x} is obvious from the continuity of the norm, and the disjointness or identity between any pair q-l(x), q-~(y) is of course another way of stating the uniqueness of the farthest point. Furthermore, the existence of farthest points on all of E shows that E is
208
EDGAR ASPLUND
[July
covered by the family. Finally, the uneountability of a nontrivial closed disjoint cover of E is proved by reduction to the ease E = R (take the trace of the whole configuration on a line contained in E) and there it is an easy consequence of the Cantor-Bendixon theorem on uncountability of perfect sets. Namely, the set of all interval endpoints of the complements of the closed sets in the family would be at the same time countable and perfect, which is impossible. From Lemma 3 we got the following theorem. TI-~OREM 3. If the norm in E is the maximum of a countable family of linear functions (i.e. there is a sequence y, in E* such that, for each x in E, (x,y~) <=[Ix Hfor all n and (x, y,) = ][x Jlfor some n), then each set with unique farthest points consists of a single point. COROLLARY Each set with unique farthest points in Co(W) consists of a single point. Proof. Say that a farthest point q(x) corresponds to the element Yn in E* if II q(x) - x II =
-
Suppose another farthest point q(z) also corresponds to Yr. From the following short computation
yn> we deduce that (q(z), y~) < (q(x), y~), and the converse inequality is obtained by letting x and z change places in (5.1) so ( q ( z ) , y , ) = (q(x),y~). But then substitution into (5.1) shows that we have equality all along in (5.1), which implies q(x) = q(z). Thus at most one farthest point corresponds to each y,. It follows that {q-l(x): x E S} is countable, hence by Lemma 3 trivial, and this proves Theorem 3. Thus Klee's results extend in all cases to some infinite dimensional situation (note that our Theorem 3 extends Klee's Case 3 in some finite-dimensional cases too, but this has already been covered by our Theorem 2) and the extension of Case 3 gives the only proof known to the author of a case where S may be a priori non-precompact. In contrast to this, our method in the previous sections is impossible to extend to infinite dimensional cases, both because of its inductive nature and because of its use of the generalized Straszewics theorem. The referee has pointed out that Klee's method in Case 1 works also if the norm function is replaced by an arbitrary continuous convex function that attains its minimum. Let f be such a function. Then (5.2)
g(x) = sup{f(z - x): z ~ S} = f(q(x) - x)
together with the hypothesis that for each x there exists a unique such q(x), serves to define this analog of the antiprojection function. By translation one may
1967]
SETS WITH UNIQUE FARTHEST POINTS
209
assume that f(0) = 0 < f ( x ) for all x. If S is compact it follows that q is continuous. and, as in [3] that it has a fixed point s, with q(s) = s. It then follows from (5.2) and the unicity that s is the only point in S. Thus, in particular, the method works as well for unsymmetric norms. Klee's proof for Case 2, however, uses the homogeneity of the norm and so does not work for general convex function, although it will take a generally unsymmetric norm. The third case, again, has an appropriate extension to more general convex functions. We give here the corresponding generalization of our Theorem 3. TrmOREM 4. I f the (continuous) function f is defined on the Banach space E as the maximum (attained at each point of E) of a denumerable family of continuous affine functions, and f attains its minimum on E, then each subset S of E such that (5.2) has a unique so ution q(x)for each x in E (i.e. each translate o f f attains its maximum uniquely on S), consists of a single point. The proof of Theorem 4 consists of a repetition of the argumcnts in the proof of Theorem 3, and is omitted. Again, it seems indicated by the success in Case 1 and Case 3 that the corresponding statement would be true, in finite dimensional spaces, for any continuous convex function that attains its minimum and for a priori arbitrary sets S. We have not been able to solve this and state it as an open problem. PROBLEM. Suppose f is a continuous convex function on R n that attains its minimum and that S is a subset of R" such that each translate of f attains its maximum at a unique point in S. Must S consist of a single point? REMARk. Some condition on f like attaining its minimum is needed, as shown by taking f to be a nonconstant linear function.
REFERENCES 1. E. Aspluud, A k-extremepoint is the limit of k-exposedpoints. Israel J. Math 1 (1963), 161-162. 2. W. Fenchel, On conjugate convexfunctions. Canad. J. Math. 1(1949), 73-77. 3. V. L. Klee, Convexityof Chebyshev sets. Math. Annalen 142 (1961), 292-304. UNIVERSITYOF WASHINGTON~ SEATTLE, WASHINGTON