This Page Intentionally Left Blank
An Introduction to Nonstandard Real Analysis
This is a volume in PURE A N D APPLIED MATHEMATICS A Series of Monographs and Textbooks Editors: SAMUEL EILENBERG AND HYMAN BASS
A list of recent titles in this series appears at the end of this volume.
An Introduction to Nonstandard Real Analysis
ALBERT E. HURD Department of Mathematics University of Victoria Victoria. British Columbia Canada
PETER A. LOEB Department of Mathematics University of Illinois Urbana, Illinois
1985
ACADEMIC PRESS, INC. (Harcourt Brace Jovannvich, Publishers)
Orlando San Diego New York London Toronto Montreal Sydney Tokyo
C O P Y R I G H T o 1985 BY ACADEMIC PRESS, I N C . ALL RIGHTS RESERVED. N O PARTOFTHIS PUBLICATION MAY BE REPRODUCEDOR TRANSMITTED I N ANY FORMOR BY ANY MEANS. ELECTRONIC OR MECHANICAL, INCLUDING PHOTOCOPY. RECORDING.OR ANY INFORMATION STORAGE AND RETRIEVAL SYSTEM. WITHOUT PERMISSION IN WRITING FROM THE PUBLISHER.
ACADEMIC PRESS, INC. Orlando, Florida 32887
United Kingdom Edition published by
ACADEMIC PRESS INC. (LONDON) LTD. 24/28 Oval Road, London N W I . 7 D X
Library of Congress Cataloging in Publication Data Main e n t r y under t i t l e : An i n t r o d u c t i o n t o nonstandard r e a l a n a l y s i s . Includes b i b l i o g r a p h i c a l references and index. 1. Mathematical a n a l y s i s , Nonstandard. I . Hurd. A. E. ( A l b e r t Emerson), DATE 11. Loeb, P. A. A299.82.158 1985 515 84-24563 SBN 0-12-362440-1 ( a l k . paper)
P
.
PRINTED INTHE UNITEDSTATkS OFAMtRlCA
85868788
9 8 7 6 5 4 3 2 1
Dedicated to the memory of ABRAHAM ROBINSON
This Page Intentionally Left Blank
Contents
Preface
..................................................................................
ix
Chapter I lnfinitesimals and The Calculus
1.8 1.9
The Hyperreal Number System as an Ultrapower ................................... *-Transforms of Relations ............................................................... Simple Languages for Relational Systems Interpretation of Simple Sentences .................... The Transfer Principle for Simple Senten Infinite Numbers, Infinitesimals, and the The Hyperintegers .......................... Sequences and Series ................................................................... Topology on the Reals
1.10 1.11 1.12
Riemann Integration .......
I. I 1.2 1.3 1.4
1.5 1.6 1.7
..............
I . I3 I . 14 Two Applications to Differential Equations ............... I . I5 Proof of the Transfer Principle
2 8
32
.................. 51 56
.............................
................ 63
Chapter II Nonstandard Analysis on Superstructures Superstructures ...... ........ Languages and Interpretation for Superstructures ................................... 11.3 Monomorphisms between Superstructures: The Transfer Principle .............. 11.4 The Ultrapower Construction for Superstructures .................................. 11.5 Hyperfinite Sets. Enlargements, and 11.6 Internal and External Entities; Comp 11.7 The Permanence Principle ............ 11.8 K-Saturated Superstructures ........... II. I
11.2
vii
71 74 78
83
Contents
Chapter 111 Nonstandard Theory of Topological Spaces Ill. I 111.2 111.3 111.4 111.5
111.6 111.7 111.8
........................................ Basic Definitions and Results .. . .............. Compactness ...................................... ........................... Metric Spaces . .. .. ... .. .. .. .. . . . .. .. .. .. . .. . ... . Normed Vector Spaces and Banach Spaces ........... .............. .............. Inner-Product Spaces and Hibert Spaces . .. .. .. .. .. .. .. .......................................I Nonstandard Hulls of Metric Spaces .. Compactifications . .. ....................................... . .. .. .. .. .. .. . Function Spaces . . .. .. .. . . .. .. ...,. . . ..... .. .. .... ..
i10 i20
123 132 145
154 156 160
Chapter IV Nonstandard Integration Theory IV. I IV.2 IV.3 IV.4 IV.5 IV.6
Standardizations of Internal Integration Structures . .. .... . 165 Measure Theory for Complete Integration Structures .. .. .. .. .. . . .. .. . .. .. . . ... .. 175 Integration on R";the Riesz Representation Theorem .. . . .. .. . . . . .. Basic Convergence Theorems .. .. .. .... ..... .. The Fubini Theorem ............ ............................ 200 Applications to Stoch .................................
Appendix Ultrafilters
....................
References
......................................................................................
.........................................................
............................................................................... .............................................................................................
219
222
List of Symbols
225
Index
227
Preface
The notion of an infinitesimal has appeared off and on in mathematics since the time of Archimedes. In his formulation of the calculus in the 1670s. the German mathematician Wilhelm Gottfried Leibniz treated infinitesimals as ideal numbers, rather like imaginary numbers, which were smaller in absolute value than any ordinary real number but which nevertheless obeyed all of the usual laws of arithmetic. Leibniz regarded infinitesimals as a useful fiction which facilitated mathematical computation and invention. Although it gained rapid acceptance on the continent of Europe, Leibniz’s method was not without its detractors. In commenting on the foundations of calculus as developed both by Leibniz and Newton, Bishop George Berkeley wrote, “And what are these same evanescent increments? They are neither finite quantities, nor quantities infinitely small, nor yet nothing. May we not call them the ghosts of departed quantities’?’’The question was, How can there be a positive number which is smaller than any real number without being zero? Despite this unanswered question, the infinitesimal calculus was developed by Euler and others during the eighteenth and nineteenth centuries into an impressive body of work. It was not until the late nineteenth century that an adequate definition of limit replaced the calculus of infinitesimals and provided a rigorous foundation for analysis. Following this development. the use of infinitesimals gradually faded, persisting only as an intuitive aid to conceptualization. There the matter stood until 1960 when Abraham Robinson gave a rigorous foundation for the use of infinitesimals in analysis. More specifically, Robinson showed that the set of real numbers can be regarded as a subset of a larger set of “numbers” (called hyperreal numbers) which contains infinitesimals and also, with appropriately defined artithmetic operations, satisfies all of the arithmetic rules obeyed by the ordinary real numbers. Even more, he demonstrated that the relational structure over the reals (sets, relations. etc.) can be extended to a similar structure over the hyperreals in such a way that all statements true in the real structure remain true, with a suitable interpretation, in the hyperreal structure. This latter property, known as the transfer principle. is the pivotal result of Robinson’s discovery. ix
X
Preface
Robinson’s invention, called nonstandard analysis, is more than a justification of the method of infinitesimals. It is a powerful new tool for mathematical research. Rather quickly it became apparent that every mathematical structure has a nonstandard model from which knowledge of the original structure can be gained by applications of the appropriate transfer principle. In the twenty-five years since Robinson’s discovery, the use of nonstandard models has led to many new insights into traditional mathematics, and to solutions of unsolved problems in areas as diverse as functional analysis, probability theory, complex function theory, potential theory, number theory, mathematical physics, and mathematical economics. Robinson’s first proof of the existence of hyperreal structures was based on a result in mathematical logic (the compactness theorem). It was perhaps this aspect of his work, more than any other, which made it difficult to understand for those not adept at mathematical logic. At present, the most common demonstration of the existence of nonstandard models uses an “ultrapower” construction. But the use of ultrapowers is not restricted to nonstandard analysis. Indeed, the construction of ultrapower extensions of the real numbers dates back to the 1940s with the work of Edwin Hewitt [ 171 and others, and the use of ultrapowers to study Banach spaces [ 10,161 has become an important tool in modem functional analysis. Nonstandard analysis is a far-reaching generalization of these applications of ultrapowers. One essential difference between the method of ultrapowers and the method of nonstandard analysis is the consistent use of the transfer principle in the latter. To present this principle one needs a certain amount of mathematical logic, but the logic is used in an essential way only in stating and proving the transfer principle, and not in applying nonstandard analysis. We hope to demonstrate that the amount of logic needed is minimal, and that the advantages gained in the use of the transfer principle are substantial. The aim of this book is to make Robinson’s discovery, and some of the subsequent research, available to students with a background in undergraduate mathematics. In its various forms, the manuscript was used by the second author in several graduate courses at the University of Illinois at Urbana-Champaign. The first chapter and parts of the rest of the book can be used in an advanced undergraduate course. Research mathematicians who want a quick introduction to nonstandard analysis will also find it useful. The main addition of this book to the contributions of previous textbooks on nonstandard analysis [ 12, 37, 42, 461 is the first chapter, which eases the reader into the subject with an elementary model suitable for the calculus, and the fourth chapter on measure theory in nonstandard models. A more complete discussion of this book’s four chapters must begin by noting H. Jerome Keisler’s major contribution to nonstandard analysis in the form of his 1976 textbook, “Elementary Calculus” [23] together with the instructor’s volume, “Foundations of Infinitesimal Calculus” [24]. Keisler’s book is an excellent
Preface
xi
calculus text (see the second author’s review [30])which makes that part of nonstandard analysis needed for the calculus available to freshman students. Keisler’s approach uses equalities and inequalities to transfer properties from the real number system to the hyperreal numbers. In our first chapter, we have modified that approach to an equivalent one by formulating a simple transfer principle based on a restricted language. The first chapter begins by using ultrafilters on the set of natural numbers to construct a simple ultrapower model of the hyperreal numbers. A formal language is then developed in which only two kinds of sentences are used to transfer properties from the real number system to the larger, hyperreal number system. The rest of the chapter is devoted to extensive applications of this simple transfer principle to the calculus and to more-advanced real analysis including differential equations. By working through these applications, the reader should acquire a good feeling for the basics of nonstandard analysis by the end of the chapter. Anyone who begins this book with no background in mathematical logic should have no problem with the logic in the first chapter and hence should easily pick up the background needed to proceed. Indeed, it is our hope that such a reader will grow quite impatient with the restrictions on the language we impose in the first chapter, and thus be more than ready for the general language introduced in Chapter 11 and used in the rest of the book. We will not comment on what might be in the mind of a logician at that point. Chapter I1 extends the context of Chapter I to “higher-order’’ models appropriate to the discussion of sets of sets, sets of functions, etc., and covers the notions of internal and external sets and saturation. These topics, together with a general language and transfer principle, are held in abeyance until the second chapter so that the beginner can master the subject in reasonably easy steps. They are, however, essential to the applications of nonstandard analysis in modem mathematics. External constructions, such as the nonstandard hulls discussed in Chapter I11 and the standard measure spaces on nonstandard models described in Chapter IV, have been the principal tools through which new results in standard mathematics have been obtained using nonstandard analysis. The general theory of Chapter I1 is applied in Chapter I11 to topological spaces. These are sets with an additional structure giving the notion of nearness. The presentation assumes no familiarity with topology but is rather brisk, so that acquaintance with elementary topological ideas would be useful. The chapter includes discussions of compactness and of metric, normed, and Hilbert spaces. We present a brief discussion of nonstandard hulls of metric spaces, which are important in nonstandard technique. Some of the more advanced topics in Kelley ’s “General Topology,” such as function spaces and compactifications, are also included. Finally, in Chapter IV, we introduce the reader to nonstandard measure theory, certainly one of the most active and fruitful areas of present-day research in non-
xii
Preface
standard analysis. With measure theory one extends the notion of the Riemann integral. We shall take a "functional" approach to the integral on nonstandard spaces. This approach will produce both classical results in standard integration theory and some new results which have already proved quite useful in probability theory, mathematical physics, and mathematical economics. The development in this chapter does not assume familiarity with measure theory beyond the Riemann integral. Most of the results in [27, 29, 32, 331 are presented without further reference. We note here that the measures and measure spaces constructed on nonstandard models in Chapter IV are often referred to in the literature as Loeb measures and Loeb spaces. With one exception (Section 1.15). every section of the book has exercises. In designing the text, we have assumed the active participation of the reader, so some of the exercises are details of proofs in the text. At the back of the book there is a list of the notation used, together with the page where the notation is introduced. Of course, we freely use the symbols E. u, and n for set membership, union, and intersection. We have starred sections that can be skipped at the first reading. Every item in the book has three numbers, the number of the chapter ( I . II, 111 or IV). the number of the section, and the number of the item in the section. Thus. Theorem IV.2.3 is the third item in the second section of the fourth chapter. In referring to an item, we shall omit the chapter number for items in the same chapter as the reference, and the section number for items in the same section as the reference.
CHAPTER I
In$nitesimals and The Calculus
Our aim in this chapter is to introduce the reader to nonstandard analysis in the familiar context of the calculus. It was in this context that the concept of an infinitesimal was used by Leibniz and his followers to define the derivative, thus launching the infinitesimal calculus on its spectacular development. The notion of an infinitesimal is a cornerstone in all applications of nonstandard methods to analysis, and so an understanding of this chapter is basic to the rest of the book. Moreover, such an understanding will make the technical elaborations of the later chapters easier to appreciate. In spite of the many technical advantages attending the use of infinitesimals as developed by Leibniz, the notion of infinitesimal was always controversial. The main question was whether infinitesimals actually existed. Since an infinitesimal real number was supposed to be smaller in absolute value than any ordinary positive one, it was clear that all infinitesimals other than zero were not ordinary real numbers. Leibniz regarded them as “numbers” in some ideal world. Further, he implicitly made the important but somewhat vague hypothesis that the infinitesimals satisjed the same rules as the ordinary real numbers. Consider how this hypothesis would work in the calculation of the derivative of the function ex. Leibniz would write -de x dx
ex+dx
=
- ex
dx
ri’),
= ex
~
where dx is an infinitesimal. A separate calculation (Example 11.3.2) would show that (edx- l)/dx = 1. We will learn in this chapter that the foregoing calculation is correct as long as the equality signs are replaced by N , where a N- b means that a and b are infinitesimally close. Two facts should be noted: (a) We need to be able to add infinitesimals to ordinary real numbers. This implies that both infinitesimals and ordinary reals are contained in a larger set of “numbers” for which the operations of arithmetic are defined. 1
2
I.
lnfinitesimalsand The Calculus
(b) The function ex needs to be extended to this larger set of numbers in such a way that the law of exponents is satisfied. The example of the previous p a r a ~ a p hshows that to make Leibniz’s approach to the calculus rigorous we must A. construct a set * R of “numbers” and define operations of addition, mu~tiplicat~on, and linear ordering on * R so that (i) the field R of real numbers (or an isomorphic copy of R) is embedded as a subfield of * R and (ii) the laws of ordinary arithmetic are valid in *R, B. show how functions and relations on R are extended to functions and relations on *R, thus extending the “relational” structure on R to one on *R, C . ensure that statements true in the relational structure on R are “extended to statements true in the relational structure on *R. A set * R having the properties men ti one^ in A is d e v e l o ~ din 81.1 using ultrafilters. We show in the Appendix that the existence of ultrafilters follows from Zorn’s lemma, a form of the axiom of choice. In 51.2 we show how relations and functions on R are extended to relations and functions on *R. To deal with C we must develop a very modest amount of mathematical logic (@1.3 and 1.4) in order to make precise what is meant by the words “statement” and “true.” The sense in which true statements for R “extend” to true statements for *R is made precise in the transfer principle, which is stated in 51.5. This principle is at the heart of nonstandard methods as developed by Abraham Robinson. Its proof is deferred to $1.15 since it is not necessary to know the proof in order to apply the transfer principle. In the intervening sections we show how to use the transfer principle to prove results in the calculus. The proofs are usually similar to those developed in the early days of the calculus except for the role played by mathematical logic. As noted in the Preface, we have used a very simple formal language in this chapter in order to facilitate the initiation of readers not familiar with formal languages. Consideration of a more elaborate language and nonstandard model is deferred until Chapter 11.
1.1 The Hyperreal Number System
as an Ultrapower We assume that anyone reading this book is familiar with the real number system as a complete linearly ordered field 41, = (R, + , <), where R denotes the set of real numbers and +, *, and < denote the usual algebraic operaa ,
3
1.1 The Hyperreal Nurrber System as an Ultrapower
tions and relations of addition, multiplication, and linear ordering on R. Our object in this section is to construct another linearly ordered field $t = (R, <) which contains an isomorphic copy of W but is strictly larger than 9.W will be called a nonstandard or hyperreal number system. The construction of W is reminiscent of the construction of the reals from the rationals by means of equivalence classes of Cauchy sequences. To begin the construction, let N denote the natural numbers and d denote the set of all sequences of real numbers (indexed by N); i.e., each element in fi is of the form r = (rl,r2, r 3 , . . .). For convenience we denote (rl,r2,r3,. . .) by (ri:i E N) or simply ( r i ) . Operations of addition, $, and multiplication, 0, can be defined on d in the following way: If r = (ri) and s = (si) are elements of d , we define
+,
a,
r @ s = (ri
+ si>
and r 0 s = (ri si). is a commutative ring with an It is easy to check (Exercise 1) that (A, @,0) identity (1,1,. . .) and a zero (O,O, . . .) (where 1 and 0 are the unit and zero in R). However, the ring is not a field; for example, (l,O,l,O,l,. . .) 0 (O,l,O,l,O,. . .) = ( O , O , O , . . .), so the product of nonzero elements can be zero. We remedy the situation by introducing an equivalence relation on d and defining operations and relations +, and < on the set R of equivalence classes which make (R, ., <) into a linearly ordered field. To introduce the equivalence relation we need the notion of an ultrafilter (for more on ultrafilters see the Appendix).
+,
.,
1.1 Definition Let I be a nonempty set. A j l t e r on I is a nonempty collection 9 of subsets of I having the following properties:
(i) The empty set 0 $%. (ii) If A, B E 4, then A n B E 9. (iii) If A E 42 and I 2 B 2 A, then B E 9. A filter 4 is an ultrajlter if (iv) for any subset A of I either A [but not both by (i) and (ii)].
E9
or its complement A’ = I - A E 9
For each x E I there is a $xed ultrafilter 4, = {B G I : x E S}. If I is an infinite set the collection 9: = {A E I : I - A is finite} is a filter called the cojnite or Frichet filter on I . An ultrafilter 9 on I isfree if it contains 9:.
4
I.
lnfinitesimals and The Calculus
A free ultrafilter 9 cannot contain any finite set F , since otherwise F‘ is cofinite and hence in 4 and so F n F‘ = 521, contradicting l.l(i). Intuitively an ultrafilter is a very large collection of subsets, but not too large, since, for example, it cannot contain two disjoint sets by l.l(i) and (ii). Note that if 4 is an ultrafilter on I, then I E 4. Note also that if A l , A,, . . . , A, are a finite number of sets in I with Ai n A j = 521 for i # j and UAXl s i 5 n) = I, then one and only one of the sets Ai is in 42 (Exercise 2). It is not at all obvious that free ultrafilters exist. These, however, are the important ultrafilters for our construction. Therefore we take as a basic assumption the following axiom.
1.2 Ultrafilter Axiom If 9 is a filter on I, then there is an ultrafilter 9 on I which contains 9. We show in the Appendix that the ultrafilter axiom follows from Zorn’s lemma (that is, the axiom of choice). Now assume that we have chosen a free ultrafilter 9 on N. We define a relation = on fi as follows ( = will depend on 9,but this dependence will not be indicated explicitly). 1.3 Definition If r = (Ti) and s = (si) are in fi, then r = s if and only if {i E N : r i = si} E 4. We then say that ( r , ) = (si) almost everywhere (a.e.). 1.4 Lemma The relation
= is an equivalence relation on fi.
Proof: The relation = is reflexive (r = r) because N E 9, symmetric (r = s implies s = I ) because = is a symmetric relation on R, and transitive (r = s and s = t imply r = t) because of conditions l.l(ii) and (iii) for a filter. The details are left to the reader (Exercise 3). 0 Note that two sequences can have the same limit as n + co and not be equivalent. For example, (l,i,f,. . .) f (O,O,O, . . . ) since 0 $9; sequences like (l,i,$, . . .) will later be used to define “infinitesimal numbers” different from zero. We will see shortly that the equivalence relation also eliminates the problem that the product of nonzero elements can be zero. For example, consider again the sequences ( l , O , 1,0,1,. . .) and (0,1,0,1,0,. . .). By l.l(iv), one of these two sequences is equivalent to (40, . . .); which one depends on the particular ultrafilter 4 used to define = (there are many such).
1.1 The Hyperreal Number System as an Ultrapower
5
The set R is divided into disjoint subsets called equivalence classes by =. Each equivalence class consists of all sequences equivalent to any given sequence in the class. Thus r and s are in the same equivalence class iff r = s. Of course, two sequences which differ at only a finite number of places are equivalent under =.
1.5 Definition Let R denote the set of all the equivalence classes of R induced by =. The equivalence class containing a particular sequence s = (si) is denoted by [ s ] or s. Thus if r = s in d then r = [ r ] = [s] = s. Elements of R are called nonstandard or hyperreal numbers.
R is technically known as an ultrapower (the general concept will be presented in Chapter 11). Notice that if r = [ ( r i ) ] and s = [ ( s i ) ] are two elements of R then r = s if and only if ( r i ) = ( s i ) a.e. (Exercise 4). We use the same idea to define operations and relations which make R into an ordered field. 1.6 Definition Let r = [ ( r i ) ] and s = [ ( s i ) ] . Then (i) r + s = [ ( r i + s i ) ] , i.e., [ r ] + [ s ] = [r 0 s ] , (ii) r s = [ ( r i . s i ) ] , i.e., [ r ] [ s ] = [r o s], (iii) r < s (s > r) if and only if { i E N : r i < s i } E Q, and r S s (s 2 r) if and only if r < s or r = s. The structure (R,+, -,<) is denoted by W .
.
We must check that the definitions are independent of the particular representatives chosen from the equivalence classes. That is, if r = Fand s = Swe must show that [ r @ s] =I[FO 51, [ I 0s ] = [FO i],and [F] < [i]if and only if [ I ] < [ s ] . We check the first equality and leave the rest to the reader (Exercise 5). Let r = ( r i ) , F= (Fi), s = (si), and S= (Fi). Then { i E N : r i = and { i E N :si = Fi} are in 42. Obviously
c}
(1.1)
{i E N:ri
+ si = Fi + Si} 2 { i E N : r i = T i } n { i E N : s i = ii}.
The right-hand side of (1.1) is in 42 by l.l(ii), and so the left-hand side is in 42 by l.l(iii). The remarkable fact is the following. 1.7 Theorem The structure W is a linearly ordered field. Proof: That W is a commutative ring with zero 0 = [(O,O, . . . ) ] and unit 1 = [( 1,1,. . .)] is easy to check. For example, the distributive law r (s + t) = r s + r t is proved as follows: Let r = [ r ] , s = [ s ] , and t = [ t ] . 0
-
-
6
1.
lnfinitesimals and The Calculus
Then
r (s
+ t) = [r]
+ [t])
([s]
€I3 t ] = [r 0 (s €I3 t ) ] = [(r 0 s) @ (r 0 t>] = [r]
= [r
*
[s
0s]
=r.s+
+ [r 0t ]
r e
t.
The other commutative ring axioms are left to the reader (Exercise 6). To show that 9t is a field we need to prove in addition that every nonzero element in R has an inverse (i.e., if r # 0 then there is an element r-' in R so that r r - l = 1). Suppose that r = [ ( r i ) ] # [(O,O, . . .)I. Then {i E N : r i = 0) 4 4 and so {i E N : r i # 0} E 4 by l.l(iv). Define r-' = [(Fi)], where Fi = r;' if ri # 0, and Ti= 0 if ri = 0. Then it is easy to check that r r - l = 1. Here we have used the fact that 4 is an ultrafilter. Finally we must show that 4t is a linearly ordered field with the ordering given by <. We say that an element r of R is positive if r > 0. We must show that (a) the sum of two positive elements is positive,
(p) the product of two positive elements is positive, (y) (Law of Trichotomy) for a given element r either r is positive, r or -r is positive (where -r is the additive inverse of r).
= 0,
(a) and (j) are left to the reader (Exercise 7). To demonstrate (y), let r = [(ri)] and define A = {i E N : r i > 0}, B = {i E N : r i = 0}, and C = {i E N : r i < 0). We want to show that one and only one of A, B, and C is in Q. This follows
from Exercise 2 or the followingequivalent proof: From the law of trichotomy in R, we see that A u B u C = N E 4. Now one of A, B, and C must be in 4, for otherwise A', B', and C' are in 4 by l.l(iv), and so (A' n B' n C') = ( A u B u C)' = E Q, and this is a contradiction. Suppose finally that at least two (say, A and B) of the sets A, B, and C are in 4, and so A n B E 4. By the law of trichotomy in R, A n B = 0,contradicting l.l(i). 0 The definition of absolute value in 9l is now clear. 1.8 Definition If r E R, then
r -r
if r > O if r < 0.
1.1 The Hyperreal Number System as an Ultrapower
7
This absolute value has all of the properties of the familiar absolute value in R. In Exercise 8 the reader is asked to shown that if r = [(ri)] then Irl= [
I* Next we want to show that 9 can be embedded isomorphically as a linearly ordered subfield of W. To be precise, we define a mapping * :R + R as follows. 1.9 Definition If r E R, we define *(r) = * I , where *r = [ ( r , r , . . .)]E R.
1.10 Theorem The mapping
R.
* is an order-preserving isomorphism of R into
Proof: The mapping * is 1-1, for if *r = *s then [ ( r , r, . . ,)] = [(s, s, . . .)] and so = s. It is a trivial matter to show that * preserves the field and order properties. For example, the equation [ ( r , ~ ,...)] + [(s,s, . . .)I = [(r + s,r + s, . . .)] establishes *(r + s) = *r + *s. The details are left to the reader (Exercise 9). 0
Of particular interest are the standard numbers in R;these are the images of elements of R under *. 1.11 Definition If A E R then (A), is the set of all elements *a, where a E A; ( R ) , is the set of standard numbers in R.
Finally we want to show that R contains numbers other than standard numbers. In order to do so we use for the first time the assumption that 9 is a free ultrafilter. Consider the number w = [(1,2,3, . . .)I. This number ~ = i} cannotequalany standardnumber *r = [ ( r , r , r , . . .)],fortheset { i N:r consists of at most one natural number. Thus R is a strictly larger set than ( R ) , . In $1.6, w will be called an infinite number. Similarly the number w - = ((l,&+,, . .)) is not in (R), and is called an infinitesimal. We will see that there are many other distinct infinite and infinitesimal numbers in R. To sum up, we have shown that the structure W is at least an ordered field. The proof of this fact has involved simple but tedious manipulations involving the ultrafilter Q. One might ask whether other properties of W are likewise true of W. For example, W has the property that if r < s then there is a number t so that < t < s. It turns out that R also has this property (Exercise 10). After checking this and a few more properties, one begins to suspect that all reasonable statements that are true in W are also true in if the statements are suitably interpreted. This is the content of the transfer principle, which will be stated in a simple form in 11.5 and proved at the end of the chapter. With the transfer principle the proofs of Theorem 1.7 and similar results become trivial.
8
I.
lnfinitesimals and The Calculus
Exercises I.1 1. Show that (k, is a ring with identity (1,1,1,. . .) and zero (O,O,O,. . .). 2. Fix an ultrafilter 9 in a set I and show that if A,, A,, . . . , A, are a finite number of subsets of I with Ai n A j = 521 for i # j and U A i ( l Ii < n) = I, then one and only one of the sets Ai is in 9. 3. Complete the proof of Lemma 1.4. 4. Show that if r = [ ( r i ) ] and s = [(si)], then r = s (equality of equivalence classes as sets) if and only if (ri) = (si) a.e. 5. Show that parts (ii) and (iii) of Definition 1.6 are independent of the representatives chosen for the equivalence classes. Also show that r 5 s if and only if {i E N : r i I s i } E 9. 6. Prove that W is a ring. 7. Establish the properties (a) and (fl) of the ordering < which are stated in the proof of Theorem 1.7. 8. Show that if r = [ ( r i ) ] then Irl = [ ( l r i l ) ] . 9. Complete the proof of Theorem 1.10. 10. For any r, s E R with r <s, show that there exists a t E R with r < t < s. 11. Show directly (without using Theorem 1.7) that if r < s and s < t, then
@,o)
r < t.
+ SI
+
12. Show that Ir 5 Irl Is1 and lrsl = lrllsl for all r, s E R. 13. Show that there are infinitely many distinct elements of R greater than O = [(1,2,3 , . . . ) ] . 14. Show that an ultrafilter 9 is free iff it is not fixed at any x E I. 15. Show that if one lets 4' 3 be an ultrafilter fixed at some n E N in the construction of 9,then W is isomorphic to W .
1.2 *-Transforms of Relations
In order to do calculus we must introduce sets and functions into discussions involving 9 and 9.Of course, sets and functions are just special types of relations. We will show how to extend relations from W to W . The procedure generalizes what we have done for the relation c. 2.1 Definition For any set S, the set S" = S x S x . . x S (n factors) is the set of ordered n-tuples (a1,a2,. . . ,a"), a' E S. An n-ary relation P on S is a subset of S". If ( a ' , . . . , a") is an element of P we write either (a', . . . ,a") E P or P(a', . . . ,a"). The complement of a relation P is the relation P' = (S x S x . . . x S) - P . In particular, a subset A of S is a unary relation on
1.2
9
*-Transforms of Relations
S, and we write c E A or A ( c ) if c is in A. The domain of P is the subset of S"-' consisting of those (n - I)-tuples (a', . . . ,a"-') in S"-' for which there exists an a E S so that P (a', . . . , a"- a). The set of all such elements a is called the range of P. We write dom P and range P for the domain and range of P . An S-valuedfunction f of n variables on S is an (n + 1)-ary relation with the special property that if f(n', a', . . . , a",a ) and f(a', a', . . . ,a", b) then a = b. Here a is called the image of (a', . . . ,a") under f . We also frequently write f(d,. . . , a") = a if f(a', . . . , a",a) (notice the different brackets). If f(a', . . . , a") = f ( b ' , . . . ,b") implies (a', . . . , a") = (b', . . . , b"), then we say that f is one-to-one (1- 1) or injective.
',
As examples, we see that = and < are 2-ary (usually written "binary") relations on R; we will usually write a = b and a < b rather than = (a, b) and <(a, b). Similarly + and . are functions of two variables; we will write a + b = c and a . b = c rather than + ( a , b ) = c (or ( a , b , c ) ) and .(a,b)= c. Following common practice, we will often leave out the dot denoting multiplication altogether. Next, we generalize Definition 1.9 by introducing a mapping * from the set of n-ary relations on R to the set of n-ary relations on R.
+
2.2 Definition Let P be an n-ary relation on R. The *-transform * P of P is the set of all n-tuples (r', . . . ,r") in R" such that if rk = [(r",,&, . . .)I, 1 5 k I n, then P(r,!, r:, . . . , c ) holds a.e.; that is, { i E N : P ( r ! , . . . , c)}E %.
The set * P is well defined, for if [($)I = [(F:)], k = 1,2,. . . ,n, and A = { i E N:P(r,!, . . . ,r l ) } E %, then there is a set B in % such that 6 = Ff, 1 I k I n, for each i E B (Exercise l), and so P(F!, . . . ,F); for i E A n B E @. The definitions in 1.6 for -,and < are easily seen to be special cases of Definition 2.2, in the sense that *+ is etc. (Exercise 2). Notice that * = is equality on R, and not just a relation satisfying the formal properties of equality (see Exercise 4 in g1.1). For any set A c R, it follows from Definition 2.2 that *A = {[(si)] E R: { i E N : s i E A } E a }and therefore * A 2 (A), (Exercise 3). We will show later that, for an infinite set A, * A contains points not in (A)*. As an example, we see that * R = R since { i E N : r i E R } = N E %, and R properly contains (R)*. For a slightly less trivial example, let A = [a,b] = {x E R : a 5 x I b}. Then *A = { [ ( s i ) ] E R : { i E n:a 5 si 5 b} E a},and it follows that *[a,b] = { x E R:*a I x I * b } . It will again be a consequence of the transfer principle that *A has all of the properties (appropriately interpreted) of A.
+,
+,
10
I.
lnfinitesimals and The Calculus
To take another example, let Z denote the set of integers, and consider *Z = { [ ( s i ) ] E R:{i E N : s i E 2 ) E %}. If the transfer principle is true, then * Z must be a subring of W under the induced operations of addition and multiplication. To establish this without the transfer principle would require more tedious calculations with the ultrafilter Q (Exercise 4). Let 8 = [($)I, 1 I;k 5 n, and s = [ ( s i ) ] . I f f is a function of n variables then the relation * f ( [ ( r ! ) ] , . . . , [(<)I, [(si)]) holds in R"" iff f ( r ; , . . . ,c,s i ) holds a.e., in which case f(r;, . . . ,c)= si almost everywhere. It follows that *f is a function of n variables on R. Moreover, *f(r', . . . , r") is defined iff f(r\, . . . ,f l ) is defined a.e., and, if defined, then *f(r', . . . , r") = [(s,)] where f(r:, . . . ,c)= si almost everywhere. It should now be easy to prove the following result. The proof is left as an exercise.
2.3 Theorem If P is an n-ary relation on R and P(rl, . . . ,r") for 9 E R, then *P(*r', . . . , *r"). Thus if we identify each 9 with *9, the relation * P extends P (where R is regarded as embedded in R). In particular, iff is a function of n variables on R and f ( r ' , . . . ,r")= s with # and s in R, then *f(*r', . . . , *f')= *s. Consequently *f is an extension off if we think of R as embedded in R by the mapping *. An n-ary relation P can be defined by its characteristic function
The proof of the following proposition is also left to the reader.
2.4 Proposition
*xp = x . ~ .
2.5 Notational Convention In order to eliminate the use of boldface symbols and so conform to Robinson's notation in the rest of Chapter I, we adopt the following conventions: (a) R and W will be denoted by * R and *W, respectively. (b) Numbers in R and * R will be denoted by lowercase letters, e.g., r, s. The context will make clear whether a number is in R or *R. (c) We regard R as embedded in *R, and hence identify R and (R)*. For example, we will write 5 instead of * 5 for [ ( 5 , 5 , . . .)I. (d) We will use the usual notation <, +, and . for <, +, and Again the context will settle any possible confusion. (e) W will from now on denote the structure consisting of the set R together with all relations and in particular all functions on R. *W will denote a.
11
1.3 Simple Languages for Relational Systems
the structure consisting of * R and the extensions of n-ary relations and functions on R. Any relation or function in *Wis the extension of a standard one. We will deal with the structure W and *W as special cases of structures known as relational systems. 2.6 Definition A relational system is a structure Y = (S, { P i : iE I } , {fi:jE J ) ) consisting of a set S, a collection of relations Pi (i E I) on S, and a collection of functions fi ( jE J ) on S.
It should be noted that a given relational system does not necessarily contain all relations and functions on S. The formal exhibition of functions in Y is needed for the definition of “terms” in the next section. In the next two sections we present the elements of symbolic logic for relational systems to facilitate work with the transfer principle. Exercises 1.2 1. Complete the proof that *P, as defined in 2.2, is well defined. 2. Show that *+, *., and * < are -,and <, respectively. 3. If A is a subset of R, show that *A 3 (A)*. 4. Show that if Z denotes the set of integers, then *Z is a subring of *41 under the induced operations of addition and multiplication from + and -. 5. Prove Theorem 2.3. 6. Prove Proposition 2.4. 7. Show that if A is finite, then *A = ( A ) * . 8. Let A and B be two subsets of R, and show that *(A n B) = * A n * B and *(A u B) = * A u *B. 9. Show that *(dom P) = dom * P for any n-ary relation P. 10. Let f be a 1-1 function, and show that *f is 1-1.
+,
1.3 Simple Languages for Relational Systems
From this point (with the exception of 01.15) we suppress the construction of *W and work instead with the properties obtained through use of the transfer principle. This facilitates our work in the same way that work with the set of real numbers is helped by the suppression of its construction via Dedekind cuts or Cauchy sequences. It is not enough to extend functions and relations to *R; the extensions must follow the same rules as the original functions and relations. For example, we want *sin(x + 2n) = *sin(x) for all x E *R. To make this precise,
12
I. lnfinitesimalsand The Calculus
we must make the notion of “rule” precise, and for that we need the formal language which we begin to develop here. At the end of the last section we remarked that both Se and *B could be regarded as relational systems, that is, structures of the form .Y’= (S,{Pi: i E I}, { fi:j E J}),where S is some basic set, the Pi (i E I) are relations, and thefi ( jE J ) are functions on S. Given any such structure Y (it need not be Se or *@), we present in this section a symbolic language L, in which we may make mathematical assertions about Y .In the next section we will show how to interpret formal sentences in L , in the relational structure 9, and also indicate how to translate informal mathematical statements about Y into sentences in L,. The transfer principle is stated (in $1.5) and proved (in $1.15) in the context of a formal language. Later, with a little experience in translating mathematical statements into formal sentences and vice versa, this formality will usually be unnecessary. Thus, in subsequent chapters of this book the transfer principle will be applied directly to informal mathematical statements; at this stage, however, it is important to develop confidence in the translation procedure. The language consists of (a) a basic set of symbols and (b) combinations of those basic symbols into formal sentences of a particular type which we will call simple sentences. The basic symbols of L, fall into two categories. The first category consists of logical symbols which are common to any simple language and do not vary if Y is changed. These are 3.1 Logical Connectives The symbols “and” and “implies.”
A
and +, to be interpreted later as
3.2 Quantifier Symbol The symbol V, to be interpreted as “for all.”
3.3 Parentheses The symbols [ , 3, ( , ), and ( , ), to be used as usual in mathematics for bracketing. 3.4 Variable Symbols A countable collection of symbols like x, y , x l r x 2 , and m,n, to be used as “variables.” Only a countable collection is necessary since each sentence will involve only finitely many variables, and we write down only finitely many sentences in any proof. The symbols in the second category depend on Y and will be called parameters. They consist of 3.5 Constant Symbols A symbol s called the name of s for each s in S.
1.3 Simple Languages for Relational Systems
13
3.6 Relation Symbols A relation symbol _P for each relation P in S. We call P the name of P. 3.7 Function Symbols A function symbol f- for each function f in S. We call f the name off. Whenever possible in this and the next section we underline to denote constant, relation, and function symbols. But, for convenience, familiar constants, relations, and functions like 1, 2, n, <, +, and sin will be named by the same symbol. Note that we do not rule out the possibility of more than one name for each entity. We next describe how to make up meaningful combinations of the above symbols. The first step is to show how to build up familiar expressions like f(x), g(2,f(x, - y)), and sin(x + A). Such expressions are like composite functions in usual mathematical notation, and use constant, variable, and function symbols. They are special cases of terms, which are defined as follows. 3.8 Definition Terms are defined inductively as follows: (i) Each constant and variable symbol is a term. (ii) Iff- is the name of a function of n variables and zl, . . . ,T" are terms, then l(zl,. . . , z") is a term.
A term containing no variables is called a constant term. In L9, for example, the expression xy + xz is a term which could be written more formally as S(P(x, y), E(x, z)), where S and P are the functions defined by S(a, b) = a + b and P(a, b) = ab for a, b E R. We can now be precise about the form of the mathematical sentences to be used in the rest of the chapter; they will be called simple sentences in L,. 3.9 Definition A simple sentence is a string of symbols in L y which takes either of the following forms: (A) Atomic sentences. Such sentences are of the form &TI,. . . , T"), where name of an n-ary relation and the T~ ( i = 1, . . . ,n) are constant terms. (B) Compound sentences. Such sentences are of the form
P is the
Si denotes S1A . * * AS,, ti and 6jare n,-tuples and nj-tuples of where terms (e.g.,Si = (T,!, . . . ,~ 7 ' ) ) involving no other variables than xl,. . . ,x,, n, and nj being the orders of the relation named by the symbols pi and Qj.
14
I. lnfinitesimals and The Calculus
For example, if 1 is the name of the inequality relation I (so that I(a,b) if a < b), then the expression L(1,2) is an atomic sentence. The sentence 1(1,2) will usually be written in the more familiar form 1 < 2. This and similar conventions will be used here and throughout the text. The expression Wx)(Vy)(Vz)[R(x)
A
R(Y) A R ( z )
+
x
.o, + 4 = x
a
Y
+x
*
23,
where is the name of the unary relation R defined by R ( a ) iff a is a real number, is a compound sentence which, as we shall see in the next section, expresses the fact that the distributive law holds in W. On the other hand, the expression (VX)[R(X)
AX
< 1 -+
X
< Y]
is not a simple sentence, since the variable y occurs within the square brackets without a corresponding (Vy) outside. We will use the symbol t) to abbreviate pairs of compound sentences which differ only in that A:= Piand A:= 8, are reversed. For example, the pair of sentences (Vx)[(x = x) -+ R(x)] and V(x)[R(x) + (x = x)] will be abbreviated as (VX)[(X = x) &(x)]. The use of simple sentences as the only type of sentence in the language and the corresponding use of “Skolem functions’’ (defined in 51.4) was suggested by a similar use of equalities, inequalities, and Skolem functions in H. J. Keisler’s work [23,24].
-
Exercises I.3 In the following we let Y = W, where 91 consists of the basic set R, the collection {Pi:iE I } of all relations on R, and the collection {fi: j E J } of all functions on R. 1. Let f,g, and h be names of functions of 1,2, and 3 variables, respectively, and x,;, and z denote variables. Which of the following are terms in L,?
(a) WX, 2,f(&, 4 )) (b) f(g(Y*1)!!(1, (4 b(Li(2),g(A3)) 2. In Exercise 1, which of the expressions are constant terms? 3. Let P and S denote the binary functions of product and sum in R (see remarks after Definition 3.8). Also let I and R denote the relations defined in the text. Which of the following expressions are atomic sentences and which are compound sentences? 1 9 4 )
(a) 1(S(2,_P(2,3)),9) (b) 1(2, S(X, 2))
15
Interpretation of Simple Sentences
1.4 Interpretation of Simple Sentences
In preparation for the statement of the transfer principle, which will be presented in the next section, we now show how a simple sentence in Ly is E I } , {h:jE J});i.e., we show interpreted in a relational system Y = (S, {Pi:i how to determine when such a sentence is true or false. The process of interpretation was already implicitly begun in $1.3 by designating symbols like s,E, and f in L , as names of the corresponding It will b e seen that the interpretation of simple entities s, P, and f in 9. sentences is completely determined once this naming has been specified. The interpretation is, in fact, the obvious one consistent with the interpretation of V, A , and -+ as “for all,” “and,” and “implies,” respectively. We first show how to interpret constant terms, i.e., those containing no variable symbols. In defining “terms” we have taken no account of the domain of definition of the functions whose names may occur in the terms, and this must be accounted for in the interpretation. Suppose that the constant term is just a constant symbol s naming s E S. Then this term will be interpreted as the corresponding s. If the term is of the form f($, . . . ,s”), where f is a symbol for a function of n variables and names the function f,and The zi name siE S (i = 1, . . . ,n), then we obviously should interpret . . . ,$”) as the constant f(s’, . . . ,s“) as long as f(s’, . . . ,s”)is defined, i.e., (s’, . . . ,s“) is in the domain off. We proceed inductively to more general constant terms as follows.
&’,
4.1 Definition A constant term is interpretable in Y if either
(i) it is a constant symbol s naming an element s E S, in which case it is interpreted as s, or (ii) it is of the form f ( ~ ’. ,. . , r“),where the terms T’, . . . ,r” are interpretable in Y and hence can be interpreted as the elements s’, . . . , s“ E s, and
16
I.
lnfinitesimals and The Calculus
the n-tuple (s', . . . ,s") is in the domain of the function f named by f;in this case f ( ~ .' ., . ,T") is interpreted as f(s', . . . ,s"). Note that terms D which are interpretable contain no variable symbols and are interpreted as the unique elements in S which can be obtained by imitating in Y the process of constructing D. In W ,for example, the term sin is interpretable since, inductively, 1 is defined by (i), 1 + (n/2)2 is interpretable by (ii), is interpretable by (ii), and finally sin is interpretable by (ii). O n the other hand, 1 + tan(n/2) is not interpretable since 4 2 is not in the domain of the function tan. We can now state when a simple sentence in Y is true. A sentence of the form "P implies Q is regarded as true if whenever P is true then Q is true. More precisely, such a sentence is false only if P is true and Q is false. Thus the following definition is natural.
-4
Jm
,/m
4.2 Definition In the notation of Definition 3.9,
(A) the atomic sentence P(T', . . . ,T " ) is true (or holds) in Y if each of the terms fi, 1 I i I n, is interpretable in Y as si, 1 Ii 5 n, and the n-tuple (sl, . . . ,s") is in the relation P named by (thus P ( T ' , . . . ,T " ) does not hold in Y either if one of the T~ is not interpretable in Y or if all of the z i are interpretable in 9 'but the corresponding n-tuple (sl,. . . , s") is not in the relation P), (B) the sentence
e
is true (or holds) in Y if, for each replacement of the variable symbols . . , x, with constant symbols s', . . . ,s"such that, with this replacement, the atomic sentences _Pi(?!, . . . ,T:') are all true in Y (1 Ii I k), the atomic sentences (Ij(.;, . . . ,a;'), 1 I j I I, are also true in Y with the same replacement (we are assuming that each of the constant symbols s', 1 I i In, names an element of 9). xl,.
This scheme for interpreting simple sentences in Y is only the obvious one consistent with the interpretation of A and -P as "and" and "implies," respectively. In most cases of interest it will be clear whether a given sentence is true in Y or not. For example, the atomic sentence 1 + (n/2)2 > 2 (4.1) is true in 9. Likewise, the sentence
(4.2)
(vX)(vy)[X > 0 A y > 0 -P Xy > 01
17
1.4 Interpretation of Simple Sentences
is true in 9, and expresses the fact that the product of positive real numbers is positive. Note that
(4.3)
’
- 1)
;J W ;([
+
(J;; 2 O)]
is true in 9,since ,/u is interpretable only for a 2 0, and if a 2 0 then
& 2 - 1 and & 2 0. On the other hand, (Vx)[B(x> h 2 01 (4.4) is not true in 9,since & is not defined for all real numbers a. Many examples +
will be presented in this and the succeeding sections which will provide practice in deciding on the truth in Y of simple sentences. It is even more important to be able to translate an informal mathematical statement about a relational system 9’(involving English phrases like “for all,” “there exists,” “and,” and “or”) into a sentence in L , which has the same interpretation. The rest of this section is devoted to some examples and remarks concerning this problem. A basic problem in translation is that the simple language of this chapter involves only formal analogues of the phrases “for all,” “and,” and “implies” and these must occur in certain formal combinations in a simple sentence, whereas informal mathematical statements often involve phrases like “there exists,” “or,” and “not.” It is not always easy t o decide whether there exists a corresponding sentence in our simple language having the same interpretation. Fortunately, the specific translations which are necessary to do calculus will not cause any difficulty. Some typical examples follow. Sometimes the translation is direct. Take, for example, the statement “The distributive law holds in R,” or, more precisely, “For all real numbers x, y, and z, x(y + z ) = xy + xz.” Some simple sentences in L , which each correspond to this statement are (4.5)
(Vx)(Vy)(Vz)[R<x)
A
R(Y)
A
R(z)
+
X(Y 4- Z) = XY
Xz]
and
+
+
(Vx)(Vy)(Vz)[l = 1 + X(Y Z) = XY XZ]. (4.6) In the latter sentence we have used the fact that the only substitutions allowed for variables in L , are names of elements in R. We often, however, write R(x) for clarity. Mathematical statements involving “not” attached to a given n-ary relation P on S can often be restated using the complement P‘ of P. For example, corresponding to the true statement “2 is not less than 1” is the atomic sentence 1’(2,1), where I is the relation {<x, y) E R 2 : x < y} and I‘ is the complement of I given by {(x, y) E R 2 : x 2 y}. Note, however, that if a
18
1.
lnfinitesimals and The Calculus
term in I‘(T’, , . . ,tn) is not interpretable in 9, then neither L’(T’, . . . , tn) nor P‘(tl,. . . ,t“) is true in 9’. Statements involving “or” are sometimes more difficult to deal with. Consider the law of trichotomy for the ordering on R, “For all real x, either x > 0 or x = 0 or x < 0.” This can be translated by any one of the three simple sentences 4 O A X # 0 + X > 01, (vX)[E(X) A X 4 O A X > 0 + X = 01, (vX)[~(X)AX>oAX#o+X
(4.7)
AX
where we have written x 4 0 and x 3- 0 for the expressions I’(x,O) and rl(0, x). Statements involving “there exists” are translated by the technical artifice of introducing so-called Skolem functions, which will occur frequently in later sections of this chapter. Consider, for example, the true statement “For each nonzero x in R there exists a y in R so that xy = 1.” Notice that the statement asserts the existence of a special function $ of one variable whose domain is the set of nonzero reals, and which satisfies x$(x) = 1 [so that $(x) = x- ‘I. An equivalent mathematical statement using this special function $ is thus “For all nonzero real x, x$(x) = 1,” which translates to the simple sentence (4.8)
(VX)[&(X)
AX
# 0 + X$(X) - = 13.
The function $ is an example of a Skolem function. Clearly the same trick can be used systematically in other situations to remove the expression “there exists.” As another example of a Skolem function, let f be a function of n variables with domain A and range B c R. The fact that B is the range off can be expressed using n Skolem functions I(li, 1 Ii In, with the two simple sentences
(4.9)
@xl)(vx,).
. *(Vxn)[_f(xI,
P’Y)[B(Y) +f($l(~),
* * *
* * * 3
,xn) =f(xI,.
$n(y)) = ~
. ,xn) *
+
B(f(x1, * . * ,xn>>],
1.
How could the I(li be defined? The ideas just presented do not constitute a general translation scheme between statements and simple sentences, but will suffice for the problems presented in this chapter. In the next chapter we present a richer formal language for more general mathematical structures which will involve formal analogues of “there exists,” “or,” and “not,” and so will avoid Skolem functions. We have restricted ourselves to simple sentences in this chapter because the transfer principle is easier to state and prove for these sentences
1.5 The Transfer Principle for Simple Sentences
19
and because this restriction allows a more gradual introduction to the general techniques of nonstandard analysis. Exercises 1.4
1. Show in detail that the sentence (4.1) is true when interpreted in W. 2. Show that the sentences (4.9) express the fact that B is the range of the function f of n variables. In doing so, define the Skolem functions I,$~, l
(a) bounded by 1 in absolute value, (b) periodic with period 2n. 8. Write a simple sentence in L, which expresses the fact that the function f on R is continuous, i.e., given E > 0 there is a 6 > 0 so that Ix - a1 c 6
implies If(x) - f(a)( c E.
9. Let A, B, and C denote unary relations defining subsets of R. Write simple
sentences whose interpretation in W asserts that (a) (b) (c) (d)
A c B, A = B, C = A n B, C = A u B.
1.5 The Transfer Principle for Simple Sentences
We are now able to state accurately the transfer principle for simple sentences in L,. The proof will be deferred to the end of the chapter. In the intervening sections we will present many applications of the principle which
20
I.
lnfinitesimals and The Calculus
should convince the reader that it is a very powerful tool. Moreover, it will be clear that one need not know the proof of the transfer principle to apply it successfully. A transfer principle for more general sentences and more general mathematical structures will be presented in Chapter 11. We first introduce the notion of the *-transform of a sentence in L,. Here, we adopt the following conventions. 5.1 Conventions (a) If r is a name in L , of r E R then I: is also a name in L, of *r E * R (remember that we identify r and *r). (b) If _P is a name in L, of the relation P on R then *f is a name in L, of the relation * P on *R. In particular, (c) Iff is a name in L, of the function f on R then *f is a name in L, of the function *f on *R. (d) The symbols <, + , and will denote the corresponding relation and functions in W and *W. Stated briefly, the *-transform of a simple sentence 6 in L, is the simple sentence *@ in L, obtained by starring all function and relation symbols in the sentence 0. This is made more precise in the following definitions.
5.2 Definition The *-transform of terms is defined by induction as follows: (i) If t is a constant or variable symbol then *t = t , (ii) If z =f(t', . . . , t") then *t = *f(*tl, . . . , *?). -
+
For example, the *-transform of the term f ( x ,-g ( y n)) is the term * f ( x , * g ( y n)) [notice again that should also be starred, but we avoid txe awkward notation * by Convention 5.l(d)].
+
+
+
5.3 Definition If @ is a simple sentence in L , we define the *-transform *@ of @ as follows:
(a) If @ is the atomic sentence f ( t l , (b) If @ is the sentence
. . . , r " ) then *@ is * f ( * t ' ., . . , *t").
rk
then *@ is the sentence
where *i = ( * T I , .
. . , *T")
if? = (T', . . . , 5").
I
1
21
1.5 The Transfer Principle for Simple Sentences
Thus, for example, the *-transform of the simple sentence (5.1)
(V.~)(Vy)(Vz)[g{.y)
A ~ ( J J A)
E(z) A X
-P
x
+ z
is the sentence that if x, y, z E * R and x < y then which expresses the fact (true in *a) + 2 < y + z. We now come to the main result of this section.
x
5.4 Theorem (Transfer Principle) If @ is a simple sentence in L, which is true in d , then *@ is true in *3. 1
As examples of the application of Theorem 5.4 we will establish some elementary results which will be useful later. Some of these have already been presented in #I.1 and 1.2. Throughout the rest of this book we will often say that (the truth of) a certain mathematical statement *@ about a nonstandard structure *Y follows by transfer of a sentence Q, in L,. This means that the truth of the statement follows by the transfer principle from the fact that @ is true in Y . 5.5 Proposition Let P be an n-ary relation on R and xp its characteristic function on R". Then *Pis an extension of P. Also * x p = x.pI and *(P')= (*P)'.
Proof: Suppose (c', . . . ,c") E P. Then the atomic sentence P(c', . . . ,c"} is true in d . By transfer, *Y(c', . . . ,c"); i.e., * P is an extension of P. The rest follows by transfer of the four sentences
(5.3)
(v.Ul).
.
*
(VX,)[&(.Ul)
A.
. . . ,X") #
(5.4) W.u,) . . ' (VX")[&(X,, (5.5)
(v.yI
I
1
WX,)[_P(x,
. . A &(X,)
3
(5.6) (VX,). . * (Vx,)[_P'{xlt.
*
., Xn)
-P
xp(X,, . . . , X,) =xP(x,> * - * , xn)]s
1 -+xp(x,, . . . ,x,) = 03,
*~p(x1,
. . ,Xn)-Xp(Xl,. -
,*
9
XJ
. . ,x,)
= r=
1 1 3
01,
which are true in d. 5.6 Proposition If$ is a function of n variables on R, then *f is a function of n variables and is an extension of f with *(domf) = dom *j and *(range.f)= range * f .
22
lnfinitesimals and The Calculus
I.
Proof: That *f is a function follows from the definition of *f. It also follows by transfer of the sentence (5.7) (\dx,). . . (Vxn)(Vy)Wz)[f(x,,
-
*
9
xn, Y>
AL(x,,
* * * 3
xn,z>
+
Y = 21.
That *f is an extension off follows from Proposition 5.5. Transfer of the sentence (5.8)
WX,) *
* *
(Vxn)[domf(xl,
*
-
*
9
xn>-B(f(x,,
* * *
,xJ>]
yields *(domf) = dom *f. To show that *(range f )= range *f is a little tricky, and so we consider the case n = 1 first. The trick is in noticing that there is a Skolem function defined on range f so that f($(b))= b for each b E range f. Thus the sentences (VX)(VY)Cf(X) = Y
(5.9)
+
rangef(y>l
and (5.10) (Vy)Crangef(y> +f(l(l(Y)) = Y1 are true in 9. By transfer, the first says that y E range *f implies that y E *(rangef), and the second yields the inverse implication. The general case is now clear, and is left to the reader [see sentences (4.9)]. 0 5.7 Proposition Iff and g are functions of n variables on R, then for x = (x,, . . . ,x,) we have
0) *(f+ g)(x) = *fM + *g(x) and *(f d(X) *
*g(x) when x E dom *f n dom *g, (ii) *lj(x)I = I*f(x)I when x E dom *f. = *f(x)
Proof: (i) follows from Exercise q b ) and by transfer of (5.1 1) (Vx,)
. . . (Vx,)[domf n dom g(x,, . . . ,x,> xn) + g(x1,. xJ]; =f(xl, *
1
-,(f+ g)(x,, . . . ,x,)
3
a similar sentence holds for the product. (ii) follows by transfer of the sentences (5.12)
(Vx)[x 2 0 -+ 1x1 = x]
and (5.13)
(Vx)[x < 0 -+ 1x1 = -XI.
0
23
The Transfer Principle for Simple Sentences
1.5
5.8 Proposition
(i) *a= 0. (ii) If A and B are two sets in R" then * ( A u B) = * A u *B, * ( An B) = * A n * B , and *(A') = (*A)'. (iii) Let Ai ( i E I ) be a family of sets in R". Then U * A i ( i I )~ E * [ U A i ( iE Z)] and n * A i ( i E I ) 2 * [ n A i ( i E l)]. Proof: (i) Since x0 is identically zero, it follows from Proposition 5.5 that = x . is ~ identically zero, so *@ is empty. (ii) Using Propositions 5.5 and 5.7,we have
*x0
x*(AUB)
=
*XAUB
+ XB - XAXB) = * X A + * X B - *XA*XB = X*A + X*B - X*AX*B
=
*(XA
= X*AU*B,
with a similar proof for the intersection (Exercise 1). That *(A') = (*A)' follows directly from Proposition 5.5. (iii) Let j E I be fixed. Then [ U A , (i E I ) ] is the name of a set in R" and the sentence (Vx)[x E A j + x E [ U A , ( i E I ) ]
is true in 9.By transfer we see that * A j E * [ U A i ( i E I ) ] . This is true for each j E I , and the result for unions follows by elementary set theory. The result for intersections is similarly established (Exercise 1). 0 We conclude these examples by indicating how to prove Theorem 1.7 using the transfer principle. The field properties are easy. For example, the distributive law follows by transfer of the sentence
(5.14) (Vx)(Vy)(Vz)[R(x)
A
R(y)
A
R(z)
-,x
(y
+ z) = x y + x - z ] , *
and the existence of multiplicative inverses follows by transfer of (5.15)
(VX)[X
# 0 -P x$(x) =
11,
where $(x) is the Skolem function defined on the nonzero elements of R by $(x) = x - ' . To prove the law of trichotomy one can use parts (i) and (ii) of Proposition 5.8. The details are left to the reader (Exercise 2).
24
I.
Infinitesimal5 and The Calculus
Exercises 2.5 1. Finish the proofs for 5.8(ii) and (iii). 2. Finish the proof of Theorem 1.7 using the transfer principle. is an upper-bounded 3. Show, using the transfer principle, that if B # set in R and C is the set of upper bounds of B with least element c, then *Cis the set of upper bounds of * B and c is the least upper bound of *B. 4. Use the transfer principle to show that for each x E * R there is an m E * N so that m 2 x (i.e., the *-Archimedean property holds in *a). 5. Let f be a function on R which is continuous at each point x E R. Show that for each x E * R and each E > 0 in * R there is a 6 > 0 in * R so that if y E * R and Ix - y [ < 6 then l*f(x) - *f( y)l < E. In this case we say that *f is *-continuous at each x E *R. 6. Let A and B be subsets of R. Use the transfer principle directiy to show that
(a) if A E B then *A c *B,
(b) * ( A n B) = *A n *B, (c) *(A u B) = *A u *B, (d) *(A’) = (*A)’.
7. Use the transfer principle to establish Exercises 9 and 10 in 0 1.2. 8. Use the transfer principle to show that iff(x) = sin x then *f is periodic with period 2%. 9. Let S be a real-valued function defined on the natural numbers N (a sequence). Suppose that for each m E N there is an n E N so that S(n)2 m. Using the transfer principle, show that for each m E *N there is an n E * N so that *S(n) 2 m. 10. Use the transfer principle to show that if If(x)l s M for all x E dom f, then I*f(x)l M for all x E dom *f. 11. Use the transfer principle to show that iff is a real-valued function on R and c E R, then *{x E R : f ( x )> c ) = {x E * R : * f ( x )> c f . 12. Show that if P and Q are relations on R and P # Q then * P # *Q.
1.6 Infinite Numbers, l n ~ n i t e s i ~ ~ s , and the Standard Part Map
We have already noted in constructing * R that it contains numbers not in R [we are regarding R as embedded as a subset of *R; i.e., R and ( R ) , are identified]. For example, the numbers w = [(1,2,3,. , .)I and l/w are not in R. They are called non-standard numbers.
1.6 Infinite Numbers, Infinitesimals, and the Standard Part Map
25
6.1 Definition A hyperreal number is called standard if the number is in R and non-standard otherwise. An n-tuple (a', . . . ,a") is standard if each ai E R and non-standard otherwise. Since * R is called the set of nonstandard real numbers, we use a hyphen to indicate elements of * R - R. We further classify the numbers in * R by the following definition.
6.2 Definition (i) A number s E * R is infinite if n < Is1 for all standard natural numbers n. (ii) A number s E * R is finite if Is1 < n for some standard natural number n. (iii) A number s E * R is infinitesimal if Is1 < l/n for all standard natural numbers n. In the construction of 81.1, we see that the number w = [(1,2,3,. . .)] is infinite since, for any r E R, { i E N : i > r} is cofinite and thus in 9, showing that w > r for any standard r > 0. There are many more infinite numbers. For example, w + r is infinite for any positive r E * R since w + r > w (we are using the properties of linear ordering in *R). Similarly l/w = [ ( l , f , f , . . .)I is an infinitesimal number, but again there are many more since the reciprocal of any infinite number is infinitesimal (Exercise 1). The number 0 is the only standard infinitesimal number. Clearly, every infinitesimal number is finite, but the sets of finite and infinite numbers are disjoint.
6.3 Theorem (i) The finite and infinitesimal numbers in * R each form subrings of *R; i.e., sums, differences, and products of finite (infinitesimal) numbers are finite (infinitesimal). (ii) The infinitesimals are an ideal in the finite numbers; i.e., the product of an infinitesimal and a finite number is infinitesimal.
Proof: (i) Let
E
and 6 be infinitesimal, and let r > 0 be standard. Then
I E ~ < r/2 and 161 < r/2 and so I E + 61 < r and ( E - 61 < r. Also I E ~ and 161 are both <&,so that 1~61< r. Here we have used the familiar properties of the absolute value for numbers in * R which are valid by transfer (see Exercise 3). This shows that the infinitesimals in * R form a subring. A similar argument works for the finite numbers (Exercise 4).
26
I.
lnfinitesimals and The Calculus
(ii) Let E be infinitesimal and 6 be finite. Then lbl < s for some standard > 0. Also, I E ~ < r/s for any standard r > 0. Therefore lebl < r, and so Eb is infinitesimal.
s
We next introduce two important equivalence relations and the associated notions of monad and galaxy. Monads are central to the nonstandard treatment of convergence and continuity. 6.4 Definition Let x and y be numbers in * R .
(i) x and y are near or injnitesirnally close if x - y is infinitesimal. We write x ‘v y if x and y are near and x y otherwise. The monad of x is the set m(x) = { y E * R : x ‘Y y). (ii) x and y are jnitely close if x - y is finite. We write x y if x and y are finitely close and x * y otherwise. The galaxy of x is the set G(x) = { y E * R : x y}.
+
-
-
The monadic and galactic structure of * R is easily visualized. To aid in the visualization, we present the following facts. Clearly m(0) is the set of infinitesimals and G(0) is the set of finite numbers. It follows easily from 6.3 that any two monads m(x) and m(y) are either equal (if x N y ) or disjoint (if x y) and the relation ‘v is an equivalence relation on * R . Likewise any two galaxies G ( x ) and G ( y ) are either equal (if x - y is finite) or disjoint. It is equally easy to prove the somewhat disconcerting fact that between any two disjoint monads or galaxies is a third, disjoint from the first two. If x #z 0 we see easily that m(x) is a translate of m(0); i.e., for any x,
*
m(x) = { y E * R : y = x
+ z , z E 40)).
G(x) = { y E * R : y = x
+ Z, z E G(0)).
Similarly We leave the proofs of these facts as exercises. These remarks show that the structure of * R with respect to infinite, finite, and infinitesimal numbers is somewhat complicated but easily visualized. Some authors say that x is infinitely close to y if x - y is infinitesimal. We continue with the following basic fact about the structure of * R . 6.5 Theorem If p E * R is finite, there is a unique standard real number r E R with p = r; i.e., every finite number is near a unique standard number. Proof: Let A = {x E R : p 5 x } and B = { x E R : x < p } . Since p is finite, there exists a standard number s such that - s < p < s. It follows that B is
27
1.6 Infinite Numbers, Infinitesimals, and the Standard Part Map
nonempty and has an upper bound. Let r be the least upper bound of B (the existence of r is assured by the completeness of R). For each E > 0 in R, (r + E ) E A and (r - E ) E B, so r - E < p Ir E, and hence Ir - pI IE. It follows that r N p. If rI N p then Ir, - rl Ilr, - pi + Ip - rl < 2~for each standard E > 0, whence r = r , . 0
+
6.6 Definition If p E * R is finite, the unique standard number r E R such that p N r is called the standard part of p and is denoted by st@) or "p. This defines a map st: G(0)+ R called the standard part map.
Clearly st maps G(0)onto R since st(r) = r when r E R. That the map also preserves algebraic structure is shown by the following theorem.
6.7 Theorem The map st is an order-preserving homomorphism of C(0)onto R, i.e.,
0) st(x k y ) = st(x) k st(y), (ii) st(xy) = st(x)st(y), (iii) st(x/y) = st(x)/st(y) if st(y) # 0, (iv) st(x) 5 st(y) if x I y.
+
Proof: Let x = Ox E, y = "y + 6 with E and 6 infinitesimal. Then x k y = ("x f " y ) + ( E f 6), which establishes (i) using 6.3. Parts (ii) and (iii) are left to the reader (Exercise 6). To prove (iv), we have + E I " y + 6, so that "x I + (6 - E ) < "y + r for any positive r E R; from this we conclude that Ox I O y . 0 O x
O y
6.8 Corollary The quotient field C(O)/m(O) is isomorphic to the standard field 9. Proof: m(0) is the kernel of the linear (over R) map st, i.e., m(0) = {x E G(O):st(x)= O}. 0 6.9 Corollary If x, x', y, and y' are finite and x
(i) x f y N x' f y', (ii) xy 'v x'y', (iii) x/y 'u x'ly' if y 0 (and hence y'
+
N
x', y
N
y' then
+ 0).
From Definition 6.2 we see that the set of infinite hyperreal numbers is the complement of the set G(0) of finite numbers. Since various subsets of the set
28
I. lnfinitesimals and The Calculus
of infinite numbers (especially the set of infinite integers) will occur frequently in the sequel, we adopt the following definition. 6.10 Definition Given a set A c R, the set of infinite numbers in *A is the set * A m = *A n (*R - G(0)). 6.11 Theorem If A E N and A is infinite, then *A contains infinite natural numbers, i.e., *A n * N , # 0.
Proof: For each n E N, there is an element k E A with k 2 n, and so we may define a Skolem function J/: N + A with J/(n)2 n. Thus the sentence (Vn)[N(n) + A ( J / ( n ) )~n IJ/(n)]is true in R. By transfer, *J/(n)E *A and n I*I&) for all n-~ * N including n = w E * N m .Thus, *$(a) E *A n * N , . 0
Note that the proof of Theorem 6.11 shows that *A contains arbitrarily large infinite natural numbers. Exercises 1.6
1. Show that the reciprocal of an infinite number is infinitesimal and the reciprocal of a nonzero infinitesimal number is infinite. 2. Show that if r is an infinitesimal standard number, then r = 0. 3. Write simple sentences for L, which yield the properties of the absolute value function on * R used in the proof of Theorem 6.3(i) for infinitesimal numbers. 4. Prove Theorem 6.3(i) for finite numbers. 5. Fill in the details in the remarks following Definition 6.4. 6. Prove Theorem 6.7, parts (ii) and (iii). 7. Show that it does not follow from Ox I; O y that x I y in G(0). What can be said if Ox < O y ? 8. Prove Corollary 6.9. 9. Show that Corollary 6.9(iii) need not be true if y 'v 0. 10. Start with the fact that every finite element of * R is near a standard r E R and show that R is complete. 11. Show that if x E * R then m(x) = { y E * R : l x - yl < E , E > 0 infinitesimal in *R} = (I{ y E * R : ( x- y l < E , E > 0 in R}. 12. Show that if x i E *R, 1 Ii In, then , / x : + * + x.' N 0 iff x i 'v 0 for all i, 1 Ii In. 13. Show that if a and b are finite numbers in * R with b # 0, and n is infinite in *N, then a + nb is infinite.
u
1.7
29
The Hyperintegers
1.7 The Hyperintegers
The set of integers, which we denote by 2, and the set N of natural numbers play central roles in analysis. We therefore pay particular attention to the structure of the *-transforms * Z and *N of these sets; we will call elements of * Z and * N hyperintegers and hypernatural numbers, respectively. In the literature, * Z and *CN are often called the nonstandard integers and nonstandard natural numbers, respectively. The first obvious fact is the following.
7.1 Proposition * Z is a linearly ordered subring of *R. Proof: To show that * Z is a subring of *R, we need only check that it is closed under addition and multiplication. This fact follows from the interpretation in * R of the *-transform of the simple sentence (7.1)
( V X ) ( V Y ) [ Z ( X > A Z(Y>
+
Z(X
+ Y> A z < x
*
Y)],
which is true in 9. Finally, notice that * Z inherits the linear ordering on *R. 0 In W there is a greatest integer function [ . 3: R (7.2)
[XI
Ix < [ X I
Z which satisfies
+1
for all x E R. Therefore the extended function *[ . 3: * R + *Z satisfies *[XI I x < *[XI + 1 for all x E * R by the transfer principle. Thus we have 7.2 Proposition For each x k + 1.
E
* R there is an element k E * Z so that k Ix <
7.3 Corollary There are positive and negative infinite integers. Proof: If x is positive infinite then the hyperinteger k + 1 of Proposition 7.2 is positive infinite and the hyperinteger -(k + 1) is negative infinite. 0 7.4 Corollary If x E *R, there is an n E *N so that 1x1 < n.
The following result shows that the hyperintegers in * Z are a unit distance apart. 7.5 Proposition For each n E *Z, n + 1 is the smallest hyperinteger greater than n.
30
I.
lnfinitesimalsand The Calculus
Proof: The simple sentence
(7.3) (Vx)(Vy)[Z(x) A Z ( y ) A (x 5 Y x + 1)A (Y z x) -P (Y = x + 1)1 is true in 9.The interpretation of its *-transform in *W yields the desired conclusion. 0 7.6 Corollary *Z n G(0) = Z (i.e., any finite hyperinteger is an ordinary integer).
Proof: Let k be a finite hyperinteger. Then st(k) is a real number and so n 5 st(k) < n+ 1 for some n E Z. It is easy to see (Exercise 1) that 0 I In - kl < 1. But In - kl E *Z and so n = k by Proposition 7.5. 0
7.7 Corollary If x E *Z, then * Z n m(x) = {x}. Proof: Exercise 2. 0
If we let Q denote the standard set of rational numbers, then the set *Q will be called the set of hyperrationals or nonstandard rationals. In contrast to Corollary 7.7 we see that if x E *Q then *Q n m(x) contains many other hyperrationals distinct from x; for example, if w E *N, then l/w E *Q n m(0) (proof?). An interesting exercise, which we leave to the reader, shows that, in analogy with Corollary 6.8, the real number system is isomorphic to [*Qn C(O)]/m(O). Notice that only the rational numbers are used in defining [*Q n C(O)]/m(O). Although this would not be a recommended way of defining the real numbers from the rationals, the result is a prototype of many results in nonstandard analysis which construct standard mathematical structures from nonstandard structures. We end this section with some remarks which will clarify the nature of the *-mapping and the transfer principle. Similar considerations will be crucial in correctly applying the more powerful transfer principle of Chapter I1 (see Remark 3.5 of $11.3 and $11.6 in Chapter 11). 7.8 Remarks on Sets Which Are Nonstandard Extensions of Standard Sets We first show that there are subsets of *R which are not the *-transforms (i-e.,nonstandard extensions) of sets in R. A typical example is the set R itself, regarded as embedded in *R. For suppose that *A = R for some subset A c R. Two cases are possible:
(i) A is bounded above by a number a E R. But in this case the sentence (Vx)[A(x) + x 5 a] is true in La. By transfer, (Vx)[*A(x) --* x Ia], i.e., every element of *A is Su, and *A cannot equal R;
1.7 The Hyperintegers
31
(ii) A is not bounded above. Then for all x E R there is a y E A with y 2 x. Thus there is a Skolem function $: R + A so that the sentence (Vx)@(x) -+ $(x) 2 x ~ A ( $ ( x ) ) ] is true in L,. By transfer, (Vx)[*&(x) + *rc/(x) 2 x A *z($(x))]. In particular, if x is an infinite natural number then there is an element y = *$(x) 2 x. Since y E * A we see that * A contains infinite numbers and cannot equal R. Thus there is no A c R so that * A = R. A similar argument shows that there is no A c R so that * A = N. Thus the *-mapping of Definition 2.2 does not map the collection of all subsets of R onto the collection of all subsets of * R but only onto a subcollection of them. It is obvious from Definition 5.3 that the *-transform *@ of a sentence @ in La can contain only the names of the *-transforms of sets and n-ary relations on R. A lack of attention to this fact can lead to an incorrect understanding of the transfer principle. As an example, recall that R is Archimedean; this means that given any x E R there is an n E N so that 1x1 5 n. We might naively expect that * R is Archimedean by transfer; i.e., for all x E * R there is an n E N so that 1x1 5 n. But this statement is obviously false, as we see by taking x to be an infinite integer. The mistake is in transferring the sentence but forgetting to replace N by *N,thus leaving the name of the set N in the transferred statement. The correct transfer of the Archimedean property is in Corollary 7.4. Even though only sets which are *-transforms of standard sets arise in the application of the transfer principle, other subsets of * R occur regularly in the nonstandard characterization of standard concepts. For example, we will show (Proposition 8.1) that a sequence (s,) converges to the limit L if and only if *s, is infinitesimally close to L for all n E *N,. Neither *N, nor the monad of zero is the *-transform of a standard set.
Exercises I. 7 1. Show that 0 I(n- kl < 1 in the proof of Corollary 7.6. 2. Prove Corollary 7.7. 3. Show that the real number system is isomorphic to [*Q n G(O)]/m(O). 4. Show that there is no A c R so that * A = *N,. 5. Show that there is no function f on R so that *f = zR. 6. Show that there is no A c R so that * A = m(0). 7. Show that there is no A c R so that * A = G(0). 8. Show that if A is a finite set {a,,az,. . .,a,} in R then * A = A. 9. Show that if A is an unbounded set in R then * A # A. 10. Show that if x < y in R and r E * R with x Ist(r) < J' then (x - rl < y - x and "IX - r ( < y - x.
32
I.
lnfinitesimals and The Calculus
1.8 Sequences and Series
The first task in applying nonstandard analysis to a given theory is to find nonstandard equivalents for the basic definitions in the theory. The nonstandard equivalents can then be applied to produce (often shorter) proofs of standard results. In this section we will illustrate these remarks by considering the basic theory of limits for real sequences and series. In this and the next few sections, nonstandard equivalents of the standard definitions will be presented as propositions. These results are due to Robinson [40,42]. Familiarity with the standard definitions is assumed. Let S N + R be a standard sequence. As usual we write s(n) = s, and denote the sequence by (s,: n E N) or simply (s,). The sequence s:N + R has a *-transform *s: *N + * R and we let *s(n) = *s, for n E *N.From 2.3 or 5.5 we see that *s, = s, for n E N. Applying the remark preceding Theorem 2.3, we see, for example, that if w = [(2,4, . . . ,2", . . . ) ] then *s, = [(s2, s4, * * * ,S2", * * * >]. 8.1 Proposition The sequence (s,) converges to L iff *s,
N
L for all infinite n.
Proof: Recall the condition for convergence of (s,) to L
(8.1)
Given E > 0 in R there is a k E N (depending in general on E ) so that Is, - < E for all n > k.
4
Suppose (s,) converges to L, let E > 0 be given, and find the corresponding k from (8.1). Then the sentence (8.2)
( V n ) [ N ( n > A n > k + 1s.- LI
< E]
is true in 9.By transfer, n > k + )*an - LI
< E], and so if n E *N and n > k, then 1*s, - L( < E. But all infinite n E *N are larger than k, so I*s, - LI < E for all infinite n in *N.The core of our argument is that the latter conclusion could have been derived for any standard E > 0. Thus I*s, - LI N 0 for all infinite n E *N. Conversely, suppose that (s,) does not converge to L. Then there is a standard E > 0 and a Skolem function $:N + N satisfying $(k) 2 k and ls*(k) - Ll 2 E for all k E N. Thus the sentence
(8.3)
(8.4)
(Vn)[*N(n)
(WWk)
A
+
Jl(W 2 k A IS*(k, - Ll 2 E l
is true in 9. By transfer, (8.5)
(Vk)[*b!(k)
-+
*$(k) - 2
I*S*L(k)
- LI 2 &],
1.8 sequences and series
33
and so ( * s * ~ ( ~ ) L( 2 E for all k E * N and, in particular, for an infinite k = o. Since n = *$(a)2 o is infinite and I*s, - L( 2 E, *s, N L is not true for all infinite m. 0
If (s,) converges to L we write
s, = L. We then get
8.2 Theorem If limn+ s, = L and limn+ t , = M then (i)
(ii)
(iii)
+
+
t,) = L M, (s,,td = LM, limn+,,, (s,,/t,,) = L/M if M # 0.
(s,
Proof: To prove (i), we have s, N L, t, N M, and hence s, + t,, s L + M for any infinite n by Corollary 6.9. The proofs of (ii) and (iii) are left as exercises. 0 By variants of the arguments in the proof of Proposition 8.1 we can establish the following results. 8.3 Proposition The sequence (s,) is a Cauchy sequence iff *s, infinite n and m. Proof: Recall the standard condition for (s,)
(8.6)
N
*s, for all
to be a Cauchy sequence:
Given E > 0 in R there is a k E N so that Is, - s,I < E for all n, m > k.
Then proceed as in the proof of Proposition 8.1 (Exercise 2). 0
8.4 Proposition The sequence (s,) is bounded iff *s, is finite for all infinite n. Proof: If (s,) is bounded then there is a k E N so that the sentence (8.7) V n ) [ N < n )--* b n l k] is true in W.By transfer, I*s,I < k for all n E *N,and hence I*s,,l is finite for all infinite n. Conversely if (s,) is not bounded then there is a Skolem function $: N --+ N satisfying $(k) > k, k E N, so that > k for all k E N. By transfer of the appropriate sentence (which the reader is invited to write down), I*sIeg)(> k for all k E *N and, in particular, I*s,,l is infinite if n = *$(k) and k is infinite. 0
34
I.
lnfinitesimals and The Calculus
Using Proposition 8.3, one can prove the standard result that any Cauchy sequence (s,) is bounded (Exercise 3). 8.5 Theorem The sequence s, converges iff it is a Cauchy sequence.
Proof: If (s,) converges to L then *s, N L N *s, for all infinite n, rn by 8.1, so (s,) is a Cauchy sequence by 8.3. Conversely if (s,) is a Cauchy sequence then (s,) is bounded and so *s, is finite for all infinite n. Define L = st(*sm),where w is a specific infinite natural number. Then *s, z *s, 1: L for all infinite n by 8.3, and so (s,) converges to L by 8.1. 8.6 Corollary A monotonic bounded sequence (s,) converges.
Proof: We may assume that the sequence is increasing, and we need only show that (s,) is a Cauchy sequence. If not then there exists an E > 0 in R and a Skolem function #:N -+ N so that #(l) = 1, $(n 1) > #(n), s#(k) E < s # ( k + I ) for k E N, and hence s#g) > s1 kE for all k E N. By transfer (of what sentence?), *(k) > s1 kE for k E *N,and so *(k) is infinite if k is infinite. By 8.4, (s,) is not bounded (contradiction). 0
+
+
+
The notion of a limit point of a sequence (s,) same way as we have treated limits.
+
can be treated in much the
8.7 Proposition L is a limit point of the sequence (s,) for some infinite n.
if and only if s,
N
L
Proof: Suppose that L is a limit point of (s,). The standard definition of a limit point states that for a given E > 0 in R and k E N there is an n E N with n 2 k so that Is, - LI < E. Thus there is a Skolem function #: R + x N -P N ( R + the positive reals) with $ ( ~ , k 2 ) k so that the sentence (8.8)
(vE)(vk)[&+(E)
&(k)
-P
$(&, k ) 2 k A
4 ( ~ , k -) LI <
E]
is true in 9.By transfer, for all E E * R + and k E *N (and, in particular, k E *N,), I*s, - LI < E if n = * $ ( ~ , k )2 k; thus *s, N L if E is infinitesimal. Conversely, if L is not a limit point of (s,) then there is a standard E > 0 and k E N so that Is, - LI > E for all n 2 k. By transfer of the appropriate sentence (Exercise 4), I*s, - LI 2 E for all n 2 k and, in particular, for all infinite n; i.e., s, L for any infinite n.
+
1.8
sequences and series
35
8.8 Theorem (Bolzano-Weierstrass) Every bounded sequence (s,) limit point L.
has a
Proof: If (s,) is bounded then *s, is finite for all infinite n by Proposition 8.4. If L = st(*s,) for some infinite o E *N,then *s, N L and L is a limit point by Proposition 8.7. 0 8.9 Examples
+
+
1. If s, = (2n2 3n)/(5n2 1) for n E N, then by the transfer principle, *s, = (2n2 3n)/(5n2 1) for n E *N.Thus, for all infinite n E *N,s, = (2 3/n)/(5 l/n2) N so (s,) converges to 2. Iflim,+m s, = L # 1 andt, = (1 s,)/(l - s,,), thent, N (1 L)/(l - L) for all infinite n, and so t, = (1 L)/(l - L). 3. Let s, be defined recursively by s,+ = i(s, a/sJ, n 2 1, a > 0, s1 2 It is easy to see that (s,) is a decreasing and positive sequence, i.e., 0 5 s,+ 5 s,, and s, 2 for all n (check); hence s, converges to a limit L by Corollary 8.6. Thus L N s,, N i(s, a/sJ ‘Y i(L a/L) if n is infinite, so 2L = L a/L, and hence L = 4. Let s, = (- 1)”(1- l/n). Then s, = 1 - l/n z 1 for all even infinite n and s, = l/n - 1 z - 1 for all odd infinite n. Thus the numbers 1 and - 1 are
+
+
+
&.
+
4,
4.
+ +
4
+
&.
+
+
+
+
the only limit points of (s,) by Proposition 8.7. The methods just developed can be used effectively in the study of double sequences. A double sequence is a mapping s: N x N + R we write s(n, m) = s, and denote the sequence by (s,,). The sequence (s,,,,) converges to L, and we write limn.m+m s, = L, if, given E > 0 in R, there is a k E N so that Is, - L) < E if n, m > k. The *-transform of s yields the numbers *s(n, m) = *s, (n,m E *N).The proofs of the following results are analogous to those of 8.1 and 8.5 and are left as exercises. 8.10 Proposition limn,,+m s,
=L
iff *s,,
N
L for all infinite n, m.
converges iff *s, 1: *s,,.,. for all infinite n, m, n’, m’ and then converges to L = st(*sm,,,)for n, m infinite.
8.11 Proposition (s,,,)
We may want to compute the limit of (s,) by first computing ,s = ,s and then computing lim,+m s,. It may happen that the above limits exist but (s,) is not convergent. For example, if s, = n/(n m) then limn+ms,= 1, m E N, so lim,+Q s,,,) = 1. If o E *N is infinite, however, then o / ( w + o)c&, o/(o + 20) and so (s,,) is not convergent by 8.11.
+
36
I.
lnfinitesimals and The Calculus
Notice that if limn+, ,s = s, then
*S , , N S ,
(8.9)
If, moreover, lirn,,,
for n E * N , ,
mE
N.
s, = L then
(8.10)
*s,=
L
for m E * N , .
If in place of (8.9) we could establish (8.1 1)
*s,,N *s,
for n E * N , ,
mE*N
then (8.10) and (8.11) would yield *s, ‘Y *s, ‘Y L for all infinite n, m, and hence the convergence of (s,,,) to L. Condition (8.1 1) is equivalent to the uniformity in m of the convergence of the sequences (s,,:n E N), as shown by the following result. 8.12 Proposition (s,,:n
E
for all n E *N,, m E * N .
N) converges to s, uniformly in m iff
*snm z *s,
Proof: Recall that limn,, s, = s, uniformly in m iff, given E > 0, there exists a k E N,possibly depending on E but not on m, so that Is, - s,I < E for all n 2 k. Let E > 0 be specified and find the corresponding k. Then the sentence
n 2 k + brim - s m l < &] is true in 41.By transfer, [*snm - *s,I < E if n,m E * N with n > k, and in particular for all infinite n. The latter conclusion is valid for any standard E > 0, and so *s,, N *s, for n E * N , , m E * N . The converse is left to the reader. (8.12)
(tln)(Vm)[_N(n)
A &(m) A
The preceding discussion yields the first part of the following result. The second is left as an exercise. 8.13 Theorem If limn+, s, = ,s
uniformly in m, and lim,+, s, = L, then limn,,,+m,s = L. If, moreover, limm,m ,s = s, exists for each n E N, then limn+, s, exists and equals L. Note in passing that lim,,,,, ,s may exist even though limn,, s, does not exist; For example, let s, = [( - 1)” + (- l)”]/m. We continue with a consideration of infinite series. Recall that the infinite series ai converges (to L) if the sequence s, = 1 ad1 s i 5 n) converges (to L). Both sequences ( a , ) and (s,) have *-transforms (*a,:n E *N) and (*s,:n E * N ) . We will write *s, = *ad1 5 i In), thus defining the nonstandard “summation” operation on the right-hand side. This operation has
*c
37
1.8 sequences and Series
all of the familiar properties of ordinary summation, as we easily check by transfer from the properties of (s,). For example, *ail 5 i I m) *ail Ii In) = * *a& I 1 I;i Im) for m > n in *N.From the previous results in this section we immediately obtain
*c
*c
+
8.14 Proposition
*c
(1) ai converges to L iff *ail Ii In) N L for all n E *N,; (2) ai converges iff *ai(n Ii 5 m) = 0 for both m and n in *N,; (3) If a, 2 0 then 1 aii E N)converges iff *C *a# E N) is finite for some n E *N,, in which case *aX1 Ii In) is finite for all n E *N,.
*I
*c
c
c
The series a& E N) converges absolutely if lail(iE N)converges. The comparison test and its consequences, the ratio and root tests, are important in the standard theory of absolute convergence. The following result is a nonstandard version of the limit comparison test. 8.15 Theorem If
n E *N,, then
c
bii E N) converges absolutely and l*a,l II*b,,l for all aii E N)converges absolutely.
Proof: There is a k E N so that if n 2 k then lan[5 1b.l (why?). By transfer of the appropriate sentence, I*ail(n 5 i Im) 5 I*bil(n 5 i 5 m) = 0 if m > n are in *N,. 0
*c
*c
The notion of the limit of a sequence (s,) can be extended to the notions of lim sups, and lirn inf s,, called the limit superior and limit inferior, respectively. Here, for a change, we define lim sup and lim inf using nonstandard notions, and show that these definitions coincide with (one of) the standard definitions in Proposition 8.17. 8.16 Definition Let (s,) be a sequence in R. For lirn sups, we consider three
cases: (i) If *s, is positive infinite for some n E *N,, then lirn sups,, = -t00. (ii) If *s, is negative infinite for all n E *N,,then limsups, = - 00. (iii) If neither case (i) nor case (ii) holds, then lim sups, = sup{st(*s,):n E *N,, *s, finite}.
We define liminfs, in a similar way, or equivalently we set liminfs, = lim sup(- s,).
38
1.
lnfinitesirnalsand The Calculus
If neither case (i) nor case (ii)of Definition 8.16 holds, then, as in Proposition 8.4, the sequence (s,) is bounded above. Ifr = sup{st(*s,,):n E * N , , *s, finite}, then r is a limit point of (s,) (why?), so by Proposition 8.7, r = st(*s,,) for some n E * N , . Thus r = max{st(*s,,):nE *N,,*s, finite} = lirn sups,. Clearly, r is the largest limit point of (s,). 8.17 Proposition Let (s,) be a sequence in R and let u, = sup{sk:k 2 n in N ) and u, = inf{s,:k 2 n in N} for each n E N.Then (u,) is a nonincreasing
sequence and (0,) is a nondecreasing sequence. Moreover, lim sups, = inf{u,:m E N} and liminfs, = sup{u,:m E N}.
+
Proof: If case (i) of Definition 8.16 holds, then u, is 00 for all n (why?)and ao. If case (ii)holds, then for any no E N there is an m E N so that s, s -no for all n 2 m (why?). In this case, u, s - no. Since this is true for each no E N, infu, = - 00. If case (iii) holds and r is the largest limit point of (s,), then, for any E > 0 in R, u, 2 r - E for each rn E N and u, 5 r E for some m E N,so r = inf u,. The proof for liminfs, is left to the reader. 0 so inf u, =
+
+
Let (s,) liminfs,
and
(t,)
be bounded sequences. The reader should verify that
+ liminft, s liminf(s, + t,) < lim sup@, + t,) 5 lim sup s, + lim sup t,.
Moreover, (s,)
has limit L E R if and only if lim sups, = lirn infs, = L.
c1"=, ciZ1
a, converges absolutely if 8.18 Tbeorem (Ratio Test) A series lirn sup(lai+,l/luil) < 1. A series a, diverges if lirn inf(la,+,I/[a,l) > 1.
Proof: Left to reader. 0 Exercises 1.8 Prove Theorem 8.2, parts (ii) and (iii). Prove Proposition 8.3. Using Proposition 8.3, prove that a Cauchy sequence is bounded. What sentence must be transferred for the proof of the second part of Proposition 8.7? 5. Use Exercises 1.7.7 and 1.7.8, Proposition 8.7, and Theorem 8.8 to show that if A c R then *A c A if and only if A is a finite set. 6. Prove Proposition 8.10. 7. Prove Proposition 8.11. 8. Finish the proof of Proposition 8.12. 9. Finish the proof of Theorem 8.13. 10. Fill in the details in the proof of Theorem 8.15. 1. 2. 3. 4.
39
1.9 Topology on the Reals
11. Show that if (s,) is bounded above and r = sup{st(*s,):n E * N , , *s, finite}, then r is a limit point of (s,). 12. Fill in the details and finish the proof of Theorem 8.17. 13. Fill in the details in the remark preceding Theorem 8.18. 14. Prove Theorem 8.18. 15. Use Exercise 13 in 1.6 to show that if a, b are real and b # 0 then the sequence (s,) given by s, = l/(a + nb) converges to 0. 16. Suppose that (s,) and (t,) converge to L and M , respectively. Show that (a) (b) (c) (d)
+
+
(s, c,) converges to L M , (as,) converges to a t for a E R, (s,t,) converges to LM, (s,/r,) converges to L/M if M # 0.
17. Show that if (s,) and (t,) converge to L and M , respectively, and s, I t , for n E N, then L 5 M . Prove as a consequence that the limit of a sequence is unique. I, = s, then 18. Show that if r, Is, I c, for all n E Nand limn+, r, = (s,) converges to s. 19. Show that if limn+m(s, - l)/(s, + 1) = 0 then limn+, s, = 1. 20. Investigate the limits limn,, s,, limns,, lim, s,, and the iterated limits for the sequences (i) s, = n/(n + m), (ii) s, = ( - l)”n/(n + m), (iii) ,s = (- l)”+,(l/n + l/m).
1
1
21. Show that a,@E N) converges iff *aXn Ii Im) ‘v 0 for all n and m aii E N) converges than *a, ‘v 0 for ( n < m) in *N,. Conclude that if all n E * N , . (b“/n‘) = 00 22. Prove the formulas limn+, (n!/b”)= co ( b 2 l), (nc/ln n) = 00 (c > 0) by using transfer of familiar ( b > 1,c 2 0), properties of logs.
1.9 Topology on the Reals
In this section we present nonstandard characterizations of the basic topological notions of open, closed, and compact set, and use these characterizations to prove a few standard results. Familiarity with the standard definitions is assumed.
40
1.
lnfinitesirnalsand The Calculus
9.1 Proposition Let A be a subset of R. Then
(i) A is open iff Ma) c *A for each a E A, (ii) A is closed iff m(a) n *A is empty for each a E A'.
Proof: (i) Suppose that A is open and let a E A. By the definition of openness, there exists a standard E > 0 so that (9.1)
(vx)[&(x)
A IX
- a1 < & + a(x>]
is true in W. By transfer, if x E * R and 1x - a1 c E then x E *A. In particular, if (x - a1 N 0 then x E *A and so +a) c * A . Conversely, suppose that m(a) c * A for each a E A. If A is not open, there exists an a E A so that for each n E N we can find an x, E A' with Ix, - a[ < l/n. Define a Skolem function $:N + R by #(n) = x,,, where x, is a specifically chosen element of A' with Ix, - a1 < l/n. Then the sentence
I*$().
- a1 < l/n. In is true in W. By transfer, for all n E *N,*$(n) E *A' and particular, for n = w where w is infinite, the number x, = *#(a)satisfies x, E *A' and Ix, - a1 < l/w N 0, i.e., x, E +a) (contradiction). (ii) This assertion can be proved by noting that, by definition, A is closed iff A' is open (exercise). 0 9.2 Theorem
(i) If { A i : iE I } is a collection of open sets in R, then U A , ( i E I) is open. (ii) If A,, . . . ,A,, are open in R, then ()Ail I i I n) is open. (iii) If { A i : iE I} is a collection of closed sets in R, then n A i ( i E I ) is closed. (iv) If A,, . . . ,A, are closed in R, then U A A l 5 i 5 n) is closed. Proof: We prove (i) and (ii) and leave the proofs of (iii) and (iv) to the reader.
(i) Let x E U A , (i E I). Then x E A, for some j E I and so m(x) c *A, by 9.1(i). Thus m(x) c U * A i (i E I ) E * [ U A i (i E I)], the last inclusion by Proposition 5.8(iii). This shows that U A , (i E I) is open by 9.1(i). (ii) Let x E n A A 1 5 i n). Then x E A, and so m(x) c *A, for each i, 1 5 i 5 n, by 9,1(i). Thus m(x) c *Al n -. - n * A , = *[r)Ai(l s i s n)], the last equality by Proposition 5.8(ii). Thus ( ) A i l s i s n) is open by 9.1(i).
41
1.9 Topology on the Reds
Recall that a point x E R is an accumulation point of a set A E R if, for every n E N,there is a point y in A different from x with ly - XI < l/n. The set of accumulation points of A is denoted by 2, and the closure of A is the set A = A u 2. 9.3 Proposition A point x E R is an accumulation point of A E R iff there is a y # x in *A with y N x.
Proof: Suppose that x is an accumulation point of A. Then for each n E N we can find a y # x in A with Ix - yl < l/n. Let JI: N + A be a Skolem function obtained by associating a y E A with each n E N so that the sentence (9.3)
(Vn)[H(n)
+
$00 z x A 4(&(n))
A
Ix
-!&)I
< l/nI
is true in W.By transfer we see that, for each n E *N,*JI(n)# x, *Jl(n)E *A, and Ix - *Jl(n)l < l/n. We need only choose y = *$(a) E *A for w E *N, . The converse is left to the reader. 0 9.4 Proposition The closure A of a set A in R consists of those x E R for which m(x) n *A is not empty.
Proof: If x E A then x E A or x E 2. If x E A then x E *A and x E m(x). If x E 2 then m(x) n *A is not empty by Proposition 9.3. The converse is established by reversing the argument. 0
Proposition 9.4 can be expressed in a more graphic way. The standard part map st: G(0)+ R defines a mapping, also denoted by st, from subsets of G(0)to subsets of R by the obvious definition, For each B c G(O),st(B) = {st(y):yE B} = {x E R:there exists a y E B with y 21 x}. Proposition 9.4 can be restated as asserting that st(*A n G(0))= A for any subset A of R, and thus it shows how to construct the closure of any set A by constructing the *-transform of A and then collapsing back to R by a standard part operation. In this form, Proposition 9.4 is a prototype of similar results obtained in more complicated situations later in this book. 9.5 Theorem For any subsets A and B of R,
(a) A c A, (b) A = 1, (c) AUB = A u (d) A is closed,
B,
42
I.
(e) if B is closed and A E B thenA (f) if A is closed thenA = A.
lnfinitesimals and The Calculus
G B,
Proof: (a1 Immediate from the definition. (b) A E 2from (a). If x E 2but x 4 A then x E ;?- .Thus, for any n E N, there is a y E A: with Ix - yl < l/n; by Proposition 9.4 there is a z E *A with Ix - zI < l/n. On the other hand, if x # 2 there is an n E N so that Ix - zI > l/n for all z E A. By transfer (check) this is true for all z E *A (contradiction). (d) If b # A then m(b) n *A = 0,for otherwise b E 2 by 9.4, and then b E A by part (b). Parts (c), (e), and (f) are left as exercises. Next we present an important characterization of compactness due to Robinson. Recall that, by definition, the collection A, (i E I)of sets is a covering of the set A E R if A c U A , (i E I), and that A is compact if each covering A, (i E I) by open sets contains a finite subcovering A, (i E 1') (i.e., I' c I is finite).To obtain Robinson's characterization we need the following standard result.
c R by open sets A, (i E I) contains a finite subcovering if each covering of A by a collection of open intervals (a,,, b,,) with rational end points contains a finite subcovering.
9.6 Lemma Each covering of A
Proof: Let A,(i E I) be a covering of A by open sets. If x E A then x E A, for somej E I. Since the rationals are dense in R and A, is open, we can find rationals a and b so that x E (a, b) c A, (why?).The corresponding countable collection covers A. Select a finite subcovering from this latter covering. Each interval in the finite subcovering is contained in some A,, and so we may find a finite collection of the A, (i E I) which also covers A. 0 9.7 Robinson's Theorem The set A c R is compact iff for each y E *A there is an x E A with x =! y, i.e., every point in *A is near a point in A.
Proof: Suppose that A is compact but y E *A is not near any x E A. Then for each x E A there is a S, > 0 in R such that Ix - yl 2 6,. Since A is compact we can extract a finite subcovering A, = { z E R : ~ X-, zI < d,,} (i = 1,2,. . . ,n) from the covering of A by the sets Ax = { z E R:lx - zI < S,} (x E A). It follows that (9.4) (vY)[A A 1.
- ~ 1 6,, 2 A*
*
A
Ixn-1-
~1
2 6,"- I
+
1 ~ . - YI
< 6x,I
43
1.9 Topology on the Reals
is true in W.Transferring to *W, we obtain a contradiction with the fact that y E *A and [ x i - yl 2 d,, for i = 1,2,. . . ,n. Assume now that a covering Ai (i E I) contains no finite subcovering. By Lemma 9.6 there exists a covering of A by a countable collection I,, = {x E R : a , < x < b,,},n E N,of open intervals with rational end points which has no finite subcovering. Thus there is a Skolem function $: N -+ A so that (9.5)
(Vn)(Vk)[N(n) A A!(k) A k 5 n A
<$(n) + bk 5 JI(n)]
is true in W (check).By transfer we see that if w is infinite, then *$(a) # *(ak, b3 for any k E N. Thus, *$(a) E *A is not near a point x in A since m(x) c *(a,, b k ) for some k E N . 0 In Chapter 111 we generalize this result to topological spaces, and the proof given there avoids an analogue of Lemma 9.6. As an application of Robinson's theorem, we prove the following famous result. 9.8 Theorem (Heine-Borel)
A set A c R is compact iff it is closed and
bounded. Proof: If A is not closed then, by Proposition 9.1(ii), there is an x E A' and a y E *A with y = x; since st(y) = x it follows by Theorem 9.7 that A is not compact. If A is not bounded there is a Skolem function JI: N -,A so that
(9.6)
(vn)CA!(n) -,n s I$(n)l A 4(g(4)1
is true in 9. Transferring to *W, and choosing w infinite, we see that w 5 l*JI(w)[,and so the point y = *$(a) E *A is not near any standard point. Hence A is not compact by Theorem 9.7. If A is closed and bounded there is an M E R so that (9.7)
WY"(Y)
-,
lYl 5 MI.
By transfer, if y E *A then lyl 5 M,and so = x is in A (why?). Since A is closed, Theorem 9.5(f) shows that x E A. Thus, A is compact by Theorem 9.7. 0 O y
The nonstandard characterizations of topological notions on the real line developed in this section can easily be extended to n-dimensional space R". Observe that all characterizations are stated in terms of the notions of near points or monads. To extend our characterizations to subsets of R" we make the following definition.
44
1.
lnfinitesimals and The Calculus
.
9.9 Definition If x = (xl,. . . ,x,) and y = (y,, . , ,y,) are points in *R" then x N y iff xi N yi, 1 I; i I; n, and m(x) = {y E *R":xN y}.
With this definition the results of the section apply also to subsets of R". We return to these problems (in more generality) in Chapter 111. Exercises 1.9
1. Finish the proof of Proposition 9.1(ii). 2. Finish the proof of Proposition 9.3 by showing that if x is not an accumulation point of A then for y E *A - { x } we have y x . 3. Prove parts (c), (e), and (f) of Theorem 9.5. 4. Show that a set A E R is closed iff whenever (x,:n E N) is a sequence of points in A which converges to x, then x E A. 5. Show that if A,, A,, . . . ,A, are open (closed) subsets of R then A, x A, x * x A, is open (closed) in R". 6. Use Robinson's theorem to show that if K c R is compact and A c R is closed then K n A is compact. 7. Show that Robinson's theorem holds also for subsets A of R" (with the obvious definition of compactness). Hence show that if A , , . . . , A, are compact subsets of R then A, x * x A, is a compact subset of R". 8. Prove that R is connected. That is, show that R cannot be of the form A u B, where A n B and A n B are both empty. [Hint: Assume the contrary, choose x E A, y E B, and consider the points x, = x ( y - x)k/n, 0 5 k 5 n. There is a largest k-say, ko-, such that xk E A for all k 5 k,
+
--
-
+
and xk,+ I E B.] 9. A set A c R is bounded if there exists an n E N so that A c [- n, n]. Show that A is bounded iff every x E *A is finite. 10. Show that if A is compact in R and x # A, then there is a y E A such that for all z E A, Ix - yl s Ix - zI. 11. Let F , 2 F , 2 * * be a decreasing sequence of non-empty compact sets in R. Show that O F , (i E N) # 0 by choosing x, E F, for each n E N (so then x, E F, for rn 5 n.) 12. Use Theorem 9.7 to show that if A and B are compact subsets of R then A B = { x y; X E A , ~ BE} is compact. 9
+
+
1.10 Limits and Continuity
It should now be clear that the notions of limit and continuity can be characterized nonstandardly in much the same way as were the notions of the previous sections; therefore we will be brief in the following discussion.
45
1.10 Limits and Continuity
10.1 Proposition Let f be defined on A
c R and let a E A. Then
(a) lim,+,f(x) = L iff *f(x) 1: L for all x E * A with x N a but x # a, (b) lim,,,, f(x) = L [Iim,,,,f(x) = L ] iff *f(a + E ) cv L for all E > 0 [ E < 01 with E 'u 0, a + E E * A , and at least one such E exists, (c) Iim,+, f(x) = 00 ( - 0 0 ) iff *f(x) is positive (negative) infinite for all x E * A with x N a, x # a. (d) 1im.v- + m ( - m) f(x) = L iff *f(x) 1: L for all positive (negative) infinite x E * A , and at least one such x exists. Proof: We prove (a) and leave the remaining proofs to the reader as exercises. Recall that lim,+,f(x) = L if and only if, given E > 0 in R, there exists a 6 > 0 in R so that lf(x) - LI < E if 0 < (x - a1 < 6 and x E A. Suppose that lim,,,f(x) = L, and find the 6 corresponding to some E > 0 in R. Then (10.1)
(VX)[A(X)
A
0 < IX - Ul < 6 + (f(X) - LI < E l .
By transfer, if x E * A = dom *f and 0 < Ix - a1 < 6, then I*f(x) - L( < E. In particular, (*f(x) - LI < E if x 1: a but x # a for any E > 0 in R and so *f(x) 1: L. Conversely, if Iim,,, f(x) does not exist or limx+, f(x) exists but is not equal to L, then there exists a standard E > 0 and a Skolem function JI: N --* A - {a} so that IJI(n) - a( < l/n and If(JI(n))- LI 2 E. Thus
A($(n)) A 0 < &!I)
)"@(fI
< 1/n A - LI 2 E l . By transfer, I*f(*JI(n))- Ll 2 E, *I&) E * A , and 0 < l*@(n)- a1 < l/n for all n E *N. In particular, if n E *Nmthen x = *JI(n)satisfies x E *A, x 1: a, x # a, and I*f(x) - LI 2 E, i.e., *f(x) L. 0 (10.2) (VWY(n>
+
- al
+
10.2 Proposition Let f be defined on A and choose a E A. Then the limit limx+, f(x) exists iff *f(x) 'v *f(y) for all x, y E * A with x 1: a, y 1: a but x # a, y # a.
Proof: Exercise.
0
10.3 Theorem If lim,+,f(x) = L, limx+, g(x) = M ,then
(a) Iim,+,, (f + g)(x) = L + M , (b) limx+, (fg)(x) = LM, (c) lirn,-, (f/g)(x) = L/M if M # 0. Proof: Exercise. 0
46
1.
lnfinitesimals and The Calculus
10.4 Proposition Let f be defined on A E R. Then f is continuous at a E A iff *f(x) z f(a) for all x E *A with x z a, i.e., *f(m(a) n *A) E m(f(a)).
Proof: Immediate from 10.1 and the definition of continuity. 0
+
Proposition 10.4 says that iff is continuous at x E A, and x Ax E * A where Ax IY- 0, then Ay = *f(x + Ax) - f ( x ) z 0. For example, if f ( x ) = x2, then Ay = ( x + Ax)’ - x’ = 2 x A x (Ax)2 N 0.
+
10.5 Theorem Iff and g are defined on A and continuous at a E A, then so are f + B, fe,%nd [if da) # 01 f / ~ . Proof: Immediate from 10.3 and 10.4. 0 The preceding propositions can be used to prove the intermediate and extreme value theorems. 10.6 Intermediate Value Theorem If f is continuous on the closed and bounded interval [a, b] and f(a) < d < f(b) for some d, then there exists a c E (a, b) with f(c) = d.
+
Proof: Consider the points xk = a k(b - a)/n, 0 5 k I; n. Considering the values off at xky we see that there exists a Skolem function +: N + [a, 6) satisfyingf(+(n)) < d and f(+(n) (b - a)/n) 2 d (check). Hence the sentence
+
< d ^f(#n) + (b - a)/n) 2 4 is true in W.Transferring to *W,and letting n E * N m ,we have
(10.3) (10.4)
P”n<.>a
< b Af(@o)
-+
*f(*+(n)) < d
and
*f(*+(n)
+ (b - a)/n) 2 d.
+
Let c = st(*+(n)) = st(*+(n) (b - a)/n). By continuity we have f(c) 5 d and f ( c ) 2 d, and hencef(c) = d. Also c cannot equal either a or bysince otherwise f(c) = f(a) or f(W. 0
10.7 Extreme Value Theorem Iff is continuous on the closed and bounded interval [a, b], then there exists a c E [a, b] so that f(c) 2 f ( x )for all x E [a, b]. Proof: For each n E N construct the points x#,k = a + k(b - a)/n, 0 5 k 5 n. There is a Skolem function +:N-,N u (0)satisfying +(n) I; n such that, for each n E N , f ( ~ , , ~ ( 2 , ) f)( x , , ) , 0 I; k 5 n, since the finite set of numbers f ( ~ , , ~0 )I; , k I; n, has a maximum for some k satisfying0 s k I; n. By transfer, *f ( ~ , , . ~ ( ,2) )*f(~.,~),0 I; k I; n, for k E *N and n fixed and infinite. Then c =
1.10 Limits and Continuity
47
st(x,,,.*(,,)) satisfies the conditions of the theorem. To see this, fix d E [a,b]. Then d N X,,k for some k E *N with 0 S k S n (exercise), so, using continuity, f(d) N *f(xn,k)I *f(x,,y(,)) N f(c). If f(d) ‘v f(c) then f(d) = f(c) since both numbers are real. Otherwise f(d) < f ( c ) . 0 Proposition 10.4 shows that f is continuous on A iff *f(m(a) n *A) c m(f(a)) for all a E A. Uniform continuity on A results if an analogous condition holds for all a E *A. 10.8 Proposition The function f is uniformly continuous on a set A iff *f(m(a) n *A) c m(*f(a)) for all a E *A; i.e., a, b E *A and a N b implies
*f(4N *f(b). Proof: Recall that f is uniformly continuous on A iff, given E > 0 in R, there exists a 6 > 0 in R so that, for all a E A, - f(a)l < E if Ix - a1 < 6 and x E A. Suppose that f is uniformly continuous on A, let E > 0 in R be given, and find the corresponding 6 > 0 in R. Then the sentence
If($
< 6 -+ If(4- f(b)l < E l is true in W. By transfer, for all a and b in *A, la - bl < 6 implies I+f(a) - *f(b)J< E. In particular, this is true for any E > 0 in R if a N b, and hence a, b E *A and a N b implies *f(a) N *f(b). Conversely, suppose f is not uniformly continuous on A. Then there is an E > 0 in R so that, for each n E N, there are points $l(n) = a, E A and $z(n) = b, E A with la, - bnl < l/n but If(a,,) - f(b,)l 2 E. By transfer of the appropriate sentence(the reader is invited to write one down), for each n E *N there are points a, and b, E * A with la, - b,l < l/n but I*f(u,J- *f(b,JI 2 E . With n E *N, we have a, N b, but *f(a,) *f(b,). 0 (10.9
Va)V~)D(a>
A
la - bl
+
10.9 Examples
+
+ +
1 . limx+3x 2 = 9 since if h N 0, we have (3 h)’ = 9 6h hZ N 9. 2. lim,,,o { [ ( x + h)2 - x 2 ] / h } = 2x since if h = 0, h # 0,[(x + h)’ - x 2 ] / h = 2~ + h N 2 ~ . 3. limx+m = 0 since for h positive infinite in *R
( 4 3 J;;>
(&T
- Ji;)(&T
&T+&
+ Ji;) -
-
1
Jhl+Ji;
O.
N 0, l/a 0. However, f is not uniformly continuous on (0,l)
4. f ( x ) = l / x is continuous on (0,l) since if a E (0, 1) and h
l/(a
+ h) = h/a(a + h)
N
48
I.
lnfinitesimals and The Calculus
since if n E *N,, l/n and l/(n - 1) are in *(O, 1) and l/n *f(l/n) - *f(l/(n - 1)) = 1 $0.
2:
l/(n
- 1)
but
Proposition 10.8 can be used effectively to prove standard results. 10.10 Theorem Iff is continuous on the compact set A, then f is uniformly continuous on A. Proof: If x, y E *A and x N y, then both x and y are near a standard point a E A since A is compact (Theorem 9.7). Thus *f(x) czf(a)N *f(y) by continuity (Proposition 10.4), so f is uniformly continuous by Proposition 10.8.
0
10.11 Theorem If A c R is compact and f is continuous on A, then f ( A ) is compact. Proof: If y E * [ f ( A ) ] = *f(*A) (Proposition 5.6) then there is an x E * A with *f(x) = y. Since A is compact there is a point a E A with x ‘v a (Theorem 9.7).Then *f(x) = y = f(a) since f is continuous at a, and so f(A)is compact by Theorem 9.7. 0
10.12 Theorem Suppose that f is uniformly continuous on each bounded subset of its domain A. Then f has a unique extension g defined on 2 (i.e., f agrees with g on A ) such that g is uniformly continuous on every bounded subset of 2. Proof: Every standard point y E 2 is near a finite point x E * A and we define g(y) = st(*f(x)). This definition is independent of the x we choose since ifx’ N y then x 2: x’, and both x and x’ are in *B, where B = A n [-IyI - 1, lyl + 11 is bounded. Therefore, *f(x) 2 *f(x’) by uniform continuity on B. We leave as an exercise the proof that * f ( x ) is finite. If C = A n [ - 2n, 2n], n E N,then, given E > 0, there exists a 6 > 0 so that If(x) - f(x’)l < ~ / if2 Ix - x’I < 6 and x, x’ E C. By transfer, I*f(x) - *f(x’)l < &/2 for all x, x’ E *C satisfying Ix - x’I < 6. Now if y, y’ E 1 n [ -n, n] are such that Iy - y’( < 6/2 and y N x, y’ ‘v x for some x, x‘ E *C, then Ix - x’I < 6, and so Idv)- g(y’)l = I *fM - *f(x’)l < &/2. Thus, 1g(y) - g(y’)) 5 4 2 < E. Uniqueness is left to the reader. 0 Theorem 10.12 can be used to extend the exponential function f(x) = ax, a > 0 in R, defined on the rationals Q to the reals R = Q. The function ax, x E Q,satisfies the following properties.
49
1.10 Limits and Continuity
10.13 Properties of Exponents If a and b are positive reals and 4 and r are rational then
(i) 19 = 1, (ii) a%' = as+', a-4 = I/&, (iii) ( a 7 = aq, (iv) a4bq= (ab)4, (v) a c b and q > 0 implies as c bq, (vi) 1 c a and 4 c r implies as c 6, (vii) a 2 0 and q 2 1 implies (a lp 2 a4
+
+ 1.
The useful inequality (vii) follows by noting that, for x 2 0, (x + 1)' - 4x - 1 has a minimum at x = 0. Properties (i) through (vi) are obvious. To extend f ( x ) = a", a > 0,x E Q, to R we need only show that f is uniformly continuous on bounded subsets of Q.That is, we need the following lemma. 10.14 Lemma If a > 0 in R, then up N
a4
if p = 4 in *Q n G(0).
Proof: We may suppose that p > q and a 2 1 [if 0 < a c 1 consider - 1; we must show that b N 0. By transfer from
as = ( l / ~ ) - ~ ]Let . 6=
10.13(vi), b 2 0, and, by transfer from lO.l3(vii), (10.6)
a = (b
+ l)I'(p-q) 2 b/(p - 4)+ 1 2 1,
so b/(p - 4)is a finite number p, and hence b = ( p - q)p = 0. 0 This argument is due to Keisler [23]. It is easy to show that properties 10.13 are satisfied by the extension g(x), x E R,of f ( x ) = ax, x E Q. For example, gcY + y') = *f(q + 4') = *f(q)*f(q') = gcvlscv') if 4 = Y , 4' = y', and 4,4'E *Q;this establishes the first part of 10.13(ii) for g since g is realvalued. Most of the results in this section can be extended to functions f of n variables defined on subsets of R" simply by using the definition of nearness for points in *R" introduced in the previous section. The details are left to the reader. Exercises 1.10 1. 2. 3. 4.
Prove parts (b)-(d) of Proposition 10.1. Prove Proposition 10.2. Prove Theorem 10.3. Complete the proof of Theorem 10.7 by showing that for each d E [a, b] there is a k E *N with 0 5 k 5 n such that d N x , , ~ .
50
I.
lnfinitesimals and The Calculus
5. Prove that i f f is uniformly continuous on a bounded set B c R, then *f(x) is finite for each x E *B. 6. Prove uniqueness in Theorem 10.12. 7. Show that there are infinite rational numbers p and q with p = q such that 2p 2q. Where is the assumption that p , q E G(0) used in the proof
*
of Lemma 10.14? 8. Let
0 < x 5 1, x=o (a) Show that f ( x ) is not continuous on [0,1]. (b) Show that the function xf(x) is uniformly continuous on [0,1]. sin(l/x),
9. Show that the function f(x) = x2 on ( 0 , ~is)continuous but not uniformly continuous. 10. Show that limx+af ( x ) = L iff for each sequence (s,) with s, = a and s, # a, n E N,we have limn+mf(s,,) = L. 11. Prove that iff is uniformly continuous on R and (s,) is a Cauchy sequence then (f(s.)) is a Cauchy sequence. 12. Suppose that f is continuous on R and satisfies lim,+,f(x) = limx+- f ( x ) = 0. Prove that f is uniformly continuous. 13. Suppose that f is defined on a compact set A in R. Prove that f is continuous iff the graph ((x,f(x)) E R2:x E A} off is compact. 14. Show that if the function f is continuous on the set A then the zero set {x E A:f(x) = 0} off is closed. 15. Suppose that the function f on the closed bounded interval [a, b] is monotone [e.g., x
51
1.11 Differentiation
1.11 Differentiation
The theory of differentiation can now be developed easily using the results of the previous section. 11.1 Proposition Let f be defined at a E R. The derivative f(a) off at a exists iff for any infinitesimal h # 0
(i) *f(a + h) is defined, (ii) [*f(a h) - f(a)]/h is finite, (iii) st([*f(a + h) - f ( a ) ] / h )is independent of the choice of h.
+
In this case, f(a) = st([*f(a + h) - f(a)]/h).The right-hand (left-hand) derivative off at a exists iff (i)-(iii) hold for any infinitesimal h > 0 (h
+
Proof: Immediate from Proposition 10.1. 0 11.2 Proposition Let f be defined on [a, b]. The following statements are equivalent:
(i) f’ exists and is continuous on [a,b], where f’(a) is the right-hand derivative at a and f’(b) is the left-hand derivative at b. (ii) For all x, y , x’, y’, in *[a, b] with x N x‘ H y N y’ and x # y and x‘ # y‘,
*f(x)- *fcv)- *f(xl)- *fW X-Y
If (ii) holds, then f’(st(x))= st([*f(x)
XI - yl
E
G(0).
- * f ( y ) ] / ( x- y)).
Proof: If (i) holds and, in *[a, b], x N y, x < y , st(x) = c E [a, b ] , then by the transfer of the mean value theorem there is an xo with x < xo < y such that [*f(x)- *f(y)]/(x- y ) = *f’(xo). (How is a Skolem function used here?) Since f’ is continuous, *f’(xo) ~ f ’ ( c ) ,whence (ii) follows. Assume that (ii) holds. If c = x = x’ E [a, b], then f’(c) exists by Proposition 11.1. Using a Skolem function and the transfer principle, we can obtain for each x E *[a, b] and positive infinitesimal E a positive infinitesimal 6 such that when Y E * [ a , b ] and 0 < Ix - y Jc 6, I*f’(x) - [*f(x) - *f(y)]/(x- y)( < E. It follows from (ii) that if x ‘v x’ in *[a, b] then *r(x) N *r(x’);i.e., f’ is uniformly continuous on [a, b] by Proposition 10.8. 0
52
I.
lnfinitesimals and The Calculus
1 1 3 Examples
+ 3x then *f(x + h) - f ( x ) - 2(x + ..)2 + 3(x +
1. If f ( x ) = 2x2
h
) - 2x2 - 3x
h
- 4xh + 2h2 + 3h h
= 4x
+ 3 + 2h
N4X+3
+
for all h 1: 0, h # 0, and hencef’(x) = 4x 3. 2. Starting with the definition e = lhnx+m ( 1 + l/x)”, we show that de“/dx = e“. If f ( x ) = ex then [*f(x + h) - f ( x ) ] / h = eyeh - l)/h, and we need to show that (eh - l)/h N 1 if h N 0. If b = (eh - l)/h then eh = 1 + bh. If h N 0, h > 0, eh N 1 by the continuity of e“ (which we assume here) and so bh N 0 and l/bh is infinite. Then e = lim x+m
( +Y 1
-
N
(1
+ bh)l/bh= (eh)llbh= el/*.
Hence b N 1, and [*f(x + h) - f ( x ) ] / hN ex if h > 0. A similar argument works for h < 0, showing that f ‘ ( x ) = ex (this argument is due to Keisler 1231). 11.4 Theorem Iff is differentiable at x
E (a, b), then
f is continuous at x.
Proof: By proposition 10.1, f ( x + h) - f ( x ) -f’(x)h for h N 0, and so f ( x + h) !x f ( x ) for all h 1: 0; i.e., f is continuous at x. 0 11.5 Theorem Iff, defined on (a, b), achieves a relative maximum or minimum at x E (a, b) and is differentiable at x, then f’(x) = 0.
Proof: Suppose that f achieves a relative minimum at x. Then, for all h sufficiently small and positive (negative), we have [ f ( x + h) - f ( x ) ] / h2 0 (SO).By transfer of the appropriate sentence, we see that [*f(x + h) - f ( x ) ] / h 2 0 ( S O ) if h N 0 and h > 0 (h < 0). Thusf’(x) = 0 from 11.1 and 6.7(iv). 0
Rolle’s theorem and the mean value theorem can be deduced in the standard way from this result and the extreme value theorem.
53
1.11 Differentiation
= g(x)f’(x)+ f(x)g’(x) by 11.1, 11.4 (applied to g), and 6.7. The result follows from Proposition 11.1. 0 At this point it is natural to introduce differentials in the spirit of Leibniz. Denoting the nonzero infinitesimal h by Ax, we have C*f(X + Ax) - f ( x ) l / A x = f ’ ( X )
iff is differentiable at x. We call Ay = * f ( x f A X )- f ( x )
the increment off at x corresponding to the increment Ax. The differential of f at x corresponding to Ax is defined to be dy = f’(x)Ax. Notice that E = Ay/Ax - f’(x) is infinitesimal, and so (11.1)
Ay = f ’ ( x ) A x
+ & A X= dy + &AX.
11.7 Theorem (Chain Rule) Let h(t) = f(g(t))be the composite off and g. If g’(t) exists and f’(g(t))exists [so that g is defined in an interval about t and f is defined in an interval about g(t)],then h’(t) exists and h’(t) = f’(g(t))g‘(t).
Proof: Let x (11.2)
= g(t) and
y = h(t) = f ( x ) . By (1 l.l),
Ay = f’(x)Ax
+ &AX,
E N
0,
54
I.
lnfinitesimals and The Calculus
for any infinitesimal Ax. Setting Ax = *g(t + At) - g(t),where At is any nonzero infinitesimal, and dividing by At, we get AylAt =f’(x)(Ax/At) E(Ax/At). The result follows by taking standard parts. 0
+
11.8 Inverse Function Theorem Let f be continuous and strictly increasing (or decreasing) on (a, b) and let g be the inverse off. Iff is differentiable at x E (a, b) with f’(x) # 0, then g is differentiable at y = f ( x ) , and g’(y) = lIf’(4.
+
Proof: Let Ay N 0, Ay # 0, and set Ax = *& Ay) - gQ. Then Ax is infinitesimal and nonzero since g is continuous (why?)and one-to-one. Since S(X)# 0, (11.3)
1 f’(x) - *f(x -ry
+
Ax A X )- f ( x ) - y
+
Ax Ax =AY - y A y e
Since this is true for all nonzero infinitesimals Ay, g’Q exists and equals l/y(x). 0 . Partial derivatives of functions of several variables are defined as usual. For notational convenience, we confine ourselves to functions z = f ( x ,y ) of two variables; the extension to functions of n variables is obvious. The partial derivatives f, and f y are defined by f,(a, b) = g’(a) and fJa, b) = b’(b), where g(x) = f ( x ,b), h Q = f(a, y). Assuming that the partial derivatives exist, we define the increment Az and total digerential d z by (11.4)
AZ = *f(a
+ AX,y + Ay) - f ( ~b),
and (11.5)
dz = fx(a,b)Ax
+ &.(a, b)AY,
respectively, where Ax and Ay are arbitrary numbers in *R. Note that both Az and dz depend on a, b, Ax, and Ay. We say that f is direrentiable at (a,b) if (11.6)
Az = dz
+ E A X+ 6 A y
for any infinitesimals Ax and Ay and corresponding E
N
0, and 6 2:0.
11.9 Theorem Iff, and f, are continuous at (a, b), then f is differentiable at (a, b).
55
1.11 Differentiation
Proof: If Ax and Ay are nonzero standard numbers, then (11.7) f(a
+ AX,^ + AJJ)- f ( ~ , b ) = [f(a
+ AX,b + Ay) - f(a + AX,b)] + [f(a + A X ,b) - f(a,b)].
Using the mean value theorem, we have (11.8)
f(a + AX,b) 6) = fx(u,b)AX, AX,b Ay) - f(a AX,b) = &(a A X ,U) Ay,
+ + where la - UI 5 Ax, Ib - UI I Ay. Hence
(11.9)
f(a
f(a
+
+
+ AX,b + Ay) - f(a,b) = fX(u,b)AX + &(a + AX,U)Ay.
Since this equation is true for all standard Ax and A y we have by transfer check; you must use Skolem functions) that for Ax ‘Y 0, A y N 0, (11.10)
AZ = *fAu, b) AX
+ *&(a + Ax, U) Ay
UI
for u, u E * R with la - uI 5 Ax, Ib - I Ay. The result follows since *fx(u,b) = fJa, b) and *&(a Ax, u) N &(a, b). 0
+
Exercises Z.11
1. Prove Theorem 11.6, parts (i) and (iii). 2. Why is the inverse function g in Theorem 11.8 continuous? 3. Use Proposition 11.2 to show that iff’ exists then it is continuous on [a, b]
if and only if for each x E *[a, b] and each Ax with Ax N 0 and x + Ax E *[a, b ] , we have Ay = *f(x + Ax) - *f(x)= *f’(x)Ax + E Ax, where E N 0. That is, at any x E *[a, b ] , Ay = d y + E Ax with E N 0 when Ax N 0. 4. Consider the example f ( x ) = x z sin( l/x), x # 0, f ( 0 ) = 0, to see what happens in Exercise 3 iff’ exists but is not continuous. 5. (Darboux’s Theorem) A function f on [a, b] may possess a derivative f’ on [a, b ] that is not continuous. Prove that if f’(a) < c 0; prove that f’(x) = 0 for some x E (a, b). (iii) Reduce the problem to (ii) by using an appropriate function.] 6. (Hyperreal Mean Value Theorem) Let f be differentiable on (a,b). Assuming the standard mean value theorem (i.e., if x < y are points in (a, b) then there is a c, x < c < y, with f ’ ( c ) = v ( y ) - f ( x ) ] / ( y- x), show that if x < y in *(a,b) then there is a c E *(a,b),x < c < y, with *f’(c) = [*fW- *f(X)l/(Y- 4. 7. Let f be twice differentiable on (a, b). Prove that if f’(c) = 0 and f ’ ( c ) < 0 [f”(c) > 01 for some c E (a, b) then f has a local maximum [minimum] at c. (Hint: Use Exercise 6.)
56
I.
lnfinitesimals and The Calculus
8. (Ekhrens [S]). A real-valued function f defined in a neighborhood of c E R is ~ n ~ o di~erenriub~e r ~ ~ y at c with derivative f’(c) if, for each E > 0 in R, there is a 6 > 0 in R so that
forallx,yE(c-6,~+6). (a) Show that f is uniformly differentiable at c iff there exists an a E R,
a=
W)- * f ( Y ) X--Y
for all x, y E * R with x N y = c and x # y, and that in this case f’(c) = a. (b) Show that iff has a derivative on an open interval (a,b)containing c, then f is continuous at c iff f is uniformly diffe~ntiableat c. [Hint: see the proof of Proposition 11.21. (c) Give an example of a function f which is uniformly differentiable at a point c, but every neighborhood of c contains a point where f is not differentiable. (d) Show that iff is uniformly differentiable at c then f is continuous on some neighborhood of c. (e) Show that iff is increasing on an interval (a, b) and f is uniformly differentiable at x E (a, b) with f’(x) # Oy then the inverse function g is uniformly differentiable at y = f(x) and g’(y) = l/f’(x).
1.12 Riemann Integration
Nonstandard analysis is a natural tool for developing the theory of Riemann integration on an interval [a, b], and this section contains a few relevant results. We c o n ~ n ~ r aon t e inte~rationof continuous functions on intervals [a,b]. The presentation in this section owes much to Keislet [23]. 12.1 Definition Let f be a continuous function on [a, b] c R, a < b. A partition P of [a, b] is a set {xo,xl, ,xJY where a = xo < XI < * * * < xn- 1 < x, = b. The upper, lower, and ordinary Riemann sums $(f, P), S:(f, P), and S%f,P) off with respect to P on [a,b] are defined by
...
St(f, P ) = MiAxJ1 < i < n), $:(fy P ) = )3 miAxJl 2 i 5 n),
1.12 Riemann Integration
57
and
$(f, P ) = 1f ( x i- 1) Axi(1
Ii In),
where M i and mi are the maximum and minimum o f f on [ x i _ , , x i ] and Ax, = xi - x i - , , 1 Ii s n. If P is given by setting xk = a + k A x , 0 I k 5 n - 1, where Ax is a fixed positive number and n is the greatest integer for which a + (n - 1) Ax < b, then we write Ax), S:(f, Ax), and S:(f, Ax) for the upper, lower, and ordinary Riemann sums, and say that P is determined by Ax. Here, Ax,, = b - x,,- 5 Ax. If a = b, all Riemann sums are set equal to 0.
s:(f,
,
The partition P, is a refinement of P , if PI E P,. It is easy to see that if P , is a refinement of P , , then
af, P2) Iw, P , ) Iw , PI).
S2f9 P1) %f, PZ)I
The common reJinement P , of P , and P , is given by P , = PI v P,. Since m
-
9
Pl)
s:(L P , ) 5 mf, P3)
a
f
t
P2),
it follows that any lower Riemann sum is less than or equal to any upper Riemann sum. 12.2 Definition The function f on [a, b] is said to be Riemunn integrable on [a, b] with integral J: f ( x )dx if (i) S:(f, P) I f ( x ) d x I St(f, P ) for any partition P of [a,b] and (ii) given any E > 0 in R there is a partition P so that P) P ) < E.
aft s:cs,
We now set out to show that a continuous function f is Riemann integrable. Although we do not have an extension of the set of partitions of [a, b] in this chapter, we can fix f and extend the Riemann sums determined by positive numbers Ax E R to Ax E *R. In the following result, *S:(f, .) and *S:(f, .) denote the extensions to * R of such sums St(f, -) and S:(f, . ) a
12.3 Proposition Let f be continuous on [a,b], and let Ax be a positive infinitesimal in * R . Then *S:( f ,Ax) = Ax).
*s:(f,
Proof: Given Ax > 0 in R, S:(f,Ax)'- S:(f,Ax) = l ( M i - mi)Axi(l < i In) BAxX1 Ii 4 n) = B Axdl Ii 5 n) = B(b - a),
51
58
1.
lnfinitesimals and The Calculus
where E = max, s,&f, - m,).Thus toeach Ax E R + corresponds two points $(Ax) and @(Ax)on [a, b] with [$(Ax)- @(Ax)[< Ax and
S!U, A X )- S ! U , A X )S [f($&)) - f(@(Ax))](b - 4. For Ax = 0 in * R there is a c E [a, b] with *$(Ax) N c N *$(Ax),and hence *f(*$(Ax))N *f(*$(Ax))by the continuity off at c. The result follows by transfer of (12.1). 0 (12.1)
12.4 Corollary Let f be continuous on [a, b]. Then f is Riemann integrable and J.6 f ( x )dx N *S:(f, Ax) for any infinitesimal Ax.
From Corollary 12.4 it follows that fi f ( x ) d x = limb+, S:(f, Ax). In the following we will write S:(f, Ax) and * S i ( f , Ax) as x f ( x )Ax and *f(x)Ax, respectively. By convention we set fi f ( x )dx = -fi f ( x )dx and f ( x )dx = 0. 12.5 Theorem Let f and g be continuous on [a, b]. Then
(i) J.6 cf(x)dx = c J.6 f ( x )dx for c E R, (i)fi [ f ( x )+ &)I dx = fi f ( x )dx + j.6g(x)dx, (iii) J.6 f ( x )dx = fi f ( x )dx J: f ( x )dx if a < c s b, (iv) if f ( x ) s g(x) on [a, 61 then fi f ( x )dx 5 fi g(x)dx, (v) if m < f ( x ) 5 M on [a, b] then m(b - a) < f ( x )dx I M(b - a).
+
Proof: We prove (iii) and (iv) and leave the remaining proofs to the reader.
+
(iii) For each natural number n, if Ax = (c - a)/n > 0 then z f ( x ) A x z f ( x ) A x = c f : f ( x ) A x .The result follows by taking standard parts of the terms in the transferred equality when n E *N, . (iv) For each standard Ax > 0, f ( x )Ax 4 g(x)Ax. Thus by transfer cf: *f(x)Ax< cf: *g(x)Ax,where Ax > 0 is infinitesimal. The result follows from Theorem 6.7(iv). 0
cf:
12.6 Theorem Iff is continuous on [a, b], then the function F(x) = j:f((t) dt, defined for x E [a, b], is differentiable. Moreover, F’(c) = f ( c ) for each c E [a, b], where F‘(c)is the right- or left-hand derivative if c = a or b.
Proof: Fix c E [a, b). For any standard h E (0,b - c) we have, using 12.5(iii) and (v), that f(x,)h S F(c h) - F(c) < f(xl)h, where f has a minimum and maximum on [c,c + h] at x 1 and x 2 , respectively. Thus there are Skolem functions $, $:(O, b - c) + [c,c + h] so that f(&h))h < F(c h) - F(c) < f(@(h))h for all h E (0,b - c). By transfer, *f(*$(h))h 5 *F(c + h) - F(c) <
+
+
59
1.12 Riernann Integration
*f(*$(h))h for all h E *(O, b - c). In particular, if h
N
0 we have
*f(*W)) I; c*m+ h) - F(c)l/h I *f(*$(h)). Now *&h) N c and *$(h) N c if h is infinitesimal and so *f(*+(h))= *f(*$(h))= f ( c ) by continuity off at c. Therefore [*F(c h) - F(c)]/h 2: f ( c ) if h is a positive infinitesimal. The argument is similar if h is a negative infinitesimal; the result follows from Proposition 11.1. 0
+
A result due to Keisler [23], which can be used to justify the definition via integrals of many quantities occurring in applications, is the following.
12.7 Infinite Sum Theorem If f(x) is continuous on [a,b] and B(u,u) is a real-valued function of two variables (u, u) E [a, b] x [a, b] satisfying (a) B(u, u) = B(u, w ) + B(w, u) for u 5 w I u, (b) for any infinitesimal subinterval [x, x + Ax] E *[a, b], *B(x,x *f(x)Ax &AXwith E N 0,
+
+ Ax) =
If:
then B(a, b) = f(x) dx.
+
Proof: For n E N let g(n) be the maximum of [B(xi, xi Ax) - f(xJ Ax]/Ax, where Xk = a + k Ax, 0 5 k < n, and Ax = (b - a)/n. From (b), *g(w) N 0 if w E *N, , From (a) and (b), IB(a,b) - f(x) Ax1 5 g(n)(b - a) for each n E N,and so, by transfer, B(a, b) N *f(x)Ax = *Sf:(f,Ax) for n = W.
zz
12.8 Fundamental Theorem of Calculus If a function F has a continuous derivative f on [a, b], then jf:f(x) dx = F(b) - F(a). Proof: Let B(u, u) = F(u) - F(u) in Theorem 12.7 and use Exercise 1.11.3. 0
The following calculation is a direct proof of Theorem 12.8. Let Ax =
+
+
(b - a)/w, where w E *N-, A k y = *F(a k Ax) - *F(a (k - I)Ax), and dky = *f(a (k - 1)Ax) Ax, k = 1,2, . . . ,w. Then by Exercise I. 11.3
+
ol
F(b) - F(a)=
c
k= 1
AkY
= c[*f(a =cd&l 2 Jab
f(X)
+ (k - 1)Ax) + Ax(1 Ik I < k < 0) + c&kAX(1
0)
&k]
0)
dX
+ 1 AX( 1 S k I &k
W),
where &k 2: O for each k. Since IC &k Ax(1 I k I w)l I maxklsk((b- a) N 0, the result follows. The standard proof of 12.8 uses 12.6 to show that F(x) = f(t) dt F(a) for x E [a, b].
+
60
lnfinitesimals and The Calculus
I.
12.9 Example: Volume of Revolution Suppose that a volume V is obtained by revolving a region R = ((x, y) E R2:0 I; x 5 1 , O I y I;f(x)) about the x-axis, where f ( x ) is continuous. To find a formula for Y we let B(u,o) be the volume obtain^ by revolving R(u,o)=((x,y) ER~:uSXI;U,OS;S~(X)) about the x-axis and make the reasonable assumption that B satisfies (a) of 12.7. Also obvious is the fact that m 2Ax I; B(x, x Ax) I; nM2 Ax if Ax is standard, x E [0,1], and m and M are the minimum and maximum, respectively, of f ( x ) in [x,x + Ax]. As in the proof of 12.6, for Ax > O? Ax TZT 0, we have *B(x,x Ax) = n[*f(x)]’ E Ax where E z 0, and so V = B(a,b) = 7c [fCxtI’ kc
+
+
E
+
Exercises 1.12 1. Prove Theorem 12.5, parts (i), (ii), and (v). 2. (Keisler [23]) An “approximate” average of a continuous function for an interval [n,b] is given by f(a kAx)/n, where n E N and Ax = (b - a)/n. What relationship does this have to the integral average :j f(x)dx/(b - a)? 3. Do Example 12.9 for the case in which the axis of rotation is the y-axis. 4. Prove Bliss’s theorem: Let f and g be continuous functions on [a, b]. For each Ax > 0 and the corresponding partition P, let d, and t,b be Skolem functionssuch that, for 1 I; i s n - 1, Q (i - 1) Ax S d,(i, Ax) s a i Ax and Q (i - 1)Ax I; $@,Ax)I; a i A x while a (n - 1)Ax 5 d,(n,Ax) I; b and a+(n- l ) A x ~ $ ( n , A x ) S b .Let S(Ax)=CI, f(qb(i,Ax))g($(i,Ax))Ax. Show that limd,,, S(Ax) = ~ ~ f ( x ) g ( x ) ~ x .
EZt
+
+
+
+
+
+
1.13 Sequences of Functions
A sequence of functions on A c R is a map f:N x A + R. As usual we denote f(n, x) by f.(x) (n E N , x E A). We will use nonstandard analysis to study the convergence of such sequences. 13.1 position The sequence (A),$,: A --* R,n E N, converges pointwise to the function f:A -,R iff *f,(x) N f ( x ) for all x E A and all infinite n E * N .
Proof: The sequence (f,) converges pointwise iff for each fixed x E A the sequence ( f , ( x ) ) converges to f(x). The result then follows from 8.1. El 13.2 Proposition The sequence (f.), f,:A --* R, converges u n i f o ~ l yto the function f:A -+ R iff *f(x) 1: *f(x) for all x E *A and all infinite n E *N.
61
1.13 Sequences of Fundons
Proof: Recall that (f,) converges uniformly to f iff, given E > 0 in R, there exists a k E N so that If.(x) - f(x)l < E for all x E A if n 2 k. Suppose then that (f,) converges uniformly to f and find the k corresponding to a specified E > 0. Then the sentence (13.1)
(WW[N(n)
A
A+)
A
n2k
+
l&(4 - f(x)l
-= &I
is true in 9.By transfer, I*f.(x) - *f(x)l < E for all n E *N,n 2 k, and all x E *A. In particular, this is true for all infinite n, no matter what E > 0 we choose. Hence *f&) N * f ( x ) for all infinite n E *N and all x E * A . The converse is left to the reader. 0
13.3 Did’s Theorem Suppose that the sequence (f.) of continuous functions on the compact set A c R is monotone [i.e., f,(x) 5 f,(x) or f,(x) 2 fm(x) for all n 2 m,x E A ] and converges pointwise to the continuous function f. Then the convergence is uniform. Proof: We may suppose that f(x) = 0, x E A (simply by considering the sequence f, - f), and that f, decreases (otherwise consider -f.). By transfer we see that *f(x) 5 *f,(x) for all n 2 m in *N and all x E *A. Fix x E * A . Since A is compact there is a y E A, y N x. Then, for each n E *N, and standard m, 0 I*f(x) 5 *f,(x) ‘v f,(y), and since lim,,,+mf,Q= 0 it follows that *f,(x) N 0. 0 13.4 Theorem If (f,) converges uniformly to f on A, a E R is a limit point of A, and limx+,fn(x)= s, exists for all n E N, then (s,) converges and lim,,,f(x) = limn+ms,.
Proof: Let E > 0 in R be specified. Then there is a k E N so that If.(x) - fm(x)I < &/2for all x E A and all n, m 2 k by uniform convergence of (f,)tof on A. By transfer as in 13.2, I*f.(x) - *f(x)( < ~ / 4and I*f,(x) - *f,(x)l < &/2for all n, m 2 k and all x E *A. Since s, ‘v *f.(x) if x N a, x E * A , we have Is, - s,I N I*f.(x) - *f,(x)l < E if n, m 2 k, and so (s,) is a Cauchy sequence and converges, say, to L. It follows (letting x N a and n 2 k) that
f(x)l < ~ / 4 ,and hence
)&fI
I*f(x) - LJ5 I*f(x) - *f..(x)l+ I*f.(x) - snl+ Is, - LI I~ / 4+ infinitesimal + 2~ < 3&, and hence *f(x) N L. 0
62
1.
lnfinitesirnals and The Calculus
13.5 Corollary If the functions f, are continuous on A and (f.) converges uniformly to f on A, then f is continuous on A.
We end this section with a proof of the Arzell-Ascoli theorem, a result which has many important applications in analysis. The theorem asserts that from a uniformly bounded, equicontinuous sequences (f.) of functions on a closed bounded interval [a, b] c R it is possible to select a subsequencewhich converges uniformly on [a, b] to a continuous function f. That the result is not true for an arbitrary sequence of continuous functions is shown by the sequence in which f.(x) = x" on [0,1].Here (f.) actually converges pointwise (but not uniformly) to the discontinuous function
{
0, f(x) = 1,
OSXXl, x = 1.
13.6 Definition The sequence ( f n ) of functions on [a, b] is uniformly bounded if there exists an M so that Ifn(x)l M for all x E [a, b] and all n E N. The sequence (f.) of functions on [a, b] is equicontinuous if, given E > 0, there is a 6 > 0 (independent of x, y, and n) so that If.(.) - f.(y)l < E for all n E N and all x, y E [a, b] such that Ix - yl < 6. (Each f., then, is uniformly continuous on [a, 61.) 13.7 ArzelP-Ascoli Theorem If (f.) is a uniformly bounded and equicontinuous sequence of functions on the closed and bounded interval [a, b], then there is a subsequence (f.,)which converges uniformly to a continuous function f on [a, b]. Proof: Let E > 0 be given and find the corresponding 6 > 0 from the equicontinuity of the sequence. Then the sentence
([email protected])(vy)[nE & A x E [a, b] A Y E [a, b] A - _f,(Y)I < &I +
&).I
1. - YI
<6
is true. By transfer, for all n E *N and all x, y E *[a, b] such that Ix - yl < 6 we have I*h(x) - *f.(y)I < E. In particular, I*f,(x) - *f.(y)J < E if x N y for any n E *N.Since E > 0 is arbitrary, we see that *f(x) = *f.(y)for any n E *N as long as x N y. Now let n = o be a fixed infinite natural number. By an argument similar to that of the first paragraph we see that I*f,(x)l 5 M for any x E *[a, b], so that *f,(x) is near-standard for x E [a, b]. Definef(x) = "(*f,(x)), x E [a, b]. We claim that f(x) is uniformly continuous. For let E > 0 be given and find the 6 > 0 corresponding to ~ / 2from equicontinuity. Then if x, y E [a, b]
63
1.14 Two Applications to Differential Equations
and Jx- y ( < 6, we have
Jfb) -f(Y)l
IfM - *f&)l+ I*fm- *fU(Y)l + I*fm(JJ)
- f(Y)l*
The first and last terms on the right are infinitesimal by definition o f f , and the middle term is < ~ / 2by the argument of the first paragraph, and so IfW - f ( Y ) l < E. Finally we show that a subsequence of (f.) converges uniformly to f on [a, b]. To do this it suffices to show that for all E > 0 and all n E N there is an m > n so that - f(x)l < E for all x E [a, b] (why?). Suppose this statement is not true. Then there exists an t o > 0 and an no E N so that for each m > no we can find an x E [a, b] with (f,(x) - f ( x ) ( 2 E ~ Thus . there exists a Skolem function $: {no,no + 1 , . . . } + [a, b] so that the statement (Vm)[m E N A m 2 no + If,($(m)) - f($(m)) 1 2 go] is true. By transfer, given w E * N m , we have w 2 no, and so there exists an x E *[a, b] [equal to *$(a)] such that I*fm(x) - *f(x)l 2 t o .But by compactness of [a, b] and Robinson's theorem, this x is infinitesimally close to a y E [a, b], and so
)&fI
I*f,x)
- *fWl
I*fm- *f,Y)l + I*fAY) - f(Y)l + lf(Y)
- 'f(x)l.
Each term on the right is infinitesimal, the last by the continuity off. This contradiction proves the theorem. 0 A general form of the Arzelsi-Ascoli theorem will be given in 8111.8 of Chapter 111.
Exercises Z.13 1. Finish the proof of Proposition 13.2. 2. Prove Corollary 13.5 directly from Proposition 13.2. 3. Give an example to show that an equicontinuous sequence need not be uniformly bounded. 4. Let (f,) be a sequence on [a, b]. Show that (f.) is an equicontinuous sequence if and only if for any n E *N and any pair x, y E *[a, 61 with x r= y we have * f ( x )z *f(y). (Hint: For the necessity see the proof of Theorem 13.7.) 5. Let (f.) be a sequence of continuous functions on [a, b] which converges uniformly to f. Show that limn+oo jf:f.(x) dx = jf:f ( x )dx. 1.14 Two Applications to Differential Equations
As our first application we prove the Cauchy-Peano existence theorem for ordinary differential equations. A nonstandard proof was first presented by A. Robinson [a].
64
lnfinitesimals and The Calculus
1.
14.1 Cauchy-Peano Existence Tbeorem Let f be continuous and satisfy If(x,y)( 5 M on the rectangle B = {(x, y) E R 2 :Ix - xo( 5 Q, ly - yo[ 5 b}. Then there exists a function & with continuous first derivative, defined on the closed interval I = {x E R : ( x- xol 5 c}, where c = min(a, bM-’), and satisfying Q(xo) = yo and &’(x) = f ( x , &(x)) for x E I. Proof: We begin, as in [40], by constructing a family of polygonal approximations. It suffices to construct a solution on [ x o , xo + c]. Divide [xo, xo + c] into n equal parts by the points xk = xo + kcfn, 0 5 k 5 n, and define &,, by the equations
& A d = YO, &kX)= &n(Xk) + f ( x k ,
(14.1)
&n(xk))(x
- xk)
for xk < x < & + I , 0 s k < n - 1. For any n E N , the graph of &,, lies in B since (f(x, y)( s M. Moreover, I&,,(x) - &,,(x’)( s MIX- x’I for any x, x’ E [xo,xo c]. Thus the following statement is true in 9k
+
(14.2)
+
For all n E N , x, x‘ E [xo,xo c], we have I&,&) - yo( 5 b and (&,,(x) - &,,(x’)l 5 MIX - x’l.
By transfer, for all n E *N and x, x’ E * [ x o , x o + c], (14.3)
and (14.4)
l*&n(x) - *&n(x‘>lS
- XI(*
We now let n=o E *N, and note that *&JX) is finite for all x E *[xo, xo+c] by (14.3). We may therefore define the standard function & on [x, xo c] by &(x) = st(*&&)). Now & is continuous since, for standard x and x’ in [xo,xo+c], ~ ~ ( x ) - & ( x ’ ) ~ ~ l * & ~*&,,(x‘)( ( x ) - 5 Mlx-x’l by (14.4). Therefore
+
*&Y) = &(St(Y))N *&,(stQ) = *&w(Y) if y E *[xo,xo + c ] . Since f is continuous and hence uniformly continuous on the closed bounded set B, (14.5)
*fk*d(x)) N * f b Y
(14.6)
*&nix))
for all x E * [ x o ,xo + c ] (exercise). Now if x E [ x o , xo + c], then xk <x <xk+ for some k E *N,0 5 k 5 w - 1, whence xk N x and so &(x)
*&cu(xk) k- 1
65
1.14 Two Applications to Differential Equations
by transfer from (14.1). Thus k- 1
a! YO
+ iC *f(xi, *&xi))(xi+ 1 - xi) =o
since
c max I*f(xi, * 4 m ( x i ) ) OSiSk- 1
- *f(xi, *&xi>>l,
which is infinitesimal by (14.6) (where have we used the transfer principle in this argument?). Since k-1
where Ax = c/o, it follows from Corollary 12.4 that
Therefore 4 has a continuous derivative and #’(x) = f(x, &x)).
El
The standard proof of this result [S] uses the ArzelA-Ascoii theorem, 13.7. The reader is referred to any standard text on differential equations for a discussion of a (Lipshitz) condition that ensures uniqueness of the solution. Lastly we use nonstandard techniques to derive the wave equation for a vibrating string. We assume that the magnitude of the tension T and the density y of the string are constant along the string. Given an infinitesimal segment of length As from P , to P2 on the string as shown in the figure, the
vertical force on the segment is T(*sin 8, - *sin 8,) and its mass is PAS. If the nearest standard point is x , and we are considering the vertical position y as a twice continuously differentiable function of x and t, then by Newton’s law, 7‘(*sin O2 - *sin 61) = p As&(x, t )
+ E),
66
where E
I. lnfinitesimalsand The Calculus
= 0, so that
-(T P
*sin8, - *sine, As
= Y,(X, 0.
We want to show that (*sin 8, - *sin tI,)/As N yJx, t). Often in deriving the wave equation the assumption that Ay is uniformly small is made, and then the expression (sin 8, - sin O,)/As is replaced by (tan O2 - tan 8,)/Ax, where Ax2 + Ay2 = As2. Since As is infinitesimal, as is *sin 8 - *sin 8,, it is not clear why this replacement is justified. Let us instead fix t and consider small changes Ax, and Ax2 at P , and P2 resulting in changes Ay,,As,, i = 1,2, respectively. We may take Axi, i = 1,2, so small that
+
A xJAS, = *Xs(Pi, t ) E,, Ay JAs, = *sin Or + Ei + 2 , AYJAxi = *~#‘t,t) + ~ i + 4 for i = 1,2, where &,/As = 0 for 1 < j S 6. Then, omitting the sine2 - sine, As
= N
*, we have
Ax2 Ay, A x l ) b s AXA ~ S ~ AX, As,
YAP^
9
0 x P 2 t ) - YAP, t)xAPi As 9
9
0
Since we are assuming that y(x, t ) is twice continuously differentiable, it follows as in Proposition 11.2 that
= YXJX,
t)[xAx, t)I2 + XJX,
t)YX(X,
t).
The wave equation now results if y is “uniformly small” in the sense that ax/& can be taken to be 1 and a2x/ds2 can be taken to be 0. There are many potential applications of nonstandard analysis to differential equations. For example, for applications to singular perturbations, the reader is referred to the work of the Strasbourg group ( [ 6 ] and the papers referenced there).
1.15
67
Proof of the Transfer Principle
Exercises 1.14 1. In the proof of Theorem 14.1, Eq. (14.6) states that *f(x,*&x)) 2 *f(x, * ~ J x ) for ) all x E * [ x o , xo c]. Show that this is correct. 2. Fill in the details in Theorem 14.1 on the use of the transfer principle to show that 4 ( x ) = yo f ( t , &(t))dt. 3. Show that Theorem 14.1 goes through if we replace f(xk, f$,,(Xk)) in (14.1) by Mn.k where m i n ( ~ . y ) ~ A f ( x , y ) 5 Mn,k 5 m a X ( x , y ) ~ A f ( x , Y ) and
+
+
4. The conditions in Theorem 14.1 do not guarantee a unique solution 4 to 4f = f ( x , y). Use infinitesimal partitions and the idea in Exercise 3 to obtain the solutions + ( x ) = 0 and 4 ( x ) = x 3 to the equation 4' = 342/3, 4(0) = 0. 5. Generalize Theorem 14.1 to the vector situation. Let x denote a point in R and y denote a point in R". Let f be defined on {(x, y) E R x R": Ix - xoI S a, lly - yell 5 b ) where xo E R, yo E R", and denotes the usual distance in R". Consider the system 4f = f ( x , &), 4 ( x o ) = &, where 4(x) = (&(x), . . . ,+,,(x)) and &(x) = ( F 1 ( x ) ,. ,&,(x)). Find conditions
11.11
..
on the vector function f so that a solution 4 to this system exists in a certain interval about x o .
1.15 Proof of the Transfer Principle
Recall that the only functions and relations that are in *W are extensions of standard functions and relations. We assume that each constant c in La names an element of *R, and if c names an element of R then c is in La.Recall Definitions 3.8, 4.1, and 5.2, which give the following inductive definition of a constant term which is interpretable in *W: (i) A constant c in Lais interpretable in *Wand is interpreted as the element it names. (ii) Iff is the name of a function f of n variables on R and T I , . . . ,T" are constantterms interpretable in *W as r', , . . ,r", respectively, and if the ntuple ( r ' , . . . ,r") is in the domain of the nonstandard extension *f off, then *f(~', - . . . ,T") is a constant term interpretable in *W as *f(rl,. . . ,r"). We now want to associate with each constant term in Laa fixed sequence of constant terms in L,. We will denote the sequence for a constant term T
68
1.
lnfinitesimals and The Calculus
by (T,(n)) or just T,. A sequence T, is defined for all terms T , interpretable or not, by the following inductive definition: (a) For each r E * R we choose a definite sequence ( r , ) from R so that r = [ ( r , ) ] . If r E R, we choose r, = r for all n. If c is a constant in Lathat names r, we set T,(n) =I,,,where T, is a name in La of r, E R for all n E N.If r E R, we set T,(n) = E: for all n. (B) If 7 = *f(7l,. . . , rk) where f is a name of the function f of k variables on R a n d the Ti are constant terms in La,1 I i I k, then Tr(n) = J(Trl(n), . * T+(n))*
-
9
Conditions (a) and (p) serve to define T, inductively for all constant terms in La. We are now able to prove a simple form of a theorem due to Lbs (pronounced "Wash"). T
15.1 Theorem
(A) If T is a constant term in Laand ( r , ) is a sequence of numbers in R, then T is interpretable in *W and names [ ( i n ) ] iff T,(n)is almost everywhere (a.e.) interpretable in W and names r, a.e. [i.e., for all n in a set U in Q, T,(n) is interpretable and names r,]. (B) If ~ l. ., . ,7' are constant terms in Laand *_P(z', . . . ,7') is an atomic sentence in La,then *E(7l,. . . , T') holds in *W iff E(T,,(n),. . . , T,&)) holds a.e. in W. Proof: (A) The proof is by induction on the complexity of the terms (as defined by 3.8 and 5.2).
(i) If T = where c is a constant naming an element of *R, then c names [ ( r , ) ] iff T,(n) a.e. names r, by definition of T, in (a). (ii) Let 7 = *f(zl, . . , ,T'), wheref is a name of the functionf of k variables and T ~ . ., . ,7 k are constant terms for which (A) is true; i.e., givenj, 1 ~j 5 k, and a sequence (r',), r', E R, 7' is interpretable in *W and names [(r',)] iff T,l(n)is a.e. interpretable and names r'. a.e. Let (s,) be a sequence in R. Then the following statements are equivalent: (a) The term z = *f(~',. . , , is interpretable in *W and names [ ( s , ) ] . (b) There exist elements [ ( r j ) ] , . . . , in *W such that, for 1 <j S k, TI is interpretable as [ ( d ) ] , the k-tuple ([(r,!)], . . . , is in the domain of *f,and * f ( [ ( r j ) ] , . . . ,[(&I) = [ ( s , ) ] . (c) There exist sequences ( I , ' ) , . . . , (4)in R and a set U E 9 such that, for each m E U , if 1 I j < k then T,,(m) is interpretable as &, the k-tuple (r:, , . . ,r",) is in the domain off, and f(r;, . . . , = s,.
[(e)]
[(e)])
69
1.15 Proof of the Transfer Principle
(d) f(Trl(n),. . . , T&)) is a.e. interpretable as s, in W. (e) T,(n)is a.e. interpretable in W as s,. Thus (A) is true by induction. (B) To prove (B), let P be a name for the k-ary relation P on R, and let t', . . . ,t k be constant terms in La. Then the following are equivalent statements: (a) *P(T',. . . ,t k )holds in *W. (b) There are elements [ ( r , ! ) ] ,. . . , in * R such that pretable as [(d)],1 < j < k, and the k-tuple ([(r,!)], . . . ,
[(e)]
*P.
ti
is inter-
[(e)]) is in
(c) There are sequences (r,!), . . . , (4)in R and a set U E 43' such that, for each m E U, TJm) is interpretable as r i for 1 <j < k and the k-tuple (r;, . . . ,&) is in P . (d) P(T,,(n),. . . , T&)) holds a.e. in W. This establishes (B). 0 We are now in a position to prove the transfer principle. If O is an atomic sentence which holds in 9 then 15.1 shows immediately that *O holds in *a. Suppose that O is of the form
and CP holds in W.Let *ti and *c: be the *-transforms of z: and 4 and replace the variables xl,. . . ,x, in *z: and *a', with constant symbols I-, , . . ,I, from La*. Assume that with this replacement . . . ,*&,) holds in *W for each i, 1 I i 4 k. Using 15.1, we see that there is a set U E Isuch that if n E U then Ei(Tr,(n),. . . , Trim,(n))holds in 43 for each i, 1 < i < k. But then, since 0 holds in R, Q,(T,Xn), . , . , T#JI)) holds in 41 for 1 Ij < 1 and n E U . By 15.1 again, *Qj(*a{, . . . , *c$,) holds in *W for each j, 1 < j I I, and we are through.
*e,(*t',,
CHAPTER II
Nonstandard Analysis on Superstructures
In order to proceed to analysis more general than the calculus, we will need to consider mathematical systems which contain entities corresponding to sets of sets, sets of functions, and so on. For example, we might want to prove theorems involving the set of open subsets of R, or the set of all continuous functions on R. Such entities, regarded as objects in themselves, are not contained in any relational system based on R. Beginning with a basic set X,we can construct a superstructure V ( X )which contains all of the entities normally encountered in the mathematics of X by successively taking subsets. This chapter is devoted to nonstandard analysis in this general setting. In particular, we consider mathematical logic for superstructures in $11.2, and the transfer principle in $11.3. The language presented is more general than that of Chapter I, and this will allow us to avoid Skolem functions and proofs by contradiction in applying the transfer principle in the rest of this book. We generalize the ultrapower construction and *-mapping for *W to superstructures in $11.4, obtaining a superstructure V(*X)and a map *: V ( X )+ V(*X). In 511.5 we show how to choose the ultrafilter in the construction of $11.4 to ensure that V ( * X )is an enlargement, a notion which is fundamental to nonstandard analysis as developed by Abraham Robinson. The notions of internal and external entities and sentences are developed in $11.6. These notions are important in being able to recognize when a sentence Y about V ( * X )is of the form *CP for some sentence CP about V(X). We will often use such a corresponding “downward transfer principle” in succeeding chapters. In $11.7 we present the permanence principle, which involves the idea of internal formulas, and is useful in many proofs. Finally in 511.8 we survey the theory of maturated superstructures, a concept which was introduced by W. A. J. Luxemburg [36] and is very important in some of the recent applications of nonstandard analysis. I0
71
11.1 Superstructures
11.1 Superstructures
In the succeeding chapters of this book, we will need to consider mathematical systems which contain entities corresponding to sets of sets, sets of functions, etc. Such sets, regarded as objects in themselves, are not contained in any relational system based on R; there are no names for them in the language of relational systems. More generally, we are led to work with a set X and all of the sets which can be obtained inductively from X in a finite number of steps by successively taking subsets of the preceding set, as indicated in the following definition. The resulting structure is called a superstructure over X. We will always assume that X contains the natural numbers N in order later to be able to define ordered n-tuples (Definition 1.2). 1.1 Definition Let X be a nonempty set containing at least the natural numbers N. The power set B ( X ) of X is the set of all subsets of X (including the empty set 0).The nth curnulatioe power set V,(X) of X is defined recursively by
VO(X)= x,
V,+,(X)= V,(X)u m I ( X ) ) .
The superstructure over X is the set
u m
W ) n= V,W. =O The elements of V ( X )are called entities, and the entities in X are also called individuals. The entities in V,(X)- V,- ,(X) are of rank n.
For example, let X = N, the set of natural numbers. Then some entities in V,(N) are 7, {7}, and the set {2,4,6,. . .} of even numbers. Similarly, some entities in V.(N) are 7, {1,3,5,. . .}, and the set of all finite subsets of N. As usual in set theory we use the symbol E to stand for “is an element of“ and $ to stand for “is not an element of.” Similarly, if x , y ~V(X), we write x E y ifz E x implies z E y; we write x = y ifx E y and y E x and write x # y otherwise. In particular, we write x c y if x G y but x # y. Notice that an entity may simultaneously be a subset of, and an element of, another entity; in particular, Vn(X)E V , + , ( X ) and Vn(X)E V,+,(X) for all n. We always assume that the individuals have no members; i.e., if x E X then x # 0 and the statement t E x is false. The choice of the basic set X is always somewhat arbitrary and depends on the context. If, for example, we want to study the real number system, and do not need to consider the manner of construction of each real number (as, for example, an equivalence class of Cauchy sequences of rational numbers), then we may take X = R. If, on the other hand, we want to study
72
II. Nonstandard Analysis on Superstructures
the real numbers as equivalence classes of Cauchy sequences of rationals, then we might take X = Q (the rational numbers). We next show how to describe relations and functions in the set theory of V(X).The basic step is to define an ordered n-tuple set-theoretically, and the rest follows as in Definition 1.2.1. We start with the definition of an ordered pair and make a distinction between ordered pairs and two-tuples. 1.2 Definition An ordered pair ( a , b ) is the set ({a}, { a , b ) } . For n 2 2, an ordered n-tuple (xl,. . . ,x,) of elements x,, x 2 , . , . ,x, is defined by (xl,. . .,x,) = ((1, x,), .. .,(n,~,)), where for each k E N, 1 s k i; n, (k,x,) is an ordered pair, If c,, c2,. . .,c, and c are sets, we define
cl x c2 x
x c, = {(x,,
. . . ,x,):xi
E ct (i = 1,.
. . ,n)]
and cn = c x - * x c (n factors). For n 2 2, an n-ary relation P on c1 x cz x * * * x c, is a subset of c1 x c2 x * * - x c,. P is a relation in V ( X )if each ci E V,(X)(i = 1,. . . ,n) for some fixed integer k. If P is a 2-ary relation on c1 x cz we will call it a binary relation. In this case we define the domain and range of P by +
dom P = {xl E c1:there exists x2 E c2 such that (x,, x2) E P } and range P = {xz E c,:there exists xI E c, such that (x,,x,)
EP}.
Similarly, if b c cl we define the image of b under P by
P[b] = {x2E c,:there exists x, E b such that (xl,xz)
E P],
and the inverse image of b C c2 under P is the set
P-'[b] = (xl E c,:there exists x2 E 6 with (x,,x,>
E P}.
If b is a singleton set, i.e., b = {x}, we will usually write P [ x ] and P - l [ x ] for P[b] and P - ' [ b ] . A functjon f from a to b, which we denote by f:a 4b, is a subset of a x b (and hence a binary or 2-ary relation) such that, for each x E a, there is exactly one y E b such that (x, y) ~ f The . element y is called the image of x and is denoted by f ( x ) . The set a is the domain off, and we say that f is defined on a. If f(x) = f(y) implies x = y for all x, y E a, we say that f is oneto-one (1-1) or injective. If range f = 6 we say that f maps a onto 6 or that f is surjective. A function g: c + b is an extension off: a + b (and f is the rest~ictionof g to a) if c 2 a and g ( ~= ) f(x) for ail x E a; we write f = gla in this case. If a c c, x * - * x c, we may say that f is a function of n variables and may write f(x) as f(xl,. . . ,x,), where x = (q,. . ,x,), xi E ci. (Note
.
11.1 Superstructures
73
that f(x) will be different from f[x]; e.g., if f(x) = xz on R then f(2) = 4 and fC21 = (41.1 The set-theoretic definition of ordered n-tuple in Definition 1.2 is justified by the following lemma, which expresses the definitive property of ordered n-tuples. 1.3 L.emma(x,,. . . , x , , ) = ( y l , . ..,y,)(setequaIity)iffx,=y,(i=l,.. .,n).
Proof: We prove the lemma for n = 2 and leave the rest of the proof to the reader. It is immediate that if x1 = y, and x, = y2 then (l,xl) = ( l , y l ) and (27x2) = (2,YZ) so (x1,xz) = (Y1,YZ). Suppose, conversely, that (x,,x,) = (y,, yz). Then (1.1) WL { L X I H , @I, {2J2)H = {{{l}, { L Y I H , W), {%Yz}H. Suppose first that x, = 1. Then { { l}, { l,xl}} = { { l}, { l}} = {{ 1)) and so, from (l.l), {{l}, {l,y,}} = {{l}}, so (1) = {l,y,} and hence x1 = y, = 1. Suppose now that x, # 1. From (Ll), {{l}, {l,xl}} = {{l}, {l,yl}}, whence { 1, x,} = { 1, y,} and so x1 = y,. Similarly x, = y,. 0
From Definition 1.2 we see that n-ary relations on &(X) and functions with domain and range in V,(x) for some k E N are entities in V(X). Indeed, if x, y E V,(x) then the ordered pair (x, y) E h+,(X).Thus if xl,. . . ,x,, x , + ~E V,(X) then (x,, . . . ,x,) E V,+3(X), and the 2-tuple ((xl, . . . ,x,),x,,+ 1) E h+,.Therefore, any relation P on a set c1 x - * * x c,, ci E V,(X)(i = 1,. . . ,n), is an element of V,+4(X),and a function of n variables on c1 x x c, with range in V,(X) is in V,+,; a function on just c1 is again in b+.+(X).Thus superstructures are at least rich enough to contain entities corresponding to the usual relations and functions occurring in ma thematical systems. We conclude this section with some examples. 1.4 Examples
1. Let X = R and let 9 be the set of all finite closed intervals in R; i.e., E iff ~Z = {x E R:a I; x 5 b , a , b R} ~ = [a,b]. Then Y E V2(R). 2. We define a relation P on N x 9,where N is the set of natural numbers, by “(n, y) E P iff n E y.” Clearly P is in V,(R) since N and 9 are in V2(R). 3. The relation p defined on 9 x R + (where R + is the set of positive real numbers) defined by “(y, r) E p iff y = [a, b] and r = Ib - a[’’is a function on 9 which measures the length of each interval I E 9;it is in V,(R). Z
74
II.
Nonstandard Analysis on Superstructures
Exercises II.1
1. Complete the proof of Lemma 1.3. 2. If a E V,(X),k 2 1, and b c a then for what n is b E V,(X)? 3. If a, b E V,(X)- Vo(X),k 2 1, then for what n are a u b, a n b, and a - b in V,(X)? 4. Show that a relation P is in V ( X ) iff domP and range P are in V(X). 5. Let 9 denote the collection of all finite closed intervals on the real line (i.e., sets of the form [a, b]). For each I E 9 let p ( I ) = b - a (i.e., the length of I). For which value of n is p E K(R)?
11.2 Languages and Interpretation
for Superstructures In this section we introduce a suitable language for superstructures and show how to interpret sentences in this language. for Let V ( X )be a given superstructure. The symbols of the language gLeX V ( X )consist of the following. 2.1 C O M ~ C ~The ~ Vsymbols ~S 1,A , v, +, and t),to be interpreted later as “not,” “and,” “or,” “implies,” and “if and only if,” respectively. 2.2 Quantifiers The symbols V and 3, to be interpreted as “for all” and “there
exists,” respectively. 2.3 Parentheses The symbols [ ,3, ( ,), and ( , ), to be used for bracketing. 2.4 Constant Symbols At least one symbol 4 for each element a of V(X).
For simplicity of notation we will identify a and its symbol a. The context will clear up any possible confusion. 2.5 Variable Symbols A countable collection of symbols like x, y, xl, x2, . . . , to be used as “variables.” 2.6 Equality Symbol The symbol =, to be interpreted as “equals” [it denotes set-theoretic equality for elements of V ( X )- Vo(X)].
11.2 Languages and Interpretation
75
2.7 Predicate Symbol The symbol E, to be interpreted as “is an element of.”
We use the same symbol that was used informally in #II.l. Notice that the language Yxis richer than the language Ly for a relational system 9’based on X in that Y xhas the symbols i , v t*, and also 3. However, Yxis poorer in having no terms. Sentences in Yxare built up inductively using the symbols just introduced. The basic building blocks are the atomic formulas, introduced in the following definition. 2.8 Definition A formula of .YX is built up inductively using the following rules:
(a) If xl,. , . ,x,, xi and y are either constants or variables, the expressionsx ~ y , = x y,(XI,x2,. . . ,x,) E y,(x1,. . ,Xn) = y,((x1,. . - ,xn),x) E y, and ((xl, . . . ,x,), x) = y are formulas, called atomic formulas. (b) If @ and Y are formulas, then so are I@,(0 A Y, @ v Y, @ + Y, and @*Y. (c) If x is a variable symbol, y is either a variable or a constant symbol, and @ is a formula which does not already contain an expression of the form (Vx E z) or (3x E z) (with the same variable symbol x), then (Vx E y)@ and (3x E y)@ are formulas. A variable occurs in the scope ofa quantijer if whenever a variable x occurs in @, then x is contained in a formula Y which occurs in (0 in the form (Vx E z)Y or (3x e z)uI (z may be either a variable or a constant); it is then said to be bound, and otherwise it is called free. A sentence is a formula in which all variables are bound. For example, the expression (Vx E b ) [ x E y A (y, a) = 61, where a and b are constants and x and y are variables, is a formula in Yxbut not a sentence, since the variable y is free. The formula (Vx E y)(3z E c)[(x, z) = c] is likewise not a sentence, but (3y E a)(Vx E y)(3z E c)[(x, y) = c A (y,z) = d ] is a sentence. We now indicate how to interpret a given sentence @ in Yxin the superstructure V(X). That is, we show how to decide whether @ is true or false in
V(X). 2.9 Definition
(a) The atomic sentences a E b, ( a l , . . . , a,) E b, ((al,. . . , a,),c) E b and a = b, ( a l , . . . , a , ) = b, ((al,. . . ,a,),c) = b are true (hold) in V(X)if, respectively, the entity (corresponding to) a, (a1,. . . ,a,,), or ((al,. . . ,a,),c) is an element of, or identical to, b.
76
II. Nonstandard Analysis on Superstructures
(b) If CP and Y are sentences then (i) -I@ is true if O is not true (does not hold), (ii) @ A Y is true if both O and Y are true, (iii) CP v Y is true if at least one of O and Y is true, (iv) O + Y is true if either Y is true or CP is not true, (v) CP-Y is true if O and Y are either both true or both not true.
(c) Let O = @(x)be a formula in which x is the only free variable, and b is a constant symbol. Then (i) (Vx E 6)o is true if, for all entities a E b, when the symbol corresponding to the entity a is substituted for x in @, the resulting formula, which we denote by @(a),is true, (ii) (3x E b)O is true if there exists an entity a E b so that @(a)is true. Theoretically, the induction scheme for interpretation of sentences implied by 2.9 could be rather involved. But it is nothing more than the obvious one consistent with the specified (and usual) interpretation of the logical symbols involved. For most sentences O in YXwhich we will later encounter, it will be easy to check if O is true in V(X). For example, let IpRbe the language for V(R),and let 3 be the ternary relation for sum (ie., (a, b, c ) E if a + b = c), d be the ternary relation for product (i.e., (a,b,c) E p if ab = c), and R , be the set R - (0)of nonzero reals. Then the sentence
s
(2.1)
Px E R,)(3Y E R ) [ ( x , Y , 1) E PI
is true in V(R).For let @(x)= (3y E R ) [ ( x , y , 1) E PI. Then (Vx E R o ) 4 is true if, for all nonzero a E R, (3y E R ) [ ( a , y , 1) E is true. This sentence is true if there is a number b so that (a, b, 1) E b. But this is true with b = a - ’ . Thus the sentence states that “every nonzero real number has an inverse.” As in Chapter I, it is important to be able to translate an ordinary mathematical statement into a sentence in the language Ipx,since in the next section we will show how to write down the “transform” of such a sentence, which can then be interpreted in an appropriate “nonstandard” superstructure. As an example, consider the distributive law for the real numbers. To simplify matters we introduce the following notational convention.
a]
2.10 Convention If f is the symbol for a function of n variables we may write x,+ = f ( x , , . . . ,x,) for the atomic formula ((x,, . . . ,x,), x,+ ,)E f. Iff is a function of one variable, we may write y = f ( x ) for (x, y) E f.
s
Noting that the ternary relations and d define functions S and P of two variables [e.g., S(a, b) = c if (a, b, c ) E $1, we may express the distributive
77
11.2 Languages and Interpretation
law by the sentence (2.2)
(VX E R)(Vy E R)(Vz E R)(Vx, E R ) . . . (VX, E R )
“CS(YJ) = X I 1 A CP(X,X,) = x21 A [&, y ) = X3] A [P(X,z) = [S(X3, Xd) = 4 1 . +
Notice that since our language IPR does not have terms, the sentence (2.2) is somewhat more involved than the corresponding sentence (2.3) (Vx)(Vy)(Vz)[R(x) A R(y)A R(z) E(P(x, S(Y,z))), W(x, Y), P(x, z>)1 in the language La of Chapter I. [Remember that, in La, x(y z) = xy + xz is shorthand for the term E(x(y z), xy xz), where E is the symbol for the equality relation.] However, it is still easy to check that (2.2) is true in V(R). For another example, consider the statement that f: R -,R is continuous at the point x = a, or, more precisely, given E > 0 there exists a 6 > 0 so that whenever Ix - a1 < 6, If(x) -f(a)l < E. To translate this into a sentence in gR,let R; denote the entity in V(R) which is the set of strictly positive real numbers. Let p be the function of two variables corresponding to distance (so that ((x, y ) , z ) E p iff (x - yl = z), and I be the binary relation of strict inequality (so that (x, y) E I iff x < y). Then the corresponding sentence in IRR is +
+
(2.4)
+
+
(VEE R;)(36 E R;)(VX E R)(VX, E R)(VX~E R)(Vx3 E R ) [[p(x,a) = xl A (x,,6) E I A ~ ( x=) x2 ~ f ( a= )b A
P(X2, b) = x31 -b
[(X3,E)
E 111.
Check that the interpretation of this sentence is true in V(R)iff is continuous at x = a and f(a) = b. Since, with a little practice, the translation of ordinary mathematical statements in a given superstructure V ( X )into sentences in the language IRxwill be routine, we adopt the following convention in the rest of this book. 2.11 Convention A sentence in Y Xwill often be written as a sentence in the language L , of Chapter I, where 9’is a relational system over X,or even as a sentence in ordinary mathematical language, when the translation into a sentence in 14, is clear. We will also abbreviate (Vx, E c) * - * (Vx, E c) by
(VX,,
. . . ,x, E c).
Thus, for example, the sentence in IPR which is equivalent to (2.4) is the sentence (2.5)
(VE E R;)(36 E R ~ ) ( V X E R)[/x - a( < 6 + If(x) -
f(~)l
< E].
78
II. Nonstandard Analysis on Superstructures
Similarly, the sentence in p Rwhich is equivalent to the semiformal sentence (Vx E u)[x G b] is the sentence (Vx E u)(Vy E x)[y E b]. Exercises 11.2
1. Write out the commutative and associative laws of addition for R in the form of sentence (2.2). 2. Write out a sentence in the form (2.4) which means that lim,+,f(x) = L in R. 3. Write out a sentence in the form (2.4) which means that the derivative f’(u) exists and equals L. 4. Formulate a sentence in p Rwhich expresses the Archimedean property of the real number system (i.e., for each x E R there is an rn E N so that rn 2 x). 5. Write sentences in PRexpressing the fact that a collection 9 of subsets of R is a filter. 6. Let X be any set. Write sentences in 9xwhich express the facts that a function f : A + B is surjective (i.e., onto) and injective (i.e., one-to-one) respectively.
11.3 Monomorphisms between Superstructures:
The Transfer Principle In $1.5 we stated that the relational systems W and *W are connected by a transfer principle. To be precise, we stated that, with the *-mapping and the associated *-transform of simple sentences defined in 61.5, if (0 is any simple sentence which is true in W then *(0 is true in *W. In this section we generalize this relationship to superstructures. The basic properties of the new mapping *, which was introduced by Robinson and Zakon [45,48], are abstracted in the notion of a monomorphism. In the next section we show that with each superstructure V ( X ) one can associate a superstructure V(*X) and a monomorphism *: V ( X )+ V(*X). Let X and Y be two sets of individuals with associated superstructures V ( X )and V( Y) and languages Pxand Py, respectively. We will again assume that there is at least one constant symbol in zxand Pyfor each entity in V ( X )and V(Y),respectively, and identify the constant symbols with the corresponding entities. The context should settle any possible confusion. A constant symbol in Y Xnames something in V ( X ) and the same is true for constant symbols in pY.
11.3
79
Monomorphisms between Superstructures
Now let *: V ( X )-+ V(Y) be a one-to-one mapping (injection). For a E V ( X ) we write *(a) = *a. We assume that for each a E V ( X ) the symbol *a is in $R, and names *(a). 3.1 Definition If @ is a formula (or sentence) in Y Xthe , *-transform *@ of @ is the formula (or sentence) in TYobtained from @ by replacing each constant symbol c in @ with the symbol *c in SY associated with the entity *(c).
For example, given the set R of real numbers, we assume that a superstructure V(*R) over a set * R and a monomorphism *: V ( R ) + V(*R) exist (this will be established in the next section). Then the *-transform of the sentence (2.1) of the last section is the sentence (VX E *Ro)(3yE *R)[(x, y , *1) (3.1) and the *-transform of (2.4) is
(3.2)
(VE E *RL)(3S E *R:)(Vx
E *@I,
*R) [[*p(x, *a) = x1 A (x,, S) E * I A *f(x) = x2 A *f(*a) = *b A * p ( X z , *b) = X3] + [(X31&) E *I]]. E *R)(Vx, E *R)(Vx, E *R)(Vx, E
3.2 Definition The injection *: V ( X )+ V(Y) is called a monomorphism if
(i) *(fa) = fa, where fa is the empty set, (ii) a E X implies *a E Y, and n E N implies *n = n (recall that N c X and N E Y by assumption), (iii) a E V,, ,(X)- V,(X)implies * a E V,, l(Y) - V,(Y),n 2 0, (iv) if a E *V,(X),n 2 1, and b E a, then b E *V,- l(X), (v) (transfer principle) fur any sentence @ in 14,, @ holds in V ( X )iff *@ holds in V ( Y ) . Property (iv) is called strictness by Zakon [48]. We will later interpret it to say that elements of “internal” sets are internal. Because of (ii) we may, and will, assume that X is actually a subset of Y and *a = a for a E X [this is the analogue of Convention 2.5(c) of Chapter I]. The transfer principle as stated is redundant in that if from @ holding in V ( X )one can conclude that *@ holds in V ( Y ) ,then when i@ holds in V(X), *(lo),ie., i( holds * in V @ ( Y ) . The principle that) *@ holding , in V(Y) implies that @ holds in V ( X )will sometimes be called the downward transfer principle.
We now suppose that *: V ( X )-+ V ( Y ) is a monomorphism and collect together some elementary results that follow easily from the transfer principle; the proofs are good illustrations of the use of that principle.
80
II. Nonstandard Analysis on Superstructures
3.3 Theorem
(a) Let a, b, a,, (i) (ii) (iii) (iv) (v) (vi) (vii)
. . . ,a,
be fixed entities in V ( X ) .Then
*{a,, . . .,a,} = {*a,, . ..,*a,}, *(a,, . . .,a,) = (*a,, . . . , *a,), a E b iff *a E *b, a = b iff *a = *b,
a c b iff *a E *b, ai)= *ai, ai)= x *a,. *(al x a2 x * . x a,) = *a, x
*cur=, u;= , *(n;= , n;=,*ai, -
(b) If P is a relation on a, x . * * x a, then * P is a relation on *a, x * * * x *a,, and, for n = 2, *(dom P) = dom *P and *(range P) = range *P. (c) Iff is a mapping from a into b then *f is a mapping from *a into *b, and * [ f ( c ) ] = *f(*c)for each c E a. Also f is one-to-one iff *f is one-to-one.
. . ,a,} and transform the sentence (Vx E b) v x = a,], as well as the sentences a, E b, . . . ,a, E b.
Proof: (a)(i) Let b = {a,,. [x = a, v x = a2v
9
*
(a)@) Exercise 1. (a)(iii) Clear. (a)(iv) Clear. (a)(v) The sentence (Vx E a)[.€ b] is true in V ( X ) iff its *-transform (Vx E *a)[x E *b] is true in V( Y).The interpretation of the latter sentence is that *a E *b. (a)(vi) We show that *(a u b) = *a u *b; it then follows by induction that *(U;=, ai) = *ai. The proof that *(fly=, ai)= *ai is similar (Exercise 2). Let c = a u b. The sentence (Vx E c)[x E a v x E b] is true in V(X),so its *-transform (Vx E *c)[x E *a v x E *b] is true in V(Y).The interpretation of the latter sentence is that *(a u b) G *a u *b. Similarly, the interpretation of the *-transforms of the sentences (Vx E a)[x E c] and (Vx E b)[x E c] shows that *(a u b) 2 *a u *b. (a)(vii) We show that *(a x b) = *a x *b; the proof for n > 2 is similar. Interpretation of the *-transforms of the sentences (Vz E (a x b))(3x E a) (3y E b)[(x, y) = z] and (Vx E a)(Vy E b)(3z E (a x b))[(x, y) = z] shows that *(a x b) E *a x * b and *a x *b E *(a x b). x *an follows by interpretation of (b) That * P is a relation on *a1 x the *-transform of (Vx E P)(3x, E a,) * (3x, E a,)[(x,,. . . ,x,) = x]. To show that, for n = 2, *(domP) c dom*P, interpret the *-transform of the sentence (Vx E dom P)(3y E a2)[(x, y) E PI. The proof of the fact that *(dom P) 2 dom * P is left to the reader (Exercise 3).
.-
u;=,
11.3
81
MonomorphismsBetween Superstructures
(c) *f is a relation on *a x *b by (b). To show that *f is a mapping, interpret the *-transform of the sentence (Vx E a)(Vy E b)(Vz E b)[ [(x, y ) E f A (x,z) E f ] + y = 23, which is true in V(X). The rest of the proof of (c) is left as Exercise 3. 0 The results in Theorem 3.3 are quite general in nature. To be more concrete we consider, as examples, the interpretation of the sentences (3.1) and (3.2). Remember that the sentence (2.1) of which (3.1) is the *-transform holds in V(R)because of the fact that there exists a multiplicative inverse of each nonzero element in the field L@, and (2.1) is a formal expression of that mathematical statement. Clearly (3.1) should be a formal expression of a similar fact about V(*R).To see this, note that the ternary relation d defines a function P of two variables since the product of two real numbers is uniquely defined. By parts (b) and (c) of Theorem 3.3 we see that *P is a function from * R 2 to *R. Thus for each a, b E * R the number c E *R such that ((a, b ) , c ) E * P is uniquely defined and is called the *-product of a and b. We denote c by a * b or ab. Now (3.1) is true by transfer in V(*R)since (2.1) is true in V(R),and its interpretation establishes the existence for each a # 0 in * R of a number y E * R so that a * y = 1. One can similarly show by transfer that y is unique. Consider now the interpretation of (3.2). Proceeding as above, we see that (3.2) is equivalent to the ordinary mathematical statement “Given E > 0 in * R there is a 6 > 0 in * R so that, for all x E *R, Ix - a1 < 6 implies I*f(x) - *f(a)I < E.” (The absolute value 1x1 for x E * R is the extension of the usual absolute value in R.) Notice that here E and S are allowed to be any positive numbers in * R (even infinitesimal). The function *f will be said to be *-continuous at a if it satisfies (3.2), which will be the case, by transfer, if f is continuous at a. In 61.2 we noted that if B was a subset of R then *B was an extension of B (regarded as embedded in *R).This fact is again true in the present context. For if b E 9(X)and a E b then *a E *b by Theorem 3.3(a)(iii). But since a E X we have *a = a and so a E *b, and hence b c *b. One might expect that this fact is true in general, i.e., that a E b implies a E *b for any entities a, b E V ( X ) , but in general Theorem 3.3(a)(iii)is the best we can do, as shown by the following example. 3.4 Example Let f denote the set of closed bounded intervals in R; each I E 9 is of the form I = {x E R : a I x I b, a,b E R } = [a,b].Then f E V2(R).
Thus the following statements are true in V(R): (Vx E f)(3a, b E R)(Vy E R ) [ a I y I b *y
E x],
(Va,b E R)(3x E f)(Vy E R)[a 5 y 5 b - y E x].
82
II.
Nonstandard Analysis on Superstructures
By transfer, assuming a monomorphism *: V(R)+ V(*R),we see that if I E *# then there exist numbers a, b E * R so that I = {x E *R:a I; x I; b}. Even if a and b are standard (i.e., in R), if a # b such an interval is not identical to an interval in f , since it contains non-standard reals between a and b. Thus *#contains the transform *I = {x E *R:a I x I b} of each standard interval I = [a, b], a, b E R, and also all other intervals of the form {x E *R:a < x < b} where either a or b or both are non-standard. Notice, in particular, that 9 is not embedded in *9, i.e., only singleton sets in 9 lie in * f . This situation is indicative of what happens in general when one forms * b for an entity b of rank higher than one.
, the existential quantifier The fact that the languages Zxand 9,contain 3 allows alternative proofs of many of the results established in Chapter I. In particular, we may use 3 to do the work done by Skolem functions in Chapter I. To illustrate, consider the following proof of the sufficiency of the condition in Proposition 8.1 of Chapter I, which states that if (s,) is a standard sequence and *s, N L E R for all infinite n, then s, converges to L. We present the proof in a hybrid of the languages L, and ZR.Translation into the language gRis left to the interested reader. Suppose then that *s, N Lfor all infinite positive integers n E *N.Let E > 0 be a fixed standard real. Since J*s, - LI is infinitesimal for all infinite positive integers, the statement (3.3)
(vn E * N ) [ n 2 0 + IL - *SJ < E l
is a sentence in which is true for any infinite positive integer o.However, (3.3) is not the *-transform of a sentence in ZR, since it involves the constant a,which does not name the image *(a)of an element a E V(R).But since (3.3) is true, the sentence (3.4)
(3mE *N)(VnE * N ) [ n 2 m + (I,- *s,I < E ]
is also true in V(*R) and is the *-transform of (3.5)
(3mE N)(Vn E N ) [ n 2 rn -+ IL - s,I < E ] ,
which is then true in V(R) by virtue of the transfer principle. Since (3.5) is true for any E > 0, we see that s, converges to L. 3.5 Remark Comparison of this proof with that in Chapter I shows that we have avoided a proof by contradiction, and the use of Skolem functions. Another and more important aspect of this new technique of proof is that we construct a true sentence *CP in .Y.xwhich is the *-transform of a sentence CP in Zx,so 0 is true by transfer down and yields the desired result. In Chapter I we used the transfer principle only in the upward direction, i.e.,
11.4 The Ultrapower Construction
83
from La to La. The construction of *@ is often accomplished, as above, by writing down a sentence Y in 9., which is true but involves entities, like the w above, which do not occur in the *-transforms of sentences (R in Y,, and then appropriately adding the existential quantifier to convert Y to a sentence of the form *(R for some @ in 9,. The proof above may seem surprising since we infer the existence of a standard integer m satisfying (vn E N ) [ n 2 m -,1s - s,,l < E ] (3.6) from the existence of the infinite integer w satisfying (3.3). Since similar proofs will occur in the rest of this book it is important to be able to recognize when a sentence Y in Y., is or is not of the form *@ for some sentence (R in Y;. This question will be dealt with in gI1.6.
Exercises 11.3
Prove Theorem 3.3(a)(ii). show that a,) = *ai. Finish the proof of parts (b) and (c) of Theorem 3.3. Use the downward transfer principle to prove the sufficiency of the condition in Proposition lO.l(a) of Chapter I. 5. Use the downward transfer principle to prove the sufficiency of the condition in Proposition 10.8 of Chapter I. 6. Use the transfer principle to show that the set N of standard natural numbers is not an element of *9(N). 7. Let *: V ( X )+ V(Y)be a monomorphism. Show that iff E V ( X ) maps a onto b then *f maps *a onto *b.
1. 2. 3. 4.
'11.4 The Ultrapower Construction for Superstructures
In this section we show how to generalize the construction of * R in Chapter I by constructing, for any superstructure V(X), a superstructure V(*X) on an appropriate set *X and a monomorphism *: V ( X )+ V(*X). We begin with an ultrafilter 4 on an index set I (see the Appendix); both I and 4 will be fixed in the construction of V(*X), but in later sections we will choose them to have additional properties. Now let V ( X )= K(X) be a given superstructure.
u."=o
4.1 Definition Let S be an entity in V ( X ) . The set of all maps a: I + S is denoted by we write a(i) = a, for i E 1. The maps a and b in flS are
fl$
84
II. Nonstandard Analysis nn Superstructures
equivalent (with respect to %), and we write a =*b iff { i E l : a i = b,} E % (the equality is set-theoretic except when S E X,in which case it is identity). If a = 'y b we say that a, = b, almost everywhere (a.e.). The relation =* is an equivalence relation on n S . The set of associated equivalence classes is denoted by n*S, and is called the ultrapower of S (with respect to 4).The equivalence class in n,S containing a E n S is denoted by [a]. Let V- ,(X) = 0, the empty set. The bounded ultrapower of V(X)is the set
niw,= u l-I*[v.(W - v.-I(X)l* m
n=O
We define the map e: V(X)+ @V(X) by e(a) = [ii], where E, = a for all i E 1. The proof that = is an equivalence relation on n S is similar to the proof of Lemma 1.4 of Chapter I and is left as an exercise. We see immediately from Definitions 1.3 and 1.5 of Chapter I that * R = n 4 R , where 4 is the ultrafilter of $1.1. The map e is a generalization of the map *: R + * R of Definition 1.9 of Chapter I. n i V ( X ) is called a bounded ultrapower since, for each [a]E@V(X), a,€ V,(X)- h-l(X),i E 1 , for some fixed k E N; thus, there is a uniform upper bound to the rank of a,, i E 1. We now want to construct from niV(X)a superstructure V(*X)over a set *X,and an associated mapping M:niV(X)+ V(*X).We will finally define the mapping *: V ( X )+ V(*X)as the composition of e and M , and show that * is a monomorphism. In the literature, M is called a Mostowski collapsing function. First we must define *X.In analogy with the definition of * R we put
*x = n * x = fl*Vo(X).
(4.1)
Now V(*X)is completely determined and we proceed to the definition of M : niV(X)+ V(*X). We define M successively on V,(X)- V,- ,(X)]by induction. By (4.1) n , V o ( X ) = *X,and by definition Vo(*X) = *X,and so we define M to be the identity on *X,i.e.,
He[
(4.2)
M(a) = a,
u E n*Vo(X) = *X.
For higher levels we need the following definition. 4.2 Definition If [a],[b]
E n;V(X),then
[a]
[b] iff { i E Ila, E b,} E 4.
The reader should check that E* is well defined (exercise). To motivate the definition of M on V,(X) - Vo(X)] we let X = R and recall from $1.2 the definition of *A, where A is a subset of R. By Definition 2.2 of Chapter I (with 1 = N), *A consists of those elements [a] of
n*[
85
11.4 The Ultrapower Construction
* R for which {i E I:a, E A } E 4. For our more general situation, the subset A is mapped by e to the element e(A) = [ A ] in H,V,(R). Note that [ A ] is not a subset of *R. We want * A to be a subset of * R and to consist of precisely those elements [ a ] E * R for which [ a ] E% [ A ] . Since will be the composition of e and M, it follows that we should put M ( [ A ] )= { [ a ] E * R : [ u ]E* [ A ] }
{ M(Ca1) E Vo(*R):[aIE n*Vo(R) and [a1 E* The general definition is now clear. =
[All.
4.3 Definition We define M: f l g V ( X ) + V ( * X )inductively by
M ( [ b ] )= [bl
for
PI E l-I*VO(X), n- 1
for [ b ] E H,[V,(X)- V,- ,(X)],n 2 1. The important properties of M and e are collected together in the following result. 4.4 Lemma
(i) e and M are one-to-one maps; i.e., a = b iff e(a) = e(b), and [a] = [ b ] iff M([ a ] ) = M([ b ] ) . (ii) e maps X into *X;M maps * X onto * X . (iii) e maps V , + , ( X )- V,(X) into n,[V,+,(X> - V,(X)];M maps H,[V,+ ,(X) - V,(X)I into V,+,(*XI - V,(*X). (iv) a E b iff e(a) E, e(b); [t]E* [ b ] iff M ( [ u ] )E M ( [ b ] ) . (v) e(X) = [XI and M ( [ X ] ) = *X. (vi) Let [a],[ b ] E HiV(X)and put ci = {ai,bi}, i E I . Then [c] E ngV(X) and M([c]) = { M ( [ a ] ) , M ( [ b ] ) }Similar . statements hold with { } replaced by ( ) and = replaced by E, and also for three or more terms. (vii) If [ b ] E* e(a), a E V,(X) - K-l(X), then [ b ] E* e(V,-,(X)). Proof; We leave the proof of (ii)-(v) and (vii) as exercises.
(i) To show e is one-to-one let a # b E V ( X ) .Then e(a) # e(b) since iii # 6, for all i E I, and 0 4 9.To show M is one-to-one, we consider only the case that [ a ] and [ b ] are in flzV(X) - *X,and [ a ] # [ b ] in f l $ V ( X ) . Let U , = {i E I:there exists uiE a, with ui 4 b,} and Ub = { i E I:there exists I), E bi with ui # a,). If neither U , nor U b is in 4,then I - (V,u u b ) is in 4 and a, = bi
86
II. Nonstandard Analysis on Superstructures
for almost all i E I. But this is impossible. Assume, therefore, that U,E 4. Choose u, E a, - b, for each i E U, and let u, be a fixed uio otherwise. Then M([ u ] ) E M([a]) and M([ u ] ) $ M([b]). The rest is left to the reader. (vi) We prove the first statement and leave the rest to the reader. Now M([c]) = {M([y]):y, E {a,,b,} a.e.}. If y, E {a,,b,} a.e., let A = {i E I: yi = a,} and B = {i E I:y, = b,}. Then A u B E Q, and so either A E Q or B E 4 since Q is an ultrafilter. Thus
M([c]) = {M([y]):y, = a, a.e.} u {M([y]):yi = b, a.e.} = {M<[aI>,M(Cb1)). 0 With
*
defined as the composition of e and M, we now show that
*: V(X)+ V(*X)is a monomorphism. To do so we need the following funda-
mental result; in the proof we use the axiom of choice.
was) If 4(x1,. .. ,x,) is a formula in le, with x i , . . . ,x, its only free variables, and [a,], . . . ,[a,] E f l g V ( X ) , then *4(M([a,]), . . , , M([a,])) is true in V(*X) iff
4 5 Theorem
{i E l:@(al(i),. . . ,a,(i)) is true} E Q.
Proof: 1. We first establish the result when @ is an atomic formula. If @ is of the form x E y or x = y, where x and y are either constants or variables, the result is immediate from 4 4 ) and 4.qiv). The result for @ of the form (xi,. . . , ~ n ) ~ ~ n + i (, ~ i , . - . , x J = x , + i ((~i,...,x,),x)~xn+i, , and ((xi, . . . ,x,), x) = x,+ can be proved by induction using 4.qvi) (Exercise 4). 2. Suppose now that the theorem has been established for the formulas @(x,, . . . ,x,,) and Y(x,, . . . ,x,,). We would like to prove it for the formulas i@ @ A Y, @ v Y, and @, + Y. We do so for the first two and leave the proofs for the last two as exercises (Exercise 4); recall, however, that 0 v Y is equivalent to i [ 0) ( A (1 Y)]. i
note that the following are equivalent: (i) For i@
*(7@)W([ai]b.. . , M([an])) is true; i *WM([a,]), . . . ,M([a,])) is true; {i E I:O(al(i), . . . ,a,(i)) is true} # 4; {i E I : i @ ( a , ( i ) ., . . ,a,(i)) is true} E 4 (since 4 is an ultrafilter).
87
11.4 The Ultrapower Construction
(ii) For @ A \Y note that the following are equivalent: *(@ A \Y)(M([al]), . . . , M([a,,])) is true; *@W([al]), . . . , M([aJ)) A *\Y(M([al]), . . . , M([a,,]))is true; {i E l:@(al(i),. . . ,a,,(i))is true} E Q, and {i E l:Y(al(i), . . . , a,,(i))is true} E Q; { i E Z:@(ul(i), . . . ,a,,(i)) is true} n {i E l:\Y(ul(i), . . . , a,,(i))is true} E 9 (since 9 is a filter); { i E I : ( @ A Y)(al(i),. . . ,a,(i)) is true} E 4.
3. Suppose the result is true for a formula of the form @(xl,. . . ,x,,, y). We want to show it is true for formulas of the form ( 3 y ~ c ) @( 3, y ~ z ) @ , (Vy E c)@, and (Vy E z)@,where c is a constant and z is a variable. We consider the case (3y E c)@ and leave the case (3y E z)@ to the reader (Exercise 4). For the quantifier V, replace (Vy E c)@ with i ( 3 y E c ) i @ and (Vy E z)@ with i ( 3 y E z ) i @ . Suppose *(3y E c ) @ ( M ( [ a , ] ).,. . ,M([a,,]), y) holds in V(*X),i.e.,
-
( 3 E~*c)*@(M([all),
* 9
M([an]), Y )
holds in V(*X). Thus we can find M ( [ a ] )E V(*X) so that ( M ( [ a ] )E *c) A @(M([all), M<[anl>,M([al)) holds in V(*X). Using step 2, this is equivalent to . 3
{i E l:a(i) E c A @(al(i),. . . ,u,,(i), a(i)) is true} E Q.
Hence also the larger set { i E 1:(3y E c)@(al(i),. . . ,a,,(i),y) is true} is in 4. Conversely, let {i E 1:(3y E c)@(al(i),. . . ,a,,(i),y ) is true} = U belong to 9. Then, for each i E U,we can use the axiom of choice to choose some a(i) E c and for i E 1 - U put u(i) = d E c, where d is a fixed element of c, so that {i E I:a(i)E c A @(al(i),. . . ,a,,(i),a(i))}E Q. Changing u(i) on the complement of a set in Q if necessary, we may assume that a(i) E V,(X)- V,- ,(X) for some n E N and all i (Exercise 6). Now the map a: 1 -+ c defines [a] E n i V ( X )and the steps of the previous paragraph can be retraced, yielding the result. 4. The general result now follows by induction based on Definition 2.9 (Exercise 4). 0 4.6 Theorem The map *: V ( X )+ V ( * X ) defined by morphism.
* =M
0
e is a mono-
Proof: We prove (v) of Definition 3.2 and leave the remaining proofs as exercises. Let @ be a sentence in Y x Then . @ has no free variables, so *@ is
88
II. Nonstandard Analysis on Superstructures
true in V(*X) iff {i E 1:4 is true} E 9 by Theorem 4.5. But the set ( i E I : @ is true} is either I [if 4 is true in V ( X ) ] or 0 [if @ is not true in V ( X ) ] ,so *@ is true if and only if @ is true. Whenever nonstandard analysis is applied in any concrete situation in the rest of this book, we will start with a superstructure V ( S )based on a suitable set S, and then use a superstructure V(*S) and a monomorphism *: V(S)-+ V(*S)constructed with an ultrafilter 4 as in this section. Usually the monomorphism will not be mentioned explicitly, but we will always choose % in such a way that V(*S) has a special property, that of being an enlargement. This will guarantee that *S is large enough to contain “infinite” entities. We turn to this question in the next section. Exercises 11.4 1. Prove that =* is an equivalence relation on HS. 2. Show that the relation E* of Definition 4.2 is well defined. 3. Finish the proof of Lemma 4.4. 4. Finish the proof of Theorem 4.5. 5. Finish the proof of Theorem 4.6. 6. Show that if a(i) E V,(X)for a fixed n and all i E I, then, for some k In and all i E U for some U E 4,u(i) E V,(X)- V,- l(X).
11.5 Hyperfinite Sets, Enlargements, and Concurrent Relations
In $1.1 we showed that * R was strictly larger than R (regarded as embedded in *R) by exhibiting elements like [(1,2,3, . . .)I in * R which were not equal to any element of R. The demonstration involved the fact that the ultrafilter 4 on N was free, i.e., it contained the cofinite filter .FN. In the general case it is interesting to determine the conditions under which *X is strictly larger than X. It should be recalled that, by assumption, X contains N and hence is infinite. The following result shows that *X = X and hence V(*X) = V ( X ) when 4 is a principal (nonfree) ultrafilter on I; thus we get nothing new in this case. 5.1 Lemma If 4 is a principal ultrafilter on I then * X (as constructed in $11.4) equals X (regarded as embedded in *X).
Proof: A principal ultrafilter 4 is generated by a single element io E I; i.e., 9 consists of all sets U c I which contain io (see the Appendix). If [a] E *X
11.5
89
Hyperfinite Sets
and ai, = a, then [a]
=* [if],
where ifi = a, for all i E I . Thus [a] E X,where
X is regarded as embedded in *X. 0
We will next show how to choose an index set and an ultrafilter of subsets of the index set so that the *X constructed as in $11.4 is strictly larger than X,and so that V ( * X )has other desirable properties; the most important is that of being an enlargement. We begin by introducing the notion of a hyperfinite or *-finite set. 5.2 Definition If A E V,(X)- V,(X) for some n, we denote by PF(A)the set of all finite subsets of A. 9F(A) is in V(X),and we call the image *PF(A)E V(*X) (with respect to a monomorphism *) the set of hyperfinite or *-finite subsets of *A. The set of all hyperfinite subsets is the set *PAV,(X)).
u.“=l
Any elementary mathematical result that holds for finite sets extends to a similar result for hyperfinite sets by the transfer principle. An example of a hyperfinite set is the set J c * N of positive integers less than some j E *N. To see that J is hyperfinite consider the collection 9 c PAN) of all finite subsets of N of the form { 1,2, , . . , j } for some j E N (a set of this form is called an initial segment). Then *Y c *PAN) contains sets of the form {n E * N : n s j } for somej E *N. The following result shows that these hyperfinite sets are in some sense the prototype.
*h(X), k E N, is a hypefinite set, then there is an initial segment J = {n E * N : n s j } for somej E * N and a one-to-one, onto mapping f :J + B in *V,+,(X). 5.3 Theorem If B E
Proof: Suppose B E *SdA), where A E V,(X), n 2 1. Now the following statement (in semiformal language) is true in V(X):
9F(W)))(3 E”YE V,+4(X)) [f maps J one-to-one onto B, where J
(VB E
=
{n E N : n S j } ]
[the reader should check that the sentence in square brackets can be translated into a sentence in Y X(exercise)]. The result follows by transfer. 0 Because of Theorem 5.3 we will often write a hyperfinite set B as B = {bl, b 2 , . . . , b j } ,where b, = f(k), k E J, and f is the function of the theorem. It should be noted that the dots in this representation cover somewhat more ground than they do in the standard case, and that this representation is really an abbreviation of the setup in Theorem 5.3. Hyperfinite sets are an important tool in nonstandard analysis by virtue of the fact that many standard mathematical structures can be “approximated” by hyperfinite structures in a natural way. We will illustrate this fact later in this section.
90
II.
Nonstandard Analysis on Superstructures
5.4 Definition Entities in V ( X ) , and entities which are of the form *b for some b E V ( X ) ,are called standard; all others are called non-standard.
5.5 Examples 1. Each individual in X E *X is standard. 2. In Example 3.4 the intervals Z E *9of the form Z = {x E *R:a Ix I b}, where a < b, are themselves standard entities even though they contain nonstandard numbers. An interval {x E *R:a 5 x 5 fl}, where 0 < a < fl and a and fl are infinitesimal, is a non-standard entity. 5.6 Definition The superstructure V(*X) [with respect to a monomorphism *: V ( X )+ V(*X)] is called an enlargement of V ( X ) if for each set A E V ( X ) there is a set B E * 9 A A ) such that *a E B for each a E A, i.e., B contains the
standard entities in *A. We have already seen that a hyperfinite set of the form {n E *N:1 5 n Ij}, where j E *N,, contains every standard natural number. Definition 5.6 is a generalization for arbitrary sets in V(X). We will now show that for a given superstructure V ( X ) it is possible to choose an index set J and a free ultrafilter V on J so that the associated superstructure V(*X), constructed as in $11.4 using J and *v; is an enlargement of V(X). It will follow as a corollary, since X is infinite, that *X is strictly larger than X.The proofs of Lemma 5.7 and Theorem 5.8 may be skipped on first reading of the chapter. Let J be the set of all nonempty finite subsets of V ( X ) .It follows that a E J iff there is a b E V ( X )- V,(X) and a E 9db) - 0 (why?). If a E J we define
J, = { b E J : a c b } . 5.7 Lemma The collection 9 = {A C J:there exists a E J such that J ,
c A}
is a free filter on J. Proof: It is easy to show that 9 is a filter. For example, if A,, A, E 9 there exist a,, a, E J so that A, z J,, (i = 1,2). Since A, n A, 2 J,, n J,,, = J,, ,2, A , n A , E 9.The rest is left as an exercise. To show 9 is free, let U E J . Then there is an element b E J so that a n b = 0.Since a 4 J b , J - {a} 2 J,, so 9 is free. 0
Now let Y be an ultrafilter on J with Y by Theorem A S of the Appendix).
2
9 (such an ultrafilter exists
91
11.5 Hyperfinite Sets
5.8 Theorem If V(*X)is constructed from V ( X ) using V and J then it is an enlargement of V ( X ) . Proof: Let A be a set in V(X). We define a map l-:J+&.(A) by r4= a n A, and let B = M ( [ r ] ) . Then B E *PF(A). If x E A then J , , = {a E J : x E a), so {u E J : x E a n A} E V .Thus [Z] ey [r]and so * x E B. 0
Robinson’s original definition of enlargement (see Theorem 5.10 below) made use of the notion of concurrent relation and was the cornerstone of his development of nonstandard analysis. 5.9 Definition A binary relation P is concurrent (finitely satisfiable) on A c dom P if for each finite set {xl,. .,x,> in A there is a y E range P so that ( x i , y) E P, 1 Ii 5 n. P is concurrent if it is concurrent on dom P.
.
Examples of concurrent relations are the relation 5 in N and c in PAN). 5.10 Theorem The following are equivalent:
(i) V(*X)is an enlargement of V ( X ) . (ii) For each concurrent relation P E V ( X )there is an element b E range * P so that (*x, b) E * P for all x E dom P. Proof: (i) => (ii): Let B E *9ddom P ) be such that, for each x E domP, * x E B. Since the sentence (Vw E Pddom P ) ) ( 3 yE range P)(Vx E w ) [ ( x , y ) E P ]
is true in V ( X ) by concurrence of P, its *-transform is true in V(*X). Thus there exists an element b E range *P so that (z, b) E * P for each z E B, and in particular for each *x with x E dom P. (ii) => (i): Exercise. 0
5.11 Corollary If Y E V ( X )contains an infinite number of entities and V(*X) is an enlargement, then * Y contains entities which are not standard. In particular, if A E X is infinite then *A properly contains A. Proof: The relation P on Y x Y defined by “(a, b) E P iff a # b” is concurrent since Y is infinite. By 5.1qii) there is a b E *Y such that b # * x for all x E Y. 0
Corollary 5.11 gives another proof of the existence, in an enlargement V(*R) of V(R),of non-standard numbers, but it holds in much more general situations. 5.12 Definition A set 9’ of subsets of an entity A E V ( X )is called exhausting if, for each finite subset F E A, there is an S E 9’with F c S.
92
II.
Nonstandard Analysis on Superstructures
5.13 Proposition If Y is an exhausting set of subsets of A E V ( X )and V ( * X ) is an enlargement, then there is a set C E *Y containing all the standard entities in *A.
Proof: Let B be a hyperfinite subset of * A such that *a E B for each a E A. Then there is a C E *Y with B E C. 0 In spite of its simplicity, Proposition 5.13 turns out to be a very powerful tool in nonstandard analysis. The typical application runs as follows. Suppose A is an infinite set with some additional mathematical structure; for example, A could be an infinite graph, or a Hilbert space. Suppose further that A can be exhausted by a family Y of substructures-finite subgraphs, finitedimensional inner-product spaces, etc.-so that for each S E Y a certain result can be proved. One wants to establish a corresponding result for A. Using Proposition 5.13, we can find a set C E *Y containing all of the standard elements in *A, and by transfer the *-transform of the given result is true for C. The problem then is to show how the validity of the *-transform of the result on C induces the validity of the result on A. This last step can be quite difficult but is often easier than proving the result by standard methods. This method of proof was the basis of the first successful attack, by Bernstein and Robinson [7], on an invariant subspace problem in Hilbert space proposed by Smith and Halmos. We illustrate the technique by proving a result in infinite graph theory due to de Bruijn and Erdos. (See also the related paper by Luxemburg [35].) The application indicates how nonstandard analysis is applicable in areas other than analysis. A graph ( A , E) consists of a set A of vertices and a binary relation E on A x A which is symmetric (i.e., (x, y) E E implies ( y , x ) E E). If ( x , y ) E E we say that x and y are connected by an edge. ( A , E ) is injnite if A is infinite. ( A , E ) is k-colorable if there exists a map f:A + { 1,2, . . . ,k} (the set of “colors”)such that if (a, b) E E thenf(a) # f ( b ) , i.e., no two vertices which are connected by an edge are given the same color. If B E A then the subgraph (B, E IB) is defined by “(x, y) E E IB iff x, y E B and (x, y) E E ; i.e., B inherits its edges from E. 5.14 Theorem (De Bruijn-Erdos [13]) If each finite subgraph of an infinite graph ( A , E) is k-colorable, then ( A , E) is k-colorable.
Proof: We work in the superstructure V ( A u N).Let Y denote the set of all finite subsets of A (obviously exhausting). For each F E 9’the graph (F,EI F) is k-colorable, so the following is true in V ( A u N): (5.1)
(VF E 9’)(3fF:F + { 1,2, . . . ,k})(Vx,y E F) K.3
Y> E E
+
f&)
f
fF(Y)l.
11.5 Hyperfinite Sets
93
By the definition of enlargement, there exists a B E *Y so that B 2 A . By transfer of (5.1) we see that there is a map (coloring) fB: B + *{ 1,2, . . . ,k} (= { 1,2,. . . , k}) so that if (x, y} E * E then fdx) # fs(y). We now restrict fB to A to get a map f: A + { 1,2, . . . ,k}. f is a coloring since it inherits the property “(x, y } E E implies f ( x ) # f(y)” from fB (check). 0 Intuitively, the proof of 5.14 given above is obvious; we have simply covered A by a *-finite and hence k-colorable graph B and then restricted the coloring. A similar technique can be used to give easy proofs of more intricate theorems in infinite graph theory. In closing this section we note that the results of Chapter I for * R remain valid for an enlargement of V(R).To get more we need to consider the notions of internal and external entities in V(*R);these are introduced in the next section.
Exercises 11.5
1. Show that i f j is infinite then J = { n E N : n S j } E *@)F(N)- 9F(*N). 2. Show that in general *@&I) 2 @F(*A) whereas @(*A) 2 *@(A). 3. Check the translation into a sentence in Yxof the informal sentence in the proof of Theorem 5.3. 4. Show that the family 9in Lemma 5.7 is a filter given that Al, A , E .% =A , n A,€.%. 5. Prove that (ii) =s (i) in Theorem 5.10. 6. Show that if {O,:a E A} is an open covering of a set S c R but no finite subcollection covers S, then there is a y E *Ssuch that y ;74 x for all x E S. 7. Give another proof of the existence of infinite natural numbers in an enlargement V(*R)of V(R) by using the concurrent relation <. 8. Let A be an entity in a superstructure V ( X )- X which is closed under finite unions; i.e., if a, E A (1 I i I n) then u a d l I i In) E A . Show that if V(*X) is an enlargement of V ( X ) , there is an element b E * A so that U*a(a E A ) E b. 9. (Luxemburg) Let A be an entity of V ( X )- X . The intersection monad of A is the set p ( A ) = n.4. E A ) in V(*X).A has the finite intersection property (f.i.p.) if a, b E A implies a n b # 0. Show that V(*X) is an enlargement of V ( X ) iff the intersection monad /.@) of each A with the f.i.p. is nonempty. 10. (Luxemburg) Let V ( * X ) be an enlargement of V ( X )and 9 be a filter in V ( X )- X. Let p ( 4 ) be the intersection monad (see Exercise 9). Show that if B E V ( X ) and F n B # 0 for all F E 9 then p(.%) n *B # 0. 11. Show that if 9 is a filter in V ( X )- X and B is a set in V(*X)such that B n * F # 0for all F E f then it is not necessarily true that p(.%) n B # 0, where p ( 9 ) is the intersection monad of Exercise 9. [Hint: Let X = N,
94
II. Nonstandard Analysis on Superstructures
9 be the Frkhet filter on N (the collection of complements of finite sub-
sets of N),and B = N.] 12. A standard result states (informally)that a set A E V ( X )is finite iff every injective map f: A + A is surjective. Formalize this statement and so obtain a similar characterization for the *-finite sets. 13. Let P E V ( X )be a binary relation and suppose that in some enlargement V(*X)of V ( X ) the following is true: for every y E range *P, there exists an x E dom P so that (*x, y) E *P. Show that there exists a finite set {x,, . . .,x,} E dom P such that for all y E range P,there is an i, 1 I i 5 n, with (x,, y) E P. 14. (Konig’s lemma) Let (S,:n E N) be a sequence of mutually disjoint nonempty finite sets and let P be a binary relation on S,(n E N)such that, whenever x E S,,, for some n, there exists a y E S, such that ( y , x ) E P. Show that there exists an infinite sequence (x,:n E N) such that x, E S, and (xn,xn+,) E P for n E N. 15. (Total ordering) Let X be a nonempty set. A binary relation P on X is a partial ordering if the following holds: (a) P is reflextive, that is, (x, x) E P for all x E X,(b) P is antisymmetric, that is if (x, y) E P and (y, x) E P then x = y; (c)Pis transitive, that is, if (x, y) E P and (y, z) E P then (x, z) E P. A partial ordering P on X is a total ordering if whenever x, y E X then either (x, y) E P or (y, x) E P. Every finite set can be totally ordered. Assuming this result and the fact that enlargements exist, show that any set can be totally ordered. 16. (Rado’s selection lemma) Let {A,:L E A} be a nonempty family of finite sets. A choice function over A is a function #:A + UA,(rE E A) so that &A) E A, for each A E A. Let {A,:y E r}be a nonempty family of finite sets. Assume that for each finite subset F G r there is a choice function & over F. Show that there exists a choice function 4 over r so that for any finite set F E r, there exists a finite set F’2 F with &x) = &,(x) for all x E F.
u
11.6 Internal and External Entities; Comprehensiveness
We noted in Remark 3.5 that a basic technique of proof in nonstandard analysis is to establish the validity of a sentence @ in YXby noticing that it is the downward *-transform of a sentence *@ which is true in Z XThus . it is particularly important to be able to recognize when a sentence Y in 2’.xis of the form *# for some sentence in Y x Notice . that, for a sentence @ in Y x ,*# uses only the names of standard objects and so *@ involves
95
11.6 Internal and External Entities
only expressions like (Vx E *a)", (Vx E y)Y, (3x E *a)", and (3x E y)Y. Thus to check the truth of *a we need only look at elements b in V ( * X ) which satisfy b E *a for some a E V ( X ) .By 3.2(iv), if c E b and b E *a, then c E *V,(X) for some k. If b E *a for some a E V ( X ) ,we call it internal; otherwise we call b external (Definition 6.1). A sentence Y in Z, is not of the form *@ if it contains names of external entities, i.e., is an external sentence. A common mistake in nonstandard arguments is to apply the transfer principle to external sentences Y in 9.x. Thus it is important to be able to recognize external entities in V(*X). We will learn in this section that R, N,Z, * R , , *N,, *Zm,and m(0) are external subsets of * R . Using these, we can construct many external functions and relations. For example, the characteristic function of an external set is an external function; the relation of nearness N is external. The properties of external entities cannot be obtained by transfer from those of V(X). For example, it is true that any subset of N which is bounded below has a least element. However, this property is not true of *N,, for if n were a least element in *N, then n - 1 would have to be finite, which is impossible. In this section we first concentrate on internal entities and their properties and then present examples of external entities. The section ends with a discussion of comprehensiveness which involves internality. 6.1 Definition An entity b E V(*X) is called internal [with respect to *:
V ( X )-+ V(*X)]if there exists an a E V ( X )so that b E *a; i.e., internal entities are elements of standard entities. An entity which is not internal is called external. Similarly, a sentence or formula in Z., is called either standard or internal if the constants in CP are names of standard or internal entities, respectively. A sentence which is not internal is called external. 6.2 Examples 1. All standard entities are internal (Exercise 1). 2. With 9 the set of closed and bounded intervals in R, every set { x : a I x I b, a, b E *R} E *9is internal; the standard *-intervals are those for which a and b are in R. 3. If %' denotes the set of continuous functions on R , then each f E *%?is internal and is called a *-continuous function. 4. If P is concurrent, the element b E range * P given by Theorem 5. lqii) is internal. 5. The *-transform of any formula @ E YXis standard. 6. The sentence (Ve > 0 in *R)(VyE *R)(36 > 0 in *R)(VxE *R)
Clx - Yl < 6
+
l f ( x ) - f(Y)l < 4
96
II. Nonstandard Analysis on Superstructures
where f E *% is internal, is an internal sentence and expresses the fact that f is *-continuous on *R.
u."=
6.3 Theorem The set of all internal elements of V ( * X ) is the set * V ( X )= 0 * V,(X). Proof: If b E * V ( X )then b E *V,(X)for some natural number n 2 0 and so b is internal since V,(X)is standard. Conversely, if b is internal then b E *a, where a is in V,+l(X) - V,(X)for some n 2 1, so a c V,(X).Thus *a G * V,(X) and b E *V,(X). 0
It is necessary to be able to recognize internal sets. In that regard the following result is very useful. 6.4 Theorem (Keisler's Internal Definition Principle [24]) Let @(x)be an internal formula in YeX for which x is the only free variable, and let A be an internal set. Then { x E A : @ ( x )is true} is internal. Proof: Let cl,.. . , c , be the constants in q x ) ; we write W x ) = @(cl,. . . ,c,,x). Now A, c l , . . . ,c, E *V,(X)for some k E N.Thus the sentence @'XI,
* * *
3
xn, Y E
b((x))(3~ E V,+ l(X))('dxE U X ) )
[X EZ * [ X E Y A @ ( X ~ ,
* *
-
3
xn,~)]]
in Yxholds in V(X). Its interpretation in V(*X) says that { x ~ A : @ ( xis) true} E * V, + l ( X ) . 0
6.5 Examples 1. The set 2, of zeros of an internal *&valued function f in V(*R) is internal since 2, = { x E * R : ( x , O ) E f}. 2. The characteristic function of an external set is external (Exercise 2).
A consequence of property 3.2(iv) of a monomorphism *: V ( X )+ V(*X)is that any element of an internal entity is an internal entity. We use this fact in the proof of the following result. 6.6 Theorem If A and B are internal, then so are A u B, A n B, A - B, and A x B. Proof: We prove the result for A u B and leave the remaining proofs as an exercise. Suppose A, B E *V,+,(X)and consider the following true statement in V ( X ) :
(vw,Y E < + l ( X ) ) ( 3 zE V,+l(X))(vXE K(x))[XE z
C-,
X E
W A X E Y].
97
11.6 Internal and External Entities
By transfer, there exists a set C E * V,, , ( X ) having exactly the same elements from *V,(X)as A u B. But by 3.2(iv) all elements of A, B, and C are in *V,(X), and so C = A v B. 0 Having considered internal entities in some detail, we are now ready to demonstrate the existence of external entities. Recall that in Remark 7.8 of Chapter I we showed that there was no set A c R so that *A = R. This fact is not sufficient to show that R is external in the sense of Definition 6.1; we would need to show that R was not an element in the *-transform of an element of V(R).To show the existence of external subsets we use the following lemmas. 6.7 Lemma If a E V ( X ) - X then the internal entities in @(*a)consist exactly of the entities in *@(a).
-
Proof: Consider the following true statement in V ( X ) with n 2 1: (VX E
V,(X))[(VYE X ) C Y E .I
x E @(a)]
[i.e., for all x E V,(X),x is a subset of a if and only if x E 9(a)].Its *-transform says that, for all x E *V,(X), x is a subset of *a if and only if x E *iP(a).We see from Theorem 6.3 that if x is an internal set in V(*X), i.e., x E *V(X), then x E *V,(X) for some n. Such an x is a subset of *a if and only if it is in *@(a).Thus * V ( X )n @(*a) = * V ( X ) n *@(a) = *@(a). 0 As an example, we note that the internal subsets of * N are exactly the
members of *9(N). 6.8 Lemma Each nonempty internal subset of the hyperintegers * Z which is bounded below (above) has a least (greatest) element. Proof: If X is an internal nonempty subset of * Z then X E * 9 ( Z ) by Lemma 6.7. The result in the “bounded below” case now follows by transfer of the sentence
(VX E @(Z))[(3bE Z)(Vx E X ) [ b 5 x ] A X # 0 (3Y E X)(VX E X ) [ Y X I ] , which expresses the fact that each subset of Z which is bounded below has a least element. The “bounded above” case is similar. 0 +
6.9 Theorem In an enlargement V(*R)of V(R)the set * N , of infinite natural numbers is external.
98
11.
Nonstandard Analysis on Superstructures
Proof: Suppose that * N , E B(*N)is internal. Then by Lemma 6.8 there exists a least b E * N , . But then b - 1 E * N , and b - 1 < b (contradiction). 0 6.10 Corollary The sets R, N,Z, * Z , (the set of infinite integers), * R , (the set of infinite reals), and m(0) (the set of infinitesimals) are external in an en-
largement V(*R) of V(R). Proof: Note that *N, = *N - N.If N were internal then * N , would be internal by Theorem 6.6, contradicting Theorem 6.9. Using the fact that the set of integers Z is external (exercise), we see that R is external, since otherwise 2 = R n *Z would be internal. Similarly * Z , and * R , are external. To show that m(0) is external, we note that * R , = {x E * R : ( 3 yE * R ) [(x, y ) E P A y E m(O)]}, where P is defined by “(x, y) E P if y = l/x.” If 4 0 ) is internal then so is * R , by Theorem 6.4 (contradiction). 0
Clearly, external entities and notions play a very important role in nonstandard analysis, as we see by noting the occurrence of the set of infinite natural numbers and the set of infinitesimals in many of the results of Chapter I. The reader might want to review some of the proofs in Chapter I to see just how external sets aria, and how the transfer principle is effective even though it involves only internal sets. In many cases, external entities and notions are useful in recovering standard results from internal results. “Limiting” entities corresponding to “converging” families of entities in V ( X )can often be identified with internal entities in *V(X),but to recover actual limiting entities in V ( X ) usually involves some external operation (one which produces external entities). For example, consider Theorem 14.1 of Chapter I in this light. We constructed the solution Q(x) of the differential equation 4’ = f ( x , 4) as the standard part, i.e., 4(x) = st(*&,(x)), of the internal function *+,(x). The operation of taking the standard part is an external operation. The solution is usually constructed as the limit of a subsequence of the polygonal sequence 4&) by using the Arzela-Ascoli theorem. We end this section with a consideration of the notion of a comprehensive monomorphism; the special case of a denumerably comprehensive monomorphism will be used in Chapter IV. 6.11 Definition The monomorphism *: V ( X ) -+ V(*X)is Comprehensive if, for any sets C,D E V ( X ) and any map h: C + *D,there is an internal map g: *C
*D such that g(*a) = h(a) for a E C. The monomorphism is called
11.6 Internal and External Entities
99
denumerably comprehensive if the choice of C is restricted so that the cardinality of C is that of the natural numbers N . 6.12 Example Suppose that * is comprehensive,and { A , : n E N}is a sequence (in the ordinary sense) of entities in *V,(X) for some integer m. Then there is an internal sequence {B,:nE *N} such that A, = B,, for all n E N.
6.13 Theorem A monomorphism *: V ( X )+ V(*X), constructed as in t11.4, is comprehensive.
Proof: Let C,D, and h be as in Definition 6.11. Each element of * D is of the form M ( [ b ] ) . Let S ( M [ b ] ) be a representative b from the equivalence class [ b ] . We may assume that bi E D for all i E I. For each i E I, let k, be the mapping from C to D given by ki = {(u,S(h(a))(i)):a E C} and let [ k ] denote the equivalence class generated by the mapping { ( i , k i ) : i E I } . We leave as an exercise the proof that M ( [ k ] ) is an internal function from *Cto *D. If M ( [ a ] )E *Cand { ( i , u i ) : i E I} is a representative from the equivalence class [ a ] , then the image of M ( [ a ] ) under the mapping M ( [ k ] )is M ( [ b ] ) , where bi = S(h(ai))(i)for i E I. In particular, if ai = a E C for almost all i E I, then bi = S(h(a))(i)for all i E I. Thus M ( [ k ] )extends h. 0 Exercises 11.6
1. Show that all standard entities are internal. 2. Show that the characteristic function of an external set is external. 3. Finish the proof of Theorem 6.6. 4. Show that the sets 2, *Zm,and * R , are external. 5. Show that M ( [ k ] ) defined in the proof of Theorem 6.13 is an internal function from *C to *D. 6. Show that if { x , : n E *N} is an internal sequence and IX,,~ 5 l / n for all n E N then, for some k E * N , , Ix,I Il/n for all n I k. 7. Let {A,: n E N} be a sequence in the ordinary sense of internal subsets of * R such that, for any k E N , nA,,(l s n 5 k ) # 0.Assume the monomorphism * is denumerably comprehensive and show that n A , ( n E N) # 0. 8. (a) Show that every nonempty internal subset A of * R with an upper bound has a least upper bound. (b) Show that every nonempty internal subset A of *N has a minimal element.
100
II.
Nonstandard Analysis on Superstructures
9. Show that if P is an internal binary relation on c1 x c2 then dom P and range P are internal. In particular, cl x c2 is internal if c1 and c2 are internal. 10. Show that st: G(0)-+ R is an external map. 11. Show that every internal subset of a *-finite set is *-finite. 12. Show that if a and b are internal then the set of all internal functions from a to b is internal. 13. (a) Let sup be the function in V(R)which assigns, to each upper bounded set E c R, its supremum, supE. The function sup can be extended to a function *sup defined on all internal subsets of * R which are “*-bounded above.” Characterize by a sentence in Y., the collection 9 of sets which are *-bounded above, and then show that *sup E 5 *sup F if E E F are sets in 9. (b) Show that if E is a finitely upper bounded external subset of *R, then *sup E may have no meaning in *R, but sup{”r:r E E} has a meaning in R. 11.7 The Permanence Principle
In this section we present a principle with many applications called the permanence principle by Robinson and Lightstone [44] or Cauchy’s principle by Stroyan and Luxemburg [46]. Throughout the section we suppose that X contains the set of reals R. 7.1 Theorem (Permanence Principle) Let @(x) be an internal formula in Y,,with x the only free variable.
(i) If @(x) holds for each x E N (x E * R + - *R,), then there exists a k E *N, so that @(b)holds for each b 5 k in *N (*R+). (ii) If @(x) holds for each x E *N, (*R:), then there exists a k E N so that @(b)holds for each b 2 k in *N (*R+). (iii) If @(x)holds for each infinitesimal x, then there is a standard r > 0 in R so that @(b)holds for all b with Ibl s r in *R.
Proof: We prove the results in the case of the natural numbers N and leave the proofs for the real case (parentheses) of (i) and (ii) to the reader. (i) Let A = {x E * N : i @ ( x ) holds in V(*X)}.Then A is internal by Theorem 6.4 (internal definition principle) and A E *N, by hypothesis. If A = 0 we are through. Otherwise A is bounded below and hence has a least element I by Lemma 6.8; we may take k = 1 - 1.
11.7 The Permanence Principle
101
(ii) Given the internal set A defined as in (i), A c N and A is bounded above and hence has a largest element 1 by Lemma 6.8; we can take k = 1 + 1. (iii) Let A = {x E * N - {O}:@(y) holds for all y with lyl I l/x} and use (ii). 0 7.2 Corollary (Spillover Principle) Let A be an internal subset of *R.
(i) If A contains all standard natural numbers then A contains an infinite natural number. (ii) If A contains all infinite natural numbers then A contains a standard natural number. (iii) If A contains the positive infinitesimals then A contains a standard positive real number. Theorem 7.1 can be used to give yet another proof of the fact that if (s,:n E N) is a standard sequence and *s, = L E R for all infinite n, then lims, = L (see $11.3). Let E > 0 be a fixed number in R. Then l*s, - Ll < E for all infinite n. Applying Theorem 7.l(ii) with @(b)the internal statement “I*sb - LI < E”, we see that there is a k E N so that I*sb - LI < E for all b 2 k in * N and, in particular, ( s b - L( < E for all b 2 k in N since * s b = s b if b E N. This establishes the desired result. The following result has many applications.
7.3 Theorem (Robinson’s Sequential Lemma) Let (s,:n E *N) be an internal *R-valued sequence such that s, 2: 0 for each n E N. Then there is an infinite natural number o so that s, ‘Y 0 for all natural numbers n I o. Proof: The sequence (ns,:n E *N) is internal. Apply 7.l(i) with @(n) the internal formula “Jns,JI 1” to obtain an o E * N , so that ls,l 5 l/n if n I o. Thus s, N 0 if n E * N , and n I a,and so s, N 0 for all n 5 o. 0
One should beware of assertions similar to Theorem 7.3 which sound plausible but are not true. For example, it is not true that if s, N 0 for all infinite n then there exists a finite k so that s, N 0 for all n 2 k as the example s, = l/n shows. As an application of Theorem 7.3 we give another proof of the fact (Corollary 1.13.5) that if the sequence (fn(x):n E N) of continuous real-valued functions on the interval [a,b] converges uniformly then the limit f ( x ) is continuous on [a, b]. Let xo E [a, b]; we need to show that *f(x) 1: f ( x o ) if x 2: xo. But *f,(x) ‘Y *f(xo) for each n E N,and so *f,(x) ‘v *f,(xo) for some infinite o by Theorem 7.3. But *f,(x) N *f(x) for all x E *[a, b] by Proposition 13.2 of Chapter I, and we are through.
102
II.
Nonstandard Analysis on Superstructures
Robinson [41] applied Theorem 7.3 in a more significant context in giving a nonstandard construction for Banach limits of bounded sequences. Suppose (s,:n E N) is a bounded sequence, i.e., IS,] 5 M for some real M > 0. We would like to attach a "limit" to (s,) even though it might not converge in the usual sense. For example, the sequence t, = (sl sz - . * s,)/n (n = 1,2,. . .) of Cesaro means sometimes converges when (s,) does not converge and defines a limit called the Cesaro sum of the sequence (s,). Any generalized limit should satisfy the properties in the following definition.
+ + +
7.4 Definition Let I, denote the set of standard bounded sequences. A map
L: I,
+R
is called a Banach limit if
+
+
(i) L(au b ~ = ) aL(u) bL(r)(a, b E R, u,T E la), (ii) if 0 = (s,,ln E N) then lim infs, 5 L(u) 5 lim sups,, (iii) i f u = ( s , I n ~ N ) a n d . r =(t,InEN),where t , = ~ , + ~ , t h e n L ( u )L(t). =
To obtain a Banach limit, we let summation operators I , n E N.
cy=
zr'
for o E *N, extend the standard
7.5 Theorem Fix o E *N,, and let L(u) = "((l/o) (s,:n E N) in I,. Then L is a Banach limit.
cr=
*sJ for each u =
Proof: The mapping L clearly satisfies 7.4(i). Given u = (s,:n M = sup{ls,l:n E N}.For a given m E N ,
(o- m)M 'y
E N), let
I
+-mM o
0.
By Theorem 7.3, there is an m E *N, so that
1 L(a) N 0
- m a=m+l
Fix E > 0 in R. We see immediately from Definition 8.16 of Chapter I that for each n E *N with m 1 < n 5 o
+
lim inf s, - E < *s, < lim sup s,
+ E.
103
11.7 The Permanence Principle
By the transfer of the usual properties of an average applied to (7.1), lim inf s,
- E IL(o)I
lim sup s,
+ E.
Since E is arbitrary, we obtain 7.qii). The rest of the proof is left to the reader. 0 Exercises 11.7 1. Prove the real case of (i) and (ii) of Theorem 7.1. 2. Assume that A is an internal set in * N such that, for some infinite integer y, if n is infinite and n 5 y in *N then n E A. Show that, for some finite m e N, if n E N and m 4 n then n E A. 3. Prove that the mapping L of Theorem 7.5 satisfies property (iii) of Definition 7.4, i.e., L is invariant under finite translations. 4. Use the permanence principle to show that iff is a standard function and I*f(x) - LI ‘v 0 for all x ‘v 1 but x # 1, then limx-.l f ( x ) = L. 5. Let (s,:n E *N) be an internal *R-valued sequence, and suppose that there is an M > 0 in R so that Is,[ IM for all n E N. Show that there is an o E * N , so that Is,I IM for all n Io in *N. 6. Show that the assertion in Exercise 5 is not true if ‘‘ls,,l S M” is replaced by “s, is finite.” 7. A filter 9E V ( X )- X has a countable subbasis if there is a countable family { A i : iE N} of entities in 9 so that for each F E 9 there is a sequence il, . . . ,in with r)Aa(l Ik In) c F. Suppose that B is an internal set in V ( * X ) and 9 has a countable subbasis. Show that if B n *F = 0 for all F E f then B n p ( 9 ) # 0,where d9)is the intersection monad o f f introduced in Exercise 11.5.9. 8. Let a: V(R)+ V(*R) be comprehensive, and let S = {nk:k E N} be a countable set contained in * N , .
(a) Show that S has a lower bound in * N , . [Hint: Regard S as a sequence, i.e., a map h: N + * N with h(k) = nk. Use comprehensiveness to extend h to an internal map g: * N + * N and apply the spillover principle to the set A = { m E * N : g ( k )> m for all k < m}.] For decreasing sequences nk this was presented by DuBois-Reymond and proved in our context by Robinson. (b) Use the transfer principle applied to g to show that S has an upper bound in * N , . 9. Show that iff is an internal function on an internal set A in some superstructure V(*X),and f is finite-valued, then there exists a standard n E N so that If(x)l 5 n for all x E X.Give an example to show that the assertion is not necessarily true iff is not internal.
104
II.
Nonstandard Analysis on Superstructures
10. (*-Convergence and S-Convergence) An internal *R-valued sequence (s,:n E *N)is (i) *-convergent to L E * R if for each E > 0 in *R there is an rn E * N so that n > rn implies Is, - LI < E, (ii) S-convergent to L E * R if s, 2 L for all n E * N , .
(a) Show that if s, = *t, where ( t , ) is a standard sequence converging to L, then (s,) is *-convergent and S-convergent to L. (b) Show that there are internal sequences which are *-convergent but not S-convergent and vice versa. (c) Show that if (s,) is S-convergent to a finite L E *R then there is an rn E N so that s, is finite for n 2 m and the standard sequence E N) converges to "L. (d) Show that if s, = *t,, where ( t , ) is a standard sequence, then (s,) is S-convergent to a finite L iff there exists an infinite o E * N , so that *s, N L for every n E *N, with n I o. (Os,:n
11.8 ic-Saturated Superstructures
Theorem 7.5 of the last section is a good example of a result in which a standard entity (a Banach limit) is obtained by performing a standardizing operation on an internal entity [in this case, taking the standard part of the internal sum (l/o) *s,(l 5 i 5 a)]. Similar applications of nonstandard analysis often occur in more complicated circumstances, and sometimes the internal structure in a given extension V(*X)of a superstructure V ( X )is not rich enough to produce a desired result. A specific example arose from a result of Robinson, which was that if X is a metric space and B an internal subset of *X in an enlargement V(*X), then the standard part of B is closed (definitions and results will be presented in Chapter 111). It was natural to ask whether the result was still true if X was not metric. An example due to H. J. Keisler showed that the answer was negative if V ( * X ) was only an enlargement of V ( X ) [36, Example 3.4.31. Luxemburg [36, Theorem 3.4.21 showed that the result does go through if V(*X) is large enough to satisfy a generalization of the property of an enlargement, valid for internal concurrent binary relations on an appropriate set A in V(*X). V(*X) is called maturated, where IC is a cardinal number, if this generalization holds for all sets A in V(*X) with the cardinality of A < K (Definition 8.1). It is not necessary for the reader to be very knowledgeable about the theory of cardinal numbers for arbitrary sets in order to apply the theory. In a typical application we will begin with an internal concurrent binary relation on A-then we can assert that the results of the section will be applicable if V ( * X )is sufficiently large. Sufficiently large means that V(*X)is maturated,
11.8 K-Saturated Superstructures
105
where K > card A, but this is irrelevant in the application as long as we are assured that K-saturated structures exist (Theorem 8.2). Let V ( X )be a given superstructure and *: V ( X )+ V(*X)a monomorphism. We write card A to denote the cardinality, in the standard sense, of a set A. 8.1 Definition V(*X) is K-saturated if, for each internal binary relation P E V(*X)which is concurrent (Definition 5.9) on some (not necessarily internal) set A in V ( * X ) with card A < K , there exists an element y E range P so that (x,y) E P for all x E A.
H. J. Keisler [21,22] characterized those ultrafilters Q such that the superstructure V ( * X ) constructed from a given superstructure V(X), using Q as in g11.4, is maturated; he called them rc-good ultrafilters. In [21] Keisler established the existence of K-good ultrafilters on the assumption of the generalized continuum hypotheses. This assumption was subsequently removed by Kunen. Thus we have the following result. 8.2 Theorem Given any superstructure V ( X ) and cardinal K there is a Ksaturated superstructure V(*X)and a monomorphism *: V ( X )+ V(*X).
For the proof of this and related results the interested reader is referred to the papers mentioned above and also to the book by Stroyan and Luxemburg [46], where the desired structures are constructed as limits of ultrapowers. In any applications it will not be necessary to know the details of the proof. It follows from Theorem 5.10 that if K > card V(X)then V(*X) is an enlargement. In applying Theorem 8.2 it is important to note that the set A of Definition 8.1 need not be internal, although the binary relation P must be internal and so the elements of A are internal. For a successful application, however, we do need an upper bound on the cardinality of A which is independent of the particular construction of V(*X). For example, suppose that P is the binary relation on * R x *9’F(R) defined by “(x, B) E P iff the *-finite set B contains x.” Then P is concurrent on any subset A E *R. However, it is not possible to apply Theorem 8.2 and Definition 8.1 with A = *R; i.e., it is not possible to find a *-finite subset of * R which contains all numbers of *R, no matter how large K is. For then * R itself would be a *-finite set and hence, by transfer down, R would be finite. The error occurs in trying to apply the result to the set A = * R whose cardinality depends on the construction of the extension V(*R)and is not fixed in advance. In [36] Luxemburg developed a general theory of monads in enlargements and maturated extensions. In the following we present several of his important results.
106
II.
Nonstandard Analysis on Superstructures
8.3 Definition Let *: V ( X )-,V(*X)be a monomorphism, and let A be an entity in V(X).The (intersection) monad p(A) of A (with respect to *) is the set p ( ~=) n.4.
A).
Monads p ( A ) are most important when A is a filter 9, i.e., when 0 4 9, F and G in 9implies F n G E 9,and F E 9and G 2 F implies G E 9.The next result generalizes the permanence principle.
*: V ( X ) + V(*X) be a monomorphism, and assume that V(*X)is maturated. Fix a filter f E V ( X )with card 9 < K; then
8.4 Theorem (Luxemburg) Let
(a) given an internal set B E V(*X),if *F n E # 0 for all F E F,then P ( 9 )nB
z 0,
(b) given an internal subset A of *9such that every standard element of *9is an element of A, there exists an element E E A such that E c p ( F ) , (c) given an internal subset A of *9such that E E *9and E c p ( F ) implies E E A, there exists an element F E 9 such that *F E A.
Proof: (a) Define an internal relation P, with domain *9and range contained in E, by “(F, x) E P if x E B n F.” Then P is concurrent on the collection of standard elements of *9, and this collection has the same cardinality as 9. Therefore there is a y E B so that y E B n *F for each F E 9, i.e.,
YE
(b) Define an internal relation P, with domain *9and range contained in A, by “(F, G) E P if G E A and G E F.” Then P is concurrent on the collection of standard elements of *9(why?),so there is an E E A such that E G *F for each F E 9,i.e., E c ~ ( 9 ) . (c) Let A satisfy the condition of (c). If A does not contain a standard element *F E *9then the internal set *f- A c *9contains all standard elements of *9and so by (b) there exists an element E E *9- A with E C p ( 9 ) . But then E E A by the hypothesis on A (contradiction). Several exercises in the preceding sections have dealt with situations in which, without saturation, the statement (a) of Theorem 8.4 may or may not hold. The results can be summarized as follows: The statement does not hold in general if E is not internal (Exercise 11.5.1 l), but does hold if E is standard (Exercise 11.5.10) or if E is internal and F has a countable basis (Exercise 11.7.7). (SeeTheorem 8.6.) An example due to H.J. Keisler (see Example 2.7.4 in [36]) shows that the statement need not hold if E is internal but V(*X) is only an ultrapower enlargement. We note finally that an internal version of comprehensiveness holds in Ksaturated extensions.
11.8
107
K-Saturated Superstructures
8.5 Theorem Let V(*X) be a K-saturated extension of V ( X ) .Assume C is a (not necessarily internal) set of entities in V,(*X) for some n E N with card C < K, and D is an internal set in V(*X). For any mapping &: C -P D, there is an internal extension 8: -P D of 4 [i.e., is internal, contains C, and &(a) = &a) if a E C]. If C = {*a:a E C,} we may take = *C,.
c
c
c
Proof: Let P be the binarl relation “(4, $) E P iff is an extension of 4’’ [i.e., dom $ 2 dom & and &(a) = &(a) if a E dom 41 defined on the set of internal mappings with values in D. Let A be the set of all internal mappings f,: {x} -,&(x), x E C.That is, each element of A is a set consisting of exactly one element from 4. Then card A = card C < K and P is concurrent on A (check). Thus there exists an internal map with values in D which extends each f,, x E C, and so dom = 2 C and &a) = &(a), a E C.The rest is left as an exercise (Exercise 1). 0
6 c
4
There is a converse of Theorem 8.5 when cardinal number bigger than card N.
K = K,, where
K, is the first
8.6 Theorem V ( * X )is a denumerably comprehensive extension of V ( X )(Definition 6.1 1) if and only if V ( * X ) is K,-saturated. Proof: Exercise. 0 8.7 Corollary An extension V(*X) constructed as in 811.4 is K,-saturated.
Proof: Follows from Theorems 6.13 and 8.6.
0
Corollary 8.7 shows that assuming HI-saturation in an application of nonstandard analysis is not assuming very much. Later in this book we assume a stronger form of saturation (larger K ) only in the proof of Theorem 1.22 of Chapter I11 (which is not used afterward) and in the proofs of the last few results in GIV.3, where K-saturation is used in a more significant way.
Exercises 11.8 1. Show that if the set C in Theorem 8.5 has the form {*a:a E C,} then one may take = *Co in the conclusion of the theorem. 2. Prove Theorem 8.6. 3. Let V(*R) be a K-saturated extension of V(R) with card 9(R) < K. Let B be an internal subset of * R and st(B) = {x E R:there exists a Y E B with st(y) = x}. Use Theorem 8.4(a) to show that st(B) is closed in R.
108
II.
Nonstandard Analysis on Superstructures
4. (Luxemburg [36]) Suppose that V ( * X ) is a K-saturated extension of V ( X )with K > card V ( X ) .Let A E V ( X )contain an infinite number of elements. If A c *(PAA))is internal and moreover, E E A for every *-finite subset E c * A with the property that A = {a E V(X):*aE E } , then there exists a finite subset { a l , . . . , a,} c A so that {*a,, . . . , *a,} E A. (Hint: Apply Theorem 8.4 to the Frechet filter of A.)
CHAPTER 111
Nonstandard Theory of Topological Spaces
In Chapter I we showed how the notion of continuity for real-valued functions of a real variable could be characterized in terms of the nonstandard concept of nearness [f is continuous at x if *f(y) 2 f ( x ) for all y N x]. On the real line, nearness and the associated concept of monad are characterized in terms of the distance function, so that x 'Y y if Ix - yl N 0. We also characterized open and closed sets in terms of monads. In this chapter we will show how these notions can be extended to more general settings. In the standard development of topology one usually begins with a set X possessing a collection 5- of (open) subsets satisfying the abstract analogues Y) is called of conditions (i) and (ii) of Theorem 9.2 in Chapter I. The pair (X, a topological space. The notions of continuity can then be defined just in terms of the open sets; i.e., a function f: X -,Y is continuous if f - ' ( V ) is open in X for every set V which is open in Y. In the nonstandard theory developed here, we will show how the collection 5- on X can be used to characterize nearness and monad and so allow a simple development of the theory of topological spaces analogous to that of Chapter I. One of the most useful results in the nonstandard development is a characterization of compact spaces (the analogues of closed bounded sets on the real line) due to Abraham Robinson. This development is presented in 5111.2, with an elaboration in $111.7. Sections 111.3, 111.4, and 111.5 are devoted to the nonstandard theory of metric, normed, and inner-product spaces, which are of central importance in much of analysis. In $111.6 we show how one may begin with a standard metric space X and construct a (standard) metric space on the nonstandard set *X,leading to the so-called nonstandard hull of a metric space. This construction plays a central role in some recent applications of nonstandard analysis to the theory of Banach spaces by Henson and Moore (see [16] for a I09
110
111.
Nonstandard Theory of Topological Spaces
review). The section ends with a discussion of some results in the theory of function spaces, and includes a generalization of the Arzela-Ascoli theorem of Chapter I.
111.1 Basic Definitions and Results
A topological space is a pair (X,Y), where X is a set and Y is a family of subsets of X satisfying the conditions in the following definition. 1.1 Definition A family I of subsets of X , called open sets, is a topology for
X if (a) 0,X E 5 ; (b) U,V E 9implies U n V E 9(and thus every finite intersection of open sets is open), (c) U ,E Y ( i E I ) implies UUXi E I) E F,i.e., every arbitrary union of open sets is open. Closed sets are complements of open sets. Often we call X rather than (X, 9) the topological space.
The usual family of open subsets of R, defined in the proof of Proposition 9.1 of Chapter I, is a topology for R (Theorem 1.9.2). We will presently see that there are many topologies for R as for most sets. With each topology we will associate corresponding notions of convergence and continuity, using only the open sets. In order to develop a nonstandard theory, we first generalize the notions of nearness and monad which were central to the work in Chapter I. We begin with a few basic definitions. 1.2 Definition Let (X,Y) be a topological space. A set U is a neighborhood of a point x E X if U contains an open set V which contains x. The neighborhood system .Nxof x is the set of all neighborhoods of x. We denote the system of open neighborhoods of x E X by SX. A collection B E 9 is a base for Y if each set in 5 is a union of sets in a or, equivalently, if for each x E X and each U E YXthere is a V E Yxn with V E U.(For example, open intervals form a base for the usual open sets in R.) A collection 9 is called a subbase for Y if the collection of finite intersections of members of W is a base for 5 Similarly W,EJ; is a (neighborhood) base at x if for each U E Nx there is a V E axwith V E U;axE Nx is a subbase at x if the col-
111.1 Basic Definitions and Results
111
lection of finite intersections of members of axis a base at x . If 9- and Y are topologies for X, then 9-is weaker than Y (and Y is stronger than 3)if s E 9. From now on we work in an enlargement V(*S)of a superstructure V(S), where V(S)contains the standard space X under consideration, so 5 E V(S) as well. In this section we will not use the fact that if x E X then x may contain elements. Therefore, we will write x instead of * x for the nonstandard extension of x . 1.3 Definition The sets in *sare called *-open subsets of * X . The monad of x E X is the subset m(x) = n * U ( U E sx) of * X . A point y E *X is neat x E X , and x is the standard part of y , if y E m(x); then we write y N x and x = st(y). The set of near-standard points is the set ns(*X) = u m ( x ) ( x E X). A point y E *X is called remote if it is not near-standard. An easy exercise shows that m(x) = (I*U(U E N,).
1.4 Proposition If Alx is a local subbase at x , then m(x) = n * U ( U E ax).
Proof: n * U ( U E a,)2 n * U ( U E JV,) since 9,E N,.On the other hand, for each U E .V; there exist V, E a,(lIi 5; n) with V,1 I i I n) c U, and so n*Vk1 I;i I;n) _c *U by transfer. Hence n * V ( V E AlJ -c
n*u(uENx).
1.5 Examples 1. Discrete topology. ( X ,Y) is discrete if { x } is open for each x E X . In this case m(x) = { x } for each x E X. 2. Trivial topology. (X, s)is trivial if 3 = (525, X}.In this case m(x) = *X for each x E X . 3. Usual topology on R. The open sets in R as defined in $1.9 constitute a topology. The monads as defined here and in Definition 6.4 of Chapter I are identical [where we assume that *B and V(*R) are obtained from the same ultrafilter]. This follows immediately from Proposition 1.4since the set ax of symmetric open intervals about x forms a local base by the definition of open set in R. A subbase for the topology is formed by intervals of the form (- 00, b), (a, 00) with a,b E R. 4. Half-open interval topology on R. Let 3 be the topology for R which has as base the set 9 of half-open intervals [a,@ = { x : a I x < b } , where a and b are real. Here m(x) = { y E * R : x I;y, x 2: y } (Exercise 1).
+
112
111.
Nonstandard Theory of Topological Spaces
5. Finite complement topology. For simplicity let X = N (any infinite set would do), and let Y be the collection consisting of the empty set and those subsets of N whose complements are finite. It is an easy standard exercise to show that Y is a topology. Here m(x) = {x} u *N, (Exercise 1). 6. Product topology. Let (X,Y) and ( Y , Y ) be topological spaces. Then X x Y can be made into a topological space as follows: A set W C _ X x Y is open if to each (x, y ) E W there correspond sets U E Yx,V E YYso that U x V E W, i.e., products of open sets form a base for the topology (check that this defines a topology). The resulting topology is called the product If my, m y , and m denote monads in topology and is denoted by Y x 9. (X, Y), ( Y ,Y), and (X x Y,9 x Y), respectively, then m((x, y ) ) = my@) x my(y), x E X , y E Y (Exercise 1). The following facts should be noted in comparing the usual monads for
R and monads in a general topological space (X,Y):
(a) The concept of nearness is derived from that of monad and not vice versa as in Definition 6.4 of Chapter I. (b) We have defined monads only for standard points in *X. (c) Nearness is not in general an equivalence relation on *X [this is, of course, because of (b)]. The monad m(x) always contains x. That m(x) will in general contain points other than x follows from the following basic lemma, the proof of which requires that V(*S) be an enlargement. 1.6 Proposition For each x E X there is a *-open set V E *Yx with V E m(x). Proof: The binary relation P on Yxx Fxdefined by P ( U , V ) if V E U is concurrent. For if U,,. . . , U,E Yxthen V = U, n * n U, satisfies P ( U i , V), 1 5 i 5 n. Since V(*S) is an enlargement, Theorem 5.10 of Chapter I1 guarantees the existence of an element V E *Yx, so that V E *U for all U E Yxand hence V E m(x). 0
1.7 Proposition Let A be a subset of X. Then (i) A is open iff m(x) c * A for each x E A, (ii) A is closed iff m(x) n *A = 0 for each, x in the complement A’ of A. Proof: (i) Suppose A is open and let x E A. By definition there exists an open set U E Fxwith U E A. By transfer m(x) E *U E *A.
111.1 Basic Definitionsand Results
113
Conversely, suppose m(x) E *A for x E A. By Proposition 1.6 there exists a V E *Fx with V c m(x) c *A. Thus the internal sentence (3 V E *FJ[V E * A ] is true and so, by downward transfer, there exists a set V E 9..with V E A. Thus A is open since A = u V x ( x E A). (ii) This follows immediately from (i) and the definition of a closed set: A is closed if A' is open. 0 1.8 Definition A point x is an accumulation point of the set A E X if every open neighborhood of x contains points of A other than x. We let A^ denote the set of accumulation points of A; the set A = A u A^ is the closure of A. A is dense in B if A = B. 1.9 Proposition A point x is an accumulation point of A iff m(x) contains a point y E * A different from x.
Proof: If x is an accumulation point of A then the sentence (VU E Fx) (3y E U n A ) [ y # x ] is true for V ( X ) ,and hence, by transfer, each U E * 9 . ' contains a point y # x in * A . This is true, in particular, of the *-open set V of Proposition 1.6, and so there is a y E m(x) n * A with y # x. Conversely, suppose that m(x) contains a point y # x in *A. Then, for a fixed U E YX,*U contains a point y # x in *A. Thus the internal sentence (3y E *(Un A ) ) [ y # x ] is true, and it follows by downward transfer that there exists a y E U n A with y # x. 0 1.10 Proposition The closure 2 of A E X consists of those x E X for which fa. The closure of A is the smallest closed set containing A. Thus A = 2 if A is closed. m(x) n * A #
Proof: Exercise. 0
Let 9.and 9 'be two topologies for a set X with associated monads ms(x) and my(x) ( x E X). An easy exercise shows that F is weaker than 9 ' iff m,(x) 2 my(x) for each x E X. We noted in 81.6 that if x and y are distinct standard real numbers then m(x)n m(y)is empty. Therefore, we say that R is a Hausdorff space. This property is not true in general for topological spaces. Properties of spaces which deal with the relationship between monads of distinct points are called separation properties. Some of the more important separation properties are presented next; the most important of these is the Hausdorff property.
114
111.
NonstandardTheory of Topological Spaces
1.11 Definition The space ( X , F ) is
(a) To if, for each pair x, y of distinct points in X, there is an open neighborhood of one not containing the other, (b) T, if {x} is closed for each x E X, (c) Hausdorfl (or T2)if whenever x # y in X there are disjoint open neighborhoods U and V of x and y. There are more separation properties (e.g., regularity and normality) which we will consider in the exercises. 1.12 Proposition The topological space (X, Y) is
) Y E m(x) then x = y, (a) To iff whenever x , y ~ Xand both x ~ m ( y and (b) T , iff whenever x, y E X and x E m(y) then x = y, (c) HausdorlT iff monads of distinct points in X are disjoint. Proof: We prove (c) and leave the other proofs as exercises. Suppose (X, F) is Hausdorff and x, y E X are distinct. Then there exist U E Yx, V E Yywith U n V = 0.Therefore, *U n *V = 0,and since m(x) E *U and m(y) E *V, we have m(x) n m(y) = 0. Conversely, if m(x) n m(y) = 0 then by Proposition 1.6 there exist U E *Fx, V E *Yy with U n V = 0. By downward transfer of the appropriate sentence (check), there exist U E Yx,Y E Yywith U n V = 0. 0
If (X, F)is Hausdorff then there is only one standard point st(y) associated with each y E ns(*X). It is defined by st(y) = x, y E m(x). Thus for Hausdorff spaces we have a well-defined map st: ns(*X) + X called the standard part map, which has many applications (e.g., see gIV.3 below). 1.13 Examples 1. The discrete topology is Hausdorff, and every subset is both open and closed. 2. The trivial topology of a space with two or more points is not T o . 3. The finite complement topology on N is T , but not Hausdorff by Proposition 1.12. Also a set is closed in the finite complement topology iff it is finite. For if A is finite then *A = A, and if x E A' then m(x) n *A = ({x} u * N , ) n A = 0. On the other hand, if A is infinite then *A n *N, # 0 by 6.11 of Chapter I, and m(x) n *A # 121 for any x.
So far we have used a topology Y to define associated monads m(x), x E X. Conversely,it is possible to start with a collection k(x), x E X, of subsets of * X with x E k(x), and define an associated family Y as follows: U E 9 if
111.1
115
Basic Definitionsand Results
k(x) E *U for each x E U . An easy exercise shows that Iis a topology. If k(x) (x E X)are the monads of F then clearly k(x) E I;(x) for all x E X,but set equality does not necessarily hold (see Exercise 6). The sets k(x) will be called pseudomonads; the concept will be used in 5111.8. Let (X, F )and (Y, 9) be topological spaces with monads m(x) (x E X)and rTi(y) (y E Y), respectively. To discuss continuity of mappings f: X -+ Y we work in an enlargement containing *X and * Y and thus *f, *Y and all mappings *f: *X + *Y, etc. The symbol 21 will be used for the relation of nearness in both (*X, * F )and (*Y,*Y);the context should clear up any ambiguities. 1.14 Definition The map f:X -,Y is continuous at x E X if to each V E 9',,,, there corresponds a U E Fxwith f [ U ] c Y . f is continuous on X if it is continuous at each x E X.A one-to-one mapf from X onto Y is a homeomorphism iff and f - are continuous.
'
1.15 Proposition The map f: X + Y is continuous at x E X iff *f(y) for each y z x. That is, *f[m(x)]E ITi(f(x)).
21
f(x)
Proof: Suppose f is continuous at x E X,and let V by any open neighborhood of f ( x ) . Find a corresponding U E I . from the definition of continuity so that f[U]E V . If y 'v x then y E *U by 1,7(i), so *f(y) E *V since * f [ * U ] c *V by transfer. Thus *f(y) E *V for each Y E9',(x),i.e., *f(y) z f ( x ) . The converse is left to the reader. 0
Proposition 1.15 shows that for real-valued functions of a real variable, Definition 1.14 is equivalent to the 8-8 definition of continuity. 1.16 Theorem The map f: X + Y is continuous on
each Y E 9.
X iff f - ' [ V ]
EI for
Proof: Fix x E X, suppose f is continuous, and let Y E 9'[(.,.Then *f[m(x)] G rii(f(x))c *V by continuity at x and the fact that V is open. It follows that m(x) E * f - ' [ * V ] = * ( f - ' [ V ] ) (check), and so f - ' [ V ] is open
by Proposition 1.7(i). The converse is left to the reader. 0
The reader will have noticed that the proofs of the results 1.6-1.16 are considerably simpler than the proofs of the corresponding results in gI.9 and 1.10. This is mainly because the richer language of Chapter I1 allows us to avoid proofs by contradiction which use Skolem functions. If Y is a subset of the topological space (X, F),then I induces a topology called the relatioe topology Fr on Y. A subset U E Y belongs to Fr iff
116
111.
NonstandardTheory of Topological Spaces
U = V n Y for some V E Y. It is easy to see that the monads in (Y, 9,,) are given by A(y) = m(y) n *Y, y E Y, where My) is the monad of y in (X,Y) (check).The characterizations of relative openness, relative closedness, continuity, etc., are the obvious modifications of those we have just proved with hi replacing m. Next we define the important notion of a weak topology. Suppose that X is a set and (Xi,Yi) (i E I) is a family of topological spaces. We work in an enlargement containing *X and *Y where Y = U X X i E I). We let mXy) (i E I , y E X i ) denote the monads of y in ( X i , Y i ) .Let {& X -,X i : i E I} be a family of mappings. 1.17 Definition The weak topology 9 on X for the family {4i:iE I} is the topology generated from the subbase 9'consisting of all inverse images of the form 4; ' [ V ] , U E 9'i.e., ; Y consists of all sets obtained by taking arbitrary unions of finite intersections of sets in 9'.
The weak topology is the weakest topology which makes all the maps &i continuous (Exercise 8). 1.18 Proposition If m(x) (x E X)is a monad of the weak topology, then m(x) = { y
E
* X :*$i(y) E mi(c$i(x)) for all i E I}.
Proof: Let the right-hand side of the equation be denoted by k(x). If x E X then for i E I the sets 4; ' [ U ] , U E Y$,(x), are open neighborhoods of x, so m(x) c ( ) { y E * x : y E =
nIy
n*(4;1[~])(~ E Y:,(J}(i E I )
* x :E ~*4;1[r)*U(U
. ~ ; , ~ ~ , ) 3 } I( )i = k(x).
On the other hand, if V E Yxis a neighborhood in the base of 9 generated by the subbase 9,then V is a finite intersection of sets of the form 4; ' [ U i ] , U i E 9:,(X). Clearly k ( x ) E *4;'[*Ui] for each U i E 9$,(x) and so k ( x ) c * V . It follows that k(x) G m(x), and we are through. 0 1.19 Definition: The Product Topology Let ( X i , Y i ) (i E I) be a family of topological spaces. Then the product X = flXi(i E I) is defined to be the set of all mappings x on I with x(i) E X i for i E I. The product topology Y for X is the weak topology generated by the mappings 4 i : X + X i defined by
4Ax) = x(i).
111.1 Basic Definitions and Results
117
To see what *X is, note that each x E X is of the form x: I -t U X i i E I) with x(i) E Xi. The *-transform of the collection { X [ : iE I} includes new sets Xi for i E *I - I. Thus, by transfer, each x E *X is of the form x: *I -t *[uX,(i E I)] with x(*i) E *Xi if i E I, whereas if i is not standard, then x(i) E Xi, but X ineed not be the extension of a standard set. If x E X, and m(x) denotes the monad in F, then by Proposition 1.18 m(x) = { y E *X:y(i) E mix(i)) for all standard i in *I}.
That is, the monad is determined by just the standard indices in *I. 1.20 Theorem The topological product of Hausdorff spaces is Hausdorff.
Proof: Let X = n X i , where the ( X i , q )are Hausdorff with monads mix). Let 9-be the product topology with monad m(x). If x, y E X with m(x) n m(y) # 0,let z E m(x) n m(y). Then z(i) E mX*x(i)) n mi*y(i)) for each i E I, and so x(i) = y(i) for each i E I since (Xi,Yi)is Hausdorff, i.e., x = y. 0
We end this section with a result which is valid under the assumption that X is in V(*S) for some S and V(*S) is rc-saturated with K > card 3.This result was mentioned at the beginning of 511.8 as a good example of the use of saturation in nonstandard analysis. It will be referred to again in 51V.3. 1.21 Definition Let (X,.F) be a topological space with monads m(x), x E X . The standard part st(A) of a set A E *X is the set of all x E X for which there exists a y E A with y E m(x). *1.22 Theorem Assume X E V(S) and V(*S) is rc-saturated with K > card .T. If B G *X is internal then st(B) is closed.
Proof: Suppose z is an accumulation point of "B = st(B). If U E F2then there exists a point x E " B with x E U . Since x E " B there exists a y E B with y E m(x),and hence y E *U since U is open, Thus *U n B # fa for all U E Y2. Since V ( * X ) is K-saturated with K =- card Yx, we see from Theorem 8 4 a ) of Chapter I1 that dYZ)n B # 0,where p ( F J is the intersection monad of the filter Y2 (Definition 8.3 of Chapter 11). Clearly p(.T2) = m(z), and so z E "B, and we are through. 0
Note that if A S *X then st(A) = st(A n ns(*X)). Also note that Proposition 1.10can be interpreted to say that A = st(*A n ns(*X)), and so Theorem 1.22 is a generalization of Proposition 1.10. Theorem 1.22 was established
118
111.
Nonstandard Theory of Topological Spaces
for metric spaces by Robinson using an enlargement [42, Theorem 4.3.31, and in the general case (assuming saturation) by Luxemburg [36, Theorem 3.4.21. An example due to Keisler shows that Theorem 1.22 is not true if V(*S) is not maturated with K > card F [36, Example 3.4.31.
Exercises III.1 Verify the statements in Examples 1.5.4-6. Prove Proposition 1.10. Prove (a) and (b) of Proposition 1.12. Prove that a topology Y is weaker than a topology 9 on X iff m,(x) 2 my(x) for each x E X,where m, and m y denote the monads for F and 9,respectively. 5. A TI space is normal if for any two disjoint closed sets A and B there are disjoint open sets U and V with A c U and B E V. A T , space is regular if the same condition holds for all A and B, where A is a point (actually a set consisting of a point) and B is a closed set. Give a nonstandard condition for regularity and normality. 6. (a) Let k(x) be a subset of *X for each x E X. Define a collection F of subsets of X as follows: U E Y iff k(x) E *U for each x E U.Show that Y is a topology for X.Also show that if hx) is the Y-monad of x E X then k(x) E i(x). (b) Fix an infinitesimal E > 0 in * R and for each x E R let k(x) be the pseudomonad {y E * R : l y - XI < E } . Show that a set U is open in R in the usual sense if and only if, for each x E U,k(x) c *U.Clearly k(x) m(x) for each x E R. (c) Let X be any set. Let A?,, x E X,be a collection of subsets of X satisfying the following:
1. 2. 3. 4.
(i) If V E 1, then x E V, (ii) If V,, V, E A?,, there exists a V E with V E V, n V,, (iii) If y E U E a,, then there is a V E 1,with V E U. Use the sets k(x) = n * U ( U E 1,) to define a topology F as in qa). Show that 1,is a neighborhood base in Y for each x E X. 7. Finish the proof of Proposition 1.15. 8. Show that the weak topology is the weakest topology making the corresponding functions continuous. (See Definition 1.17.) 9. Let A be a subset of a topological space X.A point x is an interior point of A iff A is a neighborhood of x. The set of interior points of A is denoted by A". A point x is a boundary point of A if x is not interior to A and not interior to A'. The set of boundary points of A is denoted by dA.
119
111.1 Basic Definitions and Results
Show that (a) x E A" iff m(x) E *A, (b) x E d A iff m(x) n *A # 0 and m(x) n *A' # 0. 10. Let A be a subset of a topological space X.Use Exercise 9 and the text material to establish the following results:
(a) dA = A n A' = R - A", (b) X - d A = A" u (A')", (c) R = A u dA, A" = A - aA, (d) A is closed iff A 3 dA, (e) A is open iff A n d A = 0. 11. Let (X x Y, .T x 9')be the product of (X, 5 )and (Y,9'). Show that if A E X and B E Y then
(a) AxB = A x B, (b) (A x B)" = A" x B", (c) d(A x B) = (dA x B) u ( A x
a).
12. Let Y be a subset of (X,.T) with relative topology that
Yy.
If A E Y show
(a) A is Yy-closed iff it is the intersection of Y and a Y-closed set. (b) A point y E Y is a Yy-accumulation point of A iff it is a Y-accumulation point. 13. (a) Let (Xl,.Tl), (X2,Y2),and (X3,Y3)be topological spaces. Show that a function f: X1-+ X, is continuous iff, for each subset A E X,
f[AI
fin.
(b) Show that iff: X, -+ X 2 and g: X, + X 3 are continuous, then the composite function h = g f defined by h(x) = g(f(x)) for x E X,is continuous. 14. Let Y be the product topology on X = n X i ( i E I) where(Xi,%) are topological spaces. If Ai c X i for each i E I, show that n A X i E I) = E I), so that the product of closed sets is closed. 15. (a) A sequence (x,:n E N) in a space (X,.T) converges to x E X if for every neighborhood U of x there is an m so that x, E U if n 2 m. Show that (x,) converges to x iff *xu E m(x) for all infinite w. (b) Let (x,:n E N ) be a sequence in X = n X , ( i E I), where the ( X i , F i ) are topological spaces. Show that (x,) converges to x E X iff (4,(x,)) converges to 4,(x) for each i E I, where the di are as in Definition 1.19. 16. Let (X, Y) and (Y, 9')be topological spaces with (Y, 9')being Hausdorff. Suppose that f,g: X + Y are continuous. Show that { x : f ( x )= g(x)} is closed. 0
120
111.
Nonstandard Theory of Topological Spaces
111.2 Compactness
A cornerstone of topology is the notion of compactness, which is defined as follows. 2.1 Definition A collection d = { A i : iE I} of sets is a couer of (or covers) A c X if A C U A , (i E I ) . A subcover of d is a subcollection of d which also covers A. A is a compact subset of a topological space (X,F)if each open cover, that is, each cover of A by open sets U i (i E I ) , contains a finite sub-
cover. Probably the most useful result in nonstandard analysis is the following pointwise characterization of compactness due to Robinson. 2.2 Robinson’s Theorem Let ( X , F ) be a topological space. Then A E X is compact iff every y E *A is near a standard point x E A.
Proof: Suppose A is compact but that there is a point y which is not contained in the monad of any x E A. Then each x E A possesses an open neighborhood U, with y 4 *U,.The covering { U , : x E A} of A has a finite subcovering { U l , . . . , U,};i.e., Ul u * * * u U, 1 A. By transfer *U1u * - * u *U,2 *A. This contradicts the fact that y E * A but y # * U i , 1 S i 5 n. Conversely, suppose that A is not compact. Then there is an open covering d = { U i : iE I} of A which has no finite subcover. The binary relation P on d x A defined by P ( U , x ) iff x # U is concurrent (check). By Theorem S.lO(ii) of Chapter I1 there is a point y E *A with y # *U for all U E d. If x E A then x E U for some U ~ dbut ,y # *U so y # m(x). 0
2.3 Examples 1. In the discrete topology the only compact subsets are finite. 2. All subsets in the trivial topology are compact. 3. In the finite complement topology for N,every subset A is compact. For if A # 0 and y E * A then either y E A or y E *N, (Corollary 7.6 of Chapter I). In the first case y E m(y), and in the. second case y E m(x) for any x E N and, in particular, for some x E A. Recall that a set must be finite to be closed in this topology, so there are compact subsets which are not closed in this non-Hausdorff topology.
We use Robinson’stheorem to give proofs of the following standard results.
121
111.2 Compactness
2.4 Theorem If X is compact in the topology F and A c X is closed, then A is compact. Proof: Let y E *A. Since X is compact there is an x E X with y E m(x), whence x E A by 1.7(ii), so A is compact. 0
2.5 Theorem If (X, Y) is Hausdorff and A c X is compact, then A is closed. Proof: Let x E A' and suppose that y E m(x), y E *A. Since A is compact, y E m(2) for some iE A, but then m(x) n m(9 # 0,contradicting the fact that ( X , F ) is Hausdorff. 0
2.6 Theorem If ( X , Y ) and ( Y , Y ) are topological spaces and f: X continuous, then f[K] is compact for each compact K E X.
+Y
is
Proof: Exercise. 0
2.7 Theorem If (X, F )is compact, (Y,9') is Hausdorff, and f: X tinuous, then
+
Y is con-
(i) f is closed (i.e., takes closed sets onto closed sets), (ii) iff is one-to-one then it is a homeomorphism. Proof: (i) Follows from 2.4-2.6. (ii) We may assume that f[X]= Y. We need only show that f is open (i.e., takes open sets onto open sets). But if U is open in X,then U' is closed. Since f is one-to-one, f[U] = Y - f[U'], which is open by (i). 0
The real power of Robinson's theorem is illustrated by the proofs of the following standard results. The standard proofs of these results as given in Kelley [20] are somewhat involved. 2.8 Tychonoff's Theorem If (Xi,Yi)(i E I) are compact spaces and X = JJXii E I), then X is compact in the product topology Y. Proof: Let y E *X.Then Hi)E *Xi for (standard) i E I and so Hi)is near a standard point xi E Xi for each i E I. That is, Hi) E mi(xi),where mi(xi)denotes the monad of xi in ( X i , Y i ) .By 1.19, y E m(x), where m(x) is the monad in 9 of the point x E X defined by x(i) = x i . 0
122
111.
NonstandardTheory of Topological Spaces
Tychonoffs theorem is used in many proofs in analysis. One can usually replace these standard proofs by simpler nonstandard ones which use Robinson’s theorem directly (for example, see the proof of Alaoglu’s theorem, 4.22, below).
*29 Alexander’s Theorem If 9’is a subbase for the topology of (X, 9) and every cover of X by members of Y has a finite subcover, then X is compact. Proof: (Hirschfeld [lS]) Suppose X is not compact. By 2.2 there exists a y E *X which is not near-standard and so for each x E X there is an open set U, with x E U, and y 4 *U,.Since each U,is a finite intersection of members 6 of 9,one of the *& must omit y, so we may as well assume that U , E 9’ for each x. Then the covering {U,:x E X} cannot have a finite subcover U , , , . . ,U,, for in that case *X = *U,u - * u *U,and y E *Ui for some i, 1 s i 5 n (contradiction). 0
Exercises 111.2
1. Prove Theorem 2.6. 2. Let ( X , 9 ) and ( Y , 9 ) be topological spaces and suppose that (Y, 9)
3. 4.
5.
6.
7.
is compact Hausdorff. Show that f: X + Y is continuous iff the graph G, = {(x,f(x)) E X x Y: x E X} off is closed in X x Y. Let X have the topologies 9 and 9, and suppose that ( X , 9 ) is compact Hausdorff. Show that (a) if Y is strictly contained in 9 then Y is not Hausdorff, (b) if 9 is strictly contained in Y then Y is not compact. Show that if(X,Y) is compact then there is a hyperfinite set F E *X with X c F E ns(*x) such that X = st(F). Suppose that (X,Y) is compact. Show that if ( A , : n E N) is a sequence of nonempty closed subsets of X which is monotone, i.e., A, 2 A, z * * * , then nA,(n E N) # 0. The following problem is derived from a result of A. Abian (see [11): Let pn be a sequence of polynomials and x, a sequence of variables so that, for each n, p. = pn(x1,x2,. . . ,x,) is a function of the first n variables. Let I, be a sequence of closed and bounded intervals in R. Assume that for each n there are values 4 E Ii for 1 5 i 5 n such that, for each i 5 n, pi(a;, a;, . . . ,4) = 0. Show that there are values ai E I i for 1 s i < 00 such that, for each n E N, pn(al,az,.. . ,a,) = 0. (Luxemburg [36]). Let ( X , Y ) be a regular Hausdorff space (see Exercise 1.5). If A is an internal set in a K-saturated enlargement of V ( X ) where K > card 9, and A E ns(+X),then st(A) = {x E X:there exists y E A with x = st(y)} is compact.
I23
111.3 Metric Spaces
111.3 Metric Spaces
The most important topologies which occur in analysis are those associated with a metric or distance function. The corresponding spaces are called metric spaces. 3.1 Definition A metric space is a pair ( X , d), where X is a set and d is a map from X x X into the nonnegative reals satisfying (for all x, y, z E X )
(a) d(x, y) = 0 iff x = y, (b) 4%Y ) = d(Y,X), (c) (triangle inequality) d(x, z )
d(x, y )
+ d(y,z).
Each metric space (X, d) can be made into a topological space ( X , r d ) by specifying that a set U E r d if, for each x E U,there is an E > 0 in R so that the open &-ballB,(x) = { y E X : d ( x , y ) < E } E U.The resulting collection F,, is a topology (standard exercise). When the metric d and associated topology F', are understood we simply call X rather than (X,d) or ( X , Y d )the metric space. Note that the open &-ballsabout a point x E X form a local base at x.
3.2 Examples 1. R is a metric space with the usual metric d(x, y ) = Ix - yl for x, y E R . 2. R is a metric space with the metric d(x, y) = Ix - yl/(l + Ix - yI) (check). 3. Let X be any set and define d(x, y) = 1 if x # y and d(x, y) = 0 otherwise. It is easy to see that d is a metric anti is called the discrete metric. 4. Rn is a metric space under each of the following metrics [where x = ( x 1 , * *. r x n ) r y = ( y l , . . . , ~ n ) ] :
C!=
(a) dl(x, Y ) = 1 [xi - yil, (b) d,(x, y ) = max{lxi - yil: 1 Ii In}.
Properties (a) and (b) of Definition 3.1 are trivial; to check property (c) for metric (a) we have [with z = (zl,.. . ,z,)]
The triangle inequality (c) for metric (/is I) left as an exercise. 5. Let 1, (also often denoted by I") be the set of bounded sequences x = (x,,x,, . . .). Then I , is a metric space under the metric defined by d,(x,y) = sup{lxi - yil:i E N } with y = ( y , , ~ , ,. . .). Note that d,(x, y) is finite for any x, y E I, since, for any i, Ix, - yil 5 (xi( lyil and so
+
sup{lxi - yil:i E N } Isup{lxil:i E N }
+ sup{lyil:i E N}.
124
111.
NonstandardTheory of Topological Spaces
To check the triangle inequality (c) we have [with z = (zl,zz,.
IY,
. .)]
[xi - z i l s 1x1 - Yil + - ZiI 5 sup{[ x i - y,[ :i E N} sup{ y, - ziJ:i E N } = dm(x, Y )
+ drn(Y,z)*
+
1
The result follows by taking sup over i E N on the left. The nonstandard analysis of metric spaces will be carried out in an enlargement V(*S) of a superstructure V ( S )that contains X. We always assume that S contains the set of real numbers R. In proving abstract theorems concerning a metric space (X,d) we will write x instead of * x for an element of *X.In concrete examples, it might be important to investigate in more detail the structure of the elements of *X. For example, if X = I , then we could take S = R, in which case elements of I, would appear as bounded real-valued functions on the integers. Often the set S in a particular example will not be specified; the reader should be able to fill in the details. By transfer, the *-transform *d of d satisfies the conditions of Definition 3.1 with *d replacing d for all x , y, z E *X.
3 3 Definition Let (X,d) be a metric space. Two points x and y in *X are near if *d(x,y) N 0. We write x 21 y if x and y are near and x qk y otherwise. The monad of x E *X is the set m(x) = { y E *X:y N x } . Two points x , y in *X are in the same galaxy if *d(x, y ) is finite. The principal galaxy of *X is the one containing the standard points, and is denoted by fin(*X). Points in fin(*X) are called finite. An easy exercise shows that for standard points x E X the monad m(x) of Definition 3.3 coincides with the monad obtained from the associated topology .Td.The metric monads, however, are defined for all points x E *X.It is also easy to see that the relation N is an equivalence relation. 3.4 Examples
1. In the metric of Example 3.2.1, x N y iff x - y is infinitesimal. 2. In each of the metrics on R" defined in Example 3.2.4, x 'Y y iff x i - yi is infinitesimal for 1 Ii In (exercise). 3. Each element of * I , is an internal function x : * N -+ *R, and we usually write x(i) = x i and x = ( x i : i E *N).The standard elements in * I , are of the form *y, where y = ( y , : i E N) is an element of I,. Each x E * I , is *-bounded in the sense that there exists an M E * R (which could be infinite) so that [ x i /IM for each i E *N (exercise). In passing, note that there are external
111.3
Metric Spaces
125
functions z: * N -+ * R which are also *-bounded; an example occurs when zi = 1 for i E N and zi = 0 for i E *N,. The real-valued function on S(N)x 1, defined by (A, x ) -,sup{[xi[:i E A} extends by transfer to a *R-valued function on *B(N)x *I,. We again denote the value of this extended function by sup{lxil:i E A}, where A E *N is internal and x E *I,. Properties of the extended sup function can be obtained by transfer. For example, if A and B are internal subsets of * N and A E B then sup{ x i [:i E A} 5 sup{lxil :i E B}. For each x, y E * I , we have *d,(x, y ) = sup{ x , - yil:i E * N } . The monads in *I, are easily characterized. We claim that if x , y E * I , then x N y iff xi N y , for all i E * N . For suppose x N y . Then, for any i E *N, Ix, - y,l I sup{Ix, - y,l :i E *N}N 0. The converse is left as an exercise. The finite elements in * I , are those x = ( x i : i E *N)for which there exists a finite M (and hence even a standard M )in * R so that [xi[ 5 M for all i E *N. The value of M depends on x. All of the results of 3.1 and 3.2 are available for the topological space
(X,Yd)associated with a metric space (X,d). We concentrate in this section on some results which are special to metric spaces. The first few revolve around the notion of uniformity. 3.5 Definition Let (X, d ) and (Y, d ) be two metric spaces and A a subset of X.
(a) A mapf: A + Y is uniformly continuous on A if, given E > 0 in R, there exists a 6 > 0 in R so that d(f(x), f(y)) < E for all x, y E A for which d(x, y ) < 6. (b) A sequence of maps f.:A + Y, n E N , converges uniformly on A to f: A + Y if, given E > 0, there exists a k E N so that d(f.(x),f(x)) < E for all n 2 k in N and all x E A. In the following results, (X, d ) and (Y, 2) are metric spaces and A is a subset of X. We use N to denote nearness in both *X and *Y, letting the context settle any ambiguity.
3.6 Proposition The map f: A -+ Y is uniformly continuous on A iff *f(x) N * f ( y )whenever x, y E *A and x N y. Proof: Let f be uniformly continuous on A. Find the 6 > 0 for a prescribed > 0 from 3.5(a). By transfer, *d(*f(x), * f ( y ) )< E for all x, y E * A for which *d(x, y ) < 6. In particular, *d(*f(x), *f(y)) < E for all x , y E * A for which x N y . This is true for any E > 0 in R, and so *f(x) z *f(y) for all x, y E * A for which
E
x
N
y.
126
111.
Nonstandard Theory of Topological Spaces
Conversely, suppose *f(x) N * f ( y )whenever x, y in R be given. Then the internal sentence
E
* A and x
1:
y . Let E > 0
(36 E *R)[6 > 0 A (Vx, y E *A)[*&, y ) < 6 + *d(*f(X), *f(Y)) < E l ] is true in Y(*S)(choose 6 to be infinitesimal).That f is uniformly continuous follows by transfer to Y(S). 3.7 Propositioo The sequence f , : A + Y converges uniformly on A to f: A + Y iff *fJx) N *f(x)for all n E *N, and all x E *A.
Proof: Exercise. 0 3.8 Theorem Iff: A + Y is continuous and A is compact, then f is uniformly continuous on A. Proof: Let x, y E *A with x ‘v y. Then x and y are near a standard point z E A since A is compact, and *f(x) N f(z) N *f(y) since f is continuous at z. The result follows from Proposition 3.6. 0
3.9 Theorem Iff.: A + Y is a sequence of continuous functions which converge uniformly on A to f:A + Y, then f is continuous.
Proof: Let x E A and y E *A with y ‘v x. We need to show that *f(y) N f ( x ) . Now *f,(y) z *h(x) for each n E N and so, by Theorem 7.3 of Chapter 11, * f , ( y )N *f,(x) for some w E *N,. By Proposition 3.7, *f&) N * f ( y ) and *f,MN *f(x), so *f(Y) N *fM= f(4.0 Next we present the notion of a complete metric space. To do so we need the obvious generalizations of the definitions in fiI.8. 3.10 Definition Let (X, d) be a metric space, and let (s,:n of points in X. Then
E N) be
a sequence
conuerges to s if, given E > 0 in R, there is a k E N so that < E if n 2 k, (ii) (s,) is a Cauchy sequence if, given E > 0 in R, there is a k E N so that d(s,, s), < E if n, m 2 k, (iii) s is a limit point of (s,) if, for each E > 0 in R and each k E N, there is an n > k so that d(s,,s) < E.
(i) (s,)
d(s,,s)
111.3
Metric Spaces
127
The reader will easily be able to prove that (s,) converges to s iff *s, N s for all n E * N , , (s,) is a Cauchy sequence iff *s, N *s, for all n, m E * N , , and s is a limit point of (s,) iff *s, 1: s for some n E * N , . 3.11 Definition ( X ,d ) is complete if each Cauchy sequence in X converges to a point in X . 3.12 Examples 1. The set R with the usual metric is complete by 8.5 of Chapter I. 2. Any set X with the discrete metric is complete. 3. R" with each metric of Example 3.2.4 is complete. For example, let ( x k ) be a Cauchy sequence in (R",d,). Then for each i, 1 5 i 5 n, Ix: - xfl 5 d,(xk,x').Thus, (x:) is a Cauchy sequence for each i and so converges to a point x i in R. The point x = ( x l , . . . ,x,) in R" is the limit of xk in R".
We now use nonstandard analysis to prove some abstract theorems on completeness. The nonstandard characterization of completeness requires the following notion. 3.13 Definition Let ( X , d ) be a metric space. A point y E *X is a pre-nearstandard point if for every standard E > 0 there is a standard x E X with *d(x,y) < E. 3.14 Proposition A metric space ( X , d ) is complete iff every pre-near-standard point y E * X is near-standard.
Proof: Suppose ( X , d ) is complete. If y is pre-near-standard, find a sequence s, E X so that *d(y,s,) < l/n. Then (s,) is a Cauchy sequence with limit s and y N *s, N s if n~ * N , . Conversely, suppose every pre-near-standard point is near-standard, and let (s,) be a Cauchy sequence. Given E > 0, find the associated k E N from Definition 3.10. Then d(*s,,sk) < E if n E * N , . Thus *s, is pre-near-standard for every n E * N , and each such *s, must be near-standard to the same s E X (check). The sequence (s,) must converge to s. 0 3.15 Corollary A closed subset ( A , d ) of a complete metric space ( X , d ) is complete.
128
111.
Nonstandard Theory of Topological Spaces
Proof: Let y be a pre-near-standard point in *A. Then y N x for some x E X since (X,d) is complete. But x E A by Proposition 1.10 since A is
closed. 0 Using this characterization, we will show that it is possible to adjoin “ideal” elements to a metric space (X,d) so that the result is a complete metric space in which (X, d) is densely embedded.
3.16 Definition Let (X, d ) be a metric space. A metric space (8,(i) is a completion of (X,d) if (8,d)is complete, there is an isometric embedding 4: X -,8 [i.e., d(x, y ) = d(r$(x), &y)) for all x, y E X,whence 4 is one-to-one],
and 4[X] is dense in X.
3.17 Theorem Any metric space (X,d) has a completion (8,d).
Proof: We let X‘ be the pre-near-standard points in *X, and 8 be the equivalence classes of X under the relation of nearness N (an equivalence relation); thus the elements of 8 are monads 4 x 7 of pre-near-standard points x’ E *X.Also define d(m(x’),My’))= st(*d(x’, y’)) [note that *d(x‘, y’) is finite for any pre-near-standard points x’, y’]. This metric is independent of the pre-near-standard points chosen to represent the elements of 2,for if x’ N x i and y‘ N y; then *d(x‘, y’) N *d(x;, Y;) (Exercise 6). The map 4: X -,8 defined by $(x) = m(x) is obviously an isometric embedding. Also #[XI is dense in 8.For if m(x’) E 8,where x’ is pre-nearstandard, then given E > 0 there exists an x E X so that *d(x’, x ) -c E and then
d(m(x‘),N x ) ) = st(*d(x’, x ) ) < E. To show completeness, let (m(x3:n E N) be a Cauchy sequence in (8,d), with xk E X’. Since each xk E X‘, there are elements x, E X with *d(x,, xk) < l/n for each n E N. Given E > 0 in R, there exists a k E N so that d(m(x9, m ( x 3 ) < E and hence * d ( x ; , x a < E if n, m 2 k. Then d(x,,x,,,) = *d(x,,x,,,) 5 2/n E if rn 2 n 2 k in N by the triangle inequality. Again by transfer, *d(*x,,, *x,J 5 2/n E if m 2 n 2 k in *N.In particular, if w E * N m ,*d(x,,*xJ 9 2/n E if n 2 k, and so *x, is pre-near-standard. Therefore *d(xk, *x,) < *d(xk, x,,) *d(x,, * x J < 3/n E if n 2 k, yielding d(m(xk),m(*x,)) < 3/n E if n 2 k. Thus (m(x3) converges to m(*x,). 0
+
+
+
+
+ +
As an example, note that the rationals Q form a metri? space under the usual metric d(x, y) = Ix - yl, x, y E Q. The completion (0,d) is isomorphic to the real metric space (R,d). Recall that a subset of the real line is compact iff it is closed and bounded. In arbitrary metric spaces there is a similar relationship between compact-
111.3
129
Metric Spaces
ness, completeness, and total boundedness, the last being a generalization of boundedness.
3.18 Definition A metric space ( X , d ) is totally bounded if, to each E > 0 in R, there corresponds a finite covering {&(xi): 1
< i 5 n} by open &-balls[each
BXx) = { y E X:d(x,y ) < E } ] . 3.19 Proposition A metric space (X, d) is totally bounded if every point of * X is pre-near-standard.
Proof: Suppose (X,d) is totally bounded. Let E > 0 be given and find the corresponding points x i , 1 < i 5 n, so that X = UB,(xJ(l 5 i 5 n). By transfer, * X = U*Bc(x,)(l< i < n), and so every point of * X is pre-near-standard. The converse is left to the reader. 0
3.20 Theorem A metric space (X, d) is compact iff it is complete and totally bounded. Proof: Suppose (X, d) is compact. Then every point y E *X is near a point in X , so ( X , d )is complete and totally bounded by 3.14 and 3.19, respectively. Conversely, suppose (X,d) is complete and totally bounded. If y E *X, then y is pre-near-standard by 3.19 and hence near-standard by 3.14. 0 One might expect that “totally bounded” may be replaced by “bounded” in this theorem, where boundedness is defined as follows.
3.21 Definition A set A in a metric space ( X , d ) is bounded if there is a point xo E X and a number M so that d(x,xo) I M for all x E A. Example 3.2.2 and the following example show that boundedness is not enough for Theorem 3.20.
3.22 Example Let B , = {x E lm:dm(x,0) < l } be the “unit ball” in (l,,d,)
where 8 = (O,O, . . .). It is easy to see that B, is closed and hence is complete when regarded as a metric space with the metric induced by d, (Exercises 8, 14). Also, B , is obviously bounded. Now consider the element x = ( x i : i E *N)E * B , which is zero except at some infinite integer o where x, = 1. Then x is not near-standard. For if x ‘Y * y for some standard y = ( y i : i E N) then 0 = xi 3: y, for at least all i E N, and so yi = 0 for all i E N. By transfer, *yi = 0 for all i E * N , and so * y , x,.
+
130
Ill.
Nonstandard Theory of TopologicalSpaces
To end this section we consider another compactness criterion, which is especially important in applications. In many situations one can obtain a sequence (x,) of points (in a given topological space x,) which has certain desirable properties, e.g., giving better and better approximate solutions to a set of equations. One would like to assert that a subsequence of the given sequence converges to a point in the space (in order, e.g., to produce an exact solution). Though the criterion of compactness in the sense of $111.2 is not always of help in constructing such a subsequence, if the assertion is nevertheless always true we call the space sequentially compact. 3.23 Definition A topological space X is sequentia~lycompoct if from each sequence (x,) in X it is possible to select a subsequence which converges to a point x E X . It turns out that compactness is equivalent to sequential compactness in a metric space. U~ortunatelythis is not true in general topological spaces, as we shall see in $111.7.
3.24 Tbeorem A metric space (X, d) is compact iff it is ~quentiailycompact. Proof: (i) Suppose that (X,d) is compact and let (x,) be a sequence in X. By Exercise 9 there is a point xo which is a limit point of (x,,). We will show that some subsequence of (x,) converges to xo. Consider the open ball B , = {x E X:d(x,x,) c I}. Since xo is a limit point of (x,) there is an x,, E Bl. Similarly there is an x,, in BlI2= (x E X:d(x,xo) c it} with n, > n,. Continuing this process inductively, we obtain a subsequence (x,) with x,, E B , , = (x E X:d(x, xo) < l f k } ; clearly (x,,) converges to xo. (i) Suppose (X,d) is ~quentialiycompact. Then it is obvious that (X,d) is complete, so that if (X,d) is not compact, it must not be totally bounded. Thus there exists some E > 0 so that no finite collection (B,(yJ:I 5 i r; n} covers X. Let x1 E X be a given point. Then there is an x2 with d(x 1, x,) 2 E. Similarly there is an x3 with d(xl,x3) 2: E and d(x2,x3) 2 E. Continuing in this way, we construct a sequence (x,,) with d(x,,xd 2 E for any n, m E N. Clearly (x,) can have no convergent subsequence.
The procedure used in part (i) of the proof in going from a limit point to a convergent subsequence does not work in a general topological space. It uses in an essential way the fact that the neighborhood system of x has a countable base. A topological space is said to satisfy the first axiom of cou~tubilityif the neighborhood system of each point has a countable base. Included in such spaces are the metric spaces. Clearly, a subset A in a metric
111.3 Metric Spaces
131
or first countable space is closed iff A contains the limit of any convergent sequence in A. Exercises 111.3 1. Show that d,(x, y) satisfies the triangle inequality. 2. Show that for the metrics on R” defined in Example 3.2.4, x N y iff xi = yi for 1 Ii In. 3. Show that for each x E *1, there is an M E * N such that lxil I;M for all n E *N. 4. Prove that if x, y are internal sequences and x, N y, for all i E *N then sup{[ x i - y,l :i E * N } 2: 0. 5. Prove Proposition 3.7. 6. (a) Show that if a, b, c are points in a metric space (X, d) then Id@, c) d(b, I;4%b). (b) Show that if x’ N x i and y’ z y’, in (*X,*d) then *d(x’, y’) N *d(x;, yi). 7. Show that if (X,d) is a metric space and each point of *X is pre-nearstandard then (X, d ) is totally bounded. 8. Show that B1= {x E I,:d,(x,8) I; l} is closed. 9. Show that a sequence in a compact metric space has a limit point. 10. Let (x,) be a sequence in a compact metric space (X, d). Fix o E *N,. Use the downward transfer principle and the fact that x, is near-standard to prove there is a subsequence x,, that converges to st(x,). 11. Use Theorem 3.24 to prove Robinson’s result: If (X, d ) is a metric space and A is an internal set in X such that each a E A is near-standard, then st(A) = {x E X:there exists an a E A with x N a} is compact. (The generalization for regular topological spaces (Exercise 2.7) is due to Luxemburg [36].) 12. Prove that a Cauchy sequence in a metric space (X,d) is bounded. 13. Use Exercise 12 to show that (X,d) is complete if every finite point in *X is near-standard. 14. Show that ( l m , d m )is complete. 15. (&&Continuity,*-Continuity, and S-Continuity) Let (X,d) be a metric space, A be a subset of *X,and f: A -+ * R be a function. We say that f is &&continuous(*-continuous) at x E A if, for each > 0 in R (*R), there is a 6 > 0 in R (*R)such that If(x) - f(y)( < & ify E A and *d(x, y) < 6. We say that f is S-continuous at x E A if f(y) N f ( x ) for every y E A with y 2: x. (a) A = *X and f = *g, where g: X + R. Show that if g is continuous at each x E X,then f is *-continuous and S-continuous at each x E *X.
132
111.
Nonstandard Theory of Topological Spaces
(b) Show that iff is .&continuous at x E A then f is S-continuous at x E A but not necessarily vice versa. (c) Suppose that f is internal. Show that f is Scontinuous at x E A iff f is Ed-continuous at x. (Hint: Use the spillover principle.) (d) Show that there are internal functions f on * R which are *eontinuous but not Scontinuous at zero and vice versa. (Hint: Look for examples on X = R with the usual metric). 16. Let A be an internal set in *X where (X,d) is a metric space and let f:A + * R be internal. Show that f is S-continuous at each point x E A iff, for every (standard) E > 0 in R, there is a 6 > 0 in R such that If(x) - f(y)l < E for all x, y E A for which *d(x, y) < 6. (Hint: Again use
the spillover principle.)
17. Let X be a compact metric space. Suppose that the internal function
f:*X -i* R is Scontinuous at each point of *X and finite at each x E X. Let g be defined by g(x) = "f(x) for x E X. Then g is continuous on X and *g(x) N f(x) for all x E *X. 18. Two metrics on X are equivalent if they define the same topology. Show
that the metrics d and d' are equivalent if there exist positive (nonzero) constants a and /I in R so that ad(x, y) 5 d(x, y) 5 /Id(x, y) for all x, y E X. 19. Let I = [0,1] c R and let X be the set of all continuous functions f: I -,I such that If(.) - f(y)l 5 Ix - yl. Define d ( f , g ) = sup{lf(x) g(x)l: x E I } for f,g E X. (a) Show that (X,d) is a metric space. (b) Show that (X,d) is compact. 20. Use Robinson's theorem to show that the set of elements x of ll with llxlll 5 1 (the unit ball) is not compact. 21. (Lebesgue covering lemma). If Ul,. . . ,U,is an open covering of a compact metric space (X,d), then there is an E > 0 in R such that the E ball E,(x) about any x E X is entirely contained in one of the sets U,, 1SiSn.
111.4 Normed Vector Spaces and Banach Spaces
The space R is not only a metric space with the usual metric; it is also equipped with operations of addition and multiplication, and the distance function d(x, y) = Ix - y ( involves these operations. In this section we generalize this simple example. The metric spaces will have the additional struc-
111.4 Normed Vector Spaces and Banach Spaces
133
ture of a vector space, and the metric will come from a generalization of the absolute value. Many theorems and exercises are standard. As in 5111.3, the nonstandard analysis will be carried out in an enlargement V(*S) of a suitable superstructure Y(S).The choice of S will depend on the context and will not be mentioned explicitly. 4.1 Definition A (real)' vector space is a set X on which are defined operations of vector addition (+) and scalar multiplication (*)(so that we form the sum x + y of two vectors x, y E X and the scalar multiple a - x of the vector x E X by a E R). These operations satisfy the following conditions (as usual we often omit the dot in scalar multiplication):
(i) x + y = y + x for all x, y E X. (ii) (x + y) + z = x + (y z ) for all x, y, z E X. (iii) There is a vector 8 E X called the zero vector so that x
+
XEX.
+
+ 8 = x for all
+
(iv) a(x y) = ax ay if a E R and x, y E X. (v) (a + b)x = ax + bx if a, b~ R and EX. (vi) a(bx) = (ab)x if a, b E R and x E X. (vii) 0 . x = 8, 1 . x = x for all x E X.
Wewrite(-l)x= -x,sothatx+(-x)=Oby(v)and(vii).Theset Y E X is a (linear) subspace of X if x, y E Y and a, b E R imply ax + by E Y. An easy exercise shows that the element 8 is unique. A subspace Y of a vector space X is itself a vector space with the inherited operations of addition and scalar multiplication.
4.2 Definition A norm on a vector space X is a nonnegative real-valued function 1) 11: X + R satisfying (a) llxll = 0 iff x = 8, (b) IIx + yll Illxll + llyll (triangle inequality), (4 llaxll = la1 IIXII. A normed vector space (X, 11 11) is a metric space if we define the metric d by d(x, y) = IIx - yll (exercise).If the normed vector space is complete in this metric it is called a Banach space. A subspace Y E X is closed if it is closed in the topology defined by the norm. The reader should easily be able to prove that the norm function 11 1 : X + R is continuous when X has the topology induced by d. Note also that a closed Much of this and the succeeding section obtains (with some obvious modifications) if the real numbers are replaced by complex numbers in the definition of vector space. f
134
111.
Nonstandard Theory of Topological Spaces
subspace of a Banach space is complete (Corollary 3.15) and hence a Banach space. 4.3 Examples
1. R" can be made into a vector space in the following standard way: If x = (xl, . . . ,x,), y = (yl, . . . ,y,), and a E R we definex y = (xl y , , . . . , x, + y,), ax = (ax,, . . . ,ax,), and l3 = (O,O, . . . ,0). R" is a normed space under each of the following definitions of a norm (exercise):
+
+
(a) IlXlll = C1=11x11, (b) llxllDD = sup{lxfl:l I i s n} 2. The space I,. The space R" of infinite sequences of real numbers is a vector space with the following definitions of addition and scalar multiplication: If x = (x1,x2,. . .), y = ( y , , y , , . . .), and U E R, we define x y = (xl y1,x2 y , , .. .) and ax = (axl,ax,, . . .) (check). Let ll be the set of elements x = (x1,x2, . . .) in R" for which llxlll = lxil is finite. Then II is a linear subspace of R" and 11 [I1 is a norm on II (regarded as a vector space). For example, to check the triangle inequality 4.2(b) and the fact that II is closed under +, we have (with x = (x1,x2,. . .) and y = ( y l , y , , . . .))
+
+
+
czl
i
I= 1
IXf
i
i
+ Yil 5 i = 1 IXil + i = 1 lYil 5 IlXlll + IIYIII,
and the results follow by taking the limit as n + 00 on the left. Properties 4.2(a) and 44c) are immediate. Finally we show that II is complete and so is a Banach space. Let (xk:k E N) be a Cauchy sequence in I , with xk = (x:,xi,. . .). Then given E > 0 there is an n E N so that llxk - xfII1I E if k, 12 n. Since Cauchy sequences are bounded there exists a number A so that llxklllI A for all k E N.Let o be an infinite integer; by transfer we have *IIx"II1 < A. Now 5 llxklll for all k, and so by transfer lxrl 5 A. Let xi = st(xfu). We will show that x = (xi) E I , and ( x k ) converges to x. For any k and L we have
and so by transfer
This shows that x E Il. Finally, for any k, I, and L,
135
111.4 Norrned Vector Spaces and Banach Spaces
By transfer, with k = w, we have L
L
C ]xi - x11 5 i C ( x i - xyl + *llxo i= 1 = 1
I;infinitesimal
+ *IIxm-
The right-hand side is < 2~ if l 2 n. Since this is true for any L E N , we conclude that IIx - .'Ill < 28 if 12 n. 3. The space 1, is a Banach space under the norm defined by IIxll, = sup{lx,l:i E N},where x = (xl,x2,. . .) (Exercise 111.3.14). 4. The space co. The space co consists of those x = (xi:i E N) E I, for which x, = 0. It is easy to see that co is a closed linear subspace of I, and hence a Banach space. 5 . The spaces B(S) and C(S). Let S be an arbitrary set. We denote by B(S) the set of all bounded functions on S. Then B(S) is a vector space with the usual definitions of addition and scalar multiplication of functions, that is, iff, g E B(S)and a E R, we put (f g)(x) = f ( x ) g(x) and (af)(x) = af(x)for x ES; we take 8 to be the function that is identically zero. B(S) is a Banach space under the norm defined by llfll, = sup{lf(x)):x E S} (Exercise 3). If S is a topological space we define C(S)to be the subset of B(S) consisting of continuous functions. Then C(S)is a closed subspace of B(S) (Exercise 4), and hence a Banach space.
+
+
Let (X,II 11) be a normed space. From now on we will follow the usual convention of denoting the *-transform of the norm 11 11 on *X by 11 11 rather than *I1 1 ; the context will clear up any possible confusion. We see immediately that the (norm)monad of a point x E * X is the set m(x) = { y E * X :Ily - 'Y O}. It is also almost immediate that m(x) = {y E * X : y = x z, z E m(O)}, so that all monads are translates of the monad about zero (Exercise 5). The finite points in *X (Definition 3.3) are those x E *X for which llxll is finite. Next we come to the basic notion of linear operator.
+
4.4 Definition Let X and Y be vector spaces. A map T:X + Y is called a linear operator if T(ax by) = aTx bTy for all a, b E R and x, y E X. The set of all such linear operators is denoted by L ( X , Y). Let X and Y be normed vector spaces. (Since there is no possibility of confusion we denote the norms and zeros on both by and 8, respectively.) A linear operator A?' X 4 Y is bounded if the number IlTll = sup{llTxll:IIxII 5 l } is finite. This number is called the norm of T. Then llTxll < llTllllxll for all x E X (check). The set of all bounded linear operators T: X + Y is denoted by B V , Y ) . If Y = R (with the usual operations of addition and multiplication and usual norm) then a linear operator T is called a linear functional. In what
+
+
11 11
136
111.
Nonstandard Theory of Topological Spaces
follows, we will often write x and T ( x )for the nonstandard extensions * x and *T(x)of x and T(x);* x and *T(x)may, however, have nonstandard elements. 4.5 Example Define a map T: I, 4I, as follows: i f x = ( x I , x 2 , x 3 , .. .) then T x = ( 0 , x 1 , x 2 , . . .). Then T is linear, one-to-one, and bounded (in fact (ITxll = llxll for all x E I,). However, T does not map I , onto I,. 4.6 Tbeorem (Robinson) Let T E L ( X , Y ) , where X and Y are normed
spaces. The following are equivalent: (i) T is bounded. (ii) * T : * X + *Y takes finite points to finite points. (iii) *T takes the monad of 8 into the monad of 8. (iv) * T takes near-standard points to near-standard points. In fact, if z E *X is near x E X then *Tz is near T x .
Proof: (i) * (ii): Suppose llTxll 5 Mllxll for all x E X . By transfer 11*Txll 5 Mllxll for all x E * X and (ii) follows. (ii) =$ (iii): Proceed by contradiction. Suppose x E m(8) but 11*Txll # 0.iThen the element z = x/llxll E *X is finite with norm 1 (here and in the following we use freely the transfers of the properties in Definitions 4.1, 4.2, and 4.4) but *Tz = (1/11x11)*Tx is not finite since llxll N 0 but 11*Tx11 0. (iii)*(iv): Let X E X and z ~ m ( x ) so , x - z E m ( 8 ) . Then * T ( x - z ) = T x - *Tz E Me), so *Tz is near T x . (iv) =$ (i): Proceed by contradiction. If T is not bounded then there exists a sequence ( x , E X : n E N ) so that IIx.II = 1 but llTx.ll> n for n E N (check).Then II*T X J is~ infinite for some infinite natural number w.NOW z = xm/,/ji%iJ is near-standard since it belongs to m(8), but 11*Tz11 = is not finite, so z cannot be near-standard. 0
,/m
It is easy to see that a linear operator is continuous if and only if it is continuous at 8 (Exercise 6). Therefore we have the following result. 4.7 Corollary T E L(X, Y)is bounded iff it is continuous.
Proof: Use 4.qiii) and 1.15. 0 48 Corollary If T E B(X, Y), then the null space N(T)= { x E X : T x = 8 ) is a closed linear subspace of X.
Proof: Exercise. 0
111.4
Norrned Vector Spaces and Banach Spaces
137
One of the most important results concerning bounded linear operators on Banach spaces is the uniform boundedness theorem. The proof is entirely standard. 4.9 Uniform Boundedness Theorem Let X be a Banach space, Y a normed vector space, and 9 c L ( X , Y ) a family of bounded linear operators. Suppose that for each x E X there is a constant M , so that llTxll < M , for all T E f .Then there is a constant M so that llTll IM for all T E 9, i.e., the operators in 9 are uniformly bounded. Proof: Suppose that T EL ( X , Y ) . Note that if llTxll IM for all x in the closed ball B,(x,) = { x E X : " x - xoll I E } then Tis bounded and IlTll 5 ~ M / E . The proof of this fact is left to the reader. Now we proceed by contradiction. Let xo E X and E, > 0 be given. Then there is an x 1 E Be,(xo) and a T,E f so that llTlxlll > 1. For otherwise llTxll I1 for all x E B,,(x,) and all T E 9,and then I2 / ~ , for all T E 9 by the remark in the first paragraph. By continuity we can find an E , with 0 < < 4,and B,,(x,) 2 BLI(xl)so that llTlxll > 1 for all x E BJx,). Inductively we can find a sequence {B,,(x,):n E N } with B,,(x,) 2 Be,+,(x,+ ,) and E, = 0, and a sequence T, E 9 so that llT,,xll 2 n for all x E Ben(x,). Now ( x , ) is a Cauchy sequence since E, = 0. Let x E X be the limit of ( x , ) (here we use the completeness of X ) . Then x E Bem(x,,),so 11T,,x11 > n for all n, contradicting the assumption. 0
As a corollary we can prove the following result. 4.10 Theorem Let X be a Banach space and Y a normed vector space, and suppose that (T,:n E N) is a sequence in B(X, Y) such that for each x E X there is an element y, with T,x = y , (limit in norm). Then the mapping T given by T x = y, is in B(X, Y). Proof: An easy exercise shows that the map T: X + Y is linear. Since 11 11 is a continuous function, lim"+mllT,,xll = IITxII, and thus for each x there exists an M , so that IIT,xll IM , for all n. By the uniform boundedness theorem there is an M E N with 11T,,11 IM for all n E N , so llTxll = limllT,xII IMllxll and T is bounded. 0
Next we study an important class of bounded linear operators, the compact operators. These operators occur in many applications. There is an extensive analysis of equations in Banach spaces involving these operators; it is called the Fredholm theory.
138
Ill.
Nonstandard Theory of Topological Spaces
4.11 Definition Let Xand Y be normed vector spaces. An operator T E L(X, Y) is compact if T[B] is compact for every norm-bounded set B c X. 4.12 Theorem (Robinson) T E L ( X , Y) is compact iff *T takes finite points to near-standard points. Proof: Suppose T is compact and let x E *X be finite, i.e., IIxI( < M for some M > 0. The ball B = {x E X:llxll I; M} is bounded and so is compact. Thus every point of *(T[B]) = *T[*B] is near-standard by Robinson's theorem, 2.2. Since x E * B we conclude that *Tx is near-standard. Conversely, suppose that *T maps finite points into near-standard points, and let B be a bounded set. By Theorem 2.4 we need only show that T[B] C _ K for some compact set K.Let K = {y E Y:y 2: y' for some y' E *(T[B])} = st(*T[*B]). Then T[B] c K and K is compact by Exercise 111.3.11. 0 We see immediately from 4.6 and 4.12 that compact operators are bounded. Theorem 4.12 can be used to establish the compactness of many operators, as the following example shows.
4.13 Example: Integral Operators (Robinson [42, Theorem 7.1.71) Let T: C([O,13)4C([O, 13) be defined by VlX)
= Jol
K(x, Y)f(Y)dY,
where K(x, y) is a continuous function on [0,1] x [0,1]. The reader should check that T is a linear operator. To show that Tf is continuous notice that if lf(x)I I M for all x IZ 1413 then (4.1)
I m x ) - Tf(Y)(
I Jol ( K b ,t ) - K ( Y , t)l If(t)l dt 5
M max{JK(x,t) - K(Y,t)l:(x, th (Y,t) E [O, 11 x
[o, 13>,
and maxlK(x, t) - K(y,t)l can be made as small as desired if Ix - yl is sufficiently small by the uniform continuity of K(x, t). Also note that lK(x, t)l 5 K for all (x, t ) E [O, 13 x [0,1] for some constant K, and so, for any x E [0,1], (4.2)
ITf(x)JI K max{lf(t)l:t E [O, I]}.
To show that T is compact we need to show that *Tfis near-standard for each finite f. Let f E *C([O,13) be finite. This means that there is a finite standard M so that I M for all t E *[O, 13.
If(?)
111.4
139
Normed Vector Spaces and Banach Spaces
From the transfer of (4.2) we see that I*Tf(x)l 5 K M for all x E *[O, 11, i.e., *Tfis finite, and we may define a function on [0,1] by +(x) = st(*Tf(x)), x E [0,1]. To complete the proof we will show that 91, is continuous and *Tf is near *$. From the transfer of (4.1) we have I*V(X) - *my11 IM max{l*K(x,t) - *K(y,t)l:(x,t), (y,t) in *[O, 13 x *[O, l]}. Thus *Tf(x) z *Tf(y) whenever x, y E *[O, 13 and x N y by the uniform continuity of K(x,t) (Theorem 10.10 and Proposition 10.8 of Chapter I). Let E > 0 be a fixed standard real, and let D = (6 E *R, 6 > O:x, y E *[O, 13 and Ix - yl < 6 implies I*Tf(x) - *Tf(y)l < E } . Then D contains all positive infinitesimals by the above remark, and so contains a standard 6 > 0 by Corollary 7.2(iii) of Chapter 11. Now if x, y E [0,1] then
IW - Jl(Y)l IIW)- *Tf(x)l + I*T.(x)
-
* m y ) ( + I*Tf(Y) - +(Y)l.
The first and last terms are infinitesimal, so that I$(.) - $(y)l < 2 if the 6 is chosen as above; thus $ is continuous. To show that *$ is near *Tfnotice that *$(x) N *Tf(x) for all standard x by the definition of $ and the fact that *$ is an extension of $. If x E *[0,1] then I*Vw - * W l II*Tf(x) - *Tf(Ox)l+ I*~f(Ox)- *$(Ox)[ + I*+(".)
- *$(x)I,
and all terms on the right are infinitesimal by the preceding remarks and the continuity of JI. A word of caution here. The reader may think that the above proof is needlessly complicated since we could replace t j by $(x) = st(Tf(x)) for all x in *[O, 13 rather than [0,11, in which case it would be obvious that $ is near Tf.Unfortunately the $ defined this way is usually external and thus not a standard element in *C([O,13).Notice also that an internal finitef E C([O, 13) can be quite wild; e.g., f(x) = sin wx, where w is infinite. The set of bounded linear operators can be made into a linear space in an obvious way. If T, S E B(X, Y) and a E R we define (T S)(x) = T(x) S(x) and (aT)(x) = aT(x). It is then not hard to see that the operator norm on B(X, Y) makes B ( X , Y) into a normed vector space (Exercise 8).
+
+
4.14 Theorem Let X be a normed vector space and Y a Banach space. Then
the normed vector space B(X, Y)is complete and hence a Banach space. The set of compact operators in B(X, Y) forms a closed linear subspace. Proof: Let ( T , E B(X, Y ) : nE N) be a Cauchy sequence. Then, for each x E X,T,x is a Cauchy sequence and hence converges to an element y, by
140
Nonstandard Theory of Topological Spaces
111.
completeness of Y. We define T by T x = lim T,,x. Then T is linear (check) and bounded since limllT,,ll = llTll (check). Finally, we show that T,, converges to T in norm. For given E > 0 there is an N so that IIT,,x - T,,,xll 5 ]ITn- T,,,ll llxll < ~llxllif n, m 2: N . Thus IIT,,x - Txll 5 ~llxllif n 2 N, and so [IT,,- TI[ 5 E for n 2 N,and we are through. An easy exercise shows that the set of compact operators is a linear subspace of B(X, Y ) .To show that it is closed, let (T,,) be a sequence of compact operators converging to an operator T E B(X, Y ) . If y E *X is finite then it belongs to *B, where B = { x E X:llxll S M )for some standard real M > 0. Now note that Tnx converges to T x uniformly on the ball B, i.e., for any E > 0 there is an m(&) E N so that IIT,,x - Txll < E for all n 2 m(E) and all x E B. Thus II*T,,,x - *Txll < &/2for no 2 m(&/2) in N and all x E *B. Since T,,, is compact, *T,,,y is near a standard z E Y and so ll*Ty - zll < E by the triangle inequality. Since e is arbitrary, *Ty is pre-near-standard. Since Y is complete, it follows from Proposition 3.14 that *Ty is near-standard. 0 The standard proof of the closedness of the set of compact operators usually involves the selection of infinite subsequences with certain desirable properties. 4.15 Corollary The space of bounded linear functionals on a normed vector
space X is a Banach space. The Banach space of this corollary is used sufficiently often for us to introduce some notation. 4.16 Definition The Banach space of bounded linear functionals on a normed
linear space X is called the dual space of X and is denoted by X'. The dual of X is denoted by X" and is called the second dual of X . Similarly for X"', etc. It is sometimes difficult to characterize the dual of a given Banach space, but the following example is an easy case.
T I , -B l', which is linear,l-l,onto,andsatisfies~(Tyll= llyll,fory~I,.Lety= ( ~ , : ~ E N1,) E and define Ty: 1, + R by Ty(x) = xiyi for x = ( x i ) E I , . Then Ty is linear, and
4.17 Example: 1; = I , Our aim is to define a mapping
1 ~ ~ x 1 1sup{ly,~:iENI
2 1
lXtl=
IIyIIrnIIxII1~
so T y is a bounded linear functional on II with llTyll S IIyll,. We next show
111.4
141
Normed Vector Spaces and Banach Spaces
that IITy(1 2 llyllm. We may assume llyll, > 0. Given a positive E < ~ ~ y ~ ~ , , there is an no so that lynol > llyllm - E. Now define x = ( x i ) E I, by xi = 0 for i # no and xno = YnJlYnol. Then llxlli = 1 and ITY(x)l = IYnoI > IIYIIm - E, so IITyll 2 IIyll,. We also see that T is 1-1, since if Ty = 8 then llyll, = 0 so y = 8.
It only remains to show that T is onto. Let f E I;. If e" E I , is defined by n and 8::= 1, then lle"lll = 1 for all n E N. Put Ilfll, and so y = ( y i ) E I,. Now the functional Ty attached to y as in the first paragraph agrees with f on the elements e". A simple limiting argument (check) shows that Ty = f,and so I; = I,.
e" = (d;), where S; = 0 if i # f(e") = y, E R. Then 1y.l 5
In the case of a general normed vector space X,it is not at all obvious that X' contains any elements other than 8. The following result, which is basic to the study of duality, shows that X' always contains many elements. 4.18 Hahn-Baaach Tbeorem Let X be a vector space and suppose that a given function p: X -+ R satisfies p(x y) s p(x) p(y) and dux) = ap(x) for each a 2 0 E R and x, y E X.Suppose that f is a linear functional defined on a subspace S of X with f ( x ) 5 Ax) for all x E S. Then there is a linear functional F on X which extends f [i.e., F(x) = f ( x ) for all x E S] and satisfies F(x) 5 p(x) for x E X.
+
+
Proof: Let g and h be linear functionals, each defined on a linear subspace of X. We say that g extends h and write h < g if the domain of g contains the domain of h and g = h on dom h. The relation < partially orders the set of linear functionals. Consider the set of all extensions g off which satisfy g(x) 5 p(x), for x in the domain of g. Applying Zorn's lemma (see the Appendix) to this set, partially ordered by 4,we see that there is a maximal extension F. We need only show that the domain X, of F is all of X.Suppose this is not the case, i.e., there is a vector y in X but not in X,. Then F may be extended to a functional g on the subspace 2 3 X, consisting of elements of the form ay + x,, x, E X,,a E R, by putting g(ay + x,) = ug(y) F(xo). Now g is specified uniquely by g(y), and we need to show that g(y) can be chosen so that g(x) I p(x) for all x E 8 in order to get a contradiction. For xl, x2 E Xo we have F(x2)- F ( x , ) = F(x2 - x,) s p(x2 - xl) 5 p(x2 y ) p ( - y - XI), which yields -p(-y - xl) - F(x,) 5 p(x2 + y) - F(x,). Since the left is independent of x2 and the right is independent of x1 there is a constant c E R so that
+
+ +
(9 c
P(x2
(ii) -p(-y
+ Y ) - F(X,), - xl) - F(x,) 5 c
142
111.
NonstandardTheory of Topological Spaces
for all x I ,x 2 E X o . We now put g(y) = c. Then for x = ay + xo E 8 the inequality g(x) = d a y xo) = ac + F(xo) < d a y xo) follows by replacing x2 by xo/a in (i) if a > 0 and x 1 by xo/a in (ii) if a < 0. 0
+
+
4.19 Corollary If X is a normed vector space and x E X , x # 8, then there is an x‘ E X so that x’(x) = llxll and IIx‘II = 1.
Proof: Standard exercise. 0
We now show that X can be isometrically and isomorphically embedded in X . 4.20 Tbeorem Let X be a normed vector space and define a map T: X + X ’ by Tx(x’) = x‘(x) for all x’ E X . Then T is a linear and norm-preserving embedding. If X is a Banach space then T [ X ] is a closed linear subspace of X’.
Proof: The reader should check that T is linear. That T x is bounded (as we have implied in the statement of the theorem) follows since ITx(x’)I = Ix’(x)l < llxll IIx‘II, and we see that llTxll < IIxII. The result will be established when we show that llTxll 2 Ilxll. This is trivial if x = 8, so suppose x # 8. From Corollary 4.19 there exists an x‘ E X’ so that llx’ll = 1 and x’(x) = IIxII. Thus llxll = Ix’(x)I = ITx(x’)I 5 llTxll IIx’II = 113x11. The rest is left to the reader. 0
Because of Theorem 4.20 we identify X with T [ X ] and regard X as a subspace of X in the rest of this section without further explicit comment. We end this section with a consideration of compactness properties in Banach spaces. We have seen in Example 3.22 that the closed unit ball in I , is not normcompact. This situation turns out to be typical of all infinitedimensional spaces. In fact one can prove that a closed ball in a Banach space is norm-compact iff the space is finite-dimensional [14, Theorem IV.3.51. It follows that no set in an infinite-dimensional Banach space X containing a closed ball can be norm-compact. Since this severely limits the sets which can be norm-compact we look for other topologies on a Banach space in which closed balls are compact. 4.21 Definition Let X be a normed vector space. The weak topology on X is the topology whose neighborhood system at a generic point x E X is generated by the subbase consisting of sets of the form U(x;x ’ , ~ )= { y E X:Ix’(y)- x‘(x)l < E } for some x’ E * X . Let X be the dual space of a normed vector space X . The weak* topology on X is the topology whose neighborhood system at a generic point
111.4 Normed Vector Spaces and Banach Spaces
143
x’ E X’ is generated by the subbase consisting of sets of the form V(x’;x , E ) = {y’ E X : I x ( y ’ )- x(x’)l < E } for some x E X (regarded as embedded in X”).
Notice that in the definition of the subbase for the weak* topology we take only those x E X and not all x” E X ” . This turns out to make a crucial difference. An easy exercise, which we leave to the reader, shows that the monads of points x E X and x’ E X’ in the weak and weak* topologies, respectively, are given by mw(x)= { y E * X : * x ‘ ( y )11 *x’(*x)= x’(x) for all (standard) x‘ E X‘}, mw,(x’)= {y’ E *X’:*x(y’)‘Y *x(*x’)= x(x’) for all (standard) x E X } .
Using the Hahn-Banach theorem, we can show that the weak and weak* topologies are Hausdorff (exercise). 4.22 Alaoglu’s Theorem The closed unit ball in X is compact in the weak* topology.
Proof: Let B be the unit ball in X . We must show that corresponding to every y’ E * B there is a point x’ E B so that *x(y’)N x(x‘) for all x E X . Fix y’ E * B and define a functional x‘ on X by x‘(x) = st(y‘(*x)), x E X . Then *x(y‘)11 x(x’) for all standard x E X . The linearity of x’ is obvious, and, finally, x’ E B since Ix’(x)I 5 o ( ~ ~ yIl*xl)) ’ ~ l 4 11x11 by transfer (y’ E *B so IIY‘IIS1). 0 The same result can be proved for a ball of any radius and also follows directly from Theorems 4.22 and 2.6. We obtain as a consequence the following corollary. 4.23 Corollary A norm-bounded and weak*-closed subset of X is compact.
Proof: Use Theorems 4.22 and 2.4.
One might expect a similar result to be true for subsets of X in the weak topology. However, it turns out that the unit ball in X is weakly compact iff X is rejexioe, which means that X = X ” [14, Theorem V.4.71. Considering the importance of sequential compactness as emphasized in 5111.3, we would like to know when the unit ball B in a Banach space X is weakly sequentially compact. A deep theorem due to Eberlein and Smulian asserts that B is weakly sequentially compact iff B is weakly compact (iff X is reflexive by the above remark).A nonstandard proof of this result can be found in [47].
144
111.
Nonstandard Theory of Topological Spaces
4.24 Example We will show that the unit sphere in I , is not we&* sequen-
tially compact even though it is weak* compact by Alaoglu's theorem. Consider the sequence e"€ll (regarded as embedded in defined by e" = (S::i E N). Then lle"lll = 1. Suppose that (e") has a convergent subsequence (Slk). Define the element x = (x,:iE N) E I, by x i = 1 if i = nk and k is even, and xi = 0 otherwise. Then e x ) = 1 if k is even, and 0 if k is odd, so the sequence (e"7x)) does not converge, i.e., (8") does not converge in the weak* topology. Note that by compactness (check) the sequence e" has a weak* limit point y, but we cannot select a convergent subsequence since the neighborhood system at y does not have a countable base.
r-)
An extensive study of the structure of Banach spaces using nonstandard methods has been developed by Henson and Moore [16]. This study uses in an essential way the notion of the nonstandard hull of a Banach space. We present the definition of the nonstandard hull of a metric space in $111.6 to help the interested reader to understand these results.
Exercises 111.4 1. Show that d(x, y) = IIx - yll is a metric. and 11 llm are norms on R". 2. Show that 11 3. Show that B(S) with the sup norm 11 lla, is a Banach space. 4. Show that C(S) is a closed subspace of B(S) if S is a topological space. 5. Show that for a normed space all monads are translates of the monad of zero. 6. Show that a linear operator is continuous if and only if it is continuous at 0. 7. Prove Corollary 4.8. 8. Show that the operator norm on B(X, Y)makes B(X, Y)into a normed vector space. 9. Show that the set of compact operators is a linear subspace of B(X, Y). 10. Show that the weak and weak* topologies are Hausdorff. 11. Discuss the relationship between Alaoglu's theorem and the Tychonoff product theorem. 12. Two norms on a space X are equivalent if the corresponding metrics they define are equivalent. (a) Show that the norms 11-11 and 111.111 on X are equivalent iff there exist positive (nonzero) constants a and /Iin R so that ctIIxII 5 lllxlll 5 Pllx11 for all x E x. (b) Show that any two norms on R" are equivalent. (Hint: Show that any norm 11 11 is equivalent to 1.1, To do so you need only show that 111x111/11~11~and [ ~ x ~ ~ , / [are ~~x finite [ [ ~ for all x E *R". Write x = & xg, and get estimates.)
111.5 Inner-Product Spaces and Hilbert Spaces
145
13. Let X be a vector space with a topology 9. X is a topological vector space if both vector addition (as a map X x X -,X)and scalar multiplication (as a map R x X + X)are continuous. Let m(a)denote the monad of a E R and A x ) denote the monad of x E X . Show that if X is a
topological vector space (of more than one dimension) then (a) A x ) + A Y ) = A x ) + Y = d x + Y ) = x + y + P(@, (b) m(4x c m(alAx) = W ( X ) = Aax), (c) 9 is Hausdorff iff p(8) n X = {O}, (d) if X is a topological vector space with topologies Y1and F2 having monads pl and p2 then Y1= .F2iff pl(8) = p2(0).
111.5 Inner-Product Spaces and Hilbert Spaces
In this section we consider those normed spaces and Banach spaces in which the norm is derived from an inner product. Most of the results and proofs of this section are standard. The canonical example of an inner product occurs in Euclidean space R” where the scalar product of x = ( x l , . . . ,x , ) and y = ( y l , . . . ,y,) is ( x , y ) = C; x,y,. The angle 8 between two nonzero vectors x and y is given by the familiar formula cos 8 = (x, y)/llx11 llyll. The scalar product is generalized to vector spaces as follows. 5.1 Definition Let H be a vector space. An inner product on H is a map ( ,): X x X + R which satisfies (for all x, y, z in X and a, b E R )
(4 (x,Y ) = (YY X I , (ii) (ax + by, 4 = 4 x , 2) + b(y, 4, (iii) ( x ,x ) 2 0, and (x,x ) = 0 iff x = 8.
A vector space with an inner product is called an inner product space. A norm on H is obtained by setting llxll = (exercise). If H is complete in this norm it is called a Hilbert space. To prove that
11 11 is a norm on X one uses the following basic result.
5.2 Schwarz’s Inequality For any x , y in an inner-product space H,[(x,y)l I IIXII
IIyII.
Proof: Let x and y be given. For any real I we have ( x + I y , x + I y ) = 11x11’ 21(x,y) I ’ ~ ~ y2~ 0. ~ ’Thus the quadratic expression in 1 given by
+
+
146
111.
5.3 Corollary ( x , y ) is continuous on H x x, y E H.
Nonstandard Theory of Topological Spaces
H as a function of the variables
Proof: Exercise. 0 5.4 Examples
1. In the linear space R" we define the inner product of x = (xl, . . . , x n ) xIyI. The reader should check that and y = (yl, . . . ,y,) by (x, y) = this defines an inner product on R". From Schwarz's inequality,
2. The space 1 2 . Let l2 denote the space of all infinite sequences x = (xl, x: < 00. If x = (x1,x2, . . .) and y = (y1,y2,. . .) are two such sequences, we define (x, y) = c g xfy,. To check that (x, y ) is finite for x, y E l2 we have
x2, . . .) for which
and so cgl (x,y,l converges. Using the fact that (zgl x:)l/' = llxll is a norm, we can now easily check that l2 is a linear space. We will see later that all separable Hilbert spaces are isomorphic to 12. Using the inner product, we can introduce a notion of orthogonality in an inner-product space. 5.5 Definition If H is an inner-product space then x and y in H are orthogonal if(x,y)=O,inwhichcasewewritexly.IfS~HthenS1={x~H:xlz for all z E S}. 5.6 Proposition For any S G H,S' is a closed linear subspace of H.
+
Proof: Let x, y E S* and a, b E R. Then, for any z E S, (ax by,z) = + b(y,z) = 0, so S* is a linear subspace. To show closure, let x E *S*
a(x,z)
111.5
147
Inner-Product Spaces and Hilbert Spaces
and x N y E H.Then (y,z) -N (x,z) = 0 for all z E S by the continuity of the inner product, and so y E S*. Thus S* is closed. 0 Since the norm on an inner-product space H is derived from the inner product, we might expect that it has some special properties. It turns out that it is completely characterized by the following law. 5.7 Parallelogram Law A normed space (H,(( 11) is an inner-product space iff for all x, y E H IIX
- Y1I2 + IIX + YII’ = 211x112 + 211Yll’.
Proof: Suppose H is an inner-product space. Then IJX-YI12
+ IIx+Y112 =(x-Y,x-Y)+(x+Y,x =llx1I2 -(Y,X) -(x, Y)+ = 211x112+211Y112.
+Y)
IIYII’+ llX1l2 +(Y, x)+(x, Y)+ IIYII’
The converse, which we omit, sets (x, y) = ${ IIx
+ yll - IIx - yll}.
0
Using this simple result, we now establish a sequence of results which are fundamental to all further analysis of Hilbert spaces. 5.8 Definition A subset K of a vector space H is conuex if whenever x, y E K then ax (1 - a)y E K for all real a E [0,1].
+
In the proof of the next result we use completeness in an essential way. 5.9 Theorem If K is a closed convex subset of a Hilbert space H,then there is a unique element xo E K so that llxoll < llxll for all x E K, i.e., K has a unique element of smallest norm. Proof: Let d = inf{llxII:x E K}.Then for each S > 0 there is an x E K so that d 5 llxll < d + S. By transfer, with 6 infinitesimal, there is a y E *K with llyll 2: d. We now show that y is near-standard. Since K is complete by Corollary 3.15, it is enough to show that y is pre-near-standard (see Proposition 3.14). Fix E > 0 in R. By transfer from the parallelogram law, (5.1)
Ilx - Yll’
+ IIX + Yl12 = 211x112 + 211YIl2
I+
for any x E K. If x E K then since y E *K,(x + y)/2 E *K,so x + yll’ = 4)1(x + y)/21I2 2 4d’. It follows from (5.1) that [(x- yl12 < 211x 2d2 4d2 + q = 2))x1I2- 2d2 + q, where r,~is infinitesimal and x E K. But we can find an x E K so that 1 1 ~ 1 c 1 ~ d2 + ~/4,and we get IIx - yll’ < ~ / 2+ r , ~< E.
148
111.
Nonstandard Theory of Topological Spaces
Thus y is pre-near-standard, so y is near some xo E H . The point xo E K since K is closed, and llxoll = d by the continuity of the norm. The uniqueness is another application of the parallelogram law (exercise). 0 5.10 Theorem Let E be a closed subspace of the Hilbert space H with E # H. There are unique linear operators P: H + E, Q: H + E l so that
x = Px
+ Q x for all x E H. Further, Px=x
iff X E E
and
iff X E E ' .
Qx=x
P and Q are called the projections of H onto E and E l , respectively.
+
Proof: For x E H let K = x + E = { x y:y E E } . Then K is convex and closed (check), Let Qx be the unique element of smallest norm in K (existing by 5.9), and put P x = x - Qx. Then it is clear that x = P x + Q x and P x E E. To show that (Qx,z ) = 0 for all z E E, we put Q x = y. Assuming without loss of generality that llzll = 1, we have
IIYI12 5 IIY - az1I2 = (Y - az, Y - az) = llYllZ - 2a(Y,Z) + '1.1
+
for every a E R, yielding 0 < -2a(y,z) la12. If a = (y,z) this gives 0 < -I(y,z)l2, and so (y,z) = 0. The uniqueness of P and Q follows from the fact that E n E l = {O}. For if x = x 1 + x 2 with x 1 E E, x 2 E E l , then x 1 - P x = Qx - x 2 and x l - Px E E, Q x - x 2 E E l , so x 1 = P x and x 2 = Qx. The rest of the proof is left to the reader. 0 The culmination of the preceding sequence of results is the following theorem, which probably has more applications than any other result on Hilbert spaces. 5.11 Riesz Representation Tbeorem To each bounded linear functional L on H there corresponds a unique element y E H so that L ( x ) = (x, y) for each x E H,and llLll = IIyII.
Proof: We may assume that L is not identically zero (otherwisetake y = 0). Let E = { x E H : L x = O}. Then E is a closed linear subspace (check) and E l # { O}, so we may choose z # 8 in E l . Then, for any x E H , x - (Lx/Lz)z E E, so (x,z ) - (Lx/Lz)(z,z ) = 0. Thus L x = (x, [Lz/(z,4 3z), and we take y = [Lz/(z,43 z. The rest is left as an exercise. 0 5.12 Corollary A Hilbert space H is self-dual; i.e.,
H
=H.
Next we investigate the generalization to Hilbert space of a familiar notion in R", that of an orthonormal basis. In R" the vectors el = (1, O,O, . . . 0),
111.5
Inner-Product Spaces and Hilbert Spaces
149
e2 = (0,1,0,0, . . . ,O), . . . ,e, = (O,O, . . . ,O,l) have the property that lleill = 1, (ei,e,) = Sfi (the Kronecker &function), and any vector x E R” can be written uniquely as x = aiei. The set {e,} is called an orthonormal basis. In Hilbert spaces we will see that orthonormal bases exist and that any vector can be expressed in a limiting sense in terms of the orthonormal basis.
c;=,
5.13 Definition A set S = {ei:i E I } of nonzero vectors in an inner-product space H is orthonormal if e, I e, for i # j and lle,ll = 1 for all i E I. S is maxi-
mal (or complete) if it is not properly contained in any other orthonormal set. Given any x E H the numbers 2(i)= (x,ei) are called the Fourier coeficients of x relative to the orthonormal set S = { e , } . If H is a nontrivial inner-product space (ie., contains more than the zero vector 8) then there is at least one orthonormal set in H obtained by taking a single nonzero vector x E H and forming the normalized vector e = x/llxll. The existence of maximal orthonormal sets then follows from the following more general result. 5.14 Theorem Every orthonormal set S c H is contained in a maximal
orthonormal set 3 c H.
Proof: Let 9’be the collection of all orthonormal sets in H containing S, and partially order Y by set inclusion E.Y is nonempty since it contains S.We use Zorn’s lemma (see the Appendix)to show the existence of a maximal orthonormal set. Let ‘3 c Y be any chain in 9.Then the set 3 = uS(S E W) is an orthonormal set, for if x , , x , E 3, then x E S , and x , E S, for some S,, S, E ‘3. Since ‘3 is a chain, either S, G S2or S2 E S,. In either case x and y are in some S E ‘3, so x Iy. Thus 3 is orthonormal. By Zorn’s lemma there is a maximal orthonormal set. With a little more work it is possible to prove that any two maximal orthonormal sets can be put in one-to-one correspondence (i.e., have the same cardinality), but we will not need this fact. The reader should prove (exercise) that S is a maximal orthonormal set iff x E H and x I S implies that x = 8. This fact will be used in the proof of Theorem 5.19. 5.15 Example The vectors ei = (Sf:j E N) in I , form a maximal orthonormal set, for if x = ( x , : j E N ) E 1, and (x, ei) = xi = 0 for all i E N, then x = 8.
In the following we will deal only with inner-product spaces H which are (norm) separable (i.e., H contains a countable set which is dense in the
150
111.
Nonstandard Theory of Topological Spaces
topology induced by the norm). In this case any orthonormal set is either finite or countable, for if {ei:iE I } is orthonormal and i # j , then [lei- ejIl2 = (ei - e,,ei - e,) = lle1112+ (le,l(’= 2 since (ei,e,) = 0. Conversely, if any orthonormal set in H is either finite or countable then H is separable (exercise). Since the following results are easy if H is finite-dimensional (i.e., contains a finite maximal orthonormal set), we will assume in the following that the inner-product space H contains a countable orthonormal set which we arrange in a sequence ( e i : i E N). Without loss of generality we have chosen I = N. Now let x E H and ( a r: i E N) be a sequence of real numbers. Then
From this we obtain the following results.
5.16 Best Approximation Theorem Let (ei:i E N ) be an orthonormal sequence in an inner-product space H. For any x E H,
i.e., the best norm approximation to x by a linear combination of the ei is given by choosing the coefficients to be the Fourier coefficients. Proof: The right-hand side of (5.2) is minimized if ai = (x,ei). 0
5.17 Bessel’s Inequality For any x E H,
Proof: Exercise. 0
Bessel’s inequality has the following interpretation. For any x E H we can consider the sequence ( $ i ) : i E N ) of Fourier coefficients of x relative to
111.5
151
Inner-Product Spaces and Hilbert Spaces
a given orthonormal sequence S = ( e i :i E N ) . Then Bessel's inequality shows that this sequence is in I,. Thus for a fixed countable orthonormal sequence (ei> we obtain a mapping T: H --* I , defined by T x = (2(i) :i E N). It is easy to check that T is a linear mapping. The next result, which requires that H be complete, shows that T maps H onto 1,. 5.18 Riesz-Fischer Theorem Let ( e i : i E N) be a countable orthonormal sequence in the Hilbert space H.Then each element of I , is of the form 2 for some x E H. Proof: Let ( a i : i E N ) be a sequence in I , so that a: < 00. Then aiei is a Cauchy sequence in H since x , - x , = the sequence x, = ~ ~ = , +aiei 1 if m > n, and so IIx, - x,,((' = a:. Since H is complete, there is an element x which is the (norm) limit of x,. By the continuity of the inner product, (x,ei)= lim(x,,e,) = a, for any i E N. 0
cy=l
The element x which is obtained in Theorem 5.18 is often written x = is a maximal orthonormal sequence, then the associated map T is one-to-one, a,ei. For this reaand so an x E H can be written in only one way as son a maximal orthonormal set (also called a complete orthonormal set) is sometimes called an orthonormal basis. It should be emphasized that this notion of basis must be understood in a limiting sense and not in the algebraic sense of vector space theory.
Cim,, aiei. One consequence of the next theorem is that if ( e i : i E N)
5.19 Theorem The orthonormal sequence ( e , : i E N) in the Hilbert space H is maximal iff each x E H can be written uniquely as x = (x, ei)ei.
cp",
Proof: Suppose ( e , ) is maximal and x E H . Then by the Riesz-Fischer theorem there is an element y E H so that y = (x,ei)ei and so (y,e,)= (x,ei)for all i E N. But then ( x - y,e,) = 0 for i E N , so x = y by the remark following Theorem 5.14. Conversely, suppose each x E H can be written as x = 1,E (x, ei)ei.If (ei> is not maximal there is an x # 8 in H so that (x,e,)= 0 for i E N. But then x= (x,ei)ei= 6' (contradiction).
czl
xz
5.20 Theorem (Parseval's Identities) If ( e i : i E N) is a maximal orthonorma1 sequence in the Hilbert space H then
c,?,
(i) [(x,ei)I2= 1 1 ~ 1 for 1 ~ all x E H ; (ii) I, (x, ? ei)(y,ei) = (x, y ) for all x, y E H .
152
111.
NonstandardTheory of Topological Spaces
Proof: We leave the proof of (ii) as an exercise. To prove (i) we see by Bessel's inequality that I(x,ei)I25 11x11'. On the other hand, given E > 0 there is a z = (x, ei)ei so that IIx - zll < E , whence IIx(1-= llzll + E. Thus
cy=
1;
and the result follows since E is arbitrary.
0
The results above can now be used to show that 1, is essentially the only separable infinite-dimensional Hilbert space. 5.21 Theorem Given a maximal orthonormal sequence S = ( e i : i E N ) in a separable Hilbert space H,the associated map T: H + 1, is one-to-one, onto, and satisfies (x,y) = (Tx,Ty) for all x, y E H , and so T is a Hilbert space isomorphism.
Proof: Use 5.18-5.20.
0
We end this section with an application of nonstandard analysis to prove a theorem concerning compact operators in Hilbert space. Much more can be done in this direction. In particular, Bernstein and Robinson [4] first proved that so-called polynomially compact operators have nontrivial invariant subspaces using refinements of the technique used here. We are going to prove that every compact operator on a separable Hilbert space H can be approximated arbitrarily closely by an operator of "finite rank." 5.22 Definition An operator Q: H + H is of finite rank if there is a finitedimensional subspace E c H so that Q x E E for each X E H.
Since every separable Hilbert space H is isomorphic to 1, we will identify H and 1, in the following discussion. Thus we will asume that an orthonormal sequence ( e i ) is given and represent any x E H as either x = i = alel or (a1,a2, . . :). First we need the following lemma.
c"
5.23 Lemma If x = ( a i : i E * N ) E *I2 is near-standard, then &12(i i 2 o)is infinitesimal for any o E * N , .
cIpi12(i
E
*N,
Proof: If y = ( b i : i E N ) E I , then 1irnk+" E N, i 2 k ) = 0, so cI*bi12(iE * N , i 2 w ) 2: 0 for any infinite o.Now since x E *I2 is near-standard there is a y E 1, with Ilx - *yI12 = Clai - *bi12(iE * N ) N 0. By the trans-
111.5
Inner-Product Spaces and Hilbert Spaces
153
and both terms on the right are infinitesimal. 0 5.24 Theorem Let T: H + H be a compact linear operator. For each E > 0 there is an operator Q of finite rank so that [IT - QIl < E. Proof: For each k E * N (finite or infinite) we define a projection operator Pk:*H + * H by Pkx = (a1,a2,. . . ,a k , O , O , . . .) when x = (ui:i E * N ) . Then P, is linear and [[PkXll5 llxll for any x E * H . Also, Il(1 - Pk)X1l2 = C(ai12(iE * N , i 2 k + l), and so, by Lemma 5.23, ll(Z - pk)x11 is infinitesimal for k infinite and x near-standard. It follows that ll*T - pk*Tll is infinitesimal for all infinite k. Now let E > 0 in R be given. The internal set A = {n E *N:Il*T - P,,*TI( < E } contains all infinite natural numbers, and so contains a finite (standard) integer rn by Corollary 7.2(ii) of Chapter 11. Thus ll*T - P,*TII < E. Transferring down shows that [IT - P,TII < E. Finally, the operator Q = P,T is of finite rank since its range is contained in the subspace E generated by { e l , . . . , em}. 0
This result can be used as a starting point for the Fredholm theory of compact operators. Exercises 111.5 1. Show that if (,) is an inner product on a vector space H then the map ~~~~~: H 4 R + defined by llxll = is a norm on H. 2. Prove Corollary 5.3 3. Show that the element x o of Theorem 5.9 is unique. 4. Complete the proof of Theorem 5.10. 5. Finish the proof of Theorem 5.1 1. 6. Show that S is a maximal orthonormal set iff x E H and x IS =. x = 8. 7. Show that if any orthonormal set in an inner-product space H is either finite or countable, then H is separable. 8. Prove Theorem 5.17. 9. Prove Theorem 5.2qii). 10. Establish the following converse to Lemma 5.23. If x = (ui:i E * N ) E *12, 11x11’ = &12(i E * N ) is finite, and &,l’(i E *N, i > w)z 0 for all infinite o,then x is near-standard.
a
154
111.
NonstandardTheory of Topological Spaces
1 1 . The Hilbert cube is the set of all x = ( x i ) E I , such that lxil I l/i, i E N. Show that the Hilbert cube is compact. 12. Let H be a Hilbert space and let B ( H ) denote the normed space of all bounded linear operators A: H + H. A subbase for the weak operator topology on B(H) is formed by the collection of all sets of the form { A : I ( ( A- A,)x, y)I < S}, A, E B(H), x, y E H and 6 > 0 in R. Show that the monad of A, in B(H) in the weak topology is given by p(A,) = {A E * B ( H ) : ( A xy, ) N (A,x, y) for all standard x, y E H}. 13. (Standard) A bilinear form on H is a map B: H x H + R such that B(x, .) is linear for each x E H and B ( * ,y) is linear for each y E H. B is bounded if there exists M E R such that (B(x,y)I IMllxll llyll for all x, y E H. Show that if B is a bounded bilinear form, then there exists an operator T E B(H) such that B(x, y) = (Tx,y) for all x, y E H. 14. Use Exercises 5.12 and 5.13 to show that the unit ball in B(H) is compact in the weak operator topology. 111.6 Nonstandard Hulls of Metric Spaces
In this short section we introduce the reader to the concept of the nonstandard hull of a metric space. This notion was introduced by Luxemburg [36] and has proved to be a powerful tool in the nonstandard analysis of Banach spaces, as indicated by the survey paper of Henson and Moore [16]. The technique of nonstandard analysis, as applied to the theory of Banach spaces, is essentially equivalent to the use of Banach space ultrapowers, a technique which originated with Dacunha-Castelle and Krivine [101 and is now used extensively. Nonstandard methods, however, are more intuitive and usually easier to apply, especially when they involve concepts, such as the internal cardinality of a *-finite set, which are not easy to express in the ultraproduct setting. In this section we will assume that the nonstandard analysis is carried out is a metric in a ic-saturated enlargement where ic > KO.Suppose that (X,d) space. Recall that the principal galaxy G = fin(*X)is the set of points in *X each of which is at a finite distance from a point in X (regarded as embedded in *X).If u, b E *X we say as usual that u = 6 if *+, b) ‘v 0. Let 2 denote the equivalence classes of G under the equivalence relation N . Alternatively, 3 is the set of monads, where each monad m(a) = {b E G : *d(a,b) 2: 0} for a E G (notice that if a E G and b ‘Y a then 6 E G). Since *d(a,b) is finite for any a, b E G,we can define h(x, y) = st(*d(a,b))
when x
= m(a) and
y = m(b) in
8.
1116 .
155
NonstandardHulls of Metric Spaces
6.1 Proposition (2,h) is a metric space.
Proof: Exercise. 0
6.2 Definition
(8,h) is called the nonstandard hull of (X,d).
We now use saturation to prove that (2,d)is complete [even if ( 2 , j ) is not]. Our construction is like that of Theorem 3.17, but here 2 consists of monads of finite points and not just pre-near-standard points.
6.3 Theorem Suppose that *X lies in a K-saturated superstructure with K > KO.Then (2,h) is a complete metric space. Proof: Let (m(ai):iE N ) be a Cauchy sequence in
(2,d).Then for each
k E N there is an n(k) E N so that *d(ai,aj)< l/k if i and j are both >n(k); we can assume without loss of generality that n(k) -, co as k -, co. Let 4(i)= a,. By Theorem 8.5 of Chapter 11, the map 4: N --+ *X can be extended to an internal map 4:fi + *X,where fi E * N is internal and contains N and so contains some infinite integer. We would like to show that there is some infinite integer m’ in fi so that *d(ai,a,.) < l/k for all i E N with i > n(k), where a,. = &’). For any k E N the set E(k) = {m E fi:*d(ai,aj)< l/k for all i, j E fi satisfying n(k) < i I m, n(k) < j I m } is internal and contains N. Therefore E(k) also contains { m E * N : m I mk} for some infinite integer m k , and we may assume that mk+ I S mk for all k E N. Again by Theorem 8.5 of Chapter I1 we may extend the sequence ( m k ) to an internal decreasing mapping from an internal set fi c * N into *N.Since mk > k for each finite k E N, there is an infinite w with m, 2 w and m, E E(k) for all k E N. Let m’ = m,. Then *d(ai,a,.) < l/k for all i E N with i > n(k). It follows that a,. is finite and (rn(ai)) converges to m(a,.). 0
If our metric space ( X , d ) is a normed vector space with norm )I 1 , the nonstandard hull can be made into a normed vector space in an obvious way. For in this case G consists of all x E *X for which Ilx1I is finite, and so G is a vector space over the reals. We define addition and scalar multiplication of elements in 2 by m(x)
+ m(y) = m(x + Y),
and am(x) = m(ax).
x, Y E G,
156
111.
NonstandardTheory of Topological Spaces
Also we define a norm in 2 by Illm(x)lll = stllxll, x E G. It is easy to check that d(m(x), m ( y ) ) = Illm(x) - m(y)III. From Theorem 6.3 we see that (2,111111) is a Banach space. The details are left to the reader. Exercises 111.6 1. Prove Proposition 6.1; in particular, show (i is well defined. 2. Show that if (X, I(* 11) is a normed space, then (2,(II.111) as a Banach space.
3. Show that there is an isometric embedding of a Banach space into its nonstandard hull. 4. Consider the sequence (e") which is 0 for n # o E *N, and 1 for n = o to show that the mapping in Exercise 3 is not onto for 1,. 5. Consider the sequence (x,), where x, = l/w for 1 In Iw, w E *N, and x, = 0 for n > o,to show that the mapping in Exercise 3 is not onto for
4. '111.7 Compactifications
In this section we show how some Hausdorff spaces (X,9) can be embedded as dense subsets of compact Hausdorff spaces (Y, F).That is, there exists a 1-1 map $: X + range $ E Y so that $ is a homeomorphism and range $ is dense in Y. In this case (Y, 9) is called a compactijcation of ( X ,9)We . usually identify X and range $, and so regard X as a subset of Y; we will denote Y by 8. A given space X typically has many compactifications. For example, if one adjoins 0 and 1 to (0,l) one obtains the compact interval [0,1]. Adjoining a single point to both ends of (0,l) gives a circle. Similarly the plane can be made into a sphere by adjoining a single point. We are interested here in compactifyinga space X so that certain continuous functions on X have continuous extensions to 8. What, for example, should one adjoin to (0, 11 to make sin(l/x) continuous on the resulting compact space? 7.1 Definition Let Q be a family of (perhaps not uniformly) bounded, continuous, real-valued functions on (X, 9). (8,Y) is called a Q-compuctification
of (X, 9) if it is a compactification for which (a) each f E Q has a continuous extension f to (z,Y), (b) if x and y are different points in - X there is an f E Q whose extension f separates x and y, i.e., f(x) it r(y). We sometimes write for X.
x
xQ
In order to construct a Q-compactification we need to suppose that Q contains sufficiently many functions.
111.7
157
Compactifications
7.2 Definition A family Q of continuous functions distinguishes points and closed sets if, for each set A c X and each x E X - A, there is an f E Q so that f(x) # m 1.
It should be noted that not all Hausdorff spaces X admit sufficiently many continuous real-valued functions to distinguish points and closed sets. There are enough functions if X is completely regular [20]. The compactifications of this section will be constructed from *X.The original work on this construction was done by Gonshor [151, Luxemburg [36], Machover and Hirschfeld [37], and Robinson [43]. Let Q be a family of bounded, continuous, real-valued functions on (X, 9’). Assuming that Q distinguishes points and closed sets, we construct as follows. We call two points y , z E *X equioalent, and write y z, if *f(y) N *f(z) for all f E Q. It is easy to see (check) that is an equivalence relation. The equivalence class containing x E * X is denoted by [XI, and the set of all equivalence classes is X. Next we show that if x E X, then [XI = m(x), the monad of x. First note that if y E m(x),then y x since each f E Q is continuous. On the other hand, if U is an open set containingx, then there is anf E Q so thatf(x) 4 f[X - U]. Thus f - ‘ [ R - f[X - U ] ] is an open set containing x and contained in U . We conclude that [XI = m(x). We extend each f E Q to a function on (again denoted by f ) by setting f([y]) = st(*f(y)),y E *X (check that f is well defined). The set of extended functions is again denoted by Q. The topology 9-on is the weak topology for the functions in Q. Thus U is open in iff for each [y] E U there is a finite set { f i , . . . ,f,} E Q and a positive number E in R so that { [ z ] E - Li[zI)1< E, 1 s i I n) c U . In order that we may treat as an element of the original superstructure V ( X )(which will be used in the proof of compactness),we may think of each point in as a function on Q by the definition [y](f) = f([y]). Distinct points of X give distinct functions on Q. The standard construction of X is based on such a family of functions on Q. It is often helpful, however, to think of X as a quotient of *X as we have done. Let XQ be constructed as above from a set Q of bounded, continuous, realvalued functions on (X,9’) which separate points and closed sets.
x
-
-
N
x
x
INYI)
x
x:
x
x
7.3 Theorem (XQ,Y) is a Q-compactification of (X, 9’).
Hausdorff.
xQ
x
be denoted by 8.Define the map I): X --t by $(x) = [XI, [x] = m(x) for x E X,so the map $ is 1-1 by 1.12(c)since X is
Proof: Let x E X.Now
158
111.
+
NonstandardTheory of Topological Spaces
To show is a homeomorphism, we must show that JI and + - I are conis tinuous. An easy exercise shows that is continuous. To see that continuous, we must show that if x E X and V is an open neighborhood of x in 9,then there is a U E Yxso that U n X E V (we regard X as contained in Let f E Q be such that f ( x ) 4 where A = X - V . Then there is an E > 0 in R so that {z E X :(f(z) - f(x)l < E } E V (why?); we let U = ( 2 E X:If(z) - f(x)I < E } . To show +[XI is dense in 8,let [y] E s - +[XI,and let U E 9-be given by U = {[z] E 8 : I J ; ( [ z ] )- fd[y])I < E, 1 Ii In}. We must show that E U for some x E X. Let ai = fd[y]), 1 5; i In. Then the set {x E *X:If;(x) all < E, 1 I i I n} is not empty (indeed it contains y). By downward transfer, the set {x E X:lf,(x) - ail < E, 1 I i I n} is not empty, and we are through. To show that is compact we consider a mapping T on 8. For each [y] E 8, T([y]) is the function from Q into R defined by setting T([y])(f) = f([y]) for each f E Q. Let A be the range of T; then T is a 1-1 mapping from 8 onto A. We make T a homeomorphism by letting U be open in A iff T-'[U]is open in 8. Thus a typical neighborhood of an a E A is given by a finite set {f', . . . ,f"} c Q and an E > 0 in R: it consists of those b E A with Ia(f,) - b(f,))< E, 1 5 i I n. Since X is dense in each such neighborhood contains a T ( [ x ] ) for some x E X; i.e., la(f,) - J(x)l < E for 1 Ii 5 n. To show that X is compact, we need only show that A is compact. Fix b E * A . Let E be a positive infinitesimal in *R, and let Q1 be a hyperfinite subset of Q such that *f E Q, for each f E Q. By the transfer principle, there is an x E *X such that Ib(f) - f ( x ) l < E for each f E Qr. Let c = T([x]). For each f E Q, c(f) = T([x])(f) 'Y * f ( x )2 b(*f), so b is in the monad of the standard point c E A. Thus A is compact. Finally, by the construction, each member of Q has a continuous extension to 8,and the family of extensions separates the points of 8. 0
+
+-'
fCA3,
z).
[XI
x
x,
It is not hard to see that if Q1 and Q2 are two families as described above with Q1 G Q2, then there is a continuous map ( from onto XQ1such It follows that that <(x) = x for all x E X. In this case we write XQl 5 a Q-compactification of X is unique up to a homeomorphism that leaves the points of X fixed (see, for example, [20, Theorem 221). Any compactification 8 of X is a Q-compactification;just let Q = { g x :the function g: 2 + R continuous}, where gx denotes the restriction of g to X. It follows that if Q consists of all bounded, continuous, real-valued functions on X,then xa is the largest compactification of X,i.e., 2 8 for any other compactification of X.RQis called the Stone-eech compactijication and is often denoted by PX. If (X, Y) is locally compact (i.e., each point x E X has at least one compact neighborhood), and Qo is any family of bounded, continuous, real-valued
xQ2 xQz.
xQ
111.7
Cornpactifications
159
functions on X, then, following Constantinescu and Cornea [ 9 ] , we may obtain Q by adjoining to Qo the family C, of all continuous functions with compact support (i.e., vanishing off compact sets). If Qo is empty and so Q = C,, then XQ can be identified with X u {a}, where 00 is a single point not in X.In this case we have a topology 9'whose members are all open sets in X,together with all sets U of XQ such that XQ - U is compact in X (check); the space XQ is called the one-point compactijication of X . We end this section with a result concerning the Stone-tech compactification N of the natural numbers N (with the discrete topology). 7.4 Theorem The points in N - N are in one-to-one correspondence with where 9[U1 = {A c N : the free ultrafilters on N via the map [ w ] uu, FfU1, w E *A}.
Proof: For each A c N, the characteristic function x,, is in Q, the set of bounded continuous functions on N. It follows that for each equivalence class [ w ] E N - N either [w] c * A or [w] c *N - * A . If [w] E N - N then w E *Nm, and the family FtU1 = { A c N : w E * A } is a free ultrafilter (exercise). This is the same ultrafilter as { A c N:x,,([w]) = l}, where x,, has been extended to 15. O n the other hand, if 9 is a free ultrafilter on N, then the intersection monad p ( 9 ) = *F(F E 9) is a unique element [w] in N - N. To prove this, we assume it is false. Then there are at least two distinct equivalence classes [ w ] and [y] in p ( 9 ) and a bounded sequence (s,) E Q such that a = ' s W # O s Y = b. We may assume that a < b and choose c E R with a < c < b. Since either { n E N:s, I c} €9or { n E N : s , > c} EF, we have a contradiction. 0 Exercises 111.7 1. Let (X, F )be locally compact, and let 2 denote the one-point compactification of X. Let A be an internal set of near-standard points in *X. Use the fact that st[A] is closed in X and a closed subset of 8 is compact to show that st[A] is compact. 2. Show directly that the one-point compactification of a locally compact Hausdorff space is compact. 3. Show that, for w E * N , , { A c N : w E * A } is a free ultrafilter. 4. What is the Q-compactification of (0,l) when Q = { f ( x ) = x } ? 5. What is the Q-compactification of (0,l) when Q = { f ( x ) = x , g(x) = sin ( l / x ) } ? 6. Show that X is open in a compactification X if and only if X is locally compact.
160
111.
Nonstandard Theory of Topological Spaces
‘111.8 Function Spaces
Let (X, 9’)and (Y, 9) be Hausdorff topological spaces and F be a family of mappings from X into Y.This section will be concerned with two questions: (a) For which topologies 9 on F is the map +:, F x A + Y defined by +(f, x) = f ( x ) continuous for all subsets A E X in a certain family .W? Such a topology 9 is said to be jointly continuous with respect to .%. (b) For which topologies on the space M of all mappings from X into Y is the closure of F compact?
To answer these questions, we consider two important topologies, the topology of pointwise convergence and the compact-open topology. For a standard treatment the reader is referred to Kelley [20, Chapter 71. Our treatment follows suggestions of Hirschfeld [181. The nonstandard analysis will be done in an enlargement of a structure containing X and Y. Monads in ( X , Y ) and ( Y , 9 ) will be denoted by mJx) (x E X ) and my(y ) ( y E Y ) , respectively, but we will denote nearness in both X and Y by N as in 6111.1. With each subset A E X we associate an important pseudomonad k A ( f )(fE M) on the space M of all maps from X into Y by setting (8.1) k A ( f ) = { g E * M : g ( x ’ )!x f ( x ) for all x E A and x’ E *A with x’
N
x}.
The following result provides a nonstandard answer to question (a). 8.1 Proposition Let 9 be a topology for F with associated monads m ( j ) (fE F). Then 9is jointly continuous with respect to A? iff m(f) c n { k A ( f ) : A E .W} for all f E F.
+,,
Proof: We need only show that, for each A E X , is continuous iff m(f) c k A ( f ) for all f E F. But for f E F and x E A, the monad of (f,x) in * F x *A is m(f) x m,,(x), where mA(x)= mJx) n *A. is continuous at each (f, x) E F x A o *+,,(m(f) x mA(x))c m,,(+,,(f, x)) for each f E F, x ~ A o i ~f E F X, E A , then whenever g E m ( f ) and y ! x x , y ~ * A we , have g(y) N f ( x ) - m(f) G k A ( f ) for each f E F. 0
+,,
8.2 Definition
(a) The topology of pointwise conoergence 9 on M is the weak topology for the family {+x:x E X} of evaluation maps &: M + Y defined by c$,(f) = f ( x ) . The monads for 9 are denoted by p(f) (fE M). (b) The compact-open topology % on M is generated by the subbase conV ) = { g E M : g [ K ] c V } , where K is sisting of all sets of the form W(K,
111.8 Function Spaces
161
compact in ( X , Y ) and U is open in (Y,Y).We let c(f) (f E M ) denote the monads of %‘. From 1.18 we see that (8.2)
p(f) = { g E * M : g ( x )E f ( x ) for all standard points x E X } .
8.3 Proposition Let X be the family of compact subsets of (X,Y). Then, for each f E M, k , ( f ) E n { k , ( f ) : A E x } E ~ ( f E ) hf). Proof: (i) k , ( f ) E k A ( f )for any A E X,and the first containment follows. (ii) Let K be compact in ( X , Y ) and U be an open set in ( Y , Y ) containing f[K]. If g E n { k A ( f ) : AE X } ,then g E k K ( f ) ,so g ( y ) IIf ( x ) for all x E K and all y E * K with y 2: x . Since U is open, g ( y ) E *U for all y E * K with y z x E K . But this includes all y E *K since K is compact, and so g [ * K ] G *U, i.e., g E *W(K,U ) . Thus n { k A ( f ) : AE X } G *W(K,U )for any K and U with f[K] c U , and the second containment follows. (iii) A subbase for 9 consists of sets of the form W ( { x } ,U ) , and so 9 is weaker than W and the third containment follows. 0
8.4 Theorem Each topology which is jointly continuous with respect to the family of compact subsets of X is stronger than V. Proof: Immediate from 8.1 and 8.3. 0
8.5 Theorem Assume F c M is closed with respect to 9.Then F is compact in ( M , 9 ) if for each x the set { f ( x ) : fE F} has compact closure in Y. Proof: Our condition guarantees that, for any x E X , every point in * { f ( x ) : fE F} = {g(x):gE *F}is near a standard point in Y. Given g E *F, let f ( x ) be defined for each x E X by setting f ( x ) = y,, where y, is a point in Y with y, 2: g(x) [such a point is unique since (Y, Y) is Hausdorff]. Then f E M and f ( x ) N g ( x ) for all x E X, i.e., g E p(f). Since g E * F and F c M is closed, f E F. Thus each g E *F is near a standard f E F. 0
The fact that { f ( x ) : f E F } has compact closure for each x E X is an essential ingredient in obtaining a functionfE F from a function g E *F.The argument of Theorem 8.5 does not work, however, for the compact-open topology since the condition g(x) 2: f ( x ) for all x E X is not sufficient to guarantee that g E c(f). If, however, g(x’) 2: f ( x ) for all x E X and x‘ E X with x‘ N x , then
162
Ill.
Nonstandard Theory of Topological Spaces
g E k x ( f ) c c(f) (by Proposition 8.3) and compactness follows. A standard condition guaranteeing that this holds is the following from Kelley [20].
8.6 Definition The family F is evenly continuous if for each x E X,y E Y and each open neighborhood U of y, there are neighborhoods V of x and W of y so that for all f E F with f ( x )E W , we have f [ V] c U . 8.7 Proposition The family F is evenly continuous iff the following condition holds: Given x E X and y E Y , if g E * F and g(x) 1: y, then g(x’) 1: y for all x’ ‘v x in *X. Proof: Assume first that F is evenly continuous. Fix a neighborhood U E 5.and the corresponding sets V E Y xand W E Fygiven by Definition 8.6. Since g(x) z y, g(x) E *W, so by transfer g [ * V ] c *U. In particular, g(x‘) E *U if x’ ‘v x. This last statement is true for any U E F,,, and so g(x’) ‘v y if x‘ 1: x. To prove the converse, fix U E Yyand let V and W be *-open sets in *.UX and *FY, respectively, with V G mx(x) and W c mu(y). Now if g E * F and g(x) E W , then g(x) z y. By assumption, for all x’ E V , q(x’) E q ( y ) c * U . The rest follows by downward transfer. 0
As a corollary we get a generalized Ascoli theorem due to Kelley [20]. 8.8 Ascoli Theorem If F c M is closed in W and evenly continuous, and { f ( x ) : f E F} has compact closure for each x E X,then F is compact in (Ad,%?). Proof: Immediate from the discussion preceding Definition 8.6.
0
For the rest of this section we assume that (Y, F )is a metric space with metric d. In this context, a notion which is closely related to even continuity is the notion of equicontinuity, which has already been presented in the realvariable case in Definition 1.13.6.
8.9 Definition A family F c M is called equicontinuous on X if, for each x E X and each E > 0 in R, there is a V E Yxsuch that, for any f E F, if x’ E V, then d(f(x’),f(x))
< E.
8.10 Proposition The family F c M is equicontinuous on X iff, for any x E X and any g E *F, g(x’) 1: g(x) whenever x’ 1: x. Proof: Exercise.
0
163
111.8 Function Spaces
If F is the family { n + nx:n E N } then F is evenly continuous but not equicontinuous on [0,1]. By Propositions 8.7 and 8.10, any equicontinuous family F c M is evenly continuous. If F c M is a family of continuous functions, then the compact-open topology in F is the same as the topology of uniform Convergence on compact sets, or the topology of compact convergence. For the latter topology, a typical basic open neighborhood off E F is of the form { g E F : d ( f ( x ) , g ( x ) ) E for all x E K} for some compact K c X and E > 0 in R (see [20, p. 2291). It follows from Theorem 8.8 that if F is an equicontinuous family in M (whence each f E F is continuous), and F is closed in M with respect to the topology of uniform convergence on compact sets with { f ( x ) : f E F } having compact closure in Y for each x E X , then F is compact with respect to the topology of uniform convergence on compact sets. Moreover, for an equicontinuous family F, the topology of pointwise convergence is jointly continuous on compact sets (exercise), and hence coincides with the topology of uniform convergence on compact sets.
-=
Exercises 111.8
1. Use Theorem 8.5 to prove Alaoglu’s theorem, 4.22. 2. Prove Proposition 8.10. 3. (a) Show that the set of real-valued continuous functions on R (with the usual topology) is closed with respect to the topology of uniform convergence on compact sets. (b) Show that part (a) is no longer true if we replace the usual topology on R with a topology Y such that { r } E Y for each r # 0 in R, and U is an open neighborhood of 0 if 0 E U and R - U is countable. [Hint: what are the compact sets? Is g continuous if g(0) = 1 and g(r) = - 1 for r # O?] 4. Show that if (Y, Y) is a metric space and F is an equicontinuous family, then the topology of pointwise convergence is jointly continuous on compact sets and hence coincides with the topology of uniform convergence on compact sets. 5. Let C denote the set of real-valued continuous functions on I = [0,1]. Then the map d: C x C + R + defined by d ( f , g ) = max{lf ( x ) - g(x)l: x E I } is a metric on C. Show that the compact-open topology on C coincides with the metric topology. 6. Show that the space C ( X , Y) of continuous mappings from (X, 9) to (Y, 9) with the compact-open topology is Hausdorff if (Y, F)is Hausdorff.
CHAPTER IV
Nonstandard Integration Theory
In trying to apply the theory of the Riemann integral we are faced with the following technical problem. Suppose we are given a converging infinite series f,(x) = f(x) of functions on [a, b] and are asked to calculate gf(x)dx. The answer is often simple if we can write
c,”r
Thus we need to find conditions under which integration and infinite summation can be interchanged. Equivalently [letting gn(x)= I Jtrx)] we need conditions under which, if g(x) = g,(x), then
E=
lim n-tm
s,” g,(x)dx = s” g(x) dx
for a sequence (g,(x)) of Riemann-integrable functions on [a, b]. It turns out that we can reduce the discussion to sequences {g,(x)} which are monotone increasing, i.e., gn+,(x) 2 g,(x) for all n E N [this is the case ifk(x) 2 0 for all n E N]. Thus, assuming that {g,(x)) is a monotone increasing sequence of integrable functions and g,(x) converges to g(x) on [a, b], we need conditions which insure that ~ ( xis) integrable and the above equation holds. A result of this type is known as a monotone convergence theorem. Unfortunately, the conditions under which a monotone convergence theorem holds for Riemann integration are quite restrictive (for example, it holds if the sequence fgn) converges ~~~0~~~~on [a,b]). This fact led Lebesgue [26] and others to generalize the process of integration in such a way that the conditions for a monotone convergence theorem were considerably relaxed. The procedure was to generalize the concept of the length of an interval so that one could measure the “length” of a very general subset of [u, b] called a measurable set. The theory,of integration then developed systematically from this “measure theory.** 164
IV.l
165
Standardizations of Internal Integration Structures
An alternative approach was developed by P. Daniel [l 11. He began with the general notions of a lattice L of functions on a set X and an integral I on t.As indicated in Definition 1.2, a lattice of functions is a linear space which is also closed under the operation of taking absolute valves, and an integral 1 on L is a linear functional which is also positive [i.e., f 2 0 implies I ( f ) 2 01. Daniel showed that if I satisfied the additional continuity concould be dition “If {f.} decreases to 0 then l(f,)decreases to 0,” (t,I) enlarged to a structure f) which satisfied the monotone convergence theorem. Our nonstandard approach to integration follows the Daniel approach except that we begin with an “internal” integration structure ( L , I ) on an internal set X in some enlargement. We show that, without any continuity assumption, we can construct from (t, I) a standard integration structure (L,f) on the same internal set X , and that structure satisfies the monotone convergence theorem. In 41V.2 we show that the usual measure-theoretic approach can be recovered from any structure r^) satisfying the monotone convergence theorem. The usual Lebesgue theory on R” is developed in 8IV.3 by using the standard part map to carry results on *R” down to R”. Some important convergence theorems which hold in any structure for which the monotone convergence theorem is valid are developed in 4IV.4. A nonstandard approach to the Fubini theorem, which is an analogue of the iterated integration procedure for the Riemann integral, is developed in fiIV.5. Finally, in 4IV.6 we apply the nonstandard integration theory developed in the previous sections to study several important stochastic processes, including the Poisson process and Brownian motion. These processes are represented as processes on a *-finite probability space and indicate the usefulness of an integration theory on nonstandard sets. References to the original work on nonstandard integration theory will be given in the body of this chapter, with the exception, as noted in the Preface, of [27,29,32,33] by the second author.
(e,
(e,
IV.l Standardizations of Internal Integration Structures
The Riemann integral for continuous functions on an interval [a,b] (see $1.1 2) has the properties
(1-2)
f ( x )dx 2 0
if f ( x ) 2 0 on [a, b ] .
166
IV.
Nonstandard Integration Theory
Implicit in (1.1) is the fact that a linear combination of continuous functions is continuous. It is also true that is continuous iff is continuous. A general theory of integration should specify (A) a class L of “integrable” functions on a space X corresponding to the continuous functions on [a, b] in the above example, and (B) a real-valued function I on L whose value at f E L we denote by I$ (a numerical-valved function on a set of functions is usually called a functional). Here Ifcorresponds to the Riemann integral of f . In general, the analogues of the properties above should be satisfied. We abstract these properties in the notion of an integration structure. It consists of a lattice of functions and a positive linear functional on this lattice as in Definition 1.2 below. This definition incorporates the standard (real) and nonstandard (hyperreal) notions of integration structures since we want to consider internal analogues of integration structures when the functions are internal and hyperreal-valued. Our main objective in this section is to show how, beginning with an internal integration structure (L, I) on an internal set X,we can construct a real integration structure (i,f) on the same internal set X by a process called standardization. The important fact is that the real integration structures so obtained satisfy a closure property called the monotone convergence theorem. This theorem states roughly that a monotone increasing sequence (f.) of functions in whose integrals ifnare uniformly bounded, converges to a function f E i,and @ is the limit of It is the basic tool in all further developments of integration theory. We begin with a definition summarizing standard notation.
If1
e,
(k).
1.1 Definition Let X be a set and E G X. The functions zE,1, and 0 on X are defined by
”={::
x E E,
x$E,
1 = xx, and 0 = xer, where 0 is the empty set. Iff and g are functions on X,we write f I; g if f ( x ) 5 g(x) for all x E X ; we define a f , f g , f g , f / g (if g does not vanish at any point in X),and
+
+
If1
as usual by assigning the values af(x),f ( x ) g(x),f(x)g(x),f(x)/g(x),and If(x)l at x E X. 1.2 Definition A set L of real- or hyperreal-valued functions on a set X is a real (hyperreal) lattice if
(a) f , g E L implies uf + /3g E L for all real (hyperreal) a, /3, (b) f E L implies E L .
If1
IV.l
167
Standardizations of Internal Integration Structures
A real- or hyperreal-valued function I on L is called a real (hyperreal) positive linear functional (p.1.f.) if (c) I(af + 88) = aIf (d) Z f 2 O i f f 2 0 .
+ 8Ig for all f,g
EL
and real (hyperreal) a, 8,
The pair ( L , I ) then forms a real (hyperreal) integration structure on X . The integration structure (t, r', on X is an extension of the integration structure ( L , I ) if L E and 9 = If when f E L. If the sets X and L (and hence all f E L) and the functional I are internal in some enlargement V(*S) of a superstructure V(S), then we say that ( L , I ) is an internal integration structure. A lattice L always contains 0 (check), and is also closed under the operations of taking maxima and minima, defined as follows. 1.3 Definition Iff and g are (real- or hyperreal-valued) functions defined on
X,we define the maximum and minimum off and g by
max(f, g ) = f v 9 = (f + 9 + If - 91)/2, min(f,g) = f A 9 = (f + 9 - If - g1)/2 and the positive and negative parts off by 'f = f v 0, f- = (-f)v 0. Clearly, if L is a lattice and f,g E L then f v g , f A g E L. Conversely, if L is a set of functions on X which is closed under linear combinations and for which f,g E L implies f v g and f A g E L, then L is a lattice (Exercise 1). Notice that iff, g E L and f 2 g , then the inequality If 2 I g follows from 1.2(d). This fact will be used frequently in the development. The following are examples of real integration structures of real-valued functions. 1.4 Examples
1. Let C [ a , b ] denote the set of all continuous real-valued functions on the finite interval [a, b] c R. Define the linear functional J: on C[a, b] by f = 1: f ( x )dx (Riemann integral). Then (C[a, b], j:) is a real integration structure on [a, b] (exercise).Note that 1 E C[a, b]. 2. Let C,(R) denote the set of all continuous real-valued functions f on R with compact support, where the support off is the set
suppf = { x : f ( x )# O}.
If
= j : f ( x ) d x if (a) Let denote the functional on C,(R) defined by supp f E [a, b]. (The definition of j is independent of the choice of a and
168
IV.
Nonstandard Integration Theory
b satisfying this condition.) Then (C,(R),1) is a real integration structure (exercise).Note that 1 $ C,(R). (b) Let {. . . ,x- 2,x- xo,xl, . . .} be a countable set of points in R with no limit point. For each f E C,(R) let f = - f(x,). Then (C,(R), is a real integration structure on R (exercise).
,,
c
c)
cy=
3. A step function on R is a function f of the form f = cixE,,where the sets E, are disjoint finite intervals (open, closed, or semiopen; this includes the case where the end points are equal and Ei is thus a single point). Let S(R)denote the set of step functions on R. Define the functional $ on S(R) by $f = cdb, - a,) iff = cixE,and E , has the end points a, and bl, a, Ibi. Then (S(R),$) is a real integration structure on R (exercise). 4. With Y = {x,, . . . ,x,} a finite set, let B(Y) denote the set of all realvalued functions on Y. If a,, . . . ,a, are fixed real numbers with ai > 0, 1 5 i In, define the functional aif(xi). Then on B(X) by = (~(y), C)is a real integration structure on Y (exercise). 5. With Y any nonempty set, let Bo( Y) denote the set of all real-valued functions on Y, each of which is zero except for finitely many x E Y. If a is a positive real-valued function on Y, let denote the functional on Bo(Y) defined by f= 4xi)f(xi), where supp f = {xl, . . . ,x,}. Then (Bo(Y), is a real integration structure on Y (exercise). If Y is a finite set, this example degenerates to Example 1.4.4.
c;=,
c;=,
If
co)co cy=
Lo
The next proposition, easily proved using the transfer principle, shows that each standard real integration structure on a set Y (in particular, each of Examples 1.4) gives rise to an internal integration structure on * Y by transfer. We now fix an enlargement of a structure containing Y, with the associated monomorphism *. 1.5 Proposition If (L, I) is a real integration structure on a set Y, then (*L,*I) is an internal integration structure on X = *Y.
Proof: Exercise. 0 There are internal integration structures which cannot be obtained from a real integration structure by using Proposition 1.5, as the following example shows. 1.6 Hyperfinite Integration Structures Let X be an internal *-finite set {x,, . . . ,x,) in an enlargement V(*S) of some superstructure Y(S). Let
B,(X) denote the set of all hyperreal-valued internal functions on X. With {al,, . . ,a,} a fixed set of hyperreal nonnegative numbers of the same internal cardinality as X, let denote the hyperreal functional on B,(X) defined by f= a&,), where the summation is the extension of finite
c;"= c,
IV.l
169
Standardizations of Internal Integration Structures
summation. Then (B,(X),C,) is a hyperreal integration structure on X (Exercise 5). Such "hyperfinite" integration structures have recently been used as the starting point in an extensive nonstandard treatment of Brownian motion and other stochastic processes. An introduction to this theory is presented in §IV.6. Now let (L, I) be an internal hyperreal integration structure on an internal set X in an enlargement V(*S)of a superstructure V(S)containing the reals. Our main objective in this section is to construct a real integration structure (2, f) on the same internal set X so that the monotone convergence theorem is valid. (2,f) will be called the standardization of (L, I ) . T o prove the convergence theorem and other results we need to assume that V(*S) is K1saturated. Thus we assume from now on without further explicit comment that any internal structure (L, I) being standardized lies in an Kl-saturated enlargement V(*S) of a superstructure V(S). 2is now defined as follows. 1.7 Definition Let @ , I ) be an internal integration structure on an internal set X.We define the set Lo of null functions to be the set of hyperreal-valued (possibly external) functions g on X such that, for each E > 0 in R, there is a $ E L with 191 I; and "I$ < E. Further we define t to be the set of realvalued functions f on X such that f = 4 + g, where 4 E L, "Il4l < m, and g E Lo.
*
1.8 Lemma
(a) I f f = 4
f = 4? #with
44 - 4) = 0.(b) If
(41 v 4 2 )
fi E L
and
+ g E 2 with 4 E L , "I141< 00,
g E Lo, and we also have
4~L , # E Lo, then '1141 < m and 4 - 4E Lo,so I4 - 14 = with
fi = q5i + gi, 4i E L, giE Lo ( i = 1,2),
(flA h ) - (41A 4 2 )
then
(fl
vf2)
-
are in Lo.
Proof: (a) Since 6 - 4 = g - # E Lo, we have I"I(8 - 4)l I'118 - 41 = 0 and_ I"I($l- oIlc$l I I '114 - 41 (Exercise 6). It follows that "Z(4l< co and 1'14 -
= 0.
E > 0 in R, there is a (why?). From the inequalities
(b) Given
(41 v 4 2 )
it follows that
(fl
-
*
I,+E L with lgil < J/ ( i = 1,2) and "I$ <
= (41 - $1 v (42
5 (41 + 81)v ( 4 2 5 (41v 4 2 ) + $ 3
E
- $1
+ 92) = fl v f 2
v f2) - (41v 4 2 ) E Lo. Similarly (fl
A f2)
- (41A &) E Lo. 0
170
IV.
1.9 Theorem The sets Lo and
Nonstandard Integration Theory
2 are real lattices.
Proof: We show only that Lo is a real lattice. The proof that t is a real lattice is left as an exercise [use Lemma 1.8(b)]. Let gl, g2 E Lo and a, /3 E R with a2 /3' > 0. Given E > 0 in R there exists a function $ E L so that lgll 5 $(i = 1,2) and I$ < ~/Z(max(la(, 1/31)>. Then lag1 + /3gz(5 4, where 4 = 2$ max(la1, ISl) E L and I$ c E. Property 1.2(b)for Lo is obvious. 0
+
Iff E 2 has two representations f = (b + g = 4 + 4 as in Lemma 1.8(a) then "14 = "I$ < co,so we may unambiguously make the following definition.
+
1.10 Definition For each f = (b g E 2, where 4 E L and g E Lo, we set = "14.The real number If is called the integral off [with respect to the hyperreal integration structure (L, I)].
ff
1.11 Theorem The functional f is a real p.1.f. on
2.
+
Proof: The linearity of r^ follows from that of I. Iff 2 0 then f = (b g where (b E L, g E Lo, and we msy take (b 2 0 by Lemma 1.8(b). Thus ff 2 0. 0 1.12 Corollary The pair
(2,f) is a real integration structure on X.
1.13 Definition The structure (2,f) constructed from (t, I) is called the standardization of ( L ,I ) .
The next result gives another useful characterization of functions in 2. In its proof we use saturation. 1.14 Theorem A real-valued function f on X is in t iff for each E > 0 in R there exist functions $1 and $2 in L with JI1I f 5 $ 2 , 0 1 ( ~ $ l < ~ ) 00, and I ( @ 2 - $1) < E, in which case I ff I + E.
Proof: First assume that f = (b + g E 2 with (b E L, g E Lo, and "Il(b1 c 00. For each n E N choose I$,, E L with Igl S (b,, and I(bn < E/n for some fixed E > 0 in R. Setting = (b - &, and $z = (b (bz, we have < f 5 $2, oIl$ll c co,and - $1) < E. If $1 and $z are any elements of L satisfying
+
IV.l
$1
171
Standardizations of Internal Integration Structures
IfI
$2
- 4,, I 4 I
and I($2 - $J < E, then
o ~ $ l-
E/n I
'14 = if I
+
0 ~ $ 2
E/n
I
+ 4,,, and so + + E/n
$2
o~$l
E
for each n E N. It follows that I if 5 + E. To prove the converse we use saturation. Assume that f is an arbitrary real-valued function on X for which the conditions of the theorem hold. Then there exists an increasing sequence {$,,:n E N} (i.e., $,,+ 2 $,, for all n) and a decreasing sequence {$b:n E N} in L with $,, I f I $:, < 00, and I($: - $,,) c l/n for each n E N. We now apply Theorem 11.8.5 with C = N, D = L, and 4: N + L and 4': N -,L defined by 4(n) = $,, and &(n) = $:. Then there are internal extensions 4: *N + L and * N + L. By the permanence principle, Theorem 11.7.1(i), we may find a k E * N , so that $,, and $: form increasing and decreasing sequences and $,, I $: for n I k. Thus, for some infinite w I k, $,, I $ m I < $:, and so $,, - $1 < f $" I $: - $,, for all n E N. It follows that f - $ m E Lo and f E L. 0 OZ$,,
4':
We now come to a result, called the monotone covergence theorem, which is central to the further development of the subject, both practically and theoretically. The result says, roughly speaking, that L is closed under monotone limits if the integrals are uniformly bounded. We will later generalize the result to a larger class of functions.
(e,
1.15 Monotone Convergence Theorem for f) Suppose that (f,E k n E N) is a monotone increasing (i.e., f,+ 2 f,,for all n E N)sequence of functions in 2 for which f,(x) = f ( x ) exists for all x E X, (a) (b) sup{If,:n E N} = limn+, if,,< 00. Then f E 2 and
ff
=
fh.
Proof: We may assume without loss of generality that f,,2 0 (otherwise consider f, -fl). By 1.8(b) we may find representations f,,= +,, g, with ff,,.Then given 4,, E L, gn E Lo, and 0 I $,, I &,,+ (check). Let B = E > 0 in R we may find an m E N so that, for n 2 m in N, B - E < ff,,5 B, and hence B - E < 14,,< B + E for any E > 0 in R. We now use saturation again. As in the proof of Theorem 1.14 we can extend the sequence ( & E L : n E N) to (&E L:nE*N) so that it is still increasing (if necessary repeat some 4 E L for all n 2 some k in *N,).Thus, for some infinite w, 4m2 $,, for each n E N and '14" = sup{olc$,,:n E N} (Exercise 8). We need only show that f - 4mE Lo. Fix E > 0 in R, and for each n E N choose a $,, E L with Ig,( I $,, and I$,, < E/2". Again by Kl-saturation we may extend the sequence ($,,:n E N)
+
172
Nonstandard Integration Theory
IV.
to ($,,:n E *N) so that, for some infinite k E *N, $,, 2 0 and l$,, < ~ / for 2 ~ each n < k. Let $ = $,,. Then I $ < E and
+
If=,
(1.3) 4 n I4 n for each n E N, so that (4n - 4 m )
$n
54n
+ gn sf
-$sf- 4 m 5 ~
(1
4
+ ~ ) ( 4 m+ J/)
+m (1 +
We may choose n E N so large that -2E
Also, I(&&
<
+ (1 +
- &) E)$)
- I$.
< El&
+ + E
E2.
Since E is arbitrary, it follows that f - 4mE Lo (check). Our next theorem is a result which is useful in many applications. It gives conditions under which the standard part "4 of a function 4 E L is in E and fr4) = "14. In general we define "4 by W(x)),
4(x) finite, 4(x) E $(x) E *R,.
1.16 Theorem If 4 E L takes only finite values, and for some $ 2 0 in L with "I$ < co we have {x E X : & x ) # 0} E {x E X : $ ( x ) 2 l}, then 4 "4 E Lo, "4 E ,I!, and r^c.4)= '14.
Proof: For each E > 0 in R, 14 - O41 IE $ , and so 4 - E Lo. Since l4(x)l In$(x) for all infinite n and all x E X, an easy argument using the permanence principle shows that 141 In$ for some finite n, and so '14 < 00. Thus "4 = 4 ( O 4 - 4) E 2 and fV4) = "14. 0
+
It is important to note that if 1 E L and "I1 < co then the conclusion of Theorem 1.16 holds under the sole assumption that 4 is finite-valued, for then we may take $ = 1. In this case we now show that iff is a real-valued function on X and f = 4 + g, 4 E L, g E Lo, then 1141 is automatically finite. 1.17 Theorem Assume the function 1 E L and "I1 < co. Then 1 E.!, Moreover, iff = 4 g is a real-valued function on X with 4 E L and g E Lo then o~141< 00, i.e., f E L.
+
Proof: If 1 E L and "11 < 00 then, by Theorem 1.16,1 = " 1 E i. Moreover, given f = 4 + g as above, we fix $ E L with 191 I $ and l$ < l. Then
173
IV.l Standardizations of Internal Integration Structures
++
4 - $ _< f I 4 + $.
Since f is real-valued and both 4 - $ and 4 are internal, there is an n E N with 4 - $ In and - n I4 + $ (permanence principle). Thus ol141 < 00. 0 1.18 Examples
1. Let ( L ,I) = (*C,(R), *J) be the internal integration structure on X = * R constructed from Example 1.4.2 using Proposition 1.5. (a) We first show that if g vanishes off the bounded interval [a, b] c X and takes only infinitesimal values, then g E Lo. We may assume b - a 2 1. For each E > 0 in R, Ig(x)(54,where 4 is the simplest piecewise linear function which is ~ / 2 ( b a) on [a, b] and positive inside and zero outside (a - 3,b + 3). Then 4 E L, and we see by transfer that 14 5 E. Thus g E Lo. (b) Suppose now that f E C,(R) and thus vanishes outside an interval [a,b] c R. We will show that "(*f)E i, and r^(o(*f)) = "I*f. Since f is continuous we have IM for some positive M E R, so *f is everywhere finite. The result now follows from Theorem 1.16 if we choose $ to be piecewise linear, 1 on *[a, b ] , and 0 outside [a - &, b + -)3. Note also that I(*f) = J f d x by transfer. (c) Lastly we show that contains the characteristic function of each nonstandard interval of finite length in X. Only closed intervals will be considered; the other cases are similar. Suppose that a, b E * R with a < b and la - b) finite; we want to show that x ~ . , ~E, i.By Exercise 10 we may assume that b a (check). For each a, b E R and n E N , the simplest function fn,.,b which is piecewise linear, continuous, and 1 if a I x Ib, 0 if x I a - 1/n or x 2 b + l/n, is in C,(R), and Jfn,a,6 ( x )dx Ib - a + 2. By transfer, for each a, b E * R with a < b and la - bl finite but not infinitesimal, and for each n E *N, the function Fn.o.Lon * R defined by
If1
*
Fn,rr.b(X)
=
1"
+
n(x - a) 1, 1 - n(x - b),
aIx_
+ l/n,
+
is in L and o I F n , a , b I"(b- a) + 2 < 00. Now let o E * N , , and consider g = xfo,bl- Fo,+. Then for each n E N with 2/n Ib - a, 1st I4,,, where I#+, E L is defined by
+ l/n),
+ l/n - x), n(x b + l/n), n(b + l / n - x),
a - l/n 5 x a Ix Ia b - l/n I x b Ix 5 b
0
otherwise.
n(x
-
n(a
-
a
Ia,
+ l/n,
Ib,
+ l/n,
174
IV.
By transfer I&,, = 2/n, and we conclude that shows that f~~.,~, = "Ib - a[.
Nonstandard Integration Theory
x ~ . , ~E]
t.An easy calculation
*c)
2. Let ( L , I ) = (*B(Y), be the internal structure on X = *Y = Y = . . , x,} constructed from Example 1.4.4 using Proposition 1.5. We first show that Lo consists exactly of the functions g which take infinitesimal values. Suppose that g(xi)'Y 0 for all 1 5 i I n. Then 191 5 4, where 4 = ~ / 2 ai, and 'I4 < E by transfer. Conversely, if g(xi) = r 0 for some i, then when 191 I 4 2 ~ T ~ x ~ , ,so ) , 14 2 a,Irl+ 0. Thus g $ Lo. It is now easy to show that L, consists of all real-valued functions on X, and that r^) = (B(r), 3. Let X be any internal set and let xo E X.Let L consist of all hyperrealvalued functions which vanish except at xo. Put If = f ( x o ) for f E L. Then (L,I) is an internal integration structure. It is easy to check that Lo consists of all functions which vanish except at xo, where they are infinitesimal, and that consists of all real-valued functionsf which vanish except at xo, where f ( x o ) is finite, and f(f) = f ( x o ) . {xl,.
17
(L
+
q,
c>.
e
-
Exercises I V.1
1. Show that if L is closed under linear combinations and f , g E L f v g E L and f A g E L, thenfE L E L. 2. (Standard) Show that the structures in Examples 1.4 are real integration structures. 3. Let X = 1, and y = (y,) E 1,; assume that y, 2 0 for all i E N. Show that (X,I) is an integration structure if we define Ix = ( x , y ) for all x E X. 4. Prove Proposition 1.5. 5. Show that the structure in 1.6 is a hyperreal inteBration structure. 6. (a) Show that, for functions 4 and in L, I"Il+l - "Il4ll 5 - 41. (b) Show that if 4 E L n Lo, then " I [ $ / = 0 7. Prove that i is a real lattice. 8. In the proof of Theorem 1.15, show that for some infinite o E *N, 4u 2 for all n E N,and = sup{014n:nE N}. 9. Show that one cannot in general replace (1 + E)(+, + I)) with (4" + $) in the right-hand side of Eq. (1.3) in the proof of Theorem 1.15. 10. Let (L, I) be an internal integration structure on the internal set X and suppose that the function f is real-valued and nonnegative. Show that f € L o iff f~ i on X and ff = 0. 11. (Comparison Theorem) Let (L, I)and (L', 1')be two internal integration structures on the internal set X.Suppose that Lo G Lb and that for each 4 E L there exists a I) E L' so that 14 'v I'I) and 4 - $ E Lb. Show that L E and~ if =?f for all f E L .
If1
4
OI$,,,
O I I 4
IV.2
Measure Theory for Complete Integration Structures
175
12. Use Exercise 11 to show that if (L, I) and (L', Z') are the *-transfers (*C,(R), *j)and (*S(R),*$)of the structures in Examples 1.4.2(a) and 1.4.3 f) = (2,f'). respectively, then (i, 13. In the standardization of Example 1.18.1 give an example of a function g E Lo which takes infinitely large values. 14. (Standard) A collection of subsets of a set X is a ring if A, B E Y implies that A v B and A - B E 9. A function v : Y + R + is a jinitely additive measure on S if v(A u B) = v(A) v(B) for A, B E Y with A n B = 0.
+
(a) Show that if A, B E Y then A n B and A AB = ( A - B) u ( B - A) E Y . (b) Show that if d is any collection of subsets of X then there is a unique ring Y containing 8. (Hint: Y is the intersection of all rings containing 8.) (c) Show that the set L of all linear combinations of characteristic functions of disjoint sets in Y is a lattice. (d) Show that if 4 = aizA,E L, we may unambiguously define I4 aiv(Ai) and that ( L , I ) is an integration structure.
=cl=l
15. Develop the internal analogues of the notions in Exercise 14.
IV.2 Measure Theory for Complete Integration Structures
In the last section we showed that the monotone convergence theorem holds for the integration structure f) obtained from an internal structure & , I ) by standardization. In this section we develop a measure theory for any integration structure 1)for which the monotone convergence theorem is valid. Such structures will be called complete.
(e,
(e,
(e,
2.1 Definition A real integration structure I^, on a set X is complete if whenever (f, E e : n E N ) is a monotone increasing sequence for which f , ( x ) = f ( x ) exists for all x E X, (a) (b) sup{Ifn:n E N } = limn+,,, If,, < co,
then f E t and
9". Throughout this section (t, r^) will denote a complete integration structure. Our first objective is to introduce a set M of functions which includes the set f,. The functions in M are called measurable functions. Roughly speaking, @=
measurable functions will have the same regularity as functions in 2 but may not have finite integrals. We will find that products of measurable functions
176
IV.
Nonstandard Integration Theory
are measurable, a useful fact that is not in general true for functions in I!,. We then extend the functional f to a subset t,of M ,and obtain a real integration structure which is an extension of @,f). We will also study the basic properties of those sets, called measurable, whose characteristic functions are in M.This leads to a discussion of measure theory which is often taken as the starting point for a standard development of integration theory and is important in many areas of analysis; in particular, it is basic to probability theory. We will show that the two approaches are equivalent. Most of the proofs are standard except at the end of the section where we establish connections with 4IV.l. The functions in M will be extended real-valued functions; that is, they may take the values + a and - a.Thus we make the following definition.
2.2 Definition The extended real number system is the set R = R v {-a, + a}.By convention - a < x, and x < + a for all x E R. The rules of arithmetic for R are supplemented by the following rules: If x E R then
+
( f m ) (fa) =x
+ (fa)=(&coo) + x = f a ,
(*a)(fa) = +a, ( + a ) ( T m ) = -a),
1
if x > O if x = O if x < O
fa
x(+co) = ( f a ) x= 0 Tco x/( f co)= 0
for all x
E
R.
If a set A E R is not bounded above we define sup A = + a,and if A is not bounded below we define inf A = - a, with a similar convention for lim sup and lim inf. As usual, we often denote + 00 by 00. Notice that we have not defined ( & a)+ ( T a),(ka)/( fa),or (& a)/( T a).
2.3 Dewtion L' denotes the set of nonnegative functions in I!,. We denote by M' the set of nonnegative R-valued functions h on X such that h ~f E I!, for each f E .,!I If h E M' we define .fh = SUp{f(h A f ) : f E
.f is an R-valued function on M'.
i}.
We denote by M the set of i?-valued function h on X whose positive and negative parts h+ = h v 0 and h - = - h v 0 are both in M'. If h E M and
177
IV.2 Measure Theory for Complete Integration Structures
either jh' or j h - is finite, we define
jh
= jh'
-jh-
2.4 Remarks
1. Since i is a lattice we see that M 3 2, and it is easy to check that if h E i then j h = fh. 2. In defining M' and j h for h E M', we may assume that f E L', where itis the set of nonnegative functions in t.That is, fix h 2 0 and suppose that h ~ i f ofr allf~ E i'. Then iff = f + - f - E i,we have h ~ ( f ' - f - ) = (h ~f') - f- E i.Similarly, j h = sup{f(h~ f ) : Ef i'} for h E M'. 3. An easy calculation shows that j h = sup{@:O ~f I h, f E i} for h E M'. This formula will be used later without explicit comment. 4. Suppose that (i, f) is obtained by standardization from (L,I). For h E M + ,j h may be less than the supremum of the integrals 'Z4 for 4 E L, 0 I 4 Ih. For example, let X = {x, y}, and let L be the internal set of *R-valued functions on X. For 4 E L define 14 = &(x) + w$~(y),where o E *N,. Then eachf E vanishes at y, 1 E M,,f1 = 1, but sup{"l4:4 E L, 0 I4 Il } = 03.
+
2.5 Proposition If hi, h2 E M ' and a E R', then h , h,, ah,, hl ~ h , and , h_, v h2 are in M'. Also j(hl h,) = j h , j h , , j(ah,) = a h l for a E R, and ~ h I, $1~1 if hl I h2.
+
+
Proof: Let f E 2'. Then
(h, + h , ) A f = [ ( h , A f )
+ (44 1A f E t.
For a > 0,(ah, ~ f= )a(h, A (l/a)f) E i.Similarly h, ~ h and , hl v h, E M'. For any f E t+,the reader should check that (h, + h 2 ) h f 5 (h, ~ f+) (h, ~ f )Thus .
i((h, + h , ) ~ f I ) f(hl ~ f+)@,I ~ fI)j h ,
+ jh,.
Taking the supremum on the left-hand side, we obtain
J(hl + h2) I h,+ j h , . On the other hand, suppose f,,f2 h,
+ h,, so
E
i and f, 5 h , , f, 5 h 2 . Then f, + f, I
Ifl + if2 = f(fl + f2) I j(hl + h,), and hence j h , + j h , I j ( h 1 + h2). Thus j(hl left as an exercise. 0
+ h,) = j h , + j h , . The rest is
178
IV.
Nonstandard Integration Theory
Our next result extends the monotone convergence theorem to (M,j). In considering its meaning remember that j is an extended real-valued function and takes on the value + co for many functions in M'.
(a',
2.6 Monotone Convergence Theorem for j ) If (h, E M ' :n E N) is an increasing sequence in M', then h = sup h, E M' and j h = sup{jh,: n E N} = limn+mJh,. Proof: Let f E i '. Then h, ~f E 2 for each n, the sequence (h, ~ f : nE N ) increases to h ~ f and , sup{f(h, ~ f ) : nE N} I ff) c 03. By completeness, h A f E i a n d f ( h A f ) = lim f ( h , A f ) . Thus he&?' and
.h = sup{f(h ~ f ) :Efi} = sup{sup{f(h,r\f):fE i } : n E
N}
= sup{jh,:n E N } .
It is now natural to restrict our attention to those functions in M whose integrals are finite. 2.7 Definition We define 2, to be the set of R-valued functions h E M for which j h is finite and 2 : to be the set of nonnegative functions in i,.
el
The functions in are extended real-valued functions. For this reason they cannot, in general, be added without encountering difficulties with expressions of the form 00 - 00. We can, however, restrict ourselves to the real-valued functions in i1and obtain an integration structure. Later we will show that with any function f E i1is associated a real-valued function f E 2, (which equals f almost everywhere; see 8IV.4) such that jf = jf". 2.8 Proposition The set of real-valued functions in 2, together with j forms a complete integration structure on X,i, 2 i,and jf = ff iff E i.
Proof: Exercise. 0
2.9 Remarks 1. To show that a given function h is in 2, it suffices to show that h E M and (hl Ig for some g E 2 : (exercise). 2. If 1 E i,then every real-valued function in 2, is in (exercise). 3. In general, 2,properly contains i.In Example 1.18.3, i consists of all real-valued functions which vanish except perhaps at xo, while 2,consists
IV.2
179
Measure Theory for Complete Integration Structures
of all R-valued functionsf which are finite at xo, and .i’f = f(xo).In particular, 1 E i1- i. To proceed we need to make a further assumption on L due, in the standard development of the subject, to Marshall Stone. 2.10 Definition A lattice L (real or hyperreal) is Stonian if 4 E L implies 4 A 1 E L. An integration structure (L, I) is Stonian if L is Stonian. 2.11 Remarks
1. If 1 E L, then L is Stonian. 2. If i is Stonian, then 1 E M + . 3. Each of the real lattices in Examples 1.4 is a Stonian lattice. 4. If L is a real Stonian lattice on a standard set Y, then * L is an internal Stonian lattice on X = *Y. 5. If L is Stonian, then 4 A O !E: L for any a > 0 since 4 A a = a((l/a)+ A 1).
2.12 Proposition If ( L , I ) is an internal Stonian integration structure on the internal set X, then the standardization (&f) is a Stonian integration structure. Proof: Let E > 0 in R and f E 2 be given. By Theorem 1.14 there are func< 00, and tions $ 1 ,t,h2 E L so that $1 I f I$ 2 , - $J < E. Then $ 1 A 1 I f A 1 5 $ 2 A 1 and 1 - $1 A 1) I‘f($2 - $1) < E, SO f~ 1 E by Theorem 1.14. 0
The above results show that all of the examples of integration structures encountered so far have been Stonian. In the rest of this chapter we will assume without further explicit comment that all integration structures are Stonian. To lead into our discussion of measurable sets we give an alternative characterization of measurable function in terms “good” sets which are defined as follows. 2.13 Definition We let XA E
t+.
2.14 Proposition If A AE9.
2 denote the collection of all sets A E X = {x E
X : f ( x ) > a}, where f
E
i’
for which
and a > 0, then
180
IV.
Nonstandard Integration Theory
Proof: By considering (l/a)fwe may assume a = 1. Then f = f - f A 1 E i, and if B = {x E X : f ( x ) > 0 } then A = E. Also 1 A nf E i, f ( l A nf) I f(1 AS)I&! for all n E N , and so ze = lim(1 A nf) E by completeness. 0
2.15 Proposition M’ consists of all nonnegative extended real-valued functions h such that h A nzA E for each n E N and A E 2. Given h E M’, h = sup{f(h A nXA):nE N , A E 2).
I,,.
Proof: Givenf 2 0 in L, let A, = { x E X : f ( x )> l/n}, n E N. Then Ei by 2.14, and the result follows from completeness and the fact that h ~ = f limn+a [ h ~ n ~ ~and , ~h f~ ]n z ~ I , h~~ fn z ” I , h. 0
We are now ready to consider the notions of measurable set and measure. These notions were the starting point of the integration theory developed by Lebesgue. He proposed attaching a real number p(A), called the measure of A , to a subset A of a set X. The measure of a subset can be thought of as a generalization of the length of an interval on the real line, or the area of a rectangle in the plane. Thus it is natural to require that the measure of a disjoint union of sets is the sum of the measures of the sets, at least for finite unions. Unfortunately it is usually impossible to define p on all subsets of a given set X . The best we can expect is that the subsets, called measurable, on which p is defined are closed under countable unions and complements, and that the measure is “countably additive”. The general definitions of measurable sets and measure as presented by Lebesgue are as follows. 2.16 Definition A collection A of subsets of a set
X is called a a-algebra if
(a) x E A, (b) A E A implies that the complement A’ of A is in A, (c) {A, E .M:i E N} implies U A i ( i E N ) E .M. Each set in A is called measurable, and ( X , A) is called a measurable space. A nonnegative function p:.M --* R + is called a meusure on A if p ( 0 ) = 0 and (d) for each collection { A , E A:iE N} which is disjoint (i.e., Ai n A j = 521 if i # j ) we have p ( u A i (i E N ) ) = C d A i ) ( i E N ) .
This property is called countable additivity. A measure p on .M is complete if (e) whenever A E A with p ( A ) = 0 and B c A, then B E A (and thus p ( B ) = 0 since p ( B ) 5 p(A - B ) p ( B ) = p ( A ) ) .
+
The triple (X, 4, p) is called a measure space.
IV.2
181
Measure Theory for Complete Integration Structures
2.17 Remarks 1. @ = X‘. 2. If { A i : iE N} c A then, by De Morgan’s law, r ) A i (i E N) = ( U A ;( i E N))’ E 4. 3. Finite unions and intersections of sets in A are again in 4. 4. If A, B E .A then A - B = A n B’ and the symmetric difference A A B = ( A - B) u (B - A) are in A. 5. If p is a measure on (X, A),then for any collection {A,, E 4 : n E N} we have p ( u y A,,) I p(A,). If A , c A, then p ( A , ) Ip(A2). (Exercise).
6. The term “complete” for measures is not related to completeness for integration structures.
Now we will show how to use a complete integration structure X to introduce a measure theory on X.
(&f)
on
2.18 Definition A set A E X is measurable with respect to (i,f)if x,, E M’. The collection of these measurable sets is denoted by 2.For each A E define $(A) = jXA.
Note that
2 E {A E .,8:p(A) c a}.
2.19 Theorem 2 is a a-algebra on X and $ is a measure on
2.
Proof: (a) By Remark 2.11.2, 1 = x x E A?’. (b) If A E A? then zA E M’ and so zA,= 1 - x,, E M’. (c) Suppose Ai E A (i E N) and put A = Ai and B, = A i . Then (Xe,:n E N ) is an increasing sequence of functions in M’. Since x,, = xBn,x,, E M’ by the monotone convergence theorem, 2.6, and hence AEM. (d) In the notation of (c) we have
u;
$(A) = jx,, = lim jxEn
by monotone convergence
n+ca
n
=
lim
n-m
c
1 jXA, i= 1
m
=
i= I
$(A,).
0
since the {A,} are disjoint
uy=
182
IV.
Nonstandard Integration Theory
2.20 Examples
1. Let (J!,,f) be the standardization of ( L , I )= (*CdR),*S) on X = * R (see Example 1.18.1).
(a) 9 contains all intervals of finite length, including intervals of infinitesimal length and (the degenerate case) single points [see Example l.lS.l(c)]. (b) A? contains each interval on * R (exercise). (c) The set G of finite numbers in * R is in A? (exercise). (d) The set of numbers infinitesimally close to any a E R is in 9(exercise). 2. In Example 1.18.3, 9 consists of {xo},and A? consists of all sets.
In the standard developments of integration, one begins with a measure on a a-algebra A. Using A, one then defines the notions of measurable function and associated integral. We now present this development. Our eventual aim is to show that if we begin with the 2 and fi obtained from (i,1)then the measurable functions and integrals obtained from the standard development coincide with those obtained from (J!,,1). In the next few results p will be a measure on an arbitrary a-algebra A. 2.21 Definition An extended real-valued function h on X is measurable with respect to A if A , = { x E X : f ( x )> a } E d for each a E R. The set of functions f which are measurable with respect to A is denoted by M.
We will see presently that M = M in our situation, but a few results must first be established. We want to show that each h E M is the limit of a sequence of functions in M, each of which takes only finitely many values. 2.22 Definition A function v E M is simple if it takes only finitely many distinct real values a , , . . . ,a,, and the sets Ai = {x E X : v ( x )= ai} E A ( i = 1, . . . ,n). The representation v(x) = aiXA, is called the reduced representation of v.
cy=,
2.23 Proposition Each nonnegative function h E M is the limit of a monotonically increasing sequence ( v , E M:nE N) of nonnegative simple functions. Prooj: Define VAX) =
(k - 1)/2",
if (k - 1)/2" I; h(x) < k/2", if h(x) 2 n,
1
k I; n 2",
IV.2
183
Measure Theory for Complete Integration Structures
(drawing a picture helps here). Then 0 Ih(x) - u,(x) I 1/2" if h(x) In, and u, = n if h(x) > n. Also on increases monotonically to h. 0 In the standard development of integration that we are following, the integral of a nonnegative function h E M is defined as follows. 2.24 Definition Let the measure p on A be given. If u = aizA,is a simple function with each ai 2 0, we define the integral of u by u d p = C;=l aip(Ai).One can show that the integral is well defined (Exercise 7). If h E M is nonnegative we define the integral of h by
If h E M and h = h' - h- we define j h d p = j h'dp integrals is finite.
-Ih - d p if one of the
We now show that our development of integration coincides with this standard development. 2.25 Theorem Let (t, f) be a complete integration structure with measurable functions M ,and let M be the functions measurable with respect to the 6algebra k obtained from(2,f). Then an R-valued function h is in M ' iff it is in M + ,and j h = hd4, where 4 is the measure obtained from (L,f).
Proof: Assume that h E M'. For a > 0 let A = { x E X : h ( x )> a}; fix rn > a i n N andCE&'.ForanynEN,XAAnX,=zAnC,and A n C = { x E X :h A mzc > a } E 9
by 2.14 and 2.15, so A E 2.Moreover, {X E
X : h ( x )> 0 } =
u
{X E
X : h ( x )> l/n} E 2,
noN
and so h e M'. Now assume that h E M', and fix C E &' and n E N . Then h A nzc is the limit of an increasing sequence of simple functions from t by 2.23. Thus h A nxc E t by completeness, so h E M'. To show that j h = hdp, note that j udp = j u for nonnegative simple functions and that
Shd;
= sup{ju:u
simple, 0 Iu I h}
Isup{jf:f E M ,0 If 5 h} A
= Jh.
184
IV.
Nonstandard Integration Theory
But if f E t and 0 5 f 5 h, then there exists an increasing sequence (u,,: n E N) of simple functions with 0 I; u,, 5 f and limn-rmu, = f,so that
9 = lim fun 5 j h d j l n-rm
Hence
2.26 Corollary M = M and .fh = hdj? for all h E M for which J^ is defined.
Proof: To show that M G M let h = h+ - h- E M. By Theorem 2.25, h+ E M + = M' and h- E M + = M + ,and so h E M . To show that M E M we proceed in the same way, using the fact that if f, g E M ' and f g = 0, then f - g E M . To prove this we have x E X : f ( x )> a } if a 2 0 {x E X : f ( x ) - g(x) > a } = E X : g ( x )< - a } if a < 0.
Now {x E X : f ( x ) > a } E
2.Also
{ x E X : g ( x ) < - a } = { x E X : g ( x ) 2 -a}' = ( n { xE X : & ) > -a
is in
2 by Theorem 2.19.
-
i/n}(n E N))'
0
(e,f)
2.27 Notation Let be a complete integration structure with associated sets and measure My2,and fi. With Corollary 2.26 in mind we will denote the value of .f at h E M by the standard notation hd&
I
We can now show that the set of measurable functions is closed under many limiting and algebraic operations. 2.28 Proposition If (h, E M : n E N) is a sequence of functions in M ,then the
functions h, H,h, 1-7 defined by h(x) = inf{h,(x):n
E
i ( x ) = lim inf h,,(x),
N},
H ( x ) = sup{h,(x):n E N}, A ( x ) = Iim sup h,(x)
are in M.
u;=l
Proof: Since { x E X : H ( x ) > a } = { x E X : h , ( x ) > a } we see that H E M by 2.26. Then h E M since inf{h,} = -sup{ -h,} E M. Finally h" = sup{inf{h,:m 2 n} } E fi and similar R E M. 0
185
IV.2 Measure Theory for Complete Integration Structures
2.29 Proposition Iff, g E fi and H is a continuous function on the plane R2, then the function h defined by h(x) = H(f(x),g(x)) is in M. In particular, f + g andfg E fi. Proof: Since H is continuous, the sets U , = { (u, u):H(u, u) > a } are open, and so each can be written as a union of open boxes:
u {<~,u):(~~)~(a~,bm) m
u a =
x ( cn , d n ) } *
n= 1
Therefore
is measurable (why?), and so h is measurable.
0
The preceding two propositions can be used to show that most functions commonly encountered in analysis are measurable.
2.30 Notation Iff E M and f f d@ is defined, then f X A Proposition 2.29. We put JA f dfi = J fXAd@.
E
M for any A E A? by
It follows from Proposition 2.29 that i f f , g E M and E E A ? then the function h defined by x E E, XEE, is in M (exercise). This fact will be used later without explicit reference. We end this section with several results which hold when the complete integration structure (i, f) is the standardization of an internal structure (L,l). We begin by showing that '4 E M for any 4 E L.
2.31 Proposition If 4 E L then '4
E
M.
Proof: We need only show that '4 A zc E if C#J E L', and C E 9.The rest follows by considering 4' and 4- and rescaling. Given E > 0, choose 11/1 and < E. Then 11/2 in L with 0 I11/1 IxC Iq?2 I 1, 0111/2< 00, and -E$2
+(4~11/1)1°4hXCI(4~11/2)+E11/2
and
4 ( 4 A 11/21 - (4 A 11/11 + W
2 )
5w
2 - 11/1)
+ 2 E W 2 5 E + 2EW2.
Since E is arbitrary, the result follows from 1.14. 0
186
Nonstandard Integration Theory
IV.
The following result shows that if h E fi', we may often be able to find a function 4 E L which is "close" to h in an appropriate sense. 2.32 Proposition Assume that 1 E L. For each h E M' there is a function 4 E L so that ((hA n) - (4 A n)l E Lo for each n E N, and thus
h = sup{&
A
n):n E N } = sup{01(4A n):n
EN} =
'1(4A w )
for some w E *N,. Proof: By Theorem 1.14 we may choose sequences ( 4 , , : n ~ N ) and ($,,:n E N) in L such that 4,, Ih A n 5 $,,, 4,, I$,,+,, and I($,, - 4,,) < l/n for each n E N. Given k 2 m 2 n in N, we obtain $,,,An 2 h A n 2 & A n 2 I$,,, A n and I(($,,,A n) - (&,,,A n)) I I($,,, - b,,,) < l/m. By K,-saturation we mayfind4ELsuch t h a t ~ , A n 2 ( b / \ n 2 4 , r \ n f o r e v e r y m , n E N with m 2 n. Clearly I(h A n) - (4 A n)l E L o .
The function 4 is sometimes called a "lifting" of h. Given a *R-valued function 4 E L, where (L, I) is an internal integration structure, it is important to know whether '4 E E or '4 E El and whether = 0 and j('4) = '14. Note that if 4(x) 'v 0 for all x, and 14 0, then j ( O 4 ) # "Id.If 4 2 0 then we always have j('4) 5 "I4(Exercise 17).
+
2.33 Proposition Let 4 be a finite-valued element of L. Then sup {"I(l+l - (l/n A (4l)):n E N} < 00. Proof: We may assume that {x E X
"4 € 2 iff
4 2 0. For each n E N,
: 4 - ( l / n A 4 ) > O} = {x E X:& > l/n} = {x E X:# - (1/2n E {x E X:2n[4
A
4) > 1/2n}
- (1/2nr\ 4)] 2 I}.
Moreover, '4 = limn+, '[4 - (l/n A 4)]. If sup {"1[4 - (l/n A &)]:n E NJ < co then, for each n E N, '[4 - (l/n A 4)] E by Theorem 1.16, and so E by Theorem 1.15. The converse follows from the fact that if 4 2 0 is in L and 0 I'4 I$ E L with O I $ < co, then, for each n E N, 4 - ( l / n A 4 ) 5 $. 0 In the following treatment "S-integrability", we have replaced Anderson's original definition [2] by a condition which is a direct consequence of the definition of the general integral, and is often easier to apply.
2.34 Definition A function 4 E L is S-integrable if
E
i1and i('4) = '14.
IV.2
Measure Theory for Complete Integration Structures
187
2.35 Proposition A function 4 E L is S-integrable iff I(141 - (141A w ) ) 5 0 and I(l4l A l/w)'v 0 for each w E * N , .
Proof: We may assume that 4 2 0. For each n E *N, 4 = (4 - (4 A n)) + ((4 A n) - (4 A l/n)) + (4 A l/n). Assume that Z(4 - (4 A w ) ) = 0 and ](4 A l/w)2 0 for each w E *N,. Then by the permanence principle I ( $ (4 A n)) and I ( + A l/n) are finite for some rn E N and all n 2 m in N. Fix n 2 rn in N . Then 4 A n I n2(4 A l/n), so I((4A n) - (4 A l/n)) is also finite, whence I4 is finite. Moreover, (4 A n) - (4 A l/n) is finite-valued and {x E X: (4 A n) - (4 A I/n)> 0) = { x E X:+ > l/n) E (.v E X:n(# A l / n ) 2 I}, and so ('4 A n) - ("4A l/n) E L and "I((4A n) - (4 A l/n)) = j ( ( 4A n ) - ("4A l/n))
by Theorem 1.16. Now by our assumption and Theorem 2.6, '14 = lim
"I((4A n) - (4 A l/n))
n-m
=
lim
.?((04A n) - (04A l/n))
n-m
=j(O4).
If, on the other hand, I4 is finite, then the second and third in this string of equalities hold as before. If we also have '14 = .?(O4),then I4 'Y I((4A w ) ( 4 l/w)), ~ and so I(4 - ( 4 ~ 0 ). v)O and I ( ~l/w) A - 0 for each WE*N,. 0 Note that ;he condition I ( [ + / A l/w)
2:
0 is automatically satisfied for any
4 E L and w E * N , if 1 E L and "I(1)< co. Exercises I V.2 1. (Standard) Finish the proof of Proposition 2.5. 2. (Standard) Prove Proposition 2.8. 3. (Standard) Show that if h E M and Ihl I y for some y E L: then h E L,. 4. (Standard) Show that if 1 E i,then every real-valued function in LI is in L. 5. (Standard) Show that if (X, A,p ) is a measure space and A, E A, n E N, then p ( u ; A,) I p(A.1, and if A , c A , then p ( A l ) I A A 2 ) 6. Verify the statements in (b)-(d) of Example 2.20.1. 7. (Standard) Show that, for a simple function u, udp is well defined in Definition 2.24. That is, if u = aizA,= 1bjzBjr a, 2 0, bj 2 0, show that aip(Ai)= C bjp(Bj). 8. Show that the measure $ obtained from the standardization (L,f) of an internal integration structure (L,I) is complete (Definition 2. lqe)). [Hint: Use Exercise IV 1.101
x?
x
188
IV. Nonstandard Integration Theory
then the function h defined 9. (Standard) Show that iff, g E fi and E €2, by
is in M. 10. (Standard) Given a sequence (f,) of measurable functions, show that the set E of points where limn+mj,(x) exists is measurable. [Hint: Con-
sider lim sup f, and lim inf f.3. 11. Prove that if ( L ,I) is an internal integration structure with standardization (i, f),then for each E > 0 and A E ?L there is a 4 E L with 0 5 I$ I zA and P(A) - "I(4)< E. In particular, if W A ) > 0 there is a 4 E L with 0 S 4 Ix A and I(& > P(p)/2. L 12. (Standard) Show that A consists of those sets C such that C n A E ? for each A E 8. 13. Let S be an internal hyperfinite subset of an internal set X. If d is the set of internal subsets of X,define the function v: d + * R + by v(A) = [ A n Sl/(Sl, where 1. 1 denotes internal cardinality.
+
(a) Show that v is finitely additive, i.e., v(A u B) = v(A) v(B) for A, BE^ and A n B = 0. (b) Show how you may use the theory of 8IV.l to define a measure p on a o-algebra d of subsets of X (see 1.6 in particular) so that A =) d and p(A) = "v(A) for A E 1.Note that 0 I p(A) I 1 for all A€&. 14. (Nonmeasurable sets) Consider Exercise 13 where X = {n E * N : O s n < o,o E * N , } . Define an operation 0 on X by n 0 m = n + m if n + m < o,and n 0 m = n + m - o if n m 2 o.Call nand m i n X equivalent if there is a standard k E N with either n Q3 k = m or in 0 k = n (this is an equivalence relation). Using the axiom of choice, choose one point from each equivalence class to form a set B. Show that B 4 A. (Hint: Show that X = u [ ( B @ n) u (BQ3 (o- n)](n E N ) ) . 15. Let ( L , I ) be an internal lattice. Give an example of a function 4 E L for which I4 is finite but 4 is not S-integrable. 16. Let & , I ) be an internal lattice.
+
(a) Show that iff, g E L, g is S-integrable, and if1 I191, then f is Sintegrable. (b) Show that if f E L is S-integrable and g E L satisfies 191 s n for some n E N, then fg is S-integrable. (c) Show that iff, g are S-integrable and a, = * R are finite, then af + bg is S-integrable.
189
IV.3 Integration on R”; the Riesz Representation Theorem
17. Modify the proof of Proposition 2.33 t o show that for 4 2 0 in L, j(’4) I “f4.(Hint: we may assume O I 4 < 00). 18. Use Theorem 1.14 and 1.17 to show that if ( & I ) is an internal Stonian integration structure, then the function 1 is in iff 1 E L and “I(1) < cx). 19. State and prove Proposition 2.35 with the additional simplifying assumption that the function 1 E L and “I(1) < co. 20. Let ( L , f ) be the hyperfinite integration structure of Example 1.6, and bc the standardization of ( L , f ) ,with associated 9, 2,fi, etc. let (i,,i) Assume that ai ( i E I) is finite.
1
(a) Show that A E 9 iff for every E > 0 in R there exist internal subsets B and C of X such that B E A E C and a, ( i E C - B) < E. (b) Show that A E 3 iff there is an internal set Bsuch that j ( ( A - B ) u ( B - A ) ) = 0. (Hint: use HI-saturation and the permanence principle.)
1
‘IV.3 Integration on R”; the Riesz Representation Theorem
Let X be any open or closed subset of R” and suppose that lois a positive linear functional (p.1.f.)on the lattice C , ( X )of continuous functions with compact support on X (of course C,(X) = C ( X ) if X is compact). For example, f 0 ( f )could denote the Reimann integral o f f E C,(X) or, more generally, the Riemann-Stieltjes integral off with respect to an increasing integrator. In particular, fo(f) could be evaluation of f at some point xo E X. We want to use the theory developed in the previous sections to define a measure space ( X , .NX,p x ) and a corresponding complete integration structure ( L x ,I,) on X which is an extension of the structure (Cc(X),lo).Most of these results are easy to prove and are left as exercises. The measure px will be shown to satisfy an additional condition known as regularity. This and other associated results are more technical, and can be skipped if desired. All of the above results taken together yield the Riesz representation theorem. With minor modifications except in one place, the results and proofs of this section carry over to the case that X is any locally compact Hausdorff space. One essential difficully arises in the proof of Lemma 3.8, which, for the general case, requires Usysohn’s lemma [20]. Also, if X is not compact a “countability” condition is needed for the general case to show “outer regularity.’’ Without further ex4licit comment, the nonstandard analysis in this section will be carried out in a h--saturated enlargement V ( * R ) of V ( R ) .We assume that 6 2 N l . For a F ’ %ral space X we would need x > card ,7,where 9 is the collection of o r sets in X.
190
IV.
Nonstandard Integration Theory
Let ( L , I ) be the internal integration structure (*C,(X),* I o ) on *X, with denoting the objects constructed from ( L , I ) by the procedures of &IV.l and IV.2. Recall that if G denotes the near-standard elements in * X then the standard part map st: G -,X maps G onto X. The basic idea of this section is to use the standard part map to lift functions from X to * X as follows.
(A?, j),(il, j ) , . k , 9, F
3.1 Definition For each R-valued function f on X we define the function on * X by
and for each A E X we define
A' = st-'(A) n *X.
3.2 Remarks 1 . ?is constant on the monads of standard points in *X, and zero at all points which are remote (i.e., not near-standard). In partic,ular, f(x) = 0 if x E * X a n d J h E r m of x is i e i t e . r y 2. ?=uj,.f y = f 6, f v y = f v a , f A g = ~ A (exercise). G 3. i A= x i (exercise).
+
+
We now obtain measure-theoretic structures on X with the following definition.
3.3 Definition We let M , = { / : fA?) ~ and define J , by putting J , ( f ) = j(f) when j ( , f )is defined. For each set A c X with A' ~ " i.e.,ix A,E M,, we set p x ( A ) = b ( i ) ; the set ,'it. = { A E X : ~ E . R }We . let L , denote the real-wlued functions j ' in M , for which J,j' is defined and finite. 3.4 Proposition ( L , , J,) is a complete integration structure which extends (C,(X),lo).Moreover, (X.. I f x , p x ) is a measure space such that f E M , iff f is . //,-measurable, and J f d / i x = J,f when J,f is defined. Proof: That ( L , , J , ) is an integration structure is left as an exercise. To show that ( L x ,J , ) extends (C,(X),/"), let f E C , ( X ) . By the uniform continuity off, *f'(y) 1: * f ( x )if y z x and *fis zero at any remote point since f has compact support. Thus f = "(*f). By the obvious extension of Example 1.18.1(b),f E and jf = = "/*I' = I 0 f .
IV.3
Integration on R", the Rtesz Representation Theorem
191
To show that (L,,J,) is complete, let (JJ be a monotone increasing sequence of functions in L , for which limn,,fn(x) = f(x) exists for all x E X and suplJ,/,;ri E N ) < x.Then ( f " ) is a monotone insreasing sequence of functions in L , and sup(jf,, : n E N l < lz. Also lim,-mfn(z) = f(z) for all z E * X ( c h c d ) ,soj E L , and jf = lim .lL by the monotone convergence theorem for ( L , , j ) .Therefore .f E L , and Jx.f' = limn,x J,.f,. The rest is left to the reader (Exercise 2); the equality 1J'dp, = J,ffollows from the corresponding fact for simple functions. 0 When we start with I, being the p.1.f. given by ordinary Riemann integration. then .IS, is called the class of Lebesgue-measurable sets and p x is called Lebesgue measure. I n that case we write f dpx as 1f dx.
3.5 Examples In the following examples we consider the case in which X = R and I , is given by Riemann integration.
1. The characteristic function of any bounded interval in X is in L , (i.e., these intervals are in ,M,). This follows from Example 1.18.1(c). The corresponding result for bounded rectangles holds if X = R". 2. Next we show that L , contains the function
and hence contains unbounded functions. If A =(O, I] then zA and hence nzA are in L , by Example I . Thus f, = nzAA l/& E L , by the lattice property. Now the sequence (f,) is monotone increasing and converges to f . An easy calculation shows that J,fn I 2, so the result follows from completeness. 3. If E E . N, is bounded then p , ( E ) < cc (Exercise 3). This again generalizes to
X
=
R".
The following results give more detailed information about .dX and p , and center about the notions of regularity, which is defined as follows. *3.6 Notation Let .%" and .P be the collections of subsets of X that are compact and open in X, respectively. Recall that, for X c R", V E X is open in X if V = X n W for some open W G R". A set K is compacr in X iff it is compact in R". We write K i J' if K E X , j ' E C,(X), 0 I f I 1, and f(x) = 1 for all x E K. We write f i V if V E .F,f E C,(X), 0 5 f I 1, and supp j' C_ V . The notation K i J' < V means that K i f and J' i V.
192
IV.
Nonstandard Integration Theory
*3.7 Definition A measure p on a a-algebra A 3 X u F of subsets of a metric space X is inner regular if (a) &A) = sup{p(K):K E A, K
E X } ,A E A,
outer regular if
(b) p(A) = inf{p(V):A E V , V E F},A E A, and regular if it is both inner and outer regular. We first show that Ax2 X u 9. To do so we need the following fact about continuous functions.
*3.8 Lemma Suppose K E X , V E F,and K c V. Then there exists a function J E Cc(X)so that K
[I
Let
[I be
c Y. For any set A
*3.9 Proposition If V E F,then V E Axand p x ( V ) = sup{Zof:f< V } . Proof: Let A E 8 and E > 0 in R be given. We may choose i,b1, $ 2 E L with 05 5 x,, 5 $, I 1 and Z($, - $J ~ / by 3 Theorem 1.14 (the inequality $, I 1 uses the fact that L is Stonian). Let Xo= {K E X : K c V}. For each K E Xolet
-=
aK = inf{OI($,
A
*f):K
BK = inf{"Z($, A * f ) : K
/? = sup(/?,:K
E
Xo}.
For each K E X o ,f l K - aK 5 ~ / 3 so , /? - u < ~ / 3 By . definition of u, we may . Kchoose a standard f E C,(X)with f < V such that Z($, A *f) > a - ~ / 3 By saturation we may choose a K' E *Xoand a 4 E L so that K' 3 *K for each K E Xo, 0 I4 I 1, 41K' = 1, and I($, A 4) < /? ~ / 3(check). It follows that $1 A *f 5 X A A Zf 5 $2 A 4 and 1 ( $ 2 A 4)-1($1 A *f) < ( f l - C f ) + 2E/3 IE, and hence xA A xp E L for each A E 2.We conclude that P E A? and V E Ax. We assume px(V ) < 00, and leave the case px( V) = co to the reader. Given E > 0 there exists an A E 8 so that &) I P(A n P) E since j x c = sup{.f(x, A xp): A E 2). With the and f obtained for this E and A as in A *f)+ E , and so the first paragraph, we have "Z($, A *f)I P(A n ?) I
+
+
OZ($,
IV.3
193
Integration on K"; the Riesz Kepresentatlon Theorem
Of($; A * j )Ijx; = p x ( V ) I" I ( + , A *f)+ 2 ~Also, . by Theorem 1.16,"(*f)= ! E L , and o f ( $ l ~ * f ) s " f * f = f o f = j f I j x ; = p x ( V ) s i n c e j I x , - . We conclude that px( V) = sup{f,ff< V ) . 0
*3.10 Proposition If K E X then inf{foj':K < f ; .
f? € 2 ,so
K EJ,,and p,(K)
=
Proof:Leta = inf{f,f:K 0 in R, we have
(1 whence x i - 4 E Lo, x i E
+ E)*f
2xi 2
4,
t,and b(k)= jxi = O f 4 = a.
0
*3.11 Corollary If K E X thenpx(K)=inf(p,(Y):VEY, V 2 K } .
Proof: Exercise. 0 *3.12 Theorem The measure px on .I, is regular.
Proof: (a) We first show that px is inner regular. Let AcAXLFor any > 0 in R and n E N, choose h E it so that if j x a < co we have J(h A x i ) > jxi - E and if .?xi = co we have j ( h A x i ) > n. Now choose I// E L so that 0 I I h A x i and O f + 2 j ( h A x i ) - E. Let K = st{y E * X : + ( y ) > 0).Then K is the standard part of an internal set which is near-standard (i.e., contained in G) since 0 I$ I x i , and K c A. Thus K is compact by Exercise 111.3.1 1. Finally, if 5x2 < we have
E
+
jxi 2 jxK2
O f $
2
jxi
-2~,
and E > 0 is arbitrary, so (a) is established in this case. A similar argument works if J x i = 00. (b) Now we show that px is outer regular. Let A E J , . The result is trivial if px(A) = co,so suppose that p,(A) < co. First assume that W is open in X and that A G W G W E X and W is compact. Given E > 0 in R we may use (a) to find a compact K E W - A so that p x [ ( W - A) - K] < E. Then the open set V = W - K 2 A and p,( V ) - px(A) < E. In general there exits an increasing sequence (W,) of sets open in X with X = uWn(nE N), and W, compact and contained in X for each n (exercise). Let A, = A n wk E A x , and put B1 = A,, B, = A, - A,-,, k 2 2, SO that the Bk are disjoint and I B, = A. For each k we may find an open set V, 2 Bk with I
u;=
194
IV.
px( vk) < pX(Bk) fix(&)
+ ~ / 2 Then ~ . V = up=
+ E = P X ( 4 + E.
0
Nonstandard Integration Theory
V, is open and px( V )I
1 ; '
px( 6) I
The following result summarizes this section. In its proof we use the notation f f d p for integration based on a measure p on A,. *3.13 Riesz Representation Theorem Let T be a p.1.f. on Cc(X).Then there
exists a 0-algebra Axon X which contains all open and compact subsets of X and a unique complete regular measure px on Ax so that T(f)= f f dp, for all f E C,(X). Proof: From the previous results, all that remains is to show the uniqueness and completeness of p,. To show uniqueness, let p be any other regular measure on Ax so that Tf = f d p for all f E Cc(X). It suffices to show that p ( K ) = p x ( K ) for all K E Xby regularity. Let K E Xand E > 0 in R be fixed. By regularity there is a V 1 K with p( V) < p ( K ) + E. Let f satisfy K < f < V. Then This is true for any E > 0, so that p,(K) 5 p ( K ) . Similarly p ( K ) s p,(K), and the uniqueness follows. The completeness of p, follows easily from the completeness of fi (see Exercise IV.2.8) and is left as an exercise. 0 Exercises I V.3 1. Prove the validity of Remarks 3.2.2 and 3.2.3. 2. Show that (L,, J,) as defined in Definition 3.3 is an integration structure, and finish the proof of Proposition 3.4. 3. Show that if E E Axis bounded, then px(E) < 00. 4. Show that if X is an open or closed subset of R" and K c X is compact, then there is an open set V in X (i.e., V = X n W for some open W c R") such that K c V and the closure of V is both compact and contained in X. 5. Finish the proof of Proposition 3.9 by showing that if p x ( V ) = 00, then P X ( V = SuP{~of:f< V ) . 6. Assume that X is compact in R", and deduce Proposition 3.9 from Proposition 3.10. 7. Prove Corollary 3.1 1. 8. Show that if X is open or closed in R", then there is an increasing sequence W,> of sets open in X with X = u W n ( nE N) and each W,compact and contained in X. 9. Prove that p, is a complete measure in Theorem 3.13.
<
195
IV.4 Basic Convergence Theorems
10. Show that in the case of Lebesgue integration the functionf on R defined by
is Lebesgue-measurable but not Lebesgue-integrable. 1 1 . Replace *lowith any internal positive linear functional I on *C,(X)such that Z(*f) < co for each f E C,(X). Prove that if we define (LX,J,) on X as in Definition 3.3, then (L,,J,) is a complete integration structure with J,y/'= 'If for each f E C,(X). 12. Let A x = l/n! with n E * N , be a fixed infinitesimal and let T = { u E * R : u = n Ax, n E * Z } . Let L = *C,(R) and for f E L put Z(f) = C f ( x )Ax ( x E T ) (note that for any f E L the sum is equal to the *-finite transfer of finite summation). (a) Show that (L, I) is an internal integration structure. (b) Show that if I' is the *-transfer of Riemann integration in C,(R), then there are internal functions f for which Zf I ' f . (c) If (&f) is the standardization of (L,Z), modify the procedure in Exercise IV.2.14 to produce a subset E of *[O, 13 which is not in 2.
+
13. Let ( L x , J x ) ,X = R, denote the integration structure on X obtained from the ( L , I )of Exercise 12 by the procedure of Exercise 1 1 . (a) Show that the associated A, contains all compact and open sets. (b) Show that the associated p x is regular. (c) Prove, hence, that ( L x ,I,) coincides with the Lebesgue integration structure. (Hint: Use Theorem 3.13, especially uniqueness.) 14. (Standard) Let D be the unit disk {z:IzI < l } in the complex plane, and let C be its boundary {z:IzI = 1). It is well known that for each continuous function f on C , there is a unique continuous function h, on D = D u C such that h,lC = f a n d h,lD is harmonic, that is, (d2h,/dx2) (d2h,/dy2) = 0. Moreover, h 2 0 iff 2 0. Use Theorem 3.13 to show that for each x E D there is a measure px on C such that h,(x) = J c f d p x for all continuous functions f on C.
+
IV.4 Basic Convergence Theorems
In this section we will present several convergence theorems which complement those which have been presented in #IV.2. Our first concern is to establish analogues of the monotone convergence theorem in which we deal
196
IV.
Nonstandard Integration Theory
with sequences of integrable functions which are not necessarily monotone. The basic results here are Fatou's lemma and the dominated convergence theorem. Next we present several results concerning various types of convergence for sequences of measurable functions, including almost uniform convergenceand convergence in measure. The proofs are standard; we include these results to fill out the standard theory. Throughout the section we will be dealing with classes M and L , of Rvalued measurable and integrable functions on a measure space (X, A,p). Integrals of functions f in M and L , will be denoted by f dp. Before embarking on a presentation of the convergence theorems, we consider the role played by sets of measure zero in the discussion. These occur frequently enough for us to make the following definition. 4.1 Definition A proposition P(x), which depends on x E X,holds p-almost everywhere (a.e.)if there is a set E of measure zero so that P(x) is true for all
x E E' (the complement of E in X).When the measure p is understood we write a.e. instead of p-a.e.
For example, a function f is bounded a.e. if there is a constant B > 0 so that p ( { x : l f ( x ) l > B}) = 0. Similarly, we say that f = g a.e. if there is a set A E X with p(A) = 0 and { x : f ( x ) # g(x)} E A. If p is a complete measure or f and g are measurable, we need only specify that p ( { x : f ( x ) # g ( x ) } = 0. The relation of equality a.e. is easily seen to be an equivalence relation (Exercise 1). The basic fact is that sets of measure zero can be ignored as far as integration is concerned, as indicated by the following results. 4.2 Theorem
If
(a) Iff E M is zero a.e. then dp = 0. (b) Iff E M + and 5 f dp = 0 then f = 0 a.e. Proof: Let E = { x : f ( x ) # O}; then E E A.
(a) Suppose first that f~ M + and p(E) = 0. Letting u, = nzE, we have and j undp = np(A) = 0. With h = lim on it follows from Theorem 2.6 that h E M + and 5 h d p = sup{ j u n d p : nE N} = 0. Finally f I h, and hence 0 If f d p 5 j h d p = 0, so that j f d p = 0. For general f we write f = 'f - f-. Iff = 0 a.e. then f'and f- are both zero a.e., and the result follows by linearity of the integral. (b) The sets En = { x : f ( x ) 2 l/n} are in A and E = u E n ( n E N).Since f 2 (1/n)xE,,we have 0 = j f d p 2 (l/n)p(En)2 0,so p(En)= 0. Hence p ( E ) = 0 by countable additivity. 0 u, E M +
IV.4
197
Basic Convergence Theorems
4.3 Corollary I f f . g
E
M and f
=g
a.e. then 1f d p = j g dp.
Proof: If E = { x : j ( x ) = g ( x ) } , then j f X x - E d p = g X x - E d p = 0 by 4.2(a), j S d P = jfXEdP = BXEdP = j Y d P .
I
4.4 Theorem If .f E M and
(fl
d p < 00, then f is finite a.e.
If[,
Proof: Let E = { x : I f ( x ) l= 0 0 ) . Then E E .L(check) and nxE I and so np(E) I1Jfldp < 00 for any n E N. We conclude that p ( E ) = 0. 0
Most of the results in 4IV.2 can be improved by replacing assumptions which hold everywhere by corrresponding assumptions holding almost everywhere. We illustrate this by proving a final version of the monotone convergence t heorem.
4.5 Lebesgue’s Monotone Convergence Theorem Let f, (n E N) and g belong to M . Iff, 2 y a.e. where j g dp > - co,and f, I f,+ a.e. for all n E N, then f, converges a.e. to a function f E M and jf,dp =jfdp.
-=
Proof: By combining the countably many sets (wheref, g, f, > f,+ into one set E of measure zero, we may set each f, and g equal to 0 on E without changing the integrals. We may also assume that 0 2 g(x) > - 00 for all x (check), so - a, < j y d p I 0. The result now follows from the monotone convergence theorem applied to f, - g. 0
4.6 Fatou’s Lemma If (fn) is a sequence of nonnegative measurable functions, then 1(lim inf f,)d p Ilim inf f ,dp. Proof: If g, = inff;: ( i 2 n), then g, E M t and ( g n : n E N) is an increasing sequence which converges to lim inff,. Also, if n Im, then g, If,, so g , d p IJ f, dp; hence j g, d p Ilim inf j f,dp. Therefore j (lim inf f,)dp = limn+ 1gnd p Ilim inf j f,dp by the monotone convergence theorem. 0 ~
4.7 Lebesgue’s Dominated Convergence Theorem Suppose that (f,) is a sequence of measurable functions which converges a.e. to a measurable function f . If there is nonnegative function g E L1 so that 1f.l g a.e. for each n E N , then f E L , and j f d p = jf,dp. Proof: Fix a set E E .Iwith p ( E ) = 0 so that (f,) converges to f except Ig except possibly on the set E. Iff, = possibly on the set E, and
1.f(
198
IV.
Nonstandard Integration Theory
-
. f . ~ ~.1-‘ ~ , and .y‘ = g x x - F , then the sequence (.f,) of-measurable functions converges everywhere to f , If,\ I .y‘ on X,and finally J . f d p = J’ j ’ d p and i dp-= Sf, dp by Corollary 4.3. Since ljl 5 y’ and f~ M, l f L,, ~ as is each of the functions &. Now 6 + i 2 0, and so by Fatou’s Lemma
S
sijdp
+ S f d p = s(g + f ) d p I lim inf s(y‘+ i ) d p
Hence J f dp I lim inf J 0, we obtain
idp. Similarly, applying Fatou’s lemma to .y‘
-
J’ 2
Jgdp - Jfdp = J(i- j ) d p Ilim infJ(g - j ) d p = Jij dp - lim
sup
si
dp.
Thus lim sup J 11, dp 5 J j d p , and the result follows. 0 The rest of this section will center on various convergence properties of sequences of measurable functions without special concern for the convergence of their integrals. The first of these is the famous result of Egoroff which states that a.e. convergence “almost” implies uniform convergence. To be specific we introduce the following definition. 4.8 Definition A sequence <jA) converges almost uniformly if for each E > 0 there exists a set E E A? with p(E) < E so that (f,)converges uniformly on
E. 4.9 Egoroff’s Theorem If p ( X ) is finite and
then (f,) converges almost uniformly to f. Proof: For each k and n define the set E,,
(f,) converges a.e. to f on X
Ed t’by
Ek, =
n;=, {x:lfm(x)
-
(f,) converges then for each k we have U E , , ( n E N ) 2 E. For fixed k we have E,, E Ekn if n s m,
f(x)l < l/k}. Notice that if E is the set on which
and so limn.+m p(I!ikn) = p(UI!ik,,(n E N ) ) 2 p ( E ) = p ( x ) . Thus, for a given > 0, we see that with each k E N is associated an n, E N so that p(Eknk)< c/zk. If F = (k E N ) then p(F‘) I I p(Gnk) < ~ / =2 E . ~ Finally we show that (I,)converges uniformly on F. Let E > 0 be given and find a k so that l/k < E. Then (fm(x)- f ( x ) ( < E for all m 2 n, if x E Eknk. Since F C Eknkwe have uniform convergence on F. 0 E
n&,,
=:x
IV.4
199
Basic Convergence Theorems
Another type of convergence which is important in probability theory is that of convergence in measure. 4.10 Definition A sequence (1,) of measurable real-valued functions on X Converges in measure to a real-valued function f if for every real E > 0 we p ( { x : l f , - 2 E } ) = 0. Similarly (f,) is Cauchy in measure if have for each E > 0 we have limn,m-.mp ( { x : [ f , ( x )- fm(x)l 2 E } ) = 0.
fI
It is easy to see that if (1")is convergent in measure to f then it is Cauchy in measure. Recall that Egoroffs theorem has been established only for sets of finite measure (see Exercise 2). The following result shows that, in general, almost uniform convergence is stronger than both convergence a.e. and convergence in measure. 4.11 Theorem If a sequence (f,) converges to f almost uniformly then it converges a.e. and in measure. Proof: For each k E N let (f,) converge uniformly to f on Fk where p ( 4 ) < l/k. Then (f,) converges on F where F = U F k ( 1 I k < CO) and p(F') I p(F;)< l/k for each k E N, so that p(F') = 0. Thus (f,) converges a.e. To prove convergence in measure let E 0 be given and choose k with Ilk < E. Sincef, converges unif@rmlyon F k , there is an rn such that {x: If,(x) f ( x ) l 2 E } E F; for all n 2 some m depending on k. Thus p({x:If,(x) f ( x ) l 2 E } ) < l/k < E for all n 2 m, and the result follows. 0
=-
The following example shows that a sequence can converge in measure but fail to converge at any point. 4.12 Example Represent each n E N as n = k + 2", m 2 1, 0 I k < 2", and define f , ( x ) on [O, 1 1 to be X [ k 2 - m . ( k + 1 ) 2 - m ] (the reader should draw some pictures). Then for any x E [0, I] and any no there is an m I 2 no and an m2 2 no so that fm,(x) = 0 and fm,(x) = 1. Thus f, does not converge at any point. On the other hand, given E > 0, the Lebesgue measure of {x:lf,(x)I > E } I2/n, so that f , -+ 0 in measure. In this example it is possible to select a subsequence of (f,) which converges a.e. This is true in general, as we now show. 4.13 Theorem If (f,) converges in measure to f,then there is a subsequence (f,,) which converges almost uniformly and hence a.e. to f .
200
IV.
Nonstandard Integration Theory
Proof: Given k we can find an nk so that p({x:If,(x) - f ( x ) (2 2-k}) < 2-' for n 2 f l k - We may assume that n k + > nk. Now let Ek = {x:I f,,(x) - f(x)l 2 2-'}. Given E, let m be chosen so that 2 - m + 1< E. If x $ E , = A then If.,@) - f(x)l < 2-' for k 2 m, so fn,(x) converges uniformly to f ( x ) on A'. p(Ek)5 2-' = 2-'"+l < E, and the result follows. 0 But p(A)
IF=,,,
uT=,,,
z=,,,
Exercises I V.4
= on the set of functions on a measure space (X,A,p) defined by f = g iff = g a.e. is an equivalence relation. 2. (Standard) Show that Egoroffs theorem does not hold for Lebesgue measure on all of R. 3. (Standard) Show that if for each n E N,f,E L , and dp < 00, then the series f . converges absolutely and almost everywhere to an integrable function f and d p = :=I 4. (Standard) Show that if limn+w - dp = 0 then f . converges to f in measure. 1. (Standard) Show that the relation
lfnl
If
l,f
Ifn&.
In the following problems, (L, I) will be an internal integration structure and (i, f) the complete integration structure of 4IV.l with associated measurable structure of 5IV.2. 5. Show that if g E Lo then g N 0 fi-a.e. (Hint: Assuming g 2 0, for any E > 0, there is a I(/ E L with 0 I; g I $ and I $ < E. Use Proposition 2.33, Exercise 2.17, and the fact that {x:g Il/n} E {x:$ 2 E {x:+ 2 1/2n})
lln}
6. (Lifting of Measurable Functions) Assume that 1 E L. A function f is in fi iff there exists a $ E L such that '$ = f fi-a.e. Iff is bounded then $ can be obtained with the same bound and Jfdfi = '14. (Hint: Use Proposition 2.32 and Exercise 5.) Any function $ E L satisfying these conditions is called a lifting off. 7. (Lifting of Integrable Functions) Assume that 1 E i.Show that f E iI iff f has an S-integrable lifting $, in which case dfi = '14.
If
IV.5 The Fubini Theorem
A familiar process in the theory of Riemann integration for functions of several variables is that of iterated integration. If, for example, f ( x , y ) is a continuous function on the set [a,b] x [ c , d ] in R x R then we have the equality
IV.5 The Fubini Theorem
201
The purpose of this section is to establish a nonstandard version of this equality in the contexts of the earlier sections of this chapter. The general result is known as the Fubini theorem, after its originator, G. Fubini. The nonstandard version is then applied to establish a Fubini theorem for integration structures on Euclidean spaces. First some notation. We will be dealing with integration structures (internal or standard) on product spaces U x V (internal or standard). These structures will typically be denoted by ( L , ,, I, ,). We will also be given integration and (L,,I,) on U and V, respectively. Given a function structures &,I,) f E L, we may find that f(u, .) E L, for u E U,in which case I,f is a function of u. If g = 1,f is also in L , then we denote its integral Iug by l , I v f (a slight abuse of notation since we are suppressing variables).
,
5.1 Definition Let (L,,, I,), (L,,I,), and (L,,Iw) be integration structures on U,V, and W = U x V , respectively. If the integration structures are stan-
dard, we say that a function f E L , has the strong Fubini property with respect to I,, I,, and I, if (i) f(u, - ) E L, for all u E U and f(., u) E L , for all u E V, (ii) I,f is in L , and 1,f is in L,, (iii) I,f = I,Ivf = I,I,f.
If “all” in (i) is replaced by “almost all” (i.e., the conditions hold a.e.), and (ii) and (iii) hold if I,f and 1,f are set equal to zero when not otherwise defined, then we say that f has the Fubini property. If the integration structures are internal and (i), (ii) and (iii) hold without exception, we say that f has the internal strong Fubini property.
To begin we need the following basic result. (L,,I,), and (L,,Zw) with W = U x V be real complete integration structures on U , V , and W, respectively. Suppose that each function f, E L , in the sequence {f,:n E N} has the Fubini property with respect to I,, I,, and I,, and { f,} is a monotone increasing sequence converging to a real-valuedf. Also suppose that sup{I,f,:n E N} < co.Then f has the Fubini property with respect to I,, I,, and I,. 5.2 Lemma Let (&,,I,),
Proof: Exercise. 0
We next establish results concerning the standardizations (i,,f,), (i,, f,), and (L,, fw)of internal integration structures (L,, I,), (L,, I,), and (Lw, I,)
202
IV.
Nonstandard Integration Theory
on the internal sets U ,V, and W = U x V, respectively, in an K,-saturated enlargement. These will be used to establish results on Euclidean spaces via the results of sIV.3. We assume that the function 1 (i.e., the function which is identically 1) is in L , and that "I,l < co. This will allow us t o apply Theorem 1.16 when 4 E Lw by taking JI = 1. We also assume that each function in L, has the internal strong Fubini property (as in the case, for example, with Riemann integration of continuous functions). In particular, 1 is in L,, and L, and "1,l c co and "1,l < 00.
5.3 Lemma Suppose that 4 is a finite-valued function in L,. Then ' 4 has the strong Fubini property with respect to f,,, f,, and fw. Proof: Since, by assumption, Q(u,.)E L, for each u E U, we see that
"4(u,.)E i, by Theorem 1.16. Similarly, using Theorem 1.16 where neces-
sary, we have fV("4) = 'I,4 in L,,, I,,(#) = "I,,# in L,, and fw("q5) = "Id4)= "lUIv(4) = fuol,(4) = fufV('4). The same argument with U and V reversed yields the result. 0
For the next lemma we use the fact (Exercise IV.l. 10) that if h is real-valued and nonnegative, then h is a null function (Definition 1.7) with respect to an integration structure ( L , I ) iff h E i and f ( h ) = 0.
5.4 Lemma Suppose that h is a bounded real-valued null function on W. Then h has the Fubini property with respect to f,,, f,, and f,. Proof: We may assume that h 2 0 by considering h = h+ - h - and using the fact that the Fubini property is preserved under sums (exercise). Then we have 0 _< h 5 K for some standard integer K. Since h is null there is a decreasing sequence (&:n E N ) of functions 4, E L , with h I4, IK for (n E N) = 0. Since h is real-valued there is a real-valued all n, and lim "Id4,) H E i, to which the sequence ("4,) monotonically decreases, and 0 5 h I H. Now H also has the strong Fubini property by Lemmas 5.3 and 5.2 (appropriately modified), and f,(H) = 0. It follows from Theorem 4.2 that for almost all u E U (in the measure induced by i,,, f"), f,H(u;) = 0, whence h(u;) is null on V . Therefore fufvh = 0. The same argument works with U and V reversed, and we conclude that the Fubini property holds for h. 0
Our main theorem generalizes a result of H. J. Keisler [25, p.
71
5.5 Nonstandard Fubini Theorem Let (L,,,I,,), &,,I,), and ( L w , I w ) be internal integration structures on the internal sets U, V, and W = U x V , respectively, with 1 in L , and "Iwl c co. Assume that every finite-valued
IV.5
203
The Fubini Theorem
function 4 in L , has the internal strong Fubini property with respect to I,, I,, and I,. Then any f E G , for which jwlfl < co has the Fubini property with respect to f", I , , and fw. Proof: Using the fact that the Fubini property is preserved under sums and writing f = f' - f - , we may assume that f is positive. Also, we may n using assume that f is bounded by first proving the result for f ~ and Lemma 5.2 to pass to the limit. Suppose then that f E t, is a bounded nonnegative function. Then f has a decomposition f = 4 h with 4 E L , bounded and h a bounded null function (check). Now f = '4 + (4 - '4) + h, and since the null function (4 - '4) h is real-valued, the theorem follows from 5.3 and 5.4. 0
+
+
We will now apply Theorem 5.5 to prove a Fubini theorem for integration structures in Euclidean spaces. In the following, X and Y will denote closed and bounded (and thus compact) subsets of R" and R", respectively, and Z = X x Y. Notice that 1 belongs to C ( X ) , C ( Y ) ,and C(Z).Given positive linear functionals I,, I,, and I, on C ( X ) , C ( Y ) , and C(Z), we obtain integration structures (C(X),I x ) , (C(Y), I,), and (C(Z),I,). These structures have *-transforms on *X,* Y , and *Z, namely, (*C(X),*Ix), (*C(Y), *I,), and (*C(Z),*Iz), respectively. For example, * C ( X )is the set of all *-continuous functions on *X. Using the techniques of B1V.l and IV.3, we find that these internal structures induce integration structures (i,,f,), (i,, f,), and (E,, f,) on *X, *Y, and *Z, which in turn induce integration structures (,!,,,.Ix), ( L y , J,), and (L,, J,) on X, Y, and Z, respectively. The latter structures extend ( C ( X ) ,Ix), (C(Y), I,), and (C(Z),I,). The reader should recall (Remark 2.9.2) that every real-valued function in is in t,.We remark that for f E C ( Z ) the equality of the iterated integrals always nolds [34, 16B, p. 441. If that common value is I, then the strong Fubini property holds for f. 5.6 Standard Fubini Theorem Assume that X and Y are compact. Suppose that each f E C ( Z ) has the strong Fubini property with respect to I,, I y , and I,. Then each f E M, such that Jzlfl < 00 has the Fubini property with respect to J , , J , , and J,. Proof: It suffices to prove the result for f bounded and hence in L,. The assumptions of Theorem 5.5 are satisfied with * X = CJ, * Y = V , and * Z = W, since the strong Fubini property for each f E C ( Z ) transfers to the internal strong Fubini property for each 4 E *C(Z). Let f E L,. Then TEi, has then the Fubini property with respect to fx, f,,, and f,. If "xl = j ( x l , y ) = j ( x 2 , y ) for all ~ E ' * Y Thus . there is a standard set A c X such that f ( x ; ) E L, for all x E * X - 2. Also A" is null in * X so A is Ox2
204
IV.
Nonstandard Integration Theory
Tv
null in X.Ifx E X - A thenf(x;) = f ( x ; ) o ! * Y s o J , / ( x ; ) = j , f ( x ; ) . Set J Y f ( x ; ) = 0 for x E A. Since- J - *x ) = ~ , / ( x ; ) for x E *X - A, we have J, f ( x ; ) E L , and J,J,f' = J,Jyf = = JJ. The same argument with the roles of X and Y reversed gives the result. 0
sf
We have established the Fubini theorem for the case that X and Y are compact subsets of R" and R", respectively. The extension of this result for the case that X and Y are both open or both closed in R" and R" is a standard exercise, which we leave to the reader (Exercise 3). Exercises I V.5 Prove Lemma 5.2. Show that the Fubini property is preserved under sums. Use Theorem 5.6, Exercise IV.3.8, and the obvious extension of Lemma 5.2 (for the case of R-valued functions) to establish Fubini's theorem for integrable f on X x Y,when X and Y are both open or both closed in R" and R", respectively. (Nonstandard version of Tonelli's theorem) In the notation of this section, assume that 1 E Lw with "IWl< co, and the other assumptions of Theorem 5.5 hold. Show that iff E fi;, then (a) l(u;) E M v for a.e. u E U , and / ( . , u ) E M u for a.e. u E V , (b) <,f(u,.)_~ ?;iu, and j , / ( . , u ) E kv, (c) J w f = J , J , f = JIJU/. 5. In the context of Exercise 4, show that if/ E fiwand either of the repeated integrals .?,,j,lf( or .fu.fylflis finite, then l.fwfl < 00 and .fwf = j u J y / =
jY.fUf. 6. State and prove a standard version of Tonelli's theorem extending Theorem 5.6 for the case that f 2 0. 7. (Standard) (a) Let f be the function on [0,1] x [0,1] defined by
( x , Y ) = (090).
Use trigonometric substitutions to show that (with Lebesgue integration)
Conclude that f is not Lebesgue integrable on [0,1] x [0, I].
IV.6
205
Applications to Stochastic Processes
(b) Let f be the function on S = [-1,1]
x [-1,1]
defined by
(x, v) = (0, 0)
Show that the iterated integrals o f f over S are equal, but f is not integrable.
*IV.6 Applications to Stochastic Processes
In this section we present a few examples which show how the theory of integration structures and the associated measure theory as developed in g1V.l and IV.2 can be applied to problems in probability theory, and in particular to stochastic processes. The essential idea is to extend the concepts of elementary probability theory on finite sample spaces to situations in which the sample space is a *-finite set in some enlargement. By transfer, this allows us to use the techniques of calculation and also the conceptual simplicity of the finite cases to deal with probabilistic situations in which the sample spaces are intrinsically infinite. The standard treatment of the problems we present below, and especially Brownian motion, can be a little complicated. Following the nonstandard treatment of coin tossing and Poission processes in [27], a nonstandard approach to the theory of Brownian motion was developed by Robert Anderson in [2]. This work has since led to a sequence of papers on nonstandard probability theory (see, for instance, the survey article [39] and other related papers [19]) and, in particular, has resulted in the solution of some difficult questions in the theory of stochastic processes by Keisler [25] and Perkins [33]. We begin with a very quick survey of probability theory. This theory was developed in order to provide a mathematical foundation for the study of problems in which the outcomes of certain experiments or measurements cannot be determined with certainty. To illustrate, we consider two typical examples from elementary probability theory: 1. A die is tossed at random and the upturned face is recorded. 2. A marksman is shooting at a target, and the resulting hole in the target is noted. Each shot is subject to unpredictable effects of wind.
In example 1 the words “at random” are meant to convey that no device (for example, weighting the die, or influencing it with magnets) is in operation. We are interested in the likelihood of a particular number or set of
206
IV.
Nonstandard Integration Theory
numbers occurring on any given toss. It is almost evident that one is more likely to toss an element from the set {2,4,6} than that the number 3 will turn up. If the die is tossed n times and even numbers turn up n, times, then nl/n “should” turn out to be quite close to (and “should” approach 3 as n + 03. Thus, the ratio 3 is a measure of the likelihood of an even number turning up and is called the probability of that event. To attack the problem mathematically and, in particular, to attach a meaning to the word “should” used above, we consider the set { 1,2,3,4,5,6} consisting of all the possible outcomes of the experiment. Because of the randomness we argue that each face is equally likely to turn up, and so the probability of any outcome is &. Using the idea that the probability of an even number turning up is the sum of the probabilities of a 2,4, or 6 turning up, we see that the probability of an even number turning up is j. Similarly, we can attach a number P(A) between 0 and 1 to any subset A of { 1,2, . . . , 6 } which will be a measure of the likelihood of a number in A turning up and will be called the probability of the event A. Given the random nature of the experiment, P(A) will be IA1/6, where IAI is the number of elements in A. For multiple tosses of the die, we must consider a product space and corresponding probabilities. In example 2 the analogue of the set { 1,2, . . . , 6 } in example 1 is the set of points in the target. Each point has zero probability of being hit, but sets with positive Lebesgue measure are assigned a positive probability of being hit. A typical event is the event A of hitting a particular set in the target. The probability of the event A should again be a number between 0 and 1 which could be approximately determined by performing the experiment many times. In general, an abstract model for problems in probability is constructed as follows: (a) We construct a space R, called the sample space, whose points consist of all of the outcomes of the experiment. In example l , R consists of the possible six faces (or, equivalent, the numbers from 1 to 6). In example 2, R consists of all of the points in the target. (b) The events to which we wish to assign a probability are subsets of R. Given events A and B, A‘ is the event that A does not occur, and A v B is the event that A or B occurs. More generally we require the set of events to be closed under complementation and countable unions and thus be a a-algebra 8. (c) The probability of an event A is a number P(A)satisfying 0 I P(A) 5 1. Since B is a a-algebra we can and do require P to be a measure.on B. In P(A,,) for disjoint events A,,. This generalizes particular, P ( u y A,,) = the procedure for constructing P used in example 1. Now we have the following definition.
IV.6
Applications to Stochastic Processes
207
6.1 Definition A probability space is a measure space (R,d,P), where P is a measure on (R, 8)satisfying P(R) = 1. The a-algebra d is called the collection of events and P the probability measure. When the space R is finite and d is the set of all subsets of R, then P is completely determined by its values on the points in R. A particularly important situation, as represented above, occurs when all the points of R are assigned equal probabilities (the equiprobability model), in which case P ( A ) = IAl/lRl. In the examples we will consider below, the nonstandard models will be hyperfinite analogues of the equiprobability model. Following a standard convention for probability theory, we will use o to denote elements of R. Thus o will no longer be used for elements of *N,. In applications, we are usually concerned with functions defined on the sample space R. For instance, in example 2 the target might be divided into three concentric regions A, (the central circle), A,, and A,, and the marksman could score 10,5, or 1 depending on whether he hit A , , A,, or A,. The expected average score of the marksman if the shooting is performed many times would be lOP(A,) + 5P(A,) + P(A,). This leads us to the following definition. 6.2 Definition A random variable is a real-valued measurable function X on the probability space (R,8,P). The expected value E ( X ) of X is X dP(when the integral is defined). In many situations we are more interested in a particular random variable than in the underlying probability space (R,d,P) on which it is defined. The probabilistic information involving a random variable X is contained in its distribution P,, which is a probability measure defined on the collection A of Borel-measurable subsets of the real line R (i.e., the smallest o-algebra containing the open sets in R) by the formula P,(A) = P({w E R:X(o) E A}), A E A. It turns out that P , is completely determined by its value on all intervals in R. Thus in many applications the properties of a random variable are defined in terms of the function F,(x) = Px(( - 00, XI), which is called the distribution function of X . When X takes on only finitely many values { a i , .. . ,a,> then P , is completely determined by the values P&) = P({w E R:X(w) = a i } ) ,i = 1,. . . ,n. We are now ready for the definition of a stochastic process. 6.3 Definition A stochastic process is a family { X , :t E I } of random variables all defined on a common probability space (a,&,P). 1 is called the parameter set.
208
IV.
Nonstandard Integration Theory
In the following examples I will be either the positive integers (for infinite coin tossing) or a subset of the real line (for the Poisson and Brownian motion processes). In the case of the coin-tossing and Poisson processes, the random variables will take values in the integers. A fundamental notion in probability theory, and especially important for stochastic processes, is the notion of independence. 6.4 Definition A collection X,, . . . ,Xu of random variables is independent
if for any x l , . . . ,xu
If the X, are integer-valued we can replace the inequalities
< by equality.
Suppose, for example, a coin is tossed n times and xk, 1 5 k 5 n, is the random variable which records a 1 or - 1 if the outcome of the kth toss is a head or a tail, respectively. Here R is the set of sequences of 1's or -1's of length n and contains 2" elements. It is clear that, for any k, P({w E n:Xk(o) = O}) is 2"-'/2" = 1/2. More generally, if xi is fixed as 1 or - 1 for 1 I i 5 k, then P({W E Q:x,, = xi,. . . ,x,, = xk}) = 2"-'/2" = 1/2', so the XI,are independent. A common practice is to define a stochastic process {X,}by properties involving the distribution functions of certain combinations of the X,, for example the increments X,- X,. The Poisson and Brownian motion processes are ones in which the increments over a finite number of disjoint intervals are independent. One last notion, which is central to probability, is that of conditional probability. Suppose, for example, that we want to compute the probability of a 5 turning up in example 1 given the extra information that an odd number will turn up. The answer is clearly 3. In general the probability of A given B is denoted by P(A 1 B) and is computed as follows. 6.5 Definition The conditional probability of the event A given the event B is given by P(A B) = P(A n B)/P(B)if P(B) # 0.
I
In the standard approach to the problems to follow it is sometimes difficult to define a suitable space (R,d',P) on which the process is defined. One advantage of the nonstandard approach is that this step is relatively easy. 6.6 Example (Infinite Coin Tossing)
In the elementary theory of probability (for finite sample spaces), one encounters the experiment of tossing a fair (i.e., unbiased) coin a finite number
IV.6
Applications to Stochastic Processes
209
of times. If the coin is tossed n times, then, as just remarked, a sample space for the experiment can be taken to be the set R, of all sequences (e,,e,, . . . ,en), where e, is either + 1 or - 1 depending on whether a head or a tail is obtained on the ith toss; thus R consists of 2" points (sequences). Specifying any event, for example the event of obtaining exactly two heads in n tosses, is the same as specifying a subset A of R. Since the coin is fair, it is argued that each sequence is equally likely, and so the probability P,(A) of an event A is measured by P,(A) = (1/2")1AI,where IAI is the cardinality of A. Suppose now that a coin is tossed an infinite number of times. We may define an associated stochastic process { X , : n E N} by putting X,(w) = + 1 or - 1 depending on whether a head or a tail occurs on the nth toss. Thus {X,} is a discrete parameter stochastic process in which the X, take on the values 1 and - 1. We would now like to define a probability space (Q8,P) on which this process is defined. In the standard theory one takes R to be the (infinite) set of all infinite sequences (e,,ez, . . .) of + 1's and - 1's. Now, however, the specification of the set of events and the probability of each event is not so clear. It is required that the set of events form a a-algebra 8 of subsets of R, and that the probability is a countably additive measure P on d with 0 5 P(A) 5 P(R) = 1 for each A E 8. Also, 8 should contain any event A, which depends on only a finite number of tosses, for example the event of getting two heads in the first 10 tosses, and P should assign to this event A the probability obtained by using only the finite theory. In the standard theory, the existence of an 8 and P satisfying these conditions is a consequence of a general theorem of Kolmogoroff. We now show that the nonstandard theory provides an appropriate 8 and P, and that these have conceptual as well as calculational advantages. Our sample space R is the internal set of all internal sequences (el, e,, . . . , ell) of + 1's and - 1's of length 9, where 9 = [! and ( is an infinite integer. The lattice L is the set of all hyperreal-valued internal functions on Q we define I(4)= (1/2'9 4(ei) if 4 E L. As noted in 1.6, ( L , I )is an internal integration structure. We denote by 8 the collection of internal subsets of R and put P ( A ) = I(zA)for A E 8. The associated collection d of measurable sets will be the collection of events, and the measure k on d coming from (2,f) as in 5IV.2 will define the probability. The reader should check that 0 I P(A) I1 for each A E 8, and that any internal set A is in d with &A) = st(lAl/2"), where 1. 1 is the internal cardinality, i.e., the *-transfer of the standard cardinality function. It is not hard to show that if A is an internal set in 8 which depends on only the first n tosses then P(A) equals the probability obtained using the finite theory on R,. Thus (R, d, k) is an alternative to the standard space mentioned above. We can use (R, d, P) to compute the probabilities of events depending on an infinite number of tosses. As an example, let A, be the event "The first
210
IV.
Nonstandard Integration Theory
n - 1 tosses are tails, then the nth toss is a head" in R. Then &4), = 1/2,". The event A = up=l A,, corresponds to the standard event of getting at least one head in an infinite number of tosses, the first one occurring at an even-numbered toss, and &A) = (1/22n)= 4.Note that the internal set B = uifl A,, is the event of getting at least one head in q tosses, the first one occurring at an even-numbered toss, and we also have @(B)= st(C:'z1 (1/2'") = st(+ - 1/(3 * 2")) = 4. We may now consider the original stochastic process {X,:nE N} as a process on (R,a,P) by putting X(w) = {;I according as en = {L1 in w = ( e l , . . . , e,). For each n, P({wE R:Xn(w)= en})= st(2"-'/23 = for en = - 1 or 1. Similarly any finite set of the Xn's is independent. 6.7 Example (The Poisson Process)
The Poisson process is a stochastic process which is intended to model situations in which isolated events occur randomly in time. Imagine, for instance, an experiment in which we record the time t 2 0 of arrival of each telephone call to an office. We can define a set {fi(w,t): t E [0, 00)) of random variables by specif in that, for any particular w in the as yet unspecified probability space p) (w should represent a particular selection from the set of ways the calls come in), R(o, t) eeuals the number of incoming calls in the time interval [0, t]. Then, for s < t, N ( o , t ) - fi(w,s) equals the number of calls in the interval (s, t]. In many situations it is found that fi(w,t) has the following properties, which define a Poisson process, and in particular, force the measurability of the process. (6.1) for each w, fi(w,t) 2 0,fi(o,0) = 0, and fi(w,t) is integer-valued,
(8,[
a
(6.2) if s < t and w E then R(o,s) 5 R(w, t) and fi(w,t ) is right continuous (see Exercise 3(a) for definition), (6.3) for each tl < t , < * * < t, E R the random variables fi( t,) --R( *, t 1), . . . ,fi( ., tn)- t,- 1) are independent [i.e., the N ( ., t) have independent increments)], a ,
(6.4) & { a : s ( w , s
a(
a,
+ t ) - /?(a, s) = k)) = e-Ar((At)k/k!).
The assumption (6.3) says that what happens in one time interval is not affected by what happens in a disjoint time interval. The assumption (6.4) says in particular that the probability of n calls occurring in the interval (s, s + t] is independent of s and depends only on the length t of the interval and the parameter 1 in the manner indicated. We call I the rate of the process. We now present a nonstandard model for discussing the Poisson process. In doing so we will specify an appropriate probability space (b,d',P")on which the process can take place.
21 1
IV.6 Applications to Stochastic Processes
As in Example 6.6, let q = l! be an infinite factorial in *N.For simplicity we choose a standard positive rational number 3, as the rate for our process, and we let y be the infinite integer Lq. Divide the interval [O,q) into q2 intervals [0, l/q), [ l/q, 2/q), . . . , [(q - l)/q, q), and let R be the internal set of all internal ways that y distinguishable points can be put into the q2 intervals [ k / q , (k l)/q). That is, R consists of internal sequences o = (ai:1 I i Iy ) with 1 5 oi 5 q’ for each i. By transfer, the internal cardinality JRJ of R is q 2 y . Again we use the counting measure to induce an internal integration structure on R. Let L denote the internal lattice of all internal *R-valued functions on R, and for f E L define If = (1/q2’) f ( o , ) . Then ( L , I) is an internal integration structure on R. We let the set of internal subsets of R (internal events) be denoted by 1.If A E d then x , E I, and we define the internal probability of A by P ( A ) = I x A . The standardization (L, f) of ( L , I ) leads to the measure space p), where 8‘ is the collection of measurable sets and P is the measure on k obtained from (L, f) by the methods of 5IV.2. Since P(R) = 1, we see that 0 s &A) s 1 for all A E 8, and (0, i‘) is a probability space. Also F(A) = “P(A)for all A E 1. We now define an internal stochastic process {N,:tE I} on R which is an internal analogue of the Poisson process. Here I is the internal set { k / q :1 I k I q’}. For any o E R and t E I, we define N,(w) to be the number of points which the outcome o places in the interval [O,t). For the event o = (wi), a point “lies” in [k/q,(k + l)/q) if oj= k + 1 for some j , 1 Ij I y. Note that N,(w)can be an infinite integer for some o even for finite t. Also note that since q is an infinite factorial, any positive, standard rational number is of the form k / q E I. We want first to compute the P and p probabilities of the internal set A = { o : N , , ( o )- N,,(w)= k}, i.e., the probabilities that k points fall in [ t , , t 2 ) , where t , - t , is finite and k is an ordinary natural number. Let s = t 2 - t,. For simplicity we assume that t , and t 2 are rational numbers. Then there are exactly sq of the q 2 intervals inside [tl, t 2 ) , and the P-probability of any one of the y points being put in [ t , , t 2 )is sq/q2 = s/q = 3,s/y. Now by (transfers of) elementary counting and independence,
+
a:”,
@,a,
a,
This establishes the analogue of (6.4) for N,.
212
IV.
Nonstandard Integration Theory
,
The analogue of (6.3) can be established in the same way. Let t i + - ti = si s,, where t , < t , < * < t,, n is an ordinary natural numand s = s1 * ber, and ti E I. Assume s is finite. If A, = {o:Nri+ ,(a) - N,,(w)= k i } , where the k, are ordinary natural numbers, with k = kl * + k,- and i = 1, . . . , n - 1, then
+ +
,
+
P(A, n..-nA,-,)
We want to use { N , ( o ) :t E I } to define a standard Poisson process. Unfortunately, for some o E Q, N , ( o )will take infinite values even for a finite t E I. Another possibility is that for some o E R there can be many points falling in an infinitesimal interval. We will show in the next paragraph that these abnormalities happen only on a set of @ measure zero in Q we can define a standard Poisson process on the remainder. Given o E R, we order the distinguishable points b, by the order in which they fall in the line *R. Thus bi 5; bi+ and bi = bi+ if and only if bi and bi+ are in the same interval [k/q,(k l)/q). Again fixj > 0 and k 2 0 in N . Given to E I and t > 0 with t finite and to + t E I , let C,, be the event “b, E [ t o , to l/q)”, and let D,,be the event “If j 1 5; i s; j k, b, E [to, to + t), and b,+k+ # [ t o ,to t).” Let y‘ = y - J . Given C,,,the conditional probability of getting a given point of the remaining y’ points in [to, to + t) is
+
+
,
+
+
trl q2 -
t q - to
=-=-tog
It y - toI
+
-
It y’ + j - t o l ’
Therefore, for all finite t o , and hence for all to < T for some infinite T, the conditional probability
~ ( ~ t o ~ C , oN ) ~ ( ~ t o ) On the other hand, ~ , c ,P(CJ = 1, and so (At)ke-Af/k!. That is, the P-probability of having exactly k more distinguishable
213
IV.6 Applications to Stochastic Processes
points in the interval of length t after the jth point is (At)ke-"/k!.Since
the @-probabilityof having only a finite number of distinguishable points in any finite interval [0, t] is 1. Moreover, since lirn,-, e-& = 1, the @-probability of having point bj+l infinitely close to bj is 0. Since this is true for eachj 2 1 in N, it follows that the @-probabilityof having two distinguishable points in the same monad is 0. We now let E c R denote that set of measure zero consisting of those w for which N,(w) is infinite for some finite t or for which two or more distinguishable points fall in the same monad. Since E E 3, we define a new probability space (a,8, F) by putting = R - E, d = { A : A E a, A E a}, and &(A) = P ( A ) for A E 8. We now use N,(w)to define a process { fi,:t E R} on (R, 8,P ) . For o E fi and t E R + we put fi,(w)= sup NJw) (s 2: t , s E I). By the above remarks, fi,(w)is finite and integer-valued for any w E fi and t E R,and fi,(w)= N , ( o ) for some s E I, s 21 t. We leave it to the reader to show that fi,(w)is right continuous (Exercise 3) and that (6.3)and (6.4)are satisfied. Thus {fir} is a Poisson process on (R,&,P). -
-
I
I
-
-
6.8 Example (Anderson's Construction of Brownian Motion)
Brownian motion is a stochastic process which is intended to model the behavior of a particle (for example, a small particle suspended in water). The particle is subject to random disturbances (for example, collisions with the water molecules) which cause its position to change with time. For simplicity, we consider the one-dimensional case, and denote the random position of the particle on the real line at time t 2 0 by X ( t ) . Again for simplicity we follow the particle only for a unit time interval. Then { X , : t E [0, l]} is to be a stochastic process on an as yet unspecified probability space (a,&,P). A (standard) Brownian motion { X , : t E [0, l]} must satisfy the following conditions:
(6.5) X, = 0, (6.6) if s1 < t , I s2 < t 2 I * . . I s, < t , are points in [0,1] then the random variables X(tl) - X(s,), X ( t J - X(s,), . . . , X(t,) - X(s,)
are independent random variables, which we denote by X,,- X,,, etc.,
(6.7) if t > s are points in [0, 13 then P({w E R:X,(w) - X J o ) I a } ) = J l ( a / c ) ,where $(x) = (1/&) s" oo e - y 2 / du. 2
214
IV. NonstandardIntegration Theory
Condition (6.5) locates the particle at the origin at t = 0. Condition (6.6) says that the probability of a change in position of the particle in any time interval (s,,t,] is unaffected by the changes in position in other disjoint intervals. Condition (6.7) indicates how closely the position of the particle at time t can be determined if its position at time s is known. The probability distribution function $(x) is known as the normal distribution with mean 0 and variance 1. One should note that $(x/G) = ( l / ~ & ) e-"''2"2du, which is the normal distribution with mean 0 and variance a2. In [2], Robert M.Anderson used the measure space construction of 4IV.2 to obtain, among other things, a nonstandard representation of Brownian motion. We give here a brief account of some of his results, which is necessarily incomplete since we refer to his nonstandard version of the central limit theorem (Theorem 6.11), which is crucial to the development. The central limit theorem is one of the deeper results in probability theory and to prove it here woule lead us too far from the main theme of these examples. A Brownian motion can now be defined as follows. Fix g = C!, an infinite factorial in *N; and let (0,8,P) be the internal space for infinite coin tossing of Example 6.6 (with R being all sequences o = (ol, . . . ,o,,), and o,= + 1 or - 1) constructed from the internal integration structure (L,I).Let (Q 8,@) be the corresponding standardization of (Q 8,P ) constructed from f) as in Example 6.6. Let ~ ( t ; denote ) the internal random variable (function in L ) defined by setting
(e,
) w,. Here [qt] denotes the largest element of * N less than or where X ~ O = equal to qt. Thus, for any o = ( w 1 , 0 2 , .. . ,o,,), the particle located by x(t,u)starts at the origin at t = 0 [i.e., x(0,o) = 01,and at each time ti = i/q ( i = 1,2,3, . . . ,q) the particle moves to the right or left a distance l/&, depending on whether w, is + 1 or -1; at times lying between the t , the particle remains fixed. The resulting motion is an internal analogue of a standard "symmetric random walk." We now define P(t, o)= "dt,w ) for t E [0,1] and o E R. We will show that f l ( t , - ) is a Brownian motion on (R,8,@).To do so we need the following results.
6.9 Definition An internal random variable on (R, 8,P) is a function X E L. A collection {X,:i E I} of internal random variables is *-independent if for
every *-finiteinternal subcollection {XI, . . . ,X,} (rn E *N) and every internal
215
IV.6 Applications to Stochastic Processes
{ X i : iE I} is S-independent if, for every finite subcollection { X I , .. . ,X , } (m E N ) and every m-tuple ( a 1 , . . . ,a,) E R", (6.8) holds with = replaced by =. 6.10 Lemma Suppose { X i : iE I} is S-independent. Then { " X , : iE l } is an independent collection of random variables on (a,8,F). Proof Suppose m E N, (a1,. &{W:~X,,(~ <) u t , .
n-m
=
lim n-m
. . ,a,)
E
. ., o ~ , , ( w<) a,))
w : X , , ( w )< a l
cl
R". Then
1 n
- -,
. . . ,X i , < u,,, - -
'( ({ P
w :x,,(w) < aj - -
m
=
fl F ( { w : " X , , ( o )< U j } ) .
j= 1
6.11 Theorem Let { X n : nE N} be an internal sequence of *-independent random variables on (a,&, P). Assume that there is a standard distribution function F such that *F is the distribution of X n , E(Xn)= 0, and E(X:) = 1 for each n E *N. Let JI denote the standard normal distribution. Then for any m E * N - N and any u E * R
Proof: See Theorem 21 in [2].
0
6.12 Theorem If q E * N - N, then P(t;) is a Brownian motion on
(a,8,p).
216
IV.
Nonstandard Integration Theory
Proof: (i) Given t E [0,1], x(t, .) is 8-measurable, and so b(t, is 8-measurable by Proposition 2.31. (ii) By transfer from the case of finite coin tossing we see that the X i have identical distributions. Also if s k = X i for any k E *N then the S k have independent increments by the transfer of Exercise 1. Thus if sl < t l 5 s2 < t 2 I * - . s s, c t , are points in [0,1], then {x(t,, - x(sl,.), . . . , x(tn,*) x(sn, .)} are *-independent and so S-independent, and condition (6.6) follows by Lemma 6.10. (iii) Given s c t in [0,1], 3, = [qt] - [qs], and a E R, a )
cf=
a )
*((a E R:B(t,w)-
a})
= &{w:Ox(t,w) - 'x(s,w) s a } )
E
= ~ ( { a :k'=[.(I 3&) 5 a } ) = lim
P({~:J-
4
= lim
E < $ + i)}) (a
wk
(h + A))
n+m
k-lwl
(a
O(*$)
n+m
= n-rm lim $ [ ( $ ( a
a
(by Theorem 6.1 1)
n
+
i)))
+ l/n
a
This establishes condition (6.7) 0 In general, by a path of a stochastic process {XI: t E I}, we mean a function f ( t ) = XI@),for some particular o E R. The last result of this section shows that almost all of the paths of Brownian motion are continuous.
6.13 Theorem There is a set R' E 8 with P(R') = 1 such that /?(-,a) is a continuous and finite function on [0,1] for all w E R'.
Proof: For each m, n E N,let R, be the internal set given, using the internal extensions of sup and inf, by
R,,
=
I
~ E R :sup
I E [iln. (i+lllnl
x(t,w)-
inf I E [iln. (i+ 1)lnI
1
x(t,o)>m
217
IV.6 Applications to Stochastic Processes
Then for A = q/n, w : sup x(t,w) r E IO.llnl
InP
k
({
inf
rE[O,llnl
'})
w : max C w , >ISkjA 1 2m
&,a)> m
+nP
({
k
w : min E m ,< -lSkSA 1 2m
@I) 2m
F(u)
2 1 - sup inf F(Q,,,) 2 1 - sup inf4ne-fiI4" = 1. m
n
m
n
Fix w E R.If for some t E *[O, 13 we have "x(t,w ) = + m or "x(t,0) = - co, then w E Q,,n for all standard m and n E N,whence w # f2'. If for some s and t E *[O, 11 with s N t we have "lx(s,w)- x(t,o)l= a > 0, then for m > 2/a we have o E R,, for all n E N (exercise), whence w 4 nl. Now suppose w E R'. By the preceding paragraph, /3(t,0) is finite for all t E [0,1]. Fix E > 0 in R. Then the set {n E * N : l t - sI < l/n * Ix(t,w) ~ ( s o)l , < ~ / 2 }is internal and contains all infinite n. Hence it contains a finite n by II.7.2(ii).Thus if It - sI < l/n, Ix(t, w ) - ~ ( sw)l , < ~ / and 2 hence w) /?(s,w)I < E. It follows that /?(.,a) is continuous on [0,1]. 0 Exercise ZV.6
1. (Standard) Let X i be defined on the space a, of Example 6.6 by Xxw) = e, if w = (e,,e2, . . . ,en). Show that the random variables S, = Xi,
218
IV.
Nonstandard Integration Theory
1 5 k I n, have independent increments, i.e., if 1 I k, < k2 < k, < k4 <
. . . c k, In then s k 2 - S k , , sk, - Sk2,. . . ,SkI-
are independent.
2. In Example 6.7, check that Atk
W,, ICIO)'" k! e -
and
Go<,P(C,,)
%
1.
3. (a) Show that the process fi,(o) defined on (fi,d,p)in Example 6.7 is right continuous. That is, show that, for each fixed o E 6, the function f: R t -,Z defined by f ( t ) = fi,(o) satisfies Iirns+,, s , I f ( s ) = f(t). (b) Show that the process G,(o)satisfies Properties (6.3) and (6.4). 4. (Inter-arrival times) Define the process { Tn:nE N} on the space (a,8,P) of Example 6.7 as follows. For o E 6,T,,(w)is the time between the (n - 1)st and the nth jump of R,(o).
-
(a) Define the internal analogues { T,:n E *N}of { ?n: E N } on (f2,dp, P). (b) Show that P{Tl > r l } = e-"I and P{T,,> t,,lTl = r l , . . . , T,,-l = t , , - , } = e - A, . (c) Use (b) to show that P{T1> rl, T2> r 2 , . . . , T,, > r,,} = e-*fle-Atz * * eChn,showing that the f,, are independent, identically distributed random variables. 5. Prove the result tagged as an exercise in the proof of Theorem 6.13. (Note that we may have s < i/n < t for same values of n E N).
APPENDIX
Ultrafilt ers
In this appendix we present the essential facts concerning ultrafilters which are needed in the text. In the following, I will be an arbitrary set.
A.1 Definition A nonempty collection 9 of subsets of I is afilter on I if (i) 0 # 9, (ii) A, B E 9 implies A n B E 9, (iii) A E 9 and B 2 A implies B E 9.
A filter 9 on 1 is an ultrafilter if it is maximal; i.e., whenever Y is a filter on I and 9 E Y then 9 = 48. The following result shows that this definition of ultrafilter is equivalent to that of Definition 1.1 in Chapter I. A.2 Proposition A filter 9 on I is an ultrafilter iff, for every subset A of I, either A E 9 or A' = I - A E 9. Proof: Suppose that 9 is a filter such that for every A c I either A E 9 or A' E 9. Let 9 be a filter with 3 ' 2 9 and suppose that B E Y and B # 9. But then B' E 9 c 48,and so 0 = B n B' E Y, contradicting A.l(i) for a filter. Thus there is no filter Y properly containing 9, and so 9 is an ultrafilter. Conversely, suppose that 9 is an ultrafilter and A 4 9. Let Y be the set {X E I : A n F G X for some F E S}. T h e n 9 E Y and 9 # $(since, for example, A E Y), and so 9 is not a filter since 9 is maximal. But Y is not empty, and if B, C E 9 and D 2 B then B n C E Y and D E 9.Thus Y can fail to be a filter only if 0E Y.That is, we have A n F = 0 for some F E 9 for which we then must have F E A'. It follows that A' E 9 by A.l(iii). 0 219
220
Appendix
Ultrafilters
We now want to prove the ultrafilter axiom, 1.2 of Chapter I. To do so we need Zorn’s lemma, which is a variant of the axiom of choice. The statement of Zorn’s lemma involves the idea of a partially ordered set and related concepts. A.3 Definition A partially ordered set is a pair (X, I), where X is a nonempty set and Iis a binary relation on X which is (i) reflexive, i.e., x < x for all x E X , (ii) antisymmetric, i.e., if x I y and y I x then x = y , (iii) transitive, i.e., if x 5 y and y 5 z then x I z.
A subset C of X is a chain if for all x, y E C either x Iy or y < x . The element x is an upper bound for a subset B c_ X if b 5 x for all b E B. An element m E X is maximal if, for any x E X , m Ix implies x = m. A.4 Zorn’s Lemma Let (X, I) be a partially ordered set. If each chain in X has an upper bound then X has at least one maximal element.
Zorn’s lemma is equivalent to the axiom of choice. A.5 Axiom of Choice For any set A of nonempty sets, there is a function f with domain A such that f ( x ) E x for each x E A. The function f is called a choice function for A.
We now use Zorn’s lemma to prove the ultrafilter axiom. A.6 Ultrafilter Axiom If 9 is a filter on I then there is an ultrafilter 4 on I containing 9. Proof: L5t .@ be the set of all Qters which contain 9. & is !onempty since 9E 9.We partially order @ by inclusion; i.e., if d ,49 E f then we say that d < W if A ~d implies A E W. It is easy to check that < is a partial ordering on 9. Now let 5 be a chain in &. To show that 4 has an upper bound consider P = u V ( V E @.Then V I9 for all V E 8.Also@ is a filter. For if A, B E @ then and B e V 2 for some Vgland W2 in 4.Since 4 is a chain, we may assume without loss of generality that V, < Wg2,and so A, B E V2 and A n B E V2 E @. Similarly we check conditions (i) and (iii) of A.l. We deduce from Zorn’s lemma that .@ contains a maximal element which is then an ultrafilter containing 9.0
Appendix Ultrafilters
22 1
There are some ultrafilters on 1 which, for our purposes, are quite trivial. Consider, for example, the collection 4, = {A c 1 : a E A} for some a E I . It is easy to see that 9, is an ultrafilter.
A.7 Definition An ultrafilter 4 is principal or fixed if there is some a E 1 so that 9 = {A G 1 :a E A}. If the ultrafilter 9 is not principal it is called free. A.8 Theorem Free ultrafilters exist on any infinite set I . Proof: The collection 9, = {F c 1 : l - F is finite} is a filter (check) called Then 4 the cofinite or Frechet filter. Let 4 be an ultrafilter containing 9,. cannot be principal. For if 9 = {A G 1 : a E A} and 4 =I 9,, then the set F = {a}’ E 9 (contradiction). 0
References
1 Abian, A. Solvability of infinite systems of polynomial equations (to appear). 2 Anderson, R. M. A non-standard representation for Brownian motion and It8 integration. Israel J . Math. 25 (1976), 15-46. 3 Anderson, R. M. Star-finite representations of measure spaces. Trans. A.M.S.271 (1982), 667-687. 4 Anderson, R. M., and Rashid, S. A nonstandard characterization of weak convergence. Proc. A.M.S. 69 (1978), 327-332. 5 Behrens, M.A local inverse. function theorem. In “Victoria Symposium on Nonstandard Analysis,” (Hurd A. E., and Locb, P. A. eds.), Lecture Notes in Mathematics, Vol. 369. Springer, Berlin, 1974. 6 Benoit, E., Callot, J. L., Diener, F.,and Diener, M. Chasse au canard. Collectonea Mathematica, 32 (1981). 37-74. 7 Bernstein A. R., and Robinson, A. Solution of an invariant subspace problem of K.T. Smith and P.R. Halmos, Pacific J. Math. 16 (1966). 421-431. 8 Coddington, E. A., and Levison, N. “Theory of Ordinary Differential Equations.” McGrawHill, New York, 1955. 9 Constantinem, C., and Cornea, A. “Ideale Rander Riemannscher Flachen,” Ergebnisse der Math., Vol. 32. Springer, Berlin, 1963. 10 Dacunha-Castelle, D., and Krivine, J.-L. Applications des ultraproduits a I’itude des espaces et des algbbres de Banach. Studia Math. 41 (1972), 315-334. 11 Daniel, P.A general form of the integral. Ann. Math. 19 (1917-18). 279-294. 12 Davis, M. “Applied Nonstandard Analysis.’’ Wiley, New York, 1977. 13 De Bruijn, N. G.,and Erdos. P.A color problem for infinite graphs and a problem in the theory of relations. Proc. Kon. Nederl. AM. u. Wetensch., Ser. A, 54 (1951), 371-373. 14 Dunford. N., and Schwartz, J. T. “Linear Operators,” Vol. 1. Interscience, New York, 1958. 15 Gonshor, H. Enlargements contain various kinds of completions. In ”Victoria Symposium
on Nonstandard Analysis,”(A.E. Hurd and Loeb, P.A.,eds.), Lecture Notes in Mathematics, Vol. 369. Springer, Berlin, 1974. 16 Henson, C. W.,and Moore, L. Nonstandard analysis and the theory of Banach spaces. In “Nonstandard Analysis-Recent Developments,” (Hurd A. E., ed.), Lecture Notes in Mathematics, Vol. 983. Springer, Berlin, 1983. 17 Hewitt, E. Rings of real-valued continuous functions, Vol. I. Trans. A M . Math. SOC.64 (1948), 45-99. 18 Hirschfeld, J. “A Non-Standard Smorgasbord” (unpublished notes). Tel-Aviv University.
222
References
223
19 Hurd, A. E. (ed.) “Nonstandard Analysis-Recent Developments,” Lecture Notes in Mathematics, Vol. 983. Springer, Berlin, 1983. 20 Kelley, J. L. ‘General Topology.” Van Nostrand, New York, 1955. 21 Keisler, H. Jerome. Good ideals in fields of sets. Ann. Math. 79 (1964), 338-359. 22 Keisler, H. Jerome. Ultraproducts and saturated models, Proc. Kon. Nederl. Akad. u. Wetensch.. Ser. A, 67 (1964). 178- 186. 23 Keisler, H. Jerome. “Elementary Calculus.” Prindle, Weber and Schmidt, Boston, 1976. 24 Keisler, H. Jerome. “Foundations of Infinitesimal Calculus.” Prindle, Weber and Schmidt, Boston, 1976. 25 Keisler, H.Jerome. “An infinitesimal approach to stochastic analysis,” Memoirs Amer. Math. SOC.. No. 297, American Mathematical society, Providence, Rhode Island, Vol. 48, 1984. 26 Lebesgue, H. Sur une generalization de I’integrale definie, Compres Rendus Acad. Sci. Paris, 132 (1901). 1025-28. 27 Loeb, P. A. Conversion from nonstandard to standard measure spaces and applications in probability theory. Trans. Atner. Math. Soc. 211 (1975), 113-122. 28 Loeb, P. A. “An introduction to nonstandard analysis and hypefinite probability theory. In “Probabilistic Analysis and Related Topics, Vol. 2, (edited by A. T.Bharucha-Reid), Academic Press, New York, 1979. 29 Loeb, P. A. Weak limits of measures and the standard part map. Proc. Amm. Math. SOC. 77 (1979), 128-135. 30 Loeb, P. A. Review of [23] and [24]. J . Symbolic Logic, 46 (1981), 673-676. 31 Loeb, P. A. Measure spaces in nonstandard models underlying standard stochastic processes. “Proceedings of the International Congress of Mathematicians.” Warsaw, 1983. 32 Loeb, P. A. A functional approach to nonstandard measure theory. Contemporary Math., 26 (1984). 251-261. 33 Loeb, P. A. A nonstandard functional approach to Fubini’s theorem. Proc. Amer. Math. SOC.93 (1985). 343-346. 34 Loomis, L. H. “An Introduction to Abstract Harmonic Analysis.” Van Nostrand, New York, 1953. 35 Luxemburg, W. A. J. A remark on a paper by N. G. De Bruijn and P. Erdos, Proc. Kon. Kederl. Akad. v. Wetensch., Ser. A, 65 (1962). 343-345. 36 Luxemburg, W.A. J. A general theory of monads, I n “Applications of Model Theory to
Algebra, Analysis, and Probability, (W. A. J. Luxemburg, ad.). Holt, Rinehart, and Winston, New York, 1969. 37 Machover, M. and Hirschfeld, J. “Lectures on Non-standard Analysis,” Lecture Notes in Mathematics, Vol. 94. Springer, Berlin, 1969. 38 Perkins, E. A. A global intrinsic characterization of Brownian local time. Ann. Probability 9 (1981), 800-817. 39 Perkins, E. A. Stochastic processes and nonstandard analysis. In “Nonstandard AnalysisRecent Developments,”(Hurd A. E., ed.),Lecture Notes in Mathematics, Vol. 983. Springer, Berlin, 1983. 40 Robinson, A. “Introduction to Model Theory and to the Metamathematics of Algebra.” North-Holland, Amsterdam, 1963. 41 Robinson, A. On generalized limits and linear functionals. Pacific J . Math. 14 (1964), 269 - 283. 42 Robinson, A. “Nonstandard Analysis.” North-Holland, Amsterdam, 1966. 43 Robinson, A. Compactifications of groups and rings and nonstandard analysis. J . Symbolic Logic 34 (1969), 576-588. 44 Robinson, A. and Lightstone, A. H.“Nonarchimedean Fields and Asymptotic Expansions.” North-Holland, Amsterdam, 1975.
224
References
45 Robinson, A. and Zakon, E. A set-theoretical characterization of enlargements. I n “Applica-
A. J. Luxemburg, ed.), tions of Model Theory to Algebra, Analysis, and Probability,” (W. Holt, Rinehard and Winston, New York, 1969. 46 Stroyan, K. and Luxemburg, W. A. J. “Introduction to the Theory of Infinitesimals.” Academic Press, New York, 1976. 47 Tacon, D. G.Weak compactness in normed linear spaces. J . Australian Math. Soc. 14 (1972), 9-12. 48 Zakon,
E.A new variant of nonstandard analysis. I n “Victoria Symposium on Nonstandard Analysis,” (Hurd A. E. and Loeb. P. A. eds.), Lecture Notes in Mathematics, Vol. 369. Springer, Berlin, 1974.
List of Symbols
R,2
42
+, 2
.I
2
<, 2 N. 3 8.3 3 093 (r093 0,3 3 ax,3 p,,3 =, 4 a.e. 4 R,5 Crl, 5 r, 5 $9
*.
+,*, <, 5
a,5 Irk 6
*, 7,86 *r, 7 (A)*, 7 S", 8 (a1.. . . , a"), 8,72 P( a', . . . , a"), 8 P,8 dom P,9,72 range P , 9,72 fb',. . . , a"),9 *P, 9 z, 10 XP9 10 *R, 10
*g?, 10 Y,ll LY, 12 A , 12,74 -+, 12,74 v, 15 74 s, 12 P, f,13 i', 13
c ,~ C1
.. . x
c,, 72
P [ b ] , P - ' [ b ] , 72
f@A72 1-1,72
4%72
Y,, 74 7 ,A ,
v, 3, 74
A f', 13
v,-*,++,74
B, 74
I- 1
=, 74
>, 18
~,26,44,111,124 26,124 -, *, 26 Mx),26,44, 111, 124 W,26 st(p), 27.11 1 OP, 27 *A,, 28 *Z, *N,29 Q,*Q,30 (%)! 32 R', 34 lim sup, lirn inf, 37 2,A, 41, 113 st, 41, 107, 11I A+B,44 Ax, AY. dv,53 P), &f, P), etc. 56, 57
+,
s:(f,
c f ( x ) d x , 57 T,, 68
m),71
W,K(rn, 71
E,
, 7 1
CZ X
c", 72
k
C,
X
#, 7 ~ 7 5 225
i,b, 76 Ro, 76
R:, 77 (VX,,
. . . ,x, E c), 77
8, *a,79 fls,83 =*, 84
n*s.np,84 [a], 84
*X,84 E q , 84
M ,85
*, 86 PdAX 89 dAk93, 106 AS),93 * V(X),96 I,, 102,123 card A, 105 K,,107 st(B), 107, 117, 122, 131 (X,Y), 110 Y, 110 4, Yx, ax,110
226
List of Symbols
Mx), I I I, 124 2 , 111, 124 St(Y),11 1 ns(*X), 1 1 1 .F x 9,112
A^, A. 113 To, r,,r,, I 14
I, I15, n X i ( i E I), 116 st(&
'B, 117, 122, 131
A', dA, 118
(X,d), 123
123 BAx), 123 ,,I 123 5 , #, 124 d x ) , 124 fi~('X). 124, 154 (X,d), 128 0. 129 st(A), 131 +,: 133 It 11,133 1I Ilm. 134 4 . II Ill, 134 R"', 134 Yd,
co, 135
X',X".etc., 140
6;. 141 my, m,, 143 ( . ),I45
,I 146
xly.146 ,'S 146 em,149 G, 154 fyyx),154 X,d, 154 RQ,156 157 BX,158 4". 160
-.
M,160
W),160 d/b 160 9,160
dn,160
0, 160
df). 161 C(X,Y), 163 I , 0, 166 XE. 166 p.1.f.. 167 ( L ,I ) , 167
RS),135
C(S), 135 L ( X , Y ), 135 B(X , Y),135 IITIL 135 w),136
CAR). 167,189 167 1,168 SUPP!,
Index
A
Bounded set,44,129 Brownian motion, 213-217
Abian, A., 122 Absolute value, 6 Accumulation point, 41, 113 Alaoglu's theorem, 143 Alexander's theorem, 122 Almost everywhere (with respect to an ultrafilter), 4 Almost everywhere validity, 196 Almost uniform convergence, 198-200 Anderson, R. M.,186,205,214 Archimedean property, 19 Archimedes, ix Arzela-Ascoli theorem, 62-63 Ascoli theorem, 162 Axiom of choice, 220
C
Cauchy sequence in measure, 199-200 Cauchy-Peano existence theorem, 63-65 Cauchy's principle, 100 Chain in an ordered set, 220 Chain rule, 53 Characteristic function, 10 Choice function, 220 Closed map, 121 closed set, 40,110,112 closed subspace, 133 Closure of a set, 41, 113 Cofinite, 3 Cofinite filter, 221 Compact metric space, 129 Compact operator, 137-140,152-153 Compact set, 42,48,120-122 Compactification, 156-159 Compactness theorem, x Complete integration structure, 175 Complete measure, 180 Complete metric space, 127-129 Complete orthonormal set (basis), 149, 151-152 Comprehensive monomorphism, 98-99, 106-107 Concurrent relation, 91, 105 Conditional probability, 208 Constantinescu, C., 159 Continuity,46,48,77, 115, 119, 131 Continuous linear operator, 136
B Banach limit, 102 Banach space, 133 Banach space ultrapower, 154 Base, 110 Behrens, M.,56 Berkeley, G., ix Bernstein, A. R.,92, 152 Bessel's inequality, 150-151 Best approximation theorem, 150 Bilinear form. 154 Bliss's theorem, 60 Bolzano- Weierstrass theorem, 35 Bore1 set, 207 Boundary point, 118-1 19 Bounded linear operator, 135-140 227
228
index
Convergence in measure, 199-200 Convergence, +, 104 Convergence, S, 104 Convex set in a vector space, 147 Cornea, A., 159 Countable additivity, 180
D Dacunha-Castelle D., 154 Daniel, P., 165 Darboux's theorem, 55 De Bruijn, N.G., 92 Differential, 53 total 54 Denumerably comprehensive monomorphism, 98-99,106-107 Derivative, 5 1-52 Dini's theorem, 61 Discrete metric, 123 Distribution function of a random variable, 207
Dual space, 140-144 DuBois-Reymond, 103
E Eberlein-Smulian theorem, 143 Egoroff's theorem, 199-200 Element, 71 Enlargement, 90-92 Entity, 71 Equality almost everywhere (a.e.), 84 Equicontinuous family of functions, 162-163 Equiprobability model, 207 Equivalence with respect to an ultrafilter, 84 Equivalent metrics, 132 Equivalent norms, 132 Erdos, P., 92 Euler, L., ix Evenly continuous family of functions, 162- 163
Event in a probability space, 207 Expected value of a random variable, 207 Exponential function, 48-49,52 Extended real number system, 176 External entity, 95,97-98 External sentence, 95 Extreme value theorem, 46,50
F Family distinguishing points and closed sets, 157
Fatou's lemma, 197 Filter, 3,219 cofinite, 3 countable subbasis, 103 Frechet, 3,221 Finite intersection property (f.i.p.), 93 Finite point, 124, 135 Finitely additive measure, 175 First axiom of countability, 130-131 Formula, 75 *-transform, 79 atomic, 75 Fourier coefficient, 149 Frwhet filter, 3,221 Fubini, G.,201 Fubini property, 201 internal, 201 strong, 201 Fubini theorem nonstandard, 202-203 standard, 203-204 Function, 9,72 continuous, 46,48,77,115, 119, 131 +-continuous, 81,95-96,131-132 differentiable, 51-56 domain, 72 extension, 72 increment, 53-54 injective, 9,72 n variables, 72 one-to-one, 9, 72 onto, 72 range, 72 restriction, 12 Riemann integrable, 57 S-continuous, 131- 132 surjective, 72 uniformly differentiable, 56 Fundamental theorem of calculus, 59
c Galaxy, 26, 124 principal. 124,154 Gonshor, H., 157
229
Index
Graph, 92 edge, 92 infinite, 92 k-colorable, 92 vertex, 92
Internal sentence, 95 Internal set, 79 Intersection monad, 93, 103, 106 Inverse function theorem, 54 Inverse image under a relation, 72
H Hahn-Banach theorem, 141-142 Halmos, P. R.,92 Hausdorffspace, 114,117 Heine-Bore1 theorem, 43 Henson, C. W., 109,144,154 Hewitt, E., x Hilbert cube, 154 Hilbert space, 145-154 Hirschfeld J., 122, 157, 160 Homeomorphism, 115 Hypefinite integration structure, 168-169, 188- 189 Hyperfinite set, 89 Hyperintegers, 29 Hypernatural numbers, 29 Hyperrational numbers, 30 Hyperreal number system, 5 1
Image under a function, 9,72 Image under a relation, 72 Independent random variables, 208 Individual, 71 Infinite coin tossing, 208-210 Infinite sum theorem, 59 Initial segment of N,89 Inner product, 145 Inner product space, 145-154 Inner regular measure, 192 Integral for a standardization, 170 with respect to a measure, 183 Integral operator, 138 Integration structure, 166-167 hyperreal, 166- 167 internal, 166-167 real, 166-167 Interior point, 118-1 19 Intermediate value theorem, 46, 50 Internal entity, 95-97 Internal formula, 95
K K-saturation, 70 Keisler, H. J., x-xi, 14.49.52, 56, 59,60, 96, 104-106,118,205 Keisler's internal definition principle, 96 Kelly, J. L., xi, 160, 162 Kolmogoroff, A., 209 Konig's lemma, 94 Krivine. J.-L., 154 Kunen, K., 105 L
Language for superstructures, 74-78 Lattice hyperreal, 166-167 real, 166-167 Lebesgue, H., 164,180 Lebesgue covering lemma, 132 Lebesgue dominated convergence theorem, 197-198 Lebesgue measurable set, 191 Lebesgue measure, 191 Lebesgue monotone convergence theorem, 197 Leibniz, W. G., ix, 1.53 Lifting, 186,200 Lightstone, A. H.,100 Limit, 45 Linear functional, 135,140-144 Linear operator, 135-140 Linear subspace, 133 Locally compact space, 158-159 Loeb measure, xii Loeb space, xii X6s theorem, 68-69,86-87 Luxemburg, W. A. J., 70,92,93,100,104, 105,106,108,118,122,131,154,157
M Machover, M., 157 Mapping +, 7 Maximal element in an ordered set, 220
230
Index
Maximal orthonormal set (basis), 149, 151-152
Maximum of functions, 167 Mean value theorem, 55 Measurable function, 175-178, 182-183 Measurable set, 176,180-181 Measurable space, 180 Measure, 180 Measure space, 180 Metric space, 123-132 completion, 128 Minimum of functions, 167 Monad, 26,106,111,124,135 Monomorphism, 79,86-88 strictness, 79 Monotone convergence theorem, 164, 171-172, 178, 197
Moore, L. C., Jr., 109, 144,154 Mostowski collapsing function, 84-88
N Near point, 111,124 Near-standard point, 111,127 Negative part of function, 167 Neighborhood, 110 Neighborhood base, 110 Neighborhood subbase, 110-1 11 Neighborhood system, 110 Newton, I., ix Nonmeasurable set, 188 Non-standard entities, 90 Nonstandard hull, 144 Nonstandard hull of a metric space, 154-156 Nonstandard hull of a nonned space, 155-156
Nonstandard number system, 5 Nonstandard summation operation, 36-37 Norm of a linear operator, 135 Norm on a vector space, 133 Normal distribution, 214 Normal space, 118 Null function, 169 Null space. of a linear operator, 136 Number finite, 25 infinite, 25.28 infinitesimal, 25 non-standard, 25 standard, 25
Numbers finitely close, 26 infinitesimally close, 26 near, 26 0
One point compactification, 159 Open ball, 123 Open covering, 42,120 Open map, 121 Open set, 40,110,112 Open subcovering, 42, 120 Operator of finite rank, 152-153 Ordered n-tuple, 72 Ordered pair, 72 Ordering partial, 94 total, 94 Orthogonality, 146 Orthonormal basis, 149 Orthonormal sequence, 149-152 Orthonormal vectors, 149 Outer regular measure, 192
P Parallelogram law, 147 Parameter, 12 Parameter set of a stochastic process, 207 Parentheses, 74 Parseval's identities, 151-152 Partially ordered set, 220 Partition, 56-57 refinement, 57 Path of a stochastic process, 216 Perkins, E. A,, 205 Permanence principle, 100-104 Poisson process, 210-213 Polynomially compact operator, 152 Positive linear functional, 166- 167 Positive part of function, 167 Power set, 71 cumulative, 7 1 Pre-neai-standard point, 127 Probability measure, 207 Probability of an event, 206 Probability space, 207 Projection in a Hilbert space, 148 Pseudomonad, 114-115,118
Index
23 I
Q Q-compactification, 156-159
R Rado’s selection lemma, 94 Random variable, 207 internal, 214 *-independent, 214 S-independent, 2 15 Rank, 71 Rate of a Poisson process, 210 Reflexive normed space, 143 Regular measure, 192 Regular space, 118 Relation *-transform, 9 binary, 9,72 complement, 8 concurrent, 91 domain, 9, 72 finitely satisfiable, 91 n-ary, 8,72 range, 9,72 unary, 8 in V(X),72 Relational system, 11 Remote point, 111, 190 Riemann integral, 57-60 Riemann integration, 56-60 Riemann sum, 56-57 Riesz representation theorem, 148,194 Riesz-Fischer theorem, 151 Ring, 175 Robinson, A., ix-x, 32, 42. 63, 70, 78, 91, 92, 100, 102,103,109,118, 120,131, 136, 138,152, 157 Robinson’s sequential lemma, 101 Robinson’s theorem, 42, 120-122, 132
S Sample space, 206-207 Saturation, 104-108, 117 Scalar multiplication, 133 Schwarz inequality, 145 Scope of a quantifier, 75 Second dual space, 140, 142 Sentence, 75 atomic, 13
compound, 13 simple, 13 transfer, 21 *-transform, 20, 79 truth, 16.75-76 Separable Hilbert space, 149-150 Separation properties, 113-114 Sequence, 32-36 bounded, 33-35 Cauchy, 33-34,126-127 convergence, 32,34-35,119, 126 double, 35-36 limit. 32.34-36,126-127 limit inferior, 37-38 limit point, 34-35, 126 limit superior, 37-38 Sequence of functions, 60-63 convergence, 60-61 equicontinuous, 62.63 uniform convergence, 36,60-62, 125-126 uniformly bounded, 62 Sequentially compact space, 130 Series, 36-38 absolute convergence, 37-38 convergence, 36-37 ratio test, 38 Set dense, 113 exhausting, 91-92 *-finite, 89 *-open, 111 Sigma-algebra of sets, 180 Simple function, 182 reduced representation, 182 S-integrability, 186- 188 Skolem function, 18 Smith, K.T.,92 Spillover principle, 101 Standard entity, 90 Standard formula, 95 Standard numbers, 7 Standard part, 27, 111 Standard part map, 27,41, 114 Standard part of a function, 172 Standard part of a set, 41,107,117,122, 132 Standard sentence, 95 Standardization of an integration structure, 170
Stochastic process, 207 Stone, M. H., 179
232
Index
Stone-tech compactification, 158-159 Stonian integration structure, 179 Sronian lattice, 179 Stroyan, K.,100, 105 Subbase, 110 Subgraph, 92 sup function, 125 Superstructure, 71 Support of a function, 167 Symbol connective, 12,74 constant, 12.74 equality, 74 function, 13 logical, 12,74-75 predicate, 75 quantifier, 12.74 relation, 13 variable, 12,74
T Term, 13 constant, 13 interpretable, 15 *-transform, 20 Tonelli's theorem, 204 Topological space, 110 Topological vector space, 145 Topology, 110 compact convergence, 163 compact-open, 160-163 discrete, 111 finite complement, 112 half-open interval, 111 Hausdorff, 114 jointly continuous, 160-162 normal, 118 pointwise convergence, 160-162 product, 112,116-119,121-122 regular, 118 relative, 115-116 stronger, 111 trivial, 111 uniform convergence on compact sets, 163 weak, 116,118,142-144 weaker, 111
Totally bounded metric space, 129 Transfer principle, 21,67-69,78-83 downward, 79,82-83 Triangle inequality, 123,133 Trichotomy law, 6 Tychonoff product theorem, 121-122
U Ultrafilter, 3,219 fixed, 3,88,221 free, 3,8,221 principal, 221 Ultrafilter axiom, 4,220 Ultrapower, x, 5,84 bounded, 84 Uniform boundedness theorem, 137 Uniform continuity, 47-48,125-126 Uniform convergence, 36,60-62, 125, 126 Upper bound in an ordered set, 220
V Variable, bound, 75 Variable, free, 75 Vector addition, 133 Vector space, 133 Volume of revolution, 60
W Wave equation, 65 Weak compactness, 143 Weak operator topology, 154 Weak sequential compactness, 143 Weak topology on normed space, 142-143 Weak* sequential compactness, 144 Weak* topology on normed space, 142-144
Z
Zakon, E.,78,79 Zero vector, 133 Zorn's lemma, 220
Pure and Applied Mathematicr A Series of Monographs and Textbooks Editors
S m m u m l Cllmbmrg m n d H y m n Ems*
Columbia University. N e w York RECENT TITLES
CARLL. DEVITO. Functional Analysis MICHIELHAZEWINKEL. Formal Groups and Applications SIGURDUR HELGASON. Differential Geometry. Lie Groups. and Symmetric Spaces ROBERTB. BURCKELAn Introduction to Classical Complex Analysis: Volume I JOSEPHJ . R ~ M A An N Introduction to Homological Algebra C. TRUESDELL and R . G. MUNCASTER. Fundamentals of Maxwell's Kinetic Theory of a Simple Monatomic Gas: Treated as a Branch of Rational Mechanics BARRY SIMON. Functional Integration and Quantum Physics ROZENBERGand ARTOSALOMAA. The Mathematical Theory of L Systems GRZEGORZ DAVIDKINDERLEHRER and GUIDO STAMPACCHIA. An Introduction to Variational Inequalities and Their Applications H. SEIFERT and W. THRELFALL. A Textbook of Topology; H. SEIFERT. Topology of 3-Dimensional Fibered Spaces LOUIS HALLEROWEN.Polynominal Identities in Ring Theory DONALD W. KAHN.Introduction to Global Analysis DRAGOS M. CvETKovic, MICHAEL Dooe, and HORSTSACHS. Spectra of Graphs ROBERTM. YOUNG. An Introduction to Nonharmonic Fourier Series MICHAELC. IRWIN. Smooth Dynamical Systems JOHNB . GARNETT. Bounded Analytic Functions EDUARD PROGOVECKI.Quantum Mechanics in Hilbert Space, Second Edition M. S c m OSBORNE and GARTH WARNER. The Theory of Eisenstein Systems K. A . ZHEVLAKOV. A. M. SLIN'KO, 1. P. SHESTAKOV. and A. I. SHIRSHOV. Translated by HARRY SMITH. Rings That Are Nearly Associative JEAN DIEUDONNE. A Panorama of Pure Mathematics: Translated by 1. MACDONALD JOSEPHG . ROSENSTEIN. Linear Orderings AVRAHAM FEINTUCH and RICHARD SAEKS. System Theory: A Hilbert Space Approach ULF GRENANDER. Mathematical Experiments on the Computer HOWARD OSBORN. Vector Bundles: Volume I . Foundations and Stiefel-Whitney Classes K . P. S . BHASKARA RAOand M. BHASKARA RAO.Theory of Charges RICHARD V. KADISONand JOHNR. RINGROSL~. Fundamentals of the Theory of Operator Algebras. Volume I EDWARD B. MANOUKIAN. Renomlization BARRETT O'NEILL.Semi-Riemannian Geometry: With Applications to Relativity
LARRY C. GROVE. Algebra E. J . MCSHANE. Unified Integration STEVEN ROMAN. The Umbra1 Calculus JOHNW. MORGANand HYMAN BASS(Eds.). The Smith Conjecture SIGURDUR HELGASON. Groups and Geometric Analysis: Integral Geometry, Invariant Differential Operators, and Spherical Functions E. R. KOLCHIN. Differential Algebraic Groups ISAAC CHAVEL. Eigenvalues in Riemannian Geometry W. D. CURTIS and F. R. MILLER. Differential Manifolds and Theoretical Physics JEANBERSTELand DOMINIQUE PERRIN.Theory of Codes A. E. HURDand P. A. LOEB.An Introduction to Nonstandard Real Analysis IN PREPARATION
RICHARD V. KADISONand JOHNR. RINGROSE. Fundamentals of the Theory of Operator Algebras, Volume II A. P MORSE.A Theory of Sets, Second Edition CHARALAMBOS D. ALIPRANTIS and OWENBURKINSHAW Positive Operators DOUGLAS C. RAVENEL. Complex Cobordism and Stable Homotopy Groups of Spheres