NONSTANDARD METHODS IN STOCHASTIC ANALYSIS AND MATHEMATICAL PHYSICS
This is a volume in PURE AND APPLIED MATHEMATICS A Series of Monographs and Textbooks AND HYMAN BASS Editors: SAMUEL EILENBERG The complete listing of books in this series is available from the Publisher upon request.
NONSTANDARD METHODS IN STOCHASTIC ANALYSIS AND MATHEMATICAL PHYSICS
Sergio Albeverio
Raphael Hsegh-Krohn
Institute of Mathematics Ruhr University Bochum Bochum, West Germany
Institute of Mathematics University of Oslo Oslo, Norway
Jens Erik Fenstad
Tom Lindstrsm
Institute of Mathematics University of Oslo Oslo, Norway
Institute of Mathematics University of Trondheim-NTH Trondheim, Norway
1986
ACADEMIC PRESS, INC. Harcourt Brace Jovanovich, Publishers
Orlando San Diego New York Austin Boston London Sydney Tokyo Toronto
COPYRIGHT 0 1986 BY ACADEMIC PRESS. IKC ALL RIGHTS RESERVED. NO PART OFTHIS PUBLICATION MAY B E REPRODUCED OR TRANSMITTED IN ANY FORM OR BY ANY MEANS. ELECTRONIC OR MECHANICAL. INCLUDING PHOTOCOPY. RECORDINC;. OR ANY INFORMATION STORAGE AND RETRIEVAL SYSTEM. WITHOUT PERMISSION IN WRITING FROM THE PUBLISHER
ACADEMIC P R E S S , Orlando. Florida 32887
INC
United Kingdom Edition published b!
ACADEMIC PRESS INC.
(LONDON) 24-28 Oval Road, London NWI 7DX
LTD
Library of Congress Cataloging in Publication Data Main entry under title: Nonstandard methods in stochastic analysis and mathematical physics. (Pure and applied mathematics ; ) Includes bibliographies and index. 1. Stochastic analysis. 2. Mathematical physics. I. Albeverio, Sergio. 11. Series: Pure and applied mathematics (Academic Press) ; QA3.P8 [QA274.2] 510 s [519.2] 85-21442 ISBN 0-1 2-048860-4 (alk. paper) ISBN 0-1 2-048861 -2 (pbk. : alk. paper)
86 87 88 89
9 8 7 6 5 4 3 2 I
CONTENTS
ix
Preface
Part 1.
BASIC COURSE
Chapter 1. 1.1. 1.2. 1.3. 1.4. 1.5.
Chapter 2.
Calculus Infinitesimals The Extended Universe Limits, Continuity, and the Derivative The Integral Differential Equations References
4 15
23 21 30 43
Topology and Linear Spaces
2.1. Topology and Saturation 2.2. Linear Spaces and Operators 2.3. Spectral Decomposition of Compact Hermitian Operators 2.4. Nonstandard Methods in Banach Space Theory References
V
45 53 56
59 62
CONTENTS
vi
Chapter 3.
Probability
3.1. 3.2. 3.3. 3.4. 3.5.
Part II.
The Loeb Measure Hyperfinite Probability Spaces Brownian Motion Pushing Down Loeb Measures Applications to Limit Measures and Measure Extensions References
63 67 78 86 95 103
SELECTED APPLICATIONS
Chapter 4. 4.1.
4.2.
4.3.
4.4.
4.5.
4.6.
4.7.
4.8.
Stochastic Analysis The Hyperfinite It6 Integral A. Stochastic Integration B. Liftings C. ItB’s Lemma General Theory of Stochastic Integration A. Internal Martingales B. Martingale Integration Lifting Theorems A. Nonanticipating Liftings B. Uniform Liftings Representation Theorems A. Square S-Integrable Martingales B. Nonstandard Representations of Standard Stochastic Integrals C. Quadratic Variation and ItB’s Lemma D. Further Remarks on Nonstandard Representations E. Standard Representations of Nonstandard Stochastic Integrals Stochastic Differential Equations A. It5 Equations with Continuous Coefficients B. It6 Equations with Measurable Coefficients C. Equations with Coefficients Depending on the Past Optimal Stochastic Controls A. Optimal Controls: The Markov Case B. Girsanov’s Formula C. Optimal Controls: Dependence on the Past Stochastic Integration in Infinite-Dimensional Spaces A. Brownian Motion on Hilbert Spaces B. Infinite-Dimensional Stochastic Integrals C. A Remark on Stochastic Partial Differential Equations White Noise and LCvy Brownian Motion A. Construction of White Noise B. Stochastic Integrals and the Continuity Theorem C. Levy Brownian Motion D. Invariance Principles Notes References
107 108 111 113 115 115 122 130 130 135 141 141 145 148 152 153 159 159 163 170 171 172 182 189 193 195 197 202 205 206 207 213 216 220 220
CONTENTS
Chapter 5.
Hyperfinite Dirichlet Forms and Markov Processes
5.1. Hyperfinite Quadratic Forms and Their Domains A. The Domain B. The Resolvent 5.2. Connections to Standard Theory 5.3. Hyperfinite Dirichlet Forms A. Hyperfinite Markov Processes and the Definition of Dirichlet Forms B. Alternative Descriptions of Dirichlet Forms C. Equilibrium Potentials D. Fukushima’s Decomposition Theorem E. The Hyperfinite Feynman-Kac Formula 5.4. Standard Parts and Markov Processes A. Exceptional Sets B. Strong Markov Processes and Modified Standard Parts 5 . 5 . Regular Forms and Markov Processes A. Separation of Compacts B. Nearstandardly Concentrated Forms C. Quasi-Continuous Extensions D. Regular Forms 5.6. Applications to Quantum Mechanics and Stochastic Differential Equations A. Hamiltonians and Energy Forms B. Standard and Nonstandard Energy Forms C. Energy Forms and Markov Processes References
Chapter 6.
vii
226 226 234 24 1 241 248 250 25 5 258 260 264 265 213 28 1 282 285 281 29 1 291 291 302 310 315
Topics in Differential Operators
6.1. A Singular Sturm-Liouville Problem 6.2. Singular Perturbations of Non-Negative Operators A. The Computation B. Nontriviality C. The Main Result D. The Case of Standard A’s E. Translation into Standard Terms 6.3. Point Interactions A. Application of the General Theory B. An Alternative Approach 6.4. Perturbations by Local Time Functionals A. Applications of the General Theory B. The Basic Estimate C. Models of Polymers 6.5. Applications of Nonstandard Analysis to the Boltzmann Equation A. The Equation B. Physically Natural Initial Conditions C. The Equation with a Truncated Collision Term D. The Nonstandard Tool
319 321 330 333 336 340 343 349 350 352 358 359 364 312 319 319 382 383 385
CONTENTS
viii
E.
The Space-Homogeneous Case: Global Existence of Solutions F. The Space-Homogeneous Case: Asymptotic Convergence to Equilibrium G. The Space-Inhomogeneous Case: A Loeb-Measure Approach 6.6. A Final Remark on the Feynman Path Integral and Other Matters References
Chapter 7.
389 395 403 407
Hyperfinite Lattice Models
7.1. Stochastic Evolution of Lattice Systems 7.2. Equilibrium Theory 7.3. The Global Markov Property A. Hyperfinite Markov Property B. Lifting and the Global Markov Property C. S-Continuity, Dobrushin’s Condition, and the Global Markov Property D. Maximal and Minimal Gibbs States E. The Case of Unbounded Fiber 7.4. Hyperfinite Models for Quantum Field Theory A. The Program B. Free Scalar Fields C. Interacting Scalar Fields D. Some Concluding Remarks on Gauge Fields 7.5. Fields and Polymers A. Poisson Fields of Brownian Bridges B. The Square of the Free Field as a Local Time Functional C. Local Time Representations for Interactions Which Are Functions of a* D. a4 and Polymer Measures E. @:Oi References
Index
386
402 419 43 1 432 434
438 441 445 449 449 453 458 468 476 477 482 486 493 496 504
509
PREFACE
This is a book on applied nonstandard analysis. It is divided into two parts. The basic course (see accompanying chart) on calculus, topology and linear spaces, and probability gives a complete and self-contained introduction to nonstandard methods. The purpose of this part is expository, but we have tried to enliven the text with some substantial examples in differential equations, linear space theory, and Brownian motion. The second part presents selected applications to stochastic analysis and mathematical physics. Some of the applications of this part are new. Some represent a re-analysis of known results, but from a novel point of view. We hope through this diversity of examples to convince the reader that nonstandard analysis is a viable tool in many parts of the mathematical sciences. The collaboration that led to this book started in the fall of 1976. At that time P. Loeb and R. Anderson had established the basic facts of nonstandard measure theory and hyperfinite probability theory with applicationsto Brownian motion and stochasticintegration. Our point of entry was somewhat different. In the summer of 1976, E. Nelson had given his AMS lecture on nonstandard theory with application to a singular perturbation problem. This was a field of interest for two of us (S. A. and R. H.-Kr.). They had several “heuristic” calculations and wondered if the “new” numbers of nonstandard theory could make this sound. This led to a joint seminar in 1976-1977 ix
PREFACE
X
by three of us (S. A., J. E. F., and R. H.-Kr.) and to the subsequent work on singular perturbations (see Chapter 6). The seminar was very lively and we had the active participation of some of our Oslo colleagues, in particular, Bent Birkeland and Dag Normann. In fact, as the reader will see, they have contributed substantially to the book. At that time the fourth author (T. L.) was a beginning graduate student. We gave him a standard text on stochastic analysis and asked him to do better using the new tools of nonstandard theory. This led to his thesis (see Chapter 4)and to a continued collaboration on the book project. He spent two years in Madison and the contact established with H. J. Keisler has proved important for our project. In fact, Keisler’s monograph An Infinitesimal Approach to Stochastic Analysis, in various stages of publication, has been a source of much inspiration and insight for us. Many others have contributed to our project. In Bochum, Albeverio has worked in close contact with C. Kessler and A. Stoll; some of their work is reported in this book. L. Arkeryd in Gothenburg and N. Cutland in Hull have been other friends in this enterprise. In Chapter 6 we report on Arkeryd’s new results on the Boltzmann equation and in Chapter 4 we have included some of Cutland’s contributions to control theory. The contacts with Arkeryd and Cutland have been very stimulating and we have learned much from both. I
Basic course
Ch. 1-3 I
I
4.1- 4.2
4.3- 4.4
5.1-5.2
5.3
I
6.1
I
6.5
I
6.6
11
7.1- 7.4
6.2-6.3
5.4-5.6
6.4
7.5
Chart of Main Dependencies There are certain omissions in this chart, which are all rather innocent. A typical example is that the dependence of Theorems 5.3.9 and 6.4.5 on Proposition 4.8.5 is not indicated since the proof of 4.8.5 can be read independently of the rest of Section 4.8. In the other direction, none of the later chapters really depends on Sections 1.5.2.4, and 3.5, which have been included to enliven the text by illuminating and worthwhile applications of the basic techniques.
PREFACE
xi
The book a number of years and we have benefited through this period from a close interaction with a numbers of friends and colleagues. In addition to those mentioned above we should like to thank Ph. Blanchard, S. Fajardo, W. Henson, H. Holden, P. Loeb, W. Luxemburg, D. Miller, H. Osswald, E. Perkins, D. Ross, K. Rumberger, R. Seising, K. Stroyan, P. Suppes, T. T. Wu, and R. ZivaljeviE for helpful advice and comments on successive versions of the manuscript. We also want to express our sincere thanks to Signe Cordtsen for her expert typing and our editors at Academic Press for their extensive help in transforming the manuscript into a printed book.
This Page Intentionally Left Blank
Part I
BASIC COURSE
This Page Intentionally Left Blank
CHAPTER 1
CALCULUS
The basic insight or, if you prefer, discovery of nonstandard theory is that the geometric line or continuum can support a point set richer than the standard reals. This, among other things, gives us a framework for a geometric analysis of physical phenomena on many scales, and of physical phenomena that are too singular to fit in a direct way into the standard framework. Let us elaborate a bit. Take the usual axiomatization of the affine plane. There are two basic categories of objects, lines and points. Any point lies on a line, but a line is not a set of points. Axioms with a “true” geometric content allow us to introduce coordinates from a field. But we need the less elementary Archimedean axiom to tell us that the field of an ordered geometry is isomorphic to a subfield of the reals. Is the Archimedean axiom a “true” geometric fact? What is given in our immediate experience is a limited part of the geometric line with at most a finite number of points marked on it, representing, e.g., the result of some physical measurement. The rest is an extension, ideal or real. In the orthodox view the real numbers are created from the rationals by a limit construction in which we adjoin points representing certain equivalence classes of convergent sequences. The claim is that the “ideal 3
4
1 CALCULUS
elements” thus created fill up the line. But in this process we do not pay any attention to the rate of convergence. If we do this and add “witnesses” to distinguish between different convergence behavior, we are led to a richer point set on the line. And if we also care about the asymptotic behavior of sequences we are ineluctably led to the full notion of hyperreals or nonstandard reals. Nonstandard analysis asserts that the geometric line can support this richer set. And we shall see how this gives a frame for rescaling arguments, i.e., for a geometric analysis of phenomena on many scales. Ours is a book on applied nonstandard analysis and we have not set ourselves the task of writing the history of infinitesimals and nonstandard theory. But two names should be mentioned. The “founding” memoir of the nonstandard theory is the 1934 paper by T. A. Skolem, “Uber die Nichtcharakterisierbarkeit der Zahlenreihe mittels endlich oder abzahlbar unendlich vieler Aussagen mit ausschliesslich Zahlenvariablen” (Skolem, 1934). Skolem’s aim was, as the title indicates, in a certain sense destructive. But a deep insight cannot be only negative. In 1960 Abraham Robinson turned the nonstandard method into a new and efficient technique in mathematics. This was a truly remarkable step forward. He extended Skolem’s analysis from arithmetic to the reals and saw how the nonstandard version could provide a suitable framework for the development of analysis by means of the infinitely small and infinitely large numbers; see his book (Robinson, 1966) and Volume 2 of his collected works (Robinson, 1979). Some have seen a vindication of the Leibnizian infinitesimals in nonstandard analysis. There certainly are similarities. But one should be careful in claiming that novel developments prove the correctness of older ideas. The history of infinitesimals, however, is fascinating; see Chapter X of Robinson (1966) and the many historical remarks in Laugwitz (1978). We now turn to the text proper. In Sections 1.1 and 1.2 we lay the foundation of the infinitesimal method. In Sections 1.3 and 1.4 we discuss briefly the basic notions of elementary calculus. And in Section 1.5 we present some more substantial examples from differential equation theory.
1.1. INFINITESIMALS
If you are firmly convinced that infinitesimals, and hence their inverses, the infinitely large, exist, then you need not read this section. You may jump directly to the next section, the extended universe. But if you have some lingering doubts and need to be reassured, we have added some introductory remarks to demonstrate that the real line can be extended to accommodate both the infinitely small and the infinitely large.
5
1 1 INFINITESIMALS
The real numbers R can be constructed from the rationals Q in various ways. One method is to add to Q new points to represent limits of convergent sequences of rational numbers. In this process one has to identify sequences converging to the “same point” in R. We shall follow the same procedure in adding infinitesimals and infinitely large numbers. But we shall be more careful in the process of identifying sequences. If we care about rate of convergence and asymptotic properties, we will identify fewer sequences, hence end up with a richer set of points on the line, the nonstandard extension. We turn to the construction. Let N be the set of natural numbers, N = {0,1,2,. . . , n, . . .}. Following usual set-theoretic notation let R“ be the direct product of N copies of R,i.e., R” is the set of all real sequences or, equivalently, the set of all functions f:N + R. The direct product has a natural algebraic structure; we add and multiply two sequences by adding and multiplying in each coordinate separately, i.e., let f; g E R”, then we define f g and f . g by the coordinate equations
+
( f + g>(4= f ( i ) + d i ) ,
( f . g ) ( i ) =f(9 di). *
We are interested in studying the behavior of sequences “in the limit,” i.e., the asymptotic properties. As a first requirement we certainly want to identify sequences which agree on a coJinite set, where a set E E N is cofinite if it is of the form E = N - {m,, . . . , mk}. Note that every m > max{m,, . . . , mk} belongs to E. Two sequences g E R“ agree on the cofinite set E iff f(m ) = g( m), for all m E E. Obviously, iff; g E R” agree on a cofinite set, then they have the same behavior “in the limit.” Conversely, if the set { i E N l f ( i ) # g ( i ) } is cofinite, then we shall want to distinguish the limiting behavior off and g, i.e., add different points for f and g in the extension of R. But cofinite sets are not good enough. It is well known from algebra that the direct product construction may introduce zero divisors, i.e., we may have elements a # 0 and b f 0 such that nevertheless a b = 0. The natural candidate for zero in R“ is the sequence 0 = ( c,),,~, where c, = 0, for all n E N. Let f = ( a n ) , e N be any sequence such that a, = 0 iff n is even, and g = (bn)neN any sequence such that b,, = 0 iff n is odd. Then f -g = 0, but f # 0 and g # 0. Identifying sequences modulo cofinite sets does not help the situation, neither f nor g is equal to the sequence 0 on a cofinite set. The existence of zero divisors destroys usual algebra, hence we need something more refined than cofinite sets in our construction. A Jilter 8 is a family of subsets of N satisfying the following properties:
0e9.
(1)
N€S,
(2)
A l ,. .. ,A,, E 8
(3)
A E ~ , A G B = ~ B E ~ .
+ A , n - . . n A,, E 8 .
6
1 CALCULUS
The first property is a condition of nontriviality. The second is the important finite intersection property, which will turn u p in a number of connections. The third is a convenient closure property. The class of cofinite sets, Cof, is an example of a filter. (4) A filter 9 on
N is called free if it contains no finite set.
We shall be particularly interested in free filters extending the filter of cofinite sets.
(5) A filter % is called an ultrafilter over N if for all E E E % or N - E E %.
EN
either
Notice that from the nontriviality condition (1) it follows that if % is an ultrafilter on N and E E N, then exactly one of the sets E and N - E belongs to %. Also observe that if a finite union E , u * . v En belongs to 3, thCn at least one Ei belongs to %. 9
1.1.1. ULTRAFILTER THEOREM.
There exist free ultrafilters % on N extend-
ing the filter of cofinite sets. The proof is an easy exercise in the use of transfinite methods in set theory, namely Zorn’s lemma. Let A be the set of all filters 9 extending the filter Cof of cofinite sets. Then A is nonempty and is closed under union of chains. By Zorn’s lemma, any such family has maximal elements. We shall prove that any maximal element of A is an ultrafilter extending Cof. So let % be a maximal element of A. % is clearly a filter; let us show that given any E G N, either E E % or N - E E %. There are two cases: (1) assume that E n F is infinite for all F E %; then the set .Ir = { D G Nl D 2 E n F, for some F E %}, is a filter in A which extends % and E E 7f. By maximality % = 7 f , thus E E %. (2) Assume that E n F is finite for some F E %. If (N - E ) n D is finite for some D E %, then both (N E ) n D n F and E n F n D will be finite, hence F n D will also be finite; but this is a contradiction since both F, D E % zCof. Thus (N - E ) n D is infinite for all D E %, and we proceed as in case (1). We can now construct the nonstandard extension. Let 021 be a free ultrafilter on N and introduce an equivalence relation on sequences in R“ as
(6)
f
- ,g
iff
{i
E
Nlf(i) = g ( i ) } E %.
Sets in the ultrafilter are “large” sets. In fact, there is an obvious one-to-one correspondence between ultrafilter on N and finitely additive 0-1 valued measures on N; with respect to this correspondence the relation in (6) identifies sequences which are equal “almost everywhere,” i.e., which are equal on a set of measure 1 in the measure associated to %.
1 1 INFINITESIMALS
7
Since % is free, two sequences that agree on a cofinite set are identified with respect to the relation introduced in ( 6 ) . But more sequences are identified: since % is an ultrafilter either the set { i E NI i even} or the set { i E Nl i odd} belongs to %. Thus either the sequence (an)neN, where a, = 0 iff n is even, or the sequence (b,),€", where b, = 0 iff n is odd, will be where C, = 0 for all n. Thus identified with the zero sequence 0 = identifying modulo an ultrafilter eliminates the difficulty we noticed above in connection with zero divisors. R" divided out by the equivalence relation gives us the nonstandard extension *R, the hyperreals; in symbols,
--,
*R = R"/%.
(7)
If f E R", we denote its image in *R by fQ, and, of course, every element in *R is of the form f, for some f:N + R. For any real number r E R let r denote the constant function with value r in R", i.e., r(n) = r, for all n E B. We then have a natural embedding.
*: R + *R
(8) by setting * r = rc, for all r hyperreals *R.
E
R. We must now lift the structure of R to the
REMARK. What happens if we use in (7) an ultrafilter %,that is not free? If q0 is not free, i.e., does not extend the filter of cofinite sets, then %, must contain some finite set; in fact, it is easily seen that in this case there must exist some number no E N such that
=
{ E c Nlno E E ) .
In this case the construction of *R collapses, *R
= R"/Q0
= R.
As an algebraic structure, R is a complete ordered field, i.e., a structure of the form
(9)
where R is the set of elements of the structure, + and . are the binary operations of addition and multiplication, < is the ordering relation, and 0 and 1 are two distinguished elements of the domain. And it is complete in the sense that every nonempty set bounded from above has a least upper bound. The *-embedding of (8) sends 0 to *O = 0%and 1 to *1 = 1%.We must lift the operations and relations of R' to *R. We get the clue from (6), which tells us when two elementsf* and g, of *R are equal: (10)
fQ = g ,
iff
{i
E
Nlf( i ) = g ( i ) } E 3.
8
1 CALCULUS
In a similar way we extend < to *R by setting for arbitrary$,
and g, in *R:
f, < ga iff {i E Nlf(i) < g(i)} E 3. With this definition of < i n * R we easily show that the extended domain *R is linearly ordered. As an example we verify transitivity of < in *R: let f, < g, and g, < ha, i.e., D1= { i E N l f ( i ) < g ( i ) } E %, D2= { i E Nlg(i) < h(i)} E %. By the finite intersection property (2), D, n D2 E %. If i E D, n D2, then f ( i ) < g ( i ) a n d g ( i ) < h(i);hencebytransitivityof
prove that given anyf,, g,
E
*R, then eitherf,
< g a , orfa
=
g,, orfq > g,.
REMARK. The relation < on *R introduced in (1 1) extends the relation < on W, i.e., given any r , , r2 E R we see that rl < rz in R iff *rl < *r2 in *W.
This is the reason why we have not decorated the equality introduced in (10) and the ordering in (11) with an asterisk. We now have a linear order on *R and can verify that *R contains infinitesimals and infinite numbers. A (positive) infinitesimal 6 in *W is an element 6 E *R such that *O < 6 < * r for all r > 0 in R. Infinitesimals exist; let f ( n ) = l / n for n E N. Then 6 = f, is a positive infinitesimal. Also notice that f’( n) = l / n 2 introduces another infinitesimal 6’ and that 6’ < S in *R. In the same way g(n) = n and g’(n) = n2 introduce injnite numbers, w = g, and w ‘ = g & , and we have that w < w ‘ in *R. It remains to extend the operations + and to *R. Looking back to (10) and (11) we have nothing to do but to set fq
(12)
+ g,
=
h,
iff
{ i E Nlf(i) + g ( i ) = h ( i ) } E %,
iff { i E N \ f ( i ) . g ( i ) = h ( i ) } E %. f , . g, = h, With these definitions we can prove easily that *W is an ordered field extension of R. And these definitions introduce an honest algebra on the infinitesimals and on the infinitely large numbers. As an exercise the reader may wish to verify that if f , < g, and *O < h,, then f, . hQ < g,. h,. One should also notice that for the infinitesimals S and 8’ and the infinite w and w ’ introduced above, we have, e.g., w ‘ = w 2 , 6 . w = 1, 6 ’ . w is infinitesimal, and 6 w ’ is infinite. (You may directly verify from the definitions that 6 - w ’ is infinite, but you can also calculate 6 w ‘ = ( l / w ) . w z = w . ) Thus the infinitely small and the infinitely large have a decent arithmetic.
9
1 1 INFINITESIMALS
The way we extended the particular operators + and * and the particular relation = from R to *R can be used to extend any function and relation on R to *R. Let F be an n-ary function on R, i.e.,
F : @ x . . * x R+ R. n'iirnes
We introduce as in (12) the extended function * F by the equivalence (13) * F ( f &,..., f;)
= g,
iff { i E N I F ( f ' ( i ) ,... , f " ( i ) ) = g ( i ) } E 42.
The reader may want to verify that * F is a function and that * F really extends F, i.e., * F ( * r , ,. . . , *r,,) = * r iff F ( r l , . . . , r n ) = r. In the same way we extend any n-ary relation S on R to a relation * S on *R. Note that since a subset E E R corresponds to an unary relation, we have an extension * E characterized by the condition (14)
fQ E *E
iff
{i
E
Nlf(i)
E
E } E %.
Thus if E = (0,1], then * E as a subset of *R will have every positive infinitesimal as an element, but not *0, a fact which can be read off immediately from condition (14). We shall later return to some intuitive characterizations of open, closed, and compact sets in [w in terms of their *-extension to *R. But first a few elementary observations on the *-extension of subsets of R; *0is the empty set in *R. If E c_ R, then * r E * E for all r E E, but in general (see the example E = (0,1] above) * E will contain elements not of the form * r for any r E R. Furthermore * is a Boolean homomorphism in the sense that *(El v E,) = * E l u * E 2 , *(El n E,) = *El n * E 2 , for arbitrary sets El, E2 c R. Finally, we note that *El = *E2 iff El = E, and * r E * E iff r E E. Note that * is not a a-homomorphism; see the discussion of Loeb measure in Section 3.1. And, of course, * is not onto since there exist sets in *R that are not of the form * E for some E c_ R. Before proceeding we need to discuss the important concept of standard part. By virtue of (13) the absolute-value function has an extension to *R that we will denote by the usual 1 * I rather than the "correct" *). I. An element x E *R is calledjinite if 1x1 < * r for some r > 0. As we shall see in a moment, every finite x E *R is infinitely close to some (unique) r E R in the sense that Ix - *rl is either 0 or positively infinitesimal in *R. This unique r is called the standardpart of x and is denoted by st(x) or Ox. The proof of existence of the standard part is rather simple. Let x E *R be finite. Let D, be the set of r E R such that * r < x and D2 the set of r' E R such that x < *r'. The pair (Q, D2) forms a Dedekind cut in R, hence determines a unique r, E R. A simple argument shows that ( x - *ro( is infinitesimal, i.e., st(x) = r,.
10
1 CALCULUS
REMARK. The reader should notice that we did not mess up the above argument with irrelevant details about the construction of *R. It is entirely based upon the properties of the *-embedding that we have already established. This is an important point to which we shall return. The properties of the nonstandard extension *R can be pinned down in a few fundamental principles; see the discussion in Section 1.2. We do the same when we characterize R as a complete ordered field extension of the rationals. And neither for R nor for *R are the basic properties of the extension dependent upon the particular way they are constructed. For R we may use either Dedekind cuts or Cauchy sequences. For *R we can use the ultrafilter construction as above or the Godel completeness theorem for first-order logic.
The standard part map is well behaved. If r E R,then st(*r) = r; if x, *R are both finite, then st(x f y ) = st(x) + st(y), st(x - y ) = st(x) - st(y); and if st(y) # 0, then st(x/y) = st(x)/st(y). We shall now use the nonstandard extension *R and the standard part map to characterize some topological notions in R. By the monad of a real number r E R, denoted by p ( r ) , we understand the set of all x E *R such that st(x) = r. y E
1.1.2. PROPOSITION.
Let E
c R; then
(i) E is open iff p ( r ) E * E for all r E E ; (ii) E is closed iff st(x) E E for all finite x E * E ; and (iii) E is compact iff for all x E * E , st(x) exists and is an element of E. Here (ii) is the dual form of (i) and (iii) follows from (ii), recalling that compact in R means closed and bounded. Since p ( x ) is contained in the *-extension of every standard neighborhood of x, one part of (if is immediate. Assume now that E is not open; then there exists some r E E such that every neighborhood of r in R intersects R - E. We shall produce an element x E p ( r ) such that x g * E : for each n E N pick some r,, E E such that ) r - rnl < l / n . Let f : N + R be defined by f ( n ) = r,,. From (14) and the definition o f f , we see that f Q & "E. But f Q E p ( r ) since by construction I*r -fa! < 6, where 6 is the infinitesimal associated with the sequence (1ln)nsNWe hope that the reader by now has been somewhat reassured with respect to his or her doubts about the existence of the infinitely small and the infinitely large. Let us conclude by proving a general result about ultrafilter extensions which draws together in simple terms the many separate insights scattered over the preceding pages. The result we are hinting at is the general transferprinciple of nonstandard analysis. We have already noticed that R and *R are similar in many respects;
1.1. INFINITESIMALS
11
e.g., both are linearly ordered fields. We shall now make precise in which sense R and *R are similar; i.e., we shall specify in more detail which properties of 58 transfer to *R. We consider the reals as a structure
0% +, <, I . I,
(15)
0, I),
*,
where, in addition to the information in (9), we have added the absolute value that defines the metric on R. Of course, 1 1 is definable in terms of the other entities in (15), but it makes things a bit easier to include it explicitly in the specification. The structure R has an associated simple language L(R) that can be used to describe the kind of properties of R that are preserved under the embedding *: R + *R. The elementaryforrnufas of L(R) are expressions of the form (i) t , + t2 = f 3 , (ii) 1, t2 = t 3 , (iii) It,\ = r,, (iv) t , = t 2 , (v) t l < f 2 , (vi) t , E X , where t I , t 2 , f3 are either the constants 0 or 1 or a variable for an arbitrary number r E R, and X is a variable for a subset A s R. From the elementary formulas we generate the class of all formulas or expressions of L(R) using the propositional connectives A
and
V
or
i
not if, then
-+
and the number quantijiers Wx
forall
ax
forsome x (inR)
x
(inR)
by the rules: (vii) If
Q,
and q are formulas of L(R), then @ A T ,
@ V v ,
la,
@+q
are formulas of L(R). (viii) If @ is a formula of L(R) and x is a number variable, then VxQ and 3 x @ are formulas of L ( R ) .
12
1 CALCULUS
Whenever necessary, we add parentheses to formulas of L(R) to avoid possible ambiguities. The language L(R) is basically a first-order language; i.e., we allow number quantification but not set quantification. The reader with some knowledge of logic may also be puzzled by our somewhat restricted notion of elementary formula; e.g., (x + 1) . y = z is not elementary in our sense. However, we can easily write a formula in L ( R ) capturing the meaning of this expression, namely 3X,[X -I-1
= X1
A XI
y =
Z].
This trick is universal and shows that L(R) has the intended expressive power. We give a few examples: in the language L(R) we can write down conditions which express that < is a linear ordering: y < 23 + x < z]
transitive
Vx V y Vz[[x < y
irreflexive
Vxi[x
linear
VxVy[x < y v x = y v y < XI.
A
< x]
Let us also write down a formula @ ( X )which expresses that X is an open set (16) @(X)=derVy[y
E
X
+
3 . d> ~ 0 A VYi[IY - YII < z
+
Y I E XI]].
A formula @ of L ( R ) is in general of the form
@=
qx,,. . . ,x,,X I , . . . ,x n ) ,
where x,, . . . , x, are the free number variables of @, i.e., variables not bound by a quantifier V or 3,and X I , .. . ,X , are the (free) set variables of CP. Every formula in L ( R ) has an immediate meaning or interpretation in the structure R; e.g., let @ ( X )be the formula in (16) and let A G R, then @(A) expresses the fact that A is open in R. But formulas of L(R) can also be interpreted in the extended structure *R. We use (12) to interpret elementary statements of the previous form (i) and (ii). We use the general format of (13) to interpret (iii), and we use (10) and (11) to interpret (iv) and (v), respectively. Finally, we use (14)to interpret (vi). And since the logical symbols have a fixed meaning over any domain, this means that every formula @ in L(R) has an interpretation in both R and *R. Let us once more return to (16); given A c R we have already explained what we mean by @ ( A ) .Now we can also interpret @ ( * A ) over *R according to the specification given in this paragraph.
13
1 1 INFINITESIMALS
REMARKS. (1) As already remarked, not every subset of *R is of the form * A for some subset A E R. If A, is a subset of *R that is not of the form * A for some A c R,then we have not given any meaning to @(Ao); this follows from the fact that (14) applies only to sets in *R of the form *A. (2) Using the general format of (13) we could also have added arbitrary function parameters to our L ( R ) formulas.
We can now state the main result about ultrafilter extensions, which has the general transfer principle as an immediate corollary. 1.1.3. THEOREM OF tos. Let @ ( X I ,... ,X , , x,, . . . ,x,) be a formula of L(R). Then for any A , , . . . , A, E R and f k, . . . ,fk E *R,
(17)
@ ( * A l , .. . , *A,,f&, . . . ,f&) { i EN/ @ ( A ,., . . ,A,,f'( i),
iff
. . . ,f"(i ) ) } E %.
The proof is by induction on the number of logical symbols in @. If @ has no logical symbols, it is an elementary formula of the form (i)-(vi), and (17) then reduces to one of (12), (13), ( l o ) , ( l l ) , or (14). If @ contains v CP2, l a l , @, +-0 2 , logical symbols, then CP is of the form @, A Q2, Vx@,, or 3xQ1. The verification of (17) is, by induction, in each case reduced to an elementary property of the ultrafilter 021. In fact, we have done bits of the proof in verifying the transitivity of < in *R; see the paragraph following (11). Let us, however, comment briefly on a few of the cases. For example, if @=a,A @ ~ , (17) follows from the finite intersection property of the ultrafilter. The case @ = uses in an essential way that 021 is an ultrafilter, namely, that N - E E % iff E 021; the reader should recall our discussion of zero divisors above. Quantifiers offer no special difficulties, but let us spell out the case CP = 3xQI in detail. For simplicity let @ have one free variable; we shall prove
@(f%)
a, Now a(&) is true in *R
iff { i E N l @ ( f ( i ) ) l
E
where @(f%) is of the form 3 x @ , ( x , f Q ) . iff there is some g , E *R such that CPl(g%,f%)is true in *R. By the induction hypothesis this means that {i E N I @ , ( g ( i ) , f ( i ) ) }E 021. But if Q 1 ( g ( i ) , j ( i ) ) is true in R, then 3 x @ , ( x , f ( i ) )is also true in R, i.e., { i E N I C P , ( g ( i ) , f ( i ) ) }c {i E N13x@l(x,f(i))}.From the property (3) of filters it follows that { i E N l @ ( f ( i ) ) } E a. To prove the converse, assume that { i E Nl@(f(i))} E 021. For each i in this set choose some aiE R such that CPl(ai,f(i)).Let g E R" be a function g such that g ( i ) = ai for all i in the set we started with and g ( i ) = a'
14
1 . CALCULUS
otherwise, where a' is some arbitrary element of R. Then we have @,(g(i),f(i)) for a set of indices in %; hence by the induction hypothesis we have @,(g%,&) in *R, i.e., we have @(f%) = 3 x @ , ( x , f Q ) in *R. The theorem of Los has the transfer principle as an immediate corollary.
x,,
1.1.4. TRANSFER PRINCIPLE. Let @ ( X I , . . . , X i , . . . ,X,) be a formula of the language L(R). Then for any A , , . . . , A , 2 R and r , , . . . , r, E R, @ ( A , , ...,A,, r , , . . ., r,,) is true in R iff @ ( * A , ,. . ., *A,, * r , , . . .,*r,) is true in *R.
The proof is indeed immediate. From (17) we get at once @ ( * A , ,. . . ,* A,,,, * r l , . . . , * r n ) iff { i E N ( @ ( A , ,. . . , A,, rl, . . . ,r , ) } E %. But the set { i E NI @ ( A , ,. . . ,A,, r l , . . . , r , ) } is equal to N E % if @ is true o f A l , . . . ,A,, r , , . . . , r, in R, and is equal to 0 +? % if @ is not true of A, ,..., A,, r ,,.,., r,,. Thus @ ( A ,,..., A,, r ,,..., r,,) is true in R iff @ ( * A , ,. . . ,* A,, * r , , . . . ,* r n ) is true in *R. The existence of the embedding * :R + *R satisfying the transfer principle, 1.1.4, is all we need for elementary nonstandard analysis. To make this point let us briefly return to the proof of Proposition 1.1.2.The essential point was to prove that E is open if p ( r ) -C * E , for all r E E. In the previous proof we worked inside the model, here we use transfer: let a E E ; since p ( a ) G * E the following statement
@(x,X ) = gZ[o < Z A vy[lX - yl <
Z
-+
JJ E
x]]
is true in *R when X is interpreted as * E and x as *a. By transfer @ ( E , a ) is true in R, which means that some standard neighborhood o f a is contained in E, i.e., E is open. The reader will find it instructive to compare the two proofs. Let us round off this introductory discussion by the following more general comments. Starting from the rationals 69 we can form the ultraproduct
(18)
*Q
= Q"/%,
where % is some ultrafilter on N extending the filter of cofinite sets. As in the case o f *R we have the notions offinite, infinite, and infinitesimal in *Q. But this time we must be a bit careful with the notion of standard part, since Q is not complete in the standard metric uniformity. Let Qr denote the finite points of *Q and Qi the set of infinitesimals. It is easy to prove that there is an onto map 7 ~ Qf : + R, obtained by identifying modulo infinitesimals, i.e., R = QJQi.
This gives perhaps the most direct way of constructing the reals from the rationals; *Q is an elementary extension of Q, the substructure Qf is seen
1 2 THE EXTENDED UNIVERSE
15
to be an integral domain, and Qi is a prime ideal in Qf. Thus the quotient is a field, namely the reals. This example can be generalized; R is usually regarded as the completion of the normed algebraic structure Q. In general, given a normedAalgebraic structure E, we can construct both the standard completion E and the nonstandard extension * E . In * E we can distinguish the 5et of prenearstandard (or bounded) points Ep There is a map T :E, + E such that the following diagram is commutative: 77
fi
E [See Fenstad and Nyberg (1970) for a detailed discussion.] The map 7r has the property that whenever F is complete and the map f : E + F has an += F, then extension to a map
1:
st(*f(x)) = j ( d x > > for all x E E,. It is this commutative diagram which “explains” the successful use of nonstandard methods in many situations. It is well known that the algebraic structure of E does not always extend to E. By transfer it does extend to *E. The map 7~ :* E + k “confuses” points that from the algebraic point of view should be kept distinct. It is this richness of * E which adds power and intuition to the nonstandard methods. REMARK. We should admit that, whereas I? is a “good” extension of E from the topological point of view, * E does not admit a nice intrinsic topological structure; see Zakon (1969). But we are not in general interested in doing topology inside * E. The topological aspect will enter the situation in differentways, e.g., via the commutative diagram above; see also Proposition 1.1.2 and the discussion of measure extensions in Sections 3.4 and 3.5.
1.2. THE EXTENDED UNIVERSE
What we have learned so far is that the complete ordered field R of real numbers has a proper (but not unique) ordered field extension “R, the hyperreals. The extension is elementary, which means that it preserves the true statements about R which are expressible in the language L ( R ) .
16
1 CALCULUS
The structure R is not a large enough domain for the development of classical mathematics. We need an extended universe that, in addition to numbers and functions, also contains sets of functions, sets of spaces of functions, etc.; i.e., we need the finite type structure over R. In more generality, given any set S we introduce the superstructure V ( S ) over S as follows. 1.2.1. DEFINITION.
s kt V,(S) = s,
For any Set
Vn+,(S) = Vn(S) u {XlX
Vn(S)),
V ( S ) = U Vn(S)-
”
Thus we see that the superstructure over S is obtained by iterating the power-set operator countably many times. Classical analysis lives inside V(R). The extended universe of nonstandard analysis will be obtained by postulating an extension *R 3 R and postulating an embedding * : V(R) + V(*R)
(1)
that will have properties similar to the embedding * :R -+ *R constructed in Section 1.1. First of all we assume the following principle. 1.2.2. EXTENSION PRINCIPLE. *R
is a proper extension of [w and * r = r for
all r E R. In the model of Section 1.1 this means that we identify R with its *-image in *R. We shall now extend the ultrafilter construction to demonstrate that superstructure embeddings of the type (1) satisfying a transfer principle analogous to 1.1.4 exist. This is, of course, not part of our systematic development, but simply a proof that our axioms are nontrivial, i.e., they admit a model. The construction proceeds in two stages. First we construct a bounded ultrapower of V(R) using a free ultrafilter % on N.This is similar to the construction of *R in Section 1.1. Then we map the bounded ultrapower into the superstructure V(*R) in such a way that the embedding (1) obeys the transfer principle.
I. Constructing the Bounded Ultrapower. A sequence ( A l , A 2 , .. .) of elements of V(R) is bounded if there is a fixed n such that each Ai E Vn(R). Two bounded sequences A and B are equivalent with respect to the free ultrafilter %, in symbols A - % B, iff { i E N I Ai = Bi}E %. We let A%denote the equivalence class of A and define the bounded ultrapower by (2)
V(W)”/%
I
= {AQ A
is a bounded V(R)-sequence}.
1 2 THE EXTENDED UNIVERSE
We define the membership relation (3)
A Q ~ % B Q iff
E%
{i
17
in the ultrapower by
E
NIA, E B i } E %.
The reader will notice how this extends definition (14) of Section 1.1. There is a natural proper embedding
i : V(R) + V(R)"/ %,
(4)
namely let i ( A ) = (A, A , . . .)%, the equivalence class corresponding to the constant sequence; this is similar to the embedding (8) of Section 1.1. 11. Embedding V(R)"/% into V(*R). *R is the (bounded) ultrapower R"/%. But V(R)"/% will not be the same as the full superstructure V(*R). We shall now construct an embedding
(5)
j : V(R)"/%
+
V(*R)
such that (i) j is the identity on *R and (ii) if A% *R, then j ( A , ) = { j ( B % I)BQ€%AQ}.This means that the relation E % in the ultrapower is mapped into the ordinary membership relation in V(*R). The embedding j is constructed in stages. Let V,(R)"/ %
=
{A, I A is a sequence from Vk(R)).
Then the bounded ultrapower is the union of the chain *R = V,(R)"/% G
.-
a
G
V,(R)"/%
c. , *
and we can define j by induction. For k = 1, j must be the identity. If AQ E Vk+l(R)"/% and A%E *R, we simply set j ( A , ) = { j ( & ) l BQe4Aq}. This makes sense: if &€%A%,it follows from ( 3 ) that { i E Nl Bi E V,(R)} E %, i.e., B% E V,(R)"/%, which means that j ( & ) is defined at a previous stage of the inductive construction. Combining i and j we get a model of the extended nonstandard universe V(R)y%
j
V(*R)
where * A = j ( i ( A ) ) , for any A E V(R). The reader may verify that this *-embedding extends the *-embedding constructed in Section 1.1. Here V(R) and V(*R) are connected by a transfer principle generalizing 1.1.4 of Section 1.1. The structure R has an associated elementary language L ( R ) , which we used to give the necessary precision to the transfer principle. We need a similar formal tool to state the extended transfer principle.
18
1 CALCULUS
1.2.3. THE RESTRICTED LANGUAGE L( V(R)). The language L( v ( R ) ) Will be an extension of the language L(R). We add to our stock of elementary formulas [see (i)-(vi) in Section 1.11 expressions of the form
xE
(6)
Y,
where X and Y are variables for arbitrary sets in V(R). In (vi) of Section 1.1 we restricted ourselves to t E Y, t being a variable for elements of R. We keep the logical symbols of L ( R ) , but in addition to the number quantifiers we add bounded set quantijiers
(7)
VX
E
Y
for all sets X element of Y,
3X
E
Y
for some set X element of Y
Formulas @ of L( V(R)) are then constructed in exactly the same way as formulas of L(R). A formula @ of L( V(R)) can be interpreted in a natural way in any of the structures V(R), V(R)”/Q, and V(*R); note that in V(R) and V(*W) we have the standard interpretation of the E symbol, in V(R)”/% we use as introduced in (3) to interpret membership. Given any formula @ = @ ( X , , .. . ,X , )
with X , , . . . , X , as the onlyfree set parameters, and given sets A , , . . . ,A, E V(R), we mean by @ ( A , ,. . . ,A , ) the statement about V(R) obtained by giving the variables X , , . . . ,X , the values A l , . . . , A , , , respectively. In a similar way we interpret @ ( * A , ,. . . ,*A,,) as a condition about V(*W) obtained by giving each x k the value *Ak = j ( i ( A k ) ) ,k = 1 , . . . , n. 1.2.4. TRANSFER PRINCIPLE. Let A , , . . . ,A, E V(R). Any L( V(R)) statement @ that is true of A , , . . . ,A,, in V(R) is true of * A , , .. . ,* A , in V(*R), and conversely.
Transfer will be used over and over again. We grant that it may take some time to be completely free and easy in manipulating elementary, i.e., L( V(R)), statements. But it is our experience that it is at most a temporary stumbling block. Here we discuss a simple example. Let A E *R, then A E V2(*R). The embedding ( 1 ) maps VJR) to a set *( V2(R)) in V(*R). Will A belong to this set? Not necessarily, as we shall show below (see the remark after 1.2.6). But if A E * B for some B E V(R), then A E *( V2(R)). REMARK. This can be “proved” in two ways. First we can show that it is true in the model, which means that the result is consistent with our axiomatic principles, extension and transfer. But we can also prove the fact directly from the two basic principles, which means that it is true. We shall
1 2 THE EXTENDED UNIVERSE
19
do the latter and leave the former to the reader. Perhaps we should at this point recall the two “proofs” of 1.1.2(i). We thus want to prove that
(8)
V A [ A E * B A A E *R
+
A
E
*V,(R)].
As it stands, (8) is not an L( V(R)) formula. However, it is equivalent to
(9)
( V A E * B ) [ V r E A( r
E
*R) -+ A E * V,(R)].
This is genuine L( V(R)); i.e., we have only bounded set quantifiers. With some experience one always sees how a condition can be rewritten in correct L(V(R)) form. So in the future we will write formulas in the style of (8) rather than always insist on the correct (9). Now (9) is a condition @(*El, *R, *V2(R)), which by transfer is true in V(*R) iff the corresponding a(B, R, V2(R)) is true in V@>. But the latter condition is trivially true. Thus we have shown that if a subset of *IW is an element of some * B in V(*R), then it is already an element of the *-image of V,(R). REMARK. We shall comment briefly on the proof of 1.2.4. In the ultrapower model there are three structures involved, V(R), V(R)N/%, and V(*R). Given any L(V(R)) formula @ ( X , Y),we have explained how to interpret it in the three structures. Now LoS’ theorem, 1.1.3, immediately extends to the bounded ultrapower V(R)”/% by exactly the same proof; i.e., for any AQ, BQ E V(R)“/% we have
(10)
@ ( A * ,8 % )
iff
(i
E
N I @ ( A jB , j ) )E %,
from which transfer follows between V(R) and V(R)“/ 011 exactly as in 1.1.4. But Principle 1.2.4 asserts transfer between V(R) and V(*R). And in order to prove this we need to replace (10) by (11)
W(&),j(&))
iff I i
E
NI@.(Ai,&)I
E
%.
But this is a rather immediate extension which follows from the fact that every element of, say, j ( A , ) in V(*R) is of the form j ( A & ) for some A& in v(R)”/%; see the construction of the j-map. And once we have (1 1) the Transfer Principle 1.2.4 follows by the same argument as in 1.1.4.
All properties of *R discussed in Section 1.1 continue to hold for *R in V(*R). Elements of *R are either finite or infinite,and every finite element in *R has a unique standard part in R. Infinitesimals are the inverses of the infinite numbers in *R. But there are further distinctions to make in the extended universe.
20
1 CALCULUS
1.2.5. DEFINITION.
Let A
E
V(*R), then
(i) A is called standard if A = * B for some B E V(R), (ii) A is called internal if A E * B for some B E V(R), and (iii) A is called external if A is not internal. It is easy to see that every standard set is internal and that every element of an internal set is internal. The latter fact follows from our previous discussion: let A E *B; then, for some k, A G Vk(*R),thus A E *( v k + l ( R ) ) . But it follows from 1.2.1 and transfer that if A, E A E *( vk+I(R)), then A, E *( vk([w));the stages are in set-theoretic terminology transitive. REMARK. Because of their importance we will describe in detail the internal sets in the model. Let A be internal; thus A E *( Vk+l(R))for some k 2 1 . This means that A will be of the form A = j ( A % ) ,for some A%.By the construction of j, we then get
A
E
*( vk+l(R))
iff j ( A % )E j(i(vk+l(R)), vk+l(R)),
iff
where i is the embedding of V(R) into the ultrapower. The definition of E % then gives
A E *(vk+,([w))
iff
{i
E
Nl Ai
E
Vk+,([w)}E %,
where ( A , ,A 2 , ...) is the bounded sequence defining A%.Thus the internal sets are precisely the objects we obtain by starting with an arbitrary bounded sequence (A,, A 2 ,. . .); the standard objects are obtained by starting from a constant sequence (A, A , . . .). As an example consider the sequence (A,, A 2 , . . . , A n ,. . .), where A, = [0, n ] c R. The internal (but not standard) set defined by this sequence is the interval [0, w ] E *R, where w is the infinite number in *R defined by the sequence ( 1 , 2 , . . . , n , . . .); i.e., a number x E *R belongs to [0, a] iff 0 Ix Iw. We have the following important property of internal sets. 1.2.6. PROPOSITION. (i) Every nonempty internal subset of *N has a least element. (ii) Every nonempty internal subset of *R with an upper bound has a least upper bound.
The proof is a simple exercise in the use of the transfer principle. We prove (i), so let A c_ *N be internal. Then A E *( V2(R)); see (9). We can express the fact that an internal subset of *N has a least element by the condition
=defVXE * V2(R)[X #
IZI A X
G *N +=
X has a <-least element].
1.2. THE EXTENDED UNIVERSE
21
To be exact we must write out in detail the condition that X has a <-least element, which is
3 x E X [ V y E X 1 ( y < x)], but expanding such clauses into “correct” L( V(R)) statements is a matter of routine and will be largely omitted. We thus have a condition @ such that @(N, V2(R)) is true in V(R). By transfer @(*N,*V,(R)) is true in V(*R), proving (i) of 1.2.6. REMARK. It follows from this that *N - N is external since there is no <-least element in *N - N; if x E *N - N, then also x - 1 E *N - N. We also see that N is external; thus N E V2(*R) - *V,(R). From (ii) it follows that R as a subset of *R is external. Note that Proposition 1.2.6 is valid only for internal sets; the positive infinitesimals in *Iw is bounded but has no least upper bound.
We spell out a few important corollaries of 1.2.6. 1.2.7. PROPOSITION. (i) If A is internal and N c A, then A contains some infinite natural number, i.e., an element of *N - N. (ii) If A is internal and every infinite n E *N belongs to A, then A contains some standard n E N. (iii) If an internal set A contains every positive infinitesimal, then A contains some positive standard real. (iv) If an internal set A contains every standard positive real, then A contains some positive infinitesimal.
The first two parts of this proposition are often referred to as overflow andunderfowThe proofs are immediate from 1.2.6; e.g., if an internal set A contained every positive infinitesimal, but did not contain a positive standard real, then A would be an internal subset of *R with an upper bound, but with no least upper bound. How do you recognize a set as internal? The following simple principle is quite powerful. 1.2.8. INTERNAL DEFINITION PRINCIPLE. Let A , , . . . ,A, be internal Sets in V(*R) and let @ ( X , ,. . . ,X,,, x) be an L( V(R)) statement. Then the set
{x
E
Ail@( &,
. . ., A,,
XI)
is internal. The proof goes as follows. Since A , , . . . ,A, are internal, there must be some integer rn such that A , , . . . ,A, E *( V,,,(R)). In V(R) we have the truth of the following set existence or comprehension principle:
VXI *
*
- VX,,E V,(R)
3y
E
V,+,(R)[y
= {x E
X , l @ ( X , ,. . . , X “ , x))],
1 CALCULUS
22
-
where y = {x E X , I @ ( X I , .. . ,X,,, x ) } is an abbreviation of the formula Vx(x E y x E X , A @ ( X I , . .,X,, x)); thus x is a bound variabie of the term {x E XI]@(XI, . . . ,X , , x)}. Using transfer and the fact that A l , . . . ,A, E *( V,(R)), we conclude that the set {x E A , J @ ( A ,. ,. . ,A,, x)} is in *V,+,(R), hence internal. 1.2.9. REMARK. Together, 1.2.7 and 1.2.8 carry more punch than you would initially guess. Let the functions f and g be internal, i.e., internal as sets in V(*R). Suppose that you have proved that for all t E *[O, 11 there is some infinitesimal 6, > 0 such that If( t ) - g ( t)l < 8,. Can we then find a un$orrn estimate 6 such that I f ( t ) - g ( t ) (< 6 for all t E *[O, l ] ? The answer is yes and follows immediately from 1.2.7(iv) and 1.2.8. The set
(12)
A
={rE
*R]r > 0 A V t E *[O, I]()f( t ) - g( t ) ) < r ) )
is internal by 1.2.8. We easily see that A contains every standard positive real, hence by 1.2.7 (iv) it contains some positive infinitesimal 6, which is the uniform estimate we asked for. This concludes our general description of the extended universe. We have introduced the superstructure V(R) and postulated a superstructure embedding (1)
*: V ( R )+ V(*R) satisfying extension (1.2.2) and transfer (1.2.4). We have constructed a model of the superstructure embedding via the bounded ultrapower. Thus the axioms are consistent; i.e., there is a coherent conception of the infinitefy small and the infinitely large. This conception is ruled by the transfer principle, which will be our main tool in the sections to come. 1.2.10. REMARK. We have restricted ourselves to V(R); in some cases it may be more natural to work inside a differentsuperstructure. For instance, let E be a linear normed space over the complex number C.To apply the present machinery we must assume that E and C are objects in V ( R ) . However, from a certain point of view it would have been as natural to work in the superstructure V ( E u C), i.e., to regard E and C as basic or irreducible objects. The rest would be a set-theoretic construction from the set of “urelements” E u C.
The ultrapower construction can easily be adapted to construct an embedding (13)
*:
V ( E u C) + V ( * E u *C),
23
1 3 LIMITS CONTINUITY. AND THE DERIVATIVE
where * E is a normed linear space over *@,satisfying (14)
Extension: * E and *@ are proper extensions of E and @, respectively, and * x = x for all x E E u @,
and (15)
Transfer: Let A,, . . . ,A, E V ( E u @). Any L( V( E u @)) statement Q, which is true of A , , . . . ,A , in V ( E u @ ) is true of * A l , . . . , *A, in V ( * E u *@),
where L( V ( E u @)) is the language of the structure E as a norrned linear space over C augmented by the E -relation and bounded set quantification. This setting would be the natural framework for our discussion of linear operators in Section 2.2; for other purposes we may choose a different base structure S for V ( S ) . 1.3. LIMITS CONTINUITY, AND THE DERIVATIVE
We continue our basic course with a brief discussion of limits, continuity, and the derivative. A sequence (a,)nGNis a map a :N + R and, as such, has an extension to a map * a : * N + *R. For any n E *N we write a, = * a ( n ) .We use (u,),~*~ to denote the extended sequence. For any elements a, a‘ E *R we shall write a -- a’ to mean that the difference a - a’ is infinitesimal. 1.3.1. PROPOSITION.
lim,+m a,
a iff a, = a for all w
=
E
*N - N.
Here the left-hand side of the equivalence has its standard meaning inside V(R). The right-hand side is a statement about the extended universe V(*R). Thus 1.3.1 characterizes the limit notion in nonstandard terms. But it could also have been taken as a definition of what it means for a standard sequence to converge. If lim a, = a, then given any E > 0 there is some n E N such that the following statement is true in V(R):
V m E N(m
2
n
+ la - a,[
< E).
By transfer (1.2.4) the statement
V m E *N(m
2
n
+ ( a - a,(
<E)
is true in V(*R). If w E *N - N,then la - u,I < E is true in V(*R). Since this is true for all standard E > 0, it means that the difference a - a, is infinitesimal, i.e., a, = a.
24
1 CALCULUS
We present two versions of the proof of the converse: (i) If a = a, for all w E *N - N, then, in particular, the following statement is true in V(*R), where E > 0 is some standard real: (P(E,*N)
iff ~ n ~ * N V r n ~ * ~ ( r n ~ n ~ [ a - a , l < ~ ) .
By transfer (1.2.4) (P( E , N) is true in V(R). Since E is arbitrary, this means that lim a, = a in the standard sense. (ii) Once more let a = a, for all w E *N - N and fix an E > 0 in R. The set A
= { n E *Nlla -a,]
< &,allrn
2
n, rn
E
*N]
is internal by 1.2.8. By 1.2.7 (ii) A contains some finite n, convergence in the standard sense follows.
E
N. Once more
We could go on from here to develop a nonstandard version of the theory of limits. For instance, Cauchy’s criterion tells us that the standard sequence (an)nEN converges iff a, = aA for all w , A E *N - N, and that the limit is st(a,) for all w E *N - N. In general, st(a,), if it exists, defines a limit point of the sequence, and every limit point is of this form. But we shall leave the general theory to the reader and only concentrate on one particular point which will be of great importance in Chapter 7-double limits. Given a double sequence ( Q ~ , ~ ) we ~ will , ~ ~let~ (arn,n)rn,nE*N , denote the ~ , analogy ~ ~ ~ ) with . Proposition 1.3.1 it is natural to extension * ( ( u ~ , ~ ) By guess that lim lirn
n-ao rn-m
for all w , A
E
=
st(a,,,)
*N - N. But then
lim lirn
n-oo
rn-m
=
st(a,,A)
=
lirn lim
rn-m n-m
and we know that this is not true in general; the order of the limits matters. We will show that (1) is true if we are more careful in our choice of w and A ; roughly speaking, we will show that (1) holds if w is ‘‘large’’ relative to A, but that if A is “large” relative to w, we have instead lirn lirn
m-.m n-m
= st(a,,,).
Let us first illustrate this by a simple example. Define ( u , , , ~ ) , , , ~by ~*~ if
rn > n,
if
rn
(3) c
=
n,
1 3 LIMITS, CONTINUITY, A N D THE DERIVATIVE
25
where a, b, and c are real numbers. Thus Iim lim am,n= a,
(4)
n+m m+m
lim lim am,n= b.
(5)
m-co n + m
By transfer, (3) remains true for infinite integers, and we get a = st(aqA
if w > A, but =
st(aqA)
if w < A. Note that if w = A, then c = st(a,,,), and hence there may be infinite integers w, A where the standard part is different from both limits. 1.3.2. PROPOSITION. Let ( u ~ , ~ ) , be , , a, ~standard ~~ sequence. (i) Assume that limn+mlim,,,+m = a and lim,,,+mlimn+coam,n= b. For each m E *N - N there are numbers n , , n, E *N - N such that z a when n In , and = b when n 2 n 2 . Similarly, for each n E *N - N there are m,, m2 E *N - N such that am,n= b when m 5 m, and am,n a when m 3 m 2 . (ii) We have = a for all m, n E "N - N iff for each E E W I , there is an N E N such that lam,n- a1 < E whenever m, n > N. -L
Part (i) of the proposition is illustrated in Fig. 1.1. Before we turn to the proof, let us just remark that the nonstandard condition in (ii) does not necessarily hold even when limn+mlim,,,+m am,n= limm+mlimn+m to see this just choose a = b # c in the example above.
IN
*lN-IN
Figure 1.1
m
26
1 CALCULUS
To prove (i), we first observe that by applying Proposition 1.3.1 to each of the sequences {am,n}mcN, n E N, we get that amSn = a whenever rn E *N - N, n E N. Fixing an m E *N - N, this implies that the internal set
A,
=
{ n E * N / V k Ir ~ ( ( a ,-, ~a1 < 1/k)}
contains all of N, and hence has an infinite element n,. By definition n, has the property we want. To find n 2 , we first define a, = lim am,n, n-w
and note that a,
==
b when m
E
*N - N. Since
Vrn E N V E E R, 3 n 2 E N V n
2
n2()am,n - a,l < E )
holds in the standard universe, the transferred statement Vrn E *N V E E *R+ 3 n 2 E *N Vn
2 n2(1am,n - am1
<E)
is true in the nonstandard universe. The n2 we get by choosing m infinite and E infinitesimal is the one we want. The second part of (i) is the same as the first part with the roles of m and n interchanged, and it need not be proved separately. The proof of (ii) is almost a copy of the proof of Proposition 1.3.1. If = a for all m, n E *N - N, then for all E E R, the statement
3N
E
*NVrn, n
N(la,,, - (11 < E )
2
is true, and by transfer so is 3N
E
N Vrn, n
N((a,,, - a ( < E ) .
2
For the converse we note that if N
E
N is such that
V m , n E N(m, n z N + (am,n- a ( <
E),
then by transfer Vm, n
(6)
E
*N(m, n
2
N
-+
\am,n- a1
< F).
If m, n E *N - N, the condition in ( 6 ) is satisfied for all standard E. Thus am,n a, and Proposition 1.3.2 is proved. We turn briefly to the topics of continuity and uniform continuity. Let f:I + R be a standard, real function, where I is some interval in R. L-
1.3.3. PROPOSITION.
x
E
* I such that a
L-
(i) f is continuous at a x.
E
I iff f ( x ) = f ( a ) for all
27
1 4 THE INTEGRAL
(ii) f is uniformly continuous in I iff f ( x ) = f(y) for all x, y E * I such that x = y. The reader should appreciate the difference. f(x) = l / x is continuous for every x E (0, I), but f is not uniformly continuous. Let w be an infinite number in *N; then l / w = 1/61’ = 0, b u t f ( l / w ) = o a n d f ( l/w2 ) = w z are far from being close. Here l / w and 1 / 0 2 represents two different ways of converging to 0. Proofs are again simple; e.g., assume that f( a ) f ( x ) for all x E * I such that a = x. Fix an E E R,, i.e., E > 0. The set A = {a E *RIWx E *Z(la - X I < S + - f ( x ) l< E ) } is internal by 1.2.8 and contains every positive infinitesimal. By 1.2.7(iii) A must then contain some positive 6 E R; continuity in the standard sense follows. 1.3.4. REMARKS. Again we could develop a nonstandard theory of continuity, uniform continuity, uniform convergence, equicontinuity, and so on. This we shall not do, but we cannot resist inserting the following painless way of proving that i f f : I + R is continuous and I is compact, then f is uniformly continuous on I. The proof is an immediate combination of Propositions 1.1.2 and 1.3.3. Let x, y E * I and assume that x = y. Compactness of I tells us that st(x), st(y) exist and are elements of I ; obviously st(x) = st(y). Since j is continuous in I, f ( x ) -f(st(x)) =f(st(y)) = f(Yh We round off this section by introducing the nonstandard version of the derivative. 1.3.5. THE DERIVATIVE. Let f be an open interval in R and let f : 1 + R. Let dx be an infinitesimal different from zero. Then we call dy = f ( a + d x ) - f ( x ) a diflerential off at a E I. If the standard part of dy/dx exists and is the same for all nonzero dx, then f has a derivative at a and f ’ ( a ) = st(dy/dx). The reader may amuse himself by proving the standard results. Is, e.g., the chain rule nothing but the calculation dy - _ dy _ du - _ dy du _ -? dx dx du du dx L-
If(.)
1 4 THE INTEGRAL
A general theory of measure and integration will be developed in Chapter 3. Here we shall, in as simple as possible a setting, make sense of the heuristic idea that the integral is an infinite sum of infinitesimal parts.
28
1 CALCULUS
Thus for this section let f : I + R be a positive continuous function, where 1 is some interval in R. Let [ a , b ] s 1 and let Ax be a positive real. The Riemann sum is defined as b
(1)
C f ( x ) AX = f ( X o ) AX +f(xl) A X +
* * *
+f(x,-l) AX
a
+f(X,)(b - n Ax), where n is the largest integer such that a + n Ax 5 b and where X o = a, x , = a + Ax,.. . ,x , = a n Ax. Note that it may happen that n Ax < b < ( n + 1) Ax. [Since f is positive and continuous we have formed the Riemann sum as the sum of the rectangles over each subinterval with height equal to the value of f ( x ) at the left end of the base of the rectangle.] The Riemann sum for fixed 4, b is a function of Ax. By extension and transfer this function is also defined for positive infinitesimals dx. We get a corresponding hyperfinite sum
+
a
where the number n in (1) is now an infinite number. The reader should notice that the Riemann sum is a finite hyperreal number; thus it has a standard part. 1.4.1. DEFINITION. Let [ a , b ] E 1 and let dx be a positive infinitesimal. The de$nite integral off from a to b with respect to dx is the standard part of the Riemann sum,
This definition depends upon the choice of infinitesimal dx. But it can be immediately proved that if dx and du are two positive infinitesimals, then l Q b f ( x dx )
=
I a b f ( u )du.
Note that the x i n f f x ) and u i n f ( u ) are dummy variables; the dx and the du are not. The equality above is thus not a matter of typographic convention but an assertion to be verified. From Definition 1.4.1 we can now develop the elementary theory of integration. We mention one result which “justifies” the intuitive idea of calculating an integral by taking a typical infinitesimal element and adding up. We follow the exposition in Keisler (1976). First a bit of notation. Given a nonzero infinitesimal Ax, we write (compared to A x ) u=v to mean that u / A x = v / A x .
1 3 THE INTEGRAL
29
1.4.2. INFINITE SUM THEOREM. we assume that (i) h is a real function continuous on [a, b ] ; (ii) B ( u , w ) is a real function with the additive property
B( u, w )
B( u, u )
+ B( u, w )
for u < u < w in [ a , b ] . (iii) For any infinitesimal subinterval [x, x + Ax] of * [ a , 61 B(x, x
+ Ax) = h(x) Ax
(compared to Ax).
Then B(a, b ) =
I:
The proof is rather simple. Choose o We show that for any r > 0, r E 88,
h ( x ) dx. E
*N - N and set Ax = ( b - a ) / o .
b
b
a
U
1(h(x) - r ) Ax < B ( a , b ) < 1(h(x) + r ) Ax, which proves the result. In order to prove the last inequality above, we note that if B(a, b ) 2 1:( h ( x ) + r ) Ax, then by the transfer principle there is an x s u c h t h a t a s x < x + A x r - b a n d s u c h t h a t B ( x , x + A x ) z ( h ( x ) + r)Ax. Since r E R, this contradicts the fact that AB = B(x, x Ax) = h ( x ) AX (compared to Ax). A typical application of the infinite sum theorem is the following volume determination (see Fig. 1.2). Let D be the region
+
D
=
( ( x , y ) \ u 5 x s b , O s y s g(x)}.
'T Figure 1.2
30
1. CALCULUS
The solid generated by revolving D about the x axis has volume V
=
5
b
v(g(x))’ dx.
Q
This is an immediate consequence of 1.4.2 since A V --- r r ( g ( x ) ) ’ Ax (compared to Ax). 1.5 DIFFERENTIAL EQUATIONS
We close our calculus course by giving some examples of the application of nonstandard methods to differential equations. We first present the nonstandard proof of Peano’s existence theorem for ordinary differential equations. 1.5.1. THEOREM. Let f : [O, 11 x IW + R be continuous and bounded. Let uo E R be given. Then there exists a solution u : [0, I] -+ R such that u ( 0 ) = uo and
d u ( t ) / d t =f(t, u ( t ) ) .
The key to the nonstandard proof consists in noticing that the following procedure is given by an L( V(W)) condition on the parameters involved: (1)
For all n E N divide [0, 11 into equal parts to = 0, t , = 1/ n, . . . , t, = 1, and define on([) over [0, 11 inductively by first setting
and extend to all of [O, 11 by linear interpolation. Since the procedure is elementary, let us also carry it through for some A E *N - N. A simple calculation shows that for 1 5 o 5 A w- I
(2)
~A(tm)
=
~0
+
E
f ( t i , UA(fi))(ti+l
i.; 1
Introduce a standard function u :[0,1] u ( t ) = st(s(*t)),
-+
-
ti).
58 by
t
E
[O, 13.
Elementary considerations (using the compactness of [0, 11 and the boundedness o f f ) show that (3) f ( t , * u ( t ) ) -f(t, V A ( t ) > , t E *[O, 11. (For emphasis we have put an asterisk on the extended u, but in conformity with previous conventions we have dropped the asterisk on the extended $1
31
1.5. DIFFERENTIAL EQUATIONS
Here ( 3 ) means that for all such that
t E
* [ O , 13 there is some infinitesimal 6, > 0
If(t, * u ( t ) ) -f(4 u*(t))l
< 6,.
By Remark 1.2.9 the choice of 6, can be made uniform (the internal definition principle); i.e., there exists some infinitesimal 6 > 0 such that
If(4 for all t
E
*U(l)) - A t , UA(t))l
<6
*[O, 11. This gives the uniformity needed to pass from (2) to w-l
(4)
~ ( t ==)
~0
+ C
!(ti,
u(ti))(ti+l-
ti),
n=1
where w is such that t = .,r
However (4) is nothing but the standard
U(t)=
u,+
Jot
f ( s , 4 s ) ) ds;
see Definition 1.4.1. This completes the proof. Let us at this point add a mildly polemical remark. It has often been held that nonstandard analysis is highly nonconstructive, thus somewhat suspect, depending as it does upon the ultrapower construction to produce a model; see Section 1.1. On the other hand, nonstandard praxis is remarkably constructive; having the extended number set we can proceed with explicit calculations. A case in point is the existence theorem (Theorem 1.5.1). In the standard approach one uses in the final step the Ascoli lemma, which asserts that every bounded equicontinuous sequence of functions on a bounded interval I has a uniformly convergent subsequence. This part of the argument is lacking in the nonstandard proof, which makes it more direct. And indeed it is, in the following precise sense. It is possible to recast the nonstandard proof to give a proof of the Peano existence theorem where the only nonrecursive element is the weak Konig’s lemma asserting that every infinite binary tree, i.e., infinite tree of sequence of 0’s and l’s, has an infinite path. And this is a principle which is provably weaker, i.e., more constructive, than the Ascoli lemma (Simpson, 1984). On the other hand, adding saturation (see Chapter 2) adds real power to the nonstandard calculations, which explains the many successes of nonstandard methods in stochastic analysis and related fields (Henson et al., 1984; Henson and Keisler, 1985). In the generality of 1S.1 there is no uniqueness of solution. The procedure described in (1) gives one. Is it possible through some “infinitesimal”
32
1 CALCULUS
variation of the data to obtain all solutions via a nonstandard difference equation? 1.5.2. EXAMPLE.
The equation u‘ = 3u 2 / 3
(6)
with initial condition u ( 0 ) = 0 has for each a
E
[0,1] a solution
u ( t ) = 0,
Ost
u(t) = ( t - a)3,
a
5
t
5
1.
The family of solutions can also be parameterized by the values u(1) = r, r E [0,1]; i.e., to each r E [0,1] there is a unique solution u of ( 6 ) that satisfies u ( 0 ) = 0 and u ( 1) = r. We will show how to obtain all these solutions from the difference equation of ( l ) , (7)
UA(ti+l)
=
uA(ti)
+ (1/A)3uA(fi)2’3?
where A E *N - N, by imposing different initial conditions ~ ~ ( =06,) where 6 is a positive infinitesimal. Let ug be the unique solution of (7) with initial condition u A ( 0 )= 6, and let ug be the corresponding solution of ( 6 ) .Clearly uo is the constant zero and is the solution obtained in 1.5.1. We note that for any standard real 6 > 0 equation (6) has a unique solution with initial condition u ( 0 ) = 6. This solution is the standard part of the solution ug obtained from (7). We observe that u,(l) > 1. Thus the internal set ( 6 E *R16 > 0
A
Us(1)
> 1)
contains every standard real >O; hence it contains some positive infinitesimal 6o > 0. The function f ( ~=) u,(l) is internal and continuous in the sense of V(*R). We know that f(0) = 0 and f(6,) > 1. Thus for every r E *[O, 11 there is some 6 E [0, So] c *R such that f ( 6 ) = r, i.e., such that u,(l) = r. Since ug is the standard part of ug we conclude that every solution of (6) with initial value u ( 0 ) = 0 can be obtained as the standard part of some solution u6 of (7) by choosing the infinitesimal 6 suitably. This observation also has a numerical content. We can carry out an approximate calculation of the solution corresponding to a given a by choosing an initial value bearing the same relationship to a and the step length as 6 has to a and n. It is, of course, a well-known fact in numerical analysis that various approximations, step lengths, often must be chosen to depend on each other in quite specific ways in order to exhibit a particular phenomenon. It could be that the nonstandard theory is the right way to discuss this.
1 5 DIFFERENTIAL EQUATIONS
33
This example is taken from Birkeland and Normann (1980), where the reader can find a more general discussion. We mention some of their other results. First, if the solutions of the equation u‘ =f( t, u ) are unique to the left in the sense that if u , and u2 are solutions and u , ( t ) = u , ( t ) for some t E (0, 13, then u , ( s ) = u 2 ( s ) for all s E [O, $1, then all solutions u of the equation with initial condition u ( 0 ) = 0 can be obtained from procedure (1) by choosing a suitable infinitesimal S and starting from the initial condition u, (0)= 6. In the general case it does not suffice to perturb the initial condition only, we must perturb the whole equation. We call g a 6 perturbation off, where S is some infinitesimal, if g is internal, continuous in the sense of V(*R), and I l f - g/l, < 6. We let X , denote the set of 6 perturbations of 5 We further denote by ug the unique solution of the difference equation in ( 1 ) with fixed initial condition u, (0) = 0 and with the 6 approximation g replacing f in Eq. (1). One can now prove that given f there is some infinitesimal 6 > 0 such that if u is a solution of u’ = f ( t, u ) with initial condition u ( 0 ) = 0 there is a 6 perturbation g off such that u is the standard part of ug. Our final example comes from the study of vector fields or dynamical systems of the form (8)
x =f(x,Y),
E j
=
g(x,Y )
where E is a “small” parameter. Here x = x ( t ) , y = y ( t ) are functions of “time” t, and x,y denote time derivatives. We are interestzd in the behavior for small 8. The standard approach is to use asymptotic expansions in powers of E. This often leads to complicated computations [see Section 2.6 of Cole (1968)], in particular when f and g also depend on some auxiliary parameter. In the nonstandard setting we may use a fixed, infinitesimal E instead of a moving, standard one. 1.5.3. REMARK. The work we report on was done by a group of French mathematicians, Callot (1981), F. Diener (1981), and M. Diener (1981). For more complete information the reader may consult the surveys Benoit (1980), Cartier (1982), and Zvonkin and Shubin (1984).
We begin with some general remarks on the system (8). For simplicity we assume that f and g are smooth standard functions, and E is some fixed infinitesimal constant. At nearstandard points (x, y ) E *R2 where g ( x , y ) / e is in$nite, the trajectories of the system (8) are “quasi parallel” to the y axis; i.e., their standard part is parallel to it. At such points the “speed” (1’+ j 2 ) l ’ , with which trajectories are traversed is infinite; such points are called fast points.
34
1 CALCULUS
The remaining nearstandard points, the slow points, at which the trajectories may have other directions and the speed is finite, all lie close to the set S: g(x, y ) = 0, which is called the slow manifold. The orientation of the vector field at a fast point is determined by the sign of g, while the direction of the flow at a point on the slow manifold is determined by the sign o f f at that point. In particular, near a segment of the slow manifold S where the partial derivative g , ( x , y ) is negative, all fast trajectories are directed toward S, while near segments where g, is positive, they all go away from it. We speak of stable (attracting) and unstable (repelling) parts of S. We shall not develop a general theory here (see the references in Remark 1.5.3), but exhibit the rather complex behavior of trajectories in a simple but fairly representative example. 1.5.4. EXAMPLE.
(9)
Our system will be a van der Pol equation of the form E i
+ (x’ - l)X + x - a = 0,
where a is a parameter and E = 0. For reasons of symmetry we may suppose that a > 0 and that E is a positive infinitesimal. There are many ways of transforming this equation to a system of the form of (8); we find it convenient to first look at the Lienard’s substitution u = F ( x ) + E X , where F ( x ) = x 3 / 3 - x. This leads to the vector field EX
U, :
= u - F(x),
u=a-x.
[Note that the use of variables in (10) is not entirely consistent with (8); this time the fast trajectories will be quasi parallel to the x axis.] From classical theory, with standard x, u, a, E, we recall the following facts; see, e.g., LaSalle and Lefschetz (1961). The only stationary or equilibrium point is ( a , F ( a ) ) .If a < 1 it is stable, and all trajectories tend toward it as t + 00. If a < 1 it is unstable, and there is a stable limit cycle which goes around it and toward which all trajectories tend as t + co. As a increases toward 1 , the limit cycle shrinks continuously toward the stationary point. By transfer this description also applies in a nonstandard setting with E chosen infinitesimal. But now more can be said. For brevity we restrict the discussion to the case a < 1 . The slow manifold S, for the system U, of (10) is the cubic u = F ( x ) ; it is stable when 1x1 > 1 , unstable when 1x1 < 1 . Trajectories through nearstandard points (x, u ) are quasi parallel to the x axis unless ( u - F ( x ) ) / E is finite, and they are oriented toward increasing or decreasing x depending on the sign of u - F ( x ) . Thus the standard part of any trajectory must be a union of parallels to the x axis and of segments of the slow manifold. In
1 5 DIFFERENTIAL EQUATIONS
Figure 1.3
35
1 CALCULUS
36
particular, the only possible shapes for (the standard part of) a limit cycle are as shown in Fig. 1.3a-c. Continuity arguments [either by transfer from standard theory or directly for the present nonstandard situation, as in Benoit et al. (1980)] show that all three shapes must occur. It is also clear that the cases of Fig. 1.3b,c occur only when 1 - a is infinitesimal. The reason for this is simple: if a has a noninfinitesimal distance from 1, a point on a trajectory passing infinitesimally close to the point (1, - 4 ) on the slow manifold will have a noninfinitesimal distance from the slow manifold at x = a, since u < 0 for x > a ; this means that we are in the situation of Fig. 1.3a. We shall be interested in the situations of Fig. 1.3b,c, where a trajectory first follows the stable and then the unstable part of the slow manifold infinitesimally closely, both for some noninfinitesimal distance. Now numerical experiments (Benoit et al., 1980) have uncovered an unexpected phenomenon: the transition from a cycle of the shape of Fig. 1.3a to an “infinitesimally” small cycle of the shape of Fig. 1 . 3 is ~ extremely abrupt and takes place well away from a = 1; for E = 1/100 it all takes place within 2 * lo-” of the value a = 0.9987404512. Thus we must expect some arithmetic restrictions on the value of a in order to exhibit what we, according to the next definition, will call a “canard.” 1.5.5. DEFINITION. A canard (duck) for the vector field U, is a segment of a trajectory which first follows the stable, then the unstable part of the slow manifold S, infinitesimally closely, both for some noninfinitesimal distance.
The name canard is motivated by the general shape of the cycle in Fig. 1.3b; see Benoit et al. (1980). Examples of canards are, of course, segments near the point (1, -2/3) of the limit cycles in Fig. 1.3b,c. The question now is: For which values of a does the vector field U, allow canards? Remember that E is now a positive infinitesimal and a must be infinitesimally close to 1. To answer this question we “magnify” the immediate neighborhood of the slow manifold u = F ( x ) by the substitution y = ( u - F ( x ) ) / E The . vector field U, is then transformed to
Yo:
1 = y, & j= a
-
x
-
(x’
-
1)y.
This is a typical example of a “change of scale” argument, which is available to us in the nonstandard setting. The slow manifold in the (x, u ) plane is transformed into the x axis in the ( x , y ) plane, and the finite part of the ( x , y ) plane represents an infinitesimal strip around the slow manifold in
1 5 DIFFERENTIAL EQUATIONS
37
the (x, u ) plane. And this rescaling will enable us to answer the question posed above. (See Fig. 1.4.) Let us note that the field Y, is of the form (8) with its fast trajectories quasi parallel to the y axis. The slow manifold S, is defined by (x2 - l ) y = a - x, and it is stable when 1x1 > 1, unstable when 1x1 < 1. Note that when a = 1 the standard part H of S, is the union of the straight line x = 1 and the hyperbola y = - l / ( x + 1). 1.5.6. REMARK. We note that (1 1) is the usual “phase plane” representation of the van der Pol equation (9). We preferred to start with the Lienard representation (10) because the fast trajectories for (11) for 1x1 < 1 “go off to infinity.”
There is one further point we need to discuss before starting to calculate. There is a kind of “transition zone” of points (x, u ) in the Lienard plane with u - F ( x ) infinitesimal but ( u - F ( x ) ) / Einfinite. These are fast points in the Lienard plane, but they are also infinitesimally close to the slow
Y
il I
Figure 1.4
Slow Manifold
1 CALCULUS
38
manifold; in the phase plane they are infinitely far off. We must show that trajectories cross this zone in a "nice" way. Consider, for example, the trajectory through a nearstandard point (xo, yo) in the phase plane with xo > 1 and yo > 0. It is quasi parallel to the y axis, and may therefore be described as the graph of a function x = y ( y ) , defined at least for all finite y 2 yo. But its domain of definition is internal, hence contains some infinite y, > yo. Now for all finite y, y ( y ) x, hence y ( y ) = xo on some interval yo 5 y 5 y,, with infinite y , 5 y l . L-
1.5.7. REMARK. The existence of y , is an exercise in the use of the internal definition principle. Consider the internal set
A
=
{ y E * R ( V ZE *R(yo < z < y
+
I~(Z)
- xOI
< l/y}.
Since A contains arbitrarily large real numbers, it must have an infinite element y , . The point ( y ( y 2 ) ,y,) corresponds to a point (x,, u,) in the Lienard plane with ( u z - F ( x ) ) /E infinite, thus to a point on a fast trajectory, quasi parallel to the x axis. In short, the correspondence between trajectories in the two coordinate planes is as nice as one could hope for. Now let y : t + y( t ) = (x( t ) , u ( t ) ) , tl < t < f , , be a canard for the vector field U,; i.e., y is a trajectory for U,, "x(t,) > 1 > "x(t,), and ( u ( t ) F ( x ( t ) ) ) / E is finite for tl 5 t 5 t Z . Our change of scale takes y into a trajectory f for the magnified field Y,. Since the x coordinate of f is the same as for y, also projects onto the same noninfinitesimal interval (x( f 2 ) , x( t , ) ) of the x axis and it is clear that f ( t ) is finite for all t E [ t l , f,]. But this is possible only if f ( t ) is infinitesimally close to the slow manifold S,, at least for < O t < "t,, that is, if f is again a canard (see Fig. 1.4). For which values of a d o canards exist? Heuristically, we may argue as follows: the slope of the vector field Y, at the point P = (1, -$) is Ot,
dy
4
_-_ dx
1 a-x --
E
y
2(1
(x2-l))l
=x=,
-
a)
E
y=-l/2
At the same point the hyperbola H has slope i. If a canard exists, it ought to be quasi parallel to If;i.e., we should expect that 2(1 - a ) / &0, or
a
a = 1-E/8
(12)
-
L-
EV,
for some 7 0. This result is, in fact, true, but it needs a proof. One way is to consider the behavior of Y, at some suitable straight line 1 through P with slope between that of Y, and that of H at P and to show that unless a is of the form (12), all trajectories cross it in the wrong direction. L-
1 5 DIFFERENTIAL EQUATIONS
It is convenient to parameterize 1 by x = 1 + T, y r ] = 2(1 - a ) / & .Then on I :
39 =
-f + 87 =
-$(l - 287), and to write
dy - r] - T * / E ( ~- 287 - 48) _ -
dx
1 - 287
a
Suppose now that ‘v > 4; the case ‘77 < is analogous and will be omitted. Choose 8 I such that O r ] > “8 > and note that with this choice ( d y / d x ) 8 > 0 on some standard neighborhood 5 6 of P on I (see Fig. 1.5). This implies that all trajectories of Y, cross this line segment in the downward direction. If a trajectory was a canard for this value of r] = 2( 1 - a ) / E, it would have to “enter” our picture at a point (1 + T,y) with y < -4 + 87. Then in order to be a canard it would have to cross 1 in an upward direction. But this was impossible on the segment 171 5 6, and since the endpoints of this segment are in the fast part of the plane, we see that no canard is possible. Thus a must necessarily be of the form (12). We observed above that the canard y for U, remained a canard for the magnified field Y,.The process can be iterated and eventually leads to
a
IT/
I1 I I
I
I *H I
1
Figure 1.5
1 CALCULUS
40
the conclusion that the values of a which allow canards are of the form N
a =
(13)
1 a,&'+
E
N
7,
7 = 0,
j=O
for every standard natural number N. Here the aj are standard reals depending only on the given vector field U,; note that a, = st(a) = 1, a, = -$. This shows that the values of a for which canards exist all lie in an extremely narrow interval; the reader is asked to recall the numerical experiments reported on above. 1.5.8. REMARK. By the amplitude of a limit cycle we understand the length of its projection onto the x axis. In connection with the Lienard plane representation of (9) (see Fig. 1.3), we noted that the transition from a cycle of the shape of Fig. 1.3a to an infinitesimally small cycle of the shape of Fig. 1 . 3 ~is rather abrupt. This is illustrated in Fig. 1.6. The amplitude in case of Fig. 1.3a is 4. Near 1 the amplitude is infinitesimal. At 1 - .5/8 the curve is extremely steep; this is the region of values of a where canards do occur. Note that this is the nonstandard picture; in the standard setting we will have a sudden jump.
We have seen that linear changes of scale by factors E - " , n E N, are insufficient to separate different slow trajectories. A stronger magnification is provided by the function
t
Amplitude
5 9
5
1-8 &
Monod of 0-
Figure 1.6
1
1.5. DIFFERENTIAL EQUATIONS
41
i.e., the odd function which is equal to T' for T > 0. Note that d E 1 = 1 when 1 ' 71 is finite and noninfinitesimal and even when E~ < 171 < E - " for some n E N (since E In E = 0 when E = 0). And if 0 < ol~rEII < 1 then T is of the form T = exp( - k/ E ) with "k finite positive. Returning to the field U,, we consider from now on only that particular value of a for which the limit cycle follows the slow manifold S, all the way between -1 and 1, i.e., the border case between parts b and c of Fig. 1.3. This limit cycle is a canard for - 1 i x I2 and we represent it as a function graph y = y(x), -1 5 x I2, in the phase plane, Fig. 1.4. The change of scale w = ( y - y(x))CE'
leads to a new vector field
w = ( a - x ) w w, Y(X)
which in the domain -1 5 x 5 2, '1x1 < 1, can be viewed as an infinitesimal perturbation of the elementary integrable vector field ~:
x = -l/(x + l ) , 3 = (1 - X')W.
To see this just note that 'y(x) = - l / ( x wl < 1. Solution curves for ( 15) are
1'
(16)
+ 1) and that
w['"] = 0 when
w ( x ) = c exp(cp(x)),
where c is a constant and p ( x ) = x 4 / 4 + x 3 / 3 - x 2 / 2 - x . Some of the solution curves are sketched in Fig. 1.7. The important feature of cp is that it has a minimum at x = 1 and from there increases monotonically as x increases or decreases. Thus a map is defined by the following prescription: +(1) = 1, and if x # 1, then +(x) is the only number # x such that cp(+(x)) = cp(x). For our purpose both x and +(x) must lie in [-1,2], so we can define only for -1 Ix 5
+
+(-1) = ;.
+
1.5.9. REMARK. It can be proved (see Benoit et al., 1980) that in our domain the trajectories for W, must lie infinitesimally close to those for 6 Further, in analogy to a previous argument, one can show that a trajectory w(x) that reaches 1 (or -1) for some x = 6, with "6 E (-1,2), reappears in both the phase plane and the Lienard plane as a fast trajectory at abscissas with the same standard part "6.
42
1 CALCULUS
C=O 067 -1
2
*
x
-‘I Figure 1.7
We are now in a position to describe the standard parts of the trajectories in the Lienard plane in some detail. We start at a point Q outside the slow manifold S,. The trajectory through it is quasi horizontal, going to the right if Q is above S,, to the left if Q is below S,. It meets S, in the monad of some standard point (6, F ( 5)) with 161 > 1 and from there follows S, in the direction of decreasing 1x1. There are now three possibilities:
2,
(i) If 1 < 6 < y reappears in the (x, w) plane as a curve w(x) = cexp(cp(x)) with w(5) = 1 if Q is above S,, w(6) = -1 if Q is below S,. Since +(6) > -1 when 1 < 5 < the trajectory seen in the Lienard plane is a canard for values of x such that +( 6 ) < x < 6; then the trajectory leaves S, to the same side from which it approached it. (ii) If $ <6, y reappears in the (x,w) plane as a curve w(x) = c exp(cp(x)) with 1 w(x)l < 1 for -1 < Ox < min(6, 2). In the Lienard plane this corresponds to a canard following S, all the way up to (-1, f),where it leaves S, to the upper side, and will then meet S, at (2,;). (iii) If 6 < -1, y follows S, in the Lienard plane up to (-l,$), where it leaves S, and again approaches S, at (2,;).
:,
1.5.10. REMARK. It is seen that if the point Q = (x, u ) lies in the region F ( x ) < u < F ( $ )= x > 0, the trajectory through it will exhibit one or more “small” oscillations of the type described under (i) before it enters the limit cycle.
-g,
REFERENCES
43
We see that we have a complete description for this particular value of a. Similar results hold for other values of a that allow canards (see Benoit et al., 1980). It is interesting to note that numerical calculations with finite E , say E = 0.05, accord nicely with the descriptions under (i)-(iii). We have dwelt at length upon this example because it exhibits in a simple situation the richness of rescaling arguments that nonstandard analysis makes available to us. This is a theme which will recur throughout this book. REFERENCES* S. Albeverio (1984). Nonstandard analysis; polymer models, quantum fields. Acta Phys. Austriaca, Suppl. XXVI. E. Benoit, J.-L. Callot, F. Diener, and M. Diener (1981). Chasse au canard. Collect. Math. 32. B. Birkeland and D. Normann (1980). A non-standard treatment of the equation y‘ = f ( y , t ) . Mat. Sem. Oslo. J.-L. Callot (1981). Bifurcations du portrait de phase pour des equations differentielles du second ordre ayant pour type I’equation d’Hermite. Thkse, Strasbourg (1981). P. Cartier (1982). Perturbations singulieres des iquations diffirentielles ordinaires et analyse non-standard. Sem. Bourbaki, Astirisque 92-93. J. D. Cole (1968). Perturbation Methods in Applied Mathematics. Ginn (Blaisdell), Boston, Massachusetts. N. J. Cutland (1983). Nonstandard measure theory and its applications. Bull. London Math. SOC. 15. M. Davis (1977). Applied Nonstandard Analysis. Wiley, New York. F. Diener (1981). Methode du plan dobservabilite; developpements en E-ombre. Thkse, Strasbourg. M. Diener (1981). Etude generique des canards. These, Strasbourg. J. E. Fenstad (1980). Nonstandard methods in stochastic analysis and mathematical physics. Jber. Deutsch. Math.- Verein. 82. J. E. Fenstad (1985). Is nonstandard analysis relevant for the philosophy of mathematics? Synthese 62. J. E. Fenstad (1986). Lectures o n stochastic analysis with applications to mathematical physics. Roc. Simposio Chileno Log. Mat., Santiago. J. E. Fenstad and A. Nyberg (1970). Standard and nonstandard methods in uniform topology. Logic Colloq. 2969, North-Holland h b l . , Amsterdam. J. M. Henle and E. M. Kleinberg (1979). Injiniresimal Calcufus. MIT Press, Cambridge, Massachusetts. C. W. Henson and H. J. Keisler (1985). The strength of nonstandard analysis. J. Symb. Logic (to appear). C. W. Henson, M. Kaufmann, and H. J. Keisler (1984). The strength of nonstandard methods in arithmetic. J. Symbolic Logic 49. A. Hurd, ed. (1983). Nonstandard Analysis: Recent Developments, Lecture Notes in Math. 983. A. Hurd and P. A. Loeb (1985). Introduction to Nonstandard Real Analysis. Academic Press, New York.
* In addition to books and papers explicitly referred to in this chapter we have included a number of references to books and survey papers o n nonstandard analysis, which the reader may find useful to consult.
44
1 CALCULUS
H. J. Keisler (1976). Foundations oflnjnitesirnal Calculus. Prindle, Weber and Schmidt, Boston, Massachusetts. H. J. Keisler (1984). An infinitesimal approach to stochastic analysis. Mem. Amer. Math. SOC. 291.
D. Laugwitz (1978). Injnitesimalkalkiil. Bibliographisches Inst., Mannheim. J. P. LaSalle and S. Lefschetz (1961). Stability by Liapunov’s Direct Method, Academic Press, New York. T. Lindstrom (1986). Nonstandard analysis and perturbations of the Laplacian along Brownian paths. I n Stochastic Processes in Mathematics and Physics. ( S . Albeverio et a/., eds.) Proc. o f BiBoSI, Lecture Notes in Mathematics, 1158, Springer-Verlag, Berlin and New York. P. A. Loeb (1979). An introduction to nonstandard analysis and hyperfinite probability theory. In Probabilistic Analysis and Related Topics, ed. A. T. Bharucha-Reid, Vol. 2. Academic Press, New York. R. Lutz and M. Goze (1981). Nonstandard Analysis. Lecture Notes in Math., 881, SpringerVerlag, Berlin and New York. W. A. J. Luxemburg (1973). What is nonstandard analysis. Amer. Math. Monthly 80. E. Nelson (1977). Internal set theory. Bull. Amer. Math. SOC.83. M. M. Richter (1982). Ideale Punkte, Monaden, und Nichtstandard-Methoden. Vieweg, Wiesbaden. A. Robinson (1966). Non-Standard Analysis. North-Holland Publ., Amsterdam. A. Robinson (1979). Selected Papers, Vol. 2, North-Holland Publ., Amsterdam. S. Simpson (1984). Which set existence axioms are needed to prove the Cauchy/Peano theorem for ordinary differential equations? J. Symbolic Logic 49. T. A. Skolem (1934). Uber die Nichtcharakterisierbarkeit der Zahlenreihe mittels endlich oder abzahlbar unendlich vieler Aussagen mit ausschliesslich Zahlenvariablen. Fund. Math. 33. K. D. Stroyan and W. A. J. Luxemburg (1976). Introduction to the Theory of Infinitesimals. Academic Press, New York. K. D. Stroyan and J. M. Bayod (1985). Foundations of Infinitesimal Stochastic Analysis. North-Holland Publ., Amsterdam (to appear). E. Zakon (1969). Remarks on the nonstandard real axis. In Applications of Model Theory to Algebra, Analysis, and Probability. Holt, New York. A. K. Zvonkin and M. A. Shubin (1984). Nonstandard analysis and singular perturbations of ordinary differential equations. Russian Math. Surveys 39.
CHAPTER 2 TOPOLOGY AND LINEAR SPACES
In this chapter we shall prove the spectral theorem for compact Hermitian operators as an introductory example to a powerful nonstandard technique which we shall apply several times in the second part of this book. But first we need to review a few basic facts about topology and about linear operators in normed spaces. We shall also introduce one new principle of nonstandard theory, saturation, in addition to extension and transfer. 2.1. TOPOLOGY AND SATURATION
Our development of the nonstandard theory has so far rested upon two principles, the extension principle, 1.2.2, and the transfer principle, 1.2.4; see also 1.2.10. Now we need to add a new principle. We recall that a family 9of subsets of some set E has theJinite intersection property if X , n . n X , f 0 for every Jinite set of elements X , , . . . , X , in 9. As we noticed in connection with (2) in Section 1.1, a filter as a family of sets has the finite intersection property. Does the finite intersection property imply that the intersection of all sets in the family is nonempty? Not necessarily, and this deficiency is a basic motivation for the construction of various completions and compac-
-
45
46
2 TOPOLOGY A N D LINEAR SPACES
tifications. One wishes to adjoin “limit points” to certain families with the finite intersection property, e.g., Cauchy filters. Let (an)naNbe a Cauchy sequence of rationals, i.e., each > 0 there is some k, E N such that ( a k , - a,\ < E for all m 2 k,. Thus the set A, = { a E Q(( a - a,l < E , all m 2 k c } is a nonempty subset of Q. The family 3 = {A,} has the finite intersection property, but the intersection n A , of all sets in the family 9is not necessarily nonempty; observe that a E A, iff a = a,. The completion R of 69 is introduced exactly to adjoin limit points to families 9 of this type. We shall consider embeddings * : V ( R ) + V ( * R )with a certain saturation property which in a uniform and general way adjoins limit points to families with the finite intersection property. As an introduction we shall verify the following countable version in the bounded ultrapower model of Section 1.2. 2.1.1. EXAMPLE.
a, E Q. For each
E
n
2.1.2. COUNTABLE SATURATION PRINCIPLE. If A, 2 A2 3 . . is a countable decreasing chain of nonempty internal sets, then
n A, z 0.
ntN
Each A , is internal, thus is of the form A, = j ( A L ) where A: = (A:, A;, . . .)%; see the construction of the bounded ultrapower model and the description of internal sets in the model in Section 1.2. We may assume that each A: c v k ( R ) for some fixed k > 1. And, for convenience, we adjoin the constant sequence Ah = ( v k ( [ w ) , . . .)%. Define for k 2 0 1, = { i
2
We see that: (i) I. = N; (ii) For each i E N let
k1A;z Af 2 E
%, Ik2
. . . zA:
Ik+l,
m ( i) = max{ m I i
E
# 0).
for all k
2
0; and (iii) n I k = O .
I,}.
Since Zo = N and nIm=O,m ( i ) is well defined. Let B, be some element in A:”’. We shall prove that B = ( B , , B , , . . .)% E % A; for all k 2 0, which, via the j map, verifies 2.1.2 in the model. But B E~ A; will follow if we show that z k c { i E N I Bj E A:}. The latter is, however, immediate since i E I k implies that m ( i ) 5 k. Saturation is a very important uniformity principle which lies behind many mathematical arguments, namely a transition from a quantifier structure V 3 (to express the “local” property of finite intersection) to one of the form 3 V (to express the “global” or uniform property), which is the heart of many finiteness, compactness, or uniform boundedness arguments. As a first application of the saturation principle 2.1.2, we prove the following useful extension principle. We know by transfer that any standard sequence (An)nEN has a canonical extension But what happens if we are given a countable sequence ( A n ) n EinN V ( * R ) ?
2 1 TOPOLOGY AND SATURATION
47
2.1.3. PROPOSITION. Any bounded countable sequence (An)ntN of internal sets in V(*R) can be extended to an internal sequence (An)na*N in V(*R). is external in V(*R), even if every element Notice that the object (An)nEN A, of the sequence is internal. Thus transfer is of no help. But saturation does the trick: define for each n E N a set B, by
f~ 8,
iff
dom f
= *N A
V i 5 n ( f ( i ) = Ai).
Each set B, is internal. Thus every f E B, is internal. We observe by saturation, 2.1.2, that nB,,#D. Any f in this intersection is a suitable extension of the given sequence. Before turning to topology we shall state the general saturation property. If our base space satisfies some countability condition, e.g., if our base space is a separable metric space, then countable saturation will suffice. For general topological spaces we need something more. 2.1.4.GENERAL SATURATION PRINCIPLE. (1)
Let K be an infinite cardinal.
A nonstandard extension is called K-saturated if for every family card( I ) < K , with the finite intersection property, the intersection
ni,,
Xi is nonempty, i.e., contains some internal object. (2) The embedding * : V(S) + V(*S) is saturated if the extension V ( * S ) is card(*S)-saturated. The embedding is called polysaturated if it is card( V(S))-saturated. We make a few remarks. If K = w , , the first uncountable cardinal, we are back to the countable saturation principle 2.1.2. And we have proved that w , -saturated embeddings exist, namely the bounded ultrapower of Section 1.2 is w,-saturated. We also notice that by choosing the cardinality of * S large enough saturation implies polysaturation; thus we use “general saturation” to refer to either of the notions in part (2) of 2.1.4. The reader is referred to the literature for a proof that saturated models exist (Chang and Keisler, 1973). 2.1.5. REMARK. For readers who feel some uneasiness about the nonuniqueness of the extension *R, we mention the following uniqueness theorem for superstructure embeddings: there is up to isomorphism a unique superstructure embedding * : V(R) + V(*R) such that: (i) * satisfies the transfer principle, 1.2.4; (ii) * is saturated in the sense of 2.1.4 (2); (iii) *R and the set of all internal sets have cardinality equal to the first uncountable inaccessible cardinal [see Keisler (1976) for an excellent discussion].
We now turn to topology. We assume that the topology on the space E is given by the family 0 of open sets. For x E E we let Ox be the family of open sets containing x. We introduce the monad ~ ( x of ) x E E by the equation (1)
F ~ x =)
nw o
E
ox}.
48
2 TOPOLOGY AND LINEAR SPACES
The reader will immediately see that this generalizes our previous definition 1.1.2. Furthermore, we have the following extension of Proposition 1.1.2. 2.1.6. PROPOSITION.
Let E be a topological space.
(i) E is HausdorlT iff p ( x ) n p ( y ) = 0 for all x, y E E such that x # y. (ii) A set A c E is open iff p ( x ) s *A for all x E A. (iii) A set A c E is closed iff for all x E E and all y E *A, y E p ( x ) implies that x E A. (iv) A set A E E is compact iff for all x E *A there is a y E A such that x E p ( y ) . We call a point x E * E nearstandard iff x E p ( y ) for some y E E. Thus (iv) of 2.1.6 says that a space E is compact iff every x E * E is nearstandard. We let N s ( * E ) denote the set of nearstandard points in *E. Let x E p ( y ) for some y E E ; if E is Hausdorff it follows from 2.1.6(i) that y is unique, and in this case we set y = st(x). Note that st:*E + E is a pnrtial map. For any A c * E we let st(A) = {st(x)lx E A n N s ( * E ) } .
In the case where E is Hausdorff we have the following useful reformulation of 2.1.6(ii)-( iv): (ii)' A set A G E is open iff stC'(A) C_ *A. (iii)' A set A G E is closed iff *A n Ns(*E) c st-'(A). (iv)' A set A C_ E is compact iff * A G st-'(A). The map st is called the standard part map, and st(x) will be referred to as the standard part of x. We should comment a bit on the proof of 2.1.6. And for that purpose let us concentrate on the characterization of openness. We recall from the proof of 1.1.2(i) that if p ( x ) c *A, then we could find some standard open neighborhood of x inside A. This was an exercise using transfer. But we had to use a trick. Since p ( x ) is not an internal set, we cannot pull p(x) back via the inverse of the *-embedding to a standard neighborhood of x in A. But we can take a positive infinitesimal 6 > 0 and consider the internal set D = { y E *R I Ix - yl < 6). We now observe that D c p ( x ) C_ *A, and D obviously is an element in *Ox. Thus we have the truth in V(*R) of the statement
3 0 E * O x [ Dc *A]. By the transfer principle there exists a D E Ox such that D s A. This is exactly the proof of 1.1.2(i) (presented immediately after the transfer principle 1.1.4 in Section l . l ) , observing that over R the quantifier 3 0 can be reduced to a quantifier 36.
2.1 TOPOLOGY AND SATURATION
49
In order to make this proof work in general we need the following fact. 2.1.7. APPROXIMATION
D
E
*Ox such that D
LEMMA.
For each x
E
E there is an internal set
s p(x).
For each D E Ox let F D = {D‘ E *Ox 1 D’ G * D } . Each set FD is internal and the family {FD1 D E Ox} has the finite intersection property. Hence by general saturation principle 2.1.4, there exists some internal Do belonging to FD for all D E Ox. Thus Do E * D for all D E Ox, i.e., Do E p ( x ) . Using 2.1.7, the proof of 2.1.6(ii) is as before. The proof of 2.1.6(i) uses once more Lemma 2.1.7 to approximate p ( x ) from the inside by an internal set. And 2.1.6(iii) is the dual of (ii) and needs no further comment. We do, however, add a comment on the characterization of compactness. We claim that the space E is compact iff every x E * E is nearstandard. This is nothing but the well-known ultrafilter characterization of compactness in an easy disguise (see the note Fenstad, 1967). We spell out the details: with every point x E * E we can associate an ultrafilter 9xon E in the following way:
X
E
Sx
iff
x
E
*X,
i.e., sxis the “trace” on E of a principal ultrafilter on *E. Conversely, with every ultrafilter 9,, on E one may associate a point x, E * E ; this is a consequence of general saturation. It follows that 9,= 9x,.In fact, if X E 9,, then by choice x, E *X, which means that X E Yx,. As 9,is maximal, equality follows. Let a E E and b E * E . We note that b E p ( a ) iff %, converges as a filter to a, i.e., every open set containing a belongs to 9b. Combining these observations, the proof of 2.1.6(iv) follows. (i) Let E be compact and b E * E . Compactness means that every ultrafilter on E converges; in particular, there is some a E E such that 9-b converges to a, i.e., b E p ( u ) . (ii) Conversely, let 9 be an ultrafilter on E ; then 9 is of the form 9 = S b for some b E *E. Since every point in * E is nearstandard, i.e., b E p ( a ) for some a E E, we see that sbconverges. We shall make one important addition to 2.1.6. 2.1.8. PROPOSITION.
Let E be Hausdorff and A an internal set in *E.
Then st(A) is closed. The proof of 2.1.8 is a simple but essential application of the saturation principle. Assume that 0 n st(A) # 0 for all 0 E Oa,where a E E. If b is any point in 0 n st(A), then for some c E A, b = st(c). Since 0 is open, 2.1.6 tells us that p ( b ) = p(st(c)) c_ *0,hence c E * O n A. Our assumption thus implies that * O n A # 0 for all 0 E Oa.Since A is internal saturation
50
2 TOPOLOGY AND LINEAR SPACES
implies that there is some point bo E p ( a ) A A; but this means that a = st(bo)E st(A). We have thus proved that if a belongs to the closure of st(A), i.e., 0 n st(A) # 0 for all 0 E Oa,then a E st(A), i.e., st(A) is closed. 2.1.9. EXAMPLE. The results in 2.1.6-2.1.8 really presuppose that we work in a (poly-)saturated extension. The following is an example of a noncompact space in which every point is nearstandard. Let w 1 = {0,1,2, . . . , a,. . .} be the (uncountable) set of all countable ordinals. The structure (wl, <), where < is the ordering relation between ordinals, is a complete ordered set. The family of intervals ( a , p ) = { y E w , 1 a < y < p}, a,/3 E w l , induces a topology on w t ,and it is easily seen that w1 is not compact in this topology. Let % be a free ultrafilter on N and consider
* w , = my/%.
We claim that every point aq E * w l is finite; i.e., there exists some a E w 1 such that a'pc < *a.Now a%= (a1,. . . ,a n , .. .)%. And since a countable set of countable ordinals is bounded in w , , there is some a E w 1 such that a, < a for all a, in the defining sequence for a%.But then aq < * a ; i.e., every point in * w , is finite. But in a complete ordered structure every finite point has a standard part, i.e., a% is nearstandard for all aq E *wl. The extension * w l = w y / % satisfies 2.1.2 but not 2.1.4. However, w 1 does have saturated extensions; see the reference in Remark 2.1.5. We round off our general topology course by mentioning the characterization of continuity. 2.1.10. PROPOSITION. Let E and F be topological spaces and f:E Then f is continuous at a point a E E iff *f(y(a)) C_ p ( f ( a ) ) .
+
F.
We postpone the few remarks that we shall make on the topic of uniform continuity to our discussion of metric spaces. There is, however, a general discussion of standard versus nonstandard methods in uniform topology in Fenstad and Nyberg (1970), which was behind the commutative diagram in Section 1.1. Suitable references on general topology are Davis (1977) and Stroyan and Luxemburg (1976). Let us also mention that Richter and Benninghofen have recently introduced an interesting new method to nonstandard topology. Their idea is to extend the transfer principle to a larger class of formulas (allowing certain external quantifiers) in a way which makes it possible to describe monads; see Richter (1982) and Benninghofen and Richter (1983). Benninghofen and Stroyan (1984) have given applications to the bounded weak star topology on linear spaces. Richter's and Benninghofen's ideas were inspired b y Nelson's (1977) formulation of nonstandard analysis, but can also be used within the formulation we have described.
51
2 1 TOPOLOGY AND SATURATION
2.1.11. REMARK. So far we have been discussing nonstandard characterizations of standard concepts. In 2.1.10 we discussed a m a p f : E + F that is a standard object. However, in applications of nonstandard theory we shall be much concerned with internal objects, such as internal m a p s f : * E .+ *F. For internal maps basic notions such as continuity and differentiability split into two separate concepts. Take as an example continuity; let f :*R + *R and recall that R, denotes the set of positive reals:
(i) f is called S-continuous at a
E
*R iff
(Ve E R+)(3S E R+)(Vx E R)(lx - a / < 6 + If(x) -f(a)l < E ) .
(ii) f is called *-continuous at a (VE
E
*R,)(35
E
*rw+)(Vx
E
E
*R iff
*R)(lX - a1 < 6 + [ f ( X ) - f ( a ) l <
The reader will appreciate the difference. Let o function
E
E).
*N - N; then the internal
f ( x ) = sin(wx) is everywhere *-continuous but nowhere S-continuous. The internal function if if
*Q,
x
E
x
c *Q,
is everywhere S-continuous but nowhere *-continuous. The difference between S-continuity/diff erentiability and *continuity/diff erentiability will be important in several applications. For instance, the "free energy" is always *-differentiable but not always Sdifferentiable in an Ising model; see Section 7.2. In the non-S-differentiable case there may be a phase transition. We now turn for a moment to metric spaces. Let ( E , d ) be a metric space. The notion of monad now makes sense for any x E *E. We set P ( X ) = {Y E
* E I d ( x , Y ) = 01.
In analogy with Proposition 2.1.10 we have the following general version of Proposition 1.3.3(ii): a function f:E + F, where E and F are metric spaces, is uniformly continuous iff f ( p ( x ) ) E p ( f ( x ) ) , all x E * E . In a metric space E we must also carefully distinguish between a notion of nearstandard and a notion of finite point. A point x f * E is called nearstandard if d ( x , y ) = 0 E. The point x is called finite if d (x, y ) is finite for some y E E.
2.1.12. DEFINITION.
for some y
E
52
2 TOPOLOGY AND LINEAR SPACES
Trivially, every nearstandard point is finite. The converse is not true. If E is a separable infinite-dimensional Hilbert space there are finite points be an in * E which are not nearstandard. In order to see this, we let (en)ncN orthonormal basis for E. Choose in * E an element e,, w E *N- N. Then e, is finite but not nearstandard in the metric derived from the norm of E. We note that for every finite point of * E to be nearstandard it is necessary and sufficient for E to satisfy the condition that every bounded closed set is compact. Any metric space E is HausdorfL Thus if x E *E is nearstandard there is a unique y E E such that x E p ( y ) ; we write y = Ox or y = st(x). We discuss one result that involves the notion of standard part in three spaces: in many applications we have a pair of separable metric spaces E l , E2 and the set C( E l , E2)of continuous mapsf: El + E2 that in the compactopen topology is a separable metrizable space. Let F E * C ( E ,,&); i.e., F is a map from *El to * E 2 . As an element of * C (E l , E 2 ) , F may have a standard part " F E C ( E l , E2), How is this notion of standard part related to the notions of standard parts in *El and *E2? 2.1.13. PROPOSITION. Let E,,E2 be separable metric spaces with El locally compact and let C ( E ,, E2) have the compact-open topology. If f E C (E l , E 2 ) and F E * C (E l , E2),then " F = f iff for each nearstandard point x E *El, "(F(x)) = f ( " x ) .
We give the proof. Let "F = f and let x E *El be nearstandard. El is locally compact, so for every open neighborhood U of f("x) we can find a compact neighborhood K of Ox such that f ( K ) c U ; i.e., C ( K , U ) is a neighborhood off: We see that x E * K and F E *C(K , U ) .Thus F : * K + * U and F ( x ) E * U. Since this is true for arbitrary open neighborhoods of f("x) we conclude that "(F(x)) = f("x). For the converse, suppose that "( F ( x ) ) = f("x) for all nearstandard x E * E. Let C (K , U ) be any neighborhood o f f ; we must show that F E * C ( K , U ) in order to conclude that O F =f: Pick any x E * K ; K is compact so O x exists and is an element of K. Then f("x) E U, so by assumption " ( F ( x ) )E U, ie., F ( x ) E *U. Thus F : * K + *U, i.e., F E * C ( K , U ) . We shall prove one more result about metric spaces needed in further applications. 2.1.14. DEFINITION. Let f be a map between metric spaces E, F. f is called compact if for every bounded subset A of E, f(A) C_ D for some compact subset D of F. 2.1.15. PROPOSITION. f:E += F is compact iff f maps finite points of * E to nearstandard points of *F.
2 2 LINEAR SPACES AND OPERATORS
53
Let f be compact and x finite in * E . There is some r E W, such that d ( x , y ) 5 r for some y E E. The set A = {u E E Id(u, y ) 5 r ) is bounded, hencef(A) E D for some compact D c E We observe that x E *A, which implies that f ( x ) E *D. Since D is compact, f ( x ) is nearstandard.
For the converse let A be a bounded subset of E. Our candidate for the set D will be the set of all points y E F such that y = x for some x E *(f(A)). We must show that D is compact. For any y E D and E E R, we have in V(*R) the truth of the statement 3x
E
* f ( A ) ( d ( xY, ) < E ) .
By transfer we see that for all y E D and all E E R+ there exists x E f(A) such that d ( x , y ) < E . Thus we have in V(R) the truth of
(VY E D)W&E R + ) W E f ( A ) ) ( d ( xY, ) < E ) . This means, once more by transfer, that if y E *D, then there is some x E * f ( A ) such that x L- y. By assumption A is bounded; thus any point of * f ( A ) = f ( * A ) is nearstandard. But if x is nearstandard and y = x, then y is nearstandard, i.e., D is compact.
2.2 LINEAR SPACES AND OPERATORS In this section we shall introduce a nonstandard approach to linear spaces and operators. We shall only discuss a few basic facts and refer the reader to the existing literature for further information (Heinrich, 1980; Henson and Moore, 1983). The reader should also preview Chapter 5, where we develop a hyperfinite theory of quadratic forms with applications to stochastic analysis. Let ( E , 1) .I\) be a linear normed space. An element x E * E is called finite (normfinite) if ( ( x ( (is a finite hyperreal; we let Fin(*E) denote the finite elements of *E. The element x E * E is infinitesimal if / / x / L/ 0; we write x = y for IIx - yll = 0. Both Fin(*E) and p ( 0 ) = {x E *?I (Ix(( = O} are vector spaces over the same field as E. The quotient space E = Fin(*E)/= is also a normed linear space that we call the nonstandard hull of E. This notion was introduced by Luxemburg (1969), and there is a large literature on this topic; e.g., see Henson and Moore (1983) for a survey and introduction. We are working with saturated embeddings, a fact which has important consequences for the study of the nonstandard hull. We shall treat the following somewhat more general situation. Let ( F , 11 .I]) be an internal normed linear space; in this case we assume that the norm is a map from F into "R. The notions of finite and infinitesimal still make sense; i.e.,
2.TOPOLOGY AND LINEAR SPACES
54
Fin(F)/= is a well-defined normed linear space. We have the following result: 2.2.1. PROPOSITION.The
space Fin(F)/- is complete, i.e., is a Banach
space. be a PROOF The proof is a variation on Example 2.1.1. Let (6n)nGN Cauchy sequence in Fin(F)/=. This sequence comes from a sequence (an)nGN in F ; by saturation this sequence can be extended to an internal *Cauchy sequence in F, namely let A,, = {bl b : *N + F, b is *Cauchy,
V i 5 n[b, = ail}.
Each A, is nonempty and internal. By countable saturation 2.1.2, the intersection A=r)A, ntN
-
is nonernpty. Let b E A and let q E *N - N,then b^ = b , / = is an element of Fin(F)/ that is easily seen to be the limit of the given sequence (&JnaN. We add the observation that if the norm in F comes from an inner product, then Fin(F)/= will be a Hilben space. We now let E be a standard normed linear space. We recall that an element x E * E is called neatstandard if flx - y(l == 0 for some y E E. An element x E * E is called pre-nearstandard if for all E E R, there is some y E E such that IIx - yll < E ; we let Pns(*E) denote the set of pre-nearstandard points of *E. By analogy with Proposition 2.2.1 we have: 2.2.2. PROPOSITION.
The space Pns(*E)/= is the completion of the given
space E. The proof is exactly the same as for 2.2.1. We only add the observation that Pns(*E) is the closure of E in the topology on "E defined by the extended norm; this implies that Pns(*E)/= will be the completion of E. We further observe that if E is an inner product space ( a pre-Hilbert space) then Pns(*E)/= is the Hilbert space completion of E. Further results can be found in Luxemburg (1969) and in Fenstad and Nyberg ( 1970) (where "pre-nearstandard" was called "bounded"). Note also that the space E is complete iff every pre-nearstandard point is nearstandard. This generalizes the fact that compactness is equivalent to every point of * E being nearstandard. Let E be any normed linear space; we denote by 9Ethe class of all finite-dimensional subspaces of E. For F E SE let dim(F) denote the dimension of E By transfer we obtain an object *SEE V(*R) and a map
2 2 LINEAR SPACES A N D OPERATORS
55
*dim: *2FE + *N; in conformity with previous conventions we drop the asterisk and write dim(F) for all F E*sE. If F E *2FE and dim( F ) = q E *N it follows by transfer that there is an internal sequence ( e J i S r ,such that ei E * E and
where X =R or C is the field of coefficients of E. The space F is a hyperfinite-dimensional linear space. By transfer F has all the elementary properties of finite-dimensional spaces. 2.2.3. PROPOSITION. Let E be a normed linear space. Then there is an
F
E *%E
such that
EcFc*E. The proof is an exercise in the use of general saturation 2.1.4. For each E let A, = { F E *SEIx E F}. The family {Ax}xcEof internal sets has the finite intersection property; hence there is some F E *2FE such that x E F for all x E E. This is the powerful technique alluded to in the introductory paragraph of this chapter. We imbed the space E into a hyperfinite-dimensional space F. By transfer we prove results about F by proving results about finitedimensional subspaces of E. Since E sits inside F, we shall use the standard part map to prove results about E. This will be illustrated in the next section. Here we add a few remarks about linear operators. Let T :E + E be a bounded linear operator. The norm 11 T(I = supIIxIl=I I(Tx(I is well defined and ((Tx(12 (1 TI( . ( ( x ( (We . have the following nonstandard characterization. x
E
Let T :E + 6 be a linear operator: T is bounded iff * T maps finite points to finite points. T is compact iff * T maps finite points to nearstandard points.
2.2.4. PROPOSITION.
(i) (ii)
For the proof note first that (ii) is a corollary of Proposition 2.1.15. For the proof of (i) note that the inequality ) ) T x /5/ ))TI( ((x)/immediately implies that a bounded T maps finite points to finite points. For the converse we observe that every point in the set { Tx E * E Ix E * E , llxll = 1) is finite since T maps finite to finite. Sin :e the set is internal and bounded by every hyperfinite integer, there is some finite bound to the set; we conclude that II 7-11 < aWe now turn to a study of compact symmetric operators in a Hilbert space.
56
2 TOPOLOGY AND LINEAR SPACES
2.3. SPECTRAL DECOMPOSITION OF COMPACT HERMITIAN OPERATORS
In this section we shall illustrate the use of Proposition 2.2.3 by giving a nonstandard proof of the spectral theorem in an arbitrary Hilbert space. We have included this result as a pedagogical example. The result is classical; the nonstandard treatment is due to Robinson (1966), and also consult the exposition in Davis (1977); for a different approach and a more general result see Moore (1976). If T # 0 is a compact Hermitian operator in a Hilbert space H , then there exist real numbers v l , v 2 , .. ., v,, .. . , v, f v], i # j , and finite-dimensional subspaces H, of H such that for all x E H 2.3.1. THEOREM.
TX = C V,PH,x, I
where PH, is the orthogonal projection onto the subspace HI. The set of eigenvalues { v l , . . . , v,, . . .} is either finite or countably infinite. In the latter case lim v, = 0. We recall that a bounded linear operator T on H is called Hermitian or self-adjoint if ( Tx, y ) = ( x , Ty) for all x, y E If. A bounded linear operator P on H is called an (orthogonal) projection if it is Hermitian and idempotent. There is a one-to-one correspondence between closed linear subspaces E of H and projections PE on H . We recall that PEx is the unique pointy E E such that IIx - y 11 5 (Ix - z ( ( for all z E E. For the proof of Theorem 2.3.1 we use Proposition 2.2.3 to pick a hyperfinite-dimensional space E E *SHsuch that H c E c * H . E is a “finite approximation from above” to H . And we shall use “finitedimensional” linear algebra in E to prove the spectral theorem in H . Let P = PE be the projection in * H onto E. We note that if x E * H is nearstandard, then f3c = x. Let T be a nontrivial compact Hermitian operator on H ; we write T’ for the operator P*T restricted to E. T can also be described as P * T P restricted to E. We note a few elementary facts about T‘. Since T is bounded on H, T’ is bounded on E ; in fact, 11 T’(( 5 11 T(I. And if x E H, T’x = Tx. We also see that T’ is Hermitian on E. The compactness of T immediately implies that if x E E is finite, then T’x is nearstandard.
Thus we have a compact Hermitian operator T’ on the hyperfinitedimensional space E, and we can use standard linear algebra. By transfer we have a set of eigenvalues A l , . . . , A , , v E *N, for T’, where lAll 2 (A21 2 2 / A v l . We also have a corresponding set of orthonormal eigenvectors
--
2 3 SPECTRAL DECOMPOSITION
57
r, , r,, . . , rv satisfying the eigenvalue equations ~
T’r, = Airi,
i = 1,2,.
. . , v.
By “standard parts” we shall push this back to a result about T on H. First we note that each Ai is finite; in fact, \hi\ 5 \IT\\ for each i 5 v. This means that r, is nearstandard for all noninfinitesimal Aj. The reader should note that this is the point where we use the compactness assumption on T From the eigenvalue equation T‘r, = Ajrj and the fact that r j is normfinite, compactness of T’ implies that Ajrj is nearstandard. So if Aj is not infinitesimal, then r, is nearstandard. We shall need the following lemma, which adds a further ingredient from nonstandard theory. 2.3.2. LEMMA. Let
k
E
E E
R, and suppose that \Aj\ z E for all j
5
k ; then
N.
From our remark we see that rj is nearstandard for all j Ilr, - rj )I = fi for all i, j 5 k, i # j . Suppose to the contrary that k Consider the internal sequence
.=(q
5
E
k, and
*N - N.
i f j ~ k , if j > k
0
This internal sequence has the property that IIsi - sj 11 Thus the set
{ i E * N ] ( v<~ i)llsj
-
2
4 for i f j , i,j E N.
sill 2 JZ}
contains an i , < k in *N - N. For i E N let ti = st(s,>. This is a standard sequence and thus has an extension (ti 1 i E *N). Since t, si for all i E N, there exists an i2 E *N - N such that ti = si for all i < i2. Choose io < min{ i , , iz}. Then io < k and t, = sb is nearstandard. Therefore s = st(t,) = st(s,) = st(r,) is a limit point of the sequence ( t i \ i E N). But this is clearly impossibIe since 11 s - t, 11 I &! for all i E N. From this lemma we see that if we have a block of equal and noninfinitesimal eigenvalues A,+, =
=
9
. . = Ar+k,
then the length of the block k is aJinite integer. We rewrite the set of eigenvalues as a sequence K ~ ,i 5 p 5 v, without repetition. Let the eigenvectors corresponding to K, be r(li),. . . ,r:!. Write Ei
= span(r\”, . . . , r:).
2 TOPOLOGY AND LINEAR SPACES
58
At this point i and m imay both be hyperfinite. But transfer of standard matrix theory tells us that we can write
E
= El@ E2@. . - 0E,
with the associated spectral resolution of T‘,
T’ = K I P E ,+ K
+ . . + K,PE,.
~ P E ~
A regularity argument remains; let W 2.3.3. LEMMA.
*
*NlK~ + 0).
={j E
W is a nonempty subset of N.
The main thing to show is that W is nonempty; it then follows from Lemma 2.3.2 that W is already a subset of N. For a contradiction assume that W = 0;then K = 1 ~ =~ 0. 1Since T # 0, there is an element x E E such that 1IxI( < 1 and IIT’xll > 0. By the above decomposition we can write x = x1 . . . + x, and T’x = K ~ +X . ~. * + K ~ x Then ~ .
+
This is the desired contradiction which proves that W is nonempty. It remains to put the pieces together. If A, 0, then rI is nearstandard and the pair st(r,), st(x,) satisfy the eigenvalue equation for T in H. If j E W, then m, is finite. Let sIJ)= st( r : J ) )and E; = st( K,) for j E W. Let
+
HJ We note that if x
E
=
span($’, . . . ,s(n:) c H.
H then PH,x= P,,x. We can now write
, where Ho is the orthogonal complement of H , 0H2 0- * . in H. The sum is finite if W is finite. It is infinite if W = N;both cases may happen. From the spectral decomposition of T’ we derive for x E H TX = vlPH,x + v2PHZ+ . * . . This completes our exposition. We just add that in the case W = N we H = H00 HI 0 H 2 0 .
* *
immediately see that lim,+m v, = 0. This is so since (v,I= ( K , ( for all i E N. Thus for some w E *N - N, 1 v-1 = I K , ~ . From the definition of W it follows that I K ” ~ z 0. Thus 0 is a limit point of the monotone decreasing sequence ( v z l iE N). An early success of the technique of approximating a Hilbert space H “from above” by a hyperfinite-dimensional linear space was the solution by Bernstein and Robinson of the invariant subspace problem. The result is as follows: 2.3.4. THEOREM. Let T be a bounded linear operator on a separable Hilbert space H. Suppose T is such that for some complex polynomial
2 4 NONSTANDARD METHODS IN BANACH SPACE THEORY
59
p ( A ) = co + c,A + * * * + c,Am, c, # 0, p ( T) is compact. Then T leaves invariant at least one closed linear subspace of H other than H or (0). This example is noteworthy; here we have a classical problem whose solution was first given by nonstandard methods. The special case p(A) = A was solved by von Neumann in 1930. Successive extensions are due to Aronszajn, Smith, and Halmos. The problem was finally settled by Bernstein and Robinson in 1966; for an exposition see Robinson (1966) and Bernstein (1973). The result has subsequently been extended in a standard framework by Lomonozov (1973); for a general exposition see the book by Radjavi and Rosenthal (1973). However, the nonstandard approach exemplifies a general technique that has proved useful on many other occasions. We start out with a basic finite-dimensional fact, in this case the fact that in a finite-dimensional linear space E of dimension m any linear operator possesses a chain of invariant subspaces
E,
=
{0}L E ,
. . E Em = E, *
where dim(Ej) = j , 0 5 j s m.The given Hilbert space H will be imbedded into a suitably chosen hyperfinite-dimensional extension Hv, and one uses the transfer of the above finite-dimensional result in the process of obtaining an invariant subspace for the given operator on H ; see Bernstein (1973) and Robinson (1966). 2.4. NONSTANDARD METHODS IN BANACH SPACE THEORY
The nonstandard hull construction (see Section 2.2) has had important applications in the study of local properties of Banach spaces. We aim in this book toward other applications, but feel that our introductory account would be incomplete if we did not at least give a glimpse into an area of fruitful applications of nonstandard methods. Let ( F , 11 11) be an internal normed linear space; in Section 2.2 we introduced the nonstandard hull of F, fi = Fin(F)/=. There is an isometric embedding of F into P and we showed in Proposition 2.2.1 that is a Banach space. 2.4.1. EXAMPLE. For each p 2 1 in *R and each n E *N we can define the internal space l,(n) in exact analogy with the standard notion. By varying p and n, the nonstandard hulls f , ( n ) give us many interesting examples of hyperfinite;dimensional spaces. It is easy to see that if p and n are both finite then I,( n ) is order isometric to l,,( n ) , where p' = st( p ) . If p is infinite
2 TOPOLOGY AND LINEAR SPACES
60
and n finite, then t ( n ) is order isometric to L ( n ) . But new possibilities arise when n is infinite; see Henson and Moore (1983, Section 9) for an introduction. As remarked in Section 2.2, the notion of nonstandard hull was introduced by Luxemburg (1969). There is a parallel development due to Krivine and continued by Stern using ultraproducts. A comprehensive introduction can be found in the recent survey papers by Heinrich (1980) and by Henson and Moore (1983), who are among the main contributors to this field. As mentioned above, we offer only a brief look and recommend the papers by Heinrich and Henson and Moore for a full account. We recall that a Banach space E is called rejlexive i f the canonical embedding of E into its second dual E** is surjective. A stronger notion of superrejZexiuity has turned out to be important in many connections. To explain this notion we need the concept of finite representability. Let E and F be standard Banach spaces; E is said to be finitely representable in F if for each finite-dimensional subspace Eo of E and each positive real number E there exists a linear transformation T of Eo into F such that
for all x E Eo. F is said to be superreflexive if E finitely representable in F implies that E is reflexive. Every superreflexive Banach space is reflexive, but the converse is not in general true. 2.4.2. PROPOSITION.
A nonstandard hull
P is reflexive if and only if it is
superreflexive. Let fi be reflexive; then every separable subspace E of fi is reflexive. We need to show that every separable space E that is finitely representable in P is reflexive. Being finitely representable in a space is not in general the same as being isomorphic to a subspace of it. However, in the case of a nonstandard hull we have the following result, which proves 2.4.2. 2.4.3. PROPOSITION. Let E be a separable Banach space and F a n internal Banach space. Then E is finitely representable in P if and only if E is isometrically embeddable in fi PROOF Let E be finitely representable in l? Since E is separable, there is an increasing family E , c E2 c . * * G En c * . of finite-dimensional En is dense in E. Let subspaces of E such that dim En = n and UnGN e, ,. . . ,en be apasis for En;by assumption there exists a linear transformation T,, : En + F such that JlxII 5 I( T'xI( 5 (1 + 1/2n)llxll for all x E En. Pick elements p , , p 2 , . . . ,pn E F such that p , / = = T n ( e i )i, = 1,2,. . . , n. We see
2 4 NONSTANDARD METHODS IN BANACH SPACE THEORY
61
that the map TL of * E nonto the linear span [in the sense of V(*R)] of p , , . . . , p n in F given by
is an internal, linear, one-to-one map of *En into F satisfying (*)
for all q
E
(1 - l / n > l l s l l IIThlll ~ * E n with 11q11 = 1.
5
( 1 + ~/n)llqIl
We are set for an application of the internal definition principle. The set of integers n E *N such that there exists an internal, linear, one-to-one mapping TL of *Eninto F satisfying (*) is an internal set containing N; hence it contains some no E *N - N. The map T defined by T ( q / = ) = TLo(q)/= defines an embedding of the nonstandard hull &,, of Er, into fi We notice that UntN En is in a natural way contained in the hyperfine extension thus the extension to E of the restriction of T to UntN En gives the desired embedding of E into l? We can use these ideas to give a simple proof of a result due to Enflo ef al. (1975). It is well known that reflexivity is a three-space property; i.e., if E is a Banach space and F is a closed subspace such that both F and E / F are reflexive, then E is reflexive. Enflo et al. showed that superreflexivity is also a three-space property. Rakov observed that using the nonstandard hull construction, one has the following simple proof; see Heinrich (1980) for further references.
&,;
2.4.4. PROPOSITION. Let E be a Banach space and F a closed subspace of E. If both F and E / F are superreflexive, then E is superreflexive. PROOF A simple argument shows that if a Banach space E is superreflexive, then is reflexive. A further easy calculationlshows that the nonstandard hull of E / F is canonically isometric to k / F It thus follows from the assumptions of 2.4.4 that fi and k / fi are reflexive. By the classical result 6, is reflexive. By Proposition 2.4.2 it follows that k is superr!flexive, and so obviously is E since E can be regarded as a subspace of E.
We stop our account here. The present proof that superreflexivity is a three-space property clearly indicates the strength of the nonstandard hull or ultrapower construction. But we have not in this glimpse touched the really deep and difficult results such as the Kiirsten-Stern local duality theorem and Krivine's results on block finite representability of the co and 1, bases. For these and other applications we refer once more to the surveys of Heinrich (1980) and Henson and Moore (1983).
62
2 TOPOLOGY AND LINEAR SPACES
REFERENCES B. Benninghofen and M. M. Richter (1983).General theory of superinfinitesimals (preprint), RWTH Aachen. B. Benninghofen and K. D. Stroyan (1984).Bounded-weak-star continuity. Univ. of Iowa. A. R. Bernstein (1973).Non-standard analysis. In Studies in Model Theory. Math. Ass. of Amer. C. C. Chang and H. J. Keisler (1973).Model Theory. North-Holland Publ., Amsterdam. M. Davis (1977). Applied Nonstandard Analysis. Wiley, New York. P. Enflo, J. Lindenstrauss, and G. Pisier (1975).On the “three space problem”. Math Scand. 36. J. E. Fenstad (1967).A note on “standard” versus “non-standard” topology. Indag. Math. 29. J. E. Fenstad (1985).Is nonstandard analysis relevant for the philosophy of mathematics? Synthese 62. J. E. Fenstad and A. Nyberg (1970).Standard versus nonstandard methods in uniform topology. Logic Colloq. 1969, North-Holland Publ., Amsterdam. S. Heinrich (1980).Ultraproducts in Banach space theory. J. Reine Angew. Math. 313. C. W. Henson and L. C. Moore (1983).Nonstandard analysis and the theory of Banach spaces. Nonstandard Analysis: Recent developments. Lecture Notes in Math. 383, Springer-Verlag, Berlin and New York. H. J. Keisler (1976).Foundations ofInfinitesirnal Calculus. Prindle, Weber and Schmidt, Boston, Massachusetts. V. J. Lomonozov (1973).Invariant subspaces for operators commuting with compact operators. Functional Anal. Appl. 7. W. A. J. Luxemburg (1969).A general theory of monads. In Applications of Model Theory to Algebra, Analysis and Probability. Holt, New York. L. C. Moore (1976).Hyperfinite extensions of bounded operators on a separable Hilbert space. Trans. Amer. Math. SOC.218. E. Nelson (1977).Internal set theory. Bull. Amer. Math. SOC.83. H. Radjavi and P. Rosenthal(l973). Invariant subspaces. Springer-Verlag, Berlin and New York. M. M. Richter (1982). ldeale Punkte, Monaden, und Nichtstandard-Methoden. Vieweg, Wiesbaden. A. Robinson (1966).Non-Standard Analysis. North-Holland Publ., Amsterdam. K. D. Stroyan and W. A. J. Luxemburg (1976). Introduction to the Theory of Infinitesimals. Academic Press, New York.
CHAPTER 3
PROBABILlTY
In this chapter we shall give an introduction to hyper-nite probability theory, while at the same time presenting the necessary background from the theory of measure and integration. We shall illustrate the general theory by discussing the particular but important case of Brownian motion. In the last two sections we shall present a hyperfinite approach to limit measures and measure extensions.
3.1 THE LOEB MEASURE
Measure theory and probability theory were studied early within the context of nonstandard analysis. However, there were troublesome points, since *-extensions of a-additive measures are not in general u-additive in the extended universe V(*R). The breakthrough came with a paper by Loeb (1975). His construction of the Loeb measure, which converts an internal measure to a standard cT-additive measure, is the key to the nonstandard approach to stochastic analysis. Let ( X , s4,v ) be an internal measure space; i.e., X is an internal set in V(*W), s4 is an internal algebra of subsets of X , and v is a finitely additive 63
64
3 PROBABILITY
internal measure on d.It should be noticed that u takes values in (*R)+. d being internal means that d also is closed under *-finite unions; i.e., if w E *N - N and {Ai} is an internal sequence such that Ai E d for each i, then A, v A2 u . * . u A, E d. From the internal algebra d we can in standard fashion generate the external u-algebra ~ ( d Is ) .it possible in some way to obtain from v on d a standard rr-additive measure on ~ ( d )Or,? more specifically, let O v , the standard part of v, be the set map AEd
" v ( A )= "( v ( A ) ) ,
is a map from d to IW, u {a}. Can Ou be extended from d to a(&)? If v is an internal probability measure, i.e., v ( X ) = 1, the Countable Saturation Principle 2.1.2 gives an immediate affirmative answer. In this case O v is a finite measure on the algebra d and it is well known that such a measure can be extended to a a-additive measure on ~ ( d if the ) following continuity property is satisfied: let A , , A 2 .. . E d ;if A,, & 0,then " v ( A , )& 0. But this is trivially satisfied because of saturation, 2.1.2: if A, &0, then A , = 0 for some finite n, and convergence is trivial. In the general case the Carathkodory extension procedure gives the answer. For the proof we need the following technical lemma, which shows that a countable, infinite union of disjoint nonempty sets in SP is never itself an element of d,which, trivially, implies that O v is countably additive on d.
O v
3.1.1. LEMMA. Let A,, f d for n = 0, 1, 2 , . . . ; if A, E A , u A2 u . . . , then there is an m > 0 such that A, E A, u A2 u . * u A,,,.
For the proof, let (A,,),,E*Nbe an internal sequence extending Ao, A , , A 2 , .. . , see 2.1.3, The set m
{mt'NlA,i
u An}
n=l
is internal and contains every m E *N - N. It follows from 1.2.7 that it must contain some m E N. We can now state the main result due to Loeb (1975).
Let ( X , d,v ) be an internal measure space. The measure has a unique a-additive extension, denoted by L ( v ) , to the a-algebra a(&) generated by d.Furthermore, if " v ( X )< a, then 3.1.2. THEOREM.
Ou
(i) For each B E a(&)there exists A E d such that L ( v ) ( AA B) = 0 (where A A B = ( A - B) u (B - A ) ) . (ii) For each B E u(&) and each E E R, there are sets C, D E d such that C c B c D and L( v ) ( D )- E IL( v ) ( B )I L( v)( C ) + E .
3 1 THE LOEE MEASURE
65
The existence is clear from CarathCodory’s extension theorem, which also tells us that v is unique provided “ v ( X )< 00. When v is infinite, a more refined argument is necessary to establish the uniqueness; see Henson (1979). Since we are mostly interested in probability measure, we omit this part. We now turn to (ii). By construction of L( v ) there is a sequence (An)nsN of elements of sd with A, c A,+, such that B c 6 = UnGN A, and L( v)( B ) < L ( v ) ( B )+ F. Extend the sequence to an internal sequence (A,),,*N. For any o E *N - N we clearly have B E B E A,. We need to choose w such that v ( A , ) 5 L( v ) ( B ) E, as we can then put D = A,. Let r = L( v)( B ) and consider the set { m E *Nl v(A,) 5 r + E } . This set is internal and contains N, hence it also contains some w E *N - N. Note that L( v) and B are external objects, but the number r is internal. Applying the same argument to X - B, we find C. It is now easy to prove (i). For each n, let C, and 0, be elements in sd such that C, c B c On and
+
L ( v ) ( L ) , )- ( l / n )
5
U v ) ( B )5 L ( v ) ( C n )+ (1/n).
We may assume that { C,} is increasing and { Dn}decreasing. Extend { C,} and { D,} to an increasing, respectively decreasing, internal sequence such that C, c D, for all n E *N, and put A = C, for some w E *N - N. 3.1.3. EXAMPLE. The space 2“ of infinite sequences of 0’s and 1’s is the standard model of unlimited or infinite coin tossing. More precisely, the model is the measure space (2“, 3,y ) defined in the following way: on each factor 2 = {0,1} start with the a-algebra of all subsets and the counting measure giving equal weight to each point in the space. Then 3 is the usual product cr-algebra and p the product measure. In this case existence of 3 and y is trivial; in more general cases there are often problems.
A nonstandard way of modeling the same phenomena would be to choose some E *N - N and consider the space = {0, l}” of all internal sequences of 0’s and 1’s of length q. In this case we let SB be the algebra of all internal subsets of Cl and P the associated counting measure. This means that for each A E d
JYA) = IA1/2”, where for any internal set A G a, [A] is the internal cardinality of A, i.e., the number of elements in A, and 2” = In[. Here (a,d,P ) is an internal measure space. Let L ( P ) be the associated Loeb measure on a(&); (a,cr(A), L ( P ) ) is a standard measure space; however, on a rather unusual sample space. The possible usefulness of the nonstandard approach lies in the fact that we have two spaces to play with.
66
3 PROBABILITY
The space (a,d, P) as an internal object is in some sense “finite,” which means that we can calculate as if we were in a truly finite situation. On the other hand, the external space (Q, o(d),L( P)) has a well-behaved measure and integration theory. How is (2“, 93, p ) related to these structures? Again, in a suitable sense, it should be the “standard part” of the nonstandard constructs. Let st, :Q - $ 2” be the restriction map; we would like to assert that for any A E 93
i.e., that st, is a measure-preserving map. This is almost true; we need only be a bit careful about the a-algebras involved. 3.1.4. REMARK. So far we have considered extensions (Q, a(&),L ( P ) ) of (a,d,P). For many purposes it is more convenient to work with the completion of (a,~ ( dL)( P, ) ) . We denote the completion by (a,I-(&), L ( P ) ) and call it the Loeb space associated with (a,d,P ) . Similarly, the completed measure L ( P ) will be called the Loeb measure of P. Referring back to the previous example, we may show that if B is a Bore1 set in 2” then st;’(B) E L(A) in Q; see Sections 3.4 and 3.5 for a general discussion.
3.1.5. REMARK. Although we have used Caratheodory’s extension theorem to obtain Loeb measures, there is a simple and elegant direct construction which presupposes no measure theory: given an internal measure P on an internal algebra d,define inner and outer measures p and p by
P(A)
= sup{”P(B )
I B E d and B c A}
and P(A) = inf{”P(B) I B
E
d and B
2
A}.
Then the Loeb algebra t ( d )is just the collection of all sets A such that P ( A ) = P ( A ) ,and L ( P ) ( A )is just this common value. For the details, see, e.g., Cutland (1983) or Stroyan and Bayod (1985). A more radically different, Daniell-type approach to nonstandard measure theory has been advocated by Loeb (1983,1984) [see also Hurd and Loeb (19831; it has the advantage of allowing one to develop measure and integration theory simultaneously. There could be interesting connections to vector-valued Loeb measures, a topic which has only recently begun to attract attention (Osswald, 1983, 1985; Zivaljevic, 1985).
3 2 HYPERFINITE PROBABILITY SPACES
67
3.2. HYPERFINITE PROBABILITY SPACES
We claimed that our internal model (a,d,P) in a “suitable sense” was finite. The time has come to be more precise. A subset A c N is finite if it is a subset of some proper initial segment of N; i.e., for some m E N, A g { n E NI n 5 m}. An internal subset A E *N is called hyperfinite (or *-finite) if A E { n E *N\ n 5 m } , for some m E *N. In general, an internal set E E V(*R) is called hyperfinite if there is an internal one-to-one map f of a hyperfinite subset of *N onto E. This is equivalent to the following: 3.2.1. DEFINITION. An internal set E E v(*R) is called hyperfinite if there is an internal bijection f of some proper initial segment { n E *N I n 5 m} of *N onto E. The number m is called the internal cardinality of E, in symbols [El = m. REMARK. In the bounded ultrapower model an internal set A was defined from a bounded sequence ( A , , A,, . . . ,A,, . . .); see the remark following Definition 1.2.5. If each set A, is finite, then A will be hyperfinite and the internal cardinality of A will be the hyperfinite integer determined by the sequence (lA1l,IAzl,. . . ,IA,l,. . .) of standard integers, where lAnl is the number of elements in A,. Every hyperfinite set in the model can be described in this way. At this point we should draw the reader’s attention to the difference between internal and external cardinality. The segment {n E *Nl n 5 no}, where no E *N - N,has the hyperfinite number no as its internal cardinality; i.e., there is no internal bijection of this segment onto a smaller one. But seen from the “outside,” i.e., within the full structure V(*R), the set has uncountably many elements; i.e., there is no (external) map of N onto { n E *Nl n Ino}. This follows from the general fact that an internal set is either finite or uncountable. For let A be an internal set which is countable (but not finite) from the outside; i.e., we have an external enumeration of A, A = { a , , a 2 , . . . ,a,, . . .}, in V(*R). We can then define internal sets A o = A , ~ A , = , - . . b y s e t t i n g A o = A a n d A ,= A - { a l , ..., a,}forn > O . Ao.A , , . . . is an external sequence of nonempty internal sets. By the Saturation Principle 2.1.2, n A , # 0; but this is a contradiction! The precise external cardinality of an internal set depends upon the kind of ultrapower construction we use. But it is the internal cardinality which is well behaved and is the important concept in our theory. Definition 3.2.1 defines the precise sense in which the set C? = (0, l}v of Example 3.1.3 is “finite,” namely hyperfinite. Thus we see that the definition of P ( A ) = lAl/lC?l is the exact analog of the definition of counting measures in finite probability spaces.
68
3 PROBABILITY
In the following we shall call an internal probability space where R is hyperfinite, a hyperfinite probability space.
(a,d,P ) ,
3.2.2. EXAMPLE. The hyperjnite time line will be another important example of a hyperfinite probability space. The idea is to replace the time interval [0,1]by a hyperfinite set T. To this end choose some 71 E *N - N, such that At = 7-l is a positive infinitesimal. Let
T
= (0,
At, 2 A t , . . . , q At
=
l},
where T is hyperfinite. We notice that if 7 = w ! for some w E *N - N,then every standard rational m / n in [0,1]belongs to T;they are all of the form h At for some A 5 7. The standard part map ’: T + [0,1] is onto: no irrational number r in [0,1] is an element of T, but given an irrational r there is a unique t E T such that t < r < t + At. We let P be the counting measure on T ; i.e., for any internal set A E T we set P ( A ) = IAI/ITI. As we shall see below (T, SQ, P ) , where d is the internal algebra of all internal subsets of T, and the associated Loeb space (T, L(SQ),L ( P ) ) are our versions of the usual Lebesgue space ([0,1], 9, p ) , where B denotes the Lebesgue-measurable subsets of [0,1]and p is the standard Lebesgue measure.
Spaces and measures give the necessary setting. The mappings and the processes are the objects of central importance. A standard stochastic process is a map x : E x [0,1]
+
R,
where ( E , B, p ) is some probability space. The value space here is R but could, e.g., be some suitable separable metric space. A hyper-nite stochastic process is an internal map
X : R x T+*R, where T is a hyperfinite time line and (0,SQ, P ) is some hyperfinite probability space. From X we want to derive a standard process x by taking “standard parts.” And conversely, if we let our initial data be an external, i.e., standard process x :R x [0,1]+ R with respect to (0,L ( d ) ,L ( P ) ) ,is it possible to approximate it by a hyperfinite process X : R x T + *R? Processes are maps of two variables; thus we need to approximate external objects by internal objects in each factor. This leads to the notion of liftings.
69
3 2 HYPERFINITE PROBABILITf SPACES
Letf: fi + &! and F : R + *R. F is called a lifting
(1) o f f if F is internal and 3.2.3. DEFINITIONS.
" F ( w )= f ( w ) for almost all w E fi with respect to the Loeb measure L ( P ) on R. (2) Let f :[0,1] + R and F : T += *R. F is called a lifting o f f if F is internal and O F (
for almost all t
E
t ) =f("t)
T with respect to the Loeb measure L ( P ) on T.
We have the following important result:
P ) be a hyperfinite probability space 3.2.4. THEOREM. (1) Let (a,d, and (R, L(sZ),L ( P ) ) its associated Loeb space. A function f:n + R is Loeb-measurable iff f has a lifting F : R + *R. (2) Let ([0,1], 93,p ) be the standard Lebesgue space and ( T, L( a),L( P ) ) the Loeb space associated to the hyperfinite time line T. A functionf: [0,1] + R is Lebesgue-measurable iff f has a lifting F : T + *R. We note that R can be replaced here by any separable metric space, e.g., C(R", R") in the compact-open topology. Now we turn to the proof of ( 1 ) . Let F be a lifting off: Let N , , , , ( r ) = {r' E R l l r - r'l < l/n}; i.e., N I l n ( r )is a standard neighborhood of r E R. We must prove that f-'(N,,,,( r ) ) is Loeb-measurable. Since F lifts f the set
U has L ( P ) measure 1. Let f(w)
E
Nl/n(r)
= {w E
w E
RI"F(w) = f ( w ) )
U ; then
iff Ir - " F ( w ) l < l/n, iff
OJr- F ( w ) J< l/n,
iff
Ir - F ( w ) l Il / n - l / m ,
some
m E N.
Here the first equivalence is by definition and the second comes from the continuity of the absolute-value function. The importance of the third equivalence lies in the fact that an external condition "lr - F(w)l < l / n is replaced by an internal condition ( r - F ( w ) (5 ( l / n ) - ( l / m ) . Thus the set E nllr { w E Rllr - F ( w ) JI( l / n ) - ( I / m ) } E SP. Therefore UmtN{w F ( w ) l s ( l / n ) - ( l / m ) } E L(SP). Hence
u n f - ' ( N l / n ( r ) )E U d ) , which suffices to show that f is Loeb measurable. For the converse let N1, N 2 ,. . . be a countable open base for R and set U,, = f ' ( N n )E L(SP).By Theorem 3.1.2 we can find internal sets A , , such
70
3. PROBABILITY
that L ( P ) (U,,) 5 P(A,,) calculate that
+ l/m
and such that A , ,
E
A,,,,+'
c U,,. We
L ( P ) (un - UrnAn,m) = 0.
Thus the set U = R - Urn (U,, - Urn A,,,) has Loeb measure 1. For each n E N let %,, be the internal set of all internal functions F : R + *R such that F ( A k I )E *Nk for all k In, 1s m. Each %,, is nonempty. Hence by the Saturation Principle 2.1.2 there exists an internal FE For o E U we see that
n%,,,,,.
Since N , , N , , . . . is an open base for 88, we conclude that " F ( w )= f(o) for all o E U. Thus F lifts f: For the proof of part (2) of 3.2.4 we need the following result. 3.2.5. PROPOSITION. A set A G [O,13 is Lebesgue measurable iff the set st-'(A) = { t E Tl"t E A} is Loeb measurable. In this case we have p ( A ) = L( P)(st-'( A)).
The reader should at this point recall our discussion in Example 3.1.3. Granted Proposition 3.2.5, the proof of 3.2.4(2) reduces to the case 3.2.4(1). Given f:[O,11 + R, define f l : T + R by setting f l ( t ) = f ( " t ) . For any open U E R, 3.2.5 tells us that f'( U ) is Lebesgue measurable iff fF'( U ) is Loeb measurable. Pick by (1) a lifting F of fl ,then F also lifts f: The intuition behind 3.2.5 should be clear. The Lebesgue measure is the uniform measure on [0,1], i.e., the continuous version of the counting measure. The corresponding Loeb measure is the counting measure on the hyperfinite approximation T to [0, 11. So everything is all right, except for the fact that there are u-algebras involved. Half of the proof is rather immediate. Namely, let A be Lebesgue measurable. It follows from general constructions that we may restrict ourselves to the case A = [ a , b), where a and b are rationals. By our choice of At, cf. 3.2.2, we have a, b E T and stC'(A) = { t E TI a - ( l / n ) 5 t < b - (l/m)}. Then st-'(A) E L ( d ) , and quite clearly L(P)(st-'(A)) = b - a. For the converse we use the fact that if B is an internal subset of T in *R, then st(B) is a closed, hence compact, subset of [0,1] in R; see Proposition 2.1.8. Now let A c_ [0,1] and assume that st-'(A) E L ( d ) . We must show that A is Lebesgue measurable. Given any E E R, we may, by Theorem 3.1.2, find an internal B E st-'(A) such that P ( B ) > L(P)(st-'(A)) - E. C = st(B) s A is compact in [0,1], hence Lebesgue measurable. By the first part of 3.2.5 .we know that st-'(C) E L(A) and p( C ) = L(P)(st-'( C)) 2 "( P ( B ) ) 2 L( P)(st-'(A)) - E . By a dual argument
urn n,,
3.2. HYPERFINITE PROBABILITY SPACES
71
we can find for any E E R, an open set D 2 A such that p ( D )5 L(P)(st-'(A)) + F. This sufices to show that A is Lebesgue measurable, and p ( A ) = L(P)(st-'(A)). 3.2.6. REMARK. The arguments above generalize in such a way that they can be used to show that any Radon probability space can be obtained from a hyperfinite probability space via a measure isomorphism. In our case the standard part map st : T + [0, I ] is a measure isomorphism between (7',I-(&), U P ) ) and ([0,1], 93,p); for a full discussion see Anderson (1977, 1982) and also Sections 3.4 and 3.5.
The theory of integration is particularly simple in a hyperfinite probability space. 3.2.7. DEFINITION. Let (a, &, P ) be a hyperfinite probability space with P the counting measure and & the algebra of all internal sets. Let F : R + "R be an internal function. The expectation E ( F ) of F is defined as
F(o)d P
=
2
1 F(w)-.
wen IRI Some remarks may be in order. First, the hyperfinite sum CWEn exists by transfer, thus E ( F ) is a well-defined hyperreal number. We have restricted ourselves here to integration with respect to the counting measure P. We could also assign different weights to the points of 0. Let (a, I o E R) be an internal sequence such that 1a, = 1. In a standard way this defines a hyperfinite probability on with associated expectation
E(F)=
C
F(o)a,.
"€R
We could also develop the theory with respect to an arbitrary internal algebra d. We would then have to make appropriate mention of d-measurability; e.g., in Theorem 3.2.4( 1 ) we would get that f :R + R is Loeb measurable [i.e., L(d)-measurable] iff f has an &-measurable lifting F : R -+ R. We shall now relate the hyperfinite expectation with the standard Lebesgue integral on R. First we state a definition: 3.2.8. DEFINITION. Let (a, d,P ) be an internal probability space and F : St + *R an &-measurable internal function. F is called S-integrubte if
(i) E(IF1) is a finite hyperreal; (ii) A E & and P ( A ) L- 0, then j A lF(w)l d P = 0. The space (R, &, P ) has an associated Loeb space (R, L ( d ) ,L ( P ) )with a standard integration theory with respect to the a-algebra L ( & ) and the a-additive measure L( P ) . We have the following not unexpected result.
72
3 PROBABILITY
3.2.9. THEOREM. ( 1 ) Let (0,d,P) be a hyperfinite probability space and (0,L ( d ) ,L ( P ) )its associated Loeb space. A functionf:0+ R is Loeb integrable iff f has an S-integrable lifting F : 0 + *R.In this case
E(F)= [ / w )
dL(P)(w).
(2) Let (0,d,P) be a hyperfinite probability space and (0,L ( d ) , L ( P ) ) its associated Loeb space. Let F: 0 + *R be internal and nonnegative. Then F is S-integrable iff (i) O F is Loeb integrable; (ii) " E ( f )= 1 , "FdL(P). (3) Let ([0, 11, g,p ) be the Lebesgue space and ( T , d,P) a hyperfinite time line. A function f :[0,1] + R is Lebesgue integrable iff f has an S-integrable lifting F: T + *R. In this case
E ( F )=
M r ) .
Note that i f f : [0, 11 + R is continuous, then *f restricted to T is a lifting of J: Thus (3) above shows that the Riemann integral as defined in Section 1.4 is a special case of hyperfinite integration. For the proof of 3.2.9 note first that part (2) is a variation on (1); it does, however, require a proof! Part (3) follows from the proof of ( 1 ) and the fact that st: T + [0,1] is a measure isomorphism. We are thus left with part (1). First note that if F is a $finite function, i.e., F :0 + *[ -n, n] for some n E N, then a very simple approximation argument shows that
E ( F )=
I,
" F ( o )d L ( P ) ( w ) .
The general case will follow from the following characterization: 3.2.10. PROPOSITION. A function F: 0 + *R is S-integrable iff there exists a sequence ( F , I n E N) of finite functions such that
' E ( I F - F,I)
+
0
as
n + a.
We give a fairly complete proof of 3.2.10 since it shows us exactly the role of S-integrability. The remaining part of the proof of 3.2.9 is rather standard and is left to the reader. [If needed one may consult Loeb (1975), Anderson (1976), and also Henson (1979) for the case of unbounded Loeb measures; see also the forthcoming book Stroyan and Bayod (1989.1
73
3.2.HYPERFINITE PROBABILITY SPACES
Let F : 0 + *R be S-integrable. For each n
E
*N define a function F, :0 +
*R by setting F,(w) = F ( w ) if IF(w)i 5 n, F,(w) = n if F ( w ) > n, and F , ( w ) = - n if F ( w ) < -n. Then F, is a finite function if n
E
N. For rn E
*N -N P ( { w E R ( ( F ( w ) (> r n } ) Il / r n E [ ( F ( = ) 0
by Chebyshev's inequality. The last part follows since by S-integrability E(IF1) < CO. For rn E *N -N, E(IF- F m I )
5
J
IF(w )I> m
I F ( ~d) I~ .
Since P ( { w E 0 1 I F ( w ) ]> m } ) L- 0, S-integrability at once implies that the integral -0. For the converse let (Fn)ntN be given; we assume that sup,IF,,I < n. Obviously "E(JFJ) < co follows from the approximation "E(1F F,I) = 0. In order to verify 3.2.8(ii), let E E R, be given. Choose some n E N such . A E d be such that P ( A ) < r / 2 n . Then that "E(1F- F,I) < ~ / 2 Let
lA
1 FJd P 5
1 F, 1 d P +
lA1
F - F, 1 dP < E.
I,
Thus if A E d and P ( A )= 0, then 1 FI dP = 0. We shall give a brief introduction to conditional expectations. Let (0,d,P ) be a hyperfinite probability space where d is the internal algebra of internal subsets of R. Any internal subalgebra 9 of d is generated by a hyperfinite partition {a,, . . . ,a,,}, 7 E *N, of the set a; this follows by transfer from the finite case. 3.2.11. DEFINITION. Let F : fl+ *R be an internal function and 93 a subalgebra of A generated by the internal partition a,, . . . ,OT},71 E *N, of 0. The conditional expecration E ( FIB) is defined as
{a,,
E ( F I 93)(0) = P(R,)-'
1
F(o')P(w'),
W'ECl,
for w e n , ,
n = l , 2 ,..., q.
We note that the function E ( F ( 93) :0 +*R is 93-measurable and that
E ( E ( F I 9 3 ) )= E ( F ) . The following proposition relates the hyperfinite concepts to the standard one. 3.2.12. PROPOSITION. Let (0,d, P ) be a hyperfinite probability space, 3 of L& we and let F: 0 + *R be S-integrable. For any internal subalgebra 9
74
3 PROBABILITY
let E(FI 93) denote the (hyperfinite) conditional expectation of F with respect to 93 and E("F1 L( 93)) denote the standard conditional expectation of " F with respect to the sub-a-algebra L(93) of L ( d ) . Then E ( F l 9 3 ) is S-integrable and
" E ( F l 9 3 ) = E("FIL(93)),
L(P)-a.e.
We begin the proof by observing that for all A E 93, O l A
E ( F l 9 3 ) dP
="
FdP
[A
" F d L ( P )=
=
E("FIL(93))d L ( P ) ,
[A
by S-integrability of F and 3.2.9(2). Taking A such that P ( A ) = 0, we get S-integrability of E ( F I 93). By3.1.2wemayforanyB E L(93)findanA E %ssuchthatL(P)(AAB) = 0. Hence " E ( F I 9 3 )d L ( P ) = "
I,
=
E ( F I 93) dP
E("FIL ( 9 ) ) d L ( P ) ,
by the S-integrability of E(FI a),Theorem 3.2.9(2), and the calculation above. Since " E ( f l 9 ) is L( %)-integrable, the result follows. There is more to measure and integration theory than we have touched upon. Let us add a few remarks on product measures. Anderson (1976) has shown that if R, and R2 are hyperfinite probability spaces and U E R, x R, is measurable with respect to the product of the Loeb measures on R, and R2, then U is Loeb measurable on R1 x R2, and the measure of U is the same with respect to the two measures. As first pointed out by Hoover (1982), the converse is false in general. Our exposition follows D. Normann (unpublished). 3.2.13. EXAMPLE. Fix some m E *N - N. Let R, be the set of all internal subsets of { 1,2, . . . ,m }and R, = { 1,2, . . . ,m}.Let Pl and Pz be the counting measures on R, and R2, respectively. On R, x R,, let L ( P , )0L(P,) be the product of the two Loeb measures and L(P, 0 P2) be the Loeb measure of the counting measure PI 0P2 on R, x R2. Let F E R, x R2 be defined as
F
= {(x, i ) E R l
x f121i
E
x},
3 2 HYPERFINITE PROBABILITY SPACES
75
where F is internal in R, x R2 and (FI = 4 . lR,l lR2/;I 1 denotes internal cardinality. Hence F is L(P, 0PJ-measurable and L ( P , 0P,)(F) = 4. We will show that F is not L , ( P , )0L(P2)-measurable.Assume to the contrary that F is L( PI)0 L( P,)-measurable. By the result of Anderson quoted above L ( P , ) 0 L ( P 2 ) ( F )= i. By definition of the product measure there will be a family of measurable rectangles { A ; x Bi}iaNsuch that
and such that
By construction of the Loeb measure (see 3.1.2), we may assume that each Ai, Biis internal. Then by 3.1.1 there is a number n E N such that
The complement of Uisn( A , x B,) will also be a finite union of internal rectangles. Thus there are internal sets A E R, and B c_ R2 such that F n ( A x B ) = 0 and L ( P , ) ( A )> 0 and L ( P 2 ) ( B )> 0. This is impossible as the following argument shows: I{x
E
n , ( V i E ~ (G x))l i
=
2mp'BI= IR,12-'?
Since F n ( A x B ) = 0 we have that P , ( A ) 5 2-IB1.But then either P , ( A ) is infinitesimal (if B is infinite) or P 2 ( B )is infinitesimal (if B is finite); thus either L ( P , ) ( A )= 0 or L ( P 2 ) ( B )= 0. Even if the product of Loeb measures is not the Loeb measure of the product, Keisler ( 1977) has provided the following often useful Fubini-type theorem: 3.2.14. THEOREM.
f:R,
Let R l and R2 be hyperfinite probability spaces and
x R2 + R a Loeb-integrable function. Then
(i) f ( w , , . ) is Loeb integrable for almost all 0,E R,; (ii) the function g ( w l ) = w 2 ) dL(P2)is Loeb integrable on (iii) jf(w,, 0 2 )W P , p2)= ( 5 f ( w , , 4 ~ u P J )d ~ ( P 1 ) .
o
If(@,,
5
a,;
The proof we shall give is due to Loeb. It splits naturally into two parts.
76
3. PROBABILITY
(a) We shall show that if A c ill x i12 has L(Pl 0P2)-measure zero, , 2 ) E A ) has L(P2)then for L(P,)-a.a. w l , the section A ( w , ) = { w 2 1 ( w l w measure zero: pick a decreasing sequence { B , } of internal subsets of fll x i12 such that A c B, and ' P , 0 Pz( B , ) L O . Let B = n B , . Since PI 0 P2( B,) = P 2 ( B n ( w l ) )d P , ( w , ) , Theorem 3.2.9 tells us that
s
L(pl 0 P 2 ) ( B , ) = 'PI 0 P2(B,) = =
=
5
I
'1
'P2(Bn(Wl))
P2(Bn(wl)) dPI(w,)
dL(P,(w,))
Gf72)(Bn(w1))
dL(P,)(w,).
By the monotone convergence theorem, this implies that L ( P 2 ) ( B ( w I )=) 0 a.e., and since A c B, we have proved our claim. (b) Pick an S-integrable lifting F of f, and define G ( w , ) = F ( w l , 0 2 )dP2. It suffices to show that for almost all w , , the function F ( 0,, .) is an S-integrable lifting off( w , , * ) and that G is an S-integrable lifting of g. By Theorem 3.2.9 this immediately implies the first two parts of the theorem, and the last part follows from the calculation
That F ( w , , ) is a lifting off(w, , - ) for almost all w Ifollows immediately from (a). For the S-integrability we use Proposition 3.2:lO. Let F(wl,w2) Frn(wl,w2)
=
m
[-m be a truncation of F for each rn
"I
E N.
if if if
-rns F ( w l , w 2 ) s r n , F ( w , , w 2 ) > rn, F ( w , , 0 2 )< - m
Then
as rn + 03, implying that IF(w,, w 2 ) - F,(w,, w2)I dP, + 0 a.e. and thus that F ( w , ,.) is S-integrable for almost all w , .
n
3 2 HYPERFINITE PROBABILITY SPACES
A consequence of what we just proved is that for almost all w 1
and thus G is a lifting of g. If G , ( w , ) a finite function and
as m
+ 00.
=j
F,,,(w,, w 2 ) dP2(w2)then G , is
By 3.2.10, C is S-integrable, and the theorem is proved.
Let us conclude with the following brief remark. We shall not always start from a hyperfinite probability space. It may be convenient to let the internal space be the *-transform of a standard measure space. For instance, if (R, B, p ) is the standard Lebesgue space on R,our internal starting point could be the internal measure space (*R,*B, * p ) . Here * p is finitely, hence hyperfinitely, additive on the internal algebra *B; a-additivity is lost in the transition. However, it is restored by passing to the associated Loeb space. By transfer we can write down "integrals"jAf(r) d * p , where A E *B, which however must be handled with some care: no countable manipulations are allowed. 3.2.15. REMARK. The internal measure space (*R, *.%, * p ) has an associated Loeb space (*R, L(*.%),L ( * p ) )and results like Theorem 3.2.9 connecting the two spaces are still true. There is, however, one important point to notice. The measure * p is not finite, hence we have to add one clause to the definition, 3.2.8, of S-integrability in order for 3.2.9 to extend to the unbounded case. We state the necessary modifications. Let ( X , d,v ) be an internal measure space as in Section 3.1 and let (X, L(&), L ( v ) ) be its associated Loeb space. A function f:X + *R is S-integrable if it is d-measurable and
(i)
jx If1 dv
(ii) if A (iii) if A
E E
is finite,
d and v ( A ) = 0, then j A If1 dv = 0. d and f ( A ) E p(O), then j A I f 1 dv = 0.
Condition (iii) is redundant if v is finite and, hence, was omitted from Definition 3.2.8. Theorem 3.2.9 extends; for instance, let f : X -+ *R be &-measurable, then f is S-integrable iff Of is L( v)-integrable and "I dv = Iofl dL( v). And a function g :X + R is L( v)-integrable iff g has an S-integrable lifting f:X + *R; in this case " j f d v = 1g dL( v).
If1
78
3. PROBABILITY
We round off this section by mentioning the following elegant application of hyperfinite ideas to ergodic theory. 3.2.16. EXAMPLE. Let K = { 0 , 1 , . . . , k - l}, k E *N - N, and consider the L( P ) ) where is the algebra of internal subsets of Loeb space ( K , L( a), K and P the counting measure. Let cp be the shift operator on K defined by
x+l
if if
x
Kamae (1982) has proved that any dynamical system of the form (R, %I, p, T ) , where (a,3,p ) is a (standard) probability space and T a measure-preserving transformation, is a factor of the “hypercycle” K in the sense that there exists a measure-preserving transformation g : K + R such that g ( q ( x ) ) = T ( g ( x ) )for almost all x E K . As Kamae (1982) [see also Katznelson and Weiss (1982)l shows this has an immediate application to the individual ergodic theorem, since this theorem is rather simple to prove for hypercycles. It would be interesting to see if this notion of hypercycle has other applications. The reader should also consult the recent work by Ross (l983,1984a, 1984b) on measurable transformations on Loeb spaces for additional information. 3.3. BROWNIAN MOTION Now it is time to come down from the abstract theory to a concrete example. The probabilists know how to construct Brownian motion as a limit of random walks. In an important paper Anderson (1976) constructed Brownian motion as a hyperfinite random walk. See also Keisler (1984), whose version of Anderson’s process we shall adopt. For us Brownian motion will be an internal map
B : R x T+*R, where T is a hyperfinite time line, T = (0, At, 2 A t , . . . , l}, as in 3.2.2, and R is essentially our model for hyperfinite coin tossing with the minor change that the base space {0,1} is replaced by { - 1 , +1} and Q = {--1, + l ) T (see 3.1.3). As a hyperfinite random walk B has the following explicit definition B(w, t ) =
o(s)&,
o
E
R.
0
Thus between times to and to + A t the “particle” moves a distance either to the “left” or to the “right” independently with probability ;.
&
79
3 3 BROWNIAN MOTION REMARK.
We use the following convention that if u It, then I
2 X ( W ,S) "
= X ( w ,U )
+ . . + X ( W ,t - A t ) ,
i.e., X ( o , t ) is not included in the sum. The standard Brownian motion is obtained by setting
b ( w , " t ) = " B ( w ,2), where i is the point in T to the immediate right of t ; b will be a standard stochastic process from R x [0,1] to 54 where R has the measure structure given by the Loeb construction (R, L ( d ) ,L ( P ) ) , where d is the internal algebra of internal subsets of R and P is the hyperfinite counting measure on d.We emphasize that b is a standard process although on a somewhat unusual sample space R. Perhaps this R is closer to "physical intuition"? In order to prove that b is a Brownian motion, we must verify the following: (i) b( -,t ) is a measurable function of w for all t E [0, 11; (ii) for s < t, b ( w , t ) - b ( w , s) has a normal distribution with mean 0 and variance t - s; (iii) if s1 < t1 5 s2 < t, I* . 5 s, < t, in [0, 11, then { b ( w , t l ) b ( w , sl), . . . , b ( o , t,) - b ( o , s,)} is an independent set of random variables. By construction (i) is immediate. We turn to the verification of (ii) and (iii). Before giving the proofs we shall digress briefly to make a few remarks on independence of random variables in the extended universe. 3.3.1. DEFINITION. A collection of internal random variables (Xi)itron a hyperfinite probability space (a, d,P ) is called *-independent if for every hyperfinite subset {XI,. . . , X m } , m E *N, and every internal m-tuple ( a ] ,. . . ,a,) E *OBm,
n m
P({w~RIXl(w)
k=l
~ ( { w ~RIXk(u)
The collection is called S-independent if for every finite set {XI,. . . ,X,,,}, m E N, and every ( a 1 ,... , a,) E R m the same product formula holds with = replaced by =. Once more we have a *-version and an S-version in the extended universe. And it is the S-version that has a standard significance. 3.3.2. LEMMA. Let (Xi)ierbe S-independent on (R, d, P ) ; then is independent on the associated Loeb space (R, L ( d ) ,L ( P ) ) .
80
3 PROBABILITY PROOF
(by calculation). Let
in E
N and ( a l , .. . , a,)
E
R",
k=l
We have included this calculation not because we expect any reader to have any difficulties in providing the proof, but simply once more in a simple context to demonstrate how an external condition involving L( P ) and OXikis converted to an internal condition involving P and Xi, ;see also the proof of 3.2.4. 3.3.3. PROPOSITION (Central Limit Theorem). Let (X,,)nc*N be an internal sequence of *-independent random variables on (a, Sa, P ) with a common standard distribution function F and with mean 0 and variance 1. Then for any rn E *N - N and any a E *R
where
* ( a ) = (27r-1'2
[a -m
exp(-c)
dx
is the standard Gaussian distribution. For the proof let G be the distribution function of OX,,on the Loeb space. Since F is standard, it is easy to show that G = OF and F = *G. It is also straightforward to calculate that E("Xn) = 0 and E("X;) = 1. Hence, by the standard Central Limit Theorem, given a E R and E E R, there exists no E N such that if in > no, then
"Xkhas a distribuThe collection (Ox,) is independent. Thus the sum tion function G" which is the in-fold convolution product of G. We get
33. BROWNIAN MOTION
81
for m > no that I G " ( f i a ) - T ( a ) l < E. Now F = * G ;thus we may apply the transfer principle to conclude that for any m E *N - N and any a E R, F " ( f i a ) = **(a). However, F" is the distribution function of E L o x k . Thus for any a E R
As the reader will appreciate, the distribution function 9 is so well behaved (also at *a)that we can extend to all a E *R. 3.3.4. REMARK. In the proof for Proposition 3.3.3 we used the standard central limit theorem. Perhaps the reader would like to give an alternative proof of this result thinking of the Gaussian distribution as a hyperfinite binomial distribution?
3.3.5. THEOREM. Let
B be the hyperfinite random walk on (a, d,P ) and
b its standard part. Then b is a Brownian motion on
(a, L ( . d ) ,L ( P ) ) .
We have to verify conditions (i)-(iii) listed above. As we previously observed, the proof of (i) is immediate by construction. The proof of (iii) follows easily from Lemma 3.3.2. We prove (ii) in two different ways: FIRST PROOF OF
(11)
We use Proposition 3.3.3
L ( P ) ( { w E l l l b ( w , " t )- b ( w , " s )5 a } ) =
L ( P ) ( { wE RI"B(o, t ) - O
B ( q
s) 5 a } )
Here t - s = m A t and i , ,. . . , ik enumerate the random variables "between" s and t. We now have a form suitable for an application of 3.3.3 and get L ( P ) ( { wE
b ( w ," t ) - b ( w ,
Thus b(w, t ) - b ( o , s) for t, s 0 and variance t - s.
E
OS)
5
a})
[0,1] has a normal distribution with mean
3 PROBABILITY
a2 SECOND PROOF OF
We calculate the Fourier transform
(11)
r
J,
exp[i(b(w, " t ) - b ( w , Os))z] d L ( P )
"1,
exp[i(B(w, t ) - B ( w , s))z] dP
=
=
=
1' , '1 fi
exp[ i(
$
(this is a use of 3.2.9),
z] dP,
wk&)
e x p ( i w k f i z) dP,
k=s
=
"is I,
z ) dP
exp( iwk&
exp( iJh? z)
=O
[
( C S
(since
=
(by the independence),
+ exp(-iJh? z) 2
wk
)
1
*l, both with probability
i),
(where m =
exp(
=
At-'),
"t7 "s z2). -
Once more this proves the result. In this case we made a direct calculation which did not appeal to the Central Limit Theorem. Perhaps we should add a comment on the last equality. We know that if m E *N - N, then (1 + x/ m)m = ex. Since the standard part operation kills the remainder term O( z4/ m 2 ) , the equality follows. However, there is more to Brownian motion. 3.3.6. THEOREM. B ( w , is S-continuous for almost all w E a; i.e., there is a set a' of Loeb measure one such that B( w, s) = B(w, t) whenever w E fl' and s = t. Consequently, b ( w , is continuous for all w E a'. a
)
a )
Before proving the theorem, we shall establish two simple identities. First observe that if A B ( s ) = B ( s + A t ) - B ( s ) , then AB(s)' = A t and thus
E ( B ( t ) * )= E s=o
( ( B ( s )+ A B ( s ) ) ~- B ( s ) ~ )
I
=
1 E ( 2 B ( s ) A B ( s )+ A t ) s=o
1
=
1 A t = t, s=o
3.3. BROWNIAN MOTION
83
where we have used that E ( B ( s )A B ( s ) ) = O since A B ( s ) is plus or minus no matter what B ( s ) is. From this we get
6with probability
E ( B ( t ) 4 )= E
((B(S)+AB(S))~-B(S)~) S=O
E ( ~ B ( sA)B~( s ) + 6 B ( s ) ’ AB(s)’
= S=O
+4B(s) h B ( s ) ’ + h B ( ~ ) ~ ) I
I
=
C
E(6B(s)’At+At2)=
s=o
c (6sAt+At2)
s =O
= 3 t ( t - A t ) + t At by summing the arithmetic series 1 ;s At shows that
3t2 - 2 t At I3 t Z , = t ( t - A t ) / 2 . The same argument
E ( ( B (t ) - B ( s ) ) ~5 )3 ( t - s)’. Turning to the proof of the theorem, we define a “bad” set pair ( m , n ) E N’: =
{
w E
fl I 3 i < n 3 s E T n
for each
i i+l
urn nn
Note that the path B ( w , - ) is discontinuous iff w E + 0 as n ,co for all m it suffices to show that O P ( f l , , , )
E
and that
N. Observe also that
where the second inequality follows from a reflection argument: assume that IB(w, ( i + l ) / n > - B ( w , i / n ) l < l / m , but that there is an s E (i/ n, ( i + 1)/ n] such that 1 B( w , s) - B( w, i/ n)l > 1/ m. Let s,,, be the smallest such s, and consider the “reflected path” o’defined by w ’ ( t ) = w ( t ) for t < s, and w ’ ( t ) = - o ( t ) for t 2 s,. Obviously, IB(w’, ( i + l ) / n ) B(w’, i / n ) l 2 l / m , and since each reflected path corresponds to a unique unreflected path, the inequality follows. Completing our calculations, we now have
3 PROBABILITY
84
as n +. 00. The theorem is proved. Just as we could get the Lebesgue measure from the counting measure on the hyperfinite time line, we can now get a measure on C[O,11 by applying the inverse standard part map to the measure induced by B. 3.3.7.REMARK (on Wiener Measure). The Wiener measure is defined as the unique completed Borel measure on C[O, 11 satisfying the following conditions:
(i) the measure of { f C[O, ~ l ] l f ( t ) -f(s) < a } is T ( a / f i ) ; (ii) if s1 < t , 5 s2 * * * 5 sn < t, in [O, 11, then { f ( t l ) -f(sl), . . . , f ( t , ) f ( s,)} is an independent set of random variables. By using 3.3.5 and 3.3.6, we get the following easy construction of the Wiener measure. Let 93 be a a-algebra on C[O,13 defined by (2)
Fe93
iff
{wERI~(~;)EF}EL(&),
where L ( d ) is the Loeb algebra on R. A measure W on 93 is obtained by
(3)
W ( F ) = t ( P ) ( { w E ill b ( w , . ) E F } ) .
The space (C[O,1],93, W ) is the completion of the Wiener space; this can be proved in exactly the same way as 3.2.5 once we realize that (3) can also be expressed as (4)
where
W (F ) = L( @)(st-'( F ) ) ,
is the measure on *C[O,I] defined by
@(A)=P{~IB(~,*)EA} and st is the standard part map on *C[O, 11. We shall develop a general theory for such "pushed-down'' Loeb measures in the next section, and in Section 3.5 we shall return to have a look at the Wiener measure from a somewhat different point of view. Thus far we have been looking at Brownian motions with a fixed initial position 0; what happens if we also fix the final position a? In the hyperfinite setting everything is straightforward: let d be an element in the monad of a which is hit by a Brownian path at time 1; i.e., 6 = B(w, 1) for some o E il, and put
a, = { w E R1 B ( w , 1 ) = a}. Let P, be the normalized counting measure on R,. As before Pa induces a completed Borel measure on C[O, 11 by
(5)
W , ( F ) = L ( P , ) { w E R, I B ( w , * ) E stC'(F)}.
3 3 BROWNIAN MOTION
85
The measures W , are called conditional Wiener measures, and a process with such a distribution is often called a Brownian bridge. Given s, t E [0, 11, s < t, and a, b E R, more general conditional Wiener measures Ws,l,a,b can be constructed to model particles starting at a at time s and ending at b at time t. We leave these to the reader. In a purely standard setting a direct construction of conditional Wiener measures is not so easy since we cannot condition on sets like {w E
R Jb ( w , 1)
= a},
which have measure zero. However, we can list the properties we want W, to have and then prove that such measures exist. Let P, ( x , y ) be the Gaussian kernels: P,(x, y ) = (27~t)-"' exp(-)x - y I 2 ) / 2 t ) . If 0 = to < t , < * . . < t , < f,+, = 1 is an increasing sequence of elements from [0, I], and A , , . . . ,A, are Borel sets in R, we want
w,(A) =
I,, *
. . JAn ~ ~ (~ )0- ' ,P , , ( OX,I )
x p*,-t,( X I , x * ) .
*
P f " + l - f " ( xa)n ,d x , , * . ., dX",
where A = {f E C[O, 13 1 f ( 0 ) = O , f ( 1) = a and f ( t i ) E A, for all i } .
It can be proved that a unique Borel measure W, satisfying this condition exists, and this gives a standard construction of conditional Wiener measures. We leave it to the reader to check that the two definitions agree. 3.3.8. BROWNIAN LOCAL TIME. The notion of local time is important in the study of Brownian motion. Formally the local time I(t, x) is given by
where b is a Brownian motion and 6 the delta function. The idea is that 1( t, x) measures the number of times the Brownian particle visits the site x before time t. One way of making this heuristic idea precise is to show that there exists a jointly continuous process I ( t, x ) such that
(ii) for almost all ( t , x ) E [ 0 , a)x R, where IA is the characteristic function of the set A. This is the standard, but somewhat indirect, approach.
3 PROBABILITY
a6
Thinking of b as the standard part of the hyperfinite random walk gives us a different and more direct approach. Starting from the approximation [use either (i) or (ii)] (iii) to 1( t, x ) , we replace the time line [0, a)by the hyperfinite discretization T = { O , A t ,,.., n A t,...}, n E *N, and the space Iw by A = (0, & A x , . . . , f n A x , . . .}, n E *N, and introduce the internal process L : T x A + *R by
We note that in the hyperfinite random walk A x and A f are chosen such that Ax = ( A t ) ’ l 2 . This definition is due to Perkins (1981), who showed that L has a standard part which is, in fact, a Brownian local time, i.e., satisfies (ii). Perkins used the hyperfinite representations in (ii) to prove the following global intrinsic characterization of local time. Let m( 1, x, 6) denote the Lebesgue measure of the set of points within 6/2 of {s 5 t I b ( s ) = x } . Then for almost all w and each to > 0, (v)
lirn S+O+
sup Irn(t, x, 6)6’” - 2 ( 2 / n ) ’ ” l ( t , x)l
= 0.
rsro,x&
This characterization of l(t, x) is both intrinsic (i.e., depends only on { s l b ( s ) = x } ) and global (i.e., holds for all x simultaneously). Previously it was known to hold for each x separately. The reader is referred to Perkins (1981,1983) for further discussion and proofs. We will return to the idea of local time in Chapters 6 and 7 in connection with stochastic potentials given by local time functionals; for the present, we just mention that Perkins’ construction has found numerous applications in both standard and nonstandard contexts (Perkins 1981a,b, 1982a-c, 1983; Greenwood, 1982, 1985). There are other hyperfinite constructions of Brownian motion besides Anderson’s; see the paper by Oikkonen (1985) for one of them.
3.4. PUSHING DOWN LOEB MEASURES In the previous sections we have developed the basic theory for Loeb measures and used it to get easy constructions of the Lebesgue and Wiener measures. Our strategy was the same in both cases: we began by constructing a natural internal measure on the *-version * X of our space X; taking the
3 4 PUSHING DOWN LOEB MEASURES
a7
Loeb measure, we obtained a countably additive measure on *X; and pulling this back to X with the inverse standard part map, we got the desired measure on X . This method is useful in other situations as well; assume, e.g., that X is a nice enough topological space, and let { p a } a E be a weakly convergent net of measures on X. How do we construct the limit measure? Consider *({ p,}aEr)-whiCh is a nonstandard net { bol}ae*I of internal measures on *X-and pick b, for some infinite o E *I. Taking the Loeb measure L(bw)and pushing it down to X, we get the limit measure [for details, consult Anderson and Rashid (1978) and Loeb (1979a)l. With these examples in mind we shall now study internal probability spaces ( * X , d,P), where X is a Hausdorff space, and try to determine when p = st(L(P)) [i.e., p ( A ) = L(P)(st-’(A)) for all measurable A ] is a reasonable probability measure on X. The exposition may seem unduly technical, but we shall try to convince you in the next section that the machinery we develop here is an extremely efficient tool for constructing limit measures and measure extensions. Before we begin, we should mention that the first use of these techniques seems to have been an application to potential theory in Loeb (1976); the first systematic treatments were by Anderson (1977, 1982)-who used them to obtain hyperfinite representations of Radon spaces-and b y Loeb (1979a). Let us first agree on what a “reasonable” probability measure on X is: 3.4.1. DEFINITION. Let X be a Hausdorff space, and let %’ be a family of subsets of X . A function v : %’ + R, is called regular if for all C E %’
(1)
v( C ) = sup{ v( F ) 1 F c C, F = inf{ v(
E
%’ is closed}
0)1 0 3 C, 0 E %‘ is open}.
A measure v on X is called a Radon measure if it is the completion of a Borel measure, and for all v-measurable C:
(2)
v( C) = sup{ v( K ) 1 K c C, K is compact} =
inf{ v( 0)I 0
3
C, 0 is open}.
If v is finite, the condition on approximation by open sets in the definition of Radon measure is clearly redundant. We shall be mostly interested in Radon measures, but our first result is on regular measures: 3.4.2. PROPOSITION. Let X be a Hausdorff space, and let ( * X , SP, P) be an internal, finitely additive probability space such that st-’(F) E L ( d ) for ) 1. Then st(L(P)) is a regular, all closed F. Assume that L ( P ) ( N s ( * X ) = completed Borel probability measure on X .
3 PROBABILITY
88
PROOF Since st-'( F) is measurable for all closed F, the measure st(L( P)) must clearly be defined on a a-algebra extending the Bore1 sets, and since L( P) is complete, so is p = st(L(P)).The proposition will follow if we can prove the first equality in (1). Assume that C is p-measurable, and let E > 0 be given. Since st-'(C) E L( a),there must be an A E d,A c st-' C, with
L(P)(A) > L(P)(st-'(C)) - E
=
p ( C ) - E.
Since A is internal, st(A) is a closed subset of C by 2.1.8. But since st-' st(A) 2 A n Ns(*X), we get p(st(A)) = L(P)(st-'st(A))
2
L(P)(A) 2 p ( C ) - E ,
and the proposition is proved. This proof is exactly the same as the one we gave for Proposition 3.2.5; Anderson (1977, 1982) attributes it to E. Fisher. We shall be more concerned with the following consequence. 3.4.3. COROLLARY. Let X be a Hausdorff space, and let ( * X , d,P) be an internal, finitely additive probability space such that st-'( K ) E L ( d ) for all compact K. Assume also that for all positive E E R,there is a compact K, with L(P)(st-'K,) > 1 - E. Then st(L(P)) is a Radon measure on X. PROOF
st-'(F)
E
The corollary follows from the proposition if we can prove that L ( d ) for all closed E But
F
=
u (F
Kl,,) u ( F -
ntN
u Kl/n),
ntN
and st-'(F) is hence a countable union of measurable sets. To use the last result we need to know that st-'(K) E L ( d ) for all compact sets K. In many applications this is far from obvious, and our next task is to reduce this problem to a much easier one. We begin with a topological lemma. 3.4.4. LEMMA. Let X be a Hausdorf? space and T a basis for the topology closed under finite unions. If K = X is compact, then
st-l(K) = PROOF
nc.01 K c 0,o E
7).
By definition of the standard part map st-l(K) E n{*olK c 0, o E
To prove the opposite inclusion, assume y G st-'(K). For each x E K, we can find G, E T such that x E G, and y C *G,. Obviously K c U x e K G,, and since K is compact, we may find a finite subcovering K 5 G,, u *
* *
v Gxm,
3 4 PUSHING DOWN LOEB MEASURES
89
and thus
*K
G *(Gx,u . . . u GJ.
Since T is closed under finite unions, we have G,, u . since y E *(GxIu . u GJ, the lemma follows.
-
u G,,
E T,
and
3.4.5. PROPOSITION. Let X be a Hausdorff space and T a basis for the topology closed under finite unions. Let ( * X , d,P ) be an internal, finitely additive probability space such that *O E L ( d )for all 0 E T. Then st-'( K ) E L ( d ) for all compact sets K , and
L(P)(st-'(K)) PROOF
= inf{L(P)(*O))0 E T ,
K
c
O}.
Let K be compact, and put aK = inf{L(P)(*O)(OE T, K c
Given 01,,. . ,On E Ao,,...,
7
={BE
O}.
with K c 0, n . . . n On,and m
E
N, let
dl B c *0,n * . n *On, P ( B ) > aK - l / m } .
Each Ao,,,,.,on,, is nonempty since *0,n * n *On is Loeb measurable with measure > a K .Applying saturation to the family {Aol,...,on,m}, we find an internal B E d such that " P ( B )2 aK and B c *O for all 0 E T , 0 K. By the lemma B c st-'(K), and the proposition follows from the completeness of the Loeb measure. =J
Combining Corollary 3.4.3 and Proposition 3.4.5, we get the main result of this section. 3.4.6. THEOREM. Let X be a Hausdorff space and T a basis for the topology closed under finite unions. Let ( * X , d,P ) be an internal, finitely additive probability space such that *O E L ( A ) for all 0 E T , and assume that for each E E [w, there is a compact set K , with
aK,= inf{L(P)(*O)lK c 0, 0 E
T}
> 1 - E.
Then p = st(L(P)) is a Radon probability measure on X , and for all compacts K , we have p ( K ) = a K . Theorem 3.4.6 has three important advantages over Proposition 3.4.3. The first is that we need no longer check if the external sets st-'( K ) are in L ( d ) , but only if the internal sets *O are; in most applications they will already be in d,and the checking is trivial. The second advantage is that we only have to show the measurability of sets in a basis for the topology; this is important in spaces where the open sets are not countably generated from the basis. Finally, we need not show that there exist compacts with arbitrarily large measure; it is enough to come up with compacts that can
90
3 PROBABILITY
be approximated from the outside by only basis elements of large measure. When we turn to applications in the next section, we shall exploit these three points systematically. The hypothesis in Corollary 3.4.3 that L ( P ) ( K , ) > 1 - E (and the corresponding one in Theorem 3.4.6, that a K ,> 1 - E ) serves two purposes. The first is to ensure that Ns(*X) is Loeb measurable with measure one-which is necessary for L( P ) 0 st-' to be a well-defined probability measure-and the second is to guarantee the existence of arbitrarily large compacts-which is necessary for L ( P ) st-' to be a Radon measure, and also for the proof of 3.4.3 to work. We shall now show that if X is a locally compact space or allows a complete metric, then these conditions are satisfied whenever Ns(*X) has outer measure one. 0
3.4.7. PROPOSITION. Let X be a Hausdo& space, and assume that ( * X , d,P ) is an internal, finitely additive probability space such that 0 E d for all *-open sets 0. If either
(a) X is locally compact, or (b) X is a complete metric space, then Ns(*X) is L(P)-measurable and L(P)(Ns(*X)) = sup{L(P)(st-'(K))I K compact} PROOF
(a) Define a
= sup{L(P)(st-'(K))I
If K , , . . . , K , are compact and m AKl, ,Kn,,,
=
{ B E dl B
3
E
K compact}.
N, let
*Kl u . . * u *K, and P ( B ) < a
+ l/m}.
Since by 2.1.6(iv) we have * K c st-'(K) for all compact sets K , the set , K n , m is nonempty. Using saturation on the family { A K l ,, K , , m } , we find a set B E d such that " P ( B )4 a and * K c B for all compact K . Since X is locally compact, any element in Ns(*X) is in the *-version of some compact, and hence Ns(*X) c B. It follows that Ns(*X) is Loeb-measurable with L(P)(Ns(*X)) - = a. (b) Let L ( P ) be the outer measure generated by L ( P ) , and put AK,,
y = L(p)(Ns( *X)).
If a is as in part (a), all we have to show is that y the argument is indicated by the following claim.
5
a. The main idea of
CLAIM. Let E E R,. For each m f N, there is a finite sequence {C!m)}ll,(m) of subsets of X such that each C:"' is a closed ball of radius
3 4 PUSHING DOWN LOEB MEASURES
91
3 / m and
Let us first use the claim to prove the proposition. Define
K,
=
n
u c:");
,(m)
msN r=l
since it is closed and totally bounded, K , is compact. We have
n U st-'(c:m)) n U * c f m n) NS(*X) n(m)
st-'(&) =
n(m)
3
meN i = 1
mtN
i=I
since C:"' is closed. By the claim
W ) ( s t - ' ( K , ) )> Y - E, from which the proposition follows. We now prove the claim. For each x
E
*X, r
E
*R, let
B(x, r ) = { y E * X 1 d(x, y ) 5 r } .
Given 6
E
R,
n
E
*N, let 1x1,.
and put
. . ,xn E *x
Pa = sup{ Pfl,sI n E N}. Observe that the set
{
n
E
* N ( V X,,... ,x,
E
*x(P
u B(x,,6) ( i l l
I):
< p a +-
is internal and contains N. Hence we can find 7 E *N - N such that P,,s = Ps for all 6 E R,. Let E be the positive real number in the claim. For each rn E N, there is a finite sequence xim), . . . ,x!,?,!) such that
We must have
92
3 PROBABILITY
The reason is as follows: extend xi"'), . . . ,xi:,!,) to an internal sequence xi"'), . . . ,x',"' containing all standard points; this can be done by saturation. By definition of 7,the set
has measure less than ~/2"',and since the sequence x!"'), . . . ,x',"' contains all standard points,
We shall now replace the sequence {B(x!"', l/rn)},5n(m) by a sequence {C!"')},sn(m) of standard sets satisfying the claim. If B(xt"', l / m ) n Ns(*X) = 0, let C!"' = 0. If the intersection is nonempty, we can find a standard element y!"' E X such that
B(xl"), l / m ) c *C(lrn), where
c(") , = {x E X 1 d ( x , y ! " ) ) 5 3 / r n } . It follows that
i=l
and hence
L(P)(N~(*X) n
n u * c i m )) > y -
8.
mshl n(m) i=l
Throwing out the empty C$""'s, we prove the claim (and the proposition). REMARK. The proof of (a) is essentially due to Loeb (1984), while the idea in (b) is based on a well-known argument showing that all probability measures on Polish spaces are tight [see, e.g., Billingsley (1968)l.It should be pointed out that the proof of (b) can be considerably simplified if we assume that X is separable, but since Proposition 3.4.7 will play an important part in Chapter 5, we have decided to present the general case. Note that (a) remains true if we weaken the measurability condition on *-open sets to just demanding that *O E L ( d ) for all open sets 0; the same is true of (b) when X is separable.
In many situations it is much easier to show that the nearstandard points have outer measure one than to prove the existence of large compacts, and
3.4. PUSHING DOWN LOEB MEASURES
93
what Proposition 3.4.7 tells us is that for certain spaces this is sufficient to allow us to apply Theorem 3.4.6. As a matter of fact, in these spaces Theorem 3.4.6 can be used even when Ns(*X) does not have outer measure one, but L ( P ) st-' will then no longer be a probability measure. When we have used Theorem 3.4.6 to construct a measure, we often want to compare the result with some given set function; e.g., if we want to extend a measure to a larger algebra, we would like the new measure to agree with the old one where the old one is defined. The next result is an efficient tool for checking this; it goes back to Anderson (1977,1982). 0
3.4.8. PROPOSITION. Let X be a Hausdorff space, and (*X,sB, P ) an internal, finitely additive probability space with L( P)(Ns(*X)) = 1. Let % be a family of subsets of X , and let v : % + R, be a regular set function such that
v( C) = L( P ) ( * C )
(3) Then p
=
for all
C
E
%.
st(L(P)) is an extension of v.
PROOF Let C E % and E E Iw, be given, and choose F, 0 E % closed and open, respectively, such that F c C c 0 and
(4)
V(
0 )-
V( F
) < E,
by the regularity of v. Obviously * F c * C c *0,and by the nonstandard characterization of closed and open sets (see 2.1.6) * F n Ns(*X) c st-'(F)
= st-'(C)
c st-'(O) c *O.
Combining (4) with (3) applied to 0 and F, and remembering that L ( P ) (Ns(*X)) = 1, we get L ( P ) ( * O )- L(P)(*F n Ns(*X)) < E. Since E is arbitrary, and * C n Ns(*X) and st-'(C) are both squeezed between * F n Ns(*X) and *0,we must have (5)
L(P)(st-'(C) A*C) = 0,
and hence p ( C ) = L(P)(stf'(C)) = L ( P ) ( * C )= Y ( C ) ,
and the proposition is proved. As a consequence of the last proposition, we get a nonstandard version of Lusin's theorem due to Anderson (1977, 1982); the result is interesting in its own right and also useful in the construction of liftings.
94
3 PROBABILITY
3.4.9. COROLLARY. Let ( X , %, v ) be a Radon probability space, and let f :X + Y be a measurable map into a Hausdodl space with countable basis. Then " ( * f ( x ) ) = f ( " x ) for L(*v)-a.a. x in * X .
PROOF Let {Un}ncN be a countable basis for Y with U1 = Y. If " ( * f ( x ) )# f ( " x ) , we must have x
E
u
{ ( f o
st)-'( Un)A*f-'(*
fl€N
K)},
and we only have to show that each of the sets in the union has measure zero. But (fo
st)-'( U,) A*f-'(*U,)
= st-'(f-'(
and applying formula ( 5 ) with P = * v and C set on the right has measure zero.
U,)) A*(f-'( U,)), =f - ' (
U,), we see that the
Anderson's Lusin theorem tells us that * f is a lifting o f f with respect to all standard measures *v. A second and just as useful consequence of 3.4.8 is the following representation theorem, also due to Anderson (1977,1982): 3.4.10. COROLLARY. Let p be a Radon probability measure on a Hausdorff space X . Then there exist a hyperfinite subset Y of * X and an internal measure P on Y such that y = st(L(P)). PROOF
Given a finite family O , , 0 2 ,... , 0, of open Sets in X , let
[o,,02,...,o,be the set of all hyperfinite partitions of * X into *y-measurable
sets such that each *Oi is a union of partition classes. The family {[o,,02,...,on} has the finite intersection property, and thus there is a hyperfinite partition of * X into *y-measurable sets such that the *-version of any open set is a union of partition classes. Fix one such partition E ; let Y consist of one element from each partition class of E, and if y E Y and [ y ] is its partition class, put P{YI
= *P"YI).
Since each open set is a union of partition classes, we clearly have p ( 0 )= *y(*O) = P(*O)= L ( P ) ( * O )
for all open sets 0. By Proposition 3.4.8, the measures y and st( L( P ) ) agree on the open sets, and since they both are Radon measures, they must be equal.
Corollary 3.4.10 states one of the basic facts of nonstandard measure theory, and it will be used repeatedly in the sequel, often without an explicit reference.
3 5 APPLICATIONS TO LIMIT MEASURES A N D EXTENSIONS
95
3 5. APPLICATIONS TO LIMIT MEASURES AND MEASURE EXTENSIONS
We shall now apply the techniques developed in the last section to some questions in standard measure and probability theory. Our aim is to show how these techniques form a strong and flexible tool for constructing different kinds of measures. A natural first step is to see what happens when we apply Theorem 3.4.6 to the *-version of a standard space. 3.5.1. THEOREM. Let X be a Hausdorff space, T a basis for the topology closed under finite unions, and let u be a regular, finitely additive probability measure defined on an algebra Yi extending 7. Assume that for each positive E E R+, there is a compact K , with
(1)
PK, = inf{ u( 0)I 0 E 7 , O 3 K , ) > 1 - E.
Then u has a unique extension to a Radon measure p on X , and for all compact K, p ( K ) = P K . PROOF To construct p, just apply Theorem 3.4.6 to (*X, *%, *v), and put /I = st(L(*u)). It follows immediately from Proposition 3.4.8 that p is an extension of u. For the uniqueness, assume that fi is another Radon extension of u. Since p ( K ) = P K , we must have p ( K ) 2 p ( K ) for all compact sets K . Since /I f p, there must be a set B such that p ( B ) < p(B),and since p is Radon there is a compact K = B with f i ( K ) 2 p(B). But p(B) 2 p ( K ) , and hence @ ( K )> p ( K ) and we have a contradiction.
To illustrate the strength of this theorem, we shall use it to obtain two famous results of probability theory. The first is concerned with projective limits of measures, and is what Schwartz (1972) calls Prohorov’s theorem; the reader is warned that this result is different from the one that is usually known by this name [as a matter of fact, the theorem seems to be due to Kisyhki (1969)l. The theorem plays a fundamental role in the theory for cylindrical measures and generalized stochastic processes, but for these applications we can only refer the reader to Schwartz’s book. Our second result is another proof of the existence of Brownian motion, from which the Levy modulus of continuity follows immediately. Before we can state Prohorov’s theorem, we need some definitions. A projective system of topological spaces is a directed family {(Xl,T , ) } , ~of~ Hausdorff spaces, together with a family { 7rlJ},<J of continuous mappings ?rlJ:xJ+ x,, satisfying ?r,k = ?r,] r J k whenever i <j < k, i, j , k E I. The projective limit ( X , T) of { ( X z ,T , ) } , ~is~the space 0
X
= { ( x ~ ) , :~V,i
= mlJ(xJ))1
96
3. PROBABILITY
with the weakest topology making all the maps ? T ~ ( ( x = ~ xi ) ~continuous; ~~) (X, T) is clearly a Hausdorff space. A family { p i } i s Iof Radon measures on the spaces Xi is a projective system of measures if p i = rrv( p j ) whenever i < j. The question is: When does a projective system of measures give rise to a limit Radon measure p on X such that pi = r i ( p )for all i E I ? 3.5.2. PROHOROV'S THEOREM. Let (Xi,T ~ pi)it, , be a projective system of Hausdorfl spaces and Radon probability measures. The following is a necessary and sufficient condition for the existence of a Radon probability measure p on the projective limit X such that pi = r i (p ) for all i E I:
(2) For all E > 0, there is a compact K , have p i ( r i ( K E )>) 1 - E.
= X such that for all i E I we
The limit is unique when it exists. PROOF. The necessity is almost trivial; since p is Radon there exists a compact K , with p ( K , ) > 1 - E, and hence
To prove the sufficiency, let T' be the basis for the topology given by rF'(0)for all open sets 0 E T~ and all i E I. Let v be the finitely additive measure defined on the algebra (e generated by this basis, by v ( r y l ( B ) )= p i ( B ) .Since each pi is Radon, and the inverse image of a compact set is closed, v is regular. Condition (1) of 3.5.1 follows immediately from (2), and hence 3.5.1 gives us the existence of a unique Radon extension of v. This proves the theorem. Notice that (2) only allows us to calculate the measure of base elements containing the sets K,; thus the extra strength of 3.4.6 was crucial in this application. We now turn to our next example-another construction of Brownian motion-and in this case we shall see how the equality PK = p ( K ) of 3.5.1 can be used. Let us recall what we are trying to show. Let C[O, 13 be the set of real-valued, continuous functions on [0,1], and r the uniform topology on C[O, 11. A cylinder set C is a subset of C[O, 11 of the form C
= { w E C[O,l ] l w ( t l ) E
A l , . . . , w ( t n ) E A,,},
where tl < t2 < . * < t, are elements of [0, 11, and A , , . . . ,A,, are Bore1 sets in 88. Let v be the finitely additive measure defined on the cylinder sets by +
3 5 APPLICATIONS TO LIMIT MEASURES AND EXTENSIONS
97
where to = 0. We shall prove 3.5.3. THEOREM. The finitely additive measure v has a unique extension to a Bore1 measure W on (C[O, 11, 7 ~ ) .Moreover,
PROOF Let u be the topology of point-wise convergence in C[O, 11. If r consists of all finite unions of cylinder sets C with open sections A ] , . . . ,A,,, then r is a basis for u closed under finite unions. We want to apply Theorem 3.5.1 to the space (C[O, 11, a ) .The regularity of v is obvious, and we only have to prove (1). Consider the following sets:
for C E R+, n E N. The standard part of an element in *K$-in both the u-and the n--topology-is obviously in K:, and hence K: is compact in both topologies. A little thought will convince the reader that
Eo'(1
[ ~ C A IIn( , l / A f , )Ill2
PK:
=
inf{
exp(-x2/(2 A t ) , )
-[2CAf, Ln(l/A~,)]l/~
dx)17
where the infimum is over all partitions 0 = to < t l < f2 < . . * < t, I1, where At, = t l r l - t, and max,,, At, 5 l/n. Introducing a new variable y = x / m in the ith integral above, the expression for PK: becomes
Let us first consider the case C > 1. By using L'HBpital's rule, we see that lim
:j ( 2 / 6 ) exP(-Y2/2) dY = 0, exp(-x2/2)
x-tm
and for n large enough, we thus have
= inf{exp[C 2
ln(1 - Arc)]}
inf[exp( -2
1 At?)] 2 exp
[
-2 ( 3 ' : ' 3 7
98
3 PROBABILITY
where the inf’s are over the same set as previously. This gives us condition (1) in Theorem 3.5.1, and Y thus has a unique extension to a Radon measure W on (C[O,11, a). Since the KT’s also are .rr-compact, W must also be a Radon measure on (C[O,11, r),and that it is the only T-Radon extension of v follows exactly as the uniqueness part of 3.5.1. Since all completed Borel measures on complete, separable metric spaces are Radon, W is also unique as a Borel measure on (C[O, 13, T ) . It remains to prove (3). Let
ue‘
then K C zUnEN K : and K C G for all C’ > C. From what we have just proved plus the fact that W ( K : ) = O K ; , it follows that W ( K C )= 1 for C > 1. To prove (3) it clearly suffices to show that W ( K C )= 0 for C < 1, and this follows if we can show that P K ; = 0 for all n E N and C < 1. Assume that C < 1, and pick a > 1 such that aC < 1. We have lim x-rm
K ( 2 / G ) exp(-y2/2) dy = co, exp(-ax2/2)
and for all large enough m, we thus have
by applying (4) to the partition ti = i / m . Since a C < 1, the limit of (1 (l/rn)”c)m as m goes to infinity is zero, and hence P K ; = 0 for all C < 1. This completes the proof of the theorem. Equation (3) is the famous Levy modulus for the continuity of the Brownian sample path; if we only wanted to prove the existence of Wiener measure, we could get away more easily b y choosing simpler compacts. It is also worthwhile to notice how we used the W ( K ) = OK part of 3.5.1 to get W ( K c ) = 1 for C > 1 . For the next application we take our leave of Theorem 3.5.1, and return to the basic techniques of Section 3.4. Let H be a real separable Hilbert space; 9 the class of finite-dimensional subspaces of H ; and B the class of finite-dimensional projections in H. If E, F E 9,E c F, we let PE,F denote the projection from F to E, and PE the projection from H to E. We consider a projective system ( E , p E ,P E , F ) E , F E ofS Radon measures, and let p be the finitely additive measure on H defined by p ( P i I ( A ) )= p E ( A ) .Using Prohorov’s theorem, it is not hard to show that there exists
3 5 APPLICATIONS TO LIMIT MEASURES AND EXTENSIONS
99
a Borel measure v extending p if and only if sup inf p E ( B E ( r )=) 1, rsR+ E t 9
where B E ( r ) is the ball in E of radius r centered at the origin [see, e.g., Lindstrom (1982) for details.] What happens when this condition is not satisfied? It turns out that although we no longer have a limit measure on H, we may still have a limit measure on some larger space. From the nonstandard point of view we shall see that this means that even though the natural nonstandard limit measure is not supported on the nearstandard points in the Hilbert space topology, it may be supported on the nearstandard points in some weaker topology. The right way of weakening the topology was discovered by Gross (1967): 3.5.4. DEFINITION. Let p = { p F } F , g be a projective system of measures on H. A norm 1.1 on H is called p-measurable if it is continuous with respect to the Hilbert space norm 11 I( and for all E > 0 there is a Po E 9 such that
Gross’s theorem says that if 1. I is p-measurable, then p can be extended to a Borel measure on the Banach space B obtained by completing H with respect to 1.1. Thus for the cases we are interested in, the Hilbert space norm cannot be measurable. Before we formulate Gross’s result more precisely, we shall take a look at an example. 3.5.5. EXAMPLE.For each F E 9 let p F be Gaussian distributed with mean zero and covariance matrix dim F . I, where I is the identity matrix. It is easy to check that { / . L F } F ~is~ a projective system of measures, and if a limit measure on H existed, this would be the natural infinite-dimensional Gaussian measure. But, as is easily seen, no such limit measure exists. However, Gross’s theorem tells us that if we can only find a p-measurable norm, we can produce a limit measure on a larger space B. Let {e,},EN be an orthonormal basis for H, and let T : H + H be a linear map. Recall that T is called a Hilbert-Schmidt operator if C:==, 1) Te,1I2 < co, and that 11 Te, It2 is independent of which orthonormal basis we use. If T is a one-to-one Hilbert-Schmidt operator, let I IT be the norm defined by
It is not hard to show that 1. I T is p-measurable. Thus p can be extended to a Borel measure on the Hilbert space generated b y I
IT.
100
3 PROBABILITY
This example is of great interest in infinite-dimensional probability theory. We shall return to the problem in Section 4.7, when we discuss Brownian motion on Hilbert spaces. Since Gaussian measures do not exist on the “natural” Hilbert space, the Brownian motion must also live on the “wrong” space. For more information on the standard theory the reader should consult Kuo’s Lecture Notes (Kuo, 1975). Before we can prove Gross’s theorem, we must agree on what it should mean for a measure on B to extend p. Let B* be the dual of B, and embed B* in H in the natural way. If y l , . . . ,y, E B* and A is a Borel set in Iw”, then { x E BI(Yl(X),. . . ,Yn(X)) E 4 is called a cylinder set in B. We define a finitely additive measure cylinder sets by
fi
on the
f i { x E B1 (Yl(X),* - . ,Y , ( X ) ) E A} = p { x E HI( ( Y l ,X I , . . . ,(Y”, x ) ) E A}. That a Borel measure on B extends p we now take to mean that it is an extension of fi. 3.5.6. GROSS’S THEOREM. Let p = { p F } F e sbe a projective system of measures on H, I * I a p-measurable norm, and B the completion of H with respect to I * 1. Then p has an extension to a Borel measure on B.
Let stl.lbe the standard part map in B with respect to the norm 1.1, and let NS,.~(*B) be the set of nearstandard elements in *B. The family { p E } E E S extends to an internal family { bE}EE*9 of nonstandard measures. By saturation we find E E *9such that H c E. We define an internal measure on * B by ;(A)
=
F E ( An E ) .
We shall show that stl.l(L(/Z))is the desired measure. First notice that if 0 = B is open, then since I - 1 is continuous with respect to 11. 1 1, *O is b-measurable. The key step is the following lemma, which allows us to apply Proposition 3.4.7. 3.5.7. LEMMA.
If
1. I is a p-measurable
norm
L( L)(Nsl.l(*B))= 1. PROOF.
Define A, = { x E E 13u
E
H(lv
-XI
< l/m)};
by the completeness of B we see that Nsl.l(*B) n E = n m e N A,,,. Hence it suffices to show that L( F E ) ( A m=) 1 for all m E N. But for each P E 9and each n > m we have A,,, 3 { x E E I * P ( x )is nearstandard} n { x E E I Ix - *Pxl < l / n } .
3 5 APPLICATIONS TO LIMIT MEASURES AND EXTENSIONS
101
The first set on the right has Loeb measure one since *( p u p ( His) )nearstandardly concentrated (it is the *-version of a Radon measure on a finitely dimensional space), and since 1.1 is measurable, the second set has measure larger than 1 - l/n. Since n is arbitrarily large, L( b E ) ( A m=) 1. By Lemma 3.5.7 and Theorem 3.4.6 we now get that st,.,(L( f i ) ) is a Radon measure on B, and it only remains to show that it is an extension of fi. Let us first calculate the value of stl.l(L(f i ) ) on cylindrical sets: st,.l(L(b)){XE B I ( Y , ( X ) , . . . ,Yfl(X))E A )
where F is the finite-dimensional subspace of H,generated by y, , . . . ,y,,. On the other hand, b y definition of $
$Ix B I
(Yl(x), . . .
9
Yn(x))
A} = /lF{x
FI
((Yl?
x), . . .
9
(Yn,
x>)
A}-
We must show that these two expressions are equal. Using F as the X of Proposition 3.4.8, p F as the v, and * p F as P,the equality follows immediately from that result. This completes the proof of Gross's theorem.
As we have already remarked, we shall return to Gross's theorem in Section 4.7; for the present we just remark that in the standard treatments we know (Gross, 1967; Kallianpur, 1971; Kuo, 1975), the theorem is only proved when the cylindrical measure is Gaussian, and that the nonstandard proof of the general case may be the first. In our last application we return to the problem of extending measures. What happens in Theorem 3.5.1 if we do not even know that Y is defined on a basis for the topology? We shall give a short proof of the following theorem due to Henry (1969). 3.5.8. HENRY'S THEOREM. Let (e be an algebra of subsets of a Hausdorff space X , and let v be a finitely additive probability measure on (e such that for all C E (e
(6) Then
.(C) Y
=
sup{ u ( K ) (K
E (e,
K
c
C, K compact}.
can be extended to a Radon measure on X .
102
3 PROBABILITY
93 = {*C I C E V or C is a Bore1 set}. Let I be a hyperfinite algebra containing 93, and put 9 = d n *V. Obviously, * u is an internal measure on 9. If we have two finite algebras and a measure defined on the smallest of them, we can in a trivial way extend this measure to the larger algebra. By transfer, we may thus extend * u 1 9 to I and get an internal measure P. Applying 3.4.6 to ( X , I P, ) we get a Radon measure st(L(P)) on X , and by 3.4.8 this is an extension of u. For applications of Henry’s theorem, we again refer the reader to Schwartz (1972). We have included this proof to show how embeddings in hyperfinite algebras may be used to produce measure extensions; this kind of trick seems to go back to Loeb (1972). Our hope is that the depth and variety of the results obtained in this section will convince the reader of the strength of the techniques developed in Section 3.4, and we again urge him or her to study the applications to weak convergence given in Anderson and Rashid (1978) and Loeb (1979a). Theorems 3.5.2 and 3.5.6 were first given a nonstandard treatment in Lindstrdm (1982), while (a more general version of) 3.5.7 will appear in Lindstram (1986). Although we have not been able to find the exact statement of Theorem 3.5.1 in the literature, we should mention that a standard proof can be obtained by methods introduced by Kisyriski (1969) [e.g., combine Theorem 2.1.4 and Lemma 2.1.9 in Berg et al. (1984)l. As for Levy’s modulus of continuity, the reader should look up Keisler’s (1984) proof based on Anderson’s random walk. One natural question we have not investigated is how to push down Loeb measures when X does not carry a nice topological structure; see Anderson (1982), Ross (1983), and Lindstrdm (1986) for three different approaches to this problem. We shall end our introduction to nonstandard measure and probability theory here. The reader who is interested in the history and development of the subject should consult Cutland’s excellent survey paper (Cutland, 1983), which contains an almost complete report of what had happened in the field up till the summer of 1983. In the remainder of this book, we shall mainly concentrate on those aspects of the theory which have to do with stochastic processes and stochastic phenomena in physics. The books by Stroyan and Bayod (1985) and Hurd and Loeb (1985) contain wealths of additional information, and the surveys and introductions by Cutland (1982,1983), Fenstad (1980,1985), Lindstrerm (1989, Loeb (1979b, 1983), Nelson (1989, Osswald (1985), and Perkins (1983) approach the theory from different perspectives.
REFERENCES
103
REFERENCES R. M. Anderson (1976). A nonstandard representation for Brownian motion and It6 integration. Israel J. Math. 25. R. M. Anderson (1977). Star-finite representations of measure spaces. Ph.D. thesis. Yale Univ., New Haven, Connecticut. R. M. Anderson (1982). Star-finite representations of measure spaces, Trans. Amer. Math. SOC. 271.
R. M. Anderson and S. Rashid (1978). A nonstandard characterization of weak convergence. Proc. Amer. Math. SOC.69. C. Berg, J. P. Reus-Christensen, and P. Ressel (1984). Harmonic Analysis on Semigroups. Springer-Verlag, Berlin and New York. P. Billingsley (1968). Convergence of Probability Measures. Wiley, New York. N. J. Cutland (1982). Infinitesimal methods in measure theory, probability theory and stochastic analysis. Bull. Inst. Math. Appl. 18. N. J. Cutland (1983). Nonstandard measure theory and its applications. Bull. London Math. SOC.15. M. Davis (1977). Applied Nonstandard Analysis. Wiley, New York. J. E. Fenstad (1980). Nonstandard methods in stochastic analysis and mathematical physics. Jber. Deutsch. Math.- Verein. 82. J. E. Fenstad (1985). Is nonstandard analysis relevant for the philosophy of mathematics. Synthese 62. P. Greenwood and E. Perkins (1983). A conditional limit theorem for random walk and Brownian local time on square root boundaries. Ann. Probab. 11. P. Greenwood and E. Perkins (1985). Limit theorems for excursions from a moving boundary. Theory of R o b . and Appl. 29. L. Gross (1967). Abstract Wiener spaces. Roc. 5th Berkeley Sym. Math. Statist. Probab. 2 (1965). Univ. of California Press, Berkeley and Los Angeles. J. P. Henry (1969). Prolongements d e Mesure de Radon. Ann. Inst. Fourier (Grenoble) 19. C. W. Henson (1979). Unbounded Loeb measures. Proc. Amer. Math. SOC.74. D. N. Hoover (1982). A normal form theorem for Lolp with applications. J. Symbolic. Logic 47.
A. E. Hurd and P. A. Loeb (1985). Nonstandard Real Analysis. Academic Press, New York. G. Kallianpur (1971). Abstract Wiener processes and their reproducing kernel Hilbert spaces. 2. Wahrsch. Venv. Gebiete 17. T. Kamae (1982). A simple proof of the ergodic theorem using non-standard analysis. Israel J. Math. 42. Y. Katznelson and B. Weiss (1982). A simple proof of some ergodic theorems, Israel J. Math. 42.
H. J . Keisler (1977). Hyperfinite model theory. In (R. 0. Gandy and J. M. E. Hyland eds.), Logic Colloqium 1976, North-Holland Publ., Amsterdam. H. J. Keisler (1984). An infinitesimal approach to stochastic analysis. Mem. Amer. Math. SOC. 297. J . Kisynski (1969). On the generation of tight measures. Studia Math. 30. H.-H. Kuo (1975). Gaussian measures in Banach spaces. Springer-Verlag, Berlin and New York. T. Lindstreim (1982). A Loeb-measure approach to theorems by Prohorov, Sazonov and Gross. Trans. Amer. Math. SOC.269. T. Lindstreim (1985). Nonstandard analysis and perturbations of the Laplacian along Brownian paths. Proceedings of the First BiBos Symposium. Lecture Notes 1158 Springer-Verlag, Berlin and New York.
104
3 PROBABILITY
T. Lindstr~m(1986). Weak Loeb-space representations (in preparation). P. A. Loeb (1972). A nonstandard representation of measurable spaces, L, and LZ. In (W. A. J. Luxemburg and A. Robinson, eds.) Contributions to Nonstandard Analysis. NorthHolland Publ., Amsterdam. P. A. Loeb (1975). Conversion from nonstandard to standard measure spaces and applications in probability theory. Trans. Amer. Math. SOC.211. P. A. Loeb (1976). Applications of nonstandard analysis to ideal boundaries in potential theory. Israel J. Math. 25. P. A. Loeb (1979a). Weak limits of measures and the standard part map. Proc. Amer. Math. SOC. 77. P. A. Loeb (1979b). An introduction to nonstandard analysis and hyperfinite probability theory. In A. T. Bharucha-Reid (ed.), Probabilistic Analysis and Related Topics II. Academic Press, New York. P. A. Loeb (1984). A functional approach to nonstandard measure theory. In (Beals et al., eds.), Conference on Modern Analysis and Probability, Amer. Math. SOC.,Providence, Rhode Island. P. A. Loeb (1983). Measure spaces in nonstandard models underlying standard stochastic processes. Proc. Inter. Congr. Math. Warsaw. E. Nelson (1977). Internal set theory: A new approach to nonstandard analysis. Bull. Amer. Math. SOC.83. E. Nelson (1985). Radically elementary probability theory (preprint) Math. dept., Princeton Univ. J. Oikkonen (1985). Harmonic analysis and nonstandard Brownian motion in the plane. Math. Scand, (to appear). H. Osswald (1983). On hyperfinite integration in reflexive Banach spaces (preprint). Univ. of Munchen. H. Osswald (1985). On Petti’s integrability on Loeb spaces (preprint). Univ. of Miinchen. H. Osswald (1985). Introduction to nonstandard measure theory, 1-11. Lecture notes, Univ. of Miinchen, 1983-85. E. Perkins (1981a). A global intrinsic characterization of Brownian local time. Ann. Rob. 9. E. Perkins (1981b). The exact Hausdorff measure of the level sets of Brownian motion. Z. Wahrsch. Verw. Gebiete 58. E. Perkins (1982a). Weak invariance principles for local time. Z. Wahrsch. Verw. Gebiete 60. E. Perkins (1982b). Local time is a semimartingale. Z. Wahrsch. Verw. Gebiete 60. E. Perkins (1982~).Local time and pathwise uniqueness for stochastic differential equations. Sem. Probab. XVI, Lecture Notes in Math. 920 Springer-Verlag, Berlin and New York. E. Perkins (1983a). On the Hausdorff dimension of the Brownian slow points 2. Wahrsch. Verw. Gebiete 64. E. Perkins (1983b). Stochastic processes and nonstandard analysis. l n (A. E. Hurd, ed.), Nonstandard Anafysis-Recent Deuelopments. Springer-Verlag, Berlin and New York. D. Ross (1983). Measurable transformations in saturated models of analysis. Ph.D. thesis, Univ. of Wisconsin, Madison. D. Ross (1984a). Automorphisms of the Loeb algebra (preprint). Univ. of Iowa. D. Ross (1984b). Completeness theorem for probability logic with function symbols (preprint). Univ. of Iowa. L. Schwartz ( 1972). Radon Measures on Arbitrary Topological Spaces and Cylindrical Measures. Oxford Univ. Press, London and New York. K. S . Stroyan and J. Bayod (1985). Foundations of Infinitesimal Stochastic Analysis. NorthHolland Publ., Amsterdam (to appear). R. Zivaljevic (1985). Loeb-completion of internal vector-valued measures. Math. Scand. (to appear).
Part II
SELECTED APPLICATIONS
This Page Intentionally Left Blank
CHAPTER 4
STOCHASTIC ANALYSIS
Following up the ideas from the last chapter, we shall now study one of the most lively and active areas of nonstandard research-the theory of stochastic processes and their applications. In this field hyperfinite structures play a particularly interesting and important role, combining in the same model the combinatorial aspects of the discrete theory and the analytic character of the continuous one. We have already seen an example of this interplay between the continuous and the discrete in the construction of Brownian motion in Section 3.3, and we shall now consider it in greater breadth and detail. As our central theme we shall take stochastic integration with its applications to diffusions, control theory, and multiparameter processes.
4 1 . THE HYPERFINITE IT0 INTEGRAL
We shall begin by giving an informal introduction to Anderson’s (1976) construction of the standard It6 integral as a hyperfinite Stieltjes sum. In this first section our aim is simply to give the reader a feeling for the basic ideas of the subject, and we have postponed the more technical proofs to 107
108
4 STOCHASTIC ANALYSIS
the systematic treatment starting in the next section. But to illustrate the power of the method, we have included a complete and, we think, illuminating proof of It6’s lemma. A. Stochastic Integration
The fundamental problem of stochastic integration is to give sense to integrals of the form I x d y , where x and y are stochastic processes. Everybody’s first idea is to let x, and y , denote the functions x ( w , and y ( w , * ), respectively, and consider the Stieltjes integrals 0
zu(t) =
(1)
ld
x , ( s ) dY,(S).
)
-
The natural candidate for the stochastic integral is then the process ( w , t ) z(, t ) . However, for (1) to make sense as a Stieltjes integral, the path y ( w , . ) must be of bounded variation, and for many naturally occurring processes this is not the case; e.g., almost all Brownian paths are of unbounded variation. Hence a pathwise Stieltjes approach breaks down. It was It6 (1944)-extending work by Wiener-who discovered a way of defining integrals of the form g db, where b is a Brownian motion and g is a suitable process. We shall sketch It6’s idea below, but first we want to explore a nonstandard approach where indeed a pathwise Stieltjes definition works. Let T = (0, At, 2 A t , . . . , 1) be a hyperjnite time line with At infinitesimal. Our sample space is a hyperfinite probability space (a,d,P ) , where for simplicity we assume that Iis the algebra of all internal subsets of R. A hyperjnite stochastic process on fl is an internal map X:RxT+M into some internal set M ; note that since we are using the algebra of all internal subsets of R, no measurability conditions are needed in this definition. We shall write A X ( w , t ) for the forward increment of X ( o , . ) at t, i.e.,
+
A X ( w , t ) = X ( W ,t A t ) - X ( W ,t ) , (2) and use the following convention for sums 1
(3 1
C X ( W ,r ) = X ( O , S) + . . . + X ( w , t - A t ) ; S
hence X ( w , t ) is not included in the sum 1:( X ( w ,I ) ) . The nonstandard stochastic integral is just a Stieltjes sum: 4.1.1. DEFINITION. Let X , Y :R x T H *R be two hyperfinite processes. The stochastic integral of X with respect to Y is the process X dY defined
4 1 THE HYPERFINITE I T 0 INTEGRAL
109
We shall write J i X d Y for the random variable ( J X d Y ) ( t ) , a ,
The stochastic integral is well defined for all hyperfinite processes X and
Y , but, of course, in this generality it may have strange properties and no standard part. One of our first tasks will be to single out classes of processes X and Y for which the integral is well behaved, and we shall devote the next section to this problem. For the present, we shall be satisfied by getting a reasonably good grasp of the basic ideas. To understand what is going on, let us take a look at a simple example. 4.1.2. EXAMPLE. We are back in the setting of Section 3.3; is the set of all internal functions w : T H { -1, l}, and y, :R x T H “R is Anderson’s Let X ( w , s) = o ( s ) and consider the random walk, x ( w , t ) = 1:w ( s ) & . stochastic integral X dx. We have
jO‘
Xd,y(W)=
i 0
w(s)o(s)&
=
i
t
=-
&’
which is infinite for all noninfinitesimal t. Hence the integral of the finite function X with respect to the “Brownian motion” x is infinite.
*&
It is easy to see what went wrong in Example 4.1.2-the increments of a Brownian motion are much larger than one would expect of a finite function, and what keeps the Brownian path finite is the delicate balance between the positive and negative contributions. The integrand w (s) upsets this balance by making all increments positive. What we learn from this is that we cannot allow integrands which anticipate the behavior of y, if we want to keep the integral finite. Let us make this notion of nonanticipation precise. If w E fl and t E T, let
(5)
w
r t = (w(s)ls < t)
(we are still working with the R of Example 4.1.2). A hyperfinite process X : f l x T + “R is nonanticipating if X ( w , t ) = X ( w f , t ) whenever w t = o f t. To show that we are headed in the right direction, we shall prove the following result, where A denotes the uniform probability measure on T :
r
r
4.1.3. PROPOSITION. Assume that X is a nonanticipating process which is square S-integrable with respect to P x A. Then for all t, the stochastic integral 5; xd,y is finite a.e.
110
PROOF.
4. STOCHASTIC ANALYSIS
By simple algebra
+ 2 C E ( X ( s ) X ( r )A x ( s ) A x ( r ) ) r<s
Here the last term is zero since A x ( s ) is plus or minus f i [no matter what X ( s ) , X ( r ) , and AX(r) are] with probability one-half. Since Ax(s)’ = At, we end up with (7) E ( ( [ o t X d x ) 2 ) = E ( f X ( s ) ’ A t )
=[
X*d(PxA)
nxro,r1
and the proposition follows. This result is the first indication that we get a reasonable theory for stochastic integration if we restrict our integrands to the class of nonanticipating processes. Without delving deeper into this theory here, we mention that in the next section we shall prove that the integral S X d x above is S-continuous a.e. and hence induces a nice standard process. Before turning to the standard theory of stochastic integration, we shall give an alternative description of nonanticipating processes which is less intuitive, but easier to generalize. For each t E T, let d,be the internal algebra on R generated by the sets [w], ={w’Ealw’r
t
= or
t}.
It is easy to see that a process X is nonanticipating if and only if X ( t ) is d,-measurable for each t. We call the tuple (a, {Se,},,T, P ) an internal filtration. Let us now turn to the standard theory of stochastic integration with respect to the standard part b of x. The standard notion of nonanticipation is based on a filtration (a, { 93,}rsro,,l, L( P ) ) generated by the internal filtration (R, {d,}, P ) above. If X is the class of null sets with respect to L ( P ) , we define for each t E [0,1], a ,
us=,
where the (T means that we take the a-algebra generated by L ( d s )u X. Basically, 93, classifies the events that only depend on what happens up to and including the monad of t ; the null sets X are added just for technical convenience.
4.1. THE HYPERFINITE I T 0 INTEGRAL
111
Recall that a standard process x : SZ x [ 0 , 1 ] H R is measurable if it is measurable with respect to the completed product of the Loeb measure on SZ and the Lebesgue measure on [0,1]. We say that x is adapted to the filtration (R, { Br},L( P ) ) if it is measurable and if x ( * , t ) is B3,-measurable for each t E [ 0 , 1 ] .Adapted is clearly a standard counterpart of the nonstandard notion of nonanticipating. When it is clear which filtration (R, {Br},L ( P ) ) we have in mind, we suppress the explicit mention of it, and only refer to x as an adapted process. In view of Proposition 4.1.3, it is natural to assume that standard stochastic integrals x db will be defined for all adapted processes x which are square integrable with respect to L ( P ) x m (where m is the Lebesgue measure). We first define the integral when x is an adapted step function, i.e., when there is a partition 0 = to < t , < * * < tk = 1 such that X ( W , s ) = X ( W , t , ) whenever s E [ t , , t r + , ) , and x ( . , t , ) is bounded and %,,-measurable for each t , . Let
I
-
lo‘
k-1
X(W,
s> dWW, s) =
C
X(W,
,=0
t , ) ( b ( w ,$ + I ) - b(w, 5)).
It8’s observation was that since
=
E(
‘flx ( $ ) ’ A b ( t , ) ’ ) + 2 1 E ( x ( t , ) x ( t fA) h ( $ ) A b ( t f ) ) j=O
=E
(I,:
j
X(tJ’
dt),
where A b ( t j ) = b ( t j + , )- b ( t j ) ,the map x + 1; x d b is norm-preserving from L 2 ( L ( P )x m ) to L 2 ( L ( P ) ) Since . the adapted step functions are dense in the set of adapted processes in L 2 ( L ( P )x m ) , we can extend x + x d b to an isometry which we shall also denote by x db. If lco,rlis the indicator function of the interval [0, t ] , the stochastic integral is the process (9)
, stochastic Note that since g db is defined as an element in L * ( L ( P ) )the integral is only determined up to equivalence. B. Liftings
We have described two stochastic integrals-Anderson’s hyperfinite Stieltjes sum (4) and It6’s classical integral (9). As in Section 3.2, the
112
4 STOCHASTIC ANALYSIS
standard and nonstandard theories are connected through the notion of a lifting. Recall that A is the normalized counting measure on T. 4.1.4. DEFINITION.
T
++
Let x : R x [0,1] H R; a hyperfinite process X : R x x (with respect to P x A ) if
*R is a lifting of
OX(w, t ) = X(O,
(10)
O r )
almost surely in L ( P x A ) . Notice how this definition combines the two parts of Definition 3.2.3. Call a process x almost surely adapted if there is an adapted process y such that x ( w , s ) = y ( w , s ) for almost all ( w , s). Anderson (1976) proved the “only if” and Keisler (1984) the “if” part of the following theorem. 4.1.5. THEOREM. A stochastic process x :R x [0,1] H R is almost surely adapted if and only if it has a nonanticipating lifting.
This result is a consequence of the lifting Theorem 3.2.4 for random variables; all that needs to be checked is that the classes of almost surely adapted and nonanticipating sets fit together correctly. This is slightly technical and not very surprising, and we postpone the proof till the systematic treatment of liftings in Section 4.3. It follows easily from 4.1.5 that if x is square-integrable with respect to L( P ) x m, then we may choose the lifting to be square S-integrable with respect to P x A. The next theorem establishes the relationship between It8 integration and internal Stieltjes integration (Anderson, 1976). 4.1.6. THEOREM. Let x be a square-integrable, adapted process, and X a nonanticipating, square S-integrable lifting of x. Then for all t E T
:I
x ( w , s) db(w, s ) =
(11)
X ( w , s ) dx(w, s )
for L(P)-a.a. w.
In this case also the proof is quite simple; we first establish (11) when x is an adapted step function, and then extend to the general case by a limit argument, using the definition of the It8 integral. The full details will be given in Section 4.4. A consequence of 4.1.6 is that if X , and X 2 are two square S-integrable liftings of x, then X I dx = X , dx almost surely. To give a little more of the flavor of the theory, we shall give a proof of this fact from scratch. Let X = X , - X,; then J X d x = J X , dx - I X 2 d x and by ( 7 )
5
((sd1; x d x ) 2 ) Lo,,,
X 2d ( P X A) 5
=
Hence
X 1 dx =
X , dx a.e.
I
( X , - X2)2d ( P X A)
z
0.
4 1 THE HYPERFINITE I T 0 INTEGRAL
113
C. lt6's Lemma
This is a result of great importance in stochastic analysis. We shall give a nonstandard proof which in a simple way converts the usual heuristics into a precise argument. Let R :R" x [0,1] H R. We shall use V and A for space derivatives and for time derivative. If b is an n-dimensional Brownian motion [i.e., b ( w , t ) is a vector ( b , ( w ,t), . . . , b,(w, t ) ) where b l , . . . , b, are independent, onedimensional Brownian motions], then It6's lemma in its differential form asserts that
d ~ ( b ( t )= ) V ~ ( b ( t )db ) + (SAP
(12)
+ arcp)(b(t))dt,
which is just a convenient abbreviation for (13)
q ( b ( t ) )- cp(b(0)) =
I,'
v q ( b ( s ) ) db(s) +
lor
( ~ A V+ a r p ) ( b ( s ) )ds.
It6's lemma is the fundamental theorem of stochastic calculus, and it is particularly useful for calculating with functions cp such that arcp = -4 Acp; e.g., see Simon (1979) for a proof of the Feynman-Kac formula using this technique. There are more general versions of the result than (13), and we refer the reader to Anderson (1976), Lindstmm (1980a), and Section 4.4 for more extensive nonstandard treatments. For simplicity we shall write down the proof only in the one-dimensional case. Assume that cp :R x [0,1] t+ R has continuous first and second derivatives; it suffices to show that
since Vcp(xs, s) is a lifting of Vcp(bs,s). The left-hand side of (14) can be rewritten as a hyperfinite sum
cp(x,, t ) - dxo, 0) r
=
c
[v(Xs+Ar,
+
S=O
r
=
c Vcp(xs,s)Ax(s)
s=o
- c p ( x s , s)l
4 STOCHASTIC ANALYSIS
114
This is a simple algebraic reformulation suggested by Taylor’s formula. By definition, the first sum is the correct hyperfinite stochastic integral. To treat the middle sum note that
I M- m - v
~ ( ~- y) ) -( M~ ~ ) ( ~ y)25 ~ cix - y
~ 3
for a finite constant C. Thus the second sum can be replaced up to = by
~ But, and here is the point where the usual heuristics becomes exact, A x ( s ) =
At. Hence the middle sum can be replaced by
J:
;
Acp(x(s), s) ds.
To handle the last term we have the inequality I d x , t ) - d x , s) - & d x , s ) ( t
-
s)l 5 C ( t - Sl2,
which tells us that up to = the last sum can be replaced by
lof
arq(X(s + ~ t ) s) , ds ==
lor
a r q ( x ( s ) , s) ds,
and the proof of (14) is complete; no further limit arguments have to be made. We have purposely modeled our proof of It6’s lemma on the standard argument in Simon (1979) to make it easier for the reader to compare the two approaches. Two aspects of the nonstandard proof are particularly noteworthy; one is the advantage of the pathwise definition of the stochastic integral, the other is the exact relation AX( t)’ = At, where A t is infinitesimal. Our aim so far has been to introduce the most important ideas of hyperfinite stochastic calculus with a minimum of technical details. In the sections to come we will be taking a closer look at the themes that this informal account has suggested, according to the following plan. The next section develops a theory for hyperfinite stochastic integrals X d M , where X is nonanticipating and M is a square-integrable martingale; we will be proving regularity properties for the paths of such integrals and will take a look at an importance criterion for S-continuity. In Section 4.3, we turn to lifting theorems and establish not only a generalized version of Theorem 4.1.5, but also other lifting results which will be important in later applications. We then study the relationship between standard and nonstandard stochastic integration; Section 4.4contains a more general version of 4.1.6
4 2. GENERAL THEORY OF STOCHASTIC INTEGRATION
115
plus a discussion of other aspects of this question. Applications to stochastic differential equations and stochastic control theory are the themes of Sections 4.5 and 4.6-here the ideas used in the proof of Its's lemma are more fully exploited. The last two sections of this chapter extend the theory in different directions; in 4.7 we treat stochastic integration in infinitedimensional spaces with applications to stochastic partial differential equations, and in 4.8 we consider a generalization of Brownian motion to several time parameters.
4.2. GENERAL THEORY OF STOCHASTIC INTEGRATION
It is time to begin the systematic development of the ideas we sketched in the last section. Although the hyperfinite integral S X d Y is defined for all *R-valued, internal processes X and Y , we have seen that certain restrictions on X and Y are necessary to get a decent theory. The appropriate nonanticipation conditions on the integrand X have already been discussed in some detail, but we have so far simply assumed that the integrator process Y is a hyperfinite random walk. In this section we shall allow more general integrators; basically, we shall be working with the class of processes called martingales. In addition to X being nonanticipating and Y being a martingale, we need to impose certain integrability conditions on the two processes. Here we are given a certain freedom. Originally, standard stochastic integration had a strong L2 flavor, and both X and Y were assumed to be locally square-integrable (see, e.g., Kunita and Watanabe, 1967). However, by using more sophisticated techniques and stronger inequalities, it has been possible to extend the theory to the L' case; thus X and Y need only be assumed locally integrable [see Meyer (1976) and Metivier and Pellaumail (1980) for expositions]. On the nonstandard side, the Lz and the L' theory were developed independently and almost simultaneously by Lindstrom (1980ac) and Hoover and Perkins (1983a,b). We have decided to restrict ourselves to the less general L2 approach in this book as we feel that the extra technicalities needed for the L' case tend to blur the very simple ideas underlying the whole theory, and also because the extra generality is not needed for our applications. A. Internal Martingales As in the last section we shall be studying hyperfinite stochastic processes X :R x T H *R, where (a,d,P ) is a hyperfinite probability space, but we shall allow T to be slightly more general. A hyperfinite time line will be a
4. STOCHASTIC ANALYSIS
116
hyperfinite set T={to,tl,...,tE),
where 0 = to < tl < . . . < tE = 1 and t , + l - t, = 0 for each i. We write A X ( w , t , ) for X ( w , fr+l) - X ( w , t i ) and use the same convention for sums as in (4.1.3); i.e., if s = t , , t = f,, then 1
(1)
C
r=s
- +X(w,
X ( w , r ) = X ( w , t , ) + X ( w , t , + , )+ .
thus the term X ( w , t ) is not included in the sum. Given two internal processes X , Y :R x T + *R, the stochastic integral is defined as before:
l O 1 X d Y=
(2)
X ( s )A Y ( s ) . s=o
To define martingales and nonanticipating processes, we must first introduce the notion of a filtration. 4.2.1. DEFINITION. Let T be a hyperfinite time line and (a, d, P) a hyperfinite probability space. An internalfiltration on 0 indexed by T is a tuple (0,{ d , } r E P )T , where , { d , } , tisTan increasing internal sequence of internal algebras on R.
Since we are always assuming that d is the internal power set of R, all the d,’s are automatically subalgebras of d. A natural example of an internal filtration was given in Section 4.1; it gives substance to the assertion that d , is supposed to represent the information we have about the stochastic system at time t. 4.2.2. DEFINITION. An internal process x : X T + *R is nonanticipating with respect to the filtration (R, { d B t }P) r tifTo,H X ( w , t ) is &,-measurable for all t E T.
There is another way of describing nonanticipating processes that is both more intuitive and closer to the definition we gave for a special case in the previous section. For each t E T, introduce an equivalence relation -, on 0 by (3)
w
--,w’
iff
V A E d,(w
E
A @ w ’ E A).
It is easy to see that X is nonanticipating if and only if X ( w , t ) = X ( w ’ , t ) whenever w --* w ’ . 4.2.3. DEFINITION. An internal process M :R x T + *R is a martingale with respect to the filtration (a,( d , ) t cPT) ,if it is nonanticipating and if for all s, t E T, s < t, and all A E d,,
(4)
E ( l , ( M , - M , ) ) = 0.
4 2 GENERAL THEORY OF STOCHASTIC INTEGRATION
117
If we replace the equality in (4) by the inequality E ( l , ( M , - M,)) 2 0, then A4 is called a submartingale, and if we replace (4) by the opposite inequality E ( l , ( M , - M , ) ) I0, then M is called a supermartingale. For the final word on this much discussed and somewhat confusing piece of terminology, see Doob 1984 (p. 808). The easiest way to figure out what Definition 4.2.3 really says is to use the equivalence relation ( 3 ) . If [ w ] , is the equivalence class of w, then a nonanticipating process M is a martingale if and only if
for all w and t. Using this characterization, it is trivial to see that if X is nonanticipating and M is a martingale, then X d M is also a martingale.
5
The sample paths of a martingale are usually oscillating wildly-just think of the most studied of all martingales, the Brownian motion, whose paths are not only of unbounded variation, but even nowhere differentiable. To tame such extremely irregular behavior, we introduce an associated increasing process called the quadratic variation. 4.2.4. DEFINITION. By the quadratic variation of the process [ X I :R x T + “R defined by
[ x l ( w ,t ) =
C
x:
X
T
+= *[w,
we mean
AX(^, s)*.
s=o
A striking example of how much simpler the quadratic variation is than the original process is provided by Anderson’s random walk x: (6)
[ x ] ( t )=
s =o
Ax(s)’=
At
= t.
s=o
In martingale theory the simplicity of the quadratic variation is exploited systematically by translating problems about martingales into much simpler problems about their quadratic variations. We shall see several examples o f this strategy in this and later sections. An important tool in some of these proofs is the following simple identity: 4.2.5. LEMMA.
For all hyperfinite processes X :fl x T += *R, [X](t)=X(t)’-X(0)’-2
PROOF
By elementary algebra
118
4 STOCHASTIC ANALYSIS
=X(t,+,)2-X(t,)2-2 Summing over all
ti < t,
the lemma follows.
If we apply the formula we just proved to a martingale M, we get
(7)
E ( M 3=E(M:+[Ml(f))
since M dM is a martingale starting at 0 and thus has zero expectation. By connecting the expectations of M 2 and the quadratic variation [ M I , this formula provides much of the explanation of why the theory of squareintegrable martingales is particularly well behaved. 4.2.6. DEFINITION. A hyperfinite martingale M is called a h2-martingak if " E ( M : )< a, for all t E T
Note that since [ M I (t ) is an increasing process, (7) tells us that E ( M : ) is also increasing. To show that M is a A2-martingale it thus suffices to check that E ( M i ) is finite. Definition 4.2.6 introduces a condition on the size of M . Often we can reduce a martingale to a A2-martingale by stopping it before it grows too big. An internal stopping time adapted t o the filtration (a,{ d t }P, ) is an internal mapping T : C + ~ T such that for all t E T, the set { w I T ( W ) It } belongs to d,. In terms of the equivalence relation - I in (3), this just means that if T ( W ) = t, then T ( w ' ) = t for all w' - I w. The important observation to make is that if T is a stopping time and M is a martingale, then the stopped process M , defined by M,(w, t ) = M ( w , t
A ~ ( 0 ) )
is also a martingale. 4.2.7. DEFINITION. An internal martingale M is a local A2-martingale if there is an increasing sequence { T , , } , , ~of ~ internal stopping times such that each M,. is a A2-martingale, and such that for almost all w , T , , ( w ) = 1 for some n E N. The sequence { T , , } , , ~is~ called a localizing sequence for M.
It is the local A*-martingales we are going to allow as integrators in our stochastic integrals. The difference between A * and local A 2 theory is really of little importance as most results in the latter follow immediately from corresponding results in the former, but the extra freedom of localization is handy to have around in applications. Before we turn to stochastic integration, we would like to show that the local h2-martingales are processes with nice standard parts. One thing is clear; if we fix t E T, then MI is finite almost everywhere. But a problem is that the exceptional set may differ from one t to another, and hence there
4 2 GENERAL THEORY OF STOCHASTIC INTEGRATION
119
might conceivably be no w such that M ( w , t ) is finite for all t. What we would like to know is that maxfeTM(w,t ) is finite for almost all w. To answer “uniform” questions of this type, we have an extremely useful inequality due to Doob (1953). 4.2.8. DOOB‘S INEQUALITY.
then for all p > 1 and all t
where
1) 1Ip *
E
If X :fi X T + *R is a positive submartingale, T
denotes the Lp norm.
We postpone the proof of this inequality until we have completed our analysis of the paths of M. To apply 4.2.8 to the problem we are working on, observe that IM( is a positive submartingale. Letting p = 2, we get E(sup Mt)5 4E(M:).
(8)
S S f
If M is a A2-martingale, the expectation on the right is finite, and hence there is a set of measure one where M , is finite for all s. Using a localizing sequence, we see that this must also hold for local A2-martingales. But the fact that M is finite a.e. is not sufficient to guarantee that it has a decent standard part; its paths may still be jumping infinitely often between different finite values. We would like M to have one-sided limits in the following sense: 4.2.9. DEFINITION. Let f : T + *R be internal. We say that r E IW is the S-right limit o f f at t E [0,1] if for all standard E > 0, there is a standard 6 > 0 such that if s E T and t < < t + 6, then I f ( s ) - rl < E . We write r = S-limsLff(s).The S-left limit, S-limsTff(s)is defined analogously. O s
Note that it is the standard part of s (and not s itself) which should lie between t and t + 6; hence we do not care how f behaves on the upper half-monad of t. Before we can show that local A*-martingales have S-right and S-left limits a.e., we have to know a little more about stopping times. The algebras d,in our filtration classify the events which happen before a fixed time t ; we shall now generalize this notion and introduce algebras dTclassifying events which happen before a stopping time 7. We first extend the class of equivalence relations --! in (3) by letting (9)
w -Tw’
if and only if
w
-T(w)
w’
when 7 is a stopping time. Note that if w --T(o) w ‘ , then ~ ( w ’ =) ~ ( w and ) hence is really an equivalence relation. We let d7be the internal algebra generated by the equivalence classes of --7
rrT.
120
4 STOCHASTIC ANALYSIS
Assume now that we are given an increasing sequence { T , , } , , of ~ ~internal stopping times, and let MT, be the random variable M J w ) = M ( w , T , , ( w ) ) . It is trivial to check that ( w , n ) +I= MTn((w)is a martingale with respect to the filtration (a, {&,}, P ) , and hence by (7) we have
where
= 0.
T~
4.2.10. PROPOSITION. If M is a local A2-martingale, then almost all M's paths have S-right and S-left limits at each t E [0,1].
PROOF Without loss of generality, we may assume that M is a A'martingale. Since we already know that M is finite almost everywhere, the proposition can only fail if M oscillates too much, i.e., if the set
u{
1
w The path M ( w, * ) crosses the interval [a, b ] infinitely many times}
a,bsQ a t b
has positive probability. Since there are only countably many pairs of rationals, this implies that we can find a, b E Q, a < b, such that M crosses [ a , b ] infinitely many times with positive probability. Define a sequence (7,) of stopping times as follows. Let T~ = 0 and for k odd define Tk(W)
= inf{t
> T k - I ( w ) ( x ( w ,t ) 5 a } A 1.
Similarly, if k is even, we let
Since the sequence (7,) is strictly increasing until it reaches one, we see that if y is the number of elements in the time line, then T~ is identically one. By ( l o ) , we thus have
The left-hand side of this equation is finite by assumption, while the sum of the right must be infinite on a set of positive measure. Hence we have a contradiction and the theorem is proved. Hyperfinite processes with one-sided S limits can be turned into standard processes as follows.
4 2 GENERAL THEORY OF STOCHASTIC INTEGRATION
121
4.2.11. DEFINITION. Let X : R x T + *R be a hyperfinite process with S-left and S-right limits a.e. The standard process OX+ :R x [0,1] + R defined by
OX+(@,t ) = S-lim X ( w , s ) slr
is called the right standard part of X . The process OX-( w, t )
=
S-lim X ( w, s) str
is called the left standard part of X . Of these two standard part processes, "X' is by far the most important one and we shall often refer to it simply as the standard part of X . Indeed, Hoover and Perkins (1983a) and Stroyan and Bayod (1985) observed that OX+ can be considered as the standard part of X in the Skorohod topology [see, e.g., Billingsley (1968)l on the space D of right continuous functions with left limits. For these reasons most authors denote O X f simply by OX, but with the notation we have been using, this would make it impossible to distinguish between " X ( t ) (the standard part of the value of X at an internal time t ) and "X'(t) (the value of OX+ at the standard time t ) . Having shown that the local h2-martingales have reasonable path properties, we are ready to discuss stochastic integration. However, we still have the postponed proof of Doob's inequality to attend to, and this seems as good a time as any. First a lemma:
v
4.2.12. LEMMA. Let u and be two internal, positive maps from to *R. Assume that p, a E *R are such that p > a, p > 1, a > 0, and that for all positive 6 E *R
(11)
5"P[ u > 51 5
[
V"dP.
(U>S}
Then
E ( U " ) 5 (+'"E(YP). P-"
Let p be the distribution of U (i.e., p ( A ) = P [ U straightforward calculations PROOF
E
A]). By
122
4 STOCHASTIC ANALYSIS
where the last step uses (11). Continuing our calculations, we see that
=I,
UP-"V" d P 5 P E ( UP)l-"/PE( VP)"/P P-ff
by Holder's inequality. Dividing by E ( U p ) ' p " ' pand raising both sides to the p / a - t h power, the lemma follows. V
To prove Doob's inequality, we apply the lemma with U X , and a = 1. All we have to do is to check (1 1). Let 5 > 0 and define a stopping time T by
=
supss1X , ,
=
~ ( w =) inf{s E
T I X ( w , s) > 5 ) A 2.
Note that since t < 2, we have {supsd,X, > 5 ) = ( T ~P[SUP,,,x, > 51 = =
(p{T 5
I
t} 5
J
5
t } . Thus
XJP (751)
(X,, - X,) d P +
I
X, d P 5
(751)
j
XdP,
(7-1)
where the last step uses the fact that X is a submartingale. Hence tP[SUP xs > 51 5 S 5 I
J
X , dP
{SUP,S, X<'C)
and Doob's inequality follows. What we have stated and proved is, of course, a nonstandard version of Doob's result, but we shall not hesitate to use the original, standard version whenever it is convenient. B. Martingale Integration
In the first section of this chapter we showed that if x is Anderson's random walk and A is the normalized counting measure on T, then the integral X dx is well behaved if X is nonanticipating and S-squareintegrable with respect to P x A. When we now turn to integration with respect to A*-martingales, the integrability condition on the integrand will no longer refer to the measure P x A but to a measure constructed from the martingale we integrate with respect to. If M is a A2-martingale, we let v M be the internal measure on R x T defined by
5
(13)
v M { ( w ,t ) } = A M ( w , t ) ' P { w } .
4 2 GENERAL THEORY OF STOCHASTIC INTEGRATION
123
Note that uM(s2 x T ) = E ( [ M ] ( l ) )is finite, and that vx = P x A for Anderson’s process x. Two natural classes of integrands are defined as follows: 4.2.13. DEFINITION. Let M be a A2-martingale. A hyperfinite process X belongs to the class SL2(M ) if it is nonanticipating and square S-integrable with respect to vM. If M is a local A’-martingale, then X belongs to S L ( M ) if it is nonanticipating and in SL’(MTn)for all T, in a localizing sequence for M.
The reason why uM is the right measure to use in this definition should become clear from the proof of the next proposition. 4.2.14. PROPOSITION. If M is a A*-martingale and X E S L 2 ( M ) , then X d M is a A2-martingale. If M is a local A2-martingale and X is in SL( M ) , then j X d M is a local A*-martingale.
PROOF The second half of the proposition is an immediate consequence of the first. Thus assume that M is a A*-martingale and X E S L 2 ( M ) . Applying ( 7 ) to the martingale 5 X dM, we get
E ( ( J 0 ’ x d M ) * ) = E ( [ J 0 ’ x d ~ ] =) E ( $ x ’ A M ~ )=
X2dVM,
which is finite by assumption. Since we already know that local A ’-martingales are nearstandard, this proposition tells us that we have obtained a reasonable integration theory. Turning to the deeper aspects of this theory, we first recall that an internal function f : T H*R is S-continuous i f f ( s) = f(t ) whenever s = t, and each f ( t ) is nearstandard. A process X : s Z x T + *R is S-continuous if almost all its paths are. The result we are aiming at is: 4.2.15. THEOREM. If M is an S-continuous local A’-martingale and S L ( M ) , then 5 X d M is also S-continuous.
xE
To prove it, we shall use the following characterization of S-continuous martingales: 4.2.16,. THEOREM.
A local A2-martingale is S-continuous if and only if its
quadratic variation is. This is perhaps the most important example of the interplay between a martingale and its quadratic variation. As a first illustration of its usefulness, we apply it to the hyperfinite random walk x: since [ x ] ( t )= t obviously is S-continuous, the theorem tells us that x itself is S-continuous! A second illustration is provided by the following proof of Theorem 4.2.15: PROOF OF 4 2 1 5
martingale and X
E
It suffices to prove the theorem when M is a A*S L 2 ( M ) . Let us first assume that X is bounded by a
124
4 STOCHASTIC ANALYSIS
real number n. Since M is S-continuous, we know from 4.2.16 that the quadratic variation [ M I is S-continuous, and since
[1
XdM](t)-
[
X d M ] ( s ) = + X 2 A M 2 s n 2 si A M 2 =
n 2 ( [ M 1 ( t )- [ M l ( s ) ) ,
[I
this implies that X d M ] is S-continuous. Using 4.2.16 for a second time, we get the S-continuity of X dM. The idea of the proof is the argument we just gave; to extend the result to general X E S L 2 ( M ) is just an exercise in measure theory. To carry it out, note that since X is square S-integrable, there is a sequence { X , } of S-bounded functions such that ( X - X , ) 2 dv, + 0. By Doob's inequality
"5
Ordinary measure theory tells us that there is a subsequence {"rnax,,,(j~ X d M - X,, d M ) } k E N converging to zero almost everywhere. Since each 5 X,,, d M is S-continuous, the uniform limit X d M must also be S-continuous, and the proof is complete. Observe the important part played by the S-integrability of X 2 in this proof; it is not sufficient to assume that X 2 dv, is finite as we need to approximate X by S-bounded functions X , . Indeed, it is almost trivial to find an example which shows that the theorem is false under the weaker condition. This is rather typical of the theory and is the reason why we require the integrands to be square S-integrable despite the fact that a few simple results (such as 4.2.14) only need the finiteness of 5 X 2 dv,. As almost all our applications will be concerned with continuous processes, Theorems 4.2.15 and 4.2.16 are of the utmost importance to us. Each time we have constructed a process we need to check that it is continuous, and 4.2.15 and 4.2.16 are the perfect tools for this task. The remainder of this section is devoted to the proof of Theorem 4.2.16. To get the necessary estimates, we shall use the following inequalities.
I
4.2.17. LEMMA. There exist constants C, K E R, such that for all hyperfinite martingales M :R x T + *R with M ( 0 ) = 0,
(14)
) E ( [ M ] ( t ) 2I ) KE(max M ( s ) ~ ) . CE(max M ( s ) ~I S l t
P S f
4.2. GENERAL THEORY OF STOCHASTIC INTEGRATION PROOF
125
By Doob’s inequality and simple algebra E(max M ( s ) ~5 ) ($)4E(M(t)4) S 5 1
=
( (:)4E (
(:)4E
s=o
=
{ ( M ( s )+ A M ( s ) ) ~ -M(S)~}) { ~ M ( sA) M ~ f s ) + 6M(s)’ hM(s)’
s=o
1
+ ~ M ( S ) A M ( S+)A~ M ( s ) ~ } . Using that E ( M ( s ) ~AM(s)) = 0 and /AM(s)/5 2max,,,IM(r)), we get from this that M(s)’[M](t) E(max M ( s ) ~5) ($)4E(6max s--1 3 5 1
+ 8 max M(s)’[M](t) S C l
+ 4 max M ( s)*[ M]( t ) ) (Cf
=
18(:)4E(max M(s)’[M](t)) 4 5
5
1
18($)4E(maxM ( S ) ~ ) ’ / ~ E ( [ M ] ( ~ ) * ) ’ ’ ~ , S C f
where the last step is Holder’s inequality. Dividing by E(maxS5,M ( s ) ~ ) ” ~ , the first half of (14) follows. To prove the second half, we first use Lemma 4.2.5 and Holder’s inequality to get E([M](t)’)
=
E((Mi-211M 0 dM)2)
=
E ( M:-
5
E(max s 5 1 M:)+4E(maxM:)”’E(( S S f
+ 4E
4 ~ Jot: M d M + 4 (
(( lofM d M ) 2 )
lor
M~M)’)
I,’
MdM)2)’/z
9
where we have replaced M: by the larger quantity rnax,,, M: in the first two terms. Since 1M dM is a martingale, E(({01MdM)2)
=.([I 5
MdM](t))
=
E($M2AM2)
E(max M ( s ) ~ [ M ] ( ~ ) ) T S l
126
4 STOCHASTIC ANALYSIS
and putting this into (15) we get
E ( [ M ] ( t ) 25 ) E(max M : ) ssr
(16)
+ 4E(max M(s)4)3/4E([M~(t)’)”4 S S 1
+4 ~ ( m aM x ( s)‘))’/’E([M I (t ) 2 ) ’ / 2 . SE- 1
By the part of the lemma we have already proved E(maxsSIM : ) (1/C)E([M](t)2], and making use of this, we turn (16) into
5
Dividing by E([M](r)2)’’’,we have proved the lemma. The inequalities (14) are very simple special cases of the famous Burkholder-Davis-Gundy inequalities, which appear in the literature in various forms. A fairly general version is that
for all p E (1, a). To prove these results one needs much more sophisticated methods than we used in our proof of 4.2.17; see, e.g., Neveu (1975). Before we can use our inequalities to prove Theorem 4.2.16, we have one more problem to solve. Assume that we want to stop a martingale before it grows too large. The natural approach is to use a stopping time T~ =
min{t
E
TIIM(t)J2 K }
for some K E IW, ,but since the last increment before T~ could be enormous, we might still stop the martingale too late. Let us say that M has injinitesirnal increments if AM(#, t ) = 0 for all o and t ; if this is the case, clearly “)M,,] IK . The next lemma tells us that if one of the processes M, [ M I is
4 2 GENERAL THEORY OF STOCHASTIC INTEGRATION
127
S-continuous, then there is a martingale with infinitesimal increments which is almost identical to M. 4.2.18. LEMMA.
Let M be a A2-martingale such that the set { w ~ R ) 3Tr( "~A M ( w ,t ) # 0 ) }
has Loeb measure zero. Then there is a A2-martingale fi with infinitesimal increments such that on a set of Loeb measure one f i ( t ) - M ( t ) and
[A?](t)=[M](t)
for all t. PROOF. For each n E *N, let R n= { w J 3 t ( J A M ( w?)I, 2 l / n ) } . Since the internal set A = {n E *Nl P(s1,) 5 l / n } contains N, it must have an infinite member 71. For w E R,, let t, be the first t such that IAM(w, t ) l 2 l / q , and put t, = 1 for all w not in R,. If -, are the equivalence relations in (3), we let [ w ] , denote the partition class of w under -,. Introduce
[ w ] : = {3E[w],l t & 5t } ,
and note that if t > t; for some 3 E [ w ] , , then [ w ] : = [ w ] , . We first modify M by cutting away the increments which are larger than 1/71: let K be the internal process defined by K ( 0 )= M ( 0 ) and AK(w, t )=
{ ~ M ( o t,)
if if
t
K is usually not a martingale, but if we add the process N given by N ( 0 )= 0, c
then = K + N is a martingale. The crucial observation is that the sum IAN(w, t)l is infinitesimal a.e. To see this, introduce the set C c R x s1 x T consisting of those triplets ( w , 3, t ) such that t = t; < 1 and w -, 3, and make the following computation:
c:=,
4 STOCHASTIC ANALYSIS
128
which is infinitesimal since P(SZ,) = 0 and max,,,lM(s)l is S-integrable. On the subset of SZ - SZ, where 1 lAN(s)l is infinitesimal, we obviously have M ( r ) = M ( r ) for all t. Moreover, on a - SZ, I
[ k l ( t )-
MI(^)
=
1
( 2 ~ ( s+) A N ( s ) )AN(s),
s=o
which must be infinitesimal for all t and almost all w. Since fi is a A*-martingale, it looks like a promising candidate for fi. There is one small problem, however; I\;i need not have infinitesimal increments since A N may be noninfinitesimal. The cure is simple; let y be an infinite element of the set
and define
~ :+ aT by
{
~ ( w =) min t E
7
TIAN(w,t ) > -
A
Y
1.
Since A N ( @ ,t ) only depends on the equivalence class [0lz,T is a stopping time. Using that T = 1 almost everywhere, it is trivial to check that the stopped process 6f = MTsatisfies the lemma. With these preparations it is now quite easy to prove the main theorem. PROOF OF THEOREM 42.16 Without loss of generality, we may restrict ourselves to A2-martingales M. By Lemma 4.2.18 it suffices to consider the case where M has infinitesimal increments, and using .stopping times T,(o)
=
min{t
E
T I J M ( wt)l ,
5
n o r [ M ] ( w ,t ) 2 n } ,
we may thus assume that M and [ M I are S-bounded. (i) Let us first assume that [ M I is S-continuous. For each pair (m,n ) E *N define a subset A,,,n of SZ by
where F denotes the smallest element in T larger than or equal to r. To prove that M is S-continuous, we must show that A = UmcN Am,"
nncN
4 2 GENERAL THEORY OF STOCHASTIC INTEGRATION
129
has measure zero, and for this it is sufficient to show that P(A,,Y) = 0 for all m E N, y E *N - N. But
where the last step uses Lemma 4.2.17 applied to the martingales M ( s ) M ( F / y ) . Since the quadratic variation is S-continuous and finite, the last expectation is infinitesimal and so is P ( A m , y ) . (ii) We now assume that M is S-continuous and define
B,,,"=
{
o E sZl3i E *N(
(y) i)) A)}.
([ MI i + l
-[MI(
-
2
2
As above it suffices to show that P(B,,,) = 0 for all m E N and y E *N - N. Fix y E *N - N, and let N be the restriction of M to the time line S = {( F / y ) I i 5 y } . By stopping M before [ N ] gets too large if necessary, we may assume that EN] is S-bounded (this uses the fact that M has infinitesimal increments). Using 4.2.17 and Doob's inequality, we have 0 5 P(B,,Y) 5
m
cP
5
{O
l ( [ M ] ( T ) -[MI('))'>')
Y
c E ( ( [ M I ( , )i + l
m
-[MI(S))3
1tY
1E
zrnK I
5
<
mK(3)'
1 I<
5
(
max-
( F/ Y ) S s 5 ( I + I / Y 1
(Mfs) - M(;))')
E ( ( M ( y ) - M(:))')
Y
mK($)"E{m2;((M(?)
- M(;))*)[N](l)},
which is infinitesimal since [ N ] is S-bounded and M is S-continuous. This completes the proof. The theory we have presented above is due independently to Hoover and Perkins (1983a,b) and Lindstram (1980a). Our exposition follows Lindstrem (1980a) quite closely; the Hoover-Perkins approach is based on
130
4 STOCHASTIC ANALYSIS
a systematic use of the Burkholder-Davis-Gundy inequalities and leads to results which are slightly stronger than the ones we have arrived at; basically, they only need local S-integrability where we have required local square S-integrability. Hoover and Perkins also give an alternative formulation of Theorem 4.2.16 which is more convenient to use in certain applications. The continuity theorem 4.2.15 for stochastic integrals has several ancestors and relatives in the literature; Anderson (1976) proved the theorem for M a hyperfinite random walk and X a lifting, and Keisler (1984) removed the lifting hypothesis from X and at the same time estimated the modulus of continuity of the stochastic integral. Indeed, the papers by Anderson and Keisler give a complete theory of stochastic integration with respect to Anderson’s random walk, and much of the first part of this section is a straightforward generalization of their work to a martingale setting [see also the thesis of Panetta (1978), which contains the first results on integration with respect to S-continuous martingales]. This generalization would, of course, have been much less “straightforward” if the standard theory of stochastic integration had not told us what to look for and provided us with the necessary inequalities. There are several good introductions to standard stochastic integration available, e.g., Meyer’s classic “course on stochastic integrals” (1976) and the books by Metivier and Pellaumail (1980), Chung and Williams (1983), and Ikeda and Watanabe (1981). As for the nonstandard theory, the book by Stroyan and Bayod (1985) gives an account of the Hoover-Perkins approach and also adds many new contributions to the field, while Osswald’s (1985) lecture note is a careful and detailed exposition of the approach we have followed. Martingale theory is much more than stochastic integration; for nonstandard contributions to other aspects of it, see Helms and Loeb (1982), Hoover (1984), Hoover and Keisler (1984), and Perkins (1982) [extending results of Barlow (198l)l.
4.3. LIFTING THEOREMS A lifting theorem gives an internal approximation to an external object and is thus an important technical tool in nonstandard praxis. Lifting theorems also provide simple characterizations of classes of processes and are therefore of equal importance for nonstandard theory. We shall develop both themes. A Nonanticipating Liftings
We recall from Theorem 3.2.4 that a function f from a hyperfinite probability space R to R is Loeb measurable if and only if it has a lifting
4 3 LIFTING THEOREMS
131
F : a+ *R, and that a function f :[0,1] + R is Lebesgue measurable if and only if it has a lifting F : T + *R, where T is a hyperfinite time line. A stochastic process is a map of two variables x : R x [O,13 +. R, and we need to combine the two separate lifting theorems into one result asserting the existence of some suitable hyperfinite process
X : R x T - 3 *R. We started this discussion in Section 4.1 and shall continue here. Let us make one preliminary remark; Theorem 4.1.6 would lose some of its force if Theorem 4.1.5 did not assert the existence of “enough” liftings of the appropriate kind. Before we can begin to prove lifting theorems, we must introduce the basic concepts of stochastic analysis. A stochasticjfiltration is a tuple (a, {93r}1tco,ljr Q), where ( ’ 3 3 1 } i ~ ~ o , is 1 3 an increasing family of a-algebras on a set 0, and Q is a probability measure on Bl. Let (a,{~.4,},~~, P ) be an internal filtration as defined in 4.2.3. The stochastic Jiltration generated by (a,{ d r } , ,PT) ,consists of R, the Loeb measure L ( P ) , and the u-algebras
93,= u(
(1)
u L(sB,)u N ) ,
s-r
where X consists of the null sets in L ( d ) . A stochastic filtration (R, {93,}lt10,,l, Q) is said to satisfy the usual conditions if each 93, contains all the null sets of and for all t E [0, 1)
The next lemma implies that all internally generated filtrations satisfy these conditions. 4.3.1. LEMMA. Let (a, { ~ r } , t [ O , , l , L ( P ) )be the stochastic filtration generated by (R,{d,},ET,P). Then for all t E [0, 11,
(3) PROOF
Obviously
u
s=r
a ( ~ ( d s u )
X)c
ar c
and it is thus enough to prove that
Us=, u ( L ( d s )u
uN
),
X ) is a a-algebra.
132
4. STOCHASTIC ANALYSIS
Let {An}ntNbe a countable family of sets from Us=,a ( L ( d s )u X ) , and assume A, E a ( L ( d S nu) N ) . The family S, = [ s , , s, + l / n ] n T is countable and has the finite intersection property, hence by saturation S, # 0. If i ~ n , , ~ ~ then S , ,s’- t and U n e N A ,E c r ( L ( d ; ) u X ) c a ( L ( d s )u X ) . This shows that our family is closed under countable unions, and as it obviously has the other properties of a a-algebra, the lemma is proved.
nneN
us=,
From (3) and (l), we immediately get 4.3.2. COROLLARY. A stochastic filtration generated by an internal filtration satisfies the usual conditions.
There are several standard attempts to capture the concept of a “nonanticipating” process. Unfortunately, there seems to be no natural candidate; different formulations give rise to different notions and not all are suitable for the same purposes. We shall concentrate on two of these notionsadapted processes and predictable ones. First recall that a measurable rectangle is a subset of R x [0, 11 of the form B x [ s , t ] , where B is Loeb measurable. A set is measurable if it is in the a-algebra generated by the measurable rectangles. Let (R, { ~ I } , e ~ O , Q , l ,) be a stochastic filtration. 4.3.3. DEFINITION.
A set A c s1 x [0, 11 is called adapted (with respect to
{a,}) if A is measurable and each section A, = { w I ( w , t ) E A} is 3,-measurable. 4.3.4. DEFINITION. A predictable rectangle (with respect to (93,)) is a set of the form B, x ( s , t ] , where B, E Bs,or B,,x [0, t ] , where Bo E Wo. A set is called predictable if it is in the a-algebra generated by the predictable rectangles.
A process X :R x [ 0 , 1 1 + R is adapted if it is measurable with respect to the a-algebra of adapted sets. Let p be a measure on R x [0,1] defined on an extension of the measurable sets. A set A is almost surely adapted with respect to p if there is an adapted set B such that p ( A A B ) = 0, and a process is almost surely adapted if it is equal almost everywhere to an adapted process. We use similar terminology for measurable and predictable sets and processes. It is clear that all almost surely predictable processes are almost surely adapted. The opposite is not true in general. We shall be mostly interested in the case where p is defined from an internal measure Y on Cl x T by (4)
p =
L( v ) (id x st)-’ 0
133
4 3 LIFTING THEOREMS
(here id is the identity map on R). The situation we have in mind is Y = v M , where v, is the measure derived from a martingale M as in (4.2.13). A special case is v = P x A, where A is the uniform measure on T [i.e., A({ ti}) = ti+' - ti], then p is the completed product of L( P ) and the Lebesgue measure. This is the case we discussed in Section 4.1. For convenience we introduce the notation St = id x st:Q x T + i l x [0,1].
(5)
From now on (a,{'l$t}tET,P ) will be an internal filtration generating (a,{93t}ts~o,ll, L ( P ) ) ; Y will be an internal measure defined on all internal subsets of il x T ; and p = L( v ) 0 St-'. We shall assume that v(s1 x T ) is finite. The measure v is said to be absolutely continuous with respect to P if L ( P ) ( C )= 0 implies L ( v ) ( C x T ) = 0 , and p(R x (0))= 0. Let v be absolutely continuous with respect to P. If B
4.3.5. LEMMA.
=
R x [0, 11 is almost surely predictable, then there exists a nonanticipating A c R x T such that L( v ) ( A A St-'(B)) = 0. PROOF It is enough to consider the case where B is a predictable rectangle. Assume first that B is of the form B, x (s, t ] . By 4.3.1 we can find an s' E T, s' = s, and an A,- E d,-such that L ( P ) ( B , A A:) = 0. By the absolute continuity of v with respect to P, we get
p ( B , x ( s , t ] ) = lim lim L ( v )
m
m + m n+m
Thus we can find n, m E *fV - N such that s + l / n
2
s' and
P(Bs x (s, tl) = D 4 A : x (s + ( l / n ) ,t + ( l / m ) I ) . Hence we may take A = A,- x ( s + ( l / n ) ,t + ( l l m ) ] . To treat the other kind of predictable rectangle Bo x [0, 1 3 , we use the second clause in the definition of absolute continuity. The details are left to the reader. In the opposite direction we have 4.3.6. LEMMA. Let B = R x [ 0 , 1 1 and assume that there is a nonanticipating A c s1 x T such that L( v ) ( AA St-'(B)) = 0. Then B is almost surely adapted. PROOF
Let
V,
be the internal measure given by v A ( C )= v ( A n C ) ,
and put pA = L( vA) 0 St-'. Let g be the Radon-Nikodym derivative g = apLa/w.,
4 STOCHASTIC ANALYSIS
134
and define
c=
{(W,
t ) I g ( w , t ) = 1).
Since L( v ) ( AA St-'(B)) = 0, we must have g = 1 a.e. on B and g = 0 a.e. outside B. Hence p ( B A C) = 0, and all we have to do is to find an adapted version of g. Define internal functions F, F A : R x T + *R by F A ( W , t ) = 1 { vA( W , s ) I s 5 t }. F ( W , t ) = 1 { V( W , S ) I s S t } , The right standard parts f = O F + , f A = O F : are measurable since they are increasing, right continuous processes, and they are obviously { B3,)-adapted. But
and the lemma is proved. The proof above has an immediate corollary. 4.3.7. LEMMA. Let v be absolutely continuous with respect to P. A set B c R x [0, 11 is almost surely measurable if and only if there is an internal subset A c R x T such that L( v ) ( AA St-'( B)) = 0. PROOF That an almost surely measurable set can be lifted is proved by a straightforward routine argument, which we leave to the reader. The proof of the opposite direction is a copy of the proof of 4.3.6; just delete everything which has to do with nonanticipation.
The results in 4.3.5-4.3.7 can be expressed in terms of processes instead of sets. The following important definition generalizes 4.1.l. Let x : x [ O , 13 + R be a stochastic process. A lifting of x (with respect to v ) is an internal process X :il x T + *R such that 4.3.8. DEFINITION.
OX(@,
t ) = x ( w , " t ) L( v)-a.e.
Of course, the notion of lifting generalizes to processes taking values in arbitrary Hausdod spaces. By 3.2.4 (i), the results above can now be reformulated as the following proposition. 4.3.9. PROPOSITION. Assume that v is absolutely continuous with respect to P, and let x : R x [0, 11 + R be a stochastic process.
(i) If x is almost surely predictable, then x has a nonanticipating lifting. (ii) If x has a nonanticipating lifting, then x is almost surely adapted. (iii) x is almost surely measurable if and only if it has a lifting.
135
4 3 LIFTING THEOREMS
A lifting theorem in the sense of Keisler (1984) is a result that characterizes a class of standard processes in terms of what kind of liftings they allow; 4.3.9( iii) is such a characterization of almost surely measurable processes. The two first parts of the proposition do not quite add up to a lifting theorem since we have not determined exactly which standard notion corresponds to the nonstandard one of a nonanticipating process; is it a s . predictable, a s . adapted, or something in between? However, in cases where these classes coincide, we do get a lifting theorem from 4.3.9(i) and (ii). We shall take a brief look at one such case. 4.3.10. LEMMA. Let m be the Lebesgue measure on [O, 11, let P be a probability measure on R, and put p = P x m. A process x :R x [0,1] +=R is almost surely predictable with respect to p if and only if it is almost surely adapted. PROOF It is enough to show that if x is adapted and bounded, then x is almost surely predictable. First observe that if a process is adapted and continuous, then it is almost surely predictable. Hence, taking
IO@’
E
&(x) =
)x(x -
for O < x < elsewhere
E )
B
as an approximation to the delta function, we see that x,(t) = ( X * S E ) ( f ) =
lo1
x ( t - s)S,(s) ds
is a s . predictable. Since x, converges to x in p-norm as process x must be almost surely predictable.
E
+ 0,
the original
4.3.11. COROLLARY. If P is an internal probability measure on .R, A the uniform measure on T, and v = P x A, then a process x : R x [0,1] + R is almost surely adapted with respect to p = L( v ) St-’ if and only if it has a nonanticipating lifting. 0
The reader should notice that 4.3.11 is identical to 4.1.5. The results above are sufficient for our needs in this book; if you are interested in the deeper aspects of the theory, you should consult the work of Stroyan (1985) [see also Stroyan and Bayod (1985)], which contains detailed information on the questions raised by Proposition 4.3.9.
B. Uniform Liftings We shall now turn our attention to the continuous processes. The appropriate notion of lifting was introduced by Keisler (1984).
136
4 STOCHASTIC ANALYSIS
4.3.12. DEFINITION. Let E and F be Hausdorfl spaces and x :R x E -+ F a stochastic process. An internal process X :fi x * E + * F is a uniform lifting of x if there is a set R' of measure one such that
OX(w, rn) = x ( w , "rn)
for all w
E
s1' and all nearstandard rn
Recall that x : R x E
+
E
E.
F being a stochastic process just means that
x( * , m ) is measurable for each m E E. In most applications E = [ 0 , 11 and F = R", but there are examples in stochastic differential equations and
stochastic control theory that make it convenient to consider also the more general situation. Keisler proved the following result when E = [0, 11 and F = R": 4.3.13. PROPOSITION. Assume that E and F are separable metric spaces. A stochastic process x : R x E + F is continuous if and only if it has a uniform lifting. PROOF Let us d o the easy part first. Assume that X is a uniform lifting of x, and fix m E E, E E R + , and w E R' (where 0' is the set where x and X agree as in Definition 4.3.12). If y = x ( w , m ) and E E R, the internal set
(6
E
*R+I d ( y , X ( w , m')) < E
whenever
d(m, m')< S }
contains all positive infinitesimals, and hence a noninfinitesimal Fo. Consequently d ( m , m') s So implies ' d ( x ( w , m), x ( w , m')) s E, and x ( w , . ) is continuous at m. For the converse, the idea is as follows. Let C ( E , F ) be the set of continuous functions from E to F, and define x^ :fi + C ( E, F) by w ) = x(w, Use Anderson's lifting theorem 3.2.4 for random variables to pick aliftingiofx^,anddefineX:Rx*E + * F b y X ( o , m ) =z(w)(m).Then X is a uniform lifting of x. To carry out this plan we need a topology on C ( E, F ) , and in order for Anderson's lifting theorem to work, it has to be second countable. When E = [0,1], F = Rd, Keisler could use the compact-open topology, but in general this does not work as C ( E , F ) may fail to be second countable if E is not locally compact. Instead we shall use a topology which can most conveniently be described as follows. Fix a countable, dense subset Eo of E, and let B ( E ) be the family
a(
a).
W ( E )= { B ( e , l / n ) l e
E
Eo, n E N}
of closed balls B(e, l / n ) = {m E E 1 d ( e , m ) Il/n}. Let B(F)be a similarly defined family in E If B , E B(E), B2 E B ( F ) , let OB,,B,
=
{f E
C ( E , F ) IflBll
= B3.
137
4 3 LIFTING THEOREMS
Define a topology T on C ( E , F ) by letting its open sets be arbitrary unions of finite intersections of the form OB(;l,By)n. * . n OB;m),B;m).
(6)
Since there are only countably many such sets (6), this topology is second countable, and it is obviously Hausdorff. As above, define x^:R + C ( E ,F ) by x ^ ( w ) = x ( w , .). To lift 2, we must first show that it is measurable with respect to the topology 7. Since there are only countably many basic open sets of the form (6), it suffices to show is measurable. Let 0 be a countable, dense subset that each set 2-'( of B1;then since B2 is closed
which is measurable since x ',s a stochastic process. We can now pick a lifting X of 2, and define X by X ( w , m ) = i ( w ) ( m ) . Let R' be a set of measure one such that 2(0) = for all w E R'. It only remains to show that if w E a', then x(w,Om) = OX(w, m ) for all nearstandard m. Assume not; then d ( x ( w , Om),OX(w, m)) is noninfinitesimal, and there is an element B2 E B(F)such that x ( w , Om)belongs there is to the interior of B,, but X ( w , m ) f! * B 2 . By continuity of x ( w , a set B1 E B(E) with m in its interior such that x ( w , m')E B2 for all rn' E B, . Hence 2 ( w ) O B l , B 2 while clearly g ( w ) f! *OBI,B27 contradicting the assumption O X ( @ )= $(a).The proof is complete. O i ( w )
a ) ,
A special case of the proposition above is that a process x :R x [0,1] + R is continuous if and only if it has a uniform lifting X :R x T + *R. Another situation where the result is useful is when we are considering a function
f :[ O , 11 x C([O, 11, R")+ R",
(7)
which is measurable in the first coordinate and continuous in the second. If A is the normalized counting measure on T, Proposition 4.3.13 asserts the existence of a uniform lifting F : T x *C([O, 11, R,) + *Rm.
Note that in this case
C ( E , F ) = C ( C ( [ O 11, , R"), R") is not second countable in the compact-open topology, and hence the special choice of the topology in the proof of 4.3.13 was necessary. We shall encounter functions f of the above type in the section on stochastic control theory, where we are interested in expressions of the form f(t, x ( w , . )) for a controlled process x.
138
4 STOCHASTIC ANALYSIS
A natural question is whether it is possible to combine two already established lifting results to create a new one; e.g., it is tempting to conjecture that a process is continuous and adapted if and only if it has a nonanticipating, uniform lifting. In most cases an extension of this kind is possible, but we may have to formulate it carefully, and the proof is often far from trivial. As an illustration we shall prove a modification of the conjecture we just mentioned. An internal process X :R x T -+ *R is an essentially uniform lifting of x :R x [0,1] .+ R if there is an infinitesimal 6 E T such that { w ( V t > S(OX(w, t ) = x ( w , “ 1 ) ) )
has Loeb measure one; i.e., the uniformity of the lifting may break down on an infinitesimal initial segment. It follows from 4.3.13 that x has an essentially uniform lifting if and only if it is continuous. 4.3.14. THEOREM. A process x:I2 x [0, 11.+ R is continuous and adapted if and only if it has an essentially uniform, nonanticipating lifting. REMARK. Keisler (1984) originally proved this result in a slightly less general context; the version we are using is taken from Osswald (1985). Another formulation of the theorem, which is more common in the literature, is to claim that the lifting is uniform, but only nonanticipating after an infinitesimal time 6. The statement we get by just removing the word “essentially” is not true as we shall show in Example 4.3.15; hence our conjecture above (i.e., that a process is continuous and adapted iff it has a nonanticipating, uniform lifting) is actually false! False is also the assertion obtained by turning “adapted” into “almost surely adapted”-perhaps somewhat surprisingly at first glance in view of 4.3.9. 4.3.15. EXAMPLE. To construct a continuous, adapted process that does not have a uniform, nonanticipating lifting, we return to the setting of Section 4.1. Hence R is the set of all internal maps w : T + {-1, l}, and d, is the internal algebra generated by the equivalence relation w - f w ’iff w ( s ) = w ‘ ( s ) for all s < 1. Let x be defined by x ( w , t ) = o ( 0 ) for all o E a, t E [0,1]. Clearly x is a continuous, 93,-adapted process. Since do= (0,R}, x does not have a uniform, nonanticipating lifting. Note that
is a nonanticipating, essentially uniform lifting of x. In the proof of 4.3.14 it will be convenient to work with a standard filtration {qf}which is slightly coarser than {Bf}.To define it, we first introduce an internal equivalence relation -* on R for each s E T by (8)
w -soi
iff
V A E &(w
E
At,
O’E
A).
139
4 3 LIFTING THEOREMS
If t E [0, 13, let 3, be the (external) equivalence relation given by w = , w ' if w - , w ' for all s such that O s 5 t. The cT-algebra %',consists of all Loeb measurable sets C which are closed under =,. Using 4.3.1 and 4.3.2 it is but that for each B E 9, there is a C E '%,such easy to see that %',c 9,, that B A C has Loeb measure zero. It follows from this that for each continuous, %-adapted process x, there is a continuous, %,-adapted process y such that x ( w , . ) = y ( w , . ) for almost all w. The filtration { Vt} is the one used by Keisler (1984). PROOF OF THEOREM 4 3 14 Assume that x has an essentially uniform, nonanticipating lifting X . Since it has an essentially uniform lifting, x is continuous. To show that it also is adapted pick t E [O, 13 and FE T such that t = O f . Then O X ( 8 = x ( t ) almost everywhere, and since X ( ?) is djmeasurable, x( t ) must be %',-measurable. For the converse, assume that x is continuous and adapted. Let y be a continuous and %',-adapted process such that y ( w , = x ( w , almost everywhere, and let Y be a uniform lifting of y. We shall modify Y such that it becomes a nonanticipating, essentially uniform lifting of y (and hence of x). Let R' be the set of measure one where Y lifts y, and choose an increasing, internal sequence (Cl,,}nt*N such that for each n E N, fin c R' and P(R,) > 1 - l / n . Let d = a,,. For each equivalence class 8 of the relation -s in (8), choose a representative w, E 9. We pick w, such that it belongs to the smallest R,, which intersects 8,and note that this selection can be carried out in a strictly internal way. To each element w E R and each s E T, we thus associate a representative w, (i.e., the representative of the equivalence class of w under -,). Observe that since we pick w , from the smallest possible R,, the representative w, is always in d when w is. This implies that a )
a )
u,,,
(9)
OY(w, s) = y ( w , Os)
when w E 6, since Y lifts y on DefineX:RxT+*Rby
and
OY(w,, s) = y ( w , ,
Os)
6.
X ( w , s ) = Y ( w , , s).
As X is clearly nonanticipating, it suffices to show that it is an essentially uniform lifting of y. Assume that w E fi, s E T, and that O s > 0. By construction w - s w s , and hence w ~ ~ forw all , r E [O, 11, r < Os. Since y is %,adapted, this implies that y ( w , r ) = y ( w , , r). Taking the limit as r increases to "s, we get (10)
Y(W,
OS)
= Y(W,,
Os).
4 STOCHASTIC ANALYSIS
140
Combining (9) and (lo), we see that for all noninfinitesimal s E T and all W € d
(11)
OX(w, s) = o Y ( w , ys) , = y(wsrOs) = y ( w , "s),
which shows that X is a uniform lifting of y on the half-open interval (0, I]. It remains to extend (1 1) to sufficiently large, infinitesimal s. The trick is as follows: from (11) we get that the internal set
contains N and thus has an infinite element y. Since Y is a uniform lifting of y, we see that X must be a uniform lifting of y on T n [l/y, I], and hence an essentially uniform lifting on T. The basic idea of the proof above is to create a nonanticipating lifting from an ordinary lifting by choosing one element from each equivalence class. In the last result of this section we shall see the same idea at work in a much simpler setting. A function f : [0,1] x C([O, 13, W") + R" as in (7) is called nonanticipating i f f ( t, y ) = f ( t, y ' ) whenever y ( s ) = y'(s) for all s It. We use the same terminology for internal functions F : T x *C([O, 13, R") + *Wm. The following result will be useful in Section 4.6. 4.3.16. PROPOSITION.
Assume that the nonanticipating function
f : [O, 11 x C([O,11, W") + R" is measurable in the first coordinate and continuous [w.r.t. the uniform topology on C([O,11, R")] in the second. Then f has a nonanticipating lifting F which is uniform in the second coordinate; i.e., there is a set T' c T of Loeb measure one such that "F(t,y ) = f("t, " y ) for all nearstandard y and all t E T'. PROOF Let G : T x *C([O,13, W") + *Rm be a lifting o f f which is uniform in the second variable. For each y E *C([O,11,R") and each t E T, define y r E *C([O,13, R") by
y(t)
if if
s s t s > t,
and notice that ( " y ) ( r )= ( O y r ) ( r )for all r E [0, 11, r
IO t .
Hence
Define a nonanticipating F : T x *C([O,11, R") + *R" by (13)
F ( t , Y ) = G ( t ,Y t ) .
4 4 REPRESENTATION THEOREMS
If t is in the set T' where G lifts (13)
141
and y is nearstandard, then by (12) and
" F ( t , y )= G ( t , y , ) = = f ( " t , " y t=f("f,"y), )
and thus F is a uniform, nonanticipating lifting of J We shall end our survey of lifting theorems here. Although we have at times used a different approach or worked in a more general setting, all the main results in this section are basically due to H. J. Keisler. We urge the reader to consu!t his monograph Keisler (1984) for further information, e.g., on lifting characterizations of different kinds of Markov processes. Hoover and Perkins (1983a) and Stroyan and Bayod (1985) contain a detailed analysis of liftings of right continuous processes with left limits, and Rodenhausen (1982a,b) has developed an interesting critsrion for when a process is a lifting. 4.4. R EPR ES ENTATlO N THEOREMS
We shall now discuss in more detail the relationship between standard and hyperfinite stochastic integrals. There are two aspects of this relationship. First, is the hyperfinite theory strong enough, i.e., can standard integrals always be reduced in some suitable sense to hyperfinite integrals? And second, does the hyperfinite theory give us something which the standard theory cannot, i.e., are there "meaningful" uses of the hyperfinite integral that cannot be reduced (at least in a simple way) to standard integration theory? Theorems 4.1.5 and 4.1.6 give a positive answer to the first question in the case of Brownian motion. The nonstandard integral with respect to a hyperfinite random walk gives-via liftings and the standard part map-the classical theory for the It6 integral. In the first part of this section we shall give the proofs we left out in Section 4.1 and at the same time extend the treatment to cover the theory of 4.2; in the second part we shall discuss some results pertaining to the second question. A. Square S-Integrable Martingales
Let us begin with a brief sketch of the standard theory. An L2-martingale with respect to a stochastic filtration ( Z , {9t}tt[o,l,, P ) is just a martingale N : Z x [0,1] -$ R such that E ( N : ) < 00 for all t E [0,1]. We shall also assume that N is right-continuous with left limits. Most of the results we shall prove about L2-martingales have trivial extensions to the larger class of local L2-martingales; we shall just give the definition of this class and leave the extensions to the interested reader. A process N : 2 x [0,1] + R
4 STOCHASTIC ANALYSIS
142
is a local L2-martingale if there exists an increasing sequence { T , , } , , ~ of stopping times such that each stopped process N,, is an L2-martingale, and for almost all w E 2, T , , ( w ) = 1 for large enough n E N. Given an L2-martingale N, we define a measure vN on the a-algebra of predictable sets by putting (1)
vN(A
(s, t1) = E ( l A ( N ( f ) N(s))2)
for all predictable rectangles A x ( s , f ] , and letting vN(A x (0)) = 0.
(2)
The generated measure v N is often called the Dol6uns meusure of N (Doleans, 1968; Meyer, 1976). Stochastic integrals of the form X d N are defined as follows. First, if X is a simple function of the form
1:
n
where Ai E Ss,, the integral is given by
-
1;
Observing that X X d N is an isometry from L2(v N ) to L2(P ) and that the simple functions of the form (3) are dense in L 2 ( v N ) we , extend the mapping X H X d N to an isometry from all of L2( v N ) into L2(P ) . We shall denote this extension also by X dN. The stochastic integral as a process is now defined for all X E L 2 ( v N )and all t E [ 0 , 1 ] by
1;
(5)
(5 x d N )
-
(0, f )
=
(joll,o.,,X d N )
(w).
Notice that since X X d N is given as an L2 limit, the stochastic integral is only defined up to equivalence. Observe also that the definition above extends the definition of the It6 integral in Section 4.1; we have replaced the measure L ( P ) x m by v N and restricted the class of integrands from adapted to predictable processes (recall 4.3.10). We can now formulate the theme of the first half of this section more precisely. Let M be a A2-martingale with respect to an internal filtration (0, P ) . It is easy to check that the standard part OM+ of M is an L2-martingale with respect to the generated stochastic filtration (R, {BR,},,[o,ll, L ( P ) ) . [A word of caution may be appropriate at this point; it is not true that the standard part of a local A2-martingale is always a local L2-martingale,since we may get problems with the stopping times-see
4 4 REPRESENTATION THEOREMS
143
Lindstrom (1980b) for an example.] Given an X E L2(ueM+)we want to show that f Xd"M' can be obtained as a hyperfinite integral of M ; i.e., Xd"M+= Y d M ) +for suitable Y E SL2(M ) . The idea of the construction is simple; we use Proposition 4.3.9 (i) to pick Y ES L 2 ( M ) to be a lifting of X with respect to v M , and then just check that this Y does the job. However, there is one technical obstacle; Proposition 4.3.9 requires Y to be absolutely continuous with respect to P, and this is not true of v M for all A2-martingales M. We shall get around this obstacle by restricting our class of martingales slightly.
5
"(5
4.4.1. DEFINITION. An internal martingale M : n x T + *R is called an SL2-martingale if M , E SL2(R,d,, P ) for all t E T.
We have a corresponding notion of a local SL*-martingale. We also need the following definition. 4.4.2. DEFINITION. An internal process X continuous at 0 if " X ( 0 )= " X + ( o )L ( P ) a.e.
:aX T + "[w
is
,$right-
We notice that if M is an SL2-martingale which is S-right-continuous at 0, then uM is absolutely continuous with respect to P. We shall now interrupt our development of integration theory for a moment while we prove two theorems about %*-martingales. These results are included not only to make the exposition self-contained, but also because they provide another illustration of our strategy of reducing questions concerning martingales to questions concerning their quadratic variation. However, a reader primarily interested in a short introduction to stochastic integration may skip the proofs. 4.4.3. PROPOSITION. An internal martingale M is an SL2-martingale if and only if Mg + [ M I (1) is S-integrable.
-
PROOF Since in either case M is a A2-martingale, it is enough to prove the result for such processes. Recall from Chapter 3 that if f :Q *R is internal and non-negative, then "IfdP2 " f d L ( P ) with equality if and only i f f is S-integrable. We shall use this several times in the following. By formula (4.2.7)
(6)
E ( M i + [Ml(t)) = E ( M 3 ;
we shall prove a similar result for the standard part of M. Define a sequence { T , } , ~of ~ internal stopping times by T,,(o) =
min{s E
TIIM(~,
s)l 2 n).
4. STOCHASTIC ANALYSIS
144
Since
1MTndM,,, is a h2-martingale, and thus 1; M7. dMTnis S-integrable for all t. By the characterization of S-integrability above
Since
we get (7)
E("MT,,(0)2
+
0[M7n1(t))
=
E(oM7n(t)2).
Obviously, "[MTx]( t ) + "[MI( 1 ) and OMTn( t ) + OM( t ) almost everwhere as n + m. The sequence {o[MTn](t)}is bounded by " [ M ] ( f ) which , is integrable since
E("[Ml(t))5 "E([Ml(t))< 0. Also, "(MTn(t)) 5 Omax,,, MS, and since by Doob's inequality
"max,,, Mf is integrable. Applying Lebesgue's convergence theorem to both sides of (7), we get
E ( o M ( 0 ) 2+ " [ M ] ( t )=) E ( " M ( t ) ' ) .
(8)
Combining ( 6 ) and (8), we see that
" E ( M $ + [ M I ( [ ) )= E ( " M ; + " [ M ] ( t ) )
iff
" E ( M : )= E("M:).
The result now follows from the characterization of S-integrability mentioned at the beginning of the proof. We mention one more result of the same kind [see Hoover and Perkins (1983a) for a proof]: If M is an SL2-martingale, then max,,, Mf is Sintegrable. The next proposition shows that the class of SL2-martingales is closed under stochastic integration. is an SL2-martingale and X If ~ is M an SL2-martingale.
4.4.4. PROPOSITION.
1X
E
S L 2 ( M ) ,then
145
4 4 REPRESENTATION THEOREMS PROOF We first consider the case where X is S-bounded, i.e., for some n E N. Then
1x15 n
and it follows from 4.4.3 that XdM is an SL2-martingale. Let us consider the general case of X E S L 2 ( M ) .There exists a sequence {Xn}ncN of S-bounded elements in S L 2 ( M )such that O s n x T IX2 - X',l dv, + 0. We have
(:
I
-c
= "E C X ~ A M ~X',AM'
[I
0
)
(X' - X',) dv,
-+
0.
='JnxT
Since each X,, dMI(1) is S-integrable, so is tion follows from 4.4.3.
[I X dM](l), and the proposi-
5 . Nonstandard Representations of Standard Stochastic Integrals
Having made ourselves a little more familiar with SL2-martingales, we now return to integration theory. Recall that S t = i d x s t : R x T+Rx[O,l]. 4.4.5. LEMMA. Let M be an SL2-martingalewhich is S-right-continuous at 0, and let OM+ be its standard part. Then vaM+is the restriction of L( v M ) 0 St-' to the predictable sets. PROOF It suffices to prove that vaM+ and L( v M ) 0 St-' agree on predictable rectangles. Let B E Bs,then v.,+(B x ( s , t ] ) = E(l,("Mf(t) - " M + ( s ) ) ' ) .
Let A be an internal set such that L( P ) (A A B ) = 0 and A s' s; such an A exists by 4.3.1. We get
E
d,for some
L-
L(v,)oSt-'(B
x (s, t ] ) = L(v,)oSt-'(A
x (s, t ] )
n+m m + m
= Rlim 'W
lim o E ( l , ( [ M ] ( t + lm )
m'm
= nlim + m rn-m lim 0E(lA(M(t+J-) = E(l,("M+(t)- "M+(s))'),
m -[M](s+f))) -M(s+;))')
146
4 STOCHASTIC ANALYSIS
where the S-integrability of [MI has been used to switch between B and A, and the S-integrability of M 2 to get the standard part inside the expectation. Pulling the limits inside the expectation is justified by a combination of Doob’s inequality and Lebesgue’s convergence theorem as in the proof of 4.4.3. It only remains to observe that since M is S-right-continuous at 0,
and the lemma is proved. The notion of lifting we shall need is the following. 4.4.6. DEFINITION. Let M be an SL2-martingale, and let x:SZ x [O, 11 + [w be a predictable process in Lz(uoM+). A 2-lifting of x (with respect to M ) is a nonanticipating process X :SZ x T + *R in SL2(M ) such that OX(w , t ) = x ( w , “ t ) for L( u,)-a.a. (w, t ) .
Remembering that we introduced SL2-martingales which are S-rightcontinuous at 0 in order to have the conditions of Proposition 4.3.9 satisfied, the following result is not surprising. Let M be an SL2-martingale which is S-right-continuous L2(uOM+), then x has a 2-lifting with respect to M.
4.4.7. LEMMA.
at 0. If x
E
PROOF. By 4.3.9 x has a nonanticipating lifting X ; we must show that we can choose X in SL2(v M ) .For each n E N, let x, be the truncation of x given by x, = (x A n ) v ( - n ) . If X,, is the corresponding truncation of X , we see that X , is a nonanticipating lifting of x,. From 4.4.6 we get
5
5
Since x’, dU-M++ x2 du.,+, we can find an 77
By Proposition 3.2.10 it follows that X,, 2-lifting of x.
E
E
*N - fV such that
S L 2 ( u M ) and , hence that it is a
The next result shows that the particular choice of 2-lifting is irrelevant for stochastic integration.
4 4 REPRESENTATION THEOREMS
147
x
4.4.8. LEMMA. Let M be a n SL*-martingale and let x E L2(u-,+). If and Y are 2-liftings of x, then there is a set R' of Loeb measure one such that for all w f f l ' and all t~ T
"(J X ~ M ) r > ='( J Y ~ M )
(w, t).
(0,
PROOF
By Doob's inequality
since X - Y is in SL2(v,) and is infinitesimal almost everywhere. The lemma follows. We can now prove the representation theorem. 4.4.9. THEOREM. Let M be an SL'-martingale that is S-right-continuous at 0, and assume x E L2(yo,+). Then x has a 2-lifting X, and
(9) PROOF All that remains to be done is to prove the equality. We first consider the case where x is a simple process of the form n
x=
1a ~ l B , x ~ ~ ~ , ~ , ] ~ r=l
where B,E By,.For each i, choose an $ c- s, such that = "M+(s,) almost everywhere, and such that there is an A, E 4,with L ( P ) ( A ,A B,) = 0. Pick f, = t, such that " M ( [) = OM+(1,) almost everywhere. Let n
x = c atlA,x&?,]; ,=I
then X is a 2-lifting of x, and (9) obviously holds for this pair. To prove the general case, we only have to show that the map X" ( J X d M ) ( l )is an isometry from L'(R x [0, 11, v O M into + ) L2(R, L ( P ) ) .But we have already observed that by 4.4.5
J
x2 duo,+ = O J X 2 dvM = " E
(( 5
XdM)2(l)) = E
(O(
J XdM)2(1)),
where we use 4.4.4 in the last step. This proves the theorem. Recalling Lemma 4.3.10, we now get Theorem 4.1.6 as an immediate corollary of 4.4.9.
148
4 STOCHASTIC ANALYSIS
In spite of the result above there is a difference between the standard and the nonstandard definition of the stochastic integral that seems worthy of a remark. Since the standard definition only determines the integral process u p to a null set for each t, questions about the path properties of this integral does not really make sense; we can not ask whether the integral itself is continuous or has left and right limits, but only whether it has a representative with these properties. By Lemma 4.4.8 the nonstandard approach defines the stochastic integral up to indistinguishability; i.e., if '(j X d M ) + and "(1Y d M ) + are two versions of it, then there exists a null set such that outside this set the two versions agree for all t. Hence in this case it does make sense to ask for path properties, and it is reassuring to notice that the representatives picked by the nonstandard definition always seem to have the nicest properties possible (see, e.g., Proposition 4.2.15). C. Quadratic Variation and lt6's Lemma
As the quadratic variation of an internal martingale has proved to be such a useful tool, it may be interesting to take a look at the corresponding standard notion. Let N : n x [0, 11 + R be an L2-martingale. If a = (0 = to < t , < t2 < . . . < t, = 1) is a partition of [0,1], let 8 ( ~ =) maxOcicn(fi+l - t,) be its mesh. Given a sequence {m,} = (0 = tom < t y < . . < trm= 1) of such partitions with S ( a,) += 0, it is natural to define the quadratic variation [ N ] of N by
-
nm-l
(10)
[ ~ ] ( t=) Iim
m+m
1
i=o
ti",, A
t)
-
~ ( t Am t ) ) ' ,
where the limit is taken in the L'-sense. There are two obvious questions; first whether the limit exists and is independent of the sequence {a,},and then whether
(11)
[OM+] = "[ MI+
for all internal SL2-martingales M. Standard probabilists have answered the first question affirmatively; the quadratic variation exists and can be defined through (10) [see, e.g., Metivier and Pellaumail (1980) and Meyer (1976)l. As we have stated it, the answer to the second question is "no"; there are SL2-martingales M for which ( 1 1 ) does not hold; an example is given in Hoover and Perkins (1983a). However, we shall now introduce a class of SL2-martingales for which (11) is true.
- -
4.4.10. DEFINITION. An internal function f : T + *kk is called wett-behaued if for each t E [0,1] there is a ?E T, ?=t, such that for all s = t, SI t,
f(s) = f(71,
4 4 REPRESENTATION T H E O R E M S
and for all s =
149
t: s> 7,
where 7’ denotes the successor of ? in T. A process is well-behaved if almost all its paths are. A well-behaved process is simply a process which “jumps” at most once per monad. The nondescript term “well-behaved” was introduced by Lindstr$m (1980b), who could not come up with anything better. Unfortunately, Hoover and Perkins’ (1983a) terminology is hardly an improvement-a “well-behaved process” is to them an “SDJ-process”-although it makes a certain sense in their setting. We shall continue to use the worst alternative “well-behaved’’ in the hope that its sheer awfulness will force somebody to invent a better name for it. REMARK.
To prove that [“M’] = “ [ M I +for all well-behaved SL2-martingales M, we need the following lemma. 4.4.11. LEMMA. Let M : R x T -+ *R be a well-behaved SL2-martingale. If S is a hyperfinite subline of T and M S is the restriction of M to S, then
[ M ] ( s ) =[ M S ] ( s )
(12) for all s
E
a.e.
S.
PROOF Let T , , ( w ) = inf{t E TI M ( o , t ) 2 n}. Since M,,, E SL2(MTn), and for a.a. w we have T , , ( w ) = 1 for sufficiently large n E N, it suffices to prove the lemma when M E S L 2 ( M ) . By Lemma 4.2.5
(13)
[ M ] ( t ) =M ( t ) 2 - M(O)’-2
I,:
MdM
and (14)
[ M s ] ( t ) =M S ( t ) ’ - M ( 0 ) ’ - 2
s;
id
MSdMS.
Note that 1; M S d M S = A?” d M where k?”: R x T -+ *R is defined by letting f i s ( w , t ) equal M S ( w , s) for the largest s in S smaller than t. Since M is well behaved, M S and ii?” are both 2-liftings of the left standard part “ M - of M, and hence
by Lemma 4.4.8. Putting this into (13) and ( 1 4 ) ,the lemma follows.
150
4 STOCHASTIC ANALYSIS
The reader may feel that this proof is unnecessarily roundabout. Why are we using Lemma 4.2.5 to reduce the problem to a question about stochastic integrals, and not attacking it directly through the definition of the quadratic variation? The answer is simply that we do not know of a more direct proof; the trick above was used by both Hoover and Perkins (1983a) and Lindstram (1980b). 4.4.12. PROPOSITION. If M is a well-behaved SL2-martingale that is Scontinuous at O, then [OM+]exists and equals "[MI+.Moreover,
t ) = OM+(t ) 2 - "M'(0)' - 2
[OM+](
(16)
lo'
OM-
doMf.
Before we give the proof, let us say how to interpret the stochastic integral in (16). The point is that since OM- need not belong to the class L2(v O M + ) , this integral is not covered by the definition given at the beginning of this section. But the solution is simple: put OM- d o M + = ' M i n doM', where {u,,} is a sequence of stopping times increasing to 1 such that 'Min E L 2 ( v O Mfor + ) each n. To prove the proposition, fix t E [0,1] and choose ?E T, ?-t, so large that OM( ?) = OM+(t ) almost everywhere. Given a sequence { T , , , } ~of~ ~ partitions of [0,1] with mesh going to zero, construct a sequence {Gm},,,EN of internal partitions of T by choosing for each t m a E T, f? = t y such that OM( ??) = " M + ( t T )Extend . to an internal sequence {G,}mE*N. For each rn E *N - N, we have by the lemma that
5
5
?m
and the first part of the proposition follows. To prove (16), let T , , ( o ) = min{t E T I M ( o , t ) 2 n} and put a,,= OT,,. Then MT,E SL2(Mrn)is a 2-lifting of ' M i n ,and thus
J"M, d o M + =
O(
1
M,, dMTn)'
If we combine this with the identity LMTnl(
i)
=
M?n(
I ) - M:,,(0) -
loi
MTn
T n'd
and take the limit as n goes to infinity, we get (16). Not much reflection should be need to realize why 4.4.12 fails for SL'-martingales in general; if M is not well behaved it will have jumps inside monads which will be counted by [ M I but missed by [ O M + ] .That the proposition is useful despite the restriction to well-behaved processes
4 4 REPRESENTATION THEOREMS
151
is guaranteed by the following fact: if X is an internal process with S-left and S-right limits, there is a subline S such that the restriction X s of X to S is well behaved. Thus we can always assume that our processes are well behaved by restricting them to a coarser time line if necessary. We shall not prove this fact here as proofs are easily found in Hoover and Perkins (1983a), Stroyan and Bayod (1989, and Lindstram (1980b), but would like to make the following observation first pointed out by Hoover and Perkins: if N is the right standard part of an SL’-martingale M, we can always choose M to be well behaved. Thus Proposition 4.4.12 establishes the existence of the quadratic variation [ N ] and the formula (16) for all such processes. As an application of Proposition 4.4.12, we shall prove the following generalization of It6’s lemma. 4.4.13. PROPOSITION. Let N : R x [ 0 , 11 + W be the right standard part of an S-continuous, SL’-martingale M. If c p : x~ [o, 11 + w is twice continuously differentiable in the first variable and once in the second variable, then
Each of the three last terms equals the corresponding term on the right-hand side of (17). The last of these equalities uses the fact that [ N ] = “[MI+. There is a version of 16’s lemma for discontinuous martingales also, but it is more complicated as we must add each jump term separately [see Lindstram (1980a) for a nonstandard treatment]. The continuous version above suffices for our purposes in this book. Let us end our discussion of well-behaved martingales on a cautious note; although “well-behaved” with respect to the quadratic variation, the class is not closed under stochastic integration: if M is a well-behaved
152
4 STOCHASTIC ANALYSIS
martingale and X is a lifting, then X d M is well behaved, but this is not always true for general X E S L 2 ( M ) [see Lindstrom (1980b)l. However, Hoover and Perkins (1983b) have shown that it is possible to find a subline such that the restriction of the martingale to this subline and all its integrals are well behaved. D. Further Remarks on Nonstandard Representations
We have shown that the standard theory of stochastic integration with respect to the right standard part O M f of an internal martingale M can be reduced to the nonstandard theory of stochastic integration with respect to M itself. To claim on this basis that standard stochastic integration in general can be reduced to nonstandard stochastic integration is still somewhat premature; what about the martingales that are not the standard parts of internal martingales? A point in case is Proposition 4.4.13; do we really need the awkward formulation “ N is the right standard part of an Scontinuous martingale,” or does it suffice to assume that N itself is a continuous martingale? In this subsection we shall take a brief look at this and a few related questions. The reader who does not care for such niceties, and who believes that the taste is a better proof of the pudding than a close scrutiny of the recipe, may skip this discussion and let the applications in the next sections convince him or her that the nonstandard theory is rich enough for all practical purposes. By a saturation argument it is quite easy to show that if (a,d,P ) is an internal probability space and N :R x [0,1] + Iw is a martingale with respect to a filtration (R, { q I }L, ( P ) ) , then N is the standard part of an internal martingale in the following sense. There is an internal filtration (R, { d , } l ePT) ,and an internal martingale M : O x T + *R adapted to it such that N = OM+. Moreover, Yt c B l , where BEis the standard filtration This answers our question about Proposition 4.4.13; it generated by {d,}. suffices to assume that N is a continuous martingale. But not all processes live on Loeb spaces; if N : X x [0,1] + R is a martingale with respect to some general filtration (Z, {st}, Q ) , what do we then do? It turns out that N can be represented by a martingale N0 adapted to a Loeb-space filtration (Q, { qI}, L ( P ) ) through a measure-preserving, Boolean a-homomorphism 0,mapping Loeb measurable sets into 0measurable sets. As we have already remarked, NO is the right standard part of an internal martingale M, and by using the map 0 all stochastic integrals with respect to N can be interpreted as nonstandard integrals with ) the details]. Although this prorespect to M [see Lindstrflm ( 1 9 8 0 ~ for cedure makes complete the reduction of the standard theory to the nonstan-
4 4 REPRESENTATION THEOREMS
153
dard, it does not seem to be of any great significance. The reason is that probabilists usually d o not care which probability space they are working with as long as it is “rich” enough to support the phenomenon they are studying. Keisler (1984,1985) and Hoover and Keisler (1984) have shown that Loeb spaces are extremely rich in this respect; whatever happens on some probability space also happens on a Loeb space. This universality property implies that we need not care about other spaces; we can always assume that we are working in a Loeb setting. To obtain their results, Hoover and Keisler use a notion of elementary equivalence taken from probability logic, which gives a classification of processes much finer than those habitually used by probabilists. For the reader with some background in logic, we mention that Loeb spaces play much the same role in probability theory as saturated models d o in first-order model theory. Additional information about probability logic can be found in Keisler (1977,1985), Hoover (1978,1982,1985), Rodenhausen (1982a), Fajardo (1984,1985), and Ross (1984).
E. Standard Representations of Nonstandard Stochastic Integrals If standard stochastic integrals can be obtained from hyperfinite ones, a natural question is whether the converse also holds; if X E SL2(M ) , can we find a standard process N somehow related to M, and an x E L2(v N ) such that { x d N = ‘(I X d M ) + ? In the remainder of this section we shall discuss various formulations of this problem. The following example shows that the answer is no if we insist on having N = .M+. 4.4.14. EXAMPLE. We are in the setting of Section 4.1; our time line is (0, At, 2 A t , . . . , l}, and x :fi x T -+ *R is Anderson’s random walk. Let X : R x T + *R be given by
X ( W ,k A t )
(-l)k;
obviously X E S L 2 ( x ) . If p = O x + and x E L 2 ( v p ) ,we shall show that l x d p # O(JXdX)+. Let Y be a 2-lifting of x; we may assume that Y ( w ,2k A t ) = Y ( w , (2k + 1) A t ) for all k. We have
154
4. STOCHASTIC ANALYSIS
The last expression is smallest when Y is identically zero, and hence
for all /.?-integrable x.
As observed by Rodenhausen, this calculation can be used to prove 4.4.15. PROPOSITION. Let M be an SL’-martingale that is S-right continuous at the origin, and let X E S L 2 ( M ) . If x E L 2 ( v O M +is) such that x d’M+ = X d M ) ’ , then X is a 2-lifting of x.
1
‘(5
PROOF Let Y be a 2-lifting of x; then everywhere. Hence
=E
((I
X d M ) (1) -
1; X d M = 1; Y d M
(I
Y d M ) (1))’)
almost
= 0.
Since Y is a 2-lifting of x, so is X , and the proposition is proved. The example and the proposition above show that the class of hyperfinite stochastic integrals with respect to M is much larger than the class of standard stochastic integrals with respect to OM+. Before we go on to explore this extra richness, let us introduce a note of caution to the discussion by stating 4.4.16. PROPOSITION. Let M :a X T + *R be an SL’-martingale that is S-continuous at the origin. Let N : x [0,11 + R be an L’-martingale such that for each hyperfinite subline S of T, there is a process X s E S L 2 ( M ” ) such that N = X s d M S ) + .Then there is an x E Lz( vOM+)such that N = 5 x d”M+.
‘(I
The proof is not hard, and we leave it to the reader. Proposition 4.4.16 may be interpreted as saying that as long as we only use M as an arbitrary representation of its standard part, the extra power of hyperfinite stochastic integration may not be very useful since it depends solely on the representation and not on the original process. However, it is not quite clear that this interpretation is correct since for any hyperfinite representation we get a much richer class of stochastic integrals by allowing integrands that are not liftings, and it may be that it is this extra richness that is important and not the representation of some particular process as a stochastic integral. Returning to our original question we see that we have no chance of finding an x such that x d N = X d M ) + if N = OM+. The next question
1
‘(1
155
4 4. REPRESENTATION THEOREMS
is what happens if we allow a looser relationship between N and M. We shall answer this question in the case of Brownian motion. From now on we work in the setting of Section 4.1,and let 0, P, {d,}, (9,) and x be as in that section. Our aim is to prove
x
E SL2(x). 4.4.17. THEOREM. Let y , be Anderson's random walk and Then there exist a Brownian motion p and an adapted process x E L2(R x [0,1])such that
Before we turn to the proof, let us make a few comments. [f X is not a lifting, Proposition 4.4.15tells us that p # O x + . A natural extension of 4.4.17 would be that given an SL2-martingaleM and an X E S L 2 ( M ) there , always exist an L*-martingale N having the same finite-dimensional distributions as OM+ and an x E L 2 ( v N )such that
I
xdN
=
'( 5 X d M ) '
This turns out to be false even if M is assumed to be continuous [see Lindstrgm (1985)for an example]. On the other hand, the generalization of 4.4.17to n-dimensional Brownian motions does hold, and with a proof that is basically the same as the one we shall give for the one-dimensional case [see Lindstram (1985)l. Let us agree that by a Brownian motion p adapted to a filtration (R, {S,}, P ) , we mean a continuous martingale with respect to this filtration such that for all t > s, P ( t ) - p ( s ) is independent of Ssand Gaussian distributed with mean zero and variance t - s. Before we can prove 4.4.17 we need a result which can help us in recognizing Brownian motions. The following is a standard result (LCvy, 1948) in hyperfinite disguise: 4.4.18. PROPOSITION. Let M : x T + *R be an S-continuous SL2martingale such that E("[M]+(t)- "[M]+(s)1B3,)= t - s for all t > s. Then OMf is a Brownian motion with respect to (R, {as}, L(P)). PROOF Applying 4.4.13,It8's lemma, to the martingale M and the function q ( x ) = elyx, we get
exp( iy( M , - Ms)) = 1
+ iy
exp( iy( M , - M,)) d M ( r)
-zjlexp(iy(M,2
7
M s ) )d[M](r).
4 STOCHASTIC ANALYSIS
156
Let B E By.If we take the standard part of the equation above, then multiply by l B and take expectations, we get r
J,
exp( iy("M: - "Mf)) dL( P ) =
L ( P ) ( B )-
$ Is'
exp(iy("M:
-
"Mf)) d L ( P ) dr.
Since the unique solution to the integral equation q(t)=
L ( P ) ( B ) - $ J 'sq ( r ) dr
is q ( t ) = L ( P ) ( B )exp[-y2/2L(P)(B)(t - s ) ] , we see that ( " M + ( t )" M ' ( s ) ) 1 B is Gaussian distributed with mean zero and variance L( P ) (B ) ( t - s). This proves the proposition. We now turn to the proof of 4.4.17. Let
M
=
5
J
Xdx.
First notice that since [ M ] = X 2 dt, the standard part "[M ] +is increasing and absolutely continuous. Hence
d
f(w, t ) = -" [ M I + t( ) = lim dt
h10
" [ M I +t() - " [ M I + t( - h ) h
exists, and is a non-negative, adapted process with
" [ M I + ( wt, ) =
lo'
f ( w , s ) ds
a.e.
Define a new adapted process g by
and let 1, be the characteristic function of the set { ( w , t ) 1 g ( w , t ) = 0). Since by (18)
we have g E L2(v o c M l +Let ) . G E S L 2 ( M )be a 2-lifting of g, and l G a lifting of 1,. We may assume that G . l G = 0. Define (19)
P(', t ) =
"(Io'
G ( w ,s, d M ( w , s) + Jo' lG(w, s,
dx(w,3)
4 4 REPRESENTATION THEOREMS
157
Since G and l c have disjoint supports
=
lo' + lo' g2f ds
1: ds =
1 ds = t,
and combining this with 4.4.18 we see that p is a Brownian motion adapted to (%{%I, U P ) ) . E L2(vp), and By (18) we have f
5
, we since fl" 1 , = 0. It only remains to prove OM+ = f 1 l 2 g d o M +since then get 4.4.17 by putting x = fl". By Doob's inequality f'/'gd"M+)')
5
4E(("M+(l) - / 0 1 f 1 ' 2 g d o M +
stQ = 4E(
=
4E(
lo1 lo'
( 1 - f 1 / 2 g ) d2 " [ M ] + )
( 1 -f1I2g)'fdt) = 0
since f 1 / 2 g= 1 whenever f # 0. Theorem 4.4.17 is proved. This proof is not new; this technique for proving that processes are integrals of Brownian motions goes back to Doob. One way of using 4.4.17 is the following. Assume that we know a certain property holds for all standard integrals of Brownian motions. A priori there is no reason to believe that this property should also hold for all hyperfinite integrals, but by 4.4.17 all such integrals are standard integrals, and thus the property extends. However, Theorem 4.4.17 is not really a standard characterization of nonstandard integrals since it does not tell us which Brownian motions P should be allowed. This question is answered in Lindstram (1986) and we shall give a brief account of the main results.
4. STOCHASTIC ANALYSIS
158
The absolute joint variation of two internal processes M , and M2 is defined as 1
(20)
wl, M2n(a,t ) c &(a, s) AM(@, =
s=o
s)
AM,(W,
s),
where E ( W , s) = sgn E ( A M , ( s )AM2(s)Ids)(a).We say that MI and M2 are absolutely orthogonal if [ M ,,M2](1) = 0 a.e. If M2 is absolutely orthogonal to all S-continuous SL2-martingales that are absolutely orthogonal to M , , we say that M2 is in the absolute span of M I . 4.4.19. PROPOSITION. Let N be an S-continuous SL2-martingale with N ( 0 ) = 0. Then N = X dx for some X E S L 2 ( x )if and only if N is in the absolute span of x.
5
The nonstandard version has a standard counterpart which generalizes (10). If M, N:fl x [ O , l ] + R are continuous L2-martingales and T = (0 = to < t , < . . < t, = 1) is a partition of [O,l] we let n-1
(21)
UM, NDn =
1 E,(w,
ti)(M(ti+,)- M(ti))(N(ti+l)- N(ti)),
i=O
where
where {T,}is a sequence of partitions with 6 ( r n )+ 0. A lemma is needed to show that this is a well-defined notion at least for all continuous L2martingales with respect to (fl, { B t } ,L ( P ) ) .We have the associated notion of absolute span. 4.4.20. THEOREM. Let y , :fl x T --* *R be Anderson's random walk. A process N : fl x [0,1] + R is of the form N = '(j X d X ) + with X E SL2(x) iff there is a Brownian motion p in the absolute span of ,'y' and an adapted x E L2(fl x [0,1]) such that N = x dp.
I
The results in the first half of this section are adapted from Lindstram (1980b) and Hoover and Perkins (1983a,b), but the treatment is closely modeled on the theory of Anderson (1976) and Keisler (1984) for Brownian motions. Theorem 4.4.17 is from Lindstram (1985) and the last two results from Lindstr@m(1986).
159
4 5 STOCHASTIC DIFFERENTIAL EQUATIONS
4.5. STOCHASTIC DIFFERENTIAL EQUATIONS
We have completed our account of the fundamentals of the hyperfinite theory for stochastic integration and its relationship to the standard theory. In the sections which remain of this chapter, we shall take a look at various applications and extensions of the theory, trying to show by example what we have so far only postulated-namely, the strength and the flexibility of the nonstandard approach. A. It6 Equations with Continuous Coefficients
In this section we shall discuss the existence of solutions of stochastic differential equations of the form (1)
x(w, t ) = d o )+ I'f(s, x(w, s)) ds 1-
I,:
g(s, x(w, s))
W o , s),
where b is an n-dimensional Brownian motion on a suitable Loeb space 0, and f and g are functions (2)
f :[O, 11 x R" + R",
g :[ O , 11 x R"
+ R"
0 R",
where R" 0 R" is the space of real n x n-matrices. First of all we must explain what this means. We have thus far restricted our discussion to real-valued processes, since the multidimensional theory for stochastic integration is a trivial extension of the one-dimensional. However, for stochastic differential equations the situation is quite different; here the multidimensional theory is far more complex, and one-dimensional techniques often have no extensions to the general case. To show the strength and applicability of our approach we shall have to work in a multidimensional setting. Let T = {0,1/77, 2/77,. . . , l } be a hyperfinite time line. As our sample H1-the set of all internal functions space we shall use 0 = {-1, l}rx"-2*...3 from T x {1,2,. . . , H } to {-1, 1)-for some H E *N,H 2 n. If o E 0, we often write wi(t) for o(t, i). Let {el, e 2 , . . . ,e n } be the natural basis for R", and define an ndimensional Anderson process by
(3) Notice that x consists of n independent random walks x,,x2,.. . ,xn running in orthogonal directions. The standard part of x is an n-dimensional Brownian motion with respect to the Loeb measure L( P ) of the normalized counting measure P.
160
4 STOCHASTIC ANALYSIS
As in Section 4.1 an internal filtration { d , } r s isT defined in terms of the equivalence classes [a], = { w ’ RlVs ~
< t t l i IH ( w , ( s )= w : ( s ) ) } ,
and the standard filtration (B,},,[o,Ij is generated from {d,} in the usual way. For all i, j 5 n let X,, be a nonanticipating process in SL2(P x A ) (where A is the normalized counting measure on T); the *R” 0*R“-valued process X ( w , t ) = ( X v ( w ,t)),,,‘” is then said to be in SL2(x),and the stochastic integral is defined by
(4) where . means matrix multiplication. Notice that X d x is an *R”-valued process; the ith component is given by
5
This gives a definition of X dx in terms of one-dimensional integrals, and the corresponding formula is used to define standard stochastic integrals of multidimensional processes. Because of this reduction all lifting and representation theorems we shall need will carry over from the onedimensional case. Having explained what each part of ( 1 ) means, let us now define what a solution is: 4.5.1. DEFINITION. Let (a,{%t},s[n,lj, P ) be a filtration and b a Brownian motion adapted to it. A solution x of ( 1 ) with respect to (0,{%,}, P, b ) is an adapted process x :R x [0,1] +. Iw such thatf(s, x ( w , s)) is in L’(R x [0,1]) and g(s, x(o, s ) ) is in L2(R x [0, l]), and such that for all t E [0,1] equation ( 1 ) holds almost surely.
Existence theorems for solutions usually state that for all f in a certain class F and all g in a class G, equation ( 1 ) has a solution. Probabilistic jargon classifies such theorems as “weak,” “strong,” and “strict” according to the dependence of (R, {%,}, P, b ) on f and g. We shall use the following version of this terminology (Barlow, 1982; Cutland, 1982; Keisler, 1984). Weak solutions. For all f E F, g E G, there exist (R, {9, P,}b, ) (depending on f and g ) such that ( 1 ) has a solution x with respect to (R, {9,}, P, b). Strong solutions. There exist (R, {%,}, P, b ) such that for all f E F, g E G, Eq. (1) has a solution x with respect to (R, {9,}, P, b ) .
4 5 STOCHASTIC DIFFERENTIAL EQUATIONS
161
Strict solutions. Let Q = C([O, l]), P Wiener measure, and .Fr the (+algebra generated by C([O,t ] ) . Let b be the coordinate function; b(w, t ) = w ( t ) . For all f~ F, g E G, equation (1) has a solution with respect to (0, { S f }p, , b). Notice that if an equation has a strict solution, then it also has a solution with respect to any other Brownian motion. What we get by nonstandard methods are strong solutions living on hyperfinite Loeb spaces. Since most methods only yield weak solutions when the conditions on f and g become fairly general, this is in itself noteworthy. Also, hyperfinite Loeb spaces have the following homogeneity property [in the remainder of this section, unless otherwise specified, (Q, { d s }P,) is a hyperfinite probability space of the form previously described] : 4.5.2. PROPOSITION.
Let
f : [O, 11 x R" + R",
g :[O, 11 x R" + R" 0R"
be measurable functions. Suppose that there is a Brownian motion b adapted to (Q, { %,}, L ( P ) )such that for all %,-measurable initial conditions x,: Q + R", the equation
has a solution. Then the equation has a solution for every Brownian motion adapted to { 9,) and every B,-measurable initial condition. This result is due to Keisler (1984), and although we shall not prove it here, it will be used below to extend results from one Brownian motion to all. Proposition 4.5.2 shows that Loeb spaces of the form Q = { - 1,1}Tx{17-.3H) are extremely regular; e.g., if we can produce weak solutions on such spaces, we automatically get strong solutions. To illustrate the hyperfinite approach to stochastic differential equations, we first take a look at a fairly simple, but typical proof from Keisler (1984). 4.5.3. PROPOSITION. Let b be an n-dimensional Brownian motion adapted to (Q, {%.v},L ( P ) ) ,and assume that
f:[O, 11 x R" + R",
g : [ O , 11 x
R" + R " 0 R "
are bounded measurable functions which are continuous in the second variable. Let xo be a B0-measurable initial condition. Then the equation (6)
x ( w , t ) = xo(w) +
has a solution.
g(s, x ( w , s)) ds +
f(s, x ( w , s)) db(w, s)
162
4. STOCHASTIC ANALYSIS
PROOF Choose 6 0 such that xo has an &j-measurable lifting X , . By Proposition 4.3.13 we can find liftings F : T x *R" + *R" and G : T x *R" + *R" 0 *R" of f and g which are S-bounded and uniform in the second variable. Hence there exists a set T' c T of Loeb measure one such that O F ( t, y ) = f ( " t , " y ) and "G( t, y ) = g("t, " y ) whenever t E T' and y is nearstandard. Let us consider the hyperfinite difference equation I
I
(7) X ( w , t ) = X o ( w ) +
C- F(s, X ( w , s)) s=o
At
+ C- G(s, X ( w , s)) M u , s), S=O
where ,y is Anderson's random walk. This equation obviously has a unique, nonanticipating solution X defined inductively for all t 2 6 [recall that by our convention (4.1.3), the sum Ct=a is really a sum 1f3'1. We extend X to all of T by letting X ( w , t ) = 0 for t < 6. Let x be the standard part O X ' of X ; we shall show that x is a solution of the original equation (6) in the case where b = O x + . It suffices to show that F ( s , X ( w , s)) is a lifting of f(s, x ( w , s)) and that G(s,X ( w , s)) is a lifting of g(s, x ( w , s)), as we will then have for all t 2 6 r
x("t) = X ( t )= Xo
+C
I
F ( s , X ( s ) )A t
s=o
= xo+ J:A#,
x(s)) ds
+
J:
+C
G(s,X ( s ) )Ax(s)
s=o
g(s, x(s)) db,
which proves the theorem for b = O x + . The general case then follows from 4.5.2. To prove that F ( s , X ( w , s)) and G ( s ,X ( w , s)) are liftings off(s, x ( w , s)) and g(s, x ( w , s)), respectively, we note that since F and G are S-bounded, there is a set R' c fl of Loeb measure one such that X ( w , s) is nearstandard for all s when w E 0'. Thus if ( w , s) E R' x T', (8)
" F ( s ,X ( 0 , s)) = A ' s , x ( w , "s)),
"G(s,X ( w , s))
=
A's,
x ( w , Os)),
which completes the proof, since R' x T' obviously has measure one. For an application with more sting, we refer the reader to Theorem 5.14 of Keisler (1984). The proof is just a slightly more technical variant of the one above, and the result is a new, strong existence theorem. Even Proposition 4.5.3 is less innocent than the proof might lead you to believe; Barlow (1982) has shown that (6) may have no strict solution even in the onedimensional case ! (Warning: Barlow's terminology is slightly different from ours.) This is an indication that the Loeb space is a better setting for stochastic differential equations than standard path space C[O, 13.
4 5. STOCHASTIC DIFFERENTIAL EQUATIONS
163
B. It6 Equations with Measurable Coefficients
The argument we have just given is typical; by choosing appropriate liftings F and G o f f and g, we turn Eq. (1) into a hyperfinite difference equation (7), and the solution of Eq. (1) is obtained as the standard part of the solution of Eq. (7). There are two technical problems that occur; first we have to find the proper liftings F and G, and then we have to prove that F ( s , X ( w , s)) and G(s, X ( w , s)) are liftings of f ( s , x(w, s)) and g(s, x( w , s)), respectively. In the theorem above these problems were relatively easy to solve since f and g were continuous in the space variable. If this is not the case the problems become much harder; the reason is that although we may choose F and G such that they differ fromf and g only on a null set in T x 0, the process may nevertheless stay in this null set with positive probability, and this will destroy the lifting properties. Following Keisler (1984), we shall see how this problem can be solved under the extra hypothesis that g does not degenerate. We shall need the following inequality due to Krylov (1974, 1980): 4.5.4. THEOREM.
Let x : R x [0, I ] + R" be a process of the form
~ ( 0 t ,) = xo +
lo'
.f(w, s) ds
+
lo'
g(w, s) d b ( w , s),
where b is a Brownian motion adapted to a filtration (0,{%,}, P ) ; and
f:Rx[O,l]+R"
g:Rx[O,l]+R"OR"
are bounded, adapted processes; and xo E R". For d ~ ~ ( =winf{t ) E [0, l]\lx(w, t)i
2
E
R+, let
d } A 1.
Given d, K E R, , there is a constant N = N ( n, d, K )-depending only on d, K, and the dimension n-such that for all adapted f and g bounded by K , and all functions h E Lnf1([O, 11 x R")
Krylov's inequality is a deep result which we do not prove here, especially as we have no new, nonstandard insight to offer. Instead we refer the reader to Krylov's (1980) book; what we have stated is a special case of his Theorem 2.2.2. We shall need a hyperfinite version of 4.5.4. If J E *R+, a function h :*([O, 11 x R")+ *Rm is called J-Lipschitz if it is internal, bounded by J, and satisfies IlhCC x)
-
h(s,y)ll
for all ( t , X I , (s,v) E * ([ 0,1] x R").
5 Jll(t9 x )
- (s, v)ll
164
4 STOCHASTIC ANALYSIS
If x is Anderson's random walk in *R", and U :s1 x T + *(R" 0R") is a nonanticipating process such that U ( w , t ) is a unitary matrix for all ( w , t ) , then the standard part of
x' =
I
Udx
is a Brownian motion. We denote the class of all such
x' by % ( x ) .
4.5.5. PROPOSITION. Let x : s1 x T + *R" be Anderson's random walk. For all n E N, 0, K E R+, there exist constants N = N ( n , 0, K ) E R, and J E * N - N s u c h t h a t i f X : f l x T+*R" isoftheform
(9) X(o, t ) = X,
+
Id
F ( s , X(w, s)) ds
+
Id
G(s,X ( w , s)) dx'(w, s),
where F : "([O, 11 x R") + *R"
G : *([O, 11 x R")
are J-Lipschitz and bounded by K , X ,
E *R",
and
+ *(R"
0R")
x' E % ( x ) ,then
for all J-Lipschitz H : * ( [ O ,11 x R") -+ *R, where uD= inf{t
E
TIIx(~,
t)l
2
D } A 1.
Before we prove the proposition, we observe that we can take J to be independent of ( n , 0,K ) E N x R, x R., If we can find a suitable J ( n, 0,K ) for each such triple, then there is by saturation a J E *N - N smaller than all of them, and this J will work for all choices of n, 0,and K. To prove the theorem, we first consider the case where J E N. Then F and G are S-continuous and we can define functions f and g by f ( " t , Ox) = OF( t, x); If x is the standard part of X, (11)
x(w, t ) = xo +
lor
As,
g("r, Ox) = "G( t, x).
x(w, s)) ds
+
x(w, s)) db'(w, s)
where x, = OX, and b' = Ox'+ (if X, is not nearstandard, uD = 0 and there is nothing to prove). Applying Theorem 4.5.4 to x with d = D + 1, we get
4 5 STOCHASTIC DIFFERENTIAL EQUATIONS
165
Since all J- Lipschitz functions are S-continuous and S-bounded, it follows from (12) that for all J-Lipschitz H : (13)
E
(1;
5
(det G(s, X ( s , w)))'/'""'lH(s, X ( w , s))l d s )
"n, D + 1, K)IIffII,+I + 1 / J ,
where 1/ J is added t o take care of the case where we have equality in (12). Consider the internal set A defined as { J E *NI (13) holds for all JLipschitz F, G, H, all X,,E *Rd, and all x' E %(x)}. Since we have just shown that A 2 N, it also contains an infinite J, and the proposition follows. In order to use Proposition 4.5.5,we must know how to produce functions that are J-Lipschitz. Let h :"5%" + *Rkbe any internal, S-bounded function. If x E *Rm, let [ x I J be the box centered at x with sides of length l/D.If J E *N - N, an easy calculation shows that the function hJ defined by
h J ( x )= m([x]')-'
lx1, h(x)dm(x)
is J-Lipschitz (here rn is Lebesgue measure). Another important aspect of 4.5.5 is the need to keep track of points where "det( G(s, X ( o , s))) = 0 since the estimate does not tell us anything about them. The key observation is the following lemma. 4.5.6. LEMMA. Let a :[0,1] x 08" + R" 0R" be a bounded and measurable function taking non-negative, symmetric values. Then a J is a J-Lipschitz lifting of a also taking symmetric and non-negative values. Moreover, for each M E R+, there is an rn E R, such that if det *a(s, y ) 2 M for all ( s , y ) E [ ( t , x ) l J ,then det a J ( t , x ) 2 m. PROOF By Anderson's Lusin theorem 3.4.9, we know that *a is a lifting of a, and hence a J = *a J is a J-Lipschitz lifting of a. To prove the last part of the lemma, we define
Z(B) = inf{(Bx, x)l llxll = I} for all non-negative, symmetric n x n matrices B, and note that (14)
Z(B + C ) 2 I ( B ) + I ( C ) .
Since det B is the product of the eigenvalues of B and Z(B) is the smallest eigenvalue,
(15)
Z(B)" 5 det B
5
~~B~~n-lZ(B).
Returning to our function a, we get from (15) that if det *a(s, y ) 2 M for all (s, y ) E [ ( t , x)]', then Z(*a(s,y)) 2 M/Ilall"-' for the same (s,y)'s.
166
4 STOCHASTIC ANALYSIS
From the superadditivity property (14), we get / ( a J (f, x)) z-M / Ila 11 " - I , and using ( 1 5) again we obtain det( a'( f, x)) which shows that we can choose rn
2
=
( M / Ila 11 " - I ) " , ( M / 11 a 11 "-I)".
We have now reached the stage where we can prove that solutions to (1) exist even in the case where f and g are not continuous in the space variable, provided that the determinant of g is bounded away from zero. A "weak" solution of this kind was first obtained by Krylov (1974); the "strong" result we shall give is due to Keisler (1984). 4.5.7. THEOREM.
Let
f :[O, 11 x R"
+ R",
g : [ O , 11 x R" + R" 0 R"
be bounded, measurable functions, and assume that there exists an E E R, such that ldet g( r, y)l > E for all (f, y ) E [0,1] x R". For all Bo-measurable random variables xo,and all Brownian motions b adapted to (a,{ Bl}, L( P ) ) the equation (16) x(w, t )
= xo(w)
+
I:
f(s, x(w, s ) ) ds
+
lb
g(s, x(w, s)) db(w, s)
has a solution. PROOF. We shall only consider the case where xo is constant and b is a particular Brownian motion. The easy task of gluing these processes together to obtain solutions for general xo's is left to the reader, as is the final appeal to Proposition 4.5.2. We begin by simplifying the problem further. For each (f, x) let Igl( f, x) be the absolute value of g ( f, x), i.e., the unique symmetric, non-negative matrix such that 1gI2= g'g, where lg is the transpose of g. Let u( t, x) be the unitary matrix such that lgl = g . u. We claim that it suffices to find a solution to the equation I-'
where lgJhas replaced g. The reason is simply that since u( f, x) is a unitary matrix, the process b' = u db is a Brownian motion. Thus lgl db = 1g db', and a solution of (17) is also a solution of (16), although with respect to a different Brownian motion. To solve (17), we let J be an element of *N - N satisfying Proposition 4.5.5, and choose Jo E *N - N infinitesimal compared to J. The reason why we pass to the second constant will become clear later. Let F =f'o and
4 5 STOCHASTIC DIFFERENTIAL EQUATIONS
167
/GI = I g t ' O , and observe that by 4.5.6,the standard part of IGI is bounded away from zero. Let X be the solution of the hyperfinite difference equation 1
X ( w , t)=xo+
(18)
f
1 F ( s , X ( W ,s)) A t + rC= O IGl(s, X ( W ,s)) A X ( @ ,s); s=o
we shall show that x = OX+ is a solution of (17) when b = O x + . All we need to prove is that "F(s,X ( w , s)) = f ( " s , x(w, Os)) and O I G I ( s , X ( w , s)) = Igl("s,x(w,"s)) almost everywhere in R x T To this end let BK = { ( t , x ) E T x *R" I llxll 5 K } , and define
A,
=
{ ( t , X)
E
BK I°F(t, X ) + f ( " t , " ~ )oroIGI(t, X )
+ Igl("t,
OX)}.
We shall show that there is J-Lipschitz function H, which is 1 on AK and has O1lHEIln+l 5 E. Applying 4.5.5to X and H E ,we get
Since detlGI is bounded away from zero, it follows that
L ( P X A){(w, t)I(t, X ( w , t ) ) E A , } = 0 for all K E R+, and hence x = OX+ is a solution of (17). To construct H E ,observe that since F and G are Jo-Lipschitz for a Jo which is infinitesimal compared to J, the set A , has the following "openness" property: if x E A , and Ix - yl I1/J, then y E A,. Thus if B is an internal set in the complement of AK, the function H , ( y ) = J . d ( y , B ) A 1 is one on A K . If we also choose B such that its complement has measure < E as HE is zero on B. But HE is obviously less than E, then IIHEIln+l J-Lipschitz, and hence the theorem is proved. Observe that in the proof above we only proved that (16) has a weak solution; Proposition 4.5.2took care of the rest. Since we have claimed that one of the nicest features of the nonstandard theory is its ability to produce strong solutions, but at the same time have refused to prove 4.5.2,the reader could accuse us of not playing fair. However, by changing the proof above slightly, we can get a strong solution directly. The idea is to replace (17) by I
(19)
X ( w , t ) = x o + l F ( s , X ( o , s)) At 0
where V is a suitable lifting of u-'. The proof proceeds along the same lines as above, but becomes a little more technical since we also have to treat the relationship between u and V.
168
4. STOCHASTIC ANALYSIS
In a certain sense Keisler's result is the best we can expect if we only assume f and g to be measurable. If we allow g to degenerate, all potential solutions of (16) may take values in a null set in R",and i f f and g are only given as measurable functions, their values on null sets are arbitrarythe equation no longer makes sense. However, by assuming that f and g are continuous on the set where g degenerates, Kosciuk (1982,1983) obtained the following generalization: 4.5.8. THEOREM. Let f : [0,1] x R" + R" and g :[O, 11 x R" + R" 63 [w" be bounded, measurable functions, and define N = [0,1] x R" by N = {( t, x ) 1 there is a sequence (r,, x,) + ( 2 , x ) such that det g( t,, x,) + 0). Assume that the restrictions o f f and g to N are continuous, and that if either f or g is discontinuous at a point (f,x) E N, then det g is bounded away from zero on V - N for some neighborhood V of ( t , x). Under these conditions the equation
(20)
40,r)
= xo
+
lo'
+
f(s, x(w, s))
As,
4 0 , s)) W s )
has a solution for all Brownian motions b adapted to all %o-measurable initial conditions xo.
(a,{%,},
L ( P ) ) and
PROOF As in the proof of 4.5.7 we may assume that g is symmetric and non-negative by passing to the absolute value lgl if necessary. We shall also assume that xois constant and that b is the right standard part of Anderson's random walk x. Choose a J E *N - N satisfying 4.5.5, and let F = f J o , G = gJo for a Jo E *N - N that is infinitesimal compared to J. We shall prove that the standard part x of the process 1
(21) X ( 0 , t ) = xo +
c Hs,
1
X(W,
s=o
At +
c G(s,
X(W,
s=o
s)) A X ( 0 , s)
is a solution of (20). As usual, it suffices to show that
L ( P x A){(@,
(22) for all K
E
N, where A,
(23)
t ) l ( t , X ( 0 , t ) ) E AK) = 0
=
{ ( t , x)l llxll 5 K and either
OF(?,
x ) # f ( " t , Ox)
or "G(t, x ) # g("t, "x)}.
We shall first show that if det G(t,x) = 0, then ( t , x) g A,. By the conditions on g and Lemma 4.5.6, it is clear that det G(t,x) can only be infinitesimal when ( t , x) E st-'( N ) . Let Nc be the points in N where both
4 5 STOCHASTIC DIFFERENTIAL EQUATIONS
169
f and g are continuous, and let N p = N - Nc. Assume first that ( t , x) E stC1(Nc); then f and g are continuous at ( O f , and ( t , x ) cannot be in A K . If, on the other hand, ( t , x) E st-'( N p ) , then det G(t, x) can only be infinitesimal if Ox),
since det g is bounded away from zero on V - N. However, since f and g are continuous in N, formula (24) implies that (f, x) G A K . Thus we have shown that if det G ( t ,x) -- 0, then ( t , x) i?A x . For each E E R + , let HE be a J-Lipschitz function that is one on AK and has L"+' norm less than E, as in the proof of 4.5.7. By 4.5.5 E
(1;
det G(s, X ( w , s))'"""'~H,(s, X ( w , s))l d s )
5
NE.
Since det G ?t. 0 on A K , and E, K , and D are arbitrary elements of R + , N, and N, respectively, (22) follows. The proof is complete. We should mention that we have reformulated Kosciuk's result somewhat; in his papers Kosciuk (1982, 1983) does not consider the stochastic differential equation (20), but constructs a solution to the associated martingale problem without using stochastic integration [for information on martingale problems, see Stroock and Varadhan (1979)l. The proof we have given is shorter and fits better into our presentation. As Kosciuk (1982) pointed out, Theorem 4.5.8 may have interesting applications in biophysics and biochemistry, e.g., to the study of reactions between regulatory proteins and DNA molecules. Here is what is believed to happen (Richter and Eigen, 1974): the protein is diffusing in a solution until it hits a DNA molecule (nonspecific binding); the process then abruptly changes to a diffusion along the surface of the DNA until the protein either dissociates from it or reaches a particular site where it gets trapped (specific binding). The lower-dimensional diffusion along the DNA chain is necessary to explain the high association rate found in experiments; if we use a model with only a three-dimensional diffusion, the value will be much too low. A mathematical model for the phenomenon was suggested by Berg and Blomberg (1976,1977), using a coupling of three-dimensional and onedimensional diffusion processes, but their theory has some unsatisfactory features; e.g., the dissociation of the protein from the DNA can only be obtained by allowing it to jump a finite distance back into the solution. In Theorem 4.5.8 we have a flexible tool for engineering models of diffusions with abrupt changes and degeneracies in the coefficients, and it seems likely that by applying it to proteins and DNA molecules, we would arrive at a
170
4. STOCHASTIC ANALYSIS
better mathematical model. However, it should be pointed out that using their model, Berg and Blomberg were able to obtain connections between the diffusion coefficients and physical properties of the DNA molecule, and that no analysis on this level has yet been attempted using 4.5.8-it remains an interesting challenge for future research. C. Equations with Coefficients Depending on the Past
As a final example of the hyperfinite approach to stochastic differential equations, we shall take a brief look at equations where the coefficients depend on the past of the process and not only on its present state. Although there are several papers dealing with equations of this sort, we shall concentrate on Hoover and Perkins's (1983b) work on equations where the driving term is a semimartingale, and even in this case restrict ourselves to the main ideas-for a description of the technical machinery, you will have to consult the original paper or Perkins' (1983) survey article. A semimartingale is a process z = a + m where a has paths of bounded variation and rn is a local martingale. Hoover and Perkins studied equations of the form (25)
t) =
h(w, 2 ) +
I:
f ( w , s, Y ( W ,
W w , s),
where z is a d-dimensional semimartingale, h :R x [0,1]+ R" is rightcontinuous with left limits, and f ( w , s, y ( o , .)) E R " 0 R d only depends on the values of y ( w , up to time s. The strategy is the same as before; we find a well-behaved (in the technical sense of Section 4.4) lifting H of h, and a hyperfinite process 2 such that "Z+= z and 2 and all its stochastic integrals are well behaved. If we can construct a suitable lifting F of J; we can solve the hyperfinite difference equation a )
I
(26)
Y ( w ,t ) = H ( w , 2 ) +
C s=o
F ( w , s, U w , * )) A z ( w , s),
and since all processes in (26) are well behaved, it is not hard to see that under reasonable conditions the right standard part y of Y is a solution of (25). But how do we construct F ? If J , is the Skorohod topology on the space 9 of right-continuous processes with left limits, we can use Proposition 4.3.13 to pick a lifting F o f f which is uniformly continuous in the third variable, provided that f ( ~s,,. ) is J,-continuous for all o and all s. However, Hoover and Perkins only assumed that f(o,s, . ) was continuous in the uniform topology, and this makes the problem much harder as 9is then no longer separable. Since we only need to lift f on functions y which
4 6 OPTIMAL STOCHASTIC CONTROLS
171
are possible solutions to (25), Hoover and Perkins could show that we are only interested in a subspace of 9where the two topologies coincide [this is basically because a solution to (25) can only jump when h or z jumps]. Hence we get our lifting F, and the standard part of Y is a solution of (25). For the precise statement of the result, we again refer the reader to Hoover and Perkins (1983b) and Perkins (1983). Prior to Hoover and Perkins’s work, solutions to (25) were only known to exist under Lipschitz conditions on the space variable inf [DolCans-Dade (1976); Protter (1977a,b)], but a variant of the general result was found independently and at about the same time by Jacod and Memin (1981). Other nonstandard studies of stochastic differential equations with dependence on the past include Cutland (1982) and a paper by Osswald (1984). We shall take a closer look at Cutland’s ideas when we turn to stochastic control theory in the next section; at this point we only mention an interesting feature of his latest papers, where he works with internal Brownian motions instead of hyperfinite random walks. This *-continuous approach seems to simplify a number of technical problems concerning liftings and transfer of standard results; see his proof (Cutland, 1985e) of Keisler’s existence theorem 4.5.7 as an illustrative example. In this section we have concentrated on existence theorems, although nonstandard methods have also been used successfully to obtain other kinds of results about stochastic differential equations, notably invariance principles (Keisler, 1984; Kosciuk, 1982) and Markov properties (Keisler, 1984). With uniqueness questions there has yet been no progress. For an interesting application to economics, see Keisler (1983). 4.6. OPTIMAL STOCHASTIC CONTROLS
The stochastic differential equations (1)
x(r) = xo
+
I:
f(s,
X)
ds
+
Id
g(s, x ) d b ( s )
of the last section have been used to model a wide variety of phenomena in science, engineering, and economics. What many of these phenomena have in common is a feedback mechanism which modifies the process’s future behavior on the basis of an observation of its past. The purpose of this modification is usually to optimize a certain outcome of the process. Mathematically the situation can be described as follows: we consider processes
(2)
x ” ( t ) = xo
+
f ( s , x“, u ( s , xu)) ds +
g ( s , xu,u ( s , xu))db
172
4 STOCHASTIC ANALYSIS
where f and g are given functions, but where the control u may vary. If the outcome of the process is determined by a function h, we define the cost of xu to be J ( u )= E ( { o ' h ( x u ( t ) ) d t ) .
(3)
The idea is to find the control u which minimizes the cost j ( u ) . It is obvious that the solution to this problem depends on which functions u we allow. Certain restrictions are necessary; e.g., (2) does not make sense unless u is nonanticipating in the sense that u ( s , x) only depends on the values of x up to time s. On the other hand, it is not always the case that u should be allowed to depend on the entire past of the process; the feedback mechanism may be based on incomplete information about the system and this will result in a smaller choice of u's. Hence we are led to the subject matter of this section, optimal controls for partially observed stochastic systems. The nonstandard approach we shall present here has been developed by Cutland (1983a-c, 1985a,b) in a series of papers. Although we have changed the details slightly to fit into our framework, the main ideas are all due to him. Our aim is to prove the existence of optimal controls under quite general conditions, and we shall first take a look at the case wheref(s, x, a ) , g(s, x, a ) , and u(s, x ) only depend on the value of x at time s. In this situation Krylov's inequality and the techniques of the last section can be used. When f; g, and u are allowed to depend on the past of x, the problem becomes more difficult, and we can only solve it under extra hypotheses. The main tool in this case is Girsanov's formula, which we shall develop from scratch. A Optimal Controls The Markov Case
The first type of controlled systems we shall study is (4)
x(t) =
I,:f(X(S),
s, u ( x , s)) ds +
lo'
g(x(s), s, u ( x , s)) d b ( s ) ,
where f :R" x [0,1] x K + R", g : R" x [0,1] x K + R" 0 R" are continuous in the third variable and K is a compact Polish space (i.e., a topological space which allows a complete, separable metric). Since f and g do not depend on the past of x, we shall refer to this rather sloppily as "the Markov case"; strictly speaking this terminology is incorrect, as a certain dependence on the past is introduced through the control u. The feedback mechanism works as follows. Observations are made at fixed times 0 5 t , < f 2 < . * * < rp < 1, and the result of the observation at
173
4 6. OPTIMAL STOCHASTIC CONTROLS
time f , is recorded as a value y , ( x ( r,)), where yi:R" + 9 is a measurable function into a countable observation space 9. Given a path x : [ 0 , 11 + R", we let
A x ) = (Yl(x(tl>),- ..
9
y,(x(t,)))
be the sequence of observations made on x. is a measurable function u : 9' x following sense: if t, 5 t < ti+,, then u ( y , t ) depends only on the first i components of y = ( y l , y z , . . . ,y p ) . 4.6.1. DEFINITION. An admissible control += K that is nonanticipating in the
[0,1]
Given a bounded, measurable cost function h :R"x [0,1] lem is to find a control u which minimizes the cost
+ R,the
prob-
where
(6)
x"(t)=
I*'
f(x"(s), s, U ( Y ( X " ) , s)) ds
+
lo'
g(x"(s), s, U ( Y ( X " ) S ) ) W s ) .
For this to make sense, the solutions to ( 6 ) must be unique in distribution, and this is guaranteed by assuming that g is Lipschitz continuous in the first variable and that g-' is bounded on compacts. But even when (6) has a unique solution, there may not exist an admissible optimal control. The problem is that since the set of controls has rather weak closure properties, the infimum (7)
inf{j ( u ) I u an admissible control}
need not be attained. One way of solving-or evading-this difficulty is to allow measure-valued controls. Let A ( K ) be the set of Bore1 probability measures on K with the weak topology. 4.6.2. DEFINITION.
An admissible relaxed control is a measurable function
u : 9,x [ 0 , 11 + A ( K ) that is nonanticipating in the sense of Definition
4.6.1.
Relaxed controls were first introduced in a standard context (Filippov, 1962; McShane, 1967; Warga, 1967, 1972), but we shall see below that they arise naturally in nonstandard theory as the standard parts of internal controls.
174
4 STOCHASTIC ANALYSIS
We must explain the proper way of interpreting (6) when u is measure valued. The drift term causes no difficulties; just let (8)
/ r ' f ( x " ( s ) >s, U(Y(X"), s)) d s =
I , X . . ( s ) , s, a ) d u ( y ( x " ) ,s ) ( a ) ds.
The interpretation of the martingale term is perhaps more surprising; we let
Why this is the appropriate definition will become clear later; it suffices to say here that it is connected to the fact that it is the covariance of the diffusion that determines the dynamics of the system. Before stating the result we are aiming at, we introduce conditions on f, g, and h. The conditions are stronger than what is strictly necessary, but they make the theory run smoothly without too many technicalities. 4.6.3. CONDITION. Assume that f : R " x [O, 11 x K -+ [w", g : R " X [O, 11 x K -+ [w" 0 R", and h :R" x [0,1] + [w are bounded, measurable functions with the following properties:
(i) f and g are continuous in the third variable; (ii) g is uniformly Lipschitz continuous in the first variable; i.e., there is a constant K such that Ilg(x, t, a ) - g ( y , t, a ) " IK l l x - yIJ for all x, y , t, a ;
(iii) the values of g are positive definite, symmetric matrices whose determinants are bounded away from zero, i.e., there is a constant E > 0 such that Jdetg ( x , t, a)l > E for all x, t, a. Note that (i) and (ii) imply that g is jointly continuous in the first and the third variable. 4.6.4. THEOREM. Assume that f, g , and h satisfy Condition 4.6.3. Then there exists an admissible relaxed control u that minimizes the cost
i.e., j ( u ) 5 j ( 6 ) for all other admissible relaxed controls u'. The control u is called an optimal relaxed control for the system defined by J; g, and h.
4 6 OPTIMAL STOCHASTIC CONTROLS
175
To prove Theorem 4.6.4 we shall apply the following strategy. We first translate the problem into a hyperfinite setting by using Anderson's random walk and suitable liftings off; g, and h. In the nonstandard problem it will be immediately clear that an optimal control exists. Taking the standard part of this internal optimal control, we obtain the optimal relaxed control for the original system. In order to carry out this program, we must first study internal controls and their relationship to relaxed admissible controls. Let T = { O , A t , 2 A t ,..., 1)
(11)
be an internal time line and
(12)
0I?,
"6 = ti for all i 5 p .
a finite sequence of elements of T with
4.6.5. DEFINITION. An internal control is an internal function U :*%' X T + * K that is nonanticipating in the following sense: if ?,st< (+, , then U ( Y, t ) depends only on the first i components of Y = ( Yl , . . . , Y p ) .
We shall now define the standard part of an internal control to be an admissible relaxed control. The intuitive idea behind the construction is as follows: fix a time t E [0,1] and a control parameter y E W. For each n E N, let v,(y, t ) be the internal measure on * K given by
(13)
1
v , ( y , t ) ( A )= 2 n . #
s
1
It + - A
n
U ( y , s )E A
If u,(y, t ) = L(v,(y,t ) ) 0 st-', the standard part of U is the limit of u, as n goes to infinity. Thus u ( y , t ) is the distribution of U ( y , s) over the monad of t. The proof of the next lemma gives an alternative construction of u which is perhaps less intuitive, but technically more convenient. 4.6.6. LEMMA. For each internal control U there is an admissible relaxed control u with the following property: assume that z : [0,1] x K + R is a bounded, measurable function continuous in the second variable, and let 2 : T x * K + *R be an S-bounded lifting of z such that for all t outside a set of Loeb measure zero, " Z (t, a ) = z("t," a ) for all a E * K . Then for all a, b, Y
(14)
1-1
z(t,U(Y,
b
0 )dt
=
"C -W, U ( Y ,0 ) At.
PROOF. Fix a control parameter y , and let vy be the internal measure on T x * K defined by
v,,(A)= # { t
Put py = L( '4) 0 st-'.
E
TI ( t , U ( y , t ) ) E A } * At.
176
4 STOCHASTIC ANALYSIS
Note that for each Bore1 set B c K , the measure py,Bdefined on [0,1]by Py,B(c)
=
py(c
B,
is absolutely continuous with respect to the Lebesgue measure. We define u ( y , t ) ( B ) to be the Radon-Nikodym derivative, i.e.,
By working on each of the intervals [ t i , t i + , ] at a time, it is easy to check that we can choose u nonanticipating. What is intuitively just as obvious is that we may assume u ( y , t ) ( to be a measure for each pair ( y , t ) , but this is actually a nontrivial observation [see, e.g., Stroock and Varadhan (1979),Theorem 1.1.61. It remains to prove that (14) is satisfied. Since 2 is a lifting of z with respect to vy, we have 0
z ( t , u ( y , t ) ) dt
=
=
=
)
lo; (5
1
z ( t , a ) d u b , t ) ( a ) dt
j
Z(t,
'1
Z ( t , a ) dvy(t,a ) ,
a ) dPY(t,a )
[ "a."b] x K
[ a,b]x* K
and the lemma is proved. The relaxed control u in the lemma above is called the standard part of
u. 4.6.7. LEMMA.
All admissible relaxed controls are standard parts of inter-
nal controk. PROOF We shall only sketch this-simple construction. Fix an admissible relaxed control u and a control parameter y E W'. Choose N E *N - N such that 1/N is infinitely large compared to At, and consider the internal measures uk on * K defined by ( k + l ) /N
(16)
* 4 t , Y)"
Uk(A) =
dt.
j k /N
By strictly internal methods we can find an internal function Uy: T + * K such that Uk(A)=
N * # { t l ( k / N ) St < ( ( k + l ) / N ) A u y ( t ) E A }
4.6 OPTIMAL STOCHASTIC CONTROLS
for all internal A, and U ( y , t ) = U,( that u is the standard part of U.
t)
177
is nonanticipating. It is easy to check
4.6.8. REMARK. The last two results are fundamental to the theory in this section. They were discovered by Cutland, who applied them first in deterministic (Cutland, 1983a) and then later in stochastic control theory (Cutland, 1983b,c, 1985a,b). Recently, a similar connection between internal solutions and "weak" standard solutions has turned up in Arkeryd's work on partial differential equations, but here the relationship seems to be more complicated, especially in the nonlinear case. Arkeryd's interest in these problems grew out of his work on the Boltzmann equation (which we report on in Section 6.5),but in two recent papers (Arkeryd, 1984; Arkeryd and Bergh, 1985), he has given a general treatment in a Sobolev space setting.
We now pick internal functions
F : * R " x T x * K + *R", G : * R n x T x * K + *R" @ *R",
H : * R " x T-* *R,
x:*w+*q
isp,
and consider the controlled internal process X"(t)
=
lot lof
F ( X U ( s ) S,, U ( Y ( X " ) , s ) ) ds
+
G ( X " ( s ) ,s, U ( Y W " ) , s)) dx,
where y, : LI x T + R" is Anderson's random walk, U is an internal control, and
The cost of X u is defined to be
J( U )= E
( lo1 H(X"(t),
t > G3)
We shall assume that F, G, and H are S-bounded. The next lemma reduces the proof of Theorem 4.6.4 to a question of constructing the right kinds of liftings of g, and h. It also explains why we have chosen (9) as the interpretation of g(x", s, u ( y ( x " ) , s)) db when u is measure valued.
5
178
4 STOCHASTIC ANALYSIS
4.6.9. LEMMA. Assume that there is a set N c R x T of Loeb measure zero such that if ( w , s) E I?, then
" F ( X U ( w s, ) , s, a ) = f ( " X u ( o ,s ) , "s," a )
for all
a
" G ( X U ( ws, ) , s, a ) = g ( " X u ( w ,s ) , "s, " a )
for all
a E *K
E
*K
" H ( X U ( ws, ) , s ) = h("XU(w,s ) , "s) Assume, moreover, that Y ( X " ( w ) )= y ( " X u ( w ) )for L ( P ) almost all w , and that f, g, and h satisfy Condition 4.6.3. If u is the standard part of U, then j ( u ) = "J(U ) . PROOF. Let x be the right standard part of X u . We shall show that x is a solution of
for a suitably chosen Brownian motion b. Since Condition 4.6.3 guarantees that the solution to this problem is unique in distribution, this is sufficient to prove the lemma. First, observe that by Lemma 4.6.6 and the conditions on F and Y, b
{ o ; f ( x ( s ) , s,
U ( Y ( X ) , s))
ds
=
"cH X U ( s ) ,s, U ( Y ( X U ) s)) , At
for almost all w. In treating the martingale part of our process, we shall first assume for simplicity that the dimension n is equal to one. The quadratic variation is given by
=
I,'
G ' ( X u ( s ) , s, U ( Y ( X " ) , s ) ) ds.
By Lemma 4.6.6 and the conditions on G and Y, the standard part of the integral on the right is equal to
4 6 OPTIMAL STOCHASTIC CONTROLS
179
Thus by (9)
Recalling (9) again, we get x ( f ) = J ' A ~ ( s~, wLX ) ,
s)) ds +
0
I,:
g ( x ( s ) ,s, AX), s)) db
which completes the proof of the one-dimensional case. The proof in higher dimensions is similar and we shall leave it to the reader, with the following comments. If M is an *R"-valued martingale, its quadratic variation is the "R" 0 %"-valued process whose ( i , j ) component is given by I
0
where M,, MJ are the ith and j th components of M . The proof of 4.4.17 can be extended to the n-dimensional case rather easily [see Lindstrgm (1989, but observe that we are only interested in the nondegenerate case], and the lemma then follows as in dimension one. We did not use the full force of Condition 4.6.3 in the proof above; essentially all that was needed was the continuity of f ( x ( s ) ,s, . ) and g ( x ( s ) ,s, .), and the uniqueness in distribution of the solutions to the stochastic differential equations
However, it is not easy to see how one should construct the liftings F, G, H, and Y and guarantee the uniqueness, unless something close to 4.6.3 is
180
4. STOCHASTIC ANALYSIS
satisfied. Observe how that condition is used in the proof of the main theorem: PROOF OF THEOREM 4 6 4
Assume that we can find liftings F, G, H, and
Y o f f ; g, h, and y such that the conditions of Lemma 4.6.9 are satisfied for all internal controls U. By 4.6.6, 4.6.7, and 4.6.9, we then have inf{j( u ) 1 u a relaxed, admissible control) =
inf{"J( U )I U an internal control}.
If a is this common infimum, the set { n E *NI there is an internal control with J ( U ) 5 a
+ l/n}
is internal and contains N, and hence it has an infinite element y. If U is an internal control such that J ( U ) 5 a + 1/ y, the standard part u of U is an optimal relaxed control since j ( u ) = " J ( U ) = a.
To complete the proof, it only remains to construct the liftings F, G, H, and Y. Let J E *N - N be the constant occurring in the nonstandard version of Krylov's inequality, Proposition 4.5.5. We choose F, G, and H to be J-Lipschitz liftings o f f ; g, and h, and we make sure that for almost all nearstandard (x,t ) ,the functions F ( x , t, and G(x, t, are S-continuous. We also require that "det G is bounded away from zero; since we have assumed that g is positive definite and symmetric, this does not interfere with the requirement that G is J-Lipschitz (recall 4.5.6). Note that if y E W is an observation parameter and a )
a )
X(t) =
lof
F ( X ( s ) ,s, U(Y, s)) ds +
lo'
G ( X ( s ) ,s, U(Y ,s)) dx
for some internal control U, then by Krylov's inequality " F ( X ( w , s ) , s , a )= f ( o X ( ~ , ~ ) , o s , o u )forall
~E*K,
" G ( X ( w ,s), i, a ) = g ( " X ( w ,s), "s," a )
a
" H ( X ( o ,s), s)
=
for all
E
'K,
h ( " X ( w ,s), "s),
for almost all ( w , s) as needed for Lemma 4.6.9. If we can only show that there is a Y such that
Y ( X ( w ) )= y("X(o))
a.e.
for all controls U, the proof will be finished. Since y = ( y , , . . . ,y p ) ,we need only explain how to lift each component y i :R" + 9.The trouble with y i is that it depends only on the value x( t l ) of
4 6 OPTIMAL STOCHASTIC CONTROLS
181
the process at time t i , and not on the joint distribution ( x ( t ) , t ) as t varies. This makes it impossible to apply Krylov's inequality to y i the way we did forA g, and h. However, there is a similar inequality which holds for fixed t's, namely that for all t > 0 (18)
L ( P ) { w : x ( w ,t ) E A } 5 Cm(A)'/',
where C is a constant depending only on t, the bounds on f, g, and det g , and the Lipschitz coefficient K of g. We omit the proof of this estimate, which is long, dull, and not very informative. To turn (18) into an internal inequality we argue as follows: if H E *N, let an H-element be a cube in *R" of the form
where ( k , ,. . . , k , ) E *Z". An H-set is an internal union of H-elements. For finite H it follows from (19) that (20)
+
P { w I X ( w , t ) E A } 5 3 d / 2 C * m ( A ) 1 / 21/H
for all H-sets A. The factor 3 d / 2 is due to the fact that when X ( w , t ) is in one H-element, x may be in the same element or any one of its neighbors, and the extra term 1 / H is included to compensate for the fact that P { X ( w , t ) E A} may be larger than L ( P ) { X ( w ,t ) E A } by an infinitesimal amount. Since (20) is valid for all finite H, there must be an infinite H , for which it also holds. We let Y, be a lifting of yi (with respect to the "version of the Lebesgue measure) which is constant on H,-elements. It follows from (20) that Y i ( X ( w ) )= y i ( " X ( w ) )
a.e.,
and this completes the proof of Theorem 4.6.4. Although much more deserves to be said about the stochastic system we have been discussing, we shall restrict ourselves to a few short remarks. First we would like to mention that Theorem 4.6.4 is a simplified version of the results in Cutland (1985a,b); the original treatment covers a more complicated and general situation which is really an amalgamation of the one we have used above and the one we shall discuss at the end of this section. Next we should admit that the feedback mechanism we have been using is partly dictated by technical considerations. As an example of the problems we run into a more general setting, note that if the observation space 9 is not discrete, then it is not clear that an internal control U has a standard part satisfying anything like Lemma 4.6.6, the problem being
182
4 STOCHASTIC ANALYSIS
that U ( y , t ) and U ( y ' , t ) may be completely different for infinitely close y and y'. Indeed, the existence of optimal relaxed controls for partially observed stochastic systems still seems to be a wide open question in the general case. But there is more to stochastic control theory than the existence of optimal controls. We shall take a very brief look at one of the other aspects as a further indication of the strength of the nonstandard approach. If H E *N, call an internal control U an H-control if V ( y ,. ) is constant on the intervals [ k / H , (k + 1)/H) n T, k E E, for all y E *? Recalling I. the proof of Lemma 4.6.7, we see that if u is an admissible relaxed control and H is infinite, then u is the standard part of an internal H control. If E E R, and a is the lowest possible cost, the set {H
E
*N I there is an H control U with J ( U ) < a
+E}
is internal and contains all infinite H, hence it contains a finite element H E . The standard part of an HE control is an ordinary (not measure-valued) admissible control which is' constant on intervals of length 1/ HE.Thus we have shown that the optimal relaxed control can be approximated arbitrarily well by very simple ordinary controls. This argument can be refined by also allowing Af to become finite, and we can then describe the system as a limit of discrete systems where the Brownian motion is replaced by random walks. Results of this kind are of interest in applications; see Christopheit (1983) for a discussion. There is one natural question we have not touched on yet: When is there an optimal ordinary control? We shall postpone this until the end of the section. B Girsanov's Formula
In the stochastic systems we studied above, the drift coefficient f, the diffusion coefficient g, and the observations yi all depended only on the present value of the process. For many applications this is too restrictive; f, g, and y c will also depend on the past; they will be nonanticipating functions f :C[O, 11 x [0,1] x K + R", g : C[O, 11 x [0,1] x K
+ R"
OR",
y1: C[O,11 + K,
where C[O, 13 is short for C([O,11,R"). Looking for liftings o f f , g, and y,, it is clear that they will be functions
F:*C[O,11 x T x * K + *R", G:*C[O, 11 x T x * K + *R" O*R",
Y:*C[O,11 + * K ,
4.6. OPTIMAL STOCHASTIC CONTROLS
183
but the big question is which measure on *C[O, 11 they should be liftings with respect to. What we need, of course, is that F, G, and the Y,'s are liftings with respect to all the measures induced on *C[O, 13by the controlled processes Xu
=
I
F ( X u , s, U ( Y ( X u ) , s)) ds +
I
G ( X u , s, U ( Y ( X " ) , s)) &,
but it is impossible to define F, G, and Y, in this way since there are uncsuntably many of these measures and most of them are mutually singular. In the simpler case we studied above where R" replaced C[O, 11, the liftings were picked with respect to the Lebesgue measure, and Krylov's inequality was used to prove that they were also liftings with respect to the measures induced by the Xu's. It is possible to use a similar idea in the present setting, but the price we shall have to pay is that the diffusion coefficient g is no longer allowed to depend on the control u; hence g is a function g : C[O,11 x [O, 11 + R" 0 R" The point is that if G is a lifting of g and (21)
Z
=
I
G ( Z , s )dx,
then (under reasonable conditions) all the measures induced by the X "'s are absolutely continuous with respect to the one induced by 2. Hence we can define our liftings with respect to the latter, and the absolute continuity will take care of the rest. The result we are referring to is a nonstandard version of Girsanov's formula. To prove it, we shall take a look at stochastic differential equations from a slightly different angle than we did in the last section. This change of viewpoint is not, strictly speaking, necessary (see Cutland, 1982), but we find that the alternative approach is not only a convenient way of proving Girsanov's theorem, but also interesting in its own right. The basic idea is that we can deal with the drift term by changing the underlying probability measure rather than changing the paths of the process. As usual we work with the hyperfinite time line T = (0, Af, 2 A t , . . . , 1) and the probability space R = {-1, l}Tx{l*--s") with normalized counting measure P. We let {slt}tGT be the natural filtration on R and n
x(w, t )
=
t
1 s2= o wi(s)&ei
i=l
184
4 STOCHASTIC ANALYSIS
an n-dimensional version of Anderson's random walk. By Z we shall denote the set of all internal maps u :T + *R" which are zero at the origin. Given a nonanticipating function H:I: x T + *Rn,we let PH,Ibe the probability measure on R given by I
PH,I(w) =
(22)
n
n n (4 + 4 H i ( x ( w ) ,
S)
A x i ( a , s)),
s=O i = l
where Hiand xi are the ith components of H and x, respectively. This is the measure governing a random walk obtained by flipping unfair coins, H being a measure of the unfairness. Note that the density of PH,, with respect to P is
If E H . 1 denotes expectation with respect to PH,I,an easy calculation shows that
Hence
is a martingale with respect to (R, {ar}, P H , I ) .In fact, it is more than just a martingale: 4.6.10. LEMMA. Assume that llH(a, t)ll' At = 0 for all (T and t. Then the standard part of WH,lis a Brownian motion with respect to the Loeb measure L(pH,I
1.
PROOF
Let AWH,I(t)i be the ith component of the increment A W H , l ( t ) .
Then E , i ( A W H , , ( t ) t I a r ) ( w )= ( f i- H ( x ( w ) , t ) At)'(+ + + H , ( x ( w ) ,l ) f i )
+ (-fi- H , ( x ( w t)), At)* x =
At
(5 - t H , ( x ( w ) ,t ) f i ) -
H , ( x ( w t)' ) , At2.
Since IIH(,y(w),t)1I2 At = 0 , Proposition 4.4.18 tell us that is a Brownian motion with respect to L ( P H , I ) and , since the components of OWL,, are independent, the lemma follows.
4 6 OPTIMAL STOCHASTIC CONTROLS
we see that if H is a suitable lifting of a standard function h, then a solution of the stochastic differential equation
x(t) =
185
Ox+
is
I:
h ( x , t ) ds + b ( t )
on (a,L( P H . 1 ) ) . In the terminology of the last section, this is a weak solution as the Brownian motion b = WL,I depends on the coefficient h. Let us take a brief look at what we have been doing. Starting with a L ( P ) ) ,we have constructed a new measure Brownian motion O x + on (a, L(PH.1) such that on (a,L ( P H , , ) )the "same" process is a solution to (26). The connection between L ( P ) and L( P H , I )is given by (23). We shall now explain how the method can be extended to deal with stochastic differential equations of the form O
x( t )
(27)
=
Id
f(x, s) ds +
lor
g(x, s) d b ( s ) .
The idea is the same as before; given a solution of the simpler equation
we turn it into a solution of (27) by changing the underlying measure. Assume that
F : X X T+*R",
G : X X T+*R"@*R"
are nonanticipating, and that G ( a ,t ) is invertible for all u and t. For each u E X, let uG be given by uG(t) =
and similarly let
c
xG:ax T + *R"
G(aG, s > A d s ) , be the process
Note that (29) is the nonstandard version of (28). We define a nonanticipating function H :Z x T + *R" by
(30)
H ( U ,t ) = G.'F(uG, t ) .
186
4. STOCHASTIC ANALYSIS
Combining (29), (30), and (25), we get
which is a nonstandard version of (27). Let us introduce the following notation. We shall write PF,G and WE, for PH,I and WH,I,respectively, where H is defined as in (30). When we consider ,yG as a stochastic process on (0,PF,G),we shall denote it by x ~ , when we consider it as a process on (R, P), we shall keep the old notation xG. In this way we avoid having to refer explicitly to the underlying measure every time we mention a process. By Lemma 4.6.10 and formula (31) we have 4.6.11. LEMMA.
Assume that IIG-'F(a, t)ll*Ar = 0 for all
(T
and
1,
and
that the sets (32)
( ( 0 ,r ) \ ° F ( X F , G ,
(33)
{(w, t>IoG(XF,G~t , #
t, #
f('X:.G, g( OX;,G ,
Or))
Of))
have L( PEG x A ) measure zero, where A is the normalized counting measure is a solution to on T. Then (34)
x(t) =
lb
f ( x , s) ds +
Ib
g(x, s) d b ( s )
on (0,L ( P F , G ) ) . Recall that we are not interested in the solution of (34) for its own sake, but rather in its density with respect to the solution of (35)
~ ( t =)
Jd
g(x, s) d b ( s ) .
Since xc and x ~have , the ~ same paths, this density is easy to calculate; in fact, formula (23) tells us that
To study the expression on the right-hand side, we introduce a process M : 0x
s-, *R
~ ;
4 6 OPTIMAL STOCHASTIC CONTROLS
187
as follows: S is the time line obtained by inserting n - 1 new points between t and t + A t for each t E T, i.e.,
S
=
(0, As, 2 A s , .
. . , l},
where A s = Att/n. A point in S is thus of the form s = t and an i between 0 and n. We define M by letting M ( 0 ) = 1,
(37) (38)
+ i A s for a t E T
M(t + ( i + 1 ) A s )
=
M ( t + i As)(l
+ Hi(x, t ) A x i ( t ) ) .
It is clear that M is a martingale with respect to P and that (39)
(PF,G/P)(W)
4.6.12. LEMMA.
=
M(w, l ) .
If I\H(u,t ) l 1 2 S K for all u and t, then
E ( M ( s ) ~s) exp(nK2s)
(40)
for all s E S. PROOF
First note that if s = t
+ i As, then
E ( M ( s + AS)’) = E ( M ( s ) ’ ( l + Hi(x,t ) Axi(t))’) = E(M(s)’(l
+ H , ( x , t)’At))
5
E(M(s)’)(I
+ nK2 As).
By induction we see that if f : S + *R is an internal function such that f(0) = 1 and f(s A s ) i ( 1 + nK2 As)f(s), then
+
f(s)
5
(1
+ nK’
Iexp(nK2s).
Putting f ( s) = E ( M ( s)’), the lemma follows. If A E fl has infinitesimal P measure, then by (39), Holder’s inequality and the lemma,
Hence L(PF,G)is absolutely continuous with respect to L ( P ) . The next lemma weakens the boundedness assumption on H. 4.6.13. LEMMA. Let F and G be S-bounded, nonanticipating functions. Assume that for ail S-bounded u E C, we have ‘IG-’F(u,t)\ < co for all t E T. Then is absolutely continuous with respect to L ( P ) , and
188
4 STOCHASTIC ANALYSIS
PROOF We already know that the conclusion is true if 1 G-'Fl is uniformly S-bounded. Let
BN = { a E E \ s u p / c r ( t ) ) sN } fE
for each N
E
7-
N, and observe that
If L( PF,G)is not absolutely continuous with respect to L( P ) , there must be an internal set A contained in some B N , N E N, such that L ( P ) ( A )= 0 and L( P F , G ) ( A ) > 0. Define :Z x T + *R" by if l a ( s ) l IN for all s 5 f, otherwise. Then OIG-'Fl is uniformly S-bounded, and P F , and ~ PF,Gagree on BN Thus L( P F , ~ ) ( A=)L( PF,G)(A)> 0, while L( P ) (A ) = 0, contradicting the preceding lemma. The conditions in Lemma 4.6.13 are still far from optimal; e.g., tht S-boundedness of F and G can be replaced by bounds of the form
ma,
t)l,
lG(u, {)I 5 K ( 1 +
Il4l),
where 11.11 is the supremum norm, but we shall leave all further refinements to the reader. However, although we shall not need it, we would like tc bring our expression for the density aL( P F , G ) / a L ( Pto) a more familiar form 4.6.14.NONSTANDARD GIRSANOV FORMULA. Let F and G be S-bounded nonanticipating functions, and assume that for all S-bounded (T E Z, wt have 'IG-'F(u, t)l < co for all f E T. Then L(PF,G)is absolutely continuour with respect to L ( P ) and
where A = G . 'G.
4 6 OPTIMAL STOCHASTIC CONTROLS
189
where we have used the Taylor expansion l n f l + x) L- x - (x2/2). Recalling that by (31) dx = G - ' ( x G )dxc, we get
and the theorem is proved.
In o u r work on stochastic control theory we shall only use the fact that L ( P , , ) is absolutely continuous with respect to L ( P ) , and not the explicit formula for the Radon-Nikodym derivative given in (42). All we shall need to know is that a lifting with respect to P is also a lifting with respect to PF,G. For other applications, however, formula (42) is of the greatest importance; see Stroock and Varadhan (1979) for numerous examples. C Optimal Controls. Dependence on the Past
We shall now take a look at a stochastic control problem where the coefficients f, g, and the observations y, are allowed to depend on the past of the process. This means that f and g will be functions
f:C[O,1 1 x [0,1] x K
+ R",
g : C[O, 11 x [0,1] -+
R" 0R",
and that for each i, y , is a function Y , : C[O, 11
+
9,
where 3 is a countable observation space as before. We shall assume that f and g are nonanticipating in the sense that for all x, t, a, the valuesf(x, t, a ) and g(x, t ) depend only on the values of x up to time t. Similarly, if 0 5 t , < t * < . ' . < tp < 1
190
4 STOCHASTIC ANALYSIS
are the times observations are made, then the ith observation y i ( x ) depends only on the values of x up to time ti. The sequence ( y , ( x ) ,y 2 ( x > ,. . . ,y p ( x ) >E ?Vp of observations is denoted by y ( x ) . If u is a relaxed admissible control u : ?Vp x [0,1] + A ( K )
as in Definition 4.6.2, we are interested in the process
Given a cost function h : C[O, 13 + R, we want to minimize the cost j(u) = E(h(x")).
We shall use the fol1owing.conditions on f ; g, and h. 4.6.15. CONDITION.
Assume that:
(i) h is a measurable, bounded function; (ii) f and g are bounded, measurable, and nonanticipating, and f is continuous in the third variable; (iii) g is uniformly Lipschitz in the first variable, i.e., there is a number L E R such that Ig(x, t ) - g ( y , t ) l 5 Lllx - yll for all x, y, t, where (1.11 denotes the supremum norm; (iv) for each n, the function g - ' f ( x , r, a ) is bounded on the set
A,, = { ( x , t, a ) E C[O, 13 x [0,1] x K I llxll
5
n}.
The main result is due to Cutland (1985a): 4.6.16. THEOREM.
If Condition 4.6.15 is satisfied there exists an optimal
relaxed control. PROOF We shall first construct the nonstandard counterparts off; g, h, and y. Let A be the normalized counting measure on T. Since g is continuous in the first variable, we can find a nonanticipating lifting G :X x T + *R" 0 *R" such that for all t outside a set of measure zero with respect to L ( A ) ,
"G(a;t ) = g ( " u + , " t )
(44)
for all nearstandard Let (45)
(T
(recall Proposition 4.3.16).
191
4 6 OPTIMAL STOCHASTIC CONTROLS
be the process defined in (29). Since f is continuous in the third variable, it has a nonanticipating lifting F : Z x T x * K + *R" such that for L ( P x A ) almost all ( w , t ) , we have
"F(x&), t, a ) = f("x+G(d,"t, "4 (46) for all a E *K. We let H be a lifting of h such that O H ( X G ( W ) ) = h("X+G(@)) (47) for L ( P ) almost all w. We can obviously choose F, G, and H S-bounded. Note that we can also choose F and G such that [G-'F'(u,t, a ) [ is S-bounded when we restrict u to a n S-bounded set. For each i, choose E T infinitely close to the observation time t,, and let
y:Z+*9
(48)
be a lifting of yi such that (49)
OYi(XG(W))
=
yt("x+G(w))
a=,
and Y , depends only on ,yG(w) up to time t . We are now ready to define the internal counterparts of the x u processes. Given an internal control U:*?JPx T + * K , we let F U : Z x T + *R" be defined by F"(u, t ) = F ( a , t, U ( Y ( a ) ,t ) ) . The process ~ ~ [i.e., u xG, considered ~ as defined on hyperfinite version of xu since by (31)
(50)
t
(a,pF',G)]
is a
I
(a,
where ' W ~ U is a, Brownian ~ motion on L(PF~,G)). Let u be the standard part of U. We shall show that " X ~ F U , ~is a solution of (43). To this end let R, be the set of all w E R such that (49) holds for all i and (46) holds for almost all t and all a. By Lemma 4.6.6 we see that if w E Ro, then Of:
F(XG(W),
s, U ( Y ( X C ( W ) ) ,s)) A t
=
p-('xb(w),
s, u ( y ( " x c ( w ) ,s)) ds.
Since R, has L(P)-measure one and L(PFuIG)is absolutely continuous with respect to L ( P ) , the last equation holds L ( P + J , ~almost ) everywhere. By a similar argument
192
4 STOCHASTIC ANALYSIS
L ( P F ~ , Galmost ) everywhere. Since xG and ~ ~ haveu the , same ~ sample paths, it follows that is a solution of (43). Moreover, since H is a , get lifting of h also with respect to P F ~ , Gwe j ( u ) = “ J ( W,
where J( U )=
EFU,G(XFU,G(W))
(here we have tacitly assumed that the solutions of (43) are unique in distribution; this follows from the Lipschitz condition on g). Thus inf{j ( u ) I u an admissible relaxed control} =
inf{”J( U )I u an internal control}.
Since the last infimum is achieved, so is the first, and the theorem is proved. Cutland (1985a) has extended the result above to a situation which incorporates both the present and the Markov case we looked at before. This is achieved by considering a system consisting of two interacting parts, an “observation” part and a “performance” part, both modeled by stochastic differential equations. Both parts are allowed to depend on the past, but only the noise in the observation part can be controlled. The observation scheme is also slightly different. Without going deeper into this problem, we only mention that Cutland’s methods are rather similar to the ones we have used, the main difference being that he works in a *-continuous setting with *-Brownian motions instead of hyperfinite random walks (see the comment at the end of Section 4.5). We would like to point out that Girsanov’s theorem was used in the proof of Theorem 4.6.16 only to show that F, G, H, and Y were liftings of f; g, h, and y with respect to all the measures PFu,G,and not only with respect to P. A natural question is whether absolute continuity is the “best” condition for the existence of simultaneous liftings of this sort, or whether it is possible to find a weaker one. Observe that by Anderson’s Lusin theorem, 3.4.9, *f is a lifting off with respect to all standard measures * p , and hence absolute continuity is not always required. However, since the internal measures we construct in nonstandard probability theory are rarely standard, the Lusin theorem is often of little use; what we need is an extension which applies to a larger class of internal measures. No such result seems to be known yet, and we shall make no attempt to prove one here, restricting ourselves to the rather cryptic remark that Anderson’s work on standardly distributed measures (Anderson, 1982) may turn out to contain valuable ideas for such a project.
4 7 STOCHASTIC INTEGRATION IN INFINITE-DIMENSIONAL SPACES
193
We have promised to say a few words about the existence of optimal ordinary controls. In the setting of Theorem 4.6.16 there is a general strategy for turning an optimal relaxed control into an optimal ordinary control which works under suitable convexity assumptions on the range off: The idea is that i f f has a convex and closed range, then for all x, t and each measure u in & ( K ) there is an element a(x, t, u ) E K such that
f(x,
44 t, u ) ) =
I
f ( x , t, c ) d u ( c ) .
Given a relaxed control u, the idea is to choose an ordinary control ii by
U'(Y(X),t )
=
44 t, U ( Y ( X ) , t > > .
Since y contains only partial information about x, this definition need not make sense as we may have y(x) =y(x') and a(x, t,u(y(x),t ) ) # a(x', t, u(y(x'), t ) ) . To circumvent this difficulty, one has to assume that f is of the form f(x, t, a ) =fi(x, t ) . f 2 ( t , a ) . Once this assumption is made, the difficulties disappear and the argument can be carried through [see Elliott and Kohlmann (1982) for the details]. When the control is also allowed to enter into the martingale term as in Theorem 4.6.4, the situation becomes more complicated. Not only d o we need a convexity assumption on g2guaranteeing the existence of an element b(x, t, u ) E K satisfying g2(x, t, b(x, t, u ) ) =
J
g2(x, 4 c> W c ) ,
but we must also be able to choose a(x, t, u ) = b(x, t, u ) . The simplest way of ensuring this is perhaps to let f and g be of the form
f(x, t, a ) =f1(x, t ) h ( t , a ) ,
A x , t, a ) = gdx, t)h1'2(t, a )
for the same function h. We shall leave it to the reader to obtain an optimal ordinary control in this case. The theory of optimal stochastic controls is a fascinating area both in its own right and as a laboratory for the techniques of nonstandard probability theory. There are books by Krylov (1980), Elliott (1982), and Davis (1984) for those who want to know more about the subject. 4 7. STOCHASTIC INTEGRATION IN INFINITE-DIMENSIONAL SPACES
In the last sections of this chapter we shall extend the theory of Brownian motion and stochastic integration in two directions; first we shall consider
4 STOCHASTIC ANALYSIS
194
processes taking values in infinite-dimensional spaces, and then-in the next section-we shall turn to the case of multidimensional “time” parameters. Infinite-dimensional Brownian motions are important for the study of stochastic partial differential equations. As an example we shall take a look at a problem from hydrodynamics. Recall that the velocity of a fluid at position x at time t is given by a function u(x, t ) satisfying the Navier-Stokes equation (1)
+
d , ~ ( u .V)U = -Vp
+ u A U +f(x, t )
(where p is the pressure, u is the viscosity, and f is the external force) and the incompressibility assumption
vu
= 0.
To (1) and (2) one must in each particular case add the appropriate boundary conditions. The mathematical formulation we have just given is the one most often found in the literature, but from the point of view of integral equations, there is another that is more convenient. Using the incompressibility assumption V u = 0, the pressure p can be eliminated from ( l ) , and the system ( l ) , (2) is reduced to the equation
+
(3 1
d , ~ ( u . V ) U= u AU
+ Kf,
where K is a certain projection operator. The actual form of K is of no importance to us here, and we only refer the interested reader to Chow (1978) and Vishik et al. (1979). We can consider (3) as an infinitedimensional evolution equation; for each t the solution is a function u,( * ) in a suitably chosen function space. It is not very realistic to assume full knowledge of the external force f ; in many situations it may be reasonable to introduce it as a stochastic term. The natural candidate is white noise, and we are led to the stochastic equation (4)
du
=
v A u d t - ( u .V ) u d t
+ h d t + gdw,
where h is the deterministic contribution of the external force, and w is a “Brownian motion.” But what kind of a Brownian motion can w be? Since u takes values in an infinite-dimensional space, so does w. In order to understand (4), we must first study infinite-dimensional Brownian motions and their stochastic integrals. This, in fact, is almost all we shall do in this section; only briefly shall we return to (4).
4 7 STOCHASTIC INTEGRATION IN INFINITE-DIMENSIONAL SPACES
195
A. Brownian Motions on Hilbert Spaces
Brownian motions are defined in terms of rotationally invariant Gaussian measures. In Section 3.5 (recall Gross's theorem), we noticed that in the infinite-dimensional case such measures exist only in a rather curious sense, not as measures on the original Hilbert space, but rather as measures on a larger Banach space. We also discovered an easy way of constructing these measures by pulling down a hyperfinite-dimensional Gaussian measure using the Banach space topology. The same idea will be used to define Brownian motions on a Hilbert space, starting with a hyperfinitedimensional random walk and taking the standard part with respect to a measurable norm. Let us begin by recalling and extending some of the results from Section 2.2. If ( E , I I) is an internal normed linear space, a standard generating set for E is a set { u , , } , ~of~ elements from E with 0 < < a3 for all n E N. An element u E E is called 1 * 1 -pre-nearstandard with respect to { u,,} if for anun,with k E N, a,, . . . , ak E R, such that all E E R+, there is a v = Iu - u\ < E. We denote the set of all pte-nearstandard points by Pns( E, 1. I). Let -I_l be the equivalence relation on Pns(E, 1.1) defined by u -1.1 v if and only if Iu - ul = 0. We let Ou be the equivalence class of u, and define a norm on = Pns(E, l.l)/-l.l by = stlul. The following lemma is just a slight reformulation of Proposition 2.2.2. O)u,,J
ct=,
Oloul
O1-1
4.7.1. LEMMA.
combinations
1' *I) is
a Banach space, and the set of finite linear
xi=,akovkis dense in "El+
defined by stl.l(x) = Ox is called the The mapping stl.l:Pns(E, I * 1) + standardpart map. This slight abuse of terminology is justified by stl.1 having exactly the same properties as a real standard part function. For example, mimicking the proof of Theorem 3.2.4 ( l ) , we get. 4.7.2. PROPOSITION. Let (0, d,P ) be a hyperfinite probability space, be an L(d)-measurable random variable. Then there and let Y :R + exists an internal, &-measurable random variable X : R -+ E such that sti.I(X(w))= " X ( w )= Y ( w ) for L ( P ) almost all w.
We call X a lifting of Y. Using 4.7.2 we can derive the natural counterparts of the lifting results in Section 4.3. Let ( E , ( - , .)) be a hyperfinite inner product space where the norm is denoted by I\.\\.Choose an orthornormal generating set {en}nsNin E, and let {en}nsr, 71 E *N - N, be an extension of {en}nsNto an orthonormal basis for E. We shall construct a hyperfinite-dimensional random walk on E. Let T = (0, At, 2 A t , . . . , 1) be a hyperfinite time line, and let H E *N - N be at least as large as the dimension y of E. The sample space 0 consists
196
4. STOCHASTIC ANALYSIS
of all internal mappings w :{ 1,2, . . . , H}x T + {-1, l}, and P is the normalized counting measure on SZ. We shall write w i ( s ) for w ( i , s ) . The process y, :SZ x T + E is defined analogously to the n-dimensional random walk in Section 4.5; Y
x(o,t ) =
(5)
f
C 1 &wj(s)ej. ,=1 c = o
We let (0,{ d r }P, ) be the natural internal filtration on SZ; i.e., d,is generated by the equivalence relation w - r w' iff q ( s ) = w l ( s ) for all s < t and all i. It is easy to see that E(JJ,yfl12) = y - t and that x,is not nearstandard in ( E , 11.11). It does, however, have the right finite-dimensional properties: 4.7.3. PROPOSITION. Let P : E + Eo be the projection onto a finitedimensional subspace Eo. Then "(&,) is a Brownian motion on
"11 . 11).
("~0,ll~Il~
It is not hard to see that & is S-continuous; in fact, this is a special case of Theorem 4.7.6 below, and we postpone the proof until then. To show that 4( is a Brownian motion, we first check the distributions of one-dimensional projections. Let Q be the projection on a unit vector e E E o , and let q k ( e ) be the Fourier coefficient (e, ek).The quadratic variation of QX is given by PROOF
r
=
C r=s
At
qi(e)' = t - s, i=l
where we have used the independence of w i ( r ) and w j ( r ) for i # j . By Proposition 4.4.18, "Qx is a Brownian motion. To show that "& itself is a Brownian motion, it suffices to show that two one-dimensional projections "Qx and '& along orthonormal axes e and e' are independent. Using that w i ( s ) and wj(r)are independent when (i, s) # ( j , r ) , we get
=
1
Y
s=O
i=O
1 At
q i ( e ) q,(e')= t ( e , e') = 0,
'& are uncorrelated Hilbert space "((.II).
which shows that "Qx and
and hence independent.
Let (H, (1 * 11) be the We have just seen that the hyperfinite random walk y, is in some sense a Brownian motion on H.
4.7. STOCHASTIC INTEGRATION IN INFINITE-DIMENSIONAL SPACES
197
But how d o we turn x into a standard process? Since we have already observed that y, does not have a standard part in H, the only hope is to take the standard part with respect to a weaker topology. This is exactly the situation we encountered in connection with Gross’s theorem in Section 3.5. In fact, if for each finite-dimensional subspace F of H and each t E [ O , 11 we let pLfF be the measure on F induced by the projection ‘PFx of ,y to F, then { p k ) F E sis a cylindrical measure on H for each t. Gross’s theorem tells us that if 1.1 is a norm on H which is measurable with respect to Gaussian measures and B is the completion of H with respect to 1. I, then { p k ) induces a measure p L ron B. The proof of Gross’s theorem and in particular Lemma 3.5.7 tell us more; they say that xr is nearstandard almost everywhere with respect to the norm 1. I (note that since we can consider E as subspace of * H, there is a natural extension of 1. I to E ) . Since B = ‘El.,, the standard part b of y, is thus a process living in B, and p ‘ is nothing but the measure b induces on B. If i : H + B is the inclusion map, the triple ( i , H, B ) is often referred to as an abstract Wiener space, and b :0 x [ 0 , 1 ] + B is a Brownian motion on (i, H, B ) . From the hyperfinite-dimensional random walk x, we have thus obtained a standard Brownian motion b on H simply by taking the standard part with respect to a measurable norm. More information about Brownian motions in abstract Wiener spaces can be found, e.g., in Kuo (1975). From the nonstandard point of view, the last two paragraphs may be considered an unnecessary detour; the random walk x is a perfectly legitimate mathematical object, and the existence of the standard part b will play no part in our theory for stochastic integration. B. Infinite Dimensional Stochastic Integrals
Our first task is to define a suitable class of integrands for stochastic integrals with respect to hyperfinite-dimensional random walks. Recall from Section 4.5 that if y, is an n-dimensional random walk, then the integrands are matrix-valued processes. In infinite dimensions one would expect the integrands to be operator valued, and it might be useful to take a brief look at operators between hyperfinite-dimensional spaces before turning to stochastic integration. Let E and F be hyperfinite-dimensional inner product spaces, and let (. , - ) and 1) )I denote the inner products and norms in both E and F. Fix two orthonormal generating sets {en}nENand { f n } n E N for E and F, respectively. A linear map A : E + F is called S-bounded if the operator norm 11 All is finite; it is called nearstandard if A ( e , ) E Pns(F, )I * 11) for all n E N. Thus, if A is S-bounded and nearstandard, it maps pre-nearstandard points to pre-nearstandard points.
198
4 STOCHASTIC ANALYSIS
Let {en}nsy and { f n ) , , % ? be extensions of {en}neN,{ f n } n e N to internal orthonormal bases for E and F. A linear map A :E + F is called a HilbertSchmidt operator if
;011A*fn112 ;IIA*f"1I2< =
n=l
O
Q3,
n=l
where A* is the adjoint of A. The sum choice of basis {fn}". Moreover,
i
C:=l
IIA*fn((2is independent of the Y
I I A * S ~=I I ~ (ern,A*LY m=l
=
c
(Aem,fn)2,
m=l
and hence
i I I A * L I I ~ c c (Aem,fn)' i IIAem11*, Y
7
=
n=l
=
m=l n=l
m=l
which shows that C:=, IIA*fn112 is the square of the usual Hilbert-Schmidt norm for A. We let
All Hilbert-Schmidt maps are obviously S-bounded, and they are also nearstandard:
as k + co in N. Letting 9 ( E , F ) denote the space of internal linear maps from E to F, we can now define the set of stochastic integrands we shall be working with as follows. 4.7.4. DEFINITION. A process X : Q x T + 2 ( E , F ) is said to be in SL2(E, F ) if it is nonanticipating [with respect to (Q, { d , } P ) ]and
where an asterisk denotes the adjoint operator, and Schmidt norm.
11.
is the Hilbert-
Notice the position of the standard parts in (4.7.4),inside both integrals on the left and outside both on the right. This guarantees that X ( w , s) is Hilbert-Schmidt for almost all ( w , s), and that IIX(w, s)ll:2)is S-integrable in the product measure on s2 x T
4 7 STOCHASTIC INTEGRATION IN INFINITE-DIMENSIONAL SPACES
199
Let x : R x T + E be the random walk defined in ( 5 ) , SL2(E, F ) .The stochastic integral X dx is the F-valued process
4.7.5. DEFINITION.
and let X given by
E
This definition is the natural generalization of the ones given in 4.1.1 and equation (4.5.5).We shall show that by restricting the integrands to the class SL2(E,F ) , we have obtained an integral with decent properties: 4.7.6. THEOREM. Let X E SL2(E, F ) .Then X dx is pre-nearstandard and S-continuous almost everywhere. PROOF Let Y k be the kth component of Y = j X dx. We make a preliminary calculation:
By Doob's inequality, 4.2.8,we obtain
( i
0 5 E Sup 1 5 1 k=m Y k ( f ) ' )
5
4E(
1' 2
0 k=m
IIx(w, s)*(fk)ll'ds).
Since X E SL2(E,F ) , the right-hand side approaches zero as m goes to infinity in N, and hence X d x is nearstandard almost everywhere. To prove that Y is S-continuous, we shall first prove that each component Y k is. We remind the reader that according to Theorem 4.2.16,a squareintegrable martingale is continuous if and only if its quadratic variation is. To apply this result in the present setting, we use the following trick: construct a new time line T' by dividing each interval [ k At, ( k + 1) A t ) of T into y points (recall that y is the dimension of E ) . Introduce a new martingale F k :a X T'+ "R which makes the jump oj(t ) a ( x , ( e j ) ,f k ) at
5
200
4. STOCHASTIC ANALYSIS
time t + ( ( j - l ) / y ) At. Obviously, T is a subline of T‘, and Yk is the restriction of ?k to T The point of this trick is that ?k has quadratic variation t
[ Fkkl(t)
=
Y
cc
(xs(e,),fk)2
At =
s=Oj=l
lof
IlxT(fk)l12
ds.
Since IlX(w, s)*(fk)ll’s lIX(w, s ) l l & ~ , and (0, s) lIX(w, s)11:2) is Sintegrable, it follows that s + IlXy(fk)ll’ is S-integrable for almost all o. Hence [Fk]is S-continuous almost everywhere, and by 4.2.16 so are Yk and Yk. It still remains to prove the S-continuity of Y . Let R’ be a set of measure one such that for all w E R’, Y(w,t ) is pre-nearstandard for all t, and Y k ( q . ) is S-continuous for all k E N. Pick w E a’, and let s, t E T be infinitely close. By the continuity of the Yk’s, we have +
for all n E N. But then (8) holds for some n and Y (w, s) are pre-nearstandard (9)
=
no E *N - N. Since Y(w,t )
k=no+l
Combining (8) and (9), we get (1 Y ( t )- Y ( ~ ) 1=1 0, ~ and hence Y(w,. ) is S-continuous when w E R’. We shall now show how the hyperfinite integrals J Xdx can be used to define standard integrals Xdb, where b = st,+y. Let 93(’)(H,K ) be the set of all Hilbert-Schmidt operators from H to I<; with the Il.fc2,-normthis is a separable Hilbert space. As usual (R, { BS}, L( P ) ) is the external filtration generated by (0,{ds}, P ) . The class of b-integrable processes is defined as follows: 4.7.7. DEFINITION. A process x : n x [ 0 , 13 + 93(21(H, K ) is in L2(H,K ) if it is adapted to (R, {%,}, L( P ) ) and
As usual x db will be defined as the standard part of J X dx for a suitable lifting X of x, but this time the notion of a lifting is a little more complicated. Recall that Y ( E , F ) is the set of all internal linear maps from E to F. If T E 93(2)(H,K ) , we define ‘f:E .+ F by
? = PF*TfE,
4 7 STOCHASTIC INTEGRATION IN INFINITE-DIMENSIONAL SPACES
201
where PF : * K + F is the orthogonal projection. Observe that f is an internal Hilbert-Schmidt operator. Pick an orthonormal basis { T n } n c N for % ( 2 ) ( HK , ) , and let { f'fl}naN be a standard generating set in ( Y ' ( E ,F ) , 11 * defined by The map h : 93(2)(KK ) + ("=%E, F)ll.l12, '1) * h( T ) = "( f)
is an isomorphism, and we shall identify %'(2)(H,K ) and " 2 ( E ,F ) I I + 4.7.8. DEFINITION. An internal process X : x T -+ Y'(E, F ) is called a 2-1ijtingofx:R x [0, 11 -+ B(,,(H, K ) i f X E SL2(E,F)andstll.1,2(X(w, t))= x(o, " t ) almost everywhere in 0 x T.
4.7.9. LEMMA.
Any x
E
L2(H,K ) has a 2-lifting.
PROOF By 4.7.2 and 4.3.11 there is a nonanticipating process Y :fi X T -+ 2 ( E , F ) such that stll.l12( Y ( w , t ) ) = x(w, " t ) almost everywhere. We must modify Y to make it an element of SL*(E, F ) . For each rn E *N, let Ymbe defined by
Since Y lifts x, we have for m
E
N
Thus there exists an m0 E *N - N such that
which implies that // Y%(w, t)ll:2) is S-integrable. Since Ym is a lifting of x, we get
202
4. STOCHASTIC ANALYSIS
which proves that Y,,,,,is a 2-lifting of x. We shall leave it to the reader to check that if X and Y a r e two 2-liftings of x, then 5 X dx = Y dx almost everywhere. With these two facts in mind, we can make the following definition. 4.7.10. DEFINITION. Let x E L2(ff, K ) . Then the stochastic integral x db is defined as the standard part of X d x where X is a 2-lifting of x.
Notice that the stochastic integral is independent of which measurable norm was used in the construction of b ; in fact, b itself only enters the theory as a notational device in the expression 5 x db. We could now go on to prove that the stochastic integral defined in 4.7.10 agrees with the one obtained by standard methods in, e.g., Kuo (1975). This can be done very much along the same lines as the onedimensional theory in Section 4.4,but we shall not try to carry it out here. The interested reader is referred to Lindstram (1983). The nonstandard approach to the infinite-dimensional stochastic integral has two advantages. First, the integral has a simple, intuitive definition as a pathwise Stieltjes integral; and second, the Brownian motion has a natural construction as a random walk on a hyperfinite-dimensional linear space; no measurable norms or appeals to Gross's theorem are necessary. Also notice that although the Brownian motion b is B-valued, the integrand x takes values in 93(21(H, K ) . In the standard approach [see, e.g., Kuo (1975)l this necessitates a rather messy study of the relationship between H and B and the linear maps on the two spaces. The nonstandard theory does not need B, and thus no such study is necessary. C. A Remark o n Stochastic Partial Differential Equations
Let us finally return to partial stochastic differential equations. We shall first take a look at a simplified, linearized version of Eq. (4):
(10)
du=(vA~-(~A).u+h)dt+gdw.
Here u is a constant equilibrium solution of the deterministic Navier-Stokes equation, and u(x, t ) is the fluctuating velocity about this equilibrium due to the random perturbations g dw. As a model of physical reality, (10) is of limited interest, but it has the mathematical virtue of being exactly solvable. Observe that (10)is of the form
(11)
d x ( t ) = A x ( t ) dt
+ t ( x ( t ) )d b ( t ) ,
4 7 STOCHASTIC INTEGRATION IN INFINITE-DIMENSIONAL SPACES
203
where A is an unbounded, linear operator on a Hilbert space K [in (lo), K = L 2 ( D ) ,where D is the domain of the problem], and b is an infinitedimensional Brownian motion. We have the following general result: 4.7.11. THEOREM. Let A : K + K be the densely defined infinitesimal genLet t :[0,1] x K + erator of a strongly continuous semigroup { B(2)(H,K ) be measurable, and assume that there is an L E N such that \I[(?, u ) - [(t, u ) \ \ ( ~5, L . IJu- uI( for all t, u, u. Assume also that 11 [( t, 0) 11 f 2 ) dt < a.Finally, let xo be a B3,-measurable, square-integrable initial condition. Then the equation
(12)
dx( t ) = AX(t ) dt
+
t, X ( t ) ) db( t ) ,
c$(
~ ( 0=) xO,
has a continuous weakened and mild integral solution. There are several ways of defining a “solution” to (12); we have focused
on the following two: The process x is called a weakened solution if (13)
x ( t ) = xo+ A
sd
x ( s ) ds +
Jd
((s, x(s)) d b ( s ) ,
and it is called a mild integral solution if (14)
I:
x(t) = T ~ X+O
Tr-.yt(s,~ ( s )d b) ( s ) ,
where { T I }is the semigroup generated by A. Results like 4.7.11 are well known from the standard literature, and a general result [see Chojnowskaja-Michalik (1979)l states that a process is a weakened solution if and only if it is a mild integral solution. We shall restrict ourselves to an outline of the strategy for producing solutions of the latter kind. Embed H and K in hyperfinite-dimensional spaces E and F in the usual way,and let y, be a random walk on E. Write {*Tl}rt*R+ for * ( { T r } r E Rand +), let T,, : F + F be defined by ?&I
=
TAr
1F,
where P F :* K + F is the projection. Define an internal semigroup { by f k A t = ( f ~ , )and ~ , let
A = ( I - fAr)/At. Let S : T x F + 2 ( E , F ) be a uniform lifting of 5 with respect to the normalized counting measure A on T [i.e., there is a T’ c T of Loeb measure one such that stll.11c2, (S(t,0 ) ) = [ ( “ t , “ u ) whenever t E T’ and u is prenearstandard]. We choose Z S-Lipschitz-continuous in the second variable,
204
4 STOCHASTIC ANALYSIS
and such that IlE(f,O)Il&, is S-integrable over T. Finally, we let X o be an &-measurable 2-lifting of xo for some 6 = 0. Corresponding to (14), we now have the hyperfinite difference equation
XI = f r X o+
(15)
r
C
F,-Bt-sZ(s, X , ) Ax(s).
S=S
This equation obviously has a solution, but some honest work is required to establish the estimates needed to show that the solution is nearstandard and S-continuous [see Lindstr$m (1983)l. However, granted that X is nearstandard and S-continuous, we see that Tl-Ar-sE(s, X ( s ) ) is a 2-lifting of T r P s ( ( s", X ( s ) )on a set of measure one. Since T,xo = E X o , we see that by putting x ( " t ) = OX( t ) , we transform (15) into I
i.e., x is a continuous mild integral solution of (12). By induction, (15) is easily seen to be equivalent to (17)
X,
=
X o+
I
C S=S
i ( X ( s ) )A t +
r
C
E(s, X ( s ) )A,y(s),
s=s
and this can be used to give a direct nonstandard proof of the fact that x is also a weakened solution of (12) [see Lindstr$m (1983)l. We round off with a few remarks on the full Navier-Stokes equation (4). We are interested in space-time statistical solutions of both the NavierStokes equation and a stochastic version of the equation in a bounded domain. There is a vast literature on this topic; we mention in particular the work of Bensoussan and Temam (1973) and the contribution of the Vishik school; for a survey and full references see the paper by Vishik et al. (1979). Here we indicate how this problem can be discussed within the present nonstandard framework. The method is to replace the "infinite-dimensional" Navier-Stokes system by its "finite-dimensional'' Galerkin approximations. In finite dimensions it is not too difficult to construct space-time statistical solutions. And provided we can prove the necessary uniform estimates we can use Prokhorov's theorem to produce a solution of the infinite-dimensional system. In the nonstandard framework we can replace the given system by a hyperfinite-dimensional system; in the stochastic case the finite-dimensional approximations will be It6-type stochastic equations, which we have studied in Section 4.5. We can then use the theory of Sections 3.4 and 3.5 to pass
4 8 WHITE NOISE AND LEVY BROWNIAN MOTION
205
from the finite-dimensional approximations to the infinite system, provided-and this is, of course, the mathematical crux of the argument-that we establish the necessary uniform (i,e., dimension-independent) estimates for the finite-dimensional approximations. We invite the reader to look at the theory from the present nonstandard perspective, which we believe offers a clean conceptual setting for this kind of problem. There are related questions concerning the Euler equation which also could be interpreted in the present framework; see Albeverio and Hpregh-Krohn (1981) and Albeverio ef al. (1979, 1985). For other interesting applications of infinite-dimensional stochastic differential equations, see Faris and Jona-Lasinio (1982), Jona-Lasinio and Mitter (1985) and Loges (1984). 4.8. WHITE NOISE AND LEVY BROWNIAN MOTION
Let ( E , 8, m ) be a measure space and let 8, be the sets in 8 with finite measure. Fix a probability space (R, 93, P ) . An n-dimensional white noise on (E,8, m ) with respect to (R, 93, P ) is a family (1)
=
{x(A)}AeZm
of random vectors X ( A ) : R + R” such that: X(A) is Gaussian distributed with mean zero and covariance matrix m(A). I ; (ii) if A n B = 0,then X(A) and X ( B ) are independent, and X ( A ) + X ( B ) = X ( A u B) and P-a.e. (i)
4.8.1. EXAMPLE. Let ( E , 8,rn) be the unit interval with the Bore1 a-algebra and Lebesgue measure. Let b : R x [0,1] -+ R” be an n-dimensional Brownian motion, and define X = {X(A)}by
X ( A )=
jo’
1~ db,
where l Ais the indicator function of the set A on ( E , 8,m).
c
E. Then X is a white noise
By emphasizing the close relationship between white noise on [ O , 11 and Brownian motion, Example 4.8.1 introduces the theme of this section. Following Stoll (1982, 1985), we shall show that Anderson’s construction of Brownian motion can easily be extended to a construction of white noise. Once we have white noise on Rd, we shall reverse the procedure of Example 4.8.1 and obtain (LCvy) Brownian motion of d “time” parameters as a
206
4 STOCHASTIC ANALYSIS
stochastic integral of white noise. Finally, we shall give a variant of a theorem of Stoll showing how this stochastic integral representation can be used to obtain a new invariance principle for LCvy Brownian motion. A. Construction of White Noise
We shall first construct the nonstandard counterpart of white noise. Let
( Y , 9, p ) be an atomless hyperfinite measure space; i.e., 9is the set of all internal subsets of Y, and p ( { x } ) = 0 for all singletons {x}. If Cl is the set of all internal maps from Y to {-1, l},let d be the set of all internal subsets define of Cl and P the uniform probability measure on Cl. For each A E 9,
the map A H x ( A ) is called S-white noise on Y. Equation (2) is the obvious generalization of the white noise representation discussed in Section 3.3, and x will play the same part in this section as Anderson's random walk did in Section 3.3. Before we define the standard part of x, we make a preliminary calculation.LetA,BE 9 , a n d d e f i n e ~ : A A B + ( - l , l } t o b e l o n A - B a n d - l on B - A. Then
=
c
a t A AB
p(a)=p(AAB),
where the next to last step uses the independence of w ( a ) and o ( b ) when a # b. Thus if A and B are equal L ( p ) almost everywhere, x ( A ) and x ( B ) are equal L ( P ) almost everywhere. For each Loeb measurable A c Y, choose an A' E 9 such that L ( P ) (A A A') = 0. Define the standard part of x to be the family
where (4)
X ( A ) ( o )= " x ( A ' ) ( w ) .
It is not hard to check that {X(A)} is a one-dimensional white noise on the Loeb space ( Y ,L ( 9 ) , L ( p ) ) ;part (ii) of the definition follows from (3), and part (i) is proved by a calculation of the Fourier transform similar to the one in the second proof of 3.3.5(ii).
4 8 WHITE NOISE AND LEVY BROWNIAN MOTION
207
Let E be a Hausdorf3 space and m a a-finite Radon measure on E. Assume that if x is an isolated point in E, then m { x } = 0. Fix a rich, hyperfinite subset Y of * E [rich means that st( Y ) = E l and an atomless, internal measure p on Y such that m = L ( p ) st-'. If X is a white noise on ( Y ,L ( 9 ) , L ( p ) ) ,the equation 0
% ( A )= X(st-'(A))
(5)
m)., Thus on any a-finite Radon space induces a white noise 2 on ( E , 'i? that does not charge isolated points, there is a white noise induced by an internal S-white noise. Let us just remark that the condition given in the last sentence is stronger than what is strictly necessary, but since we shall be mainly concerned with white noise on Rd, it is more than sufficient for our purposes. Above we have only considered one-dimensional white noise, but it is trivial to extend the construction to n dimensions; just let n independent one-dimensional processes run along orthonormal axes. B. Stochastic Integrals and the Continuity Theorem
We now turn to stochastic integration. Without filtrations and a concept of nonanticipation we can only expect to integrate deterministic functions, but it turns out that this is all we shall need in this section. Let x be an S-white noise as in (2), and let F : Y -+ *R be an internal function. The stochastic integral of F with respect to x is defined as (6)
(1
Fd*)(w)
=
ZYF ( a ) x ( a ) ( w ) c F ( a ) w ( w Z m =
a€Y
An easy calculation similar to (3) shows that (7)
and thus a reasonable class of integrands is S L 2 ( p ) . Notice that if p is atomless and F E S L 2 ( p ) ,then the measure p F ( A )= F2 d p is also atomless, and the map A H lAFdX is an S-white noise on (y9, pF).In particular, 1,FdX) is Gaussian distributed with mean zero and variance ' ( J A F 2 d p ) . If 2 is a white noise on a Radon space ( E , 8, m ) induced by x, we can for each f E L2(m ) define a stochastic integral by
5,
'(I
I,
where F E S L 2 ( p )is a lifting o f f : It follows from (7) that this integral is independent of the choice of lifting, and we leave it to the reader to check
208
4. STOCHASTIC ANALYSIS
that it agrees with the one obtained from the standard approach. [The standard definition is the usual one; d o the obvious thing for simple functions, and extend continuously using the L2-isometry corresponding to (71.1 Again the extension to higher dimensions is straightforward; if x is an n-dimensional S-white noise and F : Y + *Rm 0*R" is internal [recall that * R m 0 * R " is the space of all *R-valued ( m x n ) matrices], definition (5) extends to integrals of the form FdX in the same way as before; put
I
j
FdX
= Jy
F ( a ) .x ( a ) ,
-
where is matrix multiplication. The stochastic integrals we have defined in this section are just random variables; unlike integrals of martingales, they do not depend on an extra parameter. We have seen that we can remedy this somewhat by considering (J 1AF(x) &IAea as an S-white noise on ( Y , 9, p F ) ,but it is often more interesting to integrate a kernel F(x,y) of two variables, and study the resulting integral F ( x , y ) & ( y ) as a stochastic process parametrized by Y. Stoll's construction of LCvy Brownian motion is in terms of such integrals, and as a first illustration we shall take a look at what happens in the one-dimensional case.
5
4.8.2. EXAMPLE. Let T = ( 0 , At, 2 A t , . . . , 1) be a hyperfinite time line, and let p be the normalized counting measure on T. If R is the set of all internal maps from T to { - 1 , l ) and P is the uniform probability measure on R, then
x(A)=
1
w
(
aGA
~
)
m
1 = w ( a ) a ,
w E
R
atA
is an S-white noise on T. Put (9)
~(t,a)=tsgn(t-a)+5;
then the process
is Anderson's random walk. Hence we have reversed Example 4.8.1 and obtained Brownian motion as a stochastic integral of white noise on [0, I]. Observe for future reference that (11)
(sgn(t - a ) - sgn(s - a ) ) & ( a ) .
4 8 WHITE NOISE AND LEVY BROWNIAN MOTION
209
Our plan is to construct LCvy Brownian motion on R d in the spirit of this example, but before we turn to this, we must describe the nonstandard theory of white noise on Euclidean spaces in more detail. An internal subset r G * R d is called a hyperjinite lattice if
r = { k , A t , . . . , kd A t ) l ( k , , . . . , k d ) E *Hd,maxlkll 5 N } Isrsd
for some A t = 0 and some N E *N such that N . A t is infinite. The uniform, internal measure p on r defined by
.
p({(kl
*
9
kd
At)})
=
induces Lebesgue measure on Rd. As usual R is the set of all internal maps from r to {-1, l}, P is the uniform probability measure on a,and
x(A)=
C
w ( a ) m= C w ( ~ ) ( A t ) ~ ' '
atA
is S-white noise on r. We are interested in internal processes Z :R x r -+ *R that can be obtained as stochasticintegrals I ( x ) = F(x, y ) d x ( y ) .To havestandard parts processes of this kind must be reasonably regular; we shall use the following notion of S-continuity.
1
4.8.3. DEFINITION. An internal process 2 :ax r + *R is S-continuous if for almost all o,Z ( o , x) = Z ( w , y ) whenever x and y are nearstandard and x = y.
Notice that we are only concerned with S-continuity at nearstandard points. A result we shall find useful in proving the S-continuity of various stochastic integrals is the following theorem. 4.8.4. THEOREM. Let r be a hyperfinite lattice in * R ~ and , let F : r x r -$ *R be an internal function. Assume that there are positive real numbers (Y and C such that
IF(%a ) - F ( y , a)12 d d a ) 5 C I I X - yll"
(12)
for all x, y continuous.
E
r. Then
the stochastic integral x H j F ( x , a ) & ( a ) is S-
The main probabilistic ingredient needed for the proof is a hyperfinite version of Kolmogorov's continuity theorem: 4.8.5. PROPOSITION.
Let
r be a hyperfinite lattice in * R d , and let Z : n x
I7 + *R be an internal process. If there exist positive real numbers p, r, K
4. STOCHASTIC ANALYSIS
210
such that E(IZ(x) - Z ( Y ) l P )IKIlx - Ylld+'
(13) for all x, y
E
r, then Z
is S-continuous.
Let us first show how 4.8.5is used to prove 4.8.4. PROOF OF THEOREM 4.84. Let rn E N be so large that rncy > d, say ma = d r. If I ( x ) = F ( x , a ) dx(a),then by 4.8.5it suffices to show that there is a K E R such that
+
I
E ( ( I ( x )- I ( Y N 2 " )
(14) for all x, y
E
5
KIlx - Yll""
r. We first observe that
The next thing to notice is that since w ( a , ) and o(a,) are independent when a, # a most terms in the last sum are zero; e.g., if a, f a, when i # j , then 2; E ( n , = ,w ( a , ) ) = 0. More than that is true; if there is an i such that the set { j I2rnla, = a,} is of odd cardinality, then E(nf:, w ( a , ) ) = 0. To make this precise, we introduce an equivalence relation on r" by putting (al, . . . , a*,,,) ( b l , . . . , b2") if a, = a, iff b, = b,. Let A be the set of equivalence classes of -. An equivalence class 6 E A is called even if whenever (al,. . . ,a2") E 6, then the sets A, = { j I a, = a,} all have an even number of elements. Split A into two parts; the even elements A,,,, and the rest We have seen that Aodd makes no contribution to (15), and hence
-
-
E ( ( I ( x )- I ( Y ) ) 2 m )
= c
SEA^^., ( a , ,
c
E(
,a2,,,)~6
ii ( F ( x ,a,)
- F(Y, a # ) ) w ( a J m ) .
r=L
To prove (14),it thus suffices to show that if 6 (16) (01.
c
E(
.a2,,,)~6
5r" ( F ( x ,4
r=l
for some constant K6 E R.
E
A,,,,,
then
- F(Y, a # ) ) w ( a , ) m )5
KSIIX
- Yll""
4 8 WHITE NOISE AND LEVY BROWNIAN MOTION
211
Fix 6 E A,,,,, and let 2p1,.. . , 2 p , be the number of elements in the various partition classes A, = { j l a, = U J ; obviously p , + p 2 + . + p4 = rn. Since =1
(0,.
c
E(
.oZ,)ts
=
Ir" (F(x, z=1
C
fI
a,) - F ( Y , U
J @ ( U J r n )
(F(x,b,) - F(Y, b , ) ) 2 p r ~ ( b l ) P ,
( b , , ..b,)sr4 r = l
For the convenience of the reader we include a proof of 4.8.5, although it is almost identical to the standard argument given in, e.g., Simon (1979). To simplify the notation, we shall assume that A t = 2-q for some 7) E *N - N. PROOF OF PROPOSITION 4 8 5
continuity only at points x
=
Without any loss of generality we prove the (x,, . . . , xd) with 0 5 xi i- 1 for all i.
4 STOCHASTIC ANALYSIS
212
Fix a real number a, 0 < a < r / p , then P { w I I Z ( w , x) - Z ( w , Y)l 5 IIX
IIIX -
YII"1
- YII-apE(IZ(x) - Z ( Y ) l P 5 ) KIlx
-
Ylld+E,
where E = r - a p . If k = ( k l , . . . , k d ) E *Zd, let k ( ' )= ( k l , . . . , ki + 1, . . . , k d ) .For all n and all k E *Zd, we have
Given n
E
I77
*N, let
Hn = { ( k l , .. . , k d ) E * Z d 10 5 ki < 2" for all i } . For all N
I7,
we have
Thus for almost all w there is a number v ( w ) E N such that
whenever n Iv ( w ) . t= Consider two elements k / 2 " = ( k l / 2 " ,. . . , k d / 2 " ) and ( k , / 2 " ,. . . , t,, . . . , k d / 2 " )in r such that ki/2" Iti 5 ( k i f 1)/2".Then
L
j-0
for some sequence { y j } of zeros and ones. If n
2
v ( w ) , we thus have
Similarly, we get
This implies that if v ( w ) is finite, Z ( w , t ) is S-continuous along all lines parallel to one of the axes. Hence Z is S-continuous. REMARK. We were rather careless toward the end of this proof; with just a little more patience one could get that Z is Holder continuous with index a.
4 8 WHITE NOISE AND LEVY BROWNIAN MOTION
213
C. Levy Brownian Motion
At long last we can now turn to LCvy Brownian motion. Let us first explain what it is (LCvy, 1948). 4.8.6. DEFINITION. A Lkvy Brownian motion on R d is a stochastic process L:R x Rd + R such that:
(i) L(0) = 0 a.e.; (ii) if x,, . . . , x , E Rd, the random vector ( L x l , . . , Lxn)is Gaussian distributed with zero expectation; ) JIx- yl( for all x, y E [ w ~ ; (iii) E ( ( L , - L , ) ~ = x L ( w , x) is continuous for almost all o. (iv)
-
Notice that the covariance of the process can be calculated from (i)-(iii); E(LXL,) =
w 4 llYll -
IIX
- Yllk
and that LCvy Brownian motion thus induces a unique probability measure on c ( [ w ~ , [ w ) . Recall from Example 4.8.2 that one-dimensional Brownian motion can be obtained as a stochastic integral
'i
i ( t ) - R(s) = -
2
(sgn(t - a ) - sgn(s
-
a ) )d X ( a ) ,
or, as we may also express it,
Stoll's idea was that equation (18) may be generalized to dimension d by putting
and he showed that for the appropriate choice of the constant C,, i induces a LCvy Brownian motion on Rd. We shall use a slightly different representation (also mentioned by Stoll) when d > 1; we shall study a process A satisfying
and prove that the standard part of A is a Livy Brownian motion for the right choice of k d .
214
4 STOCHASTIC ANALYSIS
There is no substantial difference between the two approaches (19) and (20); all arguments that apply to one process apply almost automatically to the other. Stoll’s original representation (19) has the advantage of working also in the one-dimensional case; on the other hand it needs an Rd-valued white noise. For the remainder of this section we assume d > 1, and use the convention 1/0 = 0. Let us assume that A satisfies (20), and compute
If x, y
E
r, let rx,ybe the lattice
given by and let px,ybe the uniform measure on I‘x,y
for all b E Tx,v. Notice that if At/IIx - yll = 0, then I . L ~ . ~induces the Lebesgue measure on rx,y. Defining a new variable k by k=- a - x IIX - YII ’
where w = (x - y ) / l l x - yll is a unit vector. We shall use the following lemma. 4.8.7. LEMMA. For all x, y E r, the function
is in SL2(px,y), and there exists a K E R such that J(k)’ dpxJk) < K for all x, y E r. Moreover, if x and y are finite and not infinitesimally close, then
4 8 WHITE NOISE AND LEW BROWNIAN MOTION
=
/(
1
1 //Zll~d-11/2 -
215
IIZ
+ ell
(d-l)/2)2
dm(z),
where m is Lebesgue measure on Rd, and e is any unit vector. The proof is not hard but very tedious, and we leave it to the reader. One word of caution may be appropriate; the integrals
both diverge at infinity, and a little care is needed in showing that
converges. One consequence of (22) and 4.8.7 is that if A satisfies (20), then E ( ( A ( x )- A ( Y ) ) ~5 ) k2dKIIX
-YII
for all x , y E r, and thus A is S-continuous b y 4.8.4. We can now define a process A satisfying (20) and its standard part L; put
where
and let L : R x R d
be defined by
L( o,Ox) = "A ( w , x)
(25)
for all x
+R
E
r.
4.8.8. THEOREM.
L is a Levy Brownian motion on R d .
PROOF There is not much left to prove; 4.8.6(i) is immediate from (23); 4.8.6(ii) follows from the general theory for stochastic integration with respect to white noise; 4.8.6(iii) is a consequence of the choice of k d [recall (22), (24), and 4.8.71; and since we have already observed that A is Scontinuous, 4.8.6(iv) is also satisfied.
216
4 STOCHASTIC ANALYSIS
Notice that if
2 is the white noise on R d constructed
from ,y,
is a standard representation of L as a stochastic integral of white noise. REMARK. Stoll's stochastic integral representation of LCvy Brownian motion is not the first one; Chentsov (1957) [see also Gangolli (1967), McKean (1963), and Takenaka (1977)l gave a representation in terms of white noise on projective space, and Cartier (1971) discovered a representation in terms of two-dimensional white noise on Rd, using a kernel that is a little more complicated than the one in (26). Stoll points out that his representation is invariant under translations T in a way Cartier's is not; if 2 is the white noise in (26) and g r is the one obtained by g T ( C )= k(T-'( C ) ) ,then L, - Ly = L&,) - LT(,) almost everywhere, where LT is the LCvy Brownian motion obtained from gT.A more important result is a Donsker-type invariance principle for Levy Brownian motion.
D. lnvariance Principles
Let us first construct finite approximations to A. Fix N E *N - N, and consider the lattices
for all n less than some hyperfinite integer 7.Let R be the set of all internal maps from rq into {-1, l}, and let P be the uniform probability measure on R. The measure p,, on r, is uniform, and p , , ( { a } )= (1/2")d for each a E r,. By analogy with (23), we define A, : R x r. + *R by
Our goal is to show that the measures induced by the An's on C ( R d ,R) converge weakly to the measure induced by Levy Brownian motion. Let us first explain how we get a measure on C(Rd,R) from A,. If x E *Rd, let [XI, be the half-open cube centered at x with sides of length 2-", i.e., [XI,
= {(yl,.
. . ,y d ) E * R [~x i- 2-("+') Iyi < xi+ 2-("+')for all i } .
If F : r, + *R is an internal function, we extend it to a function by putting
m
=
F(x)
:* R d + "R
217
4 8 WHITE NOISE AND LEVY BROWNIAN MOTION
if x E r, and y E [x,]. The function F is not S-continuous, but we can turn it into an S-continuous function F by putting
g(x) = * m ( [ x ] , ) - ’ . (28)
5,,,.
F ( y )d * m ( y ) ,
where rn is Lebesgue measure on R”. Notice that 6 agrees with F on r,. If n E N, let L, :R x R d + R be the standard part of I,, and define a measure u, on C(Rd,R) by
v , ( B ) = L ( P ) { wI L,(w,
-1E B }
for all Bore1 sets B. If i7, is the internal measure on *C(Rd, R) given by V n ( A= ) P{w II,(w, * )
E
A},
then
v,
(29)
= L( V,)
0
st-’.
On the other hand, if n E *N -N, then according to Theorem 4.8.8, L( V,) 0 st-’ is the measure induced on C ( R d ,R) by Livy Brownian motion. If we call this measure u, we see that for all n E *N - N and all bounded, continuous functions f :c ( R ~ ,R) ++ R,
Given an
E E
R+, there must be an no E N, such that for all n
E
N, n
5
no,
and hence the sequence { u , } converges weakly to u. Before we formulate this resylt as a theorem, let us redefine L, and u, in entirely standard terms. Let rn be the standard lattice consisting of the finite points in r,, and let l, : R x ?, + R be defined by
Obviously, In(w,x) = O A n ( w , x ) for almost all w, and hence L, equals the extension In of 1, (notice that f, can be defined from l, in standard terms; the cube [XI, is now a standard cube with sides of length 2-”). We define u, from L, as before.
218
4 STOCHASTIC ANALYSIS
For each n
4.8.9. THEOREM.
fW let ?, be the lattice
= { ( k , / 2 " ,. . . , kd/Zn)l(kL,.
f n
and define for x
E
E
. . , k d ) E z"},
f,,
where the o ( u ) ' s are independent random variables taking the values -1 and 1 with probability f . Let L, :R x R d + R be the continuous extension of 1, defined above, and let v, be the measure on C(Rd,R) induced by L , . Then { v , } , , converges ~~ weakly to LCvy Brownian motion. This is our version of Stoll's invariance principle (Stoll, 1982, 1985); the original result is more complicated to formulate as it uses general interpolation functions to turn 1, into a continuous process. The conditions we have put on the random variables w ( a ) are obviously much stronger than necessary; all we need is that they induce a white noise in the limit. REMARK. Nonstandard techniques for proving invariance principles were first introduced by Muller (1969) before the invention of Loeb measure. Anderson (1976) combined Miiller's ideas with the Loeb construction and his own hyperfinite random walk to give an almost trivial proof of Donsker's invariance principle for Brownian motion. Later Keisler (1984) and Kosciuk (1982) proved related results for solutions of stochastic differential equations.
To emphasize that LCvy Brownian motion is not the only process that can be obtained as a stochastic integral of white noise, we close this section with a construction of the Yeh-Wiener process. 4.8.10. DEFINITION. A Yeh- Wiener process on Rd is a stochastic process W:fl x Rd + R such that
(i) for all zl,. . . , z, E Rd, the random vector (W,,,. . . , W z n )is Gaussian distributed with mean zero; , (ii) if x = ( x l , .. . ,x d ) and y = ( y l , .. . , y d ) are two points in R ~ then E ( KW,) = IxzIA Iy,I; (iii) for almost all o,the path x H W (w, x) is continuous.
n;=,
Let T be a hyperfinite lattice in * R d anddefine F : T x T H * R b y
F(x,a ) =
{2
if lai[ < [ x i /for all i, elsewhere,
4 8 WHITE NOISE AND LEV? BROWNIAN MOTION
where x = (xl,.. . ,xd) and a Y be the process
= ( u , , . . .,a d ) .
Y(x)= For x
= (x,, . . . ,xd)and
5
If
219
x is S-white noise on r, let
F(x, a ) & ( a ) .
y = ( y , , . . . , y d ) in
r, we get
which shows that Y has the right covariance. Also,
which shows that for each K x, y E r with IIXII, IlYll 5 K ,
E
N, there is a constant CK such that for all
By changing F outside r K = {x E r /llxll 5 K } if necessary, we can use Theorem 4.8.4 to conclude that Y is continuous on each rK,and hence it is S-continuous. The standard part W of Y is obviously a Yeh-Wiener process. Reasoning as in the proof of 4.8.9, one gets a simple proof of Kuelbs’ (1968, 1973) and Wichura’s (1969) invariance principle for this process [see Stoll (1982) for the details]. REMARK. All the main results in this section are from Stoll (1982), but we have changed the exposition somewhat. Stoll first proves the representation (26) by standard methods, and then uses this result to prove Theorem 4.8.8. By introducing the Continuity Theorem 4.8.4, we have obtained a way of proving that A is S-continuous, and this makes it possible to reverse Stoll’s construction. Knowing that A is continuous also helps to simplify the proof of 4.8.9. Stoll’s treatment of the Yeh-Wiener process is much like the one we have given, although he uses a different argument to show that Y is S-continuous. In the shorter and more recent exposition of his work Stoll (1985) is using an approach closer to ours.
220
4 STOCHASTIC ANALYSIS
NOTES
Even in a chapter of this length it is not possible to give a complete survey of the nonstandard theory of stochastic processes, and the reader should consult Stroyan and Bayod (1985) and Cutland (1983~)for further information and references. The most serious of our omissions are probably Perkins’s theory of local time (recall the remarks and references at the end of Section 3.3); Keisler’s study of Markov processes (1984); and the now well-developed field of probability logic (see the references in Section 4.4.D). As a matter of fact, we shall have much to say about both Markov processes (Chapter 5) and local time (Sections 6.4 and 7.5), but from quite different perspectives than Keisler and Perkins. Probability logic is further from our chosen topic, and we just refer the reader to Keisler (1985) (probably the best starting point for a logician) and Hoover and Keisler (1984) (the most natural point of departure for the probabilist). Let us finally mention that Lawler’s important and well-known work on self-avoiding random walks is reviewed briefly in Section 6.4. A completely standard, but very stimulating discussion of the various branches of stochastic analysis and their interconnections is given in Follmer (1984).
REFERENCES S. Albeverio and R. Heegh-Krohn (1981). Stochastic methods in quantum field theory and hydrodynamics. Phys. Rep. 77. S . Albeverio, M. de Faria, and R. H~iegh-Krohn(1979). Stationary measures for the Euler flow in two dimensions. J. Sfatisf.Phys. 20. S. Albeverio, R. H0egh-Krohn, and D. Merlini (1985). Euler flows, associated generalized random fields and Coulomb systems. In (S. Albeverio, ed.) Infinite Dimensional Analysis and Stochastic Processes, pp. 216-244, Res. Notes Math. Pitman, London. R. M. Anderson (1976). A nonstandard representation for Brownian motion and It8 integration. Israel 1. Math. 25. R. M. Anderson (1982). Star-finite representations of measure spaces. Trans. Amer. Math. Soc. 271. L. Arkeryd (1984). Loeb-Sobolev spaces with applications to variational integrals and differential equations (preprint). Chalmers Inst. Technol., Gothenborg, Sweden. L. Arkeryd and J. Bergh (1985). Some properties of Loeb-Sobolev spaces (preprint). Chalmers Inst. Technol., Gothenborg, Sweden. M. T. Barlow (1981). Construction of a martingale with given absolute value. Ann. Probab. 9. M . T. Barlow (1982). One dimensional stochastic differential equation with no strong solution. J. London Math. Sac. 26. M. T. Barlow and E. Perkins (1984). One-dimensional stochastic differential equation involving a singular increasing process. Stochastics 12. A. Bensoussan and R. Temam (1972). Equations aux dCriv6es partielles stochastiques non lin6aires 1. Israel J. Math. 11. A. Bensoussan and R. Temam (1973). Equations stochastiques du type Navier-Stokes. J. Funcr. Anal. 13.
REFERENCES
221
0. Berg and C. Blomberg (1976). Association kinetics with coupled diffusional flows. Special application to the Lac Repressor-Operator system. Biophys. Chem. 4. 0. Berg and C. Blomberg (1977). Association kinetics with coupled diffusion. An extension to coiled-chain macro-molecules applied to Lac Repressor-Operator system. Biophys. Chem. 7. P. Billingsley (1968). Convergence of Probability Meosures, Wiley, New York. P. Cartier (1971). Introduction a 1’Ctude de mouvement browniens a plusieurs paramktres. Sem. Probab. V. Lecture Notes in Math. 191. Springer-Verlag, Berlin and New York. N. N. Chentsov (1957). Ltvy Brownian motion for several parameters and generalized white noise. Theory Probab. Appl. 2. A. Chojnowskaja-Michalik (1979). Stochastic differential equations in Hilbert Spaces. In (2. Ciesielski, ed.) Probability Theory, Banach Center Publications 5 , PWN, Warzaw. P.-L. Chow (1978). Stochastic partial differential equations in turbulence related problems. In (A. T. Bharucha-Reid, ed.) Probabilistic Analysis and Related Topics, Vol. 1. Academic Press, New York. N. Christopheit (1983). Discrete approximations of continuous time stochastic control systems. SIAM J. Control Optim. 21. K.-L. Chung and R. Williams (1983). Introduction to Stochastic Integration. Birkhauser, Basel. N. J. Cutland (1982). On the existence of solutions to stochastic differential equations o n Loeb spaces. 2. Wahrsch. Verw. Gebiete 60. N. J. Cutland. (1983a). Internal controls and relaxed controls. J. London Math. SOC.27. N. J. Cutland (1983b). Optimal controls for partially observed stochastic systems: an infinitesimal approach. Stochastics 8. N. J. Cutland (1983~).Nonstandard measure theory and its applications. Bull. London Math. SOC.15.
N. J. Cutland (1985a). Partially observed stochastic controls based on cumulative digital readouts of the observations. In Proc. IFIP Work. Conf: Stochastic DifferentialSystems, 4th, MarseilleLuminy, 1984. Springer-Verlag, Berlin and New York. N. J. Cutland (1985b). Infinitesimal methods in control theory: Deterministic and stochastic. Acta App. Math. (to appear). N. J. Cutland (198%). Simplified existence for solutions to stochastic differential equations. Stochastics 14. M. H.A. Davis (1984). Lectures on Stochastic Control and Nonlinear Filtering. Springer-Verlag, Berlin and New York. C. Dolians (1968). Existence du processus croissant nature1 associi & un potentiel d e classe (D). Z. Wahrsch. Verw. Gebiete 9. C. DolCans-Dade (1976). On the existence and unicity of solutions of stochastic integral equations. 2. Wahrsch. Verw. Gebiete 36. J. L. Doob (1953). Stochastic Processes. Wiley, New York. J . L. Doob (1984). Classical Potential Theory and irs Probabilistic Counterpart. Springer-Verlag, New York and Berlin. R. J. Elliott (1982). Stochastic Calculus and Applications. Springer-Verlag, New York and Berlin. R. J. Elliott and M.Kohlmann (1982). On the existence of optimal partially observed controls. Appl. Math. Optim. 9. S. Fajardo (1984). Completeness theorems for the general theory of stochastic processes. Methods in Math. Logic, Lecture Notes in Math., 1030, Springer-Verlag, Berlin and New York. S. Fajardo (1985). Probability logic with conditional expectation. Ann. Pure Appl. Logic 28. W. G . Faris and G. Jona-Lasinio (1982). Large fluctuations for a non-linear heat equation with noise. J. Phys. A 15. A. F. Filippov (1962). On certain questions in the theory of optimal control. J. Sac. Indust. Appl. Math., Ser. A, Control. 1.
222
4 STOCHASTIC ANALYSIS
H. Follmer (1984). Von der Brownschen Bewegung zum Brownschen Blatt: einige neuere Richtungen in der Theorie der Stochastischen Prozesse. In (W. Jager et a/.,eds.) Perspectives in Mathematics. Birkhauser, Basel. R. Gangolli (1967). Positive definite kernels on homogeneous spaces and certain stochastic processes related to LCvy's Brownian motion of several parameters. Ann. Inst. H. Poincarb Sect. B 3. L. L. Helms and P. A. Loeb (1982). A nonstandard proof of the martingale convergence theorem. Rocky Mountain J. Math. 12. C . W. Henson (1986). Banach space model theory (in preparation). D. N. Hoover (1978). Probability logic. Ann. Math. Logic. 14. D. N. Hoover (1982). A normal form theorem for LwIpwith applications. J. Symbolic Logic 47. D. N. Hoover (1984) Synonymity, generalized martingales and subfiltrations. Ann. Probab. 12. D. N. Hoover (1985). A probabilistic interpolation theorem (to appear). D. N. Hoover and H. J. Keisler (1984). Adapted probability distributions. Trans. Amer. Math. SOC.286. D. N. Hoover and E. Perkins (1983a). Nonstandard construction of the stochastic integral and applications to stochastic differential equations I. Trans. Amer. Math. Soc. 275. D. N. Hoover and E. Perkins (1983b). Nonstandard construction of the stochastic integral and applications to stochastic differential equations 11. Trans. Amer. Math. SOC.275. N. Ikeda and S. Watanabe (1981). Stochastic Diferential Equations and Difusion Processes. North-Holland Publ., Amsterdam. K. It8 (1944). Stochastic integral. Proc. Imp. Acad. (Tokyo) 20. J. Jacod and J. Memin (1981). Existence of weak solutions for stochastic differential equations with driving semimartingales. Stochastics 4. G. Jona-Lasinio and P. K . Mitter (1985). On the stochastic quantization of field theory. Commun. Math. Phys. 101. H. J. Keisler (1977). Hyperfinite model theory. In (R. 0. Candy and J . M. E. Hyland, eds.) Logic Colloq. 1976. North-Holland Publ., Amsterdam. H. J. Keisler (1983).A non-tathnement process with infinitesimal traders (preliminary version). Univ. of Wisconsin, Madison. H. J . Keisler (1984). An infinitesimal approach to stochastic analysis. Mem. Amer. Math. Soc. 291.
H. J. Keisler (1985). Probability quantifiers. In (J. Banvise and S . Feferman eds.) Model Theoretical Logics. Springer-Verlag, Berlin and New York. S. A. Kosciuk (1982). Nonstandard stochastic methods in diffusion theory. Ph.D. thesis. Univ. of Wisconsin, Madison. S. A. Kosciuk (1983). Stochastic solutions to partial differential equations. In (A. E. Hurd, ed.) Nonstandard Analysis-Recent Contributions pp. 113- 119. Springer-Verlag. Berlin and New York. N. V. Krylov (1972). On It8's stochastic integral equation. Theory Probab. Appl. 14 (1969). [See also the correction in Theory Probab. Appl. 17.1 N. V. Krylov (1974). Some estimates of the probability density of a stochastic integral. Math. USSR-Izv. 8. N. V. Krylov (1980). Controlled Difusion Processes. Springer-Verlag, New York and Berlin. J. Kuelbs (1968). The invariance principle for a lattice of random variables. Ann. Math. Star. 39. J . Kuelbs (1973). The invariance principle for Banach space valued random variables. J. Multivariate Anal. 3. H. Kunita and S. Watanabe (1967). On square integrable martingales. Nagoya Math. J. 30. H.-H. Kuo (1975). Gaussian Measures in Banach Spaces. Springer-Verlag, Berlin and New York.
REFERENCES
223
P. Ltvy (1948). Processus stochastiques et mouuement Brownien. Gauthier-Villars, Paris. T. L i n d s t r ~ m(1980a). Hyperfinite stochastic integration I: The nonstandard theory. Math. Scand. 46. T. Lindstrdm (1980b). Hyperfinite stochastic integration 11: Comparison to the standard theory. Math. Scand. 46. T. Lindstrdm ( 1 9 8 0 ~ ) Hyperfinite . stochastic integration 111: Hyperfinite representations of standard martingales. Math. Scand. 46. T. LindstrBm (1983). Stochastic integration in hyperfinite dimensional linear spaces. In (A. E. Hurd, ed.) Nonsfandard Analysis-Recent Deuelopmenfs. Springer-Verlag, New York and Berlin. T. L i n d s t r ~ m(1985). The structure of hyperfinite stochastic integrals. 2. Wahrsch. Verw. Gebiete (to appear). T. Lindstrdm (1986). A standard characterization of nonstandard stochastic integrals (in preparation). V. Luges (1984), Girsanov’s theorem in Hilbert space and an application to the statistics of Hilbert space-valued stochastic differential equations, Stoch. Proc. and Appl. 17. H. P. McKean (1963). Brownian motion with a several dimensional time. Theory Probab. AppL 8. E. J. McShane (1967). Relaxed controls and variational problems. SIAM J. Control Optim. 5. M. Metivier and J. Pellaumail (1980). Stochastic Integration. Academic Press, New York. P. A. Meyer (1976). Un cows sur les intigrales stochastiques. In Sim. Probab. X , Lecture Notes in Math. 511, Springer-Verlag, Berlin and New York. D. W. Miiller (1969). Nonstandard proofs of invariance principles in probability theory. In (W. A. J. Luxemburg, ed.) Applications of Model Theory to Algebra, Analysis, and Probability. Holt, Rinehart, and Winston, New York. J. Neveu (1975). Discrete Parameter Martingales. North-Holland Publ., Amsterdam. H. Osswald (1984). On the existence of solutions to stochastic integral equations with respect t o square integrable continuous martingales on Loeb spaces (preprint). Univ. of Munchen. H. Osswald (1985). Introduction to nonstandard measure theory 1-11. Lecture notes. 1984-1985. Univ. of Munchen. R. L. Panetta (1978). Hyperreal probability spaces: some applications of the Loeb construction. Ph.D. thesis. Univ. of Wisconsin, Madison. E. Perkins (1982). On the construction and distribution of a local martingale with a giver1 absolute value. Trans. Amer. Math. Soc. 271. E. Perkins (1983). Stochastic processes and nonstandard analysis. In (A. E. Hurd, ed.) Nonstandard Analysis-Recent Developments. Springer-Verlag, Berlin and New York. P. E. Protter (1977a). On the existence, uniqueness, convergence, and explosions of solutions of systems of stochastic integral equations. Ann. Probab. 5. P. E. Protter (1977b). Right continuous solutions of systems of stochastic integral equations. J. Multiuariafe Anal. 7. P. H. Richter and M. Eigen (1974). Diffusion control reaction rates in spheroidal geometry. Application to repressor-operator association and membrane bound enzymes. Biophys. Chem. 2. H. Rodenhausen (1982a). The completeness theorem for adapted probability logic. Doctoral dissertation. Univ. of Heidelberg. H. Rodenhausen (1982b). A characterization of nonstandard liftings of measurable function:; and stochastic processes. Israel J. Math. 43. D. Ross (1984). Completeness theorem for probability logic with function symbols (preprint). Univ. of Iowa. B. Simon (1979). Functional Integration and Quantum Physics. Academic Press, New York.
224
4. STOCHASTIC ANALYSIS
A. Stoll (1982). A Nonstandard Construction of Livy Brownian Motion with Applications to Invariance Principles. Diplomarbeit, Freiburg. A. Stoll (1985). A nonstandard construction of Livy Brownian motion. 2. Wahrsch. Verw. Gebiete (to appear). D. W. Stroock and S. R. S. Varadhan (1979). Multidimensional Diffusion Processes. SpringerVerlag, Berlin and New York. K. D. Stroyan (1985). Previsible sets for hyperfinite filtrations. 2. Wahrsch. V e w . Gebiete (to appear). K. D. Stroyan and J. M. Bayod (1985). Foundations of Infinitesimal Stochastic Analysis. North-Holland Publ. (to appear). S . Takenaka ( 1977). On projective invariance of multi-parameter Brownian motion. Nagoya Math. J. 67. M. J. Vishik, A. J. Komech, and A. V. Fursikov (1979). Some mathematical problems of statistical hydrodynamics. Russian Math. Surueys 34. J. Warga (1967). Functions of relaxed controls. SIAM J. Control Optim. 5. J. Warga (1972). Optimal Control of Differential and Functional Equations. Academic Press, New York and London. M . J. Wichura (1969). Inequalities with applications to the weak convergence of random processes with multidimensional time parameters. Ann. Math. Statist. 40.
CHAPTER 5 HYPERFINITE DIRICHLET
FORMS AND MARKOV PROCESSES
The interplay between methods from functional analysis and stochastic processes is one of the most important and exciting aspects of mathematical physics today. It is a highly technical and sophisticated theory based on decades of research in both areas. Our purpose in this chapter is to develop a more elementary approach where hyperfinite linear algebra and Markov chains replace functional analysis and continuous time Markov processes. In the first two sections we describe a general theory for non-negative quadratic forms on hyperfinite-dimensional spaces, but from Section 5.3 on we restrict ourselves to Markov forms and begin the analysis of the associated Markov processes. While Section 5.3 is devoted to the hyperfinite processes themselves and contains, e.g., the Beurling-Deny formula, Fukushima’s decomposition theorem, and the Feynman-Kac formula, Sections 5.4 and 5.5 contain a study of their standard parts. The theory in Section 5.5 is rather complicated and has probably not yet reached its final form; therefore, the reader may find it more rewarding first to take a look: at the applications in Section 5.6. Unless otherwise specified, all linear spaces in this chapter are over the reals or hyperreals. 225
226
5. HYPERFINITE DIRICHLET FORMS AND MARKOV PROCESSES
5.1. HYPERFINITE QUADRATIC FORMS AND THEIR DOMAINS
We shall develop a hyperfinite theory of non-negative, symmetric, quadratic forms on infinite-dimensional spaces. It is well known that in the Hilbert space case the theory for closed forms of this kind is equivalent to the theory of non-negative, self-adjoint operators; in fact, there is a natural correspondence between forms E and operators A given by E ( u , u ) = (A1"u, A"2u). We have chosen to present the theory in terms of forms and not operators for two reasons: partly because forms are real-valued, and this makes it simpler to take standard parts, but also because in most of our applications, the form is what is naturally given. A. The Domain
Let us begin by recalling some results from Section 2.2. If H is an internal, hyperfinite-dimensional linear space with an inner product ( .) generating a norm I( * 11, we let Fin(H) be the set of all elements in H with finite norm. By defining x = y if IIx - y 11 -- 0, Proposition 2.2.1 tells us that the space a ,
(1)
O
H
=
Fin( H ) / =
is a Hilbert space with respect to the inner product (Ox, " y ) = st(x, y ) , where Ox denotes the equivalence class of x. We call (OH, (., the hull of (H, *>I. Given a non-negative, symmetric, bilinear form a))
( - 9
8 : H x H += *R,
we want to define its standard part E as a bilinear form on OH. If 8 is S-bounded, i.e., there exists a K E Iw such that
I%(% v)l 5 Kllull 1 1 ~ 1 1 for all u, u
E
H, then we can simply define E by E(Ou, " u ) = ' 8 ( u , u).
If 8 is not S-bounded, we run into two difficulties; we no longer have that 8(u, u ) = 8(C, 6) whenever u = u' and u = 6,and there may be elements u E Fin(H) such that 8(v',v') is infinite for all v' -- u. The last problem should not surprise us; it is an immediate consequence of the fact that unbounded forms on Hilbert spaces cannot be defined everywhere; we shall solve it by simply letting E("u, " u ) be undefined when 8(6,v') is infinite for all v' E The most natural solution to the first problem may be to define Ou.
(2)
E('u, " u ) = inf{O8(u, u ) I u
E
Ou},
5.1. HYPERFINITE QUADRATIC FORMS AND THEIR DOMAINS
227
and then extend E to a bilinear form by the usual trick E(OU," v ) = f ( E ( " u + OU,
OU
+ "u) - E('u,
"u)- E("v,"0)).
The disadvantage of this approach is that it gives us very little understanding of how the infimum in (2) is obtained; for a n easier access to the regu1arit.y properties of 8 and E, we prefer a more indirect way of attack. Our plan is to define a subset 9[81 of Fin(H)-called the domain of 8-satisfying (3)
if
(4)
if u, v
O 8 (
u, u ) < m, there is a v E E
9[81 and
9[81 such that v = u ;
v, then O8(u, u ) = " $ ( v , v ) < 03.
u
We then define E by
(5)
E(OU,
" u ) = " 8 ( v ,u ) ,
when v E 9[81 n "u. It turns out that the two definitions (2) and ( 5 ) agree (see 5.1.14). Before giving the precise definition of 9[8],we shall introduce a few useful notions. Since we shall often be concerned with the relationship between the form if and the inner product ( . , . ), it will be convenient to work with forms incorporating both; for a E *R, a 2 0, we define (6)
ga(U, 0) = %(u, v )
+ a ( u , v).
Each of these forms generates a norm (possibly a seminorm in the case a = 0): (7)
IUI,
w2.
= (&(u,
Remember that the original Hilbert space norm on H is denoted by (1 * (1. Since 8 is a symmetric, non-negative form on the hyperfinite-dimensional space H, elementary linear algebra tells us that there is a unique, symmetric, non-negative definite operator A :H + H such that (8)
8 ( u , v ) = (Au, v )
for all u, v E H. If IlAll is the operator norm of A, we fix an infinitesimal At such that (9)
O
1
ll All '
and define a new operator Q"' by (10)
Q"' =
I - AtA.
Notice that by (9), the operator 0"'is non-negative, and since A is nonnegative, the operator norm of Q"' is less than or equal to one.
=a
5 HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
Introduce a nonstandard time line T by
T
(11)
and for each element t
=
=
{k A t \ k
E
*N},
k At in T, define Q' to be the operator Q'
(12)
=
(QA')k.
The family { Q ' } t t Tis obviously a semigroup, and we shall call it the semigroup associated with 8 and At. Whenever we refer to 8, A, T, and Q' in the remainder of this section we shall assume that they are linked by ( 8 ) - ( 1 2 ) . In applications, the primary object will often be the semigroup {Q'}, and we can then define A (and hence 8) by (13)
A
=
( l / A t ) ( Z - QA').
The operator A is called the injinitesimal generafor of (0'). Since A and Q' are non-negative operators, they have unique nonnegative square roots, which we denote by and Q'I2,respectively. For each t E T, we may define an approximation A(') of A by (14)
1 A ( ' )= - (1- Q ' ) , t
and from A(')we get the form (15)
8 ( ' ) ( u ,u ) = (A(')u,u).
Be careful not to confuse with the form defined in (6). Notice that even when 8 is not S-bounded, 8")is S-bounded for all noninfinitesimal t. One of the motivations behind our definition of the domain 9[81 is that we want to single out the elements where 8 really is approximated by the bounded forms %(I), t Z 0, i.e., those u E H such that
We could have taken this to be our definition of 9[8],but for technical and expository reasons we have chosen another one which we shall soon (see Proposition 5.1.5) show to be equivalent to (16). 5.1.1. DEFINITION. Let 8 be a non-negative, symmetric quadratic form on a hyperfinite-dimensional linear space H. The domain 9[81 of 8 is the set of all u E H satisfying:
(i) Og1(u,u ) < co. (ii) For all t = 0, 8 ( Q r u , Q'u) = 8(u,u ) .
5 1 HYPERFINITE QUADRATIC FORMS AND THEIR DOMAINS
229
Let us try to convey the intuition behind this definition. Think of A as a differential operator; then the elements of 9[81 are "smooth" functions, and Q' is a "smoothing" operator often given by an integral kernel. If an element u is already smooth, then an infinitesimal amount of smoothing Q', t = 0, should not change it noticeably, and hence 8(Q'u, Q ' u ) = 8(u, u ) . We shall give a partial justification of this rather crude image later, when we show that if Ogl(u, u ) < 00, then the "smoothed" elements Q'u, t Z 0, are all in 9[8](Lemma 5.1.7; see also Corollary 5.1.9). Our first task will be to establish a list of alternative definitions of 9[81, among them (16). We begin with the following simple identity giving the relationship between % and 8"): 5.1.2. LEMMA.
For all u At
C
% ( ' ) ( u ,U ) = -
t
PROOF
E
H and
t E
T
At %(Q"u,U ) = t
05S<'
C
~ ( Q ' / ' uQSIZu). ,
ass<'
This is just an easy calculation: 1
%(')(u,U ) = - ((I - Q')u,U ) = t
((Q "- QSfA')u,U )
05scr
c
At -t
t1 1
c
At 8 ( Q S u ,u ) = -
%(QS/ZU,
Q%).
1 oc+
ocs
Among other things, Lemma 5.1.2 tells us that 8") is non-negative. 5.1.3. LEMMA. Let B, C : H + H be non-negative, symmetric operators commuting with A and each other. Then the functions
t
H
(Q'Bu, C u )
and
t
are non-negative and decreasing for all u PROOF
We first notice that the
$(.')
H
E
8 ( S ) ( Q ' BC~u, )
H and set.
part follows from the other one since
% ( S ) ( Q ' B U , C u ) = (l/s)(Q'(I - Q")Bu,Cu),
and the operator B' and C. If t > r, then
=
( I - Q s ) B is non-negative and commutes with A
230
5 HYPERFINITE DlRlCHCET FORMS AND MARKOV PROCESSES
since 8 ( ' - r )is non-negative, and hence t positivity, observe that
H
(Q'Bu, C u ) decreases. For the
Q ~ / ~ B ~ z/ 0. ~ c ~ / ~ ~ )
( Q ' B ~c, u ) =
From 5.1.3 we may now obtain our main inequalities. 5.1.4. PROPOSITION.
For all u
E
H, t E T :
(i) O I% ( u , u - Q ' u ) I%(u, u ) - %(Q'u,Q ' u ) I2 8 ( u , u - Q'u). u ~) ( uu ,) - ~ ( Q " u , (ii) o I~ ( Q " ' U , Q"U) - ~ ( Q ~ ~Q '~u~, 'I PROOF By trivial algebra
oA'u).
8 ( u , u ) - %(Q'u, Q ' u ) =
Applying Lemma 5.1.3 with B
=
8(u,u
- Q'u)
+ S(Q'U, u - Q'u).
I, C = I - Q', we see that
0 I8(Q ' U , u - Q'u) I8(u, u - Q ' u ) , and part (i) follows. (ii) The non-negativity is immediate from (i), and as above we have %(u, u ) -
8(Q"'u, Q A ' U )
+ 8(QA'u, u - QA'u).
= 8(u, u - Q A ' U )
Applying 5.1.3 to each of the last two terms, using B = I, C = I the first case, and B = QA', C = I - QA' in the second, we get %( u, u ) - 8(Q A ' U ,
QA'U) 2
=
8(Q Z A ' U , u - QA'u)+ 8(Q 3 A r U , u 8(Q%, u ) - 8(Q2%, Q A ' U )
+ 8(Q3%, =
8(QA'U,
u ) - 8(Q3%,
QA'U)
-
8(Q Z A ' U ,
-
QA' in
- QA'U)
QA'u) Q2A1U).
The proposition is proved. The inequalities above are what we need to establish a reasonable theory for 9[8].We first give our promised list of alternative definitions of the domain of 8: 5.1.5. PROPOSITION.
The following are equivalent:
(i) u is in the domain of 8. (ii) Ogl(u, u ) < 00, and for all t -- 0, we have 8 ( u , u - Q ' u ) = 0. (iii) Og1(u, u ) < 00, and for all t 0, we have 8 ( u - Q'u, u - Q ' u ) = 0. (iv) Og1(u, u ) < 00, and for all t 0, we have @')(u, u ) = %(u, u ) . L-
L-
(i) e (ii). Follows immediately from 5.1.4(i). (ii) 3 (iii). We have
PROOF
0 I8(u - Q ' U , u
-
Q'u) = %(u,u
- Q'U)
-
and by 5.1.3 the term ~ ( Q ' uu ,- Q'u) is positive.
S ( Q : , u - Q'u),
5 1 HYPERFINITE QUADRATIC FORMS A N D THEIR DOMAINS
231
+
(iii) (i). Recall that lulo = $ ( u , u)"' is a seminorm. By 5.1.3 and the triangle inequality 0 5 lulo - I Q ' U l O
5 Iu - Q ' U l o .
Multiplying both sides by lulo + IQ'uIO, we get 05
14; - lQ'ul$ 5 lu - Q ' U I o ( I ~ l o+ IQ'ulo)
21Ulol~- QIUIO. - Q'u, u - Q'u) L- 0, we have $ ( u , u ) 5
Hence if OgI(u, u ) < 00 and "(14 8(Q ' U , Q'u) = 0. (ii) 3 (iv). Follows at once from 5.1.2. (iv) 3 ( i i ) . Follows from 5.1.2 and the fact that s decreasing.
-
8 ( Q s u ,u ) is
The different characterizations of 9[81 are useful for different purposes; as an illustration we use 5.1.5(iii) to prove that the domain has the right linear structure. 5.1.6. COROLLARY.
dard. Then a u and u PROOF
Let u, u E 9[81, and assume that + u are elements of 9[8].
(Y
E
*R is nearstan-
The (YU part is trivial. For u + o we use 5.1.5(iii) and the triangle
inequality; Il(u
+ u)
-
Q ' ( u + u)llo = IIu - Q'u
+ ZI - Q'ZIllO
Q'ullo+ I I V - Q'ZIIlo and the last two terms are infinitesimal when t = 0. 5 IIU -
The second part of 5.1.4 informs us that Q'u is more likely to be in 9[8] than u is. The next lemma pins this down more precisely. 5.1.7. LEMMA. Assume O g I ( u, u ) < co. Then for all noninfinitesimal t, we have Q'u E 9[8]. PROOF
By 5.1.3 0 8 , ( Q Z U , Q'u) 5 O%,(u, u ) < co.
To prove that 5.1.1(ii) is satisfied, notice that according to 5.1.4(ii), the function t
* $(Q'u, Q'u)
is decreasing and convex, and hence
for all s > 0. Multiplying through by s, we get (17)
0 5 8(Q ' U , Q ' u ) - $( Q'+%l,Q ' + " U ) 5 (S/t"(U,
u ) - $(Q"u, Q"u)l.
5 HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
232
For s = 0 and t lemma follows.
+ 0, the expression on the right is infinitesimal, and the
We shall now strengthen the lemma above and show that if E l ( u, u ) < 00, then there is an infinitesimal t such that Q'u E 9[%I. This is a special case of our next result. First some terminology; a subset 9of H is called %-closed if for all sequences { u , } of ~ ~elements ~ from such that OIu, - urn\l+ 0 as n, m + 00, there exists an element u in 9 such that OIu, - uI1 + 0 as n + 00. 5.1.8. PROPOSITION. 9181 is 8-closed. Moreover, if {u,}.,~ is a 1 - 1 1 Cauchy sequence from 9[8],and {u,},~*~ is an internal extension, then there is a y E *N - N such that u, E 9 [%] for all r] Iy. PROOF Let {u,},,~ be a I.l,-Cauchy sequence from 9 [ 8 ] , and let { u , , } , , ~be* ~an internal extension of it. There is an element y E *N - N such that Iu, - u,I1 = 0 whenever n and m are infinite and less than y. Let 7 E *N - N, r ] 5 y. By choice of y, o%l(u,,,u,,) < 00 and ' ( u , - u q ( ,+ 0 as n approaches infinity in N. All that remains is to prove that u,, E 9[&?]. Assume not, then by 5.1.5(iii) there is an E E R, and a t = 0 such that
Iu, - Q'%Jo'
Choose rn
E
E.
N so large that Iuq
- urnlo < ~ / 4 ;
then by 5.1.3 (Q'u,
- Q'Um10
<~ / 4 .
Combining the three inequalities above, E
< Iu,
- Q'uJO 5 Juq - urnlo + Iurn 5
~ / +2I u m
Q'UmIO
+ IQ'urn - Q'u,lo
- Q'UrnIO~
but since urnE 9[&? the ],last term is infinitesimal by 5.1.5 (iii). We have the contradiction we wanted. 5.1.9. COROLLARY.
for all t
2
If $,(u, u ) < a, there is a to = 0 such that Q'u
E
9[81
to.
PROOF First notice that if Q'Ou E 9 [ % ] , so is Q'u for all t > to. Put u, = Q""u. Then the sequence {lu,ll} is increasing and bounded by I u I , ,
and we can apply the proposition to it. The corollary follows. REMARK. Proposition 5.1.8 is rather surprising since there exist standard forms that are not closed. In fact, there are numerous applications where the main difficulty is to show that the form constructed is closed, or at least can be extended to a closed form [see, e.g., Albeverio and Hoegh-Krohn
5 1 HYPERFINITE QUADRATIC FORMS AND THEIR DOMAINS
233
(l976,1977a,b; 1981a,b, 1982, 1984), Albeverio et al. (1977, 1980, 1981, 1984b, 1985), Carmona (19791, Fukushima (1980,1985), Reed and Simon (1975), Rockner and Wielens (1985), and Wielens (1985)l. If we know that it comes from a hyperfinite form, this follows immediately from 5.1.8. In Chapter 6, we shall see various examples of how useful this observation is; for the time being we only remark that since we shall soon (Theorem 5.2.1) show that all standard, closed forms can be obtained from hyperfinite forms, the method is quite general. Notice that if we can show that whenever O%,(u, u ) < CO, then IIu Q'u 1) 0 for all t = 0, Corollary 5.1.9 will imply the first part of our program, i.e., (3) above. L-
5.1.10. LEMMA.
If ' % ( u , u ) < a,then for all t
L-
0
11 u - Q'u 11 = 0. PROOF
I(u - Q * U \ ( * = ( u - Q ' U , u - Q ' u ) = t % ( ' ) ( U , u - Q ' u ) =
t [ %'"( u, u ) -
$(I)(
u, Q ' u ) ] 5 t%( u, u ) L- 0
for t = 0. Let us turn our attention to our second main goal (4). 5.1.11. LEMMA.
If u, v
E
9[%] and
u =
v, then
a(u,u ) = a(u, v). PROOF
It is obviously enough to show that if u E 9[ $1 and u 9[%]we , know from 5.1.5(iv):
L-
0, then
% ( u , u ) = 0. But if u E
Also
@%,
u ) = ( l / t ) ( ( I - Q')u, u ) = ( l / t ) ( ( 4
-(Of%4)5 ( 1 / ~ ) l l ~ l \ ,
which is infinitesimal for t Z 0. Combining this with (18), the lemma follows. We may now sum up our results on 9[%] in one statement. 5.1.12. THEOREM. Let % be a symmetric, non-negative quadratic form on a hyperfinite-dimensional linear space H :
If u, v E 9[$] and 96[81. (ii) 9[8]is %-closed.
(i)
LY
is a finite element of *R, then au, u
+v E
5 HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
234
(iii) If Og1(u,u ) < a,then there exists a u E 9[8]with ( ( u- u ( (= 0. Moreover, 0 8 ( u , u ) = lim ' ~ ( Q ' LQ I' U, ) = Iim 0 8 ( ' ) ( u u, ) . 1lO
'10
1#0
l#O
(iv) If u, u E 9[8]and u = u, then 8 ( u , u ) = 8 ( u , u ) . The following definition now makes sense. 5.1.13. DEFINITION. O H
The standard part of 8 is the quadratic form E on
defined by:
(i) The domain 9 [ E ]of E is the set of all equivalence classes Ou E OH such that inf{ 8,( u, u ) I u E " u } < 03. (ii) If x, y E OH are in the domain of E, let E ( x , y ) = O8(u, v ) , where u E x, u E y are in 9[8]. An E,-Cuuchy sequence is a sequence {x,} of elements from 9 [ E ] such that E , ( u , - u,, u, - u,) 0 as n, rn -+ 00. We say that E is closed if all El-Cauchy sequences converge in El-norm to an element in 9 [ E ] .The next proposition follows immediately from 5.1.12 and the definition of E. -+
Let E be the standard part of 8. Then E is closed,
5.1.14. PROPOSITION.
and for all x
E
O H
E ( x , x) = inf{"8(u, u)l u E x } , (19) where we take the value 00 on the right to mean that the expression on the left is undefined. Notice that (19) is just our original suggestion (2) for the standard part of 8. B. The Resolvent
Up to now we have only been interested in the relationship between the form 8 and the associated semigroup {Q'}. For the remainder of the section, we turn our attention to the resolvent {G,} of 8. The goal is to give a description of 8 and 9[81 in terms of { G,}, similar to the one we have just given using the semigroup. The main result (Theorem 5.1.19) will allow us to reconstruct a form from its resolvent, and it will play an essential part in our study of singular perturbations of operators in Chapter 6. The operator G, is defined to be (A - a)-' whenever this exists, and the formal calculation m
(20)
( A - a)-' = ( I - ( I - At(A - a ) ) ) - ' Ar = m
=
C k=O
(QA1+a At)"At
C k=O
( I - At(A - a ) ) " Ac
5 1 HYPERFINITE QUADRATIC FORMS AND THEIR DOMAINS
235
tells us that G, will exist if the series on the right-hand side converges. Since Q"' is a non-negative, symmetric operator with norm at most one, all its eigenvalues must be between zero and one. Choosing a such that [ a (A t Ii, we get that the absolute value of all eigenvalues of QA'+ a A t must be less than 1 + a At. Hence if a < 0 and la\ A t 5 i, the series in (20) converges, and we get the following proposition. 5.1.15. PROPOSITION. Let be a non-negative, symmetric quadratic form, and let a E *R, -1/2 A t 5 a < 0. Then G, exists and m
C
G,=
(QA'+a.At)kAt.
k =O
Moreover, in operator norm IIG,II
5
l/lal.
In the standard theory, the formula corresponding to (21) is G,
=
lom lom e-r(A-a)
dt =
em'dt,
giving G, as a weighted sum of the elements C r A in the semigroup. Since e a rdt = 1, it is convenient to multiply this equation by -a to obtain
:j -a
-aG,
(22)
=
lo*
-a C r A eatdt.
It is not quite obvious that this result carries over to the hyperfinite theory, since the equation (Q"' a A t ) k = QkAr(l + a At)k (corresponding to e-'(A-a) - e P r A. e"') is false. But the next result shows that the two operators are close enough for our purposes:
+
5.1.16. LEMMA. O g I ( u ,
For a
*R, -1/&
E
5
a < 0, and all u
E
H
with
u ) < m.
I
aG,u-
(
oc)
C
a
QkA'(l+aAt)kAt
k=O
PROOF Let { e , } , , N be an orthonorrnal basis of eigenvectors for A, and let a, be the ith eigenvalue. Defining b, = a, + 1, we notice that if u = N Ll w,, then N
8,(u, u) =
C
biUf.
i=l
Summing geometric series, we see that m
a ai
-a
ei
5 HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
236
and similarly
(
m
a
c
a
QkA'(l+ a A t ) k A t
ai - a
k=O
+ aia A t ei
9
which yields
(
m
(YG,(u)a C QkA'(l+~At)kAt k=O
= c ( 1 - ( a i / a ) )ai( lAtuiei - ( a i / a ) - ai A t ) N
i=l
Taking the I*(,-normof this, we get from (24)
I
(YG,u-
m
(
(Y
C
QkA'(l+ a A t ) k A t
k=O
af At2
N
=
c
biu:
k=I
( 1 - (ai/a))2(1- ( a i / . )
- ai A t )
2
5 178I(U, u ) ,
where
v=
a: At2 (ai/a))2(1- (ai/.) - ai At)'
max
1sisN((1 -
All that remains is to show that 71 is infinitesimal. First observe that since IlAll A t 5 1 [recall (9)], we have ai A t I1 for all i. Hence 7 s max
lrisN
Since - l / & is finished.
Ia
a: A t 2 ( 1 - (ai/a))2((ai/a))2
< 0, the last term is always infinitesimal, and the proof
Equation (23) is the nonstandard counterpart of (22). Notice that m
c
-a(l
+
(Y
A t ) k = 1,
k=O
and that if a is infinite, then there is t, = 0 such that
C
-a(1
+
(Y
At)k
1.
0 5 kAt5 1,
On the other hand, if a is finite, then
c
(27)
- ( ~ ( l + At)k = 1
tskAf
for all infinitesimal terms of G,.
f.
We can now begin our description of 8 and 9[81 in
237
5 1 HYPERFINITE QUADRATIC FORMS AND THEIR DOMAINS
If - 1 / a < a < 0 and
-
5.1.17. LEMMA.
u ) < co, then
O%(U,
(if If a is infinite, Il-aG,u - uI\ 0. (ii) If (Y is finite, -(YG,u E 9[%). (iii) There is an infinite a such that -aG,u E 9[%]. PROOF According to 5.1.16 it suffices to prove the statements we get after replacing G, by E
6, = k = O Q k A ' ( l + a A t ) k At. (i)
Let t ,
ll-a&,u
L-
0 be as in (26); then
-
uII
=
L-
1 1
m
C
(QkAru - u)(-a)(l
+ a At)k At
k=O
C
+ a At)k A t
(QkAru - u)(-a)(l
OckArS
I,
I1
L-
0,
where the last step uses Lemma 5.1.10. .{ e g } , s Nis an orthonormal (ii) Choose t L- 0 such that Q'u E 9[%]If basis of eigenvectors for A, and a, is the ith eigenvalue, we have
C
+ a A t ) k e, =
Q"'(1
OskAr
( 1 - a, A t ) k ( l + a OskAr
- 1 - (1
If
ti
N
= C,=l u,e,, we
I
+ a At)"A'(l - a, a, - a + a p At
el.
get QkA'u(l + (Y At)k A t
C
OskAt
c
=
,=I
I:
1 - ( 1 + a At)"A'(l - a, At)'IAr bd( a, - a + a,a At
where b, = a, + 1. Since a is finite, it is easy to check that
)
1 - ( 1 + a At)"A'(l - a, At)"*' -0 a, - a + a,a At for all i, and hence (28)
1
Q k A ' u ( l+ a A t ) k At
C 0 sk A c c t
Observe that since QkA'uE 9[8]for all k A t QkAtu(1 + a At)" At trkAt
E
I, 2
-0. t, we must have
9[81.
238
5. HYPERFINITE DlRlCHLET FORMS A N D M A R K O V PROCESSES
But by (28),
w(iii) 81-
16,u -CrSkArQkA'u(l+ a A t ) k A t ] , = 0,
and hence &u
E
Notice that
increases as a + -00, and is bounded by 8,(u, u ) . Applying 5.1.8 to the sequence u, = nG-,u, the lemma follows. The next proposition adds two new characterizations of 9[81 to the list in 5.1.5. 5.1.18. PROPOSITION.
The following are equivalent:
(i) E W81, u + aG,u, u + aG,u) (ii) Og1(u,u ) < 00 and lim-,+-m "8,( -aG,u) < m. (iii) Og1(u,u ) = lim.,,_,"8,(-aG,u, PROOF.
+ aG,u
(i)*(ii).
Pick an infinite a such that -aG,u
9[81, u + aG,u = 0, and hence 8,( u + aG,u,
E
= 0,
9[$]Then .
+ aG,u)
= 0. Part (ii) follows. (ii) + (iii). By (29) and the triangle inequality 0 5 ( u (-~ (-aG,u(, 5 Iu + aG,ul,, and multiplying by (ul,+ I-aG,ull 5 28,(u, u ) , 0 s IuI: - (-cuG,u(: 5 28,(u, u)lu + cvG,uJI, which shows that ( i i ) 3 (iii). (iii) =3 (i). Pick an infinite a such that -aG,u E 9[8].Then IIu + aG,u)) = 0 and g1(u,u ) = 8 1 ( - a G , ~ ,-aG,u), and hence u E 9[S]. u
E
u
We have now reached the final theorem of this section. As we mentioned above, the result gives a way of reconstructing a form from its resolvent. In our study of singular perturbations in Chapter 6 , we shall find it much easier to control the resolvent of the perturbed form than the form itself. Once we have a good grasp of the resolvent, Theorem 5.1.19 will give us the form. 5.1.19. THEOREM. Let 8 be a symmetric, non-negative, hyperfinite form on H, and let E be its standard part. For all x E " H and all v E x
(30)
E ( x , x)
=
lim "(a'(G,v, v ) + a ( v , v ) ) .
o+-m
PROOF. Notice that since G, is bounded, it does not matter which v we use. We split the proof into two cases.
E
x
(i) x is not in the domuz'n o f E : Let { e i } i c Nbe an orthonormal basis of eigenvectors for A, and assume that the corresponding eigenvalues
FORMS A N D
5 1 HYPERFINITE QUADRATIC
are in decreasing order. Pick v that
= CIS u,e,
-((a2G,v, U) + (.u(u,u))
239
in x. An easy calculation shows N
=
THEIR DOMAINS
C ;=I
-ff
ai~f-. a; - ff
Assume for contradiction that the limit in (30) is finite; then there is an infinite (Y such that N
"C ;=I
a,uf-
-ff
ai
< 00.
- ff
If H is the largest integer such that uH > In[,
and hence the last two terms are finite. But if - ( a / 2 ) I : , u: is finite, u is infinitely close to N
u' =
C
u,e,,
r=H+1
N
a,uf is finite, then "$,( u', 0') < co.This contradicts and if in addition; the assumption that x & 9[E]. (ii) Assume that x E 9[E]: Let u E x be such that "%,(u, u ) < a.
(32)
Since K , ( G,u, w ) = ( u , w),we have $(-aG,u, -aG,u)
+ a(aG,u, aG,u)
=
%-,(aG,u, aG,zI)
=
a2(G,u,u ) + a3(G,u,G,u).
The theorem will follow from 5.1.18(iii) if we can prove that for all satisfying (32), lim "(a2(G,u,u )
"m+-rn
+ a3(G,u,G,u) + a2(G,u,u ) + a(u, u ) ) = 0.
0
5 HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
240
By simple algebra, this is the same as lim o(aIIaG,u + ull')
"a+-m
= 0.
Pulling a inside the norm and reformulating the problem in nonstandard terms, we see that what we have to prove is
11 I C ~ / ~ / * -G 1, V~ 1 1 ~ / ~ == ~ 1 0/ ~
(33)
for all infinite, negative a of sufficiently small absolute value. N If u = vieiis the eigenvector expansion of u, we see that
xi=,
-aaT
N
= c (a, - a )* uf =
N
i=l
aiu,
where pi = -./a,. Notice that if a, is infinitesimal compared to a or a is infinitesimal compared to a,, then 1/( p i+ p;' + 2 ) is infinitesimal. To get the sum on the right-hand side of (34) to be infinitesimal, we only have to choose a such that the contributions from the terms satisfying neither of these requirements are infinitesimal. Assuming that the eigenvalues { a , } are given in descending order, we define
1
k
y = sup{
a,uf ak is infinite
Since aiu: 5 % ( u , u ) is finite by (32), y is a real number. Using saturation on the sets Oc;=,
1
aiuf> y - - a n d a j > n n
we find a hyperinteger K such that aK is infinite and K
a,ui = y. r=l
We choose la1 to be infinitely large, but infinitesimal compared to u K . For each E E R+, let k
5 2 CONNECTIONS TO STANDARD THEORY
241
By choice of y, the term uM, must be finite. But
N
1
where the first term is infinitesimal since each pi is; the second term is less than 2~ by choice of M E ;and the last term is infinitesimal since each p i is infinite. Since E E R, is arbitrary, the sum on the left must be infinitesimal. This proves (33), and hence the theorem. REMARK. The hyperfinite part of the theory in this section is new, but most of the results correspond to well-known standard theorems [see, e.g., Fukushima (1980) and Reed and Simon (1975)l.
5.2. CONNECTIONS TO STANDARD THEORY We shall interrupt our development of the hyperfinite theory for a moment, and relate the results we have obtained so far to the standard theory. In the last section we proved that a quadratic form on a hyperfinitedimensional space H induces a closed form on the hull OH of H. For most applications the space OH is too large; what we really want is a form defined on a Hilbert space K given in advance of the nonstandard construction. If K can be identified with a subspace of OH, we get the desired form by restricting the form on “ H to K ; notice that since K is a closed subspace of OH,the restricted form is also closed. The result we shall prove in this section states that any closed, symmetric, non-negative form on any Hilbert space K can be obtained from a hyperfinite form in this way. There are two reasons for proving such a representation theorem; the first and most important is to obtain the “correct” relationship between already established standard theory on the one hand, and the new hyperfinite theory on the other. The second, closely related reason is to show that no generality is lost by working within the nonstandard framework. Let K be a standard Hilbert space. A hyperfinite-dimensional subspace H of * K is called S-dense in * K if for all x E K , there is a y E H such that IIx - yll = 0. Recall that L- is the equivalence relation on H given by u = u if and only if 11 u - v I I = 0, and that Ou denotes the equivalence class of u under = . If H is S-dense in * K , we can identify K with a subspace of OH by identifying x and O u whenever IIx - u I I = 0. If 8 is a quadratic, symmetric, non-negative form on H, we let E denote the standard part of 8 as defined in 5.1.13, and we let EK be the restriction
242
5. HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
of E to K . As mentioned above, our goal is to show that all closed forms on K can be obtained in this way, but before we prove this, we shall recall a few results from the standard theory of quadratic forms [Fukushima (1980, Section 1.3) is a convenient reference]. To each closed, symmetric, non-negative, densely defined form F on K , there is associated a unique, strongly continuous semigroup { T, } ttR+of contraction operators. The form can be recovered from the semigroup by (1)
1 F ( x , x ) = lim - (( Z - T,)x,x ) , ti0
t
where the functions t I+ ( ( I - T,)x,x) are positive and decreasing. Equations (1) and (2) are just the standard counterparts of 5.1.12(iii). Let stK be the standard part map from * K to K . 5.2.1.PROPOSITION. Let F be a closed, densely defined, nonnegative, symmetric form on a Hilbert space K. Let { T,} be the semigroup generated by F, and let H be an S-dense, hyperfinite-dimensional subspace of * K . Then there exists a non-negative, symmetric form 8 on H-associated with an internal time line T-such that
(3)
F=
EK.
Moreover, if { Q"}SGis the semigroup generated by 8 and T, then for all t E R, s E T, u E K , Y E H such that t = "s, u = stK(u),we have stKQ*U = T,u.
(4)
for PROOF Let P be the projection of * K on H,and write {*T,}ft*R+ * ({T ,} f ,n +)Our . plan is first to define the internal semigroup by putting 0"'= P*T,, for a carefully chosen infinitesimal At, and then let
8 ( ~Y),= ( l / A t ) ( ( ~ - QA')u,Y). Notice that if u E H,then (QAfu,u ) = (P*TArU, u ) = (*TbrU,u ) 2 0,
which shows that 0"' is positive on H. Also, since the operator norm of * TAtis less than or equal to one, so is the norm of Q"', and hence conditions (5.1.8)-(51.12) are satisfied. We shall now choose A t such that (3) and (4) hold. If u is nearstandard and "t < a,then
) ) P * T , u- *T,ull = 0
52 CONNECTIONS TO STANDARD THEORY
243
since * T, takes nearstandard elements to nearstandard elements, and H is S-dense in *K. By induction we get II(P*T,)"U - *T,..ull = 0
(5)
for all n E N. For each u Consider the sets
E
K , let u,
=
Pu. Then u,
E
H and Ilu - u,II = 0.
A, = { n E *NJVk5 22"()I(P*T2-n)k~, - *7"2-+,II
5
l/n)}.
By (5), this set contains N,and since it is internal, it must contain an infinite internal segment {n t *NJ1n 5 nu}. Using saturation, we find an infinite n smaller than all the nu's. Next we consider the set
m
E
* N \ V k 5 22"()(2m/k)((I- (P*T2-rn)k)u,, v,>
For each u E I(, this set contains N, and hence an initial segment { m E *NI m 5 mu}.By saturation there is an infinite m smaller than all the mu's, u E K. We now take A t to be the largest of the two infinitesimals 2-" and 2-". Equation (4) follows immediately from the definition of A, and the choice of n. By the definition of B, and the choice of m, we see that 1 lim-((I - T,)u,u) 1-0 t
=
lim 08'"(u,, u,). 1-0
fJ0
The proposition follows from ( l ) , (2), and Theorem 5.1.12(iii). REMARK. Let us make a few comments on 5.2.1. The assumption that F is densely defined is for convenience only; if it is not satisfied, we just apply the proposition to the closure of 9[F]. If F is not closed, we obviously cannot obtain F as EK for any hyperfinite form 8 since 5.1.14 tells us that EK is always closed However, if F is closable (i.e., there exists a closed form extending F ) , all closed extensions of F can be represented as standard parts of hyperfinite forms. A natural representation for a closable form F would be a representation of its smallest closed extension-the Friedrichs extension. If F is not closable, no hyperfinite representation (in our sense) is possible; any representation we try will change some F values, and restrict and extend 9 [ F ] in different directions in order to turn F into a closed form. As we commented in Section 5.1, the fact that nonclosable forms do not have hyperfinite representations is more a blessing than a curse; in standard theory a lot of effort goes into showing that the forms one constructs
244
5 HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
are closable; in the hyperfinite theory this is an immediate consequence of the construction. Let us finally remark that there is an equation between the resolvents of F and 8 similar to (4); we leave the precise statement and the proof to the reader. In Proposition 5.2.1, the space H was just any S-dense, hyperfinitedimensional subspace of * K , but in applications we often want to choose special kinds of subspaces appropriate for the problems we have in mind. We shall take a look at the case where K = L 2 ( X ,m ) for some HausdorfT space X . Let us make the following assumptions about the measure space ( X , 93,m ) : the measure m is a completed Bore1 measure such that m( K ) < co for all compact sets K , and it is a Radon measure in the sense that for all B E 93,
m ( B ) = sup{m( K)I K c B, K compact},
(6) and for all B
E
93 with m ( B ) < co, m ( B ) = inf{m(O) I B
(7)
c
0, 0 open}.
A subset Y of * X is called rich if it is hyperfinite and st( Y ) = X . If Y is rich, it is easy to construct a hyperfinite measure p on Y such that m = L( k ) st-'. Let H be the hyperfinite-dimensional space of all internal functions f :Y + *R, given the inner product 0
(/;g)=lf.gdp. We want to show that all closed forms on K = L 2 ( X ,3,m ) can be represented as hyperfinite forms on H. One way of constructing a rich set Y in * X and a measure p on Y representing rn is as follows (recall the proof of 3.4.10). If O , , . . .,0,are be the collection of all hyperfinite partitions 9 open sets in X , let 90,,...,o, of * X such that if P is a partition class in 9,then P is *Bore1 and for all i 5 n either P c *Oior P n *0,= 0. Using saturation on the family we find a partition 9 which is in all these collections. Let Y be an internal set containing one point from each partition class of 9. By construction of 9,this set must be rich in * X . For each y E Y , let P, be the corresponding equivalence class in 9; it is easy to see that if y is nearstandard, then Py is contained in the monad of y. We define the internal measure p on Y by (8) the equality m
P ( { Y } ) = *m(P,) =
L( p ) st-' follows from (6), (7), and 3.4.8. 0
5 2 CONNECTIONS TO STANDARD THEORY
245
Let k be the subspace of * K consisting of all functions constant on each class Py E 9 ' . Since H and k obviously are isomorphic, Proposition 5.2.1 will give us a representation of closed forms on K in terms of internal forms on H if we can only show that fi is S-dense in * K . From Section 3.2 we know that each function f in K has an S-squareintegrable lifting f in fi such that " f ( x )= f ( " x )for almost all nearstandard x. If we can show that I[*f - f l l = 0, then k is dense in * K . By ( 6 ) , it suffices to show this when f is bounded and of compact support, and for such functions the statement is an immediate consequence of Anderson's nonstandard version of Lush's theorem (Corollary 3.4.9). We have proved: 5.2.2. COROLLARY. Let X be a Hausdorff space, m a Radon measure on X , and F a densely defined, closed, non-negative, symmetric form on L2(X,m). Then there exists a hyperfinite, rich subset Y of * X , an internal measure p on Y , and a non-negative, symmetric form 8 on L2( Y , p ) representing F in the following sense: m = L( p ) st-' and for all u E L 2 ( X ,m ) 0
F ( u , u ) = inf{"8(u, u ) 1 u is a 2-lifting of u}.
z}
Moreover, if { and { 0") are the semigroups generated by F and 8, respectively, then Q"u is a 2-lifting of Truwhenever u is a 2-lifting of u and s = t. In the remaining sections of this chapter, we shall be mostly interested in quadratic forms generating Markov processes. A bounded operator S on L*(X, m ) is called a Markou operator if it maps non-negative functions to non-negative functions, and
for all bounded functions f: A quadratic form F on L2(X,m ) is called a Dirichlet form if it is closed, densely defined, symmetric, and non-negative, and generates a semigroup { T I }of Markov operators. Hyperfinite Dirichlet forms are defined analogously. We shall see in the next section that the Dirichlet forms are exactly the forms generating Markov processes; for the time being we only make the following simple observation. 5.2.3. COROLLARY. If the form F in 5.2.2 is a Dirichlet form, we can also take 8 to be a Dirichlet form. PROOF We shall prove that if F is a Dirichlet form, then the 8 obtained from the proofs of 5.2.1 and 5.2.2 is a Dirichlet form. Observe first that it is enough to prove that QAris a Markov operator. In the proof of 5.2.1, we defined QAr as P*TAr for a suitable infinitesimal At, where P is the orthogonal projection of * K onto H. With the choice of H made in the
246
5 HYPERFINITE DlRlCHLET FORMS A N D MARKOV PROCESSES
proof of 5.2.2, this projection is just the conditional expectation with respect to the algebra generated by the partition P. Since conditional expectations preserve nonnegativity and decrease the supremum norm, the corollary follows. When are the standard forms generated by two hyperfinite forms different? The last result we shall prove shows that to answer this question, it is enough to check whether the forms have the same resolvents. Recall that in Theorem 5.1.19 we found a way of reconstructing a form from its resolvent. This representation will be used later to study singular perturbations of operators. Lemma 5.2.4 will be useful in checking that certain perturbations are nontrivial, i.e., that the perturbed form is different from the original one. 5.2.4. LEMMA. Let K be a Hilbert space and H an S-dense, hyperfinitedimensional subspace of * K . Let 8 and @ be two nonnegative, symmetric forms on H inducing E , and 8,, respectively, on K . Let { G,} and { &,} be the resolvents of 8 and @.Assume that for some finite, noninfinitesimal a E *K,there is a u E H with ' g 1 ( u ,u ) < co such that u = G,u, w = &u are both nearstandard, but O 1 1 u - wII # 0. Then EK Z l?,. PROOF A2sume for contradiction that EK = l?,. Pick v' = u, G L- w such that v' E 9[8],G E 9[8], and notice that by 5.1.17, u E 9[%]w,E 9[@]. We have
(9)
( u , u - w )= (u, u - 6)= 8-,(u, u - G),
and since u, w are nearstandard and EK "8-,(u, u
(10)
-
=
EK,
G) = T-Jv', v' - w ) .
On the other hand, (1 1)
(u, u
-
w) = (u, v' - w ) = g-&V, 6 - w).
Combining (9), (lo), and ( l l ) , we see that
O="@-,(v'-w,v'-
w)2Olal'llu-
w112>0,
and the lemma is proved. REMARK. The representation theorems in this section are intended for general theoretical purposes. When studying a particular problem concerning a specific operator, it is often more convenient to work with a hyperfinite form constructed directly from our intuitive insight into the problem, rather than with the one obtained by an appeal to Proposition 5.2.1. For example,
5 3 HYPERFINITE DIRICHLET FORMS
247
if we are interested in the form F generated by -A (where A is the Laplace operator) on Rd, i.e., the closure of I-
a simple hyperfinite representation can be constructed as follows. Let (nlE,.
where
E
. . , nd&)l(n,,.. . , n d ) E
*zd,max
Isicd
ni
= 0, be a hyperfinite lattice in *Rd, and define
for all internal functionsf: r y E' r. The form
+
*R, using the convention thatf(y) = 0 when
S(X g ) = -
c Af(x)g(x)s"
xtr
is a simpler and more intuitive representation of F than the one obtained from Corollary 5.2.2. Hyperfinite representations of this kind will be important in Sections 5.6 and 7.5. 5.3. HYPERFINITE DIRICHLET FORMS
For the remainder of this chapter we shall restrict our attention to the hyperfinite forms generated by Markov processes, the Dirichlet forms. The aim is to give a reasonably detailed account of the relationship between properties of these forms and the behavior of the associated processes. Consider a particle which can be in N + 1 different states so, sl , . . . ,s N . Assume that if the particle is in the state si at some instant t, thenindependently of what its past history may be-the probability that it will be in state sj at the next instant t + At is given by a fixed number q i j .This is the familiar setting for the theory of stationary Markov chains with finite state spaces; see, e.g., Chung (1960) and Dynkin and Yushkevich (1969). We shall be interested in the case where S = {so,s,, . . . ,s N }is a hyperfinite set, and T = {k At 1 k E *N} is a hyperdiscrete time line with At L- 0 (for technical reasons it is convenient to have a *-infinite time line to work with in this chapter). The idea is to use the hyperfinite setup to reduce the highly sophisticated theory of continuous parameter Markov processes taking values in topological spaces [see Fukushima (1980) and Silverstein (1974, 1976)] to the much simpler theory of finite Markov chains. Thus we
248
5 HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
shall have in mind the situation where S is an S-dense subset of * Y for some Hausdorff space Y , and where a standard Markov process is to be defined as the standard part of our hyperfinite Markov chain. A. Hyperfinite Markov Processes and the Definition of Di rich let Forms
To make the assumptions a little more precise, let Q = {qij) be an ( N + 1) x ( N + 1 ) matrix with non-negative entries, and assume that N
for all i. Let m be a hyperfinite measure on S = { s o ,s,, . . . ,s N ) ; we shall for qij.If (R, P ) write mi for rn({si}), and-whenever it is convenient-q,,, is an internal measure space, and X :R x T + S is an internal process, let
and whenever X ( o , 1 ) (4)
=
si,
P { w ’ E [ o ] ? I X ( t + At,
0’) =
sj} = qoP([w];’),
then we call X a Markov process with initial distribution m and transition matrix Q. Notice that we do not assume that m and P are probability measures; P(R) could be an infinite, hyperfinite number. Given Q and m, it is easy to construct an associated Markov process X ; just let R be the set of all internal functions o : T + S, let X be the coordinate function X ( o , t ) = w ( t ) , and take P to be the measure generated by k-1
n =O
A special case is when the initial distribution is concentrated in one point si and has mass one, i.e., rnj = 6, (where 6, is the Kronecker symbol), then
generates a probability measure. We next introduce a few regularity conditions. The state so is a trap, i.e., (7)
qoi = 0
for all
i # 0.
5 3 HYPERFINITE DlRlCHLET FORMS
249
The initial measure m and the transition matrix Q satisfy the symmetry conditions (8)
rn,q,,
=
m,q,#
for all
i # 0 , j # 0.
Finally, we assume that (9)
rn, # 0
for at least one
i # 0.
It is easy to find examples of transition matrices Q such that no m satisfies (8) and (9), and thus these assumptions may be regarded as conditions on Q. Notice that for most i, the transition probability q,o should be of order of magnitude At, since if not the process will die in infinitesimal time. A process X satisfying ( 3 ) , (4), (7), (8), and (9) is called a symmetric Markou process associated with m and Q, and it is this class of processes we now shall study in some detail. The main tool will be the theory of quadratic forms and semigroups developed in Section 5.1. We shall first obtain the form from the process. If
(10)
So = { S l , s*, . . . , S N )
is the state space S without the trap so, we let H be the linear space of all internal functions u : So + *R with the inner product
Just as we usually write m, for m ( s , ) , we shall write u ( i ) or u, for u ( s , ) . From time to time we shall identify H with the set of all internal functions u : S -+ *R such that u(so) = 0. Our convention of letting the trap so be the zeroth element is notationally convenient, but it does create certain pitfalls for the careless reader; e.g., N failure to distinguish between sums of the forms Cf"=, and may cause problems. For t E T and u E H we define a new function Q'u E H by (12)
Q'u(i) = E , ( u ( X ( t ) ) ) ,
where E, is the expectation with respect to the measure P,defined in ( 6 ) . Intuitively, Q'u( i ) is the expected value of u ( X ( t ) ) for a particle starting in state s,. Notice that
(13) where have (14)
Q"u = Q . u
is matrix multiplication, and that since qI;+') = C q $ ' q t ' , we must Q"'
= Q'
. Q'
250
5. HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
(here 4 ; ) is the transition probability given by Q'). Hence the family { Q ' } t C T is a semigroup of operators on ff, and N
QA'u(i) =
(15)
C
u(j)qij.
j=l
The infinitesimal generator A of this semigroup is given by N
At
C
u ( i )-
u(j)qij
j= 1
and the Dirichlet form associated with Q and m is defined to be N
8 ( u , u ) = (Au, u ) =
(17)
1 Au(i)v(i)mi. i=l
Combining (16) and (17), we get
We have already given a different definition of Dirichlet forms in Section 5.2, but we shall prove in Proposition 5.3.3 that if a form is a Dirichlet form with respect to some pair m, Q, then it is a Dirichlet form in the sense of Section 5.2 and vice versa. Until this equivalence is established, we shall use the definition above. B. Alternative Descriptions of Dirichlet Forms
Our first result gives an alternative way of expressing a Dirichlet form in terms of m, and qV ; it is a nonstandard version of the Beurling and Deny (1959) formula, and is often more useful than (18): 5.3.1. LEMMA.
Let 8 be the Dirichlet form of Q and m. Then
1
N
+C
u(i)u(i)qiomi.
,=I
PROOF
Notice that N
u ( i ) u ( i ) m ,-
N
1C ,=I,=
5 3 HYPERFINITE DlRlCHLET FORMS
251
where the first line is a trivial modification of (18); the second line follows N from the first since q,, = 1; and the last line is just a rearrangement of the second. Fix a pair ( i , j ) , and consider the terms in the last expression above involving both i and j . If i = j , there is only one such term, and that term is zero. If i Zj,there are two terms to consider,
c,=,
(u(i)- 4j))u(i)q,,m,
and
( u ( j )- u ( i ) ) u ( h J m , ,
and since q p , = q,,m,, their sum equals ( u ( i ) - u ( j ) ) ( u ( i )- u ( j ) ) q p , .
Summing over all pairs ( i , j ) , the lemma follows. As an immediate consequence we have 5.3.2. COROLLARY.
A Dirichlet form is symmetric and non-negative.
The next result gives us three ways of deciding whether a given form is a Dirichlet form without actually constructing an associated Markov process. But first a few definitions. If u E H, the function u' = (0 v u ) A 1 is called the unit contraction of u. A quadratic form 8 is said to have the Markov property if for all u
qu', u') 5 8(u,u ) . Recall that a Markou operator T : H + H is an operator which maps nonnegative functions to non-negative functions, and which never increases the supremum norm, i.e., IlTuIIm
for all u
E
5 IIuIIm
H.
5.3.3. PROPOSITION.
Let N
0) =
c
1.1
h,u(i)u(j)
=I
be a nonzero, symmetric form o n H, and let (0') be the associated semigroup. The following statements are equivalent:
(i) 8 is the Dirichlet form of some Q and m. (ii) 0"'is a Markov operator. (iii) 8 has the Markov property. b,, for all i. (iv) Whenever i # j , b,, < 0; but b,, 2
-zJ+x
5. HYPERFINITE DlRlCHLET F O R M S A N D MARKOV PROCESSES
252
+
We shall prove (i) 3 (ii) =$ (iv) (i) and (i) (i) 3 (ii). This follows immediately from (12). (ii) 3 (iv). Notice that
PROOF
+(iii) + (iv).
n
C
QA'u(i)= u ( i ) - At
biju(j).
j=l
For e a c h j E So, let uj be given by uj(i) = 6,. Since QA'is Markov, we have for all i # j 0 5 QA'uj(i) = -Atb,j,
and thus bij I0. On the other hand, applying Q"' to the function which is constant one and using that QA' cannot increase the supremum norm, we get n
1 2 QA'l(i)= 1 - A t
b,. j=l
-zj+,
It follows that bii 2 b,. (iv) 3 (i). We first observe that since 8 is symmetric
C
bij(u(i) - ~ ( j ) ) '
lsi<jsN
=
-C b , u ( i ) u ( j ) +
C
b,.u(i)2+
lsi<jiN
i#j
1
biju(j)2
Isi<jsN
biju( i)',
= -8(u,u)+ 15 i, j s
N
and thus
(20)
8(u,u ) = -
bij(u(i) - u ( j ) ) 2 +
C lsi<j
1
bi,u(i)2.
Isi,jsN
The plan is to define Q and m by matching the two expressions (19) and (20) term by term. It turns out that we have one degree of freedom for each i; we may choose 0 Iqii < 1.
(21)
Comparing (19) and (20), we notice that we must have (22)
(l/At)miqij = -b,
for
and 1
- miqio =
At
N
c b,.
j=1
j
# 0, i
5 3 HYPERFINITE DIRICHLET FORMS
253
This construction breaks down when bii = 0, but in that case b,J = 0 for all j , and we get m, = 0 and can choose the qij's arbitrarily.
We finally observe that since 8 is symmetric, (22) implies (S), and that the nontriviality of 8 implies (9). (i) 3 (iii). This follows immediately from Lemma 5 . 3 . 1 . (iii)+(iv). Fix k, 1 E So, k # 1, let E E *R+, and define u by
if if
-E
1
u(i) =
[o
i = k,
i
=
1,
otherwise.
If u' is the unit contraction of u, we get from (20) 0 5 8(U, U ) - 8(u',u')
=
-
1
bikE2
i#k,l
= -E2
-
c
bill2 - b k l ( 1
+ &)'
i#+l
b,k
- 2Ebkl
ifk.1
- bk,E2 + E 2
N
1 b,k. j=1
Choosing E small enough, the term -2&bkldominates the rest, and it follows that bkl 5 0. To get the second half of (iv), we fix a k E So and consider the function u(i) =
{ : + E
if i = k , otherwise.
254
5 . HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
If u” is the unit contraction of v, 0 5 %(u, v ) -
%(G, 6)
N
N
= -E’
c b,, + 2~ c bkl +
i#k
E~
1-1
bkl, J=l
and choosing E small enough, we see that The proposition is proved.
c,=,bkJ must be nonnegative. N
Notice that 5.3.3(ii) is the definition of Dirichlet forms we used in Section 5.2. Let us try to illustrate the theory by a simple example. 5.3.4. EXAMPLE. (Brownian motion on a circle). Let N = (A?)-”2 be an even hyperfinite integer, and let So = {sl,. .. ,s N } be uniformly distributed on a circle of circumference one. If i, j = { 1 , 2 , . . . , N } , let the transition probability qu be $ if s, and sJ are neighbors, and 0 otherwise. The semigroup {Q‘} is given by
Q“u(i)=~u(i+l)+~u(i-l),
where the addition is modulo N inside the u’s. The infinitesimal generator is (27)
Au(i)
= -
u(i
+ 1) - 2u(i) + u(i - 1 ) 2 At
9
and if rn, = 1/N for all i, the associated Dirichlet form 8 is given by
”
%(u, v ) = --
2
or-in (28)
1 [u(i + 1 ) - 2u(i) + u(i - l ) ] v ( i ) ,
j-1
the Beurling-Deny formulation-
NN 1 [u(i + 1 ) - u ( i ) ] [ u ( i+ 1 ) - ~ ( i ) ] . 2 *=I
8(u, v ) = -
Notice that by (27), the infinitesimal generator A is a nonstandard version of the operator
if = - L f r f 2
,
and by (28), the form 8 is a representation of
5 3 HYPERFINITE DlRlCHLET FORMS
25 5
where rn is the Lebesgue measure on the circle C, and all derivatives are taken along the circle. Passing from the original expression for 8 to the Beurling-Deny version amounts to an integration by parts. C . Equilibrium Potentials
We now turn to a closer study of the relationship between a Dirichlet form '8 and the associated process X . Recall that for a E *R+, the form 8, is defined by
ga(u,v ) = 8(u, v ) + a
J
uvdm.
The first problem we consider is the following. Letf: D, + *R be an internal function defined on an internal subset D, of So. We want to find the function e , ( f ) agreeing with f on Of and minimizing 8,(e,(f), e,(f)). We first observe that it suffices to find an extension e , ( f ) off such that
K ( e * ( f ) ,u ) = 0
(29)
for all u which are zero on D,. To see this, let w be another extension of f ; then u = w - e , ( f ) is zero on D,, and by (29) (30)
8,(w, w ) = % , ( e , ( f ) + u, e , ( f ) + u ) = % ( e a ( f ) , e,(f)) = 8u(ea(f),
+ 2 K ( e , ( f ) , u ) + 8,(u, u )
e , ( f ) ) + 8,(u, u).
It still remains to find a function e , ( f ) satisfying (29). Let a, be the stopping time (31)
af(w) =
min{t
E
T I X ( w , t ) E Df},
and define (32)
e,(f)(i)
=
E,((1+ a A t ) - " f ' " ' f ( X ( a j ) ) ) ,
where Eiis the expectation with respect to the measure Pi.We let (1
+ a At)pul'Ar= 0
256
5 HYPERFINITE DlRlCHLET FORMS A N D MARKOV PROCESSES
If u is zero on D,, then
N
(34)
+a i=l
x
F[
i= 1
1 e,(f)(i)u(i)m= i At N
( 1 + a ~ t ) e a ( f ) (-i )
c e a ( j ) ( j ) q i j ]u(i)mi = 0,
j= 1
since the expression in the bracket is zero when i E D,, and u ( i ) = 0 when i E Dp We have proved 5.3.5. PROPOSITION. Let 8 be a hyperfinite Dirichlet form, and letf: 0, --f *R be an internal function defined on a subset of So. The extension o f f
having the smallest 8, value is e , ( f ) ( i ) = E i ( ( l + a Ar)-"l'*'f(X(o,))).
I f f is constant one on its domain A Observe that by (29)
=
D,, we write e , ( A ) for e , ( f ) .
% ( e , ( A ) , % ( A ) )= 8 , ( e u ( A ) ,11,
and applying (19) to %,(e,(A), l ) , we get
The functions e, ( f ) are called equilibrium potentials, and they will serve as bridgeheads in our campaign to unify the analytic theory of Dirichlet forms and the probabilistic theory of Markov processes. The key to their importance is the observation that they have natural interpretations both in analytic and in probabilistic terms; on the one hand they minimize the forms under suitable side conditions; on the other they describe how the process X hits subsets of So. For a fuller understanding of the probabilistic description, note that for finite (Y " e,( A) ( i)= Ei("ePauA),
and that the function a ++ Ei(oe-nuA)is the Laplace transform of uA.Hence the distribution of a, can be completely recovered from the functions e , ( A ) . This fact will be of great importance in Section 5.5. Often it is of no consequence which a we work with, and we shall then choose a = 1. To simplify notation, we write eA and e, for e , ( A )and e , ( f ) , respectively.
5 3 HYPERFINITE DlRlCHLET FORMS
257
As a first example of the use of equilibrium potentials, we prove the following proposition. 5.3.6. PROPOSITION. Let X be a hyperfinite Markov process and 8 its Dirichlet form. Then for all u E H and t E T
PROOF
Let A
= { i E S o l u ( i )2 E } ,
P ( W J 3 S5
=
(1
5(1
t ( u ( X ( w ,s))
+ At)"A' +
I
then
2 E)}
eA(i) drn( i)
h t ) f / A r % l ( e A , eA),
where the last step uses (35). Since % has the Markov property, we get from 5.3.5 that
and hence
Applying the same argument to -u, the proposition follows. Let u, u,, n E N be elements in H, and assume that n + 00. There is a subsequence { U , , ~ }such that u , , ( X ) converges uniformly to u ( X ) on all S-bounded subsets
5.3.7. COROLLARY. OE1(u
- u,,
for a.a. of T.
O,
PROOF
u
-
u,) + 0 as
By the proposition and basic measure theory.
5. HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
258
D. Fukushima's Decomposition Theorem
So far we have not used the theory of domains from Section 5.1, but for the next theorem it will be needed. The condition (5.1.9) requiring that
IlAll At
(37)
5
1,
or-equivalently-that 0"' is positive, played a not unimportant technical role in Section 5.1, and by choosing At sufficiently small, we can always assume that it is satisfied: 5.3.8. DEFINITION.
for all u
E
A Dirichlet form 8 is normal (with respect to A t ) if
H ( QA'u, u ) 2 0.
(38)
With this definition we can apply the machinery of Section 5.1 to normal Dirichlet forms. As a first example of the power of the theory, we shall prove a hyperfinite version of Fukushima's (1979, 1980) decomposition theorem. For each t E T, we let d,be the internal algebra generated by the sets [a]: defined in (2), and we ask the reader to recall the notions of quadratic variation (Definition 4.2.4) and A'-martingale (Definition 4.2.6). 5.3.9. THEOREM. Let 8 be a normal, hyperfinite Dirichlet form associated to a Markov process X : n x T + S. For each u E 9[8]there exist two processes N u , M " :0 x T + *R such that:
(i) u ( X ( w , t ) ) = N U ( w ,t ) + M " ( w ,t ) for all o,t. P). (ii) M " is a A*-martingale with respect to (0,{d,}, N u is S-continuous, and E ( [ N " ] ( t ) ) = 0 for all finite t (iii) PROOF.
E
T.
Define N u by
(39)
NU(@ 0), = u ( X ( w ,011,
(40)
A N " ( w , t ) = Q " ' u ( X ( w ,t ) ) - u ( X ( w , t ) ) .
By definition of QAr,the process (41)
M U= u ( X ) - N U
is an {&}-martingale. To prove that M u is a A2-martingale, we observe that if X ( w , t ) = si and X ( w , t + A t ) = sj, then (42)
AM"(w, t)
= u(j)-
QA'u(i).
5 3 HYPERFINITE DlRlCHLET FORMS
259
Since P { o ( X ( w ,t ) = s i } 5 mi, we get from (42):
5
( 2 t % ( u ,U ) ) l / 2 + ( t % ( u ,u - Q"'U))''*
< a,
and hence M" is a A2-martingale. It remains to prove (iii). The quadratic variation part is easy: 1 -E(AN"(t)') At
l N
5
- C ( u ( i ) - QA'u(i))'mi 5 8 ( u , u - Q"'u),
At
,=I
which is infinitesimal by 5.1.5(ii). Hence E ( [ N " ] ( t ) 5 ) t 8 ( u , u - QA'u) is infinitesimal for all finite t. For each n E N, let u, = Q*/'h.Our plan is first to show that N u =is S-continuous for each n, and then use 5.3.7 to deduce that N " must also be S-continuous. We first recall that by (5.1.17) 8 ( ~ ,u,) , - %(QAtu,,Q"u,)
5
n%(u, U ) At,
and since according to 5.1.4(i) 0 5 %(u,,
U, -
Q"u,)
5
%(u,, u,)
-
%(Q"'U,,
QAtU,),
we get 1
-E(AW-(~)') At
5
%(u,, U, - Q"u,)
5
n%(u, U ) At.
It follows that
E ( ( N " n ( t )- N".(S))')5 n 8 ( u , u ) ( t - s)' for all t, s E T. By Kolmogorov's continuity theorem (this is Proposition 4.8.5 with d = 1; the proof can be read independently of the rest of Section 4.8), each Nun is S-continuous. If we can find a subsequence of { u,} such that N'% converges uniformly to N" on compacts, N u is S-continuous. By 5.3.7 there is a subsequence such that u , , ( X ) + u ( X ) uniformly on compacts, and since N u = u ( X )- M u ,
260
5 HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
it suffices to establish the corresponding convergence for a subsequence { M " v } . If u, = u - u,, we get from (43) and Doob's inequality E(max(M"(s) - ~ " * x ( s ) ) ~ s ) ~~/ E ~ ( ( M " (c )M""(t))')'/' S
I
f
5
2 ( 2 t 8 ( u,,
v,))'/2
+ 2( f8(on,u, - QAfu,))1/2, which gives us the convergence we need. The theorem is proved. We know from Chapter 4 that A '-martingales have left and right limits. Fukushima's decomposition theorem tells us that so has u( X ) , and that the discontinuities of u ( X ) are those of the martingale Mu.If our state space So is embedded in the *-version of a topological space Y, and the domain 9[81 contains enough S-continuous functions, this will imply that X itself has left and right limits. In the next section we shall use information about u ( X ) and 9 [81 to carry out a detailed analysis of the process X . Another application of Theorem 5.3.9 is to stochastic differential equations with singular drift coefficients; see Section 5.6. Let us finally mention an alternative way of proving 5.3.9; instead of approximating with the functions u, = Q1/"u, we could have used u, = nG_,u,
where {G,} is the resolvent of 8. This approach avoids Kolmogorov's theorem, but uses instead the theory of resolvents from Section 5.1. The reader is invited to carry out the proof, using Fukushima (1980, Theorem 5.2.2) as a reference if needed. E. The Hyperfinite Feynman-Kac Formula
The last result we shall prove in this section is a hyperfinite version of the Feynman-Kac formula-the cornerstone of the functional integration approach to quantum physics. Given the infinitesimal generator A of a Markov process X , and a function V on the state space, the Feynman-Kac formula gives a description of the semigroup e - f ( A + Vin) terms of V and X . Much of the formula's importance in quantum mechanics stems from the fact that the Hamiltonian Ho of a free particle is just the infinitesimal generator - ( h 2 / 2 m ) A of a Brownian motion, while the Hamiltonian H of a particle moving in a potential V is given by H = Ho + V. The FeynmanKac formula thus gives us a probabilistic method of approaching the Schrodinger operators H = Ho + V . For systematic accounts of the remarkable success of this approach, the reader should consult the books by Simon (1979) and Glimm and Jaffe (1981); in the present work we shall only make
5 3 HYPERFINITE DlRlCHLET FORMS
261
use of it on two occasions-during the discussion of polymer measures in Section 6.4 and when explaining the connection between polymer models and quantum fields in Section 7.5. In the nonstandard setting, A will be a hyperfinite Markov operator and V an internal function on the state space. The idea of the proof is to show that the semigroup { T ' } t C generated T by A is infinitely close to the semigroup {S1)'tTgiven by SA'= (1 - V At)( 1 - A Ar).
(44)
Once this relation is established, the following easy lemma will give the probabilistic interpretation of T ' : 5.3.10. LEMMA. Let A be a hyperfinite Markov operator and let X : R x T + S be the associated process. If V : So+=*R is an internal function, and S' is the semigroup given by (44), then
where 0 I7 ( t ) I411 VII 2 At. PROOF
If {Q'} is the semigroup generated by A, S'
=
(1 - VAt)Q"'(l
-
V A t ) . . . Q"'(1 - VAr)Q"'.
Using the probabilistic interpretation of QA1repeatedly, we see that S'u(i) =
[
Ei u ( X ( t ) )
n
(1
-
V ( X ( s ) ) At)].
OlS
But
n
(1 - v ( x ( w ,
3))
At)
=
exp
=
exp(
ln(1- v ( x ( w , s)) At))
oss
-f: V ( X ( w , s)) At 0
where O ( w , s) lies between 0 and V ( X ( w , s)). Hence
and the lemma follows.
1' 20
- - C O ( w , s)' At2
262
5 HYPERFINITE DlRlCHLET FORMS A N D MARKOV PROCESSES
Note that if ( ( V ( ( 5 A= t 0, then the lemma says that S'u and E ( . , ( u ( X (t ) ) exp[-ji V(X(s)) d s ] ) are infinitely close in any reasonable sense. We are ready to prove a hyperfinite version of the Feynman-Kac formula: 5.3.11. THEOREM. Let A be a hyperfinite Markov operator and let X :Q x T + S be the associated process. Assume that V: So + "R is an internal function such that
(a) there is a p E R such that ((A + V)u, u ) 2 p((ul(' for all u, (b) II VIIm/ln(At) = 0. If { T ' ] denotes the semigroup generated by A + V, then
for all finite t and all u with finite norm.
By the lemma it suffices to prove that T'u = S'u. Observe first that
PROOF
Sr
+ VA At2) = S'-AcTA'+ Sr-Ar VA At2 = S r - 2 A r-( l(A + V) A t + VA At2)TA'+ Sf-A'VAAt2 (1 - (A + V) A t
=
~ t - A f
-
~ ' - 2 A f ~ 2 A+ f
~ ' - 2 A t v ~ ~~~2h f+ St-A'vA
At2.
Continuing in this manner, we finally get f
~f
=
Tt +
c
Sr-s-At
VAT" At2.
s=O
To show that the sum 1
(45)
C
1 S'-"A'VAT" At2 is infinitesimal, we write it as
SrP"-"V(A + V)Ts A t 2 -
s=o
r
C
SrPs-ArV2Ts At2.
s=o
Considering the second term first, we note that since
// St--s-*' 11 < - e!!v\\m(r-s) - el!"1\mf < we have
which is infinitesimal by (b).
and
/IT"))5
5
elel'
5 3 HYPERFINITE DlRlCHLET FORMS
263
Turning our attention to the first term in (45), we note that
ll(A
M + V)T"II Is f A t
for a finite constant M, but let us first show that this is sufficient to finish the proof since
we get
which is infinitesimal for all finite t by (b). It only remains to prove (46). Obviously, ll(A + V)T'(I equals the largest of the values ] A ( 1 - A At)s/A'l when A ranges over the eigenvalues of A + V . If A is negative, it is bounded from below by the p in (a), and thus (A(1 - A I * e'"'. If A is positive, \ \ ( A+ V)T"II cannot exceed the maximal value s/At
of the function A
H
A(1 - A
Since
we see that we can choose M equal to the maximum of l p l t elP'' and 1. REMARK. If the proof above seems rather complicated, the reason is that the conditions are quite weak; for instance, the boundedness condition (b) only requires that V has a certain infinite bound. This makes it possible to apply the theorem to potentials which are much too singular to be described by standard functions-a fact which will be important in Section 6.4. The following integrated version of the Feynman-Kac formula will be needed in Section 7.5.
264
5 HYPERFINITE DlRlCHLET FORMS A N D MARKOV PROCESSES
5.3.12. COROLLARY. Assume that A and V are as in Theorem 5.3.1 1, and that there is a positive real number c such that V(s) 2 c for all s E So. Then
for all u with finite norm. There are two difficulties to overcome in order to prove this corollary: the behavior of the integrand at zero and at infinity. Since the integrand decays as e P c t / t for large f, the behavior at infinity does not really create any problem. At the origin, we have a singularity of order ln(l/Af), but since the error estimates in the proof of 5.3.11 allow an extra factor In( l/At), the corollary follows. We leave the details to the reader. REMARK. The systematic theory of Dirichlet forms was first studied by Beurling and Deny (1958, 1959), and the theory has since been developed by a number of authors; the reader should consult the books by Silverstein (1974, 1976) and Fukushima (1980) for further references. It is interesting to note that in their first papers [but see Deny (1970)l Beurling and Deny only considered finite Markov chains; in a certain sense the hyperfinite theory developed above is a return to the origins of the theory. The FeynmanKac formula grew out of Kac’s attempt to give a mathematical foundation for Feynman’s ideas; for the probabilistic interpretation of the formula, see, e.g., Williams (1979).
5.4. STANDARD PARTS AND MARKOV PROCESSES
In this section we shall study the standard parts of hyperfinite Markov processes. In order to take standard parts we need a topology; we shall assume that with the exception of the trap so, the state space S is embedded in the nonstandard version * Y of some Hausdorfl space Y. If X is a hyperfinite Markov process taking values in S, we want to find conditions that guarantee that the standard part of X exists and is a Y-valued Markov process. It turns out that there are two difficulties we shall have to overcome; the first is that the paths of X may be so irregular that no natural standard part process exists; the second is that even when a standard part does exist, there is no reason why it should automatically be a Markov process-taking standard parts we may lump together states that should be kept apart. We shall see that the theory of right standard parts that we developed in Chapter 4 is sufficient to solve the first of these problems. Thus most of the work in this section will be directed to the second problem under discussion. Before we delve into the technicalities, we shall discuss the problem informally in somewhat more detail.
5 4 STANDARD PARTS A N D MARKOV PROCESSES
265
Assume that x:Cl x R, + Y is the standard part of X and that we want to prove that x is a Markov process with respect to the filtration it generates. Given that x ( t ) = y, there will in general be several states s E Sosuch that y = st(s), and X may be in any one of them. From the nonstandard point of view, these states are totally unrelated, and hence the past and the future of the process may differ widely from one state to the next. Observation of the past may indicate which states are the more likely, and thus influence our prediction of the future. This explains why in general x is not a Markov process. Note, however, that if the process started at si and the process started at sj have the “same” future whenever si = sj, the above argument breaks down, and it is reasonable to expect that x is Markov. One way of formulating this condition is to demand that
for all t E R, and all Bore1 sets B. But (1) as it stands turns out to be too strict for the applications we have in mind; instead of demanding that it holds for all infinitely close si, sj, we shall only demand that it holds for all such si, sj outside an exceptional set (i.e., a set which the process hits with probability zero). It may at first seem that little is achieved by allowing a condition to fail on an exceptional set, but in fact the extra freedom and flexibility we gain will turn out to be very useful. The situation is reminiscent of measure theory; by Anderson’s Lusin theorem 3.4.9, a measurable function is one that fails to be S-continuous on a set of measure zero-a condition similar to the weaker version of (1). This section falls naturally into two halves. In the first part we develop the necessary theory for exceptional sets. In the second we show that if (1) plus some additional conditions are satisfied, the standard part is a strong Markov process. In the next section we shall translate these probabilistic conditions into the language of Dirichlet forms, and find conditions on forms which guarantee that the standard parts of the associated processes are Markov. A. Exceptional Sets
By a hyperfinite Markov process we shall in this section understand a stationary hyperfinite Markov chain as described in Section 5.3. However, the symmetry condition (5.3.8)-that miqij = mjqji for all i, j # 0-will not be needed. We shall instead assume that for each i E So,the function
(2) is decreasing.
t
P { w 1 X ( W , t ) = Si}
5 HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
266
We shall further assume that So c * Y for some topological space Y , but that the trap so is not an element of * Y To keep the measure theoretical complications to a minimum, we introduce the following condition on Y : 5.4.1. DEFINITION. A HausdorfI space Y is afmost a-compact if for each hyperfinite probability measure P on a subset So of * Y, the set go= Son Ns(* Y )is Loeb measurable, and
L(P)(so) = sup{L(P)(So n stC'(K))IK compact}. Note that if Y is almost a-compact, then any internal probability measure P on So has a standard part L ( P ) st-' (by Theorem 3.4.6); this standard part is Radon, but need not be a probability measure. In the new terminology Proposition 3.4.7 can be recast as follows: 0
5.4.2. PROPOSITION. All locally compact spaces and all complete metric spaces are almost a-compact.
Throughout this section we shall assume that Y is almost a-compact. We have already defined
so= So n Ns(* Y ) ,
(3) and we now introduce
= {w E
(4)
f q X ( w , 0) E So}.
so
Assuming that and are L( m )- and L( P)-measurable, respectively, we define new measures m and p by
so), P ( B ) = L(P)(B n a).
m(A) = L ( m ) ( An
(5)
(6)
If 6 E T, let T6 be the subline
T6 = (0, 6,26,
. . .}
and set
Ti
= {t E
T6 I t
Ir } ,
T:" = { t E Ts I t is finite}.
We shall write X ( ' ) for the restriction X
r T6.
5.4.3. DEFINITION. A subset A of sois called 6-exceptional if for all E E R, , there is an internal set 8 = A such that
(7)
P { w I3t E TA(X(w,t ) E B ) } < E.
A set is exceptional if it is 6-exceptional for some infinitesimal 6. Note that if A is 6-exceptional, then because of (2), (8)
P { w I3t E Tg"(X(w,t ) E A ) } = 0.
5 4 STANDARD PARTS A N D MARKOV PROCESSES
267
The larger S is, the more sets are &exceptional. It is therefore convenient to be able to restrict to coarser time lines, and as long as we are only interested in the right standard part of X , we may d o so without any loss of generality. Our first lemma contains two extremely simple but also very useful observations. The proof is left to the reader. 5.4.4. LEMMA. (i) All internal sets B c So with m ( B ) = 0 are exceptional. (ii) The families of exceptional and 6-exceptional sets are closed under countable unions.
Why did we use (7) and not (8) to define exceptional sets? Simply because there is no reason to believe that the set in (8) is measurable when A is fairly complicated. Obviously, the set is measurable when A is internal, but although (9)
U
{ w / 3 1 ~T ; ( X ( w , t ) € n s N An)} =
U { w 1 3 t ~T k ( x ( w , t ) ~ A n ) I
ncN
for all sequences { A n } n cof N subsets of S, the corresponding formula for intersections is false in general. It does hold, however, if the sequence is decreasing and consists of internal sets. This is the observation behind the next lemma. 5.4.5. LEMMA. Let A c S, and assume that there is a family {Bm,n}m,ncN of internal sets such that
and for each m, the sequence {Bm,n}ntN is decreasing. Then (11)
fw13t E T ; ( x ( ~t ), E A ) } =
PROOF
u ni w p t
m c N nsN
E
T ; ( x ( ~t ), E B ~ J I .
It suffices to show that
(W{wI3f E
G
(
X(w,t) E
n
Bm,n
ntN
)I
=
fl { 4 3 t E
G ( X ( o ,t > E
Bm,n)}
ncN
for all m,since (1 1) then follows from (9). Also, it is immediately clear that the left-hand side of (12) is included in the right-hand side. To prove the opposite inclusion, choose wo E
n { w l g t E G ( X ( w ,t> E
Bm,n)I,
ncN
and consider the set
{ n E *"3t
E
TL(X(w0, t > E
gm.n)I,
268
5 HYPERFINITE DlRlCHLET FORMS A N D MARKOV PROCESSES
where {I?m,n}nG*Nis some internal, decreasing extension of {Bm,n}ntN. By choice of w o , this set contains N, and since it is internal, it must have an infinite member 7.Thus
and the lemma is proved.
An exceptional set A is hit by a set of paths of I?-measure zero, but there may still be states si A such that a particle starting at s,hits A with positive L( P,) probability. 5.4.6. DEFINITION. A &exceptional subset A of So is called properly Sexceptional if there is a family {Bm,n}m,neN of internal sets such that
and for all sis? A L ( P i ) { w ( 3 Et T z " ( X ( w ,t ) E A ) } = 0.
(14)
A set is properly exceptional if it is properly &exceptional for some 6 = 0.
By the lemma above, the set in (14) is measurable. Also note that the classes of properly 6-exceptional and properly exceptional sets are closed under countable unions. The properly exceptional sets are the "nice" exceptional sets in two ways; they are impenetrable from the outside, and they are constructed from the internal sets in a simple and uniform fashion. The next lemma shows that the class is also large enough. 5.4.7. LEMMA.
set B
If A c S is &exceptional, there is a properly &exceptional
= A.
PROOF Since A is &exceptional, there is for each pair ( m , n ) of natural numbers an internal set Bm,n containing A such that
F { a t E T , " + ' ( X ( t~) ,E B , , , ~ )5} l / n 2 m . Define
c,,,~ = { i E SIP,{^^ E
T ; ( X ( ~E) nm,n)} 2 l/nm},
and set
A=
un
m c N neN
5 4 STANDARD PARTS AND MARKOV PROCESSES
We first show that
A is S-exceptional.
P ( w J 3 tE T ; ( X ( W ,I ) 5
P(wl3t
E
E
269
By definition of Cm,n
c,,,)}. I / n m
T ; + ' ( X ( ~t ,) E B ~ , ~I)l /} n 2 m ,
and thus i3{w13t E T ; ( X ( ~t ), E
c ~ , ,5) }l / n .
nnGN
This shows that for each m the set C,,,, is 6-exceptiona1, and since the class of &exceptional sets is closed under countable unions, A is also S-exceptional. From the definition of the family {Cm,n}, we see that if s, 6 A, then (15)
L(Pi)(w13tE T ! " ( X ( w ,t ) E A ) } = 0.
We now iterate the construction above countably many times to get an increasing sequence
A, c A , c A2 c . of &exceptional sets, where A.
*
=
A and A,+, = Ai for each i. Let
B
=
u A,; I€N
then B is a S-exceptional set of the form
for internal sets Ci!n, and it follows from (15) that if si g B, then
L(P,){w13t E T : " ( X ( w ,1 )
E
B ) } = 0.
The lemma is proved. Most of the exceptional sets we encounter in the theory of Markov processes are sets we would like the process to avoid. From the standard point of view, this means that a point y in Y should be avoided if all its nonstandard representations si E st-'(y) n So are in the exceptional set. To study this relationship more closely, we define the inner standard part A" of a subset A of S by (16)
A" = { y
E
Ylst-'(y) n So = A } .
We have already defined the standard part of A by (17)
" A = st(A) = { y E Y ( 3 sE A(st(s) = x)}.
270
5 HYPERFINITE DlRlCHLET FORMS A N D MARKOV PROCESSES
The inner standard part can also be defined in terms of the standard part operation: A" = C(st(CA)),
(18)
where the outer complement is with respect to Y,the inner with respect to So. It is trivial to check that standard parts commute with arbitrary unions. Using (18), we get that inner standard parts commute with arbitrary intersections. The other way around is less nice; standard parts and intersections do not commute, and neither do inner standard parts and unions. All we can say is the following: 5.4.8. LEMMA.
If {An}ncN is a decreasing family of internal sets,
'( n A,)
=
ncN
fl "A,.
ntN
If { B n } n sis N an increasing family of internal sets,
( nU B,)" €N
(20)
=
U BI.
ncN
PROOF. Obviously, the left-hand side of (19) is contained in the set on the right. To prove the converse, let E
n "A,. ncN
For each n E N, pick y , E A, such that x = st(yn). Extend {An}ntNto a decreasing internal family {An}ne*N and { Y , , } , , ~to~ an internal sequence such that y , E A, for all n E *N. For each neighborhood 0 of x, consider the set No
=
{n E *"y,
E
*O}.
All these sets are internal and contain N, and hence we can find an infinite integer 77 that is in all of them. But then x = st(y,),
is decreasing, y, E A, c and since the family {An}nc*N proves (19). We now get (20) from (19) by using (18):
nncN A,,. This
( u B,,)"=CstCUB,=CstncB,, ncN
ntN
= c n stCB, n€N
=
u CstCB, = u BI.
rIEN
ntN
5 4 STANDARD PARTS AND MARKOV PROCESSES
271
Since in general
(u n
un
#
Bm.n)'
K , n ,
meN neN
mtN ntN
there is no obvious reason to believe that the inner standard part of a properly exceptional set is always Borel, but we shall now prove that it must at least be universally measurable. 5.4.9. LEMMA. Assume A = UmeNnneN Bm,nfor a family {Bm,n}m,neN of internal sets. For any completed Borel probability measure p on Y, the inner standard part A" is p-measurable, and there exists a family {Dm,n}m,neN of internal sets such that
and
PROOF We may assume that the family {Bm,n} is increasing in m and decreasing in n. Note that
A=
U
n
Bm,n
=
n,, U
/EN
mshl n s N
BrnJcrn).
rneN
Let f ( m ) be the sequence (f(O),f(l), . . . , f ( m ) ) ,and define Ci;(m, =
U
Bk/(k).
k sm
For fixed f, the sequence (Ci;(mi}meN is increasing, and hence by 5.4.8 and the fact that inner standard parts and arbitrary intersections commute,
Since CCq,)
= st CCy(,,,)
is closed, the complement C A O =U, /EN
n c cg,,,
meN
of A can be derived from the closed sets by the Souslin operation, and hence A" is measurable with respect to any completed Borel measure [see Saks (1937), p. 50, for any easy proof]. It only remains to find the family {D,,,"}.From (23) it follows that for each E E R, there is a function f F :N + N such that p(
nu
gs/,
m e N c i ( m ) - ~ o )< E ,
272
5 HYPERFINITE DIRICHLET FORMS A N D MARKOV PROCESSES
where g S L means that g ( n ) s L ( n ) for all n E N. Since our original sequences { Bm,n}neN are decreasing,
f l U C"g(m)= mU C>e(ml. aN
gsf, m s N
Putting Dm,n = Cfiln(,,,), the lemma follows. REMARK. The argument above shows that a subset of a Hausdorff space can be derived from the closed sets by using the Souslin operation if and only if it is the standard part of a set derived from the internal sets by the same operation. This result-and the proof we have given-is due to Henson (1979).
We have now reached the last lemma we shall need before we can return to our Markov process. It will be used to pick hyperfinite representations of measures avoiding properly exceptional sets.
nnEN
5.4.10. LEMMA. Let D be a subset of Soof the form D = UmeN Dm,n for a family {Dm,n} of internal sets. If p is a Radon probability measure on Y with p ( D " )= 0, there exists an internal probability measure v on So such that p = L( v) 0 st-' and L( v)( D ) = 0. PROOF We may obviously assume that the family {D,,,n}is increasing in m and decreasing in n. Defining new measures pm,nby
(24) we have
p m , n ( B ) = PCL(B- W n , n ) ,
The set So- D,,,," is S-dense in Y - DZ,,,, and hence there is an internal measure v , , , ~concentrated on So- D,,,n such that
(26) p m , n = L( v r n , n ) st-'. We may choose these measures such that the family { v , , , ~is} decreasing in m and increasing in n. Extending to an internal sequence { v , , , , ~ } , , , we ~~*~, first pick a y E *N - N such that 0
Ovy,n(so)
for all n
E
N,and then an
=
lim
m+m
Ovrn,n(So)
7 E *N - N such that
By using (25), (26), and the definitions of y and 7,we see that L( v , , ) ( D ) = 0 and p = L( vy,,) st-'. The measure vy,, has all the properties of the desired measure, except that it need not be a probability measure. 0
273
5 4 STANDARD PARTS A N D MARKOV PROCESSES
However, it is clear that Y ~ . ~ (=S1 ~+ )E for some v = (1 + E ) - ' v ~ , the ~ , lemma is proved.
E
= 0, and putting
By combining 5.4.9 and 5.4.10 we get: 5.4.11. COROLLARY. Let A = So be a properly exceptional set, and let p be a completed Borel probability measure on Y such that @(Aa)= 0. Then there is an internal probability measure v on So such that p = L( v ) st-' and L ( v ) ( A )= 0. 0
B. Strong Markov Processes and Modified Standard Parts
Having completed our study of exceptional sets, we now turn to the real subject matter of this section, an investigation of the standard parts of hyperfinite Markov processes. Let us first describe what kind of processes we would like to obtain as standard parts. Given a topological space Y, we let YAbe the set Y u {A} obtained by adding a new element A to and we let YAhave the u-algebra W Agenerated by the Borel sets on Y and the singleton {A}. Our standard processes will be Y,-valued, and the new element A will serve as a trap. Recall that a filtration {Fl}lcR+ of a-algebras is right-continuous if Fl = Fsfor all t E R,, and that a mapping u :R +. [0,03] is a stopping time with respect to {F,}if
nS,,
{w
I+)
5
t } E Ft
for all t. Each stopping time u introduces a u-algebra Fw by
9v= { A E .FmlVt(An { a 5
(27)
t}E
.!Fl)},
where Fais the a-algebra generated by the union of the Fl's. Let A( Y )be the set of all Radon measures on Y with finite mass. A set or a function is called universally measurable if it is measurable with respect to all p E A( Y ) .If for each y E Y we are given a probability measure p, on a set 0, then for each p E A( Y ) ,we let p,, be the measure defined by
[provided, of course, that this makes sense; i.e., y * p , ( A ) is p-measurable]. Finally, if x is a YA-valued process, we let the lifetime 5 of x be the stopping time defined by t ( w ) = inf{t E R+lx(w, t ) = A}. 5.4.12. DEFINITION. vAr
A strong Markov process is a quadruple (0,{91}fER+, is a
x) where { F,} is a right-continuous filtration on 0, each
ev
274
5. HYPERFINITE DlRlCHLET FORMS A N D MARKOV PROCESSES
probability measure on gm, and x: 0 x R+ + YAis a stochastic process on (0,5Fm, p,,)for each y . Moreover, the following conditions must be satisfied: For all t 2 0 and all measurable E c Y u {A}, the map y H universally measurable. (ii) For all t > { ( w ) , we have x ( w , t ) = A. (iii) For each y E Y A ,the process x is adapted to (R, {5Fr}, p,,) and is ~ , right-continuous with left limits at all t < ~ ( w ) a.e. stopping times a, all measurable E c Y A ,all p E (iv) For all {9,} A( Y ) , and all s E R+:
(i)
P,{x,
E E } is
I
~ w { x ~E+E, Fu} = P,,{x,
E
pfi a.e.
E},
In order to prove that a class of hyperfinite Markov chains have standard parts that are strong Markov processes, we shall have to overcome two main difficulties: the construction of the family of measures {p,,},and the proof of the strong Markov property 5.4.12(iv). But first we must introduce the necessary regularity conditions on our nonstandard processes. of = So n Ns(*Y) and that X(') = X 1 T'. The lifetime Recall that X(') is defined by
so
~ ' ( w= ) inf{otIX(')(w, t )
and we define the right standard part
OX(')+
So},
as follows. If t < & ( w ) , let
OX(')+(W,t ) = s - lim x(')(w, s) sSr
S==f
if this limit exists, and OX(')+(w, t ) = A else. If t
2
& ( w ) , we always put OX(')c(w, t ) = A.
5.4.13. DEFINITION. A subset A of So is called a set of irregularities of X if there is a (positive) infinitesimal a0 such that for 6 5 0, 6 2 6,:
so
- A and L ( P i )a.a. w, the path X(')(w, (i) For all si E and S-left limits at all t < & ( w ) . (ii) For all si E 3, - A, the set
{olgt
E
T:"("t > & ( W )
has L ( P i ) measure zero. (iii) For all infinitely close si,sj
E
A
x(W, t ) E
a )
so)}
so- A,
L ( P i ) { o x ( s ) + ( wt ,) E B} = L(P,){"X'"+(w, t ) E B}
for all finite t
E
Ts and all Bore1 sets B.
has S-right
5 4 STANDARD PARTS AND MARKOV PROCESSES
275
X has exceptional irregularities if it has an exceptional set of irregularities. The first condition above guarantees that X has a reasonable standard part; the second says that “infinity” is a trap; the third is a version of (1). Putting st(s,) = A when s, E S we have the following definition of a modified standard part.
so,
5.4.14. DEFINITION. Assume that X has exceptional irregularities, and let A be a properly 8,-exceptional set of irregularities (where 8, is as in 5.4.13). Let x : Cl x R, + YAbe defined by:
(i) if X ( w , 0 )F?A, then x ( w ) = o X ( 6 ) + ( w ) ; (ii) if X ( w , 0 ) E A, then x ( w , t ) = st(X(w, 0 ) ) for all t E R,. Then x is called a modified standard part of X .
Our aim is to prove that if X has exceptional irregularities, then with the appropriate definition of the family {p,} of measures, all modified standard parts of X are strong Markov processes. The first step toward the definition of {p,} is the following lemma-a “smeared out” version of condition 5.4.13(iii). If v is an internal probability measure on S, let P,, be the measure on 0 defined [in analogy with (27)] by (28‘)
P J C )=
5
P , ( C )d v ( s , ) .
Recall that the space Y is assumed to be almost a-compact (Definition 5.4.1). Let A be a properly exceptional set of irregularities of X , and let x be the corresponding modified standard part of X . Let vl, v2 be two internal probability measures on So such that L( vl)( A ) = L( vz)( A ) = 0 and L( vl) 0 st-’ = L( v2) 0 st-’. Then for all t E R, and all Bore1 sets B 5.4.15. LEMMA.
L ( ~ , , ) { x ( w0 , PROOF
Let p
=
L( v l ) 0 st-’
E
=
OX(w, i) = x ( w , t ) ,
w = UP”*){X(W,1 )
E
B).
L( v2) st-’. Choose t‘= t so large that 0
L ( P , , ) , and
L(PJ
a.e.
6 such that L ( P , , ) ( { X ( w ,i) E 6} n { “ X ( w ,i) f B } ) = 0
and pick an internal set
for i = 1, 2 . Define a function f: Y + R as follows: if y f? A”, let f(y) = L ( P , ) { x ( w ,t ) E B } for some (i.e., all) s, E stC’(y) n So - A; if y E A”, define
276
5 HYPERFINITE DIRICHLET FORMS A N D MARKOV PROCESSES
f(y) arbitrarily. The function si H P i { X ( w , ?) E respect to both v, and v2. Hence L(P,,){x(w,t ) E B } = "P,,{X(w,I ) =
=
E
"1 1
i} is a
lifting of f with
B}
P , { X ( w , ?) E
B } dv,(s,)
f(Y) M Y )
=
"[
=
"PY,{X( 0, ?) E
=
L(P,){x(w, t ) E
PI{X(W, ?)
E
ii} dv,(sJ
B}
w,
and the lemma is proved. 5.4.16. LEMMA. Assume that X has a properly exceptional set A of irregularities, and let x be the corresponding modified standard part. For all infinitely close si, sj E - A; all finite sequences t , < t2 < t3 < . . < t , from R, ; and all Bore1 sets B , , B 2 ,. . . , B, we have
so
L(Pi){X(W,t l ) =
E
B1
A X ( W , f2) E
L(P,){x(w,t , ) E B ,
A
B2 A
' ' *
A X(W,
x(w, t 2 ) E B2 A
t,) A
E
B,}
x(w, t,)
E
B,}.
PROOF. We shall prove this by induction on the length n of the sequences t , < t2 < . . < t , , B , , B 2 , . . . , B,. The case n = 1 is part of Definition 5.4.13. Assume that the lemma holds for all sequences of length n - 1, and pick t , , fZ,. . . , ?, such that ?,= t , , . . ., ?, = 2, and
- -
OX(@,6)= x ( w , t i ) for 1 = 1,2, . . . , n and k
L(P,)
a.e. I
=
I
I
i, j . Choose internal sets B , , B 2 , . . . , B, such that
L ( p , ) ( { x ( ~?I) , E &} A {"Xlw, 6)E Bi}) for 1 = 1, 2,..., n and k = i,j. We define two measures vir vj on S by putting v k ( s )= Pk{w\ X ( w , ?,-,)
for all s
E
S and k
=
=s
and for ail 1 < n - 1, X ( w , ?,) E
i, j . By the induction hypothesis
L( v,) st-' 0
=
L( V j ) 0 st-',
El}
5.4. STANDARD PARTS A N D MARKOV PROCESSES
277
and since A is properly exceptional, L( v , ) ( A )= L( v , ) ( A )= 0. Applying we get Lemma 5.4.15 with t = t, L(Pu,){x(tn- f n - 1 )
E
Bnl = L ( P v , ) { x ( t n
- tn-1)
E
Bn}.
Since X is Markov and time invariant, the lemma follows. We now have what we need in order to define the missing ingredients E R+, let be the a-algebra generated by the sets
{9,} and {p,} of our Markov process. For each t {wlX(w, t l ) E
where 0 5 t , < t2 <
Bl A . . . A
X(w,
2,) E
B,},
- - - < t , 5 f and B1,.. . ,B, are Bore1 sets in Y. Define 9, = r'l 9;; I>(
the filtration { %,}ttR+ is obviously right-continuous. We define 9m to be the a-algebra generated by all 9(, t E R+ . It follows from Lemma 5.4.16 that for all C E 9-and all infinitely close s,, s, in -A
so
L ( P , ) ( C )= L ( P , ) ( C ) .
(29)
This observation makes possible the following definition of a family { ~ y } , t of measures on I f y @ A", let for all C E
p y ( C )= L ( P , ) ( C )for all s, E st-'(y) - A.
(30)
If y (31)
E
A" u {A}, let
p y ( C )=
1 0
if C contains all constant paths x( t ) = y, else.
Observe that since a set C E Srncontains either all or none of the constant paths x ( w , f) = y, the set function p, is a measure. We have reached our goal: 5.4.17. THEOREM. Assume that So is a hyperfinite subset of * Y for some almost a-compact space Y , and let X : n x T + S be a hyperfinite Markov process with exceptional irregularities. If x is a modified standard part of X , then (a,{.Ft}tEW+, {p,,}yE y A , x) is a strong Markov process. PROOF. Since we already know that { $,} is right continuous, all we have to do is to check conditions 5.4.12(i)-(iv). The second and the third of these conditions are immediate from the construction, and we can concentrate on (i) and (iv).
5 HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
278
Given a Radon probability measure p on Y, we define two new measures po and p1 on Y by
(32)
P O ( m = P ( B - A"),
(33)
pl(m
= P ( B n A"),
where A is the properly exceptional set of irregularities used in the construction of x. By Corollary 5.4.1 1, we can find an internal measure v on So such that po =
(34)
L( v ) 0 st-'
and
L( v)(A) = 0.
(35)
We first prove 5.4.12(i): (i) Given a E [0,1], a Bore1 set E, and t
bE
YI m x ( t ) E E l
E
R,
we must show that
'a }
is p-measurable for all finite Radon measures p. Since { y E A"IF,,{x(t) E E } > a } = A" n E
is universally measurable b y 5.4.9, it suffices to prove that
{ y g A"I P,{x( t ) E E } > a }
(36)
is p-measurable. In fact, by the definition of po we only have to prove that (36) is po-measurable. Since X has S-right limits a.e. with respect to the probability measure P, constructed from the v in (34) by using (28'), we can find a 7 = t such that
L ( P , ) ( { x ( t )E E } a { O X ( There must be an internal set
L ( P , ) { X (7)
i) E E } ) = 0.
such that E
(st-'(E) A
i)}= 0,
and hence L ( P i ) { X ( t )E E } = " P i { X (7)
-
E
E}
for L ( v ) a.a. si. Combining this with the definition of p,,, we see that y H Py{x(t) E E } has i P i { X ( i) E i}as a v-lifting, and hence it is a po-measurable function. This proves that the set in (36) is po-measurable,and 5.4.12(i) follows.
279
5 4 STANDARD PARTS AND MARKOV PROCESSES
(iv) We must show that for all {st}-stoppingtimes w, all sets B and all s E R+, the equation (37)
Fp{w E
Blx,+,
E}=
E
E
gU,
I,
P,,{X, E E } dFp
holds for all Radon probability measure p on Y and all Bore1 sets E. First notice that since the paths of x are constant Fpl a.e., we have
and it suffices to prove
If Y is the nonstandard representation of po given in (34)-(35), the hyperfinite counterpart of (38) is
(39)
PV{w
F}=
IxT+S
I,:
F}
p/y,{xS
dpU,
where C and F are internal sets and C is measurable in the *-algebra generated by the internal stopping time T. Since X is a time-invariant Markov process, (39) holds. Our plan is to reduce (38) to a version of (39). We first pick an internal stopping time .r such that = w L(P,) a.e., and such that there is a .r-measurable set C satisfying O.r
(40)
L ( P u ) ( BA C ) = 0.
If P' is the internal measure on SZ given by P'(D)
Px,(D)dPu,
=
we observe that since A is properly exceptional, L( P r ) {X , E A} = 0
(41)
for all t E TZ: (where &, is the infinitesimal used in the construction of x; see Definition 5.4.14). Hence we can choose an s' E Tg:, s' = s, such that (42) (43 1
O X ;
OX,+
=
x,
L(P')
=
xu+,
L( P,,) a.e.
a.e.
Finally, we pick an internal set F such that (44) (45)
L ( P v ) { X T +E; st-'(E) A F} = 0,
L(P'){X,
E
st-'(E)
A F } = 0.
5. HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
280
We now have
=
=
I,
UP~JX,
EI UP,)
E
1,
(by (40), (421, (45)) (by def. of v, 7,and {p,}).
Pxm{xsE E } ifF&
This proves (38) and the theorem. We illustrate the use of 5.4.17 by the following-rather trivial-example. More worthwhile applications will be given in Section 5.6. 5.4.18. EXAMPLE.
Let 7
E
*N - N,and set
Define the transition probabilities qs,se by 4s,d =
if s, s'
E
{
4
if Is - s'J = otherwise,
0
So, and if so is the trap, let 4S,%
=
{4
0
if s = * f i , otherwise.
Let m be the internal measure on So given by m(s)= for all s E So, and let the time line be
116
llfi,
5 5 REGULAR FORMS AND MARKOV PROCESSES
281
The process X is Anderson’s random walk with a uniform initial distribution corresponding to the Lebesgue measure. To prove that the standard part of X is a strong Markov process, we show that the empty set is a set of irregularities of X . Since X is S-continuous L( P j )a.e. for all nearstandard siE S o , the two first conditions in 5.4.13 are obviously satisfied. If si = s,, the paths starting at si look exactly like the paths starting at sj except for an infinitesimal translation. Hence they induce the same standard paths, and 5.4.13(iii) follows. We have focused on the standard notion of a strong Markov process, but there is nothing canonical about this choice; in fact, from the point of view of the standard theory of Markov processes as presented, e.g., in Blumenthal and Getoor (1968) and Fukushima (1980), it would have been more natural to introduce conditions on X guaranteeing that the modified standard part x is a Hunt process. However, to be a Hunt process a strong Markov process has to satisfy several rather technical requirements, and since we feel that this section already has its fair share of technicalities, we leave these further developments to the reader. The material in this section is new, but the hyperfinite theory for exceptional sets developed here follows the standard theory (Blumenthal and Getoor, 1968; Fukushima, 1980) closely. Keisler (1984) approached the nonstandard theory of Markov processes from a slightly different point of view, characterizing different kinds of standard Markov processes in terms of what kinds of liftings they allow, and giving applications to solutions of stochastic differential equations. Keisler’s processes can be time dependent, but the conditions corresponding to (5.4.1) are on the sample space R and not on the state space So. 5.5. REGULAR FORMS AND MARKOV PROCESSES Combining the results of the two previous sections, we shall now obtain conditions on hyperfinite Dirichlet forms which guarantee that all modified standard parts of the associated Markov chains are strong Markov processes. The method we shall apply is simple; we just use the relationship between forms and processes established in 5.3.5-5.3.9 to translate the conditions of Theorem 5.4.17 into the language of Dirichlet forms. We need slightly stronger assumptions in this section than in the last. First we reintroduce the symmetry condition miqV= mjqji ( i , j # 0) of formula 5.3.8, which in Section 5.4 was temporarily replaced by the weaker assumption in formula 5.4.2. We are thus back in the setting of Section 5.3. Next we need stricter topological requirements. Recall that a Hausdorff space is regular-or T3 if you like-if for all closed sets F and all x E F,
282
5 HYPERFINITE DIRICHLET FORMS A N D MARKOV PROCESSES
there are disjoint open sets 0,, 0, such that x E 0,, F c 0 2 .A space is second countable when there is a countable base for the topology. In this section we shall assume that the state space Y is second countable, regular, and almost a-compact. These conditions are satisfied by the spaces we are most interested in-locally compact spaces with countable bases, and complete, separable metric spaces. As in Section 5.4 we assume that the hyperfinite state space Sois a subset of * Y. A. Separation of Compacts
One by one we shall reformulate the conditions of 5.4.13 in terms of Dirichlet forms. The following assumption will take care of 5.4.13(i). 5.5.1. DEFINITION. A Dirichlet form 8 separates compacts if there is a countable family n of open sets such that for all disjoint compacts K , , K 2 , there are sets O,, 0, E T and an internal function u : S o + *R satisfying K , c O,, K 2 c O,, u 1 S o n *O, = 1, u 1 So n *02= 0, and O8,(U,
u ) < 00.
Note that since there are only countably many pairs O , , 0 2 ,we can choose all the functions above from a countable family. In fact, if e::; is the function which has the smallest $,-value among those which are one , can always choose u from the family on *O, and zero on * 0 2we { ezg; I O,, 0, E n are disjoint}.
Such a collection of u's is called a separating family for 8 and n. Note that by Proposition 5.3.5, each e::; will only take values between zero and one. Let 5 = la,be the lifetime of X, i.e., < ( w ) = inf{otlX(w, t ) E
where
So},
so= So n Ns(* Y ) .
5.5.2. LEMMA. Assume that 8 separates compacts and that 9is a separating family. If the path X ( w , . ) fails to have an S-left or S-right limit at some t < { ( w ) , then so does u(X(w, - ) ) for some u E 9. PROOF. Fix an w E R and a t E R,, t < l ( w ) . Given a sequence {fn}nEN from T such that the standard part increases strictly to t, we shall first show that the sequence Or,
{ " X ( w ,f " ) } " E N
has a cluster point. Assume not; then for each y integer n,, E N such that
E
Y there is a neighborhood 0, and an
OX(W,t n )
Oy
5.5. REGULAR FORMS AND MARKOV PROCESSES
283
when n 2 n,. Since Y is regular, we can find a neighborhood G, of y such that the closure G, is contained in O,, and h'ence X(w, t,) k-z *G, when n 2 n,. Extend {fn}ntN to an internal sequence (fn}nt*N of elements of T less than t, and consider the set A , = ( n ~ * N ( ) nn,orX(w,t,)k-z*G,}. s Since A, is internal and contains N, there is an r], E *N - N such that all r] 5 r], are elements of A,. By saturation there is an infinite 7 less than all 7., But then X(w, 4)) 6z *Gy for all y. This implies that X ( w , f,,) is not nearstandard, contradicting our assumption that t < { ( w ) . Let x be a cluster point of {"X(w, t)}. If x is not the S-left limit of X ( w , at t, there must be another sequence { s , , } , , ~ ~increasing to t such that x is not a cluster point for { " X ( w ,s,)}. Repeating the argument above, we see that { " X ( w ,s,)} must have a cluster point y. Let u E 9 be one on a neighborhood of x and zero on a neighborhood of y. Obviously, u(X(w, . )) does not have an S-left limit at t. This proves the S-left limit case of the lemma; the S-right limit case is similar. a )
We shall now use 5.5.2 and Fukushima's decomposition theorem 5.3.9 to show that if 8 separates compacts, then the associated Markov process X satisfies 5.4.13(i). Recall that if 6 E T, the subline T8 is defined by
T, X(,) is the restriction X
=
(0, 6,26,. .}; *
r T,; and 5, is the lifetime of X"), I
k ( w ) = inf{"t x ( ' ) ( w , t )
So}.
5.5.3. PROPOSITION. Let '8 be a normal Dirichlet form which separates compacts. There exist a = 0 and an exceptional set A. such that for all si E So- A. and all infinitesimal 6 2 60, the restricted process X ( ' ) has S-left and S-right limits at all t < la L(Pi)-a.e. PROOF.
To find a0,we fix a separating family 9 for 8. Since for any u O 8 (
E
9,
u, u ) < 03,
we can find a 6, = 0 such that u E 9[@')] for all infinitesimal 6 saturation there is a a0 = 0 larger than all 6,.
2
6,. By
5. HYPERFINITE DlRlCHLET FORMS A N D MARKOV PROCESSES
284
Turning to A,, we first observe that by Lemma 5.5.2 and the countability of 9, it suffices to show that for each u E 9, there is an exceptional set A such that for all s, E So - A, the process u(X'*') has S-right and S-left limits L ( P , ) a.e. We can then take A, to be the union of these A's. Since u(X"0') takes values between zero and one, the only way it can fail to have one-sided S limits is by oscillating too wildly. We shall use stopping times to keep track of the oscillations. Given two rationals p, q, 0 5 p < q 5 1, we define a sequence { T ; ~ , ~ of )} stopping times as follows: T : ~ , ~=)
min{t
E
~ + , l u ( ~ " J (t )w) 5 , p}
T%I t > T ; ~ , , , ' ( W ) A u ( ~ " J ( wt,) ) 5 p > - min{t E T+,\t > T & ) ( w ) A u(x('~)(o, t ) ) 2 4). = min{t E
T&) T2n+l (p,4)
Let A c So be defined by
A
=
u u u n { i i Pi{T;p,s,
5
mi 2 1
~
.
( p . 4 ) m E N kEhl n t N
If X"0' fails to have S-right or S-left limits with positive L ( P i )probability, then s, E A. We must show that A is exceptional. By Lemma 5.4.5, {w
13t E G , ( X ( w , t ) E A)>
If A is not 8,-exceptional, there must be a pair ( p , q ) of rationals, integers m, k E N, and an infinite number 77 E *N such that
71
L ( P ) { o 1 3 tE Tb( Px~soj(,,,){T~p,4) 5 m) 2 -
k
> 0.
This implies that with positive L ( P ) probability, u ( X ' ' ~ )jumps ) back and forth between p and q more than 77 times before time rn 1. Since u E 9[ Fukushima's decomposition theorem 5.3.9 tells us that
+
u(X"0')
=
N
+ M,
where N is S-continuous L ( P ) a.e., and A4 is a h2-martingale. If u(X"0') jumps 77 times between p and q before t = m + 1, there must be an infinitesimal interval where it jumps back and forth infinitely many times. Since N is S-continuous-and hence almost constant on infinitesimal intervals-most of this jumping is done by M. Hence the quadratic variation of M is infinite on a set of positive measure, contradicting the fact that it is a h2-martingale.
5 5 REGULAR FORMS AND MARKOV PROCESSES
285
We can conclude that A must be S,-exceptional, and the proposition is thus proved.
B Nearstandardly Concentrated Forms Let A and B be two disjoint, internal subsets of So,and letf: A u B + "R be the function which is constant one on A and constant zero on B. We shall write e f := e , ( f 1
(1)
for the equilibrium potential defined in (5.3.32). The notation eA for e z has already been introduced in Section 5.3. Recall that denotes the norm generated by 8,, i.e., IUI, =
8,(u, u)1'2.
The following condition on '8 implies 5.4.13(ii). 5.5.4. DEFINITION. A Dirichlet form 8 is nearstandardly concentrated if there exist a countable family % and an increasing sequence {Bn}ncN both consisting of internal subsets of So, such that:
so so u
B, - and - { CI C E %} are both exceptional. (i) The sets UncN (ii) For all C E %, limn+mlim,,,+,, ole?:Bn - eel, = 0. REMARK.
The first condition of 5.5.4 more or less asserts that
u Bn = so = u {Cl c
n€N
E
%I,
but for technical reasons it is more convenient to use the slightly weaker version we have given. Condition 5.5.4(ii) is often difficult to verify, and later in this section we shall introduce modifications of it that are easier to handle. 5.5.5. LEMMA. Let X be the Markov process generated by a nearstandardly concentrated Dirichlet form 8. If B = UncN B,, the set
D
= {Si E
CBIL(P,){3t
E
T f ' " ( X ( tE)
So,} > 0)
is exceptional. PROOF We have used the outer measure since there is no obvious reason why
{3t E T " ( X ( t )
E
L(p,)in
So)}
the definition of D
286
5. HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
should be L(Pi)-measurable. However, since tional, it suffices to show that
so- u {Cl C
E
%} is excep-
Dc = { s i E C B ) L ( P i ) { 3 tE T'"(X(t) E C ) }> 0) is exceptional for each C E %. Pick two increasing sequences { n k ) k e N ,
{mk)k&
such that
and define
=
For each K
E
2( 1 -I-At)''A' 8 k
N, we have Dc c
2
u:=,
?I
*
8k.
DF. By ( 2 ) this implies that for all
KEN La
P(wl3t Il ( X ( o , t ) E D c ) } I2e C Sk Ie2--K+2 k= K
since
8k
< 2Tk, and hence Dc is exceptional.
5.5.6. PROPOSITION. If 8 is nearstandardly concentrated, there is an exceptional set A , such that for all s, E - A , ,
so
L ( P i ) { 0 1 3 tE Tn"(Ot > g ( w ) A ~ ( wt ), E So)>= 0. PROOF Let {B.},EN be as in 5.5.4, and set B = UncN B,. We let A , be any properly exceptional set containing B - So and the set D in 5.5.5. extending { B n } n e Nand , Choose an internal, increasing sequence { Bn)nc*N define
TIx(~,
t ) sz
I X ( O , t ) sz
B},
a , ( w ) = min{t E
If a ( w ) = inf{'t
B.}.
5.5. REGULAR FORMS AND MARKOV PROCESSES
287
it is not hard to check that a ( w ) = sup{"CTn(w)~ n E N}.
Given an si
E
so- A,, we can find an 71 E *N - N such that
(3)
L ( P i ) a.e.
a ( w ) = Oa,(w),
so
It follows from Lemma 5.5.5 that - B = D c A l , and by definition of A,, we have B - c A,. Since A, is properly exceptional and si iZ A,, this implies that
so
(4)
a ( w ) = {(w),
L ( P i ) a.e.
Combining (3) and (4), we get (5)
L ( P i ) a.e.
" a ? , ( w )= l ( w ) ,
Since siJ? A l , and A, contains D and is properly exceptional, U p i ) ( X ( w ,a ? ( w ) ) E D ) = 0, and hence by definition of D L(Pi)(atE Tfin(t1 U , ( W )
A
x(0,t ) E SO))= 0.
The proposition follows from ( 5 ) . C. Quasi-Continuous Extensions
We shall begin our study of condition 5.4.13(iii) by introducing the notion of quasi-continuity. An internal function f:D, + *R defined on a subset of So is called quasi-continuous if there is an exceptional set A such that for all infinitely close si,sj E n D, n CA, we have f ( s i ) - f ( s j ) .
so
REMARK. Recall that for an internal function f:So + *R the following two definitions of S-continuity at a point x E Y agree:
-
(i) For all y , z E So, y = x, z x, we have f(y) 2:f(z). (ii) For all E E R+, there is a neighborhood 0 of x such that If(y) f ( z ) l < E for all y, z E *O. When as above we restrict the domain of continuity to an external set CA, this equivalence breaks down. Clearly (ii) still implies (i), but it is not hard to find examples which show that the converse is false. If A is an internal subset of So and S E T, let a:(&) = min{t E T,IX"'(w, t) E A}.
We shall write eh?(f) for the equilibrium potential off with respect to the form E',"', i.e.,
+
eb?(f)(si) = ~ ~ ( (a~)-~~/'"f(~'6'(a",))). 1
288
5. HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
Recall that we abbreviate e y ) ( f )by e , ( f ) when 6 = At. An internal function f : Df+ *R has Jinite energy if (6) " $ , ( e l ( n e, , ( f ) )< a. Note that if (6) is satisfied,
"%L?(eh6'(f), eL6'(f))< a for all finite, positive a and 6. An internal set A c So has Jinite energy if
" & ( e , ( A )e, , ( A ) )< 00. A Dirichlet form 8 generates quasi-continuous extensions if there is a 6 , = 0 such that for all infinitesimal 6 2 6 , and all quasi-continuous, S-bounded internal functions f : Of + *R, of finite energy ehs)(f) is quasi-continuous for all non-negative real numbers a. 5.5.7. DEFINITION.
We observed in Section 5.3 that since
e Q ( A ) ( s i=) E i ( ( l
+
(Y
At)-'~'Af),
the function a -Oea(A)(Si)
is the Laplace transform of the measure Ai defined on R by hi(C) = L(P~){wI"uA(w) E C}.
If the functions si H e,(A)(s,)are quasi-continuous, then for all infinitely close si, sj outside an exceptional set, the particles starting at si hit A with the same time distribution as the ones starting at sj. Hence quasi-continuity of equilibrium potentials implies a certain uniformity in the future behavior of the process. To prove condition 5.4.13(iii), we must extend this observation to also take into account what happens after the first time X hits A. Given a finite sequence A = ( A , ,A Z ,. . . ,A,,) of internal sets and an integerj 5 n, let
Alj
= ( A , ,AZ,. . . ,Aj)
and
Azj = (Aj, A,+, . * An)We define an internal stopping time U, by induction on the length n of A as follows: if A = ( A , )has length one, a, is just the first hitting time uA,, i.e., 7
m A ( w ) = inf{t E
9
T I X ( o , t)
E
A}.
If uB has been defined for all sequences B of length less than n, let u A ( w ) = inf{t E
TI t
5: u A 5 , , - , (A~X ) (w, t ) E
An},
5 5 REGULAR FORMS AND MARKOV PROCESSES
289
When we apply this definition to a restriction x(’)of X , we write for the resulting stopping time. For each finite sequence A of internal sets, we define “generalized equilibrium potentials” ef ’( A) by ef)(A)(si)= E , ( ( l + a6)-LTi”).
(7)
5.5.8. LEMMA. For all finite sequences A of internal subsets of positive a and 6, and all j less than the length of A, we have ,(’)m (Azj) = d.’)(ef)(Azj+l) 1 Aj), (8)
so,all
i.e., eY)(AZj)is the equilibrium potential with respect to 8Y’ of the restriction of eF)(A,j+l) to Aj. In particular, the function (9) is increasing. PROOF That the function in (9) is increasing follows from (8) and 5.3.5. To prove (8), just observe that since X is Markov
ef)(A2,)(.st) = E,[(1 + ~ 6 ) - u i ~ J / ’ ] = E,[(1 + a6)-“2,/“(1
+
&j-(uiz,-ui,)/’
1
=
EJ(1 + a 6 ) - ‘ ~ , / ’ e e b P ) ( A , , + l ) ( X ( ~ ~ J ) ) l
=
e?(ef)(AzJ+l)
1 A,)(s,).
The lemma is proved. 5.5.9. COROLLARY. Assume that 8 generates quasi-continuous extensions, and that A = (A,, . . . , A,,) is a finite sequence of internal sets where A,, has finite energy. Then ekS)(A)is quasi-continuous and has finite energy for all finite, positive a and all 6.
PROOF Use induction on k and the lemma to prove the statement for all Arn-k, 0 5 k < n.
We are now ready for 5.4.13(iii). 5.5.10. PROPOSITION. Let 8 be a nearstandardly concentrated, normal Dirichlet form which separates compacts and allows quasi-continuous extensions. There is a properly exceptional set A of irregularities of X such that if x is the modified standard part with respect to A, then for all infinitely close s,, sJ E - A, all t E R, and all Bore1 sets B c Y
so
L(P*){wI x ( w , t ) E
B) = L ( p , ) { wI x ( w , t ) E
w.
290
5 HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
PROOF. We shall first construct the exceptional set A. Let A. and A, be as in 5.5.3 and 5.5.6, respectively. Let T be the countable family in Definition 5.5.1, and observe that all 0 E T have finite energy. Let Z be the set of all finite, alternating sequences
. . ,*01,* 0 2 )
Y = (*O,, * 0 2 , * 0 1 , * 0 2 , .
where O,, O2 E T. Fix an infinitesimal 6 larger than the So of 5.5.3 and the 6, of 5.5.7, and let
9 = <ey'(Y)lq E Q, Y
E
z).
Since 9 is countable, Corollary 5.5.9 tells us that there is an exceptional set A2 such that all functions in 9are continuous off A,. Let A be a properly exceptional set containing A. u A, u A , . The functions a H e;')(Y) are obviously S-continuous, and thus "Ei((l
(10)
+ a6)-'g/S) = "Ej((l + a S ) - ' s / S )
for all infinitely close si, sj
oEi((l for all a
E
R,
+
E
so- A. Since
a s ) - u $ / 6 )=
I
e-a-us dL(pi)
the function a
H
"Ei((l + as)-+)
is the Laplace transform of the distribution of "a$ with respect to L(P,). Hence it follows from (10) that for all si, sj E 3, - A, s, = sj, (11)
L(pi){w I o ~ $ s , , , ( ~ )
5
t<
= L(P,)<wI"a$sm(4 5 t
< o~$sm+,J
for all t E R, and all Y E Z, rn < length 9. Assume that the proposition is false; then we can find si, s, a Bore1 set B such that (12)
L ( P , ) { w l x ( w ,t )
E
E
so
-
A and
B } - L ( ~ ) { o ~ xt ) (EwB }, = E > 0
for some t E R., Pick i- f so large that " X ( w ,i) = x ( w , t ) L(Pi)- and L ( P , ) - a.e.; this is possible since Ao, Al c A. The measures pi and pj defined by i.i(C) = L(pi){wIoXtw,
9 E C),
PjLJC) = L(P,){wloX(w,
are Radon measures on Y, and hence there is a compact set K (13)
p i ( K ) > p j ( K ) + 3&/4*
I) E C }
= B such that
5 5 REGULAR FORMS AND MARKOV PROCESSES
291
We can also find a compact set K ' c Y - K such that pi(( Y - K ) - K ' ) < ~ / 4 ,
pj((
Y - K ) - K') < ~ / 4 .
Since 8 separates compacts, we can find two disjoint open sets 0, 0' E rr such that K c 0, K ' c 0'. Let rn be so large that the L(P,) and L(P,) probabilities that a path should jump back and forth between *O and *O' more than rn times before time t 1 are both less than ~ / 4 Let .
+
Y = (*O, *O', *o,*or,. . .,*o,*O') have length 2rn. Then
3E
> L ( P , ) { w I x ( w ,t ) E K } --, 4
contradicting (13). The proposition is proved. REMARK. Note that we d o not really have to know that 8 generates quasi-continuous extensions to carry through the proof of 5.5.10; all that is needed is that the functions e',"'(9') are quasi-continuous.
We can now combine the results of this section with Theorem 5.4.17. 5.5.11. THEOREM. Assume that sois a hyperfinite subset of * Y for some second countable, regular, almost a-compact space Y Let 8 be a nearstandardly concentrated, normal Dirichlet form on So that separates compacts and generates quasi-continuous extensions. Then the associated hyperfinite Markov process has a modified standard part which is a strong Markov process. PROOF By 5.5.3, 5.5.6, and 5.5.10, the hyperfinite Markov process has exceptional irregularities, and thus the result follows from 5.4.17.
D. Regular Forms The conditions in 5.5.11 are not in any sense canonical; there are other possibilities which are just as reasonable. So far we have tended to choose assumptions that would simplify the theory rather than those most suitable for applications; we shall now reformulate our hypotheses so that they will be easier to verify.
292
5 HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
First, a small observation whose proof we leave to the reader. We say that 8 sepurutespoints if Definition 5.5.1 only applies to singletons K , = {x}, K 2 = {y}. It turns out that if 8 separates points it also separates compacts. Hence it suffices to prove that our forms separate points. As we have already mentioned, it is not always easy to verify that a form is nearstandardly concentrated; the problem is to prove that
Recall that this condition is needed in order to prove that the process does not return from “infinity.” We shall indicate how these difficulties can be circumvented by turning more of the non-nearstandard points into traps. Let be an internal subset of So, and let T be the first hitting time,
so
T ( u ) =
We define a process (15)
Z(W,
minit
E
T J X ( U ,t ) E $}.
2 taking values in 5 = Sou {so} by
t) =
{
t(W9
t,
t < T(w), t 2 T(w).
when when
According to the Beurling-Deny formula, 5.3.1 , the associated Dirichlet form @ is given by g(u, u ) =
at 1( u ( i ) - u(i))’q,,ml+ Z-u ( i ) z m , c - 4111. so
”
1.J
Let (.,
,€So
E
JES-So
1 ‘J
be the L2-inner product on
.>I
(u, u)- =
(so,m ) ;
1-u ( i ) u ( i ) m ( i ) , Itso
and let
&(u, u ) = q u , u ) + a ( u , u>If f : D j + *R is an internal function defined on a subset of function defined by
j: 0, u ( S o - so) -+ “R be the internal
so,let
Observe that if ZN(f) denotes the equilibrium potential o f f with respect to
gN,
For all internal u : (17)
so “R, +
we have
.
% Y U ( U ,u
)=
u’).
5 5 REGULAR FORMS A N D MARKOV PROCESSES
293
The next lemma is a slight variation of the definition of a nearstandardly concentrated form. 5.5.12. LEMMA. Let 8 be a hyperfinite Dirichlet form, and assume that there exist a countable family (e and an increasing sequence {B,},,,, both consisting of internal subsets of S o , such that the sets UntN B, - So and C E %} are exceptional, and for all C E Ce
so u{C
lim leCnB,II
(18) (19)
n-rm
=
lecll < 00,
lim \eyABn\, = IecnB,,\l
m+oo
for all
n
E
N.
Then 8 is nearstandardly concentrated. PROOF
Recall that if w :So -+ *R agrees with f on Of, then by (5.3.30)
81(e,(f) - w, e , ( f ) - w ) = 81(w,w ) - 81(e,(f), e , ( f ) ) . Applying this twice, we have
always exist. The problem is that they may be infinite, or be finite but fail to agree with (18) and (19). Granted the finiteness, we shall show how the second problem can be solved by restricting the form to a suitable subset goof So as above. 5.5.13. DEFINITION. A Dirichlet form 8 is called locally$nite if there is a countable family { C I C E %} and an increasing sequence { B n } n e Nboth consisting of internal subsets of So, such that
so
so
(i) the sets A U n EB, , and - u{Cl C E (e} are exceptional; (ii) for each C E %, we have limn+mlimm,,oIeCcB,"B,(l < 03. 5.5.14. LEMMA. If 8 is a locally finite Dirichlet form, there is an internal subset go of So such that - go is exceptional, and is nearstandardly concentrated.
so
294
5 HYPERFINITE DlRlCHLET FORMS A N D MARKOV PROCESSES
PROOF Extend {B,,},,cNto an increasing, internal sequence { B,,}ns~N. There is an infinite p E *N such that for all n E N,
o~eccBc;B,~l = lim oleCC:B,J,.
(20)
m-m
Moreover, there is a
uE
*N - N such that
so
Let = B p ; we shall show that @ is nearstandardly concentrated with respect to the families { B n } n e and N %' = { C n B, I C E %}. Note that since A UntN B, is exceptional, it suffices to prove the equivalents of (18) and (19). Let I ;1 be the norm generated by 8,.Using (16), (17), (20), and (21), we get for all C n B, E %':
so
lim
'(z(CnB,)nB,(;
=
n-tm O I ~ C C B X B= , I ~O I ~ C C B X = B ~OI I~G n B , I T
lim
O1hf&&jnBm(;
=
m+m
n +m
lim
< co
and m-m
lim OIeP:B,,\l= OIeCcBXB,II = o I ~ ( C n B , ) n B . I ; ,
which correspond to (18) and ( 1 9 ) , respectively. Since is exceptional, a modified standard part of the process 2 generated by the form 'i!? in 5.5.14 is a perfectly adequate version of the standard part of X. There are still a few difficulties in connection with Definition 5.5.13. Although it should not be too hard to check that
so so
lim lim
n-m m + m
oJey:BnI, < 00
once the families {B,,} and % are chosen, it may be more problematic to prove that the sets
sonU B,, fl€N
and
so-U { C ( C
E
%}
are exceptional. However, when Y is locally compact, we can usually pick the B's and C's such that both sets are empty. 5.5.15. PROPOSITION. Let Y be a second countable, locally compact Hausdorff space, and let 8 be a hyperfinite Dirichlet form defined on a subset So of * Y. Assume that there is a covering { of Y consisting of open sets with compact closures, such that for each n E N there is an in E N with
Then there is an internal set dardly concentrated.
&, such that
soc &, c Soand i? is nearstan-
5 5 REGULAR FORMS AND MARKOV PROCESSES PROOF
Put B,
295
=Uz,*Oi n So, and let ~={*O,nSoJn~N}.
Since
u B, so u { C (c =
=
E %},
ntN
the proposition follows from Lemma 5.5.14. We shall finally take a look at quasi-continuous extensions. 5.5.16. LEMMA. Assume that for each S-bounded u E 9[8]there is a sequence {u,},,~of quasi-continuous functions such that OIu - u,I1 + 0 as n + co. Then all internal S-bounded functions u with u ) < 00 are quasi-continuous, and '8 generates quasi-continuous extensions. O$,(u,
If v is S-bounded and "8',(v, v ) < 00, there is an S-bounded 9['8] such that IIu - u1) = 0. For all k E N
PROOF
uE
and thus this inequality also holds for some infinite k. This implies that the set
D
= {Sil U ( S i ) #= V ( S , ) }
is contained in an internal set of infinitesimal measure. By 5.4.4(i), D is exceptional, and hence v is quasi-continuous if and only if u is. To prove that u is quasi-continuous, we first observe that by passing to a subsequence if necessary, we may assume that (22)
$I(uk+l
-
uk,
Uk+l
- uk)
< 2-4k.
Extend { u , } , , ~to an internal sequence { u , } ~7 )~E ~*N, - N, such that ( 2 2 ) holds for all k < 17. Define G k
= CSi
E sO\\uk+l(si)
- uk(Si)l
> 2-k),
and note that $I(eGk, eGk)5 2 2 k 8 ] ( U k + ] - u k ,
uk+]
- u k ) 5 2-2k.
Let Fk
=
u
GI.
k z f s q
Since
leFkll
l C k s l < v 'GfIl
C k r l < q leG,ll
2-k+1,
8 ( e F ke,F k ) 5 z-'~+'.
296
5 HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
Thus
F=
n Fk
kaN
is exceptional. Let A be an exceptional set such that for all n E N, u, is S-continuous on So - A. Since 11 u - u, 11 = 0, we are done if we can only show that u, is S-continuous on the complement of F u A. Assume not, and pick E E R, s i, sj E So - ( F u A ) such that
(23)
lu,(si) - u,(sj)l
but si = sj. Choose k Sj @ Fk
E
N so large that si, sj @ Fk and 2-k'2 < 8. Since si ,
luk(si) - u,(si>l -
and since
uk
1 E,
y k t l 7
/uk(sj)
- uq(sj)/
2-kfl,
is S-continuous on So - A Iuk(si)
- uk(sj)/
0.
Hence lu,(si) - u , ( s j ) l
+ I u k ( s i ) - u k ( s j ) I + I u k ( s j ) - u,(sj)l + Iuk(si) - u k ( s j ) I + 2-k+1 < &,
lu,(si>
-
2-k+1
- uk(si)l
contradicting ( 2 3 ) . This completes the proof. We shall say that an internal function f : So + *R has compact support if there is a compact set K such that f ( s i )= 0 for all si E So - * K . 5.5.17. DEFINITION.
A hyperfinite Dirichlet form 8 is called regular if it
is normal and (i) for all S-continuous functions u of compact support there is a sequence {u,},,~of internal functions of finite energy such that sup("~u(si)u,(si)llsi~ n+m; (ii) for all S-bounded u E 9[81, there is a sequence { u , , } , of ~ ~quasicontinuous functions such that O1u - u,1 + 0 as n -+ CO.
as
The last result we shall prove in this section is a nonstandard counterpart of a theorem of Fukushima stating that all regular (in a similar, standard sense) Dirichlet forms on second countable, locally compact spaces generate Hunt processes [see Fukushima (1971,1980) and Silverstein (1974)l.
5 6 APPLICATIONS TO QUANTUM MECHANICS
297
5.5.18. THEOREM. Let So be a hyperfinite subset of * Y for some second countable, locally compact Hausdorff space Y. If 8 is a regular Dirichlet form on S o , there is an internal set c c S o , such that a modified standard part of the associated process 2 is a strong Markov process.
so,so so
PROOF It follows easily from 5.5.17(i) that 8 separates compacts and satisfies the conditions of 5.5.15. By 5.5.17(ii) and 5.5.16, 8 must generate quasi-continuous extensions. The theorem follows from 5.5.1 1.
REMARK. The theory in this chapter is new but several of the fundamental ideas are taken from the standard theory, as described in Fukushima (1980) and Silverstein (1974,1976). In particular, our proof of Proposition 5.5.10 is closely modeled on Silverstein’s proof of the standard version of 5.5.18 [see Silverstein (1974) and Fukushima (1980)l. The existing standard theory flows smoothly for second countable, locally compact spaces, but gets more complicated when the space is not assumed to be locally compact [see Albeverio and Hoegh-Krohn (1976, 1977a,b), Kusuoka (1982), Paclet (1977/78,1979), and Takeda (1984)l. One of the advantages of the hyperfinite approach is that it allows a unified treatment of locally compact spaces and most other interesting spaces (e.g., separable Hilbert and Banach spaces). However, it cannot be denied that the nonstandard theory is also easier to apply in locally compact spaces (compare Proposition 5.5.15). In the next section we shall take a look at some typical applications of the theory developed in the last two sections.
5.6. APPLICATIONS TO QUANTUM MECHANICS AND STOCHASTIC DIFFERENTIAL EQUATIONS
We shall end this chapter by taking a brief look at some of the areas in which the theory of Dirichlet forms has been applied. The results we are going to present are not new [most of them are taken from Albeverio et al. (1977)], but they illustrate how Dirichlet forms provide a natural unification of ideas from analysis, probability, and mathematical physics. There are also strong reasons to believe that combined with the theory developed in the previous sections, these examples may serve as the starting point of important generalizations, especially in the infinite-dimensional case. As this section is primarily expository in nature, we do not strive for the greatest possible generality and shall feel free to omit proofs whenever we find it convenient. A. Harniltonians and Energy Forms
Let us begin by explaining the relationship between Dirichlet forms and quantum mechanics. Recall that a quantum mechanical particle moving in
238
5 HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
a potential V is governed by its Hamiltonian
H = -A
(1)
-+ V,
xi=,
where A = d2/dxf is the three-dimensional Laplace operator. If 0 = inf spec( H) is a simple eigenvalue corresponding to a strictly positive eigenfunction Q, it is well known that under reasonable conditions on V, the Hamiltonian H is the infinitesimal generator of a symmetric Markov semigroup e-IH; see, e.g., Simon (1979). The interactions we shall be interested in here are too singular for this approach, and we shall study the relationship between Markov processes and quantum physics from a slightly different point of view. Recall that to fit into the general framework of quantum physics, H must be interpreted as a self-adjoint operator on L2(Rd,d x ) , where dx denotes the Lebesgue measure. Another way of formulating this is to say that the form F defined on a reasonable class of functions (e.g., C i ( R d )by ) (2)
( - A f + V f ) gdx
F ( f ; 8) =
is closable. Through a sequence of formal calculations we shall now find a function Q and a Dirichlet form E such that
(3)
E ( f ,8 ) = F ( Q f ,Qg).
The calculations will be formal in the sense that we shall assume without justification that the operations we perform are legitimate. Later in this section we shall give several examples illustrating the importance of (3). Before we begin our computations, let us mention that the reader who is totally unfamiliar with quantum mechanics may find it helpful first to take a look at the introduction to Section 6.2. We shall have to assume that there is a function Q such that (4)
(-A
+ V ) Q = 0.
This is a rather innocent assumption; if Q is a positive generalized [in the sense that it need not belong to L2(R3, d x ) ] eigenfunction of -A + V corresponding to an eigenvalue E,,, then (5)
(-A
+ C)Q = 0,
where f = V - Eo. Since potentials are only determined up to an additive constant, V and C describe the same physical situation, and we might just as well work with ? as with V. Assuming (4), we can rewrite (1) as (6)
H
=
-A
+ (AQ/Q).
5.6 APPLICATIONS TO QUANTUM MECHANICS
299
Applying H to a function of the form qj; we see that (7)
2 V P V f - (P A f + ( b l f
H(cpf) = -A(cpf) + ( b l f = -(Acp)f-
d - A f - 2(VPo/Cp)Vf),
=
and if A is the operator Af
(8)
-Af - 2(Vq/Cp)Vf;
=
equation (7) can be written as (9)
Hq
=
qA.
In terms of the form F, we get (10)
F(cpf;a )=
5
H(cpf)vog d x
=
1
( A f ) g c p 2dx.
Introducing a measure p by dp
= q 2 dx,
it is natural to define a form on L2(R3,d p ) by (11)
E ( A g) =
5
( A f ) gd p
=
J
(Af)gcp2 d x
= F ( ~ m). A
Comparing (11) and (3) we see that E is the natural candidate for the Dirichlet form associated with F. To get E on a simpler form we integrate by parts in the first term of
Assuming that f and g and their partial derivatives vanish at infinity, we get
=
=
1
VfVgcp2 d x
J
+
1
VfVq’gdx -2
I
V qVfgqdx
V f Vgdp.
We shall refer to E as the energy form of the measure p. To summarize, we have shown that (12)
E ( ” tg ) = F(cpf;4%)-
300
5 HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
The connection between p and V is that V = A q / q and d p = q 2dx. There are several reasons for passing from F to E. The first concerns the important technical question of closability; recall that to be accepted as a Hamiltonian form, F must have a closed extension. As we shall now see, closability of E implies closability of F, and we shall later show that it is quite easy to find conditions which guarantee that E can be closed. If gois the domain of E, and F is defined on its domain @o
=
{cpf:f
E 901
by F(cpf,cpg) = E ( f ,81, then F is closable in L2(Rd,d x ) if E is closable in L2(Rd,d p ) . The reason is quite simply that if El and Fl are the forms (15)
then
(17)
Fl(cpf;Pi?) = E , ( J g ) .
This means that if {(pf"} is a Cauchy sequence with respect to the Fl norm, then { f n } is a Cauchy sequence with respect to the E , norm, and from this the claim follows immediately. The second reason for passing from F to E is perhaps more exciting. There are interactions which are so irregular that they cannot be conveniently modeled by operators of the form
(18)
H=-A+ V;
e.g., the potential may be too singular to be represented by a function V. But using the heuristic calculations above, it is often possible to reformulate the problem in terms of energy forms in a way that makes perfectly good
5 6 APPLICATIONS TO QUANTUM MECHANICS
301
sense. The point is that the measure dp = p2dx is usually a much less singular object than the function V = A p / p . An example of this phenomenon is provided by the theory of point interactions, which we shall study both below (Example 5.6.4) and in Chapter 6. There is a third reason for studying energy forms which should appeal to probabilists as well as physicists. We have seen that the quadratic form E on L2(Rd,d p ) is generated by the operator (19)
A
=
-A
-
2(Vp/cp)V,
which is the infinitesimal generator of the stochastic differential equation (20)
dx( t )
= f ( x (t ) )
dt
+ db( t ) ,
where f = 2(Vp/cp) and b is a Brownian motion with variance parameter 2. If it exists, the Markov process associated with E must be a solution to (20). Using the theory developed in the last section [or the corresponding standard theory of Fukushima (1980) and Silverstein (1974)], we shall see that E generates a Markov process in situations where Vcplcp is too singular to exist as a function or even as a distribution. Hence energy forms can be used to study stochastic differential equations with generalized drifts. Note that since f = 2(Vcp/p) = V In cp’, we must assume that f is a gradient. The stochastic differential equation (20) is worth a study in its own right, but it gains additional relevance from the part it plays in quantum mechanics. Not only is it a useful technical tool (Albeverio et al., 1977, 1980, 1981, 1984a,b) but it has attained conceptual importance through the attempts that have been made to provide quantum theory with a purely probabilistic foundation [see e.g., Nelson (1967, 1985), Guerra (1981), Guerra and Morato (1983), Carlen (1984), Yasue (1981), Zheng and Meyer (1984), and contributions to S. Albeverio ef al. (1986a)l. For applications to other fields of mathematical physics, see Albeverio and Heregh-Krohn (1981a,b, 1982, 1984), Albeverio et al. (1983, 1984, 19851, Fukushima (1985), and their references. In infinite dimensions Dirichlet forms and their associated processes have found applications in quantum field theory and hydrodynamics [see, e.g., Albeverio and Hoegh-Krohn (1981a,b, 1982, 1984), Albeverio et al. (1985c), and Takeda (1984)]; the mathematical framework was developed in Albeverio and H0egh-Krohn (1977a,b), Kusuoka (1982), Paclet (1978,1979), and Takeda (1984). Although we shall not treat it in any detail here, it is in the infinite-dimensional theory that we have the highest hopes for the nonstandard approach; the reason is the ease with which we have been able to handle infinite-dimensional problems not only in Sections 5.4 and 5.5, but also in connection with Gross’s theorem (Section 3.5) and stochastic integration (Section 4.7). We shall return briefly to these ideas at the end of the section.
302
5 HYPERFINITE DIRICHLET FORMS AND MARKOV PROCESSES
B. Standard and Nonstandard Energy Forms
Before we can understand the applications above in detail, we need to know more about energy forms. It should come as no surprise to the reader that we are going to use a hyperfinite approach, studying hyperfinite forms 8 on a hyperfinite lattice Y. Let (21)
Y
=
[( k ,
(
for some A x = 0. If e
(22)
I)
A x , . . . , kd A x ) : V i 5 d ki E *Zand lkil 5 Ax2 E
R d is a unit vector of the form
e
= (0,.
. . ,0,+1,0,. . . ,O),
let sgn(e) be the sign of the nonzero component of e. We shall call the set of all such unit vectors U ; note that U splits naturally into a positive part U’ and a negative part U - . Given an internal function f: Y + *R and an element e f U, we define
when both y and y + Axe belong to Y. If y E Y and y + Axe iZ Y, we let D e f ( y )= 0. The resulting function D e f : Y -+ *R is an internal version of the partial derivative in e’s direction. We shall study hyperjnite energy forms given by (24)
1
%(-A g ) = 5y~CY e tCU Def(y)Deg(y)v(y),
where v(y) is an internal measure on Y. The factor $ is included to compensate for the fact that each direction is counted twice. Using that D P e f ( y+ e A x ) = D e f ( y ) ,we can rewrite (24) as
The form 8 obviously has the Markov property, and hence it is a Dirichlet form by Proposition 5.3.3. To construct the transition matrix Q and the invariant measure m of 8 we could have used the procedure in the proof of 5.3.3, but it is just as easy to do this directly. First we fix the time line (26)
T = ( 0 , At, 2 A t , . . .},
where (27)
At
=
Ax2/2d.
303
5 6 APPLICATIONS TO QUANTUM MECHANICS
between the space and the Observe that this is the usual relation Ax = time scale, with the factor 1 / 2 d added for technical convenience. If the Markov process has no trap, the hyperfinite Beurling-Deny formula 5 . 3 . 1 and relation ( 2 7 ) tell us that
Let us assume that (29)
%Y,
=0
if
Iy - y’J # Ax,
i.e., the process can only jump to a neighboring site. We can then write ( 2 8 ) as
%(Ag
(30)
)
=
d
C 1
y~ Y e E U
Def(Y)Deg(Y)qy,y+ehxm(y).
By the symmetry condition q y , y + e A x m ( Y ) = q y + e A x , y m ( y + e Ax) and the fact that D - J ( y + e Ax) = D e f ( y ) , the last equation can be rewritten as
Comparing ( 2 5 ) and (31) we see that
By ( 2 9 ) we must have
CeeUq y , y + e ~=x1, and thus
(33)
or 1 1 m ( y ) = 2 v(y) + - C v(y 4d e t ~
(34)
+ e Ax).
Putting this into ( 3 2 ) , we get (35)
qy,y+eAx
- .(Y) -
+ 4 Y + e Ax) 4dm ( Y )
4 Y ) + 4 Y + e Ax) 2dv(y) + v(y + e‘ Ax) ’
xi,
The last two equations show how to construct m and Q from v. Observe that although in general m is different from v, we always have L( m ) st-’ = L( v ) st-’, and thus m and v induce the same standard measure on Rid. 0
0
5. HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
304
Let us just remark that the procedure above cannot always be reversed; given m and Q such that both (29) and the symmetry condition (36)
myqYY' = mY'%'Y
hold, it may be impossible to find a measure v satisfying ( 3 2 ) . What is needed is the following generalization of ( 3 6 ) : if y , , y,, . . . ,y , is a closed path on Y (i.e., y , = yn and lyi - yi+ll = Ax for all i), then (37) i=l
Note that ( 3 6 ) is ( 3 7 ) applied to the path y, y', y. We shall not make use of ( 3 7 ) here, but would like to comment that it corresponds to the standard condition that f in ( 2 0 ) must be a gradient. The average value of the transition probabilities q y , y + e A x is 1 / 2 d and the order of magnitude of the deviation is Ax. It is therefore convenient to write qy,y+eAx in the form (38)
qy,y+eAx
=
( 1 / 2 d ) + sgn(e)Pe(y) Ax,
where the factor sgn( e ) is included to facilitate comparison to the standard theory. Rewriting ( 3 5 ) as (39)
qy,y+eAx
= - 1+ 2d
where
.(y> + 4 v + e Ax) - 2 d v ( y ) + Z.pGu V ( Y + e' AX) -
[ 4 y + e Ax) - v ( y ) l + [ v b ) - ( 1 / 2 d ) LU V ( Y + ;AX)] 2 d v ( y ) + EcCuV ( Y + e' AX)
h is the discrete
Laplacian
we see that
By equation (5.3.16) the infinitesimal generator of 8 is
5 6 APPLICATIONS TO QUANTUM MECHANICS
305
Using ( 2 7 ) and (38) we can rewrite this as
(44)
Under any reasonable conditions on f and v, the last term in (44) is infinitesimal. Observe also that (44) is the nonstandard counterpart of (19). We can now begin to explore the connections between the standard and the hyperfinite theory of energy forms. Let p be a completed Bore1 measure onRd such that p ( K ) < when K is compact. Pick a hyperfinite representation v of p supported on the lattice Y. In order to make the last term in (44) vanish, we choose v such that [ b v ( y ) / m ( y ) ] A x - O . Let Eo be the form defined on the set %A(Rd) of continuously differentiable functions of compact support by (45)
We know from Sections 5.1 and 5.2 that g induces a closed standard form E on L2(Rd,p ) . A natural question to ask is when E is an extension of Eo; this would imply that Eo is closable. Note first that i f f E %A(Rd), then (47 1
"S(*.L * f ) = E ( L f ) .
5. HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
306
But since E can be defined by
all we can conclude from (47) is the inequality (49)
E ( A f )5 ~ o ( J ; f ) .
To get equality in (49) we need a condition on v. It turns out that the “natural” one to use is
but that by working a little harder, we can get away with a slightly weaker, “local” version of (50). The extra generality will be helpful in our applications. 5.6.1. LEMMA.
Assume that there is a closed set N c Rd of p measure
zero such that
whenever 0 is an open set containing N and K is compact. Then *f E 9[ 81 for all f E %i(Rd) and hence E is an extension of Eo. PROOF
If * f g 9[8],there must be an internal function f such that
I
(52)
f*l
-
fl*
drn = 0,
but
“8(*f - J * f ) > 0. We can choose
=
W*f
Q‘*ffor a suitable infinitesimal -
1 *f)=
2
c
ecU
1.
By definition
R*(f- fW,*fdv(y),
and since N has p-measure zero and f has compact support, there is a compact set K , and an open set 0, 3 N such that
c
o e ( * j - f ) D e * f d v ( y )> 0.
5 6 APPLICATIONS TO QUANTUM MECHANICS
307
Approximating D*f by differentiable functions, we can find a g E Vg(Rd - N ) (the set of all functions which vanish outside a compact subset of R d - N and have continuous second derivatives) such that
C O l( Y n * K , ) - * O ,
De(*f - J ) D e * g d u ( y ) > 0.
ecU
If we are a little careful how we choose g outside I<, - O , , we can ensure that
Hence
Comparing ( 5 2 ) and ( 5 3 ) , we see that we get a contradiction if we can only prove that ( A * g ) * d v < 00.
By (44), i * g consists of three terms, the first of which is square-integrable since g E %;(ad), the second is square-integrable by condition (51) and the fact that g vanishes outside a compact subset of R d - N, and the third is infinitesimal by our standing condition A v ( y ) Ax/m(y) = 0. Thus the assumption that *f E 9[‘81 leads to a contradiction. That E is an extension of Eo now follows from (47).
To translate 5.6.1 into standard terms, we assume that d p = cp’ dx as above. What corresponds to the hyperfinite quantity D e v ( y ) / m ( y )is then acp2/axe ~ - - 2 _acp cp2
_
cp axe’
and condition (51) becomes
Hence we get: 5.6.2. COROLLARY. Let dp = (p’ dx, where cp is locally square integrable. Assume that there is a closed set N of Lebesgue measure zero such that the distribution Vcp is in L:””(Rd- N ) . Then Eo is closable.
308
5. HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
By saying that a distribution T is in L:””(Wd- N ) , we mean that for any %‘F(Rd - N ) the distribution f ( x ) T ( x )is actually a function in L 2 ( R d ) . Let us take a look at two applications of 5.6.2, both due to Albeverio et al. (1977). The first example is very simple, but it illustrates vividly the importance of allowing an “exceptional” set N.
f
E
5.6.3. EXAMPLE.
Let D be an open subset of Rd, and let q ( x ) = 1 for
D and p(x) = 0 for x E D. The boundary N = dD of D has Lebesgue measure zero, and V q E L2(1Wd- N ) . Hence the form
x
E
is closable by Corollary 5.6.2, and the associated operator is the Laplacian with Neumann boundary conditions. In the next example we shall finally exploit the connection between quantum mechanics and Dirichlet forms developed at the beginning of this section. 5.6.4. EXAMPLE.
Let d
=
3, and for each rn
E
R, let
e-mlxl
q m ( x ) = -.
1x1
It is easy to check that (54)
and
(55)
A q m ( x ) = m2qm(x)
when x # 0. From (54)we see immediately that V q , hence the form
EO,(f,g ) =
E
LY(Iw3- {0}), and
V f vgcpz, d x
can be extended to a closed form Em.Let F, be the closed form on L2(R3,d x ) defined by (56)
F m ( f , g ) = Ern(qilf; qi’g).
From (55) and the calculations we did at the beginning of this section, one
5.6. APPLICATIONS TO QUANTUM MECHANICS
309
would expect that (57)
Going through the calculations once more, checking that each step is legitimate when cp = cp,, we get that (57) does indeed hold as long as f ( 0 ) = g ( 0 ) = 0. Defining
we thus get a family of forms agreeing on functions vanishing at the origin. But since for m > 0
fi, with eigenvalue - m 2 . This implies that when m # k ; if not, cpm and v k would be two positive eigenfunctions corresponding to different eigenvalues and this is clearly impossible. Letting -Am be the operator generating gm,we have a family of self-adjoint operators agreeing with the free Hamiltonian -A on functions vanishing at the origin. We shall try to explain briefly why the family {-Am} is of interest to physicists. Assume that we are trying to model forces of short range. If the source is at the origin and the range of the interaction is less than E, the Hamiltonian is of the form cpm is an eigenfunction for
gm#
(59)
He = - A + V,,
where V, vanishes outside the ball BE = {x: 1x1 < F } . Another way of saying that V, vanishes outside BE is to demand that (60)
HEf
=
-AL
whenever f is zero on B E . If the range of the interaction is extremely short (as, e.g., with nuclear forces), it is convenient to put E = 0. In this case it is not at all obvious how to interpret ( 5 9 ) , but with (60) there is no problem; what we want is a self-adjoint operator H such that (61)
Hf
=
-AX
i f f vanishes at the origin. But this is exactly the kind of operators we have constructed above, and hence - A , is a model of zero-range interactions (or point interactions, as they are also called).
310
5 HYPERFINITE DlRlCHLET FORMS A N D MARKOV PROCESSES
When we return to point interactions at greater length in the next chapter, we shall see how to interpret (59) for E = 0 as (62)
H=-A-AS
for suitable infinitesimals A. Although it may not be totally clear from our exposition, the same idea is behind the construction above. To realize this one has to take a closer look at how the forms Fmare defined. Using Lemma 5.6.1, we first constructed Em as the standard part of
where cjm is a suitable nonstandard approximation to qo,. Fm is the standard part of the form
It follows that
Carrying out the calculations from the beginning of this section in the hyperfinite setting, we get that the operator generating 9,,, is of the form (65)
-A,
+ B,
where A, is a hyperfinite version of the Laplacian and B g ( x ) = m2g(x) when st(x) f 0. But when x = 0, the singularity &, has at the origin causes the relation B g ( x ) = m2g(x)to fail; in fact, B has a singularity which turns ( 6 5 ) into a version of (62) for a suitable infinitesimal A depending on m. The calculations are rather messy, and we leave them to the reader. For more information about point interactions, see Chapter 6 and the forthcoming monograph by Albeverio et ~ l (1986). . There are applications of Dirichlet forms to other kinds of singular potentials as well, but we shall not discuss them here. Instead we turn our attention to the Markov processes generated by energy forms. C. Energy Forms and Markov Processes
Recall that by Theorem 5.5.18, a hyperfinite Dirichlet form on a locally compact space generates a standard strong Markov process if it is regular in the sense of Definition 5.5.17. 5.6.5. PROPOSITION. Let v be an internal measure on Y such that " u ( * K )< for all compact sets K and 0 < "v(*O)for all open, nonempty sets 0.Put qo(x) = v ( x ) / A x d ,and assume that there is a closed set N c Rd of Lebesgue measure zero such that for all x E Rd - N, there is an open
5 6 APPLICATIONS TO QUANTUM MECHANICS
neighborhood 0 c R d with
9-l
E
311
SL2(*0 n Y, Axd). Then the form
1
%(.A g ) = j C C
Def(Y)Deg(Y)v(Y)
y~ Y e s U
is regular. SKETCH OF PROOF. The definition of regularity consists of two parts. To show that the first part is satisfied, we must prove that all S-continuous functions of compact support can be approximated by internal functions of finite energy in the supremum norm. This follows immediately from the fact that %A is dense in Z0. To prove that the second condition in the definition of regularity is satisfied, it suffices to show that there is no function f E 9[81 n SL2(Y ) such that “ 8 ( f , f )# 0 but 8 ( J g) -- 0 for all quasi-continuous, S-bounded g E 9[81. Assume that such an f exists; we shall first show that it cannot be nearstandard [in the sense that it is a 2-lifting of a function in L2(Rd,L( v ) st-’)]. I f f is nearstandard, there is a sequence { f n }of elements of 9[81 such that ] I f - fn + 0 and each Defnis nearstandard. Since both f and f n belong to 9[8], it follows that 8 ( f - f n , f - f n ) + 0. Hence for each e E U 0
implying that Def is nearstandard. But if Def is nearstandard, we cannot have %(.A g) = 0 for all quasi-continuous, S-bounded g, unless 8(.Af)-- 0. Assume now that f is not nearstandard. Then’there is an internal set A of infinitesimal measure such that
.=I, is noninfinitesimal for some e E U. M ;shall lefive this estimate to the reader; it is trivial for d = 1 but takes work in higher dimensions. We can choose A c *O for one of the open neighborhoods 0 mentioned in the proposition. By Holder’s inequality
, PO-^ Axd is infinitesimal and a is not, 8 ( J f )must be infinite. But Since 1 this is impossible since we have assumed f E 9[81.
312
5 HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
The conditions of Proposition 5.6.5 guarantee that the nonstandard energy form 8 generates a standard Markov process, but unless the conditions of Lemma 5.6.1 are also satisfied, we do not have a standard characterization of this process. To get a better grasp of the relationship between the form and the process, let us return to Example 5.6.4 and compute the infinitesimal generator of the process in this case. 5.6.6. EXAMPLE.
Let e-mlxl
%(X)
=-
1x1
as in Example 5.6.4. It is easy to check that
where u, = x/lxl is the unit radial vector. Integrating by parts we see that i f f , g E %ef, then
=
f
(-Ag + 2 [rn + &]Vg u.) *
cp’, dx.
Since
A=-A+2
[
,:I1
m+-
u;V
is the infinitesimal generator of the stochastic differential equation
the Markov process generated by Em is a solution of (68). As we have already indicated, the theory of Dirichlet forms is capable of treating equations with more severe singularities than the one occurring in ( 6 8 ) ; we refer the reader to Albeverio et al. (1977,1980) and &hima (1982) [see also Portenko (1979a,b) for an alternative approach]. In what sense the Markov process generated by an energy form solves the associated
5 6 APPLICATIONS TO QUANTUM MECHANICS
313
stochastic differential equation was studied in Albeverio et al. (1980, 1984b) where singular forms and processes were approximated by smooth ones. We have promised to say a few words about the infinite-dimensional theory. On the nonstandard side everything is as before; we are working with a hyperfinite lattice
( k , Ax,. .. , kd Ax) an internal measure
Y
kiE *B and Jk,)5 -
on Y, and an energy form
The only difference is that we now assume that d E *N - N. Let H be the standard Hilbert space generated by the orthonormal basis {eilielgl, where e i = ( O , O,..., 1,..., O ) E Y with the one occurring in the ith coordinate. With the proper identifications, we obviously have
Hc Yc*H. From the examples we considered in Sections 3.5 and 4.7, we know that it is unnatural to assume that v is supported on the nearstandard sets in the Hilbert space topology. Instead we introduce a Banach space B in which H is densely and continuously embedded. Letting st: Y + H and stB: Y + B denote the standard part maps in the H and B topologies, we shall assume that p = L(v)ost~'
is a nonzero Radon measure on B. By the theory of Section 5.1, the hyperfinite form 8 induces a closed form E on L2(B,p ) . The question is what we can say about this form from reasonable conditions on B and p. It is not our purpose to develop an extensive nonstandard theory here, but to get a feeling for what to expect, we shall take a brief look at the results which have been obtained by standard methods. For simplicity we shall concentrate on Kusuoka's (1982) work; the contributions by Albeverio and H@egh-Krohn(1977a,b) and Paclet (1978, 1979) are of a similar nature. For sufficiently smooth functions u E L2(B,p j, let Du : B -+ H denote the Giteaux derivative of u. Kusuoka defines a quadratic form on L2(B, p )
314
5. HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
by
and shows that Eo is closable under reasonable conditions when p is a probability measure. Adding more conditions he then shows that Eo generates a strong Markov process on B. These additional conditions basically say that the embedding of H into B is compact, and that the dual B* of B is a sufficiently large subspace of H. The basic idea of the proof is as follows. Fukushima’s existence theorem (the standard version of 5.5.18) only applies to forms defined on locally compact spaces, and does not cover Eo. Kusuoka uses Gelfand’s representation theorem to find a locally compact, separable metric space M in which B is densely and continuously embedded. He interprets Eo as a form on L2(M,b ) , where is the probability measure induced by p through the embedding from B to M, and applies Fukushima’s existence theorem to find an associated process on M. The argument is completed by showing that M - B has capacity zero and that the process thus lives on B. Our hope is that by using the hyperfinite theory developed in this chapter, it should be possible to refine and extend the results mentioned above. Instead of applying the trick of replacing B by a locally compact space M, the results of Sections 5.4 and 5.5 should enable us to work directly with B-valued processes. We have not checked whether the theory of Section 5.5 is strong enough to prove extensions of Kusuoka’s theorems, or whether further refinements are needed, but this is certainly a promising area for future research. Before we take our leave of the infinite-dimensional theory, there is one more question we would like to discuss. Assume that v is a uniform measure on Y ; then the process associated with 8 is a hyperfinite-dimensional random walk. If B is the completion of H with respect to a measurable norm, we know from Gross’s theorem and Section 4.7 that the standard part of this random walk is a Brownian motion x living on B. The problem is that v does not have a standard part on B, and that, in fact, there is no Dirichlet form on B associated with x. A natural question is whether it is possible to use well-defined hyperfinite forms such as 8 to study processes such as x which are not generated by standard forms. This would require an extension of the theory developed in this chapter, as we have always assumed that the underlying measure has a standard part. The contents of this chapter can be characterized as probabilistic potential theory. We have, however, spent most of the time discussing very general existence problems of a probabilistic nature, and no time at all on potential
REFERENCES
315
theoretic topics of central importance such as boundary problems. It is too late to redress the balance now, but we would like to make the reader aware of Loeb’s (1976,1980,1982) work. Let W be a locally compact Hausdorff space which is connected and locally connected, but not compact. Let X be a family of harmonic functions such that ( W, 2)is a Brelot space where 1 is superharmonic. These assumptions are satisfied if X consists of VZ2 solutions of an elliptic differential equation of the form 6% c,., %-ax,a*ax,u +cb, -+ ax,
cu = 0
I
on a region of Rd, where (a,) is positive definite, c r O , and aiJ,b,, and c are locally Lipschitz. Another example is given by the solutions to Au+cu=O
on an open Riemann surface, where c 2 0 is a smooth density. Working in a nonstandard model, Loeb extends Wiener’s generalized solution of the Dirichlet problem to arbitrary compactifications of the harmonic space W, and gives a Martin-Choquet integral representation of positive harmonic functions. Under the additional assumption that there exist a positive potential and a bounded, nonzero harmonic function on W,Loeb constructs an ideal boundary A such that the points of A correspond to nonnegative harmonic functions. This boundary supports the maximal representing measures (with respect to the Choquet ordering) for positive bounded and quasi-bounded harmonic functions, and, as opposed to Martin’s boundary, almost all the points of A are regular for the Dirichlet problem. These results are all contained in Loeb (1976). A modification of the methods of that paper also gives a simple construction of the maximal representing measures for positive harmonic functions on W as the weak* limits of finite sums of point masses on [ 0 , +a]. In this case W can also be a Bauer space; see Loeb (1980,1982). REFERENCES
s. Albeverio
and R. Hbegh-Krohn ( 1976). Quasi-invariant measures, symmetric diffusion processes and quantum fields. In Le MPrhodes MathPmatiques de fa Thkorie Quantique des Champs, Ed. CNRS, Paris. S. Albeverio and R. Hbegh-Krohn (1977a). Dirichlet forms and diffusion processes on rigged Hilbert spaces. Z Wahrsch. Verw. Gebiete 40. S . Albeverio and R. Hbegh-Krohn (1977b). Hunt processes and analytic potential theory on rigged Hilbert spaces. Ann. Inst. H. PoincarP Secf. B 13. S . Albeverio and R. Hbegh-Krohn (1981a). Stochastic methods in quantum field theory and hydrodynamics. Phys. Rep. 11.
316
5 HYPERFINITE DlRlCHLET FORMS AND MARKOV PROCESSES
S. Albeverio and R. Hbegh-Krohn (1981b). Some Markov processes and Markov fields in quantum theory, group theory, hydrodynamics and C*-algebras. In (D. Williams, ed.), Stochastic Integrals. Springer Verlag, Berlin and New York. S. Albeverio and R. Hbegh-Krohn (1982). Some remarks on Dirichlet forms and their applications to quantum mechanics and statistical mechanics. In (M. Fukushima, ed.), Functional Analysis in Markou Processes. Springer-Verlag, Berlin and New York. S. Albeverio and R. H0egh-Krohn (1984). Diffusion fields, quantum fields, fields with values in Lie groups. In (M. Pinsky, ed.), Stochastic Analysis and Applications, pp. 1-97. Dekker, New York. S. Albeverio, R. Hbegh-Krohn, and L. Streit (1977). Energy forms, Hamiltonians, and distorted Brownian paths. J. Math. Phys. 18. S . Albeverio, R. Hbegh-Krohn, and L. Streit (1980). Regularization of Hamiltonians and processes. J. Math. Phys. 21. S . Albeverio, M. Fukushima, W. Karwowski, and L. Streit (1981). Capacity and quantum mechanical tunneling. Comm. Math. Phys. 81. S. Albeverio, Ph. Blanchard, and R. Hbegh-Krohn (1983). A stochastic model for the orbits of planets and satellites: An interpretation of Titius-Bode law. Expo. Math. 4. S . Albeverio, F. Gesztesy, W. Karwowski, and L. Streit (1984a). On the connection between Schrodinger and Dirichlet forms J. Math. Phys. 2C. S. Albeverio, S. Kusuoka, and L. Streit (1984b). Convergence of Dirichlet forms and associated Schrodinger operators (preprint). Math. Inst., Ruhr Univ., Bochum. (To appear in J. Funct. Anal., 1986.) S . Albeverio, Ph. Blanchard, and R. Hbegh-Krohn (1984~).Newtonian diffusions and planets, with remark on nonstandard Dirichlet forms and polymers. In (A. Truman and D. Williams, eds.) Proceedings ofrhe LMS Symposium on Stochastic Analysis and Applications (Swansea, 1983), p p 1-24. Lect. Notes Math. 1094, Springer-Verlag, Berlin and New York. S. Albeverio, Ph. Blanchard, F. Gesztesy, and L. Streit (1985a). Quantum mechanical low energy scattering in terms of diffusion processes. In (S. Albeverio, Ph. Combe, and M. Sirugue-Collin; eds.) Stochastic Aspects of Classical and Quantum Systems, p p 207-227. Lect. Notes Math. 1109. Springer-Verlag, New York and Berlin. S. Albeverio, Ph. Blanchard, and R. H0egh-Krohn (1985b). Diffusions sur une varitte riemanienne: Barrikres infranchissable et applications. In Colloque en I'honneur de Laurent Schwarrz. Asttrique, 132. S . Albeverio, R. Hbegh-Krohn, and H. Holden (198%). Markov processes on infinite dimensional spaces, Markov fields and Markov cosurfaces. In (L. Arnold and P. Kotelenez, eds.) Stochastic Space-Time Models, Limit Theorems, pp 11-40. Reidel Publ., Dordrecht. S. Albeverio, G. Casati, and D. Merlini, eds. (1986a). Stochastic Processes in classical and quantum systems. Proceedings First International Ascona-Como Meeting. Lect. Notes Phys., Springer-Verlag, Berlin and New York. S. Albeverio, F. Gesztesy, R. Hoegh-Krohn, and H. Holden (1986b). Soluable Models in Quantum Mechanics (in preparation). A. Beurling and J. Deny (1958). Espaces de Dirichlet. Acta Math. 99. A. Beurling and J. Deny (1959). Dirichlet spaces. Proc. Nat. Acad. Sci. 45. R. M. Blumenthal and R. K. Getoor (1968). Markou f'rocesses and Potential Theory. Academic Press, New York and London. E. Carlen (1984). Conservative diffusions. Cornm. Math. Phys. 94. R. Carmona ( 1979). Regularity properties of Schrodinger and Dirichlet semigroups. J. Funct. Anal. 33. K. L. Chung (1960). Markou Chains with Stationary Transition Probabilities. Springer-Verlag, Berlin and New York.
REFERENCES
317
J. Deny (1970). Methodes Hilbertienns et theorie du potentiel. In (M. Brelot, ed.), Potential Theory. Edizioni Cremonese, Roma. E. 9 . Dynkin and A. A. Yushkevich (1969). Markou Processes. Plenum Press, New York. M. Fukushima (1971). Dirichlet spaces and strong Markov processes. Trans. Amer. Math. SOC. 162. M. Fukushima (1979). A decomposition of additive functionals of finite energy. Nagoya Math. J. 19. M. Fukushima ( 1980). Dirichlet forms and Markou processes. North-Holland Publ., Amsterdam, 1980. M. Fukushima (1985). Energy forms and diffusion processes. In (L. Streit, ed.), Mathematics + Physics; Lectures on Recent Results. World Scientific Publ. Co., Singapore. J. Glimm and A. Jaffe (1981). Quantum Physics-a Functional Integral Point of View. SpringerVerlag, New York and Berlin. F. Guerra (1981). Structural aspects of stochastic mechanics and stochastic field theory. In (C. De Witt-Morette and K. D. Elworthy eds.), New stochastic methods in physics. Phys. Rep. 11. F. Guerra and L. Morato (1983). Quantization of dynamical systems and stochastic control theory. Phys. Rev. D 21. C. W. Henson (1979). Analytic sets, Baire sets and the standard part map. Can. J. Math. 31. J. G. Hooton ( 1979). Dirichlet forms associated with hypercontractive semigroups. Trans. Amer. Math. SOC.253. H. J. Keisler (1984). An infinitesimal approach to stochastic analysis. Mem. Amer. Math. SOC. 291. S. Kusuoka (1982). Dirichlet forms and diffusion processes on Banach spaces. J. Fac. Sci. Univ. Tokyo Sect. 1A Math 29. T. Lindstrdm (1986). Nonstandard energy forms and diffusion on manifolds and fractals. In (S. Albeverio, G. Casati, and D. Merlini, eds.) Stochastic Processes in Classical and Quantum Systems. Proc. 1st Int. Ascona-Coma Meet., Lect. Notes Phys., Springer-Verlag, Berlin and New York. P. A. Loeb (1976). Applications of nonstandard analysis to ideal boundaries in potential theory. Israel J. Math. 25. P. A. Loeb (1980). A regular metrizable boundary for solutions of elliptic and parabolic differential equations. Math. Ann. 251. P. A. Loeb (1982). A construction of representing measures for elliptic and parabolic differential equations. Math. Ann. 260. E. Nelson (1967). Dynamical Theories of Brownian Motion. Princeton Univ. Press, Princeton, New Jersey. E. Nelson (1985). Quantum Fluctuations. Princeton Univ. Press, Princeton, New Jersey. Y. Oshima (1982). Some singular diffusion processes and their associated stochastic differential equations. Z. Wahrsch. Verw. Gebiefe 59 P. Paclet (1977/ 1978). Espaces de Dirichlet et capacitts fonctionelles sur triplets de HilbertSchmidt. In Sem. Krie, Exp. 5,4e annee, 1977-78, Universite Pierre et Marie Curie, Paris. Ph. Paclet (1979). Espaces de Dirichlet en dimension infinie. C.R. Ac. Sci., Paris, Ser. A 288. N. I. Portenko (1979a). Diffusion processes with generalized drift coefficients, Theory Probab. Its Appl. (Eng. Transl.) 24. N. 1. Portenko (1979b). Stochastic differential equations with generalized drift vector. Theory h o b . Its Appl. (Eng. Transl.) 24. M. Reed and 9. Simon (1975). Methods of Modern Mathematical Physics 11: Fourier Analysis, SelfAdjointness. Academic Press, New York.
318
5 HYPERFINITE OIRICHLET FORMS AND MARKOV PROCESSES
M. Rockner and N. Wielens (1985). Dirichlet forms-closability and change of speed measure. In (S. Albeverio, ed.) Infinite Dimensional Analysis and Stochastic Processes. Res. Notes Math., Pitman, London. S. Saks (1937). Theory of the integral, Monograf. Mat. 7, Polska Akademia Nauk, Warszawa. M. L. Silverstein (1974). Symmetric Markou Processes. Springer-Verlag, Berlin and New York. M. L. Silverstein (1976). Boundary Theory for Symmetric Markou Processes. Springer-Veriag, Berlin and New York. B. Simon (1979). Functional Integration and Quantum Physics. Academic Press, New York. M. Takeda (1984). On the uniqueness of Markovian extensions of diffusion operators on infinite dimensional spaces, Bi Bo S (preprint). Bielefeld. M. Tomisaki (1982). Dirichlet forms associated with direct product diffusion processes. In (M. Fukushima, ed.), Functional Analysis in Markou Processes, pp 76-1 19. Springer-Verlag, Berlin and New York. N. Wielens (1985). The essential self-adjointness of generalized Schrodinger operators. J. Funct. Anal. 61. D. Williams (1979). Diflusions, Markou Processes, and Martingales, Vol. I, Foundations. Wiley, New York. K. Yasue (1981). Stochastic calculus of variations. J. Funcr. Anal. 41. W. Zheng and P. A. Meyer (1984). Quelques resultats de “Mtcanique Stochastique”. In (J. Aztma, ed.) Sem. Probab. XVII. Lect. Notes Math., Springer-Verlag, Berlin and New York.
CHAPTER 6
TOPICS IN DIFFERENTIAL OPERATORS
This chapter contains a mixed bag of applications. In Section 6.1 we discuss a singular Sturm-Liouville problem. In Section 6.2 we develop a general theory for singular perturbations of non-negative operators, which we apply in Section 6.3 to point interactions and in Section 6.4 to perturbations by local time functionals, in particular to polymer models. In Section 6.5 we present a nonstandard approach to the Boltzmann equation, proving among other things an existence result in the space-inhomogeneous case. In Section 6.6 we make a few remarks on a hyperfinite version of the Feynman path integral; we explain that there is in general no associated Loeb measure, but point out how the internal path space measure can be used to solve the Schrodinger equation and to discuss the classical limit. Unless otherwise specified, all linear spaces in this chapter are over the reals or hyperreals. Where the physical examples require a complex setting, the results extend easily from the real case. 6.1. A SINGULAR STURM-LIOUVILLE
PROBLEM
As an introduction to a nonstandard treatment of singular perturbation theory we shall discuss a singular Sturm-Liouville problem. 319
320
6. TOPICS IN DIFFERENTIAL OPERATORS
Let p be a finite non-negative Bore1 measure on [0,1]. We shall consider the eigenvalue problem (1)
-Y“(x)
+ ~ Y ( x=) AY(x),
05x
5
1,
where A is a real parameter and with boundary conditions Y(0) = Y ( l ) = 0.
(2)
Equation ( 1 ) as it stands does not make classical sense. But there are several ways to give a standaid interpretation of the equation. First one can consider the associated quadratic form defined by
(3) where qo is a continuously differentiable function on [0, 11 satisfying (2) and dm denotes the Lebesgue measure on [0,1]. We shall see that A has a countable family of “generalized eigenfunctions” { Y,} which behave in a way similar to the eigenfunctions of the classical Sturm-Liouville problem. A second way of interpreting ( 1 ) is to pass to the associated integral equation (4)
Y ( x ) = xY’(0)+
I,^
(x
-
s) Y(s) dP - A
I,:
(x
- s) Y ( s )dm.
Standard methods tell us that (4)has a continuous solution YAon [0, 13 for every A. But it seems difficult to get precise information, e.g., to decide for which A we have YA(l)= 0. We shall see that the sequence Y,, A,, “solving” (3) also answers our questions in connection with (4). We shall obtain the solution via an excursion into nonstandard territory. In our original work [Albeverio et al. (1979b)l we converted the measure p into a nonstandard smooth function by replacing it with 6, * p, where 6, is a *C” delta function with support in [ - E , E ] , E being a positive infinitesimal. By transfer, 1.2.4, we could apply classical Sturm-Liouville theory to the nonstandard equation and in this way also obtain new information in the standard case. There is a more radical nonstandard approach to the problem, namely to replace the standard differential equation ( 1 ) by a nonstandard, hyperfinite difference equation. This was done for the classical SturmLiouville problem by MacDonald (1976) and extended to the more general context by Birkeland (1980). We shall give a brief exposition of Birkeland’s treatment. The key idea is to replace the system ( 1 ) and (2) by the difference equation (5)
N 2 A 2 y ( k )+ ( A - q ( k ) ) y ( k )= 0,
0 < k < N,
6 1 . A SINGULAR STURM-LIOUVILLE PROBLEM
321
with the boundary conditions (6)
Y ( 0 ) = Y ( N )= 0,
where N is some large-but still finite-integer and q = ( q ( O ) , . . . , q( N ) ) is a given vector such that q ( k ) 2 0 for all 0 5 k IN. The difference operators are defined by AY(k) = Y ( k + 1 ) - Y ( k ) , A 2 y ( k ) = A ( A y ( k -l ) ) = y ( k + 1 ) - 2 y ( k ) + y ( k - l ) .
The system ( 5 ) and (6) is nothing but a system of N - 1 linear equations for the N - 1 unknowns y ( l ) ,. . . ,y ( N - 1). The corresponding matrix is of the form A
-
Al,
where I is the identity matrix and where A,, = 0,
A,,
=
N2,
if
I i - jl > 1,
if
) i - jl = 1,
A,, = - 2 N 2 - q ( i ) .
Since A is a symmetric matrix it follows from elementary linear algebra that ( 5 ) , ( 6 ) has N - 1 pairwise orthogonal real eigenvectors y , , . . . ,y N - ] and that the corresponding eigenvalues A l , . . . , ANPI are real and simple. This is standard and elementary. What is still elementary, but no longer so trivial, is to get sharp and uniform bounds on the eigenvectors and eigenvalues in order to obtain useful information about the system ( l ) , (2) from the discrete problem ( 5 ) , (6). We have the following result from Birkeland (1980).
-
6.1.1. PROPOSITION. Let A , < A 2 < . . < A N - , be the eigenvalues for the problem ( 5 ) , ( 6 ) and let y , , . . . ,y N - , be the corresponding eigenvectors normalized by setting lly,\l; = N. Define
=
ll411l/N
=
2(1 + 12m).
and b Then for A, (7)
(8)
5
3 N 2 , in particular if j
5
N / 2 , we have the bounds
IIY,llm 5 b, N . ~ ~ A5 Y2 * , A;’ ~ *~b. ~
6 TOPICS IN DIFFERENTIAL OPERATORS
322
If 3 N 2 < Aj we have an inequality IIyjIlm 5 b’,
(7’)
+
where b‘ = 2(1 12m’) and m‘ = llqllm - m. We should, perhaps, add the remark that the vector norms we use are = ZJu(k)J, ))u)I2= (ZJu(k)J2)1/2, and \)uJJm = the standard ones, JJuJJ1 max(u( k)l. We give some brief remarks on the proof; the reader may consult Birkeland (1980) for full details on the various calculations involved. The first trick is to use the following identity, well known from the theory of ordinary differential equations. Let y and z be the solutions of the difference equations N 2 A2y(k ) + ( A - q( k))y( k ) = 0
(9)
and N 2 A2z(k ) + A . Z(k ) = 0,
(10)
respectively, and suppose that y ( 0 ) = z(0) = 0. Then we have the identity k-l
y(k)z(l) = y(l)z(k)
+ N-’ C
q(i)y(i)z(k
- i).
i= I
To prove 6.1.1 we must obtain sufficiently strong estimates for the sum above; i.e., we want to obtain an inequality of the form
where P is some real number. This inequality can be combined with explicit information about the eigenvectors and eigenvalues of (10) to obtain an estimate of the form Ilyll..
(12)
5
2(1
+ 3PA-’/2)11y(12N-1’2.
It remains to choose a suitable P. If we were only interested in the classical Sturm-Liouville problem, we could get away with something less than (7) and (8). In fact, we could use the Cauchy-Schwarz inequality,
1c k
i=1
q(i)y(i)z(k- ill 5
I I ~ I I ~ I I ~ I I ~ I I Z I I ~ ~
to choose P = \ \ q ) ) 2 N - 1 /or 2 ,even weaker P = JJqJJm, to satisfy (11) and to obtain a sufficiently strong bound via (12). In the singular case we need better inequalities, hence a more delicate choice of P.
6.1 A SINGULAR STURM-LIOUVILLE PROBLEM
323
A somewhat lengthy analysis (Birkeland, 1980) shows that (11) is true with the choice P = 4mh1I2. This value of P in (12) at once implies (7), recalling the normalization Ilyjll: = N. The bound in (8) requires some further calculations; see Birkeland (1980) for details. To complete the nonstandard program we need an analog of the classical Sturm-Liouville theory for difference equations, which is elementary but somewhat tedious to supply. Let us touch on one point: we need to estimate the number of zeros of a vector y = ( y (k ) ) . This requires the right definition of zero point; it may happen that for some k, both y ( k - 1) and y ( k ) # 0 but y ( k - 1) * y ( k ) I0. We use linear interpolation: if y ( k - l ) y ( k ) 5 0 and y ( k ) # 0, then the real number
is called a zero point or node for the vector y. With this notion we have the following result: 6.1.2. PROPOSITION. Let y, and A, be the j t h eigenvector and eigenvalue for the system ( 5 ) , ( 6 ) .
+
(i) y, has exactly j 1 zero points in the closed interval [0, N ] , and between two zero points for y, there is a zero point for y,,, . (ii) A, satisfy the following inequality: (,ir2j2)/65 o,5 A,
5
(#'
4- m ) 2 ,
where u, = 2 N 2 (1 - cos(jv/N ) ) is the j t h eigenvalue of equation (10) with boundary conditions z(0) = z ( N ) = 0. The proof is given in detail in Birkeland (1980). We remark that Birkeland (1980) also contains more explicit information about the system ( 5 ) , (6). We have touched upon a few facts needed to make the transition from ( 5 ) and (6) to the system (I) and (2). The time has come to choose a hyperfinite integer N E *N - N. We want to represent the measure p by a vector q E *RNC1. We need a little bit of care, but it is possible to show that given any P, N E *N - N such that P / N 0, there is a vector q such that for every continuous function rp on [O, 11
and such that 0 5 q ( k ) 5 P.
324
6 TOPICS IN DIFFERENTIAL OPERATORS
With this choice of q consider the problems ( 5 ) and ( 6 ) . By transfer, 1.2.4, we still have Propositions 6.1.1 and 6.1.2. In particular, it follows from the bound in (ii) and 6.1.2 that i f j is finite, then st(A,) exists and satisfies
(13) r 2 j 25 s t ( A j ) 5 ( r j + p([O, l]))’, since p ( [ o , I]) = s t ( N - ’ C q ( k ) ) = st(N-’ . llqlll). Next, from the inequalities in 4.1.1 we obtain ly,(m)-y,(l)l
=
1 Ay,(k) 1k;l
I
m-1 ~2.A;’~b.-. N
If j is finite this implies that st(yj(m)) = st(yj(l)) whenever m - l / N is infinitesimal. This means that the functions Y,(x) = st(yj(k)), where x = s t ( k / N ) , are well defined. Y, is continuous, but not necessarily differentiable. But if Y y ( x )existed, then Yj’(x)= st(N-’ A 2 y j ( k ) ) , where x = st(k/N). This is the intuition connecting ( 5 ) and (6) and (1) and (2). In the general case we must settle for something less. By a suitable double summation we derive from ( 5 ) the equation (15)
yj(n)- yj(l) = ( n - I ) Ayj(l)
+ N-’
i
k=l+l m
-AN-2
(n - k)yj(k)q(k)
C (n- k)yj(k).
k=l+l
If j is finite, we may take standard parts. This leads to the integral equation (16)
Y,(x) - Y,(x0) = (X - x0)
*
+
K
I:.
(X - t ) Y , ( t ) dp
- t ) Y , ( t ) dt,
- A [:(x
where x = s t ( n / N ) , xo = s t ( l / N ) , t = st(k/N), and K = s t ( N A y j ( l ) ) .We are thus back to (4). It is also not difficult to extract (3) and to prove the following theorem from the corresponding results about the hyperfinite system ( 5 ) , (6). 6.1.3. THEOREM. Let p be a finite Bore1 measure on [0,1] and define a quadratic form A on the space CA[O, 11 of continuously differentiable functions cp on [0,1] that satisfy cp(0) = cp(1) = 0 by
Acp =
lo1+ jol
Let M = p([O, 13) and B = 2(1
(cp’)’
dm
+ 12M).
cp2 dp.
6 1 A SINGULAR STURM-LIOUVILLE PROBLEM
There exists a sequence CA[O, 11 such that
{ T ~ }of
325
real numbers and a sequence { Y,} in
(a) the following inequalities hold, 0 Ix I1, 1 ~j < 00: 2.2 ?T J 5 Tj 5 (7Tj + k f ) 2 , 1 y(X)l5 B ; (b) { Y,} is an orthonormal and complete sequence in L2[0,11; (c) if cp is twice continuously differentiable on [0, 11 and cp(0) = q(1) = 0, then its orthogonal expansion in terms of { y} converges uniformly to cp; (d) if cp is continuously differentiable on [0,1], q ( 0 ) = q ( 1 ) = 0, and C,zldjY, is its expansion, then m
c Tjdf;
Ap =
j=l
(e) the Y , are solutions of the integral equation (4) with A = T,; (f) Y , has exactly j + 1 zeros in [0,1], and between two zeros for Y , there is a zero for Y,,, ; and (g) if p ( { x } )= 0 or Y , ( x ) = 0, Y i ( x ) exists and is continuous at x; further, I Y i ( x ) (5 ~ B T ; ’ If ~ .p ( { x } )Y , ( x ) # 0, then Yi has a jump discontinuity at x of size p ( { x } ) Y , ( x ) . On intervals I c [0, 11 where p is of the form d p = g dm, g E L 1 ( I ) , Y: is absolutely continuous, and Y , satisfies Y”+ (g - 7,) Y = 0 almost everywhere.
We have indicated a part of the proof of the theorem and round off our discussion by proving (b) and (c). First we remark that the orthogonality properties of the eigenvectors y, at once imply that the functions { Y,} are orthonormal over [0,1]. To prove the completeness part of (b) as well as property (c) we consider a two times continuously differentiable function @ on [0,1], with @(O) = @(1) = 0. Define for 0 5 k IN cp(k) = * @ ( k / N ) . Standard linear algebra and transfer tell us that
where N cj
= (CP,Y j ) =
C cp(k)yj(k)* k=O
For finite j we have
dj = st(cj/N) = st( N-’
*
1q ( k ) y j ( k ) )=
I,:
@ Y, dm.
6. TOPICS IN DIFFERENTIAL OPERATORS
326
It follows that for any positive integer M < 00 M
st( Npl C cjyj(k) j=1
=
)
C
djY,(st(k/N)).
jM =1
To prove completeness it will thus be sufficient to control the tail part of the expansion (17); i.e., we must show that for any positive standard E there is an integer M , < 00 such that N-l
N-'
(18)
C
Icjyj(k)l < E,
0 5 k s N.
j = Me
For yj we have inequalities from Proposition 4.1.1. We need to prove a "good" inequality for cj. We calculate (19)
cj
= (P, yj> = ( r ~ ,Ajl(yjq -
N~ A2Yj))
Here we have used the fact that yj, Aj is a solution of the system From our assumptions on we have a finite real p such that Iq(k)l
5
p and
N21A2cp(k)l5 p,
05k
5
(9,(6).
N.
This gives the following bounds in (19):
and, using summation by parts two times,
These estimates combined with (19) give the following inequality for c j ; Icjl 5 Ajl~llYjllm(ll~I1~ +N). This implies the following bound for the sum in (IS):
From Proposition 6.1.2 we know that 7r2j2/65 Aj for all j , and from Proposition 6.1.1 we have bounds for IIyjllm.This applied to (20) gives
Since b is finite, "b = 2(1 + 12p([O, l])), the first sum can be made small by choosing ME< co large enough. To complete the proof we need to show
6 2 SINGULAR PERTURBATIONS OF NON-NEGATIVE OPERATORS
327
that the last sum is infinitesimal. We have the following bound: 12 (b’)’
From 6.1.1 we easily see that b’ 5 24( 1) q JJm+ 1).We can now use our freedom to choose a P E *N - N satisfying P / N 0 and 11 q [Irn 5 P such that (b’)’/ N will be infinitesimal. And, indeed, any hyperinteger P less than N”4will do. This ends our remarks on the proof of Theorem 6.1.3. We have focused attention on those aspects that are novel and interesting from the point of view of nonstandard methodology. The proofs of the remaining parts follow standard patterns [see Birkeland (1980) and MacDonald (1976)l. L-
6.1.4. REMARK. The results of this section have been discussed from a standard point of view, i.e., by using distribution theory and approximation techniques, in Persson (1981, 1984).
Is there a lesson to be learned from the nonstandard proof? We think so: replace the continuous by a hyperfinite discrete problem. Study the genuinely finite version of this problem and prove sharp and uniform inequalities. Then by transfer and standard parts you have a solution to the continuous problem. This is a theme which we shall meet more than once in applicable nonstandard analysis. And it does give nonstandard methods a certain concreteness and constructivity. 6.2. SINGULAR PERTURBATIONS OF NON-NEGATIVE OPERATORS
Let us consider a particle of mass m moving in a potential V. In classical mechanics the trajectory of the particle is determined by Newton’s equation d2x 1 -= -dt2 m
v V(x)
and the initial conditions (2)
x(0) = xo,
(3)
d x / d t ( O ) = v0,
where x( t ) is the position at time t. In quantum mechanics this deterministic description is abandoned and all that is possible is a formulation in probabilistic terms. The probabilistic information is coded up in a complexvalued function + ( x , t ) of space and time called the wave function. Knowledge of the particle’s physical behavior can be obtained by certain mathematical operations on the wave function; e.g., the probability that the particle
328
6 TOPICS IN DIFFERENTIAL OPERATORS
is in a subset A of R3 at time t is given by
I,I*(% f)12
dx.
Although the exact way the particle behaves cannot be predicted, the evolution of the probabilistic law governing it is deterministic and given by the Schrodinger equation (4)
a$
ih-
at
=
h2 --A$ 2m
+ V4,
where h is Planck's constant divided by 21r, and A = CJ,, (a2/axt) is the Laplace operator. This equation plays the same central role in quantum mechanics as (1) does in classical mechanics. To understand the Schrodinger equation, we must first get a good grasp on the operators occurring on its right-hand side. Since the constant h 2 / 2 m is of no mathematical significance, we shall study operators of the form (5)
H=-A+
V,
where V acts by multiplication. In order to get a reasonable mathematical theory we must interpret H as a self-adjoint operator on L2(R3); how this can be done and what consequences follow are well known for a large class of functions V [see, e.g., Kato (1976) and Simon (1982)l. The purpose of this and the next two sections is to extend the theory to some cases of considerable physical interest where the potential is no longer given by a function. The examples we have in mind have in common that the potential vanishes outside a set of measure zero. One example is point interactions, where the force is concentrated at a single point; for a number of purposes this is a good and convenient model for extremely short-range interactions, e.g., nuclear forces [see Albeverio et al. (1984a, 1986b) and Sections 5.6 and 6.31. Lattices of point potentials have been used successfully to simulate crystals [Grossmann et al. (1980a, b); H$egh-Krohn et al. (1985); Albeverio et al. (1986b)l. In another example the potential is concentrated on a Brownian path modeling a polymer molecule; as we shall explain in Sections 6.4 and 7.5, the four-dimensional version of this model is important in quantum field theory. Since the potentials vanish outside sets of measure zero, the corresponding Schrodinger operators cannot be of the form -A + V where V is a function; we must perturb by more singular objects. Such perturbations are most easily described by quadratic forms. Recall that there is a one-to-one correspondence between non-negative self-adjoint operators A and non-negative closed forms E given by (6)
E ( f ; g ) = (A'/'J A'/'g).
6 2 SINGULAR PERTURBATIONS OF NON-NEGATIVE OPERATORS
329
A perturbation of A concentrated on a set C induces a perturbed form (7)
E ( J g ) = ( A ’ / ’ J A’/’g)+
I
Vfg dx.
C
If, as above, C has measure zero, the last integral is zero and there is no perturbation. However, it is easy now to see how this can be fixed; just replace the function V by a measure p supported on C : (8)
E ( J 8 )= < A T
Al/2g)+
J fs
dp.
C
For technical reasons it is convenient to split p into two parts-a measure p” on C and a function (-1).Hence
probability
c
(9)
E ( J g ) = (AL/’J A’l’g) - J
ifg dp’. C
The idea is to let the self-adjoint operator generated by i be the desired perturbation of A. The problem, of course, is that B need not generate a self-adjoint operator. Since C has measure zero, the last term in (9) only makes sense for a subset of the domain of A”’-those functions that are continuous in a neighborhood of C. The question is whether E can be extended to a closed form. Closability questions of this kind are often very intricate, but in Section 5.1 we found a general method for dealing with them; since the standard part of a hyperfinite form is always closed, all we have to do is find a hyperfinite representation of i.The strategy is as follows: we shall assume that the underlying space X is a Hausdorff space, that C is a closed subset of X , and that m and p” are Radon measures on X and C, respectively, with m ( C ) = 0. We also assume that m ( K ) < 00 for all compact sets K . Given a closed form E on L z ( X , m),let %’ be a hyperfinite representation of it defined on L2(Y, p ) as in Corollary 5.2.2. Let B be an internal subset of Y with C = st(B), and let p be an internal probability measure on B such that p” = L ( p ) st-’. If A : B + “R is internal, the nonstandard counterpart of E is 0
(10)
@ ( Jg ) = % ( J8 ) -
J Afs &.
If L is the operator defining 8,the operator generated by
g is given by
There are three things to be done: we must show that under reasonable conditions and a suitable choice of A the form g is S-bounded from below
330
6. TOPICS IN DIFFERENTIAL OPERATORS
(such that the theory of Section 5.1 applies); that it differs from 8 in standard part; and that the perturbation is carried on C. In this section we shall study these problems in a general setting, while in the next two sections we shall take a look at what happens in the examples we mentioned abovepotentials concentrated in points and along Brownian paths. The most surprising discovery we shall make concerns the choice of A ; we shall find examples where it is necessary to choose A infinitesimal, but where the standard form induced by still is different from E. Hence there are perturbations of E which cannot standardly be expressed by (9), but which nevertheless are induced by hyperfinite forms as in (10). This happens for point interactions in W 2 and W3 and perturbations along Brownian paths in W4 and W5.
A. The Computation
We can now begin the work. Assume that H is given by ( l l ) , and that L, A, p, and p all are non-negative. Our main tool will be the relationship between forms and resolvents established in Section 5.1, and we set out by computing the resolvent of H. If G, = ( L - a)-’, we get
provided, of course, that the series on the right converges. Since I(G,II 5 -l/a, there is a z1 E *R- such that it does converge when a < zl. There is no a priori reason why this zl should be finite, but we are not going to let this stop us; we just assume a < zl and continue our calculations. Applying (12) to a function J we get
6 2 SINGULAR PERTURBATIONS OF NON-NEGATIVE OPERATORS
331
where G , ( x , y ) and ( H - a ) - I ( x , y ) are the kernels (or matrices, if you want) of ( L - a)-' and ( H - a)-' with respect to p. Since an integral 1,. . - & ( x i ) is in fact a hyperfinite sum C,. . . p ( x i ) (and similarly for integrals with respect to p ) , we get from (13):
To get (14) onAamore transparent form, we introduce three new operators. First, let G, be the internal operator from L2(Y, p ) to L2(B,p ) given by
(&w)=
(15)
and let
I
G,(X,Y)dY) dP(Y),
6: be its adjoint:
Finally, let GL be the operator mapping L2(B,p ) to itself according to the formula: (17)
If we also let (18)
u(x) = A(x)"~,
we can write (14) as OCI
(19)
(H
-
a)-'f(x) = G,f(x)
where u acts by multiplication. There exists a z2 E *R such that
+C
6Xu(vG&u)'u&',f(~),
I=O
(Y
< z2 implies
6 TOPICS IN DIFFERENTIAL OPERATORS
332
and for such a m
(1 - uG',v)-' =
1 (uG&u)'. 1=0
Inserting this in (19), we have for
CY
< min(zl, z2)
( H - a ) - ' f ( x ) = G , f ( x )+ 62(:
(22)
-
Gb)-ld,f(x).
Happily calculating within the nonstandard universe, we have been able to express the resolvent of H as the sum of the resolvent of L and a perturbation term. The calculations leading us to (22) are only valid for a < min(z,, z2), but since all operators involved are analytic functions of a, the final result must hold whenever the expression on the right is defined. This is the case when (I/A - G&) is a strictly positive operator (recall 5.1.15), and according to the calculation
BB
it suffices to choose A such that
for some
E
E *R+
and all x
E
B. Notice also that for fixed f
increases as a decreases, and thus if (24) holds for a (I/A - Gb) is positive for all a < ao. From (22) we get
=
a,,, the operator
6 2 SINGULAR PERTURBATIONS OF NON-NEGATIVE OPERATORS
333
and thus ( H - a ) is positive if ( ( l / A ) - G&)is. This implies that the form $? generated by H is bounded from below, and that its standard part B can be defined by using the theory in Sections 5.1 and 5.2. Notice that since A and p are positive and @ is bounded from below, 1' gl(u, u)l = 00 implies Ol%,(u, u)l = M, and hence the domain of i is at least as large as that of the original form E. According to Theorem 5.1.19. (26)
E(Ag) =-
lim
~ ' ( G Jg)
+ a < ig^>
Da+--m
where j', g^ E SL2( Y, p ) are liftings off and g. We can summarize our results as follows: 6.2.1. LEMMA.
Assume that there is a finite a.
E
R- such tnat
for some positive E E R and all x E B. Then $? is bounded from below by ao, and its standard part i is a closed form given by (26). Moreover, 3 9 ( E ) .
9(E)
B Nontriviality
The lemma above is just the starting point for our more serious investigaexists, it leaves the tions. Although it tells us when the perturbed form more important question of when the perturbation is nontrivial, i.e., # E, unanswered, and it is to this problem that we now turn. Recall that according to Lemma 5.2.4, all we have to do is find a E *R_ and an internal function f satisfying the following four conditions: -a3
(28) (29) (30) (31)
< Oa < O a 0 , (f;f)< 0O>
O E I
u = ( L - a)-lf
and
w = ( H - a)-'f O(IU
- WII
are nearstandard,
# 0.
Since L is constructed from the standard operator A, the S-bounded operator ( L - m)-' maps nearstandard elements to nearstandard elements. By (22) (32)
( H - a ) - ' f - ( L - a)-'f
=
&:(l/h
-
Gh)-'&A
and thus it suffices to find a nearstandard f satisfying (29) such that the right-hand side of (32) is nearstandard and noninfinitesimal.
6 TOPICS IN DIFFERENTIAL OPERATORS
334
We introduce the appropriate conditions in stages. First we shall give a quite abstract formulation in terms of the operators Gb, and and then, in the next subsection, a less lofty but much more useful one in terms of the integral kernel G , ( x , y ) and the measures p and p. But before the technical work begins, there is one more remark we would like to make. In the calculations above, we have used the L2 norm on the space of internal functions on B. From now on we shall mainly use the L' norm. The reason for this is purely technical; although the two approaches yield almost exactly the same results, the L' case leads to simpler calculations both here and, especially, in Section 6.4. We shall say that is p-dense if for all nearstandard-g E L'(B, p ) and all E E R,, there is a nearstandard u E L2( Y, p ) such that G,u is S-bounded, O%Y(u,u ) < a,and
em
(33)
IIg -
~ ~ u L 'I( BI, p ) < E.
An internal operator T: L"(B, p ) + L2( Y, p ) is called compact if it maps all S-bounded functions to nearstandard ones (recall Propositions 2.1.1 5 and 2.2.4). If we instead consider T as an operator from L'( B,p ) to L2(Y, p), we say that T is S-bounded o f B if whenever K is a compact set such that K n st(B) = 0, then T : L ' ( B , p ) +L 2 ( Y , P K )
is S-bounded, where p K is the internal measure PK(A) = P(An* K ) .
The operator formulation of the nontriviality result now reads as follows. 6.2.2. LEMMA. Assume that Gp$., .) 2 0 for all p 3 a. and that there is a finite a, ' a < Oa0,such that G, is p-dense and GZ is compact as an operator from L"(B, p ) to L2(Y , p ) and S-bounded off B as an operator from L'(B, p ) to L2( Y, p ) . Assume further that there is an S-bounded function h such that
(34)
g = (l/A
-
G&)h
is nearstandard In L'(B, p ) and (35)
+ 0,
oll~:~llL2~Y,d
and, finally, that there is a positive
E
E R such that
6 2 SINGULAR PERTURBATIONS OF NON-NEGATIVE OPERATORS
for all x E B. Then the standard form different from E.
335
which 'i? induces on L2(X,m ) is
PROOF As we have already observed, it suffices to find a nearstandard function f satisfying (29) which makes the right-hand side of (32) nearstandard and noninfinitesimal. Note that by the resolvent equation
-
G,, - G,
=
(00-
a)G,G,,
and the non-negativity of the kernels G,(x, y ) and G,,(x, y), the function a G,(x, y ) decreases as a goes to --OO for all x and y. Hence (36) holds with a0 replaced by a, and consequently ( 1 / A - G & )is invertible and (34) can be rewritten as (37)
h
Since h is S-bounded and
6: is compact, the function
u=
= (( l / A ) -
ezh
=
Gh)-'g.
dz((l/A)- Gk)-'g
is nearstandard and noninfinitesirn+ in L*( Y, F). Although we do not know if g = Gaffor some nearstandardfin L2(Y , p ) , the p-denseness of G, implies tkat there is a sequence { f n } n G N of nearstandard functions such that each G,fn is S-bounded, 08(fn,fn) < CO, and
II&fn- gllL'(B,p)5 I/n. It is easy to check that ( l / A - Gb)-' has norm less than I / &both as an operator on L'( B, p) and as an operator on La( B, p). To see this for the Lm case, let
and assume that f achieves its maximum at x. Then
2
I ~ , ( x l f ( x ) l >Ellfllm,
where we have used the positivity of G,( x, y ) . The L' case is left to the reader. Applying ( 1 / A - Gh)-' to the sequence {emfn}, we see that
6 TOPICS IN DIFFERENTIAL OPERATORS
336
is a sequence of S-bounded functions converging to h in L'(B, p ) . By the conditions on 62, the functions
are nearstandard in L2( Y , p ) and converge to u in all the spaces L2( Y, p K ) . It only remains to show that O ~ ( U , I I ~ ~ (#~ 0, ~ for ) some n, as we can then take f to be the corresponding f,. But this is easy: rn is a Radon measure with m(st(B)) = 0 and u is nearstandard; hence there must be a compact K such that K n st(B) = 0 and '11 u 11 L z + K ) f 0. Since { u,} converges to u in L2(pK), there must be an n such that 0 < Ufl II L 2 ( r K ) 5 ufl II L * ( r ) . O11
REMARK.
O11
The condition G p ( . , - ) 2 0 for all
instance, g ( A g ) = Sdf, g ) +
1
p
5
a0 is satisfied if, for
Vfg 44
where 8, is a Dirichlet form and V is a function bounded from below by a0.
C. The Main Result
The conditions of Lemma 6.2.2 are complicated and numerous, and the reader may have the impression that they have been chosen for the sole purpose of making the proof work. To show that this is not really the case, we shall now translate them into simple and natural conditions on the integral kernels G , ( x , y ) . Let us first fix the following standing conditions that are satisfied for the resolvent kernels of most interesting operators [see, e.g., Section B7 of Simon (1982) for the Schrodinger case]: 6.2.3. CONDITIONS. For all finite
(Y
less than ao, we have
(i) If x, y, x ' , y' are nearstandard elements of Y and x = x', y -- y ' , but x ie y , then G , ( x , y ) is nearstandard and
-- G , ( X ' , Y ' ) . Y - Ns( Y),then
G,(X,Y)
(ii)
If x
E
Ns( Y ) and y
E
G , ( x , Y ) = 0. An easy but important consequence of (i) and (ii) is: (iii) If K is compact, B is internal, and K n st(B) = 0, then the restriction of G , ( x , y ) to * K x B is S-bounded.
6 2 SINGULAR PERTURBATIONS OF NON-NEGATIVE OPERATORS
337
In addition to 6.2.3 we need integrability conditions. As a first example we have: 6.2.4. LEMMA.
Assume that G,(., . ) satisfies Conditions 6.2.3 and that
G,(x, y)G,(x,
2) E
SL'b x P x PI.
Then 6: is compact as an operator from L"(B, p ) to L2( Y , p ) and Sbounded off B as an operator from L'(B, p ) to L2( Y, p ) . PROOF We first check the S-boundedness. Let K be a compact set such that K n st(B) = 0. By (iii) above,
IGa(x, YII
for some N
E
N and all x
5
E
*K,y
E
5
N
B. Hence
N2p(*K)211fll:?p,,
and since p ( * K ) is finite, this proves that 62 is S-bounded off B. For the compactness part, we first note that since
5
I I ~ I LJ A x B x B IG,(~, y)~,(x,
Z>I
d ( p x p 2 ) ( x ,y, z)
for all internal f and A, the function 6:f is in SL2(p ) for all S-bounded f: Observe next that by Keisler's Fubini theorem, 3.2.14, (suitably extended to cover infinite measures), the function (38)
( Y , z)
-
Ga(x,Y ) G a ( & z)
is in S L ' ( p 2 ) for almost all x. If x is such an element which is not nearstandard, the right-hand side of 1 3 f ( x ) 1 2=
5
(1
G,(x,y)f(.Y) d P ( Y ) ) 2
llfll&
5
IGa(x, Y ) G a ( X , z)l dP2(Y,z)
is infinitesimal since Ga(x, y ) = 0 when y is nearstandard, and L ( p ) ( Y Ns( Y))= 0.
338
6 TOPICS IN DIFFERENTIAL OPERATORS
Finally, we pick two infinitely close, nearstandard points xl, x2 such that st(x,) st(B) and the function (38) is in S L ' ( p 2 )when x = x, or x = x,. Then I W X l ) ' - &f(Xd21 5
llfll2
I
IGm(X1,Y)Gm(Xl, 2 ) - GO(X2, Y)Ga(X2, 211 dP2(Y,z),
which is infinitesimal by 6;2.3(ii). We have shown that G Y is in S L 2 ( p ) ,that it vanishes at almost all non-nearstandard points, and that it is S-continuous outside a set of measure zero. Hence it is nearstandard in L2(Y, p ) , and the lemma is proved. What we have just proved takes care of the requirements 6.2.2 puts on - G&.The main difficulty is that we have to operate with two different parameters a. and a ; while we are allowed to choose A such that
&, but we still have to consider the conditions on ( l / A ) ( l / A ) - G&,
satisfies the conditions of 6.2.1, we must prove that ( 1 / A ) - Gb,
satisfies the conditions of 6.2.2. Since ( 1 / A ) - G&= ( ( l / A ) - G&J+ (G&,- GL),
and the resolvent equation tells us that G,,
-
G, = (a0- a)GmG,,,
the operator we need to study is given by
6.2.5. LEMMA.
(40)
Assume that G, satisfies 6.2.3 and that G,(X,Y)G,,(X,
2) E
S L ' b x p2).
Then (G&,- G&)his nearstandard in L 1 ( B ,p ) for all S-bounded, internal functions h. PROOF
By (39), (40), and Keisler's Fubini theorem, 3.2.14, we get (G&,- G&)hE S L ' ( p ) .
By the same theorem there is a set A c B of L ( p ) measure one such that (41)
(x, z)
G,(x, Y)Gm,(X,2) E SLYP x P )
6 2 SINGULAR PERTURBATIONS OF NON-NEGATIVE OPERATORS
for all y
E
339
A. Let y,, y2 be infinitely close, nearstandard elements of A. Then
5 (a0 -
~ I I I ~ I I J,
B
J
I G , ( ~Y,A Y
- Ga(x,Y2)llG,"(x,z)l
dP(Z),
which is infinitesimal by (41) and Conditions 6.2.3.This completes the proof. We are now ready for the main theorem. 6.2.6. THEOREM.
< Go(* ,
Oa,, a )
Assume that there are two finite numbers ffo,
ff
E
*@
< 0, such that G, is p-dense and satisfies Conditions 6.2.3, and 2 0 for all p 5 cyo. Assume further that Ga,(x, Y)G,"(X, z ) E SLYcL x P2),
(42)
and that A is chosen such that (43) is nearstandard in L'(B, p) and bounded from below by a positive real number. Then the standard part of '8 is a nontrivial perturbation of E.
-
PROOF We shall use Lemma 6.2.2 with h = 1. Observe that since G, is positive and p-dense, 0 1 1 & z 1 ) ) L 2 ( p ) # 0. Note also that since Gp(., * ) 2 0, the function a G, (x, y ) decreases with a for all x and y, and consequently (42) implies the integrability conditions in 6.2.4 and 6.2.5. By the first of these lemmas, the conditions 6.2.2 puts on 6: are satisfied, and we only have to check that
(44)
g
=
( ( 1 / A ) - G & ) ( l= ) ((1/A)
-
G&,)(l)+ (GL, - G & ) ( l )
is nearstandard in L'( B, p). But ( G &, G & ) 1( ) is nearstandard by 6.2.5 and (45) by choice of A, and the theorem is proved. In the introduction to this section we mentioned as the most interesting case the situation where A is infinitesimal, but where we still get a nontrivial perturbation. To see how this occurs, observe that we are forced to choose A infinitesimal when (46)
6 TOPICS IN DIFFERENTIAL OPERATORS
340
is infinite. As Gc, usually has a singularity on the diagonal, this is likely to happen when p is concentrated on a “small” set. By our a r p n e n t s above, it is not really the finiteness of (46) which determines when 8 is a nontrivial perturbation, but rather the finiteness of the difference
I
(47)
G,,(x, Y ) dP(Y) -
1
Ga(x,Y ) dP(Y).
At first glance it may seem improbable that this difference between infinite objects should be finite, but it is here that the resolvent equation enters the scene and reminds us that
where
The key point is that thanks to the integration with respect to p , the new kernel G,G,,(. , ) is usually much less singular than the old one G,,( * , * ), and hence there is a fair chance that (47) will be finite even when (46) is not. It is in such cases that we get nontrivial perturbations from infinitesimal A’s. To illustrate the difference between Gao(-,.) and G,G,,( * , .), let us just quote the following results about the resolvent R, of the Laplacian -A in Rd (see Lemma 6.3.1 for more detailed information): for d = 1, R , ( x , y ) is continuous and bounded; for d = 2, it has a singularity on the diagonal proportional to -1nllx - yll; and for d > 2, the singularity on the diagonal is proportional to jlx - y112-d.The integral kernel R,,R,(x, y ) , on the other hand, is bounded and continuous for d 5 3; has a singularity of type -Inllx - yll for d = 4; and has a singularity of type I(x - y114-d for d > 4. As we shall see examplified in the next two sections, a consequence of this behavior is that perturbations of the Laplacian induced by infinitesimal A’s usually exist two dimensions higher than those coming from finite values of A.
-
D. The Case of Standard A‘s
But let us take a closer look at what happens in situations where we can choose A noninfinitesimal. From the nonstandard point of view this case may seem less exciting than the other one, but the hyperfinite theory still
6.2. SINGULAR PERTURBATIONS OF NON-NEGATIVE OPERATORS
341
has a significant contribution to make; it gives an efficient method for showing that a form Wf(x)E(x)db(x)
E ( f ,g ) -
(48)
is closable. To set the stage for the theorem, let us recall the basic ingredients of the problem: E is a densely defined, closed, non-negative form on L 2 ( X ,m); the probability measure p" is supported on a closed set C of m measure zero, and : C + R is a bounded Bore1 function. Let Eo be the form defined on the bounded and continuous elements of 9[E] by
x
c
(49)
-ME i )= ~ (i )1 -J
~ ( x ) f ( x ) i ( xd)b ( x ) .
We want to know when Eo can be extended to a closed form. For purely technical reasons we shall restrict ourselves to the case where E is a Dirichlet form. We can then choose a hyperfinite representation 8 that is itself a Dirichlet form, and if G, is the resolvent of 8,then the kernel G,(x, y ) is non-negative for all x, y, and a, and (50)
II-a?(f)llm
5
llfllm
for all a and J: In the remark which follows the proof of the theorem, we comment briefly on what happens when E is not a Dirichlet form. 6.2.7. THEOREM. Let '8 be a hyperfinite representation of a Dirichlet form E and p a hyperfinite representation of b. Let A be an S-bounded lifting of with respect to p. If G, denotes the resolvent of 8, assume that
lim
(51)
Dm
G e ( . , Y ) dP(Y)ll
= 0, L'(P)
+ -a2
and that each f E 9[E0]has an S-bounded L 2 ( p + p)-lifting f such that "s(f,f) < cy) and
If there exist an
(53)
E E
R, and a finite a. A(x)
(1
E
*R- such that -1
G,(x, Y ) M Y ) + E )
for all x, then the standard part of extension of Eo.
@(A g ) = 8 ( f ,g ) -
Afg dp is a closed
6 TOPICS IN DIFFERENTIAL OPERATORS
342
is closed follows immediately from PROOF. That the standard part of (53) and Lemma 6.2.1. To prove the extension part, let g' E 9[E0]and pick liftings f and g according to (52). By (26) it suffices to prove that
X
(54)
E ~ ( g) X
= Flim
O(
a
2
o+-m
(
g )~+ &(A g>
and since by 5.1.19 (55)
H I g') = -1im DOL
+ -m
"(a2(GA 8)+ d
A g)),
this reduces to showing that
Since by (52)
(56) will follow if we can prove that
Now
and since Ga(x,y ) 2 0, we know that (1/A - G&)-'has norm less than 1 / ~ as an operator on L"(B). By ( 5 0 ) , the functions (-aGa)f and ( - a e m ) g are S-bounded, and hence (51) implies that the last term in (58) goes to zero.
6 2 SINGULAR PERTURBATIONS OF NON-NEGATIVE OPERATORS
343
REMARK. The only use we have made of the assumption that 8 is a Dirichlet form is in showing that ( ( l / A ) - GL)-'(-a&,)f and ( - a & , ) g are S-bounded; in the general case, all that is immediately clear is that they are in L2(B,p ) . This suffices if we replace (51) by the stronger assumption
(59) but since the singularities of G,(x, y ) blow up when we square them, (59) is much too restrictive. Hence a satisfactory treatment of general forms requires finer estimates, and these we leave to the reader. E. Translation into Standard Terms
The results above are phrased in terms of the resolvent G, of the hyperfinite form 8. From a nonstandard point of view this is only natural, but in a standard setting a formulation using the resolvent R, of the standard form E is often more convenient. Since translating our results into standard terms is not quite as trivial as it may look at first glance, we shall give a brief sketch of what seems to be the best approach. The idea is simple; given a standard form E satisfying standard counterparts of the conditions of Theorem 6.2.6 or 6.2.7, we shall show that it is the standard part of a hyperfinite form satisfying the original conditions. In Section 5.2, we used the associated semigroup to construct hyperfinite representations, but in the present context, where semigroups are less important than resolvents, an alternative line of attack is preferable. We return to the setting of Section 5.2. Thus K is a standard Hilbert space, and H is a hyperfinite-dimensional, S-dense subset of * K . Assume that E is a symmetric, non-negative, densely defined form on K , and let R, be its resolvent. Fix an q,E IW-, and define (60)
G,,
=
P*R,,P,
where P : * K -+ H is the orthogonal projection. Clearly, G,, is invertible as a linear map from H to H, and (61)
A
= Gi:
+ a.
is a symmetric, non-negative operator on H. The form (62)
%(u, U ) = (Au, U )
is the hyperfinite representation of E we shall be working with. Let us check that E really is the standard part of 8. If G, is the resolvent of 8 and f E K , the two functions from IW- to [w given by (63)
( U f ) and
a
-
O(G,*.A
*f)
6. TOPICS IN DIFFERENTIAL OPERATORS
344
are analytic. From the series expansions m
m
it is easy to see that they agree in a neighborhood of an, and hence they are equal for all negative a. If F is the standard part of 53, we know from 5.1.19 that
R L f )=
(66)
-1im O(a2(Ga*L*f>+ d P * A P*f)), 'OL+-Cc
and by the corresponding standard result [see, e.g., Lemma 1.3.4 in Fukushima (1980)], we get
Since the two expressions are equal, E is the standard part of 8. Returning to the setting of this section, from now on we assume that K = L2(X,m),where X is a Hausdorff space and m is a Radon measure. The following conditions are the standard counterparts of 6.2.3(i)-(ii): 6.2.8. CONDITIONS.
R, (
a .
Assume that the resolvent R, has a symmetric kernel
) satisfying:
(i) R,(., is continuous off the diagonal; i.e., at all points ( x , y ) where x # y. (ii) For all X E X and E E R+, there is a compact K , such that I R a ( x , y ) l < E i f y &. a )
REMARK. Note that these conditions imply that if x has no compact neighborhood, then R,(x,y ) = 0 for all y # x, and that they therefore may seem rather worthless in nonlocally compact spaces. However, in the infinitedimensional case, the situation can often be saved by choosing a weaker topology than the one originally given.
Recall that C is a closed subset of X of m measure zero, and that p" is a probability measure supported on C. We shall say that R, is ;-dense if for each g E L'(C,); and each E E R+, there is a u E 9 [ E ] such that
J R,(.,
y ) u ( y >~
y
)
6 2 SINGULAR PERTURBATIONS OF NON-NEGATIVE OPERATORS
is bounded on C and differs from g with less than that by the resolvent equation
Ru
=
F
345
in L'(6) norm. Note
Rae" - ( a - %)Rct),
all the Ru's are ;-dense if and only if one of them is. The following definition makes precise what we are looking for. 6.2.9. DEFINITION.
A symmetric, lower bounded form
i is a perturbation
of E supported on C i f
(i) ~ [ E I - G ~ [ E I ; (ii) E f E; (iii) whenever f , g E 9[E l and there is a neighborhood of C where f is continuous and g vanishes, then E ( f , g ) = E ( f , g ) .
We can now translate 6.2.6 into the following theorem. 6.2.10. THEOREM. Let E be a densely defined, non-negative, symmetric form on L2(X , m ) , where X is a Hausdorff space and m is a Radon measure, and let p' be a Radon probability measure supported on a closed set C c X of m measure zero. Assume that the resolvent kernel R,(x, y ) is nonnegative, ;-dense, and satisfies 6.2.8. If in addition
11 (YRJ + f {IL z ( 6 ) + 0 as
(68)
(Y
+
-a
for all f E 9 [ E ] which are continuous in a neighborhood of C, and there is an ( Y E~ Iw- such that
R,,(x, Y ) R U O b2,) E L1(m x P 2 ) ,
(69)
then E has a perturbation supported on C, PROOF. We begin by making a careful choice of hyperfinite representations Y, p, B,and p of X , m, C, and 6, respectively. Recall from the proof of Corollary 3.4.10 that there is a hyperfinite equivalence relation = on * X such that each equivalence class is a *Bore1 set, and the equivalence class of each nearstandard point is contained in its monad. Refine = to an equivalence relation by letting x y if and only if x = y and either none or both of x and y belong to * C. For each equivalence class p of ,choose an element x,, E p, and define
-
-
-
{x,, ( p is an equivalence class},
Y
=
B
= {x,,
J pis an equivalence class, p c
Let p be the hyperfinite measure on Y given by PIX,> = * m ( p )
*c}.
346
6 TOPICS IN DIFFERENTIAL OPERATORS
and p the measure on B defined by P{XJ
=
*b(P).
Note that m = L ( p ) st-’ and b = L(p) 0 st-’, and that p and p have disjoint supports. If P denotes the natural “projection” of *L2(X,m ) on L2( Y, p ) , let 0
Ga0= P*R,,P
(70)
as in (60). As already observed, E is the standard part of the hyperfinite form 8 defined in (62). The key observation is that if G = m + b, then (71)
I, I,
Gao(xp,xq)= * f i ( p ) - ’ * f i ( q ) - ’
*R,,(x, Y ) d * f i ( x ) d*fi(Y)
is a kernel for G,, as an operator from L2( Y, p ) to itself. Using this formula, it is easy to check that the ;-denseness of R, implies the p-denseness of G,, and that the assumption of R, satisfying 6.2.8 implies that G, satisfies 6.2.3. Let v = p + p. Then G,,(x, Y)G,,(X,
2)
d(CL x P”X, y, z)
and thus (69) implies the S-integrability of G,,(x, y)G,,(x, z). But then all the assumptions of 6.2.6 are satisfied, and the standard form = ‘ 2 is a nontrivial perturbation of E. It only remains to check that the perturbation is supported on C in the sense of 6.2.9(iii). Assume that there is a neighborhood of C where f is continuous and g vanishes, and let f , g E L2( Y, Y ) be liftings o f f and g with respect to I/.From (68) we get that (72)
o l l ~ & f+ fllL ? p )
+
0
and (73)
oII~~,gllLz(p) 0 -+
6 2 SINGULAR PERTURBATIONS OF NON-NEGATIVE OPERATORS
347
as ‘a! + -a. By (26) (74)
&L
g > - E(L g ) = lim ( ( ( 1 ~ ) G;)-~(-&J~ 0,
+ -a-
(-&M)P(~),
where the right-hand side is zero by (72), (73), and the uniform boundedness of the operators ( l / h - G&)-’.The theorem is proved. REMARK. Note that E will in general have many different perturbations supported on C since we are free to choose the difference
(l/A(x)) -
I
G~,(x, Y ) &(Y)
in 6.2.6, and also, to a certain extent, the measure p’ on C. Using the methods above, it is easy to prove the following standard version of 6.2.7:
x
6.2.11. THEOREM. Let E be a Dirichlet form on L*(X, m),where is a Hausdorff space and m is a Radon measure, and assume that the resolvent kernel R,(x, y ) satisfies 6.2.8. Given a closed set C c X of m measure zero, a Radon probability measure p’ supported on C, and a bounded Bore1 function i:C + R, let Eobe the form defined on the bounded and continuous elements of 9[E] by
(75)
EO(L g ) = E(f, g ) -
Assume that (76)
, lim +-a
[I
I
if!?dp’.
R,(x, Y ) 4?x) G ( Y ) = 0;
that for all f E DIEO] (77)
lim
,+-a
and that there exist a0 E R-,
Ilf+
E EF ,I!
aRfll
L2(i)
= 0;
such that for all x
then Eo is closable. Although there are no continuity conditions on the resolvent kernel in Theorem 6.2.7, we are now assuming that R, satisfies Conditions 6.2.8. This is just a convenient way of guaranteeing sufficient regularity for (76)-(78) to make sense.
6 TOPICS IN DIFFERENTIAL OPERATORS
348
In our applications we shall be studying perturbations of the free Hamiltonian form
in Rd. Some of the assumptions of Theorems 6.2.10 and 6.2.11 are automatically satisfied for this form, and to make clear exactly what has to be checked, we state the following corollary. 6.2.12. COROLLARY. Let A be the Laplacian and m the Lebesgue measure in Rd. Let Eo be the form defined on C i ( R d )by
EO(A g ) =
I
(-Af)gdm,
and let E be its closure. Assume that p’ is a Radon probability measure supported on a closed set C of Lebesgue measure zero (a) If for some a. E [w-
(79) then E has a perturbation supported on C. (b) If 1:C + Iw is a bounded Bore1 function and there exist a. E [w-, E E R, such that i ( x ) 5 R,,(x, y ) d b ( y ) + E ) - ’ and
(s 5
then the form E0(J g ) - irfg db is closable. PROOF That R , satisfies 6.2.8 is well known; see, e.g., the explicit formulas given in Lemma 6.3.1 below. If f~ 9 [ E 0 ] is continuous in a neighborhood of C,then - a R , f converges uniformly to f on C, and hence ~~R,fl(~2(,,) + 0. Since { R , u ( u E 9[E,,]} =I C : ( R d ) is dense in L ’ ( b ) , the resolvent R, is ;-dense. Finally, since R,(x, y ) decreases to zero at all nonsingular points, (80) implies (76). Hence all the conditions of Theorems 6.2.10 and 6.2.11 are satisfied, and the corollary is proved.
/If+
We shall take our leave of the general theory of singular perturbations here; as already announced, the next two sections will be dedicated to the study of two important special cases-point interactions and potentials supported on Brownian paths. The theory we have presented above is new, and, partly for that reason, this section is among the most open-ended in the book; although we have a fairly good understanding of when perturbations supported on null sets exist, we know almost nothing about their
63 POINT INTERACTIONS
349
properties. Perhaps the work which has been done on point interactions (and which we review briefly in the next section) can serve as a guide for the general theory; among the topics people have studied are spectral properties, resonances, and convergence of approximating operators. It also seems evident that there must be a connection between singular perturbations and Hausdorff measure and dimension, which has not yet been exploited. Let us finally mention that in collaboration with Karwowski we have developed an alternative approach to Corollary 6.2.12 using Fourier transforms and ultraviolet cutoffs. We hope to give an account of this method in Albeverio et ai (1986a) [see also the announcement Albeverio et al. (1984b)l.
6 3. POINT INTERACTIONS
Point interactions (i.e., perturbations of the free Hamiltonian supported on a discrete set) have a long and venerable history. Kronig and Penney (1931) used operators of the form
to model one-dimensional crystals, and a few years later, when Wigner (1933) showed that the diplon (the system consisting of a proton and a neutron) is held together by forces of extremely short range, a threedimensional model based on point interactions was developed by Bethe and Peierls (1935). Fermi (1936) used a similar approach to study the motion of neutrons through hydrogenous substances, but although the work was continued and extended to N-body problems by, e.g., Huang, Lee, Luttinger, Yang, and Wu [see Huang and Yang (1957), Huang et al. (1957), Lee et aZ. (1957), and Wu (1959)] in the 1950s, no mathematical foundation for the theory existed until 1961. In that year, Berezin and Faddeev (1961) not only showed how to interpret one-point interactions as self-adjoint operators, but also gave a complete classification of the resulting Hamiltonians [see the excellent survey article by Flamand (1967) and the book by Demkov and Ostrovskii (1975) for accounts of this and related work]. Friedman (1971,1972) showed that operators of this kind can be obtained as limits of operators defined by ordinary potentials, and in his AMS address on internal set theory, Nelson (1977) gave a nonstandard treatment of these results. Friedman and Nelson did not obtain the full classification of Berezin and Faddeev; the gap was filled by Albeverio et al. (1979b), who also extended the theory from perturbations in one point to perturbations in an
350
6 TOPICS IN DIFFERENTIAL OPERATORS
arbitrary, finite number of points. The latest nonstandard contribution to the field is the thesis of Alonso y Coria (1978). During the last few years, our understanding of point interactions and their properties has grown rapidly, and the rich and complex theory which has been created is described in detail in the monograph by Albeverio et al. (1986b). In this section our aim is much more modest; we just want to show you two nonstandard ways of constructing point interactions (recall that we have already taken a brief look at a third method in Section 5.6). A. Application of the General Theory
The first method is based on the theory of the previous section; we simply want to apply Corollary 6.2.12. To obtain the necessary information about the kernels R,( * , ) and R,R,(. ,- ), note that
-
R,
=
R,R,
=
and
jOm ear
dt
lom t T ' . ea'dr,
where T' is the semigroup of -A. Since T' is given by the Gaussian kernel
we get (2)
R,(x, y ) =
I,"
( 4 ~ t ) - ~exp( "
-'Ix I2':;)
exp(at) dt
and
Since R,(x, y ) and (R,R,)(x, y ) depend only on the distance r = IIx - yII between the points x and y, we can treat them as one-dimensional functions Rid'(r) and (RaRLd))(r), where we have introduced the superscript d to emphasize the dependence on the dimension. From (2) and (3) it follows immediately that (4)
6 3 POINT INTERACTIONS
351
Our first lemma summarizes well-known asymptotic properties of RLd’(r ) as r goes to zero and to infinity: 6.3.1. LEMMA.
Let RLd’(r)be as above, a E R-:
(i) If d < 2, then Rbd’(r)is continuous and bounded and converges uniformly to zero as a -+ -a. (ii) When d 2 2, RLd’(r) is continuous off the diagonal, and the asymptotic expression as r + co is
(iii) When d
=
2, the asymptotic behavior as r + 0 is
R?’( r )
- --In(& 2?l I
r)
(iv) When d > 2, the asymptotic behavior as r + 0 is
Most of the properties above should be known to the reader from the basic theory of partial differential equations (after all, R, is nothing but the Green function), and we shall not prove the lemma here. To obtain the finer estimates, one may either integrate the right-hand side of (2) in terms of Bessel functions [see Gradshteyn and Ryshik (1965), Eq. 3.471.91, or use Fourier transforms as in Glimm and Jaffe (1981). The important thing to notice about 6.3.1 is that the singularity at the origin gets worse as the dimension increases. By (4), the singularity of (R,R,) grows two dimensions slower than the singularity of R,, and this is why 6.2.12(a) is a more powerful tool than 6.2.12(b). Let us take a look at what happens for point interactions. 6.3.2. PROPOSITION. Let E be the closed form on L2(Rd,rn) (where m is the Lebesgue measure) generated by -A. If d 5 3 and a,, . . .,a, E Rd,there is a nontrivial perturbation of E supported on { a , , . . . , a,}. If d = 1 and P I , . . . ,P, E R, the form defined on the continuous and bounded elements of 9 [ E ] by
is closable. PROOF
By 6.2.12 combined with 6.3.1.
352
6 TOPICS IN DIFFERENTIAL OPERATORS
It is known that if d 2 4, then -A is essentially self-adjoint even when restricted to functions vanishing in the vicinity of { a , , . . . , an},and thus E has no perturbations supported on finite sets. In this sense 6.3.2 is the best possible result. On the other hand, we can use Theorem 6.2.6 and formula 6.2.26 to obtain more detailed information about the possible perturbations of E ; if we, for instance, choose the A and the p in 6.2.6 uniform, the perturbation in each point ai is the same. But we shall leave the closer study of the properties of the perturbed operators to the reader; instead we shall take a look at a different nonstandard approach to point interactions. B. An Alternative Approach
So far our strategy in this chapter has been to reformulate a given problem as a discrete problem in a hyperfinite setting. The technique is quite general and allows us to treat successfully a number of problems concerning singular coefficients and perturbations. It is often possible, however, to give a continuous nonstandard treatment; i.e., we may use all of * X as the underlying space rather than a hyperfinite, S-dense subset Y. In the discrete approach one is first led to a problem in linear algebra, whose solution is later pushed down to the standard space by the theory of Chapter 5 or some similar device. In the continuous approach a singular standard problem in X is replaced by a sufficiently regular problem in * X , and again a meaningful standard answer may be forthcoming if one is only able to prove the necessary smoothness of the nonstandard solution. As an example, we shall see how point interactions in R3 can be treated by this method. If xa is the indicator function of the interval [0, E ] , we first consider self-adjoint operators on the complex space L 2 ( R + ,d r ) of the form A = - ( d 2 / d r2) + A X E (6) with Dirichlet boundary conditions at r = 0, where A is a constant. What this means is the following. Let E be the form defined on the functions in C"(R+) that vanish outside a compact subset of (0, a),by
(7) If E is the closure of E, then A is the self-adjoint operator defined by E. Similarly, we let A , be the self-adjoint operator on L2(R+, d r ) given by
and Dirichlet boundary conditions at r = 0. We shall choose E = 0, and show that for certain infinitely large choices of A, the nonstandard operator A induces a standard perturbation of A o .
6 3 POINT INTERACTIONS
353
In a sense that we shall explain later, this solves the "radial part" of the problem of a point interaction at the origin in R3. As in Section 6.2, our main tool will be the resolvent kernel of A. We fix an infinitesimal F > 0, and let u, u, be solutions of the equation (9)
-u"
+ AxEu- (YU = 0
on *R+, where a E C - R., If we choose u1 and u2 such that u,(O) = 0 and u 2 ( r ) + 0 as r + a,elementary Sturm-Liouville theory tells us that the quantity (10)
K = U:V*
-
U~U;
is independent of r, and that the inverse of ( A - a ) is given by the integral kernel
whenever K # 0. Since AxE is constant on each of the intervals [0, F ] and ( E, a), equation (9) can be solved explicitly. Assuming that Re(&) > 0, we see that o1 and u2 can be chosen as follows:
and
Since u,( r ) and u ; ( r ) have to be continuous at r
= 8, we
get
The corresponding expressions for c and d are
Evaluating K as the limit of u;u2 - U,V; as r goes to infinity, we find that (18)
K
=
2aa.
354
6 TOPICS IN DIFFERENTIAL OPERATORS
Combining (12), (13), and (18), we can turn (11) into an explicit formula for G,(x, y). Since A is a self-adjoint operator, G,(x, y) is symmetric, and we need only consider the case x 5 y: sin(-
x)
(c
e G Y +
d e - G Y ),
XlY<&
On the other hand, the resolvent kernel of the unperturbed operator A. is
Our task is to find the A's which make G, different from R, in standard part. From (16) and (17), we get that if ' a , (Y # 0, then G,(x, y) is finite when x = y. Hence G,(x, y) has no singularity at the origin, and comparing (20) and the third clause in (19), we see that the only way G, may differ from R, in standard part is by (21)
st(b/a) f -1.
We first observe that if A is finite, then
and
m
and thus b / a = - 1. As for infinite A, we see that if d ( a - A ) / a cos E is infinite, then again b l a = -1. This means that we have to choose A infinite. but such that
is finite. For infinite A
and thus it suffices to choose A such that the quantity (26)
p = mc o s ( G X &)
6 3 POINT INTERACTIONS
355
is finite. Hence we must let E =
(27) for some k
E
( k + 5) T
*Z and some infinitesimal
1).
+7
Solving for A, we get
and combining (27) and (26), we see that (29)
6 = 2cos(
(k
+
E
1)
T
+ 7) + ( k +
i)
77 cos((k
T&
+ $)T + 7) 77
6
If we choose q / E nearstandard and k E N, then has a standard part which is equal to the standard part of the second term in (29):
p = st((-l)k6) = - ( k +$).rrst(q/E).
(30)
It turns out that without any loss of generality we may take ~ , I / E to be a real number y. Formulas (28) and (30) may then be rewritten as
(31)
A = -(k
+ $ ) ’ ( T * / E ’ ) + ( 2 / ~ ) -p y 2
p
+ $)TY.
and (32)
=
-(k
The functions o1 and v2 are S-continuous and have S-continuous derivatives on *R+.Hence their standard parts u1 and u2 exist and satisfy the equation -urr -
(33)
Lyu
= 0.
Moreover, (34) and since
v,(E)
=
sin(-&)
and
u;(E)
=
d
ac
o s ( m ~ )we, get
(35) From (25), (26), and the fact that sin(-&) (36) It follows that (37)
u;(o) = PUl(0).
L-
(-l)k, we see that
356
6. TOPICS IN DIFFERENTIAL OPERATORS
is the resolvent of the operator -d (38)
'1 dr2 on R, with the boundary condition
u'(0) = Pu(0).
If & is the restriction of -d21dr2 to the C" functions which vanish outside a compact subset of (0, CO), it is known that the only self-adjoint extensions of & are the ones given by condition (38) when P runs through Iw [see, e.g., Reed and Simon (1975)l. Thus we have proved: 6.3.3. PROPOSITION. Consider the self-adjoint operator
(39) with Dirichlet boundary conditions on the nonstandard Hilbert space *L2(R+,d r ) , where A,( p ) is a number of the form
with p, 6 E R, E a positive infinitesimal, k E N, and xEthe indicator function of [0, E ] . Then A, is nearstandard in the sense that its resolvent ( A P - a)-' is nearstandard, and the standard part of (A, - a)-' is the resolvent of the self-adjoint operator - d 2 / d r 2 with the boundary condition u ' ( 0 ) = P u ( 0 ) . Thus when P runs through R, the standard part of A, runs through all self-adjoint extensions of - d 2 / d r 2 independently of k and 6. Comparing (31) and (40), we see that the term (-y2)-which depends on p through (32)-has been replaced by the independent term 6, but it is easy to check that this makes no difference. Let us now explain the relationship between 6.3.3 and point interactions in R3. If d is the restriction of A = a2/axf to the C" functions that vanish outside a compact subset of R3 - {0}, we are interested in the self-adjoint extensions of i.In polar coordinates A takes the form
xi=,
A = - a2 + - - +2 - aB ar2 r ar
1 r2 '
where B is the Laplace-Beltrami operator on L2(S2).But B has discrete spectrum with eigenvalues - I ( / + l ) , 1 E Z+,of finite multiplicity, and thus it suffices to study the self-adjoint extensions of the restriction d, of
A
'-
d2 2 d dr2 r dr
Z(l+ 1 ) r2
to those functions in L2(R+, r2 d r ) which are C" and vanish outside compact subsets of R, - (0).
6 3 POINT INTERACTIONS
357
The map f (r ) + $ ( r ) is a unitary equivalence between L2(R+, r2 d r ) and L2(R+, d r ) , which carries A, into (43 1
d 2 / d r 2- l ( l + 1 ) / r 2 ,
and since it also maps the class of C" functions vanishing outside compact subsets of R, - (0) to itself, it suffices to study the self-adjoint extensions of (43). It is well known [see, e.g., Reed and Simon (1975)] that (43) is essentially self-adjoint for 1 L 1. We have therefore only to consider the self-adjoint extensions of - d 2 / d r Z on L2(R+, d r ) , and this is exactly what we did in 6.3.3. We now fix a positive infinitesimal E, and write A also for the self-adjoint Laplacian in the nonstandard space * H = *L2(R3). We want to know when the self-adjoint perturbation
A, = -A + ~ X E ( l l X 1 l ) is nearstandard to a self-adjoint, nontrivial perturbation of the standard operator -A. If we split * H into its rotationally symmetric part * H s and the orthogonal complement * H i , (44)
*H =*Hs@*Hi, we get immediately that the restriction of A, to *Hi is nearstandard and that its standard part is -A (this is because we have already remarked that the restriction of A to the C" functions in H i that vanish outside a compact subset of R3 - (0) is essentially self-adjoint). Hence we need only consider the restriction of A, to * H s , and we have seen that this operator is unitary equivalent to (43). Combining our results, we have the following theorem. 6.3.4. THEOREM.
(45)
Consider the self-adjoint operator A,
=
-A
+ A, (B)x,
on *L2(R3), where A,( p ) is a number of the form (46)
A E ( ,B) = - ( k
+ $ ) ' ( T ~ / E ~+) ( ~ / E ) P+ 6
with ,B, 6 E R, k E N, E a positive infinitesimal, and ,yE the indicator function of the ball { llxll IE}. Then the resolvent ( A , - a)-' is nearstandard, and its standard part is independent of k and 6, but different for different choices of p. When p runs through R, the standard part of A, runs through all self-adjoint extensions of the restriction of -A to the C" functions supported on compact subsets of R3 - (0). Theorem 6.3.4 gives an alternative description of the one-point perturbations of -A to the one we obtained in 6.3.2 by using the methods of Section 6.2. In Albeverio et al. (1979b), a third approach based on approximations
358
6. TOPICS IN DIFFERENTIAL OPERATORS
by finite rank operators was also exploited, but we shall not discuss this here. Instead we shall give a short survey of recent contributions-standard as well as nonstandard-to the study of point perturbations. We have already mentioned the nonstandard contributions by Nelson (1977), Albeverio et al. (1979b), and Alonso y Coria (1978). Using standard methods, Albeverio et al. (1977, 1980) gave the alternative description in terms of Dirichlet forms, which we described in Section 5.6, and Grossmann et al. (1980a,b) used approximations by finite rank operators to initiate the study of the finer properties of the perturbations such as eigenvalues and bound states, resonances, and infrared convergence. The reader should also consult contributions by Thomas (1979,1980) and Zorbas (1980). The extension from perturbations in a finite number of points to discrete, countable sets was made in Grossmann et al. (1980a) [see also Svendsen (1981) for related models]. It is interesting to observe that this theory is not covered by Section 6.2 (since p’ is a probability measure and is a bounded function), but we should add that the infinite theory is obtained from the finite by a rather direct limit argument. In Albeverio and HpreghKrohn (1981) point interactions were characterized as limits of smooth, local potentials, and this approach has been refined and extended by Albeverio et al. (1982c), Holden (1981), and Holden et al. (1983,1984). Connections to the fashionable field of random potentials have been established in, e.g., Albeverio et al. (1982a). These are just a few of the many recent contributions to the field; the reader is referred to Albeverio et al. (1986b) and Demkov and Ostrovskii (1975) for additional information and references. 6.4. PERTURBATIONS BY LOCAL TIME FUNCTIONALS
We now turn to the study of perturbations of the Laplacian induced by potentials supported on Brownian paths. Given a Brownian motion b:R x [0,1] + Rd, (1) we let C, denote the path (2) for each w
c6J
E
{ b ( w ,t)l t E [O, 111 R. There is a natural measure on C, induced by b: =
(3) ;,(A) = m , { t E lo, 111 b ( w , t) E 4; where m,is the one-dimensional Lebesgue measure. We shall use the theory of Section 6.2 to study operators on L2(Rd,m ) given by forms (4)
6 4 PERTURBATIONS BY LOCAL TIME FUNCTIONALS
(as usual, m is the Lebesgue measure on R“). Using the definition of we may rewrite (4) as (5)
g) =
I
(-Af)gdm -
359
b,,
1’
A,(b(t))f(6(t))s(6(t)) dt,
and if 6 is the delta function on R”, a formal calculation “shows” that the associated operator is (6)
H,=-A-
V,(.),
where r1
V,(x) = A,(x)
(7)
J
6 ( x - b( t)) dt. 0
Comparing (7) and 3.3.8, we see that H,,, is a perturbation of - A by a local time functional. Although local time in the sense of 3.3.8 exists only in dimension one, we shall see that nontrivial perturbations of -A of the form (6) exist for d 5 5 . For d = 4 , 5 it is necessary to choose A, infinitesimal. Before we turn to the technical part of the theory, we would like to give a brief sketch of the contents of this section. After having applied Corollary 6.2.12 to the form (4) to obtain the results just mentioned, we shall try to explain why local time perturbations are of interest to physicists. In this section we shall concentrate on models of polymer molecules, but in Section 7.5 we shall take a look at some applications to four-dimensional quantum field theory. As models of polymers, Brownian paths have some unrealistic features; e.g., a Brownian path in R3 will intersect itself infinitely often, while a polymer molecule does not intersect itself at all. To remedy this deficiency attempts have been made to replace the Brownian motion by a self-avoiding modification, but the probabilistic behavior of the process then becomes extremely complicated and little has yet been achieved. We shall end the section with brief accounts of two of the most promising of these attempts-Westwater’s (1980, 1981, 1982, 1985) work on Edwards’s polymer model and Lawler’s (1980) on self-avoiding random walks. A. Application of the General Theory
Our first lemma will show that the condition of Corollary 6.2.12 (a) is satisfied for almost all Brownian paths when the dimension d is less than or equal to 5. The proof consists in reducing the problem to a question about convetgence of certain multiple integrals, and the basic observation is always the same-that for all r > 0 (8)
I
I/x I15 ‘
dx
converges iff
p < d,
6. TOPICS IN DIFFERENTIAL OPERATORS
360
I I X ( ( - ~ dx
(9)
converges iff
p > d,
as is easily seen by changing to polar coordinates. 6.4.1.
LEMMA. Let R, be the resolvent of -A in Rd. If d
I5
and a
E
Iw-,
then
for almost all w. PROOF The integral kernel ( R , R , ) ( . ,- ) is described in detail by Lemma 6.3.1 and equation 6.3.4. When d I3, the kernel is bounded and there is nothing to prove. We shall leave the case d = 4 to the reader and concentrate on dimension 5. To see what goes wrong in higher dimensions, let us try to carry out the proof for a general d > 4. From Lemma 6.3.1 and formula 6.3.4 we know that there are positive constants C and p such that
To prove the lemma it suffices to show that
is finite. Introducing Gaussian kernels, we have
and substituting the new variable S = ( t - ~ ) / l l u 1 1for ~ s, this turns into
If we let m
( ~ T I S I ) - ~exp(-1/2JSJ) ” dS (the integral converges since d > 2), then
6 4 PERTURBATIONS BY LOCAL TIME FUNCTIONALS
361
which according to (8) converges if and only if 2d - 6 < d, i.e., d < 6. The proof is complete. Combining the lemma and Corollary 6.2.12 (a), we get: 6.4.2. THEOREM. Let b:fl x [0,1] +=R d be a Brownian motion and m the Lebesgue measure in Rd. If d I 5, then for almost all w E Q, the form
(13)
E ( f ,8 ) =
I
(-Af)gdm
has a nontrivial, closed perturbation supported on the path
cu = {Ww, t ) /t E ro, 111The next problem we shall consider is when we can take the perturbation in the theorem to be of the form
for some standard function A, : C, + R. Combining the following lemma with 6.2.12 (b), we see that this is the case when d 5 3. 6.4.3. LEMMA.
Let R,( -,. ) be the resolvent kernel of -A in Rd. If d
5
3
and a < 0, then (15)
R , ( * , . )E
L'(b%)
for almost all w. PROOF This is almost identical to the proof of the preceding lemma. By Lemma 6.3.1 there is nothing to prove for d = 1, and we shall leave the two-dimensional case to the reader. Assume that d 2 3; then there are positive constants C and fJ such that for all r > 0
and it therefore suffices to find those d for which
is finite. Since b is a Brownian motion
362
6 TOPICS IN DIFFERENTIAL OPERATORS
for s. Since d > 2, the integral
where we have substituted S = ( t -
converges, and hence
which converges if and only if 2d lemma. E
By Corollary 6.2.12b the form R, the inequality
-
4
< d, i.e., for d < 4. This proves the
@- , in (14) is closable if for some a E Iw-,
E
holds for b,-a.a. x. From the lemma we see that for almost all w , the function
is finite ;,-almost everywhere, and this implies the closability of @ - , for a large class of functions A,. But the lemma is too weak to give a closer description of this class; e.g., it cannot tell us whether we are allowed to choose A, positive and constant, which is the most interesting choice from the physical point of view. To settle this problem we shall analyze the situation with more care. In dimension three the resolvent kernel of the Laplacian has a particularly simple form
For each a
E [w-,
we define a stochastic process X" :CR x [0,1] r
+
R by
' exp(-J-(Yllb(t) =
- b(s)/I) ds, Ilb(t) - b(s)ll
where the factor 4.rr is just for notational convenience. Observe that if we knew that for almost all w (21)
X ' I ( w , t ) + 0 uniformly in t when a + -CO,
6 4 PERTURBATIONS BY LOCAL TIME FUNCTIONALS
363
then we could conclude from Corollary 6.2.12b that the form (16) is closable for all bounded functions A,. To prove (21) we first observe that by Fatou's lemma X"(q
(22)
1) 5
lim inf X " ( w , s), S-r
and thus the paths of X " are lower semicontinuous. In a short while we shall establish the following estimate: 6.4.4. LEMMA.
a
E [w-,
For each
E E
R+, there is a constant C, such that for all
s, f E [O, 11
(23)
E(Ix"(~)
-
xa(s)I3) Ic,lt
~
The conclusion of this lemma is the condition of Kolmogorov's continuity theorem [see, e.g., Simon (1979)], and hence the restriction of X " to the dyadic rationals D
=
{ ( k / 2 " ) l k ,n
E
N,OI k
5
2")
is continuous. We can define a continuous modification Y" of X " by Y a ( w ,t ) = lim X r ' ( w ,s). s+ r
SED
Since the paths of X " are lower semicontinuous, (25)
X"(W,
t ) 5 Y " ( 0 ,t )
for all t, and the paths of X " must be bounded. Assuming Lemma 6.4.4, we may now prove: 6.4.5. THEOREM. Let A be the Laplace operator in R3. If b : R x [O, 11 + R3 is a Brownian motion and m , is the Lebesgue measure on [0,1], then for each w E 0 let the measure b, be defined by
i i , ( A ) = m , { t E [O, 111 bfw, The form
5
k ( L8 ) = ( - 4 f ) g
dm -
1) E
5
A}.
Afs 4,
is closable for all bounded functions A PROOF By Corollary 6.2.12 it is enough to prove (21). In fact, by (24) and ( 2 5 ) it suffices to show that for almost all w
(26)
X u ( o ,t ) + 0 uniformly for all t
D as a
-a. Note that since the paths of X " are bounded, the limit in (26) holds pointwise by Lebesgue's convergence theorem. E
+
364
6 TOPICS IN DIFFERENTIAL OPERATORS
Assume that (26) does not hold; then there is an internal set R1 c R of noninfinitesimal measure P ( f i , )= 6 such that VO
(27)
R, V a
E
E
R- 3t E D ( X " ( w , t )
2 E)
for some E E 08,. For each n E N let D,
= { ( k / 2")
I k E N, 0 Ik 5 2").
It follows from (27) that for each a such that P { w 13t
(28)
E
E
IW- there must be an integer n ( a )
D,,,,(X"(w, t ) 2
E)} 2
6/2.
Applying transfer, we see that for an infinitely large negative a, (28) holds for some n( a ) E *N - N. Since (26) is true when uniform convergence is replaced by pointwise convergence, * X " ( o ,t ) must be infinitesimal at all standard points t. By the *-version of Lemma 6.4.4, the restriction of * X " to satisfies the condition of the hyperfinite Kolmogorov theorem, is S-continuous, and since it is also infinitesimal 4.8.5. Hence * X " at all standard points, it cannot possibly satisfy (28). Thus (26) holds and we are done.
r
B The Basic Estimate
It still remains to prove Lemma 6.4.4, and we warn the reader at the very beginning that this is going to be quite messy. Our main tool will be the following estimate. 6.4.6. LEMMA.
There is a constant C such that if e E R3 has length $,
for all a E R3 and all PROOF
p
E
08,.
Fix a, and introduce a new variable y
= I@ k :
When p goes to infinity, the last integral goes to zero if a # f e , and it
6 4 PERTURBATIONS BY LOCAL TIME FUNCTIONALS
365
converges to
if a = *e. Hence I( p, a ) goes to zero at least as fast as 1/p when p To see what happens when p + 0, we first observe that
+ 00.
by the triangle inequality, and that the last integral converges to
when p + 0. Thus I ( p, a ) goes to infinity as I/@ when /3 + 0. It follows that for each a there is a constant C ( a ) such that
and a little thought will make it clear that we can, in fact, choose C ( a ) independent of a. In our applications the p in Lemma 6.4.6 will be of the form p = llx~12/2f, and we then have
We shall also use the algebraic identity
with
LY
= J J ~ 1 ) ~ / 2pt ,= , IJxJJ2/2r,:
366
6 TOPICS IN DIFFERENTIAL OPERATORS
PROOF OF LEMMA 6 4 4 .
This argument is related to the proof of Lemma
6.4.1, but it is much more complicated. We shall estimate the integral (32)
I ( ~t ,) = E(IxO(t) - X O ( ~ ) I ~ )
Since
I
e-611"11
e-GllYll
-
llvll
llxll
it suffices to prove 6.4.4 for a = 0 as the result is then obviously true for all negative a ;i.e., it suffices to show that
I(s, t ) 5 CJt - SI 3 / 2 - ~.
(33)
We shall prove (33) by introducing Gaussian kernels in (32). To treat the possible dependencies between the increments b(t) - b(u,), b(s) - b(u,), 1 5 i, j 5 3, we must first split the domain of integration into several parts according to the ordering of u l , u2, u3, s, and t. There are 120 different ways to order five elements, but many of these give rise to exactly the same calculations, and it is enough to consider the following six basic cases. CASE 1
U1
< U2 < U j < S < t.
CASE 2
U1
< U2 < S < U3 < t.
CASE 3
U1
< U2 < S < t < U 3 .
CASE 4
U1
< S < U 2 < t < U3.
CASE 5
U1
< S < U 2 < U3 < t.
CASE 6
S
< U1 < U2 < U j < t.
We cannot treat all the different cases here, but shall concentrate on the first and the last one. These are the extreme cases in the sense that in case 1 all the u's lie outside the interval (s, t ) , while in case 6 they all belong to the interval. For the intermediate cases 2-5 one needs only combine the methods used for the extreme cases in a rather straightforward way, and we leave this to the reader.
6 4 PERTURBATIONS BY LOCAL TIME FUNCTIONALS PROOF OF CASE 1
367
We must estimate
X
X
Note that b ( t ) - b ( s ) , b ( s ) - b(u,), b ( u 3 )- b ( u 2 ) , b ( u 2 )- b(u,) are independent random variables. When we now introduce Gaussian kernels, x, x3, x2, and x1 will be variables corresponding to these increments:
X
exp(-1’x1112/2(U2 - ”)) du, du, du, dx, dx, dx, dx. [2r(u, - u,)]3’2
We introduce new variables as follows: new time variables are
6 TOPICS IN DIFFERENTIAL OPERATORS
368
and extend the domain of the time integrals to be the whole unit interval, we get
X
exp(-"x"2/2(t ( t - s),/2
du, du, du, dk, dk, dk, dx.
We now carry out the integration with respect to k l , kZ, and k3 in that order, using Lemma 6.4.6 in each case. The result is
By elementary calculus
and thus
Changing variables for the last time, we let
and get Il(s,t)
5
a 7r2
(-),It C
- sI3'*
Ilyl131n( 1 +
a
t - s IIY II
)3e-llyJ12/2dy.
It is easy to check that when t - s goes to zero, the last integral goes to infinity more slowly than ( 1 - s(-' for any E > 0. This completes the proof of case 1.
6 4 PERTURBATIONS BY LOCAL TIME FUNCTIONALS
PROOF OF CASE 6.
369
We shall estimate
In this case the independent increments are b ( u , )- b ( s ) , b ( u 2 )- (ul), b( u 3 )- b( u 2 ) , b( t ) - b( u,), and when we introduce Gaussian densities, they will be represented by the variables xl, x2, x3, x4:
X
exp(-11x4112/2(f - u 3 ) )du, du, du, dx, dx, dx, dx,. [ 2 r (t - u3)I3I2
We introduce new variables as follows: new time variables are u, = u1 -s,
v, = u2- u , , 213 = u3 - u2, and for notational convenience we shall write u4 for the dependent variable v4=
? - a3 = ? - (s
+
2)1+ #2+
u3).
New space variables are x, k, , k,, k3 , given by
+
x = X I+ x2 + x3 x4, XI =
XI
xI
Ilxllk,+x/2,
+ x2 = llxll k*+ x/2,
+ x2+x3= llxll k3+ x/2.
Note that x2 = llxll(k2 - kl),
x3 = IIXll(k3 - k2),
Letting e = x/2llxll,
x4 = x/2 - llxll k3.
370
6 TOPICS IN DIFFERENTIAL OPERATORS
we get
x
X
exP(- llxl1211kl + e 1l2/2UI) exP(-
lx11211k2 - kl 112/2u2) U:f
,,312
X
exp(-llxll2I1k3- k2112/2u3). exp(-llxl1211k3 - e1I2/2u4)
u:12
U;’2
x du, du, dv3 dk, dk, dk3 dx.
To this point we have followed the proof of case 1 closely, but we cannot carry out the integration now as we did then. The problem is that each ki appears in two of the exponents. In case 1, the variable k, appeared in only one exponent, and this was what made it possible to apply Lemma 6.4.6. It is here that we need the algebraic identity (31). Applying it three times in succession we get:
lx11211kl + e 1, 2VI
+
Ilx 1I211kl- k21I2
+
2 v2
IIX
11211 k2 - kJI2 2 03
+
lx11211k3 - e
2 u4
112
6 4 PERTURBATIONS BY LOCAL TIME FUNCTIONALS
+ 2( 11x112(f-sS)
u1+ v 2 +
371
Ilk,- ( u , + vz + u3)e- u4e t-s
v 3 ) u4
If we use the abbreviations c1
=
u , k2 - v2e v1+
the expression for
16(s, t
c2
=
c3
=
02
’
+ v2)k, - v3e ’ u, + u* + v3
(01
( u , + u2 + u3)e - u4e t - s
3
) can be rewritten as
x exp(&)
du, dv2 du, dk, dk2 dk3 dx.
We now evaluate the integrals with respect to k,, k 2 , and k, (in this order), using Lemma 6.4.6 and inequality (29):
6 TOPICS IN DIFFERENTIAL OPERATORS
372
We change variables again, letting X
w,
y=Jt-s’
Ul
=t-s’
w2
02
=t - S ’
w3
u3
=-
t-s’
and note that uq
= 2 - s - (u,
+ u,+
u3)
=
2-s-(t-s)(w,+
w2+w3)
=
( t - s)(l - ( w l + w , + w3)).
The expression for 16(s, t ) now becomes
x exp(
-y)
d w , dw, dw3 [ w1 W 2 W 3 ( 1 - w1 - w2 - w3)liI2
dy,
and since the integral converges, case 6 is proved (even for E = 0). As we have already said, we shall not prove the remaining cases 2-5 here, but just repeat that they can be obtained by combining the methods of the two proofs we have given; the main point is to realize that one has to treat the ui’s which lie in the interval (s, t ) differently from those which lie outside. C. Models of Polymers
After all these tedious technical calculations, we shall end this section by a brief and informal introduction to the mathematical physics of polymers and its relationship to the operators we have been studying. Polymers are long and flexible chain molecules composed of many repeat units (called monomers) occurring, e.g., in polyethylene (chains of -CH2- units), rubbers, plastics, and certain biological substances (biopolymers). The number of monomers in one molecule may be extremely large, up to 105-106 units. We shall only be concerned with linear chains where the centers of the repeat units are connected by a polygonal path, and ignore the branching network structure common in more complex polymers. If one ignores details below a certain characteristic magnitude called the persistence length,
373
6 4 PERTURBATIONS BY LOCAL TIME FUNCTIONALS
real-life polymers are almost perfectly flexible. This suggests that they can be modeled by Markov chains where the angles between two consecutive links in the polygonal path joining the monomer centers have a certain distribution. The idea is that the smaller the typical angle, the larger is the persistence length. For a detailed analysis of this model see Flory (1969) and Westwater (1980, 1981,1982, 1985). If we ignore the persistence length altogether, the simple random walk can be used as a model, and if we also let the distance between consecutive centers go to zero, we end up with a Brownian motion. Hence Brownian paths may serve as a first approximation to polymer molecules. If we accept this model, the Em's we have constructed above are just Hamiltonian forms describing the behavior of a quantum mechanical particle moving in the vicinity of a polymer moIecule and only interacting with it through forces of extremely short range. Recalling the heuristic expression (6)
I:
H, = -A - A,( . ) a ( . - b ( t ) ) dt (34) for the operator generated by i",a formal application of the Feynman-Kac formula gives the expression
x exp(
loflo'
A,(g(s)) 6 ( b ( r )- g(s))
)
drds dp,
where 6 is a new Brownian motion independent of b, for the associated semigroup T:. As it stands, this formula makes no sense, but using the hyperfinite version of the Feynrnan-Kac formula in Theorem 5.3.11, we can interpret and prove it. Let 8 be a hyperfinite representation of E on L2(Y, p ) , and let g: Y 2+ R be the "delta function" if x = y, otherwise. Choose At so small that p ( x ) log Ar is infinite for all x E Y, and let T be the time line ( 0 , At, 2 A t , . . .). If X : R x T -+ Y and 2:fi x T + Y are independent copies of the Markov process generated by '8, their standard parts b and 6 are Brownian motions. Consider the forms
(36)
374
6. TOPICS IN DIFFERENTIAL OPERATORS
If A, is chosen according to Theorem 6.2.6, the hyperfinite Feynman-Kac formula, 5.3.11, tells us that
(note that condition 5.3.11 (b) is satisfied since we have chosen p ( x ) log Ar infinite), where 0: is the semigroup generated by gw.This immediately gives us our interpretation of (35). Recall that according to theorem 6.4.5, we can choose A to be any bounded function as long as the dimension is less than or equal to three. With this in mind, it should come as no surprise to learn that Westwater (1980) has given a natural interpretation of (38)
T( t, S)
=
lo‘I,’
6( b,( U ) - b,( v ) ) du dv
as a real-valued stochastic process when d 5 3; the idea of his construction is simply to replace 6 by approximating &functions 6, and then look at the limit as E + Of. When d > 3 this limit does not exist, and if we want to give a standard interpretation of ( 3 5 ) , we must also let the approximating A,’s go to zero as E goes to zero. Theorem 6.4.2 tells us that when d = 4 or 5, it is possible to find such a family of As’s, at least in the sense that T: is different from the free semigroup. What happens to the integral in the exponent of (35) in this case is an interesting question we shall leave to the reader to answer. So far we have been using Brownian paths as models for polymers, and as far as the general influence of polymer molecules on their environment is concerned, this may not be totally unrealistic and uninteresting. But as an explanation of the polymers themselves, their structure and properties, the model is far too simple and idealized. Not only have we disregarded the persistence length of the molecules, but we have also not taken into account the fact that since two monomers cannot occupy the same part of space, a polymer molecule cannot intersect itself. This is usually referred to as the “excluded volume effect,” and mathematically it has a grave impact on the class of reasonable models. If we think of the growth of a polymer molecule as a stochastic process, to require that the paths do not intersect themselves means that the possible positions at any time are dependent on the entire past of the process; the model can no longer be Markov. As a consequence, the analysis of the resulting process becomes extremely complicated, and very few mathematical results have been obtained in the physically interesting case d = 3. However, on the more heuristic level quite a lot of information has been obtained through computer simulation [see
6 4 PERTURBATIONS BY LOCAL TIME FUNCTIONALS
375
e.g., Barber and Ninham (1970), Domb (1969), Flory (1969), Freed (1981), and Hammersley and Morton (1954)l. In higher dimensions analytic methods have had greater success, and later in this section we shall give a brief introduction to Lawler's work on self-avoiding random walks. But first we want to discuss a slightly different approach, which was introduced by Edwards (1965, 1975) and given a solid mathematical basis by Westwater (1980, 1981, 1982, 1985). Also in Edwards's model the polymer molecule is described by a stochastic process, but instead of the dynamic picture suggested by the term "self-avoiding random walk," the description in this case is quite static; the process is given in terms of a measure on the space V([O, 11, R d ) . If W is the Wiener measure on Rd, this measure is formally described by (39)
dp,
=
z-'(exp( A
Jo' Jo' 6 ( x ( u )- x ( v ) ) du du
) d W ( x )) ,
where A is a negative constant called the coupling constant, and 2 is a normalization factor (40)
2
=
[
exp(A
lo'lo'
6 ( x ( u ) - x ( v ) ) dudu
We shall not discuss here the physical assumptions underlying (39) [the survey article by Freed (1981) is strongly recommended for anybody who wants a better understanding of polymer models], but intuitively it seems quite reasonable; since we are interested in nonintersecting paths, we penalize them whenever they intersect themselves (remember that A is negative). In dimension one it is easy to give sense to (39) by making use of Brownian local time, and in two dimensions a solution was given by Varadhan in an appendix to Symanzik's (1969) paper. The case d = 3 required a new technique and was solved by Westwater (1980,1982). His proof is too long and technical to be presented here, but we shall try to explain the main idea, which exploits the close relationship between (39) and (36), or more precisely between the exponents
and
lo1lo'
S ( x ( u ) - x ( v ) ) dudv.
Recall that according to (38) we know how to make sense of (41), and the idea is to interpret (42) by somehow expressing it in terms of (41). The
376
6 TOPICS IN DIFFERENTIAL OPERATORS
problem is to split the Brownian motion x in (42) into two independent Brownian motions 6 , and b, as in (41). Here is the trick. Instead of integrating at once over the entire square [0,1] x [0,1], we break the integral in (42) down into smaller pieces. First we divide [0,1] x [0, 13 into four smaller squares [O,;] x [ 0 , $1, [ O , t] x [i,11, [i,11 x [0, $1, and [$,11 x [i,13, and integrate over the square in the upper left-hand corner [ O , f ] x [i,11 and the one in the lower right-hand corner [f, 13 x [O,i]. We repeat the procedure with each of the two remaining squares [O,f] x [O,f] and [f, 11 x [f, 11, split them into four pieces, and integrate over the upper left-hand and lower right-hand corners. Again we repeat the procedure with respect to the four unused subsquares, and so on to infinity. Figure 6.1 shows the domains of integration in each of the three first steps. The nth time we repeat the procedure we will be integrating over 2" squares each of area 4-". If we sum all these contributions we get the integral over the unit square [0,1] x [0,1].
Figure 6.1
To understand the idea behind this procedure, let us see what happens to the integral over the first subsquare [ O , f ] x [i,13. The two processes xl, x2:R x [0,1] + R3 defined by x(4)
(43) and
xl(s) =
X(i
- s)
(44)
x*( t )
x($
+ t ) - X(t)
=
-
are independent Brownian motions. By a formal change of variables ~ ( x ( s-) X( t ) ) ds dt =
~ ( x , ( s) ~ * ( t )ds) dt.
6.4 PERTURBATIONS BY LOCAL TIME FUNCTIONALS
377
I:,2 I:”
Hence S(x(s) - x( t ) ) ds dt has the same distribution as the quantity T ( $ , + )in (38). Similarly, the integral over any of the squares of area 4-“ occurring in the nth step of the procedure above will have the same distribution as T(1/2”, 1/2”). This reduces the problem of interpreting (39) to a problem about the existence of the limit of a sum of random variables consisting of copies of T(1/2“, 1/2“). Since the copies of T(1/2“, 1/2”) are dependent, this is quite a difficult task, and we can only refer the reader to Westwater’s papers for the solution. Knowing that the measures pA exist, we may ask for their properties. Despite the formal expression (39), pA is singular with respect to Wiener measure when A < 0; in fact, Westwater (1982) has shown that if A l f A Z , then pA,and pA2have disjoint support. The results of Kusuoka (1983) seem to imply that the paths still intersect for finite A’s, but nothing is known about the limit as A goes to infinity. For other work on self-intersection, see Dynkin (1985), Lawler (1985), LeGall (1985), Yor (1985), and Rosen (1983). By the work of Varadhan and Westwater we know that the measures pA in (39) exist when d 5 3. Westwater’s argument highlights the close relationship between (39) and (35), and since we have proved that (35) makes sense for d 5 5 , it is natural to ask what happens to (39) in dimensions four and five. This question is not only the result of an idle wish to generalize, but is also of the utmost importance in mathematical physics; through Symanzik’s (1969) program it is intimately connected to the existence of nontrivial quantum fields in dimension four (the physically important dimension). We shall explain this connection in Section 7.5. An attempt to generalize Westwater’s argument to four dimensions will probably meet a number of obstacles; not only is the existence of (39) much more precarious in this case, but also the fact that we have to choose A positive may create new problems. Westwater’s limit theorem is so complicated that it may be a good idea first to give a nonstandard treatment of it in dimension three [for d = 2, this has already been done by Stoll (1985)l. It is the kind of limit construction that seems to lend itself naturally to nonstandard methods. Once this has been achieved, an attempt should be made in dimension four. As promised above we shall end this section with a few remarks on self-avoiding random walks. A simple random walk of length N is a sequence {xl,. . . ,x N } of elements in Z d such that Ixi - xi-,l = 1 for all i 5 N, and a self-avoiding random walk of length N is a simple random walk {x,, . . . ,x N } where xi # xj when i # j . On the set aNof all simple random walks of length N, the natural measure is the normalized counting measure PN(X,,
..., X N }
=
2-Nd.
It is not so obvious what is the “right” measure to put on the set S , of all
6 TOPICS IN DIFFERENTIAL OPERATORS
378
self-avoiding random walks of length N, but also here it has been customary to use the uniform measure. QNIX1 . x N } = l/lSNlPrecise results about the measures Q N have been scarce [see Barber and Ninham (1970), Brydges et al. (1982), Brydges and Spencer (1989, Domb (1969), Flory (19691, Freed (1981), Hammersley and Morton (1954), and Kesten (1963)] and a few years ago Lawler (1980) introduced a new and more tractable measure on S,. His idea was to construct self-avoiding random walks from simple ones by erasing all the loops. The construction can be described more precisely as follows. Given a simple random walk { x l, . . . ,x N } ,we define 9 . .
(o
= min(jl3i
T =
and let { y ,,. . . ,y,}
9
<j ( x , = x i ) } ,
the i < (o for which xi = x,,
be the path given by
Observe that M = N - ( c p - T) and that { y , ,. . . ,y M } is the path obtained from {xl,.. . ,x,} by erasing the loop {xT,xTtl,. . . ,x,-~}. If { y l , .. . , y M } is self-avoiding, we stop the process; if not, we repeat the loop-erasing operation till we get a self-avoiding path. Figure 6.2 illustrates the process. Let R be the set of simple random walks of infinite length, and let P be the natural measure on R. If d 2 3, the simple random walk is transient, i.e., I)x,,II + 00 with probability one as n + 00. Hence we can apply the procedure above to an infinite random walk {x,}, and the result is an infinite, self-avoiding path { z i } .We shall use the notation DN({xi})
=
lz1
9
z2,
...
9
zN).
Lawler's measure on SN is defined by qN(lZ1 . . . z N } ) = p ( { x i } l : { z l ... z N } DN{xi}). Let X N :R, x {1,2,. . . , N} + Z d be simple random walk considered as a stochastic process on ( O N ,P N ) . If we choose N infinite, we know that the standard part of the process Y N :R N x {O,l/N, . . . , 1) + * R d defined by 9
7
9
Figure 6.2
9
6 5 APPLICATIONS OF NONSTANDARD ANALYSIS
379
is a Brownian motion. We can also consider the loop-erased, self-avoiding random walk as a stochastic process X N :S N x {1,2,. . . , N } + Zd on ( S N ,q N ) . Lawler’s main result states that if d 2 5 , there is a positive, real number K d such that the standard part of YN(W,
t)=
d
m
~
XN(W,
KdNf)
is a Brownian motion. On the basis of heuristic arguments and numerical evidence it has been conjectured that the limit distribution of a self-avoiding random walk is Gaussian when d 2 4 but not when d 5 3, and the result above is the first successful step toward a proof of this claim. The original conjecture is concerned with the uniform measure QNand not with Lawler’s measure q N ,and in a later paper Lawler (1983) has studied the relationship between the two approaches. We would like to point out, however, that there is no reason to consider Q N any more natural than qN ;indeed, Lawler (1980) has given the following characterization of the loop-erased random walk, which is perhaps more intuitive than the definition. Given a subset A c Zd,we may define a random walk with taboo set A by restricting our attention to those simple random walks which do not hit A. Lawler proved that given the n first values x ( l ) , x(2), . . . , x ( n ) of a loop-erased random walk, then x(n 1 ) has the same distribution as y ( n + l ) , where y is a random walk with taboo set {x(l ) , x(2), . . . ,x(n)} and y ( n) = x ( n). For more information about the importance of self-avoiding random walks in polymer physics and quantum field theory, see Freed (1981) and Nelson (1983).
+
6.5. APPLICATIONS OF NONSTANDARD ANALYSIS TO THE BOLTZMANN EQUATION A The Equation
The (nonlinear) Boltzmann equation (Boltzmann, 1867) is the basic equation of gas kinetics. The aim is to deduce the macroscopic behavior of gases from the microscopic model of kinetic gas theory, in which a gas is described as consisting of rigid spheres (molecules) interacting by collisions, according to the laws of elastic collisions of classical mechanics. It is a basic model for nonequilibriurn phenomena and for the study of the approach to equilibrium (asymptotic Maxwell-Boltzmann distribution) in gases. To discuss the equation we need some notation. Consider a gas of identical point molecules supposed first to interact by a potential of finite range d. The molecules move in a region A of R3, with some suitable
380
6 TOPICS IN DIFFERENTIAL OPERATORS
boundary conditions on the boundary of A, e.g., periodic ones. The velocities of the molecules can take all values in R3, so that the phase space of each molecule is M = A x R3. Let F be the density of the molecules in M, a quantity supposed a priori to exist and depend differentiably on time and space, so that an evolution equation can be set up. The number of molecules at time t in a region A of M is then F ( x , U, t ) dx dv.
The Boltzmann equation expresses a F / a t through a balance between the numbers of molecules entering a region of collision and leaving it. Let us consider two molecules of initial velocities u1 and u2, initially separated in space at a distance larger than the range d of the potential. Let u: and u i be the respective velocities of the molecules after collision has taken place and the molecules again are separated by a distance larger than d. Conservation of momentum and conservation of energy give and u1+ u2 = u: + u; u: + u: = ( u ; ) + ~ (u;)~. These equations are not sufficient, in our three-dimensional case, to completely specify the collision. For such a complete specification one can, e.g., introduce a plane P orthogonal to w = u2 - u1 and at rest with respect to the first molecule. In this plane, u is the vector from the first molecule to the point of intersection with the plane P of the straight line from the second molecule at time -a in the direction of w. Note that in the case of finite range d one has collisions only if IuI Id. We denote in general by B the values of u for which one has collisions. Boltzmann, using the conservation laws and a statistical hypothesis (the famous Stosszahlansatz), computed the number of collisions occurring in a short interval of time in small regions A x of position space between molecules with incoming momenta in small regions Au,, Au2, respectively A U ; , Au;, of velocity space. From this computation he then derived the equation t 0, W / J t ) ( x ,U l , 2 ) + u,VxF(x, u1, 2) = ( Q m x , 01, t ) , (1) where V, is the gradient with respect to the position x E A and Q is the so-called collision operator
’
(2)
(QF)(x,U l l t )
6 5 APPLICATIONS OF NONSTANDARD ANALYSIS
381
This is the Boltzmann equation, which should be solved under an initial condition F ( x , u, 0). In the case where the range of the potential is infinite one has to take B = R2. This equation has been derived from classical mechanics without any statistical hypothesis, e.g., in the case of a purely hard-core potential, a model equivalent to having hard spheres of radius d j 2 . This was done by Lanford (1975), starting with a gas of n such spheres in a fixed container A and letting n += CO, d += 0 in such a way that nd2 = 1. He showed that if the initial condition F ( x , u, 0) is such that F ( x , u, 0) = limd,, limn+mn - l F ( n , d ) ( xu,, 0 ) , the F ( n , dbeing ) initial values for the density function for the n molecules of diameter d, then limd,, limn+mn - l F ( " * d ) ( u, x , t ) exists almost everywhere and the limit function F ( x , u, t ) is a generalized solution of the Boltzmann equation, at least for sufficiently small times t > 0. There is an extensive literature, both physical and mathematical, on the Boltzmann equation and its consequences; see, e.g., Truesdell and Muncaster (1980) and Cercignani (1975). Main mathematical problems which have been studied are the following: (1) Existence and Uniqueness of Solutions. In the so-called spatially homogeneous case one looks at solutions which are independent of the space variable x. In this case existence and uniqueness of classical solutions were proved under suitable initial conditions by several workers, starting with Carleman (1933) and including Morgenstern, Wild, Truesdell, and Povzner. The strongest results in this case seem to be those obtained by Arkeryd ( 1972- 1984) [for additional references see Truesdell and Muncaster (1980), Chapter 211. We shall discuss this case further below, using methods of nonstandard analysis. For solutions depending on space (i.e., in the spatially inhomogeneous case), existence and uniqueness of weak solutions for a finite interval of time have been proved in the case of potentials of finite range starting with Grad (1958), and including work by Ukai, Shizuta, Nishida and Imai, Glickson, and Kaniel and Shinbrot [see Truesdell and Muncaster (1980), Chapter 201. Whether these solutions are classical is an open problem. Ukai, Shizuta, and Nishida and Imai have shown global in time existence near the equilibrium (of the Maxwell distribution). The only presently known global in time existence results far from equilibrium have been obtained recently by Arkeryd, using nonstandard analysis. We shall describe this below.
( 2 ) Study of the Asymptotic Behauiour in Time of the Solutions of the Boltzmann Equation, Approach to Equilibrium. Proofs of the asymptotic approach to the Maxwell distribution under suitable initial conditions and
6 TOPICS IN DIFFERENTIAL OPERATORS
382
assumptions on the potential are available in the spatially homogeneous case with finite-range potentials (Carleman, Arkeryd, and others). The first proof of convergence from initial conditions far from equilibrium and for potentials of infinite range has been obtained by Arkeryd (1982), using methods of nonstandard analysis. This will also be discussed further below. For finite-range potentials and particular boundary conditions convergence has been proven for x-dependent solutions near equilibrium, by Ukai (1974), and then extended by Shizuta, Imai, and Nishida. Let us introduce some notation that we shall need in our nonstandard study below. Together with the Boltzmann equation ( 1 ) it is convenient to consider the following modification of it:
where Qw is defined as the collision operator Q with the quantity Iu2 - ull replaced b y a function w( ul, u 2 , u ) of u1, u2, and u. Below we shall consider together with w its upper and lower truncated version w r , defined to be equal to w for 1u) Im and u: + ui 5 n2, and equal to zero if either IuJ5 m, u: + ui > n2, or Iu(> m. We remark at this point that the results we shall prove below for (1') with w replaced by w; actually also hold in the case w ( u l , v 2 , u ) = I u2 - ulla with 0 5 a < 2. We shall consider for simplicity the case where the molecules have positions x varying in A, with A a vessel with periodic boundary conditions; i.e., we take A to be the torus R3/Z3.However, completely similar techniques yield corresponding results in many cases where A is a bounded region with other boundary conditions, and in the case where there is, in addition to the two-body forces, some external force acting upon the molecules. B. Physically Natural Initial Conditions
Natural initial conditions for the Boltzmann equation are such that F ( x , u, 0) = Fo(x, v ) 2 0 with no more severe restrictions than Fo, vzFo, Fo log F, in L'(dx d u ) . In fact, Fo 2 0, Fo E L'(dx du) expresses the fact that F, is a density, and u2FoE L' comes from the fact that u2Fodx du has the physical interpretation of the total energy of the molecules. Computing formally, one sees that Fo dx du = F d x dv and u2Fodx du = 5 u2Fdx d v for all t, so that 1F d x du, u2Fdx du are conserved quantities. In a similar way one sees that the famous H quantity
5
HF( t ) = decreases as
t
I
F log F dx du
increases, i.e., HF( t ) 5 H F ( 0 ) .
6 5 APPLICATIONS OF NONSTANDARD ANALYSIS
383
As is well known [see, e.g., Truesdell and Muncaster (1980) and Cercignani (1975)], HF can be interpreted as an entropy function. So the natural setting for studying the Boltzmann equation (1) is in a space of functions F which are such that F 2 0 , F, u2F, and F log F E L’. But there are serious difficulties due to the lack of linearity, boundedness, and continuity of Boltzmann’s equation (1). This comes about in three different ways. First, QF is not linear in F. Second, at least in the case B = R2, the two terms in Q are not well defined, when taken separately. Third, Q as a function of x contains the square of a L’ function. In the space-homogeneous case, where one looks at solutions F independent of x, the third problem, of course, disappears. This is the main reason why existence and uniqueness have been easier to obtain when x is absent than otherwise. We shall present below the nonstandard way of doing this. In the space-inhomogeneous case (x dependence) the only global existence results for the above physical conditions are for solutions in a Loeb L’ setting by nonstandard analytical methods. C . The Equation with a Truncated Collision Term
Before considering the full Eq. (1) we shall look first at the simpler case of the “truncated Boltzmann equation” (3) with
( d l d t ) F ( x+ Ult,
(QTF)(x, 0 1 , t ) =
01,
t ) = ( Q T F )( x + U l t ,
I,IR,
[ ( F @ FXx, ul,
01,
t)
04, 1)
- ( F O F ) ( x , v , , v , , t ) I w T ( u , , v , , u > dv,du, where m, n are integers, and
( F OF)=
{ rs?g:F
0F,
if IF@ FI otherwise.
5
a,
w r was defined above in Section 6.5.A.
In this case we have the following 6.5.1. THEOREM. There exists a unique non-negative solution of (3) in L” when the initial value Fo is in L?. SKETCH OF PROOF It is easy to see from the definition of ( F 0F ) , that for any functions F, G :
I(F@ W x , 01, 5
029
t ) - ( G O GNx, u1,
02,
t)l
IF(x, 01, t)l IF(x, 0 2 , t ) - G ( x , 0 2 , t)l
+ IG(x, 0 2 .
t)l . IF(%01, t ) -
G ( x , Vlr
t)l.
384
6. TOPICS IN DIFFERENTIAL OPERATORS
Since the integration in Q r is over bounded sets, this implies
IIQTF- Q?Glln,I K . (llFllm+llGllm)*llF- GIL, for some constant K . Thus QY is locally Lipschitz in La and there exists a unique local solution F of the integrated version of (3) F(x+ u,t, u l , t ) = Fo(x, u , ) +
ld
Q r F ( x + uls, u l , s ) ds.
But
llQrFllm5 rrm2 2 n n . m 3 4 / 3 = K' and so IIF(t)JI,
5
IlFollm + tK'.
Thus F exists for all t > 0. To complete the proof it remains to verify F It is easy to show that F is the only solution of the equation (4)
2
0.
G ( x + ult, ul, t ) = exp(-ff(x, ul, t ) ) F o ( x ul) ,
+
lo*
exp(-ff(x,
211,
t ) + H(x, u1,s))
x Q?'G(x+ uIs, 'u,, s ) ds _=
@3x,
t),
01%
where
and
- ( F Om x ,
u1, u 2 ,
s)Iwnm(u1,
u2,
u ) do2 du.
We note that Qr' has a structure similar to Q r but it has the nice property of being order preserving in the sense that Q r ' G , 2 Q r ' G 22 0 if G , 2 G22 0. For small t, the solution G of (4) is the limit of the increasing sequence GI = 0, Gj = 0 G j - , ,j 2 2. It follows that the solution of (4), hence of (3), satisfies F 2 0 for small t. By a continuation argument we then get F 2 0 for all t > 0, which completes the proof of the theorem; for more details consult Arkeryd (1972).
6.5. APPLICATIONS OF NONSTANDARD ANALYSIS
385
6.5.2. REMARK. The solution of (3) given in Theorem 6.5.1 conserves mass, i.e., Fo dv = Fdv, and energy, i.e., vzFo dv = v2Fdu, and the H function introduced in Section 6.5.A is decreasing. The reason for these properties is that the simple formal argument mentioned above is actually rigorously valid if F,, vzFo E L'. From this the general case follows, since
I
5
F
=
F,
if
lull In
and for lv,l 5 n, F depends only on Fo restricted to lvll function evidently is in L'.
D. The
In,
and the latter
Nonstandard Tool
We shall now study the Boltzmann equation (1) in a bounded region A of x space with initial condition Fo E L : ( A x R3), under the assumption of a two-body potential of finite or infinite range, following Arkeryd (1981a,b, 1984). Our starting point will be the truncated case of Theorem 6.5.1 but now within the enlarged universe V(*R). Choose n E *N - N and let fo(x, v ) _= min(*F,(x, u ) , n ) + n-' exp(-v2).
It is often easy to show that the V(*R) solution f with initial value obtained by Theorem 6.5.1 defines a weak solution in V(R)
L , : Gj o [
fo
f*G,
for G belonging to some suitably nice space. In this way one can recover various results, previously proved directly in V(R), i.e., by standard means. This approach has two positive aspects. On the one hand our method for recovering these old results has advantages of a conceptual, expository, and pedagogical nature. On the other hand-and perhaps more importantly-new results are obtained in this way. The first proof of asymptotic convergeme to the Maxwellian value of F when t + 03 of solutions of (1) in the space-homogeneous case with no cutoff was obtained by Arkeryd, using the fact that a method from the standard cutoff case can be directly applied in the nonstandard setting to a cutoff in *N - N. In Section 6.5.E we shall prove the existence of solutions and in Section 6.5.F asymptotic convergence. As another interesting development, in the space-inhomogeneous case far from equilibrium, the nonstandard technique has suggested a new solution concept, which in fact has yielded the only existence results so far. This will be discussed in Section 6.5.G.
6 TOPICS IN DIFFERENTIAL OPERATORS
386
Both these cases show once more that nonstandard analysis can be an efficient additional-rather than alternative-tool for working mathematicians. In the discussion of the two cases that now follows, some arguments are only sketched. For further details the reader is referred to the original papers (Arkeryd, 1981a, b, 1982, 1984). E. The Space-Homogeneous Case: Global Existence of Solutions
We shall discuss the space-homogeneous Boltzmann equation (1); i.e., we look for initial data F,, and solutions F independent of the space variable x. The two-body potentials can be of finite or infinite range. To (1) we can associate the weak form
1
F ( 0 , t ) g ( v , t ) du
=
J
F o ( v ) g ( u ,s) dv
for t > 0 and all g E Cl.m= { G E CL([O,a)x R 3 ) ) J G \= , sup\G(u, t ) \ + supld,G(v, t)l + SUPlV"G(V, t)l < 4. Corresponding to the truncated version of Section 6.5.C we have a similar weak form. In the extended universe we can write this form as f ( u , t ) g ( v , t ) j"du =
6,.
fo(v)g(v, 0 ) *du
where QL = Q:. In (5) and future formulas of the same kind our notation is a little sloppy as we should really have asterisks on all differentials, e.g., *dv *ds and not just *duds. We note that we have by change of variables
6 5 APPLICATIONS OF NONSTANDARD ANALYSIS
387
We can now state the following theorem. Suppose that Fa 2 0 and Fa,u2Fo, Falog Fo E L'. Then f in V(*R) of Theorem 6.5.1 with initial
6.5.3. THEOREM.
for rn value
=
n
E
*N - N the solution
& ( v ) = min(*F,(u), n )
+ n-'
exp(-u2)
defines a mapping in V(R)
F:R,
+
L:
through r
o r
JR3
F ( v , t ) G ( v )du
=
J
f(u, t ) * G ( u )*du, *R3
for all G E C'3m. The function F is a weak solution of (1) in the sense that, with F,( v ) = F(u, t ) : c
J
c
F,(u)G(u,t ) dv R3
=
J
F , ( v ) G ( u , o )du R3
Here
and potentials of infinite range are required to be of inverse kth power type, k > 2, or suitable generalizations. REMARK. For a fuller discussion of such potentials giving convergence of ( Q F , G) see Arkeryd (1981a), p. 14. This paper also considers consequences of the present theorem not treated below, such as the behavior of higher moments of F. PROOF Since the real-valued mapping L, from the space C0(R3) of continuous functions on R3 with compact support defined by
(L,G)(r)=
f ( u , t)*G(u)*du 0J*R3
is linear with
388
6 TOPICS IN DIFFERENTIAL OPERATORS
it follows that L, defines a measure p,. Recall from Section 6.5.C that under the given assumptions on Fo, we have that Fo du = Fdu, 5 v2Fodu = 5 uZFdu (mass and energy conservation), and using this we get
I
5
= (1
+ j*)-'
I
(1
+ u2)fo(u ) *dv = 0,
for all j E *N - N. From this it follows that for any sequence k we have
E
N, k
+ co
lim p l { l u [ > k } = 0.
kim
From the remark at the end of Section 6.5.C we have by transfer
f ( u , t ) logf(u, t ) *du 5 IjR3
1.,f d v )
l o g f o ( v ) *du,
with the right member finite by assumption. Applying the elementary inequality y logy 2 -z + y log z ( y 2 0,z > 0) to the case y = f ( zi, t ) , z = exp(-u2) and using the above inequality we have, with log+ x 3 sup(0, log x): flog+ f * d u
5
IR3
Fo(u ) log Fo( u ) du +
+J
exp( - u z ) du IR3
Fo(u)u2 du = K . R3
From this we get for *-measurable subsets s1 in *R3 and for any N > 1:
la
f (v, t ) *du
5
N
c= N 5
N
la la
*dv
+ (log N)-'
*du
+ (log N)-'
la
*du
{*R3
f(u, t ) log+f(u, t ) *du
+ K(1og N ) - ' .
It follows that
(a>
f ( v , t ) *du = 0
if
la
* d u = 0.
6 5 APPLICATIONS OF NONSTANDARD ANALYSIS
389
Now let flk be a sequence of measurable sets in R3, which is such that :i
la,
dv = 0.
From (a) above we then get, using also the definition of p,, that lim p , ( n , ) = 0.
k-m
This implies that p, is absolutely continuous with respect to dv and its Radon-Nikodym derivative is F,, i.e., d p , = F, dv, with F, E L:(R3). This proves the first part of the theorem. To prove that F is a weak solution of (1) in the sense stated in the theorem it suffices to remark that under the assumptions of the theorem Eq. ( 5 ) holds (rigorously, not only formally). Using this, one sees that F is a weak solution by arguments close to those just used to prove the first part of the theorem. F. The Space-Homogeneous Case Asymptotic Convergence to Equilibrium
In this section we only treat forces of inverse kth power type, k 2 5 , thereby avoiding a discussion of the extra technical restrictions needed to include a wider range of the interactions of the previous section. The discussion is based on Arkeryd (1982). We consider again the spacehomogeneous case with initial conditions For L:, 6.5.4. LEMMA.
v 2 F o L:, ~
F o l o g F o E L'.
Assume that for some s > 2 (1
+ Ivl)sFoE L'(R3)
Assume moreover that the force K between the molecules is repulsive of the form IK(x)I = y/lxlk, for some k 2 5. Then there exists a constant C,,,,, depending only on s and on
such that P
where F is the weak solution of the Boltzmann equation given by Theorem 6.5.3.Moreover, for G = 1, v, v 2 one has G ( v ) F ( v ,t ) dv
=
Jw' G ( v ) F o ( v dv. )
6 TOPICS IN DIFFERENTIAL OPERATORS
390
The proof of this result can be found in Elmroth (1983) to which we refer the reader, since the proof falls outside our present discussion of nonstandard analytical methods. 6.5.5. LEMMA. Let F be the solution of Theorem 6.5.3 and let t,, be a sequence such that for some s 2 2 one has
1 ( 1 + lul)sF(t", u ) du < a;
then for G ( u )= (1 + lul)'', 2 < s' < s, the sequence GF(t,,) contains a subsequence converging weakly toward GH for some H E L:. SKETCH OF PROOF The proof follows from Dunford and Petti's (1940) theorem. In fact, weak relative compactness is equivalent to
5
(a) uniformly bounded total masses, i.e., sup,, F( t,, u ) du < CO, with + Iul)"'F(t,u ) , which holds by assumption; (b) uniformly small masses for large velocities, i.e., for all E > 0 there is some r, such that sup J,+ 'o F( t,, u)du < E, which holds by Lemma 6.5.4; (c) uniformly small masse? on small sets, i.e., for all E > 0 there is > 0 such that sup F ( t , , u ) du < E if du < 8(~),which holds some 8 ( ~ ) as in the proof of Theorem 6.5.3, since F ( t , u ) = (1
5
(6)
5,
I*,
f ( u , t > logf(u, 1) *dv 5
I*,
f o ( v ) bgfo(u) *du.
In the next lemma we give an estimate on the H function Hf from below, which goes back to Gibbs. 6.5.6. LEMMA.
Consider the class
(e of
functions E
E
L : ( R 3 ) which satisfy
for some finite constants A, B, C. Let
+ be u + c),
Eo(u) = exp( - alvl'
with Q > 0, b E R3, c E R, be the unique exponential function satisfying the relations in (7) with equality also in the third relation. Then every E E % satisfies E ( u ) l o g E ( u ) du L
and the equality holds only for E
E,(u) log Eo(u)dv,
2
3
I
=
E,.
3
6 5 APPLICATIONS OF NONSTANDARD ANALYSIS
391
SKETCH OF PROOF It is enough to consider the case 6 = 0, since we can reduce to this case by translation. Set q ( E ) = E log E + au2E - ( c + l)E. q ( E ) has the minimum E = Eo with q ( E o )= -Eo, which implies that E log E 2 -au2E + ( c + 1) E - E o , with equality iff E = Eo . From this the lemma follows by integration.
We can now study the asymptotic properties for t + 00 of the solution F of Theorem 6.5.3. We show that we have weak convergence of F to the equilibrium Maxwell distribution. 6.5.7. THEOREM. Assume that Fo is x-independent and Fo 2 0, Fo, u2Fo, Fo log Fo E L’, as well as (1 + Iul)”F, E L: for some s > 2. Then for G ( u )= (1 + (ul)’‘, 0 5 s‘ < s, the family (GF(t ) ) , , o converges as t + co in the weak L’ sense to C E O ,where Eo is the Maxwell distribution
E o ( u ) = exp(-a(uI2 + bu + c ) ,
a > 0,
defined as in Lemma 6.5.6 by choosing the constants F,(u) du,
B
=
J
Fo(u)udu,
and
C
=
5
FO(u))u12du.
In this proof we use log$ This is well defined since f o ( u ) > exp(-u2) > 0 implies, as we see from the solution formula (4)for f, that f(t, u ) > 0 in the whole of *R+ x R3; i.e., logf is defined everywhere. But information about the domain of log F requires further arguments. We also need a n estimate on (QF, log F ) . For this we actually would like log F E C’3m,which does not hold in general. On the other hand, since f is a solution in the sense of Theorem 6.5.1 [and not only in the sense of ( 5 ) ] , i.e., f is a “strong nonstandard solution,” the corresponding quantity is well defined for f (under the relevant nonstandard cutoff). The estimates for f mentioned in Section 6.5.B can this time be carried through rigorously for J; not only formally. This, together with Lemma 6.5.6 with a Gaussian €7; corresponding to fo (in the sense of Lemma 6.5.6), implies that PROOF
n-l
HE;
Hf(t ) = Hfo +
I
with
J
(jYf)(t) = 4-’
lor
Nf(s)*ds
[ ( f @ f ) ( U L 4, t )
*IW3XR3XB
- ( f O f > ( u , u2, ,
t)lw::(u,, v 2 , u ) x l o g [ f O f ( u , , u2, t ) l f O f ( u l , 4, t)l *dUI dvz du. We recall that ( u i , ui) are determined by ul, u2, u by the laws of collisions, as mentioned in Section 6.5.A.
6. TOPICS IN DIFFERENTIAL OPERATORS
392
We notice that the integrand in Nf is nonpositive. Evidently -Nf(s) > k-’ > 0 at most on a set of measure k( Hf,- H E ; ) . So there is an increasing sequence ( $)jEN of nearstandard reals, such that
By Lemmas 6.5.4 and 6.5.5 we can choose the sequence so that (1 + Iu[)”‘F($)
(9)
weakly, with E
E
-+
(1
+ Iul)”E
L: satisfying G(u)E(u) du
G(u)F,(u) du,
= R3
w3
for G(u) = 1, u, u2. In particular,
Define
AE(u,,~
2
U, )
E 0 E ( u ~u;) , - E 0 E ( u , , uZ).
We will see in Lemma 6.5.8, that A E = 0 for a.e. ( u l , u2, u ) E R3 x R3 x B. Using this and a classical computation [see Arkeryd (1972)], one concludes that E = Eo. We shall now prove that (1 + IUI)~‘E,, for s’ < s, is the weak L’ limit of (1 + Iul)”’F(t)as t -+ co. In fact, make the ad absurdurn assumption that (1 + (u/)”’F( t ) % (1 + Iul)“’E,for some s’< s. We know by Lemma 6.5.5 that there is a sequence ( f j ) j e N such that for some E‘ # E,, ( 1 + lul)”‘F(tj)-+ (1 + 1uI)”E’. Take G E CiSm such that
(This is possible since E # E,.) By Theorem 6.5.3
393
6 5 APPLICATIONS OF NONSTANDARD ANALYSIS
In particular, for some
> 0 we have
7,
By the above inequality H E ; IHf(t ) = Hf,+ ji Nf(s)* d s we get for some finite C and some J, E *N - N: 70
c.
(-Nf)(t;+T)*dTS
0 5 1 0 J<Jo
Since Nf 5 0, there are some
7' 5 7,
and some finite C' such that
c (-Nf)(t;+
7)
< C',
J lln
thus lim,,,, "Nf( 5 + 7') = 0. As in the case of (8) and (9) above we can choose (t:)JEN such that, moreover, F ( t : + 7')+= Eo in the weak L' sense. Then lim
(12)
J+m
jR3 + [ F( u,
=
J
f;
7')-
F ( u, t : ) ] G (u ) du
[ E , ( u ) - E ' ( u ) ] G ( u )du. R3
This is ZO according to (lo), which, however, contradicts (11). Thus the ad absurdum assumption must be rejected and we have ( 1 + Iul)"'F(t)+= ( 1 + I U I ) ~ 'for E ~ all ~ S' < s. To complete the proof it remains to establish the following lemma. 6.5.8. LEMMA.
With the notations in the proof of Theorem 6.5.7 we have
A E = E 0E ( v l , u;) - E 0E ( v , , u,)
=0
f0ra.e.
u l , u 2 , u.
PROOF Assume ad absurdum that A E f 0 on a set of positive d v , du, du measure. Then there is a bounded measurable set R c {lull + Iu21 + IuJ4 ro} c R3 x R' x B of measure m for some r,, m > 0 such that
A E ( u ~u,Z , u ) [ u-~ ~ 1 do1 1 d ~ du,
=
Co
for some real C,, with 0 < C, < 1. We shall now derive a contradiction of this with Eq. (8). Let K > 1 be given together with an arbitrary subset R ' c R of Lebesgue measure
6 TOPICS IN DIFFERENTIAL OPERATORS
394
I dv, du, du < K - , .
Then for the solution F of Theorem 6.5.3 we have, defining A F similarly to A E ,
I,,,
AF(U1, 0 2 ,
5
2r0
- v11
du, do2 du
f @ f ( v , ,u 2 , u, t ) * d u du, du, O
5
u, tIlu2
J
2ro(K-'
+ (l/log K)27rr;)
the latter estimate being similar to the one used in the proof of (c) in Lemma 6.5.5. By this estimate there is a standard KOsuch that
I,
AF(uI,
z)Z,
U,
f)l~,
-
uIJ
dul du2 du < co/4
I,.
if Q' c Q, t > 0, and dv, dv2 du < m i / K ; . The subset of R with F @ F ( u , , u2, t ) > Ki(I,1 f o ( v )du), = C, has measure smaller than r r i / K i , since r
r
Together with the argument in connection with (9) above, this implies that there is an index no, and given n > no a set Rj c R of measure larger than m - r r ; / K ; , such that for the sequence t, as in the proof of Theorem 6.5.7
on R, and (14)
O
I,, I,, MU,,
=
u2, u, t , ) ~ s - U'I *dU,
A F ( 01,
~
2
U, , t , ) l ~ 2-
011
du du, dU2 du > C0/2.
From this it follows, with Af+ = max(0, Af), by a proof of Arkeryd (1982) carried over to our present nonstandard setting, that there is a standard E > 0 and, for each n > no, a partition *R, = R,' u R,, such that
6 5 APPLICATIONS OF NONSTANDARD ANALYSIS
5
395
-min[(4rnC,ro)-', log(1 + ~ ) ] C i / 1 6 ,
which contradicts (8) above. This shows that the ad absurdurn assumption has to be rejected, which proves the lemma. 6.5.9. REMARK.
For other results in the space-homogeneous case see
Arkeryd (1981a). G . The Space-lnhomogeneous Case: A Loeb-Measure Approach
In this section we look for x-dependent solutions of the Boltzmann equation (1) in the case of periodic boundary conditions, i.e., x E A = W3//z3. We present essentially the results in the papers where Arkeryd (1981b, 1984) managed by nonstandard techniques to extend to the space-inhomogeneous case the strong results first obtained for the space-homogeneous case. We first consider Eq. (3) extended by transfer for rn E N, n E *N - N. We remark that this corresponds to the sole assumption of having a cutoff in the impact parameter (which is justified in the case of finite-range potentials), but no cutoff in velocity space. Take as initial value & ( u ) = min(*Fo(u), n )
+ n-' exp(-v2)
with Fo satisfying Fo 2 0, Fo, v2Fo,Fo log Fo E L'. We recall from Chapter 3 that an internal function g from an internal measure space ( X , d,v ) is S-integrable if (i) (ii) (iii) then (iv)
I,
g is &-measurable; lgl dv is finite, thus nearstandard; if R E A and g(R) = 0 [in the sense that g ( a ) 1- 0, for all a E Q], lgl dv = 0 (this is needed since v is not finite); lgl dv = 0. if R E A and v ( Q ) 0, then
I,
I,
Let ( X , L ( d ) ,L(v))be the Loeb space associated with ( X , d,v). If g is an S-integrable function, then one has that ' g : X + R is Loeb integrable, and that OIngdv
=
/,OgdL(v)
for all R E A. If, moreover, h :X + *W is an &-measurable function and for some finite hyperreal K we have
Ih(x)l IK l g ( x ) l then h is S-integrable.
for all x E X ,
6 TOPICS IN DIFFERENTIAL OPERATORS
396
6.5.10. LEMMA. Let f be the solution of (3) described above, with initial condition&. Then is Loeb integrable with respect to the measure L( d x d u ) generated from the internal Lebesgue measure * d x du. Of
PROOF We have to verify (i)-(iv) above. But (i) and (ii) are immediate; (iv) follows as in the proof of Theorem 6.5.3. In fact, we need only replace in the reasoning starting from
5,
all *du integrations by * d x du integrations and we arrive at f *du dx = 0 for any internal measurable set R of infinitesimal measure in *A x *R3, which proves (iv). To prove (iii) it suffices to realize that, as in the first part of the proof of Theorem 6.5.3,
J
If(x, u, t)l *du d x
I(1
+j2)-'
*Ax{I4>j)
I I
(1
+ u')f(x,u, t ) * d u d x
"AxR'
= (1
+- j 2 ) - '
= (1
+j2)-Ic,
( 1 + u2)fo(x,u ) *du dx
with C E *R+ and nearstandard. Now let A be *Lebesgue measurable and suppose f ( A ) = 0. Then it follows from the above inequality that 1, f ( x , u, t ) *du dx = 0, which proves (iii). Let ns *R3 denote the nearstandard points of *R3. 6.5.11. LEMMA. U ~ Tu, z , T ) w r ( u l ,
Ole‘ PROOF
For Loeb a.e. ( x , u l ) E *A x ns *(w3 the function f ( x u2) is S-integrable in ( u 2 , T ) , and for IuI 5 rn
f ( x + 017, V 2 , ~ ) w Y ( u i u2) , *do, dT 1 . 3
We find for finite v l , using energy conservation, that
5
(1 + j 2 ) - 1 ' 2
for some constant K O z .
fo( X ,
02)
( 1 + u:) * dx du2 dT,
+
397
6 5 APPLICATIONS OF NONSTANDARD ANALYSIS
This implies that for Loeb a.e. (x, u l ) E *A x ns *R3
o
=
I,J+ r
f ( x + V ~ Tu2, , 7 ) w T ( u 1 , u2>*du2d7.
Similarly, by the boundedness of the H function Hf we obtain for Loeb z.e. (x, 0 , ) E *A x ns *R3:
I
( P1
f(x
+ V ~ Tu2, , T)lU2
-
ull *du2dT = 0,
B,.,,
where ~ j , c ~ = { ( ~ 2 , 7 ) E { ~ 2 ( I ~ 2 1 ~ j } x * [ O , t I I f ( x + ~ 1 ~ > c, ~r )2, , 7 )
c' is a fixed number in *N - N. From ( a ) and ( p ) it follows that for Loeb a.e. (x, 21,) E *A x ns *R3 the function f ( x + u17, u2, T ) w ~ ( u ,u,) , is S-integrable in ( u 2 ,T). For these values of (x, u l ) we have that " f ( x + U ~ Tu2, , 'T)~(U ~ull is Loeb integrable in ( u 2 , 7 ) with O
II,, f
f ( x + 017, u 2 , T)w:(Ul,
=
lb I,.
" f ( x + ~ 1 7~,
*du2 d7
v2)
- ~ l l L ( d dT). ~2
2 T ,)
O ~ U ~
*A
X
This proves the lemma.
For Loeb a.e. (x, u , )
6.5.12. LEMMA. o
I,1.,
E
ns *R3 and IuI
5
m
r
( f O f ) ( x+ UIS,
=
I,'
1.,*d
U I , u2,
s)w,m(u1, 0 2 )
O f O " f ( x + U ~ S U,, , ~
*dv2 ds
2 S ), O I U ~
-
~ 2 1 L ( d ds). ~2
PROOF The product of a bounded *Lebesgue-measurable function and an S-integrable one is S-integrable. From (4) it is easy to see that f ( x + u,s, v l , s) is such a bounded *Lebesgue-measurable function and from Lemma 6.5.11 we see that f ( x + u,s, u 2 , s ) w ~ ( u ,u2) , is S-integrable in ( u 2 , s ) for Loeb a.e. (x, u l ) E *A x ns *R3. We conclude that f0 f ( x + uls, u l , u2, s ) w T ( u , , u2) is S-integrablein(u,, s) forL0eba.e. (x, u , ) E *A x ns *R3. This also holds for ( f O f ) w ; , since 0 5 Ifof:It
(fof)
6 TOPICS IN DIFFERENTIAL OPERATORS
398
follows that " [ ( f O f ) ( x + u,s, u , , u 2 , s ) w ; ( u l , u 2 ) ] is Loeb integrable on { u z ; [u21 5 j } x [0, t ] for Loeb a.e. (x, u , ) E * A x ns *R3. Moreover,
But for a.e. ( x , u l ) E *A x ns *R3 we have
"((f@f)w;) =
Of@
Of0lU2
-
UII
for a.e. ( u 2 , s) E ns *R3 x [0, t ) , which is easy to see, since for a.e. (x, u l ) each factor of f @ f is finite a.e. in ns *R3 0[ O , t ] . This in turn is a consequence of the fact that f is S-integrable. Thus for Loeb a.e. (x, u , ) E *A x ns *R3 we have from (7) the equality in the lemma. 6.5.13. LEMMA.
For Loeb a.e. (x, ul) E *A x ns *R3
(15)
PROOF The following discussion holds for Loeb a.e. (x, u,)- Let x be the characteristic function of any set u: + u: 5 k2. Since the integrals
I
xf(x
+ s u l , uj, s ) *du, du2 du ds,
j = 1,2,
both are finite, it follows that xf(x
+ su1, 4, s) and
Xf(X
+ sv,, 4, s)
are finite for Loeb a.e. ( u 2 , u, s). And so for Loeb a.e. ( u z , u, s) "(fOf)(x
+ SUI, v ; , v;, s)w,m =
O f 0
"f(X
+ m l , u ; , ui,
s)OIu,
- U2l.
To prove the S-integrability of ( f O f ) w ; ( u 2 , u, s) = ( f o f ) it suffices to show that for some finite j > 1 that ( f O f ' ) w ; is S-integrable on the set (x
+ sul, u : , u;, s) w ;
a,= { ( U z , 11, s ) l ( f O f )
rj(f@f)).
6 5 APPLICATIONS OF NONSTANDARD ANALYSIS
399
and it suffices to consider the S-integrability of the right member on R2. We only have to check (ii), (iii), and (iv). Now
f @ f ’ 2j f @ f on Rj for j > 1, and (16)
0 5Ja,
5
((fo~>-(fof>)w; *du2 duds
C/log j ,
with C finite. In particular this proves (ii), if we take j = 2. The possibility of majorizing a.e. by a finite C in the last member of (16) follows from the usual proof of the H theorem, more precisely from the positivity of the integrand, and from ((f@f>-(f@f>)w:
log(f@f’/f@f) *dxdvi dv, duds
(fologfo+exp(-v2)+u2f0) * d x d u when
A = * ( M x R 3x B x R + ) . To prove (iii) we let k E * N - N. It follows from fo>O and the Sintegrability of (fof)w that
::
Hence (f@f)w;
A
k-’ *du2 d u d s = 0.
As for the remaining part of R2, by (16) (f@f’)w:
This proves (iii).
A
k-’ *dv2 duds
6 TOPICS IN DIFFERENTIAL OPERATORS
400
To prove (iv) we let k
R
E
* N - N and set
= { ( v 2 , u, s) E
It follows from (ii) that
R21(f@f)wr > k } .
la
*dv2 du ds = 0,
and from fo > 0 together with (4), that
( f O f ) w ; *du2duds > 0
if
I,,
*dv2 duds > 0.
Then
J,
0 < j,'
since
( f @ f ) w r *du2 duds = 0,
(fof) is S-integrable. Hence
I
( f @ f ' ) w ; *duz duds
0.
As for the remaining part of R
by (16). This proves (iv) and so the S-integrability. Finally, to obtain (15) we shall also check that o r
J
lim
v-m
(f@f')wr
*dv2 duds = 0,
A"
where
A, Suppose the limit equals
= { ( u 2 ,U, s) E S Z ; E
Iu21 2 v}.
> 0. By (16) we can choose j
E
N such that
( f o f ) ) ~ , "duds < ~ / 2 .
( ( f o f '-)
* d ~ 2
O I a ,
This, however, leads to a contradiction, since by Lemma 6.5.12
or lim Y'm
J
( f O f ) w r *du2 duds
= 0.
A"
From the above Lemmas 6.5.10-13 we then get the following existence theorem for the space-inhomogeneous Boltzmann equation (1) on a torus
6 5. APPLICATIONS OF NONSTANDARD ANALYSIS
401
A = R 3 / E 3 with a two-body potential of finite range, so that the impact parameter is in B = {u E R211ul 5 m } , for some rn < 00. The initial conditions are the natural ones from the physical point of view, as in all theorems above. 6.5.14. THEOREM. Consider the Boltzmann equation (1) with initial condition Fo 2 0 Lebesgue a.e., and such that F,, u2Fo, Fo log Fo E L'(A x R3). Then the solution f of the equation
(dldtlf(x
+ 4 t, u1, t ) = (QYfNx + V l t ,
with QF given in Section 6.5.C for m
E
211,
t)
N, n E *N - N, with initial condition
h ( x , u ) = min(*F,(x, u ) , n )
+ n-' exp(-u2),
is nearstandard for Loeb a.e. (x, u , ) E *A x ns *R3, and its standard part Of is a solution of the Boltzmann equation (1) with initial value Fo in the following sense: for a.e. (x, u , ) E *A x ns *R3 "f(X
+ q t , 211,
t ) = O*FO(X, u , ) O f 0
+
"f(X
+ u,s, u : , u;, s)0J?J2- u,IL(du*duds)
10' 1"s * R 3 x * B
-
Jo'Jns
" f O 0 f ( x + u ~ s , o u~~ ,, s ) O I U ~ - ~ , l L ( d ~ 2 d u d s ) .
*R3x*B
The integral form of (3) is
PROOF
f ( x + V l t , u1, t ) = fo(x, u , ) + 10' 1*Bx*6S3
-
lo'
(fOf)(x
+ U ~ S v, ; , 04, s ) w ~ ( u~ ~2 *, ) d ~ d dS~ z
(fOf)(x+
U ~ S u,I , U ~ , S ) W ; ( U ~~,, ) * d u d u , d s .
1*Bx*R3
Taking the standard part and using Lemmas 6.5.12 and 6.5.13 for Loeb a.e. (x, u , ) E *A x ns *R3, we get Of(x
+ u,r, u l , t ) = "*F0(x,u , ) Of
+ 10'
-
0"f(X + u,s,
u ; , u;,
s)OIu2
- u,lL(dudu2 d s )
111s ' W 3 x * B
lo'Jns
o ~ ~ o f ( X + u , s , u , , U ~ , s ) o /v u1 [2L- ( d u z d u d s ) .
*R3x*B
We shall now deduce some consequences of Theorem 6.5.14.
402
6 TOPICS IN DIFFERENTIAL OPERATORS
6.5.15. COROLLARY. The solution of Theorem 6.5.14 is t-continuous in the sense that for Loeb a.e. ( x , u , ) , given t 2 0 and E E R+, there is a 6 E R, such that I"f(X
+ u,t, u , , t ) - " f ( X + U l t ' ,
t')I < E
211,
if It - t'l < 6. For a discussion of the meaning of t continuity see Truesdell and Muncaster (1980, p. 343). PROOF. If a function g on ns *R+ is Loeb integrable, then by the definition of the Loeb integral we have
=0
xa(s)g(s)L(ds)
for every characteristic function xa of a set A = ns *R+ with infinitesimal *Lebesgue measure. From this and the equality in Theorem 6.5.14 the corollary follows easily. We shall now compare standard solutions, in case they exist, with Loeb solutions. We have the following 6.5.16. THEOREM. If F :R+ += L:(A x R3) is a solution of the Boltzmann equation in the sense that
F ( x + u,t, u,,t ) = Fo(x, u , )
+ Jo- J
- FO F(x
for a.e. ( x , u , ) E A x R3, then Loeb a.e. (x, u , ) E *A x *R3:
f(x
BXR3
[ F O ~ ( +xu,s, u ; , u;,
+ U ~ S u,, , s)]Iu~
7= F
~
0
2
,
- ull
s)
dud112ds
st is a Loeb solution satisfying for
+ u l t , u,, t ) = "*F0(x,u , ) +
-
10'
jOj(x
+ u,s, u ; , u;,
fOf'(x
+ UIS, u,
s)O((u2
- u,l)L(du2 duds)
1m*6t3x*t3
I,'j
ns *[W'X*B
9
u2, s)O(Iu2
- U,l)L(dUz d u d s ) .
PROOF This depends essentially on the fact that if f is Lebesgue integrable, then f = f o st is Loeb integrable with sfL(dq)= S f d q .
6.5.17. REMARK. If a standard solution F of the Boltzmann equation (1) exists and if uniqueness holds in our corresponding Loeb problem of finding an Of which satisfies Theorem 6.5.14, then by using Theorem 6.5.16 we
6 6 A FINAL REMARK ON THE FEYNMAN PATH INTEGRAL
403
conclude that the Loeb solution O f is equal to f = F 0 st and is thus an extension to a “denser space” of the standard solution. For example, this situation arises in the spatially homogeneous case with finite fourth moments. 6.5.18. REMARK. The results obtained by nonstandard methods by Arkeryd have several consequences, for which we refer to Arkeryd’s publications. In particular, as shown by Arkeryd (1981c), one can prove that the Loeb solutions of the Boltzmann equation discussed above are limits (in the weak sense) of such solutions for a Boltzmann equation with discretized time.
Summarizing, we have seen how nonstandard tools are quite powerful in handling a typical nonlinear Cauchy problem. In the space-homogeneous case the first proof, for potentials of infinite range, of convergence to the Maxwell distribution was by nonstandard methods. Moreover, the only existence proof so far for solutions in the space-inhomogeneous case far from equilibrium is also by nonstandard methods. Above all we look at this approach as very encouraging for handling other nonlinear problems. Nonstandard methods permit us to use-by transfer-conservation laws and monotone quantities such as the entropy, which would be otherwise difficult to exploit due to the nonlinearities and nonsmoothness involved. We would expect that this and similar ideas will turn out to be useful also in other problems of the theory of integral and differential equations.
6.6. A FINAL REMARK ON THE FEYNMAN PATH INTEGRAL AND OTHER MATTERS
In lecturing on nonstandard methods the question of the Feynmap path integral is inevitably brought up. It is possible that the “new numbers” of nonstandard theory can also make this notion precise? The Feynman path integral was introduced by Feynman to give an alternative formulation of the theory of quantum dynamics [see Feynman (1948) and Feynman and Hibbs (19691. The heuristic, but powerful, idea of Feynman has been given several mathematical expressions; see, e.g., the surveys in Albeverio et al. (1979a). Here we shall use nonstandard analysis to give an interpretation of Feynman’s ideas, in a spirit close to the original motivation. REMARK. The reader should not confuse the Feynman path integral with the “Euclidean” version, the Feynman-Kac integral; for the latter see Section 5.3.
404
6 TOPICS IN DIFFERENTIAL OPERATORS
Let us first briefly recall Feynman’s heuristic argument. Consider a quantum mechanical particle of mass m moving in R” under the action of a potential V(x). The dynamics is given by the operator e -(i’ ’)‘” in the Hilbert space L2(R”), where h is Planck’s constant divided by 2 7 ~ and H = Ho+ V, where Ho= - ( h 2 / 2 m ) A. Using the Lie-Trotter formula we can write
[where the limit is taken in the sense of strong convergence of operators] if V is real-valued and Ho+ V is essentially self-adjoint on D( Ho) n D( V), i.e. the domains of Ho and V as self-adjoint operators. The kernel of exp( -( i/ h ) * ( t / n)Ho) is
with proper determination of the z f 5 l 2 part. The if q~ is a smooth function of compact support
where gxj = ( 2 r i h t / m * n)-’/* dx,, dxj being the Lebesgue measure on R’, and
Here the integral on the right-hand side of (2) exists as an improper Lebesgue integral. Following Feynman, it is natural to look at (2) as an integral along piecewise linear continuous paths in the following way. Let $ = j . (tin), j = 0,. . . , n, and let Y,,(T) be the piecewise linear path [0, t ] + R” which is equal to xj for 7 = $. Define
P 3 7 >& -
I:
%(4)
d7,
6 6 A FINAL REMARK ON THE FEYNMAN PATH INTEGRAL
5;
where ( m / 2 ) +:(
T ) dT
405
is the kinetic energy
associated with y,,, and
with f ( y , , ( ~ ) )= V(y,(j(t/n))forTE [ j ( t / n ) ,( j + l)(t/n)],istheapproximate potential energy associated with yn. Then S,( y,,) = S,(x, x n P 1 , . .,xo) and the right-hand side of (2) appears as the path integral (4)
with T, the space of all piecewise linear continuous paths y, such that y,,(t) = x and yn has discontinuities in the derivatives only at the points j ( t / n ) , j = 1, . . . , n - 1 . Note that r, = To + x, where To = R”’. The measure dy, is given by dy,,
=
‘fi’ (’;;;;;> riht
-’/’
dy,,(j:)
j=O
=
n Jxj.
n-1
j=O
Feynman’s idea is now to take n “big” and to look at (4) as an integral over “the space r of all continuous paths ending at time t in x.” Formally (I), (21, and (4) give
We have stressed “formally,” although the right-hand side can be interpreted in certain cases as the limit as n + 00 of (4). But there is no standard measure d y on the space of paths which gives the right-hand side of (5) as a genuine integral. This difficulty can, at least up to a point, be overcome by using nonstandard analysis. REMARK. In the literature on Feynman path integrals the definition by the above limiting procedure or related ones is sometimes called “sequential limit definition”; see, e.g., Albeverio et al. (1979a), Kallianpur and Brornley (1984) and references therein.
Since To = R”” we can rewrite (4) as
6 TOPICS IN DIFFERENTIAL OPERATORS
406
Extending from n E N to an n E *N we can now introduce the internal Feynrnan path integral by exactly the same definition as for ( 6 ) . In a very intuitive sense we now integrate over hyper-nite piecewise linear paths; i.e., we let ?,,(T) be a piecewise linear path *[O, t ] + *W, where we now have divided *[O, t ] into a hyperfinite number of points tj = j ( t / n ) , j = 0,. . . , n, and n E *N. Thus, on the ordinary scale “looks continuous,” but on the infinitesimal scale it is a hyperfinite polygonal path. Thus our construction captures some of the formal properties of Feynman’s original proposal. But it is an internal quantity; exp((i/h)S,(y, + x)) * d y , is an unbounded, wildly oscillating *C-valued measure which cannot be turned into a Loeb measure by the techniques described in Chapter 3. And even if there were a “Loeb” construction we would have difficulties in passing from the hyperfinite and highly irregular path space to the continuous version. On the other hand, it is not difficult to exhibit classes of potentials V for which the internal Feynman path integral is nearstandard and its standard part solves the Schrodinger equation; i.e., the solution of the Schrodinger equation
a
ih-i,b dt
with a suitable initial condition cp
for some n
E
E
=
El+
L2(R”) is given by
*N - N.
REMARK. This includes H which are strong resolvent limits as k + 00 of Hamiltonians Hkfor which the Lie-Trotter formula holds. For related work, in particular, on a “nonstandard Lie-Trotter’’ formula, see Sloan (1977, 1978).
In solving the Schrodinger equation we have assumed that h is standard positive. What happens in (7) when h # 0 is infinitesimal? It was the heuristic insight of Dirac and Feynman that in order to recover classical mechanics from quantum mechanics one should exploit the fact that Planck’s constant is small, hence find the asymptotics of the solution of the Schrodinger equation for h + 0. Writing this solution as an oscillating integral as above suggests an application of a method of stationary phase, and formally this works since the stationary points of the phase are seen to be the points where the first variation of S,( y + x) vanishes, and by Hamilton’s principle these are the classical orbits ending at time t.
REFERENCES
407
Whereas in finite dimensions the method of stationary phase is a classical tool, which has its origin in methods of Stokes and Kelvin and has enjoyed a steady refinement up to recent years [see, e.g., Guillemin and Sternberg (1977)], it took a long time before a corresponding method of stationary phase could be developed in infinite dimensions, sufficiently powerful to handle certain classes of Feynman path integrals. This was done in Albeverio and Hoegh-Krohn (1977) and Rezende (1985) by using the definition of Feynman path integrals given in Albeverio and H%egh-Krohn (1976). But it is also possible, indeed even “natural,” to take (7) as a point of departure and use the explicit definition of the internal Feynman path integral in the calculations; i.e., we could start from
for n E *N - N and h = 0. We shall not pursue this approach further here, since it would essentially consist in reproducing the “hard core” of the standard approach of Hepp (1974) and Hagedorn (1980). We have only wanted to emphasize that the internal Feynman path integral is a precise, well-defined mathematical notion, which captures a significant part of the original heuristics and which can be effectively used in “hyperfinite” calculations. But one has to be careful; the integral is internal, and remains internal! REMARK. Harthong has developed another interesting nonstandard extension of the classical method of stationary phase for oscillatory integrals; see Harthong (1981, 1984). In particular, he has applied his results to a revealing discussion of certain optical phenomena (“moire patterns”) and to the study of wave propagation.
REFERENCES S. Albeverio and R. Heegh-Krohn (1976). Mathematical theory of Feynman path integrals. Lect. Notes Math. 523. Springer-Verlag, Berlin. S. Albeverio and R. Heegh-Krohn (1977). Oscillatory integrals and the method of stationary phase in infinitely many dimensions, with applications to the classical limit of quantum mechanics, I. Invent. Math. 40. S. Albeverio and R. Hbegh-Krohn (1981).Point interactions as limits of short range interactions. J. Operator Theory 6 . S . Albeverio, R. Heegh-Krohn, and L. Streit (1977). Energy forms, Hamiltonians, and distorted Brownian paths. J. Math. Phys. 18.
408
6 TOPICS IN DIFFERENTIAL OPERATORS
S. Albeverio, Ph. Combe, R. Hbegh-Krohn, G. Rideau, M. Sirugue-Collin, M. Sirugue, and R. Stora (eds.) (1979a). Feynman path integrals. Lecf. Notes Phys. 106. Springer-Verlag, Berlin. S. Albeverio, J. E. Fenstad, and R. Hbegh-Krohn (1979b). Singular perturbations and nonstandard analysis. Trans. Amer. Math. Soc. 252. S . Albeverio, R. Hbegh-Krohn, and L. Streit (1980). Regularization of Hamiltonians and processes. J. Math. Phys. 21. S. Albeverio, R. Hbegh-Krohn, W. Kirsch, and F. Martinelli (1982a). The spectrum of the three-dimensional Kronig-Penney model with random point defects. Adu. Appl. Math. 3. S. Albeverio, Ph. Blanchard, and R. Hbegh-Krohn (1982b). Some applications of functional integration. In Mathematical Problems in Theorefical Physics. Lecfure Notes in Phys. (R. Schrader, R. Seiler, D. A. Uhlenbrock, eds.), 153. S . Albeverio, F. Gesztesy, and R. Hbegh-Krohn ( 1 9 8 2 ~ )The . low energy expansion in nonrelativistic scattering theory. Ann. Inst. H. Poincari Sect. A 37. S. Albeverio, F. Gesztesy, R. Hbegh-Krohn, and H. Holden (1984a). Some exactly solvable models in quantum mechanics and the low energy expansions. In Proc. Leipzig ConJ Opera for Algebras, 1983, Teubner, Sfuttgarf. S. Albeverio, J . E. Fenstad, R. Hbegh-Krohn, W. Kanvowski, and T. Lindstrbm (1984b). Perturbations of the Laplacian supported by null sets, with applications to polymer measures and quantum fields. Phys. Left. 104. S. Albeverio, J. E. Fenstad, R. Hbegh-Krohn, W. Karwowski, and T. Lindstrglm (1986a). (In preparation.) S. Albeverio, F. Gesztesy, R. Hbegh-Krohn, and H. Holden (1986b). Solvable Models in Quantum Mechanics (in preparation). A. Alonso y Coria (1978). Shrinking potentiair in the Schradinger equation. Ph.D. thesis. Princeton Univ., Princeton, New Jersey. L. Arkeryd (1972a). On the Boltzmann equation I, 11. Arch. Rational Mech. Anal. 45, 1-34. L. Arkeryd (1972b). An existence theorem for a modified space-inhomogeneous, non-linear Boltzmann equation. Bull. Amer. Mafh. SOC.78, 610-614. L. Arkeryd (1981a). Intermolecular forces of infinite range and the Boltzmann equation. Arch. Rational Mech. Anal. 77, 11-23. L. Arkeryd (1981b). A non-standard approach to the Boltzmann equation. Arch. Rational Mech. Anal. 77, 1-10, L. Arkeryd ( 1 9 8 1 ~ )A . time-wise approximated Boltzmann equation. I M A J. Appl. Math. 27, 373-383. L. Arkeryd (1982). Asymptotic behavior of the Boltzmann equation with infinite range forces. Comm. Math. Phys. 86, 475-484. L. Arkeryd (1984). Loeb solutions of the Boltzmann equation. Arch. Rafional Mech. Anal. 86. M. N. Barber and B. W. Ninham (1970). Random and Restricted Wafks.Gordon and Breach, New York. F. A. Berezin and L. D. Faddeev (1961). A remark on Schrodinger’s equation with a singular potential. Sovief Math. Dokl. 2. H. Bethe and R. Peierls (1935). Quantum theory of the diplon. Proc. R. SOC.London A 148. B. Birkeland (1980). A singular Sturm-Liouville problem treated by nonstandard analysis. Math. Scand. 47. D. Brydges and T. Spencer (1985). Self-avoiding walk in 5 or more dimensions, Comm. Math. Phys. 97. D. Brydges, J. Frohlich, and T. Spencer (1982). The random walk representation of classical spin systems and correlation inequalities. Comrn. Math. Phys. 83.
REFERENCES
409
C . Cercignani (1975). Theory and Application of the Boltzmann Equation. Scottish Academic Press. Y. N. Demkov and V. N. Ostrovskii ( 1975). 7’he Use ofzero-Range Potentials in Atomic Physics. Nauka, Moscow (in Russian). C . Domb (1969). Self-avoiding walks on lattices. In K. E. Shuler (ed.), Stochastic Processes in Chemical Physics. Wiley, New York. N. Dunford and B. J. Pettis (1940). Linear operations o n summable functions. Trans. Amer. Math. SOC.47, 323-392. E. B. Dynkin (1985). Random fields associated with multiple points of the Brownian motion. J. Funct. Anal. 62. T. Elmroth (1983). Global boundedness of moments of solutions of the Boltzmann equation for infinite range forces. Arch. Rational Mech. Anal. 82, 1-2. S. F. Edwards (1965). The statistical mechanics of polymers with excluded volume. Proc. Phys. SOC.London 85. S. F. Edwards (1975). A note on the convergence of perturbation theory in polymer problems. J. Phys. A. 8. E. Fermi (1936). Sul moto dei neutroni nelle sostance idrogenate. Ric. Sci. 7 . R. P. Feynman (1948). Space-time approach to non-relativistic quantum mechanics. Reo. Mod. Phys. 20. R. P. Feynman and A. R. Hibbs (1965). Quantum Mechanics and Path Integrals, McGraw-Hill, New York. G . Flamand (1967). Mathematical theory of non-relativistic two- and three-particle systems with point interactions. In F. Lurcat (ed.), Cargese Lectures in Theoretical Physics 2967. Gordon and Breach, New York. P. J. Flory (1969). Sfatistical Mechanics of Chain Molecules. Wiley (Interscience), New York. K. F. Freed (1981). Polymers as self-avoiding random walks. Ann. Probab. 9. C. N. Friedman (1971). Perturbations of the Schrodinger equation by potentials with small support, semigroup product formulas, and applications to quantum mechanics. Ph.D thesis. Princeton Univ., Princeton, New Jersey. C . N. Friedman (1972). Perturbations of the Schrodinger equation by potentials with small support. J. Funct. Anal. 10. M. Fukushima (1980). Dirichler Forms and Markou Processes. North-Holland Publ., Amsterdam. J. Glimm and A. Jaffe (1981). Quantum Physics-A Functional Integral Point of View. SpringerVerlag, New York and Berlin. I. S. Gradshteyn and I. M. Ryshik (1965). Table of Integrals, Series and Products. Academic Press, New York. A. Grossmann, R. Hdegh-Krohn, and M. Mebkhout (1980a). A class of explicitly soluble, local, many-center Hamiltonians for one-particle quantum mechanics in two and three dimensions. J. Math. Phys. 21. A. Grossmann, R. H0egh-Krohn, and M. Mebkhout (1980b). The one-particle theory of periodic point interaction. Comrn. Math. Phys. 77. V . Guillemin and S. Stemberg ( 1977). Geometric Asymptotics. Amer. Math. SOC.,Providence, Rhode Island. G. A. Hagedorn (1980). Semiclassical quantum mechanics. Comm. Math. Phys. 71. J. Hammersley and K. W. Morton (1954). Poor Man’s Monte Carlo. J.R. Statist. SOC.16. J. Harthong (1981). Le moirt. Adu. Appl. Math. 2. J . Harthong (1984). Etudes sur la mkanique quantique. Astirisque 111. K . Hepp (1974). The classical limit for quantum mechanical correlation functions. Comm. Math. Phys. 35.
410
6 TOPICS IN DIFFERENTIAL OPERATORS
H. Holden (1981). Konuergens mot punkt-interaksjoner, cand. real. thesis, Math. Inst., Univ. of Oslo. H. Holden, R. H~egh-Krohn,and S. Johannesen (1983). The short range expansion. Adu. Appl. Math. 4. H. Holden, R. H~egh-Krohn,and S. Johannesen (1984). The short-range expansion in solid state physics. Ann. Inst. H. Poincari, Sect. A 41. R. H~egh-Krohn,H. Holden, S. Johannesen, and T. Wentzel-Larsen (1986). The Fermi surface for point interactions J. Math. Phys. 27. K. Huang and C. N. Yang (1957). Quantum-mechanical many-body problem with hard-sphere interaction. Phys. Rev. 105. K. Huang, C. N. Yang, and J. M. Luttinger (1957). Imperfect Bose gas with hard-sphere interaction. Phys. Reu. 105. G. Kallianpur and G. Bromley (1984). Generalized Feynman integrals using analytic continuation in several complex variables. In Stochastic Analysis and Applications (M. Pinsky, ed.). Dekker, New York. T. Kato (1976). Perturbation Theory for Linear Operators, 2nd ed., Springer-Verlag, Berlin and New York. H. Kesten (1963). On the number of self-avoiding walks I. J. Math. Phys. 4. R. D. L. Kronig and W. G. Penney (1931). Quantum mechanics of electrons in crystal lattices. Proc. R. SOC.London, Ser. A 130. S. Kusuoka (1985a). On the path property of Edward’s model for long polymer chains in three dimensions. In Proceedings of the USP Meeting on Stochastic Processes and Infinite Dimensional Analysis. ( S . Albeverio, ed.). Res. Notes. Math., Pitman, London. S. Kusuoka (1985b). Asymptotics of polymer measures in one dimension. In Proceedings of the Bielefeld Conference on Infinite Dimensional Analysis and Stochastic Processes. ( S . Albeverio, ed.). Res. Notes Math., Pitman, London. 0. E. Lanford, 111 (1975). Time evolution of large classical systems. In Dynamical Systems, Theory and Applications (J. Moser, ed.) pp. 1 - 1 1 1 , Lecture Notes in Physics 38, SpringerVerlag, Berlin and New York. G. A. Lawler (1980). A self-avoiding random walk. Duke Math. J. 47. G . F. Lawler (1983). A connective constant for loop-erased self-avoiding random walk. J. Appl. Probab. 20. G . F. Lawler (1985). Intersections of random walks in four dimensions 11. Comm. Math. Phys. 97. T. D. Lee, K. Huang, and C. N. Yang (1957). Eigenvalues and eigenfunctions of a Bose system of hard spheres and its low-temperature properties. Phys. Rev. 106. J. F. LeGall (1985). Sur le temps local d’intersection du movement Brownien plan, et la mkthode de renormalization de Varadhan. In (J. Azema and M. Yor, eds.) Sim. Prob. XIX, 83/84. Lect. Notes Math. 1123, Springer-Verlag, Berlin and New York. A. L. MacDonald (1976). Sturm-Liouville theory via nonstandard analysis. Indiana Uniu. Math. J. 25. E. Nelson (1977). Internal set theory: A new approach to nonstandard analysis. Bull. Amer. Math. Soe. 83. E. Nelson (1983). A remark on the polymer problem in four dimensions. Studies in applied mathematics, 1-5. Adu. Math. Suppl. Stud. 8, Academic Press, New York and London. J. Persson (1981). Second order linear ordinary differential equations with measures as coefficients. Matematiche (Catania) 36. J. Persson (1984). Linear distribution differential equations. Comment. Math. Uniu. St. Pauli 33. M. Reed and B. Simon (1975). Methods of Modern Mathematical Physics I I : Fourier Analysis, SelfAdjointness. Academic Press, New York and London.
REFERENCES
411
J. Rezende (1985). The method of stationary phase for oscillatory integrals o n Hilbert space. Comm. Math. Phys. 101. J. Rosen (1983). A local time approach to the self-intersections of Brownian paths in space. Comm. Math. Phy. 88. B. Simon (1979). Functional Integration and Quantum Physics. Academic Pres , New York and London. B. Simon (1982). Schrodinger semigroups. Bull. Amer. Math. SOC.( N . S . ) 7. A. SIoan (1977). An application of the nonstandard Trotter product formula. J. Math. Phys. 18. A. Sloan (1978). A note on the exponential o f distributions. Pacific J. Math. 79. A. Stoll (1985). Doctoral dissertation, Bochum. A. Stoll (1986). Self-repellent random walks and polymer measures in two dimensions. In ( S . Albeverio, Ph. Blanchard, and L. Streit, eds.). Proc. I1 Bibos Symp., Lect. Notes Math., Springer-Verlag, New York and Berlin. E. C. Svendsen (1981). The effect of submanifolds upon essential self-adjointness and deficiency indices. J. Math. Anal. Appl. 80. K. Symanzik (1969). Euclidean quantum field theory. In (R. Jost, ed.), Local Quantum Theory. Academic Press, New York and London. L. E. Thomas (1979). Birman-Schwinger bounds for the Laplacian with point interactions. J. Math. Phys. 20. L. E. Thomas (1980). Scattering from point interactions. In (De Santo, Saenz, and Zachary, eds.) Mathematical Methods and Applications of Scattering Theory. Proc. 1979. SpringerVerlag, Berlin and New York. C. Truesdell and R. G. Muncaster (1980). Fundamentals of Maxwell’s Kinetic Theory of a Simple Monatomic Gas. Academic Press, New York. S. Ukai (1974). On the existence of global solutions of mixed problem for non-linear Boltzmann equation. Proc. Jpn. Acad. 50, 179-184. F. Wattenberg (1977). Nonstandard measure theory-Hausdorff measure. Proc. Amer. Math. SOC.65. J. Westwater (1980). On Edwards’ model for long polymer chains. Comm. Math. Phys. 72. J. Westwater (1981). On Edwards’ model for polymer chains: 11. The self-consistent potential. Comm. Math. Phys. 79. J. Westwater (1982). On Edwards’ model for polymer chains: 111. Bore1 summability. Comm. Math. Phys. 84. J. Westwater (1985). On Edwards’ model for polymer chains. In ( S . Albeverio and Ph. Blanchard, eds.) Trends and Deuelopments in the Eighties. Proc. Bielefeld. Enc. Math. Phys. IV. World Scientific, Singapore. E. Wigner (1933). On the mass defect of Helium. Phys. Rev. 43. T. T. Wu (1959). Ground state of a Bose system of hard spheres. Phys. Rev. 115. M. Yor (1985). Renormalisation et convergence en loi pour les temps locaux d’intersections du mouvement brownien dans R’. In (J. Azema and M. Yor, eds.) Se‘m Prob. XIX 83184. Lect. Notes in Math. 1123. Springer-Verlag, Berlin and New York. J. Zorbas (1980). Perturbation of self-adjoint operators by Dirac distribution. J. Math. Phys. 21.
CHAPTER 7
HYPERFINITE LA7TICE MODELS
The common thread of this chapter is the study of hyperfinite lattice models. In Sections 7.1-7.3 our lattices are hyperfinite, but the lattice spacing or distance between neighboring points in the lattice is kept fixed and finite. In Section 7.1 we study the stochastic evolution of lattice systems; in Section 7.2 we use the hyperfinite model to study the classical equilibrium theory; and in Section 7.3 we discuss the global Markov property under a variety of assumptions. In the second part of the chapter we introduce hyperfinite lattices with infinitesimal spacing as models for field theories. In Section 7.4 we discuss a number of models for quantum field theories, including some brief remarks on gauge fields; in Section 7.5 we investigate the connection between fields and polymers, extending the discussion of Section 6.4.
7.1. STOCHASTIC EVOLUTION OF LATTICE SYSTEMS
The study of the stochastic evolution of lattice systems was initiated in the case of the Ising model by Glauber. The idea is to look at the ferromagnetic system obtained by immersing the usual Ising model [see Faris (1979)] 412
7 1 STOCHASTIC EVOLUTION OF LATTICE SYSTEMS
413
in a heat bath, keeping the temperature fixed, but varying the energy by making the spins (elementary ferromagnets) flip up and down at a given rate. The observed motion is then described by a continuous Markov process, and the model is so constructed as to have the Gibbs states of the Ising system as equilibrium states; cf. Section 7.2. In this section we shall study the stochastic evolution of these systems, using a hyperfinite lattice system with noninfinitesimal spacing. As usual we start with a brief r6sum6 of the standard theory. A d-dimensional system is specified by a set of sites A E Ed.At each site i E A there is a spin which can be either “up” (+1) or “down” (-1); i.e., associated with A we have a conJigurutionspace R, = {-1, +l}A.An element q E R,, is called a configuration, and q ( i ) , which is equal to either + 1 or -1, is called the spin at site i with respect to the configuration q. Often we shall write qi for q ( i ) . For i, j E A we let 1 i - j l denote the usual Euclidean distance and we let B , , = { j E A1 Ii - j l Ir } be the intersection of A with the ball of radius r around i. We note that R, is compact in the product topology and as usual we let C(R,) stand for the set of real-valued continuous functions on R,. When A = E d we drop subscripts; i.e., we write R for {-1, + l } E d and C(R) for the corresponding set of continuous functions. A functionf: R + R is called tame if there is a Jinite A _c E d such that f ( q ) = f ( q ‘ ) for all q, q‘ E R which agree on A, i.e., such that q 1, = 4’ We call A a base for the tame function f - There is a simple identification between functions in C(R,,), A finite, and tame functions in C(R) with base A; namely, to each f~ C ( 0 , ) we define f’ E C(R) by the equation f ‘ ( q ) = f ( q 1,). Finally, observe that the set of tame functions is dense in C(R) as a consequence of the Stone-Weierstrass theorem. We shall study certain diffusion or driji systems on the infinite lattice Zd. We assume that the stochastic evolution of such systems is controlled by a speed function c = E d x R + R which satisfies the following two conditions:
.,r
There is an M E R such that 0 < c ( i, q ) 5 M for all i E Zd, q E R. (2) There is an L E R such that c(i, q ) = c(i, q’) for all i E E d and all q, 4’ satisfying 4 lBt,,- = 4’ ~ B , .J (1)
In standard terminology this means that we study systems with Jinite-range interaction. The prime example is the classical Ising model with speed function
i.e., we have a system with nearest-neighbor interaction and an external field h.
414
7 HYPERFINITE LATTICE MODELS
We want the speed function to govern the stochastic evolution in the following way: given that the system is in configuration q at time to, then the probability that the spin at a single site i E Zd will be reversed at time to + A t shall be c ( i , q ) A t + o(At), while the probability that the spins will be reversed at two or more sites in the time interval A t shall be o ( A t ) . Translated into the language of semigroups, this means that we want to construct a Markov semigroup of operators T,, t 2 0, on C(R) having an infinitesimal generator which is the extension of the following operator A defined on the set of tame functions in C(R): (4)
where q ( i )is the configuration obtained from q by reversing the spin at site i. 7.1.1. REMARK. This approach to interacting particle systems is due to Spitzer (1970); however, a particular case had been discussed earlier by Glauber (1963). There is a large literature on this topic; for an introduction see Kindermann and Snell (1980), which has an extensive bibliography; see also Martin (1977). In this section we give an exposition of the hyperfinite approach of Helms and Loeb (1979); see also Helms and Loeb (1982).
With the standard approach there is no direct way to extend definition (4) from a tame to an arbitrary continuous function. Going from finite to hyperfinite we shall see how to preserve the explicit form of (4). This gives
an internal bounded operator and, hence, a semigroup. Add the Loeb construction and some standard-part arguments and the semigroup T, on C ( 0 ) stands in front of you. We proceed tp the details. Choose an N E *N - N and let r = Bo,N be a hyperfinite subset of *Zd. We let fir be the set of internal mappings from r to {-1, +1} and C(Rr) be the set of internal hyperreal-valued functions on Rr in the maximum norm. Note that for finite A, C(R,) consists of all functions from R, to R. If f is any function on *0and q is any configuration in *R - Or, we let f g be the function on 0, defined by f,(q') = f(q' x q ) , q' E Rr. Following the usual terminology we call q an external conJiguration. This terminology may be confusing; "external" here means external relative to the hyperfinite set R r , not external as opposed to internal in the language of nonstandard theory. For later purposes we shall make use of the standard part map str defined by strq = q r E d , for q E fir. But first the hyperfinite construction. Fix an external configuration qo and set (5)
(AI-,qof)(q) =
where f~ C(Rr), q
c * c ( i ,4 x q o ) [ f ( 9 ' 9 -f(q)l,
[tr
E
R r , and * c is the standard extension of the given
7 1 STOCHASTIC EVOLUTION OF IAmICE SYSTEMS
415
speed function c. Our first simple but basic observation is that Ar,qois an internal bounded operator on C(R,); in fact IIAr.qoll5 2 .
Irl,
where M is the constant specified in (1) and Irl is the internal cardinality (or "volume") of r. We also note that Ar,qol = 0. By transfer of standard t ) , t 2 0, theory we may introduce an internal semigroup of operators Srvs0( on C(fl,) by the explicit formula (6)
Sr,qo(t)
=
exp(tAr,qo).
The sought-for semigroup T,, t 2 0, will in a suitable sense be the standard part of the internal semigroup Sr,qo(t ) . 7.1.2. REMARK.
We shall use the power-series definition of the exponential
function
Internally, the time evolution is governed by the semigroup
Q( t )
= (1
+ Ar,q,At)"*'
(see Section 5.3), and it might have been more natural to use this representation for exp( However, Q ( t ) and Sr,qohave the same standard parts; we are free to choose and shall in this section follow the approach of Helms and Loeb (1979). The basic estimate is contained in the following lemma. 7.1.3. LEMMA.
n
E
Let f € C(R) be a tame function with base A and let
*N; then
IIAF,qo*f,ll where
5
Ilf1\2"Mnn!exp[lAl+ n ( 2 L +
11 11 is the supremum norm.
This looks more impressive than it really is. Estimates of this form are well known from standard theory [see, e.g., Holley (1970, 1972)l; a detailed proof in the hyperfinite setting is given by Helms and Loeb (1979). Since there is nothing new in principfe we omit the verification. We remind the reader of the notion of S-continuity on that goes with the standard part map str: f is S-continuous if it is internal and f(q ) = f(q ' ) whenever strq = strq'. We now come to the basic result on finiteness and S-continuity. 7.1.4. PROPOSITION. If t is a finite non-negative hyperreal andf E c(nr) is a finite-valued S-continuous function on R r , then Sr,qo(t)f is finite-valued and S-continuous on Sr.
416
7 HYPERFINITE L A n I C E MODELS
The proposition is proved in several steps. First we show that it is true for functions of the form *f,, where f is tame, provided t is chosen such that 2Mt exp(2L + l ) d < 1. Since tame is dense, we immediately extend to functions f~ C ( 0 ) . Next we observe that if we can prove 7.1.4 with the above restriction on t, then 7.1.4 holds for all finite non-negative t by the semigroup property. To complete the proof we observe that given a finitevalued S-continuous internal functionf on a1., the standard part Of, defined q E 0,is continuous on 0 and Il*("f), -fll = by setting Of(q) = " ( f ( * q 0. From this observation it is not difficult to infer the S-continuity of Sr,qo(t)f from the S-continuity of Sr,qn(t)*("f),. And the finiteness part follows since II&,qo(t)fllIllfll and llfll is finite since by assumption the set {n E *N I If(q)l 5 n, all q E a,} contains all hyperfinite integers, hence some finite no E N. It remains to prove 7.1.4 for f tame with base A and t satisfying the condition that 2Mt exp(2L + l ) d < 1. In fact, we shall prove something stronger. Let qo, qb be external configurations, let q, q' E Or, and assume that strq = strq'; then
rr)),
sr.qo(t)*fqo(q) Sr.¶&( t)*&(q'). (7) Since f is tame we see from the explicit definition in ( 5 ) that A;,,,*f%(q) = A;1,q6*fqb(q') for all n E N. By the internal definition principle there must be some w E *N - N such that the equality is true for all n Iw . Since
Sr.q,(t)*fqn(q)=
t"
t"
C7 AF,qo*fqn(q)+ .>:+, 7 AF,q,*f%(q), n.
n=O
we get Sr,qn(t)*fqn(q) - &,q&(t)*fqg(q') t
"
Using the estimate from Lemma 7.1.3, we see that the right-hand side is dominated by 211fIIe'"' t"2"M" exp n(2L +
c
nrw+l
Using the restriction on 1, we immediately conclude that the difference is infinitesimal, proving 7.1.4 for tame $ And, as we have explained, this suffices to prove all of 7.1.4. 7.1.5. REMARK. The proof of 7.1.4 tells us that the external configuration qoand the hyperfinite lattice r have only an infinitesimal effect on &,Jt)*f,. This corresponds to the standard fact that the limit thermodynamic quantities are independent of how we go to the limit and how we choose the external configurations as long as we go to the limit in the sense of van Hove; see Section 7.2.
7 1 STOCHASTIC EVOLUTION OF LATTICE SYSTEMS
417
It remains to take the standard part of the semigroup Sr,¶,,(f). To this end we note that for each t E * [ O , a)and q E R r , Sr,so( t ) defines a transition function U p ( q , . ) which in turn gives us an internal probability measure P ( E ) = U p ( q , E ) on the algebra d of internal subsets of Cn,. Let (fir, L ( d ) ,L ( P ) )be the associated Loeb space; see Section 3.2. Note that whenever E E 9 ( R ) then st,'(E) E L ( d ) .Thus we can define a probability measure Ps on B(R) by the equation P s ( E ) = L(P)(st,'E); this is by now a familiar technique. We also know from Section 3.2 that iff E C ( O ) ,then
[/dPs
(8)
=
"(*fq0)
dL(P)=
I,,*f,
dP.
This general construction can be used to define a transition function T p in the following way. Let q E Cn and f E [ O , a ) ; this gives a particular transition function U p ( * q -). The general construction then delivers a measure on %(a); we let T p (q, + ) denote this measure. And the measure introduces an operator Tpf on the bounded Bore1 measurable functions on R by the equation TFf = f(q ) T p ( * , d q ) .
rr,
I,
7.1.6. LEMMA.
for all q
E
R, E
I f f € C(Cn),then Tpf E C(Q). Moreover,
E
%(a), and s, t E [O,OO).
The proof is rather simple using general facts of the Loeb construction such as (8) and the corresponding semigroup properties of the hyperfinite entities. We show how to prove the continuity assertion. Let f E C(R); now O is compact so f is uniformly continuous on R, hence *fqo is finite-valued and S-continuous on Or, which by 7.1.4 implies that &,¶,(t)*f4, is also finite-valued and S-continuous. Using (8), we have the following calculation:
c
c
418
7 HYPERFINITE L A l T I C E MODELS
We conclude that T p f E C(n). To prove the second part we use the corresponding property of the transition function U p (q, ). Observe that we have dropped the r as an index to the transition function U p ; using Remark 7.1.5, we may as well drop the go too. In fact the transition function T p ( q, . ) and the semigroup TF, t 2 0, are both independent of the set r and the configuration qo. Thus T p will from now on simply be denoted by T,.
-
7.1.7. THEOREM. The family of operators T,, t 2 0, is the unique Feller semigroup whose infinitesimal generator is an extension of the operator A defined in (4).
To prove strong continuity we need only prove strong continuity at t and this is easy using the estimate in 7.1.3. We show that for f tame
=
0,
Trf- f = AJ: lim 1-o+
Let f have the finite set A
G
t
hd as base. Then for any q
E
Rr and small t
The right-hand side can be made less than any positive number by choosing t sufficiently small. Uniqueness follows as in the standard case; see, e.g., Holley (1972). We have thus completed our restricted task of constructing the semigroup Tl via a hyperfinite extension of the infinite lattice system. The theory does not stop here; the next topic of interest is the existence of invariant measures for the T, semigroup and their connection to the equilibrium states to be considered in the next section; some references are Holley and Stroock (1976,1977), Liggett (1985), Stroock (1978), and Sullivan (1975). We have discussed the case of discrete fiber {-1, +l}. Extensio? to the case of compact fiber, i.e., to configuration spaces of the form I' , where I is some compact subset of R, is rather immediate. The extension to the case of continuous fiber, i.e., to configuration spaces of the form R*, A c Z d , has also been studied; a reference is Faris (1979). In Section 7.3 we shall discuss both lattice systems with compact fibers and continuous fibers in connection with the global Markov property. But first we turn to classical equilibrium theory.
7 2 EOUlLlBRlUM THEORY
419
7.2. EQUILIBRIUM THEORY In this section we shall give a brief exposition of a hyperfinite approach to classical equilibrium theory. But first a rCsumC of the classical theory for a finite system, REMARK. For the standard theory consult Ruelle (1983), Israel (1979), Preston (1976), Gross (1982), and Sinai (1982).
An interaction @ is a map from finite nonempty subsets X c Z d to real-valued continuous functions on ax,i.e., @ ( X )E C ( 0 , ) . Let A be a finite subset of Z d ; the Hamiltonian or energy function for A defined by the interaction @ is the function
If:=
1
@(X).
X=A
The classical Ising model is given by the following interaction: let @ ( X ) ( q )= kq,q, if X consists of two sites i, j with distance 1; let @ ( X ) ( q )= hqi if X consists of exactly one site i; otherwise let @ ( X ) ( q )= 0. We then get
In Section 7.1 we introduced the Ising model via a speed function c ( i , q ) ; see formula (3) of 7.1. These are obviously equivalent procedures; in the present section we stick to interactions and their associated Hamiltonians. The Hamiltonian introduced in (1) allows no interaction between spins inside A and spins outside A; i.e., we have a situation with free boundary conditions. For many purposes other boundary conditions are important. We discuss one type: an external conjlguration for A is an element in where A‘ is the complement of A in Zd. (As pointed out in Section 7.1, we are here in a terminological quandary. In nonstandard analysis “external” means not internal. In statistical mechanics “external” is defined with respect to the complement of a given set A; in this sense an external configuration may well be internal as an entity of nonstandard analysis. We trust that the reader can live with this dilemma.) An external configuration q E a,,. defines a Hamiltonian QAc,
(3) where q’ is any configuration in aA.Here (q‘ x q ) ( i ) = q ’ ( i ) if i E A and (4’ x q ) ( i )= q ( i ) if i E A‘. A Hamiltonian of the form H : always makes sense; the sum in (3) may diverge. But if @ is a jinite-range interaction, i.e., there exists some number
420
7 HYPERFINITE LATTICE MODELS
1 such that @ ( X )= 0 if the diameter of X is larger than 1, then (3) is well defined. We shall in this brief exposition mostly restrict ourselves to finiterange interactions, but the theory has wider scope; see Ostebee et al. (1976) for results involving infinite-range interactions (Coulomb systems). Let 910 be the linear space of finite-range interactions. Completing with respect to the norm
we get a Banach space 93, and completing LBOwith respect to
IPII- = Xc30 Il@(X)llm we get a Banach space 6.We see that %,, c 6 c 93 and that gois dense in both 6 and 93. Much of the theory extends from interactions in 930 to 6 or sometimes to all of 93. We see that qH: is finite if @ E 6. We shall also assume that we are dealing with translation-invariant interactions: given i E Z d there is a natural map t, : O x + ,+ R, obtained by setting ( t i q ) ( j )= q ( j + i ) . This map can be lifted to a map t , : C(R,) + C'(Qx+i).We say that @ is translation-invariant if @ ( X + i ) = ti@(X)for all finite X and all i E Zd. From interactions and Hamiltonians we pass to the important thermodynamic entities such as pressure, entropy, and mean energy. These are obtained by a suitable averaging process. Since our basic space is Q, = R t , where R, = {-1, +l}, a natural a priori measure on it is the normalized counting measure; i.e., each configuration q E R, is given the weight 2-IA1. Given any function F E C(R,), we let ( F ) odenote the expected value of F with respect to the normalized counting measure. The partitionfunction associated with the interaction 0 and the finite set A is the function (4)
z,, = ( e - H z ) o
A state of the finite system is a probability measure on the space of configurations R, . The Gibbs state or equilibrium state is defined by giving each q E R, the weight
The thermal average is the expected value with respect to the equilibrium state p , . Thus given F E C(R,), the thermal average ( F ) , is given by (6)
( F ) ,= Z i l ( e - H ; F ) o .
7 2 EQUILIBRIUM THEORY
421
REMARK. In the classical case with fiber Ro = {-1, t l } it is more usual to use the counting measure rather than the normalized counting measure; i.e., we would have replaced 2, = (e-":)>, by 2, = CqEn,e-":(¶). This would not change the Gibbs measure. We have chosen the normalized counting measure in order to facilitate the comparison with recent standard expositions; see, e.g., Israel (1979).
We can now introduce the pressure and mean energy by (7)
P,(G)
=
\ A / - ' in z,,
(H:),
= Z,'(e-":H:),.
Here we have used a Hamiltonian with free boundary conditions; the same definitions extend to Hamiltonians qH: with external boundary conditions q E a,.. Let p be a state for the finite system A; i.e., p is a probability measure on the space R, that gives a weight p ( q ) to the point q E a,.Let p','(q) = p ( q ) 2'"'. In fancy language p f A ) is the Radon-Nikodym derivative of p with respect to the counting measure on a,.And this is the appropriate language to use when the fiber a,is an arbitrary compact metric space provided with some a priori probability measure p,. The entropy of p in A is defined as (8)
s,(p ) = - ( p ' , )
In P ( ~ ) ) O .
We see that -/A/ In 2 5 S,( p ) 5 0, and for the Gibbs state in A we obtain
We have the following variational principle. 7.2.1. PROPOSITION.
For any state p of the finite system A with Hamil-
tonian H:, SA(p)
- p(H:)
lAIpA(@).
Equality holds if and only if p is the Gibbs state pl. Here p ( F ) is the expected value with respect to the measure p. The proof is a simple application of the well-known Jensen inequality. We have now completed our sketch of the finite classical theory. But finite systems have limited physical interest; e.g., no finite system can exhibit the phenomenon of phase transition. In order to obtain a more meaningful physical model we must pass to the thermodynamic or bulk limit. The configuration space of the infinite system will be R = {-1, +l)"d. An interaction for the infinite system will-as for the finite systems-be a map Q from finite subsets X of Z d to real-valued continuous functions on R,. For the rest we cannot give direct explicit formulas but have to resort
422
7 HYPERFINITE
LAlTICE MODELS
to limiting procedures; e.g., the pressure P ( @ )for the infinite system should be obtained as P ( @ )= lim PA(@) = lim 1AI-I In 2, A+m
A-00
as A converges to Zd.The same should be true for entropy and mean energy. The crucial notion is here the idea of limit in the sense of van Hove. For each positive integer a we partition the space Z d into a family of cubes of the form { i zd ~ 1 nja 5 ij < (nj + l ) a ,
j = 1, . . . , d }
for integers n , , . . . , nd. Let A be a finite subset of Z d ; we define:
N:(A)
= number
which intersect A,
of cubes in
N , ( A ) = number of cubes in
%'a
contained in A.
We then say that a sequence ( A n ) n e N of finite subsets of Zd converges to injinity in the sense of van Hove if for all a (i) N,(A,)+ co. (ii) N;(A,)/N;(A,)
+ 1.
In order to formulate the basic convergence results we need the notion of a state of the injnite system: p is a state of the infinite system if p is a probability measure on the configuration space 0. Note that p induces a state on every finite system A since there is a natural embedding t A of C(s1,) into C(sZ),namely ( t Af ) ( q )=f ( q PA), q E R. Thus the probability that the restriction of p assigns to an element q of the finite set is simply p ( t , , ~ , )= p ( { q x 4'1 q ' E a,.}), where xA is the characteristic function of the subset A E Zd. Given a state p of the infinite system we have for each finite A a well-defined entity S,( p ) , the entropy of the state p in A. Since p is a probability measure on s1 we have an immediate notion of translation-invariant state; we let E' denote the set of translation-invariant states for the infinite system. The basic convergence results of the classical theory can now be stated. 7.2.2. PROPOSITION. (i) Let @ E 9; then limAn-mPA.(@)converges for all sequences A, that go to infinity in the sense of van Hove, and the limit is independent of the particular van Hove sequence. converges n( for all sequences (ii) Let p E E ' ; then l i m A n + m ~ A n ~ - lpS) A A, that go to infinity in the sense of van Hove, and the limit is independent of the particular van Hove sequence.
We use P ( @ )to denote the infinite-volume pressure and S( p ) to denote the mean entropy of the infinite system in state p.
7 2 EQUILIBRIUM THEORY
423
The next topic would be to discuss the appropriate notion of equilibrium state for the injinite system and to provide the necessary existence results. But here we shall take leave of the classical theory and turn to a hyperfinite version of the limit theory. In the hyperjinite approach we choose an infinite integer N E *N - N and choose a hyperfinite rectangle r of diameter 2 N as the set of sites for the hyperfinite limit system. We recall from Section 7.1 that R, is the set of internal mappings from r to { - l , + l } and that C(Rr) is the set of all hyperreal-valued functions on Rr in the maximum norm. We shall for simplicity restrict our attention to interactions *@, where 0 is a standard jinite-range interaction. Let q be an external configuration for the hyperfinite system (in the sense explained in Section 7.1). We then have a Hamiltonian (10)
qG@(q’)
=
c
I’nX=0
* @ ( X ) ( q x’ q ) ,
where X is hyperfinite and q’ E R r . We also have a well-defined partition function
where we take the average with respect to the internal normalized counting measure on fir. An internal state of the hyperjnite system r is an internal probability measure on the space of configurations Rr. By analogy with the finite theory we can introduce: 7.2.3. DEFINITION. The internal equilibrium state with boundary condition qo is the internal probability measure
This is in complete analogy with definition ( 5 ) and introduces pqo as an internal finitely additive probability measure on the algebra of internal subsets of Rr. As usual we can use the Loeb construction to obtain a a-additive probability structure (a,.,L ( d ) ,L( p%)), where d denotes the algebra of internal subsets of Rr. 7.2.4. DEFINITION. An equilibrium state with boundary condition qo for the infinite limit system Rr is any probability measure of the form L( pqo), where pqois the internal measure given by Definition 7.2.3.
Thus in the hyperfinite case the existence of equilibrium states for the limit systems is trivial. It remains to verify that they are well behaved with respect to the finite-dimensional approximations. We shall also discuss how the equilibrium states in 7.2.4 are related to the equilibrium states of the standard theory.
424
7 HYPERFINITE L A l I C E MODELS
In the standard theory equilibrium states may be characterized as those states which satisfy the Dobrushin- Lanford-Ruelle equations. In our hyperfinite setting the definition is particularly simple. Let p be an internal state of the Or. Let A be finite and q E a,. The conditional probability for the configuration q in A given the external configuration qo E Or-, is
Since
a, is finite p ( q 1 qo) is well defined as a hyperreal number.
7.2.5. DEFINITION. The internal state p satisfies the DLR equations for the interaction 0 if for all finite A c Zd, all q , , q2 E a,, and all qo E Or-h we have
(14)
p(q1Iqo) = exP(,H:(q,)
- qoH:(q,))P(q2140).
Since @ is of finite range, &Z:(qi), i = 1, 2, are the classical entities. (To be pedantic we should have written qo 7.2.6. PROPOSITION. Let pqobe an internal equilibrium state for the infinite system. Then pqusatisfy the DRL equations.
The proof is very simple; given a finite A, q observe that
=
c
@ ( q x qh)
AnX#O diam(X)
E
+K
O A , and qh E Or-,, we
=
,$%?)
+ K,
where K is independent of q and I is such that if diam(X) z I then * @ ( X )= 0. The definition of pressure and entropy for the hyperfinite system r is straightforward. But whereas Zr,¶,,is an infinite hyperreal number, we want the pressure and entropy to be finite. To this end we quote a few immediate inequalities of the finite theory. A simple calculation shows that for finite A
II~:lla,
(15)
5
IAl
111@l11.
Hence (16)
IpA(@)l
~ A ~ - ' ~ ~~ ~~@H~IpA(@) ~~I , ~- pA(T)l ~ ~
Ill@- Till-
If we use a Hamiltonian with external configuration qo we have (17)
l,H:lm
5
IAl . ll@Il-,
from which we get the analog of (15) with
111.)I) replaced by 1) 11 - . *
7 2 EQUILIBRIUM THEORY
425
Let @ be a finite-range interaction, *@ its extension, and go an external By transfer on (17) we get configuration for r. We note that I\*@\\-= ll,HF411 5 Irl*11*@\\-, which implies that IPr(*@)l 5 11@11- ; i.e., the nonstandard pressure is finite and thus has a standard part.
\ @I\.
7.2.7. PROPOSITION. Let ,P,-(*@) = 1rI-l In Zr,,, be the nonstandard pressure in the hyperfinite system r associated with the finite-range interaction @ and the external configuration qo. Then ,Pr(*@)is finite and
st(,P,(*@)>
=
lim PA"(@) A,,+cO
for any sequence A, that converges to infinity in the sense of van Hove. The proof is a simple adaptation of the classical convergence arguments as set out, e.g., in Israel (1979), Theorems 1.2.3-5, and need not be repeated here. We emphasize that from our point of view we have a direct definition of the pressure for the hyperfinite system as st(,Pr(*@)); this is a finite real number which by Proposition 7.2.7 corresponds to the standard one. The notion of entropy for the hyperfinite system also presents no difficulties; we simply use formula (8) above in order to define Sr( p ) . And in a way similar to Proposition 7.2.7 we shall identify the nonstandard entropy with the usual van Hove limiting sequence. But recall that Proposition 7.2.2 requires translation-invariant states; we have the following result: 7.2.8. PROPOSITION. Let p be a standard translation-invariant state on Q; then (r(-'Sr(*p) is a finite hyperreal number and
for any sequence A, which converges to infinity in the sense of van Hove. Finiteness is immediate from the fact that -In 2 5 IT\-'Sr(*p) 5 0; see (8) above. The convergence result, which identifies the hyperfinite entity with the standard one, is an adaptation of the standard convergence result; see Theorem 11.2.2 in Israel (1979). Note that S ( p ) is well defined for any internal state on fir; it is for the convergence result that we require a state of the type *p, where p E E l . 7.2.9. REMARK. Let us summarize what we have obtained so far. We have chosen a hyperfinite rectangle r as the set of sites for the limit system. We have introduced the notion of an internal equilibrium state for the system fir, and we have verified that these states satisfy the DRL equations for finite sets A E Zd. We have further introduced pressure and entropy for the hyperfinite system and showed that they are equal to the limit entities of the standard theory. Thus we have a clean and satisfactory definition of the thermodynamic limit.
426
7 HYPERFINITE LATUCE MODELS
But the usual limit space would be R = {-1, +l}zd.There is no problem, however, in using the standard part map str to obtain an equilibrium state for the infinite system R from an internal state pso on Rr. This is a simple application of the Loeb construction. From pq we obtain a a-additive measure L( p,) on Rr with respect to the c+-algebra L ( d ) . From L( p,) we introduce a state P, on R by setting (18)
8 B )= L(p,)(st,’B)
will satisfy the DLR for Bore1 sets B in R. It is not difficult to show that equations for finite A; thus &, is an equilibrium state for the infinite system R. Do we get all standard equilibrium states from our hyperfinite model? The answer is essentially yes, but we have to complicate Definitions 7.2.3 and 7.2.4 a bit. And for simplicity we stick to a nearest-neighbor interaction @ for the remainder of this section. 7.2.10. DEFINITION. Let p be a hyperfinite probability measure on Rar, where d r is the boundary of r. The internal equilibrium state with boundary distribution p is the internal probability measure
where pqris given by (12) in Definition 7.2.3. From this we get the corresponding extension of Definition 7.2.4. 7.2.11. DEFINITION. An equilibrium state for the infinite system is any probability measure of the form L ( p , ) , where p, is the internal measure given by Definition 7.2.10.
Using the format of (18) in Remark 7.2.9, we may associate with pp an equilibrium state P, on the classical limit system R. And now we can argue that every standard equilibrium state p on Ck can be written in the form ip for some suitable measure p on Rdr.This follows from the well-known fact that the restriction of the equilibrium state p of a to a finite-volume lattice A can be written in the form
where pq is the Gibbs measure on CIAwith external configuration q on Rdr and p is a probability measure on Qar. Via transfer it follows that the given p is indeed of the form fip for some suitable internal measure p on dr. The result is not restricted to nearest-neighbor interactions; see Hurd (1981), who gives a nonstandard version of the approach using projective families of speciJicafions[consult Preston (1976) for the standard theory].
7 2 EQUILIBRIUM THEORY
427
7.2.12. REMARK. Combining our remarks above with formulas (18) and (19) we have the following representation of equilibrium states in terms of “external” states
i.e., given any standard equilibrium state p we can find a measure p on = b,, can be written as in (21).
Rar such that p
This would have been a “correct” representation in terms of (standard) extremal points if we knew that “extremal” or “pure” in the standard sense coincided with the notion of being of the form bq for some q E Gar. Half of this statement is true; if p is standard pure then the representation in (21) implies that p must be of the form p = p’¶ for some q E a,. There are examples, however, which show that the converse may fail. To understand the following argument the reader should refer ahead to our discussion of phase transitions at the end of this section. It is known that in the two-dimensional Ising model with no external field there are exactly two pure states provided the “inverse temperature” p is small enough, namely p’+ = p’+,p,o and p’- = i--,p,o. Let r be a two-dimensional rectangle with an even number of sites on each edge (see Fig. 7.1). Let q E R,, be the configuration which assigns $1 to every second site on d r and - 1 to the rest as indicated by pluses and minuses in Fig. 7.1. Consider the state P,. If it were standard pure it had to be either +; or p’-. Assume that bq = b+. If we reflect about the line L the effect is to interchange pluses and minuses on the boundary ar. This + - +
-
_ .......................................... + - + -4 -
+
r
-
+ - + - +
+
+ -
...............
+ - +
Figure 7.1
ma
7 HYPERFINITE LATTICE MODELS
gives a new state b,,, where q' is obtained from q by reversing the spin at each site in d r . Reversing spins changes +; into b-, hence bqrwould have to be equal to b-. But the new state can also be obtained from the old one by a rotation, and b+ is obviously unchanged by a rotation. Thus P,. would also have to be equal to p+, a contradiction which shows that 6, cannot be standard pure. In fact, it is not difficult to see that
p, = ;p'+ + 2 p - . 1 -
The reader could also discuss this example in terms of finite approximations with appropriate boundary values. We now state a variational principle for the hyperfinite theory. Let p,(*@) = st(Irl-lp(qH;a)) for ps an internal state on Rr. 7.2.13. PROPOSITION.
equilibrium state for
Let @ be a finite-range interaction and pq an internal
a,. Then
S(P,) = P ( * @ ) + P,(*@).
The proof is simple. It follows by transfer from Proposition 7.2.1 that Sr(pq) =
IrIqM*a)+ p q ( q H ? ) -
And that is all we need to know. Note that both S ( p , ) and p , ( * @ ) are independent of q. 7.2.14. REMARK.
Macroscopic thermodynamics is to a large extent based
on the relation
TdS
=
dE
+PdK
This is essentially the variational principle, 7.2.13, for the hyperfinite system. To make the proper identification we note that for any function f such that limx+md f ( x ) / d x exists, limx+cof ( x ) / x exists, and the two limits are equal. When r is hyperfinite we can thus identify 1 dlnZ
p=--
P aV
and
dS dV
and
dE dV
and
where in the hyperfinite system the temperature has been absorbed into the energy function.
7 2 EQUILIBRIUM THEORY
429
We conclude our exposition of the equilibrium theory by a brief discussion of phase transitions. Phase transitions d o not occur in finite lattice systems. We shall briefly explain how they may occur in hyperfinite systems as a kind of “change of scale” phenomenon, namely as a consequence of the distinction between internal or *-differentiability versus external or S-diff erentiability. We shall not develop a general theory but restrict attention to the classical Ising model, and in this we shall follow closely the exposition in Helms and Loeb (1979). Let r be a hyperfinite lattice, q a configuration in a,, and qo an external boundary condition, i.e., qo E *O - Slr. We recall the notation q x qo for the combined configuration, i.e.,
In the Ising model the unique internal equilibrium state with boundary condition qo (see Definition 7.2.3) is given by
la-jl=l
Here p is a positive real parameter, the inverse temperature, and h is a real parameter representing the external jield. We thus see that the Ising model on the hyperfinite lattice r is specified by two parameters, p and h. We write p90,P,hto indicate the dependence of pq0 on the parameters p and h. We use fi90,P,h to denote the corresponding state of the classical system. We may now prove the following. (i) If h # 0, i.e., if there is an external field, then all the states fiqo,P,h coincide; that is, we have but one equilibrium state of the classical system. This means that there is only one pure phase of the system, and there is no possibility for a phase transition. (ii) If there is no external field, i.e., if h = 0, then for sufficiently small p all states pSo,P,o coincide, but for sufficiently large p there is more than one extremal state; i.e., phase transition does occur. We shall indicate part of the proof for (i). Let denote the expected value with respect to the measure pq,P,hon fir. There are two boundary conditions that we single out for special attention, namely q = +1 and q = -1; we write P ; , ~for the associated measures and use a corresponding notation for the expectations. In the same way we write the partition function as Z(q, P, h ) .
7 HYPERFINITE LATTICE MODELS
430
It is customary to regard the set R,- as a lattice introducing the pointwise ordering relation q 5 q' iff q ( i )Iq'( i) for all i E r. One may show [see Helms and Loeb (1979)l that iff is increasing on &.and -1 Iq , 5 q2 4 +1, then
(23)
E&h[fl
E2h[fl
E;%h[fl
EG.h[fl-
We also note for further use the following lemma [see Helms and Loeb (1979), Lemma 71: 7.2.15. LEMMA.
If /3 and h are standard real numbers, then r
for all q
1
E
It is the behavior of the internal pressure , P ( P , h ) which determines whether or not we have a phase transition. Let us slightly adapt our notation and write
(24)
P*( P, h ) = Irl-'In Z ( * , P, h ) .
A simple estimate shows that P+( p, h ) -- P-( also straightforward to show that r
P, h ) for each h E
*R. It is
1
(25) where we use Lemma 7.2.15 to obtain the last equality. It is now a basic fact that if h # 0 and P is kept fixed, then P, are both S-differentiable; see Helms and Loeb (1979), who closely follows the standard expositions of Lebowitz and Martin-Lof (1972) and Ruelle (1983). This means that the standard function (26)
G ( h )= st(P+(h))
is well defined; it will be convex and differentiable at any standard h # 0, and furthermore
G'(h)= P : ( h ) for any standard h # 0. We may then call upon (25) to conclude that E;,,[q(O)] = E h , p [ q ( 0 ) ]for all h # 0. By translation invariance we may conclude that E X p [ q ( i ) ]= E & [ q ( i ) ] for all standard i E r.Using (23), we see that
7 3 THE GLOBAL MARKOV PROPERlY
431
for all external configurations qo. From this we may conclude that L( pqD,P,h)= L( P + , @ , ~=) L( P - , ~ , ~for ) all external boundary conditions qo on dT and all h # 0. Hence all states @qo,S,h coincide and we have no phase transition. If there is no external field, i.e., if h = 0, then the above argument collapses at one point. The functions P+ will, since the lattice r is hyperfinite, be *-differentiable, but now they turn out not to be S-differentiable for large enough p . We may as in (26) introduce a well-defined function G = st(P*), but the non-S-differentiability of P , means that G is not differentiable for h = 0 and sufficiently large /3. And this suffices, using the standard arguments, to show that for large enough p, P+,P,o f p-.P,o. Since one may show for sufficiently small standard /3 that all states p4,P,o coincide, we do have a phase transition as p varies. Our aim in this section has been to review a small part of classical equilibrium theory from a hyperfinite point of view, in particular to point out how a hyperfinite lattice gives a very natural model for the thermodynamic limit, a model which preserves faithfully all the explicit algebra and combinatorics of the finite models.
7.3. THE GLOBAL MARKOV PROPERTY
In this section we shall use the hyperfinite model R r to discuss the global Markov property for the classical limit system a. We recall that the importance of the global Markov property lies in the fact that it allows us to introduce a certain semigroup which yields a rather complete probabilistic description of the system; see, e.g., Albeverio and Hpregh-Krohn (1984a;b) and Albeverio et a/, (1981). We also remind the reader of the importance of the global Markov property in the case of Euclidean field theory, where the infinitesimal generator obtained from the semigroup gives us the Hamiltonian of the associated GHrding-Wightman theory; see, e.g., Nelson (1973), Simon (1974), and Albeverio and Hldegh-Krohn (l984b). 7.3.1. REMARK. The reader should note that the semigroup discussed in connection with the global Markov property is not the same as the “external” semigroup introduced in Section 7.1 to describe the evolution of a stochastic lattice system.
In rough terms we can describe the Markov property in the following way. Let @ be a finite-range interaction and let H d be written as a disjoint union
H d = A, LJ C
LJ
A2,
432
7 HYPERFINITE LATTICE MODELS
where At and A2 are subsets of E d with distance larger than the range of the interaction @; i.e., C “insulates” A, from A2 with respect to the interaction @. Let p be an equilibrium state for the infinite system and let E , ( . ) denote the expectation with respect to p. We let E,( * I C) denote the conditional expectation with respect to the a-algebra BCgenerated by the projection maps A ( q ) = q( i), i E C. If for any finite A, one has that (1)
E,(flf2I C ) = EJfIl C)E,(f2l C ) ,
where f;, i = 1, 2, are bounded %,,,-measurable functions, then the Gibbs state p is said to have the local Markov property. We say that p has the global Murkou property if (1) holds for all A , , A2. In the final part of this section we shall extend some of the above discussion to the case of unbounded fiber with R replacing {-1, +1} in the definition of the configuration space. A. Hyperfinite Markov Property
We shall start out by discussing the Markov property on a hyperfinite lattice r. The global Markov property for the classical system is intimately tied up with the “behavior at infinity” of the system. We believe that the hyperfinite approach, which allows us to see what happens at infinity, offers an interesting alternative to the standard limit approach. For simplicity we shall restrict attention to a nearest-neighbor interaction @. There are no difficulties in extending the results to an arbitrary finite-range interaction. Let r be a hyperfinite lattice and let pqo be a pure internal equilibrium state on with respect to the nearest-neighbor interaction *@, see Definition 7.2.3; here qo is a boundary or external condition on r‘ = * E d - r fixing the state. Let r be written as a disjoint union r = A, u C u A2, where C is a “curve” dividing A, and A,; i.e., C is the common boundary of At and A2 inside r. Let f,be an internal function on 0, which is independent of A2; i.e., if q and q‘ differ only on sites in A2, then f , ( q ) = f,(q’). Let f2 in a similar way be independent of A , . We let E ( .) denote the (hyperfinite) expectation with respect to the internal measure pso and let E ( C) denote the corresponding conditional expectation. We want to prove the following hyperfinite Markov property:
c )= E(ft I c ) E ( f 2 1c).
7.3.2. PROPOSITION. E(f1f21
We verify the proposition by a simple hyperfinite calculation. We start out by writing down an explicit formula for the conditional expectation. Let qc be a configuration on the dividing curve C, i.e., qc E Rc. Let
7 3 THE GLOBAL MARKOV PROPERTY
denote the set of all configurations q qc X on C. We can write
E
433
flr which agree with qc
where the union is disjoint. The conditional expectation can now be written (3) where x ~ is the~characteristic ~ ~function~ of the- set qc~x Note that E ( f l C) as a function of q E Rr only depends upon the behavior of q on C ; thus for any qc E Rc the expression E ( f l C ) ( q c ) is well defined, and we see that (4)
c
E ( f l C ) ( q c )=
f(9)Pqo(419c) = Eq,(f),
9 s qc X R r - c
where pso(qI qc) is the conditional probability measure derived from pqn, and Eqc denotes expectation with respect to this measure. Any element 4 E Q, can be written as q = q, x q2 x qc, where qz E a,,, and qc E a,. The measure p,( - 1 q c ) induces measures on R A Iand flA2via the formulas (5)
P.4,(%) = P%(% x
A2
x 4c I qc),
P.4L42) = Pqn(Al x q 2 x 4c I qc), where, e.g., q1 x A2 x qc is the set of all configurations q in Rr such that q = q, x qc. An explicit calculation gives the following splitting property:
riZlvC
(6) Pqo(% q 2 x 4c Iqc) = P.41(ql)Ph2(qZ). In order to prove the proposition it suffices by (4) to show that E q c ( f , f 2 )= Eq,(f1)Eqc(f2) for all qc E a,. But this is a straightforward calculation using (6):
c
Eqc(flf2) =
fi(41f(4)P~(414C)
q f qc-xnr-t-
=
c
f i
(4 1If2(
q 2 ) P.4,( 9 1 ) P A 2 ( q 2 )
4 1 t R A
42t0,;
Eqc(f ,) Eq, (fJ. This completes the proof of Proposition 7.3.2. REMARK. The local Markovproperty for the limit space a is an immediate corollary of the hyperfinite Markov property 7.3.2. In the next section we shall discuss what the hyperfinite version has to say about the global Markov property. =
434
7. HYPERFINITE LATUCE MODELS
B. Lifting and the Global Markov Property
We shall now explain how this simple hyperfinite calculation can be used to discuss the global Markov property on R. We are now given a nearest-neighbor interaction on R. By transfer we have a nearest-neighbor interaction *a on Rr, where r = r, = {x E *Ed\Jxij< x , i = 1,. . . , d } for some x E *N - N. Let p, be an internal state for Rr,where qo E Rp. From pqowe derive via the Loeb construction a state L( p,). As explained in Section 7.2, t(p,) induces a measure P, on R by setting
ip)= L(p,)(st,’B), for Bore1 sets B in R. 5%is an equilibrium state of the classical system 52. (7)
Write Z d as a disjoint union Z d = A, u C u A2, where C is the common boundary of A, and A2. C has an extension *Cr = *C n r that splits r in the form r = A; u *Cr u A; for suitable A:, A:, where there is no interaction between A: and A: with respect to *a. Let f~ C(R) and let *fr be the extension to r given by *fr(q) = * f ( q x qo), q E R,. As above let E ( * 1 *Cr) denote the internal hyperfinite conditional expectation with respect to the measure pso and the internal algebra d*cr generated by the disjoint family of sets cpcr x fir-*,-,, qc,E a*,. We shall let E‘”’( 1 *Cr) denote the conditional expectation with respect to the measure L( p,) and the Loeb algebra L ( d c r ) . From Proposition 3.2.12 we know that E(*frl*Cr)is a lifting of E‘”’(”(*fr)I*Cr),i.e., (8) OE(*frI*Cr)= E‘”’(Y*fr)I*Cr) for a set of L(p,) measure one. We have thus the picture shown in Fig. 7.2. Let f and g be tame functions with supports in D, and D 2 ,respectively. From Section 7.3.A we know that we have the following splitting:
E(*fr* *grI*Cr) = E(*frl*Cr) * E(*grl*Cr), which by (8) implies a “global” Markov property on r with respect to the curve *Cr. But we are interested in the global Markov property on Z d and it is not at all clear how the results on r can be “pushed down” to Ed.If the interaction is strong enough there could be an influence between D , and D2 through the infinite part. In technical terms, it is not at all clear that the conditional expectation I?‘”’( . I *Cr) on Rr is “essentially the same” as the conditional expectation E ( 1 C) on R with respect to the measure b, and the algebra g C . There are various ways of controlling the behavior “at infinity.” We start by discussing the notion of lifting:
-
-
7.3.3. DEFINITION. The internal conditional expectation E ( 1 * c,) is called a C-lifting if there exists a set N c Rr which is L(d*,,) measurable
7 3 THE GLOBAL MARKOV PROPERW
435
r
/
I
/ Infinite Part
Figure 7.2
and with L( pqJ( N) = 0 such that if qr, qf-E N and strqr = strqf., then (9)
E(*frI*Cr)(qr)
E(*frl*Cr)(qk),
for all tame functions f on R. We have to show that if E ( . I *Cr) is a C-lifting in the sense of Definition 7.3.3, then it is a lifting of the conditional expectation E ( I C ) on For q E R, let p ( q ) = (41- E RrIstrqf-= q} be the monad of q. Let h(qr) be the function
a.
(10)
h(qr)
=
E(*h-l*Cr)(qr),
where f~ C ( R ) is tame. As remarked above, it follows from 3.2.12 that h(q,-) z E's'(a(*fr)I*Cr)(qr) for ~ ( p , )almost all qr E a,. We now assume that E ( *l*Cr) is a C-lifting in the sense of 7.3.3. We may then introduce a function g on R:
Because of (9) the value of g ( q) is independent of the choice of representative qf-E p ( q ) - N 7.3.4. PROPOSITION. The function g is a version of the conditional expectation on R with respect to the measure P, and the algebra Bc.
We know that E'"'(. I *Cr) is measurable with respect to the Loeb algebra L(d%c,.). As the reader may either directly verify or conclude via the general
436
7 HYPERFINITE L A n I C E MODELS
theory of Section 3.4, we have the following relationship between the a-algebras 91c and L ( d k c r ) : (i) If B E Bc, then st;’(B) (ii) If B E R and stF’(B) E
E
~!,(d*~~). then B E %Ic.
In order to show that g is B3,-measurable, it suffices to show that if q, , q2 E R have the same restriction to the curve C, i.e., q1 1 C = q2 1 C, then p(q1) c N iff d q 2 ) N. The proof is simple. Let E p ( q l )- N ; we must find some G2 E p ( q 2 )- N. Since $l E p ( q , ) it follows from the internal definition principle that there is some hyperfinite rl = r such that 4, 1 Cr, = * q , 1 Cr, . But * q , and *q2are equal on Crl; hence by setting $2 equal to *q2on Cr, and equal to 4, otherwise, we have constructed an element G2 E p ( q 2 )- N. The rest goes by symmetry. For the final part of the proof of 7.3.4, let B be any tame set in Bc.Then
.I
Thus we may take g as a version of the conditional expectation E ( C). Combining Propositions 7.3.2 and 7.3.4, we obtain a proof of the following theorem. 7.3.5. THEOREM. &, has the global Markov property with respect to E ( 1 *C,) is a C-lifting.
c if
This theorem was proved by Kessler (1984) generalizing a previous result, Theorem 7.3.7, where the lifting condition is replaced by the stronger condition of S-continuity. We will discuss this in the next subsection. We shall say that Fqosatisfies the lijting condition if E ( 1 *Cr) is a C-lifting for all C. But before we continue let us remark that lifting is not a necessary condition for the global Markov property. Goldstein (1980) introduced a “condition C” which represents another way of controlling the behavior at infinity and which yields the global Markov property; see also Follmer (1980). Kessler (1984) has shown that the lifting condition 7.3.3 and condition C are incomparable, i.e., neither implies the other.
437
7 3 THE GLOBAL MARKOV PROPERTY
Kessler (1984) gave a hyperfinite version of condition C which we shall briefly discuss. Choose x, A E *N - N such that x - A E *N - N and consider the hyperfinite lattices r = r, and I” = rA.We are in the “standard” situation; i.e., H d is written as a disjoint union Z d = A1 u C u A2, where C is a “curve” separating A , , A2 with respect to the nearest-neighbor interaction @. Rr is our basic configuration space and we let fi,qo denote the internal equilibrium state on Rr determined by the boundary condition qo. Let (so,q’) denote a configuration which is equal to q’ on *Cr= *Cr n r’ and is equal to q,, on (I? - I“) n A;, where, as above, A; is the extension of h2 to r. fifnAi,(qo,q,) is then the internal equilibrium measure on with external condition (q,,, 4’). We define:
where B2 is drnAi-rneasurable and B1is dr_(,.nAi,-measurable. Notice that if B , = q k - ( r n A ; ) x qrnAi, then B1 n B2 is a singleton q’
REMARK.
(13)
x
f i , r , , A ; , q o ( q ’ )= h , q o ( q f - - ( r ’ n A ; )
X
firn,,;
=
qk-(rnA;)
and B2 = x qr.nr\; and
~~‘nA;”.n,2;,(q0.s’)(9f’’nl\;).
Thus we see that p ~ ’ , ~ , is, ~“close” ; , ~ ~ to fi,qo, but represents a different way of controlling what happens at infinity. And if these measures do not differ “too much,” they could have the same standard parts. 7.3.6. DEFINITION. Let r = r, for some infinite x and let h,qo be an internal satisfies condition C if there exists A E equilibrium measure on R r . *N - N such that x - A E *N - N and such that
-
-
P r , q o - fi,r,,A;,qo3
where
r‘ = rA.
Goldstein showed that any measure h,so satisfying condition C has the global Markov property. Kessler (1984,1985) has constructed examples of measures of the form that do not satisfy the global Markov property and, hence, d o not satisfy condition C. For related work see von Weizsacker (1980) and Higuchi (1984). We shall give a brief indication why condition C implies the global Markov property. Even if we do not have a lifting in the sense of 7.3.3, we can show that for any tame g there exist infinite 7’s in *N, 7 5 x, such that Ep,,o(gI*Crq) is a lifting (notice that the conditional expectation as a function depends only on sites in *Cr, = *Cr n r T ) ,and that the set of such 7’s is an initial segment.
438
7 HYPERFINITE LATTICE MODELS
Condition C can now be used (Kessler, 1984) to show that for any E *N - N there is some A E *N - N, h Ih o , such that we have the right kind of splitting A.
EPr,qo(fi .f i I * Cr, 1
Epr,so(fi I *Cr,) * EPr,q,(f* I * Cr, ).
Putting these facts together and adding some measure theory yields the global Markov property; see Kessler (1984) for full details. And to obtain the right splitting from condition C we use the hyperfinite Markov property, 7.3.2, with an appropriate choice of the measures fi,r,ni,qO, for some 6. C. S-Continuity, Dobrushin’s Condition and the Global Markov Property
We recall that the S-continuity of E ( .I *Cr) means that (14)
E (*frI * Cr ) ( qr)
E (*fr I * Cr1( sf-),
whenever strqr = strqf-.Thus S-continuity implies the lifting property and we have the following theorem. 7.3.7. THEOREM. &, has the global Markov property with respect to C if E ( * 1 * Cr) is S-continuous.
The converse is not true; the lifting property does not imply the Scontinuity; a counterexample can be found in Kessler (1984). The measures he constructs are not translation-invariant and it remains an interesting task to decide the relationship between lifting, condition C, and S-continuity in the translation-invariant case. Lifting and S-continuity are rather “abstract” properties of the measure pqo.We shall in this subsection give an example due to Dobrushin (1968a,b), where we infer the S-continuity from a condition imposed on the interaction @. We continue the notation from the last subsection. Let I” = r - *Cr and fix a configuration cpcr on *Cr. We shall consider internal equilibrium states pb on a,.with boundary conditions equal to qo on r“and +cr on *Cr. Any tame function f E C ( 0 ) has an extension to an internal function on both Rr and Or,; denote these extensions by *fr and *fr, respectively. Note that for q E we have * f r ( q ) = *fr(q x qtCr) = *f(q x cpCr x qo). Observe further [see (4) above] that: (15)
EP;(*fr)= E(*frl*Cr)(q*c,).
We shall now describe a way to calculate the expected value Ep;(*fr) which, under suitable assumptions on the interaction potential @, does not depend upon the behavior at infinity; thus we have the S-continuity of E ( . I *Cr).
7 3 THE GLOBAL MARKOV PROPERW
439
7.3.8. REMARK. Uniqueness of Gibbs or equilibrium measures was first proved by Dobrushin (1968a,b). The connection between global Markov property and uniqueness has been remarked upon on several occasions [see, e.g., Albeverio et al. (1981) and Follmer (1980)l and has been used to prove the global Markov property for lattice systems. The present exposition gives a somewhat different arrangement of the facts.
let pi( - 1 q*) be the equilibrium For each i E H d - C and q* E +cr x a,., measure at site i with boundary condition q*. Further, let t i ( f ) ( q * )denote the expected value o f f with respect to the measure pi( I q*). Fix an enumeration i, , i,, . . . of the sites in Z d - C. It is easy to see that for each f E C(lL,d-,)
Tf
=
lim t,,
n-m
-
ti$
*
exists (and, indeed, will be independent of how the sites are enumerated). Consider the norm
where the sup is taken over all pairs q, q' i, i.e., q(j ) = q'(j ) for all j # i. 7.3.9. PROPOSITION.
E
q*,-, x lLr, such that q = q' off
Let
~ i=j
4 SUP{ II Pi(
*
14) - pj( * I q')IIvarI,
where the sup is taken over all pairs q, q' E +c, x i, and define
such that q = q' off
a = SUP1 Pij, '
where i, j
E
I
Z d - C. If a < 1, then
I1 T.fll < a Ilfll. The following simple observation is behind the inequality. Let p , and E,, be the associated expectations; then p2 be probability measures on some common space and let Efi,and
IE,,(f) - E,,(f)l
5
f II PI - PLZlIvar . b
P f
-
inffl.
The route from this inequality to the proposition can be found in Gross (1979). Iterating the T transform shows that 11 T"fll 5 a" l l f l l . It is not too difficult [see Gross (1979)l to see that T'f will converge uniformly to some number E(f).
7 HYPERFINITE L A n I C E MODELS
440
We can now put the various bits together. Associated with pb we have a measure fib on C t 2 H d - C [see (7)J such that for a n y f e C(Ct),
E,;(*frJ = E&(f).
(18)
Since pb is an internal equilibrium measure it follows that for all i
E,Jtf)
(19)
=
E
Zd - C
&(f).
From this we may conclude that
E,;(*f,.)
(20)
= E&( T W ,
and since T'f converges uniformly to the constant E ( f ) , it follows from (15) and (20) that
E(*h-l*Cr)(q.c,)= E ( f ) .
(21)
Since E ( f ) does not depend upon the infinite part of qCr, the S-continuity is proved under the assumption of Proposition 7.3.9. 7.3.10. REMARK. The condition a < 1 is a condition on the interaction @. Let us make this explicit. First recall that for each finite X E Zd, @ ( X ) E C(Ctx).Let us denote by 1 @1 .;, the norm
Next let us introduce a measure of the total strength of the interaction between sites i, j , i # j :
l ( i , j ) = sup
(23)
9
1c
a(x)(q)l.
X3(l,JI
Observe that
C
l ( i , j )5
J If1
C
x 31
(1x1 - 1)11@(X)ll.;,.
The following inequality is simple but basic. Let i # j and q
=
q' off j ; then
II P A . 14) - P I ( . I 4')llvar 5 e45(1*J) - 1. From this it is not too difficult to conclude that Pe
< 2e411@ll, -
iYi,j).
Recalling the definition of a in Proposition 7.3.9, we see that (24)
Note that if follows.
(y
)I@,
5
2e4ll@ilm
IIall, .
< (2e)-', then a < 1 and the S-continuity of E ( I *C,)
441
7 3 THE GLOBAL MARKOV PROPERTY
If the interaction Q, satisfies the inequality //@//, < (2e)-’, then there is a unique equilibrium measure on the classical system a. In this case the boundary condition q,, in the measure pqnon Rr has at most an infinitesimal action in the space a; i.e., the measures P,,, qo E arc, all coincide. 7.3.11. THEOREM. Let be a finite-range interaction satisfying the inequality 11Q, 1 1< (2e)-’. Then the unique equilibrium measure on R satisfies the global Markov property.
The simple hyperfinite calculation of Proposition 7.3.2, the S-continuity which we have verified under the stated condition on the interaction @, and Theorem 7.3.7 add up to a proof of the theorem. 7.3.12. REMARK. The assumption in Theorem 7.3.11 is a special case of the “strong uniqueness condition” for the interaction Q,. Let a be an arbitrary product measure on R and let a i , i E Zd, be the ith component of a ; i.e., aiis a measure on a fiber {-1, +l}. @ is said to have the strong uniqueness property if for all product measures a there exists at most one equilibrium state for the interaction Q, + q o ,where q o is the interaction associated with the measure a. Strong uniqueness implies S-continuity; for details see Kessler (1984).
D. Maximal and Minimal Gibbs States
We shall describe another situation where we have the global Markov property. The configuration space R = { - 1, + l}Bd has a natural ordering (25)
q
5
q’ iff q ( i ) 5 q ’ ( i ) , all i E
zd.
We let K , denote the set of bounded increasing measurable functions on R. K , is a convex cone and determines the order on s1 in the sense that (26)
q s q ’ i f f F ( q ) s F ( q ’ ) , all F E K , .
We shall need the following (standard) lemma. 7.3.13. LEMMA. Let p, and p2be probability measures on R and assume that pl(F)= p 2 ( F )for all F E K , . Then p1 = p 2 . The proof is a simple density argument. A dual order is defined on the set of probability measures on by
(27)
pI5 p2 iff p , ( F ) 5 p Z ( F ) , all F
E
K,.
This order is in a suitable sense a “linear extension” of the order defined in (25). If p l 5 p2 and p25 p l ,then Lemma 7.3.13 tells us that p1 = p 2 . Let r be a hyperfinite rectangle. The above definitions immediately carry over to the space 0, and to internal probability measures on R r .
442
7 HYPERFINITE LATrICE MODELS
Let pl and p2 be internal probability measures on R r , L( p , ) and L( p 2 ) the associated Loeb measures, and b , and b, the “pull-backs’’ to defined by (7) above. The following lemma is an immediate consequence of the Loeb construction. 7.3.14. LEMMA. Let p , and p2 be internal probability measures on and assume that p , 5 p 2 . Then I;., 5 b2 on R.
Let @ be a finite-range interaction on R and *a its extension by transfer. Let r be a hyperfinite rectangle and pr,JLthe corresponding internal equilibrium state with boundary condition p ; see Definitions 7.2.3 and 7.2.10. 7.3.15. DEFINITION. pr,& is increasing for
The interaction Q, is called attractive if the map p all r.
We have chosen a rather abstract form of the notion of an attractive interaction in order to highlight the general ideas and to avoid getting involved in too detailed analytical considerations at this point. The class of attractive interactions includes many important examples; e.g., the interaction defining the king model [see (22) of Section 7.21 is attractive for p 2 0 and h arbitrary. 7.3.16.THEOREM. If @ is an attractive interaction of finite range on 0, then there is a unique maximal Gibbs state+; and a unique minimal Gibbs state p’- on R.
With our choice of definitions the proof is simple. Choose r hyperfinite and let pr,+ be the internal equilibrium state on Rr with respect to the interaction *@ and with boundary condition q = + l . Since @ is attractive, Pr,+ is maximal on Or. We will show that +; = pr,+ is the unique maximal Gibbs state on R. Indeed, if p is another Gibbs state on R, then p would be of the form p”r,, for an internal state pr,lL on Rr. But on 0,-we have that Pr,p 5 Pr,+; thus by Lemma 7.3.14, p 5+; on R. This shows that +; is maximal; uniqueness follows from Lemma 7.3.13; see the remark following (27). Choosing the boundary condition q = -1 we obtain the minimal Gibbs state 5-. We now aim toward the following theorem; for a related discussion see Follmer (1980). 7.3.17. THEOREM. Let @ be an attractive interaction of finite range. Then the unique maximal and minimal Gibbs states p’+ and 6- both have the global Markov property.
To prove the appropriate lifting property we need the following basic estimate due to Kessler (1984).
7 3 THE GLOBAL MARKOV PROPERW
443
r’, and
7.3.18. PROPOSITION. Let r and r‘ be hyperfinite rectangles, I‘E let f be a positive increasing tame function. Then
L( fi,,+HEpr.+(fl *Crs)(qrO<< E,,.,+(fl *Cr)(qr)) = 0. REMARK. a << b means that a < b but a Z b. The proof is based on the following two facts. First, we have for finite rectangles r E r‘ positive increasing functions f with support in r and configurations q E RZd (28) Epr,+(flCr)(qr) 2 Ep,.,+(fl Cr)(qr) 2 0. The idea behind this inequality is that when we “project” down from r‘to r we get something smaller than what we obtain in the attractive case from the maximal state. By transfer (28) remains true for hyperfinite rectangles and standard tame functions-in fact, even for internal functions. Next we observe that for tame positive increasing functions the non-negative sequence of the E,,+(f)decreases when the size of r increases. Thus we have (29) E,, ,+(f) = EpI...+ (f) for hyperfinite I‘ c r‘. Assume now that 7.3.18 is false. Then there are f i , + ( B F2 ) S, where B, = C, x Rr‘-*cr and where
c, = ((41 x
q2)
E,
S E R+ such that
- El.
E ~*c,..I~,,..+(fl*Cr~)(ql x 4 2 ) 5 Ep,+(fI*Cr)(q1)
REMARK. We have written a configuration qr, in fly as q1 X q2 X q 3 , where q1 E SZ*,-,., q2 E .R*c,.-*c,., and q3 E Rr,-*cr.
We have the following expression for the expected value Epr,+(f):
c
E,r.+(f) = 91x
E,,...+(fl*Cr.)m. + h I x q2 x %-*cr).
qZEn‘C,.
The sum splits in two, q1 x q2 E C, or q1 x q2 E C, and using (27) and the definition of C, we obtain the following chain of inequalities:
C
Ep,.,+ff)5
q1 x 92s
+
(E,,+(fl*Cr)(qi)- E ) e , + ( q i x 42 x
fh-*c,.l
cs
C
(E,,+(fl*Cr)(qi))fi:+(qi x q 2 x %-*,..)
41 XqZL c e
5
c
E,,.+(fI*Cr)(sl)~,,+(sl x q2 x
&.-*cJ
¶1Xq2sn*C,.
=
Epr,+(E,,+(fl*C,.))- E .
5
Epr,+(E,,+(fI*Cd) -E* 6
-
E , , + ( f ) - E 8, but this contradicts (29), and the proposition is proved. =
-
E
*
444
7 HYPERFINITE L A n l C E MODELS
From Proposition 7.3.18 we can now prove that the conditional expectation is a C-lifting in the sense of Definition 7.3.3. And from this we can, as in Section 7.3.B, complete the proof of Theorem 7.3.17. It remains to verify the C-lifting property. Let x E *N - N be the size of r, and for A Ix let rAbe the rectangle of size A. For A < x and E E R, we define NA,F= {qIE,,,+(fl*Cr)(qr)5 EprA,+(fI*CrA)(qr,) - &IE
Since the sets are internal Proposition 7.3.18 implies that the least A E *N (call it A,) such that pr,+(NA,,) IE is actually in N; we also see that as A increases the sequence P ~ , + ( N *decreases; .~) finally, we observe that as E decreases, A, increases. We conclude from the Borel-Cantelli lemma that the set N =
u
n
N ~ .E ~LW~,) , ~ ~
n€N A r A , / 2 ”
has L( pr,+) measure 0. Now let q, q’ be two configurations in Rr and assume that q, q’ @ N but strq = st1.q’. From the internality of q, q‘ and the fact that they have the same standard part it follows that there is some A E *N - N such that qr, = qk,. We then have
&,+(fl*Cr)(qr) z EPrA,+(fI *CrA)(qrA) =
Ep,.,.+(fl *cr,)(qf.,)
z
E,,+(fl *Cr)(qf.).
Here the two = -assertions follow from the fact that q, q’ E N. The proof of the lifting property is now complete. This proves Theorem 7.3.17. Before turning to the case of continuous fiber we summarize the situation so far in the accompanying diagram. The arrows in the diagram have all been discussed above except the one leading from strong uniqueness to extremality. A pure state is extremal if it is not a convex combination of other pure states; strong uniqueness thus trivially implies extremality. Lifting condition
t S-continuity
/
Global Markov property Condition Maximal state -
t Strong uniqueness
c
1
/ Extrema’ity
t Dobrushin condition
Kessler (1984, 1985) has shown that in the nontranslation-invariant case, the arrows cannot be reversed. Contrary to a conjecture of Goldstein (1980),
7 3 THE GLOBAL MARKOV PROPERTY
445
he has also shown that extremality does not imply the global Markov property. It also follows from Kessler’s work that lifting and condition C are incompatible; he also has an example showing that S-continuity does not imply condition C. Some version of the hyperfinite Markov property and some adaptation of the lifting idea are behind the arrows above. We feel that the hyperfinite picture is an intuitive and effective way of organizing the material. E. The Case of Unbounded Fiber
In the final part of this section we discuss the unbounded, continuous case. More precisely, this means that we shall replace the fiber {-1, +1} by R and study configuration spaces of the form 0, = IRA,where A c Zd. This goes beyond the theory of Section 7.2 and we shall, therefore, first give a brief resum6 of the basic standard theory. An interaction @ is a map from finite nonempty subsets X E Z d to real-valued continuous functions on 0,. Let A be a finite subset of Z d ; we introduce the Gibbs or equilibrium measure p A on a,by the definition (30)
dp,(q)
=
Zl\’e-H:(q)d4,
where
and
In this section we shall restrict attention to interactions @ of finite range, i.e., interactions @ for which there exists a natural number 1 such that @ ( X ) = 0 if the diameter of X c Z d is larger than 1. With this restriction the following construction is well defined. Let A be a finite subset of Zd. An external configuration for A is an element in a,=. Let qo be an external configuration for A; we introduce the Gibbs or equilibrium measure with boundary condition qo on 0, by (33)
&*,qo(q)
=
ZizrqoexP(-q$:(q))
47,
where
and Z,,,q,,is defined in the same way as 2,; see (30) above. We note that since A is finite and @ has finite range the sum in (34) exists.
7 HYPERFINITE L A R I C E MODELS
446
Let E p A (I .A;) denote the conditional expectation with respect to the measure pAand the a-algebra B,; . For a large class of interactions the limit
(35)
E::(F)
=
lim -%,(FlAf)(q) hfbd
exists. When @ is of finite range we have the following explicit formula:
(36)
j
E 4 F ) = .Cq F ( P ) exp(-,H% P ) ) dP.
Let p be a probability measure on R or equilibrium measure on R if
(37)
E,(FlA‘)(q)
= R Z d ;p
=
is called a Gibbs measure
J%C(F)
for all finite A E Zd. This concludes our brief sketch of the standard theory. For the hyperfnite version we start with a hyperfinite rectangle r and fix a boundary condition qo E R p . By transfer we have an internal measure corresponding to (33). If v is a probability measure on R p , we can introduce a measure pr,,by
(38)
dPr,”(q)=
pr,qo(q)dv(qo).
This is a n internal measure, but by adjoining the Loeb construction and the standard part map we may introduce a measure p on R by the equation
(39)
pL(B) = L(Fr,J(St-lB),
where B is a Bore1 set in R. By a straightforward adaptation of the theory in Section 7.2 (see also the extension theory of Sections 3.4 and 3.5) we see that the measure p of (39) is a Gibbs measure on R and that every Gibbs measure can be represented in this form for suitable r and v. There is no difficulty in adapting previous definitions to the present case. So assume that Q, is of finite range and attractive (see Definition 7.3.15). In the unbounded case this is not enough to ensure the existence of a maximal and a minimal Gibbs measure; we need to impose a growth condition on the interaction Q,. 7.3.19. DEFINITION. A
there are integers k, N (40)
probability measure p on N such that
is called tempered if
E
p(ls,l) 5 k(1 + 14)”
for a11 i E zd. An interaction @ is called tempered if for all finite A E Z d the measure /-L,,~ is tempered whenever v is tempered. Q, is called uniformly tempered
7 3 . THE GLOBAL MARKOV PROPERTY
447
if for any tempered Y the family { p , . , v ~ < ~A 03)~ is uniformly tempered in the sense that there are k, N E N such that
(411
pA,v(lqtl)
for all finite A G E d and all i
E
k(l + lil)"
Ed.
We shall not in this exposition prove general results about tempered interactions; see Bellissard and Hbegh-Krohn (1982) for detailed information. As an important example we mention
Ii-jl=l
which is uniformly tempered under rather weak conditions on the potential
v.
This interaction satisfies the following important inequality: E?i,c(lqiO 5 a +
(43)
C
rijI%l,
jtgd
cj
where a > 0 and rij Ic, < 1. [We can always assume that rii = 0, and in the present simple case, (42), we need only sum over the nearest neighbors of i.] To the proof we remark that (43) follows from (42) by standard techniques for the evaluation of the asymptotic behavior of Laplace transforms; see Bellissard and H$egh-Krohn (1982). From (43) we can derive the following crucial inequality. 7.3.20. LEMMA. Let CP be a uniformly tempered interaction of finite range satisfying the inequality (43). Then there is a constant c such that
EF(lqil) < c
(44) for all i
E
Z d and all tempered Gibbs measures p on SZ.
Using the fact that p is a Gibbs measure and taking expectations on both sides of (43), we obtain
which implies that (45)
Let (1 - r ) be the matrix with entries (1 - T ) =~ 6, - r, and let X be the vector defined by X, = E p ( l q J l ) ;then (45) can be restated as the set of inequalities ((1 - r ) X ) , 5 a. And since ( 1 - r);' = 6, + rIJ+ 1, rrkrk,+ . . *
7 HYPERFINITE L A l T l C E MODELS
448
cj
has only positive terms and (1 - r);' by assumption converges and is bounded by some number < 1 independent of i, it follows that there is some constant c such that E,(lqil) < c for all i E Z d and all tempered Gibbs measures p, We are now in a position to construct a maximal tempered Gibbs measure on 0. Our starting point is a uniformly tempered interaction CD satisfying Lemma 7.3.20. To control the growth we fix two integers N , , N2 E N and choose a configuration q+ E f L Z d such that lilNl s lq'l < l i l N 2 ,
(46)
all i
E
*zd
and such that (47) The choice of N , will depend on the dimension d of Hd. Let r be a hyperfinite rectangle and consider the measure p+ on 0 defined by
(48)
p+ =
L( p r , q + ) st-'. 0
The first thing to note is that this is a tempered Gibbs measure on 0; see (46) above. We must show that it is maximal, i.e.,
(49)
E,(F)
5
E,+(F)
for all F E K , [see (27)] and all tempered Gibbs measures p on R. Equation (49) also implies the uniqueness of p + . We make the following preliminary calculation:
E,(F).= ~ , ( ~ , ( F l A ' ) ( q ) )
(50)
5
J-
EX'-(F)dpL(q)+ ll~llmP(%,q+),
BA4+
where
(511
B,,,+
={qE
fL1 qz 5 q:, all i
E
A'} =
n {ql q1 5 4:).
,€A?
From the bound in Lemma 7.3.20 we obtain 1
(52) I*(%,,+)
5
C
Z cp ( { q l q l > 9:)) 5 I ~~ c Aqtx E , ( l q l l5) r cCA C7. 9,
re.4
It follows from (47) that p ( B;,,+)can be made arbitrarily small by choosing A large.
7.4. HYPERFINITE MODELS FOR QUANTUM FIELD THEORY
449
Since CP is attractive we have on B,,4+ that E i c ( F ) IE ~ ~ ( F ‘hence ),
IBA,,+ E4,c(F) d p ( q ) 5 E4,:(F). We conclude that given any E > 0 there is a “large enough” finite A
E Zd
such that
E,(F) 5 E:Z(F) + E. (53) Passing to the hyperfinite picture we see that E $ ( F ) is the internal expectation with respect to the measure From (48) and standard Loeb theory we conclude that
(54) J q F ’ ) 5 E,+(F) for all tempered Gibbs measures p. 7.3.21. THEOREM. Let be an attractive and uniformly tempered interaction of finite range satisfying the inequality of (43); then there is a unique maximal Gibbs measure p+ and a unique minimal Gibbs measure p- with respect to CP on R.
We conclude our story here. The global Markov property of p+ has been proved by Zegarlinski (1984), using condition C; we invite the reader to look at this proof from the hyperfinite point of view; see Definition 7.3.6. 7.3.22. REMARK. The discussion of lattice models with “unbounded spins” has been largely motivated by the study of the corresponding problems for continuous quantum fields. In fact, existence, uniqueness, and the global Markov property were first proved for quantum field models (with trigonometric interactions) by Albeverio and H$egh-Krohn ( 1979). The case of lattice models with unbounded spins was analyzed under the Dobrushin uniqueness condition by Bellissard and Picco (1979). This was extended to more general settings by Bellissard and H$egh-Krohn (1982) and Zegarlinski (1984). For some recent work in the continuum quantum field case see Gielerak (1983), Zegarlinski (1989, and Rockner (1985). We invite the reader to look at this work in the context of the hyperfinite models with infinitesimal spacing that we construct in the next section.
7.4. HYPERFINITE MODELS FOR Q U A N T U M FIELD THEORY
In this section we will present the Euclidean quantum field theory as a continuous spin system on a hyperfinite lattice with infinitesimal spacing. We shall use this formulation in the next section to discuss the CP: polymer representation. A. The Program
We shall start by outlining the standard probabilistic construction of the free Euclidean field. This will be a special kind of Gaussian random field.
7 . HYPERFINITE L A n I C E MODELS
450
7.4.1. DEFINITION. Let ( Q , 3, p ) be a probability space and H a real Hilbert space. The (unique) Gaussian randomjeld indexed by H is the map CP :H + L2(Q, d p ) satisfying
(i) @ is linear; (ii) { @ ( u ) l u E H} is full, i.e., generates the measure algebra; (iii) each @ ( v ) is a Gaussian random variable; and L z:
(1)
where a = J f 2 d p is the variance of$ (Note that we only consider Gaussian random variables with mean 0.) Finally, note that ( ,)" is the inner product in H and that (@(o)@( w ) ) ~=z J @( u ) @ ( w ) d p . We fix some terminology. Given the Hilbert space H we know that the Gaussian random field indexed by H is unique up to measure isomorphisms. We therefore write QH for the underlying measure space and d p O , Hfor the measure. It is now possible to give a quick description of the free Euclidean field. First let N , be the Hilbert space of all real distributions f E S'(Rd) whose Fourier transforms are functions with finite norm, assuming m > 0 if d = 1 , 2 and m z O if d 2 3 ,
In the true physical case d, which is the number of space-time dimensions, is equal to 4. 7.4.2. DEFINITION.
random process
Q0
The free Euclidean jield of mass m is the Gaussian indexed by N,.
Let dpo denote the measure d h , N mThe . Schwinger functions S , associated with the free Euclidean field Q0 are given by (3)
In the free case where only the mean (which we always take to be zero) and the covariance matter, we may restrict our interest to the free two-point functions Sz. We note that (4)
S2(i
8 ) = ; N m = (f,(-A
+ m2)-'g),z.
7 4 HYPERFINITE MODELS FOR QUANTUM FIELD THEORY
451
From the Euclidean point of view the Schwinger functions are nothing but the moments of a certain random field. It may therefore not be unreasonable if we briefly explain how these entities arise from the physical theory of quantum fields. Classically a field Q F is a real-valued function. Thus in the quantum case the basic object should be an operator-valued function. But @ , ( x ) , the field at the space-time point x, may be a too singular object; we are therefore led to the following point of view. A Hermitian (boson) scalar quantum field is an operator-valued distribution The state space is some suitable separable Hilbert space H with a distinguished state R, the vacuum. The observables are the operators @ , ( f ) , f E S ( R ). ~ Various properties of relativistic invariance, microscopic causality, and regularity of the field must be assumed. It will suffice, however, for our coarse sketch just to introduce the dramatis personae but to leave their canonical text, i.e., the Wightman-Girding axioms, unspoken. We shall only focus for a moment on the following objects of the physical theory, the Wightman distributions or the vacuum expectation values:
The point is that from the Wightman distribution we can pass to the Schwinger functions in the following way. A point ( z , , . . . , z,) E Cd" is called Euclidean if each zJ is of the form
where sj and the 4 ' s are real (points with pure imaginary time). With zj we associate yj = ( s j , x,) E R d and we parameterize the Euclidean points in C d n by vectors y = (yl , . . . ,y,) E Rd". We denote by 8, the set of noncoincident Euclidean points, i.e., Euclidean points z = ( zl, . . . , z,) such that yi - yj f 0 , i # j , for the associated y's. A fundamental result of the Wightman-Girding theory is that the Wightman functions can be extended to a domain in C d n which includes the set 8, ; or, in somewhat more precise terms, the Wightman distribution W,, is the boundary value of an analytic function, which we also denote by W,, defined on a domain in C d n which includes the set 8,. 7.4.3. DEFINITION.
The restriction of
w,
to 8, is called the n-point
Schwinger function. We shall write S, for the Schwinger functions viewed as functions of the associated y's. The main point of this story is that one can write down a set of properties for the Schwinger functions which suffices to reconstruct the underlying physical field; i.e., given a set {S,} of functions satisfying these properties there exists an essentially unique Wightman-Ghding field O F such that the Schwinger functions of this field are precisely the given
7 HYPERFINITE LATTICE MODELS
452
functions S,. And the story is completed by noting that the Schwinger functions as introduced in (3) satisfy the assumptions of the reconstruction theorem; thus the free Euclidean field via the Schwinger functions ( 3 ) gives us a model of a (noninteracting) scalar quantum field. This is in outline one version of the general story; see Simon (1974) on how to fill in the details. But the Euclidean field can also be viewed in the following way; see, e.g., Glimm and Jaffe (1981). It follows from the Minlos theorem that it is possible to represent the free measure fi0 as a probability measure on the distribution space S’(Rd); i.e., we can always take QN,= S’(Rd). In this case the field as a map Q 0 : N , + L2(QN,)can be represented as @ o ( f ) ( 7-1 =
(7)
T(f),
where YE S(Rd) and T E S‘(Rd). And we may equally well read off the properties of the measure po in the generating functional
as in the Schwinger or moment functions
We are led to the following general definition of a Euclidean random field. 7.4.4. DEFINITION. An Euclidean field theory is given by a probability measure p on (the a-algebra generated by the cylinder sets of) S ’ ( R d ) whose generating functional
S{f} =
[
dp( T )
elT(/’)
satisfies the following properties:
s{f}is entire and analytic.
ANALYTICITY
The functional
REGULARITY
For some p , 1 Ip IS{f)l
5
5
2, some constant c, and all f
E
S(Rd),
exp C ( l l f l l I + Ilfll;).
INVARIANCE S{f}is invariant under Euclidean symmetries (translations, rotations, and reflections). OSTERWALDEA-SCHRADER REFLECTION POSITIVITY Let iw: ( t , x) E Rd such that t > 0 and d, the set of functions n
be the set of points of the form
7 4 HYPERFINITE MODELS FOR QUANTUM FIELD THEORY
453
-
where cj E @, E C:(Rd,), and T E S'(Rd); then (8A, A ) 2 0, where ( ,) is the L 2 ( d p ) inner product, A E d+,and 8 is time reflection, 8: ( t , x) ( - t , 4. ERGODICIW.
For all L ' ( d p ) functions F one has F d p = lim t-tm
where T,(s', x)
=
t
lo'
F ( T s)ds,
T ( s ' + s, x), for all (s',x)
E
Rd.
The list of properties may look formidable, but each translates back to some physically meaningful property for the Wightman-Girding field ( D F . Euclidean invariance translates into relativistic invariance of ( D F ;ergodicity ensures the uniqueness of the vacuum state a; and the reflection positivity allows us to reconstruct a Hamiltonian of the physical field. Finally, some regularity must be imposed to control the physical theory, hence the first two properties of the definition. The free Euclidean field as constructed in 7.4.2 is easily seen to satisfy the properties of 7.4.4. But it is of rather limited physical interest; there is no particle interaction. To rectify this situation we must search for some non-Gaussian random field which in some suitable way incorporates an interaction potential U. We shall try to do this in the spirit of Definition 7.4.4 by keeping the process Q0 and the underlying space QN,of the free Euclidean field but replacing the free measure dpo by an interaction measure
where U is the added interaction potential. Provided dv can be given a meaning, we can immediately write down the associated generating functional or the associated Schwinger functions and hope to reconstruct a physically nontrivial theory. B. Free Scalar Fields
To construct a physically nontrivial theory is no small task; one way of approaching it would be to start with a free latticejield. Let 6 > 0 be a fixed positive real number and define the lattice Lf6 with spacing or mesh 6 to be the set (11)
zS= { n s l n E z"}.
On this space we consider the function space l2(Zs)with the norm
7 HYPERFINITE LATTICE MODELS
454
With our hyperfinite background we understand the role of the normalization constant a d ; when 6 is infinitesimal l l f l l : = dx. It follows from (4) that the free Euclidean field has a “covariance matrix” (-A + m2)-’; for the lattice approximation we shall have to study the discretization of this operator. As is well known from previous chapters -As on 12(2?s) is given by the expression
5 If[’
(13)
(-Asf)(n6)
=
[
2df(n6) -
1
f(n’S)].
ln’--nl= 1
We introduce the matrix C by the definition (14)
+ m’),’,..
Cn,fl! = 6- d( - As
Let Ys = [ - ~ / 6 ,7r/6Id be the dual space of Ts.We have the following explicit formula:
where d
2d - 2
1 cos(ki6) i= 1
Note that for 6 = 0 and k finite we have p s ( k ) 2= p ( k ) ’ = k2 + m2. The free lattice field can be defined with respect to the full lattice Zs. But from many points of view it is as natural to start with a “finite space cutoff.” Let A be a bounded region in Rd. 7.4.5. DEFINITION.
With A E R d we associate
(i) A, = A n y s ; (ii) A? = (n6 E A,IVm((n - m (= 1 + m6 (iii) dAs = As - A F .
E
As)}; and
If M is any matrix indexed by Y s we denote by M A its restriction to A,. We say that a matrix M is concentrated on ahs if Mm,n Z 0 implies that ma, n6 E ah,. For the matrix C we have the following important result:
where BjAs is a matrix concentrated on dAs with non-negative elements. We shall now discuss a version of the free Euclidean field in the lattice As. Note that As is a finite set; let 1 = lAsl be the number of elements in
7.4. HYPERFINITE MODELS FOR QUANTUM FIELD THEORY
A,. Our truncated measure space will be the finite product the free measure (18)
4
E QA,,
QA,
455 = IwAs
with
d/.b0,A6(q) = (2n)-’/’[det( CA)]-’/’
and dq
= n n S s A s &,ti.
7.4.6. DEFINITION.
The random field @, indexed by As is the map @a: A g
x
QA,
+ Iw
given by @ , ( n ) ( q )= qns,
ns
E
AS.
It is called the free latticejeld of mass m in A,. Using formula (17) for ( C * ) - ’ , we get the expression (19)
c
(CA);,l,.qn,qn.a = ( 2 d 6 d - 2+ m 2 s d )
nS,n’SEA6
C
q:,
nSeAs
nS, n‘S E As
We see that the free measure dpo,A,describes an Ising model with continuous Gaussian spins and nearest-neighbor interaction between spins inside A, but with extra couplings on the boundary. We shall now explain how the free lattice field of Definition 7.4.6 is related to the free Euclidean field constructed in 7.4.2. The trick is to represent the random variables @,( n), n6 E A,, on the space QNm by setting (20)
@,(n)
=@dfnS),
where fns is the function on R d with Fourier transform satisfying
An explicit calculation shows that @,( f n s )dp, (22)
(@a(fns)@,(fn’6)>L2
with L2 standing for L2(dpo).Let g of A c Rd. We set
=0
and
= +(fnS,fn,s)N,,, = E
Cnn,,
Cr(Rd) have support in the interior
7 HYPERFINITE LATTICE MODELS
456
Using the representation @,(n) = @ , ( f n s ) one may show [see Simon (1974)l that @,(g) converges to Q0(g) as 6 4 0 in each L P ( Q N ,dpo), , 1 Ip < 00. Thus we can look upon the free lattice field as an approximation “from the inside” to the free Euclidean field. We now pass to the hyperjnite picture, which will approximate the free Euclidean field “from the outside.” Let 6 > 0 be infinitesimal and let A, be a hyperfinite lattice in * R d with spacing 6. We use transfer on Definition 7.4.6, to construct the hyperfinite free field. 7.4.7. DEFINITION. Let (*RA6,L( a ) ,L( P , , , ~ , )be ) the Loeb space defined from the internal Bore1 algebra $33 on *R“, and the internal measure obtained by transfer on (18). The hyperjnite laitice field is the random field
@,:As x *IRA, + *R given by
@ s ( n ) ( q )= qns, for n6
E
A, and q E *R”,.
We have the following regularity result. 7.4.8. LEMMA.
(24)
Let g E C p ( R d ) ;then @,(*g)=
c Sd*g(n~)@s(n)
is nearstandard on a set of L( po,A,)measure one in *RA6.
For the proof we observe that we have the following formula for the covariance with respect to the internal measure po,A, (25)
E(@s(*g)@s(*g))=
c ~2d*g(n~)*g(n’~)Cnn,,
and this is finite since g has compact support. Let D be a countable set of C r ( R d )functions that is dense in N,. It follows from the lemma that there is a set R c *RA, of ~ 5 ( p ~ , ~ ~ ) - m e a s u r e one such that QS(*g)(q) is nearstandard for all g E D and all q E R. 7.4.9. THEOREM. The random field a, indexed by A,, where 6 is a positive infinitesimal and As is hyperfinite, is a “realization” of the free Euclidean field on the measure space (*RA6,L(B), L ( P ~ , A ~ ) ) .
What we mean, quite precisely, is that st(@,(*g)), for g E 0, is a set of Gaussian random variables with mean zero and with the correct covariance, (st(@s(*g)),S f ( @ , ( * g ) ) ) L 2
= +cg, dN,.
Since D is dense in N,,, we can extend this equality to all of the latter space. Thus Q0(g) for g E N,,, can be represented by st(@,(*g)).
7 4 . HYPERFINITE MODELS FOR QUANTUM FIELD THEORY
457
The proof consists in taking standard parts in formula (25); just as in Section 6.2 this gives us a standard integral with respect to the resolvent kernel of the Laplacian. Then use formula (4) above. 7.4.10. REMARK. We see that the representation in Theorem 7.4.9 is independent of the choice of infinitesimal S and hyperfinite lattice As. This corresponds to the convergence of Q > & ( gto ) Q J g ) as 6.1 0 in the case of finite lattice approximations; see the discussion in connection with formula (23).
We have thus constructed the free field on a hyperfinite lattice. Formula (24) gives the field as an internal construction. The noteworthy aspect here is that the field is pointwise dejined; Q8(rz) makes computational sense for all points n8 E A&.This is not possible in the standard approach of Definition 7.4.2. One cannot give a coherent sense to @ & ( f n S ) for 8 infinitesimal, for if nS is chosen so that x = st(nS) exists as a point in Rd, then ( P s ( f n s ) would correspond to the field at the point x. But this is, in general, a nonexisting entity. Thus in the standard approach one is led to the concept of a random field indexed by some space of distributions. In the hyperfinite version we use the sites of the lattice As as our primary index set and extend by using formula (24). We will still have “infinities,” but they can be controlled through a consistent algebra, hence lead to unambiguous and meaningful results. REMARK. ( * ~ “ 6 , L( a ) ,L( P ” , ~ ~is) a) hyperfinite realization of a generdized random field. Kessler (1984) has developed a general hyperfinite approach to distribution-valued random fields. It seems that the nonstandard approach offers certain technical advantages; in particular, when the underlying probability space is a Loeb space it turns out that the Gelfand and Urbanik characterizations of generalized fields of order n are equivalent; see Kessler’s paper for full details. While we are digressing, let us also take the opportunity to insert a brief note on the status of generalized functions and distribution theory versus nonstandard analysis. That nonstandard analysis can be a suitable framework for the study of distributions was realized quite early, e.g., with E infinitesimal the Dirac &function S(x) represented as (Robinson, 1966; Laugwitz, 1978). More generally, the definition of distributions as limits of sequences [Mikusinski and Sikorski (1973)l is easily taken over in nonstandard theory, replacing sequences and limits by infinitesimals and standard parts; see, e.g., Richter (1982). However, a systematic study, especially in connection with partial differential equations, is lacking [see, however, Li Bang-He (1978) for distributions defined in a nonstandard way through analytic functions]. In Section 6.5 we saw that generalized solutions
458
7 . HYPERFINITE LATTICE MODELS
in the nonstandard sense might indeed give new insights in problems of partial differential equations, where generalized functions in the standard sense failed. Kessler has developed a quite systematic approach to distributions by using a hyperfinite approach. Formula (24) above gives a hint; the distribution QO(g), formally given by @,,(x)g(x) dx, is realized as the standard part of
c ad* Ana I@, (n), which makes perfect sense. More generally, Kessler replaces test functions, distributions, and differential operators by their hyperfinite analogs. In particular, elements of the distribution space 9‘ are lifted to internal functions on a hyperfinite lattice r, S-dense in Rd. The result mentioned above about n-order generalized random fields is but a special case of the nice control on generalized functions which can be achieved by using the hyperfinite picture. REMARK. There have been other attempts at using nonstandard methods to discuss quantum fields, e.g., Blanchard and Tarski (1978), Fittler (1984), Kelemen and Robinson (1972) and Nagamachi and Nishimura (1984).
C. Interacting Scalar Fields Enough has now been said about the free field; the time has come to return to interactions. We shall follow the program hinted at in connection with formula (10). One way of obtaining interactions is to construct suitable “local additive functionals” of the free field; ultimately we shall use our hyperfinite realization of Definition 7.4.7 to do this. But first we outline the standard procedure. Let S > 0 be a standard real and consider the lattice A, obtained from a bounded domain A E Rd. Let g be a positive function with support in A and let u, be any continuous real function. We will study interactions of the form (26)
u; = A,
c Sdg(nau,(@dn)),
where As is a real constant, the “coupling constant.” In accordance with (10) we introduce
We are in the finite pg,A6 turns out to be The function g in be the characteristic
case, and under a number of reasonable conditions a well-defined probability measure. (26) represents a kind of “space cutoff”; e.g., it could function of some domain A, E A. In order to obtain
7 4 HYPERFINITE MODELS FOR QUANTUM FIELD THEORY
459
a nontrivial field we now let S tend to zero while at the same time letting ASTIRd. To remove the space cutoff we let g converge to the constant function 1 on I W ~. In the hyperfinite version this means that we want to choose 6 > 0 infinitesimal, A, a hyperfinite lattice, and g an internal function such that g(nS) = 1 for all nearstandard nS E A,. By transfer (26) and (27) still make sense. But we would like to extract from the internal construct pg,Asa non-Gaussian measure satisfying the requirements of Definition 7.4.4. We shall discuss this problem in several steps. First we make some calculations in the truly finite case. Next we choose S > 0 infinitesimal and As hyperfinite but keep a cutoff function g of compact support. We then retreat to the finite case to establish some inequalities, which in the final stage are used to remove the space cutoff in the hyperfinite model. This is the general program which we now shall discuss in some detail in the case of exponential interaction where we choose u(y) = exp(ay), independent of 6, for some real parameter a ; see Albeverio and H$eghKrohn (1974). We start out by doing the following calculation, where 6 E R, As is a finite lattice, and g 2 0 has support in A:
(ui)’
(28)
dkhl,A,
=
A’,
c
62dg(nS)g(n’6)
nS,n‘ SsAs
From well-known properties of Gaussian integrals we see that e
where we have now put a suffix 6 on C,, [see (14)] to indicate the dependence on 6 and where (30)
At
=
exp($C$
= e x p ( $ ( 2 ~ ) - ~I T 6 p , ( k ) - ’ d k ) .
Thus (31)
(Uf)’ dpo,A6= A:(A:)’
C
6Zdg(n~)g(n’S)e”zC~~~.
n S , n ‘ S E As
And since we are in the finite case everything is well defined. Passing to the next step, we now choose S = 0 and A, as a hyperfinite lattice, but we keep a cutoff g of compact support. By transfer (31) still
460
7 HYPERFINITE LA-ICE
MODELS
makes sense, but it may be infinite. We must determine for which values of d, a, and As the right-hand side of (31) is nearstandard. For d = 1, the sum
is finite for all a since Cznris finite for all n, n' and g has compact support. For d = 2, Ctnsis modulus finite terms, -(1/27r) ln/n6 - n'6/ when In6 - n'61 = 0. Thus if a 2 < 47r and g is of compact support the sum (32) is finite. For d 2 3 or d = 2 and a* 2 47r the sum is not finite. [For more analytic details on this point the reader may consult Albeverio and Hoegh-Krohn (1974).] Thus in the case d = 1 or d = 2 and a 2 < 47r the right-hand side of (31) would be finite if A:(A;)' were finite. We observe frbm (30) that A: is infinite for 6 = 0 and d = 2. But we have freedom of choice; the coupling constant A6 has so far been left unspecified. So let us choose (33)
As = A(A6,)-',
with A E R+, independent of 6, a. With this choice the right-hand side of (31) is nearstandard. And it is not difficult to see that the standard part is r
(34) JWd
+
where G ( x - y ) is the kernel of the operator (-A rn2)-'; for similar arguments see Section 6.2 and Theorem 7.4.9. Thus we may conclude that U i is a positive function which has finite L2(dpo,A6)norm for the above choice of As ; moreover, we easily see that st( U:) # 0. This shows that for d = 1 and all a E R or d = 2 and a 2< 47~, the function exp( - U : ) is nearstandard and not identically one. We thus have the following result: 7.4.11. THEOREM. For d = 1 and all a E R, or d = 2 and a ' < 457, the Loeb measure L( pg,A6) associated with the internal measure pg,A6 as defined in (27), and with coupling constant for d = 2 chosen as in (33), is absolutely continuous with respect to the free field measure L(P,,*~)for all g of compact support, the Radon-Nikodym derivative being the L" function
We have thus completed the second stage of the program. Before we proceed we note that the interaction U i is more commonly written
461
7 4 HYPERFINITE MODELS FOR QUANTUM FIELD THEORY
where
=
:earP6("):
(A:)-' e a @ s ( n )is the so-called Wick renormalization of
. In the Wick notation the identity in (29) is nothing but an instance
e"*6(")
of the identity (:exp af: :exp p g : ) = exp(aP(fg)). We have chosen a more direct approach to highlight the choice of infinitesimal coupling constant in (33). 7.4.12. REMARK.
One can obviously replace
17:above by
fii =
j U ; dv( a ) for any positive finite measure v [with support in (-6, &) for d = 21. This is the case that was studied by standard methods in Albeverio and Hoegh-Krohn (1974); see also the references in Albeverio and HoeghKrohn (1984b). We now return to the discrete case in order to establish certain inequalities of crucial importance for the final space cutoff removal. Let 6 E R+, let As be a finite lattice, and let g have support in A. We shall use the interaction where in order to simplify matters we now choose a measure v satisfying .(a) = .(-a). The Schwinger functions associated with the measure pg,A, are given by
fii,
1
' ' '
exp(-fii)
dk'Cl,A,.
Let g, g' both have support in A; assume that supp g G supp g' and that g = g' on supp g. We want to show that (36)
s;,Is;.
In order to do this, introduce
Note that Si, = S : , , , , and S ; = S;,,,,o. Hence in order to establish ( 3 6 ) we must compute aSi,,,,/ap and show that aS;r,g,,/ap I0 for 0 5 p 5 1. This will follow by an application of the Griffiths-Kelly-Sherman inequalities; e.g., see Simon (1974). In brief outline the argument goes as follows.
7 HYPERFINITE LAlTICE MODELS
462
Referring to the explicit form of the measure dpg,As[see (27) and (18)], we may immediately conclude from the GKS inequalities that (38)
I
@'s(n,).
*
@'s(nk>@6(n>'dpg,A,
for all 1. Then, by series expansion and the fact that the measure dv is even, one shows that
which is precisely what is needed to prove that aSi,,,,/ap way it also follows from the GKS inequalities that
5 0.
In a similar
si2-0
(40)
for all g of support in A. We are now ready for the final stage. Let us once more move to the hyperfinite model where 6 > 0 is infinitesimal and AS is hyperfinite. By transfer, the moments Sz are well defined with respect to the appropriate internal measure. Let us first consider the case g = 0 on As. In this case we get the Schwinger functions of the free field. In more detail, let n,F and n,6 be finite points and assume that x = st( n , 6 ) # st( n 2 6 )= y, x, y E R2. Then for d = 2, (41)
"(S,S(nd, n 2 6 ) ) =
I
dL(Po.n,)
"(@6(nI)@'s(n2))
= Sob, Y )
And, as is well known in the free case, an arbitrary Schwinger function is a sum of products of the "two-point functions" given in (41). Thus if we have noncoincident points x1= st( n16) # x2= st( n 2 8 )# * * # xk = st(nk8), then " S ; ( n , S , . . . , nk6) is well defined and finite. Now let g , be the characteristic function of a cube centered at the origin and with sides of length 2n. By transfer we have for all n that (42)
s,".
0 5 sfg"I
7 4 HYPERFINITE MODELS FOR Q U A N T U M FIELD THEORY
463
Let g, be an internal function which is the characteristic function of a cube of side length 2w and with supp g, bounded by A,; i.e., if g,(n8) = 1, then n S E A s . We see that Siu, as an internal moment function with respect to the internal measure pgu,Aa, will also satisfy the inequalities (42). Furthermore, we notice that if niS n16, i = 1, . . . ,k, and the points st(ni8)E R2 for n = 1,. . . , k are noncoincident, then Siu(n16,.. . , nkS)= S i u ( n ; 8 , .. . ,nL6). This follows from the explicit definition of Siu and the fact that g,(n,6) = g,(n:8), i = 1 , . . .,k Thus Si, has a well-defined standard part, and it follows from the inequalities %
siu5 sin,
(43)
all n,
01
that
(44) "(siu(n18,* .
*
nk8)) =
I
"(@8(nl) *
= lim
o(@6(nl)
n+m
= Iim n-or,
is?,"(
.
"(
'
@ ' S ( n k ) ) dL(pgu,A2s)
* *
.@ S ( n k ) )
dL(p*gn,A,)
n,6, . . . , a k a ) ) .
We may formulate this as a theorem. 7.4.13. THEOREM.
Let 6 > 0 be infinitesimal and A& hyperfinite, and let
g, be an internal function such that g,( n8) = 1 for all finite n6. Then L( P ~ , , , ~ )
is a non-Gaussian probability measure on (*IF"',L ( % ) ) , where 9 3 is the internal Bore1 algebra on *RA,. The Schwinger functions are finite and S-continuous for finite arguments and satisfy the inequalities 0 5 Siu5 so,
where the So are the (standard) Schwinger functions of the free Euclidean field. REMARK. In the hyperfinite picture we seem to effect the double limit passage in a single step by choosing 8 > 0 infinitesimal (continuum limit) and A&hyperfinite (infinite-volume limit). Does this mean that in nonstandard analysis one need not worry about the order in which one takes iterated limits? Not at all, as our discussion in Section 1.3 showed; in the present model, however, we need not worry because of the strong monotonicity properties of the associated Schwinger functions.
In Albeverio and H@egh-Krohn (1974) a nontrivial lower bound for st Siu is given. It remains to verify that the measure L( is nonGaussian. We recall that if p is Gaussian, then (45)
(fi
.
* 'S2n)fi
=
C (L,A1)p. . .(L.A.)~~
7 . HYPERFINITE I A n I C E MODELS
464
where ( * ) p denotes expectation with respect to the measure p and the sum is taken over all ways of writing (1,. . . , 2 n } as i l , . . . ,in, j , , .. . ,j , with i, < i2 < * < in, i , < j , , . . . , in < j , . To verify the non-Gaussian character ~ ,suffices ) to choose n = 2, all thef's equal to @ & ( * h ) for , of p = L( j ~ ~ , . , it some suitable h, and by a direct computation see that the identity (45) is violated; for details in the standard case see Albeverio and Hlzregh-Krohn (1979). It remains to discuss the axioms of Definition 7.4.4. There are two ways to proceed. In the standard approach, where one does not have the hyperfinite model but must work with the approximations given by the sequence g,, one defines a family of generating functionals by the equations
-
(46)
SM
=
where f E C:(R2).
I
exp(i st @&(s)) d ~ p ~ ~ , A , )n, = 1 , ~ . .. ,
One then shows that
Sm{f) = nlim
(47)
sn{fl
+5
exists. An application of the Minlos theorem yields a probability measure dpm on S'(R2), and one is back to the setting of Definition 7.4.4. But having the hyperfinite model and a well-defined measure L( pgw,A6) on {f},we may introduce the required generating functional S { f } directly; we set (48)
s { f }=
exp(i st @&(f))dL(pg,,Aa)r
where f E C;(R2). Notice that Definition 7.4.4 makes perfect sense in this setting and we may proceed to verify the axioms. 7.4.14. THEOREM.
The probability measure L( pgw,Ad) in the space
(*RA6, L( 93)) defines a non-Gaussian Euclidean field theory; i.e., the
generating functional
s{f}=
I
exp(i st @.G(f)) dL( pgw,A,)
satisfies the properties of 7.4.4 (analyticity, regularity, invariance, reflection positivity, and ergodicity). We shall not establish this in detail; we note that analyticity and regularity easily follow from the bounds established in Theorem 7.4.13. For invariance we have to show that if E is a Euclidean symmetry then S { f o E } = S { f ) . But in the approximation g, we have assumed that g, is the characteristic function of some lattice AP) = A'") n Lf6, where A'"' is a suitable bounded
7 4 HYPERFINITE MODELS FOR QUANTUM FIELD THEORY
465
domain in Rd. Performing the symmetry E on the domains A‘”’, we get a new sequence g ; and an infinite internal g: which is easily seen to define the same measure as the given g,; this follows from the general form of the inequalities in (36). The Osterwalder-Schrader reflection positivity follows from the fact that the internal limit measure pg,,Azs satisfies the hyperfinite global Markov property; see Section 7.3. The bounds of Theorem 7.4.13 are again essential for the verification of ergodicity; however, the details are somewhat laborious, and we ask the reader to study the original reference [Albeverio and H$egh-Krohn (1974)l. We have treated a particularly simple example to illustrate how one can exploit the hyperfinite picture to construct models for quantum fields. Let us observe that the hyperfinite construction can be given in all space-time dimensions. From an interaction (49)
one can derive an internal measure pcLg,As. The associated Loeb measure will always exist and some of the axioms, the translation and reflection property of Euclidean invariance and reflection positivity, will be seen immediately to be verified. As for the remaining axioms, ways can be found to cope with them; the real difficulty lies in the fact that the measure L( P ~ , ~may , ) turn out to be trivial, i.e., equal to the free measure; compare the discussion in Albeverio and Hldegh-Krohn (1974, 1980, 1984b) and Albeverio et al. (1979). We add some remarks on “fields with boundary conditions” which will be used in Section 7.5. Let p&6 be the Gaussian measure given by
+
with (C“)-’ = (( C-l)A - Ba,,), C-’ = ad(-A8 m2), ( C-’)Athe restriction of C-’ to As, and Bail a measure concentrated on an,; see (17) above. We call p&,& the free lattice$eld measure with boundary condition B. Sometimes it is convenient to assume that A is a “hypercube” of the form (-@, li8) and that As = A n *Zi. The free lattice field @: with boundary condition B indexed by As is introduced as in 7.4.6 and 7.4.7. It is a Gaussian random field with mean zero and covariance
nf=,
(51)
E(Wn)@.,B(n’))= (CAI,!,,,
where E now denotes expectation with respect to p&,. In Section 7.5 we shall have a particular use of the case of “Neumann boundary conditions,” which corresponds to setting
7 HYPERFINITE LATTICE MODELS
466
( Ba A 1n, m = an. m i ,
1
if nS E As is a corner site of As, if nS E Ag is not a comer site of As, otherwise.
i =2
where
i =1
i
=0
Notice that this corresponds to dropping the terms (qns- qn.s)2across ahs in the expression for the density of p t A 8 .Thus we have a discrete version of setting the normal derivative equal to zero. Another frequent choice is the “Dirichlet boundary conditions” obtained by setting BaA= 0. We can use in (26) and (27) to construct models of nonGaussian Euclidean random fields. This has been carried out in a number of cases in constructive quantum field theory; see, e.g., Simon (1974) and Glimm and Jaffe (1981) and references therein. We conclude this section with some remarks on the q i model seen from our hyperfinite point of view. The Q: model has an interaction of the form (49) with ug a polynomial of fourth degree. We shall give an outline of an approach by Brydges et al. (1983) in the cases d = 2, 3 and indicate how their construction fits into the present framework. In fact, by using a hyperfinite lattice we shall overcome the “somewhat distasteful” construction using compactness and subsequences which the authors need to complete their program. We start out with the finite-volume theory. Let A, be a finite lattice in R d with spacing 6 > 0. On RA6 we introduce the measure (52)
4 4 v ) = rI
d9x exP(-S*,(cp))lZA,,
xeh,
where ZA,is a suitable normalization factor and
where A 2 0 and a E R. Here ( x y ) means that we sum over all nearest neighbors in Z:, setting cpx = 0 for x As (which means that we impose Dirichlet boundary conditions). REMARK. We follow the exposition of Brydges et al. (1983). Thus (52) contains both the “free” and the “interacting” part of the measure. The interacting part corresponding to (49) is
1 -a 2
c
Sd9;+-
XEA~
A
1
4 xeA,
SdQ;.
The authors reverse the usual procedure and first take the infinite-volume limit. Thus, let ( F ) f i denote expectation with respect to the measure (52).
7 4 HYPERFINITE MODELS FOR QUANTUM FIELD THEORY
467
Known inequalities and uniform bounds [see Brydges et al. (1983)l imply that the following limit is well defined:
(54) note that we keep S fixed. In our language this means that we pass to a hyperfinite lattice Ts with finite spacing 6 and that we have (F)'" = st(F)g,'.
(55)
It remains to take the continuum limit, i.e., to choose S positive infinitesimal. To obtain something finite and nontrivial we need strong and uniform inequalities. As above, where we needed to choose As = A(A;)-' [see (33)], in order to cancel certain infinities when passing to the continuum limit ( 6 % 0), we also need here to introduce suitable mass counterterms. In fact, choose a(S)=
rng + S r n : ( S ) , rnt + Srn:(S) + S r n : ( S ) ,
d = 2, d = 3,
where Srn: = -3AC("(O) and Srn: = 6A25 C'6'(0 - z ) d d z and C'"(x, y ) is the free two-point function in the infinite-volume limit, i.e., obtained by setting A = 0 and a = m i . Let S'6'(x, y ) = ((pXp,,)(')be the interacting two-point function in the infinite-volume limit. And let 111. 1 11 denote the norm
(57)
lllflll
=
llflll
+
Ilfllm.
The key to the successful passage to the continuum limit is the following inequality, which crucially depends upon the "renormalization" introduced in (56): there exist universal constants A. > 0 and c such that if 0 IA IAo, then
(58)
IIIS'') - C'')III
5
cA';
universal here, of course, means independent of 6. As soon as this inequality is established the usual arguments allow us to complete the construction. In our setting we shall arrive at a nontrivial field theory based on the hyperfinite lattice Ts with 6 0. We conclude with a few remarks on (58). Three main ingredients go into the proof ( i ) the field equation, which in the setting of Ts is obtained by transfer from the finite-volume case; (ii) the so-called skeleton inequalities, which are proved by using the random-walk representation of a classical spin system [see Brydges et al. (1983)l; and (iii) the continuity of IIIS") C(')ll(in the parameter A. L-
468
7 HYPERFINITE LATrICE MODELS
There is no success story to recount in the case d = 4. Glimm and Jaffe (1981) have obtained some positive results. There have also been some speculations that cp: with a positive A is trivial, i.e., that the associated measure collapses to the free measure; the possibility of a nontrivial with A < 0 has also been discussed; see Albeverio et al. 1982, 1984a, and the references therein. In Section 7.5 we shall return to the discussion of the cp: model in connection with its “polymer representation.” D Some Concluding Remarks on Gauge Fields
There has been a great interest, both in mathematics and in physics, in gauge fields; e.g., see Frohlich (1980) and Seiler (1982) and references therein. Many physicists have expressed the hope that quantized gauge fields might provide the appropriate physical description of interactions in elementary particle physics, and mathematicians have studied gauge fields as possible candidates for nontrivial models of relativistic quantum fields, as well as for the rich mathematical structure they exhibit at the classical level. In this concluding part we shall indicate how one can describe in nonstandard terms the continuum gauge field in two dimensions, starting from the model in Sections 7.4.B and 7.4.C of a hyperfinite lattice with infinitesimal lattice spacing. Moreover, we get a construction of “Markov cosurfaces” and “Markov covector fields” which are natural extensions of Abelian gauge fields to arbitrary space-time dimension. Let us start by recalling some of the standard theory. Let r be a bounded domain in IWd and 6 > 0; as usual we set
rs = r n n:. A cell A of Ts is the hypercube obtained by translating {x E W d 10 5 x, IS} by nS for some n E Zd. The cell as a point set is, of course, a subset of Wd. It is determined b y a set of vertices that are constrained to lie in Ts. We denote by F , , F 2 , . . . , F2d the faces of A, considered as ( d - 1)dimensional hypersurfaces, oriented in such a way that the basis vector orthogonal to a face points outward. We then have, as sets, aA = Uf!,Fi, with aA denoting the boundary of A. We shall call two cells adjacent if they have a common face, which, of course, will have opposite orientations as a face of the respective cells. For d = 2 we define the product Fl F2 of two faces such that the endpoint of F, is the starting point of F2 as the set Fl u F2 with the orientation being the one inherited from F, and F2. In the same way, for d > 2 we define the product F,F2 for faces for which Fl n F2 is ( d - 2)-dimensional and with the orientation inherited from F, and from F2 opposite to each other.
7 4 HYPERFINITE MODELS FOR QUANTUM FIELD THEORY
469
We define recursively
Fl * . . . .F,,+, = ( F l
9
* * .
. F,,)Fntl
and we let Z = Zr,be the set of all such products. To each element S E Z we associate an element C ( S ) in a fixed group G, supposed to be Abelian if d > 2, but not necessarily Abelian for d = 2. We assume that the map C satisfies C(SIS2) = C(SJC(S2)
whenever S,S, is a well-defined element of I; and the product on the right-hand side is the group operation in G. Further, we assume
c(s-’)= c(s)-’, where S-’ denotes the same set as S but with opposite orientation. We call the map C a cosulface on Ts with values in G; or, in different terminology, C is a G-valued ( d - 1)-cochain. We denote by rG = rG,rb the set of all cosurfaces on Ts. REMARK. In the case d = 2 the cells are squares having vertices in points of the lattice Ts which are nearest or next-nearest neighbors. The faces are segments joining nearest-neighbor points. In the terminology of lattice gauge field theory the cells are “plaquettes” and the faces are “links” or “bonds.”
Having established the basic geometric terminology, we now endow the group G with a measurable structure. We may then introduce a notion of a stochastic cosulface as a measurable map S from some probability space (O,%?,P) into r G . For d = 1 there is no genuine product of faces and Z reduces to the set of lattice points Ta; thus in this case a stochastic cosurface is simply a G-valued discrete stochastic process indexed by Ts. We shall only marginally touch upon the notion of stochastic cosurface in these remarks; see Albeverio et al. (1984b, 1985) for a more extensive discussion. We are now all set to introduce the basic notion of interaction, appropriate for lattice gauge field theories. Let U be a real-valued function on G such that U ( g h ) = U ( h g ) for all h, g E G. Let p be a real constant. An interaction on Ts given by U and p is a family of functions { WA}, where A runs over all finite unions of cells of Ts and wA(c)
Es
-p 1
U(C(dA)),
AcA
the sum being over all cells A of A and where
c is a cosurface in r G , A ;
7 HYPERFINITE L A n I C E MODELS
470
actually, WA(C) depends only on cosurfaces of the form C ( a A ) , A c A. Also note that because of our assumption on U, U ( C ( a A ) )= U ( C (F,)* . . C (F Z d ) is ) independent of the order in which we are given the faces Fl, . . . , FZd of A, provided that the order is compatible with the orientation. The group G has a measure structure; more specifically, we assume that G has a G-invariant probability measure dg, and we further assume that P, U are such that the integral
by the exists. We can then introduce a probability measure dp, on I‘c,A formula dpA(C) = 2,’ exp[-
wA(c)l
II
n
dC(F),
AcA F E ~ A
where zA
I
exp[-WA(C)l
n n
A c A FtdA
dC(F),
the product being over all faces F of cells A in A. Since U ( C ( d A ) )is invariant under cyclic permutations of the faces Fl,. . . ,F 2 d of aA and W,( C ) is invariant under permutation of the cells, we see that the measure is independent of the order of the cells and of the order in which the faces C(F)= C(aA). in a cell are taken, provided this order is such that
nFEaA
REMARK. Some people may be confused by the notation dC( F ) , but we may think of C ( F ) as a variable taking value in the group G and that C(aA) and, more generally, C ( S ) , S E Z, are composite terms built up according to the “defining relations” C (S , S,) = C ( S , )C (S2)and C ( S - ’ ) =
c(S)-’. For further reference let us note that 2, = k A ( 1 )= C
( P P ,
where N,, is the number of cells in A. Let f be a function on rG,n of the form
f ( C )= f o ( g 1 , . . . , g d , where fo is a real-valued bounded continuous or nonnegative measurable function on G‘ and where each gj = C ( F , ) ,with F j a face of some cell A
7.4. HYPERFINITE MODELS FOR QUANTUM FIELD THEORY
471
in A such that 5 n dA = 0.Thus f is “supported” in the interior of A, and we write for simplicity supp f = (Fl,. . . , F I ) . For such f we obtain the following formula for the expectation o f f with respect to the measure pra (recalling that Ts itself is a finite union of cells):
4J-f)
(59)
=
If
]
ePU(c(aA))
c(p)
n n
Acsuppf F s a A
dC(F)-
By transfer we can also let TS be a hyperjinite lattice with injnitesimal lattice spacing extending beyond standard space Rd. This gives us a welldefined internal quantity EPra(f)and an associated Loeb measure L( p r 6 ) , which we may take as our starting point for constructing models of gauge field theories. REMARK 1. pr6 is a “Gibbs state” for the interaction Won the hyperfinite lattice TS,in the same spirit as the hyperfinite Gibbs states constructed in Section 7.2. REMARK 2. For d = 2, U ( y ) = ~ ( y ) where , y E G, G a compact Lie group, and x is some character of an irreducible unitary representation of G, we can show that L( pra)in the hyperfinite case realizes the probability measure describing a lattice gauge field theory (pure lattice Yang-Mills theory) on the lattice with spacing 6. The Gibbs state given by L( p r , ) will be invariant under gauge transformations in the sense that if F is the face with vertices x, y E Ts and y ( z ) = yz is any measurable map from TS to G, then prais invariant under the map C ( F )H y,C(F)yJ’. In common terminology exp(pU(C(aA))) is called a Wilson loop and thus pragives the distribution of products of Wilson loops. Let G = U(1) and let A,, p = 1, 2, be smooth mappings from Rz into the Lie algebra of G. This determines a 1-form A = C A, dx,; physically one looks upon A., as “gauge fields” with “field strengths” FFY= a,A, - d,A,, p, Y = 1, 2. The relation of the 1-form A with our cosurface is as fol\ows. Let A be the cell of Ts with vertices u1= Sz, u2 = S ( z + e l ) , uj = 6(z + el + e,), u4 = 6 ( z + e 2 ) ,with z E Z2 and e l , e, an orthogonal basis of Z2. Then
lad
A = lad
c
dx, = 6Al(l(Ul + u,)) - 6A,($(v3 + u4))
+ 6A2(l(vz + 0 3 ) )
- 6A2(4(u1 + u4)),
A
where aA = ( u l , u,) u (u,, u3) u ( u 3 , u4) u (u4, vl), and (ui,u j ) denotes the link between ui,uj, oriented positively from uito uj. Now define the cosurface
7 HYPERFINITE L A U I C E MODELS
472
C by setting
where g is some real parameter. Then
=exp[-ig[
aA
A]
For 6 infinitesimal we see that giving a probability distribution to the cosurfaces C(dA) amounts to giving a distribution to the flow jdaA of the 1-form AAacross8 i . As we shall see below, our choice pr, of distribution A coincides with the usual for C(aA) yields a distribution for j d ~ that physical one in the cases considered in the theory of gauge fields. REMARK 3. For d > 2 the usual Yang-Mills lattice field theory studies random variables associated with two-dimensional plaquettes, whereas we study random variables associated with ( d - 1)-dimensionalsurfaces. Thus only for d = 2 do our models include the usual Yang-Mills models.
It is easy to write down the quantity EPra(f);it is far more difficult to understand what it means. We shall try to gain some insight by discussing some examples. A unit cell A of rl will, b y a process of subdivision, split into a family of smaller cells. To be specific, let 6 = 2-" for some n E N. In Ts the cell A will be divided into 2d" smaller cells A:, i = 1, . . . ,2d". A face F, of A will be divided into 2 ( d - l ) nfaces F i , l , .. . , F i , 2 [ d - 1 ) n , each Fi,j being the face of some cell A;; but, of course, not every face of a cell A; will be of the form Fi,j.We shall use F ; , j , j = 1 , . . . ,2d, to enumerate the faces of A;. of the special form Let us for simplicity consider functions f on rG,rs
f(c>=fO(c(Fl), c f F 2 d ) ) ? * * *
9
where fo is a bounded real-valued continuous function on G Z dand each F, is a face of A in rl. To be precise, if C and C' are cosurfaces in 17G.r6 satisfying C(Fi,l) *
' ' '
* c(E,,(d-Iln)
=
~'(E,J) . . *
* *
c'(F,,Cd-l)m)
7 4 HYPERFINITE MODELS FOR QUANTUM FIELD THEORY
for i = 1,. . . , 2 d , thenf(C) we obtain (60)
= f(C').Using
473
formula (1)to calculate Epr8(f),
J fo(c(m.. . , cv,,))
Epr8(f) =
where q p ( g ) = exp[PU(g)] for g E G and qlp'k' means k-fold convolution of qp with itself. Notice that this calculation reduces the given integral E,, (f)to an integral of fixed dimensionality 2 4 and where the C occurring in the reduced integral can be interpreted as a cosurface in rl. A calculation always extends by transfer. Thus let n E *N - N and take 6 = 2-". In a hyperfinite lattice Ts the unit cell A is subdivided into a hyperfinite number of cells A:. We assume that the group G is compact and carries a normalized Haar measure. Let f be an internal function of the form (61)
f ( C ) = * f o ( C ( F , ) ,. . . , C ( F * d ) ) ,
where C is a cosurface on Ta with values in *G and fo is a bounded continuous real-valued function on GZd.Notice that in this case C ( F , )is the hyperfinite product C ( F , )= C ( F , , , *)
*
. . C(F,,,(d-~)rn), *
which as a group element belongs to *G. Since G is compact, C ( E ) is nearstandard in *G. We may thus conclude that f as a function on r*G,r8 is S-integrable with respect to the internal measure p r , and that we have (62)
ELCpr8,("f) = St(EILr8(fh
where, to emphasize, f is of the form (61) and L( pr,)is the Loeb measure on r*G,r8 associated with pr,.
474
7 HYPERFINITE LATTICE MODELS
But (62) is of limited use unless we can determine the quantity
A general answer is difficult; let us take G = U ( 1) with U ( g ) = Re ,y(g), where y, is the character ,y(e”) = eip,where eip,cp E [0,277) gives the natural parameterization of G. Choosing n E *N - N, a short calculation shows that with the choice p = 2dn/u2, Ql(C@A)),
(63)
where the C on the right-hand side can be considered as a (standard) cosurface on rl and Q1 is the standard entity Q ~ ( ~ ” P=) e - c 2 / 2 r 2
If we define Q,(eiP)= exp(t(-cp2/2u2)), we see that Q, is a semigroup on V (1); in fact, Qt is the semigroup describing Brownian motion with diffusion coefficient, u on U(1), which “explains” why we added v2to the choice of p. From (60), (62), and (63), we now conclude that
1
~ L c p ~ ~ ) (= o f )
ii
& ( c ( F ~ )* ,* * c ( F ~ , ) ) Q ~ ( c ( ~ A ) ) dC(Fi), 9
i=l
where the integral on the right-hand side is standard and f is given as in (61). We have thus added considerable “insight” to (62), knowledge of the Haar measure on U (1) and the explicit form of the semigroup Qt gives us, in principle, full control of the expectation EL(p= ) . As a further illustration let A be the union in of three unit cells A‘, A’, A3 such that A’ has a common face with both A2 and A3, but A’ and A3 have no common face. Let f be a function depending only on faces F E aA, i.e.,
notice that this is in complete analogy with (61). If we remark that for cells A’ and A2 which are adjacent, Q l ( W A ’ ) ) . Q l ( W A 2 ) ) = Q2(C(8(A1 u A2))), using the semigroup property of Qt and understanding the product in sense
7 4 HYPERFINITE MODELS FOR QUANTUM FIELD THEORY
475
of convolution, we see that
r
where JAldenotes the Cartesian volume of A, i.e., in this case IAl = 3. The reasoning behind (64) can be extended, even to the hyperfinite case. Let A in Ts be a bounded hyperfinite union of cells; i.e., we assume that the set of standard points of A is bounded. In this case we may think of dA as a hyperfinite approximation to a “nice” closed curve in R d with an interior of finite volume; see Fig. 7.3. For technical reasons we have added two “curves” S , and S2 so that the resulting partition of Ts, A, B,, B2 consists of three connected and simply connected pieces.
I
B2
Figure 7.3
Let K = {dA,S , , S,} and DK = {A, B , , B2}. K is an example of what is called a regular saturated complex and DK is the associated partition of the underlying space. Generalizing (64), we can introduce a probability measure v K S associated with K and D K by
where Q,in our examples is given by (63) but could, in principle, be any Markovian semigroup on G satisfying Q , ( g h ) = Q , ( h g ) .We further assume
476
7 HYPERFINITE LATTICE MODELS
that Qlel = 1 if the volume I BJis hyperfinite but not finite. Then, generalizing (64), we may write (66) EP,Jf 1 = &,(f 1, assuming that f as a function depends only on the “curve” dA, i.e.,
f ( C )= * f O ( ( c ( m F E d A ) . The approximation in (66) will be exact if we add the Loeb construction. Thus we have our final interpretation of (60) for S infinitesimal in terms of the semigroup Q,, the volume measure IBI,and the Haar measure on G. Notice that in the standard approach the measure L(Y,,) would have to be obtained by a projective limit construction. The Loeb measure L( pr,)can be taken to be the underlying probability measure of a field C ( S )of canonical random variables attached to smooth (d - 1)-dimensional hypersurfaces on Rd. For d = 2 and S = d A , this random field coincides essentially, by Remark 2 and (63), with the flow A of the gauge field A across S. The gauge group was chosen here to be G = U(1), but the same is true in other examples, e.g., G = S U ( 2 ) or G = Z’ [see Albeverio et al. (1984b)l. Thus in these situations the random jield C ( S ) is a realization of the “continuum limit” of “lattice gaugefields.” For general d > 2 and G Abelian one can show (Albeverio et al., 1986) that the random field C ( S )has a global Markov property and satisfies the d-dimensional analog of the axioms for Euclidean gauge fields (Seiler, 1982). For G = T” the Markov cosurfaces C ( S ) can also be looked upon as solutions of stochastic differential equations D A = 6 for covector fields A, with D the covariant derivative and 5 an infinitely divisible generalized random field over Rd. This introduces new and interesting mathematical structures, but it would take us too far afield to explore these here. We hope to have convinced the reader that the Loeb construction has once more been put to good use to give a “geometric” construction of the “continuum” case by exploiting the underlying hyperfinite “discrete” model.
5,
7.5. FIELDS AND POLYMERS
In this final section we shall study the close connections which exist between quantum fields and polymer measures. Since their discovery by Symanzik (1969) in the mid-l960s, these relations have played an important part in quantum field theory and helped strengthen the already strong bonds between probability and mathematical physics. As we shall soon see, the basic ingredient in the theory is the representation of the square of the free field @ as a Poisson random field of local times of Brownian loops. If we add an interaction to @, this is reflected in the probabilistic representation
7 5 FIELDS AND POLYMERS
477
as a perturbation involving intersections and self-intersections of the Brownian paths, and we are thus led directly to the problems we studied in Section 6.4. Toward the end of this section, we shall use this relationship to shed new light on @: and (especially) fields with infinitesimal coupling constants. A Poisson Fields of Brownian Bridges
Throughout this section Poisson fields of Brownian loops and bridges will appear in various guises, and it is convenient to begin by studying them in an abstract setting. Given an internal, *-bounded subset A of * R d and a positive 6 E *R, let A, be the lattice
As
For each i
E
=
{n6 E A J nE * Z d } .
A,, we shall let N,
={ jE
A81/i - j l
=
6)
denote the set of next neighbors of i in A,, and write JN,J for the number of elements in N,-obviously, JN,I= 2d unless i is on the boundary of As. The discrete Laplacian in A, with Neumann boundary conditions is given by
We shall be interested in Markov processes on As with infinitesimal generator
-f A8 + m 2 , where the “mass” m is a positive real number. These processes will have a hyperdiscrete time line T = { k A t 1 k E *N}for some positive infinitesimal At that we keep fixed all through the section (in contrast to 6, which we shall allow to be standard on some occasions and infinitesimal on others). We shall have to add a “trap” or “cemetery state” 0 to A,; hence our state space will be
A,
=
A, u (0)
(the trap 0 plays exactly the same role here as did so in Chapter 5 ) . To describe the Markov process more precisely, assume that it is in a state i E As at time t. At time t + At, it will then be in the trap 0 with probability m2 A t ; it will be in each one of the neighboring states j E Ni
ma
7 HYPERFINITE LATrICE MODELS
with probability At/262; and it will remain in i with probability (1 ((JNi1/2S2) + m 2 ) At). For this to make sense ( ( ) N i 1 / 2 S 2+) m 2 )A t 5 1, i.e., 6
~
J-KF 2(1 - m2 A t ) '
and we shall always assume that this is the case. If the process is in the trap 0 at time t, it will remain in 0 at all later times. Let X :R x T + A, be a hyperfinite Markov process which fits the description we have just given. Computing its infinitesimal generator, we get
f ( j ) - l N i l f ( i ) ]+ m ' f ( i ) .
= -1 6-2[
2
jcNi
+
Thus A = -; As in2, exactly as we wanted. Note that when 6 is infinitesimal, then in the interior of A,, the standard part of X behaves like a Brownian motion which is killed at rate m2.Observe also that if X is uniformly distributed at time zero, say
P { X ( O )= i}
=
ad,
then X is uniformly distributed at all later times and P { X ( t )= i }
=
ad(l- m2At)'/Ar
It was to achieve this effect that we chose to work with Neumann boundary conditions. If i E A,, we let as usual Pi be the probability measure of the process started at i. Thus if P is the probability measure of the process started with uniform distribution P { X ( O ) = j } = ad, we have P i ( A )= K d P { w E A I X ( o , O ) = i}. If we only count those paths which are not yet trapped at time t, we are led to the probability measures PI(A) =
P,{w E A I X ( w , t ) # O } P{o E A I X ( w , O ) = i A X ( w , t ) # O} (1 - m 2 h t ) ' l A r a d ( l - m2 A t ) ' l A r
Given two elements i, j E A, which are connected by at least one path of length t, we shall use the measure Pf,j(A= )
P : { oE A I X ( w , t ) =j } Pl{w I X ( w , t ) =j }
479
7 5 FIELDS AND POLYMERS
to discuss the set of all such Brownian bridges. If there is no such connection between i and j , we let P:,j be the point measure on the constant path X ( w , t ) = 0. To denote expectations with respect to Pi, P f , and P:,j, we shall use E,, E f , and Ei,j, respectively. So far we have only discussed the second half of our announced topic, "Poisson fields of Brownian bridges." Turning to the Poisson fields, we let a be an internal measure on the space A', x i? If X is the set of all internal maps
a : A ; x T+{O,l}, a induces a measure Q on I:as follows: let qi,j,r(a) = a ( i,j, t ) if a( i,j, and qi,j,t(a) = 1 - a(i,j, t ) otherwise, and put
t) =
1
where the product is over all (i, j , t ) E A: x i? It is easy to check that (a,Q) is a Poisson random field with parameter a in the following sense. 7.5.1. LEMMA. Assume that a is nonatomic, i.e., a { x } = 0 for all x E A f x T. If C = A', x T is Loeb measurable with L ( a ) (C) < 00, then for all
mEN
PROOF. It is clearly sufficient enough to show that if C is internal and has finite but noninfinitesimal measure, then
Pick a subset Co = { x , ,x 2 , . . . ,x,} of C of cardinality m. The probability that Co is the subset of C where a equals one is a ( x I ) a ( x 2 )*
* *
a(xm) .
n
(1 - . ( x ) ) .
xtc-c,
Since a is nonatomic
xtc-Cn
(1 - a ( x ) ) = exp[
xtC-Cn
ln(1- a ( x ) ) ] = exp[ -
1
xec-co
a(x)]
480
7 . HYPERFINITE L A n I C E MODELS
where we have used the Taylor expansion of ln(1 - a ( x ) ) . The probability that CxtCa(x) = m is thus infinitely close to
where the sum is over all subsets C, = {x,, . . . ,x,} of C of cardinality m. Since each such subset can be ordered in m! different ways, we get
if we sum over all m-tuples (x, ,. . . ,x,) with distinct coordinates instead. However, since a is nonatomic and C has finite measure, the result of allowing repeated coordinates is just to change the sum by an infinitesimal amount, and thus
which proves the lemma. We shall now combine the Poisson field and the Brownian bridges in one structure. Let @=
z x Q*W,
where as usual sl*ixT is the set of all internal maps from A$ x T to s1. Let Z be the internal product measure on 0 obtained by using the measure Q on X and the measure Pi,j on the ( i , j , t)-th component of the product If g : A, + *R+ is an internal function, define a random variable T ( g ): 0 + *R+ by 1
(2)
T ( g ) ( a 0) , =
C ( i,j, t ) E A : x T
u(i,j, t )
C
g ( X ( s , wi,j,t))At,
S=O
where w ~ , is~ the , ~ ( i , j , t)-th component of o.We use the convention that g ( 0 ) = 0. We shall refer to T as the PoissonJield of Brownian bridges induced by a. It is easy to paraphrase the way in which T operates. Each time we choose an element $, = (u,o)in 0, the first component (+ picks out a set of points (i,, j l ,tl), . . . ,(i,,j,, t,) in A', x T, and the second component o determines a set of Brownian bridges connecting the ik's to the jk's in time tk. The value of Tg is just the sum of the integrals of g over these paths.
7 5 . FIELDS A N D POLYMERS
481
The fact that u is Poisson distributed makes it easy to compute the Laplace transform of T(g): 7.5.2. LEMMA.
Let g be a non-negative, internal function such that
C
ln[l
+ a ( i , j , t)Ei,j(exp(-Gi,j,,) - l ) ]
( i,j, f )
Using Taylor’s formula, we turn this into
where the remainder term is negative, and
482
7 HYPERFINITE LATTICE MODELS
But Gi,j,, IIIgllmt,and thus the right-hand side of this expression is less than 1
- C ( 1 -exp(-tllgIl,))'a(i,j, 2 (iJ,t)
tI2,
which is infinitesimal by assumption. This proves the lemma. We shall use this result to show that with the appropriate choice of the measure a, the Poisson field T has the same Laplace transform as the square of the free field on A,, and that these two quantities can thus be identified. If the reader wonders why we have used the somewhat curious condition (3) in Lemma 7.5.2 and not, for example, the slightly stronger but more natural condition
1 cu(i,j,
c)~=o,
it is because the representation theorem we just mentioned needs the stronger condition. REMARK. The idea behind the hyperfinite approach to Poisson random fields is by no means new; in fact, the very first application of Loeb measures in probability theory was a similar construction of Poisson processes in Loeb's (1976) original paper [see also Stroyan and Bayod (1985)l.
B. The Square of the Free Field as a Local Time Functional
As we now turn to lattice fields, our first task is to compute the Laplace transform of the square of the free field in order to establish the identification referred to above. But before we can begin our calculations, we need to fix the terminology. Let DAs be the kernel or matrix of -:A8 + m 2 as an operator from 12(A,, S d ) to 12(A,, t j d ) ,i.e.,
Denote the inverse matrix by CAs,and observe that S-2dCA8is the kernel of (-i A* + m')-'. Let {&)SN(i)}rcA6 be a Gaussian random vector with mean and let @: = Kd&)gNbe its density with zero and covariance matrix CAS, respect to the measure S d on A,. Recall from the last section that @: is the free lattice field of mass m on As with Neumann boundary conditions. We introduce the square of the free field T:(i) = @)SN(i)*,and write @)SN(g)=
c
g(i>@,( 2 )
C
g(i)*Ir,NG) 8 d
ICAd
W g )=
icAS
' N .
=
c
iehs
g(i)@)SN(i)ad
7 5 FIELDS AND POLYMERS
483
for all internal functions g. It will be useful to keep in mind that w w ) @ S N ( g ) )= , f ( i k ( . K & , - i )
+ m')-'SI
= ((--A8
g),
where the inner product is in /'(A8, a d ) . We are now ready to compute E exp( - 4 $ r ( f ) ) (the factor $ is included for technical convenience) for all positive, internal functions J: By definition of {a") E exp(-i$F(f))
=
( 2 ~ ) - ' ~ 6 ' / ~ (CAa)-'/' Det
Using the fact that DAs= Cl\i and that exp(-$(Aq, 4)) dq = ( 2 ~ ) " Det '~
(4)
J,.
for all positive, symmetric matrices A, we get (5)
E e xp ( - $ T f ( f ) )
=
Det D i y Det(D,,
+ S-df)-'/*.
To compute these determinants, we first translate the problem into the language of operators on IZ(h,,6 " ) . If we write H8 for -$A6 + m2, we have Det( DAJ = Xd"'al exp[tr log Ha] and Det(D,,
+ 8-Y) =
F d 1 " 6 '
exp[tr log(H6 +f)l,
where tr denotes the trace of an operator. Thus (6) E exp(-$*:(f)) = exp{-ftr[log(H6 +f)- log(H6)]). Our next task is to get rid of the logarithms. Applying the well-known series expansion
log(x) = - 1 k=i
k
(1
- X)k
to a positive operator A At, we get log(A) + log(At) = log(A At) = -C;='=, ( l / k ) ( I - A At)k. Hence (7)
E exp(-i*,N(f)) =
1
enp(2tr
c" 1-[(I - ( H 8 + f ) At)k - ( I k
k=i
- H6
7 HYPERFINITE LATllCE MODELS
484
To compute the trace, we shall use the Feynman-Kac formula and the orthonormal basis { e i } i t h s where ,
In order to have the conditions of the hyperfinite Feynman-Kac formula satisfied, it will be convenient to assume that Ilfllm 5 (In( l / A t ) ) " 2 . By Corollary 5.3.12, we get
( f L[Z k=l
-
k
( H 6 +f) AtIkei,ei
f k E { e i ( X ( kA t ) ) e i ( X ( 0 )exp( ) - t f ( X ( j At)) A t ) } ,
=
k=i
where X is the Markov process generated by Ha. In terms of the conditional measures PI, Pi,j, this can be rewritten as
( f :[Z k=l
k
=
-
(Ha + f ) AtIkei,ei
" 1 1 -( 1 r=Ar
-
t
m 2 A t ) r / ~ ' p : { X (= t ) i > E ; ,exp( - i f ( x ( s ) ) A t ) At.
When t is noninfinitesimal, P : { X ( t )= j} is of order of magnitude Sd, and it is convenient to rescale PI by introducing p ( i ,j , t ) = S - d P I { X ( t )= j } .
Note that p ( i, j , t ) is nothing but the "kernel" of the discrete heat equation in As with Neumann boundary conditions. We now get
=
" 1
C - ( 1 - m2 Ar)'IA1p(i,i, t ) E : , ,exp l=Al
t
The trace in our expression (7) for E e xp(- f'P r (f)) is the sum of the left-hand side of (8) over all i E As. Thus if Aa is finite, we have (9)
tr
" 1 1 -[(I k=i
k
-
(Ha + f ) A t ) k - ( I - H6 At)"]
7 5 FIELDS AND POLYMERS
485
But if this formula holds for all finite As, it must also hold for all hyperfinite As with internal cardinality less than some hyperinteger No E *fV - N. For such A, we can combine (7) and (9) to get
(here, and in the remainder of this section, we use the convention that 1/0 = 0). Comparing (10) and Lemma 7.5.2, we see that if we choose the measure (Y in 7.5.2 as (11)
a(i,j , t ) = ( ~ 3 , ~ / 2 f-) (rn2 l At)'/A'p(i,j , t ) ad At,
where 6, is the Kronecker symbol, then $9;and T have the same distribution (provided, of course, that the condition in 7.5.2 is satisfied). Since (Y is concentrated on the diagonal {( i, i, t ) 1 i E As, t E T } ,the Brownian bridges from which T is constructed will be Brownian loops starting and ending at the same point. We summarize our findings in the following theorem. 7.5.3. THEOREM. Let Yry be the square of the free lattice field on R6 with Neumann boundary conditions, and let T be the Poisson random field of Brownian loops induced by the measure (Y in ( 1 1). There is a hyperfinite integer N E *N - fV such that whenever I&( 5 N, then
E e x p ( - + W f ) ) = E exp(-T(f)) for all non-negative, (log( 1/ A t ) )' I 2 -
functions f :As + *R
internal
with
llfllm
5
PROOF We have already carried out all the necessary calculations, and what remains is only a certain amount of bookkeeping; i.e., keeping track of the various conditions w e have explicitly or implicitly assumed. As observed above, the condition
Ilf
1100
5
(log
k)
"2
justifies our use of the Feynman-Kac formula, and if we also choose N no larger than the constant No appearing in the calculations, formula (10) clearly holds.
7 HYPERFINITE LAnICE MODELS
486
The rest of the argument is just an appeal to Lemma 7.5.2, and only requires that the condition
C (1 - exP(-~llfllm))’~(i,~, t)’ = 0 is satisfied. Using the fact that 1
-
exP(-tllfllm)
5
tllfllm,
and that the sum
is finite, we see that if we choose N
C (1 - exp(-tllflIm))’a(i,
j , t ) ’ ~C
5
1 / J h i , then
C
1
t ’ l l f I I L s ( 1 - m’ At)2f’Af A t 2
i € A 8 rcT
and the proof is complete. Picking an element k we have
(12)
E
As and applying T to the functionfk(i) =
aik,
T(k)(a, a)= T ( f k ) ( U ,0 ) = C c ~ ( i , j , t ) S - ~ I ( s t
Since I{s < t I X ( s , miPj,J= k}l measures the time X spends at k, Theorem 7.5.3 gives a representation of O2 as a Poisson random field of Brownian local times. C. Local Time Representations for Interactions Which Are Functions of @’
The next step in our program is to use the representation we just found to study interacting scalar fields. An internal function u :*R + *R defines an interaction U by (13)
u=A C
u ( O r ( i ) ) ad,
iaA8
where A E *R is a coupling constant. If po is the free measure (i.e., the probability measure we have been using all along and which makes Or a Gaussian random vector with mean zero and covariance matrix S-”CA,), the interaction measure pu is given by
487
7 5 FIELDS AND POLYMERS
Recall from the last section that the interacting field is just @: considered as a random variable with respect to pu. To find the Laplace transform of the interacting field, we must compute (15)
E exp(-@,"(g)) exp(-U),
where the expectation is with respect to po. We shall not be able to d o this for general interactions, but by using Theorem 7.5.3 we shall obtain the expression (16)
E exp(-@F(g))F(+q\IlsN)= E ( F ( T + T,)) exp(t(fG'g, g)),
where F : *R"s -+ *Iw is an internal functional, and T and Tg are independent Poisson fields of Brownian bridges (the measure a inducing Tg depends i.e., on the function g). If the interaction is a function of
qr,
u=A
u(q\ITbN(i)) ad, i€AG
we can then apply (16) to F(1Irr) = eCU to obtain information about (15). In order to prove (16), we shall first consider the special case where F ( V ; ) = e x p ( - $ q r ( f ) ) for some function f; and show that (17)
E(exp(-@.,N(g)) exp(-t*,"(f))) = J5exp[-T(f)
-
Tg(f>lexp($(g, W ' g ) ) .
Once this has been established, an easy Stone-Weierstrass-type argument will take care of general F's. The main tool in computing (17) is simply the formula
(18)
(27r-"/'Det A'/*
I,.
exp(-i(Ax, x)) exp(-(y, x)) dx
= exp(HA-'y, Y))
for the Laplace transform of a Gaussian measure. Recalling that @: is Gaussian with covariance matrix 8-2dC,,, and that DA, = C;:, we get
(19) E[exp(-@.,N(gN exp(-4*,N(f)>l r
7 HYPERFINITE LATTICE MODELS
488
where all inner products are in R'*s', i.e., (x, y ) = CitAsxa,. Since we are elsewhere using the inner product in 12(As, a d ) ,i.e., ( A g) = CicAs f( i)g(i)Sd, it is convenient to rewrite (19) as
+*f(f))I = Det Diy Det(DA, + 6-dfl)-"2
(20) E[exp(-@ s"(g)) exp(-
eXp(f(g, (Ha+f)-'g)),
where the inner product is in 12(As, s d ) . Recall that Ha = -5 A8 + m2. We have already evaluated the product of the two determinants in (20): Det D!,y Det(D,,
+ S-dfl)-1'2
= E exp(-T(f)),
where T is the Poisson field of Brownian loops induced by the measure a in (11). To compute the exponential factor in (20), note that 00
(Ha +f)-'=
(21)
1 (I - (Ha +f) A t ) k A t k=O
since Ha + f is strictly positive. By the Feynman-Kac formula (22)
k =O
( ( 1 - (Ha +f)At)kg,g)
where X is the Markov process generated by Ha. By introducing the conditional measures pi,j and their expection Ei,j, the last formula can be written as (23)
f
( ( 1- (Ha + f )
g)
k=O
--
C
( i j , ' ) t Asx T
g(i)g(j)(l - rn2At)"A'p(i,j, t )
x Et,j[exp(-/o'f(X(s))
dr) 82d At].
If we define an internal measure ag on A: x T by a g ( i , j ,t ) = s g ( i ) g ( j ) ( l - m2 At)'"'p(i,j, t ) S Z d At, (24) we can combine (21)-(24) to get
7.5 FIELDS AND POLYMERS
489
(in fact, we have exact equality in this case), and thus
The reason for this last, rather mysterious maneuver is simply that if Tg is the Poisson random field of Brownian bridges induced by a g ,then
7 HYPERFINITE LATTICE MODELS
490
for all non-negative, internal functions f and g such that (log(l/Af))"' and llg1I2/& = 0.
llfllco
5
PROOF. As in the proof of 7.5.3, we only have to check that the conditions are sufficient for the calculations we have already carried out. Clearly they are strong enough for the use we have made of Theorem 7.5.3, and it is ~ ~ m our application of the easy to check that since ~ ~ 5f (Iog(l/At))'/', Feynman-Kac formula in (22) is valid. Hence (27) holds. To justify the appeal to Lemma 7.5.2, it suffices to check that C ag(i,j,t ) 2 = 0. But
C a,(( j, t)'
1
- g(i)2g(j)2(1 - m2 At)2"A'
5 (i,j,t)
4
At2
2
1 (F;g(i)'Sd) 8 m 2 - 4m4 A t
5
At=0,
and the lemma is proved.
To generalize Lemma 7.5.4, it is necessary to show first that the special functionals F, span the set of all reasonable functionals in an appropriate sense. We shall use the following terminology. For each H E *N, an H functional is an internal function F : 12(As,A") + *R
such that
(34)
IF(fll5 H
exp(-llfl12/H)
and
(35)
lF(f)- F ( g ) l s Hllf-
gll
for a l l i g E 12(As, S d ) . A hyperfinite sequence ( a n , f nconsistingofnumbers ) a, E *R and non-negative, internal functions fn : A, + *R is called a ( K , At) sequence if C Ia,I 1. K and llfnllm 5 (log(l/At))'/' for all n. 7.5.5. LEMMA. For each K E *N - N, there is an H E *N - N with the property that if lhsJ5 H, then for each H functional F : 12(As, iSd) + *R, there is a ( K , A t ) sequence (a,,f,) such that
IFk) -
c a&(g)l
for all non-negative, internal g :A,
+ *R.
1 5
-
H
-
Assume first that H is finite, and let wI.: = R$ u {a}be the one-point compactification of RA6. Since A, is finite, R t 6 is a subset of PROOF.
7 5 FIELDS AND POLYMERS
--
lz(A6, a d ) ,and we can turn F into a standard function F :R:a
491
+ OX by letting
F ( g ) = "J%) for all g
E
R,.:
and setting F ( c 0 ) = 0.
F
Note that the condition IF(g)l 5 He-llg'i2'H implies that is continuous at infinity. Each strictly positive function f : A6 R defines a continuous function Ff on R$a by -f
= exp(-(g,f)) for g E R$s, and Ff(a)= 0; the strict positivity is needed to get continuity at infinity. The algebra d consisting of all finite, linear combinations
1=1
-
separates points and contains the constant functions, and since R$a is compact, the Stone-Weierstrass theorem tells us that we can find finite sequences { P I } ,{J;} such that
Im
(36)
-
(Po
c
+ P,F.(g))ls
1
6.
for all g E Considering the J ' s F , : 12(A6, tid)+ *R by
as elements of 12(A6,6 d ) , define functionals
F A k ) = exp(-(g,J;)). Using the decay and continuity conditions (34), (35) on F, and the facts that /3,, J; are standard and J; is strictly positive, we get from (36) that 1 IFk) - ( P o + PF.(g))l 5
c
for all non-negative, internal g : A 8 + *R.As ( p , , J ; ) is obviously a ( K , A t ) sequence, this proves the lemma for all finite H, and hence for all sufficiently small, infinite H. REMARK. The condition IF(f)l 5 H exp(-llfl12/H), which we have used to govern F's behavior at infinity, is not in any sense canonical. In fact, we can replace it by
(37) IW)l5 Ilfll) for any standard function G :N x R, + R, such that limx+mG (n, x ) all n E N.
Combining 7.5.4 and 7.5.5, we get the following theorem.
=
0 for
492
7 HYPERFINITE L A n l C E MODELS
7.5.6. THEOREM. There is an H E *N -N such that if \As[I H , then for all H functionals F: /'(A8, a d ) + R and all g E 12(h8, a d )with Ollgll< 03, g > 0,
(38) m x P ( - @ r ( g ~ ) F ( m " ELF(7-+ 7-,)1 exp(t(KA?, 8)) and, provided O F is not identically zero,
PROOF If [ A 8 (is less than the constant N1E *N -N occurring in Lemma 7.5.4, there must be a K E *N - N such that
-
ELI afl&(7-+ Tg)lexp
for all ( K , A t ) sequences (a,,f,) and all g E 12(A8, a d ) wid finite norm. If IA81 is also less than the constant H in Lemma 7.5.5, we can find a ( K , Ar) sequence ( a f l , f fsuch l ) that the difference F - 1 cy,,Fr, is less than H-' in supremum norm. Consequently,
and
Since the operator H8 is bounded from below by m2, we have (H,'g,
llgl12/m2,
and thus - E [ F ( T + T,)] exp 1
2
which proves (38). Turning to (39), note that if H is finite, E(F(t9:)) and E ( F ( T ) )are infinitely close and noninfinitesimal. Hence (39) holds for all finite H, and the extension to sufficiently small, infinite H is straightforward.
7 5 FIELDS AND POLYMERS
493
REMARK. The theorem above is a hyperfinite version of Dynkin's representation formula [Dynkin (1984a,b); see also Brydges et al. (1982)l. Note that the class of internal functionals F that we allow is sufficiently rich to represent all standard functionals from 12(A, d x ) to R.
D.
(P4 and
Polymer Measures
Recall from Section 7.4 that Q4 fields are given by interactions of the form (40)
u (@F=)-A 4
isAs
c
@ ~ ( iad) +~ a 1 Q"(i)2 2 i%As
A
c
T\Ir6N(i)* S~ +f
=-
4is&
2
ad
c T f ( i ) ad, isAS
where A and a are constants. To see how this field can be represented in terms of Brownian local times, note that its generating functional can be written as (411
E exp([email protected]"(g)) exp(-
u)
exp(-@,N(g))exp
=
(: )
E exp(-@r(g))F -TF
7 . HYPERFINITE L A n l C E MODELS
494
Before we discuss the terms of the last exponent, we must make our notation a little more explicit. Recall from (12) that T is defined by 1
C di,it) (S.1.t)
T ( k ) ( U , W )=
(44)
apdl {k) (X ( w z , ] , l,
s))
At,
5=O
where u is the hyperfinite Poisson field induced by the measure a in (1l), and X is the Markov process generated by Ha. Similarly, 1
(45)
Tg(k)(ug3 wg) =
C
ug(i,j,t )
1)
(I./.
C
6 p dl{k)(Xg(wg,r,J.r, s)) At,
S=O
where a, is the hyperfinite Poisson field induced by the a, in (24), and X , is another copy of the Markov process generated by Ha. Since Tg is assumed to be independent of T, we let u,X and a,, X, be chosen independent. Quadratic terms in T and Tg can now be computed as follows: A
(46)
C
T(k)T,(k)ad
ksA6
=A
C C 1
C s=o
=A
d i , i f)ffR(i,j, t)
(w.1)(l,y,i)
keAs
1
<=o
C (i./,f)
spdl{k)(X(s))l~k)(Xg(S))
c
---
di,j, fb,(i,j,
t)
(;;,i)
where d is the hyperfinite version of the d-dimensional delta function; and d(z) = 0 for i # 0. Similarly, g(0) = (47)
A
c ktAs
~ ( k ) ' =6 A~
c (;,y,c
(Lj.1)
u ( i , j , t ) n ( ( i i) j )
where we have displayed the w's on the right-hand sides to emphasize that unless (i, j, t ) = ( [ i ?), the processes X ( w,,,,,, s) and X ( w , ; , i , S) are independent [and so are X g ( w g,,,,, , ! s) and X,(wR,;,y,i,9 1 .
7 5 FIELDS A N D POLYMERS
495
Linear terms in T and T, can also be computed:
In this formula the last two terms [coming from (49) and ( 5 0 ) ]are easy to handle, while the first three are similar to the ones we studied in Section 6.4.C. As long as (i, j, t ) f ( [ i), the processes occurring inside the same delta functions are independent, and we are in the situation discussed in
496
7 HYPERFINITE LATTICE MODELS
(ti
formulas (6.4.35)-(6.4.37). When ( i , j , t ) = &-and note that this event has noninfinitesimal probability-we are dealing with the more complicated polymer measures described in (6.4.39) and (6.4.40). Thus the triviality or nontriviality of a a4scalar field is intimately tied up with the behavior of polymer measures and perturbations of the Laplacian along Brownian paths. It should be pointed out at this stage that it is not only our inability to control four-dimensional polymer measures which prevents us from saying anything definite about (a:. Recall from Section 6.4 that we could only make standard sense of the expressions exp(
-lo‘
1
l O r A 6 ( b , ( s )- b 2 ( f ) ) ds’ds
for certain negative, infinitesimal A. Now it turns out that the result on which we have based our discussion, Theorem 7.5.6, does not hold for negative A’s! In fact, both the numerator and the denominator on the left-hand side of (39) diverge when A is negative. However, since T and T, are basically *-finite quantities, the right-hand side does make sense, and it is relatively easy to reinterpret the left-hand side by means of a truncation argument in such a way that the result extends to negative A’s. A much more serious problem is caused by the fact that the infinitesimal A’s we found in Section 6.4 were not constants, but functions of w and t. It might, of course, be that one can choose them to be constant without violating the argument in Section 6.4, but we have not been able to prove this. On the other hand, there is no obvious way of making sense of the right-hand side of (51) as a quantum field when A is not constant. To sum up, let us emphasize that although the computations above and the results of Section 6.4 may seem to suggest that could be nontrivial for certain negative and infinitesimal choices of the coupling constant A, there is a long way to go before such a claim can be either proved or disproved. Work is already in progress; at the time of writing, Andreas Stoll is developing a hyperfinite approach to polymer measures based on (selfrepellent) random walks [see Stoll (1985)l. For other discussions of infinitesimal coupling constants and related topics-such as triviality results for repulsive a4models-see Aizenman (1981), Albeverio et al. (1982, 1984a,c), Frohlich (1982), Gallavotti and Nicolo (1985), Sokal (1982), and Westwater (1980,1985).
E.
Fields and Local Time Perturbations
As we have just seen, the two major problems we encounter in trying to control Q4 through its local time representation are the occurrence of polymer measures on the right-hand side of (51) and the need to choose A
7 5 FIELDS AND POLYMERS
497
constant. We shall take a brief look at another kind of interacting quantum fields where the first problem is avoided, but where, unfortunately, the second still is present. What we have in mind are the so-called ( @ : @ i ) d models describing two d-dimensional quantum fields Qland 0,interacting through a term
where A is a coupling constant. To describe the situation mathematically, let m, and M, be two positive, real numbers representing the masses of the two fields. Let &, k = 1 or 2, be the matrix of Hk = - $ A d + mi as an operator from l * ( ~s ~d ), to 12(A8,6 d ) , and let C, be the inverse matrix. Fix a probability measure po and two independent random vectors {6),( i ) } , E A 8 , {8J2( i ) } i s A swith covariance matrices C , and C 2 , respectively. As in Section 7.5.B above, the quantum fields Q, and Q, are simply defined by Q k = S-d6k for k = 1, 2. With respect to p o , Q, and Q, are clearly independent, free scalar fields. The interaction measure p is defined by @ : ( i ) Q i ( i ) 6.)
(53)
dpo,
where 2=
5 (: exp --
C
Q;(i)Q:(i)
a d ) dpo
it**
is just the normalizing constant. Assume, for the time being, that A is non-negative. As usual we want to compute the Laplace transform of p : -I
Dealing with two interacting quantum fields, this problem does not fit into the framework discussed in Section 7.5.C, and we shall have to start our calculations from scratch. It turns out, however, that along the way we will be able to make use of some of the formulas we have already derived, and that this will simplify our task considerably. In particular, we shall use the formula
7 HYPERFINITE LATTICE MODELS
49a
(55)
(2.rr)-"12 Det
for the Laplace transform of a Gaussian measure; the expression (56)
Det DYt Det(DA,
+ 8-df)-1/2
= exp(-a(Ag x T)) exp
obtained by combining ( 5 ) , (lo), and ( l l ) , and formula (25), i.e., (57)
and We begin the computations by observing that since independent and Gaussian with covariance matrices C1and C2
(58)
k(f;g ) = Z F ( 1 g ) = (27r)-lA61Det C;'"
x
II
ieA6
dpi
62are
Det C;"'
FI 4,.
isA6
Let us carry out the integration with respect to the pi variables. Recalling that Dk= CL', k = 1 or 2, we get from ( 5 5 ) : (59)
F(f;g) =
Det D;12
7 5 FIELDS AND POLYMERS
(60)
499
Det D ; l 2Det = exp(-a,(hg x T))
where XI is the Markov process generated by HI = - $ A d + rn:, and a 1is the measure obtained from ( 1 1) by substituting rnl for rn. Similarly, by ( 5 7 )
where a,- is the measure in (24) with g and rn replaced by f and rn,. Substituting (60) and (61) in (59) and pulling everything which does not depend on q outside the integral, we get (62)
g ( J g ) = ( 2 ~ ) - l ~ s lDet ’ ~ D;1’2
x exp(
1 -Tx Dz(i,j)qiqj) exp(-C
qig(i)) ieha II dqi.
Strictly speaking, we are cheating here; there are conditions on (56) and (57) which we ought to check, and we cannot just substitute (60) and (61) into (59) and be certain that the result is valid. However, it is not hard to prove that (62) holds when lAsl and 6-’ are finite, and hence when they are infinite and sufficiently small. Since we are trying to communicate an idea rather than prove a theorem, we shall leave all bookkeeping of this sort to the reader from now on.
500
7 . HYPERFINITE LATTICE MODELS
To compute the qi-integrals in (62),we first expand the exponential term in a power series
+
Let p; be the measure on A 2 x T obtained by multiplying a, fyr by itself n times. Assume that X : ( o , , . . . ,X ; ( w , , are independent copies of X , , and let a ) ,
a )
denote expectation with respect to the product measure n",, Pbk(Ok).We can rewrite (63) as
x exp( - 2 1 ,iS-2d
i Irkq ( X ; ( s ) ) ' d s ) dp;.
k=l
0
Putting this into (62) and interchanging the order of integration, we see that
x exp( A:-
KZd k=l
Jrk
o
q(X;(s))' ds)
IJ
iEAg
d q i ) dp:.
The idea is that we can now compute the innermost integral by using the formula for the Laplace transform of Gaussian measures. To see this, let Ln be the "local time"
7 5 FIELDS AND POLYMERS
501
and observe that
and thus (65) becomes (67)
P ( f ,g) = e x p ( - a l ( h i
x T)) Det D;',
If X 2 is a copy of the Markov process generated by H , = -;Ad is independent of all the X:'s, we get from (56) that
+ m:, which
(68) Det D;" Det( D2+ A S - d L n ) - 1 / 2 = exp(-a,(hi x T))
where is expectation with respect to the measure governing X , , and a , is the measure (11) with rn replaced by m,. Similarly, by (57) (69)
exp(; ( ( H 2+ A L J ' g , 9))
where ag is the measure in (24) with m replaced by m 2 .
502
7 HYPERFINITE LATTICE MODELS
Substituting ( 6 8 ) and (69) into (67)-and the reader-we get (70)
leaving the bookkeeping to
f i ( g> ~ = exp(-(a, + a 2 ) ( G x T))
Recalling the definition of L,, we may rewrite this as
x g(x2(s)-x;(sk))
dSkdS
where as before is the hyperfinite version of the delta function given by and 6 ( i ) = 0 for i # 0. g(0) = Formula (71) may seem curiously unsymmetric in Q 1 and Q 2 , but if we expand the exponential term in a power series, we see that (72)
k(L 8 ) = exP(-(al+
~ z ) ( A ;X T ) )
7.5 FIELDS AND POLYMERS
503
where X i , X : , . . . are independent copies of X , , and where E%;:::;''';,,__,, IT, and are constructed from the Xi's and a , + ag in the same way as E:;;:::;:;,jl,,..,jn and p ; were constructed from the X i ' s and a 1+ a,. Since the normalization constant 2 in (58) equals E ( O , O ) , we finally arrive at the expression
6:
x $(Xk(il)- X : ( s k ) )dS,dsk) d a y d a r ) ,
provided lAsl and 6-' are less than some infinite bound, and f and g are "reasonable" functions. in terms of local time functionals. This, then, is our representation of Note that since X i and X : are independent, the exponential terms are always of the kind studied in Section 6.4, and no polymer measures enter into the discussion. But we are still faced with two problems. The first is that while in 6.4 we were forced to choose A negative and infinitesimal in order to obtain something nontrivial, the calculations above apply only to non-negative A's. In fact, the interaction term eis not integrable with respect to the free measure p o when A < 0, and our calculations thus make no sense for such choices of A. But since the right-hand side of (73) always exists, it is not difficult to make sense of the left-hand side by a suitable truncation argument in such a way that (73) holds (or, more opportunistically, we can simply define the right-hand side to be the interpretation of the left-hand side when A < 0). The second problem is more difficult-and of much greater importance: Is there a negative, infinitesimal A which makes (73) different from the Laplace transform of the free field; i.e., is nontrivial for d = 4 or d = 5? The results of Section 6.4 indicate that the answer would be yes if A were allowed to vary with o and t, but we simply do not have sufficient control to answer the question for constant A's. All we know from Chapter 6 is that the permissible choices of A depend on the local behavior of the
7 HYPERFINITE L A n I C E MODELS
504
Brownian path, and that we thus are dealing with a question concerning self-similarity of Brownian motions. We shall leave the problem open-ended, only emphasizing once again what this section is meant to illustrate-the ease and flexibility with which hyperfinite methods can handle some of the intricate conceptual and computational questions of quantum field theory. REMARK. As a further illustration let us remark that the exponential model discussed in Section 7.4.C recently has attracted much attention due to the connection of its zero mass version with the theory of relativistic strings. The massless exponential interaction model is called the Liouville model, since it was studied originally by Liouville as a classical field theory. The connection between the quantum version of this field theory and relativistic strings was observed by Polyakov in 1981. In fact, it turns out that the theory of relativistic strings in space-time dimension D s 13 (corresee 7.4.C for the choice of a ) can be expressed in sponding to a < 6, terms of expectations with respect to the pathspace measure of the Liouville model, see Albeverio et al. (1986a). The reduction of the string model to the (hyperfinite) exponential Euclidean model also reduces the “singularities” of the string approach to a discussion of singularities of the Euclidean model, and can thus be discussed by the hyperfinite methods developed in this chapter.
REFERENCES M. Aizenman (1981). Proof of the triviality of 0; field theory and some mean field features of using models for d > 4. Phys. Reu. Letf. 47. S. Albeverio and R. Hbegh-Krohn (1974). The Wightman axioms and the mass gap for strong interactions of exponential type in two-dimensional space-time. J. Funct. Anal. 16. S . Albeverio and R. Hbegh-Krohn (1979). Uniqueness and the global Markov property for Euclidean fields. The case of trigonometric interaction. Comm. Mafh. Phys. 68. S. Albeverio and R. Hbegh-Krohn (1980). Martingale convergence and the exponential interaction in R”. I n (L. Streit, ed.), Quantum Fields-Algebras and Processes. Springer-Verlag, Berlin and New York. S. Albeverio and R. Hbegh-Krohn (1984a). Local and global Markov fields. Rep. Math. Phys. 19.
S . Albeverio and R. Hbegh-Krohn (1984b). Diffusion fields, quantum fields and fields with values in Lie groups. In (M. Pinsky, ed.), Stochastic Analysis and Applications. Dekker, New York. S. Albeverio, G. Gallavotti, and R. Hbegh-Krohn (1979). Some results for the exponential interaction in two or more dimensions. Comm. Math. Phys. 70. S . Albeverio, R. Hbegh-Krohn, and G. Olsen (1981). The global Markov property for lattice systems. J. MulfiuariafeAnal. 11. S. Albeverio, Ph. Blanchard, and R. Hbegh-Krohn (1982). Some applications of functional integration. I n (R. Schrader, R.Seiler, and D. A. Uhlenbrock, eds.), MathematicalProblems in Theoretical Physics. Lect. Notes Phys. 153, Springer-Verlag, Berlin and New York.
REFERENCES
505
S. Albeverio, J. E. Fenstad, R. Hbegh-Krohn, W. Karwowski, and T. Lindstrbm (1984a). Perturbations of the Laplacian supported by null sets, with applications to polymer measures and quantum fields. Phys. Lett. 104. S . Albeverio, R. H~egh-Krohn,and H. Holden (1984b). Markov cosurfaces and gauge fields. Acta Phys. Austriaca, Suppl. XXVI. S. Albeverio, Ph. Blanchard, and R. H~egh-Krohn( 1 9 8 4 ~ )Newtonian . diffusions and planets, with a remark on nonstandard Dirichlet forms and polymers. In (A. Truman and D. Williams, eds.), Proceedings of rhe LMS-Symp. on Stochastic Analysis and Applications (Swansea, 1983). Springer-Verlag, Berlin, and New York. S. Albeverio, R. Hbegh-Krohn, and H. Holden (1985). Markov processes in infinitedimensional spaces, Markov fields, and Markov cosurfaces. In (L. Arnold and P. Kotelenz, eds.) Stochastic Space- Time Models, Limit Theorems. Reidel, Dordrecht. S . Albeverio, R. H0egh-Krohn, and H. Holden (1986). Markov cosurfaces and quantum fields (in preparation). S. Albeverio, R. H~egh-Krohn,S. Paycha, and S. Scarlatti (1986a). Pathspace measure for the Liouville quantum field theory and the construction of relativistic strings. Phys. Lett. (to appear) J. Bellissard and R. H~egh-Krohn(1982). Compactness and the maximal Gibbs state for random Gibbs fields on a lattice. Comm. Math. Phys. 84. J. Bellissard and P. Picco (1979). Lattice quantum fields: uniqueness and Markov property. (preprint) Marseille. P. Blanchard and J. Tarski (1978). Renormalizable interactions in two dimensions and sharptime fields. Acta Phys. Austriaca 19. D. Brydges, J. Frohlich, and T. Spencer (1982). The random walk representation of classical spin systems and their correlation inequalities. Comm. Math. Phys. 83. D. C. Brydges, J. Frohlich, and A. D. Sokal(l983). A new proof ofthe existence and nontriviality of the continuum cp: and cp! quantum field theories. Comm. Math. Phys. 91. R. L. Dobrushin (1968a). Gibbsian random fields for lattice systems with pairwise interactions. Functional. Anal. Appl. 2. R. L. Dobrushin (1968b). Description of a random field by means of conditional probab and the conditions governing its regularity. Theory Probab. Its Appl. 13. E. B. Dynkin (1984a). Gaussian and nongaussian random fields associated with Markov processes. J. Funct. Anal. 55. E. B. Dynkin (1984b). Polynomials of the occupation field and related random fields. J. Funct. Anal. 58. W. G. Faris (1979). The stochastic Heisenberg model. J. Funct. Anal. 32. R. Fittler (1984). Some nonstandard quantum electrodynamics. Helu. Phys. Acta 57. H. Follmer (1975). Phase transition and Martin boundary. Sem. Probab., Strasbourg IX. Lecture Notes in Math. 465, Springer-Verlag, Berlin and New York. H. Follmer ( 1980). On the global Markov property. In L. Streit (ed.), Quantum Fields-Algebras, Processes. Springer-Verlag, Berlin and New York. J. Frohlich (1980). Some results and comments on quantized gauge fields. In Recent Develop ments in Gauge Theory, Cargese Summer Inst., 1979. Plenum Press, New York. J. Frohlich (1982). On the triviality of h@:-theories and the approach to the critical point in d e 4 dimensions. Nuct. Phys. B 200. G. Gallavotti and F. Nicolb (1985). Renormalization theory in four dimensional scalar fields 11. Comm. Math. Phys. 101. R. Gielerak (1983). Verification of the global Markov property in some classes of strongly coupled exponential interactions. J. Math. Phys. 24. R. J. Glauber (1963). Time dependent statistics of the lsing model. J. Math. Phys. 4.
506
7 HYPERFINITE I A n I C E MODELS
J. Glimm and A. Jaffe (1981). Quantum Physics, a Functional Integral Point of View. SpringerVerlag, Berlin and New York. S . Goldstein (1980). Remarks on the global Markov property. Comm.Math. Phys. 74. L. Gross (1979). Decay of correlation in classical lattice models at high temperature. Comm. Math. Phys. 68. L. Gross (1982). Thermodynamics, statistical mechanics and random fields. In Ecole d’Ete‘ de Probabilitis de Saint-Flour. X-1980.Lecture Notes Math. 929, Springer-Verlag, Berlin and New York. L. L. Helms and P. A. Loeb (1979). Applications of nonstandard analysis to spin models. J. Math. Anal. Appl. 69. L. L. Helms and P. A. Loeb (1982). Bounds on the oscillation of spin systems. J. Math. Anal. Appl. 86. Y. Higuchi (1984). A remark on the global Markov property for the d-dimensional king model. Roc. Jpn. Acad. Ser. A 60. R. Holley (1970). A class of interactions in an infinite spin system. Adu. Math. 5. R. Holley (1972). Markovian interaction processes with finite range interactions. Ann. Math. Statist. 43. R. Holley and D. W. Stroock (1976). L, theory for the stochastic lsing model. 2. Wahrsch. Venv. Gebiete 35. R. Holley and D. W. Stroock (1977). In one and two dimensions every stationary measure for a stochastic Ising model is a Gibbs state. Comm. Math. Phys. 55. A. E. Hurd (1981). Nonstandard analysis and lattice statistical mechanics: a variational principle. Trans. Amer. Math. Soc. 263. R. B. Israel (1979). Convexity in the Theory ofLattice Gases. Princeton Univ. Press, Princeton, New Jersey. P. J. Kelemen and A. Robinson (1972). The nonstandard A : &x): model. J. Math. Phys. 13. C. Kessler (1984). Nonstandard methods in random fields. Thesis, Bochum. C. Kessler (1985). Examples of extremal lattice fields without the global Markov property. Publ. RIMS 21. R. Kindermann and J. L. Snell (1980). Markov Random Fields and their Applications. Amer. Math. SOC.Providence, Rhode Island. D. Laugwitz (1978). In$nitesimalkalkul. Bibl. Inst., Mannheim. J. L. Lebowitz and A. Martin-Lof (1972). On the uniqueness of the equilibrium state for Ising spin systems. Comm. Math. Phys. 25. Li Bang-He (1978). Nonstandard analysis and multiplication of distributions. Sci. Sinica 21. T. M. Liggett (1985). Interacting Particle Systems. Springer-Verlag, Berlin and New York. P. A. Loeb (1976). Conversion from nonstandard to standard measure spaces and applications in probability theory. Israel J. Math. 25. P. A. Martin (1977). On the stochastic dynamics of Ising models. J. Statist. Phys. 16. J. Mikusinski and R. Sikorski (1973). Theory ofDistributions. The Sequential Approach. Elsevier, Amsterdam. S . Nagamachi and T. Nishimura (1984). Linear canonical transformations on Fermion Fock space with indefinite metric (preprint). Tokushima. E. Nelson (1973). Construction of quantum fields from Markov fields. J. Funct. Anal. 12. A. Ostebee, P. Gambardella, and M. Dresden (1976). A “nonstandard” approach to the thermodynamic limit. 11. Weakly tempered potentials and neutral Coulomb systems. J. Math. Phys. 17. C. Preston (1976). Random fields. Lecture Notes in Math. 534, Springer-Verlag, Berlin and New York. M. M. Richter (1982). Ideafe Punkte, Monaden und Nichtstandurd-Methoden. Vieweg, Wiesbaden.
REFERENCES
507
A. Robinson (1966). Nonstandard Analysis. North-Holland Publ., Amsterdam. M. Rockner (1985). A Dinchlet problem for distributions and specifications for random fields. Mem. Amer. Math. SOC.54. D. Ru-elle (1983). Statistical Mechanics: Rigorous Results. Benjamin, New York. E. Seiler (1982). Gauge theories as a problem of constructive quantumfield theory and statistical mechanics. Lecture Notes in Phys. 159, Springer-Verlag, Berlin and New York. B. Simon (1974). The P(cp)* Euclidean (Quantum) Field Theory. Princeton Univ. Press, Princeton, New Jersey. Y. Sinai (1982). Theory of Phase Transitions: Rigorous Results. Pergamon, Oxford. A. Sokal (1982). An alternate constructive approach to the @: quantum field theory, and a possible destructive approach t o @.: Ann. Inst. H. Poincari Sect. A 37. F. Spitzer (1970). Interaction of Markov processes. Adu. Math. 5. A. Stoll (1985). Selfrepelleni random walks and polymer measures in two dimensions. Doctoral dissertation. Bochum. D. W. Stroock (1978). Lectures on infinite interacting systems. Leciures in Math. 11. K. Stroyan and J. Bayod (1985). Foundations ofInfinitesimalStochastic Analysis. North-Holland Publ., Amsterdam (to appear). W. G. Sullivan (1975). Markov processes for random fields. Comm. Dublin Inst. Adu. Stud. Ser. A 23. K. Symanzik (1969). Euclidean quantum field theory. In (R. Jost, ed.), Local Quantum Theory. Academic Press, New York and London. H. von Weizsacker (1980). A simple example concerning the global Markov property of lattice random fields. Proc. Winter School Abstr. Anal., 8th, Praha. J. Westwater (1980). On Edwards’ model for long polymer chains. Comm. Math. Phys. 72. J. Westwater (1985). On Edwards’ model for long polymer chains. In (S. Albeverio and Ph. Blanchard, eds.) Trends and Developments in the Eighties. Proc. Bielefield Enc. Math. Phys. IV. World Scientific, Singapore. B. Zegarlinski (1985). Uniqueness and the global Markov property for Euclidean fields: the case of general exponential interaction. Comm. Math. Phys. B. Zegarlinski (1984). Extremality and the global Markov property, 11: global Markov property for non-FKG maximal Gibbs measures. BiBoS (preprint). Bielefeld.
This Page Intentionally Left Blank
A Absolute joint variation of martingales, 158 Absolutely continuous measure, 133 Abstract Wiener space, 197 Adapted, 111, 132 almost surely, 112, 132 Almost o-compact, 266 Anderson Lush theorem, 93 Anderson random walk, 78, 159, 196 Ascoli lemma, 31 Atomless measure, 206, 479
B Beurling-Deny formula, 250 Bilinear form, see Quadratic form Bounded operator, 55, 197 Boltzmann equation, 380 Loeb solution of, 401 Brownian bridges, 85 Poisson field of, 480 Brownian local time, 85, 486 Brownian motion, 79, 155 continuity of, 82, 97 infinite dimensional, 197 Lhy, 213 modulus of continuity, 97 Brownian sheet, see Yeh-Wiener process Burkholder-Davis-Gundy inequalities, 126
C
Canard, 36
Central limit theorem, 81 C-lifting, 434 Closed form, 234 Closed set in W, 10 in topological space,48 Cofinite set, 5 Compact map, 52 operator, 55, 56, 334 set i n R , 10 set in topological space, 48 support, 296 Compact-open topology, 52 Condition C, 434 Conditional expectation, 73 Configuration, 413 external, 414 space, 413 Connective, 11 Continuity, 26, see also S-continuity uniform, 27, 51 *-continuity, 51 Control admissible, 173 admissible relaxed, 173 internal, 175 optimal, 174 Cost, 173 function, 173, 190 internal, 177 Cosurface, 469 Cylinder set in C[O,l], 96 in Banach space, 100 509
510
INDEX D
@-dense,334. 344 Derivative, 27 Differential, 27 Diffusion system, 413 Dirichlet form, 245, 250, 251, see also Quadratic form generating quasi-continuous extensions, 288 locally finite, 293 nearstandardly concentrated, 285 normal, 258 regular, 296 separating compacts, 282 Distribution, 457 DLR equations (Dobrushin-LanfordRuelle), 424 DNA-protein reaction, 169 Dobrushin uniqueness condition, 441 Doleans measure, 142 Domain of quadratic form, 228, 230, 238 Doob inequality, 119 Drift system, 413
E
E (expectation), 71 Em, E(t) (quadratic forms), 227, 228 ea0, ea(A), ef: (equilibrium potentials), 255, 256, 285, 288 Edwards polymer model, 375 *-embedding, 7, 16 Energy form hyperfinite, 302 standard, 299 Entropy, 383, 421 Ergodic theorem, 78 Euclidean field, 452 free, 450 Exceptional set, 266 properly, 268 Excluded volume effect, 375 Expectation, 71 conditional, 73 Exponential interaction, 459 nontriviality of, 464 and relativistic string, 504 *-extension, 7 Extension principle. - . 16
External configuration, 414 set, 20
F
Fast point, 33 Feynman integral, 406 Feynman-Kac formula, 262 Filter, 5 , see also Ultrafilter free, 6 Filtration internal, 110, 116 right continuous, 273 stochastic, 111, 131, 152, 273 Fin (*E ) (finite points in *E ), 53 Finite, see also Hyperfinite energy, 288 number, 9 point in metric space, 51 Finite intersection property, 6, 45 Finitely representable, 60 Form, see Quadratic form Formula in L (R), 11 in L ( V (R) 1, 18 Fukushima decomposition theorem, 258
G Gauge field, 468 Gaussian random field, 450 Gibbs state, 420 Girsanov formula, 188 Gross theorem, 100 H Hamiltonian, 298, 419 Hausdorff space, 48 Henry theorem, 101 Hermitian operator, 56 H-function (Boltzmann), 382 Hilbert-Schmidt operator, 99, 198 Hull, 226 Hyperfinite dimensional space, 55 lattice, 209, 412 probability space, 68
INDEX
51 1
random walk, 78 set, 61 stochastic process, 68, 108 time line, 68, 115 Hyperreal, 7 I Ideal boundary, 315 Independence, stochastic, 79 Infinite number, 8 Infinitesimal, 8 generator, 228 increment, 127 in normed space, 58 in *R, 8 Inner standard part, 269 Integral Riemann, 28 Lebesgue, 72 Loeb, 72 Interaction, 419 attractive, 442 finite range, 419 for gauge fields, 469 for quantum fields, 453, 458 singular, 328 tempered, 446 translation invariant, 420 zero range, 309, 328, 349 Internal, see also Hyperfinite cardinality, 67 function, 22 measure, 68 set, 20 Internal definition principle, 21 Invariance principle, 171, 218 for Levy Brownian motion, 218 Invariant subspace problem, 58 Irregularities, set of, 274 exceptional, 275 king model, 419 It6 integral, 111, see also Stochastic integral It6 lemma, 113, 151
Konig lemma (weak), 31 Krylov inequality, 163 L
L ( , d )(Loeb algebra of&), 66 L (p) (Loeb measure of P),66 L (R) (first-order language), 11 L (V(W)) (first-order language), 18 Laplace operator, 247, 328 and Brownian motion, 254, 260, 373, 478 discrete, 247, 304, 454, 465, 466, 477 and quantum mechanics, 260, 298, 328, 373, 404, 450 resolvent kernel of, 351 singular perturbation of, 348 at a point, 308, 351, 357 along Brownian paths, 361, 363, 371 Lattice field, 412 free, 455 hyperfinite, 456 Lebesgue measure, 70 Levy Brownian motion, 213 Levy modulus of continuity, 98 Lifetime of Markov process, 273 internal, 274 Lifting, 69, 195 C-lifting, 434 essentially uniform, 138 nonanticipating, 112, 134, 135 and essentially uniform, 138 of stochastic process, 112, 134 2-lifting, 146 uniform, 136 Local time, 85 functional, 359 Localizing sequence of stopping times, 118 Locally finite Dirichlet form, 293 Loeb algebra, 66 Loeb measure, 66 Loeb space, 66
M d x ) (monad of x), 10, 47
K x-saturation, 47 Keisler Fubini theorem, 75 Kolmogorov continuity theorem, 209
Markov operator, 245, 251 process, 248, 265 strong, 273 symmetric, 249
512
INDEX
Markov (continued) property of random field global, 432, 444 hyperfinite, 432 local, 432 Martingale hyperfinite, 116 X2, 118 local X2, 118 local SL2, 143 S-continuity of, 123 SL2, 143 standard L2, 141 local L2,142 Mean energy, 420 Measurable norm, 99 process, 111, 132 rectangle, 132 Measure Doltans, 142 extension of, 95, 101 hyperfinite representation of, 94 internal, 63 product, 74 Radon, 87, 244, 329 regular, 87 Monad i n R , 10 in topological space, 47
N V M (martingale measure), 122 Nearstandard, 48, 51, see also Prenearstandard function, 245 operator, 197 Nearstandardly concentrated Dirichlet form, 258 Nonanticipating, 109, 116, 140, 175, 189 Nonstandard hull, 53, 226 Normal Dirichlet form, 258 Ns (*E ), ns(*E ) (nearstandard points in *E ), 48, 396
0 Observation, 172
Observation space, 173 Open set in R, 10 in topological space, 48 Osterwalder-Schrader positivity, 452 Overflow, 21
P Partition function, 420 Peano existence theorem, 30 Persistence length, 372 Perturbation supported on a set, 345 Phase transition, 429 Pns (*E) (pre-nearstandard points in *I?), 54, 195 Point interaction, 309, 328, 349 Poisson random field, 479 of Brownian bridges, 480 of Brownian local times, 486 Polymer, 372 model 375, 496 Polysaturation, 47 Pre-nearstandard, 54, 195, see also Nearstandard Pressure, 421 Probability logic, 153, 220 Product measure, 74 Prohorov theorem, 96 Projection, 56 Projective system of measures, 96 of topological spaces, 95
Q Quadratic form, see also Dirichlet form hyperfinite, 226 domain of, 228, 230, 236 nontrivial perturbation of, 246, 334, 339, 341 resolvent of, 234 S-bounded, 226 semigroup of, 227 standard form induced by, 242 standard part of, 234 standard closed, 226, 232, 241 hyperfinite representation of, 242, 245, 343 perturbation of, 345, 347, 348
INDEX
513
Quadratic variation hyperfinite, 117 S-continuity of, 123 standard, 148 of well-behaved martingales, 150 Quantifier, 11, 18 Quantum field, 451, 452, 482, 486 exponential interaction, 459, 464 and relativistic string, 504 euclidean, 452 free, 450, 482 q,445, 493 and polymer measures, 496 0;a;, 496, 503 Quasi-continuity, 287 Quasi-continuous extensions, 288
R *R, 7, 8 Radon measure, 87, 244, 329 Random field, see also Quantum field Gaussian, 450 generalized, 457 L h y Brownian motion, 213 Poisson, 479 Yeh-Weiner (Brownian sheet), 218 Random walk hyperfinite, 78 loop erased, 378 self avoiding, 377 simple, 377 with taboo set, 378 Reflexive space, 60 Regular Dirichlet form, 296 set function, 87 space, 282 Relativistic string, 504 Resolvent, 234 Rich, 207, 244 Riemann sum, 28 S S-bounded form, 226 operator, 197 off a set, 334 S-continuity, 51, 123, 209, 287, 415 S-continuous at zero, 143
S-dense, 241 S-integrable, 71, 77 S-left limit, 119 S-right continuous at zero, 143 S-right limit, 119 S-white noise, 206 Saturation, 46, 47 countable, 46 x-saturation, 47 polysaturation, 47 Schrodinger equation, 328 Schrodinger operator, 298, 328 Schwinger function, 450 SDJ-process, 149, see also Well-behaved process Self-adjoint operator, 56 and closed form, 226, 328 Semigroup, 228 Semimartingale, 170 Separate compacts, 282 Separating family, 282 Sequence, 23 double, 24 internal extension of, 47 limit of, 23 Singular perturbation, 329, 345, 347, 348, 351, 357, 361, 363 Site, 413 SL fM) (class of stochastic integrands), 123 SL2 fM) (class of stochastic integrands), 123 SLZ-martingale, 143 local, 143 Slow manifold, 34 Slow point, 34 Souslin operation, 271 Speed function, 413 Spin, 413 Stable point, 34 Standard generating set, 195 Standard part, 9, 48, 195 of configuration, 414 in compact-open topology, 52 in Hausdorff space, 48 of internal control, 176 inner, 269 left, 121 modified, 274 of quadratic form, 234 in *R, 9 right, 121 of stochastic process, 121
514
INDEX
Standard set, 20 State, 420, 422 equilibrium, 420, 423, 424, 426 Gibbs, 420 internal, 423 Stochastic differential equation, 159, 202 finite dimensional, 159, 170, 312 strict solution, 161 strong solution, 160 weak solution, 160 infinite dimensional, 202 mild solution, 203 weakened solution, 203 Stochastic filtration, see Filtration Stochastic integral, 108 hyperfinite, 108, 116, 160 infinite dimensional, 199 S-continuity of, 123, 199, 209 of white noise, 207 standard, 111, 142, 160 infinite dimensional, 202 of white noise, 208 Stochastic process, 68 adapted, 111, 132 measurable, 111, 132 predictable, 132 Stopped process, 118 Stopping time, 118, 273 Stosszahlansatz, 380 Sturm-Liouville problem, 320 Suhmartingale, 117 Supermartingale, 117 Superreflexive, 60 Superstructure, 16
T Tame function, 413 base of, 413 Tempered interaction, 446 Thermal average, 420
Three-space property, 61 Transfer principle, 14, 18 Transition matrix, 248
U Ultrafilter, 6, 49 Ultrapower, bounded, 16 Underflow, 21 Uniform lifting, 136 Unit contraction, 251 Universally measurable, 273 Unstable point. 34
V
V(S) (superstructure over S ) , 16, 22 Van der Pol equation, 34 Van Hove limit, 422 Vector-valued measure, 66
W Well-behaved process, 148, 150, 151 White noise, 205, 206 Wick renormalization, 461 Wiener measure, 84, 98 conditional, 85 Wightman distributions, 451 Wightman-Garding axioms, 451 Wilson loop, 471
Y Yang-Mills field, 471 Yeh-Wiener process, 218 2
Zero divisor, 5 Zero range interaction, 309, 328, 349