P1: JZP 0 521 86300 7
Printer: cupusbw
0521863007pre
May 17, 2006
This page intentionally left blank
18:4
Char Count= 0
P1: JZP 0 521 86300 7
Printer: cupusbw
0521863007pre
May 17, 2006
18:4
Char Count= 0
CAMBRIDGE STUDIES IN ADVANCED MATHEMATICS 100
MARKOV PROCESSES, GAUSSIAN PROCESSES, AND LOCAL TIMES Written by two of the foremost researchers in the field, this book studies the local times of Markov processes by employing isomorphism theorems that relate them to certain associated Gaussian processes. It builds to this material through self-contained but harmonized “mini-courses” on the relevant ingredients, which assume only knowledge of measuretheoretic probability. The streamlined selection of topics creates an easy entrance for students and experts in related fields. The book starts by developing the fundamentals of Markov process theory and then of Gaussian process theory, including sample path properties. It then proceeds to more advanced results, bringing the reader to the heart of contemporary research. It presents the remarkable isomorphism theorems of Dynkin and Eisenbaum and then shows how they can be applied to obtain new properties of Markov processes by using well-established techniques in Gaussian process theory. This original, readable book will appeal to both researchers and advanced graduate students.
i
P1: JZP 0 521 86300 7
Printer: cupusbw
0521863007pre
May 17, 2006
18:4
Char Count= 0
Cambridge Studies in Advanced Mathematics
Editorial Board: Bela Bollobas, William Fulton, Anatole Katok, Frances Kirwan, Peter Sarnak, Barry Simon, Burt Totaro
All the titles listed below can be obtained from good booksellers or from Cambridge University Press. For a complete series listing, visit http://www.cambridge.org/us/mathematics
Recently published 71 72 73 74 75 76 77 78 79 81 82 83 84 85 86 87 89 90 91 92 93 95 96 97 98 99
R. Blei Analysis in Integer and Fractional Dimensions F. Borceux & G. Janelidze Galois Theories B. Bollobas Random Graphs 2nd Edition R. M. Dudley Real Analysis and Probability 2nd Edition T. Sheil-Small Complex Polynomials C. Voisin Hodge Theory and Complex Algebraic Geometry I C. Voisin Hodge Theory and Complex Algebraic Geometry II V. Paulsen Completely Bounded Maps and Operator Algebras F. Gesztesy & H. Holden Soliton Equations and Their Algebra-Geometric Solutions I S. Mukai An Introduction to Invariants and Moduli G. Tourlakis Lectures in Logic and Set Theory I G. Tourlakis Lectures in Logic and Set Theory II R. A. Bailey Association Schemes J. Carlson, S. M¨ uller-Stach & C. Peters Period Mappings and Period Domains J. J. Duistermaat & J. A. C. Kolk Multidimensional Real Analysis I J. J. Duistermaat & J. A. C. Kolk Multidimensional Real Analysis II M. C. Golumbic & A. N. Trenk Tolerance Graphs L. H. Harper Global Methods for Combinatorial Isoperimetric Problems I. Moerdijk & J. Mrcun Introduction to Foliations and Lie Groupoids J. Koll´ ar, K. E. Smith & A. Corti Rational and Nearly Rational Varieties D. Applebaum L´ evy Processes and Stochastic Calculus M. Schechter An Introduction to Nonlinear Analysis R. Carter Lie Algebras of Finite and Affine Type H. L. Montgomery & R. C. Vaughan Multiplicative Number Theory I. Chavel Riemannian Geometry D. Goldfeld Automorphic Forms and L-Functions for the Group GL(n,R)
ii
P1: JZP 0 521 86300 7
Printer: cupusbw
0521863007pre
May 17, 2006
18:4
Char Count= 0
MARKOV PROCESSES, GAUSSIAN PROCESSES, AND LOCAL TIMES
MI C H A EL B. M AR CUS City College and the CUNY Graduate Center
JA Y R OSEN College of Staten Island and the CUNY Graduate Center
iii
cambridge university press Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo Cambridge University Press The Edinburgh Building, Cambridge cb2 2ru, UK Published in the United States of America by Cambridge University Press, New York www.cambridge.org Information on this title: www.cambridge.org/9780521863001 © Michael B. Marcus and Jay Rosen 2006 This publication is in copyright. Subject to statutory exception and to the provision of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published in print format 2006 isbn-13 isbn-10
978-0-511-24696-8 eBook (NetLibrary) 0-511-24696-X eBook (NetLibrary)
isbn-13 isbn-10
978-0-521-86300-1 hardback 0-521-86300-7 hardback
Cambridge University Press has no responsibility for the persistence or accuracy of urls for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.
To our wives
Jane Marcus and
Sara Rosen
Contents
1
Introduction 1.1 Preliminaries
page 1 6
2
Brownian motion and Ray–Knight Theorems 2.1 Brownian motion 2.2 The Markov property 2.3 Standard augmentation 2.4 Brownian local time 2.5 Terminal times 2.6 The First Ray–Knight Theorem 2.7 The Second Ray–Knight Theorem 2.8 Ray’s Theorem 2.9 Applications of the Ray–Knight Theorems 2.10 Notes and references
3
Markov processes and local times 3.1 The Markov property 3.2 The strong Markov property 3.3 Strongly symmetric Borel right processes 3.4 Continuous potential densities 3.5 Killing a process at an exponential time 3.6 Local times 3.7 Jointly continuous local times 3.8 Calculating uT0 and uτ (λ) 3.9 The h-transform 3.10 Moment generating functions of local times 3.11 Notes and references
62 62 67 73 78 81 83 98 105 109 115 119
4
Constructing Markov processes 4.1 Feller processes 4.2 L´evy processes
121 121 135
vii
11 11 19 28 31 42 48 53 56 58 61
Contents
viii 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11
Diffusions Left limits and quasi left continuity Killing at a terminal time Continuous local times and potential densities Constructing Ray semigroups and Ray processes Local Borel right processes Supermedian functions Extension Theorem Notes and references
144 147 152 162 164 178 182 184 188
5
Basic properties of Gaussian processes 5.1 Definitions and some simple properties 5.2 Moment generating functions 5.3 Zero–one laws and the oscillation function 5.4 Concentration inequalities 5.5 Comparison theorems 5.6 Processes with stationary increments 5.7 Notes and references
189 189 198 203 214 227 235 240
6
Continuity and boundedness of Gaussian processes 6.1 Sufficient conditions in terms of metric entropy 6.2 Necessary conditions in terms of metric entropy 6.3 Conditions in terms of majorizing measures 6.4 Simple criteria for continuity 6.5 Notes and references
243 244 250 255 270 280
7
Moduli of continuity for Gaussian processes 7.1 General results 7.2 Processes on Rn 7.3 Processes with spectral densities 7.4 Local moduli of associated processes 7.5 Gaussian lacunary series 7.6 Exact moduli of continuity 7.7 Squares of Gaussian processes 7.8 Notes and references
282 282 297 317 324 336 347 356 361
8
Isomorphism Theorems 8.1 Isomorphism theorems of Eisenbaum and Dynkin 8.2 The Generalized Second Ray–Knight Theorem 8.3 Combinatorial proofs 8.4 Additional proofs 8.5 Notes and references
362 362 370 380 390 394
Contents
ix
9
Sample path properties of local times 9.1 Bounded discontinuities 9.2 A necessary condition for unboundedness 9.3 Sufficient conditions for continuity 9.4 Continuity and boundedness of local times 9.5 Moduli of continuity 9.6 Stable mixtures 9.7 Local times for certain Markov chains 9.8 Rate of growth of unbounded local times 9.9 Notes and references
396 396 403 406 410 417 437 441 447 454
10
p-variation 10.1 Quadratic variation of Brownian motion 10.2 p-variation of Gaussian processes 10.3 Additional variational results for Gaussian processes 10.4 p-variation of local times 10.5 Additional variational results for local times 10.6 Notes and references
456 456 457 467 479 482 495
11
Most visited sites of symmetric stable processes 11.1 Preliminaries 11.2 Most visited sites of Brownian motion 11.3 Reproducing kernel Hilbert spaces 11.4 The Cameron–Martin Formula 11.5 Fractional Brownian motion 11.6 Most visited sites of symmetric stable processes 11.7 Notes and references
497 497 504 511 516 519 523 526
12
Local times of diffusions 12.1 Ray’s Theorem for diffusions 12.2 Eisenbaum’s version of Ray’s Theorem 12.3 Ray’s original theorem 12.4 Markov property of local times of diffusions 12.5 Local limit laws for h-transforms of diffusions 12.6 Notes and references
530 530 534 537 543 549 550
13
Associated Gaussian processes 13.1 Associated Gaussian processes 13.2 Infinitely divisible squares 13.3 Infinitely divisible squares and associated processes 13.4 Additional results about M -matrices 13.5 Notes and references
551 552 560 570 578 579
Contents
x 14
Appendix 14.1 Kolmogorov’s Theorem for path continuity 14.2 Bessel processes 14.3 Analytic sets and the Projection Theorem 14.4 Hille–Yosida Theorem 14.5 Stone–Weierstrass Theorems 14.6 Independent random variables 14.7 Regularly varying functions 14.8 Some useful inequalities 14.9 Some linear algebra
580 580 581 583 587 589 590 594 596 598
References Index of notation Author index Subject index
603 611 613 616
1 Introduction
We found it difficult to choose a title for this book. Clearly we are not covering the theory of Markov processes, Gaussian processes, and local times in one volume. A more descriptive title would have been “A Study of the Local Times of Strongly Symmetric Markov Processes Employing Isomorphisms That Relate Them to Certain Associated Gaussian Processes.” The innovation here is that we can use the well-developed theory of Gaussian processes to obtain new results about local times. Even with the more restricted title there is a lot of material to cover. Since we want this book to be accessible to advanced graduate students, we try to provided a self-contained development of the Markov process theory that we require. Next, since the crux of our approach is that we can use sophisticated results about the sample path properties of Gaussian processes to obtain similar sample path properties of the associated local times, we need to present this aspect of the theory of Gaussian processes. Furthermore, interesting questions about local times lead us to focus on some properties of Gaussian processes that are not usually featured in standard texts, such as processes with spectral densities or those that have infinitely divisible squares. Occasionally, as in the study of the p-variation of sample paths, we obtain new results about Gaussian processes. Our third concern is to present the wonderful, mysterious isomorphism theorems that relate the local times of strongly symmetric Markov processes to associated mean zero Gaussian processes. Although some inkling of this idea appeared earlier in Brydges, Fr¨ ohlich and Spencer (1982) we think that credit for formulating it in an intriguing and usable format is due to E. B. Dynkin (1983), (1984). Subsequently, after our initial paper on this subject, Marcus and Rosen (1992d), in which we use Dynkin’s Theorem, N. Eisenbaum (1995) found an unconditioned isomorphism that seems to be easier to use. After this Eisenbaum, Kaspi, 1
2
Introduction
Marcus, Rosen and Shi (2000) found a third isomorphism theorem, which we refer to as the Generalized Second Ray–Knight Theorem, because it is a generalization of this important classical result. Dynkin’s and Eisenbaum’s proofs contain a lot of difficult combinatorics, as does our proof of Dynkin’s Theorem in Marcus and Rosen (1992d). Several years ago we found much simpler proofs of these theorems. Being able to present this material in a relatively simple way was our primary motivation for writing this book. The classical Ray–Knight Theorems are isomorphisms that relate local times of Brownian motion and squares of independent Brownian motions. In the three isomorphism theorems we just referred to, these theorems are extended to give relationships between local times of strongly symmetric Markov processes and the squares of associated Gaussian processes. A Markov process with symmetric transition densities is strongly symmetric. Its associated Gaussian process is the mean zero Gaussian process with covariance equal to its 0-potential density. (If the Markov the process, say X, does not have a 0-potential, one can consider X, process X killed at the end of an independent exponential time with is the α-potential density of mean 1/α. The 0-potential density of X X.) As an example of how the isomorphism theorems are used and of the kinds of results we obtain, we mention that we show that there exists a jointly continuous version of the local times of a strongly symmetric Markov process if and only if the associated Gaussian process has a continuous version. We obtain this result as an equivalence, without obtaining conditions that imply that either process is continuous. However, conditions for the continuity of Gaussian processes are known, so we know them for the joint continuity of the local times. M. Barlow and J. Hawkes obtained a sufficient condition for the joint continuity of the local times of L´evy processes in Barlow (1985) and Barlow and Hawkes (1985), which Barlow showed, in Barlow (1988), is also necessary. Gaussian processes do not enter into the proofs of their results. (Although they do point out that their conditions are also necessary and sufficient conditions for the continuity of related stationary Gaussian processes.) This stimulating work motivated us to look for a more direct link between Gaussian processes and local times and led us to Dynkin’s isomorphism theorem. We must point out that the work of Barlow and Hawkes just cited applies to all L´evy processes whereas the isomorphism theorem approach that we present applies only to symmetric L´evy processes. Nevertheless, our approach is not limited to L´evy processes and also opens up
Introduction
3
the possibility of using Gaussian process theory to obtain many other interesting properties of local times. Another confession we must make is that we do not really understand the actual relationship between local times of strongly symmetric Markov processes and their associated Gaussian processes. That is, we have several functional equivalences between these disparate objects and can manipulate them to obtain many interesting results, but if one asks us, as is often the case during lectures, to give an intuitive description of how local times of Markov processes and Gaussian process are related, we must answer that we cannot. We leave this extremely interesting question to you. Nevertheless, there now exist interesting characterizations of the Gaussian processes that are associated with Markov processes. We say more about this in our discussion of the material in Chapter 13. The isomorphism theorems can be applied to very general classes of Markov processes. In this book, with the exception of Chapter 13, we consider Borel right processes. To ease the reader into this degree of generality, and to give an idea of the direction in which we are going, in Chapter 2 we begin the discussion of Markov processes by focusing on Brownian motion. For Brownian motion these isomorphisms are old stuff but because, in the case of Brownian motion, the local times of Brownian motion are related to the squares of independent Brownian motion, one does not really leave the realm of Markov processes. That is, we think that in the classical Ray–Knight Theorems one can view Brownian motion as a Markov process, which it is, rather than as a Gaussian process, which it also is. Chapters 2–4 develop the Markov process material we need for this book. Naturally, there is an emphasis on local times. There is also an emphasis on computing the potential density of strongly symmetric Markov processes, since it is through the potential densities that we associate the local times of strongly symmetric Markov processes with Gaussian processes. Even though Chapter 2 is restricted to Brownian motion, there is a lot of fundamental material required to construct the σ-algebras of the probability space that enables us to study local times. We do this in such a way that it also holds for the much more general Markov processes studied in Chapters 3 and 4. Therefore, although many aspects of Chapter 2 are repeated in greater generality in Chapters 3 and 4, the latter two chapters are not independent of Chapter 2. In the beginning of Chapter 3 we study general Borel right processes with locally compact state spaces but soon restrict our attention to strongly symmetric Borel right processes with continuous potential densities. This restriction is tailored to the study of local times of Markov
4
Introduction
processes via their associated mean zero Gaussian processes. Also, even though this restriction may seem to be significant from the perspective of the general theory of Markov processes, it makes it easier to introduce the beautiful theory of Markov processes. We are able to obtain many deep and interesting results, especially about local times, relatively quickly and easily. We also consider h-transforms and generalizations of Kac’s Theorem, both of which play a fundamental role in proving the isomorphism theorems and in applying them to the study of local times. Chapter 4 deals with the construction of Markov processes. We first construct Feller processes and then use them to show the existence of L´evy processes. We also consider several of the finer properties of Borel right processes. Lastly, we construct a generalization of Borel right processes that we call local Borel right processes. These are needed in Chapter 13 to characterize associated Gaussian processes. This requires the introduction of Ray semigroups and Ray processes. Chapters 5–7 are an exposition of sample path properties of Gaussian processes. Chapter 5 deals with structural properties of Gaussian processes and lays out the basic tools of Gaussian process theory. One of the most fundamental tools in this theory is the Borell, Sudakov–Tsirelson isoperimetric inequality. As far as we know this is stated without a complete proof in earlier books on Gaussian processes because the known proofs relied on the Brun–Minkowski inequality, which was deemed to be too far afield to include its proof. We give a new, analytical proof of the Borell, Sudakov–Tsirelson isoperimetric inequality due to M. Ledoux in Section 5.4. Chapter 6 presents the work of R. M. Dudley, X. Fernique and M. Talagrand on necessary and sufficient conditions for continuity and boundedness of sample paths of Gaussian processes. This important work has been polished throughout the years in several texts, Ledoux and Talagrand (1991), Fernique (1997), and Dudley (1999), so we can give efficient proofs. Notably, we give a simpler proof of Talagrand’s necessary condition for continuity involving majorizing measures, also due to Talagrand, than the one in Ledoux and Talagrand (1991). Our presentation in this chapter relies heavily on Fernique’s excellent monograph, Fernique (1997). Chapter 7 considers uniform and local moduli of continuity of Gaussian processes. We treat this question in general in Section 7.1. In most of the remaining sections in this chapter, we focus our attention on realvalued Gaussian processes with stationary increments, {G(t), t ∈ R1 }, for which the increments variance, σ 2 (t − s) := E(G(t) − G(s))2 , is relatively smooth. This may appear old fashioned to the Gaussian purist but
Introduction
5
it is exactly these processes that are associated with real-valued L´evy processes. (And L´evy processes with values in Rn have local times only when n = 1.) Some results developed in this section and its applications in Section 9.5 have not been published elsewhere. Chapters 2–7 develop the prerequisites for the book. Except for Section 3.7, the material at the end of Chapter 4 relating to local Borel right processes, and a few other items that are referenced in later chapters, they can be skipped by readers with a good background in the theory of Gaussian and Markov processes. In Chapter 8 we prove the three main isomorphism theorems that we use. Even though we are pleased to be able to give simple proofs that avoid the difficult combinatorics of the original proofs of these theorems, in Section 8.3 we give the combinatoric proofs, both because they are interesting and because they may be useful later on. Chapter 9 puts everything together to give sample path properties of local times. Some of the proofs are short, simply a reiteration of results that have been established in earlier chapters. At this point in the book we have given all the results in our first two joint papers on local times and isomorphism theorems (Marcus and Rosen, 1992a, 1992d). We think that we have filled in all the details and that many of the proofs are much simpler. We have also laid the foundation to obtain other interesting sample path properties of local times, which we present in Chapters 10–13. In Chapter 10 we consider the p-variation of the local times of symmetric stable processes 1 < p ≤ 2 (this includes Brownian motion). To use our isomorphism theorem approach we first obtain results on the p-variation of fractional Brownian motion that generalize results of Dudley (1973) and Taylor (1972) that were obtained for Brownian motion. These are extended to the squares of fractional Brownian motion and then carried over to give results about the local times of symmetric stable processes. Chapter 11 presents results of Bass, Eisenbaum and Shi (2000) on the range of the local times of symmetric stable processes as time goes to infinity and shows that the most visited site of such processes is transient. Our approach is different from theirs. We use an interesting bound for the behavior of stable processes in a neighborhood of the origin due to Molchan (1999), which itself is based on properties of the reproducing kernel Hilbert spaces of fractional Brownian motions. In Chapter 12 we reexamine Ray’s early isomorphism theorem for the h-transform of a transient regular symmetric diffusion, Ray (1963) and
6
Introduction
give our own, simpler version. We also consider the Markov properties of the local times of diffusions. In Chapter 13, which is based on recent work of N. Eisenbaum and H. Kaspi that appears in Eisenbaum (2003), Eisenbaum (2005), and Eisenbaum and Kaspi (2006), we take up the problem of characterizing associated Gaussian processes. To obtain several equivalencies we must generalize Borel right processes to what we call local Borel right processes. In Theorem 13.3.1 we see that associated Gaussian processes are just a little less general than the class of Gaussian processes that have infinitely divisible squares. Gaussian processes with infinitely divisible squares are characterized in Griffiths (1984) and Bapat (1989). We present their results in Section 13.2. We began our joint research that led to this book over 19 years ago. In the course of this time we received valuable help from R. Adler, M. Barlow, H. Kaspi, E. B. Dynkin, P. Fitzsimmons, R. Getoor, E. Gin´e, M. Talagrand, and J. Zinn. We express our thanks and gratitude to them. We also acknowledge the help of P.-A. Meyer. In the preparation of this book we received valuable assistance and advice from O. Daviaud, S. Dhamoon, V. Dobric, N. Eisenbaum, S. Evans, P. Fitzsimmons, C. Houdr´e, H. Kaspi, W. Li, and J. Rosinski. We thank them also. We are also grateful for the continued support of the National Science Foundation and PSC–CUNY throughout the writing of this book.
1.1 Preliminaries In this book Z denotes the integers both positive and negative and IN or sometimes N denotes the the positive integers including 0. R1 denotes the real line and R+ the positive half line (including zero). R denotes the extended real line [−∞, ∞]. Rn denotes n-dimensional space and | · | denotes Euclidean distance in Rn . We say that a real number a is positive if a ≥ 0. To specify that a > 0, we might say that it is strictly positive. A similar convention is used for negative and strictly negative. Measurable spaces: A measurable space is a pair (Ω, F), where Ω is a set and F is a sigma-algebra of subsets of Ω. If Ω is a topological space, we use B(Ω) to denote the Borel σ-algebra of Ω. Bounded B(Ω) measurable functions on Ω are denoted by Bb (Ω). Let t ∈ R+ . A filtration of F is an increasing family of sub σ-algebras Ft of F, that is, for 0 ≤ s < t < ∞, Fs ⊂ Ft ⊂ F with F = ∪0≤t<∞ Ft .
1.1 Preliminaries
7
(Sometimes we describe this by saying that F is filtered.) To emphasize a specific filtration Ft of F, we sometimes write (Ω, F, Ft ). Let M and N denote two σ-algebras of subsets of Ω. We use M ∨ N to denote the σ-algebra generated by M ∪ N . Probability spaces: A probability space is a triple (Ω, F, P ), where (Ω, F) is measurable space and P is a probability measure on Ω. A random variable, say X, is a measurable function on (Ω, F, P ). In general we let E denote the expectation operator on the probability space. When there are many random variables defined on (Ω, F, P ), say Y, Z, . . ., we use EY to denote expectation with respect to Y . When dealing with a probability space, when it seems clear what we mean, we feel free to use E or even expressions like EY without defining them. As usual, we let ω denote the elements of Ω. As with E, we often use ω in this context without defining it. When X is a random variable we call a number a a median of X if P (X ≤ a) ≥
1 2
and
P (X ≥ a) ≥ 12 .
(1.1)
Note that a is not necessarily unique. A stochastic process X on (Ω, F, P ) is a family of measurable functions {Xt , t ∈ I}, where I is some index set. In this book, t usually represents “time” and we generally consider {Xt , t ∈ R+ }. σ(Xr ; r ≤ t) denotes the smallest σ-algebra for which {Xr ; r ≤ t} is measurable. Sometimes it is convenient to describe a stochastic process as a random variable on a function space, endowed with a suitable σ-algebra and probability measure. In general, in this book, we reserve (Ω, F, P ) for a probability space. We generally use (S, S, µ) to indicate more general measure spaces. Here µ is a positive (i.e., nonnegative) σ-finite measure. Function spaces: Let f be a measurable function on (S, S, µ). The Lp (µ) (or simply Lp ), 1 ≤ p < ∞, spaces are the families of functions f for which S |f (s)|p dµ(s) < ∞ with f p :=
1/p |f (s)|p dµ(s) .
(1.2)
S
Sometimes, when we need to be precise, we may write f Lp (S) instead of f p . As usual we set f ∞ = sup |f (s)|.
(1.3)
s∈S
These definitions have analogs for sequence spaces. For 1 ≤ p < ∞, p
8
Introduction
is the family of sequences {ak }∞ k=0 of real or complex numbers such that ∞ ∞ p p 1/p and ak ∞ := k=0 |ak | < ∞. In this case, ak p := ( k=0 |ak | ) sup0≤k<∞ |ak |. We use np to denote sequences in p with n elements. Let m be a measure on a topological space (S, S). By an approximate identity or δ-function at y, with respect to m, we mean a family {f,y ; > 0} of positive continuous functions on S such that f,y (x) dm(x) = 1 and each f,y is supported on a compact neighborhood K of y with K ↓ {y} as → 0. Let f and g be two real-valued functions on R1 . We say that f is asymptotic to g at zero and write f ∼ g if limx→0 f (x)/g(x) = 1. We say that f is comparable to g at zero and write f ≈ g if there exist constants 0 < C1 ≤ C2 < ∞ such that C1 ≤ lim inf x→0 f (x)/g(x) and lim supx→0 f (x)/g(x) ≤ C2 . We use essentially the same definitions at infinity. Let f be a function on R1 . We use the notation limy↑↑x f (y) to be the limit of f (y) as y increases to x, for all y < x, that is, the left-hand (or simply left) limit of f at x. Metric spaces: Let (S, τ ) be a locally compact metric or pseudo-metric space. A pseudo-metric has the same properties as a metric except that τ (s, t) = 0 does not imply that s = t. Abstractly, one can turn a pseudometric into a metric by making the zeros of the pseudo-metric into an equivalence class, but in the study of stochastic processes pseudo-metrics are unavoidable. For example, suppose that X = {X(t), t ∈ [0, 1]} is a real-valued stochastic process. In studying sample path properties of X it is natural to consider (R1 , | · |), a metric space. However, X may be completely determined by an L2 metric, such as d(s, t) := dX (s, t) := (E(X(s) − X(t))2 )1/2
(1.4)
(and an additional condition such as E X 2 (t) = 1 ). Therefore, it is natural to also consider the space (R1 , d). This may be a pseudo-metric space since d need not be a metric on R1 . If A ⊂ S, we set τ (s, A) := inf τ (s, u). u∈A
(1.5)
We use C(S) to denote the continuous functions on S, Cb (S) to denote the bounded continuous functions on S, and Cb+ (S) to denote the positive bounded continuous functions on S. We use Cκ (S) to denote the continuous functions on S with compact support; C0 (S) denotes the functions on S that go to 0 at infinity. Nevertheless, C0∞ (S) denotes infinitely differentiable functions on S with compact support (whenever S
1.1 Preliminaries
9
is a space for which this is defined). In all these cases we mean continuity with respect to the metric or pseudo-metric τ . We say that a function is locally uniformly continuous on a measurable set in (S, τ ) if it is uniformly continuous on all compact subsets of (S, τ ). We say that a sequence of functions converges locally uniformly on (S, τ ) if it converges uniformly on all compact subsets of (S, τ ). Separability: Let T be a separable metric space, and let X = {X(t), t ∈ n T } be a stochastic process on (Ω, F, P ) with values in R . X is said to be separable if there is a countable set D ⊂ T and a P -null set Λ ⊂ F n such that, for any open set U ⊂ T and closed set A ⊂ R , {X(t) ∈ A, t ∈ D ∩ U }/{X(t) ∈ A, t ∈ U } ⊂ Λ.
(1.6)
If X is separable and U ⊂ T is an open set and Λ is as above, then ω∈ / Λ implies sup |X(t, ω)| = sup |X(t, ω)|
t∈D∩U
(1.7)
t∈U
inf |X(t, ω)| =
t∈D∩U
inf |X(t, ω)|.
t∈U
If T is a separable metric space, every stochastic process X = {X(t), n t ∈ T } with values in R has a separable version X = {X(t), t ∈ T }, is separable for some that is, P X(t) = X(t) = 1, for all t ∈ T , and X D and Λ. If X is stochastically continuous, that is, limt→t0 P (|X(t) − X(t0 )| > ) = 0, for every > 0 and t0 ∈ T , then any countable dense set V ⊂ T serves as the set D in the separability condition (sometimes called the separability set). The P -null set Λ generally depends on the choice of V. Fourier transform: We often give results with precise constants, so we need to describe what version of the Fourier transform we are using. Let f ∈ L2 (R1 ). Consistent with the standard definition of the characteristic function, the Fourier transform f of f is defined by ∞ eiλx f (x) dx, (1.8) f (λ) = −∞
where the integral exists in the L2 sense. The inverse Fourier transform is given by ∞ 1 (1.9) e−iλx f(λ) dλ. f (x) = 2π −∞
10
Introduction
With this normalization, Parseval’s Theorem is ∞ ∞ 1 f(λ) f (x)g(x) dx = g (λ) dλ. 2π −∞ −∞
(1.10)
2 Brownian motion and Ray–Knight Theorems
In this book we develop relationships between the local times of strongly symmetric Markov processes and corresponding Gaussian processes.This was done for Brownian motion over 40 years ago in the famous Ray– Knight Theorems. In this chapter, which gives an overview of significant parts of the book, we discuss Brownian motion, its local times, and the Ray–Knight Theorems with an emphasis on those definitions and properties which we generalize to a much larger class of processes in subsequent chapters. Much of the material in this chapter is repeated in greater generality in subsequent chapters.
2.1 Brownian motion A normal random variable with mean zero and variance t, denoted by N (0, t), is a random variable with a distribution function that has density e−x /2t pt (x) = √ 2πt 2
x ∈ R1
(2.1)
with respect to Lebesgue measure. (It is easy to check that a random variable with density pt (x) does have mean zero and variance t.) In anticipation of using pt as the transition density of a Markov process, we sometimes use pt (x, y) to denote pt (y − x). We give some important calculations involving pt .
11
Brownian motion and Ray–Knight Theorems
12
Lemma 2.1.1 (1) The Fourier transform of pt (x) is ∞ 2 pt (λ) := eiλx pt (x) dx = e−tλ /2 .
(2.2)
−∞
Equivalently, if ξ is N (0, t) E(eiλξ ) = e−tλ
2
/2
.
(2.3)
(2) If ξ is N (0, t) and ζ is N (0, s) and ξ and ζ are independent, then ξ + ζ is N (0, s + t). (3) ∞ ps (x, y)pt (y, z) dy = ps+t (x, z). (2.4) −∞
This equation is called the Chapman–Kolmogorov equation. (4) For α > 0, α
∞
√
−αt
e
u (x) := 0
e− 2α|x| . pt (x) dt = √ 2α
(2.5)
(We see below that uα is the α-potential density of standard Brownian motion). For (2.2), we write iλx − x2 /2t = −(x − iλt)2 /2t − tλ2 /2 so that ∞ 2 1 −tλ2 /2 √ e−(x−iλt) /2t dx (2.6) pt (λ) = e 2πt −∞ ∞ 2 2 2 1 e−x /2t dx = e−tλ /2 . = e−tλ /2 √ 2πt −∞ For the second equality note that (2πt)−1/2 CN exp(−z 2 /(2t)) dz = 0, where CN is the rectangle in the complex plane determined by {x|−N ≤ x ≤ N } and {x − iλt| − N ≤ x ≤ N } (since exp(−z 2 /(2t)) is analytic), and then take the limit as N goes to infinity. Equation (2.3) is simply a rewriting of (2.2). It immediately gives (2). Equation (2.4), for z = 0, follows from (2) and the fact that the density of ξ + ζ, the sum of the independent random variables ξ and ζ, is given by the convolution of the densities of ξand ζ. For general z we ∞ need ∞ only note that by a change of variables, −∞ ps (x, y)pt (y, z) dy = p (x − z, y)pt (y, 0) dy. −∞ s For a slightly more direct proof of (2.4) consider pt (x, y) = pt (y − x) as a function of y for some fixed x. The Fourier transform of pt (y − x) Proof
2.1 Brownian motion
13
is eiλx pt (λ). Similarly, the Fourier transform of pt (z − y) is eiλz pt (λ). By Parseval’s Theorem (1.10), the left-hand side of (2.4) is ∞ ∞ 2 1 1 eiλ(x−z) pt (λ) ps (λ) dλ = eiλ(x−z) e−(t+s)λ /2 dλ 2π −∞ 2π −∞ = ps+t (x, z) (2.7) by (1.8), (1.9), and (2.2). To prove (2.5) we note that by Fubini’s Theorem and (2.2) the Fourier transform of uα (x) is ∞ 1
α (λ) = . (2.8) e−αt pt (λ) dt = u α + λ2 /2 0 Taking the inverse Fourier transform we have ∞ 1 1 e−iλx uα (x) = dλ. 2π −∞ α + λ2 /2
(2.9)
Evaluating this integral in the complex plane using residues we get (2.5). (For x ≥ 0, use the contour (−ρ, ρ) ∪ (ρe−iθ , 0 ≤ θ ≤ π) in the clockwise direction and for x < 0 use the contour (−ρ, ρ) ∪ (ρeiθ , π ≤ θ ≤ 2π) in the counterclockwise direction.) We define Brownian motion starting at 0 to be a stochastic process W = {Wt ; t ∈ R+ } that satisfies the following three properties: (1) W has stationary and independent increments. law
(2) Wt = N (0, t), for all t ≥ 0. (In particular W0 ≡ 0.) (3) t → Wt is continuous. Theorem 2.1.2 The three conditions defining Brownian motion are consistent, that is, Brownian motion is well defined. Proof We construct a Brownian motion starting at 0. We first construct a probability P on RR+ , the space of real-valued functions {f (t), t ∈ [0, ∞)} equipped with the Borel product σ-algebra B(RR+ ). Let Xt be the natural evaluation Xt (f ) = f (t). We first define P on sets of the form {Xt1 ∈ A1 , . . . , Xtn ∈ An } for all Borel measurable sets A1 , . . . , An in R and 0 = t0 < t1 < · · · < tn by setting n n pti −ti−1 (zi−1 , zi ) 1Ai (zi ) dzi . P (Xt1 ∈ A1 , . . . , Xtn ∈ An ) = i=1
i=1
(2.10) Here 1Ai is the indicator function of Ai and we set z0 = 0. That this construction is consistent follows from the Chapman–Kolmogorov equation
Brownian motion and Ray–Knight Theorems
14
(2.4). The existence of P now follows from Kolmogorov’s Construction Theorem. It is obvious, by (2.10), that the random variable (Xt1 , . . . , Xtn ) has n probability density function i=1 pti −ti−1 (zi−1 , zi ). Hence, for 0 = t0 < t1 < · · · < tn < v and measurable functions g and f , E(g(Xt1 , . . . , Xtn )f (Xv − Xtn )) = g(z1 , . . . , zn )f (y − ztn ) n
=
pti −ti−1 (zi−1 , zi )pv−tn (zn , y)
i=1
g(z1 , . . . , zn )f (y)
n
pti −ti−1 (zi−1 , zi )pv−tn (y)
i=1
(2.11)
n i=1 n
dzi dy dzi dy
i=1
= E(g(Xt1 , . . . , Xtn )) E(f (Xv−tn )). This shows that Xv − Xtn is independent of Xt1 , . . . , Xtn and is equal in law to Xv−tn . It follows from this that {Xt ; t ∈ R+ } satisfies property (1). It is obvious that it satisfies property (2). We now show that {Xt ; t ∈ R+ } has a continuous version. To do this law
we note that by properties (1) and (2) and the fact that N (0, t − r) = |t − r|1/2 N (0, 1), (|X1 |n ) (|Xt − Xr |n ) = |t − r|n/2 E E
(2.12)
for all 0 ≤ r < t. (It is also easy to check that N (0, t) has moments of all orders.) Thus, by Kolmogorov’s Theorem (Theorem 14.1.1), for any fixed > 0 and some random variable C(T ), which is finite almost surely, |Xt − Xr | ≤ C(T )|t − r|1/2−
(2.13)
for all dyadic numbers 0 ≤ r < t ≤ T .
= {W
t ; t ∈ R+ }, of Brownian motion. We now define a version, W Let D be the set of positive dyadic numbers. On the set ∩∞ N =1 {C(N ) <
to be the continuous functions obtained by extending ∞}, we take W c {Xt ; t ∈ D} by continuity. On (∩∞ N =1 {C(N ) < ∞}) , which has proba to be identically 0. Obviously W
satisfies property bility 0, we take W (3). Since {Xt ; t ∈ R+ } satisfies properties (1) and (2), to see that
also satisfies properties (1) and (2) it suffices to note that for each W
t = Xt almost surely. This follows from (2.12). This shows that t, W Brownian motion starting at zero is a well-defined stochastic process.
t ; t ∈ R+ } is a version of Brownian motion. We Remark 2.1.3 {W
2.1 Brownian motion
15
present another version that has useful properties. Let C[0, ∞) be the space of continuous real-valued functions {ω(t), t ∈ [0, ∞)} with the topology of uniform convergence on compact sets and let B(C) be the Borel σ-algebra of C[0, ∞). Let (P, RR+ , B(RR+ )) denote the probabil . The map ity space of W
: (RR+ , B(RR+ )) → (C[0, ∞), B) W
(f )(t) = W
t (f ) induces a map of the probability P on defined by W R+ R+ (R , B(R )) to a probability P on (C[0, ∞), B). Let Wt := ω(t). {Wt ; t ∈ R+ } is a Brownian motion on (C[0, ∞), B). This version of Brownian motion, as a probability on the space of continuous functions, is called the canonical version of Brownian motion. We mention two classical local limit laws for Brownian motion, {Wt ; t ∈ R+ } starting at 0. These laws are not difficult to prove, but we shall not do so at this point since their proofs are included in the more general Theorems 7.2.15 and 7.2.14; see also Example 7.4.13. Khintchine’s law of the iterated logarithm states that Wx+δ − Wx lim sup =1 2δ log log(1/δ) δ→0
a.s.
(2.14)
L´evy’s uniform modulus of continuity states that, for any T > 0, lim sup |y−z|→0
0≤y,z≤T
|Wy − Wz | 2|y − z| log(1/|y − z|)
=1
a.s.
(2.15)
It is remarkable that Khintchine’s law also gives an iterated log law for the behavior of Brownian motion at infinity, namely, lim sup √ t→∞
Wt =1 2t log log t
a.s.
(2.16)
This is obtained by applying (2.14) to W := {W (t); t ∈ R+ }, for W (t) := tW (1/t) for t = 0 and W (0) = 0, where {W (t); t ∈ R+ } is a Brownian motion. The point here is that W is also a Brownian motion. We show this in Remark 2.1.6. Remark 2.1.4 Let {Wt , t ∈ R+ } be a Brownian motion starting at 0 on the probability space (Ω, F 0 , P ), where F 0 denotes the σ-algebra generated by Ws , 0 ≤ s < ∞. (We use a superscript for F 0 , reserving F for the enlargement of F 0 introduced in Section 2.3.) For each x ∈ R1 we define a probability P x on F 0 by setting P x (F (W· )) = P (F (x + W· )), for all measurable functions F on F0 . {Wt ; t ∈ R+ ; P x } is called a Brownian motion starting at x.
Brownian motion and Ray–Knight Theorems
16
Using the properties of Brownian motion one can easily show that, for fi ∈ Bb (R), i = 1, . . . , n, n n n x E fi (Wti ) = pt1 (x, z1 ) pti −ti−1 (zi−1 , zi ) fi (zi ) dzi i=1
i=2
i=1
(2.17) for any 0 = t0 < t1 < . . . < tn (compare this with (2.10)). For each t > 0 we define the transition operator Pt : Bb (R) → Bb (R) by Pt f (x) = E x (f (Wt )) = pt (x, z)f (z) dz (2.18) and take P0 = I, where I denotes the identity operator. The Chapman–Kolmogorov equation (2.4) shows that {Pt ; t ∈ R+ } is a semigroup of operators, that is, for all f ∈ Bb (R), Pt+s f (·) = Ps (Pt f )(·). (This fact is often abbreviated by simply writing Ps Pt = Ps+t .) For measurable sets A ∈ B(R), set Pt (x, A) = Pt IA (x), so that pt (x, z) dz. (2.19) Pt (x, A) = A
We interpret Pt (x, A) as the probability that a Brownian motion, starting at x at time zero, takes a value in A at time t. Because of (2.19) we call pt (x, z) the transition probability density function of Brownian motion. For each α > 0 we define the α-potential operator U α : Bb (R) → Bb (R) by ∞ ∞ α −αt x −αt U f (x) = e Pt f (x) dt = E e f (Wt ) dt α > 0. 0
0
(2.20) The second equality uses Fubini’s Theorem. To justify this we need to show that {Wt (ω) ; (t, ω) ∈ R+ × Ω} is jointly B(R+ ) × F 0 measurable. This is easy to see because Wt (ω) can be written as a limit of B(R+ )×F 0 measurable functions, that is, Wt (ω) = lim
n→∞
∞
1[j−1/n,j/n) (t)Wj/n (ω).
(2.21)
j=1
Let
uα (x) = 0
∞
e−αt pt (x) dt.
(2.22)
2.1 Brownian motion
17
Using (2.18) we have U α f (x) =
uα (x, z)f (z) dz.
(2.23)
As in the case of Pt , for measurable sets A ∈ B(R), set U α (x, A) = U α IA (x). Because an equation similar to (2.19) holds in this case also, we call uα the α-potential density of Brownian motion. The numerical value of uα is given in (2.5). In subsequent work we consider the 0-potential density √ of certain Markov processes. Since, for Brownian motion, pt (x) ∼ 1/ t at infinity, (2.22) is infinite when α = 0. Therefore, we say that the 0-potential density of Brownian motion is infinite. Remark 2.1.5 For later use we note that Brownian motion is recurrent, that is, it takes on all values infinitely often, or, as is often said, it hits all points infinitely often. To see this note that {W (t), t ∈ R+ } and {−W (t), t ∈ R+ } have the same law. Therefore, by (2.16) we see that Brownian motion crosses the boundaries (t log log t)1/2 and −(t log log t)1/2 infinitely often, and, since it is continuous, it hits all points in between infinitely often. Remark 2.1.6 Let W = {W (t); t ∈ R+ } be a Brownian motion and consider W = {W (t); t ∈ R+ }, where W (t) = tW (1/t) for t = 0 and W (0) = 0. For t > s we write W (t) = (W (t) − W (s)) + W (s). Then, using the fact that W has independent increments, we see that for all s, t ∈ R+ EW (t)W (s) = s ∧ t = EW (t)W (s).
(2.24)
We show in Section 5.1 that W and W are both mean zero Gaussian processes, and, since mean zero Gaussian processes are determined uniquely by their covariances, they are equivalent as stochastic processes (see, in particular, Example 5.1.10). Here we give a direct proof of this equivalence, which does not require us to explicitly mention Gaussian processes. Note that for any 0 < t1 < · · · < tn and a1 , · · · , an ∈ R1 , n j=1
aj W (tj ) =
n j=1
n k=j
ak (W (tj ) − W (tj−1 )) .
(2.25)
18
Brownian motion and Ray–Knight Theorems
Therefore, using the independence of the increments of W , we see that 2 n n n 2 λ E exp iλ aj W (tj ) = exp − ak (tj − tj−1 ) . 2 j=1 j=1 k=j
(2.26) This shows, by (2.3), that any linear combination of W (tj ), j = 1, . . . , n is a normal random variable and necessarily n aj W (tj ) (2.27) E exp iλ j=1
2 n λ aj W (tj ) exp − E 2 j=1
2
=
n
2
λ aj ak E(W (tj )W (tk )) 2 j,k=1 n 2 λ aj ak (tj ∧ tk ) . exp − 2 exp −
=
=
j,k=1
Applying this to
n
E exp iλ
j=1 n
aj W (tj ) =
n j=1
aj tj W (1/tj ) we obtain
aj W (tj )
(2.28)
j=1
n 2 λ aj tj ak tk E(W (1/tj )W (1/tk )) = exp − 2 j,k=1 n 2 λ aj ak E(W (tj )W (tk )) . = exp − 2
j,k=1
It follows from this and (2.24) that W and W have the same finite joint distributions. Consequently W satisfies the first two conditions in the definition of Brownian motion. The fact that W is continuous at any t = 0 is obvious. The continuity at t = 0 follows from (2.12) and (2.13) with W instead of X. Hence W is a Brownian motion. Remark 2.1.7 Let 0 < t1 < · · · < tn and let C denote the symmetric
2.2 The Markov property
19
n × n matrix with Ci,j = tj ∧ tk , i, j = 1, . . . , k. Since 2 n n aj ak (tj ∧ tk ) = E aj W (tj ) > 0 j=1
j,k=1
unless all ai = 0, we have that C is strictly positive definite and hence invertible. In this case the distribution of (Bt1 , . . . , Btn ) in Rn has density −1 1 (2.29) e−(x,C x)/2 (2π)n/2 det(C) with respect to Lebesgue measure. To see this we compute characteristic functions. By a change of variables n n −1 1 i λ x dxj (2.30) e j=1 j j e−(x,C x)/2 (2π)n/2 det(C) j=1 n 1 i(λ,C 1/2 x) −(x,x)/2 e dxj e = (2π)n/2 j=1 n 1 i(C 1/2 λ,x) −(x,x)/2 = e dxj e (2π)n/2 j=1 n 1/2 2 1 = ei(C λ)j xj e−xj /2 dxj 1/2 (2π) j=1 n 1/2 1/2 − (C 1/2 λ)2j /2 j=1 =e = e−(C λ, C λ)/2 = e−(λ,Cλ)/2 . Comparing this with (2.27) shows that (Bt1 , . . . , Btn ) has probability density (2.29).
2.2 The Markov property Let {Wt , t ∈ R+ } be a Brownian motion. Let Ft0 denote the σ-algebra generated by {Ws , 0 ≤ s ≤ t} so that F 0 = ∪t≥0 Ft0 . Lemma 2.2.1 For all x ∈ R1 E x f (Wt+s ) | Ft0 = Ps f (Wt )
(2.31)
for all s, t ≥ 0 and f ∈ Bb (R). Since Ps f (Wt ) is Wt measurable, (2.31) implies that, for all f ∈ Bb (R), E x f (Wt+s ) | Ft0 = E x (f (Wt+s ) | Wt ) , (2.32)
Brownian motion and Ray–Knight Theorems
20
which says that the future of the process {Wt ; t ≥ 0}, given all its past values up to the present, say time t0 , only depends on its present value, W (t0 ). When (2.31) holds for any stochastic process {Wt ; t ≥ 0}, we say that {Wt ; t ≥ 0} satisfies the simple Markov property. Proof We give two proofs of (2.31). In the first we use the fact that if Z is A measurable and Y is independent of A, then E(f (Z + Y ) | A) = E(f (z + Y ))|z=Z . Writing Wt+s = Wt + (Wt+s − Wt ) we note that Y := Wt+s − Wt is independent of Ft0 and for any x ∈ R1 , under the measure P x , we have law
Y = N (0, s). Therefore E x f (Wt+s ) | Ft0
= E x f (Wt + Y ) | Ft0
(2.33)
= E (f (z + Y )) |z=Wt = E Wt (f (Ws )) = Ps f (Wt ). For the second proof we appeal to the definition of conditional expectation and show that, for all Ft0 measurable sets A, f (Wt+s ) dP x = Ps f (Wt ) dP x . (2.34) A
A
Ft0
Since is generated by cylinder sets, that is, sets of the form ∩ni=1 {ai ≤ Wti ≤ bi }, where 0 = t0 < t1 < . . . < tn ≤ t, it suffices to verify (2.34) for sets of this form. More generally we show that, for functions of the n form i=1 fi (Wti ) with fi ∈ Bb (R), i = 1, . . . , n and 0 = t0 < t1 < . . . < tn ≤ t, n n x x fi (Wti )f (Wt+s ) = E fi (Wti )Ps f (Wt ) , (2.35) E i=1
i=1
thus completing the second proof. By (2.17) n x E fi (Wti )f (Wt+s )
(2.36)
i=1
pt1 (x, z1 )
=
n
pti −ti−1 (zi−1 , zi )pt+s−tn (zn , z)
i=2 n
=
pt1 (x, z1 )
n i=2
pti −ti−1 (zi−1 , zi )
i=1 n i=1
fi (zi ) dzi f (z) dz fi (zi )Pt+s−tn f (zn )
2.2 The Markov property n
21 dzi .
i=1
Similarly, by (2.17) we have n x fi (Wti )Ps f (Wt ) E
(2.37)
i=1
pt1 (x, z1 )
=
n
pti −ti−1 (zi−1 , zi )pt−tn (zn , z)
i=2 n
=
pt1 (x, z1 )
n
pti −ti−1 (zi−1 , zi )
i=2
i=1 n
fi (zi ) dzi Ps f (z) dz fi (zi )Pt−tn Ps f (zn )
i=1
n
dzi .
i=1
It follows from the semigroup property of Pt that Pt−tn Ps = Pt+s−tn . Therefore (2.36) and (2.37) are equal, which is (2.35). In the study of local times it is necessary to enlarge the σ-algebras Ft0 . We do this in the next section. In preparation, we introduce the following definition, which generalizes (2.32). Let {Gt ; t ≥ 0} be an increasing family of σ-algebras with Ft0 ⊆ Gt for all t ≥ 0. We say that the Brownian motion {Wt , t ∈ R+ } is a simple Markov process with respect to {Gt ; t ≥ 0} if, for all x ∈ R1 , E x (f (Wt+s ) | Gt ) = Ps f (Wt )
(2.38)
for all s, t ≥ 0 and f ∈ Bb (R). Let {Wt , t ∈ R+ } be a Brownian motion on the probability space (Ω, F 0 ). We assume that there exists a family {θt , t ∈ R+ } of shift operators for W , that is, operators θt : (Ω, F 0 ) → (Ω, F 0 ) with θt ◦ θs = θt+s
and
Wt ◦ θs = Wt+s
∀s, t ≥ 0.
(2.39)
For example, for the canonical version of Brownian motion, we can take the shift operators θt to be defined by θt (ω)(s) = ω(t + s)
∀ s, t ≥ 0.
(2.40)
In general, for any random variable Y on (Ω, F 0 ), Y ◦ θt := Y (θt ). Lemma 2.2.2 If {Wt , t ∈ R+ } is a simple Markov process with respect to {Gt ; t ≥ 0}, then E x (Y ◦ θt | Gt ) = E Wt (Y )
(2.41)
Brownian motion and Ray–Knight Theorems
22
for all F 0 measurable functions Y and all x ∈ R1 . n Proof It suffices to prove (2.41) for Y of the form Y = i=1 gi (Wti ), with 0 < t1 < . . . < tn and gi ∈ Bb (S), i = 1, . . . , n. In this case n Y ◦ θt = i=1 gi (Wt+ti ). We then have n n−1 x E gi (Wt+ti ) | Gt+tn−1 = gi (Wt+ti )E x gn (Wt+tn ) | Gt+tn−1 i=1
i=1
(2.42) and, by (2.38), E x gn (Wt+tn ) | Gt+tn−1
= =
Ptn −tn−1 gn (Wt+tn−1 ) (2.43) ptn −tn−1 (Wt+tn−1 , zn )gn (zn ) dzn
:= h(Wt+tn−1 ). Next we see that the conditional expectation of the left-hand side of (2.42) with respect to Gt+tn−2 is equal to n−2
gi (Wt+ti )
ptn−1 −tn−2 (Wt+tn−2 , zn−1 )gn−1 (zn−1 )h(zn−1 ) dzn−1
i=1
=
n−2
gi (Wt+ti )
ptn−1 −tn−2 (Wt+tn−2 , zn−1 )
i=1
ptn −tn−1 (zn−1 , zn )gn−1 (zn−1 )gn (zn ) dzn−1 dzn . Continuing in this way see that the left-hand side of (2.41) is equal to pt1 (Wt , z1 )
n
pti −ti−1 (zi−1 , zi )
i=2
n
gi (zi ) dzi .
(2.44)
i=1
Clearly the right-hand side of (2.41) is also equal to this. We now provide an alternative, more explicit proof for the important case in which Gt = Ft0 for all t ≥ 0. Thus we show that for Y as above (2.45) E x Y ◦ θt | Ft0 = E Wt (Y ) . To prove (2.45) it is enough to show that E x (F Y ◦ θt ) = E x F E Wt (Y ) m
(2.46)
for all F of the form F = j=1 fj (Wsj ), with 0 < s1 < . . . < sm ≤ t and fj ∈ Bb (S), j = 1, . . . , m.
2.2 The Markov property By (2.17), E x (F Y ◦ θt )
= Ex
fj (Wsj )
j=1
=
m
n
23
gi (Wt+ti )
(2.47)
i=1
ps1 (x, z1 )
m
psj −sj−1 (zj−1 , zj )pt+t1 −sm (zm , y1 )
j=2 n
pti −ti−1 (yi−1 , yi )
i=2
Similarly, E
x
m
fj (zj ) dzj
j=1
n
m F E Wt (Y ) = E x fj (Wsj )E Wt (Y )
(2.48)
j=1
=
gi (yi ) dyi .
i=1
ps1 (x, z1 )
m
psj −sj−1 (zj−1 , zj )pt−sm (zm , y)
j=2 m
fj (zj ) dzj E y (Y ) dy
j=1
and
E y (Y ) =
pt1 (y, y1 )
Since pt+t1 −sm (zm , y1 ) =
n i=2
pti −ti−1 (yi−1 , yi )
n
gi (yi ) dyi .
(2.49)
i=1
pt−sm (zm , y)pt1 (y, y1 ) dy, we get (2.46).
In the development of the theory of Markov processes it is crucial to extend the simple Markov property, which holds for fixed times, to certain random times. Let (S, G, Gt , P ) be a probability space, where Gt is an increasing sequence of σ-algebras and, as usual, G = ∪t≥0 Gt . A random variable T with values in [0, ∞] is called a Gt stopping time if {T ≤ t} is Gt measurable for all t ≥ 0. Let T be a Gt stopping time and let GT := {A ⊆ S | A ∩ {T ≤ t} ∈ Gt , ∀t}. It is easy to check that GT is a σ-algebra. Note that if A ∈ GT , then T (ω) if ω ∈ A T (A)(ω) := (2.50) ∞ if ω ∈ /A is a Gt stopping time. Set Gt+ = ∩h>0 Gt+h . Gt+ is an increasing sequence of σ-algebras. It is easy to check that T is a Gt+ stopping time (that is, {T ≤ t} is Gt+
24
Brownian motion and Ray–Knight Theorems
measurable for all t ≥ 0) if and only if {T < t} is Gt measurable for all t. Similar to GT we define GT + = {A ⊆ S | A ∩ {T ≤ t} ∈ Gt+ , ∀t}. One can check that GT + = {A ⊆ S | A ∩ {T < t} ∈ Gt , ∀t}. Remark 2.2.3 We wrote the last three paragraphs on stopping times without specifically mentioning Brownian motion because they are relevant for general Markov processes. We did this so that in Chapter 3 we can discuss stopping times without defining them again. Several concepts that are developed in this chapter for the study of Brownian motion are presented in this way. Let W = {W (t), t ∈ R+ } be a Brownian motion on (Ω, F 0 , Ft0 , P ). An interesting class of stopping times is the first hitting times of certain subsets A ⊆ R1 , that is, the random times TA = inf{t > 0 | Wt ∈ A}. Lemma 2.2.4 Let W = {W (t), t ∈ R+ } be a Brownian motion on (Ω, F 0 , Ft0 , P ). If A ⊆ R1 is open, then TA is an Ft0+ stopping time. If A ⊆ R1 is closed, then TA is an Ft0 stopping time. Proof Suppose that A ⊆ R1 is open. Then TA < t if and only if Ws ∈ A for some rational number 0 < s < t. Therefore {TA < t} ∈ Ft0 . Let d(x, A) := inf{|x − y|, y ∈ A}. Since d(x, A) is continuous, the sets An = {x ∈ R1 | d(x, A) < 1/n} are open, and if A ⊆ R1 is closed, An ↓ A. Using the facts that A is closed in the first equation and An is open in the last equation, we see that if t > 0, {TA ≤ t} = {Ws ∈ A for some 0 < s ≤ t} {Ws ∈ A for some 1/m ≤ s ≤ t} =
(2.51)
m
=
{Ws ∈ An for some 1/m ≤ s ≤ t}
m n
=
{Ws ∈ An for some rational 1/m ≤ s ≤ t}.
m n
We use the continuity of Brownian motion in the last line. Thus {TA ≤ t} ∈ Ft0 for t > 0. The case of t = 0 is immediate since, for a closed set A, {TA = 0} = {W0 ∈ A} ∈ F00 . The next lemma, which generalizes the simple Markov property of Brownian motion so that it holds for stopping times, is called the strong Markov property for Brownian motion.
2.2 The Markov property
25
Lemma 2.2.5 If the Brownian motion {Wt , t ∈ R+ } is a simple Markov process with respect to {Gt ; t ≥ 0}, then for any Gt+ stopping time T E x f (WT +s )1{T <∞} | GT + = Ps f (WT )1{T <∞} (2.52) for all s ≥ 0, x ∈ R1 , and bounded Borel measurable functions f. Proof We need to show two things: first that Ps f (WT )1{T <∞} is GT + measurable for each bounded Borel measurable function f and, second, that E x 1A f (WT +s )1{T <∞} = E x 1A Ps f (WT )1{T <∞} (2.53) for each A ∈ GT + and bounded Borel measurable function f . For both of these assertions it suffices to consider that f is also continuous. [2n T ] + 1 , that is, Tn = j/2n if and only if (j −1)/2n ≤ T < Let Tn = 2n n j/2 and check that, for each n, Tn is a Gt stopping time and Tn ↓ T . Furthermore, Tn < ∞ if and only if T < ∞. Using the continuity of {Wt , t ≥ 0} we have WT 1{T <∞} = lim
n→∞
∞
W(j−1)/2n 1{Tn =j/2n } .
j=1
Furthermore, it is easy to see that W(j−1)/2n 1{Tn =j/2n } is GT + measurable. Therefore the same is true for WT 1{T <∞} and consequently also for Ps f (WT )1{T <∞} . We proceed to verify (2.53). We claim that this equation holds with T replaced by Tn . Once this is established, (2.53) itself follows by taking the limit as n goes to infinity and using continuity. Replace T by Tn in the left-hand side of (2.53) and note that since A ∈ GT + we have A ∩ {Tn = j/2n } ∈ Gj/2n . Then, using the simple Markov property, we see that E x 1A f (WTn +s )1{Tn <∞} (2.54) ∞ = E x 1A 1{Tn =j/2n } f (WTn +s ) =
j=1 ∞
E x 1A∩{Tn =j/2n } f (Wj/2n +s )
j=1
=
∞ j=1
E x 1A∩{Tn =j/2n } Ps f (Wj/2n )
= E x 1A 1{Tn <∞} Ps f (WTn ) .
26
Brownian motion and Ray–Knight Theorems
This shows that (2.53) holds with T replaced by Tn . Remark 2.2.6 A similar proof shows that for any Gt stopping time T E x f (WT +s )1{T <∞} | GT = Ps f (WT )1{T <∞} (2.55) for all s ≥ 0, x ∈ R1 and bounded Borel measurable functions f. In the same way we obtained (2.41) using (2.38), we use (2.52) to obtain the next lemma. Lemma 2.2.7 The strong Markov property (2.52) for the Gt+ stopping time T can be extended to (2.56) E x Y ◦ θT 1{T <∞} | GT + = E WT (Y ) 1{T <∞} for all F 0 measurable functions Y and x ∈ R1 . Using (2.55), a proof similar to the proof of Lemma 2.2.7 shows that for the Gt stopping time T (2.57) E x Y ◦ θT 1{T <∞} | GT = E WT (Y ) 1{T <∞} for all F 0 measurable functions Y and x ∈ R1 . Remark 2.2.8 Since a constant time is an Ft0 stopping time, it follows from (2.56) and (2.57) that, for any t ≥ 0, E x Y ◦ θt | Ft0+ = E Wt (Y ) = E x Y ◦ θt | Ft0 (2.58) for all F 0 measurable functions Y and x ∈ R1 . This in turn implies that (2.59) E x Z | Ft0+ = E x Z | Ft0 for all F 0 measurable functions Z and x ∈ R1 . (To see this note that it suffices to verify it for Z of the form Z = (Y ◦ θt )V , where V ∈ Ft0 , since such functions generate F 0 . For Z of this form (2.59) follows immediately from (2.58).) It follows from (2.59) that Ft0+ and Ft0 differ merely by null sets. More precisely, let A ∈ Ft0+ and set A = {ω | E x (1A | Ft0 ) = 1}, which is clearly in Ft0 . By (2.59) applied to Z = 1A we see that 1A = 1A , P x almost surely, that is, P x (A∆A ) = 0. Remark 2.2.8 leads to the next important lemma. Lemma 2.2.9 (Blumenthal Zero–One Law for Brownian Motion) Let A ∈ F00+ . Then, for any x ∈ R1 , P x (A) is either zero or one.
2.2 The Markov property
27
Proof Note that F00 = σ(W0 ), the σ-algebra generated by W0 . Thus any set A ∈ F00 must be of the form A = W0−1 (B) for some Borel set B ⊆ R1 . Therefore P x (A ) = P x (W0 ∈ B) = 1B (x), so that P x (A ) is either zero or one. Applying the results stated in the second paragraph of Remark 2.2.8, in the case t = 0, we see that for A ∈ F00+ , P x (A∆A ) = 0 for some A ∈ F00 . Thus P x (A) is either zero or one. The following simple lemma is used in the next section. Lemma 2.2.10 Let (S, G, Gt , P ) be a probability space, where Gt is an increasing family of σ-algebras and let S and T be Gt stopping times. Then the sets {S ≤ T } , {S < T }, and {S = T } ∈ GT . Proof
(2.60)
To see this simply use the facts that
{S ≤ T } ∩ {T ≤ t} = {S ≤ t} ∩ {T ≤ t} ∩ {S ∧ t ≤ T ∧ t} ∈ Gt (2.61) and
{S < T } ∩ {T ≤ t} =
{S < r} ∩ {r < T ≤ t} ∈ Gt .
(2.62)
r
r rational
We use a stopping time argument to prove the following simple but useful result, known as the reflection principle. Lemma 2.2.11 Let {Wt , t ∈ R+ } be a Brownian motion. Then 0 sup Ws ≥ x = 2 P 0 (Wt ≥ x) . (2.63) P 0≤s≤t
Proof By Lemma 2.2.4, Tx = inf{t > 0 | Wt = x} is an Ft0 stopping time. For any α > 0, ∞ e−αt P 0 (Wt ≥ x) dt (2.64) 0 ∞ = E0 e−αt 1{[x,∞)} (Wt ) dt 0 ∞ 0 −αt e 1{[x,∞)} (Wt ) dt =E Tx ∞ = E 0 e−αTx e−αt 1{[x,∞)} (Wt ) ◦ θTx dt . 0
Brownian motion and Ray–Knight Theorems
28
Since for α > 0, e−αTx = e−αTx 1{Tx <∞} , and WTx = x, by the strong Markov property (2.57), ∞ e−αt 1{[x,∞)} (Wt ) ◦ θTx dt (2.65) E 0 e−αTx 0 ∞ = E 0 e−αTx E x e−αt 1{[x,∞)} (Wt ) dt 0
and clearly
E
x
∞
−αt
e 1{[x,∞)} (Wt ) dt ∞ 0 −αt =E e 1{[0,∞)} (Wt ) dt ∞ 0 1 . = e−αt P 0 (Wt ≥ 0) dt = 2α 0
(2.66)
0
A standard integration by parts shows that ∞ e−αt dP 0 (Tx ≤ t) = α E 0 e−αTx = 0
∞
e−αt P 0 (Tx ≤ t) dt.
0
Combining (2.64)–(2.67) we have ∞ 1 ∞ −αt 0 e−αt P 0 (Wt ≥ x) dt = e P (Tx ≤ t) dt. 2 0 0
(2.67)
(2.68)
Since this holds for all α > 0, and P 0 (Wt ≥ x) and P 0 (Tx ≤ t) are right continuous in t, we have P 0 (Wt ≥ x) =
1 2
P 0 (Tx ≤ t) .
The lemma follows from the fact that P 0 (Tx ≤ t) = P 0 sup Ws ≥ x .
(2.69)
(2.70)
0≤s≤t
2.3 Standard augmentation We initially defined Brownian motion on the probability space (Ω, F 0 , Ft0 , P x ). However, as we pointed out on page 21, to continue with the introduction of local times in the next section, we must enlarge the σalgebras Ft0 , t ≥ 0. We do this by using a construction referred to as the standard augmentation of Ft0 with respect to P x , or just standard augmentation, when it is clear to which σ-algebras and probability measures we are referring. Heuristically, Ft0 is augmented with null sets
2.3 Standard augmentation
29
that can depend on the whole trajectory of the path, not merely on the trajectory up to time t. An important task of this section is to show that the strong Markov property continues to hold on the standard augmentation of Ft0 . (Note that the standard augmentation is sometimes referred to as the “usual augmentation.”) Let B denote the Borel σ-algebra on R1 and let M denote the set of finite positive measures on (R1 , B). For each µ ∈ M and A ∈ F 0 , set ∞ P x (A) dµ(x). (2.71) P µ (A) = −∞
Let F be the P completion of F 0 and let Nµ be the collection of null sets for (Ω, F µ , P µ ). Let Ftµ = Ft0 ∨ Nµ and µ
µ
F = ∩µ∈M F µ
and
Ft = ∩µ∈M Ftµ .
(2.72)
We note that F µ = {B ⊆ Ω | B ⊆ B ⊆ B with B , B ∈ F 0 and P µ (B ) = P µ (B )}. (2.73) µ The fact that the right-hand side of (2.73) is contained in F is immediate. The opposite inclusion follows by showing that the right-hand side of (2.73) is a σ-algebra. {Ft , t ≥ 0} is the standard augmentation of Ft0 with respect to {P x , x ∈ R1 } or, more simply stated, the standard augmentation of Ft0 with respect to P x . One can check that F = ∪t≥0 Ft . Since F and Ft are enlargements of the σ-algebras F 0 and Ft0 , respectively, it is clear that a Brownian motion {Wt , t ≥ 0} on (Ω, F 0 , Ft0 , P x ) is also a Brownian motion on (Ω, F, Ft , P x ). Note that for each µ ∈ M E µ (f (Wt+s ) | Ftµ ) = Ps f (Wt )
(2.74)
for all s, t ≥ 0 and f ∈ Bb (R1 ). This follows since the two measures A → E µ (1A f (Wt+s )) and A → E µ (1A Ps f (Wt )) agree on Ft0 by the simple Markov property, and also on Nµ , where they are both zero. Therefore they agree on Ftµ , which gives (2.74). It is also clear that in (2.74) we may replace Ftµ by Ft , so that in particular {Wt , t ∈ R+ } is a simple Markov process with respect to {Ft ; t ≥ 0}. Hence, by Lemmas 2.2.5 and 2.2.7, for any Ft+ stopping time T we have E x Y ◦ θT 1{T <∞} | FT + = E WT (Y ) 1{T <∞} (2.75) for all F 0 measurable functions Y . A simple integration then shows that E µ Y ◦ θT 1{T <∞} | FT + = E WT (Y ) 1{T <∞} (2.76)
30
Brownian motion and Ray–Knight Theorems
for all F 0 measurable functions Y and µ ∈ M. Lemma 2.3.1 Ft+ = Ft for all t ≥ 0. Proof Just as we obtained (2.76) from (2.56), we can use (2.57) to see that for any Ft stopping time T we have E µ Y ◦ θT 1{T <∞} | FT = E WT (Y ) 1{T <∞} (2.77) for all F 0 measurable functions Y . Applying (2.76) and (2.77) to the Ft stopping time T = t we see that E µ (Y ◦ θt | Ft+ ) = E µ (Y ◦ θt | Ft )
(2.78)
for all F 0 measurable functions Y . By the argument in Remark 2.2.8, this implies that E µ (Y | Ft+ ) = E µ (Y | Ft )
(2.79)
for all F 0 measurable functions Y and hence for all F µ measurable functions Y . Suppose that Y is Ft+ measurable. Then, by (2.79), Y differs from an Ft measurable random variable by a set of µ measure zero. This shows that Ft+ ⊆ Ftµ for each µ ∈ M and consequently Ft+ = Ft . Lemma 2.3.2 Let {Wt , t ≥ 0} be a Brownian motion on (Ω, F, Ft , P x ). For any Ft stopping time T (2.80) E µ Y ◦ θT 1{T <∞} | FT = E WT (Y ) 1{T <∞} for all F measurable functions Y and µ ∈ M. Since Ft+ = Ft for all t, for any Ft stopping time T we have FT + = FT . Thus (2.80) is not more restrictive than (2.76) for F 0 measurable functions Y . Of course, the significance of this lemma is that it holds for all F measurable functions Y . Proof Fix µ ∈ M. It suffices to obtain (2.80) for Y = 1B , for B ∈ F, that is, to show that for all B ∈ F P WT (B) 1{T <∞} and
is FTµ measurable
E µ 1A 1B ◦ θT 1{T <∞} = E µ 1A P WT (B) 1{T <∞}
(2.81)
(2.82)
for each A ∈ FT . Let ν ∈ M be the measure defined by ν(C) = P µ (WT ∈ C , T < ∞) for C ∈ B. By (2.73) and the definition of F we can then find sets
2.4 Brownian local time
31
B , B ∈ F 0 with B ⊆ B ⊆ B and P ν (B ) = P ν (B) = P ν (B ). By the definition of ν we have (2.83) f (x) dν(x) = E µ f (WT )1{T <∞} . Hence P ν (B ) =
P x (B ) dν(x) = E µ P WT (B )1{T <∞}
(2.84)
with a similar expression for P ν (B ). Thus, since P ν (B ) = P ν (B ) and B ⊆ B ⊆ B , we see that P WT (B )1{T <∞} = P WT (B)1{T <∞} = P WT (B )1{T <∞}
P µ a.s. (2.85) (B) 1{T <∞} is FTµ
Using this in (2.76) with Y = 1B we see that P WT measurable. This is (2.81). Similarly, (2.76) with Y = 1B and Y = 1B , combined with (2.85), show that E µ (1B ◦ θT 1{T <∞} ) = E µ (1B ◦ θT 1{T <∞} ).
(2.86)
Since B ⊆ B ⊆ B , we see that 1B ◦ θT 1{T <∞} = 1B ◦ θT 1{T <∞} , P µ almost surely. Therefore (2.82) follows by using (2.76), with Y = 1B , and (2.85). We have thus established (2.80). Lemma 2.3.3
P µ (A) =
P x (A) dµ(x)
∀A ∈ F.
(2.87)
Proof Equation (2.87) holds for A ∈ F 0 by definition. To show that it continues to hold for A ∈ F we proceed as in the proof of Lemma 2.3.2. We find two sets A , A ∈ F 0 with A ⊆ A ⊆ A and P µ (A ) = P µ (A) = P µ (A ). Then, since P x (A) = P x (A ) almost surely with respect to µ, and since (2.87) holds with A replaced by A ∈ F 0 , it also holds for A.
2.4 Brownian local time In this section we establish the existence and continuity of a stochastic process {Lyt , (t, y) ∈ R+ × R1 } called the local time of Brownian motion. A primary objective of this book is to establish necessary and sufficient conditions for the continuity of local times of a wide class of Markov processes. For Brownian motion we can give a simple, direct proof.
32
Brownian motion and Ray–Knight Theorems
The general result, which of course also applies to Brownian motion, is obtained by completely different methods. Let f (x) be on a positive symmetric continuous function supported 1 [−1, 1] with f (x) dx = 1. For any > 0, set f (x) = f (x/) and f,y (x) = f (x − y). Thus, f,y is an approximate identity with respect to Lebesgue measure. Let t L,y = f,y (Ws ) ds, (2.88) t 0
where {Ws , s ≥ 0} is a Brownian motion on (Ω, F, Ft , P x ). converges uniLemma 2.4.1 For any numbers 0 < T, M < ∞, L,y t formly on (t, y) ∈ [0, T ] × [−M, M ] as → 0, almost surely and in Lp (P x ) for all p > 0. Before proving this lemma we note some of its consequences. First, it allows us to define a jointly continuous local time for Brownian motion. Theorem 2.4.2 Let {Wt ,t ≥ 0} be a Brownian motion on (Ω,F, Ft , P x ). in (2.88) There exists a set Ω ⊂ Ω with P x (Ω ) = 1 such that, for L,y t ,y lim→0 Lt (ω) ω∈Ω (2.89) Lyt (ω) := 0 ω∈ / Ω is continuous on R+ ×R1 . Lyt is called the local time of Brownian motion at y (up to time t). The continuity of Lyt on R+ × R1 is often referred to by saying that it is “jointly continuous.” Proof Let Ω denote the set on which L,y t converges locally uniformly 1 on (t, y) ∈ R+ × R . By taking T = M = n, n = 1, 2, . . . in Lemma 2.4.1 we see that P x (Ω ) = 1 for all x ∈ R1 . Note that for each > 0, f,y is simply a bounded continuous function with compact support. Consequently, one can use the Dominated Convergence Theorem to show 1 that {L,y t , (y, t) ∈ R+ × R } is continuous for all > 0. Therefore, the 1 local uniform convergence of {L,y t ; (t, y) ∈ R+ × R } on Ω implies that y 1 Lt is continuous on R+ × R . A family A = {At ; t ∈ R+ } of random variables on (Ω, F, Ft ) is said to be a continuous additive functional (CAF) if it satisfies the following three properties: (1) t → At is almost surely continuous and nondecreasing with A0 = 0; (2) At is Ft measurable;
2.4 Brownian local time
33
(3) At+s = At + As ◦ θt for all s, t ∈ R+ almost surely. It is immediately obvious that Lyt inherits from L,y the property of t being a continuous additive functional. Note that the fact that Ft is the standard augmentation of Ft0 is / Ft0 (see crucial in showing that Lyt satisfies property (2), since Ω ∈ (2.89)). Since Lyt is a continuous additive functional, it follows from property (1) that, for almost all ω ∈ Ω, Lyt (ω) defines a positive measure dLyt (ω). It is easy to see from (2.89) that dLyt (ω) is supported on the set {t | Wt (ω) = y}. It also follows from the almost sure locally uniform convergence in (2.89) that for any continuous function g with compact support t y f,y (Ws ) ds dy (2.90) g(y)Lt dy = lim g(y) →0
=
0
lim t
→0
t
(g ∗ f )(Ws ) ds 0
g(Ws ) ds.
= 0
Therefore, for any bounded Borel set A, t y Lt dy = 1A (Ws ) ds.
(2.91)
0
A
In particular, let A = [a, a + ]. Using the continuity of Lyt we have measure {0 ≤ s ≤ t ; a ≤ Ws ≤ a + } . (2.92) This gives an intrinsic definition of Lat as the derivative of an occupation measure. It also shows that Lat is independent of the approximating sequence f,y used in (2.88). This last fact also follows easily from the proof of Lemma 2.4.1 given below. To further emphasize the fact that the local time is a density we note that (2.92) implies that, for each a ∈ R1 , λ({t : W (t) = a}) = 0, where λ is Lebesgue measure. Lat = lim
→0
Lemma 2.4.1 gives rise to three important formulas, which we give as a separate lemma. Lemma 2.4.3
E x (Lyt ) =
t
ps (x, y) ds. 0
(2.93)
Brownian motion and Ray–Knight Theorems ∞ uα (x, y) = E x e−αt dLyt .
34
(2.94)
0
uα (x, y) = E x e−αTy uα (y, y).
(2.95)
y ,y 1 x x x Proof Since L,y t converges in L (P ), E (Lt ) = lim→0 E (Lt ). Using (2.88) we see that t ) = lim f,y (z)ps (x, z) dz ds (2.96) lim E x (L,y t →0 →0 0 t = ps (x, y) ds, 0
which gives (2.93). For (2.94) we note that E
∞
x
−αt
e
dLyt
lim E
=
T →∞
dLyt
(2.97)
0
e−αT LyT
lim E x
=
−αt
e
T →∞
0
T
x
T
+α
e−αt Lyt
dt
0
by integration by parts. Using (2.93) and integration by parts again we see that the last line of (2.97) equals T lim e−αT E x (LyT ) + α e−αt E x (Lyt ) dt (2.98) T →∞ 0 t T T ps (x, y) ds + α e−αt ps (x, y) ds dt = lim e−αT T →∞ α
0
0
0
= u (x, y). We now obtain (2.95). Since dLyt is supported on {t|W (t) = y} almost surely, Lyt = 0 for t ∈ [0, Ty ) almost surely. Therefore ∞ y α x −αt (2.99) u (x, y) = E e dLt 0 ∞
= Ex = E
x
Ty
E
x
e−αt dLyt
∞
−αt
e
dLyt FTy
.
Ty
Note that it follows from Remark 2.1.5 that Ty < ∞ almost surely. Making the change of variables t = s + Ty and then using the additivity
2.4 Brownian local time
35
property of local times, that is, that Lys+Ty = LyTy + Lys ◦ θTy , and the strong Markov property (2.80), we see that ∞ ∞ y x −αt e dLt FTy e−αs dLys ◦ θTy FTy E = e−αTy E x 0
Ty −αTy
= e
E
∞
WT y
= e−αTy E y
−αs
e
dLys
0 ∞
e−αs dLys
0
(2.100)
= e−αTy uα (y, y). (In the next to last line we use the fact that WTy = y and (2.94).) Substituting this into the last line of (2.99), we get (2.95). Proof of Lemma 2.4.1 We consider L,y as a stochastic process int 1 dexed by (t, y, ) ∈ R+ × R × (0, 1]. Let RT,M,1 = [0, T ] × [−M, M ] × (0, 1]. We show that for all 0 < γ < 1 and all integers n, 2n ,y Ex L,y ≤ C(T, M, n)|(t, y, )−(t , y , )|(1−γ)n (2.101) − L t t for all (t, y, ), (t , y , ) ∈ RT,M,1 . Given (2.101) it follows from Kolmogorov’s Theorem (Theorem 14.1.1), with h = 2n, d = 3, and r = n − γn − 3, that there exists a random variable Cn (ω) which is finite almost surely, such that
,y |L,y | ≤ Cn (ω)|(t, y, ) − (t , y , )|1/2−(γ/2+3/(2n)) t − Lt
(2.102)
for all dyadic numbers (t, y, ), (t , y , ) ∈ RT,M,1 . However, since L,y t is continuous on RT,M,1 almost surely, (2.102) holds for all (t, y, ), (t , y , ) ∈ RT,M,1 . Furthermore, since (2.102) holds for all n sufficiently large, we see that for any γ > 0 we can find a random variable C(ω) which is finite almost surely, such that
,y |L,y | ≤ C(ω)|(t, y, ) − (t , y , )|1/2−γ . t − Lt
(2.103)
Lemma 2.4.1 follows immediately from this. To obtain (2.101) we use the triangle inequality to treat the variation in t, y, and separately. By (2.88), we have 2n 2n 2n t t ,y = Lt i i ··· Ex fi ,yi (Wti ) dti (2.104) Ex 0
i=1
=
0
π
D2n,t (π)
i=1
E
x
2n i=1
i=1
fi ,yi Wtπi
2n i=1
dti ,
Brownian motion and Ray–Knight Theorems
36
where D2n,t (π) = {0 < tπ1 < · · · < tπ2n < t} and the sum runs over all permutations π of {1, . . . , 2n}. We decompose the integrals in this way so we can use (2.17), which requires that the ti are increasing. (The reader should note well this technique, which is used repeatedly in this book.) It now follows from (2.17), in which, following convention, we set z0 = x and t0 = 0, that (2.104) =
2n D2n,t
π
=
i=1
π
pti −ti−1 (zi−1 , zi )
2n
D2n,t
2n
fπi ,yπi (zi ) dzi dti
(2.105)
i=1
pti −ti−1 (πi−1 zi−1 + yπi−1 , πi zi + yπi )
i=1 2n
f (zi ) dzi dti ,
i=1
where we abbreviate D2n,t = D2n,t (I), in which I denotes the identity partition, that is, D2n,t (I) = {0 < t1 < . . . < t2n < t}. We also take π0 = 0, 0 = 1, and y0 = 0. Since, by (2.5), for any r ≤ T √ r ∞ − 2|x−y| T −s Te √ (2.106) ps (x, y) ds ≤ e e ps (x, y) ds = e 2 0 0 we see that the last term of (2.105) is uniformly bounded for t < T. (Ultimately we set all i = and yi = y. We introduce the subscripts to aid in following the next steps.) We use the notation ∆δ h(z) = h(z +δ)−h(z). When h is a function of several variables (z1 , . . . , z2n ) we write ∆δ,zi for the operator ∆δ applied to the variable zi . Recall that pt (x, x ) = pt (x − x ), so that the terms in pt are actually functions of a single variable. Proceeding exactly as in (2.104), we have 2n ,y +δ i ,yi x i i Lt (2.107) − Lt E i=1
=
π
=
2n D2n,t
2n π i=1
pti −ti−1 (zi−1 , zi )
i=1
∆δ,yπi
D2n,t
2n
2n
∆δ,yπi fπi ,yπi (zi ) dzi dti
i=1
pti −ti−1 (πi−1 zi−1 + yπi−1 , πi zi + yπi )
i=1 2n i=1
f (zi ) dzi dti .
2.4 Brownian local time
37
Without loss of generality it suffices to find a bound for the summand in which π is the identity I, that is, to obtain a bound for 2n 2n 2n ∆δ,yi pti −ti−1 (i−1 zi−1 + yi−1 , i zi + yi ) f (zi ) dzi dti . D2n,t
i=1
i=1
i=1
(2.108) It would be convenient to attach the operators ∆δ,yi to the individual p factors. We cannot do this because each yi , except y2n , occurs in two of the p factors. Instead we proceed as follows: We write (2.108) as the sum of the 22n terms obtained by expanding ∆δ,yi for each even i and using ∆δ,yi f g = (∆δ,yi f )g(yi + δ) + f ∆δ,yi g
(2.109)
for each odd i. Then we set i = and yi = y for all i. By doing this we get 22n terms of the form 2n 2n pti −ti−1 ((zi − zi−1 )) f (zi ) dzi dti , pt1 (y1 + z1 ) D2n,t
i=2
i=1
(2.110) where p is either p, p(· + δ), ∆δ p, or ∆(−δ) p. Furthermore, it follows from (2.109) that there are n terms of the form ∆±δ p. (Note that the reason only the first factor in (2.110) contains a yi , namely y1 , is that, since pt (x, x ) = pt (x − x ), all the other terms in the yi ’s drop out on setting yi = y for all i.) To bound the absolute value of (2.110) we replace ∆±δ p by |∆±δ p| and integrate successively with respect to the variables t2n , t2n−1 , . . . , t1 . The dti integral of a factor of the form pti −ti−1 ((zi −zi−1 )) is bounded by (2.106). To bound the dti integral of a factor of the form pti −ti−1 ((zi − zi−1 ) + δ) − pti −ti−1 ((zi − zi−1 )), we note that ps (x) < ps (y) for a single s ⇒ ps (x) < ps (y) for all s so that
0
(2.111)
r
|ps (x + δ) − ps (x)| ds ∞ T ≤e e−s |ps (x + δ) − ps (x)| ds 0 ∞ e−s (ps (x + δ) − ps (x)) ds| = eT | 0
√
≤ c eT |e−
2|x+δ|
√
− e−
2|x|
(2.112)
| ≤ c eT δ.
The same bound is obtained if δ is replaced by −δ. After bounding all
Brownian motion and Ray–Knight Theorems
38
the dti integrals in this manner, the only terms in zj that remain are in the integrals f (zj ) dzj = 1. Thus we have shown that 2n ,y ,y x Lt − L t ≤ C(T, M, n)|y − y |n . (2.113) E Essentially the same analysis shows that 2n ,y ,y x E Lt − Lt ≤ C(T, M, n)| − |n
(2.114)
for all (t, y, ), (t , y , ) ∈ RT,M,1 . (When carrying out the proof for {i } rather than {yi } we get factors of the form pti −ti−1 (i (zi − zi−1 )), as in the proof for {yi }, but the other n terms are of the form pti −ti−1 (i (zi − zi−1 ) + δ(zi − zi−1 )) − pti −ti−1 (i (zi − zi−1 )) because all the i are multiplied by (zi − zi−1 ). Nevertheless, we need only consider |zi | ≤ 1 because f is supported on [−1, 1] (see (2.110)). Thus (2.112) applies in this case also.) Finally we consider the variation in t. Let t < t . Analogous to (2.104) and (2.105), but taking all the {i } equal to and all the {yi } equal to y, we have ,y 2n − L } E x {L,y (2.115) t t 2n t f,y (Wsi ) dsi = Ex i=1
t
= (2n)! D2n,t
1{t≤ti ≤t ,∀i}
2n
pti −ti−1 ((zi − zi−1 ))
i=1 2n
≤ (2n)! Since
D2n,t
f (zi ) dzi dti
i=1
1{t≤ti ≤t ,∀i}
2n i=1
√
1 ti − ti−1
2n
f (zi ) dzi dti .
i=1
f (zj ) dzj = 1, the last line in (2.115) equals 2n 1 √ 1{t≤ti ≤t ,∀i} dt1 . . . dt2n (2n)! ti − ti−1 D2n,t i=1 ≤ (2n)! 1{D2n,t }
2n i=1
√
1 q 1{t≤ti ≤t ,∀i} q , ti − ti−1
where we use H¨older’s inequality with 1/q + 1/q = 1 and 1 < q < 2.
2.4 Brownian local time
39
It is easy to see that 1{D2n,t }
2n i=1
√
1 q ≤ C(T, n, q). ti − ti−1
(2.116)
Also, 1{t≤ti ≤t ,∀i} q
t
···
= t
2n t
1/q dti
(2.117)
t
i=1 (1−1/q)2n
= |t − t |
.
By taking q sufficiently close to 2 we see that, for all γ > 0, ,y 2n E x {L,y ≤ C(T, n, q)|t − t |(1−γ)n . − L } t t
(2.118)
The inequalities in (2.113), (2.114), and (2.118) yield (2.101), which is what we set out to show.
2.4.1 Inverse local time of Brownian motion Let
{Lxt , (x, t)
∈ R1 × R+ } be the local times of Brownian motion. Let τz (s) := inf{t > 0 | Lzt > s},
(2.119)
where inf{∅} = ∞. We refer to τz ( · ) as the (right continuous) inverse local time (of Brownian motion) at z. It is easy to see that, for each s, τz (s) is a stopping time and τz (s + t) = τz (s) + τz (t) ◦ θτz (s) .
(2.120)
We set τ (s)=τ0 (s). Lemma 2.4.4 For each s ∈ R1 , τ (s) < ∞ almost surely. Proof
Since L0t is continuous, it suffices to show that L0∞ := lim L0t = ∞ t→∞
a.s.
(2.121)
Define the sequence of stopping times T0 = 0 and Tn+1 = inf{t > Tn + 1 | Wt = 0}, n = 0, 1, . . .. It follows from Remark 2.1.5 that Tn < ∞ almost surely. Therefore, using the monotonicity and additivity of L0t , we see that L0∞ ≥
∞ n=0
∞ 0 L0Tn +1 − L0Tn = L1 ◦ θTn . n=0
(2.122)
Brownian motion and Ray–Knight Theorems
40
Clearly {L01 ◦ θTn }∞ n=0 are independent and identically distributed. By the strong Markov property, Lemma 2.3.2, and (2.93), E 0 (L01 ◦ θTn ) = E 0 (L01 ) > 0. Thus the right-hand side of (2.122) is a sum of independent identically distributed random variables that are nonnegative and have a positive probability of being larger than for some > 0. Consequently, (2.121) follows by the Borel–Cantelli Lemma. Lemma 2.4.5 Let W be a Brownian motion on (Ω, F, P 0 ). The positive increasing stochastic process {τ (s) , s ∈ R+ } has stationary and independent increments. Proof Let F ∈ Bb (Rn ) and note that, by (2.120) and Lemma 2.3.2, for all t > 0 E 0 F (τ (s1 + t) − τ (t), . . . , τ (sn + t) − τ (t))|Fτ (t) (2.123) = E 0 F (τ (s1 ), . . . , τ (sn )) ◦ θτ (t) |Fτ (t) = E Wτ (t) (F (τ (s1 ), . . . , τ (sn ))) = E 0 (F (τ (s1 ), . . . , τ (sn ))) since Wτ (t) = 0. Taking the expectation we get E 0 (F (τ (s1 + t) − τ (t), . . . , τ (sn + t) − τ (t)))
(2.124)
0
= E (F (τ (s1 ), . . . , τ (sn ))) , which completes the proof. Lemma 2.4.6 For any positive measurable function f (t) and any T ∈ [0, ∞] and z ∈ R1 we have ∞ T z f (t) dLt = f (τz (s))1{τz (s)
0
If Ht is a positive continuous Ft measurable function, F a positive F measurable function, and T a stopping time (possibly T ≡ ∞), then T T (2.126) Ht F ◦ θt dLzt = E z (F ) E x Ht dLzt . Ex 0
0
Proof To prove (2.125) it suffices to show it for functions f (t) of the form f (t) = 1[0,r] (t) for real numbers r. In this case (2.125) is simply Lzr∧T = µ{s : τz (s) ≤ r , τz (s) < T },
(2.127)
where µ is Lebesgue measure. Note that Lzt = µ{s : τz (s) ≤ t} = µ{s : τz (s) < t}.
(2.128)
2.4 Brownian local time
41
Using this we get (2.127). Using (2.125) we see that T x z E Ht F ◦ θt dLt 0
(2.129)
∞
Hτz (s) F ◦ θτz (s) 1{τz (s)
0
As we mentioned above, τz (s) is a stopping time. Therefore, by (2.60), {T ≤ τz (s)} ∈ Fτz (s) . Taking complements we see that {τz (s) < T } ∈ Fτz (s) . Using the methods of the third paragraph in the proof of Lemma 2.2.5 and the fact that Fτz (s)+ = Fτz (s) (see the paragraph following the statement of Lemma 2.3.2), it follows that Hτz (s) 1{τz (s)<∞} is Fτz (s) measurable. Hence, using the strong Markov property (2.130) E x Hτz (s) F ◦ θτz (s) 1{τz (s)
0
(2.131) Using (2.125) again gives (2.126). Since τ (s) ≥ 0 for each s ≥ 0, we can compute its Laplace transform. Lemma 2.4.7 E 0 (e−ατ (s) ) = e−s/u
α
(0,0)
√
= e−
2α s
.
(2.132)
Proof Let f (s) = E 0 (e−ατ (s) ). Note first that, by (2.120), the strong Markov property, and the fact that Wτ (s) = 0, as in the preceding proof, E 0 (e−ατ (s+t) )
= E 0 (e−ατ (s) e−ατ (t)◦θτ (s) ) 0
−ατ (s)
= E (e
0
−ατ (t)
) E (e
(2.133)
).
Thus f (s + t) = f (s)f (t) and since f (s) is right continuous we have f (s) = e−h(α)s for some h(α) > 0. To compute h(α) we use (2.125),
Brownian motion and Ray–Knight Theorems
42
(2.94), and (2.5) to get ∞ ∞ 1 f (s) ds = E 0 e−ατ (s) ds = h(α) 0 ∞ 0 1 0 −αt 0 = E e dLt = uα (0, 0) = √ . 2α 0
(2.134)
Thus we obtain (2.132). Remark 2.4.8 A canonical positive p-stable process {Z(t), t ∈ R+ }, Z(0) = 0, 0 < p < 1, is defined by its Laplace transform, E 0 e−λZ(t) = e−tλ . p
(2.135)
By Lemma 2.4.7 we have E 0 (e−λτ (s)/2 ) = e−s
√
λ
.
(2.136)
Thus we see that τ (s)/2 is a canonical positive 1/2–stable process.
2.5 Terminal times A stopping time T is called a terminal time if for every t T > t ⇒ T = t + T ◦ θt
a.s.
(2.137)
It is easy to see that any first hitting time is a terminal time. Set uT (x, y) = E x (LyT ) ,
(2.138)
which may be infinite. We evaluate uT (x, y) for Brownian motion when T = T0 , the first hitting time of 0. Since Brownian motion is continuous, it is obvious that uT0 (x, y) = 0 unless x and y have the same sign. Lemma 2.5.1 Let {Lxt , (x, t) ∈ R1 × R+ } be the local times of Brownian motion and let uT0 be given by (2.138). Then 2 (|x| ∧ |y|) xy > 0 (2.139) uT0 (x, y) = 0 xy ≤ 0. Proof
By (2.94), α
u (x, y)
= E
x
= Ex
∞
−αt
e 0 T0 0
dLyt
(2.140)
e−αt dLyt
+ Ex
∞
T0
e−αt dLyt .
2.5 Terminal times
43
Using the arguments in (2.99), (2.100), and (2.95), we see that T0 y x −αt = E e dLt + E x (e−αT0 ) uα (0, y) 0
= E
x
T0
−αt
e
dLyt
+
0
uα (x, 0) uα (0, y) . uα (0, 0)
Therefore, by (2.5), T0 uα (x, 0)uα (0, y) y x −αt E = uα (x, y) − e dLt uα (0, 0) 0 √
=
e−
2α|x−y|
√
√
− e− 2α|x| e− √ 2α
(2.141)
2α|y|
.
Taking the limit as α → 0 we obtain (2.139). Let A = {At , t ∈ R+ } be a continuous additive functional and let τA (s) := inf{t > 0 | At > s},
(2.142)
where inf{∅} = ∞. τA (s) is the right continuous inverse of At . It is easy to see that τA (s) is a stopping time but it is not a terminal time. However, we do have τA (s) > t ⇒ τA (s) = t + τA (s − At ) ◦ θt
a.s.
(2.143)
Lemma 2.5.2 Let A = {At , t ∈ R+ } be a continuous additive functional on (Ω, F, Ft , P x ) and let λ be an exponential random variable with mean 1/α that is independent of (Ω, F, Ft , P x ). Then, for all f ∈ Bb (R1 ), Eλ 1{λ>At } f (τA (λ)) = e−αAt Eλ (f (t + τA (λ) ◦ θt )) P x a.s., (2.144) where Eλ denotes expectation with respect to λ. Proof By (2.143), λ > At implies that τA (λ) = t + τA (λ − At ) ◦ θt . Therefore Eλ 1{λ>At } f (τA (λ)) (2.145) = Eλ 1{λ>At } f (t + τA (λ − At ) ◦ θt ) ∞ =α f (t + τA (y − At ) ◦ θt )e−αy dy At ∞ −αAt α f (t + τA (y) ◦ θt )e−αy dy, =e 0
which is the right-hand side of (2.144).
Brownian motion and Ray–Knight Theorems
44
Note that even though we do not state it explicitly, we are really using the “memoryless” property of λ, that is, the fact that, conditioned on λ > h, λ − h has the same law as λ. Let Pλx := P x × Pλ . Similar to (2.138), we define uτA (λ) (x, y) = Eλx LyτA (λ) . We note that
uτA (λ) (x, y)
=
Eλx
= Eλx = E = E
x
x
∞
0 ∞ 0 ∞
1{τA (λ)>t} dLyt
(2.147)
1{λ>At } dLyt Pλ (λ >
0 ∞
(2.146)
−αAt
e
At ) dLyt
dLyt
.
0
We now give a simple formula that plays a crucial role in obtaining isomorphism theorems for local times of Brownian motion and, in later chapters, much more general classes of processes. Theorem 2.5.3 (Kac’s Moment Formula) Let W be a Brownian motion and T a terminal time on (Ω, F, Ft , P x ). Let {Lyt , (y, t) ∈ R1 × R+ } be the local times of W . Then n y x i LT = uT (x, yπ1 ) · · · uT (yπn−1 , yπn ), (2.148) E i=1
π
where the sum runs over all permutations π of {1, . . . , n}. In particular, n n−1 E x (LyT ) = n! uT (x, y) (uT (y, y)) . (2.149) Let A = {At , t ∈ R+ } be a CAF on (Ω, F, Ft ) and λ an exponential random variable independent of (Ω, F, Ft , P x ). Then these equations are also valid with T replaced by τA (λ) and E x replaced by Eλx . If uT (x, yi ) = ∞ for some yi , both sides of (2.148) are infinite. Similarly when T is replaced by τA (λ) and E x by Eλx . Proof
We first consider the case when T is a terminal time. Note that n n y T yi x x i = E (2.150) E LT dLti i=1
i=1
0
2.5 Terminal times x = E
n
{0
π
(see (2.104)). Let yπi = zi , i = 1, . . . , n. Note that T n dLztii = {0
0
T
T
···
t1
n
tn−1 i=1
45
yπ
dLti i
dLztii .
(2.151)
Therefore, setting
n
F (t, T ) =
{t
we have x E
n
{0
dLztii
=E
dLztii ,
(2.152)
T
F (t, T ) dLzt 1
x
.
(2.153)
0
Using the terminal time property (2.137) for the second equality and the additivity of Lyt for the last equality, we can write n dLztii (2.154) F (t, T ) = {t
n
=
{t
dLztii
= F (0, T ) ◦ θt . Hence E
T
F (t, T ) dLzt 1
x
=E
0
T
F (0, T ) ◦
x
θt dLzt 1
ds.
(2.155)
0
By (2.126) with Ht ≡ 1 we have T z1 x F (0, T ) ◦ θt dLt E = E z1 (F (0, T )) E x (LzT1 ) .
(2.156)
0
(Since the right-hand side of (2.150) sums the terms in (2.151) over all permutations of (y1 , . . . , yn ), z1 in (2.156) can be any of the yi . Thus if uT (x, yi ) = ∞ for some yi , the left-hand side of (2.148) is infinite.) Combining (2.153), (2.155), and (2.156), we have n zi x (2.157) E dLti {0
46
Brownian motion and Ray–Knight Theorems n zi z1 = uT (x, z1 )E dLti . {0
Iterating this argument we get x E
n
{0
dLztii
(2.158)
= uT (x, z1 )uT (z1 , z2 ) · · · uT (zn−1 , zn ), which gives one of the terms in (2.148). Summing over π we get (2.148). We now note that (2.150)–(2.153) also hold with T replaced by τA (λ) and E x replaced by Eλx . Hence n zi x dLti Eλ (2.159) {0
τA (λ)
= Eλx 0
= Eλx
F (t, τA (λ)) dLzt 1
∞
0
1{λ>At } F (t, τA (λ)) dLzt 1
,
where, as in (2.152), F (t, τA (λ)) =
n
{t
dLztii .
Using Lemma 2.5.2 we see that ∞ 1{λ>At } F (t, τA (λ)) dLzt 1 Eλx 0 ∞ e−αAt F (t, t + τA (λ) ◦ θt ) dLzt 1 = Eλx 0 ∞ e−αAt F (0, τA (λ)) ◦ θt dLzt 1 , = Eλx
(2.160)
(2.161)
0
where the last equality uses the additivity of local times as in the last equality of (2.154). By (2.126) with T ≡ ∞ we have ∞ (2.162) e−αAt F (0, τA (λ)) ◦ θt dLzt 1 Eλx 0 ∞ e−αAt dLzt 1 = Eλz1 (F (0, τA (λ))) E x 0
= Eλz1 (F (0, τA (λ))) uτA (λ) (x, z1 ),
2.5 Terminal times
47
where, for the last equation, we use (2.147). Combining (2.159)–(2.162) we see that Eλx
n
{0
dLztii
=
uτA (λ) (x, z1 )Eλz1
Iterating this argument we get x Eλ
n
{0
n
{0
dLztii
.
dLztii
(2.163)
= uτA (λ) (x, z1 )uτA (λ) (z1 , z2 ) · · · uτA (λ) (zn−1 , zn ), which gives one of the terms in (2.148) with T replaced by τA (λ) and E x replaced by Eλx . Summing over π we get the other terms. We often suppress the λ in Eλx when the meaning is clear from the context. When the CAF A = {At , t ∈ R+ } is L0 = {L0t , t ∈ R+ }, the local time of Brownian motion at 0, τL0 (s) is the same as τ (s), the inverse local time of Brownian motion at 0, defined in Subsection 2.4.1. We use the latter notation. In order to apply Theorem 2.5.3 when A = L0 we calculate uτ (λ) in the next lemma. Lemma 2.5.4 Let W be a Brownian motion (Ω, F, Ft , P x ) and let {Lyt , (y, t) ∈ R1 × R+ } be the local times of W . Let λ be an exponential random variable with mean 1/α which is independent of (Ω, F, Ft , P x ). Then 1 (2.164) uτ (λ) (x, y) = Eλx (Lyτ (λ) ) = uT0 (x, y) + , α where uT0 (x, y) is given in (2.139). Proof Note that τ (λ) = T0 + τ (λ) ◦ θT0 , so that by the strong Markov property, as in the proof of (2.95), ∞ y β x −βt u (x, y) = E (2.165) e dLt 0 τ (λ) ∞ y y x −βt x −βt = Eλ e dLt + Eλ e dLt 0
T0 +τ (λ)◦θT0
48
Brownian motion and Ray–Knight Theorems τ (λ) ∞ y y x −βt x −βT0 0 −βt = Eλ e dLt + E (e )Eλ e dLt 0
τ (λ)
= Eλx
e−βt dLyt
0
= Eλx
τ (λ)
τ (λ)
+ E x (e−βT0 )Eλ0 (e−βτ (λ) )uβ (0, y)
e−βt dLyt
0
+
uβ (x, 0)uβ (0, y) 0 −βτ (λ) Eλ (e ), uβ (0, 0)
where for the last equality we again use (2.95). Consequently τ (λ) y x −βt uτ (λ) (x, y) = lim Eλ e dLt β→0
=
lim
β→0
0
uβ (x, y) −
uβ (x, 0)uβ (0, y) uβ (0, 0)
(2.166)
!
uβ (x, 0)uβ (0, y) (1 − Eλ0 (e−βτ (λ) )). β→0 uβ (0, 0)
+ lim
It follows from (2.141) and (2.139) that ! uβ (x, 0)uβ (0, y) β lim u (x, y) − = uT0 (x, y). β→0 uβ (0, 0)
(2.167)
Also, by (2.5), uβ (x, 0)uβ (0, y) 0 −βτ (λ) 1 − E e λ β→0 uβ (0, 0) √ e− 2β(|x|+|y|) √ 1 − Eλ0 e−βτ (λ) = lim β→0 2β 1 1 − Eλ0 e−βτ (λ) = lim √ β→0 2β α 1 = lim √ 1− √ β→0 2β 2β + α √ 1 1 2β = lim √ √ = , β→0 α 2β 2β + α lim
(2.168)
where we use (2.132) for the third equality.
2.6 The First Ray–Knight Theorem In addition to considering {Lxt , (x, t) ∈ R1 × R+ }, the local times of Brownian motion, which is a stochastic process on R1 × R+ , we often can get interesting descriptions of the stochastic process on R1 , or subsets A of R1 , given by {LxT , x ∈ A}, where T is a stopping time. The
2.6 The First Ray–Knight Theorem
49
classical First Ray–Knight Theorem is a relationship of this sort. It describes the behavior of the Brownian local times {LrT0 , r ∈ R+ } in terms of squares of associated Gaussian processes, which, in this case, are two independent Brownian motions. This theorem, which was proved independently by Ray (1963) and Knight (1969), is the earliest example of such a result that we are aware of. We refer to results of this sort as isomorphism theorems. To make matters transparent we initially state and prove this theorem in its simplest, restricted form (that is, we only consider a portion of the domain of LT· 0 ). Afterwards we prove the general theorem. There are many proofs of this theorem in the literature. We give a very simple proof which we generalize to a wider class of processes in Theorem 8.2.6. ¯t denote two independent Brownian motions starting at Let Bt and B ¯ 2 : t ∈ R+ } is called a second order 0. By definition, the process {Bt2 + B t squared Bessel process (see Section 14.2). Theorem 2.6.1 (First Ray–Knight Theorem, restricted form) Let x > 0. Then, under P x × PB,B¯ , law ¯ 2 : r ∈ [0, x]} {LrT0 : r ∈ [0, x]} = {Br2 + B r
(2.169)
on C([0, x]). Equivalently, under P x , between 0 and x, LrT0 has the law of a second order squared Bessel process starting at 0. The key to the proof of Theorem 2.6.1 is the following lemma which is a simple consequence of Kac’s moment formula (Theorem 2.5.3). Lemma 2.6.2 Let T be a terminal time with potential density uT (x, y) defined in (2.138). Let Σ be the matrix with elements Σi,j = uT (xi , xj ), i, j = 1, . . . , n. Let Λ be the matrix with elements {Λ}i,j = λi δi,j . For all λ1 , . . . , λn sufficiently small and 1 ≤ l ≤ n, n det(I − ΣΛ) xi xl λi LT = E exp , (2.170) det(I − ΣΛ) i=1 where i,j = Σi,j − Σl,j Σ
i, j = 1, . . . , n.
(2.171)
These equations also hold when T is replaced by τA (λ), for any CAF, A = {At , t ∈ R+ }, and E x is replaced by Eλx .
50 Proof
Brownian motion and Ray–Knight Theorems By (2.148), k n λi LxTi E xl
(2.172)
i=1
= k!
n
uT (xl , xj1 )λj1 uT (xj1 , xj2 )λj2 uT (xj2 , xj3 ) · · ·
j1 ,...,jk =1
uT (xjk−2 , xjk−1 )λjk−1 uT (xjk−1 , xjk )λjk = k!
n
{(ΣΛ)k }l,jk
jk =1
for all k. Therefore n n xi xl λi LT = {(I − ΣΛ)−1 }l,j = {(I − ΣΛ)−1 1t }l , (2.173) E exp i=1
j=1
where 1t denotes the transpose of the n-dimensional vector (1, . . . , 1). Consequently (I − ΣΛ)Y = 1t ,
(2.174) n xi where Y is the n-dimensional vector with components E xl e( i=1 λi LT ) , l = 1, . . . , n. Therefore, by Cramer’s Theorem, n det((I − ΣΛ)(l) ) xi xl E exp λi LT = , (2.175) det(I − ΣΛ) i=1 where (I − ΣΛ)(l) is the matrix obtained by replacing the l-th column of (I − ΣΛ) by a column with all of its elements equal to 1. Subtracting the l-th row of (I − ΣΛ)(l) from each of its other rows, we obtain a matrix A with its l-th column equal to δj,l , j = 1, . . . , n and with entries i,j for i, j = 1, . . . , n for i, j = l. Note that (I − ΣΛ) det((I − ΣΛ)(l) ) = det(A)
(2.176)
and expanding det(A) by the l-th column we see that det(A) = det(B),
(2.177)
where B is the (n − 1) × (n − 1) matrix obtained by deleting the l-th column and l-th row from A. Note that {(I − ΣΛ)} l,j = δl,l , so that expanding det(I − ΣΛ) by the l-th row we see that = det(C), det(I − ΣΛ)
(2.178)
where C is the (n − 1) × (n − 1) matrix obtained by deleting the l-th
2.6 The First Ray–Knight Theorem
51
column and l-th row from I − ΣΛ. Considering the description of the matrix A in the sentence preceding (2.176), we see that C = B. Thus det((I − ΣΛ)(l) ) = det((I − ΣΛ)).
(2.179)
Using this in (2.175) we get (2.170). Proof of Theorem 2.6.1 To prove that two continuous stochastic processes are equal in law on C([0, x]) it suffices to show that they have the same finite-dimensional distributions. We do this by showing that they have the same finite-dimensional moment generating functions. Let 0 < x1 < · · · < xn = x. Since T0 is a terminal time, we apply Lemma 2.6.2 with T = T0 and l = n. By (2.139), Σi,j = 2(xi ∧xj ), i, j = 1, . . . , n. Since i,j = Σi,j − Σn,j = 0 Σ
for j ≤ i ≤ n,
(2.180)
we see that (I − ΣΛ) is an upper triangular matrix with ones on the diagonal. Therefore = 1. det(I − ΣΛ)
(2.181)
Theorem 2.6.1 now follows from Lemma 2.6.2 and the fact that n 1 2 λi Bxi . (2.182) E exp = det(I − ΣΛ) i=1 This is a straightforward calculation. Note that Σ = 2C, where C is given on page 18. Using (2.29) we see that n n 2 2 −1 1 Ee i=1 λi Bxi = e i=1 λi yi −(y,C y)/2 dy n/2 1/2 (2π) (det C) n 2 −1 1 = e i=1 λi yi /2−(y,Σ y)/2 dy n/2 1/2 (2π) (det Σ) −1 1 = e−(y,(Σ −Λ)y)/2 dy. n/2 1/2 (2π) (det Σ) 1/2 det(Σ−1 − Λ)−1 = (det(I − ΣΛ))−1/2 . (2.183) = det Σ
In Theorem 2.6.1 we considered LrT0 for r ∈ [0, x]. We now allow r to be in R+ . Note that r < 0 is not possible because the process is killed when it hits 0. On the other hand, it is clear that the following theorem is also valid for r ∈ (−∞, 0], with obvious modifications.
Brownian motion and Ray–Knight Theorems
52
Theorem 2.6.3 (First Ray–Knight Theorem) Let x > 0. Then, under P x × PB,B¯ , law
2 ¯ 2 )1{r≥x} : r ∈ R+ } = {B 2 + B ¯ 2 : r ∈ R+ } +B {LrT0 + (Br−x r−x r r (2.184) on C(R+ ). Equivalently, under P x , LrT0 between 0 and x has the law of a second order squared Bessel process Y = {Yr ; 0 ≤ r ≤ x} with Y0 = 0, and then proceeds from x as a 0-th order squared Bessel process Z = {Zr ; x ≤ r < ∞} with Zx = Yx , where Z also has the property that, conditioned on Yx , it is independent of Y .
Proof The proof is just a slight generalization of the previous proof. Let 0 < x1 < · · · < xl = x < . . . < xn . It suffices to show that the mo¯x2 −x I{x >x } , i ∈ ment generating functions of {LxTi0 + Bx2i −xl I{xi >xl } + B i l i l 2 2 ¯ 1, . . . , n} and {Bxi + Bxi , i ∈ 1, . . . , n} are equal, for all x1 , . . . , xn in R+ , for all n. By Lemma 2.6.2, n det(I − ΣΛ) xi xl λi LT0 = E exp , (2.185) det(I − ΣΛ) i=1 where i,j = 2 ((xi ∧ xj ) − (xl ∧ xj )) Σ
i, j = 1, . . . , n.
(2.186)
Taking (2.182) into account we see that, to complete the proof, we need only show that n 1 2 λi Bxi −xl . (2.187) E exp =" i=l+1 det(I − ΣΛ) It is easy to see that i,j = 0 Σ
1≤j ≤i∧l
(2.188)
and i,j = 2((xj − xl ) ∧ (xi − xl )) Σ
l < i ∧ j ≤ n.
(2.189)
Thus = det(I − ΣΛ), det(I − ΣΛ)
(2.190)
i,j , l < i ∧ j ≤ n. where Σ is an (n − l) × (n − l) matrix with Σi,j = Σ Equation (2.187) follows from this and (2.182). The equivalency is proved in Remark 14.2.3.
2.7 The Second Ray–Knight Theorem
53
2.7 The Second Ray–Knight Theorem Recall that τ (t) = inf{s | L0s > t} is the right continuous inverse of the local time at 0 of a standard Brownian motion. Heuristically, it is the amount of time it takes for the local time at zero to be equal to t and, by the continuity and support properties of L0s , L0τ (t) = t. The Second Ray–Knight Theorem describes the behavior of the Brownian local times {Lrτ (t) , r ∈ R+ } in terms of squares of standard Brownian √ motion and Brownian motion starting from t. As for the First Ray– Knight Theorem, there are many proofs of this theorem in the literature. We give a simple proof along the lines of our proof of the First Ray– Knight Theorem. Theorem 2.7.1 (Second Ray–Knight Theorem) Let t > 0. Then, under the measure P 0 × PB , ! √ 2 law x 2 {Lτ (t) + Bx ; x ≥ 0} = Bx + t ; x ≥ 0 (2.191) on C(R+ ), where {Bx ; x ≥ 0} is a real-valued Brownian motion starting at 0, independent of the original Brownian motion (that is, the Brownian motion with local times Lrτ (t) ). Equivalently, under P 0 , {Lrτ (t) ; r ∈ R+ } has the law of a 0-th order squared Bessel process starting at t. Proof Like the First Ray–Knight Theorem, this theorem is also a simple application of Lemma 2.6.2, this time applied to τ (λ). Let ξ be an exponential random variable with mean 1/α. By Lemma 2.5.4, uτ (ξ) (x, y) = uT0 (x, y) + 1/α, where uT0 (x, y) = 2(x ∧ y). Let x1 = 0 and x2 , . . . , xn > 0. By Lemma 2.6.2, n det(I − ΣΛ) x1 xi λi Lτ (ξ) = , (2.192) Eξ exp det(I − ΣΛ) i=1 where Σi,j = 2(xi ∧ xj ) + 1/α
i, j = 1, . . . , n
(2.193)
and i,j = 2(xi ∧ xj ) Σ
i, j = 1, . . . , n
since uT0 (0, y) = 0 for all y ≥ 0. By (2.182), n 1 2 E exp . =" λi Bxi i=1 det(I − ΣΛ)
(2.194)
(2.195)
54
Brownian motion and Ray–Knight Theorems
Let ρ be a N (0, 1) independent of {Bx , x ≥ 0}. Using the fact that Brownian motion has stationary and independent increments, we can law see that {Bx + (2α)−1/2 ρ, x ≥ 0} = {Bx+(2α)−1 , x ≥ 0}. Hence it follows from (2.195) that n 2 1 λi Bxi + (2α)−1/2 ρ . (2.196) E exp = det(I − ΣΛ) i=1 Combining (2.192), (2.195), and (2.196), we see that under the measure P 0 × PB,B,ρ,ρ 2
{Lxτ(ξ) + Bx2 + B x ; x ≥ 0}
(2.197)
= {(Bx + (2α)−1/2 ρ)2 + (B x + (2α)−1/2 ρ)2 ; x ≥ 0}.
law
Let {Bx ; x ≥ 0}, {B x ; x ≥ 0} be independent Brownian motions and let λ1 , . . . , λn be vectors in R2 . Comparison with (2.27) shows that n n 1 (λj , (Bxj , B xj )) = exp − (λj , λk )(xj ∧ xk ) . E exp i 2 j=1 j,k=1
(2.198) It follows from this that if U is any orthogonal transformation in R2 , using the fact that U t = U −1 , we have n E exp i (λj , U (Bxj , B xj )) (2.199) j=1
= E exp i
n
(U −1 λj , (Bxj , B xj ))
j=1
n
1 = exp − (U −1 λj , U −1 λk )(xj ∧ xk ) 2 j,k=1 n 1 (λj , λk )(xj ∧ xk ) , = exp − 2 j,k=1
and consequently law
{U (Bx , B x ) ; x ≥ 0} = {(Bx , B x ) ; x ≥ 0}.
(2.200)
Let ρ be an independent copy of ρ, and let Uρ,ρ be the orthogonal transformation in R2 for which Uρ,ρ (ρ, ρ) = (0, (ρ2 + ρ2 )1/2 ).
(2.201)
2.7 The Second Ray–Knight Theorem
55
Using | · | to denote the Euclidean norm in R2 , it follows from (2.200) and the independence of B, B, ρ, ρ that {|(Bx , B x ) + (2α)−1/2 (ρ, ρ)| ; x ≥ 0} = {|Uρ,ρ (Bx , B x ) + (2α)
−1/2
(2.202) 2
2 1/2
(0, (ρ + ρ )
)| ; x ≥ 0}
= {|(Bx , B x ) + (2α)−1/2 (0, (ρ2 + ρ2 )1/2 )| ; x ≥ 0}.
law
law
Note that (2α)−1 (ρ2 + ρ2 ) = ξ (this follows easily using moment generating functions). Therefore, by (2.197) and (2.202), under the measure P 0 × Pξ × PB,B , 2
2
law
{Lxτ(ξ) + Bx2 + B x ; x ≥ 0} = { B x + (Bx +
ξ)2 ; x ≥ 0}.
(2.203)
By taking the moment generating function of each side of (2.203), it is easy to see that under the measure P 0 × Pξ × PB law {Lxτ(ξ) + Bx2 ; x ≥ 0} = {(Bx + ξ)2 ; x ≥ 0}. (2.204) This is actually the “Laplace transform” of (2.191). To see this note that by (2.204) n n x1 xi 2 λi Lτ (ξ) E exp λi Bxi Eξ exp (2.205) i=1
=E
exp
n
i=1
λi Bxi
2 + ξ
.
i=1
Since ξ is an exponential random variable with mean 1/α, we can write this as n n ∞ xi 2 x1 λi Bxi E exp λi Lτ (t) e−αt dt E exp i=1
∞
E
= 0
exp
0
i=1 n
λi Bxi
√ 2 + t
e−αt dt. (2.206)
i=1
n xi 2 Therefore the moment generating functions of i=1 λi Lτ (t) + Bxi √ 2 n and i=1 λi Bxi + t are equal for all (x1 , . . . , xn ) in R+ for all n for almost all t. However, since the first of these is increasing in t and the second is continuous in t, they are equal for all t. Thus we get (2.191). The equivalency follows from Theorem 14.2.2 as in Remark 14.2.3. Remark 2.7.2 We state the First and Second Ray–Knight Theorems in terms of Bessel processes because this is how they often appear in
56
Brownian motion and Ray–Knight Theorems
the literature (see, e.g., Revuz and Yor (1991, Chapter XI, Section 2)). Also, describing the local times of Brownian motion as a squared Bessel process gives a stochastic integral representation of the local times which can be used to study them further. On the other hand, the isomorphism theorems that we presented first are in a form that easily extends to a much wider class of Markov processes than Brownian motion. The primary purpose of this book is to use the extensions to study the local times of these processes.
2.8 Ray’s Theorem The isomorphisms for the local time of Brownian motion in the preceding two sections are with respect to the spatial variable. The local time itself is evaluated in Theorem 2.6.3 at the first time the process hits zero and in Theorem 2.7.1 at the first time the local time at zero is equal to t. Ray’s Theorem deals with the total accumulated local time of Brownian motion stopped (or, to be more precise, killed, as we explain in Section 3.5) at an independent exponential time. But it is more complicated than this. It actually is given for the local time of the h-transform of Brownian motion killed at an independent exponential time. We study h-transforms in Section 3.9 and obtain a simplified version of Ray’s Theorem for transient regular diffusions in Chapter 12. We proceed to explain Ray’s Theorem without justifying every statement. Let λ be an independent exponential random variable with mean 2 and set uλ (x, y) = Eλx (Lyλ ). To compute this expectation we use Theorem 2.5.3 with At = t, so that τA (λ) = λ. We then have λ y y x x Eλ (Lλ ) = Eλ (2.207) dLt = Eλx
0
∞
0 ∞
1{λ>t} dLyt
Pλ (λ > t) dLyt 0 ∞ e−t/2 dLyt = e−|x−y| , = Ex = Ex
0
where the last equality comes from (2.94) and (2.5). Let {Br,λ , r ∈ R+ } denote Brownian motion killed at the end of an independent exponential time with mean 2. {Br,λ , r ∈ R+ } is a Markov process with 0-potential density uλ (x, y). The calculation in (2.207) should make this believable. We provide a complete proof of this fact in
2.8 Ray’s Theorem
57
Section 3.5. In the rest of this section we set u(x, y) := u1/2 (x, y) = e−|x−y| .
(2.208)
Ray’s Theorem for Brownian motion describes the total accumulated local time of the h-transform of Bλ = {Br,λ , r ∈ R+ }, under the measure P a,b , in terms of squares of associated Gaussian processes. Heuristically, the process with measure P a,b is Bλ , starting at a and conditioned to hit b and die at its last exit from b. {Lr∞ , r ∈ R1 } in Theorem 2.8.1 is the total accumulated local time of this process. Let h(x) = u(x, b)/u(b, b). The potential density of the h-transform process is u(x, y)h(y) u(x, y)u(y, b) = h(x) u(x, b)
(2.209)
(see (3.216)). This follows because the process with measure P a,b is a Markov process with transition density e−t/2 pt (x, y)h(y)/h(x) with respect to Lebesgue measure on R1 , where pt (x, y) is the transition probability density of Brownian motion. The Gaussian process that enters into Ray’s Theorem is the Ornstein– Uhlenbeck process, a mean zero stationary Gaussian process with covariance u(x, y) = exp(−|x − y|), the same function as the 0-potential density of the exponentially killed Brownian motion. In Section 5.1 we discuss how Gaussian processes are determined by their covariances. At this point we define the Ornstein–Uhlenbeck process as a rescaled Brownian motion {Gx = e−x Be2x ; x ∈ R1 }, where {Bx ; x ≥ 0} is a Brownian motion starting at 0. For any s ∈ R1 we set Gx,s = Gx − e−|x−s| Gs .
(2.210)
This is the orthogonal complement of the projection of Gx onto Gs (see Remark 5.1.6). Theorem 2.8.1 (Ray’s Theorem) Let a < b. Under the measure P a,b × PG,G¯ , ¯2 ¯2 G G2r,b G G2r,a r,b r,a r + 1{r≤a} + + 1{r≥b} : r ∈ R1 } {L∞ + 2 2 2 2 law
= {
¯2 G2r G + r : r ∈ R1 } 2 2
(2.211)
¯ x ; x ∈ R1 } are independent Ornstein–Uhlenbeck on C(R1 ), where {Gx , G processes, independent of the Brownian motion.
58
Brownian motion and Ray–Knight Theorems
We call Theorem 2.8.1 Ray’s Theorem because it describes the same process considered in Ray (1963). Ray’s result actually looks quite different from (2.211). He describes Lr∞ separately in the three regions r ∈ (−∞, a], r ∈ [a, b], and r ∈ [b, ∞) as follows: Theorem 2.8.2 (Ray’s Theorem, original version) Let B = {Bt , t ≥ 0} denote a Brownian motion and let λ be an independent exponential random variable with mean 1/2. Let I = inf t≤λ Bt and S = (i) supt≤λ Bt , and let {Gx , x ∈ R1 }, i = 1, . . . , 4 be four independent Ornstein–Uhlenbeck processes, independent of B. Then, under the measure P a,b × PG(1) ,G(2) ,G(3) ,G(4) , {I , Lr∞ ; r ≤ a}
law
=
{I , 1{r>I}
4
(i)
2
Gr,I
/2 ; r ≤ a}
(2.212)
i=1
{Lr∞ ; a ≤ r ≤ b}
law
{S , Lr∞ ; r ≥ b}
law
= =
2 2 (2) { G(1) + G )/2 ; a ≤ r ≤ b} (2.213) r r {S , 1{r<S}
4
(i)
Gr,S
2
/2 ; r ≥ b}, (2.214)
i=1
where {I , Lr∞ ; r ≤ a} is a random variable on R1 × C((−∞, a]) and similarly for {S , Lr∞ ; r ≥ b}. The proofs of Theorems 2.8.1 and 2.8.2 follow from more general results which are given in Chapter 12.
2.9 Applications of the Ray–Knight Theorems The First and Second Ray–Knight Theorems and Ray’s Theorem have been used to obtain interesting results about Brownian local times. We mention several of them here without proof. Some of them are contained in more general results that are proved in later chapters. Joint continuity of Brownian local time. Knight (1969) used the First Ray–Knight Theorem (Theorem 2.6.3) to establish the joint continuity of Brownian local times {Lyt ; (t, y) ∈ R+ × R1 }. Ito and McKean (1974) obtain this result via Ray’s Theorem (Theorem 2.8.2). In Section 2.4 we obtained the joint continuity of Brownian local time by elementary methods. However, none of these methods can be extended to general Markov processes. In Chapter 9 we also use isomorphism theorems, but more general ones than those described in this chapter, to find necessary and sufficient conditions for strongly symmetric Markov processes to have jointly continuous local times.
2.9 Applications of the Ray–Knight Theorems
59
Exact moduli of continuity for Brownian local time. Ray (1963) used the First Ray–Knight Theorem to show that for all x ∈ R1 " |Lx+δ − LxT0 | lim sup T0 = 2 LxT0 δ log log(1/δ) δ→0
P x a.s.
(2.215)
and " |LyT0 − LzT0 | LyT0 lim sup = 2 sup |y − z|(1/ log |y − z|) |y−z|→0 0≤y≤x
P x a.s. (2.216)
0≤y,z≤x
Note the similarity between these equations and (2.14) and (2.15). This is not a coincidence but an instance of a general principle given in Theorems 9.5.1 and 9.5.5. Using Ray’s Theorem, Ray also obtained (2.215) and (2.216) with T0 replaced by an independent exponential time. Thus, by Fubini’s Theorem, they also hold for some fixed time t. Then, using Brownian law law scaling, that is, the fact that Wt = cWt/c2 and consequently Lyt = y/c cLt/c2 , it follows that (2.215) and (2.216) also hold with T0 replaced by any fixed t. Note that (2.215) is not very interesting when x = 0 since, in this case, LxT0 = 0. Indeed, using the First Ray–Knight Theorem (Theorem 2.6.3), it follows that lim sup δ↓0
LδT0 =1 2δ log log(1/δ)
P x a.s.
(2.217)
for x > 0. We consider moduli of continuity of local times in Section 9.5. Quadratic variation of Brownian local time. Using the Second Ray–Knight Theorem, Perkins (1982) showed that a xi−1 2 xi lim =4 Lyt dy (2.218) Lt − Lt n→∞
xi ∈π(n)
0
in probability, for any sequence of partitions π(n) = {0 = x0 < x1 < . . . < xn = a}. We consider the p-variation of local times in Chapter 10. Quadratic variation is the case when p = 2. Most visited site. Bass and Griffin (1985) used the Second Ray– Knight Theorem to study the most visited points for Brownian motion
60
Brownian motion and Ray–Knight Theorems
up to time t. Let Vt = {x ∈ R1 | Lxt = sup Lyt }. y
Heuristically, the more time Brownian motion spends at a point in its range, the greater the local time at that point. Thus V can be thought of as the points at which the Brownian motion path spends the most time, up to time t. We refer to these points as the most visited sites of Brownian motion or as the favorite sites of Brownian motion. For specificity we consider Vt = inf |x|
(2.219)
x∈Vt
the most visited point up to time t with the smallest absolute value. In Section 11.2 we show that, for any γ > 3, (log t)γ Vt = ∞ t→∞ t1/2 lim
P 0 a.s.
(2.220)
Following Bass and Griffin, we describe this phenomenon by saying that the most visited site of Brownian motion is transient. (Information on the rate of growth of V (t) is also obtained in Bass and Griffin (1985).) We also show in Section 11.2 that, for any γ > 3, | log t|γ Vt = ∞ t→0 t1/2 lim
P 0 a.s.
(2.221)
This shows, perhaps counterintuitively, that the most visited site by Brownian motion, as it starts from zero, stays away from zero. Eisenbaum and Shi (1999) used the Second Ray–Knight Theorem to analyze the rarely visited points of Brownian motion, the set of points x such that Lxt is greater than zero but is “small.” In Chapter 11 we study the most visited sites of symmetric β-stable processes, 1 < β ≤ 2. Random walks. Hu and Shi (1998) used the First Ray–Knight Theorem for Brownian motion to derive almost sure properties of a simple random walk in a random environment and of Brox’s diffusion, which is its continuous time analog. The Edwards model. van der Hofstad, den Hollander and Konig (1997) used the Ray–Knight Theorems to derive a central limit theorem for the Edwards model of polymers.
2.10 Notes and references
61
2.10 Notes and references Brownian motion is the pearl of probability theory. In this chapter we developed some of its properties that generalize to strongly symmetric Markov processes, our main concern in this book. It probably is not necessary for us to point out that there is much more to learn about Brownian motion. There are many excellent books on it. In particular, we mention Revuz and Yor (1991), Rogers and Williams (2000b), Port and Stone (1978), Chung (1982), and Knight (1981). For more information on the standard augmentation, we refer the reader to Rogers and Williams (2000b), Chung (1982), and Blumenthal and Getoor (1968). Recall that the standard augmentation is also referred to as the usual augmentation. Brownian local time was first studied in L´evy (1948). The joint continuity of Brownian local time is due to Trotter (1958). Our approach, which proves the existence and joint continuity of Brownian local time simultaneously, is based on ideas developed for studying intersection local times in Rosen (1986). Kac’s formula is discussed in great generality in Fitzsimmons and Pitman (1999). Lemma 2.6.2, which is a simple consequence of Kac’s Lemma, is from Eisenbaum, Kaspi, Marcus, Rosen and Shi (2000), although this result was probably recognized earlier. Considered along with Lemma 5.2.1, it makes it evident that there is a relationship between local times and squares of Gaussian processes. The Ray–Knight Theorems were proved independently by Ray (1963) and Knight (1969). For other proofs see Revuz and Yor (1991, Section XI.2). Ray’s Theorem first appeared in Ray (1963). It has been the subject of many investigations, reformulations, and new proofs. See, e.g., the work of Williams (1974), Sheppard (1985), Biane and Yor (1988), and Eisenbaum (1994).
3 Markov processes and local times
In this book we study the local times of Markov processes by means of isomorphism theorems that relate them to Gaussian processes. The natural class of Markov processes to consider are strongly symmetric Borel right processes with continuous potential densities. In this chapter we define these processes and study many of their properties. In Chapter 2 we focus on Brownian motion to give an overview of some of our goals in this book. However, even for such a regular process as Brownian motion, there are many complexities in the development of Markov process theory. This is particularly evident in Section 2.3 on standard augmentation. Much of the material presented in Chapter 2 is presented in a manner that also makes it relevant to strongly symmetric Borel right processes. We use this material freely in this chapter, especially in the first two sections.
3.1 The Markov property Let S be a locally compact space with a countable base. We use B = B(S) to denote the Borel σ-algebra of S. A function P = P (x, A) is called a sub-Markov kernel on (S, B) if for each x ∈ S, P (x, · ) is a positive measure on (S, B) of total mass less than or equal to 1, and for each A ∈ B, P ( · , A) is a B measurable function on S. Unless otherwise indicated, a sub-Markov kernel will refer to a sub-Markov kernel on (S, B). A sub-Markov kernel of total mass 1 is called a Markov kernel. For a sub-Markov kernel P and f ∈ Bb (S), we set P f (x) = f (y)P (x, dy) (3.1) S
and note that P maps bounded Borel measurable functions into bounded Borel measurable functions. A family {Pt ; t ≥ 0} of sub-Markov kernels 62
3.1 The Markov property
63
on (S, B) is called a sub-Markov semigroup on (S, B) if Pt+s f (x) = Pt (Ps f )(x)
(3.2)
for all s, t ≥ 0, x ∈ S, and f ∈ Bb (S) (bounded Borel measurable functions S). Unless otherwise indicated, we always take P0 = I. Semigroups such as {Pt ; t ≥ 0} that map Bb (S) to Bb (S) are referred to as Borel semigroups. Imagine a particle starting at x at time t = 0. Intuitively, Pt (x, A) can be thought of as the probability that the particle is in the set A at time t. The fact that we may have Pt (x, S) < 1 means that there is a chance that the particle has “disappeared” by time t. To deal with this mathematically we introduce a new state ∆, called the “cemetery state,” and set S∆ = S ∪ ∆. If S is already compact, we consider ∆ as an isolated point. If S is not compact, we consider ∆ as the one-point compactification of S, the “point at infinity.” A family {Pt ; t ≥ 0} of sub-Markov kernels that is a sub-Markov semigroup can be extended to become a family of Markov kernels on S∆ in a unique way. We define Pt (x, ∆) = 1 − Pt (x, S)
(3.3)
Pt (∆, ∆) = 1.
(3.4)
for all x ∈ S and
It is easily verified that the extended family {Pt ; t ≥ 0} forms a Markov semigroup on S∆ . We use the convention that any function f defined on S, but not defined on ∆, is automatically extended to S∆ by setting f (∆) = 0. Whenever we refer to a Markov semigroup {Pt ; t ≥ 0} on S∆ we assume that it satisfies (3.3) and (3.4). Let {Xt ; t ≥ 0} be a stochastic process on a probability space (Ω, G, P ) with values in (S∆ , B(S∆ )). Let {Gt ; t ≥ 0} be a filtration of G and {Pt ; t ≥ 0} a Markov semigroup on S∆ . {Xt ; t ≥ 0} is called a simple Markov process with respect to {Gt ; t ≥ 0} with transition semigroup {Pt ; t ≥ 0} if Xt is Gt measurable for each t ≥ 0 and E (f (Xt+s ) | Gt ) = Ps f (Xt )
(3.5)
for all s, t ≥ 0 and f ∈ Bb (S∆ ). We refer to S as the state space of {Xt ; t ≥ 0}, even though, strictly speaking, X has values in S∆ . Since Ps f (Xt ) is Xt measurable, (3.5) implies that E (f (Xt+s ) | Gt ) = E (E (f (Xt+s ) | Gt ) |Xt ) .
(3.6)
Markov processes and local times
64
Therefore, using a basic property of conditional expectation, we have E (f (Xt+s ) | Gt ) = E (f (Xt+s ) | Xt ) .
(3.7)
This is a more familiar way to express the simple Markov property which says that the future values of the process {Xt ; t ≥ 0}, given its past, depend only on its present values. Note that the semigroup property (3.2) is necessary for (3.5) to be consistent. Remark 3.1.1 To check that {Xt ; t ≥ 0} is a simple Markov process, it suffices to verify (3.5) for f ∈ Cb (S∆ ). Furthermore, since (3.5) holds trivially for constant functions, it suffices to check it for f ∈ C0 (S). Repeated use of the Markov property (3.5) leads to n n n fi (Xti ) = E Pt1 (X0 , dz1 ) Pti −ti−1 (zi−1 , dzi ) fi (zi ) E i=1
i=2
i=1
(3.8) for any 0 < t1 < . . . < tn and any fi ∈ Bb (S∆ ), i = 1, . . . , n. This shows that the finite-dimensional distributions of the simple Markov process {Xt ; t ≥ 0} are determined by its transition semigroup {Pt ; t ≥ 0} and its initial distribution, the distribution of X0 . Let {Pt ; t ≥ 0} be a Markov semigroup on S∆ . The collection X = (Ω, G, Gt , Xt , θt , P x ) is called a right continuous simple Markov process with transition semigroup {Pt ; t ≥ 0} if the following three conditions hold: (1) (Ω, G, Gt ) is a filtered measurable space, with Xt : (Ω, Gt ) → (S∆ , B(S∆ )), and {Xt ; t ≥ 0} is right continuous, that is, t → Xt (ω) is a right continuous map from R+ to S∆ . (2) {θt ; t ≥ 0} is a collection of shift operators for X, that is, θt : Ω → Ω with θt ◦ θs = θt+s
and
Xt ◦ θs = Xt+s
∀s, t ≥ 0.
(3.9)
(3) For each x ∈ S∆ , P x (X0 = x) = 1, and {Xt ; t ≥ 0} on (Ω, G, P x ) is a simple Markov process with respect to {Gt ; t ≥ 0} with transition semigroup {Pt ; t ≥ 0}. Let X be a right continuous simple Markov process with transition semigroup {Pt ; t ≥ 0}. It follows from (3.8) that n n n x E fi (Xti ) = Pt1 (x, dz1 ) Pti −ti−1 (zi−1 , dzi ) fi (zi ) i=1
i=2
i=1
(3.10)
3.1 The Markov property
65
for any 0 < t1 < . . . < tn and any fi ∈ Bb (S∆ ), i = 1, . . . , n. In particular, (3.11) E x (f (Xt )) = Pt f (x) = Pt (x, dz)f (z). Let A ∈ B(S∆ ). When f = IA , (3.11) takes the form P x (Xt ∈ A) = Pt IA (x) = Pt (x, A).
(3.12)
The process {Xs (ω) ; (s, ω) ∈ R+ × Ω} with values in (S∆ , B(S∆ )) is B(R+ ) × G measurable. This is easy to see since Xs (ω) = lim
n→∞
∞
1[(j−1)/2n ,j/2n ) (s)Xj/2n (ω),
(3.13)
j=1
which holds because of the right continuity of Xs . Consequently, f (Xt ) is B(R+ ) × G measurable for any f ∈ B(S∆ ). This allows us to define the α-potential operator ∞ ∞ α x −αt U f (x) = E e f (Xt ) dt = e−αt Pt f (x) dt (3.14) 0
0
for all f ∈ Bb (S∆ ), where the last equality follows from Fubini’s Theorem. We also use the definition (3.14) for α = 0 whenever the integral is defined, in particular when f is nonnegative, in which case U 0 f (x) may be infinite. Letting f = IA in (3.14) for A ∈ B(S∆ ) and using (3.12) yields another measure on B(S∆ ): ∞ e−αt Pt (x, A) dt. (3.15) U α (x, A) := U α IA (x) = 0
One can think of (αe−αt Pt (x, dy)) as a product probability measure and interpret αU α (x, A) as the probability that a particle, starting at x at time zero, is in the set A at the end of an independent exponential time with mean 1/α. Lemma 3.1.2 For any α > 0 and f ∈ Cb (S∆ ), lim αU α f (x) = f (x)
α→∞
∀ x ∈ S∆ .
(3.16)
Proof Using (3.11) and the fact that Xt is right continuous, we see that limt→0 Pt f (x) = f (x) for any f ∈ Cb (S∆ ). Therefore, by the Dominated Convergence Theorem, ∞ e−αt Pt f (x) dt (3.17) lim αU α f (x) = lim α α→∞
α→∞
0
66
Markov processes and local times ∞ = lim e−t Pt/α f (x) dt = f (x). α→∞
0
The statement in the next lemma is called the resolvent equation. Lemma 3.1.3 For any α, β > 0 and f ∈ Cb (S∆ ), U α f (x) − U β f (x)
=
(β − α)U α U β f (x).
=
(β − α)U U f (x). β
(3.18)
α
This also holds for α = 0 whenever U 0 f (x) is defined. Proof Consider first α, β > 0. We have U α f (x) − U β f (x) ∞ −αs = e − e−βs Ps f (x) ds 0 ∞ s = (β − α) e−(α−β)t dt e−βs Ps f (x) ds 0 0 ∞ ∞ e−(α−β)t e−βs Ps f (x) ds dt = (β − α) 0 ∞ t ∞ = (β − α) e−αt e−βs Ps+t f (x) ds dt 0 0 ∞ ∞ e−αt e−βs Pt Ps f (x) ds dt = (β − α) 0 0 ∞ ∞ = (β − α) e−αt Pt e−βs Ps f (x) ds dt 0 α
(3.19)
0
= (β − α)U U f (x). β
To get the second equation in (3.18), simply interchange α and β in the first equation. The case α = 0 is proved separately for the positive and negative parts of f . We let α decrease to 0 in (3.18) and use the Monotone Convergence Theorem. Let X = (Ω, G, Gt , Xt , θt , P x ) be a right continuous simple Markov process with transition semigroup {Pt ; t ≥ 0}. As in the discussion of standard augmentation in Section 2.3, let M denote the set of finite positive measures on (S∆ , B(S∆ )). For each µ ∈ M and A ∈ G, set P µ (A) = P x (A) dµ(x). (3.20)
3.2 The strong Markov property
67
Let G µ be the P µ completion of G and let Nµ be the collection of null sets for (Ω, G µ , P µ ). Let Gtµ = Gt ∨ Nµ and set G = ∩µ∈M G µ
and
G t = ∩µ∈M Gtµ .
(3.21)
We say that {Gt ; t ≥ 0} is augmented if Gt = G t for each t. Suppose that {Gt ; t ≥ 0} in the previous paragraph is not augmented. Nevertheless, we have that, for each µ ∈ M, E µ (f (Xt+s ) | Gtµ ) = Ps f (Xt )
(3.22)
for all s, t ≥ 0 and f ∈ Bb (S∆ ). To see this note that the two measures A → E µ (1A f (Xt+s )) and A → E µ (1A Ps f (Xt )) agree on Gt by the simple Markov property and they also agree on Nµ , where they are both 0. Therefore, they agree on Gtµ , which is (3.22). Clearly we can replace Gtµ by G t in (3.22). Hence, in dealing with a right continuous simple Markov process, we can always take {Gt ; t ≥ 0} to be augmented. Recall that {Gt ; t ≥ 0} is said to be right continuous if Gt = ∩s>t Gs for each t ≥ 0. A collection X = (Ω, G, Gt , Xt , θt , P x ) that satisfies the following three conditions is called a Borel right process with transition semigroup {Pt ; t ≥ 0}: (1) X is a right continuous simple Markov process with transition semigroup {Pt ; t ≥ 0}. (2) U α f (Xt ) is right continuous in t for all α > 0 and all f ∈ Cb (S∆ ), the set of bounded continuous functions on S∆ . (3) {Gt ; t ≥ 0} is augmented and right continuous. The name “Borel right process” refers to the requirements of right continuity and the fact that {Pt ; t ≥ 0} is a Borel semigroup.
3.2 The strong Markov property A right continuous simple Markov process X = (Ω, G, Gt , Xt , θt , P x ) with transition semigroup {Pt ; t ≥ 0} is said to have the strong Markov property if for each Gt+ stopping time T E x f (XT +s )1{T <∞} | GT + = Ps f (XT )1{T <∞} (3.23) for all s ≥ 0, x ∈ S∆ and all f ∈ Bb (S∆ ). The next theorem provides a criterion for determining when a simple
Markov processes and local times
68
Markov process has the strong Markov property, which is tailored to our needs in this book. Theorem 3.2.1 Let X = (Ω, G, Gt , Xt , θt , P x ) be a right continuous simple Markov process with transition semigroup {Pt ; t ≥ 0}. If U α f (Xt ) is almost surely right continuous in t for all f ∈ Cb (S∆ ), for all α > 0, the strong Markov property holds. In particular, a Borel right process has the strong Markov property. Proof As in the proof of Lemma 2.2.5, Ps f (XT )1{T <∞} is GT + measurable. Hence it suffices to show that (3.24) E x 1A f (XT +s )1{T <∞} = E x 1A Ps f (XT )1{T <∞} for each A ∈ GT + and bounded continuous function f . As we point out after (3.11), since {Xt ; t ≥ 0} is right continuous in t, Pt f (x) = E x (f (Xt )) is also right continuous in t for f ∈ Cb (S∆ ). Therefore we can obtain (3.24) by showing that their Laplace transforms are equal, that is, that ∞ E x 1A 1{T <∞} e−αs f (XT +s ) ds = E x 1A 1{T <∞} U α f (XT ) 0
(3.25) for each α > 0. [2n T ] + 1 , as in the proof of Lemma 2.2.5, and recall that Let Tn = 2n for each n, Tn is a Gt stopping time, Tn ↓ T , and T < ∞ if and only if Tn < ∞, for all n ≥ 1. We claim that (3.25) holds with T replaced by Tn . Once this is established, (3.25) itself follows by taking the limit as n → ∞ and using right continuity. Replace T by Tn in the left-hand side of (3.25) and note that since A ∈ GT + , we have A ∩ {Tn = j/2n } ∈ Gj/2n . Therefore, by the simple Markov property, ∞ e−αs f (XTn +s ) ds (3.26) E x 1A 1{Tn <∞} =
=
=
∞
E
x
j=1 ∞ ∞ j=1 0 ∞ ∞ j=1
0
0
1A 1{Tn =j/2n }
∞
−αs
e
f (XTn +s ) ds
0
e−αs E x 1A∩{Tn =j/2n } f (Xj/2n +s ) ds e−αs E x 1A∩{Tn =j/2n } Ps f (Xj/2n ) ds
3.2 The strong Markov property ∞ = e−αs E x 1A 1{Tn <∞} Ps f (XTn ) ds,
69
0
which gives the right-hand side of (3.25) with T replaced by Tn . The strong Markov property (3.23) implies that for each Gt+ stopping time T , (3.27) E x Y ◦ θT 1{T <∞} | GT + = E XT (Y ) 1{T <∞} for all x ∈ S and F 0 measurable functions Y . (Of course, if {Gt ; t ≥ 0} is right continuous, GT + = GT .) To prove (3.27) it suffices to prove n it for Y of the form Y = i=1 fi (Xti ) with 0 < t1 < . . . < tn and n fi ∈ Bb (S∆ ), i = 1, . . . , n. Then Y ◦ θT = i=1 fi (XT +ti ) and (3.27) follows by repeated use of (3.23), precisely as in the proof of Lemma 2.2.2. Indeed, both sides of (3.27) are equal to n n Pti −ti−1 (zi−1 , dzi ) fi (zi ). (3.28) 1{T <∞} Pt1 (XT , dz1 ) i=2
i=1
Note the similarity of (3.28) and (3.8). Remark 3.2.2 Let X = (Ω, G, Gt , Xt , θt , P x ) satisfy the first two conditions for a Borel right process. By Theorem 3.2.1, X has the strong Markov property (3.23). In particular, for the constant stopping time T = t we have that for each x ∈ S∆ , E x (f (Xt+s ) | Gt+ ) = Ps f (Xt )
(3.29)
for all s, t ≥ 0 and f ∈ Bb (S∆ ). This shows that X = (Ω, G, Gt+ , Xt , θt , P x ) is a right continuous simple Markov process with transition semigroup {Pt ; t ≥ 0}. As mentioned near the end of Section 3.1, by replacing Gt with G t , we can always assume that {Gt ; t ≥ 0} is augmented. It is easy to check that, in this case, {Gt+ ; t ≥ 0} is also augmented. Since {Gt+ , t ≥ 0} is clearly right continuous, (Ω, G, Gt+ , Xt , θt , P x ) is a Borel right process. Let X = (Ω, G, Gt , Xt , θt , P x ) satisfy the first two conditions for a Borel right process. As in the study of Brownian motion, we set 0 Ft0 = σ(Xs ; s ≤ t) and Ft = F t , that is {Ft , t ≥ 0} is the standard augmentation of {Ft0 , t ≥ 0}. {Ft0 , t ≥ 0} is the minimal σ-algebra with respect to which {Xs ; s ≤ t} the first two condi- is measurable. Therefore, Ft0 ⊆ Gt . Since X satisfies tions for a Borel right process, the same is true of Ω, F 0 , Ft0 , Xt , θt , P x and, by the discussion near the end of Section 3.1, of (Ω,F, Ft , Xt ,θt , P x ).
70
Markov processes and local times
As in the proof of Lemma 2.3.1, we can show that {Ft ; t ≥ 0} is right continuous. Thus, (Ω, F, Ft , Xt , θt , P x ) is a Borel right process. We summarize this discussion in the next lemma. Lemma 3.2.3 Let X = (Ω, G, Gt , Xt , θt , P x ) satisfy the first two conditions for a Borel right process. Let Ft0 = σ(Xs ; s ≤ t) and let {Ft , t ≥ 0} be the standard augmentation of {Ft0 , t ≥ 0}. Then (Ω, F, Ft , Xt , θt , P x ) is a Borel right process. As in the proof of Lemma 2.3.2, (3.27) extends to all F measurable functions Y . Lemma 3.2.4 Let X = (Ω, G, Gt , Xt , θt , P x ) be a Borel right process. Then, for each Gt stopping time T , E x Y ◦ θT 1{T <∞} | GT = E XT (Y ) 1{T <∞} (3.30) for all x ∈ S∆ and F measurable functions Y . In particular, taking Gt = Ft and using Lemma 3.2.3 we have that, for each Ft stopping time T , (3.31) E x Y ◦ θT 1{T <∞} | FT = E XT (Y ) 1{T <∞} for all x ∈ S and F measurable functions Y . Since the simple Markov property concerns the conditional expectation of a function of X, we can only obtain (3.27) and hence (3.30) for F measurable functions Y . On the other hand, because of later applications, we want to allow conditioning on a possibly larger σ-algebra. This is why we introduce the generic σ-field G in the definition of a Borel right process. Following the arguments in Remark 2.2.8 and Lemma 2.2.9, we get Lemma 3.2.5 (Blumenthal Zero–One Law) Let A ∈ F0 . Then for any x ∈ S∆ , P x (A) is either zero or one. We now discuss an important class of stopping times. For A ⊆ S we define the first hitting time of A as TA = inf{s > 0 | Xs ∈ A}.
(3.32)
Theorem 3.2.6 Let X = (Ω, G, Gt , Xt , θt , P x ) be a Borel right process. Then TA is an Ft stopping time for every Borel set A ⊆ S∆ .
3.2 The strong Markov property Proof
71
Since Ft+ = Ft , it suffices to show that for each fixed t > 0, {TA < t} ∈ Ft .
(3.33)
By the right continuity of Xs , we see that for any s ≤ t, n
X(s, ω) = lim
n→∞
2
1[(j−1)t/2n ,jt/2n ) (s)X(jt/2n , ω) + 1{t} (s)X(t, ω).
j=1
Hence X : ([0, t] × Ω, B([0, t]) × Ft ) → (S∆ , B(S∆ )) and therefore we have ([0, t) × Ω) ∩ X −1 (A) ∈ B([0, t]) × Ft . Let π denote the projection from R1 × Ω to Ω. Then, by the Projection Theorem (Theorem 14.3.1), for each t ≥ 0, π(B([0, t]) × Ft ) ⊆ Ft . Consequently, π{([0, t) × Ω) ∩ X −1 (A)} ∈ Ft .
(3.34)
DA = inf{s ≥ 0 | Xs ∈ A}.
(3.35)
{DA < t} ∈ Ft .
(3.36)
Set
We claim that
To see this, observe that {DA < t} = {ω | inf{s ≥ 0 | Xs (ω) ∈ A} < t} = π{([0, t) × Ω) ∩ X
−1
(3.37)
(A)}.
Therefore (3.36) follows from (3.34). Next note that {s + DA ◦ θs < t}, {TA < t} =
(3.38)
s>0
so the proof of (3.33) reduces to showing that for each s > 0 {s + DA ◦ θs < t} ∈ Ft .
(3.39)
We observe that {s + DA ◦ θs < t} = {DA ◦ θs < t − s} = θs−1 {DA < t − s}.
(3.40)
Therefore, by (3.36), to obtain (3.39) we need only show that θs−1 Ft−s ⊆ Ft .
(3.41)
Finally, considering how the augmented filtration is constructed, to obtain (3.41) it suffices to show that, for each B ∈ Ft−s and µ ∈ M, θs−1 B ∈ Ftµ . Let ν ∈ M be the measure defined by ν(C) = P µ (Xs ∈ C) for C ∈ B.
72
Markov processes and local times
0 By the definition of Ft−s , for each B ∈ Ft−s , we can find B ∈ Ft−s with ν P (B∆B ) = 0. Using the simple Markov property, as in (2.84), this is 0 ⊆ Ft0 , and equivalent to P µ (θs−1 B∆θs−1 B ) = 0. But clearly θs−1 Ft−s µ −1 therefore θs B ∈ Ft . This implies (3.41), which completes the proof of Theorem 3.2.6.
Set ζ := inf{t > 0 | Xt = ∆}. ζ is referred to, poetically, as the “death time” of X. Lemma 3.2.7 Let X = (Ω, G, Gt , Xt , θt , P x ) be a Borel right process. Then, conditional on ζ < ∞, Xt = ∆
for all t ≥ ζ
P x a.s.
(3.42)
Proof By the right continuity of Xt we have (3.43) P x (Xζ+t = ∆, ∀ t ≥ 0; ζ < ∞) x n = lim P Xζ+i/2n = ∆, i = 0, 1, . . . , n2 ; ζ < ∞ . n→∞
Since ζ is a stopping time, by the strong Markov property, P x Xζ+i/2n = ∆, i = 0, 1, . . . , n2n ; ζ < ∞ (3.44) x ∆ n = E I{ζ<∞} P Xi/2n = ∆, i = 0, 1, . . . , n2 = P x (ζ < ∞) , where we use the fact that P ∆ Xi/2n = ∆, i = 0, 1, . . . , n2n = 1,
(3.45)
which follows from (3.4) and (3.10). Remark 3.2.8 By removing a set of measure 0 from Ω we can assume that (3.42) holds for every sample path, that is, that Xt = ∆ on {t ≥ ζ} ∩ {ζ < ∞}. Let Ω be the space of right continuous S∆ –valued functions {ω (t), t ∈ [0, ∞)} such that ω (t) = ∆ for all {t ≥ ζ } ∩ {ζ < ∞}, where ζ = inf{s > 0 | ω (s) = ∆}. It is sometimes useful to work with a Borel right process with the canonical sample space Ω . Let Xt be the nat0 0 ural evaluation Xt (ω ) = ω (t), and let F and F t be the σ-algebras generated by {Xs , s ∈ [0, ∞)} and {Xs , s ∈ [0, t]}, respectively. The : (Ω, F 0 ) → (Ω , F 0 ) defined by X(ω)(t) := Xt (ω) induces a map X 0 map of the probabilities P x on (Ω, F 0 ) to probabilities P x on (Ω , F ).
3.3 Strongly symmetric Borel right processes
73
Augmenting the σ-algebras F and F t , as in Remark 3.2.2, we obtain a Borel right process X = (Ω , F , Ft , Xt , θt , P x ) with transition semigroup {Pt ; t ≥ 0}. X has the property that Xt = ∆ for all {t ≥ ζ } ∩ {ζ < ∞}. 0
0
3.3 Strongly symmetric Borel right processes Let X = (Ω, G, Gt , Xt , θt , P x ) be a right continuous simple Markov process with state space (S∆ , B(S∆ )). Let m be a σ-finite measure on (S, B(S)). (Note that m is a measure on S, not S∆ .) We say that X has α-potential densities with respect to m if, restricted to S, U α (x, ·) is absolutely continuous with respect to m,
∀x ∈ S. (3.46)
In this case we refer to m as a reference measure for X. Let uα (x, y), x, y ∈ S, denote a density for U α (x, ·) with respect to m. When α = 0 we often write u(x, y) instead of u0 (x, y). Lemma 3.3.1 Let X be a right continuous simple Markov process. If X has α-potential densities with respect to m for some α ≥ 0, then X has β-potential densities with respect to m for all β > 0, and for any x ∈ S and α, β > 0 α β u (x, y) − u (x, y) = (β − α) uβ (x, z)uα (z, y) dm(z) (3.47) for almost all y ∈ S, with respect to m. This also holds with β = 0 if U 0 (x, ·) is a σ-finite measure. Furthermore, U α f (x) = uα (x, z)f (z) dm(z) (3.48) for all f ∈ B(S), whenever either side exists. Proof Suppose that (3.46) holds for some α ≥ 0. Let A ∈ B(S) have m measure zero. Then U α IA ≡ 0 so U β U α IA (x) = 0. Therefore, by the resolvent equation (3.18), U β IA (x) = 0, that is, (3.46) holds with α replaced by β. Note that since U α ( · , A) is measurable with respect to B(S), we can choose uα (x, y) to be measurable with respect to B(S) × B(S) (see Dellacherie and Meyer (1980, Theorem V.58)). For any set A ∈ B(S), (3.18) gives us U α (x, A) − U β (x, A)
(3.49)
Markov processes and local times β = (β − α) u (x, z) uα (z, y) dm(y) m(dz) A β α u (x, z) u (z, y) dm(z) m(dy), = (β − α)
74
A
which gives (3.47). The rest of the proof is obvious. Theorem 3.3.2 (Kac’s Moment Formula: II) Let X = (Ω, G, Gt , Xt , θt , P x ) be a right continuous simple Markov process with 0-potential densities u(x, y) with respect to some reference measure m. Then, for any fi ∈ Bb+ (S), i = 1, . . . , n, n ∞ Ex fi (Xs ) ds (3.50) 0
i=1
=
u(x, y1 ) · · · u(yn−1 , yn )
π
n
fπi (yi ) dm(yi ),
i=1
where the sum runs over all permutations π of {1, . . . , n}. In particular, ∞ n Ex (3.51) f (Xs ) ds 0
u(x, y1 ) · · · u(yn−1 , yn )
= n!
n
f (yi ) dm(yi ).
i=1
Proof We have n Ex
fi (Xs ) ds
0
i=1
E
= n R+
=
π
∞
x
n
(3.52)
fi (Xsi )
i=1
n
dsi
i=1
Ex
{0≤s1 ≤...≤sn <∞}
n
fπi (Xsi )
i=1
n
dsi .
i=1
Using (3.10), the left-hand side of (3.52) can be written as n Pt1 (x, dy1 ) Pti −ti−1 (yi−1 , dyi ) {0≤s1 ≤...≤sn <∞}
π
n
=
π
{0≤s1 ≤...≤sn−1 <∞}
(3.53)
i=2
i=1
Pt1 (x, dy1 )
n−1 i=2
fπi (yi )
n
dsi
i=1
Pti −ti−1 (yi−1 , dyi )
3.3 Strongly symmetric Borel right processes ∞ n−1 n−1 fπi (yi ) dsi Ps (yn−1 , dyn )fπn (yn ) ds. i=1
i=1
75
0
By (3.14), ∞ Ps (yn−1 , dyn )fπn (yn ) ds = u(yn−1 , yn )fπn (yn ) dm(yn ). (3.54) 0
Iterating this procedure gives (3.50). Equation (3.51) is an obvious special case of (3.50). We say that X is symmetric with respect to m when its associated semigroup {Pt ; t ≥ 0} is symmetric with respect to m, that is, if (3.55) g(x) Pt f (x) dm(x) = Pt g(x) f (x) dm(x) for all nonnegative functions f, g ∈ B(S). By the Cauchy-Schwarz inequality, |Pt f (x)|2 = |E x (f (Xt ))|2 ≤ E x (|f (Xt )|2 ) = Pt |f |2 (x).
(3.56)
Therefore, by (3.55) and (3.56), Pt |f |2 (x) dm(x) (3.57) |Pt f (x)|2 dm(x) ≤ = |f (x)|2 Pt 1(x)dm(x) ≤ |f (x)|2 dm(x), where 1(x) is the constant function that takes the value 1. This shows that Pt can be extended to a contraction on L2 (m). With this extension (3.55) holds for all f, g ∈ L2 (m). When (3.55) holds, we also have that g(x) U α f (x) dm(x) = U α g(x) f (x) dm(x) (3.58) for all α > 0. (Let ( · , · ) be the canonical inner product on L2 (m). Equations (3.55) and (3.58) are simply the statements that (g, Pt f ) = (Pt g, f ) and (g, U α f ) = (U α g, f ), respectively.) We say that X is strongly symmetric with respect to m when X is symmetric with respect to m and has α-potential densities with respect to m for some (and consequently all) α > 0. In this case, by (3.58), uα (x, y) = uα (y, x)
for m a.e. x, y ∈ S and α > 0.
(3.59)
Therefore, when uα (x, y) is continuous on S × S, (3.59) holds for all x, y ∈ S.
Markov processes and local times
76
Lemma 3.3.3 Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric right continuous simple Markov process with semigroup {Pt ; t ≥ 0} and continuous α-potential densities uα (x, y), α > 0, with respect to some reference measure m. Then uα (x, y) is positive definite. This also holds for α = 0 when u(x, y) in continuous on S. This elementary lemma plays a crucial role in in this book. It enables us to “associate” with the strongly symmetric Borel right process X a mean zero Gaussian process with covariance uα (x, y) which we refer to as an associated Gaussian process (see Remark 5.1.7). Proof Using the semigroup property and then symmetry, (3.55), we see that uα (x, y)f (x)f (y) dm(x) dm(y) (3.60) = f (x)U α f (x) dm(x) ∞ = f (x)e−αt Pt f (x) dm(x) dt 0 ∞ f (x)e−αt Pt/2 Pt/2 f (x) dm(x) dt = 0 ∞ e−αt |Pt/2 f (x)|2 dm(x) dt ≥ 0. = 0
Let f,z (x) be an approximate δ-function at z. For any ai ∈ R1 and n zi ∈ S, i = 1, . . . , n, set f (x) = i=1 ai f,zi (x). Then, by (3.60), n ai aj (3.61) uα (x, y)f,zi (x)f,zj (y) dm(x) dm(y) ≥ 0. i,j=1
Taking the limit as goes to zero we get n
ai aj uα (zi , zj ) ≥ 0.
(3.62)
i,j=1
Thus uα (x, y) is positive definite. We say that the semigroup {Pt ; t ≥ 0} has regular transition densities with respect to m when there exists a family of nonnegative functions {pt (x, y) ; (t, x, y) ∈ R+ × S × S} measurable with respect to B(R+ ) × B(S) × B(S) such that (3.63) Pt f (x) = pt (x, z)f (z) dm(z)
3.3 Strongly symmetric Borel right processes
77
for all t > 0 and all bounded B(S) measurable functions f , and pt+s (x, y) = ps (x, z)pt (z, y) dm(z) (3.64) for all s, t > 0. Note that the conditions on pt (x, y) apply only to x, y ∈ S; nothing is implied about ∆. Equation (3.64) is known as the Chapman–Kolmogorov equation, a special case of which is given in (2.4). A set of functions {pt (x, y) ; (t, x, y) ∈ R+ × S × S}, measurable with respect to B(R+ ) × B(S) × B(S), that satisfy (3.64) determines a semigroup, that is, (3.2) holds. When regular transition densities exist it follows from (3.14) that the α-potential densities exist and are given by ∞ e−αt pt (x, z) dt. (3.65) uα (x, z) = 0
The next lemma presents the resolvent equations for potential densities. Lemma 3.3.4 Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric right continuous simple Markov process with semigroup {Pt ; t ≥ 0}, which has regular transition densities with respect to some reference measure m. Then, for α, β > 0, β α u (x, y) = u (x, y) + (α − β) uβ (x, z)uα (z, y) dm(z) (3.66) = uα (x, y) + (α − β) uα (x, z)uβ (z, y) dm(z) for all x, y ∈ S. This also holds for α = 0 when u(x, y) < ∞ for all x, y ∈ S. Proof We use (3.64) and (3.65) and mimic the proof of Lemma 3.1.3, with obvious modifications. We say that the family {pt (x, y) ; (t, x, y) ∈ R+ × S × S} of regular transition densities is symmetric when pt (x, y) = pt (y, x) for all (t, x, y) ∈ R+ × S × S. By (3.65), the symmetry of the transition densities pt (x, z) implies that of the potential densities uα (x, z). Remark 3.3.5 It follows from the above paragraph that a simple Markov process with a semigroup that has symmetric regular transition densities is strongly symmetric. In particular, this holds for a Borel right
Markov processes and local times
78
process. Conversely, Wittman (1986) showed that every strongly symmetric Borel right process has symmetric regular transition densities. We use this result from this point on and refer the reader to Wittman (1986) for the proof. To get around the lack of proof of this result, one can often prove that a particular Borel right process that is being considered has symmetric regular transition densities. In fact, in many cases, one of the steps in the proof that a particular process is a strongly symmetric Borel right process is to show that it has symmetric regular transition densities (see Section 4.2). Using the fact that a Borel right process has symmetric regular transition densities simplifies things considerably. Now we can simply define the α-potential densities by (3.65). Then the 0-potential density is also well defined as limα→0 uα (x, y), although it may be infinite. (Note that, by (3.65), uα (x, y) is increasing as α ↓ 0.) In the next lemma we show that when symmetric regular transition densities exist, it is not necessary to require that the α-potential densities uα (x, y) are continuous to prove that they are positive definite. Lemma 3.3.6 Let {pt (x, y) ; (t, x, y) ∈ R+ × S × S} be a family of symmetric regular transition densities with respect to some reference measure m. Then, for each t > 0, pt (x, y) is positive definite. Hence, whenever uα (x, y) exists, for any α ≥ 0, it is positive definite. Proof Using the Chapman–Kolmogorov equation (3.64) and then symmetry we get n n ai aj pt (xi , xj ) = ai aj pt/2 (xi , z)pt/2 (z, xj ) dm(z) i,j=1
i,j=1
=
n 2 ai pt/2 (xi , z) dm(z) ≥ 0.
(3.67)
i=1
Thus pt (x, y) is positive definite. Multiplying each side by e−αt and integrating completes the proof.
3.4 Continuous potential densities Lemma 3.4.1 Let X = (Ω, G, Gt , Xt , θt , P x ) be a right continuous simple Markov process with α-potential densities uα (x, y). If uα (x, y) is
3.4 Continuous potential densities
79
continuous on S × S for some α > 0, then U α f (x) is continuous on S for any f ∈ Cb (S∆ ). Proof Since uα (x, y) is continuous on S×S, U α f (x) is continuous on S for any continuous function f with compact support in S. Such functions are dense in C0 (S∆ ) in the uniform topology, where C0 (S∆ ) is the set of continuous functions on S∆ that vanish at ∆. Since U α 1(x) = 1/α for all x, we see that if fn → f uniformly, then U α fn (x) → U α f (x) uniformly. It follows that U α f (x) is continuous on S for any f ∈ C0 (S∆ ). Let f ∈ Cb (S∆ ) and let f∆ denote the constant function with value f (∆). Then f = f∆ + (f − f∆ ) with (f − f∆ ) ∈ C0 (S∆ ). The lemma now follows since U α (f − f∆ ) is continuous and U α f∆ (x) = f (∆)/α. Lemma 3.4.2 Let X = (Ω, G, Gt , Xt , θt , P x ) be a right continuous simple Markov process with Xt = ∆ for all t ≥ ζ. If X has continuous α-potential densities on S×S for each α > 0, then (Ω, F, Ft , Xt , θt , P x ) is a Borel right process. Proof By Lemma 3.4.1, U α f (Xt ) is right continuous in t < ζ for any f ∈ Cb (S∆ ) and α > 0. This also holds for all t ≥ ζ by our assumption that Xt = ∆ for all t ≥ ζ. This shows that X satisfies the first two conditions for a Borel right process. It then follows from Lemma 3.2.3 that (Ω, F, Ft , Xt , θt , P x ) is a Borel right process. We note the following remarkable properties of uα (x, y) (to appreciate our use of the word “remarkable,” see Remark 5.1.7). Lemma 3.4.3 Let X be a Borel right process with α-potential density uα (x, y). Assume that uα (x, y) is continuous on S × S for some α ≥ 0. Then uα (x, y) ≤ uα (y, y)
∀x, y ∈ S
(3.68)
and uα (x, y) is continuous in y uniformly in x ∈ S. Furthermore, for any compact K ⊆ S, sup
x∈S; y,y ∈K
|uα (x, y) − uα (x, y )| ≤
sup
x,y,y ∈K
|uα (x, y) − uα (x, y )|. (3.69)
Proof Fix y0 ∈ S and some open neighborhood G of y0 that has compact closure in S. If f is any bounded measurable function supported in G, then by (3.14) and the strong Markov property we have (3.70) uα (x, y) f (y) dm(y)
Markov processes and local times ∞ = Ex e−αs f (Xs ) ds 0 ∞ = Ex e−αs f (Xs ) ds I{TG <∞} T G ∞ = E x e−αTG I{TG <∞} e−αs f (XTG +s ) ds 0 ∞ e−αs f (Xs ) ds = E x e−αTG I{TG <∞} E XTG 0 = E x e−αTG I{TG <∞} uα (XTG , y) f (y) dm(y) .
80
Let {fn } be a sequence of approximate identities converging to δy0 . Using (3.70) with f = fn and taking the limit as n → ∞ we obtain uα (x, y0 ) = E x e−αTG I{TG <∞} uα (XTG , y0 ) . (3.71) Here we use the fact that uα (x, y) is uniformly continuous on G × G. Since (3.71) holds for all such neighborhoods G of y0 , we get (3.68). It follows from (3.71) that, for y, y ∈ G, |uα (x, y) − uα (x, y )| ≤ E x e−αTG |uα (XTG , y) − uα (XTG , y )| ≤
sup |uα (x, y) − uα (x, y )|,
(3.72)
¯ x∈G
which gives sup x∈S; y,y ∈G
|uα (x, y) − uα (x, y )| ≤
sup
|uα (x, y) − uα (x, y )|. (3.73)
x,y,y ∈G
The fact that uα (x, y) is continuous in y uniformly in x ∈ S follows since by hypothesis uα (x, y) is uniformly continuous on G × G. Finally, (3.69) follows from (3.73) since we can find a sequence Gn of open sets with compact closure such that K = ∩n Gn . Remark 3.4.4 Let X be a strongly symmetric Borel right process with α-potential density uα (x, y). An immediate consequence of Lemma 3.4.3 is that if uα (x, y) is continuous on S × S for some α ≥ 0, then it is also continuous for any other α > 0. This follows from the resolvent equation (3.66). Remark 3.4.5 Under certain conditions that often arise in practice, we can establish the conclusion of the last remark without assuming the existence of regular transition densities. Thus, assume that X is a strongly symmetric Borel right process with uα (x, y) continuous on
3.5 Killing a process at an exponential time
81
S × S for some α ≥ 0 and in addition that U β : Cb (S) → Cb (S) for some β = α. It follows from (3.47) and Lemma 3.4.3 that for each fixed x we can choose the density uβ (x, y) to be continuous in y, and this continuity is uniform in x ∈ S. With this choice, (3.47) holds for all y. Note once again from Lemma 3.4.3 that for fixed y ∈ S, hy (z) := uα (z, y) is bounded and continuous in z, and (3.47), which now holds for all y, can be written as uα (x, y) − uβ (x, y) = (β − α)U β hy (x). This shows that for each fixed y, uβ (x, y) is continuous in x. Together with what we have just shown, this implies that uβ (x, y) is continuous on S × S, and then from (3.59) that it is symmetric. We have a further consequence of the assumption that uα (x, y) is continuous on S × S: Lemma 3.4.6 Let X be a strongly symmetric Borel right process with α-potential density uα (x, y). If uα (x, y) is continuous on S × S for some α ≥ 0, then uα (y, y) > 0
∀ y ∈ S.
(3.74)
Proof Fix y0 ∈ S. If uα (y0 , y0 ) = 0, then by (3.68) and symmetry we would have that uα (y0 , x) = 0 , ∀x ∈ S. Now, since α uα (y0 , x) dm(x), (3.75) U (y0 , S) = S
it follows that U (y0 , S) = 0. This implies that P y0 (Xt ∈ S) = Pt (y0 , S) = 0 for almost every t. This contradicts the fact that P y0 (X0 = y0 ) = 1 and Xt is right continuous. α
3.5 Killing a process at an exponential time Let X = (Ω, G, Gt , Xt , θt , P x ) be a Borel right process with transition semigroup {Pt ; t ≥ 0}. For any α > 0 we construct a Borel right process on (S, B(S)) with transition semigroup {e−αt Pt ; t ≥ 0} . This new process is said to be obtained from the original process by killing it at an independent exponential time with mean 1/α. The new process is useful for the following reason. Suppose that the original process is strongly symmetric with continuous α-potential density uα (x, y) for some α > 0 but that its 0-potential density is infinite. The exponentially killed process we construct is strongly symmetric with continuous 0-potential density (which is the α-potential density of the original process). We use this approach in the next section in the construction of local times.
Markov processes and local times
82
= Ω × R+ with elements ω Let Ω = (ω, s). Define the family of measures x ∈ S∆ (3.76) Px = P x × αe−αs ds G × B(R+ )). Define λ(ω, s) = s. λ is an exponential random on (Ω, variable with mean 1/α under each of the measures Px . Let t<s Xt (ω) (3.77) Xt (ω, s) = ∆ t ≥ s. ∈ G such that Let G = G × B(R+ ). We define Gt to consist of all sets A A ∩ (Ω × (t, ∞)) = A × (t, ∞) for some A ∈ Gt . It is easy to check that Gt is an increasing family of σ-algebras. We introduce shift operators on r ◦ θt = X r+t . given by θt (ω, s) = (θt ω, (s − t) ∨ 0) and verify that X Ω x (f (X t )) and note that, for f ∈ Bb (S), Let Pt f (x) := E t x f (Xt ) 1{t<λ} = e−αt Pt f (x). (3.78) x f X =E Pt f (x) = E ∈ Gt , we have Then when A r+t ) 1 x f (X E A
x f (Xr+t )1{r+t<λ} 1A = E −α(r+t)
= e
(3.79)
x
E (f (Xr+t )1A )
= e−α(r+t) E x (Pr f (Xt )1A ) = e−αt E x Pr f (Xt )1A t )1 . x Pr f (X = E A Consequently,
x f (X r+t )|Gt = Pr f (X t ). E
(3.80)
=Ω one can see that {Pt ; t ≥ 0} is a semiAlso, using (3.79) with A group. Thus we see that X is a simple Markov process with respect to {Gt ; t ≥ 0} with transition semigroup {Pt ; t ≥ 0}. t is right continuous. Also, by (3.78), It is clear that the map t → X ∞ β f (x) := U Pt f (x) dt = U α+β f (x). (3.81) 0
Suppose that X is strongly symmetric with continuous α-potential density uα (x, y) for some α > 0. Then, by Remark 3.4.4 and Lemma 3.4.1, β f (x) is continuous on S for each f ∈ Cb (S∆ ) and β > 0. Thus U t ) is right continuous in t < ζ for any f ∈ Cb (S∆ ) and β > 0. β f (X U t = ∆ for all t ≥ ζ. Hence X But this also holds for all t ≥ ζ since X satisfies the second condition for Borel right processes.
3.6 Local times
83
satisfies the first two conditions that We have now established that X define a Borel right process. Then, as discussed in Remark 3.2.2, we augment the σ-algebras Gt to obtain a Borel right process, which we t , θt , Px ). = (Ω, G, Gt , X again denote by X
3.6 Local times Let X = (Ω, G, Gt , Xt , θt , P x ) be a Borel right process. A family A = {At ; t ≥ 0} of random variables on (Ω, G, Gt ) is called a continuous additive functional (CAF) of X if (1) t → At is almost surely continuous and nondecreasing, with A0 = 0 and At = Aζ , for all t ≥ ζ. (2) At is Ft measurable. (3) At+s = At + As ◦ θt for all s, t ∈ R+ a.s. This is just a slight generalization of the definition of a CAF of Brownian motion on page 32 that incorporates the lifetime of X. Let {At ; t ≥ 0} be a continuous additive functional of X, and let RA (ω) = inf{t | At (ω) > 0}.
(3.82)
We call {At ; t ≥ 0} a local time of X at y ∈ S if P y (RA = 0) = 1 and, for all x = y, P x (RA = 0) = 0. In the next theorem we give general conditions for the existence of local times. However, we first assume that {At ; t ≥ 0} is a local time of X at y and derive some important consequences. The first of these is that if X starts at y at time zero, it hits y infinitely often by time t for all t > 0, almost surely. Lemma 3.6.1 Let X be a Borel right process that has a local time at y. Then P y (Ty = 0) = 1.
(3.83)
Proof Let {At ; t ≥ 0} denote a local time of X at y. Since P y (RA = 0) = 1, P y (Ty > 0) = P y (RA = 0, Ty > 0).
(3.84)
If Ty > 0, then for some interval (0, t0 ] we have Xt = y for all 0 < t ≤ t0 . Let RA,r = r + RA ◦ θr . Then, since t → At is almost surely
84
Markov processes and local times
continuous and nondecreasing, with A0 = 0 and At > 0 for t > 0 (since {At , t ≥ 0} is local time), {XRA,r = y} (3.85) {RA = 0, Ty > 0} ⊆ r>0
rational
except, possibly, for a set of measure zero. Also, by definition, RA ◦ θRA,r = 0. Therefore, P y (RA = 0, Ty > 0) ≤ P y ( {RA ◦θRA,r = 0, XRA,r = y}). (3.86) r>0
rational
Clearly, for any r > 0, P y (RA ◦ θRA,r = 0, XRA,r = y) = E y P XRA,r (RA = 0)I{XRA,r =y} = 0
(3.87)
since P x (RA = 0) = 0 when x = y. Putting these equations together gives (3.83). Remark 3.6.2 By definition, a CAF {At ; t ≥ 0} of a Markov process is nondecreasing and consequently determines a (random) measure dAt on R1 . We now show that when {At ; t ≥ 0} is a local time of a Borel right process X at y, dAt is supported on Z := {t : Xt = y}, almost surely. Set I = {t : At+ − At > 0 for all > 0}.
(3.88)
We first show that I ⊆ Z, almost surely. This is because, by the right continuity of Xt , {XRA,r = y}, (3.89) {I ∩ Z c = ∅} ⊆ r>0
rational
and, arguing as in (3.86) and (3.87), this has P x measure zero for any x ∈ S. We now note that dAt is supported on J , where J = {t : At+ − At− > 0 for all > 0}.
(3.90)
It is also easy to check that J − I is countable. However, since t → At is continuous, dAt has no atoms. Therefore dAt is supported on I ⊆ Z, almost surely. We now show that when a Borel right process is strongly symmetric with continuous α-potential densities it has local times at all points in its state space.
3.6 Local times
85
Theorem 3.6.3 Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric Borel right process with continuous α-potential density uα (x, y). For each y ∈ S we can construct a local time of X at y, which we denote by {Lyt ; t ≥ 0}, such that, for each x ∈ S, ∞ Ex e−αt dLyt = uα (x, y). (3.91) 0
Let f,y be an approximate δ-function at y with respect to the reference measure m. There exists a sequence {n } tending to zero, such that almost surely t y fn ,y (Xs ) ds (3.92) Lt = lim n →0
0
uniformly for t ∈ [0, T ], where T is any finite time, which may be random. Proof Consider first the case of α > 0. Let λ be an exponential = random variable with mean 1/α that is independent of X. Let X x (Ω, G, Gt , Xt , θt , P ) be the Borel right process obtained by killing X at λ as described in Section 3.5. Note that the 0-potential density for is given by uα (x, y). Kac’s moment formula (3.50) shows that X ∞ ∞ x I, (x) := E f,y (Xs ) ds f ,y (Xt ) dt (3.93) 0 0 = uα (x, z1 )uα (z1 , z2 )f,y (z1 )f ,y (z2 ) dm(z1 ) dm(z2 ) + uα (x, z1 )uα (z1 , z2 )f ,y (z1 )f,y (z2 ) dm(z1 ) dm(z2 ). Since inf
z,z ∈K∨
uα (x, z)uα (z, z ) ≤ I, (x) ≤
sup
z,z ∈K∨
uα (x, z)uα (z, z )
(3.94) ∞ s ) ds converges in and uα (x, y) is continuous, we see that 0 f,y (X L2 (Px ) as → 0. (K · is part of the definition of approximate δ-function on page 8.) In fact, using the uniform continuity of uα (x, y) described in Lemma 3.4.3, we see that this convergence is uniform in x ∈ S. Define the right continuous X-martingale ∞ x Mt = E (3.95) f,y (Xs ) ds|Gt = 0
0 t
s ) ds + f,y (X
t , z)f,y (z) dm(z), uα (X
Markov processes and local times
86
where for the last equation we use (3.50) again. Note that the right continuity of Mt comes from the continuity of uα . It follows from Doob’s L2 inequality (see, ∞e.g., Rogers and Williams (2000b, II.70.2)) and the s ) ds that fact that M∞ = 0 f,y (X E x sup |Mt − Mt |2 (3.96) t≥0
≤ 4E x
∞
s ) ds − f,y (X
0
∞
2 f ,y (Xs ) ds .
0
Therefore, Mt converges in L2 (Px ) uniformly in t ∈ R+ and x ∈ S. Using the bounds and uniform continuity of uα (x, y) described in Lemma 3.4.3, we see that the last term in (3.95) also converges in L2 (Px ) uniformly in t ∈ R+ and x ∈ S. Consequently, we can find a sequence n → 0 such that t∧λ t s ) ds = fn ,y (X fn ,y (Xs ) ds (3.97) 0
0
converges uniformly in t ∈ R+ , Px almost surely for all x ∈ S. Since P (λ > v) = e−αv > 0, it follows from Fubini’s Theorem that, for this sequence {n }, the right-hand side of (3.92) converges uniformly for t ∈ [0, v], P x almost surely for all x ∈ S and v > 0. Let t = ω| fn ,y (Xs ) ds converges locally uniformly in t . (3.98) Ω 0
= 1, for all x ∈ S. For ω ∈ Ω we define We have shown that P x (Ω) y y Lt by (3.92); otherwise set Lt ≡ 0. It is easy to verify that Lyt is a continuous additive functional of X. ∞ s ) ds, n = 1, 2, . . . It follows from (3.93) that the sequence 0 fn ,y (X is uniformly integrable. Therefore ∞ x (Ly ) = lim E x E f ( X ) ds = uα (x, y) (3.99) n ,y s λ n→∞
0
by Theorem 3.3.2. Also, clearly x (Ly ) = αE x E λ
∞
e−αt Lyt dt .
(3.100)
0
Note that if the integral in the right-hand side of (3.100) is finite, then lim supt→∞ e−αt Lyt = 0. Suppose to the contrary that there exists an infinite sequence {tn } for which e−αtn Lytn ≥ for some > 0. Then e−αt Lyt ≥ /2 for t ∈ [tn , tn + (1/α) log 2], which contradicts the fact
3.6 Local times
87
that the integral is finite. Using this observation we see that (3.91) follows from (3.99), (3.100), and integration by parts. To complete the proof we must show that {Lyt ; t ≥ 0} is a local time for X at y. To do this we need only to show that R = RLy (see (3.82)) satisfies the conditions stated after (3.82). It follows immediately from (3.92) and the right continuity of X that P x (R = 0) = 0 for all x = y. We now show that P y (R = 0) = 1. Suppose that P y (R = 0) = 0, so that P x (R = 0) = 0 for all x ∈ S. Since, for any z ∈ S, P z (Lyt > 0) ≤ P z (R < t),
(3.101)
we would have lim P z (Lyt > 0) = 0
t→0
∀z ∈ S.
(3.102)
This is not possible. To see this note that it follows from the definition of R that, for any x and t > 0, (3.103) P x (R < ∞) = P x LyR+t > 0 , R < ∞ . It is easy to see that R is a stopping time. Therefore, using the additivity of Ly· and the Markov property, we have (3.104) P x (R < ∞) = E x P XR (Lyt > 0) I{R<∞} . Using (3.102) in (3.103) gives us P x (R < ∞) = 0 for all x, that is, that Ly· ≡ 0 almost surely, which contradicts (3.91). Thus P y (R = 0) > 0, and by the Blumenthal zero-one law (Lemma 3.2.5), P y (R = 0) = 1. When α = 0, (3.91) is E x (Ly∞ ) = u0 (x, y).
(3.105)
The proof of this theorem, in this case, is similar to the above, and in fact simpler, since we do not need to introduce the exponentially killed process. Remark 3.6.4 (1) The version of {Lxt ; t ∈ R+ } constructed in Theorem 3.6.3 is F 0 measurable. (2) It follows from the proof of Theorem 3.6.3 that, for any sequence { n } tending to zero, we can choose a subsequence {n } tending to zero, such that (3.92) holds. (3) It is easy to check that if {Lyt ; t ≥ 0} is a local time for X = (Ω, G, Gt , Xt , θt , P x ) at y ∈ S, then {Lyt∧λ ; t ≥ 0} is a local time = (Ω, G, Gt , X t , θt , Px ), (see Section 3.5). for X
88
Markov processes and local times
Theorem 3.6.5 Let X be a strongly symmetric Borel right process and assume that its α-potential density, uα (x, y), is finite for all x, y ∈ S. Let Lyt be a local time of X at y, with ∞ e−αt dLyt = uα (x, y). (3.106) Ex 0
Then uα (x, y) E x e−αTy = α u (y, y) and for every t
E
x
(Lyt )
(3.107)
t
ps (x, y) ds.
=
(3.108)
0
Furthermore, if u(x, x) and u(y, y) are finite, P x (Ty < ∞) =
u(x, y) . u(y, y)
(3.109)
Proof When x = y, (3.107) is just (3.83). When x = y, (3.107) follows from (3.106), the support property of dLyt , and the strong Markov property as follows: ∞ (3.110) uα (x, y) = E x e−αs dLys 0 ∞
= Ex
1{Ty <∞}
= E
x
−αTy
e
e−αs dLys
Ty
1{Ty <∞}
∞
0
= E x e−αTy 1{Ty <∞} E XTy = E x e−αTy uα (y, y)
−αs
e
◦ θ Ty ∞ e−αs dLys dLys
0
since XTy = y on {Ty < ∞}. We get (3.109) by taking the limit in (3.107) as α goes to zero. To obtain (3.108) we first note that if λ is an independent exponential random variable with mean 1/α, then ∞ ∞ (3.111) α e−αt Lyt dt = Eλx (Lyλ ) = Eλx 1{λ>t} dLyt 0 0 ∞ = Ex e−αt dLyt = uα (x, y). 0
3.6 Local times Therefore α
∞
e−αt E x (Lyt ) dt
0
89
∞
e−αt pt (x, y) dt (3.112) t ∞ = α e−αt ps (x, y) ds dt =
0
0
0
by integration by parts. Here we use the fact that t −αt ps (x, y) ds lim e t→∞ 0 t −αt/2 −αs/2 e ps (x, y) ds ≤ lim e t→∞
0 −αt/2 α/2
≤ lim e t→∞
u
(3.113)
(x, y) = 0.
Equation (3.112) holds as long as uα (x, y) is finite. Since uλ (x, y) is decreasing in λ, (3.112) holds for all λ ≥ α. By the uniqueness property of Laplace transforms, this implies that (3.108) holds for almost every t. Since both sides of (3.108) are continuous in t, it follows that (3.108) holds for every t. Remark 3.6.6 Fix some x, y ∈ S with P x (Ty < ∞) > 0. Equations (3.107) and (3.74) then imply that uα (x, y) > 0 and thus, by symmetry, that uα (y, x) > 0. This, in turn, implies that P y (Tx < ∞) > 0. Theorem 3.6.7 If {At ; t ≥ 0} is a local time of a Borel right process X at a point y and K > 0 is a constant, then {KAt ; t ≥ 0} is also a local time of X at y. Furthermore, if {A1t ; t ≥ 0} and {A2t ; t ≥ 0} are local times of X at y with ∞ i = 1, 2 (3.114) e−αt dAit < ∞, Ey 0
for some α > 0, then A1t = CA2t
∀t ≥ 0
a.s.
(3.115)
for some constant C > 0. The proof of the first statement in this theorem is simple; one just checks that {KAt ; t ≥ 0} satisfies the definition of a local time. To prove the second statement we develop some material on the α-potential operator of a CAF. Let A = {At ; t ≥ 0} be a continuous additive
90
Markov processes and local times
functional. We define its α-potential operator by ∞ UAα f (x) = E x e−αt f (Xt ) dAt
(3.116)
0 α α for f ∈ Bb+ (S). When f ≡ 1 we write uα A (x) = UA 1(x). uA (x) is called the α-potential of A. In particular, by (3.91), α uα Ly (x) = u (x, y).
(3.117)
If the α-potential of A is bounded, then in analogy with the resolvent equation, (3.18), for any α, β > 0, UAα f (x) − UAβ f (x) = (β − α)U α UAβ f (x).
(3.118)
The proof of (3.118) is similar to the proof of (3.18). We have UAα f (x) − UAβ f (x) (3.119) ∞ = Ex (e−αs − e−βs )f (Xs ) dAs 0 ∞ s x −(α−β)t −βs e dt e f (Xs ) dAs = (β − α)E 0 0 ∞ ∞ x −(α−β)t −βs e e f (Xs ) dAs dt = (β − α)E 0 ∞ ∞ t x −αt −βs e e f (Xs ) dAs ◦ θt dt = (β − α)E 0 0 ∞ ∞ x −αt Xt −βs = (β − α)E e E e f (Xs ) dAs dt = (β − α)U
α
0 β UA f (x).
0
Lemma 3.6.8 Let {A1t ; t ≥ 0} and {A2t ; t ≥ 0} be two continuous additive functionals. If A1 , A2 have bounded α-potentials for some α > 0 and UAα1 f (x) = UAα2 f (x) for all bounded measurable functions f and x ∈ S, then A1t = A2t for all t ≥ 0 almost surely. Proof It follows from (3.118) that UAα1 f (x) = UAα2 f (x) for all α > 0. Also, for any i, j = 1, 2 we have ∞ ∞ e−α(s+t) dAis dAjt Ex (3.120) 0 0 ∞ ∞ e−αs e−αt dAjt dAis = Ex 0 s∞ ∞ + Ex e−αt e−αs dAis dAjt 0
t
3.6 Local times ∞ e−2αs e−αt dAjt ◦ θs dAis 0 ∞0 ∞ + Ex e−2αt e−αs dAis ◦ θt dAjt .
= Ex
91
∞
0
0
Let τAi (s) be as defined in (2.142). For any measurable function f (t) we have the following analogue of (2.125): T ∞ f (t) dAit = f (τAi (s))1{τAi (s)
0
This is proved in exactly the same way that (2.125) is proved. Note that {τAi (s) < T } ∈ FτAi (s) . Using first (3.121) and then the strong Markov property we have ∞ ∞ j x −2αs −αt i (3.122) e e dAt ◦ θs dAs E 0 0 ∞ ∞ j x −2ατAi (s) −αt E e e dAt ◦ θτAi (s) 1{τAi (s)<∞} ds = 0 0 ∞ = E x e−2ατAi (s) uα Aj (XτAi (s) ) 1{τAi (s)<∞} ds. 0
Interchanging the order of integration and applying (3.121) again, we see that this ∞ x −2αs α i =E e uAj (Xs ) dAs = UA2αi uα (3.123) Aj (x). 0
Thus, by (3.120), ∞ ∞ 2α α e−α(s+t) dAis dAjt = UA2αi uα Ex Aj (x) + UAj uAi (x). (3.124) 0
0
By hypothesis, the right-hand side of (3.124) is the same for all i, j = 1, 2. Therefore 2 ∞ ∞ Ex = 0, (3.125) e−αs dA1s − e−αs dA2s 0
0
and consequently ∞ e−αs dA1s = 0
0
∞
e−αs dA2s
∀α > 0
a.s.
(3.126)
Since both A1s and A2s are continuous, the lemma follows (see, e.g., Feller (1971, Chapter XII)).
92
Markov processes and local times
Proof of Theorem 3.6.7 Let {At ; t ≥ 0} be a local time of X at y. The support property of dAt shows that ∞ α x −αt UA f (x) = E e f (Xt ) dAt = f (y)uα (3.127) A (x). 0
Furthermore, for any x = y, the support property of dAt also shows that ∞ x −αs uα (3.128) (x) = E e dA s A 0 ∞
= Ex
1{Ty <∞}
= E
x
−αTy
e
e−αs dAs
Ty
1{Ty <∞}
∞
0
= E x e−αTy 1{Ty <∞} E XTy = E x e−αTy uα A (y)
−αs
e
◦ θTy ∞ −αs e dAs dAs
0
since XTy = y on {Ty < ∞}. Let {A1t ; t ≥ 0} and {A2t ; t ≥ 0} be local times for X at y. Let i = A t
Ait uα Ai (y)
i = 1, 2.
(3.129)
It follows from (3.127) and (3.128) that UA1 = UA2 . Thus, by Lemma t t 2t . This gives (3.115). 1t = A 3.6.8, A Remark 3.6.9 Let X be a strongly symmetric Borel right process with local time at y given by Ly = {Lyt ; t ≥ 0} as constructed in Theorem y 3.6.3 and satisfying (3.91). By Theorem 3.6.7, L := {f (y) Lyt ; t ≥ 0} is also a local time of X at y, as long as f (y) > 0. In this case we have ∞ y x −αt E e dLt = f (y)uα (x, y). (3.130) 0
Therefore, when we speak of a local time for X we need to identify the left-hand side of (3.130). We refer to this as the normalization of the local time. Note that by Lemma 3.6.8, if {Lyt ; t ≥ 0} is a local time for X at y for y which (3.91) holds and {Lt ; t ≥ 0} is a local time for X at y for which (3.130) holds, then y
Lt = f (y)Lyt
∀t ≥ 0
a.s.
(3.131)
3.6 Local times
93
3.6.1 Inverse local time Let X be a strongly symmetric Borel right process with continuous αpotential density uα (x, y). Let 0 denote a fixed element in S. By Theorem 3.6.3 we can construct a local time {L0t ; t ≥ 0} for X at 0. Exactly as for Brownian motion in Subsection 2.4.1, we set τ (s) = inf{t > 0 | L0t > s}
(3.132)
with inf ∅ = ∞. We refer to τ ( · ) as the (right continuous) inverse local time (of X), at 0. For each s, τ (s) is a stopping time and τ (s + t) = τ (s) + τ (t) ◦ θτ (s) .
(3.133)
Lemma 3.6.10 For α > 0, E 0 (e−ατ (s) ) = e−s/u
α
(0,0)
.
(3.134)
Proof The proof proceeds exactly like the proof of Lemma 2.4.7 once we note that, for α > 0, E 0 (e−ατ (s) ) = E 0 e−ατ (s) 1{τ (s)<∞} . This allows us to use the strong Markov property as in the proof of Lemma 2.4.7. We now introduce the important concepts of transience and recurrence both for elements in the state space of X and for X itself. Let L := sup{s | Xs = 0}
(3.135)
with sup ∅ = 0. If L = ∞, P 0 almost surely, we say that 0 is recurrent (for X), while if L < ∞, P 0 almost surely, we say that 0 is transient (for X). Recall that L0∞ := limt→∞ L0t . Lemma 3.6.11 (1) If u(0, 0) = ∞, then 0 is recurrent for X (equivalently L = ∞) and L0∞ = ∞, P 0 almost surely. (2) If u(0, 0) < ∞, then 0 is transient for X (equivalently L < ∞) and L0∞ < ∞, P x almost surely, for each x ∈ S. Proof Suppose that u(0, 0) = ∞. Then, by (3.134), for each s < ∞ we have τ (s) < ∞, P 0 almost surely. Now suppose that L(ω) < ∞. Since L0t is continuous (and thus locally bounded) and grows only when Xt = 0, we would have L0L (ω) < ∞. Also, by the definition of inverse local time, τ (L0L + 1)(ω) = ∞. This contradiction proves that L = ∞, P 0 almost surely. The fact that L0∞ = ∞, P 0 almost surely then follows
Markov processes and local times
94
as in the proof of (2.121), where we now use the fact that L = ∞, P 0 almost surely as a substitute for Remark 2.1.5. Assume now that u(0, 0) < ∞. Define the sequence of stopping times T0 = 0 and Tn+1 = inf{t > Tn + 1 | Xt = 0}, n ≥ 0. Then L0∞ ≥
∞
∞ 0 L0Tn +1 − L0Tn 1{Tn <∞} = L1 ◦ θTn 1{Tn <∞} . (3.136)
n=0
n=0
By (3.91) and the strong Markov property, u(0, 0) ≥
∞
∞ E 0 1{Tn <∞} E 0 (L01 ) = E 0 (L01 ) P 0 (Tn < ∞). (3.137)
n=0
n=0
By the definition of local time L01 > 0, P 0 almost surely. Thus, obviously, E 0 (L01 ) > 0 and we have limn→∞ P 0 (Tn < ∞) = 0. Since the Tn are nondecreasing this implies that P 0 (∩∞ n=0 {Tn < ∞}) = 0, which in turn implies that L < ∞, P 0 almost surely. Since E 0 (L0∞ ) = u(0, 0) < ∞, we also get L0∞ < ∞, P 0 almost surely. Thus we get (2) for x = 0. To prove that L < ∞, P x almost surely for a general x ∈ S, note that {L > 0} = {T0 < ∞} and use the strong Markov property at T0 . To prove that L0∞ < ∞, P x almost surely for a general x ∈ S, note that E x (L0∞ ) = u(x, 0) ≤ u(0, 0) < ∞ using (3.68). Lemma 3.6.12 If P x (Ty < ∞) = 1 and P y (Tx < ∞) = 1, then x and y are recurrent. Proof As in (3.136), we define the sequence of stopping times T0 = 0, T1 = Ty + Tx ◦ θTy , and Tn+1 = Tn + T1 ◦ θTn , n = 1, 2, . . .. Using the monotonicity and additivity of Lxt , we see that Lx∞ ≥
∞
∞ x LxTn +1 − LxTn = LT1 ◦ θTn .
n=0
(3.138)
n=0
By the hypothesis and the strong Markov property, P x (T1 < ∞) = P x (Ty < ∞, Tx ◦ θTy < ∞) = 1,
(3.139)
and consequently Tn < ∞, P x almost surely for all n ≥ 1. Therefore, by the strong Markov property, u(x, x) = E x (Lx∞ ) ≥
∞ n=0
∞ E x LxT1 ◦ θTn = E x LxT1 .
(3.140)
n=0
of local Since T1 > 0, P x almost surely, it follows from the definition time that LxT1 > 0, P x almost surely and therefore E x LxT1 > 0. Thus,
3.6 Local times
95
by (3.140), we have that u(x, x) = ∞. It then follows from Lemma 3.6.11 that x is recurrent. The recurrence of y follows similarly. We say that the stochastic process X is recurrent (transient) if all the elements x ∈ S are recurrent (transient). The next lemma shows that if one element of the state space S is recurrent for X, then all the elements of S are recurrent for X, that is, that X is a recurrent process, and that X hits all the elements of S infinitely often. Lemma 3.6.13 If u(0, 0) = ∞ and P x (T0 < ∞) > 0 for all x ∈ S, then u(x, x) = ∞ and P x (Ty < ∞) = 1 for all x, y ∈ S, that is, X is a recurrent process. Proof
Fix some x ∈ S. Since lim E x (e−αT0 ) = P x (T0 < ∞),
α→0
(3.141)
we see from (3.107) that when P x (T0 < ∞) > 0 and limα→0 uα (0, 0) = u(0, 0) = ∞, then limα→0 uα (x, 0) = u(x, 0) = ∞. Since, by (3.68), uα (x, 0) ≤ uα (x, x), we see that limα→0 uα (x, x) = u(x, x) = ∞. We now show that P x (T0 < ∞) = 1. Since P x (T0 < ∞) > 0, we can find some s < ∞ for which p := P x (T0 ≤ s) > 0. Define the sequence of stopping times T0 = 0 and Tn+1 = inf{t > Tn + s | Xt = x}, n ≥ 0. Let Lx = sup{t | Xt = x} with sup ∅ = 0. By Lemma 3.6.11 we have Lx = ∞ almost surely, so that Tn < ∞ almost surely. By the strong Markov property, P x (T0 = ∞) ≤ P x (T0 > Tn ) n = Ex 1{T0 >Tj }
(3.142)
j=1
≤ (1 − p)n . Letting n → ∞, we see that P x (T0 < ∞) = 1. By Remark 3.6.6 and the assumption that P x (T0 < ∞) > 0, we have P 0 (Tx < ∞) > 0 and hence for any y ∈ S, P y (Tx < ∞) ≥ P y (Tx ◦ θT0 < ∞, T0 < ∞) = P y (T0 < ∞)P 0 (Tx < ∞) > 0. Using Remark 3.6.6 again we see that P x (Ty < ∞) > 0. Finally we note that 0 is just a generic point in S. We could just as well have taken y. Since, by what we have shown already, u(y, y) = ∞ and P x (Ty < ∞) > 0, our proof shows that P x (Ty < ∞) = 1.
Markov processes and local times
96
Remark 3.6.14 When L(ω) = 0, L(ω) is a left-limit point of zeros of Xt (ω). This is clear if L(ω) = ∞ almost surely. If L(ω) = ∞, then, by Lemma 3.6.11, L(ω) < ∞ almost surely. Suppose that L(ω) < ∞ almost surely and L(ω) is not a left-limit point of the zeros of Xt (ω). Then for some rational number r, L(ω) = r + T0 ◦ θr (ω). Note that T (r) := r + T0 ◦ θr is a stopping time and XT (r) = 0 on T (r) < ∞. Therefore, by the strong Markov property and (3.83), we have that, for any x ∈ S, P x (T (r) < ∞ , T0 ◦ θT (r) = ∞) = P x (T (r) < ∞)P 0 (T0 = ∞) = 0. (3.143) Since the event {ω |L(ω) < ∞, and L(ω) is not a left-limit point of the zeros of Xt (ω)} # is contained in r∈Q {T (r) < ∞} ∩ {T0 ◦ θT (r) = ∞} where Q is the set of positive rational numbers, we have a contradiction. Remark 3.6.15 We make the following observation for use in Theorem 13.1.2. Let B = {x1 , . . . , xp } ∈ S be a fixed set and define σ = inf{t ≥ 0 | Xt ∈ B ∩ {X0 }c }.
(3.144)
A proof similar to the proof of Lemma 3.6.12 shows that if P x (σ < ∞) = 1 for all x ∈ B, then all x ∈ B are recurrent. Here are the details of the proof. We define the sequence of stopping times T0 = 0, T1 = σ, and Tn+1 = Tn + σ ◦ θTn , n = 1, 2, . . .. Let At = x∈B Lxt . Using the monotonicity and additivity of At , we see that ∞ ∞ (ATn +1 − ATn ) = (Aσ ◦ θTn ) . (3.145) A∞ ≥ n=0
n=0
It follows from the assumption that P x (σ < ∞) = 1 for all x ∈ B and the strong Markov property that Tn < ∞, P x almost surely for all x ∈ B, for all n ≥ 1. Hence, by (3.105) and the strong Markov property, for any x ∈ B, y∈B
u(x, y) = E x (A∞ ) ≥
∞ n=0
E x (Aσ ◦ θTn ) =
∞
E x E XTn (Aσ ) .
n=0
(3.146) Since σ > 0, P y almost surely for all y ∈ B, it follows from the definition of local time that Lyσ > 0, P y almost surely for all y ∈ B and therefore
3.6 Local times
97
r(y) := E y (Aσ ) > 0 for all y ∈ B. Consequently, E XTn (Aσ ) ≥ inf r(y) > 0 y∈B
P x a.s.
(3.147)
Thus, by (3.146), we have y∈B u(x, y) = ∞, so that u(x, y0 ) = ∞ for some y0 ∈ B. By (3.68), u(x, x) = ∞, so that, by Lemma 3.6.11, x is recurrent P x almost surely. Analogous to Lemma 2.4.5, we have Lemma 3.6.16 If u(0, 0) = ∞, then {τ (s); s ∈ R+ } has stationary and independent increments under P 0 . Proof The proof is the same as the proof for Brownian motion, except that we now use Lemma 3.2.4. Remark 3.6.17 If u(0, 0) < ∞, a similar proof shows that, conditional on the event τ (s) < ∞, τ (t) ◦ θτ (s) is independent of Fτ (s) and has the same law, under P 0 , as τ (t). Let τ − (s) = inf{t > 0 | L0t ≥ s}
(3.148)
with inf ∅ = ∞. We refer to τ − ( · ) as the left continuous inverse local time (of X), at 0. For each s, τ − (s) is a stopping time. Lemma 3.6.18 For any t > 0, τ − (t) = τ (t) Proof
P0
a.s.
(3.149)
Clearly τ − (t) ≤ τ (t). Since {τ − (t) < τ (t)} = {τ − (t) < ∞ , RL0 ◦ θτ − (t) > 0}
and τ − (t) is a stopping time, by the strong Markov property, P 0 (τ − (t) < τ (t)) = E 0 I{τ − (t)<∞} P Xτ − (t) (RL0 > 0) .
(3.150)
It is clear from the definition that on {τ − (t) < ∞} we have that τ − (t) is in the support of dL0s . Therefore, by Remark 3.6.2, Xτ − (t) = 0 almost surely. The proof follows since, by the definition of local time, P 0 (RL0 > 0) = 0. Lemma 3.6.19 τ − (L0∞ ) = L
P0
a.s.
(3.151)
Markov processes and local times
98 Proof Also,
Since L0s cannot increase after time L, we have τ − (L0∞ ) ≤ L.
{τ − (L0∞ ) < L}
⊆
{τ − (L0∞ ) < r < L}
(3.152)
r rational
⊆
{T0 ◦ θr < ∞, RL0 ◦ θT0 ◦θr > 0}.
r rational
By the strong Markov property and the facts that XT0 ◦θr = 0 on {T0 ◦ θr < ∞} and P 0 (RL0 > 0) = 0, we have P 0 (T0 ◦ θr < ∞, RL0 ◦ θT0 ◦θr > 0) = E 0 I{T0 ◦θr <∞} P XT0 ◦θr (RL0 > 0) = E 0 I{T0 ◦θr <∞} P 0 (RL0 > 0) = 0. Therefore P τ − (L0∞ ) < L = 0.
(3.153)
3.7 Jointly continuous local times Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric Borel right process with continuous α-potential density uα (x, y). In Section 3.6 we construct a local time Lyt for each y ∈ S. Now we consider the stochastic process L = {Lyt , (t, y) ∈ R+ × S}. We ask whether we can obtain a version of L that is continuous or, as is often stated, “jointly continuous.” This is a classical and important problem in the theory of Markov processes. One of the main results in this book is a necessary and sufficient condition for a strongly symmetric Borel right process to have jointly continuous local times. ¯ = {L ¯ yt , (t, y) ∈ R+ × S} is When we say that a stochastic process L a version of the local time of a Markov process X, we mean more than the traditional statement that one stochastic process is a version of the other. Besides this we also require that the version is itself a local time ¯ y· is a local time for X at y, as defined in for X, that is, for each y ∈ S, L Section 3.6. To be more specific, suppose that L = {Lyt , (t, y) ∈ R+ × S} is a local time for X. When we say that we can find a version of the local time that is jointly continuous on T × S, where T ⊂ R+ , we mean ¯ = {L ¯ yt , (t, y) ∈ R+ × S} that is that we can find a stochastic process L continuous on T × S for all x ∈ S and which satisfies, for each x, y ∈ S, ¯ yt = Lyt L
∀ t ∈ R+
a.s. P x
(3.154)
Following convention, we often say that a Markov process has a continuous local time when we mean that we can find a continuous version for the local time.
3.7 Jointly continuous local times
99
In Theorem 3.7.3 we give conditions that imply that X has a jointly continuous local time, but first we note a simple but useful consequence of joint continuity. Theorem 3.7.1 (Occupation Density Formula) If L = {Lyt , (t, y) ∈ R+ × S} is jointly continuous, then, for any f ∈ Cκ (S) and t ≥ 0, t f (Xs ) ds = f (y)Lyt dm(y) a.s. (3.155) 0
Proof Let
At =
f (y)Lyt dm(y).
(3.156)
It is easy to see that A = {At ; t ≥ 0} is a CAF, as defined on page 83. It is clear that A satisfies conditions (1) and (2). To check that it satisfies condition (3), additivity, note that for each y ∈ S, Lyt+s = Lyt + Lys ◦ θt
for all s, t ∈ R+ a.s.
(3.157)
Clearly (3.157) also holds on a countable dense set of S almost surely, and since L is jointly continuous, it holds for all y ∈ S almost surely. It follows from this that condition (3), in the definition of a CAF is also satisfied. Let g ∈ Bb (S). It follows from Remark 3.6.2 and (3.91) that ∞ y α x −αt e g(Xt ) dLt f (y)m(dy) (3.158) UA g(x) = E 0 ∞ = Ex e−αt g(y) dLyt f (y)m(dy) 0 α = u (x, y)g(y)f (y) dm(y). t
f (Xs ) ds is a CAF and ∞ e−αt g(Xt )f (Xt )dt UBα g(x) = E x 0 α = u (x, y)g(y)f (y) dm(y).
Clearly Bt =
0
(3.159)
Thus UAα g = UBα g for all g ∈ Bb (S). Equation (3.155) now follows from Lemma 3.6.8. Let Lyt be a local time for X at y. Since Lyt is increasing in t, Ly∞ = limt→∞ Lyt exists. We often refer to the process L∞ := {Ly∞ , y ∈ S} as the total accumulated local time of X. The isomorphism theorems that
Markov processes and local times
100
we develop, which relate local times to Gaussian processes, usually are obtained (at least initially) for L∞ , and so we require that Ly∞ < ∞ for some, or all, y ∈ S. Of course sometimes it is. When it is not we consider X killed at some stopping time and study the total accumulated local time of the killed process. For example, if Ly∞ is infinite, we could yt := Lt∧λ , the local time for the killed process X, which was consider L y∞ is finite almost surely (see Remark introduced in Section 3.5, since L 3.6.4 (3)). In this section we reduce the problem of showing that L = {Lyt , (t, y) ∈ R+ ×S} is jointly continuous to showing that {Ly∞ , y ∈ S} is continuous. Note that since S is a locally compact space with a countable base, we can always find a metric ρ compatible with the topology of S and consider (S, ρ) as a locally compact separable metric space. Lemma 3.7.2 Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric Borel right process with continuous 0-potential density u(x, y) and state space (S, ρ), where S is a locally compact separable metric space. Let L = {Lyt , (t, y) ∈ R+ × S} be local times for X. Suppose there exists a p > 1 such that, for any compact set K ⊆ S, p
E x sup (Ly∞ ) < ∞,
(3.160)
y∈D∩K
where D is countable dense subset of S. Then 1/p x E sup t≥0
|Lyt − Lzt |p
sup
(3.161)
ρ(y,z)≤δ
y,z∈D∩K
1/p
≤
p x E p−1
sup ρ(y,z)≤δ
|Ly∞ − Lz∞ |p
y,z∈D∩K
+
sup
|u(x, y) − u(x, z)|.
ρ(y,z)≤δ
x,y,z∈D∩K
Proof
Consider the martingale Ayt = E x (Ly∞ | Ft )
(3.162)
and note that Ly∞ = Lyt + Ly∞ ◦ θt . Therefore, Ayt = Lyt + E x (Ly∞ ◦ θt | Ft ) = Lyt + E Xt (Ly∞ )1{t<ζ} ,
(3.163)
3.7 Jointly continuous local times
101
where we use the Markov property. Since E x (Ly∞ ) = u(x, y), we have Ayt = Lyt + u(Xt , y)1{t<ζ} .
(3.164)
Let F be a finite subset of D ∩ K. Using (3.160) along with the fact that u(Xt , y) ≤ u(y, y), we see that E x supt≥0 supx∈F Axt < ∞ and 1/p
x y z p E sup sup |Lt − Lt | t≥0
(3.165)
ρ(y,z)≤δ
y,z∈F
1/p
≤ E x sup sup |Ayt − Azt |p t≥0
ρ(y,z)≤δ
y,z∈F
1/p
+ E x sup sup |u(Xt , y) − u(Xt , z)|p 1{t<ζ} t≥0
.
ρ(y,z)≤δ
y,z∈F
It follows from Lemma 3.4.3 and the continuity of u that the last term in (3.165) is dominated by the last term in (3.162). Furthermore, since Ayt is right continuous, we have that sup Ayt − Azt = sup |Ayt − Azt | ρ(y,z)≤δ
ρ(y,z)≤δ
y,z∈F
y,z∈F
(3.166)
is a right continuous nonnegative submartingale. Therefore, by Doob’s Lp inequality (see, e.g., Rogers and Williams (2000b, II.70.2)) and the · · = L∞ , we have fact that A∞ 1/p
x y z p E sup sup |At − At | t≥0
1/p
≤
ρ(y,z)≤δ
y,z∈F
p x y z p E sup |L∞ − L∞ | p−1 ρ(y,z)≤δ
,
y,z∈F
(3.167) which when substituted in (3.165) gives (3.162). Theorem 3.7.3 Suppose that, in addition to the hypotheses of Lemma 3.7.2, lim E x
δ→0
sup ρ(y,z)≤δ
|Ly∞ − Lz∞ |p = 0.
y,z∈D∩K
Then we can find a version of L that is continuous on R+ × S.
(3.168)
Markov processes and local times
102
Fix T < ∞ and a compact set K ⊆ S. We first show that
Proof
Ex
sup
sup
|s−t|≤h(δ)
ρ(y,z)≤δ
|Lys − Lzt |p ≤ H(δ),
(3.169)
s,t∈[0,T ] y,z∈D∩K
where both h(δ) and H(δ) decrease to zero as δ decreases to zero. Note that it follows from Lemma 3.7.2, (3.168), and the uniform continuity of u(x, y) on the compact set K × K that E x sup
|Lys − Lzs |p ≤ H (δ),
sup
s≥0
(3.170)
ρ(y,z)≤2δ
y,z∈D∩K
where H (δ) decreases to zero as δ decreases to zero. Let B(vj , δ), j = 1, . . . , N (δ) be a family of closed balls of radius δ in N (δ) the metric ρ, centered at vj ∈ D ∩ K such that D ∩ K ⊆ ∪j=1 B(vj , δ). Note that ρ(y, z) ≤ δ implies that both y and z are contained in B(vj , 2δ) for at least one j ∈ [1, N (δ)]. For each j ∈ [1, N (δ)], |Lys − Lzt |
sup
≤
sup y∈B(vj ,2δ)
ρ(y,z)≤δ
y,z∈B(vj ,2δ)
+
|Lys − Lvsj |
(3.171)
v
sup z∈B(vj ,2δ)
v
|Lt j − Lzt | + |Lvsj − Lt j |.
Consequently, sup
|Lys − Lzt |p =
ρ(y,z)≤δ
y,z∈D∩K
|Lys − Lzt |p
sup
sup
j∈[1,N (δ)]
ρ(y,z)≤δ
y,z∈B(vj ,2δ)∩D∩K
≤ Cp sup s≥0
|Lys − Lzs |p +
sup ρ(y,z)≤2δ
sup j∈[1,N (δ)]
(3.172)
v |Lvsj − Lt j |p .
y,z∈D∩K
We use (3.172) to obtain (3.169). The first term in the last line of (3.172) is controlled by (3.170). To control the second term in the last v line of (3.172), we note that for each vj , Lt j is uniformly continuous on [0, T ]. Therefore, it follows from (3.160) and the Dominated Convergence Theorem that lim E x
δ →0
v
sup |Lvsj − Lt j |p = 0.
(3.173)
|s−t|≤δ
s,t∈[0,T ]
Thus we can find a function h(δ) that goes to zero as δ goes to zero,
3.7 Jointly continuous local times
103
such that
N (δ)
Ex
sup
sup
j∈[1,N (δ)] |s−t|≤h(δ) s,t∈[0,T ]
v
|Lvsj −Lt j |p ≤
j=1
Ex
|Lvsj −Lt j |p ≤ H (δ). v
sup |s−t|≤h(δ)
s,t∈[0,T ]
(3.174) Thus we have (3.169). Let Kn be a sequence of compact subsets of S such that S = ∪∞ n=1 Kn y and let s > 0. It follows from (3.169) that Lt is uniformly continuous on [0, s] × (Kn ∩ D), P x almost surely. Let = {ω | Lyt (ω) is locally uniformly continuous on R+ × D} Ω (3.175) and note that c = Ω {ω | Lyt (ω) is not uniformly continuous on [0, s]×(Kn ∩D)}, s∈R
1≤n≤∞
(3.176) where R denotes the rational numbers. By the remark in the previous c ) = 0. Thus we have paragraph, P x (Ω =1 P x (Ω)
∀x ∈ S.
(3.177) % = L yt , (t, y) ∈ R+ × S We now construct a stochastic process L $ y let L t (ω), (t, y) ∈ that is continuous and is a version of L. For ω ∈ Ω, % y R+ × S be the continuous extension of {Lt (ω), (t, y) ∈ R+ × D} to c set R+ × S, and for ω ∈ Ω $
%
yt ≡ 0 L
$
∀ t, y ∈ R+ × S.
(3.178)
yt , (t, y) ∈ R+ × S is a well-defined stochastic process that, clearly, L satisfies (3.154). is jointly continuous on R+ × S. We now show that L We have been working with a countable dense set D. We could just as well have used D ∪ {y} and thus obtained (3.177) with D replaced by Therefore we can assume that {y} ∈ D. D ∪ {y} in the definition of Ω. , with all y ∈ D, be such that limi→∞ yi = y. We then Let {yi }∞ i i=1 have lim Lyt i = Lyt
i→∞
locally uniformly on R+ ,
P x a.s.
we also have By the definition of L yt lim Lyt i = L
i→∞
locally uniformly on R+ ,
P x a.s.
Markov processes and local times
104 This shows that
yt = Lyt L
∀t
P x a.s.,
(3.179)
which is (3.154). At first glance it might appear as though we require that X has a 0potential density to obtain conditions for X to have a jointly continuous local time. This is not the case, as we see from the following corollary of Lemma 3.7.2 and Theorem 3.7.3. Corollary 3.7.4 Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric Borel right process with continuous α-potential density uα (x, y) and state space (S, ρ), where S is a locally compact separable metric space. Let L = {Lyt , (t, y) ∈ R+ × S} be local times for X. Suppose there exists a p > 1 such that, for all compact sets K ⊆ S p
Eλx sup (Lyλ ) < ∞,
(3.180)
y∈D∩K
where D is a countable dense subset of S and λ is an exponential random variable independent of X. Then 1/p (3.181) Eλx sup sup |Lyt∧λ − Lzt∧λ |p t≥0
ρ(y,z)≤δ
y,z∈D∩K
≤
p x E p−1 λ
sup
|Lyλ − Lzλ |p
1/p +
sup
ρ(y,z)≤δ
ρ(y,z)≤δ
y,z∈D∩K
x,y,z∈D∩K
|u(x, y) − u(x, z)|.
If, in addition, lim Eλx
δ→0
sup
|Lyλ − Lzλ |p = 0,
(3.182)
ρ(y,z)≤δ
y,z∈D∩K
we can find a version of L that is continuous on R+ × S. = (Ω, G, Gt , X t , θt , Px ) be the strongly symmetric Borel Proof Let X right process obtained by killing X at an independent exponential time λ with mean 1/α, as described in Section 3.5. The 0-potential density is the α-potential density of X. Thus we have a Borel right process of X X with state space S and continuous 0-potential density uα (x, y). As we pointed out in Remark 3.6.4 (3) when Lyt is a local time for X, yt := Ly is a local time for X. L t∧λ yt is locally It follows from Lemma 3.7.2 and Theorem 3.7.3 that L uniformly continuous on R+ × D, almost surely with respect to µ × P x , where µ is the probability measure of λ and D is a countable dense subset of S. Equivalently, Lyt is locally uniformly continuous on [0, λ] × D,
3.8 Calculating uT0 and uτ (λ)
105
almost surely with respect to µ × P x . Using Fubini’s Theorem we can now see that Lyt is locally uniformly continuous on [0, qi ] × D for all qi ∈ Q, where Q is a countable dense subset of R+ . As in Theorem 3.7.3, we can extend Lyt so that it is jointly continuous on R+ × D almost surely with respect to P x . The rest of the proof, showing that the extended Lyt is a local time for X, is the same as in Theorem 3.7.3.
Remark 3.7.5 Suppose that we only assume (3.182) for a single compact set K ⊆ S. The proof of Corollary 3.7.4 shows that we can find a version of L that is continuous on R+ × K. This is easy to check.
3.8 Calculating uT0 and uτ (λ) Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric Borel right process with continuous α-potential density uα (x, y). In this section we generalize the concepts introduced for Brownian motion in Section 2.5. We consider terminal times, as defined in the first paragraph of Section 2.5, and, for a CAF A = {At ; t ≥ 0} on (Ω, Ft , F), the inverse times τA as defined in the paragraph containing (2.143). We also consider uT and uτA (λ) as defined in (2.138) and (2.146). Note that Lemma 2.5.2 continues to hold. A particularly important example of a continuous additive functional is the local time itself. As a generalization of (3.132) we set τx (s) = inf{t > 0 | Lxt > s}, where inf{∅} = ∞. (We generally write τ (s) for τ0 (s).) Note that, just as in (3.133), τx (s + t) = τx (s) + τx (t) ◦ θτx (s) .
(3.183)
Recall that T0 denotes the first hitting time of 0, where we use 0 to indicate some fixed point in S. It is obvious that T0 is a terminal time. In the next two lemmas we obtain explicit descriptions of the potential densities uT0 and uτ (λ) . Lemma 3.8.1
uT0 (x, y) = lim
α→0
uα (x, 0)uα (0, y) u (x, y) − uα (0, 0) α
! ,
(3.184)
where the limit always exists. If P x (T0 < ∞) > 0 for all x ∈ S, then the limit in (3.184) is finite and positive definite.
106
Markov processes and local times
Note that uT0 (x, 0) = 0 for all x ∈ S. Proof We first note that, as in (3.128), ∞ ∞ y y x −αt x −αt e dLt e dLt E = E 1{T0 <∞} (3.185) T0 T0 ∞ y x −αT0 −αt 1{T0 <∞} e dLt ◦ θT0 = E e 0 ∞ y x −αT0 X T0 −αt 1{T0 <∞} E e dLt = E e 0 = E x e−αT0 uα (0, y) since XT0 = 0 on T0 < ∞. We use (3.185) and (3.107) precisely as in (2.140) to obtain T0 uα (x, 0)uα (0, y) y x −αt e dLt = uα (x, y) − E . (3.186) uα (0, 0) 0 Consequently, by the Monotone Convergence Theorem, ! uα (x, 0)uα (0, y) α . uT0 (x, y) = lim u (x, y) − α→0 uα (0, 0)
(3.187)
We now show that if P x (T0 < ∞) > 0 for all x ∈ S, then the limit in (3.187) is finite. Using (3.183) and the terminal time property, (2.137) for T0 , we have P x (LxT0 > s + t)
= P x (τx (s + t) < T0 )
(3.188)
= P (τx (s) < T0 , τx (t) ◦ θτx (s) < T0 ◦ θτx (s) ). x
Since Xτx (s) = x on {τx (s) < ∞}, the strong Markov property shows us that P x (LxT0 > s + t)
= P x (τx (s) < T0 )P x (τx (t) < T0 ) = P
x
(LxT0
> s)P
x
(LxT0
(3.189)
> t).
It follows from (3.189) that if there exists an s for which P x (LxT0 > s) < 1, then LxT0 is finite almost surely and is an exponential random variable under P x with mean uT0 (x, x) < ∞. On the other hand, LxT0 cannot infinite almost surely. To see this we note that uα (x, x) = ∞be−αt x e dLxt is finite and Lxt is increasing in t. Therefore E 0 ∞ e−αT0 LxT0 ≤ e−αt dLxt < ∞ P x a.s. (3.190) 0
By assumption P x (T0 < ∞) > 0, and when T0 < ∞ we see by (3.190) that LxT0 < ∞. Thus we see that uT0 (x, x) < ∞ for all x ∈ S.
3.8 Calculating uT0 and uτ (λ)
107
We know that uα (x, y) is positive definite by Lemma 3.3.3, and thereuα (x, 0)uα (0, y) α . fore, by Remark 5.1.6, so is uα (0) (x, y) := u (x, y) − uα (0, 0) Taking the limit as α goes to zero we see that uT0 (x, y) is positive definite. Since uT0 (x, x) < ∞ for all x ∈ S, the finiteness of uT0 (x, y) follows by the Cauchy–Schwarz inequality; see (5.27). Remark 3.8.2 In the course of the proof of Lemma 3.8.1 we show that when P x (T0 < ∞) > 0, LxT0 is an exponential random variable, under P x , with mean uT0 (x, x) < ∞. Remark 3.8.3 It follows from (3.184) that if X has continuous 0potential densities u(x, y), then uT0 (x, y) = u(x, y) −
u(x, 0)u(0, y) . u(0, 0)
(3.191)
In this case it is obvious that uT0 (x, y) is continuous and it follows from Remark 5.1.6 that it is positive definite. Lemma 3.8.4 Assume that P x (T0 < ∞) > 0 for all x ∈ S. Then 1 P x (T0 < ∞)P y (T0 < ∞), α + 1/u(0, 0) (3.192) where 1/α = E(λ) and 1/u(0, 0) = 0 if u(0, 0) = ∞. In fact, when u(0, 0) = ∞ we have P x (T0 < ∞) = 1 for all x ∈ S, so that uτ (λ) (x, y) = uT0 (x, y) +
1 . α
uτ (λ) (x, y) = uT0 (x, y) + Proof
(3.193)
We first note that ∞ y x −βt e dLt Eλ T0 +τ (λ)◦θT0
=
Eλx
1{T0 <∞}
=
−βT0
Eλx
e
(3.194)
∞
=E =E
−βT0
x
e
−βT0
1{T0 <∞}
e
∞
dLyt
∞
e
◦ θT 0
∞
−βt
e τ (λ)
−βt
e τ (λ)
τ (λ)
X 1{T0 <∞} Eλ T0
Eλ0
−βt
e
dLyt
T0 +τ (λ)◦θT0
x
−βt
dLyt
dLyt
Markov processes and local times
108
since XT0 = 0 on T0 < ∞. Similarly, ∞ y e−βt dLt Eλ0 τ (λ)
=
Eλx
1{τ (λ)<∞}
=
Eλx
−βτ (λ)
e
(3.195)
∞
−βt
dLyt
e τ (λ)
1{τ (λ)<∞}
∞
= Eλx e−βτ (λ) 1{τ (λ)<∞} E Xτ (λ) ∞ x −βτ (λ) 0 E e−βt dLyt = Eλ e
dLyt
◦ θτ (λ) ∞ y −βt e dLt 0
e
0
−βt
0
since Xτ (λ) = 0 on τ (λ) < ∞. Combining (3.194) and (3.195), we see that ∞ y x −βt e dLt = E x e−βT0 Eλx e−βτ (λ) uβ (0, y). Eλ T0 +τ (λ)◦θT0
(3.196) Using this and proceeding exactly as in (2.165)–(2.166), we see that ! uβ (x, 0)uβ (0, y) uτ (λ) (x, y) = lim uβ (x, y) − (3.197) β→0 uβ (0, 0) uβ (x, 0)uβ (0, y) (1 − Eλ0 (e−βτ (λ) )). β→0 uβ (0, 0)
+ lim By (3.184),
lim
β→0
uβ (x, y) −
uβ (x, 0)uβ (0, y) uβ (0, 0)
! = uT0 (x, y).
(3.198)
Also, by (3.107), uβ (x, 0)uβ (0, y) (1 − Eλ0 (e−βτ (λ) )) β→0 uβ (0, 0) lim
(3.199)
= lim E x (e−βT0 )E y (e−βT0 )uβ (0, 0)(1 − Eλ0 (e−βτ (λ) )) β→0
= P x (T0 < ∞)P y (T0 < ∞) lim uβ (0, 0)(1 − Eλ0 (e−βτ (λ) )), β→0
and, by (3.134), uβ (0, 0)(1 − Eλ0 (e−βτ (λ) )) = u (0, 0)(1 − Eλ (e β
−λ/uβ (0,0)
(3.200)
)) 1 α = uβ (0, 0)(1 − )= . β α + 1/u (0, 0) α + 1/uβ (0, 0)
3.9 The h-transform
109
Thus we get (3.192). The latter statement of the lemma follows from Lemma 3.6.13. We end this section with an extension of (3.68) (which is given by (3.201) with T = ∞). Lemma 3.8.5 Let X be a strongly symmetric Borel right process with state space S and continuous 0-potential densities u(x, y). Let T be a terminal time for X. Then uT (x, y) ≤ uT (y, y)
∀ x, y ∈ S.
(3.201)
Proof Using the additivity and support properties of Lyt and the fact that T is a terminal time, we have LyT = LyTy +T ◦θT = LyT ◦ θTy
on Ty < T.
y
(3.202)
Therefore, by the strong Markov property, uT (x, y) = E x (LyT )
= E x (LyT ◦ θTy , Ty < T ) x
= E (Ty < T )E
y
(LyT )
≤E
(3.203) y
(LyT )
= uT (y, y).
3.9 The h-transform Let X = (Ω, F, Ft , Xt , θt , P x ) be a strongly symmetric Borel right process with transition semigroup {Pt ; t ≥ 0} and continuous 0-potential density u(x, y). Let 0 denote some fixed element in S and assume that h(x) := P x (T0 < ∞) > 0 for all x ∈ S. By (3.109), h(x) =
u(x, 0) . u(0, 0)
(3.204)
In this section we construct another Borel right process, called the htransform of X, that is, essentially, X conditioned to hit 0 and die at its last exit from 0. Let L be as defined in (3.135). By Remark 3.2.8 we assume that Ω is the space of right continuous S∆ -valued functions {ω(t), t ∈ [0, ∞)} such that ω(t) = ∆ for all t ≥ ζ(ω). Furthermore, by Lemma 3.6.11 and Remark 3.6.14 and using the same procedure as in Remark 3.2.8, we assume that L(ω) < ∞ for all ω ∈ Ω and that L(ω) is a left limit point of zeros of ω(t) on {L(ω) > 0}. (Recall that we have Xt (ω) = ω(t).)
Markov processes and local times
110
We define the killing operator kL : Ω → Ω as ω(s) s < L(ω) kL (ω)(s) = ∆ s ≥ L(ω).
(3.205)
Let Ωh = kL (Ω). We then have Ωh = {ω ∈ Ω | ζ(ω) = L(ω)}.
(3.206)
Note that {L > 0} = {T0 < ∞} and, more generally, {L > s} = θs {T0 < ∞}. Therefore L is F measurable. Furthermore, for any f ∈ B(S∆ ), f (Xt ) ◦ kL := f (kL (X(t)) = f (Xt )1{t
(3.207)
which implies that f (Xt ) ◦ kL is F measurable. Thus, for any a < b, we −1 ({a ≤ f (Xt ) ≤ b}) = (f (Xt ) ◦ kL )−1 [a, b] ∈ F. Since sets of the have kL −1 : F 0 → F. form {a ≤ f (Xt ) ≤ b} generate F 0 , it follows that kL −1 0 Since kL : F → F, we can define the probability measures {P x/h ; x ∈ S} on (Ωh , F 0 ) by setting P x/h (A) =
1 −1 (A) ∩ {L > 0}) P x (kL h(x)
A ∈ F 0.
(3.208)
We then have E x/h (F ) =
1 E x F ◦ kL 1{L>0} h(x)
(3.209)
for any positive F 0 measurable function F . This follows from (3.208) by first considering F of the form F = 1A with A ∈ F 0 . Consider E x/h (F ) for F = f1 (Xt1 ) · · · fn (Xtn ) with t1 < · · · < tn , where fi ∈ B(S), i = 1, . . . , n, and recall our convention that functions on S are extended to S∆ by setting f (∆) = 0. Thus F ◦ kL = F 1{L>tn } = F 1{L>0} ◦ θtn .
(3.210)
It now follows from (3.209), (3.210), and the Markov property for X that 1 E x F 1{L>tn } (3.211) E x/h (F ) = h(x) 1 E x F P Xtn (L > 0) = h(x) 1 E x (F h(Xtn )). = h(x) Using the abbreviation Fn−1 = f1 (Xt1 ) · · · fn−1 (Xtn−1 ) in (3.211), we
3.9 The h-transform
111
have E x/h (F )
= = = =
1 E x (F h(Xtn )) (3.212) h(x) 1 E x (Fn−1 fn (Xtn )h(Xtn )) h(x) 1 E x (Fn−1 E Xtn−1 {fn (Xtn −tn−1 )h(Xtn −tn−1 )}) h(x) 1 E x (Fn−1 h(Xtn−1 )E Xtn−1 /h {fn (Xtn −tn−1 )}) h(x)
= E x/h (Fn−1 E Xtn−1 /h {fn (Xtn −tn−1 )}). Using (3.212) for the first equality and (3.211) for the second, we see that, for any tn−1 < tn and fn ∈ B(S), E x/h (fn (Xtn ) | Ft0n−1 ) (3.213) = E Xtn−1 /h fn (Xtn −tn−1 ) 1 = E Xtn−1 fn (Xtn −tn−1 )h(Xtn −tn−1 ) h(Xtn−1 ) 1 Pt −t f h(Xtn−1 ). = h(Xtn−1 ) n n−1 Let Qt f (x) :=
1 Pt f h(x). h(x)
(3.214)
Using this, (3.213), and Remark 3.1.1, it is easy to verify the following lemma. Lemma 3.9.1 X = Ωh , F 0 , Ft0 , Xt , θt , P x/h is a right continuous simple Markov process with transition semigroup {Qt ; t ≥ 0}. Let U
α
denote the α-potential of of X. Then, for any f ∈ Bb (S), ∞ α U f (x) = e−αt Qt f (x) dt (3.215) 0 ∞ 1 = e−αt Pt f h(x) dt h(x) 0 1 U α f h(x) = h(x) 1 uα (x, y)h(y)f (y) dm(y). = h(x)
Markov processes and local times
112 Thus
X has α-potential density
1 α u (x, y)h(y). h(x)
(3.216)
Let F h , Fth denote the standard augmentation of F 0 , Ft0 under {P x/h ; x ∈ S}. Statement (3.216) together with the continuity and strict positivity of h show that X has α-potential densities that are continuous on S ×S for each α. It then follows from Lemmas 3.9.1 and 3.4.2 that X is a Borel right process. As usual, let m denote the reference measure of the original Borel right process X. Considering (3.216), it is clear that X is not strongly symmetric with respect to m. Nevertheless, with a different reference measure, X is a strongly symmetric Borel right process. We refer to this process as X. = Ωh , F h , Fth , Xt , θt , P x/h is a Borel right proTheorem 3.9.2 X cess that is strongly symmetric with respect to the measure m(dy) := h2 (y) dm(y) and has α-potential densities u α (x, y) :=
uα (x, y) h(x)h(y)
(3.217)
with respect to the m(dy). is called the h-transform of X. X is a Borel right process. It follows from Proof We just showed that X α with respect to m. It (3.216) that u (x, y) is the potential density of X α , the potential with density u α , is symmetric with is easy to see that U respect to m. are the same Borel right process. They simply Remark 3.9.3 X and X have different α-potentials with respect to different reference measures. It is also customary to refer to X as the h-transform process. is a strongly symmetric Borel right process, by Theorem 3.6.3 Since X = {L yt , (y, t) ∈ S × R+ } that satisfies we know that it has a local time L ∞ yt = u α (x, y). e−αt dL (3.218) Ex 0
By Remark 3.6.9, y yt Lt := h2 (y)L
∀ t ∈ R+
or, equivalently, X. Furthermore, is also a local time for X ∞ 1 α y u (x, y)h(y) e−αt dLt = h2 (y) uα (x, y) = Ex h(x) 0
(3.219)
(3.220)
3.9 The h-transform
113
is the α-potential density of X. Remark 3.9.4 The role of the killing operator in the definition of the probability measures P x/h on (Ωh , F 0 ) in (3.208) justifies our interpre as the paths of X conditioned to hit 0 and die on their last tation of X exit from 0. For this reason P x/h is often written as P x,0 . We use both notations in this book. Let L = {Lxt ; (x, t) ∈ S × R+ } be the times of a strongly sym% $ local = L xt ; (x, t) ∈ S × R+ the local metric Borel right process X and L We consider the times of the corresponding h-transform process X. satisfying versions of L and L ∞ e−αt dLxt = uα (y, x) ∀α > 0 (3.221) Ey 0
and
E
y
∞
−αt
e 0
x dL t
=u α (y, x)
∀ α > 0.
(3.222)
These exist by Theorem 3.6.3. under P 0/h with L under P 0 . In the next lemma we compare L Lemma 3.9.5 For any countable subset D ⊆ S, $ 2 % % $ xt ; (x, t) ∈ D × R+ , P 0/h law h (x)L = Lxt ◦ kL ; (x, t) ∈ D × R+ , P 0 . (3.223) Proof Let f,x (y) be an approximate δ-function at x with respect to the reference measure m. By Theorem 3.6.3 there exists a sequence {n } tending to zero, such that, for t x := {ω ∈ Ω | Ω fn ,x (Xs ) ds converges locally uniformly in t}, 0
(3.224) x ) = 1 for all y ∈ S and {Lxt ; t ∈ R+ } defined by we have P y (Ω & t x limn →0 0 fn ,x (Xs ) ds ω ∈ Ω x (3.225) Lt = x 0 ω∈ /Ω is a version of the local time of X at x satisfying (3.221). Clearly, {Lxt ; t ∈ R+ } is F 0 measurable. Using (3.209), we see that for any 1 , and λ1 , . . . , λn ∈ R1 , x1 , . . . , xn ∈ S, t1 , . . . , tn ∈ R+ n n x x i λ L j i λ L j ◦k (3.226) E 0/h e j=1 j tj = E 0 e j=1 j tj L .
Markov processes and local times
114
Using the continuity of h(y) and the fact that h(y) > 0, we see that h−2 (x)f,x (y) is an approximate δ-function at x with respect to the reference measure m. Exactly as above, by Theorem 3.6.3, there exists a sequence {n } tending to zero, such that, for h,x := Ω (3.227) t h−2 (x)fn ,x (Xs ) ds converges locally uniformly in t , ω ∈ Ωh | 0
h,x ) = 1 for all y ∈ S and {L xt ; t ∈ R+ } defined by (Ω & t h,x limn →0 0 h−2 (x)fn ,x (Xs ) ds ω ∈ Ω x t = L (3.228) h,x 0 ω∈ /Ω
we have P
y/h
at x satisfying (3.222). By Remark is a version of the local time of X 3.6.4 (2) we can choose the same sequence {n } in (3.224) and (3.227). It follows that x h,x = Ωh ∩ Ω (3.229) Ω and that x Lxt = h2 (x)L t
on
h,x . Ω
(3.230)
h,x ) = 1, the lemma follows from (3.230) and (3.226). Since P 0/h (Ω Recall that τ (s) = inf{t > 0 | L0t > s} and τ − (s) = inf{t > 0 | L0t ≥ s}. 0 > s} (in all of these expressions, Similarly we set τ(s) = inf{t > 0 | L t inf{∅} = ∞). Lemma 3.9.6 For any countable subset D ⊆ S and t > 0, x 0 x ; x ∈ D , P 0/h law h2 (x)L . = L − (t∧L0 ) ; x ∈ D , P τ τ (t) ∞
(3.231)
Proof We continue the notation of the previous lemma. Note that t∧L t fn ,0 (Xs ) ds = 1[0,L] (s)fn ,0 (Xs ) ds (3.232) 0 0 t = fn ,0 (Xs ◦ kL ) ds, 0
where, according to our convention, fn ,0 (∆) = 0 for all n, so that the last two integrands agree except (possibly) for s = L Using (3.232) and (3.225) we see that L0t∧L = L0t ◦ kL , for all t P 0 almost surely. Also, by Remark 3.6.2 we have L0t = L0t∧L for all t, P 0 almost surely. Therefore, L0t = L0t ◦ kL ,
∀t ≥ 0
P0
a.s.
(3.233)
3.10 Moment generating functions of local times
115
It follows from (3.223) and the fact that h(0) = 1 that x ; (x, t) ∈ D × R+ , L 0 ; s ∈ R+ , P 0/h h2 (x)L (3.234) t s law = Lxt ◦ kL ; (x, t) ∈ D × R+ , L0s ◦ kL ; s ∈ R+ , P 0 . Therefore, using (3.233) we see that for any countable subset D ⊆ S x ; (x, t) ∈ D × R+ , L 0 ; s ∈ R+ , P 0/h (3.235) h2 (x)L t s law = Lxt ◦ kL ; (x, t) ∈ D × R+ , L0s ; s ∈ R+ , P 0 . 0r > s} This, together with {τ (s) < r} = {L0r > s}, { τ (s) < r} = {L and the right continuity of τ (s), τ(s) shows that xt ; (x, t) ∈ D × R+ , τ(s) ; s ∈ R+ , P 0/h h2 (x)L (3.236) law = Lxt ◦ kL ; (x, t) ∈ D × R+ , τ (s) ; s ∈ R+ , P 0 . Consequently, for any t > 0, x 0 x ; x ∈ D , P 0/h law . = L ◦ k ; x ∈ D , P h2 (x)L L τ (t) τ (t)
(3.237)
It only remains to show that, for any x ∈ S, Lxτ(t) ◦ kL = Lxτ− (t∧L0∞ ) ,
P0
a.s.
(3.238)
As in the beginning of the proof, we have Lxτ(t)∧L = Lxτ(t) ◦kL , P 0 almost surely, so we need only show that τ (t) ∧ L = τ − (t ∧ L0∞ )
P0
a.s.
(3.239)
This follows from Lemmas 3.6.18 and 3.6.19 and the fact that τ − (t) is monotonically increasing. Remark 3.9.7 By (3.232) with t = ∞, Lx∞ ◦ kL = LxL and by (3.151), LxL = Lxτ− (L0 ) . Thus we have ∞
Lx∞ ◦ kL = Lxτ− (L0∞ )
P0
a.s.
(3.240)
3.10 Moments and moment generating functions of local times In Section 3.8 we define terminal times and inverse times, that is, stopping times of the form τA (λ), where A = {At ; t ≥ 0} is a CAF and λ is an exponential random variable independent of A. In Theorem 2.5.3 we give Kac’s Moment Formula for Brownian motion, which holds for
Markov processes and local times
116
terminal times and inverse times. Clearly, the proof of Theorem 2.5.3 does not involve any specific properties of Brownian motion. As is easily seen, that proof suffices to prove the next, much more general result. Theorem 3.10.1 (Kac’s Moment Formula: III) Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric Borel right process and assume that its α-potential density, uα (x, y), is finite for all x, y ∈ S. Let {Lyt , (t, y) ∈ R+ × S} be a local time of X with ∞ e−αt dLyt = uα (x, y). (3.241) Ex 0
Then, if T is a terminal time for X, n y x i E LT = uT (x, yπ1 ) · · · uT (yπn−1 , yπn ), i=1
(3.242)
π
where the sum runs over all permutations π of {1, . . . , n}. In particular, n n−1 . (3.243) E x (LyTi ) = n!uT (x, y) (uT (y, y)) Let A = {At , t ≥ 0} be a CAF on (Ω, G, Gt ) and λ an exponential random variable independent of (Ω, G, Gt , P x ). These equations are also valid for inverse times, that is, with T replaced by τA (λ) and E x replaced by Eλx . Using Theorem 3.10.1, the proof of Lemma 2.6.2 can be immediately extended to give the following important relationship for the moment generating function of local times evaluated at terminal times and inverse times. Lemma 3.10.2 Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric Borel right process with continuous α-potential density, and let T be a terminal time with potential density uT (x, y). Let Σ be the matrix with elements Σi,j = uT (xi , xj ), i, j = 1, . . . , n. Let Λ be the matrix with elements {Λ}i,j = λi δi,j . For all λ1 , . . . , λn sufficiently small and 1 ≤ l ≤ n, n det(I − ΣΛ) xi xl λi LT = E exp , (3.244) det(I − ΣΛ) i=1 where j,k = (Σj,k − Σl,k ) Σ
j, k = 1, . . . , n.
(3.245)
These equations also hold when T is replaced by τA (λ), for any CAF, A = {At , t ∈ R+ }, and E x is replaced by Eλx .
3.10 Moment generating functions of local times
117
Remark 3.10.3 We note for future reference that the proof of Lemma 3.10.2 gives the following analog of (2.173): n n xi xl E exp λi LT = {(I − ΣΛ)−1 }l,j = {(I − ΣΛ)−1 1t }l , (3.246) i=1
j=1
where 1t denotes the transpose of the n-dimensional vector (1, . . . , 1). The next lemma gives analogous results for the h-transform process described in Section 3.9. Lemma 3.10.4 Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric Borel right process with continuous 0-potential density u(x, y). Let 0 denote a fixed element of S. Assume that h(x) > 0 for all x ∈ S (see = (Ωh , F h , F h , Xt , θt , P x,0 ) denote the h-transform of (3.204)). Let X t y X, with 0-potential density u (x, y) (see (3.217)). Let L = {Lt ; (y, t) ∈ normalized so that S × R+ } denote the local time for X y u(x, y)h(y) . (3.247) E x,0 L∞ = h2 (y) u(x, y) = h(x) Then E
x,0
n
yi L∞
=
i=1
1 u(x, yπ1 ) · · · u(yπn−1 , yπn )h(yπn ), h(x) π
(3.248)
where the sum runs over all permutations π of {1, . . . , n}. Let Σ be the matrix with elements Σi,j = u(xi , xj ), i, j = 1, . . . , n. Let Λ be the matrix with elements {Λ}i,j = λi δi,j . For all λ1 , . . . , λn sufficiently small and 1 ≤ l ≤ n, n det(I − ΣΛ) xi xl ,0 , (3.249) E exp λi L∞ = det(I − ΣΛ) i=1 where j,k = Σ
h(xj )Σl,k Σj,k − h(xl )
j, k = 1, . . . , n.
(3.250)
Proof Let m be the reference measure for the original process X. We is not strongly show in Theorem 3.9.2 that the h-transform process X symmetric with respect to m but that it is a strongly symmetric Borel right process with respect to the reference measure m(dy) = h2 (y) dm(y) and has 0-potential density u (x, y) =
u(x, y) . h(x)h(y)
(3.251)
Markov processes and local times
118
= {L x ; (x, t) ∈ S × R+ } be the local time for the h-transform Let L t yt ) = u normalized so that E x,0 (L (x, y). Then, by Theorem process X 3.10.1, n y∞i = L u (x, yπ ) · · · u (yπ , yπ ), (3.252) E x,0 1
n−1
n
π
i=1
where the sum runs over all permutations π of {1, . . . , n}. (Obviously, T = ∞ is a terminal time.) By Remark 3.9.3, xi xt i . h−2 (xi )Lt = L
(3.253)
Substituting this in (3.252) gives (3.248). In the same way as the first equation in (2.173) follows from (2.148) we see that n n % 1 $ xi xl ,0 E (I − ΣΛ)−1 l,j h(xj ). (3.254) exp λi L∞ = h(xl ) j=1 i=1 Let H be the matrix with elements j )δj,k . Let Y be the Hj,k =xh(x i n xl ,0 1×n vector with elements E exp i=1 λi L∞ and h the 1×n vector with elements h(xi ), i = 1, . . . , n. It follows from (3.254) that (I − ΣΛ)HY = h.
(3.255)
Consequently, by Cramer’s Theorem, n det((I − ΣΛ)(l, h) ) xl ,0 xi exp λi L∞ = (HY )l = h(xl )E , det((I − ΣΛ)) i=1
(3.256)
where (I − ΣΛ)(l, h) is the matrix obtained by replacing the l-th column of (I − ΣΛ) by h. Thus n det((I − ΣΛ)(l, h) ) xl ,0 xi exp λi L∞ = E , (3.257) det((I − ΣΛ)) i=1 1 h. Note that hl = 1. h(xl ) Let B be the matrix obtained by subtracting the h(xj )/h(xl ) times the l-th row of (I − ΣΛ)(l, h) from the j-th row for each j = l. We see where h=
that Bj,l
= δl,l
Bj,k
=
j,k (I − ΣΛ)
(3.258) j, k = l.
3.11 Notes and references
119
Thus det((I − ΣΛ)(l, h) ) = det B = Ml,l ,
(3.259)
l,k = 0 Since by (3.250), Σ where Ml,l is the (l, l)-th minor of (I − ΣΛ). for all k, we also have det(I − ΣΛ) = Ml,l . Thus we get (3.249) from (3.257). Example 3.10.5 We can write (3.243) as uT (x, y) n n E x (LyTi ) = n! (uT (y, y)) , uT (y, y) from which it is easy to see that, for λ < 1/uT (y, y), uT (x, y) uT (x, y) 1 y x E exp (λLT ) = 1 − + . uT (y, y) uT (y, y) 1 − λuT (y, y)
(3.260)
(3.261)
Thus, under P x , we can write LyT = ξ1 ξ2 ,
(3.262)
where ξ1 and ξ2 are independent random variables, ξ1 = 1 with probability uT (x, y)/uT (y, y) and ξ1 = 0 otherwise, and ξ2 is an exponential random variable with mean uT (y, y). The same result holds for LyτA (λ) under Px (see (3.76)), if uT is replaced by uτ (λ) . Similarly, by (3.252), x under P x,0 , if uT is replaced by u the same result holds for L in (3.251). ∞
3.11 Notes and references The standard references on right processes are Sharpe (1988), Getoor (1975), Dellacherie and Meyer (1987) and Dellacherie, Maisonneuve and Meyer (1992). Other classics on the theory of Markov processes are Rogers and Williams (2000b), Chung (1982), and Blumenthal and Getoor (1968). Borel right processes on locally compact Hausdorff spaces are the simplest right processes. Our restriction to strongly symmetric processes is due to the fact that our main tools in the study of local times are isomorphism theorems that relate the local times of a Markov process to an associated Gaussian process. The association is that the covariance of the Gaussian process, which is necessarily symmetric, is the same as the potential density of the Markov process. Strongly symmetric Markov processes are treated in Fukushima, Oshima and Takeda (1994). Le Jan (1988) has an isomorphism theorem for nonsymmetric Markov processes that
120
Markov processes and local times
involves complex-valued measures. We have not been able to use his results to study local times of nonsymmetric processes. Beginning in Section 3.4 we assume that the potential densities are continuous. This is a very strong condition, which allows us to simplify many arguments concerning Markov processes. The justification for this assumption is that we are interested in processes with jointly continuous local times, and, as explained in Section 4.6, if a strongly symmetric Borel right process with finite potential densities has a jointly continuous local time, then the potential densities are continuous. We mention in Section 4.4 that a strongly symmetric Borel right process with continuous potential densities is in fact a Hunt process. We could have considered only Hunt processes from the onset, but instead we chose to work with Borel right processes, both to clarify the generality of our results and because we never explicitly use quasi left continuity. There is an extensive literature on the continuity of local times of Markov processes, beginning with the celebrated result of Trotter (1958) on the joint continuity of the local time of Brownian motion and the results of McKean (1962) and Ray (1963), which give the exact uniform modulus of continuity for the local times of Brownian motion. Boylan (1964) found a sufficient condition for the joint continuity of the local time for a wide class of Markov processes that inspired considerable efforts to obtain necessary and sufficient conditions. A variation of Boylan’s result is given in Blumenthal and Getoor (1968, Section 3, Chapter V) following an approach of Meyer (1966). Further improvements are given in Getoor and Kesten (1972) and Millar and Tran (1974). After a long hiatus, M. Barlow and J. Hawkes obtained necessary and sufficient conditions for the joint continuity of local times of L´evy processes. Their work: Barlow (1985), Hawkes (1985), Barlow and Hawkes (1985), and Barlow (1988) inspired our own, as we say in Chapter 1. Our treatment of local times in Section 3.6 is simplified by the assumption that the Markov processes are strongly symmetric Borel right processes with continuous potential densities. Theorem 3.7.3 is the core of our argument that gives a sufficient condition for the joint continuity local times, a condition that we also show is necessary in Chapter 9. Gaussian processes are not mentioned in this theorem. They come in later in the isomorphism theorems in Chapter 8. Theorem 3.7.3 combines several theorems in our paper Marcus and Rosen (1992d) , but with considerable simplifications due the simpler, more efficient isomorphism theorems that are now available to us. Kac’s formula is discussed in great generality in Fitzsimmons and Pitman (1999).
4 Constructing Markov processes
So far in this book, we have simply assumed that we are given a strongly symmetric Borel right process with continuous α-potential densities uα (x, y), α > 0 and also u(x, y) when the 0-potential exists. In general, constructing such processes is not trivial. However, given additional conditions on transition semigroups or potentials, we can construct special classes of Borel right processes. In this chapter we show how to construct Feller and L´evy processes. (For references to the general question of establishing the existence of Borel right processes, see Section 3.11.) In Sections 4.7–4.8, we show how to construct certain strongly symmetric right continuous processes with continuous α-potential densities that generalize the notion of Borel right processes and are used in Chapter 13. In Sections 4.4–4.5 we present certain material, on quasi left continuity and killing at a terminal time, which is of interest in its own right and is needed for Sections 4.7–4.10. In Section 4.6 we tie up a loose end by showing that if a strongly symmetric Borel right process has a jointly continuous local time, then the potential densities {uα (x, y), (x, y) ∈ S × S} are continuous. In Section 4.10 we present an extension theorem of general interest which is needed for Chapter 13.
4.1 Feller processes A Feller process is a Borel right process with transition semigroup {Pt ; t ≥ 0} such that, for each t ≥ 0, Pt : C0 (S) → C0 (S). Such a semigroup is called a Feller semigroup. We consider C0 (S) as a Banach space in the uniform or sup norm, that is, f = supx∈S |f (x)|. (Sometimes, to avoid confusion, we use · ∞ to denote the sup norm.) 121
122
Constructing Markov processes
A family {Pt ; t ≥ 0} of bounded linear operators on C0 (S) (that is, Pt : C0 (S) → C0 (S)) is called a strongly continuous contraction semigroup if (1) Pt Ps = Pt+s , ∀s, t ≥ 0 (2) Pt f ≤ f , ∀f ∈ C0 (S) and ∀t ≥ 0 (3) limt→0 Pt f − f = 0, ∀f ∈ C0 (S). If, in addition, for each t ≥ 0, Pt is a positive operator, that is, Pt f ≥ 0 when f ≥ 0, then each Pt can be associated with a sub-Markov kernel on S, which we also denote by Pt . {Pt ; t ≥ 0} is clearly a Borel semigroup. Theorem 4.1.1 Let S be a locally compact space with a countable base, and let {Pt ; t ≥ 0} be a strongly continuous contraction semigroup of positive linear operators on C0 (S). Then we can construct a Feller process X with transition semigroup {Pt ; t ≥ 0}. Proof As we did in Section 3.1, we extend {Pt ; t ≥ 0} to be a Borel semigroup of Markov kernels on the compact set S∆ . Since any function f ∈ Cb (S∆ ) can be written as the sum of a constant plus a function in Cb (S∆ ) that is zero at ∆, it is easy to check that the extended {Pt ; t ≥ 0} is a strongly continuous contraction semigroup on Cb (S∆ ) (recall that S∆ is compact, so that C0 (S∆ ) = Cb (S∆ )). We now construct a Borel right process with transition semigroup R {Pt ; t ≥ 0}. Let x ∈ S∆ . We first construct a probability Px on S∆+ , the space of S∆ -valued functions {f (t), t ∈ [0, ∞)} equipped with the Borel R product σ-algebra B(S∆+ ). Let Xt be the natural evaluation Xt (f ) = f (t). We define Px on sets of the form {Xt1 ∈ A1 , . . . , Xtn ∈ An } for all Borel measurable sets A1 , . . . , An in S∆ and 0 = t0 < t1 < · · · < tn by setting n n Px (Xt1 ∈ A1 , . . . , Xtn ∈ An ) = IA1 (zi ) Pti −ti−1 (zi−1 , dzi ), i=1
i=1
(4.1) where z0 = x. Here IAi is the indicator function of Ai . It follows from the semigroup property of {Pt ; t ≥ 0} that this construction is consistent. Therefore, by the Kolmogorov Construction Theorem, we R can extend Px to S∆+ . Let Ft0 = σ(Xs ; s ≤ t). As in the second proof of (2.31), it follows from (4.1) that f (Xt+s ) | Ft0 = Ps f (Xt ) E (4.2)
4.1 Feller processes
123
for all s, t ≥ 0 and f ∈ Bb (S∆ ). Thus, {Xt ; t ≥ 0} is a simple Markov process with respect to the filtration Ft0 = σ(Xs ; s ≤ t). As usual we set ∞ Uλ = e−λt Pt dt ∀ λ > 0. (4.3) 0
Using the fact that the {Pt ; t ≥ 0} are contractions on Cb (S∆ ), we also have U λ : Cb (S∆ ) → Cb (S∆ ) and λU λ ≤ 1 for each λ > 0. Furthermore, the fact that limt→0 Pt f − f = 0 for f ∈ Cb (S∆ ) and ∞ ∞ e−λt Pt f (x) dt = e−t Pt/λ f (x) dt (4.4) λU λ f (x) = λ 0
0
implies that lim λU λ f − f = 0
λ→∞
∀ f ∈ Cb (S∆ ).
(4.5)
We now show that {Xt ; t ≥ 0} has a right continuous modification. We begin by noting that for any α > 0 and f ∈ Cb+ (S∆ ), e−αt U α f (Xt ) is a Ft0 supermartingale for any Px . To see this note that by the simple Markov property, for any s < t, x e−αt U α f (Xt ) | Fs0 = e−αt Pt−s U α f (Xs ), (4.6) E and it is easily checked using (4.3) that Pt−s U α f ≤ eα(t−s) U α f . Since S∆ is compact we can find a sequence of functions F = {fn } with each fn ∈ Cb+ (S∆ ) that separate points of S∆ , that is, for each pair x, y in S∆ with x = y there exists an fn ∈ F such that fn (x) = fn (y). Since limα→∞ αU α fn = fn , we see that the countable collection G := {U α fn ; α, n = 1, 2, . . .} also separates points. Let Q be a countable dense set in R+ . By the convergence theorem for supermartingales, Revuz and Yor (1991, Chapter II, Theorem 2.5), g(Xt ) has right-hand limits along Q for all g ∈ G, almost surely. We now show that almost surely t → Xt has right-hand limits in S∆ along Q. Since S∆ is compact, it suffices to show that whenever {sn } and {s n } are sequences in Q such that sn ↓ t, s n ↓ t, limn→∞ Xsn = y, and limn→∞ Xsn = y , then y = y . However, for any g ∈ G, the continuity of g implies that limn→∞ g(Xsn ) = g(y) and limn→∞ g(Xsn ) = g(y ). Furthermore, since g(Xt ) has right-hand limits along Q, we must have g(y) = g(y ). Since G separates points, we see that y = y .
t (ω) = For all ω for which the following limit exists for all t, set X lims↓t,s∈Q Xs , and for those ω for which it does not exist for all t, set
· (ω) = ∆. Note that {X
t ; t ≥ 0} is right continuous. We claim X
124
Constructing Markov processes
t = Xt almost surely. To see this note that, for any that for each t, X f, g ∈ Cb (S∆ ), x (f (Xt )g(Xs )) x f (Xt )g(X
t ) E = lim E (4.7) s↓t,s∈Q
=
x (f (Xt )Ps−t g(Xt )) lim E
s↓t,s∈Q
x (f (Xt )g(Xt )) = E since lims↓t Ps−t g = g uniformly. Since this holds for all f, g ∈ Cb (S∆ ), the Monotone Class Theorem (see, e.g., Rogers and Williams (2000b, II.3.1)) shows that
t ) = E x (h(Xt , Xt )) , x h(Xt , X (4.8) E for all bounded Borel measurable functions h on S∆ × S∆ . In particular
t = Xt this holds for h(y, z) = I{(y,z) | y=z} . This shows that for each t, X almost surely. Let Ω be the space of right continuous S∆ -valued functions {ω (t), t ∈ 0 [0, ∞)}. Let Xt be the natural evaluation Xt (ω ) = ω (t), and let F 0 and F t be the σ-algebras generated by {Xs , s ∈ [0, ∞)} and {Xs , s ∈ : (S R+ , F 0 ) → (Ω , F 0 ) defined for [0, t]}, respectively. The map X ∆ R+
t (ω) induces a map of the probabilities Px on =X ω ∈ S∆ by X(ω)(t) R 0 (S∆+ , F 0 ) to probabilities P x on (Ω , F ). It is easy to check that X = 0 0 x (Ω , F , F t , Xt , θt , P ) is a right continuous simple Markov process with transition semigroup {Pt ; t ≥ 0}. Since U α : Cb (S∆ ) → Cb (S∆ ) for every α > 0, it is clear that X satisfies the second condition for a Borel right process. It then follows from Lemma 3.2.3 that we can find a Borel right process with transition semigroup {Pt ; t ≥ 0}. Remark 4.1.2 (1) In Theorem 4.1.1 we construct a Feller process with state space S∆ . Assume that the original family {Pt ; t ≥ 0} is a Markov semigroup on S. It follows immediately that, for each x ∈ S, P x (Xt ∈ S, ∀t ∈ Q) = 1, which implies that P x (ζ < ∞) = 0 (see Lemma 3.2.7). Thus, if the process starts in S, almost surely it never reaches ∆. Therefore, we can drop the cemetery state ∆ so that the corresponding Feller process has state space S. (2) The paths of the Feller process {Xt , t ≥ 0} constructed in Theorem 4.1.1 have left limits in S∆ almost surely. This follows from the proof of Theorem 4.1.1. As in (4.6), we see that, for
4.1 Feller processes
125
any α > 0 and f ∈ Cb+ (S∆ ), e−αt U α f (Xt ) is a right continuous supermartingale. It is well known that a right continuous supermartingale has left limits for all t ∈ [0, ∞) almost surely; see Revuz and Yor (1991, Chapter II, Theorem 2.8). Let G be as defined in the paragraph following (4.6). Following the argument of that paragraph, we can show that, almost surely, t → Xt has left-hand limits. As a simple consequence of Theorem 4.1.1 and the Hille–Yosida Theorem (Theorem 14.4.1) we can give conditions for the construction of Feller processes in terms of potential operators. Let X = (Ω, G, Gt , Xt , θt , P x ) be a Borel right process in a locally compact space S with continuous α-potential densities uα (x, y), α > 0. Recall the definition given in (3.14) of the potential operators U α on Cb (S). It is clear that U α is a positive operator. Also, we show in Lemma 3.4.1 that U α : C0 (S) → Cb (S), and, using the fact that Pt f ≤ f for each t ≥ 0, it is easy to see that we have the contraction property αU α f ≤ f
∀ f ∈ C0 (S)
(4.9)
(indeed, this holds for all f ∈ Cb (S)). In Lemma 3.1.3 we show that the family {U α ; α > 0} satisfies the resolvent equation U λ − U µ = (µ − λ)U λ U µ
∀ λ, µ > 0.
(4.10)
A family {U λ ; λ > 0} of operators on a Banach space (B, | · |) that satisfies (4.9) and (4.10) is called a contraction resolvent on B (we use the notation | · | to emphasize that we are referring to an arbitrary norm, not necessarily the sup norm). Note that for general Borel right processes with continuous α-potential densities, we do not have U α : C0 (S) → C0 (S). However, by (3.16), we do have lim λU λ f (x) = f (x)
λ→∞
∀ f ∈ C0 (S) and x ∈ S.
(4.11)
Theorem 4.1.3 Let S be a locally compact space with a countable base, and let {U λ ; λ > 0} be a family of positive linear operators on C0 (S) that is a contraction resolvent on C0 (S) and in addition satisfies (4.11). Then we can construct a Feller process X with potential operators {U λ ; λ > 0}. Proof By Theorems 14.4.2 and 14.4.1, we can find a strongly contin-
126
Constructing Markov processes
uous contraction semigroup {Pt ; t ≥ 0} on C0 (S) with ∞ λ e−λt Pt dt ∀λ > 0. U =
(4.12)
0
The construction of Pt in the proof of Theorem 14.4.1 and the fact that {U λ ; λ > 0} are positive operators implies that each Pt is a positive operator on C0 (S). The theorem now follows from Theorem 4.1.1. Corollary 4.1.4 The following are equivalent: (1) X is a Feller process. (2) X is a Borel right process with potentials U α with the property that U α : C0 (S) → C0 (S) for all α > 0. (3) X is a Borel right process with transition semigroup {Pt ; t ≥ 0}, which is a strongly continuous contraction semigroup on C0 (S). Proof Let {Pt ; t ≥ 0} be the transition semigroup for X. To show that (1) implies (2) we use the fact that {Pt ; t ≥ 0} are contractions from C0 (S) → C0 (S) along with the Dominated Convergence Theorem in (4.12) to see that U α : C0 (S) → C0 (S) for all α > 0. To show that (2) implies (3), we first note that X has a semigroup, say {Pt ; t ≥ 0}, associated with it, and the potential operators {Uα , α > 0} are related to {Pt ; t ≥ 0} by (4.12). Also, as we show in the discussion preceding Theorem 4.1.3, {U λ ; λ > 0} is a contraction resolvent on C0 (S) satisfying (4.11). Therefore, it follows from the proof of Theorem 4.1.3 that there exists a strongly continuous contraction semigroup, say {Pt ; t ≥ 0}, such that Pt : C0 (S) → C0 (S), and {Uα , α > 0} are also related to {Pt ; t ≥ 0} by (4.12) (with Pt replaced by Pt ). It follows from the uniqueness of the Laplace transform that the semigroups {Pt ; t ≥ 0} and {Pt ; t ≥ 0} are the same. Thus (2) implies (3). That (3) implies (1) follows from the definition of Feller processes. Remark 4.1.5 Let X be a Borel right process with continuous αpotential densities. As mentioned in the discussion preceding Theorem 4.1.3, we show in Lemma 3.4.1 that U α : C0 (S) → Cb (S). If S is compact, Cb (S) = C0 (S). Hence it follows from Corollary 4.1.4 that a Borel right process with continuous α-potential densities and compact state space is a Feller process. Theorem 4.1.3 gives conditions for a contraction resolvent on C0 (S) to be the potential operators of a Feller process. We next present a theorem of Hunt that provides conditions for a bounded operator on C0 (S) to be the 0-potential operator of a Feller process.
4.1 Feller processes
127
Let G : C0 (S) → C0 (S) be a positive operator. We say that G satisfies the positive maximum principle on C0 (S) if, for all h ∈ C0 (S) for which supx Gh(x) > 0, sup {x:h(x)>0}
Gh(x) ≤ a ⇒ sup Gh(x) ≤ a.
(4.13)
x∈S
Theorem 4.1.6 (Hunt’s Theorem) Let S be a locally compact space with a countable base and let G : C0 (S) → C0 (S) be a positive, bounded linear operator. The following are equivalent: (1) G is the 0-potential operator of a Feller process on S. (2) G satisfies the positive maximum principle on C0 (S), and the image of C0 (S) under G is dense in C0 (S) in the uniform norm. Proof (1) ⇒ (2) Assume that there exists a Feller process X on S with potential operators {U λ , λ ≥ 0} such that U 0 = G. Let f ∈ C0 (S). It follows from Corollary 4.1.4 that the transition semigroup of a Feller process is a strongly continuous contraction semigroup on C0 (S). Consequently, we see by (4.5) that lim αU α f − f ∞ = 0
α→∞
∀ f ∈ C0 (S).
(4.14)
Thus it suffices to show that any function of the form U α f is in the image of C0 (S) under U 0 = G. This follows from the resolvent equation, Lemma 3.1.3, which can be written as U α f (x) = U 0 f (x) − αU 0 U α f (x) = U 0 (f − αU α f ) (x).
(4.15)
We now show that G satisfies the positive maximum principle. Let h ∈ C0 (S), with supx Gh(x) > 0, and assume that for some a > 0, sup {x:h(x)>0}
Gh(x) ≤ a.
(4.16)
Note that since G is a positive operator, the condition supx Gh(x) > 0 ensures that the set A := {x : h(x) > 0} = ∅. Let HA f (x) := E x (f (XTA ) ; TA < ∞) and let x ∈ Ac . Using the strong Markov property we see that ∞ Gh(x) = E x h(Xt ) dt (4.17) 0 TA
= Ex = E
+ Ex
h(Xt ) dt 0
TA
x
h(Xt ) dt 0
∞
h(Xt ) dt TA
+ HA Gh(x).
128
Constructing Markov processes
The integral in the last line of (4.17) is less than or equal to zero because h(Xt ) ≤ 0 for t ∈ [0, TA ). Therefore, ∀x ∈ Ac .
Gh(x) ≤ HA Gh(x)
(4.18)
We now note that HA Gh(x) = E x (Gh(XTA ) ; TA < ∞) ≤ a.
(4.19)
To see this we first observe that Gh(y) ≤ a for all y ∈ A by (4.16) and the continuity of Gh, and then use the fact that XTA ∈ A when TA < ∞, by the right continuity of X. Using (4.19) and (4.18), we see that Gh(x) ≤ a
∀x ∈ Ac .
(4.20)
This shows that G satisfies the positive maximum principle. (2) ⇒ (1) In Lemma 4.1.9, below, we construct a family {U λ , λ ≥ 0} of positive linear operators on C0 (S) → C0 (S), with U 0 = G, satisfying λU λ ≤ 1
λ > 0,
(4.21)
where U λ denotes the operator norm of U λ as an operator on C0 (S), that is, U λ = sup{f ∈C0 (S),f ∞ ≤1} U λ f ∞ , and U r − U s = (s − r)U s U r
∀ r, s ≥ 0.
(4.22)
We refer to such a family {U λ , λ ≥ 0} as a contraction resolvent. This is the natural extension of the concept of contraction resolvent, introduced after (4.10), to the index set λ ≥ 0. Taking r = 0 in (4.22) and using (4.21), we see that for any f ∈ C0 (S), lim sU s Gf = Gf.
s→∞
(4.23)
Using (4.21) and the hypothesis that the image of C0 (S) under G is dense in C0 (S) in the uniform norm, it follows from (4.23) that, for any f ∈ C0 (S), lim sU s f = f.
s→∞
(4.24)
The theorem now follows by Theorem 4.1.3. To motivate the rather abstract construction of {U λ , λ ≥ 0}, we note that (4.22) shows that U p (1 + pG) = G.
(4.25)
Thus, if (1 + pG) is invertible, we would have U p = (1 + pG)−1 G
(4.26)
4.1 Feller processes
129
for p > 0. This suggests that when we are given only G and want to define {U λ , λ ≥ 0} with U 0 = G, we should use the right-hand side of (4.26). This is what is done in (4.38). The next two lemmas are used in the proof of Lemma 4.1.9. Lemma 4.1.7 Let S be a locally compact space with a countable base and let G : C0 (S) → C0 (S) be a positive, bounded linear operator that satisfies the positive maximum principle on C0 (S). Then (I + pG) is one–one. Proof It follows from the hypothesis that G satisfies the positive maximum principle on C0 (S) that for any p > 0 and f ∈ C0 (S), (I + pG)f ∞ ≥ pGf ∞ . To see this, note first that for any f ∈ C0 (S), 1 I + G f (x) sup Gf (x) ≤ sup {x:f (x)≥0} {x:f (x)≥0} p ' ' 1 ' ' ≤ ' I + G f' . p ∞
(4.27)
(4.28)
Therefore, since G satisfies the positive maximum principle on C0 (S), ' ' 1 ' ' (4.29) I + G f' . sup Gf (x) ≤ ' p ∞ x∈S Using (4.28) with f replaced by −f we see that (4.29) also holds with f replaced by −f . Since Gf ∞ = max (sup Gf (x), sup G(−f )(x)), x
(4.30)
x
we get (4.27). If (I + pG)f ∞ = 0, by (4.27) we also have pGf ∞ = 0. Consequently, f = (I + pG)f − pGf ≡ 0. Thus (I + pG) is one–one.
Lemma 4.1.8 Under the hypotheses of Lemma 4.1.7, let W be a bounded operator that commutes with G and satisfies G − W = pGW
or, equivalently,
(I + pG)W = G.
(4.31)
Then W is a positive operator and W ≤ 1/p.
(4.32)
130 Proof
Constructing Markov processes By straightforward manipulations (4.31) implies that (I + pG)(I − pW ) = I
G(I − pW ) = W.
and
(4.33)
Since G satisfies the positive maximum principle on C0 (S), it follows from the first line of (4.28) that 1 sup Gf (x) ≤ sup I + G f (x). (4.34) p x x Using this with f = (I − pW )g and (4.33), we see that sup pW g(x) ≤ sup g(x). x
(4.35)
x
Therefore, if g ≤ 0, the same is true for W g. This shows that W is a positive operator. We then have pW g∞ ≤ pW |g|∞ ≤ g∞ , so that (4.32) holds. We say that a positive linear operator H is symmetric with respect to m if g(x) Hf (x) dm(x) =
Hg(x) f (x) dm(x)
(4.36)
for all nonnegative functions f, g ∈ B(S), whenever both sides exist. Lemma 4.1.9 Let S be a locally compact space with a countable base, and let G : C0 (S) → C0 (S) be a positive, bounded linear operator that satisfies the positive maximum principle on C0 (S). Then there is a unique contraction resolvent {U λ , λ ≥ 0} on C0 (S), with U 0 = G. Furthermore, when G is symmetric, so is {U λ , λ ≥ 0}. Proof We construct the operators {U λ , λ ≥ 0}. For any < 1/G, I + G is invertible. In fact, (I + G)−1 =
∞
(−1)j j Gj .
(4.37)
j=0
To see this, note that the sum on the right-hand side converges in the operator norm so that when multiplied by I + G the sum converges to the identity. It follows from (4.37) that G and all the operators (I + G)−1 , for < 1/G, commute with each other. For p < 1/G, set U p = (I + pG)−1 G = G(I + pG)−1 .
(4.38)
It follows from this that for all p < 1/G, G − U p = pGU p = pU p G.
(4.39)
4.1 Feller processes
131
Therefore, U p has the properties of the operator W in Lemma 4.1.8. In particular, U p is a positive linear operator with U p ≤ 1/p.
(4.40)
We extend the definition of U p to all p ≥ 0 by induction. Assume that for some integer k and all p < k/G we have constructed operators U p satisfying (4.39) and (4.40). Fix a p < k/G. We repeat the procedure in the last paragraph starting from (4.37) but with G replaced by U p . For any < 1/U p , I + U p is invertible and all the operators G, U p and (I + U p )−1 commute with each other. For < 1/U p we mimic (4.38) and set U p, = (I + U p )−1 U p = U p (I + U p )−1 .
(4.41)
U p − U p, = U p U p, = U p, U p ,
(4.42)
pG(U p − U p, ) = pGU p U p, = (G − U p )U p,
(4.43)
Thus
so that by (4.39)
or, equivalently, pGU p + U p U p, = (p + )GU p, .
(4.44)
Using (4.39) and (4.42) again we find that G − U p, = (p + )GU p,
(4.45)
(I + (p + )G)U p, = G.
(4.46)
or, equivalently, that
Note that U p, and G commute, so that U p, also has the properties of the operator W in Lemma 4.1.8. Therefore, in particular, U p, is a positive linear operator with U p, ≤
1 . p+
(4.47)
How large can p+ be? We can take any p < k/G and any < 1/U p . Since U p ≤ 1/p, this implies that we can take any < p. Thus we can choose and p to achieve any number p + < 2k. We now define U p+ = U p, . However, we must check for consistency since U p+ already exists for p + < k/G. In this case, by (4.39), we have (I + (p + )G)U p+ = G,
(4.48)
132
Constructing Markov processes
and since I + (p + )G is one–one, we see by (4.46) and (4.48) that U p+ = U p, . Thus, we have constructed positive operators U p satisfying (4.39) and (4.40) for all p < 2k/G. We can proceed in this manner to define positive operators U p satisfying (4.39) and (4.40) for all p ≥ 0. By repeatedly using (4.39), which now holds for all p ≥ 0, we see that for any r, s ≥ 0, (I + rG)(U r − U s )
= G − (I + rG)U s =
(s − r)GU
=
(I + rG)(s − r)U r U s .
(4.49)
s
Since, by Lemma 4.1.7, I + rG is one–one, we obtain (4.22). Uniqueness follows from (4.48) and the fact that I+(p+)G is one–one. And, lastly, it follows from (4.37) and (4.41) that when G is symmetric, so is {U λ , λ ≥ 0}. Remark 4.1.10 If one reexamines the proof of Lemma 4.1.9, one sees that the only property of C0 (S) that is used is the fact that C0 (S) is a Banach space of functions on S. An almost identical proof shows that if G : Cb (S) → Cb (S) is a positive, bounded linear operator that satisfies the positive maximum principle on Cb (S), then we can construct a contraction resolvent {U λ ; λ ≥ 0} on Cb (S) with U 0 = G. Furthermore, when G is symmetric, so is {U λ , λ ≥ 0}. Remark 4.1.11 Let X and X be strongly symmetric Borel right processes on compact state spaces S and S , respectively, with continuous 0-potential densities u(x, y) and u (x, y). Assume that S ⊆ S and u (x, y) = u(x, y) on S × S . Let t = Xτ , X t where τ· is the right continuous inverse of t At = 1S (Xs ) ds.
(4.50)
(4.51)
0
t ; t ≥ 0} is right continuous with values in S . We claim Note that {X that t ; t ≥ 0} = {Xt ; t ≥ 0}. {X law
(4.52)
To see this, first note that it follows from the fact that τt+s = τt + τs ◦ θτt for any s, t and the strong Markov property for Xt that for any
4.1 Feller processes x ∈ S
t+s ) | Fτ E x h(X t
= E x h(Xτt+s ) | Fτt
133
(4.53)
= E (h(Xτs ) ◦ θτt | Fτt ) t h(X s ) = E Xτt (h(Xτ )) = E X x
s
for any h ∈ B(S ). Let
t ) . Pt h(x) := E x h(X
(4.54)
Taking the expectation E x of the first and last equations in (4.53) = we see that {Pt ; t ≥ 0} is a Borel semigroup on B(S ). Thus, X (Ω, Fτ∞ , Fτt , Xt , θτt ) is a right continuous simple Markov process with 0 denote the 0-potential operatransition semigroup {Pt ; t ≥ 0}. Let U tor for X. Let f ∈ C(S ) and x ∈ S . As in the proof of (2.125) (with T = ∞): ∞ 0 x f (Xτt ) dt (4.55) U f (x) = E 0 ∞ = Ex f (Xs ) dAs 0 ∞ x f (Xs )1S (Xs ) ds (4.56) = E 0 = U 0 f (x) = u(x, y)f (y) dy. is a right continuous simple Markov process in S with We see that X a symmetric, continuous 0-potential density, {u(x, y) ; x, y ∈ S }. This and X have the same 0-potential operators. We now use shows that X Lemma 4.1.9 to extend these to potential operators on S . These are the same by the uniqueness property in Lemma 4.1.9. It follows by (3.15) and the uniqueness of the Laplace transform that X and X have that same transition semigroup. Therefore we have (4.52). Remark 4.1.12 Not every Feller process has a finite 0-potential operator, and even when it does, its 0-potential operator may not be a bounded operator on C0 (S). Brownian motion is an example of the first assertion. An example of the second is given by Brownian motion killed the first time it hits 0. In that case, ∞ (x ∧ y)f (y) dy. (4.57) U 0 f (x) = 2 0
Constructing Markov processes
134
On the other hand, every Feller process has a 1-potential operator that is bounded as an operator on C0 (S). The next corollary deals with an important special case of this phenomenon. Corollary 4.1.13 Let S be a locally compact space with a countable base and let G : C0 (S) → C0 (S) be a positive linear operator with G1(x) = 1 for all x ∈ S. The following are equivalent: (1) G is the 1-potential operator of a Feller process on S. (2) G satisfies the positive maximum principle on C0 (S) and the image of C0 (S) under G is dense in C0 (S) in the uniform norm. Proof Since G : C0 (S) → C0 (S) is a positive linear operator with G1(x) = 1 for all x ∈ S, it follows that for any f ∈ C0 (S), Gf (x) ≤ Gf ∞ (x) = f ∞ .
(4.58)
This also holds with f replaced by −f , so we have G ≤ 1. Since the 1-potential operator of a Feller process X on S is the 0potential operator of the Feller process obtained by killing X at an independent exponential time with mean one, it follows from Theorem 4.1.6 that (1) implies (2). To see that (2) implies (1), we first use Theorem 4.1.6 to see that G is the 0-potential operator of a Feller process X on S. The resolvent equation (3.18) together with the fact that G1(x) = 1 for all x ∈ S shows that U α 1(x) = (1 + α)−1
(4.59)
for all x ∈ S and all α ≥ 0. Let {Pt ; t ≥ 0} be the strongly continuous contraction semigroup of positive linear operators on C0 (S) associated with the Feller process X. We define the linear operators {P t = et Pt ; t ≥ 0}. It is clear that {P t ; t ≥ 0} satisfies conditions (1) and (3) in the definition of a strongly continuous contraction semigroup of positive linear operators on C0 (S) on page 122. We show that it also satisfies condition (2). For any α ≥ 0, ∞ ∞ e−(1+α)t P t 1(x) dt = e−αt Pt 1(x) dt = U α 1(x) = (1 + α)−1 . 0
0
(4.60) By (3.15) and the uniqueness of the Laplace transform, this implies that for each x ∈ S, P t 1(x) = 1
for almost all t.
(4.61)
4.2 L´evy processes
135
Assume that for some x and t we have P t 1(x) > 1. Then we could find some f ∈ C0 (S) with 0 ≤ f ≤ 1 and P t f (x) > 1. However, by (4.61) we can find a sequence tn ↓ t with P tn 1(x) = 1 and consequently, P tn f (x) ≤ 1 for all n. This implies, by condition (3) on page 122, that P t f (x) ≤ 1. This contradiction shows that condition (2) on page 122 is satisfied. Thus we see that {P t ; t ≥ 0} is a strongly continuous contraction semigroup of positive linear operators on C0 (S). By Theorem 4.1.1 we can construct a Feller process X with transition λ semigroup {P t ; t ≥ 0}. If {U ; λ > 0} are the potential operators for X, we have ∞ ∞ 1 U f (x) = e−t P t f (x) dt = Pt f (x) dt = U 0 f (x) = Gf (x). 0
0
(4.62)
4.2 L´ evy processes Symmetric L´evy processes in R1 give us a rich class of examples of Borel right processes for which we can give explicit examples of many general results about local times. We define an Rn -valued L´evy process, starting at 0, to be a stochastic process X = {Xt ; t ∈ R+ } that satisfies the following two properties: (1) X has stationary and independent increments. (2) t → Xt is right continuous and has limits from the left. Let φt (λ) denote the characteristic function of Xt . It is easy to see that for all u, v rational, φu+v (λ) = φu (λ)φv (λ). It then follows that Eei(λ·Xt ) = e−tψ(λ)
∀t ≥ 0
(4.63)
for some continuous function ψ with Re ψ ≥ 0. The function ψ is called the L´evy exponent of X. A stochastic process Z = {Z(t); t ∈ S} is said to be infinitely divisible, if for each n there exists a stochastic process Zn = {Zn (t); t ∈ S} such that n law Zn,i , (4.64) Z = i=1
where Zn,i , i = 1, . . . , n are independent copies of Zn . It is easy to see that X is infinitely divisible and for each n the independent random processes for which (4.64) holds are those determined by (4.63) with
136
Constructing Markov processes
characteristic function exp(−tψ(λ)/n). Obviously, X1 is an infinitely divisible random variable. Infinitely divisible random variables are characterized by the L´evy– Khintchine Theorem, which states that ψ(λ) is of the form ψ(λ) = i(a · λ) + Q(λ) + 1 − ei(λ·x) + i(λ · x)1|x|<1 ν(dx), (4.65) where a ∈ Rn , Q(λ) is a positive semidefinite quadratic form and ν is a measure on Rn − {0} satisfying (1 ∧ |x|2 ) ν(dx) < ∞. (4.66) In the next theorem, starting with the L´evy–Khintchine Theorem, we use the material in Section 4.1 to prove that for all functions ψ satisfying (4.65)–(4.66) there exists a L´evy process with L´evy exponent ψ(λ). Theorem 4.2.1 Let ψ(λ), λ ∈ Rn be a function that satisfies (4.65)– (4.66). Then there exists a L´evy process with L´evy exponent ψ(λ). Proof The L´evy–Khintchine Theorem implies that for each t ≥ 0, there exists an Rn -valued random variable Zt with characteristic function exp(−tψ(λ)). Let µt be the probability measure on Rn induced by t (λ) = exp(−tψ(λ)) and µ t µ s = µ t+s . Consequently, Zt . It follows that µ µt ∗ µs = µs+t for all s, t ≥ 0. Set Pt f (x) = f (x + y) dµt (y) f ∈ Cb (Rn ). (4.67) Rn
Pt is a positive operator on Cb (Rn ) satisfying Pt f ≤ f for each t ≥ 0, where · is the uniform norm. It also follows from the fact that µt ∗ µs = µs+t for all s, t ≥ 0 that {Pt ; t ≥ 0} is a Markov semigroup of operators on Cb (Rn ). In addition, using the fact that |e−tψ(λ) | ≤ 1, we see that for f ∈ C0∞ (Rn ), 1 Pt f (x) = (4.68) e−iλx e−tψ(λ) f(λ) dλ. 2π Rn It follows from the Riemann-Lebesgue Lemma that Pt : C0∞ (Rn ) → C0 (Rn ). Using the fact that Pt f ≤ f , we then see that Pt : C0 (Rn ) → C0 (Rn ). Furthermore, for f ∈ C0∞ (Rn ), we see that 1 Pt f − f ≤ |e−tψ(λ) − 1| |f(λ)| dλ, (4.69) 2π Rn so that by the Dominated Convergence Theorem limt→0 Pt f − f = 0
4.2 L´evy processes
137
for f ∈ C0∞ (Rn ). As above, using the fact that Pt f ≤ f we then see that limt→0 Pt f − f = 0 for all f ∈ C0 (Rn ). Thus, {Pt ; t ≥ 0} is a strongly continuous contraction semigroup of positive linear operators on C0 (Rn ). We can now use Theorem 4.1.1 and Remark 4.1.2 (1) to construct a Feller process {Xt , t ≥ 0} with transition semigroup {Pt ; t ≥ 0} and state space Rn . It is easy to check that this Feller process is a L´evy process. Condition (2) on page 135 follows from Remark 4.1.2 (2). Also, from the proof of Theorem 4.1.1, we see that for 0 = t0 < t1 < . . . < tn , n (Xt1 , . . . , Xtn ) has probability measure j=1 Ptj −tj−1 (zj−1 , dzj ), where z0 = 0. Note that, by definition, for all f ∈ Cb (Rn ), f (a + y)Pt (x, dy) = f (a + y + x)µt (dy). Therefore, for any functions f, g ∈ Cb (Rn ), E(g(Xt1 , . . . , Xtn−1 )f (Xtn − Xtn−1 )) n Ptj −tj−1 (zj−1 , dzj ) = g(z1 , . . . , zn−1 )f (zn − zn−1 )
g(z1 , . . . , zn−1 )
=
j=1
f (zn − zn−1 )Ptn −tn−1 (zn−1 , dzn ) n−1
=
g(z1 , . . . , zn−1 )
(4.70)
Ptj −tj−1 (zj−1 , dzj )
j=1
f (zn )µtn −tn−1 (dzn ) n−1
Ptj −tj−1 (zj−1 , dzj )
j=1
= E(g(Xt1 , . . . , Xtn−1 ))E(f (Xtn −tn−1 )). As in (2.11), in the proof of the existence of Brownian motion, this shows that {Xt , t ≥ 0} satisfies condition (1) for L´evy processes on page 135.
We now restrict our attention to the symmetric L´evy processes that satisfy the additional condition 1 ∈ L1 (dλ). 1 + ψ(λ)
(4.71)
(We point out later in this section why this condition is fundamental
138
Constructing Markov processes
in this book.) The symmetry of X implies that ψ(λ) is a positive realvalued even function. Using this in (4.65), one can easily show that ψ(λ) = O(|λ|2 ) as |λ| → ∞. Thus (4.71) can only hold on R1 . In this case (4.65) takes the form ∞ (1 − cos λx) ν(dx) (4.72) ψ(λ) = Cλ2 + 2 0
for some constant C and measure ν on (0, ∞) such that ∞ (1 ∧ x2 ) ν(dx) < ∞.
(4.73)
0
L´evy processes with L´evy exponents ψ that satisfy (4.71) have continuous transition probability density functions. To see this note first that condition (4.71) implies that exp(−tψ(λ)) ∈ L1 (dλ) for each t > 0 (since for each t > 0 there exists a constant C such that exp(−tx) ≤ C/(1 + x) for all x ≥ 0). Hence, for each t > 0, ∞ 1 e−iλx e−tψ(λ) dλ (4.74) pt (x) = 2π −∞ is continuous. It is easily checked that {pt (x, y) := pt (y − x), t > 0} are regular transition densities with respect to Lebesgue measure. Let {Xt ; t ∈ R+ } be a symmetric L´evy process. Substituting (4.72) in (4.63) we see that we can write Xt as the sum of two independent stochastic processes, one of which is a constant times standard Brownian motion and the other which has characteristic function exp(−tψ(λ)), where ∞ (1 − cos λx) ν(dx). (4.75) ψ(λ) =2 0
The conditions on the L´evy measure ν in (4.73) impose restraints on the growth of ψ. Lemma 4.2.2 ψ(λ) = o(λ2 )
as
λ→∞
(4.76)
∞ ψ(λ) ≥ (0.7) (x2 ∧ 1) ν(dx). lim λ→0 λ2 0
(4.77)
lim ψ(λ) = ∞.
(4.78)
λ→∞
4.2 L´evy processes Proof
139
By (4.75) we have for λ > 1 ψ(λ)
∞ =
sin2
4
λx ν(dx) 2
(4.79)
0
1/λ ≤ λ x2 ν(dx) + 4ν[1/λ, ∞). 2
0
The first term in the last line of (4.79) is obviously o(λ2 ) as λ → ∞. Therefore, it remains to show that limλ→∞ ν[1/λ, ∞) = o(λ2 ) or, equivalently, that lim→0 2 ν[, ∞) = 0. For any δ > 0 and < δ, δ δ 2 ν[, δ) ≤ x2 ν(dx) ≤ x2 ν(dx). (4.80) 0
Therefore, since lim→0 ν[δ, ∞) = 0, we see that δ lim 2 ν[, ∞) ≤ x2 ν(dx). 2
→0
(4.81)
0
Thus lim→0 2 ν[, ∞) = 0 and we get (4.76). Equation (4.77) is elementary. Since sin x > (0.84)x for x ∈ [0, 1], 2/λ 2 2 ψ(λ) > (0.84) λ x2 ν(dx), (4.82) 0
which gives (4.77). Equation (4.78)is an immediate consequence of the Riemann–Lebesgue Lemma since Xt has a probability density function (see (4.74)), and so e−tψ(λ) is the Fourier transform of a function in L1 . It follows from (4.76) that the class of L´evy processes with characteris tic function exp(−t ψ(λ)) does not include Brownian motion. Therefore, L´evy processes for which C = 0 in (4.72) are often referred to as L´evy processes without a Gaussian component. Remark 4.2.3 Although it is not needed in this book, it is interesting to note that L´evy processes without a Gaussian component are pure jump processes. When (4.71) holds the paths of these processes are not of bounded variation on bounded intervals of time, although the sum of the squares of their jumps on a bounded interval is finite. A necessary and sufficient condition for the paths to be of bounded variation, on bounded intervals of time, is that ∞ (1 ∧ x) ν(dx) < ∞. (4.83) 0
140
Constructing Markov processes
It follows from (4.74) that the α-potential density of X is given by 1 ∞ cos λ(x − y) uα (x, y) = dλ. (4.84) π 0 α + ψ(λ) It follows from (4.71) that uα (x, y) is continuous for all x, y ∈ R1 . We also write uα (y −x) = uα (x, y). Here we see the significance of condition (4.71). It is necessary and sufficient for the existence of a continuous αpotential density for X, for α > 0. Obviously, u(0, 0) exists if and only if (ψ(λ))−1 ∈ L1 (dλ). Theorem 4.2.4 Suppose that 0 is recurrent for X (equivalently, u(0) = ∞). Then P x (T0 < ∞) = 1
∀x ∈ R1
(4.85)
and uT0 (x, y) = φ(x) + φ(y) − φ(x − y), where φ(x) :=
1 π
∞
0
1 − cos λx dλ ψ(λ)
x ∈ R1 .
(4.86)
(4.87)
Proof We first note that φ(x) is bounded and continuous. This follows by breaking the integral into two pieces. The integral over [1, ∞] is controlled by (4.71), and the integral over [0, 1] is controlled by (4.77). Note that 1 ∞ 1 − cos λx α α u (0) − u (x) = dλ (4.88) π 0 α + ψ(λ) := φα (x). Since φ(x) is bounded and u(0) = ∞ (see Lemma 3.6.11), we see that uα (x) = 1. α→0 uα (0) lim
Equation (4.85) now follows from (3.107). It follows from (4.88) that 1 ∞ 1 − cos(λx) α α lim (u (0) − u (x)) = dλ α→0 π 0 ψ(λ) = φ(x). Using the Markov property at time T0 shows that ∞ uα (x − y) = E x e−αt dLyt 0
(4.89)
(4.90)
(4.91)
4.2 L´evy processes T0 y x −αt x = E e dLt + E 0
141 ∞
e−αt dLyt
T0
−αT0 α x = uα e u (y). T0 (x, y) + E It follows from (4.91) that α uα T0 (x, y) − uT0 (x, 0)
(4.92) −αT0
= u (x − y) − u (x) − {u (y) − u (0)}E (e α
α
α
x
−αT0
= φα (x) + φα (y)E (e
α
x
)
) − φα (x − y).
Using (4.85) we see that limα→0 E x (e−αT0 ) = 1. Also, clearly, uα T0 (x, y) (x, 0) tend to u (x, y) and u (x, 0) as α decreases to zero. and uα T T 0 0 T0 Therefore we can take the limits in (4.92) and use (4.90) and the fact that uT0 (x, 0) = 0 (see Lemma 3.8.1) to obtain (4.86). An infinitely divisible stochastic process Z is called stable if Zn in (4.64) is equal to cn Z for some cn > 0. It follows from (4.63) that for this to happen we must have ψ(cn λ) = ψ(λ)/n. It is easy to see that this is the case when ψ(λ) = |λ|p
0
(4.93)
and cn = n−1/p (in fact, except for multiplication by a constant, this is the only way this can happen; see, e.g., Feller (1971, XVII.4)). We call a L´evy process with characteristic exponent ψ given by (4.93) a canonical stable or canonical p-stable process. It is obvious from (4.72) that ψ(λ) = λ2 is possible. To obtain ψ(λ) = |λ|p for 0 < p < 2, take ν(dx) = (πCp+1 xp+1 )−1 dx and C = 0 in (4.72), where 2 ∞ 1 − cos s ds. (4.94) Cp = π 0 sp
Example 4.2.5 We define the canonical symmetric stable process on R1 , of index β, to be a L´evy process with L´evy exponent ψ(λ) = |λ|β . Then, for 1 < β ≤ 2, by Theorem 4.2.4, φ(x) = and
Cβ β−1 |x| 2
uT0 (x, y) = Cβ |x|β−1 + |y|β−1 − |x − y|β−1 .
(4.95)
(4.96)
Note that condition (4.71) requires that β > 1 (Cβ is given by (4.94)) with β = p.
Constructing Markov processes
142
Remark 4.2.6 Note that the symmetric√stable process of index 2 is standard Brownian motion multiplied by 2. Because of this, when we give results for stable processes, those for 2-stable processes differ, by the effects of the square root, from known results for standard Brownian motion. In order to compare results we obtain with others in the literature, it is useful to have numerical values for Cp . We can do this easily for C2 . For standard Brownian motion (4.87) is φ(x) =
1 π
∞
0
1 − cos λx dλ λ2 /2
x ∈ R1 .
(4.97)
√ √ By a change of variables this is equivalent to φ x/ 2 = 2C2 |x|. Therefore, by Theorem 4.2.4 for Brownian motion, φ(x) = C2 |x| and uT0 (x, y)
= C2 (|x| + |y| − |x − y|) =
(4.98)
2 C2 (|x| ∧ |y|) 1[xy≥0] .
It follows from Lemma 2.5.1 that C2 = 1. We state the results for 1 < p < 2 and give references. By Ibragamov and Linnik (1971, 2.6.32, page 88) Cp = −
π 2 cos (p − 1) Γ(1 − p). π 2
(4.99)
Since Γ(p)Γ(1 − p) =
π sin πp
(see, e.g., Ahlfors (1966, (30), page 198)) and π π cos (p − 1) = sin p 2 2
(4.100)
(4.101)
we can also write Cp =
Γ(p) sin
2 π
2 (p
. − 1)
(4.102)
Condition (4.71) implies that the symmetric L´evy process X has a continuous α-potential density for all α > 0 and hence, by Theorem 3.6.3, that X has local times. In fact, (4.71) is a necessary condition for (3.83), so it is fundamental for us (see Bertoin (1996, Theorem 19, II.5)).
4.2 L´evy processes
143
4.2.1 L´ evy processes on the torus We obtain a large and interesting class of symmetric L´evy processes taking values in a torus, T 1 . One aspect of these processes that interests us is that their associated Gaussian processes are random Fourier series with normal coefficients. These Gaussian processes are relatively easy to work with. Let X be a symmetric L´evy process on R1 with characteristic function given by (4.63) and transition probability density pt (x, 0)=pt (x). Let π : R1 → T 1 denote the natural projection, π(x) = x(mod 2π), onto T 1 . Consider Yt = π(Xt ).
(4.103)
It is easy to see that Y = {Yt , t ∈ R+ } satisfies conditions (1) and (2) in the definition of a L´evy process on page 135 (here we extend the definition of a L´evy process to stochastic processes X = {Xt , t ∈ R+ } with values in an Abelian group). Also, it is easy to check that the transition probability density function of Y is given by qt (x, y) = 2π
∞
pt (x − y + 2πj)
∀ x, y ∈ T 1
(4.104)
j=−∞
with respect to the normalized Lebesgue measure dx/(2π) on T 1 . As usual we write qt (x, y) = qt (x − y). It is clear from (4.104) that qt is symmetric. We have 2π 1 eijx qt (x) dx (4.105) EeijYt = 2π 0 2π ∞ = eijx pt (x + 2πj) dx 0
j=−∞ ∞
= −∞
eijx pt (x) dx = e−tψ(j) .
The distribution of Y on T 1 is determined by its characteristic sequence {exp(−tψ(j))}. Since qt (x) is the density function for Yt , qt (x) =
∞
eijx e−tψ(j) ,
(4.106)
j=−∞
and the α-potential of Y is given by u α (x)
=
∞
eijx α + ψ(j) j=−∞
(4.107)
144
Constructing Markov processes =
∞ 1 cos jx +2 α α + ψ(j) j=1
α (0) = ∞. Thus every point in for all α > 0. Note that u (0) = limα→0 u 1 T is recurrent for Y , which should not be a surprise. As in Theorem 4.2.4, − φ(x − y), + φ(y) u T0 (x, y) = φ(x)
(4.108)
where φ(x) =2
∞ 1 − cos jx j=1
ψ(j)
.
(4.109)
4.3 Diffusions Let I ⊂ R1 be an interval that can be infinite. For the purpose of studying local times of diffusions we define a diffusion to be a Borel right process with state space I that has continuous paths. A diffusion is called regular and without traps if P x (Ty < ∞) > 0
∀x, y ∈ I.
(4.110)
Let X be a transient regular diffusion without traps on I. It is known that we can always find a positive σ-finite measure m on the state space I, called the speed measure, so that the 0-potential density u(x, y) of X, with respect to m, is symmetric and continuous. Furthermore, there exist two continuous positive functions p and q with p strictly increasing and q strictly decreasing such that for all x, y ∈ I p(x)q(y) x ≤ y u(x, y) = (4.111) p(y)q(x) y < x. See Ray (1963, (1.4), (1.6)) (see also Rogers and Williams (2000a, Theorem V, 50.7) for an exponentially killed diffusion). The reader unfamiliar with these facts can simply take them as assumptions. However, in the next lemma we show how to derive the representation in (4.111), for particular potential densities that we are interested in, from the facts that X has continuous paths and continuous symmetric potential densities. Lemma 4.3.1 Let X be a diffusion in I with continuous symmetric αpotential densities, α > 0, and with P x (Ty < ∞) > 0 for all x, y ∈ I.
4.3 Diffusions
145
Then, for some continuous functions p and q on I and x, y ∈ I, we have p(x)q(y) x ≤ y uα (x, y) = (4.112) p(y)q(x) y < x. Furthermore, p is strictly increasing and q is strictly decreasing. This is also true for α = 0 when u(x, y) exists. ∞ Proof By (3.91), uα (x, y) = E x 0 e−αt dLyt . Making use of the fact that X has continuous paths, we see that for x < v < y ∞ y α x −αt u (x, y) = E (4.113) e dLt Tv ∞ e−αt dLyt = E x e−αTv E XTv = E
x
e
−αTv
0
uα (v, y) =
uα (x, v)uα (v, y) . uα (v, v)
Here we also use (3.107). By Remark 3.6.6 and the assumption that P x (Ty < ∞) > 0, we have uα (x, y) > 0 for all x, y ∈ I. Consequently, the following functions are well defined: x ≤ x0 uα (x, x0 ) G(x) = uα (x, x)uα (x0 , x0 ) x > x0 uα (x, x0 ) and
H(x) =
uα (x, x) uα (x, x0 )
x ≤ x0
uα (x, x0 ) α u (x0 , x0 )
x > x0 .
Using (4.113) one can check that uα (x, y) = G(x ∧ y)H(x ∨ y).
(4.114)
This shows that uα (x, y) can be represented as in (4.112). Furthermore, by the continuity of X, E x (e−Tv ) < 1. Therefore, by (4.113), uα (x, y) is strictly less than uα (v, y). This shows that p is strictly increasing. A similar argument using (4.113) and the fact that uα (y, v) uα (v, y) = α = E y (e−αTv ) α u (v, v) u (v, v) shows that q is strictly decreasing.
(4.115)
Constructing Markov processes
146
It is clear that everything goes through with α = 0 when the 0potential density of X exists.
Lemma 4.3.2 Let X be a recurrent diffusion in R1 with continuous α-potential densities. Assume that P x (Ty < ∞) = 1 for all x, y ∈ R1 . Then uT0 (x, x) is strictly increasing on R+ and xy ≥ 0 uT0 (x, x) ∧ uT0 (y, y) uT0 (x, y) = (4.116) 0 xy < 0. Proof Let y > x > 0 and recall that uT0 (x, y) = E x (LyT0 ). By the support properties of Lyt and the strong Markov property, uT0 (x, y) = E x 1{Ty
(4.118)
Furthermore, using continuity of the paths of X and the strong Markov property, we have P y (T0 < ∞) = P y (Tx < T0 , T0 ◦ θTx < ∞) = P y (Tx < T0 )P x (T0 < ∞), (4.119) so by the assumption that P x (Ty < ∞) = 1 for all x, y ∈ R1 we see that P y (Tx < T0 ) = 1. Using this in (4.118) we see that uT0 (y, x) = uT0 (x, x).
(4.120)
Let τ 0 = 0 and for i = 1, 2, . . .. Define τi
=
inf{t ≥ τ i−1 : Xt = x}
τi
=
inf{t ≥ τi : Xt = y}.
Using continuity of the paths of X, the fact that T0 is a terminal time, and the strong Markov property we have P y (T0 < ∞)
= ≤
∞ i=1 ∞
P y (τi < T0 < τ i ) P y (T0 ◦ θTτi < Ty ◦ θTτi , τi < ∞)
i=1
=
∞ i=1
P x (T0 < Ty ),
(4.121)
4.4 Left limits and quasi left continuity
147
since P y (Tx < ∞) = 1. Since, by assumption, P y (T0 < ∞) = 1, we must have P x (T0 < Ty ) > 0. Furthermore, since P x (T0 < ∞)
= P x (T0 < Ty ) + P x (Ty < T0 , T0 ◦ θTy < ∞) = P x (T0 < Ty ) + P x (Ty < T0 ),
it follows that P x (Ty < T0 ) < 1. Using this in (4.117) we see that uT0 (x, y) < uT0 (y, y).
(4.122)
Finally, we note that by Lemma 3.8.1, uT0 (x, y) = uT0 (y, x). Combining this with (4.120) and (4.122) we get (4.116) when x, y ≥ 0. The extension to all x, y is obvious.
4.4 Left limits and quasi left continuity We continue to explore the sample path properties of strongly symmetric Borel right processes with continuous α-potential densities. We first show that X has left-hand limits. Theorem 4.4.1 Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric Borel right process. Then Xs− := limr↑↑s Xr exists in S for all s ∈ (0, ζ). Proof Let D ⊆ R+ denote the positive rational numbers . Since Xt is right continuous, it suffices to show that limr↑↑s, r∈D Xr exists in S for every s ∈ (0, ζ). Fix t ∈ D. Let {Pt ; t ≥ 0} denote the transition semigroup and m denote the reference measure for X. Then, using the fact that X is symmetric with respect to m, we have that, for any 0 < s1 < s2 < . . . < sn ∈ D ∩ (0, t) and fi ∈ Bb (S), i = 1, . . . , n, E
m
n
fi (Xt−si ) I{t<ζ}
(4.123)
i=1
Pt−sn (x, dyn )fn (yn )Psn −sn−1 (yn , dyn−1 )fn−1 (yn−1 ) · · ·
= =
· · · Ps2 −s1 (y2 , dy1 )f1 (y1 )Ps1 (y1 , dz)1S (z) dm(x) 1S (x)Pt−sn (fn Psn −sn−1 (fn−1 · · · Ps2 −s1 (f1 Ps1 1S )) · · ·)(x) dm(x).
On page 75 we express the symmetry condition in inner product notation. Using this notation we note that for f, g, h ∈ L2 (dm), (h, Pt f g) =
Constructing Markov processes
148
(Pt h, f g) = (g, f Pt h). Using this we see that the last line of (4.123) equals 1S (x)Ps1 (f1 Ps2 −s1 · · · (fn−1 Psn −sn−1 (fn Pt−sn 1S )) · · ·)(x) dm(x) n m (4.124) =E fi (Xsi ) I{t<ζ} . i=1
Hence, under P m ( · | I{t<ζ} ), {Xs ; s ∈ D ∩ (0, t)} has the same law as {Xt−s ; s ∈ D ∩ (0, t)}. Because X is right continuous, {Xt−s ; s ∈ D ∩ (0, t)} has left-hand limits in S, and therefore so does {Xs ; s ∈ D ∩ (0, t)}, P m ( · | I{t<ζ} ) almost surely. Since this is true for all t ∈ D, P m (Γ) = 0, where Γ := lim Xr fails to exist in S for some s ∈ (0, ζ) . (4.125) r↑↑s, r∈D
x x We now show that P (Γ) = 0 for all x ∈ S. Let f (x) = P (Γ) so that f (x) dm(x) = 0. Since, by hypothesis, X has α-potential densities, we have U α f (x) = 0 for all x ∈ S and α > 0. Assume that Γ ∈ F. Note that θt−1 Γ = {θt ω ∈ Γ}. Therefore, by the Markov property, for any t > 0
P x (θt−1 Γ)
= E x (1Γ ◦ θt )
(4.126)
= E (E (1Γ ◦ θt |Gt )) = E x E Xt (1Γ ) = Pt f (x) x
x
(see (3.27) and (3.11)). Since θt−1 Γ = lim Xr fails to exist in S for some s ∈ (t, ζ) , (4.127) r↑↑s, r∈D
we see that Pt f (x) ↑ f (x) as t → 0. Therefore, by the Monotone Convergence Theorem, ∞ α e−αt Pt f (x) dt (4.128) 0 = lim αU f (x) = lim α α→∞ α→∞ 0 ∞ e−t Pt/α f (x) dt = f (x), = lim α→∞
0
which completes the proof of this theorem. It remains to show that Γ ∈ F. We prove this first when S = R+ . In this case, we denote ∆, the one-point compactification of R+ by ∞ (as usual). Note that supR+ x = ∞. We write R+ = R+ ∪ {∞}. With these definitions set Z t = lim sup Xr r↑↑t, r∈D
Z t = lim inf Xr r↑↑t, r∈D
t ∈ (0, ∞).
(4.129)
4.4 Left limits and quasi left continuity
149
Suppose that the map (t, ω) → Z t (ω) from (0, ∞) × Ω to R+ is measurable from B((0, ∞)) × F to B(R+ ), and similarly for Z t . Then Γ is the projection on Ω of the B((0, ∞)) × F measurable set (t, ω) | Z t (ω) = ∞, (t, ω) | Z t (ω) = Z t (ω), Xt (ω) ∈ R+ Xt (ω) ∈ R+ and hence is in F by the Projection Theorem (Theorem 14.3.1). To prove that (t, ω) → Z t (ω) is measurable we note first that Z t = lim Ztn ,
(4.130)
n
where Ztn =
∞
1(k2−n ,(k+1)2−n ] (t)
k=0
sup {k2−n
Xr =:
∞
Ytk,n
(4.131)
k=0
with Ytk,n = 1(k2−n ,(k+1)2−n ] (t) sup{k2−n
= lim m
∞
k,n 1(j2−m ,(j+1)2−m ] (t)Yj2 −m .
(4.132)
j=0
Thus Ytk,n is measurable. Then Ztn is measurable, and consequently so is Z t . A similar analysis applies to (t, ω) → Z t (ω). This proves the theorem when S = R+ . Finally, let S be a general locally compact Hausdorff space and let d(x, y) be a metric compatible with the topology of S. Set d(x, ∆) = ∞ for each x ∈ S. Let x1 , x2 , . . . be a countable dense set in S. Then ∞ lim d(xn , Xr ) fails to exist in R+ for some s ∈ (0, ζ) . Γ= n=1
r↑↑s, r∈D
(4.133) Repeating the argument given when S = R+ applied to d(xn , Xr ), we see that each set in the union in (4.133) is in F. Thus Γ ∈ F. This completes the proof of Theorem 4.4.1. We say that X is quasi left continuous on [0, ζ) if, for any increasing sequence {Tn } of Gt stopping times with limit T , we have XTn → XT on {T < ζ} almost surely. Theorem 4.4.2 Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric Borel right process with continuous α-potential density uα (x, y). Then Xt is quasi left continuous on [0, ζ).
150
Constructing Markov processes
Proof Let {Tn } be an increasing sequence of stopping times with limit T and define L = limn XTn , which exists in S when T < ζ by Theorem 4.4.1. Let f, g ∈ Cb (S). Using the fact that L ∈ FT and the strong Markov property, we have that for any α > 0, E x f (L)U α g(XT ) I{T <ζ} (4.134) ∞ e−αt g(Xs ) ds = E x f (L) I{T <ζ} E XT 0 ∞ x −αt e g(XT +s ) ds = E f (L) I{T <ζ} 0 ∞ −αt = E x f (L) I{T <ζ} eαT e g(Xs ) ds T ∞ = lim E x f (XTn ) I{Tn <ζ} eαTn e−αt g(Xs ) ds , n→∞
Tn
by the Dominated Convergence Theorem. Here we use the fact that ∞ f (XTn ) I{T <ζ} eαTn e−αt g(Xs ) ds n Tn ∞ 1 αTn ≤ f ∞ g∞ e e−αt ds = f ∞ g∞ α Tn is uniformly bounded. Using the same procedure involving the strong Markov property we have that for any α > 0 E x f (L)U α g(XT ) I{T <ζ} = lim E x f (XTn )U α g(XTn ) I{Tn <ζ} . n→∞
(4.135) By Lemma 3.4.1, U α g is bounded and continuous on S and therefore limn U α g(XTn ) = U α g(L) on T < ζ. Hence, by (4.135) we get (4.136) E x f (L)U α g(XT ) I{T <ζ} = E x f (L)U α g(L) I{T <ζ} . By (3.16) and the Dominated Convergence Theorem, this implies that E x f (L)g(XT ) I{T <ζ} = E x f (L)g(L) I{T <ζ} . (4.137) The Monotone Class Theorem (see, e.g., Rogers and Williams (2000b, II.3.1)) then shows that (4.138) E x h(L, XT ) I{T <ζ} = E x h(L, L) I{T <ζ} for all bounded measurable functions h, in particular for h(x, y) = 1{x=y} , which yields P x (XT = L | T < ζ) = 1.
4.4 Left limits and quasi left continuity
151
A Borel right process is said to be a standard Markov process if it is quasi left continuous on [0, ζ). Thus we see that a strongly symmetric Borel right process with continuous α-potential density is a standard Markov process. In Theorem 3.2.6 we use the Projection Theorem to show that, for a Borel right process, the first hitting time TA = inf{s > 0 | Xs ∈ A} is a stopping time for every Borel set A. In this book all our results are for standard Markov processes, and we only consider sets A that are open or closed. In this case we can give a direct proof of Theorem 3.2.6 that does not use the Projection Theorem. Theorem 4.4.3 Let X = (Ω, G, Gt , Xt , θt , P x ) be a standard Markov process. Then TA is an Ft stopping time for every open or closed set A ⊆ S. Proof The proof for A open is exactly the same as for Brownian motion. We note that {Xr ∈ A} ∈ Ft (4.139) {TA < t} = 0
by the right continuity of X. Now let A be a closed set. Let An ⊇ A be a decreasing sequence of open sets such that An ⊇ An+1 and A = ∩n An . Note that A = ∩n An .
(4.140)
Recall the definition DB = inf{s ≥ 0 | Xs ∈ B} in (3.35). Clearly DAn ≤ DA is increasing in n. Set D = lim DAn ≤ DA .
(4.141)
By right continuity we have XDAn ∈ An . Note that DAn is a stopping time. This can be shown by using (4.139) but with the union taken over {0 ≤ r < t, r ∈ D}. Since {D ≤ t} = ∩n {DAn ≤ t}, we see that D is also a stopping time. Therefore, since X is quasi left continuous on [0, ζ), we have XD = lim XDAn ∈ ∩n An = A n
on D < ζ.
(4.142)
Hence D ≥ DA on {D < ζ}, so that by (4.141) we have D = DA on {D < ζ} and a fortiori on {DA < ζ}. Now, from the definitions, we have that if DA ≥ ζ, then DA = ∞, so that in fact D = DA on DA < ∞. Thus DA is an Ft stopping time. The proof is completed as it was in the proof of Theorem 3.2.6, starting at (3.38).
152
Constructing Markov processes
A Hunt process is a Borel right process that is quasi left continuous on [0, ∞), that is, for any increasing sequence {Tn } of Gt stopping times with limit T we have XTn → XT on {T < ∞} almost surely. It can be shown that a strongly symmetric Borel right process with continuous α-potential density is a Hunt process. The proof of this would take us too far afield, so we refer the interested reader to Theorem 3.8 in our paper Marcus and Rosen (1992d).
4.5 Killing at a terminal time Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric Borel right process with transition semigroup {Pt ; t ≥ 0} and continuous α-potential density uα (x, y). Let T be a terminal time. In this section we consider the process obtained by killing X at T . For f ∈ C(S), set Pt f (x) = E x (f (Xt )1{t
(4.143)
Using the terminal time property we show that {Pt ; t ≥ 0} is a semigroup, Ps+t f (x)
= E x (f (Xs+t )1{s+t
(4.144)
= Pt (Ps f )(x). Let t (ω) = X
Xt (ω) if t < T ∆ otherwise.
(4.145)
t is right continuous. Clearly, t → X Let S be a Gt stopping time and let H ∈ GS . Noting that {S < T } ∈ GS and using the strong Markov property of X, we see that S+t ) 1H E x f (X = E x (f (XS+t )1{S+t
θt (ω) if t < T (ω) ∆ otherwise.
(4.147)
4.5 Killing at a terminal time
153
= (Ω, Gt , G, X t , θt , Px ) satisfies all the conditions in the We see that X definition of a right continuous simple Markov process with transition semigroup {Pt ; t ≥ 0} on S∆ , with the possible exception of the condi0 = x) = 1 for all x ∈ S∆ . tion that Px (X We now restrict our attention to the case when T = TB , the first hitting time of a set B. If, for example, B is open, then it is clear that 0 = x) = 1 for x ∈ B, since in fact X 0 = ∆, Px we do not have Px (X a.s. We remedy this defect by restricting the state space. We begin by showing that the semigroup {Pt ; t ≥ 0} is symmetric for certain sets B. Lemma 4.5.1 Let {Pt ; t ≥ 0} be the semigroup defined in (4.143) with T = TB , the first hitting time of the set B. If B is open, (4.148) g(x)Pt f (x) dm(x) = f (x)Pt g(x) dm(x) for all t ≥ 0 and all f, g ∈ Bb (S). This also holds if B is closed and P x (TB = 0) = 1 for all x ∈ B. Proof Let B be open. By the right continuity of X and the symmetry of Pt , we have (4.149) g(x)Pt f (x) dm(x) = g(x)E x (f (Xt )1{t
= lim
n→∞
g(x0 )
n→∞
n→∞
n→∞
=
n
Pt/n (xi−1 , dxi )f (xn )
i=1
n
1{S∩B c } (xj ) dm(x0 )
j=0
g(x)1S∩B c (x)
= lim
= lim
1{S∩B c } (Xjt/n )) dm(x)
j=0
= lim
n
Pt/n 1S∩B c (Pt/n 1S∩B c · · · (Pt/n 1S∩B c f )) · · ·)(x) dm(x) f (x)1S∩B c (x) Pt/n 1S∩B c (Pt/n 1S∩B c · · · (Pt/n 1S∩B c g)) · · ·)(x) dm(x)
f (x)Pt g(x) dm(x).
Now let B be closed. Let Bn ⊇ B be a decreasing sequence of open sets such that Bn ⊇ Bn+1 and B = ∩n Bn . Let DB be as in the proof of Theorem 4.4.3. We show, in the proof of Theorem 4.4.3 that DBn ↑ D,
154
Constructing Markov processes
for some stopping time D, with D ≤ DB and D = DB on {D < ζ}. It is easy to see that DBn = TBn for open sets Bn . The condition that P x (TB = 0) = 1 for all x ∈ B implies that DB = TB as well. Therefore, TBn ↑ D with D ≤ TB and D = TB on {D < ζ}. Using this we see that E x (f (Xt )1{t
TB
α f (x) + E x e−αTB E XTB e−αt f (Xt ) dt =U 0 α f (x) + E x e−αTB =U uα (XTB , y)f (y) dm(y) . ∞
has an α-potential density means that we can find a To say that X α α f (x) = u (x, y)f (y) dy. Thus we see function u α (x, y) such that U from (4.151) that X has the α-potential density x, y ∈ S. (4.152) u α (x, y) = uα (x, y) − E x e−αTB uα (XTB , y) By Lemma 3.4.3 we see that u α (x, y) is continuous in y uniformly in x. By Lemma 4.5.1 we have u α (x, y) = u α (y, x)
m a.e. x, y ∈ S
(4.153)
4.5 Killing at a terminal time
155
and thus by (4.152) E x e−αTB uα (XTB , y) = E y e−αTB uα (XTB , x) ,
m a.e. x, y ∈ S. (4.154) We now show that this holds for all x, y ∈ S, and consequently (4.153) holds for all x, y ∈ S. Let h(x, y) = E x e−αTB uα (XTB , y) . By the strong Markov property E x ( h(Xt , y)) = E x e−αTB ◦θt uα (Xt+TB ◦θt , y) . (4.155) ¯ t , θ¯t , P¯ x ) be an independent copy of X. We ¯ = (Ω, ¯ G, ¯ G¯t , X Now let X have ¯ y E x h(Xt , X ¯s) = E ¯ y E x e−αTB ◦θt uα Xt+T ◦θ , X ¯s E . t B (4.156) Since t + TB ◦ θt ↓ TB as t → 0, using the right continuity of X and the continuity of uα (x, y), we have ¯ y E x h(Xt , X ¯ s ) = h(x, y). lim E (4.157) s,t→0
Fix x, y ∈ S. Let K ⊂ S be a compact set containing x, y. Let f : K × K → [0, 1] be a continuous symmetric function with f (x, y) = f (y, x) = 1. Then clearly (4.157) implies that ¯ y E x h(Xt , X ¯ s )f (Xt , X ¯ s ) = h(x, y). (4.158) lim E s,t→0
Note that by (3.68) and the fact that f (x , y ) is bounded and compactly supported we have that h(x , y )f (x , y ) is bounded. Let g ∈ Bb (S). Then it follows from the basic definitions in Section 3.1 that ∞ g(y )U β (y, dy ) = e−βt E y (g(Xt )) dt. (4.159) 0
Using this, we see that (4.160) h(x , y )f (x , y )U β (x, dx )U β (y, dy ) lim ββ β,β →∞ ∞ ∞ ¯y ββ e−βt e−β s E = lim β,β →∞ 0 0 x ¯ s )f (Xt , X ¯ s ) ds dt E h(Xt , X ∞ ∞ ¯y = lim e−t e−s E β,β →∞ 0 0 x ¯ s/β )f (Xt/β , X ¯ s/β ) ds dt E h(Xt/β , X = h(x, y)
Constructing Markov processes
156
by (4.158). It follows from (4.154) that h(x , y ) = h(y , x ) for m almost all x , y ∈ S. Therefore, since f (x , y ) is symmetric and the measures U β (x, dx ) and U β (y, dy ) are absolutely continuous with respect to m, we see that (4.161) h(x , y )f (x , y )U β (x, dx )U β (y, dy ) = h(y , x )f (y , x )U β (x, dx )U β (y, dy ). Applying the sequence of equalities in (4.160) to this last expression, we see that h(x, y) is symmetric, which in turn implies, as we pointed out α (x, y) is continuous above, that u α (x, y) is symmetric. Finally, since u in each variable uniformly in the other, we see that u α (x, y) is also jointly continuous. ¯ c we ¯ c is open, it is clear that for each x ∈ B We now note that since B x x x have P (TB = 0) = 0, so that P (X0 = x) = P (X0 = x) = 1. We next ¯ c , that is, that X, starting show that P x (TB¯ = TB ) = 1 for each x ∈ B c ¯ , does not leave S before being killed. This is a tautology in S = B when B is closed. When B is open, u α (x, y) = 0 for any x ∈ B; hence by continuity for any x ∈ ∂B and therefore for such x we have TB
Ex 0
e−αt 1S (Xt ) dt
u α (x, y) dm(y) = 0.
=
(4.162)
S
This implies that P x (TB > 0) = 0 for any x ∈ ∂B. Hence, by the strong Markov property, for any x ∈ S, (4.163) P x (TB > T∂B ) = E x P X∂B (TB > 0) = 0 and therefore P x (TB¯ = TB ) = 1. restricted We have obtained enough information to establish that X c ¯ to S = B is a right continuous simple Markov process with continuous α-potential densities. The theorem now follows from Lemma 3.4.2. Remark 4.5.3 Let X and X be strongly symmetric Borel right processes on compact state spaces S and S , respectively, with continuous 0-potential densities u0 (x, y) and u 0 (x, y). Assume that S ⊆ S and u 0 (x, y) = u0 (x, y) on S × S . Let B ⊆ S be an open set with compact be the process obtained in Theorem 4.5.2 closure contained in S . Let X by killing X at TB¯ c and X be the process obtained by killing X at TB¯ c . and X have the same distribution. This follows from Remark Then X 4.1.11. Using the notation of Remark 4.1.11, we see that {Xτt ; t ≥ 0}
4.5 Killing at a terminal time
157
has the same distribution as {Xt ; t ≥ 0}. It is clear that for t < TB¯ c , At = t, so that τt = t. Example 4.5.4 Let 0 be a point in the state space S. When T = T0 we can describe the potential densities u α (x, y) more explicitly than in 0 (4.152). Recall from (3.83) that P (T0 = 0) = 1, so the condition in Theorem 4.5.2 is satisfied. Using the fact that XT0 = 0, it follows from (4.152) that (4.164) u α (x, y) = uα (x, y) − E x e−αT0 uα (0, y). Therefore, by (3.107), we have u α (x, y) = uα (x, y) −
uα (x, 0)uα (0, y) . uα (0, 0)
(4.165)
If we consider uα (x, y) as the 0-potential density of X (in the notation of Theorem 4.5.2), killed at the end of an independent exponential time with mean 1/α, then, by Remark 3.8.3, the right-hand side of (4.165) is equal to uα α (x, y) = uα T0 (x, y). It is not surprising that u T0 (x, y). Example 4.5.5 By Theorem 4.5.2, Brownian motion killed at T0 is a strongly symmetric Borel right process. By (4.165) and (2.5), its αpotential density for α > 0 is √
u (x, y) = α
e−
2α|y−x|
√
2α
√
−
e−
√ 2α|x| − 2α|y|
√
e 2α
.
(4.166)
Because Brownian motion has continuous paths, the killed process naturally breaks up into two processes, one with state space (0, ∞) and the other with state space (−∞, 0). To be specific we deal with the process with state space (0, ∞). In this case, using (2.5) once more, we see that √
u (x, y) α
=
e−
=
2α|y−x|
√
∞
2α
√
−
e−
2α|x+y|
√
2α
(4.167)
e−αt (pt (y − x) − pt (y + x)) dt,
0
where pt (y − x) is the transition density for Brownian motion. By the uniqueness of the Laplace transform, we see that the Borel right process with state space (0, ∞), obtained by killing Brownian motion the first X time it hits 0, has transition densities pt (x, y) = pt (y − x) − pt (y + x)
(4.168)
with respect to Lebesgue measure. Note that (0, ∞) is locally compact.
Constructing Markov processes
158
A one-point compactification of (0, ∞) can be obtained by first embedding it in the compact set [0, ∞] and then considering 0 and ∞ as a single point. By considering the limits of u α (x, y), both as x → ∞ and is a Feller x → 0, we see from (4.167) and Corollary 4.1.4 (2) that X process. We note that the simple relation ∞ ∞ pt (x − y) dy + pt (x + y) dy = 1 (4.169) 0
0
together with (4.168), for standard Brownian motion Bt , show that P x (T0 ≤ t)
1 − P x (T0 > t) (4.170) ∞ ∞ = 1− pt (y − x) dy + pt (y + x) dy 0 ∞0 pt (y + x) dy = 2P 0 (Bt ≥ x). = 2 =
0
This gives an alternative derivation of the reflection principle, Lemma 2.2.11.
4.5.1 Brownian motion killed at T0 and the three-dimensional Bessel process The general class of Bessel processes is defined in Section 14.2. In (4.185) we obtain an explicit relationship between Brownian motion killed at T0 and the three-dimensional Bessel process. Let W (i) , i = 1 . . . , 3 be independent standard Brownian motions (1) (2) (3) and set Wt = (Wt , Wt , Wt ). We refer to {Wt , t ≥ 0} as threedimensional Brownian motion. {|Wt |, t ≥ 0} is a three-dimensional Bessel process. Let 1 qt (x, y) = (pt (y − x) − pt (y + x)) y x, y > 0. (4.171) x Note that
∞
qt (x, y) dy = 1.
(4.172)
0
This follows since ∞ y (pt (y − x) − pt (y + x)) dy 0 ∞ ∞ (s + x)pt (s) ds − (s − x)pt (s) ds = −x
x
(4.173)
4.5 Killing at a terminal time ∞ ∞ =x pt (s) ds + pt (s) ds = x.
−x
159
x
It is easy to check that qt (x, y) satisfies the Chapman–Kolmogorov equation (3.64) with respect to Lebesgue measure on the state space S = (0, ∞). We now show that qt (x, y) are the transition densities of {|Wt |, t ≥ 0}. A priori, {|Wt |, t ≥ 0} has state space [0, ∞), but the next lemma shows that we may omit {0} and take the state space to be S = (0, ∞). Lemma 4.5.6 Let {Wt , t ≥ 0} be three-dimensional Brownian motion. Then P x (T0 < ∞) = 0 for x = 0. In other words, starting from any point away from the origin, the three-dimensional Brownian motion will never hit the origin, almost surely. Proof
Fix 0 < a < b < ∞. We claim that {Wt = 0 for some a ≤ t < b} (4.174) & , 3 (i) ⊆ log n , |Wk/n | ≤ n m≥1 n≥m an≤k≤bn i≤3
except, possibly, for a set of P x measure 0, where x > 0. To see this we first note that {Wt = 0 for some a ≤ t < b} = {T0 ◦ θa < b}. Therefore, by the strong Markov property at T0 ◦ θa and Khintchine’s law of the iterated logarithm, (2.14), applied to each (1) (2) (3) component of Wt = (Wt , Wt , Wt ), we have - & (i) W P x T0 ◦ θa < b, =1 lim sup T0 ◦θa +δ 2δ log log(1/δ) δ→0 i≤3 = P x (T0 ◦ θa < b) .
(4.175)
Hence {Wt = 0 for some
a ≤ t < b} (4.176) & (i) W =1 , lim sup T0 ◦θa +δ ⊆ {T0 ◦ θa < b} 2δ log log(1/δ) δ→0 i≤3
except, possibly, for a set of P x measure 0, from which (4.174) follows. Using (4.174) we see that P x (Wt = 0 for some
a ≤ t < b)
(4.177)
Constructing Markov processes - & , 3 (i) log n ≤ Px |Wk/n | ≤ n m≥1 n≥m an≤k≤bn i≤3 - & , 3 (i) x log n |Wk/n | ≤ ≤ lim inf P n→∞ n
160
an≤k≤bn i≤3
≤ lim inf n→∞
bn
P
,
(1) |W1 |
x
≤
k=an
3 log n k
3
(log n)3/2 = 0, n→∞ n1/2 where C(a, b) is a constant depending only on a and b. The proof is completed by noting that that for any sequences {an } ↓ 0 and {bn } ↑ ∞, ≤ lim C(a, b)
{Wt = 0 for some t > 0} = {Wt = 0, for some an ≤ t < bn }
(4.178)
n
It follows from Lemma 4.5.6 that, starting from x = 0, the value of / S. f (|Wt |) is almost surely well defined for f ∈ Bb (S), even though 0 ∈ For notational convenience we extend f ∈ Bb (S) by setting f (0) = 0. Then, for x = 0, fi ∈ Bb (S), i = 1, . . . , n, and 0 < t1 < · · · < tn , n n n x fi (|Wti |) = p¯t1 (x, z1 ) p¯ti −ti−1 (zi−1 , zi ) fi (|zi |) d3 zi , E R3n
i=1
i=2
i=1
(4.179) where p¯t (x, y) are the transition probabilities of three-dimensional Brownian motion. We use the law of cosines to write 1 e−|x| /2t e−|y| −|y−x|2 /2t e = (2πt)3/2 (2πt)3/2 2
p¯t (x, y) =
2
/2t
e−|y||x| cos φ/t (4.180)
(φ is the angle between x and y). Using polar coordinates and integrating out the angular coordinates, we see that p¯t (x, y)f (|y|) d3 y (4.181) e−|x| /2t = 2π (2πt)3/2 2
∞
π
−r|x| cos φ/t
e 0
0
sin φ dφ e−r
2
/2t 2
r f (r) dr
2 2 e−|x| /2t ∞ t r|x|/t = 2π e − e−r|x|/t e−r /2t r2 f (r) dr 3/2 r|x| (2πt) 0 qt (|x|, r)f (r) dr, = S
4.5 Killing at a terminal time
161
where qt is given in (4.171). Thus we can write (4.179) as n n n x fi (|Wti |) = qt1 (|x|, z1 ) qti −ti−1 (zi−1 , zi ) fi (zi ) dzi . E i=1
Sn
i=2
i=1
(4.182) Note that E x depends only on |x|. Therefore, when dealing with {|Wt |, t ≥ 0} we use the notation P y with y ∈ S to denote any of the measures P y with y ∈ R3 and |y | = y. Writing {Yt , t ≥ 0} for {|Wt |, t ≥ 0} under the family of measures P x , x ∈ S, we see that {Yt , t ≥ 0} has state space S and we can write (4.182) as n n n Ex fi (Yti ) = qt1 (x, z1 ) qti −ti−1 (zi−1 , zi ) fi (zi ) dzi i=1
Sn
i=2
i=1
(4.183) for fi ∈ Bb (S), i = 1, . . . , n and 0 < t1 < · · · < tn . Thus we see that the qt (x, y) are the transition densities of {|Wt |, t ≥ 0} under the family of measures P x , x ∈ S. Let Ft0 = σ(Ys ; s ≤ t). As in the second proof of Lemma 2.2.1, it follows from (4.183) that E f (Yt+s ) | Ft0 = Qs f (Xt ) (4.184) for all s, t ≥ 0 and f ∈ Bb (S), where Qs f (x) = qs (x, y)f (y) dy. Thus, {Yt ; t ≥ 0} is a continuous simple Markov process with respect to the filtration Ft0 = σ(Ys ; s ≤ t). It is clear from from (4.166), (4.167), and (4.171) that {Yt ; t ≥ 0} has continuous α-potential densities for α > 0. Therefore, we see by Lemma 3.4.2 that, after augmentation, {Yt ; t ≥ 0} is a Borel right process. In what follows, when we refer to the three-dimensional Bessel process we mean the Borel right process {Yt ; t ≥ 0}. We now obtain an explicit connection between Brownian motion killed with state space at T0 and the three-dimensional Bessel process. Let X, S = (0, ∞), denote the Borel right process obtained by killing Brownian motion the first time it hits 0 (see Example 4.5.5). We see from (4.168) t and and (4.171) that the transition probability density functions of X Yt , the three-dimensional Bessel process, are closely related. Therefore, using (4.183) and (4.172), we see that for fi ∈ Bb (S), i = 1, . . . , n, and 0 < t1 < · · · < tn , n n t X n x x t ) E fi (Yti ) = E fi (X (4.185) i x i=1 i=1
Constructing Markov processes n t X x = E fi (Xti ) x i=1
162
t for any t ≥ tn . Note that in the last equality we use the fact that X is a martingale, which follows from the calculation in (4.173). Equation (4.185) is used in Subsection 8.1.1. Remark 4.5.7 Recall from (2.139) that u (x, y) = 2(x ∧ y). Although ∞ is not in S = (0, ∞), we have formally that u (x, ∞) = 2x. Comparing (4.185) with (3.211) we can say, heuristically, that the three-dimensional Bessel process has the law of Brownian motion killed at T0 , conditioned to die at ∞.
4.6 Continuous local times and potential densities In all the results in this book that prove the joint continuity of local times, we assume that the potential densities {uα (x, y), (x, y) ∈ S × S} are continuous. The next theorem shows that we lose nothing by this assumption. Theorem 4.6.1 Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric Borel right process and assume that its α-potential density, uα (x, y), is finite for all x, y ∈ S. Let Lyt = {Lyt , (t, y) ∈ R+ × S} be a local time of X with ∞ e−αt dLyt = uα (x, y). (4.186) Ex 0
{Lyt , y
∈ S} is continuous for all t ∈ R+ almost surely, then {uα (x, y), If (x, y) ∈ S × S} is continuous. Proof
We show first that for compact sets K ⊂ S, sup uα (y, y) < ∞.
(4.187)
y∈K
Suppose that (4.187) does not hold. Then we can find a sequence {yn }∞ n=1 in K and a y ∈ K such that limn→∞ yn = y and lim uα (yn , yn ) = ∞.
n→∞
(4.188)
It follows from the definition of local time that for any real number T > 0, P y (LyT > 0) = 1. Let Gn be a decreasing sequence of open sets
4.6 Continuous local times and potential densities
163
y such that yn ∈ Gn and ∩∞ n=1 Gn = y. The continuity of LT implies that
{LyT > 0} =
∞
{LzT > 0, ∀z ∈ Gn }.
(4.189)
n=1
Hence there exists an n0 such that for all n ≥ n0 P y (LzT > 0, ∀z ∈ Gn ) ≥ 12 .
(4.190)
Let Tn denote the first hitting time of yn by (X, P y ). It follows that P y (Tn ≤ T, ∀n ≥ n0 ) ≥ which implies that
1 2
,
(4.191)
E y e−αTn ≥ 12 e−αT .
(4.192)
uα (y, yn ) E y e−αTn = α u (yn , yn )
(4.193)
uα (y, z) = E z e−αTy ≤ 1. uα (y, y)
(4.194)
uα (y, y) uα (y, yn ) ≤ α . E y e−αTn = α u (yn , yn ) u (yn yn )
(4.195)
However, by (3.107),
and for any z
Hence
Therefore, by (4.188) and (4.195), the left-hand side of (4.192) goes to zero as n approaches infinity. This contradiction establishes (4.187). Now let λ be an exponential random variable with mean α that is independent of X and let µ be the probability measure of λ. It follows from Theorem 3.10.1 along with (4.194) and (4.187) that the random variables {Lyλ , y ∈ K} are uniformly bounded in Lp (P x × µ) for all 1 ≤ p < ∞. Therefore, both {Lyλ , y ∈ K} and {Lyλ Lzλ , y, z ∈ K} are uniformly integrable. Using this and the continuity of Lyt for all t ∈ R+ , we see that both y → E x (Lyλ ) = uα (x, y)
(4.196)
and (y, z) → E x (Lyλ Lzλ ) = (uα (x, y) + uα (x, z))uα (y, z) are continuous.
(4.197)
164
Constructing Markov processes
To see that {uα (x, y), (x, y) ∈ K × K} is continuous, let (yn , zn ) → (y0 , z0 ), where all {yn }, {zn }, y0 , and z0 are contained in K. We see from (4.196) and (4.197) that uα (y0 , yn ) → uα (y0 , y0 ),
uα (y0 , zn ) → uα (y0 , z0 ),
(4.198)
and (uα (y0 , yn ) + uα (y0 , zn ))uα (yn , zn ) → (uα (y0 , y0 ) + uα (y0 , z0 ))uα (y0 , z0 ). (4.199) Recall that by (3.74), uα (y0 , y0 ) > 0. Therefore, it follows from (4.198) and (4.199) that uα (yn , zn ) → uα (y0 , z0 ). Thus, {uα (x, y), (x, y) ∈ K × K} is continuous. Since this holds for all compact sets K ⊂ S, the theorem is proved.
4.7 Constructing Ray semigroups and Ray processes Until Chapter 13, Borel right processes seem to be sufficiently general for all the results we obtain using Gaussian process theory to study local times. However, in the important Theorems 13.1.2 and 13.3.1, Borel right process are too limited to enable us to establish the equivalencies we seek. For this we introduce local Borel right processes in Section 4.8. In this section we develop prerequisite material that is also interesting and important in its own right. Let S be a locally compact space with a countable base. Let {U λ ; λ ≥ 0} be a contraction resolvent on Cb (S), that is, {U λ ; λ ≥ 0} are positive bounded linear operators on Cb (S), and similar to (4.9) and (4.10) αU α f ≤ f
∀ f ∈ Cb (S) and α > 0
(4.200)
(clearly, this also holds for α = 0), and U λ − U µ = (µ − λ)U λ U µ
∀ λ, µ ≥ 0.
(4.201)
A function f ∈ Bb+ (S) is called a supermedian function (with respect to {U λ ; λ ≥ 0}), or just supermedian, if αU α f (x) ≤ f (x)
∀x ∈ S and α > 0.
(4.202)
Let M+ denote the set of supermedian functions. Set M := M+ − M+ and H := M+ ∩ Cb (S) − M+ ∩ Cb (S). If f ∈ M+ and λ > µ, it follows from the resolvent equation (4.201) that λU λ f (x) − µU µ f (x)
(4.203)
= (λ − µ)U f (x) + µ(U f (x) − U f (x)) λ
λ
µ
4.7 Constructing Ray semigroups and Ray processes
165
= (λ − µ)U λ f (x) − µ(λ − µ)U µ U λ f (x) = (λ − µ)U λ (f (x) − µU µ f (x)) ≥ 0. It follows from this that for f ∈ M+ λU λ f (x) ↑
as
λ → ∞.
(4.204)
Consequently, f(x) := lim λU λ f (x) λ→∞
(4.205)
exists. Furthermore, we see from (4.202) that f(x) ≤ f (x)
∀f ∈ M+ .
(4.206)
Using (4.205), (4.200), and then (4.201), we see that µU µ f(x)
= µU µ lim λU λ f (x) =
λ→∞ λ
(4.207)
lim λU µU µ f (x)
λ→∞
λµ (U µ f (x) − U λ f (x)) λ−µ = µU µ f (x) ≤ f(x), =
lim
λ→∞
so that f ∈ M+ . f is called the excessive regularization of f . Theorem 4.7.1 Let S be a locally compact space with a countable base, and let {U λ ; λ ≥ 0} be a contraction resolvent on Cb (S). Assume that H∩C0 (S) is dense in C0 (S) in the uniform norm. Then we can construct a sub-Markov semigroup {Pt ; t ≥ 0} on (S, B), without the additional requirement that P0 = I, such that for all f ∈ Cb (S), Pt f (x) is right continuous in t ∈ R+ and ∞ U λ f (x) = e−λt Pt f (x) dt. (4.208) 0
The sub-Markov semigroup {Pt ; t ≥ 0} constructed in Theorem 4.7.1 is called a Ray semigroup. We break the proof of Theorem 4.7.1 into a series of lemmas, Lemmas 4.7.2–4.7.5, that give the details of the construction. The reader should note that each of these lemmas employs the notation, hypotheses, and results of the previous ones. In all of these lemmas we assume that S is a locally compact space with a countable base and that there exists a contraction resolvent {U λ ; λ ≥ 0} on Cb (S). Moreover, we assume that H ∩ C0 (S) is dense in C0 (S) in the uniform norm.
Constructing Markov processes
166
Lemma 4.7.2 For each f ∈ M+ and x ∈ S we can find a positive function of t ∈ R+ , denoted by Pt (x, f ), that is decreasing, right continuous in t, and satisfies ∞ U λ f (x) = e−λt Pt (x, f ) dt. (4.209) 0
Proof of Lemma 4.7.2 Note that by the resolvent equation (4.201), for any f ∈ Bb (S) and λ > 0, ∂ λ U f (x) ∂λ
U λ+ f (x) − U λ f (x) →0 = − lim U λ+ U λ f (x) = −(U λ )2 f (x),
=
lim
(4.210)
→0
where for the last equality we use (4.201) and (4.200). Then by induction we have ∂k λ U f (x) = k!(−1)k (U λ )k+1 f (x) ∂λk
∀λ > 0.
(4.211)
Hence for any f ∈ Bb+ (S) and x ∈ S, U λ f (x) is completely monotone in λ, and consequently it is a Laplace transform (see, e.g., Feller (1971, XIII.4)). That is, there exists a positive measure µf,x on R+ such that ∞ λ U f (x) = e−λt dµf,x (t) ∀λ > 0. (4.212) 0
Using the differentiation formula Dk (f g) = (4.211) we have ∂k (λ U λ )f (x) ∂λk
k j=0
k j
Dj f Dk g and
∂ k−1 λ ∂k λ U f (x) + λ U f (x) (4.213) ∂λk−1 ∂λk = k!(−1)k−1 (U λ )k (I − λ U λ )f (x).
= k
Let f ∈ M+ . By (4.204)–(4.205) we see that f(x) − λ U λ f (x) ≥ 0. Furthermore, for any k ≥ 1 it follows from (4.213) and (4.202) that (−1)k
∂k ∂k (f (x) − λ U λ f (x)) = (−1)k−1 k (λ U λ )f (x) ≥ 0. k ∂λ ∂λ
Thus f(x) − λ U λ f (x) is completely monotone in λ, and consequently for some positive measure ρf,x on R+ we have ∞ λ f (x) − λ U f (x) = e−λt dρf,x (t) λ > 0. (4.214) 0
It follows from the resolvent equation (4.201) that U λ f (x) ≤ U 0 f (x) < ∞, so that limλ→0 λU λ f (x) = 0. Therefore, taking the limit in (4.214) as λ → 0 we see that ρf,x has total mass f(x) ≤ f (x) < ∞. Taking
4.7 Constructing Ray semigroups and Ray processes
167
the limit as λ → ∞ and using (4.205) shows us that ρf,x ({0}) = 0. Consequently, ∞
e−λt ρf,x ((t, ∞)) dt
∞
=
0
0
∞
e−λt
dρf,x (s) (t,∞)
−λt
e
= ∞
= 0
(4.215)
dρf,x (s)
[0,s)
0
dt
dt
1 − e−λs dρf,x (s) λ
1 f (x) − [f(x) − λ U λ f (x)] = U λ f (x), λ
=
where for the last line we use (4.214). Set Pt (x, f ) = ρf,x ((t, ∞)). It is easy to see that Pt (x, f ) is decreasing and right continuous in t. Lemma 4.7.3 For x and t fixed consider Pt (x, f ) as a functional on M+ . We can extend it to be a positive bounded linear functional on M, such that for fixed x ∈ S and f ∈ M, Pt (x, f ) is right continuous in t and for all f ∈ M ∞ e−λt Pt (x, f ) dt ∀ x ∈ S. (4.216) U λ f (x) = 0
Proof of Lemma 4.7.3 We show in Lemma 4.7.2 that t → Pt (x, f ) is decreasing and right continuous in t for f ∈ M+ . Therefore, using (4.205) and (4.209), we have f(x)
lim λU λ f (x) ∞ e−λt Pt (x, f ) dt = lim λ λ→∞ ∞0 e−t Pt/λ (x, f ) dt = P0 (x, f ). = lim =
λ→∞
λ→∞
(4.217)
0
Thus, by (4.206), P0 (x, f ) = f(x) ≤ f (x)
∀f ∈ M+ .
(4.218)
In particular, taking f ≡ 1 ∈ M+ and again using the fact that t → Pt (x, 1) is decreasing, we have Pt (x, 1) ≤ 1. These arguments also show that Pt (x, f ) is bounded.
(4.219)
168
Constructing Markov processes
Now let g, h ∈ M+ . It follows from (4.209) that ∞ e−λt (Pt (x, h) − Pt (x, g)) dt. U λ (h − g)(x) =
(4.220)
0
This has two consequences. First, if h−g = h −g with h, g, h , g ∈ M+ , then it follows from (4.220), using the uniqueness property of Laplace transforms and the right continuity of t → Pt (x, ·), that Pt (x, h) − Pt (x, g) = Pt (x, h ) − Pt (x, g ). Thus we can obtain a consistent extension of Pt (x, f ) to f ∈ M by defining Pt (x, f ) = Pt (x, h) − Pt (x, g)
(4.221)
whenever f = h − g with g, h ∈ M+ . With this definition, (4.216) holds for all f ∈ M. Using this, the uniqueness property of Laplace transforms, and the right continuity of t → Pt (x, ·), it is easy to see that Pt (x, · ) is linear. The second consequence of (4.220) comes from the fact that if f ∈ M with f = h − g ≥ 0 and g, h ∈ M+ , then by (4.212) we have that U λ f (x) is the Laplace transform of a positive measure. Using, once again, the uniqueness property of Laplace transforms, (4.221), and the right continuity of t → Pt (x, ·), we see that Pt (x, f ) ≥ 0. Lemma 4.7.4 For each t ∈ R+ we can extend the linear functionals f → Pt (x, f ), f ∈ M, to become a sub-Markov kernel Pt on (S, B) such that for all f ∈ Cb (S) and x ∈ S, t → Pt f (x), defined in (3.1), is right continuous in t and ∞ e−λt Pt f (x) dt. (4.222) U λ f (x) = 0
Proof of Lemma 4.7.4 To begin, fix x and t. By the previous lemma, f → Pt (x, f ) is a positive bounded linear functional on M. By the assumption that any function in C0 (S) is the limit in the uniform norm of functions in H ∩ C0 (S), we can extend f → Pt (x, f ) to be a positive bounded linear functional on C0 (S). By the Riesz Representation Theorem, there is a positive finite Borel measure Pt (x, ·) so that for all f ∈ C0 (S) (4.223) Pt (x, f ) = f (y)Pt (x, dy). Therefore we can extend Pt (x, ·) so that it is a positive finite Borel measure on (S, B). Obviously Pt (x, dy) = Pt (x, dy). Since, by Lemma 4.7.3, Pt (x, f ) is right continuous in t for f ∈ M , the extension t → Pt (x, f ) is right continuous and (4.216) or, equivalently,
4.7 Constructing Ray semigroups and Ray processes
169
(4.222) continues to hold for all f ∈ C0 (S). By (4.219), Pt (x, ·) is a subprobability measure, so that Pt f (x) := f (y)Pt (x, dy) (4.224) extends naturally to f ∈ Cb (S). We now show that t → Pt f (x) is right continuous. Without loss of generality we can assume that f ≥ 0. Fix t0 and > 0 and let K be a compact set with Pt0 (x, K c ) ≤ . Let 0 ≤ gK ≤ 1 be a continuous compactly supported function that is 1 on K and set hK = 1 − gK . Clearly Pt0 hK (x) ≤ . By (4.224), Pt hK (x) = Pt 1 − Pt gK (x), and since 1 ∈ M+ ∩ Cb (S), we have by Lemma 4.7.3 that t → Pt hK (x) is right continuous. Consequently, for some δ > 0, we have Pt hK (x) ≤ 2 for t ∈ [t0 , t0 + δ]. For these values of t, |Pt0 f (x) − Pt f (x)| ≤ |Pt0 (f (x)gK (x)) − Pt (f (x)gK (x))| + 3f ∞ . (4.225) The right continuity of t → Pt f (x) at t0 now follows from that of t → Pt0 (f (x)gK (x)), which holds since f gK ∈ C0 (S). Therefore t → Pt f (x) is right continuous for all f ∈ Cb (S). Using the Dominated Convergence Theorem we can further extend (4.222) from all f ∈ C0 (S) to all f ∈ Cb (S). It remains to show that for all A ∈ B, x → Pt (x, A) is Borel measurable. A simple monotone class argument shows that to prove this, it suffices to show that for all f ∈ Cb (S), x → Pt f (x) is Borel measurable. By hypothesis, U λ f ∈ Cb (S) for all f ∈ Cb (S) and λ ≥ 0. It follows ∞ that 0 ψ(t)Pt f (x) dt is Borel measurable for any ψ ∈ C0 (R+ ). This last remark follows from the fact that the linear span of {e−λt ; λ ≥ 0} is dense in C0 (R+ ), since supposing otherwise we obtain a contradiction: The Hahn–Banach Theorem and the Riesz Representation Theorem give a finite nonzero Borel measure whose Laplace transform is identically zero. To conclude the proof, for any fixed s ≥ 0, we take ψ,s (·) ∈ C0 (R+ ) to be an approximate identity supported on [s, s + ]. Then, using the ∞ right continuity of Pt f (x) in t, Ps f (x) = lim→0 0 ψ,s (t)Pt f (x)dt is Borel measurable in x for each s ≥ 0 and f ∈ Cb (S). Lemma 4.7.5 The family of sub-Markov kernels {Pt ; t ≥ 0} is a subMarkov semigroup on (S, B). Proof of Lemma 4.7.5: We only need to verify the semigroup prop-
170
Constructing Markov processes
erty, and for this it suffices to show that Ps Pt f (x) = Ps+t f (x)
(4.226)
for all s, t ≥ 0 and f ∈ Cb (S). By Lemma 4.7.4, t → Pt f (x) is right continuous and Ps (x, ·) is a bounded measure. Using these observations and the Dominated Convergence Theorem, it follows that t → Ps Pt f (x) is right continuous, and similarly for t → Ps+t f (x). Thus, by the uniqueness property of Laplace transforms, to prove (4.226), it suffices to show that the Laplace transforms of Ps Pt f (x) and Ps+t f (x) are equal or, equivalently, that ∞ Ps U λ f (x) = e−λt Ps+t f (x) dt (4.227) 0
for all λ > 0. By hypothesis, U λ f ∈ Cb (S), so that by the Dominated Convergence Theorem again, both sides of (4.227) are right continuous in s. Therefore, taking Laplace transforms again, we see that to prove (4.226), it suffices to show that ∞ ∞ µ λ U U f (x) = e−µs e−λt Ps+t f (x) ds dt (4.228) 0
0
for all µ, λ > 0. By the first five lines of (3.19) we see that ∞ ∞ µ λ U f (x) − U f (x) = (λ − µ) e−µs e−λt Ps+t f (x) ds dt, (4.229) 0
0
so that (4.228) follows from the resolvent equation (4.201) when λ = µ. By continuity, (4.228) also holds when λ = µ. Proof of Theorem 4.7.1 This follows from Lemmas 4.7.2–4.7.5. Let {Pt ; t ≥ 0} be a Ray semigroup on (S, B). Let δx denote the unit mass at x and set N = {x ∈ S ; P0 (x, dy) = δx (dy)}.
(4.230)
N is called the set of branch points for {Pt ; t ≥ 0}. Heuristically, one may think of a branch point in the following way: If {Xt ; t ≥ 0} is a process with semigroup {Pt ; t ≥ 0}, then a branch point is a point x that is never visited by the path X but may appear as a left limit point of the path, that is, we may have lims↑t Xs = x for some t, even though Xt = x. The path “branches” just before hitting x. This is clarified further by Lemma 4.7.9.
4.7 Constructing Ray semigroups and Ray processes
171
We set D = S − N . D is the set of points in S that are not branch points. We note from the definition of N that P0 (x, {x}) = P0 (x, S) = 1
∀x ∈ D.
(4.231)
Lemma 4.7.6 Under the hypotheses of Theorem 4.7.1, for all x ∈ S and for all t ≥ 0, the measures Pt (x, ·) are carried by D, that is, for all measurable sets A ⊂ S, Pt (x, A ∩ D) = Pt (x, A).
(4.232)
Proof Let fn be a sequence of functions in H ∩ C0 (S) that is dense in C0 (S) in the uniform norm. Each fn can be written as a difference fn = fn,1 − fn,2 of two functions in M+ ∩ Cb (S). Let {gn } be the set of functions {fn,1 , fn,2 , n = 1, . . .}. Since P0 gn (x) = gn (x) for all n implies that P0 f (x) = f (x) for all f ∈ C0 (S), we have D = ∩n {x ∈ S | P0 gn (x) = gn (x)}. Consequently, N = ∪n {x ∈ S | |gn (x) − P0 gn (x)| > 0}. Then, for any t ≥ 0 and y ∈ S, Pt (y, {x ∈ S | |gn (x) − P0 gn (x)| > 0}). Pt (y, N ) ≤
(4.233)
(4.234)
n
Using (4.218) we have that for any > 0, Pt (y, {x ∈ S | |gn (x) − P0 gn (x)| > }) 1 |gn (x) − P0 gn (x)| Pt (y, dx) ≤ 1 = (gn (x) − P0 gn (x)) Pt (y, dx) Pt gn (y) − Pt P0 gn (y) ≤ =0
(4.235)
since by the semigroup property, Pt P0 = Pt . Therefore, Pt (y, N ) = 0 and we get (4.232). On page 63 we show how to extend a sub-Markov semigroup with state space S so that it becomes a Markov semigroup, by introducing a cemetery state. We repeat this process here for Ray semigroups because there are some additional issues we need to consider. Let S be a locally compact space with a countable base and let {Pt ; t ≥ 0} be a Ray
Constructing Markov processes
172
semigroup on (S, B). Recall the definition of S∆ = S ∪ ∆ as the onepoint compactification of S, with ∆ the “point at infinity.” Recall also that {Pt ; t ≥ 0} can be extended to become a Markov semigroup on S∆ , which for clarity we denote in this section as {P t ; t ≥ 0}, in a unique way by setting P t (x, ∆) = 1 − Pt (x, S)
(4.236)
P t (∆, ∆) = 1.
(4.237)
for all x ∈ S and λ
Let {U ; λ ≥ 0} be the corresponding potentials. If f is a function defined on S, we let f be the extension of f to S∆ obtained by setting f (∆) = 0. Note that for functions f defined on S f ∈ Cb (S∆ ) ⇐⇒ f ∈ C0 (S).
(4.238)
If follows from the above definitions that P t f (x) = Pt f (x)
λ
U f (x) = U λ f (x),
+
(4.239) λ
so that if M denotes the supermedian functions for {U ; λ ≥ 0}, +
f ∈ M ⇐⇒ f ∈ M+ .
(4.240)
λ
Note that in general U f (x)is not continuous at ∆, even for compactly supported f . Lemma 4.7.7 Assume all the hypotheses of Theorem 4.7.1. Set D∆ = D∪∆. Then D∆ is the set of nonbranch points for the Markov semigroup {P t ; t ≥ 0}, and for all x ∈ S∆ and t ≥ 0, P t (x, ·) is carried by D∆ . Furthermore, let {gn , n = 1, 2, . . .} be the sequence of functions in M+ ∩ Cb (S) described in the proof of Lemma 4.7.6. We have D∆ = ∩n {x ∈ S∆ | P 0 g n (x) = g n (x)}.
(4.241)
Proof It is clear that if x ∈ D is not a branch point for {Pt ; t ≥ 0}, it is not a branch point for {P t ; t ≥ 0}, and by (4.237), ∆ is not a branch point for {P t ; t ≥ 0}. The remainder of the first paragraph follows immediately from Lemma 4.7.6. The result in the second paragraph follows from what is proved in the first paragraph and the proof of Lemma 4.7.6. Let {Pt ; t ≥ 0} be a Ray semigroup on (S, B). A Ray process is a collection X = (Ω, G, Gt , Xt , θt , P x ) that satisfies all the properties of a right continuous simple Markov process (see page 64) with semigroup
4.7 Constructing Ray semigroups and Ray processes
173
{Pt ; t ≥ 0}, with the exception of the requirement that P x (X0 = x) = 1, which holds only for x ∈ D. Theorem 4.7.8 Let S be a locally compact space with a countable base, and let {Pt ; t ≥ 0} be a Ray semigroup on (S, B) with {U λ ; λ ≥ 0} the associated contraction resolvent on Cb (S). Assume that H ∩ C0 (S) is dense in C0 (S) in the uniform norm. Then we can construct a Ray process X = (Ω, F 0 , Ft0 , Xt , θt , P x ) with state space S∆ and semigroup {Pt ; t ≥ 0} such that X has left limits in S∆ . Proof The proof of this theorem has much in common with the proof of Theorem 4.1.1. R Let x ∈ S∆ . We first construct a probability Px on S∆+ , the space of S∆ -valued functions {f (t), t ∈ [0, ∞)} equipped with the Borel product R σ-algebra B(S∆+ ). Let Xt be the natural evaluation Xt (f ) = f (t). We define Px on sets of the form {X0 ∈ A0 , Xt1 ∈ A1 , . . . , Xtn ∈ An } for all Borel measurable sets A0 , A1 , . . . , An in S∆ and 0 = t0 < t1 < · · · < tn by setting (4.242) Px (X0 ∈ A0 , Xt1 ∈ A1 , . . . , Xtn ∈ An ) n n = IA1 (zi )P 0 (x, dz0 ) P ti −ti−1 (zi−1 , dzi ). i=0
i=1
Here IAi is the indicator function of Ai and {P t , t ≥ 0} is the extension of {Pt , t ≥ 0} to S∆ defined in (4.236). It follows from the semigroup property of {P t ; t ≥ 0} that this construction is consistent. Therefore, R by the Kolmogorov Construction Theorem, we can extend Px to S∆+ . 0 Let Ft = σ(Xs ; s ≤ t). As in the second proof of (2.31), it follows from (4.242) that f (Xt+s ) | F 0 = P s f (Xt ) E (4.243) t for all s, t ≥ 0 and f ∈ Bb (S∆ ). Thus, {Xt ; t ≥ 0} is a simple Markov process with respect to the filtration Ft0 = σ(Xs ; s ≤ t). We now show that {Xt ; t ≥ 0} has a modification that is right continuous with left limits. We begin by noting that, for any f ∈ M+ ∩ Cb (S), it follows from (4.218) that P0 (x, f ) = f(x) ≤ f (x), and, using the fact that t → Pt (x, f ) is decreasing, we have Pt (x, f ) ≤ f (x). It then follows from (4.239) and (4.243) that for any f ∈ M+ ∩ Cb (S), f (Xt ) is a Ft0 supermartingale, for any Px . Let Q be a countable dense set in R+ . By the convergence theorem for supermartingales, Revuz and Yor (1991, Chapter II, Theorem 2.5), f (Xt ) has right- and left-hand limits along Q, almost surely. Let G =
174
Constructing Markov processes
{gn , n = 1, 2, . . .} be the sequence of functions in M+ ∩ Cb (S) described in the proof of Lemma 4.7.6. Since, by hypothesis, any function in C0 (S) can be approximated in the uniform norm by the difference of two functions in G, we have that f (Xt ) has right- and left-hand limits along Q for all f ∈ C0 (S), almost surely. Since {f | f ∈ C0 (S)} is a collection of continuous functions that separate points in S∆ , we see from the argument in the seventh paragraph of the proof of Theorem 4.1.1 that, almost surely, t → Xt has right-hand limits in S∆ along Q. The same argument shows that, almost surely, t → Xt has left-hand limits in S∆ along Q; see also Remark 4.1.2 (2).
t (ω) = For all ω for which the following limit exists for all t, set X
lims↓t,s∈Q Xs , and for those ω for which it does not exist set X· (ω) = ∆.
t ; t ≥ 0} is right continuous with left-hand limits. We Note that {X
t = Xt almost surely. To see this note that for claim that for each t, X + any f, g ∈ M ∩ Cb (S), x f (Xt )g(Xs )
t ) x f (Xt )g(X = lim E (4.244) E s↓t,s∈Q x f (Xt )P s−t g(Xt ) = lim E s↓t,s∈Q x f (Xt )P 0 g(Xt ) , = E since t → Pt g is bounded and right continuous. from It follows the x P 0 g(X
t ) = E x g(X
t ) . In semigroup property P t P 0 = P t that E addition, by (4.218) we have that P 0 (x, g) ≤ g(x). Therefore P 0 g(Xt ) = g(Xt )
a.s.
Using this, it follows from (4.244) that
t ) = E x f (Xt )g(Xt ) , x f (Xt )g(X E
(4.245)
(4.246)
and by our assumption that H ∩ C0 (S) is dense in C0 (S) we see that (4.246) continues to hold for all f, g ∈ C0 (S). The same proof works if the functions f and g are replaced by functions that are constant on S∆ . Consequently, (4.246) holds for all functions f, g ∈ Cb (S∆ ). As in the eighth paragraph of the proof of Theorem
t = Xt almost surely. 4.1.1, this shows that for each t, X The rest of the theorem follows as in the ninth paragraph of the proof of Theorem 4.1.1. Given a set Ω and a filtration Ft0 , we define the optional σ-algebra O to be the σ-algebra of subsets of R+ × Ω generated by all real-valued processes Yt that are right continuous with left limits and are adapted
4.7 Constructing Ray semigroups and Ray processes
175
to the filtration Ft0 . A set A ∈ O is called an optional set, or simply optional. A process Zt is said to be optional if the function (t, ω) → Zt (ω) is O measurable. Lemma 4.7.9 Let X = (Ω, F 0 , Ft0 , Xt , θt , P x ) be the Ray process constructed in Theorem 4.7.8. Then, almost surely, Xt ∈ D∆ for all t ≥ 0. Proof As in Lemma 4.7.7, let gn be the sequence of functions in M+ ∩ Cb (S) for which D∆ = ∩n {x ∈ S∆ | P 0 g n (x) = g n (x)}.
(4.247)
It suffices to show that for each n, P¯0 g¯n (Xt ) = g¯n (Xt ) for all t, almost surely. Since this equality is trivial if Xt = ∆, we need only show that P¯0 g¯n (Xt )1S (Xt ) = g¯n (Xt )1S (Xt ) for all t, almost surely. Let Km be a sequence of compact sets with ∪m Km = S, such that each Kn is the closure of its interior and Kn ⊂ int Kn+1 . Let fm be a continuous function that is supported on Km and equal to 1 on Km−1 . It suffices to show that for all n, m, P x P 0 (g n (Xt )fm (Xt )) = g n (Xt )fm (Xt ), for all t = 1. (4.248) Note that g n fm ∈ C0 (S) by the support property of fm . This implies that the process {g n (Xt )fm (Xt ), t ≥ 0} is optional. The same reasonλ ing shows that {U g n (Xt )fm (Xt ), t ≥ 0} is optional. Since, by (4.217), λ P 0 g n (x) = limλ→∞ λU g n (x), we have that {P 0 g n (Xt )fm (Xt ), t ≥ 0} is optional. Let A = {(t, ω) | P 0 (g n (Xt )fm (Xt )) = g n (Xt )fm (Xt )}.
(4.249)
A is an optional set. Therefore, by the Optional Section Theorem (Dellacherie and Meyer (1978, Theorem IV.84), Rogers and Williams (2000a, Theorem VI.5.1)), if (4.248) does not hold, we can find an Ft0 stopping time Tn,m with P x (Tn,m < ∞) > 0, such that (Tn,m (ω), ω) ∈ A on {Tn,m < ∞}. Set X∞ = ∆. Since P 0 g n (x) ≤ g n (x) by (4.218), we see that if (4.248) does not hold, P x XTn,m ∈ Km and P 0 g n (XTn,m ) < g n (XTn,m ) > 0. (4.250) Using once more the fact that P 0 g n (x) ≤ g n (x), it is clear that in order to prove (4.248), it suffices to show that E x P 0 g n (XTn,m ) = E x g n (XTn,m ) . (4.251) Let Tn,m,k =
[2k Tn,m ] + 1 , as in the proof of Lemma 2.2.5, and recall 2k
Constructing Markov processes
176
that for each k, Tn,m,k is an Ft0 stopping time, Tn,m,k ↓ Tn,m , and Tn,m < ∞ if and only if Tn,m,k < ∞, for all k ≥ 1. Since, by (4.245), P 0 g n (Xt ) = g n (Xt ) almost surely for each fixed t, and Tn,m,k is rational valued, we have P 0 g n (XTn,m,k ) = g n (XTn,m,k ),
a.s.
(4.252)
Recall that XTn,m ∈ Km . Thus g n (Xt ) is right continuous at t = Tn,m , so that (4.253) E x g n (XTn,m ) = lim E x g n (XTn,m,k ) . k→∞
Therefore, to establish (4.251) it suffices to show that E x P 0 g n (XTn,m ) = lim E x P 0 g n (XTn,m,k ) .
(4.254)
k→∞
λ
Note that by (4.217) and (4.204), P 0 g n (Xt ) = limλ→∞ λU g n (Xt ) for λ all t, where the limit is an increasing limit. Since U g n is continuous on S, it follows, as in (4.253), that λ λ (4.255) E x U g n (XTn,m ) = lim E x U g n (XTn,m,k ) . k→∞
Since
U λ gn − µU µ U λ gn = U λ (gn − µU µ gn ) ≥ 0
for all µ > 0, we see that U λ gn ∈ M+ ∩ Cb (S). Consequently, it follows λ from the fourth paragraph of the proof of Theorem 4.7.8 that U g n (Xt ) is a supermartingle. Therefore, since Tn,m,k is rational valued, the limit in (4.255) is also an increasing limit. Using the interchangeability of increasing limits we see that E x (P 0 g n (XTn,m ))
= = = =
λ
lim λE x (U g n (XTn,m ))
λ→∞
(4.256)
λ
lim λ lim E x (U g n (XTn,m,k ))
λ→∞
k→∞
λ
lim lim λE x (U g n (XTn,m,k ))
k→∞ λ→∞ x
lim E (P 0 g n (XTn,m,k )),
k→∞
which gives (4.254). We say that a function u(x, y) on S × S is strongly continuous, if u(x, y) is continuous on S × S, bounded in x for each fixed y ∈ S, and continuous in x uniformly in y ∈ S (to motivate this definition, recall Lemma 3.4.3, which states that a strongly symmetric Borel right process
4.7 Constructing Ray semigroups and Ray processes
177
with continuous potential densities has strongly continuous potential densities). Theorem 4.7.10 Let S be a locally compact space with a countable base, and let {U λ ; λ ≥ 0} be a symmetric contraction resolvent on Cb (S). Assume that U 0 has a symmetric strongly continuous density u(x, y) with u(y, y) > 0 for all y ∈ S. Assume also that H ∩ C0 (S) is dense in C0 (S) in the uniform norm. Then we can construct a strongly symmetric right continuous simple Markov process X = (Ω, F 0 , Ft0 , Xt , θt , P x ) in S with potential operators {U λ ; λ ≥ 0}. Proof Under these hypotheses, Theorem 4.7.1 gives a Ray semigroup {Pt ; t ≥ 0} with potentials {U λ ; λ ≥ 0}. Then, by Theorem 4.7.8 and Lemma 4.7.9, we can construct a Ray process X = (Ω, F 0 , Ft0 , Xt , θt , P x ) with state space S∆ such that Xt ∈ D∆ for all t ≥ 0 almost surely. We now use the assumption that u(x, y) is strongly continuous to show that N = ∅. This implies that P x (X0 ) = 1 for all x ∈ S∆ and completes the proof of this theorem. Let y ∈ N . Let f,y be an approximate δ-function at y with respect to some reference measure m. Following the proof of Theorem 3.6.3, we see that there exists a sequence {n } tending to zero, such that almost surely t fn ,y (Xs ) ds (4.257) Ayt := lim n →0
0
exists and the convergence is uniform in t. Furthermore, t → Ayt is almost surely increasing and continuous, and for each x ∈ S E x (Ay∞ ) = u(x, y).
(4.258)
(Note that in the proof of Theorem 3.6.3 we initially use the α-potential for α > 0. Consequently, we get uniform convergence in (3.92) only on finite intervals. As we note at the end of the proof Theorem 3.6.3, when the 0-potential exists, we can dispense with the exponentially killed process. This gives us uniform convergence in (4.257) for all t ∈ R+ .) Clearly dAyt is supported on J = {t : Ayt+ − Ayt− > 0 for all > 0}. It follows from the definition of and Xs ∈ D∆ for all s ≥ 0, that
Ayt
(4.259)
in (4.257), and the fact that y ∈ N
J ⊆ {t | Xt− = y and Xt = Xt− }.
(4.260)
Since X is right continuous, the set of discontinuities, which by (4.260)
178
Constructing Markov processes
includes J , must be countable almost surely. However, since t → Ayt is continuous, dAyt has no atoms, so Ayt ≡ 0 for all t. This implies, by (4.258), that u(x, y) = 0 for all x ∈ S. The hypothesis that u(y, y) > 0 for all y ∈ S then implies that N = ∅.
4.8 Local Borel right processes The strongly symmetric right continuous simple Markov process X constructed in Theorem 4.7.10 is not necessarily a Borel right process. To see this, let ζ := inf{t > 0 | Xt = ∆} and ζ := inf{t > 0 | Xs = ∆, ∀s ≥ t}.
(4.261)
By Lemma 3.2.7, if X is a Borel right process, ζ = ζ. To see that this need not hold for the strongly symmetric right continuous simple Markov process X constructed in Theorem 4.7.10, let {U λ ; λ ≥ 0} be the symmetric contraction resolvent on Cb (S) with λpotential densities √ e− 2(1+λ)|x−y| λ (4.262) u (x, y) = 2(1 + λ) with respect to Lebesgue measure on the locally compact space S = R1 − {0} = (−∞, 0) ∪ (0, ∞). uλ (x, y) is the λ-potential density with respect to Lebesgue measure of Brownian motion on R1 that is killed at the end of an independent exponential time with mean 1. However, in the state space S we remove 0 from R1 . The one-point compactification for S can be viewed as embedding S in [−∞, ∞] and identifying the three points {−∞, ∞, 0} as the single point ∆. By the uniqueness property of Laplace transforms it is immediate that the semigroup corresponding to uλ (x, y) is the semigroup {Pt ; t ≥ 0} for exponentially killed Brownian motion on R1 restricted to S. The construction in Theorem 4.7.8 proceeds by first using the semigroup {Pt ; t ≥ 0} together with the Kolmogorov Extension Theorem 1 to obtain a process Xt indexed by t in a countable dense set D ⊂ R+ , and then extending it by taking limits from the right. The process {Xt , t ∈ D} has the same distribution in S as exponentially killed Brownian motion Bt , but since P x (Bt = 0) = 0 for fixed t and D is countable, we see that, almost surely, exponentially killed Brownian motion does not hit 0 for all t ∈ D. It follows that the process constructed in Theorem 4.7.10 on S∆ , for uλ (x, y) as given in (4.262), is obtained from 1 exponentially killed Brownian motion by projecting R∆ onto S∆ , which
4.8 Local Borel right processes
179
simply identifies 0 and ∆. Since, with positive probability, exponentially killed Brownian motion hits 0 often before it is killed, we see that ζ < ζ, with positive probability. Therefore, in general, the processes constructed in Theorem 4.7.10 are not Borel right processes. Nevertheless, they are used in Theorem 4.8.4 to construct a class of Markov processes that have most of the nice properties of Borel right processes. We call these processes local Borel right process. A collection X = (Ω, G, Gt , Xt , θt , P x ) that satisfies the following three conditions is called a local Borel right process with state space S and transition semigroup {Pt ; t ≥ 0}: (1) X is a right continuous simple Markov process with transition semigroup {Pt ; t ≥ 0}. (2) U α f (Xt ) is right continuous at all t such that Xt ∈ S, for all α > 0 and all f ∈ Cb (S∆ ). (3) {Gt ; t ≥ 0} is augmented and right continuous. We emphasize that condition (1) requires that t → Xt (ω) is a right continuous map from R+ to S∆ . In fact, the definition of a local Borel right process differs from that of a Borel right process only in condition (2). For a Borel right process we must have that U α f (Xt ) is right continuous at all t such that Xt ∈ S∆ , which of course is all t ∈ R+ . To clarify this distinction we show that for the process considered at the beginning of this section, U α f (Xt ) is not right continuous for all f ∈ Cb (S∆ ) at values of t for which Xt = ∆. Let f ∈ C0+ (S) have support in [1, 2] and note that √ U α f (x) = ( 2(1 + λ))−1 e− 2(1+λ)|x−y| f (y) dy, (4.263) so that
√ lim U α f (x) = ( 2(1 + λ))−1 e− 2(1+λ)|y| f (y) dy > 0.
x→0
(4.264)
If limt↓t0 Yt = 0 for exponentially killed Brownian motion Yt , then for Xt , its projection in S, we have limt↓t0 Xt = ∆. Therefore, by (4.264), lim U α f (Xt ) = lim U α f (x) > 0,
t↓t0
x→0
(4.265)
but U α f (Xt0 ) = U α f (∆) = 0. On the other hand, since Xt is right continuous and U α f ∈ Cb (S), limt↓t0 U α f (Xt ) = U α f (Xt0 ) for all t0 for which Xt0 ∈ S.
180
Constructing Markov processes
Before presenting Theorem 4.7.10 we collect some facts about local Borel right processes. Lemma 4.8.1 If S is compact, then any local Borel right process with state space S is a Borel right process. Proof Let Xt = ∆. When S is compact, recall that ∆ is added as an isolated point. Since by assumption Xs is right continuous, we must have Xs = ∆ for all t ≤ s ≤ s + for some > 0, and the result follows.
Lemma 4.8.2 Let X be a local Borel right process with locally compact state space S. If, for some α > 0 and f ∈ Cb+ (S), we have that U α f ∈ C0 (S) and is strictly positive on S, then X is a Borel right process. Proof Recall that we show in the fifth paragraph of the proof of Theorem 4.1.1 that e−αt U α f (Xt ) is a supermartingale. Furthermore, since U α f ∈ C0 (S) and the paths of X are right continuous, we have that e−αt U α f (Xt ) is right continuous and, by hypothesis, U α f (Xt ) is 0 only when Xt = ∆. It then follows from a standard result about right continuous supermartingales that ζ¯ = ζ (see Dellacherie and Meyer (1980, Theorem VI.17)).
The proof of Lemma 3.2.7 shows that if X is right continuous and satisfies the strong Markov property, then ζ = ζ. Hence, the example considered at the beginning of this section, and consequently local Borel right processes in general, do not satisfy the strong Markov property. The next theorem provides a substitute for the strong Markov property that is sufficient for our needs. Lemma 4.8.3 Let X = (Ω, G, Gt , Xt , θt , P x ) be a right continuous simple Markov process. Assume that U α f (Xt ) is almost surely right continuous at all t such that Xt ∈ S, for all f ∈ Cb (S∆ ) and all α > 0. If T is a Gt+ stopping time such that XT ∈ S on T < ∞, almost surely, then E x f (XT +s )1{T <∞} | GT + = Ps f (XT )1{T <∞} (4.266) for all s ≥ 0, x ∈ S, and all f ∈ Bb (S∆ ). Furthermore, E x (f (Xt+s ) | Gt+ ) = Ps f (Xt ) for all s, t ≥ 0, x ∈ S, and all f ∈ Bb (S∆ ).
(4.267)
4.8 Local Borel right processes
181
Proof The proof of (4.266) follows exactly as in the proof of Theorem 3.2.1. Equation (4.267) does not follow immediately from that proof since we may have Xt = ∆ with positive probability. However, by the simple Markov property and (4.237), P x (Xr = ∆ and Xs = ∆) = 0 for any r < s and x ∈ S∆ . Therefore, if t is fixed and D is a countable dense set in R+ , P x (Xt = ∆ and Xs = ∆ for some s > t, s ∈ D) = 0. Consequently, using the right continuity of the paths of X, we have P x (Xt = ∆ and Xs = ∆ for some s > t) = 0.
(4.268)
It is clear from this that, for any fixed t, if Xt ∈ ∆, then U α f (Xs ) is right continuous at t, almost surely. Since by hypothesis U α f (Xs ) is almost surely right continuous at all t such that Xt ∈ S for all f ∈ Cb (S∆ ) and all α > 0, we see that for any fixed t, U α f (Xs ) is almost surely right continuous at t for all f ∈ Cb (S∆ ) and all α > 0. Using this, the proof of (4.267) follows exactly as in the proof of Theorem 3.2.1 (see also Remark 3.2.2). Theorem 4.8.4 Let S be a locally compact space with a countable base and let {U λ ; λ ≥ 0} be a symmetric contraction resolvent on Cb (S). Assume that U 0 has a symmetric strongly continuous density u(x, y) with u(y, y) > 0 for all y ∈ S. Assume also that H ∩ C0 (S) is dense in C0 (S) in the uniform norm. Then we can construct a strongly symmetric local Borel right process X = (Ω, F, Ft , Xt , θt , P x ) with state space S and potential operators {U λ ; λ ≥ 0}. Proof Let X = (Ω, F 0 , Ft0 , Xt , θt , P x ) be the strongly symmetric right continuous simple Markov process constructed in Theorem 4.7.10. It follows from the fact that U α f ∈ Cb (S) for all f ∈ Cb (S∆ ) and all α > 0 that condition (2) for a local Borel right process is satisfied. Let {Ft , t ≥ 0} be the standard augmentation of {Ft0 , t ≥ 0}. As in the proof of (2.75), it follows from (4.267) that E x (Y ◦ θt | Ft+ ) = E Wt (Y )
(4.269)
for all F 0 measurable functions Y . It then follows from the proof of Lemma 2.3.1 that Ft+ = Ft for all t ≥ 0. Using this, the theorem follows as in the proof of Lemma 3.2.3. As in the proof of Lemma 2.3.2, (4.266) extends to all F measurable functions Y .
Constructing Markov processes
182
Lemma 4.8.5 Let X = (Ω, F, Ft , Xt , θt , P x ) be a local Borel right process. If T is an Ft stopping time such that XT ∈ S on T < ∞, almost surely, then E x Y ◦ θT 1{T <∞} | FT = E XT (Y ) 1{T <∞} (4.270) for all x ∈ S and F measurable functions Y . We call the result of this lemma the local strong Markov property.
4.9 Supermedian functions Let S be a locally compact space with a countable base. Supermedian functions are functions in Bb+ (S) that satisfy (4.202), where {U λ ; λ ≥ 0} is a contraction resolvent. We use M+ to denote these functions and M to denote M+ − M+ . We now give some properties of M+ and M for use in the next section. Lemma 4.9.1 Let {fn } be a sequence of functions in M+ . Then (1) (2) (3) (4) (5)
fn ↑ f =⇒ f ∈ M+ . inf n fn ∈ M+ . lim inf n fn ∈ M+ . M is closed under ∧ and ∨. M contains the constant functions.
Proof Property (1) follows from the Monotone Convergence Theorem and (2) from the definition of supermedian functions. Property (3) follows from Fatou’s Lemma. Closure in (4) under ∧ follows from (2) and the fact that (f − g) ∧ (u − v)
=
[(f + v) − (g + v)] ∧ [(u + g) − (v + g)]
=
[(f + v) ∧ (u + g)] − (v + g).
Since both (f + v) ∧ (u + g) and v + g are in M+ , we see that (f − g) ∧ (u − v) ∈ M. Writing (f − g) ∨ (u − v) = −[(g − f ) ∧ (v − u)], the same proof gives closure under ∨. Property (5) follows from (4.200). A function f ∈ M+ is said to be excessive if λU λ f (x) ↑ f (x)
as λ → ∞
∀ x ∈ S.
(4.271)
Lemma 4.9.2 Assume that U 0 has a symmetric strongly continuous density u(x, y). Then (1) uv (x) := u(x, v) is excessive for each v ∈ S.
4.9 Supermedian functions (2)
183
U 0 ρ(x) :=
u(x, v) dρ(v)
(4.272)
is excessive for any finite positive measure ρ on S. Such functions are called potentials. (3) All excessive functions are the increasing limits of potentials. Proof To obtain (1), let f ∈ Cb+ (S) and let α < β. It follows from the resolvent equation (4.201) that αU α U 0 f (x) = U 0 f (x) − U α f (x)
(4.273)
βU β U 0 f (x) = U 0 f (x) − U β f (x)
(4.274)
and U α f (x) − U β f (x) = (β − α)U α U β f (x) ≥ 0.
(4.275)
Combining these equations, we see that αU α U 0 f (x) ≤ βU β U 0 f (x).
(4.276)
Since this is true for all f ∈ Cb+ (S), it follows from the strong continuity of u that αU α uv (x) ≤ βU β uv (x).
(4.277)
Consequently, h(v) := limα→∞ αU α uv (x) exists and, by the strong continuity of u, h ∈ Cb+ (S). Let f ∈ Cb+ (S) with compact support. By the Monotone Convergence Theorem, (4.273), and (4.200), αU α uv (x)f (v) dm(v) (4.278) h(v)f (v) dm(v) = lim α→∞
lim αU α U 0 f (x) = U 0 f (x) = uv (x)f (v) dm(v).
=
α→∞
Since this is true for all f ∈ Cb+ (S) with compact support, it follows from the continuity of h and uv that h = uv . This proves (1). (2) follows from (1). To obtain (3), we note that when f ∈ M+ is excessive, 0 ≤ f −λU λ f ↓ 0. Using the resolvent equation, we then have λU 0 (f −λU λ f ) = λU λ f ↑ f pointwise as λ → ∞.
184
Constructing Markov processes
4.10 An extension theorem for local Borel right processes All the results obtained about strongly symmetric Borel right process on S make perfect sense when S is finite (although much of the machinery we develop is not needed to handle this simple case). Also, since every finite set is compact, by Lemma 4.8.1 for such sets, there is no difference between Borel right processes and local Borel right processes. We now present a theorem that allows us to reduce the study of certain properties of strongly symmetric local Borel right processes, to processes with finite state spaces. This is used in Chapter 13. Whenever we consider a process on a finite state space we take the reference measure to be the measure that assigns mass 1 to each point in the state space. Such a measure is called a counting measure. Theorem 4.10.1 (Extension Theorem) Let S be a locally compact space with a countable base. Let Γ(x, y) be a continuous function on S × S. Assume that for every finite subset S ⊆ S, Γ(x, y) is the 0potential density of a strongly symmetric transient Borel right process X on S . Then Γ(x, y) is the 0-potential density of a strongly symmetric transient local Borel right process X on S. In particular, if S is compact, then Γ(x, y) is the 0-potential density of a strongly symmetric transient Borel right process X on S. Proof We can always find a probability measure µ on S that gives strictly positive mass to each open set of S. By Lemma 3.4.6 applied to all finite subsets S ⊆ S, Γ(y, y) > 0 for all y ∈ S. Set dm(y) = 1 dµ(y). By Lemma 3.4.3 again applied to all finite subsets S ⊆ S, Γ(y, y) Γ(x, y) ≤ Γ(y, y) for all x ∈ S. We define the operator Hf (x) = Γ(x, y)f (y)dm(y), f ∈ Cb (S). (4.279) S
We have
|Hf (x)|
≤
f ∞
Γ(x, y)dm(y)
(4.280)
S
= f ∞ S
Γ(x, y) dµ(y) ≤ f ∞ . Γ(y, y)
Hence H is a positive bounded linear operator on f ∈ Cb (S). We show that H is the 0-potential operator of a local Borel right process in S. It then follows from (4.279) that H has 0-potential density Γ(x, y) with respect to the measure m.
4.10 Extension Theorem
185
To begin, we show that H satisfies the positive maximum principle on Cb (S). Considering the form of H in (4.279), it is easy to see that it suffices to show that for each compact set K ⊆ S, the restriction of H to K, which we denote by HK , satisfies the positive maximum principle on C(K). n For each n write K as a finite disjoint union ∪m j=1 Kn,j of neighborhoods Kn,j with diameter less than or equal to 1/n. For each j choose a point xn,j ∈ Kn,j . Define the operators Hn h(xn,i ) =
mn
Γ(xn,i , xn,j )h(xn,j )m(Kn,j ).
(4.281)
j=1
We claim that for each n the operator Hn satisfies the positive maximum principle on the set En = {xn,j , 1 ≤ j ≤ mn }. It is easy to check that this claim follows once we show that mn H n h(xn,i ) = Γ(xn,i , xn,j )h(xn,j ) (4.282) j=1
satisfies the positive maximum principle on En . To prove this last assertion note that, by hypothesis, H n is the 0-potential operator of a strongly symmetric transient Borel right process on En . Obviously, since En is finite, this process is also a Feller process. Therefore, by Theorem 4.1.6, H n satisfies the positive maximum principle on En . We now show that this implies that HK satisfies the positive maximum principle on C(K). Suppose that it does not. Then there exists some a > 0 and a continuous function h on K with supx∈K HK h(x) > 0 for which sup {x:h(x)>0}
HK h(x) ≤ a
(4.283)
but for some x∗ ∈ K, with h(x∗ ) ≤ 0, and some b > 0 HK h(x∗ ) > a + b.
(4.284)
By replacing h(x) by h(x)− for > 0 sufficiently small, we may assume that h(x∗ ) < 0. Since h is continuous, for all sufficiently large n we can find some x∗∗ ∈ En for which h(x∗∗ ) ≤ 0. Using the continuity of Γ and h on the compact set K it follows from (4.283) and (4.284) that for all sufficiently large n, sup {x∈En :h(x)>0}
Hn h(x) ≤ a + b/2
(4.285)
and Hn h(x∗∗ ) > a + b/2.
(4.286)
186
Constructing Markov processes
This contradicts the assumption that Hn satisfies the positive maximum principle on En . Therefore, HK satisfies the positive maximum principle on C(K). By Remark 4.1.10 we can construct a symmetric contraction resolvent {U λ ; λ ≥ 0} on Cb (S) with U 0 = H. By Lemma 3.4.3 applied to all finite subsets S ⊆ S, we see that u(x, y) = Γ(x, y) is strongly continuous, and by Lemma 3.4.6 applied to all finite subsets S ⊆ S, we see that u(y, y) > 0 for all y ∈ S. Set uv (x) = u(x, v). It follows from Lemma 4.9.2 (1) that uv (x) is excessive for each v ∈ S and thus is a supermedian function on S for each v ∈ S. Let x, y ∈ S. By hypothesis, u(x, y) = Γ(x, y) is the 0-potential density of a strongly symmetric transient Borel right process on the two point set S = {x, y}. Therefore, it follows from (3.109) and Lemma 3.6.12 that u(x, y) < u(x, x) ∨ u(y, y)
∀ x, y ∈ S.
(4.287)
If u(x, y) < u(x, x), then ux (x) = ux (y), whereas if u(x, y) < u(y, y), then uy (x) = uy (y). This shows that the collection M+ ∩ Cb (S) of continuous bounded supermedian functions separates points of S. It follows from Lemma 4.9.1 (4) and (5) that M ∩ Cb (S) is a vector lattice that contains the constants. Suppose S is compact. Then it follows from Theorem 14.5.1 that M ∩ Cb (S) is dense in (Cb (S), · ∞ ). Therefore, when S is compact this theorem follows from Theorem 4.8.4. We now consider the general case in which S is a locally compact space with a countable base. We show below that for any open set B ⊆ S with compact closure and any x ∈ B, we can find a function vB,x ∈ H ∩C0 (S) that is supported in B and vB,x (x) > 0. Because S is locally compact, this shows that H ∩ C0 (S) separates points of S. Since H is a vector lattice that also contains the constants, it follows from Theorem 14.5.2 that H ∩ C0 (S) is dense in (C0 (S), · ∞ ). We can now use Theorem 4.8.4 to complete the proof of this theorem. We now construct the functions vB,x . Let {Kn } be an increasing sequence of compact subsets of S with ∪∞ n=1 Kn = S, such that each Kn is the closure of its interior and Kn ⊆ int Kn+1 . For f ∈ C(Kn ) and x ∈ Kn , we define the operator u0 (x, y)f (y)dm(y).
Vn f (x) = Kn
(4.288)
4.10 Extension Theorem
187
Since we have proved this theorem for compact state spaces, we know that Vn is the 0-potential of a strongly symmetric Borel right process (n) Xt in Kn with continuous potential densities. t(n) , By Theorem 4.5.2, for any open set B ⊆ int Kn , the process X (n) (n) obtained by killing Xt at TB c , is a strongly symmetric Borel right process in B with continuous potential densities, which we denote by u (n),α (x, y). By (4.152), ∀ x, y ∈ Kn . u (n),0 (x, y) = u0 (x, y) − E x u0 XT (n) , y 1T (n) <∞ B
c
B
c
(4.289) c It is clear that the right-hand side of (4.289) is zero for x ∈ B . Consec quently, by symmetry, it is zero for all for x, y ∈ B . Fix x ∈ B. Let (n),0 (x, y) y ∈ Kn u vx(n) (y) = 0 y ∈ Knc . (n)
vx (y) is clearly a continuous function on S with support in B. Furthermore, since u (n),0 (x, y) is the 0-potential density of a strongly symmetric (n) Borel right process, it follows from Lemma 3.4.6 that vx (x) > 0 for (n) x ∈ B. Consequently, vx (y) is not identically zero. Note that vx(n) (y) = u0 (x, y)1{Kn } (y) (4.290) − E x u0 XT (n) , y 1{Kn } (y) 1T (n) <∞ . B
c
B
c
Let n0 be such that that B ⊆ int Kn0 . It follows from Remark 4.5.3 (n) that vx is independent of n for n ≥ n0 . We denote this function by vB,x (y). It remains to show that {vB,x (y), y ∈ S} ∈ M. For n ≥ n0 we have, by (4.290), (4.291) vB,x (y) = u0 (x, y)1{Kn } (y) − E x u0 XT (n) , y 1{Kn } (y) 1T (n) <∞ . B
c
B
c
So it also holds if we take the limit as n → ∞, providing that the limit exists. Clearly u0 (x, y)1{Kn } (y) ↑ u0 (x, y) = u0x (y), and by Lemma 4.9.2 (1), u0x (y) ∈ M+ . Now set (4.292) hn (y) = E x u0 XT (n) , y 1{Kn } (y) 1T (n) <∞ . B
c
B
c
188
Constructing Markov processes
We see from (4.291) that hn (y) increases in n to some continuous, nonnegative limit that we denote by h∞ (y). We conclude the proof by showing that h∞ ∈ M+ . Consider the contraction resolvent {U (n),λ ; λ ≥ 0} on Cb (Kn ) with (n),0 = Vn . It follows from Lemma 4.9.2 (1) that u0v (y)1{Kn } (y) is exU cessive for this contraction resolvent. Therefore, using Lemma 4.9.2 (2) and (4.292), we see that h∞ (y) is excessive for this contraction resolvent. It now follows from Lemma 4.9.2 (3) that there is a sequence of nonnegative functions {gn,k }, k = 1, . . . on Kn with Vn gn,k ↑ h∞ or, equivalently, that U 0 (gn,k 1{Kn } ) ↑ h∞ pointwise on Kn . Let wn (y) := lim inf U 0 (gn,k 1{Kn } )(y) k→∞
y ∈ S.
(4.293)
By Lemma 4.9.2 (2), U 0 (gn,k 1{Kn } ) ∈ M+ . Consequently, by Lemma 4.9.1 (3), wn ∈ M+ . On the other hand, wn = h∞ on Kn . This implies that h∞ = lim inf n→∞ wn on S. Using Lemma 4.9.1 (3) again, we see that h∞ ∈ M+ .
4.11 Notes and references We refer the reader to Bertoin (1996), Kallenberg (2001), Khoshnevisan (2002), and Stroock (1993) for many different interesting proofs of the existence of L´evy processes. The material in Sections 4.4–4.5 is adapted from the classic references on Markov processes mentioned in Section 3.11. The content of Section 4.6, which comes from our paper Marcus and Rosen (1992d) is due to P. Fitzsimmons. Section 4.7 is adapted from Getoor (1975) and Dellacherie and Meyer (1987), both of which are based on Ray (1963). All these sources, however, only consider compact state spaces. In their treatment, a state space that is not compact is compactified by a procedure known as the Ray–Knight compactification. This compactification changes the topology and can substantially enlarge the state space. For the applications of the results in Sections 4.7 that we present in Chapter 13, it is necessary to deal directly with locally compact spaces. The basic idea for the extension theorem of Section 4.10 is due to P. Fitzsimmons.
5 Basic properties of Gaussian processes
5.1 Definitions and some simple properties A real-valued random variable X is a Gaussian random variable if it has characteristic function σ 2 λ2 (5.1) EeiλX = exp imλ − 2 for some real numbers m and σ. It follows from (5.1), by differentiating with respect to λ and then setting λ = 0, that E(X) = m
and
Var(X) = σ 2 .
(5.2)
An Rn -valued random variable ξ is a Gaussian random variable if (y, ξ) is a real-valued Gaussian random variable for each y ∈ Rn . By the above, this is equivalent to saying that it has characteristic function Var((y, ξ)) def i(y,ξ) = exp iE((y, ξ)) − φξ (y) = Ee (5.3) 2 for each y ∈ Rn . Let m = (m1 , . . . , mn ). Setting Eξj = mj
and
E(ξj − mj )(ξk − mk ) = Σj,k ,
we see that (5.3) can be written as yΣy t φξ (y) = exp imy t − , 2
(5.4)
(5.5)
where Σ = {Σj,k }nj,k=1 is a symmetric n×n matrix with real components. (We also consider a vector in Rn as a 1 × n matrix. Thus we can express (m, y) as my t ). We refer to m as the mean vector, or simply the mean, of ξ and Σ as the covariance matrix of ξ. We note that the distribution of an Rn -valued Gaussian random variable is completely determined by its mean vector and covariance matrix. The rank of Σ equals the dimension of the subspace of Rn , which supports the distribution of ξ. 189
190
Basic properties of Gaussian processes
A remarkable property of Gaussian random variables is that uncorrelated Gaussian random variables are independent. Lemma 5.1.1 Let ξ be an Rn -valued Gaussian random variable with mean vector m and assume that E(ξj − mj )(ξk − mk ) = 0
j = k.
(5.6)
Then ξ1 , . . . , ξn are independent. Proof
In this case
n
Σj,j yj2 φξ (y) = exp imj yj − 2 j=1
.
(5.7)
A matrix A = {Aj,k }nj,k=1 with real components is positive definite if n j,k=1 aj ak Aj,k ≥ 0 for all real numbers a1 , . . . , an .We say that the n positive definite matrix A is strictly positive definite if j,k=1 aj ak Aj,k = 0 implies that all the ai are zero. A function Σ(s, t) on T ×T is a positive definite function if, for all n ≥ 1 and t1 , . . . , tn in T , the n × n matrix Σ, defined by Σj,k = Σ(tj , tk ) is a positive definite matrix. A function {φ(u), u ∈ T } is positive definite if the function Σ(s, t) := φ(s − t) is positive definite. Both of these definitions hold similarly for strict positive definiteness. The covariance matrix Σ in (5.5) is positive definite. This is easy to see, since 2 n n aj ak Σj,k = E aj (ξj − mj ) . (5.8) j=1
j,k=1
Let ξ be an Rn -valued Gaussian random variable with mean m and covariance matrix Σ and let U be a p × n matrix. Then η = (U ξ t )t is an Rp -valued Gaussian random variable with mean (U mt )t and covariance matrix U ΣU t . This follows from (5.5) since φη (z)
t
t
= Eeiηz = Eeiξ(zU ) = φξ (zU ) zU Σ(zU )t t = exp im(zU ) − 2 zU ΣU t z t t t t = exp i(U m ) z − . 2
When Σ is strictly positive definite its rank is equal to n.
(5.9)
5.1 Definitions and some simple properties
191
Lemma 5.1.2 Let ξ be an Rn -valued Gaussian random variable with mean m and strictly positive definite covariance matrix Σ. Then the probability distribution of ξ is absolutely continuous with respect to Lebesgue measure on Rn and has probability density function 1 (x − m)Σ−1 (x − m)t f (x) = , (5.10) exp − 2 (2π)n/2 (det Σ)1/2 where det Σ denotes the determinant of Σ and x ∈ Rn . We also use the notation |Σ| for det Σ. Proof Since the characteristic function is unique, it suffices to show that 1 (x − m)Σ−1 (x − m)t i(y,x) exp − e dx = φξ (y), 2 (2π)n/2 (det Σ)1/2 (5.11) where φξ (y) is given in (5.5). Since Σ is a strictly positive definite symmetric matrix, it has a symmetric square root Σ1/2 . Using this we see that (x − m)Σ−1 (x − m)t dx (5.12) ei(y,x) exp − 2 xΣ−1 xt = ei(y,m) ei(y,x) exp − dx 2 1/2 xxt dx. = (det Σ)1/2 ei(y,m) ei(yΣ ,x) exp − 2 Let η be an Rn -valued Gaussian random variable with mean zero and covariance matrix I. Then xxt 1 i(yΣ1/2 ,x) dx = φη (yΣ1/2 ) exp − (5.13) e 2 (2π)n/2 yΣy t = exp − , 2 where, for the second equality we use (5.5). Using (5.13) to substitute for the integral in (5.12), we get (5.11). Let ξ be an Rn -valued Gaussian random variable with mean zero and covariance matrix Σ. Let a ∈ Rn . Then, by (5.10), in dimension one P (α ≤ (a, ξ) ≤ β) =
1 (2πσ 2 )1/2
β
α
e−u
2
/2σ 2
du,
(5.14)
192
Basic properties of Gaussian processes
where σ 2 = E((a, ξ))2 = aΣat . Using this we easily see that aΣat Ee(a,ξ) = exp . 2 We use the standard notation x 2 1 e−u /2 du Φ(x) = √ 2π −∞ φ(x) =
2 d 1 Φ(x) = √ e−x /2 . dx 2π
(5.15)
(5.16)
(5.17)
Lemma 5.1.3 Let ξ be a real-valued Gaussian random variable with mean zero with variance σ 2 . Then, for all a > 0, a2 P (|ξ| > a) ≤ exp − 2 , (5.18) 2σ and when a/σ ≥ 1, (σ/a)φ(a/σ) ≤ P (|ξ| > a) ≤ 2 (σ/a)φ(a/σ).
(5.19)
Proof Dividing |ξ| by σ we see that it suffices to verify (5.18) when σ = 1. In this case, ∞ 2 2 e−u /2 du (5.20) P (|ξ| > a) = √ 2π a ∞ 2 2 2 2 √ ≤ ue−u /2 du = √ e−a /2 . a 2π a a 2π √ we see This gives us (5.18) for a ≥ 2/ 2π. Also, taking derivatives, √ and since that exp(−a2 /2) − P (|ξ| > a) is increasing for a < 2/ 2π, √ this difference is zero when a = 0 we get (5.18) for a < 2/ 2π. To obtain the right-hand side of (5.19) we note that ∞ ∞ 2 2σ 2 −u2 /2 e du ≤ √ ue−u /2 du. (5.21) P (|ξ| > a) = √ 2π a/σ 2πa a/σ The left-hand side of (5.19) follows from Lemma 14.8.1. A sequence of independent Gaussian random variables with mean zero and variance one is called a standard normal sequence. Sometimes the word “normal” is a synonym for Gaussian. Sometimes we use the notation N (µ, σ 2 ) to indicate a normal random variable with mean µ and variance σ 2 . The following simple limit theorem plays an important role in this book.
5.1 Definitions and some simple properties
193
Theorem 5.1.4 Let {ξn } be a standard normal sequence. Then ξn =1 a.s. (5.22) (2 log n)1/2
Proof For > 0, let an = P ξn > ((2 + ) log n)1/2 and bn = P (ξn > an < ∞ and bn = ∞. The result (2 log n)1/2 ) and observe that follows by the Borel–Cantelli Lemmas. lim sup n→∞
Consider the function def
U(x) = (φ ◦ Φ−1 )(x)
x ∈ [0, 1]
(5.23)
with U(0) = U(1) = 0. We note the following important relationship. Lemma 5.1.5 For x ∈ (0, 1)
def
UU (x) = U(x)U (x) = −1.
(5.24)
Proof This follows simply, using the relations φ (x) = −xφ(x) and (Φ−1 (x)) = 1/U(x). We use γn , or simply γ, to denote the canonical Gaussian measure on Rn . This is the probability distribution on Rn that is induced by the standard normal sequence with n terms. Clearly the probability density function of γ is (2π)−n/2 exp(−xxt /2), x ∈ Rn . A real-valued stochastic process {X(t), t ∈ T } (T is some index set) is a Gaussian process if its finite-dimensional distributions are Gaussian. It is characterized by its mean function m and its covariance kernel Σ, which are given by m(t) = EX(t)
and
Σ(s, t) = E(X(t) − m(t))(X(s) − m(s)). (5.25)
It follows from (5.8) that Σ(s, t) is a positive definite function on T × T . Conversely, and this is the route we generally take, given a real-valued function m(t) on T and a positive definite function Σ(s, t) on T × T , we obtain a real-valued Gaussian process {X(t), t ∈ T }, with mean function m and covariance kernel Σ. This is because m, Σ, and (5.5) allow us to obtain a consistent family of finite-dimensional Gaussian distributions on the finite subsets of T . When T is the positive integers we write the Gaussian process {X(t), t ∈ T } as {ξj }∞ j=1 , or simply {ξj }, and refer to it as a Gaussian sequence. It follows from (5.14) that a real-valued Gaussian random variable has
194
Basic properties of Gaussian processes
moments of all orders. Thus, given a Gaussian process {X(t), t ∈ T }, we can define a natural L2 metric on T × T by 1/2
(5.26) d(s, t) := dX (s, t) = E(X(s) − X(t))2 =
1/2
(Σ(s, s) + Σ(t, t) − 2 Σ(s, t))
.
Remark 5.1.6 The fact that there is a one–one correspondence between mean zero Gaussian processes and positive definite functions allows us to very easily obtain many important properties of positive definite functions. Let Σ(s, t), s, t ∈ T be a positive definite function and let {X(t), t ∈ T } be a mean zero Gaussian process with covariance Σ. Then, because Σ(s, t) = EX(s)X(t), it follows from the Schwarz inequality that 1/2
Σ(s, t) ≤ (Σ(s, s)Σ(t, t))
.
(5.27)
Let 0 be a point in T . Then Σ(s, t) −
Σ(s, 0) Σ(t, 0) Σ(0, 0)
(5.28)
is also a positive definite function on T × T . This is because it is the covariance of Σ(t, 0) η(t) = X(t) − X(0). (5.29) Σ(0, 0) (It follows from the remarks in the paragraph containing (5.9) that η is a Gaussian process.) Let (Ω, P ) denote the probability space of a Gaussian process, {X(t), t ∈ T }. This process exists in L2 (P ) with the inner product given by (X(s), X(t)) = EX(s)X(t), s, t ∈ T . Note that η(t) is the projection of X(t), onto the orthogonal complement of X(0) with respect to L2 (P ). Remark 5.1.7 We show in Lemma 3.3.3 that the α-potential density of a strongly symmetric Borel right process with continuous α-potential density is positive definite. Therefore, to every such process we can associate a mean zero Gaussian process with covariance equal to its αpotential density. We call these Gaussian processes associated processes. Not all Gaussian processes are associated processes. In fact, associated processes have some stronger properties than Gaussian processes in general. Let {X(t), t ∈ T } be an associated process. In contrast with (5.27), it follows from Lemma 3.4.3 that EX(s)X(t) ≤ EX 2 (s) ∧ EX 2 (t)
∀ s, t ∈ T.
(5.30)
5.1 Definitions and some simple properties
195
Also, by Lemma 3.4.6, EX 2 (t) > 0 for all t ∈ T . In Chapter 13 we give several properties that characterize when a Gaussian process is an associated process.
5.1.1 Gaussian Markov processes We consider Gaussian processes {X(t), t ∈ R+ } that are also Markov processes. Lemma 5.1.8 Let p and q be positive functions on T ⊂ R1 with p/q strictly increasing. Set p(s)q(t) s ≤ t Σ(s, t) = (5.31) p(t)q(s) t < s and assume that p and q are such that Σ(s, t) > 0 for all s, t ∈ T . Then Σ(s, t) is a strictly positive definite function on T × T . Proof Let t1 < · · · < tn ∈ T and set pj = p(tj ) and qk = q(tk ). Using (5.31), we see that the matrix Dn := {Σ(tj , tk )}nj,k=1 is given by p1 q1 p1 q2 · · · p1 qn−1 p1 qn p1 q2 p2 q2 · · · p2 qn−1 p2 qn . .. .. .. . .. .. . . . . p1 qn p2 qn · · · pn−1 qn pn qn Multiply the (n − 1)-st column of Dn by −qn /qn−1 and add it to the last column to see that pn−1 qn2 . (5.32) det Dn = (det Dn−1 ) pn qn − qn−1 Since p/q is strictly increasing, the last term in (5.32) is greater than zero. Since det D1 > 0, it follows by induction that det Dj > 0 for all 1 ≤ j ≤ n. By Lemma 14.9.2, this implies that the matrix Dn is strictly positive definite. Lemma 5.1.9 Let I ∈ R1 be an interval, open or closed, and let X = {X(t), t ∈ I} be a mean zero Gaussian process with continuous strictly positive definite covariance Σ. Then X is a Gaussian Markov process, that is, for all increasing sequences t1 , . . . , tn ∈ R+ , for all n, E(X(tn )|X(tn−1 )) = E(X(tn )|X(tn−1 ), . . . , X(t1 )) if and only if Σ can be expressed as in (5.31).
(5.33)
196
Basic properties of Gaussian processes
Proof
We write Σ(tn , tn−1 ) Σ(tn , tn−1 ) X(tn ) = X(tn ) − X(tn−1 ) + X(tn−1 ) Σ(tn−1 , tn−1 ) Σ(tn−1 , tn−1 ) n + Yn−1 . := X (5.34)
n X(tj ) = 0, 1 ≤ j ≤ Suppose Σ can be expressed as in (5.31). Then E X n is independent of X(tj ), 1 ≤ j ≤ n − 1. Therefore, n − 1. Thus X obtaining (5.33) reduces to showing that E(Yn−1 |X(tn−1 )) = E(Yn−1 |X(tn−1 ), . . . , X(t1 )),
(5.35)
which is trivially true. Now suppose that X is a Gaussian Markov process. Let t1 < t2 < t3 and consider the Gaussian vector X(t1 ), X(t2 ), X(t3 ), which we also denote by X1 , X2 , X3 . Set (Xi , Xj ) = E(Xi Xj ), i, j = 1, 2, 3. Let 2 = X2 − (X1 , X2 ) X1 . We write X (X1 , X1 ) 2 ) 2 ) (X3 , X (X3 , X (X3 , X1 ) 2 + (X3 , X1 ) X1 . X2 − X X1 + X3 = X 3 − 2 , X 2 ) 2 , X 2 ) (X1 , X1 ) (X1 , X1 ) (X (X (5.36) The term in the square brackets is independent of X1 and X2 . Therefore, 2 ) 2 ) (X1 , X2 ) (X3 , X (X3 , X1 ) (X3 , X − X1 . E(X3 |X2 , X1 ) = X + 2 , X 2 ) 2 2 , X 2 ) (X1 , X1 ) (X1 , X1 ) (X (X (5.37) In order for (5.33) to hold, the term in the square bracket must be zero, that is, we must have 2 , X 2 ) = (X3 , X 2 )(X1 , X2 ). (X3 , X1 )(X
(5.38)
2
2 ) = (X2 , X2 ) − (X1 , X2 ) and (X 2 , X3 ) = (X2 , X3 ) − 2 , X Since (X (X1 , X1 ) (X1 , X2 )(X1 , X3 ) , we see that (5.38) holds if and only if (X1 , X1 ) (X1 , X3 ) =
(X1 , X2 )(X2 , X3 ) . (X2 , X2 )
(5.39)
Consequently, when X is a Gaussian Markov process, for u < s < t, Σ(u, t) =
Σ(u, s)Σ(s, t) . Σ(s, s)
(5.40)
We now show that (5.40) implies that, for any t0 ∈ I, Σ(t0 , t) > 0
∀ t ∈ I.
(5.41)
5.1 Definitions and some simple properties
197
Suppose this is false and Σ(t0 , t) = 0 for some t > t0 . Let s = inf{v > t0 |Σ(t0 , v) = 0}. Clearly, s > t0 because, by hypothesis, Σ is continuous and Σ(t0 , t0 ) > 0. Using (5.40) we see that, for t0 < s < s, 0 = Σ(t0 , s) =
Σ(t0 , s )Σ(s , s) . Σ(s , s )
(5.42)
By our choice of s, Σ(t0 , s ) > 0. Therefore, Σ(s , s) = 0 for all t0 < s < s. Since the covariance is assumed to be continuous, this implies that Σ(s, s) = 0, which is a contradiction. Therefore, Σ(t0 , t) > 0 for all t ≥ t0 . A similar proof gives this result for t ≤ t0 . The covariance Σ satisfies (5.40) and (5.41). These are precisely the properties satisfied by uα in Lemma 4.3.1 that are used to show that uα can be represented as in (4.112). This shows that Σ can be represented as in (5.31). Since Σ(s, t) is strictly positive definite, it follows from (5.32) that p/q is strictly increasing. Example 5.1.10 Let p(s) = s and q(t) = 1 in (5.31), where T = := {W (t), t ∈ R+ −{0}. This gives rise to a Gaussian Markov process W (0) = 0 and denote the R+ − {0}}. We extend it to R+ by setting W extended process by W = {W (t), t ∈ R+ }. By Lemma 5.1.1, it is easy to see that W has stationary independent increments and W (t) − W (s) is N (0, |t − s|). Thus by (2.12) and the argument following it, W has a continuous version. Thus, in a certain sense, W is a Brownian motion. We say in a certain sense because, in the definition of Brownian motion in Section 2.1, we require that all the paths of Brownian motion are continuous. In general, when we consider Gaussian processes we only require that a continuous version exists. Thus Brownian motion, as a Markov process, is considered differently from Brownian motion as a Gaussian process. This does not cause us any difficulties. Consider W := {W (t) ; t ∈ R+ }, for W (t) := tW (1/t) for t = 0 and W (0) = 0. Since a mean zero Gaussian process is determined by its covariance and since EW (t)W (s) = s ∧ t, W is also a Brownian motion (see page 15). Remark 5.1.11 Let I ∈ R1 be an interval, open or closed, and let X = {X(t), t ∈ I} be a mean zero Gaussian Markov process with continuous strictly positive definite covariance Σ. Then X is a simple modification of time changed Brownian motion. This follows from Lemma 5.1.9 because, for any functions p and q satisfying the conditions of Lemma 5.1.8 on I with Σ(s, t) continuous, {X(t), t ∈ I} and {q(t)W (p(t)/q(t)) ; t ∈ I}, where W is Brownian motion, are both mean
Basic properties of Gaussian processes
198
zero Gaussian processes with the same covariance. Furthermore, since Brownian motion is continuous, X has a version with continuous sample paths.
5.2 Moment generating functions Lemma 5.2.1 Let ζ = (ζ1 , . . . , ζn ) be a mean zero, n-dimensional Gaussian random variable with covariance matrix Σ. Assume that Σ is invertible. Let λ = (λ1 , . . . , λn ) be an n-dimensional vector and Λ an n×n diagonal matrix with λj as its j-th diagonal entry. Let u = (u1 , . . . , un ) be an n-dimensional vector. For all λi , i = 1 . . . , n with λi < for some > 0 sufficiently small, (Σ−1 − Λ) is invertible and n λi (ζi + ui )2 /2 (5.43) E exp i=1
1 exp = (det(I − Σ Λ))1/2
Λut ) (uΛΣ uΛut + 2 2
,
where def = (Σ−1 − Λ)−1 = (I − Σ Λ)−1 Σ Σ and u = (u1 , . . . , un ). Equivalently, n ζi2 E exp ui λi ζi + λi 2 i=1 1 = exp (det(I − Σ Λ))1/2
(5.44)
(5.45)
Λut uΛΣ 2
.
Proof We prove (5.45), which immediately gives (5.43). Using (5.10) twice we see that n ζj2 E exp uj λj ζj + λj (5.46) 2 j=1
ζ Σ−1 − Λ ζ t 1 t exp uΛζ − dζ = 2 (2π)n/2 (det Σ)1/2 =
1/2 (det Σ) 1 uΛξt = uΛξt , Ee Ee 1/2 1/2 (det Σ) (det(I − Σ Λ))
where ξ is an n-dimensional Gaussian random variable with mean zero and E is expectation with respect to the proband covariance matrix Σ
5.2 Moment generating functions
199
ability measure of ξ. Equation (5.45) now follows from (5.15).
We have the following immediate corollary of Lemma 5.2.1. Corollary 5.2.2 Let η = {ηx ; x ∈ S} be a mean zero Gaussian process and fx a real-valued function on S. It follows from Lemma 5.2.1 that for a2 + b2 = c2 + d2 , ηx + fx b)2 ; x ∈ S} {(ηx + fx a)2 + (
(5.47)
law
= {(ηx + fx c)2 + ( ηx + fx d)2 ; x ∈ S},
where η is an independent copy of η. Lemma 5.2.3 Let (ζ1 , ζ2 ) be an R2 -valued Gaussian random variable with mean zero. Then, for all s = 0, E(ζ1 exp(sζ2 )) = E (ζ1 ζ2 ) . sE (exp (sζ2 ))
(5.48)
Proof tζ1 +sζ2 is a mean zero Gaussian random variable with variance t2 E(ζ12 ) + 2tsE(ζ1 ζ2 ) + s2 E(ζ22 ). Therefore, E (exp (tζ1 + sζ2 ))
= exp t2 E(ζ12 )/2 + tsE(ζ1 ζ2 ) + s2 E(ζ22 )/2 .
(5.49)
Differentiating this with respect to t and then setting t = 0, we get
E (ζ1 exp (sζ2 )) = sE (ζ1 ζ2 ) exp s2 E(ζ22 )/2 , (5.50) which gives (5.48). Here is an alternative proof of Lemma 5.2.3. We write E (ζ1 ζ2 ) E (ζ1 ζ2 ) ζ ζ2 + ζ1 − ζ1 = 2 2 E (ζ2 ) E (ζ22 ) E (ζ1 ζ2 ) ζ2 . := ζ2⊥ + E (ζ22 )
(5.51) (5.52)
Observe that ζ2⊥ and ζ2 are orthogonal and hence independent. Since ζ2⊥ has mean zero, we have E (ζ1 exp(sζ2 )) =
E (ζ1 ζ2 ) E(ζ2 exp(sζ2 )). E (ζ22 )
(5.53)
200
Basic properties of Gaussian processes
By (5.15) we see that E(ζ2 exp(sζ2 ))
d E(exp(sζ2 )) ds
s2 E ζ22 d = exp ds 2
2 = sE ζ2 E (exp (sζ2 )) .
=
(5.54)
Using this in (5.53) we get (5.48). Remark 5.2.4 We define a probability measure P on Rn , in terms of by its expectation operator E,
n
2 , . . . , ζ ) exp λ ζ /2 E g(ζ 1 n i i i=1 n (5.55) E(g(ζ 1 , . . . , ζn )) = E (exp ( i=1 λi ζi2 /2)) for all measurable functions g on Rn . Under P, ζ = (ζ1 , . . . , ζn ) is a mean zero, n-dimensional Gaussian random variable with covariance given in (5.44). To see this, take g(ζ1 , . . . , ζn ) = exp(i(u, ζ)) matrix Σ in (5.55). Using (5.46) with uj replaced by iuj , j = 1, . . . , n, we get n t 1 λj ζj2 /2 = Eeiuξ , E exp(i(u, ζ)) exp 1/2 (det(I − Σ Λ)) j=1 (5.56) By where ξ is a mean zero normal random variable with variance Σ. t t )/2) and by (5.44), E(exp(n λj ζ 2 /2)) = (5.5), Eeiuξ = exp(−(uΣu j j=1 (det(I − Σ Λ))−1 . Thus we see that when g(ζ1 , . . . , ζn ) = exp(i(u, ζ)), t )/2), that is, we have the right-hand side of (5.55) is equal to exp(−(uΣu exp(i(u, ζ)) = exp(−(uΣu t )/2). E
(5.57)
Corollary 5.2.5 Let ζ = (ζ1 , . . . , ζn ) be an Rn -valued Gaussian random variable. We have
n E ζj ζk exp λi ζ 2 /2 j,k , n i=1 2 i (5.58) = {Σ} E (exp ( i=1 λi ζi /2)) and for all s = 0
n
2 E ζ1 exp i=1 λi (ζi + s) /2 t }1 , n = {ΣΛ1 sE exp ( i=1 λi (ζi + s)2 /2) where 1 denotes a vector with all its components equal to 1.
(5.59)
5.2 Moment generating functions
201
Proof Equation (5.58) follows immediately from Remark 5.2.4. To prove (5.59) we expand the squares (ζi + s)2 and cancel the terms in s2 to see that the left-hand side of (5.59) is equal to
n
n 2 (ζ1 exp (s n λi ζi )) E ζ1 exp (s i=1 λi ζi ) exp E i=1 λi ζi /2 i=1 n n = . (exp (s n λi ζi )) sE (exp (s i=1 λi ζi ) exp ( i=1 λi ζi2 /2)) sE i=1 (5.60) Using Lemma 5.2.3, we see that this last term is equal to n n 1,i = {ΣΛ1 t }1 . ζ1 λi ζi λi {Σ} (5.61) = E i=1
i=1
We end this section with an important moment identity for products of Gaussian random variables. Lemma 5.2.6 Let {gi }ki=1 be an Rk -valued Gaussian random variable with mean zero. Then, when k is even, k k/2 gi = cov(Di ), (5.62) E D1 ∪...∪Dk/2 ={1,...,k} i=1
i=1
where the sum is over all pairings (D1 , . . . , Dk/2 ) of {1, . . . , k}, that is, over all partitions of {1, . . . , k} into disjoint sets each containing two elements, and where cov({i, j}) := cov(gi , gj ) := E(gi gj ).
(5.63)
When k is odd, the left-hand side of (5.62) equals zero. Proof
By (5.15), E exp
k i=1
λi gi
1 = exp E 2
k
2 λi g i .
Clearly
k ∂ ∂ ... E exp λi g i ∂λ1 ∂λk i=1
Also,
(5.64)
i=1
=E λ1 =···=λk =0
2 k ∂ 1 ∂ ... exp E λi gi ∂λ1 ∂λk 2 i=1
k
gi
.
(5.65)
i=1
(5.66)
202
Basic properties of Gaussian processes n k ∞ k 1 ∂ ∂ = ... λi λj cov(gi , gj ) . n n! ∂λ 2 ∂λ 1 k n=0 i=1 j=1
It is easy to see that n k k ∂ ∂ ... λi λj cov(gi , gj ) ∂λ1 ∂λk i=1 j=1
(5.67) λ1 =...=λk =0
is zero when n = k/2. Thus, in particular, when k is odd the left-hand side of (5.62) is equal to zero. When n = k/2, (5.67) is not zero only for those terms in k/2 k k λi λj cov(gi , gj ) (5.68) i=1 j=1
in which each λi , i = 1, . . . , k appears only once. The terms with this property form a pairing, say (D1 , . . . , Dk/2 ), of {1, . . . , k}. For this k/2 pairing (4.36) is equal to i=1 cov(Di ). Finally, it is easy to see that there are 2k/2 (k/2)! terms in (5.67), corresponding to each pairing of {1, . . . , k}. This establishes (5.62).
5.2.1 Exponential random variables There are some simple relationships between exponential and normal random variables that play an important role in this book. An exponential random variable is a positive random variable λ with density exp(−x/γ)γ −1 on R+ for some γ > 0. Equivalently, P (λ > u) = exp(−u/γ) for all u ≥ 0. The importance of exponential random variables in the theory of Markov processes is due to the fact that an exponential variable is “memoryless,” that is, P (λ > t + s|λ > s) = P (λ > t).
(5.69)
It is elementary to check this as well as the facts that Eλ = γ and the moment generating function of λ is Eevλ =
1 . 1 − γv
(5.70)
Let ξ be a normal random variable with mean zero and variance γ, and let ξ be an independent copy of ξ. It follows from Lemma 5.2.1 that ξ 2 + (ξ )2 = 2λ. law
(5.71)
5.3 Zero–one laws and the oscillation function
203
The following immediate consequence of Corollary 5.2.2 is used in the proofs of isomorphism theorems for local times: Corollary 5.2.7 Let η = {ηx ; x ∈ S} be a mean zero Gaussian process and η be an independent copy of η. Let ξ, ξ , and λ be as above and be independent of η and η . Then √ law {(ηx + ξ)2 + (ηx + ξ )2 ; x ∈ S} = {ηx2 + (ηx + 2λ)2 ; x ∈ S}. (5.72) Remark 5.2.8 We record another fact for use in the proofs of isomorphism theorems. Let λ and ρ be two independent exponential random variables with Eλ = γ and Eρ = γ¯ . Then λ∧ρ is an exponential random variable with mean 1 E(λ ∧ ρ) = −1 . (5.73) γ + γ¯ −1 To prove this note that P (λ ∧ ρ > u) = P (λ > u)P (ρ > u). Finally we note another relationship that associates Gaussian random variables with exponential random variables, the identity 2p/2 p+1 p , (5.74) E|η| = √ Γ 2 π ∞ where Γ is the gamma function, Γ(t) = 0 xt−1 e−x dx. This is obtained by writing out the integral for E|η|p and making the change of variables √ y = 2v.
5.3 Zero–one laws and the oscillation function A Gaussian process with a continuous covariance can be expressed as an infinite series of independent continuous functions. This representation allows us to easily obtain several important zero–one laws for the process. Since everything in this section applies to a larger class of processes than Gaussian processes, we give the results for this larger class of processes. Let (T, d) be a separable metric space and let Γ be a positive definite function on T × T . We sometimes refer to Γ as a covariance kernel. Theorem 5.3.1 (Reproducing kernel Hilbert space) Let (T, d) be a separable metric space and let Γ be a continuous covariance kernel on T × T . Then there exists a separable Hilbert space H(Γ) of continuous real-valued functions on T such that Γ(t, · ) ∈ H(Γ)
t∈T
(5.75)
Basic properties of Gaussian processes
204
(f ( · ), Γ(t, · )) = f (t)
f ∈ H(Γ)
t ∈ T,
(5.76)
where ( · , · ) denotes the inner product on H(Γ). If H1 is a separable Hilbert space of continuous real-valued functions on T such that (5.75) and (5.76) hold with H(Γ) replaced by H1 and the inner product by the inner product on H1 , then H1 = H(Γ) as Hilbert spaces. H(Γ) is called the reproducing kernel Hilbert space of the covariance kernel Γ. Proof Let n S= aj Γ(tj , · ), a1 , . . . , an ∈ R1 , t1 , . . . , tn ∈ T, n ≥ 1 .
(5.77)
j=1
On S we define the bilinear form n m n m aj Γ(tj , · ), bk Γ(tk , · ) = aj bk Γ(tj , tk ). j=1
Note that if f (t) =
(5.78)
j=1 k=1
k=1
n j=1
aj Γ(tj , t),
f (t) = (f ( · ), Γ(t, · )).
(5.79)
If f ∈ S, then (f, f ) ≥ 0, because Γ is positive definite. Also, (f, f ) = 0 implies that f ≡ 0 since |f (t)|2 = |(f, Γ(t, · ))|2 ≤ (f, f )(Γ(t, · ), Γ(t, · )) = 0.
(5.80)
Thus we see that (5.78) defines an inner product on S. Let {fn } be a sequence of functions in S. Then |fn (t)−fm (t)|2 = |(fn ( · )−fm ( · ), Γ(t, · ))|2 ≤ fn −fm 2 Γ(t, t), (5.81) where f 2 := (f, f ) for f ∈ S and Γ(t, · ) 2 = Γ(t, t) by (5.78). This shows that if {fn } is a Cauchy sequence with respect to the inner product norm, then it is also a Cauchy sequence pointwise on T . We close S in the inner product norm and identify the limits that are not already in S with the pointwise limits in T . H(Γ) is this closure of S. It is a Hilbert space of real-valued functions. Let {tn } be a dense subset of (T, d). Since T is separable and the covariance kernel Γ is continuous, S1 =
k
aj Γ(sj , · ), a1 , . . . , ak rational, s1 , . . . , sk ∈ {tn }, k ≥ 1
!
j=1
(5.82)
5.3 Zero–one laws and the oscillation function
205
is dense in H(Γ). The reproducing property (5.79) immediately extends to H(Γ). Furthermore, |f (s)−f (t)| = |(f ( · ), Γ(s, · )−Γ(t, · ))| ≤ f Γ(s, · )−Γ(t, · ) , (5.83) which goes to zero as d(s, t) → 0, since Γ is continuous. Thus H(Γ) consists of continuous functions. This completes the proof of the statements in the first paragraph of this theorem. For the second part of the theorem note that (5.75) and (5.76) with H(Γ) replaced by H1 and the inner product by the inner product on H1 , let us call it ( · , · )1 , show that the set S in (5.77) is contained in H1 and ( · , · )1 agrees with the reproducing kernel Hilbert space norm ( · , · ) on S. Therefore, H(Γ) ⊂ H1 and the two inner products agree on / H(Γ). Then f can be written H(Γ). Let f ∈ H1 and suppose that f ∈ uniquely as f = f1 + f2 , where f1 ∈ H(Γ) ∩ H1 , f2 ∈ H1 and (g, f2 )1 = 0 for every g ∈ H(Γ). In particular (Γ(t, · ), f2 )1 = f2 (t) = 0 for all t ∈ T . Hence f2 ≡ 0 , so f = f1 ∈ H(Γ), which is a contradiction. Therefore H1 = H(Γ). Theorem 5.3.2 Let X = {X(t), t ∈ T }, T a separable metric space, be a stochastic process on (Ω, F, P ) with EX 2 (t) < ∞, for all t ∈ T . Let EX(t) = m(t) and assume that Γ(s, t) = EX(s)X(t) is continuous on T × T . Then ∞ X(t) = φj (t)Yj + m(t), (5.84) j=1
where {φj } are continuous functions on T , {Yj } is an orthonormal set in L2 (Ω, F, P ), and convergence and equality in (5.84) are in L2 (Ω, F, P ). Proof Let H(Γ) be the reproducing kernel Hilbert space of Γ and let X(t) = X(t) − m(t). We establish an isomorphism between H(Γ) and a = {X(t), t ∈ T }. Let subspace of L2 (Ω, F, P ) containing X " n = Closure of j ), a1 , . . . , an real, t1 , . . . , tn ∈ T, n ≥ 1 aj X(t L2 (X) j=1
(5.85) in L2 (Ω, F, P ). Consider the map n n j) aj Γ(tj , · ) = aj X(t ΘP j=1
(5.86)
j=1
ΘP is linear, one–one, and norm from S, given in (5.77), into L2 (X).
206
Basic properties of Gaussian processes
Furpreserving. It extends to all of H(Γ) with range equal to L2 (X). has mean zero. thermore, each element in L2 (X) Let {φj } be a complete orthonormal set in H(Γ) and set Yj = ΘP (φj ). Then {Yj }∞ j=1 is a complete orthonormal set in L2 (X), EYj = 0, and X(t) =
∞
E(X(t)Yj )Yj + m(t),
(5.87)
j=1
where the convergence is in L2 (Ω, F, P ). By the isometry ΘP , E(X(t)Yj ) = E(X(t)Y j ) = (Γ(t, · ), φj ( · )) = φj (t),
(5.88)
where ( · , · ) is the inner product in H(Γ) and the last equality is the reproducing property (5.76). Equations (5.87) and (5.88) give (5.84). Remark 5.3.3 One should be careful when interpreting the equalities in (5.84) and (5.87). The left-hand side, X(t), is a random variable that takes a unique value at each ω ∈ Ω. The right-hand side represents an equivalence class in L2 (Ω, F, P ). The equalities simply mean that X(t) belongs to this equivalence class. When the stochastic process in Theorem 5.3.2 is a Gaussian process one can be much more explicit about the series representation. Corollary 5.3.4 Let X = {X(t), t ∈ T }, T a separable metric space, be a Gaussian process with EX(t) = m(t). Then X(t) has a version given by ∞ φj (t)ξj + m(t), (5.89) X (t) = j=1
where {φj } are continuous functions on T , {ξj } is a standard normal sequence, and convergence in (5.89) is in L2 (Ω, F, P ). If X has a continuous version and T is compact, the series in (5.89) converges uniformly on T almost surely. The series in (5.89) is called the Karhunen–Lo´eve expansion of X. n j ) in the right-hand Proof Since X is a Gaussian process, j=1 aj X(t side of (5.86) is a Gaussian random variable. Since all random variables are limits in L2 (Ω, F, P ) of Gaussian random variables, each in L2 (X) Yj in the previous proof is a Gaussian random variable. Since {Yj } is also an orthonormal sequence, they must be independent. We label them {ξj } in this corollary. Thus (5.89) is just a statement of Theorem 5.3.2. ∞ The variance of X(t) is j=1 φ2j (t). Thus this sum must converge.
5.3 Zero–one laws and the oscillation function
207
It then follows from the three-series theorem that the sum in (5.89) converges almost surely for each fixed t ∈ T . Therefore, by Corollary 14.6.4, the series in (5.89) converges uniformly almost surely on T . We have the following simple application of Corollary 5.3.4, which we use in Chapter 9. Lemma 5.3.5 Let X = {X(t), t ∈ K}, K a compact separable metric space, be a mean zero Gaussian process with continuous sample paths. Then, for all > 0, we have P sup |X(t)| ≤ > 0. (5.90) t∈K
Proof
We represent X by its Karhunen–Lo´eve expansion X(t) =
∞
ξj φj (t)
t∈K
(5.91)
j=1
as in Corollary 5.3.4. Since X is continuous, this series converges uniformly almost surely on K. Hence, given any > 0, we can find an N ( ) such that ∞ P sup ξj φj (t) < /2 ≥ 1/2. (5.92) t∈K
j=N ()+1
Using independence, we see that ∞ P sup ξj φj (t) ≤ t∈K
(5.93)
j=1
() N ξj φj (t) ≤ /2 P sup ≥ P sup
t∈K
≥
j=1
N ()
t∈K
∞
ξj φj (t) < /2
j=N ()+1
1 ξj φj (t) ≤ /2 . P sup 2 t∈K j=1
It is easy to see that this last probability is strictly positive since the N () φj are bounded on K and the {ξj }j=1 are simply a finite collection of independent normal random variables with mean zero and variance one. Thus we get (5.90). The sample paths of a Gaussian process have interesting zero–one properties that follow because the Gaussian process can be represented by a series as in (5.89). Since these properties are shared by all processes
Basic properties of Gaussian processes
208
with such a representation, we present the next few results in this greater generality. Let (T, d) be a compact metric space. A stochastic process X = {X(t), t ∈ T } is said to be of class S, if there exist real-valued continuous functions {φj } on T and independent symmetric real-valued random variables {ξj } such that X(t) =
∞
φj (t)ξj
t ∈ T,
(5.94)
j=1
where the series converges almost surely for each fixed t ∈ T . In the rest of this section we take T to be a compact metric space. Let Z = {Z(t); t ∈ T } be a real (or complex)-valued stochastic process, where (T, d) is a separable metric space. Let t0 ∈ T . Z(t0 ) is a real (or complex)-valued random variable and hence is finite almost surely. We say that the process Z has a bounded discontinuity at t0 if 0 < lim
sup
δ→0 t∈Bd (t0 ,δ)
|Z(t) − Z(t0 )| < ∞,
(5.95)
where Bd (t0 , δ) is a closed ball of radius δ at t0 in the metric d. We say that the process Z has an unbounded discontinuity at t0 if lim
sup
δ→0 t∈Bd (t0 ,δ)
|Z(t) − Z(t0 )| = ∞.
(5.96)
Corollary 5.3.6 Let X = {X(t), t ∈ T } be a separable process of class S. Let t0 ∈ T . The following events have probability zero or one: (1) (2) (3) (4) (5) (6)
X X X X X X
is continuous at t0 ; has a bounded discontinuity at t0 ; is unbounded at t0 ; is continuous on T ; has a bounded discontinuity on T ; is unbounded on T .
Proof Let D be the separability set of T . It suffices to prove this corollary for {X(t), t ∈ D} (see page 9). There exists a P -null set Λ such that, for all ω ∈ / Λ, X(t, ω) =
∞
φj (t)ξj (ω)
t ∈ D,
(5.97)
j=1
where the series converges as a series of numbers. Also, since X is a
5.3 Zero–one laws and the oscillation function
209
process of class S, {ξj } are independent random variables and {φj } are continuous functions. For t ∈ D, ω ∈ / Λ let Xn (t, ω) =
∞
φj (t)ξj (ω).
(5.98)
j=n+1
n Note that Sn := { j=1 φj (t)ξj (ω), t ∈ T } is a continuous function on T for each ω and n ≥ 1. Consequently, the events in (1) to (6) are in σ({Xn (t), t ∈ D}) ⊂ σ({ξj , j > n}), for all n. The fact that these events have probability zero or one follows from the Kolmogorov zero–one law.
With a little more effort we can obtain a more precise description of the discontinuities of processes in class S. Let (T, d) be a separable metric space and f an extended real-valued function on T . Let Wf be the oscillation function of f , that is, Wf (t) = lim
sup
→0 u,v∈B (t,) d
|f (u) − f (v)|,
(5.99)
where Bd (t, ) is a closed ball of radius in (T, d). We also define Mf (t) = lim
sup
→0 u∈B (t,) d
f (u)
and
mf (t) = lim
inf
→0 u∈Bd (t,)
f (u) (5.100)
so that Wf (t) = Mf (t) − mf (t).
(5.101)
We take ∞ − ∞ = 0 and (−∞) − (−∞) = 0. Note that Wf (t) = 0 if and only if f is continuous at t. Theorem 5.3.7 Let X = {X(t), t ∈ T } be a separable process of class S defined on the probability space (Ω, F, P ). There exists an extended real-valued, upper semicontinuous function α on T , called the oscillation function of the process X, such that P (WX (t, ω) = α(t), t ∈ T ) = 1,
(5.102)
where WX (t, ω) := WX( · ,ω) (t). Furthermore, for all t ∈ T , P (MX (t, ω) = X(t, ω) + α(t)/2, mX (t, ω) = X(t, ω) − α(t)/2) = 1. (5.103) Proof
Let Λ be the P -null set in the first paragraph of the proof of
210
Basic properties of Gaussian processes
Corollary 5.3.6. For a closed subset F of T , define WX (F, ω) = lim lim
sup
n→∞ k→∞ d(s,t)≤1/k s,t∈Fn
|X(t, ω) − X(s, ω)|,
(5.104)
where Fn = {u ∈ T, d(u, F ) ≤ 1/n}. Because X is separable, we can choose Λ so that the supremum in (5.104) and in the definition of WX (t, ω) is achieved over s, t ∈ D, ω ∈ / Λ. It then follows that WX (t, · ) and WX (F, · ) are random variables. Let X(t, ω) and Xn (t, ω) be as given in (5.97) and (5.98). For any set F ⊂ T and ω ∈ / Λ, WX (F, ω) = WXn (F, ω). Furthermore, WXn (F, ω) ∈ σ({Xn (t), t ∈ D}) ⊂ σ({ξj , j > n}). Therefore, by the Kolmogorov zero–one law, P (WX (F, w) = α(F )) = 1
(5.105)
for some number α(F ). It is simple to check that for each ω ∈ Ω, WX (F, ω) = WX (t, ω)
when F = {t}
(5.106)
and lim WX (Fm , ω) = WX (F, ω).
m→∞
(5.107)
Let {On , n ≥ 1} be a countable basis for the separable metric space T . Let Jn be the closure of On and set J = {Jn , n ≥ 1}. By (5.105), P (WX (Jn , ω) = α(Jn ), n ≥ 1) = 1.
(5.108)
By (5.108) there exists a set Ω0 ⊂ Ω, with P (Ω0 ) = 1 such that WX (Jn , ω) = α(Jn ) for all ω ∈ Ω0 . Choose any t ∈ T . There exists a nested sequence {Jnk , nk ≥ 1} of sets in J such that ∩k Jnk = {t}. Therefore, for this t and ω ∈ Ω0 , by (5.107), WX (Jnk , ω) ↓ WX ({t}, ω). Also, since WX (Jnk , ω) = α(Jnk ), α(Jnk ) decreases, as k → ∞, to some number β, which is independent of ω. Consequently, WX ({t}, ω) = β. It follows from (5.106) and (5.105) that β = α({t}). Thus we get (5.102) where α(t) = α({t}), t ∈ T . Recall that, by definition, a function f on (T, d) is upper semicontinuous if lim sups→t f (s) = f (t) (see, e.g., McShane and Botts (1959, page 74)). Since, obviously, lim sups→t f (s) ≥ f (t), one can prove that a function is upper semicontinuous by showing that lim sups→t f (s) ≤ f (t). Thus, in general, the oscillation function of an extended real-valued function on a separable metric space is an upper semicontinuous function since lim sups→t Wf (s) ≤ Wf (t).
5.3 Zero–one laws and the oscillation function
211
To verify (5.103) note that lim sup (X(s) − X(t)) = lim sup (Xn (s) − Xn (t)).
s→t s∈D
(5.109)
s→t s∈D
Therefore, by Kolmogorov’s zero–one law, lim sup (X(s) − X(t)) = MX (t) − X(t) = γ(t)
a.s.
s→t s∈D
(5.110)
for some constant γ(t). Since X is symmetric, we also have lim sup (−X(s) + X(t)) = γ(t)
s→t s∈D
a.s.,
(5.111)
which implies that lim inf (X(s) − X(t)) = mX (t) − X(t) = −γ(t)
s→t s∈D
a.s.
(5.112)
It follows from (5.110) and (5.112) that α(t) = MX (t) − mX (t) = 2γ(t), which together with (5.110) and (5.112) again gives (5.103). Corollary 5.3.8 Let X = {X(t), t ∈ T } be a separable process of class S. Then (1) X has continuous paths almost surely if and only if it is continuous at each fixed t ∈ T almost surely, that is, if and only if # $ P lim X(s) = X(t) = 1 for each t ∈ T. (5.113) s→t
(2) The paths of X are either almost surely continuous on T or almost surely discontinuous on T . (3) Let K be a compact subset of T . If X is unbounded on K with positive probability, there exists a t0 ∈ T such that X is unbounded almost surely at t0 , that is α(t0 ) = ∞, where α is the oscillation function of X. Proof Statements (1) and (2) are immediate consequences of Theorem 5.3.7. We prove statement (3). Suppose that X is unbounded on K with probability greater than zero. Then, by Corollary 5.3.6, X is unbounded on K almost surely. For each j let { j }∞ j=1 be a decreasing sequence of N
j positive numbers satisfying limj→∞ j = 0. Let {B(zk,j , j )}k=1 be a cover of K. If
sup
|X(z)| < ∞
(5.114)
z∈B(zk,j ,j )
on a set of positive measure, then, by Corollary 5.3.6, it is finite almost
212
Basic properties of Gaussian processes
surely. Hence, if X is unbounded almost surely on K, there exists a zk,j ∈ K such that |X(z)| = ∞
sup
a.s.
(5.115)
z∈B(zk,j ,j )
Note that for each z ∈ K, X(z) is just a mean zero random variable with finite variance. Thus (5.115) implies that sup
|X(y) − X(z)| = ∞
a.s.
(5.116)
y,z∈B(zk,j ,j ) ∞ Since K is compact, there exists a subsequence {zk(i),j }∞ i=1 of {zk,j }k=1 such that limi→∞ zk(i),j = z0 for some z0 ∈ K. It is easy to see that
sup
|X(y) − X(z)| = ∞
∀ > 0.
a.s.
(5.117)
y,z∈B(z0 ,)
All mean zero Gaussian processes are in class S, so their sample paths have the properties described in the previous corollary. Note that, generally speaking, independent increment processes, say on [0, 1], are continuous almost surely for t fixed but not continuous on the whole interval. L´evy processes that have only a finite number of jumps on [0, 1] are continuous with positive probability that is less than 1 on all subintervals of [0, 1] (this is because the jump times are uniformly distributed on [0, 1]; see Sato (1999, Theorem 21.3)). In the next theorem we give conditions on the nature of the discontinuities of processes of class S. Theorem 5.3.9 Let X = {X(t), t ∈ T } be a stochastically continuous (see page 9) separable process of class S with oscillation function α. (1) Suppose that α(t) ≥ a > 0 on a dense subset S of an open set I ⊂ T . Then P (MX (t) = ∞, mX (t) = −∞, t ∈ I) = 1.
(5.118)
(2) The set {t ∈ T : a ≤ α(t) < ∞} is nowhere dense (i.e., its closure contains no open ball of T ). Proof Since X is stochastically continuous, we can assume that the separability set D is such that D ∩ I = S. For fixed t ∈ I, set Fn (t) = {u ∈ T : d(t, u) < 1/n}. Then, by (5.103), almost surely MX (t, w)
= =
lim sup X(s, ω)
s→t s∈T
lim
sup
n→∞ s∈D∩F (t) n
X(s, ω).
(5.119)
5.3 Zero–one laws and the oscillation function
213
Let s ∈ D ∩ Fn (t) and suppose that u ∈ D ∩ Fj (t) for some j ≥ n. Then u ∈ D ∩ Fn/2 (t). Therefore lim
sup
n→∞ s∈D∩F (t) n
X(s, ω) ≥ =
sup
X(u, ω))
lim
sup
( lim
lim
sup
(X(s, ω) + α(s)/2).
n→∞ s∈D∩F (t) j→∞ u∈D∩F (s) n j n→∞ s∈D∩F (t) n
Since α(s) ≥ a on D ∩ I, it follows from (5.119) that MX (t, ω) ≥ MX (t, ω) + a/2.
(5.120)
Therefore, since a > 0, P (MX (t, ω) = ∞, t ∈ D ∩ I) = 1.
(5.121)
Since D is dense in I we see that, for ω in a set of probability one, MX (t, ω) = ∞ for all t ∈ T . The result for mX (t, ω) follows similarly. To obtain (2) we note that, because α is upper semicontinuous, the sets Ta := {t : α(t) ≥ a} are closed. Suppose that there exists an 0 < a < ∞ such that Ta − T∞ contains a dense subset S of I. Then by part (1) of this theorem S ⊂ T∞ , contradicting the supposition. Therefore, for each a > 0, Ta − T∞ is nowhere dense. We now present the Belyaev dichotomy, which says that the sample paths of processes of class S with stationary increments are either continuous or completely irregular. Theorem 5.3.10 (Belyaev) Let X = {X(t), t ∈ T }, where T is an interval of Rn , be a stochastically continuous separable process of class S with stationary increments. Then either X has continuous paths almost surely on all open subsets of T or it is unbounded almost surely on all open subsets of T . Proof Since X has stationary increments, its oscillation function takes the same value on all open subsets of T . Therefore, by Theorem 5.3.9, the oscillation function can only be zero or infinity. One might ask whether further restrictions can be imposed on the oscillation function of processes of class S. Essentially the answer is no, as can be seen from the following theorem (a proof can be found in Ito and Nisio (1968b) or Jain and Kallianpur (1972)). Theorem 5.3.11 Let T = [0, 1]n . Let α be an extended real-valued upper semicontinuous function on T such that, for all a > 0, the set {t ∈ T : a ≤ α < ∞} is nowhere dense in T . There exists a mean
214
Basic properties of Gaussian processes
zero Gaussian process on T with continuous covariance and oscillation function α.
5.4 Concentration inequalities One of the most important results in the theory of Gaussian processes that is used repeatedly in this book is a consequence of an isoperimetric inequality that states that for all measurable sets in Rn with the same canonical Gaussian measure γ, half-spaces achieve the minimal surface measure with respect to γ. Let A be a Borel set in Rn . Define Ar = {x + rB, x ∈ A}, where B is the open unit ball in Rn . The surface measure of A is defined as γS (∂A) = lim inf r→0
1 (γ(Ar ) − γ(A)) . r
(5.122)
Let H be the half-space {x ∈ Rn : (x, u) < a} for some u ∈ Rn with |u| = 1 and a ∈ [−∞, ∞]. Let A ⊂ Rn be such that γ(A) = γ(H). The isoperimetric inequality just referred to is γS (∂A) ≥ γS (∂H).
(5.123)
It is convenient to express (5.123) only in terms of A. Since γ(H) = Φ(a) and γS (∂H) = φ(a), we can rewrite (5.123) as γS (∂A) ≥ φ(a) = φ ◦ Φ−1 (γ(A))
(5.124)
with equality when A = H. It is more useful for us to have an integrated version of this isoperimetric inequality, which we now state. Theorem 5.4.1 (Borell, Sudakov–Tsirelson) Let A be a measurable subset of Rn such that γ(A) = Φ(a)
− ∞ ≤ a ≤ ∞.
(5.125)
Then, for r ≥ 0, γ(A + rB) ≥ Φ(a + r),
(5.126)
where B is the open unit ball in Rn . Proof Let C denote the sets in Rn that are finite unions of open balls. We prove below that (5.124) holds for all C ∈ C. We now show that this implies (5.126) for all measurable sets A ⊆ Rn .
5.4 Concentration inequalities
215
To begin we first show that if (5.124) holds for C ∈ C and γ(C) = γ(H) = Φ(a), then γ(Cr ) ≥ γ(Hr ) = Φ(a + r),
(5.127)
which is (5.126) for C ∈ C. Note that for C ∈ C, the lim inf in (5.122) d is actually a limit, so that γ(Cr ) = γS (∂Cr ). Consider the function dr −1 v(r) = Φ (γ(Cr )), r ≥ 0. It follows from (5.124) that for Cr ∈ C, v (r) =
γS (∂Cr ) ≥ 1. φ ◦ Φ−1 (γ(Cr ))
(5.128)
r Therefore, 0 v (u) du ≥ r and consequently v(r) ≥ v(0) + r. Since v(0) = a, we get (5.127) for sets in C. Suppose that A ⊂ Rn is open. Then we can find an increasing sequence of sets Cn in C such that Cn ⊂ A and γ(Cn ) increases to γ(A). Applying (5.127) to each Cn and taking the limit, we see that (5.127) holds for open sets. Now let A be a measurable set in Rn . Let ρ > 0. Then Aρ is an open set and γ(Aρ+r ) = γ((Aρ )r ) ≥ Φ(aρ + r), where aρ is such that γ(Aρ ) = a +r), where Φ(aρ ). Taking the limit as ρ goes to zero we get γ(Ar ) ≥ Φ( a ≥ a and we get a is such that Φ( a) = limρ→0 γ(Aρ ) ≥ γ(A). Thus (5.127) in the general case. We show below that for sufficiently smooth functions f : Rn −→ [0, 1], % U 2 (f ) + | f |2 dγ, (5.129) U f dγ ≤ where |f | denotes the Euclidean length of the gradient of f (see (5.23)). We now show that (5.129) implies that (5.124) holds for all A ∈ C. Suppose that the set A in (5.124) is in C. This implies that γ(∂A) = 0. Suppose that (5.129) holds for all sufficiently smooth functions with values in [0, 1]; then it holds for all Lipschitz functions with values in [0, 1], that is, functions f : Rn → R1 with |f (x) − f (y)| ≤ M |x − y|, for some constant M (see Dudley (1989, Theorem 11.2.4)). In particular it holds for the functions + d(x, A) def r > 0, (5.130) fr (x) = 1 − r where d is the Euclidean distance on Rn (see also (1.5)). ¯ fr (x) = 1 for all r > 0. If x ∈ A¯c , Let A¯ be the closure of A. If x ∈ A, limr→0 fr (x) = 0. Thus limr→0 fr (x) = IA¯ (x) and limr→0 U(fr (x)) = 0, since U(0) = U(1) = 0. Moreover, | fr | = 0 on A and on the complement of the closure of Ar and | fr | ≤ 1/r everywhere.
Basic properties of Gaussian processes
216
We now use (5.129) with f = fr and take the limit inferior as r → 0. ¯ and γ(∂Ar ) = 0 for all r, we get Since γ(A) = γ(A) U(γ(A)) ≤ lim inf U(fr ) dγ + lim inf | fr | dγ (5.131) r→0 r→0 = lim inf | fr | dγ r→0
≤ lim inf r→0
1 (γ(Ar ) − γ(A)) = γS (∂A), r
which is (5.124). Since (5.129) implies (5.124), which implies the statement of this theorem, we complete the proof of this theorem by verifying (5.129). To do this we use the Ornstein–Uhlenbeck or Hermite semigroup with invariant measure γ. For f ∈ L1 (γ) set f (e−t x + (1 − e−2t )1/2 y) dγ(y) x ∈ Rn , t > 0. Pt f (x) = Rn
(5.132)
We note the following properties of {Pt , t ≥ 0}. Lemma 5.4.2 (1) The operators Pt are contractions on the function spaces Lp (γ), for all p ≥ 1. (2) For all sufficiently smooth integrable functions f and g and every t > 0, f Pt g dγ = gPt f dγ. (5.133) (3) {Pt , t ≥ 0} is a semigroup of operators, that is, Ps ◦ Pt = Ps+t . P0 is the identity and for any f ∈ L1 (γ), limt→∞ Pt f = f dγ. Proof For any 0 ≤ α ≤ 1, set α = (1 − α2 )1/2 . Then let α = e−t and Y be a standard normal random variable. We can write Pt f (x) = E(f (αx + α Y )). Let X be a standard normal random variable independent of Y and note that Z = αX + α Y is also a standard normal random variable. We have Pt f p
=
1/p
(EX |EY (f (αX + α Y ))|p )
1/p
≤ (EX EY |f (αX + α Y )|p ) =
1/p
(E|f (Z) |p )
= f p
which is (1) (in the last line we use the conditional H¨ older’s inequality).
5.4 Concentration inequalities
217
To obtain (2) we note that gPt f dγ = Eg(X)f (αX + α Y ) = Eg(X)f (Z). Here (X, Z) is a Gaussian random variable, EX 2 = EZ 2 = 1, and E(XZ) = α. Clearly then Eg(X)f (Z) = Eg(Z)f (X), which is (2). When β = e−s we have Y )) = E(f (βαx + β α Y + βX)). Pt ◦ Ps f (x) = E(Ps f (αx + α is normal with variance 1 − (αβ)2 , we see that Ps ◦ Pt = Since β α Y + βX Ps+t . The rest of (3) is immediate. Proof of Theorem 5.4.1 continued Let L be the infinitesimal operator for the semigroup {Pt , t ≥ 0}. That is, L satisfies d Pt f = Pt Lf = LPt f dt
(5.134)
for all sufficiently smooth functions f ∈ Rn . One can check that Lf (x) = f (x) − (x, f (x)).
(5.135)
It follows, by repeatedly integrating by parts on each component of Rn , that − f (x)(Lg(x)) dγ(x) = (f (x), g(x)) dγ(x). (5.136) Let 0 ≤ f ≤ 1 be a smooth function on Rn . We assume that 0 < Pt f < 1. This is the case unless f is equal to one or zero on a set of γ measure one, in which case (5.129) is satisfied. To verify (5.129) when 0 < Pt f < 1, it suffices to show that the function % F (t) = U 2 (Pt f ) + | Pt f |2 dγ (5.137) is nonincreasing in t ≥ 0, since if this is true, then F (∞) ≤ F (0), which implies (5.129) by property (3) of {Pt , t ≥ 0}. Showing that the derivative of F is less than or equal to zero is a straightforward, albeit tedious, calculation. However, since it is critical in the proof of this theorem, we go through the details. Using (5.134), we see that dF UU (Pt f )LPt f + ((Pt f ), (LPt f )) % dγ. (5.138) = dt U 2 (Pt f ) + | Pt f |2 To simplify the notation we set Pt f = h and U 2 (h) + | h|2 = K(h). In
Basic properties of Gaussian processes
218
this notation (5.138) becomes dF UU (h)Lh + (h, (Lh)) % dγ. = dt K(h) Let hi =
(5.139)
∂h . Note that ∂xi (h, (Lh)) =
n
hi Lhi − | h|2
(5.140)
K(h) = 2UU (h) h + | h|2 .
(5.141)
i=1
and
Also, by (5.136) and (5.140), dF | h|2 UU (h) % , h dγ − dγ = − % dt K(h) K(h) n hi % , hi dγ − K(h) i=1
(5.142)
= I + II + III. By Lemma 5.1.5, UU (h) (U )2 − 1 1 % (K(h), h) dγ (5.143) I=− | h|2 dγ + 2 K 3/2 (h) K(h) and by and (5.141), I + II
(5.144) (U (h))2 2 =− U (h) + | h|2 | h|2 dγ K 3/2 (h) 1 UU (h) (UU (h))2 2 dγ + | h| | h|2 , h dγ + 3/2 3/2 2 K (h) K (h) (U (h))2 (h) 1 UU 4 =− dγ + | h| | h|2 , h dγ 3/2 3/2 2 K (h) K (h) = I + II .
Also, III
= − = −
n i=1 n i=1
1
K 3/2 (h) 1 K 3/2 (h)
1 K(h)| hi | − hi (K(h), hi ) dγ 2 2
#
U 2 (h)| hi |2 + | h|2 | hi |2
5.4 Concentration inequalities
219
1 −hi UU (h)(h, hi ) − hi | h|2 , hi 2 = IV + V + V I + V II, where IV = −
n U 2 (h)| hi |2 dγ K 3/2 (h) i=1
$
dγ
(5.145)
and similarly for V , V I, and V II. n Note that i=1 hi hi = 12 | h|2 , so that V I = II . Therefore, n 2 (hi hj U (h) − hi,j U(h)) I + II + IV + V I = − dγ ≤ 0, (5.146) K 3/2 (h) i,j=1 ∂2h . Also, ∂xi ∂xj n n n 2 2 2 2 2 2 2 2 hi hj,k + hk hj,i + | h| | hi | = hi hj,i (5.147)
where hi,j :=
i
j=1
i=1
i>k
and 1 hi | h|2 , hi 2 i n
1 | h|2 , | h|2 (5.148) 4 n n 2hi hk hj,i hj,k + h2i h2j,i . =
=
j=1
This shows that V + V II ≤ 0. Thus
i=1
i>k
dF ≤ 0. dt
The next theorem, which is a consequence of Theorem 5.4.1, plays a critical role in this book. Theorem 5.4.3 Let X = {X(z), z ∈ T } be a real-valued mean zero Gaussian process where T is a countable set. Let a be a median of supz∈T X(z) and let σ := sup(EX 2 (z))1/2 < ∞.
(5.149)
z∈T
Then, for all t > 0, we have P sup X(z) > a − σt ≥ Φ(t),
(5.150)
z∈T
P
sup X(z) < a + σt z∈T
≥ Φ(t),
(5.151)
220
Basic properties of Gaussian processes
and
P sup X(z) − a ≥ σt ≤ 2(1 − Φ(t)).
(5.152)
z∈T
Furthermore, the median a is unique. Proof To begin, we take T to be a finite set and assume that the covariance matrix, say R, of {X(z), z ∈ T } is strictly positive definite. This implies that R is invertible and we can write R = P DP t , where P is an orthogonal matrix and D is a diagonal matrix with strictly positive entries. Let n = card{T } and let ρn denote the measure induced by {X(z), z ∈ T } on Rn . For C ⊂ Rn we have xR−1 xt dx exp − . (5.153) ρn (C) = n/2 |R|1/2 2 (2π) C Under the change of variables x = uD1/2 P t , we get t uu du exp − ρn (C) = n/2 2 (2π) −1/2 CP D
(5.154)
so that ρn (C) = γn (CP D−1/2 ).
(5.155)
Let f : Rn → R1 be given by f (x) = supk xk , where x = (x1 , . . . , xn ). Let Cb = {f ≤ b}. We see, by (5.153), that ρn (Cb ) is strictly increasing. This shows that the median of f is unique. Let a be the median of f , that is, ρn ({f ≤ a}) = 1/2, and set A = {f ≤ a}. By (5.155), since ρn (A) = 1/2, γn (AP D−1/2 ) = Φ(0). Therefore, by Theorem 5.4.1, γn (AP D−1/2 + tB) ≥ Φ(t),
(5.156)
where B is the unit ball in Rn , and by (5.155) ρn (A + tBD1/2 P t ) ≥ Φ(t).
(5.157)
A + tBD1/2 P t ⊂ {f ≤ a + σt},
(5.158)
1/2
(5.159)
We show that
where σ = sup(EX 2 (z))1/2 = sup Rk,k . z∈T
k
This along with (5.157) gives (5.151). To get (5.158) let x = α + tβD1/2 P t , where α ∈ A and β ∈ B. Then sup xk ≤ a + t sup |(βD1/2 P t )k |. k
k
(5.160)
5.4 Concentration inequalities Note that (βD1/2 P t )k =
n
βj (D1/2 P t )j,k .
221
(5.161)
j=1
Therefore, by the Schwarz inequality and the fact that |β| ≤ 1 1/2 n (5.162) |(βD1/2 P t )k | ≤ (D1/2 P t )2j,k . j=1
Also, n
(D1/2 P t )2j,k
(5.163)
j=1
=
n
(D1/2 P t )j,k (D1/2 P t )j,k =
j=1
= ((D
n
(D1/2 P t )tk,j (D1/2 P t )j,k
j=1 1/2
t t
P ) (D
1/2
P ))k,k = (P DP t )k,k = Rkk ≤ σ 2 . t
Combining the inequalities in this paragraph, we get (5.158) and hence (5.151). To obtain (5.150) consider the set A = {f ≥ a}. Since ρn (A ) = 1/2, we have, as in (5.157), ρn (A + tBD1/2 P t ) ≥ Φ(t).
(5.164)
(A + tBD1/2 P t ) ⊂ {f ≥ a − σt}
(5.165)
Also, we have since, if α ∈ A , β ∈ B and x = α − tβD1/2 P t , then sup xk ≥ a − t sup |(βD1/2 P t )k | ≥ a − σt k
(5.166)
k
by (5.162) and (5.163). The inequality in (5.150) now follows from (5.164) and (5.165). We now remove the condition that the covariance matrix of X is strictly positive definite. By Corollary 5.3.4 there is a version of X m that can be represented by X(z) = j=1 φj (z)ξj , t ∈ T , where {ξj } is a standard normal sequence. (Clearly, if the covariance matrix of X is not strictly positive definite, m is strictly less than the cardinality of T .) Note that xxt dx exp − , (5.167) P sup X(z) ≤ b = n/2 2 (2π) |R|1/2 z∈T Cb m where Cb = {x : supz∈T j=1 φj (z)xj ≤ b}. It follows from this that the
222
Basic properties of Gaussian processes
distribution function of supz∈T X(z) is continuous and strictly increasing. In particular, supz∈T X(z) has a unique median, which we denote by a. Let {gi }ni=1 be a standard normal sequence and set X = {X(z1 ) + g1 , . . . , X(zn ) + gn }. It is easy to check that the covariance matrix of X is strictly positive definite. Let a be the median of supt∈T X (t) and σ = supt∈T (EX2 (t))1/2 . By (5.150), P sup X (z) > a − σ t ≥ Φ(t). (5.168) z∈T
It is also easy to check that supz∈T X (z) converges to supz∈T X(z) almost surely as goes to zero. Clearly lim→0 σ = σ, and since the distribution function of supz∈T X(z) is continuous and strictly increasing, lim→0 a = a. Therefore we can take the limit in (5.168) to get (5.150) for X without the requirement that the covariance matrix of X is strictly positive definite. A similar argument gives (5.151) for finite sets T . Now let T be a countable index set. Choose an increasing sequence of finite sets {Tn }, Tn ⊂ T , such that limn→∞ Tn = T . Let an = median supz∈Tn X(z) and σn = supz∈Tn (EX 2 (z))1/2 . Note that both {an } and {σn } are increasing in n. Let α = limn→∞ an and σ = limn→∞ σn . We always have an < 2σn ≤ 2σ. Since we assume that σ is finite, we see that α is also finite. By (5.151), for any n, Φ(t) ≤ P sup X(z) < α + σt . (5.169) z∈Tn
Since the sets En = {supz∈Tn X(z) < α+σt} are decreasing and ∩n En = {supz∈T X(z) ≤ α + σt}, we can take the limit in (5.169) as n→ ∞ to get (5.170) Φ(t) ≤ P sup X(z) ≤ α + σt . z∈T
Let t > 0 and let {tn } be a sequence of positive real numbers, tn < t, that increases to t. Then, by (5.170), Φ(tn ) ≤ P sup X(z) < α + σt . (5.171) z∈T
Taking the limit as n → ∞, we see that Φ(t) ≤ P sup X(z) < α + σt . z∈T
(5.172)
5.4 Concentration inequalities We also note that by (5.170), 1 ≤P 2 By (5.150), for any n, Φ(t) ≤ P
sup X(z) ≤ α .
(5.173)
z∈T
sup X(z) > an − σt sup X(z) > an − σt .
(5.174)
z∈Tn
≤ P
223
z∈T
Since the sets Fn = {supz∈T X(z) > an −σt} are decreasing and ∩n Fn = {supz∈T X(z) ≥ α − σt}, we can take the limit in (5.174) as n→ ∞ to get Φ(t) ≤ P sup X(z) ≥ α − σt . (5.175) z∈T
As above, let t > 0 and let {tn } be a sequence of positive real numbers, tn < t, that increases to t. Then, by (5.175), Φ(tn ) ≤ P sup X(z) > α − σt (5.176) z∈T
and taking the limit as n → ∞ we see that Φ(t) ≤ P sup X(z) > α − σt .
(5.177)
z∈T
Letting t decrease to zero, we obtain 1 ≤ P sup X(z) ≥ α . 2 z∈T
(5.178)
We see from (5.173) and (5.178) that α is a median of supz∈T X(z). Thus we get (5.150) and (5.151) from (5.177) and (5.172). The inequality in (5.152) follows from (5.150) and (5.151). To see that the median is unique, note that it follows from (5.151) that for t > 0 P sup X(z) − median sup X(z) ≥ σt ≤ Ψ(t), (5.179) z∈T
z∈T
where Ψ(t) = 1 − Φ(t). Suppose that median supz∈T X(z) is equal to both a1 and a2 with a2 > a1 . Let t = (a2 − a1 ). We then have 1 1 ≤ P sup X(z) ≥ a2 = P sup X(z) ≥ a1 + t ≤ Ψ(t/σ 2 ) < . 2 2 z∈T z∈T (5.180) Therefore a1 = a2 .
224
Basic properties of Gaussian processes
Remark 5.4.4 It is useful to replace X by |X| in (5.152). Using (5.150) and (5.151) we get P sup |X(z)| − a ≥ σt ≤ 3(1 − Φ(t)). (5.181) z∈T
To see this note that, by symmetry, (5.151) also holds with X replaced by −X. Therefore (5.182) P sup |X(z)| < a + σt z∈T ! ! =P sup X(z) < a + σt ∩ sup −X(z) < a + σt z∈T
z∈T
≥ 2Φ(t) − 1. Using (5.182) and the fact that (5.150) holds with X replaced by |X|, we get (5.181). Statements like (5.152) and (5.181) are called concentration inequalities because they show that supz∈T X(z) and supz∈T |X(z)| are concentrated at the median of supz∈T X(z) when this median is large relative to σ. It is often useful to replace a in Theorem 5.4.3 by E supz∈T X(z). The median of a random variable Y is less than or equal to 2EY , but in general we can say nothing about bounding EY by the median of Y . However, for Gaussian processes, using Theorem 5.4.3, we can show that the mean and median of supz∈T X(z) can be very close. Corollary 5.4.5 Under the hypotheses and notation of Theorem 5.4.3, σ |a − E sup X(z)| ≤ √ . 2π z∈T Also,
σ ≤ 2 median
Proof
sup |X(z)| .
(5.183)
(5.184)
z∈T
It follows from (5.150) that ∞ 2 a − supz∈T X(z) 1 ≥t ≤ √ P e−u /2 du. (5.185) σ 2π t ∞ Since, for any random variable Y , EY ≤ 0 P (Y ≥ u) du, we see that ∞ ∞ 2 a − supz∈T X(z) 1 1 ≤√ E e−u /2 du dt = √ . (5.186) σ 2π 0 2π t
5.4 Concentration inequalities
225
Therefore σ a − E sup X(z) ≤ √ . 2π z∈T
(5.187)
Using (5.151) and essentially the same argument we get E supz∈T X(z)− √ a ≤ σ/ 2π and hence (5.183). For a fixed z ∈ T , (EX 2 (z))1/2 ≤ 2 median |X(z)| ≤ 2 median sup|X(z)|. z∈T
This implies (5.184). (Note that for the first inequality we use a table of the cumulative normal distribution.) Theorem 5.4.3 enables us to make some very strong statements about integrability properties of the supremum of Gaussian processes. These immediately extend to Banach space–valued Gaussian processes and we express them in this way. Let (B, · ) be a Banach space with the following property: There exists a countable subset D of the unit ball of B ∗ , the dual of B, such that x = supf ∈D |f (x)| for all x ∈ B. X is a B-valued Gaussian random variable if f (X) is measurable for all f ∈ D, and every finite linear combination i αi fi (X), αi ∈ R1 , fi ∈ D, is a Gaussian random variable. X has mean zero if E(f (X)) = 0 for all f ∈ D. Corollary 5.4.6 Let X be a (B, · )-valued Gaussian random variable with mean zero, and let σ := sup (E(f (X))2 )1/2 . f ∈D
(5.188)
Then lim
1
t→∞ t2
log P ( X ≥ t) = −
and, equivalently, 1 2 E exp X < ∞ 2α2
1 2 σ2
if and only if α > σ .
(5.189)
(5.190)
Proof By hypothesis on the Banach space (B, · ), there exists a countable set D in the unit ball of B ∗ such that X = supf ∈D |X(f )|. It follows from (5.181) that 1 1 lim log P sup |X(f )| ≥ t ≤ − 2 . (5.191) t→∞ t2 2 σ f ∈D
Basic properties of Gaussian processes
226
This gives us the upper bound in (5.189). The lower bound is trivial since P (supf ∈D |X(f )| > t) ≥ P (|X(f )| > t) for all f ∈ D. Using (5.189), we get sufficiency in (5.190). The reverse implication follows since 1 1 2 2 ≥ sup X E exp X (f ) E exp 2α2 2α2 f ∈D # σf2 $−1/2 = sup 1 − 2 , α f ∈D where σ 2 (f ) = EX 2 (f ). This is finite only when α2 > σ 2 . For a (B, · )-valued random variable X, let X p := (E X p )1/p . In the next corollary of Theorem 5.4.3 we show that all the moments of the norm of a mean zero (B, · )-valued Gaussian random variable are equivalent. Corollary 5.4.7 Let X be a (B, · )-valued Gaussian random variable with mean zero. Then all the moments of X exist. Furthermore, the moments are equivalent, that is, for all p, q > 0 there exist constants Kp,q such that X p ≤ Kp,q X q .
(5.192)
√ In particular, Kp,2 ≤ K p for p ≥ 1, where K is an absolute constant. Proof The fact that X has all its moments follows immediately from (5.189). To continue, as in the proof of Corollary 5.4.6, we obtain (5.192) with X replaced by supf ∈D |X(f )|. Let a = med (supf ∈D |X(f )|). By (5.181), for all p > 0, ∞ p E sup |X(f )| − a = P | sup |X(f )| − a| > t dtp f ∈D
0
f ∈D
∞ ∞ 3 ≤ √ σp exp(−u2 /2) du dtp 2π 0 t ∞ 3 = √ σp tp exp(−t2 /2) dt 2π 0 √ ≤ (K pσ)p (5.193)
for some absolute constant K. Therefore, 1/p E sup |X(f )|p f ∈D
√ ≤ a + K pσ.
(5.194)
For a real-valued random variable |ξ|, median |ξ| ≤ (2E|ξ|q )1/q for all
5.5 Comparison theorems
227
q > 0. Thus a ≤ (2E supf ∈D |X(f )|q )1/q . Also, by (5.184), σ ≤ 2a. Using these observations in (5.194) we get (5.192) and the comment following it.
5.5 Comparison theorems Gaussian processes are determined by their covariance functions. In this section we show the rather remarkable fact that, generally speaking, the smoother the covariance function, the better behaved are the sample paths of the processes. Lemma 5.5.1 (Slepian) Let ξ and ζ be mean zero Rn -valued Gaussian random variables such that Eξj2 = Eζj2
∀1≤j≤n
(5.195)
∀ 1 ≤ j, k ≤ n.
(5.196)
and Eξj ξk ≤ Eζj ζk
Then, for any real numbers λ1 , . . . , λn ,
P ∪nj=1 {ξj > λj } ≥ P ∪nj=1 {ζj > λj } .
(5.197)
Proof Let η be mean zero Rn -valued Gaussian random variable with a strictly positive definite covariance matrix Γ = {Γj,k }. Taking the inverse Fourier transform of the characteristic function of η, we can express the probability density function of η by (Γx, x) 1 exp −i(z, x) − g(z, Γ) = dx, (5.198) (2π)n 2 where x ∈ Rn . Differentiating the right-hand side of (5.198), one sees that ∂ 2 g(z, Γ) ∂g(z, Γ) = ∂Γj,k ∂zj ∂zk
∀ j, k = 1, . . . , n
Let Q(η, Γ) = P
∩nj=1 {ηj
≤ λj } =
λ1 −∞
···
j = k.
(5.199)
g(z, Γ) dz.
(5.200)
λn
−∞
For 1 ≤ j < k ≤ n, we see by (5.199) that λ1 λn ∂2 ∂Q(η, Γ) = ··· g(z, Γ) dz. ∂Γj,k −∞ −∞ ∂zj ∂zk
(5.201)
Basic properties of Gaussian processes
228
The right-hand side of (5.201) is an (n − 2)-fold integral of the function g, with the arguments zj and zk replaced by λj and λk and the domain of integration of the remaining variables the same as before. This shows us that ∂Q(η, Γ)/∂Γj,k > 0.
(5.202)
Let Σ = {Σj,k } and Z = {Zj,k } be the covariance matrices of ξ and ζ, respectively, and assume that both matrices are strictly positive definite. For 0 ≤ θ ≤ 1 set Γj,k (θ) = θΣj,k + (1 − θ)Zj,k and note that Γ(θ) = {Γj,k (θ)} is strictly positive definite for all 0 ≤ θ ≤ 1. Let q(θ) = 1 − Q(η, Γ(θ)). Using (5.195), (5.196), and (5.202), we see that dq(θ) dθ
= − = − = −
n ∂Q(η, Γ(θ)) dΓj,k (θ) ∂Γj,k dθ
j,k=1 n
(5.203)
∂Q(η, Γ(θ)) (Σj,k − Zj,k ) ∂Γj,k
j,k=1 n
j,k=1,j =k
∂Q(η, Γ(θ)) (Σj,k − Zj,k ) ≥ 0. ∂Γj,k
Since q is increasing, we get
q(1) = P ∪nj=1 {ξj > λj } ≥ q(0) = P ∪nj=1 {ζj > λj } .
(5.204)
This proves the lemma when both Σ and Z are strictly positive definite. Suppose this assumption is not satisfied. Let φ be a standard normal sequence with n components independent of ξ and ζ. Consider ξ = ξ + φ and ζ = ζ + φ. The hypotheses of this lemma hold for these Gaussian random variables and their covariance matrices are strictly positive definite for all > 0. Therefore the lemma holds for ξ and ζ . By considering their characteristic functions it is easy to see that ξ and ζ converge in distribution to ξ and ζ as → 0. Thus the lemma holds as stated. The following simple corollary of Lemma 5.5.1 is fundamental and used often in this book. Corollary 5.5.2 Let {Yj } be a Gaussian sequence such that EYj2 = 1
and
EYj Yk ≤ δ
j = k.
(5.205)
Then lim sup j→∞
Yj ≥ (1 − δ)1/2 (2 log j)1/2
a.s.
(5.206)
5.5 Comparison theorems Proof
229
Let {ξj } be a standard normal sequence and set Zj = δ 1/2 ξ1 + (1 − δ)1/2 ξj
j = 2, . . .
(5.207)
Clearly EYj Yk ≤ EZj Zk
j = k.
(5.208)
Therefore, by Lemma 5.5.1 and the Monotone Convergence Theorem, # $ 1/2 P ∪∞ ≥ (1 − δ)1/2 } (5.209) j=k {Yj /(2 log j) $ # 1/2 ≥ (1 − δ)1/2 } . ≥ P ∪∞ j=k {Zj /(2 log j) By Theorem 5.1.4, this last term is equal to one. Using the Monotone Convergence Theorem again we get (5.206). Lemma 5.5.3 Let ξ and ζ be mean zero Rn -valued Gaussian random variables such that E(ζj − ζk )2 ≤ E(ξj − ξk )2
∀ 0 ≤ j, k ≤ n.
(5.210)
Then E sup ζj ≤ E sup ξj
(5.211)
E sup |ζj − ζk | ≤ E sup |ξj − ξk |.
(5.212)
j
j
and j,k
j,k
A much simpler proof gives this result with the factor 2 on the righthand sides of (5.211) and (5.212) (see Chapter 15, Section 3 in Kahane (1985)). However, we need this degree of sharpness in the proof of Theorem 6.3.4. Proof
Note that (5.211) implies (5.212) because
E sup |ζj − ζk | = E sup(ζj − ζk ) = E(sup ζj + sup(−ζk )) = 2E sup ζj , j,k
j,k
j
k
j
(5.213) where the last step uses the fact that ζ is symmetric. Therefore it suffices to prove (5.211). The proof follows along the lines of Lemma 5.5.1 but is more complicated. We use the notation of Lemma 5.5.1. As we show in Lemma 5.5.1, it is enough to consider that Σ and Z are strictly positive definite. Also, without loss of generality, we can assume that ξ and ζ are independent. Consider ξ(θ) = θ1/2 ξ + (1 − θ)1/2 ζ, 0 ≤ θ ≤ 1. The covariance matrix
230
Basic properties of Gaussian processes
of ξ(θ) is Γ(θ), which is strictly positive definite for all 0 ≤ θ ≤ 1. We obtain (5.211) by showing that h(θ) = E sup ξj (θ)
(5.214)
j
is increasing in θ. The probability density function of ξ(θ) is given by 1 (Γ(θ)x, x) g(z, Γ(θ)) = dx. (5.215) exp i(z, x) − (2π)n 2 Also, dh(θ) = dθ
max(z1 , . . . , zn ) Rn
It follows from (5.215) that dg(z, Γ(θ)) 1 =− dθ (2π)n
and ∂ 2 g(z, Γ(θ)) 1 =− ∂zj ∂zk (2π)n
d(g(z, Γ(θ)) dz. dθ
(5.216)
n d Γj,k (θ) 1 xj xk (5.217) 2 dθ j,k=1 (Γ(θ)x, x) dx exp i(z, x) − 2
(Γ(θ)x, x) dx. (5.218) xj xk exp i(z, x) − 2
Using (5.217) and (5.218), we see that n 1 d Γj,k (θ) ∂ 2 g(z, Γ(θ)) dg(z, Γ(θ)) . = 2 dθ ∂zj ∂zk dθ
(5.219)
j,k=1
Using this in (5.216) gives n dh(θ) 1 d Γj,k (θ) ∂ 2 g(z, Γ(θ)) = max(z1 , . . . , zn ) dz. (5.220) dθ 2 dθ ∂zj ∂zk Rn j,k=1
We now show that the right-hand side of (5.220) is greater than or equal to zero. To this end let us consider ∂ 2 g(z, Γ(θ)) max(z1 , . . . , zn ) dz. (5.221) I1,2 := ∂z1 ∂z2 Rn Let u1 = max(z2 , . . . , zn ). We can write ∞ ∂g(z, Γ(θ)) d I1,2 := dz1 z1 (5.222) dz1 ∂z2 u1 Rn−1 u1 d ∂g(z, Γ(θ)) u1 + dz1 dz2 · · · dzn . dz1 ∂z2 −∞
5.5 Comparison theorems
231
It is easy to do the integration with respect to z1 , using integration by parts on the first integral, to get ∞ ∂g(z, Γ(θ)) dz1 dz2 · · · dzn . (5.223) I1,2 = − ∂z2 Rn−1 u1 Interchanging the order of integration between z1 and z2 and setting u1,2 = max(z3 , . . . , zn ), we get ∞ z1 ∂g(z, Γ(θ)) dz2 dz1 dz3 · · · dzn . (5.224) I1,2 = − ∂z2 Rn−2 u1,2 −∞ ∞ = − g((z1 , z1 , . . . , zn ), Γ(θ)) dz1 dz3 · · · dzn . Rn−2
u1,2
For 2 < k ≤ n we define I1,k as we defined I1,2 in (5.221) but with 2 replaced by k. Following the argument above we get ∞
I1,k = −
g((z1 , z2 , . . . , zk−1 , z1 , zk+1 , . . . , zn ), Γ(θ)) dz1 Rn−2
u1,k
dz2 · · · dzk−1 , dzk+1 dzn , where u1,k = max(z2 , . . . , zk−1 , zk+1 , . . . , zn ). Since g is a probability density function, it is clear that I1,k < 0 for all k = 2, . . . , n. Next, consider ∂ 2 g(z, Γ(θ)) I1,1 := max(z1 , . . . , zn ) dz. (5.225) ∂z12 Rn ∞ ∂g(z, Γ(θ)) dz1 dz2 · · · dzn = − ∂z1 n−1 u1 R = g((u1 , z2 , . . . , zn ), Γ(θ)) dz2 · · · dzn , Rn−1
where, for the second equality, we follow the computations in (5.222) and (5.223). Divide Rn−1 into n − 1 disjoint sets Bk = {z ∈ Rn−1 , zk > u1,k }. Thus n I1,1 = g((zk , z2 , . . . , zn ) dz2 · · · dzn (5.226) k=2
Bk
since u1 = zk on Bk . The integral over B2 is ∞
g((z2 , z2 , . . . , zn ), Γ(θ)) dz2 Rn−2
u1,2
dz3 · · · dzn = −I1,2 . (5.227)
Basic properties of Gaussian processes
232
The integrals over the other sets Bk are obtained similarly and we see that n I1,1 = − I1,k . (5.228) k=2
Define Ij,j and Ij,k as in (5.221) and (5.225). Rearranging the components of ξ and ζ, we obtain a result similar to (5.228) for Ij,j and Ij,k for all j and k = j. That is, Ij,k ∀ j = 1, . . . , n. (5.229) Ij,j = − 1≤k≤n,k =j
Also, as in the case j = 1, Ij,k < 0 for all j = k. Using this in (5.220) we see that dh(θ) dθ
n d Γj,k (θ) d Γj,j (θ) (5.230) Ij,k + Ij,j dθ dθ j=1 1≤j,k≤n,j =k 1 d Γj,k (θ) d Γj,j (θ) − Ij,k = 2 dθ dθ 1≤j,k≤n,j =k $ d# 1 Γj,j (θ) + Γk,k (θ) − 2Γj,k (θ) Ij,k . = − 2 dθ
=
1 2
1≤j,k≤n,j>k
Note that Γj,j (θ) + Γk,k (θ) − 2Γj,k (θ)
= E(ξj (θ) − ξk (θ))2
(5.231)
= θE(ξj − ξk ) + (1 − θ)E(ζj − ζk )2 . 2
The derivative of Γj,j (θ) + Γk,k (θ) − 2Γj,k (θ) is greater than or equal to zero by (5.210). Since Ij,k , j = k, is less than zero, we see that dh(θ)/dθ ≥ 0, which proves this lemma. It is worthwhile to give a simple equality that is used often in this chapter. Lemma 5.5.4 Let (ξ1 , ξ2 ) be an R2 -valued mean zero normal random variable with (E(ξ1 − ξ2 )2 )1/2 = D. Then D E (ξ1 ∨ ξ2 ) = √ . 2π
(5.232)
Proof Let a and b be real numbers. Then a ∨ b = 12 ((a + b) + |a − b|). Using this and the fact that ξ1 and ξ2 have mean zero, we see that E (ξ1 ∨ ξ2 ) = 12 E (|ξ1 − ξ2 |) = 12 DE(|ξ|), where ξ is N (0, 1). This is (5.232).
(5.233)
5.5 Comparison theorems
233
Lemma 5.5.5 Let ξ1 , . . . , ξn be a standard normal sequence. Then, for all n ≥ 1, E sup ξj ≥ 1≤j≤n
1 1/2 . 12 (log n)
(5.234)
√ Proof Since log 1 = 0 and, by (5.232), E(ξ1 ∨ξ2 ) = 1/ π, we need only consider n ≥ 3. Recall that, for any real-valued function f , f + = f ∨ 0 and f − = |f ∧ 0|, so that f = f + − f − . Let Mn = sup1≤j≤n ξn . Clearly, EMn = EMn+ + E(−Mn− ).
(5.235)
We show below that P (Mn > (log n)1/2 ) > 0.18. This implies that EMn+ ≥ (0.18)(log n)1/2 . Also, note that (−Mn− ) is increasing as n increases. Therefore, for n ≥ 3, E(−Mn− ) ≥ E(−M3− ) and E(−M3− )
= E(−M3− |M3 < 0)(1/8) ≥ E(−ξ1− |M3 < 0)(1/8) = E(−ξ1− |ξ1 < 0)(1/8) = −(2/π)1/2 (1/8),
where in the next to last equality we use the fact that {M3 < 0} = ∩3i=1 {ξi < 0}. Using these observations in (5.235) we get (5.234). To see that P (Mn > (log n)1/2 ) > 0.18, we note the elementary equality P (Mn > b) = 1 − (1 − P (ξ1 > b))n . Furthermore, by the right-hand side of (5.19) and symmetry, we see that for n ≥ 3 and α = 1/(8π)1/2 , $ # 1 α > . P ξ1 > (log n)1/2 ≥ (5.236) n (8πn log n)1/2 n −α , we get the desired estimate for Mn . Since (1 − α n) < e
Let (T, d) be a metric space. A set S ⊂ T is said to be u-distinguishable if, for all x, y ∈ S, x = y, d(x, y) > u. Whenever we consider a Gaussian process X = {Xt , t ∈ T }, unless otherwise stated, we take the metric d(s, t) = (E(X(s) − X(t))2 )1/2 . We sometimes refer to this as the standard L2 metric. Lemma 5.5.6 (Sudakov) Let X = {Xt , t ∈ T } be a mean zero Gaussian process. Let S ⊂ T be finite and u-distinguishable. Then E sup X(s) ≥ s∈S
1 u(log #S)1/2 , 17
(5.237)
where #S is the cardinality of the set S. If E supt∈T X(t) < ∞ and S ⊂ T is u-distinguishable, then 2 17E supt∈T X(t) #S ≤ exp . (5.238) u
Basic properties of Gaussian processes
234
Proof Let {s1 , . . . , sn } denote the points of S. Let ξ = (ξ1 , . . . , ξn ) be a standard normal sequence. Consider the Gaussian process {Y (s), s ∈ √ S}, where Y (si ) = (u/ 2)ξi , i ∈ [1, n]. By hypothesis, E(Y (si ) − Y (sj ))2 ≤ E(X(si ) − X(sj ))2
∀ 1 ≤ i, j ≤ n.
(5.239)
Therefore, by Lemmas 5.5.3 and 5.5.5, √ E sup X(s) ≥ E sup Y (s) = (u/ 2)E sup ξi ≥ u(log n)1/2 /17, s∈S
1≤i≤n
s∈S
(5.240) which is (5.237). The inequality in (5.238) follows from (5.237) when #S is finite. Moreover, we see from (5.237) that when E supt∈T X(t) < ∞, #S must be finite. Hence we get (5.238) as stated. We have the following sharpening of Lemma 5.5.6. It seems to be an obvious and minor improvement over Lemma 5.5.6. However, it is exactly what is needed in Talagrand’s elegant proof of Theorem 6.3.4. Lemma 5.5.7 (Talagrand) Let X = {Xt , t ∈ T }, where T is countable, be a mean zero Gaussian process. Let S ⊂ T be 4u-distinguishable, u > 0. Then 1 E sup X(t) ≥ sup X(t) , (5.241) u(log #S)1/2 + inf E s∈S 12 t∈T t∈B(s,u) where B(s, u) = {t ∈ T | d(s, t) ≤ u}. Proof We can assume that E supt∈T X(t) < ∞. This implies, by Lemma 5.5.6, that S = {sk , k ∈ [1, M ]} is a finite set, where, of course, M = #S. Let {Xk , k ∈ [1, M ]} be independent copies of X and {gk , k ∈ [1, M ]} be a standard normal sequence independent of {Xk , k ∈ [1, M ]}. Let T = ∪k∈[1,M ] B(sk , u). We now define a Gaussian process on T as follows: Y (t) = Xk (t) − Xk (sk ) + ugk
t ∈ B(sk , u).
(5.242)
This is possible because the balls B(sk , u) are disjoint. law
Let t, t ∈ B(sk , u). Then Y (t) − Y (t ) = X(t) − X(t ), so dY (t, t ) = dX (t, t ). If t, t are in disjoint balls, d2Y (t, t ) = E(Xk (t) − Xk (sk ))2 + E(Xk (t ) − Xk (sk ))2 + 2u2 for some k = k . Therefore dY (t, t ) ≤ 2u. On the other hand, since t, t are in disjoint balls, dX (t, t ) ≥ 2u. Thus we see that dY (s, t) ≤ dX (s, t) for all s, t ∈ T . It follows from Lemma 5.5.3 that E sup X(t) ≥ E sup X(t) ≥ E sup Y (t). t∈T
t∈T
t∈T
(5.243)
5.6 Processes with stationary increments
235
We proceed to find a lower bound for (5.243): sup Y (t)
t∈T
=
sup
Y (t)
sup
(5.244)
k∈[1,M ] t∈B(sk ,u)
=
ugk +
sup k∈[1,M ]
− Xk (sk ) .
Xk (t)
sup
t∈B(sk ,u)
Recall that {Xk , k ∈ [1, M ]} and {gk , k ∈ [1, M ]} are independent. First we hold {gk , k ∈ [1, M ]} fixed and take the expectation with respect to {Xk , k ∈ [1, M ]} to get E{Xk } sup Y (t) ≥ t∈T
sup
ugk + E
k∈[1,M ]
≥
Xk (t)
sup
(5.245)
t∈B(sk ,u)
sup ugk + k∈[1,M ]
inf
k∈[1,M ]
E
sup
Xk (t).
t∈B(sk ,u)
Taking the expectation with respect to {gk , k ∈ [1, M ]}, we get E sup Y (t) ≥ E sup ugk + t∈T
k∈[1,M ]
inf
k∈[1,M ]
E
sup
Xk (t).
(5.246)
t∈B(sk ,u)
The inequality in (5.241) now follows from (5.243) and Lemma 5.5.5.
5.6 Processes with stationary increments A stochastic process {Y (t), t ∈ T } is called a stationary process if, for all law
t1 , . . . , tn in T , for all n ≥ 1, and all τ ∈ T , (Y (t1 ), . . . , Y (t)) = (Y (t1 + τ ), . . . , Y (tn + τ )). A stochastic process {Z(t), t ∈ T } is called a prolaw
cess with stationary increments if (Z(t1 ) − Z(s1 ), . . . , Z(tn ) − Z(sn )) = (Z(t1 + τ ) − Z(s1 + τ ), . . . , Z(tn + τ ) − Z(sn + τ )) for all t1 , . . . , tn and s1 , . . . , sn in T , for all n ≥ 1, and all τ ∈ T . These definitions do not make sense unless T is an additive semigroup. For our purposes it suffices to take T = Rd or T = [0, 1]d . Rd is a locally compact Abelian group, and, when dealing with Gaussian processes that can be expressed as random Fourier series, we view [0, 1]d as Rd /Z d , which is a compact Abelian group. Most of the results we give about Gaussian processes on Rd and [0, 1]d , when considered to be compact or locally compact Abelian groups, are also valid on general compact or locally compact Abelian groups. Since a Gaussian process is determined by its finite-dimensional distributions, a necessary and sufficient condition for a Gaussian process to be stationary is that its covariance Σ(s, t) is a function of s − t. Thus we may talk about a stationary Gaussian process Y = {Y (t), t ∈ T } with
236
Basic properties of Gaussian processes
covariance φ(s), meaning that EY (t)Y (t + s) = φ(s). Note that φ must be an even positive definite function, with φ(0) > 0 (or else Y (t) ≡ 0 almost surely, for all t ∈ T ). A theorem of Bochner states that every continuous, even, positive definite function φ on Rd is the cosine transform of a finite positive symmetric measure on Rd (see page 184 in Donoghue (1969)). Therefore, each stationary Gaussian process on Rd has a covariance φ that can be represented as cos(2π(λ, u)) µ(dλ), (5.247) φ(u) = Rd
where µ is a finite positive symmetric measure on Rd . We now consider the representation of characteristic functions of Gaussian processes with stationary increments. To motivate this, let Z(t) = Y (t) − Y (0) for Y as given above, and set Z = {Z(t), t ∈ T }. Clearly, Z is a Gaussian process with stationary increments and obviously Z(0) = 0 almost surely. Because of this Z(t) − Z(s) is equal in law to Z(t − s) and 2 ψ(t − s) := EZ (t − s) = 2 (1 − cos(2π(λ, t − s))) µ(dλ). (5.248) Rd
We note that 1 (ψ(s) + ψ(t) − ψ(s − t)) , (5.249) 2 which is positive definite since it is the covariance of Z. We extend the class of functions ψ in (5.248) to integrals with respect to a larger family of measures on Rd , that is, we consider ψ(u) = 2 (1 − cos 2π(λ, u)) ν(dλ), (5.250) EZ(t)Z(s) =
Rd
where ν is a symmetric positive measure on Rd satisfying (1 ∧ |λ|2 ) ν(dλ) < ∞,
(5.251)
Rd
the same condition as for a symmetric L´evy measure. It is easy to see that ψ(s) + ψ(t) − ψ(s − t) is still positive definite since 1 − cos 2π(λ, s) − cos 2π(λ, t) + cos 2π(λ, s − t) i2π(λ,s)
= Re(e
−i2π(λ,t)
− 1)(e
(5.252)
− 1).
Thus ψ as given in (5.250) determines a mean zero Gaussian process with stationary increments with covariance given by the right-hand side of (5.249).
5.6 Processes with stationary increments
237
Let {G(t), t ∈ R+ } be a mean zero Gaussian process with stationary increments and G(0) ≡ 0. We can set G(t) = ξt + G1 (t), where ξ is a normal random variable with mean zero and G1 and ξ are independent (ξ ≡ 0 is possible). It follows from Chapter 11, Section 11 in Doob (1953) that the covariance of G1 (t) is given by (5.250) for some measure ν satisfying (5.251). Consider a real-valued stationary Gaussian processes on R1 . Using symmetry, we can set F (λ) = µ({0}) + 2µ((0, λ]) and integrate over [0, ∞) in (5.247). When f (λ) = d/dλ(F (λ)) exists on [0, ∞) we can write (5.247) as ∞ cos(2πλu)f (λ) dλ. (5.253) φ(u) = 0
In this case we can represent a real-valued stationary Gaussian process Y = {Yt , t ∈ R1 } with covariance φ(u) by ∞ ∞ Y (t) = cos(2πλt)f 1/2 (λ) dB(λ) + sin(2πλt)f 1/2 (λ) dB (λ), 0
0
(5.254) where B and B are independent Brownian motions on [0, ∞). Also, by (5.253), ∞ # $2 = 2 (1 − cos 2πλ(s − t))f (λ) dλ E Y (s) − Y (t) 0 ∞ (sin2 πλ(s − t))f (λ) dλ. (5.255) = 4 0
F is called the spectral distribution and f the spectral density of the stationary process Y . For Gaussian processes with stationary increments, using symmetry, we set H(λ) = 2ν([λ, ∞)) and integrate over [0, ∞) in (5.250). When h(λ) = −d/dλ(H(λ)) exists on [0, ∞), we can write (5.250) as ∞ ψ(u) = (1 − cos 2πλu) h(λ) dλ. (5.256) 0
In this case we can represent a real-valued Gaussian process {Z(t), t∈ 1 R } with stationary increments and covariance ψ(s) + ψ(t) − ψ(s − t) by ∞ (1 − cos 2πλt) h1/2 (λ) dB(λ) (5.257) Z(t) = 2 0 ∞ + sin(2πλt)h1/2 (λ) dB (λ) , 0
Basic properties of Gaussian processes
238
where B and B are independent Brownian motions on [0, ∞). Analogous to (5.255), ∞ # $2 − Z(t) =4 sin2 (πλ(s − t))h(λ) dλ. (5.258) E Z(s) 0
We stress (5.254) and (5.257) because Gaussian processes associated with L´evy processes that have local times can be expressed in these ways. Although it is not common to do so, we shall also refer to ν as the process the spectral distribution and h as the spectral density of Z, with stationary increments. Consider the characterization of the characteristic function of a stationary Gaussian process given in (5.247) and assume that µ is supported on Z d with µ({k}) = a2k , for k ∈ Z d . In this case its characteristic function is a2k cos 2π(u, k) u ∈ [0, 1]d (5.259) χ(u) = k∈Z d
(since χ(u) is periodic it suffices to consider it on [0, 1]d ). We can rept , t ∈ [0, 1]d } with resent a real-valued stationary Gaussian process {W covariance χ(u) by the random Fourier series (t) = W ak (gk cos 2π(t, k) + gk sin 2π(t, k)) t ∈ [0, 1]d , (5.260) k∈Z d
where {gk }k∈Z d and {gk }k∈Z d are independent standard normal sequences. Series like this for d = 1 and 2 are associated Gaussian processes for L´evy processes on the torus. In this context we are actually considering the processes on Rd /Z d rather than on [0, 1]d . The next lemma is a technical result due to Fernique, which shows that, given a Gaussian process with stationary increments, we can find a Gaussian random Fourier series that is close to it in a very important way. This may appear as a very specialized observation, but it greatly simplifies the proof of a fundamental result of Fernique, Theorem 6.2.2 in this book, that the finiteness of Dudley’s metric entropy integral is a necessary condition for the continuity and boundedness of a Gaussian process with stationary increments. Lemma 5.6.1 (Fernique) Let Z = {Z(t), t ∈ [0, 1]d } be a Gaussian process with stationary increments and covariance given by (5.249) where (1 − cos(2π(λ, t − s))) ν(dλ) (5.261) ψ(t − s) := EZ 2 (t − s) = 2 Rd
5.6 Processes with stationary increments
239
for a positive symmetric measure ν satisfying (5.251). We can find a Gaussian random Fourier series W = {W (t), t ∈ [0, 1]d } such that 1/2 2 (1 ∧ |λ| ) ν(dλ) , E sup Z(t) − E sup W (t) ≤ Cd t∈[0,1]d t∈[0,1]d Rd (5.262) where Cd is a constant depending only on the dimension d and ν is the measure in (5.250). Proof The covariance of Z is given in terms of ψ, which is an integral with respect to ν(dλ), λ ∈ Rd , where λ = (λ1 . . . , λd ). We partition Rd as follows: Let λ = sup1≤k≤d |λk |. For all n ∈ Z d , where n = (n1 , . . . , nd ), we set An = {λ ∈ Rd : (2nk − 1) ≤ λk < (2nk + 1), ∀ k = [1, d]}.
(5.263)
Note that λ ≤ 1 on A0 , where 0 = (0, . . . , 0), and λ ≥ 1 on An , n = 0. For t ∈ Rd we define 1 ν 1/2 (An ) (gn cos 2π(2n, t) + gn sin 2π(2n, t)) , W (t) = 2 d n∈Z /{0}
(5.264) where {gn }n∈Z d and {gn }n∈Z d are independent standard normal sequences independent of the Gaussian process Z. We define another Gaussian process U = {U (t), t ∈ Rd }, which is independent of both Z and W by U (t) = Rd
(5.265) 1/2 d d 2 1/2 1/2 2π d (1 ∧ |λ| ) ν(dλ) gk tk + (2π) Bk (tk ) , k=1
k=1
(g1 , . . . , gd )
is a standard normal sequence where t = (t1 , . . . , td ); g = and B1 , . . . , Bd are independent Brownian motions independent of g . d d Using the trivial inequality sup t∈[0,1]d k=1 fk (tk ) ≤ k=1 sup s∈[0,1] fk (s), we see that 1/2 E sup U (t) = Cd (1 ∧ |λ|2 ) ν(dλ) . (5.266) t∈[0,1]d
Rd
Using the inequalities 1 − cos θ ≤ θ2 /2 and | cos α − cos β | ≤ |α − β|, we see that 2 d 2 2 |sk − tk | λ ∈ A0 =⇒ (1 − cos 2π(λ, s − t)) ≤ 2π λ k=1
Basic properties of Gaussian processes
240 and
λ ∈ An , n = 0 =⇒ | cos 2π(λ, s − t) − cos 2π(2n, s − t)| ≤ 2π
d
|sk − tk |.
k=1
Using these two statements and the Schwarz inequality, we see that 2 dZ (s, t) − d2W (s, t) (5.267) (1 − cos 2π(λ, s − t)) ν(dλ) ≤ A0 |cos 2π(λ, s − t) − cos 2π(2n, s − t)| ν(dλ) + n =0
2 d λ ν(dλ) |sk − tk |
≤ 2π
An
2
2
A0
k=1
+2πν(Rd \ A0 )
d
|sk − tk |
k=1
≤ d2U (s, t). It now follows from Lemma 5.5.3 and (5.266) that 1/2 (1 ∧ |λ|2 ) ν(dλ) (5.268) E sup Z(t) ≤ E sup W (t) + Cd t∈[0,1]d
t∈[0,1]d
and
Rd
E sup W (t) ≤ E sup Z(t) + Cd t∈[0,1]d
t∈[0,1]d
1/2 (1 ∧ |λ| ) ν(dλ) , (5.269) 2
Rd
which give us (5.262).
5.7 Notes and references In this survey of aspects of the theory of Gaussian processes, we present the material that we need to obtain results about the local times of symmetric Markov processes. Basically this comes down to sample path properties of Gaussian processes such as continuity and limit laws. We have in mind a reader who wants proofs of the fundamental results but who is also eager to move on, to see how these results are used in the study of local times. To this end we have tried to put the most important results in the statements of theorems, corollaries, and occasional remarks, so that they serve as an outline of the basic results. In this way, the reader who wishes to can skip some of the proofs on a first
5.7 Notes and references
241
reading. In our attempt to streamline the presentation we have omitted many interesting sidelights that exhibit the depth and beauty of this subject. Indeed, this survey is not a substitute for the profound elaborations of this subject in Probability in Banach Spaces by Ledoux and Talagrand (1991) and Fonctions Al´eatoires Gaussiennes Vecteurs Al´eatoires Gaussiennes by Fernique (1997). In this presentation we rely most heavily on the two books mentioned above and the long article “Continuity of subgaussian process” by Jain and Marcus (1978) and Uniform Central Limit Theorems by Dudley (1999). Other expos´es that we know of that the reader might want to consult are Lifshits (1995) and Adler (1991). Also, the Saint Flour notes by Fernique (1975) are useful, since they present the subject in its early days when the proofs were less elegant but perhaps easier on a first reading. The same is true of Jain and Marcus (1978) and some of the expository material in Random Fourier Series with Applications to Harmonic Analysis by Marcus and Pisier (1981), which considers stationary Gaussian processes. Many people have contributed to the material presented in this chapter. Since this is only a survey, we cannot possibly give all the credit that is deserved. The book by Ledoux and Talagrand (1991) has a detailed and we think accurate history of the development of the theory of Gaussian processes in its chapter notes. Dudley (1999), who, like us, surveys some aspects of Gaussian process theory relevant to his presentation, has good historical notes on the material he treats in his notes on Chapter 2. Details about the moment generating function of Gaussian processes that we present in Section 5.2 are not usually considered in books on Gaussian processes. We use them in the proofs of the isomorphism theorems in Chapter 8 that relate Gaussian processes to local times. The material in Section 5.3 is taken from Jain and Marcus (1978). It depends on the results of Jain and Kallianpur (1972) and Ito and Nisio (1968b). Theorem 5.3.10 is given in Belyaev (1961). In Marcus and Rosen (1992d) we refer to Theorem 5.4.1 as Borell’s inequality; we learned it from Borell (1975). More recently, the attribution of this result has broadened. We quote Ledoux (1998) whom we consider an expert on this topic. “The Gaussian isoperimetric inequality has been established in 1974 independently by Borell (1975) and Sudakov and Tsirelson (1978) on the basis of the isoperimetric inequality on the sphere and a limiting procedure known as Poincar´e’s lemma. A proof using Gaussian symmetrizations was developed by Ehrhard (1983).” A complete proof of this fundamentally important result is not included
Basic properties of Gaussian processes
242
in earlier books on Gaussian processes because the known proofs depended on the Brun–Minkowski inequality. Fortunately, Ledoux (1998) has given a relatively simple and straightforward proof of this result based in part on a paper by Bobkov (1996) and his joint paper, Bakry and Ledoux (1996). We present Ledoux’s proof here. The argument about the uniqueness of the median in Theorem 5.4.3 is taken from Section 1.1 in Ledoux and Talagrand (1991). Corollary 5.4.6 is presented as a simple consequence of Theorem 5.4.3, but it predated Theorem 5.4.3. It was originally proved, without the best constant, independently by Landau and Shepp (1971) and Fernique (1970), and as given, independently by Marcus and Shepp (1972) and Fernique (1971). For a simple direct proof of Corollary 5.4.6 based on Fernique (1970) and Marcus and Shepp (1972), see Chapter II, Theorem 4.8 in Jain and Marcus (1978). Corollary 5.4.6 is very useful, but there are several critical points in this book where we must use Theorem 5.4.3. Lemma 5.5.1 is the famous result of Slepian (1962). It seems to be a very special property of Gaussian processes. The ideas behind it have been studied intensively, and it has been generalized in many directions. We refer the reader to Chapter 3, Section 3 in Ledoux and Talagrand (1991) for details and further references. See also Li and Shao (2002) for a reverse Slepian-type lemma. Lemma 5.5.3 is due to Fernique (1974). A slightly weaker result with an easier proof is due to Marcus and Shepp (1972) (see Corollary 3.14, Ledoux and Talagrand (1991)). Our treatment of Lemmas 5.5.1 and 5.5.3 is taken from Jain and Marcus (1978). Lemma 5.5.6 is by Sudakov (1973), and Lemma 5.5.7 is by Talagrand (1992). It is not necessary for F to be differentiable to get an expression like (5.254). When it is not we can write ∞ ∞ Y (t) = (cos 2πλt) dB(F (λ)) + (sin 2πλt) dB (F (λ)) (5.270) 0
0
and similarly in (5.257). We stress (5.254) and (5.257) because Gaussian processes associated with L´evy processes killed at different stopping times are of this form. Lemma 5.6.1 is taken from 3.2.9 Lemme in Fernique (1997).
6 Continuity and boundedness of Gaussian processes
According to legend, Kolmogorov proposed the problem of finding necessary and sufficient conditions for the continuity of the sample paths of a Gaussian process, in terms of its covariance function. This problem is well posed since a mean zero Gaussian process is determined by its covariance. One can tackle this question using Kolmogorov’s continuity theorem, which is given in Section 14.1, and get results that in some cases are pretty sharp. In fact, refinements of Kolmogorov’s approach actually give necessary and sufficient conditions for continuity when the covariance function is sufficiently regular. Nevertheless, to fully describe the criteria for the continuity and boundedness of Gaussian processes, we must abandon this approach totally. Here is the main point. Suppose we have a Gaussian process X = {X(t), t ∈ T }, where T has a natural topology. For example, suppose T = [0, 1]d , where, of course, the natural topology is the one induced by Euclidean distance. In the classical approach to studying continuity, one considers X(s) − X(t) when |s − t| is small. However, the Gaussian random variable X(s) − X(t) is small when dX (s, t) := (E(X(s) − X(t))2 )1/2
(6.1)
is small. Thus, one should prove the continuity of X on the metric or pseudometric space (T, dX ), not on (T, | · |). When X is continuous on (T, dX ) and dX is continuous on (T, | · |), X is also continuous on (T, | · |). Note that on a pseudometric space dX (s, t) = 0 does not necessarily imply that s = t. The continuity results we prove are with respect to pseudometric spaces. Many people have worked on the problem of finding necessary and sufficient conditions for the continuity of Gaussian processes. Nevertheless, 243
244
Continuity and boundedness of Gaussian processes
it seems appropriate to mention that the major results on this problem were obtained by R. M. Dudley, X. Fernique, and M. Talagrand.
6.1 Sufficient conditions in terms of metric entropy We obtain a sufficient condition for the continuity and boundedness of a Gaussian process X = {X(t), t ∈ T } on the metric or pseudometric space (T, dX ), where dX (s, t) = (E(X(s) − X(t))2 )1/2 . These conditions are given in terms of the metric entropy of T with respect to dX . Let (T, d) be a separable metric or pseudometric space with the topology induced by d. We use Bd (t, u) to denote a closed ball of radius u in T centered at t ∈ T . The diameter of T is the radius of the smallest ball with center in T that contains T . Often, when it is clear which metric is being used, we abbreviate Bd (t, u) by B(t, u). The following functions help describe the structure of T : (1) N (u) = N (T, d, u) is the minimum number of closed balls of radius u in the metric d with centers in T that cover T . The family {N (u) : u > 0} is called the metric entropy of T . (2) D(u) = D(T, d, u) is the maximum number of closed disjoint balls of radius u in the metric d with centers in T . The family {D(u) : u > 0} is called the packing numbers of T . (3) M (u) = M (T, d, u) is the maximum number of points of T such that each distinct pair of these points is u-distinguishable. Lemma 6.1.1 For each u > 0, N (2u) ≤ D(u) ≤ M (u) ≤ N (u/2).
(6.2)
Proof We consider the three inequalities in Lemma 6.1.1 in order. (1) Let v1 , . . . , vD(u) be the centers of the balls in D(u). Then T ⊂ D(u) ∪k=1 Bd (vk , 2u). To prove this, suppose the contrary, that there exists D(u) a w ∈ T that is not in ∪k=1 Bd (vk , 2u). Then Bd (w, u) is disjoint from all the balls in D(u), thus contradicting the definition of D(u). This follows because, for any k ∈ [1, D(u)], d(w, vk ) > 2u, whereas for any s ∈ Bd (w, u), d(s, w) ≤ u. Thus d(s, vk ) ≥ d(w, vk ) − d(s, w) > u. This D(u) shows that the D(u) balls ∪k=1 Bd (vk , 2u) cover T and therefore D(u) ≥ N (2u), since N (2u) is the cardinality of the smallest such covering. (2) This is much simpler since the centers of the balls in D(u) are u-distinguishable. (3) Let v1 , . . . , vM (u) be the u-distinguishable points in M (u). Let r1 , . . . , rN (u/2) be the centers of the balls in N (u/2). Each ball Bd (rk , u/2)
6.1 Sufficient conditions in terms of metric entropy
245
can contain only one of the points v1 , . . . , vM (u) . We see from this that N (u/2) ≥ M (u). We now come to the famous theorem of R. M. Dudley, the first major result in the modern theory of Gaussian processes. Theorem 6.1.2 (Dudley) Let X = {X(t), t ∈ T } be a mean zero Gaussian process and assume that D 1/2 (log N (T, dX , u)) du < ∞, (6.3) 0
where D denotes the diameter of T with respect to dX . Then there exists a version X of X with bounded uniformly continuous sample paths on (T, dX ) such that √ D/2 1/2 (log N (T, dX , u)) du (6.4) E sup X (t) ≤ 16 2 t∈T
and
0
E
X (t) − X (s) ≤ 99
sup
1/2
(log N (T, dX , u))
du.
(6.5)
0
s,t∈T
dX (s,t)≤
Proof If T contains only one point, then both sides of (6.4) are equal to zero. Thus we may assume that T contains at least two points. In this case, N (T, dX , u) ≥ 2 for u ∈ (0, D). Let S ⊂ T be finite and consider {X(s), s ∈ S}, along with the pseudometric space (S, dX ). For s ∈ S, let Fn (s) =
sup t∈B(s,D2−n )∩S
X(t) − X(s).
(6.6)
Clearly, supt∈B(s,D2−n )∩S (E(X(t) − X(s))2 )1/2 ≤ D2−n . Let Nn+1 = N (B(s, D2−n ), dX , D2−n−1 ) and let Sk , k = 1, . . . , Nn+1 , be a covering of B(s, D2−n ) by balls of radius D2−n−1 . Let sk denote the center of Sk . Set Fn+1 (s, sk ) =
sup t∈B(sk ,D2−n−1 )∩S
X(t) − X(s).
(6.7)
Clearly EFn (s) ≤ E sup Fn+1 (s, sk ).
(6.8)
EFn (s) = E
(6.9)
k≤Nn+1
Note that sup t∈B(s,D2−n )∩S
X(t)
Continuity and boundedness of Gaussian processes
246
and similarly for Fn+1 (s, sk ). Also, note that sup t∈B(sk ,D2−n−1 )∩S
(E(X(t) − X(s))2 )1/2 ≤ 2D2−n .
By (6.10) and (5.151) we see that
−n−1
P Fn+1 (s, sk ) − med Fn+1 (s, sk ) ≥ 4D2
1 u ≤√ 2π
(6.10)
∞
e−s
2
/2
ds,
u
(6.11) where med indicates the median. Consequently, P
−n−1
sup (Fn+1 (s, sk ) − med Fn+1 (s, sk )) ≥ 4D2
k≤Nn+1
1 ≤ 1 ∧ Nn+1 √ 2π
u
(6.12)
∞
−s2 /2
e
ds .
u
Integrating as in the proof of Corollary 5.4.5, we get E sup (Fn+1 (s, sk ) − med Fn+1 (s, sk )) k≤Nn+1
≤ 4D2−n−1
∞
0
≤ 4D2−n−1
1 1 ∧ Nn+1 √ 2π
(6.13) ∞
e−s
2
/2
ds
du
u
(2 log Nn+1 )1/2
1 du 0
∞ 1 −s2 /2 + Nn+1 √ e ds du 2π u (2 log Nn+1 )1/2 ∞ $ # 2 1 se−s /2 ds ≤ 4D2−n−1 (2 log Nn+1 )1/2 + Nn+1 √ 2π (2 log Nn+1 )1/2
∞
≤ 4D2−n−1 ((2 log Nn+1 )1/2 + (2π)−1/2 ).
√ By Corollary 5.4.5, med Fn+1 (s, sk ) ≤ EFn+1 (s, sk )+4D2−n−1 / 2π. Using this and (6.8), we see that EFn (s) ≤ E sup Fn+1 (s, sk )
(6.14)
k≤Nn+1
≤
−n−1
sup EFn+1 (s, sk ) + 4D2 k≤Nn+1
1/2
(2 log Nn+1 )
& 2 . + π
Finally, using (6.9) and (6.14) we get E
sup t∈B(s,D2−n )∩S
≤
sup E k≤Nn+1
X(t)
(6.15) sup
t∈B(sk ,D2−n−1 )∩S
X(t) + 8D2−n−1 (2 log Nn+1 )1/2 .
6.1 Sufficient conditions in terms of metric entropy
247
Let K(S, dX , u) = sups∈S N (B(s, 2u), dX , u) and set En = sup E
X(t).
(6.16)
En ≤ En+1 + 8D2−n−1 (2 log K(S, dX , D2−n−1 ))1/2
∀ n ≥ 0. (6.17)
s∈S
sup t∈B(s,D2−n )∩S
It follows from (6.15) that
Since S is finite, En is equal to zero for n sufficiently large. Thus we can sum the terms in (6.17) to obtain E sup X(t) = E0 ≤ 8 t∈S
∞
D2−n−1 (2 log K(S, dX , D2−n−1 ))1/2 . (6.18)
n=0
Using the fact that K(S, dX , D2−n−1 ) ≤ N (S, dX , D2−n−1 ), we can dominate the sum in (6.18) by an integral and obtain (6.4) for X, under the assumption that S is finite. Using the Monotone Convergence Theorem we see that (6.4) continues to hold for X when S is a countable dense subset, say S ∗ , of T . Therefore, since X is separable, we can find a version X of X for which (6.4) holds as stated. For all s ∈ S ∗ and > 0, we can apply (6.4) to the process {X (t), t ∈ B(s, )}. Using the fact that N (B(s, ), dX , u) ≤ N (T, dX , u), we see that √ 1/2 (log N (T, dX , u)) du E sup X (t) − X (s) ≤ 16 2 t∈S ∗
0
dX (s,t)≤
√ ≤ 32 2
/2
1/2
(log N (T, dX , u))
du.
(6.19)
0
It follows from (6.19) and Chebyshev’s inequality that X is continuous at s almost surely. We now obtain (6.5). Assume, to begin with, that S ⊂ S ∗ is finite. Set H = {(t, t ) : t, t ∈ S, dX (t, t ) ≤ }. Let Sk , k = 1, . . . , N (S, dX , η), be a covering of S by balls of radius η and let sk denote the center of Sk . Note that Gk := {(u, v), u ∈ B(sk , η), v ∈ B(sk , η + )}
k = 1, . . . , N (S, dX , η) (6.20)
is a covering of H . Therefore E(
sup
t,t ∈S dX (t,t )≤
≤E(
X (t) − X (t ))
(6.21)
sup
sup
k≤N (S,dX ,η)
t,t ∈S,dX (sk ,t)≤η dX (sk ,t )≤+η
X (t) − X (t )).
Continuity and boundedness of Gaussian processes
248
Following the argument used in (6.13)–(6.15), we see that the right-hand side of (6.21) is less than or equal to sup E ( k≤N (η)
sup
t,t ∈S,dX (sk ,t)≤η dX (sk ,t )≤+η
X (t) − X (t )) + 2(2η + )(2 log N (S, dX , η))1/2 . (6.22)
Furthermore, the expectation in (6.22) is less than or equal to E(
sup
X (t) − X (sk )) + E (
t∈S
dX (sk ,t)≤η
sup
t ∈S dX (sk ,t )≤+η
X (t ) − X (sk )). (6.23)
Using (6.23) in (6.21) and (6.22) and taking = η, we get E(
sup
t,t ∈S dX (t,t )≤
X (t) − X (t )) ≤ 2 sup E (
sup
s∈S
X (t) − X (s))
t∈S
dX (s,t)≤2
+ 6 (2 log N (S, dX , ))1/2 . (6.24) The inequality in (6.5) now follows from (6.19) and (6.24) when S is a finite set. As above, it also holds on S ∗ . Thus we get (6.5) as stated. The condition in (6.3) is also necessary for boundedness and continuity when the Gaussian process has stationary increments. We take this up in Section 6.2. Here we give an example that shows that (6.3) is not necessary in general. Example 6.1.3 Let T = {{tk }∞ k=4 ∪ 0}, where t4 = 1 and tk ↓ 0. Let X(tk ) := ξk /(2 log k log log k)1/2 , where {ξk }∞ k=4 is a standard normal sequence and X(0) = 0. It follows easily from the Borel–Cantelli Lemma that {X(t), t ∈ T } is continuous on (T, dX ) (we only need to check this at 0). Note that 1/2 1 1 + . (6.25) dX (tj , tk ) = 2 log j log log j 2 log k log log k Let uk = 1/(log k log log k)1/2 . It follows from (6.25) that BdX (tk , uk ) contains the points {tj }j≥k and 0. Therefore, N (T, dX , uk ) = k − 3. This implies that the metric entropy integral in (6.3) is infinite since it is essentially equivalent to
(log k)1/2 . k(log k)3/2 (log log k)1/2 k k (6.26) To make this an example in which X has continuous sample paths on ∞ [0, 1], simply define X(t) = k=4 ξk φk (t), where φk (t) is a positive (uk − uk+1 )(log N (T, dX , uk+1 ))1/2 ≈
6.1 Sufficient conditions in terms of metric entropy
249
continuous function supported on [(tk+1 + tk )/2, (tk + tk−1 )/2], which is zero at the endpoints of this interval and takes as its largest value 1/(2 log k log log k)1/2 , at tk . Another drawback of the metric entropy approach is that it does not distinguish between bounded and continuous processes, since when (6.3) holds we have 1/2 lim (log N (T, dX , u)) du = 0, (6.27) →0
0
so we also get (6.5). To appreciate the relevance of this comment see Theorem 6.3.1 and Example 6.3.7 . Here is an example that shows that, for certain metrics, Dudley’s metric entropy integral is not large. It is used in the proof of Theorem 6.2.2. Example 6.1.4 Consider the metric space ([0, 1]k , d), where d(s, t) ≤ K|s − t|α , s, t ∈ [0, 1]k , K is a constant, and 0 < α ≤ 1. Note that d(s, t) ≤ when |s − t| ≤ ( /K)1/α , so that Be (s, ( /K)1/α ) ⊆ Bd (s, ),
(6.28)
where Be (x, r) is the ball of radius r in the Euclidean metric centered at k x. A closed ball metric√covers a cube √in R of radius u in the Euclidean with sides 2u/ k. Therefore we can cover [0, 1]k with ( k/2u)k balls of radius u in the Euclidean metric. Consequently, taking u = ( /K)1/α and using (6.28), we see that Nd ([0, 1]k , ) ≤
k α/2 K 2α
k/α
:=
K
k/α .
(6.29)
Note that (6.29) implies that the diameter of [0, 1]k in the metric d is bounded by K . We have 1/2 K 1/2 K
1/2 k K k log Nd ([0, 1] , ) d ≤ d log α 0 0 1/2 1 k (α+1)/2 1 = log K du. u α1/2 2α 0 Here is an important example that satisfies the criteria of the previous paragraph. Let Y = {Y (t), t ∈ [0, 1]k } be a Gaussian process with stationary increments and covariance given by (5.249) for ψ as given in
250
Continuity and boundedness of Gaussian processes
(5.250). Assume that ν has support in [−T, T ]k . For s, t ∈ [−T, T ]k , we have 1/2
2 1/2 E(Y (s) − Y (t)) = (1 − cos 2π(λ, s − t)) ν(dλ) [−T,T ]k
≤ C
1/2 |λ| ν(dλ) 2
|s − t|
(6.30)
[−T,T ]k
for some constant C independent of k.
6.2 Necessary conditions in terms of metric entropy The principle result in this section is that (6.3) is also necessary for the continuity and boundedness of Gaussian processes with stationary increments. Before we get to this we give two necessary conditions that are valid for all Gaussian processes. Note that (6.3) and the monotonicity 1/2 of u → (log N (T, dX , u)) imply that 1/2
lim u (log N (T, dX , u))
u→0
=0
(6.31)
but, obviously, is not implied by it. We show in the next theorem that (6.31) is a necessary condition for the continuity of the Gaussian process X. Theorem 6.2.1 Let X = {Xt , t ∈ T } be a mean zero Gaussian process with bounded sample paths. Then E supt∈T X(t) < ∞, and for all u > 0, E sup X(t) ≥ t∈T
1 u(log M (T, dX , u))1/2 . 17
(6.32)
If X is also uniformly continuous on T , then 1/2
lim u (log M (T, dX , u))
u→0
= 0.
(6.33)
Note that by Lemma 6.1.1, (6.33) and (6.31) are equivalent. Proof The fact that E supt∈T |X(t)| < ∞ is proved in the paragraph containing (5.191). The inequality in (6.32) is simply a restatement of (5.237). Now suppose that X is uniformly continuous on T . Since EX(t) = 0, we have E sup X(s) s∈B(t,)
= E
sup (X(s) − X(t)) + X(t) s∈B(t,)
≤ E sup |X(s) − X(t)| s∈B(t,)
(6.34)
6.2 Necessary conditions in terms of metric entropy
251
≤ E sup |X(s) − X(t)|. d(s,t)≤
s,t∈T
It follows from the Dominated Convergence Theorem that this last term goes to zero as goes to zero. Consequently, for any h > 0 we can find an > 0 such that, for all t ∈ T , E sups∈B(t,) X(s) ≤ h/34. It follows from (6.32) that for all t ∈ T and u > 0, h ≥ u(log M (B(t, ), dX , u))1/2 . 2 N (T,)
For the for which (6.35) holds, let {B(sk , )}k=1 Clearly, for all u > 0,
(6.35) be a cover of T .
N (T,)
M (T, u) ≤
M (B(sk , ), u) ≤ N (T, ) sup M (B(t, ), u).
(6.36)
t∈T
k=1
Therefore, for all u > 0, u(log M (T, u))1/2 ≤ u(log N (T, ))1/2 + u sup(log M (B(t, ), u))1/2 . t∈T
(6.37) By (6.35), the last term in (6.37) is less than or equal to h/2 for all u > 0. Thus, for u ≤ h/(2(log N (T, ))1/2 ) (we assume that log N (T, ) > 0), the left-hand side of (6.37) is less than or equal to h. Since this holds for all h, we get (6.33). A second milestone in the theory of Gaussian processes is Fernique’s Theorem, which shows that Dudley’s metric entropy condition is necessary for Gaussian processes with stationary increments. Theorem 6.2.2 (Fernique) Let Z = {Z(t), t ∈ [0, 1]d } be a Gaussian process with stationary increments and covariance given by (5.249) for ψ as given in (5.250). Then D
1/2 log N ([0, 1]d , dZ , u) du ≤ (10 · 45 )E sup Z(t) (6.38) t∈[0,1]d
0
1/2 (1 ∧ |λ| ) ν(dλ) , 2
+ Cd Rd
where Cd is a constant depending only on the dimension d and ν is the measure in (5.250). Proof To begin, let X be a random Fourier series on [0, 1]d , that is, a process of the form of (5.260). In this case, dX (s, t) is a function of |s − t|. Given s, t ∈ [0, 1]d , s − t may or may not be in [0, 1]d . To make
Continuity and boundedness of Gaussian processes
252
sure it is, we define addition modulo [0, 1]d . That is, we consider [0, 1]d as a compact Abelian group. Therefore, since dX (s, t) is translation invariant, the number Ln = N (B(t, D4−n ), dX , D4−n−1 )
(6.39)
is independent of t. Let T = [0, 1]d . Let q = 1/4 and for n ≥ 0 set un = q n+3 D. For each s ∈ T consider the subset Tn (s) = B(s, q n−1 D) and a set Sn of 4un -distinguishable points in B(s, q n D). Note that if s ∈ Sn , then B(s , un ) ⊂ Tn (s). It now follows from Lemma 5.5.7 and the fact that Sn ⊂ T , that E sup X(t) t∈Tn (s)
(6.40)
≥ inf E
sup
s ∈Sn
t∈B(s ,un )
X(t)
+
1 n 1/2 . 12 un (log M (B(s, q D), dX , 4un )))
(In Lemma 5.5.7, the index set of the stochastic process is countable, but it is easy to see that (5.241) holds in this case as well.) Since X is stationary, E(supt∈B(s ,un ) X(t)) is independent of s . Thus (6.40) can be written more simply as E
X(t)
sup
(6.41)
t∈B(0,q n−1 D)
≥E
X(t)
sup
+
t∈B(0,q n+3 D)
1 n 1/2 . 12 un (log M (B(0, q D), dX , 4un )))
Note that by Lemma 6.1.1, M (B(0, q n D), dX , 4un ) ≥ Ln . Define En = E(supt∈B(0,qn−1 D) X(t)). We can write (6.41) as En ≥ En+4 +
1 n+3 D(log Ln )1/2 12 q
∀n ≥ 0.
(6.42)
Iterating this over n = 1, . . . , K and recognizing that T0 = T , we see that 4E sup X(t) ≥ t∈T
4 n=0
En ≥
K 1 n q D(log Ln )1/2 12 · 43 n=0
(6.43)
for all K. We next note that Ln N (T, dX , Dq n ) ≥ N (T, dX , Dq n+1 ). This is easy to see since a covering of T with balls of radius Dq n+1 can be achieved by covering each of the N (T, dX , Dq n ) balls in the covering of T by balls of radius Dq n with balls of radius Dq n+1 . By definition this can be done
6.2 Necessary conditions in terms of metric entropy with Ln balls. Also note that # y $1/2 1/2 1/2 ∀y ≥ x ≥ 1 log ≥ (log y) − (log x) . x
253
(6.44)
Therefore, ∞
q n D(log Ln )1/2
n=0
≥ = ≥
∞ n=0 ∞
(6.45)
$ # q n D (log N (T, dX , Dq n+1 ))1/2 − (log N (T, dX , Dq n ))1/2
q n D − q n+1 D (log N (T, dX , Dq n+1 ))1/2
n=0 ∞ qn D
(log N (T, dX , u))1/2 du =
q n+1 D
n=0
D
(log N (T, dX , u))1/2 du. 0
Thus we get
D
(3 · 4 )E sup X(t) ≥ 5
t∈[0,1]d
(log N ([0, 1]d , dX , u))1/2 du.
(6.46)
0
Let Z = {Z(t), t ∈ [0, 1]d } be a Gaussian process with stationary increments. We know that the covariance of Z can be expressed as in (5.249) for ψ as given in (5.250). Let ZN = {ZN (t), t ∈ [0, 1]d } be the Gaussian process with stationary increments but with ν in (5.250) replaced by νN , where νN (A) = ν(A ∩ [−N, N ]d ), for all measurable sets 2 1/2 (6.30) A ⊂ Rd . Let dZN (s, t) = (E (ZN (s) − ZN (t)) D ) . It follows from in Example 6.1.4 and Theorem 6.1.2 that 0 (log N (T, dZN , u))1/2 du < ∞ and E supt∈T ZN (t) < ∞. It follows from Lemma 5.6.1 and (6.46) that there exists a random Fourier series {W (t), t ∈ [0, 1]d } such that
1/2 (1 ∧ |λ|2 ) ν(dλ)
E sup ZN (t) + Cd t∈[0,1]d
Rd
≥ E sup W (t) ≥ t∈[0,1]d
1 3 · 45
D
(6.47)
1/2 log N ([0, 1]d , dW , u) du.
0
Furthermore, by (5.267), d2ZN (s, t) ≤ d2W (s, t) + d2U (s, t), where U is given in (5.265) with ν replaced by νN .
(6.48)
Continuity and boundedness of Gaussian processes
254
It follows from (6.48) that # √ $ N T, dZN , 2 2δ ≤ N (T, dW , δ)N (T, dU , δ)
∀ δ > 0.
(6.49)
To see why this holds, let wj , j = 1, . . . , N (T, dW , δ) denote the centers of the N (T, dW , δ) balls that cover T and let uk , k = 1, . . . , N (T, dU , δ) denote the centers of the N (T, dU , δ) balls that cover T . Let sj,k denote some point in B(wj , δ) ∩ B(uk , δ). Then, for any s ∈ B(wj , δ) ∩ B(uk , δ), by (6.48), d2ZN (sj,k , s) ≤ d2W (sj,k , s) + d2U (sj,k , s)
(6.50)
and dW (sj,k , s) ≤ dW (sj,k , wj ) + dW (wj , s) ≤ 2δ, and similarly for dU (sj,k , s). Thus we get (6.49). It follows from (6.49) that
D
1/2
(log N (T, dZN , u)) du D √ 1/2 (log N (T, dW , u)) du + ≤2 2
(6.51)
0
0
D
1/2
(log N (T, dU , u))
du .
0
Also, it is easy to see that dU (s, t) ≤ Cd
1/2 (1 ∧ |λ|2 ) ν(dλ) |t − s|1/2
(6.52)
Rd
(the constants Cd are not necessarily the same at each stage). Thus, by Example 6.1.4,
D
1/2
(log N (T, dU , u))
1/2 (1 ∧ |λ| ) ν(dλ) .
du ≤ Cd
2
(6.53)
Rd
0
Using (6.47), (6.51), and (6.53), we see that
D
1/2 log N ([0, 1]d , dZN , u) du
≤ (10 · 45 )E sup ZN (t)
0
(6.54)
t∈[0,1]d
+ Cd
1/2 (1 ∧ |λ|2 ) ν(dλ) .
Rd
It is easy to see that dZN +1 (s, t) ≥ dZN (s, t) for all s, t ∈ [0, 1]d . Consequently, both terms in (6.54) that depend on N are increasing in N (to see this for E supt∈[0,1]d ZN (t), we use Lemma 5.5.3). Taking the limit we get (6.38).
6.3 Conditions in terms of majorizing measures
255
Remark 6.2.3 We single out two special cases of Theorem 6.2.2. If the process Z is a random Fourier series on [0, 1]d , we showed in (6.46) that
D
(3 · 45 )E sup Z(t) ≥
(log N ([0, 1]d , dZ , u))1/2 du.
t∈[0,1]d
(6.55)
0
This same inequality is valid for a Gaussian process on any compact Abelian group. If the process Z is a stationary Gaussian process on Rd , {Z(t) − Z(0), t ∈ Rd } is a Gaussian process with stationary increments and the measure ν in (6.38) is finite. In fact, in this case, EZ 2 (t) = ν(Rd ) for all t ∈ [0, 1]d . Since E sup (Z(t) − Z(0)) = E sup Z(t) ≤ E sup |Z(t)| t∈[0,1]d
t∈[0,1]d
(6.56)
t∈[0,1]d
and E supt∈[0,1]d |Z(t)| ≥ (3/4)(EZ 2 (0))1/2 , we get E sup |Z(t)| ≥ Cd t∈[0,1]d
D
(log N ([0, 1]d , dZ , u))1/2 du
(6.57)
0
for some constant Cd depending only on d. Note that explicit values for the constants Cd can be determined both here and in (6.38).
6.3 Necessary and sufficient conditions in terms of majorizing measures In Example 6.1.3 we show that the metric entropy condition (6.3) is not a necessary condition for a Gaussian process to be continuous. In this section we give sufficient conditions for the continuity and boundedness of Gaussian processes that are also necessary. We show, in Example 6.3.7, that these conditions give the continuity and boundedness of the process considered in Example 6.1.3. Given a probability measure µ on a pseudometric space (T, dX ), we are often interested in µ(BdX (t, u)). Sometimes we suppress the dX when it is clear what metric we are referring to. The following is the best result so far on the continuity and boundedness of Gaussian processes. Theorem 6.3.1 Let X = {X(t), t ∈ T } be a mean zero Gaussian process, where (T, dX ) is a separable metric or pseudometric space with finite diameter D. Suppose there exists a probability measure µ on T
Continuity and boundedness of Gaussian processes
256 such that
D
sup t∈T
0
1 log µ(B(t, u))
1/2 du < ∞.
(6.58)
Then there exists a version X = {X (t), t ∈ T } of X such that
D
E sup X (t) ≤ 1056 sup t∈T
t∈T
If
lim sup
→0 t∈T
0
0
1 log µ(B(t, u))
1 log µ(B(t, u))
1/2 du.
(6.59)
1/2 du = 0,
(6.60)
then X is uniformly continuous on T almost surely and 1/2 δ 1 E sup |X (s) − X (t)| ≤ 1056 sup log du. µ(B(s, u)) s,t∈T s∈T 0 dX (s,t)≤δ
(6.61) It is customary to call µ a majorizing measure because it leads to an upper bound for E supt∈T X(t), that is (6.59) holds. Theorem 6.3.1 is a consequence of Theorem 6.3.3. The next lemma is used in the proof of Theorem 6.3.3, but it is also quite interesting in itself. Lemma 6.3.2 Let {X(t), t ∈ T } be a mean zero Gaussian process on the probability space (Ω, F, P ), with EX 2 (t) ≤ 1, for all t ∈ T . Let µ be a probability measure on T . Then there exists a random variable Z on (Ω, F, P ), with EZ < 5/2, such that, for all measurable functions f on T for which, 0 < |f (t)| µ(dt) < ∞ |X(ω, t)f (t)| µ(dt) (6.62) ≤ 3Z(ω)
|f (t)| log 1 +
|f (t)| |f (t)| µ(dt)
1/2 µ(dt).
Proof For all β < 1/2, supt∈T E exp(βX 2 (t)) < ∞. This implies, by Fubini’s Theorem, that for all β < 1/2, the set {ω : exp(βX 2 (ω, · )) ∈ L1 (T, µ)} has measure one. Let 2 " X (ω, t) Z(ω) = inf α > 0 : exp − 1 µ(dt) ≤ 1 . (6.63) α2
6.3 Conditions in terms of majorizing measures Since, by the Dominated Convergence Theorem 2 X (ω, t) lim − 1 µ(dt) = 0 exp α→∞ α2
a.s.
257
(6.64)
Z(ω) is a well-defined random variable. It follows from Young’s inequality, that for all x, y ≥ 0 and α, β > 0, 2 1/2 x y . (6.65) xy ≤ αβ exp − 1 + αy log 1 + α2 β It is simple to verify this directly. Dividing both sides of (6.65) by αβ we see that to obtain
(6.65) it suffices to verify it for α = β = 1. Next, setting y = exp u2 − 1, we see that it suffices to show that, for all x, u ≥ 0,
δ(x, u) = (x − u) exp u2 − 1 − exp x2 − 1 ≤ 0. (6.66) This is obvious for x ≤ u. It is also easy when x ∈ [u, u + 1] since, in this case,
(6.67) (x − u) exp u2 − 1 ≤ exp u2 − 1 ≤ exp x2 − 1 . ∂ δ(x, u) ≤ 0, which implies that δ(x, u) ≤ ∂x δ(u + 1, u), which we have just shown is less than or equal to zero. To simplify the notation we suppress ω in the rest of this proof. Apply (6.65) with x = |X(t)|, y = |f (t)|, and β = |f (t)| µ(dt) and then integrate with respect to µ(dt) and set α = Z to get |X(t)f (t)| µ(dt) (6.68) 2 X (t) ≤ α |f (t)| µ(dt) exp − 1 µ(dt) α2 1/2 |f (t)| µ(dt). + α |f (t)| log 1 + |f (t)| µ(dt) =Z |f (t)| µ(dt) 1/2 |f (t)| + |f (t)| log 1 + µ(dt) . |f (t)| µ(dt) For x ≥ u + 1, note that
1/2 and note that Let F (u) = u log 1 + u/ |f (t)| µ(dt) F |f (t)| µ(dt) = (log 2)1/2 |f (t)| µ(dt).
(6.69)
258
Continuity and boundedness of Gaussian processes 1/2
Also note that the function u (log (1 + u)) or, equivalently, the function u(log(1+u/C))1/2 is convex and increasing for all C > 0. Therefore, by Jensen’s inequality and (6.69), (6.70) (log 2)1/2 |f (t)| µ(dt) ≤ F (|f (t)|) µ(dt) =
|f (t)| |f (t)| log 1 + |f (t)| µ(dt)
1/2 µ(dt).
Combining (6.70) and (6.68), we get (6.62). We now get the estimate for EZ. By Chebyshev’s inequality followed by H¨ older’s inequality, for all u > 0 and p ≥ 1, 2 X (t) µ(dt) ≥ 2 P (Z > u) ≤ P exp u2 p 2 X (t) −p ≤ 2 E µ(dt) exp u2 pX 2 (t) µ(dt). ≤ 2−p E exp u2 Using Lemma 5.2.1 and the fact that EX 2 (t) ≤ 1, we see that P (Z > u) ≤ 2−p (1 − 2p/u2 )−1/2 . For u > (2 + 1/ log 2)1/2 , this last term is minimized by p = u2 /2 − 1/(2 log 2). Thus we see that for all u > (2 + 2 1/ log 2)1/2 , P (Z > u) ≤ (e log 2)1/2 u2−u /2 . Consequently EZ < 5/2.
The next, rather wonderful, theorem immediately proves Theorem 6.3.1 and also gives a modulus of continuity for all continuous Gaussian processes {X(t), t ∈ T } on the pseudometric space (T, dX ), as long as (T, dX ) has a finite diameter. Theorem 6.3.3 Let {X(t), t ∈ T } be a mean zero Gaussian process with probability space (Ω, F, P ), where (T, dX ) is a separable metric space with finite diameter D. For all probability measures µ on T there exists a version X of X and a positive random variable Z on (Ω, F, P ), with EZ ≤ 526, such that, for all ω ∈ Ω and s, t ∈ T , X (ω, s) − X (ω, t) d(s,t) ## log ≤ Z (ω) 0
$1/2 # 1 1 + log µ(B(s, u)) µ(B(t, u))
(6.71) $1/2 $ du ,
6.3 Conditions in terms of majorizing measures where d(s, t) = dX (s, t), and also X (ω, t) − X (ω, t) µ(dt) T D
≤ Z (ω) 2D + 0
259
(6.72)
1 log µ(B(t, u))
1/2
du .
Proof It suffices to prove this theorem for those elements s, t ∈ T for which the integrals on the right-hand sides of (6.71) and (6.72) are finite. Without loss of generality we assume that T contains at least two points. Let µk (t) = µ(B(t, D2−k )) and
ρk (t, · ) =
IB(t,D2−k ) ( · ) µk (t)
ρk (t, u)X(u) µ(du).
Mk (t) = T
Note that for all t ∈ T , B(t, D) = T , so that M0 (t) = T X(u) µ(du) does not depend on t. Note also that 1 E|X(t) − Mk (t)| ≤ d(u, t) µ(du) ≤ D2−k . (6.73) µk (t) B(t,D2−k ) This shows, by the Borel–Cantelli Lemma, that for all t ∈ T , limk→∞ Mk (t) = X(t) almost surely. We consider the Gaussian process {Y (u, v), u, v ∈ T × T } defined by X(u) − X(v) d(u, v) = 0 d(u, v) Y (u, v) = 0 otherwise. Note that for all n ≥ 1, (6.74) Mn (t) − Mn−1 (t) = ρn (t, u)X(u) µ(du) − ρn−1 (t, v)X(v) µ(dv) T T ρn (t, u)ρn−1 (t, v)(X(u) − X(v)) µ(du)µ(dv). = T ×T
Therefore |Mn (t) − Mn−1 (t)| (6.75) |Y (u, v)|d(u, v)ρn (t, u)ρn−1 (t, v) µ(du)µ(dv) ≤ T ×T
260
Continuity and boundedness of Gaussian processes −n ≤ 3D2 |Y (u, v)|ρn (t, u)ρn−1 (t, v) µ(du)µ(dv), T ×T
where, for the last inequality, we use the simple fact that d(u, v) ≤ 3D2−n , since u ∈ B(t, D2−n ) and v ∈ B(t, D2−(n−1) ). We now apply Lemma 6.3.2 to the last line of (6.75), where the process is {Y (u, v), (u, v) ∈ T × T }and the measure is the product measure µ × µ. Using the facts that T ×T ρn (t, u)ρn−1 (t, v) µ(du)µ(dv) = 1 and ρn (t, u)ρn−1 (t, v) ≤ (µn (t)µn−1 (t))−1 we get |Mn (ω, t) − Mn−1 (ω, t)| 1/2 ≤ Z(ω)9D2−n (log (1 + ρn (t, u)ρn−1 (t, v)))
≤ Z(ω)9D2−n
(6.76)
ρn (t, u)ρn−1 (t, v) µ(du)µ(dv) 1/2 1 log 1 + , µn (t)µn−1 (t)
where EZ < 5/2. Since (1 + ab) ≤ (1 + a)(1 + b) for all a, b ≥ 0, this implies that 1/2 √ 1 −n , |Mn (ω, t) − Mn−1 (ω, t)| ≤ Z(ω)9 2D2 log 1 + µn (t) (6.77) and for all m ≥ n |Mm (ω, t) − Mn−1 (ω, t)| n √ D/2 log 1 + ≤ Z(ω)18 2 0
1 µ(B(t, u))
(6.78)
1/2 du.
This shows us that there exists a set Ω1 ⊂ Ω with P (Ω1 ) = 1 on which Mn (ω, t) converges to a limit, say X (ω, t), for all t ∈ T . For ω = Ω1 , set X (ω, t) = 0. Since, as we stated above, Mn (ω, t) converges to X(ω, t), for each t ∈ T , {X (t), t ∈ T } is a version of {X(t), t ∈ T }. It follows from (6.78) that for all k ≥ 1, |Mk (ω, t) − M0 (ω, t)| √ D/2 log 1 + ≤ Z(ω)18 2 0
1 µ(B(t, u))
So, taking the limit as k → ∞, we get √ D/2 |X (ω, t) − M0 (ω, t)| ≤ Z(ω)18 2 log 1 + 0
(6.79)
1/2 du.
1 µ(B(t, u))
1/2 du. (6.80)
6.3 Conditions in terms of majorizing measures
261
To proceed, for any pair s, t ∈ T , choose k such that D/2k+1 < d(s, t) ≤ D/2k and, for ω ∈ Ω1 , consider |X (s) − X (t)| ≤ |Mk (s) − Mk (t)| + |X (s) − Mk (s)| + |X (t) − Mk (t)| (6.81) in which we suppress the ω. We use (6.78) again, taking the limit as m → ∞, to see that √ |X (s) − Mk (s)| ≤ Z18 2
D/2k+1
log 1 +
0
1 µ(B(s, u))
1/2 du,
(6.82) and similarly with s replaced by t. Similar to (6.75) and (6.76), we have −k |Y (u, v)|ρk (s, u)ρk (t, v) µ(du)µ(dv) |Mk (s) − Mk (t)| ≤ 3D2 T ×T
−k
≤ Z9D2
1 log 1 + µk (s)µk (t)
1/2 .
Consequently, |Mk (s) − Mk (t)| D/2k+1 ≤ Z18 log 1 + 0
(6.83)
1 µ(B(s, u)) 1/2 1 + log 1 + du. µ(B(t, u))
It now follows from (6.81) and (6.82) applied to both s and t that |X (s) − X (t)| D/2k+1 log 1 + ≤ Z54
(6.84)
1 µ(B(s, u)) 0 1/2 1 + log 1 + du µ(B(t, u)) k+1 1/2 √ D/2 1 du ≤ Z54 2 log 1 + µ(B(s, u)) ∧ µ(B(t, u)) 0 k+2 1/2 √ D/2 1 ≤ Z108 2 log 1 + du, µ(B(s, u)) ∧ µ(B(t, u)) 0
where, for the second inequality, we replace each measure by the infimum of both of them and in the last line we use the fact that, for positive
262
Continuity and boundedness of Gaussian processes
nonincreasing functions f on [0, a], a f (u) du ≤ 2 0
a/2
f (u) du.
0
Recall that d(s, t) > D/2k+1 . This implies that the balls B(s, u) and B(t, u), in the last line of (6.83), are disjoint and, consequently, that µ(B(s, u)) ∧ µ(B(t, u)) ≤ 1/2. Note that log(1 + x) ≤ (log 3)(log 2)−1 log x
x≥2
(6.85)
since they are equal when x = 2 and log x grows more rapidly than log(1 + x) for x ≥ 2. Therefore 1 log 1 + (6.86) µ(B(s, u)) ∧ µ(B(t, u)) log 3 1 ≤ log log 2 µ(B(s, u)) ∧ µ(B(t, u)) 1 1 log 3 log + log . ≤ log 2 µ(B(s, u)) µ(B(t, u)) √ We now use this last inequality in (6.84) to get (6.71) with Z = 108 2 1/2 1/2 √ log 3 log 3 Z, so that EZ ≤ 295 2 . log 2 log 2 Using (6.85) along with (6.80) when µ(B(t, u)) ≤ 1/2 and making allowance for the fact that µ(B(t, u)) may be larger than 1/2, we get (6.72). Proof of Theorem 6.3.1 It follows from (6.71) that 1/2 D 1 log du. E sup X (ω, s) ≤ EX (ω, t) + 1056 sup µ(B(s, u)) s∈T s∈T 0 (6.87) The inequality in (6.59) follows because EX (ω, t) = 0. To show that X is uniformly continuous, we use (6.71) to see that 1/2 δ 1 du. sup |X (ω, s) − X (ω, t)| ≤ 2Z (ω) sup log µ(B(s, u)) s,t∈T s∈T 0 d(s,t)≤δ
(6.88) It now follows from (6.60) that the limit of the left-hand side of (6.88) goes to zero as δ goes to zero. Clearly (6.61) follows immediately from (6.88). The shining stars in theory of Gaussian processes are the next two
6.3 Conditions in terms of majorizing measures
263
theorems by M. Talagrand. The first gives a necessary condition for the boundedness of Gaussian processes. Theorem 6.3.4 (Talagrand) Using the notation of Theorem 6.3.1, let {X(t), t ∈ T } be a Gaussian process with bounded sample paths. Then there exists a probability measure µ on T such that 1/2 ∞ 1 sup log du ≤ CE sup X(t), (6.89) µ(B(t, u)) t∈T 0 t∈T where C=835. Proof It suffices to take T to be countable and to suppose that the diameter of T is greater than 0. Under these stipulations and the fact that the theorem remains unchanged if we replace dX by DdX , we can suppose that the diameter of T is equal to 1. The crux of this proof is the construction of a nested sequence Pn of finite partitions of T , and for each n, functions τn : Pn → T and functions mn : Pn → N with the following properties, in which ρ = 1/4: (1) For all n ≥ 0 and all elements C ∈ Pn , C ⊂ B(τn (C), ρn ). (2) For all n ≥ 1, every set C ∈ Pn−1 , and any two sets D = D in Pn that are also in C, mn (D) = mn (D ), τn (D) and τn (D ) are in C, and d(τn (D), τn (D )) > ρn . With regard to (2), note that because Pn is a partition of T , all elements of Pn are disjoint, and, to elaborate further, we show below that the function mn counts the elements of Pn that cover C. The partitions Pn are defined recursively. P0 is the single set T itself, m0 (T ) = 1, and τ0 (T ) = a, where a can be chosen to be any point in T . Suppose now that Pj , j = 0, . . . , n has been chosen. Let C be an element of Pn . C is decomposed into the union of disjoint sets that are part of the partition Pn+1 in the following way: Let A0 = φ and W0 = C. Choose a point t1 ∈ C such that E(B(t1 , ρn+2 ) ∩ W0 ) ≥ sup E(B(t, ρn+2 ) ∩ W0 ) − ρn+2 E(T ),
(6.90)
t∈W0
where we use the notation E(U ) = E (supt∈U X(t)) for any U ⊂ T . We now consider the larger ball centered at t1 , B(t1 , ρn+1 ). If this ball covers C, we are done. The element in Pn+1 that covers C is C itself, and we set τn+1 (C) = t1 and mn+1 (C) = 1. If B(t1 , ρn+1 ) does not cover C, we consider C \ B(t1 , ρn+1 ) and repeat the procedure. To be more precise, for r ≥ 1, let Ar = ∪i≤r B(ti , ρn+1 ). If C is not
264
Continuity and boundedness of Gaussian processes
included in Ar , we choose the point tr+1 in Wr = C \ Ar such that E(B(tr+1 , ρn+2 ) ∩ Wr ) ≥ sup E(B(t, ρn+2 ) ∩ Wr ) − ρn+2 E(T ). (6.91) t∈Wr
Note that d(ti , tj ) > ρn+1 . Let M (C) be the smallest integer k for which C ⊂ Ak . We stop the construction at the k-th stage. We know that M (C) is finite because, for all with i = j,
d(ti , tj ) > ρn+1 , (6.92)
and thus M (C) ≤ exp (17E(T )2 )/ρ2(n+1) by Lemma 5.5.6. For each r ∈ [1, M (C)] let Dr := Dr (C) = B(tr , ρn+1 ) ∩ Wr−1 . We construct Pn+1 by taking {Dr (C), r ∈ [1, M (C)]} as the elements in Pn+1 that cover C. We define τn+1 (Dr ) = tr and mn+1 (Dr ) = r, r = 1, . . . , M (C). We do this for each element of Pn . This completes the description of the recursive construction. We see that (1) and (2) hold. Given this construction, the rest of the proof is pretty straightforward. Let s ∈ T and Cn (s) be the element of Pn that contains s. We now show that ∞ ρn (log mn (Cn (s)))1/2 ≤ 104E(T ). (6.93) i, j = 1, . . . , k
n=0
Set C = Cn (s). The element Cn+1 (s) of Pn+1 is a subset of C; indeed, it is one of the sets {Dr (C), r ∈ [1, M (C)]} that are the elements Pn+1 that cover C. Let us suppose that it is Dm := Dm (C), that is, Dm (C) = Cn+1 (s). Set m = mn+1 (Cn+1 (s))
tm = τ (Cn+1 (s)).
and
(6.94)
Consider the family of sets {Dr , r ∈ [1, m]}. Since they are contained in C and the points {τ (Dr ), r ∈ [1, m]} are ρn+1 –distinguishable, it follows from Lemma 5.5.7 that E(C) ≥
1 n+2 (log m)1/2 12 ρ
+ inf i≤m E(B(ti , ρn+2 ) ∩ Wi−1 )
(6.95)
(W · is as defined above (6.91).) Notice also that by (6.91), for each {i ∈ [1, m]}, ρn+2 E(T ) + E(B(ti , ρn+2 ) ∩ Wi−1 ) ≥ E(B(tm , ρ
) ∩ Wi−1 )
≥ E(B(tm , ρ
) ∩ Wm−1 )
n+2 n+2
(6.96)
6.3 Conditions in terms of majorizing measures
265
since tm ∈ Wi−1 . Combining (6.95) and (6.96), we get E(Cn (s)) = E(C) ≥
1 n+2 (log mn+1 (Cn+1 (s)))1/2 12 ρ + E(B(tm , ρn+2 ) ∩ Wm−1 )
(6.97) −ρ
n+2
E(T ).
Furthermore, since Cn+2 (s) ⊂ Cn+1 (s), τ (Cn+2 (s)) ∈ Cn+1 (s) = Dm . Since tm is maximal in the sense of (6.91), we see that E(B(tm , ρn+2 ) ∩ Wm−1 ) ≥ E(B(τ (Cn+2 (s)), ρn+2 ) ∩ Wm−1 ) − ρn+2 E(T ). (6.98) n+2 ) ∩ Wm−1 contains Cn+2 (s). Using It is clear that B(τ (Cn+2 (s)), ρ this in (6.98) along with (6.97), we see that E(Cn (s)) ≥
+ E(Cn+2 (s)) − 2ρn+2 E(T ). (6.99) For a fixed integer N , keeping in mind that ρ = 1/4, we get 1 n+2 (log mn+1 (Cn+1 (s)))1/2 12 ρ
N n=0
ρn+2 (log mn+1 (Cn+1 (s)))1/2 ≤ 12 E(C0 (s)) + E(C1 (s)) + 2 E(T )
(6.100) ∞
ρn+2
n=0
≤ 26 E(T ). ∞ Thus n=1 ρn (log mn (Cn (s)))1/2 ≤ (26/ρ)E(T ), which gives us (6.93), since C0 (s) = T and m0 (T ) = 1. We now define the probability measure µ in the left-hand side of (6.89). For each n ∈ N , let Q(n) = {τn (C), C ∈ Pn }. Recall that Q(0) = {a}. We set µ0 (a) = 1/2. We continue to define µn , n ≥ 1, recursively. Each µn is supported on Q(n) and for each t ∈ Q(n) −1 1 µn−1 (τn−1 (Cn−1 (t))) , (6.101) µn (t) = 2m2n (Cn (t)) m2n (Cn (u)) where the sum is taken over {u ∈ Q(n) : τn−1 (Cn−1 (u)) = τn−1 (Cn−1 (t))}. ∞ We set µ = n=0 t∈Q(n) µn (t)δt . For each s ∈ T , let φn (s) = τn (Cn (s)) and mn (s) = mn (Cn (s)). It is obvious that for each v ∈ Q(n − 1), (6.102) {µn (t) : t ∈ Q(n), φn−1 (t) = v} = µn−1 (v)/2. Therefore An = {µn (t) : t ∈ Q(n)} = 12 {µn−1 (v) : v ∈ Q(n − 1)}. That is, An = An−1 /2. Since A0 = µ(a) = 1/2, we see that An = 2−(n+1) . Thus µ is a probability measure.
Continuity and boundedness of Gaussian processes
266
For each u ∈ T and n ≥ 0, we have d(u, φn (u)) < ρn . Thus φn (u) ∈ B(u, ρn ) and consequently µ(B(u, ρn )) ≥ µn (φn (u)) ≥ 3π −2 µn−1 (φn−1 (u))/m2n (u).
(6.103)
For the second inequality we use (6.101) and the facts that φn−1 (φn (u)) = ∞ φn−1 (u) and m=1 1/m2 = π 2 /6. Iterating and noting that, µ0 (φ0 (u)) = 1/2 for all u, we see that, for all u ∈ T and all n ≥ 0, n 1 3 . (6.104) µ(B(u, ρn )) ≥ π2 2 m2n (u) m2n−1 (u) · · · m21 (u) Since the diameter of T = 1, we see that 1/2 ∞ 1 I(u, µ) := log dr µ(B(u, r)) 0 1/2 ∞ ρj−1 1 ≤ log dr µ(B(u, r)) j j=1 ρ 1/2 ∞ 1 j−1 log ρ . ≤ µ(B(u, ρj )) j=1
(6.105)
Using (6.104), we see that log
j 1 π2 ≤ j log + log 2 + 2 log mi (u). µ(B(u, ρj )) 3 i=1
(6.106)
Consequently, I(u, µ) ≤ 4
∞ j=1
≤
ρj
## j log
j $ π 2 $1/2 1/2 1/2 + (log 2) + (2 log mi (u)) 3 i=1
π 2 $1/2 4 16 # 1/2 log + (log 2) 9 3 3 √ ∞ 16 2 i 1/2 ρ (log mi (u)) . + 3 i=1
(6.107)
By (6.93), the last term in the second line of (6.107) is bounded by 785 E(T ). The sum of the first two terms in the second line of (6.107) is bounded by 3.06. Now, since the diameter of T is not less than b for any b < 1, there are two points in T that are 3/4-distinguishable. Therefore, by Lemma 5.5.6, 50 E(T ) > 3.06. Thus we obtain (6.89). We now obtain a modification of Theorem 6.3.4 that gives a necessary condition for the continuity of Gaussian processes.
6.3 Conditions in terms of majorizing measures
267
Theorem 6.3.5 (Talagrand) Using the notation of Theorem 6.3.1, let {X(t), t ∈ T } be a Gaussian process with bounded uniformly continuous sample paths. Then there exists a probability measure µ on T such that 1/2 ∞ 1 sup log du ≤ CE sup X(t) (6.108) µ(B(t, u)) t∈T 0 t∈T and
δ
lim sup
δ→0 t∈T
0
1 log µ(B(t, u))
1/2 du = 0.
(6.109)
Proof Let D be the diameter of T . By (6.33) and the sentence immediately following it, for all n ≥ 0 we can find a family of balls {B(s, D2−n ), s ∈ Sn } such that ∪s∈Sn B(s, D2−n ) = T and lim 2−n (log #Sn )
1/2
n→∞
= 0.
(6.110)
For each n and s ∈ Sn , set Tn,s = B(s, D2−n ). Consider {X(t), t ∈ Tn,s }. Obviously, this process is bounded and so, by Theorem 6.3.4, there exists a probability measure µn,s on Tn,s such that
D2−n
sup t∈Tn,s
0
1 log µn,s (B(t, u))
1/2 du ≤ CE sup X(t),
(6.111)
t∈Tn,s
where C = 835. We now construct a probability measure µ on T by setting ∞
µ=
2−(n+1)
n=0
µn,s . #Sn
(6.112)
s∈Sn
For each t ∈ T and k ≥ 0 one of the sets Tk,s , where s ∈ Sk , contains t. Therefore, by (6.111) with n replaced by k and the obvious inequality µ(B(t, u)) > 2−(k+1)
µk,s (B(t, u)) , #Sk
(6.113)
which holds for all s ∈ Sk ,
D2−k
log
0
1 µ(B(t, u))
1/2 du
≤ D2−k (log(#Sk ))
1/2
(6.114)
1/2 + D2−k log 2k+1 + CE sup X(t). t∈Tk,s
Consider (6.114) for k = 0. Since #S0 = 1 and, by (5.232), D <
268 Continuity and boundedness of Gaussian processes √ 2πE supt∈T X(t), we get 1/2 D
√ 1 log du ≤ 2 π + C E sup X(t), sup µ(B(t, u)) t∈T 0 t∈T
(6.115)
which gives us (6.108). Returning again to (6.114) and using (6.34), we see that 1/2 D2−k 1 1/2 log du ≤ D2−k (log(#Sk )) (6.116) sup µ(B(t, u)) t∈T 0
1/2 + D2−k log 2k+1 + CE sup |X(s) − X(t)|. d(s,t)≤D2−k
s,t∈T
Using (6.110) and the hypothesis that the process is uniformly continuous, we see that all the terms on the right-hand side of (6.116) go to zero as k goes to infinity. Thus we get (6.109). Considering the significance of having necessary and sufficient conditions for the boundedness and continuity of Gaussian processes, we can afford to be repetitious and restate them in the following theorem. Theorem 6.3.6 Let X = {X(t), t ∈ T } be a mean zero Gaussian process, where (T, dX ) is a separable metric or pseudometric space with finite diameter D. There exists a version of X with bounded sample paths if and only if there exists a probability measure µ on T such that 1/2 D 1 log sup du < ∞. (6.117) µ(B(t, u)) t∈T 0 Furthermore, there exists a version of X with bounded uniformly continuous sample paths if and only if there exists a probability measure µ on T such that (6.117) holds and in addition 1/2 1 du = 0. (6.118) lim sup log →0 t∈T 0 µ(B(t, u)) The conditions for continuity and boundedness in Theorem 6.3.6 are rather abstract, since there is no guidance as to what measure to use. Nevertheless, we see in the next example that in some cases it is possible to find these measures. Example 6.3.7 We return to the process studied in Example 6.1.3 and show that we can prove that it is continuous using Theorem 6.3.1. We also modify the example slightly to show how the majorizing measure
6.3 Conditions in terms of majorizing measures
269
condition can show that a process is bounded when it is not also continuous. As an extension of Example 6.1.3 let X = {X(t), t ∈ T }, where T = {{tk }∞ k=4 ∪ 0} with t4 = 1, tk ↓ 0, and X(tk ) := ξk /bk , ∞ where {ξk }k=4 is a standard normal sequence, and X(0) = 0. Thus 1/2
dX (tj , tk ) = 1/b2j + 1/b2k . We define a probability measure µ on T as follows: µ(0) = 1/2 and µ(tk ) = (Ck 2 )−1 with the appropriate constant C. Note that µ(B(tk , u)) = (Ck 2 )−1 when u < 1/bk , since tk is the only point in B(tk , u) for these values of u. When u ≥ 1/bk , µ(B(tk , u)) > 1/2 since 0 is now also in B(tk , u). Consequently, 1/2 1 1 (2 log k + log C)1/2 1/2 log du ≤ + (log 2) , µ(B(tk , u)) bk 0 (6.119) and for k ≥ j 1/2 1/bj 1 (2 log k + log C)1/2 log ≤ du (6.120) bk µ(B(tk , u)) 0 1/2
≤
(2 log k + log C)1/2 (log 2) + bk bj
.
Let bk = (2 log k log log k)1/2 as in Example 6.1.3. It follows from the second inequality in (6.120) and Theorem 6.3.1 that X is continuous at zero. In Example 6.1.3 we showed that this obvious fact does not follow from Theorem 6.1.2. Now let bk = (2 log k)1/2 . One knows from the Borel–Cantelli Lemmas that lim supk→∞ ξk /(2 log k)1/2 = 1 almost surely. The inequalities in (6.59) and (6.119) show that in this case X is bounded almost surely. We also see from the first inequality in (6.120) that (6.60) does not hold, which, of course, must be the case. It is interesting to note that we can actually use Theorem 6.3.4 to show that lim supk→∞ ξk /(2 log k)1/2 > 0. Let Y = {Y (t), t ∈ T }, where 1/2 . T = {{tk }∞ k=4 ∪ 0} with t4 = 1, tk ↓ 0, and X(tk ) := ξk /(2 log k) Every probability measure on T is of the form µ(tk ) = ak , µ(0) = a0 , and the measure for which (6.89) holds must assign a positive measure to all points of T . By (6.120), for k ≥ j, (log 1/ak )1/2 ≤ (2 log k)1/2
1/(2 log j)1/2 0
1 log µ(B(tk , u))
1/2 du.
(6.121)
∞ It is easy to see, since k=4 ak < 1, that the limit superior of the lefthand side of (6.121) does not go to zero. Consequently, by Theorem
270
Continuity and boundedness of Gaussian processes
6.3.4,
lim E
j→∞
ξk sup 1/2 k≥j (2 log k)
> 0.
(6.122)
However, since supk ξk /(2 log k)1/2 < ∞ almost surely, we can use Corollary 5.4.7 and the Dominated Convergence Theorem to see that lim supk→∞ ξk /(2 log k)1/2 > 0 with probability greater than zero. Since this is a tail event, it must occur with probability one.
6.4 Simple criteria for continuity We obtain both necessary and sufficient conditions for the continuity of Gaussian processes with stationary increments and nicely behaved covariance functions or spectral distributions. These results apply, for example, to Gaussian processes associated with symmetric stable processes and Brownian motion and, in fact, to symmetric L´evy processes in general. We achieve this by finding an integral expression that is equivalent to the metric entropy integral in (6.3) but which is easier to work with. Let σ(x, y) be a translation invariant metric or pseudometric on Rn . Since σ(x, y) = σ(0, x − y) = σ(0, y − x), we write σ(x, y) = σ(x − y). Let K ⊂ Rn be a compact symmetric neighborhood of zero. Define K ⊕ K = {x + y|x ∈ K, y ∈ K} and in a similar fashion define ⊕n K = {x1 + · · · + xn |x1 ∈ K, . . . , xn ∈ K}. Let µ denote Lebesgue measure on Rn and let µn = µ(⊕n K). Define mσ ( ) = µ({x ∈ K ⊕ K|σ(x) < })
(6.123)
and note that mσ is left continuous and nondecreasing. Set σ(u) = sup{y|mσ (y) < u}.
(6.124)
We see that σ is also left continuous and nondecreasing. Such a function is often referred to as the left continuous inverse of mσ . Also note that, by the left continuity of mσ , mσ (σ(s)) = s for all s for which σ(s) > σ(s − ) for all 0 < < s. Since 0 ≤ mσ ( ) ≤ µ2 and mσ (∞) = µ2 , we can restrict the domain of σ to [0, µ2 ]. Note that σ( · ) considered as a random variable on [0, µ2 ] has the same probability distribution with respect to normalized Lebesgue measure on [0, µ2 ] that σ( · ) has with respect to normalized Lebesgue measure on K ⊕ K (this statement is equivalent to the elementary observation that, given the probability distribution function of
6.4 Simple criteria for continuity
271
a random variable, one can find an increasing function on the unit interval with the same probability distribution function). In keeping with classical terminology, we call σ the nondecreasing rearrangement of σ (with respect to K ⊕ K). In terms of µ, σ, and K, we define µ2 σ(s) ds. (6.125) I(K, σ) = # µ4 $1/2 0 s log s We also recall the metric entropy integral in (6.3) applied to (K, σ), which we write as D 1/2 J(K, σ) = (log N (K, σ, u)) du, (6.126) 0
where D = supx,y∈K σ(x − y) := σ (. Lemma 6.4.1 1/2 1/2 µ4 1 µ2 σ log + I(K, σ) ≤ J(K, σ) ≤ 4ˆ + 2I(K, σ). −ˆ σ log µ1 2 µ2 (6.127) To prove this lemma we first explore the relationship between the metric entropy of K and the measure mσ on K ⊕ K. Lemma 6.4.2 Using the notation given above, we have µ4 N (K ⊕ K, σ, ) ≤ mσ ( /2) N (K, σ, ) ≥
µ1 ∨1 mσ ( )
(6.128) (6.129)
and N (K, σ, 2 ) ≤ N (K ⊕ K, σ, ). Proof
(6.130)
For all t ∈ K ⊕ K, we have µ({Bσ (t, ) ∩ ⊕4 K}) ≥ µ({Bσ (0, ) ∩ K ⊕ K}).
(6.131)
This inequality is elementary since for t ∈ K⊕K, t+{Bσ (0, )∩K⊕K} ⊂ {Bσ (t, ) ∩ ⊕4 K}. We also note that D(K ⊕ K, σ, /2) ≥ N (K ⊕ K, σ, ),
(6.132)
where D(K ⊕ K, σ, /2) is the maximum number of closed balls of radius /2 with centers in K ⊕ K that are disjoint in Rn . This inequality has
272
Continuity and boundedness of Gaussian processes
the same proof as the first inequality in Lemma 6.1.1. For later use let tj , 1 ≤ j ≤ D(K ⊕ K, σ, /2), denote the centers of these balls. Inequality (6.132) has the same proof as the first inequality in Lemma 6.1.1. Using (6.131) and (6.132), we see that D(K⊕K,σ,/2) {Bσ (tj , /2) ∩ ⊕4 K} (6.133) µ4 ≥ µ ∪j=1 ≥ D(K ⊕ K, σ, /2)µ ({Bσ (0, /2) ∩ K ⊕ K}) ≥ N (K ⊕ K, σ, )mσ ( /2), which is (6.128). To prove (6.129) we first note that, analogous to (6.131), for all t ∈ K, µ({Bσ (0, ) ∩ K ⊕ K}) ≥ µ({Bσ (t, ) ∩ K}).
(6.134)
N (K,σ,)
Bσ (tj , ) for some set of points {tj } in K (the Also, since K ⊂ ∪j=1 sets {tj } are not necessarily the same in each paragraph), # $ N (K,σ,) Bσ (tj , ) ∩ K} (6.135) µ1 = µ {∪j=1 ≤ N (K, σ, )µ({Bσ (0, ) ∩ K ⊕ K}), where we use (6.134) in the last step. This gives us (6.129). Since 0 ∈ K, K ⊂ K ⊕ K. Thus it may appear that (6.130) is trivial. However, the centers of the balls forming a cover of K ⊕ K may not be in K. Let {tj }, j = 1, . . . , N (K ⊕ K, σ, ) be such that N (K⊕K,σ,) K ⊕ K ⊂ ∪j=1 Bσ (tj , ). For each nonempty set Bσ (tj , ) ∩ K choose a point sj ∈ Bσ (tj , ) ∩ K. The balls {Bσ (sj , 2 )} cover K since, if not, there exists a u ∈ K such that σ(sj − u) > 2 . This implies that σ(tj − u) > , which contradicts the fact that the balls {Bσ (tj , )} cover K. Thus we get (6.130). Proof of Lemma 6.4.1 It follows from (6.129) and the fact that D = σ ˆ that 1/2 σˆ µ4 µ4 − log J(K, σ) ≥ log du (6.136) mσ (u) µ1 0 1/2 1/2 σˆ µ4 µ4 log du − σ ˆ log . ≥ mσ (u) µ1 0 By the change of variables u = σ(s) and the fact that σ(µ2 ) = σ ˆ, 1/2 σˆ µ2 # µ4 µ4 $1/2 log log du = dσ(s), (6.137) mσ (u) s 0 0
6.4 Simple criteria for continuity
273
where we also use the fact that mσ (σ(s)) = s for all s ∈ [0, µ2 ] for which σ(s) > σ(s − ) for all 0 < < s, as noted above. Since the right-hand integral in (6.137) only increases at these points, we get the equality as stated. Assume that J(K, σ) < ∞. Then, by (6.136) and (6.137), the integral on the right-hand side of (6.137) is finite. We now show that this implies that # µ4 $1/2 lim σ(s) log = 0. (6.138) s→0 s If σ(s) ≡ 0 is [0, δ], (6.138) is trivial. Assume that this is not the case and the limit in (6.138) is not zero. Then there exists an > 0 and a sequence {tk }∞ k=1 deceasing to zero such that
µ4 lim σ(tk ) log k→0 tk
1/2 ≥ .
(6.139)
∞ Consequently, for any subsequence {sk }∞ k=1 of {tk }k=1 with s1 < µ2 ,
µ2
0
#
µ4 $1/2 log dσ(s) ≥ s ≥
∞
sk
# log
k=1 sk+1 ∞
log
k=1
µ4 sk
µ4 $1/2 dσ(s) s
(6.140)
1/2 (σ(sk ) − σ(sk+1 )) .
We choose {sk }∞ k=1 so that σ(sk ) − σ(sk+1 ) ≥ σ(sk )/2. Here we use the fact that σ(sk ) > 0 and limk→∞ σ(sk ) = 0. Using this subsequence, it follows from (6.139) and (6.140) that the integral on the left-hand side of (6.140) is infinite. This contradiction establishes (6.138). Therefore, by integration by parts, 1/2 µ2 # µ4 $1/2 µ4 1 (6.141) log dσ(s) = σ ˆ log + I(K, σ). s µ 2 2 0 Using this in (6.136) we get the left-hand side of (6.127). Also, by (6.130) and (6.128), J(K, σ) ≤ 2J(K ⊕ K, σ) ≤ 4
σ ˆ
log
0
µ4 mσ (u)
1/2 du.
(6.142)
Using (6.137) and (6.141) in (6.142), we get the right-hand side of (6.127). Remark 6.4.3 If instead of Rn we took [−1/2, 1/2]n considered as an
274
Continuity and boundedness of Gaussian processes
Abelian group with addition modulo one, and took K = [−1/2, 1/2]n , then ⊕n K = K and µn = µ1 . In this setting (6.127) becomes 1 2 I(K, σ)
≤ J(K, σ) ≤ 2I(K, σ).
(6.143)
The next corollary follows immediately from Lemma 6.4.1. Corollary 6.4.4 Let X = {X(t), t ∈ [−1/2, 1/2]n } be a Gaussian process with stationary increments and let
1/2 σ(x − y) = E(X(x) − X(y))2 .
(6.144)
X has a version with continuous sample paths on ([−1/2, 1/2]n , σ) if and only if there exists a δ > 0 such that
δ
σ(s) 1/2 ds < ∞, 1 s log s
0
(6.145)
where σ(s) is the nondecreasing rearrangement of σ, as given in (6.124). The condition in (6.145) is appealing because it is a simple functional of the metric that defines X. However, it is illusionary since, in general, finding the nondecreasing rearrangement of an irregular function is neither easier nor harder than figuring out the metric entropy with respect to that function. Nevertheless, (6.145) implies conditions for continuity that can be very easy to verify. Example 6.4.5 Let X and σ be as in Corollary 6.4.4, and take n = 1. Let σ ∗ and σ∗ be a monotone majorant and minorant for σ on [0, δ/2] for some δ > 0, that is, σ∗ (u) ≤ σ(u) ≤ σ ∗ (u)
u ∈ [0, δ/2].
(6.146)
Since σ(u) = σ(−u), we have 2σ∗ (u/2) ≤ σ(u) ≤ 2σ ∗ (u/2)
u ∈ [0, δ].
(6.147)
Therefore, it is obvious, by (6.145), that
δ 0
σ ∗ (s) 1/2 ds < ∞ 1 s log s
(6.148)
6.4 Simple criteria for continuity
275
is a sufficient condition for the continuity of X on ([−1/2, 1/2], σ) and δ σ∗ (s) (6.149) 1/2 ds < ∞ 0 1 s log s is a necessary condition for the continuity of X on ([−1/2, 1/2], σ). Actually, (6.148) is a continuity condition for any Gaussian process on [−1/2, 1/2]n in the following sense. Lemma 6.4.6 Let X = {X(t), t ∈ [−1/2, 1/2]n } be a Gaussian process and let
1/2 σ + (u) := E(X(x) − X(y))2 sup . (6.150) |x−y|≤u
x,y∈[−1/2,1/2]n
If there exists a δ > 0 such that δ σ + (s) 1/2 ds < ∞, 0 1 s log s
(6.151)
X has a version with continuous sample paths on ([−1/2, 1/2]n , dX ). Proof Let dX be as given in (6.1) and note that σ + (x, y) := σ + (|x−y|) is itself a metric or pseudometric on K := [−1/2, 1/2]n . Therefore, by (6.128), N (K ⊕ K, dX , ) ≤ N (K ⊕ K, σ + , ) µ4 ≤ . mσ+ ( /2)
(6.152)
Using this we get the inequalities in (6.142), and, as in the line following (6.142), this implies √ (6.153) J(K, dX ) ≤ 4σ + ( n )(n log 2)1/2 + 2I(K, σ + ), which, by (6.151) and Theorem 6.1.2, implies that X has a version with continuous sample paths on ([−1/2, 1/2]n , dX ). Example 6.4.7 Consider (5.256) with h(λ) = λ−β , λ > 0, and 1 < β < 3. By a change of variables, it is easy to see that ψ(u) in (5.256) is equal to ∞ sin2 πs 4 ds |u|β−1 . (6.154) sβ 0
276
Continuity and boundedness of Gaussian processes
This shows that for all 0 < α < 1, we can obtain a Gaussian process X = {X(t), t ∈ R1 } with stationary increments satisfying
1/2 E(X(x) − X(y))2 = |x − y|α . (6.155) These processes are often referred to as fractional Brownian motion and for α = 1/2 give Brownian motion itself. It is easy to see that (6.148) holds for these processes, that is, they have continuous sample paths. (Strictly speaking, it shows that the paths are continuous on (R1 , |x − y|α ), but obviously this implies that they are also continuous with respect to the Euclidean metric on R1 . In what follows we may omit this comment when it is obvious.) The same argument applied to (5.247) with µ(dλ) = (λ−β ∧ 1) dλ, λ > 0, and 1 < β < 3 shows that for all 0 < α < 1 we can obtain = {X(t), stationary Gaussian processes X t ∈ R1 } for which # $1/2 2 − X(y)) ≤ C2 |x − y|α (6.156) C1 |x − y|α ≤ E(X(x) for x, y ∈ [−δ, δ] for some δ > 0 and constants C1 , C2 > 0, which may depend on α and, furthermore, that # $1/2 2 E(X(x) − X(y)) ∼ C|x − y|α (6.157) as |x − y| → 0 for some constant C (which depends on α). To complete this discussion we note that there are stationary Gaussian processes for which (6.156) and (6.157) also hold for α = 1. We now give continuity conditions in terms of the spectral distribution of Gaussian processes with stationary increments. We first consider stationary Gaussian processes. Lemma 6.4.8 Let X = {X(t), t ∈ [−1/2, 1/2]} be a stationary Gaussian process with 1/2
(6.158) σ(x − y) = E(X(x) − X(y))2 1/2 ∞ (sin2 πλ(x − y)) dF (λ) . = 0
We have
I([−1/2, 1/2], σ) ≤ 21 F
1/2
∞
(∞) + 1/(µ4 π)
(F (∞) − F (x))1/2 dx . x(log(2πx))1/2 (6.159)
6.4 Simple criteria for continuity
277
Proof Let σ ∗ (s) be the smallest monotone majorant of σ(s) for s ∈ [0, 1]. Then, as in (6.147), σ(s) ≤ 2σ ∗ (s/2). Therefore, by (6.125), µ2 /µ4 σ ∗ (µ4 s/2) (6.160) I([−1/2, 1/2], σ) ≤ 2 1/2 ds 0 1 s log s 1/2 ∗ σ (2s) = 2 1/2 ds 0 1 s log s since µ2 = 2 and µ4 = 4. Note that 1/2 ∞ σ ∗ (2s) σ ∗ (2−n+1 ) 1 . 1/2 ds ≤ (log 2)1/2 n1/2 0 1 n=1 s log s
(6.161)
By (6.158),
σ(2s)
# v $1/2 2π 0 ∞ 1/2 := (sin2 vs) dF(v) . ∞
=
(sin2 vs) dF
(6.162)
0
Therefore, σ ∗ (2−n+1 )
=
∞
sup s≤2−n+1
≤ 4·2
−n
0 2n
1/2 (sin2 vs) dF(v)
(6.163)
1/2 # $1/2 v dF (v) + F(∞) − F(2n ) 2
0 n−1 # $1/2 ≤ 4 · 2−n 4F(∞) + 22(j+1) t2j
#
j=1
$1/2 + F(∞) − F(2n ) , 2 where we set t2j = F(2j+1 ) − F(2j ) and use the fact that 0 v 2 dF(v) ≤ 4F(∞). To obtain a bound for the right-hand side of (6.161) we note that ∞ σ ∗ (2−n+1 ) n1/2 n=1
≤ 8F
(6.164)
1/2
(∞) + 8
∞ n=1
−n
2
n−1 j=1
j
2 tj +
∞ n=1
F(∞) − F(2n ) n
1/2
Continuity and boundedness of Gaussian processes
278
and, by (14.76), ∞
2−n
n=1
n−1
2j tj
1/2 ∞ 1 tj ≤ 2 t2j n n=1 j=1 j=n
∞
≤
j=1
∞
=2
∞
n=1
Since ∞
n=1
F(∞) − F(2n ) n
1/2
≤
# ∞
F(∞) − F(2n ) n
$ F(∞) − F(x) 1/2
1
x(log x)1/2
(6.165)
1/2 .
dx,
(6.166)
we can complete the proof of (6.159). Remark 6.4.9 When the spectral distribution F in (6.158) is supported on the integers it determines a Gaussian process that is periodic on [−1/2, 1/2] (it is of the form of (5.260)). In this case we consider [−1/2, 1/2] as a group and take µ4 = 1 in (6.159). Let Z = {Z(t), t ∈ [−1/2, 1/2]} be a Gaussian process with stationary increments. In this case we can write 1/2
(6.167) ψ(x − y) = E(Z(x) − Z(y))2 ∞ 1/2 = (sin2 πλ(x − y)) ν(dλ) , where
0
(1 ∧ |λ| )ν(dλ) < ∞. Therefore, 2
ψ(µ4 s) ≤ πµ4
1
1/2 λ ν(dλ) s+
∞
2
0
1/2 (sin πλµ4 s) dH(λ) , 2
1
(6.168) where H(λ) = ν([1, ∞))−ν([λ, ∞)). We already dealt with the last term in (6.168) in Lemma 6.4.8. Incorporating the first term into an estimate for I([−1/2, 1/2], ψ) is elementary. Since H(∞) − H(x) = ν([x, ∞)), we get 1/2 1 I([−1/2, 1/2], ψ) ≤ 5 µ4 λ2 ν(dλ) (6.169) 0
+ ν 1/2 ([1, ∞)) +
∞
1/(µ4 π)
where µ4 = 4. .
ν 1/2 ([x, ∞)) dx , x(log(µ4 πx))1/2
6.4 Simple criteria for continuity
279
The sufficient conditions for continuity in (6.159) and (6.170) are also necessary when F (∞)−F (x) or ν([x, ∞)) are convex on [x0 , ∞) for some x0 < ∞. Theorem 6.4.10 Let X = {X(t), t ∈ R1 } be a stationary Gaussian process with ∞ (sin2 πλ(x − y)) dF (λ), (6.170) σ 2 (x − y) = E(X(x) − X(y))2 = 0
where F (∞) − F (x) is convex on [x0 , ∞) for some x0 < ∞. Then ∞ (F (∞) − F (x))1/2 dx < ∞ (6.171) x(log(x))1/2 2 is necessary and sufficient for X to have a continuous version. Let Z = {Z(t), t ∈ R1 } be a Gaussian process with stationary increments satisfying ∞ 2 E(Z(x) − Z(y)) = (sin2 πλ(x − y)) ν(dλ), (6.172) 0
where ν([x, ∞)) is convex on [x0 , ∞) for some x0 < ∞. Then (6.171), with F (∞) − F (x) replaced by ν([x, ∞)), is necessary and sufficient for Z to have a continuous version. Proof Sufficiency is given in Lemma 6.4.8. For necessity we note that, by (6.170), ∞ σ 2 (s) ≥ (sin2 πλs) dF (λ). (6.173) 1/4s
Let k ≥ 0 be an integer. For λ satisfying k + 1/4 ≤ sλ ≤ k + 3/4, sin2 πλs ≥ sin2 (π/4) ≥ 1/2. Therefore, ∞
σ 2 (s) ≥
1 (F ((k + 3/4)/s) − F ((k + 1/4)/s)) . 2
(6.174)
k=0
For s ≤ 1/(4x0 ) it follows from the convexity of F (∞) − F (x) that, for all k ≥ 0, F ((k + 3/4)/s) − F ((k + 1/4)/s) ≥ F ((k + 5/4)/s) − F ((k + 3/4)/s). (6.175) Therefore, we can fill in the missing terms in (6.174) to see that σ 2 (s) ≥
1 4
(F (∞) − F (1/(4s))) .
(6.176)
It now follows from (6.149) that (6.171) is a necessary condition for
280
Continuity and boundedness of Gaussian processes
the continuity of X. Exactly the same proof shows that (6.171) is a necessary condition for the continuity of Z. Remark 6.4.11 The spectral distribution of a periodic Gaussian process is supported on the integers, say F (k)−F (k−) = a2k . Obviously, F (∞)− F (x) is not convex on [x0 , ∞) for some x0 < ∞. Nevertheless, if a2k is nonincreasing, (6.176) remains true. Thus we have 1/2 ∞ ∞ ∞ ∞ 2 1/2 1 2 k=n ak < ∞ or, equivalently, ak <∞ n n(log n)1/2 n=2 n=1 k=2n (6.177) is sufficient for the process to have a continuous version, and when a2k is nonincreasing it is also necessary. Some other simple conditions for the continuity of Gaussian random Fourier series are given in Section 1, Chapter VII in Marcus and Pisier (1981).
6.5 Notes and references This chapter is the crux of our survey of basic results on Gaussian processes. The necessary and sufficient conditions obtained for the continuity and boundedness of the sample paths of Gaussian processes, in many cases, are also necessary and sufficient conditions for the joint continuity and boundedness of the local times of symmetric Markov processes. We give the conditions in terms of metric entropy and majorizing measures. Even though the latter conditions imply those for metric entropy, it is useful to understand both criteria. We also hope that working through the metric entropy proofs makes the majorizing measure proofs more understandable. Theorem 6.1.2 is due to Dudley (1967); see also Dudley (1973). Our approach is based on 4.3.1 Th´eor`eme, 3.4.4 Lemme, and 3.4.6 Lemme in Fernique (1997). Theorem 6.2.2 is due to Fernique (1974). Proofs are given in Fernique (1975) and based on this in Jain and Marcus (1978). Our approach is based on 4.3.6 Th´eor`eme, 4.4.2 Th´eor`eme, and 4.4.4 Th´eor`eme in Fernique (1997). The ideas behind Theorem 6.3.1 originated in an important early paper by Garsia, Rodemich and Rumsey, Jr. (1970) and were developed further in Preston (1971, 1972), and Fernique (1975). Our proof is based on the treatment in Fernique (1997), his 5.2.6 Th´eor`eme, 5.2.7 Lemme, and related material. The shining stars, Talagrand’s Theorems 6.3.4 and 6.3.5, first appeared in Talagrand (1987). The proof in Ledoux and Talagrand (1991) is rather difficult. A simpler proof is given in Talagrand (1992). Again we follow Fernique’s treatment in Fernique (1997), which
6.5 Notes and references
281
is spread out in his Chapter 5, but see in particular 5.3.5 Lemme. Dudley (1973) also follows Fernique’s treatment but adds some simplifications. We benefited from them. We have not yet looked at Talagrand (2005), which was published as we were preparing this book for press. We are sure it has new insights into sample path properties of Gaussian processes. Lemmas 6.4.1 and 6.4.2 are taken from Chapter I in Marcus and Pisier (1981). Material on continuity conditions in terms of the spectral distribution is from Chapter IV, Section 3 in Jain and Marcus (1978) and Chapter VII, Section 1 in Marcus and Pisier (1981).
7 Moduli of continuity for Gaussian processes
7.1 General results Let (S, τ ) be a metric or pseudometric space and let K ⊂ S be compact. Under very general conditions, whenever a Gaussian process {G(u), u ∈ K} has continuous sample paths on (S, τ ), it also has both an exact uniform and an exact local modulus of continuity. To be more specific, we call ω : R+ → R+ , ω(0) = 0, an exact uniform modulus of continuity for {G(u), u ∈ K} on (S, τ ) if lim sup
δ→0 τ (u,v)≤δ u,v∈K
|G(u) − G(v)| =C ω(τ (u, v))
a.s.
(7.1)
for some constant 0 < C < ∞. We call ρ : R+ → R+ , ρ(0) = 0, an exact local modulus of continuity for {G(u), u ∈ S} at some fixed u0 ∈ (S, τ ) if |G(u) − G(u0 )| lim sup =C a.s. (7.2) δ→0 τ (u,u0 )≤δ ρ(τ (u, u0 )) u∈S
for some constant 0 < C < ∞. We use the expressions uniform and local moduli of continuity for functions ω and ρ for which the equal signs in (7.1) and (7.2) are replaced by “less than or equal to” signs. We use the terms lower uniform and local moduli of continuity for functions ω and ρ for which the equal signs in (7.1) and (7.2) are replaced by “greater than or equal to” signs. Actually we are only concerned with two pseudometric spaces: (S, d), where S is a general space, and d(u, v) = (E(G(u) − G(v))2 )1/2 (recall d := dG is defined in (6.1)), and (Rn , | · |), that is, the ordinary Euclidean metric on Rn . In our discussions of moduli of continuity, we always assume that {G(u), u ∈ K} is continuous on (S, d) and only consider functions ω and 282
7.1 General results
283
ρ, which are continuous on R+ and are strictly positive on (0, δ ] for some δ > 0. To avoid trivialities in what follows, whenever we consider the local modulus of continuity of G = {G(u), u ∈ S} at a point u0 ∈ S, we assume that G and (S, d) are such that sup
|G(u) − G(u0 )| > 0
∀δ > 0
a.s.,
(7.3)
d(u,u0 )≤δ
and whenever we consider the uniform modulus of continuity of G, we assume that sup G(u) − G(v) > 0
∀δ > 0
a.s.
(7.4)
d(u,v)≤δ
u,v∈K
Note that since (7.4) is symmetric in u, v, it is equivalent to sup |G(u) − G(v)| > 0
∀δ > 0
a.s.
(7.5)
d(u,v)≤δ
u,v∈K
We also set (G(u) − G(v))/d(u, v) = 0 when d(u, v) = 0. The next simple lemma establishes conditions for a Gaussian process to have a uniform modulus of continuity. Lemma 7.1.1 Let {G(u), u ∈ (S, τ )} be a mean zero Gaussian process. Assume that d is continuous on (S, τ ). Let ω : R+ → R+ and K ⊂ S be a compact set. For the following three statements: lim sup
δ→0 τ (u,v)≤δ u,v∈K
G(u) − G(v) ≤C ω(τ (u, v))
lim sup
δ→0 τ (u,v)≤δ u,v∈K
lim sup
δ→0 τ (u,v)≤δ u,v∈K
G(u) − G(v) = C ω(τ (u, v))
a.s. for some
d(u, v) =0 ω(τ (u, v))
a.s. for some
0≤C<∞
(7.6)
(7.7)
0 ≤ C ≤ ∞ (7.8)
we have that (7.6) implies (7.7) implies (7.8). (Obviously, if (7.6) holds then C < ∞ in (7.8) but (7.7) implies (7.8) even if the limit superior in (7.6) is not finite almost surely.) These results are also valid for the local modulus of continuity, that is, (7.6)–(7.8) hold with v replaced by u0 and with the supremum taken over u ∈ K.
Moduli of continuity for Gaussian processes
284
Proof To show that (7.6) implies (7.7), we show that if (7.7) does not hold, then (7.6) does not hold. Suppose that (7.7) does not hold but that (7.6) does. Then, since we assume that d is continuous on (S, τ ), there exists a sequence of pairs {(uk , vk )}∞ k=1 in K × K for which d(uk , vk ) ≥ ω(τ (uk , vk )), limk→∞ τ (uk , vk ) = 0, and limk→∞ d(uk , vk ) = 0. It follows from (7.6) that almost surely C ≥ lim sup
δ→0 τ (u,v)≤δ u,v∈K
G(u) − G(v) (G(uk ) − G(vk )) ≥ lim sup , ω(τ (u, v)) d(uk , vk ) k→∞
(7.9)
which gives lim sup k→∞
G(uk ) − G(vk ) C ≤ d(uk , vk )
a.s.
(7.10)
But for each k ≥ 1, ξk := (G(uk ) − G(vk ))/d(uk , vk ) is a normal random variable with mean 0 and variance 1, and (7.10) cannot be finite almost surely for such a sequence, because ∪k≥n {ξk > C/ } ⊃ {ξn > C/ } and the probability of this last set is greater than zero and independent of n since the ξn are all N (0, 1). Thus we see that (7.6) implies (7.7). To show that (7.7) implies (7.8), we express G in terms of its Karhunen– Lo`eve expansion, given in Corollary 5.3.2, G(u) =
∞
u ∈ S,
ξj φj (u)
(7.11)
j=1
where now {ξj }∞ j=1 are independent N (0, 1). Clearly, in this case, 1/2 ∞ d(u, v) = (φj (u) − φj (v))2 .
(7.12)
j=1
Set GN (u) =
N
ξj φj (u)
u∈S
(7.13)
j=1
and note that |GN (u) − GN (v)|
≤ ≤
N j=1 N j=1
|ξj | sup |φj (u) − φj (v)| (7.14) 1≤j≤N
|ξj | d(u, v).
7.1 General results
285
It follows from (7.7) and (7.14) that lim sup
δ→0 τ (u,v)≤δ u,v∈K
GN (u) − GN (v) =0 ω(τ (u, v))
a.s.
(7.15)
Therefore, the random variable lim sup
δ→0 τ (u,v)≤δ u,v∈K
G(u) − G(v) ω(τ (u, v))
(7.16)
is measurable with respect to the tail field of {ξj }∞ j=1 and hence is constant almost surely. Since, by symmetry, lim sup
δ→0 τ (u,v)≤δ u,v∈K
G(u) − G(v) ≥ 0, ω(τ (u, v))
(7.17)
this implies (7.8). The proofs are exactly the same for the local modulus. Just take v and vk equal to u0 . Using Theorems 6.3.3 and 6.3.6 and the above lemma, we can find a uniform modulus of continuity for continuous Gaussian processes. Theorem 7.1.2 Let G = {G(u), u ∈ T } be a mean zero Gaussian process, where (T, d) is a separable metric or pseudometric space with finite diameter D. Suppose that X has bounded uniformly continuous sample paths. Then there exists a probability measure µ on T such that 1/2 1 log lim sup ds = 0. (7.18) →0 t∈T 0 µ(B(t, s)) Suppose in addition that 1 sup →0 t∈T
lim
log 0
1 µ(B(t, s))
1/2 ds = ∞.
(7.19)
Then lim sup
δ→0 d(u,v)≤δ u,v∈T
supt∈T
G(u) − G(v) $1/2 = C d(u,v) # 1 log ds µ(B(t,s)) 0
a.s. (7.20)
for some constant 0 ≤ C < ∞. Proof The assertion in (7.18) is part of Theorem 6.3.6. Furthermore, by Theorem 6.3.3, the left-hand side of (7.20) is finite almost surely. Finally, (7.19) implies that the denominator of the left-hand side of
286
Moduli of continuity for Gaussian processes
(7.20) satisfies (7.7) with τ = d. Therefore, (7.20) follows from Lemma 7.1.1. Remark 7.1.3 We note the following points with regard to Theorem 7.1.2: (1) It follows from the proof that if (7.18) and (7.19) hold for any probability measure ν on T , then (7.20) holds with µ replaced by ν. The distinction here is that the continuity of G implies the existence of some µ for which (7.18) holds, whereas, if (7.18) holds for any probability measure ν on T , then G is continuous. δ 1 )1/2 (2) If the constant C in (7.20) is not zero, supt∈T 0 (log µ(B(t,u)) du is an exact uniform modulus of continuity for G. Unfortunately, it is not easy to see whether this is the case. (3) Suppose that (7.18) holds for some probability measure µ on T . Then a sufficient condition that (7.19) holds is that lim→0 M (T, d, ) = ∞ (or, equivalently, that lim→0 N (T, d, ) = ∞). To see this, let M (T, d, ) = K( ) and note that K( ) ≥ 1. Then, since µ is a probability measure, using the definition of M (T, d, ), we see that there exists a point, say t1 ∈ T , for which µ(B(t1 , )) ≤ 1/K( ). Consequently, 1/2 1 1/2 du ≥ (log K( )) . (7.21) log µ(B(t , u)) 1 0 Therefore, if lim→0 M (T, d, ) = ∞, (7.19) holds. If lim→0 M (T, d, ) < ∞, we can find a probability measure µ on T for which (7.18) holds but (7.19) does not. To see this note that under this condition there exists a K such that M (T, d, ) = K for all < , for some > 0. Reconsider (7.19) and note that it suffices to take the supremum over those points t, t ∈ T for which d(t, t ) > 0. Since M (T, d, ) = K for all < , there are only K points in T with this property. If we choose the probability measure µ to be uniform on these points, we see that for all ≤ , 1/2 1 1/2 log sup du ≤ (log K) . (7.22) µ(B(t, u)) t∈T 0 Thus, for this measure (7.19) does not hold, but, obviously (7.18) does. Note that when M (T, d, ) = K for all < , for some > 0, all the points in T are separated, so (7.20) has no meaning.
7.1 General results
287
We now present an abstract result that gives necessary and sufficient conditions for the existence of exact local and uniform moduli of continuity. We say abstract because it is difficult to express these moduli in a recognizable way. Theorem 7.1.4 Let G = {G(y), y ∈ K}, (K, d) a compact metric space, be a mean zero Gaussian process with continuous sample paths. For δ > 0 let E(δ) = E sup d(y,y0 )≥δ
|G(y) − G(y0 )| d(y, y0 )
(7.23)
y∈K
for some fixed y0 ∈ K and
G(x) − G(y) ( E(δ) = E sup . d(x, y) d(x,y)≥δ
(7.24)
x,y∈K
Then (1) If limδ→0 E(δ) = ∞, ρ(δ) = δE(δ) is an exact local modulus of continuity for G at y0 , that is, (7.2) holds with u0 replaced by y0 and ρ(d(u, u0 )) replaced by d(u, y0 )E(d(u, y0 )). ( ( = ∞, ω(δ) = δ E(δ) is an exact uniform modulus (2) If limδ→0 E(δ) of continuity for G, that is, (7.1) holds with ω(d(u, v)) replaced ( by d(u, v)E(d(u, v)). (3) If E(δ) is bounded, G does not have an exact local modulus of continuity at y0 . ( (4) If E(δ) is bounded, G does not have an exact uniform modulus of continuity. Because of (5.183), this theorem also holds with the expectation re( placed by the median, that is, with E(δ) replaced by m(δ) and E(δ) replaced by m(δ), ( where m(δ) = median of sup d(y,y0 )≥δ
|G(y) − G(y0 )| d(y, y0 )
(7.25)
G(x) − G(y) . d(x, y)
(7.26)
y∈K
for some fixed y0 ∈ K and m(δ) ( = median of sup d(x,y)≥δ
x,y∈K
We use the following general lemma, which is interesting in its own right, in the proof of Theorem 7.1.4.
288
Moduli of continuity for Gaussian processes
Lemma 7.1.5 Let (S, φ) be a separable metric or pseudometric space. Let S ⊂ S be such that there exists an s0 ∈ S − S with φ(s0 , S ) = 0. Assume that G is a mean zero Gaussian process continuous on S , with sups∈S EG2 (s) ≤ 1. Set M (u) = median of
sup
G(v).
(7.27)
φ(v,so )≥u v∈S
(1) Then limu→0 M (u) < ∞ if and only if sups∈S G(s) < ∞ almost surely. (2) If limu→0 M (u) = ∞, sup
lim
δ→0 φ(s,s0 )≤δ s∈S
G(s) =1 M (φ(s, s0 ))
a.s.
(7.28)
Proof Suppose that sups∈S G(s) < ∞ almost surely. Then, by Corollary 5.4.7, E(sups∈S G(s)) < ∞ and consequently, by (5.183), supu M (u) < ∞. Conversely, if supu M (u) < ∞, then, by (5.183), E supφ(s,so )≥u,s∈S G(s) < supu M (u) + 1. Therefore, by the Monotone Convergence Theorem, E (sups∈S G(s)) < ∞, which implies that sups∈S G(s) < ∞ almost surely. This verifies statement (1). Using the fact that M (u) is decreasing, we now show that to obtain (7.28) it suffices to show that lim
sup
δ→0 φ(s,s0 )≥δ s∈S
G(s) =1 M (δ)
a.s.
(7.29)
To see this note that (7.29) implies that, for all > 0, there exists a sequence {δk } decreasing to zero such that G(sk ) ≥ 1 − for some sk ∈ S, with φ(sk , s0 ) ≥ δk . M (δk )
(7.30)
Suppose that φ(sk , s0 ) = δk . Obviously, δk ≥ δk and since M is deceasing, G(sk )/M (δk ) ≥ 1 − . This implies that sup φ(s,s0 )≤δ k s∈S
G(s) ≥ 1 − . M (φ(s, s0 ))
(7.31)
Note that δk → 0. To see this, assume that for some subsequence kj and some > 0 we have δk j ≥ . Then, as in the first paragraph of this proof, supφ(s,s0 )≥ G(s) is bounded almost surely. However, since limu→0 M (u) = ∞, this implies that G(skj )/M (δkj ) = 0, which contradicts (7.30). Thus (7.29) implies the lower bound in (7.28). It is even easier to see that it also implies the upper bound.
7.1 General results
289
We now show that (7.29) holds. It is easy to see that by (5.152) and the Borel–Cantelli Lemma, we can find a sequence δk decreasing to zero such that lim
sup
k→∞ φ(s,s0 )≥δ k s∈S
G(s) =1 M (δk )
a.s.
(7.32)
This shows that 1 is a lower bound in (7.29). Showing that it is an upper bound is more delicate. By Lemma 5.4.3, M (u) is unique. Furthermore, M (u) is decreasing and is left continuous, with right limits (this can be proved using the ideas at the end of the proof Lemma 5.4.3, starting with the paragraph containing (5.169)). Consider also M (u) := median of
G(s),
sup
(7.33)
φ(s,so )>u s∈S
which is decreasing and is right continuous, with left limits. We have, for all u > 0, lim M (v) = M (u) ≤ M (u) = lim M (v). v↓u
v↑u
(7.34)
Also, for all integers n, there exist numbers {dn } such that {u > 0 : M (u) ≥ n} = (0, dn ]
and {u > 0 : M (u) ≤ n} = [dn , ∞). (7.35) Set Zn = supφ(s,so )≥dn ,s∈S G(s) and Z n = supφ(s,so )>dn ,s∈S G(s). It follows from (5.150) and (5.151), respectively, and the Borel–Cantelli Lemma that there exists an n0 (ω) such that, for all n ≥ n0 (ω), % % Zn ≥ M (dn ) − 1 − 2 log n ≥ n − 1 − 2 log n (7.36) and using the fact that M (u) ≤ M (u) % % Z n ≤ M (dn ) + 1 + 2 log n ≤ n + 1 + 2 log n.
(7.37)
For any u ∈ (0, dn0 ] there exists some integer n such that u ∈ (dn+1 , dn ]. For this u, M (u) ∈ [n, n + 1). Consequently, for this u, sup φ(s,so )≥u s∈S
G(s) ≥
sup φ(s,so )≥dn s∈S
G(s)
% 2 log n % ≥ M (u) − 2 − 2 log M (u) ≥ n−1−
(7.38)
Moduli of continuity for Gaussian processes
290 and
sup
G(s) ≤
φ(s,so )>u s∈S
sup
G(s)
(7.39)
φ(s,so )>dn+1 s∈S
% 2 log(n + 1) ) M (u) + 2 + 2 log(M (u) + 1).
≤ n+2+ ≤
Combining (7.38) and (7.39) we see that % M (u) − 2 − 2 log M (u) ≤ sup G(s)
(7.40)
φ(s,so )≥u s∈S
≤ lim
sup
v↑u φ(s,so )>v s∈S
G(s)
) ≤ lim M (v) + 2 + 2 log(M (v) + 1) v↑u % = M (u) + 2 + 2 log(M (u) + 1), which gives the upper bound in (7.29). Proof of Theorem 7.1.4 By (5.183) it is enough to prove this theorem ( with E(δ) replaced by m(δ) and E(δ) replaced by m(δ). ( We begin with (1). Consider Lemma 7.1.5 with (S, φ) = (K, d) and with G replaced by G , which we define as G (s) =
G(s) − G(s0 ) . d(s, s0 )
(7.41)
We take K = {d(s, s0 ) > 0}. The statement in (1) follows immediately from Lemma 7.1.5 applied to G . The statement in (2) also follows from Lemma 7.1.5, but applying it is more complicated. Let K := K × K and define the pseudometric 1/2
φ((s, t), (s , t )) = E[(G(s) − G(t)) − (G(s ) + G(t ))]2
(7.42)
on K. Let ∆ = {(s, s); s ∈ K} and define
S = {(s, t) : φ((s, t), ∆) > 0}.
(7.43)
For some s0 ∈ K consider the Gaussian process G(s, t) :=
G(s) − G(t) φ((s, t), (s0 , s0 ))
(7.44)
7.1 General results
291
on (S , φ). By Lemma 7.1.5 (2) applied to G, sup
lim
δ→0
φ((s,t),(s0 ,s0 ))≤δ
G((s, t)) =1 m(φ((s, ( t), (s0 , s0 )))
a.s.
(7.45)
(s,t)∈S
Since φ((s, t), (s0 , s0 )) = d(s, t), this is precisely statement (2). To obtain statement (4) let us assume that m(δ) ( is bounded but that G does have an exact uniform modulus of continuity ω. By Lemma 7.1.5 (1), the condition that m(δ) ( is bounded implies that sup (s,t)∈K
G((s, t)) <∞ φ((s, t), (s0 , s0 ))
a.s.
(7.46)
or, equivalently, that sup s,t∈K
G(s) − G(t) <∞ d(s, t)
a.s.
(7.47)
and hence that lim sup
δ→0 d(s,t)≤δ s,t∈K
G(s) − G(t) <∞ d(s, t)
a.s.
(7.48)
But now, writing lim sup
δ→0 d(s,t)≤δ s,t∈K
G(s) − G(t) G(s) − G(t) d(s, t) = lim sup δ→0 ω(d(s, t)) d(s, t) ω(d(s, t)) d(s,t)≤δ
(7.49)
x,y∈K
and using (7.48) and (7.7), which follows by our assumption that ω is an exact uniform modulus of continuity, we get lim sup
δ→0 d(s,t)≤δ s,t∈K
G(s) − G(t) =0 ω(d(s, t))
a.s.,
(7.50)
which contradicts the assumption that ω is an exact uniform modulus of continuity. Thus we have established statement (4). The proof of statement (3) is exactly the same as the proof of statement (4), except that we use Lemma 7.1.1 as it applies to the the local modulus of continuity.
7.1.1 m-modulus of continuity We now give another way to describe moduli of continuity that is also commonly used. In analogy with (7.1) we call ω : R+ → R+ , ω(0) = 0,
292
Moduli of continuity for Gaussian processes
an exact uniform m-modulus of continuity for {G(u), u ∈ K} on (S, τ ) if |G(u) − G(v)| lim sup sup =C a.s. (7.51) ω(δ) δ→0 τ (u,v)≤δ u,v∈K
for some constant 0 < C < ∞. We call ρ : R+ → R+ , ρ(0) = 0, an exact m-local modulus of continuity for {G(u), u ∈ S} at some fixed u0 ∈ (S, τ ) if lim sup
sup
δ→0
τ (u,u0 )≤δ
|G(u) − G(u0 )| =C ρ(δ)
a.s.
(7.52)
u∈S
for some constant 0 < C < ∞. We define a uniform and local m-modulus and a lower uniform and local m-modulus similarly to the definitions for uniform and local moduli on page 282. As with the moduli introduced on page 282, we assume that the m-moduli are continuous and strictly positive on (0, δ ] for some δ > 0. Lemma 7.1.6 lim sup sup δ→0
τ (u,v)≤δ
|G(u) − G(v)| ≤C ω(δ)
a.s.
(7.53)
u,v∈K
implies |G(u) − G(v)| ≤C ω(τ (u, v))
lim sup
δ→0 τ (u,v)≤δ u,v∈K
a.s.
(7.54)
Moreover, when ω is nondecreasing, (7.54) implies (7.53). Conversely, lim sup
δ→0 τ (u,v)≤δ u,v∈K
|G(u) − G(v)| ≥ C ω(τ (u, v))
a.s.
(7.55)
implies lim sup sup δ→0
τ (u,v)≤δ
|G(u) − G(v)| ≥ C ω(δ)
a.s.,
(7.56)
u,v∈K
and when ω is nondecreasing, (7.56) implies (7.55). Proof Suppose (7.54) fails. Then there exist a sequences {δk }, δk ↓ 0, and {(uk , vk )}, with τ (uk , vk ) = δk such that sup k
|G(uk ) − G(vk )| > C. ω(δk )
Consequently (7.53) fails. This shows that (7.53) implies (7.54).
(7.57)
7.1 General results
293
Now suppose that (7.54) holds and ω is nondecreasing. Then, since τ (u, v) ≤ δ implies that ω(τ (u, v)) ≤ ω(δ), we get (7.53). The converse follows immediately (we display it for later reference). The following simple consequences of Lemma 7.1.6 are used often. Lemma 7.1.7 (1) If (7.53) and (7.55) hold, with the same constant C, then ω is both an exact uniform modulus and an exact uniform m-modulus of continuity for G, with the same constant C. (2) If ω is nondecreasing, then it is an exact uniform modulus of continuity for G if and only if it is an exact uniform m-modulus of continuity for G. (3) The statements in (7.53)–(7.56) and (1) and (2) of this lemma remain valid when formulated for local moduli and local m-moduli. We use the expression m-modulus because whenever ω(δ) or ρ(δ) are exact m-moduli of continuity they can essentially be taken to be monotone. Set ω (δ) := inf δ≤x≤ ω(x). Obviously ω (δ) is an increasing function on [0, ]. We now show that if lim sup sup δ→0
τ (u,v)≤δ
|G(u) − G(v)| ≤C ω(δ)
a.s.,
(7.58)
|G(u) − G(v)| ≤C ω (δ)
a.s.
(7.59)
u,v∈K
then lim sup sup δ→0
τ (u,v)≤δ
u,v∈K
To see this suppose that ω (δ) < ω(δ) for some δ ∈ (0, ]. Then there (δ) = ω (δ ) = ω(δ ). Note that exists a δ > δ such that ω sup τ (u,v)≤δ
|G(u) − G(v)| ω (δ)
≤
u,v∈K
sup τ (u,v)≤δ
|G(u) − G(v)| ω (δ )
(7.60)
u,v∈K
=
sup τ (u,v)≤δ
|G(u) − G(v)| . ω(δ )
u,v∈K
Thus, if the left-hand side of (7.59) is greater than or equal to C, so is the left-hand side of (7.58). This shows that (7.58) implies (7.59). Using the facts that ω ( · ) ≤ ω( · ) and is nondecreasing and Lemma 7.1.7 (2), we get the following lemma.
Moduli of continuity for Gaussian processes
294
Lemma 7.1.8 If ω is an exact uniform m-modulus of continuity for G, then ω is both an exact uniform m-modulus of continuity for G and an exact uniform modulus of continuity for G and in all three cases the constant C is the same. The same result holds for local moduli of continuity. Lemma 7.1.1 is used often to prove zero–one laws for moduli of continuity. Mimicking the proof of Lemma 7.1.1, it is easy to see that the following version of it holds for m-moduli of continuity. Lemma 7.1.9 Let {G(u), u ∈ (S, τ )} be a mean zero Gaussian process. Assume that d is continuous on (S, τ ). Let ω : R+ → R+ and K ⊂ S be a compact set. For the following three statements: lim sup sup δ→0
τ (u,v)≤δ
G(u) − G(v) ≤C ω(δ)
a.s. for some
0≤C<∞
u,v∈K
(7.61) lim sup
δ→0 τ (u,v)≤δ u,v∈K
lim sup sup δ→0
τ (u,v)≤δ
G(u) − G(v) = C ω(δ)
d(u, v) =0 ω(δ)
a.s. for some
(7.62)
0 ≤ C ≤ ∞
u,v∈K
(7.63) we have that (7.61) implies (7.62) implies (7.63). (Obviously, if (7.61) holds, then C < ∞ in (7.63), but (7.62) implies (7.63) even if the limit superior in (7.61) is not finite almost surely.) These results are also valid for the local m-modulus of continuity, that is, (7.61)–(7.63) hold with v replaced by u0 and with the supremum taken over u ∈ K. The study of the local moduli of continuity of Gaussian processes is simplified by the following corollary of Slepian’s Lemma: Lemma 7.1.10 Let X and Y be Gaussian processes on (S, τ ) and assume that there exists a u0 ∈ S such that, for all s, t ∈ S in some neighborhood of u0 , E(Y (s) − Y (t))2 ≤ E(X(s) − X(t))2 .
(7.64)
7.1 General results
295
Suppose that ψ is a local m-modulus of continuity for X at u0 , that is, that |X(u) − X(u0 )| ≤C a.s. (7.65) lim sup sup ψ(δ) δ→0 τ (u,u0 )≤δ u∈S
Then (7.65) holds with X replaced by Y . Proof We can assume that X(u0 ) = Y (u0 ) = 0 since X(t) − X(u0 ) and Y (t) − Y (u0 ) also satisfy (7.64) in the same neighborhood of u0 , which we denote by Nδ = {t|τ (t, u0 ) ≤ δ}. Let t1 , · · · , tn ∈ Nδ and be different from u0 . Set ρ2 (δ) = sup EX 2 (tj )
(7.66)
1≤j≤n
and then set f 2 (tj )
= ρ2 (δ) − EX 2 (tj ) + EY 2 (tj ),
Y (tj )
= Y (tj ) + ηρ(δ),
X(tj )
= X(tj ) + η f (tj ),
(7.67)
where η and η are independent standard normal random variables independent of {X(tj )} and {Y (tj )}. Note that E(Y (tj ) − Y (tk ))2
= E(Y (tj ) − Y (tk ))2
(7.68)
≤ E(X(tj ) − X(tk )) ≤ E(X(tj ) − X(tk ))2 2
and EY (tj )2 = EX(tj )2 .
(7.69)
Consequently, EY (tj )Y (tk ) ≥ EX(tj )X(tk )
∀1 ≤ j, k ≤ n.
(7.70)
By convention, ρ(δ) > 0 for all δ > 0. Therefore, by Slepian’s Lemma, Lemma 5.5.1, (7.71) P max Y (tj ) > Cψ(δ) ≤ P max X(tj ) > Cψ(δ) . j
j
Equivalently, Y (tj ) ηρ(δ) X(tj ) ηf (tj ) P max + > C ≤ P max + >C . j j ψ(δ) ψ(δ) ψ(δ) ψ(δ) (7.72) We take limits to extend (7.72) to a countable set. Then, by separability, we can extend it further so that it holds for all t ∈ Nδ . We extend the
296
Moduli of continuity for Gaussian processes
definition of ρ(δ) so that ρ2 (δ) := supt∈Nδ EX 2 (tj ) and likewise extend the definition of f (tj ) in (7.67) so that it holds for all t ∈ Nδ . It follows from (7.65) and the fact that (7.61) implies (7.62), which also holds for the local m-modulus of continuity, that lim
δ→0
ρ(δ) = 0. ψ(δ)
(7.73)
Therefore, since supt∈Nδ |f (t)| ≤ ρ(δ), for any > 0 we can find a δ such that X(t) ηf (t) + ≤C + (7.74) sup ψ(δ) ψ(δ) t∈Nδ with probability greater than 1 − . Consequently, by (7.72), Y (t) ηρ(δ) + ≤C + sup ψ(δ) ψ(δ) t∈Nδ
(7.75)
with probability greater than 1 − . Using this and (7.73) we see that lim sup
sup
δ→0
τ (u,u0 )≤δ
Y (u) ≤C ψ(δ)
a.s.
(7.76)
u∈S
By Lemma 7.1.9 for the local m-modulus of continuity, and the symmetry of Y , we can replace Y by its absolute value to get that (7.65) holds with X replaced by Y . Remark 7.1.11 We generally use Lemma 7.1.10 in the contrapositive form, that is, if lim sup
sup
δ→0
τ (u,u0 )≤δ
|Y (u) − Y (u0 )| ≥C ψ(δ)
a.s.
(7.77)
|X(u) − X(u0 )| ≥C ψ(δ)
a.s.
(7.78)
u∈S
then lim sup
sup
δ→0
τ (u,u0 )≤δ
u∈S
Another implication of Lemma 7.1.10 is that if X and Y have stationary increments and 2 (h) ∼ σY2 (h) σX
(7.79)
(see (7.136)) and lim sup
sup
δ→0
τ (u,u0 )≤δ
u∈S
|X(u) − X(u0 )| =C ψ(δ)
a.s.,
(7.80)
7.2 Processes on Rn
297
then lim sup
sup
δ→0
τ (u,u0 )≤δ
|Y (u) − Y (u0 )| =C ψ(δ)
a.s.
(7.81)
u∈S
This is easy to see since we can use the lemma, first on Y and (1 + )X and then on X and (1 + )Y , for any > 0. The results in Lemma 7.1.10 and this remark are for the local mmodulus of continuity. It follows from Lemma 7.1.7 that when ψ( · ) is increasing on [0, δ ], they also apply to the local modulus of continuity. When ψ is not increasing one can replace ψ by its monotone minorant, as in Lemma 7.1.8, and then these results also hold for the local modulus of continuity. 7.2 Processes on Rn It is difficult to say anything more about general Gaussian processes that sheds light on many of the specific processes we are interested in. To obtain useful results we restrict ourselves to Gaussian processes with stationary increments on Rn and, more particularly, on R1 . Since one of our primary concerns in this book is to study local times of L´evy processes, and since their associated processes are Gaussian processes with stationary increments in R1 , this is not a significant restriction for us. We consider Gaussian processes G = {G(s), s ∈ [−T, T ]n }. As usual, 1/2
. We also consider monotone we take d(s, t) = E(G(s) − G(t))2 majorants for d, that is, strictly increasing functions φ, with φ(0) = 0, such that d(s, t) ≤ φ(|s − t|). Define
1/2
ω (δ) = ω (φ, δ) = φ(δ) (log 1/δ)
δ
+ 0
and
1/2
ρ(δ) = ρ(φ, δ) = φ(δ) (log log 1/δ)
+ 0
(7.82)
φ(u) du u(log 1/u)1/2
1/2
(7.83)
φ(δu) du. (7.84) u(log 1/u)1/2
We show that the functions ω (δ) and ρ(δ) are uniform or local moduli of continuity for Gaussian processes for which (7.82) holds. Whenever we use ω , we include the unstated hypothesis that φ is such that (δ) = 0, and similarly whenever we use ρ. It follows from limδ→0 ω Lemma 6.4.6 that the finiteness of the integral in (7.83) is a sufficient
Moduli of continuity for Gaussian processes
298
condition for the continuity of G, and we show in Lemma 7.2.5 that, for a given φ such that φ(2u) ≤ 2φ(u), the integrals in (7.83) and (7.84) are both finite or both infinite. In the next theorem we see that ω is a uniform m-modulus of continuity for G and ρ is a local m-modulus of continuity for G as defined in Subsection 7.1.1. Theorem 7.2.1 Let G be a Gaussian process satisfying (7.82). Set S = [−T, T ]n ; then lim sup sup δ→0
|u−v|≤δ
|G(u) − G(v)| ≤C ω (δ)
a.s.,
(7.85)
u,v∈S
and for each u0 ∈ S lim sup sup δ→0
|u−u0 |≤δ
|G(u) − G(u0 )| ≤ C ρ(δ)
a.s.,
(7.86)
u∈S
where 0 ≤ C, C < ∞. Before proving this theorem we note the following estimates, which are also interesting on their own. Lemma 7.2.2 Let G be a Gaussian process satisfying (7.82). Set S = [−T, T ]n . Then for all δ > 0 sufficiently small, E sup G(s) − G(t) ≤ Cn,T ω (δ),
(7.87)
|s−t|≤δ
s,t∈S
and for each t0 ∈ S
E sup |G(s) − G(t0 )| ≤ Cn |s−t0 |≤δ
φ(δ) + 0
1/2
φ(δu) du . (7.88) u(log 1/u)1/2
s∈S
Proof The inequality in (7.87) follows from (6.61) by finding an upper bound for its right-hand side. We take for the majorizing measure, µ = λ/(2T )n , where λ is Lebesgue measure on Rn . By (7.82), the ball Bd (s, ) of radius in the metric d contains a Euclidean ball of radius ρ ≥ φ−1 ( ). Taking into account the fact that a Euclidean ball of radius less than T with its center in S has at least 2−n of its volume in S, we see that for all s ∈ S, µ(Bd (s, )) ≥ (Dn,T φ−1 ( ))n , where Dn,T is a constant that depends only on n and T , so that for all s ∈ S, log
Cn,T 1 ≤ n log −1 µ(Bd (s, )) φ ( )
(7.89)
7.2 Processes on Rn
299
for some constant Cn,T ≥ 1, which depends only on n and T . Therefore, for all δ sufficiently small, 0
δ
1/2 1 du (7.90) log µ(Bd (s, u)) 1/2 δ √ Cn,T ≤ n log −1 du φ (u) 0 1/2 φ−1 (δ ) √ Cn,T log dφ(u) = n u 0 1/2 φ−1 (δ ) √ C φ(u) n,T ≤ n δ log −1 + du φ (δ ) u(log(Cn,T /u))1/2 0 1/2 φ−1 (δ ) 1 φ(u) + du . ≤ Cn,T δ log −1 φ (δ ) u(log(1/u))1/2 0
The constants are not necessarily the same at each step. (Note also that it is easy to show that when the integral in (7.83) is finite, limu→0 φ(u) $1/2 # C = 0. However we do not need to consider this when we do log n,T u the integration by parts in (7.90) because the contribution of this term is negative.) Consequently, by (6.61), E sup G(s) − G(t)
(7.91)
d(s,t)≤δ
s,t∈S
≤ Cn,T
δ log
1 φ−1 (δ )
1/2
+ 0
φ−1 (δ )
φ(u) du . u(log(1/u))1/2
Setting δ = φ(δ) and using (7.82), we get (7.87) (to be more precise, it follows from (7.82) that the subset of S × S for which d(s, t) ≤ φ(δ) contains the subset of S × S for which |s − t| ≤ δ). To obtain (7.88), we use (6.59) with X (s) = G(s) − G(t0 ). Choose δ > 0 such that φ−1 (δ ) ≤ T and let Sδ = {s : |s − t0 | ≤ φ−1 (δ )} ∩ S. 1/2
Consider the process {X (s), s ∈ Sδ }. Since E(X (s) − X (t))2 = d(s, t), it follows from (7.82) that the d-diameter of Sδ is less than or equal to 2δ . By (6.59), E sup |G(s) − G(t0 )| s∈Sδ
C sup ≤ 2 s∈Sδ
0
(7.92) 2δ
1 log µ(Bd (s, u) ∩ Sδ )
1/2 du
Moduli of continuity for Gaussian processes 1/2 δ 1 ≤ C sup log du, µ(Bd (s, u) ∩ Sδ ) s∈Sδ 0
300
where µ is any subprobability measure on Sδ . The second inequality in (7.92) is because the integrand is decreasing. (Note that because X is symmetric in (6.59), it also holds for |X | if the constant on the right-hand side is doubled.) We take µ = λ/λ(B(0, φ−1 (δ ))) in (7.92), where λ is Lebesgue measure on Rn and B(0, φ−1 (δ )) is a Euclidean ball of radius φ−1 (δ ) in Rn . Taking into account again the fact that a Euclidean ball of radius less than T with center in S has at least 2−n of its volume in S, we see that, for all s ∈ S, µ(Bd (s, ) ∩ Sδ ) ≥ (φ−1 ( )/(2φ−1 (δ )))n , so that for all s ∈ S and ≤ δ , 1 2φ−1 (δ ) ≤ n log −1 . µ(Bd (s, ) ∩ Sδ ) φ ( )
log
(7.93)
As in (7.90), but with CN,T replaced by 2φ−1 (δ ) followed by a change of variables, 1/2 δ 1 log du (7.94) µ(Bd (s, u)) ∩ Sδ ) 0 1/2 δ √ 2φ−1 (δ ) du ≤ n log −1 φ (u) 0 1 √ 1 φ(φ−1 (δ )u) 1/2 du . ≤ n δ (log 2) + 2 0 u(log 2/u)1/2 Consequently, by (6.59), E
sup |s−t0 |≤φ−1 (δ )
s∈S
|G(s) − G(t0 )|
1/2
≤ Cn δ (log 2) −1
Setting φ
1 + 2
(7.95)
1
0
φ(φ−1 (δ )u) du . u(log 2/u)1/2
(δ ) = δ and making a change of variables, we get
E sup G(s) − G(t)
(7.96)
|s−t0 |≤δ
s∈S
≤ Cn
1/2
φ(δ) (log 2)
1 + 2
0
1/2
φ(2δu) du . u(log 1/u)1/2
Let φ be any increasing function for which (7.82) holds (φ need not be strictly increasing). By the triangle inequality, − t|/2). d(s, t) ≤ d(s, (s + t)/2) + d((s + t)/2, t) ≤ 2φ(|s
(7.97)
7.2 Processes on Rn
301
Now take φ∗ to be the smallest increasing function satisfying (7.82), which again need not be strictly increasing. By (7.97), 2φ∗ (|s − t|/2) is also a monotone majorant for d(s, t). However, since φ∗ is the smallest such majorant, we have φ∗ (|s − t|) ≤ 2φ∗ (|s − t|/2) or, equivalently, that φ∗ (2|s − t|) ≤ 2φ∗ (|s − t|).
(7.98)
Taking limits, we see that (7.96) holds with φ replaced by φ∗ . Using (7.98) we get (7.88) with φ replaced by φ∗ (we absorb the factor 2 into the constant). Thus it also holds as stated since φ∗ ≤ φ. Proof of Theorem 7.2.1 By Theorem 6.3.3 and (7.90) with δ replaced by φ(δ), |G(u) − G(v)| lim sup sup <∞ a.s. (7.99) ω (δ) δ→0 d(u,v)≤φ(δ) u,v∈S
Since φ(δ) = o( ω (δ)), (7.99) is a tail random variable (as in the proof of Lemma 7.1.1). Consequently, the limit in (7.99) is equal to some finite constant. This implies (7.85) as we pointed out in the parenthetical remark following (7.91). We now obtain (7.86). Let δk := δk (θ) = θ−k , 1 < θ ≤ 2 and ak = median sup|u−u0 |≤δk ,u∈S G(u) − G(u0 ). We use Theorem 5.4.3, (5.151), and the Borel–Cantelli Lemma to see that, for all > 0, sup |u−u0 |≤δk ,u∈S
1/2
G(u)−G(u0 ) ≤ ak +φ(δk ) ((2 + ) log log 1/δk )
(7.100)
for all k ≥ k0 (ω) almost surely. By Corollary 5.4.5 and (7.88), 1/2 φ(δk u) du . (7.101) ak ≤ Cn φ(δk ) + u(log 1/u)1/2 0 Therefore, the right-hand side of (7.100) can be replaced by $ # 1/2 ρ((φ, δk ) := φ(δk ) ((2 + ) log log 1/δk ) + Cn (7.102) 1/2 φ(δk u) du . + Cn u(log 1/u)1/2 0 (We write it this way for use again in Corollary 7.2.3.) The inequality in (7.100) also holds with φ replaced by φ∗ , and by (7.98) we get ρ(φ∗ , δk+1 ). ρ((φ∗ , δk ) ≤ 2(
(7.103)
This enables us to interpolate for δk+1 < δ < δk and obtain (7.86) with φ replaced by φ∗ . But, since φ ≥ φ∗ , it also holds as stated.
Moduli of continuity for Gaussian processes
302
In the next corollary we add some conditions that enable us to sharpen Theorem 7.2.1 so that we get upper bounds that are best possible for certain classes of Gaussian processes (see Theorems 7.2.14 and 7.2.15). Corollary 7.2.3 Let G = {G(s), s ∈ [−T, T ]n } be a Gaussian process for which (7.82) holds. Suppose furthermore that φ is such that, for all > 0, there exists a θ > 1 for which φ(θu) ≤ (1 + )φ(u)
(7.104)
uniformly in [0, u0 ], for some u0 > 0. Then, if 1/2 $ # φ(δu) 1/2 , du = o φ(δ) (log log 1/δ) u(log 1/u)1/2 0
(7.105)
as δ → 0 |G(u) − G(u0 )|
lim sup
sup
δ→0
|u−u0 |≤δ u∈[−T,T ]n
1/2
(2φ2 (δ) log log 1/δ)
≤1
a.s.,
(7.106)
and if
1/2
0
$ # φ(δu) 1/2 , du = o φ(δ) (log 1/δ) u(log 1/u)1/2
(7.107)
as δ → 0 lim sup
sup
δ→0
|u−v|≤δ
u,v∈[−T,T ]n
|G(u) − G(v)| 1/2
(2nφ2 (δ) log 1/δ)
≤1
a.s.
(7.108)
Proof The statement in (7.106) follows immediately from (7.102) since we can use (7.104) in the interpolation of (7.102). We now obtain (7.108). Let {tk } = {tk (δ)}, t1 = 0, be the centers of a covering of [−T, T ]n by closed Euclidean balls of radius δ. The number of balls in this covering is bounded by CT,n /δ n , where CT,n is a constant depending on T and n. Let Sk (δ) = B(tk , 2δ) in the Euclidean metric. Let s, t ∈ [−T, T ]n be such that |s − t| ≤ δ. It is easy to see that both s and t must lie in one of the balls Sk (δ). Therefore, |G(u) − G(v)| = sup
sup
k
|u−v|≤δ
u,v∈[−T,T ]n
sup
, |G(u) − G(v)|,
(7.109)
|u−v|≤δ
u,v∈Sk (δ)
and consequently P
sup |u−v|≤δ
u,v∈[−T,T ]n
|G(u) − G(v)| ≥ a(δ)
(7.110)
7.2 Processes on Rn
CT ,n /δ n
≤
P
k=1
303
|G(u) − G(v)| ≥ a(δ) .
sup |u−v|≤δ
u,v∈Sk (δ)
Take δj = θ−j and let ak (δj ) = median
sup
|G(u) − G(v)| + (2(1 + )nφ2 (δj ) log 1/δj )1/2 .
|u−v|≤δj
u,v∈Sk (δj )
(7.111) It follows from (5.152) that CT ,n /δjn P sup
|u−v|≤δj
k=1
|G(u) − G(v)| ≥ ak (δj )
(7.112)
u,v∈Sk (δj )
is a term in a convergent sequence. We show below that median
|G(u) − G(v)| = o(2nφ2 (δ) log 1/δ)1/2 ,
sup
(7.113)
|u−v|≤δ
u,v∈Sk (δ)
so that a(δj ) := (2(1 + 2 )nφ2 (δj ) log 1/δj )1/2 ≥ ak (δj )
(7.114)
for all j sufficiently large. It now follows from (7.110) and (7.112) that, for all j ≥ j(ω), |G(u) − G(v)| ≤ a(δj ).
sup
(7.115)
|u−v|≤δj
u,v∈[−T,T ]n
We can now interpolate as in the first part of this proof to get (7.108). To complete the proof we verify (7.113). It suffices to show it for the mean. We have E
sup
|G(u) − G(v)| ≤ 2E
|u−v|≤δ
u,v∈Sk (δ)
sup |u−tk |≤2δ
|G(u) − G(tk )|
≤ Cn
φ(2δ) + 0
1/2
(7.116)
φ(2δu) du u(log 1/u)1/2
by (7.88). Therefore, by the hypothesis (7.107), we get (7.113). Let
Iloc,φ (δ) := 0
1/2
φ(δu) du u(log 1/u)1/2
(7.117)
304
Moduli of continuity for Gaussian processes
and
Iunif,φ (δ) := 0
δ
φ(u) du. u(log 1/u)1/2
(7.118)
As we have already seen, these integrals play an important role in describing local and uniform moduli of continuity for Gaussian processes on R1 . We examine some of their properties in the next two lemmas, after we introduce the concept of regularly varying functions. A function f is said to be regularly varying at zero with index α if lim
x→0
f (ux) = uα f (x)
(7.119)
for all u ≥0. If α = 0, f is also said to be slowly varying at 0. A regularly varying function f at zero with index α can be written in the form x (u) f (x) = xα β(x) exp du , (7.120) u 1 where limu→0 (u) = 0 and limx→0 β(x) = C for some constant C = 0. A function f is said to be regularly varying at infinity with index α if (7.119) holds at infinity. If α = 0, f is also said to be slowly varying at infinity. A regularly varying function f at infinity with index α can be written in the form ∞ (u) f (x) = xα β(x) exp du , (7.121) u x where limu→∞ (u) = 0 and limx→∞ β(x) = C for some constant C = 0. In general, there are no smoothness properties imposed on β and hence on f . A subclass of regularly varying functions that are easier to work with are those in which β ≡ C. A function f is called a normalized regularly varying function at zero with index α if it can be written in the form x (u) α f (x) = Cx exp du (7.122) u 1 for some constant C = 0. When α = 0, f is also called a normalized slowly varying function at zero (see Bingham, Goldie and Teugels (1987)). A similar definition applies at infinity. Let f (x) be a regularly varying function as represented in (7.120) and write β(x) = β(0)h(x). Then f (x) is a normalized regularly varying function if the function h(x) can be absorbed into the exponential term.
7.2 Processes on Rn
305 x Writing h(x) = exp( 1 h (u)/h(u) du), we see that this can be done as long as xh (x)/h(x) → 0 as x → 0 or, equivalently, as long as xβ (x) → 0
xL (x) →0 L(x)
or, equivalently,
as x → 0, where
x
L(x) = β(x) exp 1
(u) du . u
(7.123)
(7.124)
Note that if f (x) is a slowly varying function at zero, f (1/x) is slowly varying at infinity. Lemma 7.2.4 A regularly varying function at zero, or infinity, with index not equal to zero is asymptotic to a monotonic function at zero, or infinity. Proof Suppose f is regularly varying index p = 0.
xat infinity with Then we can write f (x) = xp β(x) exp 1 ( (u)/u) du . Clearly f (x) ∼
x f((x) := β(∞)xp exp 1 ( (u)/u) du du at infinity. We have x (f((x)) = xp−1 (p + (x)) exp ( (u)/u) du , (7.125) 1
which is strictly positive or negative according to the sign of p, for all x sufficiently large. The behavior at zero is handled in the same way. Other properties of regularly varying functions that we use in this book are summarized in Section 14.7. Lemma 7.2.5 Let φ, φ(0) = 0, be continuous and nondecreasing on [0, 1/2] and satisfy φ(2u) ≤ 2φ(u). Then, for all δ > 0 sufficiently small, Iunif,φ (δ) ≤ 2Iloc,φ (δ) $ # √ Iloc,φ (δ) ≤ 2 2 Iunif,φ (δ) + φ(δ)(log 1/δ)1/2 .
(7.126) (7.127)
If φ is regularly varying at zero with index 0 < α ≤ 1, Iloc,φ (δ) ≤ O(φ(δ)). Proof
(7.128)
For (7.126) we note that δ φ(u/2) Iloc,φ (δ) = du (7.129) u(log 1/u − log 1/2δ)1/2 0 1 δ φ(u) 1 ≥ du > Iunif,φ (δ). 1/2 2 0 u(log 1/u − log 1/2δ) 2
306
Moduli of continuity for Gaussian processes
For (7.127), we have δ Iloc,φ (δ) = 0
φ(u/2) du u(log 1/u − log 1/2δ)1/2
(7.130)
δ2
φ(u/2) du u(log 1/u − log 1/2δ)1/2 δ du +φ(δ) u(log 1/u − log 1/2δ)1/2 2 δ 2 δ √ φ(u/2) 1/2 . ≤ 2 du + 2φ(δ)(log 1/δ) u(log 1/u)1/2 0
≤
0
Here the estimate of the second integral in (7.130) follows because log 1/u ≥ 2 log 1/δ on [0, δ 2 ]. The third integral is easy to evaluate and bound (its integrand is the derivative of −2(log 1/u − log 1/2δ)1/2 ). Using the fact established at the end of the proof of Lemma 7.2.2, that we can replace φ(u/2) by 2φ(u), we get (7.127). By (7.120), if φ is regularly varying at zero with index 0 < α ≤ 1, δ φ(uδ) (u) α β(uδ) = u exp du (7.131) φ(δ) β(δ) uδ u β(uδ) (7.132) ≤ uα exp sup (s) log 1/u . β(δ) 0<s≤δ Therefore, given η > 0 for all δ > 0 sufficiently small, φ(uδ) ≤ Cuα−η φ(δ) Thus
Iloc,φ (δ) ≤ Cφ(δ) 0
1/2
∀ u ∈ [0, 1/2].
(7.133)
uα−η du. u(log 1/u)1/2
(7.134)
For any α > 0 we can choose η so that α − η > 0. Consequently, (7.128) follows from (7.134). Remark 7.2.6 It is not so simple to evaluate Iloc,φ . In Lemma 7.6.5 we give examples in which Iloc,φ (δ) grows faster than φ(δ)(log log 1/δ)1/2 and more slowly than φ(δ)(log 1/δ)1/2 , the familiar growth rates for the local and uniform moduli of continuity of many Gaussian processes. When φ(δ)(log 1/δ)1/2 = O(Iunif,φ (δ))
(7.135)
it follows from Lemma 7.2.5 that Iloc,φ (δ) and Iunif,φ (δ) are comparable.
7.2 Processes on Rn
307
In this case it is easier to see what Iloc,φ (δ) looks like since it is generally simpler to estimate Iunif,φ (δ). A function f on R+ with f (0) = 0 is said to be of type A if f is regularly varying with index 1 and f (x)/x is nonincreasing on (0, δ] for some δ > 0. Let G = {G(x), x ∈ [−1, 1]} be a Gaussian process with stationary increments. Set 2 (|x − y|) = E(G(x) − G(y))2 . σ 2 (|x − y|) := σG
(7.136)
We obtain information on the lower bounds for uniform moduli of continuity of G under the following conditions: σ 2 (u) is concave for 0 ≤ u ≤ h for some h > 0.
(7.137)
σ 2 (u) is a regularly varying function at 0 with index 0 ≤ α < 1 or is of type A .
(7.138)
σ 2 (u) is a normalized regularly varying function at 0 with index 0 ≤ α < 1 or is a normalized regularly varying function of type A. (7.139) This is because, under these conditions, the covariance of nonoverlapping increments of G can be made close to zero, and hence, by Slepian’s Lemma, the increments can be treated as though they are independent. We develop some technical lemmas that enable us to exploit this idea. Lemma 7.2.7 Let G = {G(x), x ∈ [0, 1]} be a Gaussian process with stationary increments for which (7.137) holds and let 0 < a < b < c < d ≤ h. Then E(G(d) − G(c))(G(b) − G(a)) ≤ 0. Proof
(7.140)
We have E(G(d) − G(c))(G(b) − G(a)) (7.141)
2 1 2 2 2 = 2 σ (d − a) + σ (c − b) − σ (d − b) − σ (c − a) .
The lemma follows from the concavity of σ 2 since d−a > d−b, c−b < c−a and the midpoint of the line segment between σ 2 (d − a) and σ 2 (c − b) has the same x-coordinate as the midpoint of the line segment between σ 2 (d − b) and σ 2 (c − a).
Moduli of continuity for Gaussian processes
308
Alternately, we can write E(G(d) − G(c))(G(b) − G(a)) (7.142)
2 1 2 2 2 = 2 {σ (d − a) − σ (d − b)} − {σ (c − a) − σ (c − b)} and use the fact that for a concave function the increments over intervals of the same length (here b − a) is decreasing. Lemma 7.2.8 Let G = {G(x), x ∈ [0, 1]} be a Gaussian process with stationary increments for which (7.138) holds. Let ξk =
k G( k+1 N ) − G( N ) 1 σ( N )
k = 0, . . . , qN
(7.143)
for some q = q(N ) > 0 such that qN is an integer. Then, for j = k, #1$ (7.144) Eξj ξk ≤ h(q) + sup |j − k + 1|α β −1 N j,k # |j − k + 1| $ (j−k)/N # |j − k| $ (u) −β du exp β N N u 1/N for some function h(q) = o(q) as q ↓ 0 for all N = N (q) sufficiently large. Proof
For j, k = 0, . . . , qN ; j = k we have
Eξj ξk =
σ 2 ( |(j−k)+1| ) + σ 2 ( |(j−k)−1| ) − 2σ 2 ( |j−k| N N N ) . 1 2 2σ ( N )
(7.145)
We write σ 2 (|u|) = |u|α L(|u|) and make use of the fact that |u|α is concave to see, as in the previous lemma, that j − k + 1 α j − k − 1 α j − k α (7.146) + − 2 ≤ 0. N N N For a = ±1 we write |(j − k) + a| 2 σ = N
* |(j − k) + a| α L |(j − k)| (7.147) N N + |(j − k)| |(j − k) + a| −L + L N N
and use (7.146) to see that, for j, k = 0, . . . , qN ; j = k, # # $ # $$ |j−k+a| − L |j−k| |j − k + a|α L N N . Eξj ξk ≤ 2L(1/N ) a=±1
(7.148)
7.2 Processes on Rn 309 # $ x We write L(x) = β(x) exp 1 (u) u du := β(x)L(x) (see (7.120)) and use the rearrangement $ # L(x) L(y) β(x)L(x) − β(y)L(y) − 1 + (β(x) − β(y) = β(x) . L(y) β(c)L(c) β(c)L(c) (7.149) to see that, for j > k and a = 1, # $ # $ L |j−k+1| − L |j−k| N N (7.150) L(1/N ) # |j−k+1|/N (u) $ $ # $ # |j − k + 1| $# −1 1 =β β exp du − 1 N N u |j−k|/N |j−k|/N # |j − k| $ # |j − k + 1| $ (u) −β exp du +β N N u 1/N := F + G. We note that
|j−k+1|/N
exp |j−k|/N
(u) du u
−1≤
∗ (q) , |j − k|
where ∗ (q) := sup0
∗ (q) . |j − k|1−α−∗ (q)
(7.151)
(7.152)
(7.153)
For α < 1 this goes to zero as q goes to zero. When α = 1, using the additional hypothesis, we see that the left-hand side of (7.152), which is L(|j − k|/N )/L(1/N ), is bounded by one for all N sufficiently large. Thus, in this case, |j − k + 1| F also goes to zero as q goes to zero. When a = −1 we consider k > j and get exactly the same terms as above. Thus we obtain (7.144). We can use Lemmas 7.2.7 and 7.2.8 to give conditions under which a Gaussian process with stationary increments has an exact uniform modulus of continuity. Theorem 7.2.9 Let G = {G(x), x ∈ [0, 1]} be a Gaussian process with stationary increments for which (7.137) or (7.138) holds. Then G has
Moduli of continuity for Gaussian processes
310
an exact uniform modulus of continuity on [0, 1] with respect to the Euclidean metric, that is, (7.1) holds for some function ω with K = [0, 1] and τ (u, v) = |u − v|. ( Proof We show that limδ→0 E(δ) = ∞ (see Theorem 7.1.4 (2)). By Lemma 5.5.6, we need only show that there is an unbounded number of 1-distinguishable points in {(G(u) − G(v))/d(u, v), u, v ∈ [0, 1]}. Let ξk be as given in Lemma 7.2.8. Let q(N ) = M/N for some fixed integer M . Since E(ξj − ξk )2 = 2(1 − Eξj ξk ),
(7.154)
we see that {ξk }M k=1 are certainly 1-distinguishable when the last term in (7.144) goes to zero as N → ∞. To see this, note that the left-hand ∗ side of (7.152) is bounded by M (q) and hence the last term in (7.144) is bounded by # |j − k + 1| $ # |j − k| $ α+∗ (q) −β C β . M N N For M fixed, this goes to 0 as N → ∞ because β(x) is continuous at 0. Since M can be taken as large as we like, the proof is completed in the case when (7.138) holds. However, under (7.137), by Lemma 7.2.7, Eξj ξk ≤ 0, so, of course, the proof also holds in this case. We now take up the question of finding conditions under which we have equality in (7.106) and (7.108). Theorem 7.2.10 Let G = {G(x), x ∈ [0, 1]} be a Gaussian process with stationary increments for which (7.137) holds or (7.139) holds. Then lim
δ→0
sup |u−v|≤δ
|G(u) − G(v)| ≥1 (2σ 2 (|u − v|) log 1/|u − v|)1/2
a.s.
(7.155)
u,v∈[0,1]
Proof We give the proof under the assumption that (7.139) holds. The proof under assumption (7.137) is essentially the same, but slightly easier. We consider the setup in Lemma 7.2.8. For some > 0 we take q to be a positive constant that is small enough so that the right-hand side of (7.144) is less than . This is easy because, by hypothesis, the complicated term in (7.144) is identically zero. Let {Zn } be a standard normal sequence. By Theorem 5.1.4, Zk (7.156) sup ≥1− =1 lim P 1/2 n→∞ 1≤k≤n (2 log n)
7.2 Processes on Rn
311
for all > 0. Then, by Slepian’s Lemma, as in the proof of Corollary 5.5.2, ξk 1/2 lim P sup ≥ (1 − ) (1 − ) = 1 (7.157) 1/2 N →∞ 1≤k≤qN (2 log qN ) for all , > 0. Therefore, by first taking q small enough and then taking N large enough, we see that for all > 0, ξk sup ≥ (1 − ) = 1. (7.158) lim P 1/2 N →∞ 1≤k≤qN (2 log N ) It follows from (7.158) that |G(u) − G(v)| ≥1 2 (|u − v|) log 1/|u − v|)1/2 (2σ |u−v|→0 lim sup
a.s.,
(7.159)
which is a different way of writing (7.155). Remark 7.2.11 Apropos of requiring that σ 2 (h) is of type A when it is a regularly varying function of index one, when σ 2 is concave on [0, δ], it also has the property that σ 2 (x)/x is nonincreasing on [0, δ]. Actually, we need something less than this for the proofs. Writing σ 2 (x) = xL(x), where L is slowly varying, we require that lim suph→0 suph≤x≤δ (L(x)/ L(h)) < ∞ for some δ > 0. Adding still more conditions on the slowly varying part of σ 2 , we can go beyond regularly varying functions of index one. For example, suppose that {G(x), x ∈ [0, 1]} is a Gaussian process for which σ 2 (h) = |h|α for some 1 < α < 2 (the existence of such a process is shown in Example 6.4.7). Define ξk =
k G( k +1 N ) − G( N ) 1 σ( N )
k = 0, . . . , qN
(7.160)
for some integer . Then, for j = k and sufficiently large, Eξj ξk ≤
α(α − 1) . (|j − k|)2−α
(7.161)
Obviously this can be made arbitrarily close to zero by taking large enough. Thus we get (7.157) with q replaced by q/. This does not affect the proof of Theorem 7.2.10, so (7.155) holds in this case also. We obtain information on lower bounds for local moduli of continuity
Moduli of continuity for Gaussian processes
312
of G under conditions that are slightly different from those given in (7.137)–(7.139). We use the following conditions: σ 2 (t + h) − σ 2 (t) ≤ σ 2 (h) for t, h > 0, and for all > 0 there exists θ > 0 such that σ 2 (θu) ≤ σ 2 (u) for all |u| ≤ u0 ;
(7.162)
2
σ (u) is a normalized regularly varying function at 0 with index 0 < α < 2.
(7.163)
Theorem 7.2.12 Let G = {G(x), x ∈ [0, 1]} be a Gaussian process with stationary increments for which (7.162) or (7.163) holds. Then lim sup
δ→0 u≤δ
|G(u) − G(0)| ≥1 (2σ 2 (u) log log 1/u)1/2
a.s.
(7.164)
Proof Suppose that (7.163) holds. Let 0 < θ < 1 and tk = θk /N , k = 0, . . . , ∞ and set G(tk ) − G(0) ξk = . σ(tk )
(7.165)
We show that E ξj ξk = o(θ, N ),
(7.166)
where o(θ, N ) is a function such that, for all δ > 0, o(θ, N ) ≤ δ for all θ > 0 sufficiently small and for N ≥ N0 (θ, δ) for some N0 (θ, δ) sufficiently large. We write σ 2 (tk ) − σ 2 (tk − tj ) − σ 2 (tj ) σ(tj ) + E ξj ξk = 2σ(tj )σ(tk ) σ(tk )
(7.167)
and represent σ 2 (h) = hα L(h), where L(h) is given as in (7.124) with β ≡ C. Note that for j > k, tk j) | (u)| L(t ≤ exp du (7.168) k) u L(t tj ≤ θ−(j−k)
∗
(tk )
,
where ∗ (tk ) = supu≤tk | (u)|. Therefore, for j > k we have ∗ σ(tj ) ≤ (θj−k )α/2− (tk )/2 . σ(tk )
(7.169)
Thus, for N sufficiently large, σ(tj ) ≤ θα σ(tk )
(7.170)
7.2 Processes on Rn
313
for some α > 0. Considering (7.170), to obtain (7.166) we need only consider σ 2 (tk ) − σ 2 (tk − tj ) σ(tj )σ(tk ) ≤
(7.171)
α k ) − L(t k − tj )| tα (tk − tj )α |L(t k − (tk − tj ) L(tk ) + σ(tj )σ(tk ) σ(tj )σ(tk )
for j > k. Note that for θ sufficiently small α−1 α . tα k − (tk − tj ) ≤ 2αtj tk
(7.172)
Using this and (7.168), we see that the first term on the right-hand side in (7.171) is less than or equal to 1−α/2 1/2
2α
tj
L
(tk )
1−α/2 1/2 L (tj ) tk
≤ 2αθ(j−k)(1−α/2−
∗
(tk ))
.
(7.173)
The second term on the right-hand side in (7.171) is less than or equal to σ(tk ) σ(tj )
k − tj ) L(t 1− k) L(t
≤ 2
σ(tk ) tj ∗ (tk ) σ(tj ) tk
= α since 1−
k − tj ) L(t k) L(t
≤ 2 ≤ 2
1−α/2 1/2
L
tj
(tk )
1−α/2 1/2 L (tj ) tk
tk
| (u)| du u
tk −tj tj ∗ (tk )
tk
∗ (tk )
(7.174)
.
Thus this term is even smaller than the right-hand side of (7.173). Since θ can be taken arbitrarily close to zero, we get (7.166). The hypotheses in (7.162) immediately give E ξj ξk ≤ o(θ)
(7.175)
as is easily seen, since they imply that the first term on the right-hand side of (7.167) is less than or equal to zero, and, for all > 0, θ can be chosen such that σ(tj )/σ(tk ) ≤ . The proof of this theorem now follows the proof of Theorem 7.2.10 since for {Zk } a standard normal sequence lim sup k→∞
Zk =1 (2 log log θk /N )1/2
a.s.
(7.176)
Moduli of continuity for Gaussian processes
314
for all N > 1 and 0 < θ < 1. We now give a lower bound for the local modulus of continuity that may seem artificial but which is very useful in the study of moduli of continuity of associated Gaussian processes in Section 7.4. Lemma 7.2.13 Let G = {G(u), u ∈ [0, 1]} be a Gaussian process and let f (u), u ∈ [0, 1] be a continuous real-valued function. Assume that, for all s, t > 0 sufficiently small, E(G(t) − G(s))2 ≥ c2 |f 2 (t) − f 2 (s)|
(7.177)
for some c > 0. Let f2 (u) := sup0≤s≤u |f 2 (s) − f 2 (0)|. Then lim sup δ↓0 u≤δ
Proof
|G(u) − G(0)| ≥c 2 (2f (u) log log 1/f2 (u))1/2
a.s.
Note that (7.177) can be written as
2 E(G(t) − G(s))2 ≥ E cB(f 2 (t)) − cB(f 2 (s)) ,
(7.178)
(7.179)
where B is standard Brownian motion. By Theorem 7.2.12, lim sup δ↓0 u≤δ
|B(f2 (u))| ≥1 (2f2 (u) log log 1/f(u))1/2
a.s.
(7.180)
Let Uδ denote the flat spots of f2 , that is, Uδ = {u ∈ (0, δ] : ∃ u < u, f2 (u ) = f2 (u)}. Let Tδ = (0, δ] − Uδ . The supremum in (7.180) is effectively taken over Tδ . To see this, suppose that v ∈ Uδ . Let v = inf{v : f2 (v) = f2 (v )}. Then v ∈ Tδ and f2 (v ) = f2 (v ). Consequently, (7.180) also holds if we replace supu≤δ by sup{u≤δ}∩Tδ . Consider f2 (u). For each u ∈ Tδ , either f 2 (u) > f 2 (0) or f 2 (u) <
f 2 (0). Let Tδ,1 denote those u ∈ Tδ for which f 2 (u) > f 2 (0), and let Tδ,2 = Tδ − Tδ,1 . It follows from the Blumenthal zero–one law for Brownian motion, Lemma 2.2.9, that lim sup δ↓0
u≤δ
u∈Tδ,i
|B(f2 (u))| = Ci (2f2 (u) log log 1/f(u))1/2
a.s.
i = 1, 2
(7.181)
for some constants C1 and C2 . Furthermore, by (7.180), at least one of the constants Ci is greater than or equal to one. Suppose C1 ≥ 1. In this case, f2 (u) = f 2 (u) − f 2 (0). Since law
{B(f 2 (u) − f 2 (0)); u ∈ Tδ,1 } = {B(f 2 (u)) − B(f 2 (0)); u ∈ Tδ,1 },
7.2 Processes on Rn
315
we have |B(f 2 (u)) − B(f 2 (0))| ≥1 (2f2 (u) log log 1/f(u))1/2
lim sup δ↓0
u≤δ
u∈Tδ,1
a.s.,
(7.182)
so that, obviously, lim sup δ↓0 u≤δ
|B(f 2 (u)) − B(f 2 (0))| ≥1 (2f2 (u) log log 1/f(u))1/2
a.s.
(7.183)
The same argument gives (7.183) when C2 ≥ 1. Finally we note that x log log 1/x is increasing on [0, δ ] for some δ > 0. Therefore by Remark 7.1.11 (see, in particular, the last paragraph), we get (7.178). We now list conditions under which we obtain exact uniform moduli and uniform m-moduli of continuity with the precise value of the constant. Theorem 7.2.14 Let G = {G(x), x ∈ [−1, 1]} be a Gaussian process with stationary increments and let σ 2 be as defined in (7.136). If any of the following three sets of conditions holds: (1) σ 2 (h) is a normalized regularly varying function at zero with index 0 < α < 1 or is a normalized regularly varying function of type A; (2) σ 2 (h) is a normalized slowly varying function at zero that is asymptotic to an increasing function near zero and 1/2 $ # σ(δu) 1/2 ; (7.184) du = o σ(δ) (log 1/δ) u(log 1/u)1/2 0 (3) σ 2 (u) is concave for u ≤ h for some h > 0, and (7.184) holds; then lim sup
sup
δ→0
|u−v|≤δ
|G(u) − G(v)| =1 (2σ 2 (δ) log 1/δ)1/2
a.s.
(7.185)
u,v∈[−1,1]
and lim
δ→0
sup |u−v|≤δ
|G(u) − G(v)| =1 (2σ 2 (|u − v|) log 1/|u − v|)1/2
a.s.
(7.186)
u,v∈[−1,1]
Proof The lower bound in (7.186) for each of the three sets of conditions is given in Theorem 7.2.10. To obtain the upper bound in (7.185) we first note that we can replace
316
Moduli of continuity for Gaussian processes
φ in (7.108) by (1+ )σ for any > 0. That we can do this in condition (1) follows from the fact that, by Lemma 7.2.4, a regularly varying function of index α > 0 is asymptotic to an increasing function near zero. In condition (2) it is an hypothesis and it is obvious in condition (3). Thus, to complete the proof, we need only show that σ satisfies (7.104) and (7.107). That (7.104) holds under conditions (1) and (2) follows from (7.119). It is easy to see that it holds for concave functions. That (7.107) holds under condition (1) follows from (7.128). Finally, note that (7.107) is (7.184), so we simply add this as a hypothesis for conditions (2) and (3). Using Lemma 7.1.7 (1), we get equality in both (7.185) and (7.186).
Theorem 7.2.15 Let G = {G(x), x ∈ [−1, 1]} be a Gaussian process with stationary increments and let σ 2 be as defined in (7.136). If either of the following two sets of conditions holds: (1) σ 2 (h) is a normalized regularly varying function at zero with index 0 < α < 2; (2) σ 2 (h) is asymptotic to an increasing function near zero; σ 2 (t + h) − σ 2 (t) ≤ σ 2 (h), for all t, h > 0 sufficiently small; for all > 0 there exists θ > 0 such that σ 2 (θu) ≤ σ 2 (u), |u| ≤ u0 ; and 1/2 $ # σ(δu) 1/2 ; (7.187) du = o σ(δ) (log log 1/δ) 1/2 u(log 1/u) 0 then lim sup
δ→0 |u|≤δ
and lim sup
δ→0 |u|≤δ
|G(u) − G(0)| (2σ 2 (δ) log log 1/δ)1/2 |G(u) − G(0)|
(2σ 2 (|u|) log log 1/|u|)1/2
=1
=1
a.s.
a.s.
(7.188)
(7.189)
Proof The lower bound in (7.189) for each of the sets of conditions is given in Theorem 7.2.12. The upper bound in (7.188) follows from (7.106) once we show that (7.104) and (7.105) hold. That (7.104) holds under condition (1) follows from (7.119) and the fact that a regularly varying function of index α > 0 is asymptotic to an increasing function near zero. That it holds under condition (2) can be seen by taking h = t and noting that σ 2 ((1 + )t) ≤ σ 2 (t) + σ 2 ( t) and σ 2 ( t) ≤ σ 2 (t). That (7.105) holds under condition (1) follows from (7.128). Finally, note that (7.187) is (7.105), so we simply add this as a hypothesis for condition (2).
7.3 Processes with spectral densities
317
The proof is completed using Lemma 7.1.7. Example 7.2.16 It is easy to see that Theorems 7.2.15 and 7.2.14 give (2.14) and (2.15) for standard Brownian motion.
7.3 Processes with spectral densities We often require that the increments variance of a Gaussian process with stationary increments, that is, σ 2 in (7.136), is a normalized regularly varying function at zero. In the next theorem we show that when the process has a spectral density, which is a normalized regularly varying function at infinity, σ 2 has this property. To be more explicit, we take 4 ∞ sin2 λh/2 2 σ (h) = dλ, (7.190) π 0 θ(λ) where
∞
0
1 ∧ λ2 dλ < ∞. θ(λ)
(7.191)
Theorem 7.3.1 When θ is a regularly varying function at infinity with index 1 < p < 3, σ 2 (h) ∼ Cp
1 h θ(1/h)
where Cp =
4 π
∞
0
as h → 0,
sin2 s/2 ds. sp
When θ is a regularly varying at infinity with index 1, 2 ∞ 1 2 dλ as h → 0, σ (h) ∼ π 1/h θ(λ)
(7.192)
(7.193)
(7.194)
and it is a slowly varying function at zero. Furthermore, if θ is a normalized regularly varying function at infinity with index 1 ≤ p < 3, then σ 2 is a normalized regularly varying function. It follows from (7.192) that when θ is a regularly varying function at infinity with index 1 < p < 3, σ 2 (h) is a regularly varying function at zero with index p − 1. Proof
To obtain the asymptotic result (7.192), it suffices to show that lim h θ(1/h)
h→0
4 π
∞ K
sin2 λh/2 dλ = Cp θ(λ)
(7.195)
Moduli of continuity for Gaussian processes
318
because the contribution of the integral from 0 to K is O(h2 ). We write θ(λ) = λp L(λ), where L is a slowly varying function at infinity. By a change of variables, ∞ 2 ∞ 2 4 sin λh/2 sin s/2 L(1/h) 4 dλ = ds (7.196) h θ(1/h) π θ(λ) π sp L(s/h) K
=
4 π
Kh ∞
sin2 s/2 L(1/h) I(Kh,∞) (s) ds. sp L(s/h)
0
By (7.124), for Kh < 1, L(1/h) L(s/h) I(Kh,∞) (s) =
1/h (u) β(1/h) exp du I(Kh,∞) (s) β(s/h) u s/h # ∗ $ ∗ ≤ 2 s + s− , (7.197)
where ∗ = supu>K | (u)|. We take K large enough so that p − ∗ > 1, p + ∗ < 3, and the ratio of the two β terms is bounded by two. For this choice of K we can use the Dominated Convergence Theorem in (7.196) to get (7.195). We now assume that θ is a normalized regularly varying function with index 1 < p < 3 and show that σ 2 (h) is a normalized regularly varying function. It follows from (7.123) that this is the case if and only if lim h↓0
We write σ 2 (h)
4 π
=
0
M
h(σ 2 (h)) = p − 1. σ 2 (h)
(7.198)
sin2 λh/2 4 dλ + θ(λ) π
∞
M
sin2 λh/2 dλ (7.199) θ(λ)
:= σ12 (h) + σ22 (h) and note that (σ12 (h))
2 = π
M 0
λ sin λh 2 dλ ∼ h θ(λ) π
0
M
λ2 dλ θ(λ)
as h ↓ 0. Using integration by parts, we write d 1 4 ∞ λ 2 uh 2 du dλ σ2 (h) = − sin π M 0 2 dλ θ(λ) 1 4 M uh du − sin2 θ(M ) π 0 2 := σ32 (h) + σ42 (h).
(7.200)
(7.201)
7.3 Processes with spectral densities Integrating we see that σ32 (h)
2 =− π
∞ M
sin λh λ− h
d 1 dλ. dλ θ(λ)
Differentiating under the integral, we get ∞ 2 d 1 dλ (σ32 (h)) = (λh cos λh − sin λh) πh2 M dλ θ(λ)
319
(7.202)
(7.203)
(we justify this step and show that this integral exists at the end of the proof). We express θ(λ) = λp L(λ), where L is a slowly varying function at infinity. We have d 1 p + δ(λ) = − p+1 , dλ θ(λ) λ L(λ)
(7.204)
where δ(λ) = λL (λ)/L(λ). Note that the hypothesis that θ is a normalized regularly varying function at infinity implies that limλ→∞ δ(λ) = 0 (see (7.123)). Using (7.204) and a change of variables in (7.203), we see that 2hp−2 ∞ (s cos s − sin s) (p + δ(s/h)) 2 ds. (7.205) (σ3 (h)) = π sp+1 L(s/h) Mh Therefore, by (7.192), h(σ32 (h)) lim h↓0 σ 2 (h)
∞ 2 (s cos s − sin s) (p + δ(s/h))L(1/h) = lim ds h→0 Cp π M h sp+1 L(s/h) ∞ (s cos s − sin s) 2p ds. (7.206) = Cp π 0 sp+1
That we can take the limit under the integral sign follows from the same argument used in (7.197) and the fact that, for s ≥ 0, |(s cos s − sin s)| = O(s3 ∧ s). We leave it to the reader to show that (σ42 (h)) = O(h), and we show in (7.200) that the same is true for (σ12 (h)) . Consequently, ∞ h(σ 2 (h)) (s cos s − sin s) 2p ds. (7.207) lim = h↓0 σ 2 (h) Cp π 0 sp+1 It is easy to see that the constant on the right-hand side of (7.207) is equal to p − 1 since, when θ(λ) = λp , σ 2 (h) = Cp hp−1 , and in this case h(σ 2 (h)) /σ 2 (h) ≡ p − 1. As for the differentiation under the integral sign in (7.202), to obtain (7.203), let F (λ, h) = λ − ((sin λh)/h). Using the facts that |(1 − cos x)/x| ≤ 1 and |(sin x)/x| ≤ 1, we see that |(F (λ, h+∆)−F (λ, h))/∆| ≤ 2λ/h. Using this and (7.204), we see that we can use the Dominated Convergence Theorem to justify differentiating under the integral.
320
Moduli of continuity for Gaussian processes
To obtain (7.194) we write 4 ∞ sin2 λh/2 4 M/h sin2 λh/2 σ 2 (h) = dλ + dλ π 0 θ(λ) π M/h θ(λ) := σ52 (h) + σ62 (h).
(7.208)
As above, we express θ(λ) = λL(λ), where L is a slowly varying function at infinity. By Theorem 14.7.1, M/h 2 M2 λ 2 2 dλ ∼ C (7.209) σ5 (h) ≤ Ch θ(λ) L(M/h) 0 and σ62 (h)
=
2 π
∞
M/h
2 1 dλ − θ(λ) π
∞
M/h
cos λh dλ θ(λ)
(7.210)
:= σ72 (h) + σ82 (h). We show below that, for any > 0, we can choose M sufficiently large so that |σ82 (h)| ≤ σ72 (h), for h = h( ) sufficiently small. Also, it follows from Theorem 14.7.2 that σ52 (h) = o(σ72 (h)) as h → 0. Thus we get (7.194) but with the lower limit of the integral replaced by M/h. It also follows from Theorem 14.7.2 that this integral is a slowly varying function of h at zero. Consequently, −1 2 ∞ 1 2 ∞ 1 →1 (7.211) dλ dλ π M/h θ(λ) π 1/h θ(λ) as h ↓ 0. Thus we get (7.194), as stated. Let M = (k0 + 1/2)π in σ82 , where k0 is odd. Then, by a change of variables, we have ∞ cos s 2 2 |σ8 | = ds (7.212) πh M θ(s/h) 2 (k+3/2)π cos s = ds πh (k+1/2)π θ(s/h) k=k0 (k+3/2)π ∞ θ(s/h) 2 | cos s| 1− ds. = πh (k+1/2)π θ(s/h) θ((s + π)/h) k=k0 , k odd
Writing θ(s/h)/θ((s + π)/h) = ((θ(s/h)/θ(1/h)) (θ(1/h)/θ((s + π)/h)), we see that limh↓0 θ(s/h)/θ((s+π)/h) = s/(s+π). Thus, for all h0 , > 0, we can take k0 = k0 (h0 , ) sufficiently large such that ∞ 2 (k+3/2)π 1 2 |σ8 | ≤ ds (7.213) πh (k+1/2)π θ(s/h) k=k0 , k odd
7.3 Processes with spectral densities 2 ∞ 1 < ds = σ72 (h) πh (k0 +1/2)π θ(s/h)
321 (7.214)
for all h ∈ [0, h0 ]. Finally, we show that (7.198) holds when θ is a normalized regularly varying function with index p = 1. We return to the paragraph containing (7.199)–(7.206) and replace M by M/h and p by one. The critical term in determining the asymptotic behavior of the derivative of σ 2 is (7.205). We have ∞ cos s 4 h(σ32 (h)) = (1 + δ(s/h)) ds (7.215) πh M θ(s/h) 4 ∞ sin s (1 + δ(s/h)) ds − π M s2 L(s/h) 2 (h). := σ92 (h) + σ10
We showed in the previous paragraph that |σ92 (h)| = o (σ 2 (h)). By the same argument used in the first paragraph of this proof we see that 2 (h)| = O(1/L(1/h)), which is o (σ 2 (h)) by Theorem 14.7.2. The |σ10 other terms are not difficult to deal with using ideas already contained in this proof. Theorem 7.3.1 allows us to extend the lower bounds for the moduli of continuity of Gaussian processes with stationary increments given in Theorems 7.2.10 and 7.2.12 to Gaussian processes that have regularly varying spectral densities. Theorem 7.3.2 Let G = {G(x), x ∈ [0, 1]} be a Gaussian process with stationary increments that has a spectral density that is regularly varying at infinity with index −p for 1 < p < 3. Then lim sup
δ→0 u≤δ
|G(u) − G(0)| (2σ 2 (u) log log 1/u)1/2
≥1
a.s.
(7.216)
Proof Let 1/θ(λ) denote the spectral density of G, that is, 1/θ(λ) = h(λ/π) in (5.256), and assume that (7.191) holds. The increments variance σ 2 is given by (7.190) and, as in (7.192), σ 2 (h) ∼ Cp
1 h θ(1/h)
as h → 0.
(7.217)
Since θ is regularly varying, it has the form 1 β(λ) = p L(λ), θ(λ) λ
(7.218)
Moduli of continuity for Gaussian processes
322
is a normalized slowly varying function at infinity. Define where L 1 β(λ0 ) β(λ) L(λ)I{λ≥λ0 } . = p L(λ)I {λ<λ0 } + λ λp θ(λ)
(7.219)
Clearly, for all > 0, we can take λ0 = λ0 ( ) sufficiently large so that (1 − )
1 1 1 ≤ ≤ (1 + ) θ(λ) θ(λ) θ(λ)
(7.220)
for all λ > 0. Note that θ is a normalized regularly varying function at infinity of index 1 < p < 3. Furthermore, since 1/θ(λ) is the spectral density of a Gaussian process with stationary increments, 1/θ can also be taken to be the spectral density of a Gaussian process with stationary increments, (this is because 1/θ is equal to 1/θ on [0, λ0 ) and is integrable on say G 2 be the increments variance of this process. By (7.192), [λ0 , ∞)). Let σ (1 − 2 ) σ 2 (h) ≤ σ 2 (h) ≤ (1 + 2 )σ 2 (h)
for all h ∈ [0, h0 ]
(7.221)
for some h0 > 0 sufficiently small. By Theorem 7.3.1, σ 2 is a normalized regularly varying function at zero with index 0 < α < 2. Therefore, by Theorem 7.2.12, lim sup
δ→0 u≤δ
|G(u) − G(0)| (2 σ 2 (u) log log 1/u)1/2
≥1
a.s.
(7.222)
The inequality in (7.216) now follows from (7.221), Lemma 7.1.10, and Remark 7.1.11. Theorem 7.3.3 Let G = {G(x), x ∈ [0, 1]} be a Gaussian process with stationary increments that has a spectral density that is regularly varying at infinity with index −p for 1 ≤ p < 2. Then lim
sup
δ→0 |u−v|≤δ
(2σ 2 (|u
|G(u) − G(v)| ≥1 − v|) log 1/|u − v|)1/2
a.s.
(7.223)
Let 1/θ(λ) denote the spectral density of G. If θ(λ) is regularly varying at infinity with index 2 and θ(λ)/λ2 is nonincreasing as λ → ∞, (7.223) continues to hold. Proof We use the notation in the first paragraph of the proof of Theorem 7.3.2. If β(λ) ≥ β(∞) for all λ ≥ λ0 , for some λ0 , we write β(λ) 1 β(∞) = (7.224) L(λ)I{λ<λ0 } + L(λ)I{λ≥λ0 } θ(λ) λp λp β(λ) − β(∞) L(λ)I{λ≥λ0 } . + λp
7.3 Processes with spectral densities
323
If there is no λ0 with this property, there exists a sequence of points {λk }, with limk→∞ λk = ∞, such that β(λ) − β(λk ) ≥ 0 for all λ ≥ λk . Choose one of these points, relabel it λ0 , and write 1 β(λ) − β(λ0 ) 1 L(λ)I{λ≥λ0 } . + = θ(λ) λp θ(λ)
(7.225)
We use this formulation because we want to write 1/θ(λ) as a sum of positive terms. We now see that whichever way we write 1/θ we can decompose it as 1/θ(λ) = 1/θ1 (λ) + 1/θ2 (λ), where both 1/θ1 (λ) and 1/θ2 (λ) can be spectral densities of independent Gaussian processes with stationary increments. Denote these process G1 and G2 . By checking their covariances, we see that law
{G(x), x ∈ [0, 1]} = {G1 (x) + G2 (x), x ∈ [0, 1]}.
(7.226)
Let σ12 and σ22 denote the increments variance of G1 and G2 . It is easy to see, using (7.192) and (7.194), that for all > 0 we can take λ0 such that (1− )σ 2 (h) ≤ σ12 (h) ≤ σ 2 (h)
for all h ∈ [0, h0 ]. (7.227) By Theorem 7.3.1, σ12 is a normalized regularly varying function at zero with index 0 ≤ α < 1. Therefore, by Theorem 7.2.10, lim
δ→0
sup |u−v|≤δ
u,v∈[0,1]
and σ22 (h) ≤ σ 2 (h)
|G1 (u) − G1 (v)| ≥ (1 − )1/2 (2σ12 (|u − v|) log 1/|u − v|)1/2
a.s. (7.228)
It follows from (7.228) that there exists an infinite sequence of points (uk , vk ) with limk→∞ |uk − vk | = 0 for which the inequality is realized. Clearly |G(uk ) − G(vk )| = |(G1 (uk ) − G1 (vk )) + (G2 (uk ) − G2 (vk ))|. (7.229) Since G1 and G2 are independent and symmetric, we see that, at least with probability 1/2, |G(uk ) − G(vk )| ≥ |G1 (uk ) − G1 (vk )| infinitely often. Consequently, (7.228) holds with probability at least 1/2 when G1 is replaced by G. Since can be taken arbitrarily small, we see that (7.223) holds with probability at least 1/2. It follows from Lemma 7.1.1 that (7.223) holds almost surely. We can handle the case when the spectral density is regularly varying of index 2 as long as σ12 (h) is also of type A. We leave it to the reader to show that this is the case when θ(λ)/λ2 is nonincreasing as λ → ∞.
324
Moduli of continuity for Gaussian processes
Remark 7.3.4 In Theorem 7.3.3, if one restricts the hypothesis to a Gaussian process G with stationary increments that has a spectral density that is regularly varying at infinity with index −p for 1 < p ≤ 2, there is equality in (7.223). This follows because, by (7.192) and Lemma 7.2.4, the increments variance of G is asymptotic to an increasing regularly varying function with index 0 < α ≤ 1. Since it is regularly varying, (7.104) is satisfied and, by (7.128), so is (7.107). Therefore, by Corollary 7.2.3, we get that 1 is an upper bound in (7.223).
7.4 Local moduli of associated processes In this section we obtain a lower bound for the local modulus of continuity of associated, and related, Gaussian processes that hold under weaker conditions than similar results for more general Gaussian processes. This is important for us because this is used in Section 9.5 to obtain local moduli of continuity of local times. An associated Gaussian process G = {G(t), t ∈ S} is a mean zero Gaussian process with a covariance that is the 0-potential density of a strongly symmetric Borel right process X with continuous 0-potential densities (S is the state space of X). By Lemma 3.4.3, the covariance of G has the remarkable property that EG(s)G(t) ≤ (EG2 (s)) ∧ (EG2 (t)).
(7.230)
It follows from (7.230) that E(G(t) − G(s))2
= EG2 (t) + EG2 (s) − 2EG(s)G(t) (7.231) ≥
|EG2 (t) − EG2 (s)|.
This inequality is not interesting when G is stationary since EG2 (t) = EG2 (s), but it is very interesting for processes with G(0) = 0 for some 0 ∈ S, in which case it can be written as 2 2 (t) − σG (s)|, E(G(t) − G(s))2 ≥ |σG
(7.232)
2 σG (t) := E(G(t) − G(0))2 .
(7.233)
where
Remark 7.4.1 Let X be a strongly symmetric Borel right process with state space S and assume that X has local times. Let T be a terminal time for X. Let uT (x, y) be as give in (2.138) and assume that it is continuous on S × S. Suppose G is a Gaussian process with covariance
7.4 Local moduli of associated processes
325
uT (x, y). Then, by (3.201) and symmetry, uT (x, y) ≤ uT (x, x) ∧ uT (y, y).
(7.234)
In other words, the covariance of G satisfies (7.230) and consequently (7.232). We see by (7.232) that the hypotheses of Lemma 7.2.13 are satisfied and we have a lower bound for the local modulus of continuity of G. Before pursuing this we show how to obtain a result similar to (7.232) for stationary associated Gaussian processes. ( = {G(t), ( Lemma 7.4.2 Let G t ∈ S}, where S is an Abelian group, be a mean zero associated stationary Gaussian process with covariance u(x, y) = u(x − y). Then 2 ( − G(s)) ( E(G(t) ≥
Proof
u(t) + u(s) 2 |σ ((t) − σ 2((s)|. G G 2u(0)
(7.235)
We begin by noting that σ 2((x) = 2(u(0) − u(x))
(7.236)
σ 2 (y) − σ 2 (x) ( ( . G u(x) − u(y) = G 2
(7.237)
u(x) ( ( G(0). G(x) := G(x) − u(0)
(7.238)
u(x)u(y) E G(x)G(y) = u(x − y) − . u(0)
(7.239)
G
so that
Consider
Note that
By Remark 3.8.3, G is the mean zero Gaussian process with covariance uT0 (x, y). Therefore, by Remark 7.4.1 and (7.237), 2
2
(7.240) E(G(x) − G(y))2 ≥ |E(G (x)) − E(G (y))| 1 (u(x) + u(y)) 2 |u2 (x) − u2 (y)| = |σ ((x) − σ 2((y)|. = G G u(0) 2u(0) Also, (u(x) − u(y))2 2 ( ( ≤ E(G(x)− G(y)) . u(0) (7.241) Using (7.240) and (7.241), we get (7.235).
2 ( ( E(G(x)−G(y))2 = E(G(x)− G(y)) −
326
Moduli of continuity for Gaussian processes
( be as in Lemma 7.4.2 and G be as defined in Remark 7.4.3 Let G (7.238). It follows from (7.237) and (7.235) that $2 # 2 2 2 (y) − σ (x) σ (u(x) − u(y)) ( ( G G = (7.242) u(0) 4u(0) $2 # 2 ( ( u(0) E(G(x) − G(y)) . ≤ 2 (u(x) + u(y)) Furthermore, by stationarity, 2 ( ( E(G(x) − G(y)) = σ 2((x − y). G
(7.243)
It follows from the equality in (7.241), (7.242), and (7.243) that u(0)σ 2 (x − y) ( 2 2 G (7.244) E(G(x) − G(y)) ≥ σ ((x − y) 1 − G (u(x) + u(y))2 and from (7.244) that E(G(t + x) − G(t))2 ∼ σ 2((x) G
as
|x| → 0.
(7.245)
We define G to be the class of Gaussian processes with a continuous covariance that has at least one of the following properties: (1) It is associated and stationary. (2) It has covariance of the form uT (x, y) for some terminal time T and is equal to zero at 0. When a Gaussian process has property (1), we sometimes say that it is Category (1) and similarly when it has property (2). For Gaussian processes in class G, we obtain the lower bound on the local modulus of continuity given in Theorem 7.2.12 more simply and with less stringent conditions on the covariance of the processes. We first note the following lemma. Lemma 7.4.4 Let G = {G(x), x ∈ R1 } be a Gaussian process in class G 2 (u) := sup|s|≤u σ 2 (|s|). Then and let σ 2 (x) := E(G(x) − G(0))2 . Let σ lim sup δ↓0 u≤δ
|G(u) − G(0)| ≥1 (2 σ 2 (u) log log 1/ σ 2 (u))1/2
a.s.
(7.246)
Proof Suppose that G is in Category (2). Then (7.177) holds with c = 1 by (7.231) and the fact that G(0) = 0. Thus, by Lemma 7.2.13, we get (7.246).
7.4 Local moduli of associated processes
327
When G is in Category (1), it follows from (7.235) and the continuity of u(x) that, for any > 0, we have (7.177) for some c = c( ) such that lim→0 c( ) ↑ 1. As above, this leads to (7.246) but with the right-hand side replaced with c( ). Since this is true for all > 0, we obtain (7.246).
Theorem 7.4.5 Let G = {G(x), x ∈ R1 } be a Gaussian process in class G and let σ 2 (x) := E(G(x) − G(0))2 . Suppose that σ(u) = O(|u|α ) for some α > 0. Then lim sup δ↓0 u≤δ
|G(u) − G(0)| ≥1 (2σ 2 (u) log log 1/u)1/2
a.s.
(7.247)
Proof This follows from Lemma 7.4.4 and the fact that u log log 1/u is increasing for u ∈ [0, δ] for some δ > 0. Therefore, since σ (u) ≥ σ(u), we can replace σ by σ in the denominator of (7.246) and then use the bound on σ in the hypothesis of this theorem to obtain (7.247). The advantage of this result over Theorem 7.2.12 is that here we have a simple bound on σ at zero rather than a smoothness condition. (One can weaken the condition on σ a bit more since we only require that log log 1/u < (1 + ) log log 1/σ(u) for all > 0.) Remark 7.4.6 All we used about processes G in Category (2) is (7.230) and the fact that G(0) = 0. It is not necessary for the covariance of G to be related to the potential density of a terminal time. We can also weaken the condition that G(0) = 0. It follows from Lemma 7.2.13 and (7.231) that when (7.230) holds lim sup δ↓0 u≤δ
|G(u) − G(0)| ≥1 (2 σ 2 (u, 0) log log 1/ σ (u, 0))1/2
a.s.,
(7.248)
where σ 2 (u, 0) = sup |EG2 (s) − EG2 (0)|
(7.249)
0≤s≤u
as long as σ 2 (u, 0) > 0 for u > 0. When this does not hold we can get interesting bounds by taking the limsup in (7.248) over the set where it does hold. It is important to understand how special are the Gaussian processes in class G. If G satisfies (7.230) we show in (7.231) that |EG2 (t) − EG2 (s)| ≤ E(G(t) − G(s))2 .
(7.250)
328
Moduli of continuity for Gaussian processes
Suppose that we don’t know whether G is an associated process. In this case all we can do is write G2 (t) − G2 (s) = (G(t) − G(s))(G(t) + G(s)) and use the Schwarz inequality to get |EG2 (t)−EG2 (s)| ≤ (E(G(t)−G(s))2 )1/2 (E(G(t)+G(s))2 )1/2 . (7.251) ( is an associated stationary Gaussian process with covariance When G u(t, s) = u(t − s), we have, by (7.235), |σ 2((t) − σ 2((s)| ≤ G
G
2u(0) σ 2((t − s). u(t) + u(s) G
(7.252)
( simply as a Gaussian process, we can apply the Schwarz Treating G ( ( ( inequality to (G(x) − G(y)) G(0) to obtain (G (t − s). |u(t) − u(s)| ≤ u1/2 (0) σ
(7.253)
By (7.237), this is equivalent to 2 2 (t) − σ (G (s)| ≤ 2u1/2 (0) σ (G (t − s). |( σG
(7.254)
Here are some more properties that distinguish the class G from general Gaussian processes. Lemma 7.4.7 Let G = {G(x); x ∈ R1 } be a Gaussian process with 2 (x) ≡ 0. Then stationary increments in class G for which σG 2 |x| = O(σG (x))
and for all > 0 √ σG (2x) ≤ 2 (1 + ) σG (x)
as
∀ |x|
x → 0,
(7.255)
sufficiently small.
(7.256)
2 (x)/|x| = 0, that is, (7.255) does not Proof Suppose that lim inf x→0 σG 1 2 2 (−y), we may assume that y > 0. hold. Fix y ∈ R . Since σG (y) = σG 2 2 (x)/x ≤ and sup0≤z≤x σG (z) ≤ Fix > 0 and choose x > 0 such that σG y. Write y = kx + z for some positive integer k and 0 ≤ z ≤ x. Now suppose that G is in Category (2) in the definition of G. By 2 2 2 2 (7.232), which implies that σG is subadditive, σG (y) ≤ kσG (x) + σG (z). Hence 2 2 2 σG kσG (y) (x) σG (z) ≤ + ≤ 2 . (7.257) y kx y 2 ≡ 0, which is a contraSince this is true for any > 0, we see that σG diction. Suppose now that G is in Category (1). We use the same construction
7.4 Local moduli of associated processes
329
as in the first paragraph of this proof, except that now we choose y ∈ R1 so that sup|z|≤|y| σ 2 (z) ≤ u(0). It follows from (7.235) and (7.236) that 2u(0) 2 2 2 (x) σG (y) ≤ σG ((k − 1)x + z) + σG u((k − 1)x + z) + u(x) 2 = σG ((k − 1)x + z) 2u(0) 2 + 2 ((k − 1)x + z) + σ 2 (x)) σG (x) 2u(0) − (1/2)(σG G 2 = σG ((k − 1)x + z) 1 2 + 2 ((k − 1)x + z) + σ 2 (x)) σG (x) 1 − (1/4u(0))(σG G 2 2 ((k − 1)x + z) + 2σG (x). ≤ σG
(7.258)
Hence, by induction, 2 2 2 σG (y) ≤ 2kσG (x) + σG (z).
(7.259)
As in (7.257), we see that = 0 for x ∈ [0, δ] for some δ > 0 (this is because of the restriction on y). However, since G has stationary 2 ≡ 0, again a contradiction. increments, we see that σG The inequality in (7.256) follows easily from (7.232) when G is in Category (2) and holds with = 0. To obtain (7.256) when G is in Category (1), let z = x and k = 1 in (7.258) and use the first equality to see that, for any > 0, 1 2 2 2 σG σG (2x) ≤ σG (x) + (x) (7.260) 1− 2 σG (x)
for all |x| sufficiently small. This gives (7.256). 2 For general Gaussian processes we can only say that |x|2 = O(σG (x)) as x → 0 and σG (2x) ≤ 2σG (x) for all x.
7.4.1 Gaussian processes associated with L´ evy processes Gaussian processes in class G, defined on page 326, can be obtained from L´evy processes with various stopping times. Let X = {X(t), t ∈ R+ } be a real-valued symmetric L´evy process, that is,
where
EeiλX(t) = e−tψ(λ) ,
(7.261)
∞ ψ(λ) = 2 (1 − cos λu)ν(du)
(7.262)
0
330
Moduli of continuity for Gaussian processes
for ν a symmetric L´evy measure, that is, ν is symmetric, and ∞ (1 ∧ x2 ) ν(dx) < ∞.
(7.263)
0
We assume that
∞
1
1 dλ < ∞. ψ(λ)
(7.264)
Let uα (x, y) denote the α-potential density of X, 1 ∞ cos λ(x − y) dλ uα (x, y) = π 0 α + ψ(λ)
(7.265)
(as given in (4.84)). For α ≥ 0, we define Gα = {Gα (x), x ∈ R1 } to α be a mean zero stationary Gaussian process with covariance 1 −1 u (x, y), with the understanding that we only consider G0 when 0 ψ (λ) < ∞. When α > 0, uα (x, y) is the 0-potential density of X killed at the end of an independent exponential time with mean 1/α, as in Section 3.5. When u0 (x, y) is finite, X is a transient process. Set σα2 (x − y)
:= E(Gα (x) − Gα (y))2 = =
(7.266)
u (x, x) + u (y, y) − 2u (x, y) ∞ 1 4 λ(x − y) dλ sin2 π 2 α + ψ(λ) α
α
α
0
and note that EG2α (x)
1 = π
∞
1 dλ. α + ψ(λ)
(7.267)
0
It follows from (4.77) that σ02 ( · ) exists for all L´evy processes satisfying (7.264). 0 = {G 0 (x), x ∈ R1 } to be a mean zero Gaussian process We define G with covariance
0 (y) = 1 σ 2 (x) + σ 2 (y) − σ 2 (x − y) , 0 (x)G (7.268) EG 0 0 0 2 (it follows from (5.252) that the right-hand side of (7.268) is positive 0 (x) − definite). Using (7.268), we see that G(0) = 0 and that E(G 0 has stationary increments. We only 0 (y))2 = σ 2 (x − y), so that G G 0 0 when 1 ψ −1 (λ) = ∞. In this case, X is recurrent and consider G 0 0 has covariance uT (x, y) (see (4.86) and note that σ 2 (x) = 2φ(x) in G 0 0 (4.87)).
7.4 Local moduli of associated processes
331
We also consider uα (x) Gα (0) uα (0)
Gα (x) = Gα (x) −
(7.269)
for α ≥ 0. When α = 0, X is transient, and, by Remark 3.8.3, G0 has covariance uT0 (x, y). When α > 0, by Section 3.5, uα is the 0-potential of X killed at the end of an independent exponential time with mean 1/α, so Gα has covariance uα T0 (x, y). It is clear that the processes Gα , α ≥ 0 are in Category (1) of G, and 0 are in Category (2) of G (note that Gα the processes Gα , α ≥ 0, and G does not have stationary increments). It follows from (7.235), that for α > 0, 2uα (0) σ 2 (x − y) (7.270) + uα (y) α 1 1 and also for α = 0 when 0 ψ −1 (λ) < ∞. When 0 ψ −1 (λ) = ∞, by 0 , (7.231) applied to G |σα2 (x) − σα2 (y)| ≤
uα (x)
20 (x) − E G 20 (y)| ≤ E(G 0 (x) − G 0 (y))2 = σ02 (x − y) |σ02 (x) − σ02 (y)| = |E G (7.271) and similarly 2
2
|EGα (x) − EGα (y)| ≤ E(Gα (x) − Gα (y))2 .
(7.272)
Lemma 7.4.8 For all α ≥ 0, |x| = o(σα2 (x))
as
x → 0.
(7.273)
For all > 0 and α > 0 there exists an h(α ) > 0 such that σα2 (x) ≤ σ02 (x) ≤ (1 + ) σα2 (x) and σα (2x) ≤
√
2 (1 + ) σα (x)
∀α ∈ [0, α ] and |x| ≤ h(α )
(7.274)
∀α ∈ [0, α ] and |x| ≤ h(α ).
(7.275)
To appreciate the significance of (7.275) compare it with (7.97). Also compare (7.273) with (7.255). Proof It suffices to prove (7.273) for α > 0. By (4.76) for all > 0 there exists a λ0 such that ψ(λ) ≤ λ2 for all λ ≥ λ0 . Let N ≥ λ0 be that α/N 2 ≤ . Using the fact that, for 0 ≤ x ≤ π/2, sin x = such x cos u du ≥ x cos x, we see that, for x < 1/N , 0 σα2 (x)
=
4 π
∞ sin2 0
1 λx dλ 2 α + ψ(λ)
(7.276)
Moduli of continuity for Gaussian processes
332
≥
x2 cos2 (1/2) π
1/x
λ2 dλ α + λ2
N
≥ ≥
x2 cos2 (1/2) π
1/x
1 dλ α/λ2 + N x2 cos2 (1/2) 1 −N , 2 π x
which implies (7.273). The first inequality in (7.274) is trivial. For the second we note that σ02 (x) = σα2 (x) +
4 π
∞ sin2
α λx dλ. 2 ψ(λ)(α + ψ(λ))
(7.277)
0
Let λ0 be such that ψ(λ) ≥ N0 for λ ≥ λ0 . This exists by (4.78) and (4.74). Then ∞
α λx dλ ≤ x2 sin 2 ψ(λ)(α + ψ(λ))
λ0
2
0
0
α 2 λ2 dλ + σ (x). ψ(λ) N0 α
(7.278)
The second inequality in (7.274) now follows from (7.277), (7.278), (4.77), and (7.273). The inequality in (7.275) follows from (7.270) and (7.271), each used in the different cases that we must consider. We also use the fact that σα (x) increases as α ↓ 0. In (7.274) we showed that the σα are asymptotically equivalent at zero. Using this and Theorem 7.3.1, we get the following useful estimates. Lemma 7.4.9 When ψ(λ) in (7.261) is regularly varying at infinity with index 1 < p ≤ 2, σα2 (h) ∼ Cp
1 h ψ(1/h)
for all 0 ≤ α < ∞, where 4 Cp = π
0
∞
as h → 0
sin2 s/2 ds. sp
(7.279)
(7.280)
When ψ(λ) in (7.261) is regularly varying at infinity with index 1, 2 ∞ 1 2 dλ as h → 0 (7.281) σα (h) ∼ π 1/h ψ(λ)
7.4 Local moduli of associated processes
333
for all 0 ≤ α < ∞. The next lemma, which is essentially a version of (7.192) and (7.194) of Theorem 7.3.1 at infinity, shows that we can find regularly varying L´evy exponents. Lemma 7.4.10 Let ψ(λ) be as given in (7.261) and (7.262), and let the L´evy measure ν be of the form ∞ 1 ν([u, ∞)) = dx, (7.282) θ(x) u where θ is a regularly varying function at zero with index 1 < p < 3. Then πCp as λ → ∞, (7.283) ψ(λ) ∼ λ θ(1/λ) where Cp is given in (7.280). When θ is a regularly varying at zero with index 3, 1/λ 2 u du as λ → ∞ (7.284) ψ(λ) ∼ 2 θ(u) 0 and is a slowly varying function at infinity. Furthermore, if θ is a normalized regularly varying function at infinity with index 1 < p ≤ 3, then ψ is a normalized regularly varying function. Proof
Using (7.262) and (7.283), we see that ∞ sin2 λu/2 ψ(λ) = 4 du. θ(u) 0
(7.285)
Compare this with (7.190). The proofs of (7.283) and (7.284) are essentially the same as the proofs of (7.192) and (7.194). We simply consider the behavior of θ at zero rather than at infinity. In Theorem 7.3.1, the critical case is when θ(λ) is regularly varying at infinity with index 1. Here, because of (7.284), it is when θ(λ) is regularly varying at zero with index 3. In this section we define Gα to be a stationary Gaussian process and 0 (0) = 0. 0 to be a Gaussian process with stationary increments, with G G We see in the next lemma, which we use in Section 9.5, that these processes are closely related. Lemma 7.4.11 For α > 0, let Gα be as defined on page 330. Set α (x) = {Gα (x) − Gα (0), x ∈ R1 } and let Hα = {Hα (x), x ∈ R1 } be a G
334
Moduli of continuity for Gaussian processes
mean zero Gaussian process with stationary increments, independent of α , with Hα (0) = 0, satisfying G 4 E(Hα (x) − Hα (y)) = π
∞ sin2
2
α λ(x − y) dλ. (7.286) 2 ψ(λ)(α + ψ(λ))
0
0 , as defined on page 330, Then, for G 0 (x), x ∈ R1 } = {G α (x), x ∈ R1 } + {Hα (x), x ∈ R1 } {G law
(7.287)
and E(Hα (x + h) − Hα (x))2 = o (σ02 (|h|))
as h → 0.
(7.288)
α , and Hα can be represented as in (5.257) 0 , G Proof The processes G with h(λ) equal, respectively, to 1/ψ(λ), 1/(α + ψ(λ)), α/(ψ(λ)(α + ψ(λ))), and other obvious minor modifications. The equality in law is then easily verified by computing the covariances of the three processes. By (7.287), the left-hand side of (7.288) is equal to σ02 (|h|) − σα2 (|h|), so (7.288) follows from (7.274). For the Gaussian processes associated with L´evy processes that we consider in this subsection, we get the following simplification of Theorem 7.4.5. Theorem 7.4.12 Let X be a real-valued symmetric L´evy process as defined in (7.261) satisfying (7.264). Let G denote any of the Gaussian 0 , processes associated with X considered in this subsection, that is, G Gα for α ≥ 0, and Gα for α ≥ 0. Let 4 σ (x) = π
∞ sin2
2
λ(x − y) 1 dλ 2 ψ(λ)
(7.289)
0 α
and assume that σ(u) = O|u| for some α > 0. Then lim sup δ↓0 u≤δ
|G(u) − G(0)| ≥1 (2σ 2 (u) log log 1/u)1/2
a.s.
(7.290)
Proof This theorem is a simple consequence of Theorem 7.4.5 since all these processes are in class G. The only difference between (7.247) and (7.290) is that in (7.247) the increments variance of the different Gaussian processes enters into the denominator. We get (7.290) because 2 they are all asymptotically equivalent at zero. Note that 1 σ−1(x) is the 0 or G0 , according to whether ψ (λ) = ∞ increments variance of G 0 or 0. Its equivalence with the increments variance of Gα for α > 0 is
7.4 Local moduli of associated processes
335
given in (7.274). Its equivalence with the increments variance of Gα , α ≥ 0, is given in (7.245). Combining Theorem 7.2.14 and Lemma 7.4.11, we can find uniform and local moduli of continuity for different Gaussian processes related to symmetric p-stable L´evy processes. Example 7.4.13 Consider the Gaussian processes of the type Gα , α > 0 0 with increments variance σ 2 , α ≥ 0 and with ψ(λ) = λp , 1 < p < and G α 2. For α > 0, processes of the type Gα are associated with symmetric p-stable processes killed at the end of an independent exponential time 0 have covariance uT (x, y). with mean 1/α. Processes of the type G 0 We show here that % |Gα (x) − Gα (0)| lim sup = 2Cp a.s. (7.291) δ→0 |x|≤δ (|x − y|p−1 log log 1/|x − y|)1/2 x∈I
and lim sup
δ→0 |x−y|≤δ x,y∈I
% |Gα (x) − Gα (y)| = 2Cp p−1 1/2 (|x − y| log 1/|x − y|)
a.s.,
(7.292)
0 . where Cp is given in (7.280) and similarly with Gα replaced by G We point out in Example 7.2.16 that (7.291) and (7.292) also hold for Brownian motion, in which case p = 2 (we show in Remark 4.2.6 that C2 = 1). This is because the Gaussian process associated with the 2-stable process, killed the first time it hits zero, is Brownian motion. It follows from Lemma 2.5.1 that the Gaussian process associated with Brownian motion, killed the first time it hits zero, is the 2-stable process. To verify these statements we first note that, by (7.268), σ 2 (h) = 0 G σ02 (h), and by Example 4.2.5, σ02 (h) = Cp |h|p−1 .
(7.293)
0 because σ 2 is It follows from Theorem 7.2.14 that (7.292) holds for G 0 concave. 0 , it follows Lemma 7.4.11 that it holds for Since (7.292) holds for G Gα (x) − Gα (y) + Hα (x) − Hα (y). It follows from Theorem 7.2.1 and (7.288) that the uniform modulus of continuity of Hα is “little o” of the uniform modulus of continuity of Gα . Thus, by the triangle inequality, (7.292) holds for Gα , for all α > 0. The proof of (7.291) is similar, except that we use Theorem 7.2.15. (Concave functions satisfy condition (2).)
Moduli of continuity for Gaussian processes
336
0 when 2 < p < 3. We also note that (7.291) and (7.292) hold for G For (7.292), the lower bound follows from Remark 7.2.11 and the upper bound, from Corollary 7.2.3. For (7.292), the upper bound also is given by Corollary 7.2.3, and one can easily check that the proof of Theorem 7.2.7 goes through in this special case.
7.5 Gaussian lacunary series The limit laws in (7.186) and (7.189) are direct generalizations of the moduli of continuity results for Brownian motion stated in (2.14) and (2.15). However, this is far from the whole story. Other classes of Gaussian processes exhibit different types of moduli of continuity. In particular, there are Gaussian processes with the same exact local and uniform moduli of continuity. We study Gaussian lacunary series to obtain examples that show how varied are the many types of moduli of continuity that can arise. But, more significantly, they allow us to find much larger lower bounds for the local moduli of continuity of certain Gaussian processes than those given in Theorem 7.2.12. A lacunary series is a trigonometric series, f (t) =
∞
t ∈ [0, 1],
(an cos 2πλn t + bn sin 2πλn t)
(7.294)
n=0
where λn ∈ N and λn+1 /λn ≥ q > 1, for all n ∈ N . The important property of lacunary series that we use (see, e.g., Katzenelson (1968, page 108)) is that Aq
∞
|an |2 + |bn |2
1/2
≤ sup |f (t)| ≤ t∈[0,1]
n=0
∞
|an |2 + |bn |2
1/2
,
n=0
(7.295) where Aq > 0 is a constant depending only on q. We restrict our attention to Gaussian lacunary series of the form X(t) =
∞
an (ηn cos 2π2n t + ηn sin 2π2n t)
t ∈ [0, 1],
(7.296)
n=0
where {ηn } and {ηn } are independent standard normal sequences. Since the Gaussian random variables are symmetric, without loss of generality we assume that the an ≥ 0. The reason we take λn = 2n is explained just before (7.334). Note that X := {X(t), t ∈ [0, 1]} is a stationary process. A necessary and sufficient condition for the continuity of X is that {an } ∈ 1 . We assume that this is the case. Under this condition, the
7.5 Gaussian lacunary series
337
series in (7.296) converges uniformly almost surely and thus gives a nice representation for X. By (7.295), Aq
∞
an (|ηn |2 + |ηn |2 )1/2 ≤ sup |X(t)| ≤ t∈[0,1]
n=0
∞
an (|ηn |2 + |ηn |2 )1/2 ,
n=0
(7.297) where Aq is a constant depending only on q. We assume that, for all m ∈ N, n i m ∞ 2 ain ≤ C ain i = 1, 2. (7.298) m 2 n=0 n=m+1 In particular, this requires that an > 0 infinitely often. For h > 0, let N (h) = max{n : 2n h ≤ 1}. Consider ,# Φ(h) := max
log N (h)
∞ n=N (h)+1
a2n
$1/2
an .
∞
,
(7.299)
n=N (h)+1
Theorem 7.5.1 Let X be a Gaussian lacunary series as described in (7.296) and assume that (7.298) holds. Then, if either n≥j an (1) lim supj→∞ # $1/2 > δ for some δ > 0 log j n≥j a2n or (2) an is nonincreasing as n → ∞ there exist constants 0 < C0 ≤ C1 < ∞ so that both lim sup
δ→0 |u|≤δ
|X(t + u) − X(t)| = C0 Φ(|u|)
a.s.
(7.300)
for all t ∈ [0, 1], and lim sup
δ→0
|t−s|≤δ
|X(t) − X(s)| = C1 Φ(|t − s|)
a.s.
(7.301)
s,t∈[0,1]
Furthermore, (7.300) and (7.301) hold with Φ(| · |) replaced by Φ(δ) and lim replaced by lim sup (and possibly different nonzero constants whem Φ is not nondecreasing). The fact that X has the same local and uniform moduli of continuity shows that processes such as these behave quite differently from those studied in the Section 7.2. To prove Theorem 7.5.1, we need several inequalities, which we incorporate into the following lemma:
Moduli of continuity for Gaussian processes
338
Lemma 7.5.2 (1)
lim sup j→∞
max{
an |ηn | ≤3 2 1/2 } n≥j an , (log j n≥j an ) n≥j
a.s. (7.302)
(2) j
lim sup j→∞
max{
an |ηn | ≤3 j 2 1/2 } n=0 an , (log j n=0 an ) n=0
j
(3)
n≥j
lim sup j→∞
an |ηn |
n≥j
an
≥C
a.s. (7.303)
a.s.
(7.304)
for some constant C > 0. (4) When an ↓ and (log j n≥j a2n )1/2 ≥ 2 n≥j an for all j sufficiently large, n≥j an ηn lim sup ≥C a.s. (7.305) 2 1/2 j→∞ (log j n≥j an ) for some constant C > 0. Let N1 = [0, . . . , j] and N2 = [j, . . . ∞] and set an , (log j a2n )1/2 } i = 1, 2. fi (j) = max{
Proof
n∈Ni
(7.306)
n∈Ni
By Chebyshev’s inequality, P αj an |ηn | ≥ cfi (j)αj n∈Ni
≤ exp(−cfi (j)αj )E
(7.307)
exp αj
an |ηn |
.
n∈Ni
Also, E
exp αj
an |ηn |
≤
n∈Ni
α2j a2n /2
e
n∈Ni
% 2/π
∞
e−(y−αj an )
2
/2
dy
0
(7.308) and
% 2/π 0
∞
e−(y−αj an )
2
/2
dy ≤ 1 +
% 2/παj an .
(7.309)
7.5 Gaussian lacunary series Consequently, P
αj
339
an |ηn | ≥ cfi (j)αj
(7.310)
n∈Ni
αj2 2 % an + 2/παj an ≤ exp −cfi (j)αj + 2 n∈Ni
.
n∈Ni
Let αj = log j/fi (j). Then, for c ≥ 3, the right-hand side of (7.310) is a term of a convergent series so that both (7.302) and (7.303) follow by the Borel–Cantelli Lemma. % To obtain (7.304), we note that since E|ηn | = 2/π, 2 2 2 E an |ηn | = an (7.311) π n≥j
n≥j
≥
2 2 E an |ηn | . π n≥j
Therefore, by the Paley–Zygmund Lemma (Lemma 14.8.2), for all 0 < λ < 1, % 2 (7.312) P an |ηn | ≥ λ 2/π an ≥ (1 − λ)2 , π n≥j
n≥j
which implies that P
lim sup
n≥j
an |ηn |
n≥j an
j→∞
% ≥ λ 2/π
2 ≥ (1 − λ)2 . π
(7.313)
Since the event in (7.313) is a tail event, we get (7.304). We now obtain (7.305). We define a subsequence of the integers as follows: Let g(x) = x + log x. For some integer m > 3, define the sequence n(0) = m, n(1) = [g(m)], . . . , n(i + 1) = [g(n(i))], . . .. We first show that 1/n(i) = ∞. Let Nk := 2k n(0). Observe that there are Nk − log Nk of the terms n(i) between Nk and Nk+1 = 2Nk . at least log 2Nk (To see this, note that the first term, n(i) ≥ Nk is less than or equal to Nk + log Nk . Each subsequent n(j) between Nk and Nk+1 is such that n(j) < n(j − 1) + log n(j − 1) ≤ n(j − 1) + log 2Nk .) Consequently, ∞ 1 n(i) i=0
≥
∞ 2k n(0) − log 2k n(0) k=0
log 2k+1 n(0)
2−k−1 n−1 (0)
(7.314)
Moduli of continuity for Gaussian processes
340
∞
≥
1 1 = ∞. 4 log 2k+1 n(0) k=0
Let
n≥n(j) an ηn Yn(j) := . ( n≥n(j) a2n )1/2
(7.315)
We have 2 =1 EYn(j)
EYn(j) Yn(k) ≤ 1/2
and
j = k.
(7.316)
The equality is obvious. To prove the inequality, it is enough to show that 2 1 n≥n(i+1) an ≤ . (7.317) 2 a 4 n≥n(i) n By the hypothesis and Lemma 14.8.3, 1/2 log n(i) a2n n≥n(i)
≥2
an >
n≥n(i)
>
#
∞ k=n(i)
2 n≥k an
> 2 log n(i)
$1/2
(k + 1 − n(i))1/2 1/2
n(i)+[log n(i)]
a2n
n≥n(i)+[log n(i)]
(7.318)
(k + 1 − n(i))−1/2
k=n(i)
1/2
a2n
,
n≥n(i)+[log n(i)]
which gives the inequality in (7.317). Let {ξi } be a standard normal sequence. By (7.316), as in the proof of Corollary 5.5.2, # √ $ 1/2 P ∪∞ {Y /(2 log n(j)) } ≥ 1/ 2 = 1. (7.319) n(j) j=k Using the Monotone Convergence Theorem, we get (7.305). We also use the following simple observation. Lemma 7.5.3 Suppose that {an } ∈ 1 , an ↓, and 1/2 ∞ ∞ 2 log j an ≥ an n=j
n=j
(7.320)
7.5 Gaussian lacunary series
341
for all j sufficiently large. Then log j
∞
a2n
(7.321)
n=j
is decreasing in j for all j sufficiently large. ∞ ∞ Proof The monotonicity of an implies that an ≥ n=m a2n / n=m an . Using this and (7.320), we see that ∞ ∞ log j( n=j a2n )2 2 ≥ a2n . (7.322) aj log j ≥ ∞ ( n=j an )2 n=j Also, log j
∞
∞
a2n ≥ log(j + 1)
n=j
a2n +log j a2j −log
n=j+1
∞ j+1 2 a . (7.323) j n=j+1 n
Using (7.322) in (7.323), we get (7.321). Proof of Theorem 7.5.1 In order to use Lemma 7.1.7, we prove the lower bound in (7.300) as given and the upper bound in (7.301) with lim replaced by lim sup and Φ(|t − s|) replaced by Φ(δ). We start with the upper bound. We have X(t + u) − X(t) (7.324) ∞ # = an [ ηn (cos 2π2n u − 1) + ηn sin 2π2n u] cos 2π2n t n=0
$ + [ ηn (cos 2π2n u − 1) − ηn sin 2π2n u] sin 2π2n t .
By (7.295), sup |X(t + u) − X(t)|
(7.325)
t∈[0,1]
|u|≤|h|
≤ sup 2 |u|≤|h|
1/2 2π2n u 2 an sin ηn + (ηn )2 2 n=0 ∞
N (h)
≤ 2π
an 2n h (|ηn | + |ηn |) + 2
n=0
:= I1 (h) + I2 (h).
∞ N (h)+1
an (|ηn | + |ηn |) .
Moduli of continuity for Gaussian processes
342 Let
Φ(h) := max
,#
N (h)
log N (h)
a2n
n=0
2n
2 $
,
2N (h)
N (h)
1/2
n=0
an
2n -
. 2N (h) (7.326)
By Lemma 7.5.2 (2) and the fact that h ≤ 2−N (h) , lim sup h→0
I1 (h) ≤ 12π Φ(h)
a.s.
(7.327)
Also, by Lemma 7.5.2 (1), I2 (h) ≤ 12 Φ(h)
lim sup h→0
a.s.
(7.328)
By (7.298), Φ(h) ≤ CΦ(h). Thus we have lim sup sup δ→0
|t−s|≤δ
|X(t) − X(s)| ≤C Φ(δ)
a.s.
(7.329)
s,t∈[0,1]
for some constant C < ∞. Using the relationship (7.53) implies (7.54), we see that we also have lim sup
δ→0
|t−s|≤δ
|X(t) − X(s)| ≤C Φ(|t − s|)
a.s.
(7.330)
s,t∈[0,1]
By Lemmas 7.1.1 and 7.1.9, we actually have equality in (7.329) and (7.330) for constants 0 ≤ C , C ≤ C. We next obtain a lower bound for the left-hand side of (7.300), which is greater than zero. Using this and the relationship (7.55) implies (7.56), which also holds for the local modulus of continuity, we see that C and C are both greater than zero, which completes the proof. By (7.324), X(u) − X(0) =
∞
an sin 2π2n u ηn +
n=0
∞
an (cos 2π2n u − 1) ηn . (7.331)
n=0
Since the two sums in (7.331) are independent and symmetric, on a set with probability greater than or equal to 1/2, sup 0≤u≤2−m
≥ =
|X(u) − X(0)| sup 0≤u≤2−m
sup 0≤u≤2−m
(7.332)
∞ an sin 2π2n u ηn n=0 ∞ m−1 an sin 2π2n u ηn + an sin 2π2n u ηn . n=0
n=m
7.5 Gaussian lacunary series
343
By the same argument, we see that on a set with probability greater than or equal to 1/4, sup 0≤u≤2−m
|X(u) − X(0)| ≥
sup 0≤u≤2−m
∞ an sin 2π2n u ηn .
(7.333)
n=m
Note that the random Fourier series in (7.333) has period 2−m . Thus the suprema taken over [0, 2−m ] and [0, 2π] in the next equation are the same (this is why we take λn = 2n in (7.296)). Consequently, by (7.297), sup 0≤u≤2−m
∞ an sin 2π2n u ηn
=
n=m
∞ sup an sin 2π2n u ηn
0≤u≤1 n=m ∞
≥ Aq
an |ηn |.
(7.334)
n=m
Suppose that condition (1) holds. Let {nj } be a sequence along which 1/2 anj ≥ δ log nj a2n . (7.335) n≥nj
n≥nj
Then, by (7.334), we have ∞ ∞ 1 P sup an sin 2π2n u ηn ≥ C an |ηn | ≥ 4 −m 0≤u≤2 n=n n=n j
(7.336)
j
for some constant C > 0, which implies, by Lemma 7.5.2 (3) that ∞ sup0≤u≤2−nj | n=nj an sin 2π2n u ηn | ∞ P lim sup ≥ C = 1 (7.337) j→∞ n=nj an for some constant C > 0, where we also use the fact that this is a tail event. ∞ ∞ Note that, for 2−nj ≤ u ≤ 2−nj +1 , Φ(u) = n=nj an . Also, n=nj an sin 2π2n u ηn is periodic with period 2−nj . Thus, for each path in the ∞ stochastic process n=nj an sin 2π2n u ηn there is a u ∈ (2−nj , 2−nj +1 ), depending on the path, for which ∞ (7.338) u = sup an sin 2π2n u ηn . 0≤u≤2−nj
n=nj
This shows that ∞ sup0≤u≤2−nj +1 | n=nj an sin 2π2n u ηn | P lim sup ≥ C = 1 (7.339) Φ(u) j→∞
Moduli of continuity for Gaussian processes
344
for some constant C > 0. Using (7.333), we see that, under condition (1) lim sup
δ→0 |u|≤δ
|X(u) − X(0)| ≥C Φ(|u|)
(7.340)
for some constant C > 0, with probability at least 1/4. We show in the next paragraph that the left-hand side of (7.340) is a tail event. Let XM (u)−XM (0) be given by the sum in (7.331) taken from n = 0 to n = M . It is easy to see that sup0≤u≤2π/2m (E(|XM (u)−XM (0)|)2 )1/2 ≤ CM 2−m . To show that the left-hand side of (7.340) is a tail event, it suffices to show that lim 2m Φ(2π/2m ) = ∞,
(7.341)
j→∞
which implies that the event in (7.337) is a tail event. By (7.298), 1/2 ∞ m −m m 2 log m an (7.342) 2 Φ(2 ) ≥ C2 n=m
≥ C2m
log m
≥
log m
m−1
1/2 a2n 2−2(m−n−1)
n=0 m−1
a2n 22(n−1)
1/2 .
n=0
Note that lim supn→∞ an 2n ≤ 1violates condition (1). When lim supn→∞ an 2n > 1, the last term in (7.342) goes to infinity as j → ∞. This verifies (7.341). Therefore, (7.340) holds almost surely when condition (1) is satisfied. When condition (1) is not satisfied 1/2 ∞ ∞ log j a2n >C an for all j ≥ j for some j (7.343) n=j
n=j
and some C > 0. Since multiplication by a constant only changes the constant in (7.300) and (7.301), we take C = 2. In this case we use Lemma 7.5.2 (4) in (7.336), which requires that {an } is monotonic, and proceed as above to obtain lim sup sup δ→0
0≤u≤δ
|X(t + u) − X(t)| ≥C Φ(δ)
a.s.
(7.344)
with the same constant C > 0. By Lemma 7.5.3, under (7.343), Φ(|u|)
7.5 Gaussian lacunary series
345
is increasing, and this implies that lim sup
δ→0 0≤u≤δ
|X(t + u) − X(t)| ≥C Φ(|u|)
a.s.
(7.345)
We have already shown that (7.345) holds under condition (1). Corollary 7.5.4 Let X be a Gaussian lacunary series as described in (7.296) and assume that (7.298) holds. Then ψ(|u|) =
∞
an
(7.346)
n=N (u)+1
is a lower bound for the local modulus of continuity of X, that is, (7.300) holds with a greater than or equal to sign when Φ(|u|) is replaced by ψ(|u|) and also when lim is replaced by lim sup and Φ(|u|) is replaced by ψ(δ). The significance of this result in relation to Theorem 7.5.1 is that we do not require that either condition (1) or condition (2) of Theorem 7.5.1 holds. Proof By (7.337) and the monotonicity of ψ(|u|), we see that it is a lower bound for the local modulus of continuity of X with probability 1/4. To replace 1/4 by 1, we need show that the event in (7.337) is a tail event. As in (7.341), this comes down to showing that lim 2N ψ(2−N ) = ∞.
N →∞
(7.347)
By (7.298), 2N ψ(2−N ) ≥ C
N
an 2n .
(7.348)
n=0
If lim inf N →∞ 2N ψ(2−N ) < ∞ for all > 0, an < /2n for all n ≥ N ( ). Therefore, ∞ an ≤ N () , (7.349) 2 n=N ()+1
and, consequently, by (7.298)
N ()
an 2n ≤ C .
n=0
That this cannot hold for all > 0 verifies (7.348).
(7.350)
Moduli of continuity for Gaussian processes
346
Example 7.5.5 We give a specific example of a Gaussian lacunary series that has the same local and uniform moduli of continuity and shows that there is no analog of Lemma 7.1.10 for the uniform modulus of continuity. Let X(t) =
∞
2−(n+1)/2 (ηn cos 2π2n t + ηn sin 2π2n t)
t ∈ [0, 1],
n=0
(7.351) where {ηn} and {ηn } are independent standard normal sequences. Clearly, E(X(t)X(s)) =
∞
2−(n+1) cos 2π2n (t − s).
(7.352)
n=0
It follows from Theorem 7.5.1 that (h log log 1/h)1/2 is both an exact local and exact uniform modulus of continuity for X (i.e., ω(|u − v|) = (|u−v| log log 1/|u−v|)1/2 in (7.1) and ρ(|u−uo |) = (|u−u0 | log log 1/|u− u0 |)1/2 in (7.2)), since log N (h) ∼ log log 1/h
(7.353)
and ∞
h/2 ≤
2−(n+1) ≤ h.
(7.354)
n=N (h)+1
We now show that π|t − s| ≤ E(X(t) − X(s))2 ≤ 16π|t − s|. 2
(7.355)
By (7.352), E(X(t) − X(s))2 = 4
∞
2−(n+1) sin2
n=0
2n+1 π|t − s| . 2
(7.356)
For 2−(k+1) < 2π|t − s| ≤ 2−k , 4
∞
2−(n+1) sin2
n=0
≤4
k+1 n=0
≤
8 2k+1
2n+1 π|t − s| 2
2−(n+1)
(7.357)
∞ 22n 2−2k 2−(n+1) +4 4 n=k+2
≤ 16π|t − s|,
7.6 Exact moduli of continuity
347
and, using the fact that sin2 θ ≥ θ2 /2 for |θ| ≤ 1, ∞
2n+1 π|t − s| 2k+1 π|t − s| ≥ 2−(k−1) sin2 ≥ π|t − s|/2. 2 2 n=0 (7.358) Let B be standard Brownian motion and set B1 = πB/2 and B2 = 16πB. We see that, for t ∈ [0, 2π], 4
2−(n+1) sin2
E(B1 (t) − B1 (s))2 ≤ E(X(t) − X(s))2 ≤ E(B2 (t) − B2 (s))2 . (7.359) X and B have the same exact local modulus of continuity, consistent with Lemma 7.1.10. But X and B do not have the same exact uniform modulus of continuity. This shows that Slepian’s Lemma does not generalize to cover uniform moduli of continuity.
7.6 Exact moduli of continuity The results on Gaussian lacunary series in the previous section may seem very specialized. However, coupled with the comparison lemma, Lemma 7.1.10, they enable us to obtain exact local moduli of continuity for a much wider class of Gaussian processes, including those that are associated with L´evy processes. Let G = {G(t), t ∈ R1 } be a mean zero Gaussian process. Some results in this section are expressed in terms of a monotone minorant for the increments variance of G. For t0 ∈ R1 and δ > 0, we consider functions ω such that ω 2 (h) = ω 2 (h; t0 , δ) ≤
min
|t−s|≥h
E(G(t) − G(s))2 .
(7.360)
s,t∈[t0 −δ,t0 +δ]
Restricting the definition of ω to some neighborhood of t0 enables us to avoid possible zeros of G(t) − G(s) when s = t. Theorem 7.6.1 Let G = {G(t), t ∈ R1 } be a mean zero Gaussian process and ω be as defined in (7.360) and satisfying ∞
1/2 ω 2 (2−n ) − ω 2 (2−n−1 ) < ∞.
(7.361)
n=1
Let N (h) = max{n : 2n h ≤ 1} and ψ(h) =
∞ n=N (h)+1
1/2 ω 2 (2−n ) − ω 2 (2−n−1 )
(7.362)
Moduli of continuity for Gaussian processes
348
and assume that sup|u|≤h E(G(t0 + u) − G(t0 ))2 = o(ψ2 (h)) for some t0 ∈ R1 and M
p/2 −p(M −n) ω 2 (2−n ) − ω 2 (2−n−1 ) 2 ≤ Cω p (2−p(M +1) )
p = 1, 2
n=k
(7.363) for some k sufficiently large and all M ≥ k. Then lim sup
δ→0 |u|≤δ
|G(t0 + u) − G(t0 )| ≥C ψ(u)
a.s.
(7.364)
for some constant C > 0. Proof
To simplify the notation we take t0 = 0. Consider
Y (t) :=
∞
an (ηn cos 2π2n t + ηn sin 2π2n t)
t ∈ [0, 1],
(7.365)
n=k
where a2n = ω 2 (2−n ) − ω 2 (2−n−1 ), {ηn } and {ηn } are independent standard normal sequences and k is such that (7.363) holds. It follows from Corollary 7.5.4 that (7.364) holds with G replaced by Y (it is easy to see that (7.363) implies (7.298)). Arguing as in (7.356) and (7.357) with 2−(n+1)/2 replaced by an and using the telescoping nature of a2n and (7.363), we see that E(Y (t) − Y (s))2 (7.366) N (|t−s|) a2n 2−2(N (|t−s|)−n) + ω 2 (2−N (|t−s|)−1 ) ≤C n=0
≤ Cω 2 (2−N (|t−s|)−1 ) ≤ Cω 2 (|t − s|) for |t − s| sufficiently small. Therefore, by (7.360), Lemma 7.1.10, and Remark 7.1.11, and the fact that ψ is increasing, we get (7.364). The next lemma gives a simple sufficient condition for (7.363). Lemma 7.6.2 If ω 2 (2−n ) ≤ 2α ω 2 (2−n−1 )
(7.367)
for some α < 2, for all n sufficiently large, ω satisfies (7.363). Proof
Let k to be large enough so that (7.367) holds for all n ≥ k.
7.6 Exact moduli of continuity
349
Then M
p/2 −p(M −n) ω 2 (2−n ) − ω 2 (2−n−1 ) 2
(7.368)
n=k
≤ (2α − 1)p/2
M
ω p (2−n−1 )2−p(M −n)
n=k
= (2α − 1)p/2
M −k
ω p (2−(M +1−j) )2−pj
j=0
≤ (2α − 1)p/2 ω p (2−(M +1) )
M −k
2−pj(1−α/2)
j=0
≤ Cω (2 p
−(M +1)
),
where we use (7.367) for the second and fourth inequalities. The next theorem relates the bounds in Theorem 7.6.1 to those in Theorem 7.2.1. Theorem 7.6.3 In addition to the hypotheses of Theorem 7.6.1, assume that ω 2 (2−n ) − ω 2 (2−n−1 ) is decreasing for all n sufficiently large. Then (7.364) holds with ψ(h) replaced by 1/2 ! ω(hu) 1/2 ( ψ(h) = max ω(h) (log log 1/h) , du . (7.369) u(log 1/u)1/2 0 Proof Consider {Y (t), t ∈ [0, 2π]} given in (7.365). Since {an } is nonincreasing for n sufficiently large, it follows from Theorem 7.5.1 that (7.364) holds with ψ(h) replaced by ,# $1/2 ψ(h) = max ω 2 (2−N (h)−1 ) log N (h) , (7.370) ∞
1/2 . ω 2 (2−n ) − ω 2 (2−n−1 )
n=N (h)+1
By Boas’s Lemma (Lemma 14.8.3), ∞ ω(2−N (h)−j ) √ ≤2 j j=1 −N (h)−j
Also, since ω(2
∞
1/2 ω(2−n ) − ω(2−n−1 ) .
(7.371)
n=N (h)+1
) ≥ ω(h2−j ),
∞ ω(2−N (h)−j ) √ j j=1
≥
∞ ω(h2−j ) √ j j=1
(7.372)
Moduli of continuity for Gaussian processes ∞ ω(h2−j ) √ dj ≥ j j=1 1/2 ω(hu) = C du. u(log 1/u)1/2 0
350
Using this in (7.370) justifies the presence of the integral in (7.369). The iterated logarithm term in (7.369) comes from the similar term in (7.370). Combining various results, we now get a complete description of the exact local moduli of continuity of a wide class of Gaussian processes with fairly regular increments variances. Theorem 7.6.4 Let G be a mean zero real-valued Gaussian process that is continuous in some neighborhood of t0 . Let φ(h) be an increasing function such that (1) 1/2
≤ C1 φ(|h|) C0 φ(|h|) ≤ E(G(t + h) − G(t))2
(7.373)
for all t in some neighborhood of t0 and all |h| sufficiently small; (2) φ2 (2−n ) − φ2 (2−n−1 ) is nonincreasing in n for all n sufficiently large; (3) φ(2−n ) ≤ 2α φ(2−n−1 ) for all n sufficiently large for some α < 1. Set 1/2
Φ1 (h) = φ(h) (log log 1/h)
1/2
+ 0
φ(hu) du. u(log 1/u)1/2
(7.374)
Then lim sup
δ→0 |u|≤δ
|G(t0 + u) − G(t0 )| =C Φ1 (u)
a.s.
(7.375)
and lim sup sup δ→0
|u|≤δ
|G(t0 + u) − G(t0 )| = C Φ1 (δ)
a.s.
(7.376)
a.s.
(7.377)
for constants 0 < C ≤ C < ∞. Proof Using Theorem 7.2.1, we obtain lim sup sup δ→0
|u|≤δ
|G(t0 + u) − G(t0 )| ≤ C Φ1 (δ)
7.6 Exact moduli of continuity
351
for some constant C < ∞. Let C be the smallest constant for which (7.377) holds. By Lemma 7.1.6, lim sup
δ→0 |u|≤δ
|G(t0 + u) − G(t0 )| ≤ C Φ1 (u)
It then follows from Lemma 7.1.1 that |G(t0 + u) − G(t0 )| =C lim sup δ→0 |u|≤δ Φ1 (u)
a.s.
(7.378)
a.s.
(7.379)
for some constant C ≤ C and from Theorem 7.6.3 that in fact C > 0. This gives us (7.375), and, since C ≥ C, we also get (7.376). (The event in (7.376) is a tail event.) We take up the question of when the constants C and C in Theorem 7.6.4, and in similar results, are equal, in Remark 7.6.7. Note that by Lemma 6.4.6 the condition that G is continuous implies that the integral in (7.374) is finite. Also, condition (2) is satisfied if φ2 (2−x ) is convex for all x sufficiently large. In order to give some concrete examples of the types of local moduli of continuity we get from Theorem 7.6.3, we need some estimates of Iloc,φ . The next lemma gives us an idea of what Iloc,φ looks like when φ is slowly varying but larger than powers of a logarithm. Lemma 7.6.5 Let φ(u) = exp(−g(log 1/u)), where g (s) is a normalized regularly varying function at infinity of index −γ, 0 < γ < 1. Then there exist constants 0 < C0 ≤ C1 < ∞ such that C0 φ(δ)(1/g (log 1/δ))1/2 ≤ Iloc,φ (δ) ≤ C1 φ(δ)(1/g (log 1/δ))1/2 , (7.380) that is, 1/δ), (7.381) Iloc,φ (δ) ≈ φ(δ)(log 1/δ)γ/2 L(log for all δ > 0 sufficiently small and (log 1/δ)1−γ L(log 1/δ) φ(δ) ≈ exp − 1−γ
(7.382)
for all δ > 0 sufficiently small, where L(u) is a normalized slowly varying function at infinity. Proof
By Theorem 14.7.1, g(s) ∼
s1−γ exp 1−γ
1
s
(t) dt t
(7.383)
352
Moduli of continuity for Gaussian processes
as s → ∞, where limt→∞ (t) = 0. This gives (7.382). To obtain (7.380), let log 1/δ = v so that φ(δu) = exp(−g((log 1/u) + v)). Then, making the change of variables s = (log 1/u) + v, we see that ∞ ds e−g(s) (7.384) Iloc,φ (δ) = (s − v)1/2 log 2+v log 2+1/g (v)+v ds ≤ e−g(s) (s − v)1/2 log 2+v ∞ +(g (v))1/2 e−g(s) ds. log 2+1/g (v)+v
Since g (s) is a normalized regularly varying function at infinity of negative index, by Lemma 7.2.4, it is decreasing for all s sufficiently large. Writing g as the integral of its derivative, we see that e−g(s) evaluated at the lower and upper limits of the second integral in (7.384) are both comparable to e−g(v) . Therefore, we can remove this term from the integral and integrate the remainder to see that the first integral after the second equal sign in (7.384) is comparable to e−g(v) 1/(g (v))1/2 at infinity. This gives the lower bound in (7.380). To get the upper bound in (7.380), we need only show that the last term in (7.384) is bounded above by a constant times e−g(v) 1/(g (v))1/2 at infinity. To see this we write ∞ ∞ e−g(α(v)) 1 −g(s) −g(s) d e ds = e + ds, (7.385) g (α(v)) ds g (s) α(v) α(v) where α(v) = log 2 + 1/g (v) + v. Using the fact that g (s) is a normalized regularly varying function at infinity of index −γ, we see that 1 d is dominated by a regularly varying function of index γ − 1. ds g (s) Thus, for v sufficiently large, the last integral in (7.385) is little “o” of the first one. We already remarked that e−g(α(v)) is comparable to e−g(v) . Since v < α(v) < 2v for v sufficiently large, 1/g (α(v)) is comparable to 1/g (v). Thus we get the desired upper bound for the last term in (7.384). Example 7.6.6 Here are some examples of functions φ that satisfy conditions (2) and (3) of Theorem 7.6.4 and the corresponding functions Φ1 for which (7.375) and (7.376) hold. The constants are not necessarily the same in each example. (1) φ is regularly varying with index 0 < a < 1 Φ1 (u) = φ(u)(log log 1/|u|)1/2 ;
7.6 Exact moduli of continuity
353
(2) φ(u)= exp(−(log 1/|u|)α (log log 1/|u|)β ), 0 < α < 1, −∞ < β < ∞ Φ1 (u) = φ(|u|)((log 1/|u|)1−α (log log 1/|u|)−β )1/2 ; (3) φ(u) = (log 1/|u|)−α , α > 1/2 Φ1 (u) = φ(|u|)(log 1/|u|)1/2 ; (4) φ(u) = ((log 1/|u|)(log log 1/|u|)β )−1/2 , β > 2 Φ1 (u) = φ(|u|)(log 1/|u|)1/2 log log 1/|u|. Proof The functions φ in (2)–(4) are explicit. One can check that they satisfy conditions (2) and (3) of Theorem 7.6.4. To check this when φ is regularly varying with index 0 < a < 1, use (7.119). To verify that Φ1 is as stated, we use Lemma 7.2.5, (7.128), for (1); Lemma 7.6.5 for (2); Lemma 7.2.5, (7.126), and (7.127) for (3); and Remark 7.2.6 for (4). Comparing Example 7.6.6 (1) with Theorem 7.2.15, we see that by giving up the precise value of the constant in (7.375) and (7.376) we can extend the result from normalized regularly varying σ 2 to regularly varying σ 2 . Remark 7.6.7 We point out in Lemma 7.1.7 that a nondecreasing function is an exact modulus if and only if it is an exact m-modulus. In this context, it is interesting to note that ω (δ) in (7.83) is increasing for δ in some interval [0, δ0 ], where δ0 > 0 (its derivative is positive). It is not clear whether ρ(δ) in (7.84) always has this property; it certainly does in the examples above (note that ρ(δ) = Φ1 (δ)). More generally, if φ in (7.84) satisfies φ(x) ≤ exp(−(log 1/x)1− ) for x in some interval [0, x0 ], for x0 > 0, for all > 0, then φ(δ) (log log 1/δ)
1/2
1/2
∼ φ(δ) (log log 1/φ(x))
(7.386)
at zero. Since the second term in (7.386) is increasing at zero, ρ(δ) is asymptotic to an increasing function at zero in this case (here we also use the obvious inequality φ(x) ≥ x3 in some interval [0, x0 ], x0 > 0). On the other hand, if φ(x) ≥ exp(−(log 1/x)1− ), for some 0 < < 1, then, loosely speaking, the integral term in ρ(δ) tends to dominate the iterated logarithm term (see Example 7.6.6 (2)), and, of course, the integral term is increasing. Because of Lemma 7.1.10, we can study the local modulus of continuity in terms of certain monotone minorants for its increments variance. At the end of Section 7.5 we showed that this technique cannot be used for the uniform modulus of continuity. That is why Theorem 7.2.10 is expressed in terms of the increments variance itself. We begin our discussion of exact uniform moduli of continuity by combining Theorems 7.2.1 and 7.2.10 and Lemma 7.1.1.
Moduli of continuity for Gaussian processes
354
Theorem 7.6.8 Let G = {G(x), x ∈ [0, 1]} be a Gaussian process with stationary increments for which (7.137) holds or (7.139) holds with α > 0. Assume furthermore that δ σ(u) du ≤ Cσ(δ)(log 1/δ)1/2 . (7.387) u(log 1/u)1/2 0 Then lim
δ→0
sup |u−v|≤δ
(2σ 2 (|u
|G(u) − G(v)| = C − v|) log 1/|u − v|)1/2
a.s.
(7.388)
u,v∈[0,1]
for some constant 0 < C < ∞. When α = 0 in (7.139), this result holds as long as σ is asymptotic to a monotonic function at zero. These statements also hold when |u − v| is replaced by δ in the denominator of the fraction in (7.388), lim is replaced by lim sup, and C by some constant C , where C ≤ C < ∞. Note that in Theorem 7.2.1 the modulus is given in terms of a monotonic majorant φ of σ (see (7.82)). When σ 2 is regularly varying at zero with index α > 0, it is asymptotic to an increasing function at zero, so we can replace φ by σ. When σ 2 is slowly varying at zero, we need to include the hypothesis that it is asymptotic to a monotonic function at zero. Note also that if the left-hand side of (7.387) is “little o” of the right-hand side, then C = C . We obtain this by showing that (7.388) holds with ω (|u − v|) in the denominator and then using Remark 7.6.7. The results in this section about the local modulus of continuity give us some free information about the uniform modulus of continuity, simply because a lower bound for the local modulus of continuity is also a lower bound for the uniform modulus of continuity. Combining Theorems 7.2.1 and 7.6.4, Remark 7.2.6, and Lemma 7.1.1, we get larger exact uniform moduli of continuity for certain Gaussian processes than is suggested by the analog of the result for Brownian motion. Theorem 7.6.9 Let G = {G(t), t ∈ [0, 1]} be a mean zero real-valued continuous Gaussian process. Let φ(h) be an increasing function that satisfies conditions (2) and (3) of Theorem 7.6.4 (with t0 ∈ [0, 1]) and is such that C0 φ(|h|) ≤ (E(G(t + h) − G(t)))
1/2
≤ C1 φ(|h|)
for all t ∈ [0, 1] and all |h| sufficiently small. Let h φ(u) Φ2 (h) = du 1/2 0 u(log 1/u)
(7.389)
(7.390)
7.6 Exact moduli of continuity
355
and assume that φ(h)(log 1/h)1/2 = O(Φ2 (h)). Then lim
δ→0
sup |u−v|≤δ
|G(u) − G(v)| =C Φ2 (|u − v|)
a.s.
(7.391)
u,v∈[0,1]
for some constant 0 < C < ∞. Furthermore, (7.391) holds with Φ2 (|u − v|) replaced by Φ2 (|δ|) and lim replaced by lim sup (with the same constant C). Example 7.6.10 We consider what the exact uniform modulus of continuity is in the four cases of Example 7.6.6. If (7.137) or (7.139) holds, σ(δ) is asymptotic to a monotone function, and σ(u) ≤ C(log 1/|u|)−α , α > 1/2, it follows from Theorem 7.6.8 that (7.388) holds. This is the case if σ = φ in Example 7.6.6 (2) and (3) since φ2 is concave in these cases. This is also the case in Example 7.6.6 (1) if φ also satisfies (7.137) or (7.139). If σ(u) = ((log 1/|u|)(log log 1/|u|)β )−1/2 , β > 2, it follows from Theorem 7.6.9 that (7.391) holds with Φ2 (u) = φ(|u|)(log 1/|u|)1/2 log log 1/|u|, as in Example 7.6.6 (4). Note that in cases (3) and (4) the Gaussian process has the same exact local and uniform moduli of continuity, although we do not know if the constants are the same. Remark 7.6.11 To put condition (7.367) of Lemma 7.6.2 in perspective, suppose that ω is the maximal monotone minorant for the increments variance, that is, there is equality in (7.360). Then, when G is a continuous Gaussian process with stationary increments, there exists an > 0 such that ω 2 (2h) ≤ 4ω 2 (h)
∀ h ≤ /2.
(7.392)
Thus (7.367) only requires a slight strengthening of the most general case. (In the paragraph containing (7.97) we show that the smallest monotone majorant of E(G(t) − G(s))2 also satisfies (7.392).) To obtain (7.392) let ρ2 (u) = E(G(u) − G(0))2 . Assume that ρ has a smallest strictly positive zero (if not, ω(h; 0, δ) ≡ 0 and there is nothing to prove). We first note that there is an > 0 for which 0 < ρ( /2) = min{ρ(u) : /2 ≤ u ≤ },
(7.393)
that is, the minimum of ρ on [ /2, ] is taken at /2. To see that such an exists, let z be the smallest strictly positive zero of ρ, or any positive number if no such zero exists, and let m(u) = min{ρ(x) : u ≤ x ≤ z}. Choose as the largest number less than or equal to z such that
Moduli of continuity for Gaussian processes
356
ρ( /2) = m(z/2). Such a number exists because 0 < m(z/2) ≤ ρ(z/2) and ρ goes to zero at zero. We now observe that 0 < ρ( /2) ≤ ρ(x) for /2 < x ≤ z/2, since this is how /2 was chosen. Also, ρ( /2) ≤ ρ(x) for z/2 ≤ x ≤ because ρ( /2) = m(z/2) and ≤ z. Thus (7.393) holds. By (7.360) (with equality), ω 2 (2h)
= ω 2 (2h; 0, ) = ≤ 4
min
2h≤|u|≤
min
2h≤|u|≤
ρ2 (u/2) ≤ 4
ρ2 (u)
(7.394)
min
h≤|s|≤/2
ρ2 (s)
≤ 4 min ρ2 (s) = 4ω 2 (h). h≤|s|≤
Here we use (7.97) for the first inequality and (7.393) for the last.
7.7 Moduli of continuity for squares of Gaussian processes Suppose that a Gaussian process G has a local or uniform modulus of continuity. We raise the question, what is the local or uniform modulus of continuity of G2 ? This question may seem artificial, but answering it allows us, in Section 9.5, to obtain local and uniform moduli of continuity for the local times of the Markov process associated with G (when G is an associated process). In order to extend the results of Section 7.1 so that they apply to squares of Gaussian processes, we slightly generalize the class of Gaussian processes that we consider. We drop the condition that the Gaussian process G = {G(y), y ∈ S} has mean zero and instead require that EG(y) = C
∀ y ∈ S.
(7.395)
where C is a constant. Obviously, dG (s, t), defined in (6.1), is invariant under the translation {G(y), y ∈ S} → {G(y) + a, y ∈ S} for some constant a, and Lemma 7.1.1 continues to hold for Gaussian processes satisfying (7.395). The result for the local modulus of continuity of squares of Gaussian processes is easy to obtain. Theorem 7.7.1 Let G = {G(y), y ∈ (S, τ )} be a continuous Gaussian process satisfying (7.395). Let y0 ∈ S. Assume that G satisfies (7.2) with C = 1. Then lim
sup
δ→0 τ (y,y0 )≤δ y∈S
|G2 (y) − G2 (y0 )| = |G(y0 )| 2ρ(τ (y, y0 ))
a.s.
(7.396)
7.7 Squares of Gaussian processes
357
Furthermore, if (7.2) holds with a less than or equal to sign, then (7.396) holds with a less than or equal to sign, and if (7.2) holds with a greater than or equal to sign, then (7.396) holds with a greater than or equal to sign. Proof
This is immediate since |G(y) − G(y0 )| 1 sup |G(y0 )| − sup |G(y) − G(y0 )| (7.397) ρ(τ (y, y0 )) 2 τ (y,y0 )≤δ τ (y,y0 )≤δ ≤
sup τ (y,y0 )≤δ
≤
sup τ (y,y0 )≤δ
|G2 (y) − G2 (y0 )| 2ρ(τ (y, y0 )) |G(y) − G(y0 )| 1 |G(y0 )| + sup |G(y) − G(y0 )| . ρ(τ (y, y0 )) 2 τ (y,y0 )≤δ
We see that (7.396) follows by continuity. It is also clear from (7.397) that the one-sided results also hold. Remark 7.7.2 When G(y0 ) ≡ 0, (7.396) is not very interesting since, obviously, lim
sup
δ→0 τ (y,y0 )≤δ y∈S
G2 (y) ρ2 (τ (y, y0 ))
=1
a.s.
(7.398)
It is also easy to find an upper bound for the uniform modulus of continuity of G2 . Theorem 7.7.3 Let G = {G(y), y ∈ S} be a continuous Gaussian process satisfying (7.395) and (7.1) with “ = C” replaced by “ ≤ 1.” Then lim sup
δ→0 τ (u,v)≤δ u,v∈K
G2 (u) − G2 (v) ≤ sup |G(u)| 2ω(τ (u, v)) u∈K
a.s.
(7.399)
Proof This is immediate since lim sup
δ→0 τ (u,v)≤δ u,v∈K
G2 (u) − G2 (v) 2ω(τ (u, v))
≤ lim sup
δ→0 τ (u,v)≤δ u,v∈K
(7.400)
|G(u) − G(v)| |G(u) + G(v)| lim sup . ω(τ (u, v)) δ→0 τ (u,v)≤δ 2 u,v∈K
The first term on the right side of the inequality in (7.400) is less than or equal to 1 by hypothesis. The rest is obvious.
Moduli of continuity for Gaussian processes
358
Finding an exact uniform modulus of continuity for G2 is more complicated. We begin with a technical lemma. Lemma 7.7.4 Let G = {G(y), y ∈ S} be a continuous Gaussian process satisfying (7.395) and (7.1) with “ = C” replaced by “ ≥ 1.” Assume that ω (in (7.1)) satisfies (7.7). Then there exists a u0 ∈ K such that, for all > 0, G(u) − G(v) ≥1 ω(τ (u, v))
sup
lim
δ→0
τ (u,v)≤δ
a.s.
(7.401)
u,v∈B(u0 ,)∩K
and lim sup
δ→0 τ (u,v)≤δ u,v∈K
Proof that
G2 (u) − G2 (v) ≥ |G(u0 )| 2ω(τ (u, v))
a.s.
(7.402)
We first show that for all n > 0 there exists a u0 ( n ) ∈ K such G(u) − G(v) ≥1 ω(τ (u, v))
sup
lim
δ→0
τ (u,v)≤δ
a.s.
(7.403)
u,v∈B(u0 (n ),n )∩K
Obviously, K ⊂ ∪x∈K Bτ (x, n /2). Let Bτ (x1 , n /2), . . . , Bτ (xm , n /2) be a finite cover of K. Note that if u, v ∈ K such that τ (u, v) < n /2, then both u and v are contained in Bτ (xj , n ) for some 1 ≤ j ≤ m. Thus, for 0 < δ < n /2, sup
sup
1≤j≤m
τ (u,v)≤δ
G(u) − G(v) G(u) − G(v) = sup . ω(τ (u, v)) ω(τ (u, v)) τ (u,v)≤δ
(7.404)
u,v∈K
u,v∈B(xj ,n )∩K
Suppose there exists an > 0 and a 1 ≤ j ≤ m such that lim
δ→0
G(u) − G(v) ≤ 1 − ω(τ (u, v))
sup τ (u,v)≤δ
(7.405)
u,v∈Bτ (xj ,n )∩K
on a set of positive measure. Then, since (7.7) holds on Bτ (xj , n )∩K, it follows from Lemma 7.1.1 that the event in (7.405) holds almost surely. If the event in (7.405) holds almost surely for all 1 ≤ j ≤ m, then, for any γ > 0, we can find a δ > 0 such that, for each j and δ ≤ δ , sup τ (u,v)≤δ
u,v∈Bτ (xj ,n )∩K
G(u) − G(v) ≤ 1 − /2 ω(τ (u, v))
(7.406)
7.7 Squares of Gaussian processes
359
with probability greater that 1 − γ. Therefore, by (7.404), sup τ (u,v)≤δ
G(u) − G(v) ω(τ (u, v))
(7.407)
u,v∈K
sup
= sup
τ (u,v)≤δ
1≤j≤m
G(u) − G(v) ≤ 1 − /2 ω(τ (u, v))
u,v∈Bτ (xj ,n )∩K
with probability greater that 1 − mγ. Consequently, by Lemma 7.1.1, lim sup
δ→0 τ (u,v)≤δ u,v∈K
G(u) − G(v) ≤ 1 − ω(τ (u, v))
a.s.,
(7.408)
which contradicts the fact that ω is a lower uniform modulus of continuity for G with constant greater than or equal to one. Thus the limit superior in (7.405) is greater than or equal to one almost surely for some 1 ≤ j ≤ m. We set u0 ( n ) = xj for some j for which this occurs and obtain (7.403). Now choose a sequence n → 0 and consider the balls Bτ (u( n ), n ) for which (7.403) holds. Since K is compact there exists a sequence {unk }∞ k=1 such that limk→∞ unk = u0 for some u0 ∈ K. It is easy to see that for all > 0 there exists a k0 ( ) such that, for k ≥ k0 ( ), Bτ (u( nk ), nk ) ⊂ Bτ (u0 , ). This observation and (7.403) give (7.401). Note that G2 (u) − G2 (v) 2ω(τ (u, v))
sup τ (u,v)≤δ
(7.409)
u,v∈Bτ (u0 ,)∩K
≥
sup τ (u,v)≤δ
u,v∈Bτ (u0 ,)∩K
|G(u) − G(v)| 2ω(τ (u, v))
inf τ (u,v)≤δ
|G(u) + G(v)|.
u,v∈Bτ (u0 ,)∩K
Taking the limit as δ → 0 and using (7.401), we get (7.402). We now introduce homogeneity conditions that allow us to obtain (7.402) for all u0 ∈ K. We say that a metric or pseudometric space (S, τ ) is locally homogeneous if any two points in S have isometric neighborhoods in the metric τ . Theorem 7.7.5 Let (S, dG ) be locally homogeneous and let K ⊂ S be a compact set that is the closure of its interior. Let G = {G(u), u ∈ S} be a continuous Gaussian process satisfying (7.395) and (7.1) with “ = C”
Moduli of continuity for Gaussian processes
360
replaced by “ ≥ 1.” Assume that ω (in (7.1)) satisfies (7.7). Then lim
δ→0 d
sup
G (u,v)≤δ
G2 (u) − G2 (v) ≥ sup |G(u)| 2ω(dG (u, v)) u∈K
a.s.
(7.410)
u,v∈K
Proof Since K is the closure of its interior, for every u0 ∈ K there exists an > 0 such that BdG (u0 , ) ⊂ K. It then follows from the homogeneity of (S, dG ) that (7.401) is satisfied for this u0 and this . In fact, for every u0 ∈ int K, (7.401) is satisfied for some > 0 and, consequently, so is (7.402). In particular, if {xj }nj=1 are points in the interior of K, then sup
lim
δ→0 d
G (u,v)≤δ
G2 (u) − G2 (v) ≥ sup |G(xj )| 2ω(dG (u, v)) 1≤j≤n
a.s.
(7.411)
u,v∈K
Since this inequality is independent of n, it also holds for {xj }∞ j=1 contained in a countable dense subset of K. Therefore, since G is uniformly continuous on K, we get (7.410). Combining Theorems 7.7.3 and 7.7.5, we get Theorem 7.7.6 Let (S, dG ) be locally homogeneous and let K ⊂ S be a compact set that is the closure of its interior. Let G = {G(y), y ∈ S} be a continuous Gaussian process satisfying (7.395) and (7.1) with C = 1. Then lim
δ→0 d
sup
G (u,v)≤δ
G2 (u) − G2 (v) = sup |G(u)| 2ω(dG (u, v)) u∈K
a.s.
(7.412)
u,v∈K
Note that by Lemma 7.1.1, the hypothesis of Theorem 7.7.5, that ω satisfies (7.7), follows from the hypotheses of this theorem. Remark 7.7.7 All the results in this section hold for m-moduli, in the sense that results for the m-moduli of G extend to the squares of G in analogy with the results in Theorems 7.7.1–7.7.6. The lower bounds follow because, by Lemma 7.1.6, the m-moduli are larger than the corresponding moduli considered in this section. It is easy to see that the proofs for the upper bounds in Theorems 7.7.1 and 7.7.3 also work for m-moduli. There is only one minor point to consider. The hypothesis of the analog of Theorem 7.7.5 requires that the uniform m-modulus satisfies (7.7). This is shown in Lemma 7.1.9.
7.8 Notes and references
361
7.8 Notes and references Much of the material in Sections 7.1–7.6 is summarized in Marcus and Rosen (1992a), which deals with local and uniform moduli of continuity of local times. We consider local and uniform moduli of continuity of local times in Section 9.5. Theorem 7.1.4 and Lemma 7.1.5 are constructed from material in Sections 6.2, 6.4, and 6.5 of Fernique (1997). Lemma 7.1.10 is by Marcus and Shepp (1972). The first half of Section 7.2 contains more modern proofs of some classical results that were obtained before the introduction of metric entropy and majorizing measures into the study of Gaussian processes. Many of the old papers containing these results are still interesting. In chronological order we mention some of them: Chung, Erd¨ os and Sirao (1959); Nisio (1967); Marcus (1968, 1970); Garsia, Rodemich and Rumsey, Jr. (1970); Kono (1970); and Sirao and Watanabe (1970). The material in Lemma 7.2.5 is taken from Marcus (1972). The continuity conditions involving regular variation and normalized regular variation develop an idea introduced in Marcus and Rosen (1992a). The material in Section 7.3 may appear to deal with too restricted a class of Gaussian processes but it is what we need when applying these results to study the moduli of continuity of local times in Section 9.5. Theorem 7.4.5 and its corollary Theorem 7.4.12, which are new, exploit special properties of the covariances of associated Gaussian processes. Theorem 7.4.5 may seem peculiar because it requires an upper bound on σ whereas we know that for larger σ the iterated log term in (7.247) is replaced by a larger function of u. The problem here is that we are using the local modulus of time changed Brownian motion as a lower bound. This technique is less effective for processes with increments variance much larger than the increments variance of Brownian motion. Sections 7.5 and 7.6 are taken from Marcus (1972). Section 7.7 is taken from Marcus and Rosen (1992a).
8 Isomorphism Theorems
The relationship between strongly symmetric Borel right processes X and their associated mean zero Gaussian processes G (G is the Gaussian process with covariance equal to the 0-potential of X) is given by several isomorphism theorems that relate squares of G to the local times of X.
8.1 The isomorphism theorems of Eisenbaum and Dynkin We begin with an isomorphism theorem due to N. Eisenbaum. This theorem plays an important role in this book. Theorem 8.1.1 (Eisenbaum Isomorphism Theorem) Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric Borel right process with continuous 0-potential density u(x, y) and state space S. Let L = {Lyt ; (y, t) ∈ S × R+ } denote the local time for X normalized so that E x (Ly∞ ) = u(x, y). Let G = {Gy ; y ∈ S} denote the mean zero Gaussian process with covariance u(x, y). Then, for any countable subset D ⊆ S, any x ∈ S, and any s = 0, ! 1 Ly∞ + (Gy + s)2 ; y ∈ D , P x × PG (8.1) 2 ! 1 Gx law (Gy + s)2 ; y ∈ D , (1 + )PG . = 2 s Equivalently, for all x, x1 , . . . , xn in S and bounded measurable funcn , for all n, and all s = 0, tions F on R+ (Gxi + s)2 Gx (Gxi + s)2 x xi = EG F . 1+ E EG F L∞ + 2 s 2 (8.2) Here we use the notation F (f (xi )) := F (f (x1 ), . . . , f (xn )). 362
8.1 Isomorphism theorems of Eisenbaum and Dynkin Proof
We first show that n (Gxi + s)2 x xi E EG exp λ i L∞ + 2 i=1 n λi (Gx + s)2 Gx i 1+ = EG exp s 2 i=1
363
(8.3)
for all λ1 , . . . , λn sufficiently small, where x = x1 . We write (8.3) as n
n 2 E Gx exp i=1 λi (Gxi + s) /2 x xi n λi L∞ = 1 + E exp . (8.4) sE exp ( i=1 λi (Gxi + s)2 /2) i=1 Let Σ denote the covariance matrix {u(xi , xj )}ni,j=1 . It follows from (5.59) that
n 2 E Gx exp i=1 λi (Gxi + s) /2 t }1 , n = {ΣΛ1 (8.5) sE exp ( i=1 λi (Gxi + s)2 /2) = (I − ΣΛ)−1 Σ and 1t is the transpose of the row vector where Σ 1 = (1, . . . , 1). Therefore, to obtain (8.3) it suffices to show that n x xi t }1 . λi L∞ = 1 + {ΣΛ1 (8.6) E exp i=1
This is indeed the case since, by (3.246) with x = x1 , n x1 xi E exp = {(I − ΣΛ)−1 1t }1 λi L∞
(8.7)
i=1
= {I1t + (I − ΣΛ)−1 ΣΛ1t }1 1 + {(I − ΣΛ)−1 ΣΛ1t }1 . t }1 . = 1 + {ΣΛ1
=
n Let µ1 and µ2 be the measures on R+ defined by (Gxi + s)2 x xi F ( · ) dµ1 = E EG F L∞ + 2
and
F ( · ) dµ2 = EG
Gx 1+ s
F
(Gxi + s)2 2
(8.8)
(8.9)
n . The measure µ1 for all nonnnegative measurable functions F on R+ is determined by its moment generating function, the left-hand side of (8.3). Furthermore, (8.3) shows that the measures µ1 and µ2 are equal.
364
Isomorphism Theorems
Therefore, although it is not clear to begin with that µ2 is a positive measure, this argument shows that it is. ∞ equipped with the The two sides of (8.1) determine measures on R+ σ-algebra of cylinder sets. We have just seen that these measures have the same finite-dimensional distributions; hence they are equal. This gives (8.1). The isomorphism in (8.1) is given for the total accumulated local time of a strongly symmetric Borel right process with continuous 0-potential density. Exactly the same proof (see in particular the use of (3.246)) shows that it also holds for terminal times and inverse local times. Thus we have the following corollary of the proof of Theorem 8.1.1. Corollary 8.1.2 Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric Borel right process with continuous α-potential densities uα (x, y) and state space S. Let L = {Lyt ; (y, t) ∈ S × R+ } denote the local time for X normalized so that ∞ e−αt dLyt = uα (x, y), (8.10) Ex 0
and let T be a terminal time for X. Assume that there exists a mean zero Gaussian process G = {Gy ; y ∈ S} with covariance uT (x, y). Then, for any countable subset D ⊆ S, any x ∈ S, and any s = 0, ! 1 (8.11) LyT + (Gy + s)2 ; y ∈ D , P x × PG 2 ! 1 Gx law (Gy + s)2 ; y ∈ D , (1 + )PG . = 2 s This also holds when T is replaced by the inverse local time τA (λ), for any CAF, A = {At , t ∈ R+ }, and P x is replaced by Pλx . The first isomorphism theorem relating Markov local times and Gaussian processes is due to E. B. Dynkin. It relates local times for the htransform of a strongly symmetric Markov process X to the Gaussian process associated with X. Theorem 8.1.3 (Dynkin Isomorphism Theorem) Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric Borel right process with continuous 0-potential density u(x, y). Let 0 denote a fixed element of S. Assume that u(x, 0) hx = P x (T0 < ∞) = >0 (8.12) u(0, 0)
8.1 Isomorphism theorems of Eisenbaum and Dynkin 365
= Ωh , F h , F h , Xt , θt , P x,0 denote the h-transform for all x ∈ S. Let X t y of X as described in Section 3.9 and let L = {Lt ; (y, t) ∈ S × R+ } normalized so that denote the local time for X, # y $ u(x, y)h(y) (8.13) E x,0 L∞ = h(x) (see (3.247)). Let G = {Gy ; y ∈ S} denote the mean zero Gaussian process with covariance u(x, y). Then, for any countable subset D ⊆ S, 1 ! ! y 1 Gx G0 law L∞ + G2y ; y ∈ D , P x,0 × PG = G2y ; y ∈ D , PG . 2 2 u(0, x) (8.14) Equivalently, for all x, x1 , . . . , xn in S and bounded measurable funcn , for all n, tions F on R+ 2 Gxi G2xi Gx G0 xi x,0 = EG F . (8.15) E EG F L∞ + 2 u(0, x) 2 Here we use the notation F (f (xi )) := F (f (x1 ), . . . , f (xn )). Proof
To prove this theorem it suffices to show that n
n
2 E Gx G0 exp xi i=1 λi Gxi /2 x,0
n u(0, x)E exp λi L∞ = 2 E exp i=1 λi Gxi /2 i=1 (8.16) for all λ1 , . . . , λn sufficiently small where x = x1 and 0 = xn . (As in Gx G0 PG is a positive the proof of Theorem 8.1.1, this implies that u(0, x) measure on the paths of {G2y ; y ∈ D}.) Let Σ denote the covariance matrix {u(xi , xj )}ni,j=1 . It follows from (5.58) that
n 2 E Gx1 Gxn exp i=1 λi Gxi /2 1,n
n = {Σ} (8.17) 2 /2 E exp λ G i x i=1 i = {(I − ΣΛ)−1 Σ}1,n n = {(I − ΣΛ)−1 }1,j u(xj , 0), j=1
= (I − ΣΛ)−1 Σ and 1t is the transpose of the row vector where Σ 1 = (1, . . . , 1). On the other hand, by (3.254), we see that n n xi x,0 u(0, x)E exp λi L∞ = {(I − ΣΛ)−1 }1,j u(xj , 0). (8.18) i=1
j=1
Comparing this with (8.17) gives (8.16).
366
Isomorphism Theorems
There is some similarity between this proof and the one in Rogers and Williams (2000b), which they refer to as a “caricature” of Dynkin’s Isomorphism Theorem. Remark 8.1.4 We now present a relationship that combines Theorems 8.1.1 and 8.1.3. To simplify the expressions we make the following change of notation. Let ν be a finite discrete measure on S of the form n i=1 λi δxi ( · ), where the λi are such that the relationships given exist. In this notation we write (8.16) as
E Gx Gy exp G2r dν(r)/2 r x,y
L∞ dν(r) = u(x, y)E exp E exp G2r dν(r)/2 (8.19) and (8.6) as 1,y ν(dy), Lr∞ dν(r) = 1 + Σ (8.20) E x exp where x=x1 . Therefore, by (8.19) and (8.17), E x exp Lr∞ dν(r) (8.21) =1+ E x,y exp Lr∞ dν(r) u(x, y) dν(y). Here we use {Ls∞ , s ∈ S} to indicate the total accumulated local time of the Markov process under consideration. Which process that is is indicated by the probability measure. Thus, on the left-hand side of (8.21) we are considering the total accumulated local times of a strongly symmetric Borel right process X with 0-potential density u(x, y), whereas on the right-hand side of (8.21) we are considering the total accumulated local times of a family of h-transforms of X with h(y) = u(x, y)/u(x, x).
8.1.1 The Generalized First Ray–Knight Theorem Theorem 8.1.1 applied to the local times of processes killed at T0 , where 0 is some fixed point in S, gives a generalization of the First Ray–Knight Theorem. Recall that, by Lemma 3.8.1, if P x (T0 < ∞) > 0 for all x ∈ S, then uT0 (x, y) is finite and positive definite. The next theorem is simply a special case of Corollary 8.1.2. Theorem 8.1.5 (Generalized First Ray–Knight Theorem) Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric Borel right process with continuous α-potential density uα (x, y) and state space S. Assume that
8.1 Isomorphism theorems of Eisenbaum and Dynkin
367
P x (T0 < ∞) > 0 for all x ∈ S. Let L = {Lyt ; (y, t) ∈ S × R+ } denote ∞ the local times for X normalized so that E x ( 0 e−αt dLyt ) = uα (x, y). Let G = {Gy ; y ∈ S} denote the mean zero Gaussian process with covariance uT0 (x, y). Then, for any countable subset D ⊆ S, any x ∈ S, and any s = 0, ! 1 LyT0 + (Gy + s)2 ; y ∈ D , P x × PG 2 ! 1 Gx law = (Gy + s)2 ; y ∈ D , (1 + )PG . 2 s
(8.22)
Remark 8.1.6 We justify calling Theorem 8.1.5 the Generalized First Ray–Knight Theorem by showing that for Brownian motion it is equivalent to the classic First Ray–Knight Theorem (Theorem 2.6.3). Since both theorems describe the law of {LyT0 ; y ∈ R1 }, they must be equivalent, but we now show how the much simpler statement of Theorem 2.6.3 follows directly from (8.22). ∧ |y|) for xy > 0 and 0 Recall from (2.139) that uT0 (x, y) = 2(|x|√ otherwise. Thus G = {Gy ; y ∈ R+ } is { 2 By ; y ∈ R+ }, where { By ; y ∈ R+ } is standard Brownian motion. Then, using continuity we can write (8.22) for x > 0 and s > 0 as {LyT0 + (By + s)2 ; y ∈ R+ , P x × PB } Bx law )PB }. = {(By + s)2 ; y ∈ R+ , (1 + s
(8.23)
We simplify this expression by changing the measure. Let PBs denote the probability measure of Brownian motion starting at s > 0. Then we can rewrite (8.23) as law
{LyT0 + By2 ; y ∈ R+ , P x × PBs } = {By2 ; y ∈ R+ , (
Bx s )PB }. (8.24) s
As in Theorem 2.6.1, we first consider the restricted case where all y ∈ [0, x], so that (8.24) becomes law
{LyT0 + By2 ; y ∈ [0, x] , P x × PBs } = {By2 ; y ∈ [0, x] , (
Bx s )PB } (8.25) s
s > 0. We claim that in fact Bx s )PB }. s (8.26) To see this it suffices to show that for any 0 ≤ y1 < . . . < yn ≤ x and law
{LyT0 + By2 ; y ∈ [0, x] , P x × PBs } = {By2 ; y ∈ [0, x] , 1{T0 >x} (
Isomorphism Theorems
368
any λ1 , . . . , λn sufficiently small E
s
1{T0 <x} Bx exp
n
λi By2i
= 0.
(8.27)
i=1
Heuristically, this is an obvious consequence of the strong Markov property at T0 , since after hitting 0, Bx is symmetric about 0, and all the other terms in Byi are squared. We give the details at the conclusion of this remark. t denote Brownian motion killed at T0 and Ps denote the probLet X abilities with respect to this process. We can write (8.26) as {LyT0 + By2 ; y ∈ [0, x] , P x × PBs }
x X )Ps } s
law
2 ; y ∈ [0, x] , ( {X y
law
{Yy2 ; y ∈ [0, x] , PYs }, (8.28)
= =
where Yt is a three-dimensional Bessel process and for the last equality we use (4.185). We now note that under PBs , By2 is a BESQ1 (s2 ) process and under PYs , Yy2 is a BESQ3 (s2 ) process. Therefore, it follows from (8.28) and Theorem 14.2.2 that under P x , {LyT0 , y ∈ [0, x]} has the law of a BESQ2 (0) process. This gives us Theorem 2.6.1, the restricted form of the First Ray–Knight Theorem. To describe the evolution of LyT0 for y > x we note that, by (8.24), Bx s )PB }. s (8.29) ¯y , where B ¯y is an independent Brownian motion Writing Bx+y = Bx + B starting at 0, we can rewrite (8.29) as law
2 x s 2 {Lx+y T0 + Bx+y ; y ∈ R+ , P × PB } = {Bx+y ; y ∈ R+ , (
x s 0 ¯ 2 {Lx+y (8.30) ¯} T0 + (Bx + By ) ; y ∈ R+ , P × PB × PB law ¯y )2 ; y ∈ R+ , ( Bx )P s × P 0¯ } = {(Bx + B B B s ) law 2 x ¯y ) ; y ∈ R+ , P × P s × P 0¯ }, = {( LxT0 + Bx2 + B B B
where in the last step we use (8.29) again, but with y = 0. ¯y )2 is a BESQ1 (B 2 ) process Note that conditional on Bx , (Bx + B x $2 #) 0 x x 2 ¯y under PB¯ . Also conditional on Bx and LT0 , LT0 + Bx + B is a BESQ1 (LxT0 + Bx2 ) process under PB0¯ . Using these two observations in the first and third lines of (8.30) along with Theorem 14.2.2, we see 0 x that conditional on LxT0 , {Lx+y T0 , y ∈ R+ } has the law of a BESQ (LT0 ) x process under P . This together with the restricted form of the First Ray–Knight Theorem just proved gives us the equivalent formulation of
8.1 Isomorphism theorems of Eisenbaum and Dynkin
369
the First Ray–Knight Theorem given in the second paragraph of Theorem 2.6.3, except for the final statement about independence. This is obvious, as we point out in Remark 14.2.3. To prove (8.27) let Eωz denote the expectation operator for canonical Brownian motion on the space of continuous paths ω = {ω(t), t ≥ 0} starting at z. We first note that for any F ∈ FT0 × F
(8.31) Eωs F (ω, θT0 ω)1{T0 <∞} = Eωs Eω0 (F (ω, ω )) 1{T0 <∞} . When F = F1 F2 with F1 ∈ FT0 , F2 ∈ F, this is just the strong Markov property. Its extension to general functions F ∈ FT0 × F follows by a monotone class argument. Let f, g ∈ C(R+ ) and define T0 (f ) = inf{t | f (t) = 0}. Let 2 F (f, g) = exp 1{yi ≤T0 (f )} λi f (yi ) (8.32) i
g(x − T0 (f )) exp
1{yi >T0 (f )} λi g (yi − T0 (f )) . 2
i
Clearly F ∈ FT0 × F and F (B, θT0 B) = B(x) exp
n
2
λi B (yi ) .
(8.33)
i=1
Using this and (8.31) we see that the left-hand side of (8.27) is equal to s exp EB 1{yi ≤T0 } λi B 2 (yi ) H(x) , (8.34) i
where H(x)
# 0 (8.35) = EB B (x − T0 (B)) # $ $ exp 1{yi >T0 (B)} λi B 2 (yi − T0 (B)) 1{T0 (B)<x} . i
0 Using the fact that T0 (B), inside the expectation EB is a fixed constant, and the symmetry of Brownian motion starting at 0, we see that H(x) = 0, which gives (8.27).
Remark 8.1.7 (1) Dynkin’s Isomorphism Theorem (Theorem 8.1.3) cannot be used to study LT0 . This is because G(0), which appears on the righthand side of (8.14), is equal to 0.
370
Isomorphism Theorems
(2) The main result in Subsection 4.5.1, (4.185), is itself an isomort ; t ∈ (0, ∞)} dephism theorem. As in Subsection 4.5.1, let {X note Brownian motion, starting at x > 0, killed the first time it hits 0, and let {|Wt |; t ∈ (0, ∞)|} denote a three-dimensional Bessel process. Then it follows from (4.185) that ! ! law x u ; u ∈ (0, t], Xt Px , |Wu |; u ∈ (0, t], PW = X x
(8.36)
x where PW is the law of three-dimensional Brownian motion start starting at x. ing at x and Px is the law of X
8.2 The Generalized Second Ray–Knight Theorem Theorems 8.1.1, 8.1.3, and 8.1.5 are isomorphisms between processes but with changes of measure. It is much more convenient to have isomorphisms between processes in which the measures are the natural measures of the processes themselves. We can do this for processes that start and terminate at the same point in the state space. The results can be considered to be extensions of the classical Second Ray–Knight Theorem (Theorem 2.7.1) to general strongly symmetric Borel right processes. Recall that a strongly symmetric Borel right process X with continuous α-potential density for some (and hence all) α > 0 has local times {Lyt , (t, y) ∈ (R+ × S)} (see Theorem 3.6.3). Let T be a terminal time for X (see page 42) and τ the inverse local time of X at 0, (see (3.132)). Consider uT and uτ (λ) as defined in (2.138) and (2.146), where λ is an exponential random variable independent of X. In Lemma 3.10.2 we obtain moment generating functions for LT := {LxT , x ∈ S} and Lτ (λ) := {Lxτ(λ) , x ∈ S}. In the next lemma we show that when uT and uτ (λ) have a certain form we can obtain simple isomorphism theorems for LT and Lτ (λ) . Lemma 8.2.1 Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric Borel right process with continuous α-potential densities uα (x, y), α > 0. Let {Lyt , (t, y) ∈ (R+ ×S)} denote the local times of X and 0 denote some fixed element in S. Let T be a terminal time for X such that uT (x, y) = v(x, y) + γ,
(8.37)
where v(x, y) is a symmetric positive definite function with v(0, y) = 0 for all y ∈ S and γ > 0 is a constant. Let η = {ηx ; x ∈ S} denote the mean zero Gaussian process with covariance v(x, y) and Y be an exponential random variable with mean γ. Then, for any countable subset
8.2 The Generalized Second Ray–Knight Theorem
371
D ⊆ S, ! ! # √ $2 law LxT + 12 ηx2 ; x ∈ D, P 0 × Pη = 12 ηx + 2Y ; x ∈ D, Pη × PY . (8.38) Let τ be the inverse local time of X at 0 and let λ be an exponential random variable independent of (Ω, G, Gt , P x ). Then, if (8.37) holds with uT replaced by uτ (λ) ! 1 Lxτ(λ) + ηx2 ; x ∈ D, P 0 × Pη × Pλ (8.39) 2 ! 1 # $ 2 √ law ηx + 2Y ; x ∈ D, Pη × PY . = 2 Proof Let {η x ; x ≥ 0} be an independent copy of {ηx ; x ≥ 0} and let ξ and ξ be independent normal random variables with mean zero and variance γ that are independent of everything else. By (5.72), √ law {(ηx + ξ)2 + (η x + ξ)2 ; x ∈ D} = {ηx2 + (η x + 2Y )2 ; x ∈ D}. (8.40) Therefore, to obtain (8.38) it suffices to show that under the measure P 0 × Pη,η,ξ,ξ , . / 2
" 2 ηx + ξ η 2x ηx2 (ηx + ξ) law x LT + + ;x∈D = + ;x∈D . 2 2 2 2 (8.41) To prove (8.41) it suffices to show that for x1 = 0, and arbitrary x2 , . . . , xn , and sufficiently small λ1 , λ2 , . . . , λn , for all n, $ 2 n # n 2 E exp i=1 λi (ηxi + ξ) /2 . (8.42)
n E x1 exp λi LxTi = 2 /2 E exp λ η i x i=1 i i=1 By Lemma 5.2.1, the right-hand side of (8.42) is =
det(I − ΣΛ) , det(I − ΣΛ)
where Σi,j = v(xi , xj ), Σi,j = v(xi , xj ) + γ, and Λi,j = λi δi,j . By (8.37) and Lemma 3.10.2 n ( det(I − ΣΛ) xi x1 E exp , λi LT = det(I − ΣΛ) i=1
(8.43)
(8.44)
( i,j = Σi,j − Σ1,j = Σi,j since v(0, y) = 0 for all y ∈ S. Thus where Σ (8.43) and (8.44) are equal and (8.42) is established.
Isomorphism Theorems
372
The key step in the proof of (8.42) is Lemma 3.10.2, which also holds for uτ (λ) . Therefore the same proof as above gives (8.39). We prove our generalization of the Second Ray–Knight Theorem in two versions, first when 0 is recurrent for X and then when it is transient for X. Theorem 8.2.2 (Second Ray–Knight Theorem, recurrent case) Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric Borel right process with continuous α-potential densities uα (x, y), α > 0. Let {Lyt , (t, y) ∈ (R+ × S)} denote the local times of X and 0 denote some fixed element in S. Let τ (t) = inf{s; L0s > t}. Assume that u(0, 0) = ∞ and that P x (T0 < ∞) > 0 for all x ∈ S. As usual, let uT0 (x, y) = E x (LyT0 ).
(8.45)
Let η = {ηx ; x ∈ S} denote the mean zero Gaussian process with covariance uT0 (x, y). Then, under the measure P 0 × Pη , for any t > 0 and countable subset D ⊆ S, ! ! √ 2 law Lxτ(t) + 12 ηx2 ; x ∈ D = 12 ηx + 2t ; x ∈ D . (8.46) Proof Let Y be an exponential random variable with mean γ that is independent of everything else. Since u(0, 0) = ∞, it follows from Lemma 3.8.4 that uτ (Y ) (x, y) = uT0 (x, y) + γ.
(8.47)
Consequently, we can apply(8.39) of Lemma 8.2.1 with v(x, y) = uT0 (x, y) to obtain that, for any countable subset D ⊆ S, # √ $2 law {Lxτ(Y ) + 12 ηx2 ; x ∈ D, PY0 × Pη } = { 12 ηx + 2Y ; x ∈ D, Pη × PY }. (8.48) This implies that EY0
Eη exp
n
λi
i=1
= EY Eη exp
i Lxτ (Y )
η2 + xi 2
n λi # i=1
2
ηxi +
(8.49) √
$2
2Y
for all (x1 , . . . , xn ) in S, for all n. Writing out the expectation with respect to Y , we get ∞ n ηx2i xi e−t/γ dt EP 0 Eη exp λi Lτ (t) + (8.50) 2 0 i=1
8.2 The Generalized Second Ray–Knight Theorem ∞ n √ $2 −t/γ λi # ηxi + 2t e = Eη exp dt. 2 0 i=1
373
Since this is true for all γ > 0, we see that n n # # # √ $2 $ η 2 $$ λi # i = L Eη exp ηxi + 2t L EP 0 Eη exp , λi Lxτ (t) + xi 2 2 i=1 i=1 (8.51) where L indicates Laplace transform. Using the fact that τ (t) and hence i are right continuous as a function of t, this shows that the moment Lxτ (t) generating functions of n n √ $2 η2 λi # i λi Lxτ (t) + xi and ηxi + 2t 2 2 i=1 i=1 are equal for all (x1 , . . . , xn ) in S, for all n. This gives us (8.46). Note that when 0 is transient for X, the 0-potential density of X is finite and u(x, 0)u(y, 0) (8.52) uT0 (x, y) = E x (LyT0 ) = u(x, y) − u(0, 0) by Lemma 3.8.1. The transient case is more subtle than the recurrent case because in the transient case L0∞ the total accumulated local time of X at 0 is finite. Consequently, L0τ (Y ) = L0∞ when Y > L0∞ . We deal with this by first considering the h-transform of X. Theorem 8.2.3 (Second Ray–Knight Theorem, transient case) Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric Borel right process with continuous 0-potential densities u(x, y). Let {Lyt , (t, y) ∈ (R+ × S)} denote the local times of X and 0 denote some fixed element in S. Assume that hx = P x (T0 < ∞) > 0 for all x ∈ S. Let {ηx ; x ∈ S} denote the mean zero Gaussian process with covariance uT0 (x, y). Then, for any countable subset D ⊆ S, ! (8.53) Lxτ− (L0∞ ) + 12 ηx2 ; x ∈ D, P 0 × Pη ! √ 2 law = 12 ηx + hx 2ρ ; x ∈ D, Pη × Pρ , where ρ is an exponential random variable with mean u(0, 0) that is independent of {ηx ; x ∈ S}. Furthermore, for any t > 0, (8.54) Lxτ− (t∧L0∞ ) + 12 ηx2 ; x ∈ D, P 0 × Pη } ! # $ 2 % law = 12 ηx + hx 2(t ∧ ρ) ; x ∈ D, Pη × Pρ .
Isomorphism Theorems
374
denote the h-transform of X as described in Section 3.9. Proof Let X We are We use a tilde ( ) to distinguish objects associated with X. assuming that u(x, y) is the 0-potential density of X with respect to the reference measure m(y). Therefore, by (3.217), u (x, y) =
u(x, y) h(x)h(y)
(8.55)
with respect to the reference measure is the 0-potential density of X m(y), where dm(y) = h2 (y) dm(y). Recall that h(x) = u(x, 0)/u(0, 0); see (3.204). Consequently, u (x, 0) = u(0, 0) for all x ∈ S and u (x, 0) h(x) := Px (T0 < ∞) = = 1. u (0, 0)
(8.56)
Therefore, by Remark 3.8.3, u T0 (x, y)
= u (x, y) − = =
u (x, 0) u (0, y) u (0, 0)
(8.57)
u(x, y) − u(0, 0) h(x)h(y) uT0 (x, y) . h(x)h(y)
In particular, using (8.55) and the second line of (8.57), we obtain u (x, y) = u T0 (x, y) + u(0, 0).
(8.58)
Note that by (8.57), {ηx /h(x) ; x ∈ S} is a mean zero Gaussian process with covariance u T0 (x, y). Hence, by (8.38) with T ≡ ∞, x∞ + 1 (ηx /h(x))2 ; x ∈ D, P 0,0 × Pη } {L 2 √ 2 law 1 = { 2 ηx /h(x) + 2ρ ; x ∈ D, Pη × Pρ },
(8.59)
normalized so that xt ; (x, t) ∈ S × R+ } are the local times for X where {L ∞ y,0 −αt x e dLt = u (8.60) E α (y, x) ∀α ≥ 0. 0
Note that by Lemma 3.9.5, since L < ∞, P 0 almost surely, x ; x ∈ D, P 0,0 } law = {Lx∞ ◦ kL ; x ∈ D , P 0 }. {h2 (x)L ∞
(8.61)
Therefore, multiplying by h2 (x) in (8.59) and using Remark 3.9.7, we get (8.53).
8.2 The Generalized Second Ray–Knight Theorem
375
We now prove (8.54). Let Y be an exponential random variable with mean γ that is independent of X. By (3.192) we see that u (x, y) = u T0 (x, y) + γ , τ (Y )
(8.62)
where 1/γ = 1/γ + 1/u(0, 0). By Remark 5.2.8, Y ∧ ρ is an exponential random variable with mean γ . Note again that by (8.57), {ηx /h(x) ; x ∈ S} is a mean zero Gaussian process with covariance u T0 (x, y). Therefore, by (8.39) we see that x {L + 12 (ηx /h(x))2 ; x ∈ D, P 0,0 × Pη × PY } (8.63) τ (Y ) # $ 2 % law 1 = { 2 ηx /h(x) + 2(Y ∧ ρ) ; x ∈ D, Pη × PY × Pρ }. We can rewrite (8.63) as $2 # % law 1 1 2 x + η ; x ∈ D} = { + h(x) 2(Y ∧ ρ) ; x ∈ D}. {h2 (x)L η x 2 x 2 τ (Y ) (8.64) Therefore, as in the proof of Theorem 8.2.2, we have that for each t > 0, x + 1 ηx2 ; x ∈ D, P 0,0 × Pη } {h2 (x)L 2 τ (t) # $2 % law 1 = { 2 ηx + h(x) 2(t ∧ ρ) ; x ∈ D, Pη × Pρ }.
(8.65)
Using Lemma 3.9.6, we get (8.54). Remark 8.2.4 We have the following interesting variation of Theorem 8.2.2. As we pointed out in (8.41), an equivalent form of (8.48) is / .
2 " 2 ηx + ξ η 2x ηx2 (ηx + ξ) law x + ;x∈D = + ;x∈D , Lτ (Y ) + 2 2 2 2 (8.66) where {η x ; x ≥ 0} is an independent copy of {ηx ; x ≥ 0} and ξ and ξ are independent N (0, γ) random variables that are independent of everything else. By (5.72) and the strong Markov property, law
x
{Lxτ(Y ) ; x ∈ D, PY0 } = {Lxτ(ξ2 ) + Lτ (ξ2 ) ; x ∈ D, P 0 × Pξ,ξ },
(8.67)
where L and ξ are independent copies of L and ξ. Using this in (8.66), we get 2
ηx2 law (ηx + ξ) ; x ∈ D, P 0 × Pη × Pξ } = { ; x ∈ D, Pη × Pξ }. 2 2 (8.68) # % $2 2 law 2 Note that (ηx + ξ) = ηx + ξ . Taking the expectation in (8.68) {Lxτ(ξ2 ) +
Isomorphism Theorems
376
with respect to ξ 2 and considering it as a function of γ, we can derive (8.46) from (8.68). Example 8.2.5 Let X be a symmetric real-valued recurrent L´evy process with characteristic exponent ψ as in (4.63). We show in Theorem 4.2.4 that when X does not have a Gaussian component, uT0 (x, y) = φ(x) + φ(y) − φ(x − y), where φ(x) =
1 π
0
∞
1 − cos λx dλ ψ(λ)
x ∈ R1 .
(8.69)
(8.70)
We see in (5.256) that the mean zero Gaussian process η = {η(x), x ∈ R1 } with covariance uT0 (x, y) has stationary increments and η(0) = 0. When X is Brownian motion,√uT0 is as in (8.69) with φ(x) = 2|x|; see (2.139). In this case, η(x) = 2 B(x) + B(−x) , where B and B are independent standard Brownian motions on R+ . When ψ(λ) = |λ|p , 1 < p ≤ 2 in (4.63), X is a symmetric stable process and η is fractional Brownian motion (or Brownian motion when p = 2); see Examples 4.2.5 and 6.4.7. One can construct many examples of transient processes to apply Theorem 8.2.3. We might consider the processes just discussed but killed at the end of an independent exponential time with mean α. In this case, the 0-potential of the killed process is the α-potential of the original process 1 ∞ cos λx α u (x) = (8.71) dλ x ∈ R1 . π 0 α + ψ(λ) By Remark 3.8.3, u T0 (x, y) = uα (x − y) −
uα (x)uα (y) uα (0)
(8.72)
(we use to distinguish this from (8.69)). Let G = {G(x), x ∈ R1 } be a mean zero Gaussian process with covariance uα (x, y). Then η(x) = G(x) −
uα (x) G(0), uα (0)
(8.73)
the orthogonal compliment of the projection of G(x) on G(0). Here is an another example of a transient process that is a little more esoteric. Once again we consider the processes discussed in the first paragraph of this example but, this time it is killed the first time it
8.2 The Generalized Second Ray–Knight Theorem
377
hits some fixed element a in R1 , where a = 0. Following the proof of Theorem 4.2.4, we see that (4.85) holds with 0 = a and uTa (x, y) = φ(x − a) + φ(y − a) − φ(x − y)
(8.74)
for φ as given in (8.70). In this case, by Remark 3.8.3, as in (8.72), uT0 (x, y) = uTa (x, y) −
uTa (x, 0) uTa (y, 0) 2φ(a)
(8.75)
since uTa (0, 0) = 2φ(a). Let {η(x), x ∈ R1 } be a mean zero Gaussian process with covariance uT0 (x, y) as in the first paragraph. Then, clearly {η(a − x), x ∈ R1 } is a mean zero Gaussian process with covariance uTa (x, y). The mean zero Gaussian process with covariance uT0 (x, y) is η(x) = η(a − x) −
uTa (x, 0) η(a). 2φ(a)
(8.76)
Theorem 8.2.3 holds with η replaced by η and with h(x) = uTa (x, 0)/(2 φ(a)). Perhaps it is more useful in this case to give an equivalent form of (8.53), which is (η )2 η 2x + x ; x ∈ D, P 0 × Pη,η } (8.77) 2 2 2 (η )2 law ηa−x = { + a−x ; x ∈ D, Pη,η }, 2 2 where η and η are independent copies of η and η (see (8.41)). We remind the reader that we are considering L´evy processes without a Gaussian component, so these results do not apply to Brownian motion. It is easy to give corresponding results for the local times of Brownian motion. We include them in the next subsection, where we consider the more general case of recurrent diffusions. We consider transient diffusions in Chapter 12. One interesting difference between (8.77) and its counterpart for Brownian motion is that for Brownian motion, if a > 0, there no local time for x > a. That is not the case in (8.77) since we are considering pure jump process, which consequently can take values greater than a without hitting a (in (8.77), Laτ− (L0 ) = 0). {Lxτ− (L0∞ ) +
∞
8.2.1 Complete Second Ray–Knight Theorem for recurrent diffusions Diffusions, which are Borel right processes with continuous paths, are introduced in Section 4.3. For diffusions we can give a version of Theorem 8.2.2 that does not require that the process starts and stops at the
378
Isomorphism Theorems
same point in its state space. What is needed to extend Theorem 8.2.2 is to describe the local time starting at some point other than 0, and killed the first time it hits 0. Let X be a recurrent diffusion in R1 with continuous α-potential densities. Assume that P x (Ty < ∞) = 1 for all x, y ∈ R1 . As in (2.138), let uT0 (y, x) = E y (LxT0 ). We show in Lemma 4.3.2 that uT0 (x, x) ∧ uT0 (y, y) uT0 (x, y) = 0
(8.78)
xy ≥ 0 xy < 0,
(8.79)
where uT0 (x, x) is strictly increasing in |x|. Theorem 8.2.6 (First Ray–Knight Theorem for Diffusions ) Let {Lyt , (t, y) ∈ (R+ × S)} denote the local times of Xand let y > 0. Let ρ(x) = uT0 (x, x)/2. Let {Br , r ∈ R+ } and {B r , r ∈ R+ } be independent standard Brownian motions starting at 0. Then, under P y × PB,B , ! 2 2 (8.80) LrT0 + (Bρ(r)−ρ(y) + B ρ(r)−ρ(y) )1{r≥y} : r ∈ R+ ! 2 law 2 = Bρ(r) + B ρ(r) : r ∈ R+ on C(R+ ). Proof The proof is the same as the proof of Theorem 2.6.3 for Brownian motion. By Lemma 3.10.2 (which is an immediate generalization of Lemma 2.6.2), n ( det((I − ΣΛ)) xi xl E exp λi LT0 = , (8.81) det(I − ΣΛ) i=1 where, by (8.79), Σi,j = 2(ρ(xi ) ∧ ρ(xj )), i, j = 1, . . . n and ( i,j = 2 ((ρ(xi ) ∧ ρ(xj )) − (ρ(xl ) ∧ ρ(xj ))) Σ
i, j = 1, . . . , n. (8.82)
The proof proceeds exactly as the proof of Theorem 2.6.3 with ρ(xi ) replacing xi , i = 1, . . . , n. For diffusions we can generalize the Second Ray–Knight Theorem so that the starting point is not necessarily 0. We use the word “complete” to describe this. Theorem 8.2.7 (Complete Second Ray–Knight Theorem for Diffusions) Let {Lyt , (t, y) ∈ (R+ ×S)} denote the local times of X and
8.2 The Generalized Second Ray–Knight Theorem
379
let y > 0. Let {Br , r ∈ R+ } be independent standard Brownian motions starting at 0. Then, under P y × PB,B , 2
2 {Lrτ (t) + (Bρ(r)−ρ(y) )1{r≥y} + (B ρ(r)−ρ(y) )1{r≥y} : r ∈ R+ } (8.83) √ 2 law = {B ρ(r) + (Bρ(r) + t )2 : r ∈ R+ }
on C(R+ ) (recall that τ (t) := inf{s | L0s > t}). √ Note that { 2Bρ(r) ; r ∈ R+ } is a mean zero Gaussian process with covariance uT0 (x, y). Therefore, since ρ(0) = 0, this is Theorem 8.2.2 2 when y = 0 (cancel B ρ(r) from each side). To be more precise, it is Theorem 8.2.2 restricted to R+ . However, (8.80) also holds on the negative half-line with independent copies of everything. Combined, the two parts give Theorem 8.2.2. √ Proof By (2.139), { 2Bρ(r) , r ∈ R+ } is a mean zero Gaussian process with covariance √ uT0 (r, s). Therefore, (8.48) holds with {ηx , x ∈ 2D} replaced by { 2Bρ(r) , r ∈ R+ }. Adding an independent copy of Bρ(r) to each side of the equality in law, we see that under P 0 × PB,B,Y , √ 2 2 law 2 {Lrτ (Y ) + Bρ(r) + B ρ(r) : r ∈ R+ } = {B ρ(r) + (Bρ(r) + Y )2 : r ∈ R+ } (8.84) on C(R+ ), where Y is an exponential random variable with mean γ, which is independent of everything else. (We can pass from the countable set D to R+ because everything is continuous.) By the strong Markov property, this implies that under P y × PB,B,Y , √ 2 2 law 2 +B ρ(r) : r ∈ R+ } = {B ρ(r) +(Bρ(r) + Y )2 : r ∈ R+ }. {Lrτ (Y ) ◦θT0 +Bρ(r) (8.85) r r By the strong Markov property, LT0 and Lτ (Y ) ◦ θT0 are independent. We now substitute an independent copy of the left-hand side of (8.80) 2 2 for the term Bρ(r) + B ρ(r) on the left-hand side of (8.84) to get that, under P y × PB,B,Y , 2
2 {LrT0 + Lrτ (Y ) ◦ θT0 + (Bρ(r)−ρ(y) )1{r≥y} + (B ρ(r)−ρ(y) )1{r≥y} : r ∈ R+ } √ 2 law = {B ρ(r) + (Bρ(r) + Y )2 : r ∈ R+ }. (8.86)
Using the additivity of local time, Lrτ (Y ) = LrT0 + Lrτ (Y ) ◦ θT0 , we then have that under P y × PB,B,Y , 2
2 {Lrτ (Y ) + (Bρ(r)−ρ(y) )1{r≥y} + (B ρ(r)−ρ(y) )1{r≥y} : r ∈ R+ } √ 2 law = {B ρ(r) + (Bρ(r) + Y )2 : r ∈ R+ }. (8.87)
380
Isomorphism Theorems
Following the proof of Theorem 8.2.2, starting from (8.48), we get (8.83).
We can get a more of (8.83) by writing its right-hand % symmetric form% side as {(Bρ(r) + t/2 )2 + (B ρ(r) + t/2 )2 : r ∈ R+ }.
8.3 Combinatorial proofs Dynkin’s proof of Theorem 8.1.3 uses a combinatorial argument and a calculation of moments modeled on “Feynman diagrams.” Eisenbaum’s proof of Theorem 8.1.1 also follows this approach. Furthermore, the initial approach to the Generalized Second Ray–Knight Theorem was also combinatoric. In this section we present these proofs, both for their intrinsic interest and because understanding them may be fruitful in future research.
8.3.1 Dynkin Isomorphism Theorem We first show that n n 2 G2x G G G x 0 x x,0 x i i ¯ ∞i + E EG = EG L 2 u(x, 0) i=1 2 i=1 for any x1 , . . . , xn ∈ S, not necessarily distinct. Using Lemma 5.2.6 we see that n Gvi Gvi 1 EG Gx G0 = n 2 2 i=1
(8.88)
u(D1 ) · · · u(Dn+1 ),
D=(D1 ,...,Dn+1 )
(8.89) where the sum is over all pairings D = (D1 , . . . , Dn+1 ) of the 2n + 2 indices {vi }ni=1 , {vi }ni=1 , x, and 0 and, for example, u({vi , vj }) = u(vi , vj ) and u({vi , 0}) = u(vi , 0). We now rewrite the right-hand side of (8.89). The reader should keep in mind that, eventually, we will set vi = vi = xi and obtain the left-hand side of (8.88). Given a specific pairing D, we create an ordering of a certain subset of the elements of D, starting with the element of D that contains x and ending with the element of D that contains 0, in the following way: Let D(1) be the unique element of D that contains x. If x is paired with either vi or vi , set π(1) = i and define (yπ(1) , zπ(1) ) to be (vi , vi ) if x is paired with vi , but (vi , vi ) if x is paired with vi . Next let D(2) be the unique element of D that contains zπ(1) . If zπ(1) is paired with either vj or vj , set π(2) = j and define (yπ(2) , zπ(2) ) to be (vj , vj ) if zπ(1) is paired
8.3 Combinatorial proofs
381
with vj or (vj , vj ) if zπ(1) is paired with vj . We proceed in this manner, getting D(1) , . . . , D(l) until we get to D(l+1) , the unique element of D that contains zπ(l) and 0. Clearly l ≤ n, but it is important to note that l < n is often the case. Let C(D) = {π(1), . . . , π(l)} and D = (D(1) , . . . , D(l+1) ). Clearly D is a pairing of the 2l + 2 elements {vi }i∈C(D) , {vi }i∈C(D) , x, and 0. Let B(D) = {1, . . . , n}/C(D) and D = D/D . It is also clear that D is a pairing of the set of 2(n − l) indices consisting of {vi }i∈B(D) and {vi }i∈B(D) . Taking this into account, we see that we can rewrite (8.89) as n Gui Gvi (8.90) EG Gx G0 2 i=1 =
1 2n
×
B∪C={1,...,n}
pairings of
u(B1 ) · · · u(B|B| )
{vi }i∈B ∪{vi }i∈B
u(x, yπ(1) )u(zπ(1) , yπ(2) ) · · · u(zπ(i) , yπ(i+1) ) · · · u(zπ(|C|) , 0),
where the last sum is over all permutations (π(1), . . . , π(|C|)) of C, and ) to (yπ(i) , zπ(i) ). Of course there over all ways of assigning (vπ(i) , vπ(i) |C| are 2 ways to make these assignments. Thus, if we set vi = vi = xi , the last sum in (8.90) is u(x, xπ(1) )u(xπ(1) , xπ(2) ) · · · u(xπ(i) , xπ(i+1) ) · · · u(xπ(|C|) , 0), 2|C| π(C)
(8.91) where now the sum is over all permutations π of C. Using (5.62), we see that u(B1 ) · · · u(B|B| ) = EG Gvi Gvi .
(8.92)
i∈B
pairings of
{vi }i∈B ∪{vi }i∈B
Therefore, setting vi = vi = xi in (8.90) and using (8.91) and (8.92), we have n G2x G2xi i = (8.93) EG Gx G0 EG 2 2 i=1 i∈B B∪C={1,...,n} u(x, xπ(1) )u(xπ(1) , xπ(2) ) · · · u(xπ(|C|) , 0). π(C)
Isomorphism Theorems
382
The left-hand side of (8.88) is n G2xi x,0 xi ¯ E EG L∞ + 2 i=1 G2x x,0 x i ¯ ∞i , = L EG E 2 B∪C={1,...,n}
i∈B
(8.94)
i∈C
and by (3.248) we see that (8.93) and u(0, x) times (8.94) are identical. This establishes (8.88). n Let z1 , . . . , zn be fixed and let µ1 and µ2 be the measures on R+ defined by G2z1 G2zn x,0 z1 zn ¯ ¯ , . . . , L∞ + (8.95) F ( · ) dµ1 = E EG F L∞ + 2 2 and
F ( · ) dµ2 = EG
Gx G0 F u(x, 0)
G2z1 G2 , . . . , zn 2 2
(8.96)
n for all bounded measurable functions F on R+ . The measure µ1 is determined by its characteristic function n x,0 zi 2 ¯ λi (L∞ + Gzi /2 . (8.97) ϕ1 (λ1 , . . . , λn ) = E EG exp i i=1
For λ1 , . . . , λn fixed, ϕ1 (λ1 , . . . , λn ) is determined by the distribution n ¯ z∞i + G2 /2). function of the real-valued random variable ξ = i=1 λi (L zi ∞ k Let µ2k denote the 2k-th moment of ξ. If k=1 µ2k t /(2k)! converges for t ∈ [0, δ] for some δ > 0, then the distribution function of ξ is uniquely determined by its moments (see, e.g., Feller (1971, page 224)). Considn ering (8.88), we see that this sum converges if EG (exp( i=1 si G2zi )|Gx | |G0 |) < ∞ for sufficiently small si > 0, i = 1, . . . , n. By repeated use of the Schwarz inequality, it is easy to see that this is the case. Hence the measure µ1 is uniquely determined by the moments of ξ or, equivalently, by the terms in the left-hand side of (8.88). Set n Gx G0 exp i . (8.98) ϕ2 (λ1 , . . . , λn ) = EG λi G2zi u(x, 0) i=1 We see by (8.88) and the above argument that ϕ1 (λ1 , . . . , λn ) = ϕ2 (λ1 , . . . , λn ). Hence µ1 = µ2 , so (8.95) and (8.96) give (8.15). Note that, although it is not clear to begin with that µ2 is a positive measure, this argument shows that it is.
8.3 Combinatorial proofs
383
8.3.2 Eisenbaum Isomorphism Theorem As in the combinatorial proof of the Dynkin Isomorphism Theorem, it suffices to show that n (Gxi + s)2 x xi E EG (8.99) L∞ + 2 i=1 n Gx (Gxi + s)2 1+ = EG s 2 i=1 for any x1 , . . . , xn ∈ S, not necessarily distinct, and all n. We show below that n Gx (Gxi + s)2 (8.100) EG s i=1 2 (Gx + s)2 i = EG 2 B∪C={1,...,n}
C =∅
i∈B
u(x, xπ(1) )u(xπ(1) , xπ(2) ) · · · u(xπ(|C|−1) , xπ(|C|) ),
π(C)
where the last sum is over all permutations π of C (note the similarity of this equation and (8.93)). It follows immediately from (8.100) that n Gx (Gxi + s)2 EG 1+ (8.101) s 2 i=1 (Gx + s)2 i = EG 2 i∈B B∪C={1,...,n} u(x, xπ(1) )u(xπ(1) , xπ(2) ) · · · u(xπ(|C|−1) , xπ(|C|) ) π(C)
if we define the last sum to be identically 1 when C = ∅. The left-hand side of (8.99) is n (Gxi + s)2 x xi E EG (8.102) L∞ + 2 i=1 (Gx + s)2 i x xi E = EG L∞ , 2 B∪C={1,...,n}
i∈B
i∈C
and by (3.242) we see that the sum on the right-hand side of (8.102) is identical to (8.101). This establishes (8.99).
Isomorphism Theorems
384
To prove (8.100), we first expand n Gx EG (Gvi + s)(Gvi + s) = EG Gx Gr s2n−(|G|+1) , s i=1 G
r∈G
(8.103) where the sum runs over all possible subsets G of {vi , vi , i = 1, . . . , n}. The reader should keep in mind that, eventually, we will set vi = vi = xi . Note that |G| must be odd for a summand on the right-hand side of (8.103) to be nonzero. Let mG = (|G| + 1)/2. Using Lemma 5.2.6, we see that E G Gx Gr = u(D1 ) · · · u(DmG ), (8.104) r∈G
D=(D1 ,...,DmG )
where the sum is over all pairings D = (D1 , . . . , DmG ) of the 2mG indices G∪{x}, and for example, u({vi , vj }) = u(vi , vj ) and u({vi , x}) = u(vi , x). We now rewrite the right-hand side of (8.104). As in the combinatorial proof of the Dynkin Isomorphism Theorem, we divide each pairing into two sets. In the first set, we start with the pairing containing x and continue with the pairings as in the scheme of the previous proof, except now we stop according to a different rule, that is, let D(1) be the unique element of D that contains x. If x is paired with either vi or vi , set π(1) = i. If both vi and vi are in G, define (yπ(1) , zπ(1) ) to be (vi , vi ) if x is paired with vi , but (vi , vi ) if x is paired with vi . If only one of the indices vi and vi are in G, let yπ(1) be that index and proceed to the next paragraph. Otherwise, let D(2) be the unique element of D that contains zπ(1) . If zπ(1) is paired with either vj or vj , set π(2) = j. If both vj and vj are in G, define (yπ(2) , zπ(2) ) to be (vj , vj ) if zπ(1) is paired with vj , or (vj , vj ) if zπ(1) is paired with vj . If only one of vj and vj is in G, let yπ(2) be that index and proceed to the next paragraph. We proceed in this manner, getting D(1) , . . . , D(l) until our procedure stops. Clearly l ≤ mG , but it is important to note that l < mG is often the case. Let C = C(D) = {π(1), . . . , π(l)} and D = (D(1) , . . . , D(l) ). Clearly D is a pairing of the 2l elements x, yπ(i) , zπ(i) , i = 1, . . . , l − 1, and yπ(l) . Let G = {yπ(i) , zπ(i) , i = 1, . . . , l − 1} ∪ {yπ(l) }. Set G = G/G and D = D/D . Clearly D is a pairing of the set of indices in G . We see that we can rewrite (8.103) as n Gx (Gvi + s)(Gvi + s) (8.105) EG s i=1
8.3 Combinatorial proofs =
G
G ∪G =G G =∅
×
385
2n−(|G|+1) u(B1 ) · · · u(B|G |/2 ) s
pairings of
#
G
$ u(x, yπ(1) )u(zπ(1) , yπ(2) ) · · · u(zπ(l−1) , yπ(l) )
where the last sum is over all permutations (π(1), . . . , π(|C|)) of C and } to {yπ(i) , zπ(i) }. over all ways of assigning {vπ(i) , vπ(i) Let B = {1, . . . , n}/C. The sum in the large parentheses in (8.105) runs over all possible subsets G of {vi , vi , i ∈ B}. As in (8.103), we have (Gvi + s)(Gvi + s) (8.106) EG i∈B
=
EG
G
=
r∈G
EG
G
s2|B|−|G
Gr
|
Gr
s2n−(|G|+1) ,
r∈G
where, for the last line, we use the facts that |B| + |C| = n, |G | + |G | = |G| and |G | = 2|C| − 1. Therefore, using (5.62) again, we see that (Gvi + s)(Gvi + s) (8.107) EG i∈B
=
G
2n−(|G|+1) u(B1 ) · · · u(B|G |/2 ) . s
pairings of G
Thus (8.105) can be written as n Gx (Gvi + s)(Gvi + s) EG s i=1 = EG (Gvi + s)(Gvi + s) B∪C={1,...,n}
C =∅
×
#
(8.108)
i∈B
$ u(x, yπ(1) )u(zπ(1) , yπ(2) ) · · · u(zπ(l−1) , yπ(l) ) ,
where the last sum is over all permutations (π(1), . . . , π(|C|)) of C and over all ways of assigning {vπ(i) , vπ(i) } to {yπ(i) , zπ(i) }. Of course there
Isomorphism Theorems
386
are 2|C| ways to make these assignments. Thus, if we set vi = vi = xi , the last sum in (8.108) is u(x, xπ(1) )u(xπ(1) , xπ(2) ) · · · u(xπ(l−1) , xπ(l) ), (8.109) 2|C| π(C)
where now the sum is over all permutations π of C. Finally, setting vi = vi = xi in (8.108) establishes (8.100).
8.3.3 Generalized Second Ray–Knight Theorem (recurrent case) Under the hypotheses and notation of Theorem 8.2.2, it suffices to show that, for all t ≥ 0, n n √ (ηx + 2t)2 ηx2i xi i 0 E Eη ) = Eη (8.110) (Lτ (t) + 2 2 i=1 i=1 for an arbitrary sequence x1 , x2 , . . . , xn of (not necessarily distinct) elements in S and any n, as in the previous two proofs. To do this we need to develop some material on combinatorics. Fix n and a sequence x1 , x2 , . . . , xn of (not necessarily distinct) elements in S. By a 2+k chain on [n] := {1, 2, . . . , n} we mean an unordered pair {i1 , i2 } of integers in [n] referred to as the endpoints together with an unordered set {j1 , j2 , . . . , jk } of k integers in [n] referred to as the intermediate points. It is assumed that the 2+k elements (i1 , i2 , j1 , j2 , . . . , jk ) are distinct. We write such a chain as (i1 , i2 ; j1 , j2 , . . . , jk ). Here we allow k = 0, 1, . . . , n − 2. (Note that two chains are the same if they have the same endpoints and the same intermediate points.) In addition, we also refer to a 1–tuple (i) as a trivial chain. Let (i1 , i2 ; j1 , j2 , . . . , jk ) be a 2 + k chain. We define ch(i1 , i2 ; j1 , j2 , . . . , jk ) (8.111) =2 uT0 (xi1 , xjσ(1) )uT0 (xjσ(1) , xjσ(2) ) · · · uT0 (xjσ(k) , xi2 ) σ
=
uT0 (xi1 , xjσ(1) )uT0 (xjσ(1) , xjσ(2) ) · · · uT0 (xjσ(k) , xi2 )
σ
+
uT0 (xi2 , xjσ(1) )uT0 (xjσ(1) , xjσ(2) ) · · · uT0 (xjσ(k) , xi1 ),
σ
where the sum runs over all permutations σ of {1, 2, . . . , k}. For a trivial chain (i) we simply set ch(i) = 1. The use of trivial chains helps us with the “bookkeeping.”
8.3 Combinatorial proofs Lemma 8.3.1 (The Chain Decomposition) For all t ≥ 0, n |C| xi 0 E Lτ (t) = t|C| ch(Cj ), i=1
C∈C(1,...,n)
387
(8.112)
j=1
where the sum runs over the set C(1, . . . , n) of all partitions C = {C1 , . . . , C|C| } of [n] into an unordered collection C1 , . . . , C|C| of chains. Proof Let λ be an exponential random variable with mean γ that is independent of X. By Lemma 3.8.4, uτ (λ) (x, y) = uT0 (x, y) + γ.
(8.113)
By (3.242) we have n n ∞ 1 −t/γ 0 xi xi 0 (8.114) e E Lτ (t) dt = E Lτ (λ) γ 0 i=1 i=1 uτ (λ) (0, xπ(1) )uτ (λ) (xπ(1) , xπ(2) ) · · · uτ (λ) (xπ(n−1) , xπ(n) ) = π n
uT0 (xπ(j−1) , xπ(j) ) + γ γ = π
=
j=2
π
A,B
1∈B
uT0 (xπ(j−1) , xπ(j) ) γ |B| ,
j∈A
where we use the fact that uT0 (0, x) = 0 for all x. The last sum runs over all partitions A, B of [n] with 1 ∈ B. Fix π and a partition A, B of [n] with 1 ∈ B. With each l ∈ B we associate a chain C(l) as follows. If l is the largest element in B, set l = n. Otherwise, let l be the next largest element of B after l. If l = l + 1, we take C(l) to be the trivial chain C(l) = (π(l)). Otherwise, we set C(l) = (π(l), π(l − 1); π(l + 1), . . . , π(l − 2)). Thus we can write uT0 (xπ(j−1) , xπ(j) ) (8.115) j∈A
=
uT0 (xπ(l) , xπ(l+1) ) · · · uT0 (xπ(l −2) , xπ(l −1) ).
l∈B
C(l) nontrivial
In this way, to each permutation π and B ⊆ [n] with 1 ∈ B we associate a partition C = {C(l)}l∈B ∈ C(1, . . . , n). How many permutations π will give rise to the same partition C = {C1 , . . . , C|C| }? Clearly we can permute the endpoints and intermediate points in each chain Cj , but in addition we can also permute the |C| chains C1 , . . . , C|C| . Note
Isomorphism Theorems
388
that by (8.111), the sum of the right-hand side of (8.115) over all permutations of the endpoints and intermediate points in each chain C(l) |B| is l=1 ch(C(l)). Here we also use the fact that ch(C(j)) = 1 for a trivial chain. Finally, since there are |C|! ways of permuting the chains C1 , . . . , C|C| , we get n ∞ 1 −t/γ 0 xi (8.116) e E Lτ (t) dt γ 0 i=1
=
|C|!γ
|C|
C∈C(1,...,n)
∞
= 0
1 −t/γ e γ
|C|
ch(Cj )
j=1
t|C|
C∈C(1,...,n)
|C|
ch(Cj ) dt,
j=1
from which the lemma follows. Proof of Theorem 8.2.2 We use Lemma 8.3.1 to obtain (8.110). We begin by writing n √ ηxi + 2t 2 Eη (8.117) 2 i=1 n ηx2 √ i + ηxi 2t + t = Eη 2 i=1 ηx2 # √ $ i ηxj 2t t|D| , = Eη 2 A,B,D
i∈A
j∈B
where the sum runs over all partitions A∪B ∪D = {1, 2, . . . , n}. Because the odd moments of a mean zero normal random variable are all zero, a summand on the right-hand side of (8.117) is zero unless |B| is even. Thus we can write the right-hand side of (8.117) as ηx2 i Eη ηxj 2|B|/2 t|D|+|B|/2 . (8.118) 2 A,B,D |B| even
i∈A
j∈B
We claim that when |B| is even, ηx2 i ηxj 2|B|/2 Eη 2 i∈A
j∈B
(8.119)
=
8.3 Combinatorial proofs |B|/2 ηx2 i , ch(Cj ) Eη 2 j=1
A ,A C∈C (B; A )
389
i∈A
where the first sum runs over all partitions A ∪ A = A and the second sum runs over the collection C (B; A ) of all partitions C = {C1 , . . . , C|B|/2 } of A ∪ B into |B|/2 nontrivial chains C1 , . . . , C|B|/2 with endpoints from B and intermediate points from A . Indeed, when |B| = 2, this is precisely (8.93). The proof for general |B| is a straightforward extension of the proof of (8.93). Combining (8.117)–(8.119) we have n √ (ηx + 2t)2 i (8.120) Eη 2 i=1 |B|/2 ηx2 i = t|D|+|B|/2 ch(Cj ) Eη 2 j=1 A ,A ,B,D
|B| even
=
i∈A
C∈C (B; A )
F ⊆{1,...,n} C∈C(F )
|C|
t
|C| j=1
ch(Cj )Eη
ηx2 i 2 c
,
i∈F
where now the first sum runs over all subsets F ⊆ {1, . . . , n} and the second sum runs over the collection C(F ) of all partitions C = {C1 , . . . , C|C| } of F into chains C1 , . . . , C|C| (we include trivial chains that correspond to the elements in D). Using Lemma 8.3.1, we now see that n √ ηxi + 2t 2 Eη (8.121) 2 i=1 ηx2 xi 0 i = E Lτ (t) Eη 2 i∈F i∈F c F ⊆{1,...,n} n η2 i = E 0 Eη , Lxτ (t) + xi 2 i=1 which gives (8.110). Remark 8.3.2 Assume that instead of (8.113) we have uτ (λ) (x, y) = v(x, y) + γ,
(8.122)
where v(x, y) is a symmetric positive definite function with v(0, y) = 0 for all y ∈ S and λ is an exponential random variable with mean α
Isomorphism Theorems
390
that is independent of X. In this case the proof of Lemma 8.3.1 yields (8.116) but with γ on the left-hand side replaced by α. If we let Y be an exponential random variable with mean γ, this can be written as Eλ0
n
i Lxτ (λ)
= EY
i=1
Y |C|
C∈C(1,...,n)
|C|
ch(Cj ) .
(8.123)
j=1
Now let η = {ηx ; x ∈ S} be a mean zero Gaussian process with covariance v(x, y) that is independent of Y and λ. The proof of Theorem 8.2.2, just above, shows that Eλ0 Eη
n
i=1
i (Lxτ (λ) +
ηx2i 2
)
= Eη, Y
# √ $2 n η + 2Y x i , 2 i=1
(8.124)
which gives (8.39). The proof of the Generalized Second Ray–Knight Theorem in the transient case (8.54) uses only (8.39) and some facts about h-transforms.
8.4 Additional proofs We show how the Generalized Second Ray Knight Theorems can be obtained from Dynkin’s and Eisenbaum’s Isomorphism Theorems. To begin, we note the following lemma, which is simply a reformulation of (8.53). denote its Lemma 8.4.1 Let X be as given in Theorem 8.2.3 and let X x h-transform for h(x) = u(x, 0)/u(0, 0). Let L = {Lt , (x, t) ∈ S × R+ } normalized so that denote the local times of X # y $ u(x, y)h(y) E x L∞ = h(x)
(8.125)
(see Remark 3.9.3). Let G = {Gx , x ∈ S} be a mean zero Gaussian process with covariance u(x, y) and η = {ηx , x ∈ S} be a mean zero Gaussian process with covariance uT0 (x, y) = u(x, y) −
u(x, 0)u(y, 0) . u(0, 0)
(8.126)
8.4 Additional proofs
391
Let G and η be independent copies of G and η. Then, for any countable subset D ⊆ S, 2 ! ! G2 x η 2x Gx ηx2 law x 0,0 + ; x ∈ D, P × Pη,η = + ; x ∈ D, PG,G . L∞ + 2 2 2 2 (8.127) Equivalently, let ρ be an exponential random variable with mean u(0, 0). Then x ! ! (η + h(x)√2ρ )2 η2 law x ; x ∈ D, Pη × Pρ . L∞ + x ; x ∈ D, P 0,0 × Pη = 2 2 (8.128)
Proof We obtain (8.128) by multiplying each side of (8.59) by h2 (x) and using (3.219). We have often pointed out that (8.127) and (8.128) are equivalent. Lemma 8.4.1 is simply a restatement of (8.53). It the context of Section 8.2, (8.53) seems like one of several different formulations of the Generalized Second Ray–Knight Theorems. Actually, all the results in Section 8.2 can be obtained from (8.53) or, equivalently, Lemma 8.4.1. We do not pursue this point here. What we do is show how to derive (8.127) or, equivalently, (8.128) from the Dynkin and Eisenbaum Isomorphism Theorems. It is remarkable how easily (8.128) follows from the Dynkin Isomorphism Theorem. To see this, we note the following lemma. Lemma 8.4.2 Let G = {Gx , x ∈ S} be a mean zero Gaussian process with covariance u(x, y). Let η = {ηx , x ∈ S} be a mean zero Gaussian 3 process with covariance given by (8.126). Let χ = i=1 G20,i , where G20,i are independent identically distributed copies of G20 . Then n n G20 1/2 2 2 E exp λi (ηxi + h(xi )χ ) /2 = E λi Gxi /2 . exp u(0, 0) i=1 i=1 (8.129) Proof The ηxi , i = 1, . . . , n are independent of χ. Take the expectation of the argument in the left-hand side of (8.129) with respect to χ. In general, $ # Eχ f (χ) = Ef u1/2 (0, 0)(ξ12 + ξ22 + ξ32 )1/2 (8.130) # # $$ = E |ξ|2 f u1/2 (0, 0)|ξ| for any function f for which the first term exists. Here, ξ is N (0, 1) and ξi , i = 1, . . . , 3, are independent identically distributed copies of ξ.
392
Isomorphism Theorems
To obtain the second equality in (8.130), convert the integral to polar coordinates, integrate out the angle terms, and reinterpret the integral with respect to the radial term as the expectation of an N (0, 1) random variable. Using (8.130), we see that the left-hand side of (8.129) is n G20 2 E exp λi (ηxi + hxi |G0 |) /2 . (8.131) u(0, 0) i=1 The second equality in (8.129) follows from this since G(xi ) = ηxi + law
h(xi ) G(0), i = 1, . . . , n, and {(ηxi + h(xi ) G(0))2 }ni=1 = {(ηxi + h(xi ) |G(0)|)2 }ni=1 . Using Lemma 8.4.2 and Dynkin’s Isomorphism Theorem (Theorem 8.1.3), we see that ! ! (η + h(x)√χ )2 x G2x x law 0,0 L∞ + ; x ∈ D, P × PG = ; x ∈ D, Pη × Pχ . 2 2 (8.132) Thus we get a direct isomorphism theorem involving the natural measures of the processes. It is easy to see that (8.132) is equivalent to (8.128). Let η and G be independent copies of η and G. By Corollary 5.2.2 and (5.71), (η + h(x)√2ρ )2 G2 ! ! (η + h(x)√χ )2 η 2 x law x + x ;x∈D = + x;x∈D , 2 2 2 2 (8.133) where ρ is an exponential random variable with mean u(0, 0). We now add ηx2 /2 to each side of (8.132) and use (8.133) on the resulting righthand side. Then we cancel G2x /2 and G2 x /2 and get (8.128). We can also derive (8.127) from the Eisenbaum Isomorphism Theorem (Theorem 8.1.1), although it is not as simple as the above. To begin with, as stated, Theorem 8.1.1 does not apply to the h-transform process because this process does not have symmetric potential densities with respect to the reference measure. We can easily deal with this, as we did in the proof of Lemma 3.10.4, by first applying Theorem 8.1.1 to in Theorem 3.9.2, for which the associated process is G · /h( · ), and X then multiplying by h( · ) and using (3.219). We then get that, under the hypotheses of the Dynkin Isomorphism Theorem (Theorem 8.1.3), (Gxi + h(xi )s)2 xi x,0 (8.134) E EG F L∞ + 2 (Gxi + h(xi )s)2 Gx = EG F . 1+ h(x)s 2
8.4 Additional proofs
393
That is, (8.134) and (8.15) deal with the same processes (we are actually interested in this when x = 0 and use the fact that h(0) = 1). We now derive (8.127), which is equivalent to (8.128), from (8.134). We take the limit as s goes to zero in (8.134) and obtain G2xi xi 0,0 E EG F L∞ + (8.135) 2 2 Gxi (Gxi + h(xi )s)2 d , + E G0 F =E F 2 ds 2 s=0 2 Gx i where we use the fact that E G0 F = 0. As usual, we write 2 Gxi = ηxi + h(xi )G0 . We see from Lemma 5.2.1 that n 2 Eη exp λi (Gxi + h(xi )s) /2 i=1
= Eη exp = Eη exp
n i=1 n
(8.136) 2
λi (ηxi + h(xi )(G0 + s)) /2 λi ηx2i /2
exp(K(G0 + s)2 )/2),
i=1
Λ) = hΛht + hΛΣΛh t for Λ and Σ as defined in where K = K(h, Σ, Lemma 5.2.1 and h = (hx1 . . . , hxn ). One consequence of (8.136), which is obtained by setting s = 0 and taking the expectation with respect to G0 , is that n n 1 2 2 E exp λi Gxi /2 = E exp λi ηxi /2 . (1 − u(0, 0)K)1/2 i=1 i=1 (8.137) We also use (8.136) to see that n d 2 EG0 exp λi (Gxi + h(xi )s) /2 (8.138) ds i=1 n d 2 EG0 exp = λi (ηxi + h(xi )(G0 + s)) /2 ds i=1 n d 2 EG0 exp(K(G0 + s)2 )/2). λi ηxi /2 = Eη exp ds i=1 Furthermore,
d 2 EG0 exp(K(G0 + s) )/2) ds s=0
= KEG20 exp(KG20 /2)
Isomorphism Theorems
394
Ku(0, 0) . (8.139) (1 − Ku(0, 0))3/2
=
Therefore, by (8.137)–(8.139), we have n n d 2 2 EG0 exp λi Gxi /2 + λi (Gxi + h(xi )s) E exp ds i=1 i=1 s=0 n 1 = E exp λi ηx2i /2 . (8.140) (1 − u(0, 0)K)3/2 i=1 We now use (8.135) to compute the moment generating functions. It follows from (8.137) and (8.140) that n n xi 0,0 2 λi L∞ E exp λi Gxi /2 (8.141) E exp i=1
= E exp
n
λi ηx2i /2
i=1
which by (8.137) again gives n xi 0,0 E exp λi L∞ = i=1
i=1
1 , (1 − u(0, 0)K)3/2
1 . (1 − u(0, 0)K)
(8.142)
Therefore, by (8.137) again, we have n 2 n xi 0,0 2 λi L∞ λi ηxi /2 E exp E exp i=1
=
E exp
n
i=1
λi G2xi /2
(8.143)
2 ,
i=1
which gives (8.127).
8.5 Notes and references Dynkin’s Isomorphism Theorem (Theorem 8.1.3) is given in Dynkin (1984); see also Dynkin (1983). It was the starting point of our research on the connection between Gaussian processes and the local times of strongly symmetric Borel right processes and was used extensively in our initial paper on this subject, Marcus and Rosen (1992d). Eisenbaum’s Isomorphism Theorem (Theorem 8.1.1) is given in Eisenbaum (1995). After it appeared, we started using it instead of Dynkin’s Theorem. It seems to be easier to apply because one does not have to worry
8.5 Notes and references
395
about h-transform processes. Our proofs of Theorems 8.1.1 and 8.1.3 in Section 8.1 first appeared in Marcus and Rosen (2001). Their original proofs were combinatorial, similar to our proofs in Section 8.3. (Our proof of Dynkin’s Isomorphism Theorem and some generalizations in Marcus and Rosen (1992d) are also combinatoric.) The results in Section 8.2 first appeared in Eisenbaum, Kaspi, Marcus, Rosen and Shi (2000).
9 Sample path properties of local times
The concept of associated Gaussian process, defined on page 324, is fundamental in this book. Recall that, given a strongly symmetric Borel right process X with 0-potential density u(x, y), the associated Gaussian process G0 is the mean zero Gaussian process with covariance u(x, y). Even when X does not have a 0-potential density, it does have α-potential densities uα (x, y) for all α > 0. As we explain in Section 3.5, which is X killed at the end of an uα (x, y) is the 0-potential density of X, independent exponential time with mean 1/α. Therefore, the mean zero Gaussian process Gα with covariance uα (x, y) is the associated Gaussian process for X. In a slightly different way of looking at things, we may say that there is an infinite family of Gaussian processes associated with X, the Gaussian processes Gα , for all α > 0 and also G0 if X has a 0-potential density. We call these the family of Gaussian processes associated with X. Using the isomorphism theorems of Chapter 8, we show that the local times of X and the family of Gaussian processes associated with X have similar sample paths properties and zero–one laws. Our proofs are soft, in the sense that we do not give concrete conditions for determining whether the local times have certain properties. We merely say that the local times have certain properties if and only if the family of associated Gaussian processes does. But this is the strength of our approach, because in Chapters 5–7 we have a complete catalog of conditions for the Gaussian processes to have these properties.
9.1 Bounded discontinuities We give conditions for local times to be bounded that do not require that they are also continuous. We begin with a simple example of this 396
9.1 Bounded discontinuities
397
phenomenon which is an immediate consequence of the Second Ray– Knight Theorem, Theorem 8.2.2. Lemma 9.1.1 Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric Borel right process with continuous α-potential densities uα (x, y), α > 0, and state space (S, τ ), where S is a locally compact separable metric space. Let L = {Lyt ; (y, t) ∈ S × R+ } denote the local times of X normalized by (3.91). Let 0 denote some fixed element in S and let τ0 (t) = inf{s |L0s > t}. Assume that u(0, 0) = ∞ and that P x (T0 < ∞) > 0 for all x ∈ S. Let η = {ηx ; x ∈ S} denote the mean zero Gaussian process with covariance uT0 (x, y) and assume that uT0 (x, y) is continuous. Assume further that η has oscillation function 0 ≤ β(0) < ∞ at 0. Then, for any t > 0 and any countable dense set C ⊂ S, √ √ β(0) t β 2 (0) β(0) t x √ P 0 a.s., ≤ lim + sup (Lτ0 (t) − t) ≤ √ δ→0 x∈C∩B(0,δ) 8 2 2 (9.1) where B(0, δ) denotes a closed ball of radius δ at 0 in the metric τ . Note that the existence of uT0 follows from Lemma 3.8.1. Proof Since uT0 (x, y) is continuous, we can take C as a separability set for η . We note that η(0) ≡ 0. Therefore, the hypothesis that η has oscillation function β(0) at 0 implies, by Theorem 5.3.7, that lim
sup
δ→0 x∈C∩B(0,δ)
η(x) =
β(0) 2
a.s.
(9.2)
Consequently,
√ √ 2 1 β(0) t β 2 (0) η(x) + 2t = √ +t + δ→0 x∈C∩B(0,δ) 2 8 2 lim
sup
Therefore, by Theorem 8.2.2, √ β(0) t β 2 (0) η 2 (x) x = √ +t Lτ0 (t) + + lim sup δ→0 x∈C∩B(0,δ) 2 8 2
a.s.
(9.3)
P 0 × Pη
a.s.
(9.4) The inequalities in (9.1) follows immediately from this. The upper bound is obvious. For the lower bound, we use the triangle inequality and (9.3) with t = 0. We now obtain results in the form of (9.1) that hold for any strongly symmetric Borel right process that has a local time and with Lτ· (t) replaced by Lt· . To do this we first develop an important corollary of Theorem 8.1.1 that enables us to prove that almost sure events for members
398
Sample path properties of local times
of the family of Gaussian processes associated with a strongly symmetric Markov process X are also almost sure events for the local times of X. For any set C, let F (C) denote the set of real-valued functions f on C. Define the evaluations ix : F (C ) → R 1 by ix (f ) = f (x ). We use M(F (C)) to denote the smallest σ-algebra for which the evaluations ix are Borel measurable for all x ∈ C. M(F (C)) is generally referred to as the σ-algebra of cylinder sets in F (C). Lemma 9.1.2 Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric Borel right process with continuous α-potential densities uα (x, y), α > 0, and state space (S, τ ), where S is a locally compact separable metric space. Let L = {Lyt ; (y, t) ∈ S × R+ } denote the local time of X normalized by (3.91). Let Gα = {Gα (y) ; y ∈ S} denote a mean zero Gaussian process with covariance uα (x, y). Let C be a countable dense subset of S. Let B ∈ M(F (C)) and assume that, for some s = 0, 2 (9.5) P (Gα ( · ) + s) /2 ∈ B = 1. Let Leb denote Lebesgue measure on R+ . Then, for almost all (ω , t) ∈ ΩGα × R+ with respect to PGα × Leb and all x ∈ S, (Gα ( · , ω ) + s)2 x · ∈ B = 1, (9.6) Lt + P 2 and for almost all ω ∈ ΩGα with respect to PGα and all x ∈ S, (Gα ( · , ω ) + s)2 x · ∈ B for almost all t ∈ R+ = 1. (9.7) Lt + P 2 Also, we can choose a countable dense set Q ⊂ R+ such that, for almost all ω ∈ ΩGα with respect to PGα and all x ∈ S, (Gα ( · , ω ) + s)2 P x L·t + ∈ B for all t ∈ Q = 1. (9.8) 2 = (Ω, G, Gt , Proof Following Section 3.5, we construct the process X x t , θt , P ), which is X killed at the end of an independent exponential X is uα and Px = time with mean 1/α. The 0-potential density of X x −αu P × αe du. We apply Theorem 8.1.1 to X. We first note that IB ((Gα ( · ) + s)2 /2) = 1 almost surely, by (9.5). Therefore, since Gα has mean zero, we have E Gα (x) IB ((Gα ( · ) + s)2 /2) = 0. (9.9)
9.1 Bounded discontinuities
399
(To make this observation obvious, write IB = 1 − IB c and use the Schwarz inequality on E(Gα (x)IB c ).) Consequently, it follows from Theorem 8.1.1 and (9.5) that (9.10) Px × PGα Lλ· + 12 (Gα ( · ) + s)2 ∈ B = 1, where λ is an exponential random variable with mean 1/α. Writing out the expectation with respect to the independent exponential time, (9.10) becomes ∞ α P x × PGα Lt· + (Gα ( · ) + s)2 /2 ∈ B e−αt dt = 1, (9.11) 0
which shows that P x × PGα Lt· + (Gα ( · ) + s)2 /2 ∈ B = 1 for almost all t ∈ R+ . This gives both (9.6) and (9.7). Equation (9.8) follows immediately from (9.6). As an immediate application of Lemma 9.1.2, we relate the bounded discontinuities of the local times of X to the bounded discontinuities of the family of Gaussian processes associated with X. Theorem 9.1.3 Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric Borel right process with continuous α-potential densities uα (x, y), α > 0, and state space (S, τ ), where S is a locally compact separable metric space. Let L = {Lyt ; (y, t) ∈ S × R+ } denote the local time of X normalized by (3.91). Let Gα = {Gα (y) ; y ∈ S} be a real-valued Gaussian process with mean zero and covariance uα (x, y). Assume that Gα has oscillation function 0 ≤ β(x0 ) < ∞ at x0 . Then, for any countable dense set C ⊂ S, β(x0 ) Lxt 0 β 2 (x0 ) β(x0 ) Lxt 0 √ √ + ≤ lim (9.12) sup Lxt −Lxt 0 ≤ δ→0 x∈C∩B(x0 ,δ) 8 2 2 for all t ∈ R+ , P y almost surely for all y ∈ S, and
x 2 ) L 0 β(x (x ) β 0 0 √ t lim + sup |Lxt − Lxt 0 | ≤ δ→0 x∈C∩B(x0 ,δ) 4 2
(9.13)
for all t ∈ R+ , P y almost surely for all y ∈ S. Here B(0, δ) denotes a closed ball of radius δ at 0 in the metric τ . Since 0 < Lxt 0 , P x0 almost surely for all t > 0, we see from (9.12) and (9.13) that the local times of a Markov process associated with a Gaussian process that has a bounded discontinuity at x0 (a probability 0 or 1 event by Corollary 5.3.6) itself has a bounded discontinuity at x0 almost surely with respect to P x0 .
Sample path properties of local times
400
If the process is started at y = x0 , there could be paths for which Lxt 0 = 0 for sufficiently small t. In this case, both sides of (9.12) should be equal to zero. Consequently, we suspect that the term containing β 2 (x0 ) can be eliminated from (9.12) and (9.13), and also from (9.1), but we do not know how to do it. Proof Let C be a countable separating set for {Gα (x), x ∈ S} and note that 2
2
(Gα (x) + s) − (Gα (x0 ) + s)
(9.14)
2
= (Gα (x) − Gα (x0 )) + 2 (Gα (x0 ) + s) (Gα (x) − Gα (x0 )) . It follows from (9.14) and Theorem 5.3.7 that 2
(Gα (x) + s) − (Gα (x0 ) + s) δ→0 x∈C∩B(x0 ,δ) 2 lim
= Let
2
(9.15)
sup
β 2 (x0 ) β(x0 )|Gα (x0 ) + s| + 8 2
B = f ∈ F (C) lim
sup
δ→0 x∈C∩B(x0 ,δ)
= Then, by (9.15),
a.s
P Gα .
f (x) − f (x0 )
(9.16)
β 2 (x0 ) β(x0 )|f (x0 )|1/2 √ + . 8 2
2 P (Gα ( · ) + s) /2 ∈ B = 1.
(9.17)
It follows from Lemma 9.1.2, that for almost all ω ∈ ΩGα with respect to PGα , lim
sup
δ→0 x∈C∩B(x0 ,δ)
Lxt − Lxt 0 +
(Gα (x, ω ) + s)2 − (Gα (x0 , ω ) + s)2 (9.18) 2
β 2 (x0 ) β(x0 ) (Gα (x0 , ω ) + s)2 = + √ for all t ∈ Q, P y a.s. Lxt 0 + 8 2 2 for all y ∈ S, where Q is a countable dense subset of R+ . Therefore, for almost all ω ∈ ΩGα with respect to PGα , lim
sup
δ→0 x∈C∩B(x0 ,δ)
Lxt − Lxt 0
(9.19)
(Gα (x, ω ) + s)2 − (Gα (x0 , ω ) + s)2 δ→0 x∈C∩B(x0 ,δ) 2 β 2 (x0 ) β(x0 ) (Gα (x0 , ω ) + s)2 + √ for all t ∈ Q, P y a.s. ≥ Lxt 0 + 8 2 2 + lim
sup
9.1 Bounded discontinuities
401
for all y ∈ S. It follows from (9.17) that, for almost all ω ∈ ΩGα with respect to PGα , β(x0 ) x0 Lxt − Lxt 0 ≥ √ Lt δ→0 x∈C∩B(x0 ,δ) 2 lim
sup
−
β(x0 )|Gα (x0 , ω ) + s| 2
for all t ∈ Q,
P y a.s. (9.20)
By Lemma 5.3.5, for all > 0, we can find an ω ∈ ΩGa such that (9.20) holds and |Gα (x0 , ω )| < . Let us also take s = . For this ω we have β(x0 ) x0 √ lim sup Lxt − Lxt 0 ≥ √ Lt − 2
for all t ∈ Q δ→0 x∈C∩B(x0 ,δ) 2 (9.21) y P almost surely, and since this holds for all > 0, we get β(x0 ) x0 Lxt − Lxt 0 ≥ √ Lt δ→0 x∈C∩B(x0 ,δ) 2 lim
sup
for all t ∈ Q
P y a.s.
(9.22) y Let Ω, P (Ω) = 1, be the set on which (9.22) holds. For any t > 0, we have, by the choose a sequence ti ∈ Q such that ti ↑ t. For ω ∈ Ω, monotonicity of the local time, β(x0 ) Lxti0 (ω) x0 x √ sup Lt (ω) ≥ Lti (ω) + lim . (9.23) δ→0 x∈C∩B(x0 ,δ) 2 Since Lxt 0 is continuous in t, we see that lim
sup
δ→0 x∈C∩B(x0 ,δ)
Lxt (ω)
≥
Lxt 0 (ω)
β(x0 ) Lxt 0 (ω) √ , + 2
(9.24)
we get the lower bound in and since this is valid for all t and all ω ∈ Ω, (9.12). To obtain the upper bound in (9.12), we use (9.18) to immediately obtain lim
sup
δ→0 x∈C∩B(x0 ,δ)
Lxt − Lxt 0
β 2 (x0 ) β(x0 ) ≤ + √ 8 2
(9.25)
Lxt 0 +
(Gα (x0 , ω ) + s)2 (Gα (x0 , ω ) + s)2 + 2 2
for all t ∈ Q, P y almost surely, for almost all ω ∈ ΩGα with respect to PGα . As above, |Gα (x0 , ω )| and s can be made as small as we like and
Sample path properties of local times
402 we get lim
sup
δ→0 x∈C∩B(x0 ,δ)
Lxt
−
Lxt 0
β 2 (x0 ) β(x0 ) Lxt 0 √ ≤ + (9.26) 8 2 for all t ∈ Q, P y a.s.
For all t choose ti ∈ Q such that ti ↓ t. Following the argument given in (9.23) and (9.24), we get the upper bound in (9.12). To obtain (9.13) we repeat the above argument with a minor variation. Analogous to (9.16), we consider the set
B = f ∈ F (C) lim sup |f (x) − f (x0 )| (9.27) δ→0 x∈C∩B(x0 ,δ)
=
β 2 (x0 ) β(x0 )|f (x0 )|1/2 √ + . 8 2
Then, essentially the same argument used in (9.14) and (9.15) shows that 2 (9.28) P (Gα ( · ) + s) /2 ∈ B = 1. Therefore, analogous to (9.25), we have lim
sup
δ→0 x∈C∩B(x0 ,δ)
|Lxt − Lxt 0 |
(Gα (x0 , ω ) + s)2 2 |(Gα (x, ω ) + s)2 − (Gα (x0 , ω ) + s)2 | sup + lim δ→0 x∈C∩B(x0 ,δ) 2 (Gα (x0 , ω ) + s)2 β(x0 )|G(x0 ) + s|1/2 β 2 (x0 ) β(x0 ) √ Lxt 0 + + √ + = 4 2 2 2
β 2 (x0 ) β(x0 ) ≤ + √ 8 2
Lxt 0 +
for all t ∈ Q, P y almost surely, for almost all ω ∈ ΩGα with respect to PGα . As above, |Gα (x0 , ω )| and s can be made as small as we like and we get β 2 (x0 ) β(x0 ) Lxt 0 x0 x √ + lim (9.29) sup |Lt − Lt | ≤ δ→0 x∈C∩B(x0 ,δ) 4 2 for all t ∈ Q, P y a.s. We now show that this holds for all t ∈ R+ . It is trivial for t = 0. For t > 0, pick u, v ∈ Q, u < t < v so that, for a fixed ω, Lxv 0 − Lxt 0 < and Lxt 0 − Lxu0 < . Note that |Lxt − Lxt 0 | ≤ |Lxv − Lxu0 | ∨ |Lxu − Lxv 0 |. Then |Lxv − Lxu0 | ≤ |Lxv − Lxv 0 | + |Lxv 0 − Lxt 0 | + |Lxt 0 − Lxu0 |
(9.30)
9.2 A necessary condition for unboundedness
403
with a similar bound for |Lxu − Lxv 0 |. Using the fact that u, v ∈ Q and (9.29), we see that √ β 2 (x0 ) β(x0 ) Lxv 0 x0 x √ + . (9.31) lim sup |Lt − Lt | ≤ 4 + δ→0 x∈C∩B(x0 ,δ) 4 2 Since Lxv 0 ≤ Lxt 0 + and can be taken as small as we like, we get (9.13).
9.2 A necessary condition for unboundedness Let X be a strongly symmetric Borel right process with continuous αpotential density uα (x, y). To obtain a necessary condition for the local times of X to be unbounded almost surely, we first consider the process t , θt , Px ), introduced in Section 3.5, which is X killed = (Ω, G, Gt , X X at the end of an independent exponential time with mean 1/α, α > 0. is uα and Px = P x × αe−αu du. Let The 0-potential density of X y = {L t ; (y, t) ∈ S × R+ } denote the local time of X. L Consider X killed the first time it hits zero and denote its potential α density by u α T0 (x, y). Since the 0-potential density of X is u , it follows from Remark 3.8.3 that α α u α T0 (x, y) = u (x, y) − hα (x)hα (y)u (0, 0),
(9.32)
where hα (x) =
uα (x, 0) . uα (0, 0)
(9.33)
Let ηα (x) be a mean zero Gaussian process with covariance u α T0 (x, y). By Theorem 8.2.3, the Second Ray–Knight Theorem, under the measure P 0 × Pηα , for any t > 0 and countable subset D ⊆ S, x 1 2 (9.34) L 0∞ ) + 2 ηα (x); x ∈ D τ − (t∧L 2 law 1 = 2 ηα (x) + hα (x) 2(t ∧ ρ) ; x ∈ D where ρ is an exponential random variable with mean uα (0, 0) that is independent of ηα = {ηα (x) ; x ∈ S}. Theorem 9.2.1 Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric Borel right process with continuous α-potential densities uα (x, y), α > 0, and state space (S, τ ), where S is a locally compact separable metric space. Let L = {Lyt ; (y, t) ∈ S × R+ } denote the local time of X
Sample path properties of local times
404
normalized by (3.91). Let Gα = {Gα (y) ; y ∈ S} be a real-valued Gaussian process with mean zero and covariance uα (x, y), and let 0 denote a fixed point in S. Suppose there exists a countable dense set C ⊂ S for which sup
lim
δ→0 x∈C∩B(0,δ)
Gα (x) = ∞
a.s.
(9.35)
Then lim
sup
δ→0 x∈C∩B(0,δ)
Lxt = ∞
∀t > 0
a.s.
P 0.
(9.36)
Let ηα (x) = Gα (x) − hα (x)Gα (0). Proof We use (9.34) applied to X. α ηα (x) is a mean zero Gaussian process with covariance u T0 (x, y) given α in (9.32). Since Gα (0) is finite almost surely and u (x, y) is continuous, we see that lim
sup
δ→0 x∈C∩B(0,δ)
ηα (x) = ∞
(9.37)
and lim
sup
δ→0 x∈C∩B(0,δ)
Eηα2 (x)
= lim
sup
δ→0 x∈C∩B(0,δ)
(uα (x, 0))2 u (x, x) − uα (0, 0) α
= 0.
(9.38) Choose some t > 0. For δ > 0, which we choose below, let T ∈ C ∩ B(0, δ) be a finite set. Note that limx→0 hα (x) = 1. Let δ be such that inf x∈[0,δ ) hα (x) > 0.9. We require that δ ≤ δ . Let ρ be an exponential random variable with mean uα (0, 0) that is independent of ηα . We first note that it follows from (5.150) that Pηα sup ηα (x) + hα (x) 2(t ∧ ρ) ≥ a − sσT + 0.9 2(t ∧ ρ) x∈T
≥ 1 − Ψ(s), (9.39) where a is the median of supx∈T ηα (x), σT = supx∈T (Eηα2 (x))1/2 and Ψ(s) = 1 − Φ(s). We assume that a − sσT > 0. It is obvious, by (9.37) and (9.38), that whatever the value of δ and s, we can find finite sets T for which this is the case. Since ηα (0) = 0, both terms in the argument of Pηα in (9.39) are positive. Consequently,
(ηα (x) + hα (x) 2(t ∧ ρ) )2 (a − sσT + 0.9 2(t ∧ ρ))2 Pηα sup ≥ 2 2 x∈T ≥ 1 − Ψ(s).
(9.40)
9.2 A necessary condition for unboundedness 405 For any > 0, we choose k so that P (.9 2(t ∧ ρ) ≥ k) ≥ 1 − . Then
(ηα (x) + hα (x) 2(t ∧ ρ) )2 (a − sσT + k)2 ≥ ≥ 1 − Ψ(s) − . P sup 2 2 x∈T (9.41) It follows from (9.34) and the triangle inequality that ηα (x)2 (a − sσT + k)2 x P sup L − 0 ≥ − sup ≥ 1 − Ψ(s) − . 2 2 x∈T τ (t∧L∞ ) x∈T (9.42) By (5.151), since ηα (0) = 0, (a + sσT )2 ηα2 (x) P sup ≤ ≥ 1 − Ψ(s). (9.43) 2 2 x∈T Using this in (9.42) we see that k 0 x sup L − 0 ≥ a(k − 2sσ) + (k − 2sσ) ≥ 1 − 2Ψ(s) − , P 2 x∈T τ (t∧L∞ ) (9.44) where σ = supx∈C∩B(0,δ) (Eηα2 (x))1/2 ≥ σT . We now choose s sufficiently large so that Ψ(s) ≤ . Next, we note that by (9.38), we can choose 0 < δ < δ so that sσ ≤ k/8. With these choices we have ak 0 x P ≥ 1 − 3 . (9.45) sup L − 0 ≥ 2 x∈T τ (t∧L∞ ) Because of (9.37) we can take T to be large enough set so that ak/2 > M for any number M . Therefore, for any > 0,
0 x sup L − 0 = ∞ ≥ 1 − 3 , P (9.46) τ (t∧L∞ ) x∈C∩B(0,δ) and this holds for any t > 0. It also holds for all 0 < δ ≤ δ, because we can always restrict T to be contained in [0, δ ]. Therefore,
0 x lim L sup (9.47) P 0∞ ) = ∞ = 1 δ→0 x∈C∩B(0,δ) τ − (t∧L for all t > 0. Now choose any t > 0. By the definition of local time, we can find 0∞ ) < t ) > 1 − . Using this observation a t > 0 such that P 0 (τ − (t ∧ L and (9.47), we see that
0 x t = ∞ = 1. lim L sup (9.48) P δ→0 x∈C∩B(0,δ)
406
Sample path properties of local times
x ≤ Lx , we get (9.36). Since L t t
9.3 Sufficient conditions for continuity In (3.180) and (3.182) of Corollary 3.7.4 we give moment conditions that imply that the local times of a strongly symmetric Borel right process X with continuous α-potential density has a version that is jointly continuous. In the next two lemmas we show that these conditions are satisfied when some member of the family of Gaussian processes associated with X is continuous. Lemma 9.3.1 Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric Borel right process with continuous α-potential densities uα (x, y), α > 0, and state space (S, τ ), where S is a locally compact separable metric space. Let L = {Lyt , (t, y) ∈ R+ × S} be local times for X and let λ be an exponential random variable with mean 1/α that is independent of X. Let K be a compact subset of S and Gα = {Gα (y); y ∈ K} be a mean zero Gaussian process with covariance uα (x, y) such that sup |Gα (y)| < ∞
a.s.
(9.49)
y∈K
Let u∗α := sup uα (y, y),
(9.50)
y∈K
and let D be a countable subset of K. Then, for all x ∈ S, supy∈D Lyλ Eλx exp <∞ c u∗α
(9.51)
for all c > 1. Proof We use the Eisenbaum Isomorphism Theorem (Theorem 8.1.1). Let supy∈D | · | := · . It follows from (8.2) that · Lλ (9.52) Eλx exp c u∗α Gα (x) (Gα ( · ) + s)2 ≤E 1+ (9.53) exp s 2c u∗α p 1/p 1/q |Gα (x)| q (Gα ( · ) + s)2 ≤ E 1+ , E exp s 2c u∗α where 1/p + 1/q = 1.
9.3 Sufficient conditions for continuity
407
For real numbers, (a + b)2 ≤ (1 + a2 )(1 + b2 ). Using this we see that (Gα ( · ) + s)2
≤
(1 + s2 ) + (1 + s2 )G2α ( · )
≤ (1 + s ) + (1 + s 2
2
(9.54)
)G2α ( · ).
Consequently, q (Gα ( · ) + s)2 E exp ≤ exp 2c u∗α
q(1 + s2 ) Gα ( · )2 . 2c u∗α (9.55) By (5.190), the expectation on the right-hand side of (9.55) is finite for c/q(1 + s2 ) > 1. Since we can take q arbitrarily close to one and s arbitrarily close to zero, we get (9.51). q(1 + s2 ) 2c u∗α
E exp
For a stochastic process Z set Z∞ = sup |Z(y)| and Zδ = sup |Z(y) − Z(z)|. y∈D
(9.56)
τ (y,z)≤δ
y,z∈D
Lemma 9.3.2 Fix x ∈ S. Under the same hypotheses as in Lemma 9.3.1, but with the additional assumption that x ∈ D, we have that, for all p ≥ 1,
Eλx sup |Lyλ − Lzλ |p
1/p
≤ C pE(Gα δ )E(Gα ∞ ),
(9.57)
τ (y,z)≤δ
y,z∈D
where C is an absolute constant. Furthermore,
y z sup |L − L | τ (y,z)≤δ; y,z∈D λ λ <∞ Eλx exp C E(Gα δ )E(Gα ∞ )
(9.58)
for some C > 0. Proof
Since (Eλx (Lλ· pδ ))
1/p
≤ sup (Eλv (Lλ· pδ ))
1/p
,
(9.59)
v∈D
it follows from (8.2) that (Eλx (Lλ· pδ ))
1/p
≤
p 1/p (Gα + s)2 δ Gα (v) 1+ sup E s 2 v∈D p 1/p 2 (Gα + s) δ + E . (9.60) 2
Sample path properties of local times
408
By the Cauchy-Schwarz inequality, p 1/p Gα (v) (Gα + s)2 δ E 1+ (9.61) s 2
2 1/(2p) |Gα (v)| 1 2p E (Gα + s)2 δ E 1+ . ≤ 2 s Using this and applying the Cauchy-Schwarz inequality to the last term in (9.60), we see that 2 1/2 |G 1 (v)| 1/p α 1+ (Eλx (Lλ· pδ )) ≤ sup E + 1 s v∈D 2
E
(Gα + s) δ 2
2p
1/(2p) .
(9.62)
Since (Gα (y) + s)2 − (Gα (z) + s)2 = (Gα (y) + Gα (z) + 2s) (Gα (y) − Gα (z)) , (9.63) we get (Gα + s)2 δ ≤ 2Gα δ (Gα ∞ + s) .
(9.64)
Let s = EGα ∞ . It follows from Corollary 5.4.7 that 1/(2p) E (Gα + s)2 2p (9.65) δ 1/(4p) 1/(4p) E Gα 4p +s ≤ 2 E Gα 4p ∞ δ ≤ C pE(Gα δ )E(Gα ∞ ). Also, since E(|Gα (v)|2 ) ≤ (π/2)(E|Gα (v)|)2 , 2 |Gα (v)| E|Gα (v)|2 π E 1+ ≤3+ ≤3+ . s s2 2
(9.66)
Therefore,
2 1/2 |Gα (v)| π 1/2 1+ sup E ≤ 3+ . s 2 v∈D
(9.67)
Using (9.65) and (9.67) in (9.62), we get (9.57). To get (9.58) we write out the series expansion for the exponential and use (9.57).
9.3 Sufficient conditions for continuity
409
We can now show that (3.180) and (3.182) of Corollary 3.7.4 are satisfied and thus get a sufficient condition for the joint continuity of the local times. Theorem 9.3.3 Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric Borel right process with continuous α-potential density uα (x, y) and state space (S, ρ), where S is a locally compact separable metric space. Let L = {Lyt , (t, y) ∈ R+ ×S} be local times for X. Let Gα = {Gα (y); y ∈ S} be a mean zero Gaussian process with covariance uα (x, y). Suppose that Gα has continuous sample paths; then we can find a version of L that is continuous on R+ × S. Proof Let K ⊆ S be compact. To prove this theorem we need only verify that (3.180) and (3.182) of Corollary 3.7.4 are satisfied. That (3.180) holds follows immediately from (9.51). Considering (9.57), to show that (3.182) holds we need only show that E sup |Gα (y)| < ∞
(9.68)
lim E sup |Gα (y) − Gα (z)| = 0.
(9.69)
y∈K
and δ→0
ρ(y,z)≤δ
y,z∈K
Since Gα has a continuous version on K, it is bounded almost surely on K. Thus, (9.68) follows from Corollary 5.4.7. It follows from (9.68) that E sup |Gα (y) − Gα (z)| < ∞.
(9.70)
ρ(y,z)≤δ
y,z∈K
We take the limit of the left-hand side of (9.70) as δ → 0. Considering (9.68) we can use the Dominated Convergence Theorem to take the limit inside the expectation. This gives us (9.69) since continuous functions on compact sets are uniformly continuous. Remark 9.3.4 Note that (9.51), as stated, does not mention Gaussian processes. Similarly, using results from Chapter 6, we can write (9.57) is terms of majorizing measure or metric entropy conditions determined by the α-potential of X.
410
Sample path properties of local times 9.4 Continuity and boundedness of local times
The results in the first three sections of this chapter show that local times inherit continuity and boundedness properties from their associated Gaussian processes. We summarize these results in this section. There are some subtle points that require careful explanation. They center around the fact that a local time at a point does not begin to increase until the Markov process hits the point. Throughout this section, X = (Ω, G, Gt , Xt , θt , P x ) is a strongly symmetric Borel right process with continuous α-potential density uα (x, y), α > 0, and state space (S, τ ), where S is a locally compact separable metric space. L = {Lyt , (t, y) ∈ R+ × S} are local times for X and Gα = {Gα (y), y ∈ S} is a mean zero Gaussian process with covariance uα (x, y). We first state several results and then give their proofs. Most of the work has already been done, so the proofs are all short. To simplify statements, we use the expression “a process is continuous almost surely” to mean that we can find a continuous version of the process, and similarly for other properties of the process. Theorem 9.4.1 (Continuity of local times) (1) {Lyt , (t, y) ∈ R+ × S} is continuous almost surely if and only if {Gα (y); y ∈ S} is continuous almost surely; (2) Let K be a compact subset of S. {Lyt , (t, y) ∈ R+ × K} is continuous almost surely if and only if {Gα (y); y ∈ K} is continuous almost surely. (3) Let D ⊂ S be countable. Then, for any compact subset K of S, {Lyt , (t, y) ∈ [0, T ] × D ∩ K} is bounded for all T < ∞ almost surely if and only if {Gα (y), y ∈ D ∩K} is bounded almost surely. In the proof of statement (1), we show that when {Gα (y); y ∈ S} is continuous we can find a continuous version of {Lyt , (t, y) ∈ R+ × S}. As for the converse, we assume that {Lyt , (t, y) ∈ R+ × S} has a continuous version and show that its associated Gaussian process must be continuous. Of course, statement (1) implies that if the Gaussian process is continuous on S, then the local time of the associated Markov process is continuous almost surely for all compact subsets K of S. However, the relationship between G and L is a local one. Thus, for example, all we need to know is that G is continuous on K to determine that {Lyt , (t, y) ∈ R+ × K} is continuous, and conversely. Most of the results
9.4 Continuity and boundedness of local times
411
in this section emphasize the local nature of the relationship between the local time of a Markov process and the Gaussian process associated with the Markov process. As far as we know, there is no general theory that allows one to assume that the local time of a Markov process has a separable version. In the proof of statement (1) we construct such a version. However, we do not know how to do this, for example, for local times that are bounded but not necessarily continuous. For this reason, we consider the local time process {Lyt , (t, y) ∈ R+ × D}, where D is some countable subset of S in statement (3). Since D is arbitrary, the results obtained are still quite strong. Remark 9.4.2 To keep the statement of Theorem 9.4.1 from becoming even more cumbersome, it is given in terms of some fixed Gα . A more precise way to state (1) would be (1a) If {Lyt , (t, y) ∈ R+ ×S} is continuous almost surely, then {Gα (y); y ∈ S} is continuous almost surely for each α > 0. (1b) If {Gα (y); y ∈ S} is continuous almost surely for some α > 0, {Lyt , (t, y) ∈ R+ × S} is continuous almost surely. Clearly the same remark applies to statement (2) and to many of the statements made in the rest of this section. Theorem 9.4.1 (1) provides an interesting fact about the family of Gaussian processes associated with a strongly symmetric Markov process X, namely, if Gα is continuous for some α > 0, then the Gα are continuous for all α > 0. If X is a L´evy process, this follows easily from Gaussian considerations, using Lemma 5.5.3, since, by (7.274), σα2 (h) ∼ σβ2 (h) as h → 0 for all α, β ≥ 0. However, we do not know how to show this by Gaussian process theory in the general case. This remark also applies to many of the other path properties considered in this section. In the next theorem we consider the behavior of the local time at a fixed point of S. Theorem 9.4.3 Let D ⊂ S be countable and y0 ∈ D. Consider the processes {Lyt , (t, y) ∈ R+ × D} and {Gα (y); y ∈ D}. We have (1) Lyt is continuous at y0 for each t > 0, P y0 almost surely, if and only if Gα (y) is continuous at y0 almost surely. (2) Lyt has a bounded discontinuity at y0 for each t > 0, P y0 almost
Sample path properties of local times
412
surely, if and only if Gα (y) has a bounded discontinuity at y0 almost surely. (3) Lyt is unbounded at y0 for each t > 0, P y0 almost surely, if and only if Gα (y) is unbounded at y0 almost surely; and for each y0 ∈ D precisely one of these three cases holds. Furthermore, this theorem remains valid with the term “each t” replaced by “some t” in (1)–(3). Remark 9.4.4 By Corollary 5.3.6, continuity, boundedness, and unboundedness, both globally and locally, are probability 0 or 1 properties for Gaussian processes. Thus, by the above results, they are probability 0 or 1 properties for the local times of the associated Markov processes. However, a certain degree of care is necessary in expressing this phenomenon. For example, if a Gaussian process is unbounded almost surely on some compact set K ⊂ S, then there exists a point y0 ∈ K such that the process is unbounded almost surely at y0 . Roughly speaking, this implies that the local time of the associated Markov process will also be unbounded at y0 , but only if the Markov process hits y0 . Thus we can say that each of the events y (9.71) Lt is continuous at y0 for each t > 0
Lyt has a bounded discontinuity at y0 for each t > 0
Lyt is unbounded at y0 for each t > 0
(9.72) (9.73)
has P y0 probability 0 or 1. Furthermore, these statements are also true with the term “each t” replaced by “some t.” To clarify some of the implications of Theorems 9.4.1 and 9.4.3 we give some of their immediate consequences in the next theorem. Theorem 9.4.5 Let D ⊂ S be countable and let K ⊂ S be a compact set. Then: (1) Either {Lyt , (t, y) ∈ R+ × K} is continuous almost surely or else there exists an x0 ∈ K such that, for any countable dense set D ⊂ K, with x0 ∈ D, the event “ {Lyt , (t, y) ∈ R+ × D} is continuous” has P x0 measure zero. (2) Either {Lyt , (t, y) ∈ [0, T ] × D ∩ K} is bounded for each T < ∞ almost surely or else there exists an x0 ∈ D ∩ K such that the event “ {Lyt , (t, y) ∈ [0, T ] × D ∩ K} is bounded for some T < ∞” has P x0 measure zero.
9.4 Continuity and boundedness of local times
413
(3) If {Lyt , y ∈ K} is continuous for some t > 0 almost surely, then {Lyt , (t, y) ∈ R+ × K} is continuous almost surely. (4) If {Lyt , y ∈ D ∩ K} is bounded for some t > 0 almost surely, then {Lyt , (t, y) ∈ [0, T ] × D ∩ K} is bounded for each T < ∞, almost surely. The next theorem shows that the continuity of the local time at each point in the state space almost surely implies that the local time is jointly continuous. (This is not the case for many processes. Let {Zt ; t ∈ [0, 1]} be a L´evy process without continuous component. Zt is continuous almost surely for each t ∈ [0, 1], but except for some very simple examples, a L´evy process is discontinuous almost surely on [0,1].) Theorem 9.4.6 Lyt is continuous at y0 for each t > 0 almost surely for all y0 ∈ K if and only if {Lyt , (t, y) ∈ R+ × K} is continuous almost surely. Furthermore, this theorem remains valid with the term “each t” replaced by “some t.” The information we need to prove these results is given in the next lemma. We note again that, by Corollary 5.3.6, the Gaussian properties referred to in this lemma have probability 0 or 1. Lemma 9.4.7 Let D ⊂ S be countable and let K ⊂ S be a compact set. (1) If {Gα (y); y ∈ K} has a continuous version, we can find a version of L that is continuous on R+ × K. (2) If {Gα (y); y ∈ K} has a bounded version, then supy∈D∩K Lyt < ∞ for all t < ∞ almost surely. (3) If {Gα (y); y ∈ K} is unbounded, supy∈D∩K Lyt = ∞, for all t > 0, P x0 almost surely for some x0 ∈ K. (4) If Gα is continuous at x0 ∈ S, then {Lyt , y ∈ D ∪ {x0 }} is continuous at x0 for all t ≥ 0, P x0 almost surely. (5) If Gα has a bounded discontinuity at x0 ∈ S, then {Lyt , y ∈ D} has a bounded discontinuity at x0 for all t > 0, P x0 almost surely. (6) If Gα is unbounded at x0 ∈ S, then {Lyt , y ∈ D} is unbounded at x0 for all t > 0, P x0 almost surely. Proof Statement (1) follows from Theorem 9.3.3. By (9.51), if Gα = {Gα (y); y ∈ K} has a bounded version, then ∞ y x E (9.74) sup Lt e−αt dt < ∞. 0
y∈D∩K
414
Sample path properties of local times
Therefore, there exists a sequence ti ↑ ∞ for which E x (supy∈D∩K Lyti ) < ∞. Since Lt· is increasing, this gives (2). If Gα is unbounded on K, it follows from Corollary 5.3.8 (3) that there is a point x0 ∈ K for which lim
sup
δ→0 x∈C∩B(x0 ,δ)
Gα (x) = ∞
a.s.
(9.75)
Statement (3) now follows from Theorem 9.2.1. (To be more precise, in Theorem 9.2.1 we choose a generic point 0, which we take here to be x0 .) Statements (4) and (5) follow from (9.12) since β(x0 ) = 0 when Gα is continuous at x0 and 0 < β(x0 ) < ∞ when Gα has a bounded discontinuity at x0 . Also, Lxt 0 > 0, P x0 almost surely for all t > 0. Statement (6) is actually what is proved in the proof of (3). We now prove all the results in this section. Proof of Theorem 9.4.1 Suppose that {Gα (y); y ∈ K} is continuous almost surely; then, by Lemma 9.4.7, {Lyt , (t, y) ∈ R+ ×K} is continuous almost surely. Now suppose that {Gα (y); y ∈ K} is not continuous almost surely; then, by Theorem 5.3.7, there is an x0 ∈ K for which the oscillation function of {Gα (y); y ∈ K} is greater than zero. This implies, by (9.12), that Lyt is not continuous at x0 , P x0 almost surely for all t > 0. (Recall that the statement “{Lyt , (t, y) ∈ R+ × K} is continuous almost surely” means that it is P x continuous almost surely, for all x ∈ K.) Therefore, {Lyt , (t, y) ∈ R+ × K} continuous almost surely implies that {Gα (y); y ∈ K} is continuous almost surely. This proves statement (2). To obtain statement (1) suppose that {Gα (y); y ∈ S} is continuous almost surely; then, for all compact sets K ⊂ S, {Gα (y); y ∈ K} is continuous almost surely. Therefore, by statement (2) of this theorem, {Lyt , (t, y) ∈ R+ ×K} is continuous almost surely. Since this holds for all K ⊂ S, {Lyt , (t, y) ∈ R+ × S} is continuous almost surely. The converse has exactly the same proof. Statement (3) follows from statements (2) and (3) of Lemma 9.4.7. Proof of Theorem 9.4.3 All these statements follow from statements (4), (5), and (6) of Lemma 9.4.7. It is clear that if these results hold for some t, they hold for each t > 0, because everything used in the proof is valid for all t > 0. Proof of Theorem 9.4.5 (1) If {Lyt , (t, y) ∈ R+ × K} is not continuous almost surely, then
9.4 Continuity and boundedness of local times
415
by Lemma 9.4.7, Gα is not continuous almost surely. Since this is a probability zero–one property of Gα , we are in the situation covered by statement (5) or (6) of Lemma 9.4.7. (2) Same proof as (1) with obvious modifications. (3) We use Theorem 9.4.3 (1) with each t replaced by some t, to see that {Gα (y); y ∈ K} is continuous almost surely. The statement now follows from Theorem 9.4.1 (1). (4) We use both (1) and (2) of Theorem 9.4.3, with each t replaced by some t, to see that {Gα (y); y ∈ K} is bounded almost surely. The statement now follows from Theorem 9.4.1 (3). Proof of Theorem 9.4.6 By (9.12), the oscillation function of the associated Gaussian process is zero. Hence the associated Gaussian process is continuous and so is the local time. As corollaries of the results in this section we give some concrete continuity conditions for the local times of different strongly symmetric Borel right processes. Corollary 9.4.8 Let X = {X(t), t ∈ R+ } be a real-valued symmetric L´evy process with characteristic exponent ψ(λ) as given in (7.261) that satisfies (7.264). Let {Lxt (x, t); (x, t) ∈ R1 × R+ } denote the local times of X and let ∞ sin2 (x/2) 2 dλ. (9.76) σ (x) = ψ(λ) 0 Then {Lxt (x, t); (x, t) ∈ R1 × R+ } is continuous almost surely, if and only if, for some δ > 0, δ σ(u) du < ∞, (9.77) u(log 1/u)1/2 0 where σ(u) is the nondecreasing rearrangement of σ(u) on [0, δ]. Proof The α-potential of X is given by (7.265). Therefore, the increments variance of the associated Gaussian process, σα , is given by (7.266). It follows from Corollary 6.4.4 that δ σ α (u) du < ∞ (9.78) 1/2 0 u(log 1/u) for some δ > 0 is a necessary and sufficient condition for the continuity of the associated Gaussian process on (R1 , σα ). Furthermore, since σα is continuous in the Euclidean metric and is equal to zero only at zero,
Sample path properties of local times
416
(9.78) is a necessary and sufficient condition for the continuity of the associated Gaussian process on R1 in the Euclidean metric. It now follows from Theorem 9.4.1 (1) that (9.78) is a necessary and sufficient condition for the continuity of {Lxt (x, t); (x, t) ∈ R1 × R+ }. The proof is completed by noting that by (7.274), (9.77), and (9.78) are equivalent.
The next corollary is not as precise as Corollary 9.4.8 but is easier to verify when its hypotheses are satisfied. Corollary 9.4.9 Let X = {X(t), t ∈ R+ } be a real-valued symmetric L´evy process with characteristic exponent ψ(λ) as given in (7.261) that satisfies (7.264). Let {Lxt (x, t); (x, t) ∈ R1 × R+ } denote the local times of X. Then {Lxt (x, t); (x, t) ∈ R1 × R+ } is continuous almost surely, if there exists an x0 such that 1/2 ∞ ∞ −1 ψ (λ) dλ x dx < ∞. (9.79) x(log x)1/2 x0 Furthermore, if there exists an x0 such that ψ is increasing on [x0 , ∞), (9.79) is a necessary condition for the continuity of {Lxt (x, t); (x, t) ∈ R1 × R+ }. Proof
This follows from Theorem 6.4.10.
The next result is a simple sufficient condition for the continuity of local times of any Borel right process on Rn . Corollary 9.4.10 Let X = {X(t), t ∈ R+ } be a strongly symmetric Borel right process with state space Rn and α-potential density uα (x, y) for some α > 0. Let {Lxt (x, t); (x, t) ∈ [−T, T ]n × R+ } denote the local times of X on the interval [−T, T ]n . Let ρ(δ) :=
sup
1/2
(uα (x, x) + uα (y, y) − 2uα (x, y))
.
(9.80)
|x−y|≤δ
x,y∈[−T,T ]n
Then {Lxt (x, t); (x, t) ∈ [−T, T ]n × R+ } is continuous almost surely, if, for some δ > 0, δ ρ(u) du < ∞. (9.81) 1/2 0 u(log 1/u) Proof This corollary follows immediately from Lemma 6.4.6 and Theorem 9.4.1 (2).
9.5 Moduli of continuity
417
The next theorem gives our most general result on necessary and sufficient conditions for the continuity and boundedness of the local times of strongly symmetric Borel right processes. Theorem 9.4.11 Let X = {X(t), t ∈ R+ } be a strongly symmetric Borel right process with continuous α-potential density uα (x, y) for some α > 0 and a locally compact separable state space (S, dX ), where dX (x, y) = (uα (x, x) + uα (y, y) − 2uα (x, y))
1/2
.
(9.82)
Let K be a compact subset of S and let D denote the diameter of K. Let {Lxt ; (x, t) ∈ K × R+ } denote the local times of X on K × R+ . Let K ⊂ K be countable. Then {Lxt ; (x, t) ∈ K ∩ K × [0, T ]} is bounded almost surely for all T < ∞, if and only if there exists a probability measure µ on K such that 1/2 D 1 log du < ∞. (9.83) sup µ(BdX (y, u)) y∈K 0 Furthermore, {Lxt ; (x, t) ∈ K × R+ } has bounded uniformly continuous sample paths if and only if there exists a probability measure µ on K such that (9.83) holds and, in addition, 1/2 1 du = 0. (9.84) lim sup log →0 y∈K 0 µ(BdX (y, u)) Proof Let Gα = {Gα (x), x ∈ K} be a mean zero Gaussian process with (E(Gα (x) − Gα (y))2 )1/2 = dX (x, y). By Theorem 9.4.1, these statements hold if and only if Gα has bounded sample paths or Gα is continuous. It follows from Theorems 6.3.1 and 6.3.4 that (9.83) is a necessary and sufficient condition for Gα to be bounded and from Theorems 6.3.1 and 6.3.5 that (9.83) and (9.84) give necessary and sufficient conditions for Gα to be continuous (we also use Corollary 5.4.7).
9.5 Moduli of continuity In this section X = (Ω, G, Gt , Xt , θt , P x ) is a strongly symmetric Borel right process on a locally compact metric space (S, τ ) with continuous α-potential density uα (x, y). L = {Lyt ; (y, t) ∈ S × R+ } denotes the local times of X normalized by (3.91). Gα = {Gα (y) ; y ∈ S} is a real-valued Gaussian process with mean zero and covariance uα (x, y). We obtain local and uniform moduli of continuity for L in terms of the corresponding quantities for Gα .
Sample path properties of local times
418
Theorem 9.5.1 For α > 0, let ρα be an exact local modulus of continuity for Gα = {Gα (y), y ∈ S} at y0 ∈ S (i.e., (7.2) holds). Let C be a countable separating set for Gα . Then lim
sup
δ→0 τ (y,y0 )≤δ y∈C
√ |Lyt − Lyt 0 | = 2 Cα (Lyt 0 )1/2 ρα (τ (y, y0 ))
for almost all t a.s.,
(9.85) where Cα is the same constant as in (7.2). If ρα is simply a local modulus of continuity for Gα , (9.85) holds with the equality sign replaced by a “less than or equal to sign” and if ρα is a lower local modulus of continuity for Gα , (9.85) holds with the equality sign replaced by a “greater than or equal to sign.” Proof
We use Lemma 9.1.2. Let
|f (y) − f (y0 )| √
= 2 Cα |f (y0 )|1/2 . B = f ∈ F (C) lim sup δ→0 τ (y,y0 )≤δ ρα (τ (y, y0 )) y∈C
(9.86) Then, using Theorem 7.7.1 on Gα ( · ) + s, we see that 2 P (Gα ( · ) + s) /2 ∈ B = 1.
(9.87)
It follows from the Lemma 9.1.2 that for almost all ω ∈ ΩGα , with respect to PGα ,
Ly − Ly0 (Gα (y, ω ) + s)2 − (Gα (y0 , ω ) + s)2
t + lim sup t
(9.88) δ→0 τ (y,y0 )≤δ ρα (τ (y, y0 )) 2ρα (τ (y, y0 )) y∈C
=
1/2 √ (Gα (y0 , ω ) + s)2 2 Cα Lyt 0 + for almost all t ∈ R+ 2
P x almost surely, for all x ∈ S. Writing (Gα (y, ω ) + s)2 − (Gα (y0 , ω ) + s)2 =
G2α (y, ω )
−
G2α (y0 , ω )
(9.89)
+ 2s(Gα (y, ω ) − Gα (y0 , ω ))
and using Theorem 7.7.1 again, we see that for almost all ω ∈ ΩGα , with respect to PGα ,
(G (y, ω ) + s)2 − (G (y , ω ) + s)2
α α 0 lim sup (9.90)
δ→0 τ (y,y0 )≤δ 2ρα (τ (y, y0 )) y∈C
≤ Cα (|Gα (y0 , ω )| + 2s) . As in the proof of Theorem 9.1.3, we can take |Gα (y0 , ω )| arbitrarily close to zero on a set of positive probability, and, of course, we can
9.5 Moduli of continuity
419
also take s arbitrarily close to zero. Using this in (9.88) along with the triangle inequality, we get (9.85). The proof of the one-sided results when ρα is either a local or lower local modulus of continuity is simply a one-sided version of the proof of (9.85). Remark 9.5.2 In Theorem 9.5.1, if Gα is continuous on a compact set K ⊂ S that contains a neighborhood of y0 , it follows from Theorem 9.4.1 (2) that there is a version of L that is continuous on K. In this case, using the continuity of L, we obtain (9.85) with the lim sup taken on y ∈ K, provided that ρ(y, y0 ) is also continuous in y. Consequently, we can extend (9.85) from C to S. In particular, for real-valued L´evy processes, the associated Gaussian processes have stationary increments; therefore, if they are continuous at a point, they are continuous on all of R1 (see Theorem 5.3.10). Thus we can always make the above extension for the local times of L´evy processes. The hypotheses of the next three theorems imply that the Gaussian process is continuous on a compact set K ⊂ S. As we just remarked, it follows from Theorem 9.4.1 (2) that there is a version of L that is continuous on K. The theorems refer to this version. The proofs of the next three theorems are very similar to the proof of Theorem 9.5.1. Theorem 9.5.3 Let K ⊂ S be compact. For α > 0, let ωα be a uniform modulus of continuity for {Gα (y), y ∈ K} (i.e., (7.1) holds with a “less than or equal to” sign). Then lim sup
δ→0 τ (x,y)≤δ x,y∈K
√ |Lxt − Lyt | ≤ 2 Cα sup (Lyt )1/2 ωα (τ (x, y)) y∈K
for almost all t a.s., (9.91)
where Cα is the same constant as in (7.1). Proof As in the proof of Theorem 9.5.1, we use Lemma 9.1.2. Let C be a countable separating set for Gα . Let
|f (x) − f (y)| √
≤ 2 Cα sup |f (x)|1/2 . B = f ∈ F (C) lim sup δ→0 τ (x,y)≤δ ωα (τ (x, y)) x∈K x,y∈C
(9.92) Then, by Theorem 7.7.3, 2 P (Gα ( · ) + s) /2 ∈ B = 1.
(9.93)
Sample path properties of local times
420
It follows from the Lemma 9.1.2 that for almost all ω ∈ ΩGα , with respect to PGα ,
Lx − Ly (Gα (x, ω ) + s)2 − (Gα (y, ω ) + s)2
t + lim sup t
(9.94) δ→0 τ (x,y)≤δ ωα (τ (x, y)) 2ωα (τ (x, y)) x,y∈C
≤
√
1/2 (Gα (z, ω ) + s)2 2 Cα sup Lzt + for almost all t ∈ R+ 2 z∈K
P x almost surely, for all x ∈ S. Using Theorem 7.7.3 again, we see that for almost all ω ∈ ΩGα , with respect to PGα ,
(G (x, ω ) + s)2 − (G (y, ω ) + s)2
α α (9.95) lim sup
δ→0 d(x,y)≤δ 2ωα (τ (x, y)) x,y∈C
≤ Cα
sup |Gα (z, ω )| + 2s .
z∈K
By Lemma 5.3.5 we can take supz∈K |Gα (z, ω )| arbitrarily close to zero on a set of positive measure. This enables us to complete the proof.
The next theorem shows that if ωα is an exact uniform modulus of continuity for Gα , then it is the “best possible” in (9.91). Theorem 9.5.4 Let K ⊂ S be compact. For α > 0 let ωα be an exact uniform modulus of continuity for {Gα (y), y ∈ K} (i.e., (7.1) holds). Then there exists a y0 ∈ K such that lim sup
δ→0 τ (x,y)≤δ x,y∈K
√ |Lxt − Lyt | ≥ 2 Cα (Lyt 0 )1/2 ωα (τ (x, y))
for almost all t a.s., (9.96)
where Cα is the same constant as in (7.1). Proof We use the fact that (7.1) holds with a “greater than or equal to” sign and the ideas in the beginning of the proof of Theorem 9.5.3 along with Lemma 7.7.4 to get (analogously to (9.94)) that
Lx − Ly (Gα (x, ω ) + s)2 − (Gα (y, ω ) + s)2
t + lim sup t
(9.97) δ→0 τ (x,y)≤δ ωα (τ (x, y)) 2ωα (τ (x, y)) x,y∈C
≥
1/2 √ (Gα (y0 , ω ) + s)2 2 Cα Lyt 0 + for almost all t ∈ R+ 2
P x almost surely, for some y0 ∈ K. (We add the point y0 , which is designated by Lemma 7.7.4, to C.) We use the fact that (7.1) holds with
9.5 Moduli of continuity
421
a “less than or equal to” sign and Theorem 7.7.3 to control the term in Gα , in the left-hand side of (9.97). This enables us to complete the proof as in the previous two theorems. In order to get the result we want for the exact uniform modulus of continuity, we need (S, dGα ) to be locally homogeneous. Theorem 9.5.5 Assume that (S, dGα ) is a locally homogeneous metric space, and let K ⊂ S be a compact set that is the closure of its interior. For α > 0 let ωα be an exact uniform modulus of continuity for {Gα (y), y ∈ K} (i.e., (7.1) holds with τ (u, v) = dGα (u, v)). Then lim
δ→0 d
sup
Gα (x,y)≤δ
√ |Lxt − Lyt | = 2 Cα sup (Lyt )1/2 ωα (dGα (x, y)) y∈K
for almost all t a.s.,
x,y∈K
(9.98) where Cα is the same constant as in (7.1). Proof Let B be the set defined in (9.92) but with an equal sign. By Theorem 7.7.6, the equality in (9.93) holds for B and we get (9.94) with an equal sign. The rest of the proof is the same as the proof of Theorem 9.5.3. In dealing with local times it is natural to express the moduli with respect to Euclidean distance on R1 . We can do this when the Gα are Gaussian processes with stationary increments. Theorem 9.5.6 Assume that Gα is a real-valued Gaussian process with stationary increments. Let K ⊂ R1 be an interval. For α > 0 let ωα be an exact uniform modulus of continuity for {Gα (y), y ∈ K} (i.e., (7.1) holds with τ (u, v) = |u − v|). Then lim sup
δ→0 |x−y|≤δ x,y∈K
√ |Lxt − Lyt | = 2 Cα sup (Lyt )1/2 ωα (|x − y|) y∈K
for almost all t a.s., (9.99)
where Cα is the same constant as in (7.1). Proof By hypothesis, Theorem 7.7.4 holds with τ (u, v) = |u − v|. Because Gα has stationary increments, dGα (x, y) is a function of |x − y|. Therefore, the neighborhoods BdGα (u0 , ) are isometric for all u0 ∈ K. Consequently, the proof of Theorem 7.7.5 and hence of Theorem 7.7.6 goes through exactly as written. This is all we need, as in the proof of Theorem 9.5.5.
Sample path properties of local times
422
Remark 9.5.7 It follows from Remark 7.7.7 that if ωα or ρα is an mmodulus, Theorems 9.5.1–9.5.6 hold with ωα (τ ( · , · )) replaced by ωα (δ) and with limδ→0 replaced by lim supδ→0 , and similarly for ρα . Remark 9.5.8 The fact that we can only obtain Theorems 9.5.1–9.5.6 for almost all t, rather than for all t, is a weakness of our method. What we actually obtain in the critical Lemma 9.1.2 is (9.10), in which λ is an exponential random variable. In using Fubini’s Theorem we can only get (9.7) for almost all t ∈ R+ . Using Theorems 9.5.1–9.5.6 and material in Chapter 7, we now give some specific results about the modulus of continuity of local times. We begin with a general result for Borel right processes on R1 . For X and Gα as in the first paragraph of this section, we define the metric 1/2 (9.100) dα (x, y) := E(Gα (x) − Gα (y))2 =
1/2
(uα (x, x) + uα (y, y) − 2uα (x, y))
.
Theorem 9.5.9 Let X be a strongly symmetric Borel right process on R1 with α-potential density uα (x, y), α > 0 and local time L = {Lyt ; (y, t) ∈ R1 × R+ }. Let y0 ∈ R1 and φα ( · ) be an increasing function such that: (1) There exist constants 0 < C0 ≤ C1 < ∞ for which C0 φα (|h|) ≤ dα (y, y + h) ≤ C1 φα (|h|)
(9.101)
for all y in some neighborhood of y0 and all |h| sufficiently small. (2) φ2α (2−n ) − φ2α (2−n−1 ) is nonincreasing in n for all n sufficiently large. (3) φα (2−n ) ≤ 2β φα (2−n−1 ) for all n sufficiently large for some β < 1. Set
ρα (h) = φα (h) (log log 1/h)
1/2
1/2
+ 0
φα (hu) du u(log 1/u)1/2
(9.102)
and assume that the integral is finite for h ∈ [0, δ] for some δ > 0. Then lim
sup
δ→0 |y−y0 |≤δ y∈C
|Lyt − Lyt 0 | = C(y0 , α) (Lyt 0 )1/2 ρα (|y − y0 |)
for almost all t a.s.,
(9.103) where C(y0 , α) > 0 is a constant depending on y0 and α, and C is any countable separating set for Gα . Furthermore, (9.103) also holds with ρα (|y − y0 |) replaced by ρα (δ), limδ→0 replaced by lim supδ→0 , and C(y0 , α) replaced by a finite constant C (y0 , α) ≥ C(y0 , α).
9.5 Moduli of continuity
423
Proof It follows from Lemma 6.4.6 and (7.126) that G is continuous in a neighborhood of y0 . Therefore, by Theorem 7.6.4 and Remark 9.5.7, ρα is both an exact local modulus and an exact local m-modulus of continuity of the associated Gaussian process Gα . The results now follow from Theorem 9.5.1. Remark 9.5.10 By Remark 9.5.2, if L, or equivalently Gα , is continuous on a compact neighborhood of y0 , we can replace C by R1 in (9.103). Using (7.274), we can simplify Theorem 9.5.9 when X is a symmetric L´evy process. Recall the function σ02 (h), defined on page 330: σ02 (h)
4 = π
∞ sin2
λh 1 dλ. 2 ψ(λ)
(9.104)
0
Theorem 9.5.11 Let X be a symmetric L´evy process as defined in (7.261) satisfying (7.264) with local time L = {Lyt ; (y, t) ∈ R1 × R+ }. Let y0 ∈ R1 and φ( · ) be an increasing function such that: (1) There exist constants 0 < C0 ≤ C1 < ∞ for which C0 φ(|h|) ≤ σ0 (h) ≤ C1 φ(|h|)
(9.105)
for all y in some neighborhood of y0 and all |h| sufficiently small. (2) φ2 (2−n ) − φ2 (2−n−1 ) is nonincreasing in n for all n sufficiently large. (3) φ(2−n ) ≤ 2β φ(2−n−1 ) for all n sufficiently large for some β < 1. Set
1/2
ρ(h) = φ(h) (log log 1/h)
1/2
+ 0
φ(hu) du u(log 1/u)1/2
(9.106)
and assume that the integral is finite for h ∈ [0, δ] for some δ > 0. Then lim
sup
δ→0 |y−y0 |≤δ y∈R1
|Lyt − Lyt 0 | = C (Lyt 0 )1/2 ρ(|y − y0 |)
for almost all t a.s.
(9.107)
for some constant 0 < C < ∞. Furthermore, (9.107) also holds with ρ(|y − y0 |) replaced by ρ(δ), limδ→0 replaced by lim supδ→0 , and C replaced by a constant C ≤ C < ∞. Note that, in general, it is easier to estimate the integral that expresses σ0 than the corresponding integrals for σα , α > 0. Proof Consider X killed at the end of an independent exponential time with mean 1. If one could replace σ0 by σ1 in (9.105), this theorem
424
Sample path properties of local times
would follow immediately from Theorem 9.5.9. By (7.274), we can do precisely this. Example 9.5.12 The conditions on φ in Theorem 9.5.11 are the same as those in Theorem 7.6.4, and Φ1 in Example 7.6.6 is the same as ρ in (9.106). Thus, everything in Example 7.6.6 carries over to φ and ρ in Theorem 9.5.11. However, they are of no interest unless we can find functions σ0 for which (9.105) holds for these functions φ. In Example 7.6.6 (1), φ is a regularly varying function of index 0 < α < 1. By (7.279) and (7.283), we can find σ0 asymptotic to any regularly varying function of index 0 < α < 1. In Example 7.6.6 (2)–(4), ∞ φ is slowly varying at zero. Thus, by (7.281) it is asymptotic to C 1/h ψ −1 (λ) dλ at zero. By (7.283), we can take ψ asymptotic to any regularly varying function of index one at infinity. Consequently, we can find functions σ0 asymptotic at zero, to the φ in (2)–(4). We illustrate this in the case of Example 7.6.6 (2). Suppose that φ(h) = exp(−g(log 1/h)) for h ∈ [0, h0 ] for some h0 > 0, where g is a differentiable normalized regularly varying functions at infinity of index 0 < δ < 1. Set ∞ φ(h) = d exp(−g(log λ)) (9.108) 1/h ∞
=
1/h
g (log λ) exp(−g(log λ)) dλ. λ
By (7.283), we can take ψ asymptotic to
λ g (log λ)
exp(g(log λ)) at infinity.
The next theorem gives an iterated logarithm law for the local times of L´evy processes with the precise value of the constant under minimal conditions. Theorem 9.5.13 Let X be a symmetric L´evy process as defined in (7.261) satisfying (7.264), and let σ0 (u) be as given in (9.104). Let L = {Lyt ; (y, t) ∈ R1 × R+ } denote the local times of X. Let σ0∗ (u) = sups≤u σ0 (s). Assume that: (1) For all > 0 there exists θ > 1 such that σ0∗ (θu) ≤ (1 + )σ0∗ (u), |u| ≤ u0 for some u0 > 0 and 1/2 σ0∗ (δu) 1/2 ∗ . (9.109) du = o σ (δ) (log log 1/δ) 0 u(log 1/u)1/2 0 (2) σ0∗ (|u|) = O(|u|α ) for some α > 0.
9.5 Moduli of continuity
425
Then, for all y0 in R1 , lim
sup
δ→0 |y−y0 |≤δ y∈R1
|Lyt − Lyt 0 | σ0∗ (|y
1/2
− y0 |) (log log 1/|y − y0 |)
= 2 (Lyt 0 )1/2
(9.110)
for almost all t almost surely. Indeed, without condition (1), the lower bound in (9.110) still holds, and without requiring condition (2) and the first half of condition (1), replacing 2 by 4 gives an upper bound. Furthermore, this theorem remains valid with σ0∗ (|y − y0 |)(log log 1/ 1/2 and with limδ→0 |y − y0 |)1/2 in (9.110) replaced by σ0∗ (δ) (log log 1/δ) replaced by lim supδ→0 . Proof As in the proof of Theorem 9.5.11, we consider X killed at the end of an independent exponential time with mean 1 and the associated Gaussian process, which has increments variance σ12 . By (7.274), Lemma 7.4.4, and Theorem 9.5.1, we get the lower bound in (9.110) with σ0∗ replaced by σ1∗ . Using (7.274) again, we can replace σ1∗ by σ0∗ . We use the same procedure for the upper bound along with Lemma 7.1.6, Corollary 7.2.3, Theorem 9.5.1, and Remark 9.5.2. The statement about replacing 2 by 4 follows from the proof of Corollary 7.2.3 except that (7.103) is used in the interpolation of (7.102). The statement about the m-modulus also uses Lemma 7.1.6. Note that condition (1) is satisfied when σ0∗ is a regularly varying function of index 0 < β ≤ 1. Remark 9.5.14 A portion of Theorem 9.5.13 holds in great generality. Let X be a strongly symmetric Borel right process with continuous zero potential density u(x, y) and set σ ∗ (y, y0 ) = sup (u(s, s) − u(y0 , y0 )). y0 ≤s≤y
(9.111)
If σ ∗ (y, y0 ) > 0 for y > y0 , it follows from Remark 7.4.6 that lim δ↓0
sup y0
y∈C
|Lyt − Lyt 0 | 1/2
σ ∗ (y, y0 ) (log log 1/σ ∗ (y, y0 ))
≥ 2 (Lyt 0 )1/2
(9.112)
for almost all t ∈ R+ almost surely, for all countable sets C ⊂ R1 . We now consider uniform moduli of continuity. Theorem 9.5.15 Let X be a symmetric L´evy process as defined in (7.261) satisfying (7.264) and let σ0 (u) be as given in (9.104). Let
Sample path properties of local times
426
L = {Lyt ; (y, t) ∈ R1 × R+ } denote the local times of X. Consider the following conditions: (1) σ02 (h) is a normalized regularly varying function at zero with index 0 < α < 1 or a normalized regularly varying function of type A. (2) σ02 (h) is a normalized slowly varying function at zero that is asymptotic to an increasing function near zero and 1/2 σ0 (δu) 1/2 . (9.113) du = o σ0 (δ) (log 1/δ) 1/2 u(log 1/u) 0 (3) σ02 (u) is concave for 0 ≤ u ≤ h for some h > 0, and (9.113) holds. Let I be a closed interval in R1 . Then, if any of the conditions (1)–(3) hold lim sup
δ→0 |x−y|≤δ x,y∈I
|Lxt − Lyt | = 2 sup(Lyt )1/2 σ0 (|x − y|)(log 1/|x − y|)1/2 y∈I
(9.114)
for almost all t almost surely . Furthermore, this theorem remains valid with σ0 (|x − y|)(log 1/|x − y|)1/2 in (9.114) replaced by σ0 (δ)(log 1/δ)1/2 and with limδ→0 replaced by lim supδ→0 . Proof It follows from Theorem 7.2.14 that (2σ0 (|x−y|) log 1/|x−y|)1/2 0 is an exact uniform modulus of continuity for the Gaussian process G defined on page 330. By (7.287), 0 (y) law 1 (x) − G 1 (y) + H1 (x) − H1 (y) 0 (x) − G = G G
(9.115)
1 and H1 are defined in Lemma 7.4.11). as stochastic processes on I ×I (G By (7.288) and Theorem 7.2.1, the uniform modulus of continuity of H1 0 . Therefore, by is “little o” of the uniform modulus of continuity of G 1/2 is also an exact the triangle inequality, (2σ0 (|x − y|) log 1/|x − y|) 1 . Using this in Theorem 9.5.6, we uniform modulus of continuity for G get (9.114). The last statement in the theorem follows from Remark 9.5.7. Condition (3) is interesting. Every function that is concave near zero can be the increments variance of a Gaussian process with stationary increments, but it is not so easy to see whether associated Gaussian processes have this property. One class of associated processes, for which it is easy to see that σ0 is concave, is the class of symmetric stable processes of index 1 < p < 2. We point this out in Example 7.4.13. Because of the significance of stable processes, we give this special case
9.5 Moduli of continuity
427
of Theorems 9.5.13 and 9.5.15 in the next example. In the next section we introduce what we call stable mixtures, which also have associated Gaussian processes with concave σ0 . Example 9.5.16 Let X be a symmetric stable process of index 1 < p < 2 with local time L = {Lyt ; (y, t) ∈ R1 × R+ }. Then, for all y0 ∈ R1 , lim
sup
|Lyt − Lyt 0 |
δ→0 (|y−y0 |≤δ y∈R1
1/2
(|y − y0 |p−1 log log 1/|y − y0 |)
for almost all t ∈ R+ almost surely, and lim sup
δ→0 |x−y|≤δ x,y∈I
|Lxt − Lyt | 1/2
(|x − y|p−1 log 1/|x − y|)
= (2Cp Lyt 0 )1/2
1/2
=
(9.116)
2Cp sup Lyt y∈I
(9.117)
for almost all t ∈ R+ almost surely. If X is standard Brownian motion, both (9.116) and (9.117) hold with p = 2 (Cp is given in (7.280); see also Example 7.4.13). Furthermore, (9.117) remains valid with |x − y|p−1 (log 1/|x − y|)1/2 replaced by δ p−1 (log 1/δ)1/2 and similarly for (9.116) and with limδ→0 replaced by lim supδ→0 . Condition (1) of Theorem 9.5.15 is also difficult to work with. The hypothesis of the next corollary is easier to verify. Corollary 9.5.17 Let X be a symmetric L´evy process as defined in (7.261) with L´evy exponent ψ. Let L = {Lyt ; (y, t) ∈ R1 × R+ } denote the local times of X. Assume that ψ is regularly varying at infinity with index 1 < p < 2 or is regularly varying at infinity with index 2 and ψ(λ)/λ2 is nonincreasing as λ → ∞. Then (9.114) holds, with σ0 (u) as given in (9.104). Lemma 7.4.10, shows that ψ(λ) can be taken to be asymptotic to any regularly varying function of index 1 < p < 2 or of index 2 as long as ψ(λ)/λ2 is nonincreasing as λ → ∞. Proof Consider the stationary Gaussian process G1 defined on page 330, with spectral density (1 + ψ(λ))−1 . By Remark 7.3.4, (7.223), with equality, holds for G1 . Therefore the proof follows from Theorem 9.5.6 and (7.274). For a rather narrow class of processes with continuous local times a uniform modulus of the type given in (9.114) in not large enough. We consider this in the next theorem.
428
Sample path properties of local times
Theorem 9.5.18 Let X be a strongly symmetric Borel right process on R1 with α-potential density uα (x, y), α > 0 and local time L = {Lyt ; (y, t) ∈ R1 × R+ }. Let K ⊂ R1 be an interval. Let φα ( · ) be an increasing function such that: (1) There exist constants 0 < C0 ≤ C1 < ∞ for which C0 φα (|x − y|) ≤ dα (x, y) ≤ C1 φα (|x − y|)
(9.118)
for all x, y ∈ K and all |h| sufficiently small. (2) φ2α (2−n ) − φ2α (2−n−1 ) is nonincreasing in n for all n sufficiently large. (3) φα (2−n ) ≤ 2β φα (2−n−1 ) for all n sufficiently large for some β < 1. Let
Φ2,α (h) = 0
h
φα (u) du u(log 1/u)1/2
(9.119)
and assume that φα (h)(log 1/h)1/2 = O(Φ2,α (h)); then there exist constants 0 < Cα < ∞ such that lim sup
δ→0 |x−y|≤δ x,y∈K
|Lxt − Lyt | = Cα sup (Lyt )1/2 Φ2,α (|x − y|) y∈K
(9.120)
for almost all t almost surely. Furthermore, (9.103) also holds with Φ2,α (|x − y|) replaced by Φ2,α (δ) and with limδ→0 replaced by lim supδ→0 . Note that it follows from Lemma 7.2.5 that the hypotheses of this theorem are not satisfied if φ is regularly varying at zero with index α > 0. Proof
This follows immediately from Theorems 7.6.9 and 9.5.6.
Remark 9.5.19 Just as Theorem 9.5.9 gives a simplified version of Theorem 9.5.11 for L´evy processes, we can use (7.274) to simplify Theorem 9.5.18 when X is a symmetric L´evy process by replacing dα (x, y) by σ0 (|x − y|), φα by φ and Cα by C. (One can then denote the function Φ2,α in (9.119) by Φ2 .) Example 9.5.20 We consider examples of exact uniform moduli of continuity for local times that correspond to the four cases considered in Example 7.6.6. (1) Corresponding to this case we have Corollary 9.5.17. When ψ is regularly varying at infinity with index 1 < p < 2, σ02 is regularly varying at zero with index 0 < α < 1, by (7.279).
9.5 Moduli of continuity
429
(2) Using the argument in the paragraph containing (9.108) along with Lemma 7.4.10, we can actually take ψ to be a normalized regularly varying function of index 1 and asymptotic to F at infinity. Thus σ0 can be taken to be a normalized slowly varying function asymptotic to exp(−g(log 1/h)) at zero. It follows from Theorem 9.5.15 under condition (2) that (9.114) continues to hold. (3) In this case, by Lemma 7.2.5, Φ1 in (7.374) and Φ2,0 in (9.119) are comparable and we get (9.120) by Theorem 9.5.18 and Remark 9.5.19. (4) Same as (3). Remark 9.5.21 The key result in establishing the correspondences between limit laws for local times and their associated Gaussian processes given in Theorems 9.5.1–9.5.6 is Lemma 9.1.2. In the proof of this lemma which is X killed at the end of an independent we initially consider X, exponential time with mean 1/α and its 0-potential density uα . It is clear that we could have considered any Borel right process X with 0potential density u(x, y) and associated Gaussian process G and in place of (9.10) obtained that when (9.5) holds, (with Gα replaced by G) 1 · P x × PG L ∞ + (G( · ) + s)2 ∈ B = 1, (9.121) 2 where L∞ is the total accumulated local time of X. This is all we need to obtain Theorems 9.5.1–9.5.6 with L·t replaced by L·∞ , with the various moduli of continuity being those of G. As an example suppose that X is a Borel right process with continuous α-potential densities α > 0 that is killed the first time it hits 0. Then, using Corollary 8.1.2, we see that Theorems 9.5.1–9.5.6 hold with L·t replaced by L·T0 , with the various moduli of continuity being those of a mean zero Gaussian process with covariance uT0 . We give some of these results, more explicitly, for stable processes in the next example. Example 9.5.22 Let X be a symmetric stable process of index 1 < p < 2 with local time L = {Lyt ; (y, t) ∈ R1 × R+ }. Then (9.116) and (9.117) hold with L·t replaced by L·T0 . Let us focus on the local result lim
sup
δ→0 |y−y0 |≤δ y∈R1
|LyT0 − LyT00 | = (2Cp LyT00 )1/2 a.s. (|y − y0 |p−1 log log 1/|y − y0 |)1/2 (9.122)
Sample path properties of local times
430
Note that when y0 = 0, the right-hand side is zero. This is correct but not very interesting since for Brownian motion we have the much stronger result given in (2.217). We prove a generalization of (2.217) for diffusions in Chapter 12. We do not know what the left-hand side of (9.122) is when y0 = 0, for processes that do not have continuous paths. However, we can obtain an upper bound for it, which holds for all strongly symmetric Borel right processes and is probably the best possible. We do this in Subsection 9.5.1. Example 9.5.23 Here is a final example of the application of these ideas. Suppose that X is a Borel right process with continuous αpotential densities α > 0 that is killed at τ (λ). Here τ is the inverse local time at 0 and λ is an independent exponential random variable. Suppose also that 0 is recurrent for X. Then, using Corollary 8.1.2 and the Laplace transform technique, as in (9.11), we see that Theorems 9.5.1–9.5.6 hold with L·t replaced by L·τ (t) with the various moduli of continuity being those of a mean zero Gaussian process with covariance uT0 . (Actually, the covariance of the associated process, before passing from λ to t, is uτ (λ) , as given in (3.193), but the constant does not contribute anything to the moduli of G.) In this case, the local modulus of continuity has a neat form. For the same processes as in the previous example, we get lim sup
δ→0
|y|≤δ y∈R1
|Lyτ (t) − t| (t |y|p−1 log log 1/|y|)1/2
= (2Cp )1/2 for almost all t a.s.
(9.123) (actually, this result is also easily obtained from Theorem 8.2.2 and Example 7.4.13, which actually show that it holds for all t ≥ 0).
9.5.1 Local behavior of LT0 In Section 9.5, all the results on moduli of continuity of local times are lifted from corresponding results for the associated Gaussian processes. We cannot do this in general for the behavior of LT· 0 near zero because of the presence of the constant s in Theorem 8.1.1. Nevertheless, we can still use Theorem 8.1.1 by applying it to sup|y|≤δ |LyT0 | directly. Before we do this we consider the much simpler case of the local times of recurrent diffusions, since in this case we can use the First Ray–Knight Theorem for diffusions (Theorem 8.2.6) (we do not have a simple version of the first Ray–Knight Theorem for Markov processes without continuous sample paths).
9.5 Moduli of continuity
431
We begin with the following simple lemma. Lemma 9.5.24 Let {B(t), t ∈ R+ } and {B(t), t ∈ R+ } be independent standard Brownian motions. Then 2
lim sup t↓0
B 2 (t) + B (t) =1 2t log log(1/t)
a.s.
(9.124)
It follows from Khintchine’s law of the iterated logarithm, (2.15), that the left-hand side of (9.124) is between one and two. It is remarkable that the limit is actually one. Proof By (2.15) we need only obtain the upper bound. Let H(t) = 2 B 2 (t) + B (t), a(t) = 2t log log(1/t), and tk = θk . We take θ < 1 and note that for any > 0 we can choose θ sufficiently close to one so that 1 ≤ a(tk−1 )/a(tk ) ≤ 1 + . The left-hand side of (9.124) is bounded above by lim sup
sup
k→∞ tk ≤t≤tk−1
By (5.71),
H(t) − H(tk ) H(tk ) + lim sup . a(tk ) k→∞ a(tk )
x . P (H(t) ≥ x) = exp − 2t
(9.125)
(9.126)
Therefore, by the Borel–Cantelli Lemma, the second term in (9.125) is less than or equal to one. We now show that lim sup
sup
k→∞ tk ≤t≤tk−1
B 2 (t) − B 2 (tk ) =0 a(tk )
a.s.
(9.127)
which completes the proof of the lemma. The left-hand side of (9.127) is bounded by |B(t) + B(tk |) |B(t) − B(tk )| lim sup sup . a(tk ) a(tk ) k→∞ tk ≤t≤tk−1 (9.128) It is clear from (2.14) that the first lim sup in (9.128) is bounded almost surely by some constant. It is easy to see, again by the Borel–Cantelli Lemma, that the second lim sup in (9.128) is zero. This is because, by Lemma 2.2.11 for all k sufficiently large,
|B(t) − B(tk )| ≥δ (9.129) P sup a(tk ) tk ≤t≤tk−1 lim sup
sup
k→∞ tk ≤t≤tk−1
Sample path properties of local times
B(t) − B(tk ) ≤ 2P sup ≥δ a(tk ) tk ≤t≤tk−1
B(tk−1 ) − B(tk ) ≥δ = 4P a(tk ) δ 2 log k ≤ C(θ)(log k)1/2 exp − , (1/θ) − 1
432
where C(θ) depends only on θ (so that, for any δ > 0, we can choose θ such that the last term in (9.129) is a term of a convergent sequence). Thus we get (9.127). Theorem 9.5.25 Let X be a recurrent diffusion in R1 with continuous α-potential densities as in Subsection 8.2.1. Let LrT0 denote the local time of X starting at y > 0 and killed the first time it hits 0. Then lim sup x↓0
LxT0 =1 uT0 (x, x) log log(1/uT0 (x, x))
P y a.s.
(9.130)
Proof This follows immediately from Lemma 9.5.24 and Theorem 8.2.6, 2 2 )1{r≥y}+(B ρ(r)−ρ(y) )1{r≥y} since, in applying (8.80), the term (Bρ(r)−ρ(y) contributes nothing. In the general case we can only obtain upper bounds for the local behavior LT· 0 in the neighborhood of 0. We use the notation f · δ := sup|y|≤δ |fy | and g( · , · )δ := sup|y|≤δ |g(y, y)|, where f and g are continuous functions. Lemma 9.5.26 Let X = (Ω, G, Gt , Xt , θt , P x ) be a strongly symmetric Borel right process with continuous α-potential density uα (x, y) and state space R1 . Assume that P x (T0 < ∞) > 0 for all x ∈ S. Let L = {Lyt ; (y, t) ∈ R1 × R+ } denote the local time for X normalized so that ∞ Ex e−αt dLyt = uα (x, y). (9.131) 0
Let G = {Gy ; y ∈ S} denote the mean zero Gaussian process with covariance uT0 (x, y). Then, for all θ < 1 and γ > 0, lim sup sup n→∞ |y|≤θ n
LyT0 2
(EG · θn ) + γuT0 ( · , · )θn
log log 1/θn
≤
1+γ γ
a.s. (9.132)
The proof depends on an inequality for the moment generating function of the square of the supremum of a Gaussian process.
9.5 Moduli of continuity
433
Lemma 9.5.27 Let {G(y), y ∈ [0, δ]} be a continuous Gaussian process. Set σδ2 = EG2· δ and let a denote the median of G · δ . Then, for λ < 1/(2σδ2 ) and all > 0 sufficiently small, 2 2 (9.133) E exp λG · + a2δ ≤ eλ(1+) a 2 2 2 σδ aσδ λ(1 + ) a + 6λ + . exp 2 2 3/2 (1 − 2σδ λ) (1 − 2σδ λ) 1 − 2σδ2 λ Proof Let λ = λ/(1 + )2 . For any positive random variable X, by integration by parts, we have ∞ Eeλ X ≤ 1 + λ eλ y P (X > y) dy (9.134) 0 ∞ 2 λ eλ y P (X > y) dy = eλa + 2
= eλa + 2
((1+)a)2 ∞
2
λueλu P (X > ((1 + )u)2 ) du.
a
Take X = G · +
a2δ .
By (5.151) and (5.18), for u ≥ a,
P (X > ((1 + )u)2 ) ≤ P (G · δ − a ≥ (1 + )(u − a)) (9.135) (1 + )2 (u − a)2 ≤ exp − . 2σδ2 δ2 /(1 − 2 σδ2 λ). Using (9.135), we see Let σ δ2 := σδ2 /(1 + )2 and ρ2 := σ that the last integral in (9.134) is less than or equal to 2 ∞ a2 σδ2 λ) − 2ua u (1 − 2 du exp − u exp − 2λ 2 σδ2 2 σδ2 a 2 ∞ σδ2 a2 u − 2uaρ2 / = 2λ du exp − (9.136) u exp − 2ρ2 2 σδ2 a
2 2 2 ∞ u − aρ2 / σδ2 λa ρ du exp . u exp − = 2λ 2 2ρ σ δ2 a Note that
∞
2 α
√ (u − α)2 du = 2ρ2 + 2παρ. u exp − 2 2ρ
(9.137)
Using this in (9.136) along with (9.134), we get (9.133). Proof of Theorem 9.5.26 We use Theorem 8.1.1, as in the proof of Lemma 9.3.1, applied to {LxT0 , x ∈ [−δ, δ]}. We consider X under P x . Let Tδ denote the first hitting time of [−δ, δ] by X. Then, by the strong Markov property, E x exp wLT· 0 δ = E x E XTδ exp wLT· 0 δ . (9.138)
434
Sample path properties of local times
It follows from (8.2) that, for any s = 0, E XTδ exp wLT· 0 δ (9.139)
GXT δ exp w(G· + s)2 δ ≤E 1+ s p 1/p
GXT 1/q δ E exp(q w(G· + s)2 δ ) , ≤ E 1+ s √ where G = G/ 2 and 1/p + 1/q = 1. We now take s = a to see that the right-hand side of (9.139) is less than or equal to
p 1/p GXT 1/q δ E exp(q wG· + a2δ ) . (9.140) E 1+
a Using this and (5.184), we see that for any > 0, we can choose q sufficiently close to one such that E XTδ exp wLT· 0 δ ≤ Cp, E exp(w(1 + )G· + a2δ ). (9.141) Denote the right-hand side of (9.133) by H (a, λ, σδ ). Then, by Chebyshev’s inequality, λy P x LT· 0 δ > y ≤ Cp, H (a, λ, σδ ) exp − , (9.142) 1 + where σδ2 = E (G· ) δ = 2
uT0 ( · , · )δ . 2
(9.143)
Let λ = (a2 b(1 + )2 + 2σδ2 )−1 , where b > 0. Then σδ2 σδ (a2 + 2σδ2 )1/2 1/b H (a, λ, σδ ) ≤ e 1+3 . + 2 a2 b(1 + )2 a (b(1 + ))3/2 (9.144) c log n 2 Now let b = β/((1 + ) log n) and y = , where λ c = (1 + )2 +
(1 + )(1 + )2 . β
With these choices and (5.184) we see from (9.142) that 3/2 log n 1 . P x LT· 0 δ > y ≤ Cp, 1+ β n
(9.145)
(9.146)
Using this, the Borel–Cantelli Lemma, and (5.183) we get (9.132), in which γ = 1/β.
9.5 Moduli of continuity
435
Under some mild additional conditions we can write (9.132) in a more understandable form. Consider the following conditions imposed on the quantities in Lemma 9.5.26: 2
(1) (EG · δ ) = o (uT0 ( · , · )δ log log 1/δ) for all δ sufficiently small. 2 (2) (uT0 ( · , · )δ log log 1/δ) = o (EG · δ ) for all δ sufficiently small. (3) For all > 0 there exists a θ < 1 such that uT0 ( · , · )θn ≤ (1 + )uT0 ( · , · )θn+1 for all n sufficiently large. (4) For all > 0 there exists a θ < 1 such that EG · θn ≤ (1 + )EG · θn+1 for all n sufficiently large. Theorem 9.5.28 In addition to the hypotheses of Lemma 9.5.28: If conditions (1) and (3) hold, then lim sup sup δ→0
|y|≤δ
LyT0 ≤1 uT0 ( · , · )δ log log 1/δ
a.s.
(9.147)
If conditions (2) and (4) hold, then lim sup sup δ→0
|y|≤δ
LyT0 (EG · δ )
2
≤1
a.s.
(9.148)
If conditions (3) and (4) hold then, for all γ > 0, lim sup sup δ→0
|y|≤δ
LyT0 2
(EG · δ ) + γuT0 ( · , · )δ log log 1/δ
≤
1+γ γ
a.s.
(9.149) If G has stationary increments, then without conditions (3) and (4), (9.147)–(9.149) still hold if their right-hand sides are multiplied by four. Furthermore, all these results hold with δ replaced by y in the denominators of the fractions. The next to last statement of the theorem implies that (9.149), with the right-hand side multiplied by four, holds for all L´evy processes with continuous local times. Proof All these results are easy consequences of Lemma 9.5.26. Conditions (3) and (4) enable the interpolation between θn and θn+1 . Condition (1) enables us to ignore the expectation in the denominator of (9.132). We can then cancel the γ term in each denominator and take the remaining γ to be arbitrarily close to zero. Condition (2) enables us the ignore the uT0 in the denominator of (9.132). We then take γ arbitrarily large. Without condition (3), if G has stationary increments,
436
Sample path properties of local times
we still have sup uT0 (x, x) ≤ 8 sup EG2 (x/2)
|x|≤θ n
(9.150)
|x|≤θ n
=
8
sup |x|≤θ n /2
≤ 4
sup |x|≤θ n+1
EG2 (x) uT0 (x, x)
for θ sufficiently close to one. Without condition (4) the same argument works for EG · θn . Corollary 9.5.29 Let X be a symmetric L´evy process, without Gaussian component, with L´evy exponent ψ, and let σ0 be as defined in (7.266). Assume that σ0 is regularly varying at 0 with positive index. Let {Lxt , (x, t) ∈ R1 × R+ } denote the local times of X. Then lim sup sup δ→0
|y|≤δ
and lim sup
δ→0 |y|≤δ
LyT0 ≤1 σ02 (δ) log log 1/δ
LyT0 ≤1 σ02 (y) log log 1/y
a.s.
a.s.
(9.151)
(9.152)
Proof By (4.86) and (4.87), σ02 (x) = uT0 (x, x). Let G be the Gaussian process with covariance uT0 (x, y). It follows from (7.88) and (7.128) that condition (1), prior to Theorem 9.5.28, holds for G. The regular variation of σ0 implies that condition (3) is satisfied. Thus (9.151) follows from (9.147) and (9.152) follows from Lemma 7.1.6. Example 9.5.30 Using (2.139) and (9.147), we get the same upper bound for LT0 of standard Brownian motion as in (2.217). (It is clear that for Markov processes with continuous paths, if x > 0, LxT0 is nonzero only if the process starts above zero. Therefore, we must modify (9.147) by writing P x almost surely, x > 0 and similarly when x < 0.) For symmetric stable processes with index 1 < p < 2, by Corollary 9.5.29, we get lim sup
δ→0
|y|≤δ y∈R1
LyT0 ≤1 Cp |y|p−1 log log 1/|y|
a.s.
(9.153)
(We show in Remark 4.2.6 that C2 = 1, so there appears to be a discrepancy between (9.153)) and the result for standard Brownian motion. But remember that the L´evy exponent of standard Brownian motion
9.6 Stable mixtures
437
is |λ|2 /2, whereas the L´evy exponent of a symmetric p-stable process, 0≤ p ≤ 2, is |λ|p .) Finally, we note that there are many interesting cases in which condition (2) holds. It follows from (7.88) that for some constant C, C ρ(δ) (see (7.84)) is an upper bound for EG · δ . Using the arguments in the proof of Lemma 6.4.6, we can show that, up to a constant multiple greater than zero, it is also a lower bound when the integral term in (7.84) exceeds the other term and EG2 (x) is asymptotic to a monotonic function. Therefore, the examples in Example 7.6.6 are applicable and give many cases in which condition (2) is satisfied and an idea of what the denominator in (9.147) looks like.
9.6 Stable mixtures Let X = {X(t), t ∈ R+ } be a canonical stable process, that is, ψ(λ) = |λ|p
(9.154)
for some 0 < p ≤ 2 for ψ as defined in (4.63) (see also page 141). X has local times only for 1 < p ≤ 2. We restrict ourselves to this range for p. We call a L´evy process a stable mixture if its L´evy exponent ψ(λ) can be represented as 2 ψ(λ) = |λ|s dµ(s), (9.155) 1
where µ is a finite positive measure on (1, 2] such that 2 dµ(s) < ∞. 1 2−s
(9.156)
By the observations in the paragraph containing (4.94) we see that ∞ ψ(λ) = 2 0 (1 − cos(λu)) dν(u) with 2 dν(u) 1 = dµ(s). (9.157) s+1 du 2πC s+1 u 1 To verify that ψ is the L´evy exponent of a L´evy process it suffices, by (7.263), to show that ν is a L´evy measure, that is, 2 ∞ 1 (u2 ∧ 1) dµ(s) du < ∞. (9.158) s+1 u 0 1 This is simple to do by integrating first with respect to u. In the next two lemmas we show that the L´evy exponent of a stable mixture and the increments variance of its associated Gaussian process
Sample path properties of local times
438
have many of the smoothness properties that are hypotheses in our results on moduli of continuity of Gaussian processes and local times. Lemma 9.6.1 Suppose that µ in (9.155) is supported on [1, β] for some 1 < β ≤ 2. Then ψ is a normalized regularly varying function at infinity of index β. Proof Let 1 < β < β. Since the support of µ is [1, β], it puts positive mass on [b, β], where b = (β + β)/2. Thus β λs dµ(s) as λ → ∞. (9.159) ψ(λ) ∼ b
Taking β arbitrarily close to β shows that ψ(2λ) ∼ 2β ψ(λ) as λ → ∞, which, by definition, means that ψ is regularly varying at infinity with index β. By similar reasoning we see that β λψ (λ) ∼ sλs dµ(s) ∼ βψ(λ) as λ → ∞. (9.160) b
Since λψ (λ)/ψ(λ) → β as λ → ∞, we see that ψ is a normalized regularly varying function at infinity of index β. Remark 9.6.2 Continuing the argument that gives (9.160), we see that for n = 1, 2, . . ., λn ψ (n) (λ)/ψ(λ) → β(β − 1) . . . (β − n + 1)
as
λ → ∞.
(9.161)
This, by definition, shows that ψ is actually a “smoothly” varying function at infinity (see Bingham, Goldie and Teugels (1987, page 44)).
Lemma 9.6.3 Let ψ(λ) be given as in (9.155). Then ∞ 1 − cos λx σ 2 (x) = dλ ψ(λ) 0
(9.162)
is concave on [0, ∞]. Proof
Note that
∞ 2σ 2 (x) − σ 2 (x − h) − σ 2 (x + h) = (1 − cos v) (9.163) 0 1 1 2 − − dv xψ(v/x) (x + h)ψ(v/x + h) (x − h)ψ(v/x − h)
for all |h| ≤ |x|. Therefore, to show that σ 2 is concave on [0, ∞], it suffices to show that the term in the bracket is positive, that is, 1/(xψ(v/x))
9.6 Stable mixtures
439
is concave in x for all x > 0 and v > 0. Clearly this is equivalent to showing that g(x) = 1/(xψ(1/x)) is concave for x > 0. By definition, 1 g(x) = 2 , (9.164) 1−s x dµ(s) 1 so that
2
(s − 1)x−s dµ(s) 1 g (x) = 2 2 x1−s dµ(s) 1
(9.165)
and g (x) =
2
2 2 − 1)x−s dµ(s) s(s − 1)x−s−1 dµ(s) 1 − 3 2 . 2 1−s 2 1−s x dµ(s) x dµ(s) 1 1 2 (s 1
(9.166)
Thus, g ≤ 0 if
2
−s
(s − 1)x
2
2 dµ(s) ≤
1
2
−s−1
s(s − 1)x
2
x1−s dµ(s)
dµ(s)
1
1
(9.167) or, equivalently, if
2
−s
(s − 1)x
2
2 dµ(s) ≤
1
2
s(s − 1)x−s dµ(s)
1
2
x−s dµ(s).
1
(9.168) This follows from the Schwarz inequality applied to
2
−s
(s − 1)x
2 1
=2
2 1
2 dµ(s) 1/2
[x−s (s − 1)/s]
(9.169) [x−s (s − 1)s]
1/2
2 dµ(s)
since 2(s − 1)/s ≤ 1. Stable mixtures give rise to a large class of regularly varying characteristic exponents of L´evy processes. Lemma 9.6.4 Let ρ(s) be a bounded continuous increasing function on [0, β − 1], 1 < β < 2. Then we can find a stable mixture with characteristic exponent ψ(λ) = λβ ρˆ(log λ)
(9.170)
Sample path properties of local times
440
for all λ > 0, where
∞
ρˆ(v) =
e−vs dρ(s).
(9.171)
0
If, in addition,
1
0
dρ(s) < ∞, s
(9.172)
the above statements are also valid when β = 2. Proof Note that R(s) := ρ(β − 1) − ρ(β − s) is an increasing function on [1, β]. Let µ(s) in (9.155) be a measure with distribution function R. Then, for 1 < β < 2, β ψ(λ) = − λs dρ(β − s) (9.173) 1
β−1
= λβ
λ−s dρ(s)
0
β−1
= λβ
e−(log λ)s dρ(s)
0
= λβ ρˆ(log λ). When β = 2, (9.172) shows that µ satisfies (9.156). Corollary 9.6.5 Let h be any function that is regularly varying at infinity with positive index or is slowly varying at infinity and increasing. Then, for any 1 < β < 2, there exists a L´evy process for which σ 2 (x) in (9.162) is concave and satisfies σ 2 (x) ∼ |x|β−1 h(log 1/|x|) If, in addition,
0
1
as
x → 0.
dx < ∞, h(x)
(9.174)
(9.175)
the above statement is also valid when β = 2. Let h(x) = xp L(x), where p > 0 and L is slowly varying at sp infinity. Take ρ(s) ∼ , as s → 0+, in Lemma 9.6.4. L(1/s)Γ(1 + 1/p) (Since p > 0, we can take ρ(s) to be increasing.) By Theorem 14.7.6, ρˆ(x) ∼ 1/h(x) at infinity. Consider the stable mixture with characteristic exponent given by (9.170). By Lemma 7.4.9, Proof
σ 2 (x) ∼ Cβ |x|β−1
1 ρˆ(log 1/|x|)
as
x → 0.
(9.176)
9.7 Local times for certain Markov chains
441
This gives (9.174). If h is slowly varying at infinity and increasing and is equal to L(x), we take ρ(s) = 1/L(1/s) at 0, which is increasing, and the same proof suffices. When β = 2 we must make sure that (9.172) holds or, equivalently, that 1 ρ(s) ds < ∞. (9.177) s2 0 This in turn is equivalent to
∞
ρˆ(s) ds < ∞
(9.178)
1
from which we get (9.175).
9.7 Local times for certain Markov chains In Section 9.1 we show that when the Gaussian process associated with a strongly symmetric Borel right process X has a bounded discontinuity at some point, the local time of X has a bounded discontinuity at that point. Naturally we would like to give concrete examples of such processes. To do so we must leave the realm of L´evy processes, which have been our source of examples up to this point. All the various killed L´evy processes that we can think of have associated Gaussian process that have stationary increments or are minor variations of such Gaussian processes. It follows from Theorem 5.3.10 that such processes either have continuous paths almost surely on all open subsets of their domain or else are unbounded almost surely on all open subsets of their domain. To find examples of local times with bounded discontinuities we turn to a certain class of Markov chains with a single instantaneous state, which are variations of an example of Kolmogorov (1951). (An element x ∈ S is an instantaneous state for a Markov chain Xt with state space S, if inf{t | Xt = x} = 0, P x almost surely.) Focusing on Markov chains gives us an opportunity to discuss the local time of a Borel right process with a countable state space. Theorem 3.6.3, which establishes the existence of local times for Borel right processes X with continuous α-potential densities, is formulated with respect to approximate δ-functions with respect to the reference measure m of X at a point, say y, in the state space of X. When y is an isolated point in the state space of X, we take the δ-function at y to be a unit point mass at y divided by m({y}). Then the local time at y
Sample path properties of local times
442
(see (3.92)) is simply the amount of time that X spends at y divided by m({y}). We use Theorem 4.1.3 to define the Markov chains that we consider in this section by their α-potentials. The state space of the chains is the sequence S = { 12 , 13 , . . . , n1 , . . . , 0} with the topology inherited from the real line. Clearly S is a compact metric space with one limit point. Let ∞ {qn }∞ n=2 and {rn }n=2 be strictly positive real numbers such that ∞ qn <∞ r n=2 n
and
lim qn = ∞.
n→∞
(9.179)
We define a finite measure m on S by m(1/n) = qn /rn and m(0) = 1. We also define an α-potential {U α , α > 0} on C(S) in terms of its density uα (x, y), x, y ∈ S with respect to m. That is, for all f ∈ Cb (S), (9.180) U α f (x) = uα (x, y)f (y)m(dy), S
where uα (0, 0)
uα (0, 1/i) uα (1/i, 1/j)
=
1 , ∞ αqj α+ j=2 α + rj
(9.181)
ri , α + ri rj rj ri + uα (0, 0) = δij . qj (α + rj ) α + ri α + rj = uα (1/i, 0) = uα (0, 0)
It is clear that uα (x, y) is symmetric and continuous on S. Furthermore, one can check that U α : C(S) → C(S) and that αU α 1 αU α
U − U + (α − β)U U α
β
α
α
β
lim αU f (x)
α→∞
=
1,
(9.182)
≤ 1, =
0,
= f (x)
∀x ∈ S.
It is easy to verify the first equation in (9.182), for uα (0, · ). Then, using the fact that uα (1/j, 1/k) = δj,k
uα (0, 1/k) uα (0, 1/j)uα (0, 1/k) + , qk uα (0, 0) uα (0, 0)
(9.183)
it is easy to verify it for uα (1/j, · ), j = 2, . . .. The second equation in (9.182) follows immediately from the proof of the first one. For the third
9.7 Local times for certain Markov chains equation it suffices to show that
u (x, y) − u (x, y) = (β − α) α
443
β
uα (x, z)uβ (z, y)m(dz)
(9.184)
S
for all x, y ∈ S. This is easy to do for x = y = 0. Then, using (9.183) one can show it for all x, y ∈ S. To obtain the limit in the last equation in (9.182) we use the Dominated Convergence Theorem. It follows from Theorem 4.1.3 that {U α , α > 0} is the resolvent of a Feller process X with state space S. By Theorem 3.6.3, X has local 1/n times, which we denote by L = {Lt , (t, 1/n) ∈ R+ × S}. We now examine the behavior of the local time in the neighborhood of L0t . Note that by Example 3.10.5, under P 0 , L0∞ , the total amount of time that X spends at 0 is an exponential random variable with mean uα (0, 0). Theorem 9.7.1 Let X be a Markov chain as defined above, with state 1/n space S and α = 1. Let L = {Lt , (t, 1/n) ∈ R+ × S} denote the local times of X. Let 1/2 2 log n , (9.185) β = β({qn }) = lim sup qn∗ n→∞ ∞ where {qn∗ }∞ n=2 is a nondecreasing rearrangement of {qn }n=2 . Then 1/n
(2L0t )1/2 β ≤ lim sup Lt n→∞
− L0t ≤
β2 + (2L0t )1/2 β 2
(9.186)
for all t ≥ 0 almost surely. 1/n When β = 0, {Lt }∞ n=2 is continuous at 0 and 1/n
|L − L0t | 0 1/2 lim sup t 1/2 = 2(Lt ) n→∞ log n qn
for almost all
t≥0
a.s. (9.187)
When β = ∞ and {qn }∞ n=2 is nondecreasing, 1/k
lim sup sup n→∞ 1≤k≤n
Lt
log n qn
0 1/2 1/2 ≥ 2(Lt )
(9.188)
for all t ≥ 0 almost surely. To prove the theorem we first examine the behavior at 0 of the Gaussian process associated with X. The following lemma is used to study this process.
Sample path properties of local times
444
Lemma 9.7.2 Let {ξn }∞ n=1 be a sequence of mean zero normal random ∞ variables. Let {an }∞ n=1 and {vn }n=1 be strictly positive real numbers such that ∞ vn <∞ and lim vn = ∞. (9.189) n→∞ a n=1 n Suppose that Eξn2 =
1 vn
and
Eξn ξm =
1 an am
n = m.
(9.190)
∞ Let {vn∗ }∞ n=1 be a nondecreasing rearrangement of {vn }n=1 . Then 1/2 2 log n lim sup ξn = lim sup a.s. (9.191) vn∗ n→∞ n→∞
and lim sup n→∞
|ξn | 2 log n vn
1/2 = 1
a.s.
Furthermore, if vn is nondecreasing and 1/2 2 log n = ∞, lim sup vn n→∞
(9.192)
(9.193)
then, for any integer m and 0 < < 10−2 , there exists an n0 = n0 (m, ) such that, for all n ≥ n0 , 1/2 2 log n . (9.194) median sup ξk ≥ (1 − ) vn m≤k≤n Proof Let n∗ be a rearrangement of n such that {vn∗ }∞ n∗ =1 is nondecreasing. Clearly the sum in (9.189) remains unchanged if we sum on n∗ rather than on n. To simplify the notation, we take {vn }∞ n=1 to be be independent normal random variables nondecreasing. Let {ηn }∞ n=0 with mean 0 and variance 1. By (9.189), we can choose N sufficiently large such that (1/vn − 1/a2n ) > 0 for all n ≥ N . For n ≥ N define 1/2 1 1 1 η˜n = − 2 ηn + η0 . (9.195) vn an an ∞ We see by (10.12) that {˜ ηn }∞ n=N and {ξn }n=N are equivalent Gaussian sequences. Therefore, with bn := (1/vn − 1/a2n )1/2 , we have law
lim sup ξn = lim sup η˜n = lim sup bn ηn . n→∞
n→∞
n→∞
(9.196)
9.7 Local times for certain Markov chains
445
The last equality follows since (9.189) implies that lim 1/an = 0. n→∞ Using P ( sup
N ≤n≤N1
N1
bn ηn > λ) = 1 −
(1 − P (bn ηn > λ))
(9.197)
n=N
and (5.20) without the absolute value, along with the fact that, by (9.189), limn→∞ bn = 0, we see that lim sup ξn = λ∗ ,
(9.198)
n→∞
where
∗
λ = inf
λ:
∞
−λ2 /(2b2n )
e
<∞ .
(9.199)
n=1
Furthermore, we see by (9.189) that λ∗ does not change if we replace 1/b2n in (10.21) by vn for n ≥ 1. To complete the proof we need only show that 1/2 ∞ 2 log n −λ2 vn /2 = inf λ: e <∞ . (9.200) lim sup vn n→∞ n=1 Let δ = lim sup n→∞
2 log n vn
(9.201)
and assume that δ < ∞. Then, given ε > 0, there exists an N (ε) such that, for all n ≥ N (ε), vn δ ≥ (1 − ε/2) (2 log n)
(9.202)
and there exists an infinite subsequence {nj } of {n} such that vnj δ ≤ (1 + ε/2). 2 log nj
(9.203)
It follows from (9.202) that λ∗ ≤ ((1 + ε)δ)1/2 . To obtain a lower bound for λ∗ we recall the well-known fact that if {cn }∞ n=1 is a nonincreasing ∞ sequence of real numbers such that n=1 cn < ∞, then cn = o(1/n). If λ = ((1 − ε)δ)1/2 , it follows from (9.203) that e−λ
2
vnj /2
≥
1 1−ε/2 nj
,
(9.204)
which implies that λ∗ ≥ ((1 − ε)δ)1/2 . Since ε can be made arbitrarily small, we get (9.200) and hence (9.191) when it is finite.
Sample path properties of local times
446
If δ = ∞, for all λ > 0 there exists an infinite subsequence nj such that vn j 1 (9.205) ≤ . 2 log nj λ Thus e−λ
2
vnj /2
≥
1 , nj
(9.206)
which implies that λ∗ ≥ λ, and since this is true for all λ, we get λ∗ = ∞. Thus we have established (9.191) in this case also. It follows from Theorem 5.1.4 that lim sup n→∞
bn ηn =1 (2 b2n log n)1/2
a.s.
(9.207)
Thus (9.192) follows from (9.196). (It is easy to see that (9.192) holds for both {ξn } and {|ξn |}.) To show that (9.193) implies (9.194), we note that by (9.189) we can choose an m1 ≥ m such that vk
(9.208) sup 2 < . a 2 k≥m1 k Then, by (9.195) and the comment immediately following it, we see that
1/2 2 log n (9.209) P sup ξk ≥ (1 − ) vn m1 ≤k≤n 1/2 ≥P sup ηk ≥ (1 − )(2 log n) + |η0 | m1 ≤k≤n
for some > 0. Since {ηk } are independent normal random variables with mean 0 and variance 1, it is easy to check (see (9.197)) that the last term in (9.209) goes to 1 as n goes to infinity. This gives us (9.194).
Proof of Theorem 9.7.1 By Theorem 9.1.3 we obtain (9.186) once we show that 2β is the oscillation function of a mean zero Gaussian process {G(x), x ∈ S} with covariance u1 (x, y) given in (9.181). Consider ξn = G(1/n) − G(0)
n = 2, 3, . . .
(9.210)
Then for n ≥ 2 we have Eξn2
= u1 (1/n, 1/n) − 2u1 (1/n, 0) + u1 (0, 0) 2 rn rn = + u1 (0, 0) 1 − qn (1 + rn ) 1 + rn
(9.211)
9.8 Rate of growth of unbounded local times =
447
1 + α(n) , qn
where lim α(n) = 0. Also, for n = m, n, m ≥ 2, n→∞
Eξn ξm
rn = u (0, 0) 1 − 1 + rn u1 (0, 0) . = (1 + rn )(1 + rm )
1
rm 1− 1 + rm
(9.212)
For n ≥ 2, set qn 1 + α(n) 1 + rn . an = 1 (u (0, 0))1/2 vn =
(9.213)
We see that (9.189) is satisfied. It now follows from Lemma 9.7.2 that the oscillation function of G(x) at x = 0 is twice the right-hand side of (9.191), that is, it is equal to 2β. This gives us (9.186). To obtain (9.187) we note that it follows from (9.192) that lim sup n→∞
|G(1/n) − G(0)| 1/2 = 1 2(1 + α(n)) log n qn
a.s.
(9.214)
This is equivalent to lim
sup
δ→0 d(1/n,0)≤δ n≥2
|G(1/n) − G(0)| =1 (2d(1/n, 0) log n)1/2
a.s
(9.215)
(d := dG is defined in (6.1)). It now follows from Theorem 9.5.1 that √ |Lt − L0t | = 2(L0t )1/2 for almost all t ≥ 0 a.s., 1/2 δ→0 d(1/n,0)≤δ (2d(1/n, 0) log n) 1/n
lim
sup
n≥2
(9.216) which is equivalent to (9.187). Finally, (9.188) is an immediate application of Theorem 9.8.1 in the next section, (9.194), and Corollary 5.4.5.
9.8 Rate of growth of unbounded local times Suppose that the local time of X in Theorem 9.2.1 has an unbounded discontinuity at a point, say y0 , in its state space. We now examine
Sample path properties of local times
448
the rate of growth of the local time in the neighborhood of y0 . In the notation of Theorem 9.2.1, let G := G1 and set d(x, y) = (E(G(x) − G(y))2 )1/2 = (u1 (x, x) + u1 (y, y) − 2u1 (x, y))1/2 . (9.217) We make the following assumptions about u1 (x,y) or, equivalently, about G. Let Y ⊂ S be countable and let y0 ∈ Y . Assume that y0 is a limit point of Y with respect to d and G(y) < ∞
sup
a.s.
∀δ > 0
(9.218)
y∈Y
d(y,y0 )≥δ
and G(y) = ∞
sup
lim
δ→0
a.s.
(9.219)
y∈Y
d(y,y0 )≥δ
Let
a(δ) = E
G(y)
sup
(9.220)
y∈Y
d(y,y0 )≥δ
and note that by (9.219), limδ→0 a(δ) = ∞. Theorem 9.8.1 Let X and G be associated processes as described above, so that, in particular, (9.218) and (9.219) are satisfied on a countable subset Y of S. Let L = {Lyt , (t, y) ∈ R+ × S} be the local time of X. Then lim sup
sup
δ→0
y∈Y
Lyt ≥ (2Lyt 0 )1/2 a(δ)
∀ t ∈ R+
a.s.
(9.221)
d(y,y0 )≥δ
and lim sup
sup
δ→0
y∈Y
Lyt 2 a (δ)
≤1
∀ t ∈ R+
a.s.,
(9.222)
d(y,y0 )≥δ
where a(δ) is given in (9.220). Obviously there is a big gap between (9.221) and (9.222). This is caused by the same technical difficulty that gives the β 2 term in (9.12). We conjecture that the lower bound is the correct limit. which is X killed To prove this theorem we use Theorem 8.1.1 on X, at the end of an independent exponential time with mean one. The 0 is u1 (x, y). For convenience we denote y0 by 0. potential density of X Let h(y) = u1 (y, 0)/u1 (0, 0).
9.8 Rate of growth of unbounded local times
449
Set η(y) = G(y) − h(y)G(0). Let Y ⊂ Y be finite. Let
1/2 σ = sup η 2 (y) y∈Y
and
a = med
sup η(y) .
y∈Y
(9.223)
(9.224)
(9.225)
Set h1 = miny∈Y h(y). Assume that h1 > 0 and note that maxy∈Y h(y) ≤ 1. In preparation for the use of Theorem 8.1.1, we consider the probability distribution of A(y, G(0), s) := 12 (G(y) + s)2 − h2 (y)(G(0) + s)2 . (9.226) Let
(1 − h(y))s B(y, G(0) + s, s) = (1 − h(y))s h(y)(G(0) + s) + . 2 (9.227) Writing G(y) + s = η(y) + h(y)(G(0) + s) + (1 − h(y)),
(9.228)
we see that A(y, G(0), s) (9.229) 2 η (y) + η(y) (h(y)(G(0) + s) + (1 − h(y))s) + B(y, G(0) + s, s) = 2 η 2 (y) = + η(y) (h(y)G(0) + s) + B(y, G(0) + s, s). 2 Note that η = {η(y), y ∈ Y } and G(0) are independent and that, for any y ∈ Y , (1 − h1 )s (9.230) B(y, G(0) + s, s) ≤ (1 − h1 )s |G(0) + s| + 2 1 , |G(0) + s|, s). := B(h Lemma 9.8.2 Let t > 0 and assume that a − σt > 0. Then (a − σt)2 Pη sup A(y, G(0), s) > 2 y∈Y
(9.231)
Sample path properties of local times
450
1 , |G(0) + s|, s) + (a − σt) (h1 |G(0) + s| − (1 − h1 )s) − B(h ≥ 2Φ(t) − 1
and
Pη
sup A(y, G(0), s) <
y∈Y
(a + σt)2 2
(9.232)
+ (a + σt) (|G(0) + s| + (1 − h1 )s) + B(h1 , |G(0) + s|, s) ≥ 2Φ(t) − 1,
where Φ(t) is given in (5.16). Proof We first obtain (9.231). One sees from (5.150) that, on a set of measure greater than or equal to (2Φ(t) − 1), both supy∈Y η(y) > (a − σt) and supy∈Y −η(y) > (a − σt). Therefore, whatever the sign of (h(y)G(0) + s), sup η(y)(h(y)G(0) + s) ≥ inf (a − σt)|h(y)G(0) + s| y∈Y
y∈Y
(9.233)
= inf (a − σt)|h(y)(G(0) + s) + (1 − h(y))s| y∈Y
≥ (a − σt) (h1 |G(0) + s| − |(1 − h1 )s|) on a set of measure greater than or equal to (2Φ(t) − 1). Using this and (9.229) we get (9.231). The inequality in (9.232) is less subtle than (9.231) because we can replace η(z)(h(y)G(0) + s) by |η(z)(h(y)G(0) + s)| in taking the upper bound. Lemma 9.8.3 For the event (a − σt)2 C = sup A(y, G(0), s) ≤ 2 y∈Y
(9.234)
+ (a − σt)(h1 |G(0) + s| − (1 − h1 )s) − B(h1 , |G(0) + s|, s)
we have G(x) E 1+ 1C s
u1 (x, x) ≤ 1+ s2 := H(x, s, t).
1/2 (2(1 − Φ(t)) (9.235)
Proof
By the Schwarz inequality, 1/2 G(x) u1 (x, x) E 1+ 1C ≤ 1+ E(1 ) . C s s2
(9.236)
9.8 Rate of growth of unbounded local times
451
Since η and G(0) are independent, E(1C ) = EG(0) Eη (1C ). We see from (9.231) that Eη (1C c ) ≥ 2Φ(t) − 1.
(9.237)
Thus E(1C c ) ≥ 2Φ(t) − 1 and hence E(1C ) ≤ 2(1 − Φ(t)). Using this in (9.236) we get (9.235). Proof of Theorem 9.8.1 We use (8.2) with F (Gxi + s)2 /2 = 1C . By (9.235), the right-hand side of (8.2) is bounded by H(x, s, t). The left-hand side of (8.2) replaces (Gxi + s)2 /2 by (Gxi + s)2 /2 + Lx∞i and |G(0) + s| by (2((G(0) + s)2 /2 + L0∞ ))1/2 , which gives us
x y 2 0 P PG(0) Pη sup L∞ − h (y)L∞ + A(y, G(0), s) ≤ W ≤ H(x, s, t), y∈Y
where
|G(0) + s|2 (a − σt)2 √ (1 − h1 ) 0 L∞ + W = + 2h1 (a − σt) − √ 2 2 2h1 2 0 1/2 (9.238) − B(h1 , (2((G(0) + s) /2 + L∞ )) , s).
Obviously, x
P PG(0) Pη
sup Ly∞ − h2 (y)L0∞ + sup A(y, G(0), s) ≤ W
y∈Y
y∈Y
≤ H(x, s, t). Now note that for any random variables U, V, W, Z, P (U + V ≤ W )
(9.239)
= P (U + V ≤ W, Z ≤ V ) + P (U + V ≤ W, Z > V ) ≤ P (U + Z ≤ W ) + P (Z > V ). We apply this with U = supy∈Y Ly∞ − h21 L0∞ , (a + σt)2 1 , |G(0)+s|, s), +(a+σt) (|G(0) + s| + (1 − h1 )s)+ B(h 2 W as in (9.238), Z = supy∈Y A(y, G(0), s).
V =
Using (9.232) to handle the last term in (9.239), we obtain
x P PG(0) Pη sup Ly∞ − h2 (y)L0∞ y∈Y
(9.240)
452
Sample path properties of local times (a − σt)2 √ |G(0) + s|2 (1 − h1 )s 0 ≤ L∞ + + 2h1 (a − σt) − √ 2 2 2h1 1 , (2((G(0) + s)2 /2 + L0∞ ))1/2 , s) − B(h −
(a + σt)2 − (a + σt)(|G(0) + s| + (1 − h1 )s) 2
1 , |G(0) + s|, s) − B(h
≤ H(x, s, t) + 2(1 − Φ(t)).
Note that 1 , (2((G(0) + s)2 /2 + L0∞ ))1/2 , s) B(h (1 − h1 )2 s2 L0∞ + |G(0) + s| ≤ + 2(1 − h1 )s 2 := D(h1 , s, L0∞ , |G(0) + s|, s). (9.241) Using this in (9.240) we see that √ x P PG(0) sup Ly∞ − h21 L0∞ ≤ (a − σt) 2h1 L0∞ − (a + σt) y∈Y
|G(0) + s| − 2a(σt + (1 − h1 )s) − 2D(h1 , s,
L0∞ , |G(0) + s|, s)
≤ H(x, s, t) + 2(1 − Φ(t)).
(9.242)
Let 0 < < 1/2 and take s = . Then choose t so that the right-hand side of (9.242) is less than . We then note the following key point: We can choose a δ = δ( ) > 0 and Y ⊂ {d(y, 0) ≥ δ} such that, for all a(δ) sufficiently large, a − σt
≥ (1 − )a(δ),
a + σt
≤ (1 + )a(δ),
(9.243)
and h1 ≥ 1 − .
(9.244)
To accomplish this we first choose δ1 so that σ1 := (E supy∈S,d(y,0)≤δ1 η 2 (y))1/2 satisfies σ1 t < and (9.244) holds. We then, at the same time, choose δ > 0 sufficiently small and Y ⊂ {y|δ ≤ d(y, 0) ≤ δ1 } with enough elements so that a is sufficiently close to a(δ) and large enough so that the inequalities in (9.243) hold as stated. (We also use (5.183).) Using these inequalities in (9.242) we see that x PG(0) P sup Ly∞ − h21 L0∞ ≤ a(δ)(1 − )2 2L0∞ − a(δ)(1 + ) (9.245) y∈Y
9.8 Rate of growth of unbounded local times 453 |G(0) + | − 4 a(δ) − 2 + 4
L0∞ + |G(0) + | ≤ . PG(0) Px is a product measure on (R1 × Ω). Let us denote the elements of this space by (v, ω). We see from (9.245) that the set D(v, ω) := sup Ly∞ − h1 L0∞ > a(δ)(1 − )2 2L0∞ (9.246) y∈Y
− a(δ)(1 + )|G(0) + | − 4 a(δ) − 2 + 4
L0∞ + |G(0) + | − . has PG(0) Px measure greater than or equal to Px (Ω) Let p(x) denote the probability density function of |G(0)|. Let b = b( ) be such that b() p(x) dx = 1/2 . (9.247) 0
We claim that there exists a v ≤ b such that − 1/2 , ID (v, ω)Px (dω) ≥ Px (Ω) Ω
(9.248)
since, if not, by (9.247) we have ∞ ID (v, ω)Px (dω)p(v) dv (9.249) Ω 0 b ∞ = ID (v, ω)Px (dω)p(v) dv + ID (v, ω)Px (dω)p(v) dv Ω Ω 0 b − 1/2 ) + (1 − 1/2 )Px (Ω) = Px (Ω) − , < 1/2 (Px (Ω) which contradicts the fact that the set D has PG(z0 ) Px measure greater − . Therefore, than or equal to Px (Ω) Px sup Ly∞ − h21 L0∞ ≤ a(δ) (1 − )2 2L0∞ − (1 + )|b( ) + | y∈Y
−4 − 2 + 4
L0∞ + |b( ) + | ≤ 1/2 . (9.250)
Clearly, this holds for Y = {d(y, 0) ≥ δ = δ( )}, so that x sup Ly∞ − h21 L0∞ ≤ a(δ) (1 − )2 2L0∞ − (1 + )|b( ) + | P y∈Y
d(y,0)≥δ
−4 − 2 + 4
L0∞ + |b( ) + | ≤ 1/2 . (9.251)
Sample path properties of local times
454
Taking n = n−4 , δn = δ( n ), we see from the Borel–Cantelli Lemma that sup Ly∞ − h21 L0∞ ≤ a(δn ) (1 − n )2 2L0∞ − (1 + n )|b( n ) + n | y∈Y
d(y,0)≥δn
L0∞ + |b( n ) + n | ∀ n ≥ n(ω) − 4 n − 2n + 4 n
a.s.
Furthermore, since limδ→0 a(δ) = ∞ and limn→∞ b( n ) = 0, we see that Ly∞ a.s. (9.252) lim sup sup ≥ (2L0∞ )1/2 a(δ) y∈Y δ→0 d(y,0)≥δ
The almost sure in (9.252) is with respect to Px = P x Pλ , where P x is the probability for X and Pλ refers to an independent exponential random variable with mean one. Thus we can write (9.252) as ∞ y Lt ≥ (2L0t )1/2 e−t dt = 1. P x lim sup sup (9.253) a(δ) y∈Y δ→0 0 d(y,0)≥δ
This implies that lim sup sup δ→0
y∈Y
Lyt ≥ (2L0t )1/2 a(δ)
a.s.
(9.254)
d(y,0)≥δ
for almost all t ∈ R+ . Using the monotonicity of L·t and the continuity of L0t we get (9.221). The proof of (9.222) is much simpler; we leave it to the reader. Remark 9.8.4 It is clear that Theorem 9.8.1 implies Theorem 9.2.1, although the proof of Theorem 9.2.1 is much simpler. We could have used Theorem 9.2.1 to get a lower bound for the rate of growth of L·τ (t) , but it seems more natural to give it for L·t . Also note that the proof of Theorem 9.8.1 also goes through when limδ→0 a(δ) < ∞. In this case we get the lower bound in (9.12), although, obviously, the proof of (9.12) is much simpler. 9.9 Notes and references This chapter presents all the results of Marcus and Rosen (1992d) and (1992a). The proofs are simplified in Sections 9.1 and 9.3 by using Eisenbaum’s Isomorphism Theorem (Theorem 8.1.1), instead of Theorem 8.1.3, which was used in the original proofs. The simplification is
9.9 Notes and references
455
considerable in Section 9.2, in which the proof of Theorem 9.2.1, using the Generalized Second Ray–Knight Theorem, is much easier than the proof of this result in Marcus and Rosen (1992d). On the other hand, the more precise Theorem 9.8.1 is similar to the proof (of our Theorem 9.2.1) given in Marcus and Rosen (1992d), except that, once again, Theorem 8.1.1 is used in place of Theorem 8.1.3. Theorem 9.8.1 is not explicitly stated in Marcus and Rosen (1992d), although some remarks in Marcus and Rosen (1992d) about how it is obtained are given in the discussion of the Markov chains considered in our Section 9.7. Theorem 9.8.1 is given in Marcus (1991), which contains examples of how it can be applied to L´evy processes with unbounded local times. The necessary and sufficient condition for the continuity of local times of L´evy processes by Barlow and Hawkes, mentioned in the Introduction, that does not require that the L´evy processes are symmetric, is precisely our Corollary 9.4.8 with ψ(λ) replaced by Re (ψ(λ)). In Remark 9.5.8 we point out that we can only obtain Theorems 9.5.1– 9.5.6 for almost all t. In Barlow (1988), he obtains many of these limit laws for L´evy processes that hold for all t ∈ R+ . Stable mixtures are introduced in Marcus and Rosen (1993) and are considered again in Marcus and Rosen (1999). Several of the results on moduli of continuity of local times in Section 9.5 have not appeared elsewhere. The Markov chains considered in Section 9.7.1 are symmetrized versions of an extension by Reuter (1969) of Kolmogorov’s example. Kolmogorov’s example is treated extensively in Chung (1967), pages 278– 283, where one can find a lucid description of the sample paths of the Markov chains that we are considering. These processes have Q matrix 0 −∞ 1/2 r2 1/3 r3 1/4 r4 .. . . ..
q2 −r2 0 0
q3 0 −r3 0
q4 0 0 −r4
···
..
.
.
Walsh (1978) gives an example of a diffusion with a discontinuous local time in which the 1-potential density is discontinuous. Also, the process he considers is not symmetric. If one changes the speed measure to obtain a symmetric process, then both the 1-potential and the local time become continuous.
10 p-variation of Gaussian processes and local times
10.1 Quadratic variation of Brownian motion Interest in the p-variation of stochastic processes was initiated, no doubt, by the elegant result of L´evy (1940) on the quadratic, or 2-variation, of standard Brownian motion {B(t), t ∈ R+ }, lim
n→∞
n 2 −1
B
i=0
i 2n
−B
i+1 2n
2 =1
a.s.
(10.1)
Generalizations of this result lead to complications. Let π = {0 = x0 < x1 · · · < xkπ = a} denote a partition of [0, a], and let m(π) = sup1≤i≤kπ (xi −xi−1 ) denote the length of the largest interval in π. (m(π) is called the mesh of π.) Dudley (1973) showed that, forany sequence {π(n)} of partitions of [0, a] such that m(π(n)) = o log1 n , lim
n→∞
(B(xi ) − B(xi−1 ))2 = a
a.s.
(10.2)
xi ∈π(n)
(To clarify the notation in (10.2) and in all that follows, note that in the expression xi ∈π f (xi−1 , xi ), for some function f and partition π, we mean that the sum is taken over all the terms in which both xi−1 and xi are contained in π.) de la Vega (1973) showed that (10.2) holds if the condition no longer on m(π(n)) is relaxed to m(π(n)) = O log1 n . (In fact, he showed more, as we point out below.) It is natural to ask what happens if one considers taking the limit over all partitions. Let Qa (δ) = {partitions π of [0, a] | m(π) ≤ δ}. Taylor (1972) showed that ψ(|B(xi ) − B(xi−1 )|) = a a.s., (10.3) lim sup δ→0 π∈Qa (δ)
xi ∈π
456
10.2 p-variation of Gaussian processes 457 " where ψ(x) = |x/ 2 log+ log 1/x |2 and (log+ u ≡ 1 ∨ log u). It may be helpful to give a heuristic explanation of where the function ψ comes from. By (2.14) we may say that, for |xi − xi−1 | small, |B(xi ) − B(xi−1 )| ∼ (2|xi − xi−1 | log log(1/|xi − xi−1 |)1/2 .
(10.4)
Note that ψ(x) is, effectively, the inverse of (2|x| log log(1/|x|)1/2 . Thus, approximating |B(xi ) − B(xi−1 )| as in (10.4), we see that the sum on the left-hand side of (10.3) is just xi ∈π (xi − xi−1 ) = a. All these results are generally referred to as results about the quadratic or 2-variation of Brownian motion. It is a short step from results about the quadratic variation of Brownian motion to results about the quadratic variation of squares of Brownian motion. Then, employing techniques we have used repeatedly, we can obtain results about the quadratic variation of the local times of Brownian motion in the spatial variable. Furthermore, as in the preceding chapters, we can approach this question in much greater generality. This is done in Marcus and Rosen (1993). In this chapter, for technical reasons we discuss in Section 10.6, we restrict ourselves to studying the variation of local times of symmetric stable processes of index 1 < β ≤ 2 in the spatial variable. It turns out that the appropriate power to use to get nice limit theorems like (10.2) is 2/(β − 1). We refer to results like (10.2) about the variation of stochastic processes, with 2 replaced by p, as results about the p-variation of these processes. Our results on the p-variation of local times of symmetric stable processes follow from Theorem 8.1.1. In order to use it we need to know about the p-variation of squares of the associated Gaussian processes. As in the case of Brownian motion, these follow easily from results about the p-variation of the Gaussian processes themselves. We take this up in the next section.
10.2 p-variation of Gaussian processes We begin with some technical lemmas. Let B = (Bij )ni,j=1 be an n × n positive definite symmetric matrix and let B denote the operator norm of B as an operator from n2 → n2 , that is, B := sup Bx2 , x2 ≤1 x∈ n 2
where x2 denotes the norm of x in n2 .
(10.5)
p-variation
458
Lemma 10.2.1 Let B be as above and suppose that n
|Bij | ≤ C
∀ 1 ≤ i ≤ n.
(10.6)
j=1
Then B ≤ C. Proof Let x = (x1 , . . . , xn ). Then, using the fact that B is a symmetric matrix, we have 2 n n Bx22 = Bij xj (10.7) i=1
≤
n i=1
≤
n
i=1
≤ C
j=1 n j=1 n
|Bij |1/2
j=1
n n
j=1
|Bij | |xj |2
n n j=1
2 |Bij |1/2 |xj |
n |Bij | |Bij | |xj |2
i=1 j=1
= C
|Bij |
|xj |2 ≤ C 2 x22 .
i=1
Let {ak } ∈ p and {bk } ∈ q , where 1/p + 1/q = 1. It is well known that ∞ {ak }p = sup bk ak . (10.8) {bk }q ≤1 k=1
Using (10.8), we see that the operator norm of B defined in (10.5) can be written as B =
sup
n
{aj }2 ≤1,{bk }2 ≤1 j,k=1
aj bk Bj,k .
(10.9)
We now consider a slightly broader definition of partition than the one given in Section 10.1. We let π = {b0 = x0 < x1 · · · < xkπ = b1 } denote a partition of [b0 , b1 ], with the understanding that b0 and b1 can be different for the different partitions considered. For G = {G(x), x ∈
10.2 p-variation of Gaussian processes
459
R1 } a real-valued Gaussian process, we associate with a partition π the covariance matrix ρij (π) = E(G(xi ) − G(xi−1 ))(G(xj ) − G(xj−1 ))
i, j = 1, . . . , kπ . (10.10)
We denote the median of a real-valued random variable Z by med(Z). Lemma 10.2.2 Let G = {G(x), x ∈ R1 } be a real-valued Gaussian ∞ process and let {π(m)}∞ m=1 be partitions of {[b0 (m), b1 (m)]}m=1 . For p > 1 define 1/p |G(xi ) − G(xi−1 )|p . (10.11) |||G|||π(m),p = xi ∈π(m)
Then P
|||G||| − med sup |||G||| sup
π(m),p π(m),p m
m
2 2
> t ≤ 2e−t /(2ˆσ ) ,
(10.12)
where kπ(m)
σ = sup 2
m {{ak }:
sup
ai aj ρi,j (π(m))
(10.13)
|ak |q ≤1} i,j=1
and 1/p + 1/q = 1. If p ≥ 2, then σ 2 ≤ sup ρ(π(m)),
(10.14)
m
where ρ(π) denotes the operator norm of ρ as an operator from k2π → k2π . Also,
σ
(10.15)
E(sup |||G|||π(m),p ) − med sup |||G|||π(m),p ≤ √ . m m 2π Proof Let U = Π × Bq , where Bq is a countable dense subset of the unit ball of q and Π = {π(m)}∞ m=1 . For π(m) ∈ Π and {ai } ∈ Bq , set ai (G(xi ) − G(xi−1 )). (10.16) H(π(m), {ai }) = xi ∈π(m)
We see that
1/p
kπ(m)
sup (π(m),{ai })∈U
H(π(m), {ai }) =
sup π(m)∈Π
=
|G(xi ) − G(xi−1 )|p
i=1
sup |||G|||π(m),p . π(m)∈Π
p-variation
460 Furthermore, sup (π(m),{ai })∈U
E(H 2 (π(m), {ai }))
(10.17)
kπ(m)
=
sup
(π(m),{ai })∈U i,j=1
ai aj ρi,j (π(m)) kπ(m)
= sup m {{ai }:
sup
ai aj ρi,j (π(m)).
|ai |q ≤1} i,j=1
The statements in (10.12) and (10.13) now follow from (5.152). When p ≥ 2 we have q ≤ 2 and since, in this case, the unit ball of q is contained in the unit ball of 2 , we see that the last line of (10.17) is less than or equal to kπ(m)
sup m {{ai }:
sup
ai aj ρi,j (π(m))
(10.18)
|ai |2 ≤1} i,j=1
≤ sup ρ(π(m)) m
(see (10.9)). The statement in (10.15) follows from (5.183). Let {G(x), x ∈ R1 } be a mean zero Gaussian process with stationary increments. Recall the definition σ 2 (h) = E(G(x + h) − G(x))2 .
(10.19)
Theorem 10.2.3 Let {G(x), x ∈ R1 } be a mean zero Gaussian process with stationary increments and assume that σ 2 (h) is concave for h ∈ [0, δ], for some δ > 0, and satisfies limh→0 σ(h)/h1/p = α for some p ≥ 2 ∞ and 0 ≤ α < ∞. Let {π(n)}∞ n=1 be partitions of {[b0 (n), b1 (n)]}n=1 p/2 with [b0 (n), b1 (n)] ⊆ [0, δ] for all n such that m(π(n)) = o log1 n , limn→∞ b0 (n) = b0 , and limn→∞ b1 (n) = b1 , where b1 − b0 > 0. Then lim |G(xi ) − G(xi−1 )|p = E|η|p αp (b1 − b0 ) a.s., (10.20) n→∞
xi ∈π(n)
where η is a normal random variable with mean 0 and variance 1. Proof In order to use the concavity of σ 2 (h) on [0, δ], we initially take b1 − b0 < δ/2. We now consider Lemma 10.2.2 with the sequence of partitions {π(m)}∞ m=1 replaced by the single partition π = π(n). In this
10.2 p-variation of Gaussian processes case (10.12) and (10.14) tell us that 2 2 P ||||G|||π(n),p − med(|||G|||π(n),p )| > t ≤ 2e−t /(2ˆσn )
461
(10.21)
where σ n2 ≤ ρ(π(n)). We show below that
ρ(π(n)) = o
(10.22)
1 log n
(10.23)
as n → ∞. Assuming this, we see from (10.21), (10.22), and the Borel– Cantelli Lemma that lim (|||G|||π(n),p − med|||G|||π(n),p ) = 0
n→∞
a.s.
(10.24)
Let med(|||G|||π(n),p ) = Mn and note that (see page 224) Mn
≤ 2E(|||G|||π(n),p ) ≤ 2(E|||G|||pπ(n),p )1/p 1/p σ p (xi − xi−1 ) , = 2(E|η|p )1/p
(10.25) (10.26)
xi ∈π(n) law
where we use the fact that G(xi ) − G(xi−1 ) = σ(xi − xi−1 )η. It follows from the hypotheses on σ 2 that Mn ≤ C(E|η|p )1/p (b1 − b0 )1/p
∀n,
(10.27)
where C is an absolute constant. Choose some convergent subsequence ∞ {Mni }∞ i=1 of {Mn }n=1 and suppose that lim Mni = M .
(10.28)
i→∞
It then follows from (10.24) and (10.28), that lim |||G|||π(ni ),p = M
i→∞
a.s.
(10.29)
Let us also note that it follows from (10.12), (10.23) and (10.27) that for all r > 0 there exist finite constants C(r) such that E|||G|||rπ(n),p ≤ C(r)
∀ n ≥ 1.
(10.30)
Thus, in particular, {|||G|||pπ(n),p ; n = 1, . . .} is uniformly integrable. This, together with (10.29), shows that p
lim E|||G|||pπ(ni ),p = M .
i→∞
(10.31)
p-variation
462
Since it is obvious because of our assumption on σ 2 that lim E|||G|||pπ(n),p = (b1 − b0 )αp E|η|p ,
(10.32)
n→∞
we have p
M = (b1 − b0 )αp E|η|p .
(10.33)
Thus the bounded set {Mn }∞ n=1 has a unique limit point M . It now follows from (10.24) that lim |||G|||pπ(n),p = (b1 − b0 )αp E|η|p .
(10.34)
n→∞
We now obtain (10.23). This follows from Lemma 10.2.1 and our hypothesis on m(π(n)), once we show that |ρij (π(n))| ≤ 2σ 2 (xi − xi−1 ) ≤ 2 max σ 2 (xi − xi−1 ). (10.35) xi ∈π(n)
j
To obtain (10.35) we first note from (10.10) that ρij
= − 12 [σ 2 (xj−1 − xi−1 ) − σ 2 (xj−1 − xi )] +
1 2 2 [σ (xj
(10.36)
− xi−1 ) − σ (xj − xi )] 2
= −Aj−1,i + Aj,i , where Aj,i :=
1 2 2 [σ (xj
− xi−1 ) − σ 2 (xj − xi )].
(10.37)
Assume that j > i. Using the assumption that σ 2 is concave and monotonically increasing on [0, δ], we see that that Aj,i ≥ 0 and also that Aj−1,i ≥ Aj,i for all j ≥ i. Therefore, k
|ρij | =
j=i+1
k
(Aj−1,i − Aj,i ) = Ai,i − Ak,i ≤ Ai,i =
1 2 2 σ (xi
− xi−1 ),
j=i+1
(10.38) where k = kπ(n) is the number of partition points in π(n). When j < i, we rewrite (10.36) as ρij = Dj−1,i − Dj,i ,
(10.39)
where Dj,i = −Aj,i =
1 2 2 [σ (xi
− xj ) − σ 2 (xi−1 − xj )].
(10.40)
(For the last equality we use the fact that σ 2 (h) = σ 2 (−h) since the Gaussian process has stationary increments.) Using the monotonicity
10.2 p-variation of Gaussian processes
463
and concavity of σ 2 once more, we see that Dj,i ≥ 0 and also that Dj,i ≥ Dj−1,i for all j < i. Therefore, i−1
|ρij | =
j=1
i−1
(Dj,i −Dj−1,i ) = Di−1,i −D0,i ≤ Di−1,i =
1 2 2 σ (xi −xi−1 ),
j=1
(10.41) and, of course, ρi,i = σ 2 (xi − xi−1 ).
(10.42)
Using (10.38), (10.41), and (10.42) we obtain (10.35). Thus we get (10.20) under the assumption that b1 − b0 < δ/2. We now extend the result so that it holds for b1 − b0 = a, for any a < ∞. For clarity, for a given partition π, we write π = [0 = x0 (π) < · · · < xkπ (π) = a].
(10.43)
We divide [0, a] into m equal subintervals Ij,m (a):=[((j−1)/m)a, (j/m)a], j = 1, . . . , m, where m is chosen so that 1/m < δ/4. Using the partition points of π we define j j = 0, . . . , m. (10.44) xk(j) (π) = sup xk (π): xk (π) ≤ a m k Consider the subset of π given by the increasing sequence of points π (Ij,m (a)) = {xk(j−1) (π) < xk(j−1)+1 (π) < · · · < xk(j) (π)},
(10.45)
j = 1, . . . , m. We write xi ∈π
|G(xi ) − G(xi−1 )|p =
m
|G(xi ) − G(xi−1 )|p . (10.46)
j=1 xi ∈π(Ij,m (a))
Taking the limit as n goes to infinity and using our prior result, which holds on intervals of length less than δ/2, we get (10.20). We now show that Theorem 10.2.3 is best possible when p = 2 and also pretty strong when p > 2. Theorem 10.2.4 Let {G(x), x ∈ R1 } be as in Theorem 10.2.3. For any b > 0 we can find a sequence of partitions {π(n)}∞ n=1 with m(π(n)) ≤ b , for all n sufficiently large, such that (10.20) is false. log n Proof To simplify matters we take α = 1 and b1 − b0 = 1. That the proof is valid for all 0 < α < ∞ and b1 and b0 should be obvious from the proof in this case.
p-variation
464
For each integer q ≥ 1, let Πq be the set of those partitions of [0, 1], each of which contains for each integer k, 0 ≤ k ≤ 2q−1 − 1, either the interval Jqk = [2k/2q , (2k + 2)/2q ] or both intervals Iq2k = [2k/2q , (2k + q−1 1)/2q ] and Iq2k+1 = [(2k + 1)/2q , 2k + 2/2q ]. There are 22 partitions in Πq . One of them has mesh 2−q ; all the others have mesh 21−q . (Note that Πq ∩ Πq+1 consists of a single partition, the unique partition in Πq with mesh 2−q .) Consider the sequence of partitions {π(n)} constructed as follows: π(0) is the partition consisting of the interval [0, 1]. Then, for each q ≥ 0, set r r
Aq+1 = n 1 + 22 < n ≤ 1 + 22 . (10.47) 0≤r≤q−1
0≤r≤q
Since Aq+1 and Πq+1 have the same number of elements, we can choose a bijection π from Aq+1 onto Πq+1 . This defines π(n) for n ∈ Aq+1 . q For these values of n, m(π(n)) ≤ 2−q . Since n ≤ 21+2 for n ∈ Aq+1 , it follows that m(π(n)) ≤ 1/ log n as long as n > 10. For the Gaussian processes considered in Theorem 10.2.3 we define, for 0 ≤ k ≤ 2q−1 − 1, 2k + 1 2k 2k L(Iq ) = G −G , (10.48) 2q 2q 2k + 1 2k + 2 −G , L(Iq2k+1 ) = G 2q 2q 2k 2k + 2 L(Jqk ) = G −G , 2q 2q and Mqk = max{|L(Iq2k )|p + |L(Iq2k+1 )|p , |L(Jqk )|p }.
(10.49)
EMqk = σ p 2−q E max{|ξq |p + |ηq |p , |ξq + ηq |p },
(10.50)
We have
where ξq and ηq are normal random variables with mean zero, variance 1, and σ 2 (1/2q−1 ) Eξq ηq = − 1 − ; (10.51) 2σ 2 (1/2q ) see (10.36) with j = i + 1. We now show that lim E sup |||G|||pπ,p = (1 + cp )E|η|p
q→∞
π∈Πq
(10.52)
10.2 p-variation of Gaussian processes
465
for some cp > 0, where η is a normal random variable with mean zero and variance one. To see this note that E sup π∈Πq
|||G|||pπ,p
=
2q−1 −1
E(Mqk ),
(10.53)
k=1
and so, by (10.50) and the hypothesis on σ, lim E sup |||G|||pπ,p =
q→∞
π∈Πq
1 lim E 2 q→∞
max{|ξq |p + |ηq |p , |ξq + ηq |p }. (10.54)
To evaluate the right-hand side of (10.54), set hq = Eξq ηq and note that h := lim Eξq ηq = −(1 − 2(2/p)−1 ). q→∞
(ξq , ηq ) has covariance matrix
Aq =
1 hq
hq 1
(10.55)
.
(10.56)
Since |h| < 1, we see that for sufficiently large q, Aq is invertible and 1 1 −hq . (10.57) A−1 = q −hq 1 1 − h2q Let fq ( · , · ) be the joint density of ξq and ηq . Then, for sufficiently large q, −1 " fq (x, y) = 2π 1 − h2q exp −(x2 − hq 2xy + y 2 )/2(1 − h2q ) ≤ C exp −D(x2 + y 2 ) (10.58) for some constants C and D independent of q. (Here we again use the fact that |h| < 1.) Let g(x, y) = max{|x|p + |y|p , |x + y|p }. Then E max{|ξq |p + |ηq |p , |ξq + ηq |p } ∞ = fq (x, y) dx dy dλ 0
=
0
(10.59)
g(x,y)≥λ ∞
λ2/p
fq (λ1/p u, λ1/p v) du dv dλ. g(u,v)≥1
Note that f ( · , · ) := limq→∞ fq ( · , · ) is the joint density of normal random variables with mean zero and variance 1 and with covariance h. Clearly, f ( · , · ) is strictly positive. We use the Dominated Convergence Theorem to take the limit as q goes to infinity in (10.59) and get lim E max{|ξq |p + |ηq |p , |ξq + ηq |p }
q→∞
(10.60)
p-variation
466
∞ λ2/p f (λ1/p u, λ1/p v) dλ du dv.
= g(u,v)≥1
0
By the same reasoning we have 2E|η|p
=
lim E(|ξq |p + |ηq |p )
(10.61)
q→∞
∞
λ2/p f (λ1/p u, λ1/p v) dλ du dv.
= |u|p +|v|p ≥1
0
Because of the different areas of integration in the (u, v) plane we see that the right-hand side of (10.60) is equal to (1 + cp ) times the right-hand side of (10.61) for some cp > 0. Using this in (10.54) we get (10.52). We now show that lim sup |||G|||pπ,p = (1 + cp )E|η|p
(10.62)
q→∞ π∈Πq
for cp given in (10.52). To do this we use Lemma 10.2.2 exactly as it was used in the proof of Theorem 10.2.3, but with |||G|||π(n),p replaced by supπ∈Πq |||G|||π,p . Analogous to (10.24) we have
lim
q→∞
sup |||G|||π,p − med( sup |||G|||π,p )
π∈Πq
=0
a.s.
(10.63)
π∈Πq
because, in this case, for q fixed, σ ˆ 2 ≤ 41−q/p for all q sufficiently large (see (10.35)). )q = med supπ∈Π |||G|||π,p . By (10.50), Let M q αp Cp (10.64) 2q−1 for all q sufficiently large, where Cp is a constant depending only on p. Therefore, it follows from (10.53) that
)q ≤ 2E sup |||G|||π,p ≤ Cp M (10.65) EMqk ≤ Cp σ p (2−q ) ≤
π∈Πq
Cp ,
which is independent of q. Using this in (10.12), for some constant we see that there exist finite constants C(r, p) such that E sup |||G|||rπ,p ≤ C(r, p) π∈Πq
∀q ≥ 1.
(10.66)
Following the proof of Theorem 10.2.3, we can show that )qp . lim E sup |||G|||pπ,p = lim M
q→∞
π∈Πq
q→∞
(10.67)
10.3 Additional variational results for Gaussian processes
467
Using (10.52), (10.67), and (10.63), we get (10.62). We see from (10.62) that lim sup |||G|||pπ(n),p = (1 + cp )E|η|p .
(10.68)
n→∞
Considering the condition on (π(n)), we now have that (10.20) does not hold for a sequence of partitions {πn } for which m(πn ) ≤ 1/log n. It is easy to improve this to get that (10.20) does not hold for certain sequences of partitions {πn } for which m(πn ) ≤ b/log n for all b > 0. q−1 different partitions on [0, 2−j ] just as we did For q fixed we create 22 q j of size 2−(j+q) above on [0, 1]. To each of these we add * −j + 2 (2 −1) intervals q−1 to complete the partition on 2 , 1 . We still have 22 partitions, but −(j+q) and all the others have mesh 21−(j+q) . now one of them has mesh 2 −j Since the partitions on [2 , 1] are simply the dyadic partitions, the sum of the p-th powers of the increments of the Gaussian process converges (as q goes to infinity) to (1 − 2−j )E|η|p . However, on [0, 2−j ], as above, it converges to 2−j (1 + cp )E|η|p . Thus we still have a counterexample to (10.20), but now m(πn ) ≤ 2−j /log n. Since this holds for all j ≥ 1, we obtain Theorem 10.2.4, as stated.
10.3 Additional variational results for Gaussian processes We first consider the p-variation of squares of Gaussian processes. This may seem artificial at first thought, but it is precisely what we need to use the Eisenbaum Isomorphism Theorem to obtain results about the p-variation of local times of the Markov processes associated with these Gaussian processes. Theorem 10.3.1 Let {G(x), x ∈ R1 } be a mean zero Gaussian process with stationary increments and increments variance σ 2 as defined in (10.19). If σ 2 (h) is concave for h ∈ [0, δ], for some δ > 0, and satisfies limh→0 σ(h)/h1/p = α for some p ≥ 2 and 0 ≤ α < ∞, then, for any p/2 1 , sequence of partitions {π(n)} of [0, a] such that m(π(n)) = o log n a 2 2 p p p |G (xi ) − G (xi−1 )| = E|η| (2α) |G(x)|p dx a.s., lim n→∞
xi ∈π(n)
0
(10.69) where η is a normal random variable with mean 0 and variance 1. Furthermore, for any b > 0, there exist partitions of [0, a] with b for which (10.69) does not hold. m(π(n)) < log n
p-variation
468
Proof Using the notation introduced in the last paragraph of the proof of Theorem 10.2.3, in analogy with (10.46), we have |G2 (xi ) − G2 (xi−1 )|p (10.70) xi ∈π
=
m
|G2 (xi ) − G2 (xi−1 )|p
j=1 xi ∈π(Ij,m (a))
≤ 2p
m
|G(xi ) − G(xi−1 )|p
|G(x)|p .
sup xk(j−1) (π)≤x≤xk(j) (π)
j=1 xi ∈π(Ij,m (a))
(The clarification of notation given following (10.2) is particularly relevant in the last two lines of (10.70) as well as to some similar statements involving subpartions that are given below.) It follows from (6.148) that the Gaussian process G has continuous sample paths almost surely. Using this fact and Theorem 10.2.3, we can take the limit, as n goes to infinity, of the terms to the right of the inequality in (10.70) to obtain lim sup |G2 (xi ) − G2 (xi−1 )|p (10.71) n→∞
xi ∈π(n)
≤ E|η|p (2α)p
m a sup |G(x)|p m x∈Ij,m (a) j=1
a.s.
Taking the limit of the right-hand side of (10.71) as m goes to infinity, and using the definition of Riemann integration, we get the upper bound in (10.69). The argument that gives the lower bound is slightly more subtle. Let Bm (a) := {j|G(x) does not change sign on Ij,m (a)}. Similarly to the way we obtain (10.71) we get |G2 (xi ) − G2 (xi−1 )|p (10.72) lim inf n→∞
xi ∈π(n)
≥ E|η|p (2α)p
j∈Bm (a)
a inf |G(x)|p m x∈Ij,m (a)
a.s.
Taking the limit of the right-hand side of (10.72) as m goes to infinity, we get the lower bound in (10.69). (We know that the set of zeros of each path of G on [0, a] has measure zero. But we need not worry about this since this set, whatever its size, contributes nothing to the integral. This is because, by the uniform continuity of G, |G| is arbitrarily small on sufficiently small intervals containing its zeros.) As we remarked at the beginning of the proof of Theorem 10.2.4, the example we give also works for all 0 < α, a < ∞. Thus, if the partitions
10.3 Additional variational results for Gaussian processes
469
are imposed on [0, a] and limh→∞ σ(h)/h1/p = α in place of (10.52), we have lim sup |||G|||pπ,p = (1 + cp )αp aE|η|p .
(10.73)
q→∞ π∈Πq
By the comments made at the end of Theorem 10.2.4, there are partitions with mesh size less than b/ log n for which lim sup |||G|||pπ,p = (1 + cp,b )αp aE|η|p
a.s.
q→∞ π∈Πq
(10.74)
for some cp,b > 0, where cp,b is a constant depending on p and b. We now show that this implies that a |G(x)|p dx a.s., lim sup |||G2 |||pπ,p = (1 + cp,b )(2α)p E|η|p q→∞ π∈Πq
0
(10.75) which implies the statement in the second paragraph of this theorem. Let q and r be positive integers. For each 1 ≤ j ≤ 2r , let Πq ,j be the set of partitions of the form of Πq on the interval [(j −1)/2r , j/2r ] ≡ Ij,r (rather than on [0, 1] as we did in the proof of Theorem 10.2.4). Note that for q = q + r, Πq is the set of partitions of [0, 1] formed by putting r together one partition from each {Πq ,j }2j=1 . It follows from (10.74) that r
lim sup
q→∞ π∈Πq
|||G2 |||pπ,p
=
2 j=1
lim
sup |||G2 |||pπ,p
q→∞ π∈Π
(10.76)
q ,j
r
≤
2 j=1
lim
sup |||G|||pπ,p sup |2G(x)|p
q →∞ π∈Π
x∈Ij,r
q ,j
r
=
(1 + cp,b )(2α)p E|η|p
2
sup |G(x)|p
j=1 x∈Ij,r
1 . 2r
Passing to the limit as r → ∞, we get the upper bound in (10.75). Similarly, lim sup |||G2 |||pπ,p
(10.77)
q→∞ π∈Πq
r
≥
2 j=1
lim
sup |||G|||pπ,p inf |2G(x)|p
q →∞ π∈Π
x∈Ij,r
q ,j
r
p
p
= (1 + cp,b )(2α) E|η|
2 j=1
inf |G(x)|p
x∈Ij,r
1 . 2r
p-variation
470
Passing to the limit as r → ∞, we get the lower bound in (10.75). We now generalize (10.3) so that it holds for the Gaussian processes we are considering. Theorem 10.3.2 Let G = {G(x), x ∈ R1 } be a mean zero Gaussian process with stationary increments. If σ 2 (h) is concave for h ∈ [0, δ] for some δ > 0, and satisfies lim σ(h)/h1/p = α for some p ≥ 2 and "
p
h→0
0 ≤ α < ∞, then, for ϕ(x) = x/ 2 log+ log 1/x , lim
sup
δ→0 π∈Qa (δ)
ϕ (|G(xi ) − G(xi−1 )|) = αp a
Also, lim
sup
δ→0 π∈Qa (δ)
a.s.
(10.78)
xi ∈π
ϕ |G2 (xi ) − G2 (xi−1 )| = (2α)p
a
|G(x)|p dx
a.s.
0
xi ∈π
(10.79) Note that by Theorem 7.2.15 we can give the same heuristic explanation for the choice of ϕ as we give for the choice of ψ on page 456. The proof depends on the following slight generalization of the local iterated logarithm law for certain Gaussian processes. Lemma 10.3.3 Let G = {G(x), x ∈ R1 } be a mean zero Gaussian process with stationary increments. If lim σ(h)/h1/p = α for some p ≥ 2 h→0
and 0 ≤ α < ∞, then, for each t ∈ R1 , lim sup
δ→0 u,v∈Sδ
|G(t − u) − G(t + v)| |u + v|1/p (2 log log 1/(u + v))
1/2
≤α
a.s.,
(10.80)
where Sδ = {(u, v)|0 < u + v ≤ δ}. Proof For fixed t ∈ R1 , we consider the Gaussian process H(u, v) = G(t − u) − G(t + v). Let aδ = med sup H(u, v)
(10.81)
u,v∈Sδ
and σδ∗
1/2
=
sup E(G(t − u) − G(t + v))2
u,v∈Sδ
≤ (1 + δ )αδ 1/p ,
(10.82)
10.3 Additional variational results for Gaussian processes
471
where limδ→0 δ = 0. By Theorem 5.4.3 with δ = θn for some θ < 1 and all n sufficiently large, and the Borel–Cantelli Lemma, lim
|H(u, v) − aθn |
sup
n→∞ u,v∈S
θn
θn/p
(2 log log 1/θn )
1/2
≤α
a.s.
(10.83)
By stationarity and Lemma 7.2.2, ≤ E sup |H(u, v)|
αδ
(10.84)
u,v∈Sδ
≤ 4E sup |G(u) − G(t)| ≤ Cδ 1/p . |u−t|≤δ
Thus we see that the terms aθn are negligible in (10.83) and can be removed. Doing this and interpolating, we get lim sup
δ→0 u,v∈Sδ
|G(t − u) − G(t + v)| 1/2
δ 1/p (2 log log 1/δ)
≤α
a.s.
(10.85)
The inequality (10.80) follows from the fact that (7.53) implies (7.54) together with Lemma 7.1.7 (3). Proof of Theorem 10.3.2 Let ψ(x) denote the function that is the inverse of α|x|1/p (2 log log 1/x)1/2 on [0, h]. Note that ψ(x) ∼ ϕ(x)/αp at zero. It is easy to see that ψ σ(x)(log 1/x)1/2 ≤ x(log 1/x)p (10.86) for all x > 0 sufficiently small. For > 0 and t ∈ [0, a], set |G(t − u) − G(t + v)| ≤ (1 +
) . Aδ = (t, ω) : sup 1/2 u,v∈Sδ α|u + v|1/p (2 log log 1/(u + v)) (10.87) Note that Aδ is measurable. Let 1δ (t, ω) be the indicator function of Aδ . It follows from Lemma 10.3.3 that, for each t ∈ R1 , limδ→0 1δ (t, ω) = 1 almost surely. Hence, by Fubini’s Theorem, for almost every ω, we have limδ→0 1δ (t, ω) = 1 for almost every t ∈ [0, a]. Therefore, by the Dominated Convergence Theorem, a 1δ (t, ω) dt = a a.s. (10.88) lim δ→0
0
This means that there exists a set Ω of measure 1 in Ω such that, for any > 0, there exists δ0 = δ0 (ω, ) such that for all δ ≤ δ0 , a 1δ (t, ω) dt ≥ a(1 − ) ∀ω ∈ Ω . (10.89) 0
p-variation
472
Let π = {0 = x0 < x1 · · · < xkπ = a} denote a partition in Qδ (a). For a given path of G( · , ω), if the interval [xi−1 , xi ] contains a t such that (t, ω) ∈ Aδ , we have ψ(|G(xi , ω) − G(xi−1 , ω)|) (10.90) 1/2 1/p ≤ ψ (1 + )α|xi − xi−1 | (2 log log 1/(xi − xi−1 )) 1/2 ≤ ((1 + )p + δ )ψ α|xi − xi−1 |1/p (2 log log 1/(xi − xi−1 )) = ((1 + )p + δ )(xi − xi−1 ), where limδ→0 δ = 0. (Here we use the property that ψ is regularly varying at zero with index p.) This is clearly what we want since summing over all intervals of the partition π containing a t ∈ Aδ for this ω, and taking the limit as δ goes to zero, we get the upper bound in (10.78). Thus, for a given ω, we must show that the sum over the intervals of π that do not contain any values of t that are also in Aδ , does not contribute anything when we take the limit as δ goes to zero. Let Λ(ω) = i : there is no value of t ∈ (xi−1 , xi ) satisfying (t, ω) ∈ Aδ , (10.91) and for some constant D > p + 4 let (10.92) Λ (ω) = i : |G(xi , ω) − G(xi−1 , ω)| 1/2 > α|xi − xi−1 |1/p (2D log log 1/(xi − xi−1 )) . Let ω ∈ Ω and assume that δ is small enough so (10.89) holds for this ω. Then (xi − xi−1 ) < a . (10.93) i∈Λ(ω)
Therefore, for this ω ∈ Ω , ψ (|G(xi , ω) − G(xi−1 , ω)|) i∈Λ∩(Λ )c
≤ ≤
(10.94)
1/2 ψ α|xi − xi−1 |1/p (2D log log 1/(xi − xi−1 ))
i∈Λ
1/2 (Dp + δ )ψ α|xi − xi−1 |1/p (2 log log 1/(xi − xi−1 ))
i∈Λ
= (Dp + δ )
i∈Λ
|xi − xi−1 | < (Dp + δ )a .
10.3 Additional variational results for Gaussian processes
473
(We simplify the notation by writing Λ for Λ(ω) and similarly for Λ .) Thus the sum in (10.78) can be made arbitrarily small on Λ ∩ (Λ )c . To estimate i∈Λ∩Λ ψ (|G(xi , ω) − G(xi−1 , ω)|), we consider the random variable Zn (ω) := card j : sup |G(t, ω) − G(s, ω)| (10.95) t,s∈Jn,j
> αh1/p n (2D log log 1/hn )
1/2
,
where hn = e−n and Jn,j = [jhn /2, (j/2 + 1)hn ], 0 ≤ j ≤ 2en − 1. Let 1/2 . An,j := ω : sup |G(t, ω) − G(s, ω)| > αh1/p n (2D log log 1/hn ) t,s∈Jn,j
(10.96) By the same argument used in the proof of Lemma 10.3.3, we see that for all > 0 sufficiently small, (10.97) P (An,j ) ≤ Cn−(D−) ≤ Cn−(p+4) 2en if n = n( ) is sufficiently large. Since Zn = j=0 1An,j , we see from Chebyshev’s inequality that P Zn > en n−(p+3/2) ≤ Cn−2 (10.98) for all n = n(p) sufficiently large. Therefore, by the Borel–Cantelli Lemma, for all ω ∈ Ω , with P (Ω ) = 1 there exists an n0 (ω) such that, for all n ≥ n0 (ω), Zn (ω) ≤ en n−(p+3/2) .
(10.99)
We now order the partitions in π according to their size. Let m0 = [log 1/(2δ)]. For all m ≥ m0 set (10.100) Pm = i : hm+1 /2 ≤ xi − xi−1 < hm /2 . Note that for each i ∈ Pm there is a j for which (xi−1 , Λ(ω)) ⊂ Jm,j . We write ψ (|G(xi , ω) − G(xi−1 , ω)|) (10.101) i∈Λ∩Λ
≤
∞
ψ (|G(xi , ω) − G(xi−1,ω )|) .
m=m0 i∈Pm ∩Λ
For our purposes we can assume that δ is small enough so that for this ω, m0 ≥ n0 (ω) and also sup |s−t|≤hm /2
s,t∈[0,a]
|G(s, ω) − G(t, ω)| ≤ σ(hm /2) (2(1 + ) log 2/hm )
1/2
(10.102)
p-variation
474
for all m ≥ m0 (see (7.108)). Then, using (10.101), (10.102), (10.99), and (10.86), we have ψ (|G(xi , ω) − G(xi−1 , ω)|) (10.103) i∈Λ∩Λ ∞
≤
1/2 Zm (ω)ψ σ(hm /2) (2(1 + ) log 2/hm )
m=m0 ∞
≤ Cp ≤ Cp
m=m0 ∞
em m−(p+3/2) hm (log 1/hm )p m−3/2 .
m=m0
It follows from (10.94) and (10.103) that lim sup ψ (|G(xi ) − G(xi−1 )|) = 0 δ→0 π∈Qa (δ)
a.s.,
(10.104)
i∈Λ
which, together with the comments following (10.90), gives us the upper bound in (10.78). The proof of lower bound in (10.78) is easier. For fixed > 0 let Eδ = Eδ (ω) := s ∈ (0, a) : ψ(|G(s + h, ω) − G(s, ω)|) > (1 − )h for some h ∈ (0, δ) . It follows from Theorem 7.2.15 that for each fixed t and δ, P (t ∈ Eδ ) = 1. Therefore, by Fubini’s Theorem we see that P (|Eδ | = a) = 1,
(10.105)
where | · | indicates Lebesgue measure. Let E := ∩0<δ≤1 Eδ = ∩∞ n=1 E1/n . It follows from (10.105) that P (|E| = a) = 1.
(10.106)
Let Ω be the subset of measure 1 of the probability space of G on which |E| = a, and let ω ∈ Ω . Note that for each t ∈ E(ω ) there are arbtrarily small intervals [t, t + h] for which ψ (|G(t + h, ω ) − G(t, ω )|) > (1 − )h.
(10.107)
Consequently, we can find a finite set of disjoint intervals of length less than δ on which (10.107) holds, such that the sum of their lengths is
10.3 Additional variational results for Gaussian processes
475
greater than (1 − )a. Let π = π(ω ) be a partition in Q(δ) that includes all these intervals, which we label [tj , t + hj ]. We have ψ (|G(xi ) − G(xi−1 )|) ≥ ψ (|G(tj + hj ) − G(tj )|) xi ∈π
j
≥ (1 − )
hj = (1 − )2 a.
j
Therefore, for each , δ > 0 we have
P sup ψ |G(xi ) − G(xi−1 )| > (1 − )2 a = 1.
(10.108)
π∈Q(δ) x ∈π i
Letting first and then δ decrease to zero through a countable set gives the lower bound in (10.78). We now prove (10.79). Continuing the notation introduced in the last paragraph of the proof of Theorem 10.2.3, in addition to the subpartions of π given by π(Ij,m (a)), j = 1, . . . , m − 1, we define the larger partition j−1 j a < xk(j−1)+1 (π) < · · · < xk(j) (π) ≤ a , σ(π)(Ij,m (a)) = m m (10.109) j−1 j j = 1, . . . , m (note that a and a are points in the partition given m m in (10.109)). We then have ϕ(|G2 (xi ) − G2 (xi−1 )|) (10.110) xi ∈π
=
m
ϕ(|G2 (xi ) − G2 (xi−1 )|)
j=1 xi ∈π(Ij,m (a))
≤
m
ϕ(|G2 (xi ) − G2 (xi−1 )|)
j=1 xi ∈σ(π)(Ij,m (a))
+
m−1
ϕ(|G2 (xk(j) (π)) − G2 (xk(j)+1 (π))|).
j=1 m To get the inequality in (10.110) we added partition points at { j−1 m a}j=2 . These points are included in the first term after the inequality sign in (10.110). In the second term after the inequality sign we have written the partitions that were present that bracketed the added points. Fix v > 0. It is easy to see that, for any > 0, we can find a c( ) > 0 such that for all c ∈ [0, c( )],
ϕ(cb) ≤ (1 + )ϕ(c)|b|p
∀ b ∈ [0, 2v]
(10.111)
p-variation
476
(this comes down to noting that for any > 0, (log(1/cb))(1+ ) ≥ log(1/c) for all c sufficiently small). Let I(A) denote the indicator function of the set A. Since G(x) is uniformly continuous almost surely on [0, a], we can find a δ sufficiently small, depending on and ω, such that for all ω in a set of measure 1,
2 2 ϕ |G (xi ) − G (xi−1 )| I sup |G(x)| ≤ v (10.112) sup π∈Qa (δ) x ∈π i
x∈[0,a]
≤ (1 + )2p
m
sup
ϕ(|G(xi ) − G(xi−1 )|)
j=1 π∈Qa (δ) xi ∈σ(π)(Ij,m (a))
|G(x)|p + m
sup
ϕ (|G(x) − G(y)| 2v) .
sup |x−y|≤δ
x∈Ij,m (a)
x,y∈[0,a]
It follows from Theorem 7.2.15 that |G(x) − G(y)| =α (δ 2/p (log 1/δ))1/2
lim sup sup δ→0
|x−y|≤δ
a.s.
(10.113)
Thus the last term in (10.112) is almost surely o(δ r ) as δ goes to zero, for all r < 1. Using this fact and taking the limit as δ goes to 0 in (10.112), we get by (10.78) that
lim sup ϕ(|G2 (xi ) − G2 (xi−1 )|)I sup |G(x)| ≤ v δ→0 π∈Qa (δ)
x∈[0,a]
xi ∈π
≤ (1 + )(2α)p
m j=1
a sup |G(x)|p m x∈Ij,m (a)
a.s.
(10.114)
Finally, taking the limit as m goes to infinity, we get
2 2 lim sup ϕ(|G (xi ) − G (xi−1 )|)I sup |G(x)| ≤ v δ→0 π∈Qa (δ)
x∈[0,a]
xi ∈π
a
|G(x)|p dx
≤ (1 + )(2α)p
a.s.,
(10.115)
0
and since this holds for all > 0 and all v, we get (10.79) but with a less than or equal to sign. To get the opposite inequality we note that sup ϕ(|G2 (xi ) − G2 (xi−1 )|) ≥ sup (10.116) π∈Qa (δ) x ∈π i
xi ∈σ(π)(Ij,m (a))
j∈Bm (a)
ϕ |G(xi ) − G(xi−1 )|
π∈Qa (δ)
inf
x∈Ij,m (a)
|2G(x)|
10.3 Additional variational results for Gaussian processes
477
for Bm (a) as in (10.72). Without loss of generality we assume that supi |G2 (xi ) − G2 (xi−1 )| < e−1 , so that the iterated log term is well defined. Similarly to (10.111), it is easy to see that when bc < e−1 , for any u > 0, and > 0 sufficiently small, for c sufficiently small, we have ϕ(cb) ≥ (1 − )ϕ(c)|b|p for all b ≥ u (this comes down to noting that for any > 0, (log(1/cb))(1− ) ≤ log(1/c) for all c sufficiently small). Therefore, for any > 0 we can find a δ = δ( ) sufficiently small, such that the term to the right of the inequality in (10.116) is greater than or equal to sup ϕ(|G(xi ) − G(xi−1 )|) (1 − )2p j∈Bm (a)
inf
x∈Ij,m (a)
π∈Qa (δ)
|G(x)|p I
xi ∈σ(π)(Ij,m (a))
inf
x∈Ij,m (a)
|G(x)| ≥ u .
(10.117)
Taking the limit in (10.116) first as δ goes to zero and then as m goes to infinity, we get that the left-hand side of (10.79) is greater than or equal to a (1 − )(2α)
|G(x)|p I(G(x) ≥ u) dx.
p
(10.118)
0
Since this is true for all > 0 and all u > 0, we obtain (10.79) but with a greater than or equal to sign. 0,β = {G 0,β (x), x ∈ R1 } be a Gaussian process Example 10.3.4 Let G with stationary increments and increments variance 2 σ0,β (h)
=
4 π
∞
sin2 λh/2 dλ λβ
0 β−1
= h
4 π
∞
(10.119)
sin2 s/2 ds = Cβ hβ−1 , sβ
0
0 (see also (7.280)). as defined on page 330, where it is labeled simply G 0,β . Clearly, (10.20), (10.69), (10.78), and (10.79) hold for G In the next section we apply the variational results about Gaussian processes to obtain variational results about local times of symmetric β stable processes. In order to use Lemma 9.1.2, we need to extend the results of the last two sections to Gaussian processes associated with symmetric stable processes killed at the end of an indendent exponential time with mean 1. Let G1,β = {G1,β (x), x ∈ R1 } be a stationary
p-variation
478
Gaussian process with increments variance 2 σ1,β (h)
4 = π
∞
sin2 λh/2 dλ. 1 + λβ
(10.120)
0
These processes are first considered on page 330, where they are labeled G1 . The reason we cannot apply the results of the preceding two sections 2 directly is because we have not established that σ1,β (h) is concave in [0, δ] for some δ > 0. Theorem 10.3.5 Let G1,β = {G1,β (x), x ∈ R1 } be a mean zero stationary Gaussian processes with increments variance given by (10.120). These processes satisfy (10.20), (10.69), (10.78), and (10.79), where p = 2/(β − 1) and α = Cβ . Proof This follows from Lemma 7.4.11. Let Hβ be defined as in 1,β = {G1,β (x) − G1,β (0), x ∈ Lemma 7.4.11 with ψ(λ) = λβ , and let G R1 }. Then, in the notation that we are using now, we have 0,β (x), x ∈ R1 } = {G 1,β (x) + Hβ (x), x ∈ R1 }, {G law
(10.121)
1,β (x) and Hβ are independent. It follows from Example 10.3.4 where G 1,β (x) + Hβ satisfy (10.20), (10.69), (10.78), and (10.79). Furthat G thermore, it is easy to see that E(Hβ (x + h) − Hβ (x))2 = O(hγ )
(10.122)
as h goes to zero, for all γ < (2β − 1) ∧ 2. It follows from (10.122) and Theorem 7.2.1 that lim sup sup δ→0
|x−y|≤δ
|Hβ (x) − Hβ (y)| ≤C (δ γ (log 1/δ))1/2
a.s.
(10.123)
It is easy to see that this implies that limn→∞ |||Hβ |||π(n),p = 0 for the partitions π(n) defined in Theorem 10.2.3. Therefore it follows from the 1,β (x) + Hβ |||π(n),p that G 1,β satisfies triangle inequality applied to |||G (10.20). Therefore G1,β also satisfies (10.20). G1,β also satisfies (10.69), since the proof of (10.69) only requires that the Gaussian process is continuous and satisfies (10.20). 0,β satisfies We now show that G1,β satisfies (10.78). It is clear that G 1 ¯ = (10.78) and therefore so does G1,β (x) + Hβ . For x ∈ R define ϕ(x) ϕ(|x|). Clearly ϕ(x) ¯ is convex for x ∈ [−δ, δ] for some δ > 0. Therefore, for any > 0, for all |a| and |b| sufficiently small, depending on , we
10.4 p-variation of local times have
ϕ¯
a 1−
1
≥ ϕ¯ (a + b) − ϕ¯ 1−
1−
and
ϕ¯ (a) ≤ (1 − )ϕ¯
a+b 1−
+ ϕ¯
479
b
(10.124)
b .
(10.125)
We use these inequalities in (10.78) with ϕ replaced by ϕ¯ and with 1,β (xi ) − G 1,β (xi−1 ) and b = Hβ (xi ) − Hβ (xi−1 ). Since all these a=G processes are uniformly continuous, there is no problem in taking the terms arbitrarily small. It should be clear now that, in order to show 1,β satisfies (10.78), we need only show that for all > 0, that G |Hβ (xi ) − Hβ (xi−1 )| =0 a.s. (10.126) lim sup ϕ δ→0 π∈Qa (δ)
x ∈π i
This follows immediately from (10.123) since ((2β − 1) ∧ 2)(p/2) > 1. 1,β satisfies (10.78) and hence so does G1,β . This also Thus we see that G implies that G1,β satisfies (10.79) since in the proof of Theorem 10.3.2 we show that any uniformly continuous process that satisfies (10.78) satisfies (10.79). Remark 10.3.6 For later use let us note that (10.75) and the final comment in the above proof imply that, for any b > 0, we can find a sequence of partitions {πn } with m(πn ) < logb n such that a |G1,β (x)|p dx a.s. lim sup |||G21,β |||pπn ,p = (1 + cp,b )(4Cβ )p/2 E|η|p n→∞
0
(10.127)
for some cp,b > 0, where p = 2/(β − 1). 10.4 p-variation of local times
Theorem 10.4.1 Let X = {X(t), t ∈ R+ } be a real-valued symmetric stable process of index 1 < β ≤ 2 and let {Lxt , (t, x) ∈ R+ × R1 } be the local time of X. If {π(n)} is any sequence of partitions of [0, a] such 1/(β−1) , then, for all t ∈ R+ , that m(π(n)) = o log1 n a xi−1 2/(β−1) xi lim |Lt − Lt | = c(β) |Lxt |1/(β−1) dx a.s., n→∞
xi ∈π(n)
0
(10.128)
p-variation
480 where c(β)
p/2
E|η|p (10.129)
1/(β−1) 1 1 22/(β−1) 1 √ + . Γ β−1 2 π Γ(β) sin π2 (β − 1)
=
(2Cβ )
=
Proof In this section we obtain (10.128) for almost all t ∈ R+ . We complete the proof at the end of Section 10.5. Let G1,β be the Gaussian process associated with the symmetric stable process of index β killed at the end of an independent exponential time with mean 1. It follows from Theorem 10.3.5 that, under the condition on m(π(n)) given in Theorem 10.3.1, we have a 1/p p/2 2 p 1/p 2 p/2 lim |||G1,β /2|||π(n),p = (2Cβ ) (E|η| ) |G1,β (x)/2| dx n→∞
0
(10.130) almost surely, where Cβ is given in (10.119) and p = 2/(β − 1). The same proof but with a very minor modification gives lim |||(G1,β + s)2 /2|||π(n),p
(10.131)
n→∞
p/2
= (2Cβ )
1/p
a
|(G1,β (x) + s)2 /2|p/2 dx
(E|η|p )1/p
a.s.
0
for all s = 0. Therefore, by Lemma 9.1.2, for almost all ω ∈ ΩG1,β , (10.132) lim |||Lt + (G1,β (ω) + s)2 /2|||π(n),p a 1/p p/2 p 1/p x 2 p/2 = (2Cβ ) (E|η| ) |Lt + (G1,β (x, ω) + s) /2| dx
n→∞
0
for almost all t ∈ R+ . It follows that for almost all ω ∈ ΩG1,β , p/2
lim sup |||Lt |||π(n),p ≤ (2Cβ ) (E|η|p )1/p (10.133) n→∞
1/p a 1/p a x p/2 2 p/2 |Lt | dx + |(G1,β (x, ω) + s) /2| dx 0
0
+ lim sup |||(G1,β (ω) + s)2 /2|||π(n),p n→∞
for almost all t ∈ R+ . Using (10.131) on the last term in (10.133) we see that for almost all ω ∈ ΩG1,β , p/2
lim sup |||Lt |||π(n),p ≤ (2Cβ ) (E|η|p )1/p (10.134) n→∞
1/p a 1/p a x p/2 2 p/2 |Lt | dx +2 |(G1,β (x, ω) + s) /2| dx 0
0
10.4 p-variation of local times
481
for almost all t almost surely. And finally, since this holds for all s = 0, we get p/2
lim sup |||Lt |||π(n),p ≤ (2Cβ ) (E|η|p )1/p (10.135) n→∞
1/p a 1/p a x p/2 2 p/2 . |Lt | dx +2 |G1,β (x, ω)/2| dx 0
0
Since G1,β has continuous sample paths, it follows from Lemma 5.3.5 that for all > 0,
P
sup |Gβ (x)| ≤
> 0.
(10.136)
x∈[0,a]
Therefore we can choose ω in (10.135) so that the integral involving the Gaussian process can be made arbitrarily small. Thus a 1/p p/2 p 1/p x p/2 lim sup |||Lt |||π(n),p ≤ (2Cβ ) (E|η| ) |Lt | dx (10.137) n→∞
0
for almost all t almost surely. By the same methods we can obtain the reverse of (10.137) for the limit inferior. p/2 Since c(β) = (2Cβ ) E|η|p , we obtain (10.128). Using (5.74) and (4.102) we get (10.129). Theorem 10.4.2 Under the hypotheses of Theorem 10.4.1, for all b > 0 b we can find a sequence of partitions {π(n)}∞ n=1 with m(π(n)) ≤ log n such that x |Lxt i − Lt i−1 |2/(β−1) (10.138) lim sup n→∞
xi ∈π(n)
a
|Lxt |1/(β−1) dx
= (1 + δβ,b )c(β)
a.s.
0
for some constant δβ,p > 0. This shows that for Brownian motion, the restriction on m(π(n)) in Theorem 10.4.1 cannot be strengthened. Proof We follow the proof of Theorem 10.4.1 precisely, except that, instead of (10.131), we have the same expression as in (10.131), but lim is replaced by lim sup and with an additional factor of (1 + cp,b ) on the right-hand side. This follows from Remark 10.3.6. With this change the rest of the proof of Theorem 10.4.1 gives (10.137) with an additional factor of (1 + cp,b ) on the right-hand side. By essentially the
p-variation
482
same methods we can obtain (10.137) with a greater than or equal to sign.
10.5 Additional variational results for local times Theorem 10.5.1 Let X = {X(t), t ∈ R+ } be a real-valued symmetric x 1 stable process of index 1 < β ≤ "2 and let {Lt , (t, x) ∈ R+ × R } be the local time of X. If ϕ(x) = |x/ 2 log+ log 1/x |2/(β−1) , then lim
sup
δ→0 π∈Qa (δ)
ϕ(|Lxt i − Lt i−1 |) = c (β) x
a
|Lxt |1/(β−1) dx
(10.139)
0
xi ∈π
almost surely for each t ∈ R+ , where
c (β) =
Γ(β) sin
2 π
2 (β
1/(β−1) − 1)
.
(10.140)
Proof To simplify the notation we denote, for real-valued functions {τ (x), x ∈ R+ } and {f (x), x ∈ [0, a]}, Vτ,a (f ) = lim sup τ (|f (xi ) − f (xi−1 )|). (10.141) δ→0 π∈Qa (δ)
xi ∈π
The first part of the proof is basically exactly the same as the proof of Theorem 10.5.1. It follows from Theorem 10.3.5 that a p/2 |(G21,β (x)/2|p/2 dx a.s., (10.142) Vϕ,a G21,β /2 = (2Cβ ) 0
where p = 2/(β − 1). The same proof with minor modifications gives a p/2 |(G1,β (x) + s)2 /2|p/2 dx a.s. Vϕ,a (G1,β + s)2 /2 = (2Cβ ) 0
(10.143) for all s = 0. Therefore, by Lemma 9.1.2, for almost all ω ∈ ΩG1,β , Vϕ,a (Lt + (G1,β ( · , ω) + s)2 /2) (10.144) a p/2 = (2Cβ ) |Lxt + (G1,β (x, ω) + s)2 /2|p/2 dx 0
for almost all t almost surely. We note that by (10.111), for all c > 0, ϕ(c|x|) ≤ (1 + δ )cp ϕ(|x|) for any δ > 0, for all x sufficiently small. Using this and (10.125) we see
10.5 Additional variational results for local times
483
that for almost all ω ∈ ΩG1,β and 0 < ≤ 1/2, Vϕ,a (Lt ) ≤ (1 + δ )(1 − )1−p Vϕ,a ((Lt + (G1,β ( · , ω) + s)2 /2) +(1 + δ ) 1−p Vϕ,a ((G1,β ( · , ω) + s)2 /2) for almost all t almost surely. (We assume that the partition size δ is sufficiently small so that the increments of (Lt + (G1,β ( · , ω) + s))2 are also sufficiently small.) Therefore, by (10.143) and (10.144) for almost all ω ∈ ΩG1,β , Vϕ,a (Lt ) ≤ (1 + δ )(1 − )1−p (2Cβ ) + (1 + δ ) 1−p (2Cβ )
a
p/2
p/2
0
(10.145) x 2 p/2 |Lt + (G1,β (x, ω) + s) /2| dx
a
|(G1,β (x, ω) + s)2 /2|p/2 dx 0
for almost all t almost surely. Using (10.136), we can choose an ω and s such that supx∈[0,a] |G1,β (x, ω) + s| can be made arbitrarily small. It follows that a p/2 |Lxt |p/2 dx for almost all t a.s. (10.146) Vϕ,a (Lt ) ≤ (2Cβ ) 0
A similar argument gives (10.146) with a greater than or equal to sign. Thus we have a p/2 Vϕ,a (Lt ) = (2Cβ ) |Lxt |p/2 dx for almost all t a.s. (10.147) 0
To show that (10.139) holds for all t ∈ R+ , we need the following lemma on scaling the local times of symmetric stable processes. Lemma 10.5.2 Let X = {X(t), t ∈ R+ } be a real-valued symmetric stable process of index 1 < β ≤ 2 and let {Lxt , (t, x) ∈ R+ × R1 } be the local time of X. For fixed s, t ∈ R+ , law
x/s1/β
{Lxt ; x ∈ R1 } = {s1/β Lt/s
; x ∈ R1 },
(10.148)
where 1/β + 1/β = 1. Proof By (4.63) for fixed s ∈ R1 , law
{X(t); t ∈ R+ } = {s1/β X(t/s); t ∈ R+ }. Therefore, by (2.92), which also holds for L´evy processes, 1 t I({a ≤ s1/β Xu/s ≤ a + }) du Lat = lim →0 0
(10.149)
(10.150)
p-variation
484 =
s →0
t/s
lim
= s
1/β
0
1 lim →0
I({s−1/β a ≤ Xv ≤ s−1/β (a + )}) dv
t/s
I({s−1/β a ≤ Xv ≤ s−1/β a + }) dv.
0
This gives (10.148). Proof of Theorem 10.5.1 continued By (10.148) with s = t/t0 we have x(t0 /t)1/β
law
{Lxt ; x ∈ R1 } = {(t/t0 )1/β Lt0
; x ∈ R1 }.
(10.151)
Let Q be a countable dense subset of R+ . It follows from (10.147) and Fubini’s Theorem that there exists a t0 ∈ R+ such that b p/2 |Lxt0 |p/2 dx ∀b ∈ Q a.s. (10.152) Vϕ,b (Lt0 ) = (2Cβ ) 0
Therefore, since Vϕ,b (Lt0 ) is monotone in b, (10.152) holds for all b ∈ R+ almost surely. Furthermore, by (9.117) and Fubini’s Theorem, we can choose t0 so that {Lxt0 ; x ∈ [0, b]} is uniformly continuous almost surely for any b < ∞. By (10.151) we have c x (10.153) |Lyt |p/2 dy); x ∈ R1 (Lt , 0 c ¯ y(t /t)1/β p/2 law x(t /t)1/β = ((t/t0 )1/β Lt0 0 , |(t/t0 )1/β Lt0 0 | dy); x ∈ R1 0
x(t /t)1/β , (t/t0 )p/β = ((t/t0 )1/β Lt0 0
c(t0 /t)1/β
0
|Lzt0 |p/2 dz); x ∈ R1
where the last equality uses the change of variables z = y(t0 /t)1/β in the ¯ + 1/β = p/β. ¯ integral and the fact that p = 2/(β − 1), so that p/(2β) · Using the uniform continuity almost surely of Lt0 and the fact that ϕ is regularly varying at zero, we see that for any t ∈ R+ ¯
¯
Vϕ,c(t0 /t)1/β ((t/t0 )1/β Lt0 ) = (t/t0 )p/β Vϕ,c(t0 /t)1/β (Lt0 ) It then follows from (10.153) that c (Vϕ,c (Lt ), |Lyt |p/2 dy) 0 law
¯
= ((t/t0 )p/β Vϕ,c(t0 /t)1/β (Lt0 ), (t/t0 )p/β
a.s. (10.154)
(10.155) 0
c(t0 /t)1/β
|Lzt0 |p/2 dz).
10.5 Additional variational results for local times
485
It now follows from (10.152), which we have seen holds for all b ∈ R1 , that for any c > 0 c p/2 Vϕ,c (Lt ) = (2Cβ ) |Lxt |p/2 dx a.s. (10.156) 0
Thus we get (10.139); (10.140) follows from (4.102). Remark 10.5.3 We cannot use an argument similar to the one used to prove Theorem 10.5.1 to show that Theorem 10.4.1 holds for each t ∈ R+ almost surely. This is because in Theorem 10.4.1, as stated, the subsets of R+ of measure zero for which (10.128) may not hold could depend on the particular sequence of partitions {π(n)}. Thus we cannot use scaling because we do not know if a sequence of partitions for which (10.128) holds for L·t0 will allow (10.128) to hold when they are rescaled, as we did above, to consider L·t . We give one final result on the p-variation of local times that mixes properties of Theorems 10.4.1 and 10.5.1, in that it is the same as (10.128) but it holds over all partitions refining to zero. This is possible because we consider a weaker form of convergence. Theorem 10.5.4 Let X = {X(t), t ∈ R+ } be a real-valued symmetric stable process of index 1 < β ≤ 2 and let {Lxt , (t, x) ∈ R+ × R1 } be the local time of X. If {π(n)} is any sequence of partitions of [0, a] with limn→∞ m(π(n)) = 0, then a x |Lxt i − Lt i−1 |2/(β−1) = c(β) |Lxt |1/(β−1) dx (10.157) lim n→∞
0
xi ∈π(n)
in Lr uniformly in t on any bounded interval of R+ for all r > 0, where c(β) is given in (10.129). We cannot prove this theorem using any of the isomorphism theorems in this book, particularly the statement about uniformity in t. The proof follows from several lemmas on moments of the Lr norm of various functions of the local times. Lemma 10.5.5 Let X = {X(t), t ∈ R+ } be a symmetric stable process of index 1 < β ≤ 2 and let {Lxt , (t, x) ∈ R+ × R1 } be the local time of X. Then, for all x, y, z ∈ R1 , t ∈ R+ and integers m ≥ 1, m pt1 (x−z) p∆ti (0) dt, (10.158) E z ((Lxt )m ) = m! · · · 0
i=2
p-variation
486
where pt is the probability density function of X(t). Furthermore, 2m E z (Lxt − Lyt ) (10.159) = (2m)! · · · (pt1 (x − z) + pt1 (y − z)) 0
p∆ti (0) − (−1)2m−i p∆ti (x − y) dt,
i=2
where ∆ti = ti − ti−1 . Proof
We show that for any yi ∈ R1 ,
n y Lt i Ez
(10.160)
i=1
=
π
n
pti −ti−1 (yπi−1 , yπi )
0
n
dti
i=1
where t0 = π0 = 0, y0 = z and the sum is taken over all permutations π of {1, . . . , n}. Equation (10.158) follows immediately from this. Note that both sides of (10.160) are continuous functions of t (we show this for pt in the paragraph containing (4.74)). Consequently, to obtain (10.160) it suffices to show that both sides of the equation have the same Laplace transform. That is, it suffices to show that, for any α > 0,
n ∞ y −αt z i e E Lt α dt (10.161) 0
α = π
i=1 ∞ −αt
n
e
0
pti −ti−1 (yπi−1 , yπi )
0
n
dti dt.
i=1
By Fubini’s Theorem, the right-hand side of (10.161) equals ∞ n n αe−αt dt pti −ti−1 (yπi−1 , yπi ) dti π
0
=
π
=
i=1
e−αtn 0
n i=1
n
0
=
tn
pti −ti−1 (yπi−1 , yπi )
i=1 n
dti
i=1
e−α(ti −ti−1 ) pti −ti−1 (yπi−1 , yπi )
n
dti
i=1
(10.162)
10.5 Additional variational results for local times
487
where uα is the α potential density of X. Since the left-hand side of ,n (10.161) can be written as E z ( i=1 Lyλi ), where λ is an independent exponential random variable with mean 1/α, we see that (10.161) follows from Kac’s moment formula, Theorem 3.10.1. This completes the proof of (10.158). To prove (10.159) we use the notation ∆δ h(z) = h(z + δ) − h(z). When h is a function of several variables, say (z1 , . . . , z2n ), we write ∆δ,zi for the operator ∆δ applied to the variable zi . Recall that pt (x, x ) = pt (x − x ), so that the terms in pt are actually functions of a single variable. Using (10.160) we have
2m y xi z i E (Lt − Lt ) (10.163) i=1
2m
=
∆yi −xi ,xi
E
z
2m
i=1
2m
=
π
Lxt i
i=1
∆yπi −xπi ,xπi
i=1
2m
p∆ti (xπi−1 , xπi )
0
2m
dti ,
i=1
where t0 = π0 = 0, x0 = z and the sum is taken over all permutations π of {1, . . . , 2m} . Ultimately we set all the xi = x and all the yi = y, 1 ≤ i ≤ 2m. The subscripts are an aid to understanding the proof. We first consider the case when π is the identity permutation. We show that 2m 2m
∆yi −xi ,xi p∆ti (xi−1 , xi ) (10.164)
i=1
i=1
= (pt1 (x − z) + pt1 (y − z))
xi =x, yi =y
2m
p∆ti (0) − (−1)2m−i p∆ti (x − y) .
i=2
It is easy to see that we get the same result for all permutations π. Using this in (10.163) we get (10.159). Set Di = ∆yi −xi ,xi and consider the effect of D2m−1 D2m on the prod,2m uct i=1 p∆ti (xi−1 , xi ). Since it operates only on x2m−1 and x2m , we need only consider D2m−1 D2m p∆t2m−1 (x2m−2 , x2m−1 )p∆t2m (x2m−1 , x2m )
(10.165)
= D2m−1 p∆t2m−1 (x2m−2 , x2m−1 ) [ p∆t2m (x2m−1 , y2m ) − p∆t2m (x2m−1 , x2m )] = p∆t2m−1 (x2m−2 , y2m−1 )[ p∆t2m (y2m−1 , y2m ) − p∆t2m (y2m−1 , x2m )] − p∆t2m−1 (x2m−2 , x2m−1 )
488
p-variation [ p∆t2m (x2m−1 , y2m ) − p∆t2m (x2m−1 , x2m )].
Set x2m−1 = x2m = x and y2m−1 = y2m = y. The expression to the right of the last equal sign in (10.165) becomes p∆t2m−1 (x2m−2 , y)[ p∆t2m (0) − p∆t2m (y − x)]
(10.166)
− p∆t2m−1 (x2m−2 , x)[ p∆t2m (x − y) − p∆t2m (0)] = p∆t2m−1 (x2m−2 , y) + p∆t2m−1 (x2m−2 , x) [ p∆t2m (0) − p∆t2m (y − x)]. Since x0 = z, this gives (10.164) when m = 1. Assume now that m ≥ 2. One more application of the difference operator reveals the pattern that gives (10.164): D2m−2 p∆t2m−2 (x2m−3 , x2m−2 )[ p∆t2m−1 (x2m−2 , y) + p∆t2m−1 (x2m−2 , x)] = p∆t2m−2 (x2m−3 , y2m−2 )[ p∆t2m−1 (y2m−2 , y) + p∆t2m−1 (y2m−2 , x)] − p∆t2m−2 (x2m−3 , x2m−2 )[ p∆t2m−1 (x2m−2 , y) + p∆t2m−1 (x2m−2 , x)]. Setting x2m−2 = x and y2m−2 = y, the right-hand side of the above equation is (10.167) p∆t2m−2 (x2m−3 , y) − p∆t2m−2 (x2m−3 , x) [ p∆t2m−1 (0) + p∆t2m−1 (y − x)]
= D2m−2 p∆t2m−2 (x2m−3 , x2m−2 )
x2m−2 =x, y2m−2 =y
[ p∆t2m−1 (0) + p∆t2m−1 (y − x)]. Combining (10.166) and (10.167), we obtain 2m 2m
∆yi −xi ,xi p∆ti (xi−1 , xi ) (10.168)
i=1 i=1 xi =x, yi =y 2m−2 2m−2
∆yi −xi ,xi p∆ti (xi−1 , xi ) =
i=1 i=1 x =x, yi =y i p∆t2m−1 (0) + p∆t2m−1 (y − x) [ p∆t2m (0) − p∆t2m (y − x)]. It should be clear that by continuing this procedure we get (10.164). Let Z be a random variable on the probability space of the stable process X. We denote the Lr norm of Z with respect to P 0 by Zr .
10.5 Additional variational results for local times
489
Lemma 10.5.6 Let X = {X(t), t ∈ R+ } be a real-valued symmetric stable process of index 1 < β ≤ 2 and let {Lxt , (t, x) ∈ R+ × R1 } be the local time of X. Then, for all x, y ∈ R1 , s, t ∈ R+ and integers m ≥ 1, Lxt − Lyt 2m ≤ C(β, m)t(β−1)/(2β) |x − y|(β−1)/2 ,
(10.169)
Lxt − Lxs m ≤ C (β, m)|t − s|(β−1)/β ,
(10.170)
Lxt m ≤ C (β, m)t(β−1)/β ,
(10.171)
and
where C(β, m) and C (β, m) are constants depending only on β and m. It is clear that the inequality in (10.170) is unchanged if we take the norm with respect to P z for any z ∈ R1 . The same observation applies to (10.169) since it only depends on |x − y|. Proof Let pt (u) := pt (x, x + u) denote the transition probability densities of X. By Lemma 10.5.5 with z = 0 we see that y 2m x (pt1 (x) + pt1 (y)) (10.172) Lt − Lt 2m = (2m)! · · · 0
p∆ti (0) − (−1)2m−i p∆ti (x − y) dt,
i=2
where ∆ti = ti − ti−1 . Writing pt (x) as the Fourier transform of its characteristic function, it is easy to see that pt (x) ≤ pt (0)
and
ps (0) = dβ s−1/β
(10.173)
for some constant dβ . Thus we have that (10.172) is less than or equal to m−1 m pt1 (0) p∆t2j+1 (0) (10.174) ··· (2m)!2 0
i=2
j=1
p∆t2i (0) − (−1)2m−i p∆t2i (x − y) dt
≤ (2m)! 2
m
m
t
ps (0) ds 0
∞
m (pt (0) − pt (x − y)) dt
.
0
We obtain (10.169) from (10.174) by using (4.90), (4.95), and (10.173). To obtain (10.170) we note that Lxt − Lxs m = Lxt−s ◦ θs m = (E 0 {E Xs (Lxt−s )m })1/m ,
(10.175)
p-variation
490
where θ denotes the shift operator on the space of paths of X and without loss of generality we assume that t ≥ s. Note that by Lemma 10.5.5, for any z,
m z x m z x = m!E dLti E (Lt−s ) ··· 0
= m! · · ·
pt1 (x − z)
(10.176)
0
pt2 −t1 (0) · · · ptm −tm−1 (0) dt1 , . . . , dtm m pr (0)dr .
t−s
≤ m! 0
Using (10.175), (10.176), and (10.173), we get (10.170). The bound (10.171) follows from (10.170) by taking s = 0. Lemma 10.5.7 Let X = {X(t), t ∈ R+ } be a real-valued symmetric stable process of index 1 < β ≤ 2 and let {Lxt , (t, x) ∈ R+ × R1 } be the local time of X. Let p = 2/(β − 1). Then, for all partitions π of [0,a], s, t ∈ R+ , with s ≤ t, and integers m ≥ 1, |||Lt |||pπ,p − |||Ls |||pπ,p m ≤ C(β, m)t(p−1)(β−1)/(2β) |t − s|(β−1)/(2β) a. (10.177) In particular, (β−1)/(2β) 1/p |||Lt |||pπ,p 1/p a , m ≤ C (β, m)t
(10.178)
where C(β, m) and C (β, m) are constants depending only on β and m. Similarly, for any r ≥ 1, a - a x r |Lt | dx − |Lxs |r dx - ≤ D(β, r, m)t(r−1)(β−1)/β |t − s|(β−1)/β a. 0
m
0
In particular, - 0
a
(10.179)
|Lxt |r dx -
m
≤ D (β, r, m)tr(β−1)/β a,
(10.180)
where D(β, r, m) and D (β, r, m) are constants depending only on β, r and m. Proof
For a partition π we set x
∆Lxt i = Lxt i − Lt i−1
i = 1, . . . , kπ .
(10.181)
10.5 Additional variational results for local times
491
Suppose that u ≥ v ≥ 0. Writing up − v p as the integral of its derivative, we see that up − v p ≤ p(u − v)up−1 .
(10.182)
Therefore, it follows from (10.182) and the Schwarz inequality that |||Lt |||pπ,p − |||Ls |||pπ,p m |∆Lxt i |p − |∆Lxs i |p m ≤ xi ∈π
≤
(10.183)
p |∆Lxt i |p−1 2m + |∆Lxs i |p−1 2m ∆Lxt i − ∆Lxs i 2m .
xi ∈π
Let r be the smallest even integer greater than or equal to 2m(p − 1). Then, by H¨ older’s inequality and (10.169), we see that |∆Lxt i |p−1 2m
≤
∆Lxt i p−1 r
≤ D(β, m)t
(10.184)
(p−1)(β−1)/(2β)
(xi − xi−1 )
(p−1)(β−1)/2
,
where D(β, m) = (C(β, r))p−1 and C(β, r) is the constant in (10.169) (clearly this inequality also holds with t replaced by s). We also have ∆Lxt i − ∆Lxs i 2m
i = ∆Lxt−s ◦ θs 2m (10.185) 0 Xs xi 2m 1/2m . = E {E (∆Lt−s ) }
It follows from (10.169) and the remark immediately following the statement of Lemma 10.5.6 that, for all z ∈ R1 , 1/2m z i i )2m = ∆Lxt−s 2m (10.186) E (∆Lxt−s ≤ C(β, m)|t − s|(β−1)/(2β) |xi − xi−1 |(β−1)/2 . Combining (10.183)–(10.186) and using the fact that s ≤ t, we see that |||Lt |||pπ,p − |||Ls |||pπ,p m
(10.187)
≤ 2pD(β, m, p)C(β, m)t |t − s| (β−1)/2 (xi − xi−1 ) (xi − xi−1 )(p−1)(β−1)/2 . (p−1)(β−1)/(2β)
(β−1)/(2β)
xi ∈π
This gives (10.177) since the sum in (10.187) is equal to a. The statement in (10.178) follows from (10.177) by setting s = 0. To prove (10.179), we take s < t and note that a - a x r |L | dx − |Lxs |r dx (10.188) t m 0 0 a |Lxt |r − |Lxs |r m dx ≤ a sup |Lxt |r − |Lxs |r m . ≤ 0
x
p-variation
492
It follows from (10.182) with p replaced by r ≥ 1, followed by the Cauchy-Schwarz inequality, that |Lxt |r − |Lxs |r m ≤ r Lxt − Lxs 2m |Lxt |r−1 2m ,
(10.189)
and as in (10.184) we have that |Lxt |r−1 2m ≤ Lxt r−1 , q
(10.190)
where q is the smallest even integer greater than or equal to 2m(r − 1). The inequality in (10.179) now follows from (10.170) and (10.171). The statement in (10.180) follows from (10.179) by setting s = 0. Proof of Theorem 10.5.4 Although we are dealing with a weaker form of convergence than in Theorem 10.4.1, the only way that we know to prove this theorem is by using Theorem 10.4.1. Fix a > 0 and let P[0, a] denote the partitions of [0, a]. For π ∈ P[0, a], let a xi−1 2/(β−1) xi Hπ (t) = |Lt − Lt | − c(β) |Lxt |1/(β−1) dx, (10.191) 0
xi ∈π
where cβ is the constant in Theorem 10.4.1. By Lemma 10.5.7 we have that for any m, Hπ (t)m ≤ C(m, a, t) < ∞,
(10.192)
where C(m, a, t) is independent of π. In particular, for each t the collecAssume that {π (n)} tion {Hπ (t) ; π ∈ P[0, a]} is uniformly integrable. is a sequence in P[0, a] with m(π (n)) = o (log n)−1/(β−1) . It follows from Theorem 10.4.1 and Fubini’s Theorem that for some dense subset D ⊆ R+ , we have that for each t ∈ D, Hπ (n) (t) converges to 0 almost surely. Since {Hπ (t) ; π ∈ P[0, a]} is uniformly integrable, we have that for any r > 0, lim Hπ (n) (t)r = 0
n→∞
∀t ∈ D.
(10.193)
Fix t ∈ D and let {π(n)} be a sequence in P[0, a] for which we assume only that limn→∞ m(π(n)) = 0. If Hπ(n) (t)r did not converge to 0, we could find some > 0 and a subsequence nj → ∞ with Hπ(nj ) (t)r ≥
j = 1, 2, . . .
(10.194)
By taking a further subsequence, if necessary, which we again denote by nj , we can assume that m(π(nj )) = o (log n(j))−1/(β−1) . This gives a contradiction between (10.193) and (10.194). Thus we see that (10.193) holds whenever limn→∞ m(π (n)) = 0.
10.5 Additional variational results for local times
493
Now fix T > 0 and a sequence {π(n)} in P[0, a] with limn→∞ m(π(n)) = 0. By Lemma 10.5.7, we have that for any r > 0 and any > 0 we can find a δ > 0 such that sup Hπ(n) (s) − Hπ(n) (t)r ≤
∀ n ≥ 1.
(10.195)
0≤s,t≤T
|s−t|≤δ
Choose a finite set {t1 , . . . , tk } in D ∩ [0, T ] such that ∪kj=1 [tj − δ, tj + δ] covers [0, T ]. By (10.193) we can choose an N such that sup Hπ(n) (tj )r ≤
∀n ≥ N .
(10.196)
∀n ≥ N .
(10.197)
j=1,...,k
Combined with (10.195) this shows that sup Hπ(n) (s)r ≤ 2
0≤s≤T
Using Theorem 10.5.4, we can complete the proof of Theorem 10.4.1. Proof of Theorem 10.4.1 continued In Section 10.4 we show that (10.128) holds for almost all t ∈ R+ almost surely. Therefore, by Fubini’s Theorem, we can find a dense subset T ∈ R+ such that a x lim |Lxt i − Lt i−1 |2/(β−1) = c(β) |Lxt |1/(β−1) dx a.s., n→∞
0
xi ∈π(n)
(10.198) for all t ∈ T . Fix t > 0, and let sk , k = 1, . . . be a sequence in T with sk ↑ t and ∞ (t − sk )(β−1)/2β < ∞. (10.199) k=1
By the additivity of local times, Lxt − Lxsk = Lxt−sk ◦ θsk . Set p = 2/(β − 1) and note that
Ak := lim sup |||Lt |||π(n),p − |||Lsk |||π(n),p
(10.200)
(10.201)
n→∞
≤
lim sup |||Lt − Lsk |||π(n),p
=
lim sup |||Lt−sk ◦ θsk |||π(n),p .
n→∞ n→∞
Let X r = Xr+sk − Xsk , r ≥ 0. Then X = {X r ; r ≥ 0} is a copy of x {Xr ; r ≥ 0} that is independent of Xsk . Let {Lr ; (x, r) ∈ R1 × R+ }
p-variation
494
denote the local time for the process {X r ; r ≥ 0}. It is easy to check that x−Xs
Lxt−sk ◦ θsk = Lt−sk k .
(10.202)
Therefore, |||Lt−sk ◦ θsk |||π(n),p
·−Xs
= |||Lt−sk k |||π(n),p
(10.203)
x |||Lt−sk |||π(n)−Xsk ,p
=
,
where π(n) − Xsk is the partition of [0, a] − Xsk = [−Xsk , a − Xsk ] obtained by subtracting Xsk from every element of the partition π(n). For fixed Xsk , it follows from Theorem 10.5.4 that 1/2
lim |||Lt−sk |||π(n)−Xsk ,p = c(β)1/p Lt−sk p/2,[0,a]−Xs
n→∞
where Ls r,[a,b] =
b a
|Lxs |r
in L1X¯ , (10.204)
1/r dx
k
and
L1X¯
1
denotes the L space with
respect to X. Using (10.201)–(10.204) and H¨ older’s inequality, we have E(Ak | Xsk ) ≤
(10.205)
¯ t−s 1/2 c(β)1/p E(L k p/2,[0,a]−Xs k
| X sk )
1/p
¯ t−s p/2 | Xsk ) ≤ c(β)1/p E(L k p/2,[0,a]−Xs k 1/p ≤ c(β)D (β, p/2, 1)a (t − sk )(β−1)/2β , where in the last line we use (10.180). Therefore, E(Ak ) ≤ C (t − sk )(β−1)/2β
(10.206)
for some finite constant C independent of k. Then, by (10.199) and the Borel–Cantelli Lemma, Ak → 0
a.s.
(10.207)
The proof is completed by observing that, for each k, lim sup |||Lt |||π(n),p
(10.208)
n→∞
≤ lim sup |||Lsk |||π(n),p + Ak n→∞
1/2
= c(β)1/p Lsk p/2,[0,a] + Ak , and lim inf |||Lt |||π(n),p n→∞
≥ lim inf |||Lsk |||π(n),p − Ak n→∞
(10.209)
10.6 Notes and references
495
1/2
= c(β)1/p Lsk p/2,[0,a] − Ak . Therefore, lim lim |||Lsk |||π(n),p = lim |||Lt |||π(n),p
k→∞ n→∞
It is easy to see that a x p/2 |Lsk | dx = lim k→∞
0
a.s.
n→∞
(10.210)
a
|Lxt |p/2 dx
a.s.,
(10.211)
0
since {Lxs ; (s, x) ∈ [0, t] × [0, a]} is uniformly continuous almost surely. Since (10.198) holds with t = sk for all k, taking the limit as k goes to infinity we extend (10.128) to all t ∈ R+ .
10.6 Notes and references The material in this chapter is taken from Marcus and Rosen (1992c). Several earlier papers deal with the quadratic variation of Brownian motion and its local times and some with generalizations to larger classes of processes. Theorem 10.2.3, for Brownian motion, is due to Dudley (1973). Gin´e and Kline (1975) consider the quadratic variation of a larger class of Gaussian processes. Theorem 10.2.4, for Brownian motion, is due to de la Vega (1973). He only states the result for b ≥ 3; however, a minor modification of his proof gives all b > 0 (the description of the partitions used to obtain the examples that establish Theorem 10.2.4 is taken, almost verbatim, from de la Vega (1973)). Other results about various aspects of the p-variation of Gaussian processes appear in Kono (1969), Jain and Monrad (1983), and Adler and Pyke (1993). There is nothing special about p-variation. Essentially, the function that operates on the increments of the Gaussian process is the inverse of σ(h) in (10.19). Our approach to the study of p-variation is to express the p norm as the supremum of a linear functional on its dual space. This is simple because σ −1 in Theorem 10.2.3 is effectively a power. When it isn’t we can apply the same ideas used in the proof of Theorem 10.2.3, but we must deal with Orlicz spaces. This is done in Marcus and Rosen (1993), but it does not seem important to bother with the technicalities of Orlicz spaces in this book. Theorem 10.3.2, for Brownian motion, is due to Taylor (1972). It is generalized to a larger class of Gaussian processes, including those we consider, by Kawada and Kono (1973). In contrast to the technical difficulties encountered in extending Theorem 10.2.3, it is easy to see how to extend Theorem 10.3.2. The proof of the lower bound only requires the
p-variation
496
existence of an iterated logarithm law for local behavior of the Gaussian process at a point. Therefore, the lower bound holds under the hypotheses of Theorem 7.4.12. (It is true that this only applies to associated processes, but since we are ultimately concerned with applying these results to local times, this should not discourage us.) The upper bound should hold under the general conditions of Corollary 7.2.3 and some asymptotic monotonicity condition on σ at zero (see Theorem 9.5.13 for an example of how these conditions are combined). Actually one should be able to extend these results even further. We know from Theorem 7.6.4 that the local modulus of continuity does not always contain an iterated logarithm term. Theorem 10.3.2 ought to hold nevertheless, with the function φ being the inverse of the local modulus of continuity. (It may be difficult to get the correct constant in general. Also, we should emphasize that we have not worked out the details. There may be unanticipated difficulties along the way.) Some results along the lines of Theorem 10.5.4 were obtained prior to Marcus and Rosen (1992c). Note that when β = 1 + k1 , where k is an integer, then (10.157) is a x lim (Lxt i − Lt i−1 )2k = c(β) |Lxt |k dx, (10.212) n→∞
xi ∈π(n)
0
where the right-hand side is a k-fold self-intersection local time for intersections of the underlying stable process in [0, a]. In particular, for the local time of Brownian motion we have a x (Lxt i − Lt i−1 )2 = 4 Lxt dx (10.213) lim n→∞
xi ∈π(n)
0
(keep in mind the different characteristic functions of Brownian motion and 2-stable processes). Formula (10.213), but with convergence in probability, was obtained by Bouleau and Yor (1981) and Perkins (1982), and allows one to develop stochastic integration with respect to the space parameter of Brownian local time; see also Walsh (1983). Formula (10.212), with convergence in L2 , was established in Rosen (1993). Lemma 10.5.5 is given in Rosen (1991).
11 Most visited sites of symmetric stable processes
Let X be a strongly symmetric Borel right process with continuous αpotential densities uα (x, y), α > 0 and state space R1 . Let {Lyt , (t, y) ∈ (R+ × R1 )} denote the local times of X and let Vt = {x ∈ R1 | Lxt = sup Lyt }.
(11.1)
y
Following the discussion in Subsection 2.9, we refer to Vt as the most visited sites, or the favorite points, of X up to time t. Set Vt = inf |x|. x∈Vt
(11.2)
V = {Vt , t ∈ R+ } is a R+ valued stochastic process. We investigate the behavior of Vt as t → ∞ and also as t → 0. We restrict ourselves to the case in which X is a symmetric β-stable process, 1 < β ≤ 2. We discuss some generalizations to a larger class of processes in Section 11.7. In Section 11.2 we restrict our attention to Brownian motion because the problem is considerably simpler in that case. Indeed, the major inequality, Lemma 11.5.1, that permits us to get results for symmetric β-stable processes for 1 < β < 2 is a triviality for Brownian motion (see the comment following Lemma 11.5.1).
11.1 Preliminaries Many of the preliminary results we need to study V for Brownian motion are special cases of results for symmetric stable processes. To be efficient we make some preliminary observations that are valid for all these processes. The Gaussian processes associated with symmetric stable processes with index 1 < β ≤ 2 are fractional Brownian motions. Let η = {η(s), s ∈ R1 } be a fractional Brownian motion of index β −1 497
498
Most visited sites of symmetric stable processes
as defined on page 276, that is, η is a mean zero Gaussian process with stationary increments, η(0) = 0, and E |η(s) − η(t)|2 = Cβ |s − t|β−1 , (11.3) where β ∈ (1, 2] and Cβ is given in (7.280). We use this normalization because these are the Gaussian processes associated with the canonical symmetric stable process of index β for β ∈ (1, 2], that is, ψ(λ) = |λ|β in (7.262), killed the first time it hits zero (see Theorem 4.2.4, Remark 4.2.6, and Lemma 7.4.9). These processes are used in the isomorphism theorems later in this chapter. It follows from (11.3) that the covariance of fractional Brownian motion of index β − 1 is Cβ β−1 |s| (11.4) + |t|β−1 − |s − t|β−1 . 2 √ We point out in Remark 4.2.6 that when β = 2, η(s) = 2B(s), where B = {B(s), s ∈ R1 } is standard two-sided Brownian motion. It is easy to see by (11.4) that when β = 2 and st < 0, Γ(s, t) = 0, consistent with the fact that for two-sided Brownian motion, {B(s), s ∈ R+ } and {B(s), s ∈ (−∞, 0)} are independent. However, for 1 < β < 2, Γ(s, t) > 0 whenever s and t are not both equal to zero. Γ(s, t) =
It also follows from (11.3) that law s−1/2 η s1/(β−1) x , x ∈ R1 = η(x), x ∈ R1
(11.5)
for any s > 0. We refer to this as the scaling property of fractional Brownian motion. In Lemma 10.5.2 we give the scaling property of the local times of symmetric β-stable processes. This carries over to the total accumulated local time up to τ , the inverse local time at zero (see (3.132)). Lemma 11.1.1 Let X = {Xt ; t ≥ 0} be a symmetric stable process of index 1 < β ≤ 2 in R1 and denote its local times by L = {Lxt , (x, t) ∈ R1 × R+ }. Let τ ( · ) denote the inverse local time of L0· . Then, for all h > 0, law x/h1/(β−1) ; (x, r) ∈ R1 × R+ . Lxτ(hr) ; (x, r) ∈ R1 × R+ = hLτ (r) (11.6) Proof Using Lemma 10.5.2 with s = hβ/(β−1) , we see that Lxt ; (x, t) ∈ R1 × R+
(11.7)
11.1 Preliminaries law x/h1/(β−1) = hLt/hβ/(β−1) ; (x, t) ∈ R1 × R+ .
499
Under this equivalence in law it follows that τ (hr) = inf{t > 0 | L0t > hr}
(11.8)
law
= inf{t > 0 | hL0t/hβ/(β−1) > hr}
= inf{t > 0 | L0t/hβ/(β−1) > r} = hβ/(β−1) τ (r). Therefore, (11.9) (Lxt , τ (hr)) ; (x, t, r) ∈ R1 × R+ × R+ law x/h1/(β−1) = hLt/hβ/(β−1) , hβ/(β−1) τ (r) ; (x, t, r) ∈ R1 × R+ × R+ . The equivalence (11.6) follows from this. To study the most visited sites of a process we consider the supremum of its local time over all of its range. We also need to consider the range in which the local time is not too big, the less frequently visited sites of the process. We do this in the next lemma. We consider Lτ· (t) rather than Lt· in order to use the the Generalized Second Ray–Knight Theorem, Theorem 8.2.2. Lemma 11.1.2 Let X = {Xt ; t ≥ 0} be a symmetric stable process of index 1 < β ≤ 2 in R1 and denote its local times by L = {Lxt , (x, t) ∈ R1 × R+ }. Let 1/(β−1) t . (11.10) ha (t) = | log t|a Then, for all a > 0, lim
sup
t→∞ 0≤|x|≤h (t) a
| log t|a/2 (Lxτ(t) − t) t(log | log t|)1/2
≤2
" Cβ (1 + a/2)
a.s. (11.11)
a.s.
(11.12)
In particular, for all γ < a/2, lim
sup
t→∞ 0≤|x|≤h (t) a
| log t|γ (Lxτ(t) − t) t
=0
Furthermore, both (11.11) and (11.12) hold with limt→∞ replaced by limt→0 .
Most visited sites of symmetric stable processes
500
Proof It follows from Theorem 8.2.2 that, for t fixed and > 0 sufficiently small,
P
sup 0≤|x|≤ha (t)
Lxτ(t) − t ≥ (1 + )λ
(11.13)
2 (x) η ≥ (1 + )λ Lxτ(t) − t + ≤P sup 2 0≤|x|≤ha (t)
√ 2 ≤P sup η (x)/2 + 2tη(x) ≥ (1 + )λ
0≤|x|≤ha (t)
≤P
√
sup
2tη(x) ≥ λ
+P
sup
0≤|x|≤ha (t)
0≤|x|≤h(t)
Let σ 2 (t) =
Eη 2 (x) = Cβ |ha (t)|β−1 = Cβ
sup
η (x) ≥ 2 λ . 2
0≤|x|≤ha (t)
t | log t|a
(11.14)
and a(t) = median
sup
η(x).
(11.15)
0≤|x|≤ha (t)
It follows from Corollary 5.4.5 and (7.88) that a(t) ≤ Dβ |ha (t)|(β−1)/2 = Dβ σ(t) for finite constants Dβ and Dβ . We now use (5.151) to get
√ 2tη(x) ≥ λ P sup
(11.16)
(11.17)
0≤|x|≤ha (t)
λ η(x) − a(t) ≥ √ − a(t) =P sup 2t 0≤|x|≤ha (t) λ a(t) ≤1−Φ √ . − 2tσ(t) σ(t)
For > 0 and 0 < δ < 1, let 1/2 (1 + )4t σ 2 (t) log | log t| . λ= δ
(11.18)
Using this (11.14) and (11.16) we see that the last term in (11.17) is less than or equal to
| log t|−(1+ )/δ
(11.19)
11.1 Preliminaries
501
for some > 0 and all t sufficiently large, and also for all t sufficiently small. Similarly
P
η 2 (x) ≥ 2 λ
sup
(11.20)
0≤|x|≤ha (t)
≤ 2P
sup
η(x) − a(t) ≥
√
2 λ − a(t)
0≤|x|≤ha (t)
√ 2 λ a(t) ≤1−Φ − σ(t) σ(t) ≤ exp −| log t|a/2 for all t sufficiently large and sufficiently small. Let tk = exp(k δ ). When t = tk , the right-hand sides of (11.20) and (11.17) (see (11.19)) are terms in a sequence, indexed by k, which converges. Therefore, by (11.13), we see that for k ≥ k0 (ω) for k0 (ω) sufficiently large, | log tk |a/2 (Lxτ(tk ) − tk ) Cβ (11.21) sup sup ≤ (1 + )2 1/2 δ tk (log | log tk |) k≥k0 (ω) 0≤|x|≤ha (tk ) almost surely, where → 0 as k0 (ω) = k0 (ω, ) ↑ ∞. Let tk−1 < t ≤ tk . We have | log tk |a/2 (Lxτ(t) − t) Cβ (tk − tk−1 )| log tk |a/2 sup + ≤ (1 +
)2 1/2 δ tk (log | log tk |)1/2 0≤|x|≤ha (t) tk (log | log tk |) (11.22) for k ≥ k0 (ω), for k0 (ω) sufficiently large, almost surely. The last term in (11.22) is less than δ . k 1−δ(1+a/2) (log k)1/2
(11.23)
We take δ = (1+a/2)−1 and obtain that, for all > 0 and tk−1 < t < tk , sup 0≤|x|≤ha (t)
| log tk |a/2 (Lxτ(t) − t) tk (log | log tk |)1/2
" ≤ 2(1 + ) Cβ (1 + a/2)
(11.24)
for k ≥ k0 (ω), for k0 (ω) sufficiently large, almost surely. It is easy to extrapolate between tk and tk−1 on the left-hand side of (11.24) to get (11.11). The limit (11.12) is obvious. Repeating the argument of the preceding three paragraphs with tk = exp(−kδ ), we see that (11.11) and (11.12) also hold when the limit is taken as t goes to zero.
Most visited sites of symmetric stable processes
502
We use L0t as a reference for supx Lxt . The next lemma gives a lower bound for the rate of growth of L0t . Lemma 11.1.3 Let X = {Xt ; t ≥ 0} be a symmetric stable process of index 1 < β ≤ 2 in R1 and denote its local times by L = {Lxt , (x, t) ∈ R1 × R+ }. Then, for all > 0, lim
| log t|1+ L0t
t→∞
t1/β
=∞
a.s.,
(11.25)
where 1/β + 1/β = 1. Furthermore, (11.25) also holds with limt→∞ replaced by limt→0 . The proof of this lemma is given in the next subsection.
11.1.1 Inverse local time of symmetric stable processes We continue studying X = {Xt ; t ≥ 0}, a symmetric stable process in R1 of index 1 < β ≤ 2, and denote its local times by L = {Lxt , (x, t) ∈ R1 × R+ } and its inverse local time at zero by τ = {τ (r) , r ∈ R+ }. As in Lemma 2.4.5, which is proved for Brownian motion, we can show that τ , which is a positive increasing stochastic process, has stationary and independent increments. In particular, it is a L´evy process. We show in Lemma 3.6.10 that for all α > 0, α (11.26) E 0 e−ατ (r) = e−r/u (0) , where uα (0) =
1 π
(see (4.84)). Let Dβ =
1 π
1 dλ α + |λ|β
(11.27)
1 dλ. 1 + |λ|β
(11.28)
Thus uα (0) = Dβ α−1/β and we can write (11.26) as −1 1/β E 0 e−ατ (r) = e−rDβ α .
(11.29)
Lemma 11.1.4 Let X = {Xt ; t ≥ 0} be a symmetric stable process of index 1 < β ≤ 2 in R1 and denote its local times by L = {Lxt , (x, t) ∈ R1 × R+ }. Let τ denote the inverse local time of {L0t , t ∈ R+ }. Then, for any > 0, lim sup r→∞
τ (r) rβ (log r)β+
=0
P0
a.s.
(11.30)
11.1 Preliminaries
503
Also, lim inf
(log log r)1/(β−1) τ (r) rβ
r→∞
≥ Dβ−β
β/β 1 1 β β
P0
a.s. (11.31)
Proof We have that P 0 (τ (r) ≥ x)
= P 0 e−1 ≥ e−τ (r)/x −1 ≤ P 0 1 − e−x τ (r) ≥ 1 − e−1 e = E 0 1 − e−τ (r)/x e−1 −1 1/β e = 1 − e−rDβ /x , e−1
(11.32)
where, for the last equality, we use (11.29). Since 1 − e−y ≤ y for all y ≥ 0, we have e r 0 P (τ (r) ≥ x) ≤ . (11.33) e − 1 Dβ x1/β Consequently,
P 0 τ (r) ≥ rβ (log r)β+ e 1 ≤ , e − 1 Dβ (log r)(1+ )
(11.34)
where = / β. Let rn = en . It follows from the Borel–Cantelli Lemma that τ (rn ) lim sup =0 P 0 a.s. (11.35) β n→∞ r (log r )β+ n n Using the fact that τ (r) is increasing and > 0 is arbitrary, we obtain (11.30). To obtain (11.31) we use (11.29) again and Chebyshev’s inequality to see that for any λ > 0, (11.36) P 0 (τ (r) ≤ x) = P 0 e−λτ (r) ≥ e−λx −1
≤ eλx e−rDβ λ . β r This is minimized by taking λ = , which gives Dβ xβ P 0 (τ (r) ≤ x) ≤ e−dβ r
β
1/β
/xβ−1
,
(11.37)
Most visited sites of symmetric stable processes 1/β 1/β 1 1 1/β . Hence where dβ = Dβ−1 β β
1/(β−1) β (1 + )dβ r 1 0 P τ (r) ≤ ≤ . 1/(β−1) (log r)1+ (log log r)
504
(11.38)
Using the Borel–Cantelli Lemma as in (11.35) and the fact that τ is increasing, we get (11.31). Proof of Lemma 11.1.3 τ (t) is a generalized inverse of L0t . Therefore, when τ (t) < ρ(t) for some strictly increasing function ρ, L0t ≥ ρ−1 (t). We get (11.25) from (11.30) because
tβ (log t)β+
−1
>
t1/β (log t)1+
(11.39)
for all t sufficiently large. We get the last sentence of Lemma 11.1.3 by law noting that {L0t , t ∈ R+ } = {t2/β L01/t , t ∈ R+ } (see Lemma 10.5.2). Remark 11.1.5 The same argument used just above in the proof of Lemma 11.1.3 gives, for the symmetric β-stable process, lim sup t→∞
L0t t1/β (log | log t|)1/β
β ≤D
P0
a.s.,
(11.40)
where β = Dβ β 1/β β 1/β . D
(11.41)
Furthermore, by scaling, as in the proof of Lemma 11.1.3, (11.40) also holds as t → 0. In fact, in both cases, (11.40) holds with an equal sign. References are given in Section 11.7.
11.2 Most visited sites of Brownian motion We begin with a simple upper bound for Brownian motion. Lemma 11.2.1 Let {Wt , t ≥ 0} be a standard Brownian motion. Then 2 P sup Wt < λ ≤ λ. (11.42) π 0≤t≤1 Proof
By the reflection principle, Lemma 2.2.11, we have λ 2 2 P sup Wt < λ = e−s /2 ds, π 0≤t≤1 0
(11.43)
11.2 Most visited sites of Brownian motion
505
which gives (11.42). A key element in the proof of Lemma 11.2.3 is the following probability that Brownian motion, starting from zero, lies below a given line. Lemma 11.2.2 Let {Wt ; t ≥ 0} be a standard Brownian motion. Then, for all a, b > 0, P (Ws < a + bs; ∀s ≥ 0) = 1 − e−cab
(11.44)
for some constant 0 < c < ∞ independent of a and b. In fact c = 2, but we skip this point since in our applications of this lemma we do not need to know the value of c. Proof
Let Ta = inf{s | Ws ≥ a + s} and set
g(a) = P (Ws ≥ a + s; for some s ≥ 0) = P (Ta < ∞).
(11.45)
Let a1 , a2 > 0. A path that hits the line s + a1 + a2 must first hit the line s + a1 and then move up to meet a line with slope 1 that is a2 units above the point where it first hit the line s + a1 . This implies that Ta1 +a2 = Ta2 ◦ θTa1 .
(11.46)
It follows from the strong Markov property that for all a1 , a2 > 0, g(a1 + a2 )
= P (Ta1 +a2 < ∞)
(11.47)
= P (Ta1 < ∞, Ta2 ◦ θTa1 < ∞) = g(a1 )g(a2 ). Note that g(1) ≥ P (W (1) ≥ 2) > 0 and g(a) is a nonincreasing function in a. The only solution of (11.47) with these properties is g(a) = e−ca for some constant 0 < c < ∞. Therefore, P (Ws < a + s; ∀s ≥ 0) = 1 − g(a) = 1 − e−ca .
(11.48)
law
Since {Wt ; t ≥ 0} = {b−1 Wsb2 ; t ≥ 0}, we have P (Ws < a + bs, ∀s ≥ 0)
= P (b−1 Wsb2 < a + bs, ∀s ≥ 0) (11.49) 2 = P Wsb2 < ab + sb , ∀s ≥ 0 = P (Wt < ab + t, ∀t ≥ 0) ,
and (11.44) now follows from (11.48). We can now use the Second Ray–Knight Theorem to study the behavior of the local time of Brownian motion near zero.
Most visited sites of symmetric stable processes
506
Lemma 11.2.3 Let {Wt ; t ≥ 0} be a standard Brownian motion and denote its local times by L = {Lxt , (x, t) ∈ R1 × R+ }. Let τ ( · ) denote the inverse local time of L0· . Then
P0
sup Lxτ(1) ≤ 1 + λ
|x|≤1
≤ d2 λ2 | log λ|4
∀ λ > 0,
(11.50)
where d2 is a constant. Proof Let η = {η(x) ; x ∈ R1 } be a√two-sided Brownian motion independent of {Wt ; t ≥ 0}. Recall that 2η is the mean zero Gaussian process with covariance uT0 (x, y) given in Lemma 2.5.1. Since {η(x) ; x > 0} and {η(−x) ; x > 0} are independent and identically distributed, it follows from the Generalized Second Ray–Knight Theorem (Theorem 8.2.2) 2 that {Lxτ(1) + η 2 (x) ; x > 0} and {L−x τ (1) + η (−x) ; x > 0} are independent and identically distributed. This implies that {Lxτ(1) ; x > 0} and {L−x τ (1) ; x > 0} are independent and identically distributed (this can be seen easily using characteristic functions). Since L0τ (1) = 1, we see that to prove (11.50) it suffices to show that (11.51) P 0 sup Lxτ(1) ≤ 1 + λ ≤ dλ| log λ|2 . 0≤x≤1
Using (11.5) we have hλ
=
P
sup η 2 (x) ≤ cλ
:= P
(11.52)
0≤x≤λ
sup η 2 (x) ≤ c .
0≤x≤1
Therefore, for c > 1 sufficiently large, hλ = h1 > 2/3.
(11.53)
Also, by L´evy’s uniform modulus of continuity for Brownian motion (see (2.15) and Example 7.2.16) we can choose c sufficiently large so that P η 2 (x) ≤ cx | log λ|; λ < x ≤ 1 > 2/3. (11.54) Set
fλ (x) =
λ x | log λ|
0≤x≤λ λ < x ≤ 1.
Using (11.53) and (11.54) we have that, for all 0 < λ < 1, mλ := P η 2 (x) ≤ cfλ (x); 0 ≤ x ≤ 1 > 1/3.
(11.55)
(11.56)
11.2 Most visited sites of Brownian motion
507
Since L and η are independent, it follows from (11.56) that for all λ > 0 sufficiently small, 0 x sup Lτ (1) ≤ 1 + λ P (11.57) 0≤x≤1
0 x 2 = m−1 λ P × Pη Lτ (1) ≤ 1 + λ and η (x) ≤ cfλ (x); 0 ≤ x ≤ 1 ≤ 3 P 0 × Pη Lxτ(1) + η 2 (x) ≤ 1 + 2cfλ (x); 0 ≤ x ≤ 1 . (Note that the constant d in (11.50) is arbitrary. Without loss of generality we can take λ < 1/e so that λ ≤ fλ (x) on [0, 1].) It then follows from the Generalized Second Ray–Knight Theorem (Theorem 8.2.2) that (11.58) P 0 sup Lxτ(1) ≤ 1 + λ 0≤x≤1
√ 2 ≤ 3P η(x) + 1 ≤ 1 + 2cfλ (x); 0 ≤ x ≤ λ 2 ≤ 3P η (x) + 2η(x) ≤ 2cfλ (x); 0 ≤ x ≤ 1 ≤ 3P (η(x) < 2cfλ (x); 0 ≤ x ≤ 1) . √ (Note that η in√(8.46) is 2 times the η we are using here.) If η(λ) > − λ| log λ|, the event {η(x) < 2cfλ (x); 0 ≤ x ≤ 1} is contained in the event √ {η(x) < 2cλ; 0 ≤ x ≤ λ}∩{η(x)−η(λ) < 2cfλ (x)+ λ| log λ|;λ ≤ x ≤ 1}. (11.59) Since by (5.18) √ 2 P η(λ) > − λ| log λ| ≤ e−| log λ| /2 , (11.60) we have P (η(x) < 2cfλ (x); 0 ≤ x ≤ 1) 2 ≤ e−| log λ| /2 + P sup η(x) < 2cλ 0≤x≤λ
P η(x) − η(λ) < 2cx| log λ| +
√
(11.61)
λ| log λ|; λ ≤ x ≤ 1 .
It follows from Lemma 11.2.1 and scaling that √ λ P sup η(x) < 2cλ = O
(11.62)
0≤x≤λ
at zero. Finally we show below that √ √ P η(x) − η(λ) < 2cx| log λ| + λ| log λ|; λ ≤ x ≤ 1 ≤ d λ| log λ|2 (11.63)
508
Most visited sites of symmetric stable processes
for λ sufficiently small. Using this, (11.58), (11.61), and (11.62), we get (11.51). To obtain (11.63) we note that when λ < 1/2, the left-hand side of (11.63) equals √ P η(x − λ) < 2cx | log λ| + λ| log λ| ; λ < x ≤ 1 (11.64) √ ≤ P η(x) < 2cx | log λ| + 3c λ| log λ| ; 0 < x ≤ 1/2 . √ Let A denote the event {η(x) < 2cx | log λ| + 3c λ| log λ| ; 0 < x ≤ 1/2}. Note that P (A)P (η(x) − η(1/2) < (x − 1/2) | log λ| + 1 ; 1/2 < x)
(11.65)
= P (A ∩ {η(x) − η(1/2) < (x − 1/2) | log λ| + 1 ; 1/2 < x}) √ ≤ P A ∩ {η(x) < (x − 1/2 + c) | log λ| + 1 + 3c λ| log λ| ; 1/2 < x} √ because {η(1/2) ∈ A} implies that η(1/2) < c | log λ| + 3c λ| log λ|. We can absorb the constant 1 into the term c| log λ| so that the last line in (11.65) is less than or equal to √ (11.66) Pη η(x) < 4cx | log λ| + 4c λ| log λ| ; ∀x ≥ 0 √
a = 1 − e−d λ| log λ| √ ≤ d λ| log λ|2 ,
2
where we use Lemma 11.2.2 to get the second line. Lemma 11.2.2 also implies that Pη (η(x) − η(1/2) < (x − 1/2) | log λ| + 1 ; 1/2 < x) (11.67) = 1 − e−d| log λ| ≥ 1/2 for d > 2. The inequality (11.63) follows from (11.64)–(11.67). Lemma 11.2.4 Let {Wt ; t ≥ 0} be a standard Brownian motion and denote its local times by L = {Lxt , (x, t) ∈ R1 × R+ }. Let τ ( · ) denote the inverse local time of L0· . Then, for any > 0 and all v > 0,
P0
sup Lsτ (v) − v ≤ λv
|s|≤vλ
≤ d2 λ2 | log λ|4
for some constant d < ∞. Therefore, in particular, P 0 L∗τ (v) − v ≤ λv ≤ d2 λ2 | log λ|4 ,
(11.68)
(11.69)
11.2 Most visited sites of Brownian motion
509
where L∗· := supx Lx· is the maximal local time. Proof
By (11.6), law
Lxτ(1) =
1 xv . L v τ (v)
(11.70)
Using this in (11.50) we get (11.68). Lemma 11.2.5 Let {Wt ; t ≥ 0} be a standard Brownian motion and denote its local times by L = {Lxt , (x, t) ∈ R1 × R+ }. Let τ ( · ) denote the inverse local time of L0· . Then, for all b > 1, (log r)b supx Lxτ(r) − r =∞ (11.71) lim r→∞ r and | log r|b supx Lxτ(r) − r = ∞. (11.72) lim r→0 r To better appreciate these results, recall that r = L0τ (r) , so we are actually looking at the deviation of the maximal local time from its value at zero. Proof Let vn = en for 0 < a < 1 and λ = n−(1+)/2 in (11.69). It follows from the Borel–Cantelli Lemma that vn (11.73) L∗τ (vn ) − vn ≥ (1+)/2 n a
for all n ≥ n0 (ω) on a set of probability one, where ω denotes the path of Brownian motion. Let vn ≤ r ≤ vn+1 . Then, restricted to this set, vn (11.74) L∗τ (r) − r ≥ −(r − vn ) + (1+)/2 n for all n ≥ n0 (ω) and L∗τ (r) − r r
≥−
vn+1 − vn 1 vn + vn vn+1 n(1+)/2
(11.75)
for all n ≥ n0 (ω). It now follows by a simple calculation that L∗τ (r) − r r
≥−
2 (1 − δ) + (1+)/2 n1−a n
(11.76)
for all δ > 0 and for all n ≥ n0 (ω). It is easy to see that if 1 − a > 1/2, then for small enough, 1 − a > (1 + )/2. Thus L∗τ (r) − r r
≥
(1 − δ) n(1+)/2
(11.77)
510
Most visited sites of symmetric stable processes
for all n ≥ n0 (ω) and vn ≤ r ≤ vn+1 on a set of probability one. For r in this range, n ∼ (log r)1/a . Since we can take a arbitrarily close to 1/2, we get (11.71). a We get (11.155) from (11.158) by taking vn = e−n for 0 < a < 1 and repeating the rest of the above argument. We can now prove the “most visited sites” limit theorem for Brownian motion discussed in Section 2.9. Theorem 11.2.6 Let {Wt ; t ≥ 0} be a standard Brownian motion and denote its local times by L = {Lxt , (x, t) ∈ R1 × R+ }. Let Vt be given by (11.2). Then, for any γ > 3, we have lim
(log t)γ Vt = ∞ t→∞ t1/2
P 0 a.s.
(11.78)
| log t|γ Vt = ∞ t→0 t1/2
P 0 a.s.
(11.79)
and lim
In particular, (11.78) tells us that limt→∞ Vt = ∞, that is, the process {Vt , t ∈ R+ } is transient. Display (11.79) shows that although limt→0 Vt = 0, there is a positive lower bound on the rate at which it decreases to zero. We see that starting from zero, at least for a little while, zero is not the favorite point of a Brownian motion. Proof have
We first prove (11.78). Let t ∈ [τ (r− ), τ (r)]. By (11.71) we sup Lxt > r + r/(log r)b
(11.80)
x
for all r sufficiently large, for any b > 1. On the other hand, let ha be as defined in (11.10) with a = 2(1 + )b, for some > 0. It follows from (11.12) with γ = b and using L0τ (r) = r that sup 0≤|x|≤ha (L0t )
Lxt
≤
sup 0≤|x|≤ha (r)
<
Lxτ(r)
(11.81)
r + r/(log r)b
for all r sufficiently large. Comparing (11.80) and (11.81), we see that for all t sufficiently large, Vt > ha (L0t ).
(11.82)
11.3 Reproducing kernel Hilbert spaces
511
Using Lemma 11.1.3 and the fact that ha is increasing, this shows that
t1/2 (11.83) Vt > ha (1+) (log t) t1/2
≥
(log t)
(2b+1)(1+)
.
Since we can take b arbitrarily close to 1 and arbitrarily close to 0, we get (11.78). The main ingredients in the above proof, (11.71), (11.12), and Lemma 11.1.3 have corresponding versions as t → 0. Using them and following the above proof, we get (11.79).
11.3 Reproducing kernel Hilbert spaces of fractional Brownian motion Our goal is to obtain a version of Theorem 11.2.6 for the local times of symmetric stable processes of index 1 < β < 2. The main difficulty in extending the results of the preceding section is to find an analog of the elementary Lemma 11.2.1. This is not simple and requires a detour through reproducing kernel Hilbert spaces of fractional Brownian motion and the Cameron–Martin formula. In Theorem 5.3.1 we define and prove the existence of reproducing kernel Hilbert spaces. In Section 5.3 they are used to obtain the Karhunen– Lo´eve expansion of Gaussian processes. In this section we obtain explicit descriptions of the reproducing kernel Hilbert spaces of fractional Brownian motion. In order to motivate the work in this section it is useful to obtain the well-known explicit description of the reproducing kernel Hilbert space of Brownian motion, B = {B(t), t ∈ R+ }. Let L2 = L2 (R+ , λ) of realvalued functions, where λ indicates Lebesgue measure, and denote its norm by · 2 , that is, g ∈ L2 is a measurable function with ∞ 1/2 2 |g(λ)| dλ < ∞. (11.84) g2 = 0
For g ∈ L let 2
x
g(λ) dλ.
Ig(x) :=
(11.85)
0
It is easy to see by the Cauchy–Schwarz inequality that Ig exists and is uniformly continuous. Consider the linear space H = {Ig, g ∈ L2 } equipped with the norm
512
Most visited sites of symmetric stable processes
Ig := g2 . This norm is well defined since Ig ≡ 0 implies that g2 = 0. To see this, note that restricted to [0, x], for any x > 0, g is integrable so that Ig is absolutely continuous and has g as a derivative almost everywhere. This implies that g = 0 almost everywhere. In fact, (H, Ig) is a separable Hilbert space of real-valued continuous functions. To see that it is complete we note that if {Ign } is a Cauchy sequence in H, then {gn } is a Cauchy sequence in L2 . Therefore, there exist some g ∈ L2 such that gn → g in L2 . This means that Ign → Ig in H. Also, H is separable because L2 is separable. We now show that Γ(t, · ) ∈ H
∀ t ∈ R1
(11.86)
and (Ig( · ), Γ(t, · )) = Ig(t)
∀ Ig ∈ H,
where (·, ·) denotes the inner product on H. This is elementary since the covariance of B, x Γ(t, x) = t ∧ x = 1[0,t] (s) ds,
(11.87)
(11.88)
0
and obviously 1[0,t] (·) ∈ L2 . Furthermore, (Ig( · ), Γ(t, · )) = g( · ), 1[0,t] (·) 2 = Ig(t),
(11.89)
where (·, ·)2 denotes the inner product on L2 . Therefore, by Theorem 5.3.1, H = {Ig, g ∈ L2 } equipped with the norm Ig := g2 is the reproducing kernel Hilbert space of B. It is customary to refer to H as the space of continuous functions f , with f (0) = 0, that are absolutely continuous with respect to Lebesgue measure, equipped with the norm 1/2 , where f denotes the derivative of f . |f (s)|2 ds The key point in the above description of the reproducing kernel Hilbert space of Brownian motion is that the covariance of Brownian motion has a bounded derivative almost everywhere. This property is not shared by most of the Gaussian processes we are interested in. Nevertheless, we get a hint about what norm might define the reproducing kernel Hilbert space of Gaussian processes in general. We note that if . exists and, by Ig above is also in L2 (R+ ), then its Fourier transform Ig Parseval’s Theorem, ∞ ∞ 1 2 . |g(s)|2 ds = |λ|2 |Ig(s)| ds. (11.90) 2π −∞ 0
11.3 Reproducing kernel Hilbert spaces
513
This suggests that the reproducing kernel Hilbert space norms of Gaussian processes might be function spaces defined by different weightings of their Fourier transforms. We now develop this insight. Consider fractional Brownian motion as defined in (11.3). We give a function space description of its reproducing kernel Hilbert space. Consider the complex Hilbert space L2β = L2C (R1 , (|λ|β /(2π)) dλ), 1 < β < 2, and denote its norm by · 2,β , that is, g ∈ L2β is a measurable function with 1/2 1 <∞ (11.91) g2,β = |g(λ)|2 |λ|β dλ 2π (we use the subscript C to indicate that the functions in L2C are complex valued). If g ∈ L2β , let 1 (1 − e−ixλ )g(λ) dλ. Ig(x) := (11.92) 2π It follows from the Cauchy–Schwarz inequality that Ig exists and is uniformly continuous, since
1
(11.93) |Ig(x) − Ig(y)| = (e−iyλ − e−ixλ )g(λ) dλ 2π 1/2 −iyλ 1 − e−ixλ |2 |e ≤ g2,β dλ 2π λβ and
|e−iyλ − e−ixλ |2 dλ λβ
(|x − y|λ ∧ 1)2 dλ λβ ≤ Cβ |x − y|β−1 .
≤
(11.94)
Let f ∈ C0∞ and let f denote its Fourier transform. Recall that the Fourier transform of a C0∞ function is rapidly decreasing, that is, lim|λ|→∞ |λ|k |f(λ)| = 0 for all k > 0. Therefore, f ∈ L2β and 1 (1 − e−ixλ )f(λ) dλ = f (0) − f (x). I f(x) = (11.95) 2π Lemma 11.3.1 The linear space H = {Ig : g ∈ L2β }
(11.96)
IgH := g2,β
(11.97)
equipped with the norm
is a separable complex Hilbert space of continuous functions.
514
Most visited sites of symmetric stable processes
Furthermore, Γ(t, · ) ∈ H
∀ t ∈ R1
(11.98)
and (f ( · ), Γ(t, · ))H = f (t)
∀ f ∈ H,
(11.99)
where (·, ·)H denotes the inner product on H, and Γ, which was given in (11.4), is the covariance of fractional Brownian motion of index β − 1. Let Kβ−1 denote the reproducing kernel Hilbert space of fractional Brownian motion of order β − 1, 1 < β < 2, and let · Kβ−1 denote the norm on Kβ−1 . Then Kβ−1 = {f ∈ H | f = f¯}
(11.100)
f Kβ−1 = f H .
(11.101)
and for f ∈ Kβ−1
Proof We first show that I : L2β → C(R1 ) is one–one. Since I is linear, to show this it suffices to show that if (1 − e−ixλ )g(λ) dλ = 0, ∀x ∈ R1 , (11.102) then g2,β = 0. Given (11.102), it follows by (11.95) and Fubini’s Theorem that for all f ∈ C0∞ , 1 (1 − e−ixλ )f(λ)g(x) dλ dx (f (x) − f (0))g(x) dx = 2π = 0. (11.103) In particular, taking f (λ) = λh(λ) for h ∈ C0∞ , we have 1 λh(λ)g(λ) dλ = 0 2π
(11.104)
for all h ∈ C0∞ . This implies that λg(λ) = 0 almost everywhere, so that g2,β = 0. To see that H is complete we note that if {Ign } is a Cauchy sequence in H, then {gn } is a Cauchy sequence in L2β . Therefore, there exist some g ∈ L2β such that gn → g in L2β . This means that Ign → Ig in H. Also, H is separable because L2β is separable. Since |λ|β is symmetric we have 1 (1 − eisλ ) Cβ β−1 |s| dλ = , (11.105) β 2π |λ| 2
11.3 Reproducing kernel Hilbert spaces
515
where Cβ is given in (7.193). Consequently, by (11.4), Cβ β−1 (11.106) + |t|β−1 − |t − s|β−1 |s| 2 1 (1 − eisλ ) (1 − eitλ ) (1 − ei(s−t)λ ) = dλ + dλ − dλ 2π |λ|β |λ|β |λ|β (1 − eisλ )(1 − e−itλ ) 1 = dλ. 2π |λ|β
Γ(s, t) =
Let
1 − eisλ . gs (λ) = |λ|β
(11.107)
Clearly gs ∈ L2β and, by (11.106), Γ(s, t) = Igs (t). Thus we get (11.98). Furthermore, (Ig, Γ(s, ·))H
=
(Ig, Igs )H
(11.108)
(g, gs )2,β 1 = gs (λ)g(λ) |λ|β dλ 2π = Ig(s).
=
This gives us (11.99). Given (11.98), (11.99), and the fact that for any complex Hilbert space of functions H, {f ∈ H | f = f¯} is a real Hilbert space, the last part of the lemma follows from Theorem 5.3.1. The following Corollary of Lemma 11.3.1 is used in Section 11.5. Corollary 11.3.2 Let f ∈ C0∞ be a real-valued function and let f denote its Fourier transform. Then f − f (0) ∈ Kβ−1 for all 1 < β < 2 and 1/2 1 |λ|β |f(λ)|2 dλ f − f (0)Kβ−1 = < ∞. (11.109) 2π Proof
This follows from (11.95) and (11.101).
Remark 11.3.3 When 1< β < 2, 1 − cos(λy) 1 β |λ| = dy, π Cβ+1 R1 |y|1+β where Cβ+1 is given in (7.193). Consequently, 1 1 − cos(λy) 2 |λ|β |f(λ)|2 dλ = |f (λ)| dλ dy π Cβ+1 |y|1+β
(11.110)
(11.111)
516
Most visited sites of symmetric stable processes 1 1 = |(eiλy/2 − e−iλy/2 )f(λ)|2 dλ dy 2π Cβ+1 |y|1+β |f (y) − f (x)|2 1 dy dx, = Cβ+1 |y − x|1+β
where for the last equation we use Parseval’s Theorem and a change of variables. This shows that that the norm f2,β is equivalent to a smoothness condition on f .
11.4 The Cameron–Martin Formula The Cameron–Martin Formula is an isomorphism theorem that expresses a Gaussian process translated by an element of its reproducing kernel Hilbert space in terms of the untranslated process, but with a change of measure, as in the isomorphisms of Chapter 8. Let {Gt ; t ∈ T } be a Gaussian process on the probability space (Ω, F, P ) with continuous covariance Γ(s, t) = E(Gs Gt )
(11.112)
and set fs (t) = Γ(s, t). In the proof of Theorem 5.3.1 we show that the reproducing kernel Hilbert space H(Γ) is the closure of S, the span of the functions {fs ; s ∈ R1 } with respect to the inner product (fs , ft ) = Γ(s, t) = fs (t).
(11.113)
By (5.76) we have that, for any f ∈ H(Γ), f (s) = (f, fs ).
(11.114)
In the proof of Theorem 5.3.2 we show that the linear map ΘP : S → L2 (P ) defined by setting ΘP (fr ) = Gr extends to an isometry that maps H(Γ) onto the Hilbert space obtained by taking the closure of the span of {Gt ; t ∈ T } in L2 (Ω, F, P ) with respect to expectation. For any f ∈ H(Γ) let G(f ) := ΘP (f ). Thus EG2 (f ) = (f, f )
(11.115)
EG(f )Gs = (f, fs ) = f (s).
(11.116)
and, in particular
Theorem 11.4.1 (Cameron–Martin Formula) Let {Gt ; t ∈ T } be a mean zero Gaussian process with continuous covariance Γ and let H(Γ)
11.4 The Cameron–Martin Formula
517
be the reproducing kernel Hilbert space generated by Γ. Let f ∈ H(Γ). Then, for any measurable functional F (G· ), E(F (G· + f (·))) = e−f
2
E(F (G· ) eG(f ) ),
/2
(11.117)
where · is the norm on H(Γ). Proof It suffices to prove that for any finite subset A ⊂ T ,
−f 2 /2 G(f ) E (Gt + f (t)) = e E e Gt . t∈A
(11.118)
t∈A
Clearly
E (Gt + f (t)) =
t∈A
f (t) E
t∈B
B⊆A
Gt
. (11.119)
t∈A−B
On the other hand, by Lemma 5.2.6 and (11.116),
∞ 1 E eG(f ) = (11.120) E Gk (f ) Gt Gt k! t∈A t∈A k=0 ∞ k 1 k (k − j)! = E(Gj (f )) k! j=0 j k=0 B⊆A, |B|=k−j
f (t) E Gt . t∈B
t∈A−B
j (Note that Lemma apply here since E(G (f )) = 0 when j , 5.2.6 does is odd and E t∈A−B Gt = 0 when j is even and k is odd.) Since (2j)! E(G2j (f ))= f 2j , we can write (11.120) as j!2j Gt ) (11.121) E(eG(f ) t∈A
=
k/2 ∞ k=0 j=0
=
∞ ∞ f 2j j=0
=
1 E(G2j ) (2j)!
j!2j
j=0
j!2j
B⊆A, |B|=k−2j
k=2j B⊆A, |B|=k−2j
∞ f 2j B⊆A
t∈B
f (t) E
f (t) E
t∈B
f (t) E
t∈B
Gt
t∈A−B
Gt
t∈A−B
Gt
.
t∈A−B
Comparing (11.119) and (11.121), we see that we have obtained (11.118).
Most visited sites of symmetric stable processes
518
Alternative proof of Theorem 11.4.1 It suffices to prove that, for any finite subset {t1 , . . . , tn } ⊂ T and bounded continuous function F on Rn , E (F (Gt1 + f (t1 ), . . . , Gtn + f (tn ))) 2 = e−f /2 E eG(f ) F (Gt1 , . . . , Gtn )) . Assume first that f (t) = matrix
n i=1
(11.122)
ai Γ(ti , t) for ai ∈ R1 . Define the n × n
Ci,j = {C(n)}i,j = Γ(ti , tj ).
(11.123)
Without loss of generality we can assume that C −1 exists. It follows from (4.10) that E (F (Gt1 + f (t1 ), . . . , Gtn + f (tn ))) 1 √ F (z1 + f (t1 ), . . . , zn + f (tn )) = (2π)n/2 det C n n −1 − Ci,j zi zj /2 i,j=1 e dzi =
i=1
1 √
F (z1 , . . . , zn )
(2π)n/2 det C −
n
e Using the fact that f (tj ) = n
(11.124)
i,j=1
n i=1
−1 Ci,j (zi −f (ti ))(zj −f (tj ))/2
n
dzi .
i=1
Cj,i ai , we have
−1 Ci,j (zi − f (ti ))(zj − f (tj ))
(11.125)
i,j=1
=
n
−1 Ci,j zi zj
i,j=1
−2
n
zi ai +
i=1
n
Ci,j ai aj .
i,j=1
Note that by (5.76), f 2 = (f, f ) =
n i,j=1
ai aj (fi , fj ) =
n
Ci,j ai aj .
(11.126)
i,j=1
Combining the last three displays and using (4.10) once more, we have E (F (Gt1 + f (t1 ), . . . , Gtn + f (tn ))) 2 e−f /2 √ = F (z1 , . . . , zn ) (2π)n/2 det C
(11.127)
11.5 Fractional Brownian motion n n n −1 Ci,j zi zj /2 zi ai − i,j=1 i=1 e e dzi
519
i=1
=e
−f 2 /2
Since G(f ) = G
n . E F Gt1 , . . . , Gtn ) e i=1 ai Gti
n
ai Γ(ti , t)
i=1
=
n
ai Gti ,
(11.128)
i=1
n we see that (11.127) is (11.122) in the case when f (t) = i=1 ai Γ(ti , t). Furthermore, since n is arbitrary, we actually have (11.122) for any f (t) in the span of {Γ(t, s) ; s ∈ T }. To get (11.122) for general f ∈ H(Γ), note that we can find a sequence {fk } in the the span of {Γ(t, s) ; s ∈ T } such that fk → f in H(Γ). By (11.114), fk (tj ) → f (tj ) for j = 1, . . . , n. Also, since ΘP is an isometry, G(fk ) → G(f ) in L2 , and since {G(fk )} and G(f ) are Gaussian, this implies that exp(G(fk )) → exp(G(f )) in L2 . Of course, fk → f . Thus (11.122) for f follows from (11.122) for the fk , which we have proved.
11.5 A probability estimate for fractional Brownian motion We can now give the analogue of Lemma 11.2.1, which enables us to find a version of Theorem 11.2.6 for the local times of symmetric stable processes. Lemma 11.5.1 For fractional Brownian motion of index 0 < γ ≤ 1 we have that, for all > 0 sufficiently small,
P
sup η(x) < λ
|x|≤1
≤ dλ2(1−)/γ
(11.129)
for some constant d < ∞. Although this estimate also applies to two-sided Brownian motion, it is not strong enough to yield Theorem 11.2.6. In fact, for Brownian motion we get 2λ2 /π on the right-hand side of (11.129). This follows from Lemma 11.2.1 and the independence of Brownian motion on R+ and (−∞, 0). Proof We use the Cameron–Martin Formula. It is clear, since d is arbitrary, that we only need to prove (11.129) for all 0 < λ ≤ λ0 for some λ0 sufficiently small. Let f ∈ C0∞ be an even function supported
Most visited sites of symmetric stable processes
520
on [−1, 1] that is strictly decreasing on [0, 1] with f (0) = 1 and f (1) = 0. For 0 < h ≤ 1, set f (h) (x) = hf (x/h2/γ ) and f¯(h) (x) = h − f (h) (x) so that in particular f¯(h) (0) = 0. We have
P sup η(x) < h = P η(x) − f¯(h) (x) < f (h) (x); |x| ≤ 1 . (11.130) |x|≤1
Let Kγ denote the reproducing kernel Hilbert space for η. By the Cameron–Martin Formula (Theorem 11.4.1), (11.131) P η(x) − f¯(h) (x) < f (h) (x); |x| ≤ 1 = E 1[η(x)−f¯(h) (x)
(11.133)
for all h > 0. Therefore, by taking q = 1/(1 − ), we see that to obtain (11.129) we need only show that P η(x) < f (h) (x); |x| ≤ 1 ≤ d h2/γ (11.134) for some constant d < ∞. Consider the event {η(x) < f (h) (x); |x| ≤ 1}. For this to occur we must have that η(x) < 0 outside of (−h2/γ , h2/γ ), the support of f (h) (x). In addition we have η(0) = 0. Consequently, P η(x) < f (h) (x); |x| ≤ 1 (11.135)
≤P
sup η(y) > |y|
≤P
sup η(y) − η(−1) >
|y|
=P
η(y)
sup h2/γ ≤|y|≤1
sup |y−1|
η(y) >
sup
h2/γ ≤|y|≤1
sup h2/γ ≤|y−1|≤1
η(y) − η(−1)
η(y)
11.5 Fractional Brownian motion
=P
η(y) >
sup
521 η(y) ,
sup 0≤y≤1−h2/γ ,1+h2/γ ≤y≤2
1−h2/γ
where the next to last equality follows from the fact that η has stationary increments. Set y ∈ [0, 2] (11.136) F (y) = P sup η(x) > sup η(x) 0≤x
y≤x≤2
and note that the last term in (11.135) is equal to F 1 + h2/γ − F 1 − h2/γ .
(11.137)
Since F (y) is nondecreasing in y, we have that F (y) is differentiable almost everywhere on [0, 2]. Pick 3/2 < y0 < 2 such that F (y) is differentiable at y0 and set ψ(y0 ) = F (y0 ). Note that 0 ≤ ψ(y0 ) < ∞. We then have that, for some δ0 > 0, F ((1 + δ)y0 ) − F ((1 − δ)y0 ) ≤ (ψ(y0 ) + 1)4δ
δ ≤ δ0 .
(11.138)
By (11.5) we have that for any 0 ≤ z ≤ 2 and k > 0, F (z) = P sup η(x) > sup η(x) . 0≤x
(11.139)
kz≤x≤2k
Taking k = y0 we see that F (1 + δ) − F (1 − δ)
=P
sup (1−δ)y0 ≤x<(1+δ)y0
(11.140) η(x) >
≤P
sup (1−δ)y0 ≤x<(1+δ)y0
η(x) >
sup 0≤x<(1−δ)y0 ,(1+δ)y0 ≤x≤2y0
sup 0≤x<(1−δ)y0 ,(1+δ)y0 ≤x≤2
η(x)
η(x)
= F ((1 + δ)y0 ) − F ((1 − δ)y0 ) . Using (11.140) and (11.138), we see that (11.137) is bounded by dh2/γ for some constant d > 0. Thus (11.134) follows from (11.135). Lemma 11.5.2 Let f be an even function in C0∞ and let f (h) (x) = hf (x/h2/(β−1) ), 0 < h ≤ 1. Then f (h) − f (h) (0) Kβ−1 = f − f (0) Kβ−1 < ∞.
(11.141)
Most visited sites of symmetric stable processes
522
(h)(λ) = h1+2/(β−1) f(λ h2/(β−1) ), so that, by (11.109), Proof Note that f/
1 2+4/(β−1) h |λ|β |f(λ h2/(β−1) )|2 dλ 2π 1 h2+4/(β−1) = |λ|β |f(λ)|2 dλ 2π h(1+β)2/(β−1) (11.142) = f − f (0) 2Kβ−1 < ∞.
f (h) − f (h) (0) 2Kβ−1
=
Corollary 11.5.3 For fractional Brownian motion of index 0 < γ ≤ 1, we have that for any constants c, > 0,
P
≤ dλ(1−)/γ
η(x) < cλ
sup |x|≤λ1/γ
(11.143)
for some constant d < ∞. Proof
By changing the constant d in (11.129) we can write it as
P
sup η(x) < cλ
|x|≤1
≤ dλ2(1−)/γ .
(11.144)
Furthermore, by (11.5),
P
sup
η(x) < cλ
|x|≤λ1/γ
=P
1/2
sup η(x) < cλ
|x|≤1
.
(11.145)
Thus we get (11.143). Remark 11.5.4 Fractional Brownian motion was defined in (11.3). Consider the broader class ηC = {ηC (s), s ∈ R1 } of mean zero Gaussian processes with stationary increments and E |ηC (s) − ηC (t)|2 = C|s − t|β−1 ,
(11.146)
where β ∈ (0, 2] and C > 0. The scaling property (11.5) holds for all ηC . Furthermore, since d in (11.129) is arbitrary, it is clear that if (11.129) holds for some ηC , then it holds for all ηC , where, of course, the constant d depends on C. This also applies to (11.143).
11.6 Most visited sites of symmetric stable processes
523
11.6 Most visited sites of symmetric stable processes We dealt with this question for Brownian motion in Section 11.2. Therefore, although some of the results in this section apply to Brownian motion, in this section we are really only interested in symmetric stable processes with index 1 < β < 2. We begin with a probability estimate for the local times of symmetric stable processes. Lemma 11.6.1 Let X = {Xt ; t ≥ 0} be a symmetric stable process of index 1 < β ≤ 2 in R1 and denote its local times by L = {Lxt , (x, t) ∈ R1 × R+ }. Let τ ( · ) denote the inverse local time of L0· . Then, for any
> 0,
P0
sup |x|≤λ1/(β−1)
Lxτ(1) ≤ 1 + λ
≤ dλ(1−)/(β−1)
(11.147)
for some d < ∞. Proof By (11.5), the scaling property of fractional Brownian motion, we have
hλ
:= Pη
sup |x|≤λ1/(β−1)
η 2 (x) ≤ 2λ
=
Pη
(11.148)
sup η 2 (x) ≤ 2
|x|≤1
so that hλ = h1 > 0. Consequently,
P0
sup |x|≤λ1/(β−1)
1 0 P × Pη = h1 1 0 P × Pη ≤ h1
Lxτ(1) ≤ 1 + λ
(11.149)
sup
|x|≤λ1/(β−1)
Lxτ(1)
≤ 1 + λ and
Lxτ(1)
1 + η 2 (x) ≤ 1 + 3λ 2
sup |x|≤λ1/(β−1)
η (x) ≤ 2λ 2
sup |x|≤λ1/(β−1)
.
Therefore, by Theorem 8.2.2, the Generalized Second Ray–Knight Theorem,
P0
sup |x|≤λ1/(β−1)
Lxτ(1) ≤ 1 + λ
(11.150)
Most visited sites of symmetric stable processes
√ 2 ηx + 2 1 ≤ ≤ 1 + 3λ sup Pη h1 2 |x|≤λ1/(β−1)
1 η 2 (x) + η(x) ≤ 3λ sup ≤ Pη h1 2 |x|≤λ1/(β−1)
1 ≤ Pη η(x) ≤ 3λ . sup h1 |x|≤λ1/(β−1)
524
The inequality (11.147) now follows from Lemma 11.5.3. Lemma 11.6.2 Let X = {Xt ; t ≥ 0} be a symmetric stable process of index 1 < β ≤ 2 in R1 and denote its local times by L = {Lxt , (x, t) ∈ R1 × R+ }. Let τ ( · ) denote the inverse local time of L0· . Then, for any
> 0 and all v > 0,
P0
sup |s|≤(vλ)1/(β−1)
Lsτ (v) − v ≤ λv
≤ dλ(1−)/(β−1)
for some constant d < ∞. Therefore, in particular, P 0 L∗τ (v) − v ≤ λv ≤ dλ(1−)/(β−1) ,
(11.151)
(11.152)
where L∗· := supx Lx· is the maximal local time. Proof
By (11.6), law
Lxτ(1) =
1 xv1/(β−1) L v τ (v)
(11.153)
Using this in (11.147) we get (11.151). We can now give a version of Lemma 11.2.5 for the symmetric stable process of index 1 < β < 2. Lemma 11.6.3 Let X = {Xt ; t ≥ 0} be a symmetric stable process of index 1 < β < 2 in R1 and denote its local times by L = {Lxt , (x, t) ∈ R1 × R+ }. Let τ ( · ) denote the inverse local time of L0· . Then, for any β−1 , b> 2−β (log r)b (supx Lxτ(r) − r) lim =∞ (11.154) r→∞ r and (log 1/r)b (supx Lxτ(r) − r) = ∞. (11.155) lim r→0 r
11.6 Most visited sites of symmetric stable processes
525
The proof is essentially the same as the proof of Theorem 11.2.5, but since it is short and since this is an important result, we repeat the proof with the necessary modifications. Proof Let v = vn = en for 0 < a < 1 and λ = n−(β−1)(1+2) in (11.152). It follows from the Borel–Cantelli Lemma that vn L∗τ (vn ) − vn ≥ (β−1)(1+2) (11.156) n for all n ≥ n0 (ω) on a set of probability one, where ω denotes the paths of the stable process. Let vn ≤ r ≤ vn+1 . Then, restricted to this set, vn L∗τ (r) − r ≥ −(r − vn ) + (β−1)(1+2) (11.157) n for all n ≥ n0 (ω) and a
L∗τ (r) − r r
≥−
1 vn vn+1 − vn + (β−1)(1+2) vn vn+1 n
(11.158)
for all n ≥ n0 (ω). Then by a simple calculation we see that L∗τ (r) − r
(1 − δ) 2 (11.159) ≥ − 1−a + (β−1)(1+2) r n n for all δ > 0 and for all n ≥ n0 (ω). It is easy to see that if β − 1 < 1 − a, then, for small enough, (β − 1)(1 + 2 ) < 1 − a. Thus L∗τ (r) − r
≥
(1 − δ)
(11.160) r for all n ≥ n0 (ω) and vn ≤ r ≤ vn+1 on a set of probability one. For r in this range, n ∼ (log r)1/a . Since we can take a arbitrarily close to 2 − β, we get (11.154). a We get (11.155) from (11.158) by taking vn = e−n for 0 < a < 1 and repeating the rest of the above argument. n(β−1)(1+2)
We can now prove the “most visited sites” theorem for symmetric stable processes of index 1 < β < 2. Theorem 11.6.4 Let X = {Xt ; t ≥ 0} be a symmetric stable process of index 1 < β < 2 in R1 and denote its local times by L = {Lxt , (x, t) ∈ R1 ×R+ }. Let Vt be given by (11.2). Then, for any γ > β/(2−β)(β −1), we have (log t)γ lim Vt = ∞ P 0 a.s. (11.161) t→∞ t1/β and | log t|γ P 0 a.s. (11.162) lim 1/β Vt = ∞ t→0 t
526
Most visited sites of symmetric stable processes
In particular, (11.161) tells us that limt→∞ Vt = ∞, that is, the process {Vt , t ∈ R+ } is transient. Moreover, we can make a similar comment about (11.162) as we made for (11.79) following Theorem 11.2.6. Proof Here again the proof is a very minor modification of the proof of Theorem 11.2.6 for Brownian motion. Let t ∈ [τ (r− ), τ (r)] for r large. By (11.154) we have that sup Lxt > r + r/(log r)b
(11.163)
x
for all b > (β − 1)/(2 − β). On the other hand, let ha be as defined in (11.10) with a = 2(1 + )b for some > 0. It follows from (11.12) with γ = b and the fact that L0τ (r) = r, that sup 0≤|x|≤ha (L0t )
≤
Lxt
sup 0≤|x|≤ha (r)
<
Lxτ(r)
(11.164)
r + r/(log r)b
for all r sufficiently large. Comparing (11.163) and (11.164), we see that for all t sufficiently large, Vt > ha (L0t ).
(11.165)
Using Lemma 11.1.3 and the fact that ha is increasing, this shows that
¯ t1/β Vt > ha (11.166) (1+) (log t) ≥
t1/β (2b+1)(1+)
1/(β−1) .
(log t)
Since we can take b arbitrarily close to (β − 1)/(2 − β) and arbitrarily close to zero, we get (11.161). The proof of (11.162) is essentially the same as the proof of (11.79) in Theorem 11.2.6.
11.7 Notes and references The first investigation of the most visited sites question for L´evy processes was by Bass and Griffin (1985), who considered Brownian motion. They obtained (11.78) for all γ > 11. They also showed that lim inf t→∞
(log t)γ Vt = 0 t1/2
P0
a.s.
(11.167)
11.7 Notes and references
527
for all γ < 1. Many people think that (11.78) should hold for all γ > 1 but this has remained an open problem for the last 20 years. 9 Bass, Eisenbaum and Shi (2000) obtained (11.154) for all b > β−1 . This is better than our estimate for b when β is near 2 but not as good when β is near 1. We use a different method of proof in Theorem 11.6.3 than the one in Bass, Eisenbaum and Shi (2000). In that work, a clever use of Slepian’s Lemma allowed them to get a version of the critical Lemma 11.5.1, by considering the amount of time Brownian motion spends within a cone. We think it takes us far too afield to develop all the prerequisites needed to proof this result. (In Bass, Eisenbaum and Shi (2000), they referred to Ba˜ nuelos and Smits (1997).) Instead, we use a rather sharp direct estimate for fractional Brownian motion by Molchan (1999). Molchan’s estimate uses the Cameron–Martin Formula, which involves reproducing kernel Hilbert spaces, thus our Sections 11.3 and 11.4. For other results about the most visited sites of symmetric stable processes, see Eisenbaum (1997) and Eisenbaum and Khoshnevisan (2002). Both Bass, Eisenbaum and Shi (2000) and Molchan (1999) restrict their attention to symmetric stable process and hence can use the scaling property in Lemma 11.1.1. Furthermore, in applying the Ray–Knight Theorem or the Generalized Second Ray–Knight Theorem both in Bass, Eisenbaum and Shi (2000) and in our presentation in this chapter, the scaling property of fractional Brownian motion, (11.5), is used often. This is analytically convenient but may not be necessary. In Marcus (2001), the approach of Bass, Eisenbaum and Shi (2000) is extended to L´evy processes with regularly varying characteristic exponents. Extending the approach of this chapter beyond symmetric stable processes remains to be done. The most visited sites problem is only one of the many questions that one can ask about the behavior of local times {Lxt , (x, t) ∈ R1 × R+ } as t → ∞. We do not pursue these questions in this chapter because our guiding principle is to concentrate on results that employ the isomorphism theorems of Chapter 8. That is except for (11.40), which we mention as a complement of Lemma 11.1.3. A good starting point for considering these questions are Donsker and Varadhan (1977), Lacey (1990), Marcus and Rosen (1994a), Marcus and Rosen (1994b), Bertoin (1995), Bertoin and Caballero (1995), and Blackburn (2000).
The proofs in Section 11.3 extend easily to all Gaussian processes associated with L´evy processes and even to all Gaussian processes with
528
Most visited sites of symmetric stable processes
spectral densities. As an example consider the Gaussian processes associated with recurrent L´evy processes killed the first time they hit zero. Let X be a recurrent L´evy process with L´evy exponent ψ. Recall that, by Theorem 4.2.4, the 0-potential density of X killed the first time it hits zero is uT0 (x, y) = φ(x) + φ(y) − φ(x − y), where φ(x) :=
1 π
∞
0
1 − cos λx dλ ψ(λ)
x ∈ R.
(11.168)
(11.169)
Consider the complex Hilbert space L2ψ = L2C (R1 , (ψ(λ)/2π) dλ) and denote its norm by · ψ , that is, g ∈ L2ψ is a measurable function with 1/2 1 2 < ∞. (11.170) gψ = |g(λ)| ψ(λ) dλ 2π For g ∈ L2ψ , let Ig(x) :=
1 2π
(1 − e−ixλ )g(λ) dλ.
(11.171)
It follows from the Cauchy-Schwarz inequality that Ig exists and is uniformly continuous. The proof of the following theorem is a simple generalization of the proof of Theorem 11.3.1. Lemma 11.7.1 The linear space Hψ = {Ig : g ∈ L2ψ }
(11.172)
IgHψ := gψ
(11.173)
equipped with the norm
is a separable complex Hilbert space of continuous functions. Furthermore, uT0 (t, · ) ∈ Hψ
∀ t ∈ R1
(11.174)
∀ f ∈ Hψ ,
(11.175)
and (f ( · ), uT0 (t, · ))Hψ = f (t)
where (·, ·)Hψ denotes the inner product on Hψ . Let Kψ denote the reproducing kernel Hilbert space for the covariance uT0 and let · Kψ denote the norm of Kψ . Then Kψ = {f ∈ Hψ | f = f¯}
(11.176)
11.7 Notes and references
529
and, for f ∈ Kψ , f Kψ = f Hψ .
(11.177)
In a similar way one can consider the reproducing kernel Hilbert spaces for Gaussian processes associated with L´evy processes killed at the end of an independent exponential time.
12 Local times of diffusions
12.1 Ray’s Theorem for diffusions In Subsection 8.2.1 we show that we can generalize the Second Ray– Knight Theorem for recurrent diffusions so that the process can start at an arbitrary point in the state space. We do the same thing here in the transient case. The Second Ray–Knight Theorem in the transient case is given in Theorem 8.2.3. As one can see from the proof, it is actually a theorem about the local times of the h-transform of a process. (We pass from the h-transform to the original process in (8.61).) In Theorem 8.4.1, in particular in (8.127), we make this more explicit. In this section, for diffusions, we extend Theorem 8.4.1 to an h-transform process starting at x and terminating at its last exit from y for y = x. This is how Ray’s original result was given, although our result appears quite different from his. Let X be a transient regular diffusion on an interval I ⊂ R1 , as described in Section 4.3, with symmetric 0-potential density u(r, s). Recall that u(r, s) has the form given in (4.111), u(r, s) =
p(r)q(s) r ≤ s p(s)q(r) s < r,
(12.1)
where p and q are positive continuous functions, with p strictly increasing and q strictly decreasing. For hy (r) = u(r, y)/u(y, y), let X hy denote the hy -transform of X. = (Ωh , F hy , Fthy , Xt , θt , P x/hy ) in (X hy is an alternative notation for X y Theorem 3.9.2 that makes more explicit the dependence on the function hy .) y Let L = {Lt ; (y, t) ∈ I × R+ } denote the local time for X hy normal530
12.1 Ray’s Theorem for diffusions ized so that
r u(x, r)h (r) u(x, r)u(r, y) y E x,y L∞ = = hy (x) u(x, y)
531
(12.2)
(recall that we use P x,y to denote the law of X hy , starting at x; see also Remark 3.9.3). Let G = {Gr , r ∈ I} be a mean zero Gaussian process with covariance u(r, s) and let Gr,z denote the projection of Gr on the orthogonal complement of Gz , that is, Gr,z = Gr −
E(Gr Gz ) u(r, z) Gz = Gr − Gz . E(Gz Gz ) u(z, z)
(12.3)
In the next lemma we collect some facts about Gr,z that are used throughout this chapter. Lemma 12.1.1 q(z) Gy , and it is independent of G≤y , q(y) the σ-algebra generated by {Gr , r ≤ y}. p(y) Gz , and it is independent of G≥z , (2) If y < z, then Gy,z = Gy − p(z) the σ-algebra generated by {Gr , r ≥ z}. (3) If t < s < r, then (1) If y < z, then Gz,y = Gz −
Gr,t = Gr,s +
q(r) Gs,t , q(s)
(12.4)
and Gr,s and Gs,t are independent. q(z) u(y, z) = . Since Gz,y is the projection of u(y, y) q(y) Gz on the orthogonal complement of Gy , it is orthogonal to Gy . Indeed, a simple calculation using (12.1) shows that Gz,y is orthogonal to each Gr , r ≤ y. Consequently it is independent of G≤y . Proof
(1) If y < z, then
(2) The proof is similar to the proof of (1). (3) If t < s < r, then, by (1) Gr,t
q(r) q(r) (12.5) Gs − Gt q(s) q(t) q(r) q(s) q(r) = Gr,s + Gt = Gr,s + Gs,t . Gs − q(s) q(t) q(s) = Gr,s +
Since Gs,t ∈ G≤s , it is independent of Gr,s by (1).
Local times of diffusions
532
x
Theorem 12.1.2 (Ray’s Theorem for Diffusions) Let L = {Lt , (x, t) ∈ I × R+} denote the local times of X hy normalized as in (12.2). Let G be an independent copy of G. Let x ≤ y. Then, under P x,y × PG, G or P y,x × PG, G , r L∞ +
2
Gr,x G2r,x + 2 2
1{r≤x} + law
=
G2 r
2
2
Gr,y G2r,y + 2 2
1{r≥y} : r ∈ I
Gr :r∈I , 2
2
+
(12.6)
where I ⊂ R1 . Proof This proof is an easy application of Lemma 3.10.4. Using it and Lemma 5.2.1, all we need to do is verify that v(r, s) := u(r, s) −
hy (r)u(s, x) u(r, y)u(s, x) = u(r, s) − hy (x) u(x, y)
(12.7)
is the covariance of the Gaussian process Gr,x 1{r≤x} + Gr,y 1{r≥y} when x ≤ y, and similarly with x and y interchanged when y ≤ x. (It is not clear a priori that v(r, s) is the covariance of any Gaussian process, or even positive definite. This follows from the proof.) Suppose x ≤ y. Then, using (12.1), we see that for x ≤ r ≤ y, v(r, r) = p(r)q(r) −
p(r)q(y)p(x)q(r) = 0. p(x)q(y)
(12.8)
We leave it to the reader to check that for r < x ≤ s ≤ y and for x ≤ r ≤ y < s, v(r, s) = v(s, r) = 0. Suppose r, s < x. Then v(r, s)
u(r, x)u(s, x) p(r)u(s, x) = u(r, s) − p(x) u(x, x) = EGr,x Gs,x . (12.9)
= u(r, s) −
Similarly when r, s > y, v(r, s) = EGr,y Gs,y . Finally, we again leave it to the reader to check that, when r < x < y < s, v(r, s) = v(s, r) = 0, which is consistent with the fact that Gr,x 1{r≤x} and Gs,y 1{s≥y} are independent. (By Lemma 12.1.1 (2), Gs,y 1{s≥y} ∈ G≥x .) The same arguments work when y ≤ x. We now give an analog of Theorem 8.2.6 for transient diffusions. Corollary 12.1.3 Under the same hypotheses as Theorem 12.1.2, under
12.1 Ray’s Theorem for diffusions P y,0 × PG, G , y > 0, r LT0 +
2
G2r,y Gr,y + 2 2
law
=
G2
r,0
2
533
1{r≥y} : r ∈ I
(12.10)
Gr,0 :r∈I , 2 2
+
where I ⊂ R+ . r
r
r
Proof By the additivity property of local times, {L∞ = LT0 + L∞ ◦ r r r θT0 } and by the strong Markov property, LT0 and L ∞ := L∞ ◦ θT0 are independent. Thus, by (12.6), under P y,0 × P 0,0 × PG, G (noting that I ⊂ R+ ),
2 2 2 G2 r G G Gr r law r,y r,y r LT0 + L ∞ + + 1{r≥y} : r ∈ I = + :r∈I . 2 2 2 2 (12.11) Also, by (12.6), under P 0,0 × PG, G ,
2 2 G2 r G2r,0 Gr,0 G law r + :r∈I = + r :r∈I . (12.12) L∞+ 2 2 2 2 Substituting the left-hand side of (12.12) into the right-hand side of r (12.11) and canceling L ∞ , we get (12.10). Theorem 8.2.6 and Corollary 12.1.3 have the same form. On their right-hand sides we have the Gaussian process with covariance uT0 and on their left-hand sides we have the orthogonal compliment of the projection of this process onto its value at y. Remark 12.1.4 We come to a very interesting question. Is it possible that Theorem 8.2.6 holds for processes other than diffusions? The proof of the theorem holds the answer, which is no, at least in whatever cases we can think of. It is very easy to see this. In order for the theorem to hold we need that v(r, s) in (12.7) is the covariance of a Gaussian process. Unfortunately this function is not even symmetric in general. Consider the canonical symmetric p-stable process killed at the end of an exponential time with mean α. In this case, 1 ∞ cos λ(r − s) dλ (12.13) u(r, s) = π 0 α + λp as we show in (4.84). (For v(r, s) to be symmetric, we would need u(r, y)u(s, x) = u(s, y)u(r, x).)
Local times of diffusions
534
One obvious way to make v(r, s) symmetric is to take x = y, that is, to make the process start and stop at the same point. The results obtained by doing this already appear in Section 8.2. Example 12.1.5 Theorem 12.1.2 applied to Brownian motion killed at the end of an independent exponential time with mean 1/2 gives Theorem 2.8.1. As a somewhat more esoteric application of Theorem 12.1.2 we give an interesting modification of Theorem 8.2.6. We consider standard Brownian motion B = {B(r), r ∈ R+ } starting at x > 0 and killed the first time it hits 0. But now we use the h-transform hy ( · ) to condition this process to hit y > x and die at y, so that the process never does hit 0. The 0-potential√of B for x, y ≥ 0 is 2(x ∧ y)); see (2.139). So G in Theorem 12.1.2 is 2B. It follows from Theorem 12.1.2 that, under P x,y × PB,B , r
{L∞ + ((Br −
r r Bx )2 + (B r − B x )2 ))1{r≤x} x x 2
law
(12.14) 2
2 + B r−y )1{r≥y} : r ∈ R+ } = {Br2 + B r : r ∈ R+ }, + (Br−y ·
where L∞ is the total accumulated local time of the hy -transform of the killed Brownian motion. Note the two independent Brownian bridges between 0 and x.
12.2 Eisenbaum’s version of Ray’s Theorem Ray’s Theorem has been the subject of many investigations. It has been reformulated and reproved in different ways. References are given in Section 12.6. In all of these papers the law of {Lr∞ ; r ∈ I} is described piecewise, in three separate regions: r ≤ x, x ≤ r ≤ y, and r ≥ y, conditioned to agree at the endpoints. The version of Ray’s Theorem closest to Theorem 12.1.2 is in Eisenbaum (1994). We derive her result from Theorem 12.1.2. Let Z = {Zt (x) ; (x, t) ∈ R1 × R+ } be a zero-dimensional squared Bessel process starting at x (see Section 14.2), and let Z¯ = {Z¯t (x) ; (x, t) ∈ R1 × R+ } be an independent copy of Z. In addition, let Bt be a standard twodimensional Brownian motion, also called a planar Brownian motion, ¯ independent of Z and Z. For p and q as in (12.1), set τ (r) = p(r)/q(r) and φ(r) = q(r)/p(r), and note that τ (r) is increasing and φ(r) is decreasing.
12.2 Eisenbaum’s version of Ray’s Theorem Theorem 12.2.1 Let
r {L∞ ; r
535
∈ I} be as in Theorem 12.1.2. Then
r
law
{L∞ ; r ∈ I} = {Ψr ; r ∈ I},
(12.15)
2 1 2 x ≤ r ≤ y, 2 q (r)|Bτ (r) | 2 1 2 r 2 q (r)Zτ (r)−τ (y) (|Bτ (y) | ) 2 2 1 2 ¯ 2 p (r)Zφ(r)−φ(x) (φ (x)|Bτ (x) | )
(12.16)
where Ψr
=
Ψr
=
Ψr
=
≥ y,
(12.17)
r ≤ x.
(12.18)
Note that |Bt |2 is a two-dimensional squared Bessel process starting at 0 (a BESQ2 (0) process in the notation of Section 14.2). Proof To begin, we motivate the choice of Ψ(r) in (12.16)–(12.18). Let Wt denote a Brownian motion and let G be a mean zero Gaussian process with covariance u(r, s). By checking covariances it is easy to verify that law
law
{Gr ; r ∈ I} = {q(r)Wτ (r) : r ∈ I} = {p(r)Wφ(r) : r ∈ I}.
(12.19)
Then using Lemma 12.1.1 (1) and (2), we obtain law
{Gr,y ; r ≥ y} = {q(r)(Wτ (r) − Wτ (y) ) : r ≥ y},
(12.20)
law
{Gr,x ; r ≤ x} = {p(r)(Wφ(r) − Wφ(x) ) : r ≤ x}.
(12.21)
As we show in the proof of Theorem 12.1.2, {Gr,x ; r ≤ x} and {Gr,y ; r ≥ y} are independent. = {B t , t ∈ R+ } be two planar Brownian ¯ = {B ¯t , t ∈ R+ } and B Let B motions independent of each other. It follows from (12.19)–(12.21) that (12.6) is equivalent to r
{L∞ +
2 1 2 ¯ 2 p (r)|Bφ(r)−φ(x) | I{r≤x} τ (r)−τ (y) |2 I{r≥y} + 12 q 2 (r)|B law τ (r) |2 : r ∈ I} = { 12 q 2 (r)|B
(12.22) : r ∈ I}
law
¯φ(r) |2 : r ∈ I}, = { 12 p2 (r)|B
r ¯ and B. Using this we see where L = {L∞ , r ∈ I} is independent of B that law
2 1 2 2 q (r)|Bτ (r) |
x≤r≤y
2 1 2 2 q (r)|Bτ (r)−τ (y) | I{r≥y}
law
2 1 2 2 q (r)|Bτ (r) |
r≥y
2 1 2 ¯ 2 p (r)|Bφ(r)−φ(x) | I{r≤x}
law
2 1 2 ¯ 2 p (r)|Bφ(r) |
r ≤ x. (12.23)
r
L∞ r L∞ + r L∞ +
= = =
Local times of diffusions
536
The last two equalities can be written as r
L∞ +
2 1 2 2 q (r)|Bτ (r)−τ (y) | I{r≥y} law 1 2 τ (y) + B τ (r)−τ (y) |2 = 2 q (r)|B
(12.24) r≥y
and r
L∞ +
2 1 2 ¯ 2 p (r)|Bφ(r)−φ(x) | I{r≤x} law 1 2 ¯φ(x) + B ¯φ(r)−φ(x) |2 = 2 p (r)|B
(12.25) r ≤ x. law
By Remark 14.2.3, and using the fact that |Bφ(x) |2 = φ2 (x)|Bτ (x) |2 r in (12.25), we see that L∞ is equal in law to Ψr , separately, in each of the three regions (12.16)–(12.18). What is not yet clear is that (12.15) holds ¯ and B in (12.16)–(12.18) to be independent. globally when we take Z, Z, r To complete the proof we show that (12.22) also holds with L∞ replaced by Ψr , since, for example, by taking Laplace transforms, we can r law show that for all finite sets D ⊂ R1 , {L∞ ; r ∈ D} = {Ψr ; r ∈ D} and thus obtain (12.15). ¯ Let B = {Bt , t ∈ R+ } be a planar Brownian motion independent of B and B. Let Zn,t (x) denote an n-th order squared Bessel process starting at x. It follows from (12.18) and Remarks 14.2.1 and 14.2.3 that for r ≤ x, 2 1 2 ¯ 2 p (r)|Bφ(r)−φ(x) | = 12 p2 (r) Z0,φ(r)−φ(x) (φ2 (x)|Bτ (x) |2 ) + = 12 p2 (r) Z2,φ(r)−φ(x) (φ2 (x)|Bτ (x) |2 ) ¯φ(r)−φ(x) + φ(x)Bτ (x) |2 . = 12 p2 (r)|B
Ψr +
(12.26) Z2,φ(r)−φ(x) (0)
A similar argument, using (12.17), shows that for r ≥ y, 2 1 2 2 q (r)|Bτ (r)−τ (y) |
Ψr +
=
1 2 2 q (r)|Bτ (r)−τ (y)
+ Bτ (y) |2 .
(12.27)
Therefore, by (12.26), (12.27), and (12.16), we see that for r ∈ I, {Ψr +
2 1 2 ¯ 2 p (r)|Bφ(r)−φ(x) | I{r≤x}
+
2 1 2 2 q (r)|Bτ (r)−τ (y) | 1{r≥y} }
law
¯φ(r)−φ(x) + φ(x)Bτ (x) |2 1{r≤x} = { 12 p2 (r)|B (12.28) 2 2 1 2 1 2 + q (r)|Bτ (r) | 1{x
2
Since this last process is clearly Markovian, it suffices to show that it agrees in law with 12 q 2 (r)|Bτ (r) |2 separately in the regions {r ≤ x}, {x ≤ r ≤ y}, and {r ≥ y}. This is obvious for the latter two regions. As law
for the region {r ≤ x}, we again use the fact that q(r)Bτ (r) = p(r)Bφ(r)
12.3 Ray’s original theorem
537
to see that ¯φ(r)−φ(x) + φ(x)Bτ (x) |2 : r ≤ x} { 12 p2 (r)|B law
= { 12 p2 (r)|Bφ(r) |2 : r ≤ x}
law
=
{ 12 q 2 (r)|Bτ (r) |2
(12.29)
: r ≤ x}.
12.3 Ray’s original theorem The designation of Theorem 12.1.2 as Ray’s Theorem is honorific since Ray’s description of the local times of diffusions X hy is quite different from the one in Theorem 12.1.2. We give a relatively simple derivation of the theorem D. Ray proved in Ray (1963), using some of the ideas that go into the proof of Theorem 12.1.2. We use the same notation as in Theorem 12.1.2. In addition, we introduce the following random variables: h
J := inf Xt y 0
h
S := sup Xt y ,
and
(12.30)
0
where ζ is the lifetime of X hy . Recall that X hy starts at x and x ≤ y, so J ≤ x. Also, using the facts that X hy has continuous paths, ζ < ∞, and limt↑ζ X hy = y, it is clear that J and S are finite random variables and S ≥ y. It is useful to have a better description of J and S before we state Ray’s original theorem. We write u(r, s), the 0-potential of X as in (12.1), and, as in Theorem 12.2.1, we set τ (r) = p(r)/q(r). Lemma 12.3.1 For x ≤ y, P x,y (J ≤ z) =
τ (z) ∧ 1 := F (z) τ (x)
(12.31)
P x,y (S ≥ z) =
τ (y) ∧ 1 := H(z). τ (z)
(12.32)
and
Proof
By the strong Markov property, z
z
E x,y (L∞ ) = P x,y (Tz < ∞)E z,y (L∞ ).
(12.33)
Therefore, by (12.2), P x,y (Tz < ∞) =
u(x, z)u(z, y) , u(x, y)u(z, z)
(12.34)
Local times of diffusions
538
and using (12.1), for z ≤ x, we get τ (z) ∧ 1. τ (x)
(12.35)
τ (y) u(x, z)u(z, y) = ∧ 1. u(x, y)u(z, z) τ (z)
(12.36)
P x,y (Tz < ∞) = Similarly for z ≥ y, P x,y (Tz < ∞) =
Using the continuity of X hy it is easy to see that for z ≤ x, P x,y (J ≤ z) = P x,y (Tz < ∞).
(12.37)
This proves (12.31). Similarly, for z ≥ y, P x,y (S ≥ z) = P x,y (Tz < ∞), which gives (12.32). J and H are real-valued random variables. We use PJ and PH to denote their probability measures. Theorem 12.3.2 (Ray’s Theorem, original version) Let L = x {Lt , (x, t) ∈ I×R+ } denote the local times of X hy and G = {Gr , r ∈ R1 } (i) be as in Theorem 12.1.2. Let G(i) = {Gr , r ∈ R1 }, i = 1, . . . , 4 be four independent copies of G and let PGj := PG(1) ,...,G(j) . Let x < y. Then law J, Lr∞ ; r ≤ x, P x,y = J, Λr ; r ≤ x, PJ × PG4 (12.38) 2 2 1 r x,y law (1) (2) Gr ; x ≤ r ≤ y, PG2 , = + Gr L∞ ; x ≤ r ≤ y, P 2 (12.39) and law S, Lr∞ ; r ≥ y, P x,y = S, Γr ; r ≥ y, PS × PG4 , (12.40) where Λr =
1 2 1{r>J}
4
(i)
2
Gr,J
(12.41)
i=1
and Γr =
1 2 1{r<S}
4
(i)
Gr,S
2 (12.42)
i=1 (i)
(i)
(here, Gr,J and Gr,S are defined as in (12.3), with G replaced by G(i) ). Throughout this book we have been able to exploit relationships involving the moment generating functions of Gaussian processes. We
12.3 Ray’s original theorem
539
continue to explore these relationships for Gaussian processes with covariances of the form (12.1). (We know by Lemma 5.1.9 that these are Gaussian Markov processes.) Lemma 12.3.3 Let t < s ≤ rj ≤ · · · r2 ≤ r1 . Consider {Gri ,s , 1 ≤ i ≤ j} (see Lemma 12.1.1), and let Σs denote its covariance matrix s := (Σ−1 − Λ)−1 , and similarly for {Gr ,t , 1 ≤ i ≤ j}. Let and Σ s
i
τ (s) = p(s)/q(s). Then 1 det(I − Σs Λ) = , det(I − Σt Λ) 1 − 2K(q, Λ, Σs )(τ (s) − τ (t))
(12.43)
s Λq t ) for Λ as given in Lemma s ) = (1/2)(qΛq t + qΛΣ where K(q, Λ, Σ 5.2.1, with n = j and q = (q(r1 ), . . . , q(rj )). Furthermore, consider {Gri , 1 ≤ i ≤ j} and let Σ denote its covariance matrix. Then τ (s) det(I − Σt Λ) − τ (t) det(I − Σs Λ) = (τ (s) − τ (t)) det(I − Σ Λ). (12.44) Proof
For each i ≤ j, by Lemma 12.1.1 (3), we can write Gri ,t
q(ri ) Gs,t q(s) 1 1 Gs − Gt ) = Gri ,s + q(ri )( q(s) q(t) := Gri ,s + ρri ,s,t , =
Gri ,s +
(12.45)
where the two processes {Gri ,s , i = 1, . . . , j} and {ρri ,s,t , i = 1, . . . , j} are independent. Also, note that
2 1 u(t, s) 1 u(s, s) u(t, t) Gs − Gt + 2 −2 = τ (s)−τ (t). = 2 EG q(s) q(t) q (s) q (t) q(t)q(s) (12.46) It follows from (5.43) that (det(I − Σt Λ))−1/2 = (det(I − Σs Λ))−1/2 (12.47) s )( 1 Gs − 1 Gt )2 . EG exp K(q, Λ, Σ q(s) q(t) j The left-hand side is obtained by using (5.43) on i=1 G2ri ,t /2 and the j right-hand side by using (5.43) on i=1 (Gri ,s + ρri ,s,t )2 /2 and using the independence of {Gri ,s , i = 1, . . . , j} and {ρri ,s,t , i = 1, . . . , j}. Equation (12.43) now follows from (12.46) and (12.47).
Local times of diffusions
540
To obtain (12.44) we first consider det(I − Σs Λ). The entries of I − Σs Λ are of the form δl,m − (u(rl , rm ) − τ (s)q(rl )q(rm )) λm
1 ≤ l, m ≤ j.
(12.48)
For each l = 2,. . . , m, multiply the first row of det(I −Σs Λ) by q(rl )/q(r1 ) and subtract it from the l-th row. The resulting matrix has no terms in τ (s) in rows 2 through j. This implies that det(I − Σs Λ) is of the form A + τ (s)B, where neither A nor B contains terms in τ (s). It is also clear that A = det(I − Σ Λ) (to see this, think of what we get if τ (s) = 0). Thus we have shown that det(I − Σs Λ) = det(I − Σ Λ) + τ (s)B,
(12.49)
where B is not a function of τ (s). Exactly the same argument shows that det(I − Σt Λ) = det(I − Σ Λ) + τ (t)B
(12.50)
since the only difference in the entries of det(I − Σs Λ) and det(I − Σt Λ) is the terms τ (s) and τ (t). Equation (12.44) follows from (12.49) and (12.50). Lemma 12.3.4 Let t < s ≤ rj ≤ · · · r2 ≤ r1 ≤ x. Then, with Σs and Σt as defined in Lemma 12.3.3,
1 λi G2ri ,r 2 i=1 j
4
F (s) − F (t) , det(I − Σs Λ) det(I − Σt Λ) t (12.51) where F (r) is given in (12.31). In particular, when t = −∞, s
E exp
s
1 λi G2ri ,r 2 i=1 j
dF (r) =
4
F (s) , det(I − Σs Λ) det(I − ΣΛ) (12.52) j where Σ is the covariance matrix of {Gri }i=1 . E exp
−∞
dF (r) =
s ). Setting t = r in (12.43) and then inteProof Let K = K(q, Λ, Σ grating, it follows that 2 s 1 2 dF (r) (12.53) (det(I − Σs Λ)) det(I − Σr Λ) t 2 s 1 = dF (r). 1 − 2K(τ (s) − τ (r)) t
12.3 Ray’s original theorem
541
Set v = τ (r)/τ (s). The right-hand side of (12.53) is equal to 2 1 τ (s) 1 dv τ (x) τ (t)/τ (s) 1 − 2Kτ (s)(1 − v) τ (s) 1 1 = −1 τ (x) 2Kτ (s) 1 − 2K(τ (s) − τ (t)) (τ (s) − τ (t))/τ (x) = 1 − 2K(τ (s) − τ (t)) det(I − Σs Λ) , = (F (s) − F (t)) det(I − Σt Λ)
(12.54)
where, for the last equality, we use (12.43). Combining (12.53)–(12.54) we get 2 s 1 F (s) − F (t) , dF (r) = det(I − Σ Λ) det(I − Σs Λ) det(I − Σt Λ) r t (12.55) which is (12.51). We obtain (12.52) by taking the limit on both sides, as t ↓ −∞. The case r1 ≤ t < s ≤ x is a degenerate form of (12.51) in which there are no squares of Gaussian processes present. In this case, both sides of (12.51) are equal to F (s) − F (t). Proof of Theorem 12.3.2 Equation (12.39) follows immediately from (12.6). We proceed to the proof of (12.38). Let −∞ = rn+1 < rn ≤ · · · ≤ rj+1 ≤ t < s ≤ rj ≤ · · · r2 ≤ r1 ≤ x. It follows from Lemma 12.3.4 that
4
n 1 E x,y EG exp λi I{ri >J} G2ri ,J , t < J ≤ rj 2 i=1 =
F (rj ) − F (t) det(I − Σrj Λ) det(I − Σt Λ)
(12.56)
for Σ · as in Lemma 12.3.4. We now consider the local time process of X hy . It follows from the Markov property that
j ri x,y exp E λi L∞ , Tt < ∞ (12.57) i=1
=E
x,y
exp
j i=1
ri λi LTt
, Tt < ∞ E
t,y
exp
j i=1
ri λi L∞
542
Local times of diffusions
j ri exp λi LTt , Tt < ∞ det(I − ΣΛ),
= E x,y
i=1
where, in the last equality, we use Theorem 12.1.2 with x, y replaced by t, y and the fact that r1 , . . . , rj are between t and y. Using (3.209) with the element 0 replaced by y, so that {L > 0} = {Ty < ∞}, we see that
j ri x,y exp λi LTt , Tt < ∞ (12.58) E i=1
1 Ex = hy (x) 1 Ex = hy (x) 1 Ex = hy (x)
exp
exp
exp
ht (x) x,t = E hy (x)
j
ri λi LTt
i=1 j
1{Tt <∞} ◦ kL , Ty < ∞
ri λi LTt
i=1 j
exp
, Tt < ∞, Ty ◦ θTt < ∞
ri λi LTt
i=1
j
, Tt < ∞ P t (Ty < ∞)
ri λi LTt
P t (Ty < ∞),
i=1
where, for the last line, we use (3.209) with the element 0 replaced by t. By Corollary 12.1.3 with y,0 replaced by x,t and the fact that r1 , . . . , rj are between t and x,
j 1 r i = . (12.59) E x,t exp λi LTt det(I − Σt Λ) i=1 Also, by (3.109), ht (x) t τ (t) u(x, t) u(y, y) u(t, y) P (Ty < ∞) = = = F (t). hy (x) u(t, t) u(x, y) u(y, y) τ (x) Combining (12.57)–(12.60) we obtain
j ri x,y exp E λi L∞ , Tt < ∞ =
(12.60)
1 1 F (t) . det(I − Σ Λ) det(I − ΣΛ) t i=1 (12.61) It follows from (12.61) and (12.44) that
n x,y ri exp (12.62) E λi L∞ , t ≤ J < rj i=1
=E
x,y
exp
j i=1
λi Lr∞i
, t ≤ J < rj
12.4 Markov property of local times of diffusions 1 F (rj ) F (t) = − det(I − ΣΛ) det(I − Σrj Λ) det(I − Σt Λ) F (rj ) − F (t) = . det(I − Σrj Λ) det(I − Σt Λ)
543
Equation (12.38) follows from (12.56) and (12.62). The proof of (12.40) is an exact replication of the proof of (12.38). We consider y ≤ r1 ≤ · · · ≤ rj ≤ s < t and set u(ri , s) u(ri , t) Gri ,t = Gri ,s + Gs − Gt (12.63) u(s, s) u(t, t) 1 1 Gs − Gt ). = Gri ,s + p(ri )( p(s) p(t) We also note that P x,y (S ≤ z) = I(z), where I(z) = (1 − τ (y)/τ (z)) ∨ 0. We proceed to obtain analogies of Lemmas 12.3.3 and 12.3.4 and the proof of (12.38) (essentially, p( · ) replaces q( · ), 1/τ ( · ) replaces τ ( · ), and H( · ) replaces F ( · )). Example 12.3.5 Ray’s Theorem for Brownian motion killed at the end of an independent exponential time with mean 1/2 is given in Theorem 2.8.1. Ray’s original theorem allows us also to describe the local time in terms of fourth-order Bessel processes. We point out on page 58 that the associated Gaussian process in this case, which is an Ornstein–Uhlenbeck process, can be represented by e−x Be2x x ∈ R1 , where B is Brownian 2r )− motion. Consequently, for example, for r ≥ J, Gr = e−r (B(e 2J 2r 2J B(e )). This shows that for r ≥ J, Λr = (1/2)e Z4,e2r e , where Z4, · ( · ) is defined on page 536.
12.4 Markov property of local times of diffusions We present a very interesting property of the local times of transient regular diffusions that can be understood by realizing that the diffusions themselves are rescaled time changed Brownian motion. r
Theorem 12.4.1 The total accumulated local time process {L∞ , r ∈ I} in Theorem 12.1.2 is a Markov process under P x,y . This theorem is almost immediately obvious from Theorem 12.2.1 since, in each of the three regions considered, the local time is equal in law to a Bessel process, which itself is a Markov process. We make this more precise in the next proof.
Local times of diffusions
544
Proof of Theorem 12.4.1 using Theorem 12.2.1 It suffices to show that
E H(Ψs ) FrΨ = E H(Ψs ) Ψr (12.64) for each r < s and any measurable function H, where FtΨ denotes the σ-algebra generated by the stochastic process Ψu , u ≤ t. This is not difficult; for example, take y ≤ r < s. For each a ∈ R1 , the continuous process Z = {Zt (a) ; t ∈ R+ } induces a probability measure PZa (·) on C(R1 , R+ ). Let Xt denote the canonical coordinate process on C(R1 , R+ ). Then, under {PZa (·); a ∈ R1 }, Xt is a time-homogeneous strong Markov process. Using (12.17), independence, and the strong Markov property, we have
1 2 E H(Ψs ) FrΨ = E H q (s)Zτ (s)−τ (y) |Bτ (y) |2 FrΨ 2 1 2 Zτ (r)−τ (y) (|Bτ (y) |2 ) q (s)Xτ (s)−τ (r) H , = EZ 2 which implies (12.64) when y ≤ r < s. (We introduce X and PZa (·) to make it clear that we are not taking expectations with respect to B.) In a similar manner we can check that (12.64) holds for all r < s. The problem with the above proof, from the perspective of this book, is that we do not really cover Bessel processes or use stochastic differential equations. We give another proof of Theorem 12.4.1, based on some simple computations involving the isomorphisms in Theorem 12.1.2, which follows easily from the next lemma. Let G be a mean zero Gaussian process with covariance u(r, s) as in (12.1). For z ≤ w we define q(w) u(w, z) = u(z, z) q(z) u2 (w, z) = EG2w,z = u(w, w) − u(z, z) 2 1 λ 2 G = = E exp . 1 − λσ 2 2 w,z
c = σ2 K = Kλ
x
(12.65)
Lemma 12.4.2 Let {L∞ , x ∈ I} be as in Theorem 12.1.2. Let x1 < . . . < xn = z ≤ w and let H be a bounded continuous function on Rn . · x1 xn Let Hn (L∞ ) := H(L∞ , . . . , L∞ ). For all λ ∈ C sufficiently small the following hold:
12.4 Markov property of local times of diffusions
545
If x ≤ y ≤ z ≤ w, w · z · E x,y exp λL∞ Hn (L∞ ) = E x,y exp c2 λKλ L∞ Hn (L∞ ) . (12.66) If x ≤ z ≤ w ≤ y, w · z · E x,y exp λL∞ Hn (L∞ ) = Kλ E x,y exp c2 λKλ L∞ Hn (L∞ ) . (12.67) If z ≤ w ≤ x ≤ y, w · E x,y exp λL∞ Hn (L∞ )1{Lz =0} (12.68) ∞ z · = Kλ2 E x,y exp c2 λKλ L∞ Hn (L∞ )1{Lz =0} ∞
and
w · (12.69) E x,y exp λL∞ Hn (L∞ )1{Lz =0} ∞ 1 − F (w) + (F (w) − F (z))Kλ · = E x,y Hn (L∞ )1{Lz =0} . ∞ 1 − F (z)
We prove this lemma after we show how it immediately implies Theorem 12.4.1. Proof of Theorem 12.4.1 using Theorem 12.1.2 Fix x < y and let z ≤ w. To establish the Markov property we show that for every function f ∈ S, the Schwarz space of rapidly decreasing functions on R+ , we can find a bounded continuous function f¯ possibly depending on z and w, such that for any finite sequence of points x1 < . . . < xn = z ≤ w, w · z · (12.70) E x,y f (L∞ )Hn (L∞ ) = E x,y f¯(L∞ )Hn (L∞ ) . This implies that w r w z E x,y f (L∞ ) | σ(L∞ ; r ≤ z) = E x,y f (L∞ ) | σ(L∞ )
(12.71)
for every function f ∈ S and hence for all bounded measurable functions f , by the conditional Dominated Convergence Theorem. This establishes the Markov property. Refer to Lemma 12.4.2. It is easy to see that both sides of equations (12.66)–(12.69) are analytic in λ in the region Re λ < δ for some δ > 0. Therefore, they hold for λ purely imaginary. This gives us (12.70) for f (x) = eipx for any real p and also an explicit formula for f¯. This leads easily to (12.70) for any f ∈ S and completes the proof of Theorem 12.4.1 using Theorem 12.1.2.
Local times of diffusions
546
In preparation for the proof of Lemma 12.4.2, we make the following ¯ = (λ1 , . . . , λn ) and definitions. Let λ
n 1 def 2 (12.72) λi Gxi , BG = exp 2 i=1
n 1 def 2 BG,x = exp λi Gxi ,x 1{xi ≤x} , (12.73) 2 i=1
n 1 def 2 λi Gxi ,y 1{xi ≥y} , (12.74) BG,y = exp 2 i=1
n def ¯ = exp λi Lx∞i . ΠL = ΠL (λ) (12.75) i=1
We next note the following simple lemma. Lemma 12.4.3 Let x1 , . . . , xn < z ≤ w. Then λ 2 λ 2 E BG exp Gw c KG2z = K 1/2 E BG exp . 2 2 Proof
(12.76)
We write Gw = Gw,z + c Gz .
(12.77)
Since Gw,z is independent of Gxi , i = 1, . . . , n by Lemma 12.1.1 (1) and EG2w,z = σ 2 , we see from Lemma 5.2.1 that 2 2 λ2 c2 σ 2 G2z λ 2 λc Gz 1/2 E BG exp G + , = K E BG exp 2 w 2 2(1 − λσ 2 ) (12.78) from which we get (12.76). Proof of Lemma 12.4.2 We begin with (12.67). Using the same analyticity arguments as in the proof of Theorem 12.4.1, we see that in order to prove (12.67), it suffices to show that for all λ, λ1 , . . . , λn ∈ C sufficiently small, ¯ exp λLw ¯ exp c2 λKλ Lz = Kλ E x,y ΠL (λ) . E x,y ΠL (λ) ∞ ∞ (12.79) Since x ≤ z ≤ w ≤ y, 1{xi ≥y} , 1{w≥y} , and 1{w≤x} are all 0. Using this, Theorem 12.1.2, Lemma 12.4.3, and Theorem 12.1.2 again, we get w (12.80) E x,y EG,G¯ ΠL BG,x BG,x ¯ exp λL∞ 2 2 ¯ G Gw = EG,G¯ BG BG¯ exp λ + w 2 2
12.4 Markov property of local times of diffusions 2 ¯ 2 G Gz + z = KEG,G¯ BG BG¯ exp λc2 K 2 2 z x,y 2 = KE EG,G¯ ΠL BG,x BG,x , ¯ exp λc KL∞
547
from which we get (12.79), by canceling EG,G¯ BG,x BG,x from the first ¯ and last terms in (12.80), and consequently (12.67). (Actually, Lemma 12.4.3 is proved for moment generating functions. It is easy to see that it is also valid when λ, λ1 , . . . , λn are complex, as long as |Re λ| is sufficiently small. In the remainder of this proof we use other lemmas proved for real λ, λ1 , . . . , λn in this way.) We now prove (12.66). It follows from Theorem 12.1.2, Lemma 12.4.3, and the fact that BG,x and BG,y are independent that
¯ 2w,y G2w,y G x,y w E EG,G¯ ΠL BG,x BG,x + ¯ BG,y BG,y ¯ exp λ L∞ + 2 2 2 2 ¯ G Gw + w (12.81) = EG,G¯ BG BG¯ exp λ 2 2 2 ¯2 G Gz + z = KEG,G¯ BG BG¯ exp λc2 K 2 2 = KE x,y ΠL exp λc2 KLz∞
2
G2z,y 2 2 BG,y EG exp λc K (EG BG,x ) , 2 where, at the last step, we use Theorem 12.1.2 again. Also, obviously, ¯ 2 G G2w,y w,y w + E x,y EG,G¯ ΠL BG,x BG,x ¯ BG,y BG,y ¯ exp λ L∞ + 2 2 G2 2 w,y 2 B E exp λ = E x,y ΠL exp (λLw ) (E B ) . G G,x G G,y ∞ 2 (12.82) We claim that
EG
G2w,y BG,y exp λ 2
=K
1/2
EG
G2z,y BG,y exp λc K 2 2
.
(12.83) Substituting this in the right-hand side of (12.82) and comparing it to the last line of (12.81) gives (12.66). To establish (12.83), note that by Lemma 12.1.1, Gw,y = Gw,z + c Gz,y
(12.84)
548
Local times of diffusions
and that Gw,z is independent of both Gz,y and BG,y ∈ G≤z . Using (12.84) in place of (12.77) and continuing with the proof of Lemma 12.4.3, we get (12.83). To obtain (12.68) we use the additivity of local time, L·∞ = L·Tz + · L∞ ◦ θTz , and the strong Markov property to obtain E x,y ΠL exp (λLw ∞ ) 1{Tz <∞} z,y ΠL exp (λLw = E x,y exp λLw Tz 1{Tz <∞} E ∞ ) (12.85) since, in the first passage from x to z, ΠL = 1. It follows from (12.58)– (12.60) and Corollary 12.1.3 that E x,y exp λLw = E x,y exp λLw F (z) (12.86) Tz 1{Tz <∞} Tz = KF (z). Furthermore, since z ≤ w ≤ y, we can use (12.67) to get z,y E z,y ΠL exp (λLw ΠL exp c2 λKLz∞ . ∞ ) = KE
(12.87)
Using (12.85)–(12.87) we get 2 z,y E x,y ΠL exp (λLw ΠL exp c2 λKLz∞ . ∞ ) 1{Tz <∞} = K F (z)E (12.88) Also, by (12.85) and (12.86) with w replaced by z and λ replaced by c2 λK, we see that E x,y ΠL exp c2 λKLz∞ 1{Tz <∞} (12.89) 2 2 x,z z z,y z exp c λKLTz F (z)E ΠL exp c λKL∞ . =E Recognizing that the first expectation to the right of the equality sign in (12.89) is equal to 1 (since LzTz = 0), we can substitute (12.89) into (12.88) to get (12.68). To obtain (12.69) we note that xi z E x,y (exp(λLw ∞ )|L∞ , i = 1, . . . , n; L∞ = 0)
(12.90) = E x,y (exp (λLw ∞ ) |z ≤ J) P x,y (w ≤ J) + E x,y (exp(λLw ), z ≤ J ≤ w) ∞ . = P x,y (z ≤ J) By (12.31), P x,y (z ≤ J) = 1 − F (z) and by (12.62), E x,y (exp(λLw ∞ ), z ≤ J ≤ w)
= =
F (w) − F (z) det(I − Σw Λ) det(I − Σz Λ) (F (w) − F (z))K (12.91)
12.5 Local limit laws for h-transforms of diffusions
549
since Σw = 0 and det(I − Σz Λ) = 1/K. Consequently, E x,y (exp(λLw )|Lxi , i = 1, . . . , n; Lz∞ = 0) ∞ ∞ 1 − F (w) + (F (w) − F (z))Kλ . = 1 − F (z)
(12.92)
·
Multiplying by Hn (L∞ )1{Lz =0} and taking the expectation, we get ∞ (12.69).
12.5 Local limit laws for h-transforms of diffusions We obtain a version of Theorem 9.5.25 for the first hitting time of 0 a diffusion that is conditioned to die at its last exit from 0. Let X be a transient regular diffusion as in Section 12.1, with 0-potential density u(r, s) and local times {Lrt ; (r, t) ∈ R1 × R+ }. Recall that uT0 (r, s) := E r (LsT0 ) = u(r, s) −
u(r, 0)u(s, 0) . u(0, 0)
(12.93) r
Theorem 12.5.1 Let X h0 be the h0 -transform of X and let {Lt ; (r, t) ∈ R1 × R+ } denote its local times. For y > 0 we have x
lim sup x↓0
LT0 =1 uT0 (x, x) log log(1/uT0 (x, x))
P y,0 a.s.
(12.94)
Proof Let Gr,0 denote the Gaussian process with covariance uT0 (r, s). This theorem follows from Corollary 12.1.3 and the fact that 2
G2r,0 + Gr,0 lim sup =1 2uT0 (r, r) log log(1/r) t↓0
a.s.,
(12.95)
2
where Gr,0 is an independent copy of G2r,0 . Here we used the fact that 2 G2r,y Gr,y the term 1{r≥y} in (12.10) contributes nothing when we 2 + 2 take the limsup at 0. It is easy to see that (12.95) holds. By (12.20), law
{Gr,0 ; r ≥ 0} = {q(r) (B (τ (r) − τ (0))) ; r ≥ 0}.
(12.96)
Using Lemma 9.5.24 and the fact that uT0 (r, r) = q 2 (r) (τ (r) − τ (0)), we get (12.95). Example 12.5.2 For standard Brownian motion, uT0 (x, x) = 2x; see Lemma 2.5.1. Thus, for Brownian motion, (2.217) follows from Theorem 9.5.25. If we apply Theorem 12.5.1 to the h0 -transform of Brownian
550
Local times of diffusions
motion killed at the end of an independent exponential time with mean 1/2, we get uT0 (x, x) = 1 − e−2x ; see Section 2.8 and use (12.93). Thus we get the same result in (12.94) that we get in (9.130) for standard Brownian motion.
12.6 Notes and references Most of the results in this chapter are taken from Marcus and Rosen (2003). They are mostly consequences of Theorem 12.1.2, which we refer to as Ray’s Theorem. Ray’s original version of this result, which is given in Theorem 12.3.2, has been studied by Williams (1974), Sheppard (1985), Biane and Yor (1988), and Eisenbaum (1994). In some of these r works the description of {L∞ ; r ∈ I} looks quite different from the one given by Ray, as does ours in Theorem 12.1.2. We show in Section 12.4 that Theorem 12.4.1, which states that the total accumulated local time of a transient diffusion under P x,y is a Markov process in the spatial variable, is an immediate consequence of Theorem 12.2.1. In Ray (1963) and Sheppard (1985), this property is proved by computing the explicit conditional expectations that define the Markov property. In Theorem 12.4.2, we use Theorem 12.1.2 to simplify the computations in these papers. (To see that our (12.69) is the same as Sheppard (1985, (2.19)) note that q 2 (x)(τ (x) − τ (z)) = u(x, x) − (u2 (x, z)/u(z, z).) See Walsh (1983) for a proof of Theorem 12.4.1 using excursion theory.
13 Associated Gaussian processes
In Section 7.4 we develop and exploit some special properties of Gaussian processes that are associated with Borel right processes. In this chapter we consider the question of characterizing associated Gaussian processes. In order to present these results in their proper generality, we must leave the familiar framework of Borel right processes and consider local Borel right process, which are introduced in the final sections of Chapter 4. The reader should note that this is the first place in this book, after Chapter 4, that we mention local Borel right processes. We remind the reader that Borel right processes are local Borel right processes, and for compact state spaces, there is no difference between local Borel right processes and Borel right processes. Let S be a locally compact space with a countable base. A Gaussian process {Gx ; x ∈ S} is said to be associated with a strongly symmetric transient local Borel right process X on S, with reference measure m, if the covariance Γ = Γ(x, y) = E(Gx Gy ) is the 0-potential density of X for all x, y ∈ S. Not all Gaussian processes are associated. It is remarkable that some very elementary observations about the 0-potential density of a strongly symmetric transient Borel right process show what is special about associated Gaussian processes. One obvious condition is that Γ(x, y) ≥ 0 for all x, y ∈ S, since the 0-potential density of a strongly symmetric transient Borel right process is nonnegative (see Remark (3.3.5)). Also, since Γ is the 0-potential density of X, it follows from (3.109) that Γ(x, y) = P x (Ty < ∞)Γ(y, y).
(13.1)
Obviously this holds with x and y interchanged. Consequently, Γ(x, y) ≤ Γ(x, x) ∧ Γ(y, y).
(13.2)
Furthermore, since X is transient, it follows from Lemma 3.6.12 that we 551
Associated Gaussian processes
552
cannot have both P x (Ty < ∞) = 1 and P y (Tx < ∞) = 1. This means that Γ(x, y) < Γ(x, x) ∨ Γ(y, y).
(13.3)
Therefore, the 2 × 2 covariance matrix Γ(x, x) Γ(x, y) Γ= Γ(x, y) Γ(y, y) is invertible and Γ
−1
=
1 det Γ
Γ(y, y) −Γ(x, y) −Γ(x, y) Γ(x, x
(13.4) .
(13.5)
−1
Thus we see that the off-diagonal terms in Γ are negative. Further−1 more, by (13.2), the row sums of Γ are positive, and, as we just noted, to show that Γ is invertible, at least one of the row sums is strictly positive. Generalizations of these conditions on the covariance matrix of the associated Gaussian process to any p points in the state space of X lead to necessary and sufficient conditions for a Gaussian process to be associated with a local Borel right process. Here is another precursor of the material covered in this chapter. Let g be a mean zero normal random variable. It is easy to see that g 2 is infinitely divisible. This is because the Laplace transform of g 2 is (1 + 2λEg 2 )−1/2 , and it is easy to check that (1 + 2λEg 2 )−1/(2n) is a completely monotone function of λ. We show in Section 13.2 that all mean zero Gaussian vectors in R2 also have infinitely divisible squares and that this is no longer true in Rp , for p ≥ 3. Let G = {G(x), x ∈ S} be a mean zero Gaussian process. There is an intimate connection between G having infinitely divisible squares and G being an associated Gaussian process.
13.1 Associated Gaussian processes Let A = {ai,j }1≤i,j≤n be an n × n matrix. If ai,j ≥ 0 for all i, j, we say that A is a positive matrix and write A ≥ 0. If ai,j ≤ 0 for all i = j, we n say that A has negative off-diagonal elements. If j=1 ai,j ≥ 0 for all i = 1, . . . , n, we say that A has positive row sums. Theorem 13.1.1 Let S be a locally compact space with a countable base. Let G = {Gx ; x ∈ S} be a Gaussian process with continuous covariance Γ(x, y). G is associated with a strongly symmetric transient
13.1 Associated Gaussian processes
553
local Borel right process X on S if and only if, for every finite collection x1 , . . . , xn ∈ S, the matrix Γ = {Γ(xi , xj )}1≤i,j≤n is invertible and Γ−1 has negative off-diagonal elements and positive row sums. Theorem 13.1.1 is an immediate consequence of the following theorem: Theorem 13.1.2 Let S be a locally compact space with a countable base. Let Γ(x, y) be a continuous symmetric function on S × S. The following are equivalent: (1) Γ(x, y) is the 0-potential density of a strongly symmetric transient local Borel right process X on S. (2) For every finite subset S ⊆ S, Γ(x, y) restricted to S × S is the 0-potential density of a strongly symmetric transient Borel right process X on S . (3) For every finite collection x1 , . . . , xn ∈ S, the n × n matrix Γ = {Γ(xi , xj )}1≤i,j≤n is invertible and Γ−1 has negative off-diagonal elements and positive row sums. Proof
(1) ⇒ (3) We define the following stopping time: σ = inf{t ≥ 0 | Xt ∈ {x1 , . . . , xp } ∩ {X0 }c }
(13.6)
(note that σ may be infinite). Let {Lxt ; (x, t) ∈ S×R1 } be the local times of X. Since Γ(xi , xj ) is the 0-potential density of X, we can normalize the local time so that Γ(xi , xj ) = E xi (Lx∞j ) .
(13.7)
This result is proved in (3.108) for Borel right processes. We leave to the reader the verification that it also holds for local Borel right processes. The key point is that whenever we use the strong Markov property in the steps leading to (3.108), we can just as well use the local strong Markov property, Lemma 4.8.5, which is satisfied by local Borel right processes.. Using (13.7) and the local strong Markov property (Lemma 4.8.5), we see that Γ(xi , xj ) = E xi (Lxσj ) + E xi E Xσ (Lx∞j ); σ < ∞ (13.8) p hi,k Γ(xk , xj ), = bi,j + k=1 x E xi (Lσj )
and hi,k = P xi (Xσ = xk ). where bi,j = Let B = {bi,j }1≤i,j≤p and H = {hi,j }1≤i,j≤p . We can write (13.8) as Γ = B + HΓ
(13.9)
Associated Gaussian processes
554
so that (I − H)Γ = B. Moreover, B is a diagonal matrix with all its diagonal elements strictly positive. This follows because, starting from X0 = xi , σ > 0, which implies that each bi,i > 0. On the other hand, the process is killed the first time it hits any xj = xi . Thus, starting x from xi , Lσj = 0, j = i. Since B is invertible, both (I − H) and Γ are invertible and Γ−1 = B −1 (I − H).
(13.10)
It is clear that H ≥ 0. It follows from this that Γ−1 has negative off diagonal elements. Furthermore, p
hi,j = P xi (σ < ∞) ≤ 1
∀ i = 1 . . . , p,
(13.11)
j=1
from which it follows that Γ−1 has positive row sums. (3) ⇒ (2) To begin, we show that an n × n matrix that has negative off-diagonal elements and positive row sums can be used to construct a strongly symmetric Borel right process on S = {1, 2, . . . , n}. Let A = {ai,j }1≤i,j≤n be such a matrix. Let Pt (i, j) = {e−tA }i,j ,
(13.12)
that is, Pt (i, j) =
∞ (−t)k Aki,j k=0
(13.13)
k!
with A0 := I. We write this compactly as Pt = e−tA . We show that {Pt ; t ≥ 0} is a sub-Markov semigroup on (S, B). The semigroup property is clear. Furthermore, since A has negative off-diagonal elements, if we take c to be the largest diagonal element of A, then B := cI − A ≥ 0. Hence Pt = e−tA = e−tc etB ≥ 0.
(13.14)
To complete the proof that Pt is a sub-Markov semigroup on S, we must show that Pt (x, S) ≤ 1 for all x ∈ S. This comes down to showing that n
Pt (i, j) =
j=1
Let hi (t) =
−tA }i,j . j=1 {e
= −
{e−tA }i,j ≤ 1
∀ i ∈ S.
(13.15)
j=1
n
hi (t)
n
n j=1
Then
{e−tA A}i,j = −
n n j=1 p=1
{e−tA }i,p Ap,j
(13.16)
13.1 Associated Gaussian processes = −
n
{e−tA }i,p
p=1
n
555
Ap,j .
j=1
n Note that j=1 Ap,j ≥ 0, since A has positive row sums. It follows from (13.14) and (13.16) that hi (t) ≤ 0. Since hi (0) = 1, this gives us (13.15). It is easy to see that {Pt ; t ≥ 0} is a strongly continuous contraction semigroup on C(S). Here we are identifying C(S) with Rn . Properties (2) and (3) on page 122 follow from (13.14)–(13.15) and (13.13), respectively. Therefore, we can use Theorem 4.1.1 to construct a Feller process X on S with transition semigroup {Pt ; t ≥ 0}. If A is symmetric, then clearly {Pt ; t ≥ 0} is also symmetric. In this case, X is a strongly symmetric Borel right process. Here we use counting measure on S as the reference measure. We now show that if A is invertible, then X is transient and A−1 is the 0-potential of X. For any α > 0, ∞ ∞ e−αt Pt dt = e−αt e−tA dt = (α + A)−1 . (13.17) Uα = 0
0
To see this, note that since {Pt ; t ≥ 0} is a contraction semigroup, the integrals converge. Therefore, ∞ ∞ −αt −tA (α + A) e e dt = (α + A)e−t(α+A) dt (13.18) 0 0 ∞ d −t(α+A) e dt = I. = − dt 0 Taking the limit in (13.17) as α → 0, we see that U 0 = A−1 . Therefore, X is transient. (Strictly speaking, in Lemma 3.6.11 we show that a process is transient if the 0-potential density exists. But note that on a finite state space, with the counting measure as the reference measure, the 0-potential and the 0-potential density are the same.) (3) ⇒ (2) now follows on identifying x1 , . . . , xn with {1, 2, . . . , n} and the matrix Γ−1 with the matrix A. (2) ⇒ (1) This is the crux of the theorem and the reason that we must consider local Borel right processes rather than Borel right processes. The proof of this implication is Theorem 4.10.1. In preparation for Section 13.3, we continue to explore some properties of Borel right processes with finite-dimensional state spaces. Let {Pt , ; t ≥ 0} be a right continuous semigroup of n × n matrices with Pt ≥ 0 and P0 = I. (Note that when n = 1, this is equivalent to having a function f (t) ≥ 0 satisfying f (t + s) = f (t)f (s) for all s, t ≥ 0.)
556
Associated Gaussian processes
As in the case n = 1, one can verify that Pt = e−tA for some n×n matrix A. The matrix −A is called the generator of the semigroup {Pt , ; t ≥ 0}. If X is a Borel right processes with semigroup {Pt , ; t ≥ 0}, we also say that −A is the generator of X. We formulate our results in terms of A rather than the generator −A, because our focus is on potentials, and as we see in the proof of Theorem 13.1.1, if A is invertible, A−1 is the 0-potential of X. n Let A = {ai,j }1≤i,j≤n be an n × n matrix. If j=1 ai,j = 0 for all i = 1, . . . , n, we say that A has zero row sums. Remark 13.1.3 Note that a matrix A with positive row sums that is invertible cannot have zero row sums, that is, at least one of its row sums is strictly positive. This is obvious because if all the row sums of A are equal to 0, A1t = 0, where 1 denotes the vector in Rn with all its elements equal to 1. This is impossible if A is invertible. Therefore the matrix Γ−1 in Theorems 13.1.1 and 13.1.2 has at least one row sum that is strictly positive. Consequently, since the off-diagonal elements of Γ−1 are negative, there is at least one row of Γ−1 in which the diagonal element is greater than the sum of the absolute values of the off-diagonal elements. Such a matrix is sometimes said to be “weakly diagonally dominant.” (3) ⇒ (2) of Theorem 13.1.2 also implies that Γ is positive and positive definite, since U 0 is a zero potential. Furthermore, since it is invertible, it must be strictly positive definite. Thus we see that an invertible matrix (Γ−1 ) with negative off-diagonal elements and positive row sums is weakly diagonally dominant and has a positive, strictly positive definite inverse. Lemma 13.1.4 Let A be an n × n matrix. Pt = e−tA is the semigroup of a Borel right process X on S = {1, 2, . . . , n} if and only if A has negative off-diagonal elements and positive row sums. In this case, the α-potential U α = (α + A)−1
∀ α > 0.
(13.19)
Furthermore, A is symmetric if and only if X is symmetric. If, in addition, A has zero row sums, then X is recurrent. A is invertible if and only if X is transient, in which case U 0 = A−1 . (In this case, at least one of the row sums of A is strictly positive.) Proof Suppose that A has negative off-diagonal elements and positive row sums. The fact that Pt = e−tA is the semigroup of a Borel right
13.1 Associated Gaussian processes
557
process X on S = {1, 2, . . . , n} is contained in the proof of (3) ⇒ (2) of Theorem 13.1.2. Assume now that Pt = e−tA is the semigroup of a Borel right process X on {1, 2, . . . , n}. Since Pt (i, j) = {e−tA }i,j = 1{i=j} − Ai,j t + O(t2 )
as t → 0,
(13.20)
we immediately see from the positivity of Pt (i, j) that A must have negative off-diagonal elements. Summing over j we see that Pt (i, S) = 1 −
n
Ai,j t + O(t2 )
as t → 0,
(13.21)
j=1
n and since Pt (i, S) ≤ 1, we must have j=1 Ai,j ≥ 0 for each i = 1, . . . , n. Thus A has positive row sums. The representation of U α in (13.19) is given in (13.17), and, of course, the statement about symmetry is immediate. n Suppose that A also has zero row sums. Set hi (t) = j=1 {e−tA }i,j . We see from (13.16) that hi (t) ≡ 0. Consequently, Pt (i, S) = P0 (i, S) = 1 for all i ∈ S. Let U be the 0-potential of X. Then ∞ n U (i, j) = U (i, S) = Pt (i, S) dt = ∞ ∀i ∈ S. (13.22) 0
j=1
Using (3.68) we see that U (i, i) = ∞ for all i, which implies that X is recurrent. Finally, we know that X is transient if and only if U 0 exists. Taking the limit as α → 0 in (13.19), we see that this is equivalent to A being invertible and equal to U 0 . Let A be a symmetric n × n matrix that has negative off-diagonal elements and positive row sums. By Lemma 13.1.4 there exists symmetric Borel right process X on S = {1, 2, . . . , n} with semigroup Pt = e−tA . We show here how the elements of A are related to the probabilistic structure of X. Define the stopping time σ = inf{t ≥ 0 | Xt ∈ S ∩ {X0 }c }
(13.23)
(note that σ may be infinite). Since σ = s + σ ◦ θs on {σ > s}, it follows from the strong Markov property that P i (σ > t + s)
= P i (σ ◦ θs > t, σ > s) i
= P (P
Xs
(13.24) i
i
(σ > t); σ > s) = P (σ > t)P (σ > s),
where the last equality uses the fact that if σ > s, then Xs is still at its initial position. Since P i (σ > s) is right continuous, we see that σ is
Associated Gaussian processes
558
an exponential random variable. Therefore, P i (σ > s) = e−qi s for some qi ∈ R+ . Let hi,j = P i (Xσ = j)
(13.25)
and note that hi,i = 0 for all i. Lemma 13.1.5 qi = Ai,i ≥ 0 and, for i = j
hi,j =
0 Ai,j − Ai,i
(13.26)
Ai,i = 0 (13.27)
Ai,i > 0.
In the next paragraph we show that as t → 0,
Proof
P i (Xt = j) = qi hi,j t + O(t2 )
i = j
(13.28)
and P i (Xt = i) = 1 − qi t + O(t2 ).
(13.29)
Using these and (13.20) we obtain (13.26) and (13.27), We now obtain (13.28) and (13.29). When qi = 0 this is obvious, so we can assume that σ < ∞ almost surely. Let σ = σ ◦ θσ . By the Markov property, conditional on Xσ , we see that σ and σ are independent exponential random variables. We have P i (Xt = j) = P i (σ > t; Xt = j) + P i (σ ≤ t < σ + σ ; Xt = j) + P i (σ + σ ≤ t; Xt = j).
(13.30)
P i (σ > t; Xt = j) = e−qi t 1{i=j}
(13.31)
Clearly,
and P i (σ + σ ≤ t; Xt = j) ≤ P i (σ + σ ≤ t) = O(t2 ).
(13.32)
Also, when i = j P i (σ ≤ t < σ + σ ; Xt = j)
= P (σ ≤ t < σ + σ ; Xσ = j) = qi qj e−qi r e−qj s dr dsP i (Xσ = j) i
{r≤t
qi −qj t = e − e−qi t Pi,j = qi hi,j t + O(t2 ). q i − qj
(13.33)
13.1 Associated Gaussian processes
559
Using (13.31)–(13.33) in (13.30), we get (13.28) and (13.29). Lemma 13.1.6 Let X be a symmetric Borel right process on S = {1, 2, . . . , n} with semigroup Pt = e−tA (A is an n × n symmetric matrix with elements {Ai,j }1≤i,j≤n ). Let X be the symmetric Borel right process on S = {1, 2, . . . , n − 1} obtained by killing X the first time it hits {n}. Then X has semigroup Pt = e−tA , where A is the symmetric (n − 1) × (n − 1) matrix with elements {Ai,j }1≤i,j≤n−1 . Proof That X is a symmetric Borel right process follows from Theorem 4.5.2. We show below that (13.28)–(13.29) hold when P i (Xt = j) is replaced by P i (Xt = j) for i, j ∈ S . Therefore, by Lemma 13.1.5, Pt = e−tA is the semigroup for X . Note that P i (Xt = j) = P i (Xt = j, Tn > t).
(13.34)
= P i (σ > t; Xt = j, Tn > t)
(13.35)
Therefore, P i (Xt = j)
+ P (σ ≤ t < σ + σ ; Xt = j, Tn > t) i
+ P i (σ + σ ≤ t; Xt = j, Tn > t). It is easy to see that for i, j = n, P i (σ > t; Xt = j, Tn > t) = P i (σ > t; Xt = j) = e−qi t 1{i=j} (13.36) and P i (σ + σ ≤ t; Xt = j, Tn > t) ≤ P i (σ + σ ≤ t) = O(t2 ).
(13.37)
Also, the condition σ ≤ t < σ + σ means that there is precisely one jump up to time t. Therefore, as long as i, j = n, we have P i (σ ≤ t < σ + σ ; Xt = j, Tn > t) = P i (σ ≤ t < σ + σ ; Xt = j). (13.38) Comparing (13.35)–(13.38) with (13.30)–(13.32), we see that (13.28)– (13.29) hold when P i (Xt = j) is replaced by P i (Xt = j) for i, j ∈ S . In the next example we use Theorem 13.1.2 to obtain a nice smoothness condition for the increments variance of an associated Gaussian process that has stationary increments. Example 13.1.7 Let G := {G(x), x ∈ S} be an associated Gaussian process with covariance matrix u(x, y) = u(x − y). Let σ 2 (x − y) :=
Associated Gaussian processes
560
E(G(x) − G(y))2 . Consider G := (G(x1 ), G(x2 ), G(x3 )). By Theorem 13.1.2, the inverse of the covariance matrix of G has negative entries off the diagonal. Considering the (1, 3) entry of the inverse, we get u(x1 , x2 )u(x2 , x3 ) ≤ u(x2 , x2 )u(x1 , x3 ).
(13.39)
Since σ 2 (x) = 2(u(0) − u(x)), this gives σ 2 (x3 − x1 ) ≤ σ 2 (x2 − x1 ) + σ 2 (x3 − x2 ) −
σ 2 (x3 − x2 )σ 2 (x2 − x1 ) . 2 (13.40)
Set y = x3 − x1 and x = x2 − x1 ; we get σ 2 (y − x)σ 2 (x) 2u(0) 2 σ (y − x)u(x) . u(0)
σ 2 (y) − σ 2 (x) ≤ σ 2 (y − x) − =
(13.41)
In particular, |σ 2 (y) − σ 2 (x)| ≤ σ 2 (y − x).
(13.42)
This is stronger than the result in Lemma 7.4.2 and can be used to simplify some arguments in Section 7.4. (To appreciate the significance of this result, see (7.254).) Clearly this observation applies to the α-potential density of L´evy processes (see (4.84)) and compliments (7.232), which applies to L´evy processes killed the first time they hit zero (see Lemma 4.2.4).
13.2 Gaussian precesses with infinitely divisible squares In this section we show that associated Gaussian processes have infinitely divisible squares. In Section 13.3 we will show that, properly formulated, this property characterizes associated Gaussian processes. Let G = (G1 , . . . , Gp ) be an Rp -valued Gaussian random variable. G is said to have infinitely divisible squares if G2 := (G21 , . . . , G2p ) is infinitely divisible, that is, for any n we can find an Rp -valued random vector Zn such that law
G2 = Zn,1 + · · · + Zn,n ,
(13.43)
where {Zn,j }, j = 1, . . . , n are independent identically distributed copies of Zn . The Gaussian process G = {Gx , x ∈ S} is said to have infinitely
13.2 Infinitely divisible squares
561
divisible squares if, for every finite collection x1 , . . . , xp ∈ S, the Rp valued random variable (Gx1 , . . . , Gxp ) has infinitely divisible squares. We also express this by saying that G2 is infinitely divisible. Let A = {ai,j }1≤i,j≤n be an n × n matrix. We call A a positive matrix and write A ≥ 0 if ai,j ≥ 0 for all i, j. The matrix A is said to be an M -matrix if (1) ai,j ≤ 0 for all i = j. (2) A is nonsingular and A−1 ≥ 0. A diagonal matrix is called a signature matrix if its diagonal entries are either one or minus one. The following theorem characterizes Gaussian processes with infinitely divisible squares. Theorem 13.2.1 Let G = (Gx1 , . . . , Gxp ) be a mean zero Gaussian random variable with strictly positive definite covariance matrix Γ = {Γi,j } = {E(G(xi )G(xj ))}. Then G2 is infinitely divisible if and only if there exists a signature matrix N such that N Γ−1 N is an M -matrix. Proof For a p × p matrix A set |A| = det A. It follows from Lemma 5.2.1 that, for s1 , . . . , sp ∈ [0, 1),
p 1 1 2 E exp − a(1 − si )G (xi ) = 2 i=1 |I + Γa(I − S)|1/2 := P (s)
(13.44)
for all a > 0, where s = (s1 , . . . , sp ) and S is a diagonal matrix with diagonal entries s1 , . . . , sp . Assume that G2 is infinitely divisible. Let Yn be an Rp -valued random vector for which law
G2 = Yn,1 + · · · + Yn,n ,
(13.45)
where the Yn,j , j = 1, . . . , n, are independent identically distributed copies of Yn . Clearly we can take Yn ≥ 0. It follows from (13.44) that for each 1 ≤ j ≤ n,
p 1 E exp − a(1 − si )Yn,j (xi ) = P 1/n (s). (13.46) 2 i=1 Note that all the terms in the power series expansion of P 1/n (s) are
Associated Gaussian processes
562
positive. This follows since ∞ k 1 aYn,j (xi ) (asi Yn,j ) exp − a(1 − si )Yn,j (xi ) = exp − . k 2 2 2 k! k=0 (13.47) Obviously P (s) > 0, so we can take (13.48) log P (s) = lim n P 1/n (s) − 1 . n→∞
We show immediately below that log P (s) has a power series expansion. From (13.48) and the other observations above we see that, except for the constant term, all the terms in this expansion must be greater than or equal to zero. Let Q = I − (I + aΓ)−1 . Note that if λ is an eigenvalue of Γ, then aλ is an eigenvalue of Q, so that, in particular, all the eigenvalues 1 + aλ of Q are in [0, 1). Then P 2 (s)
= |I + aΓ − aΓS|−1 = |(I − Q)
−1
(13.49) −1
− ((I − Q) −1
= |I − Q||I − QS|
−1
− I)S|
.
Therefore, 2 log P (s)
log |I − Q| − log |I − QS| ∞ trace{(QS)n } . = log |I − Q| + n n=1
=
(13.50)
To understand the last term, let λ1 , . . . , λp be the eigenvalues of the symmetric matrix S 1/2 QS 1/2 . Considering the conditions on S and the fact that the eigenvalues of the symmetric matrix Q are all in [0, 1), we see that S 1/2 QS 1/2 ≤ Q ≤ 1 and (u, S 1/2 QS 1/2 u) = (S 1/2 u, QS 1/2 u) ≥ 0 for any u. It follows that all of the eigenvalues of S 1/2 QS 1/2 are also in [0, 1). Therefore, |I − QS| = |I − S 1/2 QS 1/2 | =
(13.51)
(1 − λ1 ) · · · (1 − λp ),
so that log |I − QS| = −
p ∞ λni . n i=1 n=1
(13.52)
Also, trace{(QS)n }
=
trace{(S 1/2 QS 12 )n }
(13.53)
13.2 Infinitely divisible squares =
p
563
λni .
i=1
Thus we get (13.50) and it is clear that log P (s) has a convergent power series. Let {i1 , . . . , ik } be any subset of {1, . . . , p} and denote the elements of Q by qj,k . We now show that for all k ≥ 2, qi1 ,i2 qi2 ,i3 · · · , qik−1 ,ik qik ,i1 ≥ 0.
(13.54)
This is trivial for k = 2 since the term in (13.54) is simply qi21 ,i2 (recall that Q is symmetric). Assume (13.54) holds for k = 3, . . . , m − 1. We show that it also holds when k = m. We first consider the case in which qij ,in = 0 when (ij , in ) = (ij , ij+1 ) or (im , i1 ). In this case, the expansion of trace{(QS)m } is considerably simplified and we see that qi1 ,i2 qi2 ,i3 · · · , qik−1 ,ik qim ,i1 is the coefficient of si1 · · · sim in trace{(QS)m }/m. Thus it is nonnegative because all the terms in the power series for log P ({s}), other than the constant term, are greater than or equal to zero. It remains to consider the cases in which there exists an (ij , in ) with qij ,in = 0, where either 2 ≤ j < m − 1 and j + 1 < n ≤ m or j = 1 and 2 < n < m. When this occurs we can write the product in (13.54), with k = m, as qi1 ,i2 qi2 ,i3 · · · qij−1 ,ij qij ,in qin ,in+1 · · · qim ,i1 ×qij ,ij+1 qij+1 ,ij+2 · · · qin−1 ,in qin ,ij ×
(13.55)
qi−2 . j ,in
(Note that we use qij ,in as a bridge between qij−1 ,ij and qin ,in+1 and place the removed terms on the second line in (13.55).) Thus we create two terms of the type (13.54), each containing at most m − 1 terms. Therefore, by the induction hypothesis, they both must be greater than or equal to zero. We can write −1 −1 (a I). (13.56) Q = I − a−1 I + Γ For an invertible matrix A, we use Ai,j to denote {A−1 }i,j . Also, let W = Γ−1 and set wi,j = {W }i,j . For i = j we have i,j −1 a . qi,j = − a−1 I + Γ
(13.57)
lim aqi,j = −wi,j
(13.58)
Consequently, a→∞
Associated Gaussian processes
564
Considering (13.54), we see that for all k ≥ 2 (−1)k wi1 ,i2 wi2 ,i3 · · · , wik−1 ,ik wik ,i1 ≥ 0.
(13.59)
We use (13.59) to show that there exists a signature matrix N such that N W N is an M -matrix. This is trivial when W has order p = 1. Suppose it holds when W has order p−1. Now let W have order p. Let U be the (p−1)×(p−1) matrix with entries Ui,j = Wi,j , i, j = 1, . . . , p−1. By the induction hypothesis, there exists a signature matrix E such that EU E is an M -matrix. To simplify the notation, we assume, without loss of generality, that U itself is an M -matrix. Let {1, . . . , p − 1} be partitioned into sets G1 , . . . , Gm . Let U [Gi , Gi ] denote the submatrix of U formed by rows and columns indexed by Gi . We choose the {Gi } so that each U [Gi , Gi ] is irreducible (see page 600). Note that U is the direct sum of U [Gi , Gi ], {1, . . . , m}. We claim that for all pairs j, l ∈ Gi , for each 1 ≤ i ≤ m, wj,p wl,p ≥ 0. By Lemma 14.9.3 there exists an r and different integers k1 , . . . , kr in {1, . . . , p−1} such that k1 = l, kr = j and the elements wk1 ,k2 , wk2 ,k3 , . . . , wkr−1 ,kr are all nonzero. This means they are all less than zero since U is an M -matrix. Now consider the cycle product wj,p wp,l wk1 ,k2 wk2 ,k3 · · · wkr−1 ,kr .
(13.60)
If wj,p wl,p < 0, one and only one of the terms in the product (13.60) is greater than zero, while the remaining r terms are less than zero. This contradicts (13.59), which implies that (−1)r+1 times the product (13.60) is nonnegative. This proves our claim. For i ∈ {1, . . . , m} we say that Gi is type 1 if wj,p ≥ 0 for all j ∈ Gi , and type 2 if wj,p ≤ 0 for all j ∈ Gi and wj,p < 0 for at least one j ∈ Gi . We define a p × p signature matrix D with diagonal {d1 , . . . , dp } as dj =
1
if j ∈ Gi and Gi is type 1
−1
if j ∈ Gi and Gi is type 2
−1
if j = p.
We now show that matrix DW D is an M -matrix. Using the fact that U is an M -matrix, it is easy to verify that {DW D}j,k ≤ 0 for j = k. Also, since U is an M -matrix and D2 = I, it follows from Lemmas 14.9.1 and 14.9.4 that the first p − 1 principle minors of DW D are strictly positive. Furthermore, since Γ is strictly positive definite, so is W . Therefore, det(DW D) > 0 also. Consequently, by Lemmas 14.9.1
13.2 Infinitely divisible squares
565
and 14.9.4 again, DW D is an M -matrix. This completes the “only if” part of this theorem. We step out of the proof for a while to make some observations about Laplace transforms. Let φ(λ) be the Laplace transform of a positive real-valued random variable X, that is, ∞ φ(λ) = e−λx dF (x) λ > 0, (13.61) 0
where F is the probability distribution function of X. Let s ∈ [0, 1) and consider φ(a(1−s)), with a > 0. Its power series, expanded about s = 0, is ∞ (−a)n φ(n) (a) n (13.62) φ(a(1 − s)) = s . n! n=0 It follows from (13.61) that φ is completely monotone (see, e.g., Feller (1971, XIII.4)). Consequently, the coefficients of the series in (13.62) are all positive. It is easy to check that φ(a(1 − e−λ/a )) is the Laplace transform of a discrete probability measure that puts mass (−a)n φ(n) (a)/n! on the point n/a, n = 0, 1, . . .. Furthermore, lim φ(a(1 − e−λ/a )) = φ(λ).
a→∞
(13.63)
Indeed, we can use these ideas in reverse to construct infinitely divisible Laplace transforms. Lemma 13.2.2 Let ψ : (R+ )n → (R+ )n be a continuous function. Let s ∈ (R+ )n and suppose that, for all a > 0 sufficiently large, log ψ(a(1 − s1 ), . . . , a(1 − sn )) has a power series expansion at s = 0 with all its coefficients positive, except for the constant term. Then ψ is the Laplace transform of an infinitely divisible random variable in (R+ )n . Proof We give the proof when n = 1. Exactly the same argument applies to random variables in (R+ )n , only the notation required to ∞ n mimic the proof is more complicated. Let n=0 bn (a)s denote the power series expansion of log ψ(a(1 − s)) at s = 0. Then ∞ n elog ψ(a(1−s)) = eb0 (a) e n=1 bn (a)s
(13.64)
has a power series expansion about zero with all its coefficients positive. Therefore, ψ(a(1 − e−λ/a )) is the Laplace transform of a discrete probability measure on the points k/a, k = 0, 1, . . .. It follows from the
566
Associated Gaussian processes
Extended Continuity Theorem (see Feller (1971, XIII.1, Theorem 2a)) that ψ(λ) = lim elog ψ(a(1−e
−λ/a
))
a→∞
(13.65)
is a Laplace transform. The same procedure applied to (log ψ(a(1 − s)))/n shows that ψ 1/n (λ) is a Laplace transform. Proof of Theorem 13.2.1 continued By Lemma 5.2.1, the Laplace transform of G2 is |I +ΓΛ|−1/2 . We want to show that this is the Laplace transform of an infinitely divisible random variable. Consistent with (13.44), set Pa (s) = |I + Γa(I − S)|−1/2 . Let {λi } denote the diagonal elements of Λ and let si = exp(−λi /a). (Recall that s = (s1 , . . . , sp ) and S is a diagonal matrix with diagonal entries s1 , . . . , sp .) Then lim Pa (s) = |I + ΓΛ|−1/2 .
a→∞
(13.66)
By Lemma 13.2.2 we see that, to show that G2 is infinitely divisible, it suffices to show that log Pa (s) has a power series expansion at zero, with all its coefficients positive, except for the constant term. By the hypothesis there exists a signature matrix D such that DW D is an M matrix. Therefore, by Lemma 14.9.4, DW D = λI − B,
(13.67)
where B ≥ 0 and λ is greater than the absolute value of any eigenvalue of B. Using (13.67) we have |I + Γa(I − S)| = |Γ(W + a(I − S))|
(13.68)
= |Γ||λI − B + aI − aS|
1
= |Γ|(λ + a) I − (B + aS) . λ+a Therefore, by (13.50), 2 log Pa (s) = − log |Γ| − log |λ + a| +
∞ trace{(B + aS)n } . n(λ + a)n n=1
(13.69)
Since B ≥ 0, the coefficients of the terms involving any combination of s1 , . . . , sp are all positive. Remark 13.2.3 Let G = (Gx1 , . . . , Gxp ) be a Gaussian vector with covariance matrix Γ. Let N be a signature matrix with diagonal ele := ments s(xi ). If N Γ−1 N is an M -matrix, N ΓN ≥ 0. Therefore, G 2 , (s(x1 )Gx1 , . . . , s(xp )Gxp ) has a positive covariance. Since G2 = G
13.2 Infinitely divisible squares
567
when considering Gaussian vectors with infinitely divisible squares we can always assume that the covariance of the vector is positive. There is nothing mysterious about the signature matrix in Theorem 13.2.1. It simply accounts for the fact that if G has an infinitely divisible for any choice of s(xi ) = ±1, i = 1, . . . , p. By square, then so does G, considering different signs for the s(xi ), one can see which configurations of covariance matrices with strictly negative terms are possible for Gaussian vectors with infinitely divisible squares. For example, when p = 3, the covariance matrix has either no strictly negative terms or exactly four strictly negative terms. We now give a remarkable property of associated Gaussian processes. Corollary 13.2.4 Let S be a locally compact space with a countable base. Let G = {Gx ; x ∈ S} be a Gaussian process with continuous covariance that is associated with a strongly symmetric transient Borel right process X on S. Then G has infinitely divisible squares. Proof By Theorem 13.1.2, for every finite set x1 , . . . , xn ∈ S, Γ = {Γ(xi , xj )}1≤i,j≤p is invertible and Γ−1 has negative off-diagonal elements. Also, since Γ(x, y) is the 0-potential of a strongly symmetric Borel right process, Γ ≥ 0. This follows from (3.65); see also Remark 3.3.5. Therefore, Γ is an M -matrix. Consequently, to apply Theorem 13.2.1 with N = I to show that G2 is infinitely divisible, it only remains for us to show that Γ is also strictly positive definite. It is a simple fact that an invertible symmetric positive definite matrix is strictly positive definite, since such a matrix has positive eigenvalues. Furthermore, the eigenvalues must be strictly positive since, if zero were an eigenvalue, the matrix would not be invertible. In Remark 13.2.3 we discussed the essential role that the signature matrices play in considering whether Gaussian vectors have infinitely divisible squares. Nevertheless, in Corollary 13.2.4 we can take the signature matrix to be the identity. This is generally the case for Gaussian processes with a continuous covariance defined on a reasonably nice space. A topological space S is said to be pathwise connected if, for any x, y ∈ S, we can find a continuous function r : [0, 1] → S with r(0) = x, r(1) = y. Corollary 13.2.5 Let S be a pathwise connected topological space. Let G = {Gx ; x ∈ S} be a Gaussian process with continuous covariance
Associated Gaussian processes
568
Γ = {Γ(x, y), x, y ∈ S} with the property that, for any x1 , . . . , xp , the covariance matrix Γ = {Γ}i,j = Γ(xi , xj ) is strictly positive definite. If −1 G has infinitely divisible squares, then Γ is an M-matrix. This holds for all x1 , . . . , xp and 1 ≤ p < ∞. Proof
We first show that Γ(x, y) ≥ 0
(13.70)
for all x, y ∈ S. Pick x, y ∈ S and a continuous function r : [0, 1] → S with r(0) = x, r(1) = y. By considering the Gaussian process {Gr(t) ; t ∈ [0, 1]}, it suffices to prove (13.70) with S = [0, 1]. We show below that for some function s : [0, 1] → {−1, 1}, s(x)s(y)Γ(x, y) = |Γ(x, y)|
∀ x, y ∈ [0, 1].
(13.71)
We show here that this implies (13.70). Since Γ is strictly positive definite for all x1 , . . . , xp ∈ S, it follows that Γ(x, x) > 0 for each x ∈ [0, 1]. Therefore, by continuity, Γ(x, y) > 0 for each x ∈ [0, 1] and all y in some neighborhood of x. Using the continuity of Γ(x, y) again, we see that (13.71) implies that s(y) is continuous. Consequently, s(y) is constant on [0, 1]. Using this in (13.71), we get (13.70). To construct the function s, let Dn = {k2−n ; 0 ≤ k ≤ 2n }. By Theorem 13.2.1 and the first three sentences of Remark 13.2.3, we can find a function sn : Dn → {−1, 1} with sn (x)sn (y)Γ(x, y) = |Γ(x, y)|
∀ x, y ∈ Dn .
(13.72)
For any x ∈ [0, 1], let rn (x) be the sum of the first n terms in the dyadic expansion of x. Clearly, rn (x) ∈ Dn . Then, using (13.72) and the continuity of Γ(x, y), we see that for any x, y ∈ [0, 1] with Γ(x, y) = 0, h(x, y) := lim sn (rn (x))sn (rn (y))
(13.73)
h(x, y)Γ(x, y) = |Γ(x, y)|.
(13.74)
n→∞
exists and We point out above that Γ(x, x) > 0 for each x ∈ [0, 1]. Therefore, it follows from continuity of Γ that there exists a δ > 0 such that Γ(x, y) > 0 for all x, y ∈ [0, 1] with |x − y| ≤ δ. For any x, y ∈ [0, 1] we can choose a finite sequence x = z0 , z1 , . . . , zp = y with |zj − zj−1 | ≤ δ, 1 ≤ j ≤ p. Then, by (13.73), for all x, y ∈ [0, 1], h(x, y) := lim sn (rn (x))sn (rn (y)) = lim n→∞
n→∞
p
sn (rn (zj−1 ))sn (rn (zj ))
j=1
(13.75)
13.2 Infinitely divisible squares
569
exists (since (sn (z))2 = 1 for any z). Since Γ is continuous, we see that (13.74) holds for all x, y ∈ [0, 1]. It follows from the definition of h that h(x, y) = h(y, x) ∈ {−1, 1} and h(x, y) = h(x, z)h(z, y)
∀ x, y, z ∈ [0, 1].
(13.76)
Pick z0 ∈ [0, 1] and set s(x) = h(x, z0 ). This gives us (13.71) and hence (13.70). Now choose any x1 , . . . , xp ∈ S. By Theorem 13.2.1 there exists a −1 signature matrix N such that N Γ N is an M -matrix. In particular, this implies that N ΓN ≥ 0. By (13.70) we have that Ni Nj = 1 whenever g(xi , xj ) > 0. Suppose g(xi , xj ) = 0. If there exists a sequence i = r(1), r(2), . . . , r(k) = j with r : [1, . . . , k] → [1, . . . , p] such that g(xr(l) , xr(l+1) ) > 0, then we again obtain Ni Nj = 1. If there is no such sequence, then Γ is reducible. We can write it as the direct sum of irreducible matrices and use the argument just given on each of them. −1 Thus Γ is an M -matrix. Example 13.2.6 The matrix Γ−1 in Theorem 13.1.2 (3) is an M -matrix. This follows from the proof of (3) ⇒ (2) of Theorem 13.1.2 in which we show that when Γ−1 is invertible, Γ is a 0-potential. As we point out in the proof of Corollary 13.2.4, this implies that Γ ≥ 0. To understand Section 13.3 it is important to note that an M -matrix need not have positive row sums. This can be seen from the following M -matrix A: 15 5 25 1 − 34 − 13 16 6 48 3 36 5 8 1 −1 1 . (13.77) A = −4 A = 1 −4 6 9 2 5 − 13
− 14
1
25 48
1 2
7 16
Example 13.2.7 Let g = {(g1 , g2 )} be a mean zero Gaussian vector. Let Γ be the covariance matrix of G. Then Γ and Γ−1 have the forms 1 a c b −c Γ= and Γ−1 = c b −c a ab − c2 c g1 . Thus a 2 G is one-dimensional and thus infinitely divisible, as we pointed out in the introduction to this chapter.) Whether c ≥ 0 or c < 0, it is easy to see that we can find a signature matrix N such that N Γ−1 N is an M -matrix. (We take the diagonal components of N equal when c ≥ 0 when ab > c2 (We always have ab ≥ c2 . If ab = c2 , g2 =
570
Associated Gaussian processes
and to have opposite signs when c < 0.) Thus all mean zero Gaussian vectors in R2 have infinitely divisible squares. The squares of Gaussian vectors in R3 need not be infinitely divisible. It is easy to find strictly positive definite matrices in R3 with all offdiagonal entries less than zero. We point out in Remark 13.2.3 that Gaussian vectors with such covariance matrices do not have infinitely divisible squares. The same argument applies to Rp for p > 3. We take up this point in greater detail in Example 13.3.4.
13.3 Infinitely divisible squares and associated processes In Corollary 13.2.4 we showed that if S is a locally compact space with a countable base and G = {Gx ; x ∈ S} is a Gaussian process with continuous covariance that is associated with a strongly symmetric transient local Borel right process on S, then G has infinitely divisible squares. We know that the converse is false because a Gaussian vector with a positive covariance matrix has infinitely divisible squares if the inverse of its covariance matrix is an M -matrix, whereas it is an associated process only if its inverse is an M -matrix with positive row sums. We show in Example 13.2.6 that M -matrices need not have positive row sums. Therefore, to get an equivalence between Gaussian processes with infinitely divisible squares and associated Gaussian processes, we must have more than simply the condition that the Gaussian process has infinitely divisible squares. We do this in the next theorem. Theorem 13.3.1 Let S be a locally compact space with a countable base. Let G = {Gx ; x ∈ S} be a Gaussian process with strictly positive definite continuous covariance Γ(x, y). The following are equivalent: (1) G is associated with a strongly symmetric transient local Borel right process X on S. (2) {(Gx + c)2 ; x ∈ S} is infinitely divisible for all c ∈ R1 . / S, is infinitely (3) {(Gx + bξ)2 ; x ∈ S ∪ {δ}}, G(δ) ≡ 0, where δ ∈ divisible, for some b = 0. Here, ξ is a standard normal random variable independent of G. Furthermore, if this holds for some b = 0, it holds for all b ∈ R1 . Proof It follows from Theorem 13.1.2 that (1) holds if and only if it holds for every finite subset S ⊆ S. Since infinite divisibility only involves finite-dimensional distributions, it suffices to prove this theorem for finite sets S. Without loss off generality we take S = {1, 2, . . . , n}. (1) ⇒ (2) We show below that, given any strongly symmetric transient
13.3 Infinitely divisible squares and associated processes
571
Borel right process X on a finite set S, we can find a strongly symmetric recurrent Borel right process Y on S ∪ {0} with P x (T0 < ∞) > 0 for all x ∈ S such that X is the process obtained by killing Y the first time it hits 0. Let Lxt denote the local time of Y . It follows from Theorem 8.2.2 that under P 0 × PG , 1 √ 2 1 law Gx + 2t ; x ∈ S Lxτ(t) + G2x ; x ∈ S = (13.78) 2 2 for all t ∈ R+ . We know from Corollary 13.2.4 that {G2x , x ∈ S} is infinitely divisible. Also, it follows from the additivity of local time that for any integer m, Lxτ(t) =
m
Lxτ(t/m) ◦ θτ (t(k−1)/m) .
(13.79)
k=1
Using the strong Markov property we see that under P 0 , law
{Lxτ(t/m) ◦ θτ (t(k−1)/m) , x ∈ S} = {Lxτ(t/m) , x ∈ S}
(13.80)
and the m processes {Lxτ(t/m) ◦ θτ (t(k−1)/m) , x ∈ S}, 1 ≤ k ≤ m, are independent. Thus {Lxτ(t) , x ∈ S} is also infinitely divisible. Combining these facts with (13.78) we have (2) of this theorem for c ≥ 0. However √ 2 law √ 2 since Gx + 2t = Gx − 2t , it holds for all c ∈ R1 . To simplify the notation we replace the state 0 by n + 1. To complete the proof of (1) ⇒ (2) we show that for any strongly symmetric transient Borel right process X on S, we can find a strongly symmetric recurrent Borel right process X on S = {1, 2, . . . , n + 1} such that X is the process obtained by killing X the first time it hits n + 1 and P x (Tn+1 < ∞) > 0 for all x ∈ S. To see this, recall that by Lemma 13.1.4 the semigroup for X is of the form Qt = e−tA with A an invertible symmetric n × n matrix that has negative off-diagonal elements and positive row sums. We define the symmetric (n + 1) × (n + 1) matrix A by setting Ai,j = Ai,j n Ai,n+1 = An+1,i = − j=1 Ai,j n n An,n = i=1 j=1 Ai,j .
1 ≤ i, j ≤ n 1≤i≤n
It follows that A has negative off-diagonal elements and zero row sums. Therefore, by Lemma 13.1.4, Pt = e−tA is the semigroup of a recurrent strongly symmetric Borel right process X on S, and by Lemma 13.1.6, X is the process obtained by killing X the first time it hits n + 1. Let P x denote the probability of X starting at some point x ∈ S. We now show that P x (Tn+1 < ∞) > 0 for all x ∈ S. Assume, to the
572
Associated Gaussian processes
contrary, that P x (Tn+1 < ∞) = 0 for one or more points x ∈ S. By relabeling these points, if necessary, there exists a 1 ≤ k ≤ n such that P i (Tn+1 < ∞) = 0 for 1 ≤ i ≤ k and P i (Tn+1 < ∞) > 0 for k + 1 ≤ i ≤ n. Therefore, by the strong Markov property, P i (Tj < ∞) = 0 for all 1 ≤ i ≤ k and k + 1 ≤ j ≤ n. Equivalently, Pt (i, j) = 0
∀ 1 ≤ i ≤ k, k + 1 ≤ j ≤ n.
(13.81)
It follows then from (13.27), (13.28), and the definition of A that Ai,j = 0 for all 1 ≤ i ≤ k and k + 1 ≤ j ≤ n. Also, since P i (Tn+1 < ∞) = 0 for 1 ≤ i ≤ k, Pt (i, n + 1) = 0, so that by the argument just given Ai,n+1 = 0 for all 1 ≤ i ≤ k. Therefore, by the definition of A, the first k row sums in A are equal to zero. Let v ∈ Rn be defined by vi = 1 for all 1 ≤ i ≤ k and vi = 0 for all k + 1 ≤ i ≤ n. Since the first k row sums in A are equal to zero, we see that Av = 0. Therefore A is not invertible. This contradicts the assumption that A is an M -matrix. Therefore P x (Tn+1 < ∞) > 0 for all x ∈ S. (2) ⇒ (3) Let Γ = {Γ(i, j)}1≤i,j≤n . Let Λ denote the diagonal matrix with diagonal entries λ1 , . . . , λn ∈ R+ and set G0 ≡ 0. By Lemma 5.2.1, for some function F (Λ, λ0 , Γ), n √ 2 1 E e− i=0 λi (Gi + nc) = exp nc2 F (Λ, λ0 , Γ) (13.82) det(I + ΓΛ) and
n 2 E e− i=0 λi (Gi +ξ) =
1 Eξ exp ξ 2 F (Λ, λ0 , Γ) . (13.83) det(I + ΓΛ)
λ0 , Γ) denote the left-hand side of (13.82). Then we have Let ψ(Λ, 1 exp c2 F (Λ, λ0 , Γ) . (13.84) 1/n (det(I + ΓΛ)) √ 2 √ 2 divisible. Then By (2), √ (G12 + nc)√, . . . ,2(Gn + nc)√ is 2infinitely clearly ( nc) , (G1 + nc) , . . . , (Gn + nc) is also infinitely divisible. Therefore, ψ1/n (Λ, λ0 , Γ) is the Laplace transform of a random variable for every n. We take the limit in (13.84) as n → ∞. Since (det(I+ΓΛ))−1/n → 1, it follows from the continuity theorem for Laplace transforms that exp c2 F (Λ, λ0 , Γ) is a Laplace transform for any c. Integrating, we see that EX exp (XF (Λ, λ0 , Γ)) is a Laplace transform for any nonnegative random variable X. Since ξ 2 is infinitely divisilaw ble, for any m we can write ξ 2 = X1 + · · · + Xm , where X1 , . . . , Xm are independent identically distributed nonnegative random variables. ψ1/n (Λ, λ0 , Γ) =
13.3 Infinitely divisible squares and associated processes 573 2 2 This shows that for any c = 0, Eξ exp c ξ F (Λ, λ0 , Γ) is the Laplace transform of an infinitely divisible random variable. By (2) with c = 0, we have that (det(I + ΓΛ))−1 is the Laplace transform of an infinitely divisible random variable. Therefore, it follows from (13.83) that (c2 ξ 2 , (G1 + cξ)2 , . . . , (Gn + cξ)2 ) is infinitely divisible for all c = 0. As we just pointed out, it follows from (2) that this is infinitely divisible when c = 0. Let Gbξ = {Gx + bξ ; x ∈ S ∪ {δ}}. Before going on to the proof of (3) ⇒ (1) we explore the relationship between the covariance matrices of G and Gbξ on S and S ∪ {δ}. Here we take δ = n + 1. Lemma 13.3.2 Let Γ be a symmetric strictly positive definite n × n matrix and let b ∈ R1 , b = 0. Consider the (n + 1) × (n + 1) symmetric matrix Γ defined by Γi,j = Γi,j + b2 , i, j = 1, . . . n and Γn+1,j = b2 , −1 j = 1, . . . n + 1. Then Γ exists and the following are equivalent: (1) Γ−1 has negative off-diagonal elements and positive row sums. (2) Γ Proof that
−1
has negative off-diagonal elements.
Since Γ is strictly positive definite, it is invertible. We first note
Γ Γ Γ
i,j
= Γi,j
n+1,j
= −
n+1,n+1
=
i, j = 1, . . . , n
n i=1
Γi,j
j = 1, . . . , n
(13.85)
n
1 + Γi,j , b2 i,j=1
where, for an invertible matrix A, we use Ai,j to denote {A−1 }i,j . To prove (13.85), we simply go through the elementary steps of taking the inverse of Γ. We begin with the array Γ1,1 + b2 .. .
...... .. .
Γ1,n + b2 .. .
b2 .. .
Γn,1 + b2 b2
. . . . . . Γn,n + b2 ...... b2
b2 b2
1 .. .
...... 0 .. .. . .
0 .. .
0 0
...... 1 ...... 0
0 1
Associated Gaussian processes
574
Next we subtract the last row from each divide the last row by b2 to get
Γ1,1 . . . . . . Γ1,n 0
1 .. .. .. .. .. . . .
. .
Γn,1 . . . . . . Γn,n 0
0 1 ...... 1 1 0
of the other rows and then ...... 0 .. .. . .
−1 .. .
...... 1 ...... 0
−1 1/b2
This shows that det(Γ) = b2 det(Γ) and consequently Γ is invertible if and only if Γ is invertible. We now work with the first n rows to get the inverse of Γ so that the array looks like
a1 1 . . . . . . 0 0
Γ1,1 . . . . . . Γ1,n .. .. .. .. .. .. .. . . . . . . .
. . .
0 . . . . . . 1 0
Γn,1 . . . . . . Γn,n an 1 ...... 1 1 0 ...... 0 1/b2 At this stage we do not know the aj , j = 1, . . . , n. Finally, we subtract each of the first n rows from the last row to obtain
1 . . . . . . 0 0
Γ1,1 ...... Γ1,n a1
.. . . . .. .. .. . . . . . . . . . .
. . .
...... Γn,n an 0 . . . . . . 1 0
Γn,1 n n i,1 0 . . . . . . 0 1 − i=1 Γ . . . . . . − i=1 Γi,n an+1 where an+1 = (1/b2 −
n
aj ).
j=1
Since the inverse matrix is symmetric, we see that aj = − j = 1, . . . , n. This verifies (13.85). Note the following interesting property of the row sums: n+1 i,j = 0 j = 1, . . . , n j=1 Γ n+1 j=1
n+1,j
Γ
=
1 . b2
(13.86) n i=1
Γi,j ,
(13.87) −1
This shows, in particular, that when Γ is invertible the row sums of Γ are positive. We can now complete the proof. When Γ−1 has negative off-diagonal elements and positive row sums, it is obvious from the first two lines of
13.3 Infinitely divisible squares and associated processes
575
−1
(13.85) that Γ has negative off-diagonal elements. It is equally obvious from the same two lines that (2) ⇒ (1). Proof of Theorem 13.3.1 continued (3) ⇒ (1) We show that (1) follows if (3) holds for some b = 0. Let Γ be the covariance matrix of {Gx +bξ, x ∈ S∪{n+1}} with Gn+1 ≡ 0. Consequently, Γ is a symmetric positive definite matrix. By hypothesis, Γ is strictly positive definite and therefore invertible. Therefore, by Lemma 13.3.2, Γ is invertible and, since it is positive definite, it must be strictly positive definite. Consequently, by Theorem 13.2.1, there exists a signature matrix N −1 such that N Γ N is an M -matrix. In particular, this implies that N Γ N ≥ 0. Since Γi,n+1 = b2 > 0 for all 1 ≤ i ≤ n + 1, we must have Ni Nn+1 = 1 for all 1 ≤ i ≤ n + 1, and therefore Ni Nj = 1 for all 1 ≤ i, j ≤ n + 1. Without loss of generality we can take N = I, the −1 identity matrix. Thus we see that Γ is an M -matrix, which implies, in particular, that it has negative off-diagonal elements. Therefore, by Lemma 13.3.2, Γ−1 has negative off-diagonal elements and positive row sums. Statement (1) now follows from Theorem 13.1.1. Theorem 13.3.3 Let S be a locally compact space with a countable base. Let G = {Gx ; x ∈ S} be a Gaussian process with continuous y). Assume that for every finite collection x1 , . . . , xp ∈ covariance Γ(x, = {Γ(x i , xj )}1≤i,j≤p is positive and strictly positive S, the matrix Γ x) > 0 for some element 0 ∈ S and all x ∈ S. definite. Assume that Γ(0, 2 If G is infinitely divisible, then y) = Γ(x, 0) u(x, y) Γ(y, 0) Γ(x, 0) 0) Γ(0, Γ(0,
∀x, y ∈ S,
(13.88)
where u(x, y) is the 0-potential density of a strongly symmetric transient local Borel right process X on S. 0) Γ(x, . By Theorem 13.1.1, it suffices to show that, 0) Γ(0, for every finite subset S = {x1 , . . . , xp } ⊆ S, the matrix i , xj ) Γ(x Γ= (13.89) h(xi )h(xj ) Proof
Let h(x) =
1≤i,j≤p
−1
is invertible and Γ has negative off-diagonal elements and positive row is sums. The fact that Γ is invertible follows from the hypothesis that Γ strictly positive definite.
576
Associated Gaussian processes
Suppose that 0 ∈ / S . Then we add it as an element xp+1 and consider the matrix defined in (13.89) for 1 ≤ i, j ≤ p + 1 and denote it by Γ . −1 Suppose we show that Γ is invertible and that ( Γ ) has negative offdiagonal elements and positive row sums. Then it follows from Lemma 13.1.4 that Γ is the 0-potential of a transient strongly symmetric Borel right process on S ∪ {xp+1 }. It then follows from Theorem 13.1.1 that −1 Γ has negative off-diagonal elements and positive row sums. Consequently, we may as well assume that 0 ∈ S . It is convenient to relabel the elements of S so that xp = 0. For x ∈ S , write Gx = ηx + h(x)Gxp so that ηxp = 0 and {ηxj ; j = 1, . . . , p − 1} is independent of Gxp . By hypothesis, {ηx + h(x)Gxp , x ∈ S } = {Gx , x ∈ S } has infinitely divisible squares. Therefore, η η x (13.90) + Gxp := + G xp , x ∈ S h h(x) η has infinitely divisible squares. Let S = {x1 , . . . , xp−1 } and := h η x , x ∈ S . h(x) We now use η the notation of Lemma 13.3.2. Let Γ denote the covariance matrix of and set b = EG2x1 . The matrix Γ of that lemma is h precisely the of matrix Γ in (13.89), which is the covariance matrix −1 η + Gx1 . By Lemma 13.2.1 and the assumption of positivity, Γ h is an M -matrix. In particular it has negative off-diagonal elements. Furthermore, as we point out following (13.87), it has positive row sums.
Example 13.3.4 Consider the M -matrix A in (13.77) that does not have positive row sums. To avoid confusion in the rest of this paragraph, relabel this matrix Γ−1 . Clearly Γ is strictly positive definite (the determinants of its principle minors are positive). Let η = (η1 , η2 , η3 ) be a mean zero Gaussian vector with covariance matrix Γ. Let ξ be a mean zero real-valued normal random variable independent of η. Then η has infinitely divisible squares, but ηξ := (η1 + ξ, η2 + ξ, η3 + ξ, ξ) does not, whatever the value of Eξ 2 > 0. To see this, let Γ be the covariance matrix of ηξ . The proof of (3) ⇒ (1) of Theorem 13.3.1 shows that if −1 ηξ2 is infinitely divisible, Γ is an M -matrix. Therefore, in particular, it has negative off-diagonal elements. Consequently, by Lemma 13.3.2, Γ−1 has positive row sums. This is a contradiction. On the other hand, one can do the arithmetic to check that η =
13.3 Infinitely divisible squares and associated processes
577
(η1 + ξ, η2 + ξ, η3 + ξ) does have infinitely divisible squares for any value of Eξ 2 . We know that if (η1 + ξ, η2 + ξ, η3 + ξ, ξ) has infinitely divisible squares then not only does (η1 , η2 , η3 ) have infinitely divisible squares, but the inverse of its covariance matrix also has positive row sums. This is true whatever the value of Eξ 2 , as long as it is not zero. The example just given shows that even if (η1 + ξ, η2 + ξ, η3 + ξ) has infinitely divisible squares for ξ of all variances including zero, it is not necessarily true that the inverse of the covariance matrix of (η1 , η2 , η3 ) has positive row sums. We consider a related question. When (η1 + ξ, η2 + ξ, η3 + ξ) has infinitely divisible squares for some normal random variable ξ independent of (η1 , η2 , η3 ), does this imply that (η1 , η2 , η3 ) has infinitely divisible squares? In the next example we show that this is not necessarily true. Suppose that the covariance matrix of (η1 + ξ, η2 + ξ, η3 + ξ) is given by
1+b
b−a b−a
b−a
b−a
1+b
b−a ,
b−a
1+b
where Eξ 2 = b and 0 < a < 1/2. By computing the determinants of its principle minors, we see that it is strictly positive definite. The inverse of this matrix has all its off-diagonal terms equal to −(b − a)(1 + a). Thus it is an M -matrix if and only if b ≥ a. In particular, when b = 0, it is the inverse of the covariance matrix of (η1 , η2 , η3 ). This leads to the following interesting example. Let η1
= ξ1 + cξ2 + cξ3
η2
= cξ1 + ξ2 + cξ3
η3
= cξ1 + cξ2 + ξ3 ,
(13.91)
where ξi , i = 1, . . . , 3 are independent normal random variables with mean zero and variance one. Then (η1 , η2 , η3 ) has infinitely divisible squares if and only if c ≥ 0. In fact, one can check that this example extends to n vectors, (η1 , . . . , ηn ), defined similarly, when |c| is sufficiently small, depending on n. (One term in the minors determining the offdiagonal terms of the inverse of the covariance matrix dominates, and these terms have the opposite sign of c.)
Associated Gaussian processes
578
13.4 Additional results about M -matrices −1
The matrices Γ−1 and Γ in Lemma 13.3.2 are both M -matrices. More significantly, so are the matrices in Lemma 13.1.2 (3), as we point out in Remark 13.1.3, although we do not refer to them as such. The proof of this fact in Lemma 13.1.2 uses Markov chains. It is desirable to have a proof using simple linear algebra. We do this in the next lemma, which −1 also explores further the relationship between Γ−1 and Γ . −1
be as given in Lemma 13.3.2, but Lemma 13.4.1 Let Γ−1 and Γ −1 is also assume only that Γ−1 is symmetric and invertible. Then Γ invertible and the following are equivalent: (1) Γ−1 has negative off-diagonal elements and positive row sums. (2) Γ−1 is an M -matrix with positive row sums. (3) Γ (4) Γ (5) Γ
−1 −1 −1
is an M -matrix with positive row sums. is an M -matrix. has negative off-diagonal elements.
Recall that we showed in Remark 13.1.3 that an invertible matrix with positive row sums has at least one of its row sums strictly greater than zero. Proof The only way we use the fact that Γ is strictly positive definite in the proof of Lemma 13.3.2 is to show that Γ is invertible. Since this is now a hypothesis, we can use this conclusion of Lemma 13.3.2 as well as two items mentioned in its proof. These are that Γ is invertible if and −1 only if Γ is invertible and that Γ always has positive row sums. (Thus (3) ⇐⇒ (4) is trivial.) Furthermore, since Γ is invertible, if it is positive definite, it must be strictly positive definite. We next show that Γ is strictly positive definite if and only if Γ is strictly positive definite, whatever the value of b = 0. Let u = (u1 , . . . , un , un+1 ) = (u, un+1 ) with u = (u1 , . . . , un ). We have
n+1 2 2 (u, Γu) = (u, Γu) + b ui . (13.92) i=1
When Γ is strictly positive definite, if (u, Γu) = 0, then u = 0. Therefore, n+1 n+1 if u = 0, i=1 ui = un+1 = 0, so that b2 ( i=1 ui )2 > 0. This shows that Γ is strictly positive definite. On the other hand, if Γ is not strictly positive definite, then (u, Γu) ≤ 0 for some u = (u1 , . . . , un ) = 0. Let u = (u, un+1 ) with un+1 =
13.5 Notes and references
579
n
− i=1 ui . We see from (13.92) that (u, Γu) ≤ 0 with u = 0. Thus, Γ is not strictly positive definite. −1 (1) ⇒ (2) Suppose that (1) holds. Then, by Lemma 13.3.2, Γ has negative off-diagonal elements for all b = 0. It is obvious that Γ ≥ 0 −1 for all |b| sufficiently large. By definition, for these values of |b|, Γ is −1 an M -matrix. By Lemma 14.9.4, for these values of |b|, Γ is strictly positive definite. Therefore, as we just showed, Γ−1 is strictly positive definite. So, by Lemma 14.9.4 again, Γ−1 is an M -matrix. −1 (2) ⇒ (3) By Lemma 13.3.2, Γ has negative off-diagonal elements. Also, Γ ≥ 0, since it is an M -matrix. Therefore, Γ ≥ 0, and hence is also an M -matrix; (3) ⇐⇒ (4) and (4) ⇒ (5) are trivial. (5) ⇐⇒ (1) is given in Lemma 13.3.2.
13.5 Notes and references This chapter is based on Eisenbaum (2003), Eisenbaum (2005), and Eisenbaum and Kaspi (2006). We introduce local Borel right processes to make the equivalences in Theorems 13.1.2 and 13.3.1 more concrete when the state space of the Markov process is locally compact. The inequality in (13.42) has been observed several times; see Marcus and Rosen (1992b) and the references therein. Theorem 13.2.1 combines the work of Griffiths (1984) and Bapat (1989). This problem has a long history and was originally posed as characterizing the infinite divisibility of the multivariate gamma distribution; see Eisenbaum and Kaspi (2006) for further references. Many of the arguments in Griffiths (1984) and Bapat (1989) are special cases of more general results found in the 1974 text Berman and Plemmons (1994) (the reference given is to the 1994 reprint); see in particular Theorems 1.3.20 and 2.2.1 and Section 3 of Chapter 6. Lemma 14.9.1 is taken from Horn and Johnson (1999). Lemmas 13.3.2 and 13.4.1 are new observations that shed light on the relationship between Gaussian processes with infinitely divisible squares and associated Gaussian processes, and simplify the proof of Theorem 13.3.1.
14 Appendix
14.1 Kolmogorov’s Theorem for path continuity Kolmogorov’s Theorem gives a simple condition for H¨ older continn a complete separable metric space. Since we only use it in Chapter 2 for processes on Rd , we simplify matters and consider only processes on [0, 1]d . This result is interesting from a historical perspective since it contains the germs of the much deeper continuity conditions obtained in Chapter 5 (actually, in Chapter 5, we only consider Gaussian processes, but the methods developed have a far larger scope, as is shown in Ledoux and Talagrand (1991)). Let Dm be the set of d-dimensional vectors in [0, 1]d with components of the form i/2m for some integer i ∈ [0, 2m ]. Let D denote the set of dyadic numbers in [0, 1]d , that is, D = ∪∞ m=0 Dm . Theorem 14.1.1 Let X = {Xt , t ∈ [0, 1]d } be a stochastic process satisfying ∀ s, t ∈ [0, 1]d (14.1) E |Xt − Xs |h ≤ c|t − s|d+r for constants c, r > 0 and h ≥ 1. Then, for any α < r/h, |Xt − Xs | ≤ C(ω)|t − s|α
∀ s, t ∈ D
(14.2)
for some random variable C(ω) < ∞ almost surely. Proof Let Nm be the set of nearest neighbors in Dm . This is the set of pairs s, t ∈ Dm with |s − t| = 2−m . Using (14.1) and the fact that |Nm | ≤ 2d2dm , we have
E sup |Xt − Xs |h ≤ E |Xt − Xs |h ≤ c2d2−mr . (s,t)∈Nm
(s,t)∈Nm
(14.3) 580
14.2 Bessel processes
581
For any s ∈ D, let sm be the vector in Dm with sm ≤ s that is closest to s. Then, for any m, we have that either sm = sm+1 or else the pair {sm , sm+1 } ∈ Nm+1 . Now let s, t ∈ D with |s − t| ≤ 2−m . Then either sm = tm or {sm , tm } ∈ Nm . In either case, Xt − Xs =
∞
(Xti+1 − Xti ) + Xtm − Xsm +
i=m
∞
(Xsi+1 − Xsi ). (14.4)
i=m
Using (14.3) we see that -
sup
∞ |Xt − Xs | - ≤ 3 (cd2−ir )1/h ≤ C2−mr/h . h
s,t∈D |s−t|≤2−m
(14.5)
i=m
Fix α < r/h. Since for any pair s, t ∈ D with s = t we can find some m such that 2−m−1 ≤ |s − t| ≤ 2−m , we have ∞ C2α(m+1) 2−mr/h < ∞. - sup |Xt − Xs |/|s − t|α - ≤ 2 h
s,t∈D
(14.6)
m=0
s=t
This gives (14.2).
14.2 Bessel processes In this book, Bessel processes are mainly used to give classical versions of some results that we obtain by other methods. Therefore, we only give a brief description of some of their properties. In this section we assume that the reader has some familiarity with stochastic calculus and stochastic differential equations. For an excellent introduction to Bessel processes, see Revuz and Yor (1991, Chapter XI). Consider the stochastic differential equation t Zt = x + 2 Zs dBs + δt (14.7) 0
for x ∈ R and δ ≥ 0, where Bs is a Brownian motion. This equation has a unique positive strong solution {Zt , t ∈ R+ } called the square of the δ-dimensional Bessel process started at x and denoted by BESQδ (x). We also refer to this process as a δ-th ordered squared Bessel process. The motivation for the designation BESQδ (x) comes from the case in which δ is an integer. Let 1
(1)
Wt = (Wt (1)
(n)
where (Wt , . . . , Wt
(n)
+ a1 , . . . , Wt
+ an ),
(14.8)
) is an n-dimensional Brownian motion, that is,
Appendix
582 (1)
(n)
Wt , . . . , Wt are n independent Brownian motions. It follows from Ito’s formula that
n t (14.9) Ws(i) dWs(i) + n t. |Wt |2 = |W0 |2 + 2 i=1
0
This equation can be written in the form of (14.7) by taking Zt = |Wt |2 and n t (i) Ws dWs(i) (14.10) Bt = |W | s 0 i=1 and noting that Bt is a Brownian motion. To prove this last assertion, note that |Wt | > 0 for almost all t and < B, B >t = t (this last term is the quadratic variation of B up to time t). )t = (Wt(1) , . . . , Wt(n) ). Remark 14.2.1 In the above notation, let W )t = W )t−s + W )s , we see by (14.9) that for t ≥ s, Writing W )s |2 + |W )t−s |2 . )t |2 = |W |W
(14.11)
Let Zn,t (|W (0)|) := |Wt |2 for Wt in (14.8). It follows from (14.9) and (14.11) that law
Zn,t (0) = Zn,t−s (|W (s)|2 ).
(14.12)
The following additivity property of squared Bessel processes is used in the proof of Theorem 2.6.3. Theorem 14.2.2 Let Z and Z be independent stochastic processes with Z a BESQδ (x) and Z a BESQδ (x ), δ, δ ≥ 0. Then Z + Z is a BESQδ+δ (x + x ). Proof
Z satisfies (14.7) and Z satisfies t Zs dBs + δ t, Zt = x + 2
(14.13)
0
where B is a Brownian motion independent of B. Therefore, Xt = Zt + Zt satisfies t ( Zs dBs + Zs dBs ) + (δ + δ )t. (14.14) Xt = x + x + 2 0
Set
Wt = 0
t
√
Zs dBs + Zs dBs √ Xs
(14.15)
14.3 Analytic sets and the Projection Theorem
583
and note that < W, W >t = t. Therefore, Wt isa linear Brownian motion t√ and the integral in (14.14) can be written as 0 Xs dWs . Remark 14.2.3 The assertion in the second paragraph of Theorem 2.6.3 follows immediately from (2.184) and Theorem 14.2.2. Note that ¯ 2 law = Z2,r−x (0) in the notation of Remark 14.2.1, for r ≥ x, B 2 + B r−x
r−x
law
¯r2 = Z2,r−x (Yx ), where and Br2 + B ¯ 2 law Yx = Bx2 + B x = Z2,x (0).
(14.16)
Therefore, by (2.184) and Theorem 14.2.2, for r ≥ x, LrT0 is BESQ0 (Yx ). Given this, it is clear that, conditioned on Bx , {LrT0 , r ≥ x} is independent of {Br , 0 ≤ r ≤ x}.
14.3 Analytic sets and the Projection Theorem We prove the Projection Theorem, which has a critical role in the proof of Theorem 3.2.6. Theorem 14.3.1 (The Projection Theorem) Let (Ω, F, P ) be a complete probability space, and let π denote the projection from R1 × Ω to Ω. Let B denote the Borel σ-algebra for R1 and let B × F denote the product σ-algebra. Then π(B × F) ⊆ F. The proof of the Projection Theorem follows from a sequence of lemmas and theorems including Lusin’s Theorem on analytic sets. We begin by defining these sets. Let O be a set and G a collection of subsets of O that contain the empty set. A Souslin scheme {A· } over G is a collection consisting of a set An1 ,...,nk ∈ G for each k ∈ IN and (n1 , . . . , nk ) ∈ INk . Let σ = (σ(1), σ(2), . . .) ∈ ININ . We set σ|n = (σ(1), σ(2), . . . , σ(n)). The kernel K of the Souslin scheme {A· } is given by 12 Aσ|n . K= σ
n
A set is called analytic over G if it is the kernel of some Souslin scheme over G. The collection of all analytic sets over G is denoted A(G). Trivially, G ⊆ A(G). We first show that A(G) is closed under countable unions. Let K (i) be (i) the kernel of the Souslin scheme {A· } and ψ(n) = (ψ1 (n), ψ2 (n)) be a bijection between IN and IN2 . Then ∪i K (i) is the kernel of the Souslin (ψ (n )) scheme given by An1 ,...,nk = Aψ21(n11),n2 ,...,nk . We next show that A(F) is closed under countable intersections. Let ψ(n) = (ψ1 (n), ψ2 (n)) be a
Appendix
584
bijection between IN and IN2 such that ψ2 (n) < ψ2 (n ) whenever n < n and ψ1 (n) = ψ1 (n ). For example, we can take ψ
n
j + k = (k, n − k + 1)
1 ≤ k ≤ n.
j=1 (i)
Then, if K (i) is the kernel of the Souslin scheme {A· }, ∩i K (i) is the ker(ψ (k)) nel of the Souslin scheme given by An1 ,...,nk = A<ψ12 (n1 ),...,ψ2 (nk ) |ψ1 (k)> , where < ψ2 (n1 ), . . . , ψ2 (nk ) |ψ1 (k) > is the ordered collection of those ψ2 (nj ), j ≤ k such that ψ1 (j) = ψ1 (k). Lemma 14.3.2 If Gc ∈ A(G) for every G ∈ G, then σ(G) ⊆ A(G), where σ(G) is the σ-algebra generated by G. is closed Proof Let A(G) = {K ∈ A(G) | K c ∈ A(G)}. Note that A(G) under countable unions and intersections. To see this, suppose that and ∪Kn = K. Then it follows from the fact that A(G) Kn ∈ A(G) is closed under countable unions and intersections and A(G) ⊆ A(G) c c that K ∈ A(G) and also that K = ∩Kn ∈ A(G). Thus K ∈ A(G). A similar argument works for intersections. Obviously A(G) is closed under complementation. Thus A(G) is a σ-algebra. Since G ∈ A(G), it follows by the hypothesis that G ⊆ A(G) and the lemma follows. Let (Ω, F, P ) be a probability space. For any B ⊆ Ω let P ∗ (B) = inf P (F ).
(14.17)
F ⊇B
F ∈F
It is easy to check that P ∗ (B) ≤ P ∗ (C) if B ⊆ C, and that P ∗ (∪n Bn ) ≤ ∗ ∗ n P (Bn ) for any sequence B1 , B2 , . . . of sets in Ω. Thus P is an outer measure on Ω. We observe that P ∗ (F ) = P (F ) for all F ∈ F. Furthermore, for any ∈ F with B ⊇ B such that P ∗ (B) = P (B). B ⊆ Ω, we can find a B To see this, let F1 , F2 , . . . be a minimizing sequence in (14.17). By taking intersections it is easily seen that we can assume that the sequence = ∩n Fn . F1 , F2 , . . . is decreasing, and then we can take B We also observe that P ∗ (∪n Bn ) = lim P ∗ (Bn )
(14.18)
n→∞
for any increasing sequence B1 , B2 , . . . of sets in Ω. To see this, note that k ) ≤ lim inf P (B n ) = lim P ∗ (Bn ). (14.19) P ∗ (∪n Bn ) ≤ P (∩n ∪k≥n B n→∞
n→∞
14.3 Analytic sets and the Projection Theorem
585
Finally, we note that if (Ω, F, P ) is a complete probability space and ⊇B if P ∗ (B) = 0 for some B ⊆ Ω, then B ∈ F. This follows since B = 0. and P (B) Theorem 14.3.3 (Lusin) space. Then A(F) = F.
Let (Ω, F, P ) be a complete probability
Proof Let K be the kernel of a Souslin scheme {A· } on F. Since P ∗ is ≤ P ∗ (K) + P ∗ (K − K). We show below that an outer measure, P ∗ (K) ≥ P ∗ (K) + P ∗ (K − K), P ∗ (K)
(14.20)
− K) = 0. By the observation immediately which implies that P ∗ (K − K ∈ F. Since K = K ∩ (K − K)c , preceding this theorem, we have K we also have K ∈ F, which proves the theorem. Thus it remains to obtain (14.20). To do this we first introduce some notation. For (h1 , h2 , . . . , hm ) ∈ INm set Kh1 ,h2 ,...,hm =
1
m 2
An1 ,n2 ,...,nj
n1 ≤h1 ,···,nm ≤hm j=1
and K
h1 ,h2 ,...,hm
=
∞ 2
1
An1 ,n2 ,...,nk ,
{ni }: n1 ≤h1 ,···,nm ≤hm k=1
where the sets A·,...,· are members of the Souslin scheme {A· }. Fix > 0 and let h ∈ IN. Since K h ↑ K and K h1 ,h2 ,...,hm ,h ↑ h1 ,h2 ,...,hm as h → ∞, we can inductively define a sequence of inteK gers h1 , h2 , . . . such that, for each m, ∩ K h1 ,h2 ,...,hm ) ≥ P ∗ (K) − . P ∗ (K
(14.21)
Then, since Kh1 ,h2 ,...,hm ⊃ K h1 ,h2 ,...,hm and Kh1 ,h2 ,...,hm ∈ F, we have P ∗ (K)
∩ Kh ,h ,...,h ) + P ∗ (K ∩ (Kh ,h ,...,h )c ) (14.22) = P ∗ (K 1 2 m 1 2 m ∗ ∗ c ≥ P (K) + P (K ∩ (Kh ,h ,...,h ) ) − . 1
2
m
Note that Kh1 ,h2 ,...,hm is decreasing in m. We show immediately below that the limit ∩∞ m=1 Kh1 ,h2 ,...,hm ⊆ K.
(14.23)
Therefore, (Kh1 ,h2 ,...,hm )c is increasing in m and the limit contains K c . Since > 0, we get (14.20).
586
Appendix
To prove (14.23), let x ∈ ∩∞ m=1 Kh1 ,h2 ,...,hm . We say that x is (r, k)– representable if x ∈ Ar ∩km=2 Ar,n2 ,...,nm for some ni ≤ hi , i = 2, . . . , k. Let Qr = {k ∈ IN | x is (r, k)–representable}. Clearly, since x ∈ ∩∞ m=1 Kh1 ,h2 ,...,hm , we have ∪r≤h1 Qr = IN, so that we can find some l1 ≤ h1 such that |Ql1 | = ∞. Thus Ql1 = IN. We then say that x is (l1 , s, k)–representable if x ∈ Al1 ,s ∩km=3 Al1 ,s,n3 ,...,nm for some ni ≤ hi ; i = 3, . . . , k and set Ql1 ,s = {k ∈ IN | x is (l1 , s, k)–representable}. As before, Ql1 = IN implies that ∪s≤h2 Ql1 ,s = IN, so that we can find some l2 ≤ h2 such that |Ql1 ,l2 | = ∞, so that in fact Ql1 ,l2 = IN. Proceeding inductively, we obtain a sequence li ≤ hi , i = 1, 2, . . . with x ∈ ∩∞ m=1 Al1 ,l2 ,...,lm ⊆ K. This completes the proof of (14.23). The Projection Theorem follows from Lusin’s Theorem and the following theorem: Theorem 14.3.4 Let (Ω, F, P ) be a complete probability space, and let π denote the projection from R1 × Ω to Ω. Let B denote the Borel σ-algebra of R1 and let B × F denote the product σ-algebra. Then π(B × F) ⊆ A(F). Proof Let K denote the collection consisting of all compact subsets of R1 together with the empty set. Recall that, as above, K ⊆ A(K) and A(K) is closed under countable unions. Let B ∈ K. Then, since the complement of any compact set in R1 is a countable union of compact sets, we have that B c ∈ A(K). It follows from this that if C ∈ K × F, then C c ∈ A(K×F). Therefore, by Lemma 14.3.2, σ(K×F) ⊆ A(K×F) and since K generates B, we have B × F ⊆ A(K × F).
(14.24)
Let H ∈ B × F. Then, by (14.24), H is the kernel of a Souslin scheme Kn1 ,...,nk × Fn1 ,...,nk over K × F. π(H) is then the kernel of a Souslin scheme Gn1 ,...,nk over F, where Gn1 ,...,nk = Fn1 ,...,nk if ∩kj=1 Kn1 ,...,nj = ∅, and Gn1 ,...,nk = ∅ otherwise. Here we use the fact that if ∩kj=1 Kn1 ,...,nj = ∅ for all k, then ∩∞ j=1 Kn1 ,...,nj = ∅, which holds because the sets Kn1 ,...,nj are compact.
14.4 Hille–Yosida Theorem
587
14.4 Hille–Yosida Theorem Let (B, · ) be a Banach space. When dealing with operators from B to B, we also use · to denote the operator norm. The material in this section is used in Section 4.1, in which one can find the definition of a strongly continuous contraction semigroup and contraction resolvent. If {Rλ , λ > 0} is a contraction resolvent and in addition ∀ f ∈ B,
lim λRλ f = f
λ→∞
(14.25)
{Rλ , λ > 0} is a called a strongly continuous contraction resolvent. Theorem 14.4.1 (Hille–Yosida Theorem) Let {Rλ ; λ > 0} be a strongly continuous contraction resolvent on B. Then we can find a strongly continuous contraction semigroup {Pt ; t ≥ 0} on B with ∞ e−λt Pt dt. (14.26) Rλ = 0
Here the integral is the Bochner integral. Proof Let us define Pt,λ = e−λt
∞
(λt)n (λRλ )n /n! ,
(14.27)
n=0
where (λRλ )0 = I. (We can express this as Pt,λ = exp (−λt(1 − λRλ )). Using this representation, it is easy to verify, at least heuristically, the many relationships given below.) Using the fact that λRλ ≤ 1, it is easy to check that Pt,λ ≤ 1.
(14.28) ∞
Furthermore, using the identity exp(a + b) = n,m=0 an bm /n!m!, one can show that {Pt,λ ; t ≥ 0} is a semigroup and that lim
t→0
Pt,λ − I = λ(λRλ − I). t
(14.29)
It follows from the resolvent equation that Rλ , Rµ , and, consequently, Pt,λ and Pt,µ commute. Therefore, for any n > 1, Pt,λ − Pt,µ =
n k=1
P (k−1)t ,λ P (n−k)t ,µ P nt ,λ − P nt ,µ n
(14.30)
n
(this is merely a telescoping sum). Using (14.28), we now see that Pt,λ f − Pt,µ f ≤ n P nt ,λ f − f − P nt ,µ f − f , (14.31)
Appendix
588 so that, by (14.29), we have
Pt,λ f − Pt,µ f ≤ tλ(λRλ − I)f − µ(µRµ − I)f .
(14.32)
Let R := the range of Rλ . It follows from the resolvent equation that R is the same for all λ > 0. Therefore, using the fact that limλ→∞ λRλ f = f , we see that R is dense in B. Furthermore, Rλ is one–one, since if Rλ h = 0, then from the resolvent equation we see that Rµ h = 0 for all µ, which implies that h = 0 since limµ→∞ µRµ h = h. Hence we can define Rλ−1 on R and check by the resolvent equation that G := λ − Rλ−1 = µ − Rµ−1
∀ λ, µ > 0.
(14.33)
Note that if f ∈ R, we can write λ(λRλ − I)f = λRλ (λ − Rλ−1 )f = λRλ Gf.
(14.34)
Thus for f ∈ R we see that limλ→∞ λ(λRλ − I)f = Gf , so that by (14.32) we see that Pt f := lim Pt,λ f λ→∞
(14.35)
converges locally uniformly in t. Since (14.32) implies that Pt,λ f is continuous, we have that Pt f is continuous in t for each f ∈ R. However, using (14.28) again together with the density of R, we see that (14.35) holds for all f ∈ B and that Pt ≤ 1 and Pt f is continuous in t. In other words, {Pt ; t ≥ 0} is a strongly continuous contraction semigroup. It remains to show (14.26). To see this, we note that by a direct calculation using (14.27), ∞ e−λt Pt,µ dt = ((λ + µ)I − µ2 Rµ )−1 . (14.36) 0
µλ Let α = . Using the resolvent equation for Rµ and Rα , it is easy λ+µ to show that ((λ + µ)I − µ2 Rµ )−1 =
µ2 1 I. Rα + 2 (λ + µ) (λ + µ)
It follows from the resolvent equation that λ2 Rα I − Rλ = R λ . λ+µ
(14.37)
(14.38)
Taking the limit of the left-hand side as u → ∞, we see that Rα = Rλ . Therefore, taking the limit on the right-hand side of (14.37), we obtain (14.26).
14.4 Hille–Yosida Theorem
589
The next theorem shows that the hypotheses of Theorem 4.1.3 are sufficient for the Hille–Yosida Theorem. Theorem 14.4.2 Let S be a locally compact space with a countable base and let C0 (S) denote the Banach space of continuous functions on S that vanish at infinity, with the uniform norm. Let {Rλ ; λ > 0} be a contraction resolvent on C0 (S) that, in addition, satisfies ∀f ∈ C0 (S) and x ∈ S.
lim λRλ f (x) = f (x)
λ→∞
(14.39)
Then, {Rλ ; λ > 0} is a strongly continuous contraction resolvent on C0 (S). Proof We show that the pointwise convergence of (14.39) implies the strong convergence ∀f ∈ C0 (S).
lim λRλ f = f
λ→∞
(14.40)
As in the previous proof, the resolvent equation implies that R the range of Rµ is the same for all µ > 0. Furthermore, rearranging the resolvent equation, we see that for any g ∈ C0 (S), (λRλ − I)Rµ g =
µ λ Rµ g − Rλ g. λ−µ λ−µ
(14.41)
In operator terminology, (4.9) states that λRλ ≤ 1. Therefore, it follows from (14.41) that, for any f ∈ R, limλ→∞ λRλ f = f . Using the property that λRλ ≤ 1 once again, we see that limλ→∞ λRλ f = f for any f ∈ R, the closure of R in C0 (S). It only remains to show that R = C0 (S). Suppose it does not. Then, by the Hahn-Banach Theorem, there is a nontrivial continuous linear functional h on C0 (S) that vanishes on R and a fortiori on R. Representing h by a finite signed measure ν on S, we therefore have Rλ f (x) dν(x) = 0 for all λ > 0 and f ∈ C0 (S). Then, by (14.39), the fact that λRλ ≤ 1, and the Dominated Convergence Theorem, we have that for all f ∈ C0 (S),
h(f ) =
f (x) dν(x) = lim
λ→∞
λRλ f (x) dν(x) = 0.
This contradicts the fact that h is not trivial. Thus R = C0 (S).
(14.42)
Appendix
590
14.5 Stone–Weierstrass Theorems A real vector space H ⊆ C(S) is called a vector lattice if it is closed under ∨ and ∧. H is said to separate the points of S if, for any x, y ∈ S, we can find an f ∈ H with f (x) = f (y). Theorem 14.5.1 (Stone–Weierstrass) Let S be a compact space and let H ⊆ C(S) be a vector lattice, containing the constants, which separates the points of S. Then H is dense in C(S) in the uniform topology. This is a classical theorem; see, for example, Royden (1988, Chapter 9, Proposition 30). Theorem 14.5.2 Let S be a locally compact space and let H ⊆ C(S) be a vector lattice, containing the constants, such that H ∩ C0 (S) separates the points of S. Then H ∩ C0 (S) is dense in C0 (S) in the uniform topology. Proof In order to apply the Stone–Weierstrass Theorem (Theorem 14.5.1), which requires a compact space, we consider S∆ , the one-point compactification of S, where ∆ is the point at infinity. Let C(S, ∆) denote the continuous functions on S that have a continuous extension to S∆ . It is clear that C(S, ∆) includes C0 (S) and the functions that are constant on S. Under the hypothesis that H ∩ C0 (S) separates the points of S, we see that the continuous extension L of H ∩ C(S, ∆) to S∆ separates the points on S∆ . Since L is a vector lattice that also contains the constants, it follows from Theorem 14.5.1 that L is dense in (C(S∆ ), · ∞ ). Let f ∈ C0 (S), which we consider as a function in C(S∆ ) by setting f (∆) = 0. Fix > 0. By what we have just shown, we can find g ∈ L such that sup |g(x) − f (x)| ≤ .
(14.43)
x∈S∆
Since f (∆) = 0, this shows that |g(∆)| ≤ . Then, by (14.43) we have sup |(g(x) − g(∆)) − f (x)| ≤ 2 .
(14.44)
x∈S∆
Let g denote the restrictions of g to S. We have g − g(∆) ∈ C0 (S). Also, since both g and the constant function g(∆) are in H, we have g − g(∆) ∈ H ∩ C0 (S). Since can be arbitrarily close to 0, we see from (14.44) that H ∩ C0 (S) is dense in C0 (S) in the uniform norm.
14.6 Independent random variables
591
14.6 Sums of independent symmetric random variables Let B be a real separable Banach space, {Xn , n ≥ 1} a sequence of n independent symmetric B-valued random variables, and Sn = j=1 Xj . The next lemma is a generalization of a well-known inequality of L´evy for real-valued random variables. Its proof is essentially the same as in the real valued case (see, e.g., Ledoux and Talagrand (1991, Proposition 2.3)). Lemma 14.6.1 (L´ evy’s inequality) Let N be a measurable seminorm on B and suppose that Sn converges almost surely to a limit S. Then P sup N (Sn ) > λ ≤ 2P (N (S) > λ). (14.45) n≥1
We now show that, for sums of independent symmetric random variables, many different types of convergence are equivalent. Theorem 14.6.2 (L´ evy–Ito–Nisio Theorem) The following statements are equivalent: (1) {Sn } converges almost surely to a B-valued random variable S. (2) {Sn } converges in probability to a B-valued random variable S. (3) Let µn denote the probability distribution of Sn . Then there exists dist
a probability measure µ on B such that µn −→ µ. (4) For f ∈ B ∗ , ei(f,x) dµn → ei(f,x) dµ
(14.46)
where (f, x) denotes the evaluation of f at x. Proof By the Banach–Mazur Theorem (see Wojtaszczyk (1991, Chapter II, B, Theorem 4)), B is isometrically isomorphic to a closed subspace of C[0, 1], the space of continuous functions on [0, 1] with the sup norm, that is, for f ∈ C[0, 1], f = sup0≤t≤1 |f (t)|. We represent B in this way. The proof that (1) implies (2) is exactly the same as for real-valued random variables. To show that (2) implies (3), we must show that the finite-dimensional distributions of {µn } converge to the corresponding finite-dimensional distributions of µ and that {µn } is a tight family on C[0, 1]. Tightness is equivalent to the following: Given > 0, there exist a > 0 and δ > 0 such that
Appendix
592
sup µn ({x : |x(0)| > a}) <
(14.47)
sup µn ({x : xδ > }) < ,
(14.48)
n
and n
where xδ := sup|s−t|≤δ; s,t∈[0,1] |x(s) − x(t)|. Suppose that (2) holds and let µ be the probability distribution of S. Given t1 , . . . , tj in [0, 1], the vectors Sn (t1 ), . . . , Sn (tj ) converge in probability to S(t1 ), . . . , S(tj ). Thus the finite-dimensional distributions converge and, clearly, (14.47) holds. Also, µn ({x : xδ > })
= P (Sn δ > )
(14.49)
≤ P (Sn − Sδ > //2) + P (Sδ > /2) ≤ P (Sn − S > /4) + P (Sδ > /2) since xδ ≤ 2x. Since Sn converges to S in probability, given > 0 we can find an n0 such that for n ≥ n0 , the first probability in the last line of (14.49) is less than /2. Furthermore, since continuous functions on compact sets are uniformly continuous, we can take δ sufficiently small so that sup µj ({x : xδ > }) < /2
and
P (Sδ > /2) < /2.
1≤j
(14.50) Therefore (2) implies (3). Statement (3) implies (4) by the definition of weak convergence of measures. We now show that (4) implies (3) implies (2) implies (1). Suppose that (4) holds. By considering characteristic functions we see, in particular, that given t ∈ [0, 1], the real-valued random variables Sn (t) converge in distribution. Therefore, by L´evy’s Theorem, in the real-valued case Sn (t) → S(t)
a.s.,
(14.51)
where S(t) denotes the limit random variable (here we use the fact that we are considering sums of independent random variables; see, e.g., Chung (1974, Theorem 9.5.5)). Consequently, for t1 , . . . , tj in [0, 1] and A a measurable set in B(Rj ), P ((S(t1 ), . . . , S(tj )) ∈ A) = µ({x : (x(t1 ), . . . , x(tj )) ∈ A}).
(14.52)
µ is a probability measure on C[0, 1]. By (14.52) and Billingsley (1968, Theorem 9.2), there is a separable version of the process {S(t), t ∈ [0, 1]}
14.6 Independent random variables
593
that is continuous almost surely and for which (14.51) and (14.52) continue to hold. Therefore we can consider S as a C[0, 1]-valued random variable. Furthermore, since the finite-dimensional distributions of Sn and S − Sn are independent and symmetric, Sn and S − Sn are independent, symmetric C[0, 1]-valued random variables. Applying Lemma 14.6.1 to the series with two terms, Sn and S − Sn , we see that P (Sn δ > ) ≤ 2P (Sδ > ).
(14.53)
This shows that {µn } is tight; (3) follows. Assume (3). This implies that {Sn } is a tight family and, as we have already shown, that (4) holds, where S is a random variable in C[0, 1]. Therefore, {Sn − S} is also a tight family. Let t1 , . . . , tj ∈ [0, 1] be such that each t ∈ [0, 1] satisfies |t − tk | < δ for some 1 ≤ k ≤ j. By (4), given
> 0, we can find an n so that (14.54) P max |Sn (tk ) − S(tk )| > /2 < /2. 1≤k≤j
Also, since {Sn − S} is a tight family, sup P (Sn − Sδ > /2) < /2.
(14.55)
n≥1
Noting that Sn − S ≤ Sn − Sδ + max |Sn (tk ) − S(tk )| 1≤k≤j
(14.56)
and using (14.54) and (14.55), we get (2). By Lemma 14.6.1, P max Xm + · · · + Xm+p > ≤ 2P (Xm + · · · + Xm+r > ). 1≤p≤r
(14.57) The same argument that was used for real-valued random variables shows that (2) implies (1). Remark 14.6.3 Statements (1), (2), and (3) in Theorem 14.6.2 hold without the assumption of symmetry. However, it is shown in Ito and Nisio (1968a) (see also Jain and Marcus (1978, II, Remark 3.5)) that (4) does not necessarily imply (3) without the assumption of symmetry. We have the following important corollary of Theorem 14.6.2. Corollary 14.6.4 Let X = {X(t), t ∈ T } be a process of class S. Assume further that X has a version with continuous sample paths. Then
594
Appendix
the series in (5.94) converges uniformly on T almost surely (and thus is a concrete version of X). Proof Since X has a version with continuous paths, there is a measure µ on C(T ), the space of continuous functions on T , that has the same finite distributions as X. Also, Xj (t) := φj (t)ξj are independent symmetric C(T )-valued random variables. For fixed t ∈ T , by hypothesis, Sn (t) converges to S(t) almost surely. This is all we used to show that (4) implies (1) in the proof of Theorem 14.6.2.
14.7 Regularly varying functions Regularly varying functions are introduced in Section 7.2. In results involving these functions we use well-known integrability theorems. We state them here without proof. Essentially these results say that regularly varying functions of index p integrate like pure p-th powers. The only exception is the very interesting case of regularly varying functions of index minus one. For details and proofs we refer the reader to Bingham, Goldie and Teugels (1987), Feller (1971) and Pitman (1968) and to the seminal paper, Karamata (1930). Theorem 14.7.1 If L is slowly varying at infinity and is locally bounded on [M, ∞) for some 0 < M < ∞, then for all p < 1, x L(u) x1−p L(x) as x → ∞. (14.58) du ∼ p 1−p M u ∞ If L is slowly varying at infinity and p > 1, then M (L(u)/up ) du < ∞ for some 0 < M < ∞ and ∞ L(u) L(x) du ∼ as x → ∞. (14.59) p u (p − 1)xp−1 x If L(u) ≡ 1 in (14.58) or (14.59), the two sides of these relationships are obvious. Therefore, one can think of these relationships as showing that the slowly varying functions vary so slowly with respect to powers that they do not affect the rate of growth of the integrals of powers, except for regularly varying functions of index minus one, which is not included in the above theorem. We treat this case next. Theorem 14.7.2 If L is slowly varying at infinity and is in L1 ([a, b]) for
14.7 Regularly varying functions
595 x
all intervals [a, b] ⊂ [M, ∞) for some 0 < M < ∞, then M (L(u)/u) du is slowly varying and x L(u) 1 du → ∞ as x → ∞. (14.60) L(x) M u ∞ If L is slowly varying ∞ at infinity and M (L(u)/u) du < ∞ for some 0 < M < ∞, then x (L(u)/u) du is slowly varying and ∞ 1 L(u) du → ∞ as x → ∞. (14.61) L(x) x u If L(x) is slowly varying at zero, L(1/x) = L(x) is slowly varying at infinity. Substituting L(1/x) into the above results gives comparable results as x → 0. We give them for the convenience of the reader. Theorem 14.7.3 If L is slowly varying at zero and is bounded on [a, δ], for some δ > 0 and all 0 < a < δ, then for all p > 1, δ L(u) L(x) du ∼ as x → 0. (14.62) p u (p − 1)xp−1 x δ If L is slowly varying at zero and p < 1, then 0 (L(u)/up ) du < ∞ for some δ > 0 and x L(u) x1−p L(x) as x → 0. (14.63) du ∼ p u 1−p 0 Theorem 14.7.4 If L is slowly varyingat zero and is bounded on [a, δ] δ for some δ > 0 and all 0 < a < δ, then x (L(u)/u) du is slowly varying and δ L(u) 1 du → ∞ as x → 0. (14.64) L(x) x u δ If L is xslowly varying at zero and 0 (L(u)/u) du < ∞ for some δ > 0, then 0 (L(u)/u) du is slowly varying and x L(u) 1 du → ∞ as x → 0. (14.65) L(x) 0 u The next result is the Monotone Density Theorem. Theorem 14.7.5 Let
x
f (x) dx
F (x) =
x ∈ [0, M ]
(14.66)
0
for M > 0, where f is integrable on [0, M ]. If F (x) = xp L(x) as x ↓ 0,
596
Appendix
where p ≥ 0 and L is slowly varying at 0, and if f is monotone on [0, δ] for some δ > 0, then f (x) ∼ pxp−1 L(x).
(14.67)
It follows from Theorem 14.7.5 that functions that are concave on [0, δ] for some δ > 0 and are regularly varying at 0 are in fact normalized regularly varying functions; (see (7.123)). Let U be nondecreasing on R+ with U (0) = 0. Let ∞ (s) := U e−sx dU (x)
(14.68)
0
for s > σ, for some σ ≥ 0 for which the integral is finite. The following is from Feller (1971, Theorem 3, Chapter XIII, Section 5). (s) < ∞ for all s sufficiently large and Theorem 14.7.6 Assume that U let L be a slowly varying function at infinity. For all c ≥ 0 and p ≥ 0, the following are equivalent: as U (x) ∼ cxp L(1/x)/Γ(1 + p) −p as s → ∞. U (s) ∼ cs L(s)
x → 0+,
(14.69) (14.70)
When c = 0, (14.69) is interpreted as U (x) = o(xp L(1/x)) and similarly for (14.70).
14.8 Some useful inequalities Lemma 14.8.1 Let f (x) be a positive decreasing convex function on [a, ∞] that is differentiable at a. Then ∞ f 2 (a) (14.71) f (x) dx ≥ − . 2f (a) a Proof
The tangent line to f (x) at x = a is given by y = f (a) + f (a)(x − a)
(14.72)
(since it has slope f (a) and contains (a, f (a)). f (x) ≥ f (a)+f (a)(x−a) by convexity. The line (14.72) intercepts the x-axis at b = (af (a) − b f (a))/f (a). The right-hand side of (14.71) is a (f (a) + f (a)(x − a))dx. (To evaluate this integral easily note that it is the area of a right triangle, with the sides that meet at the right angle having lengths f (a) and b − a = −f (a)/f (a).)
14.8 Some useful inequalities
597
Lemma 14.8.2 (Paley–Zygmund Lemma) For a positive random variable X and 0 < λ < 1, P (X ≥ λEX) ≥ (1 − λ)2
(EX)2 . EX 2
(14.73)
Proof This follows immediately from the following two elementary inequalities: 1/2 EXI[X≥λEX] ≤ EX 2 P (X ≥ λEX) (14.74) and EXI[X≥λEX] = EX − EXI[X<λEX] ≥ (1 − λ)EX.
(14.75)
Lemma 14.8.3 (Boas’s Lemma) Let aj ≥ 0, j ≥ 1, . . . and gn2 = ∞ 2 j=n aj . Then ∞
an ≤ 2
n=1
∞ g √n . n n=1
(14.76)
If, in addition, the {aj } are nonincreasing, ∞ ∞ g √n ≤ 2 an . n n=1 n=1
(14.77)
Proof Inequality (14.76) is a simple consequence of the Schwarz inequality,
∞ 1/2 ∞ ∞ ∞ ∞ ∞ 1 ak g √n . (14.78) ≤ an = gn ≤ 2 2 k k n n=1 n=1 n=1 n=1 k=n
k=n
To obtain (14.77) we note that for all m ≥ n, a2m
=
m
a2k
1/2
−
m−1
k=n
a2k
m 1/2
k=n
a2k
1/2 +
m−1
k=n
1/2
k=n
m 1/2 m−1 1/2 ≥ 2 am (m − n)1/2 . a2k − a2k k=n
a2k
(14.79)
k=n
Consequently, ∞
am 2(m − n)1/2 m=n+1
≥
∞ m m=n+1
= gn − an
k=n
a2k
1/2
−
m−1
a2k
1/2
k=n
(14.80)
Appendix
598 or, equivalently, gn ≤ an +
∞ am 1 . 2 m=n+1 (m − n)1/2
(14.81)
Multiplying this by n−1/2 and summing on n gives
∞ ∞ m 1 gn 1 1 √ ≤ a1 + , am+1 + 2 n=1 (n(m + 1 − n))1/2 n (m + 1)1/2 n=1 m=1 (14.82) where we also use a change of the order of summation. Inequality (14.77) follows by noting that
m 1 1 1 + sup ≤ 2. (14.83) 2 n=1 (n(m + 1 − n))1/2 (m + 1)1/2 m≥1
14.9 Some linear algebra We collect some results about symmetric matrices that are used in this book. Lemma 14.9.1 A symmetric matrix is strictly positive definite if and only if all its principle minors are strictly positive. Proof Let A be an n × n symmetric matrix with all of its principle minors strictly positive. We show by induction on n that A is strictly positive definite. This is clear if n = 1, so assume that n ≥ 2 and we have established this result for all symmetric (n − 1) × (n − 1) matrices. Let u1 , . . . , un be eigenvectors of A that are an orthonormal basis for Rn . Let ρ1 , . . . , ρn denote their corresponding eigenvalues. It is clear ,n that none of the eigenvalues are zero because det A = j=1 ρj > 0. Note that n n n = bj uj , A b j uj ρj b2j . (14.84) j=1
j=1
j=1
Thus, to show that A is strictly positive definite, we need to show that all the eigenvalues ρ1 , . . . , ρn are strictly greater than zero. Since det A > 0, we see that it is not possible for A to have exactly one strictly negative eigenvalue. Assume that A has at least two strictly negative eigenvalues.
14.9 Some linear algebra
599
Relabeling the eigenvalues, if necessary, we can assume that ρ1 and ρ2 are strictly less than zero. Then, by (14.84), for any real number r, ((u1 + ru2 ), A(u1 + ru2 )) = ρ1 + r2 u22 ρ2 < 0.
(14.85)
Choose r so that the n-th component of v = u1 + ru2 is equal to zero. Consider v as an element of Rn−1 . Let An−1 denote the (n − 1) × (n − 1) matrix {A}1≤i,j≤n−1 . Then (14.85) states that (v, An−1 v) < 0, which contradicts the induction hypothesis. This shows that all the eigenvalues of A are strictly greater than zero, and hence A is strictly positive definite. For the other direction, let Ap denote the p × p matrix {Ai,j }1≤i,j≤p , 1 ≤ p ≤ n. Since A is strictly positive definite, so is Ap , for all p = 1, . . . , n. This implies that all the eigenvalues of Ap are strictly positive. Consequently, det Ap > 0, for all p = 1, . . . , n. Lemma 14.9.2 Let B = {bi,j }1≤i,j≤p be a symmetric matrix with B ≥ 0 and let ρ(B) denote the largest eigenvalue of B. Then we can find an eigenvector x = (x1 , . . . , xp ) with eigenvalue ρ(B) such that xi ≥ 0 for all 1 ≤ i ≤ p, with strict inequality for at least one i. Proof Let u1 , . . . , up be eigenvectors of B that form an orthonormal basis for Rp . Let ρ1 , . . . , ρp denote their corresponding eigenvalues. Note that any y ∈ Rp , with |y| = 1, can be written as y = j=1 aj uj with 2 j=1 aj = 1. Thus (By, y) =
ρj a2j ≤ ρ(B).
(14.86)
j=1
Let x = (x1 , . . . , xp ), |x| = 1, be an eigenvector of B with eigenvalue ρ(B). Then (Bx, x) = ρ(B), that is, (Bx, x) =
p
bi,j xi xj = ρ(B).
(14.87)
j,k=1
x| is also equal to 1. Let x ¯ = (|x1 |, . . . , |xp |) and note that, trivially, |¯ Since, by hypothesis, all the bi,j ≥ 0, (14.87) clearly implies that (B x ¯, x ¯) =
p
bi,j |xi ||xj | ≥ ρ(B).
(14.88)
j,k=1
Therefore, by (14.86), x ¯ is an eigenvector of B with eigenvalue ρ(B).
Appendix
600
A p × p matrix A = {ai,j }1≤i,j≤p is said to be reducible if there exists a permutation matrix P such that B 0 t P AP = , (14.89) C D where B and D are square matrices. (A p × p matrix P is a permutation matrix if exactly one entry in each row and column is equal to 1 and all the other entries are 0. Note that the effect of left multiplication by P in (14.89) is to permute rows of A, whereas the effect of right multiplication by P t in (14.89) is to perform the same permutation on the columns of A.) A matrix that is not reducible is called irreducible. If A is a symmetric, reducible matrix, one can apply a sequence of these interchanges to bring A to the form in which it can be expressed as a direct sum of irreducible matrices. When this is done P AP t appears as a string of irreducible square matrices aligned along its diagonal, with 0 as the entries in the rows and columns outside these matrices. There is a nice interpretation of irreducible matrices as graphs that is particularly simple for symmetric matrices. For a p × p matrix A = {ai,j }1≤i,j≤p , consider a graph with p nodes. If aj,p = 0, we say that one can go from node j to node p in the graph. If A is symmetric, then, clearly, one can also go from node p to node j. We can break the nodes up into disjoint subsets of nodes that communicate with each other but not with nodes in any other subset. Then we can reorder the indices of these nodes as {i1,1 , . . . , i1,n1 , i2,1 . . . , i2,n2 , . . . , ik,1 , . . . , ik,nk },
(14.90)
where n1 + · · · + nk = p and where the elements in each set ij,1 . . . , ij,nj communicate with each other. The permutation of {1, . . . , p} into the sequence in (14.90) is the same as the permutation that defines the permutation matrix P that makes P AP t irreducible. Lemma 14.9.3 Let A = {ai,j }1≤i,j≤p be a symmetric irreducible matrix, A = 0. Then, for all l, j ∈ {1, . . . , p}, there exists an r and different integers k1 , . . . , kr in {1, . . . , p} such that k1 = l, kr = j and ak1 ,k2 , ak2 ,k3 , . . . , akr−1 ,kr
(14.91)
are all nonzero. Proof Consider the graph G with vertices {i | 1 ≤ i ≤ p} and with a edge between i, j if and only if ai,j = 0. The fact that A is irreducible
14.9 Some linear algebra
601
means that G is connected, that is, that each element of G can be reached from any other element of G. Choose some path from i to j and remove any loops it may contain. This gives the different integers k1 , . . . , kr in the sequence of elements of A that connect i to j in (14.91). A = {ai,j }1≤i,j≤p is an M -matrix if ai,j ≤ 0, i = j, A is nonsingular, and A−1 ≥ 0; see page 561. Lemma 14.9.4 Let A = {ai,j }1≤i,j≤p be a symmetric matrix with ai,j ≤ 0, i = j. Then the following are equivalent: (1) A is an M -matrix. (2) For all λ > 0 sufficiently large, we can write A = λI − B, where B ≥ 0 and λ is greater than the absolute value of any eigenvalue of B. (3) A is strictly positive definite. Proof (1) ⇒ (2) By choosing λ sufficiently large we can always write A = λI − B, where B ≥ 0 and is strictly positive definite. This is because the larger we take λ, the larger the diagonal elements of B must be. Since the off-diagonal elements of B remain unchanged, by taking λ sufficiently large, B has strictly positive principle minors. Thus it is strictly positive definite, and, in particular, all the eigenvalues of B are positive. We write 1 (14.92) A = λ I − B := λ(I − T ). λ By the hypothesis, (I − T )−1 exists and (I − T )−1 ≥ 0. Let ρ(T ) denote the largest eigenvalue of T . Since T is symmetric and T ≥ 0, it follows from Lemma 14.9.2 that there exists a vector x = (x1 , . . . , xp ) with xi ≥ 0 for all 1 ≤ i ≤ p and with strict equality for at least one i, such that T x = ρ(T )x. The fact that (I − T )−1 exists means that ρ(T ) = 1. Thus (I − T )x = (1 − ρ(T ))x,
(14.93)
and since (I − T )−1 exists, (I − T )−1 x =
1 x. (1 − ρ(T ))
(14.94)
Since (I − T )−1 ≥ 0, we see that ρ(T ) < 1. Thus λ is greater than the absolute value of any eigenvalue of B. (2) ⇒ (3) Let x be an eigenvector of A with eigenvalue ξ. Then, by (14.92), λ − ξ is an eigenvalue of B. Since B is symmetric, λ − ξ is real.
602
Appendix
Therefore, the hypothesis that λ > |λ − ξ| implies that ξ > 0. Thus all the eigenvalues of A are strictly greater than zero, and therefore A is strictly positive definite. (3) ⇒ (1) Since A is strictly positive definite, writing A as in (14.92) shows that λ is greater than any of the eigenvalues of B. Note that j k+1 k 1 1 1 . (14.95) I− B =I− B B λ λ λ j=0 Therefore, taking the limit as k → ∞, we see that k ∞ 1 1 −1 A = B , λ λ
(14.96)
k=0
where the sum converges in the operator norm. Since B ≥ 0, we see that A−1 ≥ 0. Thus A is an M -matrix.
References
The numbers following each entry refer to the pages on which the entry is cited in this book.
Adler, R. J. (1991). An Introduction to Continuity, Extrema, and Related Topics for General Gaussian processes. Hayward, CA: Institute of Mathematical Statistics. [241] Adler, R. J. and Pyke, R. (1993). Uniform quadratic variation for Gaussian processes. Stochastic Process. Appl., 48, 191–210. [495] Ahlfors, L. (1966). Complex Analysis (second ed.). New York: McGraw Hill. [142] Bakry, D. and Ledoux, M. (1996). L´evy–Gromov’s isoperimetric inequality for an infinite dimensional diffusion generator. Invent. Math., 123, 259–281. [242] Ba˜ nuelos, R. and Smits, R. G. (1997). Brownian motion in cones. Prob. Theory Related Fields, 108, 299–319. [527] Bapat, R. (1989). Infinite divisibility of multivariate gamma distributions and M –matrices. Sankhya, 51, 73–78. [6, 579] Barlow, M. (1985). Continuity of local times for L´evy processes. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, 69, 23–35. [2, 120] Barlow, M. (1988). Necessary and sufficient conditions for the continuity of local time of L´evy processes. Ann. Probab., 16, 1389–1427. [2, 120, 455] Barlow, M. and Hawkes, J. (1985). Applications de l’entropie m´etrique ` a la continuit´e des temps locaux des processus de L´evy. C. R. Acad. Sc. Paris, 301, 237–239. [2, 120] Bass, R. F., Eisenbaum, N., and Shi, Z. (2000). The most visited sites of symmetric stable processes. Prob. Theory Related Fields, 116, 391–404. [5, 527] Bass, R. F. and Griffin, P. (1985). The most visited site of Brownian motion and simple random walk. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, 70, 417–436. [60, 526] Belyaev, Y. K. (1961). Continuity and H¨ older’s conditions for sample functions of stationary Gaussian processes. In Fourth Berkeley Sympos. Math. Statist. and Prob., volume 2 (pp. 23–33). Berkeley: University of California Press. [241]
603
604
References
Berman, A. and Plemmons, R. J. (1994). Nonnegative Matrices in the Mathematical Sciences. Philadelphia: Classics in Applied Mathematics, SIAM. [579] Bertoin, J. (1995). Some applications of subordinators to local times of Markov processes. Forum Math., 7, 629–644. [527] Bertoin, J. (1996). L´evy Processes. Cambridge: Cambridge University Press. [142, 188] Bertoin, J. and Caballero, M. (1995). On the rate of growth of subordinators with slowly varying Laplace exponents. In Sem. de Prob. XXIX, volume 1613 of Lecture Notes Math (pp. 125–132). Berlin: Springer–Verlag. [527] Biane, P. and Yor, M. (1988). Sur la loi des temps locaux browniens en un temps exponential. In S´eminaire de Probabilit´es XXII, volume 1321 of Lecture Notes Math (pp. 454–466). Berlin: Springer-Verlag. [61, 550] Billingsley, P. (1968). Convergence of Probability Measures. New York: Wiley. [592] Bingham, N., Goldie, C., and Teugels, J. (1987). Regular Variation. Cambridge: Cambridge University Press. [304, 438, 594] Blackburn, R. (2000). Large deviations of local times of L´evy processes. J. Theor. Probab., 18, 825–842. [527] Blumenthal, R. and Getoor, R. (1968). Markov Processes and Potential Theory. New York: Academic Press. [61, 119, 120] Bobkov, S. (1996). A functional form of the isoperimetric inequality for the Gaussian measure. J. Funct. Anal., 135, 39–49. [242] Borell, C. (1975). The Brunn-Minkowski inequality in Gauss space. Invent. Math., 30, 207–216. [241] Bouleau, N. and Yor, M. (1981). Sur la variation quadratique de temps locaux de certaines semi-martingales. C. R. Acad. Sc. Paris, 292, 491–492. [496] Boylan, E. (1964). Local times for a class of Markov processes. Ill. J. Math., 8, 19–39. [120] Brydges, D., Fr¨ ohlich, J., and Spencer, T. (1982). The random walk representation of classical spin systems and correlation inequlities. Comm. Math. Phys., 83, 123–150. [1] Chung, K. L. (1967). Markov Chains with Stationary Transition Probabilities (second ed.). New York: Springer-Verlag. [455] Chung, K. L. (1974). A Course in Probability Theory (second ed.). San Diego: Academic Press. [592] Chung, K. L. (1982). Lectures from Markov Processes to Brownian Motion. New York: Springer-Verlag. [61, 119] Chung, K. L., Erd¨ os, P., and Sirao, T. (1959). On the Lipchitz’s condition for Brownian motion. J. Math. Soc. Japan, 11, 263–274. [361] de la Vega, F. W. (1973). On almost sure convergence of quadratic Brownian variation. Ann. Probab., 2, 551–552. [456, 495] Dellacherie, C., Maisonneuve, B., and Meyer, P.-A. (1992). Probabiliti´es et Potential, Chapitres XVII a ` XXIV. Paris: Hermann. [119] Dellacherie, C. and Meyer, P.-A. (1978). Probabilities and Potential. Paris, Amsterdam: Hermann, North Holland. [175] Dellacherie, C. and Meyer, P.-A. (1980). Probabilities and Potential B, Theory of Martingales. Amsterdam: North Holland. [73, 180] Dellacherie, C. and Meyer, P.-A. (1987). Probabiliti´es et Potential, Chapitres XII a ` XVI. Paris: Hermann. [119, 188] Donoghue, W. (1969). Distributions and Fourier Transforms. New York: Academic Press. [236]
References
605
Donsker, M. and Varadhan, S. R. S. (1977). On laws of the iterated logarithm for local times. Comm. Pure Appl. Math., 30, 707–753. [527] Doob, J. (1953). Stochastic Processes. New York: Willey. [237] Dudley, R. M. (1967). The sizes of compact subsets of Hilbert space and the continuity of Gaussian process. J. Funct. Anal, 1, 290–330. [280] Dudley, R. M. (1973). Sample functions of the Gaussian process. Ann. Probab., 1, 66–103. [5, 280, 281, 456, 495] Dudley, R. M. (1989). Real Analysis and Probability. Belmont, CA: Wadsworth. [215] Dudley, R. M. (1999). Uniform Central Limit Theorems. Cambridge: Cambridge University Press. [4, 241] Dynkin, E. B. (1983). Local times and quantum fields. In Seminar on Stochastic Processes, volume 7 of Progress in Probability (pp. 64–84). Boston: Birkh¨ auser. [1, 394] Dynkin, E. B. (1984). Gaussian and non-Gaussian random fields associated with Markov processes. J. Fcnl. Anal., 55, 344–376. [1, 394] Ehrhard, A. (1983). Sym´etrisation dans l’espace de Gauss. Math. Scand., 53, 281–301. [241] Eisenbaum, N. (1994). Dynkin’s isomorphism theorem and the Ray–Knight theorems. Prob. Theory Related Fields, 99, 321–335. [61, 534, 550] Eisenbaum, N. (1995). Une version sans conditionnement du theoreme d’isomorphisme de Dynkin. In S´eminaire de Probabilit´es XXIX, volume 1613 of Lecture Notes Math (pp. 266–289). Berlin: Springer-Verlag. [1, 394] Eisenbaum, N. (1997). On the most visited sites by a symmetric stable process. Prob. Theory Related Fields, 107, 527–535. [527] Eisenbaum, N. (2003). On the infinite divisibility of squared Gaussian processes. Prob. Theory Related Fields, 125, 381–392. [6, 579] Eisenbaum, N. (2005). A connection between Gaussian processes and Markov processes. Elec. J. Probab., 10, 202–215. [6, 579] Eisenbaum, N. and Kaspi, H. (2006). A characterization of the infintely divisible squared Gaussian process. Ann. Probab., 34. [6, 579] Eisenbaum, N., Kaspi, H., Marcus, M. B., Rosen, J., and Shi, Z. (2000). A Ray–Knight theorem for symmetric Markov processes. Ann. Probab., 28, 1781–1796. [1, 61, 395] Eisenbaum, N. and Khoshnevisan, D. (2002). On the most visited sites of symmetric Markov process. Stoch. Proc. Appl., 101, 241–256. [527] Eisenbaum, N. and Shi, Z. (1999). Measuring the rarely visited sites of Brownian motion. J. Theor. Probab., 12, 595–613. [60] Feller, W. (1971). An Introduction to Probability Theory and its Applications, Vol. II. New York: Wiley. [91, 141, 166, 382, 565, 566, 594, 596] Fernique, X. (1970). Int´egrabilit´e des vecteurs Gaussiens. Comptes Rendus Acad. Sci. Paris S´er A, 270, 1698–1699. [242] Fernique, X. (1971). Regularit´e de processus Gaussiens. Invent. Math., 12, 1304–320. [242] Fernique, X. (1974). Des r´esultats nouveaux sur les processus Gaussiens. Comptes Rendus Acad. Sci. Paris S´ er A–B, 278, A363–A365. [242, 280] Fernique, X. (1975). Regularit´e des trajectoires des fonctions al´eatoires Gaussi´ ´ e de Probabilit´es de Saint–Flour, IV–1974, volume ennes. In Ecole d,Et´ 480 of Lecture Notes Math (pp. 1–96). Berlin: Springer Verlag. [241, 280] Fernique, X. (1997). Fonctions Al´eatoires Gaussiennes Vecteurs Al´eatoires Gaussiennes. Montreal: CRM. [4, 241, 242, 280, 281, 361]
606
References
Fitzsimmons, P. and Pitman, J. (1999). Kac’s moment formula for aditive functionals of a Markov process. Stochastic Process. Appl., 79, 117–134. [61, 120] Fukushima, M., Oshima, Y., and Takeda, M. (1994). Dirichlet Forms and Symmetric Markov Processes. New York: Walter de Gruyter. [119] Garsia, A. M., Rodemich, E., and Rumsey, Jr., H. (1970). A real variable lemma and the continuity of paths of some Gaussian processes. Indiana Math. J., 20, 565–578. [280, 361] Getoor, R. (1975). Markov Processes: Ray Processes and Right Processes. New York: Volume 440 of Lecture Notes in Mathematics, Springer-Verlag. [119, 188] Getoor, R. and Kesten, H. (1972). Continuity of local times of Markov processes. Comp. Math., 24, 277–303. [120] Gin´e, E. and Kline, R. (1975). On quadratic variation of processes with Gaussian increments. Ann. Probab., 3, 716–721. [495] Griffiths, R. C. (1984). Characterizations of infinitely divisible multivariate gamma distributions. Jour. Multivar. Anal., 15, 12–20. [6, 579] Hawkes, J. (1985). Local times as stationary processes. In From Local Times to Global Geometry, Control and Physics, volume 150 of Pitman Research notes in Math. (pp.˜ 1). New York: Longman. [120] Horn, R. A. and Johnson, C. R. (1999). Matrix Analysis. Cambridge: Cambridge Univ. Press. [579] Hu, Y. and Shi, Z. (1998). The limits of Sinai’s simple random walk in random environment. Ann. Probab., 26, 1477–1521. [60] Ibragamov, I. and Linnik, Y. (1971). Independent and Stationary Sequences of Random Variables. Groningen, Netherlands: Wolters–Noordhoff. [142] Ito, K. and McKean, H. (1974). Diffusion Processes and Their Sample Paths. New York: Springer-Verlag. [59] Ito, K. and Nisio, M. (1968a). On the convergence of sums of independent Banach space valued random variables. Osaka J. Math., 5, 35–48. [593] Ito, K. and Nisio, M. (1968b). On the oscillation function of a Gaussian process. Math. Scand., 22, 209–233. [213, 241] Jain, N. and Kallianpur, G. (1972). Oscillation function of a multiparameter Gaussian process. Nagoya Math. J., 47, 15–28. [213, 241] Jain, N. and Marcus, M. B. (1978). Continuity of subgaussian processes. In Probability on Banach Spaces, volume 4 of Advances in Probability (pp. 81–196). New York: Marcel Dekker. [241, 242, 280, 281, 593] Jain, N. and Monrad, D. (1983). Gaussian measures in bp . Ann. Probab., 11, 46–57. [495] Kahane, J.-P. (1985). Some Random Series of Functions. Cambridge: Cambridge University Press. [229] Kallenberg, O. (2001). Foundations of Modern Probability (second ed.). New York: Springer-Verlag. [188] Karamata, J. (1930). Sur un mode de croissance r´eguli`ere des fonctions. Mathematica (Cluj), 4, 38–53. [594] Katzenelson, Y. (1968). Harmonic Analysis. New York: Wiley. [336] Kawada, T. and Kono, N. (1973). On the variation of Gaussian processes. In Proceedings of the Second Japan-USSR Symposium on Probability Theory, volume 330 of Lecture Notes Math (pp. 175–192). Berlin: Springer-Verlag. [495] Khoshnevisan, D. (2002). Multiparameter Processes. New York: SpringerVerlag. [188]
References
607
Knight, F. (1969). Random walks and a sojourn density process of Brownian motion. Trans. Amer. Math. Soc., 143, 173–185. [49, 58, 61] Knight, F. (1981). Essentials of Brownian motion and diffusion. Providence, RI: AMS. [61] Kolmogorov, A. N. (1951). On the differentiability of the transition probabilities in homogeneous Markov processes with a denumerable number of states. Uˇchen. Zap. MGY, 148, 53–59. [441] Kono, N. (1969). Oscillation of sample functions in stationary Gaussian processes. Osaka J. Math., 6, 1–12. [495] Kono, N. (1970). On the modulous of continuity of sample functions of Gaussian processes. J. Math. Kyoto Univ., 10, 493–536. [361] Lacey, M. (1990). Large deviations for the maximal local time of stable L´evy processes. Ann. Probab., 18, 1669–1674. [527] Landau, H. J. and Shepp, L. A. (1971). On the supremum of a Gaussian process. Sankhy¯ a Ser. A., 32, 369–378. [242] Le Jan, Y. (1988). On the Fock space representation of functionals of the occupation field and their renormalization. J. Fcnl. Anal., 80, 88–108. [119] Ledoux, M. (1998). A short proof of the Gaussian isoperimetric inequality. In High Dimensional Probability, volume 43 of Progress in Probability (pp. 229–232). Boston: Birkh¨ auser. [241, 242] Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces. New York: Springer-Verlag. [4, 241, 242, 280, 580, 591] L´evy, P. (1940). Le mouvement brownian plan. Amer. J. Math., 62, 487–550. [456] L´evy, P. (1948). Processus stochastiques et mouvement brownien. Paris: Gauthier-Villars. [61] Li, W. and Shao, Q. (2002). A normal comparison inequality and its applications. Prob. Theory Related Fields, 122, 494–508. [242] Lifshits, M. A. (1995). Gaussian Random Functions. Dordrecht: Kluwer. [241] Marcus, M. B. (1968). H¨ older conditions for Gaussian processes with stationary increments. Trans. Amer. Math. Soc., 134, 29–52. [361] Marcus, M. B. (1970). H¨ older conditions for continuous Gaussian processes. Osaka J. Math., 7, 483–494. [361] Marcus, M. B. (1972). Gaussian lacunary series and the modulus of continuity for Gaussian processes. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, 22, 301–322. [361] Marcus, M. B. (1991). Rate of growth of local times of strongly symmetric Markov processes. In Proceedings Seminar on Stochastic Processes, 1990, volume 24 (pp. 253–885). Boston: Birkhauser. [455] Marcus, M. B. (2001). The most visited sites of certain L´evy processes. J. Theor. Probab., 14, 867–886. [527] Marcus, M. B. and Pisier, G. (1981). Random Fourier Series with Applications to Harmonic Analysis. Princeton, NJ: Princeton University Press. [241, 280, 281] Marcus, M. B. and Rosen, J. (1992a). Moduli of continuity of local times of strongly symmetric Markov processes via Gaussian processes. J. Theor. Probab., 5, 791–825. [5, 361, 454] Marcus, M. B. and Rosen, J. (1992b). Moment generating functions for the local times of symmetric Markov processes and random walks. In Probability in Banach Spaces 8, Proceedings of the Eighth International Conference, Progress in Probability series, volume 30 (pp. 364–376). Boston: Birkhauser. [579]
608
References
Marcus, M. B. and Rosen, J. (1992c). p-variation of the local times of symmetric stable processes and of Gaussian processes with stationary increments. Ann. Probab., 20, 1685–1713. [495, 496] Marcus, M. B. and Rosen, J. (1992d). Sample path properties of the local times of strongly symmetric Markov processes via Gaussian processes. Ann. Probab., 20, 1603–1684. [1, 2, 5, 120, 152, 188, 241, 394, 395, 454, 455] Marcus, M. B. and Rosen, J. (1993). φ-variation of the local times of symmetric L´evy processes and stationary Gaussian processes. In Seminar on Stochastic Processes, 1992, volume 33 of Progress in Probability (pp. 209–220). Boston: Birkhauser. [455, 457, 495] Marcus, M. B. and Rosen, J. (1994b). Laws of the iterated logarithm for the local times of recurrent random walks on Z 2 and of L´evy processes and recurrent random walks in the domain of attraction of Cauchy random variables. Ann. Inst. H. Poincar´e Prob. Stat., 30, 467–499. [527] Marcus, M. B. and Rosen, J. (1994a). Laws of the iterated logarithm for the local times of symmetric L´evy processes and recurrent random walks. Ann. Probab., 22, 626–658. [527] Marcus, M. B. and Rosen, J. (1999). Renormalized Self-intersection Local Times and Wick Power Chaos Processes, volume 675. Providence: Memoirs of the AMS. [455] Marcus, M. B. and Rosen, J. (2001). Gaussian processes and the local times of symmetric L´evy processes. In L´evy Processes–Theory and Applications, volume 1 of Math (pp. 67–89). Boston: Birkhauser. [395] Marcus, M. B. and Rosen, J. (2003). New perspectives on Ray’s theorem for the local times of diffusions. Ann. Probab., 31, 882–913. [550] Marcus, M. B. and Shepp, L. A. (1972). Sample behavior of Gaussian processes. In Proceedings of the Sixth Berkeley Symposium Math. Statist. Prob., volume 2 (pp. 423–441). Berkeley: University of California Press. [242, 361] McKean, H. P. (1962). A H¨ older condition for Brownian local time. J. Math. Kyoto Univ., 1, 195–201. [120] McShane, E. J. and Botts, T. A. (1959). Real Analysis. Princeton, NJ: Van Nostrand. [210] Meyer, P.-A. (1966). Sur les lois de certaines fonctionnelles multiplicative. Publ. Inst. Statist. Univ. Paris, 15, 295–310. [120] Millar, P. W. and Tran, L. T. (1974). Unbounded local times. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, 30, 87–92. [120] Molchan, G. (1999). Maximum of fractional Brownian motion: Probabilities of small values. Comm. Math. Phys., 205, 97–111. [5, 527] Nisio, M. (1967). On the extreme values of Gaussian processes. Osaka J. Math., 4, 313–326. [361] Perkins, E. (1982). Local time is a semimartingale. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, 60, 79–117. [59, 496] Pitman, E. J. G. (1968). On the behavior of the characteristic function of a probability distribution in the neighbourhood of the origin. J. Australian Math. Soc. Series A, 8, 422–443. [594] Port, S. C. and Stone, C. J. (1978). Brownian Motion and Classical Potential Theory. New York: Academic Press. [61] Preston, C. (1971). Banach spaces arising from some integral inequalities. Indiana Math. J., 20, 997–1015. [280] Preston, C. (1972). Continuity properties of some Gaussian processes. Ann. math. Statist., 43, 285–292. [280]
References
609
Ray, D. (1963). Sojourn times of a diffusion processs. Ill. J. Math., 7, 615–630. [5, 49, 58, 59, 61, 120, 144, 188, 537, 550] Reuter, G. E. H. (1969). Remarks on a Markov chain example of Kolmogorov. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, 13, 315–320. [455] Revuz, D. and Yor, M. (1991). Continuous Martingales and Brownian Motion. New York: Springer-Verlag. [56, 61, 123, 125, 173, 581] Rogers, L. C. G. and Williams, D. (2000a). Diffusions, Markov Processes, and Martingales. Volume Two: Foundations. Cambridge: Cambridge University Press. [144, 175] Rogers, L. C. G. and Williams, D. (2000b). Diffusions, Markov Processes, and Martingales. Volume One: Foundations. Cambridge: Cambridge University Press. [61, 86, 101, 119, 124, 150, 366] Rosen, J. (1986). Tanaka’s formula for multiple intersections of planar Brownian motion. Stochastic Process. Appl., 23, 131–141. [61] Rosen, J. (1991). Second order limit laws for the local times of stable processes. In S´eminaire de Probabilit´es XXV, volume 1485 of Lecture Notes Math (pp. 407–424). Berlin: Springer-Verlag. [496] Rosen, J. (1993). p-variation of the local times of stable processes and intersection local time. In Seminar on Stochastic Processes, 1991, volume 33 of Progress in Probability (pp. 157–168). Boston: Birkhauser. [496] Royden, H. L. (1988). Real Analysis, third edition. New York: Macmillan. [590] Sato, K. (1999). L´evy Processes and Infinitely Divisible Distributions. Cambridge: Cambridge University Press. [212] Sharpe, M. (1988). General Theory of Markov Processes. New York: Academic Press. [119] Sheppard, P. (1985). On the Ray–Knight property of local times. J. London Math. Soc., 31, 377–384. [61, 550] Sirao, T. and Watanabe, H. (1970). On the upper and lower class for stationary Gaussian processes. Trans. Amer. Math. Soc., 147, 301–331. [361] Slepian, D. (1962). The one-sided barrier problem for Gaussian noise. Bell System Tech. J., 41, 463–501. [242] Stroock, D. W. (1993). Probability Theory: An Analytic View. Cambridge: Cambridge University Press. [188] Sudakov, V. N. (1973). A remark on the criterion of continuity of Gaussain sample functions. In Proceedings Second Japan-USSR Symposium on Probabability Theory Kyoto, 1972, volume 330 of Lecture Notes in Math., (pp. 444–454)., Berlin. Springer-Verlag. [242] Sudakov, V. N. and Tsirelson, B. S. (1978). Extremal properties of half–spaces for spherically invariant measures. J. Soviet. Math., 9, 9–18. Translated from Zap. Nauch. Sem. L.O.M.I. 41, 14–24, 1974. [241] Talagrand, M. (1987). Continuity of Gaussian processes. Acta Math., 159, 99–149. [280] Talagrand, M. (1992). A simple proof of the majorizing measure theorem. Geom. Funct. Anal., 2, 118–125. [242, 280] Talagrand, M. (2005). The Generic Chaining. Berlin: Springer-Verlag. [281] Taylor, S. J. (1972). Exact asymptotic estimates of Brownian path variation. Duke. Math. J., 39, 219–242. [5, 456, 495] Trotter, H. (1958). A property of Brownian motion paths. Ill. J. Math., 2, 425–433. [61, 120] van der Hofstad, R., den Hollander, F., and Konig, W. (1997). Central limit theorem for the Edwards model. Ann. Probab., 25, 573–597. [61]
610
References
Walsh, J. (1978). A diffusion with a discontinuous local time. Asterisque, 52, 37–46. [455] Walsh, J. (1983). Stochastic integration with respect to local time. In Seminar on Stochastic Processes, 1982, volume 16 of Progress in Probability (pp. 237–302). Boston: Birkhauser. [496, 550] Williams, D. (1974). Path decomposition and continuity of local time for one-dimensional diffusion, I. Proc. London Math. Soc, 28, 738–768. [61, 550] Wittman, R. (1986). Natural densities for Markov transition probabilities. Prob. Theory Related Fields, 73, 1–10. [78] Wojtaszczyk, P. (1991). Banach Spaces for Analysts. Cambridge: Cambridge University Press. [591]
Index of notation
(Ω, F, P ), 7 B, 225 B(t, u), Bd (t, u), 244 BESQ, 368 BESQδ (x), 581 C(S), Cb (S), Cb+ (S), Cκ (S), C0 (S), C0∞ (S), 8 C0 (S∆ ), 79 Cp , 141 D(u), D(T, d, u), 244 d(s, t), dX (s, t), 8, 194 d(x, A), 24 dX , 243 EY , 7 Eλ , 43 f , f,y , 32 h, 109 K ⊕ K, ⊕n K, 270 K , 8 kL , 110 Lp , 7 Lyt , 32 L,y t , 32 M (u), M (T, d, u), 244 m, 189 m(t), 193 N (u), N (T, d, u), 244 N (µ, σ 2 ), 192 P x , 15 P x,0 , 113 611
P x/h , 110 R1 , R+ , R, Rn , 6 S∆ , 63 TA , 24, 70 U α , 16 UAα , 90 uα , 16 uα A , 90 uT , 42 Wf , 209 y ↑↑ x, 8 #S, 233 ∼, ≈, 8 ¯ yt , 98 L ¯ 215 A, ◦θt , 21 cov, 201 det Σ, 191 med, 24 p , np , 8 G, 23, 326 Gt , 21 γ, γn , 193 L, 93 M, 29, 66 F, Ft , 7 F 0 , 15 F 0 , Ft0 , 19 F h , Fth , 112 GT ,GT + , 23
612 Gt , 23 Gt+ , 23 H, 164 M, 29 M+ , M, 164 N , 561 N , IN, 6 σ(u), 270 Φ, φ, 192 Ψ, 404 φ ◦ Φ−1 , 193 ψ(λ), 135 Σ, 189 Σ(s, t), 190, 193 S, 208 σ, 270 σ( · ), 7
Index of notation σ ∗ , σ∗ , 274 2 , 307 σ 2 , σG τ (s, u),τ (s, A), 8 τA , 43 τz , 39 θt , 21 U, 193 f, 9 t , 82 X G, Gt , 82 82 Ω, θt , 82 f, 165 Px , 82 Z, 6 ζ, 72 B(Ω), Bb (Ω), 6
Author index
Adler, R. J., 241 Ahlfors, L., 142
Donsker, M., 527 Doob, J. L., 101, 237 Dudley, R. M., 4, 5, 215, 238, 241, 244, 245, 280, 456, 495 Dynkin, E. B., 394
Babkov, S., 242 Bakry, D., 242 Bapat, R. B., 6, 579 Barlow, M., 2, 120, 455 Bass, R. F., 5, 59, 526 Ba˜ nuelos, R., 527 Belyaev, Yu. K., 213, 241 Berman, N., 579 Bertoin, J., 142, 188, 527 Biane, P., 61, 550 Billingsley, P., 592 Bingham, N., 304, 438, 594 Blackburn, R., 527 Blumenthal, R., 61, 119, 120 Boas, R., 349 Borell, C., 214, 241 Botts, T. A., 210 Boylan, E., 120 Brydges, D., 1
Ehrhard, A., 241 Eisenbaum, N., 1, 2, 5, 6, 60, 61, 394, 527, 534, 550, 579 Erd¨ os, P., 361 Feller, W., 91, 141, 166, 382, 565, 566, 594, 596 Fernique, X., 4, 238, 241, 242, 244, 251, 280, 361 Fitzsimmons, P., 61, 120, 188 Fr¨ ohlich, J., 1 Fukushima, M. , 119 Garcia, A. M., 280, 361 Getoor, R., 61, 119, 120, 188 Goldie, C., 304, 438, 594 Griffin, P., 59, 526 Griffiths, R. C., 6, 579
Caballero, R., 527 Chung, K. L., 61, 119, 361, 455, 592
Hawkes, J., 2, 120, 455 Horn, R. A., 579 Hu, Y., 60
de la Vega, F., 495 Dellacherie, C., 73, 119, 180, 188 den Hollander, F., 60 Donoghue, W., 236
Ibragamov, I., 142 Ito, K., 58, 213, 241, 593 613
614
Author index
Jain, N., 213, 241, 242, 280, 593 Johnson, C. R., 579 Kac, M., 61 Kahane, J.-P., 229 Kallenberg, O., 188 Kallianpur, G., 213, 241 Karamata, J., 594 Kaspi, H., 2, 6, 61, 395, 579 Katzenelson, Y., 336 Kawada, T, 495 Kesten, H., 120 Khoshnevisan, D., 188, 527 Knight, F., 49, 58, 61 Kolmogorov, A. N., 243, 441, 455 Konig, W., 60 Kono, N., 361, 495 Lacey, M., 527 Landau, H. J., 242 Le Jan, Y., 119 Ledoux, M., 4, 241, 242, 280, 580, 591 Li, W., 242 Lifshits, M. A., 241 Linnik, Y., 142 L´evy, P., 61, 456 Maisonneuve, B., 119 Marcus, M. B., 1, 2, 5, 61, 120, 152, 188, 241, 242, 280, 361, 394, 454, 455, 457, 495, 527, 550, 579, 593 McKean, H., 58, 120 McShane, E. J., 210 Meyer, P.-A., 73, 119, 120, 180, 188 Millar, P. W., 120 Molchan, G., 5, 527 Nisio, M., 213, 241, 361, 593 Oshima, Y., 119
Perkins, E., 59, 496 Pisier, G., 241, 280, 281 Pitman, E. J. G., 594 Pitman, J., 61, 120 Plemmons, R. J., 579 Port, S. C., 61 Preston, C., 280 Ray, D., 5, 49, 58, 59, 61, 120, 144, 188, 537 Reuter, G. E. H., 455 Revuz, D., 56, 61, 123, 125, 173, 581 Rodemich, E., 280, 361 Rogers, L. C. G., 61, 86, 101, 119, 124, 144, 150, 175, 366 Rosen, J., 1, 2, 5, 61, 120, 152, 188, 361, 394, 454, 455, 457, 495, 496, 527, 550, 579 Royden, H. L., 590 Rumsey Jr., H., 280, 361 Sato, K, 212 Shao, Q., 242 Sharpe, M., 119 Shepp, L. A., 242, 361 Sheppard, P., 61, 550 Shi, Z., 2, 5, 60, 61, 395, 527 Sirao, T., 361 Slepian, D., 242 Smits, R. G., 527 Spencer, T., 1 Stone, C. J., 61 Stroock, D., 188 Sudakov, V. N., 214, 233, 241 Takeda, M., 119 Talagrand, M., 4, 234, 241, 242, 244, 263, 280, 580, 591 Taylor, S. J., 5, 456, 495 Teugels, J., 304, 438, 594 Tran, L., 120
Author index Trotter, H., 61, 120 Tsirelson, B. S., 214, 241 van der Hofstad, R. , 60 Varadhan, S. R. S., 527 Walsh, J., 455, 496, 550 Watanabe, H., 361
615
Williams, D., 61, 86, 101, 119, 124, 144, 150, 175, 366, 550 Wittman, R., 78 Wojtaszczyk, P., 591 Yor, M., 56, 61, 123, 125, 173, 550, 581
Subject index
Abelian group, 235 analytic set, 583 approximate δ-function, 8 approximate identity, 8 associated Gaussian process, 76, 194, 324, 396, 398, 551 associated process, 76, 194, 398, 551 family of, 396 asymptotic functions, 8 augmented filtration, 67
canonical version, 15 killed, 157 local time of, 31 planar, 534 quadratic variation of, 457 two dimensional, 534 CAF, see continuous additive functional Cameron–Martin Formula, 516 canonical Gaussian measure, 193, 214 canonical stable process, 141 Category (1), 326 cemetery state, 63 Chapman–Kolmogorov equation, 12, 13, 77 characteristic function, 189 class G, 326 class S, 208 comparable functions, 8 concentration inequality, 224 continuous additive functional, 32, 43, 83 potential of, 90 potential operator of, 90 contraction property, 125 contraction resolvent, 125, 126, 128 strongly continuous, 587
Banach–Mazur Theorem, 591 Belyaev dichotomy, 213 Bessel process, 158, 581 squared, 581 Blumenthal Zero–One Law, 26, 70 Boas’s Lemma, 597 Bochner integral, 587 Bochner’s Theorem, 236 Borel right process, 67 local, 179, 551 Borel semigroup, 63 Borell, Sudakov–Tsirelson Theorem, 214 bounded discontinuity, 208 branch point, 170 Brownian motion, 11, 13, 276 most visited sites of, 510 616
Subject index contraction semigroup strongly continuous, 122, 588 counting measure, 184 covariance kernel, 203 cylinder sets, 20, 398 death time, 72 diameter of a metric space, 244 diffusion, 144, 530 distinguishable set, 233 Doob’s Lp inequality, 86, 101 Dudley’s metric entropy condition, 238, 245, 249, 251 dyadic numbers, 580 Dynkin Isomorphism Theorem, 364, 380 Edwards model of polymers, 60 Eisenbaum Isomorphism Theorem, 362, 383, 454 excessive function, 182 excessive regularization, 165 exponential random variable, 202 favorite point, see most visited sites 497 Feller process, 121 Feller semigroup, 121 Fernique’s Lemma, 238 filtered, 7 filtration, 6 first hitting time, 24, 70, 151 Fourier transform, 9 fractional Brownian motion, 276, 497 scaling property, 498 Gaussian lacunary series, 336 Gaussian Markov process, 195 Gaussian metric, 233 Gaussian process, 193 p-variation of, 457
617
associated, 76, 194, 270, 398, 551 Banach space–valued, 225 boundedness, 255, 263 class G, 326 continuity, 255, 267 covariance kernel of, 193 mean function of, 193 metric for, 233 modulus of continuity for, 258, 283 periodic, 280 with infinitely divisible squares, 561 Gaussian random variable, 189 characteristic function of, 189 covariance matrix of, 189 mean vector of, 189 with infinitely divisible squares, 560 Gaussian sequence, 193 Generalized Second Ray–Knight Theorem, 455 generator, 556 group, 235 h-transform, 109, 112 h-transform process, 112 Hille–Yosida Theorem, 587 hitting time first, 24, 70, 151 measurability of, 70 Hunt process, 120, 152 Hunt’s Theorem, 127 increments variance, 307, 317 infinitely divisible process, 135 infinitely divisible squares, 560 inverse local time, 39, 97 of a symmetric stable process, 502
618
Subject index
inverse time, 105 irreducible matrix, 600 isomorphism theorems, 49 isoperimetric inequality, 214 Kac’s Moment Formula, 44, 49, 74, 116, 117 Karhunen–Lo´eve expansion, 206 Khintchine’s law of the iterated logarithm, 15, 431 killed process, 81, 152 killing operator, 110 Kolmogorov Construction Theorem, 14, 122, 173 Kolmogorov’s Theorem for path continuity, 580 lacunary series, 336 Laplace transform, 565 law of the iterated logarithm, 15, 424, 470 left continuous inverse, 270 left-hand limit, 8, 147 local time, 62, 83 continuity of, 410 Brownian, 31 inverse, 39, 93 joint continuity, 98 jointly continuous, 32, 58, 99 modulus of continuity for, 59 moment generating function of, 115 normalization of, 92 of a Markov process, 98 p-variation of, 479 quadratic variation of, 59 stochastic integral representation, 56 total accumulated, 56, 99 local uniform continuity, 9 local uniform convergence, 9
locally homogeneous, 421 locally homogeneous metric space, 359 Lusin’s Theorem, 585 L´evy exponent, 135 L´evy measure, 330 L´evy process, 135, 212 without a Gaussian component, 139 L´evy’s inequality, 591 L´evy’s uniform modulus of continuity, 15 L´evy–Ito–Nisio Theorem, 591 L´evy–Khintchine Theorem, 136 M –matrix, 561, 601 majorant, 274, 297 majorizing measure, 256 Markov kernel, 62 Markov process, 62 Gaussian, 195 right continuous simple, 64 simple, 63 standard, 151 Markov property, 19, 62 local strong, 182 simple, 20, 21, 64 strong, 24, 67 Markov semigroup, 63 measurable space, 6 median, 7 mesh, 456 metric entropy, 244, 271 minorant, 274 modulus of continuity, 15 exact local, 282 exact uniform, 282 for a Gaussian process, 258, 283 m-, 292 moment generating function, 115
Subject index Monotone Density Theorem, 595 most visited sites, 59, 497, 510, 525 negative, 6 negative off-diagonal elements, 552 non-decreasing rearrangement, 271 normal random variable, 192 normalized regularly varying function, 304 occupation density formula, 99 occupation measure, 33 operator norm, 457 optional σ-algebra, 174 optional process, 175 optional set, 175 Ornstein–Uhlenbeck process, 57 oscillation function, 209 packing number, 244 Paley–Zygmund Lemma, 597 Parseval’s Theorem, 10 partition, 456, 458 pathwise connected space, 567 permutation matrix, 600 positive, 6 positive definite, 76, 78 function, 190 matrix, 190 positive matrix, 552, 561 positive maximum principle, 127 positive operator, 122, 127 positive row sums, 552 potential, 183 of a continuous additive functional, 90 potential density, 17, 77, 105, 125, 530 potential operator, 16, 65, 125 of a continuous additive functional, 90
619
probability space, 7 Projection Theorem, 583 pseudo-metric, 8 pure jump process, 139 p-variation, 457 quadratic variation, 59, 457 quasi left continuity, 149 random Fourier series, 238 random variable, 7 random walk, 60 rapidly decreasing, 513 rapidly decreasing function, 545 Ray process, 172 Ray semigroup, 164, 165 Ray’s Theorem Eisenbaum’s version, 534 for Brownian motion, 56, 543 for diffusions, 532 original version, 538 Ray–Knight compactification, 188 Ray–Knight Theorem applications of, 58 first, 48, 366 second, 53, 372, 378, 386 recurrence of elements, 93 of processes, 95 recurrent process, 17 reducible matrix, 600 reference measure, 73 reflection principle, 27, 158 regular symmetric transition densities, 77 regular transition densities, 76 regularly varying function, 304, 594 reproducing kernel Hilbert space, 204 of fractional Brownian motion, 511
620
Subject index
resolvent equation, 66, 73, 77 scaling property of fractional Brownian motion, 498 semigroup property, 16 separability, 9 shift operator, 21 signature matrix, 561 Slepian’s Lemma, 227, 295 slowly varying function, 304, 594 smoothly varying function, 438 Souslin scheme, 583 spectral density, 237, 238 spectral distribution, 237, 238, 276 speed measure, 144 squared Bessel process, 581 stable mixture, 427, 437 stable process, 141 canonical, 42, 141 standard L2 metric, 233 standard augmentation, 28, 69 standard normal sequence, 192 state space, 63 stationary increments, process with, 235 stationary process, 235 stochastic continuity, 9 stochastic process, 7 Stone–Weierstrass Theorems, 590 stopping time, 23, 93 measurability of, 151 strictly negative, 6 strictly positive, 6 strictly positive definite matrix, 190, 598
strong continuity, 176 strongly symmetric process, 75 sub-Markov kernel, 62 sub-Markov semigroup, 63 supermedian function, 164, 182 surface measure, 214 symmetric operator, 130 symmetric process, 75 symmetric stable process inverse local time, 502 Talagrand’s Theorem, 263, 267 terminal time, 42, 105 tight family of measures, 591 transience of elements, 93 of processes, 95 transition operator, 16 transition probability density function, 16 transition semigroup, 63, 64 type A, 307 u-distinguishable, 233 unbounded discontinuity, 208 uniform norm, 121 usual augmentation, see standard augmentation vector lattice, 590 version, 98 weakly diagonally dominant, 556 zero row sums, 556