Lecture Notes in Mathematics Editors: J.-M. Morel, Cachan F. Takens, Groningen B. Teissier, Paris
1979
Catherine Donati-Martin · Michel Émery · Alain Rouault · Christophe Stricker (Eds.)
Séminaire de Probabilités XLII
123
Editors Catherine Donati-Martin
Michel Émery
Laboratoire de Probabilités et Modèles Aléatoires Université Paris VI Boîte Courrier 188, 4 place Jussieu 75252 Paris Cedex 05 France
[email protected]
IRMA Université de Strasbourg 7 rue René Descartes 67084 Strasbourg Cedex France
[email protected]
Christophe Stricker Alain Rouault
Laboratoire de Mathématiques Université de Franche Comté 16 route de Gray 25030 Besançon Cedex France
[email protected]
Laboratoire de Mathématiques Université de Versailles 45 av. des États-Unis 78035 Versailles Cedex France
[email protected]
ISBN: 978-3-642-01762-9 DOI: 10.1007/978-3-642-01763-6
e-ISBN: 978-3-642-01763-6
Lecture Notes in Mathematics ISSN print edition: 0075-8434 ISSN electronic edition: 1617-9692 Library of Congress Control Number: 2009286035 Mathematics Subject Classification (2000): 60Gxx, 60Hxx, 60Kxx, 60J80, 81S25, 11M41, 39B72, 47D07, 93-02 c Springer-Verlag Berlin Heidelberg 2009 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Cover drawing by Anthony Phan Cover design: SPi Publisher Services Printed on acid-free paper springer.com
Photo by George Bergman.
Marc Yor, one of the most prominent members of the French probabilistic school, is turning 60. For the last 33 years, he contributed to the S´eminaire by his own articles and his counselling of other authors; his methods and style permeate throughout the volumes of this series. He was a tireless member of the R´edaction during a quarter of a century, from Volume XIV to XXXIX, careful to maintain the highest quality, from broad mathematical ideas to minutest details. We wish we were able to keep up with the high standards he has set! Since Volume XL, he is no longer an official r´edacteur, but keeps helping us with the editorial work and the refereeing. Marc, nous te souhaitons un joyeux anniversaire, et nous sommes heureux de te d´edier ce volume. ´ Catherine Donati-Martin, Michel Emery, Alain Rouault, Christophe Stricker
Preface
Nine volumes ago, in S´eminaire de Probabilit´es XXXIII, a series of advanced courses was started; nine such courses have appeared since. Two of them are due to Antoine Lejay, including his Introduction to rough paths in volume XXXVII. This unrepentant recidivist now strikes again, with Yet another introduction to rough paths, which sheds a more algebraic light on the same matter. The various contributions which constitute the rest of the volume exemplify the rˆ ole the S´eminaire intends to play on the probabilistic stage: junior authors go side by side with older contributors, with a predominance from French or francophile ones; short notes mix with real research articles; and the themes are well in the traditional spirit of the S´eminaire, ranging over the broad spectrum of interest of the readership of the S´eminaire. ´ Catherine Donati-Martin, Michel Emery, Alain Rouault, Christophe Stricker
vii
Contents
Yet Another Introduction to Rough Paths Antoine Lejay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
Monotonicity of the Extremal Functions for One-dimensional Inequalities of Logarithmic Sobolev Type Laurent Miclo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Non-monotone Convergence in the Quadratic Wasserstein Distance Walter Schachermayer, Uwe Schmock, and Josef Teichmann . . . . . . . . . . 131 On the Equation μ =St μ ∗ μt Fangjun Xu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Shabat Polynomials and Harmonic Measure Philippe Biane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 Radial Dunkl Processes Associated with Dihedral Systems Nizar Demni . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 Matrix Valued Brownian Motion and a Paper by P´ olya Philippe Biane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 On the Laws of First Hitting Times of Points for One-dimensional Symmetric Stable L´ evy Processes Kouji Yano, Yuko Yano, and Marc Yor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 L´ evy Systems and Time Changes P.J. Fitzsimmons and R.K. Getoor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 Self-Similar Branching Markov Chains Nathalie Krell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
ix
x
Contents
A Spine Approach to Branching Diffusions with Applications to Lp -convergence of Martingales Robert Hardy and Simon C. Harris . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 Penalisation of the Standard Random Walk by a Function of the One-sided Maximum, of the Local Time, or of the Duration of the Excursions Pierre Debs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 Canonical Representation for Gaussian Processes M. Erraoui and E.H. Essaky . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365 Recognising Whether a Filtration is Brownian: a Case Study ´ Michel Emery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383 Markovian Properties of the Spin-Boson Model Ameur Dhahri . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397 Statistical Properties of Pauli Matrices Going Through Noisy Channels St´ephane Attal and Nadine Guillotin-Plantard . . . . . . . . . . . . . . . . . . . . . . . 433 Erratum to: “New Methods in the Arbitrage Theory of Financial Markets with Transaction Costs”, in S´ eminaire XLI Mikl´ os R´ asonyi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
List of Contributors
St´ ephane Attal Universit´e Lyon 1 Institut Camille Jordan 43 bld du 11 novembre 1918 69622 Villeurbanne Cedex, France
[email protected]
Ameur Dhahri Ceremade, UMR CNRS 7534 Universit´e Paris Dauphine Place de Lattre de Tassigny 75775 Paris Cedex 16, France
[email protected]
Philippe Biane CNRS, Laboratoire d’Informatique Institut Gaspard Monge Universit´e Paris-Est 5 bd Descartes Champs-sur-Marne 77454 Marne-la-Vall´ee Cedex 2 France
[email protected]
´ Michel Emery IRMA, Universit´e de Strasbourg et C.N.R.S. 7 rue Ren´e Descartes 67 084 Strasbourg Cedex France
[email protected]
Pierre Debs ´ Cartan Nancy Institut Elie B.P. 239, 54506 Vandœuvre-l`esNancy Cedex, France
[email protected] Nizar Demni Fakult¨ at f¨ ur Mathematik Universit¨at Bielefeld Postfach 100131 Bielefeld, Germany
[email protected]
M. Erraoui Universit´e Cadi Ayyad Facult´e des Sciences Semlalia D´epartement de Math´ematiques B.P. 2390, Marrakech, Maroc
[email protected] E.H. Essaky Universit´e Cadi Ayyad Facult´e Poly-disciplinaire D´epartement de Math´ematiques et d’Informatique, B.P 4162 Safi, Maroc
[email protected] xi
xii
List of Contributors
P.J. Fitzsimmons Department of Mathematics 0112; University of California San Diego, 9500 Gilman Drive La Jolla, CA 92093–0112 USA
[email protected]
Laurent Miclo Laboratoire d’Analyse Topologie, Probabilit´es, UMR 6632 CNRS, 39, rue F. Joliot-Curie 13453 Marseille Cedex 13 France
[email protected]
R.K. Getoor
Mikl´ os R´ asonyi Computer and Automation Institute of the Hungarian Academy of Sciences
[email protected]
Nadine Guillotin-Plantard Universit´e Lyon 1 Institut Camille, Jordan 43 bld du 11 novembre 1918 69622 Villeurbanne Cedex, France
[email protected] Robert Hardy Department of Mathematical Sciences, University of Bath Claverton Down, Bath BA2 7AY UK
[email protected] Simon C. Harris Department of Mathematical Sciences, University of Bath Claverton Down, Bath BA2 7AY UK
[email protected] Nathalie Krell Laboratoire de Probabilit´es et Mod`eles Al´eatoires Universit´e Paris 6 175 rue du Chevaleret 75013 Paris, France
[email protected] Antoine Lejay ´ Equipe-Projet TOSCA ´ Cartan Institut Elie (Nancy-Universit´e, CNRS, INRIA) Campus scientifique, BP 239 54506 Vandœuvre-l`es-Nancy Cedex France
[email protected]
Walter Schachermayer Vienna University of Technology Wiedner Hauptstrasse 8–10 1040 Vienna, Austria
[email protected] Uwe Schmock Vienna University of Technology Wiedner Hauptstrasse 8–10 1040 Vienna, Austria
[email protected] Josef Teichmann Vienna University of Technology Wiedner Hauptstrasse 8–10 1040 Vienna, Austria
[email protected] Fangjun Xu Department of Mathematics University of Connecticut 196 Auditorium Road Unit 3009, Storrs CT 06269-3009, USA
[email protected] Kouji Yano Department of Mathematics Graduate School of Science Kobe University, Kobe, Japan
[email protected]
List of Contributors
Yuko Yano Research Institute for Mathematical Sciences Kyoto University, Kyoto Japan Marc Yor Laboratoire de Probabilit´es et Mod`eles Al´eatoires
Universit´e Paris VI Paris, France and Institut Universitaire de France and Research Institute for Mathematical Sciences Kyoto University, Kyoto Japan
xiii
Yet Another Introduction to Rough Paths Antoine Lejay ´ ´ Cartan (Nancy-Universit´e, CNRS, INRIA) Equipe-Projet TOSCA, Institut Elie Campus scientifique, BP 239, 54506 Vandœuvre-l`es-Nancy Cedex, France e-mail:
[email protected]
Summary. This specialized course provides another point of view on the theory of rough paths, starting with simple considerations on ordinary integrals, and stressing the importance of the Green-Riemann formula, as in the work of D. Feyel and A. de La Pradelle. This point of view allows us to gently introduce the required algebraic structures and provides alternative ways to understand why the construction of T. Lyons et al. is a natural generalization of the notion of integral of differential forms, in the sense that it shares the same properties as integrals along smooth paths, when we use the “right notion” of a path.
Key words: Rough paths; integral of differential forms along irregular paths; controlled differential equations; Lie algebra; Lie group; Chen series; subRiemannian geometry
1 Introduction The theory of rough paths [42, 44, 52, 55] is now an active field of research, especially among the probabilistic community. Although this theory is motivated by stochastic analysis, it takes its roots in analysis and control theory, and is also connected to differential geometry and algebra. Given a path x of finite p-variation with p 2 on [0, T ] with values in Rd or an α-H¨older continuous path with α 1/2, this theory allows us to define T the integral x f of a differential form f along x, which is x f = 0 f (xs ) dxs . Using a fixed point theorem, it is then possible to solve differential equations driven by x of type t
yt = y0 +
g(ys ) dxs . 0
The case 1 p < 2 (or α > 1/2) is covered by the Young integrals introduced by L.C. Young in [73]. Some of the most common stochastic processes, including Brownian motion, have trajectories that are of finite p-variation with C. Donati-Martin et al. (eds.), S´ eminaire de Probabilit´ es XLII, Lecture Notes in Mathematics 1979, DOI 10.1007/978-3-642-01763-6 1, c Springer-Verlag Berlin Heidelberg 2009
1
2
A. Lejay
p > 2. So, being able to define almost surely an integral along such irregular paths is of great practical interest, both theoretically and numerically. Yet we know this is not possible in general, and integrals of Itˆ o and Stratonovich type are defined only as limits in probability of Riemann sums. Introduced in the 50’s by K.-T. Chen (see for example [11]), the notion of iterated integrals provides an algebraic tool to deal with a geometrical object which is a smooth path, and allows us to manipulate controlled differential equations using formal computations (see for example [23, 39]). The main feature of the rough paths theory is then to assert that, if it is possible to consider not only a path x but a path x which encodes the iterated integrals (that cannot be canonically defined if x is of finite p-variation with t p 2), then one may properly define the integral zt = 0 f (xs ) dxs and solve t the differential equation yt = y0 + 0 g(ys ) dxs provided that f and g are smooth enough. In addition, the maps x → z and x → y are continuous, with respect to the topology induced by the p-variation distance. The dimension of the path x, or equivalently the number of “iterated integrals” to be considered, depends on the regularity of x. For p ∈ [2, 3) (or α ∈ (1/3, 1/2]), one only has to consider the iterated integrals tof x along itself. This can be justified by the first order Taylor expansion of s f (xr ) dxr : d i=1
t
fi (xr ) dxir ≈
s
d
fi (xs )(xit − xis ) +
i=1
t d ∂fi (xs ) (xjr − xjs ) dxir . ∂x j s i,j=1
If x is α-H¨older continuous with α ∈ (1/3, 1/2] and one has succeeded in t i,j i,j constructing Ks,t (x) = s (xjr − xjs ) dxjr , then one can expect that |Ks,t (x)| T 2α C|t − s| . So, to approximate 0 f (xr ) dxr , we will use the sum n−1 d
fi (xkT /n )(xi(k+1)T /n − xikT /n )
k=0 i=1
+
n−1 k=0
d ∂fi i,j (xkT /n )KkT /n,(k+1)T /n (x) ∂x j i,j=1
and show its convergence as n → ∞. Hence, the integral will be defined not along a path x, but along xs,t given by 1,1 d,d xs,t = (1, xit − xis , . . . , xdt − xds , Ks,t (x), . . . , Ks,t (x)),
where the first component 1 is here for algebraic reasons. The element x can be seen as an element of the truncated tensor space T(R) = R⊕Rd ⊕(Rd ⊗Rd ). By similarity with what happens for the power series constructed from the iterated integrals—sometimes called the signature of the path —, one has that for all 0 s r t T , xs,t = xs,r ⊗ xr,t ,
An Introduction to Rough Paths
3
where ⊗ is the tensor product on T(R) (where the tensor products of more than 2 terms are killed). In addition, it is possible to consider the formal logarithm of x, and following also the properties of the Chen series, we look for paths x such that log(xs,t ) belongs to A(Rd ) = Rd ⊕[Rd , Rd ], where [Rd , Rd ] is the space generated by all Lie brackets between two elements of Rd . This algebraic property allows us to give proper definitions of rough paths and geometric rough paths from an algebraic point of view. The articles [44, 55] and the books [42, 52] use this point of view. As first noted by N. Victoir, since (T1 (Rd ), ⊗), the subset of T(Rd ) whose elements have a first term equal to 1, is a Lie group, one may describe xs,t by xs,t = (−x0,s )−1 ⊗ x0,t , and then, instead of considering the family (xs,t )0s
2. In addition, we restrict ourselves to α-H¨older continuous paths, which is not a stringent assumption at all, since a time change allows us to transform any path with p-finite variation into a path which is (1/p)-H¨older continuous. Given a differential form, we wish to construct a map x → x f which is α continuous on the space C of α-H¨older continuous paths. If α > 1/2, the existence of x f is provided by the theory of Young integrals. We also get that x → x f is continuous on Cα equipped with the α-H¨older norm. Yet we construct some sequence (xn )n∈N of functions in Cα that converges to T x in Cβ with β < 1/2, and such that 0 f (xns ) dxns does not converges to T T f , but to 0 f (xs ) dxs + 0 [f, f ](xs ) dϕs where [f, f ] is the Lie bracket x of f and ϕ is an arbitrary function. This counter-example makes use of the
4
A. Lejay
Green-Riemann functions, and one can see that, if one considers not a path x, but a path (x, ϕ) with values in R3 , then one can extend the notion of the integral to Cα with α ∈ (1/3, 1/2]. In some sense, the third component records the area enclosed between that path and its chord between times s and t. We can then provide an algebraic setting for describing such paths, still with a non-commutative operation. Then, we construct paths with values in A(R2 ), a space of dimension 3, where the first two coordinates correspond to an “ordinary” path in the Euclidean vector space R2 . The non-commutativity comes from the fact that the area enclosed between x · y (the concatenation of two paths x and y) and its chord is different from the area enclosed between y · x and its chord. The degree of freedom we gain comes from the fact that small loops allow us to move in the third direction while staying roughly at the same position in R2 . Any α-H¨older continuous path with values in A(R2 ) (with the right distance) with α > 1/3 may be approximated by smooth paths lifted in A(R2 ) using their area. In addition, the convergence of paths with values in A(R2 ) in the α-H¨older topology implies that the corresponding integrals form a Cauchy sequence in Cβ for any β < α. It is then possible to extend the notion of Young integrals to α-H¨older continuous functions with values in A(R2 ), and also to get the continuity result we need. The basic idea to approximate some α-H¨older continuous path x taking its values in A(R2 ) with α > 1/3 consists in lifting paths xn that take the same values as x on the points of a partition of [0, T ] and that link two successive times by a loop and a straight line. The loop is a way to “encode the area”. One may then be tempted to look for real geodesics. For this, we will interpret the space A(R2 ) as the subspace of the tangent space at any point of the tensor space T(R2 ), and we will look for simple curves linking two points in T(R2 ). There are several possibilities. One consists in using tools from sub-Riemannian geometry [29, 32]. Another one consists in studying paths with values in a sub-manifold G(R2 ) of T(R2 ), which is also a subgroup of (T(R2 ), ⊗), and which is the Lie group whose Lie algebra may be identified with A(R2 ). We give another way to define the integral by extending the differential form f to a differential form on G(R2 ) and construct curves that connect two points of G(R2 ). Hence, instead of considering paths with values in A(R2 ), we will consider paths with values in G(R2 ), and the difference between two points in A(R2 ) then corresponds to a direction. With this, we may redefine the integral as the limit of some Riemann sums—which is the original definition given by T. Lyons—, but where the addition has been replaced by some tensor product. Moreover, it becomes then possible to extend the notion of integrals to paths living in the bigger space T1 (R2 ). Consequently, using the concept of path living in a non-commutative space, the rough path theory provides a way to define an integral f (xs ) dxs that shares the same properties as ordinary integrals: (a) It is a limit of expressions similar to Riemann sums.
An Introduction to Rough Paths
5
(b) It is a limit of integrals along approximations of the path obtained by sampling the path at finitely many points and connecting successive sample points by “simple” curves. In addition, this map x → f (xs ) dxs is continuous from Cα ([0, T ]; T1 (R2 )) to Cα ([0, T ]; T1 (R2 )) and may be used to solve differential equations driven by x, still with a continuity property. The theory of rough paths turns out to be the natural extension of integrals on the space of α-H¨older continuous paths with α ∈ (1/3, 1/2], in the same way Young integrals are the natural notion of integral against α-H¨older continuous paths with α ∈ (1/2, 1]. Outline In Section 2, we introduce our notations and recall some elementary facts about integrals of differential forms along smooth paths as well as about H¨older continuous paths. In section 3, we quickly present results about Young integrals, and thus show the properties of integrals along α-H¨older continuous paths with α > 1/2. In Section 4, we assume that one can integrate differential forms along α-H¨older continuous path with α ∈ (1/3, 1/2], and we show how to transform this integral into a continuous one with respect to the path. In 2 Section 5, we consider paths taking their values in A(R ), and show how to define the integral x f as limits of ordinary integrals. In Section 6, we continue our analysis of the space A(R2 ) and introduce the tensor space T(R2 ). In Section 7, we give another definition of the integral of f along x, using an expression of Riemann sum type. This construction corresponds to the original one of T. Lyons [42, 52, 55]. In Section 8, we give some related results: case of the d-dimensional space, Chen series, other constructions for paths with quadratic variation, link with stochastic integrals. In Section 9, we solve differential equations. We end this article with appendix on the Heisenberg group and we recall a technical result about almost rough paths, on which the original construction of x f is based. Acknowledgement The author wishes to thank Laure Coutin, Llu´ıs Quer y Sardanyons and J´er´emie Unterberger whose remarks helped to improve this article.
2 Notations 2.1 Differential Forms Let f1 , . . . , fd be some functions from Rd to Rm . Consider the differential form f (x) = f1 (x1 ) dx1 + · · · + fd (xd ) dxd on Rd .
6
A. Lejay
Definition 1. For γ > 0, f is said to be γ-Lipschitz if the fi ’s for i = 1, . . . , d are of class Cγ (Rd ; Rm ) with bounded derivative up to order γ , and the γ fi ’s are (γ− γ )-H¨ older continuous with a (γ− γ )-H¨ older constant Hγi (f ). The class of γ-Lipschitz differential forms is denoted by Lip(γ; Rd → Rm ). For f ∈ Lip(γ; Rd → Rm ), define (0)
(γ)
f Lip = max max{fi ∞ , . . . , fi i=1,...,d
∞ , Hγi (f )},
which is a norm on Lip(γ; Rd → Rm ). Remark 1. If γ = 1, this definition is slightly different from the notion of Lipschitz functions, since this definition implies that f is of class C1 (Rd ; Rm ), while the definition that |f (x) − f (y)|/|x − y| is bounded as x → y for all y ∈ Rd only means that f is almost everywhere differentiable. Anyway, in our context, the case γ ∈ N is never considered. Given a path x ∈ C1 ([0, T ]; Rd ) and a continuous differential form f , define the integral of f along x by
f= x
0
T
d T dx dxi f (xs ) ds = fi (xs ) ds. dt t=s dt t=s i=1 0
Recall a few facts on such integrals, that will be heavily used:
If ϕ : R+ → R+ is strictly increasing and continuous, then x◦ϕ f = f . In other words, the integral of f along x does not depend on the x parametrization of x. (ii) If ϕ : [0, T ] → [0, T ] is ϕ(t) = T − t, then x◦ϕ f = − x f . In other words, reversing time changes the sign of x f . are (iii) If x, y ∈ C1p ([0, T ]; Rd ) (the class of functions from [0, T ] to Rd which piecewise C1 ) and x · y is the concatenation of x and y, then x·y f = f + y f . This is the Chasles relation. x (iv) If x ∈ C1p ([0, T ]; R2 ) is a closed loop in R2 , that is, xT = x0 , then f= [f, f ](x1 , x2 ) dx1 dx2 , (1) (i)
x
Surface(x)
where Surface(x) is the oriented surface surrounded by x and [f, f ] =
∂f2 ∂f1 − . ∂x2 ∂x1
This is the Green-Riemann/Stokes/Gauss formula.
An Introduction to Rough Paths
7
2.2 Paths of Finite p-Variation Fix T > 0. Let x be a continuous path from [0, T ] to Rd and Π = {ti }i=0,...,k be a partition of [0, T ] with k elements. For p 1, define P(x; Π, p) =
k−1
|xti+1 − xti |p .
i=0
The p-variation of x on [s, t] ⊂ [0, T ] is defined by Varp,[s,t] (x) =
sup Π partition of [0, T ]
P(x|[s,t] ; Π ∩ [s, t], p)1/p .
Definition 2. A function x : [0, T ] → Rd is said to be of finite p-variation if Varp,[0,T ] (x) is finite. If x is of finite p-variation, then we easily get (q−p)/q (Varp,[0,T ] (x))p/q Varq,[0,T ] (x) 2(q−p)/q x∞
(2)
and then x is of finite q-variation for all q > p. Note that Varp,[0,T ] (x) defines a semi-norm on the space of functions of finite p-variation, but not a norm, since Varp,[0,T ] (x) = 0 only implies that x is constant. In addition, on the space of functions x with x0 = 0 and Varp,[0,T ] (x) < +∞, Varp,[0,T ] defines a norm which is however not equivalent to the uniform norm · ∞ , and counter-examples are easily constructed. Following a recent remark due to P. Friz [26], we may work with a more precise norm than the norm constructed from p-variation. Indeed, to simplify our approach, we work only with H¨ older continuous paths and the H¨ older norm. If x is a path of finite p-variation and ϕ(t) = inf s > 0 Varp,[0,s] (x)p > t , then ϕ is increasing and x ◦ ϕ is (1/p)-H¨older continuous. As the integral of a differential form keeps the same value under a continuous, increasing time change, there is no difficulty in considering the (1/p)-H¨older norm, which is simpler to use than the p-variation norm (for some results on the relationship between p-variation and (1/p)-H¨older continuity, see for example [9]). Yet for convergence problems, this is not the most general framework, and dealing with the p-variation norm allows us to obtain more complete results (for example, in [45, 46], we only prove convergence in p-variation although the path is α-H¨older continuous, and this is due to a singularity at 0 of some term). older continuity modulus of a path x : [0, T ] → Rd Denote by Hα (x) the H¨ which is α-H¨older continuous, that is Hα (x) =
sup 0s
|xt − xs | . |t − s|α
8
A. Lejay
Of course, an α-H¨older continuous path is also β-H¨older continuous for any β α. In addition, the equivalent of (2) is for β α, Hβ (x) 21−β/α x1−β/α Hα (x)β/α . ∞
(3)
If Hα (x) = 0 then x is constant, and Hα defines only a semi-norm. Notation 1. If x : [0, T ] → Rd α-H¨older continuous, then we set xα = |x0 | + Hα (x) and call Cα ([0, T ]; Rd ) the subset of functions x in C([0, T ]; Rd ) such that xα is finite. Equipped with · α , this space Cα ([0, T ], Rd ) is a Banach space. In addition, we get the following Lemma which is a consequence of the Ascoli Theorem and (3). Lemma 1. Let (xn )n∈N be such that xn ∈ Cα ([0, T ]; Rd ) and (xn α )n∈N is bounded. Then there exist x in Cα ([0, T ]; Rd ) and a subsequence of (xn )n∈N that converges to x with respect to · β for each β < α. Remark 2. It is important to note that here, we used the · β norm for the space Cα ([0, T ]; Rd ) with β < α. When equipped with this norm, (Cα ([0, T ]; Rd ), · β ) becomes a separable space, while (Cα ([0, T ]; Rd ), · α ) is not separable: See [61] for example. The next corollary follows easily. Corollary 1. Let Π be a partition of [0, T ] and xΠ be the linear approximation of x ∈ Cα ([0, T ]; Rd ) along Π. Then xΠ α 31−α xα . If (Π n )n∈N is a sequence of partitions of [0, T ] whose meshes converge to 0, n then (xΠ )n∈N converges to x in (Cα ([0, T ]; Rd ), · β ) for all β < α. Proof. Let Π = {ti }i=1,...,J . For 0 s < t T , let s = min Π ∩ [s, T ] and t = max Π ∩ [0, t]. If s ∈ Π (resp. t ∈ Π), denote by s = max Π ∩ [0, s] (resp. t = min Π ∩ [t, T ]). As s , t ∈ Π, if s, t ∈ Π, Π Π Π Π Π Π Π |xΠ t − xs | |xt − xt | + |xt − xs | + |xs − xs |
s − s t − t |xt − xt | + |xt − xs | + |xs − xs | t −t s − s xα (t − t )α + xα (s − s)α + xα (t − s )α
31−α xα (t − s)α , the last inequality coming from the Jensen inequality applied by x → x1/α . The case where s or t belongs to Π is treated similarly. This proves that xΠ α 31−α xα . The second part of this corollary is an immediate consequence of Lemma 1.
An Introduction to Rough Paths
9
Remark 3. One may wonder whether it is possible to approximate a function x ∈ Cα ([0, T ]; R2 ) by piecewise linear functions that converge with respect to · α , and not with respect to · β for β < α. As shown in [61] (see also [18, § 4.3]), this is possible only if x belongs to the class of functions such that sup
lim
δ→0 0s
|xt − xs | = 0, (t − s)α
class of functions or, in other words, if |x(t+h)−x(t)| = o(hα ). Of course, this +∞ is strictly included in Cα ([0, T ]; Rd ): the function f (x) = k=0 c−kα sin(ck x) for c large enough yields a counter-example, as is easily proved using the results from [12].
3 Integrals Along α-H¨ older Continuous Paths, α ∈ (1/2, 1] For the sake of simplicity, consider d = 2. The construction of I on Cα ([0, T ]; R2 ) for α > 1/2 is first deduced from the Young integral. 3.1 Defining the Integrals We recall here the construction of the integral of a β-H¨older continuous path driven by a α-H¨older continuous path, provided that α + β > 1. This theorem is due to L.C. Young [73] (see also [18] for example). Theorem 1. Let α, β ∈ (0, 1] with α + β > 1. Then
t (x, y) → t → ys dxs 0
is bilinear and continuous from C ([0, T ]; R) × Cβ ([0, T ]; R) to Cα ([0, T ]; R). α
Proof (Sketch of the proof ). Fix n ∈ N∗ , and set, for tnk = T k/2n , n
J =
n 2 −1
ytnk (xtnk+1 − xtnk ).
k=0
Then |J
n+1
n 2 −1 − J | = (ytn+1 − ytn+1 )(xtn+1 − xtn+1 ) 2k+1 2k 2k+2 2k+1
n
k=0
n 2 −1
Hβ (y)Hα (x)T α+β 2−(n+1)(α+β)
k=0 −n(α+β−1)
2
Hβ (y)Hα (x).
10
A. Lejay
As α + β − 1 > 0, we deduce that the series n0 (Jn+1 − Jn ) converges and def thus that, if J = J0 + n0 (Jn+1 − Jn ), then |J − y0 (xT − x0 )| ζ(α + β − 1)T α+β Hβ (y)Hα (x),
(4)
T where ζ(θ) = n0 1/nθ . Of course, we define 0 ys dxs as J. From the last inequality in which t is substituted to T and s to 0, this also proves that t t → 0 ys dxs is α-H¨older continuous. The other properties of the integral are easily, although tediously, deduced from this construction. Remark 4. Indeed, using the argument of Lemma 2.2.1, p. 244 [55], there is no need to consider dyadic partitions, but we keep them for simplicity. Note that however, especially when dealing with stochastic processes, some results in the rough paths theory do depend on the choice of a dyadic partition (see for example [13]). One may then define for 0 s t T , t t f= f1 (xr ) dx1r + f2 (xr ) dx2r I(x; s, t) = x|[s,t]
s
(5)
s
as Young integrals with yt = f (xt ). Yet a global regularity condition is imposed on (x, y) with implies in particular that α > 1/2 and the minimal assumptions on the regularity of f also depends on α. Notation 2. For a path x defined on the time interval [S, T ], we will use I(x; s, t) to denote the integral x|[s,t] f when S s < t T , and I(x) to denote the function t ∈ [S, T ] → I(x; S, t). The following corollaries follow from the construction of the Young integrals and (4): see in particular [44, 56]. Corollary 2. Fix α ∈ (1/2, 1] and f ∈ Lip(γ; R2 → Rm ) with γ > 1/α − 1. Then I defined in (5) is well defined as a Young integral on Cα ([0, T ]; R2 ) and is a locally Lipschitz map from (Cα ([0, T ]; R2 ), ·α ) to (Cα ([0, T ]; Rm ), ·α ). Corollary 3. Fix α ∈ (1/3, 1/2] and let f ∈ Lip(γ; R2 → Rm ) with γ > 1/α − 1. Then C2α ([0, T ]; R) × Cα ([0, T ]; R2 ) → C2α ([0, T ]; Rm )
t (ϕ, x) → t → [f, f ](xs ) dϕs 0
is well defined as a Young integral and is a locally Lipschitz map from (C2α ([0, T ]; R2 ), · 2α ) × (Cα ([0, T ]; R2 ), · α ) to (C2α ([0, T ]; R2 ), · 2α ).
An Introduction to Rough Paths
11
3.2 A Problem of Continuity We have to take great care of the meaning of the continuity result in Corollary 2: the norm · α is not equivalent to the uniform norm. Convergence in Cα implies uniform convergence but the converse is not true. The following counter-example is the cornerstone to understand how I will be defined so as to deal with irregular paths. Let (xn )n∈N and x be continuous paths such that xn converges to x in α C ([0, T ]; R2 ) with α ∈ (1/2, 1]. Let ϕ be a function in Cβ ([0, T ]; R) with β ∈ (2/3, 1]. Assume also that f belongs to Lip(γ; Rd → R) where (γ + 1) β > 2 , which implies that 2 > γ > 1. Let Π n = {tnk }k=0,...,2n −1 be the dyadic partition of [0, T ] at level n, that is, tnk = T k/2n . For each n = 1, 2, . . . , denote by Φn = {ykn }k=0,...,2n −1 a set of functions piecewise of class C1 such that for a fixed κ > 1, ykn : [tnk , tnk+1 ] → R2 with ykn (tnk ) = ykn (tnk+1 ) = xn (tnk ), sup ykn β/2 < +∞,
(6a) (6b)
n=1,2,..., k=0,...,2n
uniformly in n, k, | Area(ykn ) − (ϕ(tnk+1 ) − ϕ(tnk ))| CT κ 2−nκ ,
(6c)
where Area(ykn ) is the algebraic area of the loop ykn defined by Area(ykn )
1 = 2
tn k+1
tn k
(yk1,n (s) − yk1,n (tnk )) dyk2,n (s) 1 − 2
tn k+1
tn k
(yk2,n (s) − yk2,n (tnk )) dyk1,n (s).
For such a sequence, we say that ϕ encodes asymptotically the areas of (Φn )n∈N . Denote by xn 1 Φn the path from [0, 2T ] to R2 defined by xn 1 Φn = y0n · xn|[tn0 ,tn1 ] · y1n · xn|[tn1 ,tn2 ] · · · y2nn −1 · xn|[tnn
2 −1
,tn ], 2n
where x · y is the concatenation between two path x and y (see Figure 1). This path xn 1 Φn is defined on the time interval [0, 2T ].
yn k
Area ≈ ϕt nk+1 − ϕtkn xn
Fig. 1. The path xn 1 Φn .
12
A. Lejay
Then, by the Chasles property of the integral, I(xn 1 Φn ; 0, 2T ) = I(xn ; 0, T ) +
n 2 −1 tn k+1
tn k
k=0
f (ykn (s)) dykn (s).
By the Green-Riemann formula (1),
tn k+1
tn k
f (ykn (s)) dykn (s)
[f, f ](x1 , x2 ) dx1 dx2 .
= n) Surface(yk
The idea is now the following: [f, f ](x1 , x2 ) dx1 dx2 ≈ [f, f ](xtnk ) Area(ykn ) n) Surface(yk
≈ [f, f ](xtnk )(ϕ(tnk+1 ) − ϕ(tnk )). To be more precise, using our hypotheses on f and Φn , with Δn t = T 2−n ,
n) Surface(yk
[f, f ](x1 , x2 ) dx1 dx2 − [f, f ](xtnk )(ϕ(tnk+1 ) − ϕ(tnk ))
κ β (γ−1)β/2 2∇f γ−1 ykn γ−1 + 2C∇f ∞ Δn tκ β/2 (CΔn t + ϕβ Δn t )Δn t κ−β 2∇f γ−1 ykn γ−1 + ϕβ )Δn t(γ+1)β/2 + 2C∇f ∞ Δn tκ . (7) β/2 (CΔn t
There are now 2n such terms to sum. By hypothesis, (γ + 1)β/2 > 1 and κ > 1, so the sum of the right-hand side of (7) vanishes as n → ∞. In addition, necessarily β + γα > 1, so [f, f ](xs ) dϕs can be considered as a Young integral. Thus, we easily get that n 2 −1 tn k+1
k=0
tn k
T
f (ykn (s)) dykn (s) −−−−→ n→∞
[f, f ](xs ) dϕs . 0
In other words, I(xn 1 Φn ; 0, 2T ) −−−−→ I(x; s, t) + n→∞
T
[f, f ](xr ) dϕr . 0
It is important to note that here, (xn 1 Φn )n∈N is in general not bounded in Cα ([0, 2T ]; R2 ), but it is bounded in Cβ/2 ([0, 2T ]; R2 ). Remark that for n n n , tn+1 t ∈ [0, 2T ], if t/2 ∈ [tn+1 k k+1 ] and k is odd, then x 1 Φ (t) = x (t/2). If k n n n is even, then x 1 Φ (t) = yk (t/2). Thus,
An Introduction to Rough Paths
13
|xn 1 Φn (t) − xn 1 Φn (s)| ⎧ ⎪ |xn (t/2) − xn (s/2)| ⎪ ⎪ ⎪ n+1 n+1 n+1 ⎪ ⎪ if s/2 ∈ [tn+1 ⎪ 2k+1 , t2k+2 ], t/2 ∈ [t2 +1 , t2 +2 ], ⎪ ⎪ n n n n n n ⎪ |y (t/2) − y (t )| + |x (tk ) − x (s/2)| ⎪ ⎪ ⎪ ⎪ n+1 n+1 n+1 ⎪ if s/2 ∈ [tn+1 ⎪ 2k+1 , t2k+2 ], t/2 ∈ [t2 , t2 +1 ], ⎪ ⎪ ⎨|y n (t/2) − y n (tn )| + |y n (tn ) − y n (tn )| + |y n (tn ) − y n (s/2)|
k k k k k n+1 n+1 n+1 ⎪ if s/2 ∈ [tn+1 , t ], t/2 ∈ [t , t ], k =
, ⎪ 2k 2k+1 2
2 +1 ⎪ ⎪ ⎪ ⎪ |y n (t/2) − y n (s/2)| ⎪ ⎪ ⎪ n+1 n+1 n+1 ⎪ if s/2 ∈ [tn+1 ⎪ 2 , t2 +1 ], t/2 ∈ [t2 , t2 +1 ], ⎪ ⎪ ⎪ ⎪ |xn (t/2) − xn (tnk )| + |ykn (tnk ) − ykn (s/2)| ⎪ ⎪ ⎪ ⎩ if s/2 ∈ [tn+1 , tn+1 ], t/2 ∈ [tn+1 , tn+1 ]. 2k 2k+1 2 +1 2 +2 Using the convexity inequality, one gets that for some constant C that depends only on α and β, |xn 1 Φn (t) − xn 1 Φn (s)| C max{xα ,
sup
k=0,...,2n −1
ykn β/2 } max{(t − s)β/2 , (t − s)α }.
Since β/2 α, it follows that (xn 1 Φn )n∈N is bounded in Cβ/2 ([0, 2T ]; R2 ) assuming of course that the ykn have a β/2-H¨older norm different from zero. k As we required that ϕ is β-H¨older continuous, and if we choose for yn some circles with area ϕ(tnk+1 ) − ϕ(tnk ), then their radius are
|ϕ(tnk+1 ) − ϕ(tnk )|/π
and this is why we look for ykn ’s that are β/2-H¨older continuous. This also means that when one considers a sequence (xn )n∈N of elements in Cα ([0, T ]; R2 ) and a path x of Cα ([0, T ]; R2 ) with α > 1/2, one has to consider the fact that (xn )n∈N may converge to x with respect to some βH¨older norm with β 1/2. In addition, this counter-example ruins all hopes to extend I naturally to Cα ([0, T ]; R2 ) for α < 1/2, since one may construct at least two bounded sequences (xn )n∈N and (z n )n∈N in Cα ([0, T ]; R2 ) with α < 1/2 converging uniformly to x—hence that converge to x in Cβ ([0, T ]; R2 ) for any β < α—such that I(xn ; 0, T ) −−−−→ I(x; 0, T ) and I(z n ; 0, T ) −−−−→ n→∞ n→∞ T I(x; 0, T )+ 0 [f, f ](xs ) dϕs , which is different from I(x; 0, T ) unless [f, f ] = 0 or ϕ is constant. 3.3 A Practical Counter-example in the Stochastic Setting In [43, 48], we give a stochastic example of such a phenomenon coming from homogenization theory. Consider some coefficients σ from Rd to the space of
14
A. Lejay
d×d-matrices and b : Rd → Rd smooth enough which are 1-periodic. Consider the SDE t 1 t ε ε Xt = σ(Xs /ε) dBs + b(Xsε /ε) ds ε 0 0 for some Brownian motion B. It is well known from homogenization theory (see [7] for example) that X ε converges as ε → 0 to σW for some Brownian motion W and a d × d-matrix σ which is constant, provided that the drift b satisfies some averaging property. One of the applications of this theory is to provide a tool to replace (for modelling or numerical computations) a PDE of type ∂t uε (t, x) + Lε uε (t, x) = 0, uε (T, x) = g(x) with d d 1 1 2 t Lε = i,j=1 2 ai,j (·/ε)∂xi xj + i=1 ε bi (·/ε)∂xi and a = σσ by the simd 1 2 pler PDE ∂t u(t, x) + Lu(t, x) = 0 with L = i,j=1 2 ai,j ∂xi xj and a = σσ t . From the probabilistic point of view, this means that X ε behaves—thanks to a functional Central Limit Theorem and the ergodic behavior of its projection on the torus Rd /[0, 1]d —like a non-standard Brownian motion. However, one has to take care when using X ε as the driver of some SDE, since i, j = 1, . . . , d, Ai,j (X ε ; 0, t) −−−→ Ai,j (σW ; 0, t) + tci,j ε→0
uniformly and in p-variation for p > 2, where (ci,j )i,j=1,...,d is a d × dantisymmetric matrix that can be computed from a and b, and Ai,j is the L´evy area of (Y i , Y j ), i.e., 1 t i 1 t j i,j i j A (Y ; 0, t) = (Y − Y0 ) ◦ dYs − (Y − Y0j ) ◦ dYsi 2 0 s 2 0 s for a d-dimensional semi-martingale Y . If b = 0, then c = 0, so this effect comes from the presence of the drift. From the Wong-Zakai theorem (see for example [40]), the Stratonovich integral appears as the natural extension of I on the subset SM([0, T ]; R2 ) of Cα ([0, T ]; R2 ) with α < 1/2 that contains trajectories of semi-martingales. Note however that for Y ∈ SM([0, T ]; R2 ) and (f1 , f2 ) = 12 (−xj , xi ), I(Y ; 0, t) = A1,2 (Y ; 0, t) for t ∈ [0, T ], if I is defined on SM([0, T ]; R2 ) as the Stratonovich integral t I(Y ; 0, t) = 0 f (Ys ) ◦ dYs . Since both X ε and σW belong to SM([0, T ]; R2 ), the previous example shows that I(X ε ; 0, t) does not converge in general to I(B; 0, t). This proves that I cannot be continuous on SM([0, T ]; R2 ) ⊂ Cα ([0, T ]; R2 ). Counter-examples to the Wong-Zakai theorem (see [40, 59]) also rely on the construction of approximations of the Brownian trajectories by “perturbating” the piecewise linear approximation that gives rise, in the limit, to a non-vanishing supplementary area and then, for the SDE, to a drift term. The theory of rough paths gives a better understanding of this phenomenon [48].
An Introduction to Rough Paths
15
This problem of convergence may arise in a natural setting and then be of practical interest.
4 Integrals along α-H¨ older Continuous Paths, α ∈ (1/3, 1/2]: Heuristic Considerations We present in this section a construction of the integral which is not the best possible one, but which allows to understand the main ideas and problems. The counter-example of Section 3.2 has yielded a few ideas: (1) We may use the Green-Riemann formula to deal with close loops. (2) For some α > 1/2, we may add to our paths small loops whose radii are of order 2−nα/2 and thus whose area are of order 2−nα . (3) As many loops are added, the sum of the areas does not vanish and gives rise to an extra term. Our construction will now take these facts into account. 4.1 Construction of the Integral along a Subset of Cα ([0, T ]; R2 ) As we wish our definition of the integral to be continuous, a naive construction is the following: Fix K > 0, α ∈ (1/3, 1/2] and f ∈ Lip(γ; R2 → R) with γ > 1/α − 1 (and then γ > 1). Denote by Π n the dyadic partition of [0, T ] at level n, and by Lα ([0, T ]; R2 ) the set of functions x ∈ Cα ([0, T ]; R2 ) for which n the linear approximations (xΠ )n∈N satisfy n
def
I(x) = lim I(xΠ ) exists in Cα ([0, T ]; R) n∈N n
n
Π α |t − s|α , 0 s < t T. and |I(x|[s,t] ) − I(xΠ |[s,t] )| Kx − x
If K is large enough, it follows from Corollary 2 that Lα ([0, T ]; R2 ) contains subsets of Cβ ([0, T ]; R2 ) for all β > 1/2 (this depends on f and on the choice of K, since from Corollary 2, x → I(x) is locally Lipschitz) and it is also known (but for this, we need a more complete theory) that it contains paths that are not β-H¨older continuous for β > 1/2, such as Brownian trajectories (see for example [13, 65]). Any element x of Lα ([0, T ]; R2 ) may be identified n with the sequence (xΠ )n∈N . Now, consider ϕ ∈ C2α ([0, T ]; R2 ) and (Φn )n∈N a sequence of loops at each level n whose areas are asymptotically encoded by ϕ. Then, as previously, n Cα def I(xΠ 1 Φn ) −−−−→ I(x, ϕ) = I(x) + [f, f ](xs ) dϕs . n→∞
def
For (x, ϕ) ∈ L1,α ([0, T ]; R3 ) = Lα ([0, T ]; R2 )×C2α ([0, T ]; R), we may then define n I(x, ϕ) = lim I(xΠ 1 Φn ) n→∞
16
A. Lejay
where ϕ encodes asymptotically the areas of (Φn )n∈N . The space L1,α ([0, T ]; R3 ) is naturally a Banach space when equipped with the norm (x, ϕ)1,α = xα + ϕ2α . The interesting point with this definition of the map (x, ϕ) → I(x, ϕ) is that its continuity follows naturally from its very construction. Proposition 1. For all β < α with α ∈ (1/3, 1/2], the map I is continuous from (L1,α ([0, T ]; R3 ), · 1,α ) to (Cα ([0, T ]; R), · β ) Proof. Let (xn , ϕn )n∈N be a sequence of paths converging to (x, ϕ) in the space L1,α ([0, T ]; R3 ). t By definition, I(xn , ϕn ; s, t) = I(xn ; s, t) + s [f, f ](xnr ) dϕnr . From Corol· · lary 3, we know that 0 [f, f ](xn ) dϕn converges to 0 [f, f ](x) dϕ in the space C2α ([0, T ]; R). From the very definition of Lα ([0, T ]; R2 ), m
m
I(xn,Π ) − I(xn )α Kxn,Π − xn α . But it is easily shown with Corollary 1 that for all β < α and some constant m m K2 , xn,Π − xn β K2 xn α /2m(β−α) and thus (I(xn,Π ))m∈N converges to I(xn ) in Cβ ([0, T ]; R) at a rate which is uniform in n since (xn α )n∈N is bounded. m It follows that for all β < α, I(xn,Π ) converges uniformly in n to I(xn ) in Cβ ([0, T ]; R) as m → ∞. For s < t fixed, there exist some integers im and jm such that tm im −1 m m m and t < t t . To simplify the notations, set t = s and s < tm im jm jm +1 im −1 n,m = t. For k = i − 1, . . . , j + 1, denote by z the following path (see tm m m jm +1 k Figure 2) m
zkn,m = xΠ |[tm ,tm
m
n,Π · xΠ · xn,Π tm xtm |[tm m
k+1 ]
k
k+1
m
m
m k+1 ,tk ]
k+1
· xn,Π xΠ tm tm . k
m
k
m Hence, with the previous convention on tm im −1 and tj m , m
m
I(xΠ ; s, t) − I(xn,Π ; s, t) =
jm
k=im −1
f+
n,m zk
m n,Π xΠ xs s
x
m
f+ xn,Π t
n,m
z1
xΠ
m
m
xn,Π
n,m
z0
x
n
Fig. 2. The paths zkn,m .
n,m
z2
m
xΠ t
m
f. (8)
An Introduction to Rough Paths
Note that
n,m zk
f =
n,m Surface(zk )
17
[f, f ](x1 , x2 ) dx1 dx2
1 f Lip |xtm − xtm | × |xntm − xtm | k+1 k k k 2 α (tm − tm k ) xα f Lip xn − x∞ . k+1 2
Using the convexity inequality with x → x1/α , since there are at most 2m terms in the series in the right-hand-side of (8), we get jm k=im −1
n,m zk
jm m(1−α) f 2
n,m zk
k=im −1
1/α α f
2m f Lip xα xn − x∞ (t − s)α . 2
m
− xnr for r ∈ {s, t}, On the other hand, setting Δnr = xn,Π r
m n,Π xΠ xs s
m
f+ xn,Π t
m
xΠ t
m
f =
m n,Π xΠ xs s
m
f−
xΠ t
m
xn,Π t
f
m
1 m n Πm n n (f (xΠ + rΔ ) − f (x + rΔ ))Δ dr s t s t s 0 1 m + f (xΠ + rΔnt )(Δnt − Δns ) dr t 0
m
f Lip |Δns |(xn α + xn,Π α )(t − s)α + f Lip |Δnt − Δns |. But, for any δ ∈ [0, 1), m
m
m
m
|Δnt − Δns | |xΠ − xΠ − xn,Π + xn,Π | t s s m
m
m
m
m
m
n,Π δ − xΠ − xn,Π |δ )2xΠ − xn,Π 1−δ (|xΠ t s | + |xt s ∞ m
m
m
m
(t − s)αδ 2 max{xΠ δα , xn,Π δα }2xΠ − xn,Π 1−δ ∞ . m
m
This proves convergence of I(xn,Π ) to I(xΠ ) in Cβ ([0, T ]; R) as n → ∞ for any m and any β < α. It is now possible to complete the following diagram m
· β
m
I(xn,Π ) −−−−→ I(xΠ )
· β
n→∞ m→∞
· β ↓ unif. in n n
I(x )
↓ m→∞ I(x)
to obtain that I(xn , ϕn ) converges in Cβ ([0, T ]; R) to I(x, ϕ).
18
A. Lejay
n
Fig. 3. The paths x, xΠ , xΠ
n+1
and the areas defined by Φn,n+1 (in gray).
Moreover, the following stability result is easily proved. Lemma 2. If ψ (resp. ϕ) is given in C2α ([0, T ]; R) and if it asymptotically encodes the areas of (Ψ n )n∈N (resp. (Φn )n∈N ), then n
lim I(xΠ 1 Φn 1 Ψ n ) = I(x, ϕ + ψ).
n→+∞
The function ϕ can be arbitrarily chosen, so we have gained a degree of freedom. In other words, to get a proper definition of I that respects continuity, we have to consider not a path with values in R2 but a path with values in R3 . Indeed, this construction is far from optimal, i.e., the set L1,α ([0, T ]; R3 ) is not the biggest one that can be considered. Yet it gives a proper understanding of the problem. 4.2 Is this Construction Natural? Of course, the real question is to consider whether or not is it natural to extend I on (at least) a subset of Cα ([0, T ]; R2 ) with α ∈ (1/3, 1/2] by considering paths valued not in R2 but in R3 . n Consider a path x ∈ Cα ([0, T ]; R2 ). The piecewise linear path xΠ is an approximation of x, and for each m n, we may define x Π
m
m
def
n
n
m
Π Π Π = (xΠ n · x|[tn ,tn ] ) · x|[tn ,tn ] · · · (x|[tn |[tn n 0 ,t1 ] 1 0 0 1
2 −1
n
,tn ] 2n
· xΠ |[tnn ,tnn
2 −1
2
])
n
· xΠ |[tnn
2 −1
,tn ] 2n
on the time interval [0, 3T ]. As we go back and forth on the segments composn m m xΠ ; 0, 3T ) = I(xΠ ; 0, T ). We then define ykn,m = ing xΠ , we get that I( m n Π n,m = {ykn,m }k=0,...,2n −1 . xΠ |[tn ,tn ] · x|[tn ,tn ] , that satisfies (6a)–(6b) and Φ k
k+1
Since x Π
m
k+1
k
= xn 1 Φn,m , m
m
n
xΠ ; 0, 3T ) = I(xΠ 1 Φn,m ; 0, 3T ). I(xΠ ; 0, T ) = I( If we now set for example m = n2 , then a priori nothing ensures, unless 2 x ∈ Lα ([0, T ]; R2 ), that the areas of (Φn,n )n∈N are asymptotically encoded by the function ϕ ≡ 0, nor that there exists a function ϕ ∈ C2α ([0, T ]; R) that 2 encodes the areas of (Φn,n )n∈N . In the last two cases, how then is the limit of n2
n
2
I(xΠ ) to be considered, since it may differ from the limit of I(xΠ 1 Φn,n )? Indeed, Πn
I(x
n,n2
1Φ
Πn
; 0, T ) = I(x
; 0, T ) +
n 2 −1
k=0
2
I(ykn,n ; tnk , tnk+1 ).
An Introduction to Rough Paths
19
x(t+s)=2 O((t−s)α=2α)
O((t−s)α=2α) xs
xs O((t − s)α)
Fig. 4. The area of some α-H¨ older continuous path between times s and t is of order (t − s)2α .
Yet with the Green-Riemann formula, 2
2
I(ykn,n ; tnk , tnk+1 ) ≈ [f, f ](xtnk ) Area(ykn,n ). As already seen, the function A on Cβ ([0, T ]; R2 ), β > 1/2, defined by 1 A(x; s, t) = 2
t
(x1r
−
x1s ) dx2r
s
1 − 2
t
(x2r − x2s ) dx1r
(9)
s
is not continuous with respect to the uniform norm: One only has to take f (x) = 12 x1 dx2 − 12 x2 dx1 and to use the previous counter-examples. As 2
2
2
Area(ykn,n ) = A(xn ; tnk , tnk+1 ) and although ykn,n converges uniformly to 0, 2
it may happen that Area(ykn,n ; tnk , tnk+1 ) is of order 2−2αn (this is possible since the distance between xtnk and xtnk+1 is roughly of order 2−αn if x is α2 2n −1 H¨older continuous, see Figure 4). In this case, k=0 I(ykn,n ) may have a limit different from 0, or no limit at all. In other words, the area contained between a path x and its chord for all couple of times (s, t) is “hidden” in x and has to be determined in an arbitrary manner1 . For some (x, ϕ) ∈ L1,α ([0, T ]; R3 ), which is identified with a sequence converging uniformly to x, the element ϕ means in some sense that some area has been chosen and then that our integral is properly determined. Once this choice of ϕ has been performed, Lemma 2 says how to construct different integrals by choosing other areas. 4.3 Justifications for a New Setting The previous construction does not answer our main question: “How to construct an integral for paths in Cα ([0, T ]; R2 ) for α ∈ (1/3, 1]?”. Yet it yields 1
Consider the case of Brownian trajectories, where the L´evy area is a natural choice, but not the only one, and was the first example of a stochastic integral [47]. In addition, it is then defined as a limit in probability.
20
A. Lejay
the fact that one cannot define a map x → I which extends the map x → x f on Cα ([0, T ]; R2 ) with α > 1/2 unless some extra information is added. Here, this information corresponds to the choice of a function ϕ, so that we consider indeed a subset of Cα ([0, T ]; R2 )×C2α ([0, T ]; R) (for α 1/2) such that, when equipped with the norm (x, ϕ) = xα + ϕ2α , the map I is continuous. We have also seen in Section 4.2 above that for considering an integral along a path in Cα ([0, T ]; R2 ) with α ∈ (1/3, 1/2], it is natural to consider the area contained between the path and its chord in view of defining some integral, although there is no way to define it canonically in general. The drawback of our construction is that we assumed convergence of the integrals along piecewise linear approximations of x. The idea is now to construct directly a path in R3 so that it may be identified with a limit of converging sequence of piecewise smooth paths in R2 whose integrals also converge. This allows us to to get rid of the loops themselves, since the only information we need is the asymptotic limit of the area, while keeping enough information to construct the integral. Besides, this proves that the choice of a converging subsequence does not depend on the choice of the differential form which is integrated.
5 Integrals along α-H¨ older Continuous Paths, α ∈ (1/3, 1/2]: Construction by Approximations It is time to turn to the full picture, now that the importance of knowing the area has been shown. 5.1 Motivations The main idea in the previous approach was to replace an irregular path (x, ϕ) ∈ L1,α ([0, T ]; R3 ) with a simpler path xn ∈ C1p ([0, T ]; R2 ) which “approximates” x in the following sense: xntnk = xtnk for the dyadic points {tnk }k=0,...,2n of [0, T ], and on [tnk , tnk+1 ], xntnk is composed of a loop ykn : [tnk , tnk + T 2−n−1 ] → R2 and then a segment joining xntnk and xntnk+1 . Once this family (xn )n∈N has been constructed, one may study the convergence of the ordinary integrals I(xn ), where the integrals of f on the loops have been transformed with the Green-Riemann formula into double integrals approximately given by the areas of the loops times the Lie brackets of f at the starting points of the loop. If xn is defined on [0, T ] with loops on [tnk , tnk+1 + T 2−n−1 ] and straight lines on [tnk + T 2−n−1 , tnk+1 ], a simple approximation of I(x) is then given by n
J =
n 2 −1 tn k+1
k=0
−n−1 tn k +T 2
f (xns ) dxns + [f, f ](xtnk )A(xn ; tnk , tnk + T 2−n−1 )
, (10)
An Introduction to Rough Paths
21
where A(x; s, t) has been defined by (9). Now, following the heuristic reasoning of Section 4.2, we replace the assumption (H1) The path (x, ϕ) belongs to L1,α ([0, T ]; R3 ). by the assumption (H2) There exists some function A(x; s, t) which is the limit of A(xn ; s, t) for all 0 s t T . Note that the assumption (H1) implies (H2) if f is the differential form f (x) = 1 1 2 2 1 2 (x dx − x dx ). In (H2), there is no more reference to f , while a priori the set L1,α ([0, T ]; R3 ) depends on f . 2 The assumption (H2) means that A(xn ; tnk , tnk+1 ) (which is equal to 2 A(xn ; tnk , tnk+1 + T 2−n−1 )) is equivalent to A(x; tnk , tnk+1 ) as n → ∞. Hence, one may replace (10) by n
J =
n 2 −1 tn k+1
k=0
−n−1 tn k +T 2
n Πn f (xΠ s ) dxs
+
[f, f ](xtnk )A(x; tnk , tnk+1 )
.
(11)
This form has the following advantage over the previous one: Under (H2), one can study, as was done in proof of the Young integrals, the convergence of J n n+1 n by studying J − J in order to prove that n0 (J n+1 − J n ) converges and to define the integral of x as the limit of this series plus J 0 . This method is central in the theory of rough paths. Still using some approximation, we change (11) into Jn =
n 2 −1 tn k+1
k=0
−n−1 tn k +T 2
n
f (xΠ − xtnk ) s )(xtn k+1 +
ds Δn t
n 2 −1 tn k+1
k=0
tn k
N
n n [f, f ](xΠ s )A(x; tk , tk+1 )
ds , Δn t
with Δn t = T 2−n . We use this expression to motivate our introduction of some algebraic structures. Our wish is then to interpret A(x; s, t) as some “vector”, in the same way as xt − xs can be seen, from a geometrical point of view, as the vector that links the two points xs and xt , and R2 as some affine space. As will appear below, A(x; s, t) is in general different from A(x; 0, t) − A(x; 0, s). Hence, the Euclidean structure is not adapted. We will now construct some space A(R2 ) of dimension 3, that will play the role both of an affine and a vector space, and the kind of vectors we will consider will be (x1t − x1s , x2t − x2s , A(x; s, t)). Nevertheless, they will be constructed from the paths (x1t , x2t , A(x; 0, t))t0 living in A(R2 ) seen as some affine space. Firstly, we define this space A(R2 ), then we study the approximation of paths living in this space, and finally we define an integral as a limit of ordinary integrals using the previously constructed approximations.
22
A. Lejay
5.2 What Happens to the Area? For a continuous path x ∈ Cα ([0, T ]; R2 ) with α > 1/2, let yt = A(x; 0, t) be the area enclosed between the curve x|[0,t] and its chord x0 xt , where A has been defined by (9). This path y is well defined by (9) and belongs to Cα ([0, T ]; R). As we have seen that x → A(x; 0, ·) is not continuous in general on Cα ([0, T ]; R2 ) for α 1/2, we are nonetheless willing to define the equivalent of a process y for an irregular path. This can be achieved using an algebraic setting. Remark first that if x ∈ Cα ([0, T ]; R2 ) with α ∈ (1/2, 1], 1 A(x; s, t) = A(x; s, u) + A(x; u, t) + (xu − xs ) ∧ (xt − xu ) 2
(12)
for all 0 s < u < t T (See Figure 5). Here, ∧ is the vector product between two vectors: a ∧ b = a1 b2 − a2 b1 . 5.3 Linking Points We first consider, for a piecewise smooth path x, the path (x1 , x2 , A(x)) living in a three dimensional space. If u belongs to R, then we set C(x, u; t) = (x1t , x2t , u + A(x; 0, t))
(13)
for t ∈ [0, T ]. In the following, we may think that x represents a 2-dimensional control trajectory of the position of a particle moving in R3 . Given two points a = (a1 , a2 , a3 ) and b = (b1 , b2 , b3 ), we wish to construct a piecewise smooth path x from [0, 1] to R2 such that the continuous path (xt , a3 + A(x; 0, t)) from [0, 1] to R3 goes from a to b.
(x; s, u)
xu (x; u, t)
xs
(x; s, t)
xt
xs 1 (xu− 2
xs) ∧ (xt − xu)
Fig. 5. A geometrical illustration of (12).
xt
An Introduction to Rough Paths
a
23
b = ( x1, a3 + (x; 0 , 1)) x1
Fig. 6. A simple path (x, y) from a to b controlled by a path x in R2 .
Such a path is easily constructed. We give here a simple example, that serves as a prototype for our approach. Our choice, drawn in Figure 6, is 1 3 |b − a3 | 1 a cos(4πt) − 1, √ xt = 2 + if t ∈ 0, , a sgn(b3 − a3 ) sin(4πt) 2 π 1 1 1 a b − a1 , 1 . if t ∈ and xt = 2 + (2t − 1) 2 a b − a2 2 Given two points a and b in R3 , consider two paths x and y in C1p ([0, T ]; R2 ) such that x0 = y0 = 0 and C(x, 0; T ) = a, C(y, 0; T ) = b. The concatenation x · y of x and y gives rise to a path that goes from 0 to π(a + b) through π(a), where π is the projection π(a1 , a2 , a3 ) = (a1 , a2 ). What can then be said on C(x · y; 0, 2T )? Due to (12), we get that C(x · y) is a path that goes from 0 to the point denoted by a b and defined by 1
1 a1 b a b = a1 + b1 , a2 + b2 , a3 + b3 + ∧ 2 . b 2 a2 With this notation, clearly defines an operation on R3 , which is different from the usual addition (geometrically equivalent to some translation) in this space R3 . In addition, C(x · y, 0) passes through the point a. As illustrated in Figures 6a–6d, this gives rise to a different path as the one obtained by concatenation of C(x, 0) and C(π(a) + y; a3 ), which ends at a + b. 5.4 The Space R3 as a Non-Commutative Group We have now equipped R3 with an operation , which is easily proved to be associative. When equipped with this operation , we denote R3 by A(R2 ). We also set 1
1 a1 b [a, b] = a b − b a = 0, 0, ∧ 2 . b 2 a2
24
A. Lejay
a
a+ b
a
0
Fig. 6a. The path C(x, 0).
a+b
Fig. 6b. The path C(π(a) + y, a3 ).
ab
Fig. 6c. The path C(x, 0) · C(π(a) + y, a3 ).
Fig. 6d. The path C(x · y, 0).
This bracket [·, ·] is of course linked to the fact that (A(R2 ), ) is a noncommutative group, called the Heisenberg group (see Section 6.3). Lemma 3. The space (A(R2 ), ) is a non-commutative group with 0 as the neutral element. The inverse of any element a = (a1 , a2 , a3 ) is −a = (−a1 , −a2 , −a3 ). Proof. That the inverse of a is −a is easily verified since 1 (−a1 , −a2 , −a3 ) (a1 , a2 , a3 ) = − [a, a] = 0. 2 The non-commutativity of in general follows from b a = a b [b, a]. The non-commutativity of is illustrated in Figures 6e–6f. Of course, if a, b ∈ R3 are of type a = (a1 , a2 , 0) and b = (b1 , b2 , 0), then a b = b a: the non-commutativity concerns only the third component. If x : [0, 1] → R2 goes from a to b and y : [0, 1] → R2 goes from b to c, then x · y goes from a to c and (y − b + a) · (b − a + x) goes from also from a to c. Yet the area enclosed between these two paths and its chord is not the same. It is now easy to remark that A(R2 ) is both a Lie algebra and a Lie group. For some introduction on these notions, see [17, 37, 67, 69, 71] among many other books. Lemma 4. The space (A(R2 ), [·, ·]) is a Lie algebra.
An Introduction to Rough Paths
25
ba
ab
Fig. 6e. The path C(x · y, 0).
Fig. 6f. The path C(y · x, 0).
Proof. Clearly, (a, b) → [a, b] is bilinear, [a, b] = −[b, a] and the Jacobi identity is easily satisfied: [a, [b, c]] + [b, [c, a]] + [c, [a, b]] = 0, ∀a, b, c ∈ A(R2 ). This proves the Lemma. As for R3 , A(R2 ) may be equipped with the multiplication by a scalar, def
which is (λ, x) = λ · x = (λx1 , λx2 , λx3 ) if x = (x1 , x2 , x3 ) ∈ A(R2 ) and λ ∈ R. But unlike R3 , this operation is not distributive, since λ · (x y) = (λx) (λy) + λ(1 − λ)[x, y]. Thus, (A(R2 ), , ·), where · denotes the multiplication by a scalar, is not a module. Another natural external law equips naturally A(R2 ), namely, the dilation. Given λ ∈ R, set δλ x = (λx1 , λx2 , λ2 x3 ) for x = (x1 , x2 , x3 ) ∈ A(R2 ).
(14)
Note that δλ (x y) = (δλ x) δλ y and δλ δμ x = δλμ x for λ, μ ∈ R and x ∈ A(R2 ). However, we do not have that δλ+μ x = δλ xδμ x. Hence, (A(R2 ), , δ) is not a module. This space A(R2 ) is equipped with a norm defined by |a| = max{|a1 |, |a2 |, |a3 |} and a homogeneous norm defined by |a| = max |a1 |, |a2 |,
1 3 |a | , 2
(15)
(16)
26
A. Lejay
which means that |a| = 0 if and only if a = 0, |δλ x| = |λ| · |x| for λ ∈ R and x ∈ A(R2 ), and | − x| = |x| for all x ∈ A(R2 ) (see also Section A). Remark that this choice ensures that |a b| (3/2) (|a| + |b|). We will see below in Sections 5.9 and A that this homogeneous norm is equivalent to another homogeneous norm · CC which allows us to define a distance between two points a and b in A(R2 ) by (−a) bCC (with · CC , the triangular inequality is satisfied, which is not the case with | · |). Because of the square root in the definition of | · |, this distance is not equivalent to the one generated by | · | . Yet it generates the same topology. Remark 5. Because | · | does not satisfy the triangle inequality, d : (a, b) → |(−a) b| does not define a distance. However, this may be called a nearmetric because d(a, b) C(d(a, c) + d(c, b)) for some constant C > 0 and all a, c, b ∈ A(R2 ). From this, we easily deduce that A(R2 ) is also a Lie group. We recall that a Lie group (G, ×) is a group with a differentiable manifold structure (and in particular a norm) such that (x, y) → x × y and x → x−1 are continuous (see for example [67, 71, 69, 37] and many other books). Lemma 5. The space (A(R2 ), ) is a Lie group. Proof. The continuity of (x, y) → x y and x → −x is easily proved. 5.5 Enhanced Paths and their Approximations Of course, we have constructed the space A(R2 ) with the idea of considering paths living in A(R2 ), the third component giving all the information we need. Basically, a continuous path with values in A(R2 ) is a continuous path with values in the Euclidean space R3 (recall that the norm | · | we put on A(R2 ) is equivalent to the Euclidean norm). However, we will use the group operation of A(R2 ) in replacement as the translation by a vector in R3 , and thus the paths we consider will be seen differently from the usual paths. Recall that (R2 , +) is in some sense contained in (A(R2 ), ), and then plays a special role. Definition 3. Given a continuous path x with values in R2 , a continuous path x with values in A(R2 ) with x = (x1 , x2 ) may then be called an enhanced path, or a path lying above x. Given a path x : [0, T ] → R2 , a path x : [0, T ] → A(R2 ) with lies above x is called a lift of x. Let x and y be two smooth paths lifted as x = C(x, 0) and y = C(y, 0), where C has been defined by (13). We have seen that the usual concatenation x · y of x and y seen as paths with values in R3 is different from the path
An Introduction to Rough Paths
27
C(x · y, 0). We introduce then a new kind of concatenation of two paths x : [0, T ] → A(R2 ) and y : [0, S] → A(R2 ). This concatenation is defined by xt if t ∈ [0, T ], (x y)t = xT ((−y0 ) yt−T ) if t ∈ [T, S + T ] and gives rise to a continuous path from [0, T + S] to A(R2 ) when x and y are continuous. In addition, x y lies above x · y if x (resp. y) lies above x (resp. y). Yet we have to be warned of an important fact: this concatenation is different from the usual concatenation in R3 . If x : [0, T ] → R2 and y : [0, S] → R2 are two piecewise smooth paths, then this concatenation satisfies C(x · y, 0) = C(x, 0) C(y, 0). For two points a and b in A(R2 ), let ψa,b ∈ C1p ([0, 1]; R3 ) be a smooth path joining a and b lying above ζa,b : [0, 1] → R2 (for example, we can use the one of Section 5.3). By definition of ζa,b and ψa,b , ψa,b (t) = C(ζa,b , a3 ; t). Moreover, for a, b, c in A(R2 ), ψa,b ψb,c = C(ζa,b · ζb,c , a3 ). Thus, ψa,b ψb,c is a path that goes from a to c through b. Let x be a continuous path from [0, T ] living in A(R2 ). It is then natural to look for an approximation of x given by the sequence of paths xn = ψxtn ,xtn ψxtn ,xtn · · · ψxtn 0
1
1
2
n−1
. ,xtn n
The path xn satisfies xn (t) = x(t) for the dyadic times t at level n. In addition, xn = C(ζ n , x30 ) with ζ n = ζxtn ,xtn · ζxtn ,xtn · · · · · ζxtn 0
1
1
2
n−1
, ,xtn n
and it is easily proved that ζ n converges uniformly to x, the path above which x lives (See Figure 7). Now, there are two natural questions: (1) Provided x is regular enough, does xn converge to x, in which sense? (2) Is it possible to construct I(x) as the limit of the I(ζ n )’s, which are then ordinary integrals? 5.6 H¨ older Continuous Enhanced Paths We have defined the space A(R2 ) as the space R3 with a special noncommutative group structure, which is different from the translation. Let x ∈ Cα ([0, T ]; R2 ) with α > 1/2 and x0 = 0. Set x = (x1 , x2 , A(x)). With (12), (−xs ) xt = (x1t − x1s , x2t − x2s , A(x; s, t)),
28
A. Lejay
xn
x
ζn x Fig. 7. Approximation of a path x in A(R2 ).
which means that (−xs ) xt can be constructed from the path x restricted to [s, t]. The same is true even if x0 = 0. def
For a path x from [0, T ] to A(R2 ), xs,t = (−xs ) xt may then be interpreted as an “increment” of x, and indeed we get the following trivial identity xt = xs xs,t for all 0 s t T , which is the equivalent to xt = xs + (xt − xs ) in R2 . Note that in general x3s,t is different from x3t − x3t , although xis,t = xit − xis for i = 1, 2. Similarly, we may write the value of xt at time t as a function of the values of x at times s r t: (17) xt = xs xs,r xr,t for all 0 s r t T . When one sees x as a geometric object, (17) yields x|[s,t] = x|[s,r] x|[r,t] ,
(18)
for all 0 s r t T . From now on, to take into account the fact that we work in A(R2 ), we have to think of paths from [0, T ] to A(R2 ) as continuous paths x satisfying (18), although this relation is satisfied by any continuous path from [0, T ] to R3 (which also means that there are infinitely many paths lying above a continuous path from [0, T ] to R2 ). But we will see below that if x lies above a smooth path x and is also quite regular (in a sense to be defined), then (18) and the regularity condition will impose some “constraint” on the path x. Lemma 6 ([55, Lemma 2.2.3, p. 250]). Let x and y be two continuous paths from [0, T ] to A(R2 ) such that (x1 , x2 ) = (y1 , y2 ). Then there exists a continuous path ϕ : [0, T ] → R such that y = (x1 , x2 , x3 + ϕ), which means that ((−ys ) yt )3 = ((−xs ) xt )3 + ϕt − ϕs for all 0 s t T .
(19)
An Introduction to Rough Paths
29
Proof. It is sufficient to put ϕt = ((−y0 ) yt )3 − ((−x0 ) xt )3 , which clearly satisfies (19). Notation 3. Denote by Cα ([0, T ]; A(R2 )) the set of continuous paths x : [0, T ] → A(R2 ) and such that xα = |x0 | +
sup 0s
|(−xs ) xt | |t − s|α
is finite. If x = (x1 , x2 ) is a path in Cα ([0, T ]; R2 ) and x = (x1 , x2 , y) a path in Cα ([0, T ]; A(R2 )), then we say that x lies above x. Lemma 7. Let x ∈ Cα ([0, T ]; R2 ) with α > 1/2. Then x = (x, A(x; 0, ·)) belongs to Cα ([0, T ]; A(R2 )). In addition the map x → x is Lipschitz continuous from (Cα ([0, T ]; R2 ), · α ) to (Cα ([0, T ]; A(R2 )), · α ). Proof. By construction, x is a path with value in A(R2 ). Note that (−xs ) xt = (x1t − x1s , x2t − x2s , A(x; s, t)). From the construction of the Young integral (more specifically, from a variation of (4)), |A(x; s, t)| ζ(2α − 1)(t − s)2α x2α
(20)
and then the result is proved. Note that in the previous proof, (20) does not mean that t → A(x; 0, t) is 2α-H¨older continuous (in which case 2α > 1!). Indeed, t → A(x; 0, t) is only α-H¨older continuous, since x is α-H¨older continuous. On the other hand, any path in Cα ([0, T ]; A(R2 )) with α > 1/2 can be expressed as a path x ∈ Cα ([0, T ]; R2 ) lifted using its area A(x). Lemma 8. Let x ∈ Cα ([0, T ]; A(R2 )) with α > 1/2. Then x = C(x, x30 ) = (x, x30 + A(x)) with x = (x1 , x2 ). Remark 6. If for some α > 1/2, (xn )n∈N belongs to Cα ([0, T ]; A(R2 )) is composed of paths of type xn = (xn , A(xn )) with xn ∈ Cα ([0, T ]; R2 ) and xn converges in Cα ([0, T ]; A(R2 )) to some x, then x ∈ Cα ([0, T ]; A(R)) is necessarily of type x = (x, A(x)) with x ∈ Cα ([0, T ]; R2 ). In Proposition 2 below, we will see how to construct a family of paths xn in C1 ([0, T ]; R2 ) for which xn = (x, A(x)) converges to x ∈ Cα ([0, T ]; A(R2 )) with α > 1/3. Thus, if one considers a path with values in A(R2 ) which is not of type (x, A(x)) but which is piecewise smooth, one has to interpret it as a path in C1/2 ([0, T ]; A(R2 )) in order to identify it with a family of converging paths. Proof. From Lemma 7, y = C(x, x30 ) belongs to Cα ([0, T ]; A(R2 )), and from 3 Lemma 6, there exists a function ϕ : [0, T ] → R such that ((−xs ) xt ) = 3 ((−ys )yt ) +ϕt −ϕs for all 0 s t T . Hence, |ϕt − ϕs | xα |t−s|α and then |ϕt − ϕs | x2α |t − s|2α . As α > 1/2, necessarily ϕ is constant.
30
A. Lejay
As we saw earlier, one can add a path with values in R to the third component of a path with values in A(R2 ) to get a new path with values in A(R2 ). Although a path with values in R2 which is regular enough can be naturally lifted as a path with values in R3 , we gain one degree of freedom: there are infinitely many paths that lie above a path in R2 . The next lemma, whose proof is immediate, specifies the kind of paths we have to use to stay in Cα ([0, T ]; A(R2 )). Lemma 9. For α 1/2, let x ∈ Cα ([0, T ]; A(R2 )) and ϕ ∈ C2α ([0, T ]; R). Then y = (x1 , x2 , x3 + ϕ) belongs to Cα ([0, T ]; A(R2 )). Any path in Cα ([0, T ]; A(R2 )) can be seen as a limit of paths naturally constructed above a path of finite variation. Before proving this, we state a lemma on relative compactness, which is just an adaptation of Lemma 2. Lemma 10. Let (xn )n∈N be such that xn ∈ Cα ([0, T ]; A(R2 )) and is bounded. Then there exist x in Cα ([0, T ]; A(R2 )) and a subsequence of (xn )n∈N that converges to x in (Cα ([0, T ]; A(R2 )), · β ) for each β < α. We shall now prove the main result of this section: any path x in Cα ([0, T ]; A(R2 )) with α ∈ (1/3, 1/2) may be identified as the limit of 2 C(xn , x30 ), where xn are paths in C∞ p ([0, T ]; R ). Paths taking their values 2 in A(R ) are then objects that are easier to deal with than sequences of paths with loops as we did previously. Let x ∈ Cα ([0, T ]; A(R2 )) with α ∈ (1/3, 1/2) lying above x. Denote by Πn x the linear interpolation of x along the dyadic partition Π n = {tnk }k=0,...,2n at level n, with tnk = T k/2n . Also define θkn = ((−xtnk+1 ) xtnk )3 .
(21a)
Set Φn = {ykn }k=0,...,2n −1 with ykn : [tnk , tnk+1 ] → R2 and ykn (t) =
⎤ t−tn k cos 2π tn −t −1 n k+1 ⎣ k n ⎦ . π sgn(θkn ) sin 2π nt−tk n t −t ⎡
|θkn |
k+1
(21b)
k
Finally, set n
xnt = xΠ 1 Φn (t/2) for t ∈ [0, T ] and xn = (xn , x30 + A(xn ; 0, ·)).
(21c)
This corresponds to joining the points of {xtnk }k=0,...,2n by the simple paths constructed in Section 5.3 (see Figure 6). Proposition 2. With the previous notations (21a)-(21c), (xn )n∈N is uniformly bounded in Cα ([0, T ]; A(R2 )) and converges to x with respect to · β for all β < α.
An Introduction to Rough Paths
31
Remark 7. We have considered a path x in Cα ([0, T ]; A(R2 )) above a path x ∈ Cα ([0, T ]; R2 ), but we have not shown how to construct such a path, except when α > 1/2. For that, we may either use the results in [54], that assert it is always possible to do so, or study particular cases. For example, many trajectories of stochastic processes have been dealt with (Brownian motion [65], semi-martingales [13], fractional Brownian motion [14, 15, 62], Wiener process [50], Gaussian processes [30, 31], free Brownian motion [70], . . . The book [28] contains many such constructions). In general, these results are obtained in connection with an approximation of Wong-Zakai type. Choosing a path x above x corresponds to a determination of the limit of A(xn ; s, t) where xn converges to x, and is then a slightly weaker hypothesis than (H2). Proof (Proof of Proposition 2). Note first that xntnk = xtnk . For t ∈ [0, T ), let M (t, n) be the largest integer such that tnM (t,n) t. Then, for 0 t < T , | + |xt − xtnM (t,n) | |xnt − xt | |xnt − xntn M (t,n) max{ |θkn |/π, |xtnM (t,n)+1 − xtnM (t,n) |} + xα (t − tnM (t,n) )α 2xα T α 2−αn . This proves that xn converges uniformly to x. Convergence in Cβ ([0, T ]; A(R2 )) follows from the uniform boundedness of the α-H¨older norm of xn and Lemma 10. So, it remains to estimate the α-H¨older norm of xn in A(R2 ). For 0 s < t T , let M (s, n) be the smallest integer such that s tnM (s,n) . Then, unless s, t belongs to the same dyadic interval [tnk , tnk+1 ] for some k = 0, . . . , 2n − 1, xns,t = xns,tn
M (s,n)
xntn
M (s,n)
,tn M (t,n)
for all 0 s < t T . In addition, xntn
M (s,n)
xntn
,tn M (t,n) 2
M (t,n)
,t .
= xtn
M (s,n)
,tn M (t,n)
for any
integer n. Since | · | is a homogeneous norm on A(R ), it follows that for some universal constant C0 , |xns,t | C0 |xns,tn
M (s,n)
C0 |xns,tn
M (s,n)
| + C0 |xtn
M (s,n)
|+
| ,tn M (t,n)
C0 xα (tnM (t,n)
−
+ C0 |xntn
,t |
tnM (s,n) )α
+ C0 |xntn
M (t,n)
M (t,n)
,t |.
Assume that we have proved that for some constant K, |xns,t | K(t − s)α for all tnk s t tnk+1 , k = 0, . . . , 2n − 1,
(22)
then boundedness of (xn α )n∈N follows easily as in the proof of Corollary 1 | as well by applying (22) to s, t in the same dyadic interval, and to |xns,tn as to |xntn
,t |. M (t,n)
M (s,n)
32
A. Lejay
We now turn to the proof of (22). First, consider that for some k ∈ {0, . . . , 2n − 1}, either s, t ∈ [tnk , tnk − T 2−n−1 ] or s, t ∈ [tnk + T 2−n−1 , tnk ]. In the latter case, ⎤ ⎡ T −1 2n+1 (t − s)(x1tnk+1 − x2tnk ) def ⎥ ⎢ xns,t = (−xns ) xnt = ⎣T −1 2n+1 (t − s)(x1tnk+1 − x2tnk )⎦ 0 and then |xns,t | xα |t − s|α . In the former case, setting Δn t = T 2−n , ⎡ ⎤ n |θk | π π n n (t − t ) − cos (s − t ) cos k k π Δ t Δn+1 t ⎢ n+1 ⎥ ⎢ ⎥ xns,t = ⎢sgn(θn ) |θkn | sin π π ⎥. n n ⎣ ⎦ k π Δn+1 t (t − tk ) − sin Δn+1 t (s − tk ) θkn Δt−s n+1 t Thus, for some universal constant C1 , t−s t−s 2C1 2n(1−α) xα C2 xα (t − s)α , |xns,t | C1 2n+1 |θkn | T T where C2 depends only on C1 and T . Now, if tnk s tnk + T 2−n−1 t tnk+1 , we get by combining the previous estimates that |xns,t | C0 C2 xα ((t − T 2−n−1 )α + (T 2−n−1 − s)α ) 2α−1 C0 C2 xα (t − s)α . We have then proved (22) with a constant which is in addition proportional to xα . Let us come back to the Remark 6 following Lemma 8. For α ∈ (1/3, 1/2], consider xt = (0, 0, ϕt ) where ϕ ∈ C2α ([0, T ]; R), then one can find xn ∈ C1p ([0, T ]; R) such that xn converges uniformly to 0, xn = (xn , A(xn ; 0, ·)) is uniformly bounded in Cα ([0, T ]; A(R2 )) and converges in Cβ ([0, T ]; A(R2 )) to x for any β < α. For this, one may simply consider (see Figure 8) 1 ztn = √ (cos(2πtn2 ) − 1, sin(2πtn2 )), n π and then set xnt = zϕnt .
··· Fig. 8. Moving freely in the third direction.
An Introduction to Rough Paths
33
Thus, moving freely in the “third direction” is equivalent to accumulating areas of small loops. Using the language of differential geometry, which we develop below, this new degree of freedom comes from the lack of com√ mutativity of (A(R2 ), ): a small loop of radius ε around the origin in the plane R2 is equivalent in some sense to a small displacement of length ε in the third direction. To rephrase Remark 6, even if ϕ ∈ C1 ([0, T ]; R), one has to see x as a path in C1/2 ([0, T ]; A(R2 )) that may be approximated by paths in C1p ([0, T ]; A(R2 )) (here, Lipschitz continuous paths with values in A(R2 )) which converge to x only in · β for any β < 1/2. Hence, we recover the problem underlined in Section 3.2. 5.7 Construction of the Integral If x ∈ Cα ([0, T ]; A(R2 )) with α > 1/2, then from Lemma 8, x = (x, x30 +A(x)) with x = (x1 , x2 ). For a differential form f ∈ Lip(γ; R2 → R) with γ > 1/α−1, def we set I(x) = I(x) = x|[0,·] f which is well defined as a Young integral. The next proposition will be refined later. Proposition 3. Let x ∈ Cα ([0, T ]; A(R2 )) with α ∈ (1/3, 1/2] and f be a differential form in Lip(γ; R2 → R) with γ > 1/α−1. Let (xn )n∈N be constructed by (21a)–(21c). Then (I(xn ))n∈N has a unique limit in (Cα ([0, T ]; R), · β ) for all β < α, which we denote by I(x) (of course, the limit does not depend on β). Both the α-H¨ older continuity modulus of I(x) and the rate of convergence with respect to · β depend only on T , α, γ, β, xα and f Lip . Other properties of this map x → I(x) will be proved below. Indeed, this map is obviously an extension of the one we have constructed beforehand on L1,α ([0, T ]; R3 ), with a more convenient way to encode the loops. Proof. Fix a dyadic level n. Remark first that for k ∈ {0, . . . , 2n − 1}, tnk s < t tnk+1 , I(xn ; s, t) ⎧ ⎪ 1 2 1 2 ⎪ [f, f ](z , z ) dz dz + f ⎪ ⎪ n ⎪ Partn (s,t) xn ⎪ s xt ⎪ ⎪ n n −n−1 ⎪ , ⎪ if tk s t tk + T 2 ⎪ ⎪ ⎪ ⎪ 1 2 1 2 ⎪ [f, f ](z , z ) dz dz + f ⎨ n −n−1 ) Partn (s,tn xn s xtn +T 2−n−1 k +T 2 = k ⎪ tn +2(t−tnk −2n+1 T ) ⎪ ⎪ n ⎪+ k Πn ⎪ f (xΠ if tnk s tnk + T 2−n−1 t tnk+1 , ⎪ r ) dxr ⎪ n ⎪ tk ⎪ ⎪ ⎪ ⎪ tnk +2(t−tnk −2n+1 T ) n ⎪ Πn ⎪ ⎪ f (xΠ if tnk + T 2−n−1 s < t tnk+1 , ⎩ n r ) dxr n n+1 tk +2(s−tk −2
T)
34
A. Lejay
where Partn (s, t) stands for the portion of the disk enclosed between the loop xn|[tn ,tn +T 2−n−1 ] and the segment xns xnt . Of course, the integral of f over k k Partn (tnk , tnk + T 2−n−1 ) is the integral of [f, f ] over the surface of the loop xn|[tn ,tn +T 2−n−1 ] . k k If tnk s < t tnk + T 2−n−1 , then the algebraic area of Partn (s, t) is n θk (t − s)2n+1 /T . In addition, the √ maximal distance between two points in Partn (s, t) is smaller than |θkn | 2(t − s)2n+1 /T . As [f, f ] is (γ − 1)-H¨older continuous, we deduce that for r ∈ [s, t], there exists a constant C that depends only on T such that t − s [f, f ](z 1 , z 2 ) dz 1 dz 2 − [f, f ](xs )θkn −n−1 T2 Partn (s,t) α(1+γ) Cf Lip x1+γ α (t − s)
(23)
since |θkn | x2α 2−2nα . We also deduce that for some constant C that depends only on T , xα and f Lip , [f, f ](z 1 , z 2 ) dz 1 dz 2 C (t − s)2α . (24) Partn (s,t)
In addition, since from Proposition 2, xn is α-H¨older continuous with some constant that depends only on xα , there exists some constant C such that f f ∞ C (t − s)α . (25) n xn s xt
If tnk + T 2−n−1 s < t tnk+1 , then
n n+1 tn T) k +2(t−tk −2
n n+1 T ) tn k +2(s−tk −2
n Πn f (xΠ r ) dxr
f Lip xα (T 2n )1−α (t − s) f Lip xα (t − s)α . (26) It follows from (24), (25) and (26) that for some constant C1 that depends only on f Lip and xα , |I(xn ; s, t)| C1 (t − s)α
(27)
for all tnk s t tnk+1 , k = 0, . . . , 2n − 1. Yet this is not sufficient to bound |I(xn ; s, t)| by C(t − s)α for all 0 s < t T . We then use another computation. n n+1 n First remark that tn+1 2k = tk , t2k+2 = tk+1 and that I(xΠ
n+1
n+1 Π ; tn+1 2k , t2k+1 ) + I(x
n+1
n
n+1 Π ; tn+1 ; tn+1 , tn+1 ) 2k+1 , t2k+2 ) − I(x 2k 2k+2 = [f, f ](z) dz, Tkn
An Introduction to Rough Paths
35
where Tkn = Triangle (xtn2k , xtn2k+1 , xtn2k+2 ) with area 1 Area(Tkn ) = − (xtn+1 − xtn+1 ) ∧ (xtn+1 − xtn+1 ). 2k 2k+2 2k+1 2 2k+1 In addition, I(xn ; tnk , tnk + T 2−n−1 ) =
[f, f ](z 1 , z 2 ) dz 1 dz 2 n −n−1 ) Partn (tn k ,tk +T 2
= [f, f ](xtnk )θkn + ζkn , where, from (23), |ζkn | C2 2−nα(1+γ) for some constant C2 that depends only on xα , f Lip and T . Recall that from (12), 1 n+1 n+1 θ2k + θ2k+1 + (xtn+1 − xtn+1 ) ∧ (xtn+1 − xtn+1 ) = θkn . 2k 2k+2 2k+1 2 2k+1 Hence, we easily get n+1 n+1 n+1 n n+1 n+1 I(xn+1 ; tn+1 ; t2k+1 , tn+1 2k , t2k+1 ) + I(x 2k+2 ) − I(x ; t2k , t2k+2 ) n+1 n+1 n+1 = ζ2k + ζ2k+1 − ζkn + ([f, f ](xtn+1 ) − [f, f ](xtn+1 ))θ2k+1 + ξkn , 2k+1
2k
where ξkn
= Tkn
[f, f ](z 1 , z 2 ) dz 1 dz 2 − [f, f ](xtn+1 ) Area(Tkn ). 2k
As in (23), α(γ+1) |ξkn | f Lip x1+γ , α Δn t
where Δn t = T 2−n . Thus, for some constant C3 that depends only on f Lip and xα , n+1 n+1 n+1 n n+1 n+1 |I(xn+1 ; tn+1 ; t2k+1 , tn+1 2k , t2k+1 ) + I(x 2k+2 ) − I(x ; t2k , t2k+2 )|
C3 2−nα(γ+1) . (28) For m n and k ∈ {0, . . . , 2m − 1}, n
I(x
m ; tm k , tk+1 )
− I(x
m
m ; tm k , tk+1 )
=
n−1
m
m m (I(x +1 ; tm k , tk+1 ) − I(x ; tk , tk+1 )).
=m
−m
As there are exactly 2 dyadics intervals of the form [t i , t i+1 ] contained in m m [tk , tk+1 ] for all m, we deduce from the Chasles relation and (28) that m m m m |I(xn ; tm k , tk+1 ) − I(x ; tk , tk+1 )| C3
n−1
=m
2 −m 2 α(γ+1)
C4 , 2mα(γ+1)
(29)
where C4 depends on C3 and on the choice of α and γ (note that our choice of α and γ ensures that the involved series converges as n → ∞).
36
A. Lejay
We now choose for m(0) the smallest integer such that there exists some m(0) m(0) k ∈ {0, . . . , 2m(0) − 1} for which [tk , tk+1 ] ⊂ [tnM (s,n) , tnM (t,n) ], where M (s, n) (resp. M (t, n)) is the smallest (resp. the largest) integer such that s tnM (s,n) (resp. t tM (t,n) ). From the Chasles relation, I(xn ; tnM (s,n) , tnM (t,n) ) m(0)
= I(xn ; tnM (s,n) , tk
m(0)
) + I(xn ; tk
m(0)
m(0)
, tk+1 ) + I(xn ; tk+1 , tnM (t,n) ).
By combining (27) and (29), we get |I(xn ; tk , tk+1 )| C5 2−m(0)α for some constant C5 that depends only on T , α, γ, f Lip and xα . m(1) m(1) We may now find some integers m(1) and k(1) such that [tk(1) , tk(1)+1 ] is m(0)
m(0)
m(0)
the biggest interval of this type contained in [tnM (s,n) , tk m(0)
mate I(xn ; tnM (s,n) , tk
m (1)
], in order to esti-
). Similarly, we can find some integers m (1) and k (1)
m (1)
such that [tk (1) , tk (1)+1 ] is the biggest interval of this type contained in m(0)
m(0)
[tk+1 , tnM (t,n) ], in order to estimate I(xn ; tk+1 , tnM (t,n) ). Note that necessarily, m(1) and m (1) are strictly greater than m(0). Hence, proceeding recursively, we obtain with (23) and (29) that |I(xn ; tnM (s,n) , tnM (t,n) )|
C5 m(0)α 2
+
j∈J
C5 m(j)α 2
+
j∈J
C5 , m 2 (j)α
where (m(j))j∈J and (m (j))j∈J are two finite increasing families of integers, that are bounded by n and greater than m(0). This kind of computation is the core of the proof of the Kolmogorov Lemma (see for example Corollary of Theorem 4.5 in [40]) and is also an important tool in the theory of rough paths. It also is close to the one used in [21]. For some constant C6 , we then obtain that |I(xn ; tnM (s,n) , tnM (t,n) )|
C6 . m(0)α 2
Note that T 2−m(0) tnM (t,n) − tnM (s,n) < T 2−m(0)+1 . With (27) and the Chasles relation, we then obtain that |I(xn ; s, t)| C1 (tnM (s,n) − s)α +
C6 + C1 (t − tnM (t,n) )α 2mα max{C1 , C6 /T α }(t − s)α . (30)
Since I(xn ; 0) = 0, this proves that I(xn ; s, t) is uniformly bounded in (Cα ([0, T ]; R), · α ). It follows that there exists a convergent subsequence in (Cα ([0, T ]; R), · β ), whose limit is denoted by I(x), which is also a α-H¨older continuous function.
An Introduction to Rough Paths
37
We may however give more information on the limit. With (29) and (30), for some constant C7 and any integers 0 m n and any 0 s t T with t − s > T 2−m , α |I(xn ; s, t) − I(xm , s, t)| C7 (tm − s)α + C7 (t − tm M (t,m) ) M (s,m)
+
C4 (M (t, m) − M (s, m)) . 2mα(γ+1)
As M (t, m) − M (s, m) 2m and ε = α(γ + 1) − 1 > 0, it follows that |I(xn ; s, t) − I(xm , s, t)| α C7 (tm − s)α + C7 (t − tm M (t,m) ) + M (s,m)
Set
C4 . (31) 2mε
α C4 Rm (s, t; α, ε) = max C7 (tm − s)α , C7 (t − tm M (t,m) ) , mε M (s,m) 2
.
As Rm (s, t; α, ε) converges to 0 when m → ∞, the sequence (I(xn ; s, t))n∈N is a Cauchy sequence for any 0 s t T , an so has a unique limit. Necessarily, this limit is I(x; s, t). Besides, we get from (31) that for some constant C8 and any β < min{α, ε}, |I(x; s, t) − I(xm , s, t)| C8 (t − s)β Rm (s; t, α − β, ε − β), when m is large enough so that T 2−m < t − s. If T 2−m > t − s, then there is m at most one point tm k such that s tk t and then for some constant C9 , |I(x; s, t) − I(xm , s, t)| |I(x; s, t)| + |I(xm , s, t)| C9 (t − s)α
C9 T α∧ε−β (t − s)β . 2−m(α∧ε−β)
We get that the whole sequence (I(xn ))n∈N converges to I(x) in the space (Cα ([0, T ]; R), · β ) for any β < α ∧ ε. Since (I(xn ))n∈N is bounded in Cα ([0, T ]; R) and since Cε ([0, T ]; R) is contained in Cα ([0, T ]; R) for ε < α, (I(xn ))n∈N converges to I(x) in the space (Cα ([0, T ]; R), · β ) for any β < α. The proposition is thus proved. Corollary 4. Let (xn )n∈N be a sequence of paths converging to x in the space (Cα ([0, T ]; A(R2 )), · α ). Then for all β < α, I(xn ; 0, ·) converges to I(x; 0, ·) in (Cα ([0, T ]; R), · β ). Proof. The proof follows the same line as the proof of Proposition 1. To simplify the notation, denote x by x∞ . Since xn is convergent in Cα ([0, T ]; A(R2 )), the sequence (xn α )n∈N is bounded and then, from Proposition 3, (I(xn ))n∈N is bounded in the space (Cα ([0, T ]; R), · α ).
38
A. Lejay
For n ∈ N ∪ {∞}, let (xn,m )m∈N be the sequence of paths converging to x given by Proposition 2. We have seen in Proposition 3 for for any β < α, there exists some constant K n that depends on xn α such that I(xn,m ) − I(xn )β K n 2m(β−α) . In addition, the sequence (K n )n∈N is bounded if (xn α )n∈N is bounded. As I(xn,m ) is a Young integral, it follows from Corollary 2 that I(xn,m ) converges to I(x∞,m ) in (Cα ([0, T ]; R), · β ). Hence, this is sufficient to prove that I(xn ) converges to I(x) in (Cα ([0, T ]; R), · β ), as in the proof of Proposition 1. n
Remark 8. Consider the following equivalence relation ∼ between two sequences (xn )n∈N and (y n )n∈N of paths converging in (Cα ([0, T ]; R2 ), · β ) def
with α > 1/2 and β ∈ (1/3, 1]: (xn )n∈N ∼ (y n )n∈N if x = limn∈N C(xn , 0) = limn∈N C(y n , 0) in (Cγ ([0, T ]; A(R2 ), · β ) for some γ > β. This implies that I(xn ; s, t) and I(y n ; s, t) converge to the same limit I(x; s, t). Hence, it is possible to identify Cγ ([0, T ]; A(R2 ), · γ ) with the quotient space (Cα ([0, T ]; R2 ), · β )N /∼, and two elements in the same equivalence class give rise to the same integral. Here, we have used dyadics partitions2 , so one may ask whether I(x; s, t) is equal to I(x|[s,t] ). As this is true for ordinary integrals, we easily get the following result. Lemma 11. Let x in Cα ([0, T ]; A(R2 )). Then, for all 0 s t T , I(x; s, t) = I(x|[s,t] ). From this lemma, we deduce that if x ∈ Cα ([0, T ]; A(R2 )) and y ∈ C ([0, S]; A(R2 )), then I(x; 0, t) if t ∈ [0, T ], I(x y; 0, t) = I(x; 0, T ) + I(y; 0, t − T ) if t ∈ [S, T ]. α
Proof. This lemma means that the integral constructed using the dyadics on [0, T ] but restricted to [s, t] corresponds to the integral constructed using the dyadics on [s, t]. One knows that such a relation holds for ordinary integrals, since the integral does not depend on the choice of the family of partitions on which approximations of the integrals are defined. Let (xn )n∈N be the approximation of x given by Proposition 2. Then I(xn ) is an ordinary integral. Hence I(xn ; s, t) = I(xn|[s,t] ; 0, t − s) (the last integral means that T is replaced by t − s and thus that we consider the dyadic partitions of [0, t − s]. The result follows from passing to the limit. Let us end this section with an important remark. Consider x in the space Cα ([0, T ]; A(R2 )) with α ∈ (1/3, 1/2) and ϕ in C2α ([0, T ]; R). We saw in Lemma 9 that y = (x1 , x2 , x3 + ϕ) also belongs to Cα ([0, T ]; A(R2 )). 2
We will give below another construction of I for which a family of partitions different from the dyadics ones can be used.
An Introduction to Rough Paths
39
n
Hence, we set ytn = xΠ 1 Φn 1 Ψ n (t/3) for t ∈ [0, 3T ] where Ψ n = n {zk }k=0,...,2n −1 with zkn : [tnk , tnk+1 ] → R2 defined by ⎡ ⎤ t−tn k n n cos 2π − 1 − ϕ ϕ n n t tk tk+1 −tkn ⎦ , ⎣ zkn (t) = k+1√ t−tk π sin 2π tn −t n k+1
k
n
so that ϕ asymptotically encodes the area of (Ψ )n∈N . Similarly as in Section 3.2, it is then easily shown that t n I(y ; 0, t) −−−−→ I(y; 0, t) = I(x; 0, t) + [f, f ](xs ) dϕs . n→∞
0
Hence, · adding a path ϕ to the third component of x amounts to adding a term 0 [f, f ](xs ) dϕs to I(x). 5.8 A Sub-Riemannian Point of View Our definition of I consists in approximating a path x ∈ Cα ([0, T ]; A(R2 )) by a family of paths (xn )n∈N in C1 ([0, T ]; A(R2 )) such that I(xn ) converges with respect to the β-H¨older norm in Cα ([0, T ]; R) as n → ∞ for all β < α. The integral I(x) is then defined as the limit if I(xn ). In addition, necessarily, it follows from Lemma 8 that xn = (x1,n , x2,n , x30 + A(xn )), where xn is a family of functions in C1p ([0, T ]; R2 ). The paths xn were constructed by replacing x|[tnk ,tnk+1 ] with some paths obtained by combining loops and segments. Of course, other choices are possible, and a natural one consists in using geodesics. Let a be a point in A(R2 ). How to find a path x : [0, 1] → A(R2 ) with x0 = 0, x1 = a and whose length (or whose energy) is minimal? Of course, one can use the segment y = (ta1 , ta2 , ta3 )t∈[0,1] that goes from 0 to a, which is the natural geodesic in R3 . But A(y1 , y2 ; t) = 0 and thus y is not of type (y, A(y)) and does not belong to C1 ([0, T ]; A(R2 )). We will use this point of view in Section 7.2, and this will help us to bridge our construction with another one of Riemann sum type. So, we may reformulate our question by imposing the condition that y is of type y = (y, A(y)), which means that yt3 = A(y1 , y2 ; 0, t) for t ∈ [0, 1]. This kind of problem is related to subRiemannian geometry: see [4, 5, 35, 60] for example. The notion of length we use is then the length of the path (y1 , y2 ): 1 (y˙ s1 )2 + (y˙ s2 )2 ds. Length(y) = 0
Such a path—which will be characterized from the differentiable point of view in the next section—, is called horizontal. It is then possible to introduce a distance between two points of A(R2 ) by d(a, b) =
inf
y:[0,1]→A(R2 ) horizontal y0 =a, y1 =b
Length(y),
40
A. Lejay
which is called the Carnot-Carath´eodory distance. We may then define xCC = d(0, x), which becomes a homogeneous sub-additive norm on A(R2 ) (see Section A) i.e., xCC = 0 if and only x = 0 and for all x, y ∈ A(R2 ) and λ ∈ R, δλ xCC = |λ| · xCC , x−1 CC = xCC and x yCC xCC + yCC , which is the sub-additive property. For any a ∈ A(R2 ), we succeeded in Section 5.3 in constructing a path that goes from 0 to a, so that aCC is finite. Of course, d(a, b) = a−1 bCC for all a, b ∈ A(R2 ). If a3 = 0, then the shortest horizontal path from 0 to a is the segment going from 0 to a. If a = (0, 0, a3 ) with a3 = 0, this problem is equivalent to the isoperimetric problem, whose solution is known to be the circle. In the general case, this problem is called the Dido problem, and the solutions are known to be arcs of circle (see for example [60, 66]), but they are less practical to use than our construction with circles and loops (see below in the proof of Proposition 4). These solutions are not real geodesics in A(R2 ), but they are called subRiemannian geodesics. The sub-Riemannian geodesic that links a to b is then denoted by ψa,b and belongs to C1 ([0, T ]; A(R2 )). 1 If we define the energy of a path by Energy(y) = 12 0 ((y˙ s1 )2 + (y˙ s2 )2 ) ds, then ψa,b is also energy minimizing among all paths with constant speed Length(ψa,b ). To a path x in Cα ([0, T ]; A(R2 )), we associate
t − tnk n xt = ψxtn ,xtn (32) for t ∈ [tnk , tnk+1 ], k k+1 tnk+1 − tnk for n = 0, 1, 2, . . . . Proposition 4. The sequence of paths (xn )n∈N constructed by (32) is a family of paths in C1 ([0, T ]; A(R2 )) which converges to x in Cα ([0, T ]; A(R2 )) with respect to · β for any β < α. Proof. The proof is similar to the one of Corollary 1 or of Proposition 2. Obviously, (xn )n∈N converges uniformly to x. Remark that xns,t = xns,tnk n xtnk ,tnk+1 xntnk+1 ,t and that xntnk ,tnk+1 = xtnk ,tnk+1 . Using the same argument as in Corollary 1, the α-H¨older norm of xn is then deduced from estimates on xns,tkn and xntnk+1 ,t for t ∈ [tnk , tnk+1 ] for k = 0, . . . , 2n − 1. After a translation, we would like to establish an estimate of type |ψ0,x (t)| Ct|x| for t ∈ [0, 1] for some constant C. If this holds, then for t ∈ [tnk , tnk+1 ], |ψxtn ,xtn (t/Δn t)| C k
k+1
t tΔn tα |xtnk ,tnk+1 | C xα Ctα xα . Δn t Δn t
We now give two proofs: one is done “by hand”, and the second one uses the properties of the Carnot-Carath´eodory distance.
An Introduction to Rough Paths
41
◦ If x3 = 0, then ψ0,x (t) is a segment and for t ∈ [0, 1], |ψ0,x (t)| |x|t , which gives the desired result. Now, if x3 = 0, observe first that for some constants a = 0 and r, ϕ ∈ [0, 2π), ⎧ 1 ⎪ ⎨ψ0,x (t) = a(cos(rt + ϕ) − cos(ϕ)), 2 (t) = a(sin(rt + ϕ) − sin(ϕ)), ψ0,x ⎪ ⎩ 3 ψ0,x (t) = a2 rt since the minimizers lie above arcs of circles. Hence, a2 r = x3 and 1 2 (x1 )2 + (x2 )2 = ψ0,x (1)2 + ψ0,x (1)2 = 2a2 (1 − cos(r)).
It is easily seen that one may find a and r so as to satisfy ψ0,x (1) = x. If r ∈ [π/2, 3π/2], then 1 1 − cos(r) 2, a2 max{|x1 |2 , |x2 |2 } and √ 1 2 max{|ψ0,x (t)|, |ψ0,x (t)|} 2πt max{|x1 |, |x2 |}, 3 and |ψ0,t (t)| 4π −1 t max{|x1 |, |x2 |}2 . This is sufficient to conclude. In the other case, since cos and sin are Lipschitz continuous and |a2 r| 2 |x |, we get 1 2 ψ0,x (t)2 + ψ0,x (t)2 = 2a2 (1 − cos(rt)) 2|x3 |t 2|x|2 t. √ Hence, |ψ0,x (t)| 2|x|t. It follows that (xn )n∈N is bounded in Cα ([0, T ]; A(R2 )) and this is sufficient to conclude. ◦ (Alternative proof ). As the Carnot-Carath´eodory norm is equivalent to any homogeneous norm (see Proposition 10 in Section A), it follows that for some universal constants C and C ,
∀t ∈ [0, 1], |ψ0,x (t)| Cψ0,x (t)CC = CtxCC CC t|x|,
(33)
since ψ0,x (t) is a sub-Riemannian geodesic and then ψ0,x (t)CC = td(0, x). The inequalities (33) yields the result. The point of view of the sub-Riemannian geometry, which is natural in the context of Heisenberg groups, has been used by P. Friz and N. Victoir in [29] and [32]. 5.9 A Sub-Riemannian Point of View: Differentiable Paths in A(R2 ) We have introduced the set of paths Cα ([0, T ]; A(R2 )) for α ∈ [1/2, 1/3), but we have that the value of α does not really refer to the regularity of the path x
42
A. Lejay
in such a set, but to the norms to be used to approximate x by a family of paths xn that are naturally lifted as xn = (xn , A(xn )). It is then possible to consider paths x ∈ Cα ([0, T ]; A(R2 )) with α < 1/2 that are differentiable: for example, if x in C1 ([0, T ]; A(R2 )) and ϕ in C1 ([0, T ]; R), then yt = (x1t , x2t , x3t + ϕt ) is almost everywhere differentiable, in the sense that i = 1, 2, 3,
i yt+ε − yti = αi (t) ε→0 ε
lim
(34)
exists for almost every t. Another natural way of thinking the derivative of y consists in setting i = 1, 2, 3,
lim
ε→0
1 i (−yt ) yt+ε = β i (t) ε
(35)
when this limit exists. If t ∈ [0, T ] is such that (34) holds, then β i (t) exists and 1 β(t) = α(t) − [yt , α(t)]. 2 Conversely, if (35) holds, then (34) also holds and 1 α(t) = β(t) + [yt , β(t)]. 2 Of course, (α1 (t), α2 (t)) = (β 1 (t), β 2 (t)) for all t at which yt is differentiable. If the path y is of type (y, A(y)), then 1 1 dyt1 dyt2 1 dyt α (t) 1 2 3 , α (t) = and α (t) = yt ∧ = yt ∧ 2 α (t) = . α (t) dt dt 2 dt 2 To each point a of A(R2 ), we associate the 2-dimensional vector space 1 1 a1 v Θ(a) = (v 1 , v 2 , v 3 ) ∈ R3 v 3 = ∧ 2 v 2 a2 as well as the space Ξ(a) orthogonal to Θ(a) with respect to the usual scalar product in R3 . The one-dimensional space Ξ(a) is generated by the vector (−a2 /2, a1 /2, 1)T . It is easily seen that a → (a, Ξ(a)) and a → (a, Θ(a)) form two sub-bundles of the tangent bundle of A(R2 ). We then obtain the next result. Lemma 12. A differentiable curve y is the natural lift (y, A(y)) of a differentiable curve y if and only if y˙ t belongs to Θ(yt ) for each t ∈ [0, T ]. For a differentiable path y : [0, T ] → A(R2 ), let β(t) be given by (35). The condition that y˙ t belongs to Θ(yt ) is equivalent to β(t) = (y˙ t1 , y˙ t2 , 0). More generally, if πΞ(a) is the projection from R3 identified with the tangent space of A(R2 ) at a onto Ξ(a), then for t ∈ [0, T ],
An Introduction to Rough Paths
43
β(t) = (y˙ t1 , y˙ t2 , πΞ(yt ) (y˙ t )). Thus, a differentiable path y from [0, T ] to (A(R2 ), ) is necessarily of type (y, A(y) + ϕ) where y = (y1 , y2 ) and ϕ is differentiable, and β(t) = (y˙ t1 , y˙ t2 , ϕ˙ t ) for t ∈ [0, T ]. We will see in Section 6.12 how to interpret this condition.
6 Geometric and Algebraic Structures 6.1 Motivations Up to now, we have introduced a space A(R2 ) and considered paths in Cα ([0, T ]; A(R2 )). For a path x ∈ Cα ([0, T ]; A(R2 )), we have seen how to construct a sequence (xn )n∈N of paths converging to Cβ ([0, T ]; A(R2 )) with β < α such that xn = (x1,n , x2,n ) is piecewise smooth and x3,n = x30 + A(xn ). As xn lies above a piecewise smooth path xn , I(xn ) is well defined as a Young integral, and we have shown in Proposition 3 that the sequence (I(xn ))n∈N converges and its limit defines I(x). On the other hand, we may rewrite n
I(x ; 0, T ) =
n 2 −1
I(xn|[tn ,tn ] ) k k+1
and I(x; 0, T ) =
n 2 −1
k=0
I(x|[tnk ,tnk+1 ] ).
k=0
The path xn|[tn ,tn ] was constructed in Section 5.3 from the values of xtnk+1 and k+1 ktnk+1 n xtk . Hence, tn f (xns ) dxns is an approximation of I(x|[tnk ,tnk+1 ] ), and I(xn ) is k constructed from the values of {xtnk }k=0,...,2n −1 only. We have proposed two constructions of integrals that rely on path approximation. We are now looking for a Riemann sum like expression, which consists in finding approximations of I(x; 0, T ) and summing them over the dyadic partitions of [0, T ]. n First note that if x belongs to Cα ([0, T ]; A(R2 )) with α > 1/2 and xΠ is the piecewise linear approximation of x along the dyadic partition Π n , then n
|I(xΠ ; tnk , tnk+1 ) − I(x; tnk , tnk+1 )| f Lip |A(x; tnk , tnk+1 )|
T 2α f Lip x2α 22nα
and thus, since α > 1/2, I(x; 0, T ) = lim
n→∞
= lim
n→∞
n −1 tn 2 k+1
k=0
tn k
n −1 tn 2 k+1
k=0
tn k
where x is the path above which x lies.
n
f (xΠ s )
xtnk+1 − xtnk tnk+1 − tnk
ds
n
n
f (xΠ s )
dxΠ s ds ds
(36)
44
A. Lejay
The first idea is then to find a formulation similar to (36), by looking for another way of drawing a piecewise differentiable path yn lying above a path y n : [0, T ] → R2 with y n (tnk ) = xtnk for k = 0, . . . , 2n and for which the expres tn dyn sion ξkn = tnk+1 f (ysn ) dss ds provides a good approximation of I(x; tnk , tnk+1 ), k in the sense that for some θ > 1 and C > 0, n ξk − I(x; tnk , tnk+1 ) C . 2nθ The space in which y lives has to be specified, but it is natural to assume n dyk (s) belongs to A(R2 ), and one then has to accordingly extend the that ds definition of f into a differential form on A(R2 ). 2n −1 The second idea is then to get an expression of type k=0 f (xtnk )Δnk x where Δnk x depends only on xtnk+1 and xtnk . As we deal with second-order calculus, things are not that simple: think of the difference between Stratonovich and Itˆ o integrals for the Brownian motion. 6.2 Another Formulation for the Integral We rewrite I(xn ; tnk , tnk+1 ) as n
I(x
; tnk , tnk+1 )
tn k+1
= tn k
n Πn f (xΠ s ) dxs
+
[f, f ](z) dz n) Surface(yk
where ykn has been defined by (21b). Setting xs,t = (−xs ) xt and Δn t = T 2−n , we have already seen that [f, f ](z) dz − x3tnk ,tnk+1 [f, f ](xtnk ) Δn tα(1+γ) f Lip x1+γ α . n) Surface(yk
On the other hand, 3 3 n xtnk ,tnk+1 [f, f ](xtk )−xtnk ,tnk+1
tn k+1
tn k
n
[f, f ](xΠ s )
ds Δn tα(1+γ) f Lip x1+γ α . Δn t
Hence, this means that one can replace I(xn ; tnk , tnk+1 ) by ξkn = x1tnk ,tnk+1
tn k+1
tn k
n
f1 (xΠ s )
+ x2tnk ,tnk+1
tn k+1
tn k
ds Δn t n
f2 (xΠ s )
ds + x3tnk ,tnk+1 Δn t
in the sense that I(x; 0, T ) = limn→∞
2n −1 k=0
ξkn .
tn k+1
tn k
n
[f, f ](xΠ s )
ds , Δn t
An Introduction to Rough Paths
45
Call {e1 , e2 , [e1 , e2 ]} the canonical basis of A(R2 ), and {e1 , e2 , [e1 , e2 ]} its dual basis. For z = (z 1 , z 2 , z 3 ) ∈ A(R2 ), define the differential form EA(R2 ) (f )(z) = f1 (z 1 , z 2 )e1 + f2 (z 1 , z 2 )e2 + [f, f ](z 1 , z 2 )[e1 , e2 ]. n
(37)
n
With xΠ = (xΠ , 0), the term ξkn may be put in a more synthetic form ξkn =
tn k+1
tn k
n
n EA(R2 ) (f )(xΠ s )xtn k ,tk+1
ds . Δn t
Remark 9. We have to note the following point: using the same technique as in Corollary 1, one can show that for x ∈ Cα ([0, T ]; A(R2 )), the path xn defined by xnt = xtnk δ(t−tnk )/(tnk+1 −tnk ) ((−xtnk ) xtnk+1 ) for t ∈ [tnk , tnk+1 ] converges to x in (Cα ([0, T ]; A(R2 )), · β ) for any β < α when the mesh of the partition {tnk }k=0,...,n converges to 0. Here, δ· is the dilation operator introduced in (14). We have then that I(xn ) converges to I(x) in (Cα ([0, T ]; A(R2 )), · β ) for any β < α if α ∈ (1/3, 1]. Here, we consider the piecewise linear approximation nt = xtnk x
t − tnk ((−xtnk ) xtnk+1 ) for t ∈ [tnk , tnk+1 ] − tnk
tnk+1
which is a piecewise smooth path with values in A(R2 ). If α > 1/2, we may show that ( xn )n∈N is bounded in Cβ ([0, T ]; A(R2 )) with β = 2α − 1. We do n is bounded in Cβ ([0, T ]; A(R2 )) when α < 1/2 for not know whether or not x β < α. However, we may define I(x) using ( xn )n∈N by changing the definition of the integral. The important point is the following: as we primarily want to focus on the increments of the paths, we leave the world of sub-Riemannian geometry, where paths in A(R2 ) are basically seen as 2-dimensional paths with a constraint on their areas. We are now willing to deal with paths that are seen directly as paths with values in A(R2 ) (or other spaces that will be introduced later). We are now looking for a curve yn (t) on [0, T ] which is piecewise differentiable and such that 1 dyn (t) = xtn ,tn , t ∈ (tnk , tnk+1 ). dt Δn t k k+1 n
(38)
Of course, from (38), such a path lies above xΠ . The problem is now to find the space in which yn lives. Recall the results from Section 5.4: The space (A(R2 )) is a non-commutative group when equipped with , and it is also a Lie algebra when equipped with the brackets [·, ·].
46
A. Lejay
We have already denoted the basis of A(R2 ) by {e1 , e2 , [e1 , e2 ]}. The choice of [e1 , e2 ] to denote the third component naturally follows from the bilinearity of [·, ·]. The Lie algebra structure is particularly important here, since one knows that A(R2 ) may be identified with the tangent space at any point of a Lie group. We will now construct such a Lie group. 6.3 Matrix Groups We give here a very brief presentation of matrix groups. This part can also serve as a presentation of Lie groups, for which matrix groups are a prototype with the advantage of having an explicit coordinate system. For a more detailed insight, there are many books (see specifically [3, 68] or some books on Lie groups as [17]). Consider a matrix group M, that is, a subset of d × d-matrices such that for p, q ∈ M, p × q also belongs to M and p−1 belongs to M, and which is closed. This matrix group can be equipped with the topology induced by the set Md (R) of d × d-matrices. A general result is that a matrix group forms a smooth manifold [68, Theorem 7.17, p. 106], which means that around each point p of M, there exists an open set U (p) in Rm (for some fixed m) and an open neighbourhood 2 Vp of p in Md (R) (see as Rd ) such that there exists a map Φp which is a homeomorphism from Up to Vp ∩ M. In addition, we require that for two −1 points p and q of M, Vp ∩ Vq = ∅, Φp ◦ Φ−1 q and Φq ◦ Φp are smooth on their domain of definition. In other word, one can describe locally M using a smooth one-to-one map from an open set of Rm (indeed, the dimension m does not depend on the points around which the neighbourhood is considered) to M. Example 1. Basic examples of Lie group are given by the sets of invertible matrices, of orthogonal matrices, . . . Example 2. A particular example for us is the Heisenberg group H, which is the set of matrices ⎫ ⎧⎡ ⎤ ⎬ ⎨ 1ac H = ⎣0 1 b ⎦ a, b, c ∈ R . ⎭ ⎩ 001 which is easily seen to be stable under matrix multiplication. The Heisenberg group has been widely studied, and appears in subRiemannian geometry, quantum physics, . . . (see for example [4, 25, 60]). For a given point p in M, we can consider a smooth path γ from (−ε, ε) to M ⊂ Md (R) for some ε > 0 and with γ(0) = p. As γ(t) = [γi,j (t)]i,j=1,...,d , (t)]i,j=1,...,d . we may consider its derivative γ (t) = [γi,j As γ moves only on M, γ (t) can only belong to a subspace of Md (R) at each time. Denote by Tp M the subset of Md (R) given by all the derivatives of the possible curves γ as above. This is the tangent space, which is obviously a vector space.
An Introduction to Rough Paths
47
Example 3. For the Heisenberg group, it is easily computed that the tangent space Tp M at each point p ∈ H is ⎧⎡ ⎫ ⎤ ⎨ 0ac ⎬ Tp H = ⎣0 0 b ⎦ a, b, c ∈ R ⎩ ⎭ 000 Consider now a map ϕ from a matrix group M to a matrix group M . Let p a point of M and set p = ϕ(p ). Given two neighbourhood Vp and Vp of p and p in M and the associated maps Φp and Φp defined on open subset of Rm and Rm , we assume that (Φp )−1 ◦ ϕ ◦ Φp is smooth. We may then define the differential dp ϕ of ϕ at p as the linear map from Tp M to Tϕ(p) M given by dϕ ◦ γ dp ϕ(v) = dt t=0 where γ : (−ε, ε) → M is any smooth path such that γ(0) = p and γ (0) = v for v ∈ Tp M. Remark 10. The advantage with matrix groups is that Md (R) gives a global systems of coordinates for M and for each tangent space. However, as usual in differential geometry, even if we may identify Tp M with Tq M, they are really different spaces. Two particular smooth maps are the following: for a given p in M, set Rp (q) = q × p and Lp (q) = p × q for all q ∈ M. The differentials of Rp : Tq M → Tq×p M and Lp : Tq M → Tp×q M are easily computed: dq Rp (v) = v × p and dq Rp (v) = p × v for any q ∈ M, v ∈ Tq M. In particular, this implies that the left or right multiplication of an element of Tq M by an element of M gives an element in some tangent space of M. Using for p the inverse q −1 of q ∈ M, we deduce that the tangent space Tq M at any q is in bijection with the tangent space TId M at the identity matrix Id (which necessarily belongs to M). Hence, the dimension of Tq M does not depend on q, and the dimension of TId M is then called the dimension of the matrix group M. Denote by T M the set ∪p∈M Tp M, and call it the tangent bundle of M. This set has itself a manifold structure. A smooth vector field is an application that associates with any point p of M a tangent vector Xp in Tp M and such that the dependence is smooth (the precise definition uses local coordinates, as above). An integral curve along X is a smooth path γ : [0, T ] → M such that γ (t) = Xγ(t) .
48
A. Lejay
Given two matrix groups M and M with a smooth map ϕ between them and two vectors fields X and X on M and M , we say that X and X are related if Xϕ (p) is equal to dp ϕ(Xp ) at any point p of M. In particular, this means that if γ is an integral curve of X, then ϕ ◦ γ is an integral curve of X . A left-invariant vector field is a vector field X such that dq Lp (Xq ) = XLp q . For a matrix group, this means that p × Xq = Xp×q . Using q = Id, the value of a left-invariant vector field X may be deduced from the value of X at Id, that is, from a vector in TId M. Let γ be the integral curve of a left-invariant vector field X, with γ(0) = p (and then γ (0) = Xp = p × XId ). We obtain that γ (t) = Xγ(t) = γ(t) × XId = γ(t) × p−1 × Xp . When p = Id and XId = v, we deduce that γ (t) = γ(t) × v which we know how to solve: γ(t) = exp(tv) for t 0, where exp is the matrix exponential: exp(v) = Id +
1 vk . k!
k1
As exp(−v) is the inverse of exp(v), one can extend γ to R. In addition, we also easily obtain that γ(t + s) = γ(t) × γ(s), so that γ : R → M is a group homomorphism. Proposition 5 (See for example [17, Proposition 1.3.4, p. 19]). There exist some open neighbourhood U of 0 in TId M and some neighbourhood V of Id in M such that the application exp is a C1 diffeomorphism between U and V . Example 4. For the Heisenberg group H, we have that P 3 = 0 for P ∈ TId H (which means that H is a step 2 nilpotent group) and then 1 exp(P ) = Id + P + P 2 . 2 In addition, for Q ∈ H, P = Id − Q ∈ TId H and one can define 1 log(Id + P ) = P − P 2 . 2 Here, both exp : TId H → H and log : H → TId H are one-to-one map that are inverse to each other, and exp is a global C1 diffeomorphism. More generally, the inverse of the exponential is also denoted by log, and as it maps a neighbourhood of VId of M containing VId to the vector space TId M, this gives a local system of coordinates ΨId : VId → Rm (where m is the dimension of the matrix group) by ΨId = i ◦ log, where i : TId M → Rm is the
An Introduction to Rough Paths
49
map which naturally identifies TId M with Rm . This function Φ : V → Rm is called the normal chart or the logarithmic chart. We then deduce a local system of coordinates in a neighbourhood V of a point p of M by Φp : Vp → Rm with Φp (x) = i(log(p−1 ⊗ x)) for x ∈ Vp . Another map from M to M of interest is the adjoint defined by Ad(p)(q) = p × q × p−1 for p, q ∈ M. Of course, the interest of this map comes from the fact that in general, M is not an Abelian group and then that p × q = q × p. It can be turned into a map from TId M to TId M , still denoted by Ad(p), by setting Ad(p)(q) = p×q ×p−1 for q ∈ TId M. This new map Ad(p) is simply the differential at Id of Ad(p). Given some smooth path γ : (−ε, ε) → M with γ(0) = Id and γ (0) = p ∈ TId M, def d Ad(γ(t))(q) ad(p)(q) = = p × q − q × p. dt t=0 For two matrices p, q ∈ Md (R), denote by [p, q] their bracket—called their Lie bracket—[p, q] = p × q − q × p. Hence, ad(p)(q) = [p, v], and we see that from the definition of ad, [p, q] belongs to TId M when p, q ∈ TId M. The space (TId M, [·, ·]) then has a Lie algebra structure. The Lie brackets are useful for the following property: let p and q in TId M, and let t be small enough. Then exp(tp) × exp(tq) t2 t3 t3 = exp tp + tq + [p, q] + [p, [p, q]] + [q, [q, p]] + · · · . (39) 2 12 12 This is the Dynkin formula (also called the Baker-Campbell-Hausdorff formula), for which the complete (infinite) expansion may be given in terms of Lie brackets (See for example [17, § 1.7, p. 29]). If we identify an element p of the tangent space TId M with the flow t → exp(tp) is generates, a geometric interpretation of the Lie bracket follows from (39), as for ε small enough, exp(εp) × exp(εq) × exp(−εp) × exp(−εq) = exp(ε2 [p, q] + o(ε2 )) , which means that if we follow the flow t → exp(tp) in direction of p up to a time ε, then the flow t → exp(q) in the direction of q before coming back in the direction of −p and then of −q, always up to a time ε, we arrive close to a point given by the value of the flow t → exp(t[p, q]) at time ε2 . Example 5. For the Heisenberg group H, we easily obtain that the product of two matrices P and Q in TId H is of type ⎡ ⎤ 00c P Q = ⎣0 0 0⎦ for some c ∈ R 000
50
A. Lejay
and then that the product of the matrices P , Q and R in TId H is equal to 0. Then formula (39) becomes an exact formula 1 exp(P ) × exp(Q) = exp P + Q + [P, Q] 2 and is true whatever the norms of P and Q. We now consider an element x = (a, b, c) ∈ A(R2 ), and ⎡ ⎤ 0ac Φ(x) = ⎣0 0 b ⎦ . 000
(40)
Clearly, Φ is a one-to-one map between A(Rd ) and TId H. In addition, it is easily obtained that Φ([x, y]) = [Φ(x), Φ(y)] for all x, y ∈ A(R2 ), or in other words, that Φ is a Lie algebra isomorphism between (A(Rd ), [·, ·]) and (TId H, [·, ·]). With the exponential application exp, we may then identify a path x in A(R2 ) with a path y = exp(x) living in the Heisenberg group. The path x is valued in the vector space A(Rd ) and xt gives the “direction” to follow so as to reach yt via integral curves of left-invariant vector fields. 6.4 Lie Groups We have already seen that (A(R2 ), ) is a Lie group, that is, a group (G, ×) such that (x, y) → x × y and x → x−1 are continuous. Denote by 1 the neutral element of G. Here, we consider groups (G, ×) that are finite-dimensional manifolds of class C2 and such that (x, y) → x × y and x → x−1 are also of class C2 . Any matrix group is a Lie group. We recall here some general results about G, which are merely a copy of the previous statements on matrix groups. For x ∈ G, denote by Tx (G) the tangent space at x. A vector field X is a differentiable application X : x ∈ G → Xx ∈ Tx G. A left-invariant vector field X is a vector field such that XLx (y) = dy Lx Xy for all x, y ∈ G, where Lx (y) = x × y. It is easily shown that for such a vector field, Xx = d1 Lx X1 , ∀x ∈ G, where 1 is the neutral element of the Lie group G. In other words, a leftinvariant vector field is fully characterized by the tangent vector X1 in the tangent space T1 (G) at the identity of G.
An Introduction to Rough Paths
51
An integral curve of X is a differentiable curve γ : R+ → G such that dγ(t) = Xγ(t) . dt A one-parameter subgroup of G is a differentiable curve γ : R → G such that γ(t + s) = γ(t) × γ(s) for all s, t ∈ R (note that γ(−t) = γ(t)−1 for all t ∈ R). This implies in particular that γ(0) = 1. If γ is an integral curve of a leftinvariant vector field X, then γ is deduced from the tangent vector X1 ∈ T1 G at the identity 1 of G. This vector X1 is then called the generator of γ. Given a vector v in T1 G, it is usual to denote by (exp(tv))t∈R the one-parameter subgroup of G generated by v. One may define a map Ad on G such that Ad(x) : y → x × y × x−1 . Its differential Ad (x) = d1 Ad(x) at 1 maps T1 G to T1 G, which is linear. Hence, x → Ad (x) can be seen as a map from G to L(T1 G, T1 G), the vector def
space of linear maps from T1 G to T1 G, and its differential ad(x) = d1 Ad is a linear map from T1 G to L(T1 G, T1 G). Thus, for (x, y) ∈ T1 G2 → ad(x)(y) is a bilinear map with values in T1 G, which is anti-symmetric: ad(y)(x) = def
def
− ad(x)(y). We then define by [x, y] = ad(x)(y) the Lie bracket of x and y, and (T1 G, [·, ·]) is a Lie algebra. This space is called the Lie algebra of G. For a matrix group, this Lie bracket correspond to the Lie bracket of matrices. 6.5 Tensor Algebra We have introduced matrix groups, and we have seen that (A(Rd ), [·, ·]) is isomorphic to the Lie algebra TId H of the Heisenberg group. We will now construct a bigger space, that will contain also the Heisenberg group. Consider now the following tensor algebra T(R2 ) = R ⊕ R2 ⊕ (R2 ⊗ R2 ) where R2 ⊗ R2 is the tensor product of R2 (on this notion see for example [20]). If {e1 , e2 } is the canonical basis of R2 , then R2 ⊗ R2 is the vector space of dimension 4 with basis {e1 ⊗ e1 , e1 ⊗ e2 , e2 ⊗ e1 , e2 ⊗ e2 }. For x, y ∈ R2 , x ⊗ y = (x1 e1 + x2 e2 ) ⊗ (y 1 e1 + y 2 e2 ) = xi y j ei ⊗ ej , i,j=1,2
λ(x ⊗ y) = (λx) ⊗ y = x ⊗ (λy), ∀λ ∈ R. Any element x ∈ T(R2 ) may be decomposed as x = (x0 , x1 , x2 ) where x0 ∈ R, x1 ∈ R2 and x2 ∈ R2 ⊗ R2 . This space T(R2 ) is equipped with the term-wise addition +, and the multiplication ⊗ defined by the tensor product between two elements of R2 and x ⊗ y = xy if x ∈ R, y ∈ T(R2 ), x ⊗ y ⊗ z = 0 if x, y, z ∈ R2 .
52
A. Lejay
The element e0 = 1 = (1, 0, 0) is the neutral element of T(R2 ) for ⊗, while 0 = (0, 0, 0) is the neutral element of +. The space (T(R2 ), +, ⊗) is an associative algebra, which is obtained by quotienting the tensor algebra R ⊕ R2 ⊕ R2 ⊗ R2 ⊕ · · · by the ideal formed by all the elements which belongs to (R2 )⊗3 ⊕ (R2 )⊗4 ⊕ · · · . Remark 11. Consider the space RX1 , X2 of polynomials with two noncommutative variables X1 and X2 , as well as the equivalence relation ∼ on RX1 , X2 defined by P ∼ Q if P − Q is a sum of terms of total degree at least 3. Then there exists an isomorphism Φ between the associative algebras (T(R2 ), +, ⊗) and (RX1 , X2 /∼, +, ×) such that Φ(ei ) = Xi for i = 1, 2. In other words, the elements of T(R2 ) are manipulated as polynomials where only the terms of total degree 2 are kept. For ξ ∈ {0, 1}, denote by Tξ (R2 ) the subset of T(R2 ) defined by Tξ (R2 ) = (ξ, x1 , x2 ) x1 ∈ R2 , x2 ∈ R2 ⊗ R2 . Lemma 13. The space (T1 (R2 ), ⊗) is a non-commutative group. Proof. Clearly, if x, y ∈ T1 (R2 ), then x ⊗ y ∈ T1 (R2 ). That (T1 (R2 ), ⊗) is non-commutative follows from the very definition of ⊗. To show it is a group, it remains to compute the inverse of each element. If x = (1, x1 , x2 ), then x−1 = (1, −x1 , −x2 + x1 ⊗ x1 ) is the inverse of x. For x, y ∈ T(R2 ), define the bracket of x and y by [x, y] = x ⊗ y − y ⊗ x. If x = (x0 , x1 , x2 ) and y = (y 0 , y 1 , y 2 ) belong to T(R2 ), then [x, y] = [x1 , y 1 ] = (x1 ∧ y 1 )[e1 , e2 ]. Note also that [x, y] = −[y, x]. A natural sub-vector space of (T0 (R2 ), +) ⊂ (T(R2 ), +) is then g(R2 ) = x ∈ T0 (R2 ) x = x1 + xa [e1 , e2 ], x1 ∈ R2 , xa ∈ R . Although g(R2 ) is not stable under ⊗, it is stable under [·, ·]: if x = (x1 , xa ) and y = (y 1 , y a ) are in g(R2 ), then [x, y] = x1 ∧ y 1 [e1 , e2 ] ∈ g(R2 ). This space g(R2 ) is of dimension 3. For x = x1 + xa [e1 , e2 ] and y = y 1 + xa [e1 , e2 ], set 1 x y = x1 + y 1 + (xa + y a )[e1 , e2 ] + [x1 , y 1 ] 2 1 1 1 1 a a = x + y + (x + y + x ∧ y 1 )[e1 , e2 ]. 2
An Introduction to Rough Paths
53
Finally, define ig(R2 ),A(R2 ) by ig(R2 ),A(R2 ) (x) = (x1,1 , x1,2 , xa ) if x = x1,1 e1 + x1,2 e2 + xa [e1 , e2 ]. It is clear that ig(R2 ),A(R2 ) is one-to-one from g(R2 ) to A(R2 ), and an additive group homomorphism from (g(R2 ), ) to (A(R2 ), ). In addition, [ig(R2 ),A(R2 ) (x), ig(R2 ),A(R2 ) (y)] = ig(R2 ),A(R2 ) [x, y] for all x, y ∈ g(R2 ), which means that ig(R2 ),A(R2 ) is also a Lie homomorphism. Hence, we identify the spaces g(R2 ) and A(R2 ). Lemmas 4 and 5 are then rewritten in the following way. Lemma 14. The space (g(R2 ), [·, ·]) is a Lie algebra, and (g(R2 ), ) is a Lie group with 0 as neutral element. On T0 (R2 ), define 1 exp(x) = 1 + x1 + x2 + x1 ⊗ x1 for x = (0, x1 , x2 ). 2
(41)
This map exp is given by the first terms of the formal expansion of the exponential, since we are working in a truncated tensor algebra. Similarly, define on T1 (R2 ) 1 log(x) = x1 + x2 − x1 ⊗ x1 for x = (1, x1 , x2 ) ∈ T1 (R2 ). 2 It is easily seen that exp ◦ log and log ◦ exp are equal to the identity respectively on T1 (R2 ) and on T0 (R2 ). If x, y ∈ T0 (R2 ), 1 1 exp(x) ⊗ exp(y) = 1 + x1 + y 1 + x2 + y 2 + x1 ⊗ x1 + y 1 ⊗ y 1 + x1 ⊗ y 1 2 2 and then log(exp(x) ⊗ exp(y)) = x y
(42)
with 1 1 1 x y = x1 + y 1 + x2 + y 2 + x1 ⊗ y 1 − y 1 ⊗ x1 = x + y + [x, y]. 2 2 2 This is the truncated version of the Baker-Campbell-Hausdorff-Dynkin formula (see for example [37, 63]). Lemma 15. If G(R2 ) = exp(g(R2 )), then G(R2 ) is a subgroup of (T1 (R2 ), ⊗) and exp is a group isomorphism from (g(R2 ), ) to (G(R2 ), ⊗). Note that exp(−x) is the inverse of exp(x) in G(R2 ), for all x ∈ g(R2 ). For a sub-vector space V of T(R2 ), πV denotes the projection onto V. If V = Vect(e) for some e ∈ T(R2 ), then denote πVect(e) simply by πe . For x ∈ T(R2 ), set
54
A. Lejay
s(x) =
1 (πe ⊗e (x) + πej ⊗ei (x))ei ⊗ ej , 2 i j i,j=1,2
a(x) =
1 (πe ⊗e (x) − πe2 ⊗e1 (x))[e1 , e2 ]. 2 1 2
If x belongs to R2 ⊗ R2 , then x = s(x) + a(x),
(43)
and s(x) (resp. a(x)) corresponds to the symmetric (resp. anti-symmetric) part of x. Finally, note that for x ∈ T(R2 ), s(x ⊗ x) = πR2 ⊗R2 (x ⊗ x).
(44)
For z = exp(x) ∈ G(R2 ), we have s(z) =
1 1 s(x ⊗ x) = x ⊗ x 2 2
(45)
and a(z) = π[e1 ,e2 ] (x)[e1 , e2 ]. Hence, for x ∈ g(R2 ), one may rewrite 1 exp(x) = 1 + πR2 (x) + x ⊗ x + a(x) and x = πR2 (x) + a(x). 2
(46)
In particular, for z ∈ G(R2 ), a(log(z)) = a(z). 6.6 The Tensor Space as a Lie Group It is possible to find a norm | · | on R2 ⊗ R2 such that |x ⊗ y| |x| · |y| for all x, y ∈ R2 (there are indeed several possibilities [64]). For x = (1, x1 , x2 ) ∈ T1 (R2 ) or for x = (0, x1 , x2 ) ∈ T0 (R2 ), set x = max{|x1 |, |x2 |} and
x = max |x |, 1
1 2 |x | . 2
Then · is a homogeneous gauge for the dilation operator δt defined by δt x = (1, tx1 , t2 x2 ), t ∈ R, since δt x = |t| · x (see Section A). Besides, x ⊗ y (3/2) (x + y) for all x, y ∈ T1 (R2 ). We have introduced in Section 5.4 a dilation operator, also denoted by δ, in a similar way. Note that for x ∈ A(R2 ) and t ∈ R, exp(δt x) = δt exp(x). The next lemma is easily proved.
An Introduction to Rough Paths
55
Lemma 16. With the norm · , the spaces (T1 (R2 ), ⊗) and (G(R2 ), ⊗) are Lie groups, and G(R2 ) is a closed subgroup of T1 (R2 ). def
For x ∈ g(R2 ), t ∈ R → γx (t) = exp(tx) ∈ G(R2 ) is a one-parameter subgroup of (G(R2 ), ⊗). The point x is the tangent vector to γx (t) for t = 0: dγx = x. dt t=0 Hence, g(R2 ) may be identified with the tangent space of G(R2 ) at point 1, and in fact at any point y ∈ G(R2 ). The bracket allows us to characterize the lack of commutativity of G(R2 ), as follows from the next result, which is classical in the theory of Lie groups (see Figure 9): For x, y ∈ g(R2 ) and for t 0, set √ √ √ √ θx,y (t) = γx ( t) ⊗ γy ( t) ⊗ (γx (− t)) ⊗ γy (− t). Then θx,y (0) = 1 and
dθx,y = [x, y]. dt t=0
In our case, it follows from the truncated version the Baker-CampbellHausdorff-Dynkin formula (42) that θx,y (t) = exp(t[x, y]) for all t 0. To any Lie group corresponds a Lie algebra, which is identified with the tangent space at the neutral element, and then at any point. Of course, g(R2 ) ∼ = A(R2 ) has been constructed to be the tangent space of G(R2 ) at any point. Lemma 17. The tangent space of G(R2 ) at any point may be identified with A(R2 ), and the tangent space of T1 (R2 ) at any point may be identified with T0 (R2 ).
e1 ⊗ e1
[e1,e 2] e2
e2 α
θx,y(t)
e1
α⊗β
e1
α ⊗ β ⊗ α−1 √ √ Fig. 9. Illustration of the non-commutativity with α = γx ( t) and β = γy ( t).
56
A. Lejay
Remark 12. We have seen that (A(Rd ), [·, ·]) is isomorphic to the Lie algebra (TId H, [·, ·]) of the Heisenberg group. Consider the map Ψ : T(R2 ) to H defined by ⎤ ⎡ 2 2 1 x1 x1,2 Ψ (x) = ⎣0 1 x2 ⎦ for x = x0 e0 + xi ei + xi,j ei ⊗ ej . i=1 i,j=1 0 0 1 Then note that Ψ (x ⊗ y) = Ψ (x) × Ψ (y) for x, y ∈ T(R2 ), so that Ψ is a group homomorphism from (T1 (R2 ), ⊗) or (G(R2 ), ⊗) to (H, ×). As Ψ is linear, we easily get that Ψ (exp(x)) = exp(Φ(x)), where Φ is the Lie algebra isomorphism given by (40). We then deduce that Ψ is indeed an isomorphism between (G(R2 ), ⊗) and the Heisenberg group (H, ×). The Heisenberg group is then a representation of the group (G(R2 ), ⊗). This section ends with a very useful lemma, whose proof is straightforward. The notion of Lipschitz functions on spaces with homogeneous gauges is similar to the notion of Lipschitz functions (See Definition 9 in Section A). Lemma 18. The application exp is Lipschitz continuous from (A(R2 ), | · |) to (G(R2 ), · ), and log is Lipschitz continuous from (G(R2 ), · ) to the space (A(R2 ), | · |). The application exp is locally Lipschitz continuous from (A(R2 ), | · | ) to (G(R2 ), · ), and log is locally Lipschitz continuous from (G(R2 ), · ) to (A(R2 ), | · | ). 6.7 The Riemannian Structure on T1 (R2 ) Induced by Euclidean Coordinates A natural system of coordinates—which we call the Euclidean chart—follows 2 2 2 2 from the identification of T 1 (R ) with the vector space R ⊕ (R ⊗ R ). If γ(t) = 1 + i=1,2 γi (t)ei + i,j=1,2 γi,j (t)ei ⊗ ej is a smooth path with from (−ε, ε) to T1 (R2 ) with γ(0) = x ∈ T1 (R2 ), then the derivative γ (0) of γ at time 0 may be simply expressed as γi (0)ei (x) + γi,j (0)ei,j (x), γ (0) = i=1,2
i,j=1,2
where ei (x) ∈ Tx T1 (R2 ) is the tangent vector at 0 of the path ϕi (t) = x + tei and ei,j (x) ∈ Tx T1 (R2 ) is the tangent vector at 0 of the path ϕi,j (t) = x + tei ⊗ ej . Introduce the natural attach map Ax from T0 (R2 ) to Tx T1 (R2 ) which is linear and satisfies Ax (ei ) = ei (x) and Ax (ei ⊗ ej ) = ei,j (x) for i, j = 1, 2.
An Introduction to Rough Paths
57
With this map, the derivative of γ at t = 0 is easily computed by 1 γ (0) = Ax lim (γ(t) − γ(0)) . t→0 t
(47)
Hence, it is possible to endow T1 (R2 ) with a Riemannian structure ·, · by setting for x ∈ T1 (R2 ), ei (x), ej (x)x = δi,j , ei (x), ej,k (x)x = 0, ei,j (x), ek, (x)x = δi,k δj,
for i, j, k, = 1, 2, where δi,j = 1 if i = j and δi,j = 0 otherwise. We then define ·, ·x as a bilinear form on Tx T1 (R2 ). 6.8 The Left-Invariant Riemannian Structure on T1 (R2 ) We have defined the logarithm map log as a map from T1 (R2 ) to the vector space T0 (R2 ) ∼ = R2 ⊕ (R2 ⊗ R2 ). Given a point x ∈ T1 (R2 ), another system of coordinates Φx from T1 (R2 ) to R2 ⊕ (R2 ⊗ R2 ) around x is given by $ % Φx (y) = iT0 (R2 )→R2 ⊕(R2 ⊗R2 ) log(x−1 ⊗ y) , where iT0 (R2 )→R2 ⊕(R2 ⊗R2 ) is the natural identification of T0 (R2 ) with R2 ⊕ (R2 ⊗ R2 ) for which we use the basis {ei , ej ⊗ ek }i,j,k=1,2 . For y ∈ T1 (R2 ), we then set Φix (y)ei + Φi,j Φx (y) = x (y)ei ⊗ ej . i=1,2
i,j=1,2
This system of coordinates is called the normal chart or the logarithmic chart. Let γ : (−ε, ε) → T1 (R2 ) be a smooth map with γ(0) = x. The derivative γ (0) of γ at 0 in this system of coordinate is then given by (Φix ◦ γ) (0)& ei (x) + (Φi,j ei,j (x), γ (0) = x ◦ γ) (0)& i=1,2
i,j=1,2
where e&i (x) (resp. e&i,j (x)) is the tangent vector in Tx T1 (R2 ) which is the derivative at 0 of the path ψxi (resp. ψxi,j ) such that (Φx ◦ ψxi ) (0) = ei (resp. (Φx ◦ψxi,j ) (0) = ei ⊗ej ). These paths are easily computed: ψxi (t) = x⊗exp(tei ) for i = 1, 2 and ψxi,j (t) = x ⊗ exp(tei ⊗ ej ) for i, j = 1, 2. If we write γ(t) = x ⊗ exp(λ(t)) for λ : (−ε, ε) → T0 (R2 ) with λ(0) = 0 and λ(t) = λi (t)ei + λi,j (t)ei ⊗ ej , i=1,2
then
γ (0) =
i=1,2
i,j=1,2
λi (0)& ei (x) +
i,j=1,2
λi,j (0)& ei,j (x).
58
A. Lejay
In the Euclidean structure, it follows from (47) that if x = 1+ i,j=1,2 xi,j ei ⊗ ej , then e&i (x) = Ax (x ⊗ ei ) = ei (x) +
i=1,2
xi ei +
xj ej,i (x)
j=1,2
(48)
and e&i,j (x) = Ax (x ⊗ (ei ⊗ ej )) = ei,j (x). Let Dx (for detach) be the linear map from Tx T1 (R2 ) which is the inverse of Ax , that is, which transforms ei (x) (resp. ei,j (x)) into ei (resp. ei ⊗ ej ). For x ∈ T1 (R2 ), let Lx (y) = x ⊗ y be the left multiplication on T1 (R2 ). Its differential at point y maps Ty T1 (R2 ) to Tx⊗y T1 (R2 ) and is defined by dy Lx (v) = Ax⊗y (x ⊗ Dy (v)). A left-invariant vector field X on T1 (R2 ) satisfies Xx = d1 Lx (X1 ) and then Xx = Ax (x ⊗ D1 (X1 )). From (48), e&i (x) = d1 Lx (ei (1)) and e&i,j (x) = d1 Lx (ei,j (1)). In other words, the vector field e&i (resp. e&i,j )—it is easily verified that they vary smoothly—is then the left-invariant vector field generated by ei (1) (resp. ei,j (1)) in the Lie group (T1 (R2 ), ⊗). We may then define another bilinear form ·, ·x at any point x of T1 (R2 ) by & ei (x), e&j (x)x = δi,j , & ei (x), e&j,k (x)x = 0, & ei,j (x), e&k, (x)x = δi,k δj,
for i, j, k, = 1, 2. These bilinear forms induce another Riemannian structure ·, · on T1 (R2 ). Note that for v, w ∈ T1 T1 (R2 ) and x ∈ T1 (R2 ), d1 Lx (v), d1 Lx (w)x = v, w1 , which means that ·, ·, is a left-invariant metric. For a left-invariant vector field X, the norm Xx , Xx x is constant. &x : T0 (R2 ) → Tx T1 (R2 ) and D & x : Tx T1 (R2 ) → Introduce the linear maps A 2 & & & x is the inverse T0 (R ) such that Ax (ei ) = e&i (x), Ax (ei ⊗ ej ) = e&i,j (x) and D & of Ax . If (·|·) is the natural scalar product on T0 (R2 ) for which {ei , ej ⊗ek }i,j,k=1,2 is orthonormal, then for x ∈ T1 (R2 ) and v, w ∈ Tx T1 (R2 ), & x (v)|D & x (w)) v, wx = (Dx (v)|Dx (w)) and v, wx = (D
(49)
To conclude this section, remark that it is very easy to express a vector v ∈ Tx T1 (R2 ) in the basis {& ei (x), e&j,k (x)} we know its decomposition i,j,k=1,2 when in {ei (x), ej,k (x)}i,j,k=1,2 : If v = i=1,2 v i ei (x) + i,j=1,2 v i,j ei,j (x), then
An Introduction to Rough Paths
&x (x−1 ⊗ Dx (v)), v=A
59
(50)
so that with (49), v, wx = (x−1 ⊗ Dx (v)|x−1 ⊗ Dx (w)). Moreover, if γ is a smooth path from (−ε, ε) to T1 (R2 ), then we get from (50) a simple expression for the derivative γ of γ at time t ∈ (−ε, ε) in the basis {& ei (x), e&j,k (x)}i,j,k=1,2 by &γ(t) 1 (γ(t)−1 ⊗ γ(t + h) − 1) . γ (t) = lim A h→0 h
(51)
6.9 The Exponential Map Revisited Consider an integral curve γ along a left-invariant vector field X with γ(0) = 1. If for t 0, the path γ(t) is written γ(t) = 1 + γi (t)ei + γi,j (t)ei ⊗ ej , i=1,2
i,j=1,2
then
γ (t) = Xγ(t) = d1 Lγ(t) (X1 ) = Aγ(t) (γ(t) ⊗ D1 (X1 )) and, if X1 = i=1,2 vi ei + i,j=1,2 vi,j ei ⊗ ej , γi (t) = vi ei , γi,j vi,j + γi (t)vj
for i, j = 1, 2. It follows that γi (t) = tvi and γi,j (t) = tvi,j +
t2 vi vj 2
which means that γ(t) = exp(tX1 ) where exp has been defined by (41). Note that exp(tX1 ) ⊗ exp(sX1 ) = exp((t + s)X1 ), since (tX1 ) (sX1 ) = (t + s)X1 . Hence, the one-parameter subgroup of T1 (Rd ) generated by v is given by t ∈ R → exp(tX1 ). In the sytem of left-invariant coordinates, we get &γ(t) (γ(t)−1 ⊗ Dγ(t) (γ (t))) = A &γ(t) (γ(t)−1 ⊗ γ(t) ⊗ D1 (X1 )) γ (t) = A &γ(t) (D1 (X1 )), =A which means that γ (t) is constant in the system of left-invariant coordinates. It follows that for any y ∈ T1 (R2 ), it is always possible to construct an integral curve γ along a left-invariant vector field that connects x to y and which is given by (x ⊗ exp(tv))t∈[0,1] with v = log(x−1 ⊗ y).
60
A. Lejay
6.10 Some Particular Curves for the Left-Invariant Riemannian Metric For two points x and y in T1 (R2 ) and a smooth path γ from [0, 1] to T1 (R2 ) with γ(0) = x and γ(1) = y, define the energy Energy(γ) of the path γ as 1 def 1 γ (s), γ (s)γ(s) ds. Energy(γ) = 2 0 For t ∈ [0, 1], set ϕ(t) = log(a−1 ⊗ γ(t)) so that γ(t) = a ⊗ exp(ϕ(t)) and then ϕ(0) = 0. The path ϕ belongs to T0 (R2 ). With (51), we get &γ(t) lim 1 (exp((−ϕ(t)) ϕ(t + h)) − 1) γ (t) = A h→0 h &γ(t) ϕ (t) + 1 [ϕ (t), ϕ(t)] , =A 2 where ϕ(t) = i=1,2 ϕi (t)ei + i,j=1,2 ϕi,j (t)ei ⊗ ej and ϕ (t) = i=1,2 ϕi (t)ei + i,j=1,2 ϕi,j (t)ei ⊗ ej . Thus, the energy of γ is given by 1 1 1 2 ϕ (s) + [ϕ (s), ϕ(s)]Euc ds. Energy(γ) = 2 0 2 where · Euc is the Euclidean norm of T0 (R2 ) identified with R6 . We now consider the particular path γ such that ϕ(0) = 0, ϕ(1) = log(a−1 ⊗ b) and ϕ (t) + 12 [ϕ (t), ϕ(t)] is constant over [0, 1]. This means that ϕ(t) = tv for some v ∈ T0 (R2 ). This comes from the fact that that the projection of ϕ on R2 is then constant, since [φ (t), φ(t)] lives in R2 ⊗ R2 and then [ϕ (t), ϕ(t)] = 0 for t ∈ [0, 1]. With the condition on ϕ(1), ϕ(t) = t log(a−1 ⊗b) and γ(t) = a ⊗ exp(t log(a−1 ⊗ b)). Let also ψ : [0, 1] → T0 (Rd ) be a differentiable path with ψ(0) = ψ(1) = 0. Set for ε > 0, Γε (t) = a ⊗ exp(ϕ(t) (εψ(t))) so that
&Γ (t) ϕ (t) + εψ (t) + ε[ϕ (t), ψ(t)] Γε (t) = A ε
1 ε2 + [ϕ (t), ϕ(t)] + [ψ (t), ψ(t)] . 2 2 2 Thus, if ϕ(t) = tv for some v ∈ T0 (R ), we get 1 1 ε 1 Energy(Γε (t)) = v2Euc + ε (v|ψ (t)) dt + (v|[v, ψ(t)]) dt 2 2 0 0 '2 ε ε2 1 ' ' ' dt + 'ψ (t) + [v, ψ(t)] + [ψ (t), ψ(t)]' 2 0 2 Euc ε2 1 + (v|[ψ (t), ψ(t)]) dt. 4 0
An Introduction to Rough Paths
61
1 1 Since ψ(0) = ψ(1) = 0, 0 (v|ψ (t)) dt = 0. But the term 2ε 0 (v|[v, ψ(t)]) dt 2 1 may be different from 0, as well as ε4 0 (v|[ψ (t), ψ(t)]) dt. Hence, we see that γ is not necessarily a path with minimal energy. Remark 13. At first sight, this seems to contradicts the result that t → exp(tv) is a path with a constant derivative in the left-invariant system of coordinates seen in Section 6.9 above. Indeed, the geodesics ξ associated to the left-invariant Riemannian structure are those for which ∇ξ (t) ξ (t) = 0 where ∇ is the Levi-Civita connection associated to ·, ·. Since there exist some elements x, y and z such that [z, x], y = x, [z, y] (consider for example x = e1 , x = e2 and y = e1 ⊗ e2 ), this connection differs from the Cartan-Schouten (0) connection ∇CS which is such that all paths of type γ(t) = exp(tv) are geodesics in the sense that ∇CS γ (t) (γ (t)) = 0. On this topic, see for example [58]. However, if v belongs to Vect{e1 , e2 }, then (v| [v, ψ (t)]) = 0 and so Energy(Γε (t)) Energy(γ) =
1 log(a−1 ⊗ b)2Euc , ∀ε > 0 2
and thus γ is a geodesic, that is, a curve with minimal energy. As usual, it can also be shown that it is a path with minimal length, and the length 1 def γ (s), γ (s)γ(s) ds Length(γ) = 0
−1
is then equal to log(a ⊗ b)Euc . Another simple situation is when v ∈ Vect{ei ⊗ ei }i,j=1,2 ; in this case [v, w] = 0 for all w ∈ T0 (R2 ) and we also obtain that γ is a geodesic. We also deduce that the length of the geodesic between a and b for ·, · is smaller than log(a−1 ⊗ b. Remark also that if a and b belong to G(R2 ), then γ(t) belongs to G(R2 ) for t ∈ [0, 1]. Of course, if we see T1 (R2 ) with its Euclidean structure ·, · then the geodesics are simply ϕ(t) = a + t(b − a). In this case, ϕ(t) does not belong to G(R2 ) in general when a and b are in G(R2 ). 6.11 A Transverse Decomposition of the Tensor Space We have introduced a subgroup G(R2 ) of T1 (R2 ). Is this subgroup strict or not? The tangent space of T1 (R2 ) at any point may be identified with the vector space (T0 (R2 ), +), which has dimension 6. We have also seen that the tangent space of G(R2 ) at any point may be identified with A(R2 ), and thus
62
A. Lejay
has dimension 3. Then, of course, G(R2 ) = T1 (R2 ). Indeed, we may be more precise on the decomposition of T1 (R2 ). Denote by S(R2 ) the subset of T0 (R2 ) defined by ⎫ ⎧ x2 = λe1 ⊗ e1 + μe2 ⊗ e2 ⎪ ⎪ ⎬ ⎨ S(R2 ) = x = (0, 0, x2 ) ∈ T0 (R2 ) +ν(e1 ⊗ e2 + e2 ⊗ e1 ), . ⎪ ⎪ ⎭ ⎩ λ, μ, ν ∈ R In other words, an element of S(R2 ) belongs to R2 ⊗ R2 and is symmetric. Of course, S(R2 ) is linear, stable under ⊗ and + (indeed, if x, y ∈ S(R2 ), then x ⊗ y = x + y), and is a vector space of dimension 3. 2 For an element e of the basis of T(R2 ), call πe the projection from T(R ) 2 to T(R ), such that x = π1 (x) + i=1,2 πei (x)ei + i,j=1,2 πei ⊗ej (x)ei ⊗ ej . The next result follows easily from the construction of the projection operator Υs : T0 (R2 ) → S(R2 ) and Υa : T0 (R2 ) → A(R2 ) defined by Υs (x) = s(x) and Υa (x) = πR2 (x) + a(x). Proposition 6. The space T0 (R2 ) is the direct sum of A(R2 ) and S(R2 ). This decomposition holds at the level of the tangent spaces at any point of T1 (R2 ). Proposition 7. Any element x of T1 (R2 ) may be written as a sum x = y + z for some y ∈ G(R2 ) and z ∈ S(R2 ). Proof. For x ∈ T(R2 ), set 1 Υs (x) = s(x) − x ⊗ x 2
and
1 Υa (x) = 1 + πR2 (x) + a(x) + x ⊗ x. 2
With (43), Υa (x) + Υs (x) = x for all x ∈ T(R2 ). Also, thanks to (44) and (46), Υs (T1 (R2 )) ⊂ S(R2 ) and Υa (T1 (R2 )) ⊂ G(R2 ). We have to note that with the previous decomposition, G(R2 ) is not a linear subspace of T1 (R2 ), and Υa and Υs are not linear projections, since they involve quadratic terms. This is why we do not write T1 (R2 ) as the direct sum 2 2 as the tangent if of G(R2 ) and S(R2 ). However, space of S(R ) is S(R ) itself, 2 2 2 G(R ) and exp(S(R )) = 1 + x x ∈ S(R ) are sub-manifolds of T1 (R2 ), we get that G(R2 ) and exp(S(R2 )) provides a transverse decomposition of T1 (R2 ), in the sense that their tangent spaces at any point x provides an orthogonal decomposition (with respect to ·, ·x ) of the tangent space of T1 (R2 ) at x. Define a homogeneous norm · G(R2 )×S(R2 ) by 1 Υs (x) . (52) xG(R2 )×S(R2 ) = max Υa (x), 2 It is easily shown that this homogeneous norm is equivalent to the homogeneous gauge · on T1 (R2 ).
An Introduction to Rough Paths
63
6.12 Back to the Sub-Riemannian Point of View We now come back to the result of Section 5.9, in order to bring some precision on the sub-Riemannian geometric framework. We have already seen that (A(R2 ), ) is a Lie group (here, we no longer consider the space T(R2 )). In addition, it is a vector space and then a smooth manifold with a natural system of coordinates given by the decomposition of a ∈ A(R2 ) on the basis {e1 , e2 , e3 }, where e3 corresponds to [e1 , e2 ]. If ϕi (t; a) = a + tei for i = 1, 2, 3 and t ∈ R and a ∈ A(R2 ), denote by ei (a) the derivative ϕi (0; a) at time 0 of ϕi (·, a). As in Sections 6.7, define for a ∈ A(R2 ) two linear maps Aa and Da by Aa (ei ) = ei (a) and Da = A−1 a . We now proceed as in Section 6.8. The left multiplication is La (y) = a y, and its differential db La : Tb A(R2 ) → Tab A(R2 ) at any point b is given by 1 db La (v) = Aab Db (v) + [a, Db (v)] , 2 Here [a, v] = (a1 v 2 − a2 v 1 )e3 for a = a1 e1 + a2 e2 + a3 e3 . Thus, any left-invariant vector field (Va )a∈A(R2 ) satisfies Va = d0 La (V0 ). The left-invariant vector fields e&1 , e&2 and e&3 associated to e1 , e2 and e3 are given by 1 1 e&1 (a) = e1 (a) − a2 e3 (a), e&2 (a) = e2 (a) + a1 e3 (a) and e&3 (a) = e3 (a) 2 2 for a = a1 e1 + a2 e2 + a3 e3 . The space Θ(a) introduced in Section 5.9 is then the vector space generated by e&1 (a) and e&2 (a). &a be the linear map from A(R2 ) to Ta A(R2 ) defined by A &a (ei ) = Let A e&i (a). Then a vector v in Ta A(R2 ) is easily expressed in the left-invariant basis {& e1 (a), e&2 (a), e&3 (a)} by &a ((−a) Da (v)). v=A Similarly, if γ : (−ε, ε) → A(R2 ) is a smooth path, then it is easily checked that &a lim 1 (−γ(t)) γ(t + h) γ (t) = A h→0 ε &γ(t) Dγ(t) (γ (t)) + 1 [Dγ(t) (γ (t)), γ(t)] . =A 2 For a differentiable path yt in A(R2 ) we have introduced in (34) and (35) some paths α and β that correspond indeed to the coordinates of the derivative e1 , e&2 , e&3 }, in the sense that of y in the bases {e1 , e2 , e3 } and {& dyt = αi (t)ei (yt ) = β i (t)& ei (yt ). dt i=1 i=1 3
3
64
A. Lejay
7 Rough Paths and their Integrals 7.1 What are Rough Paths? If x ∈ G(R2 ), then it is easily seen that for some universal constants c and c , cx | log(x)| c x, where |·| is the homogeneous norm defined on A(R2 ) by (16). Definition 4. A rough path is a continuous path x with values in T1 (R2 ). Denote by Cα ([0, T ], T1 (R2 )) the set of rough paths x : [0, T ] → T1 (R2 ) such that x−1 def s ⊗ xt xα = sup |t − s|α 0s
An Introduction to Rough Paths
65
Fig. 10. From the tangent space A(R2 ) at point 1 (perpendicular to the vertical axis) to the manifold G(R2 ): the paths x (dashed) and log(x) (plain).
Lemma 19. A path x belongs to Cα ([0, T ]; G(R2 )) if and only if log(x) belongs to Cα ([0, T ]; A(R2 )). A path y = (a(t), b(t), c(t))t∈[0,T ] with value in A(R2 ) is then transformed into a path xt = exp(yt ) with value in G(R2 ) by the relation 1 1 xt = a(t)e1 + b(t)e2 + a(t)2 e1 ⊗ e1 + b(t)2 e2 ⊗ e2 2 2 + (a(t)b(t) + c(t))e1 ⊗ e2 + (a(t)b(t) − c(t))e2 ⊗ e1 . Similarly, a path x with values in G(R2 ) is transformed into a path y with values in A(R2 ) by setting yt = log(xt ). def
In addition, note that xs,t = x−1 s ⊗ xt = exp((−ys ) yt ) and then s(xs,t ) =
1 (xt − xs ) ⊗ (xt − xs ) 2
where xt = a(t)e1 + b(t)e2 is the path above which x lies. Assume now that y belongs Cα ([0, T ]; A(R2 )) with α > 1/2. We have seen in Lemma 7 that necessarily, c(t) = A(x; 0, t). Hence, from (46), 1 xt = 1 + xt + A(x; 0, t)[e1 , e2 ] + (xt − x0 ) ⊗ (xt − x0 ). 2
(53)
As for 0 s t T , 1 t 2 (x1r − x1s ) dx2r − (x − x2s ) dx1r 2 s r s t 1 and (xit − xis )2 = (xir − xis ) dxir for i = 1, 2, 2 s
A(x; s, t) =
1 2
t
we may rewrite (53) as xt = 1 + xt +
i,j=1,2
0
t
(xir − xi0 ) dxjr ei ⊗ ej .
(54)
66
A. Lejay
Note also that def
xs,t = (−xs ) ⊗ xt = 1 + xt − xs +
i,j=1,2
t
(xir − xis ) dxjr ei ⊗ ej .
s
This means that the terms of xt in R2 ⊗ R2 are the iterated integrals of x. When α < 1/2, the difficulty comes from the fact that these iterated integrals are not canonically constructed. As the iterated integrals have some nice algebraic properties (see Section 8.2), we replace them by an object—a rough path—which shares the same algebraic properties, whose existence is not discussed in this article. Let us end this Section with a result on paths that are not geometric. If x belongs to Cα ([0, T ]; T1 (R2 )) (with α ∈ (0, 1]) and xt − 1 ∈ S(R2 ) for all t, then 1 2 −1 xs ⊗ xt = |x − x2s | xα |t − s|α 2 t with xt = (1, x1t , x2t ). This implies that xt can be identified with a path in C2α ([0, T ]; R3 ) (note that if α > 1/2, then x is constant). 7.2 Joining Two Points by Staying in G(R2 ) We have seen that the integral of a differential form f along a path x : [0, T ] → R2 may be written as the limit of the following scheme: consider the family of dyadic partitions {tnk }k=0,...,2n of [0, T ], and construct approximations xn of x such that xtnk = xntnk for k = 0, . . . , 2n , and two successive points xntnk and xntnk+1 are linked by a path that depends only on these two points. Then the integral I(x) of f along x is defined as the limit of the integrals of f along xn . When x is a α-H¨older continuous path with values in R2 with α > 1/2, then the “natural” family of approximations is given by piecewise linear approximations. If α ∈ (1/3, 1/2], we have seen that we need to replace x by a path x with values in A(R2 ) that projects onto x, and to construct xn by joining two successive points xntnk and xntnk+1 of xn with some sub-Riemannian geodesic computed from xntnk and xntnk+1 . Such a path xn is automatically lifted to a path (xn , A(xn )) in Cα ([0, T ]; A(R2 )), and the integral I(x) is defined as the limit of the I(xn ). Computations in Sections 6.1 and 7.5 have shown that it may be advisable to work with piecewise linear approximations of paths of Cα ([0, t]; A(R2 )). For this, we have extended the differential form f to a differential form EA(R2 ) (f ) on A(R2 ). We have subsequently introduced some tensor space T(R2 ), as well as a Lie groups G(R2 ) and T1 (R2 ) whose Lie algebras are A(R2 ) and T0 (R2 ). & x : Tx T1 (R2 ) → T0 (R2 ) We have also introduced in Section 6.8 an operator D & x (Tx G(R2 )) ⊂ A(R2 ). such that D For a piecewise smooth path x : [0, T ] → G(R2 ) with values that project onto x : [0, T ] → R2 , it is then natural to define
An Introduction to Rough Paths
t
L(x; 0, t) =
&x EA(R2 ) (f )(xs )D s
0
dxs ds
67
ds
(55)
for t ∈ [0, T ], where EA(R2 ) (f ) has been defined in (37). & x to transfer all problems Remark 15. Note that here, we use the operator D 2 to T0 (R ) identified with the tangent space T1 T1 (R2 ) at point 1. If one wants to avoid this formulation, as we have seen it in Sections 6.7 and 6.8, EA(R2 ) (f ) can be defined as the differential form EA(R2 ) (f )(x) = f1 (x)& e1 (x) + f2 (x)& e2 (x) + [f, f ](x)& e3 (x),
(56)
where e&i (x) is the dual element of e&i (x) in Tx T1 (R2 ) for i = 1, 2, 3. Formula (55) may then be rewritten
t
L(x; 0, t) =
EA(R2 ) (f )(xs ) 0
dxs ds. ds
Now, given a path x ∈ Cα ([0, T ]; G(R2 )), define the equivalent of the piecewise linear approximation xn by using the curves constructed in Section 6.10 (see Figures 10a–10b for an illustration. Note that unlike sub-Riemannian geodesics, xn is not necessarily a smooth rough path, but it is a rough path which is smooth): set ϕa,b (t) = a ⊗ exp(t log(a−1 ⊗ b)) for t ∈ [0, 1], and
t − tnk n xt = ϕxtn ,xtn (57) for t ∈ [tnk , tnk+1 ], k k+1 tnk+1 − tnk for n ∈ N∗ and tnk = T k/2n , k = 0, . . . , 2n . e1⊗e1
e2⊗e1
e1⊗e2 e2
exp(0) e1
e2
e2
e2 exp(0)
e2⊗e2
exp(0)
exp(0) e1
e1
e1
Fig. 10a. A sub-Riemannian geodesic in G(R2 ) as constructed from Section 5.9. e2⊗e1
e1⊗e2
e1⊗e1 e2
e1
exp(0)
exp(0) e1
e2
e2
e2 exp(0)
exp(0)
e2⊗e2
e1
Fig. 10b. The path ϕx,y with x = exp(0) and y = exp((1, 1, 1)).
e1
68
A. Lejay
Proposition 8. For x ∈ Cα ([0, T ]; G(R2 )) with α > 1/3, let xn be the path defined above by (57). Then I(log(x); 0, t) = lim L(xn ; 0, t) n→∞
uniformly in t ∈ [0, T ]. Proof. This follows from the computations of Sections 6.1 and 7.5, and from & x , since we have seen in Section 6.10 that D & ϕ (t) (ϕ (t)) = the definition of D a,b a,b −1 log(a ⊗ b) for t ∈ [0, 1]. As there is an identification between log(x) and x, one can set for x ∈ Cα ([0, T ]; G(R2 )), I(x) = I(log(x)). 7.3 A Riemann Sum Like Definition We are now willing to give another definition of the integral in the spirit of Riemann sums, to get rid of the integrals between the successive times tnk and tnk+1 for k = 0, . . . , 2n − 1. For this, we use the Taylor expansion of f : For x, y ∈ R2 and i = 1, 2, fi (y 1 , y 2 ) = fi (x1 , x2 ) +
∂fi (x1 , x2 )z j + κi1 (z) ∂x j j=1,2
with |κi1 (z)| f Lip |z|1+γ and z = y − x. In addition, [f, f ](y 1 , y 2 ) = [f, f ](x1 , x2 ) + κ2 (z) with |κ2 (z)| f Lip |z|γ−1 . Set x ∈ Cα ([0, T ]; G(R2 )) with α > 1/3 and xn constructed as in Proposition 8. In addition, define x and xn by x = πR2 (x) and xn = πRd (x). Remark n that xn = xΠ , the piecewise linear interpolation of x. For Δn t = T 2−n ,
tn k+1
tn k
[f, f ](xns )a(xtnk ,tnk+1 )
ds − [f, f ](xtnk )a(xtnk ,tnk+1 ) Δn t Δn tα(1+γ) xα(1+γ) f Lip . α
In addition, with the Taylor formula,
tn k+1
tn k
fi (xns )(xtnk+1 − xtnk ) −
ds − fi (xtnk )(xitnk+1 − xitnk ) Δn t
1 ∂fi (xtnk )πej (xtnk+1 − xtnk )πei (xtnk+1 − xtnk ) 2 ∂x j i,j=1,2 Δn tα(1+γ) f Lip x1+γ α . (58)
An Introduction to Rough Paths
69
If ei (x) is the dual element of ei (x), denote by f (x) the linear operator f = f1 (x)e1 (x) + f2 (x)e2 (x). If ei (x) ⊗ ej (x) is the dual element of ei (x) ⊗ ej (x) for i, j = 1, 2, denote by ∇f the linear operator ∇f (x) =
∂fi (x)ej (x) ⊗ ei (x) ∂x j i,j=1,2
so that with (45), 1 ∂fi (xtn )πej (xtnk+1 − xtnk )πei (xtnk+1 − xtnk ) = ∇f (xtnk )s(xtnk ,tnk+1 ). 2 i,j=1,2 ∂xj k Hence, with (43), we deduce that
tn k+1
tn k
& xn EA(R2 ) (f )(xns )D s
dxns ds
ds
= f (xtnk )πR2 (xtnk ,tnk+1 ) + ∇f (xtnk )πR2 ⊗R2 (xtnk ,tnk+1 ) + θkn α(1+γ) with |θkn | f Lip x1+γ . As α(γ+1) > 1, limn→∞ α Δn t We then define a differential form ET1 (R2 ) (f ) on T1 (R2 ) by
ET1 (R2 ) (f )(x) =
fi (πR2 (x))ei (x) +
i=1,2
2n −1 k=0
(59)
|θkn | = 0.
∂fi (πR2 (x))ei (x) ⊗ ej (x), ∂x j i,j=1,2
With (59) and the property of the θkn ’s, we get that, after having identified I(x; 0, T ) with I(log(x); 0, T ) for x ∈ Cα ([0, T ]; G(R2 )), I(x; 0, T ) = lim
n→∞
n −1 2
ET1 (R2 ) (f )(xtnk )xtnk ,tnk+1 ,
(60)
k=0
which is a Riemann sum like expression. This means also that ET1 (R2 ) (f )(xtnk )xtnk ,tnk+1 is a “good” approximation of I(x; tnk , tnk+1 ). 7.4 Another Construction of the Integral Assume that the functions f1 , f2 take their values in the space Rm with m > 1. For the sake of simplicity, assume that m = 2. The integral I(x) = (I1 (x), I2 (x)) then becomes a path in R2 , and we are interested in constructing its iterated integrals. If x belongs to Cα ([0, T ]; R2 ) with α > 1/2, then I(x) also corresponds to a Young integral and belongs to Cα ([0, T ]; R2 ). Hence, we use the natural lift in (54), which means that we only need to define t → A(I(x; 0, t); 0, t), or t equivalently, s Ii (x; s, r) dIj (x; s, r) for i, j = 1, 2.
70
A. Lejay
Remark that, if xr = xs + (r − s)(t − s)−1 (xt − xs ),
t t r j k i
j
fi (xu ) dx fj (xr ) dxr = fi (xs )fj (xs ) (xir − xis ) dxjr s s s
t r k k i + (fi (xr ) − fi (xs )) dxr fj (xr ) dxjr s
s
t
fik (xs )(xir − xis )(fj (xr ) − fj (xs )) dxjr .
+ s
This suggests to approximate ytk,
n ,tn k
k+1
=
tnk+1 tn k
Ii (x; tnk , s) dIj (x; tnk , s) by the quantity
fik (xtnk )fk (xtnk )x2,i,j tn ,tn , k, = 1, 2. k
k+1
i,j=1,2
With (60), we also set ytink ,tnk+1 = ET1 (R2 ) (f i )(xtnk )xtnk ,tnk+1 , i = 1, 2. Let {e1 , e2 } be the canonical basis of R2 and {ˇ e1 , eˇ2 } be its dual basis, which we distinguish from {e1 , e2 } to refer to the space in which f takes its values. Then we introduce the differential form ET1 (R2 ),T1 (R2 ) (f ) with value in T1 (R2 ) defined by, for z ∈ T1 (R2 ), e1 + ET1 (R2 ) (f 2 )(z)ˇ e2 ET1 (R2 ),T1 (R2 ) (f )(z) = 1 + ET1 (R2 ) (f 1 )(z)ˇ + f i (πR2 (z))f j (πR2 (z))ˇ e1 ⊗ eˇ2 , i,j=1,2
or, more concisely, ET1 (R2 ) (f )(z) = ET1 (R2 ) (f )(z) + f (πR2 (z)) ⊗ f (πR2 (z)) with f = f 1 eˇ1 +f 2 eˇ2 . Hence, in order to approximate I(x; s, t) and its iterated integral, we may then set def
ys,t = F(f, x; s, t) = ET1 (R2 ),T1 (R2 ) (f )(xs )xs,t
(61)
and set, for t ∈ (tnM (t,n)−1 , tnM (t,n) ] and s ∈ [tnM (s,n) , tnM (s,n) ), In (x; s, t) def
=
F(x; s, tnM (s,n) )
⊗
M (t,n)−1 ( k=M (s,n)
F(x; tnk , tnk+1 )
⊗ F(x; tnM (t,n) , t).
(62)
An Introduction to Rough Paths
71
Finally, set I(x; s, t) = lim In (x; s, t)
(63)
n→∞
when this limit exists. In the definition of F(x), we have assumed that x is a path with values in G(R2 ). In fact, this definition may be extended to paths with values in T1 (R2 ). In addition, note that if x takes its values in G(R2 ), then F(x; s, t) ∈ G(R2 ). The analysis of F(x; s, t) for x in S(R2 ) is performed in Section 7.5. We will see below that the integral defined by (62)-(63) satisfies the relation I(x; s, t) = I(x; s, r) ⊗ I(x; r, t), ∀0 s r t T,
(64)
which means that t ∈ [0, T ] → I(x; 0, t) is a path with values in T1 (R ) and I(x; s, t) represents its increments. But In (x) does not satisfy (64), unless s, r, t belong to {tnk }k=0,...,2n −1 . The next results are borrowed from [52, Section 3.2, p. 40] or from [55, Section 3.1, p. 273]. Definition 6. A function ys,t from Δ+ = (s, t) ∈ [0, T ]2 0 s t T to T1 (R2 ) is an almost rough path if there exist some constants C > 0 and θ > 1 such that 2
ys,t − ys,r ⊗ yr,t C|t − s|θ , ∀0 s r t T. where · is the norm defined by x = max{|x1 |, |x2 |}. An almost rough path is the “basic brick” for constructing a rough path. We give a proof of the next theorem in Section C in the appendix. Theorem 2. Let y : Δ+ → T1 (R2 ) be an almost rough path such that ys,t C|t − s|α for α ∈ (1/3, 1] and C > 0. Set def
n = ys,M (s,n) ⊗ ys,t
M (t,n)−1 (
ytnk ,tnk+1
⊗ yM (t,n),t , ∀(s, t) ∈ Δ+ .
k=M (s,n)
Then there exist a unique path z in Cα ([0, T ]; T1 (R2 )) and a sequence (Kn )n∈N decreasing to 0 such that n zs,t − ys,t Kn |t − s|θ .
If y is an almost rough path in G(R2 ), then z is a weak geometric rough path with α-H¨ older control. In addition, if y and y are both almost rough paths with |πR2 (ys,t − ys,t )| ε|t − s|α , |πR2 ⊗Rd (ys,t − ys,t )| ε|t − s|2α
for all (s, t) ∈ Δ+ , then the corresponding rough paths z and z satisfy |πRd (zs,t − zs,t )| K(ε)|t − s|α , |πRd ⊗Rd (zs,t − zs,t )| K(ε)|t − s|2α for some function K(ε) decreasing to 0 as ε → 0 and depending on T , α and θ only.
72
A. Lejay
The existence of I(x) in (63) as a (weak geometric) rough path when x is a (weak geometric) rough path is then justified by the next proposition and the application of Theorem 2. Roughly speaking, the proof follows the same line as the for the Young integral: the reader is referred to [55, Section 3.2.2, p. 289], [52, Section 5.2, p. 117], [44, Section 3] or [53]. Proposition 9. For x ∈ Cα ([0, T ]; T1 (R2 )) with α ∈ (1/3, 1], the function (s, t) ∈ Δ+ → F(x; s, t) is an almost rough path. In addition, if x ∈ Cα ([0, T ]; G(R2 )), then F(x; s, t) belongs to G(R2 ). Hence, I(x) given by (63) exists and belongs to Cα ([0, T ]; T1 (R2 )) (resp. Cα ([0, T ]; G(R2 ))) if x ∈ Cα ([0, T ]; T1 (R2 )) (resp. Cα ([0, T ]; G(R2 )). We have already seen that the integral I(x) lies above the integral we constructed in Section 6 using some approximation of x. With Theorem 2, we not only have continuity of x → I(x), but we also get that it is locally Lipschitz under a stronger assumption on f and we are not bound to use the · β norm with β < α while working with α-H¨older paths. In addition, we may consider any family of partitions whose meshes decrease to zero (see Remark 4 or the proof of Theorem 5 in Appendix C). We introduce a new norm · ,α on Cα ([0, T ]; T1 (Rd )), which is not equivalent to ·α but which generates the same topology: for x ∈ Cα ([0, T ]; T1 (Rd )), |πRd (xs,t )| |πRd ⊗Rd (xs,t )| x ,α = sup max , . (t − s)α (t − s)2α 0s 1 and α > 1/3, then the limit of (In (x; s, t))n∈N in (63) exists and is unique. Besides, I maps continuously (Cα ([0, T ]; T1 (R2 )), · ,α ) to (Cα ([0, T ]; T1 (R2 )), · ,α ). If f is older continuous second-order derivative where of class C2 (R2 ; R2 ) with a κ-H¨ α(κ + 2) > 1, then I is locally Lipschitz continuous. In addition, if x is a smooth rough path, then t I(x; 0, t) = exp f (xs ) dxs + A(I(x; 0, t); 0, t)[ˇ e1 , eˇ2 ] 0
and I(x) is also a smooth rough path. Hence, for x ∈ C0,α ([0, T ]; G(R2 )), there exists a sequence of paths 2 n x ∈ C∞ p ([0, T ]; G(R )) convergent to x in · α , then I(x) = limn→∞ I(x ), n 0,α 2 I(x ) is a smooth rough path and I(x) belongs to C ([0, T ]; G(R )). Now, if x is only a weak geometric 1/α-rough path with H¨ older control, then we have seen that x may be approximated by some smooth rough paths xn in the β-H¨older norm · β with β < α. Hence, I(xn ) converges to I(x) in · β with β < α. Anyway, I(x) belongs to Cα ([0, T ]; G(R2 )). n
An Introduction to Rough Paths
73
We then deduce the following stability result. Corollary 5. If I is defined by Theorem 3, then I maps Cα ([0, T ]; G(R2 )) into Cα ([0, T ]; G(R2 )) and C0,α ([0, T ]; G(R2 )) into C0,α ([0, T ]; G(R2 )). We end this section with a lemma similar to Lemma 11. Lemma 20. For any x ∈ Cα ([0, T ]; T1 (R2 )), I(x; s, t) = I(x|[s,t] ) for all 0 s < t T. Proof. If x ∈ Cα ([0, T ]; G(R2 )), the proof of this Lemma is similar to the one of Lemma 11. If x ∈ Cα ([0, T ]; T1 (R2 )), the results at the end of Section 7.5 allow us to conclude in the same way.
7.5 Integral along a Path Living in the Tensor Space Now, consider x ∈ Cα ([0, T ]; T1 (R2 )) with α ∈ (1/3, 1/2). What can be said about I(x)? From Proposition 7, one may decompose xt as the sum xt = yt + zt with y = Υa (x) and z = Υs (x). In addition, (y, z) belongs to Cα ([0, T ]; G(R2 ) × S(R2 )), where the homogeneous norm on G(R2 ) × S(R2 ) has been defined by (52). In particular, this implies that πei ⊗ej (z) belongs to C2α ([0, T ]; R), i.e., each of its components is 2α-H¨older continuous. In addition, for (s, t) ∈ Δ+ , ET1 (R2 ),T1 (R2 ) (f )(xs )(yt − ys ) belongs to G(R2 ), while
ET1 (R2 ),T1 (R2 ) (f )(xs )(zt − zs ) =
k,i,j=1,2
+
∂fik (xs )πei ⊗ej (zt − zs )ˇ ek ∂xj
fik (xs )fj (xs )πei ⊗ej (zt − zs )ˇ ek ⊗ eˇ .
k, ,i,j=1,2
Since πei ⊗ej (zt − zs ) = πej ⊗ei (zt − zs ), we get that ET1 (R2 ),T1 (R2 ) (f )(xs ) (zt − zs ) belongs to R2 ⊕ S(R2 ). Besides, for t ∈ [0, T ],
M (t,n)
lim
n→∞
$ % πR2 ET1 (R2 ),T1 (R2 ) (f )(xtnk )(ztnk+1 − ztnk )
k=0
=
k=1,2
eˇk 0
t
∂f1k (xs ) dπe1 ⊗e1 (zs ) + eˇk ∂x1 t k
+ˇ e
0
which we can more concisely write
t 0
0
t
∂f2k (xs ) dπe2 ⊗e2 (zs ) ∂x2
∂f1k ∂f2k (xs ) + (xs ) dπe1 ⊗e2 (zs ) ∂x2 ∂x1
∇f (xs ) dzs .
74
A. Lejay
In addition, if {αk }k=0,...,m and {βk }k=0,...,m belongs to T0 (R2 ), then m (
m (
(1 + αk + βk ) =
k=0
+
(1 + αk ) +
k=1
βk1
⊗
k=0,...,m
αk1 ⊗
k=0,...,m
α 1
=k+1,...,m
β 1
=k+1,...,m
+
βk1 ⊗
k=0,...,m
β 1
=k+1,...,m
with αk1 = πR2 (αk ) and βk1 = πR2 (βk ). Remark that
βk1
⊗
k=0,...,m
and
βk1 ⊗
k=0,...,m
α 1
=k+1,...,m
m
=
αk ⊗ β ,
=1 k=0
m m 1 β 1 = βk ⊗ βk . 2
=k+1,...,m
k=0
k=0
We set 1 + αk = F(f, y; tnk , tnk+1 ) and βk = ET1 (Rd ) (f )(xtnk )(ztnk+1 − ztnk ). 1 1 Furthermore, k=0,...,M (t,n) αk −→ πR2 (I(x; 0, t)), while k=0,...,M (t,n) βk t converges to 0 ∇f (xs ) dzs . Remark also that if βk2 = πR2 ⊗R2 (βk ), then
M (t,n)
k=0
βk2
t
−−−−→
f (xs ) ⊗ f (xs ) dzs .
n→∞
0
By combining all these facts and using techniques similar to those in [52, t Section 3.3.3, p. 56] or in [53], since the components of 0 ∇f (xs ) dzs are 2α-H¨older continuous, we can get
(
M (t,n)−1 n→∞
k=0
with
t
F(f, x; tnk , tnk+1 ) −−−−→
f (xs ) dys + K(y, z; 0, t) 0
t ∇f (ys ) dzs + f (ys ) ⊗ f (ys ) dzs 0 0 t s + eˇk ⊗ eˇ
∇f k (ys ) f (ys ) dys dzs
t
K(y, z; 0, t) =
0
k, =1,2
+
eˇk ⊗ eˇ
0 t
k f (ys )
0
k, =1,2
+
1 2
s
∇f (ys ) dzs dys
0
0
t
t ∇f (ys ) dzs ⊗ ∇f (ys ) dzs . 0
(65)
An Introduction to Rough Paths
75
In the previous expression, we have to remember that x and y live above the same path x = y. 2 Thus, if for each n ∈ N, zn belongs to C∞ p ([0, T ]; S(R )) and converges to n ∞ 2 z, while y ∈ Cp ([0, T ]; G(R )) converges to y, one gets that xn = yn + zn converges to Cα ([0, T ]; T1 (R2 )) and I(x) = lim (I(yn ) + K(yn , zn )), n→∞
where the limit is in Cβ ([0, T ]; T1 (R2 )) for all β < α. Of course, both K(yn , zn ) and I(yn ) correspond to integrals of differential forms along piecewise smooth paths, and hence to ordinary integrals. Yet the following fact has to be noted: If x ∈ Cα ([0, T ]; T1 (R2 )) but x ∈ Cα ([0, T ]; G(R2 )), then it is not possible to find a family (xn )n∈N of smooth rough paths such that I(xn ) converges to I(x). This means that I(x) cannot be approximated by the ordinary integrals I(xn ). This motivates our definition of geometric rough paths. However, using the decomposition of T1 (R2 ) as G(R2 ) × S(R2 ), it is then possible to interpret any α−1 -rough path as a geometric (1/α, 2/α)-rough path in the sense defined in [53]. 7.6 On Geometric Rough Paths Lying Above the Same Path We have seen in Lemma 6 that if x and y are two paths in Cα ([0, T ]; A(R2 )) with α ∈ (1/3, 1/2) and lying above the same path taking its values in R2 (i.e., πR2 (x) = πR2 (y)), then there exists a path ϕ ∈ C2α ([0, T ]; R) such that x = y + ϕ[e1 , e2 ]. In addition, (−xs ) xt = (−ys ) yt + (ϕt − ϕs )[e1 , e2 ]. t = exp(xt ) and Now, if we lift x and y as paths in Cα ([0, T ]; G(R2 )) by x t = exp(yt ), we deduce that there exists ψ ∈ Cα ([0, T ]; T0 (R2 )) such that y t = y t + ψt and in addition, x −1 t = y s−1 ⊗ y t + ψt − ψs . This path is x s ⊗x given by ψt = ϕt e1 ⊗ e2 − ϕt e2 ⊗ e1 = ϕt [e1 , e2 ]. Each component of ψ is 2α-H¨older continuous. Using the map K previously defined by (65), we get I(x) = I(y) + K(y, ψ). Finally, using the fact that ψ is anti-symmetric, setting [f, f ] = eˇ1 [f 1 , f 1 ] + eˇ2 [f 2 , f 2 ], we get K(y, ψ; s, t) =
t
[f, f ](ys ) dϕs s
+
k, =1,2
eˇk ⊗ eˇ
t
[f , f ](ys ) k
0
k
0
s
f (ys ) dys dϕs
76
A. Lejay
+
k, =1,2
t
eˇk ⊗ eˇ
s f (ys ) [f k , f k ](ys ) dϕs dys
0
+
0
1 2
0
t
t [f, f ](ys ) dϕs ⊗ [f, f ](ys ) dϕs 0
for all 0 s t T . If [f, f ] = 0, we deduce that K(y, ψ) = 0 and then that I(y) = I(x). In other words, any rough path lying above the same path x gives rise to the same integral. With the results in [54], which assert that it is always possible to lift a path x ∈ Cα ([0, T ]; R2 ) to a path x ∈ Cα ([0, T ]; G(R2 )) when α ∈ (1/3, 1/2], this means that if [f, f ] = 0, one may define I only on Cα ([0, T ]; R2 ) for α ∈ (1/3, 1/2]; but the continuity of I remains an open question.
8 Variations in the Construction of the Integral 8.1 Case of a Path Living in a d-Dimensional Space The case of a space of dimension d is not harder to treat than the case when d = 2; one only has to consider the area between the components grouped in pairs. The tensor space T(Rd ) then becomes the space T(Rd ) = R⊕Rd ⊕(Rd ⊗Rd ) whose basis is, if {e1 , . . . , ed } is a basis of Rd , 1, e1 , . . . , ed , e1 ⊗ e1 , e1 ⊗ e2 ; . . . , ed ⊗ ed , hence T(Rd ) is a space of dimension 1 + d + d2 . The space A(R2 ) is a space of dimension d + d(d − 1)/2, with basis {ei i = 1, . . . , d} ∪ {[ei , ej ] i = j, i, j = 1, . . . , d} , 2 d d d where [ei , ej ] = ei ⊗ ej − ej ⊗ ei . The space A(R ) is then R ⊕ [R , R ], where d d d [R , R ] = [x, y] x, y ∈ R . The applications exp and log are defined as previously:
1 exp(x) = 1 + x + x ⊗ x for x ∈ A(Rd ) 2 1 and log(1 + x) = x − x ⊗ x for x ∈ T(Rd ), π1 (x) = 0. 2 d d The )) is a subgroup of (T1 (Rd ), ⊗), where T1 (Rd ) = spacedG(R ) = exp(A(R x ∈ T(R ) π1 (x) = 1 , and (A(Rd ), [·, ·]) is the Lie algebra of (G(Rd ), ⊗). It may also be identified with its tangent space at any point.
An Introduction to Rough Paths
77
in A(Rd ) by A smooth path x in Rd is then lifted into a path x t = xt + x A((xi , xj ); 0, t)[ei , ej ], i,j=1,...,d, i<j
where (xi , xj ) is the two dimensional path composed of the i-th and j-th component of x. Remark that A((xi , xj )) = −A((xj , xi )) and A((xi , xi )) = 0. is then lifted into a path x in Rd by x = exp( x), and thus The path x xt = 1 + xt +
d
t
(xjs − xj0 ) dxis ej ⊗ ei .
0
i,j=1
The symmetric part s(x) of πRd ⊗Rd (x) is s(xt ) =
1 (xt − x0 ) ⊗ (xt − x0 ) 2
while the anti-symmetric part a(x) of πRd ⊗Rd (x) is a(xt ) =
d
A((xi , xj ); 0, t)ei ⊗ ej =
i,j=1
A((xi , xj ); 0, t)[ei , ej ].
i=1,...,d i<j
Hence, all previous notions and results easily extend to this case. Finally, note that the theory of rough paths also applies to the infinite dimensional case (see [50] for example). 8.2 Using Iterated Integrals We saw in Sections 7.1 and 8.1 that a path x ∈ Cα ([0, T ]; Rd ) with α > 1/2 may be naturally lifted as a path x in G(Rd ) with xt = 1 + xt +
d i,j=1
t
(xir − xi0 ) dxjr ei ⊗ ej .
0
The term K i,j (x; 0, t) = πei ⊗ej (xt ) is called an iterated integral of x. Fix d 1 and consider the tensor space T∞ (Rd ) defined by T∞ (Rd ) = R ⊕ Rd ⊕ (Rd ⊗ Rd ) ⊕ (Rd ⊗ Rd ⊗ Rd ) ⊕ · · · , and, for a smooth path x : [0, T ] → Rd , the iterated integrals t t1 t−1 i1 ,...,i (x; 0, t) = ··· dxii1t · · · dxit1 K 0
0
0
for each integer and each (i1 , . . . , i ) ∈ {1, . . . , d} . It was noted first by K.T. Chen in the 50’s [10, 11] that the formal power series
78
A. Lejay
Ψ (x; 0, t) =
0 (i1 ,...,i
K i1 ,...,i (x; 0, t)ei1 ⊗ · · · ⊗ ei
)∈{1,...,d}
in T∞ (Rd ) provides an algebraic way to encode the geometric object which is the path x. Therefore, Ψ (x; 0, t) is sometimes called the signature of the path. With the tensor product ⊗, T∞ (Rd ) remains a group, and thus if x : [0, T ] → Rd and y : [0, S] → Rd are two smooth paths, Ψ (x · y; 0, T + S) = Ψ (x; 0, T ) ⊗ Ψ (y; 0, S). In addition, if x is the path xt = xT −t , then Ψ (x; T ) = Ψ (x; T )−1 . The signature characterizes x in the sense that there is a one-toone equivalence3 between the algebraic object Ψ (x) and the geometric object d x in C∞ p ([0, T ]; R ) (see also [38] for some extension). Let [x, y] be the Lie bracket [x, y] = x ⊗ y − y ⊗ x. Denote by A∞ (Rd ) the subset of T∞ (Rd ) defined by A∞ (Rd ) = Rd ⊕ [Rd , Rd ] ⊕ [Rd , [Rd , Rd ]] ⊕ · · · . This subset is stable under Lie brackets [·, ·]. The tensor space T∞ (Rd ) is the universal Lie algebra of A∞ (Rd ) (see [63] for example). One may then define d ∞ d ∞ d two maps exp : A∞ (Rd ) → T∞ 1 (R ) and log : T1 (R ), where T1 (R ) is the ∞ d subset of T (R ) such that πR (x) = 1, which are given by 1 1 exp(x) = 1 + x + x ⊗ x + x ⊗ x ⊗ x + · · · , 2 6 1 1 log(1 + x) = x − x ⊗ x + x ⊗ x ⊗ x − · · · . 2 3 In particular, if G∞ (Rd ) = exp(A∞ (Rd )), then (G∞ (Rd ), ⊗) is a closed subd ∞ d ∞ d group of (T∞ 1 (R ), ⊗). In addition, exp is one-to-one from A (R ) to G (R ), and log is its inverse. One of the striking results of K.T. Chen, which uses some properties of the iterated integrals, is that Ψ (x; 0, t) belongs to G∞ (Rd ), or equivalently, with the Baker-Campbell-Hausdorff-Dynkin formula, that log(Ψ (x; 0, t)) belongs to A∞ (Rd ). This approach proved to be very useful, since it allows to consider equations driven by smooth paths or differential equations in an algebraic setting, and allows formal computations. Numerous topics in control theory use this point of view (see for example [23, 39, 41]). It was also used in the stochastic context to deal with flows of stochastic differential equations (see for example [2, 8, 24, 72]... or the book [4]). For some integer k, we may truncate Ψ by considering that ei1 ⊗ · · · ⊗ei = 0 for all > k. For such a truncated power series Ψk (x) we still get the relationship Ψk (x · y; 0, T + S) = Ψk (x; 0, T ) ⊗ Ψk (y; 0, S). In particular, we deduce that 3
In fact, for this equivalence to be exactly one-to-one, one has to eliminate the paths such that, one some time interval, x goes from a point a to a point b and then back to a by reversing the path.
An Introduction to Rough Paths
79
Ψk (x|[s,t] ; s, t) = Ψk (x|[s,r] ; s, r) ⊗ Ψk (x|[r,t] ; r, t) for all 0 s r t T . With k = 2, we get exactly that our natural lift of xt = Ψ2 (x; 0, t) satisfies the relationship xs,t = xs,r ⊗ xr,t . Thus, given a path x in T1 (R2 ), one can think of πei ⊗ej (xt ) as the iterated integrals of xj against xi . Of course, one knows that for irregular paths, there is no canonical way to define them (think of Brownian motion trajectories). Anyway, for weak geometric rough paths, these iterated integrals are approximated by iterated integrals of some smooth paths. We may now present another heuristic argument to derive the expression of F(f, x; s, t) and then (62). This argument is the historical one (see [42, 44, 52, 55]). Consider a smooth path x : [0, T ] → Rd and a smooth function f = (f1 , . . . , fd ). Then using a Taylor expansion, one gets d i=1
t
fi (xr ) dxir =
s
+
d
fi (xs )(xit − xis )
i=1
1 (i1 ,...,i )∈{1,...,d}
∂ fi (x0 )K i ,...,i1 ,i (x; s, t) ∂xi1 · · · ∂xi = ET∞ (Rd ) (f )(xs )Ψ (x; s, t)
with, for z ∈ Rd , ET∞ (Rd ) (f )(z) =
0 (i1 ,...,i ∈{1,...,d}
∂ fi (z)ei1 ⊗ · · · ⊗ ei . ∂xi1 · · · ∂xi
d i i In the usual case, we keep only the first term i=1 fi (xs )(xt − xs ) as an d t approximation of i=1 s fi (xr ) dxir , and we use it as the term in a Riemann sum. Keeping higher order terms has no influence, since K i1 ,...,i (x; s, t) (1/ !)x ∞ (t − s) . The idea is then to keep enough terms, if x is α-H¨older continuous and we get an object x(k) having the same algebraic properties as Ψk (x; s, t) for some integer k, to get a Riemann sum that converges. In [52, 55], T. Lyons and coauthors proved that the number of terms must be k = 1/α . In particular, from x(k) , it is possible to reconstruct an object living in T∞ (Rd ), equal to Ψ (x) when x is smooth, and possessing the same algebraic properties as Ψ (x). For k = 2 and using the path x as the object x(2) , we get the expression (62). 8.3 Paths with Quadratic Variation For Brownian motion or a semi-martingale, one knows how to construct several integrals—the major ones are the Itˆ o and the Stratonovich integrals—whose difference depends on the fact that their trajectories have finite quadratic variation.
80
A. Lejay
With the theory of rough paths, we can indeed construct a pathwise equivalent theory of the Itˆ o integral. For this, we need the path to have a quadratic variation. Definition 7. Given α ∈ (1/3, 1/2], a path x ∈ Cα ([0, T ]; R2 ) has a quadratic variation if there exists a process Q(x) ∈ Cα ([0, T ]; S(R2 )) such that ξ0 = 0 and, writing z ⊗2 instead of z ⊗ z for z ∈ R2 , Qn (x; t) =
t − tnM (t,n) tnM (t,n)+1 − tnM (t,n)
(xtnM (t,n)+1 − xtnM (t,n) )⊗2
M (t,n)−1
+
(xtnk+1 − xtnk )⊗2
k=0
and Q(x) = limn→∞ Qn (x) where the limit holds in Cα ([0, T ]; S(R2 )). Remark 16. Note that with the norm we use, this means that the components of Q(x) are 2α-H¨older continuous. Remark 17. If x ∈ Cα ([0, T ]; R2 ) with α > 1/2, then it is easily seen that necessarily, Q(x; t) = 0 for t ∈ [0, T ]. The trajectories of the Brownian motion and of H¨ older continuous martingales present this feature (see [13, 65]). Thus, a natural expression for the equivalent of the Itˆ o integral consists in considering the path xn defined in (57), and in setting
D(xn ; 0, t) =
k=0 s.t. tn k t
EA(R2 ) (f )(xntnk )
dxn (tnk ) Δn t dt
where EA(R2 ) (f )(xntnk ) has been defined by (55). This construction differs from (55), since
EA(R2 ) (f )(xtnk )
k=0 s.t. tn k t
dxn (tnk ) Δn t dt
=
k=0 s.t. tn i t
Comparing with (59) leads to D(xn ; 0, t) =
tn i+1
tn i
EA(R2 ) (f )(xtk ) log(xtnk ,tnk+1 ) ds.
F(f, x, tnk , tnk+1 ) − ∇f (xtnk )s(xtnk ,tnk+1 ).
k=0 s.t. tn i t
If x has a quadratic variation Q(x), then the components of Q(x) are 2αH¨ older continuous. In addition, the components of ∇f belongs to the space Lip(γ − 1; R2 → R2 ). Hence, since Qn (x) converges to Q(x) and
An Introduction to Rough Paths
∇f (xtnk )s(xtnk ,tnk+1 ) −
tn k+1
tn k
81
∇f (xs ) d Qn (x; s) Δn tα(1+γ) f Lip xα(1+γ) , α
we easily get convergence of the last term to the Young integral defined by 1 T 2 0 ∇f (xr ) d Q(x; r). t Thus, the limit of D(x; 0, t) is I(x; 0, t) − 0 12 ∇f (xs ) d Q(xs ) for t ∈ [0, T ]. The integral D(x) thus constructed is the same at the first level as if we had used the (1/α, 2/α)-H¨older continuous rough path (x, − 12 Q(x)) (see [53]). 8.4 Link with Stochastic Integrals Itˆo and Stratonovich integrals are defined as limits in probability of Riemann sums. On the other hand, the rough path theory gives a pathwise definition of the integral, but the price to pay is to add a supplementary information. Is there some link between both integrals? Let B be a d-dimensional Brownian motion (a semi-martingale may just as well be used). A natural way to construct a rough path B lying above B is to set t (Bri − B0i ) ◦ dBrj . πei ⊗ej (Bt ) = 0
for i, j = 1, . . . , d. For the construction of B as a rough path, see for example [13, 44, 52, 65]. The process log(B) is called the Brownian motion on the Heisenberg group, and has been widely studied (See references in Section B). Continuity of the rough path integral and the Wong-Zakai theorem allow us to identify the integral I(B; 0, T ) with the Stratonovich integral given by f (Bs ) ◦ dBs = lim
n→∞
n −1 2
k=0
1 (f (Btnk+1 ) + f (Btnk ))(Btnk+1 − Btnk ) 2
where the limit is a limit in probability. We will see here that there is another relationship between both integrals without invoking this continuity result, and that the construction of the Stratonovich and Itˆ o integrals (although under stronger condition on the function f than the one required by the “classical” theory) can be deduced from the rough paths theory. The theory of rough paths also gives a better intuitive understanding of the counter-examples to the Wong-Zakai theorem (see [40, 59] for SDEs and [48] in the context of rough paths). The projection on Rd of I(B; 0, T ) is given by πRd (I(B; 0, T )) = lim
n→∞
n −1 2
k=0
f (Btnk )(Btnk+1 − Btnk ) + ∇f (Btnk )πRd ⊗Rd (Btnk ,tnk+1 )
(66)
82
A. Lejay
which we rewrite using a and s as
πRd (I(B; 0, T )) = lim
n→∞
n −1 2
f (Btnk )(Btnk+1 − Btnk ) + ∇f (Btnk )a(Btnk ,tnk+1 )
k=0
+ ∇f (Btnk )s(Btnk ,tnk+1 ) .
But we have seen that f (Btnk )(Btnk+1 − Btnk ) + ∇f (Btnk )s(Btnk ,tnk+1 ) ≈
tn k+1
tn k
n
f (BsΠ ) dBsΠ
n
n
with B Π the piecewise linear approximation of B along the dyadic partition Π n , and ≈ meaning that the difference between the two terms is less than C2−nθ with θ > 1. 1 On the other hand, using f (x) − f (y) = 0 ∇f (x + τ (y − x))(y − x) dτ and the change of variable τ = 2n τ , we get that for k = 0, . . . , 2n , d (fi (Btnk+1 ) − fi (Btnk ))(Btink+1 − Btink ) i=1
tn k+1
= tn k
d ∂fi Π n (Bs )(Btjn − Btjn )(Btink+1 − Btink )2n ds k+1 k ∂x j i,j=1
≈ ∇f (Btnk )(Btnk+1 − Btnk ) ⊗ (Btnk+1 − Btnk ). With (45), s(Bs,t ) = 12 (Bt − Bs ) ⊗ (Bt − Bs ). This implies that d 1 i=1
2
(fi (Btnk+1 ) − fi (Btnk ))(Btink+1 − Btink ) ≈ ∇f (Btnk )s(Btnk ,tnk+1 ).
k Now, remark that if Mk = a(Btnk ,tnk+1 ), then ( =0 M )k=0,...,2n forms a martingale with respect to (Fk )k=0,...,2n , where (Ft )t0 is the filtration of the Brownian motion. In addition, E[(Mk )2 ] Hence, E
) 2n −1 k=0
6T 2 . 22n
2 * 6T 2 ∇f (Btnk )a(Btnk ,tnk+1 ) n ∇f ∞ 2
and the latter term converges to 0 in probability. Convergence in probability of the Stratonovich integral follows from the last convergence and the almost sure convergence of the rough path approximation given in (66).
An Introduction to Rough Paths
83
Regarding the Itˆ o integrals, we lift the Brownian motion B as a Brownian motion B with πR2 (B ) = B and t t 1 πei ⊗ej (Bt ) = (Bri − B0i ) dBrj = (Bri − B0i ) ◦ dBrj − δi,j t. 2 0 0 However, note that the anti-symmetric part a(B ) is equal to the antisymmetric part of a(B). Indeed, due to the Wong-Zakai theorem [40], B is a geometric rough path, while B is not a geometric rough path. From the previous computations, we easily get 1 πRd (I(B ; 0, T )) = IRd (I(B; 0, T )) − 2 i=1 d
T
0
∂fi (Bs ) ds = ∂xi
T
f (Bs ) dBs 0
and thus B gives rise to the Itˆo integral. The effect of the bracket terms t → B i , B j t = δi,j t on I(B ) with respect to I(B) in studied in Section 7.5.
9 Solving a Differential Equations The theory of rough paths may be applied to solve differential equations, since one can transform integrals into differential equations using a fixed point principle. Indeed, as noted in Section 8.2, most ideas from the rough path theory come from developments around iterated integrals as a way to deal formally with ordinary differential equations. Thus, the algebraic structures we used were introduced in the context of differential equations, not integrals (see for example [10, 57, 66]... and also [2, 8, 24, 72]... on Stratonovich stochastic differential equations). We wish now to consider the following differential equation t g(ys ) dxs , (67) yt = y0 + 0
where x is an irregular path. We assume that x lives in Rd , and y lives in Rm . Denote by {e1 , . . . , ed } (resp. {e1 , . . . , em }) the canonical basis of Rd (resp. Rm ). If one wishes to interpret this integral as a rough path, one first has to transform the vector field ∂ ek gik (z) g(z) = ∂xi i=1,...,d k=1,...,m
into a differential form h which is integrated along a path (x, y) living in Rd ⊕ Rm . For this, the natural extension is h(z, z ) =
i=1,...,d k=1,...,m
ek gik (z )ei +
d i=1
ei · ei , z ∈ Rd , z ∈ Rm .
84
A. Lejay
Hence, if x is smooth and (67) has a smooth solution y, t (xt , yt ) = (x0 , y0 ) + h(xs , ys ) d(xs , ys ) = (x0 , y0 ) + 0
h.
(x,y)|[0,t]
In order to deal with an irregular path x, the last integral will be defined as a rough path, which means that we shall consider a rough path z living above (x, y), in the tensor space T1 (Rd ⊕ Rm ). We also have to extend the differential form h. For (z, z ) ∈ Rd ⊕ Rm , define by ET1 (Rd ⊕Rm ) (h)(z, z ) the linear form on T0 (Rd ⊕ Rm ) by ET1 (Rd ⊕Rm ) (h)(z, z ) = h(z, z ) + +
ek
i=1,...,d k, =1,...,m
ek ⊗ e gik (z )gj (z )ei ⊗ ej +
k, =1,...,m i,j=1,...,d
+
∂gik
(z )e ⊗ ei ∂x
ei ⊗ ej · ei ⊗ ej
i,j=1,...,d
k=1,...,m i,j=1,...,d
ek ⊗ ej gik (z)ei ⊗ ej +
ei ⊗ ek gjk (z)ei ⊗ ej .
k=1,...,m i,j=1,...,d
Then, use Remark 15 to transform this linear form into a differential form on T1 (Rd ⊕ Rm ). The idea is now to apply a Picard iteration scheme. Define by I the integral with respect to the differential form h. If z0 is a rough path in Cα ([0, T ]; T1 (Rd ⊕Rm )) lying above (x, y 0 ) for some path y 0 ∈ Cα ([0, T ]; Rm )) and πT1 (Rd ) (z0 ) = x, then set recursively zk+1 = I(zk ). The problem is to study the convergence of (zk )k∈N . Definition 8. A solution of (67) is a rough path z living in T1 (Rd ⊕ Rm ) with z0 = (x0 , y0 , 0) and such that I(z; s, t) = zs,t for all 0 s t T and πT1 (Rd ) (z) = x. Let us start our study by the following observation: from the choice of h, πT1 (Rd ) (zk ) is equal to x, whatever k. In addition, to compute zk+1 , we need x = πT1 (Rd ) (zk ), πRm (zk ) and πRm ⊗Rd (zk ). If zk lies above (x, y k ), the last term corresponds to the iterated integrals of y k against x. For proofs, the reader is referred to [55, Section 4.1, p. 296], [52, Chapter 6, p. 148] and to [53]. Theorem 4. Let x be a rough path in Cα ([0, T ]; T1 (Rd )). Let g1 , . . . , gd be vector fields on Rm with derivatives bounded and κ-H¨ older continuous where α(2 + κ) > 1. Then there exists at least one solution to (67) in Cα ([0, T ]; T1 (Rd ⊕ Rm )). If g1 , . . . , gd are vector fields on Rm that are twice differentiable, and if, older for i, j, k = 1, . . . , d, ∂xj gi is bounded and ∂x2k ,xj gi is bounded and κ-H¨ continuous where α(2+κ) > 1, then the solution of (67) is unique and x → z is continuous from (Cα ([0, T ]; T1 (Rd )), · α ) to (Cα ([0, T ]; T1 (Rd ⊕Rm ), · α ).
An Introduction to Rough Paths
85
Remark 18. The map x → z is called the Itˆ o map. Its differentiability is studied in [51, 52], in [49] (for α > 1/2) and in [32]. Here again, because of the continuity of x → z, we get that if x belongs to Cα ([0, T ]; G(Rd )), then z ∈ Cα ([0, T ]; G(Rd ⊕ Rm )) and if x belongs to C0,α ([0, T ]; G(Rd )), then z ∈ C0,α ([0, T ]; G(Rd ⊕ Rm )). Finally, the solution of (67) may also be interpreted using an Euler scheme, as in [32], following [16]. In addition, A.M. Davie proved in [16] that there exists a unique solution if gi are of class C2 , and that the solution may not older continuous derivatives. be unique if gi only has H¨
A Carnot Groups and Homogeneous Gauges and Norms Let (G, ×) be a Lie group, and (g, [·, ·]) be its Lie algebra. G is a Carnot group of step k [4, 60] if for some positive integer k, g = V1 ⊕ V2 ⊕ · · · ⊕ Vk —this decomposition being called a stratification—with [V1 , Vi ] = Vi+1 for i = 1, . . . , k − 1 and [V1 , Vk ] = {0}, where [Vi , Vj ] = {[x, y] x ∈ Vi , y ∈ Vj }. A Carnot group is naturally equipped with a dilation operator δλ (x) = (λα1 x1 , · · · , λαk xk ) with xi ∈ exp(Vi ) and some positive real numbers α1 , . . . , αk , where exp is the map from g to G. This dilation operator must verify δλ (x × y) = δλ (x) × δλ (y). If the dimension of V1 is finite, the real number N = α1 dim(V1 ) + · · · + αk dim(Vk ) is called the homogeneous dimension. On G equipped with a dilation operator δ, a homogeneous gauge is a continuous function which maps x into a non-negative real number x such that x = 0 if and only if x is the neutral element of G, and for all λ ∈ R, δλ (x) = |λ| · x. A homogeneous gauge is a homogeneous norm if x−1 = x for all x ∈ G. In addition, this homogeneous norm is said to be sub-additive if x × y x + y for all x, y ∈ G. If V1 is of finite dimension, then a homogeneous norm always exists [27]. For this, equip the Lie algebra g with the Euclidean norm | · | and denote by exp the canonical diffeomorphism from g to G. For x ∈ g, let r(x) be the smallest positive real such that |δr(x) x| = 1, which exists, since |δr x| is increasing from [0, +∞) to [0, +∞). Then, for y ∈ G, y = 1/r(exp−1 y) defines a symmetric homogeneous norm. Two homogeneous gauges · and · are said to be equivalent if for some constants C and C , Cx x C x for all x ∈ G. Proposition 10 ([34]). If the dimension of V1 is finite, then all homogeneous gauges are equivalent. In addition, for a homogeneous gauge · , there exist some constants C and C such that x−1 Cx and |x × y| C (|x| + |y|), for all x, y ∈ G.
86
A. Lejay
Proof. If exp−1 (x) is decomposed as y1 , . . . , yk with yi ∈ Vi for x ∈ G, then set k |x| = i=1 |yi |1/i , where |·| denotes the Euclidean norm on each of the finitedimensional vector spaces Vi . It is easily verified that |x| is a homogeneous gauge. Let · be another homogeneous gauge. Set ϕ(x) = x/|x| . Then ϕ and 1/ϕ are continuous on G \ {1}, where 1 is the neutral element of G. As {x ∈ G |x| = 1} is compact, we easily get that ϕ and 1/ϕ are bounded, and then that for some constants C and C , C x C when |x| = 1. This implies that · and | · | are equivalent by using the dilation δ1/|x| for a general x. The other results are proved in a similar way by using ϕ(x) = x−1 /x and ϕ(x, y) = x × y/(x + y). It follows that any homogeneous gauge can be transformed in an equivalent homogeneous norm by setting x = x + x−1 . The notion a Lipschitz function is then extended to homogeneous gauges. Definition 9. If (G, ×) and (G , ×) are two nilpotent Carnot groups with homogeneous gauges · and · , then f : G → G is said to be Lipschitz if for some constant C, f (x)−1 × f (y) Cx−1 × y for all x, y ∈ G. The group (A(R2 ), ) (and thus (G(R2 ), ⊗)) is obviously a Carnot group of step 2 with V1 = R2 and V2 = [R2 , R2 ], and δλ (x) = (λx1 , λ2 x2 ). Its homogeneous dimension is 4. Homogeneous norms and gauges are easilyconstructed. It is sufficient to consider x = |x1 | + |x2 |, x = max{|x1 |, |x2 |} either on A(R2 ) or G(R2 ). Of course, if · is a homogeneous gauge on A(R2 ), then · defined by x = log(x) is a homogeneous gauge on G(R2 ).
B Brownian Motion on the Heisenberg Group We have seen in Section 8.4 that Brownian motion is naturally lifted as a rough path and then that the integrals correspond to the usual Itˆ o or Stratonovich integrals. The tangent space of A(R2 ) may be identified with A(R2 ), and we denote by ∂x , ∂y and ∂z the basis of Tx A(R2 ) at a point x which is deduced from the canonical coordinates e1 , e2 and [e1 , e2 ]. Let V 1 , V 2 and V 3 be the left invariant vector fields that go through 0 and that coincide respectively with ∂x , ∂y and ∂z at this point. For example, for a ∈ A(R2 ), for all x ∈ A(R2 ) and all smooth functions f on A(R2 ), V i f (ax) = V i f ◦La (x) where La (x) = ax for i = 1, 2, 3. We have seen in Section 6.12 that the V i ’s are decomposed in the basis {∂x , ∂y , ∂z } as
An Introduction to Rough Paths
87
1 1 V 1 = ∂x − y∂z , V 2 = ∂y + x∂y and V 3 = ∂z . 2 2 Remark that [V 1 , V 2 ] = V 3 and [V i , V j ] = 0 in all other cases. The tangent space at any point of A(R2 ) is then equipped with a scalar product ·, · such that V i , V j = δi,j for i, j = 1, 2, 3, i.e., for which {V 1 , V 2 , V 3 } forms an orthonormal basis. With this scalar product, A(R2 ) becomes a Riemannian manifold. Let B = (B 1 , B 2 ) be a two dimensional Brownian motion, and B n = n,1 (B , B n,2 ) for n = 1, 2, . . . be a family of piecewise linear approximations of B along a family of deterministic partitions whose meshes decrease to 0. We then consider X the solution of the Stratonovich SDE t t V 1 (Xs ) ◦ dBs1 + V 2 (Xs ) ◦ dBs2 Xt = 0
0 n
as well as the solutions X of the ordinary differential equations t t Xnt = V 1 (Xns ) ◦ dBs1,n + V 2 (Xns ) ◦ dBs2,n . 0
0
Using the decomposition of the V i on the coordinates {∂x , ∂y , ∂z }, we get Xt = Bt1 e1 + Bt2 e2 + A(B 1 , B 2 ; 0, t)[e1 , e2 ] where A(B 1 , B 2 ; 0, t) =
1 2
t
Bs1 ◦ dBs2 − 0
1 2
t
Bs2 ◦ dBs1 0
is the L´evy area of (B 1 , B 2 ). As already mentioned in Section 8.4, the process X is the Brownian motion on the Heisenberg group. Similarly, we get Xnt = Bt1,n e1 + Bt2,n e2 + A(B 1,n , B 2,n ; 0, t)[e1 , e2 ] and it is known from the Wong-Zakai theorem [40] that Xn converges in probability to X (with a dyadic partition, we get an almost sure convergence in the α-H¨older norm for any α < 1/2 [13, 65]). Note that the piecewise smooth curves Xn are horizontal curves, so that in this case, the natural approximation of X ∈ Cα ([0, T ]; A(R2 )) is provided by the piecewise linear approximations of (B 1 , B 2 ) naturally lifted as paths in A(R2 ). Many processes share this property : see for example [13, 15, 45]. This is a special case of a Brownian motion in a Lie group. Its short time behavior and its density have been already widely studied: see for example [1, 2, 4, 6, 33], ... From the H¨ ormander theorem, as {V 1 , V 2 , [V 1 , V 2 ]} spans the tangent space at any point, one knows that for any t > 0, Xt has a density on the three dimensional space A(R2 ), although it is constructed from a two dimensional Brownian motion. The infinitesimal generator of X is
88
A. Lejay
1 1 2 1 2 2 (V ) + (V ) 2 2 1 2 1 2 1 2 1 2 1 = ∂x + ∂y + x∂zy − y∂zx + (x2 + y 2 )∂z2 . 2 2 2 2 8
L=
This is an hypo-elliptic generator.
C From Almost Rough Paths to Rough Paths C.1 Theorems and Proofs In this Section we prove Theorem 2 on almost rough paths, which we rewrite in a more general setting than with H¨ older continuous norms. We set Δ+ = (s, t) ∈ [0, T ]2 0 s t T . A control is a function ω : Δ+ → R+ such that ω is continuous, ω is super-additive, i.e., ∀0 s < t < u T, ω(s, t) + ω(t, u) ω(s, u) and ω(t, t) = 0 for all t ∈ [0, T ]. If ω is super-additive and θ 1, then ω θ is also super-additive. 1 2 d Recall that for x = (ξ, x , x ) in Tξ (R ) with ξ = 0 or ξ = 1, we have defined x = max{|x1 |, 12 |x2 |}. We also set x = max{|x1 |, |x2 |}. These two norms are not equivalent, but they define the same topology. For a continuous path x with values in T1 (R2 ), introduce the norms xp,ω =
sup 0s
and x ,p,ω =
sup 0s
max
xs,t ω(s, t)1/p
|x1s,t | |x2s,t | , ω(s, t)1/p ω(s, t)2/p
with x1 = πRd (x) and x2 = πRd ⊗Rd (x). Note that x ,p,ω is finite if and only if xp,ω is finite. Hence, we denote by Cp,ω ([0, T ]; T1 (Rd )) the space of continuous paths with values in T1 (Rd ) for which xp,ω (or equivalently x ,p,ω is finite. We rewrite the first part of Theorem 2 with a control ω. Remark 19. The case of α-H¨older continuous paths corresponds to ω(s, t) = t − s and p = 1/α. All the results we gave about existence of the integral, solving a differential equation, ... may be written using a control ω(s, t) instead of ω(s, t) = t − s and the appropriate norms · p,ω and · ,p,ω . Similarly, we are not bound to use dyadic partitions, although some results may be related to dyadic partitions (see for example [13] for an application to semimartingales), and it is in general computationally simpler.
An Introduction to Rough Paths
89
Theorem 5. Let (xs,t )(s,t)∈Δ+ be a family of elements of T1 (Rd ) such that for some θ > 1, K > 0, xp,ω < +∞ and xs,t − xs,r ⊗ xr,t Kω(s, t)θ
(68)
for all (s, t) ∈ Δ+ . We call such a family an almost rough path controlled by ω. Then there exists a rough path y in Cp,ω ([0, T ]; T1 (Rd )) such that ys,t − xs,t Cω(s, t)θ
(69)
for some constant C that depends only on K, θ, p, ω(0, T ) and x ,p,ω . In addition, y is unique up to the value of y0 . In addition, if xs,t belongs to G(Rd ) for any 0 s t T , then y is a weak geometric rough path with p-variation controlled by ω. We give two proofs of this theorem. The first proof concerns the general case, and is taken from [55]. The other proof is a simpler proof in the case ω(s, t) = t−s, which is adapted from [22]. For integrals, where xs,t = f (zs )zs,t for some rough path z of finite p-variation, one can find some increasing, continuous function ϕ : [0, T ] → R+ such that z ◦ ϕ is H¨older continuous (See [9] and [13] for an example of application in the context of rough paths), so that in many cases, one can consider that ω(s, t) = t − s (as the integral of a differential form along a path in insensitive to change of time). +n Proof. Remark first that if α(n) = i=1 (1 + αi ) with αi ∈ T0 (Rd ), then α(n) = 1 +
n
αi +
i=1
n n
αi ⊗ αj .
i=1 j=i+1
+n Hence, if α(n) = i=1 (1 + αi ) with αk = αk + ζ for some k ∈ {1, . . . , n} and αi = αi , i = k, then n
α(n) = α(n) + ζ + ζ ⊗
αi +
j=k+1
k−1
αi ⊗ ζ.
(70)
i=1
For a partition π = {tk }n+1 k=1 of [s, t] with t1 = s and tn+1 = t, set (π)
xs,t =
n (
xtk ,tk+1 .
(71)
k=1 1,(π)
(π)
Put xs,t = πRd (xs,t ) and x1s,t = πRd (xs,t ). Let ti be some point of π (except s, t), and set π = π \ {ti }. Then 1,(π)
xs,t
1,( π)
− xs,t
= x1ti1 ,ti + x1ti ,ti+1 − x1ti−1 ,ti+1 .
90
A. Lejay
For a partition π = {ti }i=1,...,n+1 with n + 1 points in [s, t] and t1 = s, tn+1 = t, pick a point ti such that ω(ti−1 , ti+1 ) 2ω(s, t)/n. This is possible if n > 3 thanks to Lemma 2.2.1 from [55, p. 244]. Then 1,(π)
|xs,t
1,( π)
− xs,t | K
2θ ω(s, t)θ . nθ 1,(π)
If π has 3 elements {t1 , t2 , t3 } with t1 = s and t3 = s, then |xs,t − xs,t | Kω(s, t)θ . Thus, by summing from k = 1, . . . , n by choosing carefully which element of the partition is suppressed, we get that 1,(π)
|xs,t
− x1s,t | 2θ ζ(θ)Kω(s, t)θ
(72)
with ζ(θ) = n1 1/nθ . This is true for any partition π, whatever its size. Consider now a sequence of partitions π n of [0, T ] whose meshes decrease to 0. We set π n [s, t] = (π n ∩ [s, t]) ∪ {s, t}. Then for any (s, t) ∈ Δ+ , 1,(π n [s,t]) (xs,t )n∈N has a convergent subsequence. 1,(π nk [s,t])
One can extract a subsequence (nk )k∈N such that (xs,t )k∈N converges for any (s, t) ∈ Δ+ , s, t ∈ Q. Denote by ys,t one of the possible limits for (s, t) ∈ Δ+ , s, t ∈ Q. With K1 = K2θ ζ(θ) and with (72), we get that |ys,t − x1s,t | K1 ω(s, t)θ .
(73)
converges to 0 as |t − s| → 0, we may extend y by As ω is continuous and continuity on Δ+ . In addition, for 0 s < r < t T and r ∈ π n , then x1s,t
1,(π n [s,t])
xs,t
1,(π = xs,r
n
[s,r])
1,(π n [r,t])
+ xr,t
.
Choosing the partitions π n such that π n ⊂ π n+1 and π n ⊂ Q for each Q, we get that, by passing to the limit for r ∈ π nk0 for some k0 and s, t ∈ Q, we get ys,t = ys,r + yr,t . Using the continuity of y, this is true for any 0 s < r < t T . Define yt = y0,t and remark that ys,t = yt − ys . Now, consider another function z on [0, T ] with values in Rd and satisfying |zt − zs − x1s,t | 2θ ζ(θ)Kω(s, t)θ for all (s, t) ∈ Δ+ . Since |(yt − ys ) − (zt − zs )| |(yr − ys ) − x1r,s | + |(zr − zs ) − x1r,s | for r ∈ [s, t], (s, t) ∈ Δ+ , |(yt − ys ) − (zt − zs )| 2K1 ω(s, t)θ . Thus, yt = yt − zt is controlled by ω θ with θ > 1 and is necessarily constant. Otherwise, | yt − y0 | | ytnk+1 − ytnk | 2K1 ω(s, t) sup ω(tnk , tnk+1 )θ−1 k=0,...,2n −1, tn k t
and this converges to 0.
k=0,...,2n −1
An Introduction to Rough Paths
91
We now have to construct the second level of the rough path. For this purpose, set zs,t = 1 + yt − ys + πRd ⊗Rd (xs,t ), and, for a partition π with s (π) (π) and t as endpoints, define zs,t as xs,t in (71) with z instead of x. Note that z is also an almost rough path, since zs,t − zs,r ⊗ zr,t = x2s,t − x2s,r − x2r,t − x1s,r ⊗ x1r,t − (z1s,r − x1r,t ) ⊗ z1r,t + z1s,r ⊗ (z1r,t − x1r,t ) and therefore with (73), zs,t − zs,r ⊗ zr,t K2 ω(s, t)θ where K2 = K + 2K1 (K1 + x ,p,ω )ω(0, T )1/p ω(s, t)θ . For 0 s < t T and π = {ti }i=1,...,n+1 a partition of [s, t] with n + 1 = π \ {ti }, points and t1 = s, tn+1 = t, then for some i ∈ {2, . . . , n} and π (π)
( π)
zs,t − zs,t K2 ω(ti−1 , ti+1 )θ . One may choose ti such that ω(ti−1 , ti+1 ) 2ω(s, t)/n. Hence, as previously, (π)
zs,t − zs,t 2θ ζ(θ)K2 ω(s, t)θ .
(74)
Then, the same arguments apply and one can show that for all (s, t) ∈ Δ+ , there exists ys,t ∈ T1 (Rd ) such that πRd (ys,t ) = yt − ys , where y was the function previously defined at the first level, for all 0 s r t T , ys,t = ys,r ⊗ yr,t and ys,t − xs,t K3 ω(s, t)θ with K3 = K2 2θ ζ(θ). In particular, y is continuous on Δ+ and t → y0,t is a rough path in Cp,ω ([0, T ]; T1 (Rd )) lying above y. be another rough path in Cp,ω ([0, T ]; T1 (Rd )) lying above y and such Let y that ys,t − xs,t K3 ω(s, t)θ . Hence, 2 2 2 2 s,t |ys,r − z2s,r | + |yr,t − z2r,t | + | ys,r − z2s,r | + | yr,t − z2r,t | ys,t − y
s,t for all for all 0 s r t T . As previously, it follows that ys,t = y (s, t) ∈ Δ+ . This proves that y is unique up to to an additive constant. n The question is now to know whether or not y is also the limit of (x(π ) )n∈N for a family of partitions (π n )n∈N whose meshes decrease to 0. With the notation from the beginning of the proof, if {αi }i=1,...,n is a family of elements in T0 (Rd ) and {ηi }i=1,...,n belongs to Rd , then n n n n−1 n ( ( (1 + αi + ηi ) = (1 + αi ) + ηi + ηi ⊗ αj i=1
i=1
i=1
+
n−1 i=1
i=1
αi ⊗
n j=i+1
j=i+1
ηj +
n−1 i=1
ηi ⊗
n j=i+1
ηj .
92
A. Lejay
Now, set αi = xti ,ti+1 and ηi = yt1i ,ti+1 for some partition π = {ti }i=1,...,n+1 of [s, t]. Then for some constant C1 , n n ηi C1 ω(ti , ti+1 )θ Cω(0, T ) sup ω(ti , ti+1 )θ−1 . i=1
i=1,...,n
i=1
This last term converges to 0. Finally, remark that n−1
αi ⊗
i=1
n j=i+1
ηj =
j−1 n j=2
αi ⊗ ηj .
i=1
But from (72), for k ∈ {2, . . . , n}, k 1,(π∩[s,tk ]) αi = |xs,tk | K1 ω(s, t)θ + x ,p,ω ω(s, t)1/p . i=1
It follows that for some constant C2 depending only on x ,p,ω , K1 , ω(0, T ), θ and p, j−1
n θ−1 α . ⊗ η i j C2 ω(s, t) sup ω(ti , ti+1 ) i=2,...,n j=2
i=1
Similarly, n−1 n ηi ⊗ αj C2 ω(s, t) sup ω(ti , ti+1 )θ−1 i=1,...,n−1 i=1
and
j=i+1
n−1 n ηi ⊗ ηj K1 ω(s, t)2 sup ω(ti , ti+1 )2θ−2 . i=1,...,n i=1
j=i+1
It follows that for some constant C3 depending on C2 , K1 , θ and ω(0, T ), (π)
(π)
xs,t − zs,t C3 ω(s, t) sup ω(ti , ti+1 )θ−1 . i=1,...,n
This proves that if (π n )n∈N is a family of partitions whose meshes converge (π n ) to 0 as n → ∞, then xs,t converges to ys,t . In addition, combined with (73) and (74), this gives (69). The last assertion of this theorem follows from the fact that x(π) belongs to G(Rd ) if xs,t belongs to G(Rd ), which is a closed subgroup of T1 (Rd ). Proof (Proof of Theorem 5: alternative proof when ω(s, t) = K1 (t−s)). Define a distance on T1 (Rd ) by d(x, y) = x − y . Note that
An Introduction to Rough Paths
93
d(x ⊗ z, y ⊗ z) d(x, y)(1 + z )
(75)
and d(z ⊗ x, z ⊗ z) d(x, y)(1 + z )
(76)
for all x, y, z ∈ T1 (Rd ). For 0 s t T , set r = (t + s)/2, x0s,t = xs,t and recursively, n n xn+1 s,t = xs,r ⊗ xr,t .
By the triangular inequality, n+1 n+1 n+1 n+1 n n+1 n n n d(xn+2 s,t , xs,t ) d(xs,r ⊗ xr,t , xs,r ⊗ xr,t ) + d(xs,r ⊗ xr,t , xs,r ⊗ xr,t ).
With (75) and (76), n+1 n+1 n n+1 d(xn+2 s,t , xs,t ) d(xr,t , xr,t )(1 + xs,r ) n n + d(xn+1 s,r , xs,r )(1 + xr,t ).
(77)
Set Vn (τ ) =
sup 0sts+τ
n d(xn+1 s,t , xs,t ) and hn (τ ) =
sup 0sts+τ
xns,t .
From (77), Vn+1 (τ ) (2 + hn (τ /2) + hn+1 (τ /2))Vn (τ /2) Choose 2 < κ < 2θ . As V0 (τ ) = K(t − s)θ , the quantity V (τ ) =
+∞
κn V0 (τ /2n )
k=0
is finite. Remark that hn+1 (τ ) hn (τ ) + Vn (τ ) h0 (τ ) + V (τ ). Fix τ0 such that 1 + h0 (τ0 )V (τ0 ) < κ/2. This is possible since h0 (τ ) and V (τ ) converge to 0 as τ decreases to 0. Assume that Vn (τ ) κn V0 (τ /2n ) for τ τ0 .
(78)
n+1 V0 (τ /2n+1 ). For τ τ0 , 2+hn (τ /2)+hn+1 (τ /2) κ and then Vn+1 (τ ) κ Then, (78) is true for any n and n0 Vn (τ ) V (τ ) converges. This means that (xns,t )n∈N is a Cauchy sequence for all (s, t) ∈ Δ+ such that t − s τ . Denote by ys,t the limit of (xns,t )n∈N , which is continuous in s and t. This limit satisfies ys,t = ys,r ⊗ yr,t with r = (t + s)/2. In addition, d(ys,t , xs,t ) C(t − s)θ for some constant C. We extend ys,t to (s, t) ∈ Δ+ by setting m −m m ⊗ · · · ⊗ ytm , ys,t = ytm ,tm for the partition ti = s + i(t − s)2 0 ,t1 2m −1 2m m m i = 0, . . . , 2 when m is large enough so that (t − s) τ0 2 . We easily
94
A. Lejay
get that y is mid-point additive, that y does not depend on m, is such that ys,t = ys,r ⊗ yr,t for r = (t + s)/2 and satisfies d(ys,t , xs,t ) C (t − s)θ for (s, t) ∈ Δ+ with possibly another constant C . Let us now prove that y is unique. Let z be another function from Δ+ to T1 (Rd ) which satisfies zs,t = zs,r ⊗ zr,t for r = (t + s)/2 and d(zs,t , xs,t ) C (t − s)θ
(79)
for some C > 0 and any (s, t) ∈ Δ+ . For (s, t) ∈ Δ+ and r = (t + s)/2, d(ys,t , zs,t ) d(ys,t , ys,r ⊗ zr,t ) + d(ys,r ⊗ zr,t , zs,t ) d(yr,t , zr,t )(1 + ys,r ) + d(ys,r , zs,r )(1 + zr,t ) κ(τ /2)W (τ /2), where W (τ ) = supsts+τ d(ys,t , zs,t ) and κ(τ ) = 2 +
sup 0t−sτ
ys,t +
sup 0t−sτ
zs,t .
Thus, W (τ ) κW (τ /2). Now, note that $ % W (τ ) sup d(ys,t , xs,t ) + d(zs,t , xs,t ) 2Cτ θ . sts+τ
Then, if τ < τ0 with κ(τ0 ) < 2θ , Cκ(τ0 )n τ θ −−−−→ 0, 2(n+1)(θ−1) n→∞ which means that W (τ ) = 0 for τ ∈ [0, τ0 ]. Using the fact that both y and z are mid-point additive, we get that ys,t = zs,t for all (s, t) ∈ Δ+ and that y is unique. Now, fix (s, t) ∈ Δ+ and n ∈ N. Set W (τ ) κ(τ0 )n W (τ /2n )
zs,t = ytn0 ,tn1 ⊗ · · · ⊗ ytnn−1 ,tnn for tni = s + (t − s)i/n. Note that for r = (t + s)/2, zs,t = zs,r ⊗ zr,t for s =
t+s , (s, t) ∈ Δ+ . 2
It follows that n−1 ( d(zs,t , xs,t ) d ytni ,tni+1 , ytn0 ,tn1 ⊗ xtn1 ,tn2 i=0
+d(ytn0 ,tn1 ⊗ xtn1 ,tn2 , xtn0 ,tn1 ⊗ xtn1 ,tn2 ) + d(xtn0 ,tn1 ⊗ xtn1 ,tn2 , xtn0 ,tn1 ) n−1 ( d ytni ,tni+1 , xtn1 ,tn2 (1 + ytn0 ,tn1 ) i=1
+d(ytn0 ,tn1 , xtn0 ,tn1 )(1 + xtn1 ,tn2 ) + K|t − s|θ n−1 ( |t − s|θ C1 d ytni ,tni+1 , xtn1 ,tn2 + K|t − s|θ + C2 nθ i=1
An Introduction to Rough Paths
95
for some constants C1 and C2 that depend only on T , K and K1 . Applying recursively the same computation leads to d(zs,t , xs,t ) C3 |t − s|θ for some constant C3 that depends on K, T , K1 and n. We have previously proved that any function z : Δ+ → T1 (Rd ) which satisfies (79) is equal to y, +n−1 so ys,t = i=1 ytni ,tni+1 . Then, ys,t = ys,s+p(t−s) ⊗ ys+p(t−s),t for all p ∈ Q. From the continuity of (s, t) ∈ Δ+ → ys,t , we deduce that ys,r ⊗ yr,t = ys,t for any r ∈ [s, t], (s, t) ∈ Δ+ . Theorem 6. Let x and y be two almost rough paths, both satisfying (68) with the same constants K and θ. (i) Assume that there exists an ε > 0 such that x − y ,p,ω ε. Then there exists some function ε → K(ε) that depends only on K, θ, p, & and y & associated ω(0, T ), x ,p,ω and y ,p,ω such that the two rough paths x to x and y by Theorem 5 satisfy & ∗,p,ω K(ε) & x−y with K(ε) → 0 as ε → 0. (ii) If in addition for all (s, t) ∈ Δ+ , xs,t − xs,r ⊗ xr,t − (ys,t − ys,r ⊗ yr,t ) εω(s, t)θ , then K(ε) = K ε for some constant K depending only on K, θ, p, ω(0, T ), x ,p,ω and y ,p,ω . Proof. We first prove statement (ii) of this theorem. We use the same notations as previously. For a partition π = {ti }i=1,...,n of [s, t] with t1 = s, tn+1 = t, (π) (π) consider xs,t and ys,t as above. Pick a point ti in π such that ω(ti−1 , ti+1 ) 2ω(s, t)/n. For ξ = xti−1 ,ti ⊗ xti ,ti+1 − xti−1 ,ti+1 − yti−1 ,ti ⊗ yti ,ti+1 + yti−1 ,ti+1 , we get that, with (70), (π) xs,t
−
( π) xs,t
−
(π) (ys,t
−
( π) ys,t )
ξ 1 +
1 xt ,t j
j+1
−
yt1j ,tj+1
j=1,...,n j=i 1 where x1s,t (resp. ys,t ) is the projection of xs,t (resp. ys,t ) on Rd . With (72), we get that, for some constant C that depends only on K, θ, p, ω(0, T ), xp,ω and yp,ω ,
96
A. Lejay
1,(π)
1,(π)
|x1tj ,tj+1 − yt1j ,tj+1 | |xs,t | + |ys,t |
j=1,...,n j=i
(Cω(s, t)θ−1/p + y ,p,ω + x ,p,ω )ω(s, t)1/p . Thus, for some constant K, (π)
( π)
(π)
( π)
xs,t − xs,t − (ys,t − ys,t ) ε
K ω(s, t)θ . nθ
It follows that by carefully removing all points of π one after the other, (π)
(π)
xs,t − xs,t − (ys,t − ys,t ) εζ(θ)Kω(s, t)θ . &s,t and y &s,t , we deduce that As we have seen that x(π) and y(π) converges to x ys,t − ys,t ) εζ(θ)Kω(s, t)θ . & xs,t − xs,t − (& The result is then easily deduced. Now, to prove the statement (i), we just have to remark that for some 1/θ < η < 1, xs,t − xs,r ⊗ xr,t − ys,t + ys,r ⊗ yr,t 2η−1 (xs,t − xs,r ⊗ xr,t η + ys,t − ys,r ⊗ yr,t η ) ×(xs,t − ys,t + xs,r ⊗ xr,t − ys,r ⊗ yr,t )1−η Cω(s, t)ηθ+(1−η)/p ε1−η for some constant C that depends only on η, θ, ω(0, T ), and then to apply the result of (ii) by replacing ε by ε1−η and θ by ηθ. C.2 An Algebraic Interpretation We now give an algebraic interpretation of this construction, which is strongly inspired from the one given by M. Gubinelli in [36]. Consider the sets Δ1 = [0, T ], and
Δ2 = {(s, t) 0 s t T }
Δ3 = {(s, r, t) 0 s r t T } ,
and call Ci the set of functions from Δi to T1 (Rd ) for i = 1, 2, 3. Introduce the operator from C1 ∪ C2 to C2 ∪ C3 defined by δ(x)s,t = x−1 s ⊗ xt , (s, t) ∈ Δ2 , x ∈ C1 , δ(x)s,r,t = xs,t − xs,r ⊗ xr,t , (s, r, t) ∈ Δ3 , x ∈ C2 , hence δ maps Ci to Ci+1 , i = 1, 2. Note that if x ∈ C1 , then δ(δ(x)) = 0, so the range Range(δ|C1 ) of δ|C1 is contained in the kernel Ker(δ|C2 ) of δ|C2 . Indeed, we get a better result.
An Introduction to Rough Paths
97
Lemma 21. The range of δ|C1 is equal to the kernel of δ|C2 , and δ is injective from C1 (x) into C2 where C1 (x) is the set of paths x in C1 with x0 = x for x ∈ T1 (R2 ). In particular, when restricted to Range(δC1 (x) ), δ is invertible. Proof. We have already seen the inclusion of Range(δ|C1 ) in Ker(δ|C2 ). Now, let x ∈ C2 belong to the kernel Ker(δ|C2 ), and set yt = x0,t . As δ(x)0,s,t = yt − ys ⊗ xs,t = 0, we get xs,t = ys−1 ⊗ yt and thus x = δ(y). This proves the result. If two paths x and y are distinct in C1 (x), then δ(x)0,t = x−1 ⊗xt is different from δ(y)0,t = x−1 ⊗ yt and δ is injective from C1 (x) into C2 . Given a rough path x, which then belongs to C1 and a differential form f , the integral I(x) = f (x) dx is also a path in C1 (0). The idea is then to consider an approximation of I(x; s, t) for t − s small, and to project it on the range of δ|C1 (x) . Of course, the approximation of I(x; s, t) has to be close enough to the range of δ|C1 (x) . For p 1 and θ > 1, define the distance dθ,ω on C2 by D ,θ,ω (x, y) =
sup (s,t)∈Δ+
xs,t − ys,t . ω(s, t)θ
To simplify the notation, extend δ|C2 as a function defined on Δ2 × [0, T ] by setting δ(x)s,r,t = 1 if r ∈ [s, t]. For a fixed r ∈ [0, T ], δ·,r,· (x) is then a function in C2 . Theorem 5 is the rewritten the following way. Theorem 7. For K, K > 0 and θ > 1, denote by B(K, K , θ, p, ω) the subsets of functions x ∈ C2 for which x ,p,ω K and
sup D ,p,ω (δ(x)·,r,· , 0) K .
r∈[0,T ]
of Then to any x in B(K, K , θ, p, ω) is associated a unique element x Ker(δ|C2 ). In addition, for some constants C1 and C2 that depend only on K, K , θ, p and ω(0, T ), ) C2 . x ,p,ω C1 and D ,p,ω (x, x Moreover, if one defines a distance Θ ,p,θ,ω on ∪K,K >0 B(K, K , θ, p, ω) by Θ ,p,θ,ω (x, y) = max{x − y ,p,ω , d ,θ,ω (x, y)}, is locally Lipschitz with respect to Θ ,p,θ,ω . then this map Π : x → x From the definition, an almost rough path x of p-variation controlled by ω belongs to ∪K,K >0 B(K, K , θ, p, ω). It is then “projected” on an element Π(x) in C2 in the kernel of δC2 , which is also equal to the image of δC1 (1) . The inverse image of Π(x) then gives a rough path in Cp,ω ([0, T ]; T1 (Rd )).
98
A. Lejay
Given an element f in Lip(γ; Rd → Rm ) with γ > p − 1, the map F(f, x) defined by (61) defines an element of C2 . The integral I may then be defined as the composition of the maps −1 I = δ|C ◦ Π ◦ F(f, ·), 1 (1)
which corresponds to the construction given in Section 7.4.
References 1. R. Azencott. G´eod´esiques et diffusions en temps petit, vol. 84 of Ast´erisque. Soci´et´e Math´ematique de France, Paris, 1981. 2. G. Ben Arous. Flots et s´eries de Taylor stochastiques. Probab. Theory Related Fields, 81:1, 29–77, 1989. 3. A. Baker. Matrix groups: An introduction to Lie group theory. Springer Undergraduate Mathematics Series. Springer-Verlag London Ltd., London, 2002. 4. F. Baudoin. An introduction to the geometry of stochastic flows. Imperial College Press, London, 2004. 5. D. Burago, Y. Burago and S. Ivanov. A course in metric geometry, vol. 33 of Graduate Studies in Mathematics. American Mathematical Society, Providence, RI, 2001. 6. J.-M. Bismut. Large deviations and the Malliavin calculus, vol. 45 of Progress in Mathematics. Birkh¨ auser Boston Inc., Boston, MA, 1984. 7. A. Bensoussan, J.-L. Lions and G. Papanicolaou. Asymptotic Analysis for Periodic Structures. North-Holland, 1978. 8. F. Castell. Asymptotic expansion of stochastic flows. Probab. Theory Related Fields, 96:2, 225–239, 1993. 9. V.V. Chistyakov and O.E. Galkin. On maps of bounded p-variations with p > 1. Positivity, 2:1, 19–45, 1998. 10. K.-T. Chen. Integration of paths, geometric invariants and a generalized BakerHausdorff formula. Ann. of Math. (2), 65, 163–178, 1957. 11. K.-T. Chen. Integration of Paths–A Faithful Representation, of Paths by Noncommutative Formal Power Series. Trans. Amer. Math. Soc., 89:2, 395–407, 1958. 12. Z. Ciesielski. On the isomorphisms of the spaces Hα and m. Bull. Acad. Polon. Sci. S´er. Sci. Math. Astronom. Phys., 8, 217–222, 1960. 13. L. Coutin and A. Lejay. Semi-martingales and rough paths theory. Electron. J. Probab., 10:23, 761–785, 2005. 14. L. Coutin. An introduction to (stochastic) calculus with respect to fractional Brownian motion. In S´eminaire de Probabilit´es XL, vol. 1899 of Lecture Notes in Math., pp. 3–65. Springer, 2007. 15. L. Coutin and Z. Qian. Stochastic analysis, rough path analysis and fractional Brownian motions. Probab. Theory Related Fields, 122:1, 108–140, 2002. <doi: 10.1007/s004400100158>. 16. A.M. Davie. Differential Equations Driven by Rough Signals: an Approach via Discrete Approximation. Appl. Math. Res. Express. AMRX, 2, Art. ID abm009, 40, 2007.
An Introduction to Rough Paths
99
17. J. J. Duistermaat and J. A. C. Kolk. Lie groups. Universitext. SpringerVerlag, Berlin, 2000. 18. R.M. Dudley and R. Norvai˘ sa. An introduction to p-variation and Young integrals—with emphasis on sample functions of stochastic processes, 1998. Lecture given at the Centre for Mathematical Physics and Stochastics, Department of Mathematical Sciences, University of Aarhus. Available on the web site <www.maphysto.dk>. 19. H. Doss. Liens entre ´equations diff´erentielles stochastiques et ordinaires. Ann. Inst. H. Poincar´e Sect. B (N.S.), 13:2, 99–125, 1977. 20. C. T. J. Dodson and T. Poston. Tensor geometry: The geometric viewpoint and its uses, vol. 130 of Graduate Texts in Mathematics. Springer-Verlag, Berlin, 2nd edition, 1991. 21. D. Feyel and A. de La Pradelle. Curvilinear integrals along enriched paths. Electron. J. Probab., 11:35, 860–892, 2006. 22. D. Feyel, A. de La Pradelle and G. Mokobodzki. A non-commutative sewing lemma. Electron. Commun. Probab., 13, 24–34, 2008. . 23. M. Fliess. Fonctionnelles causales non lin´eaires et ind´etermin´ees non commutatives. Bull. Soc. Math. France, 109:1, 3–40, 1981. 24. M. Fliess and D. Normand-Cyrot. Alg`ebres de Lie nilpotentes, formule de Baker-Campbell-Hausdorff et int´egrales it´er´ees de K. T. Chen. In S´eminaire de Probabilit´es, XVI, vol. 920, pp. 257–267. Springer, Berlin, 1982. 25. G. B. Folland. Harmonic analysis in phase space, vol. 122 of Annals of Mathematics Studies. Princeton University Press, Princeton, NJ, 1989. 26. P. Friz. Continuity of the Itˆ o-Map for H¨ older rough paths with applications to the support theorem in H¨ older norm. In Probability and Partial Differential Equations in Modern Applied Mathematics, vol. 140 of IMA Volumes in Mathematics and its Applications, pp. 117–135. Springer, 2005. 27. G. B. Folland and E. M. Stein. Hardy spaces on homogeneous groups, vol. 28 of Mathematical Notes. Princeton University Press, 1982. 28. P. Friz and N. Victoir. Multidimensional Stochastic Processes as Rough Paths. Theory and Applications. Cambridge University Press, 2009. 29. P. Friz and N. Victoir. A Note on the Notion of Geometric Rough Paths. Probab. Theory Related Fields, 136:3, 395–416, 2006. <doi: 10.1007/s00440005-0487-7>, . 30. P. Friz and N. Victoir. Differential Equations Driven by Gaussian Signals I. , Cambridge University (preprint), 2007. 31. P. Friz and N. Victoir. Differential Equations Driven by Gaussian Signals II. , Cambridge University (preprint), 2007. 32. P. Friz and N. Victoir. Euler Estimates of Rough Differential Equations. J. Differential Equations, 244:2, 388–412, 2008. . 33. B. Gaveau. Principe de moindre action, propagation de la chaleur et estim´ees sous elliptiques sur certains groupes nilpotents. Acta Math., 139:1-2, 95–153, 1977. 34. R. Goodman. Filtrations and asymptotic automorphisms on nilpotent Lie groups. J. Differential Geometry, 12:2, 183–196, 1977. 35. M. Gromov. Carnot-Carath´eodory spaces seen from within. In Sub-Riemannian geometry, vol. 144 of Progr. Math., pp. 79–323. Birkh¨ auser, 1996. 36. M. Gubinelli. Controlling rough paths. J. Funct. Anal., 216:1, 86–140, 2004.
100
A. Lejay
37. B. C. Hall. Lie groups, Lie algebras, and representations, vol. 222 of Graduate Texts in Mathematics. Springer-Verlag, New York, 2003. 38. B.M. Hambly and T.J. Lyons. Uniqueness for the signature of a path of bounded variation and continuous analogues for the free group. Oxford University (preprint), 2006. 39. A. Isidori. Nonlinear control systems. Springer-Verlag, Berlin, 3rd edition, 1995. 40. N. Ikeda and S. Watanabe. Stochastic differential equations and diffusion processes, vol. 24 of North-Holland Mathematical Library. North-Holland Publishing Co., Second edition, 1989. 41. M. Kawski. Nonlinear control and combinatorics of words. In Geometry of feedback and optimal control, vol. 207 of Monogr. Textbooks Pure Appl. Math., pp. 305–346. Dekker, New York, 1998. ´vy. Differential Equations Driven by Rough 42. T. Lyons, M. Caruana and T. Le ´ Paths. In Ecole d’´et´e de probabilit´es de Saint-Flour XXXIV—2004, edited by J. Picard, vol. 1908 of Lecture Notes in Math., Berlin, 2007. Springer. 43. A. Lejay. On the convergence of stochastic integrals driven by processes converging on account of a homogenization property. Electron. J. Probab., 7:18, 1–18, 2002. 44. A. Lejay. An introduction to rough paths. In S´eminaire de probabilit´ es, XXXVII, vol. 1832 of Lecture Notes in Mathematics, pp. 1–59. Springer-Verlag, 2003. 45. A. Lejay. Stochastic Differential Equations driven by processes generated by divergence form operators I: a Wong-Zakai theorem. ESAIM Probab. Stat., 10, 356–379, 2006. <doi: 10.1051/ps:2006015>. 46. A. Lejay. Stochastic Differential Equations driven by processes generated by divergence form operators II: convergence results. ESAIM Probab. Stat., 12, 387–411, 2008. <doi: 10.1051/ps:2007040>. ´vy. Processus stochastiques et mouvement brownien. Gauthier-Villars & 47. P. Le Cie, Paris, 2e ´edition, 1965. 48. A. Lejay and T. Lyons. On the importance of the L´evy area for systems controlled by converging stochastic processes. Application to homogenization. In New Trends in Potential Theory, Conference Proceedings, Bucharest, September ¨ ckner, 2002 and 2003, edited by D. Bakry, L. Beznea, Gh. Bucur and M. Ro pp. 63–84. The Theta Foundation, 2006. 49. X.D. Li and T.J. Lyons. Smoothness of Itˆ o maps and diffusion processes on ´ path spaces. I. Ann. Sci. Ecole Norm. Sup., 39:4, 649–677, 2006. 50. M. Ledoux, T. Lyons and Z. Qian. L´evy area of Wiener processes in Banach spaces. Ann. Probab., 30:2, 546–578, 2002. 51. T. Lyons and Z. Qian. Flow of diffeomorphisms induced by a geometric multiplicative functional. Proba. Theory Related Fields, 112:1, 91–119, 1998. 52. T. Lyons and Z. Qian. System Control and Rough Paths. Oxford Mathematical Monographs. Oxford University Press, 2002. 53. A. Lejay and N. Victoir. On (p, q)-rough paths. J. Differential Equations, 225:1, 103–133, 2006. <doi: 10.1016/j.jde.2006.01.018>. 54. T. Lyons and N. Victoir. An Extension Theorem to Rough Paths. Ann. Inst. H. Poincar´e Anal. Non Lin´eaire, 24:5, 835–847, 2007. <doi: 10.1016/j.anihpc.2006.07.004>. 55. T.J. Lyons. Differential equations driven by rough signals. Rev. Mat. Iberoamericana, 14:2, 215–310, 1998.
An Introduction to Rough Paths
101
56. T.J. Lyons and T. Zhang. Decomposition of Dirichlet Processes and its Application. Ann. Probab., 22:1, 494–524, 1994. 57. W. Magnus. On the exponential solution of differential equations for a linear operator. Comm. Pure Appl. Math., 7, 649–673, 1954. 58. R. Mahony and J.H. Manton. The Geometry of the Newton Method on NonCompact Lie Groups J. Global Optim., 23, 309–327, 2002. 59. E. J. McShane. Stochastic differential equations and models of random processes. In Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability (Univ. California, Berkeley, Calif., 1970/1971), Vol. III: Probability theory, pp. 263–294. Univ. California Press, 1972. 60. R. Montgomery. A tour of subriemannian geometries, their geodesics and applications, vol. 91 of Mathematical Surveys and Monographs. American Mathematical Society, 2002. 61. J. Musielak and Z. Semadeni. Some classes of Banach spaces depending on a parameter. Studia Math., 20, 271–284, 1961. ´. Approximation of rough paths of fractional 62. A. Millet and M. Sanz-sole Brownian motion. In Seminar on Stochastic Analysis, Random Fields and Applications V, pp. 275–303, Progr. Probab., Birkh¨ auser, 2008. 63. C. Reutenauer. Free Lie algebras, vol. 7 of London Mathematical Society Monographs. New Series. Oxford University Press, 1993. 64. R.A. Ryan. Introduction to Tensor Products of Banach Spaces. Springer-Verlag, 2002. ¨inen. A pathwise view of solutions of stochastic differential equa65. E.-M. Sipila tions. PhD thesis, University of Edinburgh, 1993. 66. R. S. Strichartz. The Campbell-Baker-Hausdorff-Dynkin formula and solutions of differential equations. J. Funct. Anal., 72:2, 320–345, 1987. 67. D. H. Sattinger and O. L. Weaver. Lie groups and algebras with applications to physics, geometry, and mechanics, vol. 61 of Applied Mathematical Sciences. Springer-Verlag, New York, 1993. Corrected reprint of the 1986 original. 68. K. Tapp. Matrix groups for undergraduates, vol. 29 of Student Mathematical Library. American Mathematical Society, 2005. 69. V. S. Varadarajan. Lie groups, Lie algebras, and their representations, vol. 102 of Graduate Texts in Mathematics. Springer-Verlag, New York, 1984. Reprint of the 1974 edition. 70. N. Victoir. L´evy area for the free Brownian motion: existence and nonexistence. J. Funct. Anal., 208:1, 107–121, 2004. 71. F.W. Warner. Foundations of differentiable manifolds and Lie groups, vol. 94 of Graduate Texts in Mathematics. Springer-Verlag, New York, 1983. Corrected reprint of the 1971 edition. 72. Y. Yamato. Stochastic differential equations and nilpotent Lie algebras. Z. Wahrsch. Verw. Gebiete, 47:2, 213–229, 1979. 73. L.C. Young. An inequality of the H¨ older type, connected with Stieltjes integration. Acta Math., 67, 251–282, 1936.
Monotonicity of the Extremal Functions for One-dimensional Inequalities of Logarithmic Sobolev Type Laurent Miclo Laboratoire d’Analyse, Topologie, Probabilit´es, UMR 6632, CNRS 39, rue F. Joliot-Curie, 13453 Marseille cedex 13, France e-mail: [email protected] Summary. In various one-dimensional functional inequalities, the optimal constants can be found by considering only monotone functions. We study the discrete and continuous settings (and their relationships); we are interested in Poincar´e or logarithmic Sobolev inequalities, and several variants obtained by modifying entropy and energy terms. Keywords: Poincar´e inequality, (modified) logarithmic Sobolev inequality, monotonicity of extremal functions, linear diffusions, birth and death process MSC2000: 46E35, 46E39, 49R50, 26A48, 26D10, 60E15
1 Introduction and Result On the Borel σ-field of R, let μ be a probability and ν a positive measure. We are interested in the logarithmic Sobolev constant C(μ, ν) defined (with the usual conventions 1/∞ = 0, 1/0 = ∞ and, most important, 0 · ∞ = 0) by C(μ, ν) sup f ∈C
Ent(f 2 , μ) ν[(f )2 ]
¯+ ∈R
(1)
where C is the set of all absolutely continuous functions f on R; f denotes the weak derivative of f . Recall that in general the entropy of a positive, measurable function f with respect to a probability μ is defined as μ[f ln(f )] − μ[f ] ln(μ[f ]) if f ln(f ) is μ-integrable Ent(f, μ) +∞ else ¯ + , as an immediate consequence of Jensen’s and that this quantity belongs to R inequality with the convex map R+ x → x ln(x) ∈ R. C. Donati-Martin et al. (eds.), S´ eminaire de Probabilit´ es XLII, Lecture Notes in Mathematics 1979, DOI 10.1007/978-3-642-01763-6 2, c Springer-Verlag Berlin Heidelberg 2009
103
104
L. Miclo
One of our aims is to show that the above definition of C(μ, ν) is not modified when restricted to monotone functions: Theorem 1. Calling D the cone in C consisting of all functions f such that f 0 a.e., one has C(μ, ν) sup
f ∈D
Ent(f 2 , μ) ν[(f )2 ]
¯ +. ∈R
This can be illustrated by the most famous case of the logarithmic Sobolev inequality (due to Gross [10]), where μ = ν is a Gaussian (non degenerate) distribution; then the maximising functions are exactly the exponentials R x → exp(ax + b) with a ∈ R∗ and b ∈ R (see Carlen’s article [4]). We shall also be interested in the following discrete version of the preceding {0, 1, ..., N } result. For a given N ∈ N∗ , consider the discrete segment E as a linear non-oriented graph; call A {l, l + 1} : 0 l < N the set of its edges. Denote by C the set of functions defined on E. If f ∈ C, its discrete derivative f is defined on A by ∀ 0 l < N, f {l, l + 1} f (l + 1) − f (l) Let also be given a probability μ on E and a measure ν on A. These notations enable us to reinterpret (1) in this new setting, and, as above, our main concern will be to prove: Theorem 2. In this discrete framework, one has C(μ, ν) = sup
f ∈D
Ent(f 2 , μ) ν[(f )2 ]
¯+ ∈R
where D is the cone in C consisting of those functions with positive derivative. In fact, using interlinks between the continuous and discrete contexts, one can pass from one result to the other. So we shall start with the discrete situation, which is more immediate and better illustrates our itinerary; then similar properties in the continuous framework will derive from the discrete one. The discrete proof can also be directly translated, but precautions must be taken; more on this later. These monotonicity properties will also be extended to some modified logarithmic Sobolev inequalities (discrete, as in Wu [18] or coutinuous in the sense of Gentil, Guillin and Miclo [9]). More precisely, in the discrete case, one would like to replace the energy term ν[(f )2 ] by the quantity Eν (f 2 , ln(f 2 )) defined for f ∈ C by ν({l, l + 1})[f 2 (l + 1) − f 2 (l)][ln(f 2 (l + 1)) − ln(f 2 (l))]; {l,l+1}∈A
observe that this quantity is quadratically homogeneous. This will be done in
Monotonicity of the Extremal Functions
105
Theorem 3. Consider the case that E = Z, with the previous notations extended to this setting. One has sup f ∈C
Ent(f 2 , μ) Ent(f 2 , μ) = sup 2 2 Eν (f , ln(f )) f ∈D Eν (f 2 , ln(f 2 ))
In the continuous framework, let H : R+ → R+ be a convex function such that H(0) = 0 and H (0) = 1. We now wish to replace the energy term with the following quadratically homogeneous quantity: f 2 2 ∀ f ∈ C, EH,ν (f ) H f dν f where by convention the integrand equals (f )2 on the set where f vanishes. As before, one then has Theorem 4. If μ is a probability on R and ν a measure on R, one has sup f ∈C
Ent(f 2 , μ) Ent(f 2 , μ) = sup . EH,ν (f ) f ∈D EH,ν (f )
Similar results will be obtained when it is the entropy which is modified; for a precise statement, see sub-section 5.3. But our main motivation comes from the modified logarithmic Sobolev inequalities in Theorems 3 and 4, because we hope that the monotonicity properties we have established eventually allow to apply Hardy inequalities. Indeed, the link between Hardy and modified logarithmic Sobolev inequalities is still poorly understood, whereas that between Hardy and Poincar´e, or classical logarithmic Sobolev, inequalities is clear (see for instance Bobkov and G¨ otze’s article [3]). Besides, let us mention that similar results for the Poincar´e constant have already been obtained, in the discrete case by Chen (in the proof of Theorem 3.2 in [7]) and in the continuous case by Chen and Wang (Proposition 6.4 in [6], see also the end of the proof of Theorem 1.1 in Chen [8]), for diffusions which are regular enough. Their method partially rests on the equation satisfied by a maximising function (which then is an eigenvector associated to the spectral gap). But it does not clearly adapt to logarithmic Sobolev inequalities, nor even, in the case of the Poincar´e constant, to the irregular situations considered above (see for instance the continuity hypothesis needed in the second part of Theorem 1.3 of Chen [8]); therefore we prefer another approach. In particular, we do not a priori deal with the problem of existence of a maximising function (which is crucial in the approach by Chen and Wang [6, 8]). Furthermore, it may be preferable to attack this existence question a posteriori, when discussion is restricted to increasing functions; for rather regular situations, see also the last remark in Section 4. Still in the case of the Poincar´e constant, observe that the equation giving the maximising functions (if they exist) is not easily exploited, for it
106
L. Miclo
already involves the Poincar´e constant which is unknown in general. Moreover, if in this equation the constant is replaced by the inverse of an eigenvalue other than 0 and the spectral gap, the functions which satisfy this new equation are the corresponding eigenvectors, which are not monotone (under irreducibility hypotheses; see for instance [12]). Therefore we prefer to base our approach on Dirichlet forms rather than on the equation possibly satisfied by the maximising functions. Let us add that, at least in the case of the Poincar´e constant, some monotonicity properties can also be obtained when the underlying graph is a tree. See [12] for a description of the eigenspace associated to the spectral gap (in the discrete case). The outline of the article is as follows: the next section deals with monotonicity properties for the spectral gap; they have to be considered first to treat the case when no extremal function exists in the above logarithmic Sobolev inequalities. The situations when it exists will then be studied in Section 3, still in the discrete setting. Then Section 4 will extend discussion to the continuous setting, by two different ways. The last section will be devoted to extensions with modified entropy or energy. Last, I wish to thank the referee whose sugestions led to a better presentation.
2 Poincar´ e Inequality In the discrete setting presented in the introduction, we consider the inverse of the spectral gap (also called Poincar´e constant) associated to μ and ν, defined by A(μ, ν) sup f ∈C
Var(f, μ) ν[(f )2 ]
¯ +, ∈R
(2)
where we recall that the variance of a measurable function f with respect to a probability μ is defined by 2 ¯ +. Var(f, μ) = f (y) − f (x) μ(dx) μ(dy) ∈ R The interest for us of A(μ, ν) comes from Theorem 2.2.3 in Saloff-Coste’s course [17], where a result due to Rothaus [13, 14, 15] is adapted to the continuous case (in a more general framework than our one-dimensional one). It says that either C(μ, ν) = 2A(μ, ν), or there exists a function f ∈ C such that C(μ, ν) = Ent(f 2 , μ)/ν[(f )2 ]. This alternative is shown by considering a maximising sequence in (1). So, keeping in mind the aim presented in the introduction, it is useful and instructive to start with its analogue for the spectral gap:
Monotonicity of the Extremal Functions
107
Proposition 1. Definition (2) is not changed when C is replaced with D, that is, when only monotone functions are considered. But first observe that the supremum featuring in (2) is always achieved. To establish that, two situations will be distinguished. a) The non-degenerate case where ν(a) > 0 for every a ∈ A. As the expressions Var(f, μ) and ν[(f )2 ] are invariant when a constant is added to f and as they are quadratically homogeneous, in (2) one may consider only functions f such that f (0) = 0 and ν[(f )2 ] = 1. Let now (fn )n∈N be a maximising sequence for (2) which satisfies those two conditions. The hypothesis on ν clearly ensures boundedness in R1+N of the sequence (fn )n∈N . Hence a convergent subsequence can be extracted, with limit a function f . This limit also verfies ν[(f )2 ] = 1, wherefrom one easily deduces that A(μ, ν) = Var(f, μ)/ν[(f )2 ], showing existence of an extremal function for (2). b) If ν({i, i + 1}) = 0 for some {i, i + 1} ∈ A, two sub-cases can be considered: b1) If μ({0, ..., i}) > 0 and μ({i+1, ..., N }) > 0, putting f = 1{i+1,...,N } , one has Var(f, μ) > 0 and ν[(f )2 ] = 0, hence C(μ, ν) = +∞ and f is extremal. b2) Else, one among μ({0, ..., i}) and μ({i + 1, ..., N }) vanishes, and the problem can be restricted to the segment {0, ..., i} or {i + 1, ..., N }, whichever has mass 1. By iteration, one is then back to one of the preceding cases. Note that in the above case (b1), Proposition 1 is established; so we can henceforth assume that ν > 0 on A. This observation is also valid for the logarithmic Sobolev constant, and it almost makes it possible to assume the irreducibility hypothesis of Theorem 2.2.3 of Saloff-Coste [17], except that μ was a priori not supposed to be strictly positive on E. Yet, one always can revert to this situation: call 0 x0 < x2 < · · · < xn N the elements of E with strictly positive μ-weight. Given some real numbers y0 , y1 , ..., yn , consider the affine sub-space of C consisting of those functions f such that f (xi ) = yi for each 0 i n, and try minimising ν[(f )2 ] therein. For fixed 0 i < n, this g on {xi , xi + 1, ..., xi+1 } which minimise
leads to look for the functions 2 ν({x, x+1})(g ({x, x+1})) under the constraints g(xi ) = yi and xi x<xi+1 g(xi+1 ) = yi+1 . By a simple application of the equality case in the CauchySchwarz inequality, this optimisation problem admits the following unique solution: ∀ xi x xi+1 , g(x) = yi +
xi y<xi+1
1 ν {y, y + 1}
−1 xi
y − yi i+1 . ν {y, y + 1} y<x
So, setting ∀ 0 i n, ∀ 0 i < n,
μ (i) μ(xi ) ν({i, i + 1}) xi y<xi+1
1 ν({y, y + 1})
−1 ,
(3)
108
L. Miclo
one would be reduced to a situation where the underlying probability is everywhere strictly positive; moreover, using (3), one easily switches back and forth between extremal functions for both problems. This would also fully justify the reminder before Proposition 1. On the other hand, we shall also discard the trivial case when μ is a Dirac mass; this ensures that A(μ, ν) > 0. We can now be a little more precise on the maximising functions in (2): Lemma 1. Let f be a function realizing the maximum in (2). Assuming that ν > 0 on A and that μ is not a Dirac mass, every maximising function has the form af + b1 where a ∈ R∗ , b ∈ R and 1 denotes the constant function with value 1. Proof. Clearly, if f is maximising and if a ∈ R∗ and b ∈ R, af + b1 is also maximising in (2). Conversely, let g be maximising in (2); by subtracting a constant, we may suppose that μ[g] = 0. By variational calculus around g (i.e., by considering g + h, with ∈ R and any h ∈ C, and taking a first order expansion when → 0 of the ratio Var(g + h, μ)/ν[(g + h )2 ]), one easily sees that for each i ∈ E, g satisfies A(μ, ν) ν {i, i+1} g(i) − g(i+1) + ν {i−1, i} g(i) − g(i−1) = μ(i)g(i) with the conventions ν {−1, 0} = 0 = ν {N, N + 1} . Now, since A(μ, ν) > 0 and ν > 0 on A, starting from g(0) these equations inductively determine g(1), g(2), up to g(N ). Note that g(0) = 0, else we would end up with g ≡ 0, contradicting A(μ, ν) > 0. So there is at most one minimising function g for (2) which satisfies μ[g] = 0 and g(0) = 1. This is exactly what the lemma asserts.
Given a maximising f for (2), our strategy to show its monotonicity will be as follows: supposing on the contrary f not to be monotone, we shall decompose f as f + f, with f (and hence also f) not belonging to the linear span Vect(1, f ), and with Var(f, μ) = Var(f, μ) + Var(f, μ) ν[(f )2 ] ν[(f )2 ] + ν[(f )2 ]. Clearly, these two relations imply that f and f also are maximising for (2), a contradiction since f and f do not have the form required by Lemma 1. So let f be maximising for (2) but not monotone. A point i ∈ E will be called a local maximum of f if for each j ∈ E verifying f (j) > f (i), the segment i, j (the sub-segment of E with endpoints i and j) contains an element k such that f (k) < f (i). By definition, a local minimum of f will be a local maximum of −f . We shall now construct f by splitting f at a particular level. Replacing f by −f if necessary, we may choose a local maximum i in 1, N − 1 such
Monotonicity of the Extremal Functions
109
that f has a local minimum in 0, i and another one in i, N . Among such local maxima i, choose one which minimises f (i), and call it i0 . Denote by i1 (respectively i−1 ) the closest local minimum on the right (respectively on the left) of i0 . By possibly reversing the order of 0, N , one can suppose that f (i−1 ) f (i1 ). Also, set i2 max{y i1 : ∀ i1 x y, f (x) = f (i1 )}. For s ∈ [f (i1 ), f (i0 )], let Ss as , bs be the discrete segment whose ends are defined by as min{x ∈ i−1 , i0 : f (x) s} bs min{x ∈ i2 , N : f (x) s} − 1 (with the convention that bs = N if the latter set is empty). By those choices, particularly by minimality of i0 , one easily verifies that for any s ∈ [f (i1 ), f (i0 )], f is increasing (this is always understood in the wide sense) on as , i0 , decreasing on i0 , i2 and increasing on i2 , bs + 1 (the reader is urged to draw a picture). Still for s ∈ [f (i1 ), f (i0 )], set for x ∈ E fs (x) = f (x)1Ssc (x) + s1Ss (x) fs (x) = (f (x) − s)1Ss (x). One has indeed fs = fs + fs , and the claimed decomposition will be obtained owing to the following two lemmas. Lemma 2. For any s ∈ ]f (i1 ), f (i0 )[, one has ν[(f )2 ] ν[(fs )2 ] + ν[(fs )2 ]. Proof. An immediate calculation first gives ν[(f )2 ] = ν[(fs + fs )2 ] = ν[(fs )2 ] + ν[(fs )2 ] + 2ν[fs fs ] and then ν[fs fs ] = ν {as − 1, as } s − f (as − 1) f (as ) − s + ν {bs , bs + 1} f (bs + 1) − s s − f (bs ) (still with the convention that ν {N, N + 1} = 0). Now, from the fact that s ∈ ]f (i1 ), f (i0 )[, it appears that f (i−1 ) f (as − 1) < s f (as ) f (i0 ) and f (i2 ) f (bs ) < s f (bs + 1), which allows to notice that ν[fs fs ] 0, wherefrom the claimed inequality derives.
Lemma 3. There exists s0 ∈ ]f (i1 ), f (i0 )[ such that Var(f, μ) = Var(fs , μ) + Var(fs , μ).
110
L. Miclo
Proof. The difference between the left and right hand sides is but twice the covariance of fs and fs under μ, which equals μ fs − μ[fs ] fs − μ[fs ] = μ fs − μ[fs ] fs (4) = s − μ[fs ] μ (f − s)1Ss . Hence, it suffices to find an s ∈ ]f (i1 ), f (i0 )[ such that μ[(f − s)1Ss ] = 0. Put i3 bf (i0 ) + 1; from the increasingness of f on i−1 , i0 and on i2 , i3 , one is easily convinced that the map Ψ : [f (i1 ), f (i0 )] s → μ[(f − s)1Ss ] is continuous. Now, the pattern of f on i−1 , i3 implies that Ψ (f (i1 )) > 0 and
Ψ (f (i0 )) < 0, so there exists s0 ∈ ]f (i1 ), f (i0 )[ such that Ψ (s0 ) = 0. Notate f = fs0 and f = fs0 , where s0 is chosen as in the preceding lemma. To finalize the proof of Proposition 1, it remains to see that f is not in Vect(f, 1). To this end, notice that i1 is no longer a local minimum for f (this function may go down from i1 to i−1 , and yet f(i1 ) = s0 > f (i1 ) f (i−1 ) = f(i−1 )), and consequently f cannot be written as af +b1 with a > 0 and b ∈ R. On the other hand, the inequalities f(i−1 ) < f(i0 ) and f (i−1 ) < f (i0 ) also show that f cannot be written as af + b1 with a 0 and b ∈ R. Therefore the claimed result follows.
3 Splitting up the Entropy Our aim here is to establish (2) in the discrete setting. According to the results from the preceding section, it suffices to consider the case when there exists a (non constant) maximising f for (1). For else, a maximising family for the logarithmic Sobolev inequality is (1 + f /(n + 1))n∈N , where f is a maximising function for the corresponding Poincar´e inequality (and hence f is monotone). Globally, the scheme of our proof will be similar to that of the previous section, most of whose notation will be kept in use. First of all, observe that one may from now on suppose that f 0, by possibly replacing f with |f |, since one has ν[(|f | )2 ] ν[(f )2 ]. Assume now the hypothesis (to be refuted) that f is not monotone. Two possibilities arise: either f has a local maximum i in 1, N −1 such that there is a local minimum in 0, i and one in i, N , or the same holds for −f . We shall consider the first case only; the second one is very similar and left to the reader (one has to work with the negatively valued function −f ). As in section 2, i−1 , i0 , i1 , i2 and i3 are defined, then, for s ∈ [f (i1 ), f (i0 )], Ss , fs and fs . Our main task will consist in “splitting up” the entropy: Lemma 4. There exists s1 ∈ ]f (i1 ), f (i0 )[ such that Ent(f 2 , μ) = Ent(fs21 , μ) + Ent (s + fs1 )2 , μ .
Monotonicity of the Extremal Functions
111
Proof. First remark that for all s ∈ [f (i1 ), f (i0 )] and for all function F : R+ → R, one has μ F (f ) = μ F (fs ) + μ F (s + fs ) − F (s).
(5)
Indeed, by definition, one can perform the following expansion: μ F (f ) = μ 1Ssc F (fs ) + μ 1Ss F (s + fs ) = μ F (fs ) − μ 1Ss F (s) + μ F (s + fs ) − μ 1Ssc F (s) = μ F (fs ) + μ F (s + fs ) − F (s). In particular, applying this to F : R+ u → u2 ln(u2 ), it appears that Ent(f 2 , μ) − Ent(fs21 , μ) − Ent (s + fs1 )2 , μ = ϕ(ys ) + ϕ(xs ) − ϕ(y) − ϕ(xs ) with ϕ the convex map given by ϕ : R+ u → u ln(u) and ys = μ fs2 xs = μ (s + fs )2
y = μ[f 2 ] xs = s2 .
Resorting again to (5), but with F (s) = s2 , it appears that xs + y = xs + ys , which means that both segments [xs , y] and [xs , ys ] have the same midpoint. So, by convexity of ϕ, the inequality ϕ(xs )+ϕ(y) ϕ(xs )+ϕ(ys ) is equivalent to |y − xs | |ys − xs |. Or also, if some s1 ∈ ]f (i1 ), f (i0 )[ happens to be such that |y − xs | = |ys − xs |, then the equality in Lemma 3 holds (without even using the convexity of ϕ). Now one computes (still owing to (5) with F (s) = s2 ) that ys − xs = μ[fs2 ] − μ (s + fs )2 = μ[f 2 ] + s2 − 2μ (s + fs )2 = μ[f 2 ] − s2 − 2μ[f2 ] − 4sμ[fs ] = y − xs − 2μ fs (fs + 2s) . s
Hence it suffices to find an s ∈ ]f (i1 ), f (i0 )[ such that μ fs (fs + 2s) = 0. But fs + 2s is a positive function, whereas fs is positive for s = f (i1 ) and negative for s = f (i0 ). The claim follows by continuity of the application [f (i1 ), f (i0 )] s → μ[fs (fs + 2s)], which is easily seen not to vanish at the endpoints.
Besides, according to Lemma 2, one has for all s ∈ ]f (i1 ), f (i0 )[ ν[(f )2 ] ν[(fs )2 ] + ν[(fs )2 ] = ν[(fs )2 ] + ν[((s + fs ) )2 ]. Using the notation and proof of that Lemma again, one can even say a little more: equality can hold only if for all edges a ∈ A one has fs (a)fs (a) = 0, which in particular entails that f (as ) = s. So, for s ∈ ]f (i1 ), f (i0 )[, the discrete segment Ss contains at least three different points, as , i0 and i1 .
112
L. Miclo
Now, what we saw just before implies that fs1 and s1 + fs1 also are maximising functions for (1), and that necessarily 2 ν (f )2 = ν (fs 1 )2 + ν (s1 + fs1 ) , for else, one would have Ent(fs21 , μ) + Ent (s + fs1 )2 , μ Ent(f 2 , μ) < 2 ν (f )2 ν (fs 1 )2 + ν (s1 + fs1 ) Ent(fs21 , μ) Ent (s + fs1 )2 , μ max , 2 ν (fs 1 )2 ν (s1 + fs1 ) (the first inequality uses that by construction Ent(f 2 , μ) > 0). Therefore there exist three successive points in Ss1 where fs1 assumes the same value (namely, s1 ) and we shall now verify that this is not possible, more precisely that this would imply constancy of fs1 , which does not hold (for fs1 (i−1 ) < fs1 (i0 )). Indeed, by variational calculus around a maximising function f , one sees that f must verify for all i ∈ E (with the usual conventions) C(μ, ν) ν {i, i+1} f (i) − f (i+1) + ν {i−1, i} f (i) − f (i−1)
f 2 (i) . = μ(i)f (i) ln μ[f 2 ] Recall that discussion has been reduced to the situation that μ, ν and C(μ, ν) are strictly positive (see before Lemma 1); so if f takes the same value v at three successive points y − 1, y and y + 1, with 0 < y < N , then the preceding 2 2 equation taken at i = y forces v ln(v /μ[f ]) = 0, that is to say, v = 0 or 2 v = μ[f ]. Applying then the equation at i = y + 1 instead, one obtains f (y + 2) = f (y + 1), at least if y N − 2. Similarly, for i = y − 1, f (y − 2) = v if y 2. So equality f (i) = v propagates everywhere and f is constanty equal to v. These arguments terminate the proof of (2) by replacing the recourse to Lemma 1. For even though the knowledge of μ[f 2 ] and of f (0) determines a maximising function f for (1) owing to the linear structure of the graph E (still for fixed μ and ν verifying C(μ, ν) > 0 and ν > 0 on A, as we were allowed to suppose in the preceding section), here this no longer implies Lemma 1 because the term μ(i)f (i) ln(f 2 (i)/μ[f 2 ]) above is not affine in f (i). Besides, this lemma never holds in the context of logarithmic Sobolev inequalities. Indeed, let again f be a positive function which maximises (1). Perturbating f by a constant function and performing a variational computation, one obtains μ[f ln(f /μ[f 2 ])] = 0. Set F (t) = μ[(f + t) ln((f + t)/μ[(f + t)2 ])] for all t 0. Differentiating twice this expression on R∗+ , one obtains μ[f + t]
μ[f + t]2 1 dμ − 2 2− . F (t) = 2 2 f +t μ[(f + t) ] μ[(f + t)2 ]
Monotonicity of the Extremal Functions
113
Using Jensen’s inequality μ[1/(f + t)] 1/μ[f + t] and the fact that the map [0, 1] x → x(2 − x) is bounded by 1, it appears that F is strictly positive on R∗+ if f is not μ-a.s. constant (consider the case when Jensen’s inequality is an equality). So, there may exist at most two t 0 such that F (t) = 0. Remark 1. The inequality μ[ff (i1 ) (ff (i1 ) + 2f (i1 ))] > 0 does not allow to deduce that Ent(f 2 , μ) < Ent(ff2(i1 ) , μ) + Ent((f (i1 ) + ff (i1 ) )2 , μ); this is true only under additional conditions concerning the signs of yf (i1 ) − xf (i1 ) and y − xf (i1 ) (a similar observation holds at s = f (i0 )). The possibility for ys − xs and y − xs to change sign when s ranges over[f (i1 ), f (i0 )] (the worst case is when such changes precisely occur where μ[fs (fs + 2s)] vanishes) is as much a nuisance as the the factor s − μ[fs ] which appeared in (4). Therefore we are a priori not sure of the existence of some s ∈ [f (i1 ), f (i0 )] making one of the functions fs and s + fs “strictly more maximising” than f . On the opposite, in the spectral gap case, this conclusion was nonetheless reachable, by using the extra fact that the map [f (i1 ), f (i0 )] s → s − μ[fs ] is increasing (more precisely, a further analysis easily shows that [f (i1 ), f (i0 )] s → s − μ[fs ] is increasing).
4 Continuous Situation So we come back to the framework first considered in the introduction. We shall only deal with the case of the logarithmic Sobolev constant; the Poincar´e constant can be treated in a very similar way. As already explained, the continuous situation will be reduced to the discrete one, thus giving the proof a slight probabilistic touch. We shall also consider the other possibility, to adapt the previous proofs, which leads to further analysing the (almost) minimising functions. But whichever way is chosen, the beginning of the proof appears to need some regularization as its first step. For M > 0, let C[−M,M ] (respectively D[−M,M ] ) be the sub-set of C (respectively of D) consisting of the absolutely continuous functions with weak derivative a.e. null on ]−∞, −M ] ∪ [M, +∞[. Also, put C[−M,M ] (μ, ν) D[−M,M ] (μ, ν)
sup
Ent(f 2 , μ) ν[(f )2 ]
sup
Ent(f 2 , μ) . ν[(f )2 ]
f ∈C[−M,M ]
f ∈D[−M,M ]
One is easily convinced that these two quantities increase with M > 0 and that they respectively converge for large M to C(μ, ν) and D(μ, ν) sup
f ∈D
Ent(f 2 , μ) ν[(f )2 ]
¯ +. ∈R
114
L. Miclo
Call ν[−M,M ] the restriction of ν to [−M, M ] (it vanishes outside this interval) and μ[−M,M ] the probability obtained by accumulating on the endpoints −M and M the mass outside [−M, M ]; i.e., μ[−M,M ] is defined by μ[−M,M ] (B) μ(B ∩ ]−M, M [) + μ(]−∞, M ])δ−M (B) + μ([M, +∞[)δM (B) for B any Borel set in R. The interest of these measures is that C[−M,M ] (μ, ν) = C(μ[−M,M ] , ν[−M,M ] ) and D[−M,M ] (μ, ν) = D(μ[−M,M ] , ν[−M,M ] ), so the convergences seen above allow restriction to the case that μ and ν are supported in the compact [−M, M ], where M > 0 is fixed from now on. We shall also content ourselves with only considering functions defined on [−M, M ]. Denote by λ the restriction of the Lebesgue measure to [−M, M ] and, by abuse of language, still call ν the Radon-Nikodym derivative of ν with respect to λ (which exists without any restriction on ν, provided the value +∞ is allowed; see for instance [11]). As weak derivatives are only a.e. defined, it is well known that C(μ, ν) (or D(μ, ν)) is not modified when ν is replaced with the measure having ν as density with respect to λ, which we henceforth assume. One can also without loss suppose the function ν to be minorated by an a.e. strictly positive constant. Indeed, this derives from the fact that for any f ∈ C, one has lim
η→0+
Ent(f 2 , μ) Ent(f 2 , μ) = 2 ν[(f )2 ] (f ) (η ∧ ν)dλ
and that this convergence is monotone. So, by exchanging suprema, equality is preserved in the limit. Hence η > 0 wil be fixed in the sequel, so that ν η everywhere on [−M, M ], i.e., a suitable version of ν is chosen; but beware, ν may still assume the value +∞ (remark that obtaining the corresponding majorization of ν would be more delicate). The next procedure consists in modifying μ and is a little less immediate; a general preparation is needed: Lemma 5. On some measurable space, let μ be a probability and f and g two bounded, measurable functions. Suppose that g − f ∞ 1 (uniform norm) and that the oscillation of f (i.e., osc(f ) sup f − inf f ) is majorized by a, where and a are positive real numbers. Then there exists a number b(a) 0, depending only upon a, such that Ent(g 2 , μ) − Ent(f 2 , μ) b(a) . Proof. Note that |f | and |g| fulfill the same hypotheses as f and g; so no generality is loss by further supposing f and g to be positive. Two situations are then distinguished, according to μ[f ] being “large” or “small”. We shall start with the case when μ[f ] 2 + 2a. This ensures that f is majorized by 2 + 3a and g by 3 + 3a. Now, on the interval [0, 3 + 3a], the
Monotonicity of the Extremal Functions
115
derivative of the map t → t2 ln(t2 ) is bounded by a finite quantity b1 (a); this entails that 2 μ[g ln(g 2 )] − μ[f 2 ln(f 2 )] μ g 2 ln(g 2 ) − f 2 ln(f 2 ) b1 (a) μ[|g − f |] b1 (a) . Similarly, the norm inequality μ[g 2 ] − μ[f 2 ] μ[(g − f )2 ] in L2 (μ) yields 2 μ[g ] ln(μ[g 2 ]) − μ[f 2 ] ln(μ[f 2 ]) b1 (a) , wherefrom finally the claimed inequality with b(a) = 2b1 (a). Consider now the case when μ[f ] > 2 + 2a. It seems more convenient to deal with the map R+ t → t ln(t). Performing an expansion with first-order remainder, centred at μ[f 2 ], one finds a θ ∈ [0, 1] such that μ[g 2 ] ln(μ[g 2 ]) equals μ[f 2 ] ln μ[f 2 ] + 1 + ln μ[f 2 ] + θ(μ[g 2 ] − μ[f 2 ]) μ[g 2 ] − μ[f 2 ] . The same operation performed pointwise yields another measurable function θ with values in [0, 1] such that one has everywhere 2 − f 2 ) (g 2 − f 2 ). g 2 ln(g 2 ) = f 2 ln(f 2 ) + 1 + ln f 2 + θ(g Integrating this against μ and taking into account the preceding equality, it appears that Ent(g 2 , μ) − Ent(f 2 , μ) 2 − f 2 ) − ln μ[f 2 ] + θ μ[g 2 ] − μ[f 2 ] (g 2 − f 2 ) . (6) = μ ln f 2 + θ(g However, observe that 2 − f 2 ) f 2 ∧ g 2 μ[f ] − osc(f ) − 1 2 f 2 + θ(g 2 μ[f ]2 μ[f ] − a − 1 4 and similarly μ[f ]2 . μ[f 2 ] + θ μ[g 2 ] − μ[f 2 ] 4 So one obtains the pointwise inequality 2 − f 2 ) − ln μ[f 2 ] + θ μ[g 2 ] − μ[f 2 ] ln f 2 + θ(g 2 − f 2 ) − μ[f 2 ] − θ μ[g 2 ] − μ[f 2 ] . 4 μ[f ]−2 f 2 + θ(g
116
L. Miclo
Let us look at the last absolute value. It can be majorized by f + μ[f 2 ] f − μ[f 2 ] + (f + g) |f − g| + μ[g 2 ] + μ[f 2 ] μ[g 2 ] − μ[f 2 ] 2 μ[f ] + a a + (2μ[f ] + 2a + 1) + (2μ[f ] + 2a + 1) (2μ[f ] + 2a + 1)(a + 2). On the other hand, one has as above 2 g − f 2 2μ[f ] + 2a + 1 , wherefrom, coming back to (6), it appears that 2 Ent(g 2 , μ) − Ent(f 2 , μ) 4 (a + 2) 2μ[f ] + 2a + 1 μ[f ]2 and in that case the lemma holds with b(a) = b2 (a), where b2 (a) sup 4 t2+2a
(a + 2)(2t + 2a + 1)2 t2
< +∞.
This technical result will be used to measure how certain modifications of μ influence C(μ, ν). More precisely, for fixed n ∈ N∗ , for any 0 i n put xn,i −M + i2M/n and introduce the probability μn μ [xn,i , xn,i+1 [ δxn,i 0in
with the convention that xn,n+1 = +∞. Lemma 6. With the notation of Lemma 5, for all n ∈ N∗ one has √ 2M C(μn , ν) − C(μ, ν) b 2M . n Proof. Calling C(ν) the set of absolutely continuous functions f such that ν[(f )2 ] = 1, one has C(μ, ν) = sup Ent(f 2 , μ) f ∈C(ν)
and one also has a similar formula for C(μn , ν). Thus, to obtain the claimed bound, it suffices to see that for all f ∈ C(ν), one has √ 2M Ent(f 2 , μn ) − Ent(f 2 , μ) b 2M . n
Monotonicity of the Extremal Functions
117
To that end, rewrite Ent(f 2 , μn ) as Ent(fn2 , μ), where fn is the function which equals f (xn,i ) on [xn,i , xn,i+1 [ for all 0 i n. To apply Lemma 5, it remains to evaluate osc(f ) and fn − f ∞ . These estimates, and consequently also the claimed result, easily follow from the following application of the CauchySchwarz inequality: ∀ x, y ∈ [−M, M ],
f (y) − f (x) =
f dλ
[x,y]
(f )2
dν
[x,y]
η
−1/2
[x,y]
1 dλ ν
|y − x|,
where the last estimate holds for any function belonging to C(ν).
Evidently, the above proof also shows that |D(μn , ν) − D(μ, ν)| b
√
2M
2M ; n
so, to get convinced of the equality C(μ, ν) = D(μ, ν), it suffices to see that C(μn , ν) = D(μn , ν) for all n ∈ N∗ . But this problem reduces to the discrete context. Indeed, as before Lemma values of f (xn,i ) being fixed, one xn,i+11, the (f )2 ν dλ for each given 0 i < n. This has to minimise the quantity xn,i optimisation problem is simply solved; the minimal value is
xn,i+1
xn,i
1 dλ ν
−1
2 f (xn,i+1 ) − f (xn,i )
and is achieved by a function which is monotone on the segment [xn,i , xn,i+1 ]. Hence we are back to the discrete problem on n+1 points with the probability μ n and the measure νn respectively defined by ∀ 0 i n, ∀ 0 i < n,
μ n (i) μn (xn,i ) −1 xn,i+1 1 dλ . νn {i, i + 1} ν xn,i
Sections 2 and 3 now allow to conclude. From a possibly more analytically-minded point of view, remark that Lemmas 5 and 6 could also allow to regularize μ, which could be supposed to admit a C ∞ density with respect to λ. Let us now mention another possible approach, directly inspired from the method of sections 2 and 3. A priori two problems arise in this perspective: on the one hand, whether a minimising function exists (even in the case of the Poincar´e inequality), and on the other hand, when it exists, whether the set of its global minima and maxima can have infinitely many connected components
118
L. Miclo
(this means, the function oscillates infinitely often; this is inconvenient for us, see the considerations before Lemma 2). These problems can be bypassed as follows. We put ourselves back in the framework preceding Lemma 5. First, the notion of local minimum or maximum introduced in section 2 will be extended to the continuous case, with discrete segments replaced by continuous ones. For f ∈ C, M(f ) will denote the set of local minima and maxima of f . For p ∈ N∗ , call Cp the set of functions f ∈ C such that M(f ) has at most p connected components. So one verifies that C1 (respectively C2 ) is the set of constant (respectively monotone) functions. Set also C∞ ∪p∈N∗ Cp , for which one has the following preliminary result: Lemma 7. One has C(μ, ν) = sup
f ∈C∞
Ent(f 2 , μ) . ν[(f )2 ]
Proof. Let F denote the set of all measurable functions g : [−M, M ] → R belonging to L1 ([−M, M ], λ) and for which one can find n ∈ N∗ and −M = x0 < x1 < · · · < xn = M such that for all 0 i < n, g has a constant sign on ]xi , xi+1 [ (0 is considered as having at the same time a positive and negative sign). So C∞ is nothing but the set of antiderivatives of elements of F. It then suffices to verify that {g ∈ F : ν[g 2 ] 1} is dense in the L2 (ν) sense in the unit ball of this space. Indeed, let f ∈ C with ν[(f )2 ] = 1. According to the preceding property, there exists a sequence (gn )n∈N of elements of F converging to f . Put for all n ∈ N x ∀ x ∈ [−M, M ], Gn (x) = f (−M ) + gn (y) dy. −M
Due to the minorization ν η, it is clear that the Gn converge uniformly to f for large n. And since osc(f ) < +∞, Lemma 5 applies and shows that lim Ent(G2n , μ) = Ent(f 2 , μ),
n→∞
wherefrom follows the equality in the lemma. To show the claimed density, take g ∈ L2 (ν) with ν[g 2 ] = 1; for n ∈ N, put gn g1{νn,|g|n} . By dominated convergence, the sequence (gn )n∈N converges in L2 (ν) to g. Now, for fixed n ∈ N, the measure (ν ∧ n)dλ is regular (in the sense of inner and outer approximation of Borel sets; see for instance Rudin’s book [16]), so one can find a sequence ( gn,m )m∈N in F such that lim ( gn,m − gn )2 (ν ∧ n) dλ = 0. m→∞
Monotonicity of the Extremal Functions
119
So, setting for all m ∈ N, gn,m gn,m 1{νn,|g|n} , which still belongs to F, one also has lim ( gn,m − gn )2 dν = 0 m→∞
and the claimed density is established. The lemma entails that C(μ, ν) = lim sup
p→∞ f ∈Cp
Ent(f 2 , μ) . ν[(f )2 ]
However, for p 3 and f ∈ Cp \ C2 , the considerations from the preceding section applied to f yield f ∈ Cp−1 and f ∈ C4 such that ν[(f )2 ] = ν[(f )2 ] + ν[(f )2 ] Ent(f 2 , μ) = Ent(f2 , μ) + Ent(f2 , μ). Let us make this more precise. For g ∈ C, a connected component of M(g) will be called internal if it contains neither −M nor M . The union of the internal connected components of M(g) will be denoted by M(g). One then introduces a set C3 ⊂ C4 ⊂ C4 by imposing that C4 ∩ (C4 \ C3 ) consists of the g g(−M ), g(M ) maxM(g) g. The functions g ∈ C4 \ C3 such that minM(g) 4 will be twofold for us: on the one hand, in the above interest of this set C 4 \ C2 then g 4 , and on the other hand, if g ∈ C construction, one has f ∈ C obtained from the preceding procedure is monotone. However, the sole fact that f ∈ C4 already showed that for p 5, one has sup
f ∈Cp
Ent(f 2 , μ) Ent(f 2 , μ) = sup , 2 ν[(f )2 ] f ∈Cp−1 ν[(f ) ]
and by induction, one ends up with the fact that this quantity is nothing but supf ∈C4 Ent(f 2 , μ)/ν[(f )2 ]. More precisely, the preceding observations even imply that C(μ, ν) = sup
f ∈C4
Ent(f 2 , μ) . ν[(f )2 ]
So let (fn )n∈N be a sequence of elements from C4 satisfying ν[(fn )2 ] = 1 for all n ∈ N and C(μ, ν) = limn→∞ Ent(fn2 , μ). Two situations can be distinguished: either one can extract from (fn )n∈N a subsequence (still denoted (fn )n∈N ) such that fn (0) n∈N converges in R, or one has lim inf n→∞ |fn (0)| = +∞. The latter case corresponds to the equality C(μ, ν) = A(μ, ν)/2, whose treatment amounts to that of the Poincar´e constant, left to the reader. Thus, from now on, we assume to be in the first situation described above. By weak compactness of the unit ball of L2 (ν), one can extract a subsequence of (fn )n∈N ,
120
L. Miclo
such that (fn )n∈N is weakly convergent in L2 (ν). Together with the convergence of (fn (0))n∈N , this weak convergence implies that the sequence (fn )n∈N converges pointwise on [−M, M ] to a function f which has a weak derivative f satisfying ν[(f )2 ] 1 (because the norm is weakly lower semi-continuous). However, the uniform continuity of the fn for n ∈ N (due to the majorization older coefficient of order 1/2) ensures, via Ascoli’s theorem, by η −1/2 of their H¨ that the convergence of the fn towards f is in fact uniform on the compact [−M, M ]. In particular, one obtains Ent(f 2 , μ) = lim Ent(fn2 , μ) = C(μ, ν). n→∞
Discarding the trivial situation that C(μ, ν) = 0 (which corresponds to the cases when μ is a Dirac mass or ν = +∞ a.s. on the convex hull of the support of μ), one then obtains Ent(f 2 , μ) C(μ, ν), ν[(f )2 ] with strict inequality if 0 ν[(f )2 ] < 1, wherefrom necessarily ν[(f )2 ] = 1. So f is a maximising function for (1), which, moreover, belongs to C4 , whereof one is easily convinced: at the cost of extracting a subsequence, one can require that the number (between 0 and 2) of internal connected components is the same for each fn and that there exists a point in each of these components which converges in [−M, M ] for large n, and this allows to see a posteriori that f ∈ C4 ). If f is not already monotone, the procedure of the preceding section can be applied again to construct f and f. As f is maximising, so must be these two functions too; now, owing to f belonging to C4 , f is necessarily monotone. So these arguments allow to conclude that C(μ, ν) = D(μ, ν). Remark 2. The latter proof rests partially on the existence of a maximising function for (1), but, contrary to the approach by Chen and Wang [6, 8] (in the case of the Poincar´e constant), we have not tried to exploit the equation it fulfills. More generally, call S(μ) the convex hull of the support of μ and [s− , s+ ] its closure in the compactified real line R {−∞, +∞}. Still denoting by ν the density of ν with respect to λ, assume that 1 dλ < +∞. ν S(μ) One can then show that if C(μ, ν) > A(μ, ν)/2, a maximising function for (1) exists (but these two conditions are not sufficient as can be seen by taking for μ and ν the standard Gaussian distribution). Indeed, fix o ∈ S(μ) and define x 1 ∀ x ∈ S(μ), F (x) dy. ν(y) o
Monotonicity of the Extremal Functions
121
By the preceding condition, F is continuously extendable to [s− , s+ ]. Consider 2then an absolutely continuous function f whose weak derivative satisfies (f ) dν 1. Applying as above a Cauchy-Schwarz, inequality, one gets that ∀ x, y ∈ S(μ), |f (y) − f (x)| |F (y) − F (x)|, and consequently, by Cauchy’s criterion, f too is continuously extendable to [s− , s+ ]. One can then repeat the preceding arguments on this compact (taking into account that ν −1 1I ∈ L2 (S(μ), ν) for each segment I ⊂ [s− , s+ ], this alowing to obtain pointwise convergence from the weak compactness of the unit ball of L2 (S(μ), ν)), and see that except when C(μ, ν) = A(μ, ν)/2, there exists a maximising function f for (1) (and since it is known that dealing with monotone functions is sufficient, Ascoli’s theorem can even be replaced with one of Dini’s ones). Performing a variational calculation around this function, one realizes that it satisfies two conditions:
f2 dμ = 0 f ln μ[f 2 ] S(μ) and for a.a. x ∈ S(μ), C(μ, ν)ν(x)f (x) =
f2 dμ. f ln μ[f 2 ] [s− ,x]
(7)
Obviously, if moreover the function ν is assumed to be absolutely continuous and μ absolutely continuous with respect to λ, a further differentiation yields a second-order equation (non linear in the zeroth order term) satisfied by f . Last, if in addition [s− , s+ ] ⊂ R, ν(s− ) > 0 and ν(s+ ) > 0, equation (7) allows to recover a Neumann condition for f , namely f (s− ) = f (s+ ) = 0.
5 Extensions We present here a few generalisations of the preceding results, corresponding to modifications of the quantities featuring in (1). 5.1 Modification of the Energy in the Discrete Case We shall show here Theorem 3, whose context is now assumed, and we put E(μ, ν) sup f ∈C
Ent(f 2 , μ) . Eν f 2 , ln(f 2 )
Considering Z brings no further difficulty, since, as in section 4, one can without loss consider only the finite situation where E = {0, ..., N } with N ∈ N∗ , at the cost of accumulating mass on the endpoints and translating the obtained segment. However, we take this opportunity to point out the most
122
L. Miclo
famous infinite example where the preceding constant is finite, namely the Poisson laws on N: fix α > 0 and take ∀ l ∈ N,
αl exp(−α) μ {l} l! ν {l, l+1} μ {l} .
It is then known (see for instance section 1.6 of the book [1] by An´e, Blach`ere, Chafa¨ı, Foug`eres, Gentil, Malrieu, Roberto and Scheffer) that E(μ, ν) equals α. To get convinced of Theorem 3, on has to inspect again the three-step proof in sections 2 and 3. • As in the case of the logarithmic Sobolev inequality, one is brought back, up to a multiplicative constant, to the problem of estimating the Poincar´e constant when there exists a minimising sequence (fn )n∈N verifying ∀ n ∈ N,
Eν (fn2 , ln(fn2 )) = 1 lim |fn (0)| = +∞. n→∞
Indeed, it is well known (see for instance Lemma 2.6.6 in the book by An´e and al. [1]) that ∀ f ∈ C,
Eν (f 2 , ln(f 2 )) 4ν[(f )2 ];
so the first condition above ensures that the oscillations of the fn are bounded in n ∈ N (the situation should have been beforehand reduced to the case when ν > 0). This observation allows to perform finite order expansions showing the following equivalent for large n: Ent(fn2 , μ) Var(fn , μ) ∼ , 2 2 8ν[(fn )2 ] Eν fn , ln(fn ) wherefrom one easily deduces sup f ∈C
Ent(f 2 , μ) Ent(f 2 , μ) A(μ, ν) = . = sup 2 2 2 2 8 Eν f , ln(f ) f ∈D Eν f , ln(f )
Thus it suffices to consider the situations where there exists a minimising sequence (fn )n∈N such that ∀ n ∈ N,
Eν fn2 , ln(fn2 ) = 1 lim sup |fn (0)| < ∞, n→∞
in which cases one can extract a subsequence that converges toward a maximiser for the supremum we are interested in.
Monotonicity of the Extremal Functions
123
• Calling f this maximiser, one is easily convinced that it cannot vanish, at least in the relevant situations where E(μ, ν) > 0. Performing then a variational computation around f shows it to verify for each i ∈ E the following equation:
f 2 (i) μ(i) f (i) ln μ[f 2 ] = E(μ, ν) f (i) ν {i, i+1} ln(f 2 (i)) − ln(f 2 (i+1)) + ν {i−1, i} ln(f 2 (i)) − ln(f 2 (i−1)) ν {i, i+1} f 2 (i) − f 2 (i+1) + ν {i−1, i} f 2 (i) − f 2 (i−1) + f (i) (as usual, ν({−1, 0}) = 0 = ν({N, N +1}), hence the terms f (−1) and f (N +1) never show up). If μ does not vanish, the form of this equation enables to apply the arguments of the end of section 3, taking advantage of the fact that a maximising function for E(μ, ν) cannot take the same value at three consecutive points, unless it is constant (which won’t do either). Remark also that contrary to sections 2 and 3, this equation does not allow to recursively compute f from the values of f (0) and μ[f 2 ], for the right-hand side is not injective as a function of f (i+1) (for 0 i < N ), but only as a function of f 2 (i+1). But this could be forseeen, since the signs of the functions really play no role in the quantities considered here. There remain the cases when μ vanishes at some (interior) points; they cannot be discarded as before Lemma 1. The simplest is to bypass the argument of the consecutive three points with same value, by adapting the second proof of the preceding section (by classifying the functions according to the maximal number of segments included in their set of local extrema); this is immediate enough. • The last point to be verified, which is also the most important, is the possibility of modifying Lemma 2; namely, with the notations therein, is it true that for all s ∈ ]f (i1 ), f (i0 )[, Eν f 2 , ln(f 2 ) Eν (fs )2 , ln((fs )2 ) + Eν (fs )2 , ln((fs )2 ) (8) for any function f with a constant sign (the situation should have been reduced to that case). This question amounts to asking if for all 0 x y z, one has ϕx,z (y) (z − x) ln(z) − ln(x) , (9) where ϕx,z is the function defined by ∀ y ∈ [x, z], ϕx,z (y) (y − x) ln(y) − ln(x) + (z − y) ln(z) − ln(y) . Now, differentiating this function twice shows it to be strictly convex, and (9) then derives from the fact that ϕx,z (x) = ϕx,z (z) = (z − x) ln(z) − ln(x) . One also derives therefrom that equality in (8) can hold only if fs (a)fs (a) = 0 for every edge a ∈ A.
124
L. Miclo
The other arguments of section 3 are valid without modification, since they only involve entropy. Theorem 3 follows. 5.2 Modification of the Energy in the Continuous Case Our aim here is to prove Theorem 4. Recall that H : R+ → R+ is a convex function such that H(0) = 0 and H (0) = 1 (besides these two equalities, we shall only use the bound x H(x), valid for all x 0). In particular, it appears that (10) ∀ f ∈ C, EH,ν (f ) ν (f )2 . For μ a probability and ν a measure on R, put F (μ, ν) sup f ∈C
Ent(f 2 , μ) ¯ +. ∈ R EH,ν (f )
In view of the second proof in the preceding section, the only non immediate point in the proof of Theorem 4 concerns the cases that can be reduced to that of the Poincar´e constant. Indeed, after having supposed without loss that μ is supported in [−M, M ] and that ν η, with M, η > 0, we have to see that if (fn )n∈N is a maximising sequence for F (μ, ν) such that ∀ n ∈ N,
EH,ν (f ) = 1 lim |fn (0)| = +∞,
n→∞
then F (μ, ν) = A(μ, ν)/2. But, again, such a sequence will satisfy ν[(f )2 ] 1 for all n ∈ N, and the oscillations of the fn will be bounded, allowing to obtain for large n the equivalent Ent(fn2 , μ) ∼
Var(fn , μ) . 2
By extracting a subsequence (first, by relative compactness of the fn , then, by Ascoli’s theorem), one may suppose that the fn converge uniformly to f ∈ C, with ν[(f )2 ] 1, wherefrom Var(fn , μ) 2 Var(f, μ) A(μ, ν) Var(f, μ) . = 2 2 2ν[(f ) ] 2
F (μ, ν) = lim Ent(fn2 , μ) = lim n→∞
n→∞
However, the reverse inequality always holds. Indeed, note first that one may content oneself in only dealing, for the supremum defining A(μ, ν), with functions having a weak derivative essentially bounded in the sense of the Lebesgue
Monotonicity of the Extremal Functions
125
measure on [−M, M ]. This is because only functions such that ν[(f )2 ] < +∞ need to be considered, and such functions can be approximated in the traditional way. Let f ∈ C with f 0 and f bounded. For n ∈ N, consider fn n + f . The oscillation of f being finite, for large n one has Ent(fn2 , μ) ∼ Var(fn , μ)/2 = Var(f, μ)/2. On the other hand, since H (0) = 1, one has by dominated convergence (f )2 2 lim EH,ν (fn ) = lim dν = (f )2 dν. H (n + f ) n→∞ n→∞ (n + f )2 It ensues therefrom that Var(f, μ) F (μ, ν), 2ν[(f )2 ] then the claimed inequality, by taking the supremum over such functions f . Similar results hold when C is replaced with D. It therefore suffices to deal with sequences (fn )n∈N maximising for F (μ, ν), satisfying EH,ν (fn ) = 1 for all n ∈ N, and such that limn→∞ fn (0) exists in R. But in this situation, the arguments in the second proof in section 4 easily adapt (after one has noted that for each function f ∈ C which splits as f + f, with f, f ∈ C and f f = 0 a.s., one trivially has EH,ν (f ) = EH,ν (f) + EH,ν (f)). Remark 3. One may wonder if there is a link between the discrete modified logarithmic Sobolev inequalities, and the continuous ones as above. As an attempt to shed light on such a link, consider again the approximation procedure used in the first proof of section 4. Thus we work with a probability μ of the form 0nN μ(n)δn . The constant F (μ, ν) can then be rewritten sup f ∈C
Ent(f 2 , μ) EJ (f )
(11)
with for each f ∈ C in the discrete context Jn,n+1 f (n), f (n + 1) EJ (f ) 0n
et where the maps (Jn,n+1 )0n
Jn,n+1 (x, y)
n+1
inf n g∈C([n,n+1]) : g(n)=x, g(n+1)=y
H
g 2 2 g ν dλ. g
Obviously, the supremum (11) is not changed by restricting it to monotone functions, since this “discrete” problem can be interpreted in the continuous context where this property has just been verified. But one could certainly also
126
L. Miclo
show it directly; note in particular that for any 0 n < N and all real numbers r´eels x y z, one has indeedJn,n+1 (x, z) Jn,n+1 (x, y) + Jn,n+1 (y, z) (it suffices to split any function going from x to z as the sum of two functions, the first one being its restriction going from x to y and remaining there). This leads to ponder on the possibility of rewriting Eν as an EJ , for a suitable choice of the continuous measure μ (the discrete one being given), and of the function H. 5.3 Modification of the Entropy We now aim to change the entropy term in (1); this leads to logarithmic Sobolev inequalities modified in another sense (see for instance Chafa¨ı [5]). This will give the opportunity to test the limits of the arguments in section 3. We shall content ourselves by treating the discrete case with the usual energy given by the quadratic form C f → ν[(f )2 ], although one may think that similar considerations should allow to extend the following to the continuous situation or to energies modified as above. Let ϕ : R+ → R be a convex function, of class C 3 on ]0, +∞[. The corresponding modified entropy is the functional which to any map f ∈ C, f 0 associates the quantity (positive by Jensen’s inequality) Eϕ [f ] = μ ϕ(f ) − ϕ μ[f ] . Unfortunately the expression Eϕ (f 2 ) is no longer quadratically homogeneous in f (unless it is proportional to the usual entropy in f 2 ). To remedy this flaw, we shall need two additional hypotheses. Call ψ the map defined by ∀ x > 0,
ψ(x) xϕ (x) − ϕ(x).
One says that ψ is asymptotically concave if for some R > 0 the function ψ remains below its tangents at points larger than R: ∀ y R, ∀ x > 0,
ψ(x) ψ(y) + ψ (y)(y − x).
This notably implies that ψ is concave on [R, +∞[ (which is not sufficient, but becomes sufficient if moreover limx→+∞ ψ(x) − xψ (x) = +∞). We shall first suppose ψ to be asymptotically concave. The second additional hypothesis states the existence of a constant η > 0 such that for any 0 < x < η, one has ϕ (x) + xϕ (x) 0 (if ϕ is C 3 on R+ , this is ensured by ϕ (0) > 0; more generally, if one does not even want to suppose ϕ to be of class C 3 on R∗+ , it can be seen that it suffices to suppose that the map x → xϕ (x) is increasing on some interval ]0, η[). An example of a function ϕ satisfying these conditions is R+ x → x ln ln(e + x) . Remark that ∀ x > 0,
ψ (x) = xϕ (x) 0
Monotonicity of the Extremal Functions
127
and that this quantity decreases for x R; hence it admits a limit L 0 at +∞. So ϕ (x) (1 + L)/x for x large, which shows that up to a constant factor, ϕ(x) is dominated by x ln(x). Somehow, the usual entropy is an upper bound for the modified entropies to be considered here. For μ a probability on E = {0, ..., N } and ν a measure on the corresponding set A of edges, we are interested in the quantity G(μ, ν) sup f ∈C
Eϕ (f 2 ) ν[(f )2 ]
and our aim here is to prove Proposition 2. One has as usual G(μ, ν) = sup
f ∈D
Eϕ (f 2 ) . ν[(f )2 ]
The main annoyance comes from the inhomogeneity of Eϕ , which a priori forbids to only consider maximising sequences for G(μ, ν) with energy bounded above and below by a strictly positive constant. To remedy to that, observe that nothing here hinders us from supposing μ and ν to be strictly positive on E. This property ensures the existence of a constant b1 > 0 such that ∀ g ∈ C,
ν[(g )2 ] = 1
⇒
μ[g 2 ] b1 .
Fix a function g satisfying ν[(g )2 ] = 1 and consider the function F : R∗+ t → Eϕ [tg 2 ]/t.
(12)
A computation gives its derivative as ∀ t > 0,
F (t) = t−2 μ[ψ(tg 2 )] − ψ(tμ[g 2 ]) .
So by our hypothesis that ψ is asymptotically concave, F is decreasing on [R/b1 , +∞[. This shows that G(μ, ν) =
sup f ∈C : ν[(f )2 ]R/b1
Eϕ (f 2 ) , ν[(f )2 ]
which enables us to only consider maximising sequences (fn )n∈N satisfying ν[(fn )2 ] R/b1 for all n√∈ N. One can also suppose that these functions fn are positive. Write fn = tn gn , with tn > 0 (discarding the trivial cases that a sub-sequence reduces tn = 0) and gn ∈ C satisfying ν[(gn )2 ] = 1. Extracting to the situation when the sequences (tn )n∈N and fn (0) n∈N are respectively ¯ + . Several cases will be distinguished: convergent in [0, R/b1 ] and R • If limn→∞ tn = 0, we shall verify that we may without loss suppose that limn→∞ fn (0) > 0. Indeed, our second hypothesis on ψ ensures that for g ∈ C, g 0, the function F defined in (12) is increasing on ]0, η/ max g 2 ]. This is
128
L. Miclo
obtained via a second-order expansion with remainder: for fixed t > 0, there exists a function θt : E → ]0, t max g 2 [ such that ψ(tg 2 ) = ψ(tμ[g 2 ]) + ψ (tμ[g 2 ])t(g 2 − μ[g 2 ]) +
ψ (θt ) 2 2 t (g − μ[g 2 ])2 . 2
When this inequality is integrated with respect to μ, it appears that F (t) is positive as soon as t max g 2 η. On the other hand, there exists a constant b2 > 0 such that if g satisfies ν[(g )2 ] = 1, then osc(g) b2 and hence, if moreover g is positive, max g 2 (g(0) + b2 )2 . Consequently, if one constructs a new sequence ( tn )n∈N by setting tn si tn (gn (0) + b2 )2 > η ∀ n ∈ N, tn 2 else, η/(gn (0) + b2 ) the sequence (fn )n∈N defined by fn tn gn for n ∈ N remains maximising for G(μ, ν). We consider from now on this sequence, still called (fn )n∈N . Then one has ∀ n ∈ N, tn (gn (0) + b2 )2 η, √ that is to say fn2 (0) + 2b2 tn fn (0) + b22 tn η, which prevents the convergence limn→∞ fn (0) = 0. One can now perform a second-order expansion with for Eϕ (fn2 ); √ remainder √ there exists a new function θn valued in [fn (0) − tn b2 , fn (0) + tn b2 ] and such that Eϕ (fn2 ) = μ ϕ (θn )(fn2 − μ[fn2 ])2 /2. First consider the case that l limn→∞ fn (0) is finite. Since l > 0, one has uniformly on E lim ϕ (θn )(fn + μ[fn2 ])2 /2 = 2l2 ϕ (l2 ). n→∞
If l2 ϕ (l2 ) > 0, one draws therefrom the equivalent for large n Eϕ (fn2 ) ∼ 2l2 ϕ (l2 )μ[(fn − μ[fn2 ])2 ] 2l2 ϕ (l2 ) Var(fn , μ), wherefrom Eϕ (fn2 ) Var(fn , μ) 2l2 ϕ (l2 ) lim sup 2l2 ϕ (l2 )A(μ, ν). )2 ] n→∞ ν[(fn ν[(fn )2 ] n→∞ lim
Similarly, one gets Eϕ (fn2 ) =0 )2 ] n→∞ ν[(fn lim
Monotonicity of the Extremal Functions
129
when l2 ϕ (l2 ) = 0. So it appears that one always has G(μ, ν) sup 2l2 ϕ (l2 )A(μ, ν) = sup 2l2 ϕ (l2 )A(μ, ν), l>0
l>η
where the latter equality comes from the map x → xϕ (x) being increasing on ]0, η]. Conversely the inequality G(μ, ν) supl>η 2l2 ϕ (l2 )A(μ, ν) is satisfied under all circumstances: for all l larger than some given η, in the supremum defining G(μ, ν), it suffices to consider functions of the form l + f , with f ∈ C and > 0 which is made to tend to 0. The above argument also holds if limn→∞ fn (0) = +∞, by existence and finiteness of L = limx→+∞ xϕ (x). Thus, in all cases, the convergence entails the equality G(μ, ν) = supl>η2 2lϕ (l)A(μ, ν). Then, one also has sup
f ∈D
Eϕ (f 2 ) Var(f, μ)
= sup = sup 2lϕ (l) sup 2lϕ (l) A(μ, ν), 2 ν[(f )2 ] f ∈D ν[(f ) ] l>η 2 l>η 2
the claimed identity (2) follows. • If limn→∞ tn ∈ ]0, R/b1 ], one is back in a more classical framework, and, as already happened several times, two sub-cases will be considered. - If limn→∞ fn (0) = +∞, the boundedness in n ∈ N of the oscillations of the fn and the convergence limt→+∞ xϕ (x) = L allow again to perform a second-order expansion with remainder, yielding for large n the equivalent Eϕ (fn2 ) ∼
L Var(fn , μ) 2
if L > 0. On the other hand, if L = 0, it appears that Eϕ (fn2 ) Var(fn , μ). Since A(μ, ν) < +∞, the latter possibility implies that one is in the trivial situation that G(μ, ν) = 0. If L > 0, one also obtains G(μ, ν) = LA(μ, ν)/2. So one is reduced to the case of the Poincar´e inequality. - If limn→∞ fn (0) exists in R, one easily shows existence of some minimising function. But the proof of Lemma 4 immediately adapts to this situation, in view of the form of the modified entropy Eϕ . Then the quickest way to conclude that (2) holds is to adapt the second proof of section 4.
References 1. C´ecile An´e, S´ebastien Blach`ere, Djalil Chafa¨ı, Pierre Foug`eres, Ivan Gentil, Florent Malrieu, Cyril Roberto and Gr´egory Scheffer. Sur les in´egalit´es de Sobolev logarithmiques, volume 10 of Panoramas et Synth`eses. Soci´et´e Math´ematique de France, Paris, 2000. With a preface by Dominique Bakry and Michel Ledoux.
130
L. Miclo
2. F. Barthe and C. Roberto. Sobolev inequalities for probability measures on the real line. Studia Math., 159(3):481–497, 2003. Dedicated to Professor Aleksander Pelczy´ nski on occasion of his 70th birthday (Polish). 3. S. G. Bobkov and F. G¨ otze. Exponential integrability and transportation cost related to logarithmic Sobolev inequalities. J. Funct. Anal., 163(1):1–28, 1999. 4. Eric A. Carlen. Superadditivity of Fisher’s information and logarithmic Sobolev inequalities. J. Funct. Anal., 101(1):194–211, 1991. 5. Djalil Chafa¨ı. Entropies, convexity, and functional inequalities. J. Math. Kyoto Univ., 44(2):325–363, 2004. 6. Mu-Fa Chen and Feng-Yu Wang. Estimation of spectral gap for elliptic operators. Trans. Amer. Math. Soc., 349(3):1239–1267, 1997. 7. Mufa Chen. Analytic proof of dual variational formula for the first eigenvalue in dimension one. Sci. China Ser. A, 42(8):805–815, 1999. 8. Mufa Chen. Variational formulas and approximation theorems for the first eigenvalue in dimension one. Sci. China Ser. A, 44(4):409–418, 2001. 9. Ivan Gentil, Arnaud Guillin and Laurent Miclo. Modified logarithmic Sobolev inequalities and transportation inequalities. Probab. Theory Related Fields 133 (2005), no. 3, 409–436. 10. Leonard Gross. Logarithmic Sobolev inequalities. Amer. J. Math., 97(4):1061– 1083, 1975. 11. Laurent Miclo. Quand est-ce que des bornes de Hardy permettent de calculer une constante de Poincar´e exacte sur la droite? Annales de la Facult´e des Sciences de Toulouse, S´er. 6, no. 17(1):121–192, 2008. 12. Laurent Miclo. On eigenfunctions of Markov processes on trees. Probab. Theory Related Fields, 142(3-4):561–594, 2008. 13. O. S. Rothaus. Logarithmic Sobolev inequalities and the spectrum of SturmLiouville operators. J. Funct. Anal., 39(1):42–56, 1980. 14. O. S. Rothaus. Diffusion on compact Riemannian manifolds and logarithmic Sobolev inequalities. J. Funct. Anal., 42(1):102–109, 1981. 15. O. S. Rothaus. Logarithmic Sobolev inequalities and the spectrum of Schr¨ odinger operators. J. Funct. Anal., 42(1):110–120, 1981. 16. Walter Rudin. Real and complex analysis. McGraw-Hill Book Co., New York, third edition, 1987. 17. Laurent Saloff-Coste. Lectures on finite Markov chains. In Lectures on probability theory and statistics (Saint-Flour, 1996), volume 1665 of Lecture Notes in Math., pages 301–413. Springer, Berlin, 1997. 18. Liming Wu. A new modified logarithmic Sobolev inequality for Poisson point processes and several applications. Probab. Theory Related Fields, 118(3):427–438, 2000.
Non-monotone Convergence in the Quadratic Wasserstein Distance Walter Schachermayer1 , Uwe Schmock2 , and Josef Teichmann3 1
2
3
Vienna University of Technology Wiedner Hauptstrasse 8–10, 1040 Vienna, Austria email: [email protected] Vienna University of Technology Wiedner Hauptstrasse 8–10, 1040 Vienna, Austria email: [email protected] Vienna University of Technology Wiedner Hauptstrasse 8–10, 1040 Vienna, Austria email: [email protected]
Summary. We give an easy counterexample to Problem 7.20 from C. Villani’s book on mass transport: in general, the quadratic Wasserstein distance between n-fold normalized convolutions of two given measures fails to decrease monotonically.
We use the terminology and notation from [5]. For Borel measures μ, ν on Rd we define the quadratic Wasserstein distance T (μ, ν) := inf E X − Y 2 (X,Y )
where · is the Euclidean distance on Rd and the pairs (X, Y ) run through all random vectors defined on some common probability space (Ω, F, P), such that X has distribution μ and Y has distribution ν. By a slight abuse of notation, we define T (U, V ) := T (μ, ν) for two random vectors U , V , such that U has distribution μ and V has distribution ν. The following theorem (see [5, Proposition 7.17]) is due to Tanaka [4]. Theorem 1. For a, b ∈ R and square integrable random vectors X, Y , X , Y such that X is independent of Y , and X is independent of Y , and E[X] = E[X ] or E[Y ] = E[Y ], we have T (aX + bY, aX + bY ) ≤ a2 T (X, X ) + b2 T (Y, Y ). For a sequence of i.i.d. random vectors (Xi )i∈N we define the normalized partial sums m 1 Xi , m ∈ N. Sm := √ m i=1 C. Donati-Martin et al. (eds.), S´ eminaire de Probabilit´ es XLII, Lecture Notes in Mathematics 1979, DOI 10.1007/978-3-642-01763-6 3, c Springer-Verlag Berlin Heidelberg 2009
131
132
W. Schachermayer et al.
If μ denotes the law of X1 , we√write μ(m) for the law of Sm . Clearly μ(m) equals, up to the scaling factor m, the m-fold convolution μ ∗ μ ∗ · · · ∗ μ of μ. We shall always deal with measures μ, ν with vanishing barycenter. Given two measures μ and ν on Rd with finite second moments, we let (Xi )i∈N and (Xi )i∈N be i.i.d. sequences with law μ and ν, respectively, and denote by the corresponding normalized partial sums. From Theorem 1 we Sm and Sm obtain m ∈ N, T μ(2m) , ν (2m) ≤ T μ(m) , ν (m) , from which one may quickly deduce a proof of the central limit theorem (compare [5, Ch. 7.4] and the references given there). However, we can not deduce from Theorem 1 that the inequality T μ(m+1) , ν (m+1) ≤ T μ(m) , ν (m) (1) holds true for all m ∈ N. Specializing to the case m = 2, an estimate, which we can obtain from Tanaka’s theorem, is 1 T μ(3) , ν (3) ≤ 2T μ(2) , ν (2) + T (μ, ν) ≤ T (μ, ν). 3 This contains some valid information, but does not imply (1). It was posed as Problem 7.20 of [5], whether inequality (1) holds true for all probability measures μ, ν on Rd and all m ∈ N. The subsequent easy example shows that the answer is no, even for d = 1 and symmetric measures. We can choose μ = μn and ν = νn for sufficiently large n ≥ 2, as the following proposition (see also Remark 1) shows. 2n−1 Proposition 1. Denote by μn the distribution of i=1 Zi , and by νn the 2n 1 distribution of i=1 Zi with (Zi )i∈N i.i.d. and P(Z1 = 1) = P(Z1 = −1) = 2 . Then √ 2 (2) lim n T (μn ∗ μn , νn ∗ νn ) = √ , n→∞ 2π while T (μn ∗ μn ∗ μn , νn ∗ νn ∗ νn ) ≥ 1 for all n ∈ N. Remark 1. If one only wants to find a counterexample to Problem 7.20 of [5], one does not really need the full √ strength of Proposition 1, i.e., the estimate that T (μn ∗ μn , νn ∗ νn ) = O(1/ n). In fact, it is sufficient to consider the case n = 2 in order to contradict the monotonicity of inequality (1). Indeed, a direct calculation reveals that √ 2 2 2 T (μ2 ∗ μ2 ∗ μ2 , ν2 ∗ ν2 ∗ ν2 ). T (μ2 ∗ μ2 , ν2 ∗ ν2 ) = 0.625 < ≤ √ 3 3 Proof (of Proposition 1). We start with the final assertion, which is easy to show. The 3-fold convolutions of the measures μn and νn , respectively,
Non-monotone Convergence in the Quadratic Wasserstein Distance
133
are supported on odd and even numbers, respectively. Hence they have disjoint supports with distance 1 and so the quadratic transportation costs are bounded from below by 1. For the proof of (2), fix n ∈ N, define σn = μn ∗ μn and τn = νn ∗ νn , and note that σn and τn are supported by the even numbers. For k = −(2n − 1), . . . , (2n − 1) we denote by pn,k the probability of the point 2k under σn , i.e. 4n − 2 1 pn,k = . k + 2n − 1 24n−2 We define pn,k = 0 for |k| ≥ 2n. We have τn = σn ∗ρ, where ρ is the distribution giving probability 41 , 12 , 14 to −2, 0, 2, respectively. We deduce that for 0 ≤ k ≤ 2n − 2, 1 1 1 pn,k + pn,k+2 + pn,k+1 4 4 2 1 1 = (pn,k − pn,k+1 ) + (pn,k+2 − pn,k+1 ) + σn (2k + 2) (3) 4 4
pn,k+1 1 pn,k+2 1 + pn,k+1 = pn,k 1 − − 1 + σn (2k + 2). 4 pn,k 4 pn,k+1
τn (2k + 2) =
Notice that pn,k ≥ pn,k+1 for 0 ≤ k ≤ 2n−1. The term in the first parentheses is therefore non-negative. It can easily be calculated and estimated via 4n−2 pn,k+1 2k + 1 2k + 1 2n − k − 1 =1− 0≤1− = ≤ , = 1 − k+2n 4n−2 pn,k k + 2n 2n + k 2n k+2n−1 for 0 ≤ k ≤ 2n − 1. Following [5] we know that the quadratic Wasserstein distance T can be given by a cyclically monotone transport plan π = πn . We define the transport plan π via an intuitive transport map T . It is sufficient to define T for 0 ≤ k ≤ 2n − 1, since it acts symmetrically on the negative side. T moves mass 1 2k+1 4 pn,k 2n+k from the point 2k to 2k + 2 for k ≥ 1. At k = 0 the transport T 1 moves 8n pn,0 to every side, which is possible, since there is enough mass concentrated at 0. By equation (3) we see that the transport T moves σn to τn , since, for 1 ≤ k ≤ 2n−2, the first terms corresponds to the mass, which arrives from the left and is added to σn , and the second term to the mass, which is transported away: summing up one obtains τn . For k = 2n − 1, mass only arrives from the left. At k = 0 mass is only transported away. By the symmetry of the problem around 0 and by the quadratic nature of the cost function (the distance of the transport is 2, hence cost 22 ), we finally have T (σn , τn ) ≤ 2
2n−1 k=0
2n−1 22 2k + 1 2k + 1 pn,k ≤ . pn,k 4 2n + k n k=0
134
W. Schachermayer et al.
By the central limit theorem and uniform integrability of the function x → x+ := max(0, x) with respect to the binomial approximations, we obtain ∞ 2n−1 2 1 x √ e−x /2 dx. (2k)pn,k = lim √ n→∞ 2 n 2π 0 k=0 Hence
√
2 ≈ 0.79788 . n T (σn , τn ) ≤ √ n→∞ 2π In order to obtain equality we start from the local monotonicity of the respective transport maps on non-positive and non-negative numbers. It easily follows that the given transport plan is cyclically monotone and hence optimal (see [5, Ch. 2]). The subsequent equality allows also to consider estimates from below. Rewriting (3) yields
p
1 pn,k+1 1 n,k + σn (2k + 2) − 1 + pn,k+2 1 − τn (2k + 2) = pn,k+1 4 pn,k+1 4 pn,k+2 lim sup
for 0 ≤ k ≤ 2n − 3, and τn (2k + 2) =
p 1 n,k pn,k+1 − 1 + σn (2k + 2) 4 pn,k+1
for k = 2n − 2. Furthermore, 4n−2 pn,k 2k + 1 2k + 1 k + 2n 4n−2 − 1 = −1= ≥ − 1 = k+2n−1 pn,k+1 2n − k − 1 2n − k − 1 2n k+2n for 0 ≤ k ≤ 2n − 2. This yields by a reasoning similar to the above that T (σn , τn ) ≥
2n−2 k=0
hence lim inf n→∞
pn,k+1
2k + 1 , n
√ 2 n T (σn , τn ) ≥ √ . 2π
Remark 2. Let p ≥ 2 be an integer. By slight modifications of the proof of Proposition 1 we can construct sequences of measures (μn )n∈N and (νn )n∈N , such that the quadratic Wasserstein distances of k-fold convolutions are bounded from below by 1 for all k which are not multiples of p, while (p) lim T (μ(p) n , νn ) = 0.
n→∞
Remark 3. Assume the notations of [5]. In the previous considerations we can replace the quadratic cost function by any other lower semi-continuous cost function c : R2 → [0, +∞], which is bounded on parallels to the diagonal r and vanishes on the diagonal. For example, if we choose c(x, y) = |x − y| for 0 < r < ∞, then we obtain the same asymptotics as in Proposition 1 (with a different constant).
Non-monotone Convergence in the Quadratic Wasserstein Distance
135
Remark 4. We have used in the above proof that τn is obtained from σn by convolving with the measure ρ. In fact, this theme goes back (at least) as far as L. Bachelier’s famous thesis from 1900 on option pricing [2, p. 45]. Strictly speaking, L. Bachelier deals with the measure assigning mass 12 to −1, 1 and considers consecutive convolutions, instead of the above ρ. Hence convolutions with ρ correspond to Bachelier’s result after two time steps. Bachelier makes the crucial observation that this convolution leads to a radiation of probabilities: Each stock price x radiates during a time unit to its neighboring price a quantity of probability proportional to the difference of their probabilities. This was essentially the argument which allowed us to prove (1). Let us mention that Bachelier uses this argument to derive the fundamental relation between Brownian motion (which he was the first to define and analyse in his thesis) and the heat equation (compare e.g. [3] for more on this topic). Remark 5. Having established the above counterexample, it becomes clear how to modify Problem 7.20 from [5] to give it a chance to hold true. This possible modification was also pointed out to us by C. Villani. Problem 1. Let μ be a probability measure on Rd with finite second moment and vanishing barycenter, and γ the Gaussian measure with same first and second moments. Does (T (μ(n) , γ))n≥1 decrease monotonically to zero? When entropy is considered instead of the quadratic Wasserstein distance, the corresponding question on monotonicity was answered affirmatively in the recent paper [1]. One may also formulate a variant of Problem 7.20 as given in (1) by replacing the measure ν through a log-concave probability distribution. This would again generalize Problem 1. Acknowledgement. Financial support from the Austrian Science Fund under grant P 15889 and Y 328, from the Vienna Science and Technology Fund under grant MA 13, from the European Union under grant HPRN-CT-2002-00281 is gratefully acknowledged. Furthermore, this work was financially supported by the Christian Doppler Research Association (CDG) via PRisMa Lab (www.prismalab.at). The authors gratefully acknowledge a fruitful collaboration and continued support by Bank Austria and the Austrian Federal Financing Agency through CDG.
References 1. S. Artstein, K. M. Ball, F. Barthe and A. Naor, Solution of Shannon’s Problem on the Monotonicity of Entropy, Journal of the AMS 17(4), 2004, pp. 975–982. ´ 2. L. Bachelier, Th´eorie de la Sp´eculation, Annales scientifiques de l’Ecole Normale Sup´erieure S´erie 3, 17, 1900, pp. 21–86. Also available from the site http://www.numdam.org/
136
W. Schachermayer et al.
3. W. Schachermayer, Introduction to the Mathematics of Financial Markets, LNM 1816 - Lectures on Probability Theory and Statistics, Saint-Flour summer school 2000 (Pierre Bernard, editor), Springer-Verlag, Heidelberg, 2003, pp. 111–177. 4. H. Tanaka, An inequality for a functional of probability distributions and its applications to Kac’s one-dimensional model of a Maxwell gas, Zeitschrift f¨ ur Wahrscheinlichkeitstheorie und verwandte Gebiete 27, 1973, pp. 47–52. 5. C. Villani, Topics in Optimal Transportation, Graduate Studies in Mathematics 58, American Mathematical Society, Providence Rhode Island, 2003.
On the Equation μ = St μ ∗ μt Fangjun Xu∗ Department of Mathematics, University of Connecticut 196 Auditorium Road, Unit 3009, Storrs, CT 06269-3009, USA e-mail: [email protected]
Summary. We discuss solutions of equation μ = St μ∗μt and study their structure. The relationship with Ornstein-Uhlenbeck processes will also be considered.
Keywords: C0 -semigroup; Infinitely divisible; Mehler semigroup; OrnsteinUhlenbeck processes
1 Introduction Let E be a real separable Banach space and E ∗ its dual. We define P (E) and ID(E) to be the sets of all probability measures and infinitely divisible probability measures on E, respectively. Let OS(E) be the set of all probability measures on E which satisfy the following equation (1). For μ ∈ P (E), the Fourier transform of μ is ei<λ,x> dμ(x) for all λ ∈ E ∗ . μ (λ) = E
(λ) = 0}. Obviously, H(μ) is closed but may be Define H(μ) = {λ ∈ E ∗ : μ empty. Let (St , t ≥ 0) be a C0 -semigroup of linear operators acting on E with infinitesimal generator J. Using the notation St μ for the induced probability measure μ ◦ St−1 , we say that μ ∈ OS(E), if for each t ≥ 0, there exists μt ∈ P (E) such that (1) μ = St μ ∗ μt . As far as we know, there are several papers which studied solutions of the above equation. In these papers, under some given assumptions, solutions of equation (1) are called operator-selfdecomposable distributions and can be expressed as limit distributions. Moreover, integral expressions of operatorselfdecomposable solutions were found in some of these papers. For the case of ∗
Work carried out at Nankai University
C. Donati-Martin et al. (eds.), S´ eminaire de Probabilit´ es XLII, Lecture Notes in Mathematics 1979, DOI 10.1007/978-3-642-01763-6 4, c Springer-Verlag Berlin Heidelberg 2009
137
138
F. Xu
infinite dimensional Banach space, see [BRS96], [Cho87], [JV83] and [Urb78]. For the case of finite dimensional Banach space, see [JM93] and [SY84]. A most recent article on operator-selfdecomposable distribution and its relationship with associated Ornstein-Uhlenbeck processes is [App07]. Throughout this article, if not mentioned otherwise, the topology we consider is the weak convergence topology. Proposition 1.1 If μ ∈ OS(E) and H(μ) = ∅, then μt+s = St μs ∗ μt
for all t, s ≥ 0.
(2)
Proof. We use similar arguments as in [JV83]. From equation (1), we have μ = St+s μ ∗ μt+s = St (Ss μ ∗ μs ) ∗ μt = St+s μ ∗ St μs ∗ μt . Thus, we obtain St+s μ ∗ μt+s = St+s μ ∗ St μs ∗ μt .
(3)
By the assumption, it can be easily concluded that H(St+s μ) = ∅. Taking Fourier transforms on both sides of equation (3), we can easily obtain μ t+s (λ) = S μt (λ) t μs (λ) for all λ ∈ E ∗ , i.e.,
μt+s = St μs ∗ μt . 2
Remark 1.1 In [JV83], lim e−t = 0 implies H(μ) = ∅. However, the assumpt→∞
tion H(μ) = ∅ here cannot be dropped. In fact, Theorem 5.1.1 in [Luk60] shows that the cancellation law cannot be applied in the convolution semigroup P (E). The semigroup (μt , t ≥ 0) satisfying equation (2) is called the Mehler semigroup (or (St )-skew convolution semigroup). For a recent study of the Mehler semigroup, we recommend [BRS96], [FR00], [SS01], [Jur04] and [Jur07]. In this paper we mainly consider solutions of the above equation (1), the structure of its solutions and its relationship with associated OrnsteinUhlenbeck processes.
2 The Structure of Solutions In this section, we first show that OS(E) is a closed sub semigroup of P (E). Then we will consider the existence of limits of St μ and μt as t tends to infinity and some related problems.
On the Equation μ = St μ ∗ μt
139
Proposition 2.1 OS(E) is a closed subsemigroup of P (E). Proof. For any μ and ν ∈ OS(E), it is obvious that μ ∗ ν ∈ OS(E). Suppose {μk , k ∈ N} ⊂ OS(E) and μk ⇒ μ (weak convergence as k → ∞). Then, for any k in N and all t ≥ 0, there exist μk,t in P (E) such that μk = St μk ∗ μk,t .
Since
f (x)dSt μk (x) = E
f (St x)dμk (x) E
for all f ∈ Cb (E), we have k→∞ f (x)dSt μk (x) −−−−→ f (St x)dμ(x) = f (x)dSt μ(x) E
E
E
for all f ∈ Cb (E). This means St μk ⇒ St μ as k → ∞. Theorem 2.1 in Chap.III of [Par67] shows that {μk,t , k ∈ N} is conditionally compact. Let μt be a cluster point of {μk,t , k ∈ N}. Then it is obvious that μ satisfies equation (1) and thus μ ∈ OS(E). Consequently, OS(E) is a closed subsemigroup of P (E). 2 Remark 2.1 δ0 is the identity in this closed subsemigroup. Thus, OS(E) is a closed monic subsemigroup of P (E). Proposition 2.2 If lim St x = 0 for each x ∈ E, then μ and μt are infinitely t→∞
divisible. Moreover, the n-th convolution root of μ also belongs to OS(E). Proof. Using the same arguments as in Lemma 4.4 of [Urb78], we know that μ is infinitely divisible. So H(μ) = ∅. Using Proposition 1.1, we have μt+s = St μs ∗ μt
for all t, s ≥ 0.
Then Proposition 1 in [SS01] implies that μt is infinitely divisible. Assume ∗n μ = μ∗n n and μt = μt,n . From equation (1), we have ∗n ∗n ∗n μ∗n n = St μn ∗ μt,n = (St μn ∗ μt,n ) .
This means
n ( μn (λ))n = (St μ n ∗ μt,n (λ))
for all λ ∈ E ∗ . Using the continuity of Fourier transform and the fact that H(μ) = ∅, we obtain μ n (λ) = St μ n ∗ μt,n (λ) for all λ ∈ E ∗ . This implies μn = St μn ∗ μt,n . Hence μn ∈ OS(E).
2
140
F. Xu
Remark 2.2 The above proposition is a generalization of Lemma 4.4 in [Urb78]. We say that μt is shift convergent as t tends to infinity, if there exists a family {xt , t ∈ R+ } ⊂ E such that μt ∗ δxt is convergent as t tends to infinity; we further say that ν is dominated by μ if ν is a factor of μ. For notational convenience, put OS0 (E) = {μ ∈ OS(E) : H(μ) = ∅}. Proposition 2.3 Suppose μ ∈ OS0 (E), then St μ and μt are shift convergent as t tends to infinity. Moreover, the shift limit of μt is infinitely divisible. Proof. Equation (1) shows that μt is dominated by μ; meanwhile, equation (2) shows that, for any 0 ≤ s1 < s2 , μs1 is dominated by μs2 . Hence, by Theorem 5.3 in Chap.III of [Par67], μt is shift convergent. Moreover, by Theorem 2.1 in Chap.III of [Par67], St μ is shift convergent as well. Proposition 1.1 and Proposition 1 in [SS01] show that μt (t ≥ 0) is infinitely 2 divisible. Thus, the shift limit of μt is also infinitely divisible. From Proposition 2.3, we can denote the shift convergent limits of St μ and μt by S∞ μ and μ∞ , respectively. From the definition of shift convergence and equation (1), we know that there exists a set {xt , t ∈ R+ } ⊂ E such that lim St μ ∗ δ−xt = S∞ μ
(4)
lim μt ∗ δxt = μ∞ .
(5)
t→∞
and t→∞
Lemma 2.1 For μ ∈ OS0 (E) and {xt , t ∈ R+ } mentioned above, we have lim xt+h − St xh exists for all t ≥ 0
h→∞
and then can put xt := lim xt+h − St xh . h→∞
Proof. For any fixed t ≥ 0, we have St S∞ μ = St lim Sh μ ∗ δ−xh h→∞
= lim St (Sh μ ∗ δ−xh ) h→∞
= lim St+h μ ∗ δ−St xh h→∞
= lim St+h μ ∗ δ−xt+h ∗ δxt+h −St xh . h→∞
(6)
By Proposition 2.3, (4) and Corollary 2.2.4 in [Hey04], we obtain the existence of lim xt+h − St xh . 2 h→∞
On the Equation μ = St μ ∗ μt
141
Theorem 2.1 For μ ∈ OS0 (E), we have (a) St S∞ μ = S∞ μ ∗ δxt for all t ≥ 0; (b) μ∞ satisfies μ∞ = St (μ∞ ) ∗ μt ∗ δxt
for all t ≥ 0;
(c) lim St (μ∞ ) ∗ δxt −xt = δ0 ; t→∞
(d) lim St (μs ∗ δxs ) ∗ δxt+s −St xs −xt = δ0 t→∞
for all s ≥ 0.
Proof. By (6), (a) holds. From equation (2), we have μt+s ∗ δxt+s = St (μs ∗ δxs ) ∗ (μt ∗ δxt ) ∗ δxt+s −St xs −xt
(7)
for all t, s ≥ 0. Letting s tend to infinity in both sides of equation (7), we have μ∞ = St (μ∞ ) ∗ (μt ∗ δxt ) ∗ δxt −xt .
(8)
This establishes (b). (c) follows from letting t tend to infinity in the right side of equation (8) and employing Corollary 2.2.4 in [Hey04] while (d) follows from letting t tend to infinity in both sides of equation (7) and using Corollary 2.2.4 in [Hey04]. 2 For each μ ∈ P (E), we define the adjoint μ− of μ and the symmetrization μ of μ by μ− (B) := μ(−B) for all B ∈ B(E) and μ := μ ∗ μ− , respectively. It is obvious that μ ∈ OS(E) − ∈ OS(E). Moreover, we refer to μ as symmetric if μ− = μ implies μ and μ and have the following proposition. Proposition 2.4 For μ ∈ OS0 (E), if μ is symmetric, then St μ and μt are convergent as t tends to infinity. Moreover, the limits of St μ and μt are also symmetric. Proof. By Proposition 2.3 and Theorem 2.2.20 in [Hey04], we obtain that μt is convergent as t tends to infinity. Then we employ Corollary 2.2.4 in [Hey04] to see that St μ is convergent as t tends to infinity. Moreover, lim St μ = μ (λ) t→∞ S t μ(λ) ∗
lim St μ− and lim μ t (λ) = lim
t→∞
t→∞
− (λ) μ t→∞ St μ− (λ)
= lim
t→∞
− = lim μ t (λ) = t→∞
lim μ t (λ) = lim μ t (λ) for all λ ∈ E . Therefore, the limits of St μ and μt are t→∞ t→∞ both symmetric. 2 From the above Proposition 2.4, the limits of St μ and μt can also be denoted by S∞ μ and μ∞ , respectively. Moreover, S∞ μ and μ∞ are symmetric. Theorem 2.2 For μ ∈ OS0 (E), if μ is symmetric, then we have (a) S∞ μ is the invariant measure of (St , t ≥ 0); (b) μ∞ satisfies μ∞ = St μ∞ ∗ μt for all t ≥ 0; (c) limt→∞ St (μ∞ ) = δ0 ; (d) limt→∞ St (μs ) = δ0 for all s ≥ 0.
142
F. Xu
Proof. Employing Proposition 2.4 and using similar arguments as in Theorem 2.1. 2 Theorem 2.3 For μ ∈ OS0 (E), μ can be expressed as the convolution of S∞ μ and μ∞ : μ = S∞ μ ∗ μ∞ , where S∞ μ = lim St μ ∗ δ−xt and μ∞ = lim μt ∗ δxt . Moreover, if μ is t→∞ t→∞ symmetric, then S∞ μ = lim St μ and μ∞ = lim μt . t→∞
t→∞
Proof. By Proposition 2.3 and Proposition 2.4.
2
Corollary 2.1 For μ ∈ OS0 (E), we have μ ∈ ID(E) if and only if S∞ μ ∈ ID(E). Proof. By Proposition 2.3 and Theorem 2.3.
2
In many cases, such as in the papers mentioned in the introduction, S∞ μ is degenerate and μ appears as the limit distribution for an infinite triangular array. Therefore, μ is infinitely divisible. Proposition 2.5 Let A be a linear operator on E. Suppose there exists μ ∈ P (E) such that μ = δ0 and μ satisfies μ = Aμ.
(9)
Then, we have A ≥ 1. Proof. From equation (9), we have μ = An μ. Moreover, if A < 1, then 2
A n → 0, which yields μ = δ0 . In the last part of this section, we only need to consider the nondegenerate symmetric μ of OS0 (E). Since δ0 is the trivial solution of equation (1) and we can consider the symmetrization of μ when μ is not symmetric. Case one: S∞ μ = δ0 , μ∞ = μ. By proposition 2.3, we see that μ is infinitely divisible. Example 2.1 1. (St , t ≥ 0) is stable, i.e., lim St x = 0 for each x ∈ E. t→∞
2. (St , t ≥ 0) is exponentially stable, i.e., lim St = 0. t→∞
Remark 2.3 In the above example, “stable” and “exponentially stable” imply H(μ) = ∅.
On the Equation μ = St μ ∗ μt
143
Case two: S∞ μ = μ, μ∞ = δ0 . From equation (2) and μ∞ = δ0 , it can be easily verified that μt = δ0 for all t ≥ 0. Therefore, in this case, we have μ = St μ for all t ≥ 0. Moreover, by Proposition 2.5, we have
St ≥ 1
for all t ≥ 0.
Example 2.2 (St = I, t ≥ 0). Case three: S∞ μ = δ0 , μ∞ = δ0 . By Theorem 2.2, we have S∞ μ = St S∞ μ for all t ≥ 0. Thus, Proposition 2.5 and the above equation imply
St ≥ 1
for all t ≥ 0.
3 Relationship with Ornstein-Uhlenbeck Process In this section, we assume that E is a Hilbert Space and mainly consider the relationship between solutions of equation (1) and Ornstein-Uhlenbeck processes. Here we introduce the infinite dimensional Langevin equation: dY (t) = JY (t)dt + dX(t), Y (0) = Y0 a.s., where X = (X(t), t ≥ 0) is an E-valued L´evy process (see [App07]). The Ornstein-Uhlenbeck process t St−s dX(s) Y (t) = St Y (0) +
(10)
(11)
0
is the unique weak solution to equation (10)-see [Cho87]. Obviously, Y = (Y (t), t ≥ 0) is a Markov process. It induces a generalized Mehler semigroup (Tt , t ≥ 0) on Cb (E): f (St x + y)μt (dy) (12) (Tt f )(x) = E(f (Y (t))|Y0 = x) = E
(cf. [App06]). Linear operators defined in (12) form a semigroup if and only if μt satisfies equation (2), i.e., μt+s = St μs ∗ μt
for all t, s ≥ 0.
144
F. Xu
The above Mehler semigroup is not (in general) continuous for the norm topology-see p.111 of [DZ02]. We need to introduce a mixed topology τm and recommend [GK01] and [App07] for the definition of this topology. Theorem 4.1 in [App07] shows that (Tt , t ≥ 0) is strongly continuous on (Cb (E), τm ). Thus, we can define the infinitesimal generator A, which is densely defined and closed with respect to τm . Theorem 3.1 Suppose that H(μ) = ∅, then the following three conditions are equivalent: (i) μ ∈ OS(E); (ii) μ is an invariant measure for (Tt , t ≥ 0); (iii) E Af (x)μ(dx) = 0, f ∈ D(A). Proof. (ii) ⇒ (i): Since μ is an invariant measure, we have (Tt f )(x)μ(dx) = f (St x + y)μt (dy)μ(dx) = f (x)μ(dx) E
E
E
E
for all f ∈ Cb (E). So μ = St μ ∗ μt . (i) ⇒ (ii): Use Proposition 1.1 and similar arguments as above to show that μ is an invariant measure; (ii) ⇔ (iii): See [IW89](p. 292). 2 Remark 3.1 The equivalence of (i) and (ii) is due to D. Applebaum (private communication). Acknowledgement. This work resulted from communication with Professor David Applebaum. I am very grateful to him for reading early versions and giving many useful comments. I am also grateful to Professor Zbigniew J. Jurek for some good comments.
References [App06]
Applebaum, D.: Martingale-valued measures, Ornstein-Uhlenbeck processes with jumps and operator self-decomposability in Hilbert space. In Memoriam Paul-Andr´e Meyer, S´eminaire de Probabilit´es 39, ed. M.Emery and M.Yor, Lecture Notes in Math Vol., 1874, 173–198 Springer-Verlag (2006) [App07] Applebaum, D.: On the infinitesimal generators of Ornstein-Uhlenbeck processes with jumps in Hilbert Space. Potential Analysis, Vol., 26, 79–100 (2007) [BRS96] Bogachev, V.I., R¨ ockner, M., Schmuland, B.: Generalized Mehler semigroups and applications. Probab. Theory Relat Fields, 105, 193–225 (1996) [Cho87] Chojnowska-Michalik, A.: On processes of Ornstern-Uhlenbeck type in Hilbert space. Stochastics, Vol., 21, 251–286 (1987)
On the Equation μ = St μ ∗ μt [DZ02] [FR00] [GK01]
[Hey04] [IW89] [Jur82] [JV83]
[JM93] [Jur04] [Jur07] [Luk60] [Par67] [SS01] [SY84]
[Urb78]
145
Da Prato, G., Zabczyk, J.: Second Order Partial Differential Equation in Hilbert Space. Cambridge University Press (2002) Fuhrman, M., R¨ ockner, M.: Generalized Mehler semigroups: the nonGaussian case. Potential Anal., 12, 1–47 (2000) Goldys, B., Kocan, M.: Diffusion Semigroups in Spaces of Continuous Functions with Mixed Topology. Journal of Differential Equations, 173, 17–39 (2001) Heyer, H.: Structural aspects in the theory of probability: a primer in probabilities on algebraic-topological structures. World Scientific (2004) Ikeda, N., Watanabe, S.: Stochastic Differential Equations and Diffusion Processes (Second Edition). Amsterdam: North-Holland Pub.Co. (1989) Jurek, Z.J.: An integral representation of operator-selfdecomposable random variables. Bull. Acad. Pol. Sci., 30, 385–393 (1982) Jurek, Z.J., Vervaat, W.: An integral representation for selfdcomposable Banach space valued random variables. Z.Wahrscheinlichkeitstheorie verw. Gebiete, 62, 247–262 (1983) Jurek, Z.J., Mason, J.D.: Operator-Limit Distributions in Probability Theory. John Wiley and Sons.Inc (1993) Jurek, Z.J.: Measure valued cocycles from my papers in 1982 and 1983 and Mehler semigroups. www.math.uni.wroc.pl/ zjjurek (2004) Jurek, Z.J.: Remarks on relations between Urbanik and Mehler semigroups. www.math.uni.wroc.pl/ zjjurek (2007) Lukacs, E.: Characteristic functions. Griffin’s Statistical Monographs and Courses, 5. Charles Griffin, London (1960) Parthasarathy, K.R.: Probability measures on metric spaces. Academic Press (1967) Schmuland, B., Sun, W.: On the equation μs+t = μs ∗ Ts μt . Stat. Prob. Lett., 52, 183–188 (2001) Sato, K-I., Yamazato, M.: Operator-Selfdecomposable distributions as limit distributions of processes of Ornstern-Uhlenbeck type. Stochastic Processes and Their Applications, 17, 73–100 (1984) Urbanik, K.: L´evy’s probability measures on Banach spaces. Studia Math, 63, 283–308 (1978)
Shabat Polynomials and Harmonic Measure Philippe Biane1 CNRS, Laboratoire d’Informatique Institut Gaspard Monge Universit´e Paris-Est 5 bd Descartes, Champs-sur-Marne, 77454 Marne-la-Vall´ee cedex 2 [email protected]
1 Introduction This note is inspired by [BZ], which describes the true shape of a tree. Each planar tree (remember that a planar tree is a tree in which, for each vertex, the adjacent edges are cyclically ordered) has a distinguished embedding in the complex plane (up to similitude). Theorem 1. For each finite planar tree Γ there exists a complex polynomial P having at most 0 and 1 as critical values, and such that the inverse image of [0,1] by this polynomial is the union of the edges of a tree which is isomorphic to Γ whereas its vertices are the inverse images of 0 and 1. This polynomial is unique up to a change of variable z → az + b or the substitution P → 1 − P . Recall that the critical values of a polynomial are the numbers P (w) where w are the zeros of P . The polynomial of the theorem is called the Shabat polynomial of the tree, and we shall call Shabat embedding the corresponding embedding of the tree. The proof of the theorem uses Grothendieck’s theory of Dessins d’enfants. Some of these trees are depicted in [BZ] or [LZ], page 89. In this short note I will give a potential theoretic characterization of the shape of these trees, which explains some aspects which can be observed on the pictures mentioned above, for example the respective sizes of the different branches of the trees, as well a their curvature. Meanwhile, I shall sketch a proof of this theorem, which uses some results on L¨ owner’s equation. We start in section 2 by recalling a well known correspondance between planar trees and noncrossing partitions, then in section 3 we study the conformal mapping of the exterior of a tree and give the sought for interpretation of the Shabat polynomial. I would like to thank Alexander Zvonkin for his remarks and comments on the first version of this paper.
C. Donati-Martin et al. (eds.), S´ eminaire de Probabilit´ es XLII, Lecture Notes in Mathematics 1979, DOI 10.1007/978-3-642-01763-6 5, c Springer-Verlag Berlin Heidelberg 2009
147
148
P. Biane
2 Planar Trees and Noncrossing Partitions Let us consider, in the complex plane, the 2nth roots of unity and the arcs of the unit circle joining them. Let π be a partition of these arcs into n pairs, which is noncrossing. This means that if one draws the n segments joining the middles of the arcs which are in the same pair of π, these segments do not cross. The quotient of the unit circle by the equivalence relation identifying the two arcs of each pair is a planar tree. Each planar tree can be so obtained, and the corresponding partition is unique, up to some rotation of the circle. Here is an example, for n = 4, of a noncrossing partition and its associated planar tree.
C B
D
A
E
H
F G
Fig. 1. A noncrossing partition
A C
B
E
F
D
H G
Fig. 2. The planar tree
Let us now assume that we identify each pair of arcs according to the (2k+1)π ] with [ (2l−1)π , 2lπ natural length. This means that if we identify [ 2kπ 2n , 2n 2n 2n ] (for parity reasons, these are the only possible identifications) we have to and (2l−θ)π , for θ ∈ [0, 1]. match the points (2k+θ)π 2n 2n
Shabat Polynomials and Harmonic Measure
149
3 Conformal Mapping and Harmonic Measure Proposition 1. Let π be a noncrossing pair partition of the unit circle. Then there exists a unique conformal mapping from the outside of the unit disk to C, with a Laurent expansion z + ... (1) which extends continuously to the boundary of the circle and such that the equivalence relation on the unit circle induced by this map is the noncrossing pair partition π. The image of the unit circle by this map is an embedding of the tree associated with π. The equivalence relation on the circle in the proposition is the one for which two points are in the same class if they have the same image by the continuous extension. Sketch of proof. The conformal mapping of the proposition can be constructed in the following way. Choose a leaf of the tree (a vertex with only one adjacent edge), it corresponds to some 2nth root of unity whose adjacent arcs are in the same part of the partition π. We can assume, without loss of generality, that this root of unity is 1, then the maps φθ , θ ∈ [0, πn ] given by φθ (z) = z 2 + 1 + 2 sin2 (θ/2)z + (z + 1) z 2 + 1 − 2z cos θ /(2z) glue the two intervals according to their natural length. These maps define a conformal mapping from the exterior of the disk to a domain which is the complement of the disk centered at 0, with radius cos2 (θ/2), and of the segment [cos2 (θ/2), (1 + sin(θ/2))2 ]. For θ = π/n we have glued the two intervals.
B A C
B A C
Fig. 3. The domain
Let us map the partition π to the new circle C1 . This gives a noncrossing partition of n − 1 pairs, corresponding to the tree obtained by erasing the initial leaf from the original tree. These arcs are no longer identified through
150
P. Biane
their arc-length, since the transformation φπ/n does not preserve distances, but one can still find maps analogous to the φθ which glue two arcs coming from a leaf of the new tree. These maps can be constructed using L¨owner’s equation (see e.g. part 3 of [MR]), but the details of this construction are beyond the scope of this small note. One gets a conformal mapping from the exterior of C1 to the exterior of a smaller circle and an analytic arc coming out of this circle. Iterating the process, after n steps we obtain ψ. Since it is normalized at infinity by (1), this conformal mapping is unique up to a translation, furthermore it maps the outside of the unit circle to the outside of a planar tree. We have associated, in a canonical way, to each planar tree a conformal mapping ψ, and an embedding of the planar tree in the complex plane. This embedding can be characterized as in the following theorem. Before stating this result, I need to recall some facts about harmonic measures. Let K be a non polar compact set in the complex plane, there exists a unique probability measure μK , on K, which minimizes the logarithmic energy log(|z1 − z2 |−1 )dμ(z1 )dμ(z2 ). E(μ) = K×K
(K is non polar if there exists a probability measure μ such that E(μ) < +∞). The measure μK is called the harmonic measure of K. If the complement of K is connected, and if ϕ is a conformal map from the exterior of the disc to the exterior of K, this conformal mapping extends by continuity to the unit circle, and the image of the uniform probability measure on the circle by ϕ is μK , which can be also described as the law of the hitting position of K by a Brownian starting from infinity, see e.g. [D] for more on these relations between classical potential theory and Brownian motion. Consider a planar tree embedded in C, such that each edge is given by a C 1 arc, then each edge has two sides, and each of these sides carries a part of the harmonic measure, corresponding to the probability that Brownian motion hits the edge on this side. Using the conformal mapping given by proposition 1, the n vertices of the tree come from 2n points on the circle, and each edge comes from 2 of the 2n arcs joining consecutive points. Each of these arcs corresponds to a side of the edge and the harmonic measure on each side is the image of the uniform measure on the corresponding arc. Therefore we conclude. Theorem 2. Given a planar tree with n edges,the exists an embedding of this tree in the complex plane such that i) the harmonic measure of each edge is n1 ii) the harmonic measures on both sides of each edge coincide. This embedding is unique up to a similarity transform. The first proprerty explains why on the pictures of the trees, the vertices tend to accumulate around the leaves, this is a well known electrostatic effect, which concentrates the harmonic measure towards extremities. The second property helps us understand why edges tend to become incurvate in order to let Brownian motion approach both sides with the same probability.
Shabat Polynomials and Harmonic Measure
151
It remains to check that the embedding is that of Shabat. Consider the map z → z n , which wraps the unit circle n times over itself, sending the 2nth roots of unity to −1 and 1, and the arcs between these roots to the two half-circles between -1 and 1. Then z → 14 (z + z1 + 2) maps conformally the exterior of the unit disc to the exterior of the segment [0, 1], and extends continuously to the circle, identifying points of this circle with the same abscissa. Let now η be the inverse map of ψ, which is defined on the exterior of the tree. The preceding considerations imply that P (z) = 14 (η(z)n + η(z)−n + 2) maps conformally the exterior of the tree to the exterior of [0, 1]. Furthermore, from the construction of the tree, one can extend P continuously to the whole of C. Since P is analytic outside the tree, and continuous everywhere, Morera’s theorem (cf [R]) implies that P is entire. Since P (z) = z n + O(z n−1 ) at infinity it is a polynomial of degree n, and the embedded tree is P −1 ([0, 1]). It is easy to check that the only critical values of this polynomial are 0 and 1.
References [BZ]
[D]
[LZ]
[MR] [R]
B´etr´ema, J., Zvonkin, A.: La vraie forme d’un arbre. TAPSOFT ’93: theory and practice of software development (Orsay, 1993), 599–612, Lecture Notes in Comput. Sci., 668, Springer, Berlin, 1993. 05C05. Doob, J.L.: Classical potential theory and its probabilistic counterpart. Grundlehren der Mathematischen Wissenschaften, 262. Springer-Verlag, New York, 1984. Lando, S., Zvonkin, A.: Graphs on surfaces and their applications, Encyclopedia of Mathematical Sciences, Low dimensional topology, II. SpringerVerlag, berlin, Heidelberg, 2004. Marshall, D.E.; Rohde, S.: The L¨ owner differential equation and slit mappings. J. Amer. Math. Soc. 18 (2005), no. 4, 763–778. Rudin, W.: Real and complex analysis. Third edition. McGraw-Hill Book Co., New York, 1987.
Radial Dunkl Processes Associated with Dihedral Systems Nizar Demni Fakult¨ at f¨ ur Mathematik, Universit¨ at Bielefeld, Postfach 100131, Bielefeld, Germany. [email protected] Summary. We are interested in radial Dunkl processes associated with dihedral systems. We write down the semi-group density and as a by-product the generalized Bessel function and the W -invariant generalized Hermite polynomials. Then, a skew product decomposition, involving only independent Bessel processes, is given and the tail distribution of the first hitting time of boundary of the Weyl chamber is computed.
1 A Quick Reminder We refer the reader to [11] and [16] for facts on root systems and to [5], [20] for facts on radial Dunkl processes. Let R be a reduced root system in a finite Euclidean space (V, , ) with positive system R+ and simple system S. Let W be its reflection group and C be its positive Weyl chamber. The radial Dunkl process X associated with R is a pathwise continuous Markov process valued in C whose generator acts on C 2 (C)-functions as Lk u(x) =
1 ∇u(x), α Δu(x) + k(α) 2 x, α α∈R+
with ∇u(x), α = 0 whenever x, α = 0, where Δ, ∇ respectively denote the Euclidean Laplacian and gradient and k is a positive multiplicity function, that is, a R+ -valued W -invariant function. The semi-group density of X with respect to the Lebesgue measure in V is given by x y 1 −(|x|2 +|y|2 )/2t W √ √ e D (1) , ωk2 (y), x, y ∈ C pkt (x, y) = k ck tγ+m/2 t t where γ = α∈R+ k(α) and m = dim V is the rank of R. The weight function ωk is given by ωk (y) = α, yk(α) α∈R+
C. Donati-Martin et al. (eds.), S´ eminaire de Probabilit´ es XLII, Lecture Notes in Mathematics 1979, DOI 10.1007/978-3-642-01763-6 6, c Springer-Verlag Berlin Heidelberg 2009
153
154
N. Demni
and DkW is the generalized Bessel function. Thus, Lk may be written as Lk u(x) =
1 Δu(x) + ∇u(x), ∇ log ωk (x). 2
(2)
2 Motivation Several reasons motivated us to investigate radial Dunkl processes associated with dihedral root systems. First, the dihedral group is a Coxeter, yet non Weyl in general, reflection group and covers an exceptional Weyl group known in the literature as G2 which is of a particular interest ([1]). Second, the study of the Dunkl operators associated with dihedral root systems revealed a close relation to Gegenbauer and Jacobi polynomials which have interesting geometrical interpretations as harmonics and eigenfunctions of the radial part of the Laplacian on the sphere ([3], [11], [14]). The latter operator is a particular case of the Jacobi operator which generates a diffusion known as the Jacobi process that may be represented, up to a random time change, by means of two independent Bessel processes ([21]). Since the norm of the radial Dunkl process is a Bessel process, we wanted to gather all these materials in the present work and to see how do they interact. The last reason is that [7] and [8] emphasize the irreducible root systems of types A, B, C, D which, together with dihedral root systems, exhaust the infinite families of irreducible root systems associated with finite Coxeter groups. The remaining part of the paper consists of five sections. In order to be self-contained, some needed facts on dihedral systems are collected in the next section. Then, we write down the semi-group density, via a detailed analysis of the so-called spherical motion (see below). As a by-product, we deduce the generalized Bessel function. Once this is done, we express the W -invariant counterparts of the generalized Hermite polynomials ([20]) as products of univariate Laguerre and Jacobi polynomials. Next, we give a skew product decomposition of the radial Dunkl process using only independent Bessel processes. This mainly follows from the skew-product decomposition of the Jacobi process derived in [21]. Finally, we compute the tail distribution of the first hitting time of ∂C.
3 Dihedral Groups and Dihedral Systems The dihedral group, denoted by D2 (n) for n 3, is defined as the group of orthogonal transformations that preserve a regular n-sided polygon in V = R2 centered at the origin. Without loss of generality, one may assume that the y-axis is a mirror for the polygon. It contains n rotations through multiples of 2π/n and n reflections about the diagonals of the polygon. By a diagonal, we mean a line joining two opposite vertices or two midpoints of opposite sides
Radial Dunkl Processes
155
if n is even, or a vertex to the midpoint of the opposite side if n is odd. The corresponding dihedral root system, I2 (n), is characterized by its positive and simple systems given by: R+ = {−ieiπl/n := −ieiθl , 1 l n},
S = {eiπ/n e−iπ/2 , eiπ/2 }
so that the Weyl chamber is a wedge of angle π/n. The reader can check that, for instance, I2 (3) (equilateral triangle-preserving) is isomorphic to R = A2 and I2 (4) (square-preserving) is nothing but R = B2 (see [16] for the definitions of both root systems). However, it is a bit more delicate to see that I2 (6) (hexagon-preserving) corresponds to the exceptional Weyl group G2 ([1]). When n = 2p for p 2, there are two orbits so that k = (k0 , k1 ) ∈ R2+ ; otherwise, there is only one orbit and k takes only one positive value.
4 Semi Group Density 4.1 Spherical Motion Recall the skew-product decomposition of the radial Dunkl process into a radial and a spherical parts ([5] p. 53): t ds , (Xt )t0 = (|Xt |ΘAt )t0 At = 2 |X s| 0 where |X| is a Bessel process of index γ ([18]) and Θ is the spherical motion of X valued in the intersection of the sphere of the Euclidean space V and the closure C of the positive Weyl chamber, and is independent of |X|. For dihedral systems, Θ is valued in the unit circle and may be written as (cos θ, sin θ) for some process θ valued in the interval [0, π/n]. 4.2 Semi Group Density of θ Let us first split Lk of X into a radial and a spherical part. To proceed, we use (2) together with the expression of ωk in polar coordinates (up to a constant factor, [11] p. 205) ωk (r, θ) = rnk (sin nθ)k ωk (r, θ) = r Thus Lk =
p(k0 +k1 )
if n is odd, k0
k1
[sin(pθ)] [cos(pθ)]
if n = 2p.
1 2 2γ + 1 1 ∂θ2 ∂r + 2 + nk cot(nθ)∂θ ∂r + 2 r r 2
when n is odd, where γ = nk, and 1 2 2γ + 1 1 ∂θ2 ∂r + 2 + p(k0 cot(pθ) − k1 tan(pθ))∂θ Lk = ∂r + 2 r r 2
156
N. Demni
when n = 2p, where γ = p(k0 + k1 ). It follows that the generator of θ, say Lkθ , acts on smooth functions as ∂θ2 + nk cot(nθ)∂θ 2 ∂2 Lkθ = θ + p(k0 cot(pθ) − k1 tan(pθ))∂θ 2
Lkθ =
when n is odd, when n = 2p.
Now, it is easy to see that the process N defined by Nt := nθt/n2 satisfies dNt = dBt + k cot(Nt )dt when n is odd, while (Mt := pθt/p2 )t0 satisfies dMt = dBt + [k0 cot(Mt ) − k1 tan(Mt )]dt when n = 2p, B being a real Brownian motion. Let us first investigate the case of even n = 2p. The generator of M has a discrete spectrum given by λj = −2j(j + k0 + k1 ), j 0 corresponding to the Jacobi polynomials k −1/2,k0 −1/2 Pj 1 (cos(2θ)) (see [11], p. 201). It is known that this set of orthogonal eigenpolynomials is complete for the Hilbert space L2 ([0, π/2], μk (θ)dθ) where μk (θ) := ck sin(θ)2k0 cos(θ)2k1 for some constant ck . Accordingly, M has a semi-group density, say mkt (φ, θ), given by (we use orthonormal polynomials, [19] p. 29) k −1/2,k0 −1/2 k −1/2,k0 −1/2 mkt (φ, θ) = eλj t Pj 1 (cos(2φ))Pj 1 (cos(2θ))μk (θ) (3) j0
for φ, θ ∈ [0, π/2]. It follows that the semi-group density of θ, say Ktk,p , is given by Ktk,p (φ, θ) = pmkp2 t (pφ, pθ), φ, θ ∈ [0, π/(2p)]. A similar spectral description holds for odd n: the generator of N has a discrete spectrum given by λj = −2j(j + k), j 0 corresponding to −1/2,k−1/2 Pj (cos(2θ)). 4.3 Semi Group Density of X Let (r, θ) → f (r, θ) be a nice function and let Pρ,φ denote the law of X starting at x = (ρ, φ) ∈ C. Then, using the independence of θ and |X| together with Fubini’s Theorem, one has Eρ,φ [f (|Xt |, θAt )] = Eρ,φ [Eρ,φ [f (|Xt |, θAt )|σ(|Xs |, s t)]] π/(2p) 2 Eγρ [f (|Xt |, θ)eλj p At ]Pjl1 ,l0 (cos(2pφ))Pjl1 ,l0 (cos(2pθ))μk (pθ)dθ = 0
j0
Radial Dunkl Processes
157
where Pγρ is the law of the Bessel process |X| starting at ρ and of index γ. Next, for every θ ∈ [0, π/(2p)] 2
Eγρ [f (|Xt |, θ)eλj p
At
2
] = Eγρ [Eγρ [f (|Xt |, θ)eλj p At ||Xt |] ∞ 2 = Eγρ [eλj p At ||Xt | = r]f (r, θ)qt (ρ, r)dr 0
where qt (ρ, r) is the semi-group density of the Bessel process |X| of index γ ([18]): γ
ρr 2 2 1 r qt (ρ, r) = re−(ρ +r )/2t I γ t ρ t where Iγ is the modified Bessel function of index γ ([17] p. 108). Moreover (see [22] p. 80) 2 Eγρ [eλj p At ||Xt |
= r] =
I√γ 2 −2λj p2 (ρr/t) Iγ (ρr/t)
, λj = −2j(j + k0 + k1 ).
Thus, we proved that Proposition 1 The semi-group density of the radial Dunkl process associated with even dihedral groups D2 (2p) is given by γ 2 2 r 1 k pt (ρ, φ, r, θ) = e−(ρ +r )/2t sin2k0 (pθ) cos2k1 (pθ) ck t ρ
ρr Pjl1 ,l0 (cos(2pφ))Pjl1 ,l0 (cos(2pθ)) I2jp+γ t j0
with respect to dr dθ, where ck is a normalizing constant, l0 = k0 − 1/2, l1 = k1 − 1/2 and ρ, r 0, 0 φ, θ π/2p. For odd dihedral systems, one has to substitute in the above formula k1 = 0, k0 = k, p = n. k −1/2,k −1/2
0 Remarks 1 1/ The j-th Jacobi polynomial Pj 1 (cos(2pθ)) can be k1 ,k0 replaced by the generalized Gegenbauer polynomial C2j (cos(pθ)) (see [11], k1 ,k0 p. 27). For k1 = 0, C2j (cos(pθ)) reduces to the Gegenbauer polynomial k0 C2j (cos(pθ)).
2/ The heat kernel corresponding to a planar Brownian motion starting at x ∈ C and reflected on ∂C corresponds to k ≡ 0. Using the above formula, one deduces
ρr 1 −(ρ2 +r2 )/2t e Tj (cos(2pφ))Tj (cos(2pθ)) I2jp p0t (ρ, φ, r, θ) = c0 t t j0
where Tj is the orthonormal j-th Tchebycheff polynomial defined by Tj (cos θ) = cos(jθ),
j 0.
158
N. Demni
Thus
ρr 1 −(ρ2 +r2 )/2t e cos(2jpφ) cos(2jpθ). I2jp c0 t t
p0t (ρ, φ, r, θ) =
j0
For k ≡ 1, one recovers the Brownian motion conditioned to stay in a wedge of angle π/n which is the h = ω1 -transform in Doob’s sense of a planar Brownian motion killed when it first hits ∂C ([15]). More precisely, one has for n = 2p p1t (ρ, φ, r, θ) =
ω12 (r, θ) c1 t
1 rρ
2p
e−(ρ
2
+r 2 )/2t
I2(j+1)p
ρr t
j0
Uj (cos(2pφ))Uj (cos(2pθ)),
where Uj is the j-th Tchebycheff polynomial of the second kind defined by Uj (cos θ) =
sin(j + 1)θ , sin θ
j 0.
and dj is a normalizing constant such that dj Uj has unit norm. Keeping in mind that ω1 (r, θ) = cr2p sin(2pθ) for some constant c, elementary computations yield p1t (ρ, φ, r, θ) ω1 (r, θ) e−(r +ρ ω1 (ρ, φ) c1 t 2
=
2
)/2t
I2(j+1)p
ρr
j0
t
sin[2p(j + 1)φ)] sin[2(j + 1)pθ]
which agrees with the ω1 -transform property. Besides, one deduces that the semi-group density of a planar Brownian motion killed when it first hits ∂C is given by pC t (ρ, φ, r, θ)
=
e−(r
2
+ρ2 )/2t
t
j0
I2(j+1)p
ρr t
sin[2p(j + 1)φ)] sin[2p(j + 1)θ].
The expression of pC t should be compared with Lemma 1 in [4]. Writing pkt as γ 2 2 t 1 e−(ρ +r )/2t ωk2 (r, θ) γ+1 ck t rρ
ρr Pjl1 ,l0 (cos(2pφ))Pjl1 ,l0 (cos(2pθ)) I(2j+k0 +k1 )p t
pkt (ρ, φ, r, θ) =
j0
and considering (1), we are led to the by-product
Radial Dunkl Processes
159
Corollary 1 (Generalized Bessel function) For even dihedral groups, the generalized Bessel function is given by DkW (ρ, φ, r, θ)
= cp,k
2 rρ
γ
I2jp+γ (ρr)Pjl1 ,l0 (cos(2pφ))Pjl1 ,l0 (cos(2pθ))
j0
where γ = p(k0 + k1 ). For odd dihedral groups, one has DkW (ρ, φ, r, θ) = cn,k
2 rρ
γ
−1/2,l0
I2jn+γ (ρr)Pj
−1/2,l0
(cos(2nφ))Pj
(cos(2nθ))
j0
where γ = nk and l = k − 1/2. The constant cp,k and cn,k are such that DkW (0, 0, r, θ) = |W |.
5 W -invariant Generalized Hermite Polynomials In this section, we shall express the W -invariant counterparts of the so-called generalized Hermite polynomials by means of univariate Laguerre and Jacobi polynomials. This is done in three steps. 5.1 Generalized Hermite Polynomials Recall from [20] that the generalized Hermite polynomials (Hτ )τ ∈Nm are defined by Hτ (x) = [e−Δk /2 φτ ](x) where Δk denotes the Dunkl Laplacian ([20]) and (φτ )τ ∈Nm is a basis of homogeneous polynomials orthogonal with respect to the pairing inner product defined in [13] (see also [20]): [p, q]k =
e−Δk /2 p(x)e−Δk /2 q(x)ωk2 (x)dx
V
for two polynomials p, q (up to a constant factor). The family (Hτ )τ is then said to be associated to the basis (φτ )τ . Their W -invariant counterparts are defined by Hτ (wx). HτW (x) := w∈W
160
N. Demni
5.2 Mehler-type Formula It is known that (Hτ )τ satisfies a Mehler-type formula ([11] p. 2461 ) 1 r2 (|x|2 + |y|2 ) r D Hτ (x)Hτ (y)r|τ | = exp − y x, k 2(1 − r2 ) 1 − r2 (1 − r2 )γ+m/2 m τ ∈N
for 0 < r < 1, x, y ∈ V . An analogous formula is satisfied by (HτW )τ and follows after summing twice over W and using Dk (wx, w y) = Dk (x, w−1 w y) ([20]): HτW (x)HτW (y)r|τ | τ ∈Nm
=
|W | r2 (|x|2 + |y|2 ) W Dk exp − 2 γ+m/2 2(1 − r2 ) (1 − r )
x,
r y . 1 − r2
5.3 Dihedral Systems Let us express DkW through the hypergeometric function 0 F1 . This is done via the relation ([17]) Iν (z) =
1 z ν 0 F1 (ν + 1, z). Γ (ν + 1)
It follows that DkW (ρ, φ, r, θ) (ρr/2)2jp ρ2 r2 F 2jp + γ + 1, = cp,k 0 1 Γ (2jp + γ + 1) 4 j0
Pjl1 ,l0 (cos(2pφ))Pjl1 ,l0 (cos(2pθ)). Using the Mehler-type formula for univariate Laguerre polynomials ([2] p. 200): q0
q! L2jp+γ (ρ2 /2)Lq2jp+γ (r2 /2)z 2q (2jp + γ + 1)q q = (1 − z 2 )−2jp−γ−1 e−z
2
(ρ2 +r 2 )/[2(1−z 2 )]
valid for |z| < 1, one gets (1 − z 2 )−γ−1 e−z cp,k
j,q0
1
2
(ρ2 +r 2 )/[2(1−z 2 )]
DkW
ρ, φ,
F 0 1 2jp + γ + 1,
zr ,θ 1 − z2
z 2 ρ2 r2 , 4(1 − z 2 )2
=
ρr 2jp q! k,W k,W Nj,p (ρ, φ)Nj,p (r, θ) z 2(q+jp) Γ (2jp + q + γ + 1) 2
We use a different normalization from the one used in [11].
Radial Dunkl Processes
where k,W Nj,p (ρ, φ)
:=
Lq2jp+γ
ρ2 2
161
Pjl1 ,l0 (cos(2pφ)).
This suggests that the W -invariant generalized Hermite polynomials are given by 2 jp ρ q! k,W W Hτ1 ,τ2 (ρ, φ) = Nj,p (ρ, φ) Γ (2jp + q + γ + 1) 2 for τ1 = 2q (q 0), τ2 = 2jp (j 0) and zero otherwise. An elegant proof of this claim was given to us (private communication) by Professor C. F. Dunkl and is as follows: the j-th W -invariant harmonic is given by (see Proposition 3.15 in [12]) 2jp l1 ,l0 Pj (cos(2pφ)) hW j (ρ, φ) = ρ
so that by Proposition 3.9 in [13], e−Δk /2 [ρ2q hW j (ρ, φ)] = e−Δk
W
/2
j 2jp+γ [ρ2q hW j (ρ, φ)] = (−2) j!Lq
ρ2 2
Pjl1 ,l0 (cos(2pφ)).
Remark 1 A similar result holds for odd dihedral systems with k1 = 0, k0 = k, p = n.
6 Skew-product Decomposition In this section, we derive a skew-product decomposition for X using only independent Bessel processes. This is done by relating the process θ to a Jacobi process. That is why some results on Jacobi processes are collected below. 6.1 Facts on Jacobi Processes The Jacobi process J of parameters d, d 0 is a [0, 1]-valued process and is a solution, whenever it exists, of ([21])
(4) dJt = 2 Jt (1 − Jt )dBt + (d − (d + d )Jt )dt, where B is a real Brownian motion. As for squared Bessel processes, (4) has a unique strong solution for all t 0 and all J0 ∈ [0, 1] since the diffusion coefficient is 1/2-H¨older and the drift term is Lipschitz ([18] p. 360). When d ∧ d 2 and J0 ∈ [0, 1], then J remains in ]0, 1[ while when d ∧ d > 0, J is valued in [0, 1] ([10] p. 135). Besides, J has the skew-product decomposition below ([21]):
162
N. Demni
Z12 (t) Z12 (t) + Z22 (t)
= (JFt )t0 ,
Ft =
t0
t
2 0 Z1 (s)
ds , + Z22 (s)
(5)
where Z1 , Z2 are two independent Bessel processes of dimension d, d respectively such that d + d 2. Moreover, J is independent from Z12 + Z22 (and thereby also from F ). 6.2 Relating θ and J Assume d ∧ d 1, and define (Ht := − cos 2Mt )t0 where (Mt = pθt/p2 )t0 . Then an application of Itˆ o’s formula and of pathwise uniqueness for the above SDE shows that (Ht )t0 = (Y2t )t0 where √ dYt = 2 1 − Yt2 dBt − [(k1 − k0 ) + (k0 + k1 + 1)Yt ]dt. In fact, it is easy to see that (1 − Y2t )/2 = (1 − Ht )/2 = cos2 (Mt ) is a Jacobi process of parameters d = 2k1 + 1, d = 2k0 + 1. As a result, one gets
1 arccos( Jp2 At ) (θAt )t0 = p t0 where J is independent from X, thereby from A. 6.3 Skew-product Decomposition On the one hand, it is a well known fact ([18]) that the sum of two independent squared Bessel processes of dimensions d = 2k1 + 1, d = 2k0 + 1 is again a squared Bessel process of dimension d + d , thus Z := Z12 + Z22 is a squared Bessel process of index k0 +k1 . On the other hand, for any conjuguate numbers r, q and any Bessel process Rν of index ν > −1/q, there exists a Bessel process Rνq of index νq and defined on the same probability space such that the following holds ([18]) · 2 2/q 2 −2/r Rν (s)ds . (6) q Rν = Rνq 0
Specializing (6) with ν = k0 + k1 , q = p, Rν = Z, there exists a Bessel process Rνq such that 2/p Zt
2 Rνq = 2 p
t
ds
2(p−1)/p 0 Zs
:=
1 2 R (τt ), p2 νq
r=
p . p−1
(7)
Radial Dunkl Processes
163
Let J be the Jacobi process defined in (5) with d = 2k1 + 1, d = 2k0 + 1 and define a radial Dunkl process X by ⎞ ⎛ √ ⎜ 1 X := ⎜ ⎝Rνq , p arccos J 0
·
⎟ ⎟ = (|X|, θAt ) ⎠ ds p2 2 Rνq (s) 2/p
Let Lt := inf{s, τs > t} be the inverse of τ so that ZL = (1/p2 )|X|2 . Then · L· L· · ds ds dτs ds 2 2 p A=p = = = = FL· . (8) 2 2 2/p 2/p |X | Z s 0 0 Z 0 0 s Zs Ls
As a result, when k0 , k1 0 and θ ∈ C, one has 1 Z12 1 arccos( JFLt ) = (Lt ) (θAt )t0 = arccos p p Z12 + Z22 t0
. t0
Finally Proposition 2 Let k0 , k1 0 and define the time-change τ by t ds τ := 2 2 2(p−1)/p 0 [Z1 (s) + Z2 (s)] where Z1 , Z2 are two independent Bessel processes of dimensions d = 2k1 + 1, d = 2k0 + 1 respectively. Then there exists a radial Dunkl process X associated with the even dihedral group D2 (2p) such that Xτ is realized as the two-dimensional process 2 Z 1 1 . p(Z12 + Z22 )1/2p , arccos p Z12 + Z22 A similar representation holds when n is odd.
7 On the First Hitting Time of a Wedge Let X0 = x ∈ C and let T0 := inf{t, Xt ∈ ∂C} be the first hitting time of ∂C. Recall that for the dihedral groups D2 (2p), C is a wedge of angle π/(2p). Recall also from ([9]) that if the index function l := k − 1/2 takes one striclty negative value for some simple root α, then
164
N. Demni
< α, X > hits zero a.s. so that T0 < ∞ a.s (see also [5]). For even dihedral systems, two cases are to be considered: • 1/2 k0 , k1 1 with either k1 > 1/2 or k0 > 1/2 or equivalently 0 l0 , l1 1/2 with either l0 > 0 or l1 > 0: in that case, the radial Dunkl process with index function −l hits ∂C a.s. and we will use results from [5]. • One and only one of the index values is strictly negative while the other value is positive: in that case, the radial Dunkl process of index function l hits ∂C. We will follow a different strategy based on our representation of the angular process θ as a Jacobi process and on results from ([10]) on Jacobi processes. This strategy applies to the first case too. For odd dihedral systems, we can only have 1/2 < k 1 and computations are similar to the ones done in the first case for even dihedral systems. 7.1 Even Dihedral Groups: First Case We give two different approaches: while the first one has the advantage to be short, the second approach is shown to be efficient for both cases. In fact, the first approach uses the absolute-continuity relations for radial Dunkl processes derived in [5] and we are met with a complicated exponential functional when dealing with the second case (see [7] for more details). The second approach focuses on the angular process θ which was identified with a Jacobi process and uses absolute-continuity relations for Jacobi processes from [10]. First approach: write x = ρeiφ , ρ > 0, 0 < φ < π/(2p) and let 1/2 k0 , k1 1 with either k0 > 1/2 or k1 > 1/2. Using part (c) of Proposition 2. 15 in [5], p. 38, the tail distribution of T0 is given by: −2l(α) < α, Xt > −l l , Px (T0 > t) = Ex < α, x > α∈R+
where Plx (Elx ) denotes the law of a radial Dunkl process starting at x ∈ C and of index l (the corresponding expectation). From (1), one gets 2p(l0 +l1 ) 2 ρ ρ e−ρ /2t 2l0 2l1 √ √ (T > t) = sin (pφ) cos (pφ)g , φ , P−l 0 x ck t t where
∞
π/n
g(ρ, φ) = 0
e−r
2
/2
DkW (ρ, φ, r, θ)r2p+1 sin(2pθ)drdθ.
0
With regard to Corollary 1, it amounts to evaluate ∞ 2 e−r /2 Ibj (ρr)r2p+1−γ dr, S1 (j) = 0
S2 (j) =
0
π/2p
k −1/2,k0 −1/2
Pj 1
(cos(2pθ)) sin(2pθ)dθ
Radial Dunkl Processes
165
for every j 0, where bj := 2jp + γ. In order to evaluate S1 , we use the expansion ([17], p. 108) Ibj (ρr) =
q0
ρr 2q+bj 1 Γ (bj + q + 1) 2
and exchange the order of integration to get (p−γ)/2 Γ (aj
+ 1) Γ (bj + 1)
S1 (j) = 2
ρ √ 2
bj
1 F1
ρ2 aj + 1, bj + 1, 2
where
(2j + k0 + k1 )p + 2p − γ = (j + 1)p. 2 Using the variable change s = cos(2pθ), S2 (j) transforms to aj =
1 S2 (j) = 2p
1
−1
k −1/2,k0 −1/2 Pj 1 (s)ds
1 = p
0
1
k −1/2,k0 −1/2
Pj 1
(2s − 1)ds
which is easily computed using the expansion p. 21 in [11]. As a result, the tail distribution is given by Proposition 3 γ−2p ρ √ sin2l0 (pφ) cos2l1 (pφ) t 2jp ρ Γ (aj + 1) ρ2 √ S2 (j) Pjl1 ,l0 (cos(2pφ)) 1 F1 aj + 1, bj + 1, Γ (bj + 1) 2t 2t j0 γ−2p ρ 1 √ = sin2l0 (pφ) cos2l1 (pφ) ck t 2jp ρ Γ (aj + 1) ρ2 √ S2 (j) F − a , b + 1, − b Pjl1 ,l0 (cos(2pφ)) 1 1 j j j Γ (bj + 1) 2t 2t e−ρ /2t > t) = ck 2
P−l x (T0
j0
by Kummer’s transformation ([17]). Remark 2 The value k ≡ 1 corresponds to the first exit time of a Brownian motion from a wedge and our result fits the one in [4]. Moreover γ = 2p, bj = 2aj and one may use the duplication formula to simplify the above ratio of Gamma functions and use some argument simplifications for the confluent hypergeometric function ([17]). Second approach: recall that (θAt )t0 =
1 arccos( Jp2 At ) p t0
166
N. Demni
where J is a Jacobi process of parameters d = 2k1 + 1, d = 2k0 + 1 and is independent from |X| (thereby from A). Then, for an appropriate index function l (so that T0 < ∞), one has Plx (T0 > t) = Plx (0 < θAt < π/(2p)) = Plx (0 < Jp2 At < 1) = Plx (TJ > p2 At ) where TJ := inf{t, Jt = 0} ∧ inf{t, Jt = 1} is the first exit time from the interval [0, 1] by a Jacobi process. Now, recall the following absolute continuity relation between the laws of Jacobi processes of different set of parameters (Theorem 9.4.3. p.140 in [10] specialized to m = 1): denote the probability law of a Jacobi process starting at z ∈]0, 1[ let Pd,d z and of parameters d ∧ d > 0. Writing T for TJ , then d ,d1
Pz 1
=
|Ft ∩{T >t}
Jt z
κ
1 − Jt 1−z
β
exp − 0
t
u v d ,d ds c + + Pz 2 2 |Ft ∩{T >t} (9) Js 1 − Js
where (Ft )t is the natural filtration of J, di ∧ di > 0, i = 1, 2 and d − d2 d1 − d 2 ,β = 1 , 4 4 d − d2 d1 + d2 d1 − d 2 d 1 + d 2 − 2 ,v = 1 −2 , u= 4 2 4 2 d1 + d1 − d2 − d2 d1 + d1 + d2 + d2 c= 2− . 4 2
κ=
A corollary of the above theorem (Corollary 9.4.6. p. 1402 ) states that if d := 2k1 + 1 := 2(l1 + 1), d := 2k0 + 1 := 2(l0 + 1) with 0 l1 , l0 < 1, then l1 l z 1−z 0 l1 ,l0 −ct 1 ,−l0 (T > t) = E e . (10) P−l z z Jt 1 − Jt where Plz1 ,l0 (Elz1 ,l0 ) denotes the probability law of a Jacobi process of indices l1 , l0 and starting at z (the corresponding expectation). To recover the result in Proposition 3, proceed as follows. Let −1/2 −l0 , −l1 0 so that at least one value is strictly negative. Using the semigroup density of J = cos2 M which follows from (3): √ √ 1 mkt (arccos z, arccos s) pJt (z, s) = 2 s(1 − s) = ck eλj t Pjl1 ,l0 (2z − 1)Pjl1 ,l0 (2s − 1) sl1 (1 − s)l0 , λj j0
= −2j(j + k0 + k1 ), 2
The exponential factor e−ct is missing in [10].
(11)
Radial Dunkl Processes
167
for some constant ck and z, s ∈]0, 1[, together with the independence of J and A and (10), one gets l1 l0 z 1−z −l l1 ,l0 −cp2 At e Px (T0 > t) = Ez Jp2 At 1 − Jp2 At 1 2 = ck z l1 (1 − z)l0 Eγρ [e−p (c−λj )At ]Pjl1 ,l0 (2z − 1) Pjl1 ,l0 (2s − 1)ds, 0
j0
where Eγρ is the law of the Bessel process |X| of index γ = (2 − k0 − k1 )p = 2p − γ, corresponding to −l and c = 2(l0 +l1 ) = 2(k0 +k1 −1). Now, note that integral in the RHS is up to a constant S2 (j) defined in the previous section and that z l1 (1 − z)l0 = sin2l0 (pφ) cos2l1 (pφ). Next, use the formula
Eγρ [e−(c−λj )p
2
At
||Xt | = r] =
I√γ 2 +2(c−λ
2 j )p
(ρr/t)
Iγ (ρr/t)
,
and the semi-group density of |X| 1 qt (ρ, r) = t
γ
ρr 2 2 r re−(ρ +r )/2t Iγ ρ t
to deduce after elemantary computations that 2
ρr e−ρ /2t 2p−γ ∞ 2p−γ+1 −r2 /2t γ −(c−λj )p2 At ρ dr ]= r e I2jp+γ Eρ [e t t 0 γ−2p ∞ 2 2 ρ = e−ρ /2t √ r2p−γ+1 e−r /2 I2jp+γ (ρr) dr t 0 γ−2p 2 ρ = e−ρ /2t √ S1 (j) t In fact, γ 2 − 2(c − λj )p2 = [(2 − k0 − k1 )2 + 4[(k0 + k1 − 1) + j(j + k0 + k1 )]] p2 = [(k0 + k1 )2 − 4(k0 + k1 − 1) + 4(k0 + k1 − 1) + 4j(j + k0 + k1 )] p2 = (2j + k0 + k1 )2 p2 = (2jp + γ)2 . Finally, it only remains to relate the modified Bessel function Iν and the hypergeometric function 0 F1 via: Iν (z) =
1 z ν 0 F1 (ν + 1, z). Γ (ν + 1)
168
N. Demni
7.2 Even Dihedral Groups: Second Case We use the second approach developed above and we suppose for instance that k1 < 1/2 while k0 1/2. Take 1 d1 = d < 2, d1 = d 2 in (9) then perform the parameters change d2 = d1 = d 2, d2 = 4 − d1 = 4 − d = 3 − 2k1 > 2 so that the indices corresponding to the new parameters d2 , d2 are positive, wherefrom T = TJ = ∞ a.s. Moreover, one has β = u = v = 0, which yields
J 2 κ 2 p At e−cp At Plx (T0 > t) = P4−d,d z z where κ = (d − d2 )/2 = d/2 − 1 = l1 < 0 and c = −d (d − 2) = d (2 − d) > 0. Since the parameter d2 corresponds to the index value 1/2 − k1 = −l1 > 0 and multiplicity function 1 − k1 > 1/2, we get 2 1/2−k1 ,k0 −1/2 Eγρ [ep (λj −c)At ]Pj (2z − 1)F (j) Plx (T0 > t) = ck z 1−2k1 (1 − z)l0 j0
where λj = −2j(j + k0 + 1 − k1 ), γ = (k0 + k1 )p and F (j) = 0
1
1/2−k1 ,k0 −1/2
(1 − s)l0 Pj
(2s − 1)ds.
We leave the computations to the interested reader. A similar result holds when k0 < 1/2 and k1 1/2: substitute k1 , l1 by k0 , l0 respectively and s, z by 1 − s, 1 − z respectively. Acknowedgments: the author would like to give a special thank for Professor C. F. Dunkl for his intensive reading of the paper, for his fruitful remarks and for pointing to him the references [12], [13]. The author also thanks Professor M. Bozejko for his remarks and for stimulating discussions at the Wroclaw Institute of Mathematics.
References 1. J. C. Baez. The octonions. Bull. Amer. Math. Soc. (N. S.) 39, no. 2. 2002, 145–205. 2. T. H. Baker, P. J. Forrester. The Calogero-Sutherland model and generalized classical polynomials. Comm. Math. Phys. 188. 1997, 175–216. 3. D. Bakry. Remarques sur les semi-groupes de Jacobi. Hommage a ` P. A. Meyer et J. Neveu. Ast´erisque 236, 1996, 23–39. 4. R. Ba˜ nuelos, R. G. Smits. Brownian motions in cones. P. T. R. F. 108, 1997, 299–319.
Radial Dunkl Processes
169
5. O. Chybiryakov. Processus de Dunkl et relation de Lamperti. Th`ese de doctorat, Universit´e Paris VI, June 2005. 6. N. Demni, M. Zani. Large deviations for statistics of Jacobi process. To appear in S. P. A. 7. N. Demni. First hitting time of the boundary of a Weyl chamber by radial Dunkl processes. SIGMA Journal. 4, 2008, 074, 14 pages. 8. N. Demni. Generalized Bessel function of type D. SIGMA Journal. 4, 2008, 075, 7 pages. 9. N. Demni. Note on radial Dunkl processes. Submitted to Ann. I. H. P. 10. Y. Doumerc. Matrix Jacobi Process. Th`ese de doctorat, Universit´e Paul Sabatier, May 2005. 11. C. F. Dunkl, Y. Xu. Orthogonal Polynomials of Several Variables. Encyclopedia of Mathematics and Its Applications. Cambridge University Press. 2001. 12. C. F. Dunkl. Differential-difference operators associated to reflection groups. Trans. Amer. Math. Soc. 311. 1989, no. 1, 167–183. 13. C. F. Dunkl. Integral kernels with reflection group invariance. Canad. J. Math. 43. 1991, no. 6, 1213–1227. 14. C. F. Dunkl. Generating functions associated with dihedral groups. Special functions (Hong Kong 1999), World Sci. Publ. Rier Edge, NJ. 2000, 72–87. 15. D. J. Grabiner. Brownian motion in a Weyl chamber, non-colliding particles and random matrices. Ann. IHP. 35, 1999, no. 2. 177–204. 16. J. E. Humphreys. Reflections Groups and Coxeter Groups. Cambridge University Press. 29. 2000. 17. N. N. Lebedev. Special Functions and their Applications. Dover Publications, INC. 1972. 18. D. Revuz, M. Yor. Continuous Martingales and Brownian Motion, 3rd ed., Springer, 1999. 19. W. Schoutens. Stochastic Processes and Orthogonal Polynomials. Lecture Notes in Statistics, 146. Springer, 2000. 20. M. R¨ osler. Dunkl operator: theory and applications, orthogonal polynomials and special functions (Leuven, 2002). Lecture Notes in Math. Vol. 1817, Springer, Berlin, 2003, 93–135. 21. J. Warren, M. Yor. The Brownian Burglar: conditioning Brownian motion by its local time process. S´em. Probab. XXXII., 1998, 328–342. 22. M.Yor. Loi de l’indice du lacet brownien et distribution de Hartman-Watson, Zeit. Wahr. verw. Geb. 53, no.1, 1980, 71–95.
Matrix Valued Brownian Motion and a Paper by P´ olya Philippe Biane CNRS, Laboratoire d’Informatique Institut Gaspard Monge, Universit´e Paris-Est 5 bd Descartes, Champs-sur-Marne, 77454 Marne-la-Vall´ee cedex 2, France e-mail: [email protected]
1 Introduction This paper has two parts which are largely independent. In the first one I recall some known facts on matrix valued Brownian motion, which are not so easily found in this form in the literature. I will study three types of matrices, namely Hermitian matrices, complex invertible matrices, and unitary matrices, and try to give a precise description of the motion of eigenvalues (or singular values) in each case. In the second part, I give a new look at an old paper of G. P´ olya [14], where he introduces a function close to Riemann’s ξ function, and shows that it satisfies Riemann’s hypothesis. As put by Marc Kac in his comments on P´olya’s paper [11], “Although this beautiful paper takes you within a hair’s breadth of Riemann’s hypothesis it does not seem to have inspired much further work and reference to it in the mathematical litterature are rather scant”. My aim is to point out that the function considered by P´ olya is related in a more subtle way to Riemann’s ξ function than it looks at first sight. Furthermore the nature of this relation is probabilistic, since these functions have a natural interpretation involving Mellin transforms of first passage times for diffusions. By studying infinite divisibility properties of the distributions of these first passage times, we will see that they are generalized gamma convolutions, whose mixing measures are related to the considerations in the first part of this note.
2 Matrix Brownian Motions We will study three types of matrix spaces, and in each of these spaces consider a natural Brownian motion, and show that the motion of eigenvalues (or singular values) of this Brownian motion has a simple geometric description, using Doob’s transform. The following results admit analogues in more general complex symmetric spaces, but for the sake of simplicity, discussion will be restricted to type A symmetric spaces. Actually the interesting case for us C. Donati-Martin et al. (eds.), S´ eminaire de Probabilit´ es XLII, Lecture Notes in Mathematics 1979, DOI 10.1007/978-3-642-01763-6 7, c Springer-Verlag Berlin Heidelberg 2009
171
172
P. Biane
in the second part will be the simplest one, of rank one, but I think that this almost trivial case is better understood by putting it in the more general context. Some references for results in this section are [2], [4], [5], [7], [8], [9], [10], [13], [15], [16]. 2.1 Hermitian Matrices Consider the space of n × n Hermitian matrices, with zero trace, endowed with the quadratic form A, B = T r(AB). Let M (t) be a Brownian motion with values in this space, which is simply a Gaussian process with covariance E[T r(AM (t))T r(BM (s))] = T r(AB)s ∧ t for A, B traceless Hermitian matrices. Let λ1 (t) ≥ λ2 (t) ≥ . . . ≥ λn (t) be the eigenvalues of M (t); they perform a stochastic process with values in the Weyl chamber C = {(x1 , . . . , xn ) ∈ Rn | x1 ≥ x2 ≥ . . . ≥ xn } ∩ Hn where
Hn =
n (x1 , . . . , xn ) ∈ Rn xi = 0 . i=1
p0t
Let be the transition probability semi-group of Brownian motion killed at the boundary of the cone C. This cone is a fundamental domain for the action of the symmetric group Sn , which acts by permutation of coordinates on Rn . Using the reflexion principle, one shows easily that (σ)pt (x, σ(y)) x, y ∈ C p0t (x, y) = σ∈Sn
where pt (x, y) = (2πt)−(n−1)/2 e−|x−y| /2t and (σ) is the signature of σ. Let h be the function (xi − xj ). h(x) = 2
i>j
Proposition 1. The function h is the unique (up to a positive multiplicative constant) positive harmonic function for the semigroup p0t , on the cone C, which vanishes on the boundary. The harmonic function h corresponds to the unique point at infinity in the Martin compactification of C. Consider now the Doob’s transform of p0t , which is the semigroup given by
Matrix Brownian Motion
qt (x, y) =
173
h(y) 0 p (x, y). h(x) t
It is a diffusion semigroup on C with infinitesimal generator 1 Δ + ∇ log h, ∇ · . 2 Proposition 2. The eigenvalue process of a traceless Hermitian Brownian motion is a Markov diffusion process in the cone C, with semigroup qt . We can summarize the last proposition by saying that the eigenvalue process is a Brownian motion in C, conditioned (in Doob’s sense) to exit the cone at infinity. 2.2 The Group SLn (C) This is the group of complex invertible matrices of size n × n, with determinant 1. Its Lie algebra is the space sln (C) of complex traceless matrices. Consider the Hermitian form A, B = T r(AB ∗ ) on sln (C) which is invariant by left and right action of the unitary subgroup SU (n). This Hermitian form determines a unique Brownian motion with values in sln (C). The Brownian motion gt , on SLn (C), is the stochastic exponential of this Brownian motion, solution to the Stratonovich stochastic differential equation dgt = gt dwt where wt is a Brownian motion in sln (C). There are two remarkable decompositions of SLn (C), the Iwasawa and Cartan decompositions. The first one is SLn (C) = N AK where K is the compact group SU (n), A is the group of diagonal matrices with positive coefficients, and determinant one, and N is the nilpotent group of upper triangular matrices with 1’s on the diagonal. Each matrix of SLn (C) has a unique decomposition as a product g = nak of elements of the three subgroups N, A, K. This can be easily inferred from the Gram-Schmidt orthogonalization process. If gt is a Brownian motion in SLn (C), one can consider its components nt , at , kt . In particular, denoting by (ew1 (t) , . . . , ewn (t) ) the diagonal components of at the following holds (cf [15]). Proposition 3. The process w1 (t), . . . , wn (t) is a Brownian motion with a drift ρ = (−n + 1, −n + 3, . . . , n − 1) in the subspace Hn . The other decomposition is the Cartan decomposition SLn (C) = KA+ K, where A+ is the part of A consisting of matrices with positive nonincreasing
174
P. Biane
coefficients along the diagonal. In order to get the Cartan decomposition of a matrix g ∈ SLn (C), take its polar decomposition g = ru with r positive Hermitian, and u unitary, then diagonalize r which yields g = vav with v and v unitary and a diagonal, with positive real coefficients which can be put in nonincreasing order along the diagonal. These coefficients are the singular values of the matrix g. This decomposition is not unique since the diagonal subgroup of SU (n) commutes with A, but the singular values are uniquely defined. Call (ea1 (t) , . . . , ean (t) ) the singular values of the Brownian motion gt , with a1 ≥ a2 ≥ . . . ≥ an . They form a process with values in the cone C. Let us mention that this stochastic process can also be interpreted as the radial part of a Brownian motion with values in the symmetric space SLn (C)/SU (n). We will now give for the motion of singular values a similar description as the one of eigenvalues of the Hermitian Brownian motion. For this, consider a Brownian motion in Hn , with drift ρ, killed at the exit of the cone C. This process has a semigroup given by ρ,y−x−tρ,ρ/2 0 p0,ρ pt (x, y). t (x, y) = e
Proposition 4. The function hρ (y) =
(1 − e2(yj −yi ) )
i>j
is a positive harmonic function for the semigroup p0,ρ t , in the cone C, and vanishes at the boundary of the cone. It is not true that this function is the unique positive harmonic function on the cone; indeed the Martin boundary at infinity is now much larger and contains a point for each direction inside the cone, see [8]. The Doob-transformed semigroup hρ (y) 0,ρ p (x, y) qtρ (x, y) = ρ h (x) t is a Markov diffusion semigroup in the cone C, with infinitesimal generator 1 Δ + ρ, · + ∇ log hρ , ∇·. 2 Note that it can also be expressed as 1 ˜ ρ , ∇· Δ + ∇ log h 2 with ˜ ρ (y) = h
sinh(yj − yi )
i>j
(see[10]). Proposition 5. The logarithms of the singular values of a Brownian motion in SLn (C) perform a diffusion process in the cone C, with semigroup qtρ .
Matrix Brownian Motion
175
As in the preceding case, we can summarize by saying that the process of singular values is a Brownian motion with drift ρ in the cone C, conditioned (in Doob’s sense) to exit the cone at infinity, in the direction ρ. 2.3 Unitary Matrices The Brownian motion with values in SU (n) is obtained by taking the stochastic exponential of a Brownian motion in the Lie algebra of traceless antiHermitian matrices, endowed with the Hermitian form A, B = −T r(AB). Let eiθ1 , . . . , eiθn be the eigenvalues of a matrix in SU (n), which can be chosen so that i θi = 0, and θ1 ≥ θ2 ≥ . . . ≥ θn , θ1 − θn ≤ 2π. These conditions determine a simplex Δn in Hn , which is a fundamental domain for the action ˜ is the of the affine Weyl group on Hn . Recall that the affine Weyl group W semidirect product of the symmetric group Sn , which acts by permutation of coordinates in Hn , and of the group of translations by elements of the lattice (2πZ)n ∩ Hn . One can use the reflexion principle again to compute the semigroup of Brownian moton in this simplex killed at the boundary. One gets an alternat˜, ing sum over the elements of W (w)pt (θ, w(ξ)). p0t (θ, ξ) = ˜ w∈W
The infinitesimal generator is 1/2 × the Laplacian in the simplex, with Dirichlet boundary conditions. It is well known that this operator has a compact resolvent, and its eigenvalue with smallest module is simple, with an eigenfunction which can be chosen positive. Consider the function (eiθj − eiθk ). hu (θ) = j>k
Proposition 6. The function hu is positive inside the simplex Δn , it vanishes on the boundary, and it is the eigenfunction corresponding to the Dirichlet eigenvalue with smallest module on Δn . This eigenvalue is λ = (n − n3 )/6. The Doob-transformed semigroup qtu (x, y) =
hu (y) −λt 0 e pt (x, y) hu (x)
is a Markov diffusion semigroup in Δn , with infinitesimal generator 1 Δ + ∇ log hu , ∇· − λ. 2
176
P. Biane
Proposition 7. The process of eigenvalues of a unitary Brownian motion is a diffusion with values in Δn with probability transition semigroup qtu . Again a good summary of this situation is that the motion of eigenvalues is that of a Brownian motion in the simplex Δn conditioned to stay forever in this simplex. 2.4 The Case of Rank 1 In the next section we will need the simplest case, that of 2 × 2 matrices. Consider first the case of Hermitian matrices. The process of eigenvalues is essentially a Bessel process of dimension 3, with infinitesimal generator 1 d2 1 d + 2 dx2 x dx
on
]0, +∞[,
obtained from Brownian motion killed at zero, of infinitesimal generator 1 d2 2 dx2 with Dirichlet boundary condition at 0, by a Doob transform with the positive harmonic function h(x) = x. In the case of the group SL2 (C), or the symmetric space SL2 (C)/SU (2), which is the hyperbolic space of dimension 3, the radial process has infinitesimal generator 1 d2 d + coth x 2 dx2 dx obtained from Brownian motion with a drift d 1 d2 + 2 2 dx dx with Dirichlet boundary condition at 0, by a Doob’s transform with the function 1 − e−2x . Finally the last case is Brownian motion in SU (2), where the eigenvalue process takes values in [0, π] and has an infinitesimal generator 1 d2 d + cot x 2 dx2 dx obtained by a Doob transform at the bottom of the spectrum from 1 d2 2 dx2 on [0, π] with Dirichlet boundary conditions at 0 and π, by the function sin(x). For these last two examples, we shall write a spectral decomposition of the generator Li , i = 1, 2, of the form
Matrix Brownian Motion
f (x) =
Φiλ (x) Φiλ (y)f (y) dmi (y) dνi (λ)
i = 1, 2
177
(1)
for every f ∈ L2 (mi ), where mi is measure for which Li is selfadjoint in L2 (mi ), and the functions Φiλ are solutions to Li Φiλ + λΦiλ = 0 and νi is a spectral measure for Li . d2 For L1 = 12 dx 2 on [0, π] with Dirichlet boundary conditions, m1 (dx) is Lebesgue measure on [0, π], and L1 is selfadjoint on L2 ([0, π)]. Furthermore √ Φ1λ (x) = sin( 2λx) and ν1 (dλ) = For L2 =
1 d2 2 dx2
+
∞ 1 δn2 /2 (dλ). π n=1
(2)
d dx
on [0, +∞[, the measure m2 (dx) = e2x dx, and √ Φ2λ (x) = e−x sin( 2λ − 1x) λ > 1/2.
The spectral measure is ν2 (dλ) =
1 √ dλ π 2λ − 1
λ > 1/2
(3)
on [1, +∞[. Of course formulas (1), (2), (3) are immediate consequences of ordinary Fourier analysis. Note that the spectral decompositions, and in particular the measures νi , depend on the normalisation of the fonctions Φλ . We have made a natural choice, but it does not coincide with the usual normalisation of WeylTitchmarsh-Kodaira theory, see [6].
3 MacDonald’s Function and Riemann’s ξ Function 3.1 P´ olya’s Paper In his paper [14], P´ olya starts from Riemann’s ξ function ξ(s) = s(s − 1)π −s/2 Γ (s/2)ζ(s) where ζ is Riemann’s zeta function. Then ξ is an entire function whose zeros are exactly the nontrivial zeros of ζ. Putting s = 1/2 + iz yields
∞ Φ(u) cos(zu)du ξ(z) = 2 0
178
P. Biane
with Φ(u) = 2πe5u/2
∞
(2πe2u n2 − 3)n2 e−πn
2 2u
e
(4)
n=1
and the function Φ is even, as follows from the functional equation for Jacobi θ function; furthermore Φ(u) ∼ 4π 2 e9u/2−πe
2u
u → +∞
so that Φ(u) ∼ 4π 2 (e9u/2 + e−9u/2 )e−π(e
2u
+e−2u )
u → ±∞.
This lead P´ olya to define a “falsified” ξ function
∞ 2u −2u ∗ 2 ξ (z) = 8π (e9u/2 + e−9u/2 )e−π(e +e ) cos(zu)du. 0
The main result of [14] is Theorem 1. The function ξ ∗ is entire, its zeros are real and simple. Let N (r), (resp. N ∗ (r)) denote the number of zeros of ξ(z) (resp. ξ ∗ (z)) with real part in the interval [0, r], then N (r) − N ∗ (r) = O(log r). Recall that the same assertion about the zeros of the function ξ (without the statement about simplicity, beware also that s = 1/2 + iz) is Riemann’s hypothesis. Recall also the well known estimate N (r) =
r r log(r/2π) − + O(1). 2π 2π
P´ olya’s results rely on the intermediate study of the function
∞ u −u G(z, a) = e−a(e +e )+zu du −∞
from which ξ ∗ is obtained by ξ ∗ (z) = 2π 2 (G(iz/2 − 9/4, π) + G(iz/2 + 9/4, π)) P´ olya shows that G(z, a) has only purely imaginary zeros, (as a function of z) and the number of these zeros with imaginary part in [0, r] grows as πr log ar − r ∗ π + O(1). The results on ξ are then deduced through a nice lemma which played a role in the history of statistical mechanics (the Lee-Yang theorem on Ising model), as revealed by M. Kac [11]. We shall now concentrate on ˜ G(z, a). In particular, for a = π, the function ξ(z) = G(iz/2, π) is another approximation of ξ which has many interesting structural properties.
Matrix Brownian Motion
179
3.2 MacDonald Functions The function denoted G(z, a) by P´ olya is actually a Bessel function. Indeed, MacDonald’s function, also called modified Bessel function (see e.g. [1]), given by
∞ −1 x tμ−1 e− 2 (t+t ) dt x > 0, μ ∈ C. Kμ (x) = 0
satisfies Kz (2x) = G(z, x). The function G(z, a) is therefore essentially a MacDonald function, as noted by P´ olya. MacDonald function is an even function of μ and satisfies 2μ Kμ (x) = Kμ+1 (x) − Kμ−1 (x) x
(5)
d Kμ (x) = Kμ+1 (x) + Kμ−1 (x) (6) dx The first of these equations is used by P´ olya in a very clever way to prove that the zeros (in z) of G(z, x) are purely imaginary. −2
3.3 Spectral Interpretation of the Zeros From (5), (6) ( μx −
d dx )Kμ
(− μx −
= Kμ+1
d dx )Kμ
= Kμ−1
from which one gets Kμ = (− μ+1 x − 2
d = ( dx 2 +
μ d dx )( x
1 d x dx
−
−
d dx )Kμ
μ2 x2 )Kμ
.
This differential equation will give us a spectral interpretation of the zeros of G(z, x). Change variable by ψμ (x) = Kμ (ex ) to get (−
d2 + e2x )ψμ = −μ2 ψμ dx2
(7)
Since Kμ vanishes exponentially at infinity, the spectral theory of SturmLiouville operators on the half-line (see e.g. [6], [12]) implies that the squares d2 2x of the zeros of μ → ψμ (y) are the eigenvalues of dx on the interval 2 − e [y, +∞[ with the Dirichlet boundary condition at y, the functions ψμ being the eigenfunctions. Since this operator is selfadjoint and negative the zeros are purely imaginary, and are simple. This spectral interpretation of the zeros of MacDonald function is well known [17], I do not know why P´ olya does not mention it.
180
P. Biane
3.4 H = xp Equation (7) can be put into Dirac’s form, indeed the equations d 1 x f = γg dx + 2 + e d − dx + imply
−
1 2
+ ex g = γf
d2 1 2x + e f = (γ 2 − )f. dx2 4
Using the change of variables u = ex , we get d u du + 12 + u f = γg d −u du +
+ u g = γf.
1 2
Remark that this Dirac system yields a perturbation of the Hamiltonian H = xp considered by Berry et Keating [3], in relation with Riemann’s zeta function. 3.5 Asymptotics of the Zeros General results on Sturm-Liouville operators allow one to recover the asymptotic behaviour of the spectrum, thanks to a semiclassical analysis, see e.g. [12]. One can get a more precise result using the integral representation of Kiμ . P´ olya gives the asymptotic estimate π y x iΦ y −x −iΦ 1 −π y+i π x 2 2 Kx+iy (2a) = √ e + e e + O(e− 2 y y |x|−3/2 ) a a 2πy in the strip |x| ≤ 1 uniformly as y → ∞, where Φ = y log
π y −y− . a 4
This estimate can be obtained by the stationary phase method, writing
∞ Kz (2a) = ezt−2a cosh(t) dt. −∞
Making a contour deformation we get Kz (2a) =
−A
π/2 ezt−2a cosh(t) dt + i 0 ez(−A+it)−2a cosh(−A+it) dt π/2 A π π π + −A ez(t+i 2 )−2ai sinh(t) dt − i 0 ez(A−it+i 2 )−2a cosh(A−it+i 2 ) dt ∞ + A ezt−2a cosh(t) dt
−∞
Matrix Brownian Motion
181
and P´ olya’s estimate can be obtained by standard methods, which give also estimates for the derivatives of MacDonald’s function. Finally the zeros of y → Kiy (2a) behave like the solutions to y log
π 1 y − y − = (n + )π a 4 2
n integer
The number of zeros with imaginary part in [0, T ] is thus
T π
log
T a
− Tπ + O(1).
4 Probabilistic Interpretations We will now give interpretations of the functions ξ and ξ˜ using first passage times of diffusions. 4.1 Brownian Motion with a Drift The first passage time at x > 0 of Brownian motion started at 0 follows a 1/2 stable distribution i.e., x2
e− 2t P (Tx ∈ dt) = x √ dt 2πt3 with Laplace transform E[e−λ
2
Tx /2
] = e−λx .
Adding a drift a > 0 to the Brownian motion gives a first passage distribution x2
e− 2t ax− a2 t 2 dt P (Tx ∈ dt) = x √ e 2πt3 a
with Laplace transform E a [e−λ
2
Tx /2
√
] = e−x
λ2 +a2 +ax
.
This is a generalized inverse Gaussian distribution. In particular, its Mellin transform is E a [Txs ] = (x/a)s
K−1/2+s (ax) = (x/a)s π/ax eax K−1/2+s (ax) K−1/2 (ax)
which gives a probabilistic interpretation of MacDonald’s function (as a function of s) as a Mellin transform of a probability distribution.
182
P. Biane
4.2 Three Dimensional Bessel Process There exists a similar interpretation of the ξ function, which is discussed in details in [4], [5], for example, Consider the first passage time at a > 0 of a three dimensional Bessel process (i.e., the norm of a three dimensional Brownian motion) starting from 0. The Laplace transform of this hitting time is E[e−
λ2 2
Sa
]=
λa . sinh λa
Let Sa be an independent copy of Sa , and let Wa = Sa + Sa ; then the density of the distribution of Wa is obtained by inverting the Laplace transform. One gets P (Wa ∈ dx) =
∞
(π 4 n4 x/a4 − 3π 2 n2 /a2 )e−π
2
n2 x/2a2
dx
n=1
from which one can compute the Mellin transform E[Was ] = 2(2a2 /π)s ξ(2s). The function 2ξ thus has a probabilistic interpretation, as Mellin transform of π2 W1 . 4.3 Infinite Divisibility The distributions of Tx and Wa are infinitely divisible. Indeed √ 2 log E a [exp(− λ2 Tx )] = −x λ2 + a2 + ax 2 ∞ −a t λ2 2 dt = x 0 (e− 2 t − 1) e√2πt 3 which shows that Tx is a subordinator with L´evy measure a2
e− 2 t √ dt. 2πt3 Similarly 2
log E[exp(− λ2 Wa )] = 2 log(λa/ sinh(λa)) ∞ ∞ 2 2 2 λ2 = 2 0 (e− 2 t − 1) n=1 e−π n t/a dt therefore the variable Wa has the distribution of a subordinator, with L´evy measure ∞ 2 2 2 2 e−π n t/a dt, n=1
taken at time 1. Observe however that the process (Wa )a≥0 is not a subordinator.
Matrix Brownian Motion
183
4.4 Generalized Gamma Convolution The gamma distributions are P (γω,c ∈ dt) =
c−ω ω−1 −t/c t e dt = Γω,c (dt) Γ (ω)
where ω and c are > 0 parameters. The Laplace transform is E[e−λγω,c ] = (1 + λ/c)−ω . The gamma distributions form a convolution semigroup with respect to the parameter ω, i.e., Γω1 ,c ∗ Γω2 ,c = Γω1 +ω2 ,c . The L´evy exponent of the gamma semigroup is
∞ e−ct ψc (λ) = log(1 + λ/c) = dt (1 − e−λt ) t 0 so that this is the semigroup of a subordinator with L´evy measure e−ct /t dt. The generalized gamma convolutions are the distributions of linear combinations, with positive coefficients, of independent gamma variables, and their weak limits. One can also characterize the generalized gamma convolutions as the infinitely divisible distributions with a L´evy exponent of the form
∞ ψc (λ)dν(c) ψ(λ) = 0
for some positive measure ν which integrates 1/c at ∞. This measure is called the Thorin measure of the generalized gamma distribution. The variables Tx and Wa of the preceding paragraph are generalized gamma convolutions. Indeed it is easy to check, using the computations of section 4.3, that Wa has a generalized gamma convolution as distribution, with Thorin measure ν(dc) = 2
∞
δn2 /a2 (dc) ;
(8)
n=1
whereas Tx is distributed as a generalized gamma convolution with Thorin measure dc c > a2 /2 (9) ν(dc) = π c − a2 /2 since
e−a t/2 √ = πt3 2
∞
dc . e−ct π c − a2 /2 a2 /2
184
P. Biane
4.5 Final Remarks We can now make a connection between the preceding considerations and those of the first part of the paper. Indeed, the Thorin measures associated with the variables Tx and Wa can be expressed as spectral measures associated with the generators of Brownian motion on matrix spaces. The hitting times of Brownian motion with drift are related with the radial part of Brownian motion in the symmetric space SL2 (C)/SU (2), whereas the hitting times of the Bessel three process are related with the Brownian motion on the unitary group SU (2). The precise relations are contained in formulas (2), (3), (8), (9). Thus the Riemann ξ function, which is the Mellin transform of a hitting time of the Bessel three process, as in section 4.2, and the Polya ξ˜ function from section 3.1, which appears as Mellin transform of hitting time of Brownian motion with drift, are related in this non obvious way.
References 1. Andrews, G. E.; Askey, R.; Roy, R.: Special functions. Encyclopedia of Mathematics and its Applications, 71. Cambridge University Press, Cambridge, 1999. 2. Babillot, M.: A probabilistic approach to heat diffusion on symmetric spaces. J. Theoret. Probab. 7 (1994), no. 3, 599–607 3. Berry, M. V.; Keating, J. P.: The Riemann zeros and eigenvalue asymptotics. SIAM Rev. 41 (1999), no. 2, 236–266. 4. Biane, P.: La fonction zˆeta de Riemann et les probabilit´es. La fonction zˆeta, ´ Polytech., Palaiseau, 2003. 165–193, Ed. Ec. 5. Biane, P.; Pitman, J.; Yor, M.: Probability laws related to the Jacobi theta and Riemann zeta functions, and Brownian excursions. Bull. Amer. Math. Soc. (N.S.) 38 (2001), no. 4, 435–465. 6. Coddington, E. A.; Levinson, N.: Theory of ordinary differential equations. McGraw-Hill Book Company, Inc., New York-Toronto-London, 1955. 7. Dyson, F. J.: A Brownian-motion model for the eigenvalues of a random matrix. J. Mathematical Phys. 3 1962 1191–1198. 8. Guivarc’h, Y.; Ji, L.; Taylor, J. C.: Compactifications of symmetric spaces. Progress in Mathematics, 156. Birkh¨ auser Boston, Inc., Boston, MA, 1998. 9. Helgason, S.: Groups and geometric analysis. Integral geometry, invariant differential operators, and spherical functions. Corrected reprint of the 1984 original. Mathematical Surveys and Monographs, 83. American Mathematical Society, Providence, RI, 2000. 10. Jones, L.; O’Connell, N.: Weyl chambers, symmetric spaces and number variance saturation. ALEA Lat. Am. J. Probab. Math. Stat. 2 (2006), 91–118 11. Kac M.: Comments on [93] Bemerkung u ¨ ber die Intergraldarstellung der Riemannsche ξ-Funktion, in P´ olya, G.: Collected papers. Vol. II: Location of zeros. Edited by R. P. Boas. Mathematicians of Our Time, Vol. 8. The MIT Press, Cambridge, Mass.-London, 1974. 12. Levitan, B. M.; Sargsjan, I. S.: Introduction to spectral theory: selfadjoint ordinary differential operators. Translated from the Russian by Amiel Feinstein.
Matrix Brownian Motion
13.
14. 15.
16.
17.
185
Translations of Mathematical Monographs, Vol. 39. American Mathematical Society, Providence, R.I., 1975. Malliavin, M.-P.; Malliavin, P.: Factorisations et lois limites de la diffusion horizontale au-dessus d’un espace riemannien sym´etrique. Th´eorie du potentiel et analyse harmonique, pp. 164–217. Lecture Notes in Math., Vol. 404, Springer, Berlin, 1974. P´ olya, G.: Bemerkung u ¨ ber die Integraldarstellung der Riemannschen ζ-Funktion. Acta Math. 48 (1926), no. 3-4, 305–317. Taylor, J. C.: The Iwasawa decomposition and the limiting behaviour of Brownian motion on a symmetric space of noncompact type. Geometry of random motion (Ithaca, N.Y., 1987), 303–332, Contemp. Math., 73, Amer. Math. Soc., Providence, RI, 1988. Taylor, J. C.: Brownian motion on a symmetric space of noncompact type: asymptotic behaviour in polar coordinates. Canad. J. Math. 43 (1991), no. 5, 1065–1085. Titchmarsh, E. C.: Eigenfunction Expansions Associated with Second-Order Differential Equations. Oxford, at the Clarendon Press, 1946.
On the Laws of First Hitting Times of Points for One-dimensional Symmetric Stable L´ evy Processes Kouji Yano1 , Yuko Yano2 , and Marc Yor3,4,2 1
2 3
4
Department of Mathematics, Graduate School of Science, Kobe University, Kobe, Japan. E-mail: [email protected] Research Institute for Mathematical Sciences, Kyoto University, Kyoto, Japan. Laboratoire de Probabilit´es et Mod`eles Al´eatoires, Universit´e Paris VI, Paris, France. Institut Universitaire de France.
Summary. Several aspects of the laws of first hitting times of points are investigated for one-dimensional symmetric stable L´evy processes. Itˆ o’s excursion theory plays a key role in this study.
Keywords: Symmetric stable L´evy process, excursion theory, first hitting times.
1 Introduction For one-dimensional Brownian motion, the laws of several random times, such as first hitting times of points and intervals, can be expressed explicitly in terms of elementary functions. Moreover, these laws are infinitely divisible (abbrev. as (ID)), and in fact, self-decomposable (abbrev. as (SD)). The aim of the present paper is to study various aspects of the laws of first hitting times of points and last exit times for one-dimensional symmetric stable L´evy processes. We shall put some special emphasis on the following objects: (i) the laws of the ratio of two independent gamma variables, which, as is usual, we call beta variables of the second kind; (ii) harmonic transform of Itˆ o’s measure of excursions away from the origin. The present study is motivated by a recent work [49] by the authors about penalisations of symmetric stable L´evy paths. The organisation of the present paper is as follows. In Section 2, we recall several facts concerning beta and gamma variables and their variants. In Section 3, we briefly recall Itˆ o’s excursion theory and make some discussions C. Donati-Martin et al. (eds.), S´ eminaire de Probabilit´ es XLII, Lecture Notes in Mathematics 1979, DOI 10.1007/978-3-642-01763-6 8, c Springer-Verlag Berlin Heidelberg 2009
187
188
K. Yano et al.
about last exit times. In Section 4, we consider harmonic transforms of symmetric stable L´evy processes, which play an important role in our study. In Section 5, we discuss the laws of first hitting times of single points and last exit times for symmetric stable L´evy processes. In Section 6, we discuss the laws of those random times for the absolute value of symmetric stable L´evy processes, which include reflecting Brownian motion as a special case.
2 Preliminaries: Several Important Random Variables 2.1 Generalized Gamma Convolutions For a > 0, we write Ga for a gamma variable with parameter a: P (Ga ∈ dx) =
1 a−1 −x x e dx, Γ (a)
x > 0.
(2.1)
As a rather general framework, we recall the class of generalized gamma convolutions (abbrev. as (GGC)), which is an important subclass of (SD); namely, (GGC) ⊂ (SD) ⊂ (ID).
(2.2)
A nice reference for details is the monograph [7] by Bondesson. A recent survey can be found in James–Roynette–Yor [26]. A random variable X is said to be of (GGC) type if it is a weak limit of linear combinations of independent gamma variables with positive coefficients. Theorem 2.1 (See, e.g., [7, Thm.3.1.1]) A random variable X is of (GGC) type if and only if there exist a non-negative constant a and a non-negative measure U (dt) on (0, ∞) with 1 U (dt) < ∞ (2.3) | log t|U (dt) < ∞ and (0,1] (1,∞) t such that −λX
E[e
] = exp −aλ −
log(1 + λ/t)U (dt) ,
λ > 0.
(2.4)
(0,∞)
In what follows we shall call U (dt) the Thorin measure associated with X. By simple calculations, it follows that ∞ 1 1 −λX − ] = exp −aλ + E[e U ((0, s))ds (2.5) s+λ s 0 ∞
1 1 − e−λu e−ut U (dt) du . (2.6) = exp −aλ − u 0 (0,∞) In particular, the following holds: The law of
X is of (ID) type and its L´evy measure has a density given by n(u) := u1 (0,∞) e−ut U (dt). Since un(u) is non-increasing, the law of X is of (SD) type.
First Hitting Times of Points
189
Theorem 2.2 (see, e.g., [7, Thm.4.1.1 and 4.1.4]) Suppose that X is of (GGC) type and that a = 0 and b := U ((0, ∞)) < ∞. Then X may be law
represented as X = Gb Y for some random variable Y independent of Gb . The total mass of the Thorin measure is given by ρ(x) (2.7) b = sup p ≥ 0 : lim p−1 = 0 x→0+ x where ρ is the density of the law of X with respect to the Lebesgue measure: x 1 1 b−1 x E ρ(x) = exp − . (2.8) Γ (b) Yb Y Remark 2.3 For a given X, the law of the variable Y which represents X as in Theorem 2.2 is unique; in fact, the gamma distribution is simplifiable (see [11, Sec.1.12]). Remark 2.4 We do not know how to characterise explicitly the class of possible Y ’s which represent variables of (GGC) type as in Theorem 2.2. As a partial converse, Bondesson (see [7, Thm.6.2.1]) has introduced a remarkable class which is closed under multiplication of independent gamma variables. 2.2 Beta and Gamma Variables We introduce notations and recall several basic facts concerning the beta and gamma variables. See [11, Chap.4] for details. For a, b > 0, we write Ba,b for a beta variable (of the first kind) with parameters a, b: P (Ba,b ∈ dx) =
1 xa−1 (1 − x)b−1 dx, B(a, b)
0<x<1
(2.9)
where B(a, b) is the beta function: B(a, b) =
Γ (a)Γ (b) . Γ (a + b)
law
(2.10) law
1
Note that Ba,b = 1 − Bb,a for a, b > 0 and that Ba,1 = U a for a > 0 where U is a uniform variable on (0, 1). The following identity in law is well-known: For any a, b > 0, law Ga , Gb = (Ba,b Ga+b , (1 − Ba,b )Ga+b ) , (2.11) or equivalently, Ga + Gb ,
Ga
Ga + Gb
law
= (Ga+b , Ba,b )
(2.12)
190
K. Yano et al.
where on the left hand side Ga and Gb are independent and on the right hand side Ba,b and Ga+b are independent. The proof is elementary; it can be seen in [11, (4.2.1)], and so we omit it. Using formula (2.11), we obtain another expression of the Thorin measure of a variable of (GGC) type. Theorem 2.5 Under the same assumption as in Theorem 2.2, the total mass of the Thorin measure is given by law (2.13) b = inf c ≥ 0 : X = Gc Yc for some Yc independent of Gc . Proof. Let us write b for the right hand side of (2.13). law
By Theorem 2.2, we have X = Gb Y for some random variable Y indepenlaw dent of Gb . For any c > b, we have Gb = Gc Bb,c−b where Gc and Bb,c−b are independent, which implies that c ≥ b for any such c. Hence we obtain b ≥ b. Suppose that b > b. Then we may take c with b > c > b such that law X = Gc Z for some random variable Z independent of Gc . Then we have another expression of the density ρ(x) as x 1 1 c−1 x E exp − ρ(x) = . (2.14) Γ (c) Zc Z By the monotone convergence theorem, this implies that 1 ρ(x) 1 lim c−1 = E > 0, x→0+ x Γ (c) Zc
(2.15)
which shows that c ≥ b by formula (2.7). This leads to a contradiction. Therefore we conclude that b = b. 2.3 Beta Variables of the Second Kind Consider the ratios of two independent gamma variables, which are sometimes called beta variables of the second kind or beta prime variables. By identity (2.11), the following is obvious: For any a, b > 0, Ga law Ba,b = . 1 − Ba,b Gb The law of the ratio Ga /Gb is given as follows: For any a, b > 0, Ga xa−1 1 P ∈ dx = dx, x > 0. B(a, b) (1 + x)a+b Gb
(2.16)
(2.17)
In spite of its simple statement, the following theorem is rather difficult to prove. Theorem 2.6 (see, e.g., [7, Ex.4.3.1]) For any a, b > 0, the ratio Ga /Gb is of (GGC) type. Its Thorin measure has total mass a. For the proof, see [7]. We omit the details.
First Hitting Times of Points
191
2.4 α-Cauchy Variables and Linnik Variables 1 It is well-known that the standard Cauchy distribution π1 1+x 2 dx and the 1 −|x| dx satisfy the following relation: bilateral exponential distribution 2 e (R) The characteristic function of any of these two distributions is proportional to the density of the other. We shall introduce α-analogues of these two distributions which satisfy the relation (R). Let us introduce the α-analogue for α > 1 of the standard Cauchy variable C, which, as just recalled, is given by
P (C ∈ dx) =
1 1 dx, π 1 + x2
x ∈ R.
(2.18)
Define the α-Cauchy variable Cα as follows: P (Cα ∈ dx) =
1 sin(π/α) dx, 2π/α 1 + |x|α
x ∈ R.
(2.19)
law
Note that C2 = C. By a change of variables, the following is easy to see: For α > 1, let γ = 1/α ∈ (0, 1). Then it holds that γ Gγ (2.20) Cα = G1−γ where is a Bernoulli variable: P ( = 1) = P ( = −1) = 1/2 independent of Gγ and G1−γ . In particular, Gγ = |Cα |α . G1−γ
(2.21)
Note that the law of a standard Cauchy variable C2 is of (SD) type. Moreover, the following theorem is known: Theorem 2.7 (Bondesson [6]) For 1 < α ≤ 2, the law of |Cα | is of (ID) type. It is easy to see that law
|Cα | −→ U
as α → ∞
(2.22)
where U is a uniform variable on (0, 1). Theorem 2.8 (Thorin [42]) For p > 0, the law of U −p , which is called the Pareto distribution of index p, is of (GGC) type.
192
K. Yano et al.
Remark 2.9 The following problems still remain open: (i) Is it true that the law of Cα is of (SD) type (or of (ID) type at least)? (ii) Is it true that the law of |Cα | is of (SD) type? (iii) Is it true that the law of |Cα |−p for p > 0 is of (SD) type (or of (ID) type at least)? Remark 2.10 Bourgade–Fujita–Yor ([8]) have proposed a new probabilistic method of computing special values of the Riemann zeta function ζ(2n) via the Cauchy variable. Fujita–Y. Yano–Yor [15] have recently generalized their method via the α-Cauchy variables and obtained a probabilistic method for computing special values of the complementary sum of the Hurwitz zeta function: ζ(2n, γ) + ζ(2n, 1 − γ) for 0 < γ < 1. Following [12], we introduce the Linnik variable Λα of index 0 < α ≤ 2 as follows: E[eiθΛα ] =
1 , 1 + |θ|α
θ ∈ R.
(2.23)
It is easy to see that law
Λα = Xα (e)
(2.24)
where Xα = (Xα (t) : t ≥ 0) is the symmetric stable L´evy process of index α starting from 0 such that P [eiθXα (t) ] = e−t|θ| , α
θ∈R
(2.25)
and e is a standard exponential variable independent of Xα . Hence the laws of law Linnik variables are of (SD) type. A L´evy process (Λα (t)) with Λα (1) = Λα is called a Linnik process; its characteristic function is: 1 E eiθΛα (t) = , θ ∈ R. (2.26) (1 + |θ|α )t See James [25] for his study of Linnik processes. Note that the law of Λα has a continuous density Lα (x), i.e., P (Λα ∈ dx) = Lα (x)dx.
(2.27)
Proposition 2.11 Suppose that 1 < α < 2. Then the α-Cauchy distribution and the Linnik distribution of index α satisfy the relation (R). Proof. Note that the identities (2.23) and (2.27) show that ∞ 1 eiθx Lα (x)dx = , θ ∈ R. 1 + |θ|α −∞
(2.28)
First Hitting Times of Points
By Fourier inversion, we obtain: ∞ 1 1 e−ixθ dθ, Lα (x) = 2π −∞ 1 + |θ|α
x ∈ R.
193
(2.29)
Hence: sin(π/α) Lα (θ), 2π/α
E[eiθCα ] = Now the proof is complete.
θ ∈ R.
(2.30)
2.5 Log-gamma Processes and their Variants We recall the classes of log-gamma processes, z-processes and Meixner processes. It is well-known (see, e.g., [41]) that the law of the logarithm of a gamma variable log Ga is of (SD) type. Let us introduce a L´evy process (ηa (t) : t ≥ 0) such that law
ηa (1) = log Ga .
(2.31)
Following Carmona–Petit–Yor [10], we call the process (ηa (t) : t ≥ 0) the log-gamma process. Please be careful not to confuse with the convention that log-normal variables stand for exponentials of normal variables. In (2.31), we simply take the logarithm of a gamma variable. The L´evy characteristics of (ηa (t) : t ≥ 0) are given as follows. Theorem 2.12 (see [10] and also [19]) For any a > 0, the log-gamma process is represented as ∞ t γ (j) (t) law − ηa (t) = tΓ (1) + (2.32) j+1 j+a j=0 where γ (0) , γ (1) , . . . are independent gamma processes. In particular, the L´evy exponent of (ηa (t) : t ≥ 0) defined by Γ (a + iθ) t iθηa (t) = = etφa (θ) (2.33) E e Γ (a) admits the representation Γ (a + iθ) Γ (a) 0 iθu
=iθψ(a) + e − 1 − iθu
(2.34)
φa (θ) = log
−∞
e−a|u| du |u|(1 − e−|u| )
where ψ(z) = Γ (z)/Γ (z) is called the digamma function.
(2.35)
194
K. Yano et al.
Let (ηa (t) : t ≥ 0) and ( ηb (t) : t ≥ 0) be independent log-gamma processes. Then the difference (ηa (t) − ηb (t) : t ≥ 0) is called a generalized z-process (see [18]). In particular, we have law
ηa (1) − ηb (1) = log
Ga Gb
(2.36)
and this law is called a z-distribution. Its characteristic function is given by B(a + iθ, b − iθ) Ga (2.37) = E exp iθ log B(a, b) Gb and the law itself is given by eax Ga 1 P log ∈ dx = dx. B(a, b) (1 + ex )a+b Gb
(2.38)
2.6 Symmetric z-Processes We now consider a particular case of symmetric z-processes, i.e., σa (t) =
1 {ηa (t) − ηa (t)} , π
t ≥ 0.
(2.39)
We introduce a subordinator given by Σa (t) =
∞ 2 γ (j) (t) π 2 j=0 (j + a)2
(2.40)
where γ (0) , γ (1) , . . . are independent gamma processes. The L´evy measure of (Σa (t) : t ≥ 0) may be obtained from the following: ∞ 2γ (j) (t) E e−λΣa (t) = E exp −λ 2 π (j + a)2 j=0 ∞ ∞ exp −t 1 − exp − =
(2.41)
2λu −u du e (2.42) π 2 (j + a)2 u 0 j=0 ∞ ∞
du π 2 (j + a)2 u 1 − e−λu exp − exp −t = 2 u 0 j=0
= exp −t 0
∞
1 − e−λu n(u)du
(2.43) (2.44)
First Hitting Times of Points
195
where n(u) =
∞ 1 π 2 (j + a)2 1 u = exp − e−ut U (dt) u j=0 2 u (0,∞)
(2.45)
with U=
∞
δπ2 (j+a)2 /2 .
(2.46)
j=0
Hence we conclude that the law of Σa (t) for fixed t is of (GGC) type. The following theorem, due to Barndorff-Nielsen–Kent–Sørensen [1], connects the two L´evy processes Σa and σa : Theorem 2.13 ([1]; see, e.g., [10]) The process (σa (t) : t ≥ 0) may be obtained as the subordination of a Brownian motion (B(u) : u ≥ 0) with respect to the subordinator (Σa (t) : t ≥ 0): law σa (t) = B(Σ a (t)),
t ≥ 0.
(2.47)
Proof. By (2.32), the process (σa (t) : t ≥ 0) is represented as σa (t) =
∞ (j) (t) − γ (j) (t) 1γ π j=0 j+a
(2.48)
where γ (0) , γ (1) , . . ., γ (0) , γ (1) , . . . are independent gamma processes. Note that law law √ γ (j) (t) − γ (j) (t) = Λ2 (t) = 2B(γ(t)) (2.49) where (Λ2 (t) : t ≥ 0) is a Linnik process, i.e., a L´evy process such that law
Λ2 (1) = Λ2 . Now we obtain √ ∞ B j (γ (j) (t)) law 2 B(Σa (t)) = π j=0 j+a law 1
=
π
∞ γ (j) (t) − γ (j) (t) j=0
j+a
law
= σa (t)
(2.50)
(2.51) (2.52)
j ’s are independent of γ (j) ’s. The where on the right hand side of (2.50) the B proof is complete. The characteristic function of σa (t) is given by t Ga θ E[eiθσa (t) ] = E exp i log = e−tΦa (θ) π Ga
(2.53)
196
K. Yano et al.
where
∞
1 − eiθu
e−aπ|u| du |u|(1 − e−π|u| ) −∞ ∞ e−aπu du. (1 − cos θu) =2 u(1 − e−πu ) 0
Φa (θ) =
For t = 1, the law of σa (1) is given by 1 Ga eaπx π P (σa (1) ∈ dx) = P log ∈ dx = dx. π B(a, a) (1 + eπx )2a Ga
(2.54) (2.55)
(2.56)
Example 2.14 When a = 1/2, put t ) = σ 1 (t). Ct := B(C 2
Ct := Σ 12 (t),
(2.57)
The law of C1 is called the hyperbolic cosine distribution: E[eiθC1 ] = E[e− 2 θ 1
2
C1
]=
1 , cosh θ
P (C1 ∈ dx) =
1 dx. cosh πx
(2.58)
Consequently, C1 and πC1 satisfy the relation (R). Example 2.15 When a = 1, put t ) = σ1 (t). St := B(S
St := Σ1 (t),
(2.59)
The law of S1 is called the logistic distribution: E[eiθS1 ] = E[e− 2 θ 1
2
S1
]=
θ , sinh θ
P (S1 ∈ dx) =
π dx. (cosh πx)2
(2.60)
Consequently, S1 and πC2 satisfy the relation (R). Let us introduce a subordinator (Tt ) and then (Tt ) such that t) Tt = B(T and that E[eiθTt ] = E[e− 2 θ 1
2
(2.61)
Tt
]=
tanh θ θ
t .
(2.62)
It is well-known that the law of T1 is of (ID) type and hence that such processes exist. Now it is obvious that law
Ct = Tt + St ,
law
Ct = Tt + St
(2.63)
where (Tt ) and (St ) are independent and so are (Tt ) and (St ). For further study of these processes Ct , St and Tt , see Pitman–Yor [32]. By taking Laplace inversion, the density of the law of T1 can be obtained in terms of the theta function; see Knight [28, Cor.2.1] for details.
First Hitting Times of Points
197
2.7 Meixner Processes Let β ∈ (−π, π) and let (Mβ (t) : t ≥ 0) be a L´evy process such that law
Mβ (1) =
Ga 1 log 2π G1−a
where β = (2a − 1)π.
(2.64)
The law of Mβ (t) for fixed t is called a Meixner distribution because of its close relation to Meixner–Pollaczek polynomials (See [38], [39], [40] and [17]). The characteristic function of Mβ (t) is given by E eiθMβ (t) = where ξβ (θ) =
t
cos β2 cosh
θ−iβ 2
= etξβ (θ)
π+β π−β ψ −ψ 2π 2π ∞ iθu
eβu du. e − 1 − iθu + u sinh(πu) −∞
iθ 2π
(2.65)
(2.66)
The law of Mβ (t) itself is given by P (Mβ (t) ∈ dx) =
2 cos β2
t
2πΓ (t) t 2 cos β2
e
βx
2 Γ t + ix dx 2
eβx Γ (t/2)2 e−Φt/2 (x) dx 2πΓ (t) t 2 cos β2 B(t/2, t/2) eβx−Φt/2 (x) dx = 2π =
(2.67)
(2.68)
(2.69)
where Φa (x) has been defined in (2.55). We simply write Mβ for Mβ (1). Remark that this Meixner distribution is identical to that of the log of an α-Cauchy variable: 2 law α log |Cα | − 1 π. (2.70) where β = Mβ = 2π α Remark also that the law of Mβ is symmetric if and only if β = 0 (or a = 1/2). Then the corresponding Meixner distribution is identical to the hyperbolic cosine distribution, up to the factor 1/2; precisely: law
M0 =
G1/2 law 1 1 law 1 log |C| = log = C1 . π 2π 2 G1/2
(2.71)
198
K. Yano et al.
2.8 α-Rayleigh Distributions For an exponential variable e, the random variable √ R = 2e
(2.72)
is sometimes called a Rayleigh variable. We shall introduce an α-analogue of the Rayleigh variable. Let 0 < α ≤ 2 and let t > 0. By Fourier inversion, we obtain from (2.25) that (α)
P (Xα (t) ∈ dx) = pt (x)dx where (α)
pt (x) =
1 2π
∞
e−ixξ e−t|ξ| dξ = α
−∞
1 π
∞
(2.73)
cos(xξ)e−tξ dξ. α
(2.74)
0
Note that (α)
p1 (0) =
Γ (1/α) . απ
(2.75)
Lemma 2.16 Let 0 < α ≤ 2. Then there exists a non-negative random variable Rα such that (α)
P (Rα > x) = In particular, R2 =
√
p1 (x) (α)
,
x > 0.
(2.76)
p1 (0)
√ 2R = 2 e.
We call Rα an α-Rayleigh variable and its law the α-Rayleigh distribution. For the proof of Lemma 2.16, we introduce some notations. For 0 < α < 1, denote by Tα the unilateral α-stable distribution: α E e−λTα = e−λ , λ ≥ 0. (2.77) We denote by Tα the h-size biased variable of Tα with respect to h(x) = x−1/2 : E (Tα )−1/2 f (Tα ) (2.78) E [f (Tα )] = E (Tα )−1/2 for any non-negative Borel function f . The following lemma proves Lemma 2.16. Lemma 2.17 Suppose 0 < α < 2. Then the variable Rα is given by Rα = 2 eTα/2 are independent. where the variables e and Tα/2
(2.79)
First Hitting Times of Points
199
Proof of Lemma 2.17. Since we have law
Xα (1) =
√
α ), 2B(T 2
(2.80)
we obtain the following expression: (α) p1 (x)
1 x2 =E exp − . 4T α2 2 πT α2
Hence we obtain
(α)
x2 = E exp − (α) 4T α p1 (0) 2
p1 (x)
.
Using an independent exponential variable e, we have (α) p1 (x) x2 = E e > eT = E 2 α > x . (α) 2 4T α p (0) 1
Now the proof is complete.
(2.81)
(2.82)
(2.83)
2
Remark 2.18 Lemmas 2.16 and 2.17 can also be found in Cordero [50, §1.2.2].
3 Discussions from Excursion Theoretic Viewpoint Recall Itˆo’s excursion theory ([23] and [30]). See also the standard textbooks [22] and [35], as well as [33]. 3.1 Itˆ o’s Measure of Excursions away from the Origin We simply write D for the space D([0, ∞); R) of c`adl` ag paths equipped with the Skorokhod topology. Let X = (X(t) : t ≥ 0) be a strong Markov process with paths in D starting from 0. Suppose that the origin is regular, recurrent and an instantaneous state. Then it is well-known (see [4, Thm.V.3.13]) that there exists a local time at the origin, which we denote by L = (L(t) : t ≥ 0), subject to the normalization: ∞ e−t dL(t) = 1. (3.1) E 0
This is a choice made in this section; but later, we may make another choice, which will be indicated as L(α) (t), L(t) being always subject to (3.1). The
200
K. Yano et al.
local time process L = (L(t) : t ≥ 0) is continuous and non-decreasing almost surely. Thus its right-continuous inverse process τ (l) = inf{t > 0 : L(t) > l}
(3.2)
is strictly-increasing. By the strong Markov property of X, we see that τ (l) is a subordinator. Call D the (random) set of discontinuities of τ : D = {l > 0 : τ (l) − τ (l−) > 0}.
(3.3)
It is obvious that D is a countable set. Now we define a point function p on D which takes values in D as follows: For l ∈ D, X(t + τ (l−)) if 0 ≤ t < τ (l) − τ (l−), p(l)(t) = (3.4) 0 otherwise. We call p = (p(l) : l ∈ D) the excursion point process. Then the fundamental theorem of Itˆo’s excursion theory is stated as follows. Theorem 3.1 (Itˆ o [23]; see also Meyer [30]) The excursion point process p is a Poisson point process, i.e.: (i) p is σ-discrete almost surely, i.e., for almost every sample path, there exists a sequence {Un } of disjoint measurable subsets of D such that D = ∪n Un and {l ∈ D : p(l) ∈ Un } is a finite set for all n; (ii) p is renewal, i.e., p(· ∧ s) and p(· + s) are independent for each s > 0. For a measurable subset U of D, we define a point process pU : DU → D as DU = {l ∈ D : p(l) ∈ U }
and
pU = p|DU .
(3.5)
We call pU = (pU (l) : l ∈ DU ) the restriction of p on U . The measure on D defined by n(U ) = E [((0, 1] ∩ DU )]
(3.6)
is called Itˆ o’s measure of excursions. Corollary 3.2 (Itˆ o [23]) The following statements hold: (i) Let {Un } be a sequence of disjoint measurable subsets of D. Then the point processes {pUn } are independent; (ii) Let U be a measurable subset of D such that n(U ) < ∞. Then (0, l] ∩ DU is a finite set for all l > 0 a.s. Set DU = {0 < κ1 < κ2 < · · · },
pU (κn ) = un , n = 1, 2, . . . .
(3.7)
Then: (ii-a) {κn − κn−1 , un : n = 1, 2, . . .} are independent where κ0 = 0; (ii-b) For each n, κn − κn−1 is exponentially distributed with mean 1/n(U ), i.e., P (κn − κn−1 > l) = e−ln(U ) for l > 0;
First Hitting Times of Points
201
(ii-c) For each n, P (un ∈ ·) = n(· ∩ U )/n(U ); (iii) Let F (l, u) be a non-negative measurable functional on (0, ∞) × D. Then F (l, p(l)) = exp − 1 − e−F (l,u) dl ⊗ n(du) ; (3.8) E exp − l∈D
(iv) Let F (t, u) be a non-negative measurable functional on (0, ∞) × D. Then F (τ (l−), p(l)) = E[F (τ (l), u)]dl ⊗ n(du). (3.9) E l∈D
The proofs of Theorem 3.1 and Corollary 3.2 are also found in [22] and [35]. For u ∈ D, define ζ(u) = sup{t ≥ 0 : u(t) = 0}.
(3.10)
For each excursion path p(l), l ∈ D, ζ(p(l)) is finite and called the lifetime of the path p(l). For a measurable subset U of D, we set ζ(pU (k)). (3.11) τU (l) = k∈(0,l]∩DU
Note that τD (l) = τ (l), l ≥ 0. By Corollary 3.2 (iii), we see that the process (τU (l) : l ≥ 0) is a subordinator with Laplace transform E[e−λτU (l) ] = e−lψU (λ) given by ψU (λ) = n 1 − e−λζ ; U , λ > 0. (3.12) Since ψ(λ) := ψD (λ) < ∞, we have n(ζ > t) < ∞ for all t > 0; in particular, we see that the measure n is σ-finite. 3.2 Decomposition of First Hitting Time Before and After Last Exit Time We denote the first hitting time of a closed set F for X by TF (X) = inf {t > 0 : X(t) ∈ F } .
(3.13)
In particular, if F = {a}, the closed set consisting of a single point a ∈ R, TF (X) is nothing but the first hitting time of point a ∈ R for X: T{a} (X) = inf {t > 0 : X(t) = a} .
(3.14)
202
K. Yano et al.
The hitting time T{a} (X) may be decomposed at the last exit time from 0: T{a} (X) =G{a} (X) + Ξ{a} (X)
(3.15)
where G{a} (X) is the last exit time from 0 before T{a} (X), and where Ξ{a} (X) is the remaining time after G{a} (X), i.e., G{a} (X) = sup{t ≤ T{a} (X) : X(t) = 0}
(3.16)
Ξ{a} (X) = T{a} (X) − G{a} (X).
(3.17)
and
The joint law of the random times G{a} (X) and Ξ{a} (X) is characterised by the following proposition: Proposition 3.3 Let a = 0. Then the random times G{a} (X) and Ξ{a} (X) are independent. Moreover, the law of G{a} (X) is of (ID) type. The Laplace transforms of G{a} (X) and Ξ{a} (X) are given as E e
−λG{a} (X)
=
−1 n 1 − e−λζ ; T{a} > ζ 1+ n(T{a} < ζ)
and E e
−λΞ{a} (X)
n e−λT{a} ; T{a} < ζ = . n(T{a} < ζ)
Consequently, the Laplace transform of T{a} (X) is given as n e−λT{a} ; T{a} < ζ −λT{a} (X) . E e = n 1 − e−λζ · 1{T{a} >ζ}
(3.18)
(3.19)
(3.20)
Proof. Set ! " Ua = u ∈ D : T{a} (u) < ∞ .
(3.21)
By Corollary 3.2 (i), we see that pUac and pUa are independent. We remark that n(Ua ) < ∞; in fact, if we supposed otherwise, then there would exist a sequence {tn } such that tn → 0 decreasingly and that X(tn ) = a, which contradicts X(0+) = X(0) = 0. Set κa = inf{l > 0 : p(l) ∈ Ua }.
(3.22)
Then, by Corollary 3.2 (ii), we see that κa and p(κa ) are independent. Since κa = inf DUa and p(κa ) = pUa (κa ), they are measurable with respect to the σ-field generated by pUa . Hence we see that {pUac , κa , p(κa )} are independent. Note that
First Hitting Times of Points
G{a} (X) = τUac (κa )
and
Ξ{a} (X) = T{a} (p(κa )).
203
(3.23)
Thus we conclude that G{a} (X) and Ξ{a} (X) are independent. Moreover, we see that the law of G{a} (X) is of (ID) type; in fact, τUac is a subordinator with Laplace exponent ψUac (λ) and κa is an exponential variable with mean 1/n(Ua ) independent of τUac . The law of Ξ{a} (X) is given by P (Ξ{a} (X) ∈ ·) = Now the proof is complete.
n(u ∈ D : T{a} (u) ∈ ·) . n(Ua )
(3.24)
3.3 Excursion Durations Consider the excursion straddling t. For a general study in the setup of linear diffusions, see [37]. Define the last exit time from 0 before t and the first hitting time of point 0 after t as follows: Gt (X) = sup{s ≤ t : X(s) = 0},
Dt (X) = inf{s > t : X(s) = 0}. (3.25)
Define Ξt (X) = t − Gt (X),
Δt (X) = Dt (X) − Gt (X).
(3.26)
Recall (see (3.1)) that L = (L(t) : t ≥ 0) denotes the local time at 0 of X and τ = (τ (l) : l ≥ 0) its right-continuous inverse. Then we have Gt (X) = τ (L(t)−),
Dt (X) = τ (L(t))
(3.27)
and Ξt (X) = t − τ (L(t)−),
Δt (X) = τ (L(t)) − τ (L(t)−).
(3.28)
If the local time process L has the self-similarity property with index γ: L(ct) law : t ≥ 0 = (L(t) : t ≥ 0), c > 0, (3.29) cγ then we have γ τ (c l) L(ct) law :l≥0 :t≥0 , = {(L(t) : t ≥ 0), (τ (l) : l ≥ 0)} γ c c (3.30) for any c > 0; in particular, τ is a stable subordinator of index γ. Hence the index γ must be in (0, 1). We now state two explicit results, the proofs of which are postponed after commenting about these results.
204
K. Yano et al.
Theorem 3.4 Suppose that the local time process has the self-similarity property of index 0 < γ < 1. Then B1−γ,γ law (Ξ1 (X), Δ1 (X)) = B1−γ,γ , (3.31) 1 Uγ where B1−γ,γ is a beta variable of index (1 − γ, γ) and U is an independent uniform variable on (0, 1). The following is a special case of Winkel [45, Cor.1]: Theorem 3.5 ([45]) Suppose that the local time process has the selfsimilarity property of index 0 < γ < 1. Let e be an independent exponential time. Then 1−γ G law (3.32) (Ge (X), Ξe (X), Δe (X)) = Gγ , G1−γ , 1 Uγ where Gγ and G1−γ , respectively, are independent gamma variables of indices γ and 1 − γ, respectively, and U is an independent uniform variable. Generalizing a self-decomposability result of Bondesson (see [7, Ex.5.6.3]), Bertoin–Fujita–Roynette–Yor [3, Thm.1.1] and Roynette–Vallois–Yor [36, Thm.5] have recently proved the following: Theorem 3.6 ([3] and [36]) For any γ ∈ (0, 1), the laws 1 G1−γ and G1−γ 1 1 − 1 Uγ Uγ
(3.33)
are both of (GGC) type with their Thorin measures having total mass 1 − γ. Here G1−γ is a gamma variable of index 1−γ and U is an independent uniform variable. Example 3.7 For a symmetric stable L´evy process of index α, it is wellknown (see Kesten [27] and Bretagnolle [9]) that the origin is regular for itself if and only if 1 < α ≤ 2. Let Xα = (Xα (t) : t ≥ 0) be the symmetric stable L´evy process of index 1 < α ≤ 2. Then its local time process is given as C t 1{|Xα (s)|<ε} ds (3.34) L(t) = lim ε→0+ 2ε 0 for some constant C. Since X satisfies the self-similarity property with index 1/α, so does L with index 1 − 1/α, and hence Theorems 3.4 and 3.5 hold with γ = 1 − 1/α. Example 3.8 For a Bessel process of dimension d, it is well-known that the origin is regular for itself if and only if 0 < d < 2. Let X = (X(t) : t ≥ 0) be a reflecting Bessel process starting from 0 of dimension d = 2 − 2α, 0 < d < 2
First Hitting Times of Points
205
(or 0 < α < 1) which is scaled so that it has natural scale and speed measure 1 m(0, x) = x α −1 . Then its local time process is given as t C L(t) = lim 1{|X(s)|<ε} ds (3.35) ε→0+ m(0, ε) 0 for some constant C. Since X satisfies the self-similarity property with index α, so does L with the same index α, and hence Theorems 3.4 and 3.5 hold with γ = α. For the relations among several choices in the literature, see [13]. Example 3.9 Let α > 0 and 0 < β < min{1, 1/α} and consider the process X = Xm(α) ,j (β) ,0,0 given in [47, Ex.2.4.(b)]. Then X satisfies the self-similarity property with index α, but this property seems to have nothing to do with the local time L. Since its inverse local time process τ = ηm(α) ,j (β) ,0,0 satisfies the self-similarity property with index 1/(αβ), so does L with index αβ, and hence Theorems 3.4 and 3.5 hold with γ = αβ. law
law
Remark 3.10 The identities in law Ξ1 (Xα ) = Bγ,1−γ and D1 (Xα ) − 1 = Gγ /G1−γ are found in Feller [14, XIV.3] as the long-time limit laws of similar random variables derived from random walks. Let us prove Theorems 3.4 and 3.5 for completeness of this paper. law Proof of Theorem 3.4. Since τ (cγ l) = cτ (l) for c, l > 0, we have ψ(cλ) = γ c ψ(λ). Hence we obtain n(ζ ∈ dt) = C
dt
(3.36)
tγ+1
for some constant C. For t > 0, the excursion straddling time t is p(L(t)). Hence we have Gt = τ (L(t)−),
Ξt = t − τ (L(t)−),
Δt = ζ(p(L(t))).
Let p, q, r be positive constants. Then ∞ −pt−qΞt −rΔt E e dt 0 τ (l) e−pt−qΞt −rΔt dt =E =E
l∈D
l∈D
(3.38) (3.39)
τ (l−)
e−pτ (l−)−rζ(p(l))
ζ(p(l))
e−pt−qt dt
(3.40)
0
ζ(u) −pτ (l) −rζ(u) dl n(du)e E e e−pt−qt dt = D 0 0 ∞ ∞ s ds e−lψ(p) dl C γ+1 e−rs e−pt−qt dt = s 0 0 0 ∞ ∞ ds −rs C −pt−qt dte e . = ψ(p) 0 sγ+1 t
(3.37)
∞
(3.41) (3.42) (3.43)
206
K. Yano et al.
Note that 1 1 1 = = ψ(p) ψ(1)pγ ψ(1)Γ (γ)
∞
tγ−1 e−pt dt.
(3.44)
0
Hence we have (3.43) = =
C ψ(1)Γ (γ) C ψ(1)Γ (γ)
∞
∞
dte−pt
0
t
dv(t − v)γ−1 e−qv
0
∞
dte−pt
0
∞
v 1
dvv −γ (1 − v)γ−1 e−qvt
0
∞
1
B1−γ,γ dte−pt E exp −qB1−γ,γ t − r t 1 γ 0 U 1 =C E p + qB1−γ,γ + rB1−γ,γ U −1/γ =C
ds sγ+1
e−rs ds sγ+1
(3.45) e−rsvt (3.46) (3.47) (3.48)
for some constant C . On the other hand, by the self-similarity property (3.30), we have law
(Ξt , Δt ) = (tΞ1 , tΔ1 ) for fixed t > 0, and hence we have ∞ 1 −pt−qΞt −rΔt E e dt = E . p + qΞ1 + rΔ1 0
(3.49)
Letting q, r → 0+ and comparing (3.48) and (3.49), we have C = 1. Therefore we obtain the desired identity in law (3.31) by the uniqueness property of Stieltjes transform. law
Proof of Theorem 3.5. Note that (Ge , Ξe , Δe ) = (eG1 , eΞ1 , eΔ1 ) by the selflaw
similarity property (3.30). We also note that (e(1 − B1−γ,γ ), eB1−γ,γ ) = (Gγ , G1−γ ) by the identity in law (2.11). Therefore we obtain the desired identity in law (3.32) as an immediate consequence of Theorem 3.4.
4 Harmonic Transforms of Symmetric Stable L´ evy Processes We keep the notation Xα = (Xα (t) : t ≥ 0) for the symmetric stable L´evy process of index α such that P [eiθXα (t) ] = e−t|θ| , α
law
Note that, with (4.1), we have X2 (t) =
√
θ ∈ R.
(4.1)
2B(t). We have (α)
P (Xα (t) ∈ dx) = pt (x)dx
(4.2)
First Hitting Times of Points
where (α)
pt (x) =
1 2π
∞
e−ixξ e−t|ξ| dξ = α
−∞
1 π
∞
cos(xξ)e−tξ dξ. α
207
(4.3)
0
We suppose that 1 < α ≤ 2. Then the Laplace transform ∞ 1 ∞ cos(xξ) −qt (α) u(α) (x) = e p (x)dt = dξ t q π 0 q + ξα 0
(4.4)
is finite. Define (α) (α) h(α) q (x) = uq (0) − uq (x),
q > 0, x ∈ R
(4.5)
and (α) (α) h(α) (x) = lim h(α) q (x) = lim {uq (0) − uq (x)}, q→0+
q→0+
x ∈ R.
(4.6)
Lemma 4.1 (See also [29, Sec.4.2]) Suppose that 1 < α ≤ 2. Then the following assertions hold: 1 (α) α −1 for any q > 0 where (i) u(α) q (0) = u1 (0)q 1 1 1 (α) Γ 1− u1 (0) = Γ ; (4.7) απ α α (ii) h(α) (x) = h(α) (1)|x|α−1 for any x ∈ R where −1 (α − 1)π (α) h (1) = 2Γ (α) sin ; 2
(4.8)
(α)
(iii) lim
q→0+
uq (x) (α)
uq (0)
= 1 for any x ∈ R.
Proof. The assertion (i) is obvious by definition. It is also obvious that 1 ∞ 1 − cos(xξ) h(α) (x) = dξ = h(α) (1)|x|α−1 , x ∈ R. (4.9) π 0 ξα For the computation: −1 1 ∞ 1 − cos ξ (α − 1)π h(α) (1) = dξ = 2Γ (α) sin , π 0 ξα 2
(4.10)
see Proposition 7.1 in the Appendix. Hence we obtain (ii). Assertion (iii) is obtained by noting that (α)
uq (x) (α) uq (0)
(α)
=1−
hq (x) (α) uq (0)
q→0+
−→ 1.
(4.11)
208
K. Yano et al.
Let (L(α) (t) : t ≥ 0) be the unique local time process such that (α)
L
1 (t) = lim ε→0+ 2ε
0
t
1{|Xα (s)|<ε} ds
a.s.
(4.12)
Then it is well-known (see [2, Lemma V.1.3]) that ∞ (α) −t (α) E e dL (t) = u1 (0).
(4.13)
0
Let n(α) denote Itˆo’s measure for the process Xα corresponding to this normalisation of the local time (L(α) (t) : t ≥ 0). Remark that (α)
L(α) (t) = u1 (0)L(t)
(t ≥ 0),
n(α) =
1 (α) u1 (0)
n
(4.14)
where (L(t) : t ≥ 0) and n, respectively, are as defined by (3.1) and (3.6), respectively. Theorem 4.2 ([49] and [48]) Suppose that 1 < α ≤ 2. Then n(α) [h(α) (X(t)); ζ > t] = 1,
t > 0.
(4.15) (α)
Consequently, there exists a unique probability measure P h (α)
Eh
[Zt ] = n(α) [Zt h(α) (X(t)); ζ > t]
on D such that (4.16)
for any t > 0 and for any non-negative or bounded Ft -measurable functional Zt . The proof of Theorem 4.2 can be found in [49, Thm.4.7], so we omit it. See [48, Thm.1.2] for the proof of Theorem 4.2 for a fairly general class of one-dimensional symmetric L´evy processes. Several aspects of the law of local time process will be discussed in Hayashi–K. Yano [20]. √ Example 4.3 In the case where α = 2, we have X2 (t) = 2B(t), and we have the following formulae: x2 1 (2) pt (x) = √ e− 4t , t > 0, x ∈ R, 2 πt 1 −√q|x| u(2) , q > 0, x ∈ R, q (x) = √ e 2 q 1 x ∈ R. h(2) (x) = |x|, 2 (2)
(4.17) (4.18) (4.19)
The process ( √12 X(t) : t ≥ 0) under P h is nothing but the symmetrised 3-dimensional Bessel process starting from the origin.
First Hitting Times of Points
209
Theorem 4.4 ([48]) Let q > 0. Then the following assertions are valid: (i) Suppose that 1 < α ≤ 2. Then it holds that (α)
hq (x) = 1; x→0 h(α) (x) lim
(4.20)
(iia) Suppose that 1 < α < 2. Let a = 0. Then it holds that (α)
(α)
uq (a − x) − uq (a) = 0; x→0 h(α) (x) lim
(4.21)
(iib) Suppose that α = 2. Let a = 0. Then it holds that (2)
(2)
√ uq (a − x) − uq (a) − q|a| . = ±e x→±0 h(2) (x)
lim
(4.22)
The proof of the claim (i) of Theorem 4.4 can be found in [48, Lem.4.4], so we omit it. The proof of the claim (iib) of Theorem 4.4 is immediate from formulae (4.18) and (4.19), so we omit it, too. The proof of the claim (iia) of Theorem 4.4 is immediate from the following estimate: Lemma 4.5 ([48]) Suppose that 1 < α < 2. Let a, x ∈ R with 0 < 2|x| < |a|. (α) Then there exists a constant Cq such that (α)
(α) |u(α) q (a − x) − uq (a)| ≤
Cq . |a|
(4.23)
The proof of Lemma 4.5 can be found in [48, Lem.6.2, (i)] in a rather general setting, but we give it for convenience of the reader. Proof of Lemma 4.5. Integrating by parts, we have 1 ∞ cos aξ − cos(a − x)ξ (α) u(α) (a − x) − u (a) = dξ (4.24) q q π 0 q + ξα αξ α dξ 1 ∞ {ϕ(aξ) − ϕ((a − x)ξ)} (4.25) = π 0 (q + ξ α )2 where ϕ(x) = Since ϕ (x) =
(x = 0),
ϕ(0) = 1.
(x = 0), we have aξ aξ 2 |ϕ(aξ) − ϕ((a − x)ξ)| ≤ dy . |ϕ (y)|dy ≤ (a−x)ξ (a−x)ξ |y| cos x x
−
sin x x
(4.26)
sin x x2
(4.27)
210
K. Yano et al.
We change variables: y = uξ, then we have a 4|x| 2 |ϕ(aξ) − ϕ((a − x)ξ)| ≤ du ≤ . |a| a−x |u|
(4.28)
Thus we have proved the estimate (4.23). Let us prove the claim (iia) of Theorem 4.4. Proof of the claim (iia) of Theorem 4.4. Without loss of generality, we may suppose that 0 < 2|x| < |a|. Using the estimate (4.23), we obtain u(α) (a − x) − u(α) (a) C (α) |x|2−α q q q · , (4.29) ≤ |a| h(α) (1) h(α) (x) which tends to zero since α < 2. Now the proof is complete.
5 First Hitting Time of a Single Point for Xα 5.1 The Case of One-dimensional Brownian Motion Let B = (B(t) : t ≥ 0) denote the one-dimensional Brownian motion starting from 0. We consider the first hitting time of a ∈ R for B: T{a} (B) = inf{t > 0 : B(t) = a}.
(5.1)
It is well-known (see, e.g., [35, Prop.II.3.7]) that the law of the hitting time is of (SD) type where its Laplace transform is given as follows: 1 2 E eiθB(T{a} (B)) = E e− 2 θ T{a} (B) = e−|aθ| , θ∈R (5.2) = (B(t) : t ≥ 0) stands for an independent copy of B. Identity (5.2) where B can be expressed as {a} (B)) law B(T = |a|C,
law
T{a} (B) = 2a2 T 21 .
(5.3)
Let a > 0. Consider the random times G{a} (B) and Ξ{a} (B). The following path decomposition is due to Williams (see [43] and [44]; see also Prop.VII.4.8 and Thm.VII.4.9 of [35]): Theorem 5.1 ([43] and [44]) The process (B(t) : 0 ≤ t ≤ T{a} (B)) is identical in law to the process (Y (t) : 0 ≤ t ≤ T ) defined as follows: ⎧ ⎪ for 0 ≤ t < T{M } (B1 ); ⎨B1 (t) Y (t) = B2 (T − t) for T{M } (B1 ) ≤ t < T ; (5.4) ⎪ ⎩ R(T + t) for T ≤ t ≤ T
First Hitting Times of Points
211
where M , B1 , B2 and R are independent, M is a uniform variable on (0, a), B1 and B2 are both identical in law to B, R is a 3-dimensional Bessel process starting at 0, and T and T are random times defined as follows: T = T + T{a} (R).
T = T{M } (B1 ) + T{M } (B2 ),
(5.5)
From this path decomposition, we may compute the Laplace transform of G{a} (B) as follows: √ a dm 2 1 − e−2 2qa −qG{a} (B) −qT{m} (B) √ = E e E e = , q > 0. a 2 2qa 0 (5.6) We may also compute the Laplace transform of Ξ{a} (B) as follows: √ 2qa −qΞ{a} (B) −qT{a} (R) √ =E e = E e , q > 0. sinh( 2qa)
(5.7)
In other words, we have law
law
Ξ{a} (B) = T{a} (R) = a2 S1 .
(5.8)
Remark 5.2 The laws of first hitting times are known to be of (SD) type also for Bessel processes with drift (see Pitman–Yor [31]) and of (ID) type for onedimensional diffusion processes (see Yamazato [46] and references therein). 5.2 The Law of T{a} (Xα ) Consider the first hitting time of point a ∈ R for Xα of index 1 < α ≤ 2: T{a} (Xα ) = inf{t > 0 : Xα (t) = a}.
(5.9)
It is well-known (see, e.g., [2, Cor.II.5.18]) that E[e−qT{a} (Xα ) ] =
(α)
uq (a) (α)
,
q > 0.
(5.10)
uq (0) α = (X α (t) : t ≥ 0) be an independent copy of Xα . The following is Let X a generalization of formulae (5.2) and (5.3). Theorem 5.3 (See also Cordero [50, §1.2.2]) Suppose that 1 < α ≤ 2. Let a ∈ R. Then sin(π/α) α Lα (aθ) (5.11) E eiθXα (T{a} (Xα )) = E e−|θ| T{a} (Xα ) = 2π/α and α (T{a} (Xα )) law X = |a|Cα , where γ = 1/α.
law
T{a} (Xα ) =
|a|α (Rα )α B1−γ,γ
(5.12)
212
K. Yano et al.
We can recover (5.3) if we take α = 2, noting that law
2a2
law
T{a} (B) = T{√2a} (X2 ) =
a2
law
=
(R2 )2 B1/2,1/2
law
2eB1/2,1/2
=
a2 law 2 = 2a T1/2 . 2G1/2 (5.13)
Proof of Theorem 5.3. If we take q = |θ|α , then 1 ∞ cos(xξ) |θ|1−α ∞ cos(θxξ) (α) dξ = dξ, uq (x) = π 0 |θ|α + |ξ|α π 1 + |ξ|α 0
x ∈ R. (5.14)
Hence, by formula (5.10), we obtain E[e−|θ|
α
T{a} (Xα )
] = E[cos(θ|a|Cα )] = E[eiθ|a|Cα ].
(5.15)
This shows (5.11) and the first identity of (5.12). To prove the second identity of (5.12), it suffices to prove the claim when a = 1; in fact, by the self-similarity property, we have law
T{a} (Xα ) = |a|α T{1} (Xα ). Note that ∞ 0
e−qt P (T{1} (Xα ) < t)dt =
(α)
(α)
1 uq (1) E[e−qT{1} (Xα ) ] = (α) . q quq (0)
(α)
Since uq (0) = u1 (0)q γ−1 where γ = 1 (α) quq (0)
=
1 (α) u1 (0)
1 α,
1
q −γ =
(5.16)
we have ∞
(5.17)
y γ−1 e−qy dy.
(5.18)
(t − s)γ−1 p(α) s (1)ds.
(5.19)
(α) u1 (0)Γ (γ)
0
Hence, by Laplace inversion, we obtain P (T{1} (Xα ) < t) =
1 (α)
u1 (0)Γ (γ)
t
0
By the scaling property p(α) s (x) =
1 (α) x , p sγ 1 sγ
s > 0,
(5.20)
we obtain (α)
P (T{1} (Xα ) < t) =
Γ (1 − γ)p1 (0) (α)
= 0
u1 (0) 1
0
γ−1 −γ
t
(t − s)γ−1 s−γ p1 (1/sγ ) · ds (5.21) (α) Γ (γ)Γ (1 − γ) p1 (0) (α)
(α)
p (1/(ts)γ ) (1 − s) s · 1 (α) ds Γ (γ)Γ (1 − γ) p1 (0) (from (2.75) and (i) of Lemma 4.1)
(5.22)
First Hitting Times of Points
=P
1
Rα >
=P
(tB1−γ,γ )γ 1
Now the proof is complete.
213
(from (2.76))
(5.23) (5.24)
5.3 Laplace Transform Formula for First Hitting Time of Two Points For later use, we prepare several important formulae concerning Laplace transforms for first hitting time of two points. Denote the symmetric α-stable process starting from x ∈ R by Xαx (t) = x + Xα (t). Suppose that 1 < α ≤ 2. Recall that the Laplace transform of first hitting time of a single point is given by (see (5.10)) u(α) (x − a) x q . ϕqx→a := E e−qT{a} (Xα ) = (α) uq (0)
(5.25)
The Laplace transform of first hitting time of two points, i.e., T{a} (Xαx ) ∧ T{b} (Xαx ), is given by the following formula: Proposition 5.4 Suppose that 1 < α ≤ 2. Let x, a, b ∈ R. Then u(α) (x − a) + u(α) (x − b) x x q q . ϕqx→a,b := E e−qT{a} (Xα )∧T{b} (Xα ) = (α) (α) uq (0) + uq (a − b)
(5.26)
Proof. For any closed set F , put TFx = TF (Xαx ) = inf{t > 0 : Xαx (t) ∈ F }.
(5.27)
Following [2, p.49], we introduce the capacitary measure as z μqF (A) = q E e−qTF ; Xαz (TFz ) ∈ A dz, A ∈ B(R).
(5.28)
Now we apply Theorem II.2.7 of [2] and obtain x q (α) μF (dy) uq (x − y)dx = E e−qTF dx,
(5.29)
A
A ∈ B(R),
A
where we have used the fact that the process considered is symmetric. This implies that x q E e−qTF = u(α) (5.30) q (x − y)μF (dy).
214
K. Yano et al.
By definition of μqF , we obtain x z z z E e−qTF = q E e−qTF u(α) q (x − Xα (TF )) dz.
(5.31)
x x Now we let F = {a, b}. Then we have TFx = T{a} ∧T{b} . Noting that Xαz (TFz ) = a or b almost surely, we have x q q (α) E e−qTF = Ca≺b u(α) (5.32) q (x − a) + Cb≺a uq (x − b)
where
q Ca≺b
=q
z z z dz. E e−qTF ; T{a} < T{b}
(5.33)
Since TFa = TFb = 0 almost surely, we have q q (α) 1 = Ca≺b u(α) q (0) + Cb≺a uq (a − b),
1=
q Ca≺b u(α) q (b
− a) +
(5.34)
q Cb≺a u(α) q (0).
(5.35)
Hence we obtain q q Ca≺b = Cb≺a =
1 (α) uq (0)
(α)
+ uq (a − b)
.
(5.36)
Combining this with (5.32), we obtain the desired result. The Laplace transform of first hitting time of point a before hitting b is given by the following formula: Proposition 5.5 Suppose that 1 < α ≤ 2. Let x, a, b ∈ R with a = b. Then x ϕqx→a≺b :=E e−qT{a} (Xα ) ; T{a} (Xαx ) < T{b} (Xαx ) (5.37) (α)
=
(α)
(α)
(α)
uq (0)uq (x − a) − uq (a − b)uq (x − b) (α)
(α)
{uq (0)}2 − {uq (a − b)}2
.
(5.38)
Proof. Keep the notations in the proof of Proposition 5.4. Noting that x x x x T{a} = T{b} + T{a} ◦ θT{b}
x x on {T{a} > T{b} },
we see, by the strong Markov property, that x x x E e−qT{a} ; T{a} > T{b} = ϕqx→b≺a ϕqb→a .
(5.39)
(5.40)
Thus we have ϕqx→a = ϕqx→a≺b + ϕqx→b≺a ϕqb→a .
(5.41)
First Hitting Times of Points
215
Combining this with the trivial identity ϕqx→a,b = ϕqx→a≺b + ϕqx→b≺a ,
(5.42)
we obtain ϕqx→a≺b = This proves the desired result.
ϕqx→a − ϕqb→a ϕqx→a,b 1 − ϕqb→a
.
(5.43)
Remark 5.6 Formula (5.38) can be written as x E e−qT{a} (Xα ) ; T{a} (Xαx ) < T{b} (Xαx ) (α)
=
(α)
(α)
(α)
uq (x − b)hq (a − b) + uq (0){hq (x − b) − hq (x − a)} (α)
.
(5.45)
|x − b|α−1 − |x − a|α−1 , 1+ |a − b|α−1
(5.46)
(α)
(α)
{uq (0) + uq (a − b)}hq (a − b)
Letting q → 0+, we obtain P
(α)
(5.44)
T{a} (Xαx )
<
T{b} (Xαx )
1 = 2
which is a special case of Getoor’s formula [16, Thm.6.5]. See also [48, Thm.6.1] for its application to Itˆ o’s measure for symmetric L´evy processes. Remark 5.7 Let a < x < b. Then, as corollaries of Propositions 5.4 and 5.5, we recover the following well-known formulae (see, e.g., [24, Problem 1.7.6]) for the Brownian motion (B x (t) = x + B(t) : t ≥ 0) starting from x: cosh √2q x − b+a
x x 2
√ (5.47) E e−qT{a} (B )∧T{b} (B ) = cosh 2q · b−a 2 and E e
−qT{a} (B x )
sinh √2q(b − x) √ ; T{a} (B ) < T{b} (B ) = . sinh 2q(b − a) x
x
(5.48)
5.4 The Laplace Transforms of G{a} (Xα ) and Ξ{a} (Xα ) The following theorem generalises formulae (5.6) and (5.7): Theorem 5.8 Suppose that 1 < α ≤ 2. Let a = 0. Then it holds that {u(α) (0)}2 − {u(α) (a)}2 q q E e−qG{a} (Xα ) = (α) (α) 2h (a)uq (0)
(5.49)
(α) u(α) (a) 2h(α) (a)uq (0) q . · (α) E e−qΞ{a} (Xα ) = (α) (α) uq (0) {uq (0)}2 − {uq (a)}2
(5.50)
and that
216
K. Yano et al.
Remark 5.9 The left hand sides of (5.49) and (5.50) are functions of q|a|α since law
G{a} (Xα ) = |a|α G{1} (Xα )
law
Ξ{a} (Xα ) = |a|α Ξ{1} (Xα ). (5.51)
and
We may check that so are also the right hand sides by the following formulae: (α)
α−1 uq|a|α (0), u(α) q (0) = |a|
(α)
α−1 u(α) uq|a|α (1) q (a) = |a|
(5.52)
and h(α) (a) = |a|α−1 h(α) (1).
(5.53)
For the proof of Theorem 5.8, we need the following proposition. Proposition 5.10 Suppose that X = Xα with 1 < α ≤ 2. Let a = 0 and q, r > 0. Then (α) u(α) (a) uq (a) r · (α) . n(α) e−qT{a} −r(ζ−T{a} ) ; T{a} < ζ = (α) (α) ur (0) {uq (0)}2 − {uq (a)}2 (5.54)
Consequently, it holds that n(α) e−qT{a} ; T{a} < ζ =
(α)
uq (a) (α)
(α)
{uq (0)}2 − {uq (a)}2
(5.55)
and that n(α) (T{a} < ζ) =
1 2h(α) (a)
.
(5.56)
Proof. Let us only prove formula (5.54); in fact, from this formula one can obtain formulae (5.55) and (5.56) immediately by the limit (4.6) and Lemma 4.1. Let ε > 0. By the strong Markov property of n(α) , we have n(α) e−qT{a} −r(ζ−T{a} ) ; ε < T{a} < ζ (5.57)
=e−qε n(α) ϕqx→a≺0 x=X(ε) ϕr0→a ; ε < T{a} ∧ ζ (5.58) (α) ϕqx→a≺0 ; ε < T{a} . (5.59) =e−qε ϕr0→a E h h(α) (x) x=X(ε) Here we used Theorem 4.2. Note that ϕqx→a≺0 (α) 2 (α) 2 (0)} − {u (a)} · {u q q h(α) (x) (α)
=u(α) q (a)
(α)
(α)
hq (x) uq (x − a) − uq (a) · (α) − u(α) . q (0) · h (x) h(α) (x)
(5.60)
First Hitting Times of Points
217
Suppose that 1 < α < 2. Then we see that the right hand side of (5.60) (α) converges to uq (a) as x → 0 by Theorem 4.4. Letting ε → 0+ in identity (5.57)-(5.59), we obtain formula (5.54) by the dominated convergence theorem. Suppose that α = 2. We may assume without loss of generality that a > 0. Then we see that the quantity (5.59) is equal to (2) ϕqx→a≺0 1 −qε r h e ϕ0→a E ; ε < T{a} , X(ε) > 0 ; (5.61) 2 h(2) (x) x=X(ε)
(2)
In fact, P h is nothing but the law of the symmetrisation of the 3-dimensional Bessel process starting from the origin. We also see that the right hand side (α) of (5.60) converges to 2uq (a) as x → +0 by Theorem 4.4. Hence, letting ε → 0+ in identity (5.57)-(5.61), we obtain formula (5.54) by the dominated convergence theorem. Now the proof is complete. Now we proceed to prove Theorem 5.8. Proof of Theorem 5.8. Using formulae (5.56) and (5.54) (with r = q), we have n(α) 1 − e−qζ ; T{a} > ζ (5.62) −qζ (α) −qζ (α) (α) =n 1−e − n (T{a} < ζ) + n e ; T{a} < ζ (5.63) =
1 (α)
uq (0)
−
1 2h(α) (a)
(α)
+
uq (a) (α)
(α)
·
uq (a) (α)
(α)
uq (0) {uq (0)}2 − {uq (a)}2
.
Hence we obtain (α) n(α) 1 − e−qζ ; T{a} > ζ 2h(α) (a)uq (0) − 1. = (α) (α) n(α) (T{a} < ζ) {uq (0)}2 − {uq (a)}2
(5.64)
(5.65)
Combining this with formula (3.18), we obtain (5.49). Using formulae (5.55) and (5.56), we have (α) n(α) e−qT{a} ; T{a} < ζ 2h(α) (a)uq (a) . = (α) (α) n(α) (T{a} < ζ) {uq (0)}2 − {uq (a)}2
(5.66)
Combining this with formula (3.19), we obtain (5.50). 5.5 Overshoots at the First Passage Time of a Level For comparison with the description of the law of a first hitting time, we recall the law of the overshoot at the first passage time of a level. Let Xα = (Xα (t) : t ≥ 0) denote the symmetric stable L´evy process of index 0 < α ≤ 2 starting α from the origin such that E[eiλXα (t) ] = e−t|λ| .
218
K. Yano et al.
Consider the first passage time of level a > 0 for Xα : T[a,∞) (Xα ) = inf{t > 0 : Xα (t) ≥ a}.
(5.67)
The variable Xα (T[a,∞) (Xα )) − a is the overshoot at the first hitting time of level a. The following theorem is due to Ray [34], although he does not express his result like this: Theorem 5.11 ([34]) Suppose that 0 < α ≤ 2. Let a > 0. Then law
Xα (T[a,∞) (Xα )) − a = a
G1− α2 Gα
(5.68)
2
where G1− α2 and Gα2 are independent gamma variables of indices 1 − α 2 , respectively.
α 2
and
For its multidimensional analogue, see Blumenthal–Getoor–Ray [5].
6 First Hitting Time of a Single Point for |Xα| 6.1 The Case of One-dimensional Reflecting Brownian Motion We consider the first hitting time of a > 0 for the reflecting Brownian motion |B| = (|B|(t)): T{a} (|B|) = inf{t > 0 : |B(t)| = a} = T{a} (B) ∧ T{−a} (B).
(6.1)
It is well-known (see, e.g., [35, Prop.II.3.7]) that the law of the hitting time is of (SD) type where its Laplace transforms is given as follows: 1 2 1 E eiθB(T{a} (|B|)) =E e− 2 θ T{a} (|B|) = , θ ∈ R. (6.2) cosh(aθ) Identity (6.2) can be expressed as law {a} (|B|)) law B(T = aC1 = 2aM0 ,
law
T{a} (|B|) = a2 C1 .
(6.3)
Noting that ∞ 1 2e−a|θ| −a|θ| = = 2e (−1)n e−2na|θ| , cosh(aθ) 1 + e−2a|θ| n=0
(6.4)
we have the following expansion: ∞ E e−qT{a} (|B|) = 2 (−1)n E e−qT{(2n+1)a} (B) , n=0
q > 0.
(6.5)
First Hitting Times of Points
219
Consider the random times G{a} (|B|) and Ξ{a} (|B|). By means of random time-change, Williams’ path decomposition (Theorem 5.1) is also valid for the reflecting Brownian motion |B| instead of B. Hence we may compute the Laplace transforms of these variables as follows: √ a dm 2 tanh( 2qa) −qG{a} (|B|) −qT{m} (|B|) √ = E e E e = , q>0 a 2qa 0 (6.6) and E e−qΞ{a} (|B|) = E e−qT{a} (R) =
√
2q|a| √ , sinh( 2q|a|)
q > 0.
(6.7)
In other words, we have law
G{a} (|B|) = a2 T1 ,
law
law
Ξ{a} (|B|) = T{a} (R) = a2 S1 .
(6.8)
6.2 Discussions about the Laplace Transform of T{a} (|Xα |) Consider the first hitting time of point a > 0 for |Xα | of index 1 < α ≤ 2: T{a} (|Xα |) = inf{t > 0 : |Xα (t)| = a} = T{a} (Xα ) ∧ T{−a} (Xα ).
(6.9)
The following theorem generalises the Laplace transform formula (6.2) and the expansion (6.5). Theorem 6.1 Suppose that 1 < α ≤ 2. Let a ∈ R. Then E e−qT{a} (|Xα |) =
(α)
2uq (a)
(6.10) (α) (α) uq (0) + uq (2a) ∞ (1) (n) =2 (−1)n E e−q{T{a} (Xα )+T{2a} (Xα )+···+T{2a} (Xα )} n=0
(6.11) (1)
(n)
where Xα , . . . , Xα , . . . are independent copies of Xα . Proof. Applying Proposition 5.4 with x = 0 and b = −a, we obtain the first identity (6.10). Expanding the right hand side, we have n ∞ (α) 2u(α) (a) (2a) u q q E e−qT{a} (|Xα |) = (α) (−1)n . (6.12) (α) uq (0) n=0 uq (0) Using formula (5.25), we may rewrite the identity as E e
−qT{a} (|Xα |)
∞ n −qT{a} (Xα ) = 2E e (−1)n E e−qT{2a} (Xα ) . (6.13) n=0
This is nothing but the second identity (6.11).
220
K. Yano et al.
In the case of Brownian motion B = (B(t)) on one hand, we have law
T{a} (B) + T{2a} (B (1) ) + · · · + T{2a} (B (n) ) = T{(2n+1)a} (B)
(6.14)
where B (1) , . . . , B (n) are independent copies of B. In the case of symmetric α-stable process Xα = (Xα (t)) for 1 < α < 2 on the other hand, however, the law of the sum T{a} (Xα ) + T{2a} (Xα(1) ) + · · · + T{2a} (Xα(n) )
(6.15)
differs from that of T{(2n+1)a} (Xα ). In fact, we have the following theorem. Theorem 6.2 Suppose that 1 < α < 2. Let a ∈ R. Then, for any q > 0 and n ≥ 1, (1) (n) E e−q{T{a} (Xα )+T{2a} (Xα )+···+T{2a} (Xα )} < E e−qT{(2n+1)a} (Xα ) . (6.16) Proof. Set Dn = E e−qT{(2n+1)a} (Xα ) − E e−qT{(2n−1)a} (Xα ) E e−qT{2a} (Xα ) . (6.17) Then it suffices to prove that Dn > 0 for all n ≥ 1. Keep the notations in the proof of Proposition 5.4. Note that Dn = ϕq0→(2n+1)a − ϕq0→(2n−1)a ϕq0→2a .
(6.18)
Using formula (5.41) and translation invariance, we have ϕq0→(2n+1)a =ϕq0→(2n+1)a≺(2n−1)a + ϕq0→(2n−1)a≺(2n+1)a ϕq(2n−1)a→(2n+1)a (6.19) =ϕq0→(2n+1)a≺(2n−1)a + ϕq0→(2n−1)a≺(2n+1)a ϕq0→2a .
(6.20)
Using formula (5.41), translation invariance, and the symmetry, we have ϕq0→(2n−1)a = ϕq0→(2n−1)a≺(2n+1)a + ϕq0→(2n+1)a≺(2n−1)a ϕq0→2a . Hence we obtain
2 Dn = ϕq0→(2n+1)a≺(2n−1)a 1 − (ϕq0→2a ) ,
(6.21)
(6.22)
which turns out to be positive because both ϕq0→(2n+1)a≺(2n−1)a and ϕq0→2a are positive and less than 1. Now the proof is complete. Remark 6.3 The consistency of the two formulae (6.18) and (6.22) can be confirmed by formulae (5.25) and (5.38) as follows:
First Hitting Times of Points
221
2 ϕq0→(2n+1)a≺(2n−1)a 1 − (ϕq0→2a )
(6.23) ⎧ 2 ⎫ (α) (α) (α) (α) (α) uq (0)uq ((2n + 1)a) − uq (2a)uq ((2n − 1)a) ⎨ uq (2a) ⎬ = · 1 − (α) (α) (α) ⎩ ⎭ {uq (0)}2 − {uq (2a)}2 uq (0) (6.24) (α) (α) uq ((2n + 1)a) uq ((2n − 1)a) − = (α) (α) uq (0) uq (0) =ϕq0→(2n+1)a − ϕq0→(2n−1)a ϕq0→2a .
·
(α) uq (2a) (α) uq (0)
(6.25) (6.26)
6.3 The Laplace Transforms of G{a} (|Xα |) and Ξ{a} (|Xα |) Since |Xα | = (|Xα (t)| : t ≥ 0) is a strong Markov process, the arguments of Section 3.2 are valid for X = |Xα |. Let us compute the Laplace transforms of G{a} (|Xα |) and Ξ{a} (|Xα |). Theorem 6.4 Suppose that 1 < α ≤ 2. Let a > 0. Then it holds that E e−qG{a} (|Xα |) =
(α)
2Vq (α) {uq (0)
+
(a)
(α) uq (2a)}{4h(α) (a)
− h(α) (2a)}
(6.27)
and that u(α) (a){4h(α) (a) − h(α) (2a)} q E e−qΞ{a} (|Xα |) = (α) Vq (a)
(6.28)
2 (α) (α) (α) 2 Vq(α) (a) := {u(α) q (0)} + uq (0)uq (2a) − 2{uq (a)} .
(6.29)
where
For the proof of Theorem 6.4, we need a certain Laplace transform formula for the first hitting time of three points. Avoiding unnecessary generality, we are satisfied with the following special case: Proposition 6.5 Suppose that 1 < α ≤ 2. Let x, a ∈ R. Then x ϕqx→0,a,−a :=E e−qT{0,a,−a} (Xα )
(6.30)
q q q =C0≺a,−a uq (x) + Ca≺0,−a uq (x − a) + C−a≺0,a uq (x + a) (6.31)
where (α)
q = C0≺a,−a
(α)
(α)
uq (0) + uq (2a) − 2uq (a) (α)
Vq
(6.32)
(a)
and (α)
q q = C−a≺0,a = Ca≺0,−a
(α)
uq (0) − uq (a) (α)
Vq
(a)
.
(6.33)
222
K. Yano et al.
The proof of Proposition 6.5 is similar to that of Proposition 5.4 based on identity (5.31) with F = {0, a, −a}, so we omit it. Proposition 6.6 Suppose that 1 < α ≤ 2. Let x, a ∈ R with a = 0. Then x ϕqx→a,−a≺0 :=E e−qT{a,−a} (Xα ) ; T{a,−a} (Xαx ) < T{0} (Xαx ) (6.34) ϕqx→a,−a − ϕq0→a,−a ϕqx→0,a,−a (6.35) 1 − ϕq0→a,−a (α) (α) (α) (α) (α) uq (0) uq (x − a) + uq (x + a) − 2uq (a)uq (x) = . (6.36) (α) Vq (a)
=
The proof of Proposition 6.6 is similar to that of Proposition 5.5, so we omit it. Let m(α) denote Itˆo’s measure for |Xα | corresponding to the local time satisfying (4.12). The following proposition is crucial to the proof of Theorem 6.4. Proposition 6.7 Suppose that 1 < α ≤ 2. Let a > 0 and q, r > 0. Then u(α) (a) 2u(α) (a) r q m(α) e−qT{a} −r(ζ−T{a} ) ; T{a} < ζ = (α) · (α) . ur (0) Vq (a)
(6.37)
Consequently, it holds that 2u(α) q (a) m(α) e−qT{a} ; T{a} < ζ = (α) Vq (a)
(6.38)
and that m(α) (T{a} < ζ) =
4h(α) (a)
2 . − h(α) (2a)
Proof of Proposition 6.7. By definitions of n(α) and m(α) , we have m(α) e−qT{a} −r(ζ−T{a} ) ; T{a} < ζ =n(α) e−qT{a,−a} −r(ζ−T{a,−a} ) ; T{a,−a} < ζ . Let ε > 0. Then we have n(α) e−qT{a,−a} −r(ζ−T{a,−a} ) ; ε < T{a,−a} < ζ
=e−qε n(α) ϕqx→a,−a≺0 x=X(ε) · ϕra→0 ; ε < T{a,−a} ∧ ζ q ϕx→a,−a≺0 −qε r h(α) ; ε < T{a,−a} . =e ϕa→0 E h(α) (x) x=X(ε)
(6.39)
(6.40)
(6.41) (6.42) (6.43)
First Hitting Times of Points
223
Here we used Theorem 4.2. Noting that, by Theorem 4.4, we have (α)
(α)
(α)
uq (a − x) + uq (a + x) − 2uq (a) =0 x→0 h(α) (x) lim
(6.44)
in whichever case where 1 < α < 2 or α = 2. Hence, we use Proposition 6.6 and obtain (α) ϕqx→a,−a≺0 2uq (a) . = (α) x→0 h(α) (x) Vq (a)
(6.45)
lim
Thus, letting ε → 0+ in formula (6.43), we obtain (6.37) by dominated convergence. Letting r → 0+ in formula (6.37), we obtain (6.38). Noting that (α) (α) (α) (α) Vq(α) (a) = 2 u(α) (6.46) q (0) + uq (a) hq (a) − uq (0)hq (2a), we have (α)
lim
q→0+
2uq (a) (α) Vq (a)
=
2 . 4h(α) (a) − h(α) (2a)
(6.47)
Hence, by letting q → 0+ in formula (6.38), we obtain (6.39). Now the proof is complete. The proof of Theorem 6.4 is now completely parallel to that of Theorem 5.8; thus we omit it.
7 Appendix: Computation of the Constant h(α) (1) Proposition 7.1 For 1 < α < 3, it holds that 1 ∞ 1 − cos x 1 dx = . α π 0 x 2Γ (α) sin π(α−1)
(7.1)
2
As a check, formula (7.1) in the case when α = 2 is equivalent via integration by parts to the well-known formula: ∞ π sin x dx = . (7.2) x 2 0 The proof of Proposition 7.1 can be found in Feller [14, XVII.3 (g)], but we give it for convenience of the reader. Proof. We start with the identity: ∞ xγ−1 e−zx dx = Γ (γ)z −γ 0
(7.3)
224
K. Yano et al.
for γ > 0 and Re z > 0. For 0 < α < 1, ε > 0 and λ ∈ R, we set γ = 1 − α and z = ε − iλ. Then we obtain ∞ iλx −εx e e dx = Γ (1 − α)(ε − iλ)α−1 . (7.4) xα 0 Using the identity Γ (2 − α) = (1 − α)Γ (1 − α) and subtracting (7.4) for λ = 0 from (7.4) for λ = λ, we obtain ∞ (1 − eiλx )e−εx εα−1 − (ε − iλ)α−1 . (7.5) dx = Γ (2 − α) · xα 1−α 0 Rewriting the right hand side, we obtain ∞ ε−iλ (1 − eiλx )e−εx dx = Γ (2 − α) z α−2 dz xα 0 ε
(7.6)
where integration on the right hand side is taken over a segment from {ε − il : l ∈ R}. Since both sides of (7.6) are analytic on 0 < Re α < 2, we see, by analytic continuation, that identity (7.6) remains true for 0 < α < 2. Let us restrict ourselves to the case when 1 < α < 2. Taking the limit ε → 0+ on both sides of identity (7.6), we obtain ∞ −iλ 1 − eiλx dx =Γ (2 − α) z α−2 dz (7.7) xα 0 0 (−iλ)α−1 (7.8) =Γ (2 − α) · α−1 where the branch of f (w) = wα−1 is chosen so that f (1) = 1. Hence we obtain ∞ 1 − eiλx λα−1 − π(α−1)i 2 e dx =Γ (2 − α) · . (7.9) α x α−1 0 Taking real parts on both sides, we obtain ∞ π(α − 1) 1 − cos λx λα−1 cos . dx =Γ (2 − α) · α x α − 1 2 0 Letting λ = 1, we obtain 1 ∞ 1 − cos x π(α − 1) Γ (2 − α) · cos . dx = π 0 xα π(α − 1) 2
(7.10)
(7.11)
(We may find formula (7.11) also in [21, pp.88].) By a simple computation, we have Γ (2 − α) sin π(α − 1) (RHS of (7.11)) = · (7.12) π(α − 1) 2 sin π(α−1) 2
1 1 · = π(α−1) (α − 1)Γ (α − 1) 2 sin 2 1 = . 2Γ (α) sin π(α−1) 2
(7.13) (7.14)
First Hitting Times of Points
225
Hence we have proved identity (7.1) when 1 < α < 2. By analytic continuation, identity (7.1) is proved to be valid also when 2 ≤ α < 3. Therefore the proof is complete.
References 1. O. Barndorff-Nielsen, J. Kent, and M. Sørensen. Normal variance-mean mixtures and z distributions. Internat. Statist. Rev., 50(2):145–159, 1982. 2. J. Bertoin. L´evy processes, volume 121 of Cambridge Tracts in Mathematics. Cambridge University Press, Cambridge, 1996. 3. J. Bertoin, T. Fujita, B. Roynette, and M. Yor. On a particular class of selfdecomposable random variables: the durations of Bessel excursions straddling independent exponential times. Probab. Math. Statist., 26(2):315–366, 2006. 4. R. M. Blumenthal and R. K. Getoor. Markov processes and potential theory. Pure and Applied Mathematics, Vol. 29. Academic Press, New York, 1968. 5. R. M. Blumenthal, R. K. Getoor, and D. B. Ray. On the distribution of first hits for the symmetric stable processes. Trans. Amer. Math. Soc., 99:540–554, 1961. 6. L. Bondesson. On the infinite divisibility of the half-Cauchy and other decreasing densities and probability functions on the nonnegative line. Scand. Actuar. J., (3-4):225–247, 1987. 7. L. Bondesson. Generalized gamma convolutions and related classes of distributions and densities, volume 76 of Lecture Notes in Statistics. Springer-Verlag, New York, 1992. 8. P. Bourgade, T. Fujita, and M. Yor. Euler’s formulae for ζ(2n) and products of Cauchy variables. Electron. Comm. Probab., 12:73–80 (electronic), 2007. 9. J. Bretagnolle. R´esultats de Kesten sur les processus ` a accroissements ind´ependants. In S´eminaire de Probabilit´es, V (Univ. Strasbourg, ann´ee universitaire 1969-1970), pages 21–36. Lecture Notes in Math., Vol. 191. Springer, Berlin, 1971. 10. P. Carmona, F. Petit, and M. Yor. On the distribution and asymptotic results for exponential functionals of L´evy processes. In Exponential functionals and principal values related to Brownian motion, Bibl. Rev. Mat. Iberoamericana, pages 73–130. Rev. Mat. Iberoamericana, Madrid, 1997. 11. L. Chaumont and M. Yor. Exercises in probability, A guided tour from measure theory to random processes, via conditioning, volume 13 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, 2003. 12. L. Devroye. A note on Linnik’s distribution. Statist. Probab. Lett., 9(4):305–306, 1990. 13. C. Donati-Martin, B. Roynette, P. Vallois, and M. Yor. On constants related to the choice of the local time at 0, and the corresponding Itˆ o measure for Bessel processes with dimension d = 2(1 − α), 0 < α < 1. Studia Sci. Math. Hungar., 45(2):207–221, 2008. 14. W. Feller. An introduction to probability theory and its applications. Vol. II. Second edition. John Wiley & Sons Inc., New York, 1971. 15. T. Fujita, Y. Yano, and M. Yor. in preparation.
226
K. Yano et al.
16. R. K. Getoor. Continuous additive functionals of a Markov process with applications to processes with independent increments. J. Math. Anal. Appl., 13:132– 153, 1966. 17. B. Grigelionis. Processes of Meixner type. Liet. Mat. Rink., 39(1):40–51, 1999. 18. B. Grigelionis. Generalized z-distributions and related stochastic processes. Liet. Mat. Rink., 41(3):303–319, 2001. 19. B. Grigelionis. On the self-decomposability of Euler’s gamma function. Liet. Mat. Rink., 43(3):359–370, 2003. 20. M. Hayashi and K. Yano. On the laws of total local times for h-paths of stable L´evy processes. in preparation. 21. I. A. Ibragimov and Yu. V. Linnik. Independent and stationary sequences of random variables. Wolters-Noordhoff Publishing, Groningen, 1971. With a supplementary chapter by I. A. Ibragimov and V. V. Petrov, Translation from the Russian edited by J. F. C. Kingman. 22. N. Ikeda and S. Watanabe. Stochastic differential equations and diffusion processes, volume 24 of North-Holland Mathematical Library. North-Holland Publishing Co., Amsterdam, second edition, 1989. 23. K. Itˆ o. Poisson point processes attached to Markov processes. In Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability (Univ. California, Berkeley, Calif., 1970/1971), Vol. III: Probability theory, pages 225–239, Berkeley, Calif., 1972. Univ. California Press. 24. K. Itˆ o and H. P. McKean, Jr. Diffusion processes and their sample paths. Springer-Verlag, Berlin, 1974. Second printing, corrected, Die Grundlehren der mathematischen Wissenschaften, Band 125. 25. L. F. James. Gamma tilting calculus for GGC and Dirichlet means with applications to Linnik processes and occupation time laws for randomly skewed Bessel processes and bridges. preprint, arXiv:math/0610218, 2006. 26. L. F. James, B. Roynette, and M. Yor. Generalized Gamma convolutions, Dirichlet means, Thorin measures with explicit examples. Probab. Surv., 5:346–415, 2008. 27. H. Kesten. Hitting probabilities of single points for processes with stationary independent increments. Memoirs of the American Mathematical Society, No. 93. American Mathematical Society, Providence, R.I., 1969. 28. F. B. Knight. Brownian local times and taboo processes. Trans. Amer. Math. Soc., 143:173–185, 1969. 29. M. B. Marcus and J. Rosen. Markov processes, Gaussian processes, and local times, volume 100 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge, 2006. 30. P. A. Meyer. Processus de Poisson ponctuels, d’apr`es K. Ito. In S´eminaire de Probabilit´es, V (Univ. Strasbourg, ann´ee universitaire 1969–1970), pages 177–190. Lecture Notes in Math., Vol. 191. Springer, Berlin, 1971. 31. J. Pitman and M. Yor. Bessel processes and infinitely divisible laws. In Stochastic integrals (Proc. Sympos., Univ. Durham, Durham, 1980), volume 851 of Lecture Notes in Math., pages 285–370. Springer, Berlin, 1981. 32. J. Pitman and M. Yor. Infinitely divisible laws associated with hyperbolic functions. Canad. J. Math., 55(2):292–330, 2003. 33. J. Pitman and M. Yor. Itˆ o’s excursion theory and its applications. Jpn. J. Math., 2(1):83–96, 2007.
First Hitting Times of Points
227
34. D. Ray. Stable processes with an absorbing barrier. Trans. Amer. Math. Soc., 89:16–24, 1958. 35. D. Revuz and M. Yor. Continuous martingales and Brownian motion, volume 293 of Grundlehren der Mathematischen Wissenschaften. Springer-Verlag, Berlin, third edition, 1999. 36. B. Roynette, P. Vallois, and M. Yor. A family of generalized gamma convoluted variables. to appear in Prob. Math. Stat., 2009. 37. P. Salminen. On last exit decompositions of linear diffusions. Studia Sci. Math. Hungar., 33(1-3):251–262, 1997. 38. W. Schoutens. Stochastic processes and orthogonal polynomials, volume 146 of Lecture Notes in Statistics. Springer-Verlag, New York, 2000. 39. W. Schoutens. L´evy processes in finance: Pricing financial derivatives. John Wiley & Sons Inc., 2003. 40. W. Schoutens and J. L. Teugels. L´evy processes, polynomials and martingales. Comm. Statist. Stochastic Models, 14(1-2):335–349, 1998. Special issue in honor of Marcel F. Neuts. 41. D. N. Shanbhag and M. Sreehari. On certain self-decomposable distributions. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, 38(3):217–222, 1977. 42. O. Thorin. On the infinite divisibility of the Pareto distribution. Scand. Actuar. J., (1):31–40, 1977. 43. D. Williams. Decomposing the Brownian path. Bull. Amer. Math. Soc., 76: 871–873, 1970. 44. D. Williams. Path decomposition and continuity of local time for onedimensional diffusions. I. Proc. London Math. Soc. (3), 28:738–768, 1974. 45. M. Winkel. Electronic foreign-exchange markets and passage events of independent subordinators. J. Appl. Probab., 42(1):138–152, 2005. 46. M. Yamazato. Topics related to gamma processes. In Stochastic processes and applications to mathematical finance, pages 157–182. World Sci. Publ., Hackensack, NJ, 2006. 47. K. Yano. Convergence of excursion point processes and its applications to functional limit theorems of markov processes on a half line. Bernoulli, 14(4): 963–987, 2008. 48. K. Yano. Excursions away from a regular point for one-dimensional symmetric L´evy processes without Gaussian part. submitted. preprint, arXiv:0805.3881, 2008. 49. K. Yano, Y. Yano, and M. Yor. Penalising symmetric stable L´evy paths. J. Math. Soc. Japan, to appear in 2009. 50. F. Cordero. Sur la th´eorie des excursions des processus de L´evy et quelques applications. in preparation.
L´ evy Systems and Time Changes P.J. Fitzsimmons and R.K. Getoor Department of Mathematics, 0112; University of California San Diego 9500 Gilman Drive, La Jolla, CA 92093–0112, USA e-mail: pfi[email protected]
Summary. The L´evy system for a Markov process X provides a convenient description of the distribution of the totally inaccessible jumps of the process. We examine the effect of time change (by the inverse of a not necessarily strictly increasing CAF A) on the L´evy system, in a general context. They key to our time-change theorem is a study of the “irregular” exits from the fine support of A that occur at totally inaccessible times. This permits the construction of a partial predictable exit system (` a la Maisonneuve). The second part of the paper is devoted to some implications of the preceding in a (weak, moderate Markov) duality setting. Fixing an excessive measure m (to serve as duality measure) we obtain formulas relating the “killing” and “jump” measures for the time-changed process to the analogous objects for the original process. These formulas extend, to a very general context, recent work of Chen, Fukushima, and Ying. The key to our development is the Kuznetsov process associated with X Using X and some and m, and the associated moderate Markov dual process X. excursion theory, we exhibit a general method for constructing excessive measures for X from excessive measures for the time-changed process.
Key words and phrases: L´evy system, exit system, time change, Markov process, continuous additive functional, excessive measure, Kuznetsov process. 2000 Mathematics Subject Classification. Primary: 60J55, Secondary: 60J40.
1 Introduction Let X = (Xt , Px ) be a right Markov process with state space E. The L´evy system of X describes the intensity with which X makes totally inaccessible jumps of specified types. It consists of a continuous additive functional H and a kernel N on (E, E) such that t r Φ(Xs− , Xs ) − N (Xs , Φ) dHs (1.1) t → s≤t
0
C. Donati-Martin et al. (eds.), S´ eminaire de Probabilit´ es XLII, Lecture Notes in Mathematics 1979, DOI 10.1007/978-3-642-01763-6 9, c Springer-Verlag Berlin Heidelberg 2009
229
230
P.J. Fitzsimmons and R.K. Getoor
t is a Px -martingale for each x ∈ E, provided Px 0 N (Xs , |Φ|) dHs < ∞ for r is the left limit of X at time s > 0 taken in a each t > 0. In (1.1), Xs− suitable Ray topology on E, Φ is a product measurable function on E × E with Φ(x, x) = 0 for all x ∈ E, and N (x, Φ) := E N (x, dy)Φ(x, y). Intuitively, the rate at which jumps from x ∈ E to Λ ∈ E occur, relative to the clock H, is N (x, Λ). The notion of a L´evy system, which is a far-reaching generalization of the Itˆ o-L´evy description of the jumps of a L´evy process, is due to S. Watanabe [28]. He constructed L´evy systems for Hunt processes satisfying Meyer’s hypothesis (L) (= the existence of a reference measure). L´evy systems for general right (and Ray) processes, without (L), were constructed by Benveniste and Jacod in [1]. Suppose that in addition to the right process X we have a CAF A with right continuous inverse τ . Let F denote the fine support of A; thus A increases when (and only when) X is in F . It is well known that the time-changed t := Xτ (t) , t ≥ 0, is a right process with state space F . Our goal in process X , H) of X in terms of (N, H) and this paper is to express the L´evy system (N • an exit system (Ppr , C); the latter describes the relevant ways in which X exits the fine support F . The need for this second ingredient stems from the fact that A need not be strictly increasing—some of the totally inaccessible jumps correspond to totally inaccessible jumps of X, while others are generated of X by the excursions of X from F . The first work in this direction of which we are aware is that of H. Gzyl [17]. The recent work of Chen, Fukushima, and Ying [3, 4] has been the direct inspiration for the present study. The effect of time change on symmetric Markov processes is considered in [3] (see also [21] for the case of “nearly symmetric” Hunt processes, and [14] for symmetric diffusions). The same issues are examined in [4] for a standard process in weak duality with a second standard process, under the condition that semipolar sets are m-polar (m being the duality measure). In trying to understand [4], we came to realize that neither duality nor a restriction on semipolar sets was crucial for the discussion. Rather, the key seemed to lie in coming to grips with the “irregular” exits of X from F that occur at totally inaccessible times. In section 2 we describe the hypotheses that will be in force throughout the paper, and we recall the basic facts about exit systems and L´evy systems. In section 3, following Maisonneuve [22], we investigate the notion of predictable exit system, and construct a partial predictable exit system that is sufficient for our purposes. We also provide, in Theorem 3.4, several conditions equivalent to the existence of a complete predictable exit system. In a short section induced by a continu4 we recall the definition of the time-changed process X ous additive functional A of the basic process X. Section 5 contains one of the is expressed in terms main results of the paper. Namely, the L´evy system of X of the L´evy system of X and the partial predictable exit system describing the excursions of X away from the fine support F of A; see Theorem 5.2. In section 6 we assume the existence of an excessive measure m, and recall several constructs depending on m: In particular, the Kuznetsov measure Qm
L´evy Systems and Time Changes
231
and associated processes Y and Y ∗ . We introduce the “jump measure” J and the “killing measure” K associated with X and m, and we express them in terms of the L´evy system of X. Following [4] we then define the Feller measure Λ and the supplementary Feller measure δ associated with excursions from F . Formulas (6.24) and (6.25) relate Λ and δ explicitly to the partial predictable exit system. The main result of this section, Theorem 6.5, gives formulas for of X in terms of J , K, Λ, the jump measure J and the killing measure K in and δ. Section 7 introduces the left-continuous moderate Markov process X weak duality with X relative to m. Using this duality we extend some of the results about excursions from a regular point presented in [12] to excursions from a finely perfect nearly Borel set F for which a predictable exit system exists. We close this introduction with a few words on notation. If (F, F, μ) is a measure space, then bF (resp. pF) denotes the class of bounded real-valued (resp. [0, ∞]-valued) F-measurable functions on F . For f ∈ pF we may use f dμ; similarly, if D ∈ F then μ(f ; D) μ(f ) or μ, f to denote the integral F denotes D f dμ. On the other hand f μ denotes the measure f (x)μ(dx) and μ|D the restriction of μ to D. We write F ∗ for the universal completion of F; that is F ∗ = ∩ν F ν , where F ν is the ν-completion of F and the intersection runs over all finite measures on (F, F). If (E, E) is a second measurable space and K = K(x, dy) is a kernel from (F, F) to (E, E) (i.e., F x → K(x, A) is F-measurable for each A ∈ E and K(x, ·) is a measure on (E, E) for each x ∈ F ), then we write μK for the measure A → F μ(dx)K(x, A) and Kf for the function x → E K(x, dy)f (y). We shall use B to denote the Borel subsets of the real line R. If T is a stopping time, then T denotes the graph {(ω, t) ∈ Ω × [0, ∞[ : t = T (ω)}.
2 Preliminaries Throughout the paper X = (Ω, F, Ft , θt , Xt , Px ) will denote the canonical realization of a Borel right Markov process with state space (E, E). We shall use the standard notation for Markov processes as found, for example, in [2], [15], [7] and [26]. In short, X is a strong Markov process with right continuous sample paths, the state space E (with Borel σ-field E) is homeomorphic to a Borel subset of a compact metric space, and the transition semigroup (Pt )t≥0 of X preserves the class bE of bounded ∞ E-measurable functions. It follows that the resolvent operators U q := 0 e−qt Pt dt, q ≥ 0, also preserve Borel measurability. We shall write U for U 0 . We allow the transition semigroup (Pt ) to be subMarkovian: Pt 1(x) ≤ 1 for all x ∈ E and all t ≥ 0. To allow for the possibility Pt 1E (x) < 1, an absorbing cemetery state Δ is adjoined to E as an isolated point, and the process is sent to Δ at its lifetime ζ. Thus X takes values in EΔ := E ∪ {Δ} (endowed with the σ-field EΔ := E ∨ {Δ}; until section 6, the cemetery state will play no special role.
232
P.J. Fitzsimmons and R.K. Getoor
We write E e for the σ-algebra on E generated by the 1-excessive functions. Because the semigroup (Pt ) is Borel, all 1-excessive functions are nearly Borel measurable; consequently, E e is contained in the σ-algebra of nearly Borel sets. One of our concerns will be the excursions induced by a CAF A; to this end we recall Maisonneuve’s notion of optional exit system. The related notion of predictable exit system will be discussed in section 3. It is known that the stopping time τ (0) = inf{t : At > 0} is equal a.s. to the hitting time TF := inf{t > 0 : Xt ∈ F } of the fine support F of A, defined by F := {x ∈ E : Px [τ (0) = 0] = 1}.
(2.1)
The set F is E e -measurable and finely perfect in the sense that F = F r , where F r := {x ∈ E : Px [TF = 0] = 1} denotes the set of points regular for F . Consequently, the optional set {X ∈ F } := {(ω, t) : Xt (ω) ∈ F } has ω-sections that are right closed and without isolated points, almost surely. Let M be the (optional) subset of Ω × [0, ∞[ with ω-section M (ω) equal to the closure in [0, ∞[ of the visiting set {t ≥ 0 : Xt (ω) ∈ F }, for each ω ∈ Ω. The complement of M (ω) comprises a countable union of disjoint open intervals. We write G(ω) for the collection of strictly positive left endpoints of these “contiguous intervals”. The associated random set G is progressively measurable, but not in general optional. More precisely, the “regular” part of G, given by (2.2) Gr := {(ω, s) ∈ G : Xs (ω) ∈ F }, has evanescent intersection with the graph of any stopping time, while the “irregular” part / F} (2.3) Gi := {(ω, s) ∈ G : Xs (ω) ∈ is a countable union of graphs of stopping times. According to Maisonneuve [22] there is an optional exit system consisting • ∗ ) to of an AF B with bounded 1-potential, and a kernel Pop from (EΔ , EΔ ∗ (Ω, F ) such that ∞ x x Xs ◦ P Zs Φ θs = P Zs Pop [Φ] dBs , (2.4) 0
s∈G
Δ is the point mass at the for all optional Z ≥ 0, Φ ∈ pF ∗ , and x ∈ EΔ . (Pop dead path [Δ].) We can (and do) take the continuous part B c of B to be the dual predictable projection of the raw AF t ◦ t → [1 − exp(−TF )] θs + 1F (Xs ) ds, 0
s≤t,s∈Gr
and the discontinuous part of B to be Btd := PXs [1 − exp(−TF )]. s≤t,s∈Gi
(2.5)
L´evy Systems and Time Changes
233
Notice that B c grows only when X is in F . In view of Motoo’s theorem [26, (66.2)], there exists ∈ pE e such that
t
t
(Xs ) dBsc
1F (Xs ) ds = 0
0
t
=
(Xs ) dBs .
(2.6)
0 •
The second equality in (2.6) holds because we can (and do) take + Pop [1 − e−TF ] = 1 on EΔ and = 0 on EΔ \ F . Moreover, x [Φ] = Pop
Px [Φ] , Px [1 − e−TF ]
∀x ∈ EΔ \ F.
•
This choice of (Pop , B) having been made, we have UB1 1EΔ (x) = Px [exp(−TF )],
∀x ∈ EΔ .
(2.7)
A second key ingredient in our development is the L´evy system describing the totally inaccessible jumps of X. Recall that a stopping time T is totally r inaccessible if Px [T = S] = 0 for all x and all predictable S. Let (Xt− )t>0 denote the left limit process of X, the limits being taken in some Ray-Knight compactification E of EΔ ; see [26, §17–18]. The set J := {(ω, t) : Xtr − (ω) ∈ EΔ , Xtr − (ω) = Xt (ω)}
(2.8)
is the union ∪n Tn of a sequence of totally inaccessible stopping times. Indeed, a stopping time T is totally inaccessible if and only if T ⊂ J up to evanescence; see [26, (44.5)]. Also, if we write Xt− for the left limit of X at time t > 0 (in the original topology of E) whenever it exists, then r J ⊂ {(ω, t) : Xt− (ω) = Xt− (ω)}
(2.9)
up to evanescence; see [26, (46.3)]. The L´evy system consists of a kernel NΔ from (E, E) to (EΔ , EΔ ) such that NΔ (x, {x}) = 0 for all x ∈ E, and a CAF H, such that ∞ x x P Zs Ψ (Xs− , Xs ) = P Zs Ψ (Xs , y) NΔ (Xs , dy) dHs , (2.10) s∈J
0
EΔ
for all predictable Z ≥ 0, Ψ ∈ p(E ⊗ EΔ ), and x ∈ EΔ . We will often write NΔ (x, Ψ ) for EΔ Ψ (x, y) NΔ (x, dy); with this notation the right side ∞ of (2.10) collapses to Px 0 Zs NΔ (Xs , Ψ ) dHs . Because X cannot jump out of Δ, we can (and do) assume that Ht = Hζ for all t > ζ. Because H is a (finite) CAF, there is a strictly positive function g ∈ E e such that x ∞ −t sup P 0 e g(Xt ) dHt < ∞. Therefore, at the cost of replacing Ht by t x g(Xs ) dHs and NΔ (x, y) by g(x)−1 NΔ (x, dy), we can arrange for H to 0 have a bounded 1-potential.
234
P.J. Fitzsimmons and R.K. Getoor
3 Predictable Exit System In section 9 of [22], Maisonneuve constructed a predictable exit system, assuming condition (iii) in Lemma 3.1 below. In what follows we shall refer to this as a complete predictable exit system. Our purposes are served by a • modified construction, yielding a partial predictable exit system (Ppr , C) that is more broadly applicable to the problem of L´evy systems and time changes. Before proceeding to the construction we introduce a supplementary hypothesis under which the partial exit system becomes a complete predictable exit system. The set Gi , defined in (2.3), of irregular left endpoints of the intervals contiguous to M can be expressed as the disjoint union ∪n Tn of graphs of stopping times. It turns out that only Gi ∩ J is germane to the present study. The following result captures the situation in which all of the irregular exits from F occur at totally inaccessible times. To state it define Dt := inf{s > t : Xs ∈ F } = t + TF ◦ θt ,
t ≥ 0.
The process D is increasing and right continuous, and M \ G = {(ω, t) : Dt (ω) = t}. Lemma 3.1 The following conditions are equivalent: (i) Gi ⊂ J; (ii) The dual predictable projection of the AF B d (defined in (2.5)) is continuous; (iii) The 1-potential ϕ1 : x → Px [exp(−TF )] is regular. (iv) The process t → Dt is quasi-left continuous. Proof. (i)=⇒(ii). Let B d,p denote the dual predictable projection of B d . If T is a predictable time, then d,p d,p d,p x x P [BT − BT − , 0 < T < ∞] = P 1 T (s) dBs ]0,∞[
x
=P
1
d T (s) dBs
]0,∞[
= Px [1 − ϕ1 (XT ); T ∈ Gi ] = 0, for all x ∈ EΔ . (ii)=⇒(iii) The process t → e−t ϕ1 (Xt ) is a positive right-continuous supermartingale. Indeed, from (2.7), −t x −s e ϕ1 (Xt ) = P e dBs Ft ]t,∞[
= Px
]t,∞[
e−s dBsp Ft ,
(3.1) ∀x ∈ EΔ ,
L´evy Systems and Time Changes
235
where B is the AF component of the optional exit system for F and B p is the dual predictable projection of B. The hypothesis (ii) implies that B p is continuous, which in turn implies that ϕ1 is regular, because of (3.1). (iii)=⇒(iv) Let (Tn ) be an increasing sequence of stopping times with limit T , and set Υ :=↑ limn DTn ≤ DT . Then, for x ∈ EΔ , Px [exp(−Υ )] = lim Px [exp(−DTn )] = lim Px [e−Tn ϕ1 (XTn )] n
n
= Px [e−T ϕ1 (XT )] = Px [exp(−DT )] the third equality resulting from the assumed regularity of ϕ1 . It follows that Υ = DT almost surely. (iv)=⇒(i) Let T be a stopping time with T ⊂ Gi . Then, on {T < ∞}, we have 0 < T = DT − < DT . The quasi-left-continuity of D now implies that T is totally inaccessible. Thus, by [26, (44.5)], T ⊂ J. Remark 3.2 In view of [26, (46.2)], r G0 := Gr ∪ (Gi ∩ J) ⊂ {(ω, t) : Xt− (ω) = Xt− (ω) ∈ EΔ }
(3.2)
up to evanescence. This observation will be used several times in the sequel. Define BtJ :=
PXs [1 − exp(−TF )],
t ≥ 0.
(3.3)
s≤t,s∈Gi ∩J
As preparation for the construction of a (partial) predictable exit system, we express the dual predictable projection of B J in terms of the L´evy system. Lemma 3.3 We have, for predictable Z ≥ 0, Ψ ∈ pF ∗ , and x ∈ EΔ , ∞ Px Zt Ψ ◦ θt = Px 1F (Xt )Zt NΔ (Xt , dy)Py [Ψ ] dHt .
(3.4)
Fc
0
t∈Gi ∩J
Proof. Define It := lim sups↑↑t 1F (Xs ); this is a predictable process by the discussion on pp. 202–203 of [26] or by [6, T-IV90(a)]. Moreover, Gi ∩ J = J ∩ {(ω, t) : It (ω) = 1, Xt (ω) ∈ F c }, up to evanescence. Therefore • • Zt Ψ ◦ θt = P Zt PXt [Ψ ] P t∈Gi ∩J
t∈Gi ∩J •
=P
•
It Zt 1F c (Xt )PXt [Ψ ]
t∈J ∞
=P
•
(3.5) y
It Zt 0 ∞
=P
0
NΔ (Xt , dy)P [Ψ ] dHt 1F (Xt )Zt NΔ (Xt , dy)Py [Ψ ] dHt . Fc
Fc
236
P.J. Fitzsimmons and R.K. Getoor
We have used the fact that {(ω, t) : t > 0, It (ω) = 1, Xt (ω) ∈ / F } ⊂ G, which implies that the sets {t : It = 1} and {t : Xt ∈ F } differ by at most a countable set, almost surely. This difference is not charged by H. Define ψ(x) := Px [1 − exp(−TF )] and then take Ψ = ψ(X0 ) in (3.5) to see that the dual predictable projection of B J is t 1F (Xs ) NΔ (Xs , ψ) dHs , t ≥ 0. (3.6) 0
Accordingly we define a CAF C by t c Ct := Bt + 1F (Xs ) NΔ (Xs , ψ) dHs ,
t ≥ 0,
(3.7)
0
noting that the 1-potential of C is ∞ UC1 1EΔ (x) = Px e−t dCt ≤ Px 0
∞
e−t dBt = Px [exp(−TF )],
(3.8)
0
for all x ∈ EΔ . By Motoo’s theorem there are positive E e -measurable functions b and h such that t t t Btc = b(Xs ) dCs and 1F (Xs ) NΔ (Xs , ψ) dHs = h(Xs ) dCs . 0
0
0
(3.9) We may suppose that b + h = 1 on F and that b = h = 0 on EΔ \ F . Finally, define y c P [Φ] NΔ (x, dy) x x Ppr , (3.10) [Φ] := b(x) · Pop [Φ] + h(x)1F (x) · F NΔ (x, ψ) the ratio on the right being taken to be 0 when the denominator vanishes. Notice that t t 1F (Xs ) ds = γ(Xs ) dCs , ∀t ≥ 0, (3.11) 0
0
where γ := · b. •
Theorem 3.4 (a) The pair (Ppr , C) is a partial predictable exit system for F , in the sense that ∞ • • Xt ◦ Zt Ψ θt = P Zt Ppr [Ψ ] dCt , (3.12) P t∈G0
0
for all predictable Z ≥ 0 and Ψ ∈ pF ∗ , and C is a CAF with fine support contained in F . • (b) Suppose that Gi ⊂ J. Then (Ppr , C) is a (complete) predictable exit system for F , in the sense that the fine support of C is all of F , and (3.12) holds with G0 replaced by G. (c) Conversely, if there is a complete predictable exit system for F , then the conditions listed in (3.1) hold.
L´evy Systems and Time Changes
Proof. (a) From (3.4) we see that ∞ • • t P Zt Ψ ◦ θt = P Zt PX 0 [Ψ ]1F (Xt )NΔ (Xt , ψ) dHt ,
237
(3.13)
0
t∈Gi ∩J
where Px0 [Ψ ]
:=
Fc
NΔ (x, dy)Py [Ψ ] , NΔ (x, ψ)
with the understanding that the ratio vanishes when the denominator is zero. Combining this with (2.4), (3.9), and (3.10), we obtain (3.12). It follows from (3.7) that the fine support of C is contained in F . (b) Suppose that Gi ⊂ J. Clearly G = G0 in this case. To see that C has fine support equal to F , we observe that the inequality in (3.8) becomes an equality, so Px [exp(−TF )] = UC1 1EΔ (x) for all x. Let RC := inf{t : Ct > 0}. Clearly RC ≥ TF because C is carried by F . On the other hand ∞ Px [exp(−TF )] = UC1 1EΔ (x) = Px e−t dCt 0 ∞ x =P e−t dCt (3.14) RC
= Px exp(−RC )UC1 1EΔ (XRC ) ≤ Px [exp(−RC )], because UC1 1EΔ ≤ 1EΔ . Together with the previously noted inequality RC ≥ TF , (3.14) implies that RC = TF almost surely. • (c) Suppose, conversely, that (Ppr , C) is a predictable exit system for F ; that is, (3.12) holds with G0 replaced by G. One readily checks that ϕ1 := t t X(s) • P [e−TF ] is the 1-potential of the CAF t → 0 1F (Xs ) ds + 0 Ppr [1 − e−TF ] dCs , which implies that ϕ1 is regular.
4 Time Change Recall from section 2 that A is a CAF of X with fine support F . Thus F is finely perfect and the closed visiting set M has ω-sections that are perfect (or empty) almost surely. Let τ = (τ (t))t≥0 denote the right-continuous inverse of A: t ≥ 0. (4.1) τt = τ (t) := inf{s : As > t}, Then τ is strictly increasing (while finite), and as t varies the path t → τ (t) traces out M \ G. defined by As is well known the time-changed process X t := Xτ (t) , X
t ≥ 0,
(4.2)
238
P.J. Fitzsimmons and R.K. Getoor
∞ = Δ) is a right process with state space F , though (with the convention X X need not be a Borel right process. We note in passing that if a nearly Borel set L ⊂ F is X-polar (that is, Px [Xt ∈ L for some t > 0] = 0 for all x ∈ E) then L is also X-polar. Conversely, if L ⊂ F is X-polar, then L is X-semipolar. In fact, if L is Xpolar, then the visiting set {(ω, t) : Xt (ω) ∈ L} is contained in the graph of TF , up to evanescence. Indeed, since L ⊂ F , it is clear that {(ω, t) : Xt (ω) ∈ L} ⊂ TF , ∞ . In view of the observation made at the end of the first paragraph of this section, the X-polarity of L implies that {(ω, t) : Xt (ω) ∈ L} ⊂ G. In particular, {t > 0 : Xt (ω) ∈ L} is countable, a.s. Thus, if we fix an initial distribution μ, then 1L (Xs )[1 − exp(−TF ◦θs )] = Pμ 1L (Xs )[1 − exp(−TF ◦θs )] Pμ s∈G
s∈Gr ∞
= Pμ
Xs 1L (Xs )Pop [1−exp(−TF )] dBsc = 0,
0 c
because B is continuous. To see that, in general, L need not be X-polar, consider the example of X equal to uniform motion to the right on R with F = [0, ∞[ and L = {0}.
5 L´ evy System for X in terms In this section we give an explicit description of the L´evy system of X of the L´evy system of X and the partial predictable exit system for F . The key observation is contained in Lemma 5.1 below. Before coming to its statement and proof, it is necessary to introduce some notation. Let (Ft ) denote the that is, the usual augmentation of filtration of the time-changed process X; σ{Xs : 0 ≤ s ≤ t}, t ≥ 0. Let ρ be a metric on EΔ compatible with the Ray topology induced there by X. When viewed as a process with values in the is a right process metric space (EΔ , ρ), X is a right process; consequently, X when viewed as a process with state space (FΔ , ρ), where FΔ := F ∪ {Δ}. The corresponding Ray-Knight compactification F of FΔ (determined by X) induces a topology on FΔ ; let ρ be a metric compatible with that topology. at time t > 0. We write J r˜ for the left limit (in F ) of X We shall write X t− for the set of totally inaccessible jumps of X; thus, r˜ (ω) = X t (ω)}, r˜ (ω) ∈ FΔ , X J = {(ω, t) : X t− t− and J encompasses the totally inaccessible stopping times of the filtration r r t− and Xt− to denote left limits in the (Ft ). As in sections 2 and 3, we use X ρ-topology; these limits exist in E (the Ray compactification of EΔ induced t− and Xt− denote left limits by X) for all t > 0 almost surely. Finally, X taken in the original topology of EΔ , whenever those limits exist.
L´evy Systems and Time Changes
239
We write Λ+ (resp. Λ− ) for the set of points of right (resp. left) increase of A: Λ+ := {(ω, t) : t ≥ 0, At (ω) < At+ (ω), ∀ > 0}, Λ− := {(ω, t) : t > 0, At− (ω) < At (ω), ∀ > 0}.
(5.1) (5.2)
The set Λ+ is progressively measurable; in fact, Λ+ = M \ G.
(5.3)
Consequently, by the strong Markov property and Blumenthal’s zero-one law, if T is a stopping time, then T ⊂ Λ+ if and only if XT ∈ F , almost surely. Meanwhile, with It = lim sups↑↑t 1{Xs ∈F } as before, Λ− = {(ω, t) : It (ω) = 1},
(5.4)
so that Λ− is predictable. Lemma 5.1 (a) Defining τt− (ω) = τt (ω)}, J # := {(ω, τt (ω)) : (ω, t) ∈ J,
(5.5)
we have J # = J ∩ Λ− ∩ Λ+ , up to evanescence. (b) Recalling that G0 := Gr ∪ (Gi ∩ J), we have r τt− (ω) < τt (ω)} = {(ω, s) ∈ G0 : Xs− (ω) = XDs (ω)}, {(ω, τt− (ω)) : (ω, t)) ∈ J,
up to evanescence, where Ds = s + TF ◦θs as before. Proof. In what follows, equalities or inclusions between subsets of Ω × [0, ∞[ are understood to hold modulo evanescence. Also, if Γ ⊂ Ω × [0, ∞[ and S : Ω → [0, ∞], then we sometimes write S ∈ Γ instead of S ⊂ Γ . (a) It is clear that J # ⊂ Λ− ∩ Λ+ . Moreover, since {(ω, t) ∈ J : τt− (ω) = τt (ω)} is (Fτ (t) )-optional and has countable sections, it can be expressed as the countable union ∪n Tn of graphs of (Fτ (t) )-stopping times. Fix n ∈ N and write T for Tn and define S := τ (T ), so that T = AS . In view of (2.9) applied to X, u = ρ - lim X u = X Tr˜ − . ρ - lim X u↑T
u↑T
But, because X has left limits in the ρ-topology and τ (T −) = τ (T ), u = ρ - lim Xτ (u) ρ - lim X u↑T
u↑t
=
Xτr(T −)−
r = Xτr(T )− = XS− .
r r˜ = X T = XS , from which we deduce that S ∈ J. This Hence XS− = X T− proves that J # ⊂ J ∩ Λ− ∩ Λ+ .
240
P.J. Fitzsimmons and R.K. Getoor
For the reverse containment we begin by observing that Λ+ = M \ G is (Ft )-progressive, so J ∩ Λ+ ∩ Λ− is an (Ft )-progressively measurable subset of the (Ft )-optional set J ∩ Λ− , the latter set having countable sections. Thus, by [6, T-IV.88], there is a sequence (Sn ) of (Ft )-stopping times such that J ∩ Λ+ ∩ Λ− = ∪n Sn up to evanescence. The containment at issue will therefore be established once we show that if S is an (Ft )-stopping time with S ⊂ J ∩ Λ+ ∩ Λ− then S ⊂ J # . Fix such a stopping time S and define T := AS . Notice that τ (T −) = τ (T ) = S almost surely on {S < ∞} = {S < ζ}, because S ⊂ Λ+ ∩ Λ− . Thus, to complete the proof it suffices to show Now that T ⊂ J. r Tr − = ρ - lim Xτ (u) = X r X τ (AS )− = XS− = XS− = XS = XT , u↑AS
in which (i) the second, third, and last equalities hold because S ∈ Λ− ∩ Λ+ so that AS is a continuity point of τ with τ (AS ) = S, and (ii) the fourth equality and the inequality follow from (2.9) because S ∈ J. Consequently, T in the ρ-topology, and X r ∈ EΔ . Applying [26, is a discontinuity time of X T− (viewed as a right process in the ρ-topology on the state space (46.2)] to X r does not exist in FΔ } and {X r exists in FΔ but FΔ := F ∪{Δ}), the sets {X − − r r˜ X− = X− } are both predictably meager; that is, they are countable unions of graphs of (Ft )-predictable stopping times. To show that the intersection of T with the union of these sets is evanescent, it therefore suffices to show that T meets the graph of no (Ft )-predictable time. Suppose for the moment T , the only remaining r = X that this has been established. Then since X T− r r˜ possibility is that X T − exists in FΔ and is equal to XT − . But this forces T ∈ J. It remains to show that if R is an (Ft )-predictable time then R ∩ T is evanescent. Fix such an R, and let (Rn ) announce R. Then {τ (Rn ) < t} = {Rn < At } = ∪q∈Q {Rn < q < At } = ∪q∈Q {Rn < q, τ (q) < t} ∈ Ft , since {Rn < q} ∈ Fq ⊂ Fτ (q) and τ (q) is an (Ft )-stopping time. Thus τ (Rn ) is an (Ft )-stopping time. But t → τ (t−) is strictly increasing on ]0, Aζ ] and identically infinite on [Aζ , ∞[. Therefore the sequence (τ (Rn ) ∧ n) of (Ft )stopping times increases to τ (R−) strictly from below. Consequently τ (R−) is a predictable (Ft )-stopping time. Recalling that T = AS , we see that on the event {R = T < ∞} we have τ (R−) = S since S ∈ Λ− ∩Λ+ . This implies that Px [R = T < ∞] ≤ Px [S = τ (R−) < ∞] = 0 for all x ∈ EΔ , because S ⊂ J and J meets the graph of no predictable (Ft )-stopping time. Thus R ∩ T is evanescent, and the proof of assertion (a) of Lemma 5.1 is complete. (b) Since {t ∈ J : τ (t−) < τ (t)} is (Fτ (t) )-optional and has countable sections, it can be expressed as the countable union of graphs of (Fτ (t) )stopping times Tn , n ∈ N. (Only the F-measurability of these times is relevant
L´evy Systems and Time Changes
241
in the subsequent discussion.) Fix n ∈ N and abbreviate Tn to T . Then T ∈ J r r˜ = X r = Xr T = X and so X T− T− τ (T −)− with Xτ (T −)− ∈ FΔ ⊂ EΔ by the equality of the second and fourth terms. Define S := τ (T −). Then S ∈ G and D(S) = DS = τ (T ). From the previous string of equalities it follows that r T = XS− , XD(S) = Xτ (T ) = X
and so it remains to show that S ∈ G0 . We consider two cases. First, if r r = XS then S ∈ G ∩ J = Gi ∩ J because XS− ∈ FΔ ⊂ EΔ . On the XS− r other hand, if XS− = XS , then XS ∈ FΔ . If XS ∈ F then S ∈ Gr . To rule r˜ = XS = Δ. But out the remaining possibility, suppose XS = Δ. Then X T− in its Ray topology, since X is this implies that T is not a discontinuity of X constantly equal to Δ on [S, ∞[, contradicting T ∈ J. For the opposite inclusion, recall that Dt = t + TF ◦θt is an (Ft )-stopping time for each t ≥ 0 and that t → Dt is increasing and right continuous with G = {t > 0 : Dt− = t < Dt }. Therefore G, and hence G0 = Gr ∪ (Gi ∩ J), is optional relative to the filtration (FD(t) ) and has countable sections. Consequently, G0 = ∪n Sn , where each Sn is an (FD(t) )-stopping time. Fix n ∈ N and let S denote Sn . Then S ∈ G0 and τ (AS −) = S < DS = τ (AS ). But r A(S) , r = Xr = XS− = XD(S) = X X A(S)−
τ (A(S)−)−
in the ρ-topology. To complete the proof so AS is a time of discontinuity of X This will follow once we show that X r˜ we must show that AS ∈ J. A(S)− = r X ∈ FΔ , and (using [26, (46.2)] as in the proof of the second part A(S)−
of (a)) this will follow in turn once we show that AS meets the graph of no (Ft )-predictable time. Let R be such a predictable time and suppose that Px [R = AS < ∞] > 0 for some x ∈ FΔ . Then exactly as in the last paragraph of the argument for (i), τ (R−) is an (Ft )-predictable time, and τ (R−) = S ∈ G0 on {R = AS }. This is a contradiction since Gr meets the graph of no (Ft )-stopping time and J meets the graph of no (Ft )-predictable time. This completes the proof of Lemma 5.1. Define t := Cτ (t) , C
F := H t
τ (t)
1F (Xs ) dHs .
(5.6)
0 t 0
Because the fine supports of C and t → 1F (Xs ) dHs are contained in F , and H F are CAFs of X. Now define both C t := C t + H F, H t
t ≥ 0.
(5.7)
Another application of Motoo’s theorem yields the existence of E e -measurable densities c and h (vanishing off F ) such that t t t = s ) dH s , and H F = s, s ) dH C c(X ∀t ≥ 0, (5.8) h(X t 0
0
242
P.J. Fitzsimmons and R.K. Getoor
Δ on (FΔ , EΔ ∩ FΔ ) by almost surely. Finally, define a kernel N
x Δ (x, dy) := 1F ×F (x, y) N c(x)1{x =y} Ppr [XTF ∈ dy] + h(x) NΔ (x, dy) . Δ (5.9) Δ , H) is a L´evy system for the totally inaccessible Theorem 5.2 The pair (N jumps of X. ≥ 0, and a positive Borel function Φ Proof. Fix an X-predictable process Z on the product space F × FΔ such that Φ(x, x) = 0 for all x ∈ F . Using [6, A(s) is X-predictable. Then, using (IV.67.1)] it is not hard to check that s → Z Lemma 5.1(a) for the first equality, • t Φ(X t− , X t ) Z P (t−)=τ (t) t∈J,τ •
•
=P
1Λ− (s)ZA(s) Φ(Xs− , Xs )1Λ+ (s)
s∈J
=P
•
1Λ− (s)ZA(s) Φ(Xs− , Xs )1F (Xs )
s∈J ∞
=P
•
0 ∞
=P
1Λ− (s)ZA(s)
1F (Xs )ZA(s)
NΔ (Xs , dy)Φ(Xs , y) dHs F
0
NΔ (Xs , dy)Φ(Xs , y) dHs . F
The second equality above follows from the discussion just after (5.3) because J is the disjoint union of graphs of stopping times, while the final equality holds because Λ− differs from {X ∈ F } by a countable set not charged by H. Consequently, ∞ • t− , X t ) = P• t , dy) Φ(X t , y) dH tF . Zt Φ(X Zt NΔ (X P 0
(t−)=τ (t) t∈J,τ
F
(5.10) On the other hand, using Lemma 5.1(b), • t Φ(X t− , X t ) P Z (t−)<τ (t) t∈J,τ •
=P
A(s) Φ(Xs− , XD ) Z s
s∈G0 ∞
Xs A(s) Z =P Φ(Xs , XTF (ω)) Ppr (dω) dCs 0 Ω ∞ • t t , XT (ω)) PX t (dω) dC t . Z =P Φ(X pr F •
0
(5.11)
Ω
Δ , H) is a L´evy system for Taken together, (5.10) and (5.11) imply that (N the totally inaccessible jumps of X.
L´evy Systems and Time Changes
243
6 Jump Measures and Feller Measures We now fix an excessive measure m to serve as background measure. Thus m is a σ-finite measure on (E, E) such that mPt (L) ≤ m(L) for all L ∈ E and t > 0. Because (Pt ) is a right semigroup, we then have mPt ↑ m (setwise) as t ↓ 0; see [7, XII 36-37]. Here and in the remainder of the paper the (absorbing) state Δ is viewed as a cemetery state; the stopping time ζ := inf{t : Xt = Δ} is the lifetime of X. Accordingly, functions (resp. measures) defined on E (resp. E) are extended to EΔ (resp. EΔ ) by letting the value at Δ (resp. {Δ}) be 0. Let R = (Rt )t≥0 be a raw (i.e., not necessarily adapted) additive functional (RAF) of X. The Revuz measure of R, relative to m, is defined by the monotone limit t m (f ) :=↑ lim t−1 Pm f (Xs ) dRs , f ∈ pE ∗ . (6.1) νR t↓0
0
m If R is a CAF then the measure νR is σ-finite, and two CAFs with the m same Revuz measure are P -indistiguishable. See [15], [10], and [11] for m has a “global” counmore details. The “local” formula (6.1) defining νR terpart expressed in terms of the Kuznetsov process ((Yt )t∈R , Qm ) associated with X and m. The sample space for Y is W , the space of all paths w : R → EΔ := E ∪ {Δ} that are right continuous and E-valued on an open interval ]α(w), β(w)[ and take the value Δ outside of this interval. The dead path [Δ], constantly equal to Δ, corresponds to the interval being empty; by convention α([Δ]) = +∞, β([Δ]) = −∞. The σ-algebra G ◦ on W is generated by the coordinate maps Yt (w) = w(t), t ∈ R, and Gt◦ := σ(Ys : s ≤ t). The Kuznetsov measure Qm is the unique σ-finite measure on G ◦ not charging {[Δ]} such that, for −∞ < t1 < t2 < · · · < tn < +∞,
Qm (Yt1 ∈ dx1 ,Yt2 ∈ dx2 , . . . , Ytn ∈ dxn ) = m(dx1 ) Pt2 −t1 (x1 , dx2 ) · · · Ptn −tn−1 (xn−1 , dxn ).
(6.2)
Because the only times appearing on the right side of (6.2) are the differences tk − tk−1 , the measure Qm is invariant with respect to the shift operators σt , t ∈ R, defined by σt w(s) = [σt w](s) := w(t + s),
s ∈ R;
that is Qm [Φ◦σt ] = Qm [Φ],
∀Φ ∈ pG o , t ∈ R.
It will be convenient to take X = (Xt , Px ) to be the realization of (Pt ) described on p. 53 of [15]. The sample space for X is Ω := {α = 0, Yα+ exists in E} ∪ {[Δ]},
(6.3)
244
P.J. Fitzsimmons and R.K. Getoor
Xt is the restriction of Yt to Ω for t > 0, and X0 is the restriction of Y0+ . Moreover, F ◦ := σ(Xt : t ≥ 0) is the trace of G ◦ on Ω. To discuss the strong Markov property of Y , as well as the moderate markov property of Y when time is reversed, we recall the modified process Y ∗ of [15, (6.12)]. Let d be a totally bounded metric on E compatible with the topology of E, and let D be a countable uniformly dense subset of the d-uniformly continuous bounded real-valued functions on E. Given a strictly positive h ∈ bE with m(h) < ∞ define W (h) ⊂ W by the conditions: (i) α ∈ R; (ii) Yα+ : = lim Yt exists in E; t↓α
(iii) U q g(Yα+1/n ) → U q g(Yα+ ) as n → ∞, for all g ∈ D and all rationals q > 0; (iv) U h(Yα+1/n ) → U h(Yα+ ) as n → ∞. ◦ since E is a Evidently σt−1 (W (h)) = W (h) for all t ∈ R, and W (h) ∈ Gα+ Lusin space. We now define
Yt∗ (w) =
Yα+ (w), if t = α(w) and w ∈ W (h), Yt (w), otherwise.
(6.4)
(If h is another function with the properties of h then Qm (W (h)ΔW (h )) = 0.) The process Y ∗ features in a maximal form of the the strong Markov property, recorded in Proposition 6.1 below; for a proof see [15, (6.15)]. (This process will also be used in section 7 to define the moderate Markov dual of X with respect to m.) A clean statement of this result requires the “truncated shift” operators θt , t ∈ R defined by θt w(s) = [θt w](s) :=
w(t + s), s > 0; Δ, s ≤ 0.
The filtration (Gtm )t∈R is obtained by augmenting (Gt◦ )t∈R with the Qm null sets in the usual way. Proposition 6.1 Let T be a (Gtm )-stopping time. Then Qm restricted to GTm ∩ {YT∗ ∈ E} is a σ-finite measure and ∗
Qm (F ◦ θT | GTm ) = P YT (F ),
Qm -a.e.
on
{YT∗ ∈ E}
(6.5)
for all F ∈ pF ◦ . Now given an RAF R there is a uniquely determined (up to Qm evanescence) homogeneous random measure (HRM) κR such that κR ]s, t] = Rt−s ◦θs ,
on {α < s < β}, Qm -a.s.,
for all real s < t. The global counterpart to (6.1) that was alluded to earlier is this:
L´evy Systems and Time Changes
245
Qm
m f (x, t) dt νR (dx),
f (Yt , t) κR (dt) = R
E
∀f ∈ p(E ⊗ B). (6.6)
R
See [15, (8.21), (8.25)]. As an example, let us consider additive functionals related to the L´evy system (NΔ , H) discussed at the end of section 2. As is customary, we now break NΔ into two pieces n(x) := NΔ (x, {Δ}),
N (x, dy) := 1E (y)NΔ (x, dy), and define the “killing rate” CAF K by t Kt := n(Xs ) dHs ,
t ≥ 0.
(6.7)
(6.8)
0
Taking Z = 1]0,t] in (2.10) we find that x
P
x
Ψ (Xs− , Xs )1{s<ζ} = P
s∈J,s≤t
t
N (Xs , Ψ ) dHs ,
x ∈ E, t ≥ 0,
0
(6.9) and
t
Px [f (Xζ− ); ζ ≤ t, ζ ∈ J] = Px
f (Xs ) dKs ,
(6.10)
0
for Ψ ∈ p(E ⊗ E) and f ∈ pE. It follows from this discussion and (6.1) that m J (Ψ ) :=↑ lim t−1 Pm Ψ (Xs− , Xs )1{s<ζ} = νH (N Ψ ), (6.11) t↓0
s∈J,s≤t
and m m (f ) = νH (nf ). K(f ) :=↑ lim t−1 Pm [f (Xζ− ); ζ ≤ t, ζ ∈ J] = νK t↓0
(6.12)
That is, the “jump measure” J is given on E × E by the formula m J (dx, dy) = νH (dx)N (x, dy),
(6.13)
while the “killing measure” K is given on E by m K(dx) = n(x) νH (dx).
(6.14)
Now let J ∗ denote the set of totally inaccessible jumps of Y (defined as J was for X). Paralleling (2.9) we have (employing the obvious notation regarding left limits) r J ∗ ⊂ {(w, t) ∈ W × R : α(w) < t < β(w), Yt− (w) = Yt− (w)}
(6.15)
246
P.J. Fitzsimmons and R.K. Getoor
up to Qm -evanescence. Combining (6.6) with the version of (2.10) valid for Y we now obtain, for suitably measurable f ≥ 0, Qm f (t, Yt− , Yt ) = Qm f (t, Yt , y) N (Yt , dy) κH (dt) t∈J ∗
=
R
(6.16) f (t, x, y) J (dx, dy),
dt R
and
E
E×E
Qm [f (β, Yβ− ); β ∈ J ∗ ] = Qm f (t, Yt )n(Yt ) κH (dt) R dt f (t, x) K(dx). = R
(6.17)
E
We record the analogous results for the optional and predictable exit systems for F . The “optional” version comes from [13] (see also [15, (11.6)]); the “predictable” version is proved in a similar manner. Let F be a finely perfect nearly Borel set. For w ∈ W let M ∗ (w) be the closure in R of {t ∈ R : Yt (w) ∈ F } and let G∗ (w) be the set of left endpoints (in ]α(w), ∞[) of the contiguous intervals of M ∗ (w). It is readily verified that if α < s < t then t ∈ M ∗ if and only if t − s ∈ M ◦ θs , and likewise for G∗ and G∗0 , where G∗0 is defined over Y in analogy to G0 . •
•
Proposition 6.2 Let (Pop , B) and (Ppr , C) be optional and partial predictable m m and νC = νC be the correspondexit systems for F , respectively. Let νB = νB ing Revuz measures, with respect to m. Then νB and νC are σ-finite, and x Qm f (s, Ys , θs ) = dt νB (dx) Pop [f (t, x, ·)], (6.18) R
s∈G∗
Qm
E
f (s, Ys− , θs ) =
x νC (dx) Ppr [f (t, x, ·)],
dt R
s∈G∗ 0
(6.19)
E
∗ ⊗ F ∗ ). provided f ∈ p(B ⊗ EΔ
Following [4] (see also [14] and [3]) we define the Feller measure Λ(Γ ) :=↑ lim t−1 Pm 1]0,t] (s)1Γ (Xs− , XDs )1{Ds <∞} , Γ ∈ E ⊗ E, t↓0
s∈G0
(6.20) and the supplementary Feller measure 1]0,t] (s)1L (Xs− )1{Ds =∞} , δ(L) :=↑ lim t−1 Pm
L ∈ E.
(6.21)
Since XDs = XTF ◦θs , the exit system formula (3.12) implies that t Pm 1]0,t] (s)1Γ (Xs− , XDs )1{Ds <∞} = Pm φ(Xs ) dCs ,
(6.22)
t↓0
s∈G0
s∈G0
0
L´evy Systems and Time Changes
247
where x [1Γ (x, XTF ); TF < ∞], φ(x) := Ppr
x ∈ E.
(6.23)
Formula (6.22) yields the existence of the monotone limit in (6.20) and even identifies the limit as νC (φ). Hence, x Λ(Γ ) = νC (φ) = νC (dx) Ppr [1Γ (x, XTF ); TF < ∞]. (6.24) F
Similar considerations lead to the identification of the limit in (6.21) as x δ(L) = νC (dx) Ppr [TF = ∞]. (6.25) L
The next two formulas follow immediately from Proposition 6.2 and formulas (6.24) and (6.25). Proposition 6.3 For Φ ∈ p(B ⊗ E ⊗ E) and f ∈ p(B ⊗ E), Qm Φ(t, Yt− , YDt )1{Dt <∞} = dt Φ(t, x, y)Λ(dx, dy), R
t∈G∗ 0
and Qm
(6.26)
F ×F
f (t, Yt− )1{Dt =∞} =
t∈G∗ 0
x f (t, x) dt Ppr [TF = ∞]νC (dx). R
(6.27)
F
Recall from section 5 the CAF A, its inverse τ , and the time-changed = Xτ . We are now going to exhibit formulas for the jump and process X in terms of the corresponding measures for X and the killing measures of X, exit system for F . In the course of the proof we shall use the following result taken from section 6 of [9]. m , and let Z be a CAF of X with fine supProposition 6.4 Let m denote νA port contained in F (That is, ZTF = 0, a.s.) Then m is X-excessive, the time change Zt := Zτ (t) , t ≥ 0, defines a CAF of X, and the Revuz measure νZm of relative to m, Z, is equal to ν m . Z
The following result extends [4, Thm. 5.6] to the context of this paper. We write (F × F )0 for {(x, y) ∈ F × F : x = y}. for the timeTheorem 6.5 The jump measure J and the killing measure K m with respect to the X-excessive changed process X, measure m := νA , are given respectively by the formulas J = 1(F ×F )0 (J + Λ), = 1F K + δ. K
(6.28) (6.29)
248
P.J. Fitzsimmons and R.K. Getoor
Proof. We prove only (6.28), as the proof of (6.29) is quite similar. Let Γ be a Borel measurable subset of (F × F )0 , and let us begin with x
P
r s− s ) = Px 1Γ (X ,X
t
(X s , 1Γ ) dH s, N
x ∈ F.
(6.30)
0
s∈J,s≤t
In view of (5.6)–(5.9), the right side of (6.30) is equal to x
P
t
s , 1Γ ) dH sF + Px N (X
0
t
s ) dC s , φ(X
0
x where φ(x) := Ppr [1Γ (x, XTF ); TF < ∞]. By (5.6) and Proposition 6.4 the m m F (relative to m is ν m . Revuz measure of H := νA ) is 1F νH , while that of C C Therefore m J(Γ ) = νH (N 1Γ ) m x νH (dx) N (x, dy) + νC (dx) Ppr [XTF ∈ dy, TF < ∞]. = Γ
(6.31)
Γ
7 Entrance Law Because of the time-symmetry of the Markov property, the process (Yt , Qm ) is a Markov process with respect to the reverse filtration Gt := σ{Ys : s ≥ t}, t ∈ R. Unlike the situation in “forward” time, this process need not be a strong Markov process, but it is a moderate Markov process. To make this precise we define := {β = 0} ⊂ W ; Ω
(7.1)
∗ t (ˆ X ω ) := Y−t (ˆ ω ), t > 0, ω ˆ∈Ω s : 0 < s ≤ t}, t > 0, F := σ{X s : s > 0} Ft := σ{X w(t − s), s > 0; θˇt w(s) := Δ, s ≤ 0.
(7.2) (7.3) (7.4)
x , x ∈ E} of probability measures Then there is a Borel measurable family {P on (Ω, F) under which (Xt )t>0 has the moderate Markov property: X T [f (X x [f (X T +s )|FT − ] = P s )], P
s > 0, f ∈ bE,
(7.5)
whenever T is an (Ft )-predictable stopping time. (As a matter of conven x are uniquely 0 = x] = 1 and F0− = {∅, Ω}.) The measures P tion, Px [X
L´evy Systems and Time Changes
249
determined modulo an m-polar set. (A set L ∈ E e is m-polar provided is this: If T : W → [−∞, ∞] is Pm [TL < ∞] = 0.) The link between Y and X (Gt )-predictable, then for Φ ∈ pF, YT∗ [Φ], Qm [Φ◦θˇT |GT − ] = P
on {YT∗ ∈ E},
(7.6)
the σ-finiteness of Qm on GT − ∩ {YT∗ ∈ E} being part of the assertion. For more details see [16, §2], [8, §4], and [23]. It follows easily from (7.6) (with T a fixed time) that the transition semi • [f (X defined by Pt f = P t )] is in duality with (Pt ) with group (Pt ) of X, respect to m: f, g ∈ E ∗ , t > 0, (7.7) (Pt f, g) = (f, Pt g), in which (f, g) := E f g dm provided the integral exists. Likewise, defining the associated resolvent ∞ ∞ • λ −λt t ) dt, e Pt f dt = P e−λt f (X (7.8) U f= 0
we have
λ g), (U λf, g) = (f, U
0
f, g ∈ E ∗ , λ > 0.
(7.9)
We usually omit the hat in those places where it is obviously required. For • [f (Xt )] in place of P • [f (X t )]. example, we write P Before proceeding, we collect some facts about the moderate Markov dual process. Recall that a set L ∈ E e is m-semipolar provided the visiting set {t > 0 : Xt ∈ L} is Pm -a.s. at most countable. Also, property P (x) depending on x ∈ E is said to hold m-quasi-everywhere (m-q.e.) provided {x ∈ E : P (x) fails} is m-polar. Define :X t ∈ F }. TF := inf{t ∈]0, ζ[
(7.10)
Lemma 7.1 Let F be a finely perfect nearly Borel subset of E. x [TF = 0] < 1} is m-semipolar. (i) {x ∈ F : P x [TF = 0] > 0} is m-semipolar. (ii) {x ∈ E \ F : P (iii) t → Xt has right limits in E (with respect to the Ray topology) on [0, ∞[, x -a.s. for m-q.e. x ∈ E. P Remark 7.2 Neither statement (i) nor statement (ii) of the lemma can be improved, even if X is continuous with a strong Markov continuous dual as the example of uniform motion on R shows. process X, Proof. (i) Let μ be a finite measure on E not charging m-semipolar sets. Then there is a diffuse optional copredictable HRM κ with Revuz measure μ; see
250
P.J. Fitzsimmons and R.K. Getoor
[8, (5.22)] or [12, (3.10)]. Let φ be a strictly positive Borel function on R with φ(t) dt = 1. Since κ is copredictable, R Qm φ(t)1F (Yt∗ )1{TF ◦θˇt =0} κ(dt) R ∗ φ(t)1F (Yt∗ )PY (t) [TF = 0] κ(dt) = Qm (7.11) R x [TF = 0] μ(dx). = P F
Let Z denote the closure of {t ∈]α, β[ : Yt ∈ F }. Then Z ∩ {t : TF ◦θˇt > 0} = Z ∩ {t : ∃ > 0, ]t − , t[∩Z = ∅}. Hence, Z ∩ {t : TF ◦θˇt > 0} is contained in the set of right endpoints of the contiguous intervals of Z. But there are only countably many such intervals, and κ is diffuse, so Qm φ(t)1F (Yt∗ )1{TF ◦θˇt =0} κ(dt) = μ(F ). (7.12) R
x [TF = 0] < 1} has μ-measure equal to 0. Since μ was It follows that {x ∈ F : P an arbitrary finite measure not charging m-semipolars, a result of Dellacherie x [TF = 0] < 1} is m-semipolar. [5, p. 70] tells us that {x ∈ F : P (ii) Using the notation established in the proof of point (i), x [TF = 0] μ(dx). Qm φ(t)1{TF ◦θˇt =0} 1E\F (Yt∗ ) κ(dt) = (7.13) P R
E\F
If TF ◦θˇt = 0 then for every sufficiently small η > 0 the interval ]t − η, t[ contains times at which Y is in F ; if also Yt ∈ E \ F then t is an element of G∗ because E \ F is finely open. Since κ is diffuse, the above displayed integrals must vanish. Point (ii) now follows as did (i). ˆ t ) as f runs through a countable dense subset of (iii) By considering f (X s ( ω , t) such that s → X ω ) fails to have a right C(E) one sees that the set of (ˆ P limit in E at t is (Ft )t≥0 -progressively measurable. Here P is an arbitrary F◦ ), and (FP ) is the usual right-continuous comprobability measure on (Ω, t pletion of (Fto ). See, for example, [6, IV-90]. It follows that the projection Π is an element of F∗ := ∩P FP . Note that of the above-described set onto Ω ω ) fails to have a right limit in E at some Π is the set of ω for which s → Xs ( x [Π] is E ∗ -measurable, and then f is coexcessive. t > 0. Hence, f (x) := P s ◦θˇt = Yt−s for s > 0, the set θˇt−1 Π is contained in the set of w ∈ W Since X such that r → Yr (w) fails to have a left limit in E at some r ∈] − ∞, t[, and so Qm [θˇt−1 Π] = 0 for all t ∈ R. Now
Y (0) [Π] = Qm [θˇ−1 Π, α < 0 < β] = 0. m(f ) = Qm P 0
L´evy Systems and Time Changes
251
x [Π] = f (x) = 0 for m-q.e. x ∈ E; see Hence f = 0, m-a.e., and therefore P [16, (2.11)]. We assume, for the remainder of this section, that Gi ⊂ J. This means that the partial predictable exit system constructed in section 3 is in fact “complete” in the sense of Theorem 3.4(b). Define, for λ ≥ 0 and f ∈ pE ∗ , x [e−λTF f (XT )] PFλ f (x) := P F x [e−λTF f (X r )], PFλ+ f (x) := P T +
(7.14) (7.15)
F
with the understanding that exp(−0 · ∞) = 0 and f (Δ) = 0, so that PF f := x [f (XT ) : TF < ∞]. Here X r PF0 f = P denotes the right limit (in E F TF + with its Ray topology) of t → Xt at TF . In (7.15), f is extended to all of E r by declaring f (x) = 0 for x ∈ E \ E. In the light of Lemma 7.1(iii), X TF +
x -a.s. on {TF < ∞} for m-q.e. x ∈ E. Thus both Pλ f and Pλ f are exists P F+ F uniquely determined m-q.e. and are E ∗ -measurable. •
•
Recall the optional and predictable exit systems (Pop , B) and (Ppr , C) for F . Since the measure m will remain fixed in the sequel, we shall write νB for m m and νC for νC . The balayage of m on F is the excessive measure RF m νB defined by f ∈ pE. (7.16) RF m(f ) := Qm [f (Yt ); TF < t], Here TF := inf{t ∈]α, β[: Yt ∈ F } extends the previously time
defined hitting of F to all of W . Upon noting that f (Yt )1{TF
= (PF 1, f ), Y0 [TF < ζ] RF m(f ) = Qm f (Y0 )P x [TF < ζ] = PF 1(x) for all x ∈ E. It is important to note at this because P stage that PF + 1(x) = PF 1(x),
for m-a.e.x ∈ E.
(7.17)
(In fact, for m-q.e. x, but the m-a.e. assertion is sufficient for our purposes.) To see this fix a strictly positive f ∈ bE with m(f ) < ∞, and define g0 := sup{t ≤ x [TF < ζ, XT + ∈ / E]. 0 : Yt ∈ F }. Observe that 0 ≤ PF 1(x) − PF + 1(x) = P F Using (7.6) with T = 0 we have
f (x) PF 1(x) − PF + 1(x) m(dx) = Qm [f (Y0 ); α < g0 , Ygr0 − ∈ / E] E
= Qm [f (Y0 ); α < g0 , Yg0 − = Δ] = 0,
252
P.J. Fitzsimmons and R.K. Getoor
the second equality following from Remark 3.2 reinterpreted for Y , and the third from the fact that Δ is isolated in EΔ . The following decomposition of RF m (for general Borel F ) appears in section 6 of [13]: (PF 1, f ) = RF m(f ) = νB ( f + Vop0 f ),
f ∈ pE,
(7.18)
TF x f (Xt ) dt. Our principal goal in the remainder of where Vop0 f (x) := Pop 0 this section is to generalize this decomposition and to obtain its predictable analog. Notation 7.3. (f, g)0 := E\F f g dm. Theorem 7.4 If f, g ∈ pE ∗ , then (i) (PFλ f, g)0 = νB (f · Vopλ g), and (ii) (PFλ+ f, g)0 = νC (f · Vprλ g). T • • Here Vopλ f = Pop 0 F e−λt f (Xt ) dt and Vprλ f is defined analogously with Ppr • replacing Pop . Proof. We shall prove only (ii), as the proof of (i) is similar but easier since no x [TF > 0] = 1, m-a.e. on E \ F . Ray limit is involved. From Lemma 7.1(ii), P Thus (PFλ+ f, g)0 = Qm [PFλ+ f (Y0 )g(Y0 ); Y0 ∈ E \ F ]
ˇ r (TF + ))◦θˇ0 g(Y0 ); TF ◦θˇ0 > 0, Y0 ∈ E \ F , = Qm e−λTF ◦θ0 f (X because TF ◦ θˇ0 > 0, Qm -a.e. on the event {Y0 ∈ E \ F }. If s = −TF ◦θˇ0 > α, then s ∈ G∗ (defined below (6.17)), and ]s, s + TF ◦ θs [ is the unique interval r = Ys− by Remark 3.2. contiguous to M ∗ that contains 0. If s ∈ G∗ , then Ys− Therefore, using (6.19) for the second equality below, (PFλ+ f, g)0 = Qm eλs f (Ys− )g(X−s )◦ θs 1{s+TF ◦ θs >0} =
s∈G∗ ,s<0 ∞ −λt
e
dt
0
x νC (dx)f (x)Ppr [g(Xt ); t < TF ]
F
= νC (f · Vprλ g). Let (Qt )t≥0 and (V λ )λ≥0 denote the semigroup and resolvent for (X, TF ), the process X killed at time TF : Qt f (x) := Px [f (Xt ); t < TF ], ∞ TF λ x −λt e f (Xt ) dt = e−λt Qt f (x) dt. V f (x) := P 0
0
L´evy Systems and Time Changes
253
TF ) denote X killed at TF , with corresponding semigroup (Q t )t≥0 Let (X, λ 0 0 and resolvent (V )λ≥0 . As is customary, V := V and V := V . It is known TF ) are dual processes, in the sense that (V λ f, g)0 = that (X, TF ) and (X, (f, V λ g)0 for all f, g ∈ pE ∗ and λ ≥ 0. See [12, (A.7)]. We write Qop t f := • Pop [f (Xt ); t < TF ], with an analogous definition for Qpr t . Corollary 7.5 Fix f ∈ bpE ∗ . Then the formulas η f (g) := (PF + f, g)0
and
ξ f (g) := (PF f, g)0 ,
g ∈ pE,
(7.19)
define (σ-finite) purely excessive measures η f and ξ f for (X, TF ). Moreover, f op ηtf := (f νC )Qpr t and ξt := (f νB )Qt , t > 0, define entrance laws for (X, TF ) such that ∞ ∞ f f f ηt dt and ξ = ξtf dt. η = 0
0
If, in addition, νC (f ) < ∞ (resp. νB (f ) < ∞), then ηtf (resp. ξtf ) is a finite measure for each t > 0. Proof. Clearly η f and ξ f are σ-finite measures on E \ F . If t > 0 then • [f ◦X r ; t < TF < ∞] ≤ PF + f , and because of Lemma 7.1(ii) t PF + f = P Q TF + t PF + f ↑ PF + f , m-a.e. on E \ F . Since f is bounded, Q t PF + f → 0 we have Q f as t → ∞. Hence η is a purely excessive measure for (X, TF ). It follows from ∞ x Theorem 7.4(ii) that η f = 0 ηtf dt. Using the fact that (Xt )t>0 under Ppr is f f Markovian with transition semigroup (Pt ), one easily checks that ηt+s = ηt Qs • for t, s > 0. Recall that Ppr [1 − exp(−TF )] ≤ 1; see (3.10) and the sentence following (2.6). Now (1 − e−t ) ≤ (1 − e−TF ) on {t < TF }, and νC is σ-finite. This implies that ηtf is a countable sum of finite measures for each t > 0. Fix g ∈ pE with 0 < g ≤ 1 on E \ F and η f (g) < ∞. Then V g > 0 on E \ F , and we may use the Fubini theorem to conclude that ∞ ηtf (V g) = ηsf (g) ds ≤ η f (g) < ∞. t
Therefore ηtf is in fact σ-finite for each t > 0. Consequently, (ηtf )t>0 is an entrance law for (X, TF ). If, in addition, νC (f ) < ∞, then ηtf (1) ≤ νC (f ) < ∞. The treatment of (ξtf )t>0 is similar. Corollary 7.6 For f, g ∈ E ∗ , (i) (PFλ f, g) = νB ( f g) + νB (f Vopλ g) = νB c ( f g) + νB (f Vopλ g); (ii) (PFλ+ f, g) = νC (γg P0+ f ) + νC (f Vprλ g). Here comes from (2.6), B c is the continuous part of B, γ is defined just • [f (X r )]. below (3.11), and P0+ f := P 0+
254
P.J. Fitzsimmons and R.K. Getoor
Proof. Since σt Qm = Qm for all t ∈ R, Pλ f (x)g(x) m(dx) = Qm [Pλ f (Y0 )g(Y0 )1F (Y0 )] F
F
F
1
Qm [PFλ f (Yt )g(Yt )1F (Yt )] dt.
= 0
Also, (2.6) implies that 1F (Yt ) dt = (Yt ) κB (dt), where κB is the HRM of Y that extends B; notice that κB has Revuz measure νB . See, for example, the discussion on pages 89–91 of [15]. Therefore, by [15, (8.21)], 1 λ PF f (x)g(x) m(dx) = Qm PFλ f (Yt )g(Yt ) (Yt ) κB (dt) 0
F
= νB ( g PFλ f ). x [TF = 0] = 1, But = 0 on E \F and νB = νB c . In view of Lemma 7.1(i), P νB c -a.e. because νB c doesn’t charge m-semipolars. Hence νB ( g PFλ f ) = x [X 0 = x] = 1 by convention. Combining this with TheνB c ( gf ), since P orem 7.4(i) yields Corollary 7.6(i). A similar argument shows that PFλ+ f (x)g(x) m(dx) = νC (γg PFλ+ f ) = νC (γg P0+ f ), F
establishing Corollary 7.6(ii). ∗
Proposition 7.7 If f, g ∈ pE , then (PFλ+ f, PF g) = νC (γg P0+ f ) + νC (f Vprλ PF g) = (PF + f, PFλ g).
(7.20)
Proof. The first equality is an immediate consequence of Corollary 7.6(ii) since PF g = g on F . For the second equality, arguing as in the proof of Corollary 7.6, we have PF + f · PFλ g dm = νC (γg P0+ f ). F
Also, as in the proof of Theorem 7.4, (PF + f, PFλ g)0 = Qm f (Ys− )e−λ(s+TF ◦ θs ) g(XTF )◦ θs 1{s+TF ◦ θs >0}
s∈G∗ ,s<0 x νC (dx)f (x)Ppr
= F
0
−∞ TF
x νC (dx)f (x)Ppr
= F
TF
x νC (dx)f (x)Ppr F
e−λu g(XTF ) du
0
x νC (dx)f (x)Ppr
=
e−λ(TF −s) g(XTF ) ds
0
=
e−λ(s+TF ) g(XTF )1{TF >−s} ds
F
= νC (f Vprλ PF g).
0
TF
e−λu g(XTF )◦ θu , du
L´evy Systems and Time Changes
Combining these observations yields the second equality in (7.20).
255
In the same way one has (PFλ f, PF g) = νB c ( f g) + νB (f Vopλ g) = (PF f, PFλ g).
(7.21)
Formulas (7.20) and (7.21) reduce to (3.9) of [12] when F is a singleton. Let us suppose in this paragraph that RF m = m (otherwise, replace m with RF m.) Let us make the special choice A = C in the preceding discussion. Then by (7.17) and Corollary 7.6(ii) with λ = 0 and f = 1, m m(g) = νC (γg + Vpr g),
g ∈ pE,
(7.22)
where Vpr = Vpr0 . Notice that the right side of (7.22) depends on m only m , which is excessive for the time-changed prothrough the Revuz measure νC cess X. Following up on earlier work ([18, 19, 20, 12, 27] we use this formula to construct an excessive measure for X, given an excessive measure for X. Proposition 7.8 Suppose that Px [TF < ∞] > 0 for all x ∈ E. Let ν be an Then excessive measure for X. η(g) := ν(γg + Vpr g),
g ∈ pE
(7.23)
η defines an excessive measure for X such that (i) RF η = η and (ii) νC = ν. The measure η is uniquely determined by these two conditions.
Proof. According to [9, (5.12), (5.13)], under the hypothesis of the proposition there is a uniquely determined X-excessive measure m = mν such that m = ν. By Corollary 7.6(ii) we have RF m = m and νC m m(g) = RF m(g) = (PF + 1, g) = νC (γg + Vpr g) = ν(γg + Vpr g).
The right side of (7.23) therefore defines an excessive measure for X with the stated properties. then (5.17) of [9] implies Remark 7.9 If the measure ν is conservative for X, that η is conservative (hence invariant) for X. In particular, this is the case This extends a result of H. Kaspi [20] if ν is a finite invariant measure for X. to all Borel right processes. Define a measure Θ on F × F ×]0, ∞[ by x Θ(dx, dy, dt) := νC (dx)Ppr [XTF ∈ dy, TF ∈ dt].
(7.24)
•
Because νC is σ-finite and 0 < Ppr [1 − exp(−TF )] ≤ 1, it is easy to check that Θ is σ-finite. Notice that the Feller measure Λ is related to Θ by Λ(dx, dy) = Θ(dx, dy, ]0, ∞[). Intuitively, Θ(dx, dy, dt) is the rate at which excursions from F of duration t originate from x and terminate at y.
256
P.J. Fitzsimmons and R.K. Getoor
Proposition 7.10 Suppose f, g ∈ p(E ∗ ∩ F ) with νC (f ) < ∞ and g bounded. Then Θ(f, g, dt) is a σ-finite measure on ]0, ∞[, and for λ > 0, ∞ λ(PFλ+ f, PF g) = λνC (γg P0+ f ) + (1 − e−λt ) Θ(f, g, dt). (7.25) 0
∞
Proof. It is immediate that 0 (1−e−t ) Θ(f, g, dt) is dominated by g∞ νC (f ). Consequently, Θ(f, g, ·) is σ-finite. Then ∞ −λt x (1 − e ) Θ(f, g, dt) = f (x)Ppr [g(XTF )(1 − e−λTF )] νC (dx) 0
F
x f (x)νC (dx)Ppr
=λ F
TF
PF g(Xt )e−λt dt
0
= λνC (f Vprλ PF g), and combining this with Corollary 7.6(ii) we obtain (7.25).
Remark 7.11 Analogous results hold for the optional exit system. For example, employing the obvious notation, ∞ λ c λ(PF f, PF g) = λνB (γf g) + Θop (f, g, dt). (7.26) 0
ηtf (g)
νC (f Qpr t g) ∗
As pointed out earlier, = and ξtf (g) = νB (f Qop t g)) are λ entrance laws for (X, TF ) provided f ∈ bpE . If f ≡ 1 then PF + 1 = PFλ 1 = • [e−λTF ], m-a.e. by the obvious variant of (7.17). Let us write ϕ for this last P function when λ = 0. Theorem 7.4 implies that νC Vprλ = νB Vopλ as measures on E\F for all λ ≥ 0. In particular, ηt1 = ξt1 since either entrance law integrates to the measure ϕm 0 , which is purely excessive for (X, TF ). (Here m0 := m|E\F .) We conclude by recording some additional extensions to the present context of some formulas obtained in [12] in the context of excursions from a point. First recall the definition of the energy functional L0 of the killed process X 0 := (X, TF ); see [15, §3]. If ξ is an X 0 -excessive measure and f is an X 0 -excessive function, then L0 (ξ, f ) := sup{μ(f ) : μV ≤ ξ},
(7.27)
in which μ ranges over the σ-finite measures on F . If ξ is purely excessive for X 0 , then [15, (3.6)] L0 (ξ, f ) = lim λξ − λξV λ , f = lim t−1 ξ − ξQt , f , λ→∞
where μ, f := Define
t↓0
(7.28)
f dμ. (Both of the limits in (7.28) are monotone increasing.) •
ψ := 1 − PF 1 = P [TF = ∞].
(7.29)
0 fix f ∈ bpE. Then η f := It is easily checked ∞ f that ψ is X -excessive. We 0 PF + f · m0 = 0 ηt dt is purely excessive for X .
L´evy Systems and Time Changes
257
Theorem 7.12 (i) If g is X 0 -excessive then L0 (η f , g) = limt↓0 ηtf (g). x [TF = ∞, ζ > 0] νC (dx). (ii) L0 (η f , ψ) = F f (x)Ppr (iii) (Recall the definition (6.21) of the supplementary Feller measure δ.) x δ(f ) = f (x)Ppr [TF = ∞]νC (dx) F (7.30) 0 f x f (x)Ppr [ζ = 0] νC (dx). = L (η , ψ) + F
Proof. Abbreviate η = η and ηt = ηtf during this proof. Suppose first that g ∈ pE ∗ with η(g) < ∞. Then t η − ηQt , g = η(g) − ηQt g = ηs (g) ds. f
0
The extreme terms in this display are positive measures in g, so for general g ∈ pE ∗ we deduce that t η − ηQt , g = ηs (g) ds. (7.31) 0
If g is X 0 -excessive then ηt+s (g) = ηt (Qs g) ↑ ηt (g) as s ↓ 0. Thus t → ηt (g) is right continuous and decreasing on ]0, ∞[. In particular, ↑ limt↓0 ηt (g) exists, though it may equal +∞. Therefore, by (7.31), t 1 0 −1 ηs (g) ds = lim ηtu (g) du L (η, g) = lim t t↓0
0
t↓0
0
= lim ηt (g), t↓0
by monotone convergence, establishing (i). • • pr Next, ηt (ψ) = νC (f Qpr t ψ) and Qt ψ = Ppr [TF = ∞; t < TF ∧ζ] ↑ Ppr [TF = 0 f νC f νC ∞, ζ > 0] as t ↓ 0. Hence L (η, ψ) = Ppr [TF = ∞, ζ > 0]. But Ppr [TF = f νC [ζ = 0], proving both (ii) and (iii). ∞, ζ = 0] = Ppr f νC [TF = ∞] represents the rate Remark 7.13 Intuitively, δ(f ) = Ppr (weighted by f ) at which a final excursion of infinite length occurs, terf νC [ζ = 0] is the weighted rate at minating M . Theorem 7.12 indicates that Ppr 0 which the process X is killed while in F ; L (η, ψ) is the corresponding rate of occurrence of an excursion in which the process wanders away from F , never to return. Exactly the same argument establishes the analogous facts in the optional case.
References 1. Benveniste, A. and Jacod, J.: Syst`emes de L´evy des processus de Markov, Invent. Math. 21 (1973) 183–198. 2. Blumenthal, R.M. and Getoor, R.K.: Markov Processes and Potential Theory. Academic Press, New York, 1968.
258
P.J. Fitzsimmons and R.K. Getoor
3. Chen, Z., Fukushima, M., and Ying, J.: Traces of symmetric Markov processes and their characterizations, Ann. Probab. 34 (2006) 1052–1102. 4. Chen, Z., Fukushima, M., and Ying, J.: Entrance law, exit system and L´evy system of time changed processes, Illinois J. Math. 50 (2006) 269–312. 5. Dellacherie, C.: Autour des ensembles semi-polaires. In Seminar on Stochastic Processes, 1987, pp. 65–92. Birkh¨ auser Boston, 1988. 6. Dellacherie, C. and Meyer, P.-A.: Probabilit´es et Potentiel. Chapitres I ` a IV. Hermann, Paris, 1978. 7. Dellacherie, C. and Meyer, P.-A.: Probabilit´es et Potentiel. Chapitres XII–XVI. Hermann, Paris, 1987. Th´eorie du potentiel associ´ee ` a une r´esolvante. Th´eorie des processus de Markov. 8. Fitzsimmons, P.J.: Homogeneous random measures and a weak order for the excessive measures of a Markov process. Trans. Amer. Math. Soc. 303 (1987) 431–478. 9. Fitzsimmons, P.J. and Getoor, R.K.: Revuz measures and time changes, Math. Zeit. 199 (1988) 233–256. 10. Fitzsimmons, P.J. and Getoor, R.K.: Smooth measures and continuous additive functionals of right Markov processes. In Itˆ o’s Stochastic Calculus and Probability Theory. Springer, Tokyo, 1996, pp. 31–49. 11. Fitzsimmons, P.J. and Getoor, R.K.: Homogeneous random measures and strongly supermedian kernels of a Markov process, Electronic Journal of Probability 8 (2003), Paper 10, 54 pages. 12. Fitzsimmons, P.J. and Getoor, R.K.: Excursion theory revisited, Illinois J. Math. 50 (2006) 413–437. 13. Fitzsimmons, P.J. and Maisonneuve, B.: Excessive measures and Markov processes with random birth and death, Probab. Th. Rel. Fields 72 (1986) 319–336. 14. Fukushima, M., He, P., and Ying, J.: Time changes of symmetric diffusions and Feller measures. Ann. Probab. 32 (2004) 3138–3166. 15. Getoor, R.K.: Excessive Measures. Birkh¨ auser, Boston, 1990. 16. Getoor, R.K.: Measure perturbations of Markovian semigroups. Potential Anal. 11 (1999) 101–133. 17. Gzyl, H.: L´evy systems for time-changed processes, Ann. Probab. 5 (1977) 565–570. 18. Harris, T.E.: The existence of stationary measures for certain Markov processes, In Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, 1954–1955 , vol. II, pp. 113–124, Berkeley, 1956. 19. Kaspi, H.: Excursions of Markov processes: an approach via Markov additive processes, Z. Wahrsch. verw. Gebiete 64 (1983) 251–268. 20. Kaspi, H.: On invariant measures and dual excursions of Markov processes, Z. Wahrsch. verw. Gebiete 66 (1984) 185–204. 21. Le Jan, Y.: Balayage et formes de Dirichlet, Z. Wahrsch. verw. Gebiete 37 (1977) 297–319. 22. Maisonneuve, B.: Exit systems, Ann. Probab., 3 (1975) 399–411. 23. Maisonneuve, B.: Processus de Markov: naissance, retournement, r´eg´en´eration. In Springer Lecture Notes in Math. 1541, pp. 263–292. Springer, berlin, 1993. 24. Motoo, M.: The sweeping-out of additive functionals and processes on the boundary, Ann. Inst. Statist. Math. 16 (1964) 317–345.
L´evy Systems and Time Changes
259
25. Motoo, M.: Application of additive functionals to the boundary problem of Markov processes. L´evy’s system of U -processes. In Proc. Fifth Berkeley Sympos. Math. Statist. and Probability, vol. II, part II, pp. 75–110, Berkeley, 1966. 26. Sharpe, M.J.: General Theory of Markov Processes. Academic Press, Boston, 1988. 27. Silverstein, M.: Classification of coharmonic and coinvariant functions for a L´evy process, Ann. Probab. 8 (1980) 539–575. 28. Watanabe, S.: On discontinuous additive functionals and L´evy measures of a Markov process, Japan. J. Math. 34 (1964) 53–70.
Self-Similar Branching Markov Chains Nathalie Krell Laboratoire de Probabilit´es et Mod`eles Al´eatoires, Universit´e Paris 6, 175 rue du Chevaleret, 75013 Paris, France. e-mail: [email protected]
Summary. The main purpose of this work is to study self-similar branching Markov chains. First we construct such a process. Using the theory of self-similar Markov processes, we show a limit theorem concerning a tagged individual. Finally, we get other results in particular a Lp (P) limit theorem on the convergence of the empirical measure associated to the size of the fragment of the branching chain. Key words: Branching process, Self-similar Markov process, Tree of generations, Limit Theorems. MSC 2000: 60J80, 60G18, 60F25, 60J27.
1 Introduction This work is a contribution to the study of a special type of branching Markov chains. We will construct a continuous time branching chain X which has a self-similar property and which takes its values in the space of finite point measures of R∗+ . This type of process is a generalization of a self-similar fragmentation (see [4]), which may apply to cases where the size models non additive quantities as e.g. surface energy in aerosols. We will focus on the case where the self-similarity index α is non-negative, which means that bigger individuals reproduce faster than smaller ones. There is no loss of generality by considering this model, as the map x → x−1 on atoms in R∗+ transforms a selfsimilar process with index α into another one with index −α (and preserves the Markov property). In this article we choose to construct the process by bare hand. We extend the method used in [4] to deal with more general processes where we allow an individual to have a mass bigger than that of its parent. We will explain in the sequel which difficulties this new set-up entails. There are closely related articles about branching processes, among others [18], [19] from Kyprianou and [12], [13] from Chauvin. However notice that the time of splitting of the process depends on the size of the atoms of the process. C. Donati-Martin et al. (eds.), S´ eminaire de Probabilit´ es XLII, Lecture Notes in Mathematics 1979, DOI 10.1007/978-3-642-01763-6 10, c Springer-Verlag Berlin Heidelberg 2009
261
262
N. Krell
More precisely we will first introduce a branching Markov chain as a marked tree and we will obtain a process indexed by generations (it is simply a random mark on the tree of generation, see Section 2). Using a martingale associated to the latter and the theory of random stopping lines on a tree of generation, we will define the process indexed by time. After having constructed the process, we will study the evolution of a randomly chosen branch of the chain, from which we shall deduce some Limit Theorems, relying on the theory of self-similar Markov processes. In an appendix we will consider the intrinsic process and give some properties in the spirit of the article of Jagers [15]. On the way we will show properties about the earlier martingale.
2 The Marked Tree In this part we will introduce a branching Markov chain as a marked tree, which gives a genealogic description of the process that we will construct. This terminology comes from Neveu in [21] even if here the marked tree we consider is slightly different. First we introduce some notations and definitions. ∗ nA finite point measure on R+ is a finite sum of Dirac point masses s = i=1 δsi , where the si are called the atoms of s and n 0 is an arbitrary integer. We shall often write s = n = s(R∗+ ) for the number of atoms of s, and Mp (R∗+ ) for the space of finite point measures on R∗+ . We also define for f : R∗+ → R a measurable function and s ∈ Mp (R∗+ ) f, s :=
s
f (si ),
i=1
by taking the sum over the atoms of s repeated according to their multiplicity; and we will sometimes use the slight abuse of notation f (x), s :=
s
f (si )
i=1
when f is defined as a function depending on the variable x. We endow the space Mp (R∗+ ) with the topology of weak convergence, which means that sn converge to s if and only if f, sn converge to f, s for all continuous bounded functions f . Let α 0 be an index of self-similarity and ν be some probability measure on Mp (R∗+ ). The aim of this work is to construct a branching Markov X(t) chain X = (( i=1 δXi (t) )t0 ) with values in Mp (R∗+ ), which is self-similar with index α and has reproduction law ν. The index of self-similarity will play a part in the rate at which an individual will reproduce and the reproduction law ν will specify the distribution of the offspring. We stress that our setting includes the case when
Self-Similar Branching Markov Chains
ν(∃i : si > 1) > 0,
263
(1)
which means that with positive probability the size of a daughter can exceed that of her mother. To do that, exactly as described in Chapter 1 section 1.2.1 of [4], we will construct a marked tree. We consider the Ulam Harris labelling system n U := ∪∞ n=0 N ,
with the notation N = {1, 2, . . .} and N0 = {∅}. In the sequel the elements of U are called nodes (or sometimes also individuals) and the distinguished node ∅ the root. For each u = (u1 , . . . , un ) ∈ U, we call n the generation of u and write |u| = n, with the obvious convention |∅| = 0. When n 0, u = (u1 , . . . , un ) ∈ Nn and i ∈ N, we write ui = (u1 , . . . , un , i) ∈ Nn+1 for the i-th child of u. We also define for u = (u1 , . . . , un ) with n 2, mu = (u1 , . . . , un−1 ) the mother of u, mu = ∅ if u ∈ N. If v = mn u for some n 0 we write v u and say that u stems from v. Additionally for M a set of U, M v means that u v for some u ∈ M . Generally we write M L if all x ∈ L stem from M . Here it will be convenient to identify the point measure s with the infinite sequence (s1 , . . . , sn , 0, . . .) obtained by aggregation of infinitely many 0’s to the finite sequence of the atoms of s. In particular we say that a random infinite sequence (ξi , i ∈ N) has law ν, if there isa (random) index n such that ξi = 0 ⇔ i > n and the finite point n measure i=1 δξi has law ν. Definition 1. Let (ξ u , u ∈ U) and (eu , u ∈ U) be two independent families of i.i.d. variables indexed by the nodes of the tree, where for each u ∈ U, ξ u = (ξui )i∈N is distributed according to the law ν, and (eui )i∈N is a sequence of i.i.d. exponential variables with parameter 1. Define recursively for some fixed x > 0 a∅ := 0, ζ∅ := x−α e∅ , ξ∅ := x, and for u ∈ U and i ∈ N: ξui := ξui ξu ,
aui := au + ζu ,
−α ζui := ξui eui .
To each node u of the tree U, we associate the mark (ξu , au , ζu ) where ξu is the size, au the birth-time and ζu the lifetime of the individual with label u. We call Tx = ((ξu , au , ζu )u∈U ) a marked tree with root of size x, and the associated law is denoted by Px . Let ¯ be the set of all possible marked trees. Ω
264
N. Krell
The size of the individuals (ξu , u ∈ U) defines a multiplicative cascade (see the references in Section 3 of [5]). However the latter is not sufficient to construct the process X, in fact we also need the information given by ((au , ζu ), u ∈ U). Another useful concept is that of a line. A subset L ⊂ U is a line if for every u, v ∈ L, u v ⇒ u = v. The pre-L-sigma algebra is HL := σ(ξu , eu ; ∃l ∈ L : u l). A random set of individuals ¯ → P(U) J :Ω is optional if {J L} ∈ HL for every line L ⊂ U, where P(U) is the power set of U . An optional line is a random line which is optional. For any optional set J we define the pre-J -algebra by: A ∈ HJ ⇔ ∀L line ⊂ U : A ∩ {J L} ∈ HL . The first result is: Lemma 1. The marked tree constructed in Definition 1 satisfies the strong ¯ → [0, 1], u ∈ U, Markov branching property: for J an optional line and ϕu : Ω measurable functions, we have ξu E1 ϕu ◦ T HJ = Eξu (ϕu ), u∈J
u∈J
where T ξu is the marked tree extracted from T1 at the node (ξu , au , ζu ). More precisely T ξu = ((ξuv , auv − au , ζuv )v∈U ). Proof. Thanks to the i.i.d. properties of the random variables (ξu , u ∈ U) and (eu , u ∈ U), the Markov property for lines is of course easily checked. In order to get the result for a more general optional line, we use Theorem 4.14 of [15]. Indeed, the tree we have constructed is a special case of the tree constructed by Jagers in [15]. In our case, Jagers’s notation u , τu and σu are such that the type u of u ∈ U is the mass ξu of u, the birth time σu is au and τu is here equal to ζmu (because a mother dies when giving birth to her daughters). We notice that sisters always have the same birth time, which means that for all u ∈ U and all i ∈ N, we have that τui is here equal to ζu .
3 Malthusian Hypotheses and the Intrinsic Martingale We introduce some notations to formulate the fundamental assumptions of this work:
Self-Similar Branching Markov Chains
p∈R:
p := inf
Mp (R∗ +)
and p∞ := inf
265
xp , sν(ds) < ∞ ,
p>p:
x , sν(ds) = ∞ p
Mp (R∗ +)
(with the convention inf ∅ = ∞) and then for every p ∈ (p, p∞ ):
κ(p) := (1 − xp , s) ν(ds). Mp (R∗ +)
Note that κ is a continuous and concave function (but not necessarily a strictly increasing function) on (p, p∞ ), as p → Mp (R∗ ) xp , sν(ds) is a convex ap+
plication. By concavity, the equation κ(p) = 0 has at most two solutions on (p, p∞ ). When a solution exists, we denote by p0 := inf{p ∈ (p, p∞ ) : κ(p) = 0} the smallest, and call p0 the Malthusian exponent. We now make the fundamental: Malthusian Hypotheses. We suppose that the Malthusian exponent p0 exists, that p0 > 0, and that κ(p) > 0 for some p > p0 . Furthermore we suppose that the integral
p (xp0 , s) ν(ds) Mp (R∗ +)
(2)
(3)
is finite for some p > 1. Throughout the rest of this article, these hypotheses will always be taken for granted. Note that (2) always holds when ν(si 1 for all i) = 1 (fragmentation case). We stress that κ may not be strictly increasing, and may not be negative when p is sufficiently large (see Subsection 6.1 for a consequence of this fact.) We will give one example based on the Dirichlet process (see Kingman’s n book [16]). Fix n 2, (υ1 , . . . , υn ) n positive real numbers and υ = i=1 υi . Define the simplex Δn by n
Δn := (p1 , p2 , . . . , pn ) ∈ Rn+ , pi = 1 . j=1
The Dirichlet distribution of parameter (υ1 , . . . , υn ) over the simplex Δn has density (with respect to the (n −1)-dimensional Lebesgue measure on Δn ): f (p1 , . . . , pn ) =
Γ (υ) pυ1 −1 ...pυnn −1 . Γ (υ1 )...Γ (υn ) 1
266
N. Krell
n Let a := υ(υ+1)/( i=1 υi (υi +1)). Note that a is strictly larger than 1. Let the reproduction measure be the law of (aX1 , . . . , aXn ), where (X1 , . . . , Xn ) is a random vector with Dirichlet distribution of parameter (υ1 , . . . , υn ). Therefore n Γ (υ) Γ (p + υi ) κ(p) = ap , Γ (υ + p) i=1 Γ (υi ) p = −υ, p0 = 1 and the Malthusian hypotheses are verified. In this article we will call extinction the event that for some n ∈ N, all nodes u at the n-th generation have zero size, and non-extinction the complementary event. The probability of extinction is always strictly positive whenever ν(s1 = 0) > 0, and equals zero if and only if ν(s1 = 0) = 0 (since we have supposed (3); see p.28 [4]). After these definitions, we introduce a fundamental martingale associated to (ξu , u ∈ U). Theorem 1. The process Mn :=
ξup0 ,
n∈N
|u|=n
is a martingale in the filtration (HLn ), with Ln the line associated to the n-th generation (i.e., Ln := {u ∈ U : |u| = n}). This martingale is bounded in Lp (P) for some p > 1, and in particular uniformly integrable. Moreover, conditionally on non-extinction, the terminal value M∞ is strictly positive a.s. Remark 1. As κ is concave, the equation κ(p) = 0 may have a second root p+ := inf{p > p0 , κ(p) = 0}). This second root is less interesting: even though Mn+ := ξup+ , n ∈ N, |u|=n
is also a martingale, itis easy to check that for all p > 1 the p-variation of ∞ p = ∞). Mn+ is infinite, i.e., E n=0 |Mn+1 − Mn | (p) p We can notice that for all p ∈ (p0 , p+ ), (Mn )n∈N := |u|=n ξu n∈N is a supermartingale. Assumption (3) actually means that E(M1p ) < ∞. Proof. • We will use the fact that the empirical measure of the logarithm of the sizes of fragments Z (n) := δlog ξu (4) |u|=n
can be viewed as a branching random walk (see the article of Biggins [8]) and use Theorem 1 of [8]. In order to do that we first introduce some notation: for θ > p, we define
Self-Similar Branching Markov Chains
267
m(θ) := E exp(θx)Z (1) (dx) = E ξuθ = 1 − κ(θ) |u|=1
and W (n) (θ) := m(θ)−n
exp(θx)Z (n) (dx) = (1 − κ(θ))−n
ξuθ .
|u|=n
Notice that Mn = W (n) (p0 ). Therefore in order to apply Theorem 1 of [8] and to get convergence almost sure and in pth mean for some p > 1, it is enough to show that E(W (1) (p0 )γ ) < ∞ for some γ ∈ (1, 2] and
m(pp0 )/|m(p0 )|p < 1
for some p ∈ (1, γ]. The first condition is a consequence of the Malthusian assumption. Moreover the second follows from the identities m(pp0 )/|m(p0 )|p = (1 − κ(pp0 ))/|1 − κ(p0 )|p = 1 − κ(pp0 ) which, by the definition of p0 , is smaller than 1 for p > 1 well chosen. • Finally, let us now check that M∞ > 0 a.s. conditionally on nonextinction. Define q = P(M∞ = 0), therefore as E(M∞ ) = 1 we get q < 1. Moreover, an application of the branching property yields E(q Zn ) = q, where Zn is the number of individuals with positive size at the n-th generation. Notice that Zn = Z (n) , 1. By construction of the marked tree and as ν is a probability measure, (Zn , n ∈ N) is of course a Galton-Watson process and it follows that q is its probability of extinction. Since M∞ = 0 conditionally on the extinction, the two events coincide a.s.
4 Evolution of the Process in Continuous Time After having defined the process indexed by generation and having shown that the martingale Mn is Lp (P) bounded, we are now able to properly define the main objet of this paper. In order to do this, when an individual labelled by u has positive size ξu > 0, call Iu := [au , au + ζu ) the time period during which this individual is alive. Otherwise, i.e., when ξu = 0, we decide that Iu = ∅. With this definition, we set: Definition 2. Define the process X = (X(t), t 0) by 1l{t∈Iu } δξu , t 0. X(t) = u∈U
(5)
268
N. Krell
In particular one has for f : R+ → R any measurable function f, X(t) = f (ξu )1l{t∈Iu } . u∈U
For every x > 0, let Px be the law of the process X starting from a single individual with size x. And for simplicity, write P for P1 , and let (Ft )t0 be the natural filtration of the process (X(t), t 0). We use the notation (X1 (t), . . . , XX(t) (t)) for the sequence of atoms of X(t). In the following we will show that this sequence is almost surely finite. Of course the set (X1 (t), . . . , XX(t) (t)) is the same as the set ((ξu ); t ∈ Iu ); but sometimes it will be clearer to use the notation (Xi (t)). Define for u ∈ R+ :
us ν(ds). F (u) := Mp (R∗ +)
Notice that F (u) is the generating function of the Galton-Watson process (Zn , n 0) = ({u ∈ U : ξu > 0 and |u| = n}, n 0). From now on, we will suppose that for every ε > 0
1 du = ∞. (6) F (u) −u 1−ε Of course if F (1) = E(Z1 ) < ∞ this assumption is fulfilled. Therefore we get the first theorem about the continous time process: Theorem 2. The process X takes its values in the set Mp (R∗+ ). It is a branching Markov chain, more precisely the conditionaldistribution of X(t+r) given that X(r) = s is the same as that of the sum X(i) (t), where for each in(i) dex i, X (t) is distributed as X(t) under Psi and the variables X(i) (t) are independent. The process X also has the scaling property, namely for every c > 0, the distribution of the rescaled process (cX(cα t), t 0) under P1 is Pc . In the fragmentation case, the fact that the size of the fragments decreases with time entails that the process of the fragments of size larger than or equal to ε is Markovian, and this easily leads to Theorem 2. This property is lost in the present case. Proof. • First we will check that for all t 0, X(t) is a (random) finite point measure. By Theorem 1 and Doob’s Lp -inequality we get that for some p > 1: ξup0 ∈ Lp (P). sup Mn = sup n∈N
n∈N
|u|=n
Self-Similar Branching Markov Chains
269
As a consequence: sup ξup0 ∈ Lp (P)
(7)
u∈U
and then by the definition of the process X, writing X1 (t), . . . for the (possibly infinite) sequence of atoms of X(t) sup sup Xi (t)p0 ∈ Lp (P). i
t∈R+
Recall that p0 > 0 by assumption. Fix some arbitrarily large m > 0. We now work conditionally on the event that the size of all individuals is bounded by m, and we will show that the number of the individuals alive at time t is almost surely finite for all t 0. As we are conditioning on the event {supu∈U ξu m}, by construction of the marked tree, we get that the life time of an individual can be stochastically bounded from below by an exponential variable of parameter mα . Therefore we can bound the number of individuals present at time t by the number of individuals of a continuous time branching process (denoted by GW ) in which each individual lives for a random time whose law is exponential of parameter mα and the probability distribution of the offspring is the law of s ∨ 1 under ν (we have taken the supremum with 1 to ensure the absence of death). For the Markov branching process GW , we are in the temporally homogeneous case and, we notice that
u(ns )∨1 ν(ds) = (f (u) − u)ν(ns = 0) + u, Mp (R∗ +)
therefore as we have supposed (6), we can use Theorem 1 p.105 of the book of Athreya and Ney [3] (proved in Theorem 9 p.107 of the book of Harris [14]) and get that GW is non-explosive. As the number of the individuals is bounded by that of GW we get that the number of individuals at time t is a.s. finite. Therefore, conditioning on the event {supu∈U ξu m}, we have that for all t 0, the number of individuals at time t is a.s. finite, i.e., X(t) is a finite point measure. • Second we will show the Markov property. Fix r ∈ R+ . Let τr be equal to {u ∈ U : r ∈ Iu }. Notice that τr is an optional line. In fact for all lines L ⊂ U we have that {τr L} = {r < au + ζu ∀u ∈ L} ∈ HL . By definition, we have the identity
X(t+r)
i=1
1l{Xj (t+r)>0} δXj (t+r) =
u∈U
1l{t+r∈Iu } δξu .
270
N. Krell
n Let X(r) = i=1 δξvn ∈ Mp (R∗+ ) with n = X(r) and (v1 , . . . , vn ) the nodes of U. Define for all i n, T˜(i) := ((ξvi u , avi u − avi , ζvi u − 1l{u=∅} (r − avi ))u∈U ) = ((ξ˜u(i) , a ˜u(i) , ζ˜u(i) )u∈U ), (i) (i) (i) (i) I˜u := [˜ au , a ˜u + ζ˜u [ and
X(i) (t) =
1l{t∈I˜(i) } δξ˜(i) . u
u
u∈U
Then X(t + r) =
n
X(i) (t).
i=1
By lack of memory of the exponential variable, we have that for u ∈ U, given s ∈ Iu the law of the marked tree T˜(i) is the same as that of T ξvi := ((ξvi u , avi u − avi , ζvi u )u∈U ) := ((ξui , aiu , ζui )u∈U ). Thus we have the equality in law:
(d)
1l{t∈I˜(i) } δξ˜(i) = u
u∈U
u
1l{t∈Iui } δξui ,
u∈U
with Iui := [aiu , aiu + ζui [. Let τri := {vi u ∈ U : r ∈ Iui }. Moreover for all lines L ∈ U we have that {τri L} = {r < avi u + ζvi u ∀vi u ∈ L} ∈ HL . Therefore τri is an optional line and by applying Lemma 1 for the optional line τsi , we have that the conditional distribution of the point measure 1l{t+r∈Iui } δξui u∈U
given Hτr is the law of X(t) under Pxi . Notice that Hτs = σ(ξ˜u , eu : au s) is the same filtration as Fs = σ(X(s ) : s s). Therefore (X(1) , X(2) , . . . , X(n) ) is a sequence of independent random processes, where for each i, X(i) (t) is distributed as X(t) under Pxi . We then have proven the Markovian property. • The scaling property is an easy consequence of the definition of the tree Tx . Remark 2. For every measurable function g : R∗+ → R∗+ , define a multiplicas tive functional such that for every s = i=1 δsi ∈ Mp (R∗+ ), φg (s) = exp(−g, s) = exp(−
s i=1
g(si )).
Self-Similar Branching Markov Chains
271
Then the generator G of the Markov process X(t) fulfills for every y = y ∗ i=1 δyi ∈ Mp (R+ ):
α Gφg (y) = yi exp(− g(yj )) (e− g(xyi ),s − e−g(yi ) )ν(ds). j=i
Mp (R∗ +)
The intrinsic martingale Mn is indexed by the generations; it will also be convenient to consider its analogue in continuous time, i.e., 1l{t∈Iu } ξup0 . M (t) := xp0 , X(t) = u∈U
It is straightforward to check that (M (t), t 0) is again a martingale in the natural filtration (Ft )t0 of the process (X(t), t 0); and more precisely, the argument Proposition 1.5 in [4] gives: Corollary 1. The process (M (t), t 0) is a martingale, and more precisely M (t) = E(M∞ |Ft ), where M∞ is the terminal value of the intrinsic martingale (Mn , n ∈ N). In particular M (t) converges in Lp (P) to M∞ for some p > 1. Proof. We will use the same argument as in the proof of Proposition 1.5 of [4]. But we have to deal here with the fact that supu∈U ξu may be larger than 1. Therefore we will have to condition. We know that Mn converges in Lp (P) to M∞ as n tends to ∞, so E(M∞ |Ft ) = lim E(Mn |Ft ). n→∞
By Theorem 1, as already seen in (7), we have supu∈U ξup0 ∈ Lp (P), so, fixing m > 0, we now work on the event Bm := {supu∈U ξu m}. By applying the Markov property at time t we easily get
X(t)
E(Mn |Ft ) =
i=1
Xip0 (t)1l{ (Xi (t))n} +
ξup0 1l{au +ζu
(8)
|u|=n
where (ξv ) stands for the generation of the individual v (i.e., (ξv ) = |v|), and au + ζu is the instant when the individual corresponding to the node u reproduces. We can rewrite the latter as −α −α −α au + ζu = ξm e|u| |u| u e0 + ξm|u|−1 u e1 + ... + ξu
where e0 ,. . . is a sequence of independent exponential variables with parameter 1, which is also independent of ξu . −α −α As α is nonnegative, and as we are working on the event Bm : ξm iu m we have that for each fixed node u ∈ U , au + ζu is bounded from below by the
272
N. Krell
sum of |u| + 1 independent exponential variables with parameter mα which are independent of ξu . Thus ξup0 1l{au +ζu
|u|=n
and therefore by (8) on the event {Bm }, we get that for all m > 0: E(M∞ |Ft )1l{Bm } = M (t)1l{Bm } . The result is then obtained by then by letting m tend to ∞.
5 A Randomly Tagged Leaf We will here (as in [4]) use a tagged leaf to define a tagged individual. We call leaf of the tree U an infinite sequence of integers l = (u1 , . . .). For each n, ln := (u1 , . . . , un ) is the ancestor of l at the generation n. We enrich the probabilistic structure by adding the information about a so called tagged leaf, chosen at random as follows. Let Hn be the space of bounded functionals Φ which depend on the mark M and of the leaf l up to the n-th first generation, i.e., such that Φ(M, l) = Φ(M , l ) if ln = ln and M (u) = M (u) whenever |u| n. For such functionals, we use the slightly abusing notation Φ(M, l) = Φ(M, ln ). As in [4] for a pair (M, λ) where M : U → [0, 1]×R+ ×R+ is a random mark on the tree and λ is a random leaf of U, the joint distribution denoted by P∗ (and by P∗x if the size of the first mark is x instead of 1) can be defined unambiguously by Φ(M, u)ξup0 , E∗ (Φ(M, λ)) = E Φ ∈ Hn . |u|=n
Moreover since the intrinsic martingale (Mn , n ∈ Z+ ) is uniformly integrable (cf. Theorem 1), the first marginal of P∗ is absolutely continuous with respect to the law of the random mark M under P, with density M∞ . Let λn be the node of the tagged leaf at the n-th generation. We denote by χn := ξλn the size of the individual corresponding to the node λn and by χ(t) the size of the tagged individual alive at time t, viz. χ(t) := χn
if aλn t < aλn + ζλn ,
because in the case under consideration supn∈N aλn = ∞. We stress that, in general the process χ(t) is not monotonic. However as in [4], Lemma 1.4 there becomes: Lemma 2. Let k : R+ → R+ be a measurable function such that k(0) = 0. Then we have for every n ∈ N ξup0 k(ξu ) , E∗ (k(χn )) = E |u|=n
Self-Similar Branching Markov Chains
273
and for every t 0 E∗ k(χ(t)) = E xp0 k(x), X(t) . Proposition 1.6 of [4] becomes: Proposition 1. Under P∗ , Sn := ln χn ,
n ∈ Z+
is a random walk on R with step distribution P(ln χn − ln χn+1 ∈ dy) = ν(dy), where the probability measure ν is defined by
k(y) ν (dy) = xp0 k(ln(x)), sν(ds). Mp (R∗ +)
]0,∞[
Equivalently, the Laplace transform of the step distribution is given by E∗ (exp(pS1 )) = E∗ (χp1 ) = 1 − κ(p + p0 ),
p 0.
Moreover, conditional on (χn , n ∈ Z+ ), the sequence of lifetimes (ζλ0 , ζλ1 , . . .) along the tagged leaf is a sequence of independent exponential variables with α respective parameters χα 0 , χ1 , . . . We now see that we can use this proposition to obtain the description of χ(t) using a Lamperti transformation. Let ηt := S ◦ Nt ,
t 0,
with N a Poisson process with parameter 1 which is independent of the random walk S; for probabilities and expectations related to η we use the notation P and E. The process (χ(t), t 0) is Markovian and enjoys a scaling property. More precisely under P∗x we get (d)
χ(t) = exp(ητ (tx−α ) ),
t 0,
(9)
where η is the compound Poisson defined above and τ the time-change defined implicitly by
τ (t) t= exp(αηs )ds, t 0. (10) 0
274
N. Krell
6 Asymptotic Behaviors 6.1 The Convergence of the Size of a Tagged Individual Let
κ (p0 ) = −
Mp (R∗ +)
xp0 ln(x), sν(ds)
denote the derivative of κ at the Malthusian parameter p0 . In this part we focus on the asymptotic behavior of the size of a tagged individual. In this direction, the quantity t = exp(αηt ) plays an important role, as it appears at the time change of the Lamperti transformation (see (10)), as shown by the next proposition: Proposition 2. Suppose that α > 0, that the support of ν is not a discrete subgroup rZ for any r > 0 and that 0 < κ (p0 ) < ∞. Then for every y > 0, under P∗y , t1/α χ(t) converges in law as t → ∞ to a random variable Y whose law is specified by 1 E(k(I)I −1 ), E(k(Y α )) = αm1 ∞ for every measurable function k : R+ → R+ , with I := 0 exp(αηs )ds and m1 := E(η1 ) = −κ (p0 ). Proof. As −κ (p0 ) is the mean of the step distribution of the random walk Sn (see Proposition 1), therefore κ (p0 ) > 0 imply that E(−η1 ) > 0 thus the assumption of Theorem 1 in the work of Bertoin and Yor [7] is fulfilled by the self-similar Markov process χ(t)−1 , which gives the result. We could also try to use the same method as in [6] for which we need Proposition 1.7 [4]. But in the latter we needed E(xp , X(t)) to be finite for large p, and its derivative to be completely monotone. But here neither of these requirements is necessarily true as κ is not necessarily positive when p is large. This explains why we have to use a different method. Remark 3. In the case κ (p0 ) =p 0 we can extend this proposition. More precisely, suppose that Mp (R∗ ) x 0 | ln(x)|, sν(ds) < ∞, +
∞ xν − ((x, ∞))dx x ∞ < ∞, J := 1 + 0 dy y ν − ((−∞, −z))dz 1
(where ν − is the image of ν by the map u → −u and ν is defined in T Proposition 1) and E log+ 0 1 exp(−ηs )ds < ∞ (with Tz := inf{t : −ηt z}) hold; then, for any y > 0, under P∗y , t1/α χ(t) converges in law as t → ∞ to a random variable Y˜ whose law is specified by 1 for any bounded, continuous function k, E k(Y˜ α ) = lim E Iλ−1 k(Iλ ) , λ→0 λ ∞ where Iλ = 0 exp(αηs − λs)ds. The proof is the same as the previous one, but uses Theorem 1 and Theorem 2 from the work of Caballero and Chaumont [11] instead of [7].
Self-Similar Branching Markov Chains
275
6.2 Convergence of the Mean Measure and Lp -convergence We encode the configuration of masses X(t) = {(Xi (t))1iX(t) } by the weighted empirical measure
X(t)
σt :=
Xip0 (t)δt1/α Xi (t)
i=1
which has total mass M (t). The associated mean measure σt∗ is defined by the formula
∞ ∞ k(x)σt∗ (dx) = E k(x)σt (dx) 0
0
which is required to hold for all compactly supported continuous functions k. Since M (t) is a martingale, σt∗ is a probability measure. We are interested in the convergence of this measure. This convergence was already established in the case of binary conservative fragmentation (see the results of Brennan and Durrett [9] and [10]). A very useful tool for this is the renewal theorem, for which they needed the fact that the process χ(t) is decreasing; here we no longer have such a monotonicity property. See also Theorem 2 and 5 of [6], Theorem 1.3 of [4] and Proposition 4 of [17] for results about empirical measures which have the property ν(si 1 ∀i ∈ N) = 1. Nonetheless, with Proposition 2 and Lemma 2, we easily get: Corollary 2. With the assumptions of Proposition 2 we get: 1. The measures σt∗ converge weakly, as t → ∞, to the distribution of Y i.e., for any continuous bounded function k : R+ → R+ , we have: E xp0 k(t1/α x), X(t) → E k(Y ) . t→∞
2. For all p+ > p > p0 : t(p−p0 )/α E xp , X(t) → E(Y p−p0 ). t→∞
We now formulate a more precise convergence result concerning the empirical measure: Theorem 3. Under the same assumptions as in Proposition 2 we get that for every bounded continuous function k:
∞ M∞ Lp − lim E k(I)I −1 , k(x)σt (dx) = M∞ E k(Y ) = t→∞ 0 αm for some p > 1. Remark 4. A slightly different version of Corollary 2 and Theorem 3 exists also under the assumptions in Remark 3.
276
N. Krell
See also Asmussen and Kaplan [1] and [2] for a closely related result. Proof. We follow the same method as Section 1.4. in [4] and in this direction we use Lemma 1.5 therefrom: for (λ(t))t0 = (λi (t), i ∈ N)t0 a sequence of non-negative random variables such that for fixed p > 1 ∞ p sup E λi (t) <∞ t0
and
i=1
∞ λi (t) = 0, lim E
t→∞
i=1
and for (Yi (t), i ∈ N) a sequence of random variables which are independent −
conditionally on λ(t), we assume that there exists a sequence (Yi , i ∈ N) of i.i.d. variables in Lp (P), which is independent of λ(t) for each fixed t, and such −
that |Yi (t)| Y i for all i ∈ N and t 0. Then we know from Lemma 1.5 in [4] that lim
t→∞
∞
λi (t) Yi (t) − E Yi (t) λ(t) = 0.
(11)
i=1
Now, let k be a continuous function bounded by 1 and let At := xp0 k(t1/α x), X(t). The Markov property at time t for At+s and the self-similarity property of the process X allow to rewrite At+s as
X(t)
λi (t)Yi (t, s)
i=1
where λi (t) := Xip0 (t) and Yi (t, s) := xp0 k((t + s)1/α Xi (t)x), Xi,. (s), with X1,. , X2,. ,. . . a sequence of i.i.d. copies of X which is independent of X(t). By Theorem 1 we get that X(t) p < ∞. λi (t) sup E t0
i=1
By the last corollary we also obtain that X(t) p p−1 E λi (t) ∼ t− α p0 E χ(p−1)p0 (1) → 0, i=1
as t → ∞.
Self-Similar Branching Markov Chains
277
Moreover the variables Yi (t, s) are uniformly bounded by Yi = supxp0 , Xi,. (s), s0
which are i.i.d. variables and also bounded in Lp (P) thanks to Doob’s inequality (as xp0 , Xi,. (s) is a martingale bounded in Lp (P)). Thus we may apply (11), which reduces the study to that of the asymptotic behavior of: X(t) λi (t)E(Yi (t, s)|X(t)), i=1
as t tends to ∞. On the event {Xi (t) = y}, we get E Yi (t, s) X(t) = E xp0 k((t + s)1/α yx), X(s) . Then by Lemma 2: E xp0 k((t + s)1/α yx), Xi,. (s) = E∗ k (t + s)1/α yχ(s) . With Proposition 2, we obtain lim E∗ k (t + s)1/α yχ(s) = E k (Y ) . s→∞
X(t) Moreover recall from Corollary 1 that i=1 λi (t) converges to M∞ in Lp (P). Therefore we finally get that when t goes to infinity:
X(t)
X(t)
λi (t)E(Yi (t, s)|X(t)) ∼ E (k (Y ))
i=1
λi (t) ∼ E (k (Y )) M∞ .
i=1
Appendix: Further Results about the Intrinsic Process We will give more general properties about the intrinsic process {MQ , Q ⊂ p0 U}, MQ = u∈Q ξu . By abuse of notation, let Mn stand for the process MLn , with Ln = {u ∈ U : |u| = n} the labels of the n-th generation. We introduce new definitions: we say that a line Q covers L, if Q L and any individual stemming from L either stems from Q or has progeny in Q. If Q covers the ancestor it may simply be called covering. Let C0 be the class of covering lines with finite maximal generation. We denote the generation of Q: |Q| = supu∈Q |u|. The origin of the intrinsic martingale comes from real time martingale of Nerman [20]. Also for r ∈ R∗+ , let ϑr be the structural measure: ϑr (B) := Er ({u ∈ U : ξu ∈ B}) =
∞ i=1
ν(rsi ∈ B) for B ⊂ B,
278
N. Krell
where B is the Borel algebra on R∗+ . Let the reproduction measure μ on the sigma-field B ⊗ B be such that for every r 0: μ(r, dv × du) := rα exp(−rα u)duϑr (dv) and for any λ ∈ R μλ (r, dv × du) := exp(−λu)μ(r, du × dv). The composition operation ∗ denotes the Markov transition on the size space R+ and convolution on the time space R+ , so that: for all A ∈ B and B ∈ B,
μ(r, A × (B − u))μ(s, dr × du). μ∗2 (s, A × B) = μ ∗ μ(s, A × B) = R+ ×R+
With the convention that the ∗-power 0 is 1l{A×B} (s, 0) which gives full mass to (s, 0). Define the renewal measure as ψλ :=
∞
μ∗n λ .
0
Let
α := inf{λ : ψλ (r, R+ × R+ ) < ∞ for some r ∈ R+ }.
Moreover as
mrα /(rα + λ) if λ > −rα μλ (r, R+ × R+ ) = ∞ else,
thus ψλ (r, R+ × R+ ) < ∞ if and only if λ < (r/(m − 1))1/α therefore we get α = 0. For A ∈ B, let π(A) := lim μ∗n (1, A × R+ ) n→∞
(12)
which is well defined as μ∗n (1, A × R+ ) is a decreasing function in n and nonnegative. Let h(s) := sp0 for all s ∈ R+ and β := 1. These objects correspond to those defined in [15]. Recall that the Galton-Watson process (Zn , n 0)) is equal to ({u ∈ U : ξu > 0 and |u| = n}, n 0). We suppose that m := E(Z1 ) < ∞, i.e., Mp (R∗ ) sν(ds) < ∞. This assumption is slightly stronger than (6), there+ fore we get
Self-Similar Branching Markov Chains
279
Proposition 3. 1. If L Q are lines, then E(MQ |HL ) ML . If Q verifies |Q| < ∞ and covers L, then E(MQ |HL ) = ML . 2. For all s > 0, {ML ; L ∈ C0 } is uniformly Ps -integrable. 3. There is a random variable M 0 such that for π-almost all s > 0 ML = Es (M |HL ) L1 (Ps )
and ML → M, as L ∈ C0 filters ( ). If ςn ςn+1 ∈ C0 and to any x ∈ U there is an ςn such that x has progeny in ςn , then Mςn → M , as n → ∞, also Ps -a.s. A consequence of the first and second points applied for Ln = {u ∈ U : |u| = n} and Lm = {u ∈ U : |u| = m} with m n 0, is that Mn is a martingale and the uniform Ps -integrability of this martingale. The third point applied for the lines τt gives convergence of M (t) in L1 (Ps ) and a.s. Proof. First the conditions of Malthusian population, as defined by Jagers in [15], are fulfilled, thus by Theorem 5.1 therein we get the first point. Let ξ := R+ ×R+ h(s)rα exp(−trα )dtϑ1 (ds) = |u|=1 ξup0 and Eπ be the expectation with respect to R+ Ps (dw)π(ds). Therefore,
Eπ (ξ log ξ) = +
R+
Ex
∞ i=1
∞ ξip0 log+ ξjp0 π(dx), j=1
and readily from the Malthusian hypotheses and the fact that it follows pp0 is a supermartingale, that this quantity is finite. So the assump|u|=n ξu tion of Theorem 6.1 of [15] hold, which gives by Theorem 6.1 of [15] the second point and by Theorem 6.3 of [15] we get the third point. Acknowledgements: I wish to thank J. Bertoin for his help and suggestions. I also wish to thank the anonymous referees of an earlier draft for their detailed comments and suggestions.
References 1. Asmussen S. and Kaplan N.: Branching random walks. I. Stochastic Process. Appl. 4, no. 1, 1-13 (1976). 2. Asmussen S. and Kaplan N.: Branching random walks. II. Stochastic Process. Appl. 4, no. 1, 15-31 (1976).
280
N. Krell
3. Athreya K. B. and Ney P. E.: Branching processes. Springer-Verlag Berlin Heidelberg (1972). 4. Bertoin J.: Random fragmentation and coagulation processes. Cambridge Univ. Pr. (2006). 5. Bertoin J.: Different aspects of a random fragmentation model. Stochastic Process. Appl. 116, 345-369, (2006). 6. Bertoin J. and Gnedin A. V.: Asymptotic laws for nonconservative self-similar fragmentations. Electron. J. Probab. 9, No. 19, 575-593, (2004). 7. Bertoin J. and Yor M.: The entrance laws of self-similar Markov processes and exponential functionals of L´evy processes. Potential Analysis 17 389-400, (2002). 8. Biggins J. D.: Uniform convergence of martingales in the branching random walk. Ann. Probab. 20, No. 1, 131-151, (1992). 9. Brennan M. D. and Durrett R.: Splitting intervals. Ann. Probab. 14, No. 3, 1024-1036, (1986). 10. Brennan M. D. and Durrett R.: Splitting intervals. II. Limit laws for lengths. Probab. Theory Related Fields. 75 No. 1, 109-127, (1987). 11. Caballero M.E. and Chaumont L.: Weak convergence of positive self-similar Markov processes and overshoots of L´evy processes. Ann. Probab. 34, No. 3, 1012-1034, (2006). 12. Chauvin B.: Arbres et processus de Bellman-Harris. Ann. Inst. Henri Poincar´e. 22, No. 2, 209-232, (1986). 13. Chauvin B.: Product martingales and stopping lines for branching Brownian motion. Ann. Probab. 19, No. 3, 1195-1205, (1991). 14. Harris T. E.: The theory of branching processes. Springer (1963). 15. Jagers P.: General branching processes as Markov fields. Stochastic Process. Appl. 32, 183-212, (1989). 16. Kingman J. F. C.: Poisson processes. Oxford Studies in Probability, 3. Oxford Science Publications. The Clarendon Press, Oxford University Press (1993). 17. Krell N.: Multifractal spectra and precise rates of decay in homogeneous fragmentations. To appear in Stochastic Process. Appl. (2008). 18. Kyprianou A. E.: A note on branching L´evy processes. Stochastic Process. Appl. 82, No. 1, 1-14, (1999). 19. Kyprianou A. E.: Martingale convergence and the stopped branching random walk. Probab. Theory Related Fields 116, no. 3, 405-419, (2000). 20. Nerman O.: The growth and composition of supercritical branching populations on general type spaces. Technical report, Dept. Mathematics, Chalmers Univ. Technology and Goteborg Univ. (1984). 21. Neveu J.: Arbres et processus de Galton-Watson. Ann. Inst. H. Poincar´e Probab. Statist. 22, No. 2, 199-207, (1986).
A Spine Approach to Branching Diffusions with Applications to Lp -convergence of Martingales Robert Hardy and Simon C. Harris Department of Mathematical Sciences, University of Bath Claverton Down, Bath, BA2 7AY, UK E-mail: [email protected] Summary. We present a modified formalization of the ‘spine’ change of measure approach for branching diffusions in the spirit of those found in Kyprianou [40] and Lyons et al. [44, 43, 41]. We use our formulation to interpret certain ‘GibbsBoltzmann’ weightings of particles and use this to give an intuitive proof of a general ‘Many-to-One’ result which enables expectations of sums over particles in the branching diffusion to be calculated purely in terms of an expectation of one ‘spine’ particle. We also exemplify spine proofs of the Lp -convergence (p ≥ 1) of some key ‘additive’ martingales for three distinct models of branching diffusions, including new results for a multi-type branching Brownian motion and discussion of left-most particle speeds.
1 Introduction Consider a branching Brownian motion (BBM) with constant branching rate r and offspring distribution A, which is a branching process where particles diffuse independently according to a standard Brownian motion and at any moment undergo fission at a rate r to be replaced by a random number of offspring, 1 + A, where A is an independent random variable with distribution i ∈ 0, 1, . . . , P (A = i) = pi , ∞ such that m := P (A) = i=0 i pi < ∞. Offspring move off from their parent’s point of fission, and continue to evolve independently as above, and so on. Let the configuration of this BBM at time t be given by the R-valued point alive at process Xt := Xu (t) : u ∈ Nt where Nt is the set of individuals time t. Let the probabilities for this process be P x : x ∈ R , where P x is the law starting from a single particle at position x, and let (Ft )t≥0 be the natural filtration. It is well known that for any λ ∈ R, 1 2 Zλ (t) := e−rmt eλXu (t)− 2 λ t = eλXu (t)−Eλ t (1) u∈Nt
u∈Nt
C. Donati-Martin et al. (eds.), S´ eminaire de Probabilit´ es XLII, Lecture Notes in Mathematics 1979, DOI 10.1007/978-3-642-01763-6 11, c Springer-Verlag Berlin Heidelberg 2009
281
282
R. Hardy and S.C. Harris
where Eλ := −λcλ := 12 λ2 + rm, defines a positive martingale, so Zλ (∞) := limt→∞ Zλ (t) exists and is finite almost surely under each P x . See Neveu [46], for example. One of the central elements of the spine approach is to interpret the behaviour of a branching process under a certain change of measure. Chauvin and Rouault [9] showed that changing measure for BBM with the Zλ martingale leads to the following ‘spine’ construction: Theorem 1.1 If we define the measure Qxλ via dQxλ Zλ (t) = e−λx Zλ (t), = dP x Ft Zλ (0)
(2)
then under Qxλ the point process Xt can be constructed as follows: • starting from position x, the original ancestor diffuses according to a Brownian motion on R with drift λ; • at an accelerated rate (1+m)r the particle undergoes fission producing 1+A˜ particles, where the distribution of A˜ is independent of the past motion but is size-biased: (i + 1)pi , Qλ (A˜ = i) = m+1 • • •
i ∈ 0, 1, . . . .
with equal probability, one of these offspring particles is selected; this chosen particle repeats stochastically the behaviour of the parent with the size-biased offspring distribution; each other particle initiates, from its birth position, an independent copy of a P · branching Brownian motion with branching rate r and offspring distribution given by A (which is without the size-biasing).
The chosen line of descent in such pathwise constructions of the measure, here Qλ , has come to be known as the spine as it can be thought of as the backbone of the branching process Xt from which all particles are born. The phenomena of size-biasing along the spine is a common feature of such measure changes when random offspring distributions are present. Although Chauvin and Rouault’s work on the measure change continued in a paper co-authored with Wakolbinger [10], where the new measure is interpreted as the result of building a conditioned tree using the concepts of Palm measures, it wasn’t until the so-called ‘conceptual proofs’ of Lyons, Kurtz, Peres and Pemantle published around 1995 ([44, 43, 41]) that the spine approach really began to crystalize. These papers laid out a formal basis for spines using a series of new measures on two underlying spaces of sample trees with and without distinguished lines of descent (spines). Of particular interest is the paper by Lyons [43] which gave a spine-based proof of the L1 convergence of the well-known martingale for the Galton-Watson process. Here we first saw the spine decomposition of the martingale as the key to using the intuition provided by Chauvin and Rouault’s pathwise construction of the new
A Spine Approach to Branching Diffusions
283
measure – Lyons used this together with a previously known measure-theoretic result on Radon-Nikodym derivatives that allows us to deduce the behaviour of the change-of-measure martingale under the original measure by investigating its behaviour under the second measure. Similar ideas have recently been used by Kyprianou [40] to investigate the L1 -convergence of the BBM martingale (1), by Biggins and Kyprianou [4] for multi-type branching processes in discrete time, by Hu and Shi [33] for the minimal position in a branching random walk, by Geiger [16, 17] for Galton-Watson processes, by Georgii and Baake [19] to study ancestral type behaviour in a continuous time branching Markov chain, as well as Olofsson [47] for general branching processes. Also see Athreya [2], Geiger [15, 18], Iksanov [34], Rouault and Liu [42] and Waymire and Williams [49], to name just a few other papers where spine and size-biasing techniques have already proved extremely useful in branching process situations. For applications of spines in branching in random media see, for example, the survey by Engl¨ ander [13]. In this article1 , we present a modified formalization of the spine approach that attempts to improve on the schemes originally laid out by Lyons et al. [44, 43, 41] and later for BBM by Kyprianou [40]. Although the set-up costs of our spine formalization are quite large, at least in terms of definitions and notation, the underlying ideas are all extremely simple and intuitive. One advantage of this approach is that it has facilitated the development of further spine techniques, for example, in Hardy & Harris [23, 22], Git et al. [20] and J.W.Harris & S.C.Harris [27] where a number of technical problems and difficult non-linear calculations are by-passed with spine calculations enabling their reduction to relatively straightforward classical one-particle situations; this article also serves as a foundation for these and other works. The basic concept of our approach is quite straightforward: given the original branching process, we first create an extended probability measure by enriching the process through (carefully) choosing at random one of the particles to be the so-called spine. Now, on this enriched process, changes of measure can easily be applied that only affect the behaviour along the path of this single distinguished ‘spine’ particle; in our examples, we add a drift to the spine’s motion, increase rate of fission along the path of the spine and sizebias the spine’s offspring distribution. However, projecting this new enriched and changed measure back onto the original process filtration (that is, without any knowledge of the distinguished spine) brings the fundamental ‘additive’ martingales into play as a Radon-Nikodym derivative. The four probability measures, various martingales, extra filtrations and clear process constructions afforded by our setup, together with some other useful properties and tricks, such as the spine decomposition, provide a very elegant, intuitive and powerful set of techniques for analysing the process.
1
Based on the arXiv articles [24, 25]
284
R. Hardy and S.C. Harris
The reader who is familiar with the work of Lyons et al. [44] or Kyprianou [40] will notice significant similarities as well as differences in our approach. In the first instance our modifications correct our perceived weakness in the Lyons et al. scheme where one of the measures they defined had a timedependent mass and could not be normalized to be a probability measure in a natural way, hence lacked a clear interpretation in terms of any direct process construction; an immediate consequence of this improvement is that here all measure changes are carried out by martingales and we regain a clear intuitive construction. Another difference is in our use of filtrations and sub-filtrations, where Lyons et al. instead used marginalizing. As we shall show, this brings substantial benefits since it allows us to relate the spine and the branching diffusion through the conditional expectation operation, and in this way gives us a proper methodology for building new martingales for the branching diffusion based on known single particle martingales for the spine. The conditional expectation approach also leads directly to simple proofs of some key results for branching diffusions. The first of these concerns the relation that becomes clear between the spine and the ‘Gibbs-Boltzmann’ weightings for the branching particles. Such weightings are well known in the theory of branching process, for example see Chauvin & Rouault [7], or Harris [30] which also studies the continuous-typed branching diffusion example introduced later. In our formulation these weightings can be interpreted as a conditional expectation of a spine event, and we can use them to immediately obtain a new interpretation of the additive operations previously seen only within the context of the Kesten-Stigum theorem and related problems. Our approach also leads to a substantially easier proof of a more general form of the Many-to-One theorem that is so often useful in branching processes applications; for example, in Champneys et al. [5] or Harris and Williams [28], special cases of this theorem were a key tool in their more classical approaches to branching diffusions. As another application of spine techniques, we will analyze the Lp convergence properties (for p ≥ 1) of some fundamental positive ‘additive’ martingales for three different models of branching diffusions. Consider first the branching Brownian motion (BBM) with random family sizes. We recall that Kyprianou [40] used spine techniques to give necessary and sufficient conditions for L1 -convergence of the Zλ martingales: √ ˜ := − 2rm so that cλ := −Eλ /λ attains local maximum Theorem 1.2 Let λ ˜ For each x ∈ R, the limit Zλ (∞) := limt→∞ Zλ (t) exists P x -almost at λ. surely where: ˜ then Zλ (∞) = 0 P x -almost surely; • if λ ≤ λ ˜ 0] and P (A log+ A) = ∞ then Zλ (∞) = 0 P x -almost surely; • if λ ∈ (λ, ˜ • if λ ∈ (λ, 0] and P (A log+ A) < ∞ then Zλ (t) → Zλ (∞) almost surely and in L1 (P x ). (Note, without loss of generality (by symmetry) we will suppose λ ≤ 0 throughout this article.)
A Spine Approach to Branching Diffusions
285
In fact, in many cases where the martingale has a non-trivial limit, the convergence will also be much stronger than merely in L1 (P x ), as indicated by the following Lp -convergence result: Theorem 1.3 For each x ∈ R, and for each p ∈ (1, 2]: • Zλ (t) → Zλ (∞) almost surely and in Lp (P x ) if pλ2 < 2mr and P (Ap ) < ∞ • Zλ is unbounded in Lp (P x ), that is limt→∞ P x (Zλ (t)p ) = ∞, if pλ2 > 2mr or P (Ap ) = ∞. We shall give a spine-based proof of this Lp -convergence theorem, but also see Neveu [46] for sufficient conditions in the special case of binary branching at unit rate using more classical techniques. Also see Harris [29] for further discussion of martingale convergence in BBM and applications. Iksanov [34] also uses similar spine techniques in the study of the branching random walk. For our second model, we look at a finite-type BBM model where the type of each particle controls the rate of fission, the offspring distribution and the spatial diffusion. First, we will extend Kyprianou’s [40] approach to give the analogous L1 -convergence result for this multi-type BBM model. We will also briefly discuss the rate of convergence of the martingales to zero and the speed of the spatially left-most particle within the process. Next, we give a new result on Lp -convergence criteria, extending our earlier spine based proof developed for the single-type BBM case. The third model we consider has a continuous type-space where the type of each particle moves independently as an Ornstein-Uhlenbeck process on R. This branching diffusion was first introduced in Harris and Williams [28] and has also been investigated in Harris [30], Git et al. [20] and Kyprianou and Engl¨ ander [12]. Proofs for each of these models run along similar lines and the techniques are quite general, and it is a powerful feature of the spine approach that this is possible. For example, they have since been extended to more general branching diffusions in Engl¨ ander et al. [14] and to fragmentation processes in Harris et al. [31]. More classical techniques based on the expectation semigroup are simply not able to generalize easily, since they often require either some a priori bounds on the semigroup or involve difficult estimates – for example, in Harris and Williams [28] their important bound of a non-linear term is made possible only by the existence of a good L2 theory for their operator, and this is not generally available. Of course, to prove martingale convergence in Lp for some p > 1 we use Doob’s theorem, and therefore need only show that the martingale is bounded in Lp . The spine decomposition is an excellent tool here for showing boundedness of the martingale since it reduces difficult calculations over the whole collection of branching particles to just the single spine process. We find the same conditions are also necessary for Lp -boundedness of the martingale when p > 1 by just considering the contributions along the spine at times of fission and observing when these are unbounded. Otherwise, to determine
286
R. Hardy and S.C. Harris
whether the martingale is merely L1 -convergent or has an almost-surely zero limit, we determine whether the martingale is almost-surely bounded or not under its own change of measure – this was Kyprianou’s [40] approach and relies on a measure-theoretic result that has become standard in the spine methodology since the important work of Lyons et al. [44, 43, 41]. There are a number of reasons why we may be interested in knowing about the Lp convergence of a martingale: in Neveu’s original article [46] it was a means to proving L1 -convergence of martingales which can then be used to represent (non-trivial) travelling-wave solutions to the FKPP reactiondiffusion equation as well as in understanding the growth and spread of the BBM, whilst Git et al. [20] and Asmussen and Hering [1] have used it to deduce the almost-sure rate of convergence of the martingale to its limit. Of equal importance are the techniques that we use here. The convergence of other additive martingales can be determined with similar techniques, for example, see an application to a BBM with inhomogeneous breeding potential in J.W. Harris and S.C. Harris [27]. Similar ideas have also been used in proving a lower bound for a number of problems in the large-deviations theory of branching diffusions – we have used the spine decomposition with Doob’s submartingale inequality to get an upper-bound for the growth of the martingale under the new measure which then leads to a lower-bound on the probability that one of the diffusing particles follows an unexpected path. See Hardy and Harris [23] for a spine-based proof of a path large deviation result for branching Brownian motion, and see Hardy and Harris [22] for a proof of a lower bound in the model that we consider in Section 11. The layout of this paper is as follows. In Section 2, we will introduce the branching models, describing a binary branching multi-type BBM that we will frequently use as an example, before describing a more general branching Markov process model with random family sizes. In Section 3, we introduce the spine of the branching process as a distinguished infinite line of descent starting at the initial ancestor, we describe the underlying space for the branching Markov process with spine and we also introduce various fundamental filtrations. In Section 4, we define some fundamental probability spaces, including a probability measure for the branching process with a randomly chosen spine. In Section 5, various martingales are introduced and discussed. In particular, we see how to use filtrations and conditional expectation to build ‘additive’ martingales for the branching process out of the product of three simpler ‘oneparticle’ martingales that only depend on the behaviour along the path of the spine; used as changes of measure, one martingale will increase the fission rate along the path of the spine, another will size-bias the offspring distribution along the spine, whilst the other one will change the motion of the spine. Section 6 discusses changes of measure with these martingales and gives very important and useful intuitive constructions for the branching process with ˜ Another spine under both the original measure P˜ and the changed measure Q. extremely useful tool in the spine approach is the spine decomposition that we prove in Section 7; this gives an expression for the expectation of the
A Spine Approach to Branching Diffusions
287
˜ conditional on knowing the ‘additive’ martingale under the new measure Q behaviour all along the path of the spine (including the spine’s motion, the times of fission along the spine and number of offspring at each of the spine’s fissions). In Section 8, we use the spine formulation to derive an interpretation ˜ discussing links with theorems of for certain Gibbs-Boltzmann weights of Q, Kesten-Stigum and Watanabe, in addition to proving a ‘Many-to-One’ theorem. Finally, in sections 9, 10, and 11 we will prove the martingale convergence results for BBM, finite-type BBM and the continuous-type BBM models, respectively.
2 Branching Markov Models Before we present the underlying constructions for spines, it will be useful to give the reader a further idea of the branching-diffusion models that we have in mind for applications. We first briefly introduce a finite-type branching diffusion (which will often serve as a useful example), before presenting a more general model that shall be used as the basis of our spine constructions in the following sections. 2.1 A Finite-type Branching Diffusion Let θ be a strictly positive constant that can be considered as a temperature parameter. For some fixed n ∈ N, define the finite type-space I := {1, . . . , n} and suppose that we are given two sets of positive constants a(1), . . . , a(n) and R(1), . . . , R(n). A Single Particle Motion. Consider the process (ξt , ηt )t≥0 moving on J := R × I as follows: (i) The type location, ηt , of the particle moves as an irreducible, time-reversible Markov chain on the finite type-space I with Q-matrix θQ and invariant measure π = (π1 , . . . , πn ); (ii) the spatial location, ξt , moves as a driftless Brownian motion on R with diffusion coefficient a(y) > 0 whenever ηt is in state y, that is, 1
dξt = a(ηt ) 2 dBt ,
where Bt a Brownian motion.
(3)
The formal generator of this process (ξt , ηt ) is therefore: HF (x, y) =
∂2F 1 a(y) 2 + θ Q(y, j)F (x, j), 2 ∂x
(F : J → R).
(4)
j∈I
A Typed Branching Brownian Motion. Consider a branching diffusion where individual particles move independently according to the single particle motion as described above, and any particle currently of type y will undergo binary fission at rate R(y) to be replaced by two particles at the same spatial
288
R. Hardy and S.C. Harris
and type positions as the parent. These offspring particles then move off independently, repeating stochastically the parent’s behaviour, and so on. Let the configuration of the wholebranching diffusion at time t be given by the J-valued point process Xt = Xu (t), Yu (t) : u ∈ Nt , where Nt is the set of individuals alive at time t. Suppose probabilities for this process are given by P x,y : (x, y) ∈ J defined on the natural filtration, (Ft )t≥0 , where P x,y is the law of the typed BBM process starting with one initial particle of type y at spatial position x. This finite-type branching diffusion (with general offspring distribution) is investigated in Section 10 in this article, also see Hardy [21]. For now, we briefly introduce two fundamental positive martingales used to understand this model, the first based on the whole branching diffusion and the second based only on the single-particle model: vλ (Yu (t))eλXu (t)−Eλ t , (5) Zλ (t) := u∈Nt t R(ηs ) ds 0
ζλ (t) := e
vλ (ηt )eλξt −Eλ t ,
(6)
where vλ and Eλ satisfy
1 2
λ2 A + θQ + R vλ = Eλ vλ ,
where A := diag (a(y) : y ∈ I) and R := diag (R(y) : y ∈ I). That is, vλ is the (Perron-Frobenius) eigenvector of the matrix 12 λ2 A + θQ + R, with eigenvalue Eλ . These two martingales should be compared with the correspond1 2 ing martingales (1) and eλBt − 2 λ t for BBM and a single Brownian motion respectively. 2.2 A General Branching Markov Process The spine constructions in our formulation can be applied to a much more general branching Markov model, and we shall base the presentation on the following model, where particles move independently in a general space J as a stochastic copy of some given Markov process Ξt , and at a location-dependent rate undergo fission to produce a location-dependent random number of offspring that each carry on this branching behaviour independently. Definition 2.1 (A General Branching Markov Process) We that three initial elements are given to us: • • •
suppose
a Markov process Ξt in a measurable space (J, B), a measurable function R : J → [0, ∞), for each x ∈ J we are given variable A(x) a random whose probability distribution on the numbers 0, 1, . . . is P A(x) = k = pk (x), with mean ∞ m(x) := k=0 kpk (x) < ∞.
A Spine Approach to Branching Diffusions
289
From these ingredients we can build a branching process in J according to the following recipe: • Each particle of the branching process will live, move and die in this space (J, B), and if an individual u is alive at time t we refer to its location in J as Xu (t). Therefore the time-tconfiguration of the branching process is a J-valued point process Xt := Xu (t) : u ∈ Nt where Nt denotes the collection of all particles alive at time t. • For each individual u, the stochastic behaviour of its motion in J is an independent copy of the given process Ξt . • The function R : J → [0, ∞) determines the rate at which each particle dies: given that u is alive at time t, its probability of dying in the interval [t, t + dt) is R(Xu (t))dt + o(dt). • If a particle u dies at location x ∈ J it is replaced by 1 + Au particles all positioned at x, where Au is an independent copy of the random variable A(x). All particles, once born, progress independently of each other. We suppose that the probabilities of this branching process are P x : x ∈ J where under P x one initial ancestor starts out at x. We shall first give a formal construction of the underlying probability space, made up of the sample trees of the branching process Xt in which the spines are the distinguished lines of descent. Once built, this space will be filtered in a natural way by the underlying family relationships of each sample tree, the diffusing branching particles and the diffusing spine, and then in section 4 we shall explain how we can define new probability measures P˜ x that extend each P x up to the finest filtration that contains all information about the spine and the branching particles. Much of the notation that we use for the underlying space of trees, the filtrations and the measures is closely related to that found in Kyprianou [40]. Although we do not strive to present our spine approach in the greatest possible generality, our model already covers many important situations whilst still being able to clearly demonstrate all the key spine ideas. In particular, in all our models, new offspring always inherit the position of their parent, although the same spine methods should also readily adapt to situations with random dispersal of offspring. For greater clarity, we often use the finite-type branching diffusion of Section 2.1 to introduce the ideas before following up with the general formulation. For example, in this finite-type model we would take the process Ξt to be the single-particle process (ξt , ηt ) which lives in the space J := R × I and has generator H given by (4). The birth rate in this model at location (x, y) ∈ J will be independent of x and given by the function R(y) for all y ∈ I and, since only binary branching occurs in this case, we also have P (A(x, y) = 1) = 1 for all (x, y) ∈ J.
290
R. Hardy and S.C. Harris
3 The Underlying Space for Spines 3.1 Marked Galton-Watson Trees with Spines The set of Ulam-Harris labels is to be equated with the set Ω of finite sequences of strictly-positive integers: Ω := ∅ ∪ (N)n , n∈N
where we take N = 1, 2, . . . . For two words u, v ∈ Ω, uv denotes the concatenated word (u∅ = ∅u = u), and therefore Ω contains elements like ‘213’ (or ‘∅213’), which represents ‘the 3rd child of the 1st child of the 2nd child of the initial ancestor ∅’. For two labels v, u ∈ Ω the notation v < u means that v is an ancestor of u, and u denotes the length of u. The set of all ancestors of u is equally given by v : v < u = v : ∃w ∈ Ω such that vw = u . Collections of labels, ie. subsets of Ω, will therefore be groups of individuals. In particular, a subset τ ⊂ Ω will be called a Galton-Watson tree if: 1. ∅ ∈ τ , 2. if u, v ∈ Ω, then uv ∈ τ implies u ∈ τ , 3. for all u ∈ τ , there exists Au ∈ 0, 1, 2, . . . such that uj ∈ τ if and only if 1 ≤ j ≤ 1 + Au , (where j ∈ N). That is just to say that a Galton-Watson tree: 1. has a single initial ancestor ∅, 2. contains all ancestors of any of its individuals v, 3. has the 1 + Au children of an individual u labelled in a consecutive way, and is therefore just what we imagine by the picture of a family tree descending from a single ancestor. Note that the ‘1 ≤ j ≤ 1 + Au ’ condition in 3 means that each individual has at least one child, so that in our model we are insisting that Galton-Watson trees never die out. The set of all Galton-Watson trees will be called T. Typically we use the name τ for a particular tree, and whenever possible we will use the letters u or v or w to refer to the labels in τ , which we may also refer to as nodes of τ or individuals in τ or just as particles. Each individual should have a location in J at each moment of its lifetime. Since a Galton-Watson tree τ ∈ T in itself can express only the family structure of the individuals in our branching random walk, in order to give them these extra features we suppose that each individual u ∈ τ has a mark (Xu , σu ) associated with it which we read as: •
σu ∈ R+ is the lifetime of u, which determines the fission time of particle u as Su := v≤u σv (with S∅ := σ∅ ). The times Su may also be referred to as the death times;
A Spine Approach to Branching Diffusions
•
291
Xu : [Su − σu , Su ) → J gives the location of u at time t ∈ [Su − σu , Su ).
To avoid ambiguity, it is always necessary to decide whether a particle is in existence or not at its death time. Remark 3.1 Our convention throughout will be that a particle u dies ‘just before’ its death time Su (which explains why we have defined Xu : [Su − σu , Su ) → · for example). Thus at the time Su the particle u has disappeared, replaced by its 1 + Au children which are all alive and ready to go. We denote a single marked tree by (τ, X, σ) or (τ, M ) for shorthand, and the set of all marked Galton-Watson trees by T :
• T := (τ, X, σ) : τ ∈ T and for each u ∈ τ, σu ∈ R+ , Xu : [Su − σu , Su ) → J . •
For each (τ, that are alive at time t is defined X, σ) ∈ T , the set of particles as Nt := u ∈ τ : Su − σu ≤ t < Su . Where we want to highlight the fact that these values depend on the underlying marked tree we write e.g. Nt ((τ, X, σ)) or Su ((τ, M )). Any particle u ∈ τ that comes into existence creates a subtree made up from the collection of particles (and all their marks) that have u as an ancestor – and u is the original ancestor of this subtree. •
(τ, X, σ)uj , or (τ, M )uj for shorthand, is defined as the subtree growing from individual u’s jth child uj, where 1 ≤ j ≤ 1 + Au .
This subtree is a marked tree itself, but when considered as a part of the original tree we have to remember that it comes into existence at the spacetime location (Xu (Su − σu ), Su − σu ) – which is just the space-time location of the death of particle u (and therefore the space-time location of the birth of its child uj). Before moving on there is a further useful extension of the notation: for any particle u we extend the definition of Xu from the time interval [Su − σu , Su ) to allow all earlier times t ∈ [0, Su ): Definition 3.2 Each particle u is alive in the time interval [Su − σu , Su ), but we extend the concept of its path in J to all earlier times t < Su :
Xu (t) if Su − σu ≤ t < Su Xu (t) := Xv (t) if v < u and Sv − σv ≤ t < Sv Thus particle u inherits the path of its unique line of ancestors, and this simple t extension will allow us to later write expressions like exp{ 0 f (s) dXu (s)} whenever u ∈ Nt , without worrying about the birth time of u. For any given marked tree (τ, M ) ∈ T we can identify distinguished lines of descent from the initial ancestor: ∅, u1 , u2 , u3 , . . . ∈ τ , in which u3 is a child of u2 , which itself is a child of u1 which is a child of the original ancestor ∅. We’ll call such a subset of τ a spine, and will refer to it as ξ: • a spine ξ is a subset of nodes ∅, u1 , u2 , u3 , . . . in the tree τ that make up a unique line of descent. We use ξt to refer to the unique node in ξ that that is alive at time t.
292
R. Hardy and S.C. Harris
In a more formal definition, which can for example be found in the paper by Rouault and Liu [42], a spine is thought of as a point on ∂τ the boundary of the tree – in fact the boundary is defined as the set of all infinite lines of descent. This explains the notation ξ ∈ ∂τ in the following definition: we augment the space T of marked trees to become • T˜ := (τ, M, ξ) : (τ, M ) ∈ T and ξ ∈ ∂τ is the set of marked trees with distinguished spines. It is natural to speak of the position of the spine at time t which we think of as the position of the unique node that is in the spine and alive at time t: • we define the time-t position of the spine as ξt := Xu (t), where u ∈ ξ ∩ Nt . By using the notation ξt to refer to both the node in the tree and that node’s spatial position we are introducing potential ambiguity. However, in practice the context will usually make clear which we intend, although if this is not the case we shall give the node a longer name: • nodet ((τ, M, ξ)) := u if u ∈ ξ is the node in the spine alive at time t, which may also be written as nodet (ξ). Finally, it will later be important to know how many fission times there have been in the spine, or what is the same, to know which generation of the family tree the node ξt is in (where the original ancestor ∅ is considered to be the 0th generation) Definition 3.3 We define the counting function nt = nodet (ξ), which tells us which generation the spine node is in, or equivalently how many fission times there have been on the spine. For example, if ξt = ∅, u1 , u2 then both ∅ and u1 have died and so nt = 2. 3.2 Filtrations The reader who is already familiar with the Lyons et al. [41, 43, 44] papers will recall that they used two separate underlying spaces of marked trees with and without the spines, then marginalized out the spine when wanting to deal only with the branching particles as a whole. Instead, we are going to use the single underlying space T˜ , but define four filtrations of it that will encapsulate different knowledge. Filtration (Ft )t≥0 We define a filtration of T˜ made up of the σ-algebras:
Ft := σ (u, Xu , σu ) : Su ≤ t ; (u, Xu (s) : s ∈ [Su − σu , t]) : t ∈ [Su − σu , Su ) .
A Spine Approach to Branching Diffusions
293
Then, Ft knows everything that has happened to all the branching particles up to the time t, but does not know which one is the spine. Each of these σ-algebras will be a subset of the limit defined as
Ft . F∞ := σ t≥0
˜t )t≥0 Filtration (F In order to know about the spine, we make this filtration finer, defining F˜t by adding into Ft the knowledge of which node is the spine at time t:
F˜t := σ Ft , nodet (ξ) , F˜t . F˜∞ := σ t≥0
Consequently, F˜t knows everything about the branching process and everything about the spine up to time t, including which nodes make up the spine, when they were born, when they died (ie. the fission times Su ), and their family sizes. Filtration (Gt )t≥0 We define a filtration of T˜ , Gt t≥0 , which is generated by only the spatial motion of the spine by:
Gt := σ ξs : 0 ≤ s ≤ t , G∞ := σ Gt , t≥0
Then, Gt knows only about the spine’s motion in J up to time t, but does not actually know which line of descent in the family tree makes up the spine or anything about births along the spine. Filtration (G˜t )t≥0 We augment Gt by adding in information on the nodes that make up the spine (as we did from Ft to F˜t ), as well as the knowledge of when the fission times occurred on the spine and how big the families were that were produced:
G˜t := σ Gt , (nodes (ξ) : s ≤ t), (Au : u < nodet (ξ)) , G˜t . G˜∞ := σ t≥0
Then, G˜t knows about everything along the spine up until time t. We note the obvious relationships between these filtrations of T˜ that Ft ⊂ ˜ Ft and Gt ⊂ G˜t ⊂ F˜t . Trivially, we also note that Gt Ft , since the filtration Ft does not know which line of descent makes up the spine.
294
R. Hardy and S.C. Harris
4 Probability Measures Having now carefully defined the underlying space for our probabilities, we remind ourselves of the probability measures: Definition 4.1 For each x ∈ J, let P x be the measure on (T˜ , F∞ ) such that the filtered probability space (T˜ , F∞ , (Ft )t≥0 , P · ) is the canonical model for Xt , the branching Markov process described in Definition 2.1. For details of how the measures P x are formally constructed on the underlying space of trees, we refer the reader to the work of Neveu [45] and Chauvin [8, 6]. Note, we could equally think of P x as a measure on (T , F∞ ), but it is convenient to use the enlarged sample space T˜ for all our measure spaces, varying only the filtrations. Our spine approach relies first on building a measure P˜ x under which the spine is a single genealogical line of descent chosen from the underlying tree. If we are given a sample tree (τ, M ) for the branching process, it is easy to verify that, if at each fission we make a uniform choice amongst the offspring to decide which line of descent continues the spine ξ, when u ∈ τ we have 1 . (7) Prob(u ∈ ξ) = 1 + Av v
where fu ∈ mFt . As a simple example of this, in the case of the finite-typed branching diffusion of Section 2.1, such a representation would be: t t e 0 R(ηs ) ds vλ (ηt ) eλξt −Eλ t = e 0 R(Yu (s)) ds vλ (Yu (t)) eλXu (t)−Eλ t 1(ξt =u) . u∈Nt
(9) Definition 4.3 Given the measure P x on (T˜ , F∞ ) we extend it to the probability measure P˜ x on (T˜ , F˜∞ ) by defining 1 x ˜ f dP := fu dP x , (10) 1 + Av ˜ ˜ T T v
for each f ∈ mF˜t with representation like (8).
A Spine Approach to Branching Diffusions
295
The previous approach to spines, exemplified in Lyons [43], used the idea of fibres to get a measure analogous to our P˜ that could measure the spine. However, a perceived weakness in this approach was that the corresponding measure had time-dependent total mass and could not be normalized to become a probability measure with an intuitive construction, unlike our P˜ . Our idea of using the down-weighting term of (7) in the definition of P˜ is crucial in ensuring that we get a natural probability measure (look ahead to Lemma 4.9), and leads to the very useful situation in which all measure changes in our formulation are carried out by martingales. Theorem 4.4 This measure P˜ x is an extension of P x in that P = P˜ |F∞ . Proof: If f ∈ mFt then the representation (8) is trivial and therefore by definition
1 dP. f dP˜ = f× 1 + Av T˜ T˜ v
u∈Nt
1 However, it can be seen that u∈Nt v
T˜
Definition 4.5 The filtered probability space (T˜ , F˜∞ , (F˜t )t≥0 , P˜ ) with (Xt , ξt ) will be referred to as the canonical model with spines. In the single-particle model of section 2.1 we assumed the existence of a separate measure P and a process (ξt , ηt ) that behaved stochastically like a ‘typical’ particle in the typed branching diffusion Xt . In our formalization the spine is exactly the single-particle model: Definition 4.6 We define the measure P on T˜ , G∞ as the restriction of P˜ : P|Gt := P˜ |Gt . Under the measure P the spine process ξt has exactly the same law as Ξt . Definition 4.7 The filtered probability space (T˜ , G∞ , (Gt )t≥0 , P) together with the spine process ξt will be referred to as the single-particle model. 4.1 An Intuitive Construction of P˜ As the name suggests, we should be able to think of the spine as the backbone of the branching process. This is made precise by the following decomposition:
296
R. Hardy and S.C. Harris
Theorem 4.8 The measure P˜ on F˜t can be decomposed as: ⎞ ⎞⎛ ⎛ Av 1 ⎠⎝ dP˜ (τ, M, ξ) = dP(ξ)dL(R(ξ)) (n)⎝ pAv (ξSv ) dP (τ, M )vj ⎠ 1 + Av j=1 v<ξt
v<ξt
(11) where L is the law of the Poisson (Cox) process with rate R(ξt ) at time t, and we recall that n = (nt : t ≥ 0) is the counting process of fission times along the spine. (R(ξ))
We can summarise a clear intuitive picture of this decomposition in the following lemma: Lemma 4.9 The decomposition of measure P˜ at (11) enables the following construction: • the spine’s motion is determined by the single-particle measure P; • the spine undergoes fission at time t at rate R(ξt ); • at the fission time of node v on the spine, the single spine particle is replaced by 1 + Av children, with Av being chosen independently and distributed according to the location-dependent random variable A(ξSv ) with probabilities (pk (ξSv ) : k = 0, 1, . . .); • the spine is chosen uniformly from the 1+Av children at the fission point v; • each of the remaining Av children gives rise to the independent subtrees (τ, M )vj , for 1 ≤ j ≤ Av , which are not part of the spine and which are each determined by an independent copy of the original measure P shifted to their point and time of creation.
5 Martingales Starting with the single Markov process Ξt that lives in (J, B) we have built (Xt , ξt ), a branching Markov process with spines, in which the spine ξt behaves stochastically like the given Ξt . In this section we are going to show how any given martingale for the spine leads to a corresponding additive martingale for the whole branching model. We have already seen an example of this for the finite-type model of section 2.1, when we introduced the two martingales: t vλ Yu (t) eλXu (t)−Eλ t , ζλ (t) := e 0 R(ηs ) ds vλ (ηt )eλξt −Eλ t . Zλ (t) := u∈Nt
Just from their very form it has always been clear that they are closely related. What we shall later be demonstrating in full generality in this section is that the key to their relationship comes through generalising the following F˜t measurable martingale for the multi-type BBM model:
A Spine Approach to Branching Diffusions
297
Definition 5.1 We define an F˜t -measurable martingale: ζ˜λ (t) :=
(1 + Au ) × vλ (ηt )eλξt −Eλ t .
(12)
u<ξt
An important result that we show in this article (Lemma 5.7) is that Zλ (t) and ζλ (t) are simply conditional expectations of this new martingale ζ˜λ . We emphasize that this relationship is only possible because of the construction of P˜ as a probability measure and using filtrations to capture the different knowledge generated by the spine and the branching particles. This idea of projection is also used in random fragmentation theory where it corresponds to the notion of tagged fragment, see Bertoin [3], for example. Furthermore, in the general form that we present below it provides a consistent methodology for using well-known martingales for a single process ξt to get new additive martingales for the related branching process. In Hardy and Harris [23, 22] we use these powerful ideas to give substantially easier proofs of large-deviations problems in branching diffusions than have previously been possible. Suppose that ζ(t) is a strictly positive T˜ , (Gt )t≥0 , P˜ -martingale, which is to say that it is a Gt -measurable function that is a martingale with respect to the measure P˜ . For example, in the case of our finite-type branching diffusion this could be the martingale ζλ (t) which is Gt -measurable since it refers only to the spine process (ξt , ηt ). Definition 5.2 We shall call ζ(t) a single-particle martingale, since it is Gt -measurable and thus depends only to the spine ξ. Any such single-particle martingale can be used to define an additive martingale for the whole branching process via the representation (8): Definition 5.3 Suppose that we can represent the martingale ζ(t) as ζ(t) = ζu (t)1(ξt =u) ,
(13)
u∈Nt
for ζu (t) ∈ mFt , as at (8). We can then define an Ft -measurable process Z(t) as t Z(t) := e− 0 m(Xu (s))R(Xu (s)) ds ζu (t), u∈Nt
and refer to Z(t) as the branching-particle martingale. The martingale property Z(t) will be established in Lemma 5.7 after first ˜ building another martingale, ζ(t), from the single-particle martingale ζ(t). First, for clarity, we take a moment to discuss this definition of the additive martingale and the terms like ζu (t).
298
R. Hardy and S.C. Harris
If we return to our familiar martingales (5) and (6), it is clear that t
ζλ (t) = e
0
R(ηs ) ds
vλ (ηt )eλξt −Eλ t =
t
e
0
R(Yu (s)) ds
vλ Yu (t) eλXu (t)−Eλ t 1(ξt =u) .
u∈Nt
(14) The ‘ζu ’ terms of (13) could be here replaced with a more descriptive notation ζλ [(Xu , Yu )](t), where t
ζu (t) = ζλ [(Xu , Yu )](t) := e
0
R(Yu (s)) ds
vλ Yu (t) eλXu (t)−Eλ t ,
can be seen to essentially be a functional of the space-type path (Xu (t), Yu (t)) of particle u. In this way the original single-particle martingale ζλ would be understood as a functional of the space-type path (ξt , ηt ) of the spine itself and we could write ζλ (t) = ζλ [(ξ, η)](t) = ζλ [(Xu , Yu )](t) 1(ξt =u) . u∈Nt
This is the idea behind the representation (13), and in those typical cases where the single-particle martingale is essentially a functional of the paths of the spine ξt , as is the case for our ζλ (t), we should just think of ζu as being that same functional but evaluated over the path Xu (t) of particle u rather than the spine ξt . The representation (13) can also be used as a more general way of treating other martingales that perhaps are not such a simple functional of the spine path. Finally, from (14) it is clear that the additive martingale being defined by definition 5.3 is our familiar Zλ (t): t e− 0 R(Yu (s)) ds ζλ [(Xu , Yu )](t) = vλ Yu (t) eλXu (t)−Eλ t . Zλ (t) = u∈Nt
u∈Nt
Although definition 5.3 will work in general, in the main the spine approach is interested in martingales that can act as Radon-Nikodym derivatives between probability measures, and therefore we suppose from now on that ζ(t) is strictly positive, and therefore that the additive martingale Z(t) is strictly positive. The work of Lyons et al. [43, 41, 44], that of Chauvin and Rouault [9] and more recently of Kyprianou [40] suggests that when a change of measure is carried out with a branching-diffusion additive martingale like Z(t) it is typical to expect three changes: the spine will gain a drift, its fission times will be increased and the distribution of its family sizes will be size-biased. In section 6.1 we shall confirm this, but we first take a separate look at the martingales that could perform these changes, and which we shall combine to ˜ that will ultimately be used to change the measure P˜ . obtain a martingale ζ(t)
A Spine Approach to Branching Diffusions
299
Theorem 5.4 The expression t 1 + m(ξSv ) e− 0 m(ξs )R(ξs ) ds v<ξt
is a P˜ -martingale that will increase the rate at which fission times occur along the spine from R(ξt ) to (1 + m(ξt ))R(ξt ): ((1+m(ξ))R(ξ))
dLt
(R(ξ)) dLt
=
t 1 + m(ξSv ) e− 0 m(ξs )R(ξs ) ds
v<ξt
where L(R(ξ)) is the law of the Poisson (Cox) process with rate R(ξt ) at time t. Theorem 5.5 The term
v<ξt
1 + Av 1 + m(ξSv )
is a P˜ -martingale that will change the measure by size-biasing the family sizes born from the spine: if v < ξt , then
Prob(Av = k) =
(1 + k)pk (ξSv ) . 1 + m(ξSv )
The proof of these two results is left as an easy exercise for the reader. The product of these two martingales with the single-particle martingale ζ(t) will simultaneously perform the three changes mentioned above: Definition 5.6 We define a F˜t -measurable martingale as t ˜ := ζ(t) (1 + Av )e− 0 m(ξs )R(ξs ) ds × ζ(t) v<ξt
=
u<ξt
t 1 + Au × 1 + m(ξSv ) e− 0 m(ξs )R(ξs ) ds × ζ(t). 1 + m(ξSu )
(15)
v<ξt
Significantly, only the motion of the spine and the behaviour along the immediate path of the spine will be affected by any change of measure using this martingale. Also note, this martingale is the general form of ζ˜λ (t) that we defined at (12) for our finite-type model. The real importance of the size-biasing and fission-time-increase operations ˜ so that the following key is that they introduce the correct terms into ζ(t) relationships hold: ˜ onto their filtrations: Lemma 5.7 Both Z(t) and ζ(t) are projections of ζ(t) for all t ≥ 0, ˜ | Ft , • Z(t) = P˜ ζ(t) ˜ | Gt . • ζ(t) = P˜ ζ(t)
300
R. Hardy and S.C. Harris
˜ Proof: We use the representation (8) of ζ(t): t ˜ = (1 + Av )e− 0 m(Xu (s))R(Xu (s)) ds ζu (t)1(ξt =u) . ζ(t)
(16)
u∈Nt v
Since P˜ 1(ξt =u) |Ft = 1(u∈Nt ) × v
=
−
e
t 0
v
ζu (t) = Z(t).
u∈Nt
On the other hand, the martingale terms in (15) imply
t ˜ ˜ P˜ ζ(t)|G (1 + Av )e− 0 m(ξs )R(ξs ) ds Gt = ζ(t). t = ζ(t) × P v<ξt
6 Changing the Measures For the finite type model, the single-particle martingale ζλ (t) defined at (6) can be used to define a new measure for the single-particle model (as in [21]), via dPλ ζλ (t) . = dP Gt ζ(0) We have now seen the close relationships between the three martingales ζλ , Zλ and ζ˜λ : ζλ (t) = P˜ ζ˜λ (t) | Gt , Zλ (t) = P˜ ζ˜λ (t) | Ft , and in this section we show in a more general form how these close relation˜ λ defined in terms of P˜ as ships mean that a new measure Q ˜ ˜λ dQ = ζλ (t) , ˜ ˜ dP F˜t ζλ (0) will induce measure changes on the sub-filtrations Gt and Ft of F˜t whose Radon-Nikodym derivatives are given by ζλ (t) and Zλ (t) respectively. We will ˜ also give a useful intuitive construction of the measures P˜ and Q. ˜ on (T˜ , F˜∞ ) is defined via its Radon-Nikodym Definition 6.1 A measure Q ˜ derivative with respect to P : ˜ ˜ dQ = ζ(t) . ˜ ˜ dP F˜t ζ(0)
A Spine Approach to Branching Diffusions
301
˜ to the As we did for the measures P and P in Section 4, we can restrict Q sub-filtrations: Definition 6.2 We define the measure Q on (T˜ , F∞ , (Ft )t≥0 ) via ˜ F . Q := Q| ∞ ˆ on (T˜ , G∞ , (Gt )t≥0 ) via Definition 6.3 We define the measure P ˆ := Q| ˜ G . P ∞ A consequence of our new formulation in terms of filtrations and the equalities of Lemma 5.7 is that the changes of measure are carried out by Z(t) and ζ(t) on their subfiltrations: Theorem 6.4
Z(t) dQ , = dP Ft Z(0)
and
ˆ dP = ζ(t) . dP Gt ζ(0)
Proof: These two results actually follow from a more general observation ˜ with that if μ ˜1 and μ ˜2 are two measures defined on a measure space (Ω, S) Radon-Nikodym derivative d˜ μ2 = f, d˜ μ1 ˜ then the two measures μ1 := μ ˜1 |S and and if S is a sub-σ-algebra of S, ˜2 |S on (Ω, S) are related by the conditional expectation operation: μ2 := μ dμ2 =μ ˜1 f |S . dμ1 Applying this general result and using the relationships between the general martingales given in Lemma 5.7 concludes the proof. ˜ 6.1 Understanding the Measure Q ˜ This decomposition of P˜t given at (11) will allow us to interpret the measure Q if we appropriately factor the components of the change-of-measure martingale ˜ across this representation. On F˜t , ζ(t) ˜ dP˜ ˜ = ζ(t) dQ = ζ(t) × e−
t 0
R(ξs )ds
1 + m(ξSu ) ×
u<ξt
v<ξt
1 + Av × dP˜ 1 + m(ξSv )
ˆ dL((1+m(ξ))R(ξ)) (n) = dP(ξ) ×
u<ξt
Av 1 + Av 1 pAv (ξSv ) dP (τ, M )vj . 1 + Au 1 + m(ξSv ) j=1 v<ξt
(17)
302
R. Hardy and S.C. Harris
Just as we did for P˜ , we can offer a clear interpretation of this decomposition: ˜ Lemma 6.5 Under the measure Q, • • •
• •
ˆ the spine process ξt moves as if under the changed measure P; the fission times along the spine occur at an accelerated rate (1 + m(ξt ))R(ξt ); at the fission time of node v on the spine, the single spine particle is replaced by 1 + Av children, with Av being chosen as an independent copy ˜ of the random variable A(y) which has the size biased offspring distribution ((1 + k)pk (y)/(1 + m(y)) : k = 0, 1, . . .), where y = ξSv ∈ J is the spine’s location at the time of fission; the spine is chosen uniformly from the 1+Av particles at the fission point v; each of the remaining Av children gives rise to the independent subtrees (τ, M )vj , for 1 ≤ j ≤ Av , which are not part of the spine and evolve as independent processes determined by the measure P shifted to their point and time of creation.
˜ was first given by Chauvin and Such an interpretation of the measure Q Rouault [9] in the context of BBM, allowing them to come to the important conclusion that under the new measure Q the branching diffusion remains largely unaffected, except that the Brownian particles of a single (random) line of descent in the family tree are given a changed motion, with an accelerated birth rate – although they did not have random family sizes, so the size-biasing aspect was not seen. Size-biasing has been known for a long time in the study of branching populations, and in the context of spines, it was introduced in the Lyons et al. papers [43, 41, 44]. Kyprianou [40] presented the decomposition of equation (17) and the construction of Q at Lemma 6.5 for BBM with random family sizes, but did not follow our natural approach of starting with the probability measure P˜ .
7 The Spine Decomposition One of the most important results introduced in Lyons [43] was the so-called spine decomposition, which in the case of the additive martingale Zλ (t) = vλ (Yu (t))eλXu (t)−Eλ t , u∈Nt
from the finite-type branching diffusion would be: ˜ λ Zλ (t)|G˜∞ = vλ (ηSu )eλξSu −Eλ Su + vλ (ηt )eλξt −Eλ t . Q u
To prove this we start by decomposing the martingale as Zλ (t) = vλ (Yu (t))eλXu (t)−Eλ t + vλ (ηt )eλξt −Eλ t , u∈Nt ,u∈ξ /
(18)
A Spine Approach to Branching Diffusions
303
which is clearly true since one of the particles u ∈ Nt must be in the line of descent that makes up the spine ξ. Recalling that the σ-algebra G˜∞ contains all information about the line of nodes that makes up the spine, all about the spine diffusion (ξt , ηt ) for all times t, and also contains all information regarding the fission times and number of offspring along the spine, it is useful / ξ into the distinct subtrees to partition the particles v ∈ u ∈ Nt , u ∈ (τ, M )u that were born at the fission times Su from the particlesthat made up the spine before time t, or in other words those nodes in the u < ξt of ancestors of the current spine node ξt . Thus:
Zλ (t) = eλξSu −Eλ Su vλ (Yv (t))eλ(Xu (t)−ξSu )−Eλ (t−Su ) v∈Nt ,v∈(τ,M )u
u<ξt
λξt −Eλ t
+ vλ (ηt )e
.
˜ λ -conditional expectation of this, we find If we now take the Q ˜ λ Zλ (t)|G˜∞ Q
˜λ eλξSu −Eλ Su Q vλ (Yv (t))eλ(Xu (t)−ξSu )−Eλ (t−Su ) G˜∞ = v∈Nt ,v∈(τ,M )u
u<ξt
λξt −Eλ t
+ vλ (ηt )e
.
˜ λ the We know from the decomposition (17) that the under the measure Q subtrees coming off the spine evolve as if under the measure P , and therefore ˜λ Q
vλ (Yv (t))eλ(Xu (t)−ξSu )−Eλ (t−Su ) G˜∞
v∈Nt ,v∈(τ,M )u
= P˜
vλ (Yv (t))eλ(Xu (t)−ξSu )−Eλ (t−Su ) G˜∞ = vλ (ηSu ),
v∈Nt ,v∈(τ,M )u
since the additive expression being evaluated on the subtree is just a shifted form of the martingale Zλ itself. This concludes the proof of (18), but before we go move on to give a similar proof for the general case, for easier reference through the cumbersome-looking general proof it is worth recalling that t
ζλ (t) = e
0
R(ηs ) ds
vλ (ηt )eλξt −Eλ t ,
and therefore noting that (18) can alternatively be written as Su t ˜ λ Zλ (t)|G˜∞ = e− 0 R(ηs ) ds ζλ (Su ) + e− 0 R(ηs ) ds ζλ (t). Q u
Also, in the general model we are supposing that each particle u in the spine will give birth to a total of Au subtrees that go off from the spine – the one
304
R. Hardy and S.C. Harris
remaining other offspring is used to continue the line of descent that makes up the spine. This explains the appearance of Au in the general decomposition. Theorem 7.1 (Spine decomposition) We have the following spine decomposition for the additive branching-particle martingale: Su t ˜ Z(t)|G˜∞ = Au e− 0 m(ξs )R(ξs ) ds ζ(Su ) + e− 0 m(ξs )R(ξs ) ds ζ(t). Q u<ξt
Proof: In each sample tree one and only one of the particles alive at time t is the spine and therefore: t e− 0 m(Xu (s))R(Xu (s)) ds ζu (t), Z(t) = u∈Nt − 0t m(ξs )R(ξs ) ds
=e
ζ(t) +
e−
t 0
m(Xu (s))R(Xu (s)) ds
ζu (t).
u∈Nt ,u=ξt
The other individuals u ∈ Nt , u = ξt can be partitioned into subtrees created from fissions along the spine. That is, each node u in the spine ξt (so u < ξt ) has given birth at time Su to one offspring node uj (for some 1 ≤ j ≤ 1 + Au ) that was chosen to continue the spine whilst the other Au individuals go off to make the subtrees (τ, M )uj . Therefore, t Su Z(t) = e− 0 m(ξs )R(ξs ) ds ζ(t) + e− 0 m(ξs )R(ξs ) ds Zuj (Su ; t), j=1,...,1+Au uj ∈ξ /
u<ξt
(19) where for t ≥ Su ,
Zuj (Su ; t) :=
e−
t Su
m(Xv (s))R(Xv (s)) ds
ζv (t),
v∈Nt ,v∈(τ,M )u j
is, conditional on G˜∞ , a P˜ -martingale on the subtree (τ, M )uj , and therefore P˜ Zuj (Su ; t)|G˜∞ = ζ(Su ). ˜ Thus taking Q-conditional expectations of (19) gives ˜ Z(t)|G˜∞ = e− 0t m(ξs )R(ξs ) ds ζ(t) Q
Su + e− 0 m(ξs )R(ξs ) ds P˜ Zuj (Su ; t) G˜∞ u<ξt
= e−
t 0
m(ξs )R(ξs ) ds
ζ(t) +
u<ξt
which completes the proof.
j=1,...,1+Au uj ∈ξ / − 0Su m(ξs )R(ξs ) ds
e
Au ζ(Su ),
A Spine Approach to Branching Diffusions
305
This representation was first used in the Lyons et al. [43, 41, 44] papers and has become the standard way to investigate the behaviour of Z under the ˜ We also observe that the two measures P˜ and Q ˜ for the general measure Q. ˜ model are equal when conditioned on G∞ since this factors out their differences sizes in the spine diffusion ξt , the family born from the spine and the fission ˜ Z(t)|G˜∞ . times on the spine. That is, P˜ Z(t)|G˜∞ = Q
8 Spine Results Having covered the formal basis for our spine approach, we now present some results that follow from our spine formulation: the Gibbs-Boltzmann weights, conditional expectations, and a simpler proof of the improved Many-to-One theorem. ˜ 8.1 The Gibbs-Boltzmann Weights of Q The Gibbs-Boltzmann weightings in branching processes are well-known, for example see Chauvin and Rouault [7] where they consider random measures on the boundary of the tree, and Harris [30] which gives convergence results for Gibbs-Boltzmann random measures. They have previously been considered via the individual terms of the additive martingale Z, but the following theorem gives a new interpretation of these weightings in terms of the spine. We recall that t e− 0 m(Xu (s))R(Xu (s)) ds ζu (t). Z(t) = u∈Nt
Theorem 8.1 Let u ∈ Ω be a given and fixed label. Then − ˜ ξt = u|Ft = 1(u∈N ) e Q t
t 0
m(Xu (s))R(Xu (s)) ds
Z(t)
ζu (t)
.
Proof: Suppose F ∈ Ft . We aim to show:
˜ M, ξ) = 1(ξt =u) dQ(τ, F
1(u∈Nt ) F
e−
t 0
m(Xu (s))R(Xu (s)) ds
Z(t)
ζu (t) ˜ dQ(τ, M, ξ).
˜ on Ft and therefore, ˜ P˜ = ζ(t) First of all we know that dQ/d t LHS = 1(ξt =u) (1 + Av )e− 0 m(ξs )R(ξs ) ds ζ(t) dP˜ (τ, M, ξ), F
v<ξt
˜ at (15). The definition 4.3 of the measure P˜ requires us by definition of ζ(t) to express the integrand with a representation like (8):
306
R. Hardy and S.C. Harris
1(ξt =u)
(1 + Av )e−
t 0
m(ξs )R(ξs ) ds
ζ(t)
v<ξt
= 1(ξt =u) 1(u∈Nt )
t (1 + Av )e− 0 m(Xu (s))R(Xu (s))ds ζu (t), v
and therefore t LHS = 1(u∈Nt ) (1 + Av )e− 0 m(Xu (s))R(Xu (s)) ds ζu (t)1(ξt =u) dP˜ (τ, M, ξ), F
=
v
1(u∈Nt ) e−
t 0
m(Xu (s))R(Xu (s)) ds
ζu (t) dP (τ, M, ξ),
F
by definition 4.3. Since Z(t) > 0 a.s., we know that on Ft , dP/dQ = 1/Z(t), so t 1 dQ(τ, M, ξ), LHS = 1(u∈Nt ) e− 0 m(Xu (s))R(Xu (s)) ds ζu (t) Z(t) F and the proof is concluded. The above result combines with the representation (8) to show how we ˜ take conditional expectations under the measure Q. Theorem 8.2 If f (t) ∈ mF˜t , and f = u∈Nt fu (t)1(ξt =u) , with fu (t) ∈ mFt then − 0t m(Xu (s))R(Xu (s)) ds ζu (t) e ˜ f (t)|Ft = . (20) fu (t) Q Z(t) u∈Nt
Proof: It is clear that ˜ f (t)|Ft = ˜ ξt = u|Ft , Q fu (t)Q u∈Nt
and the result follows from Theorem 8.1.
A corollary to this useful result also appears to go a long way towards obtaining the Kesten-Stigum result in more general models: Corollary 8.3 If g(·) is a Borel function on J then t ˜ g(ξt )|Ft × Z(t). g(Xu (t)) e− 0 m(Xu (s))R(Xu (s)) ds ζu (t) = Q
(21)
u∈Nt
Proof: We can write g(ξt ) = u∈Nt g(Xu (t))1(ξt = u), and now the result follows from the above corollary. The classical Kesten-Stigum theorems of [37, 36, 38] for multi-dimensional Galton-Watson processes give conditions under which an operation like the left-hand side of (21) converges as t → ∞, and it is found that when it exists
A Spine Approach to Branching Diffusions
307
the limit is a multiple of the martingale limit Z(∞). Also see Lyons et al. [41] for a more recent proof of this based on other spine techniques. Our spine formulation apparently gives a previously unknown but simple meaning to this operation in terms of a conditional expectation and, as we hope to pursue in further work, in many cases we would intuitively expect that ˜ ˜ g(ξt )|Ft /Q(g(ξ Q t )) → 1 a.s., leading to alternative spine proofs of both Kesten-Stigum like theorems and Watanabe’s theorem in the case of BBM. 8.2 The Full Many-to-One Theorem A very useful tool in the study of branching processes is the Many-to-One result that enables expectations of sums over particles in the branching process to be calculated in terms of an expectation of a single particle. In the context of the finite-type branching diffusion of section 2.1, the Many-to-One theorem would be stated as follows: Theorem 8.4 For any measurable function f : J → R we have
t f (Xu (t), Yu (t)) = Px,y e 0 R(ηs ) ds f (ξt , ηt ) . P x,y u∈Nt t
Intuitively it is clear that the up-weighting term e 0 R(ηs ) ds incorporates the notion of the population growing at an exponential rate, whilst the idea of f (ξt , ηt ) being the ‘typical’ behaviour of f (Xu (t), Yu (t)) is also reasonable. Existing results tend to apply only to functions of the above form that depend only on the time-t location of the spine and existing proofs do not lend themselves to covering functions that depend on the entire path history of the spine up to time t. With the spine approach we have the benefit of being able to give a much less complicated proof of the stronger version that covers the most general path-dependent functions. Theorem 8.5 (Many-to-One) If f (t) ∈ mF˜t has the representation f (t) = fu (t)1(ξt =u) , u∈Nt
where fu (t) ∈ mFt , then
t ˜ ˜ f (t) . P fu (t)e− 0 m(Xu (s))R(Xu (s)) ds ζu (t) = P˜ f (t) ζ(t) = ζ(0) Q u∈Nt
(22)
In particular, if g(t) ∈ mGt with g(t) = u∈Nt gu (t)1(ξt =u) where gu (t) ∈ mFt , then
t
g(t) ζ(0) ˆ t P gu (t) = P e 0 m(ξs )R(ξs )ds g(t) = P . (23) e− 0 m(ξs )R(ξs )ds ζ(t) u∈Nt
308
R. Hardy and S.C. Harris
Proof: Let f (t) ∈ mF˜t with the given representation. The tower property together with Theorem 8.2 gives
˜ f (t) = Q ˜ Q ˜ f (t)|Ft = Q Q ˜ f (t)|Ft Q
1 t fu (t)e− 0 m(Xu (s))R(Xu (s)) ds ζu (t) . =Q Z(t) u∈Nt
From Theorem 6.4,
dQ Z(t) , = dP Ft Z(0)
and therefore we have
t ˜ f (t) = P Z(0)−1 Q fu (t)e− 0 m(Xu (s))R(Xu (s)) ds ζu (t) . u∈Nt
On the other hand,
˜ ˜ dQ = ζ(t) , ˜ ˜ dP F˜t ζ(0)
we have
˜ ζ(0) ˜ −1 . ˜ f (t) = P˜ f (t) × ζ(t) Q ˜ Trivially noting Z(0) = ζ(0) = ζ(0) as there is only one initial ancestor, we can combine these expressions to obtain (22). For the second part, given g(t) ∈ mGt , we can define t
g(t) × ζ(t)−1 , which is clearly Gt -measurable and satisfies f (t) = u∈Nt fu (t)1(ξt =u) with f (t) := e
t
fu (t) = gu (t)e
0
0
m(ξs )R(ξs ) ds
m(Xu (s))R(Xu (s)) ds
ζu (t)−1 ∈ mFt .
When we use this f (t) in equation (22) and recall Lemma 5.7, that P := P˜ |G∞ ˆ := Q| ˜ G from Definition 6.3, we arrive at the from Definition 4.6 and that P ∞ particular case given at (23) in the theorem. In the further special case in which g = g(ξt ) for some Borel-measurable function g(·), the trivial representation g(ξt ) = g Xu (t) 1(ξt =u) u∈Nt
leads immediately to the weaker version of the Many-to-One result that was utilised and proven, for example, in Harris and Williams [28] and Champneys et al. [5] using resolvents and the Feynman-Kac formula, expressed in terms of our more general branching Markov process Xt : Corollary 8.6 If g(·) : J → R is B-measurable then
t P g(Xu (t)) = P e 0 R(ξs )ds g(ξt ) . u∈Nt
A Spine Approach to Branching Diffusions
309
9 Branching Brownian Motion We now return to the original BBM model where particles move as standard Brownian motions, branching at rate r with offspring distribution A, as in Section 1. Under the measure P˜ x , the spine diffusion ξt is a Brownian motion that starts at x and we note that the martingale Zλ can be obtained as in Sections 5 & 6 by starting with the spine P˜ x -martingale 1 + Av 1 2 −mrt nt ˜ ζλ (t) := e (1 + m) × × eλξt − 2 λ t . 1+m v<ξt
˜ λ on (T˜ , F˜∞ ) by That is, we define the measure Q ˜x dQ ζ˜λ (t) λ = eλ(ξt −x)−Eλ t := (1 + Av ) dP˜ x F˜ ζ˜λ (0)
(24)
v<ξt
t
˜ x , the process Xt can be constructed as follows: then, under Q λ • • • •
starting from x, the spine ξt diffuses according to a Brownian motion with drift λ on R; at accelerated rate (1 + m)r the spine undergoes fission producing 1 + A˜ particles, where A˜ is independent of the spine’s motion with size-biased distribution {(1 + k)pk /(1 + m) : k ≥ 0}; with equal probability, one of the spine’s offspring particles is selected to continue the path of the spine, repeating stochastically the behaviour of its parent; the other particles initiate, from their birth position, independent copies of a P · branching Brownian motion with branching rate r and family-size distribution given by A, that is, {pk : k ≥ 0}.
Further, ignoring information identifying the spine by setting Qxλ := ˜ x |F , we find Q λ ∞ dQxλ Zλ (t) = = eλ(Xu (t)−x)−Eλ t . (25) dP x Ft Zλ (0) u∈Nt
Of course, this is all in full agreement with the equivalent definition of Qλ initially introduced in Theorem 1.1 via its pathwise construction. 9.1 Proof of Theorem 1.3 Just before we proceed to the proof we recall the naturally occurring eigenvalue Eλ := 12 λ2 +mr, noting that under the symmetry assumption that λ ≤ 0 and for p ∈ (1, 2]: pEλ − Epλ > 0
⇔
cλ > cpλ
⇔
pλ2 < 2mr
˜ 0], that is, when and that this always holds for some p > 1 whenever λ ∈ (λ, ˜ λ lies between the minimum of cλ found at λ and the origin.
310
R. Hardy and S.C. Harris
Proof of Part 1: We are going to prove that for every p ∈ (1, 2] the martingale Zλ is Lp (P )convergent if pEλ −Epλ > 0. Furthermore, since P x (Zλ (t)p ) = epλx P 0 (Zλ (t)p ) we do not lose generality supposing that x = 0; from now on this is implicit if we drop the superscript by simply writing P . From the change of measure at (25) it is clear that P (Zλ (t)p ) = P (Zλ (t)p−1 Zλ (t)) = Qλ (Zλ (t)q ), where q := p − 1. Our aim is to prove that Qλ (Zλ (t)q ) is bounded in t, since then Zλp (t) must be bounded in Lp (P ) and Doob’s theorem will then imply that Zλ is convergent in Lp (P ). As we know from Theorem 7.1, the algebra G˜∞ gives us the very important spine-decomposition of the martingale Zλ : nt ˜ ˜ Ak eλξSk −Eλ Sk + eλξt −Eλ t , Qλ Zλ (t)|G∞ =
(26)
k=1
where Ak is the number of new particles produced from the fission at time Sk along the path of the spine, and the sum is taken to equal 0 if nt = 0. The intuition is quite clear: since the particles that do not make up the spine grow to become independent copies of Xt distributed as if under P , the fact that Zλ is a P -martingale on these subtrees implies that their contributions to the above decomposition are just equal to their immediate contribution on being born at time Sk at location ξSk . Note, we emphasize that here we must use ˜ λ , since Qλ cannot measure the algebra G˜∞ F∞ . Q We can now use the conditional form of Jensen’s inequality followed by the spine decomposition of (26) coupled with the simple inequality, Proposition 9.1 If q ∈ (0, 1] and u, v > 0 then (u + v)q ≤ uq + v q , to obtain, ˜ λ Zλ (t)|G˜∞ q ˜ λ Zλ (t)q |G˜∞ ≤ Q Q nt Aqk eqλξSk −qEλ Sk + eqλξt −qEλ t . ≤
(27) (28)
k=1
With the tower property of conditional expectations and noting that Qλ and ˜ λ agree on Ft , Q
˜ λ (Zλ (t)q ) = Q ˜λ Q ˜ λ Zλ (t)q |G˜∞ (29) Qλ (Zλ (t)q ) = Q ˜λ ≤Q
nt
˜ λ eqλξt −qEλ t , Aqk eqλξSk −qEλ Sk + Q
k=1
(30)
A Spine Approach to Branching Diffusions
311
and the proof of Lp (P )-boundedness will be complete once we show this is bounded in t. As written, (30) is made up of two terms, and since they play a central role from we have the spine term here on we name them explicitly: on the far right nt q qλξS −qEλ Sk ˜λ ˜ λ eqλξt −qEλ t , the other being the sum term Q k . Q k=1 Ak e ˜ λ gives the spine a drift of λ, and The Spine Term: Changing from P˜ to Q therefore the change-of-measure for just the spine’s motion (i.e. on the algebra 1 2 Gt ) is carried out by the martingale eλξt − 2 λ t , so
˜ λ eqλξt −qEλ t = P˜ eqλξt −qEλ t × eλξt − 12 λ2 t Q
2 2 1 1 2 1 = e{ 2 (pλ) − 2 λ }t−qEλ t P˜ epλξt − 2 (pλ) t ˜ pλ (1) = e−(pEλ −Epλ )t = e−(pEλ −Epλ )t Q since the second-line term epλξt − 2 (pλ) 1 2 2 λ = Epλ − Eλ . 1
2
t
(31)
is also a P˜ -martingale and 12 (pλ)2 −
The Sum Term: Conditioning on the motion of the spine (without knowledge of the fission times or family sizes) and appealing to intuitive results from Poisson process theory (see [35] for example) yields ˜λ Q
nt
Aqk eqλξSk −qEλ Sk Gt =
k=1
t
˜ λ A˜q eqλξs −qEλ s ds (1 + m)r Q
(32)
0
Taking expectations of both sides of (32) and using Fubini’s theorem then gives ˜λ Q
nt
Aqk eqλξSk −qEλ Sk
˜ λ A˜q = (1 + m)r Q
k=1
t
˜ λ eqλξs −qEλ s ds Q
0 t
˜ λ (A ˜q ) = (1 + m)r Q
e−(pEλ −Epλ )s ds,
using (31).
0
Thus we have found an explicit upper-bound (if pEλ = Epλ ): P x (Zλ (t)p ) ≤ epλx
(1 + m)r ˜ λ A˜q + e−(pEλ −Epλ )t . 1 − e−(pEλ −Epλ )t Q pEλ − Epλ (33)
Finally, we also observe that ˜ λ (A˜q ) < ∞ if and only if Lemma 9.2 If p ∈ (1, 2] and q := p − 1, Q p P (A ) < ∞
312
R. Hardy and S.C. Harris
since ˜ λ (A˜q ) = Q
∞ i=1
iq
2P (Ap ) i+1 P (Ap ) + P (Aq ) pi = ≤ . m+1 m+1 m+1
Hence, if we have pEλ − Epλ > 0 in addition to P (Ap ) < ∞, this implies that P x (Zλ (t)p ) will remain bounded as t → ∞, which together with Doob’s theorem will complete the proof of the first part of Theorem 1.3. Proof of Part 2: We seek to show that Zλ is unbounded in Lp (P x ) if either pEλ − Epλ < 0 or P (Ap ) = ∞. Note that if Zλ is Lp (P x ) bounded then P x (Zλ (∞)p ) = lim P x (Zλ (t)p ) < ∞ t→∞
˜x ˜ x (Zλ (∞)q ) < ∞ and Zλ (∞)q is a uniformly integrable Q hence, Q λ λx q ˜ submartingale. In particular, for any stopping time T , Qλ Zλ (∞) |FT ≥ ˜ x (Zλ (∞)q ) ≥ Q ˜ x (Zλ (T )q ). Zλ (T )q hence Q λ λ First, by considering only the contribution of the spine Zλ (t) ≥ eλξt −Eλ t for all t ≥ 0 and recalling (31), we see that ˜ x (eqλξt −qEλ t ) = eqλx−(pEλ −Epλ )t ˜ x (Zλ (t)q ) ≥ Q Q λ λ and Zλ is therefore unbounded in Lp (P x ) if pEλ − Epλ < 0. Now, let T be any fission time along the path of the spine, then ˜ λξT −Eλ T Zλ (T ) ≥ (1 + A)e where A˜ is the number of additional offspring produced at the time of fission. Then,
˜ x (1 + A) ˜ x (Zλ (T )q ) ≥ Q ˜ x (e−(pEλ −Epλ )T ) ˜ q eqλx Q Q λ λ pλ
˜ x (1 + A) ˜ q = ∞, which is true iff and so Zλ is unbounded in Lp (P x ) if Q λ P (A˜p ) = ∞.
10 A Typed Branching Diffusion We move on to consider a general offspring distribution version of the typed branching diffusion introduced in Section 2.1. We will follow a similar notation and setup as before, but leave some details to the reader. Recall the single particle motion (Xt , Yt )t≥0 from Section 2.1, where the type Yt evolves as a Markov chain on I := {1, . . . , n} with Q-matrix θQ and
A Spine Approach to Branching Diffusions
313
the spatial location, Xt , moves as a driftless Brownian motion on R with diffusion coefficient a(y) > 0 whenever ηt is in state y. Consider a typed branching Brownian motion where individual particles move independently according to the single particle motion as above, and any particle currently of type y will undergo fission at rate R(y) to be replaced by a random number of offspring, 1 + A(y), where A(y) ∈ 0, 1, 2, . . . is an independent RV with distribution i ∈ 0, 1, . . . , P A(y) = i = pi (y), and mean M (y) := P A(y) < ∞ for all y ∈ I. At birth, offspring inherit the parent’s spatial and type positions and then move off independently, repeating stochastically the parent’s behaviour, and so on. We gather together the mean number of offspring in matrix M := diag[M (1), . . . , M (n)] and also recall that R := diag[R(1), . . . , R(n)] and A := diag[a(1), . . . , a(n)]. As usual, let the configuration of the whole branching diffusion at time t be given by the J-valued point process Xt = Xu (t), Yu (t) : u ∈ Nt , where Nt is the set of individuals alive at time t. Let the probabilities for this process be given by P x,y : (x, y) ∈ J defined on the natural filtration, (Ft )t≥0 , where P x,y is the law of the typed BBM process starting with one initial particle of type y at spatial position x. Recall, under the extended measures P˜ x,y : (x, y) ∈ J where we identify a distinguished infinite line of decent starting from the initial particle, this spine (ξt , ηt )t≥0 will simply move like the single particle motion above. It should be noted that the condition of time-reversibility on the Markov chain is not absolutely necessary, and is really just a simplifying assumption that gives us an easier L2 theory for the matrices and eigenvectors; our aim is really to show how the spine techniques work – lessening the geometric complexity of the model serves a good purpose. Note, the special case of the 2-type BBM model was considered in Champneys et al. [5] by different means. Also, in our model, at the time of fission a type-y individual can produce only type-y offspring. This is not the same as the case in which a type-y individual may produce a random collection of particles of different types – as considered in T.E. Harris’s classic text [32], for example. Other forms of typed branching processes have also been dealt with by spine techniques, for example, see Lyons et al. [41] or Athreya [2] for discrete-time models in which a particle’s type does not change during its life but a type-w individual can give offspring of any type according to some distribution. See also the remarkable work of Georgii and Baake [19] that uses spine techniques to study ancestral type behaviour in a continuous time branching Markov chain where particles can give birth to across all types. In principle, our spine methods will be robust enough to extend to all these other type behaviours (with added spatial diffusion).
314
R. Hardy and S.C. Harris
10.1 The Martingale Via the many-to-one Lemma 8.5, it is easy to see that for any λ ∈ R, any function (vector) vλ : I → R and any number Eλ ∈ R, the expression vλ (Yu (t)) eλXu (t)−Eλ t , Zλ (t) := u∈N (t)
will be a martingale if and only if vλ and Eλ satisfy:
1 λ2 A + θQ + M R vλ = Eλ vλ . 2 That is, vλ must be an eigenvector of the matrix eigenvalue Eλ .
1 2 2λ A
(34) + θQ + M R, with
Definition 10.1 For two vectors u, v on I, we define
u, v
π
:=
n
u i v i πi ,
i=1 2 which gives us a Hilbert space which we refer to as We suppose that L (π). the eigenvector vλ is normalized so that vλ π := vλ , vλ π = 1.
The fact that the Markov chain is time-reversible implies that the matrix 1 2 2 λ A + θQ + M R is self-adjoint with respect to this inner product. This in itself is enough to guarantee the existence of eigenvectors in L2 (π), but the fact that we are dealing with a finite-state Markov chain means that we also have the Perron-Frobenius theory to hand, which allows us to suppose that vλ is a strictly positive eigenvector whose eigenvalue Eλ is real and the farthest to the right of all the other eigenvalues – see Seneta [48] for details. This implies a useful representation for the eigenvalue: Theorem 10.2 Eλ = sup
v π =1
(λ2 /2)A + θQ + M R v, v π ,
(35)
since it is the rightmost eigenvalue. A proof can be found in Kreyzig [39]. From this it is not difficult to show that Eλ is a strictly-convex function of λ. Interestingly, it will be seen in our proofs that it is the geometry of the eigenvalue Eλ that determines the interval that gives rise to martingales Zλ (t) that are Lp -convergent. Corollary 10.3 As a function of λ, Eλ is strictly-convex and infinitely differentiable with (36) Eλ = λ Avλ , vλ π .
A Spine Approach to Branching Diffusions
315
If we define the speed function cλ := −Eλ /λ,
(37)
˜ then on (−∞, 0) the function cλ has just one minimum at a single point λ(θ), either side of which cλ is strictly increasing to +∞ as either λ ↓ −∞ or λ ↑ 0. ˜ In particular, for each λ ∈ (λ(θ), 0] there is some p > 1 such that cλ > cpλ ; ˜ on the other hand, if λ < λ(θ) there is no such p > 1. We refer to the function cλ as the speed function since it relates to the asymptotic speed of the travelling waves associated with the martingale Zλ (t); see Harris [29] or Champneys et al. [5] for details of the relationship between branching-diffusion martingales and travelling waves. Since Zλ (t) is a strictly-positive martingale it is immediate that Zλ (∞) := limt→∞ Zλ (t) exists and is finite almost-surely under P x,y . As before, by symmetry we shall assume that λ ≤ 0 and, without loss of generality, we also suppose that P (A(y) = 0) = 1 whenever r(y) = 0 to simplify statements. We shall prove necessary and sufficient conditions for L1 -convergence of the Zλ martingales: Theorem 10.4 For each x ∈ R, the limit Zλ (∞) := limt→∞ Zλ (t) exists P x,y -a.s. where: ˜ • if λ ≤ λ(θ) then Zλ (∞) = 0 P x,y -almost surely; ˜ • if λ ∈ (λ(θ), 0] and P (A(y) log+ A(y)) = ∞ for some y ∈ I, then Zλ (∞) = 0 P x,y -a.s.; ˜ • if λ ∈ (λ(θ), 0] and P (A(y) log+ A(y)) < ∞ for all y ∈ I, then Zλ (t) → Zλ (∞) almost surely and in L1 (P x,y ). Once again, in many cases where the martingale has a non-trivial limit, the convergence will be much stronger than merely in L1 (P x,y ), as indicated by the following new Lp -convergence result that we will prove by extending our earlier new spine approach: Theorem 10.5 For each x ∈ R, and for each p ∈ (1, 2]: • Zλ (t) → Zλ (∞) a.s. and in Lp (P x,y ) if pEλ − Epλ > 0 and P (A(y)p ) < ∞ for all y ∈ I. • Zλ is unbounded in Lp (P x,y ), that is limt→∞ P x,y (Zλ (t)p ) = ∞, if either pEλ − Epλ < 0 or P (A(y)p ) = ∞ for some y ∈ I. Note, when λ ≤ 0, the inequality pEλ − Epλ > 0 is equivalent to cλ > cpλ and ˜ holds for some p ∈ (1, 2] if and only if λ ∈ (λ(θ), 0]. 10.2 New Measures for the Typed BBM ˜ λ via a Radon-Nikodym derivative with As usual, we can define a measure Q respect to P˜ by combining three simpler changes of measures that only affect behaviour along the spine.
316
R. Hardy and S.C. Harris
First, we observe that for λ ∈ R, t
vλ (ηt ) e
0
M R(ηs ) ds λξt −Eλ t
e
is a P˜ -martingale. This fact is easy to confirm with some classical ‘one-particle’ calculations, for example, using the Feynman-Kac formula, the generator (4) and noting the relation (34). We can obtain the Zλ martingale as in Sections 5 & 6 by using the P˜ martingale t t 1 + Au × vλ (ηt ) e 0 M R(ηs ) ds eλξt −Eλ t . ζ˜λ (t) := e− 0 M R(ηs ) ds u<ξt
˜ x,y on (T˜ , F˜∞ ) via That is, for each λ ∈ R we define a measure Q λ ˜ x,y dQ vλ (ηt ) λ(ξt −x)−Eλ t ζ˜λ (t) λ e = := (1 + Av ) vλ (y) dP˜ x,y F˜t ζ˜λ (0) v<ξt
(38)
˜ x,y and then ignoring information about the spine by defining Qx,y λ := Qλ |F∞ , we find that dQx,y Zλ (t) λ = vλ (y)−1 = vλ (Yu (t)) eλ(Xu (t)−x)−Eλ t . (39) x,y dP Zλ (0) Ft u∈N (t)
We emphasise that, starting with the three simple ‘spine’ martingales, we have actually shown that Zλ must, in fact, be a martingale. This route offers a simple way of getting general ‘additive’ martingales for the branching process. ˜λ 10.3 The Spine Process (ξt , ηt ) Under Q It remains to identify the behaviour of the spine under the change of measure. In the BBM model it was clear to see that the spine ξt received a drift under the measure Q˜λ , and something similar happens here: ˜ λ the spine process (ξt , ηt ) has generator: Lemma 10.6 Under Q Hλ F (x, y) :=
∂2F 1 ∂F a(y) 2 + a(y)λ + θQλ (y, j)F (x, j), 2 ∂x ∂x
(40)
j∈I
where Qλ is an honest Q-matrix: θQλ (i, j) =
θQ(i, j) vvλλ(j) (i)
θQ(i, i) +
λ2 2 a(i)
if i = j
− Eλ + r(i) if i = j
˜ λ , ξt is a Brownian motion with instantaneous variance a(ηt ) That is, under Q and instantaneous drift a(ηt )λ, and ηt is a Markov chain on I with Q-matrix θQλ and invariant measure πλ = vλ2 π.
A Spine Approach to Branching Diffusions
317
The form of this above generator Hλ can be obtained from the theory of Doob’s h-transforms, due to the fact that on the algebra Gt the change of measure is given by: t dQx,y 1 λ = vλ (ηt ) e 0 M R(ηs ) ds eλξt −Eλ t . (41) x,y λx dP vλ (y)e Gt ˜ λ of the spine diffusion ξt can now be The long-term behaviour under Q retrieved from the generator (40) and the properties of Eλ stated in Lemma 10.3: ˜ x,y , the long-term drift of the spine is Corollary 10.7 Almost surely under Q λ given explicitly as lim t−1 ξt = Eλ
t→∞
∞ ξt + cλ t → −∞
and hence
˜ if λ ∈ (λ(θ), 0] ˜ if λ < λ(θ)
(42)
˜ the process ξt + cλ t will be recurrent on R under Q ˜ λ. whereas, if λ = λ Proof: From the generator stated at (40) we can write: t t ξt = B a(ηs )ds + λ a(ηs )ds, 0
0
˜ λ -Brownian motion. Then by the ergodic theorem and the where B(t) is a Q 2 fact that πλ = vλ π: t−1 ξt → λ a(y)πλ (y) = λ a(y)vλ2 (y)π(y) = λ Avλ , vλ π = Eλ . y∈I
y∈I
Direct calculation from (37) gives Eλ = −cλ − λc λ , and therefore t−1 (ξt + cλ t) → −λc λ , whence whether we are to the left or right of the local minimum ˜ determines the behaviour of ξt + cλ t, as is required. Lastly, of cλ found at λ ˜ with the laws of the iterated logarithm in mind, it is not difficult when λ = λ, t t to see that both B( 0 a(ηs ) ds) and 0 (λa(ηs ) + cλ ) ds will fluctuate about the ˜ λ. origin, hence ξt + cλ t will be recurrent under Q ˜λ 10.4 Construction of the Process under Q Drawing together the elements from this section, we now present the pathwise ˜ x,y : construction of the new measure Q λ
318
R. Hardy and S.C. Harris
˜ x,y , the process Xt evolves as follows: Theorem 10.8 Under Q λ • starting from (x, y), the spine (ξt , ηt ) evolves as a Markov process with generator Hλ , that is, ηt evolves as Markov chain on I with Q-matrix θQλ and ξt moves as a Brownian motion on R with variance coefficient a(ηt ) and drift a(ηt )λ. • whenever the type of the spine η is in state y ∈ I, the spine undergoes ˜ fission at an accelerated rate (1 + m(y))r, producing 1 + A(y) particles ˜ where A(y) is independent of the spine’s motion with size-biased distribution {(1 + k)pk (y)/(1 + m(y)) : k ≥ 0}; • with equal probability, one of the spine’s offspring particles is selected to continue the path of the spine, repeating stochastically the behaviour of its parent; • the other particles initiate, from their birth position, independent copies of P ·,· typed branching Brownian motions. 10.5 Proof of Theorem 10.4 The following proof is an extension of that given for BBM by Kyprianou [40]. The second part of the following theorem is the key element in using the measure change (38) to determine properties of the martingale Zλ : Theorem 10.9 Suppose that P and Q are two probability measures on a space Ω, F∞ with filtration (Ft )t≥0 , such that for some positive martingale Zt , dQ = Zt . dP Ft The limit Z∞ := lim supt→∞ Zt therefore exists and is finite almost surely under P . Furthermore, for any F ∈ F∞ Q(F ) = Z∞ dP + Q F ∩ {Z∞ = ∞} , (43) F
and consequently (a)
P (Z∞ = 0) = 1 ⇐⇒ Q(Z∞ = ∞) = 1
(44)
(b)
P (Z∞ ) = 1 ⇐⇒ Q(Z∞ < ∞) = 1
(45)
A proof of the decomposition (43) can be found in Durrett [11], at page 241. ˜ < 0. Ignoring all contributions except for the spine, it Suppose that λ ≤ λ is immediate that vλ (Yu (t)) eλXu (t)−Eλ t ≥ vλ (ηt ) eλ(ξt +cλ t) Zλ (t) = u∈Nt
˜ λ the spine satisfies where, from Corollary 10.7, under the measure Q lim inf{ξt + cλ t} = −∞ a.s. and vλ > 0, hence lim supt→∞ Zλ (t) = ∞ ˜ λ , yielding P (Zλ (∞) = 0) = 1. almost surely under Q
A Spine Approach to Branching Diffusions
319
+ ˜ Note that, for y ∈ I, P (A(y) log+ A) < ∞ ⇐⇒ k≥1 P (log A(y) > ˜ ck) < ∞ for any c > 0, where recall that A(y) has the size-biased distribution {(i + 1)pk (y)/(1 + m(y)) : k ≥ 0}. Then for an IID sequence {A˜n (y)} of copies ˜ of A(y), Borel-Cantelli reveals that, P almost surely, 0 if P (A(y) log+ A(y)) < ∞, + ˜ −1 lim sup n log An (y) = (46) ∞ if P (A(y) log+ A(y)) = ∞. t→∞ ˜ 0] and P (A(y) log+ A(y)) = ∞ for some y ∈ I Now suppose that λ ∈ (λ, (with r(y) > 0). Let Sk be the time of the k th fission along the spine producing A˜k (ηSk ) additional particles, then Zλ (Sk ) ≥ A˜k (ηSk )vλ (ηSk ) eλ(ξSk +cλ Sk ) where (ξt + cλ t)/t → −λc λ > 0, ηt is ergodic so the event {ηSk = y} will occur for infinitely many k since r(y) > 0, and nt /t →< Rvλ , vλ >π so ˜ Sk /k →< Rvλ , vλ >−1 π , hence the super-exponential growth for Ak (y) from ˜ λ -almost surely which then implies that (46) gives lim supt→∞ Zλ (t) = ∞ Q P (Zλ (∞) = 0) = 1. ˜ 0] and P (A(y) log+ A(y)) < ∞ for all y ∈ I. Finally, suppose that λ ∈ (λ, Recall from (26): nt ˜ λ Zλ (t)|G˜∞ = A˜k (ηSk )vλ (ηSk ) eλ(ξSk +cλ Sk ) + vλ (ηt )eλ(ξt +cλ t) . Q
(47)
k=1
In this case, the facts that (ξt +cλ t)/t → −λc λ > 0 and Sk /k →< Rvλ , vλ >−1 π together with the moment conditions and (46) implying that the A˜k (y)’s all have sub-exponential growth means that ˜ λ -a.s. ˜ λ Zλ (t)|G˜∞ < ∞ lim sup Q Q t→∞
˜ λ -a.s., hence also Qλ -a.s. Fatou’s lemma then gives lim inf t→∞ Zλ (t) < ∞, Q −1 In addition, since Zλ (t) is a positive Qλ -martingale (recall (39)) with an almost sure limit, this means that limt→∞ Zλ (t) < ∞, Qλ -a.s. and then (45) yields that P (Zλ (∞)) = 1 and so Zλ (t) converges almost surely and in L1 (P ).
Discussion of Rate of Convergence to Zero and Left-most Particle Speed. ˜ we can readily obtain the rate of convergence to Alternatively, when λ < λ zero with the following simple argument, adapted from Git et al. [20]. By Proposition 9.1, vλ (Yu (t))q eqλ(Xu (t)+cqλ ) eqλ(cλ −cqλ )t ≤ K Zqλ (t)eqλ(cλ −cqλ )t Zλ (t)q ≤ u∈N (t)
320
R. Hardy and S.C. Harris
where K := maxy∈I vλq (y)/vqλ (y) < ∞ since I is finite and vλ > 0. Recall that ˜ with c ˜ = −E ˜ = −λ ˜ < Av ˜ , v ˜ >π . cλ has a minimum over λ ∈ (−∞, 0] at λ λ λ λ λ Then, since Zqλ (t) is a convergent martingale, we can choose q such that ˜ giving Zλ (t) decaying exponentially to zero at least at rate λ(cλ − c ˜ ). qλ = λ λ ˜ 0], Further, once we know that P and Qλ are equivalent for every λ ∈ (λ, since the spine moves such that ξt /t → −cλ − λc λ under Qλ , the left-most particle L(t) := inf u∈N (t) Xu (t) must satisfy lim inf t L(t)/t ≤ −cλ˜ , P -a.s. On the other hand, the convergence of the Zλ P -martingales quickly gives the same upper bound on the fastest speed of any particle, leading to L(t)/t → −cλ˜ , P -a.s. This result also reveals that the rate of exponential decay found above is actually best possible. 10.6 Proof of Theorem 10.5 Proof of Part 1: Suppose p ∈ (1, 2], then with q := p − 1 a slight modification of the BBM proof arrives at ˜ x,y (Zλ (t)q ) P x,y (Zλ (t)p ) = eλx vλ (y)Q λ nt
q q qλξSk −qEλ Sk ˜ x,y A v (η ) e ≤ eλx vλ (y)Q λ k λ Sk k=1
˜ x,y vλ (ηt )q eqλξt −qEλ t + e vλ (y)Q λ λx
and the proof of Lp -boundedness will be complete once we show that this RHS expectation is bounded in t. The Spine Term. Since I is finite we note that vλp , vpλ π < ∞. It is always useful to first focus on the spine term, since we can change the measure with (41) to get
˜ x,y vλ (ηt )q eqλξt −qEλ t = P˜ x,y vλ (ηt )q eqλξt −qEλ t . vλ (ηt ) e Q λ
t
= eqλx
vpλ (y) gt (y)e−(pEλ −Epλ )t vλ (y)
0
M R(ηs ) ds λξt −Eλ t
e
vλ (y)eλx
(48)
where, for all y ∈ I
p ˜ 0,y vλ (ηt ) → v p , vpλ gt (y) := Q pλ λ π vpλ as t → ∞ and gt vpλ , vpλ π = vλp , vpλ π for all t ≥ 0, since ηt is a finite˜ μ with invariant distribution πμ (y) = state irreducible Markov chain under Q 2 vμ (y) π(y). It follows that the long term the growth or decay of the spine term is determined by the sign of pEλ − Epλ .
A Spine Approach to Branching Diffusions
321
The Sum Term. We now assume that pEλ − Epλ > 0. We know that under ˜ λ and conditional on knowing η, the fission times {Sk : k ≥ 0} on the spine Q occur as a Poisson process of rate (1+m(ηs ))r(ηs ) with the k th fission yielding ˜ which an additional Ak offspring, each Ak being an independent copy of A(y) has the size-biased distribution {(1 + k)pk (y)/(1 + m(y)) : k ≥ 0} where y = ηSk is the type at the time of fission. We also recall from Lemma 9.2 that ˜ λ (A˜q (y)) < ∞ ⇐⇒ P (Ap (y)) < ∞. Mq (y) := Q Therefore, if we condition on Gt which knows about (ξs , ηs ) at all times 0 ≤ s ≤ t we can transform the sum into an integral, use Fubini’s theorem and the change of measure used in (48): ˜ x,y Q λ
nt
Aqk vλ (ηSk )q eqλξSk −qEλ Sk
k=1
nt
x,y ˜ x,y ˜ = Qλ Qλ Aqk vλ (ηSk )q eqλξSk −qEλ Sk Gt
˜ x,y =Q λ
k=1
t
(1 + m(ηs ))r(ηs ) Mq (ηs )vλ (ηs )q eqλξs −qEλ s ds
0
˜ x,y (1 + m(ηs ))r(ηs )Mq (ηs )vλ (ηs )q eqλξs −qEλ s ds Q λ 0 t qλx vpλ (y) =e hs (ηs )e−(pEλ −Epλ )s ds vλ (y) 0 kt (y) vpλ (y) × =eqλx vλ (y) pEλ − Epλ t
=
where p
˜ 0,y r˜(ηs )Mq (ηs ) vλ (ηs ) , hs (y) := Q pλ vpλ
and
r˜(y) := (1 + m(y))r(y),
kt (y) := E(hU (y); U ≤ t)
with U an independent exponential of rate (pEλ − Epλ ) > 0. Note that, p r˜M for q vλ , vpλ and all y ∈ I, h s (y) → kt (y) ↑ k∞ (y) as t p→ ∞, where p kt vpλ , vpλ = r˜Mq vλ , vpλ P(U ≤ t) ↑ k∞ vpλ , vpλ = r˜Mq vλ , vpλ . Then, since Mq (w) < ∞ ⇐⇒ P (A(w)p ) < ∞, and I is finite, we are guaranteed that k∞ (y) < ∞ for all y ∈ I as long as P (A(w)p ) < ∞ for all w ∈ I. Having dealt with both the spine term and the sum term, we have obtained the upper-bound epλx vpλ (y)
kt (y) + gt (y) (pEλ − Epλ ) e−(pEλ −Epλ )t P x,y Zλ (t)p ≤ (pEλ − Epλ )
322
R. Hardy and S.C. Harris
and since Zλ (t)p is a P -submartingale, we find that epλx vpλ (y) k∞ (y) P x,y Zλ (t)p ≤ (pEλ − Epλ )
(∀t ≥ 0)
and Zλ (t) will be bounded in Lp (P x,y ) if we have both pEλ − Epλ > 0 and P (Ap (w)) < ∞ for all w ∈ I. Proof of Part 2: The earlier proof for BBM goes through with minor modification. Exactly as in the BBM case, looking only at the contribution of the spine means that Zλ is unbounded in Lp (P x,y ) if pEλ − Epλ < 0. In addition, letting T be any fission time along the path of the spine, ˜ T ))vλ (ηT )eλξT −Eλ T Zλ (T ) ≥ (1 + A(η ˜ T ) is the number of additional offspring produced at the time of where A(η q ˜ ˜ + A(y)) ) < ∞ ⇐⇒ P (Ap (y)) < ∞, fission. Then, with mq (y) := Q((1 ˜ x,y (mq (ηT )vλ (ηT )q eqλξT −qEλ T ) ˜ x,y (Zλ (T )q ) ≥ eqλx Q Q λ λ vλ (ηT )p −(pEλ −Epλ )T qλx ˜ x,y e =e Qpλ mq (ηT ) vpλ (ηT ) and so Zλ will also be unbounded in Lp (P x,y ) if mq (y) = ∞ ⇐⇒ P (Ap (y)) = ∞ for any y ∈ I (taking a fission time when also in state y). Remarks on Signed Martingales and Kesten-Stigum Type Theorems In the multi-typed BBM, for each λ there will be other (signed) additive martingales corresponding to the different eigenvectors and eigenvalues obtained from solving (34); the Zλ martingale simply corresponds to the Perron(strictly positive) eigenvector Frobenius, or ground-state, q eigenvalue q Eλ and q vλ . Since u + v ≤ (u + v )q ≤ u + v for all u, v ∈ R, the above proof will also adapt to give convergence results for signed martingales. In fact, when there is a complete orthonormal set of eigenvectors, a Kesten-Stigum like theorem would then swiftly follow (for example, see Harris [30] in the context of the continuous-type model of the next section).
11 A Continuous-Typed Branching-Diffusion The previous finite-type model was originally inspired by the model that we now turn to, originally laid out in Harris and Williams [28]. In this model the
A Spine Approach to Branching Diffusions
323
type moves on the real line as an Orstein-Uhlenbeck process associated with the generator Qθ :=
θ ∂2 ∂ , −y 2 2 ∂y ∂y
with θ > 0 considered as the temperature,
which has the standard normal density as its invariant distribution: π(y) := (2π)− 2 e− 2 y . 1
1
2
The spatial movement of a particle of type y is a driftless Brownian motion with instantaneous variance A(y) := ay 2 ,
for some fixed a > 0,
and fission of a particle of type y occurs at a rate R(y) := ry 2 + ρ,
where r, ρ > 0 are fixed,
to produce two particles at the same type-space location as the parent (we consider only binary splitting). The model has very different behaviour for low temperature values (i.e. low θ), but most studies have considered the high temperature regime where θ > 8r. Also, the parameter λ must be restricted to an interval (λmin , 0) in order for some of the model’s parameters to remain in R, where θ − 8r . λmin := − 4a Generally, unboundedness in a model’s rates is a serious obstacle to classical proofs since they often depend on the expectation semigroup of the branching process, and unbounded rates tend to lead to unbounded eigenfunctions. Here this is the case, but the existence of a spectral theory for their particular expectation operator allowed Harris and Williams to get a sufficiently good bound in particular for a non-linear term (see Theorem 5.1 of [28]), and therefore to prove Lp -convergence of the martingale. Other convergence results for various martingales and weighted sums over particles for this model also appear in Harris [30], again using more classical methods and requiring ‘nonlinear’ calculations. The spine approach we again adopt here is both simple and more generic in nature; requiring no such special ‘non-linear’ calculations, it elegantly produces very good estimates that only involve easy one-particle calculations. We use the same notation as previously, Xt = Xu (t), Yu (t) : u ∈ Nt to denote thepoint process of space-type locations in R×R, and suppose that the ˜ measures P˜ x,y : (x, y) ∈ R2 on the natural filtration with a spine (Ft )t≥0 are such that the initial ancestor starts at (x, y) and Xt , (ξt , ηt ) becomes the above-described branching diffusion with a spine.
324
R. Hardy and S.C. Harris
11.1 The Measure Change Although there are some significant differences, this model is similar in flavour to our finite-type model. There is a strictly-positive martingale Zλ defined as vλ (Yu (t))eλXu (t)−Eλ t Zλ (t) := u∈Nt
where vλ and Eλ are the eigenvector and eigenvalue associated with the selfadjoint (in L2 (π)) operator: 1 Qθ + λ2 A(y) + R(y). 2 The eigenfunction vλ is normalizable against the L2 (π) norm, and can be found explicitly as − 2 vλ (y) = eψλ y where
1 μλ 1 − , μλ := θ2 − θ(8r + 4aλ2 ), 4 2θ 2 are both positive for all λ ∈ (λmin , 0); another important parameter is ψλ+ := μλ 1 4 + 2θ . The eigenvalue Eλ is then given by ψλ− :=
Eλ = ρ + θψλ− . ˜ We again define the speed function cλ := −Eλ /λ, and λ(θ) < 0 is the unique point (on the negative axis) at which cλ hits its minimum c˜(θ) – further details are given in Harris and Williams [28]. We are going to use spines to ˜ and the necessary prove the following result, in which the critical case of λ = λ p conditions for L (P )-convergence are new results: Theorem 11.1 Suppose that λ ∈ (λmin , 0). 1. Let p ∈ (1, 2]. The martingale Zλ is Lp (P )-bounded if both pEλ − Epλ > 0 + ˜ and pψλ− < ψpλ . In particular, for all λ ∈ (λ(θ), 0], Zλ is a uniformlyintegrable martingale. + . 2. Zλ is unbounded in Lp (P ) if either pEλ − Epλ < 0 or pψλ− > ψpλ ˜ 3. Almost surely under P , Zλ (∞) = 0 if λ ≤ λ(θ). ˜ x,y on (T˜ , F˜∞ ) via Once again, for each λ ≤ 0 we define a measure Q λ x,y ˜ − dQ 1 λ := 2nt vλ (ηt )eλξt −Eλ t , λx x,y vλ (y)e dP˜ ˜t F ˜ λ |F we have so that with Qλ := Q ∞ dQx,y Zλ (t) Zλ (t) λ = = . dP x,y Ft Zλ (0) vλ (y)eλx ˜ λ: The facts are that under Q
(49)
A Spine Approach to Branching Diffusions
• •
• •
325
the spine diffusion ξt has instantaneous drift aηt2 λ; ∂2 2μλ ∂ and an invariant probthe type process ηt has generator θ2 ∂y 2 − θ y ∂y −1 2 ability measure πλ := vλ , vλ π vλ π, corresponding to a normal distribution, N (0, 2μθλ ); fission times on the spine occur at the accelerated rate of 2R(ηt ); all particles not in the spine behave as if under the original measure P .
We briefly comment that, along similar lines as discussed for the finitetyped BBM case, we could now give a straightforward spine proof that the asymptotic right-most particle speed in this continuous typed BBM model is almost surely c˜(θ). 11.2 Proof of Theorem 11.1 Proof of Part 1: Suppose p ∈ (1, 2]. Then using the spine decomposition with Jensen’s inequality and Proposition 9.1 we find,
˜ x,y P x,y (Zλ− (t)p ) ≤ eλx vλ (y)Q vλ (ηSu )q eqλξSu −qEλ Su λ u<ξt
˜ x,y vλ (ηt )q eqλξt −qEλ t . + eλx vλ (y)Q λ + Assume that pEλ − Epλ > 0 and pψλ− < ψpλ . As seen in Harris and Williams [28], we can do many calculations explicitly in this model, largely due to the ˜ 0,y fact that under Q pλ
−μpλ s
ηs ∼ N e
θ(1 − e−2μpλ s ) y, 2μpλ
θ → N 0, 2μpλ
and the eigenfunctions vλ have such simple exponential form. For example,
p
− − ˜ 0,y vλ (ηs ) = Q ˜ 0,y e(pψλ −ψpλ )ηs2 Q pλ pλ vpλ
(50)
can easily be seen to be finite and bounded for all s ≥ 0 if and only if pψλ− − μ − + − θpλ = pψλ− − ψpλ < 0, and just as readily calculated explicitly. ψpλ In fact, more ‘natural’ conditions for Lp convergence of the martingales would be that p RMq vλp , vpλ π < ∞, vλ , vpλ π < ∞, and pEλ − Epλ < 0, ˜ A˜q (y)) with A˜ the size-biased offspring distribution (here, where Mq (y) := Q( ˜ binary splitting means A(y) ≡ 1), and we present arguments below that are more generic in nature, at least in terms of adapting to other ‘suitably’ ergodic type motions and random family sizes. Note, the last condition above is related
326
R. Hardy and S.C. Harris
to the natural convexity of Eλ and, in our specific model, both integrability + < 0. conditions are guaranteed by pψλ− − ψpλ The Spine Term. On the algebra Gt the change of measure takes the form t ˜ x,y dQ vλ (ηs ) − λ exp = R(ηs ) ds + λ(ξt − x) − Eλ t , vλ (y) dP˜ x,y Gt 0 which we can use on the spine term to arrive at
˜ x,y vλ (ηt )q eqλξt −qEλ t = epλx vpλ (y) gt (y)e−(pEλ −Epλ )t ft (x, y) := eλx vλ (y)Q λ (51)
0,y p − + ˜ with gt (y) := Qpλ vλ (ηt )/vpλ (ηt ) . Under the assumption that pψλ < ψpλ , p p 1 it easy to check that vλ , vpλ π < ∞, that is vλ /vpλ ∈ L (πpλ ) from which it ˜ pλ , follows that gt ∈ L1 (πpλ ) for all t ≥ 0. Since η has equilibrium πpλ under Q −1 p p we find gt vpλ , vpλ π = vλ , vpλ π < ∞ and gt (y) → vλ , vpλ π vpλ , vpλ π < ∞ as t → ∞ for all y ∈ R. πpλ ) where π ˜μ := We also note that since gt ∈ L1 (πpλ ), we have ft ∈ L1 (˜ −1 1, vμ π vμ π and then
pλx
π ˜pλ (y)ft (x, y)dy = e y∈R
vλp , vpλ π −(pEλ −Epλ )t e . 1, vpλ π
The p Sum Term. Note that under the parameter assumptions we have Rvλ , vpλ π < ∞. As for the finite-type model the fission times Su on the spine occur as a Cox process and therefore
˜ x,y gt (x, y) := eλx vλ (y)Q vλ (ηSu )q eqλξSu −qEλ Su λ u<ξt
t
= eλx vλ (y)
0 t
= eλx vλ (y) 0 pλx
=e
˜ x,y 2R(ηs ) vλ (ηs )q eqλξs −qEλ s ds Q λ p
˜ 0,y 2R(ηs ) vλ (ηs ) e−(pEλ −Epλ )s ds Q pλ vpλ
vpλ (y)kt (y)
where kt (y) := 0
t
hs (y)e−(pEλ −Epλ )s ds,
p
˜ 0,y 2R(ηs ) vλ (ηs ) hs (y) := Q pλ vpλ
and ht , kt ∈ L1 (πpλ ). Note, kt (y) ↑ k∞ (y) ∈ L1 (πpλ ) as t → ∞ where
A Spine Approach to Branching Diffusions
kt vpλ , vpλ π vpλ , vpλ π
327
1 − e−(pEλ −Epλ )t p = 2Rvλ , vpλ π (pEλ − Epλ ) p 2Rvλ , vpλ π = k∞ vpλ , vpλ π < ∞. ↑ (pEλ − Epλ )
Note, kt ∈ L1 (πpλ ) implies gt ∈ L1 (˜ πpλ ), with an explicit calculation again possible. Bringing together the results for the sum and spine terms, we have an upper bound
πpλ ) (52) P x,y (Zλ (t)p ) ≤ epλx vpλ (y) gt (y)e−(pEλ −Epλ )t + kt (y) ∈ L1 (˜ and hence the submartingale property reveals that P x,y (Zλ (t)p ) ≤ epλx vpλ (y)k∞ (y) < ∞ for all t ≥ 0 and all y ∈ R.
Proof of Part 2: We need only dominate the martingale by the spine at time t, yielding ˜ x,y (Zλ (t)q ) ≥ Q ˜ x,y (vλ (ηt )q eqλξt −qEλ t ) Q λ λ p vλ vpλ (y) ˜ y = eqλx (ηt ) e−(pEλ −Epλ )t . Qpλ vλ (y) vpλ Hence Zλ is unbounded in Lp (P x ) if either pEλ −Epλ < 0 or vλp , vpλ π = ∞. Proof of Part 3: The proof that we have seen in the finite-type model will ˜ λ the spatial motion is work here with little change: under Q t t ξt = B a(ηs )ds + λ a(ηs )ds, 0
0
and the type process ηs has invariant distribution N (0, 2μθλ ), whence t−1 ξt → ˜ λ the diffusion ξt +cλ t drifts off λaθ/μλ = Eλ . Therefore it follows that under Q ˜ ˜ it is also simple to check that ξt +cλ t is recurto −∞ if λ < λ(θ). When λ = λ, rent, so has lim inf{ξt +cλ t} = −∞. Whence, in either case, bounding Zλ below by the spine’s contribution as done before, we have Zλ (t) ≥ vλ (ηt )eλ(ξt +cλ t) and since vλ > 0 and ηt recurrent, we see that lim supt→∞ Zλ (t) = ∞ almost ˜ x,y . surely under Q λ Acknowledgement. We would like to thank the referees for their careful reading of the original articles and for their helpful suggestions.
328
R. Hardy and S.C. Harris
References 1. S. Asmussen and H. Hering, Strong limit theorems for general supercritical branching processes with applications to branching diffusions, Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 36 (1976), no. 3, 195–212. 2. K. B. Athreya, Change of measures for Markov chains and the L log L theorem for branching processes, Bernoulli 6 (2000), no. 2, 323–338.5 3. J. Bertoin, Random fragmentation and coagulation processes, Cambridge University Press, 2006. 4. J. D. Biggins and A. E. Kyprianou, Measure change in multitype branching, Adv. in Appl. Probab. 36 (2004), no. 2, 544–581. 5. A. Champneys, S. Harris, J. Toland, J. Warren, and D. Williams, Algebra, analysis and probability for a coupled system of reaction-diffusion equations, Philosophical Transactions of the Royal Society of London 350 (1995), 69–112. 6. B. Chauvin, Arbres et processus de Bellman-Harris, Ann. Inst. H. Poincar´e Probab. Statist. 22 (1986), no. 2, 209–232. 7. B. Chauvin and A. Rouault, Boltzmann-Gibbs weights in the branching random walk, Classical and modern branching processes (Minneapolis, MN, 1994), IMA Vol. Math. Appl., vol. 84, Springer, New York, 1997, pp. 41–50. 8. B. Chauvin, Product martingales and stopping lines for branching Brownian motion, Ann. Probab. 19 (1991), no. 3, 1195–1205. 9. B. Chauvin and A. Rouault, KPP equation and supercritical branching Brownian motion in the subcritical speed area. Application to spatial trees, Probab. Theory Related Fields 80 (1988), no. 2, 299–314. 10. B. Chauvin, A. Rouault, and A. Wakolbinger, Growing conditioned trees, Stochastic Process. Appl. 39 (1991), no. 1, 117–130. 11. R. Durrett, Probability: Theory and examples, 2nd ed., Duxbury Press, 1996. 12. J. Engl¨ ander and A. E. Kyprianou, Local extinction versus local exponential growth for spatial branching processes, Ann. Probab. 32 (2004), no. 1A, 78–99. 13. J. Engl¨ ander, Branching diffusions, superdiffusions and random media, Probab. Surveys Vol. 4 (2007) 303-364. 14. J. Engl¨ ander, S. C. Harris, and A. E. Kyprianou, Laws of Large numbers for spatial branching processes, Annales de l’Institut Henri Poincar´e (B) Probability and Statistics, (2009), to appear. 15. J. Geiger, Size-biased and conditioned random splitting trees, Stochastic Process. Appl. 65 (1996), no. 2, 187–207. 16. J. Geiger, Elementary new proofs of classical limit theorems for Galton-Watson processes, J. Appl. Probab. 36 (1999), no. 2, 301–309. 17. J. Geiger, Poisson point process limits in size-biased Galton-Watson trees, Electron. J. Probab. 5 (2000), no. 17, 12 pp. (electronic). 18. J. Geiger and L. Kauffmann, The shape of large Galton-Watson trees with possibly infinite variance, Random Structures Algorithms 25 (2004), no. 3, 311–335. 19. H. O. Georgii and E. Baake, Supercritical multitype branching processes: the ancestral types of typical individuals, Adv. in Appl. Probab. 35 (2003), no. 4, 1090–1110. 20. Y. Git, J. W. Harris, and S. C. Harris, Exponential growth rates in a typed branching diffusion, Ann. App. Probab., 17 (2007), no. 2, 609–653. doi:10.1214/105051606000000853
A Spine Approach to Branching Diffusions
329
21. R. Hardy, Branching diffusions, Ph.D. thesis, University of Bath Department of Mathematical Sciences, 2004. 22. R. Hardy and S. C. Harris, Some path large deviation results for a branching diffusion, (2007), submitted. 23. R. Hardy and S. C. Harris, A conceptual approach to a path result for branching Brownian motion, Stoch. Proc. and Applic., 116 (2006), no. 12, 1992–2013. doi:10.1016/j.spa.2006.05.010 24. R. Hardy and S. C. Harris, A new formulation of the spine approach for branching diffusions, (2006). arXiv:math.PR/0611054 25. R. Hardy and S. C. Harris, Spine proofs for Lp -convergence of branchingdiffusion martingales, (2006). arXiv:math.PR/0611056 26. J. W. Harris, S. C. Harris, and A. E. Kyprianou, Further probabilistic analysis of the Fisher-Kolmogorov-Petrovskii-Piscounov equation: one-sided travelling waves, Ann. Inst. H. Poincar´e Probab. Statist. 42 (2006), no. 1, 125–145. 27. J. W. Harris and S. C. Harris, Branching Brownian motion with an inhomogeneous breeding potential, Ann. Inst. H. Poincar´e Probab. Statist. (2008), to appear. 28. S. C. Harris and D. Williams, Large deviations and martingales for a typed branching diffusion. I, Ast´erisque (1996), no. 236, 133–154, Hommage ` a P. A. Meyer et J. Neveu. 29. S. C. Harris, Travelling-waves for the FKPP equation via probabilistic arguments, Proc. Roy. Soc. Edinburgh Sect. A 129 (1999), no. 3, 503–517. 30. S. C. Harris, Convergence of a “Gibbs-Boltzmann” random measure for a typed branching diffusion, S´eminaire de Probabilit´es, XXXIV, Lecture Notes in Math., vol. 1729, Springer, Berlin, 2000, pp. 239–256. 31. S. C. Harris, R. Knobloch, and A.E. Kyprianou, Strong Law of Large Numbers for Fragmentation Processes, (2008) arXiv:0809.2958v1, submitted. 32. T. E. Harris, The theory of branching processes, Dover ed., Dover, 1989. 33. Y. Hu and Z. Shi, Minimal position and critical martingale convergence in branching random walks, and directed polymers on disordered trees, (2008) Ann. App. Probab., to appear. 34. A. M. Iksanov, Elementary fixed points of the BRW smoothing transforms with infinite number of summands, Stochastic Process. Appl. 114 (2004), no. 1, 27–50. 35. O. Kallenberg, Foundations of modern probability, Springer-Verlag, 2002. 36. H. Kesten and B. P. Stigum, Additional limit theorems for indecomposable multidimensional Galton-Watson processes, Ann. Math. Stat. 37 (1966), 1463–1481. 37. H. Kesten and B. P. Stigum, A limit theorem for multidimensional GaltonWatson processes, Ann. Math. Stat. 37 (1966), 1211–1223. 38. H. Kesten and B. P. Stigum, Limit theorems for decomposable multi-dimensional Galton-Watson processes, J. Math. Anal. Applic. 17 (1967), 309–338. 39. E. Kreyszig, Introductory functional analysis with applications, John Wiley and Sons, 1989. 40. A. E. Kyprianou, Travelling wave solutions to the K-P-P equation: alternatives to Simon Harris’ probabilistic analysis, Ann. Inst. H. Poincar´e Probab. Statist. 40 (2004), no. 1, 53–72. 41. T. Kurtz, R. Lyons, R. Pemantle, and Y. Peres, A conceptual proof of the Kesten-Stigum theorem for multi-type branching processes, Classical and modern
330
42.
43.
44. 45. 46.
47. 48. 49.
R. Hardy and S.C. Harris branching processes (Minneapolis, MN, 1994), IMA Vol. Math. Appl., vol. 84, Springer, New York, 1997, pp. 181–185. Q. Liu and A. Rouault, On two measures defined on the boundary of a branching tree, Classical and modern branching processes (Minneapolis, MN, 1994), IMA Vol. Math. Appl., vol. 84, Springer, New York, 1997, pp. 187–201. R. Lyons, A simple path to Biggins’ martingale convergence for branching random walk, Classical and modern branching processes (Minneapolis, MN, 1994), IMA Vol. Math. Appl., vol. 84, Springer, New York, 1997, pp. 217–221. R. Lyons, R. Pemantle, and Y. Peres, Conceptual proofs of L log L criteria for mean behavior of branching processes, Ann. Probab. 23 (1995), no. 3, 1125–1138. J. Neveu, Arbres et processus de Galton-Watson, Ann. Inst. H. Poincar´e Probab. Statist. 22 (1986), no. 2, 199–207. J. Neveu, Multiplicative martingales for spatial branching processes, Seminar on Stochastic Processes (E. C ¸ inlar, K.L.Chung, and R.K.Getoor, eds.), Birkh¨ auser, 1987, pp. 223–241. P. Olofsson, The x log x condition for general branching processes, J. Appl. Probab. 35 (1998), no. 3, 537–544. E. Seneta, Non-negative matrices and Markov chains, Springer-Verlag, 1981. E. C. Waymire and S. C. Williams, A general decomposition theory for random cascades, Bull. Amer. Math. Soc. (N.S.) 31 (1994), no. 2, 216–222.
Penalisation of the Standard Random Walk by a Function of the One-sided Maximum, of the Local Time, or of the Duration of the Excursions Pierre Debs ´ Cartan Nancy Institut Elie B.P. 239, 54506 Vandœuvre-l`es-Nancy Cedex, France E-mail: [email protected] Summary. Call (Ω, F∞ , P, X, F ) the canonical space for the standard random walk on Z. Thus, Ω denotes the set of paths φ : N → Z such that |φ(n + 1) − φ(n)| = 1, X = (Xn , n 0) is the canonical coordinate process on Ω; F = (Fn , n 0) is the natural filtration of X, F∞ the σ-field n0 Fn , and P0 the probabilitiy on (Ω, F∞ ) such that under P0 , X is the standard random walk started form 0, i.e., P0 (Xn+1 = j | Xn = i) = 12 when |j − i| = 1. Let G : N × Ω → R+ be a positive, adapted functional. For several types of functionals G, we show the existence of a positive F -martingale (Mn , n 0) such that, for all n and all Λn ∈ Fn , E0 [1Λn Gp ] E0 [Gp ]
−→
E0 [1Λn Mn ]
when p → ∞.
Thus, there exists a probability Q on (Ω, F∞ ) such that Q(Λn ) = E0 [1Λn Mn ] for all Λn ∈ Fn . We describe the behavior of the process (Ω, X, F ) under Q. The three sections of the article deal respectively with the three situations when G is a function: • of the one-sided maximum; • of the sign of X and of the time spent at zero; • of the length of the excursions of X.
1 Introduction Let Ω, (Xt , Ft )t0 , F∞ , Px be the canonical one-dimensional Brownian motion. For several types of positive functionals Γ : R+ × Ω → R+ , B. Roynette, P. Vallois and M. Yor show in [RVY06] that, for fixed s and for all Λs ∈ Fs , lim
t→∞
Ex [1Λs Γt ] Ex [Γt ]
C. Donati-Martin et al. (eds.), S´ eminaire de Probabilit´ es XLII, Lecture Notes in Mathematics 1979, DOI 10.1007/978-3-642-01763-6 12, c Springer-Verlag Berlin Heidelberg 2009
331
332
P. Debs
exists and has the form Ex [1Λs Msx ], where (Msx , s 0) is a positive martingale. This enables them to define a probability Qx on (Ω, F∞ ) by: ∀Λs ∈ Fs
Qx (Λs ) = Ex [1Λs Msx ];
moreover, they precisely describe the behavior of the canonical process X under Qx . This they do for numerous functionals Γ , for instance a function of the one-sided maximum, or of the local time, or of the age of the current excursion (cf. [RVY06], [RVY]). Our purpose is to study a discrete analogue of their results. More precisely, let Ω denote the set of all functions φ from N to Z such that on |φ(n + 1) − φ(n)| = 1, let X = (Xn , n 0) be the process of coordinates that space, F = (Fn , n 0) the canonical filtration, F∞ the σ-field n0 Fn , and Px (x ∈ N) the family of probabilities on (Ω, F∞ ) such that under Px X is the standard random walk started at x. For notational simplicity, we often write P for P0 . Our aim is to establish that for several types of positive, adapted functionals G : N × Ω → N, i) for each n 0 and each Λn ∈ Fn , E0 [1Λn Gp ] , E0 [Gp ] tends to a limit when p tends to infinity; ii) this limit is equal to E0 [1Λn Mn ], for some F-martingale M such that M0 = 1. Call Q(Λn ) this limit. Assuming i) and ii), Q is a probability on each σ-field Fn ; it extends in a unique way to a probability (still called Q) on the σ-field F∞ . This can be seen either by applying Kolmogorov’s theorem on projective limits (knowing Q on the Fn amounts to knowing the finite marginal laws of the process X), or directly, since every finitely additive probability on the Boolean algebra A = n Fn extends to a σ-additive probability on F∞ (a Cantorian diagonal argument shows that every decreasing sequence (Ak ) in A with limit k Ak = ∅ is stationary; hence every finitely additive probability on A is σ-additive on A). In short, Q is the unique probability on (Ω, F∞ ) such that Q(Λn ) = E0 [1Λn Mn ] . ∀n ∈ N ∀Λn ∈ Fn We will also study the process X under Q. 1) In the first section, G is a function of the one-sided maximum, i.e., Gp = ϕ (Sp ) where Sp = sup {Xk , k p} and where ϕ is a function from N to R+ satisfying ∞ k=0
ϕ(k) = 1
Penalisation of the Random Walk on Z
333
We will also need the function Φ : N −→ R+ given by Φ(k) :=
∞
ϕ(j).
j=k
The results of Section 1 are summarized in the following statement: Theorem 1. 1. a) For each n 0 and each Λn ∈ Fn , one has lim
p→∞
E[1Λn ϕ(Sp )] = E[1Λn Mnϕ ], E[ϕ(Sp )]
where Mnϕ := ϕ(Sn )(Sn − Xn ) + Φ(Sn ). b) (Mnϕ , n 0) is a positive martingale, with M0ϕ = 1, non uniformly integrable; in fact, Mnϕ tends a.s. to 0 when n → ∞. 2. Call Qϕ the probability on (Ω, F∞ ) characterized by ∀n ∈ N, Λn ∈ Fn ,
Qϕ (Λn ) = E[1Λn Mnϕ ].
Then a) S∞ is finite Qϕ -a.s. and satisfies for every k ∈ N: Qϕ (S∞ = k) = ϕ(k).
(1)
b) Under Qϕ , the r.v. T∞ := inf {n 0, Xn = S∞ } (which is not a stopping time in general) is a.s. finite and i. (Xn∧T∞ , n 0) and (S∞ − XT∞ +n , n 0) are two independent processes; ii. conditional on the r.v. S∞ , the process (Xn∧T∞ , n 0) is a standard random walk stopped when it first hits the level S∞ ; iii. (S∞ − XT∞ +n , n 0) is a 3-Bessel walk started from 0. 3. Put Rn = 2Sn −Xn . Under Qϕ , (Rn , n 0) is a 3-Bessel walk independent of S∞ . The proofs of the second and third parts of this theorem rest largely upon a theorem due to Pitman (cf. [Pit75]) and on the study of the large p asymptotics of P (Λn |Sp = k) for Λn ∈ Fn .
334
P. Debs
We must now explain the precise meaning of the ‘3-Bessel walk’ mentioned in the theorem and further in this article. In fact, two processes, which we call the 3-Bessel walk and the 3-Bessel* walk, will play a role in this work; they are identical up to a one-step space shift. The 3-Bessel walk is the Markov chain (Rn , n 0), with values in N = {0, 1, 2, . . .}, whose transition probabilities from x 0 are given by π(x, x + 1) =
x+2 ; 2x + 2
π(x, x − 1) =
x . 2x + 2
(2)
The 3-Bessel* walk is the Markov chain (Rn∗ , n 0), valued in N∗ = {1, 2, . . .}, such that R∗ − 1 is a 3-Bessel walk. So its transition probabilities from x 1 are π ∗ (x, x + 1) =
x+1 ; 2x
π ∗ (x, x − 1) =
x−1 . 2x
2) In the second section, the functional Gp will be a function of the local time at 0 of the random walk. The local time is the process (Ln , n 0) such that Ln is the number of times that X was null strictly before time n. In other words, Ln = 1m
Observe that Ln is also the sum of the number of up-crossings from 0 to 1 and of the number of down-crossings from 0 to −1, up to time n. Given two functions h+ and h− from N∗ to R+ such that ∞ 1 + h (k) + h− (k) = 1, 2 k=1
we consider the penalisation functional Gp := h+ (Lp ) 1Xp >0 + h− (Lp ) 1Xp <0 . Putting Θ(x) =
∞ 1 + h (k) + h− (k) , 2 k=x+1
we obtain the following penalisation theorem. Theorem 2. 1. a) For each n 0 and each Λn ∈ Fn , one has lim
p→∞ +
where Mnh
,h−
+ − E[1Λn Gp ] = E[1Λn Mnh ,h ], E[Gp ]
:= Xn+ h+ (Ln ) + Xn− h− (Ln ) + Θ(Ln ).
(3)
Penalisation of the Random Walk on Z
335
−
+
b) Mnh ,h is a positive, non uniformly integrable martingale ; indeed, it tends to 0 when n tends to infinity. +
2. Call Qh
,h−
the probability on F∞ whose restriction to Fn is given by +
∀ Λn ∈ Fn , +
This Qh
,h−
Qh
,h−
+
(Λn ) = E[1Λn Mnh
,h−
].
has the following properties: +
a) L∞ is Qh
,h−
-a.s. finite and satisfies
∀k ∈ N∗ ,
+
Qh
,h−
(L∞ = k) =
1 + h (k) + h− (k) . 2 +
b) The r.v. g := sup {n 0, Xn = 0} is Qh + − Qh ,h ,
,h−
-a.s. finite and, under
i. The processes (Xg+u , u 0) and (Xu∧g , u 0) are independent.
∞ ii. With probability 12 k=1 h+ (k), the process (Xg+u , u 1) is a 3-Bessel* walk started
∞from 1. With probability 12 k=1 h− (k), the process (−Xg+u , u 1) is a 3-Bessel* walk started from 1. iii. Conditional on L∞ = l, the process (Xu∧g , u 0) is a standard random walk stopped at its l-th passage at 0. Our unusual choice for the definition of the local time at 0 will be helpful when proving the first point. The second part of the proof of this theorem rests essentially on an article by Le Gall (cf [LeG85]) which enables us to assess, under specific conditions, that a 3-Bessel* walk for P is is still a 3-Bessel* + − walk for Qh ,h . 3) In the third part, the penalisation functional Gp will be a function of the longest excursion completed until time p. Set gn := sup {k n, Xk = 0}, dn := inf {k n, Xk = 0}, and Σn := sup {dk − gk , dk n}; for n 0, Σn is the duration of the longest excursion completed until time n. Fix an even integer x 0, and consider the penalisation functional Gp := 1Σp x . To study penalisation by this G, we must also introduce An := n − gn , which is the age of the current excursion, and A∗n := supkn Ak , which is the longest duration of a (complete or incomplete) excursion until n. We also call τ = inf {n > 0, Xn = 0} the first return time to 0, and we put
θ(x) := E0 |Xx | τ > x .
336
P. Debs
Theorem 3. 1. a) For each n 0 and each Λn ∈ Fn : lim
p→∞
E0 [1Λn 1Σp x ] = E0 [1Λn Mn ], P0 [Σp x]
(4)
where Mn :=
|X | n ˜ X T˜0 x − An 1A x 1Σ x . +P n n n θ(x)
˜ ˜ (In this expression and in similar ones, the meaning of P and T0 is to ˜ X T˜0 x − An stands for f (Xn , x−An ), be interpreted as follows: P n with f (y, z) = Py (T0 z).) b) Moreover, (Mn , n 0) is a positive martingale, non uniformly integrable; indeed, limn→∞ Mn = 0 P-a.s. 2. Call Qx the probability on F∞ whose restriction to Fn is defined by ∀Λn ∈ Fn ,
Qx (Λn ) = E [1Λn Mn ] .
Under Qx , one has: a) Σ∞ x a.s. and satisfies for all y x: Qx (Σ∞ > y) = 1 −
P (τ > x) . P (τ > y)
b) A∗∞ = ∞ a.s. c) The r.v. g := sup {n 0, Xn = 0} is a.s. finite. Moreover, if p = 2l or 2l + 1 with l 0, l∧ 2 1 l x
x
Q (g > p) =
2
k=0
P (τ > x) l−k k 1− . C2l−2k C2k P (τ > 2k)
d) For y such that 0 y x, i. An , n TyA has the same law under P and Qx . ii. An , n TyA and XTyA are independent under P and under Qx . iii. Under Qx , the law of XTyA is given by |k| + Pk T0 x − y P (Xy = k | τ > y) . Qx XTyA = k = θ(x)
Penalisation of the Random Walk on Z
337
iv.
P (τ > x) . Qx g > TyA = 1 − P (τ > y) v. Under Qx , An , n TyA is independent of g > TyA .
3. Under Qx , a) The processes (Xn∧g , n 0) and (Xg+n , n 0) are independent. b) With probability 12 , the process (Xg+n , n 0) is a 3-Bessel* walk and with probability 12 , the process (−Xg+n , n 0) is a 3-Bessel* walk. c) Conditional on L∞ = l, the process (Xn∧g , n 0) is a standard random walk stopped at its l-th return time to 0 and conditioned by {Στl x}, where τl is the l-th return time to 0. The proof of the first point of this theorem rests largely on a Tauberian theorem (cf [Fel50]) which gives the large p asymptotics of P (Σp x). And the study of the process X under Qx rests on arguments similar to those used in the proof of Theorem 2.
2 Principle of Penalisation Penalisation can intuitively be interpreted as a generalisation of conditioning by a null event. Consider the event A∞ := {S∞ a}, where a ∈ N. By recurrence of the standard walk, A∞ is a P-null event. One way of conditioning by A∞ , which involves the filtration (Fn ), is to consider the sequence of events Ap := {Sp a} and to study the limit
E 1Λn ∩{Sp a}
, lim (5) p→∞ E 1{Sp a} for each n ∈ N and each Λn ∈ Fn . Simple arguments show that the limit in (5) exists and equals a + 1 − Xn . E 1{Λn , Sn a} a+1 a+1−X n Put Mn := 1{Sn a} a+1−X stopped a+1 . The process M is the martingale a+1 when S first hits a + 1; so it is a positive P0 -martingale. Since M0 = 1 and M∞ = 0 a.s., M is not uniformly integrable. But a probability Q(n) can be defined on Fn by dQ(n) = Mn ; dP| Fn
338
P. Debs
moreover, for m < n, Q(m) and Q(n) agree on Fm . By Kolmogorov’s existence theorem (cf [Bil] pp. 430-435), there exists a probability Q on (Ω, F∞ ) whose restriction to each Fn is the corresponding Q(n) ; in other words, Q is characterized by a + 1 − Xn Q (Λn ) := E 1{Λn , Sn a} a+1 for all n ∈ N and Λn ∈ Fn . When studying the behavior of (Xn , n 0) under the new probability Q, one obtains that S∞ is a.s. finite and uniformly distributed on [0, a]. A more detailed study shows that: • (Xn∧T∞ , n 0) and (S∞ −XT∞ +n , n 0) are two independent processes. • Conditional on {S∞ = k}, (Xn∧T∞ , n 0) is a standard random walk stopped when it reaches the value k. • (S∞ − XT∞ +n , n 0) is a 3-Bessel walk started from 0, independent from (S∞ , T∞ ). This raises several natural questions: What happens when 1{Sn a} is replaced with a more complicated function of the supremum? In that case, what does the limit (5) become? Can one still define a probability Q, and how is the behavior of (Xn , n 0) under Q influenced by this modification? This simple idea of replacing the indicator by a more complex function is the essence of penalisation. All this is evidently not limited to the case of the one-sided maximum, but extends to many other increasing, adapted functionals tending P-a.s. to +∞. There exist various examples of penalisation, and also a general principle (cf [Deb07]) but this article is only devoted to three examples of penalisation functionals: the one-sided maximum, the local time at 0 and the maximal duration of the completed excursions.
3 Penalisation by a Function of the One-sided Maximum: Proof of Theorem 1 1) We start by recalling a few facts. The next result is classical (cf. [Fel50] p. 75): Lemma 1. For k ∈ Z and n ∈ N, P0 (Xn = k) =
1 n 2
n+k
Cn 2 .
Remark 1. In the sequel, we put pn,k := P0 (Xn = k); observe that pn,k = 0 if and only if n and k have the same parity and |k| n.
Penalisation of the Random Walk on Z
Lemma 2. For k in Z and n and r in N, one has P(Xn = k) P0 (Xn = k, Sn r) = P(Xn = 2r − k)
if k > r; if k r.
339
(6)
Proof. This formula is trivial when k > r; when k r, it is D´esir´e Andr´e’s well-known reflection principle (see for instance [Fel50] p. 72 and pp. 88-89). From Lemma 2 and Remark 1, one easily derives the law of S: Lemma 3. For n and r in N, one has P0 (Sn = r) = pn,r + pn,r+1 = pn,r ∨ pn,r+1 .
(7)
Proof. Summing (6) over all k ∈ Z gives P(Sn r) = P(Xn = k) + P(Xn = 2r − k) = P(Xn > r) + P(Xn r). k>r
kr
Consequently, P(Sn = r) = P(Sn r) − P(Sn r+1) = P(Xn = r + 1) + P(Xn = r), and (7) follows by definition of pn,k and by Remark 1.
2) We start showing point 1 of Theorem 1. Lemma 4. For each k 0, the ratio P(Sn = k) P(Sn = 0) is majorized by 1 for all n 0 and tends to 1 when n → +∞. Proof. The denominator is minorated by P(X1 = . . . = Xn = −1) = 2−n ; so it does not vanish. Observe that, for even n and even k 2, n n−k+2 n−k+4 P(Sn = k−1) P(Sn = k) pn,k = = ··· ; = P(Sn = 0) P(Sn = 0) pn,0 n+2 n+4 n+k and for odd n and odd k 1, n+1 n−k+2 n−k+4 P(Sn = k−1) P(Sn = k) pn,k = = ··· . = P(Sn = 0) P(Sn = 0) pn,1 n+1 n+3 n+k Clearly, these products are smaller than 1 and tend to 1 when n goes to infinity.
340
P. Debs
Lemma 5. For all x ∈ N and y ∈ Z such that y x, the ratio E ϕ x ∨ (y+Sn ) P(Sn = 0) is majorized for all n ∈ N by (x−y)ϕ(x)+Φ(x) and tends to (x−y)ϕ(x)+Φ(x) when n tends to infinity. Proof. Write E ϕ x ∨ (y+Sn ) P(y + Sn < x) P(y + Sn = k) = ϕ(x) + ϕ(k) P(Sn = 0) P(Sn = 0) P(Sn = 0) kx
P(Sn = k) P(Sn = k − y) + . = ϕ(x) ϕ(k) P(Sn = 0) P(Sn = 0) k<x−y
kx
By Lemma 4, this sum is majorized by (x − y)ϕ(x) + to this value by dominated convergence.
kx
ϕ(k) and tends
To establish point 1 of Theorem 1, observe first that Mnϕ = ϕ(Sn )(Sn − Xn ) + Φ(Sn ) is a positive martingale. Positivity is obvious: ϕ, Φ, and S − X are positive. ϕ To see that M ϕ is a martingale, consider Mn+1 − Mnϕ . If Sn+1 = Sn , the only thing that varies in the expression of M ϕ when n is changed to n + 1 is X; so, in that case, ϕ − Mnϕ = −ϕ(Sn )(Xn+1 − Xn ). Mn+1
On the other hand, if Sn+1 = Sn , one has Sn+1 = Sn +1 because each step of S is 0 or 1; one also has Xn+1 = Sn+1 because S can increase only when pushed up by X, and Xn = Sn because Xn must simultaneously be Sn and at distance 1 from Xn+1 . So Sn+1 − Xn+1 = Sn − Xn = 0, giving ϕ Mn+1 − Mnϕ = Φ(Sn+1 ) − Φ(Sn ) = Φ(Sn + 1) − Φ(Sn ) = −ϕ(Sn ) = −ϕ(Sn )(Xn+1 − Xn ). ϕ All in all, the equality Mn+1 − Mnϕ = −ϕ(Sn )(Xn+1 − Xn ) holds everyϕ where; this entails that M is a martingale, verifying
|Mnϕ − M0ϕ | n;
(8)
and since M0ϕ = Φ(0) = 1, one has E[Mnϕ ] = 1. We now proceed to prove 1.a of Theorem 1. For 0 n p, one can write Sp = Sn ∨ (Xn + Sp−n ), where S is the maximal process of the standard random walk (Xn+k − Xn )k0 , which is independent from Fn . Hence
Penalisation of the Random Walk on Z
341
ϕ Sn ∨ (Xn + Sp−n ) , E[ ϕ(Sp ) | Fn ] = E integrates over Sp−n only, Sn and Xn being kept fixed. So, for where E Λn ∈ Fn , ϕ Sn ∨ (Xn + Sp−n ) E E[1Λn ϕ(Sp )] . = E 1Λn Sp−n = 0) P(Sp−n = 0) P( When p tends to infinity, Lemma 5 says that the ratio in the right-hand side tends to Mnϕ and is dominated by Mnϕ , which is integrable by (8). Consequently, is majorated by E[1Λn Mnϕ ] for all p n E[1Λn ϕ(Sp )] (9) P(Sp−n = 0) and tends to E[1Λn Mnϕ ] when p → ∞. Taking in particular Λn = Ω, one also has E[ϕ(Sp )] → E[Mnϕ ] = 1 P(Sp−n = 0)
when p → ∞,
and to establish 1.a of Theorem 1, it suffices to take the ratio of these two limits. Half of 1.b is already proven: we have seen above that M ϕ is a positive martingale, with M0ϕ = 1. The proof that Mnϕ → 0 a.s. is postponed; we first establish 2.a. The set-function Qϕ defined on the Boolean algebra n Fn by Qϕ (Λn ) = E[1Λn Mnϕ ] if Λn ∈ Fn , is a probability on each σ-field Fn . As recalled in the introduction, Qϕ automatically extends to a probability (still called Qϕ ) on the σ-field F∞ . For k and n in N, the event {Sn k} is equal to {Tk n}, where Tk = inf{m : Xm k} = inf{m : Sm k}. Now, by Doob’s stopping theorem, Qϕ (Sn k) = Qϕ (Tk n) = E[1Tk n Mnϕ ] ϕ = E[1Tk n Mn∧T ] = E[1Tk n MTϕk ]. k But P0 -a.s., XTk = STk = k and MTϕk = Φ(k); wherefrom Qϕ (Sn k) = Φ(k) P(Sn k). Fixing k, let now n tend to infinity. The events {Sn k} form an increasing sequence, with limit {S∞ k}; hence Qϕ (S∞ k) = Φ(k) P(S∞ k) = Φ(k). This implies that S∞ is Qϕ -a.s. finite, with Qϕ (S∞ = k) = Φ(k) − Φ(k + 1) = ϕ(k); so 2.a is established.
342
P. Debs
ϕ This also implies that the P-a.s. limit M∞ of M ϕ is null, by the following argument. Using Fatou’s lemma, one writes
ϕ ] = E lim(1Sn k Mnϕ ) E[1S∞ k M∞ n
lim inf E[1Sn k Mnϕ ] n
= lim inf Qϕ (Sn k) = Qϕ (S∞ k) = Φ(k); n
then, by dominated convergence, one has
ϕ ϕ ϕ ] = E lim(1S∞ k M∞ ) = lim E[1S∞ k M∞ ] lim Φ(k) = 0, E[1S∞ =∞ M∞ k
k
k
ϕ ] = 0. Point 1.b is proven. and P(S∞ = ∞) = 1 now implies E[M∞ 3) Here are now a few facts on 3-Bessel walks, which will play an important role in the rest of the proof of Theorem 1.
Proposition 1. Let (Rn , n 0) be a 3-Bessel walk; put Jn = inf mn Rm . 1. Conditional on FnR , the law of Jn is uniform on {0, 1, . . . , Rn }. 2. Suppose now R0 = 0 (therefore J0 = 0 too). a) The process (Zn , n 0) defined by Zn = 2Jn − Rn is a standard random walk, and its natural filtration Z is also the natural filtration of the 2-dimensional process (R, J). b) If T is a stopping time for Z such that RT = JT , then the process (RT +n − RT , n 0) is a 3-Bessel walk started from 0 and independent of ZT . Proof. 1. By the Markov property, it suffices to show that if R0 = k, the r.v. J0 is uniformly distributed on {0, . . . , k}. The function f (x) = 1/(1 + x) defined for x 0 is bounded and verifies for x 1 f (x) = π(x, x − 1) f (x − 1) + π(x, x + 1) f (x + 1), where π is the transition kernel of the 3-Bessel walk, given by (2). Thus f is π-harmonic except at x = 0, and f (Rn∧σ0 ) is a bounded martingale, where σx denotes the hitting time of x by R. (This result is due to [LeG85] p. 449.) For 0 a k, by stopping, μan = f (Rn∧σa ) is also a bounded martingale. A Borel-Cantelli argument shows that the paths of R are a.s. unbounded; hence lim inf n→∞ f (Rn ) = 0 and μa∞ = f (a) 1J0 a . The martingale equality f (a) P(J0 a) = E[μa∞ ] = E[μa0 ] = f (k) yields P(J0 a) = (a + 1)/(k + 1), so the law of J0 is uniform on {0, . . . , k}. Part 2 of Proposition 1 depends only on the law of the process R, so we need not prove it for all 3-Bessel walks started at 0, it suffices to prove it for some particular 3-Bessel walk started at 0. Given a standard random , Pitman’s walk Z with Z0 = 0 and its past maximum Sn = supmn Zm theorem [Pit75] says that the process R = 2S − Z is a 3-Bessel walk started from 0, with future minimum Jn = inf mn Rm given by J = S . We shall prove 2.a and 2.b for this particular 3-Bessel walk R.
Penalisation of the Random Walk on Z
343
The process Z = 2J − R is also equal to 2S − R = Z , so it is a standard random walk. Both J = S and R = 2S − Z are adapted to the filtration of Z; conversely, Z = 2J −R is adapted to the filtration generated by R and J. This proves 2.a. To show 2.b, let T be Z-stopping time such that RT = JT . One has ZT = 2JT − RT = JT = ST . defined by Z n = Z The process Z T +n − ZT is a standard random walk independent of ZT , started from 0, with past maximum
m = ST +n − ZT = ST +n − ST . Sn = sup Z mn
= 2S − Z is a 3-Bessel walk, and it is independent By Pitman’s theorem, R of ZT because so is Z. Now, n = 2Sn − Z n = 2(ST +n − ST ) − (ZT +n − ZT ) = RT +n − RT ; R
thus 2.b holds and Proposition 1 is established.
4) The next step is the proof of point 3 in Theorem 1. We start with a small computation: Lemma 6. Let a r.v. U be uniformly distributed on {0, .., r}. Then
E ϕ(U )(r − U ) + Φ(U ) = 1 Proof. It suffices to write r r r−1 i−1 r
1 − Φ(i) = ϕ(j) = ϕ(j) (r + 1) E 1 − Φ(U ) = i=0
i=0 j=0
=
r−1
j=0 i=j+1
(r − j)ϕ(j) = (r + 1) E (r − U )ϕ(U ) .
j=0
The next proposition proves the first half of point 3 in Theorem 1. Proposition 2. Under Qϕ , the process (Rn , n 0) given by Rn = 2Sn − Xn is a 3-Bessel started from 0. Proof. According to Pitman’s theorem [Pit75], under the probability P, the process (Rn , n 0) is a 3-Bessel walk with future infimum Jn = Sn . Call R the natural filtration of R. By Proposition 1.1, the conditional law of Sn given Rn is uniform on {0, . . . , Rn }; consequently Lemma 6 gives E[Mnϕ |Rn ] = E[ ϕ(Sn )(Rn − Sn ) + Φ(Sn ) | Rn ] = 1.
344
P. Debs
Now, let f be any bounded function on Nn+1 . One has ϕ
EQ [f (R0 , . . . , Rn )] = E[f (R0 , . . . , Rn )Mnϕ ]
= E f (R0 , . . . , Rn ) E[Mnϕ |Rn ] = E[f (R0 , . . . , Rn )]. As n and f were arbitrary, R has the same law under Qϕ as under P, that is, Qϕ also makes R a 3-Bessel walk. To finish proving point 3, it remains to establish that R is independent of S∞ under Qϕ . This will easily follow from the next lemma, which decomposes Qϕ as a sum of measures carried by the level sets of S∞ . Lemma 7. Call Q(k) the probability Qϕ for ϕ = δk , that is, ϕ(k) = 1 and ϕ(x) = 0 for x = k. Then Q(k) is supported by the event {S∞ = k}, and, for a general ϕ and for all Λ ∈ F∞ one has ϕ(k) Q(k) (Λ); Qϕ (Λ) = k0
Qϕ (Λ | S∞ = k ) = Q(k) (Λ)
for all k such that ϕ(k) > 0.
Proof. For Λn ∈ Fn , one can use formula (9) twice and write E[1Λn ϕ(Sp )] P(Λn ∩ {Sp = k}) = lim ϕ(k) p P(Sp−n = 0) p P(Sp−n = 0) k P(Λn ∩ {Sp = k}) = = ϕ(k) lim ϕ(k) Q(k) (Λn ), p P(Sp−n = 0)
Qϕ (Λn ) = lim
k
k
where lim and Σ commute by dominated
convergence, owing to the majoration in (9). So the probabilities Qϕ and k ϕ(k) Q(k) coincide on n Fn ; therefore they also coincide on F∞ . Applying now equation (1) with ϕ = δk gives Q(k) (S∞ = k) = 1, that is, (k) Q is supported by {S∞ = k}. Consequently, for any Λ ∈ F∞ , one has Qϕ Λ ∩ {S∞ = k} = ϕ(k) Q(k) (Λ) because all other terms in the series vanish. Using (1) again, one may replace ϕ(k) with Qϕ (S∞ = k); this proves Qϕ ( Λ | S∞ = k ) = Q(k) (Λ) whenever ϕ(k) > 0. The proof of independence in Theorem 1.3 is now a child’s play: Proposition 2 says that the law of R under Qϕ is always the law of the 3-Bessel walk, whatever the choice of ϕ. We may in particular take ϕ = δk , so it is also true under Q(k) . Since Q(k) is also the conditioning of Qϕ by {S∞ = k}, under Qϕ the law of R conditional on {S∞ = k} does not depend upon k, thus R is independent of S∞ . 5) So far, all of Theorem 1 has been established, except 2.b, to which the rest of the proof will be devoted. Finiteness of T∞ is due to X being integer-valued and its supremum S∞ being finite.
Penalisation of the Random Walk on Z
345
Put Un = Xn∧T∞ and Vn = S∞ − XT∞ +n . To prove 2.b.i and 2.b.iii we have to show that under Qϕ the process V is a 3-Bessel walk independent of the process U . Call ν the law of the 3-Bessel walk. For bounded functionals F and G, we must prove that ϕ ϕ EQ [F ◦U G◦V ] = EQ [F ◦U ] G(v) ν(dv).
Replacing now Qϕ by k ϕ(k) Q(k) (see Lemma 7), it suffices to show it when ϕ = δk . Similarly, 2.b.ii only refers to a conditional law given S∞ ; by Lemma 7 again, we may replace Qϕ by Q(k) . Finally, when proving 2.b, we may suppose ϕ = δk and Qϕ = Q(k) for a fixed k 0. Hence the random time T∞ becomes the stopping time Tk = inf {n 0, Xn = k}, and it remains to show that • • •
(Xn∧Tk , n 0) is a standard random walk stopped when it first hits the level k; (2k − XTk +n , n 0) is a 3-Bessel walk started at 0; These two processes are independent.
By point 3 of Theorem 1, we know that R = 2S − X is a 3-Bessel walk; and as we are now working under Q(k) , we have S∞ = k a.s. Put Jn = inf mn Rm . We shall first show that the processes J and S are equal on the interval [0, Tk ]. Given n, call τ the first time p n when Xp = Sn , and observe that on the event {Tk n}, τ is finite because Xn Sn k = XTk . For all m n, one has Rm = Sm + (Sm − Xm ) Sn + 0, with equality for m = τ ; thus Jn = Sn on {τ < ∞} and a fortiori on {Tk n}. We shall now apply Proposition 1.2 to the 3-Bessel walk R = 2S − X and its future infimum J. Part 2.a of this proposition says that Z = 2J − R is a standard random walk. We just saw that J = S on the random time-interval [0, Tk ]; consequently, on this interval, Z = 2S − R = X. And as Tk is the first time when X = k, it is also the first time when Z = k. This proves that (Xn∧Tk , n 0) is a standard random walk stopped at level k, and also that the Z-stopping time Tk satisfies ZTk = FTk , where Z is the filtration of Z. Remarking that RTk = JTk = k, part 2.b of proposition 1 can be applied to Tk ; it says that (RTk +n − k, n 0) is a 3-Bessel walk independent of FTk , and hence also of the process (Xn∧Tk , n 0). But RTk +n = 2STk +n − XTk +n = 2k − XTk +n since STk = k = S∞ ; so this 3-Bessel walk is nothing but (k − XTk +n , n 0). This concludes the proof of Theorem 1.
4 Penalisation by a Function of the Local Time: Proof of Theorem 2 Definition 1. Recall that the 3-Bessel* walk is the Markov chain (Rn∗ , n 0), valued in N∗ = {1, 2, . . .}, such that R∗ −1 is a 3-Bessel walk. So its transition probabilities from x 1 are π ∗ (x, x + 1) =
x+1 ; 2x
π ∗ (x, x − 1) =
x−1 . 2x
346
P. Debs +
−
1) We now prove point 1 of Theorem 2. First, (Mnh ,h , n 0) is a positive martingale. Positivity is obvious from the definitions of h, h− and Θ. + − To see that M h ,h is a martingale, we shall verify that the increment + − + − h ,h − Mnh ,h has the form (Xn+1 − Xn ) Kn , where Kn is Fn -measurable Mn+1 and |Kn | 1. There are three cases, depending on the value of Xn . + = Xn+1 , and Ln+1 = Ln . If Xn > 0, then Xn+1 0, so Xn+ = Xn , Xn+1 +
−
+
−
h ,h Consequently, in that case, Mn+1 − Mnh ,h = (Xn+1 − Xn ) h+ (Ln ). − − Similarly, if Xn < 0, one has Xn = −Xn , Xn+1 = −Xn+1 , Ln+1 = Ln +
−
−
h ,h and Mn+1 − Mnh ,h = −(Xn+1 − Xn ) h− (Ln ). Last, if Xn = 0, then Ln+1 = Ln + 1 and Xn+1 = ±1. In that case, +
−
+
+
h ,h Mn+1 − Mnh
,h−
= 1{Xn+1 =1} h+ (Ln +1) + 1{Xn+1 =−1} h− (Ln +1) +Θ(Ln +1) − Θ(Ln ) 1 = hsgn(Xn+1 −Xn ) (Ln +1) − h+ (Ln +1) + h− (Ln +1) 2 1 = (Xn+1 − Xn ) h+ (Ln +1) − h− (Ln +1) . 2 +
−
This establishes the claim; consequently, M h ,h is a martingale which satisfies h+ ,h− + − Mn − M0h ,h n + − + − and, as M0h ,h = 1, one has E Mnh ,h = 1. To finish the proof of point 1 in Theorem 2, it remains to show formula (3). This will use the following lemma. Lemma 8. For each integer k such that 0 < k < n2 , P (Ln = k) P (Sn = 0) is bounded above by 2 and tends to 1 when n → ∞.
∞ Remark 2. In the sequel, for h : N → R+ such that k=1 h(k) < ∞, we put
∞ Mnh,0 = Xn+ h(Ln ) + Θ(Ln ) for n 0. When k=1 h(k) = 1, this notation is consistent with the one used so far; in general, M h,0 is a martingale too, for ∞ dividing it by the constant Θ(0) = k=1 h(k) reduces it to the previous case. ∞ + Lemma 9. Let h : N −→ R be such that h(k) < ∞. For a 0 and x ∈ Z, k=1 Ex [h(Ln + a) 1Xn >0 ] P (Sn = 0)
is bounded above by 2(h(a)x+ + 12 ka+1 h(k)) and converges to h(a)x+ +
1 ka+1 h(k) when n → ∞. 2
Penalisation of the Random Walk on Z
347
Proof of Lemma 8. Call γn = | {p n, Xp = 0} | the number of visits to 0 up to time n. Clearly, γn = Ln+1 and P(Ln = k) = P(γn−1 = k). We shall study the law of γn . Define a sequence (Vn , n 0) by V0 = 0 Vn+1 = inf {k > 0, XVn +k = 0} (k) (k) (k) and put (Xn )n0 = (XVk +n )n0 and Ti = inf n 0, Xn = i . Owing to the symmetry of the random walk and the Markov property, (i−1)
P(Vi = k) = P(T1
∀i 1 L
= k − 1).
(i−1)
So ∀i 1, Vi = T1 +1. Moreover, according to the strong Markov property, (2) (Xn , n 0) is independent of FV1 and hence L
(0)
V1 + V 2 = T1
(1)
+ T1
+ 2.
Wherefrom, by induction, L
(0)
V1 + V2 + ... + Vk = T1
(1)
+ T1
(k−1)
+ ... + T1
+ k.
So P(γn = k) = P(V1 + ... + Vk−1 n < V1 + ... + Vk ) (0)
= P(T1
(1)
+ T1
(k−2)
+ ... + T1
(0)
+ k − 1 n < T1
(1)
+ T1
(k−1)
+ ... + T1
+ k)
= P(Tk−1 + k − 1 n < Tk + k) = P(Sn−k+1 k − 1, Sn−k < k) = P(Sn−k+1 = k − 1) + P(Tk = n − k + 1). Taking inspiration from the proof of Lemma 4, it is easy to see that P(Sn−k = k − 1) P(Sn = 0) is majorated by 1 and tends to 1 when n tends to infinity. According to [Fel50] p. 89, r n+r 1 n . P (Tr = n) = Cn 2 n 2 Appealing again to the proof of Lemma 4, it is easy to show that P(Tk = n − k) P (Sn = 0) is majorated by 1 and tends to 0 when n goes to infinity. The proof is over.
348
P. Debs
Remark 3. From the preceding result, one easily sees that Px (Ln = k, Xn > 0) P (Sn = 0) is majorated by 1 and tends to
1 2
when n → ∞.
Proof of Lemma 9. Start from Ex [h(Ln + a) 1Xn >0 ] = Ex [h(Ln + a)1Xn >0 (1T0 >n + 1T0 n )]
One has h(Ln + a) 1Xn >0 1T0 >n =
0 si x 0 h(a) 1T0 >n si x > 0
According to Lemma 4, h(a) 1x>0 Px (T0 > n) P(Sn = 0) is majorated by x+ h(a) and converges to x+ h(a). Write Ex [h(Ln + a) 1{Xn >0,T0 n} ] Px (Ln = k, Xn > 0) = h(k + a) P(Sn = 0) P (Sn = 0) k1
By Lemma 8,this sum is majorated by k1 h(k + a) and converges to
1 k1 h(k + a) when n → ∞. 2 We shall now prove point 1.a in Theorem 2. For each 0 n p, one has ˜ p−n where L ˜ is the local time at 0 of the standard random walk Lp = Ln + L (Xn+k )k0 which, given Xn , is independent of Fn . So
˜ X h(Ln + L ˜ p−n ) 1 ˜ E h(Lp ) 1Xp >0 Fn = E n Xp−n >0 ˜ integrates only L ˜ p−n and X ˜ p−n and where Ln and Xn are fixed. where E Then, for all Λn ∈ Fn ,
˜ X h(Ln + L ˜ p−n ) 1 ˜ E E h(Lp ) 1Xp >0,Λn n Xp−n >0 = E 1Λn P(Sp−n = 0) P(Sp−n = 0) When p → ∞, Lemma 9 says that the ratio in the right-hand side tends to Mnh,0 and is dominated by 2Mnh,0 , which is integrable. Consequently, when p → ∞, E[h(Lp ) 1Xp >0,Λn ] → E[1Λn Mnh,0 ], P (Sp−n = 0)
Penalisation of the Random Walk on Z
349
and taking Λn = Ω, one has E[h(Lp ) 1Xp >0 ] → E[Mnh,0 ]. P(Sp−n = 0) Taking the ratio of these two limits yields E[h(Lp ) 1Xp >0,Λn ] E[Λn Mnh,0 ] → . E[h(Lp ) 1Xp >0 ] E[Mnh,0 ] To finalize the proof of point 1.a, it now suffices to use the symmetry of the + − standard random walk and the fact that E[Mnh ,h ] = 1. 2) Let us now show point 2 in Theorem 2. Put τl = inf {k 0, γk = l}. Then +
Qh
,h−
+
(Ln l) = Qh
,h−
(τl n − 1) +
= E[1τl n−1 Mτhl
,h−
] = Θ(l − 1)P(τl n − 1).
For fixed l, the sequence of events {Ln l} is increasing and tends to {L∞ l}; so +
,h−
+
−
Qh Hence L∞ is Qh +
Qh
,h
,h−
(L∞ l) = Θ(l − 1)P(τl ∞) = Θ(l − 1). -a.s. finite, with
(L∞ = l) = Θ(l − 1) − Θ(l) =
1 + h (l) + h− (l) 2
and 2.a is established. + − h+ ,h− of M h ,h is null, it suffices to apply To show that the P-a.s. limit M∞ + − the same method as for M ϕ , with L instead of S and M h ,h instead of M ϕ . + − The study of the process (Xn , n 0) under Qh ,h starts with the next three lemmas. Lemma 10. Under P1 and conditional on the event {Tp < T0 }, the process (Xn , 0 n Tp ) is a 3-Bessel* walk started from 1 and stopped when it first hits the level p (cf. [LeG85]). For typographical simplicity, call T p,n := inf{k > n, Xk = p} the time of the first visit to p after n, and Hl := Tp,τl < τl+1, Xτl +1 =1 , the event that the l-th excursion is positive and reaches level p. +
−
Lemma 11. Under the law Qh ,h and conditional on the event Hl , the process (Xn+τl , 1 n Tp,τl − τl ) is a 3-Bessel* walk started from 1 and stopped when it first hits the level p. Lemma 12. Put Γ + := {Xn+g > 0, ∀n > 0} and Γ − := {Xn+g < 0, ∀n > 0}. Then: ∞ + − + − 1 + Qh ,h (Γ + ) = 1 − Qh ,h (Γ − ) = h (k) 2 k=1
350
P. Debs
Proof of Lemma 11. Let G be a function from Zn to R+ . Then, according + − to the definition of the probability Qh ,h and owing to Doob’s stopping theorem, + − K := Qh ,h G(Xτl +1 , . . . , Xτl +n ) 1n+τl Tp,τl Hl
+ − Qh ,h G(Xτl +1 , . . . , Xτl +n ) 1τl +nTp,τl <τl+1 ,Xτl +1 =1 = Qh+ ,h− (Hl ) + − E G(Xτl +1 , . . . , Xτl +n ) 1τl +nTp,τl <τl+1 ,Xτl +1 =1 Mτhl+1,h = . + − E[1Hl Mτhl+1,h ] +
−
Replacing Mτhl+1,h by the constant Θ(l) and using the Markov property, one gets K=
E[G(Xτl +1 , . . . , Xτl +n ) 1τl +nTp,τl <τl+1 ,Xτl +1 =1 ]
P(Hl ) E1 [G(X0 , . . . , Xn−1 ) 1n−1Tp
Remark 4. By letting p → ∞, one deduces therefrom that, conditional on + − {g = τl , Xτl +1 = 1}, (Xn+g , n 1) is a 3-Bessel* walk under Qh ,h . +
−
Proof of Lemma 12. As g is Qh ,h -a.s. finite and as Xn = 0 for n > g, one has + − + − (10) Qh ,h (Γ + ) = lim Qh ,h (Xn > 0). n→∞
Now, +
Qh
,h−
+
(Xn > 0) = E[1Xn >0 Mnh
,h−
] = E[1Xn >0 Θ(Ln ) + Xn+ h+ (Ln )].
Since 1Xn >0 Θ(Ln ) Θ(Ln ) 1, the dominated convergence theorem gives n→∞
E[1Xn >0 Θ(Ln )] −→ 0. +
We already know that M h +
E[Mnh
,0
,0
is a martingale. Consequently, +
] = E[M0h
∞
,0
]=
1 + h (k), 2 k=1
wherefrom E[Xn+ h+ (Ln )]
Ln ∞ 1 1 + = E h (k) h+ (k). 2 2 k=1
k=1
Penalisation of the Random Walk on Z
351
By dominated convergence again, ∞
lim E[Xn+ h+ (Ln )] =
n→∞
k=1
+
and so, according to (10), Qh For F : Zn → R+ ,
,h−
(Γ + ) =
EQ F (Xg+1 , . . . , Xg+n ) 1Xg+1 =1 = =
Q
E
1 + h (k), 2
∞ 1 k=1
2
h+ (k).
EQ F (Xg+1 , . . . , Xg+n ) 1g=τl ,Xg+1 =1
l1
+ −
F (Xg+1 , . . . , Xg+n ) g = τl , Xτl +1 = 1 Qh ,h (g = τl , Xτl +1 = 1)
l1
h+ ,h− = E1 F (X0 , . . . , Xn−1 ) T0 = ∞ Q (g = τl , Xτl +1 = 1) l1
+ −
= E1 F (X0 , . . . , Xn−1 ) T0 = ∞ Qh ,h (Γ + ).
This shows half of point 2.b.ii. The other half, when Xg+1 = −1, is easily obtained using the symmetry of the walk. To end of the proof of Theorem 2, we shall show that, conditional on + − {L∞ = l} and under the law Qh ,h , the process (Xu , u < g) is a standard random walk stopped at its l-th passage at 0. Let F be a +function from Zn to R+ and l an element of N∗ . From the h ,h− definition of Q and the optional stopping theorem,
EQ F (X1 , . . . , Xn ) 1n<τl <∞ 1τl+1 =∞ E F (X1 , . . . , Xn ) 1n<τl L∞ = l = Qh+ ,h− (L∞ = l)
EQ F (X1 , . . . , Xn ) 1n<τl <∞ − EQ F (X1 , . . . , Xn ) 1n<τl <τl+1 <∞ Q
= =
Qh+ ,h− (L∞ = l)
E F (X1 , . . . , Xn ) 1n<τl Mτl − EQ F (X1 , . . . , Xn ) 1n<τl Mτl+1
Qh+ ,h− (L∞ = l)
E F (X1 , . . . , Xn ) 1n<τl (Θ(l − 1) − Θ(l)) = E F (X1 , . . . , Xn ) 1n<τl . = 1 + − (h (l) + h (l)) 2
5 Penalisation by the Length of the Excursions 5.1 Notation For n 0, call gn (respectively dn ) the last zero before n (respectively after n): gn := sup {k n, Xk = 0} dn := inf {k > n, Xk = 0}
352
P. Debs
Thus dn − gn is the duration of the excursion that straddles n. Put Σn = sup {dk − gk , dk n} , so Σn is the longest excursion before gn ; remark that Σn = Σgn .
(11)
Define (An , n 0), the “age process”, by An = n − gn , and call An = σ (An , n 0) the filtration generated by A. Set A∗n = sup Ak ,
(12)
kn
and observe that
A∗n = (Σn − 1) ∨ (n − gn ),
wherefrom
A∗gn = Σgn − 1.
n
(13)
In the sequel, γl := k=0 1{Xk =0} is the number of passage times at 0 up to time n, τ = inf {n > 0, Xn = 0} is the first return time to 0 and a function θ is defined by E [|Xx | | τ > x] =: θ(x). 5.2 Proof of Theorem 3 1) We start with point 1 of Theorem 3. To show formula (4), we need: Proposition 3.
P(Σk x) ∼
k→∞
2 πk
12 θ(x).
To establish this Proposition, we will use the following lemma: Lemma 13. For every f : Z → R+ , every n 0 and every k > 0, E [f (Xn ) | An = k] = E [f (Xk ) | τ > k] . and a Tauberian Theorem: Theorem 4 (Cf. [Fel71] p. 447). Given qn 0, suppose that the series S(s) =
∞
qn sn
n=0
converges for 0 s < 1. If 0 < p < ∞ and if the sequence {qn } is monotone, then the two relations:
Penalisation of the Random Walk on Z
S(s) ∼
s→1−
and qn ∼
n→∞
353
1 C (1 − s)p
1 p−1 n C Γ (p)
where 0 < C < ∞, are equivalent. Proof of Lemma 13. By the Markov property, E [f (Xn ) | An = k] = E [f (Xn ) | n − gn = k] = E [f (Xn ) | Xn−k = 0, Xn−k+1 = 0, . . . , Xn = 0] = E [f (Xk ) | τ > k] . Proof of Proposition 3. Let δβ be a geometric r.v. with parameter β, where 0 < β < 1, and such that δβ is independent of the walk X. Then ∞ ∞ P (δβ = k) P (Σk x) = (1 − β)k−1 β P (Σk x) . P Σδβ x = k=1
k=1
Now, from (11) and (13), P(Σδβ x) = P(Σgδβ x) = P(A∗gδ x) = P(TxA gδβ ) β
d = P(δβ dTxA ) = 1 − P(δβ > dTxA ) = 1 − E (1 − β) TxA
T ◦θ TA = 1 − E (1 − β) x (1 − β) 0 TxA TA T = 1 − E (1 − β) x EXT A (1 − β) 0 . x
(14)
Definition 2. A stopping time T is said to be X-standard if T is a.s. finite and if the stopped process (Xn∧T , n 0) is uniformly integrable. According to [ALR04], if T is X-standard and if T is independent of XT , then
−1 ∀α ∈ R E ch(α)−T = E exp(αXT ) . (15) √ Recall that Arg ch(α) = ln α + α2 − 1 . When ch α = (1 − β)−1 , 1 + 2β − β 2 1 1 1 + . − 1 = ln α = Arg ch = ln 1−β 1−β (1 − β)2 1−β According to [ALR04], Tk and TxA satisfy these properties, hence −k 1 + 2β − β 2 T0 Tk = E0 (1 − β) = Ek (1 − β) 1−β ⎡ XT A ⎤−1 x 2 A 2β − β 1 + T ⎦ E (1 − β) x = E ⎣ 1−β
354
P. Debs
So, owing to the independence of TxA et XTxA and the above formulae,
P Σδβ
⎡ −|XT A | ⎤ x 2 A 1 + 2β − β T ⎦ x = 1 − E (1 − β) x E ⎣ 1−β $ $ √ % $ √ 1 2
1+
E
2β−β 2 1−β
|XT A | x
$
= E
1+
1+
−E
√
2β−β 2 1−β
2β−β 2 1−β
−|XT A |
%%
x
XT A %
.
x
For all k ∈ N, $
1+
2β − β 2 1−β
and consequently P Σδβ x
∼
β→0+
%k ∼
β→0+
1+k
2β,
√ E |XTxA | 2β.
Thus we have obtained ∞
& (1 − β) P (Σk x) k
k=1
∼
β→0+
2 (1 − β) E |XTxA | . β
In order to apply Theorem 4, put α = 1 − β. This gives √ ∞
2 k α P (Σk x) ∼ √ E |XTxA | , α→1− 1−α k=1 and now Theorem 4 with p = P(Σk x)
∼
α→1−
Γ
1 2
and C =
√
2E |XTxA | gives
1 1 1 k 2 −1 C =
2
2 πk
12
E |XTxA | .
By Lemma 13,
E |XTxA | = E |XTxA | ATxA = x = E |Xx | τ > x = θ(x). It is now possible to finalise the proof of point 1.a. Let T˜0 be the hitting time of 0 by the walk (Xn+k )k0 , and Σ be the maximal length of the excursions of the walk (Xk+n+T˜0 )k0 .
E 1Λn ,Σp x = E [1Λn ,Σn x,T0 ◦θn >p−n ] + E 1Λn ,Σn x,T0 ◦θn (p−n)∧(x−An ),Σ p−n−T0 ◦θn x = (1) + (2)
Penalisation of the Random Walk on Z
355
˜ the measure associated to the walk (Xn+k )k0 , Xn and An being fixed. Call P Then $ % 12 2 ˜ X T˜0 > p − n ∼ E 1Λn ,Σn x |Xn | (1) = E 1Λn ,Σn x P n p→∞ πp Call also P the measure associated to the walk (Xk+n+T˜0 )k0 , T˜0 being fixed. For p > n + x, (p − n) ∧ x − An = x − An ; consequently ˜ X (T˜0 x − An )P Σ x (2) ∼ E 1Λn ,Σn x,An x P ˜ n p−n−T0 p→∞ $ % 12 2 ˜ X (T˜0 x − An ) θ(x) . ∼ E 1Λn ,Σn x,An x P n p→∞ πp One derives therefrom ' E[1Λn 1Σp x ] |Xn | ˜ = E 1Λn + PXn (T˜0 x − An ) 1An x 1Σn x . p→∞ E[1Σp x ] θ(x) lim
˜ et T˜0 , or similar ones, will frequently occur in the Remark 5. These notations P sequel. We have not been completely rigorous when defining them; a rigorous ˜ ˜ (T˜0 x − An ) stands for f (Xn , x − An ) definition is possible as follows: P Xn where f (y, z) = Py (T0 z). We shall now see that (Mn , n 0) is indeed a martingale. The parity of n + 1 comes into play, so we shall consider two cases. Suppose first that n+1 is odd. In that case, Σn+1 = Σn and An+1 = An +1. Recall that x → |x| is harmonic except at 0 for the symmetric random walk. Hence, on the event {Xn = 0}, the only relevant term is ˜ X (T˜0 x − An+1 ), Cn+1 := 1{An+1 x,Σn x} P n+1 and on Xn = 0, it sufices to verify that, when conditioned by Fn , this quantity equals 1 − θ1 1Σn x . By the Markov property, if Xn = 0, ˜ ˜ (T˜0 x − An ) = 1 (P ˜ P (T˜0 x − An − 1) + P (T˜0 x − An − 1)). Xn Xn −1 2 Xn +1 So E[1Xn =0 Cn+1 |Fn ] = E[1Xn =0 (1Xn+1 =Xn +1 + 1Xn+1 =Xn −1 )Cn+1 |Fn ] 1 ˜ ˜ ˜ ˜ = 1Xn =0,Σn x,An x−1 [P Xn +1 (T0 x − An − 1) + PXn −1 (T0 x − An − 1)] 2 ˜ X (T˜0 x − An ) = 1Xn =0,Σn x,An x−1 P n
356
P. Debs
˜ X (T˜0 x − An ) = 0, one has And, as 1Xn =0, An =x P n E[1Xn =0 Cn+1 |Fn ] = 1Xn =0 Cn . It remains to show that
E[1Xn =0 Cn+1 |Fn ] = 1Xn =0,Σn x
1 1− θ
.
(16)
This will use the classical result ([Fel50] pp. 73-77) P (X1 > 0, . . . , X2n−1 > 0, X2n = 2r) =
1 (p2n−1,2r−1 − p2n−1,2r+1 ) . (17) 2
n+r
where pn,r = 21n Cn 2 . Using formula (17) with x = 2n, one can write
P (τ > x) θ(x) = P (τ > x) E [|Xx | | τ > x] = E |Xx |1{τ >x}
= E Xx 1{τ >x,Xx >0} − E Xx 1{τ >x,Xx <0} = 2 E Xx 1{τ >x,Xx >0} x n =2 k P (Xx = k, τ > x) = 4 P (X2n = 2, τ > 2n) k>0,k even
=2
n >0
Now,
>0
2n−2 n n+−1 1 n+ . (p2n−1,2−1 − p2n−1,2+1 ) = C2n−1 − C2n−1 2
n =1
>0
n+−1 n+ = C2n−1 − C2n−1
n−1 =0
n+ C2n−1 = 22n−2 ; so we obtain
θ(x) P (τ > x) = 1.
(18)
On the other hand, 1 P1 (T0 x − 1) + P−1 (T0 x − 1) 2 = 1Xn =0,Σn x P(τ x) = 1Xn =0,Σn x (1 − P(τ > x)) ; (19)
E[1Xn =0 Cn+1 |Fn ] = 1Xn =0,Σn x
hence, considering (18) and (19), formula (16) is established. We now consider the case that n + 1 is even. In that case, {An x} = {An x − 1}. Indeed, An = n − gn is odd and x is even by hypothesis, so the event {An = x} is null. Moreover, if |Xn | 3, on a Σn+1 = Σn . Last, if |Xn | = 1, there are two cases. Either Xn+1 = 0 and one always has Σn+1 = Σn , or Xn+1 = 0 and we must see that in that case {Σn+1 x} = {Σn x, n + 1 − gn x} = {Σn x, An x − 1} . So, one is always on the event {Σn x, An x − 1}, and the same argument as when n + 1 was odd and Xn = 0 shows that, conditional on Fn , Mn+1 is equal to Mn . This shows that M is a martingale; positivity is immediate. The proof that M is not uniformly integrable is postponed until later in this section.
Penalisation of the Random Walk on Z
357
2) We now start studying the process Σ under Qx . We shall first show >x) that, for all y x, Qx (Σ∞ > y) = 1 − P(τ P(τ >y) . Put TyΣ := inf {n 0, Σn > y}. Clearly, XTyΣ = 0 and hence Qx (Σp > y) = Qx TyΣ p = E 1TyΣ p MTyΣ $ ( % |XTyΣ | ˜X T˜0 x − ATyΣ 1AT Σ x 1ΣT Σ x +P = E 1TyΣ p Σ Ty y y θ(x) = P TyΣ p, ΣTyΣ x . Letting p go to infinity, we obtain that Qx (Σ∞ > y) = P(ΣTyΣ x). For y x, ΣTyA x is a full event; so ΣTyΣ x = ΣTyA x ∩ T0 ◦ θTyA + y x = T0 ◦ θTyA + y x . By the Markov property and Lemma 13, ˜X ˜0 x − y T Qx (Σ∞ > y) = E E 1T0 ◦θT A +yx | ATyA = E P A Ty y ˜ X T˜0 > x − y 1τ >y E P y ˜ X T˜0 x − y | τ > y = 1 − =E P y P(τ > y)
E 1T0 ◦θy >x−y,τ >y P(τ > x) =1− . =1− P(τ > y) P(τ > y) On the other hand, for all n 0, one has Qx (Σn x) = 1. According to the definition of the probability Qx , P (Σn x, Σp x) P (Σp x) = lim = 1. p→∞ p→∞ P (Σp x) P (Σp x)
Qx (Σn x) = lim
3) We shall now describe several properties of g and (An , n 0) under Qx . a) We first show that g is Qx -a.s. finite; this implies that A∞ = ∞ Qx -a.s. Lemma 14. For all n 0 and k 0, n−k k C2k P(A2n = 2k) = P(A2n+1 = 2k + 1) = C2n−2k
1 n 2
.
Proof. According to [Fel50] p. 79, “Arcsin law for last visit”, 1 n n−k k . C2k P(g2n = 2k) = C2n−2k 2 Therefore n−k k C2k P(A2n = 2k) = P(2n − g2n = 2k) = P(g2n = 2n − 2k) = C2n−2k
and as A2n+1 = A2n + 1, the proof is over.
1 n 2
;
358
P. Debs
The next lemma is instrumental in the sequel. Lemma 15. For each p > 0, 1 ˜ X (˜ τ x − Ap ) . Qx (g > p | Fp ) = P p Mp Proof. Recall that T0,p := inf {n > p, Xn = 0} is the first zero after p, and remark that ΣT0,p = Σp ∨ {Ap + τ ◦ θp }. Recall also that under Qx , the event {Σp x} is almost sure. So, for every Λp ∈ Fp , Qx {Λp } ∩ {g > p} = Qx {Λp } ∩ {T0,p < ∞}
˜ X [˜ τ x − Ap ] = E 1Λp MT0,p = E 1Λp 1ΣT0,p x = E 1Λp , Σp x P p ˜ X [˜ ˜ X [˜ P P τ x − Ap ] τ x − Ap ] p p Qx = E 1Λp Mp = E 1Λp , Mp Mp and consequently one has ˜ X (˜ Qx (g > p | Fp ) = P τ x − Ap ) p
1 . Mp
We now suppose that p = 2l where l 0; when p = 2l + 1 the computation is similar, we won’t give it (see Lemma 14). According to Lemma 15,
x x x 1 ˜ X (˜ Qx (g > p) = EQ EQ [1g>p | Fp ] = EQ P τ x − Ap ) p Mp l∧ x
2 ˜ X (˜ ˜ X (˜ =E P τ x − A ) = E P τ x − Ap ) 1Ap =2k p p p k=0 l∧ x 2
=
˜ X (˜ E P τ x − Ap ) Ap = 2k P(Ap = 2k) p k=0 x
l∧ 2
˜ X (˜ = E P τ x − 2k) τ > 2k P(Ap = 2k) 2k k=0
l∧ x 2 ˜ X (˜ E P τ x − 2k) 1τ >2k 2k P(Ap = 2k) = P (τ > 2k) k=0
l∧ 2 x
=
k=0
1−
P(τ > x) P(Ap = 2k) P(τ > 2k)
x
=
l∧ 2 k=0
l−k k C2l−2k C2k
1 l 2
1−
P(τ > x) . P(τ > 2k)
Penalisation of the Random Walk on Z
359
This gives the law of g under Qx . Then, for p > 2, Qx (g > p) E[1Ap x]. Now, Ap tends to infinity P-a.s.; consequently, Qx (g = ∞) = lim Qx (g > p) lim P(Ap x) = 0, p→∞
p→∞
and g is Qx -a.s. finite. Remark 6. It is now easy to see that M is not uniformly integrable. Indeed, as g is finite, so is also L∞ , and the argument given earlier for M ϕ and S immediately adapts to M and L. b) To establish 2.d.i et 2.d.ii., we shall need: Lemma 16. For all y x, one has
E MTyA = 1 Proof of Lemma 16. Recall that the event ΣTyA x has probability 1. By formula (18) and the proof of point 2.a, |XT A |
y ˜ X (T˜0 x − y) E MTyA = E +P A Ty θ(x) =
θ(y) ˜ X (T˜0 x − y) = P (τ > x) + 1 − P(τ > x) . +E P A Ty θ(x) P (τ > y) P(τ > y)
Let F be a positive functional and G : R → R+ . Recall that after [ALR04], XTyA and ATyA are independent under P. On the other hand, as MTyA is a function of XTyA , one has
x EQ F An , n TyA G XTyA = E F An , n TyA G XTyA MTyA
= E F An , n TyA E G XTyA MTyA . (20) So, taking G ≡ 1 and using Lemma 16, one has x EQ F An , n TyA = E F An , n TyA , which shows that (An , n TyA ) has the same law under P and Qx . Using again formula (20), one obtains x x x EQ F An , n TyA G XTyA = EQ F An , n TyA EQ G XTyA ; this shows that (An , n TyA ) and XTyA are independent under Qx .
360
P. Debs
c) The rest of the proof of point 2 is quite easy, taking into account what has already been done:
x EQ G XTyA = E G XTyA MTyA = E E G XTyA MTyA ATyA |X | y ˜ X (T˜0 x − y) τ > y +P = E E G(Xy ) y θ(x) ' |Xy | ˜ ˜ = E G (Xy ) + PXy T0 x − y |τ >y θ(x) |X | y ˜ X (T˜0 x − y) τ > y = E G(Xy ) +P y θ(x) |k| + Pk (T0 x − y) P(Xy = k | τ > y). G(k) = θ(x) k
Consequently, the law of XTyA under Qx satisfies |k| Qx XTyA = k = + Pk (T0 x − y) P(Xy = k | τ > y). θ(x) (The quantity P(Xy = k | τ > y) is explicitly given in [Fel50] p. 77). We now compute Qx (g > TyA ): P ˜ X (˜ τ x − y) A
x x x Ty Qx (g > TyA ) = EQ EQ 1g>TyA FTyA = EQ MTyA
P (τ > x) ˜ X (˜ ˜ X (˜ . =E P τ x − y) = E P τ x − y) τ > y = 1 − y A Ty P (τ > y) Last, we now show that An , n TyA and g > TyA are independent under Qx ; we use again the independence of XTyA and ATyA under P.
x x x EQ F (An , n TyA )1g>T A = EQ F (An , n TyA ) EQ 1g>T A ATyA y
y
F (An , n T A ) P ˜ X (˜ τ x − y) y x TA
= EQ
y
MTyA
A ˜ X (˜ E P = E F An , n Ty τ x − y) A Ty
x = EQ F (An , n TyA ) Qx (g > TyA ). 4) To study the process (Xn , n 0) under Qx , we start with the law of the process (Xn , n g). Recall that Γ + = {Xn > 0, n > g} and Γ − = {Xn < 0, n > g}; these events Γ + and Γ − are symmetric under Qx0 : Lemma 17.
1 Qx Γ + = Qx Γ − = . 2
Penalisation of the Random Walk on Z
361
Proof. First remark that Qx (Γ + ) = lim Qx (Xn > 0), n→∞
Qx (Γ − ) = lim Qx (Xn < 0). n→∞
x
By definition of Q , |X | n ˜ X T˜0 x − An 1A x 1Σ x . +P Qx (Xn > 0) = E 1Xn >0 n n n θ(x) Owing to the symmetry of the walk under P, one has |X | n ˜ X T˜0 x − An 1A x 1Σ x Qx (Xn > 0) = E 1Xn <0 +P n n n θ(x) x = Q (Xn < 0). One also has limn→∞ Qx (Xn = 0) = 0 because g is Qx -a.s. finite; and as Qx (Xn > 0) + Qx (Xn < 0) + Qx (Xn = 0) = 2Qx (Xn > 0) + Qx (Xn = 0) = 1, taking limits when n tends to infinity, on obtains Qx (Γ + ) + Qx (Γ − ) = 2Qx (Γ + ) = 1.
We now describe the behavior of (Xn+g , n > 0) under Qx on Γ + (the other case is completely similar). Take a ∈ N∗ and p x, and set qa,a+1 := Q(Xn+1 = a + 1|Xn = a, n > g). qa,a+1 = Q(Xn+1 = a + 1|Xn = a, ∀i p Xn+i > 0) Q(Xn+1 = a + 1, Xn = a, ∀i p Xn+i > 0) = Q(Xn = a, ∀i p Xn+i > 0)
E 1Xn+1 =a+1, Xn =a, ∀ip Xn+i >0 Mp+n
= . E 1Xn =a, ∀ip Xn+i >0 Mp+n X
p+n 1Σn x ; hence we can condition the numerator (resp. the Here Mp+n = Θ(x) denominator) by Fn+1 (resp. Fn ). The Markov property gives
E 1Xn+1 =a+1, Xn =a, Σn x Ea+1 [Xp 1Xi >0,∀ip−1 ] . qa,a+1 = E [1Xn =a, Σn x Ea [Xp 1Xi >0,∀ip ]]
Clearly, (Xp 1Xi >0,∀ip )p0 is a martingale, wherefrom
(a + 1)E 1Xn+1 =a+1, Xn =a, Σn x qa,a+1 = . aE [1Xn =a, Σn x ] Last, conditioning the numerator by Fn one gets qa,a+1 =
a+1 , 2a
the transition probability of a 3-Bessel* walk.
362
P. Debs
Recall the following notation: γn := | {k n, Xk = 0} | , γ∞ := lim γn n→∞
τ1 := T0 , ∀n 2, τn := inf {k > τn−1 , Xk = 0} It remains to show that, conditional on {γ∞ = l}, (Xu , u g) is a standard random walk stopped at τl and conditioned by Στl x. Let F be a functional on Zn .
x
EQ F (X1 , . . . , Xn ) 1nτl 1γ∞ =l Qx F (X1 , . . . , Xn ) 1nτl γ∞ = l = E EQx [γ∞ = l]
x x EQ F (X1 , . . . , Xn ) 1nτl <∞ − EQ F (X1 , . . . , Xn )1nτl <τl+1 <∞
= EQx 1τl <∞ 1τl+1 =∞
x x EQ F (X1 , . . . , Xn ) 1nτl <∞ − EQ F (X1 , . . . , Xn )1nτl <τl+1 <∞
= EQx 1τl <∞ − EQx 1τl+1 <∞
E F (X1 , . . . , Xn ) 1nτl <∞ Mτl − E F (X1 , . . . , Xn )1n<τl+1 <∞ Mτl+1
= . E 1τl <∞ Mτl − E 1τl+1 <∞ Mτl+1 Under P, {τl < ∞} has probability 1, and so Mτl − Mτl+1 = 1Στl x (1 − 1τl+1 −τl x ) = 1Στl x, τl+1 −τl >x . As τl+1 − τl is independent of Fτl , one gets Qx
E
F (X1 , . . . , Xn ) 1nτl | γ∞
=
=
l
x,τl+1 −τl >x}
x,τl+1 −τl >x}
E 1Στl x E 1τl+1 −τl >x
E F (X1 , . . . , Xn ) 1{nτl ,Στ
E 1Στl x
l
x}
E F (X1 , . . . , Xn ) 1nτl ,Στl x E 1{τl+1 −τl >x}
=
l
E 1{Στ
E F (X1 , . . . , Xn ) 1nτl Mτl − Mτl+1
=l = E Mτl − Mτl+1
E F (X1 , . . . , Xn ) 1{nτl ,Στ
= E F (X1 , . . . , Xn ) 1nτl | Στl x .
References [ALR04] C. Ackermann, G. Lorang, and B. Roynette, Independence of time and position for a random walk, Revista Matematica Iberoamericana 20 (2004), no. 3, pp. 915–917. [Bil] P. Billingsley, Probability measures. [Deb07] Pierre Debs, P´enalisations de marches al´ eatoires, Th`ese de Doctorat, Ins´ Cartan, 2007. titut Elie
Penalisation of the Random Walk on Z
363
Feller, An Introduction to Probability Theory and its Applications, vol. 1, 1950. , An Introduction to Probability Theory and its Applications, vol. 2, [Fel71] 1966-1971. [LeG85] J.F. LeGall, Une approche ´ el´ementaire des th´eor`emes de d´ecomposition de Williams, Lecture Notes in Mathematics, S´eminaire de Probabilit´es XX, 1984-1985, pp. 447–464. [Pit75] J. Pitman, One-dimensional Brownian motion and the three-dimensional Bessel process, Advances in Appl. Probability, vol. 7, 1975, pp. p. 511–526. [RVY] B. Roynette, P. Vallois, and M. Yor, Brownian penalisations related to excursion lengths, submitted to Annales de l’Institut Henri Poincar´e. , Limiting laws associated with Brownian motion perturbed by its [RVY06] maximum, minimum and local time, Studia sci. Hungarica Mathematica 43 (2006), no. 3. [Fel50]
Canonical Representation for Gaussian Processes M. Erraoui1 and E.H. Essaky2 1
2
Universit´e Cadi Ayyad, Facult´e des Sciences Semlalia, D´epartement de Math´ematiques, B.P. 2390, Marrakech, Maroc. E-mail: [email protected] Universit´e Cadi Ayyad, Facult´e Poly-disciplinaire, D´epartement de Math´ematiques et d’Informatique, B.P 4162, Safi, Maroc. E-mail: [email protected]
Summary. We give a canonical representation for a centered Gaussian process which has a factorizable covariance function with respect to a positive measure. We also investigate this representation in order to construct a stochastic calculus with respect to this Gaussian process.
Mathematics Subject Classification (2000): 60G15, 60H05, 60H07, 46E22. Keywords: Volterra processes; Stochastic integrals; Reproducing kernel Hilbert space; Malliavin calculus; Girsanov theorem.
1 Introduction Let {Xt , t ∈ [0, 1]} be a centered Gaussian process with covariance K(t, s) = E(Xt Xs ) and FtX be the σ-field generated by {Xs ; s ≤ t}. It is well-known that the law of a centered Gaussian process is uniquely determined by the covariance function K(t, s). The Gaussian process which has been studied most extensively is, of course, the Wiener process {Wt , t ∈ [0, 1]}. Therefore it is natural to seek a representation of centered Gaussian process in terms of Wiener processes. It is known that if a centered Gaussian process has a factorizable covariance function of the form 1 k(s, u)k(t, u)du, K (t, s) = 0
then Xt has a stochastic integral representation 1 law k(t, u)dWu . Xt =
(1)
0
C. Donati-Martin et al. (eds.), S´ eminaire de Probabilit´ es XLII, Lecture Notes in Mathematics 1979, DOI 10.1007/978-3-642-01763-6 13, c Springer-Verlag Berlin Heidelberg 2009
365
366
M. Erraoui and E.H. Essaky
The fractional Brownian motion is the most important example of such processes, cf [AMN01], [DU99] and the references therein. Most representations for Gaussian processes of type (1) are in law; then, no comparison between FtX and FtW is given. So, a natural question arises: can this representation hold strongly? In another term, is it possible to construct a Brownian motion such that FtX = FtW ? This question is closely related to the problem of canonical representation for Gaussian processes, which means that the process X has the representation (1) and moreover FtX = FtW for each t. The theory of the canonical and noncanonical representations for Gaussian processes was initiated by L´evy [Lev65]. Since then this theory has been extented in several directions, see, for example, Hida [Hid60], Cram´er [Cra61], Hitsuda [His68], Hibino-Hitsuda-Muraoka [HHM97] and Shepp [She66]. Nevertheless, it seems interesting to investigate the above question in order to construct a stochastic calculus with respect to a Gaussian process. An important class of Gaussian processes is the Gaussian-Markov ones. It is well known, see for example [RY91] p. 81, that if K is continuous and strictly positive then there exists a continuous function u and a continuous strictly increasing function v such that K(t, s) = u(t)u(s)v(t ∧ s). It follows that there exits a positive measure given by ([0, t]) = v(t) such that X has the representation in law Xt = u(t)Wρ([0,t]) , where {W([0,s]) , s ∈ [0, 1]} is the time-changed Brownian motion. Moreover the Gaussian space generated by X is isomorphic to the space L2 (). For example when X is the standard Brownian bridge on [0, 1], then u(t) = 1 − t; v(t) = t/ (1 − t) ; when X is the Ornstein-Uhlenbeck process of parameter λ > 0, u(t) = exp(λt); v(t) = (1 − exp(−2λt)) /2λ. It then seems interesting to consider Gaussian processes whose covariance functions are factorizable with respect to a positive measure. In the sequel, we study a class of Gaussian processes having a factorizable covariance function of the form 1 k(s, u)k(t, u)(du), (2) K (t, s) = 0
where is a probability measure on [0, 1]. Let {Xt , t ∈ [0, 1]} be a centered Gaussian process with covariance of the form (2). Some natural questions are: is it possible to represent X by means of linear transformations of the Wiener process? If so, can this representation be canonical? In answering those questions we construct a Brownian motion W such that X has the following representation 1
Xt =
k(t, u)dW([0,s]) ;
(3)
0
moreover, FCXt = FtW ∀t ∈ [0, 1], with Ct = inf {s ≥ 0, As := ([0, s]) > t} ∧ 1 and, as usual, inf (∅) = +∞. It should be pointed out that, for (du) = du on [0, 1], we recover the result of [Hul03] where the representation (1) is given only in law.
Canonical Representation for Gaussian Processes
367
The paper is organized as follows. In section 2, we give a description of the reproducing kernel Hilbert space associated with the kernel k(t, .) which will be useful in the sequel. A canonical representation of the Gaussian process X is given in Section 3. Section 4 is devoted to a stochastic calculus of variations with respect to a Gaussian process X. The divergences associated with the processes X and W are related by the formula δ X (u) = δ W ((K∗ u) (C· )), where K∗ is the adjoint of the operator K defined in Section 2. Finally, in Section 5, a Girsanov transformation is given.
2 Reproducing Kernel Hilbert Space Let be a probability measure on [0, 1] and K a kernel given by
1
K (t, s) =
k(s, u)k(t, u) (du) ,
(4)
0
where k : [0, 1] × [0, 1] −→ R satisfies sup k(t, ·) L2 () < +∞.
(5)
t∈[0,1]
Then K is a bounded reproducing kernel. Let H be the reproducing kernel Hilbert space associated with K. Now we give a description of H. Proposition 1. (See Suquet [Suq95], Proposition 1.) Let E be the closed subspace of L2 () spanned by {k(t, ·), t ∈ [0, 1]}. 1. A function h : [0, 1] −→ IR belongs to H if and only if there is a unique gh ∈ E such that 1
h(t) =
k(t, u)gh (u) (du) .
(6)
0
2. The scalar product on H is given by h1 , h2 H = gh1 , gh2 L2 () , and its associated norm is denoted by . H . 3. The representation (6) defines a isometry of Hilbert spaces: Ψ : H −→ E, h → gh . Now taking into account the integrability assumption (5) on k, we define a bounded operator in L2 () by (Kh) (t) =
1
k(t, u)h(u) (du) . 0
368
M. Erraoui and E.H. Essaky
In particular, for h = k(t, .) we have 1 k(t, u)k(s, u) (du) = K(t, s). (Kk(t, .)) (s) = 0
Having in mind that any f ∈ H has the representation (6), it follows that H = K(E), and N (K) = E ⊥ , where N denotes the null space of K. Note that as a vector space H is equal to K(E), but the norm on each of these spaces are different since for h ∈ H
h H = gh L2 ()
and
h L2 () = Kgh L2 () .
It also follows that H is the closed subspace spanned by {K(t, ·), t ∈ [0, 1]} , for the norm · H . Moreover, for all h ∈ E, we have Kh H = h L2 () , and for all h ∈ E ⊥ , we get Kh = 0, which means that K is partially isometric with the initial set (E, · L2 () ) and the final set (H, · H ). This is equivalent to
Kh H = PE h L2 () , where PE is the orthogonal projection of L2 () on E. On other hand, we have K∗ h = 0, for h ∈ (K(E))⊥ and K∗ h L2 () = h H , for h ∈ K(E), where K∗ is the adjoint operator of K. As a consequence we obtain the following relations: (i) For f ∈ L2 (), we have K∗ K(f ) = PE (f ). (ii) For f ∈ H, we have KK∗ (f ) = f . For more details on partially isometric operators see Chaleyat-Maurel and Jeulin [CJE85] and Kato [Kat95]. Moreover, if we assume that K is injective, which is equivalent to E = L2 (), then H = K(L2 (ρ)) and h, Kf H = K∗ h, f L2 () , for all h ∈ H and f ∈ L2 (). Hence the operator K∗ is an isometry between H and L2 (), that is, −1 2 L () . H = (K∗ ) (7) In this case (i) and (ii) becomes: (j) For f ∈ L2 (), we have K∗ K(f ) = f. (jj) For f ∈ H, we have KK∗ (f ) = f.
3 Strong Representation of Gaussian Processes Let Ω = C0 ([0, 1], R) be the Banach space of continuous functions, null at time 0, equipped with the sup-norm and the Borel σ-field F. The canonical process X = {Xt : t ∈ [0, 1]} is defined by Xt (ω) = ω(t) and the probability
Canonical Representation for Gaussian Processes
369
measure P on (Ω; F) is the unique measure such that X is centered Gaussian with covariance given by 1 k(s, u)k(t, u) (du) . (8) E(Xt Xs ) = K(t, s) = 0
Denote by {FtX : t ∈ [0, 1]} the natural filtration generated by X. Let H be the closure in L2 (Ω) of the space spanned by {Xt : t ∈ [0, 1]} equipped with the inner product ξ, ζ H = E(ξζ), for ξ, ζ ∈ H. The reproducing kernel Hilbert space H associated with X is the space R(H) = {R(ξ) : ξ ∈ H} where for any ξ ∈ H, R(ξ) is the function R(ξ)(t) = ξ, Xt H = E(ξXt ). H has inner product F, G H = R−1 F, R−1 G H . Since K(s, .) = R(Xs ), we clearly have K(s, .) ∈ H. Hence span{K(t, .), t ∈ [0, 1]} ⊆ H. More precisely, H is the closure of the space spanned by {K(t, .) : t ∈ [0, 1]} with respect to the norm associated to ., . H . For more details on reproducing kernel Hilbert spaces we refer to Grenander [Gre81] or Janson [Jan97]. Remark 1. It should be pointed out that K(t, .), K(s, .) H = K(t, .), K(s, .) H = K(t, s), ∀s, t ∈ [0, 1], and then H = H. Let JX be the isometry from L2 ([0, 1], ) onto H defined by JX = R−1 ◦K. With this definition we see that JX maps k(t, .) → X(t), for more details see [Hul03]. Now, consider a deterministic function k : [0, 1] × [0, 1] −→ [0, +∞) satisfying the following hypotheses: (H1 ) k(0, s) = 0 for all s ∈ [0, 1] and k(t, s) = 0 for s > t. (H2 ) There are constants C, α > 0 such that for all s, t ∈ [0, 1]
1
((k(t, r) − k(s, r))2 (dr) ≤ C|t − s|α . 0
(H3 ) K is injective as a transformation of functions in L2 (). It is easily seen, using assumption (H2 ), that for sufficiently large p there exists a constant Cp such that p
E |Xt − Xs | ≤ Cp
1
p/2 2
(k(t, u) − k(s, u)) (du) 0
≤ Cp |t − s|αp/2 ≤ Cp |t − s|1+ε , for some ε > 0. Hence, by Kolmogorov’s continuity criterion, X has a continuous modification. Henceforth, we will consider only this continuous modification.
370
M. Erraoui and E.H. Essaky
3.1 Particular Case In this subsection, we assume that (ds) = ds. Hult has shown in [Hul03] that, under hypotheses (H1 ) − (H3 ), X has the representation 1 law Xt = k(t, u)dBu , 0
where B is a standard Brownian motion. Our aim is to prove that the above representation holds in the pathwise sense, with a fixed standard Brownian motion constructed on (Ω, P), and it is canonical. Following the work [Hul03], we define the Gaussian process {Wt : t ∈ [0, 1]} by Wt := JX (1[0,t] ), t ∈ [0, 1]. Now, denote by Ht the closure in L2 (Ω) of the space spanned by {Xs : s ∈ [0, t]} for t ∈ [0, 1]. It follows that JX is an isometry from L2 ([0, t], ) onto Ht . Hence, for all t ∈ [0, 1], Wt is FtX -measurable, therefore W is FtX -adapted. Lemma 1. (i) For all s < t, we have JX (1[s,t] ) = Wt − Ws . (ii) Wt is an (FtX , P)- Brownian motion. Proof. (i) Since JX is linear we have, for all s < t JX (1[s,t] ) = JX (1[0,t] ) − JX (1[0,s] ) = Wt − Ws . (ii) First, E(Wt Ws ) = K(1[0,t] , K(1[0,s] )) H = 1[0,t] , 1[0,s] L2 (ds) = s, for all s ≤ t. To prove that W is a FtX -Brownian motion, it is sufficient to show that (Wt − Ws ) is independent of FsX , for all t ≥ s. This is a consequence of the fact that E((Wt − Ws )Xr ) = K(1[s,t] , K(k(r, .)) H = 1[s,t] , k(r, .) L2 (ds) t = k(r, u)du = 0, ∀r ≤ s. s
FtX -
Remark 2. To prove that Wt is an Brownian motion, it is not sufficient to show only that W is Gaussian process with covariance t ∧ s. For instance, it is clear that the process β defined by t t Bu u du = βt = Bt − 1 + ln( ) dBu , t 0 u 0 where B is a Brownian motion, is an FtB -adapted Gaussian process with covariance t ∧ s. On other hand if it is an FtB - Brownian motion then we t Bu du = 0 a.s. which is absurd. Moreover, it is easy to see that have 0 u
Canonical Representation for Gaussian Processes
371
E (βs Bt ) = 0 for all s ∈ [0, t]. It follows that Ftβ FtB for all t > 0. More precisely, for all t > 0, the Gaussian space Γt generated by (βs ; s ≤ t) is given by t t 2 f (u)dBu ; f ∈ L ([0, t] , du) , and f (u)du = 0 . Γt = 0
0
For more details on this example, see Jeulin and Yor [JY90] or Yor [Yor92]. The following theorem gives the representation of X. Theorem 1. The process X satisfies t Xt = k(t, u)dWu . 0
Furthermore {FtX : t ∈ [0, 1]} = {FtW : t ∈ [0, 1]}. Proof. Since W is a standard Brownian motion, the following limit holds in L2 (Ω)
t
n k(t, s)dW s = lim n n→+∞
0
where ti =
i n,
ti
k(t, s)ds (Wti − Wti−1 );
(9)
ti−1
i=1
0 ≤ i ≤ n. On the other hand, n
n
k(t, s)ds 1[ti−1 ,ti [
ti
ti−1
i=1
converges in L2 ([0, 1]) to k(t, s) as n goes to infinity. Applying Lemma 1, we get n ti k(t, s)ds 1[ti−1 ,ti [ JX n ti−1
n = R−1 ◦ K n i=1
n = n i=1
ti
ti
k(t, s)ds 1[ti−1 ,ti [
ti−1
i=1
k(t, s)ds (Wti − Wti−1 )
ti−1
Now, by continuity of JX and equality (9) we have Xt =
1
k(t, u)dWu . 0
Then {FtX : t ∈ [0, 1]} ⊂ {FtW : t ∈ [0, 1]}. The equality follows from (ii) of Lemma 1.
372
M. Erraoui and E.H. Essaky
Remark 3. By using arguments similar to the ones above, one can see that 1 2 f (s)dWs . ∀f ∈ L ([0, 1]), JX (f ) = 0
For h ∈ H, we have K∗ h ∈ L2 ([0, 1]). Then, it follows that JX (K∗ h) =
1
K∗ h(s)dWs .
0
In particular, for h = K(t, .), we have JX (K∗ (K(t, .))) = JX (k(t, .)) =
1
K∗ (K(t, .))(s)dWs =
0
1
k(t, s)dWs = Xt . 0
3.2 General Case Our aim in this subsection is to give a canonical representation of X which has a factorizable covariance with respect to a probability measure . Set At = ([0, t]). It is well known that the variation of A corresponds to the total variation of and the function Ct = inf {s : As > t} ∧ 1 for t ∈ [0, 1], is increasing and right-continuous. Moreover ACt ≥ t, CAt ≥ t, for every t and At = inf {s : Cs > t} . For every t ∈ [0, 1] we have
t
h(s) (ds) = 0
t
At
h(s)dAs = 0
h(Cs )ds. 0
Now we assume the following condition: (H4 ) A is continuous and A0 = 0. Note that, under condition (H4 ), we have ACt = t, ([Ct− , Ct [) = 0 and supp() = {s ∈ [0, 1] : s = CAs }.
(10)
Lemma 2. The process Wt = JX (1[0,Ct ] ) is a (FCXt , P)-Brownian motion. Moreover, for all s < t, we have JX (1[Cs ,Ct ] ) = Wt − Ws . Proof. First, remark that JX is an isometry from L2 ([0, Ct ], ) onto HCt , where HCt is the closure in L2 (Ω) of the space spanned by {Xs : s ∈ [0, Ct ]} for t ∈ [0, 1]. Hence, for all t ∈ [0, 1], Wt is FCXt -measurable, therefore W is FCXt -adapted.
Canonical Representation for Gaussian Processes
373
Moreover, we have E(Wt Ws ) = K(1[0,Ct ] , K(1[0,Cs ] )) H = 1[0,Ct ] , 1[0,Cs ] L2 () = ACs = s, for all s ≤ t. Since JX is linear we have, for all s < t JX (1[Cs ,Ct ] ) = JX (1[0,Ct ] ) − JX (1[0,Cs ] ) = Wt − Ws . In order to prove that W is a FCXt -Brownian motion, it is sufficient to show that (Wt − Ws ) is independent of FCXs , for all t ≥ s. This is a consequence of the fact that Ct k(r, u)(du) = 0, ∀r ≤ Cs , E((Wt − Ws )Xr ) = Cs
where we have used assumption (H1 ). The main result of this subsection is the following. Theorem 2. The process X satisfies 1 k(t, s)dWAs . Xt = 0
Furthermore W FtX ⊆ FA ⊆ FCXAt , ∀t ∈ [0, 1], t
which implies that FCXt = FtW , ∀t ∈ [0, 1]. Proof. Let τi =
i n,
0 ≤ i ≤ n, a subdivision of [0, 1]. First remark that
1
At
k(t, s)dWAs =
k(t, Cs )dWs =
0
1
k(t, Cs )dWs . 0
0
Since W is a standard Brownian motion, the following limit holds in L2 (Ω)
n
1
k(t, Cs )dWs = lim
n→+∞
0
1 τi − τi−1
i=1
k(t, Cu )du (Wτi − Wτi−1 ).
τi
τi−1
(11) On the other hand, n i=1
1 τi − τi−1
τi
k(t, Cu )du 1[τi−1 ,τi [ (s), τi−1
converges in L2 ([0, 1], ds) to k(t, Cs ) as n goes to infinity. That is, 0
1
n i=1
1 τi − τi−1
τi
τi−1
2 k(t, Cu )du 1[τi−1 ,τi [ (s) − k(t, Cs ) ds −→ 0,
374
M. Erraoui and E.H. Essaky
as n goes to infinity. On the other hand, using (10) we obtain
n
1
0
=
i=1 n 1
0
i=1
1
= 0
=
1 τi − τi−1
n
i=1 n 1
0
i=1
2 k(t, Cu )du 1[τi−1 ,τi [ (s) − k(t, Cs ) ds
τi
τi−1
1 τi − τi−1 1 τi − τi−1 1 τi − τi−1
τi
τi−1
τi
τi−1
τi
τi−1
2 k(t, Cu )du 1[τi−1 ,τi [ (As ) − k(t, s) ρ(ds) k(t, Cu )du 1[C
2 (s) − k(t, s) ρ(ds) −[
− ,C τ τ i−1 i
2 k(t, Cu )du 1[Cτi−1 ,Cτi [ (s) − k(t, s) ρ(ds).
τi 1 k(t, Cu )du 1[Cτi−1 ,Cτi [ (s) converges in L2 () τ − τi−1 τi−1 i=1 i to k(t, s) as n goes to infinity. Applying Lemma 2, we get Hence
n
JX
τi 1 k(t, Cu )du 1[Cτi−1 ,Cτi [ (s) τ − τi−1 τi−1 i=1 i τi n 1 = R−1 ◦ K k(t, Cu )du 1[Cτi−1 ,Cτi [ (s) τ − τi−1 τi−1 i=1 i τi n 1 = k(t, Cu )du (Wti − Wti−1 ). τi − τi−1 τi−1 i=1
n
Now, by continuity of JX and equality (11) we have Xt =
1
k(t, Cs )dWs . 0
Since
Xt =
1
k(t, Cs )dWs = 0
At
k(t, Cs )dWs , 0
it follows that W FtX ⊆ FA , ∀t ∈ [0, 1]. t
On other hand, since Wt is a FCXt -Brownian motion, we have FtW ⊆ FCXt , ∀t ∈ [0, 1], and then W FA ⊆ FCXAt , ∀t ∈ [0, 1]. t
The proof is then finished by using (10).
Canonical Representation for Gaussian Processes
375
Remark 4. By using a similar argument, one can see that 1 2 f (Cs )dWs . ∀f ∈ L (), JX (f ) = 0 ∗
For h ∈ H, we have K h ∈ L (). Then, it follows that 1 (K∗ h)(Cs )dWs . JX (K∗ h) = 2
0
In particular, for h = K(t, .), we have ∗
1
1
JX (K (K(t, .))) = JX (k(t, .)) =
K∗ (K(t, .))(Cs )dWs
0
=
k(t, Cs )dWs = Xt . 0
Remark 5. It should be noted that if A is continuous and strictly increasing, then C is also continuous and strictly increasing and we then have CAt = ACt = t. Hence W , ∀t ∈ [0, 1]. FtX = FA t Example 1. • The fractional Brownian motion (fBm) with Hurst parameter H ∈ [ 12 , 1) is a centered Gaussian process B H with covariance function K H (t, s) =
1 2H 2H . t + s2H − |t − s| 2
It is known (see [DU99]) that B H admits a canonical Volterra type representation t BtH = k H (t, u)dWu , 0
where W is a standard Brownian motion, the kernel k H has the expression t 1 1 1 H− 3 H k (t, s) = cH H − (u − s) 2 uH− 2 du 1[0,t] (s), s 2 −H 2 s and cH is a normalizing constant given by
cH
⎤1/2 3 2H Γ − H ⎢ ⎥ 2 ⎥ . =⎢ ⎣ ⎦ 1 Γ H+ Γ (2 − 2H) 2 ⎡
We denote by S the set of step functions on [0, 1]. Let HS be the Hilbert space defined as the closure of S with respect to the scalar product 1[0,t] , 1[0,s] H = K H (t, s). S
376
M. Erraoui and E.H. Essaky
The mapping 1[0,t] → BtH can be extended to an isometry between HS and the Gaussian space associated to B H . We will denote this isometry 1 by f → 0 f (s)dBsH . Let f : [0; 1] −→ [0; 1] be an absolutely continuous function such that f > 0 and f is locally square integrable. It is known t are locally equivalent (see that both processes 0 f (s)dBsH and BHt 1 0
f (s) H ds
Baudoin and Nualart [BN03]). Moreover t t 1 H = k f (u) du, BHt 1 H 0
f (s) H ds
0
0
s
s
1 1 f (u) H du f (s) 2H dWs .
0
1 H
Now, putting As = 0 f (u) du, we find t t s
1 1 1 = kH f (u) H du, f (u) H du f (s) 2H dWs BHt 1 H f (s) ds 0 0 0 0 t 1 = kH (At , As )f (s) 2H dWs 0
Ct
=
kH (At , s)f (Cs )
1 2H
1
dWCs =
k(t, s)dWρ([0,s]) , 0
C0 1
•
with k(t, s) = kH (At , s)f (Cs ) 2H 1[C0 ,Ct ] (s) and ρ([0, s]) = Cs . For each γ > −1, the weighted process is defined as a Gaussian process X with covariance function of the form K(t, s) = sγ tγ (s ∧ t) and X0 = 0. Observe that with k(t, s) = tγ 1[0,t] (s) and ([0, s]) = s, the covariance K takes the form 1
K(t, s) =
k(t, u)k(s, u) (du) . 0
Thanks to Theorem 2 we have the canonical representation of X as follows 1 Xt = k(t, s)dWρ([0,s]) = tγ Wt . 0
•
Let X be a Gaussian-Markov process with continuous and strictly positive covariance function K(t, s). It is well known that there exist a continuous process u and a continuous strictly increasing process v such that K(t, s) = u(t)u(s)v(s ∧ t) (see [RY91], p. 81). It also follows from Theorem 4.3 in [RY91] that there exits a probability measure given by ([0, t]) = v(t). So, with k(t, s) = u(t)1[0,t] (s) and ρ([0, s]) = v(s) we have 1 K(t, s) = k(t, u)k(s, u) (du) . 0
Then we obtain via Theorem 2 the following representation 1 Xt = k(t, s)dWρ([0,s]) = u(t)Wv(t) . 0
It should be pointed out that the above representation holds not only in law but strongly.
Canonical Representation for Gaussian Processes
377
4 Malliavin Calculus First recall that stochastic calculus of variations or Malliavin calculus is valid for an arbitrary Gaussian process (see Malliavin [Mal97] and Nualart [Nua95]). The first part of this section is devoted to the orthogonal chaos decomposition for square integrable functionals of our Gaussian process X. In the second part we establish relationships between derivation operators and divergences associated with the processes X and W . To the stochastic process {Xt , t ∈ [0, 1]} we associate the isonormal Gaussian process {X(f ), f ∈ H}, defined by X(f ) = JX (K∗ (f )), f ∈ H. Denote by DX and δ X the Malliavin derivative and the Skorohod integral associated with the process X. Let S be the set of smooth and cylindrical random variables of the form F = f (X (ϕ1 ) , · · ·, X (ϕn ))
(12)
where n ≥ 1, f ∈ Cb∞ (Rn ) (f and all its derivatives are bounded), and ϕ1 , · · ·, ϕn ∈ H. Given a random variable F of the form (12) , we define its derivative as the H-valued random variable given by DX F =
n ∂f (X (ϕ1 ) , · · ·, X (ϕn )) ϕj . ∂xj j=1
The derivative operator DX is a closable unbounded operator from Lp (Ω) into Lp (Ω; H) for any p ≥ 1. In a similar way, the iterated derivative operator DX,m maps Lp (Ω) into Lp (Ω; H⊗m ). For any positive integer m and real the closure of S with respect to the norm defined by p ≥ 1, we denote by DX,m p p
p
F X,m,p = F Lp (Ω) +
m X,,j p D F Lp (Ω;H⊗j ) . j=1
The domain of δ X (denoted by Dom δ X ) in L2 (Ω) is the set of elements u ∈ L2 (Ω; H) such that there exists a constant c verifying X E D F, u ≤ c F , 2 H for all F ∈ S. If u ∈ Dom δ X , δ X (u) is the element in L2 (Ω) defined by the duality relationship F ∈ DX,1 E δ X (u) F = E DX F, u H , 2 . Let V be a separable Hilbert space. We can similarly define the spaces (V ) of V -valued random variables. Recall that the space DX,1 (H) of DX,m p 2 H-valued random variables is included in the domain of δ X and for any ele(H) we have ment u in DX,1 2
2 2 2 E δ X (u) ≤ E u H + E DX uH⊗H .
378
M. Erraoui and E.H. Essaky
A random variable F of the form (12) is said to be a polynomial functional when f is an element of the set of real polynomials with n variables. We will denote by P the set of polynomial functionals. For a more complete presentation, see [Nua95]. Consider P0 = R and for n ∈ N∗ , define Pn as the closed space spanned in 2 L (P) by the elements of P of degree less than n. Set C0 = P0 and suppose that C1 , ..., Cn are defined. Then, we define Cn+1 as the orthogonal of C1 ⊕ ... ⊕ Cn in Pn+1 . As for all Gaussian spaces, we have the chaos decomposition: Theorem 3. L2 (P) = ⊕ Cn . n≥0
This means that every P-square integrable functional from Ω to R can be written in a unique way as F = Jn F, n≥0
where Jn is the orthogonal projection of L2 (P) onto Cn . the operators and spaces Henceforth we will denote by DW , δ W , DW,m p associated with the Wiener process W . Now, remark from (jj) that for ϕ ∈ H and t ∈ [0, 1] we have
1
DtW (X (ϕ)) = DtW
(K∗ ϕ) (Cs )dWs = (K∗ ϕ) (Ct ).
0
Then, for F = f (X (ϕ)) and t ∈ [0, 1], we get DtW F = f (X(ϕ)) (K∗ ϕ) (Ct ). It follows that 2 2 2
F W,1,2 = F L2 (Ω) + DW F L2 (Ω;L2 (dt))
1 2 2 2 = F L2 (Ω) + E (f (X (ϕ))) [(K∗ ϕ) (Cs )] ds
=
2
F L2 (Ω)
0
2
1
+ E (f (X (ϕ)))
∗
2
[(K ϕ) (s)] (ds) . 0
As a consequence we obtain from equality (7) that = (K∗ )−1 (LW,1 DX,m 2 2 ),
(13)
where LW,1 = DW,m (L2 ()). 2 2 Proposition 2. For any smooth random variable F and any u ∈ L2 (Ω; H): E DX F, u H = E D.W F, (K∗ u) (C. ) L2 (dt)
Canonical Representation for Gaussian Processes
379
Proof. It is sufficient to consider F of the form f (X (Kϕ)). In this case we have E DX F, u H = E f X (Kϕ))Kϕ, u H = E f (X (Kϕ))ϕ, K∗ u L2 ((dt)) 1 =E f (X (Kϕ))ϕ (t) (K∗ u) (t) (dt)
0 1
=E
f (X (Kϕ))ϕ (Ct ) (K∗ u) (Ct ) dt
0
= E D.W F, (K∗ u) (C. ) L2 (dt) . The above proposition and equality (13) have the following consequence: Corollary 1. 1. For any H-valued random variable u in Dom δ X , we have δ X (u) = δ W ((K∗ u) (C· )) X 2. (K∗ )−1 (LW,1 2 ) is included in the domain of δ .
It should be noted that for (ds) = ds we obtain δ X (u) = δ W (K∗ u) , for any H-valued random variable u in Dom δ X . On the other hand (K∗ )−1 LW,1 = (K∗ )−1 DW,m (L2 (ds)) 2 2 is included in Dom δ X .
5 Girsanov Transformation For h ∈ H, we define
1 2 Λ = exp δ (h) − h H . 2 h
X
Let h ∈ H and τh (P) be the translate of P by h. In this section we look for the law of τh (P). Since h ∈ H there exists gh ∈ L2 () such that 1 h(t) = k(t, s)gh (s) (ds) . 0
So we define the following transformation 1 (T X)t = Xt + k(t, u)gh (u) (du) , 0
which has the law τh (P).
380
M. Erraoui and E.H. Essaky
Proposition 3. The Gaussian measure τh (P) is equivalent to P and the density is equal to Λh . Proof. First we remark that (T X). has the representation (T X)t =
1
1
k(t, s)dW([0,s]) +
k(t, s)gh (s) (ds) .
0
0
Now we have
t
W([0,t]) +
At
gh (s) (ds) =
At
dWs +
0
0
gh (Cs )ds. 0
The classical Girsanov theorem asserts that the process t gh (Cs )ds, t −→ Wt + 0
defined by under the law P is a Brownian motion W t
dP 1 t 2 = exp gh (Cs )dWs − (gh (Cs )) ds . d P W 2 0 0 Ft
It follows that the process = = exp 0
W FA
At
dWs + 0
under the law dP d P
At
At
A gh (Cs )ds is equivalent to W t
0
1 gh (Cs )dWs − 2
At
2
(gh (Cs )) ds 0
t 1 t 2 gh (s)dWAs − (gh (s)) dAs = Λh . = exp 2 0 0 Corollary 2. If E Λh = 1, then the law of the process t
(T X)t =
1
k(t, s)dW([0,s]) + 0
1
k(t, s)gh (s) (ds) , 0
under the probability t
dP 1 t 2 = exp gh (s)dW([0,s]) − (gh (s)) (ds) , d P W 2 0 0 FA
t
is the same as the law of the canonical process Xt under P. Acknowledgments The anonymous referee is acknowledged for suggestions on improving the presentation of the paper.
Canonical Representation for Gaussian Processes
381
References [AMN01] Al` os, E., Mazet, O., Nualart, D.: Stochastic calculus with respect to Gaussian process. The Annals of Probability 29 (2), 766-801 (2001). [BN03] Baudoin, F., Nualart, D.: Equivalence of Volterra processes. Stochastic Proc. Appl. 107, 327-350 (2003). [CJE85] Chaleyat-Maurel, M., Jeulin, T.: Grossissement gaussien de la filtration brownienne, Lecture Notes in Math. 1118, Springer, 59-109 (1985). [Cra61] Cram´er, H.: On some classes of non-stationary processes. Proc. 4th Berkeley Sympo. Math. Stat. and Prob. 2, 57-77 (1961). ¨ unel, A. S.: Stochastic analysis of the fractional [DU99] Decreusefond, L., Ust¨ Brownian motion. Potential Anal. 10 (2), 177–214 (1999). [Gre81] Grenander, U.: Abstract Inference. Wiley, New York (1981). [Jan97] Janson, S.: Gaussian Hilbert Spaces. Cambridge University Press, Cambridge (1997). [JY90] Jeulin, T., Yor, M.: Filtrations des ponts browniens, et ´equations diff´erentielles stochastiques lin´eaires. S´em. Prob. XXIV, Lect. Notes in Maths 1426, Springer, Berlin, 227-265 (1990). [HHM97] Hibino, Y., Hitsuda, M., Muraoka, H.: Construction of noncanonical representations of a Brownian motion. Hiroshima Math. J. 27 , no. 3, 439-448 (1997). [Hid60] Hida, T.: Canonical representation of Gaussian processes and their applications. Mem. Coll. Sci. Univ. Kyoto Ser.A. 33, 109-155 (1960). [His68] Hitsuda, M.: Representation of Gaussian processes equivalent to Wiener process. Osaka J. Math. 5, 299–312 (1968). [Hul03] Hult, H.: Approximating some Volterra type stochastic integrals with applications to parameter estimation. Stochastic Processes and their Applications 105, 1 – 32 (2003). [Kat95] Kato, T.: Perturbation Theory for Linear Operators. Reprint of the 1980 edition. Classics in Mathematics. Springer-Verlag, Berlin (1995). [Lev65] L´evy, P.: Processus Stochastiques et Mouvement Brownien. (1948, second edition). Gauthier-Villars, Paris (1965). [Mal97] Malliavin, P.: Stochastic Analysis. Springer, New York (1997). [Nua95] Nualart, D.: The Malliavin Calculus and Related Topics. Springer, Berlin (1995). [She66] Shepp, L.A.: Radon–Nikodym derivatives of Gaussian measures. Ann. Math. Statist. 37, 321–354 (1966). [Suq95] Suquet, Ch.: Distances euclidiennes sur les mesures sign´ees et application ` a des th´eor`emes de Berry-Ess´een. Bull. Belg. Math. Soc. 2, 161–181 (1995). [RY91] Revuz, D., Yor, M.: Continuous Martingales and Brownian Motion. Springer, Berlin (1991). [Yor92] Yor, M.: Some Aspects of Brownian Motion. Part I. Some Special Functionals, Lectures Notes in Math. ETH Z¨ urich, Birkh¨ auser Verlag, Basel (1992).
Recognising Whether a Filtration is Brownian: a Case Study ´ Michel Emery IRMA, Universit´e de Strasbourg et C.N.R.S. 7 rue Ren´e Descartes, 67 084 Strasbourg Cedex, France e-mail: [email protected]
[...] l’on s’introduit dans l’espace des signes. R. Barthes. L’empire des signes.
Summary. A filtration on a probability space is said to be Brownian when it is generated by some Brownian motion started from 0. Recognising whether a given filtration F = (Ft )t0 is Brownian may be a difficult problem; but when F is Brownian after zero, a necessary and sufficient condition for F to be Brownian is available, namely, the self-coupling property (ii) of Theorem 1 of [4]. (‘Brownian after zero’ means that for each ε > 0, the shifted filtration F ε = (Fε+t )t0 is generated by its initial σ-field Fε and by some F ε -Brownian motion.) In all concrete examples where this self-coupling criterion has been used to establish Brownianity, another, more constructive proof was also available. The situation presented below is different. We are interested in a certain process, introduced in 1991 by Beneˇs, Karatzas and Rishel; the natural filtration of this process turns out to be also generated by some Brownian motion, but we have not been able to exhibit such a generating Brownian motion; the general, non constructive criterion is the only proof we know that this filtration is indeed Brownian.
The filtration to be studied is the one generated by the process Z = (X, Y ), where X and Y are two Brownian motions linked by the relation sgn X dX + sgn Y dY = 0 .
(1)
This process was first considered by Beneˇs, Karatzas and Rishel [1], to solve a partially-observed stochastic control problem that turns out not to admit a strict-sense optimal law. I thank Ioannis Karatzas for bringing this reference to my attention and raising the question of the nature of the filtration generated by Z. I am also grateful to the Minerva Foundation, who supported my visit to Columbia University, where most of this work was done. C. Donati-Martin et al. (eds.), S´ eminaire de Probabilit´ es XLII, Lecture Notes in Mathematics 1979, DOI 10.1007/978-3-642-01763-6 14, c Springer-Verlag Berlin Heidelberg 2009
383
384
´ M. Emery
Notation and Conventions. A probability space (Ω, A, P) is always complete; by a sub-σ-field of A, we always mean an (A, P)-complete sub-σ-field. A raw filtration F on (Ω, A, P) is an increasing family (Ft )t0 of sub-σ-fields of A (so each of them is (A, P)-complete); if furthermore t → Ft is rightcontinuous, that is, if Ft = ε>0 Ft+ε for each t, then F is simply called a filtration. If F◦ is a raw filtration, the filtration Fgenerated by F◦ , i.e., the ◦ . smallest filtration containing F◦ , is given by Ft = ε>0 Ft+ε ◦ If F and G are two raw filtrations, F∨ G denotes the raw filtration generated ◦ G)t = Ft ∨ Gt (the small circle is a reminder by F and G; it is given by (F ∨ ◦ that F ∨ G is a raw filtration, not necessarily right-continuous). The filtration ◦ G, is denoted by F ∨ G. generated by F and G, or equivalently by F ∨ We will use the convention that a stochastic integral V dU is always started from 0 (not from V0 U0 ). LU will denote the local time at 0 of the U . Tanaka’s well-known formula asserts that continuous semimartingale sgn U dU = |U | − |U0 | − LU ; the process |U | − LU is the L´evy transform of U , we shall denote it by TU . So (1) says that TX + TY is constant (and hence equal to |X0 | + |Y0 |). If T is a stopping time, the process U T is U stopped at T ; its value at time t is Ut∧T . The process U T = U − U T is null up to T and varies as U after T . Before focusing on the filtration of the solution to (1), we shall first describe this process. The interesting case is when X0 = Y0 = 0, but it will be helpful to consider the more general case when the initial value Z0 is an arbitrary point z0 = (x0 , y0 ) in the plane. Definition. A process Z = (X, Y ), defined on some filtered probability space (Ω, A, P, F) and taking its values in R2 , is called a BKR process if X and Y are two F-Brownian motions (not necessarily started from 0) linked by (1). The next proposition is borrowed from [1]; when z0 = 0, it entails existence and uniqueness in law of the BKR process Z and shows that the filtration of Z is Brownian. Proposition 1 (Beneˇ s, Karatzas and Rishel). Fix a point z0 = 0 in the plane. On some filtered probability space (Ω, A, P, F), let be given a BKR process Z = (X, Y ) started from (X0 , Y0 ) = z0 . Define a Brownian motion B by (2) dB = − sgn Y dX = sgn X dY ; B0 = 0 . The processes B and Z generate the same filtration. More precisely, there exists a functional Φ such that (X, Y ) = Φ(z0 , B) whenever X, Y and B are any three F-Brownian motions satisfying (X0 , Y0 ) = z0 and (2). Conversely, given a Brownian motion B on some filtered probability space, with B0 = 0, the process Z = (X, Y ) defined by Z = Φ(z0 , B) is a BKR process started from z0 and satisfying (2).
A Brownian Filtration
385
Proof (borrowed from [1]). Observe first that (2) ⇒ (1); more precisely, the second equality in (2) is equivalent to (1). (The convention for sgn 0 is irrelevant here, since if U and V are two Brownian motions, 1U =0 dV, V = 1U =0 dt = 0.) The proof of the proposition consists in describing Φ as an algorithm yielding X and Y from the data x0 , y0 and B. Here is the first step of this algorithm. Put S = inf{t 0 : Bt = x0 sgn y0 or Bt = −y0 sgn x0 } ; on 0, S , set X = x0 − sgn y0 B and Y = y0 + sgn x0 B. Notice that S is also the first time when X or Y vanishes. On 0, S , (1) and (2) hold; conversely, if X, Y and B are three Brownian motions satisfying (2) and respectively started from x0 , y0 and 0, one must have X = x0 − sgn y0 B and Y = y0 + sgn x0 B on 0, S . For the second step, observe that one of XS and YS vanishes, and put S if XS = 0, T0 = S if XS = 0 and YS = 0; inf t S : T(B )t = |XS | S
on the interval S, T0 , set Yt = sgn XS Bt ) and Xt = XS − sgn XS T(B S )t ; remark that T0 is also the first time after S that X vanishes. On the interval S, T0 one has dY = sgn XS dB and dX = − sgn XS sgn B S dB = − sgn Y dB, so (2) and (1) hold. Conversely, if X, Y and B satisfy (2), if T0 denotes the first time from S on when X = 0, one must have Y = sgn XS B S on S, T0 , and also dX = − sgn XS sgn B S dB = − sgn XS dT(B S ) on this interval, consequently T0 is the first time that T(B S ) = |XS | if XS = 0. The proof keeps proceeding the same way, on successive intervals: suppose that the algorithm manufacturing X and Y from B has been constructed up to some time T2n such that XT2n = 0, and that the so-obtained X and Y have been shown to be the only Brownian motions started from x0 and y0 and satisfying (2) on 0, T2n . Put T2n+1 = inf t T2n : T(B T2n )t = |YT2n | ; and define X and Y on T2n , T2n+1 by X = − sgn YT2n B T2n ;
Y = YT2n − sgn YT2n T(B T2n ) .
The same arguments as before show that the validity and the uniqueness of the construction extend up to T2n+1 , which is the first time after T2n such that Y = 0. Define then T2n+2 = inf t T2n+1 : T(B T2n+1 )t = |XT2n+1 | ; and on T2n+1 , T2n+2 set Y = sgn XT2n+1 B T2n+1 ;
X = XT2n+1 − sgn XT2n+1 T(B T2n+1 ) .
This extends the construction to T2n+1 , T2n+2 , and T2n+2 is the first time that X vanishes after T2n+1 .
386
´ M. Emery
To complete the proof, it remains to show that T∞ = limn Tn is a.s. infinite. On the event {T∞ < ∞}, one has XT∞ = lim XT2n = 0 and similarly YT∞ = lim YT2n+1 = 0. This is impossible, because Tanaka’s formula gives d|X| + d|Y | = sgn X dX + dLX + sgn Y dY + dLY = dLX + dLY ; so |X| + |Y | is increasing, and consequently |XT∞ | + |YT∞ | |x0 | + |y0 | > 0 on {T∞ < ∞}. This proof is more informative than the statement of the proposition; it gives an intuitive description of how the BKR process behaves. The point Zt Y lives on the sides of the random square |x| + |y| = |x0 | + |y0 | + LX t + Lt , whose vertices are on the axes. When Z is on one side of this square, it moves Brownianly on this side, which remains fixed. The square inflates only when Z is at a vertex of the square, or equivalently when Z is on one of the axes; at those times, the square expands according to the local time spent by Z on this axis. And dB measures the movement of Z, when the square is given the same orientation as the trigonometric circle. Our aim is to study BKR processes issued from the origin; but we shall first state and prove a few elementary lemmas concerning the behaviour of filtrations. These lemmas will be needed later, to establish existence and uniqueness in law of BKR processes. Definition. If F and G are two raw filtrations on (Ω, A, P), one says that F m is immersed in G, and one writes F ⊂ G, if (Ft ⊂ Gt for each t 0 and) every F-martingale is a G-martingale. Lemma 1. If F and G are two raw filtrations and if Ft ⊂ Gt for all t 0, the following four statements are equivalent: (i) F is immersed in G; (ii) for each t 0, the operators of conditional expectation satisfy EGt EF∞ = EFt ; (iii) for each t 0, the operators of conditional expectation satisfy EF∞ EGt = EFt ; (iv) for each t 0, the σ-fields F∞ and Gt are conditionally independent given Ft . Proof. First, (i) holds if and only if every uniformly integrable F-martingale is a G-martingale, that is, if and only if EGt J = EFt J for each J ∈ L1 (F∞ ); this can be rewritten EGt EF∞ = EFt EF∞ , which is tantamount to (ii). Equivalence between (ii), (iii) and (iv) stems immediately from the following classical exercise: if three sub-σ-fields B, C and D of A satisfy D ⊂ B∩C,
A Brownian Filtration
387
a necessary and sufficient condition for B and C to be conditionally indepenholds, dent given D is EB EC = ED . Indeed, if conditional independence for all B ∈ L∞ (B) and C ∈ L∞ (C) one has ED BC = ED B ED C , whence E BC = E B ED C , then EB C = ED C , and finally EB EC = EB EC = ED , one derives ED BC = ED EC = ED . Conversely, from D D B D B E E BC = E B E C = E B ED C = ED B ED C . Lemma 2. Let F◦ and G◦ be two raw filtrations; call F and G the filtrations respectively generated by F◦ and G◦ . If F◦ is immersed in G◦ , then F is immersed in G. ◦
◦
Proof. By Lemma 1 (ii), one has EGt EF∞ = EFt for all t. Replacing t by t + ε and letting ε ↓ 0 gives EGt EF∞ = EFt by mreverse martingale convergence. Then, by Lemma 1 (ii) again, one obtains F ⊂ G. Lemma 3 (preservation of immersions by enlargements). Let F, G and S be three filtrations such that the final σ-fields S∞ and G∞ are condi◦ S tionally independent given F∞ . If F is immersed in G, the raw filtration F ∨ ◦ is immersed in the raw filtration G ∨ S, and the filtration F ∨ S is immersed in the filtration G ∨ S. Proof. Every bounded, Gt -measurable r.v. Gt is conditionally independent of S∞ (or of F∞ ∨ S∞ ) given F∞ ; consequently EF∞ ∨S∞ Gt = EF∞ Gt . In turn, EF∞ Gt = EFt Gt by immersion of F in G and Lemma 1 (iii); so EF∞ ∨S∞ Gt = EFt Gt . If now St is any bounded, St -measurable r.v., one has EF∞ ∨S∞ Gt St = St EF∞ ∨S∞ Gt = St EFt Gt ; as this is measurable in Ft ∨ St , one gets EF∞ ∨S∞ Gt St = EFt ∨St Gt St . A monotone class argument then gives EF∞ ∨S∞ Ht = EFt ∨St Ht for all Ht in L∞ (Gt ∨ St ). Hence ◦ ◦ S is immersed in G ∨ S by EF∞ ∨S∞ EGt ∨St = EFt ∨St EGt ∨St = EFt ∨St , and F ∨ Lemma 1 (iii). Immersion of F ∨ S in G ∨ S follows by Lemma 2. Remark 1. Lemma 3 encaptures some situations where an immersion is preserved by an enlargement, but not all such situations. A trivial counterexample is obtained by taking S = G and F degenerate: F is then immersed in G and F ∨ S is immersed in G ∨ S, but S and G are not independent. Two particular cases of Lemma 3 are noteworthy: Corollary 1. Let F, G and S be three filtrations such that S∞ ⊂ F∞ . If F is ◦ ◦ S is immersed in the raw filtration G∨ S, immersed in G, the raw filtration F ∨ and the filtration F ∨ S is immersed in the filtration G ∨ S. Proof. Given F∞ , the σ-field S∞ is conditionally independent of anything. Corollary 2. Let F, G and G
be three filtrations such that F is immersed in G and in G
. If the terminal σ-fields G ∞ and G
∞ are conditionally independent given F∞ , the three filtrations F, G and G
are immersed in the filtration G ∨ G
generated by G and G
.
388
´ M. Emery m
Proof. Lemma 3 with S = G and G = G
gives G ⊂ G ∨ G
; similarly, one m m has G
⊂ G ∨ G
, and F ⊂ G ∨ G
follows by transitivity of immersions. We can now come back to the BKR process Z = (X, Y ) started from 0. The behaviour of (|X|, |Y |) is an immediate consequence of a well known property of the L´evy transform: Proposition 2. If X and Y are two Brownian motions defined on some probability space (Ω, A, P), started from 0 and satisfying (1), one has |Xt | = Wt − It
and
|Yt | = St − Wt ,
where W is the Brownian motion defined by t t Wt = sgn Xs dXs = − sgn Ys dYs , 0
0
and where It = inf Ws and St = sup Ws . s∈ 0,t
s∈ 0,t
Moreover, the three processes |X|, |Y |, and W generate the same filtration. t Proof. The L´evy transform of X is the Brownian motion Wt = 0 sgn Xs dXs = |Xt | − LX 0, t : Xs = 0}, one has for s ∈ 0, t t . Putting gt = sup{s ∈
X X Ws = |Xs | − LX s 0 − Lt = |Xgt | − Lgt = Wgt ,
wherefrom It = Wgt = −LX t and Wt = |Xt | + It . The proof of Wt = St − |Yt | is similar. As was observed by L´evy, the process W = S − |Y | is adapted to the filtration of |Y | (see for instance Remark 2.25 in Chapter 6 of [6]); conversely, |Y | = S − W is adapted to the filtration of W . Hence |Y | and W generate the same filtration; so does also |X| for a similar reason. Proposition 2 describes the process (|X|, |Y |) as a functional of the Brownian motion W , namely (|X|, |Y |) = Ψ (I, W, S), where Ψ (i, w, s) = (w−i, s−w). This implies in particular uniqueness in law of (|X|, |Y |): if (X , Y ) and (X
, Y
) are any two solutions to (1) started from (0, 0), the processes (|X |, |Y |) and (|X
|, |Y
|) have the same law. For fixed time t, the joint law of (It , Wt , St ) can be explicitly written (see formula 1 1.15.8 in Borodin and Salminen [2]); the law of (|Xt |, |Yt |) can be derived therefrom. Observe that Proposition 2 needs the Brownian motions X and Y to be linked by (1), but they do not have to form a BKR process, that is, to be Brownian motions for some common filtration. It is not difficult to see that all solutions (X, Y ) to (1), started from (0, 0) and not constrained to be Brownain motions for some common filtration, are obtained (in law) by the following procedure: First, construct (|X|, |Y |) as Ψ (I, W, S) for some Brownian motion
A Brownian Filtration
389
W started from 0. Then, conditionally on (|X|, |Y |), construct X by choosing the signs of the excursions of X as an i.i.d. sequence uniform on {−, +}. Last, do the same for the signs of the excursions of Y . Conditional on (|X|, |Y |), each of the two sequences giving the signs of X and the signs of Y is i.i.d. and uniform on {−, +}, but their joint conditional law given (|X|, |Y |) is arbitrary: these two sequences may be correlated in any way. This assertion is an easy consequence of the following fact from excursion theory: A process U is a Brownian motion if and only if |U | is a Brownian motion reflected at the origin and, conditional on |U |, the signs of the excursions of U form an i.i.d. sequence uniform on {−, +}. To make this statement rigorous, we need a formal definition of the sequence of excursion signs of a process; for the sake of definiteness, we shall use the following one. Definition. Fix once and for all a dense sequence (rk ) in (0, ∞). If U is a Brownian motion, call Jk the (a.s. well defined) excursion interval of U straddling rk . Define a sub-sequence (Jn ) by deleting from the sequence (Jk ) any interval that already occurs earlier in that sequence; this sub-sequence is a.s. infinite. The sequence of excursion signs of U is ε = (εn ), where εn is the sign of U during Jn . Proposition 3. On a suitable filtered probability space (Ω, A, P, F), there exists a BKR process started from the origin. Proof. Start with an independent triple (W, ε, η), where W is a Brownian motion started from 0, and ε and η are two i.i.d. sequences, uniform on {−, +}. Define I and S from W as in Proposition 2, and (with an abuse of notation) put |X| = W − I and |Y | = S − W . Define X from |X| by choosing the sequence of excursion signs of X equal to ε and similarly define Y so that the signs of its excursions are given by η. As ε and η are independent of the reflected Brownian motions |X| and |Y |, both X and Y are Brownian motions. As W and W − I generate the same filtration, W is the martingale part of |X|, i.e., dW = sgn X dX; similarly, dW = − sgn Y dY , so (1) holds. It remains to see that X and Y are F-Brownian motions for some filtration F. Call Y ). The pro W (resp. X, Y) the natural filtration of W (resp. X, m cess W = sgn X dX is an X-Brownian motion, whence W ⊂ X; similarly, m W ⊂ Y. Now, X∞ = σ(W, ε) and Y∞ = σ(W, η) are conditionally independent given W∞ = σ(W ). So Corollary 2 applies, and X and Y are immersed in X ∨ Y = F; thus X and Y are F-Brownian motions. Uniqueness in law, established in Proposition 1 for BKR processes started from z0 = 0, also holds when z0 = 0. In other words, every solution to (1), where X and Y are F-Brownian motions started from 0, can be obtained as in the proof of Proposition 3. This will be established in Proposition 4; we first state a small lemma on the signs of Brownian excursions.
390
´ M. Emery
Lemma 4. If X is an F-Brownian motion, then for 0 < s t the conditional law of sgn Xt knowing Fs and the whole process |X|, is given by
δsgn Xs if X does not vanish between s and t, L sgn Xt Fs ∨ σ(|X|) = 1 2 (δ− + δ+ ) if X vanishes between s and t. Proof. Observe that the event {X does not vanish between s and t} belongs to σ(|X|). If F = X where X is the natural filtration of X, the lemma is an easy consequence of the structure of X: given |X|, the excursion signs of X are i.i.d. and uniform on {−, +}. In the general case, since X is an F-Brownian motion, its natural filtration X is immersed in F. Denote by S the constant filtration such that St = σ(|X|) ◦ ◦ S in F ∨ S, which entails for all t. Corollary 1 gives the immersion of X ∨ L sgn Xt | Fs ∨ σ(|X|) = L sgn Xt | Xs ∨ σ(|X|) ; so the result established for X carries over to F. Proposition 4 (uniqueness in law of BKR processes started at 0). On a filtered probability space (Ω, A, P, F), let (X, Y ) be a BKR process started from the origin. The process (|X|, |Y |) (whose law is given by Proposition 2), the sequence of excursion signs of X, and the sequence of excursion signs of Y are independent. Proof. Recall from Proposition 2 that the processes |X|, |Y | and W generate the same filtration, and a fortiori the same σ-field. In particular, the excursion intervals of X and those of Y are σ(W )-measurable. To establish Proposition 3, it suffices to show that, conditional on σ(W ), the excursion signs of X and the excursion signs of Y are uniformly distributed on {−, +} and independent. This amounts to verifying that, for each n,
the conditional joint law L sgn Xr1 , . . . , sgn Xrn , sgn Yr1 , . . . , sgn Yrn W makes all these signs uniform and independent or equal, with equality holding for, and only for, the signs of the same excursion of the same process. By chronologically re-ordering the rk and conditioning, it suffices to check
that, for fixed 0 < s < t, the conditional law L (sgn Xt , sgn Yt ) Fs ∨ σ(W ) is the uniform law on • the singleton {(sgn Xs , sgn Ys )} if neither |X| nor |Y | vanishes between s and t; • the doubleton {(−, sgn Ys ), (+, sgn Ys )} if |X| vanishes between s and t, but |Y | does not; • the doubleton {(sgn Xs , −), (sgn Xs , +)} if |X| does not vanish between s and t, but |Y | does; 2 • the 4-point set {−, +} if both |X| and |Y | vanish between s and t. The first case is trivial. Call gtX (resp. gtY ) the last zero of X (resp. of Y ) on 0, t ; these random variables are σ(W )-measurable. Note that the event {gtX = gtY } is negligible, since W cannot simultaneously reach its current maximum and its current minimum. The second case deals with the
A Brownian Filtration
391
X σ(W )-event {gtY < s <
gt }. On this event, sgn Yt = sgn Ys , and the con
ditional law L sgn Xt Fs ∨ σ(W ) is uniform on {−, +} by Lemma 4; the claim follows. The third case is similar to the second one, by exchanging X and Y . The fourth case takes place on the event {gtX > s and gtY > s}. By symmetry, we may work on the event {s < gtY < gtX }; and by considering all rational u, it suffices to work on {s < gtY < u < gtX }; notice that this event is in σ(W ). To compute EFs ∨σ(W ) f (sgn Xt ) g(sgn Yt ) on that event, replace EFs ∨σ(W ) by EFs ∨σ(W ) EFu ∨σ(W ) ; this separates the operations on X and Y , and applying twice Lemma 4 finishes the proof.
Another proof of Proposition 4 is possible: instead of establishing Lemma 4 in full generality, we only need to know it in the particular setting considered here, namely when F is generated by two Brownian motions linked by (1). As interest focuses on what happens between s and t, the behaviour at time 0 has no bearing on the result, so independence of the signs chosen at the beginning of all excursions can be deduced from Propositions 4 (existence) and 1 (uniqueness in law when z0 = 0). We have preferred a more generaltheoretic argument because we find Lemma 4 interesting in its own right, and because Lemma 3 was needed anyway in our proof of existence. With Propositions 1, 3 and 4, we know existence and uniqueness in law of the BKR process started from any initial position in the plane. Here is an immediate consequence of Proposition 4: Corollary 3. The law of the BKR process started from the origin is invariant under the eight transformations (x, y) → (±x, ±y) and (x, y) → (±y, ±x) (the planar isometries preserving or exchanging the axes). In particular, for fixed t > 0, the conditional law of (Xt , Yt ) given (|Xt |, |Yt |) is uniform on the four points (±|Xt |, ±|Yt |). Together with the remark, made after Proposition 2, that for fixed t the joint law of (|Xt |, |Yt |) can be explicitly computed, this corollary makes it possible to write the planar density of the joint law of (Xt , Yt ). Other consequences of Proposition 4 are scaling invariance, time-inversion invariance, and the Markov property: Corollary 4. The BKR process started from the origin has the same scaling property as planar Brownian motion: for any real λ = 0, the law of the process is invariant under the space-time change (x, y, t) → (λx, λy, λ−2 t). Proof. If (Xt , Yt )t0 is a BKR process started from (0, 0), so is also the process (λXt/λ2 , λYt/λ2 )t0 ; their laws are equal by Proposition 4. Corollary 5. The law of the BKR process started from the origin is invariant under the space-time transformation (x, y, t) → (tx, ty, 1/t). Proof. The processes Xt = t X1/t and Yt = t Y1/t are Brownian motions, and |X | + |Y | remains constant on any time-interval during which neither X nor
392
´ M. Emery
Y ever vanishes; so 1{X Y =0} (sgn X dX +sgn Y dY ) = 0. This entails that Z = (X , Y ) is a BKR process since the amount of time 1{X Y =0} dt spent on the axes is null. Hence Z has the same law as Z. Corollary 6. The BKR process Z is strong Markov:
if T isa stopping time and f a bounded functional, E f (ZT +t , t 0) FT = f (w) ν ZT (dw), where, for each z ∈ R2 , ν z denotes the law of the BKR process started from z. Proof. If X and Y are two F-Brownian motions linked by (1), then (XT +t )t0 and (YT +t )t0 are two (FT +t )t0 -Brownian motions linked by (1). Corollary 7. Let F be some filtration. A given process Z is a BKR process for F if and only if its law is that of a BKR process and its natural filtration is immersed in F. Proof. Call Z the natural filtration of Z. If Z is a BKR process for F, the conditional law L (Zt+h , h 0)|Ft equals ν Zt by Corollary 6, hence L Z|Ft is Zt -measurable, and, conditional on Zt , Z is independant of Ft . So Lemma 1 (iv) says that Z is immersed in F. Conversely, suppose Z to have the law of a BKR process and Z to be immersed in F. Then L (X, Y )|Ft = L Z|Zt , hence X and Y are F-Brownian motions; they satisfy Equation (1) which is a property of their joint law. When the initial condition z0 is not the origin, Proposition 1 says that the filtration generated by the BKR process Z is also generated by the Brownian motion B; so this filtration is Brownian. When z0 = 0, formula (2) still defines a Brownian motion B, but B now contains strictly less information than Z; for instance, −Z is another BKR process, but changing Z to −Z does not change B. (More generally, the rotations with angle kπ/2 preserve the law of Z and do not change B. There probably exist uncountably many other path transforms which preserve the law of Z without changing B, but describing them seems to be difficult.) Anyway, our purpose is not to investigate the loss of information from Z to B, but to describe the filtration generated by Z when z0 = 0. Call F this filtration; by hypothesis, X and Y are F-Brownian motions. Proposition 1 and Corollary 6 imply that for each s > 0, the shifted filtration (Fs+t )t0 is generated by the σ-field Fs and the Brownian motion (Bs+t − Bs )t0 . With the vocabulary of [4], this means that the filtration F is Brownian after zero. This is weaker than being Brownian, i.e., being generated by a Brownian motion started from 0; for instance, Brownianity after zero does not even imply that F0 is degenerate (F0 = t>0 Ft , since filtrations are right-continuous by definition). As it happens, Brownianity after zero plus degeneracy of F0 are not sufficient to imply Brownianity; this was discovered by Vershik [8] in the early seventies, in the framework of discrete time with time ↓ −∞. When adapted to the continuous setting with time ↓ 0+, Vershik’s theory gives a necessary and sufficient condition for a filtration which is Brownian after
A Brownian Filtration
393
zero to be Brownian. This condition is the self-coupling criterion (ii) from Theorem 1 of [4]; it is inspired from the concept of cosiness, introduced in 1997 by Tsirelson [7], and modified into I-cosiness in [5]. The precise phrasing of this criterion will be recalled in due time, in the proof of Proposition 5. We shall make use of the criterion to show that F is indeed Brownian; this will first need a coupling lemma, as the key ingredient. In other instances (see [5] and [3]), the coupling lemma opens the way to a constructive proof of Brownianity: it allows to exhibit a generating Brownian motion without resorting to the criterion. This does not seem to apply here; we have not been able to bypass the criterion and to give a rigorous and constructive proof. Lemma 5 (coupling lemma). Let (Z0 , Z0
) be a random vector in R2 ×R2 , defined on some sufficiently rich probability space (Ω, A, P). There exist on (Ω, A, P) a filtration H, two BKR processes Z and Z
for H, respectively started from Z0 and Z0
, and an H-stopping time T such that T < ∞ a.s. and Z = Z
on T, ∞ . Proof. By “sufficiently rich”, we mean that besides the vector (Z0 , Z0
), there also exist on (Ω, A, P) a linear Brownian motion x started from 0, and two i.i.d. sequences η and η
uniform on {−, +}; these four ingredients are assumed to be independent. Put Z0 = (X0 , Y0 ) and Z0
= (X0
, Y0
). Define a filtration F by Ft = σ Z0 , Z0
, xt . The process X = X0 + x is a real F-Brownian motion; so is also X0
− x. These two processes meet at time S = inf{t : 2xt = X0
− X0 }; define X
to motion equal X0
− x on 0, S and X on S, ∞ . This X
is an F-Brownian too, equal to X from S on. Put W = sgn X dX and W
= sgn X
dX
, and set Rt = |Y0 | ∨ sup Ws − Wt and Rt
= |Y0
| ∨ sup Ws
− Wt
. s∈ 0,t
s∈ 0,t
The processes R and R
are two 1-Bessel for F; more precisely R (resp. R
) is the 1-Bessel with martingale part −W (resp. −W
) and started from |Y0 | (resp. |Y0
|). Remark that dW = dW
on S, ∞ ; consequently, for some random t0 , one has ∀ t t0
Wt = sup Ws
s∈ 0,t
⇔
Wt
= sup Ws
, s∈ 0,t
and hence the F-stopping time T = inf t : t S and Wt = sup Ws |Y0 | and Wt
= sup Ws
|Y0
| st
st
is a.s. finite. Observe that RT = RT
= 0, and that R = R
on T, ∞ . Call y (resp. y
) the Brownian motion with absolute value R (resp. R
)
394
´ M. Emery
and whose excursion signs are given by η (resp. η
). By Lemma 4, F is immersed in the filtration G generated by F and y , and y is a G-Brownian motion; and by the same lemma, G is in turn immersed in the filtration H generated by G and y
, and y
is an H-Brownian motion. By immersion, X , X
and y are also H-Brownian motions, and by definition of y and y
one has sgn y dy = −dW = − sgn X dX and sgn y
dy
= −dW
= − sgn X
dX
. To finish the proof, some innocuous modifications of y and y
are needed, because their initial values are ±|Y0 | and ±|Y0
| but not necessarily Y0 and Y0
, and because y = y
at time T but not on the whole interval T, ∞ . It suffices to call τ (resp. τ
) the first zero of y (resp. y
), to observe that τ T and τ
T because RT = RT
= 0, and to put ⎧
⎪
⎨ sgn(y0 Y0 ) y on 0, τ ,
sgn(y Y ) y on
0, τ
, 0 0 on τ
, T , Y
= y
Y = ⎪ y
on τ , ∞ ; ⎩Y
on T, ∞ . Y and Y
are H-Brownian motions with the same absolute values as y
and y
, so Z = (X , Y ) and Z
= (X
, Y
) are BKR processes for H. One has Z
= Z after T by definition of X
and Y
, and the initial values of Z
and Z
are the given vectors Z0 and Z0
. Corollary 8. Fix α < 1. On a suitable filtered probability space (Ω, A, P, F), there exist two BKR processes Z and Z
for F, such that • Z0 = Z0
= 0; 1 1 • the stopped processes Z and Z
are independent; • for some deterministic time u > 0, one has P ∀ t u Zt = Zt
> α. Proof. Run up to time 1 two independent BKR processes Z and Z
started from the origin; this yelds a random vector (Z1 , Z1
) in R2 × R2 . Starting from this random vector and using Lemma 5, run Z and Z
after time 1 so that they couple at some finite time T > 1. The third property is obtained by choosing u large enough so that P T u > α. Corollary 9. Fix α < 1 and ε > 0. There exist a filtered probability space (Ω, A, P, F) and two BKR processes Z and Z
for F, such that • Z0 = Z0
= 0; s s • for some s > 0, the stopped processes Z and Z
are independent; • P ∀ t ε Zt = Zt
> α. Proof. Immediate from the previous corollary by simultaneously scaling the processes and the filtration (Corollary 4): it suffices to change the time-scale so as to transform u into ε, and 1 becomes some s > 0. Proposition 5. The filtration generated by a BKR process started from the origin is also generated by some real Brownian motion started from the origin.
A Brownian Filtration
395
Proof. Let Z = (X, Y ) be a BKR process started from 0, and F its natural filtration. As in Proposition 1, define a real Brownian motion B by B = − sgn Y dX = sgn X dY and, for s > 0, define the shifted process B s by Bts = Bs+t − Bs . The latter is an Fs -Brownian motion started from 0, where the shifted filtration Fs is defined by Fts = Fs+t . According to Proposition 1 and Corollary 6, the process (Zs+t )t0 is equal to Φ(Zs , B s ); consequently, the filtration Fs is generated by the σ-field Fs and by the Fs -Brownian motion B s . With the language of [4], one says that F is Brownian after zero. For such a filtration, a (necessary and) sufficient condition to be Brownian is Condition (ii) in Theorem 1 of [4]. This criterion is stated in [4] for general filtrations; in the particular case considered here, using the fact that F is generated by a BKR process Z issued from 0, the statement of the criterion can be made slightly less obscure. Here it is: For each integrable r.v. of the form f ◦ Z and each δ > 0, there exists a filtered probability space (Ω, A, P, G) and two BKR processes Z and Z
for G, such that • Z0 = Z0
= 0; s s • for some s > 0, the stopped processes Z and Z
are independent; • E |f ◦Z − f ◦Z
| < δ. Moreover, as remarked at the bottom of page 288 of [4], this need not be checked for all integrable f ◦Z, but only for those belonging to functionals some dense subset of L1 σ(Z) . So we shall verify that the above statement holds true with f ◦Z = g(Zt , t ε), for some ε > 0 and some bounded Borel functional g on continuous paths indexed by ε, ∞). Fix such ε and g, as well as δ > 0; let M > 0 be a bound for |g|. Corollary 9 with α = 1 − δ/(2M ) gives two BKR processes for some filtration, meeting the first two requirements of the criterion. The third one is met too, because the elementary estimate
g(Zt , t ε) − g(Zt
, t ε) 2M 1{∃ tε Z =Z } t t and the third point in Corollary 9 imply
E g(Zt , t ε) − g(Zt
, t ε) < 2M (1 − α) = δ .
Corollary 10 (zero-one law). If Z is a BKR process with Z0 deterministic and with natural filtration F, the σ-field F0 is degenerate. Proof. By Proposition 1 when Z0 = 0 and by Proposition 5 when Z0 = 0, the natural filtration of Z is Brownian. Proposition 5 asserts that a BKR process started from the origin has a Brownian filtration, but the proof does not explicitly exhibit any generating
396
´ M. Emery
Brownian motion. (Strictly speaking, the proof given in [4] is constructive, or rather can be made constructive; but from a practical point of view it is very far from effective.) A possible strategy to exhibit a generating Brownian motion would be to use the same ansatz as in Proposition 3 of [5] or in Theorem 8 of [3]; this approach needs a stronger version of the coupling lemma (Lemma 5), where one further demands that both BKR processes generate the same filtration. Question. Let z and z
be two points in the punctured plane R2 \ {0}. Do there exist two BKR processes Z and Z
, generating the same filtration, started from Z0 = z and Z0
= z
, and such that their coupling time inf {t : Zt = Zt
} is a.s. finite? If the answer is positive, a constructive proof of Brownianity, inspired from [5] and [3], can be derived from this result; the more effectively Z
is constructed in the filtration of Z , the more explicit the generating Brownian motion will be.
References 1. Beneˇs, V.E., Karatzas, I., Rishel, R.W.: The separation principle for a Bayesian adaptive control problem with no strict-sense optimal law. In: Applied stochastic analysis (London, 1989), 121–156, Stochastics Monogr., 5, Gordon and Breach, New York (1991) 2. Borodin, A.N., Salminen, P.: Handbook of Brownian motion – Facts and Formulae. Birkh¨ auser, Basel (first edition, 1996, or second edition, 2002) 3. Brossard, J, Leuridan, C.: Transformations browniennes et compl´ements ind´ependants : r´esultats et probl`emes ouverts. In: S´eminaire de Probabilit´es XLI, 265–278, Lecture Notes in Math. 1934, Springer-Verlag, Berlin (2008) ´ 4. Emery, M.: On certain almost Brownian filtrations. Ann. Inst. H. Poincar´ e Probab. Statist. 41, 285–305 (2005) ´ 5. Emery, M., Schachermayer, W.: A remark on Tsirelson’s stochastic differential equation. In: S´eminaire de Probabilit´es XXXIII, 291–303, Lecture Notes in Math. 1709, Springer-Verlag, Berlin (1999) 6. Karatzas, I., Shreve, S.E.: Brownian Motion and Stochastic Calculus. Second edition. Springer-Verlag, Berlin (1991) 7. Tsirelson, B.: Triple points: from non-Brownian filtrations to harmonic measures. Geom. Funct. Anal. 7, 1096–1142 (1997) 8. Vershik, A.M.: Theory of decreasing sequences of measurable partitions. English version in St. Petersburg Math. J. 6, 705–761 (1995)
Markovian Properties of the Spin-Boson Model Ameur Dhahri Ceremade, UMR CNRS 7534, Universit´e Paris Dauphine Place de Lattre de Tassigny, 75775 Paris Cedex 16, France email: [email protected]
Summary. We systematically compare the Hamiltonian and Markovian approaches of quantum open system theory, in the case of the spin-boson model. We first give a complete proof of the weak coupling limit and we compute the Lindblad generator of this model. We study properties of the associated quantum master equation such as decoherence, detailed quantum balance and return to equilibrium at inverse temperature 0 < β ≤ ∞. We further study the associated quantum Langevin equation, its associated interaction Hamiltonian. We finally give a quantum repeated interaction model describing the spin-boson system where the associated Markovian properties are satisfied without any assumption.
1 Introduction In the quantum theory of irreversible evolutions two different approaches have usually been considered by physicists as well as mathematicians: the Hamiltonian and the Markovian ones. The Hamiltonian approach consists in giving a full Hamiltonian model for the interaction of a simple quantum system with a quantum field (particle gas, heat bath...) and to study the ergodic properties of the associated quantum dynamical system. The usual tools are then typically: modular theory of von Neumann algebras, KMS states...(cf [BR96], [DJP03], [JP96a], [JP96b]). The Markovian approach consists in giving up the idea of modeling the environment and concentrating on the effective dynamics of the small system. This dynamics is supposed to be described by a (completely positive) semigroup and the studies concentrate on its Lindblad generator, or on the associated quantum Langevin equation (cf [F06], [F99], [F93], [FR06], [FR98], [P92], [HP84], [M95]). In this article we systematically compare the two approaches in the case of the well-known spin-boson model. The first step in relating the Hamiltonian and Markovian models is to derive the Lindblad generator from the Hamiltonian description, by means of the weak coupling limit. We indeed give a C. Donati-Martin et al. (eds.), S´ eminaire de Probabilit´ es XLII, Lecture Notes in Mathematics 1979, DOI 10.1007/978-3-642-01763-6 15, c Springer-Verlag Berlin Heidelberg 2009
397
398
A. Dhahri
complete proof of the convergence of the Hamiltonian evolution to a Lindblad semigroup in the van Hove limit. We derive an explicit form for the generator in terms of Hamiltonian, this is treated in section 3. In section 4, 5 and 6 we study the basic properties of the quantum master equation associated to the Lindbladian obtained in section 3. We investigate the quantum decoherence property. We show that the quantum detailed balance condition is satisfied with respect to the thermodynamical equilibrium state of the spin system and we prove the convergence to equilibrium in all cases. In section 7 we consider the natural quantum Langevin equation associated to the Lindblad generator of the spin-boson system. We indeed introduce a natural unitary ampliation of the quantum master equation in terms of a Schr¨ odinger equation perturbed by quantum noises. Such a quantum Langevin equation is actually a unitary evolution in the interaction picture, we compute the associated Hamiltonian which we compare to the initial Hamiltonian. Finally, we give a quantum repeated interaction model which allows to prove that the Markovian properties of the spin-boson system are satisfied without assuming any hypothesis.
2 The Model 2.1 Spin-boson System The model we shall consider all along this article is the spin-boson model, that is, a two level atom interacting with a reservoir modelled by a free Bose 1 gas at thermal equilibrium for the temperature T = kβ (the case of zero temperature, i.e., β = ∞, is also treated). Let us start by defining the spinboson system at positive temperature. We first introduce the isolated spin and the free reservoir, and we describe the coupled system. The Hilbert space of the isolated spin is K = C2 and its Hamiltonian is hS = σz , where 1 0 σz = . 0 −1 The associated eigenenergies are e± = ±1 and we denote the corresponding eigenstates by Ψ± . The algebra of observables of the spin is M2 , the algebra of all complex 2 × 2 matrix. At inverse temperature β, the equilibrium state of the spin is the normal state defined by the Gibbs Ansatz ωS (A) =
1 Tr(exp(−βσz )A), for all A ∈ M2 , Z
where Z = Tr(exp(−βσz )). The dynamics of the spin is defined as τSt (A) = eitσz Ae−itσz , for all A ∈ M2 , t ∈ R.
Markovian Properties of the Spin-Boson Model
399
The free reservoir is modelled by a free Bose gas which is described by the symmetric Fock space Γs (L2 (R3 )). If we call ω(k) = |k| the energy of a single boson with momentum k ∈ R3 , then the Hamiltonian of the reservoir is given by the second differential quantization dΓ (ω) of ω. In terms of the usual creation and annihilation operators a∗ (k), a(k), we have ω(k)a∗ (k)a(k)dk. dΓ (ω) = R3
The Weyl’s operator associated to an element f ∈ L2 (R3 ) is the operator W (f ) = exp(iϕ(f )), where ϕ(f ) is the self-adjoint field operator defined by 1 (a(k)f¯(k) + a∗ (k)f (k)) dk. ϕ(f ) = √ 2 R3 Call Dloc the space of f ∈ L2 (R) with compactly supported Fourier transform. It follows from [JP96b] algebra, Aloc = W (Dloc ), the algebra that the Weyl’s generated by the set W (f ), f ∈ Dloc is a natural minimal set of observables associated to the reservoir. The equilibrium state of the reservoir at inverse temperature β is given by f 2 1 ωR (W (f )) = exp − − |f (k)|2 ρ(k) dk , 4 2 R3 where ρ(k) is related to ω(k) by Planck’s radiation law ρ(k) =
1 . eβω(k) − 1
The dynamics of the reservoir is generated by Hb = [dΓ (ω), .] and it induces a Bogoliubov transformation exp(itdΓ (ω))W (f ) exp(−itdΓ (ω)) = W (exp(iωt)f ). The coupled system is described by the C∗ −algebra M2 ⊗ Aloc . The free dynamics is given by τ0t (A) = τSt ⊗ τRt (A), for all A ∈ M2 ⊗ Aloc . 2.2 Semistandard Representation The semistandard representation of the coupled system (reservoir+spin) is the representation which is standard on its reservoir part, but not standard on the spin part (cf [DF06]). Now, let us introduce the Araki-Woods representation of the couple (ωR , Aloc ) which is the triple (HR , πR , ΩR ), defined by
400
• • •
A. Dhahri
HR = l2 (Γs (L2 (R3 )), the space of Hilbert-Schmidt on Γs (L2 (R3 )) which is naturally identified as Γs (L2 (R3 )) ⊗ Γs (L2 (R3 )) and equipped with the scalar product (X, Y ) = Tr(X ∗ Y ), πR (W (f )) : X −→ W ((1 + ρ)1/2 f )XW (ρ1/2 f¯) for all X ∈ HR , ΩR = |Ω Ω|, where Ω is the vacuum vector of Γs (L2 (R3 )).
Moreover a straightforward computation shows that ωR (A) = (ΩR , πR (A)ΩR ), and the relation πR(exp(itdΓ(ω))A exp(−itdΓ (ω))) = exp(it[dΓ (ω), .])πR (A) exp(−it[dΓ (ω), .])
defines a dynamics on MR = πR (Aloc ) whose generator is the operator LR = [dΓ (ω), .]. The free semi-Liouvillean associated to the semistandard representation of the spin-boson system is defined by Lsemi = σz ⊗ 1 + 1 ⊗ LR . 0 The full semi-Liouvillean is the operator Lsemi = Lsemi + λσx ⊗ ϕAW (α), λ 0 where λ ∈ R, and where α ∈ L2 (R3 ) is called the test function (or cut-off function), ϕAW (α) is the field operator of the Araki-Woods representation which can be identified as follows ϕAW (α) ϕ((1 + ρ)1/2 α) ⊗ 1 + 1 ⊗ ϕ(¯ ρ1/2 α ¯) (see [JP96b], [DJ03] for more details) and 01 . σx = 10 The following proposition follows from [JP96b]. Proposition 2.1 If (ω+ω −1 )α is in L2 (R3 ), the operator Lλsemi is essentially self-adjoint on C2 ⊗ D(dΓ (ω)) ⊗ D(dΓ (ω)) for all λ ∈ R. An immediate consequence of the above proposition is that semi
τλt (A) = eitLλ
defines a dynamics on M = M2 ⊗ MR .
Ae−itLλ
semi
Markovian Properties of the Spin-Boson Model
401
2.3 Reservoir 1-particle Space After taking the Araki-Woods representation of the pair (ωR , Aloc ), we distinguish that the reservoir state is a non-Fock state (i.e., it cannot be represented as a pure state on a Fock space) and this case is more complicated to treat. By using the identifications given in [DJ03] and [JP96a], we see that this state can be represented as a pure state on a Fock space. Hence we have Γs (L2 (R3 )) ⊗ Γs (L2 (R3 )) Γs (L2 (R3 )) ⊗ Γs (L2 (R3 )) Γs (L2 (R3 ) ⊕ L2 (R3 )), LR dΓ (ω ⊕ −ω), ϕAW (α) ϕ((1 + ρ)1/2 α ⊕ ρ¯1/2 α ¯ ), ¯ ΩR Ω ⊕ Ω. Therefore, it is obvious that ωR is a pure state which is defined on the Fock space Γs (L2 (R3 ) ⊕ L2 (R3 )). Moreover we have the Bogoliubov transformation eitdΓ (ω⊕−¯ω) ϕAW (α)e−itdΓ (ω⊕−¯ω) = ϕAW (eitω α). This simplifies our formulation.
3 Weak Coupling Limit of the Spin-Boson System 3.1 Abstract Theory of the Weak Coupling Limit Let Y be a Banach space and X its dual, i.e., X = Y ∗ . Let P be a projection on X and eitδ0 a one parameter group of isometries on X which commutes with P . Put E = P δ0 . It is clear that E is the generator of a one parameter group of isometries on RanP . Consider a perturbation Q of δ0 such that D(Q) ⊃ D(δ0 ). We introduce the following assumptions: (1) P is a w∗ -continuous projection on X with norm is equal to one. (2) eitδ0 a one parameter group of w∗ -continuous isometries (C0∗ -group) on X , (3) For |λ| < λ0 , iδλ = iδ0 + iλQ is the generator of a one parameter C0∗ semigroup of contractions. Consider now the operator
λ−2 t
Kλ (t) = i
e−is(E+λP QP ) P Qeis(1−P )δλ (1−P ) QP ds.
0
For the proof of the following theorem we refer the interested reader to [DF06]. Theorem 3.1 Suppose that assumptions (1), (2) and (3) are true. Assume that the following hypotheses are satisfied:
402
A. Dhahri
(4) P is a finite range projection and P QP = 0, (5) For all t1 > 0, there exists a constant c such that sup sup Kλ (t) ≤ c.
|λ|<1 0≤t≤t1
(6) There exists an operator K defined on RanP such that lim Kλ (t) = K
λ→0
for all 0 < t < ∞. Put K =
1 T →∞ T
T
1e (E)K1e (E) = lim
e∈spE
eitE Ke−itE dt.
0
Then we have i) eitK is a semigroup of contractions, ii) For all t1 > 0, lim
sup e−itE/λ P eit(δ0 +λQ)/λ P − eitK = 0. 2
2
λ→0 0≤t≤t1
3.2 Application to the Spin-boson System Recall that in the semistandard representation of the spin-boson system, the free semi-Liouvillean is the operator = σz ⊗ 1 + 1 ⊗ LR , Lsemi 0 and the full semi-Liouvillean is given by = Lsemi + λσx ⊗ ϕAW (α). Lsemi λ 0 Set V = σx ⊗ ϕAW (α). Put , .] = δ0 + λ[V, .], δλ = [Lsemi λ , .], the generator of the dynamics τλt . For B ⊗ C ∈ M, we with δ0 = [Lsemi 0 define the projection P by P (B ⊗ C) = ωR (C)B ⊗ 1HR . In particular we have E = P δ0 = δ0 P = [σz , .]P and P [V, .]P = 0.
Markovian Properties of the Spin-Boson Model
403
Set P1 = 1 − P . Then it follows that
λ−2 t
Kλ (t) = i
e−isE P [V, .]eisP1 [Lλ
semi
,.]P1
[V, .]P ds.
0
, .] and Note that P [V, .]P = 0, P1 commutes with [Lsemi 0 semi
eisP1 [L0
,.]P1
semi
= eis[L0
,.]
P1 + P.
Thus, if we suppose that ∞ semi K=i e−isE P [V, .]eisP1 [L0 ,.]P1 [V, .]P ds 0
exists, we have
∞
K=i
e−isE P [V, .]eis[L0
semi
,.]
[V, .]P ds.
0
In the following we assume that (ω + ω −1 )α ∈ L2 (R3 ) and we propose to show, under some conditions, that K exists and the operator Kλ converges to K when λ → 0. Set semi
Utλ = eitP1 [Lλ
,.]P1
We thus have
Utλ
semi
, Ut = eitP1 [L0
,.]P1
.
t
Ut−s P1 [V, .]P1 Usλ ds.
= Ut + iλ 0
Hence, the operator U−t Utλ satisfies the equation U−t Utλ = 1 + iλ
t
(U−s P1 [V, .]P1 Us )(U−s Usλ ) ds. 0
Therefore, we get the following series of iterated integrals U−t Utλ = 1 + (iλ)n (U−t1 P1 [V, .]P1 Ut1 )... 0≤tn ≤...≤t1 ≤t
n≥1
(U−tn P1 [V, .]P1 Utn ) dtn ... dt1 . Note that the operator Utk commutes with P1 . So, if we put Qk = U−tk [V, .]Utk , then U−t Utλ = 1 +
n≥1
(iλ)n 0≤tn ≤....≤t1 ≤t
(P1 Q1 P1 )...(P1 Qn P1 ) dtn ... dt1 ,
404
A. Dhahri
and
λ−2 t
e−isE P [V, .]eisP1 [L0 0 (iλ)n +i
semi
Kλ (t) = i
0≤tn ≤...≤t0
n≥1
≤λ−2 t
,.]P1
[V, .]P ds
e−it0 E P [V, .]Ut0 (P1 Q1 P1 )...
(1)
(P1 Qn P1 )[V, .]P dtn ... dt0 . Put
Rn (t) =
e−it0 E P [V, .]Ut0 (P1 Q1 P1 )....(P1 Qn P1 )[V, .]P dtn ... dt0 .
0≤tn ≤...≤t0 ≤t
Recall that P U−t0 = P . Hence, if we set Qn+1 = U−tn+1 [V, .]Utn+1 , with tn+1 = 0, we get e−it0 E P Q0 (P1 Q1 P1 )...(P1 Qn P1 )Qn+1 P dtn ... dt0 .(2) Rn (t) = 0≤tn ≤...≤t0 ≤t
Lemma 3.2
Rn (t) =
0≤tn ≤...≤t0 ≤t
P [σx,0 ⊗ ϕAW (e−it0 ω α), .]P1 ...
P1 [σx,n+1 ⊗ ϕAW (e−itn+1 ω ), .]P dtn ... dt0 , where tn+1 = 0, σx,r = e−itr σz σx eitr σz . Proof. Let us start by computing P1 Qr P1 for r ≥ 1. We have Utr = eitr [σz ,.] eitr [LR ,.] P1 + P, and Utr P1 = eitr [σz ,.] eitr [LR ,.] P1 . Therefore, it follows that P1 U−tr [V, .]Utr P1 = P1 e−itr [σz ,.] e−itr [LR ,.] [V, .]eitr [σz ,.] eitr [LR ,.] P1 . Furthermore we have e−itr [σz ,.] e−itr [LR ,.] [V, .]eitr [σz ,.] eitr [LR ,.] (B ⊗ C) = [σx,r ⊗ e−itr LR ϕAW (α)eitr LR , .](B ⊗ C), and This gives
e−itr LR ϕAW (α)eitr LR = ϕAW (e−itr ω α). P1 Qr P1 = P1 [σx,r ⊗ ϕAW (e−itr ω α), .]P1 .
Markovian Properties of the Spin-Boson Model
405
Besides, P e−t0 [σz ,.] = P e−it0 [σz ,.] e−it0 [LR ,.] and e−it0 E P Q0 P1 = P e−it0 [σz ,.] [V, .]eit0 [σz ,.] eit0 [LR ,.] P = P e−it0 [σz ,.] e−it0 [LR ,.] [V, .]eit0 [σz ,.] eit0 [LR ,.] P = P [σx,0 ⊗ ϕAW (e−it0 ω α), .]P1 . Thus from relation (2), the lemma holds. Lemma 3.3 R2n+1 (t) = 0. Proof. Note that P [σx,0 ⊗ ϕAW (e−it0 ω α), .]P1 ....P1 [σx,2n+2 ⊗ ϕAW (e−it2n+2 ω α), .]P = P [σx,0 ⊗ ϕAW (e−it0 ω α), .](1 − P )[σx,1 ⊗ ϕAW (e−it1 ω α), .](1 − P )... ...(1 − P )[σx,2n+2 ⊗ ϕAW (e−it2n+2 ω α), .]P. (3) Therefore, if we expand the right-hand side of equation (3), we get a sum of terms each of which is a product of elements of the form P [σx,pk ⊗ ϕAW (e−itpk ω α), .]....[σx,pm ⊗ ϕAW (e−itpm ω α), .]P, where 0 ≤ pk ≤ ... ≤ pm ≤ ... ≤ 2n + 2. But, in each product there exists at least an element of the form P [σx,r1 ⊗ ϕAW (e−itr1 ω α), .]....[σx,r2p+1 ⊗ ϕAW (e−itr2p+1 ω α), .]P, where 0 ≤ r1 ≤ ... ≤ r2p+1 ≤ ... ≤ r2n+2 . Furthermore, it is easy to show that [σx,r1 ⊗ ϕAW (e−itr1 ω α), .]....[σx,r2p+1 ⊗ ϕAW (e−itr2p+1 ω α), .]P (B ⊗ C) is a sum of terms each of which has a second component composed by 2p + 1 number product of vector fields. But the projection P acts uniquely in the second component and the Gibbs state ωR of the reservoir is a quasi-free state (see [BR]). Then it follows that P [σx,r1 ⊗ ϕAW (e−itr1 ω α), .]....[σx,r2p+1 ⊗ ϕAW (e−itr2p+1 ω α), .]P (B ⊗ C) = 0, and by Lemma 3.2, R2n+1 (t) = 0. Remark 2: From the proof of Lemma 3.3 we can deduce that R2n (t) is a sum of 2n terms each of which is a product containing only an even number of products of commutators of the form [σx,r ⊗ ϕAW (e−itr ω α), .] between two successive projections P . Theorem 3.4 Suppose that the following
assumptions hold: n (i) R2n (t) ≤ cn tn , where the series has infinite radius of n≥1 cn t convergence.
406
A. Dhahri
(ii) There exists 0 < ε < 1 and a sequence dn ≥ 0 such that R2n (t) ≤ dn tn− . Then lim
λ→0
(iλ)n R2n (λ−2 t) = 0.
n≥1
Proof. The proof of this theorem is a straightforward application of Lebesgue’s Theorem. Now, the aim is to introduce some conditions which ensures that assumptions (i) and (ii) of the above theorem are satisfied. Set h(t) = e−itLR ϕAW (α)eitLR ϕAW (α)ΩR , ΩR . Recall that LR = [dΓ (ω), .] dΓ (ω ⊕ −ω) and
e−itLR ϕAW (α)eitLR = ϕAW (e−itω α).
Therefore we get h(t) = ϕAW (e−itω α)ϕAW (α)ΩR , ΩR . Moreover, a straightforward computation shows that h(t − s) = ϕAW (e−itω α)ϕAW (e−isω α)ΩR , ΩR . Now, for any integer n we define the set Pn of pairings as the set of permutations σ of (1, ..., 2n) such that σ(2r − 1) < σ(2r) and σ(2r − 1) < σ(2r + 1) for all r. Put
ϕAW (α1 )...ϕAW (αn ) = ωR (ϕAW (α1 )...ϕAW (αn )) = ΩR , ϕAW (α1 )...ϕAW (αn )ΩR . If n = 2 then ϕAW (α1 )ϕAW (α2 ) is called the two point correlations matrix. Besides, we have
ϕAW (α1 )...ϕAW (α2n ) =
n
ϕAW (ασ(2r−1) )ϕAW (ασ(2r) ) ,
(4)
σ∈Pn r=1
and
ϕAW (α1 )...ϕAW (α2n+1 ) = 0. (see [BR96] P 40 for more details). The proof of the following lemma is similar to the one of Lemma 3.2 in [D74].
Markovian Properties of the Spin-Boson Model
407
Lemma 3.5 If h1 ≤ ∞, then for any permutation π of (0, 1, ..., 2n + 1) we have n h(tπσ(2r) − tπσ(2r+1) )dt2n ...dt0 σ∈P(0,1,...,2n+1)
0≤t2n ≤...≤t0 ≤t r=0
≤
1 hn+1 tn , 1 2n+1 (n + 1)!
with t2n+1 = 0. We now prove the following. Theorem 3.6 If h1 ≤ ∞ then R2n (t) ≤ 22n+1 hn+1 1
tn . (n + 1)!
Proof. Put R Φr = ϕAW (e−itr ω α), ΦL r C = Φr C, Φr C = CΦr , L R σx,r B = σx,r B, σx,r B = Bσx,r ,
β : a function from {0, 1, ..., 2n + 1} to {L, R}, kβ = {r ∈ {0, 1, ..., 2n + 1} such that β(r) = R}. In the sequel, we simplify the notation σx,r ⊗ Φr into σx,r Φr . With this notations we have L R R ΦL [σx,r Φr , .] = σx,r r − σx,r Φr . Recall that, from remark 2 and Lemma 3.2, R2n (t) is a sum of 2n terms each of which is of the form β(0) β(0) β(1) β(1) j (−1)kβ P (σx,0 Φ0 )(σx,1 Φ1 )... C2n,j (t) = (−1) 0≤t2n ≤...≤t0 ≤t β
β(p −1)
β(p −1)
1 ...(σx,p11−1 Φp1 −1
β(p −1)
β(p −1)
j β(p1 ) β(p1 ) )P (σx,p Φp1 )...(σx,pjj−1 Φpj −1 1
β(2n)
β(2n)
pj j) Φpβ(p )...(σx,2n Φ2n P (σx,p j j
β(2n+1)
β(2n+1)
)(σx,2n+1 Φ2n+1
)×
)P dt2n ... dt0 ,
where 0 = p0 < p1 < p2 < ... < pj < pj+1 = 2n + 2, each pk is an even number and j = N − 2, with N is the number of projections P , which appear in the expression of C2n,j (t). Hence we have C2n,j (t)(B ⊗ C) ≤ B ⊗ C β
j
0≤t2n ≤...≤t0 ≤t r=0
β(p
−1)
r) |ωR (Φβ(p ...Φpr+1r+1 pr −1
)| dt2n ... dt0 ,
408
A. Dhahri
≤ B ⊗ C
0≤t2n ≤...≤t0 ≤t r=0
β
≤ B ⊗ C
j
j
≤t2n ≤...≤t0 ≤t r=0
β
β(p
−1)
r) | Φβ(p ...Φpr+1r+1 pr −1
| dt2n ... dt0 ,
| Φπ(pr ) ...Φπ(pr+1 −1) | dt2n ... dt0 ,
where π is a permutation which depends on β. Thus from equation (4) and Lemma 3.5 we get C2n,j (t) ≤
β σ∈P(0,1,...,2n+1)
≤ 22n+2 hn+1 1
n
0≤t2n ≤...≤t0 ≤t r=0
| Φπ(σ(2r)) Φπ(σ(2r+1)) |dt2n ...dt0 ,
tn , 2n+1 (n + 1)!
Therefore, C2n,j is dominated uniformly in j. Finally, this proves that R2n (t) ≤ 22n+1 hn+1 1
tn . (n + 1)!
The following theorem ensures that assumption (ii) of Theorem 3.4 holds. Theorem 3.7 If
∞
(1 + tε )|h(t)|dt < ∞
0
for some 0 < ε < 1, then there exists dn > 0 such that R2n (t) ≤ dn tn−ε . Proof. We have that R2n (t) is a sum of 2n terms each of which takes the form of C2n,j which is defined previously. In order to prove this theorem we group those terms pairwise as follows: β(0) β(0) j (−1)kβ P (σx,0 Φ0 )... (−1) 0≤t2n ≤...≤t0 ≤t β
β(p −1)
β(p −1)
β(2n+1)
β(2n+1)
1 (σx,p11−1 Φp1 −1
β(2n−1)
(σx,2n+1 Φ2n+1 +(−1)(j+1)
β(2n−1)
β(0)
β(2n+1)
β(0)
(−1)kβ P (σx,0 Φ0 β(2n−1)
β(pj ) β(pj ) ...P (σx,p Φpj )...(σx,2n−1 Φ2n−1 j β(2n+1)
β(2n)
β(2n)
)(σx,2n Φ2n
)
)P dt2n ... dt0
0≤t2n ≤...≤t0 ≤t β
(σx,2n+1 Φ2n+1
β(2n−1)
β(pj ) β(pj ) )P...P (σx,p Φpj )...(σx,2n−1 Φ2n−1 j
)P dt2n ... dt0
β(2n)
β(p −1)
β(p −1)
1 )...(σx,p11−1 Φp1 −1
β(2n)
)P (σx,2n Φ2n
)
)P
Markovian Properties of the Spin-Boson Model
409
β(0) β(0) β(p −1) β(p1 −1) (−1)kβ P (σx,0 Φ0 )...(σx,p11−1 Φp1 −1 )P...
= (−1)j
0≤t2n ≤...≤t0 ≤t β
β(2n−1) β(2n−1) β(2n) β(2n) β(2n+1) β(2n+1) β(pj ) β(pj ) Φpj )...(σx,2n−1 Φ2n−1 )(σx,2n Φ2n )(σx,2n+1 Φ2n+1 )P ... P (σx,p j β(2n−1)
β(2n−1)
β(pj ) β(pj ) −P (σx,p Φpj )...(σx,2n−1 Φ2n−1 j β(2n+1) β(2n+1) (σx,2n+1 Φ2n+1 )P dt2n ... dt0 .
β(2n)
β(2n)
)P (σx,2n Φ2n
)
Therefore, the right-hand side of the above equation is dominated by j−1 β(p −1) k ) β(pk +1) Φβ(p Φ ...Φ k+1 p β
0≤t2n ≤...≤t0 ≤t k=0
pk +1
k
pk+1 −1
β(2n) β(2n+1) j) × Φpβ(p ...Φ2n Φ2n+1 j
(5) β(2n−1) β(2n) β(2n+1) j) − Φpβ(p ...Φ2n−1 Φ2n Φ2n+1 dt2n ... dt0 . j Note that in the between bracket terms, there is no product of two point correlation matrix where 2n is paired with (2n + 1). Moreover this term is equal to n
Φσ(π(2r)) Φσ(π(2r+1)) , σ∈P(pj ,...,2n+1) r= 1 pj 2
where 2n is not paired with (2n + 1) and π is a permutation which depends on β. Thus the term in equation (5) is dominated by n | Φσ(2r) Φσ(2r+1) |dt2n ...dt0 ,
0≤t2n ≤...≤t0 ≤t r=0
σ
where σ indicates the sum over all pairings of {0, 1..., 2n + 1} such that 2n is not paired with (2n + 1), (t2n+1 = 0). But we have n Φσ(2r) Φσ(2r+1) dt2n ...dt0 0≤t2n ≤...≤t0 ≤t r=0
= ≤ ≤
n
|h(tσ(2r) 0≤t2n ≤...≤t0 ≤t r=0 t n k cst h1 t |h(s)|sn−k ds 0 t n n−ε |h(s)|sε ds, cst h1 t 0
− tσ(2r+1) )|dt2n ...dt0
with 0 ≤ k ≤ n − 1. This ends the proof of the above theorem.
410
A. Dhahri
All together applying relation (1), Lemma 3.3, Theorem 3.4 to 3.7, we have proved the following. Theorem 3.8 Suppose that the following assumptions are satisfied: (1) (ω + ω −1 )α ∈ L2 (R3 ), ∞ (2) 0 (1 + tε )|h(t)|dt < ∞, for some 0 < ε < 1, then lim Kλ (t) = K(t), λ→0
for all t. Moreover
∞
K =i 0
e−ise P 1e ([σz , .])[V, .]eis[L0
semi
,.]
[V, .]1e ([σz , .])P ds.
e∈sp([σz ,.])
3.3 Lindbladian of the Spin-boson System Let L = iK . The aim of this subsection is to give an explicit formula of L. Moreover, we prove that this operator has the form of a Lindblad generator (or Lindbladian). Let us introduce the well known formula of distribution theory ∞ ±i 1 = πδ(ω) ± iVp ( ), (6) e±itω dt = ω ± i0 ω 0 where 1 1 = lim , x + i0 ε→0 x + iε
f (x)δ(x) dx = f (0),
1 f (x)Vp ( ) dx = lim ε→0 x
f (x)
1 dx = lim ε→0 x + i0
|x|≥ε
f (x) dx = P P x
f (x)
f (x) dx, x
1 dx, x + iε
for all f , such that R x → f (x) is a continuous function and provided the integrals on the right are well defined and the limits exist. Note that the eigenvalues of [σz , .] are 2, -2 and 0 where 2, -2 are non degenerate and 0 has multiplicity two. Besides, the corresponding eigenvectors are respectively given by |Ψ+ Ψ− |, |Ψ− Ψ+ | and |Ψ+ Ψ+ |, |Ψ− Ψ− |. Put
Markovian Properties of the Spin-Boson Model
n+ =
10 00
, n− =
00 01
, σ+ =
01 00
, σ− =
00 10
411
,
R L R nL + X = n+ X, n+ X = Xn+ , n− X = n− X, n− X = Xn− , 1 N (ω) = βω(k) . e −1
It is easy to check that R 12 ([σz , .]) = nL + n− , R 1−2 ([σz , .]) = nL − n+ , R L R 10 ([σz , .]) = nL + n+ + n− n− .
The explicit formula of the Lindbladian associated to the spin-boson system is given as follows. Theorem 3.9 If the following assumptions are met: ∞ i) 0 |h(t)| dt < ∞, ii) α is a C 1 function in a neighborhood of the sphere B(0, 2) = {k ∈ R3 , |k| = 2}, iii) (1 + ω)α ∈ L∞ (R3 ), then for all X ∈ M2 , + L(X) = i Im(α, α)− + − Im(α, α)− [n+ , X] + +i Im(α, α)− − − Im(α, α)+ [n− , X] +Re(α, α)+ − 2σ+ Xσ− − {n+ , X} +Re(α, α, )− − 2σ− Xσ+ − {n− , X} , where
N (ω) + 1 |α(k)|2 dk, ω+2 R3 N (ω) |α(k)|2 dk, = PP ω−2 N (ω) + 1 |α(k)|2 dk, = PP ω−2 N (ω) |α(k)|2 dk, = ω + 2 3 R e2β |α(k)|2 δ(ω − 2) dk, = π 2β e − 1 R3 π = 2β |α(k)|2 δ(ω − 2) dk. e − 1 R3
Im(α, α)+ + = Im(α, α)− − Im(α, α)+ − Im(α, α)− + Re(α, α)+ − Re(α, α)− −
412
A. Dhahri
Proof. A straightforward computation shows that for all X ∈ M2 , semi
12 ([σz , .])[V, .]eis[L0 ,.] [V, .]12 ([σz , .])P X = ϕAW (α)ϕAW (eisω α) + ϕAW (eisω α)ϕAW (α) n+ Xn− , semi
1−2 ([σz , .])[V, .]eis[L0 ,.] [V, .]1−2 ([σz , .])P X = ϕAW (α)ϕAW (eisω α) + ϕAW (eisω α)ϕAW (α) n− Xn+ , semi
10 ([σz , .])[V, .]eis[L0 ,.] [V, .]10 ([σz , .])P X = e−2is ϕAW (α)ϕAW (eisω α) + e2is ϕAW (eisω α)ϕAW (α) n+ Xn+ + e2is ϕAW (α)ϕAW (eisω α) + e−2is ϕAW (eisω α)ϕAW (α) n− Xn− − e−2is ϕAW (α)ϕAW (eisω α) + e2is ϕAW (eisω α)ϕAW (α) σ+ Xσ− − e2is ϕAW (α)ϕAW (eisω α) + e−2is ϕAW (eisω α)ϕAW (α) σ− Xσ+ . Hence, for all X ∈ M2 , we have semi e−ise P 1e ([σz , .])[V, .]eis[L0 ,.] [V, .]1e ([σz , .])(X) e∈sp([σz ,.])
= e−2is ϕAW (α)ϕAW (eisω α) + e−2is ϕAW (eisω α)ϕAW (α) n+ Xn− + e2is ϕAW (α)ϕAW (eisω α) + e2is ϕAW (eisω α)ϕAW (α) n− Xn+ −2Re e2is ϕAW (eisω α)ϕAW (α) σ+ Xσ− − n+ Xn+ −2Re e−2is ϕAW (eisω α)ϕAW (α) σ− Xσ+ − n− Xn− . It follows that L(X) = −
∞
e−2is ϕAW (α)ϕAW (eisω α) + ϕAW (eisω α)ϕAW (α) ds
0
n+ Xn− ∞ − e2is ϕAW (α)ϕAW (eisω α) + ϕAW (eisω α)ϕAW (α) ds 0
n− Xn+ +2Re
∞
e2is ϕAW (eisω α)ϕAW (α) ds
0
+2Re
0
∞
e−2is ϕAW (eisω α)ϕAW (α) ds
σ+ Xσ− − n+ Xn+
σ− Xσ+ − n− Xn− .
Markovian Properties of the Spin-Boson Model
413
But we have
ϕAW (α)ϕAW (eisω α) isω 2 = e (N (ω) + 1)|α(k)| dk + R3
e−isω N (ω)|α(k)|2 dk
R3
= ϕAW (eisω α)ϕAW (α) . Now, by assumptions i), ii) and iii) of the above theorem, we apply formula (6) to get ∞ + − e−2is ϕAW (α)ϕAW (eisω α) ds = Re(α, α)+ − + iIm(α, α)− − iIm(α, α)+ , 0 ∞ − + e−2is ϕAW (eisω α)ϕAW (α) ds = Re(α, α)− − + iIm(α, α, )− − iIm(α, α)+ , 0 ∞ + − e2is ϕAW (eisω α)ϕAW (α) ds = Re(α, α)+ − − iIm(α, α)− + iIm(α, α)+ , 0 ∞ + − e2is ϕAW (α)ϕAW (eisω α) ds = Re(α, α)− − + iIm(α, α)+ − iIm(α, α)− . 0
Therefore we obtain
− − + L(X) = − Re(α, α)+ − − Re(α, α)− + i Im(α)+ − Im(α, α)−
+ − n+ Xn− + − Re(α, α)+ −i Im(α, α)− − − Im(α, α+ ) − − Re(α, α)− + − + n− Xn+ −i Im(α, α)− + − Im(α, α)− + i Im(α, α)− − Im(α, α)+ − +2Re(α, α)+ − σ+ Xσ− − n+ Xn+ + 2Re(α, α)− σ− Xσ+ − n− Xn− . Hence, we get the following + L(X) = i Im(α, α)− n+ Xn− − n− Xn+ + − Im(α, α)− + n − Im(α, α) Xn − n Xn +i Im(α, α)− − + + − − + + +Re(α, α)− 2σ+ Xσ− − 2n+ Xn+ − n+ Xn− + n− Xn+ 2σ . Xσ − 2n Xn − n Xn + n Xn +Re(α, α)− − + − − + − − + − Note that we have n+ Xn− + n− Xn+ = {n+ , X} − 2n+ Xn+ = {n− , X} − 2n− Xn− , n+ Xn− − n− Xn+ = [n+ , X], n− Xn+ − n+ Xn− = [n− , X]. This proves the theorem.
414
A. Dhahri
4 Properties of the Quantum Master Equation In this section we state some properties of the quantum master equation associated to the spin-boson system, such as quantum decoherence and quantum detailed balance condition. Note that the log-Sobolev inequality with explicit computation of optimal constants are known in this context. We refer the interested reader to [C04]. 4.1 Quantum Master Equation Let ρ ∈ M2 be a density matrix. Then the quantum master equation of the spin-boson system is given by dρ(t) − = i Im(α, α)+ − − Im(α, α)+ [n+ , ρ(t)] dt − +i Im(α, α)+ + − Im(α, α)− [n− , ρ(t)] +Re(α, α)+ − 2σ− ρ(t) σ+ − {n+ , ρ(t)} 2σ ρ(t) σ − {n , ρ(t)} . +Re(α, α)− + − − − Put ρ(t) = ρ11 (t) n+ + ρ12 (t) σ+ + ρ21 (t) σ− + ρ22 (t) n− . Therefore, the above master equation is equivalent to the following system of ordinary differential equations d + ρ11 (t) = 2Re(α, α)− − ρ22 (t) − 2Re(α, α)− ρ11 (t) dt d − + − + i Im(α, α) ρ12 (t) = − i Im(α, α)+ − Im(α, α) − Im(α, α) + − − + dt + −Re(α, α)− − − Re(α, α)− ρ12 (t) d − + − ρ21 (t) = − i Im(α, α)+ − − Im(α, α+ ) + i Im(α, α)+ − Im(α, α)− dt − −Re(α, α)+ − Re(α, α) − − ρ21 (t)
d − ρ22 (t) = 2Re(α, α)+ − ρ11 (t) − 2Re(α, α)− ρ22 (t). dt Hence, it is straightforward to show that the thermodynamical equilibrium state ρβ of the spin system is the only solution of the above equation. 4.2 Quantum Decoherence of the Spin System Definition 1 We say that the dynamical evolution of a quantum system describes decoherence , if there exists an orthonormal basis of Hs such that the off-diagonal elements of its time evolved density matrix in this basis vanish as t → ∞.
Markovian Properties of the Spin-Boson Model
415
From the system of ordinary differential equations introduced in the previous subsection, we have + − + ρ12 (t) = ρ12 (0) exp − i(Im(α, α)− + + Im(α, α)+ − Im(α, α)− − Im(α, α)− )t − + Re(α, α) )t × exp − (Re(α, α)− − + + − + = ρ12 (0) exp − i(Im(α, α)− + + Im(α, α)+ − Im(α, α)− − Im(α, α)− )t e2β + 1 × exp − π( 2β |α(k)|2 δ(ω − 2)dk)t , e − 1 R3 − − + ρ21 (t) = ρ21 (0) exp − i(Im(α, α)+ + Im(α, α) − Im(α, α) − Im(α, α) )t − − + + 2β e +1 |α(k)|2 δ(ω − 2)dk)t . × exp − (π 2β e − 1 R3 Therefore, the spin system describes quantum decoherence if and only if |α(k)|2 δ(ω − 2)dk = 0. R3
Thus, the decoherence of the spin system is controlled by the cut-off function α. 4.3 Quantum Detailed Balance Condition The following definition is taken from [AL87]. Definition 2 Let Θ be a generator of a quantum dynamical semigroup written as Θ = i [H, .] + Θ0 , where H is a self-adjoint operator. We say that Θ satisfies a quantum detailed balance condition with respect to a stationary state ρ if i) [H, ρ] = 0, ii) Θ0 (A), B ρ = A, Θ0 (B) ρ , for all A, B ∈ D(Θ0 ), with A, B ρ = Tr(ρA∗ B). Actually, we prove the following. Theorem 4.1 The generator L of the quantum dynamical semigroup Tt = eitK satisfies a quantum detailed balance condition with respect to the thermodynamical equilibrium state of the spin system ρβ =
e−βσz . Tr(e−βσz )
416
A. Dhahri
Proof. Note that L(A) = i [H, A ] + LD (A), with
+ − + H = Im(α, α)− + − Im(α, α)− n+ + Im(α, α)− − Im(α, α)+ n− ,
and
− 2 σ 2 σ LD (ρ) = Re(α, α)+ ρ σ − {n , ρ} + Re(α, α) ρ σ − {n , ρ} . + − + − + − − − Therefore, it is clear that H is a self-adjoint operator and [H, ρβ ] = 0. Moreover it is straightforward to show that LD is self-adjoint for the , ρβ scalar product.
5 Return to Equilibrium for the Spin-boson System 5.1 Hamiltonian Case In this subsection we recall the results of return to equilibrium for the spinboson system proved in [JP96b]. For f ∈ L2 (R3 ) we define f˜ on R × S 2 by ˆ s < 0, −|s|1/2 f¯(|s|k), ˜ ˆ f (s, k) = ˆ s1/2 f (sk), s ≥ 0. Put
C(δ) = z ∈ C s.t |Imz| < δ ,
2 H (δ, η) = f : C(δ) → η s.t f H 2 (δ,η) = sup |a|<δ
+∞
−∞
f (x + ia)2η dx < ∞ ,
where η is a Hilbert space. Definition 3 Let M be a W ∗ −algebra, τ a dynamics on M and ω a faithful normal state on M. We say that the triple (M, τ, ω) has the property of return to equilibrium if for all A ∈ M and all normal state μ, we have lim μ(τ t (A)) = ω(A).
t→∞
Then, in the Hamiltonian approach of the spin-boson system, the following is proved in [JP96b]. Theorem 5.1 Assume that the following assumptions are satisfied: (i) (ω + ω −1 )α ∈ L2 (R3 ), ˆ 2 dσ(k) ˆ > 0, where dσ is the surface measure on S 2 , (ii) S 2 |α(2k)| ˜ ∈ H 2 (δ, L2 (S 2 )). (iii) There exists 0 < δ < 2π β such that α
Markovian Properties of the Spin-Boson Model
417
Then, for all β > 0 there exists a constant Λ(β) > 0 which depends only on the cut-off function α, such that the spin-boson system has the property of return to equilibrium for all 0 < |λ| < Λ(β). Remark: In the above theorem the authors show that for any fixed temperature β ∈ ]0, +∞[, the spectrum of the full-Liouvillean Lλ associated to the spin-boson system is absolutely continuous uniformly on λ ∈ ]0, Λ(β)[ and in particular for λ very small (weak coupling). Moreover they used the theory of perturbation of KMS-states for constructing the eigenvector of Lλ associated to the eigenvalue 0. Therefore, for any fixed β ∈ ]0, +∞[, the spin-boson system weakly coupled has the property of return to equilibrium. 5.2 Markovian Case We shall compare the above conditions for the return to equilibrium to the one we obtain in the Markovian approach. Let (Tt )t≥0 be a quantum dynamical semigroup on B(η) such that its generator has the form L∗k XLk , L(X) = G∗ X + XG + k≥1
1
where G = − 2 Put
k≥1
L∗k Lk − iH.
A(T ) = X ∈ B(η) s.t Tt (X) = X, for all t ≥ 0 ,
N (T ) = X ∈ B(η) s.t Tt (X ∗ X) = Tt (X ∗ )Tt (X) and
Tt (XX ∗ ) = Tt (X)Tt (X ∗ ), for all t ≥ 0 .
The following result is useful for the study of approach to equilibrium in the Markovian case. Theorem 5.2 (Frigerio-Verri) If T has a faithful stationary state ρ and N (T ) = A(T ), then w∗ − lim Tt (X) = T∞ (X), ∀X ∈ B(η), t→∞
where X → T∞ (X) is a conditional expectation. In particular the quantum dynamical semigroup T has the property of return to equilibrium. We state without proof the following result which is a special case of a theorem proved in [FR98]. Theorem 5.3 Suppose that (Tt )t is a norm continuous quantum dynamical semigroup which has a faithful normal stationary state and H is a self-adjoint
418
A. Dhahri
operator which has a pure point spectrum. Then (Tt )t has the property of return to equilibrium if and only if
Lk , L∗k , H, k ≥ 1 = Lk , L∗k , k ≥ 1 . Applying the above result, we now prove the following. Theorem 5.4 Suppose that the following assumptions are satisfied: i) Im(α, α)± ± are given by real numbers, ii) S 2 |α(2k)|2 dk > 0. Then the quantum dynamical semigroup of the spin-boson system at positive temperature has the property of return to equilibrium. Proof. Set + − + H = Im(α, α)− + − Im(α, α)− n+ + Im(α, α)− − Im(α, α)+ n− , 1/2 σ− , L1 = 2Re(α, α)+ − 1/2 σ+ , L2 = 2Re(α, α)− −
(7)
1 ∗ Lk Lk − iH. 2 2
G=−
k=1
Then the Lindbladian of the spin-boson system takes the form L(X) = G∗ X + XG +
2
L∗k XLk ,
k=1
for all X ∈ M2 . Note that the quantum dynamical semigroup T of the spin-boson system has the thermodynamical equilibrium state ρβ of the spin system as a faithful normal stationary state. Moreover H is a self-adjoint bounded operator which has a pure point spectrum and it is clear that
Lk , L∗k , H, k = 1, 2 = Lk , L∗k , k = 1, 2 = CI. Thus from the previous theorem, the quantum dynamical semigroup of the spin-boson system has the property of return to equilibrium. Note that compared to the Hamiltonian approach, we 5.4 a simplification of conditions for return to equilibrium system. So in this theorem we need only that assumptions isfied. Hypothesis i) ensures that Im(α, α)± ± exist and are are not vanishing. holds, then Re(α, α)± −
have in Theorem of the spin-boson i) and ii) are satfinite, while if ii)
Markovian Properties of the Spin-Boson Model
419
5.3 Spin-boson System at Zero Temperature In the Hamiltonian case, if a quantum dynamical system which its Liouvillean L has a purely absolutely continuous spectrum, except for the simple eigenvalue 0, then this system has the property of return to equilibrium (cf [JP96b]). At inverse temperature β (0 < β < ∞), by using the perturbation theory of KMS-states (cf [DJP03]), we can give an explicit formula of the eigenstate of L associated to the eigenvalue 0. But it is not the case for zero temperature (β = ∞). On the other hand, the ground state of the spin system is not faithful and by Theorem 5.3 we cannot conclude. Let us describe the spin-boson system at zero temperature. At zero temperature, the Hilbert space of the spin-boson system is H = C2 ⊗ Γs (L2 (R3 )). The free Hamiltonian is defined as h0 = σz ⊗ 1 + 1 ⊗ dΓ (ω), and its full Hamiltonian with interaction is the operator hλ = h0 + λσx ⊗ ϕ(α), where α ∈ L (R ) is a test function. The zero temperature equilibrium state of the spin system is the vector state corresponding to the ground state of σz and it has a density matrix 2
3
ρ∞ = |Ψ− Ψ− |. The weak coupling limit of the spin-boson system at zero temperature can be proved in the same way as for positive temperature. The associated Lindbladian can be deduced from the one at positive temperature by taking β = ∞ and it has the form L∞ (X) = −iν1 [n+ , X] − iν2 [n− , X] + ν3 2σ+ Xσ− − {n+ , X} , where
1 |α(k)|2 dk, ω+2 1 |α(k)|2 dk, ν2 = P P ω−2 ν3 = π |α(k)|2 δ(ω − 2) dk.
ν1 =
R3
R3
Hence, for all density matrix ρ ∈ M2 , the associated quantum master equation is given by dρ(t) = iν1 [n+ , ρ(t)]+iν2 [n− , ρ(t)]+ν3 2σ− ρ(t) σ+ −{n+ , ρ(t)} = L∗∞ (ρ(t)). dt Now, in order to conclude the property of return to equilibrium for the quantum dynamical semigroup associated to the spin-boson system at zero temperature, we have to show it by direct computation.
420
A. Dhahri
Theorem 5.5 Assume that: i) ν2 is given by a real number, ii) S 2 |α(2k)|2 dk > 0. Then the spin-boson system at zero temperature has the property of return to equilibrium. Moreover we have ∗
lim Tr(etL∞ ρA) = Tr(ρ∞ A),
t→∞
for all A ∈ M2 and all ρ be a given density matrix. Proof. Consider the orthonormal basis of M2 given by
|Ψ+ Ψ+ |, |Ψ+ Ψ− |, |Ψ− Ψ+ |, |Ψ− Ψ− | . Then in this basis we have ⎛ e−2tν3 0 0 −tν3 it(ν1 −ν2 ) ⎜ ∗ e 0 0 e etL∞ = ⎜ ⎝ 0 0 e−tν3 e−it(ν1 −ν2) 0 0 −e−2tν3 + 1 Therefore we get
∗
∗ , lim etL∞ = Π∞
t→∞
⎛
where ∗ Π∞
0 ⎜0 =⎜ ⎝0 1
0 0 0 0
0 0 0 0
⎞ 0 0⎟ ⎟. 0⎠ 1
A direct computation gives ∗ (A) = σ− Aσ+ + n− An− , ∀A ∈ M2 . Π∞
Consider a density matrix ρ of the form α β ρ= ¯ , β 1−α with α ∈ [0, 1], β ∈ C. We have 00 ∗ = |Ψ− Ψ− | = ρ∞ . Π∞ (ρ) = 01 Therefore, it follows that ∗
∗ (ρ)A) = Tr(ρ∞ A), lim Tr(etL∞ ρA) = Tr(Π∞
t→∞
∀A ∈ M2 . This proves our theorem.
⎞ 0 0⎟ ⎟. 0⎠ 1
Markovian Properties of the Spin-Boson Model
421
6 Quantum Langevin Equation and Associated Hamiltonian It is shown in [HP84] that any quantum master equation of a simple quantum system HS can be dilated into a unitary quantum Langevin equation (quantum stochastic differential equation) on a larger space HS ⊗ Γ where Γ is a Fock space in which are naturally living quantum noises. Note that in the literature it is shown that natural quantum stochastic differential equations can be obtained by the stochastic limit of the full Hamiltonian system which is developed in [ALV02]. Now, let us introduce some notations that need in the sequel. 6.1 Basic Notations Let Z be a Hilbert space for which we fix an orthonormal basis {zk , k ∈ J}. We denote by Γs (R+ ), the symmetric Fock space constructed over the Hilbert space Z ⊗ L2 (R+ ). Therefore, from the following identification Z ⊗ L2 (R+ ) L2 (R+ , Z) L2 (R+ × J), we get Γs (R+ ) = Γsym (L2 (R+ × J)). The space Z is called the multiplicity space and dim Z is called the multiplicity. The set J is equal to {1, ..., N } in the case of finite multiplicity N and is equal to N in the case of infinite multiplicity. Let us introduce another Hilbert space H called initial or system space and we identify the tensor product K(R+ ) = H ⊗ Γs (R+ ) = H ⊗
∞
L2 (R+ × J)⊗n =
n=0
∞
H ⊗ L2 (R+ × J)⊗n
n=0
with the direct sum ∞
H ⊗ L2sym ((R+ × J)n )
n=0
∞
L2sym ((R+ × J)n , H),
n=0
consisting of the vectors Ψ = (Ψn )n≥0 such that Ψn ∈ L2sym ((R+ × J)n , H) and 1 Ψn 2L2 ((R+ ×J)n ,H) < ∞. Ψ 2K(R+ ) = sym n! n≥0
Note that for f ∈ L (R+ × J), we define its associated exponential vector by 2
ε(f ) =
f ⊗n √ . n! n≥0
422
A. Dhahri
6.2 Hudson-Parthasarathy Equation Let H, Rk and Skl , k, l ≥ 1 be bounded operators on H such that ∗ ∗ Sjk Sjl = Skj Slj = δkl , H = H ∗, j
(8)
j
and the sum k Rk∗ Rk are assumed to be strongly convergent to a bounded operator. Through H, Rk and Skl we define the following operators S ∈ U(H ⊗ Z), R ∈ B(H, H ⊗ Z), G ∈ B(H), by Ru =
(Rk u) ⊗ zk , ∀u ∈ H,
k
S=
Skl ⊗ |zk zl |,
kl
G = −iH −
1 ∗ 1 Rk Rk = −iH − R∗ R. 2 2 k
The basic quantum noises are the processes Ai (t) = A(1(0,t) ⊗ zi ), + A+ i (t) = A (1(0,t) ⊗ zi ),
Λij (t) = Λ(π(0,t) ⊗ |zi zj |), where i, j ∈ J, 1(0,t) is the indicator function over (0, t), while π(0,t) is the multiplication operator by 1(0,t) in L2 (R+ ). The Hudson-Parthasarathy equation is defined as follows ⎧
+ (t) = ⎨ dU
k Rk dAk (t) + kl (Skl − δkl )dΛkl (t) (HP ) − kl Rk∗ Skl dAl (t) + Gdt U (t) ⎩ U (0) = 1. Note that in order to have a unitary solution U of (HP), we need some conditions on the system operators. Actually the following theorem holds. Theorem 6.1 Suppose that the system operators H, Rk , Skl satisfies (8). Then there exists a unique strongly continuous unitary adapted process U (t) which satisfies equation (HP). Proof. For the proof of this theorem we refer the reader to [P92]. Now, in order to associate a group V to the solution U of (HP), we first introduce the one-parameter strongly continuous unitary group θ in L2 (R, Z) and its associated second quantization Θ in Γ (R), defined by
Markovian Properties of the Spin-Boson Model
423
θt f (r) = f (r + t), ∀f ∈ L2 (R, Z), Θt e(f ) = e(θt f ), ∀f ∈ L2 (R, Z).
(9)
Note that Θ and U (t) can be extended to act on the space K(R) = H ⊗ Γs (R+ ) ⊗ Γs (R− ) = K(R+ ) ⊗ Γs (R− ) = H ⊗ Γs (R), by Θt = 1 ⊗ Θt in H ⊗ Γs (R), U (t) = U (t) ⊗ 1 in K(R+ ) ⊗ Γs (R− ).
Theorem 6.2 Let Θ be the one-parameter strongly continuous group defined by (9) and U the solution of the EDSQ (HP) with system operators satisfying (8). Then U (t + s) = Θs∗ U (t)Θs U (s), ∀s, t ≥ 0, and the family V = {Vt }t∈R such that Θt U (t), t ≥ 0 Vt = U ∗ (|t|)Θt , t ≤ 0, defines a one-parameter strongly continuous unitary group. Furthermore, the family of two-parameter unitary operators U (t, s) = Θt∗ Vt−s Θs = Θs∗ U (t − s)Θs , ∀s ≤ t, is strongly continuous in t and in s and satisfies the composition law U (t, s)U (t, r) = U (t, r), ∀r ≤ s ≤ t. Proof. See [B06] for the proof of this theorem. The group V defined as above, describes the reversible evolution of the small system plus the reservoir which is modelled by the free Bose gas. The free evolution of the reservoir is represented by the group Θ whose generator is formally given by ∂ E0 = dΓ (i ). ∂x Note that U (t) = U (t, 0) = Θt∗ Vt is the evolution operator giving the dynamics state from time 0 to time t of the whole system in the interaction picture. Moreover by the Stone theorem dΘt = −iE0 Θt dt, dVt = −iKVt dt.
424
A. Dhahri
The operators H, E0 represent respectively the energy associated to the small system and the reservoir. The operator K represents the total energy of the combined system in the interaction picture and the system operators Rj , Sij control this interaction. Besides, if we take Rj = 0, Sij = δij , then we get U (t) = eitH , Vt = e−itE0 e−itH , and K = E0 + H which is self-adjoint operator defined on H ⊗ D(E0 ). In [G01], Gregoratti give an essentially self-adjoint restriction of the Hamiltonian K which appears as a singular perturbation of E0 + H. 6.3 Hamiltonian Associated to the Hudson-Parthasarathy Equation Recall that the generators 0 and E0 of the groups θ in L2 (R, Z) and Θ in K are self-adjoint unbounded operators. In order to explicit their domains we introduce the Sobolev space
H ((R × J)n , H) n
= u ∈ L2 ((R × J)n , H) such that ∂k u ∈ L2 ((R × J)n , H) , k=1
where all the derivatives of u are in the sense of distributions in (R × J)n (n ≥ 1) and
H ((R × J)0 , H) = H.
Furthermore H ((R × J)n , H) is a Hilbert space with respect to the scalar product
u, v H ((R×J)n ,H) = u, v L2 ((R×J)n ,H) +
n
k=1
∂k u,
n
∂k v L2 ((R×J)n ,H) .
k=1
Set
Hsym ((R × J)n , H) = H ((R × J)n , H) ∩ L2sym ((R × J)n , H). We have D(0 ) = H 1 (R, Z), and 0 u = iu , Besides, the domain of E0 is given by D(E0 ) =
Φ ∈ K s.t Φn ∈ Hsym ((R × J)n , H), ∀n and n 1 ∂k Φn 2 < ∞ , n!
n≥1
k=1
Markovian Properties of the Spin-Boson Model
425
n and this operator acts on its domain by (E0 Φ)n = i k=1 ∂k Φn . Set R∗ = R \ {0}. Let us introduce the dense subspaces in K defined by
W = Φ ∈ K s.t Φn ∈ Hsym ((R∗ × J)n , H), ∀n and ∞ 1 ∂k Φn 2L2 (R×J)n ,H < ∞ , n!
n≥1
k=1
1 Φn+1 |{rn+1 =s} 2Z⊗L2 ((R×J)n ,H) < ∞ , νs = Φ ∈ W s.t n! n≥0
ν0± = ν0− ∩ ν0+ , where Φn+1 |{rn+1 =s} is the trace (restriction) of the function Φn+1 on the hyperplane {rn+1 = s}, for all s ∈ R∗ ∪ {0− , 0+ }. Clearly ν0± ⊆ W. Define the trace operator a(s) : νs → Z ⊗ K such that (a(s)Φ)n = Φn+1 |{rn+1 =s} . Note that ε(H 1 (R∗ , Z)) ⊂ νs and a(s)Ψ (u) ⊗ h = u(s) ⊗ Ψ (u) ⊗ h, ∀u ∈ H 1 (R∗ , Z), h ∈ H, where
Ψ (u) = (1, u, u⊗2 , ..., u⊗n , ...).
Moreover W ⊃ D(E0 ) and E0 can be extended to a non-symmetric unbounded operator in W by n ∂k Φn . (EΦ)n = i k=1
The following theorem gives an essentially self-adjoint restriction of the Hamiltonian operator associated to (HP) and it is proved in [G01]. Theorem 6.3 Let K be the Hamiltonian operator associated to the equation (HP) such that the system operators satisfying (8). Then (1) D(K) ∩ ν0± = Φ ∈ ν0± s.t a(0− )Φ = Sa(0+ )Φ + RΦ , (2) KΦ = H + E − iR∗ a(0− ) + 2i R∗ R Φ, ∀Φ ∈ D(K) ∩ ν0± , (3) K|D(K)∩ν0± is a essentially self-adjoint operator. 6.4 Hamiltonian Associated to the Stochastic Evolution of the Spin-boson System Recall that the quantum Langevin equation of the spin-boson system is defined on C2 ⊗ Γs (L2 (R+ , C2 )) by
426
A. Dhahri
2
2 ∗ dU (t) = Gdt + k=1 Lk dA+ (t) − L dA (t) U (t) k k=1 k k U (0) = I,
where G, Lk , k ∈ {0, 1} are given by the relation (7). Note that this equation satisfies the class of Hudson-Parthasarathy equation with Sij = δij . Moreover we have S = I, 1/2 1/2 Ru = 2Re(α, α)+ σ− u ⊗ Ψ+ + 2Re(α, α)− σ+ u ⊗ Ψ− , ∀u ∈ C2 , − − 1/2 1/2 R∗ u ⊗ ϕ = Ψ+ , ϕ 2Re(α, α)+ σ+ u + Ψ− , ϕ 2Re(α, α)− σ− u, − − ∀u, ϕ ∈ C2 , − R∗ R = 2Re(α, α)+ − n+ + 2Re(α, α)− n− . Therefore we get
ν0± ∩ D(K) = Φ ∈ ν0± s.t a(0− )Φ = a(0+ )Φ + RΦ , and
− Φ, KΦ = H + E − iR∗ a(0− ) + i Re(α, α)+ − n+ + Re(α, α)− n−
for every Φ ∈ ν0± ∩ D(K). ∂ ). Recall that the associated energy of the reservoir is given by E = dΓ (i ∂x ∂ Therefore, by using the spectral theorem, i ∂x is a multiplication operator by a variable ω in R. Thus we get E = dΓ (ω), and E is the same as the usual Hamiltonian. On the other hand, the operator + − + n − Im(α/α) + Im(α/α) − Im(α/α) H = Im(α/α)− + + − − + n− , describes the energy of the spin. Note that the constants Im(α/α)± ± have an important physical interpretation. In some sense they contain all physical information on the original Hamiltonian of the spin. The free evolution of the combined system is described by Hf = H + E and the Hamiltonian K appears as a singular perturbation of Hf , where the operator R defined as above controls the interaction between the spin and the reservoir.
7 Repeated Quantum Interaction Model In this section, we start by describing the repeated quantum interaction model (cf [AP06]). We prove that the quantum Langevin equation of the spin-boson system at zero temperature can be obtained as the continuous limit of an
Markovian Properties of the Spin-Boson Model
427
Hamiltonian repeated interaction model. Moreover we compare the Lindbladian of the spin-boson system at positive temperature to the one obtained by using the method introduced in [AJ07]. Consider a small system H0 coupled with a piece of environment H. The interaction between the two systems is described by the Hamiltonian H which is defined on H0 ⊗ H. The associated unitary evolution during the interval [0, h] of times is L = e−ihH . After the first interaction, we repeat this time coupling the same H0 with a new copy of H. Therefore, the sequence of the repeated interactions is described by the space H0 ⊗
H. N∗
The unitary evolution of the small system in interaction picture with the n−th copy of H, denoted by Hn , is the operator Ln which acts as L on H0 ⊗ Hn and acts as the identity on the copies of H other than ! Hn . The associated evolution equation of this model is defined on H0 ⊗ N∗ H by
un+1 = Ln+1 un u0 = I
(10)
Let {Xi }i∈Λ∪{0} be an orthonormal basis of H with X0 = Ω and let us consider the coefficients (Lij )i,j∈Λ∪{0} which are operators on H0 of the matrix representation of L in the basis {Xi }i∈Λ∪{0} . Theorem 7.1 If L00 = I − h(iH +
1 ∗ Lk Lk ) + hω00 , 2 k
√ √ hLj + hωj0 , √ ∗ k √ i Li0 = − h Lk Si + hω0 , L0j =
k
Lij = Sji + hωji , where H is a self-adjoint bounded operator, (Sji )i,j is a family of unitary operator, (Li )i are operators on H0 and the terms ωji converge to 0 when h tends to 0, then the solution (un )n∈N of (10) is made of invertible operators which are locally uniformly bounded in norm. Moreover u[t/h] converges weakly to the solution U (t) of the equation
dU (t) = i,j Lij U (t)daij (t) U (0) = I
428
A. Dhahri
where 1 ∗ Lk Lk , L00 = − iH + 2 k
L0j Li0
= Lj , =− L∗k Sik , k
Lij
=
Sji
− δij I.
Proof. See [AP06] for the proof of this theorem. Now, let us put H0 = H = C2 and consider the dipole interaction Hamiltonian defined on C2 ⊗ C2 as 1 H = σz ⊗ I + I ⊗ HR + √ σ− ⊗ a∗ + σ+ ⊗ a , h where
00 , is the Hamiltonian of the piece of the reservoir, 02 V = σ− , 01 a= and a∗ is the adjoint of a. 00
HR =
Fix an orthonormal basis {Ω, X} of C2 such that 1 0 Ω= , X= . 0 1 The unitary evolution during the interval [0, h] of time is L = e−ihH such that 1 L00 = Ω, LΩ = I − ih σz − hσ+ σ− + o(h), 2 √ √ L10 = Ω, LX = −i h σ+ + o( h), √ √ L01 = X, LΩ = −i h σ− + o( h), 1 L11 = X, LX = I − ihσz − ihI − h σ− σ+ + o(h). 2 Therefore we obtain 1 L00 − I h→0 −−−→ G0 = −iσz − σ+ σ− , h 2 L10 h→0 ∗ √ −−−→ −L = −iσ+ , h L01 h→0 √ −−−→ L = −iσ− . h
Markovian Properties of the Spin-Boson Model
429
Thus by Theorem 7.1, the solution (un )n∈N of the equation un+1 = Ln+1 un u0 = I is made of invertible operators which are locally uniformly bounded in norm and in particular u[t/h] converges weakly to the solution U (t) of the equation
dU (t) = G0 dt + L dA+ (t) − L∗ dA− (t) U (t) U (0) = I. Theorem 7.2 The quantum dynamical semigroup of the repeated quantum interaction model associated to the spin-boson system at zero temperature converges towards to equilibrium. Proof. The associated Lindbladian of the above equation is of the form L(X) = i[σz , X] + 2σ+ Xσ− − {n+ , X}, and the proof is similar as the one of Theorem 5.5. Now, at inverse temperature β, we suppose that the piece of the reservoir is described by the state 1 β0 0 −βHR ρ= e = . 0 β1 1 + e−β " ΩR ), such that The GNS representation of (C2 , ρ) is the triple (π, H, • ΩR = I, " = M2 , the algebra of all complex 2 × 2 matrix which equipped by the •H scalar product
A, B = Tr(ρA∗ B), " such that π(M )A = M A, ∀M, A ∈ M2 . • π : M2 → B(H), Set 1 X1 = √ β1
01 00
1 , X2 = √ β0
00 10
1 , X3 = √ β0 β1
β1 0 0 −β0
.
It is easy to show that (ΩR , X1 , X2 , X3 ) is an orthonormal basis of M2 . Now, " = π(L) which is defined on C2 ⊗ M2 , then a straightforward if we put L " i )i,j , which are operators on C2 , of computation shows that the coefficients (L j " are given by the matrix representation of L,
430
A. Dhahri
" 0 = I − ihσz − ih β1 I − 1 hβ0 σ+ σ− − 1 hβ1 σ− σ+ + o(h2 ), L 0 2 2 # √ 0 3/2 " L1 = −i β1 h σ+ + o(h ), # √ " 0 = −i β0 h σ− + o(h3/2 ), L 2
" 0 = o(h), L 3 # √ " L10 = −i β1 h σ− + o(h3/2 ), # √ " 2 = −i β0 hσ+ + o(h3/2 ), L 0
" 3 = o(h), L 0 " 1 = I + o(h), L 1 " 2 = I + o(h), L 2
" 3 = I + o(h), L 3 "1 = L "3 = L "1 = L "3 = L " 2 = 0. "2 = L L 1 2 1 3 2 3 Hence we get " 0 − I h→0 L 1 1 0 −−−→ L00 = −iσz − iβ1 I − β0 σ+ σ− − β1 σ− σ+ , h 2 2 " 0 h→0 # L √1 −−−→ L01 = −i β1 σ+ , h 0 " # L h→0 √2 −−−→ L02 = −i β0 σ− , h " 1 h→0 # L √0 −−−→ L10 = −i β1 σ− , h 2 " # L h→0 √0 −−−→ L20 = β0 σ+ , h and the other terms converges to 0 when h tends to 0. Thus the solution (" un )n∈N of the equation " n+1 u "n u "n+1 = L u "0 = I is made of invertible operators which are locally uniformly bounded in norm " (t) of the equation and in particular u "[t/h] converges weakly to the solution U
⎧ 1 1 " ⎪ ⎪ 2 β1 σ− σ+ dt ⎨ dU (t)= − iσz + iβ1 I + 2 β0σ+ σ− + √ √ √ √ 1 0 0 2 " (t) U −iσ β da (t) + β da (t) − iσ β da (t) + β da (t) − 1 0 + 1 0 0 2 1 0 ⎪ ⎪ ⎩" U (0) = I. Theorem 7.3 The quantum dynamical semigroup of the repeated quantum interaction model associated to the spin-boson system converges towards the equilibrium.
Markovian Properties of the Spin-Boson Model
431
Proof. It suffices to observe that the associated Lindbladian of the above equation has the form 1 L(X) = i[σz , X] + β0 2σ− Xσ+ − {n− , X} 2 1 + β1 2σ+ Xσ− − {n+ , X} . 2 Remark: Note that by using the repeated quantum interaction model we can prove that the Markovian properties of the spin-boson system are satisfied without using any assumption.
References [AK00]
L. Accardi, S. Kozyrev: Quantum interacting particle systems. Volterra International School (2000). [AFL90] L. Accardi, A. Frigerio, Y.G. Lu: Weak coupling limit as a quantum functional central limit theorem. Com. Math. Phys. 131, 537-570 (1990). [ALV02] L. Accardi, Y.G. Lu, I. Volovich: Quantum theory and its stochastic limit. Springer-Verlag Berlin (2002). [AL87] R. Alicki, K. Lendi: Quantum dynamical semigroups and applications. Lecture Notes in physics, 286. Springer-Verlag Berlin (1987). [AJ07] S. Attal, A. Joye: The Langevin Equation for a Quantum Heat Bath. J. Func. Analysis, 247, p. 253-288 (2007). [AP06] S. Attal, Y. Pautrat: From Repeated to Continuous Quantum Interactions. Annales Institut Henri Poincar´e, (Physique Th´eorique) 7, p. 59-104 (2006). [B06] A. Barchielli: Continual Measurements in Quantum Mechanics. Quantum Open systems. Vol III: Recent developments. Springer Verlag, Lecture Notes in Mathematics, 1882 (2006). [BR96] O. Bratteli, D.W. Robinson: Operator algebras and Quantum Statistical Mechanics II, Volume 2. Springer-Verlag New York Berlin Heidelberg London Paris Tokyo, second edition (1996). [C04] R. Carbone: Optimal Log-Sobolev Inequality and Hypercontractivity for positive semigroups on M2 (C), Infinite Dimensional Analysis, Quantum Probability and Related Topics, Vol. 7, No. 3 317-335 (2004). [D74] E.B. Davies: Markovian Master equations. Comm. Math. Phys. 39, 91-110 (1974). [D76a] E.B. Davies: Markovian Master Equations II. Math. Ann. 219, 147-158 (1976). [D80] E.B. Davies: One-Parameter Semigroups. Academic Press London New York Toronto Sydney San Francisco (1980). [D76b] E.B. Davies: Quantum Theory of Open Systems. Academic Press, New York and London (1976). [DJ03] J. Derezinski, V. Jaksic: Return to Equilibrium for Pauli-Fierz Systems. Annales Institut Henri Poincar´e 4, 739-793 (2003). [DJP03] J. Derezinski, V. Jaksic, C.A. Pillet: Perturbation theory of W∗ -dynamics, KMS-states and Liouvillean, Rev. Math. Phys. 15, 447-489 (2003).
432 [DF06]
A. Dhahri
J. Derezinski, R. Fruboes: Fermi Golden Rule and Open Quantum Systems, Quantum Open systems. Vol III: Recent developments. Springer Verlag, Lecture Notes in Mathematics, 1882 (2006). [F06] F. Fagnola: Quantum Stochastic Differential Equations and Dilation of Completely Positive Semigroups. Quantum Open systems. Vol II: The Markovian approach. Springer Verlag, Lecture Notes in Mathematics, 1881 (2006). [F99] F. Fagnola: Quantum Markovian Semigroups and Quantum Flows. Proyecciones, Journal of Math. 18, n.3 1-144 (1999). [F93] F. Fagnola: Characterization of Isometric and Unitary Weakly Differentiable Cocycles in Fock space. Quantum Probability and Related Topics VIII 143 (1993). [FR06] F. Fagnola, R. Rebolledo: Nets of the Qualitative behaviour of Quantum Markov Semigroups. Quantum Open systems. Vol III: Recent developments. Springer Verlag, Lecture Notes in Mathematics, 1882 (2006). [FR98] F. Fagnola, R. Rebolledo: The Approach to equilibrium of a class of quantum dynamical semigroups. Inf. Q. Prob. and Rel. Topics, 1(4), 1-12 (1998). [HP84] R.L Hudson, K.R. Parthasarathy: Quantum Ito’s formula and stochastic evolutions, Comm. Math. Phys. 93, no 3, pp.301-323 (1984). [G01] M. Gregoratti: The Hamiltonian Operator Associated with Some quantum Stochastic Evolutions Com. Math. Phys. 222, 181-200 (2001) [JP96a] V. Jaksic, C.A. Pillet: On a model for quantum friction II : Fermi’s golden rule and dynamics at positive temperature. Comm. Math. Phys. 178, 627 (1996). [JP96b] V. Jaksic, C.A. Pillet: On a model for quantum friction III: Ergodic properties of the spin-boson system. Comm. Math. Phys. 178, 627 (1996). [M95] P. A. Meyer: Quantum Probability for Probabilists. Second edition. Lect Not. Math. 1538, Berlin: Springer-Verlag (1995). [P92] K. R. Parthasarathy: An Introduction to Quantum Stochastic Calculus. Birkh¨ auser Verlag: Basel. Boston. Berlin (1992). [R06] R. Rebolledo: Complete Positivity and Open Quantum Systems. Quantum Open systems. Vol II: The Markovian approach. Springer Verlag, Lecture Notes in Mathematics, 1881 (2006).
Statistical Properties of Pauli Matrices Going Through Noisy Channels St´ephane Attal and Nadine Guillotin-Plantard Universit´e Lyon 1, Institut Camille Jordan, 43 bld du 11 novembre 1918, 69622 Villeurbanne Cedex, France e-mail: [email protected]; [email protected] Summary. We study the statistical properties of the triple (σx , σy , σz ) of Pauli matrices going through a sequence of noisy channels, modeled by the repetition of a general, trace-preserving, completely positive map. We show a non-commutative central limit theorem for the distribution of this triple, which features in the limit a 3-dimensional Brownian motion with a non-trivial covariance matrix. We also prove a large deviation principle associated to this convergence, with an explicit rate function depending on the stationary state of the noisy channel.
1 Introduction In quantum information theory one of the most important question is to understand and to control the way a quantum bit is modified when transmitted through a quantum channel. It is well-known that realistic transmission channels are not perfect and distort the quantum bit they transmit. This transformation of the quantum state is represented by the action of a completely positive map. These are the so-called noisy channels. The purpose of this article is to study the action of the repetition of a general completely positive map on basic observables. Physically, this model can be thought of as the sequence of transformations of small identical pieces of noisy channels on a qubit. It can also be thought of as a discrete approximation of the more realistic model of a quantum bit going through a semigroup of completely positive maps (a Lindblad semigroup). As basic observables, we consider the triple (σx , σy , σz ) of Pauli matrices. Under the repeated action of the completely positive map, they behave as a 3-dimensional quantum random walk. The aim of this article is to study the statistical properties of this quantum random walk. Indeed, for any initial density matrix ρin , we study the statistical properties of the empirical average of the Pauli matrices in the successive states Φn (ρin ), n ≥ 0 where Φ is some completely positive and trace-preserving map describing our quantum channel. Quantum Bernoulli random walks studied C. Donati-Martin et al. (eds.), S´ eminaire de Probabilit´ es XLII, Lecture Notes in Mathematics 1979, DOI 10.1007/978-3-642-01763-6 16, c Springer-Verlag Berlin Heidelberg 2009
433
434
S. Attal and N. Guillotin-Plantard
by Biane in [1] corresponds to the case where Φ is the identity map. Biane [1] proved an invariance principle for this quantum random walk when ρin = 12 I. This article is organized as follows. In section two we describe the physical and mathematical setup. In section three we establish a functional central limit theorem for the empirical average of the quantum random walk associated to the Pauli matrices generalizing Biane’s result [1]. This central limit theorem involves a 3-dimensional Brownian motion in the limit, whose covariance matrix is non-trivial and depends explicitly on the stationary state of the noisy channel. In section four, we apply our central limit theorem to some explicit cases, in particular to the King-Ruskai-Szarek-Werner representation of completely positive and trace-preserving maps in M2 (C). This allows us to compute the limit Brownian motion for the best known quantum channels: the depolarizing channel, the phase-damping channel, the amplitude-damping channel. Finally, in the last section, a large deviation principle for the empirical average is proved.
2 Model and Notations Let M2 (C) be the set of 2 × 2 matrices with complex coefficients. The set of 2 × 2 self-adjoint matrices forms a four dimensional real vector subspace of M2 (C). A convenient basis B is given by the following matrices 0 1 0 −i 1 0 1 0 , σy = , σz = I= , σx = 1 0 i 0 0 −1 0 1 where σx , σy , σz are the traditional Pauli matrices, they satisfy the commutation relations: [σx , σy ] = 2iσz , and those obtained by cyclic permutations of σx , σy , σz . A state on M2 (C) is given by a density matrix (i.e. a positive semi-definite matrix with trace one) which we will suppose to be of the form α β ρ= ¯ β 1−α where 0 ≤ α ≤ 1 and |β|2 ≤ α(1 − α). The noise coming from interactions between the qubit states and the environment is represented by the action of a completely positive and trace-preserving map Φ : M2 (C) → M2 (C). Let M1 , M2 , . . . , Mk , . . . be infinitely many copies of M2 (C). For each given state ρ, we consider the algebra Mρ = M1 ⊗ M2 ⊗ . . . ⊗ Mk ⊗ . . . where the product is taken in the sense of W ∗ -algebra with respect to the product state ω = ρ ⊗ Φ(ρ) ⊗ Φ2 (ρ) ⊗ . . . ⊗ Φk (ρ) ⊗ . . . .
Pauli Matrices Going Through Noisy Channels
435
Our main hypothesis is the following. We assume that for any state ρ, the sequence Φn (ρ) converges to a stationary state ρ∞ , which we write as α∞ β∞ ρ∞ = β ∞ 1 − α∞ where 0 ≤ α∞ ≤ 1 and |β∞ |2 ≤ α∞ (1 − α∞ ). Put v1 = 2 Re(β∞ ), v2 = −2 Im(β∞ ), v3 = 2α∞ − 1. For every k ≥ 1, we define xk = I ⊗ . . . ⊗ I ⊗ (σx − v1 I) ⊗ I ⊗ . . . yk = I ⊗ . . . ⊗ I ⊗ (σy − v2 I) ⊗ I ⊗ . . . zk = I ⊗ . . . ⊗ I ⊗ (σz − v3 I) ⊗ I ⊗ . . . where each (σ. − v. I) appears on the k th place. For every n ≥ 1, put Xn =
n
xk , Yn =
k=1
n
yk , Zn =
k=1
n
zk
k=1
with initial conditions X0 = Y0 = Z0 = 0. The integer part of a real t is denoted by [t]. To each process we associate a continuous time normalized process denoted by (n)
Xt
= n−1/2 X[nt] , Yt
(n)
= n−1/2 Y[nt] , Zt
(n)
= n−1/2 Z[nt] .
3 A Central Limit Theorem The aim of our article is to study the asymptotical properties of the quantum (n) (n) (n) process (Xt , Yt , Zt ) when n goes to infinity. This process being truly non-commutative, there is no hope to obtain an asymptotic behaviour in the classical sense. For any polynomial P = P (X1 , X2 , . . . , Xm ) of m variables, we denote by P the totally symmetrized polynomial of P obtained by symmetrizing each monomial in the following way: Xi1 Xi2 . . . Xik −→
1 Xiσ(1) . . . Xiσ(k) k! σ∈Sk
where Sk is the group of permutations of {1, . . . , k}.
436
S. Attal and N. Guillotin-Plantard
Theorem 1. Assume that (A)
1 Φn (ρ) = ρ∞ + o( √ ). n
(1)
Then, for any polynomial P of 3m variables, for any (t1 , . . . , tm ) such that 0 ≤ t1 < t2 < . . . < tm , the following convergence holds: (n) (n) (n) (n) (n) (n) lim w P(Xt1 , Yt1 , Zt1 , . . . , Xtm , Ytm , Ztm ) n→+∞ (1) (2) (3) (1) (2) (3) = E P (Bt1 , Bt1 , Bt1 , . . . , Btm , Btm , Btm ) (1)
(2)
(3)
where (Bt , Bt , Bt )t≥0 is a three-dimensional centered Brownian motion with covariance matrix Ct, with ⎞ ⎛ 1 − v12 −v1 v2 −v1 v3 C = ⎝−v1 v2 1 − v22 −v2 v3 ⎠ . −v1 v3 −v2 v3 1 − v32 Remark : Theorem 1 has to be compared with the quantum central limit theorem obtained in [5] and [9]. In our case, the state under which the convergence holds does not need to be an infinite tensor product of states. We also give here a functional version of the central limit theorem. Finally, in [5] (see Remark 3 p.131), the limit is described as a so-called quasi-free state in quantum mechanics. We prove in Theorem 1 that the limit is real Gaussian for the class of totally symmetrized polynomials. Proof. Let m ≥ 1 and (t0 , t1 , . . . , tm ) such that t0 = 0 < t1 < t2 < . . . < tm . (n) (n) (n) (n) (n) (n) The polynomial P (Xt1 , Yt1 , Zt1 , . . . , Xtm , Ytm , Ztm ) can be rewritten (n) (n) (n) (n) as a polynomial function Q of the increments: Xt1 , Yt1 , Zt1 , Xt2 − (n) (n) (n) (n) (n) (n) (n) (n) (n) (n) (n) Xt1 , Yt2 −Yt1 , Zt2 −Zt1 , . . . , Xtm −Xtm−1 , Ytm −Ytm−1 , Ztm −Ztm−1 . A monomial of Q is a product of the form Qi1 . . . Qik for some distinct i1 , . . . , ik in {1, . . . , m} where Qi is a product depending only on the incre(n) (n) (n) (n) (n) (n) ments Xti − Xti−1 , Yti − Yti−1 , Zti − Zti−1 . Since the Qi ’s are commuting variables, the totally symmetrized polynomial of the monomial Qi1 . . . Qik is equal to the product Q i1 . . . Qik . Remark that since one considers product states, the increments are independent, thus the expectations factorize, which allows to reduce to prove the theorem for any polynomial Qi . Let i ≥ 1 fixed, for every ν1 , ν2 , ν3 ∈ R, we begin by determining the asymptotic distribution of the linear combination
(n) (n) (n) (n) (n) (n) (ν12 +ν22 +ν32 )−1/2 ν1 (Xti −Xti−1 )+ν2 (Yti −Yti−1 )+ν3 (Zti −Zti−1 ) (2) which can be rewritten as 1 √ n
[nti ]
k=[nti−1 ]+1
ν1 xk + ν2 yk + ν3 zk ν12 + ν22 + ν32
.
Pauli Matrices Going Through Noisy Channels
437
Consider the matrix
1 ν (σ − v I) + ν (σ − v I) + ν (σ − v I) A= 2 1 x 1 2 y 2 3 z 3 ν1 + ν22 + ν32 1 ν1 − iν2 −ν1 v1 − ν2 v2 + ν3 (1 − v3 ) = 2 ν1 + i ν2 −ν1 v1 − ν2 v2 − ν3 (1 + v3 ) ν + ν2 + ν2 1
2
3
which we denote by
a1 a3 a3 a2
,
with a1 , a2 ∈ R, a3 ∈ C. From assumption (A) we can write, for every n ≥ 0 α∞ + φn (1) β∞ + φn (2) n Φ (ρ) = β ∞ + φn (3) 1 − α∞ + φn (4) √ where each sequence (φn (i))n satisfies: φn (i) = o(1/ n). Let k ≥ 1, the expectation and the variance of A in the state Φk (ρ) are respectively equal to Trace(AΦk (ρ)) and Trace(A2 Φk (ρ)) − Trace(AΦk (ρ))2 . If both following conditions are satisfied:
[nti ]
√ Trace(AΦk (ρ)) = o( n)
(3)
k=[nti−1 ]+1
and lim
n → +∞
1 n
[nti ]
[Trace(A2 Φk (ρ)) − Trace(AΦk (ρ))2 ] = a(ti − ti−1 ), (4)
k=[nti−1 ]+1
then (see Theorem 2.8.42 in [3]) the asymptotic distribution of (2) is the Normal distribution N (0, a(ti − ti−1 )), a > 0. Let us first prove (3). For every k ≥ 1, a simple computation gives √ √ Trace(AΦk (ρ)) = [a1 α∞ + a3 β¯∞ + a3 β∞ + a2 (1 − α∞ ) + o(1/ n)] = o(1/ n), hence
[nti ]
k=[nti−1 ]+1
This gives (3).
[nti ]
Trace(AΦk (ρ)) =
k=[nti−1 ]+1
√ √ o(1/ n) = o( n).
438
S. Attal and N. Guillotin-Plantard
Let us prove (4). Note that the sequence (Trace(AΦn (ρ)))n converges to 0 as n tends to infinity. As a consequence, it is enough to prove that 1 n
[nti ]
Trace(A2 Φk (ρ))
k=[nti−1 ]+1
converges to a strictly positive constant. A straightforward computation gives 1 lim n → +∞ n
[nti ]
Trace(A2 Φk (ρ))
k=[nti−1 ]+1
+ a22 (1 − α∞ ) + |a3 |2 + (a1 + a2 )(a3 β¯∞ + a ¯3 β∞ ) (ti − ti−1 ) 2 ν (1 − v12 ) + ν22 (1 − v22 ) + ν32 (1 − v32 ) = 2 ν1 + ν22 + ν32 1 −2ν1 ν2 v1 v2 − 2ν1 ν3 v1 v3 − 2ν2 ν3 v2 v3 . =
a21 α∞
This means that, for every ν1 , ν2 , ν3 ∈ R, for any p ≥ 1, the expectation p (n) (n) (n) (n) (n) (n) w ν1 (Xti − Xti−1 ) + ν2 (Yti − Yti−1 ) + ν3 (Zti − Zti−1 ) converges to p (1) (2) (2) (3) (3) , E ν1 (Bt(1) − B ) + ν (B − B ) + ν (B − B ) 2 3 ti−1 ti ti−1 ti ti−1 i (1)
(2)
(3)
where (Bt , Bt , Bt ) is a 3-dimensional Brownian motion with the announced covariance matrix. The polynomial
p (n) (n) (n) (n) (n) (n) ν1 (Xti − Xti−1 ) + ν2 (Yti − Yti−1 ) + ν3 (Zti − Zti−1 ) can be expanded as the sum ν1p1 ν2p2 ν3p−p1 −p2 S1 S2 . . . Sp 0≤p1 +p2 ≤p
P
where the summation in the last sum runs over all partitions P = {A, B, C} of {1, . . . , p} such that |A| = p1 , |B| = p2 , |C| = p − p1 − p2 , with the convention: ⎧ (n) (n) ⎪ ⎨Xti − Xti−1 if j ∈ A (n) Sj = Yt(n) − Yti−1 if j ∈ B i ⎪ ⎩ (n) (n) Zti − Zti−1 if j ∈ C. The expectation under w of the above expression converges to the corresponding expression involving the expectation (E[ · ]) of the Brownian mo(1) (2) (3) tion (Bt , Bt , Bt ). As this holds for any ν1 , ν2 , ν3 ∈ R, we deduce
Pauli Matrices Going Through Noisy Channels
439
that w[ P S1 S2 . . . Sp ] converges to the corresponding expectation for the Brownian motion. We can end the proof by noticing that Ai can be written, modulo multi plication by a constant, as P S1 S2 . . . Sp for some p. Let us discuss the class of polynomials for which Theorem 3.1 holds. In the particular case when the map Φ is the identity map and ρ = 1/2I (in that case vi = 0 for i = 1, 2, 3 and C = I), Biane [1] proved the convergence of the expectations in Theorem 1 for any polynomial in 3m non-commuting variables. It is a natural question to ask whether our result holds for any polynomial P instead of P, or at least for a larger class. Let us give an example of a polynomial for which the convergence in our setting does not hold. Take P (X, Y ) = XY . From Theorem 1, the expectation under the state ω of (n) (n) (n) (n) Xt Yt + Yt Xt converges as n → +∞ to 2 E[Bt Bt ]. Since we have the following commutation relations (1)
(2)
[(σx − v1 I), (σy − v2 I)] = 2iσz ,
[(σy − v2 I), (σz − v3 I)] = 2iσx
(5)
and [(σz − v3 I), (σx − v1 I)] = 2iσy , we deduce that (n)
(n)
[Xt , Yt and
] = 2in−1/2 Zt
(n)
(n)
+ 2itv3 I, [Yt
, Zt ] = 2in−1/2 Xt
[Zt , Xt ] = 2in−1/2 Yt (n)
(n)
(n)
(n)
(n)
+ 2itv1 I
+ 2itv2 I.
(6)
Then the expectation under the state ω of 1 (n) (n) (n) (n) Xt Yt = P(X, Y ) + [Xt , Yt ] 2 converges to E[Bt Bt ] + itv3 = E[Bt Bt ], if v3 is non zero. Furthermore, by considering the polynomial P (X, Y ) = XY 3 + Y 3 X, it is possible to show that the convergence in Theorem 1 can not be enlarged to the class of symmetric polynomials. A straightforward computation shows that P (X, Y ) can be rewritten as (1)
(2)
(1)
(2)
1 3 1 3 +Y XY X 3 + [X, Y ](Y 2 −X 2 )+ (Y [X, Y ]Y −X[X, Y ]X)+ (Y 2 −X 2 )[X, Y ] 4 2 4 (n)
(n)
so the expectation w[P (Xt , Yt
)] converges as n tends to +∞ to
E
(1) (2) [P (Bt , Bt )]
+ 3iv3 t(v12 − v22 )
which is not equal to E[P (Bt , Bt )] if v3 = 0 and |v1 | = |v2 |. In the following corollary we give a condition under which the convergence in Theorem 1 holds for any polynomial in 3m non-commuting variables. (1)
(2)
440
S. Attal and N. Guillotin-Plantard
Corollary 1. In the case when ρ∞ is equal to 12 I, the convergence holds for any polynomial P in 3m non-commuting variables, i.e. for every t1 < t2 < . . . < tm , the following convergence holds: (n) (n) (n) (n) (n) (n) lim w P (Xt1 , Yt1 , Zt1 , . . . , Xtm , Ytm , Ztm ) n→+∞ (1) (2) (3) (1) (2) (3) = E P (Bt1 , Bt1 , Bt1 , . . . , Btm , Btm , Btm ) (1)
(2)
(3)
where (Bt , Bt , Bt )t≥0 is a three-dimensional centered Brownian motion with covariance matrix tI3 . Proof. We consider the polynomials of the form S = N1 P S1 S2 . . . Sp1 +p2 +p3 where the summation is done over all partitions P = {A, B, C} of the set {1, . . . , p1 +p2 +p3 } such that |A| = p1 , |B| = p2 , |C| = p3 , with the convention: ⎧ (n) (n) ⎪ ⎨Xti − Xti−1 if j ∈ A (n) (n) Sj = Yti − Yti−1 if j ∈ B ⎪ ⎩ (n) (n) Zti − Zti−1 if j ∈ C and N is the number of terms in the sum. From Theorem 1 the expectation under the state w of S converges to 3
E
(j) (j) (Bti − Bti−1 )pj .
j=1
Using the commutation relations (6) with all the vi ’s being equal to zero, monomials of S differ from each other by n−1/2 times a polynomial of total degree less than or equal to (p1 + p2 + p3 ) − 1. It is easy to conclude by induction.
4 Examples 4.1 King-Ruskai-Szarek-Werner’s Representation The set of 2 × 2 self-adjoint matrices forms a four dimensional real vector subspace of M2 (C). A convenient basis of this space is given by B = {I, σx , σy , σz }. Each state ρ on M2 (C) can then be written as 1 1 + z x − iy ρ= 2 x + iy 1 − z where x, y, z are reals such that x2 + y 2 + z 2 ≤ 1. Equivalently, in the basis B, ρ=
1 (I + x σx + y σy + z σz ) 2
Pauli Matrices Going Through Noisy Channels
441
with x, y, z defined above. Thus, the set of density matrices can be identified 3 with the unit ball in R . The pure states, that is, the ones for which x2 + y 2 + z 2 = 1, constitute the Bloch sphere. The noise coming from interactions between the qubit states and the environment is represented by the action of a completely positive and tracepreserving map Φ : M2 (C) → M2 (C). Kraus and Choi [2, 7, 8] gave an abstract representation of these particular maps in terms of Kraus operators: There exists at most four matrices Li such that for any density matrix ρ, Φ(ρ) = L∗i ρLi 1≤i≤4
with i Li L∗i = I. The matrices Li are usually called the Kraus operators of Φ. This representation is unique up to a unitary transformation. Recently, King, Ruskai et al. [10, 6] obtained a precise characterization of completely positive and trace-preserving maps from M2 (C) as follows. The map Φ : M2 (C) → M2 (C) being linear and preserving the trace, it can be represented as a unique 4 × 4-matrix in the basis B given by 1 0 t T with 0 = (0, 0, 0), t ∈ R and T a real 3 × 3-matrix. King, Ruskai et al [10, 6] proved that via changes of basis, this matrix can be reduced to ⎛ ⎞ 1 0 0 0 ⎜t1 λ1 0 0 ⎟ ⎟ (7) T =⎜ ⎝t2 0 λ2 0 ⎠ t3 0 0 λ 3 3
Necessary and sufficient conditions under which the map Φ with reduced matrix T for which |t3 | + |λ3 | ≤ 1 is completely positive are (see [6]) 1 + λ3 ± t3 (λ1 + λ2 )2 ≤ (1 + λ3 )2 − t23 − (t21 + t22 ) ≤ (1 + λ3 )2 − t23 (8) 1 − λ 3 ± t3 (λ1 − λ2 )2 ≤ (1 − λ3 )2 − t23 − (t21 + t22 )
1 − λ 3 ± t3 1 + λ 3 ± t3
≤ (1 − λ3 )2 − t23 (9)
2 1 − (λ21 + λ22 + λ23 ) − (t21 + t22 + t23 ) ≥ 4 λ21 (t21 + λ22 ) + λ22 (t22 + λ23 ) + λ23 (t23 + λ21 ) − 2λ1 λ2 λ3 .
(10)
We now apply Theorem 1 in this setting. Let Φ be a completely positive and trace preserving map with matrix T given in (7), with coefficients ti , λi , i = 1, 2, 3 satisfying conditions (8), (9) and (10). Moreover, we assume that |λi | < 1, i = 1, 2, 3. For every n ≥ 0,
442
S. Attal and N. Guillotin-Plantard
1 Φ (ρ) = 2 n
φn (1) − i φn (2) 1 + φn (3) 1 − φn (3) φn (1) + i φn (2)
where the sequences (φn (i))n≥0 , i = 1, 2, 3 satisfy the induction relations: φn (i) = λi φn−1 (i) + ti . with initial conditions φ0 (1) = x, φ0 (2) = y and φ0 (3) = z. Explicit formulae can easily be obtained. We get, for every n ≥ 0,
t1 n t1 λ + φn (1) = x − 1 − λ1 1 1 − λ1
t2 n t2 λ + φn (2) = y − 1 − λ2 2 1 − λ2
t3 n t3 λ + φn (3) = z − . 1 − λ3 3 1 − λ3 Hence, for any state ρ, for any n ≥ 1, Φn (ρ) = ρ∞ + o(|λ|nmax ) where |λ|max = max |λi | and i=1,2,3
ρ∞ =
t3 with α∞ = and β∞ 1 − λ3 ti applies with vi = , i = 1, 2, 3. 1 − λi We now give some examples of well-known quantum channels. For each of them we give their Kraus operators, their corresponding matrix T in the KingRuskai-Szarek-Werner’s representation, as well as the vector v = (v1 , v2 , v3 ) and the covariance matrix C obtained in Theorem 1. It is worth noticing that if Φ is a unital map, i.e. such that Φ(I) = I, then the covariance matrix C is equal to the identity matrix I3 . 1. The depolarizing channel: Kraus operators: for some 0 ≤ p ≤ 1, p p p σx , L3 = σy , L4 = σz . L1 = 1 − pI, L2 = 3 3 3 1 2
1+
β∞ 1 − α∞ t1 t2 1 = 2 −i . Theorem 1 1 − λ1 1 − λ2
α∞ β∞
King-Ruskai-Szarek-Werner’s representation: ⎞ ⎛ 1 0 0 0 ⎜0 1 − 4p 0 0 ⎟ 3 ⎟ T =⎜ ⎠ ⎝0 0 1 − 4p 0 3 0 0 0 1 − 4p 3 The vector v is the null vector and the covariance matrix C in this case is given by the identity matrix I3 .
Pauli Matrices Going Through Noisy Channels
443
2. Phase-damping channel: Kraus operators: for some 0 ≤ p ≤ 1, √ 1 0 √ 0 0 , L3 = p L1 = 1 − p I, L2 = p 0 0 0 1 King-Ruskai-Szarek-Werner’s representation: ⎛ 1 0 0 ⎜0 1 − p 0 T =⎜ ⎝0 0 1−p 0 0 0
⎞ 0 0⎟ ⎟ 0⎠ 1
The vector v is the null vector and the covariance matrix C in this case is given by I3 . 3. Amplitude-damping channel: Kraus operators: for some 0 ≤ p ≤ 1, √ 1 √ 0 0 p , L2 = L1 = 0 1−p 0 0 King-Ruskai-Szarek-Werner’s representation: ⎛ ⎞ 1 √ 0 0 0 ⎜0 1−p √ 0 0 ⎟ ⎟ T =⎜ ⎝0 1−p 0 ⎠ 0 t 0 0 1−p The vector v is equal to (0, 0, 1). The covariance matrix in this case is given by ⎛ ⎞ 1 0 0 C = ⎝0 1 0⎠ 0 0 0 4. Trigonometric parameterization: Consider the particular Kraus operators u v u v L1 = cos( ) cos( ) I + sin( ) sin( ) σz 2 2 2 2 u u v v L2 = sin( ) cos( ) σx − i cos( ) sin( ) σy . 2 2 2 2 King-Ruskai-Szarek-Werner’s representation: ⎛ ⎞ 1 0 0 0 ⎜ ⎟ 0 cos u 0 0 ⎟ T =⎜ ⎝ ⎠ 0 0 cos v 0 sin u sin v 0 0 cos u cos v
and
444
S. Attal and N. Guillotin-Plantard
sin u sin v ). The covariance matrix in 1 − cos u cos v ⎛ ⎞ 1 0 0 0 ⎠ C = ⎝0 1 0 0 1 − v32
The vector v is equal to (0, 0, this case is given by
with v3 =
sin u sin v . 1 − cos u cos v
4.2 CP Map Associated to a Markov Chain With every Markov chain with two states and transition matrix given by p 1−p P = , p, q ∈ (0, 1) q 1−q is associated a completely positive and trace preserving map, denoted by Φ, with the Kraus operators: √ √ √ √ p 1−p p 1−p L1 = (I + σz ) + (σx + iσy ) = 0 0 2 2
and L2 =
0 0 √ √ q 1−q
=
√ √ q 1−q (I − σz ) + (σx − iσy ). 2 2
Let ρ be the density matrix 1 2
1 + z x − iy x + iy 1 − z
where x, y, z are real numbers such that x2 +y 2 +z 2 ≤ 1. The map Φ transforms the density matrix ρ into a new one given by Φ(ρ) = L∗1 ρL1 + L∗2 ρL2 . By induction, for every n ≥ 0, Φn (ρ) =
pn rn rn 1 − pn
where the sequences (pn )n≥0 , and (rn )n≥0 satisfy the recurrence relations: for every n ≥ 1, pn = pn−1 (p − q) + q and rn =
q(1 − q) + pn−1 ( p(1 − p) − q(1 − q))
with initial condition p0 = (1 + z)/2. Assumption (A) is then clearly satisfied with
Pauli Matrices Going Through Noisy Channels
445
1 q β 1+q−p β 1−p where β = q p(1 − p) + (1 − p) q(1 − q) . Then, applying Theorem 1, if P is a polynomial of 3m non-commuting variables, for every 0 < t1 < t2 < . . . < tm , the following convergence holds (n) (n) (n) (n) (n) (n) limn→+∞ w P(Xt1 , Yt1 , Zt1 , . . . , Xtm , Ytm , Ztm ) (1) (2) (3) (1) (2) (3) = E P (Bt1 , Bt1 , Bt1 , . . . , Btm , Btm , Btm ) ρ∞ =
(1)
(2)
(3)
v1 =
2 [q p(1 − p) + (1 − p) q(1 − q)] 1+q−p
where (Bt , Bt , Bt )t≥0 is a three-dimensional centered Brownian motion with Covariance matrix Ct where ⎞ ⎛ 1 − v12 0 −v1 v2 1 0 ⎠ C=⎝ 0 −v1 v2 0 1 − v22 with
and v2 =
p+q−1 . 1+q−p
5 Large Deviation Principle Let Γ be a Polish space endowed with the Borel σ-field B(Γ ). A good rate function is a lower semi-continuous function Λ∗ : Γ →[0, ∞] with compact level sets {x; Λ∗ (x) ≤ α}, α ∈ [0, ∞[. Let v = (vn )n ↑ ∞ be an increasing sequence of positive reals. A sequence of random variables (Yn )n with values in Γ defined on a probability space (Ω, F, P) is said to satisfy a Large Deviation Principle (LDP) with speed v = (vn )n and good rate function Λ∗ if for every Borel set B ∈ B(Γ ), 1 log P(Yn ∈ B) vn 1 log P(Yn ∈ B) ≤ − inf Λ∗ (x). ≤ lim sup ¯ vn x∈B n
− inf o Λ∗ (x) ≤ lim inf x∈B
n
For every k ≥ 1, we define x ¯ k = I ⊗ . . . ⊗ I ⊗ σx ⊗ I ⊗ . . . y¯k = I ⊗ . . . ⊗ I ⊗ σy ⊗ I ⊗ . . . z¯k = I ⊗ . . . ⊗ I ⊗ σz ⊗ I ⊗ . . .
446
S. Attal and N. Guillotin-Plantard
where each σ. appears on the k th place. For every n ≥ 1, we consider the processes ¯n = X
n
x ¯k , Y¯n =
k=1
n
y¯k , Z¯n =
k=1
n
z¯k
k=1
with initial conditions ¯ 0 = Y¯0 = Z¯0 = 0. X To each vector ν = (ν1 , ν2 , ν3 ) ∈ R , we associate the Euclidean norm ||ν|| = ν12 + ν22 + ν32 and ., . the corresponding inner product. 3
Theorem 2. Let Φ be a completely positive and trace-preserving map for which there exists a state α∞ β∞ ρ∞ = β ∞ 1 − α∞ such that for any given state ρ, Φn (ρ) = ρ∞ + o(1). For every ν = (ν1 , ν2 , ν3 ) ∈ R
3,∗
, the sequence
ν X ¯ ¯ ¯ 1 n + ν2 Yn + ν3 Zn n n≥1 satisfies a LDP with speed n and good rate function ⎧
||ν||+x 1 x ⎪ 1 + log ⎪ ν,v ⎪2 ⎪
||ν|| ||ν||+ ⎨ ||ν||−x x log if |x| < ||ν|| , + 1 − I(x) = ||ν|| ||ν||− ν,v ⎪ ⎪ ⎪ ⎪ ⎩ +∞ otherwise. where v1 = 2 Re(β∞ ), v2 = −2 Im(β∞ ), v3 = 2α∞ − 1. Proof. The matrix B := ν1 σx + ν2 σy + ν3 σz =
ν3 ν1 − iν2 ν1 + iν2 −ν3
has two distinct eigenvalues ±||ν||. For every n ≥ 0, we can write α∞ + φn (1) β∞ + φn (2) n Φ (ρ) = β ∞ + φn (3) 1 − α∞ + φn (4) where the four sequences (φn (i))n≥0 satisfy φn (i) = o(1).
Pauli Matrices Going Through Noisy Channels
447
For any k ≥ 1, the expectation of B in the state Φk (ρ) is equal to Trace(B Φk (ρ)) = ν, v + εk , with εn = o(1). As a consequence, the distribution of B is 1 1 ( ν, v + εk ) = 1 − pk (−||ν||). pk (||ν||) = 1+ 2 |ν| ¯ n + ν2 Y¯n + ν3 Z¯n is the sum of n commuting matrices, Using the fact that ν1 X we get that 1 log w (exp t(ν1 Xn + ν2 Yn + ν3 Zn )) n n
1 = log e||ν||t pk (||ν||) + e−||ν||t (1 − pk (||ν||)) n k=1
Since εn = o(1), we obtain that 1 log w (exp t(ν1 Xn + ν2 Yn + ν3 Zn )) n
ν, v sinh (||ν||t) = log cosh (||ν||t) + ||ν||
ν, v tanh (||ν||t) . = log cosh (||ν||t) + log 1 + ||ν||
lim
n→+∞
We denote by Λ(t) this function of t. For every t ∈ R, the function Λ is finite and differentiable on R, then, by G¨ artner-Ellis’ Theorem (see [4]), the LDP holds with the good rate function I(x) = sup{tx − Λ(t)}. t∈R
A simple computation leads to the rate function given in the theorem.
References 1. Biane, P. Some properties of quantum Bernoulli random walks. Quantum probability & related topics, 193–203, QP-PQ, VI, World Sci. Publ., River Edge, NJ, 1991. 2. Choi, M. D. Completely positive linear maps on complex matrices. Linear Algebra and Appl. 10, 285–290 (1975). 3. Dacunha-Castelle, D. and Duflo, M. Probabilit´es et statistiques 2. Probl`emes a ` temps mobile., Masson, Paris (1983). 4. Dembo, A. and Zeitouni, O. Large Deviations Techniques and Applications. Springer, (1998).
448
S. Attal and N. Guillotin-Plantard
5. Giri, N. and von Waldenfels, W. An algebraic version of the central limit theorem. Z. Wahrscheinlichkeitstheorie Verw. Gebiete 42, 129–134 (1978). 6. King, C. and Ruskai, M.B. Minimal entropy of states emerging from noisy quantum channels. IEEE Trans. Inform. Theory 47, No 1, 192–209 (2001). 7. Kraus, K. General state changes in quantum theory. Ann. Physics, 64, 311–335 (1971). 8. Kraus, K. States, effects and operations. Fundamental notions of quantum theory. Lecture Notes in Physics, 190. Springer-Verlag, Berlin (1983). 9. Petz, D. An invitation to the algebra of canonical commutation relations. Leuven Notes in Mathematical and Theoretical Physics, Vol. 2 (1990). 10. Ruskai, M.B., Szarek, S. and Werner, E. An analysis of completely positive trace-preserving maps on M2 . Linear Algebra Appl. 347, 159–187 (2002).
Erratum to: “New Methods in the Arbitrage Theory of Financial Markets with Transaction Costs”, in S´ eminaire XLI Mikl´ os R´asonyi∗ Computer and Automation Institute of the Hungarian Academy of Sciences email: [email protected]
Unfortunately, the proof of Lemma 4.6 in [1] needs an additional assumption. For a closed cone C ⊂ Rd let C ∗ denote its positive dual cone (see [1]). It is erroneously claimed in the last line of page 460 that (G∗T −l ∩X)∗ = GT −l +X ∗ where GT −l = GT −l (ω) is a random closed cone in Rd and X ∗ (ω) = {αξ(ω) : α ≤ 0} with some Rd -valued random variable ξ (i.e. X ∗ is a random ray in Rd ). The claimed identity holds if and only if GT −l + X ∗ is a closed cone in Rd a.s., see Corollary 16.4.2 of [2]. Hence the following hypothesis must be added to the statements of Lemma 4.6 and the main Theorem 3.1 in [1]: Assumption. For all 0 ≤ t ≤ T and for almost all ω the cone Gt (ω) is such that Gt (ω) + {αx : α ≥ 0} is closed in Rd for each x ∈ Rd . The above Assumption is trivially satisfied when Gt is a (random) polyhedral cone: a ray is, in particular, a polyhedral cone and the sum of two polyhedral cones is polyhedral and hence closed. Although restricted in generality by the Assumption given above, Theorem 3.1 of [1] still covers the cases which are relevant to financial markets with proportional transaction costs. In those models Gt are assumed to be polyhedral, see the references of [1].
References 1. R´ asonyi, M. (2008) New methods in the arbitrage theory of financial markets with transaction costs. S´eminaire de Probabilit´es XLI, Lecture Notes in Mathematics 1934, 455–462, Springer, Berlin. 2. Rockafellar, R. T. (1970) Convex analysis. Princeton University Press, Princeton, N. J. ∗
I would like to thank Yuri M. Kabanov and Christophe Stricker for discussions.
C. Donati-Martin et al. (eds.), S´ eminaire de Probabilit´ es XLII, Lecture Notes in Mathematics 1979, DOI 10.1007/978-3-642-01763-6 17, c Springer-Verlag Berlin Heidelberg 2009
449
Lecture Notes in Mathematics For information about earlier volumes please contact your bookseller or Springer LNM Online archive: springerlink.com
Vol. 1795: H. Li, Filtered-Graded Transfer in Using Noncommutative Gröbner Bases (2002) Vol. 1796: J.M. Melenk, hp-Finite Element Methods for Singular Perturbations (2002) Vol. 1797: B. Schmidt, Characters and Cyclotomic Fields in Finite Geometry (2002) Vol. 1798: W.M. Oliva, Geometric Mechanics (2002) Vol. 1799: H. Pajot, Analytic Capacity, Rectifiability, Menger Curvature and the Cauchy Integral (2002) Vol. 1800: O. Gabber, L. Ramero, Almost Ring Theory (2003) Vol. 1801: J. Azéma, M. Émery, M. Ledoux, M. Yor (Eds.), Séminaire de Probabilités XXXVI (2003) Vol. 1802: V. Capasso, E. Merzbach, B. G. Ivanoff, M. Dozzi, R. Dalang, T. Mountford, Topics in Spatial Stochastic Processes. Martina Franca, Italy 2001. Editor: E. Merzbach (2003) Vol. 1803: G. Dolzmann, Variational Methods for Crystalline Microstructure – Analysis and Computation (2003) Vol. 1804: I. Cherednik, Ya. Markov, R. Howe, G. Lusztig, Iwahori-Hecke Algebras and their Representation Theory. Martina Franca, Italy 1999. Editors: V. Baldoni, D. Barbasch (2003) Vol. 1805: F. Cao, Geometric Curve Evolution and Image Processing (2003) Vol. 1806: H. Broer, I. Hoveijn. G. Lunther, G. Vegter, Bifurcations in Hamiltonian Systems. Computing Singularities by Gröbner Bases (2003) Vol. 1807: V. D. Milman, G. Schechtman (Eds.), Geometric Aspects of Functional Analysis. Israel Seminar 20002002 (2003) Vol. 1808: W. Schindler, Measures with Symmetry Properties (2003) Vol. 1809: O. Steinbach, Stability Estimates for Hybrid Coupled Domain Decomposition Methods (2003) Vol. 1810: J. Wengenroth, Derived Functors in Functional Analysis (2003) Vol. 1811: J. Stevens, Deformations of Singularities (2003) Vol. 1812: L. Ambrosio, K. Deckelnick, G. Dziuk, M. Mimura, V. A. Solonnikov, H. M. Soner, Mathematical Aspects of Evolving Interfaces. Madeira, Funchal, Portugal 2000. Editors: P. Colli, J. F. Rodrigues (2003) Vol. 1813: L. Ambrosio, L. A. Caffarelli, Y. Brenier, G. Buttazzo, C. Villani, Optimal Transportation and its Applications. Martina Franca, Italy 2001. Editors: L. A. Caffarelli, S. Salsa (2003) Vol. 1814: P. Bank, F. Baudoin, H. Föllmer, L.C.G. Rogers, M. Soner, N. Touzi, Paris-Princeton Lectures on Mathematical Finance 2002 (2003) Vol. 1815: A. M. Vershik (Ed.), Asymptotic Combinatorics with Applications to Mathematical Physics. St. Petersburg, Russia 2001 (2003)
Vol. 1816: S. Albeverio, W. Schachermayer, M. Talagrand, Lectures on Probability Theory and Statistics. Ecole d’Eté de Probabilités de Saint-Flour XXX-2000. Editor: P. Bernard (2003) Vol. 1817: E. Koelink, W. Van Assche (Eds.), Orthogonal Polynomials and Special Functions. Leuven 2002 (2003) Vol. 1818: M. Bildhauer, Convex Variational Problems with Linear, nearly Linear and/or Anisotropic Growth Conditions (2003) Vol. 1819: D. Masser, Yu. V. Nesterenko, H. P. Schlickewei, W. M. Schmidt, M. Waldschmidt, Diophantine Approximation. Cetraro, Italy 2000. Editors: F. Amoroso, U. Zannier (2003) Vol. 1820: F. Hiai, H. Kosaki, Means of Hilbert Space Operators (2003) Vol. 1821: S. Teufel, Adiabatic Perturbation Theory in Quantum Dynamics (2003) Vol. 1822: S.-N. Chow, R. Conti, R. Johnson, J. MalletParet, R. Nussbaum, Dynamical Systems. Cetraro, Italy 2000. Editors: J. W. Macki, P. Zecca (2003) Vol. 1823: A. M. Anile, W. Allegretto, C. Ringhofer, Mathematical Problems in Semiconductor Physics. Cetraro, Italy 1998. Editor: A. M. Anile (2003) Vol. 1824: J. A. Navarro González, J. B. Sancho de Salas, C ∞ – Differentiable Spaces (2003) Vol. 1825: J. H. Bramble, A. Cohen, W. Dahmen, Multiscale Problems and Methods in Numerical Simulations, Martina Franca, Italy 2001. Editor: C. Canuto (2003) Vol. 1826: K. Dohmen, Improved Bonferroni Inequalities via Abstract Tubes. Inequalities and Identities of Inclusion-Exclusion Type. VIII, 113 p, 2003. Vol. 1827: K. M. Pilgrim, Combinations of Complex Dynamical Systems. IX, 118 p, 2003. Vol. 1828: D. J. Green, Gröbner Bases and the Computation of Group Cohomology. XII, 138 p, 2003. Vol. 1829: E. Altman, B. Gaujal, A. Hordijk, DiscreteEvent Control of Stochastic Networks: Multimodularity and Regularity. XIV, 313 p, 2003. Vol. 1830: M. I. Gil’, Operator Functions and Localization of Spectra. XIV, 256 p, 2003. Vol. 1831: A. Connes, J. Cuntz, E. Guentner, N. Higson, J. E. Kaminker, Noncommutative Geometry, Martina Franca, Italy 2002. Editors: S. Doplicher, L. Longo (2004) Vol. 1832: J. Azéma, M. Émery, M. Ledoux, M. Yor (Eds.), Séminaire de Probabilités XXXVII (2003) Vol. 1833: D.-Q. Jiang, M. Qian, M.-P. Qian, Mathematical Theory of Nonequilibrium Steady States. On the Frontier of Probability and Dynamical Systems. IX, 280 p, 2004. Vol. 1834: Yo. Yomdin, G. Comte, Tame Geometry with Application in Smooth Analysis. VIII, 186 p, 2004. Vol. 1835: O.T. Izhboldin, B. Kahn, N.A. Karpenko, A. Vishik, Geometric Methods in the Algebraic Theory
of Quadratic Forms. Summer School, Lens, 2000. Editor: J.-P. Tignol (2004) Vol. 1836: C. Nˇastˇasescu, F. Van Oystaeyen, Methods of Graded Rings. XIII, 304 p, 2004. Vol. 1837: S. Tavaré, O. Zeitouni, Lectures on Probability Theory and Statistics. Ecole d’Eté de Probabilités de Saint-Flour XXXI-2001. Editor: J. Picard (2004) Vol. 1838: A.J. Ganesh, N.W. O’Connell, D.J. Wischik, Big Queues. XII, 254 p, 2004. Vol. 1839: R. Gohm, Noncommutative Stationary Processes. VIII, 170 p, 2004. Vol. 1840: B. Tsirelson, W. Werner, Lectures on Probability Theory and Statistics. Ecole d’Eté de Probabilités de Saint-Flour XXXII-2002. Editor: J. Picard (2004) Vol. 1841: W. Reichel, Uniqueness Theorems for Variational Problems by the Method of Transformation Groups (2004) Vol. 1842: T. Johnsen, A. L. Knutsen, K3 Projective Models in Scrolls (2004) Vol. 1843: B. Jefferies, Spectral Properties of Noncommuting Operators (2004) Vol. 1844: K.F. Siburg, The Principle of Least Action in Geometry and Dynamics (2004) Vol. 1845: Min Ho Lee, Mixed Automorphic Forms, Torus Bundles, and Jacobi Forms (2004) Vol. 1846: H. Ammari, H. Kang, Reconstruction of Small Inhomogeneities from Boundary Measurements (2004) Vol. 1847: T.R. Bielecki, T. Björk, M. Jeanblanc, M. Rutkowski, J.A. Scheinkman, W. Xiong, Paris-Princeton Lectures on Mathematical Finance 2003 (2004) Vol. 1848: M. Abate, J. E. Fornaess, X. Huang, J. P. Rosay, A. Tumanov, Real Methods in Complex and CR Geometry, Martina Franca, Italy 2002. Editors: D. Zaitsev, G. Zampieri (2004) Vol. 1849: Martin L. Brown, Heegner Modules and Elliptic Curves (2004) Vol. 1850: V. D. Milman, G. Schechtman (Eds.), Geometric Aspects of Functional Analysis. Israel Seminar 20022003 (2004) Vol. 1851: O. Catoni, Statistical Learning Theory and Stochastic Optimization (2004) Vol. 1852: A.S. Kechris, B.D. Miller, Topics in Orbit Equivalence (2004) Vol. 1853: Ch. Favre, M. Jonsson, The Valuative Tree (2004) Vol. 1854: O. Saeki, Topology of Singular Fibers of Differential Maps (2004) Vol. 1855: G. Da Prato, P.C. Kunstmann, I. Lasiecka, A. Lunardi, R. Schnaubelt, L. Weis, Functional Analytic Methods for Evolution Equations. Editors: M. Iannelli, R. Nagel, S. Piazzera (2004) Vol. 1856: K. Back, T.R. Bielecki, C. Hipp, S. Peng, W. Schachermayer, Stochastic Methods in Finance, Bressanone/Brixen, Italy, 2003. Editors: M. Fritelli, W. Runggaldier (2004) Vol. 1857: M. Émery, M. Ledoux, M. Yor (Eds.), Séminaire de Probabilités XXXVIII (2005) Vol. 1858: A.S. Cherny, H.-J. Engelbert, Singular Stochastic Differential Equations (2005) Vol. 1859: E. Letellier, Fourier Transforms of Invariant Functions on Finite Reductive Lie Algebras (2005) Vol. 1860: A. Borisyuk, G.B. Ermentrout, A. Friedman, D. Terman, Tutorials in Mathematical Biosciences I. Mathematical Neurosciences (2005) Vol. 1861: G. Benettin, J. Henrard, S. Kuksin, Hamiltonian Dynamics – Theory and Applications, Cetraro, Italy, 1999. Editor: A. Giorgilli (2005)
Vol. 1862: B. Helffer, F. Nier, Hypoelliptic Estimates and Spectral Theory for Fokker-Planck Operators and Witten Laplacians (2005) Vol. 1863: H. Führ, Abstract Harmonic Analysis of Continuous Wavelet Transforms (2005) Vol. 1864: K. Efstathiou, Metamorphoses of Hamiltonian Systems with Symmetries (2005) Vol. 1865: D. Applebaum, B.V. R. Bhat, J. Kustermans, J. M. Lindsay, Quantum Independent Increment Processes I. From Classical Probability to Quantum Stochastic Calculus. Editors: M. Schürmann, U. Franz (2005) Vol. 1866: O.E. Barndorff-Nielsen, U. Franz, R. Gohm, B. Kümmerer, S. Thorbjønsen, Quantum Independent Increment Processes II. Structure of Quantum Lévy Processes, Classical Probability, and Physics. Editors: M. Schürmann, U. Franz, (2005) Vol. 1867: J. Sneyd (Ed.), Tutorials in Mathematical Biosciences II. Mathematical Modeling of Calcium Dynamics and Signal Transduction. (2005) Vol. 1868: J. Jorgenson, S. Lang, Posn (R) and Eisenstein Series. (2005) Vol. 1869: A. Dembo, T. Funaki, Lectures on Probability Theory and Statistics. Ecole d’Eté de Probabilités de Saint-Flour XXXIII-2003. Editor: J. Picard (2005) Vol. 1870: V.I. Gurariy, W. Lusky, Geometry of Müntz Spaces and Related Questions. (2005) Vol. 1871: P. Constantin, G. Gallavotti, A.V. Kazhikhov, Y. Meyer, S. Ukai, Mathematical Foundation of Turbulent Viscous Flows, Martina Franca, Italy, 2003. Editors: M. Cannone, T. Miyakawa (2006) Vol. 1872: A. Friedman (Ed.), Tutorials in Mathematical Biosciences III. Cell Cycle, Proliferation, and Cancer (2006) Vol. 1873: R. Mansuy, M. Yor, Random Times and Enlargements of Filtrations in a Brownian Setting (2006) Vol. 1874: M. Yor, M. Émery (Eds.), In Memoriam PaulAndré Meyer - Séminaire de Probabilités XXXIX (2006) Vol. 1875: J. Pitman, Combinatorial Stochastic Processes. Ecole d’Eté de Probabilités de Saint-Flour XXXII-2002. Editor: J. Picard (2006) Vol. 1876: H. Herrlich, Axiom of Choice (2006) Vol. 1877: J. Steuding, Value Distributions of L-Functions (2007) Vol. 1878: R. Cerf, The Wulff Crystal in Ising and Percolation Models, Ecole d’Eté de Probabilités de Saint-Flour XXXIV-2004. Editor: Jean Picard (2006) Vol. 1879: G. Slade, The Lace Expansion and its Applications, Ecole d’Eté de Probabilités de Saint-Flour XXXIV2004. Editor: Jean Picard (2006) Vol. 1880: S. Attal, A. Joye, C.-A. Pillet, Open Quantum Systems I, The Hamiltonian Approach (2006) Vol. 1881: S. Attal, A. Joye, C.-A. Pillet, Open Quantum Systems II, The Markovian Approach (2006) Vol. 1882: S. Attal, A. Joye, C.-A. Pillet, Open Quantum Systems III, Recent Developments (2006) Vol. 1883: W. Van Assche, F. Marcellàn (Eds.), Orthogonal Polynomials and Special Functions, Computation and Application (2006) Vol. 1884: N. Hayashi, E.I. Kaikina, P.I. Naumkin, I.A. Shishmarev, Asymptotics for Dissipative Nonlinear Equations (2006) Vol. 1885: A. Telcs, The Art of Random Walks (2006) Vol. 1886: S. Takamura, Splitting Deformations of Degenerations of Complex Curves (2006) Vol. 1887: K. Habermann, L. Habermann, Introduction to Symplectic Dirac Operators (2006)
Vol. 1888: J. van der Hoeven, Transseries and Real Differential Algebra (2006) Vol. 1889: G. Osipenko, Dynamical Systems, Graphs, and Algorithms (2006) Vol. 1890: M. Bunge, J. Funk, Singular Coverings of Toposes (2006) Vol. 1891: J.B. Friedlander, D.R. Heath-Brown, H. Iwaniec, J. Kaczorowski, Analytic Number Theory, Cetraro, Italy, 2002. Editors: A. Perelli, C. Viola (2006) Vol. 1892: A. Baddeley, I. Bárány, R. Schneider, W. Weil, Stochastic Geometry, Martina Franca, Italy, 2004. Editor: W. Weil (2007) Vol. 1893: H. Hanßmann, Local and Semi-Local Bifurcations in Hamiltonian Dynamical Systems, Results and Examples (2007) Vol. 1894: C.W. Groetsch, Stable Approximate Evaluation of Unbounded Operators (2007) Vol. 1895: L. Molnár, Selected Preserver Problems on Algebraic Structures of Linear Operators and on Function Spaces (2007) Vol. 1896: P. Massart, Concentration Inequalities and Model Selection, Ecole d’Été de Probabilités de SaintFlour XXXIII-2003. Editor: J. Picard (2007) Vol. 1897: R. Doney, Fluctuation Theory for Lévy Processes, Ecole d’Été de Probabilités de Saint-Flour XXXV-2005. Editor: J. Picard (2007) Vol. 1898: H.R. Beyer, Beyond Partial Differential Equations, On linear and Quasi-Linear Abstract Hyperbolic Evolution Equations (2007) Vol. 1899: Séminaire de Probabilités XL. Editors: C. Donati-Martin, M. Émery, A. Rouault, C. Stricker (2007) Vol. 1900: E. Bolthausen, A. Bovier (Eds.), Spin Glasses (2007) Vol. 1901: O. Wittenberg, Intersections de deux quadriques et pinceaux de courbes de genre 1, Intersections of Two Quadrics and Pencils of Curves of Genus 1 (2007) Vol. 1902: A. Isaev, Lectures on the Automorphism Groups of Kobayashi-Hyperbolic Manifolds (2007) Vol. 1903: G. Kresin, V. Maz’ya, Sharp Real-Part Theorems (2007) Vol. 1904: P. Giesl, Construction of Global Lyapunov Functions Using Radial Basis Functions (2007) Vol. 1905: C. Prévˆot, M. Röckner, A Concise Course on Stochastic Partial Differential Equations (2007) Vol. 1906: T. Schuster, The Method of Approximate Inverse: Theory and Applications (2007) Vol. 1907: M. Rasmussen, Attractivity and Bifurcation for Nonautonomous Dynamical Systems (2007) Vol. 1908: T.J. Lyons, M. Caruana, T. Lévy, Differential Equations Driven by Rough Paths, Ecole d’Été de Probabilités de Saint-Flour XXXIV-2004 (2007) Vol. 1909: H. Akiyoshi, M. Sakuma, M. Wada, Y. Yamashita, Punctured Torus Groups and 2-Bridge Knot Groups (I) (2007) Vol. 1910: V.D. Milman, G. Schechtman (Eds.), Geometric Aspects of Functional Analysis. Israel Seminar 2004-2005 (2007) Vol. 1911: A. Bressan, D. Serre, M. Williams, K. Zumbrun, Hyperbolic Systems of Balance Laws. Cetraro, Italy 2003. Editor: P. Marcati (2007) Vol. 1912: V. Berinde, Iterative Approximation of Fixed Points (2007) Vol. 1913: J.E. Marsden, G. Misiołek, J.-P. Ortega, M. Perlmutter, T.S. Ratiu, Hamiltonian Reduction by Stages (2007)
Vol. 1914: G. Kutyniok, Affine Density in Wavelet Analysis (2007) Vol. 1915: T. Bıyıkoˇglu, J. Leydold, P.F. Stadler, Laplacian Eigenvectors of Graphs. Perron-Frobenius and Faber-Krahn Type Theorems (2007) Vol. 1916: C. Villani, F. Rezakhanlou, Entropy Methods for the Boltzmann Equation. Editors: F. Golse, S. Olla (2008) Vol. 1917: I. Veseli´c, Existence and Regularity Properties of the Integrated Density of States of Random Schrödinger (2008) Vol. 1918: B. Roberts, R. Schmidt, Local Newforms for GSp(4) (2007) Vol. 1919: R.A. Carmona, I. Ekeland, A. KohatsuHiga, J.-M. Lasry, P.-L. Lions, H. Pham, E. Taflin, Paris-Princeton Lectures on Mathematical Finance 2004. Editors: R.A. Carmona, E. Çinlar, I. Ekeland, E. Jouini, J.A. Scheinkman, N. Touzi (2007) Vol. 1920: S.N. Evans, Probability and Real Trees. Ecole d’Été de Probabilités de Saint-Flour XXXV-2005 (2008) Vol. 1921: J.P. Tian, Evolution Algebras and their Applications (2008) Vol. 1922: A. Friedman (Ed.), Tutorials in Mathematical BioSciences IV. Evolution and Ecology (2008) Vol. 1923: J.P.N. Bishwal, Parameter Estimation in Stochastic Differential Equations (2008) Vol. 1924: M. Wilson, Littlewood-Paley Theory and Exponential-Square Integrability (2008) Vol. 1925: M. du Sautoy, L. Woodward, Zeta Functions of Groups and Rings (2008) Vol. 1926: L. Barreira, V. Claudia, Stability of Nonautonomous Differential Equations (2008) Vol. 1927: L. Ambrosio, L. Caffarelli, M.G. Crandall, L.C. Evans, N. Fusco, Calculus of Variations and NonLinear Partial Differential Equations. Cetraro, Italy 2005. Editors: B. Dacorogna, P. Marcellini (2008) Vol. 1928: J. Jonsson, Simplicial Complexes of Graphs (2008) Vol. 1929: Y. Mishura, Stochastic Calculus for Fractional Brownian Motion and Related Processes (2008) Vol. 1930: J.M. Urbano, The Method of Intrinsic Scaling. A Systematic Approach to Regularity for Degenerate and Singular PDEs (2008) Vol. 1931: M. Cowling, E. Frenkel, M. Kashiwara, A. Valette, D.A. Vogan, Jr., N.R. Wallach, Representation Theory and Complex Analysis. Venice, Italy 2004. Editors: E.C. Tarabusi, A. D’Agnolo, M. Picardello (2008) Vol. 1932: A.A. Agrachev, A.S. Morse, E.D. Sontag, H.J. Sussmann, V.I. Utkin, Nonlinear and Optimal Control Theory. Cetraro, Italy 2004. Editors: P. Nistri, G. Stefani (2008) Vol. 1933: M. Petkovic, Point Estimation of Root Finding Methods (2008) Vol. 1934: C. Donati-Martin, M. Émery, A. Rouault, C. Stricker (Eds.), Séminaire de Probabilités XLI (2008) Vol. 1935: A. Unterberger, Alternative Pseudodifferential Analysis (2008) Vol. 1936: P. Magal, S. Ruan (Eds.), Structured Population Models in Biology and Epidemiology (2008) Vol. 1937: G. Capriz, P. Giovine, P.M. Mariano (Eds.), Mathematical Models of Granular Matter (2008) Vol. 1938: D. Auroux, F. Catanese, M. Manetti, P. Seidel, B. Siebert, I. Smith, G. Tian, Symplectic 4-Manifolds and Algebraic Surfaces. Cetraro, Italy 2003. Editors: F. Catanese, G. Tian (2008)
Vol. 1939: D. Boffi, F. Brezzi, L. Demkowicz, R.G. Durán, R.S. Falk, M. Fortin, Mixed Finite Elements, Compatibility Conditions, and Applications. Cetraro, Italy 2006. Editors: D. Boffi, L. Gastaldi (2008) Vol. 1940: J. Banasiak, V. Capasso, M.A.J. Chaplain, M. Lachowicz, J. Mie¸kisz, Multiscale Problems in the Life Sciences. From Microscopic to Macroscopic. Be¸dlewo, Poland 2006. Editors: V. Capasso, M. Lachowicz (2008) Vol. 1941: S.M.J. Haran, Arithmetical Investigations. Representation Theory, Orthogonal Polynomials, and Quantum Interpolations (2008) Vol. 1942: S. Albeverio, F. Flandoli, Y.G. Sinai, SPDE in Hydrodynamic. Recent Progress and Prospects. Cetraro, Italy 2005. Editors: G. Da Prato, M. Röckner (2008) Vol. 1943: L.L. Bonilla (Ed.), Inverse Problems and Imaging. Martina Franca, Italy 2002 (2008) Vol. 1944: A. Di Bartolo, G. Falcone, P. Plaumann, K. Strambach, Algebraic Groups and Lie Groups with Few Factors (2008) Vol. 1945: F. Brauer, P. van den Driessche, J. Wu (Eds.), Mathematical Epidemiology (2008) Vol. 1946: G. Allaire, A. Arnold, P. Degond, T.Y. Hou, Quantum Transport. Modelling, Analysis and Asymptotics. Cetraro, Italy 2006. Editors: N.B. Abdallah, G. Frosali (2008) Vol. 1947: D. Abramovich, M. Mari˜no, M. Thaddeus, R. Vakil, Enumerative Invariants in Algebraic Geometry and String Theory. Cetraro, Italy 2005. Editors: K. Behrend, M. Manetti (2008) Vol. 1948: F. Cao, J-L. Lisani, J-M. Morel, P. Musé, F. Sur, A Theory of Shape Identification (2008) Vol. 1949: H.G. Feichtinger, B. Helffer, M.P. Lamoureux, N. Lerner, J. Toft, Pseudo-Differential Operators. Quantization and Signals. Cetraro, Italy 2006. Editors: L. Rodino, M.W. Wong (2008) Vol. 1950: M. Bramson, Stability of Queueing Networks, Ecole d’Eté de Probabilités de Saint-Flour XXXVI-2006 (2008) Vol. 1951: A. Moltó, J. Orihuela, S. Troyanski, M. Valdivia, A Non Linear Transfer Technique for Renorming (2009) Vol. 1952: R. Mikhailov, I.B.S. Passi, Lower Central and Dimension Series of Groups (2009) Vol. 1953: K. Arwini, C.T.J. Dodson, Information Geometry (2008) Vol. 1954: P. Biane, L. Bouten, F. Cipriani, N. Konno, N. Privault, Q. Xu, Quantum Potential Theory. Editors: U. Franz, M. Schuermann (2008) Vol. 1955: M. Bernot, V. Caselles, J.-M. Morel, Optimal Transportation Networks (2008) Vol. 1956: C.H. Chu, Matrix Convolution Operators on Groups (2008) Vol. 1957: A. Guionnet, On Random Matrices: Macroscopic Asymptotics, Ecole d’Eté de Probabilités de SaintFlour XXXVI-2006 (2009) Vol. 1958: M.C. Olsson, Compactifying Moduli Spaces for Abelian Varieties (2008) Vol. 1959: Y. Nakkajima, A. Shiho, Weight Filtrations on Log Crystalline Cohomologies of Families of Open Smooth Varieties (2008) Vol. 1960: J. Lipman, M. Hashimoto, Foundations of Grothendieck Duality for Diagrams of Schemes (2009) Vol. 1961: G. Buttazzo, A. Pratelli, S. Solimini, E. Stepanov, Optimal Urban Networks via Mass Transportation (2009)
Vol. 1962: R. Dalang, D. Khoshnevisan, C. Mueller, D. Nualart, Y. Xiao, A Minicourse on Stochastic Partial Differential Equations (2009) Vol. 1963: W. Siegert, Local Lyapunov Exponents (2009) Vol. 1964: W. Roth, Operator-valued Measures and Integrals for Cone-valued Functions and Integrals for Conevalued Functions (2009) Vol. 1965: C. Chidume, Geometric Properties of Banach Spaces and Nonlinear Iterations (2009) Vol. 1966: D. Deng, Y. Han, Harmonic Analysis on Spaces of Homogeneous Type (2009) Vol. 1967: B. Fresse, Modules over Operads and Functors (2009) Vol. 1968: R. Weissauer, Endoscopy for GSP(4) and the Cohomology of Siegel Modular Threefolds (2009) Vol. 1969: B. Roynette, M. Yor, Penalising Brownian Paths (2009) Vol. 1970: M. Biskup, A. Bovier, F. den Hollander, D. Ioffe, F. Martinelli, K. Netoˇcný, F. Toninelli, Methods of Contemporary Mathematical Statistical Physics. Editor: R. Kotecký (2009) Vol. 1971: L. Saint-Raymond, Hydrodynamic Limits of the Boltzmann Equation (2009) Vol. 1972: T. Mochizuki, Donaldson Type Invariants for Algebraic Surfaces (2009) Vol. 1973: M.A. Berger, L.H. Kauffmann, B. Khesin, H.K. Moffatt, R.L. Ricca, De W. Sumners, Lectures on Topological Fluid Mechanics. Cetraro, Italy 2001. Editor: R.L. Ricca (2009) Vol. 1974: F. den Hollander, Random Polymers: École d’Été de Probabilités de Saint-Flour XXXVII – 2007 (2009) Vol. 1975: J.C. Rohde, Cyclic Coverings, Calabi-Yau Manifolds and Complex Multiplication (2009) Vol. 1976: N. Ginoux, The Dirac Spectrum (2009) Vol. 1977: M.J. Gursky, E. Lanconelli, A. Malchiodi, G. Tarantello, X.-J. Wang, P.C. Yang, Geometric Analysis and PDEs. Cetraro, Italy 2001. Editors: A. Ambrosetti, S.-Y.A. Chang, A. Malchiodi (2009) Vol. 1978: M. Qian, J.-S. Xie, S. Zhu, Smooth Ergodic Theory for Endomorphisms (2009) Vol. 1979: C. Donati-Martin, M. Émery, A. Rouault, C. Stricker (Eds.), Séminaire de Probabilités XLII (2009)
Recent Reprints and New Editions Vol. 1702: J. Ma, J. Yong, Forward-Backward Stochastic Differential Equations and their Applications. 1999 – Corr. 3rd printing (2007) Vol. 830: J.A. Green, Polynomial Representations of GLn , with an Appendix on Schensted Correspondence and Littelmann Paths by K. Erdmann, J.A. Green and M. Schoker 1980 – 2nd corr. and augmented edition (2007) Vol. 1693: S. Simons, From Hahn-Banach to Monotonicity (Minimax and Monotonicity 1998) – 2nd exp. edition (2008) Vol. 470: R.E. Bowen, Equilibrium States and the Ergodic Theory of Anosov Diffeomorphisms. With a preface by D. Ruelle. Edited by J.-R. Chazottes. 1975 – 2nd rev. edition (2008) Vol. 523: S.A. Albeverio, R.J. Høegh-Krohn, S. Mazzucchi, Mathematical Theory of Feynman Path Integral. 1976 – 2nd corr. and enlarged edition (2008) Vol. 1764: A. Cannas da Silva, Lectures on Symplectic Geometry 2001 – Corr. 2nd printing (2008)
LECTURE NOTES IN MATHEMATICS
123
Edited by J.-M. Morel, F. Takens, B. Teissier, P.K. Maini Editorial Policy (for Multi-Author Publications: Summer Schools/Intensive Courses) 1. Lecture Notes aim to report new developments in all areas of mathematics and their applications - quickly, informally and at a high level. Mathematical texts analysing new developments in modelling and numerical simulation are welcome. Manuscripts should be reasonably self-contained and rounded off. Thus they may, and often will, present not only results of the author but also related work by other people. They should provide sufficient motivation, examples and applications. There should also be an introduction making the text comprehensible to a wider audience. This clearly distinguishes Lecture Notes from journal articles or technical reports which normally are very concise. Articles intended for a journal but too long to be accepted by most journals, usually do not have this “lecture notes” character. 2. In general SUMMER SCHOOLS and other similar INTENSIVE COURSES are held to present mathematical topics that are close to the frontiers of recent research to an audience at the beginning or intermediate graduate level, who may want to continue with this area of work, for a thesis or later. This makes demands on the didactic aspects of the presentation. Because the subjects of such schools are advanced, there often exists no textbook, and so ideally, the publication resulting from such a school could be a first approximation to such a textbook. Usually several authors are involved in the writing, so it is not always simple to obtain a unified approach to the presentation. For prospective publication in LNM, the resulting manuscript should not be just a collection of course notes, each of which has been developed by an individual author with little or no co-ordination with the others, and with little or no common concept. The subject matter should dictate the structure of the book, and the authorship of each part or chapter should take secondary importance. Of course the choice of authors is crucial to the quality of the material at the school and in the book, and the intention here is not to belittle their impact, but simply to say that the book should be planned to be written by these authors jointly, and not just assembled as a result of what these authors happen to submit. This represents considerable preparatory work (as it is imperative to ensure that the authors know these criteria before they invest work on a manuscript), and also considerable editing work afterwards, to get the book into final shape. Still it is the form that holds the most promise of a successful book that will be used by its intended audience, rather than yet another volume of proceedings for the library shelf. 3. Manuscripts should be submitted either online at www.editorialmanager.com/lnm/ to Springer’s mathematics editorial, or to one of the series editors. Volume editors are expected to arrange for the refereeing, to the usual scientific standards, of the individual contributions. If the resulting reports can be forwarded to us (series editors or Springer) this is very helpful. If no reports are forwarded or if other questions remain unclear in respect of homogeneity etc, the series editors may wish to consult external referees for an overall evaluation of the volume. A final decision to publish can be made only on the basis of the complete manuscript; however a preliminary decision can be based on a pre-final or incomplete manuscript. The strict minimum amount of material that will be considered should include a detailed outline describing the planned contents of each chapter. Volume editors and authors should be aware that incomplete or insufficiently close to final manuscripts almost always result in longer evaluation times. They should also be aware that parallel submission of their manuscript to another publisher while under consideration for LNM will in general lead to immediate rejection.
4. Manuscripts should in general be submitted in English. Final manuscripts should contain at least 100 pages of mathematical text and should always include – a general table of contents; – an informative introduction, with adequate motivation and perhaps some historical remarks: it should be accessible to a reader not intimately familiar with the topic treated; – a global subject index: as a rule this is genuinely helpful for the reader. Lecture Notes volumes are, as a rule, printed digitally from the authors’ files. We strongly recommend that all contributions in a volume be written in the same LaTeX version, preferably LaTeX2e. To ensure best results, authors are asked to use the LaTeX2e style files available from Springer’s web-server at ftp://ftp.springer.de/pub/tex/latex/svmonot1/ (for monographs) and ftp://ftp.springer.de/pub/tex/latex/svmultt1/ (for summer schools/tutorials). Additional technical instructions are available on request from: [email protected]. 5. Careful preparation of the manuscripts will help keep production time short besides ensuring satisfactory appearance of the finished book in print and online. After acceptance of the manuscript authors will be asked to prepare the final LaTeX source files and also the corresponding dvi-, pdf- or zipped ps-file. The LaTeX source files are essential for producing the full-text online version of the book. For the existing online volumes of LNM see: http://www.springerlink.com/openurl.asp?genre=journal&issn=0075-8434. The actual production of a Lecture Notes volume takes approximately 12 weeks. 6. Volume editors receive a total of 50 free copies of their volume to be shared with the authors, but no royalties. They and the authors are entitled to a discount of 33.3% on the price of Springer books purchased for their personal use, if ordering directly from Springer. 7. Commitment to publish is made by letter of intent rather than by signing a formal contract. Springer-Verlag secures the copyright for each volume. Authors are free to reuse material contained in their LNM volumes in later publications: a brief written (or e-mail) request for formal permission is sufficient. Addresses: Professor J.-M. Morel, CMLA, ´ Ecole Normale Sup´erieure de Cachan, 61 Avenue du Pr´esident Wilson, 94235 Cachan Cedex, France E-mail: [email protected] Professor F. Takens, Mathematisch Instituut, Rijksuniversiteit Groningen, Postbus 800, 9700 AV Groningen, The Netherlands E-mail: [email protected]
Professor B. Teissier, Institut Math´ematique de Jussieu, UMR 7586 du CNRS, ´ Equipe “G´eom´etrie et Dynamique”, 175 rue du Chevaleret, 75013 Paris, France E-mail: [email protected]
For the “Mathematical Biosciences Subseries” of LNM: Professor P.K. Maini, Center for Mathematical Biology, Mathematical Institute, 24-29 St Giles, Oxford OX1 3LP, UK E-mail: [email protected] Springer, Mathematics Editorial I, Tiergartenstr. 17, 69121 Heidelberg, Germany, Tel.: +49 (6221) 487-8259 Fax: +49 (6221) 4876-8259 E-mail: [email protected]