An Initiation to Logarithmic Sobolev Inequalities Gilles Royer
An Initiation to Logarithmic Sobolev Inequalities
SMF/AMS TEXTS and MONOGRAPHS Volume 14
Cours Specialises Numero 5 1999
An Initiation to Logarithmic Sobolev Inequalities Gilles Royer
Translated by
Donald Babbitt
a
0
N
American Mathematical Society Societe Mathematique de France
Une Initiation aux Inegalites de Sobolev Logarithmiques An Initiation to Logarithmic Sobolev Inequalities Gilles Royer Originally published in French by Society Mathematique de France.
Copyright © 1999 Societe Matht matique de France Translated from the French by Donald Babbitt 2000 Mathematics Subject Classification. Primary 60-02; Secondary 35J85, 47B25, 47D07, 60J60, 82C99.
For additional information and updates on this book, visit
www.anis.org/bookpages/smfanLs-14
Library of Congress Cataloging-in-Publication Data Royer. Gilles. [Initiation aux in4galites de Sobolev logarithmiques. English] An initiation to logarithmic Sobolev inequalities / Gilles Royer ; translated by Donald Babbitt. p. cm. - (SMF/AMS texts and monographs, ISSN 1525-2302 ; v. 14) (Cours specialises, ISSN 1284-6090 ; 5) Includes bibliographical references.
ISBN-13: 978-0-8218-4401-4 (alk. paper) ISBN-10: 0-8218-4401-6 (elk. paper)
1. Ergodic theory. inequalities. I. Title. QA313.R6913
2. Logarithmic functions.
3. Semigroups of operators. 4. Differential
2007 2007060798
515'.48--dc22
Copying and reprinting. Individual readers of this publication, and nonprofit libraries acting for them, are permitted to make fair use of the material, such as to copy a chapter for use in teaching or research. Permission is granted to quote brief passages from this publication in reviews, provided the customary acknowledgment of the source is given. Republication, systematic copying, or multiple reproduction of any material in this publication is permitted only under license from the American Mathematical Society. Requests for such permission should be addressed to the Acquisitions Department, American Mathematical Society, 201 Charles Street, Providence, Rhode Island 02904-2294, USA. Requests can also be made by e-mail to reprint-permissionaams. org. © 2007 by the American Mathematical Society. All rights reserved. The American Mathematical Society retains all rights except those granted to the United States Government. Printed in the United States of America.
® The paper used in this book is acid-free and falls within the guidelines established to ensure permanence and durability.
Visit the AMS home page at http://vw.ams.org/
10987654321
12 1110090807
Contents Preface
vii
Chapter 1. Self-Adjoint Operators 1.1. Symmetric operators 1.2. Spectral decomposition of self-adjoint operators Chapter 2. Semi-Groups 2.1. Semi-groups of self-adjoint operators 2.2. Kolmogorov semi-groups
1
1
8 15 15 19
Chapter 3. Logarithmic Sobolev Inequalities 3.1. The Poincare and Gross inequalities 3.2. An application to ergodicity
37 37 55
Chapter 4. Gibbs Measures 4.1. Generalities 4.2. An Ising model with real spin
65
Chapter 5. Stabilization of Glauber- Langevin Dynamics 5.1. The Gross inequality and stabilization 5.2. The case of weak interactions 5.3. Perspectives
89 89 95 101
Appendix A.
105
65 73
A.1.
Markovian kernels
105
A.2. A.3.
Bounded real measures The topology of weak convergence
109
Bibliography
111
117
V
Preface This book contains the material that was essentially covered in a course "de troisieme cycle"' taught during the second semester of the 1996-1997 academic year at the University d'Orleans. The goal of this course was to give an exposition of an example of the use of logarithmic Sobolev inequalities coming primarily from two papers by B. Zegarlinski [Zeg9O, Zeg96]. The example is concerned with real spin models with weak interactions on a lattice where one can apply a classic method due to Dobrushin; see notably [DobTO]. For these models, we give a proof of the uniqueness of the Gibbs measure by showing the exponential stabilization of the stochastic evolution of an infinite dimensional diffusion process which generalizes the case of the Glauber dynamics for the Ising model. Although these models are technically more complicated than the Ising model, one still uses familiar techniques, e.g., using Ito's stochastic integral calculus to construct and study diffusion processes, as well as utilizing the well-known properties of self-adjoint differential operators on iR" and Sobolev and Poincare inequalities in their original setting. These models also utilize in a natural way some elegant results on logarithmic Sobolev inequalities such as the Bakry-Emery and Herbst inequalities. Interestingly, these models are simplifications of the Nelson models of Euclidean fields where Gross first introduced logarithmic Sobolev inequalities.2 In this book we introduce in a self-contained manner the basic notions of
self-adjoint operators, diffusion processes, and Gibbs measures. The chapter on logarithmic Sobolev inequalities is enriched by adding applications to Markov chains so as not to remain in too special a setting. The reader will find indications of some recent applications of logarithmic Sobolev inequalities to statistical mechanics at the end of Chapter 5. I would like to thank my colleagues S. Roelly and P. Maheux for very useful discussions as well as the students of the DEA d'Orleans, in particular, G. Salin.
Note added to the original Preface. The translation presented here differs from the French original by a small number of corrections. Since the original course was given, logarithmic Sobolev inequalities have been the
subject of many articles. We recommend that interested readers consult 'Translator's note: "Un cours de troisieme cycle" is equivalent to an advanced graduate course in an American university. 2Translator's note: These are now also called Gross inequalities. vii
viii
PREFACE
[Cor02, OR071, and their bibliographies if they are interested in further study of the subjects treated here.
CHAPTER 1
Self-Adjoint Operators We denote by H, a separable complex Hilbert space,' by V a dense linear subspace of H, and by A an operator from D to H. The space V is called the domain of the operator A and is denoted D(A). Unlike bounded operators, 2 in particular, operators on any finite dimensional Hilbert space, simple consideration of the symmetry of operators does not lead to a theorem
of spectral decomposition. We will introduce directly the notion of selfadjointness by utilizing spectral conditions based on an expose of P. Cartier at 1'Ecole Polytechnique.
1.1. Symmetric operators Definition 1.1.1. We say that the complex number A is in the resolvent set p(A) of A if (AId-A) is injective, its image (AId-A)V is dense in'H, and if the inverse operator (A Id - A)-' is a bounded operator from (A Id - A)D to H. This operator is then uniquely extended to a bounded operator R,, on H called the resolvent operator. We often abbreviate A - A Id by A - A.
Proposition 1.1.2 (Resolvent Equation). . For all A, Ec E p(A) we have:
RA-R1, =(A-u)R,RA. Note that the Resolvent Equation implies that {RA} is a commutative family of operators. Definition 1.1.3. We say that A is closed if V is complete for the norm IIA',II2)1,2.
IIII.a = (II0II2 +
Consider the graph of A:
9A={(1p,ATp)EHxH : 1P ED}. It is obvious that the projection of the graph of A, with the usual product norm H x H, onto D, with the norm HA, is an isometry. Thus it is clear that A is closed if its graph CA is closed in H x H. For a closed operator A one can express the resolvent set p(A) in a simpler way. 'The scalar product is left linear and right anti-linear. 2Recall that an operator B is bounded if there exists a constant M such that )IBIS < MJJxJJ for all x in V. 1
1. SELF-ADJOINT OPERATORS
2
Proposition 1.1.4. Let A be a closed operator. In order for A to be in p(A), it is necessary and sufficient that one of the two following conditions hold:
(1) The mapping (A - A) is a bijection of D onto H. (2) There exists a bounded operator Ra of H such that:
R,\ o(A-A)=Ida (A-A)oRa=ldH. PROOF. (1) In order to show the necessity of the condition, we need
to show that if A E p(A) then Image(,\ - A) = H. Since this image is dense, there exists for any x E H a sequence yn of elements in D such that x = lim(Ayn -Ay,,). By applying the bounded operator Ra, we can conclude that yn = Ra(A - A)yn converges. Since both yn and Ay,, converge, and cA is closed, the limit y of yn is in D and lim(Ayn) = Ay from which we conclude that .\y - Ay = x. Since x is arbitrary, we see that (A - A)D = H. Now suppose A - A is a bijection. It is a continuous mapping from the Hilbert space (D, II'IIA) to the Hilbert space H. By Banach's open mapping theorem, the inverse mapping is continuous and obviously remains continuous if we equip D with the weaker norm I I II H . (2) We see these conditions are equivalent to the initial definition if we take into account the fact that A - A is surjective if its image is dense. 0 Self-adjoint operators are a special class of symmetric operators where by a symmetric operator A with a dense domain D in H we mean a linear operator A : D H that satisfies: VV, E D (AV, V,) = (cp, Aye). They are often defined on natural domains that are too small for the operator
A to be closed. A basic example is the Laplacian A defined, say, on D = Cc°(R'), the space of infinitely differentiable functions on R' with compact support. However these operators are easily seen to be closeable in the following sense:
Proposition 1.1.5. The closure of the graph of the symmetric operator A with domain D as a subset of H x H is the graph of an operator A defined on a domain D' D D. Moreover the resolvent sets and the resolvent operators are the same for both operators. (A is called the closure of A.) PROOF. We first show that GA is a graph of a function from H to H. We need to show that if (V, ,O) E GA and (cp, ?P') E cA then = V,'. There exists a sequence (cpn, Acpn) that converges to (V, y') and similarly a sequence (cp;i, Acp'n) that converges to (
Let w be an arbitrary element of D.
limcc(Aw, V,,) = (Aw,V), (w, ') = ni o°(w, and similarly since (w, iji) = (Aw, cp) for all w in the dense space D, one
necessarily has ip =,O'.
1.1. SYMMETRIC OPERATORS
3
It is immediate by passing to the limit that the operator associated to the graph CCA remains symmetric and that it is closed. We establish that p(A) C p(A) as follows: for A E p(A) the only thing that needs to be justified is that A - A is injective. In fact, if WE ker(A - A), there exists a sequence Wn that converges to
We now consider the inclusion in the other direction. By construction, D is dense in D' for the norm and A is continuous with respect to this norm. Using this, the property (A - A)D = H follows from the fact that (A - A)7Y = H. The other properties characterizing p(A) follow immediately from the fact that it is contained in p(A).
The following lemma allows us to study a priori the resolvent set and the spectrum a(A) := C\p(A) of a symmetric operator A.
Lemma 1.1.6. Let A be a closed operator and let 1.1 be the uniform norm on bounded operators on H. (1) If A E p(A), then the open disk D(A, IIRaII-1) in C is contained in p(A). In particular p(A) is open. (2) If A is symmetric and A E p(A), then the disk D(A,t(A)) is contained in p(A).
PROOF. (1) If Iµ-Al < IIRaII-1, the series S= E (A-µ)nRn converges n=o
in the uniform norm on the algebra of bounded operators and R,,S is a bounded operator Rµ that satisfies the equations (1.1). (2) If a and 3 are two real numbers, we have: (1.1.2)
a + i,3 E p(A) and 0 34 0 * IIRaII S 0-1
This follows from the coercivity inequality (1.1.3)
II(A - a - i(3)x1I2 >, 1321Ix1I2,
which in turn follows from the calculation, for x E D, 11 (A - a - i/3)x1I2 = ((A - a)x, (A - a)x) + (/3x, Ox) + i(J3x, (A - a)x) - i((A - a)x, (3x)
= II(A - a)x1I2 + /3211x112. Let C := {A E C : ± (A) > 0}. Then we have the following proposition, the proof of which is left as an exercise for the reader.
Proposition 1.1.7. There are only four mutually exclusive possibilities for the spectrum of a symmetric operator A: v(A) is equal to C+, C-, C, or it is included in R.
1. SELF-ADJOINT OPERATORS
4
Definition 1.1.8. We say an operator (A, D) is essentially self-adjoint if it is symmetric and if a(A) C IR; if, in addition, it is closed we say it is self-adjoint. If A is self-adjoint, a core for A is any dense domain contained in the domain of A on which the restriction of A is essentially self-adjoint. Proposition 1.1.9. Let A be a symmetric (resp. symmetric and closed) operator on D. Then (1) below is a sufficient condition that A be essentially self-adjoint (resp. self-adjoint) and (2) and (8) are necessary and sufficient conditions that it be essentially self-adjoint (resp. self-adjoint). (1) There exists a real number A in p(A).
(2) ±i are in p(A). (3) The images (i - A)D and (i + A)D are dense. PROOF. The first two conditions allow us to eliminate the first three possibilities in Proposition 1.1.7. Finally, condition (3) is equivalent to (2). Indeed, the conditions of injectivity and of the continuity of the inverse, which are part of the definition of R, (or R_;), are always satisfied in the symmetric case by applying the inequality (1.1.3). Exercise 1.1.10. Verify that for A symmetric and A real the necessary condition of injectivity for A E p(A) is a consequence of the density condition.
Exercise 1.1.11. Let (W, .F, µ) be a measure space and let X be a real-valued measurable function on W. Show that the possibly unbounded operator 16fx on H = L2(µ) defined on V := {co E H : SoX} E H by Xp is self-adjoint and that the resolvent operator Ra, when it exists, is multiplication by (A - X)-1 Deduce from this that the spectrum of N1,, is "the essential image" of X, i.e., the support of the measure X(µ). By the support of a measure v on R, we mean the closed complement of the union of all open sets of v-measure zero.
Exercise 1.1.12. Let A and B be two self-adjoint operators such that A extends B, that is to say, DA D DB and B is the restriction of A to DB. Show that A = B.
Exercise 1.1.13. Consider H = L2([0,1]) and let Do := { f E C' ([O,1])
:
f (0) = f (1) = 0}.
For f E Do, we set A f= if'. Verify that the operator A is symmetric on Do. Verify that (i-A)D0 I u, where u(x) = e-t, and deduce from this that (A, Do) is not essentially selfadjoint. Show that any distribution u that is the solution of the equation a' = ku is equal to the function cekx. Let a be a fixed complex number of modulus 1. Let B be the operator defined by the same formula as A but on the domain: V = if E C' ([0,1]) f (1) = a f (0) }. Show that B is essentially self-adjoint.
1.1. SYMMETRIC OPERATORS
5
A particularly useful application of Proposition 1.1.9 is to symmetric operators that are bounded below. We say a symmetric operator A is positive when: dx E D(A) (Ax, x) > 0,
and we say that A is bounded below if there exists a constant m such that A - m Id is positive. In this case we also say that A is bounded below by m, i.e., 3m bx (Ax, x) >, mIIxII2
Proposition 1.1.14. Let A be a symmetric operator that is bounded below by m and A < m. Then in order for A E p(A), it is necessary and sufficient that (,\ - A)D is dense. Thus if (A - A)D is dense, A is essentially self-adjoint on D. PROOF. Since (x, (A - A)x) > (m - A)IIxII2, we see right away that the condition of injectivity and the condition of continuity of RN are satisfied. Thus by the definition of p(A), A E p(A) if and only if (A - A)V is dense. By Proposition 1.1.9(1), A E p(A) implies A is essentially self-adjoint.
Proposition 1.1.15. As an operator on the domain D = CO(Il) in the Hilbert space L2(lR"), the operator A is essentially self-adjoint.
PROOF. It obviously is sufficient to show that -A is essentially selfadjoint on this domain. We first see that -A is symmetric and positive from Green's formula:
To show that it is essentially self-adjoint it is sufficient by Proposition 1.1.14
to show that (Id -A)D is dense. Indeed if this were not the case, there would exist a function f 0 in L2 such that (f, ,p - Ocp) = 0, for all cp E D. Utilizing the Fourier transform on L2(lR ), which is an isometry, we would 0. This in turn would imply that f = 0 since have (f (p), (p2 + the subspace of functions T = {(p2 + 1);i(p)
:
p E D}
is dense in L2(R"). This contradicts the assumption that f # 0. Here we recall a proof that T is dense. Consider a function ) E S where S is the Schwartz space of rapidly decreasing functions on R. One can find a sequence of functions gyp" of C,, such that for each order a the derivative cp" tends to ip' in L2. For example, cp"(x) = o(x/n)zv(x) where cp E Coo is equal to 1 in a neighborhood of 0. Thus -App"+V" converges to -Or/i+V) in L2. Since the Fourier transform is an isometry on L2, we see that (p2 + 1)rJ is in the closure of T. But any function in S can be written in the preceding form. Hence T contains S = L2.
1. SELF-ADJOINT OPERATORS
6
Corollary 1.1.16. The self-adjoint operator A defined by the above theorem coincides with the Laplacian operator A defined in the sense of distributions on the following domain:
D:={uEL2(R") : AuEL2(R")}. PROOF. Let u E D. We define the element v of L2 by v :_ -Au + u in the sense of distributions and consider the resolvent operator R1 at the point 1 of A. Let T := {cp - AV : V E Cc°}. The Fourier transform of the elements of T are exactly the elements of the set T considered above, and thus T is dense in L2. Since App and App coincide on C°°, and upon letting :_ V - App, we have: (RI v, V') _ (v, R1zl') = (v, 42) = (v, V)
is an arbitrary element in the dense subspace T, we have u = Rlv, thus u is in the domain of A. In addition u - Au = v = u - Au. The reader can easily check that (A,D) is closed and thus that D(A) does not properly Since
contain V.
D
Taking into account the preceding corollary, we continue to denote the self-adjoint operator defined in Proposition 1.1.15 by A. The most celebrated self-adjoint operators in L2(dx) are the Schrodinger
operators -A + V, where V designates multiplication by the function V. Here is a case where the precise definition of the operators is easy:
Theorem 1.1.17. Let V E L satisfy V >, 0, almost everywhere. Then -A + V is essentially self-adjoint on D = C,(R"). PROOF. Recall Kato's Lemma (see, for example, Reed & Simon [RS72])
which says if u is a real function in LL(R") such that Au E L L(R"), then one has, in the sense of distributions, that Alul 3 sgn(u)Au. We argue by contradiction. Suppose that R. := (-A + V + 1)D is not dense in L2; then we can find a non-zero function u in L2 such that (u, cp) = 0 for all functions cp in R. Since D is stable under complex conjugation, it is easy to see that we can assume that u is real. We have that (-A + V + 1)u = 0
in the sense of distributions. It follows immediately that Au E Ll(R), which allows us to apply Kato's Lemma: (1.1.4)
Alul 3 sgn(u)Au = (V + 1)lul > Jul.
We regularize Jul, with the aid of an infinitely differentiable positive function
e, with compact support and integral equal to 1, as follows. Let e6(x) :_
1.1. SYMMETRIC OPERATORS
7
d-"e(x/S) and ws := Jul * ea. The regularized function wa is an infinitely differentiable square integrable function and, applying (1.1.4), we have: Owa = Alui * ea 3 Jul * ea = wa. (1.1.5)
, {lw6Il2. (zwd, w6) >
On the other hand. Owb = w * Deb E L2, which by utilizing Corollary 1.1.16, implies that the function w& is in the domain of the negative selfadjoint operator A, thus (wb, Ows) S 0. Combining this with (1.1.5) we see that wa = 0 for all J. Since wa Jul in L2 when 6 -+ 0 we get u = 0, which is a contradiction. 0 Up to this point we have not explained why our notion of "self-adjoint" is the same as the more traditional one. This we do now. Definition 1.1.18. The adjoint operator of A* of (D, A) is the operator defined on the space D' of vectors g such that the linear form f '-+ (g, Af ) is continuous and where A*(g) is defined by VfED
(g, Af) = (A`g, f) Remark 1.1.19. The existence of a unique A*(g) satisfying the above equation follows from the Riesz Representation Theorem.
Proposition 1.1.20. Let (D, A) be a symmetric operator in H. The operator A is self-adjoint if and only if V coincides with the set of vectors g such that there exists a constant c(g) satisfying (1.1.6)
I(g,Af)l
for all f E D, the relation (Af, g) = (f, Ag), which shows the continuity
f " (A f, g) on H as a function of f. Conversely the condition (1.1.6) implies the continuity of f - (g, if - A f) and thus the existence of a vector 0 E H such that (g, if - Af) = (rP, f ). In particular we set f = &p in the (R_;,,, cp). Since V is arbitrary, we have g = R_jo, which is an element of D. Conversely, suppose that V is characterized by the property (1.1.6). We will argue by contradiction to show that A is self-adjoint. First of all, A is closed. In fact, if g -* g and Ag,, - h in H, then for f E D, we have: preceding relation. We then have (g, gyp) = (r/i,
1(g,Af)l = lim l (gn,Af)l = lim J(Ag.,f)I = R(h,f)l < HHhlIIlfHH,
which implies g E D. Now suppose that (i - A)D is not dense. Then there exists a non-zero vector E H such that (rp, (i - A)cp) = 0, for all V E D. Since the zero linear form is continuous, the condition (1.1.6) shows that 0 E D, and by symmetry we have ((-i - A)V,, gyp) = 0 for all gyp, which would imply that iii + AV; = 0, which by (1.1.3) leads to the contradiction ?P = 0. Therefore (i - A)D is dense in H. The same argument works for -i, thus by Proposition 1.1.9 the operator A is self-adjoint on D. 0
8
1. SELF-ADJOINT OPERATORS
Corollary 1.1.21. The closure of the domain V of an essentially selfadjoint operator (Do, A) is the set of vectors g in H such that (g, A f) is a continuous function of f from H to C.
Exercise 1.1.22 (the Dirichlet Laplacian).
.
Let G be an open
subset of Rd and Do the space C'°(G) of C°° functions with compact support in G. We equip Do with the norm IIull1 =
(f
(f Ivu!2+IuI2dx\/J1/2
and the corresponding scalar product ( , )1 . The completion V1 of Do with respect to this norm is injected into H = L2(G). Since any element w of Dl is the limit of a sequence wn of elements in Do, it is sufficient to prove that if wn converges to 0 in L2 then w = 0. To show this let p E Do. Since (w, (P) I = lim (wn, 0)1 = (Wf, -A4P + 0) L2 = 0
and V is arbitrary, we have w = 0. We denote by D the set of functions u in Dl such that the linear form cP i- (co, u) I on Do is continuous with respect to the norm L2. With the aid of the Riesz Representation Theorem, this linear form can be written in a unique manner as (,p, AU)L2lyl. Define Ad := -A + Id. The operator Ad defined on D is called the Laplacian on G with Dirichlet boundary conditions. It follows from Proposition 1.1.20 that it is self-adjoint.
Remark 1.1.23 (the Fh-iedrichs extension). . Note that in the preceding example we did not use the special properties of differential operators. The method extends immediately to the case of a positive operator Bo defined on a domain Do in H. One just uses the norm (IIull1)2 = (u, Bou) + IIuIIN in place of the Dirichlet norm in the above example and argue just as before.
1.2. Spectral decomposition of self-adjoint operators We fix a self-adjoint operator (A, D) on H. The spectral measure of A associated with a vector 0 of norm 1 is a probability measure characterized mathematically by the formula (1.2.1) below. It is of fundamental importance in physics because it governs the quantum mechanical uncertainty when one measures the observable defined by A in the state -rl'. See, for example, [LLB83].
1.2.1. Construction of spectral measures. We denote by C C C(X) the space of rational functions on C with no real poles and bounded at infinity. The elements of C are exactly those that can be written: P
nk
F(z) = ao + 1: 1: k=1 m=1
Clk,k
Z)-
with Ak 0 1R,
1.2. SPECTRAL DECOMPOSITION OF SELF-ADJOINT OPERATORS
9
and this representation is unique. We refer to the elements of C as fractions. For F E C, we define a bounded operator by: nk
P
F(A) = ao Id + E
ak,m (Rak )m .
k=1 m=1
We say that F > 0 if for all z E R we have F(z) > 0. We define the fraction
F' from the fraction F by changing ak into ak and .k into Ak. In other words F'(z) = F(z). Lemma 1.2.1. The fraction F E C is positive if and only if it can be factored in the form GG' with G E C. PROOF. If we have this factorization, we have for x E IR that F(x) _ G(x)G(x) > 0. Conversely, since F is real on the real axis, we have for all real
x that F(x) = F'(x) = F(x), and thus by analyticity F = F' everywhere and formally in C(X). We deduce from this that F is the quotient
Q(X)
of two real polynomials. The denominator does not vanish on the real axis and can be supposed positive. Hence P and Q are positive and thus they 0 can be written POP' and QoQo.
We denote by Cr the set of fractions such that F = F'. We have just seen that this means that F is real valued on the real numbers. We will now
make explicit the properties of the "functional calculus" F H F(A). We define a norm on C by JIFIIoo = sup IFI(x). xER
(1) For F E C the operator F(A) is bounded. (2) (F G)(A) = F(A)G(A) or more precisely the functional calculus is a homomorphism of the algebra C into the algebra of bounded operators. (3) F(A)' = F'(A), where "*" in the left member denotes the adjoint of a bounded operator. In particular, F(A) is symmetric if F E Cr. (4) F > 0 implies that F(A) > 0, in the sense of order on symmetric operators. F(A) is continuous from C to the space (5) The linear mapping F of bounded operators with respect to the usual operator norm. PROOF. The first assertion follows from the definition of the resolvent operators. The second is proved with the aid of the resolvent equation. By linearity, we are reduced to considering the simple cases
F(z) =
1
(a - z)k
,
G(z) =
(b
1
z
)h ,
h, k >, 0,
1. SELF-ADJOINT OPERATORS
10
where we can suppose that b # a, the case of equality being trivial. In order to argue by induction on k + h we note that: (b - a)FG(z) = (a - z)1-k(b - z)1-h (
ax -bz ). 1
1
Thus utilizing the induction hypothesis and the resolvent equation we have:
- Ra-1Rb = Ra 1Rb-1(Ra - Rb) _ (b-a)RakRb.
(b - a)FG(A) = RakRb-1
The third assertion is proved by checking that RA = RX* for all symmetric operators when A E p(A) and a E p(A). The fourth follows from Lemma
1.2.1. In fact, if F >, 0 we are able by the lemma to write F = G*G and thus (F(A)x, x) = (G(A)x, G(A)x) 3 0. It is sufficient to establish the last assertion for all F E Cr because we can write any H E C as a linear combination of elements in Cr:
H- H+H* 2
iH-H* 2i
We next note that for all t E IR we have -IIFII < F(t) < IIFII, which by (4) above implies that for M := IIFII,
0
Finally, since x and y are arbitrary we have that: II F(A) + M Idll < 2M or IIF(A)II < 3IIF11. In fact, the inequality without the factor 3 is still true and can be proved using the spectral theorem. We consider the compact space IR := 1R U loo), which is obtained by adding a point at infinity to 1R.3 Since the functions of Cr have the same limit for x - ±oo they can be extended to functions on 1R and by the StoneWeierstrass Theorem we see that they form a dense subspace of C(II8, R). Let cp be a vector in H. The linear form defined on Cr by F H (F(A)V, cp) is
continuous by the last part of the preceding lemma and thus extends to a linear form on C(IEF,R). This positive linear form is a bounded positive measure on R and is denoted by IL,. By linearity we extend the definition to functions in C: (1.2.1)
VF E C
(F(A)cp,cp) = fF(x)#a(dx).
3For example, we can identify a circle minus a point with R by 0 - tg(0/2).
1.2. SPECTRAL DECOMPOSITION OF SELF-ADJOINT OPERATORS
11
Lemma 1.2.2. If R(A) remains bounded and £(A) tends to infinity, ARA tends strongly to Id.
PROOF. From the formula (1.1.2), we have IIAR,,II < IXI I!a(A)I-1; thus the operators ARa - Id remain uniformly bounded. Because D is dense it is sufficient to show the convergence of AR,,7b to -r/' for all i/i E D. The latter results from:
II(AR, -Id)VII = IIR,A'II s IQ`(A)I-IIIAivll. 11
Proposition 1.2.3. For any E H there exists a unique positive measure µ,p on IR satisfying (1.2.1). This measure has mass IIc,II2. PROOF. The total mass formula follows by setting F equal to 1 in (1.2.1).
It remains to show that the measure µ. is supported by R = R/1001. To 2
show this we note that the function Fa(t) =
a - +oc, except for t = oo.
converges to 1 when Consequently the dominated convergence t2
+ a2
theorem implies
µ,o(R) = lim fFadup = lim(FQ(A)cp, W). a-.o0
But since Fa(A) = 2(aiRai - aiR_a;), the preceding lemma implies: lim Fa (A) = Id. 0-00 Thus we have µ,p(R) = IIwII2 which in turn implies µ,, is supported by R. 0
1.2.2. Functional calculus. The simplest example of a self-adjoint operator is multiplication by a real-valued function on the space L2. In fact, any self-adjoint operator A on a filbert space H is unitarily equivalent to this model:
Theorem 1.2.4 (spectral decomposition). . There exists a o -finite measure space (W, F, µ), a measurable real-valued function X on W, and a unitary operator U from H to L2(µ) that transforms A into the multiplication operator by X : (1.2.2)
W E D(A)
E L2(µ)
and
Acp =
More precisely, we can realize W as a countable union of spaces IYk that are disjoint images of R by bijections bk, the measure µ as a sum of copies of spectral measures, i.e., µ = >k bk(µ,,k), and X as the identity function X(x) = x on each Rk. PROOF. We begin by considering the case where there exists a cyclic vector 01, that is to say, the "orbit" of ifil defined by ci = {F(A),01 : F E C}
is dense in H. Note that any element of the orbit can be written in an essentially unique way in the form F(A)V)1.
1. SELF-ADJOINT OPERATORS
12
In fact, if F(A)VII = 0 we also have: 0 = G(A)F(A)zb1 = F(A)G(A),b1 for all G E C and since the G(A)iil are dense we obtain F(A) = 0op. This in turn implies that: 0 = (F(A)1G1, F(A)V51) = (IG1, F`(A)F(A)' ),)
Finally using (1.2.1) we see that IFI(x) = 0 for p,,,-almost all x E R. Conversely, if F = H pv,-almost everywhere analogous calculations show that F(A)?P1 = H(A),01. Taking µ = we can define U on H by
U[F(A)i1]:= F. The mapping U is an isometry with dense image because: IIF(A)1Gi 112 = (F(A)1'1, F(A)I'1)
_ (VY1, F*(A)F(A)'vl) =
f F(x)F(x) dµ,,,
Thus it can be extended into an isometry H onto L2(µ). It is clear by construction that any operator G(A) is transformed by this isometry into the multiplication operator by G. In order to find the transform of A under this isometry, we write A = lima_ +,,,, G,\ (A) where G, (x) = XX X. Since GA(A) = ARAA, Lemma 1.2.2
implies that for all V E D(A) the vector [GAA](,p) converges to AV, when t(A) remains bounded and !3(A) tends to infinity. It follows from this that U(GA(A)
We also have the inclusion U(D) C If E L2(µ) Xf E L2(µ)} where X(x) = x and U(D) contains all of the functions F E C. Since U(A) is :
self-adjoint, and thus closed, it is easy to see that we have equality. In the case where there is a cyclic vector we have established the theorem with a single copy of IR and a simple spectrum. In the general case, we begin by choosing a non-zero vector V,1 in D(A) and forming the orbit S21 generated as above using ipl. If 01 is not cyclic we choose a non-zero vector '02 E D(A) n Hl and form the orbit SI2 generated by 02. In this way we construct by recurrence a (possibly denumerable) sequence of vectors (Vn) which leads, since H is separable, to the Hilbertian direct sum:
H=®{F(A) n: F E C}. n
We then repeat the preceding construction for each orbit.
0
Exercise 1.2.5. Show that the support of µy is contained in or(A) and that o(A) = Usupp(i ,.,). n
1.2. SPECTRAL DECOMPOSITION OF SELF-ADJOINT OPERATORS
13
Exercise 1.2.6. In the case where H = Cd with the usual scalar product and A is given by a Hermitian matrix describe the relation between the spectral theorem and ordinary diagonalization. Exercise 1.2.7. Prove that a bounded symmetric operator is self-adjoint and that its norm is equal to its spectral radius, i.e.,
IIAII = sup I)I. AEa(A)
Show that if A is self-adjoint and bounded below the best lower bound for the operator A is r. = inf(a(A)). Exercise 1.2.8. Let (a, a') and (b, b') be two fixed non-zero elements of JR2. Let D be the space of C2 functions f on [0,1] satisfying a f'(0) - a' f (0) _
bf'(1) - b'f (1) = 0. Set Af := -f" for f ED. Show that A is symmetric and bounded below. For the bounded below property first prove the following inequality: for any 6 > 0 IIJIIL2(fo,sl)
- VOf(0)I <1
(0
Ifl(t)
12dt)1,2.
Prove that A is essentially self-adjoint. One will show the following elementary hypoellipticity property: any distribution solution of u"+Au = 0 on 10, 1[ is a C°° function on 10, 1[ and, in fact, on [0, 1]. Suppose from now on that a = b = 0 and a' = b' = 1, i.e., the Dirichlet boundary conditions hold. By letting f (t) = sin(irt)g(t), prove Wirtinger's inequality: if f E C1((0,11) with f (0) = f (1)) = 0: f12 (t) dt > Ir2
f2(t).
J0 Deduce from this the lower bound on the spectrum of A. Prove that the 0
sequence of functions L sin(n7rx) with n E N* is an orthonormal basis for the Hilbert space L2([0,1]) and reproduce the preceding result. The definition of an arbitrary Borel measurable real-valued function of a self-adjoint operator A is now very simple. Given such a function f on II8, we define the self-adjoint operator f (A) as follows: begin by utilizing the unitary operator U of the spectral decomposition of A to transform A into an operator A of multiplication by X on the space L2(µ). Then define f (A) as the operator of multiplication by f (X) on the domain {cp E L2 f ()() E L2} and transform it back using U-'. One can show that this :
definition does not depend on this choice of U. See Exercise 1.2.10.
Remark 1.2.9. There also exists another functional calculus in a different setting; see [DS63]. Let A be a bounded operator on a Banach space
E and f a holomorphic function on an open set G containing a(A). We then define f (A) using an extension of the Cauchy formula for a contour C
14
1. SELF-ADJOINT OPERATORS
contained in G and surrounding a(A) as follows:
f (A) = 2i-- f f (z)RZ dz. Exercise 1.2.10. Establish the following properties of the functional calculus f H f (B) where B is the operator of multiplication by X on the space L2(µ) and f is a real-valued Borel-measurable function. If one restricts oneself to bounded functions then the mapping f -+ f (B) is continuous from L°° into the space of bounded operators on L2(µ) with the usual operator norm. If fn is a sequence of Borel measurable functions that converge to f, then for all A E C \ IR and all bi E L2(µ), there is convergence of the resolvents, i.e., f lim R(A, n-+°o
Deduce from this that if a self-adjoint A on H is transformed into two multiplication operators X and X' on L2(W, µ) and L2(W', µ') respectively, then the operators f (A) and (f (A))' defined by using the two spectral decompositions are the same. Carry out the proof in the following steps. Begin with the case where f E Cr, then where f is continuous and tends to 0 at infinity, then for f that are indicator functions for open sets, and finally utilize the monotone convergence theorem on monotone sequences of the preceding class of functions.
Exercise 1.2.11. Let A be a positive self-adjoint operator that is invertible in H. Prove the inequality: 11A+w11-1 < w-1 for w > 0. Prove the D(A-112):
following formula for al JO E
A-1/2(,P) _
7r
f 0
where the integral is the Bochner integral for functions with values in H.
Exercise 1.2.12. Go back to the construction in Remark 1.1.23 of the Friedrichs' extension of a positive operator B. Show that Dl is the domain of the self-adjoint operator B112 + Id.
CHAPTER 2
Semi-Groups 2.1. Semi-groups of self-adjoint operators The notion of a semi-group of operators on a Banach space is employed widely and, in particular, in the theory of the evolution in time of various phenomena. We will quickly restrict ourselves to the case of symmetric semi-groups on a Hilbert space.
2.1.1. Strongly continuous semi-groups of operators. Definition 2.1.1. A family S(t), t E R+, of bounded operators S(t) on a Banach space E is a semi-group if it satisfies: S(O) = Id, Vs, t > 0 S(t + s) = S(t)S(s). We say that the semi-group S(t) is strongly continuous if for each x E E the mapping t H S(t)x is continuous from lR+ to E.
Exercise 2.1.2. We denote the uniform norm on the bounded operShow that if S is a strongly continuous semi-group then
ators by 11-11.
sup IIS(t)II < oo. tE [o,a]
Definition 2.1.3. The infinitesimal generator A of a semi-group S is the operator defined by: AV := hlimo AhV, h>O
where Ah is the approximate infinitesimal generator Ah := h (Id -S(h)) and where the domain D(A) is x E E for which the limit defining A exists.
Proposition 2.1.4. Let cp be an element of the domain D(A) of the generator A of the strongly continuous semi-group S. Then for all t > Q. the vector S(t)ep belongs to D(A) and (2.1.1)
dtS(t)W = -AS(t)V = -S(t)AV.
Vt > 0
Moreover x(t) = S(t)ep is the only function with values in D(A) that is of class Cl (]0, +oo[, H), continuous at 0, and that satisfies: x(0) _ V
(2.1.2)
I
x'(t) = -Ax(t), t > 0. 15
2. SEMI-GROUPS
16
PROOF. We begin with the relations: h (S(t) - S(t + h))cp = S(t)Ahcp = AhS(t)cp.
By passing to the limit h - 0, the first of the two equalities shows that S(t)ep is differentiable and the latter one shows that S(t)ep is in the domain of A and that the equation (2.1.1) is satisfied. In particular, t S(t)ep is continuously differentiable from R+ into H and on D(A) we have commutativity: AS(t) _ S(t)A. For the uniqueness result, we fix a time T > 0 and define:
v(t) := S(T - t)x(t) where x(t) is the solution of equation (2.1.2). We then have by the differentiation of products rule (generalizing the differentiation of the product of two real functions) that for all t E [0, T[:
v'(t) = AS(T - t)x(t) + S(T - t)x'(t) = AS(T - t)x(t) - S(T - t)Ax(t) = 0. Thus v(t) is constant on [0, T], which implies: x(T) = v(T) = v(0) _ S(T)x(0) = S(T)ep for arbitrary T > 0.
Theorem 2.1.5. The domain of the infinitesimal generator A of a strongly continuous semi-group S(t) is dense and the operator is closed. PROOF. For cp E E, we set 1
f
s
S(t)V dt. s 0 Utilizing the continuity of the semi-group, it is easy to see that, when s -+ 0, the vector Vs tends to cp, thus the set of the c is dense in E. We next show Vs :=
that cps is in the domain of A. In fact we have: -Aheps =
hs
-
s S(t + h) W - S(t)yx dt 0
rs+h S(t)cadt
s
- Ts
j S(t)epdt h (sh
E D(A) such that
To show that A is closed, we consider a sequence c cp ep and Acp -+ rb. We then have:
O Ahcp = li lim
0h
limo(cpn
- S(h)V.),
which, utilizing equation (2.1.1), implies: li
o
Ah
li a
rh
rh
1
n imo h
J
dt = llimo h
J S(t)ep dt = Vi
Thus cp E D(A) and AV = z
We assume the following classical lemma, which is proved in, e.g., [DS63].
2.1. SEMI-GROUPS OF SELF-ADJOINT OPERATORS
17
Lemma 2.1.6. Let f be a real-valued function on J0, +oo[ that is subadditive:
f(t+s) < f(t)+f(s),
and bounded above in a neighborhood of 0. Then f (t)/t has a limit in lib U {-oo} when t oc given by: lim ff (t) = inf ff (t) .
(2.1.3)
t-oo t
t Proposition 2.1.7. For any semi-group of bounded operators S(t),. oc. For any y > yg, there exists a IIS(t)II1/t has a limit yg when t t>O
constant a.y such that JIS(t)II < a, e"t for all t >, 0.
PROOF. Set f (t) := log(jIS(t)Il). Since the operator norm for operators satisfies IIABIJ < 11AII IIBIJ, the function f is subadditive and the above D
lemma applies.
The number ys is called the Lyapunov exponent of the semi-group.
2.1.2. The case of symmetric operators. Recall that a self-adjoint operator A is bounded below if there exists a constant m such that (Ax, x) mhlxhI2, for all x in the domain of A. Using the spectral decomposition theorem, one can easily show that the best possible constant for this inequality, called the lower bound of A, is in = inf Q(A).
Proposition 2.1.8. Let A be a self-adjoint operator on a Hilbert space H that is bounded below. There then exists a unique strongly continuous semi-group S(t) for which the infinitesimal generator is A. For t(A) < m we have:
R1, _ -
(2.1.4)
J0
eAtS(t) dt.
Conversely the infinitesimal generator A of a semi-group S(t) of symmetric operators on H is a self-adjoint operator that is bounded below. PROOF. We begin by considering the case where H = L2(µ) and where A is the multiplication operator defined by X on:
D(A) = If E L2(µ) : fX E L2(µ)}. Since A is bounded below, there exists a constant m such that X > m almost everywhere. Thus for all t > 0 the function a-'x is bounded and 111
- e-tXI <e "`thXi
Multiplication by e-tX defines a bounded operator S(t) for each t and the family S(t), t E R+, is clearly a semi-group. If f E D(A), the bounded convergence theorem implies that: 1
1
li a (f - S(t)f) = lim t (f - e-txf) = Xf.
t
2. SEMI-GROUPS
18
Going in the other direction, if f is in the domain D(A) of the infinitesimal generator A of the semi-group S(t) the sequence n f (1-exp(- X)) converges n everywhere, in L2, and by extracting a subsequence that converges almost we see that X f E L2 and hence f E D(A). The uniqueness of the semi-group just constructed follows from the uniqueness result in Proposition 2.1.4 and the formula (2.1.4) follows from the formula: e-txeta dt.
1
VA <s
A-x
The construction of the semi-group in the general case reduces immediately to the preceding case because the spectral theorem allows us to transform A into a multiplication operator defined by a function X. We however do have to check that two different spectral representations do not give differ-
ent results. The easiest way to see this is to note that the formula (2.1.4) determines, for g E H and f E D(A), the Laplace transform of the numerical function (S(t) f , g) on J - oc, m[, and therefore S(t) itself since g is an arbitrary element of H and f is an arbitrary element from a dense subspace of H. We now turn to the converse. For c < --ys where -ys is the Lyapunov exponent of the semi-group S, the integral: r+oo
rc = - J
edS(t) dt
0
is convergent and defines a bounded operator on H. We choose, in particular, c = --ys-1. Since the operators St are symmetric, the definition of A implies
that A is symmetric on D(A) and that rc is symmetric on H. To show that the infinitesimal generator A is self-adjoint, it is sufficient to show the real number c is in the resolvent set p(A). For this it is sufficient to show that
(cI - A) o rc = IH
(2.1.5)
(which includes rc c D(A)) since the relation rc o (c7 - A) = ID(A) follows from (2.1.5) by the symmetry of the operators. (See Proposition 1.1.4.) We now establish (2.1.5). For all E H, the relation: oo
Shrc = -
J
eSt+h dt = e- h ( rh ec'S(t) dt + rc)
implies:
(cI - A)rcip =
h 0(`rc + -ch
= lim (e
h(Sh - I)rci,))
hch -
r+ f eS(t)dt) ch
l
e
h
.
The second term is a mean that tends to the value of the continuous function
t i-. e'S i,1' at 0 as h -+ 0, i.e., to t&. The first term clearly goes to 0 as h
0, which establishes (2.1.5).
0
2.2. KOLMOGOROV SEMI-GROUPS
19
Remark 2.1.9. The construction of a semi-group from a self-adjoint operator is just an example of the functional calculus for a self-adjoint operator and in this sense we have St = exp(-tA). Exercise 2.1.10. Prove that if A is bounded we have:
(-tA)k exp(-tA) =k=o 00j ki Exercise 2.1.11 (the heat semi-group).. Show that the semi-group that is generated by the R" Laplacian operator -zA acting on L2(dx) is given by: 21
(exp 24P(xW
=(
I exp(
I x 2ty12)v(y)dy
Exercise 2.1.12 (the Ornstein- Uhlenbeck semi-group). Let y denote the Gaussian probability measure on JR with mean 0 and variance 1/2, i.e., y(dy) = 1e-b2 dy. We consider the family of measures on R depending
on xERand t>0defined by No (x,dy)=dx(dy)and for 1>0 xe-t)2 -1/2 e p _ (y Nt(x, dy) = (7r(1 - e-2t))
1 - e-2t Show that Nt(,b) := f ?p(y) Nt(x, dy) defines a strongly continuous symmet-
ric semi-group on L2(y). Let -L be its infinitesimal generator. Prove that: L f = !A f - xV f , for f E C,°(iR), and that this latter formula defines an 2 essentially self-adjoint operator on Cco°(R).1
2.2. Kolmogorov semi-groups The goal of this section is the introduction of certain stochastic differential equations for which the associated semi-group is called a Kolmogorov semi-group.
2.2.1. Review of Brownian motion. A real Brownian motion starting at 0 at time 0 is a family of random variables Bt with t E ]R+ defined on a probability space (11,.F, P) that is a centered Gaussian process such that for
any finite sequence tk, 1 < k < n of "times", the vector (Bt, , ... , is a vector-valued centered Gaussian random variable such that E(BtB3) = t As. This data determines in a unique manner the joint distributions of the random vectors (Bt, , ... , Bt.). An important fact is that we can always choose versions of the random variables Bt such that for almost all w, t ' Bt(w) is continuous almost surely on R, although these random paths are almost surely nowhere-differentiable. We will always choose such versions of Brownian motion. 'Note that we have made the choice of relating the semi-group and its infinitesimal generator by St = exp(-tA),
2. SEMI-GROUPS
20
One can always have a more global vision of Brownian motion on the interval [0, T] (resp: [O, oo[). Let W _: Co([0,T]) (resp: Co([0, oo[) be the space of continuous functions on [0, T] (resp: [0, oo[) which vanish at 0 with
the topology defined by the usual "sup" norm (resp: with the topology defined by the family of semi-norms Ill II N := sup{ I f (x) I , x E [0, N] }, N = 1, 2, ...). It is easy to show that the corresponding a-field of Borel subsets is the same as the a-field generated by the evaluation functions ,Ot for t E
[0,T], defined by 3t(w) = w(t) for w E W. The law on Co([0,T]) of the Brownian motion is the image Q7 of the probability measure P induced by the mapping w H B. (w). It is called the Wiener measure on Co([0,T]) (so there is a Wiener measure for each T and we define similarly the Wiener measure on Co([0, oc[)).
Definition 2.2.1. We call the process defined by the evaluation variables on the space Co([0, oo[) with the Borel a-field and Wiener measure the canonical version of Brownian motion.
Brownian motion possesses the Markov property. In fact, let F_ be the a-field generated by a(B8, 0 < s < t), called the a-field of past events, let .Ft' be the a-field generated by B8 for s >, t, called the a-algebra of future events, and let F{t} be the a-field generated by the single random variable Bt, called the a-field of present events. Then the Markov property is the following: for any bounded.F, random variable ', we have: E(z, I Fe) = E(o I F{t})-The
.F{t}-measurable random variables are of the form o o Bt where cp is a Borel-measurable function on R. The heat semi-group appears in the description of the transition from t to t + h. in the following way:
E(f(Bt+h) I Ft) = [Nhf](Bt)1 _ 2 [Nhf](x) = (x 2h) I f (Y) dy. 27rh vr-
j
eXp \
Exercise 2.2.2. Establish the following formula: IE(B, - B8)2 = It - sI. Show that for all finite sequences t 1 < t2 < the random
Definition 2.2.3. A filtration (.Ft) is an increasing family, F8 C Ft when s < t, of sub-a-fields of the a-field F. We say that a process (Xt) is measurable if (t,w) i-. Xt(w) E Rd is measurable with respect to 8(R) ®.F
2.2. KOLMOGOROV SEMI-GROUPS
21
and that it is adapted if it is measurable and if Xt is Ft-measurable for all t.
P,Y't, Wt) is an Rd-valued BrowDefinition 2.2.4. We say that nian motion, filtered by the filtration Ft, if: Wt is almost surely continuous. (1) The process W is adapted, and t (2) For s < t, Wt - W8 is a vector-valued Gaussian random variable with covariance It - s1I that is independent of all .P,-measurable random variables. This definition immediately implies the Markov property and the formula for the transition kernel, both written as above, except we replace F.,-<- by .Pt, the heat semi-group by its d-dimensional analog, and Bt by Wt.
Examples 2.2.5. We give three examples. We can always filter the Brownian motion Bt introduced above by letting F9 = T., and we can also complete this a-field by adding all of the events of probability zero of F; or consider . := n Ps+h as is shown in Exercise 2.2.7 below; h>O
or extend a filtered Brownian motion (Wt,.Ft) already defined, to a product probability space (52 x E, .P x E, P ® Q..P, (9 £), by setting Wt(w,x) := Wt (w), for x E E.
Definition 2.2.6. We say the filtration (F,) is right-continuous if for any time s we have:.F3 = n .P,+h. h>0
Exercise 2.2.7. By utilizing the almost sure continuity of the paths, show that the a-field .Ps defined above is a right-continuous family of a-fields
as a filtration for the Brownian motion B. Exercise 2.2.8. Prove that the family of a-fields )/ .Ps UN, where N is the set of zero -probability events, is right-continuous. Begin by proving the orthogonal decomposition: L2(F , p) (P L2(G, U A(, P) = L2(F U N, p),
where ,P := a{Bt, t E R}, G, := a{(Bt - B,), t>, s}. The preceding exercise shows that we can find filtered Brownian motions which also satisfy the following condition:
(3) The filtration is right-continuous and each of its a-algebra constituents contain all of the probability-zero events in F. We will always suppose that these two supplementary conditions are satisfied. The first simplifies the consideration of stopping times and the second that of stochastic integrals.
Remark 2.2.9. From now on we will utilize the notation (Bt) to denote a filtered d-dimensional Brownian motion that starts at 0 at time 0 almost
2. SEMI-GROUPS
22
surely. If we set Bt := Wt - Wo, we obtain a Brownian motion starting at 0 and conversely, if (Bt,.Ft) is a filtered Brownian motion starting at 0 and Wo an arbitrary Fo-measurable random variable, (Wo + Bt, .Ft) is a filtered Brownian motion.
Lemma 2.2.10. Let Bt be a Brownian motion starting at 0 and 0 a positive Borel-measurable mapping onC([0,T]). Then we have what is called the return-time property:
I I (iG(x + B)) dx _ fRd E(V'(y + B)) dy where b is defined by f3,, := BT_8 for 0 s s < T. PROOF. By standard measure theory it is sufficient to consider the case where '(w) = fo(wt.)f1(wti)... with to = 0 and to = T, and where the functions fk are the indicators of Borel-measurable sets in Rd. But in this case, each side of the equality can be computed explicitly with the aid of the Markov property and the formula for the transition kernel to yield the same result: fk(Xk)eXp(-1
11 Z f(Rd)n+ i k=0 1
En
2 k=1 n
with Z = (25.)d(n+l)/2 TT (tk -
1xk-xk-1 tk - tk-1
2
odl...dn),
tk-1)1/2
D
k=1
Definition 2.2.11. We say that a random variable T with values in R+ is a stopping time if: {w : T(w) < t} E.Ft, t > 0. Exercise 2.2.12. By utilizing the right-continuity of the filtration show that the preceding condition is equivalent to:
{w:T(w)
0. Show that the time of first entry of the filtered Brownian motion into an open set G of Rd defined by Tc := inf{t : Wt E G} is a stopping time. To show this utilize the almost sure continuity of the Brownian motion. Ito's formula. This is the analog of the formula for differentiating the composition of two functions but is more subtle because Brownian paths are almost surely nowhere differentiable with respect to time. We begin by looking at the case of real-valued Brownian motion. To begin the first thing we need to do is to define stochastic integrals fo e(s) dB,. We do this in the setting of a square-integrable process with given filtered Brownian motion: (Bt,J1t) Definition 2.2.13. We say that a process (e(s)) is a square-integrable stochastic integrand if it satisfies the following condition: (B2)
rt e(s) is adapted and IE(J e2(s)ds) < co for all t. 0
2.2. KOLMOGOROV SEMI-GROUPS
23
We are then able to define the stochastic integral: Aft = fo e(s) dBs, which is characterized by the following properties:
(1) The process Aft is almost surely continuous and is a squareall u < t. integrable (Ft)-martingale: lE(Mt I Fu) = N fu (2) It is linear in e and the following isometry holds: 1E(MM) _ 1E(fo e2(s) ds).
(3) For any subdivision 0 = to < t1 <
< tk_1 < tk = t and any
family of Xt-measurable random variables ek, we have: n
t
Ee
1k- 1 II1tk-1,tkl dBs =
0 k=1
E ek-1(Btk - Btk-1 ) k=1
We next generalize the above to the case of measurable and adapted sto-
chastic integrands that satisfy the weaker condition: t
(BL2)
almost surely.
le(s)I2 ds < oo
L
An adapted process t - fo e(s) dBs can then be defined so that the following characteristic property is satisfied: for any stopping time T such that s e(s) lis
/ tAT
J
(2.2.1)
e(s) dBs =
fe(S) IIs
0
By considering the sequence of stopping times: t
Tn = inf{t
:
1
es ds > n}
and passing to the limit in the preceding formula, we see that the isometry given in (2) above extends to the case where both sides of the isometry formula are infinite.
Exercise 2.2.14. Subdivide the interval [0, t] into n subintervals of the same length and set:
=
n
E(B)Lt/n - B(k-1)t/n)2 - t
An(t)
k=1
Prove that An(t) tends to 0 in L2(P) when n formula: 2
oo. Deduce from this the
ft Bs dBs = Bt - t. 0
To state Ito's formula we need to consider other integrands f which are almost surely integrable, i.e., the condition (BL2) above is replaced by: t
(BL1)
If (s)Ids < oc almost surely,
and an initial X0-measurable random variable with respect to .F'0, e.g., a constant.
2. SEMI-GROUPS
24
Theorem 2.2.15 (Ito). Let e and f satisfy respectively (BL2) and (BL1) and consider the process:
Xt = Xo+J te(s)dB,+J f(s)ds. t 0
0
Then for any function u(t, x) E C2 (R x IR), we have:
u(t, Xt) = u(0, Xo) + J Tu(s, X,)e(s) dB, +
u(s, X,) + x u(s, X.) f (s) + 2 axe u(s, X, )e2 (s)] ds.
J
It is clear that the two preceding integrands satisfy, respectively, the conditions (BL2) and (BL1). The reader will find a proof of Ito's formula, for example, in [KS88]. When u does not depend on time, say u(t, x) = v(x), we often write Ito's formula in the symbolic form: 2v"(X,)e2(s)] ds.
dv(Xt) = v'(X,)e(s) dB, + [v'(Xs) f (s) +
We will also need the multi-dimensional form of Ito's formula with the obvious extension of notation where the indices i, j, k vary between 1 and d and time is indexed by 0. The formula is thus written as follows: u(t, Xt) = u(0, Xo) +
J°
8iu(s, X,)fi(s) +
[Nu /(s, Xs) +
8jj u2(s, X3)eikejk(s)] ds. 2 i,j,k
i
Example 2.2.16 (local exponential martingale). Applying Ito's formula to the process
rt
Mt:=exp(/
1
e, dB, -J
0
2
r
le,I2ds)
0
leads to the stochastic differential equation of exponential type: dMt = etA1tdBt. We can deduce from this that there exists an increasing sequence of stopping times T,, tending to infinity such that t '--* MtAr are martingales. We say that such an AvIt is a local martingale. A sufficient condition for Mt to be a martingale, i.e., that it is integrable, is provided by the following result. (See [KS88] for a proof.)
Proposition 2.2.17 (Novikov). Suppose that E(exp 1 fo le(s)I2 ds) is finite for all positive t. Then the local martingale
re
1
0
2
Mt = exp (J es dB, is a martingale.
t
J0
Ie,I2 ds)
2.2. KOLMOGOROV SEMI-GROUPS
25
2.2.2. The Kolmogorov process. Let U E C2(Rd) and let (BtXt) be a filtered Brownian motion with values in Rd and starting at 0. This process can be viewed as a family of mutually independent real .Ft-filtered Brownian motions (B1,,.. . , Bd,t). The following stochastic differential equation is called the Langevin equation: (2.2.2)
dXt = dBt - VU(Xt) dt.
We call any (stochastic) solution of this equation, which we study below, a Kolmogorov process. Since the paths of a Brownian motion are almost surely nowhere differentiable we can only interpret the equation (2.2.2) in a symbolic sense. Its integral version, however, does make sense and it is in this context that we study a Kolmogorov process. In a slightly more general form, the integral equation is: rt (2.2.3) b(X (s)) ds, Xt = Xo + Bt +
J0
where b is a locally Lipschitzian vector field b : Rd -+ Rd and Xo is a -Fo-measurable random variable.
Theorem 2.2.18. For P-almost every w E S2 equation (2.2.3) has a unique solution t i--> Xt(w) defined on a maximal interval [0,T(w)[ with T(w) E I[1;+. The process (Xt) is adapted and T is a stopping time with values in JO,+oo]. For all w such that T(w) < oc, we have: lim [XtI(w) = 00, t-T - (w)
PROOF. We first point out that the proof of the first statement of the theorem only uses the fact that the paths of a Brownian motion are almost surely continuous. Indeed, we will seek a solution f of the integral equation (2.2.4)
f (t) = w(t) + J b(f (s)) ds, t 0
where it, is a fixed continuous function. We fix a time interval [O,t1J and consider the metric space E = C([0, t1J, B) where B is a compact neighborhood of w(0). Let M and K respectively be the upper bound of IbI and the Lipschitz constant for b : B -+ Rd. Define a mapping F of C([0, t1J, Rd) into itself by:
F f (t) = w(t) + J b(f (s)) ds. t 0
For any E and t1 sufficiently small, we have: sup[o,til Jwi < (1 + E)w(O), and
thus for 0 < t < t1 the inequality I Ff (t) I < (1 + E)w(0) + Mt1
is satisfied for f E E. Hence for tt sufficiently small, F maps E into itself and we obtain a mapping, still denoted by F, of E into E. This mapping is Lipschitzian for the uniform norm on E. More precisely, IIFf - Fg11. < t1Klif - 911
2. SEMI-GROUPS
26
Moreover by choosing t1 < K-1, we obtain a strict contraction of E into itself, which thus has a unique fixed point. By considering t1 as the origin of time, we extend this solution from [0, tl] to the interval [0, t2], with t2 > t1 by the same argument. By continuing this process we construct a solution on a maximal [0, T(. This reasoning is the same as in the classical construction of maximal solutions of a differential equation; see [Ch85]. To show the stopping time property, it is convenient to first consider the the simpler case where b is globally Lipschitzian on iRd with the corresponding constant K. In this case it is possible to construct a global solution with T oc, as follows: set fo = w(0) and fn = F(fn_1); then we have
Dn(t) = IFfn-1 - Ff.-21(t)
ds, K f Dn-1(s) t 0
with D,, =
Ifn
{ - In-ll.
By iteration, we then obtain by recurrence the following inequality:
Dn(t) < lw(0)Ihnn1, which shows that If,,) is a Cauchy sequence in C([0, t1], Rd) for an arbitrary time t1. In this way, by taking the uniform limit we construct the restriction off to [0, t1] (which depends upon the restriction of w to [0, t1]). This shows that the stochastic process solution of the stochastic integral equation (2.2.3) is adapted and thus Tb,R(w) = inf{t
:
IXtl(w) > R}
is a stopping time for any R. We can approach the general case of locally Lipschitzian b by globally Lifschitzian vector fields by by setting by = cppb where {cpp} is a sequence of functions in C,° where cpp equals 1 on the ball {x : Ix) s p}. The exit times for balls of radius R satisfy Tb,R = Tb,,R, and thus Tb,R is a stopping time. Since the "time of explosion" T is the limit of the Tb,R when R tends to infinity, it is also a stopping time. The absence of explosion for a Kolgomorov process will occur if U satisfies suitable growth properties at infinity. We will use two different hypotheses although they are often satisfied simultaneously: (2.2.5)
U(x)
+o0 when Ixl - 00 and IVU12 - 2AU is bounded below,
(2.2.6)
3a E R, b E R
x VU(x) >
-aIx12 - b.
Theorem 2.2.19. Assume that that hypothesis (2.2.5) or (2.2.6) is satThen for any Fo-measurable random variable X0 and for P-almost
isfied.
2.2. KOLMOGOROV SEMI-GROUPS
27
everywhere w E St, the equation (2.2.3) has a unique continuous solution t ,-+ Xt(w) defined on R+. PROOF. We first assume (2.2.5) holds. We thus need to show that the explosion time T is infinite. For this, we apply Ito's formula (Theorem 2.2.15) to the function U, with the stopping time inf{t : U(Xt) > R}, which we denote by TR. U(XtATR) - U(O)
=
f
tATR
f
1 ftATR
VU(X8) dB., -ftATR f IVU(X8)I2 ds + -
2
0
0
E U(X8) ds.
0
Since the first integral is a square integrable martingale we take the expectation of both sides and find: (2.2.7)
E(U(XtATR) - U(0))
- E(- f The function -IVUI2 + Since the inequality:
tnTR IVU(X8)
tATR
I2 ds +
20U
2J0
div(VU(X3)) ds) .
is bounded above and U is bounded below.
inf U < U(Xt(w)) < R
holds for t < T(w), equation (2.2.7) implies the inequality:
RP(TR
oo, we have:
P(T < t) < R imo P(TR < t) and thus by the preceding inequality, P(T < t) = 0. Since t is arbitrary, T = +oo almost surely. The proof under the hypothesis (2.2.6) is analogous. We apply Ito's formula to the function 1. 12 and to the stopping time
T,, = inf{t : IXI2 > n}. This gives us: E(IXtATn 12) = x2 + E(-
rtATn
J0
X, V U(X4) ds) + t n T
rtAT,
x2 + a+ E (J
I Xe I2 ds) + (1 + b)+(t n T,,)
0
t x2 + (1 + b)+t + a+ JE(IXSATflI2)ds.
Then apply Gronwall's lemma to f (t) = E(IXtAT, 12), to obtain: E(I XtATn I2) < (x2 + (1 + b)+t) e0+t.
2. SEMI-GROUPS
28
When n -+ oo, IXtAT 12 converges to +oo on {T = oo}, thus Fatou's lemma implies that {T < oo} has probability 0.
The Cameron-Martin formula. We now wish to study the RadonNikodym derivative of the law of a transformed Brownian motion under a translation, with respect to the law of the Brownian motion itself.
Theorem 2.2.20 (Cameron-Martin). Let b be a locally-Lipschitzian and bounded vector-field on Rd. Denote by Qb the law on C([0, 21, R d) of the solution t Xt of the stochastic integral equation:
Xt = x + Bt +
rt b(X3) ds,
J0
and let Q be the canonical law on C([0, TJ,Rd) of the Brownian motion Wt starting at x.2 Then Qb is absolutely continuous with respect to Q and the Radon-Nikodym derivative (density) is: MT := exp(fT b(W9) 6V,, -
fT b 12(W.,) ds).
0
PROOF. We will only sketch the proof assuming Girsanov's Theorem. See, e.g., [Mt82]. We see from Novikov's condition (Proposition 2.2.17) that the process Mt is a martingale. This implies that P := AITQ is a probability measure. If t < T we set: t
Bt:=Wt - x - fo b(W,)ds 0
on the probability space (St,.F, P, .F't). Girsanov's theorem states that the process Bt is a martingale and even a Brownian motion. The process Wt almost surely satisfies the stochastic differential equation (2.2.3) for this Brownian motion. Since the probability law associated with a Brownian
motion is unique and the trajectory of W is uniquely determined by the trajectory of B, the probability law for W under the probability P is Qb.
Lemma 2.2.21. Let X, be a Kolmogorov process that does not explode in finite time. Then for all t the probability measure on C([0, t], Rd) determined by this stochastic process is absolutely continuous with respect to the probability measure determined by the Brownian motion W, starting from the same point. The Radon-Nikodym derivative (density) is: F = exp(U(Wo)
- U(Wt) - 2
f 0
t
[IVUI2
- DUJ(W,)ds).
PROOF. We set U,, = XnU where X,, is a C°° function equal to 1 on the ball of radius n centered at the origin and 0 outside the ball of radius 2We thus have Q = Qo.
2.2. KOLMOGOROV SEMI-GROUPS
29
n + 1. In this case we can apply the Cameron-Martin formula to obtain the Radon-Nikodym density:
Fn = exp(J t -VUn(4i s) dWe - 1 f!Jn(Wn)l2dS) 0
for the law on C([0, t], Rd) of the process satisfying:
dXg = dB, -
VUn(XS) ds. Applying Ito's formula we obtain: t
UU(Wt) - UU(Wo) = fo VUn(LV9) dW9 +
which establishes the lemma for the case of U,
2
fo
AUn(W8) ds,
.
To establish the general case we consider an arbitrary positive and bounded measurable function rp on C([0, t], R) and apply the standard convergence theorems of measure theory to obtain the following: for almost every w and any s < t Xs (w) = Xs(w),
Un(W3(w)) = U(IVS(w)),
as soon as n > sup3E[o,t} X3(w) V sup9E[o,t] 1'Va(w), and thus:
E(,i(W)F) _ ,limolE(1jjg'II_Sn O(W)F)
= lim E(I11wll.
= nlim00
E('P(X))
0 2.2.3. Kolgomorov semi-groups. We fix a C2 function U such that the corresponding Kolgomorov process does not explode in finite time almost surely. For a bounded Borel measurable function f on Rd, we set:
N t f (x) := E(f (Xt )) = fRd Nt(x, dy)f(y), where Xf is the solution of equation (2.2.2) satisfying Xo = x. We thus have defined a family Nt (x, dy) of Markovian kernels on Rd. See the Appendix, Sec. A.1, for a discussion of this terminology.
Definition 2.2.22. Let U be a function on Rd such that a-2U is integrable with respect to the Lebesgue measure. The Boltzmann measure associated to U is the probability measure: lt(dx) := Z-1 exp(-2U(x))dx, where the constant Z is chosen so that p(IRd) = 1. In order not to restrict the generality when it is not useful we will also call a Boltzmann measure the a-finite measure: µ(dx) := exp(-2U(x))dx, when a-2U is not integrable. In this case we set Z = 1. This will facilitate the statement of several results.
2. SEMI-GROUPS
30
Lemma 2.2.23. For each t, the Boltzmann measure is reversible with respect to the kernel of Nt, i.e., Nt(x,dy)µ(dx) = Nt(y,dx) t(dy) as mea-
sures on Rd x Rd.
PROOF. We denote by Xt the Kolmogorov process associated to U that
starts at x and (Bt) the Brownian motion that starts at 0. Let f and g be arbitrary positive Borel measurable functions and apply the preceding lemma to obtain: Z
f
R2d
f (x)g(y)Nt(x, dy) t(dx) =
f exp(-2U(x))f (x) E(g(Xi )) dx d
=f
Rd
E [f (Bo + x)g(Bt + x) x L
it
exp(-U(Bo + x) - U(Bt + x) - 2 J [IVU12 - AU](B, + x) ds)J dx. 0
Since the integrals are invariant under the mapping s ' t - s, we apply Lemma 2.2.10 to obtain: Z
f
R2d
f (x)Nt(x, dy)g(y),u(dx) = Jd E [f (Bt + y)g(Bo + y) x 11
exp(-U(Bt + y) - U(Bo + y) - 2 j [IVUI2 - DU](B, + ) ds)J dy g(y)f (x)Nt(y, dx),u(dy), = Z f2d R which is the result we wanted since f and g are arbitrary.
Lemma 2.2.24. The operators Nt form a semi-group, i. e., Nt+, _ NtN,. PROOF. Let t > 0 be fixed. Since h F-+ Bt+h - Bh is a Brownian motion, we can write Nh f (y) = E(f (Yh )) where the process Yh is the unique solution of the stochastic differential equation: t+h
Yh = y + Bt+h
- Bt + f
b(Yo) da.
From the definition of Nt and the Markov property of Brownian motion, we see that NtNh f (x) = E f (Yh `). We write the equation satisfied by the process Uh := Yh ', h > 0:
Uh = Xt + Bt+h - Bt +
f
t+h
b(Ua) da
t
t+h
=x+Bt+h+ f t b(X,)ds+f 0
b(Ug)da.
t
Making use of the "path by path" uniqueness of solutions of equation (2.2.3),
we obtain Uh = Xt+h, which implies NtNhf = Nt+hf. This is what we wanted to prove.
2.2. KOLMOGOROV SEMI-GROUPS
31
It is easy to check that the two differential operators: 1
Lip = -AV - VU V', 2
1
L -V = -AV + div(VU'G), 2
are adjoints of each other in the sense that for all Cg' functions
we
have:
JdL'idx= /d'pL*Odx. Theorem 2.2.25. The semi-group Nt induces a strongly continuous semi-group of self-adjoint operators on L2(p), where µ is the Boltzmann measure.
Let A be its infinitesimal generator. Then, any C2 function cp such that IVpl is bounded and L' E L2(µ) is in D(A) and satisfies AV = -Lcp. Any function tp that is in D(A) satisfies Aip = -Lii, in the sense of distributions. PROOF. Since p is reversible with respect to Nt, it is also invariant, from which it is easy to see: f ,- Nt f defines a self-adjoint semi-group on L2(p) (still denoted by Nt). Since: IINt!42(N) < IIfIIL2(,.),
Nt is also a semi-group of contractions. For f bounded and continuous, the almost sure continuity of the paths of Xt and the formula Nt f (x) = E(f (Xt )) imply the simple pointwise convergence of Nt+hf to Nt f when h 0. Then, by the dominated convergence theorem, we also have convergence in L2(µ). The continuity of the semi-group in L2 follows from the fact that the bounded and continuous functions are dense in L2.
The proof of first assertion of the second paragraph is based on Ito's formula:
'(XT) - V(x) =
f t V co(Xf) dB, - f 0t V U(Xs )V w(X9) ds + 0
f
t
ds.
1
2
We take the expectation of both sides of this equation and use the fact that the martingale part is nullified to obtain: V] (x) =
E(f 0
t
c
ds)
=f
ds.
0
By dividing by t and passing to the limit, we see that cp E D(A) and App = -Lcp, which establishes the first assertion. We denote the duality between For iP E D(A) and the space of distributions and test functions by
2. SEMI-GROUPS
32
p E C,, we have: (AV), W) = (Am, so)L2(d) _
`hl o h (I = (V), him 1(I
e2USO)L2(P)
Nh),,, e2USO)L2(1,)
- Nh)e2U p)Lz(µ)
= (',, -L[e2V 4PI)L2(µ) = (,O, e2U(-L*So))L2(,
)
= (v,,
from which the second assertion follows since cp is arbitrary. We finally remark that the above expression is a continuous function of cp for the usual test function topology on CO', which assures that Lei exists in the sense of distributions.
Remark 2.2.26. As we have seen, the operators Nt are contractions on L2(p) and consequently A is a positive operator. This can already be seen in the case of test functions because integration by parts immediately gives: (2.2.8)
V E C,, 0 E C2 (cp, Lt/')L2(µ)
2 fRd
dµ.
The following theorem shows that the situation is as simple as it could be.
To state the theorem we will utilize the Sobolev space W of distributions for which the derivatives up to order 2 are in L (Rd).
2
Theorem 2.2.27. The infinitesimal generator A is essentially selfadjoint on C. Its domain is precisely the set of zG E L2(µ) that are in WW.''C and are such that L1 E L2(µ) in the sense of distributions.
PROOF. It suffices to show that -1 is in the resolvent set of the symmetric operator obtained by restricting the infinitesimal generator A to C'°. Since it is positive, it is sufficient to show that (A + Id)C'° is dense. We do this by showing that any function g E L2(µ) satisfying (g, Lp - cp) = 0 for all cp e Cc" is the zero function. If the latter is the case, the function g then satisfies the equation: (L* - 1)(e-2Ug) = 0
in the weak sense. We will only provide the proof given by Wielens ((W1851) in the case where U E C°° In this case, it is well known by hypoellipticity (see
[Hor63]) that g itself is a C°° function and satisfies the preceding equation in the ordinary sense, that is equivalent to Lg = g in the ordinary sense. Let p be a CC° function that equals 1 on the unit ball centered at 0 and set pn(x) = p(x/n). Taking into account that Lg - g = 0, we obtain:
L(gpn) - gpn = g LPn + VPnVg
2.2. KOLMOGOROV SEMI-GROUPS
33
Without loss of generality we can assume g is real and formula (2.2.8) gives:
(P.9, (-L + 1)(P.9)) = 2 fV[g2pfl]Vpfld/2 - fvPnvgPngdIz fIvpnl2g2dµ.
=
Since p,, is constant on the ball of radius n. we have:
(P.9, P.9)
(P.9, (A + Id)(p9)) <21 Jz l,>n IDPnI2 g2 dµ n2IIoPII2
Ixl
92dp
When n -* oo, the left-hand side tends to II9IIi2(,,) and the right-hand side tends to 0, and thus g = 0. In fact, the same proof is valid when U E C2 once we have shown that g has first and second derivatives in the sense of distributions that belong to LIB; see J. Frehse [Fr77]. We show that i(i E D(A) if ?G and Lmji are both in L2(µ). In this case, for all test functions gyp, we can write: ('W, App) L2(M)
(',, -L(P)L2
(1i, -e-2uLV)L2
-L'e-2Up)L2(dx) = (-L,',e-2U0L2(d,) = (-LiP,V)L2(µ). Thus cp' (VI, A`p)L2(u) is continuous on L2(µ) for all Q° functions V. Since the latter is a core for the self-adjoint operator A, we have z/, E D(A) by Corollary 1.1.21. It remains to show that 7P and its first derivatives are locally in WI ' , the space of square-integrable functions for which the first derivatives are be a sequence of Cc°-functions such that also square-integrable. Let 'p,2 - Vi and A'p,, -, ,i. Then formula (2.2.8) shows that (
f IVV.12 exp(-2U) dx Rd
remains bounded. A fortiori, for all bounded open G, the restrictions of the sequence cp,, to G remain in a closed ball of the space WI,2(G). Since this ball is compact in L2(G,dx), the limit 'la is also in WI'2(G). We can thus define in the sense of distributions the term VU V ' as an element of LL. Since L; E L , the functions -ry, -0,, and the vector-field V+' are in L,. and we know, for example, by utilizing the Fourier transform, O that this implies that V) is in W W. The preceding theorem is true under either of the hypotheses (2.2.5) or (2.2.6), which are both satisfied, for example, for d = 1 and U(x) = x4 - x2.
Remark 2.2.28. There is an isomorphism F of the Hilbert space L2(µ) onto L2(dx) that maps -0 into Z- 1/2 exp(-U)-O. When U is infinitely differentiable, F preserves C,° and this calculation immediately shows that A,
2. SEMI-GROUPS
34
restricted to CC, is transformed into the Schrodinger operator defined on C°° by:
Bcp = - 2 zip + VV where V = 1(I VUI2 - AU), in other words, B =.F o A o 2 1. Under the hypothesis that V is bounded below, which is a slightly stronger condition than (2.2.5), the operator B is by Theorem 1.1.17. Thus the preceding result essentially self-adjoint on is re-proved in the case where U is infinitely differentiable.
Fokker-Planck-Kolmogorov Equations. These equations are parabolic partial differential equations that describe in a more concrete fashion the evolution of the probability distribution of Xt. We will assume that Xt is a Kolmogorov process that does not explode in finite time almost surely. In addition, to simplify things, we will assume that U is C.
Theorem 2.2.29. There exists an infinitely differentiable function on ]0, +oo[xR2d denoted by p(t, x, y) such that Nt(x, dy) = p(t, x, y) dy and p(t, x, y) satisfies both of the following equations:
&p(t, x, y) = Lip,
5jp(t, x, y) = Lyp.
To distinguish them the first is called the "backward F-P-K equation" and the second the "forward F-P-K equation". PROOF. We begin by constructing for each fixed x E IR, a solution px of the forward equation as a distribution on 10, oo[xRd. We define px as a distribution on ]0,oo[xRd by:
(px, v) (t, y) d= f JNt(xid)v(t)dt for test functions v(t,y) E Cc°(]O,oo[xRd). By reintegrating the second member against test functions O(x) we obtain a distribution p in three variables defined on JO,oo[xR2d. We now show that px satisfies the forward F-P-K equation. For T sufficiently large, Ito's formula gives us: T
0=Ex(v(T,XT)-v(1/T,X11T)) =J1a E ([- + Lv) (t, Xt)) dt °°
= f IE5 ([ =
f
av
at
°°
+ Lv] (t, Xt)) dt = f Nt(x,
a
+ Ly]v(t, y)) dt
00
(px, [5i + Ly]v(t, y))y dt = ((
- Ly]px, v)
which shows that px is a solution of the forward equation. It is then easy to see that p as a distribution in (t, x, y) is also a solution.
We will next show that p satisfies the backward F-P-K equation by utilizing the preceding result combined with the reversibility of the measure
2.2. KOLMOGOROV SEMI-GROUPS
35
with density exp(-2U) with respect to Nt (see Lemma 2.2.23). Thus for test functions of the form -r(t, x)cp(y), we have: p,T(t,x)'G(y))(t,x,Y) =
(p,-aT(t,x)cp(y))(t,x,v)
=J
dJ
T(t, x)
[Nt((p)] dx dt
Jr(tx)L[Nt(cO)1dXdt = Z f (exp(2U)T(t,. ), L[Ntcp])L2(µ) dt
= Z J'(L[exp(2U)7-(t, . )], Ntcp)L2(µ) dt =ZJ
(exp(2U)L'rr(t, . ), Ntcp)L2(µ) dt
0
f(Lr(tx), Ntcp)dt _ (cp(y)Lsr(t,x),p)(t,x,Y) _ (co(y)T(t, x), Since p satisfies both equations we have:
25ip=Lxp+Lyp. Because 28t - Lx - Ly is hypoelliptic p is infinitely differentiable.
Remark 2.2.30. For vector fields b that are not gradients, it is possible to establish the Fokker-Planck equations in the form:
ap(t, x, y) = Lyp, J 5i 4p(t, x, y) = Lxp,
L'cp = I AV - div(cpb), 2
Lcp = 2 AV + bVcp.
The proof of the "backward equation" is more difficult in this more general setting. The interested reader can consult [M-K69].
CHAPTER 3
Logarithmic Sobolev Inequalities In this chapter we mainly present logarithmic Sobolev inequalities for Kolmogorov semi-groups. We will also take some side excursions into the area of finite state spaces that will allow us to present similar ideas without the technical difficulties met in the case of Kolmogorov semi-groups.
3.1. The Poincart and Gross inequalities We consider a positive self-adjoint operator A defined on L2(µ), where ,u is a probability on a measurable space E such that 1 E D(A) and Al = 0. Since 1 is an eigenvector for A with eigenvalue 0, the lower bound of the spectrum u(A) is zero. We denote by £ the quadratic form associated with A. This is defined by:
£(f, f) = (A1/2f, Al/2f ) Since f < 1 + x/2, the spectral theorem shows that D(A1/2) D D(A) allows us to define £ on D(A) by £(f, f) = (f , Af ). We set: D(£) = D(A1/2),
(3.1.1)
Ilfll = Ilfll2+£(f, f).
Lemma 3.1.1. The domain D(A) is dense in D(£) with respect to the norm II-IIe
PROOF. By the spectral theorem it suffices to consider the case of multiplication by a positive function X in the Hilbert space L2(1, v) where v is a-finite. This implies that Sl is the union of an increasing sequence Rk of sets of finite measure. If f E D(E) and if we set fk := f 1IXI
0
is finite.
Definition 3.1.2. We say that A satisfies a Poincare inequality if there exists a constant c such that: f E D(A) and f 11=#, I1 f112
Since D(A) is dense in D(£) this inequality obviously extends to D(£). A square integrable function f on a probability space (E, p) can be considered as a random variable for which the second moment exists. The corresponding
centered random variable f - f f du is orthogonal to 1, thus the Poincare inequality is written as VIE D(£) var,(f) <, c6(f, f). 37
3. LOGARITHMIC SOBOLEV INEQUALITIES
38
Proposition 3.1.3. In order for A to satisfy the Poincare inequality, it is necessary and sufficient that the kernel of A be of dimension 1, and it has a spectral gap:
g=inf{AI)>0andAEa(A)}>0. The optimal constant in the Poincare inequality is the inverse g-' of the spectral gap.
PROOF. Let C be the vector subspace of constant functions in L2. It is sufficient to note that we have the decomposition L2(µ) = C ® Cl into orthogonal subspaces that are stable under A. The Poincare inequality says that the lower bound of Aicl is strictly positive. 0
Exercise 3.1.4. We consider the domain D consisting of the class of C° functions on [0, 27r] such that f (0) = f (2v') and f'(0) = f'(21r). We can extend these functions into periodic C' functions on R. By utilizing Fourier series prove that for all f E D such that ff" f (t) dt = 0, one obtains: 27r
f'2(t) dt > /
Jo
a f2(t) dt.
o
Deduce the Poincare inequality for the closure of the operator f ,-, -f" defined on D into L2(µ) where µ is the uniform distribution on [0, 21r].
Exercise 3.1.5. Let µ be a probability measure on a finite set E for which p{x} > 0 for all x E E and that is reversible for a transition kernel (matrix) K, i.e., µ ® K is symmetric. Define a positive self-adjoint operator on L2(µ) by setting A := I - K. Show that A satisfies a Poincare inequality if and only if K is irreducible, i.e. for some n, all of the elements of the matrix 'k-I Kk are strictly positive. Let a be the symmetric matrix µ® K defined by a(x, y) = µ(x)K(x, y). Verify that:
E(f, f) = 2 E (f (x) - f
(y)) 20,(X,
y)
x,yEE
Let -y be any path connecting x and y: yo = x, yI, , yp-I, '1'p = y and define e(-y) by e(y) := µ(x)µ(y) Ek=I a-I (yk-I, yk). Then let -y(x, y) be any path connecting x and y that minimizes e for paths connecting x and y. Prove that the Poincare constant is bounded above by: max E X(x,y,u,v)e(y(x,y)) x,yE E
where X(x, y, u, v) equals 1 if (u, v) is an edge of the path -y(x, y) and 0 if not. From now on f and g will denote real-valued functions.
3.1. THE POINCAR$ AND GROSS INEQUALITIES
39
Definition 3.1.6. The operator A satisfies a logarithmic Sobolev inequality if there exists a constant c such that: df OO E L2(µ)
(LS)
ff2log (ii1fr) dµ < c6(f,f),
with the following conventions:
II
- 112 denotes
the norm of L2(µ); we agree
that the right-hand member equals +oo if f V D(E) and that left-hand member equals +00 if f2 log(f) is not integrable. This inequality has been developed by L. Gross in [Gr75]. In addition, there are more general logarithmic Sobolev inequalities that add a supplementary term ellf II2> with e > 0, to the right-hand side. See, for example, the course of D. Bakry [Bk93]. We refer to the inequality (LS) as the strict logarithmic Sobolev inequality. In [Gr92], L. Gross gives a panoramic view of diverse applications of these inequalities. In this book, unless we state otherwise, a logarithmic Sobolev inequality, or simply Gross inequality, will always refer to an inequality of the form (LS).
Remark 3.1.7. Since the inequality (LS) is stable when f is multiplied by A # 0, we can restrict ourselves to the case when IIf II2 = 1. In this case twice the left-hand side of the (LS) inequality can be written as fE y(f 2) dp with y(x) = x log(x). Since the function y is convex, Jensen's inequality implies the positivity of the first member of the Gross inequality: IEy(.f2)dµ % y(Lf2 du) = 0,
where the inequality is strict except when the function is constant.
Proposition 3.1.8. If L'(µ) fl D(E) is dense in D(E) for the norm IIE, then the Gross inequality implies the Poincare inequality with the same constant. II
PROOF. It suffices to prove the Poincare inequality for bounded g for which the integral is zero. We set f := 1+eg and write the Gross inequality in the form:
J(1+eg)21og(1+eg)dµ
dµ - _e2 fg2 dµ + o(E2) S ce2E(g, g) + Zee
fg2 dµ + o(e2).
Then dividing both sides of this inequality by e2 and taking the limit e -+ 0 0 gives us the inequality asserted in the proposition.
Remark 3.1.9. This result does not say that the best Poincare constant is equal to the best Gross constant. In fact there are examples where these two constants are different. The simplest example is constructed on a space of two points; see [D-S96].
3. LOGARITHMIC SOBOLEV INEQUALITIES
40
Remark 3.1.10. Assuming that a Poincare inequality holds, it is possible to pass from a general Gross inequality: (LSG)
fg2log(191)
< `.E(g,g) +eI!gIIi2(µ) du to a strict inequality, i.e., such that e = 0. For example this can be done by utilizing Deuschel's inequality, which is proved in [H-S88). This inequality says that for any probability measure p, f E L2(z) and with f := f - f f du, we have:
ff2 log
_IlI
IfI
)du+IIfII2 fPlog( l IIf IIL2(µ) )du< IIf IIL2(µ)
(µ)'
To go in the other direction see Exercise 3.1.16, which will also show that the general Gross inequality can be obtained in the case of Kolmogorov semi-groups with the aid of the Sobolev inequality.
From now on we will essentially restrict ourselves to the case of Kolmogorov semi-groups. For the remainder of this chapter U will be a C2function such that the Kolmogorov process does not explode in finite time almost surely and that exp(-2U) is integrable. A will denote the infinitesimal generator of the transition semi-group Nt acting on L2 (u) where u is the Boltzmann measure. In this context, the quadratic form E is called the Dirichlet form for the transition semi-group of the Kolmogorov process.
Definition 3.1.11. Let v be a measure on Rd. We define H' (v) as the space of functions f on j d that are in L2(v) and such that V f E L2(v) in the sense of distributions. We define the norm on HI (v) by:
(f
2lvfI2+f2dv1/2 d
The measure v will be said to he regular if it has a locally bounded density with respect to Lebesgue measure.
Lemma 3.1.12. For any regular probability measure on Rd, the space C°° is dense in H1(v). PROOF. Let f E HI(v). Let p and X be two positive C'° functions such that p equals 1 on the unit ball and the integral of X is 1. We set:
Pn(x) = p(x/n), XP(X) = pdX(px) We truncate and regularize by setting:
fn = Pnf,
fn,p _ XP * A.
f in the space H1(v). In It is easy to check that f,,,p P- 00+ fn and fn n-oo fact, the first limit is deduced from the regularization by ordinary convolution in HI (lRd) because the topology of HI (Rd) is finer than that of HI (v) on any set of distributions with support in a fixed bounded set. The second
3.1. THE POINCARE AND GROSS INEQUALITIES
41
limit follows from the fact that f and Of are in L2(v). We thus have found a sequence Pn = fn,p,i of C,°-functions that converge to f in H'(v).
Theorem 3.1.13. The domain of the Dirichlet form 6 associated to a Kolmogorov process is the space H1(µ) and for f E H'(µ), we have:
£(f,f) = 2 f
d
IVf12dµ.
The Gross inequality (LS) holds if it is established for any test f E CC°.
PROOF. The preceding lemma says that any function f E H1(µ) is the limit of a sequence (fn) of test functions. On C,° the norm of HI (µ) coincides thus the Cauchy sequence (fn) in HI (µ) is also Cauchy with the norm in (D(£),11-11e) and thus converges to an element of D(£) that can only be
f . This proves that HI (µ) C D(£). We now show the inverse inclusion. Since A is essentially self-adjoint on and thus for C°°, this space is dense in D(A) for the norm [If 12 + IIAf Ilflie, which is the norm for D(A1/2). The result then follows from Lemma 1121112
3.1.1.
To prove the Gross inequality for f E H1(µ) we argue as we did above by considering a sequence of test functions (f n) converging to f. Since fn converges in the L2(µ) sense, Fatou's Lemma implies: f f 21og (If 1) dp < lim
f
fn log (If. 1) dlt
< llim[c£(fn,fn) + Ifnl2log(IIfnhI2),
_ c£(f, f) +
If 121og(Ilf112)
Lemma 3.1.14. Let u be a C2 function on [a, b] and g an element of L2(µ) such that a < g < b almost surely. Let gt := Ntg. Then t ,-+ u(gt) E L2 is differentiable fort > 0 and continuous at 0 with derivative -u'(gt)Agt and
it fu(9t)dµ = -2 JU(g)lV9i2dIZ. PROOF. It is clear that a < gt < b and the spectral theorem implies that gt is in the domain of A for t > 0 with dtgt = -Agt. There exists a constant M such that for µ-almost every x:
l u(9t+h)(x) - u(9t)(x) - hu'(9t)(x) (9t+h(x) - 9t(x))I < Mh2. By passing to L2 norms, we see that u(gt) is L2 differentiable with derivative
equal to -u'(gt)Agt. Since gt is in the domain of A it is in the domain of A1/2, which coincides with HI(µ). A direct calculation on the distributions
3. LOGARITHMIC SOBOLEV INEQUALITIES
42
shows that u'(gt) also belongs to H1(µ) with Du'(gt) = u"(gt)Vgt. Thus, we have: d
dt
u(9t) dµ = -(u (9t), A9t) -(AI"2u
(9t), A1129t)
(9t),9t) = -12
"
J u (9t)I 9t dµ I2
11
One of the most important consequences of the Gross inequality is Nelson's hypercontractivity property:
Theorem 3.1.15. Let Nt be a Kolmogorov semi-group satisfying the Gross inequality with constant c. Then for all p, q > 1 and t > 0 such that q 1 = e2t/` , we have:
P-1
(3.1.2)
g E L"(µ).
IINt9IIQ < II9IIp,
Conversely the inequality (3.1.2) implies the Gross inequality.
PROOF. By a simple density argument it suffices to consider functions g such that Im(g) C [a, b] with a > 0. In this case, a 5 gt < b for gt = Ntg. From gt E D(A) C H' (µ), we conclude that gt /2 E H1(µ). By applying the Gross inequality to this function, we obtain: (3.1.3)
f9log(9i)dii -
q log (J
9i dµ) fg' dµ
4
Jg_2JvgtI2 dµ.
By applying the previous lemma for q > 0, we see that the function ,D (q,t) = 197 dA
is differentiable with respect to the second variable t on R+, and for q > 1 the following inequality holds:
-2t(q, t) = -q J99-lA9t dµ =
A q -1) fg_2Ivgtl2 dµ 2
J
2(q - 1)
dµ)
du].
J gt [_ 97 log(9t) µ + 4 log (f 9e We then apply this to the case q := q(t)1 + (p - 1)e2t/c and taking into 2(q(t) - 1) account q'(t) = c
c
L
we obtain the inequality: q'(t)
8241(q(t), t) + q (t) J 9e log (9t) µ - q(t) log(4,(q(t),
t) < 0.
3.1. THE POINCARE AND GROSS INEQUALITIES
43
If we set: 11(t) = -b(q(t), t)1/Q(t) = IIgtIIq(t),
the preceding inequality is equivalent to: 0.
Since W(0) = II91Ip and 'I"(t) 5 0, hypercontractivity follows. Conversely, hypercontractivity implies the decay of the function WY(t), as
has been pointed out in [GZ98]. Indeed, the relation (q(t + s) - 1) = e2s/`(q(t) - 1)
implies that the operator N. is a contraction from LQ(t) to LQ(8+t) and we have:
'(t + s) = II NBgtII q(t+e) 5 1(t).
We choose g of the form g,, := (92+a)1/2 where g is a C°° function and a > 0. Since Theorem 2.2.27 implies that g is in D(A), the above calculations of the derivatives are valid even for t = 0. The inequality 'Y'(0) <, 0 thus implies the inequality (3.1.3) for t = 0. Moreover by choosing p = 2, we obtain the logarithmic Sobolev inequality:
f ga log(Ig0I) dp s CE(ga,ga) + IIgaIl2log (IIgaII) When a --, 0, one sees by dominated convergence that E(ga, 9a) converges to E(g, g). By controlling the left-hand integral from below by Fatou's Lemma, we have verified the inequality we were looking for in the case where g E CC°. This suffices for general g by Theorem 3.1.13. 0
Exercise 3.1.16 (the Sobolev inequality and Gross inequalities). The goal of this exercise is to show that the general logarithmic Sobolev inequality (Gross inequality) (LSG) for a Kolmogorov semi-group is a consequence of the usual Sobolev inequality (R.os76, Car79). One form of the latter, valid for d > 3, is as follows: if q is the exponent such that 19 = 1-2 1,9there exists a constant kd such that for any cp E C,(Rd): 11/2
IIwIIq < (kd/2) 1/2 (J
d
R
IV I2 dx/
where II 'IIq denotes the norm in LQ(dx). The inequality A < B between two self-adjoint operators on L2(v) for which C°O(Rd) is a core means cp),, 5
(BW, V),, for all cp E C. For example A > -2 indicates that the operator -21d is a lower bound for A. We will identify functions F on Rd with the corresponding multiplication operators AIF by F. Let d > 3, and let F be a real locally bounded function F in Ld/2(dx) acting by multiplication on L2(dx). Prove that F s -2kdIIFIId12 A-
3. LOGARITHMIC SOBOLEV INEQUALITIES
44
Let U be a CO° function on ](Pd such that exp(-2U) is Lebesgue-integrable and denote the associated Boltzmann probability measure -Z exp -2U(x) dx by µ. The operator defined by Acp = - 'AV + V UV a, for cp E C,"°, extends uniquely into a self-adjoint operator on L2(µ). We set 2V := IVUI2 - AU. Verify that the formula F(V) := Z-1/2 exp(-U) V defines an isometry .F of L2(µ) into L2(dx) that transforms A into the Schrodinger operator B defined on C°O by BO
-2010
We suppose that there exist positive constants d, b, m such that U < bV + b,
V > -m.
Let f be a function of norm 1 in L2(v) such that Im(f) C [a, 0] with a > 0. We set:
F = (log(f) - U -
log(Z))+,
1 = SUP t-2 log d/2 (t).
Prove that f FdI2(x) dx < 21 and from this deduce: log(f) <, 6V + b + 1 log(Z) - 212/dkd A,
log(f) s cB + e, in L2(dx) with c = kd12/d + b and e = log(Z) + b + kdl2/dm. 2 f is in D(E), then: Deduce from this that if, in addition,
log(f) < cA + e,
ff2log(f)dIi < c (f,f) +e, in L2(µ). Prove the general logarithmic Sobolev inequality (LSG) in the following 3 steps: first for g = (£2+,02)1/2, where Vi E Cc, then for g =', and finally for g E H1(µ).
Show that the inequality of type (LSG) is still valid for d = 1, 2. One should start with the inequality for d = 3 and then consider p ® y where y is the canonical Gaussian measure on R2 or R. The following definition will be useful in view of Theorem 3.1.13:
Definition 3.1.17. We say that a probability measure it on ]Rd satisfies the Gross inequality with constant c if for any real-valued f E Q°(dtd): (Is)
IRa f2 log (If!) dµ < 2 fRd
If µ is a regular probability measure Theorem 3.1.13 implies that the pre-
ceding inequality extends to H'(p). When µ is the Boltzmann probability measure on Rd associated with a potential that has good growth properties at infinity this inequality is the logarithmic Sobolev inequality for the infinitesimal generator of the Kolmogorov semi-group. The more delicate
3.1. THE POINCARE AND GROSS INEQUALITIES
45
question is whether the inequality (Is) has an analogous interpretation for less regular measures. This is studied in the paper of Rockner and Zhang [RZ94J.
If we perturb the measure by multiplying it by a bounded function the logarithmic inequality still holds if we are willing to accept a possibly much larger constant. In fact, let V be a bounded function on Rd and osc(V)
sup(V) - inf(V). Then set:
v(dx) := Z-1 exp(-V(x))p(dx) with Z =: fexP(_V(x))(dx). We then have the following:
Proposition 3.1.18. If a probability measure u on lRd satisfies the inequality (Is) with constant c, the modified probability measure v satisfies (ls) with constant c exp(osc(V)).
PROOF. Let p be any probability measure on Rd. It is easy to see that the function t ' ---* f (f 2 log(f2)_f2 Iog(t2) - f 2 + t2) dp attains its minimum on R+ for t =IIflIL2(p), which implies for all t > 0 : (3.1.4)
2ff2 log (l(I
dp<,
)
J(f2 log(f2) - f2 Iog(t2) - f2 +t2) dp.
In particular, for p = bx, we obtain: (3.1.5)
f2(x)log(f2(x)) - f2(x)log(t2) - f2(x) + t2 '> 0,
for all t and x. We apply the inequality (3.1.4) for p = v and t = If IIL2(A) to obtain:
)
2f f2 log
dv
IIf 110(v) (_.11
(f2(f2) f The inequality (3.1.5) implies that the integrand is positive and we are able f2 + If II (,,)) dv.
to write: 2
f2log
J
Ifl IIf110(v)
dv <
z
e-infV
(f2 log(f2)
- fI logllf I1i2(N) - f2 +IIf Il2L2(p)) dp 2 e- infV
z Z
f f2 log
V1 l I
e-infVfIvfl2
dp
IIL2(p)
dot
dv.
11
3. LOGARITHMIC SOBOLEV INEQUALITIES
46
It is convenient to give here a useful preliminary to the study of Gibbs measures. Let 7r be a kernel from R1 to Rd, which is given by a family 7r(x, dy), x E Rd, of probability measures on RI and v a probability measure on Rd. We denote by 81 and 62 the partial gradients on the product lied x RI
and by var,(g) the function on Rd for which the value at x is the variance of gx(y) := g(x, y) with respect to 7rx(dy) := 7r(x, dy).
Proposition 3.1.19. Suppose that the following hypotheses are satisfied:
(1) the probability measures 7r(x,dy) on RI satisfy a Gross inequality with constant fi independent of x; function g, the function irg is in H1(v) and there (2) for any exists a constant k such that: Iaiirg2I
k(7rg2)1"2
va'n/2(g);
(3) the probability measure v is regular and satisfies the Gross inequality with constant a.
Then the probability measures p = v®7r on Rd+t and vir on R' satisfy Gross inequalities for the respective constants max[a(1 + k) Q(1 + q(1 + k)ka)] and 3(1 + 41-k2a).
PROOF. We limit ourselves to the case of functions of the form g(x)h(y) where g E Cc°(Iltd) and h E Cc-(R') since linear combinations of such functions are dense in CC°(Rd+r); see [D1e74]. In order to avoid problems with differentiability, we will begin by considering the family of functions fp(x, y) := gp(x)h(y), p > 0, with gp(x) = (p + g2(x))1"2. Utilizing the first hypothesis and applying the Gross inequality for 7rx and multiplying by gp(x), we have: (3.1.6)
/ 7r(f log(Ifpl))
7rI 2fpI2+ 2log(7rfp)7rfp.
Set F := (7rfp)1/2 := Igpl ® (7rh2)1/2 and thus we have IIFIILZ(v) = IIfpIILZ(,.)
Since v is regular, the third hypothesis implies: (3.1.7)
v(F21og(F))
Combining (3.1.6) //and (3.1.7) we then obtain: (3.1.8)
ii(fp log(lfpl)) < 2 µI a2fpl2 +
logllfp II
2
The second hypothesis implies: var1/2(h),
Ia1(7rh2)1/2I = 1(7rh2)-1/2I817rh2I '< 2
3.1. THE POINCARE AND GROSS INEQUALITI&S
47
thus by applying the Poincare inequality for it to h, which is a consequence the Gross inequality, we obtain: Iai(irh2)1/2I < 2-3/2.1/2k(rja2h12)1/2, I81 FI <
(3.1.9)
IaigpI(irh2)1/2 + 2-3/201/2kIggI
Ia1FI2 < (1 + k) (Ia19PI27rh2 +
k8
(irIazhI2)1/2,
g7rlc?ZhI2),
having taken into account the general inequality: (3.1.10)
(u + kv)2 < (1 + k)(u2 + kv2).
The inequality (3.1.9) combined with the inequality (3.1.8) gives us:
fi log(Ifl) dj2(1+(1+k)k4
+(1+k)2
J
)J
Iazf I2d11
I81fPI2d,L+I0g1If,IIL2(P) f ffdii,
which almost completes the proof of the result for the case v®lr. To complete
0. The it we can argue, as in the proof of Theorem 3.1.15, by letting p same calculations provide the proof for v7r except one doesn't need (3.1.10).
0 In particular, if two probability measures satisfy (Is) with the same constant c then (Is) is also true for their tensor product. The preceding calculations are simpler in this case because we can take k = 0, and, moreover, we do not need to assume that v is regular. Thus if v satisfies (Is), it is also true for ((gv)" for all n with the same constant. This very important property is also shared by the Poincare inequality but not by the Sobolev inequality. The following exercise gives an example which will be utilized in establishing formula (4.2.20).
Exercise 3.1.20. Let U(x, y) be a C1 function on R x Rd such that for all x the Boltzmann probability measures 1rx := Z21 exp(-2U(x, y))dy exist and are such that the absolute value of the functions a%U(x,.) exp(-2U(x,. ))
are, locally in x, bounded by an integrable function on Rd. Prove the following formula for the differentiation of Boltzmann measures: for bounded h,
cd fh()irx(d) = -2cov,r.(h,a=U). Prove that for any bounded function &, Icov(h2,t )I s 2IIhII2var1"2(h)osc('). By utilizing Theorem 3.1.29, prove that the kernel on R:
ir(x, dy) := Zz 1 exp(-e-1[2Y2 + cos(x + ey)]) dy,
3. LOGARITHMIC SOBOLEV INEQUALITIES
48
where Zx is the normalization constant, satisfies the preceding hypotheses for 0 < e < 1. Deduce from this the inequality between the best logarithmic inequalities constants c(v7r) and c(v): c(vir) < e(1 - e2)-1(1 + 4e-2c(v)).
The following theorem is based on a method of Herbst for which Ledoux [Ld196, Ld296] has given interesting applications to the study of measures on spaces of large dimension or on manifolds. We will see an application to Gibbs' measures in Section 4.2.3.
Theorem 3.1.21. Let p be regular probability on Rd satisfying a Gross inequality with constant c. Then, if cp is a Lipschitzian function for which the slope is bounded above by 1 and r > 0, we have: (3.1.11)
cp
fcod+ r}
e-r2/`.
In particular, if a < c 1, exp(alxI2)dp < oc.
PROOF. We first consider the case when cp is bounded and positive. We set
G(t) := log(F(t)),
F(t) = Jet4 dp.
The function G(t) is differentiable on R+ and utilizing the Gross inequality for f = etV'2, we have:
r tG'(t) = tF-1(t) Jwetv dp = 2F-1(t) cF-1(t) 1V f I2 dp + G(t)
ff
if
2 log(f) dm
4ct2F-1(t) JlvcpI2etw dµ + G(t) 4ct2 + G(t).
Since G(0) = 1 and G'(0) = f pdp, the examination of this differential inequality leads easily to:
G(t) < 4ct2+tJcpdµ,
t
0,
and Markov's inequality leads to: (3.1.12)
u{cP 3 jpdp +r} <
e14.12_t,,
t,r
0.
This inequality is stable under addition of a constant to cp and thus is valid for any bounded W. The optimization in t of the right-hand member of the inequality yields (3.1.11) for bounded V.
3.1. THE POINCAR$ AND GROSS INEQUALITIES
49
The general case is obtained by replacing V in (3.1.11) by the sequence of bounded functions
pn=(pAn)V-n
and passing to the limit in (3.1.11). This will be a valid argument if V is integrable. In fact by using an argument by contradiction we will show that cp is square integrable. Suppose this is not the case. Then the sequence (kn) of the norms of I1Wn112 tends to infinity and thus k.1W is a Lipschitzian function with slope bounded above by 1 for n sufficiently large. We can then apply the inequality (3.1.11), which shows that the sequence kn 2tp2 is uniformly integrable. This is a contradiction since the functions have integrals equal to 1 and at the same time converge to 0 everywhere, a contradiction. When this inequality is applied to the case where V(x) = jxj, we obtain:
fe2 dp < 0c, for all a strictly less than c
0
-I.
In the following exercises we consider the Ornstein-Uhlenbeck process defined by the stochastic differential equation: dXt = d Bt - Xt dt, where (Be. .Ft) is a real filtered Brownian motion and X0 is a random variable measurable with respect to .Fo. For x E R, we denote by Xf the solution such
that Xo = x. A will denote the infinitesimal generator of the semi-group defined by Nt(f)(x) := ]E(f(Xf )) and -y(dy) := Ie-Y2dy will denote the standard Gaussian measure. The following calculation has been proposed by M. Ledoux.
Exercise 3.1.22 (the Gross inequality for the standard Gaussian measure). Let f (x) be a C°°-function on R such that 0 < a s f S b for constants a and b. We set: F(t).- f
Nt (x, f) tends, independently of x, to y (f) .
Show that -Nt(x, f) = e-tNt. (
\dx
dx
F(t) \ e_21 fNt
(dxd
f
f , and then show that:
f f (dxf (
) 2) dry
= -2t
)2
dry'
Prove that for t > 0: dt
JNtf log(Ntf) dy = 'F(t).
Deduce from this the logarithmic Sobolev (Gross) inequality for y in the form:
fflog(f)dy-Jfdy log(ffd.) _< 4 jT ( f)2dr,
3. LOGARITHMIC SOBOLEV INEQUALITIES
50
and extend it to the case where f = g2, where g is in the domain of the Dirichlet form: 2
f
(g)2(x)'Y(dX)
Exercise 3.1.23. With the aid of the Poincare inequality prove that: Vf E L2('Y)
varNt(f) < e-2tvar(f)
Verify the formula: Xt = Xoe-t+fii e-(t-') dB,. Deduce from this that if Xo is a Gaussian random variable with mean 0 and variance 1, then the random variable Xt is also a Gaussian random variable with mean zero and variance 2 and that the random vector (Xo, Xt) is Gaussian with cov(Xo, Xt) = -'e-t. Prove from the above that if (X, Y) is a Gaussian random vector with mean (0, 0) and covariance matrix (1 ) , with (3 > 0, then for all functions f and g such that f (X) and g(Y) zare square-integrable with mean 0, one has: I
((
E(f (X)g(Y))
1/2 r ] 1/2 (f2 . /3 [E(X )) 11J [E(g2(Y))
Exercise 3.1.24 (Bretagnolle'a inequality). Let B= and B2 be two independent Brownian motions with the same filtration (.Ft), let X be a Fomeasurable Gaussian random variable which has zero mean and variance 2, and let Xt and X= be the solutions to the Ornstein-Uhlenbeck equation associated to these two Brownian motions with initial conditions equal to X01 = X02 = X almost surely.
Prove that the random vector (Xt , Xt) is Gaussian with mean (0, 0) and establish the formulas:
E(Xt )2 = E(Xt )2 =
2
and E (Xt Xt) = 2e-2t.
Prove the relations:
E (f (Xt )f (Xt )) = f (Nt.f )2 dry < (f
f1+e-sc d y) 2/( I+e-2c)
By introducing the indicator function IIl_,,,,ul, deduce that for all Gauss-
ian random vectors (Y, Z) with mean (0, 0) and covariance 2 (1010) , with j3 > 0, one has:
P(Y V Z < M) < (1
rM e-x2 dx)
2/(1+0)
J 3.1.1. The Bakry-Emery inequality. This inequality is the Gross inequality for the infinitesimal generators of Kolmogorov processes on Rd associated to uniformly convex potentials U. p will denote the Boltzmann measure corresponding to U, Nt the Kolmogorov semi-group, and U"(x) the symmetric matrix of second derivatives or Hessian at the point x. We assume the existence of a constant m > 0 such that for all x E Rd (Cm)
U"(x) i mId,
3.1. THE POINCARE AND GROSS INEQUALITIES
51
in the usual sense of order on the symmetric operators on Rd. Diverse inequalities follow from this convexity. For suitable constants b, b' these inequalities are: (3.1.13)
(x - y) (VU(x) - VU(y)) > m(x - y)2,
(3.1.14)
X. VU(x) > mIx12 - b,
(3.1.15)
U(x) >
2
IxI2
- b'.
The first inequality follows directly from the formula:
vu(x) - Vu(y) =
J in
1
U" (f (t)) (x
- y)dt
where f (t) = tx + (1 - t)y. The second is deduced from the first by taking y = 0, and the third is deduced from the preceding by integration along lines starting at 0. The relation (3.1.14) shows that the Kolmogorov process does not explode in finite time and (3.1.15) shows that exp(-U) is integrable. The proof of the Bakry-Emery inequality, following the presentation given in [AKR951, will be based on the following stabilization result in the weak sense:
Proposition 3.1.25. We suppose that the hypothesis Cm is satisfied. be a Lipschitz function on Rd and i/it := Nt(i4). Then' t is a Lipschitz function, its Lipschitz slope tends to 0 exponentially, and for any Lipschitz function g, the integral fRd 9(t't) dµ converges to g(f dµ) when t oo. Let
PROOF. Let l denote the Lipschitz slope of 0. On one hand, we have: (3.1.16)
I''t(x) - Ot(y)I = JE('(Xf) - O(Xe))I <, lE (IXt - Xr I )
On the other hand Langevin's equation implies that Xf -Xt is differentiable in time, which leads to the following inequalities:
IXt - XY I2 = -2(Xt - Xt)(VU(Xi) - VU(Xi )) s -2mlXi - Xt 12, IXt - Xi I2 S Ix IXi - Xt , Ix - yle-mt yI2e-2,t,
3. LOGARITHMIC SOBOLEV INEQUALITIES
52
By utilizing (3.1.16), we see that Ot has a Lipschitz constant
It := I exp(-mt). Let k be the Lipschitz slope of g. The last convergence statement of the proposition follows from the sequence of inequalities:
If(t) dµ - g(f i dµ)
If [g(it(x)) - g (f?Pt(y) p(dy))] p(dx)
kf =k k
IV,,(,) -- f V,t(y)u(dy)I u(dx)
f If
f
- Vvt(x)) u(dy)Ip,(dx)
I+&t(y) - Vt(x)I u(dy)u(dx)
klt jiy - xI lz(dy)µ(dx).
0 Exercise 3.1.26. Show that IGt(x) converges to f rvdjt for any point of departure x and deduce from this that the probability measure associated with Xt tends weakly to p.I
Lemma 3.1.27. Let f be a bounded function in D(A). Then one can find a uniformly bounded sequence on in CC°(lRd) such that cpn
f and
Acp,, - Af in L2(µ). PROOF. We utilize the characterization of D(A): f is in Wioc and, in addition, f as well as f - VU V f are in L2(µ). The same procedure of 20 truncation and regularization that was used in the proof of Lemma 3.1.12 gives us the result. To a function p we associate V) := exp(cp) and we set:
-F(0 = J d
A4
dµ + J d Az&Acp dµ.
Lemma 3.1.28. Let p be a bounded function on IItd. Then the function = exp(W) is in the domain of A if and only if W itself is in the domain and we have: (3.1.17)
JIM '> m. fiiiid
PROOF. Suppose that W E D(A). With the aid of the preceding lemma, we can approximate W by a sequence Wn of infinitely differentiable and uniformly bounded functions. Setting On = we see immediately that:
A n = 0n(Apn -
1
1V
'See Appendix A.3 for the definition of weak or narrow convergence.
3.1. THE POINCARE AND GROSS INEQUALITIES
53
Since Vn and AV converge in L2(µ), so does A1/2Vn and, since the sequences On and A?Ln also converge, the limit 0 belongs to D(A).
Conversely suppose that 0 E D(A). We then proceed in an analogous manner by approximating 0 by regularized functions On and writing: 2
AV. = 1 AV)n + T!L'I 202
V)n
It is necessary to be careful in approximating 7P because of the denominators. We do this by noting that 0 is bounded below by a strictly positive constant
m. We then approximate 0 - m, as was done in Lemma 3.1.12, by test functions gn, which are positive, and then set On = m + gn. The approximation used for cp proves it suffices to show the inequality (3.1.17) in the case where cy E C'°. This will allow us to justify all of the integration-by-parts below. In this case the left-hand side of the inequality is written:
f(V) = 2 fAbAcp dµ - 2 f VcoI2AO dµ = fVAo 0O dp - 2 f VV12AzOi dii . On the other hand we are able to express the commutator of V and A in the form: d
(VAV); = 8 (ASP) =
d
d
a > jpajU
2 1: j=I
j=I
j=I
= A(V ) + (U"V )i. Thus inserting this in the expression for F(V) gives
(P) = fvi. AV odµ-
= 2 ; 8"j V,a?jV
f
zJ
fVI2AV, dµ+ f(V U"V )dµ
dµ - 2
f:
f
a,V a,,P dii + jlLJ(V2, U"V o) du, i,j
which can be put in the form:
FM = 2
(3.1.18)
0 (a jV)2
dµ. +
f tt(Ocp, U,,V ) d;i.
Thus the convexity of U establishes the lemma.
O
Theorem 3.1.29 (Bakry and Emery). Let U be a C2 function on Rd such that U"(x) >, m I uniformly in x. The corresponding Boltzmann probability measure µ satisfies the following logarithmic Sobolev inequality with constant in- I : for any f E H1(µ), we have: (3.1.19)
f
f2log(IfI) dµ < a
2m fee IofI2dµ+10911fII2
ff2dii.
3. LOGARITHMIC SOBOLEV INEQUALITIES
54
PROOF. We denote by -y(x) the lower-bounded convex function x log(x).
We begin by considering a strictly positive function i of the form a + f2 with f E Co0(Rd) and a > 0 and we set of = Nt where Nt is the semi-group
associated to the Langevin equation: dXt = dBt - VU(Xt)dt. It is clear that a < ')t < b with b = sup(t;). Lemma 3.1.14 implies that the derivative r(t)
d
ry(V t) du exists
[t
and equals
r(t) = IRaY (t) AOt da =
(3.1.20)
Ai,b ),
with cpt = log(ot). Since the function log E C2[a,b] the same lemma shows that t c--+ cpt E L2 is differentiable with derivative -fit 'A't. To differentiate r(t) we proceed as follows. Since Lemma 3.1.28 implies that cpt E D(A), we can write: h-I(cpt+h
h-I(r(t + h) - r(t)) =
- cpt, Ai4t+h) + (Acot,'bt+h -0t), which, after passing to the limit shows that:
r'(t) = -- (cpt) Thus the inequality (3.1.17) gives us:
jicoiii
r '(t) <
f
= -m v(pt VV)t dp = -2m f Wt AV)t dp = -2mr(t), which implies the inequality:
r(t) <
r(0)e-2mt =
2
e-2mt fVcpo V,00 dµ = 4e-2.nte('+G01Y2,'Wo/2
which, in turn, by integrating with respect to t, gives us:
f 7(O) du - J-(t)dp = f
t
r(s) ds < me(o1/2, v) 1/2),
Vt.
Utilizing Proposition 3.1.25 we let t go to infinity to obtain:
f
Setting fa =
'Y(V,)dp <
=
2E(oI/2,*1/2)+'Y (f V, dp/
a + f 2, we then obtain the logarithmic Sobolev in-
equality:
f fa log(If0D) dµ < 1E(fa, fa) + jIfaII2 log (IjfajI) , and we finish by passing to the limit a -* 0 as we did at the end of the proof of Theorem 3.1.15. 0
3.2. AN APPLICATION TO ERGODICITY
55
Remark 3.1.30. These calculations can be analyzed in the more general setting of symmetric Markovian semi-groups, which in particular leads to the Gross inequality on Riemannian manifolds with uniformly positive curvature; see [Bk93].
Exercise 3.1.31 (the Brascamp-Lieb inequality). Let µ be the Boltzmann measure associated with a C2 function v on IR satisfying v" > 0. Prove for any f E C°°, the inequality:
µ(f) <
2 fR fi2, dµ
To do this, utilize the equations:
varµ(f) = 1 f 2
xR
(f (x) - f (y))2 µ(dx)µ(dy),
f(x) - f(y) = f fv((t)
v"(t)dt
One will find a multi-dimensional analogue in [BL76].
To treat potentials which are only convex at infinity one can use Proposition 3.1.18.
3.2. An application to ergodicity 3.2.1. The Poincard inequality and stabilization. One can already obtain an ergodicity result in the strong sense, called stabilization towards the Boltzmann measure, with the aid of the Poincare inequality for a Kolgomorov process (Xt). It is a matter of showing that C(Xf) converges to the invariant measure, independently of the point of departure x. We the total variation distance between two bounded denote by IIPI measures pi and p2. (See the Appendix for the definition.) The measure m(t) will denote the probability measure G(Xf ). Theorem 3.2.1. Suppose that the Kolmogorov semi-group satisfies a Poincare inequality with constant c. (1) Then, for any function f E L2(µ), the function Nt f tends in L2 to the constant µ(f) when t o0 or, more precisely: (3.2.1)
varµ(Ntf) = IINtf -µ(f)II2
(2) If there exists to such that m(to) has an L2 density with respect to µ, then there exists a constant a > 0 such that for all t: IIm(t) - jAII,rt < a exp(-c It) .
PROOF. Let g = c-1. In the decomposition L2(µ) = C ® C1, the space of constants C is left point-by-point invariant by Nt while its restriction to Cl has norm a-9t, since the spectrum of the self-adjoint operator A,.l
3. LOGARITHMIC SOBOLEV INEQUALITIES
56
is contained in [g, +oo( and, therefore, the spectrum of Nt = exp(-tA) is contained in [0, a-9t]. Thus 11(f,1) - Nt(f )112 < e-9t Il f 112,
which establishes the first result. To prove the second assertion, we utilize irreversibility to establish that for any t >, to, the probability measure m(t) has a square-integrable density with respect to p satisfying the relation ft = NN-to fto. Moreover, to prove this, it is sufficient to write:
f
p(x)m(t) dx = E(W(Xt)) = ]E(]E(V(Xt) I )Ito)) = E(Nt-to'P(Xto))
=
f
[Nt-toco](x) fto(x) dx = Jc(x)[Nt_tofto](x)dx,
for any bounded measurable V. The inequality we are looking for is deduced from IIm(t) - IIIv = lift -1IIL1() s lift -111L2(,L) , e-9(t_to) Ilft0 - 1-(ftO)IIL2cpl
for t > to and, since the variation distance between two probability measures is bounded by 2, we can adjust a to cover the case when t <, to.
Remark 3.2.2. In order to apply the preceding theorem, it is not only necessary to obtain a Poincare inequality but also prove a regularity condition on the distribution after time 0. In the examples where one can establish the Poincare inequality one also is often able to obtain a logarithmic Sobolev inequality and then use it to prove stabilization. One will still need a regularity condition, but one a little weaker, which makes a big difference when one considers certain extensions to infinite dimensions.
3.2.2. Utilization of the logarithmic Sobolev inequality. We will utilize the Kullback information or the relative entropy of £(Xt) with respect
to µt. The relative entropy of a probability measure P with respect to another probability measure Q is defined by: (3.2.2)
I(PIQ)=
fflog(f)dQ if dP = fdQ, + oc if P does not have a density with respect to Q.
We see right away by Jensen's inequality that I is positive and I(P I Q) = 0 if and only if P = Q. We will be able to use the relative entropy to study the convergence of measures with respect to the variation in measure norm since we have the inequality:
Proposition 3.2.3 (Pinker). For two probability measures P and Q on a space E, we have: IIP
- QIIvt S
2I(P I Q).
3.2. AN APPLICATION TO ERGODICITY
57
PROOF. We begin with the elementary inequality:
3(1; - 1)2 < (4 + 21;)( log(t;) -1 + 1)
,
for any l; E R+. We set P = f Q, since if f doesn't exist, there is nothing to prove. By applying the preceding inequality with l; = f (x) for each x E E and integrating with respect to Q(dx), we obtain: (3.2.3)
311P-QII1. =3llf-1112.(Q)
0 Exercise 3.2.4. Prove the variational formula for entropy, i.e., show that for two probability measures p and ii on a space E, one has: (A I u) =
sup{ f f dµ - log (J exp(f) dv); f E C-(E) E
E
Let (Xt) be a Kolgomorov process that does not explode in finite time and let µ be the corresponding Boltzmann probability. Let I (t) := I (G(Xt) I µ).
Theorem 3.2.5. The logarithmic Sobolev inequality with constant c implies the inequality:
I(t) < I(0)exp(-2t),
(3.2.4)
t ,>O.
PROOF. If 1(0) = +oo, there is nothing to prove. If not the probability measure mo associated to X0 has a density (Radon-Nykodym derivative)
f E LI with respect to p and thus it is immediate that £(Xt) has the density ft := Nt f where Nt acts on L1(µ). We begin with the case where there exist constants 0 < a < b such that a < f < b. Since the semi-group Nt is Markovian the same inequality is valid at time t. In this case, f E L2(µ) and, thus, for any t > 0, ft is in the domain of infinitesimal generator A and we are able to write, as in (3.1.20), the following:
dI(t) dt
_ -Vog(ft), Aft
.
Since ft is in the domain of A it is in the domain of A1/2, which, in turn, is H1 (p). Since log and f are both in C1 [a, b], we have:
-I'(t) = f =2
log(ft) Aft dµ = E(log(ft), ft) Rd
0log(ft) Oft dµ =
2
f
ft
lVftI2 dµ = 4E(ft112, ft1/2)
3. LOGARITHMIC SOBOLEV INEQUALITIES
58
The function fj /2 is of norm one in L2(µ) since ft is a density for p and therefore is in L'(p). We can then apply the inequality (LS), which, for any t > 0, leads to: 2I (t).
I'(t) < _ 4 fRd ft log(f 1/2) dp
In the case under consideration Lebesgue's Theorem shows right away that I(t) is continuous at 0, which establishes (3.2.4) in this case. Now let f be arbitrary. For 0 < a < b, we set za fQ = bA (f V a) where the constant zQ is chosen so that fQ is a probability density. For a Kolmogorov
process with initial distribution ftp, the inequality (3.2.4) holds. Let a tend to 0 and then let b increase to +00. Since x log(x) is continuous and bounded below on IIB+, we first apply Lebesgue's Theorem and then Beppo Levi's Theorem to see that I ((Nt fQ) p I p) converges to I ((Nt f )p I u) and we are done.
The following exercise is borrowed from the theory of large deviations. [DSt89].
Exercise 3.2.6. Show for cp E D(A) that e(W, gyp) is the limit when
t -0 of
2
jw(x) - cp(y))2 at(dx, dy),
at = !2 ® Nt,
and deduce from this that the functional g '- e(gI/2, gI/2) is convex on L+(µ), agreeing that it takes the value +00 if g1/2
D(E).
The preceding result has the problem that it requires that "to be absolutely continuous with respect to A. But by homogeneity in time I(t) is bounded above by 2
I(t - to) exp(- (t - to)) for all t > to > 0. Thus it suffices that thec relative entropy I(to) is finite for some to in order for £(Xt) to converge to p. We are able to deduce from this the following:
Theorem 3.2.7. If the potential U satisfies the following hypotheses: (3.2.5a)
U = U1 + U2 (3.2.5b)
with
U1 uniformly convex and of class C2, U2 of Class C2 with bounded first derivatives,
IVUI2 - Wis bounded below,
the Gross inequality is satisfied for a constant c and, for any point of departure point x and to > 0, we are able to find a such that for t >, to
I(C I p) < ae-i2/rit.
3.2. AN APPLICATION TO ERGODICITY
59
PROOF. These potentials are in the class defined by (2.2.5), since IVUI
tends to infinity as Ixl tends to infinity as well as in the class defined by (2.2.6). Let m > 0 be such that Ui > in. First we prove, following L. Miclo, the existence of a decomposition of U as sum of a uniformly convex function, which gives rise to the Gross inequality for the associated Kolmogorov semi-group, and a bounded function. Proposition 3.1.18 then implies a Gross inequality corresponding to U. Let -y,, be the Gaussian density (2irv)-d/2 exp(-Ix12/2v) and U2,,, = U2 * y,,. It is easily seen that IIU2 - U2,11. <, k(vd)1/2 U2",,, > -(k/v)Id, where k is the Lipschitz slope of U2. So it suffices to write
U = (U1 +U2,v)+(U2 -U2,v),
for v > k/m. It remains to show that for t > 0, the relative entropy I(.C(Xt) I µ) is finite. The Cameron-Martin formula says that the (Radon-Nykodym) density for the measure on C([0, t]) defined by the process Xx with respect to the Wiener measure P for the Brownian paths WX starting at x is: t
F(w) = exp(U(x) - U(w(t)) -
2
f [IVU12
- AU](w(s))ds).
0
The density g for ,C(Xt) with respect to L (Wj) is:
g(y) = IE(F I Wt = y) where IE denotes the expectation associated with the Wiener measure on C([0, t]) for Brownian paths starting at x and C(WI) is the Gaussian meay)2
) dx on Rk. We write this Gaussian density as sure (27rt)-"/2 exp(- (x 2t exp(-2v). The density f of L(Xt) with respect top is thus given by:
f(y) = ZIE(exp(2U(y) - 2v(y))F I Wt = y) Setting y(x) = x log(x), we have: I(t) = Z-1 E('r[f(Wt)] exp(2v(Wt) - 2U(Wt)))
= Z-1IE(y[ZIE(exp(2U(Wt) - 2v(Wt))F I Ft)] exp(2v(Wt) - 2U(Wt))), which by Jensen's inequality gives us: (3.2.6)
I (t)
5 Z-1 IEI E(y[Zexp(2U(Wt) - 2v(Wt))F] I Yt) exp(2v(Wt) - 2U(Wt)) I (3.2.7)
_ ( [log(Z) + U(x) + U(W) - 2v(Wt) I
\\
j
[IVU12
- AU)((s)) ds] F)
.
3. LOGARITHMIC SOBOLEV INEQUALITIES
60
To see that this quantity is bounded we only have to check that the terms between the brackets are bounded above since F has expectation 1. This is obviously the case for -v since: exp(-2v(y)) = (21rt)-"/2 exp(_ (x 2ty)2
It remains only to look at ]E=(U(Wt)F). But this term is bounded above since
ftEIvuI2 exp(-U]((8)) ds) is uniformly bounded above on a path space and U+ exp(-U) is bounded above on Rd.
Remark 3.2.8. The interest in this method is to control the rate of stabilization in the sense of entropy and thus in the sense of the total vari-
ation norm. We should also point out that the method of Harris chains (see, for example, [MT97]) allows us to still obtain the same exponential stabilization for total variation under the preceding hypotheses. Let us also note that the utilization of similar calculations in the theory of simulated annealing is treated by L. Miclo in [CM]. It is formally a simple extension of Theorem 3.2.5 but is technically more difficult in the case of Rd. On the other hand more stringent growth conditions on U at infinity than above will imply that the Kolmogorov semi-group has stronger contraction properties than hypercontractivity; see [KKR93, Dav89, CKS87]. In fact, Theorem 3.2.5 can be generalized to most semi-groups associated to reversible processes. The simple case of jump processes in a finite state space is already instructive. Let K be a kernel on E with invariant probability p. We denote by £K the Dirichlet form associated to the symmetric operator on L2 (µ) given by Id - (K + K*), i.e.: 2
£K (f, f) := (f, (I - K)f)t'(µ) Assuming that a logarithmic Sobolev inequality holds, we denote the corresponding constant of the inequality by /3(K). Clearly we have: /3(K) = 3(K'). In the case when the support of µ is all of E, we can always reduce the problem to this case, the adjoint K* of K in L2(µ) is given by the unique kernel:
K'(x,y) = µ(y)K(yx). Then for any initial probability m, a jump process Xt is defined such that: G(Xo) = m and the associated semi-group Nt satisfies:
dtNtf = (K - I)Nt f.
3.2. AN APPLICATION TO ERGODICITY
61
The density gt of mNt with respect to p is then Nt go. We can then show (see [D-S96]) that:
I(mNt 1,u) < I(m I µ)eXp(-2t), where one can take c = i3(K) in the reversible case, and c = 23(K) in the non-reversible case. The derivative calculations are the same as in 3.2.5, but the equality E(log(f ), f) = 4E(f1/2, f1/2) needs to be replaced by an inequality. In the reversible case, the constants do not change. We treat the reversible case as an exercise:
Exercise 3.2.9. Let µ be a reversible probability for the kernel K on the finite set E. Prove the inequality MI(u, v) < M,, (u, v) between the logarithmic mean u - v
log(u) - log(y) and the arithmetic mean. One can utilize the representations: I
'
100 M,(u,v) =
°O
dt
,
dt
M.(u,v) =
+ u)(t + v)
(t + (UV) 1/2)
2
By utilizing the formula in Exercise 3.1.5, deduce:
EK(log(f),f) > 4EK(f1/2,f1/2). Remark 3.2.10. The validity of the Gross inequality in the case where the state space is finite is relatively trivial. For example in the reversible case, the Poincare inequality for Id -K is true if and only if the chain associated to K is irreducible; see Exercise 3.1.5. The two terms of the Gross inequality are non-zero except for constant functions. On the unit sphere of L2(µ), which is compact, the expressions F-xEE f2(x) log (If 1(x)/IIf lie) u(x)
and varµ(f) are two positive functions of f that vanish only at 1. They are infinitely differentiable and the same calculation that was used in Proposition 3.1.8 shows that their differentials are the same at 1. They are thus comparable. However, finding good, let alone the optimal, logarithmic Sobolev
constants in the finite case requires a large number of methods as can be seen in the work of P. Diaconis and L. Saloff-Coste [D-S96].
In the case of discrete-time Markov chains there is a simple analog of the formula (3.2.4), due to L. Miclo [Mic96], for which the proof is more subtle. We do not need to assume that K is reversible since this wouldn't make the proof any easier.
Theorem 3.2.11. Let y be an invariant probability for K. If the operator
Id -KK`
3. LOGARITHMIC SOBOLEV INEQUALITIES
62
satisfies a Gross inequality in L2(µ) with constant Q(KK'), then there is exponential stabilization with the following form:
I(mK" I µ) < 1 -
(3.2.8)
(
1
3(KK`))
"I(m µ).
PROOF. We note, first of all, that the Dirichlet form in question has a simple interpretation: varµ(f)
- varµ(K'f) = fvarK(f)P(dY).
It follows from this that an admissible constant in the Poincare inequality and a fortiori in the Gross inequality, should be larger than 1. Let f be the density of m with respect to p. (If this does not exist there is nothing to prove.) It is clearly sufficient to consider the case n = 1. In this case the density of mK is K* f . By letting
=
v in the convexity inequality
log(1;) - f + 1 , 0, we see that:
[u - v] + (f - v)2, u,v
u(log(u) - log(y))
0.
By replacing u by f (x) and v by K* f (y) and summing with respect to the kernel
(p (9 K*)(y,x),
the term between brackets disappears because of the K'-invariance of µ and we see that:
E
µ(y)K.
(f log(f)) (y) -
µ(y)K f (y) log(K f (y)) yEE
yEE
tt(y)K`(y,x)( f(x) -
K.f)2,
yEE,xEE
which can also be written as
I(m I u) - I(mK I µ) , E u(y){
(K.' VIf
-
K;, f
l
2}
yEE
AMVa K;(V!) =E(V/f,V!) yEE
I(fu I µ) =
I(m I µ),
where the last inequality is just the Gross inequality.
0
Remark 3.2.12. The preceding calculations would be valid in the case of an arbitrary kernel but this generality is illusory since the validity of the logarithmic Sobolev inequality in this case requires that µ be a finite barycenter of Dirac measures. The Poincare inequality varo (f) S
3.2. AN APPLICATION TO ERGODICITY
63
aEKK (f, f) does not have the same defect and, furthermore, it immediately implies that:
var,,,(k"f)
(1
-
va'µ(f
These methods are very simple but the real problem is to find good Poincare constants.
CHAPTER 4
Gibbs Measures 4.1. Generalities The theory of Gibbs measures arose in the study of certain statistical mechanical models. These models can be applied each time the system can be described by configurations belonging to a product Fs, where S is a "large" finite or denumerable set called the set of sites and the set F is called the set of elementary states or spins. We denote the set of configurations by E.
Examples 4.1.1. We give three examples. The prototype of all the statistical mechanical models that will interest us is the Ising model. Its purpose is to explain the phenomenon of magnetization. In this case S := Zd with d = 1, 2, 3, .. . and F = {-1,1}. Physically S represents the sites of atoms in a crystal where each atom has an elementary magnetic moment called
the quantum mechanical spin, which can take the value of either +1 or -1 at any site. An immediate modification to this setting is to take for F: a finite set, or the unit sphere Sk in RA+1 or F = N; in order to represent the set of possible images on a screen consisting of n rows and p columns we take F = {0, 1} where 0 codes a black
pixel and 1 a white pixel and S = {1,...,n} x {1,... , p}. If, say, p = 640, n = 350, the cardinality of E will be the huge number 222400. If the pixel can have several shades of gray, we will take F = {0, 1, ... , b}. If the pixels can be one of three fundamental colors, F will be the cube of the preceding set; the space of trajectories (configurations) of a discrete time process with values in F is FN or Fz.
4.1.1. Gibbs measures. We denote the finite subsets of S by Pf(S). Definition 4.1.2. An interaction on E consists of a a-finite measure on F and a family VL, L E Pf(S), of functions on L such that for each L, VL(x) only depends on the restriction XL of x to L. The VL are called interaction potentials. From now on we will only consider local interactions, i.e., for any i E S, there only exist a finite number of L E Pf(S) such that i E L and VL # 0. We denote by XL the canonical projection of E onto EL := FL defined by XL(x) = XL and we define the reference measure on EL to be h := (gA)L. 65
4. GIBBS MEASURES
66
For any probability measure P on E, we denote by PL the projection of P
on EL. Let L be a proper subset of S and ( E ES_L. We introduce the conditional probability:
P( I XS-L = C), which is a probability measure on E (depending in reality on the version that is chosen for the random probability P( I XS_L)) and its projection
PxL( I XS-L = O on EL, which we will denote simply by PL( I (). It is the probability governing the configurations on L given the exterior condition (. In the examples given above, F is a Polish space, I which implies that these conditional probabilities exist. See [DM75]. Alternatively we can simply postulate their existence. Case where S is finite. We define the energy of a configuration x by:
U(x) = >2 VL(x). LCS
The Boltzmann-Gibbs measure associated to the interaction V with temperature /3-1, where /3 is a strictly positive parameter, is the unique probability of the form: (4.1.1)
P(dx) = Z-1 exp -,3U(x) \s(dx),
where Z is the normalization constant:
Z = fexp -$U(x) \s(dx),
which is assumed to be finite. We consider for any L C S and z E E the unique probability measure nL,z proportional to: (4.1.2)
exp(-Q > VA(x,zLc))AL(dx) AEP f (S)
AnL#0
(For the moment, the notation Pf(S) is redundant.) It is easy to see that for any L a proper subset of S, we have: PL (dx I zLc) = nL (dx, z) P-almost surely in z.
Indeed, the conditional probability, conditioned by ( = zL-, is:
Z-1 exp -/U(x, () f Z_1 exp -,3U(x, () AL(dx)' and if we express U as a function of the VA we see that the corresponding terms A disjoint from L cancel out in the numerator and denominator and things are simplified. 1A separable topological space for which the topology can be defined by a complete metric space.
4.1. GENERALITIES
67
Case where S is infinite. Formula (4.1.1) no longer makes sense but, for L finite, the sum figuring in (4.1.2) only consists of a finite number of terms because of the locality of the interaction. This leads us to make the following:
Definition 4.1.3. The energy of a configuration x in L, given that z E E, is: (4.1.3)
VL ((x, ZLC )) for x E EL,
UL (X, z)
AEPf(S) AnLOO
where L` := S - L.
It is useful to note that this energy only depends on the coordinates ZLC of z that are outside the "box" L. We suppose for any finite subset L of S, any condition z, and any /3 > 0 that the function exp -QUL,ZL, is integrable with respect to AL. We denote its integral by ZL,z, which is called the partition function associated to the energy UL. We can associate a family of Boltzmann probability measures to the energy UL, where the family is parameterized by a new parameter T called the temperature. In the earlier chapters we set this equal to 1/2. Definition 4.1.4. The Boltzmann-Gibbs probability measure on EL with temperature T = )3-1, energy UL,z7 and measure AL is the probability measure defined by: (4.1.4) nL(z, dx) = ZL Z exP -/UL(x, z) AL(dx).
This formula defines for each L a kernel nL from E to FL.
Definition 4.1.5. The system of kernels nL, L E Pf(S), given by (4.1.4) is called the system of local specifications associated to the potential V.
One can easily verify the following compatibility property. For any subset L of a finite subset E of S and any configuration z, we have: (4.1.5) J co(xL, xE-L) nE (z, (dXL, dXE-L)) E
=
J
[ E
J
1P
V(xL, yE-L) nL((1JE, zS-E), dXL), nE (z, (dyL, dYE-L))
EL
This property will become more transparent when we introduce other notation later on. Taking into account the properties of the composition of two conditioning operations we introduce the following key definition.
Definition 4.1.6. We say a probability measure P on E is a Gibbs measure for the interaction V and temperature 0-1 if for any L E Pf(S), we have: (4.1.6)
PL( I zLC) = nL(z, ) for P-almost every z.
4. GIBBS MEASURES
68
We also say that P is a Gibbs state. The equations (4.1.6) are often called the Dobrushin-Lanford-Ruelle (D.L.R.) equations. Physically these equations say that the part of the system that is inside L is in thermodynamic equilibrium at the given temperature with the rest of the system. The function V{,} is the internal energy of the atom at site i and V{ij} is the interaction energy between the atoms at sites i and j. In physics it is usually only these two types of potentials that are considered, i.e., VL = 0 for card(L) > 2. If k 3 atoms interact and a new phenomenon is created then these potentials would be added to the the sum of the C2 pairwise interactions and the self-energies. It should also be pointed out that these multiple potentials are useful in the study of images; see [Gu92]. For the Ising model, A is the uniform probability on {-1, +1 } and we describe the energies in the following way: we consider ?Gd as a graph by defining the 2d neighbors of the point i = (ij, ... , id) as the points that are obtained by adding ±1 to the coordinates of i. We set V{;}(x) := Hx; and,
writing i - j if i and j are neighbors, we set the interaction potentials to be:
Jx,xj if i - j, 0
if not,
where H is an arbitrary real number corresponding to an exterior magnetic field, J > 0 is a fixed positive number, and VL =_ 0 for card(L) > 2. We now describe the behavior of the Ising model. Let GT,H be the set of
Gibbs measures with temperature T = 0-1 and with H and J fixed. This is a convex set of probabilities that for d = 1 consists of a unique measure card(GT,H) = 1 for all T and H. When d > 1, there are the following possibilities:
For H # 0, card(GT,H) = 1. For H = 0,
there exists a temnerature T -such that:
J card(GT,H) = I if T > T, card(9TH) > 1 if T < TT.
These results correspond to the following properties of the ferromagnetic
In the presence of an exterior field H 0 0, the elementary magnets are mostly aligned in the same direction as H, creating an induced magnetization. In the absence of this field, there are two cases. At high temperature, i.e., greater than the Curie temperature T., there exists only one possible "phase" with a mean zero magnetization: f X, dP = 0 for all i. For temperatures lower than the Curie temperature, one is able define, in particular, two phases P+ and P- with non-zero mean, respectively positive and negative, magnetization. For example, P+ is the weak limit of the corresponding Gibbs measures when H -+ 0+. Thus the field is able fields.
to conserve a non-trivial magnetization giving rise to a permanent magnet.. For a detailed discussion, see [Sp74].
4.1. GENERALITIES
69
In this book, we will study the case of real spins of dimension one and the case of high temperature.
4.1.2. Markov fields and Gibbs measures. Being able to define a multi-dimensional Markov process is one of the most significant properties of a Gibbs state. The characterization of the corresponding potentials has been described by G.R. Grimmet in [Grm73]. In order to simplify things
in this subsection, except in the case of Exercise 4.1.10, we will restrict ourselves to the case where the elementary state space F is finite. Since the characterization we are looking for does not depend upon the parameter ,0, we set it equal to 1. We assume that S has the structure of a non-oriented graph without loops. We will write i - j if i is related to j. If L C S, the boundary OL of L is the set
aL={iES - LI 3jEL,i-j}. The graph structure is given by a subset of S x S, called the set relations, which is assumed symmetric and is such that it does not intersect the diagonal. We assume that the graph is locally finite, i.e., any point only has a finite number of neighbors.
Definition 4.1.7. A subset A of S is a clique if any two distinct points of A are neighbors.
For any boundary condition u E EaL we will abbreviate the probability measure Pxt,(. I XaL = u) on EL by PL(. I u).
Definition 4.1.8. We say that a probability measure P on FS possesses the Markov property if for all finite subsets L of S, we have for PS_L-almost every ( : PL(' I () = PL(' I (8L)
The process (P, XL, L E Pf(S)), indexed by the finite subsets of L, is called a Markov field.
Exercise 4.1.9. Show that a stationary finite Markov chain with state space F defines a Markov field on Fz.
Exercise 4.1.10. We denote by r(m, 02) the Gaussian measure on R with mean m and variance Q2. The Ornstein-Uhlenbeck semi-group NN has
the kernels Nt(x,.) = F(xe-=, 2(1 - e-2t)) as measures on llt with its invariant probability measure equal to 'y(dy) = x(0,1/2). Fix t and consider the Markov process Y indexed by time Z, with transition kernel NN, and initial probability measure (distribution) -y. Calculate the law (joint distribution) of the random vector (Y1, Y2, ... , Y,,), and deduce from this that the law P on RZ defined by this process is a Gibbs measure corresponding to the following interaction potentials with respect to Lebesgue measure and temperature 1: V{i}(x) = coth(t)xi,
V{i,i+l}(x) = -2sinh-1(t)xixi+1.
4. GIBBS MEASURES
70
Do this by first considering the conditional probabilities corresponding to
the interval I conditioned by data on the set J \ I where J is an interval properly containing I.
Definition 4.1.11. We say that a probability measure P on E is full if, for any L E Pf(S), the marginal probability PL on EL has the property
that PL({y})>0forall y in EL. Naturally the existence of a full probability measure implies that F must be at most denumerable.
Theorem 4.1.12. Let (S, .) be a locally finite graph, F a finite or denumerable set, A the counting measure on F, and P a full probability measure on FS. In order for P to be Markovian, it is necessary and sufficient
that P be a Gibbs measure for an interaction V such that VL - 0 for all subsets L that are not cliques. That this condition is sufficient follows immediately from the formulas (4.1.3) and (4.1.4). We will first prove its necessity when S is finite.
Lemma 4.1.13 (Mobiu8). Let S be a set and f and g two mappings from Pf(S) into an Abelian group (G, +). Then the following two relations are equivalent:
VA C S f (A) _
(4.1.7)
g(B), BCA
VA C S g(A) _
(4.1.8)
(-1)card(A\B) f(B). BCA
PROOF. When C C A, set a(C, A) = When card(A \ C) = n, there are Ch subsets B such that card(A \ B) = k in this EccBcA(-1)card(A\B)
sum. Thus: k a(A ,C)=E(-1) k Cn=
k=0
if n= 0, then A=C,
1
(1-1)"=0 ifn¢0 , then A#C.
Assuming that (4.1.7) is true, we have:
E (-1)card(A\B)f (B) =
nL: E
(-1)card(A\B)g(C)
BCACCB
BCA
_ E a(C, A)g(C) = g(A), CCA
which is (4.1.8). We can prove that (4.1.8) implies (4.1.7) by strictly parallel reasoning. Alternatively we make the following remark. Let f be given, then by recurrence on card(A), we can construct a unique function 0 satisfying f (A) = >BCA O(B). Then following the first part of the proof, we will have:
0(A) = E (-1)card(A\B)f(B) = g(A). BCA
O
4.1. GENERALITIES
71
Exercise 4.1.14. Let I be an ordered finite set and M a matrix indexed by I x I possessing the following property (T): Mi1 # 0 only if i < j. Show
that M is invertible if and only if Mii # 0 for all i and in this case M'1 possesses property (T). Interpret the Mobius lemma in this setting when
G=R. Completion of the proof of Theorem 4.1.12. We fix a state, denoted by 0, in F and denote by OL the configuration that is identically equal to 0 on L. Since P is full the following conditional probabilities are uniquely defined and are not equal to zero. For L C S, L # S, y E E, we set: /PL({yL} I OS\L)l WL(y) = -loglPL({OL} I OS\L)/'
WS(y)
_-log( P({y})
P({OS})
We have thus defined a mapping L --* WL of P(S) into the additive group of functions on E. By the Mobius lemma, there exists a family of functions V such that WL = EACL VA for all L. Since WL only depends upon variables indexed by L, the Mobius inversion formula shows that the same is true for
VL. The probability measure P is a Gibbs measure for V since P({y}) is proportional to exp (- SACS VA(y)) .
Let A be a set which is not a clique. We will show that VA = 0. Let i and j be two distinct non-neighboring points belonging to A. We write the Mobius formula in the following manner: -1)card(A\B)[WB + WBU{i,j} - WBU{i} - WBU{j}]
VA =
BCA\{i,j}
We will show for any B that the sum of the four terms between the brackets
is zero. Set C := S - (B U {i, j}). All of the probability measures that we will consider are conditioned by {Xc = Oc} and can be expressed with the aid of:
I Xc = Oc and Xj = 0). Utilizing the notations: B notation XB=YB event we can write: WB(y)
Bo
I
To
XB=OB Xi=yi Xi=0,
log(Q(B and lo)/Q(Bo and lo)), log(Q(B I Io)IQ(Bo I To)) WBU{i}(y) = - log(Q(B and I)IQ(Bo and I4)),
which by taking the difference gives:
WBU{{}(y) - WB(y) = - log(Q(B and I)/Q(B and Io)) log(Q(I I B)/Q(Io 113))
=-log(P(Xi=yiIXB=YB and X.=0 andXc=Oc) / P(Xi =0I XB =YB and X3 = 0 and Xc =Oc)).
4. GIBBS MEASURES
72
By replacing B by B U {j} and Q by P( I Xc = Oc) in the preceding calculations we obtain in the same way the following: WBu{i,j} (1!) - WBU{j}
log(P(Xi = yi I XB = YB and Xj = yj and Xc = Oc)
/P(Xi=0I XB=YBandX3 =y,andXc=Oc)). Since j
8{i}, the Markov property implies that the measures Pi(- I (YB,y ,Oc))
and Pi(' I (yB,O,OC))
axe equal, which combined with the relation WBU{i,j)(v) - u'BU{j}(Y) = WBu{i}(y) - WB(Y) proves that VA equals zero. We now consider the case where S is infinite. The formula
WL(y) _ -log
(PL({YL} I XL = OS\L) l PL({OL} I X8L = OS\L)/
can be used in the same way as when S is finite because of the Markov property. Since the boundary of L is finite and P is full we are allowed to utilize conditional expectations that are then defined unambiguously. We define the functions VL in terms of the WL by the Mobius formula and we consider, for any finite subset E of S, the probability measure on EE obtained by conditioning P by {XOE = 08E}. We can then apply the preceding proof to PE, which shows for L C E that the potential VL is equal to zero if L is not a clique. Let E be a finite subset of S such that L U 8L C E and set E' := E U OE. By the Markov property and the composition of conditional expectations we have for PS\L-almost surely (:
PL(' I XS\L = O = PL(' I XE'\L = (E'\L) = PL(' I Xar = 08E and XE\L = (s\L) In other words, we can simply condition the measure PP by the condition XE\L = (E\L in order to calculate PL(- I XS\L = (). Since the interaction potentials for PE are the VA, we find the proportionality relation:
PL({x} I XS\L = () « exp-(E V'%((X,())), AEPI(S) AnLOO
which completes the proof.
4.2. AN ISING MODEL WITH REAL SPIN
73
4.2. An Ising model with real spin From now on the set of sites S will be Zd and the space of elementary states F will be R with Lebesgue measure ,\(dx) = dx. Thus E = R. We set lil := sups
Hypotheses 4.2.1. We assume the potentials satisfy the following:
(1) The pairwise potentials have the form V{;,j}(x) _ -J(i - j)xixj for i # j where J is a symmetric function J(-i) = J(i) with compact support in Zd. 2 (2) The self-interactions are of the form V{;} (x) = V (x) where V is a non-constant polynomial that is bounded below on R.
Thus the interactions are translation invariant and of finite range. More
precisely we will assume that J(i) = 0 for Jil > R. The self-interaction V is thus a polynomial of degree 2m with m. > 1 and with the dominant coefficient strictly positive. This hypothesis is traditional and is already made for models of Euclidean fields when the models have been simplified by discretizing space and time; see [GRS75]. But, in fact, only a certain number of growth conditions of the self-interactions at infinity are necessary. These conditions will be satisfied by the hypothesis that V is a polynomial. Similarly the form of the pairwise interactions can be generalized. However, with our assumptions, the energy in the region L under the condition z is:
UL,z(x) = EV(xi) - E 1J('i - j)xixj - E J(i -j)xizj iEL
(i,j)EL2 2
iEL,
We set the value of the parameter Q equal to 1/2 so that the Boltzmann measures are the same as in the study of the Langevin equation. Our goal is to study the problems of uniqueness that are analogous to the case T > TT in the Ising model. If we take the measure Z-1 exp(-2V(x))dx as the basic measure on I8, the analogue of the uniform measure on {0, 1} for the Ising model, the self-interactions will be absorbed in this measure and only the pairwise interactions will remain. The condition of high temperature
corresponds then to the case where the coefficients J are small, i.e., the condition of weak interactions. But for now we will study the problems of existence in the general setting.
Exercise 4.2.2. Let A be Lebesgue measure on F = R, L a finite subset of Z2, and P a real polynomial that is bounded below and of degree at least
equal to 4. We set V{i}(x) := P(xi) for any i and V,j(x) := 11 (x, - xj)2 for all neighbors i and j in Z2 with the other interactions VL set equal to zero. Let n3L z be the Gibbs measure on RL with temperature 1/0 that is associated to V with the exterior condition z. A will denote the discrete 2We do not assume that J has a constant sign. In the case where there is a constant sign we say the model is ferromagnetic if J > 0 and anti-ferromagnetic if J < 0.
4. GIBBS MEASURES
74
Laplacian: Ox(i) = >j_i(x(j) - x(i)). Prove for 3 tending to +00 that the measure nL Z is concentrated in the set D of solutions of the following non-linear Dirichlet problem: find the functions x from L U OL into R such
that
I
xl8L = ziaL I
[-Ax + P'(x)](i) = 0 for all i E L,
in other words, that dL ,,(D) tends to 1.
4.2.1. Existence of Gibbs measures. From now on L will always denote a finite subset of S. The construction of a Gibbs measure on E will be accomplished by taking a limit point of the Gibbs measures on EL when L tends to Z' . For this it is convenient to consider, for any z E E, the probability measure on E, denoted by 7rL(z, dx), governing the configurations x for which the value is frozen at z outside of L, i.e., XLC = zLc and where the restriction xL to L is a random variable with distribution nL(z,dxL). When z varies this family of probability Thus we have: 7rL,z = nL,z ® measures defines a kernel on E that we denote by 7rL. We have: 7L1)(z) =
VI(y, zL<) nL(z, dy), EL
for any bounded measurable function ' on E and the compatibility condition (4.1.5) is written in terms of the composition of kernels: (4.2.1)
VM c L 7rL = 7fL7rM
Similarly a probability measure p on E is a Gibbs measure when: (D.L.R)
VL E Pf(S)
p7rL = It.
For any subset E of S we denote by FE the a-field of subsets of E generated by the projection XE : E --+ EE. Thus f is measurable with respect to 7'E
if and only if f = g o XE where g is a measurable function on EE. The (D.L.R.) equations are able to be written as follows: 7rL(0) = E(1G I
.
Le)
We set p(i) := IJ(i)l and a = D=ES p(i) agreeing that p(O) - 0. Definition 4.2.3. We will denote by S' the space of configurations with at most polynomial growth at infinity, i.e., x E S' 3n,a, Vi, Ix(i)I 5 a + Iiln
Lemma 4.2.4. Suppose that a' > or. Then there exists a positive, symmetric, and summable function on Zd (which even has exponential decay) that is superharmonic with respect to the convolution kernel a'-Ip, that is
to say: a*p
4.2. AN ISING MODEL WITH REAL SPIN
75
PROOF. We utilize the following convergent series in the Banach algebra l1(Zd): 00
a = E a'-k (*p)k.
(4.2.2)
k=0
The property of being superharmonic follows immediately. From the fact that p has finite support we can easily check the exponential decay since the support of p*k is contained in the set {i : jiI s kR}. 0 Theorem 4.2.5. If V is of degree strictly larger than two or if its second derivative strictly dominates a there exists at least one Gibbs measure on E supported by S'. PROOF. It is necessary to verify the existence of the measures nL,Z, that
is to say, the integrability of exp(-2UL,2) on EL. For V(x) = 2x2, which has second derivative equal to a, and z = 0, the energy UL,O is the quadratic form
a 2
21 J(i - j)xixj-
2xi i.jEL
iEL
The condition a > a says that the matrix of this quadratic form has a dominant diagonal. Hadamard's Lemma implies that it is positive definite and that nL,o exists and is a centered Gaussian measure. If we take a polynomial V of second degree but otherwise arbitrary and the exterior condition z also arbitrary we add a linear form to UL,o to obtain a Gaussian measure that no longer is centered. If V is a polynomial of degree at least 4, then for any a > a, we are able to find b such that for any x E IR, we have V(x) >, 2x2 - b. The function UL dominates, up to a constant, the Gaussian case considered earlier and thus exp(-2UL,,) is integrable. Before continuing we will need some preliminary estimates.
Lemma 4.2.6. There exist constants CI and C2 such that for any z E E, L E Pf(Zd), we have: (4.2.3)
J iEZd
a(i)x; 7rL(z, dx)
CI + C2 E a(j)zj . jOL
PROOF. By virtue of the hypotheses in Theorem 4.2.5, we can find con-
stants a > a and b such that: dx E III;
x V'(x) >, axe - b,
(4.2.4)
Vi 34j
xi aiV{i,jl(xi' xj) %
1
2 -1p(i - j)(XI2 +' xj)
where ai := 8/8xi. The second of these relations will be, in any case, evident for the interactions that we will consider. By integrating by parts we have for any i E L that: ZL,Z =
r
JR
2xi BiUL,z
dx.
4. GIBBS MEASURES
76
Thus utilizing (4.2.4) and dividing by ZL,, we obtain:
E p(i - j ) (x? + z )] lrL (z, dx) < 1.
J [2ax? - 2b - E p(j - i) (x? +
j0L
jEL
We now apply Lemma 4.2.4 for a value of o' such that o < o' < a. By summing the preceding inequality with respect to a(i), we obtain:
(2a-o-7')J Ea(i)x; lrL(z,dx) < (1+2b)Ea(i)+oEzj2, iEL
iEL
since ES a(i)p(j - i) < o'a(j). It suffices to take into account that i 0 L in order to obtain the inequality in the form announced. Corollary 4.2.7. For any Gibbs measure u supported by S' and i E Zd, we have: X? , A(dx)
E
<
C1
ao
PROOF. We argue as in [BHK82]. Let (L,i) be an increasing sequence of boxes tending to Z' . Then the sequence Ej Ln a(j)z tends towards 0 for almost all z with respect to the measure p and the relation (4.2.3) then shows that for any number h:
in (xp A h) L. (z, dx) < al 0 n J Since the function x A h is bounded above, we can apply Fatou's Lemma to the integral with respect to p(dz) and thus, taking into account the (D.L.R.) equations, we obtain: J
(xp A h) p(dz) =
rr
m JJ (xo A h) 7r(z, dx) p(dz) fP(dz)limf(xAh)1r(z,dx)
al 0
We get the same estimates at the site i by considering the translated weights:
a(k) = a(k-i). Finally, by letting h tend to +oo and applying Beppo Levi's Theorem, we obtain the desired result. Completion of proof of the Theorem 4.2.5 Let K be the set of probability measures p on E that satisfy:
fE
C1,
a(i)x.
0(x) _ iELd
The preceding lemma implies that the measures vL = 7rL(0, ) all belong to K. We will show that K is compact in .M 1(E) for the topology of weak convergence. Since E is a complete separable metric space it suffices to show
that K is closed and satisfies the tightness criterion of Prokhorov, i.e., for
4.2. AN ISING MODEL WITH REAL SPIN
77
any e > 0 there exists a compact set HE C E such that for every M E K, we
have µ(E\HE)<e. If we set
HE :_ {x E E;O(x) 5 CI/el, Markov's inequality implies that every probability measure u E K satisfies:
µ(E - HE) < e. But HE is closed because the function .0 from E to R+ is lower semi-continuous as the sum of positive continuous functions and HE is bounded because for x E H, we have Ixil <, (CI/a(i))1/2, thus the Prokhorov criterion is satisfied. Finally K itself is closed since the mapping p -- p(4b) is lower semi-continuous. To see the latter, note that u- z(O) is, after Beppo Levi's Theorem, the upper envelope of the functions µ where 0n is the continuous function n A ElxI
L of finite subsets of Zd for which the union is Zd and such that 'L converges weakly to a measure v. We will show that v is a Gibbs measure. For any fixed finite subset M
7rLn as soon as Ln D M and, in particular, of S, we can write It is easy to see that for any bounded continuous function cp on E depending only on coordinates in M, the function irMV is bounded and continuous on E. Thus by weak convergence, we obtain l/7rMcp = vep and a monotone class argument implies v = vaA for any M. It remains to show that v is supported by S'. The equation (4.2.3) implies for any L that: a(0)J x07rL(0,dx)
This inequality applied to the translated set L - i, where i E Zd, is equiva lent, since the interaction potentials are invariant, to: a(0) f x? irL(0, dx) S CI,
for any L and i. Since the function x '--. x; is positive and lower semicontinuous on E, it is even continuous although not bounded, it follows by weak convergence that:
Jxv(dx) < aI
CI
Let Q be a strictly positive summable function on Zd and decreasing as an (l+Iil)-(d+l) and O(x) = E O(i)x?. inverse polynomial, for example, 3(i) = The last inequality implies that:
v(VG) <, a-I(0)CI E' (i).
Thus the function V) is v-almost surely finite, and for almost any con-
figuration x, there exists a constant cx such that for any i, we have Ixil < cx(1 + jil)(d+I).
4. GIBBS MEASURES
78
Exercise 4.2.8. Show that the kernel 7rL is a Feller kernel, that is to say for any bounded continuous function f on E, the function lrL f is bounded and continuous.
4.2.2. The Glauber-Langevin diffusion process. In finite dimensions, the Boltzmann measures appear as invariant measures of Kolmogorov processes and it is natural to construct their analogs on E = RZd. These
will be variants of the stochastic evolution of the Ising models that were introduced by Glauber. The calculations are close to those developed in [S-S80]. Let (B2,t,Ft) be a family of real Brownian motions that start at 0, filtered by the same family (Ft), and independent in the sense that the corresponding
random variables Bi with values in C(R+) are independent. We will denote by Bt the EL-valued Brownian motion with real-valued Brownian motion coordinates (Bi,t, i E L). Under the Main Hypotheses 4.2.1, it is easy to see that for any L and z one can find a and b such that: x VUL,z(x) > a[x12 - b. (4.2.5) This inequality, even for a < 0, allows us to construct for any x E E and
z E E, an EL-valued process (Xt 'zx, t E R+) that is a solution of the Langevin equation:
Xt 'z,x = XL + Bt - jt V UL,z (Xs 'z'x) ds.
(4.2.6)
U
By completing this definition by setting y ;z'x := zj if zj L, we obtain an E-valued process that is frozen outside of L. It is important to remark that only the coordinates of z outside of L and the coordinates of x inside L intervene in the definition of the process. It is possible to control, in mean and uniformly in time, the 12(a)-norm of the process where a is the weight
of Lemma 4.2.4. The sum EiEs a(i) = o/(o' - o) will be denoted by
J.
For convenience, an expression such as a * zL will designate convolution with the configuration that coincides with z2 inside L and that is zero inside L.
Lemma 4.2.9. There exist constants k and c such that for any t, L, z c S', and x E S' we have:
(4.2.7) E(Ea(i) sup (X=dz'x)2) iES
(4.2.8)
0<s
ekt(E a(7)(xi+c)+ jEL
a(7)zj j¢L
z,x)2) < ekt ([a * x2](i) + claI + [a * Z c](a)). a(0))E( sup (X 0<s
JJ
PROOF. We set Y(t) =: Xtand uj(t) := IE( sup
b<s
formula says that for i E L: (4.2.9)
Yi2(t) = x + 2AIi(t) - 2
J0
t
Yi(s) BiUL,z(Y(s)) ds + t,
J
Ito's
4.2. AN ISING MODEL WITH REAL SPIN
79
where Ali(t.) is the martingale fo Y2(s) dBi,s. Under the Main Hypotheses 4.2.1 we can find constants a and b such that the relation (4.2.4) is satisfied and thus:
-Y(s) aiUL.z(Y(s)) <, a Yi2(s) + 2 Ep(j - i)Y?(s) + b,
(4.2.10)
jES
(2 - a)+ (although it is not important, one can arrange things under these hypotheses, so that a' = 0, except if V is of where we have set a'
degree 2). We set Zt := sup Zt for any real process Zt. Since a' is positive, o<,
by putting (4.2.10) into (4.2.9), we obtain:
r
Y,.2(t) <x, +21b1i(t)+(1+b)t+a'JtYi2(s)ds 0
(4.2.11)
+
f
t
j
p(j - i)Y2(s) ds.
Doob's inequality [Mt82] applied to the submartingale M; t gives us: t E(M? (t)) <, 4E(M; (t)) = 4EJ tYi2(s)ds c 4EJ Y2(s)ds,
_
) I,Z
`t
E(Mi(t)) 5 2 E
rt
1
4+
Y2 (s) ds o
2
Y2(s) ds.
E o
By putting this in (4.2.11), after taking the expectation on both sides of the inequality and writing ui(t) := E(Yi2), we obtain for i E L that: ui(t) < x? + 8 + (1 + b)t + (1 + a')
j
t
/'e
ui(s)ds+ J >p(j-i)uj(s)ds. o
0
j
For i V L and replacing x by z we also have:
ui(t) <, z, +(1+a)jui(s)ds+ 0
j
>p(j-i)uj(s)ds.
If we set f (t) := Eies a(i)ui (t) we have, taking into account Lemma 4.2.4, that:
f(t) 5 Ea(i)[l[iEL(xi +8+(b+1)t) +1 Lz;] + f(l + a' + a')f(s)ds. iES
Gronwall's Lemma then implies:
f(t)<exp((I+a'+a')t)Ea(i)[IiEL(x?+8+(b+1)t)+Ij Lz?], iES
which gives us the relation we were looking for in the case where k > 1 + a' + at.
4. GIBBS MEASURES
80
We note that we have exactly the same relation when we replace a by a translate a(k) = a(k - i) since & is also superharmonic. The term corresponding to k = i of the left member leads to the second inequality.
To construct the infinite dimensional processes, we will utilize a comand XM,Z,x for M C L. The result obtained parison of two processes will also prove useful later for obtaining stabilization properties when we in-
troduce the property called finite velocity propagation, a concept that first appeared in [Zeg96] s Before proceeding, we point out an inequality that will appear later in many places. When we say V is sufficiently convex, we will mean that V satisfies a uniform convexity inequality in the Euclidean space 12 (a,,,) where a is a weight introduced in Lemma 4.2.4.
Lemma 4.2.10. For all points l; and i in RL, independent of the box L and the condition z, we have, by setting m := inf(V"), the following inequality:
ai(&-1Ji)(49W(r )-aiU(tl)) >, iEL
/m- 2d
ai(Ci-m)2.
(
\
iEL
PROOF. As in the inequality (3.1.13), we have:
(x - y)(V'(x) - V'(y)) > m(x - y)2,
(4.2.12a)
(4.2.12b)
(xi - yi)(aiV{i,j) (xi,xj) - aiV{ij}(yi,yj))
-2 Ep(i - j) ((xi - yi)2 + (xj - yj)2) By summing these equations for the weight ai and using the fact that a is superharmonic, we see that the lemma follows.
Proposition 4.2.11. We can find constants k', k", k', and c' such that:
(1) For any time t, and any subset M of the box L: (4.2.13)
a(0) )E
(1: a(i) sup (Xt O,x - Xt ;'0'x)2 iES
0<s
J
ek't E (]a * ca](j)x; + clala(j)). jEL\M
(2) ForalltER+, z, Z' ES': a(i)(z - z )2.
a(i)(X iz'x - X t2'.x)2 <
(4.2.14) iES
3This was pointed out to us by S. Roelly.
j¢L
4.2. AN ISING MODEL WITH REAL SPIN
81
(3) For all L and z: (4.2.15)
Ea(i) (XtZ'x - X tZ'h/)2 5 exp(k't) 1: a(i)(x - y)2. iES
iEL
PROOF. We reutilize the equations (4.2.12b). For any i E M, we have:
(X t°'x - X; '0'x)2 < L_2m + tr) (X +
j
s°'- X
x)2ds
(- i) I XLe°x - X 0x 1ds.
For i E L \ M, we have an upper bound for ]E( sup (X= °'x)2) with the aid 0,
of Lemma 4.2.9. Setting L,O,x
Di (t) := E(0,
M,0x 2
is - X",.,
))
we have in all cases that: DA(t) < 4EL\M ekta-1(0)([a * x2](i) + dial)
+ft(-2m + a)+Di (s) +
p(j - i)Dj (s) ds. jES
If we set g(t) := EiES a(i)Di(t) we then have, taking into account Lemma 4.2.4, that:
([a*aj(i)x; +cjaja(i))+J t((-2m+o)++Q )g(s)ds,
ekta-1(0)
g(t)
iEL\M
0
which gives us the result with k' := k + (-2m + v)+ + a', by Gronwall's Lemma. The proof of the second and third assertions are analogous. In particular,
the third is proved by utilizing the preceding lemma in an obvious way. In the second assertion we take k" := (-2m+u)++a' and in the third assertion
we take k' := -2m +a+ a'. Remark 4.2.12. We have a * a = E' o kv'-k (*p)k and thus a * a is also exponentially decreasing at infinity on Zd.
For all configurations x E E, we set:
Ui(x) := V{i}(xi) +EV{i,j}(Xi,xj),
j#i and introduce the infinite system of Langevin stochastic differential equations: t
(4.2.16)
Xi t = xi + Bi,t + f aiUi(X.) ds.
4. GIBBS MEASURES
82
Theorem 4.2.13. For any x E S' the Langevin system has a unique solution such that for any i E S the real process Xi,t is almost surely continuous for any t and the configuration Xt is almost surely in S'. Any Gibbs measure on S' is invariant and reversible for the associated semi-group of kernels.
PROOF. We consider the auxiliary Hilbert space l2(a) where a was introduced in Lemma 4.2.4. The solutions on [0, T] we are looking for belong to the Hilbert space HT formed of the continuous processes X on [0. T] with values in RS such that the norm IIXIIT :=
[1: a(i) E( sup X;t)] iES
1/2
0
is finite. Let L be an increasing sequence of finite sets for which the union is S. The upper bound given in (4.2.13) shows that the sequence
is Cauchy in HT and therefore converges to a limit Xz. We can extract a subsequence L,,.p such that supo
and XY are two solutions corresponding to the initial data x and y, completely analogous calculations to those used in the proof of Proposition 4.2.11 show that: (4.2.17)
IIX' - Xt IIh(a) < IIx - yllh(a) exp(kn't/2)
almost surely. Uniqueness then follows by setting x = y. We will show that we can find a sequence of boxes that will allow us to approach the limiting process regardless of the point of departure. Let D be a denumerable subset of S' that is dense in 12(a). Using a diagonal process, with t fixed, we can refine the sequence into L,,,k in such a way that for any x E D, the configuration X, '"k'0'i converges almost surely
to Xt in l2(a) when k -, oo. But the relation (4.2.15), which shows that x X k' is uniformly continuous on l2(a), implies that the preceding property holds for any x E S'. Let f and g be two functions on E that only depend on a finite number of coordinates or more precisely we assume that f (x) = f (xM), g(x) = 9(xM) where are bounded continuous functions on EM and L D M. Since for any fixed z in S', the process Xt 'Z is a Kolmogorov process in EL, we know the reversibility of nL(d£, z) for this process gives the reversibility of Lm0'.
4.2. AN ISINC MODEL WITH REAL SPIN
83
zrL(dx, z) with the transition kernel of the process Xt' '2, i.e., E IE(f
(XO'2x)g(Xt zx)) rL(z, d.-r)
= L E(f (X
z
x)g(XO
z
x)) rL(z, dx).
Integrating this equation with respect to a Gibbs measure µ(dz) supported by S' gives us, by utilizing the D.L.R. equations:
(4.2.18) f E (f(X0'x'x)g(Xc
"x))
E
u(dx)
= f E (f (Xi 'x'x)g(Xo'x'x)) p(dx). E
For any i, t, and x E S', there is almost sure convergence of XL -k 10'x to XT . The relation (4.2.14) shows that for all z E S' the process XM t 'z'x converges to XM t in EM. The dominated convergence theorem allows us to
pass to the limit in (4.2.18), which gives us: (4.2.19)
JIE(f(Xox)g(Xr)) p(dx) =
J
E(f (Xi )9(Xo )) u(dx)
An application of the theorem of monotone classes allows us to extend the preceding relation to arbitrary bounded measurable functions f and g on S'. 0
Remark 4.2.14. It can be shown that the Gibbs measures are exactly the invariant reversible probability measures for the Glauber-Langevin process; see (DR78].
4.2.3. Uniqueness in dimension 1. The processes studied above are already useful for proving that there only exists one Gibbs measure supported by S' for d = 1. We will indicate in the following chapter another approach. The case of a quadratic self-interaction corresponds to the Gaussian case, which has a direct proof that is treated in an exercise below. We are thus able to assume that the self-interaction potential is of degree greater than two. Proposition 4.2.15. Suppose that V is of degree at least four. Then, for any constant k, there exists a constant b > 0 such that for any Gibbs measures p supported by S', any i E Zd, and any r, we have: (4.2.20) p{x E E : 1x=l > r} < e-k([r-bl+)z.
PROOF. We, first of all, show an analogous inequality for the measures lrL(dx, z), i E L. In view of its degree, the polynomial P can, for any m > 0, be decomposed into the form:
mx2 + F(x) + R(x),
V (X) = 2
4. GIBBS MEASURES
84
where F(x) is a convex function and R(x) is infinitely differentiable with compact support on R. We begin by controlling the local specifications iiiL,z associated to the convex site potential V (x) = Imx2 + F(x). The quadratic part of the energy UL,z(x) is the function: iEL
2x? - i,jEL 2
J(i -
j)xixj
-
J(i - Axiz3,
iEL
I&L
for which, by Hadamard's criterion, the Hessian is positive definite form > a, and even bounded below by (m - a) Id. Thus by the Theorem of Bakry and Emery, the measure iL,z has, for m > a, a Gross constant that is smaller than (2m - 2a)-1, independently of L and z. The inequality (3.1.11), called the concentration inequality, for the function 1xil with i E L, is written: iiL,.(dx)]+)2},
nL,z{IxiI > r} < exp{-k([r - f Ixil
k = 2(m - a).
In order to deduce this from the concentration inequality for nL,z, we begin by comparing the Kolmogorov processes XI and XI associated to these two measures and departing from the same point x E RL. Utilizing Lemma
4.2.10, the Langevin equations, and the fact that IR'I is bounded, say by 1/2, we see that: dt (Xi(t)
-
Xi(t))2
= -2(Xi(t) - Xi(t)) (a3UL.=(X (t)) - ajUL.z(X (t)))
= -2(Xi(t) - Xi(t)) (aiUL,z(X(t))
- aiUL,=(k(t)) + R'(xi(t)))
< -2(Xi(t) - ki(t)) (OzUL,=(X(t)) - aiUL,Z(X (t))) + lI xi(t) - Xi(t)I. By summing the preceding inequality with respect to a and then applying Lemma 4.2.10 and the Schwarz inequality, we obtain: (4.2.21)
'1
,2(t) < (-2m+a+d)V2(t)+1(Ea(i))"2 (t) iEs
where : ap(t) = IIX L.z,T (t)
- XL'z'T(t)Ilt2(a)'
If m is chosen at the beginning, and it can be, such that the constant 2m - a - a, denoted by A, is strictly positive, the analysis of this differential inequality shows right away that: ,P(t) < lA ' la11"2,
and, in particular, we obtain a lower bound for the gap between the values of XL.z,x(t) and X (t) at 0. This estimate is valid for all translated weights a with the same constants, and finally for all L, z, x, i, t, we have: X; 'z'T (t)
-
z'T (t) I < bo
with bo =
Ial1/2a-1/2(0).
4.2. AN (SING MODEL WITH REAL SPIN
85
We apply the finite dimensional stabilization Theorem 3.2.7, which is applicable here, to the two processes X and j C' obtain for any positive p that: nLz{Ixi p} =clt-00 m P(X;xz > P) o0
(4.2.22)
clime
P(X,'
p - bo) = iiL,z{jxil > p - bo}
exp(-k([r - bo - bL'Z]+)2) where we have set: bL'z = f I xi I iiL,z (dx).
The upper bound (4.2.3), applied to a`, shows that there exist constants bI and b2, independent of i, such that:
b2'z r}) = lim n
cc
J
1rL,,,z{I4 > r}p(dz)
> r} tc.(dz) flIm1rL,z{lxil noc
f n00 lim exp(-k([r - b0 - bL"'1]+)2) p(dz) <
f lim00exp(-k([r - bo - bl + b2 E a(j)z?]+)2) p(dz) 1
jIL
< exp -k([r - bo - b, ]+)2, since almost surely in p, the influence of z tends to 0.
The preceding estimates will allow us to control the modifications that need to be made to a classic uniqueness proof for the case of compact spins [Ge88], modifications that we can abstract in the following manner. We call
a sequence (F, Kn), where Kn is a kernel from Fn+I to F, a projective system of kernels and we say that a system (µn) of probability measures Kn is compatible with this projective system of kernels. satisfying,u. = Proposition 4.2.16. Let (F,,, KK) be a projective system of kernels that has the following property: for any n there exists an increasing and exhaustive sequence Mn,,., r E N, of subsets of Fn, a sequence of measures an,r on Fn of total mass Gar, independent of n, and a sequence (or) of positive numbers such that: (1) the sequence br/ar converges to 0;
(2) for any r, n, and x E Mn,,., the probability measure Kn,x dominates the measure an .
4. GIBBS MEASURES
86
There can only exist one system of probability measures that is compatible with these kernels and satisfies An(Mn,r) > 1 - 6r, for any n and r.
PROOF. Let (µn) and (vn) be two compatible systems of probability measures for the projective system of kernels (Fn, Kn). It suffices to show that: (4.2.23)
Ilin - vnllvt <, 26,. + (1 - ar)IIpn+1 - vn+1llvt, since the iteration of this inequality leads to: Iiwt-vnllvt S 26r
1-
ar
P
+(1-ar)PIIpn+p-vn+Pllvt < abr +2(1-ar)P.
By letting p, and then r, tend to infinity we see that Il µn - vn llvt = 0. In order to demonstrate the relation (4.2.23), we note that the hypothesis on Kn implies that: osc((Knf)IMn+l,r) <, (1-ar)osc(f),
for any bounded measurable function f on Fn. It is easy to obtain the following decompositions of positive measures µn+l = µ,1,+1 + A and vn+1 = vn+l+vn+l where µn+1 and vn+1 are measures supported by Mn+i,r and with masses exactly equal to 1 - br. Under these conditions we have:
Ii n(f) - vn(,f)I = Ian+l(Knf) - vn+1(Knf)I Iin+1(Knf) - vn+1(Knf)l + Ian+1(Knf) - vn+1(Knf)I 2(1 - ar) OSC(f)IIpn+l - vn+1IIvt + 4OSC(f).
Since f is arbitrary, this implies (4.2.23).
O
Theorem 4.2.17. For d = 1 there exists a unique Gibbs measure supported by S'.
PROOF. Let R be the range of the interaction and p a Gibbs measure which exists by Theorem 4.2.5. We construct a projective system of kernels and a compatible system of probability measures as follows: for n > 0, we
set Fn := R°n with On =:]nR, (n + 1)R] U ] - nR, -(n + 1)R], and for n = 0, we set F0 := Ig{o}. the probability measures nl_nR,nRl(z,. , which only depend on the restriction of z to On and which we denote by C = zon, define kernels from Fn to Rl-nR,nRI. the kernels of our system are then defined as the images of the nl-nR,nRI (z, .) under the projections of R[-nR,nR] onto Fn_1 and are denoted by Kn_1((, ). It is clear that the marginal probability measures An of p onto Fn form a compatible system of probability measures for our projective system. To complete the proof we need to show that the projective system just constructed satisfies the hypothesis of Proposition 4.2.15. Towards this goal, we set: (4-2-24)
Mn,r = {x E Fit
:
IxI <, r},
en-1 =
e-q,,
IMn-i.. 13n-1,
4.2. AN ISING MODEL WITH REAL SPIN
87
where Qn_1 is the projection of the measure n[-nR,nR](0, ) onto FAn_, with zero exterior condition and q a constant to be fixed later. The RadonNikodym density n]-nR,nR](z,dx)
z,n =
n]-nR,nR] (0A dx)
is written: V)z,n(x) = Zz,nVz,n(x)
with
J(i - j)xixj),
Qz,n(x) = exp(-2 jEAn iEOn-i
for a certain normalization constant Z. Since it does not contain coordinates of x outside of An_1 it is also the density of Kn-1((, . ), with C = za., with respect to on-1. We utilize the concentration property (4.2.22) for k = 2. Since the b '0 terms appearing there are uniformly bounded, it is easy to see that the integral exp(Ix12)
don-1
fF*.1
remains bounded independently of n. We denote this bound by I. We deduce from this, by writing
Vz,n(x) := exp(-2(x, J * z)),
an upper bound of the form: if E Mn,,, Zn Z < Iexp(IJ * x12) < I exp(Ra2r2). Similarly if
E Mn,, and x E Mn-1,, :
Vz,n(x) = exp(-2(x, J * z)) > exp(-2lxIIJ * zI) > exp(-2R 1/2ar2), and finally there exists a constant k' such that
> exp(-k'r2)IIMn(x), if ( E Ma,,.. In order for Kn-1((, . ) to dominate an-1 for ( E Mn,,, it suffices to choose the constant q, introduced above in (4.2.24), to be equal to k'. Having made this choice, we have, for r sufficiently large, that a, > 1 exp(-k'r2) since the probability measures Qn have uniformly bounded second moments
while we have by Proposition 4.2.15 that 6, < exp(-k(r - b)2). By taking k larger than k', we see that all of the hypotheses of Proposition 4.2.16 are satisfied.
This result can be carried over to the so-called PV1 model where the set of sites is R. This is described in (CR75].
Exercise 4.2.18 (uniqueness in the Gaussian case). The dimension d is arbitrary and we set V(x) := 2x2 with a > a. We denote: (1) the space of configurations that decrease more rapidly than any negative power of the distance from the origin by S; (2) the dual of S by S'; (3) the
88
4. GIBBS MEASURES
subspace of S' consisting of configurations with finite support in Zd by T. Note that the dual of T is E = RZd. For E T and x E E we set: x) = exp{-(2x + C, aC - J * O(TE) }.
Prove that any Gibbs measure u on RZd has the following quasi-invariance property: the translated measure r_ p satisfies r-Ep(dx) = a(C,x)µ(dx)
If p is supported by S', extend this property to C E S. Show that a Id -[J* J is a permutation of S. Compute from this the Laplace and Fourier transforms of p and conclude the uniqueness.
Remark 4.2.19. Up to now, the limitation to measures supported by S' appears to be merely a technical tool for obtaining certain upper-bounds. However its necessity is made more apparent by the preceding exercise. Indeed, let po be the Gaussian Gibbs measure on S' associated to the interactions: V (x) ,a2x2, J(i) = 1{i-O} while supposing that the constant a - 2d, denoted by m2, is strictly positive. Then let h be any solution on Zd of the equation (-0+m2)h = 0. It is easy to check (see [Roy77J) that the translation of µo in RZd by h is still a Gibbs measure. (There exist non-zero such h.) However the measure obtained is
not supported by S', since h does not belong to S', unless it is the zero function. This is because its Fourier transform g on the torus Td will satisfy (m2 + 4 `vk_I sin2(pk/2))g = 0. The construction of analogous measures in the non-Gaussian case is not known.
CHAPTER 5
Stabilization of Glauber-Langevin Dynamics As we will see, the Gross logarithmic Sobolev inequality can continue to play a roll in the study of certain infinite dimensional models, unlike the ordinary Sobolev inequalities. This will allow us to prove various ergodic and stabilization properties for these models; in reality, the Poincare inequality also has this characteristic. But the Gross inequality is more powerful since it conserves, under a weak form, the regularization property that the ordinary Sobolev inequality has.' The Glauber-Langevin stochastic dynamics of the Ising model with real spins will furnish a striking illustration of the possibilities of the Gross inequality.
5.1. The Gross inequality and stabilization We consider the process X L,, associated to the energy in the box L with exterior condition z For an Ising model with real spins for which the interactions satisfy the hypotheses 4.2.1. Such a process on E corresponds to a Kolmogorov process in EL associated to this energy with the configuration outside of L being frozen at z. It follows from this that such a process has as invariant measure the Boltzmann measure lrL(z, ), which is concentrated at z outside of L. We denote the box {i : ail < n} by Ln and by a a weight on Zd as in (4.2.2).
Lemma 5.1.1. Let an instant in time t > 0 and an initial condition x E S' be fixed. The relative entropy at the instant t with exterior condition z depends in a tempered way on the size of the box Ln and the exterior influence z: (5.1.1)
I(G(Xt n,z'x)
I IrLn,z) <
K(1 + nP + E [a * iEL"
for suitable constants p, K. PROOF. We recall the inequality: (5.1.2)
I (G(Xt 'Z'x) 17rL,z) < IE([log(ZL,Z) + UL,z(XL) + UL,z(l/L)
l - ZvL(W L) -
2
1' UvU1z
- AU] (w;) ds] F),
0
'A function in L2(R") is also in Len/("-2)(R") if its derivative is in L2(Rn). 89
90
5. STABILIZATION OF GLAUBER-LANGEVIN DYNAMICS
established in (3.2.6) where the Cameron-Martin density F of the process with respect to the Brownian motion W L starting at XL in EL is given by: F = exp(-V(WW )), c
V(WL) = (-U(x) + U(Wt) + 2 I [Ivui2 - . U](W ) as),
with U = UL,,. The term -2v, which is the logarithm of the density of Wt with respect to Lebesgue measure, is bounded above by
-2
log(2at) and thus is at most of polynomial growth in n. We next examine the constant 1og(ZL,z) beginning with the case where z is zero and the self-interaction is of the Gaussian form V,(x) = 2x2, with a > a. In this case the energy UL,0 is the quadratic form: a 2
1 Q(x) _
2J(i - j)xixj
2xi i.jEL
iEL
Since the eigenvalues associated to 2Q are all greater than a - a, the calcu-
lation of the normalization of the Gaussian measure µL,,,0 gives us:
log(4ir(a - a)) = kind.
log(ZL,,,o) < 2
In the case where the exterior condition z is non-zero, we need to combine the preceding inequality with the following inequality: log{
1
ZL,,.o
JL
J(i - j)xizj - 2Q(xL)) (LxL}
exp(2 iEL j¢L
= 2 (J * zLr., Q' I (J * zLr.)) <
1
a
a I J * zLc l2 ,
with (J * zL<](i) = EjVL J(i - j)zj. Applying the Schwarz inequality gives us: (5.1.3)
IJ * zj
2
(p * ZLr](i)
iEL
K1
iEL
]a *
since the weight p is bounded above by a multiple of the weight a. To pass to the non-Gaussian case we add>.iEL V(xi) - V9(xi) to the energy. Since each term is bounded below on R, say by -k2, this additional and term is able to be controlled by multiplying Z by exp(2k2 thus the upper bound on log(ZL,,,Z) conforms to (5.1.1). Taking into account the terms for which we already have an upper bound it remains to find an upper bound for the sum of the following three terms
5.1. THE GROSS INEQUALITY AND STABILIZATION
91
forL=L,,: I1 = E(2UL,Z(xL) exp -V(WL)),
I2 = - E(exp-V(WL)
f
t ]
IVUL,z2 -
13 = E(Vt,L(Wi) exp -V(WL))
The expectation I1 is just 2UL,Z(XL). For z = 0 it has an upper bound that is polynomial in n, having taken into account the polynomial growth of the self-interaction V and the moderate growth of x at infinity. By utilizing
-J(i - j)xizj <
1
2p(i - j) (x? + zJ2),
we see that the terms that include non-zero z can be treated in the same way as in (5.1.3). Thus the upper bound on I1 conforms to (5.1.1). We now look for an upper bound for I2. We, first of all, have that: [IVUI2 - DU](y)
=E(V'(yi)-1:
jL
jEL
iEL
iEL
E{3V'2(yi) - 3(E J(i - j)yj + E J(i ;EL
jEL
4
j)zj)2
- V" (yi)}.
j¢L
¢V'2 -V" is bounded below on R, say by -k3, the corresponding term
Since
in 12 has kit card(L) as an upper bound. Concerning the other terms, the Schwarz inequality gives us, as in (5.1.3):
J(i - j)W + E J(i - j)zj)2 jEL
K2([P * (WL)2] (i) + (a * zILc]),
j¢L
thus showing that the upper bound of I2 is reduced to the estimation of:
B( P(-V(Wi )) f WL9,ids) =Ef (x") ds), iEL
iEL
which has been obtained in (4.2.8). Finally we note that since a is of rapid decrease on Zd and x2 is tempered, a * x2 is tempered. It remains to note that I3 is bounded above on R, independently of all parameters, by the upper bound of x e-x.
Remark 5.1.2. In obtaining the upper bound of II we have used the polynomial character of V in a more fundamental way than usual. If we wanted to go to a more general setting, it would be necessary to abandon S' for a configuration space having lesser growth at infinity; for example: {x : sup,(card-1(Ln) UL,,o(XL, )) < 0i could be a possibility.
5. STABILIZATION OF GLAUBER-LANGEVIN DYNAMICS
92
Theorem 5.1.3. Suppose that the Kolmogorov-Langevin process with exterior condition zero on EL = l[tL has a Gross inequality constant c independent of the finite set L of V. Then there is a unique Gibbs measure p on S' and for any x E S' the law C(Xi) tends to i in the weak topology for measures on RZd.
PROOF. We replace L by Lm and M by L in the relation (4.2.13) and we denote by II'LIa the norm of 12(a). Letting m tend to infinity we have that: (5.1.4)
a(0)EIIXi ,e'x
- Xi I12 < ek't E([a * a](j)x +clala(j)). jVL
Let f be a bounded Lipschitz function on 12(a). Its Lipschitz semi-norm and its variation are denoted respectively by: osc(f) := supif (x) - f (y)I
[f ]lip := suplf (x) - f (y)l lix - ylla t, =#Y
x,tr
We also denote by Nt(x, dy) the law on E of Xt and by Nt (x, dy) the law of Xt 'o's The relation (5.1.4) implies the relation called "finiteness of the speed of propagation": (5.1.5)
I Nt(x, f) - Ni "(x, f )l <
ek't/2A(x,
n)[f] lip,
{[a * a](j)x +clala(j)}.
a(0)A2(x,n) _ j¢Ln
We introduce an increasing sequence of boxes Ln(t) such that Ct < n(t) C(t + 1) where the constant C > 0 has been adjusted so that E [a * a](.7) < e-(k'+2a)t j Ln(t)
The latter is possible for any A, since a * a is decreasing exponentially on
Zd. The moderate growth of x implies that for any x E S' the function ek"A(x, n(t)) tends to 0 when t tends to infinity with exponential speed for any rate smaller than A. We now apply the finite dimensional ergodicity inequality 3.2.4: IINN "u)(x,.) - IrL"(,),oIIvt < 21(N; "("(x
) 17rLn(t).o)
2e-2(t-t)/c I(Nt "(t)(x,
)
I rrL"(t).o).
The preceding lemma shows that 21(Ni "(t) (x, .) 17rL"(t).o) is bounded above
by a polynomial in n(t), thus in t, so that:
"(t)(x,f) tlin00INL
-7rLn("(0,f)I lim IINL"")(x,.) < 1 osc(f) t-oo 2
rrL"(t),oIIV.
= 0,
5.1. THE GROSS INEQUALITY AND STABILIZATION
93
and by putting this into (5.1.5), we obtain for any x E S' that: (5.1.7)
tliin Nt(x, f)
-
(0, f) = 0.
We now address the uniqueness of the Gibbs measure. We see that the measures 7rL,,,o remain in a compact subset of the space of probability measures on E equipped with the weak topology and that the limit points are the Gibbs measures. Consequently we can find a sequence of times (tk) and a Gibbs measure p such that the sequence 7rL,,(,k) (0, f) converges to Et(f) for all bounded continuous functions f on E. If, in addition, f is a Lipschitz function and v is an arbitrary Gibbs measure on S', then the invariance of v, equation (5.1.7), and the theorem of dominated convergence imply that:
v(f) = v(Ntkf) = v(kim Ntkf)
=v(kiM
-oc
(0,f)) =J u(f)dv=p(f
In particular, this is applicable to f in the space T of functions depending on only a finite number of coordinates, bounded, and with bounded derivatives. This allows us to apply the theorem of monotone classes and conclude that
v = p. This uniqueness implies the uniqueness of the limit points of the weakly relatively compact family 7rLl(t),o and thus by a property of compact sets in metrizable space the convergence of this sequence to if follows. Thus
equation (5.1.7) gives the convergence of Nt(x, f) to p(f) for any f E T, which is the weak convergence stated in the last part of the theorem. For discussion of weak convergence see the Appendix, A.3.
Let CL,z denote a Gross constant for the process in EL with exterior condition z. Finding a Gross constant c satisfying the supposition of the preceding theorem is often done by obtaining an upper bound of cL,, that is uniform in z and L. In this case, we can prove the following exponential stabilization property: Corollary 5.1.4. Suppose that the upper bound c of the constants CL,2 with respect to the box L and exterior condition z is finite. Then for x E S' and any c' < c and -y, we can find a constant M such that for any bounded Lipschitz function f in P(a), we have: (5.1.8)
]Nt(x,f) -A(f)I < M(osc(f)e-t/c' +
[f]spe-7t).
PROOF. We can easily take into account a non-zero exterior condition z in (5.1.5) with the aid of (4.2.14); since k" < k', we have: NN't(x,f) - NLfl(t) Z(x,1)1
ek't/2(A(x,n(t))+
On the other hand, by utilizing Lemma 5.1.1 to obtain an upper bound for the entropy at time 1 and replacing Ln(t) by L(t), the inequality (5.1.6) is
94
5. STABILIZATION OF GLAUBER-LANGEVIN DYNAMICS
written precisely as follows: INL (t)'z(x, f) - IrL(t)(z, f ) I
e-t/` osc(f) (1 + n"(t) + E [a * 4(ty J) . iEL(t)
We combine this with the preceding to obtain:
INt(x,f)-IrL(t)(z,f)I <ek,t/2(A(x,n(t))+Ea(i)zi)[fisp iEL
+ Ke-tk` (1 + nP(t) + : [a *
oscm.)
iEL(t)
By integrating this with respect to the unique Gibbs measure a(dz) and introducing an upper bound q for f zj2 du, we obtain:
I Nt(x, f) - a(f )[ < ek tI'2 [f ],,p (A(x, n(t)) + q E a(i)) iEL(t)
+Ke-t'`osc(f)(1+n'(t)+q E a(i-j)). iEL(t) jVL(t)
By choosing the constant C of the growth of the box L(t) sufficiently large, we obtain the result we were looking for. The following convex case appears to be trivial in the context of mathematical physics although it is already interesting as a mathematical model:
Corollary 5.1.5. Let a := inf V"(x). If a > o there is a unique Gibbs XER
measure and an exponential stabilization of the Glauber-Langevin process towards this measure.
PROOF. We are able to write V(x) = Vi(x) + V2(x) where VI(x) _ lax2 and Thus UL, is the sum of a convex function and the energy U1,L,z associated to V1. U1,L,z is a quadratic form for which the Hessian is an upper bound for (a - Q)IL, for all L, by Hadamard's Lemma. The Theorem of Bakry and Emery shows the existence of a Gross constant independent of L and z.
Remark 5.1.6. This result can also be shown by the much more direct method employed in Proposition 3.1.25; see [R.oy78). In addition, it can be extended to mildly perturbed situations where UL,,, is no longer convex. This extension is difficult but can be carried out by a detour using a principle discussed at the end of the chapter.
5.2. THE CASE OF WEAK INTERACTIONS
95
Remark 5.1.7. In the setting of the next theorem one can evidently state the following infinite dimensional Gross inequality for c = co(1 -ry)-2:
b'f E T
l o glfIdp < 2
r E I8sf12 dµ+logllfIIL2(µ) ff2dp.
J iEZd In the article of S. Albeverio, Yu.G. Kondratiev, and M. Rockner [AKR95], it is proved that this inequality can be extended to the domain of the infinitesimal generator of the transition semi-group of the process.
5.2. The case of weak interactions By going more deeply into the ideas of Dobrushin, B. Zegarlinski has shown the existence of a Gross inequality for the Glauber-Langevin dynam-
ical model by a site-by-site study when the interaction between the sites is sufficiently weak. Zegarlinski's study is applicable to the case when the space of elementary spins is a finite set, a compact manifold, or R. From now on, we will only be interested in the last case. However, the theorem that we prove is valid without changes for an arbitrary set of sites S.
Let C be the space of bounded functions f on E = RS that have a bounded first derivative Oi f with respect to each coordinate xi. Zegarlinski's hypotheses are:
(HI) For each i E S and each z E E, the measure 7ri(z,.) satisfies a Gross inequality with constant co independent of i and of z. In particular:
7ri(f2loglfI) s 1
Or
(,gf)2+7r,(f2)log((7rif2)1/2),
for all f in C. (H2) For any function f in C, the function (7ri f 2)1/2 is in C. In addition, there exists a matrix Cji and a number ry < 1 such that:
Vi ECij s y
iii l03(7rif2)1/II
(5.2.1)
and
tlj >2Cij i#i
,
(ni(ajf)2)1/2+Cji(iri(8,f)2)1/2.
Theorem 5.2.1. Under the two hypotheses above, the local specification v = 7rA(z, ) satisfies a logarithmic Sobolev inequality with constant ca(1 - ry)-2, which is independent of A and z: (5.2.2)
.
(f 2log(If I))
2(1
Y)2
i
v(ajf )2 + v(f2) log(v(f2))1/2
for all f in C. PROOF. We fix z. The measure v is invariant for the kernels 7ri for i E A
by (4.2.1). We identify A to { 1, 2,3,..., N} by fixing an enumeration of S and we extend the definition of the operators Irk and ak for k > N by setting
5. STABILIZATION OF GLAUBER-LANGEVIN DYNAMICS
96
7rk+N = Irk and ak+N = ak. We define the function fk recursively by fo = f
and fk = Step 1: Utilizing (H1) and the invariance of v with respect to the kernel Irk, we see that: V(f&-1 log1fk-11) = i'(7rk(fk-1 loglfk-11))
1 co V (7rk(akfk-1)2) + V (7rk(fk-1) log(7rk fk-1)112) = -CO V(akfk-I)2+V(fR2-loglfkl)
By summing these inequalities we obtain: (5.2.3)
V(f 2loglf I) < co 2E V(akfk-1)Z + V(fn logIf.1), k=1
for any it.
Step 2: The goal in this step is to find an estimate for ap fq for q < p by iteration of the property (H2). Let I'p,i be the set of paths in N* strictly decreasing from p to is ry = {io,...,it : p = io > i1 > ... > it = i},
and let 1'P i be the subset of the paths such that i1 < q. We associate to the latter the constant:
A,i = E C(ry)
where
t
C(ry) = IT Ci,-1,i* 1
,ErP.,
We show by recurrence on q that:
(5.2.4) lapfgl <
(irgirq_i...7r1(apf)2)1/2+EA ,i(7rg7rq-1...7r1(0af)2)1/2. i
For q = 1, this follows from (H2); to pass from q - 1 to q we write: (rq(apfq_1)2)1!2+Cp.q(irq()Cgfq-1)2)1/2.
lapfgl = Iap(irq(fq 1))1/2I < By utilizing the recurrence hypothesis, we obtain a relation analogous to
(5.2.4) but with the coefficient a'P i = Aq-1 + Cp,gAP-i 1. By noting that a path in FA i is obtained in two ways starting from a path in TP i t , by introducing or not one step by q, we see that aP i = A'P i, which establishes (5.2.4).
Step 3: We define what we will call an admissible path in N' : this is a strictly increasing sequence of integers such that the difference between two successive steps is strictly less than N for which the last element is at most N or a path consisting of a single element < N. We denote by Ap,i the set of admissible paths from p to i. We set ap,i = 0 if Ap,i is empty, ai,i = 1 for
1
ap,i=>ryEAy,; C(ry),
1
5.2. THE CASE OF WEAK INTERACTIONS
97
The goal of this step is to prove: (5.2.5)
Iapfp- II < E.p,i(1rp-17rp-2 ... 7rl(aif )2)1/2. iEA
When p < N this follows right away by applying (5.2.4), for q = p - 1. To treat the case p > N, the idea is to note that ap7rp-N = ap7rp = 0 since 7rp(., z) does not depend on zp. By utilizing the inequality (5.2.1) N - 1 times starting from i = p - 1
with j = p and where we go from the kth step to the (k + 1)3t step by applying (5.2.1) to the first term of the kth inequality, stopping the process when apirp-N is zero, we obtain: P-1 (5.2.6)
I aPfP-1I < E Cp,m(irp-lirp-2 ...7rm(am fm-1)2)1/2. /
m=p- N+1
Now Ap,i is obtained from the disjoint union of the A,,,,i for p - N + 1 m < p-1. The inequality (5.2.6) then shows that (5.2.5) is true for a p > N if it is true for the preceding N - 1 numbers. Thus the inequality (5.2.5) is proved.
Step 4: We now consider a non-stationary path in A in the usual sense, i.e., a sequence q(s) E A with 0 < s < t such that two consecutive elements are never equal and the path ends at i. It is clear that 0 lifts in a unique manner to an admissible path -y in N' satisfying: y(t) = i,
'y(s) = 0(s)
(mod N).
We introduce the matrix Ci.i defined by restricting_ the coefficients of Cij to A, agreeing that Ci,i = 0 and we set V := E*_0 Cr. Thus V j is the sum of the weights of all of the paths 0 going from i to j. Therefore, by (H2), we have for each integer p that:
\pi < iEA
1
Vj,i<' (1-y)
1
,
iEA
1: -Xp,i= p>,l
V i m < ( 1 -7)
JEA
where p denotes the integer in A that is congruent to p modulo N. By the Schwarz inequality (5.2.5) implies: (apfp-1)2
(1
'y)-1
EAp.ilrp-l7rp-2...7r1(aif)2,
iEA
5. STABILIZATION OF GLAUBER-LANGEVIN DYNAMICS
98
which combined with (5.2.3) gives us:
V(f2loglfI) <'
2CO(I
-7)-' k=1 iEA
Ak.iV(sif)2+v(fnloglfnl)
1CO(I _'Y)-2>V(sif)2+V(fnloglfnl)
(5.2.7)
iEA
fkN, we obtain Pk+l = Mcpk where Al is the Step 5: If we set Wk composition of kernels M = iN7rN_1 ... 7r1. We consider the semi-norm: N
kP]d = E11a1P11/21100, i=O
defined for any function V such that Icpl is in C and we show that its norm strictly decreases under the action of M, i.e., (Mcold < Y[V]d
We take W to be positive to simplify things. Rather than describe this using complicated multiple indices, which is perfectly possible, we will proceed
in an intuitive fashion. We think of [cp]d as the total weight of a pile of The stones where the weight of the part of the pile placed on i is hypothesis (H2) applied to f = V112 gives, in particular, for i # j that: Ilaj
i;II ,1< IIa,JII0+cillaiJ110o
0. We can say that the stones placed on i have all been broken by 7ri and the pieces placed on other sites j, proportionally to the coefficients C?i with the mass coming from this pile multiplied by `JEA Chi s y < 1. It is easy to imagine that the missing mass corresponds while evidently Ilai
to the blowing away by the wind of the powder created in breaking up the stones. Since the operation M corresponds to a passage to all of the sites, no stone remains intact, most of them having been broken up several
times. Thus it is clear that the total mass would have decreased at least proportionally to -y. Thus [cok]d tends to 0 exponentially. It is the same for the Lipschitz seminorm [ckjlip associated to a norm I - I on RA since these two semi-norms are
equivalent. It follows from this that 'Pk tends to the mean with respect to v, i.e., to v(f 2) since: I ok(x) - V(cok)I = I
J (cok(x) - cPk(p)) V(dy)j < [Wk[lip
fix - yl v(dy)
where the last integral is finite. For k sufficiently large the Lipschitz seminorm is less than 1 and, for a suitable constant a, fkN = kok(x)I
5.2. THE CASE OF WEAK INTERACTIONS
99
Remark 5.2.2. For the last step we could also utilize stronger arguments such as those that appear in the theory of Harris Markov chains [MT97]. In particular, if we restrict the kernel M to the space C = EA,Z of configurations that are equal to z outside of A, the probabilities 6CM = M((, ) are mutually equivalent for (E E and have v as an invariant probability measure. In this case it is known that the sequence of probabilities S&M' converge in the sense of total variation to v for any (. Since f is bounded it follows that M" f 2 converges everywhere to f2. However this argument is limited to finite dimensions while the argument we have employed is Dobrushin's original one and is able to be adapted to directly show the uniqueness of the Gibbs measures in the case of weak interactions without having to use evolutions in time.
We now look for conditions for which the preceding result is appliWe suppose, as usual, that V(x) is a polynomial of degree 2m that is bounded below, i.e., the leading coefficient is positive. Recall that cable.
Or = EiEZd P(i) with p(i) = JJ(i - j) 1.
Theorem 5.2.3. Fix the self-interaction V. Then, for o, sufficiently small, there exists a Gross inequality independent of the box A and the exterior condition z. Therefore there exist values of or for which there is a unique Gibbs measure p on S' and the exponential stabilization of the Glauber-Langevin process to p.
PROOF. It is sufficient to establish Zegarlinski's hypotheses (HI) and To establish (HI) we note, for example, that we can decompose V as the sum of two infinitely differentiable functions VI + V2 where k = inf VI'(x) > 0 and V2 has compact support. In fact, let X be a positive (H2).
function in C°° that is strictly positive on the set {V" S 0}. Then set: VZ'(x) := -2kX(x) + kX(x + h) + kX(x - h); since V2 is an even function with zero integral, it is the second derivative of
a function V2 in C. Thus by taking k, then h, sufficiently large, we have the decomposition that we wanted. We apply the Theorem of Bakry and Emery to the Boltzmann measure associated to an energy of the form VI(x) + (x, which corresponds to the energy at the site i under the exterior condition, to obtain a Gross and Poincare inequality with constant cI = k-I, independent of (. By perturbation we find a constant cle2v', independent of the exterior configuration z, which then works for the full interaction. We now establish (H2). By the formula for the derivatives of Boltzmann measures we obtain: I aj (7rif 2) I/2 I=
(5.2.8)
(f2)- I/2ir (f aif) + J(i - j)(iif 2)-I12 CoVx, (f 2, xi)
'< (IT,(a.7f)2)1/2 +P(i -j)(7if2)-I/2ICOVa;(.f2,xi)I
5. STABILIZATION OF GLAUBER-LANGEVIN DYNAMICS
100
After writing the covariance as a double integral we apply the Schwarz inequality to see that: (1r,f2)-1/21 cov,r,(f2,xj)I
= 2(Tr,f2)-1/2 f
RxR
(f2(x) -f2(y))(x-y)ir ( ,dx)ir ( ,dy) 1/2
(5.2.9)
After making the change of variables x = p + q and y = q - p, we consider the conditional laws for the measure Tri (. , dx)7ri (. , dy) with respect to q. With q being fixed, the conditional measure is the Boltzmann probability pq on IR associated to the potential: U{;},: (p + q) + U{i}.z(q - p).
In this formula, with the interactions J(i - j)zj(p + q) + J(i - j)zj(q - p) making a contribution independent of p, the measure pq is also a Gibbs measure for the potential: (5.2.10)
Vq = V (p + q) + V (q - p).
Our next goal is to utilize the Dirichlet form associated to pq to find, independently of q, an upper bound for the integral fa FQ (p)p2 pq(dp) where:
Fq(p) = f(p + q) - f(q - p) We begin by considering the region IpI >, 1. By the isomorphism of the Hilbert space L2(pq) onto L2(dp) defined by: (5.2.11)
0 + Z, 1/2 exp(-Vq)VL,
we see that the operator Agcp
-2'P" + V'' p' is transformed into the
Schrodinger operator: HgzG :_ -1 ip" + WqVI
where W. = 2 (V'Q - VQ )
with domain CC°. Let D(x, y) be the polynomial of total degree 2m - 2 defined by:
V'(x) - V'(y) = (x - y) D(x, y) Since the leading coefficient of V is strictly positive, there exist constants a > 0 and b such that D2(x, y) > 2-2ma(x2 +y2)2'"-2 - b, which implies for any p and q the lower bound: V q(p) > apt(p2 + g2)2m-2 - b.
Since VQ'(p) is a polynomial in p and q of total degree 2m - 2, we can find a constant on b' such that:
- b') ll1 1,1(ap2 - b'). Since the second derivative operator is negative in L2(dp), aHq + b' Id is IIlpl,l Wq(p) > 11p1>1(a(p2 +
g2)2m-2
an upper bound of the multiplication operator p2 11p1.>1 and by utilizing the
5.3. PERSPECTIVES
101
Hilbert space isomorphism above we see that aAq + b' IIlpI31 is an upper bound of the multiplication operator on L2(pq). With aid of the Dirichlet form
_12
F'2 d,, Eq(F, F) = (F, AgF)L2(pq) = f this upper bound is written: for any F in H1(pq), we have
f
p F2 dpq aEq(F, F) + b' F2 dpq. Pfit JpI it We can take into account the region IpI <, 1, by letting b" := Y V 1, which implies: (5.2.12)
f (5.2.13)
Jp2F2 dpq
aEq(F, F) + b" fF2 dpq.
The formulas (5.2.10) and (5.2.11) show that the function F. is odd and, therefore, its integral is zero with respect to the symmetric measure pq. The Poincare inequality will give us an upper bound for the second term on the right-hand side of (5.2.13). The decomposition of V discussed earlier leads to a comparable decomposition of Vq and we obtain in the same way a Gross constant, and thus a Poincare constant, c2 = c1 exp(2osc V2) independent of q. By putting this into (5.2.13), we obtain: p2 (f (p + q) - f (q - p)) 2 dpq
2(a+c2b") f(f( + q)+f'(q-p))2p(dp) (a + c2b") f (f12(p + q) + f'2(q - p)) pq(dp)
This inequality is independent of q and z, therefore by integrating with respect to the law of q = x + y and changing to the variables x and y, we obtain:
f x1R(f
(x) _ f (y))2(x _
Y)2 IT,(.
, dx)7r=( , dy)
< 2(a + b"c2) J f i2(x)7r;(. , dx),
which, taking into account (5.2.9) and (5.2.8), proves the hypothesis (H2) as soon as v is sufficiently small so that 2(a + b"c2)a2 < 1. O
5.3. Perspectives The method of logarithmic Sobolev inequalities in statistical mechanics goes well beyond the results already discussed. The leitmotiv in these applications has been the proof of ergodicity results while controlling the Gross constant c(L, z) of the local specification in the box L with exterior condition z.
5. STABILIZATION OF GLAUBER-LANGEVIN DYNAMICS
102
In the case where this constant is uniformly bounded in L and z, Corollary 5.1.4 shows the stabilization with exponential speed. Beyond the case of weak interactions, D. Stroock and B. Zegarlinski have shown the existence of a uniform Gross inequality in other situations. The main idea is to iterate the kernels constructed with the aid of the local specifications of finite boxes rather than isolated sites. They have obtained (see [SZ92]) in the case of finite and compact spin models a uniform Gross inequality when there exists a de-correlation, uniform with respect to the measures irL,Z, between any two boxes A and A', which decays exponentially as a function of the distance between these two boxes. In the case of real spins with xixj-interactions, Zegarlinski obtained in an analogous manner [Zeg96] the following result:
Theorem 5.3.1. If for certain constants K and A > 0 and sufficiently large L we have: (5.3.1)
Icov,VL.=(X$,Xj)I <
Ke-ali-jl
uniformly in z, then the Gross inequality holds uniformly in L and z.
It is rather easy to establish this exponential de-correlation in the onedimensional case in the spirit of the proof of Theorem 4.2.17. In particular there is the following result which is proved in detail in [Zeg96]:
Theorem 5.3.2. For d = 1, there is exponential stabilization to the unique tempered Gibbs measure for the Ising model with real spins. Another case where (5.3.1) is satisfied is the case of Dobrushin's unique-
ness theorem. This is carried out by H. Kiinsch in [Ku82]. Dobrushin's uniqueness theorem is also valid for a Euclidean field on a lattice with an almost convex interaction; see [Roy77]. In particular, one ends up with the following result that is very close to Theorem 5.2.1:
Theorem 5.3.3. Consider the Ising model defined by the interaction potentials on Zd:
V{ij}(x)=
I (xi-xj)2,ij,
V{i} (x) = P + AQ,
where P is a non-quadratic convex polynomial and Q is a polynomial with the same or lesser degree. For A sufficiently small, there is exponential stabilization.
The extension of the hypercontractivity result of Stroock and Zegarlinski to the case of interactions with infinite range has been studied by E. Laroche in [La95]. To end our discussion we will give a quick glance at other important results in statistical mechanics that can be obtained with the aid of the Gross inequality, results which mainly apply to the case of finite or bounded spins. The corresponding proofs are often complex and utilize deep ideas that are tied to statistical mechanics; see e.g., (Sim93, Lig85, Spn9l, Com90].
5.3. PERSPECTIVES
103
Concerning Ising models with random interactions J(i) (spin glasses)
we point out an interesting result of A. Guionnet and B. Zegarlinski [GZ97, GZ981 which says: for bounded interactions and a sufficiently high temperature there is a control of c(L, z) of the form \exp(plogd-1(n)) that leads almost surely to a subexponential rate of stabilization exp(-te) where
0<1. In the case of ordinary Ising models at sufficiently low temperatures and
d > 1, there no longer is a unique Gibbs measure but two extremal Gibbs measures p+ and u- in the convex set of Gibbs measures. F. Martinelli
[Mar97] has shown in dimension d = 2 with the aid of an estimate of the Poincare constant of the order exp(o(n)n) that the Glauber semi-group acting on L2(E.t+) converges to the projection on the constant functions with a rate of the order t-a. The method of logarithmic Sobolev inequalities is also well-developed for conservative random evolutions and, in particular, for exclusion processes. An example is the case where the spin space is {0,1 }, with the value xi = 1
being interpreted as the presence of a particle at site i that can jump to neighboring sites at a suitable rate, provided that they are free. S.L. Lu and H.T. Yau have been able find a control for the Gross constant of order n2 that leads to a rate of convergence of the order t-d12 in the L2-space of an invariant measure. Here the invariant measures are the Gibbs measures corresponding to the mechanism describing the jump process. One can consult [Y97, BZ97] and their bibliographies to find out more about this topic.
APPENDIX A
A.1. Markovian kernels Definition A.M. Let (E, E) and (F,.F) be two measurable spaces. A Markovian kernel from E to F, or simply a kernel, is a mapping N : E x .F R+ such that for any x E E the mapping A H N(x, A) is a probability measure on .F, also denoted N(x, dy), and x N(x, A) is a measurable function on E for any A E.F.
In the case where E = F, we will just say a kernel on E.
A.1.1. Generalities. Let m be a positive measure on E and f a positive (bounded) measurable function on F with values in R+. We define a positive measure mN on F and a positive (bounded) measurable function
Nfon Eby
mN(A) =: LN(x,A)m), N f (x) _: ff(y)N(xdy). We also employ the notation N(x, f ). If we denote the integral of f with respect to m. by (m, f ), we have (m, N f) = (rnN, f ). In addition, f 0 implies N f 3 0, N1 = 1, and if m. is a probability measure, so is mN. The tensor product m ® N, where m is a measure on E, denotes the measure on
E x F such that (m ® N)(A x B) =
JA
N(x, B) m.(dx).
One can characterize m®N as the measure on E x F for which the projection
onto the first factor is m and which "is decomposed" relative to this first N(x, ) indexed by E. The proof projection into the family of measures x of the existence and uniqueness of the tensor product of a measure and a kernel is analogous to the proof of Pubini's Theorem. In the case where E = F we can iterate this process for the kernels N1, N2, ..., NI to obtain a measure m ® N1 ® N2 ® .. ® N1 characterized as follows: for any positive measurable f (or bounded if m is bounded) on E, we have: (A.1.1)
(m®Nj =
f
®N2®...®N,,f)
f(x0,x1,... ,x1)m(dxo)N(xo,dxl)...N(dxl _l,dxl). 105
Appendix A
106
The product of two Markovian kernels N1, N2 on E is the Markovian kernel Nl_N2 on E defined by: N1 N2(x, A) :=
jNi(xdy)N2(yA).
To be consistent with earlier notation the function to be integrated is placed after the measure. This product is associative.
Definition A.1.2. We say that m is an invariant measure for N if
mN=m. Exercise A.1.3. Let m be an invariant measure for the kernel N and f positive and measurable. Show for p > 1 that f[NIP'din <,
ffP dnt.
Deduce from this that N induces a contraction on L''(m) that we will continue to denote by f - N f .
Exercise A.1.4. Let E be finite and A indexed by E x E such that: (A.1.2)
Vi, >2 aid = 0 and Vi # i, ate < 0.
Show that Nt := exp(-tA) is a semi-group of kernels on E. (Hint: The positivity is clear for small t, and extends to the general case by the semigroup property.) Construct in the general case of a measurable space E the analogous semi-group associated to A := Id -N where N is a Markovian kernel on E.
A.1.2. Markov chains. Let (Xn) be a process indexed by the set T, which will denote either N or Z, on a probability space (S2,.F, P), with values in a measurable space (E, E). Let.Fi be the o-field generated by the random
variables X, n E I C T. In particular we write: Fn :_ F{n,n-1.... },
'n
F{n,n+l,...}
We say that the process possesses the Markov property if, for any n and any bounded measurable function with respect to F,; , we have P-almost surely that: E( O I Fn) = )E(O I -F(n))-
By first considering ,0 of the form cpo(Xn)V1(Xn+1) "' Vk(Xn+k) with k > 0,
we see that this property is equivalent to the following apparently weaker property: for any bounded measurable function f on E, we have almost surely that: (A.1.3)
IE(f(X.+1) I Fn) = E(f(X.+1) I F{n}) We will say that the process is a homogeneous Markov process with transition kernel N if, in addition to (A.1.3) we have:
E(f (X.+1) I F,) = Nf (X.).
A.I. MARKOVIAN KERNELS
107
We can also express the Markov property in the following form: outside of a negligible set of points (x, x1, x2, ...) with respect to the law for (Xn, X, I, ...),
E(f(X.+1) I X. = x and Xn-1 = x1 and Xn-2 = x2 and...) = Nf(x). The law for X0 is called the initial law for the process. Given any probability
m and any kernel N, one can always show that it is possible to construct processes (Xn),n E N, with initial law m and transition kernel N. The corresponding marginal laws ,C(Xo, ... , Xn) are uniquely determined for any
n and, in fact, they are equal to m ® N . . . 0 N where there are n N's. A process is said to be canonical when Il = ET, the random variable Xn is the nth coordinate of this product, and F is the or-field generated by the X. From the image of the measure P for an arbitrary process, one can construct a canonical process that will have the same marginal laws £(Xt...... Xtk ) for any finite set of times. Stationary Markov chains. We will consider a canonical Markov chain: (EN, F, Xn, n E N).
Let T be the shift to the left operator on EN onto itself defined by T(x)i xi+1. Since Xi o T = X;+1,
the law of (Xo, Xl,... , Xn_1) for T (P) is the law of (X1, X2,..., Xn) for P. Since a probability measure on EN is determined by its marginal laws we see that P is T-invariant if and only if for any k the law for (Xi, Xi+1, , Xi+k) is independent of i. We say in this case that the process is stationary. We will now consider a homogeneous Markov chain with transition kernel N and initial law m:
Proposition A.M. The process is stationary if and only if m is an invariant measure for N.
PROOF. Suppose that the process is stationary. The law for the pair (Xo,X1) is m(dxo) ®N(xo,dxl) and thus the law for Xl is mN. Since the process is stationary the law for X1 is also m. Conversely, suppose m is an invariant measure for N and let f be any bounded and measurable function. If we set:
0(x1) =:
fN(xi dx2) fN(x2,dx3)
f
..
ff(x1x2... ,
N(x, dx1),
Appendix A
108
the invariance of m gives us: E(f (X1, X2,
. ,
Xn+)) = fm(dxo)fco(x)N(xodzi) = fco(xo)m(dxo) =
ff(xo.xi.. . , xn) N(xn-1, dxn) = E(f (Xo, X1,..., X1 )),
which is the relation we are looking for since f is arbitrary.
If m is an invariant measure for N, we can construct chains that are indexed by times Z. The important notion of reversibility of a process we are about to define is much stronger than invariance. Definition A.1.6. We say that a measure A on E is reversible for N when \ ® N is a symmetric measure on E x E. If m is reversible for N, we have, for all positive f and g that: fE g(x) Nf (x) .\(dx) = JE Ng(x) f (x) .\(dx). In particular, a reversible measure for N is an invariant measure for N. To see this just set f = 1 in the above identity. If, in addition, A is a-finite, the monotone class theorem will show that the preceding condition is also sufficient for the reversibility of A. It can also be shown that the a-finite measure A is reversible for N if and only if N induces a self-adjoint contraction on L2(A). When A is a probability measure, the reversibility of A is equivalent to the following condition: let p be the time reversal mapping of Ez defined by [p(x)]n = x_,,; then the law of the canonical stationary Markov process defined by the transition kernel N and initial law A is invariant with respect to p. Verifying a criterion that is equivalent to reversibility is, in general, much simpler than directly establishing reversibility. For example, when E is denumerable, reversibility can be written:
for any x and y in E, \,,N.,y = \yNy,x. Example A.1.7. If we consider the values WV,, of the Brownian nlotion at integral times we obtain a Markov chain with transition kernel N1. Lemma 2.2.10 shows that Lebesgue measure is reversible for N1.
Jump processes. In the case when the state space E is finite, the generalization of Markov chains to continuous-time Markov chains to the case of continuous time is fairly simple. For example, the interested reader can consult Rozanov's book [R.oz87]. We begin with a matrix A satisfying the hypotheses (A.1.2) and the Markovian semi-group NN := exp(-tA). Then
A.2. BOUNDED REAL MEASURES
109
there exists a Markov process (Xt), t E R+, with values in E with transition semi-group (Nt), i.e.: (A.1.4)
E(f (Xs+t) Ifs) = Nt f (Xs),
F. = a(Xu, u E R+, u G s).
We can also construct a version of this process for which the trajectories are piece-wise constant and right continuous. In the case where K is a transition matrix for which the diagonal elements are zero and A = I - K, it is possible to envision the process as follows: Xt is a body that sits at a point x of E
for a certain random waiting time governed by an exponential law with parameter 1, then jumps to another point y, with probability K(x, y), with the waiting times being independent for each site.
A.2. Bounded real measures Since we will only consider bounded measures we will, in general, omit explicit mention of this property.
Definition A.2.1. A real measure on a measurable space (E, E) is a mapping p from E to R such that for any sequence An of disjoint measurable 00
sets, the series E"0p(An) converges absolutely and its sum equals p( U An). n
Exercise A.M. Show that if we remove the word "absolutely", we obtain an equivalent definition. The following theorem allows us to reduce results about measures to the special case of positive measures. Theorem A.2.3 (Jordan-Hahn). Any real measure it can be decomposed in a unique way as the difference of two mutually singular positive measures.
The qualification "mutually singular" says that there exists A with A+ (A) = 0 and µ_(A`) = 0. If the values taken by a measure are finite, it will automatically be bounded. In any case, it would be neither useful nor agreeable to consider signed measures taking infinite values. If we define 1µl := a+ +,a- the theorem implies that 1/I (C) = supBcc1µ(B)1 for any C E E. This measure is called the absolute value of µ. If f is a 14-integrable function, we define f du to be f dµ+ - f du-. Examples A.2.4. Here are three examples of signed measures. (1) The difference of two bounded positive measures; (2) for f E L1(v) where v is a positive a-finite measure, we have the signed measure f v defined by:
(fv)(A)=JAfdv; (3) for any continuous linear form l on C(E) where E is a compact metrizable space, there exists by the Riesz Representation Theorem
Appendix A
110
a unique real measure p such that:
r
dfEC(E) l(f)=J fdp. E
We denote by II'Iloo the uniform norm of bounded functions, by M the vector space of all real measures on E and we define the total variation of a measurer by:
IIpoovt = L dl µl.
Proposition A.2.5. Let p be a bounded measure. Then the total variation of p equals the norm of the linear functional defined by p on the space of bounded measurable functions: (A.2.1)
IIAII"t =
sup{ f fdp : f E ,C-, IIfIIW
1}
and the space (M, II'Ilvt) is complete. PROOF. Utilizing the decomposition of p into its positive and negative parts, we see that:
f fdp= f
fdp+ -
f fdp- < fIfId++JlfId
= fIfIdtPI < IlfII. f dIul, which proves that the left-hand member of (A.2.1) is greater than or equal to the right-hand member. Going in the opposite direction, we consider f = IIA - IIA- where A is such that A+ (A) = 0 and p_ (Ac) = 0. The relation
p(f) = p+(A) + IA-(A') = u+(E) + A_ (E) = JfdII shows that the sup bound on the right-hand side of (A.2.1) is attained. From the formula that we just established it follows that II . ll't is a norm. In fact, M is identified with a subset of the dual of the space of bounded measurable functions with the uniform norm. We now turn to proving completeness. Let pk be a Cauchy sequence in
M. Since the masses f dIpkI are uniformly bounded, a := E2-klukl is a bounded positive measure. Since all of the µk are absolutely continuous with respect to a we can apply the Radon-Nikodym Theorem to obtain Ipk I = 9ka, from which we deduce that Pk = fka where fk := IIAk - EA., and where the set Ak is as above. It is easy to establish the formula
Ilfallvt=
f
Iflda=IIfIIL-00,
for any f in Ll (a), which shows that the sequence (fk) is Cauchy in the Banach space Ll (a) and thus converges to a function f in Ll (a). The same formula shows that µk converges to f a. 'Unfortunately the measure JµJ is also called the total variation.
0
A.3. THE TOPOLOGY OF WEAK CONVERGENCE
111
To finish our discussion we recall that M is ordered, i.e., µ < v if and only if µ(A) < v(A) for any A E e and that, for this order, there exists the upper and lower bound of two measures. For example, one can show that: (a, A p)(A) = sup{E c(Bj) A p(Bi) I for finite partitions (Bi, i E I) of A}. iE/
It is not necessary to worry about such formulas because given two measures we can always find a positive measure with respect to which the two measures are absolutely continuous and it suffices to take the lower envelope of the two
densities. We note that if v and p are positive, they are mutually singular if or Ap=0. Exercise A.2.6 (mixing kernels). Let a kernel N on E be such that there exists a positive measure a of mass a < 1 satisfying `/x a(dy) <, N(x, dy). Prove for any two probability measures m1 //an and m2 on E that: Im1(f) - m2(f ) f E Goof C uIm1 -m2II = 2sup{ osc(f) where C is the vector space of constant functions. Deduce from this that N defines a strictly contracting mapping with ratio (1 - a) in the convex set of probability measures on E, that N has a unique invariant probability measure µ, and that: II vNI - pulvt 5 2(1 - a)P, for any initial probability measure v. Prove that N is an operator of norm less than (1 - a) in the subspace D of Goo of bounded measurable functions
with zero mean with respect to µ. With the aid of the Riesz-Thorin Interpolation Theorem (see, for example, [DS631), prove that N defines an operator on L2(µ) with norm at most 1 --a. In the case where E is finite, establish that the mixing condition is written maxy minx N(x, y) > 0 and that then we can take a = Ey min= N(x, y).
A.3. The topology of weak convergence Let E be a Polish space, Cb(E) the space of bounded real Bore] measurable functions on E, and M6 (E) the cone of positive Borel measures on E. Definition A.3.1. The topology of weak convergence on Mb (E) is the least fine topology for which the mappings µ -,u(f) from JNb (E) to R are continuous for all f in Cb(E)
It is easy to see that such a topology exists and that a basis for this topology is the collection consisting of the "elementary open sets" {µ I a1 < µ(f 1) < b1, ... , an < µ(fn) < b } where n is any positive integer and where the fi are bounded continuous functions. The following result is technically very useful.
Appendix A
112
Proposition A.3.2. Let I be a Hausdorfspace. In order for a mapping i 4 µi of I to Mb (E) with the weak topology to be continuous, it is necessary
and sufficient that for any open set U of E the mapping i -+ pi(U) is lower semi-continuous and that i -r pi (1) is continuous.
PROOF. We, first of all, construct a denumerable set F of positive bounded measurable functions on E that is stable under the "sup" operation, i.e., f, g E F implies sup(f,g) E .F, which has the property that the characteristic function of any open set U in E is able to be written llj = sup fn where the functions fn belong to F. Let xi be a dense sequence n
E and let B be the denumerable set of open balls with rational radius whose centers belong to this sequence. Since any open set V of E can be written as a countable union of elements in B it suffices that F has the desired property of only having to check for U E B. If U = B(xi, r) we have:
lu = supgn,i,r with gn,i,r(x) = [nd(x, U`)] Al; n
and thus it suffices to form F as the set of all functions which are the "sup" of finite sets of the functions gn,i,r.
Let U be an open set. Then we can write lu = supra fn, where the sequence fn is increasing in F, since we can replace fn by sup(f 1, f2, if necessary. Applying Beppo Levi's Theorem gives us:
,
f,,)
µ(U) = suPla(fn)
(A.3.1)
Since the mappings i'-+ pi(f,,) are continuous for any n, the upper envelope i i-+ µ(U) is lower semi-continuous. Conversely, we suppose that for any open set U, the preceding mappings are lower semi-continuous and let f be a positive continuous function bounded by b. We denote by wp the open set {x I f (x) > p}. Fubini's Theorem then implies:
,W) =
J "o µ(wp)dp
since
f(x) =
j
" o L,,(x)dp.
We can restrict the integral to the interval [0, b]. Since the function p z(wp) is decreasing, µi(f) is the increasing limit of the Riemann sums: 2^
b 2-n E pi (wb2- ). P=1
Since each of these sums is lower semi-continuous as a function of i, the same is true for pi(f ). By replacing f by b - f in the previous argument we obtain the upper semi-continuity and therefore the continuity of i '-+ µi(f ). This extends to any arbitrary bounded continuous function that can be decomposed into the difference of two bounded positive continuous functions. Finally, from the definition of weak convergence, we see that i '- ui is continuous. 0 Corollary A.M. The weak topology is metrizable.
A.3. THE TOPOLOGY OF WEAK CONVERGENCE
PROOF. Let ¢ be the mapping Mb(E)
113
RY that associates the family
µ(f) top with f E F where F is defined in the proof above. It is easy to see that ]R-F is metrizable since.F is countable. We show that 0 is injective. Let It and v be two measures such that b(p) _ 0(v) and U an open set. By the relation (A.3.1) we have: (A.3.2)
µ(U) = sup
V(U)'
i.e., the two measures coincide on the open sets and thus on the o-field of Borel sets.
Since 0 is injective we can identify .Mb (E) with a subset of IRF and since the latter space is metrizable with metric S, this induces a metric on A lb (E). Clearly since 0 is continuous, b defines a less fine topology than the weak topology. To show the equality of these topologies we utilize the preceding proposition that says that it is sufficient to show that the mappings It '-, u(U) are lower semi-continuous with respect to the 6 topology. But since f E .E, the mapping It '-- It(f) is continuous by construction of b, which completes the proof. 0
Corollary A.3.4. Let B be a basis of open sets containing E that is stable under finite unions and let G be a subset of Cb(E) containing 1 such that the indicator (characteristic) function of any element in 8 is the limit of an increasing sequence of elements of G. Any sequence of measures (An) converges weakly to It if and only if for any f E G the sequence converges to µ(f ).
Example A.3.5. Consider E = RS where S is denumerable. We can take for B the collection of open sets depending on only a finite number of coordinates, i.e., sets of the form R.5\ L x U where U is an open set in 1RL and L is a finite set of S and for g the bounded continuous functions that depend on only a finite number of coordinates.
Exercise A.M. Show that any lower-continuous function f : E II8+ is the limit of an increasing sequence of linear combinations of indicator functions of open sets with positive coefficients. Deduce from this that It - µ(f) is lower semi-continuous. The interest of the metrizability result is that we will be able to study the weak topology with aid of weak convergence.
Definition A.3.7. We say that a set A of positive bounded measures on E is tight if for any e > 0, we can find a compact K C E such that: Vp E A µ(K°) <, s, and that it is bounded if the set of total masses µ(E) is bounded. A very important fact is that a set consisting of a single measure It is tight, and therefore for any e > 0, there is a compact set Ke that supports It up to E. This property of bounded measures on Polish spaces is all the more important because it can be extended to Souslin spaces - continuous
Appendix A
114
images of Polish spaces - which includes all of the usual separable spaces. The interested reader can consult [DM75] for this result which naturally leads to the following more elementary result:
Theorem A.3.8 (compactness criterion of Prokhorov). Any set A of bounded positive measures that is tight and bounded is relatively compact in the weak topology.
PROOF. First of all, we will assume that the result is true when E is compact. Let An be a sequence of measures in A. Because of the tightness of A, we can define an increasing sequence K; of compact subsets of E such that:
sup{µ(E - K;) I p c Al < 1/i. We denote by µn,; the restriction of An to K;. By utilizing the result for the compact case we can extract a subsequence k -* µn(l,k),1 of the sequence µ,,,l that converges to a measure supported by Kl; next, we can extract from the sequence U.tn(l,k),2 a subsequence Un(2,k),2 that converges to a measure µ,,,2 supported by K2 and continuing in this way we can, for any positive integer i, find a subsequence k H An(i,k),i converging to µ,,,;, a measure supported by K;.
It is easy to see if we consider the "diagonal" sequence (n(p, p)) associated to the array n(p, k), that the sequence of measures p - lln(p,p),i converges to L,, ;, for any i. By setting lim; µ,,,i(B), for any Borel set B, we define a measure. The additivity of this set function is clear. The o-additivity follows from the fact that the sequences (B) are increasing in i combined with Beppo Levi's Theorem:
[
limµoo,i(JJBJ) = lim Epoo,i(Bj) = LA.(B9) where B = uj B3. Clearly the inequality u,. (K;) < 1/i continues to hold. Any bounded continuous function f on E whose uniform norm is bounded by b satisfies: Im.(f)-µoo,i(f)I
since the difference µ E
is positive measure for any i. Thus, for any 0, there exists i = io sufficiently large such that:
Ip.(f) - u.,io(f)I < E, and b/io < E. There also exists po such that for p 3 po Il n(p,p),io (f) - JLoo,ip (f) I < E.
Finally since I µn(p,p),;a (f) - A.(p,p) (f) I < b/io < E, we obtain:
Ip-W - {1n(p,P)(f)I < &, which proves the convergence of the subsequence (iL,,(p,p) ). Since the sequence (µi) was arbitrary, the weak compactness of A follows.
A.3. THE TOPOLOGY OF WEAK CONVERGENCE
115
When E is compact the tightness condition is automatically satisfied. If one is familiar with the notion of ultrafilters, the compactness of A is immediate by the Riesz Representation Theorem. In fact, for any ultrafilter U on N and any continuous function f, limu(p.,,(f )) exists since the sequence p ,(f) is bounded in R and this limit as a function of f is a positive linear functional on C(E), thus a measure p. If one wishes to avoid the recourse to ultrafilters we can utilize the following lemma instead:
Lemma A.3.9. Let F be a compact metric space. There exists a Qlinear subspace of V of C(F) which has the following properties: V is denumerable, contains the function 1, is dense in C(F) and u E V implies that Jul E V. Then to each positive Q-linear form I on V such that 1(1) = 1, there corresponds a unique probability measure v such that I(rk) = fF 0 dv,
for any0EV. PROOF. Let yi be a countable dense sequence in F and let iii denote the function x '- d(x,yi). Let V be the set of functions that is obtained in the following way: choose a finite number of functions r/ii and the constant function 1 and apply an arbitrary finite sequence of operations of the following type: addition, multiplication by a rational number, take absolute
values. Then V has all of the properties stated in the lemma. In fact, the closure V of V with respect to the uniform norm is an R-linear subspace of C(F) that is stable under taking absolute values, contains 1, and separates points of F. Thus the Stone-Weierstra8 Theorem implies that V is equal to C(F). Positivity of I implies:
11001 = 1l(0+) - I(v)I < ll( +) + l(v, )I = l(I'+GI) < l(II0II.) = IIr'IIo, thus I is uniformly continuous on V and extends uniquely to V = C(F). The extension is Q-linear but it is easy to see by passing to the limit that it is R-linear. We complete the proof by applying the Riesz Representation Theorem.
Completion of the proof of the theorem. Since V is denumerable we can enumerate these elements obtaining a sequence (fi), with i E N. By utilizing a diagonal process analogous to the one we used above we can extract from any sequence An in A a subsequence p such that pn(p)(ft) converges for any i. It is then immediate that the limit l(fi) is Q-linear and we are able to apply the lemma.
Exercise A.3.10. Let 0 be a function X -+ IIt+ such that for any a E R, the set {x I ¢(x) < a} is compact. Show that {p I p(¢) < c} is weakly compact for any c E R.
Bibliography [AKR95] [BHK82] [Bk93] [BL76]
[BZ97]
[Car79l [Ch85] (CKS87] [Com9O]
S. Albeverio, Yu. G. Kondratiev, M. Rockner, Dirichlet operators via stochastic analysis, J. Funct. Anal. 128, 102-138, (1995) J. Bellisard, R. Hoegh-Krohn, Compactness and the maximal Gibbs state for random Gibbs fields on a lattice, Comm. Math. Phys. 84, 297-327, (1982) D. Bakry, L'hypercontractivit6 en the orie des semi-groupes, in Ecole de probabilites St. Flour 1992, L.N.M. 1581, Springer, (1993) H. J. Brascamp, E. H. Lieb, On extensions of Brunn-Minkowski and PrekopaLeindler theorems, including inequalities for log-concave functions, with applications to the diffusion equation, J. Funct. Anal. 23, 366-389, (1976) L. Bertini, B. Zegarlinski, Coercive inequalities for Gibbs measures, J. Ftinct. Anal. 162, 247-286, (1999) R. Carmona, Regularity properties of Schrodinger and Dirichlet semi-groups, J. Funct. Anal. 33, 259-295, (1979) G. Choquet, Cours d'analyse, Masson, (1985) E. A. Carlen, S. Kusuoka, D. W. Stroock, Upper bounds for symmetric Markov transition functions, Ann. Inst H. Poincar6 87, 245-287, (1987)
F. Comets, Limites hydrodynamique, in
Bourbaki, 4V` annr e,
exposes 735, Asterisque 201-202-203, Societe Math6matique de France, (199091) [Cor02] [CR75]
D. Cordero-Erausquin, Some applications of mass transport to Gaussian type inequalities, Arch. Rational Mech. Anal. 161, 257-269, (2002) P. Courrege, P. Renouard, Oscillateur anharmonique, processes de diffusion, at measures quasi-invariantes, Asterisque 22-23, Societ6 Mathematique de France, (1975)
[Dav89]
E. B. Davies, Heat kernels and spectral theory, Cambridge University Press,
(DM75] [DSt89] (D-S96]
C. Dellacherie, P. A. Meyer, Probabilities et potential, Hermann, (1975) J. D. Deuschel, D. W. Stroock, Large Deviations, Academic Press, (1989) P. Diaconis, L. Saloff-Coste, Logarithmic Sobolev inequalities for finite Markov chains, Ann. Applied Proba. 6, 695-750, (1996) J. Dieudonn6, Elements d'analyse, tome 3, Gauthiers-Villars, (1974) R. L. Dobrushin, Prescribing a system of random variables by conditional distributions, Theory of Probab. and Applications, 15, 458-486, (1970) H. Doss, G. Royer, Processus de diffusion associd aux mesures de Gibbs, Z. Wahrsch. Verw. Geb. 46, 125-158, (1978) N. Dunford, J. T. Schwartz, Linear Operators 1,11, Interscience Publishers,
(1989)
[Die74j [Dob70]
[DR78] (DS63]
(1963) [Fr77]
[Ge88)
[Grm73]
J. Flrehse, Essential self-adjaintness of singular elliptic operators, Bol. Soc. Bras. Mat. 8, 87-107, (1977) H. O. Georgii, Gibbs measures and phase transitions, DeGruyter, (1988) G. R. Grimmett, A theorem about random fields, Bull. London Math. Soc. 5, (1973) 117
BIBLIOGRAPHY
118
[Gr75]
L. Gross, Logarithmic Sobolev inequalities, Amer. J. Math. 97, 1061-1083, (1975)
[Gr92]
[GRS75] [Gu92] [GZ97] [GZ98] [H-S881
[Hor63] [JLQ97] [KS88]
[Ku82]
[KKR93] [La95]
L. Gross, Logarithmic Sobolev inequalities and contractivity properties of semigroups, C.I.M.E. 1992, L.N.M. 1563, Springer F. Guerra, L. Rosen and B. Simon, The P(W)2 Euclidean field theory as classical statistical mechanics, Ann. of Math. 101, 89-111, (1975) X. Guyon, Champs aldtoires sur un rdseaux, Masson, (1992) A. Guionnet, B. Zegarlinski, Decay to random spin systems on a lattice, II, J. Stat. Physics 86, 899-904, (1997) A. Guionnet, B. Zegarlinski, Indgalitds de Sobolev logarithmiques et mesures de Gibbs, Sdminaire de probabilitds de Strasbourg 36, 1-134, (2002) R. Holley, D. Stroock, Simulated annealling via Sobolev inequalities, Comm. Math. Phys. 145, 553-569, (1988) L. Hormander, Linear differential operators, Springer-Verlag, (1963)
E. Janvresse, C. Landim, J. Quastel, H. T. Yau, Relaxation to equilibrium of conservative dynamics: zero range processes, Ann. Probab. 27, 325-360, (1999) I. Karatzas, S. E. Shreve, Brownian motion and stochastic calculus, SpringerVerlag, (1988) H. Kiinsch, Decay of comlations under Dobrushin's uniqueness condition and its applications, Comm. Math. Phys. 84, 207-222, (1982) O. Kavian, G. Kerkyacharian, B. Roynette, Quelques remargues sur l'ultracontractivitd, J. Funct. Anal. 111, 155-196, (1993) E. Laroche, Hypernontractivit6 pour des syst&nes de spins de portde infinie, Probability Theory Relat. Fields 101, 89-132, (1995)
[Ld196]
M. Ledoux, Isoperimetry and Gaussian analysis, in Ecole de probabilitds de St.-Flour 1994, L.N.M. 1648, Springer, (1996)
[Ld296]
M. Ledoux, On Talagrand's deviation inequalities for product measures, ESAIM Prob. Stat. 1, 63-87, (1996) T. Liggett, Interacting particle systems, Springer, (1985) F. Martinelli, Lectures on Glauber dynamics for discrete spin models, in Ecole de probabilitds de St.-Flour 1997, L.N.M., 1717, 93-191, Springer, (1999) J: M. Ldvy-Leblond, F. Balibar, Quantum, rudiments, InterEditions, (1983) H. P. McKean, Stochastic integrals, Academic Press, (1969)
(Lig85] [Mar97]
[LLB83] [M-K69[ [Mt82]
[MT97]
M. Mdtivier, Semi-martingales: a course on stochastic processes, Walter de Gruyter, (1982) S. P. Meyn, R. L. Tweedie, Markov chains and stochastic stability, Springer, (1997)
(Mic92] [Mic96]
[OR07] [RZ94] [Ros76]
(Roy77] [Roy78]
L. Miclo, Recuit simuld sur Rk. Etude de l'dvolution de l'dnergie lib", Ann. Inst. H. Poincard 28, 235-266, (1992) L. Miclo, Remarques sur I'hypercontractivitd et I'evolution de I'entropie pour des chains de Markov finie, in Sdminaire de probabilitd de Strasbourg XXI, L. N. M. 1655, 136-167, Springer, (1998) F. Otto, M. G. Reznikoff, A new criterion for the logarithmic Sobolev inequality and two applications, J. Funct. Anal. 243, 121-157, (2007) M. Rockner, T. S. Zhang, Uniqueness of generalized Schrodinger operators, J. Funct. Anal. 119, 455-467, (1994) J. Rosen, Sobolev inequalities on weighted spaces and supercontroctive estimates, Trans. Amer. Math. Soc. 222, 367-376, (1976) G. Royer, Etude des champs euclidiens sur un rdseau Z", J. Math. Pures Appl. 56, no. 4, 455-478, (1977)
G. Royer, Processus de diffusion associd a certain modules d'Ising a spins continus, Z. Wahrsch. Verw. Gebiete. 46, no. 2, 165-176, (1978/79)
BIBLIOGRAPHY [Roz87] [RS72] [S-S80]
[Sim93]
119
Y. A. Rozanov, Introduction to random processes, Springer
M. Reed and B. Simon, Methods of Modern Mathematical Physics I,II,IV, Academic Press, (1972) T. Shiga, A. Shimizu, Infinite dimensional stochastic differential equations and their applications, J. Math. Kyoto Univ. 20, 395-416, (1980) B. Simon, The statistical mechanics of lattice gases, Princeton Series in Physics, (1993)
[Sp74]
(Spn9l] [SZ92] [W185]
[Y97]
[Zeg90] [Zeg96]
F. Spitzer, Introduction aux processus de Markov d parametre dans Z1, in Ecole de probabilitQs de St.-Flour 1973, L. N. M. 390, Springer, (1974) H. Spohn, Large scale dynamics of interacting particles, Texts and Monographs in Physics, Springer, (1991) D. W. Stroock, B. Zegarlinski, The logarithmic Sobolev inequality for continuous spin systems on a lattice, J. Funct. Anal. 104, 299-326, (1992) N. Wielens, The essential self-adjointness of generalized Schrodinger operators, J. Funct. Anal. 61, 98-115, (1985) H. T. Yau, Sobolev inequality for generalized simple exclusion process, Probab. Theory and Related Fields 109, 507-538, (1997) B. Zegarlinski, Dobrushin's uniqueness theorem and logarithmic Sobokv inequalities, J. Funct. Anal. 105, 147-162, (1990) B. Zegarlinski, The strong decay to equilibium for the stochastic dynamics of unbounded spin systems on a lattice, Comm. Math. Phys. 175, 401-432, (1996)
Titles in This Series 14 Gilles Royer, An initiation to logarithmic Sobolev inequalities, 2007 13 Olivier Biquard, Asymptotically symmetric Einstein metrics, 2006 12 Fabien Morel, Homotopy theory of schemes, 2006 U Olivier Debarre, Complex Tori and Abelian varieties, 2005
10 Dominique Cerveau, Etienne Ghys, Nessim Sibony, and Jean-Christophe Yoccoz (with the collaboration of Marguerite Flexor), Complex dynamics and geometry, 2003
9 Xavier Buff, Jdr8me Fehrenbach, Pierre Lochak, Leila Schneps, and Pierre Vogel, Moduli spaces of curves, mapping class groups and field theory, 2003
6 Joab Bertin, Jean-Pierre Demailly, Luc Illusie, and Chris Peters, Introduction to Hodge theory, 2002
7 Jean-Pierre Otal, The hyperbolization theorem for fibered 3-manifolds, 2001 6 Laurent Manivel, Symmetric functions, Schubert polynomials and degeneracy loci, 2001 5 Daniel Alpay, The Schur algorithm, reproducing kernel spaces and system theory, 2001 4 Patrice Le Calves, Dynamical properties of diffeomorphisms of the annulus and of the torus, 2000
3 Bernadette Perrin-Riou, p-adic L-functions and p-adic representations, 2000 2 Michel Zinsmeister, Thermodynamic formalism and holomorphic dynamical systems. 2000 1
Claire Voisin, Mirror Symmetry, 1999
This hook provides an introduction to logarithmic Soboler inequalities with some important applications to mathematical statistical physics. Royer begins by gathering and reviewing the necessary background material on selfadjoint operators. semigroups, Kolmogorov diffusion processes, solutions of stochastic differential equations, and certain other related topics. There then is a chapter on log Soholev inequalities with an application
to a strong ergodicity theorem for Kolmogorov diffusion processes. The remaining two chapters consider the general setting for Gibbs measures including existence and uniqueness, issues. the Ising model with real spins and the application of log Soholev inequalities m show the slahilization of the Glauher-Langei III dynamic stochastic models for the (sing model w ilh real spins. The exercises and amiplemenls estend the material in the main ICU
In n-l:nrd arra, such as ntarknv chains
For additional information and updates on this book, visit
www.ams.orglbookpageslsmfams-14
American Mathematical Society
www.ams.org Societe
Mathematique de France ti,V I ASIS/ 14
sm[emath.Ir